Nonlinear scaling limit to the power-law dependency network of free software packages (Download)
Arnab K. Ray, Rajiv Nair, and G. Nagarjuna
Homi Bhabha Centre for Science Education, Tata Institute of Fundamental Research, V. N. Purav Marg, Mankhurd, Mumbai 400088
It is a matter of common knowledge that when it comes to installing a software package from the free-software Debian GNU/Linux repository, many other packages — the “dependencies” — are also called for as prerequisites. This leads to a network of these dependencies. We treat every such package as a node in a network of dependency relationships. Each dependency relationship connecting any two packages (nodes) is treated as a link. Links can further be of two types — incoming links and outgoing links, giving rise to two distinct kinds of network. The task is to measure the number of software packages, ϕ, which are connected by a particular number of links, x, in either kind of network. In other words, this gives a frequency plot of ϕ ≡ ϕ(x) versus x. We posit a mathematical model for this kind of a distribution by a general logistic-type equation,
![]() | (1) |
with λ being a power-law exponent, α being a saturation exponent, and η being a “switch” parameter for nonlinearity. Determination of the magnitude of both α and η should be important in understanding the global behaviour of the distribution of networks, particularly the distribution of a disproportionately high number of links connected to a very few special nodes — the so-called “top nodes”.
Integration of Eq. (1), which is a nonlinear differential equation, yields the general integral solution (for α≠0),
![]() | (2) |
in which c is an integration constant. It is quite obvious that when η = 0, which implies the absence of nonlinearity, there will be a global power-law distribution for the data, going as ϕ(x) ~ xλ. This has been shown by the continuous straight lines in both the log-log plots in Figs. 1 & 2, for the network of incoming and outgoing links, respectively. However, as x-→∞, nonlinearity causes a convergence of ϕ towards a non-zero limiting value η, i.e. ϕ-→η. Under the calibrated values of α = -1 and λ = -2, as derived from the data for both types of network, this particular situation is described by the solution
![]() | (3) |
a result that very strongly indicates that nonlinearity characterises the type of the network, influences the properties of its top nodes (corresponding to high values of x), and restricts the number of links connected to these special nodes. A scale for the onset of nonlinear effects can also be ascertained by requiring the two terms on the right hand side of Eq. (3) to be in rough equipartition. This will yield the nonlinear scale as xnl ~ c|η|-1∕2. The Debian data indicate that roughly the top 1% of the nodes fall within this scale. On the other hand, the very weakly linked nodes (for small values of x) are best modelled by a Gibbs-Boltzmann or a log-normal distribution.
We also note here that following a power series expansion of Eq. (2), a natural and self-consistent truncation for the series can only be achieved for α = -1, leading to the result shown in Eq. (3). This is also the only value of α that will spare ϕ(x) from the occurrence of divergences or unphysical solutions.

![[ ( ) ]-1∕α
ϕ(x) = η+ x--αλ ,
c](rnn1x.png)

