Critical Truths About Power Laws

See allHide authors and affiliations

Science  10 Feb 2012:
Vol. 335, Issue 6069, pp. 665-666
DOI: 10.1126/science.1216142

The ability to summarize observations using explanatory and predictive theories is the greatest strength of modern science. A theoretical framework is perceived as particularly successful if it can explain very disparate facts. The observation that some apparently complex phenomena can exhibit startling similarities to dynamics generated with simple mathematical models (1) has led to empirical searches for fundamental laws by inspecting data for qualitative agreement with the behavior of such models. A striking feature that has attracted considerable attention is the apparent ubiquity of power-law relationships in empirical data. However, although power laws have been reported in areas ranging from finance and molecular biology to geophysics and the Internet, the data are typically insufficient and the mechanistic insights are almost always too limited for the identification of power-law behavior to be scientifically useful (see the figure). Indeed, even most statistically “successful” calculations of power laws offer little more than anecdotal value.

By power-law behavior, one typically means that some physical quantity or probability distribution y(x) satisfies (2, 3) y(x) ∝ x for x > x0, where λ is called the “exponent” of the power law. In the equation, the power-law behavior occurs in the tail of the distribution (i.e., for x > x0). A power-law distribution has a so-called “heavy tail,” so extreme events are far more likely than they would be in, for example, a Gaussian distribution. Examples of such relationships have been reported in a wide range of situations, including the Gutenberg-Richter law in seismology (4), allometric scaling in animals (5), the distribution of hyperlinks on the World Wide Web (6), the sometimes vehemently refuted (7) “scale-free” nature of the Internet (8), a purported unified theory of urban living (9), patterns of insurgent and terrorist activity (10), and (ironically) the paper publication rates of statistical physicists (11). A subtlety to note is that this list includes two different types of reported power laws: bivariate power laws like allometric scaling and power-law probability distributions like the paper publication rates.

Power laws in statistical physics emerge naturally from microscopic theories and can be related to observable macroscopic phenomena. A good example is magnetization (3). The derivation of a power law suggests that—in a certain (“critical”) regime—phenomena do not possess a preferred scale in space, time, or something else: They are, in a sense, “scale free.” However, as Philip Anderson pointed out in 1972 (12), one must be cautious when claiming power-law behavior in finite systems, and it is not clear whether power laws are relevant or useful in so-called “complex systems” (13, 14). It is important to take a nuanced approach and consider not only whether or not one has or can derive a detailed mechanistic model of a system's driving dynamics, but also the extent of statistical support for a reported power law. One additionally needs to consider empirical support, as theories for power-law behavior arise from infinite systems, and real systems are finite.

The power law reported for allometric scaling stands out as genuinely good (see the figure) (5): Not only is there a sound theory underlying why there should be a power-law relationship between body size x and metabolic performance y, but this relationship has been supported empirically over many orders of magnitude (from bacteria to whales). The clear dependence of various biological characteristics on body size is, of course, insufficient by itself to infer a causal relationship, but few people would dispute the reality of such a relationship.

Purported power laws fall loosely into two categories: those with statistical support—by itself a nontrivial task (15)—and those without it. Numerous scholars have neglected to apply careful statistical tests to data that were reported to exhibit power-law relationships; so-called “scale-free” networks are perhaps the best known and most widely discussed examples (2, 6, 13). However, when formal statistical tools have been applied to network data, evidence favoring power-law relationships has almost always been negligible (7, 15, 16).

As a rule of thumb, a candidate power law should exhibit an approximately linear relationship on a log-log plot over at least two orders of magnitude in both the x and y axes. This criterion rules out many data sets, including just about all biological networks. Examination (15) of the statistical support for numerous reported power laws has revealed that the overwhelming majority of them failed statistical testing (sometimes rather epically). For example, a recent study found (17) that the number of interacting partners (i.e., the degree) of proteins in yeast is power-law distributed, but careful statistical analysis refutes this claim (18). Noise or incomplete data can further distort the picture (19). Trying to discern a power-law relationship by eyeballing straight lines (or even trying to find them using, for example, least-squares fitting) on log-log plots of data can be appealing, but the human ability to detect patterns from even the flimsiest of evidence might lead researchers to conclude the existence of a bona fide power law based on purely qualitative criteria.

Even if a reported power law surmounts the statistical hurdle, it often lacks a generative mechanism. Indeed, the same power law (that is, with the same value of γ) can arise from many different mechanisms (3). In the absence of a mechanism, purely empirical fitting does have the potential to be interesting, but one should simply report such results in a neutral fashion rather than provide unsubstantiated suggestions of universality. The fact that heavy-tailed distributions occur in complex systems is certainly important (because it implies that extreme events occur more frequently than would otherwise be the case), and statistically sound empirical fits of event data, when used with caution, can help in data interpretation (as it is certainly useful to estimate how often extreme events occur in a given system). However, a statistically sound power law is no evidence of universality without a concrete underlying theory to support it. Moreover, knowledge of whether or not a distribution is heavy-tailed is far more important than whether it can be fit using a power law.

Suppose that one generates a large number of independent random variables xi drawn from heavy-tailed distributions, which need not be power laws. Then, by a version of the central limit theorem (CLT), the sum of these random variables is generically power-law distributed (20). Few people today would express amazement at finding that the CLT holds in a given context (when one adds up random variables drawn from distributions with finite moments), and the CLT is a vital tool in statistics, providing the basis for many rigorous scientific analyses. It also holds ubiquitously, including in situations in which random variables are drawn from heavy-tailed distributions; in such cases, however, power laws replace the Gaussian distribution as the limiting situation. One thus expects power laws to emerge naturally for rather unspecific reasons, simply as a by-product of mixing multiple (potentially rather disparate) heavy-tailed distributions. For example, it is possible to decompose a supposedly “power-law” degree distribution of a metabolic network into separate distributions of metabolites of different types (16). The degree distribution for each of these metabolite classes is different, reflecting the different roles that they play in the organism.

How good is your power law?

The chart reflects the level of statistical support—as measured in (16, 21)—and our opinion about the mechanistic sophistication underlying hypothetical generative models for various reported power laws. Some relationships are identified by name; the others reflect the general characteristics of a wide range of reported power laws. Allometric scaling stands out from the other power laws reported for complex systems.

Finally, and perhaps most importantly, even if the statistics of a purported power law have been done correctly, there is a theory that underlies its generative process, and there is ample and uncontroversial empirical support for it, a critical question remains: What genuinely new insights have been gained by having found a robust, mechanistically supported, and in-all-other-ways superb power law? We believe that such insights are very rare.

Power laws do have an interesting and possibly even important role to play, but one needs to be very cautious when interpreting them. The most productive use of power laws in the real world will therefore, we believe, come from recognizing their ubiquity (and perhaps exploiting them to simplify or even motivate subsequent analysis) rather than from imbuing them with a vague and mistakenly mystical sense of universality.

References and Notes

  1. We thank J. Carlson, A. Clauset, and A. Lewis for useful discussions and Ch. Barnes, A. MacLean, and C. Wiuf for helpful comments on the manuscript.

Navigate This Article