Self-Similarity in the Distribution and Abundance of Species

See allHide authors and affiliations

Science  09 Apr 1999:
Vol. 284, Issue 5412, pp. 334-336
DOI: 10.1126/science.284.5412.334


If the fraction of species in area A that are also found in one-half of that area is independent of A, the distribution of species is self-similar and a number of observed patterns in ecology, including the widely cited species-area relationship connecting species richness to censused area, follow. Self-similarity also leads to a species-abundance distribution, which deviates considerably from the commonly assumed lognormal distribution and predicts considerably more rare species than the latter. Because the abundance distribution is derived under the condition of self-similarity, it may be widely applicable beyond ecology.

Patterns in the distribution and abundance of species within a biome are central concerns in ecology, providing important information about total species richness, the likelihood of species extinction under habitat loss, the design of reserves, and the processes that allow species to coexist and partition resources (1). A number of mathematical functions have been suggested as useful for characterizing observed patterns, with perhaps the most widely cited, but by no means the only plausible, ones being the power law form of the species-area relationship (SAR) (2–4) and the lognormal species-abundance distribution (4–6). The former states that the number of species found in a census patch of area A is a constant power of A: S =cAz ; the latter states that the fraction of species with n individuals is a gaussian function of log(n).

Although available data sets suggest that the lognormal abundance distribution may underestimate the number of rare species in an ecosystem or biome (1, 4, 7–10), in general the use of existing data sets to distinguish among candidate functions describing patterns, and therefore among underlying theories that generate these functions, is quite limited by inadequacies in existing data sets stemming from incomplete censusing and other sources of bias (1,9, 11). Because of these empirical limitations, because an effort (4, 5) to demonstrate a theoretical connection between the lognormal abundance distribution and the species-area relationship has been questioned on theoretical grounds (12), and because establishing mathematical linkages and incompatibilities among patterns may help us understand the mechanisms generating observed patterns, an overarching theoretical framework that unifies our understanding of patterns of species abundance and distribution in ecology is desirable.

Consider area A 0 where there areS 0 species. The number of individuals in each species is described by probability distributionP 0(n), whereP 0(n)S 0 is the expected number of species with n individuals. For convenience we take A 0 to be a rectangle with a length-to-width ratio of 2 so that repetitive bisections perpendicular to the long dimension yield at each stage rectangles of shape similar to the original. We denote byAi the area of each of the rectangles that are formed at the ith bisection, so thatAi =A 0/2i, and we denote bySi the number of species found on average in anAi rectangle (Fig. 1).

Figure 1

Origin of the recursion relation forPi(n). Consider the case i = 4 and n = 3. Diamond symbols correspond to individuals of a particular species found in a patch. On the left side of the “picture equation” there are three individuals of a particular species in an A 4 rectangle.P 4(3) is the probability that the particular species has exactly three individuals here. The right side of the equation sums the probabilities for all possible ways species can have three individuals in patch A 4. Algebraically, using Eqs. 1 to 3, this equation can be writtenP 4(3) = 2(1 −a)P 5(3) + (2a − 1)[P 5(2)P 5(1) +P 5(1)P 5(2)]. Denoting 2(1 − a) by x, and noting that then 2a − 1 = 1 − x, this expression becomes P 4(3) = xP 5(3) + (1 −x)[2P 5(2)P 5(1)]. This particular case of Eq. 8 readily generalizes to all iand n. Note that we are assuming, parsimoniously, that thePi (n) distributions are constructed from independent draws from the ensemble distribution for thePi +1(n). One could, perhaps, construct landscapes based on suitable constrained dependent draws, while still imposing the species-area relationship through the (1 − a) and 2(1 − a) probability constraints for pairing empty and occupied halves. Whether abundance distributions that emerge from dependent draws are biologically reasonable and are shaped independent of scale, as are ours (see Figs. 2 and 3), is unclear.

We define self-similarity in conformity with the fractal literature (13): a pattern is self-similar if it does not vary with spatial scale. We impose self-similarity in the distribution of species by assuming that if a species is known to be in anAi rectangle, and nothing else about that species (such as its abundance) is known, then the probability that under bisection it will be found in at least a specific one of the two resulting Ai +1 rectangles is a constant, a, that is independent of i. This implies that the fraction of those species found inAi that are also found in a specific one of the two Ai +1 is the same constanta. The resulting spatial distribution of species is self-similar in the sense that the likelihood of occurrence in a half-patch under bisection is independent of spatial scale (14).

If a species is known to exist in patchAi , there are three mutually exclusive possibilities for its presence or absence in the twoAi +1 patches that compriseAi : it is found only in the left half, it is found only in the right half, or it is found in both halves. From the above definition of a, the probability associated with each of these options is easily worked out:Embedded Image Embedded Image Embedded Image(1) Embedded Image Embedded Image Embedded Image(2) Embedded Image Embedded Image Embedded Image Embedded Image(3)Note that the probabilities of these three options sum to 1, as they must. Because the probability a species inAi is at least in a specific bisection ofAi must be at least 0.5, it follows that 0.5 ≤ a ≤ 1. The extreme values of acorrespond to the case in which one species is found everywhere (a = 1) and every individual belongs to a unique species (a = 0.5).

By application under repeated bisections of our probability rule, it follows that the average number of species found in any particularAi rectangle isEmbedded Image(4)From Eq. 4 it follows thatSi /Sj =ai−j . Now define z by lettingEmbedded Image(5)ThenSi /Sj = 2 iz/2 jz. However, Ai /Aj = 2 i/2 j, so we can writeSi /Sj = (Ai /Aj )z. This is equivalent to Si =cA i z, which is just the power law form of the SAR. Thus, we have shown that our self-similarity condition leads to the power law form of the SAR. Elsewhere (15) we have shown that the power law form of the SAR implies Eq. 4 and thus self-similarity. Note from Eq. 5 that 0.5 ≤ a ≤ 1 implies 1 ≥ z ≥ 0.

Consider, next, the consequence of Eqs. 1 and 2 above, which can be reexpressed asEmbedded Image Embedded Image(6)Using Eq. 6 combined with the same reasoning that led toEq. 4, the average number, Ei , of species found only in a specified Ai rectangle is given byEmbedded Image(7)Defining z′ = −ln2(1 − a), Eq. 7 is equivalent toE(Ai )/E(Aj ) = (Ai /Aj )z or E(A) =cAz . This is just the “endemics-area relationship” previously derived by us from the SAR (15). We note that 0.5 ≤ a ≤ 1 implies z′ ≥ 1 and that, for the commonly observed valuez = 0.25, we have z′ = 2.65.

To derive the distribution P 0(n) of abundances of individuals within species, we introduce the notion of a smallest patch size or unit rectangle within A 0. This area, Am , contains on average one individual, so that Am =A 0/2m, where the mean total number of individuals in A 0 isN 0 = 2m. Because the unit rectangle contains one species as well as one individual,a m S 0 =Sm = 1, or S 0 =a m. Moreover, using Eq. 5, S 0 =N 0 z.

We generalize our definition of P so thatPi (n) is the probability that if a species is found in a patch of area Ai , then it contains n individuals. Our interest ultimately is inP 0(n), the fraction of species in the entire surface that have n individuals, but to obtain that distribution we derive it recursively from thePi (n) for 0 < im . Note that Pm (1) = 1 (there will be on average one individual of whatever species is present in a unit rectangle) and, for each i, Pi (n) = 0 for n > 2m−i (on average, one cannot fit more individuals into an area than there are unit patches in that area) and Σn Pi (n) = 1 (the sum of probabilities of all possible occurrences is 1).

Using Eqs. 1 to 3, and letting 2(1 − a) =x, where 0 ≤ x ≤ 1, it can readily be shown (Fig. 1) that the Pi (n) satisfy the following double recursion relation (16):Embedded Image Embedded Image(8)Analytical solutions to Eq. 8 can be derived for the first few values of n (17), but we have not been able to derive the general analytical solution for alli, n, and x. Nevertheless, numerical solutions for P 0(n) are revealing. With P plotted against log(n), these species-abundance distributions are seen to deviate considerably from lognormal, being skewed more toward rarity (more species with low abundance) (Fig. 2).

Figure 2

Species-abundance distributions [P 0(n) versus ln(n)] from Eq. 8 for m = 24 and x = 0.484, 0.376, and 0.260 (corresponding to the species-area exponentsz = 0.4, 0.3, and 0.2, and total species richnessesS 0 = 772, 148, and 28, respectively). The total number of individuals in each case is 224 ∼ 1.7 × 107. Dashed portions of distributions correspond to the rarest and most abundant single species.

Plotted on a linear abundance scale, the distributions are more skewed toward commonness than the gaussian but less so than the lognormal. Because the lognormal distribution results from a product of random variables and the normal from a sum, it is not surprising that the distribution resulting from the sum of products in Eq. 8 exhibits intermediate features. Plotted on log-log scales, theP 0(n) are seen to be of the formP 0(n) ∼nc ( x )(18) for n values sufficiently below the modal abundance, with c ∼ 3/2, 1, and 3/4 for x= 0.26, 0.376, and 0.484; the exponents c(x) are independent of m, as expected from self-similarity (Fig. 3). The parameter x in the species-abundance distribution can be related to the SAR parameterz; using the relations x = 2(1− a) and a = 2 z, we getz = −ln2[1 − (x/2)]. Corresponding to the values x = 0.260, 0.376, and 0.484 in Fig. 2 are the values for the SAR power z = 0.2, 0.3, and 0.4.

Figure 3

Species-abundance distributions [lnP 0(n) versus ln(n)] fromEq. 8 for x = 0.484 and m = 14 through 24 (corresponding to a total number of individuals ranging from ∼1.6 × 104 to 1.7 × 107).

There is considerable observational support for our self-similarity condition and the abundance distribution it predicts. First, numerous census data sets are compatible with the power-law form of the SAR, as reviewed by Rosenzweig (3). Second, the few tests carried out on the endemics-area relationship (Eq. 7) show good agreement (15, 19), although considerably more testing is needed. Third, there is considerable evidence (7, 19, 20) that the fraction of species in common to two spatially separated censused patches is a decreasing function of interpatch distance (∝d −2 z), in conformity with self-similarity (15). Fourth, measurements of the dependence of species richness on the shape as well as area of censused patches agree with predictions (19, 21). Fifth, Kunin (22) presented empirical evidence that the amount of habitat occupied by a given plant species exhibits an approximate scale independence when viewed at different scales of resolution through “censusing windows” of various sizes. Our theory not only predicts this result but also provides an explicit relation between the box-counting fractal dimension implied by Kunin's finding and the abundance of the given species (23). Sixth, available abundance data, while often qualitative at best because of sampling problems (9, 11), generally resemble our predicted distribution more than they do the lognormal, exhibiting considerably more rarity than is predicted by the latter distribution (1, 4,8–10). Important exceptions to this occur, however, indicating that self-similarity and the SAR do not always describe species abundance and distributions (24).

Two caveats are in order. It is extremely unlikely that a strictly constant value of z in the SAR holds across an entire accessible scale range (3). If, however, z is a nonconstant function of scale area, so that z =z(i), then that dependence can be inserted intoEq. 8 and an abundance distribution can still be derived. The nature of the breakdown of strict self-similarity in small patches—for example, strong attraction (x ≈ 0) or repulsion (x ≈ 1) between nearby individuals of the same species—will then influence the shape of the abundance distribution in larger patches in a testable manner. Nevertheless, an abundance distribution skewed toward rarity relative to the lognormal still results as long as the curvature in z(i) is not extreme. Secondly, ecosystems are heterogeneous with respect to habitat quality, and thus quantities likeS(Ai ) andPi (n) depend on which patch of areaAi is censused. Moreover, the minimum area per individual (Am ) will differ among species and among individuals in a species and thus can be defined only statistically (especially for motile organisms). Thus all statements we have made about the number of species, or the number of individuals within a particular species, in a patch of area A refer to the average over all the nonoverlapping patches of area Athat comprise the system.

We have demonstrated that self-similarity theory provides an overarching framework within which empirically supported patterns in ecology are unified, new and plausible results are derived, and the connection between the SAR and the lognormal abundance distribution is questioned. Because our recursion relation for the species-abundance distribution is derived under the assumption of self-similarity, it may be more widely applicable to other spatial arrays of types of objects or to the distribution of energy fluctuations in turbulent media (25).


View Abstract

Stay Connected to Science

Navigate This Article