Technical Comments

Response to Comment by Volkov et al. on "Computational Improvements Reveal Great Bacterial Diversity and High Metal Toxicity in Soil"

See allHide authors and affiliations

Science  18 Aug 2006:
Vol. 313, Issue 5789, pp. 918
DOI: 10.1126/science.1121569


Volkov et al. claim that significant conclusions about the total number of species (S) cannot be made because different abundance models cannot be distinguished and the sensitivity of the chi-square measure to changes in estimates of S is low. We point out that currently available data do not support these claims.

The first point of Volkov et al. (1) depends on γ, the Cot curve “retardation factor,” being a free parameter. The authors suggest that the value of γ depends on the species structure of samples. This is a new concept in the 30-year-old field of DNA reassociation kinetics and has no published experimental or theoretical precedent. Historically, γ is a heuristic constant determined from calibration standards (e.g., Escherichia coli DNA). The factors responsible for deviation of reassociation kinetics (measured by optical absorbance or S1-nuclease digestion) from true second-order kinetics have been examined by experimentation (2) and modeling (3) and appear to arise primarily from use of randomly sheared DNA, not DNA composition or structure. Consequently, the value of γ is expected to be the same for calibration standard(s) and samples. The value of γ was measured by independent labs as 0.44 and 0.45 (2). Fixing γ at these values is the recommended approach for DNA reassociation measured by optical absorbance (4). In the absence of contradictory experimental data, caution dictates that well-accepted precepts about DNA reassociation kinetics be given primacy. Fixing γ strongly constrains the data, such that different abundance models can be effectively discerned using the framework we described in (5). Volkov et al. abandoned this constraint, using instead a different value of γ for each soil curve (γ = 0.019, 0.12, 0.16). Their analyses deviate widely from historical precedent (i.e., γ = 0.45). Their justification for this approach, although interesting, cannot be accepted in the absence of supporting experimental evidence, which is currently lacking. This is a reasonable area for future study.

We do not agree with the second assertion of Volkov et al. that the data do not effectively constrain total diversity. To the contrary, we find that the probability of observing values of S that differ significantly from our reported values is quite small. The authors' claim appears to be based on the observation that values of χ2 are all much less than 1 when S differs from our best-fit estimate by a factor of 4 to 8. However, the lack of error bars on the experimental data points prevents direct interpretation of these χ2 values. Instead, one can use Monte Carlo (MC) error analysis (6) to compute the expected distributions of S and assess the likelihood of the alternative curve fits presented in figure 2 in (1). Using the distributions P(lnS) (which assume that the experimental errors are Gaussian with standard deviation σ equal to the standard deviation of the observed curve fit residuals) shown in Fig. 1, the probability of observing values of S that are greater or less than a reference value is computed using a Gaussian distribution with the same mean and standard deviation as P(lnS). For the noncontaminated soil, the probability of observing a 4-fold reduction in S from the best-fit value Snon is 9.9 × 10–3 and of observing an 8-fold reduction is 2.7 × 10–4. For the low-metal-contamination soil, the probability of observing a 4-fold increase from the best-fit value Slow is 6.5 × 10–127 and of observing an 8-fold increase is 1.1 × 10–283. For the high-metal-contamination soil, the probability of observing a 4-fold increase from the best-fit value Shigh is 2.1 × 10–125 and of observing an 8-fold increase is 3.0 × 10–279. We conclude that the quality of fits is sensitive enough to variations of S to support the assertions of our original report (5).

Fig. 1.

Computed χ2 as a function of S superimposed on the normalized distributions of lnS for a zipf species-abundance distribution fit to (A) noncontaminated, (B) low-metal, and (C) high-metal soil Cot curves. For each soil, P(lnS) is approximated as the histogram of lnS computed by 103 MC trials. The residual errors for the noncontaminated (σnon), low-metal (σlow), and high-metal (σhigh) soils are 4.42 × 10–3, 2.93 × 10–3, and 3.40 × 10–3, respectively. The SD of lnS, computed by MC simulation, is 6.6 × 10–1 for the noncontaminated soil, 5.9 × 10–2 for the low-metal soil, and 5.9 × 10–2 for the high-metal soil.

The soil DNA reassociation data we analyzed in (5) were obtained by extracting the x,y points from figure 1 in Sandaa et al. (7), because the original data were not freely available. The original data are now available from R. A. Sandaa and were used by Volkov et al. and in our analysis above. The diversity estimates we obtain with the original data (Fig. 1) are generally consistent with our conclusions in (5).


View Abstract

Navigate This Article