## Abstract

Volkov *et al*. claim that significant conclusions about the total number of species (*S*) cannot be made because different abundance models cannot be distinguished and the sensitivity of the chi-square measure to changes in estimates of *S* is low. We point out that currently available data do not support these claims.

The first point of Volkov *et al*. (*1*) depends on γ, the Cot curve “retardation factor,” being a free parameter. The authors suggest that the value of γ depends on the species structure of samples. This is a new concept in the 30-year-old field of DNA reassociation kinetics and has no published experimental or theoretical precedent. Historically, γ is a heuristic constant determined from calibration standards (e.g., *Escherichia coli* DNA). The factors responsible for deviation of reassociation kinetics (measured by optical absorbance or S1-nuclease digestion) from true second-order kinetics have been examined by experimentation (*2*) and modeling (*3*) and appear to arise primarily from use of randomly sheared DNA, not DNA composition or structure. Consequently, the value of γ is expected to be the same for calibration standard(s) and samples. The value of γ was measured by independent labs as 0.44 and 0.45 (*2*). Fixing γ at these values is the recommended approach for DNA reassociation measured by optical absorbance (*4*). In the absence of contradictory experimental data, caution dictates that well-accepted precepts about DNA reassociation kinetics be given primacy. Fixing γ strongly constrains the data, such that different abundance models can be effectively discerned using the framework we described in (*5*). Volkov *et al*. abandoned this constraint, using instead a different value of γ for each soil curve (γ = 0.019, 0.12, 0.16). Their analyses deviate widely from historical precedent (i.e., γ = 0.45). Their justification for this approach, although interesting, cannot be accepted in the absence of supporting experimental evidence, which is currently lacking. This is a reasonable area for future study.

We do not agree with the second assertion of Volkov *et al*. that the data do not effectively constrain total diversity. To the contrary, we find that the probability of observing values of *S* that differ significantly from our reported values is quite small. The authors' claim appears to be based on the observation that values of χ^{2} are all much less than 1 when *S* differs from our best-fit estimate by a factor of 4 to 8. However, the lack of error bars on the experimental data points prevents direct interpretation of these χ^{2} values. Instead, one can use Monte Carlo (MC) error analysis (*6*) to compute the expected distributions of *S* and assess the likelihood of the alternative curve fits presented in figure 2 in (*1*). Using the distributions P(ln*S*) (which assume that the experimental errors are Gaussian with standard deviation σ equal to the standard deviation of the observed curve fit residuals) shown in Fig. 1, the probability of observing values of *S* that are greater or less than a reference value is computed using a Gaussian distribution with the same mean and standard deviation as P(ln*S*). For the noncontaminated soil, the probability of observing a 4-fold reduction in *S* from the best-fit value *S*_{non} is 9.9 × 10^{–3} and of observing an 8-fold reduction is 2.7 × 10^{–4}. For the low-metal-contamination soil, the probability of observing a 4-fold increase from the best-fit value *S*_{low} is 6.5 × 10^{–127} and of observing an 8-fold increase is 1.1 × 10^{–283}. For the high-metal-contamination soil, the probability of observing a 4-fold increase from the best-fit value *S*_{high} is 2.1 × 10^{–125} and of observing an 8-fold increase is 3.0 × 10^{–279}. We conclude that the quality of fits is sensitive enough to variations of *S* to support the assertions of our original report (*5*).

The soil DNA reassociation data we analyzed in (*5*) were obtained by extracting the *x,y* points from figure 1 in Sandaa *et al*. (*7*), because the original data were not freely available. The original data are now available from R. A. Sandaa and were used by Volkov *et al*. and in our analysis above. The diversity estimates we obtain with the original data (Fig. 1) are generally consistent with our conclusions in (*5*).