## Abstract

Shipley *et al*. (Reports, 3 November 2006, p. 812) predicted plant community composition and relative abundances with a high level of accuracy by maximizing Shannon's index of information entropy (species diversity), subject to constraints on plant trait averages. We show that the entropy maximization assumption is relatively unimportant and that the high accuracy is due largely to a statistical effect.

Shipley *et al*. (*1*) combined a trait-based framework and the assumption of maximum entropy to predict relative abundances of plant species at sites varying in successional age. Specifically, they chose relative abundances such that Shannon's index of information entropy (i.e., Shannon's diversity index) was maximized, subject to the constraint of reproducing the average values of traits at different sites. Both their use of traits and their maximization of entropy are interesting. The maximum entropy assumption is particularly novel and intriguing as this assumption essentially maximizes biodiversity, which could be interpreted in terms of minimizing harm due to specialized natural enemies (*2*) and/or maximizing niche differentiation with respect to unmeasured traits, thereby gaining hypothesized benefits of high diversity such as resilience against environmental fluctuations (*3*, *4*).

A key question raised by the results in Shipley *et al*. (*1*) is the relative importance of the entropy assumption versus the trait constraints in achieving good predictions of abundances. We tested the importance of the maximum entropy criterion by repeating the analysis under the opposite assumption; specifically, we chose relative abundances to minimize entropy (*5*), while maintaining the same trait constraints as in (*1*). Predictions of relative abundances of species in individual plots were almost as good when entropy was minimized (*r*^{2} = 0.86) as when it was maximized (*r*^{2} = 0.94), which suggests that this assumption is relatively unimportant (*6*). In contrast, when we tested the importance of the trait constraints by repeating the analysis using only four traits instead of eight traits as constraints, the accuracy of the predictions dropped substantially (Table 1). Thus, although the maximum entropy assumption consistently improves predictions, the constraints on trait averages are clearly much more important.

Entropy criterion | Constrained to reproduce plot averages of | r^{2} values for relative abundances | |
---|---|---|---|

Untransformed | Log-transformed | ||

Maximization | 8 traits | 0.94* | 0.48 |

Minimization | 8 traits | 0.86 | 0.41 |

Maximization | 4 traits | 0.71 | 0.24 |

Minimization | 4 traits | 0.59 | 0.20 |

The primacy of the trait-based assumption for obtaining good predictions of abundances leads to the further question of the degree to which this assumption is successful because of strong convergence in traits at a given successional stage or merely because the predictions are being made for the same sites from which the trait averages were calculated. To assess this question, we used cross-validation analysis. Specifically, we fit the trait patterns across successional ages based on data from all but one site and then predicted abundances at the remaining site using the same constrained entropy maximization approach as before. The key difference between the cross-validation and the original analysis is that under cross-validation, the site for which predictions were made was excluded when calculating the trait constraints. The cross-validation procedure showed poor predictive ability compared with the original analysis (*r*^{2} = 0.32 versus *r*^{2} = 0.94). This indicates that traits are not converging strongly with successional age in this system and that the success of the trait-based approach for this data set is in part due to an element of circularity.

Insofar as the trait-based approach does succeed in reproducing relative abundances in this data set, its success is largely limited to the common species that dominate the trait averages. When *r*^{2} values are calculated for log-transformed relative abundances, thus giving greater weight to rare species, predictive value drops precipitously (Table 1).

Shipley *et al*.'sanalysis (*1*) is intriguing and does well at predicting abundances, but the reasons for its success are not as clear as might first appear. The maximum entropy assumption is of secondary importance, and the traits do well not because of strong convergence but mainly for statistical reasons. These results may, to some degree, simply reflect the scale of the data set. The large variation in trait averages among sites of similar successional age presumably reflects random variation in seed arrival. Such stochastic effects typically become less important at larger spatial scales where abundances effectively average over many local deviations. Shipley *et al*.'s approach may thus hold promise for application at the larger scales used in dynamic global vegetation models. Toward this goal, it would be important to know to what degree trait averages for different regions with different floras but similar environments are convergent and whether these averages can be predicted from first principles. On a more practical level, it would be interesting to determine which traits provide the greatest predictive power relative to their measurement effort and whether additional constraints on higher moments of trait distributions would increase predictive power substantially. Future research should test this approach on other sites and data sets and use cross-validation analysis and alternative assumptions to illuminate the contribution of different elements of the analysis to its relative success or failure.