## Abstract

Tomkins *et al*. (Reports, 14 May 2010, p. 892) reported a strong negative correlation between breeding values and mutational load in cow-pea weevils. Here, I show that this result can be attributed to a statistical artifact. By testing the observed correlation against an incorrect null hypothesis, they find a negative correlation where one does not exist.

A correlation between additive genetic breeding values and mutational load is to be expected if genetic variation in traits closely related to fitness is largely the result of partially recessive, deleterious mutations (*1*–*3*). Estimating this correlation requires knowledge of an individual’s breeding value, as well as its mutational load. Predicting breeding values is relatively straightforward when pedigree data are available (*1*, *4*), but estimating variation in mutational load among individuals or families is less so (*5*, *6*).

Using an elegant experimental setup, Tomkins *et al*. (*7*) first predicted offspring phenotypes [*p*, see (*8*)] from the breeding values of their outbred but related parents and compared these to the observed phenotypes [*o*, see (*9*)] of their inbred offspring. They then used the difference between the observed and the predicted phenotype (*p* – *o*) as an estimate of the amount of inbreeding depression shown by a particular family, and thereby of the among-family variance in inbreeding depression [following (*6*)]. Using the amount of inbreeding depression found in the offspring as a measure of mutational load, they then related *p* – *o* to variation in predicted breeding values (i.e., *p*) and thereby tested for a correlation between additive genetic breeding values and the load of partially deleterious mutations.

However, before drawing any conclusions from the strength and direction of this correlation, it is crucial to explicitly formulate the correlation between *p* and *p* – *o* expected under the null hypothesis of no genetic correlation between additive genetic breeding values and mutational load. Tomkins *et al*. [supporting online material (SOM) for (*7*)] state that under the null hypothesis “the observed [inbred family mean] is random with respect to the predicted” (as is the case in Fig. 1A). This implies that inbred offspring do not resemble their parents, even though the trait is heritable. However, both *p* and *o* have an additive genetic component, and hence predicted and observed phenotypes will be correlated (Fig. 1B). This is true irrespective of whether there is a correlation between *p* and *p – o* (*10*).

Unlike *p* and *o*, we would at first sight expect *p* – *o* and *p* to be uncorrelated under the null hypothesis. However, according to Tomkins *et al*., the correlation between *p* – *o* and *p* may well be nonzero in the absence of a correlation between breeding value and inbreeding depression, because “...larger values of predicted – observed will always tend to be associated with larger values of the predicted, simply because more minus anything returns a larger number than less minus anything” and this would result in a nonzero correlation between *p* – *o* and *p *[SOM for (*7*)]. Although correct for two series of random numbers (Fig. 1C), as outlined above, *p* and *o* are not independent, even under the null hypothesis. Hence, we cannot treat *o* as a random number. Although at first glance it may appear that by correlating *p* – *o* and *p* we are correlating *p* with *p* minus “something,” by subtracting *o* from *p*, we are in fact removing the dependence between the two. So, if the null hypothesis is true, the correlation between *p* and *p* – *o* really is zero (Fig. 1D) (*11*).

Nevertheless, when Tomkins *et al*. randomized *o* relative to *p*, and subsequently correlated *p* – *o*_{randomized} and *p*, they found a strong positive correlation between the two. This correlation, however, is an artifact, introduced because *p* and *o* are not independent. If *p* is relatively large, we are most likely to draw a random value of *o* that is smaller than *p*. Similarly, if *p* is below average, a random value of *o* is more likely to be larger than *p*. Consequently, if *p* is large, *p* minus a random value of *o* is on average large, and if *p* is small, *p* minus a random value of *o* is on average small. Randomization thus generates a positive correlation between *p* – *o*_{randomized} and *p* (*12*). Hence, rather than exposing potential biases, in this case randomization generates a bias that is not there in the original data (Fig. 2).

From their randomization tests, Tomkins *et al*. erroneously conclude that under their null hypothesis, *p* – *o* and *p* are significantly positively correlated. To correct for this, they calculate the genetic correlation between breeding value and inbreeding depression as the correlation between *p* – *o* and *p*, minus the correlation between *p* – *o*_{randomized} and *p*. However, if the latter is significantly greater than zero, whereas the former is close to zero, this “corrected” correlation will be significantly negative. Indeed, whereas Tomkins *et al*. found the average correlation between *p* – *o*_{ randomized} and *p* to be 0.3, the average “corrected” correlation between *p – o* and *p* was –0.24. Using the correct null hypothesis (i.e., the correlation between *p* and *p* – *o* is equal to zero) instead, we obtain a mean estimate of –0.24 + 0.3 = 0.06.

Above I have argued that the apparent negative correlation between breeding value and inbreeding depression is an artifact from an overestimation of the correlation between *p* and *p* – *o* under the null hypothesis and that this overestimation results from *p* and *o* not being independent if a trait is heritable. This argument is corroborated by a strong negative correlation between the heritability of a trait and the corrected correlation between *p* and *p* – *o*_{,} with traits with the highest heritability having the most negative correlations (*r* = –0.66, *P* = 0.006, using estimates from table S1).

The idea of a negative correlation between additive genetic breeding values and mutational loads is appealing because it provides an answer to the enigmatic question of how genetic variation is maintained in the face of selection (*13*). Using the correct null hypothesis, the study by Tomkins *et al*. suggests that this correlation is very weak at best. However, more work is required to establish the statistical power provided by their experimental design and whether the correlation between *p – o* and *p* provides an unbiased estimate of the genetic correlation between mutational loads and breeding values for fitness.

## References and Notes

- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
*p*= μ +*â,*where μ is the population mean and*â*the predicted breeding value, given by the mean of the parental predicted breeding values. - ↵
*o*= μ + a + e –*d*, where*a*is the true breeding value,*e*the environmental deviation, and*d*the amount of inbreeding depression (i.e., mutational load). - ↵If
*â*is an unbiased prediction of*a*, then cov(*â, a*)*=*var(*â*) (see*4*) and cov(*â, e*)*=*0. From this, it follows that cov(*p, o*) = cov(*â, a + e – d*) = var(*â*) – cov(*â, d*). So if cov(*â, d*) = 0 (the null hypothesis), then cov(*p, o*) = var(*â*). - ↵As cov(
*p, o*) = var(*â*) – cov(*â, d*), cov(*p, p – o*) = var(*p*) – cov(*p, o*) = var(*â*) – [var(*â*) – cov(*â, d*)] = cov(*â, d*). So if cov(*â, d*) = 0 (the null hypothesis), then cov(*p, p*–*o*) = 0. - ↵As cov(
*p, p – o*) = var(*p*) – cov(*p, o*) and cov(*p, o*_{randomized}) = 0, cov(*p, p – o*_{ randomized}) = var(*p*). - ↵
**Acknowledgments:**I am grateful to L. F. Keller, P. Nietlisbach, B. Tschirren, and P. W. Wandeler for comments and discussion, as well as to the Swiss National Science Foundation for funding (grant 31003A-116794).