Technical Comments

Comment on “Additive Genetic Breeding Values Correlate with the Load of Partially Deleterious Mutations”

See allHide authors and affiliations

Science  02 Sep 2011:
Vol. 333, Issue 6047, pp. 1221
DOI: 10.1126/science.1200996

This article has been retracted. Please see:

Abstract

Tomkins et al. (Reports, 14 May 2010, p. 892) reported a strong negative correlation between breeding values and mutational load in cow-pea weevils. Here, I show that this result can be attributed to a statistical artifact. By testing the observed correlation against an incorrect null hypothesis, they find a negative correlation where one does not exist.

A correlation between additive genetic breeding values and mutational load is to be expected if genetic variation in traits closely related to fitness is largely the result of partially recessive, deleterious mutations (13). Estimating this correlation requires knowledge of an individual’s breeding value, as well as its mutational load. Predicting breeding values is relatively straightforward when pedigree data are available (1, 4), but estimating variation in mutational load among individuals or families is less so (5, 6).

Using an elegant experimental setup, Tomkins et al. (7) first predicted offspring phenotypes [p, see (8)] from the breeding values of their outbred but related parents and compared these to the observed phenotypes [o, see (9)] of their inbred offspring. They then used the difference between the observed and the predicted phenotype (po) as an estimate of the amount of inbreeding depression shown by a particular family, and thereby of the among-family variance in inbreeding depression [following (6)]. Using the amount of inbreeding depression found in the offspring as a measure of mutational load, they then related po to variation in predicted breeding values (i.e., p) and thereby tested for a correlation between additive genetic breeding values and the load of partially deleterious mutations.

However, before drawing any conclusions from the strength and direction of this correlation, it is crucial to explicitly formulate the correlation between p and po expected under the null hypothesis of no genetic correlation between additive genetic breeding values and mutational load. Tomkins et al. [supporting online material (SOM) for (7)] state that under the null hypothesis “the observed [inbred family mean] is random with respect to the predicted” (as is the case in Fig. 1A). This implies that inbred offspring do not resemble their parents, even though the trait is heritable. However, both p and o have an additive genetic component, and hence predicted and observed phenotypes will be correlated (Fig. 1B). This is true irrespective of whether there is a correlation between p and p – o (10).

Fig. 1

The relationship between p, o, and po. Unlike x and y (A), p and o are not independent (B). Although x and y are uncorrelated, xy and x are positively correlated (C), but po and p are not (D). p is simulated by drawing 1000 values from a normal distribution with mean of 10 (i.e., μ) and a variance of 5 [i.e., var(â)]. o is simulated as â + ed, where e and d are drawn from normal distributions with means of 0 and 2 (i.e., mean e and d) and variances of 5 and 2.5 [i.e., var(e) and var(d)], respectively. Here the variances of â and a are equal; in other words, the reliability of the predicted breeding values is 1 (4). Furthermore, if â is an unbiased prediction of a, it follows from (10) that the slope of o against p (B) is always equal to 1. To make them comparable to p and o, x and y are drawn from normal distributions with means of 10 and 8 and variances of 5 and 12.5, respectively.

Unlike p and o, we would at first sight expect po and p to be uncorrelated under the null hypothesis. However, according to Tomkins et al., the correlation between po and p may well be nonzero in the absence of a correlation between breeding value and inbreeding depression, because “...larger values of predicted – observed will always tend to be associated with larger values of the predicted, simply because more minus anything returns a larger number than less minus anything” and this would result in a nonzero correlation between po and p [SOM for (7)]. Although correct for two series of random numbers (Fig. 1C), as outlined above, p and o are not independent, even under the null hypothesis. Hence, we cannot treat o as a random number. Although at first glance it may appear that by correlating po and p we are correlating p with p minus “something,” by subtracting o from p, we are in fact removing the dependence between the two. So, if the null hypothesis is true, the correlation between p and po really is zero (Fig. 1D) (11).

Nevertheless, when Tomkins et al. randomized o relative to p, and subsequently correlated porandomized and p, they found a strong positive correlation between the two. This correlation, however, is an artifact, introduced because p and o are not independent. If p is relatively large, we are most likely to draw a random value of o that is smaller than p. Similarly, if p is below average, a random value of o is more likely to be larger than p. Consequently, if p is large, p minus a random value of o is on average large, and if p is small, p minus a random value of o is on average small. Randomization thus generates a positive correlation between porandomized and p (12). Hence, rather than exposing potential biases, in this case randomization generates a bias that is not there in the original data (Fig. 2).

Fig. 2

The effect of randomizing o on the correlation between p, o, and po. Unlike p and orandomized, which are uncorrelated (A), p and po randomized are positively correlated (B). The histograms represent the distribution of 1000 values of rp, o randomized and rp, p – o randomized, respectively. The dotted vertical lines indicate the value of rp,o and rp, p – o (see Fig. 1, B and D). For details on the simulation of p and o, see Fig. 1.

From their randomization tests, Tomkins et al. erroneously conclude that under their null hypothesis, po and p are significantly positively correlated. To correct for this, they calculate the genetic correlation between breeding value and inbreeding depression as the correlation between po and p, minus the correlation between porandomized and p. However, if the latter is significantly greater than zero, whereas the former is close to zero, this “corrected” correlation will be significantly negative. Indeed, whereas Tomkins et al. found the average correlation between po randomized and p to be 0.3, the average “corrected” correlation between p – o and p was –0.24. Using the correct null hypothesis (i.e., the correlation between p and po is equal to zero) instead, we obtain a mean estimate of –0.24 + 0.3 = 0.06.

Above I have argued that the apparent negative correlation between breeding value and inbreeding depression is an artifact from an overestimation of the correlation between p and po under the null hypothesis and that this overestimation results from p and o not being independent if a trait is heritable. This argument is corroborated by a strong negative correlation between the heritability of a trait and the corrected correlation between p and po, with traits with the highest heritability having the most negative correlations (r = –0.66, P = 0.006, using estimates from table S1).

The idea of a negative correlation between additive genetic breeding values and mutational loads is appealing because it provides an answer to the enigmatic question of how genetic variation is maintained in the face of selection (13). Using the correct null hypothesis, the study by Tomkins et al. suggests that this correlation is very weak at best. However, more work is required to establish the statistical power provided by their experimental design and whether the correlation between p – o and p provides an unbiased estimate of the genetic correlation between mutational loads and breeding values for fitness.

References and Notes

  1. p = μ + â, where μ is the population mean and â the predicted breeding value, given by the mean of the parental predicted breeding values.
  2. o = μ + a + e – d, where a is the true breeding value, e the environmental deviation, and d the amount of inbreeding depression (i.e., mutational load).
  3. If â is an unbiased prediction of a, then cov(â, a) = var(â) (see 4) and cov(â, e) = 0. From this, it follows that cov(p, o) = cov(â, a + e – d) = var(â) – cov(â, d). So if cov(â, d) = 0 (the null hypothesis), then cov(p, o) = var(â).
  4. As cov(p, o) = var(â) – cov(â, d), cov(p, p – o) = var(p) – cov(p, o) = var(â) – [var(â) – cov(â, d)] = cov(â, d). So if cov(â, d) = 0 (the null hypothesis), then cov(p, po) = 0.
  5. As cov(p, p – o) = var(p) – cov(p, o) and cov(p, orandomized) = 0, cov(p, p – o randomized) = var(p).
  6. Acknowledgments: I am grateful to L. F. Keller, P. Nietlisbach, B. Tschirren, and P. W. Wandeler for comments and discussion, as well as to the Swiss National Science Foundation for funding (grant 31003A-116794).
View Abstract

Subjects

Navigate This Article