## Abstract

A mathematical model is presented in which a single mutation can affect multiple phenotypic characters, each of which is subject to stabilizing selection. A wide range of mutations is allowed, including ones that produce extremely small phenotypic changes. The analysis shows that, when three or more characters are affected by each mutation, a single optimal genetic sequence may become common. This result provides a hypothesis to explain the low levels of variation and low rates of substitution that are observed at some loci.

Many continuously varying phenotypic characters are subject to stabilizing selection, so that the optimal phenotypic value lies between the minimum and maximum possible values (1-7). These phenotypic characters can be anything from the circumference of the stem of a plant to the distance between two subunits within a protein. Models of stabilizing selection often allow for a continuous range of mutations, so that some mutations have very small effects, whereas others have substantial effects (2, 3, 8-14). We follow this approach in the present study. Analyses of stabilizing selection have concentrated on models for which any given mutation affects only one phenotypic character. Nevertheless, mutations that affect multiple characters are well known and are commonly regarded as ubiquitous (5, 14-25). Here, we show that, when three or more characters are affected by each mutation, a single optimal genetic sequence may become predominant. This finding contrasts sharply with the usual finding that, in equilibrium, the optimal sequence is rare and many slightly suboptimal sequences are present.

Consider a simple nonpleiotropic model of viability selection in a very large population of haploid and asexual organisms (the results are expected to generalize to sex and diploidy). Parents produce offspring and then die, so that generations are discrete. After birth, offspring undergo viability selection, and the probability that an individual will survive depends on phenotype, which is described by measurements on k different characters. An individual's measurement on the i th character is denoted by z_{i} (where −∞ < z_{i} < ∞). These characters are chosen so that they affect fitness independently, and z_{i} = 0 is the optimal value for each character. We use a Gaussian fitness scheme, so that the probability of surviving viability selection for a particular individual is proportional to Π_{i=1}
^{k} exp[−*z*
_{i}
^{2}/(2 V )], where *V* > 0.

After viability selection, the remaining individuals reproduce, and fertility is unaffected by phenotype. The phenotype of a particular offspring on the i th character is assumed to depend on its “genotypic value” on that character (x_{i}) plus a normally distributed environmental noise component, e_{i} (so z_{i} = x_{i} + e_{i}). The distribution of e_{i} has mean zero and variance V_{e}. For i ≠ j, e_{i} and e_{j} are uncorrelated.

For the i th character, an individual's genotypic value (x_{i}) is identical to that of its mother, unless a new mutation has occurred in the part of the genetic sequence that controls the character. The rate of such mutations is denoted by Θ (where 0 ≤ Θ ≤ 1). For now, we assume that mutations that affect one character do not affect other characters and that mutations to different characters occur independently. Thus, the probability that an individual will have one or more new mutations (U) is given by U = 1 − (1 − Θ)^{k}.

Most fitness-affecting characters are probably controlled by many codons. Therefore, we treat the genotypic value ( x_{i}) as a continuous variable, and each possible value is associated with a different sequence of codons. Throughout this report, we will discuss gene sequences in terms of the sequence of codons, and we will treat two codons as identical if they code for the same amino acid (that is, we ignore “silent” variation).

Mutant values of x_{i} are distributed around the parental value. When a mutation occurs that alters x_{i}, the probability that the mutant offspring will have a value of x_{i} in the interval y + dy > x_{i} > y is f(y − x*)dy, where dy is infinitesimal and x* is the value of x_{i} for the mutant's mother. We use the traditional Gaussian function(1)Thus, *m* gives the standard deviation of mutant effects for a single character.

Let us define α = ΘV_{s}/m^{2}, where V_{s} = V + V_{e}. For now, we assume that α << 1.

Models similar to the one just described have been studied previously (13, 26-28). Our analysis agrees with previous work in that we find that, at equilibrium, each character has a distribution of x_{i} values that is smooth and bell-shaped and has a peak at x_{i} = 0 (the optimum). The smoothness of this distribution implies that, regardless of the strength of selection and the mutation rate, the sequence of codons for which x_{i} = 0 (the optimal sequence) is virtually absent at equilibrium. Instead, many suboptimal sequences are present (Fig. 1A). (With a smooth distribution, any single value of x_{i} has infinitesimal frequency.) These results (and others reported below) are proved in (29).

Let us define w as the probability of surviving viability selection for an individual with a particular set of genotypic values (x_{1}, x_{2}, … , x_{k}), relative to that of an individual of the optimal genotype. As demonstrated elsewhere (13), the value of w is given by(2)Let w̄ represent the mean value of w at equilibrium (thus, w̄ is proportional to the percentage of offspring that survive viability selection). We can show that, for this model, w̄ > 1 − U [in agreement with Bürger (27, 28)]. Thus, at equilibrium, the decrease in fitness caused by having a suboptimal sequence of codons for any character is typically less than Θ. Very rough estimates of Θ suggest that values as high as 10^{−2}apply for many phenotypic characters (13, 30).

Pleiotropic mutations affect multiple characters. To introduce pleiotropy, we collect the characters into sets of size Ω (where Ω is a positive integer and k is a multiple of Ω). There are Q = k/Ω such sets. The Ω characters in any set have a common genetic basis, and a mutation affecting one character will affect each of the other Ω − 1 characters in the same set. The rate at which the genetic sequences coding for the characters in any given set undergo codon-altering mutations is given by Θ, and sets mutate independently. Thus, the probability that any given individual will have one or more new mutations (*U*) is now given by U = 1 − (1 − Θ)^{Q}. Mutant effects follow a multidimensional Gaussian distribution (17). In particular, pick any set and assign the characters that make it up the numbers 1, 2, … , Ω. Consider an individual that undergoes a mutation to this set and who is born to a mother whose genotypic values on these characters are x^{*}
_{1}, x^{*}
_{2}, … , x^{*}
_{Ω}, respectively. The probability that this individual will have x_{1} in the interval y_{1} + dy_{1} > x_{1}> y_{1} and x_{2} in the interval y_{2} + dy_{2} > x_{2}> y_{2}, … is given by Π_{i=1}
^{Ω}[*f*(*y*
_{i}−*x*
_{i}
^{*})*dy*
_{i}], where the *dy _{i}
* values are infinitesimal and

*f*(

*y*

_{i}−

*x*

_{i}

^{*}) is given by Eq. 1. When Ω = 1, this pleiotropic model is identical to the nonpleiotropic model considered above.

The equilibrium distribution for a single character, x_{1}, when mutations affect two characters (Ω = 2) is shown in Fig. 1B. The distribution is more peaked than that in Fig. 1A (the nonpleiotropic result) but is still continuous. This finding is in agreement with recent theoretical studies (19-23).

When mutations affect three or more characters (Ω ≥ 3), a qualitatively new phenomenon occurs. The distribution of any given character (i) contains a singularity at x_{i}= 0 (Fig. 1, C and D). Thus, a nonnegligible fraction of the population has perfect genomes. When Ω ≥ 3, the proportion of the population for which x_{1} ≠ 0 is of order α = Θ V_{s}/m^{2}. Individuals with the perfect genome for character 1 also are genetically perfect for characters 2, 3, … , Ω. Furthermore, if the proportion of the population for which x_{1} = 0 is denoted by P, then the proportion of individuals who have the perfect genome with respect to all k traits is equal to P^{k/Ω}. Previous analyses of similar models have suggested the possibility of singular behavior of the type noted here (28, 31). However, in these previous studies, only highly implausible fitness functions were shown to lead to singularities.

When Ω = 2, mean fitness (w̅) is greater than 1 − U (just as when Ω = 1). However, for Ω ≥ 3, we have w̅ = 1 − *U*.

To gain an intuitive understanding of these results, consider two modified versions of our model, each of which makes the unrealistic assumption that all mutations are deleterious. For the first of these models, assume that only two genotypes are possible, one optimal and one suboptimal. Let (1 − s) represent the relative viability of suboptimal individuals. Optimal individuals mutate to suboptimal ones at a rate U, but not vice versa. In this well-known model, if s > U, then, at equilibrium, the frequency of optimal individuals is given by 1 − ( U/s) and w̅ = (1 − *U*). However, if s < U, then, at equilibrium, optimal individuals are entirely absent from the population and w̅ = (1 − *s*). Thus, w̅ > (1 − *U*).

When s < U, this model resembles the nonpleiotropic model (Ω = 1). In both models, the optimal genotype vanishes, and w̅ > (1 − *U*). However, in the nonpleiotropic model, deleterious mutations affect slightly suboptimal genotypes, as well as optimal ones. Nevertheless, some of these deleterious mutations are, effectively, canceled out, because when Ω = 1, nearly optimal genotypes are created at a nonnegligible rate by beneficial mutations. When Ω = 1 and x_{1} ≠ 0, 50% of mutations will move x_{1} toward the direction of the optimum (although some will push x_{1} beyond the optimum). The creation of nearly optimal genotypes by mutation is also likely when Ω = 2 (29). In contrast, when Ω ≥ 3, mutations that improve fitness and produce a nearly optimal genotype on all Ω affected characters are extremely unlikely (29). Roughly speaking, this is because, when Ω ≥ 3, only a very small region of “genotypic space” corresponds to near optimality.

A second modified model illuminates the impact of this shift in favor of deleterious mutations. Say that when offspring are produced, they may undergo a certain number of mutational steps, each of which decreases fitness by a factor of (1 − s), where 0 < s < 1. The number of mutational steps follows a Poisson distribution with mean and variance equal to λ. Thus U, the genome-wide probability of at least one new mutation, is given by U = 1 − e^{−λ}. This is a well-known model (32), and, at equilibrium,w̅ = 1 − *U* and the optimal genotype takes a nonnegligible frequency. This model is analogous to the pleiotropic model above when Ω ≥ 3. In both models,w̄ = 1 − U and the optimal genotype is preserved at equilibrium because there is a strong tendency for mutations to degrade fitness in nearly optimal genotypes. When Ω ≥ 3, the loss of nearly optimal genotypes because of selection is not compensated for by a substantial gain of such genotypes because of beneficial mutations, and thus the superiority of the optimal genotype allows it to rise to a nonnegligible frequency.

What happens when our assumption that α = ΘV_{s}/m^{2} << 1 does not hold? For any value of α, the frequency of the optimal genotype is negligible when Ω = 1 and Ω = 2. However, there is always a critical value of Ω, say Ω*, such that when Ω ≥ Ω*, a nonnegligible frequency of the optimal genotype appears. When α << 1, Ω* = 3. However, if α << 1 is not satisfied, then we may have Ω* > 3 (29).

For many proteins, there is very little within-population variation (33-35). Low amounts of variation can lead to low substitution rates (3, 9, 11), and proteins exist that have apparently not changed at all for at least 100 million years (34,36). Lack of variation can be a consequence of small population size and genetic drift, but drift will not stop substitutions. In some cases, natural selection is clearly the cause of low amounts of variation (36-38) or infrequent substitutions (34,36). In large populations, stabilizing selection on one or two characters can produce low amounts of variation and substitution only if mutations that have very small selective effects are exceedingly rare. However, our results show that, when each mutation affects three or more phenotypic characters, variation (and thus, substitution) can be suppressed in favor of an optimal sequence even when mutations of very small effect are common.