Evolutionary Trade-Offs, Pareto Optimality, and the Geometry of Phenotype Space

See allHide authors and affiliations

Science  01 Jun 2012:
Vol. 336, Issue 6085, pp. 1157-1160
DOI: 10.1126/science.1217405


Biological systems that perform multiple tasks face a fundamental trade-off: A given phenotype cannot be optimal at all tasks. Here we ask how trade-offs affect the range of phenotypes found in nature. Using the Pareto front concept from economics and engineering, we find that best–trade-off phenotypes are weighted averages of archetypes—phenotypes specialized for single tasks. For two tasks, phenotypes fall on the line connecting the two archetypes, which could explain linear trait correlations, allometric relationships, as well as bacterial gene-expression patterns. For three tasks, phenotypes fall within a triangle in phenotype space, whose vertices are the archetypes, as evident in morphological studies, including on Darwin’s finches. Tasks can be inferred from measured phenotypes based on the behavior of organisms nearest the archetypes.

Consider a biological system whose phenotype is defined by a vector of traits, v. Traits considered here are quantitative measures such as bird beak length and not genetic traits such as DNA sequences. The space of all phenotypes is called the morphospace. Most theories of natural selection maximize a specific fitness function F(v), resulting in an optimal phenotype, usually a point in morphospace. This approach has several limitations: First, the fitness function is often unknown. Second, in many cases, organisms need to perform multiple tasks that all contribute to fitness (1); thus, fitness is an increasing function of the performance at all tasks F(Pi(v),…,Pk(v)), , where Pi(v) is the performance at task i. The best phenotype for one task is usually not the best for other tasks—resulting in a trade-off situation. Maximizing fitness is thus a multi-objective optimization problem (25).

To address this issue, we employ the Pareto front concept (26), used in engineering and economics to find the set of designs that are the best trade-offs between different requirements. Consider two phenotypes v and v′. If v′ is better at all tasks than v, the latter will be eliminated by natural selection (Fig. 1A). Repeating this for all possible phenotypes, one remains with the Pareto front: the set of phenotypes that cannot be improved at all tasks at once. The Pareto front describes all optima for all conceivable fitness functions that are increasing functions of the performance in each task. Which of the phenotypes on the front is selected depends on the relative contributions of each task to the organism’s fitness in its natural habitat, provided that evolution has had sufficient time and genetic variance to reach the predicted point.

Fig. 1

(A) The Pareto front (best trade-offs) is what remains after eliminating (crossed-out symbol) all feasible phenotypes v that are dominated on all tasks by other feasible phenotypes v'. (B) The two archetypes in morphospace maximize performance in tasks 1 and 2. Phenotype v is farther from both archetypes than v′, its projection on the line segment that connects the archetypes. Thus, v has lower performance than v′ in both tasks, hence lower fitness. Eliminating all such points v, one remains with the Pareto front: the line segment connecting the two archetypes [unlike (A), axes are traits, not performances]. (C) The area ratios of rodent molar areas show a linear relationship (11). Most of the morphospace is empty. Herbivores (circles), faunivores (triangles), and omnivores (squares) are indicated.

The Pareto front is typically a small region of morphospace. This may explain the long-standing observation that most of morphospace is empty (7): Phenotypes such as animal shapes found in nature fill only a small fraction of morphospace.

We next calculate the Pareto front in morphospace. This requires two assumptions (which will be relaxed below). (i) Each performance function is maximized by a single phenotype. The phenotype that is best at task i will be called the archetype for task i, denoted vi*. (ii) Performance decreases with distance from the archetype (Fig. 1B). By distance, we mean a metric based on an inner product norm, such as Euclidean distance [mathematically, Pi(v) = Pi(di), where di=(vvi*)TM(vvi*), and M is a positive-definite matrix; Euclidean distance di=(vvi*)2 is when M = I]. For two tasks, geometric considerations show that the Pareto front is the line segment that connects the two archetypes (Fig. 1B). This is because any point off the line segment is farther from both archetypes than its projection on the line—thus points off the line have lower performance at both tasks, and hence lower fitness, and will be selected against. The position of a phenotype on the line relates to the relative importance of the two tasks in the habitat in which the organism evolved: the closer to an archetype, the more important that task (8).

The case of a trade-off between two tasks may explain the widespread occurrence of linear relations between traits (2, 9, 10). As an example, the area proportions of the molar teeth of 29 rodent species show an approximately linear relationship (11) (Fig. 1C). Species are distributed along the line according to their diet: herbivores at one end, carnivores at the other, and omnivores in the middle. Thus, the archetypes correspond to the ends of the observed line segment: a herbivore archetype with equal-sized molars, and a carnivore archetype with molars in the ratio 2:1:0. Omnivore molars are weighted averages of these archetypes. As in many morphological studies, the traits here are normalized to account for organism size: Because all molar areas scale with size, taking the ratio of molars removes the effect of organism size variation (8). Additionally, the present theory might explain cases of allometry, when traits depend on total organism size (912). Allometric relations often behave as power laws, observed as lines in logarithmic plots—predicted when the performance decays with a metric that is a function of the log of the traits (8), as suggested, for example, by scaling laws for metabolic transport (12). Other explanations for allometric relations include physical or developmental constraints (10, 11).

For more than two tasks, the Pareto front is the full polygon or polyhedron whose vertices are the archetypes (8) (Fig. 2) [or, equivalently, the convex hull of the archetypes, defined as the set of all points that are weighted averages of the archetypes: v=i=1kθivi* with nonnegative weights σi that sum to one. For particular fitness and performance functions, the weights can be calculated: θi=FPiPidi/j=1kFPjPjdj. Weights sum to one i=1kθi=1, and they are nonnegative, θi ≥ 0, because fitness increases with performance F/Pi0 and performance decreases with distance from its archetype Pi/di< 0 (8)]. For three tasks, the Pareto front is the full triangle whose vertices are the three archetypes. In this case, because a triangle defines a plane, even high-dimensional data on many traits are expected to collapse onto two dimensions. The closer a point is to one of the vertices of the triangle, the more important the corresponding task is to fitness in the organism’s habitat.

Fig. 2

Pareto front geometry. (A) Two tasks form a line (B) Three tasks form a triangle. (C) Four tasks form a tetrahedron. If only some relevant traits are measured and others are not, lines and triangles should still be found, because a projection of a convex hull on a subspace is still a convex hull (8). The distribution of phenotypes along the front depends on the second derivative of the performance and fitness functions (8).

We find evidence for such triangular suites of variation in several classic studies of animal morphology and evolution. In these studies, there was no theory to explain why the data resemble a triangle. The species near the vertices of the triangles have distinct behavior that suggests which task is optimized by each archetype (Fig. 3, A to C). A triangle is found in the study of Grant and colleagues on Darwin’s finches (13) (Fig. 3A). Measurements of five beak and body traits (five-dimensional morphospace) fall on a two-dimensional plane: Two principal components—related to body size and beak shape—account for 99% of the variation (8). On this plane, the data fall within a triangle [P < 10−4, according to a statistical test of triangularity; Fig. 3A, inset (8, 14)]. The triangle suggests three archetypes, one at each vertex. The species near the archetypes suggest which tasks may be optimized by each archetype, in this case tasks connected with diet: (1) probing for insects and nectar (long beak, cactus finch), (2) crushing large, hard seeds (thick beak, large ground finch), and (3) crushing small, soft seeds (small beak, small ground finch). Intermediate finch species perform a combination of these tasks (8).

Fig. 3

Triangular suites of variation, and trade-offs in E. coli gene expression. (A) Darwin’s ground finches (13). Axes correspond to size and beak shape. Polygons are boundaries of intraspecies variation. See (8) for species definitions. Inset: Statistical test for triangularity. Define t ratio as the ratio of the area of the minimal-area triangle (red) to the area of the convex hull of the data (purple). The P-value is the fraction of times that randomized data have a larger t ratio than the real data, based on 104 randomized data sets that preserve the statistics of each trait independently (8) (B) Leaf-cutter ant (Atta sexdens) (15): poison sac (pheromone gland that marks the trail) length (normalized to pronotal width) versus head width. (C) Bat (Microchiroptera) wing aspect ratio versus body mass (16). Archetypes and inferred tasks are listed below each figure. (D) E. coli promoter activity was measured with fluorescent reporters (18). (E) Clustered correlation matrix of the top 200 temporally varying genes reveals two anticorrelated clusters. (F) Percentage of total promoter activity of three genes at different time points, in four different media conditions (8), as bacteria transit from exponential phase (1) to stationary phase (2).

We also noted a triangle-shaped suite of variation in E. O. Wilson’s study of leaf-cutter ants (15) [Fig. 3B, P < 10−4 (8, 14)]. The three archetypes are associated with nursing/gardening, foraging outside the nest, and soldiering. Intermediate ants perform a combination of these tasks. Additionally, Norberg and Rayner’s study of bat wings (16) [Fig. 3C, P < 3 × 10−2 (8, 14)] shows a triangular pattern. Archetypes seem to be associated with eating insects in vegetation, hawking insects in the air far above vegetation, and eating large prey in vegetation.

The present considerations might apply beyond animal morphology. For example, bacteria face a trade-off in partitioning the total amount of proteins they can make at a given moment between the different types of proteins—that is, how much of each gene to express. A given expression pattern cannot be optimal, at the same time, for two different tasks such as rapid growth (which requires ribosomes) and survival (which requires stress response proteins) (17). Thus, the theory predicts that gene-expression patterns fall on low-dimensional Pareto fronts, whose vertices are archetypal expression patterns optimal for a single task.

We tested this hypothesis on Escherichia coli gene expression (Fig. 3D). The activity of 1600 promoters was tracked with fluorescent reporters as bacteria grew from exponential to stationary phase (18). Activity was normalized by the summed activity of all promoters at each time point, to represent the instantaneous allocation of transcription resources. The top 200 temporally varying promoters account for 96% of the total temporal variation and control genes in two main families (Fig. 3E) (8): growth genes (ribosomes, transcription, and translation) and stress/survival genes (oxidative stress, etc.). This high-dimensional data set falls on a line [Fig. 3F, P < 10−4 (8), fig. S10]. At one end, expression is devoted mostly to growth genes (exponential phase, archetype 1), and at the other end expression is devoted primarily to stress/survival genes (stationary phase, archetype 2). Over time, the expression program gradually moves along the line from archetype 1 to 2. The instantaneous allocation at each time point is, to a good approximation, a weighted average of two archetypal expression programs: growth and survival. Similar analysis may explain low-dimensional patterns in gene-expression measurements in bacteria (19) and cancer cells (20).

Relaxing the assumptions (i) and (ii) above generally preserves the topology of the Pareto front, with mildly curved lines instead of straight edges, but nevertheless with distinct vertices that can be related to archetypes (Fig. 4) (8).

Fig. 4

(A) Relaxing assumption (i): When performance is maximized in a region rather than a single point, the Pareto front is the line that connects the closest point in the region to the other archetypes (8). (B and C) When all performance functions decay with the same distance metric, the Pareto front is a straight line. The front is the set of tangent points between equi-performance contours (8). (D) Relaxing assumption (ii): When each performance function decays with a different metric (different elliptical contours), the front is slightly curved. Root mean square deviation from a straight line is 21%, averaged over ellipses of all orientations and major/minor axis ratios spanning a hundredfold range (8). For three tasks, triangles with curved edges are generally found.

The present theory addresses traits that have a trade-off. If a trade-off does not exist, trait values can vary independently. Observed phenotypes in this case may fill an uncorrelated cloud in morphospace (8).

Variation in traits within a population in a given species often falls on the same line as variations between species—a phenomenon called “evolution along genetic lines of least resistance” (21). This can be explained by the present framework: Variation within a species reflects the range of habitats it inhabits, each with differential importance of the tasks. Thus, populations should be distributed on the same Pareto front as different species facing the same tasks.

Finally, Pareto optimality need not be the only or generic explanation for low dimensionality and lines/triangles in biological data. It may work for some examples and not others, especially if biological constraints other than natural selection are important. The following experimental tests can refute the theory in a specific example: (i) A point in the middle of the front has higher performance in one of the tasks than a point close to the relevant vertex (this might also imply that different tasks are at play). (ii) A mutant can be found that has higher performance at all tasks than existing phenotypes. Both of these tests require measuring performance (1, 7, 13)— but not the more difficult task of measuring fitness. (iii) Laboratory evolution experiments can follow a mutant that is off the Pareto front (has lower performance in all tasks than the wild type), under conditions in which all tasks are required. Provided with sufficient genetic variation, the mutant is predicted to evolve phenotypes closer to the front.

Supplementary Materials

Materials and Methods

Figs. S1 to S24

Table S1

References (2246)

References and Notes

  1. Supplementary materials are available on Science Online.
  2. Software that analyzes data in terms of triangles and their significance, and suggests archetype trait values, is available at
  3. Acknowledgments: We thank N. Barkai, R. Milo, O. Feinerman, C. Tabin, J. Losos, M. Khammash, T. Dayan, and N. Ulanovski for discussion. U.A. holds the Abisch-Frenkel Professorial Chair; O.S. is an Azrieli Fellow. This work was supported by the Israel Science Foundation and the European Research Council (FP7).
View Abstract

Stay Connected to Science

Navigate This Article