Mathematical and Computational Challenges in Population Biology and Ecosystems Science

See allHide authors and affiliations

Science  17 Jan 1997:
Vol. 275, Issue 5298, pp. 334-343
DOI: 10.1126/science.275.5298.334


Mathematical and computational approaches provide powerful tools in the study of problems in population biology and ecosystems science. The subject has a rich history intertwined with the development of statistics and dynamical systems theory, but recent analytical advances, coupled with the enhanced potential of high-speed computation, have opened up new vistas and presented new challenges. Key challenges involve ways to deal with the collective dynamics of heterogeneous ensembles of individuals, and to scale from small spatial regions to large ones. The central issues—understanding how detail at one scale makes its signature felt at other scales, and how to relate phenomena across scales—cut across scientific disciplines and go to the heart of algorithmic development of approaches to high-speed computation. Examples are given from ecology, genetics, epidemiology, and immunology.

Mathematical and computational approaches to biological questions, a marginal activity a short time ago, are now recognized as providing some of the most powerful tools in learning about nature; such approaches guide empirical work and provide a framework for synthesis and analysis (1, 2). In some areas of biology, such as molecular biology, the advent has been recent but rapid—for example, as an adjunct to the analysis of nucleic acid sequences or the structural analysis of macromolecules. In population biology, in contrast, the marriage between mathematical and empirical approaches has a century-long history, rich in tradition and in the insights it has provided. Statistics and stochastic processes, for example, derive their origins from biological questions, as in Galton's invention of the method of genetic correlations and Fisher's creation of the analysis of variance to study problems in agriculture (1). Branching processes were developed to describe genealogical histories, and even such classical subjects as dynamical systems theory have been enriched by contact with problems in population biology [see (3, 4)].

In recent years, the nature of the game has changed, primarily because of the availability of high-speed computation. Classical approaches to population biology—like classical approaches to other problems in biology—emphasized deterministic systems of low dimensionality, and thereby swept as much stochasticity and heterogeneity as possible under the rug. New techniques and the availability of more powerful computers have led to the development of highly detailed models in which a wide variety of components and mechanisms can be incorporated. In a model of animal grouping, every animal can be tracked; in a forest model, every tree; in an epidemiological model, every individual in the population.

Because models of this sort may provide an unjustified sense of verisimilitude, it is important to recognize them for what they are: imitations of reality that represent at best individual realizations of complex processes in which stochasticity, contingency, and nonlinearity underlie a diversity of possible outcomes. Individual simulations cannot be taken as more than representative of this diversity, but repeated simulations can provide statistical ensembles that contain robust kernels of truth. The problem becomes one of the central problems in science: determining what is signal and what is noise by understanding what detail at the level of individual units is essential to understanding more macroscopic regularities.

The issues raised above cut across population biology and ecosystems science, from the immune system to the biosphere. At each level, dynamics can be observed to emerge from the collective behaviors of individual units. The challenge, then, is to develop mechanistic models that begin from what is understood (or hypothesized) about the interactions of the individual units, and to use computation and analysis to explain emergent behavior in terms of the statistical mechanics of ensembles of such units. In the following sections, this challenge is examined for a range of scientific problems. Many of the ideas are explicated in more detail in (1) and represent conclusions derived more recently in (5). The areas discussed range across a spectrum of problems in population biology, from the populations of B cells and T cells in the immune system, to the variety of genotypes within a population, to the diversity of populations in the biosphere. Though the nature of the biological problems differs, the similarity is what stands out: An individual organism is a biosphere in miniature—with competition, exploitation, mutualism, succession, and nutrient cycling—that provides the stage for evolutionary changes on the small scale, including selfish and cooperative behaviors. Although the subdisciplines that are highlighted have their individual cultures and dynamics, the commonality of the mathematical and computational challenges can foster positive feedbacks that would otherwise not occur.


The characterization of ecological interactions provides one of the most venerable of venues for mathematical biology, dating back at least as far as Volterra's consideration of the fluctuations of the Adriatic fisheries. The challenges facing us today—for example, in the consideration of global change and the loss of biodiversity, and in achieving a sustainable future (6)—elevate the complexities to new levels.

General circulation models are providing detailed information on likely scenarios of climate change and the global fluxes of key elements such as carbon and nitrogen. Typically, the resolution of such models is at the scale of hundreds of kilometers; how then can we assess likely effects on natural and managed systems, where the scales of interest are typically on the order of meters or even centimeters? Even more difficult, how can we extrapolate from the level of effects on individual plants and animals to changes in the distribution of individuals over longer time scales and broader space scales, and hence in community-level patterns and the fluxes of nutrients?

Individual-based models, such as the forest growth simulators JABOWA (7), FORET (8), and SORTIE (9), provide a point of departure, but the amount of detail in such models cannot be supported in terms of what we can measure and parameterize. The result is that these models produce cartoons that may look like nature but represent no real systems. However, they do represent powerful experimental tools, which become more valuable when used to produce exhaustive simulations that allow exploration of parameter space and model structures; such models permit adequate representation of the full statistical ensemble of possible realizations associated with the many stochastic elements. The development of extensive sets of outputs from multiple runs forms the basis for extracting essential and more robust features that can be compared with data, and that can provide the foundation for simplification (10, 11). Simplification techniques may include familiar tools such as renormalization or moment closure (12) in approximations that present more interpretable representations of pattern and dynamics. Computation is an essential adjunct to analysis in developing and testing these approximations.

SORTIE provides a case study in the range of computational problems that can arise with ecological data. Designed to simulate the growth of northeastern forests, SORTIE is a stochastic and mechanistic model that follows the fates of individual trees and their offspring. It uses species-specific information on growth rates, fecundity, mortality, and seed dispersal distances, as well as detailed, spatially explicit information about local light regimes, which change in response to changing distributional patterns of nine dominant or subdominant species. The outputs are dynamic maps of tree species distributions that look like real forests (Fig. 1) and match data observed in real forests at appropriate levels of spatial resolution. Models of this sort, if verified, obviously provide powerful tools for prediction under various hypothetical scenarios of future climate change; more reliably, they provide tools for exploring hypotheses regarding the mechanisms underlying the maintenance of biodiversity and ecosystem processes.

Fig. 1.

Visualization of a 9-hectare SORTIE forest, 500 years into the simulation. Each cylinder represents an individual, where height and cylinder diameter are based on species-specific parameters (96). Green, Eastern hemlock; purple, beech; yellow, yellow birch.

Yet it is fair and important to ask how seriously such predictions should be taken. Surely, such models should not be expected to predict where every tree will be at each point in time; only aggregate statistical properties can be reliably predicted, typically over broad spatial and temporal scales. The great detail regarding local light regimes may be important to the growth of individual trees, but forest dynamics can respond in predictable ways only to more general features of light regimes. To derive robust statements about these systems, it is essential to understand what detail at the local level affects the broader scale patterns, and what is noise.

One approach to this problem [for example, (10)] is to carry out extensive simulations in which different degrees of smoothing and aggregation are used, to determine how much information is lost by averaging, and to find out where error is compressed and where it is enlarged in the course of this process. SORTIE typically involves tens of thousands of trees, each having an associated light regime resolved into 216 pixels. The magnitude of the system requires high computational power even for individual simulations; the tasks described above magnify this challenge by requiring exploration of statistical ensembles through multiple runs and complex statistical analyses. Simulations carried out for heterogeneous environments require an interface between large dynamic simulations and geographic information systems, providing real-time feedbacks between the two. In some cases, these tasks simply involve known techniques and many cycles, and in other cases they involve the development of new algorithms. There are many outstanding theoretical challenges.

Simplification through extensive simulations is a powerful brute-force method, but the development of analytical approaches to simplification transforms art into science. Again, there is the need both for adapting existing methodologies and for developing new ones. SORTIE may provide the starting point, but abstracted analytical descriptions can potentially reproduce essential qualitative features, and thereby provide more robust and interpretable descriptions of vegetational dynamics. Evaluation of such simplifications requires the output from extensive simulations, the numerical solutions of coupled partial-differential integral equations (11, 13), and the development of theoretical generalizations that may raise sophisticated mathematical challenges. The richness of mathematical and computational issues is matched only by the great potential for increasing our ability to understand and predict the dynamics of forests. Moreover, the creation of interfaces between the self-organizing dynamics implicit in these models and the imposed environmental regimes derived from geographical information systems, remote sensing, or the output of climate models allows exploration of the interplay between intrinsic and extrinsic factors in shaping vegetational patterns.

Global change and vegetational responses to it provide one set of challenges, but similar issues exist in the description of other ecological phenomena. Populations typically are made up of diverse and heterogeneous assemblages of individuals, each with unique characteristics. As such, they differ from the more uniform assemblages usually treated in statistical mechanics, but the challenges are similar. How do we represent the mean dynamics of such heterogeneous assemblages without retaining all of the detail, much of it irrelevant to the essential dynamics? How much information, beyond variances and covariances, do we need to retain in order to provide reasonable descriptions, and how can we close up those descriptions in terms of the dynamics of the higher moments? Similar questions exist classically not only in the physical sciences, but also in evolutionary biology (14). Evolution feeds off the variances and covariances within populations, and in return helps to shape that variance-covariance structure. The recognition of this phenomenon, and of ways to deal with it, has provided some of the most powerful approximations to the dynamics of quantitative inheritance.

The maintenance of biological diversity and approaches to sustainable use raise similar issues. The heterogeneous distribution of resources and exploiters is a fact of overwhelming importance to understanding dynamic interactions, as well as an ecological and evolutionary consequence of those interactions (11, 13). Thus, the description of the dynamics of aggregations of fish, krill, birds, or foraging vertebrates requires an understanding of how factors at the level of individuals determine the cohesion, fusion, and fission of groups, and of the consequences of those processes and patterns for ecological interactions such as harvesting for food or predation. Again, a powerful starting point is the individual: Lagrangian descriptions of individual movements make attractive cartoons (15) and can provide a basis for analysis; and again, extensive simulations can provide the foundation for the exploration of robust cause-and-effect relations and for the extraction of statistical mechanical and Eulerian field descriptions that capture the essence of the dynamics. In the same manner as for the vegetational systems the interplay between extrinsic and intrinsic factors can also be explored through computation—for example, by imposing flow regimes derived from Navier-Stokes equations upon the dynamics of attraction and repulsion in marine systems (16).

Spatial heterogeneity is the most obvious of ways that nonuniform distributions may be important, but other dimensions provide even greater challenges. In epidemiology (see below), heterogeneous mixing among different risk groups can provide a fundamentally altered view of disease dynamics, especially for sexually transmitted diseases (STDs). Regarding biological diversity, although it is widely acknowledged that species are being lost at rates never before experienced, what is equally important is the loss of diversity at other scales—not only within species (genetic diversity, or even simply the loss of populations), but also within functional groups of species performing essential ecosystem functions. The most important consequences of the disappearance of biodiversity may be in the loss of such ecosystem services as the maintenance of fluxes of nutrients and pollutants, the mediation of climate and weather, and the stabilization of coastlines.

In developing priorities for the conservation of biodiversity, it becomes important to identify and understand the most fragile and critical components of ecological systems, in terms of their capability to sustain these services. Again, this means understanding the degree to which aggregate behavior is linked to the dynamics of higher moments representing distributional features. The approach is the same as discussed previously [for example, (17)]: extensive simulations of detailed models, comparison with aggregated models, and the development of rules for relating these models to one another and for providing simplified descriptions. In all of these problems, there are common mathematical and computational challenges that range from techniques for representing and accessing data sets, to algorithms for simulation of large-scale spatially stochastic systems, to the development and analysis of simplified descriptions. These themes will reappear below.

Genetics and Evolution

The heritage of mathematics in evolutionary and genetic studies has been extraordinary, beginning with the work of the three giants—Fisher, Haldane, and Wright—and continuing to this day. Although much of the basic framework of population genetics thus has roots deep in the history of the subject, contemporary questions ranging from the very basic (18) to the applied [for example, conservation biology (19) and the use of transgenic organisms] are raising new and important mathematical challenges. Despite the relative simplicity of the underlying genetic models, complexities ranging from multiple loci to spatial factors to the role of frequency dependence in evolution (20) lead to problems that require sophisticated computational approaches. The considerations underlying the management and analysis of genetic sequence data are well known; hence, the following discussion focuses on other facets of evolution and genetics that lead to deep computational and mathematical challenges, especially regarding dynamics.

Although the dynamics of alleles at single loci were well understood in the 1920s, the inclusion of just one more locus leads to models whose dynamics are still not completely understood, even in the deterministic case (21). A full understanding of the behavior of these two-locus models has required the use of a variety of computational approaches, from straightforward simulation [for example, (22)] to more complex analyses based on optimization (23) or the use of computer algebra systems. The consideration of as few as three loci leads to models whose behavior can only be understood by means of numerical approaches, except for some very special cases (21, 24); yet the number of loci exhibiting genetic variation in populations of higher organisms is well into the thousands. Including all this complexity leads to the consideration of populations in which the number of possible genotypes could be much larger than the population. Thus, stochastic effects become paramount, and even the simulation of such populations (25) leads to problems of substantial computational difficulty (26).

Faced with the impossibility of constructing a theory of evolution of characters controlled at many loci by detailed consideration of what is going on at each locus, evolutionary biologists have turned to more macroscopic representations at the level of the phenotype, an attractive option because of the ease of observation and description. The simplest such approaches involve quantitative traits, such as height or weight, or other traits of ecological interest that represent the sum of multiple small effects. Recently, there have been substantial efforts (14) to integrate the long tradition of using statistical approaches to model the dynamics of quantitative traits with the more mechanistic genetic approaches, and hence to provide a rigorous basis for treating quantitative traits. The problem of closure arises again, and even under simplifying assumptions concerning the relation between genotype and phenotype, further approximations are required to obtain a closed system of equations (14, 27). Confirmation of the appropriateness of these approximations ultimately rests on comparisons with both natural and artificial populations as well as on the results of computer simulations.

The study of complex adaptations can lead to questions about the evolution of evolvability itself (28, 29). How does selection act to modify the capability of organisms to adapt to changing environments? This can become an extraordinarily complex question; one intriguing avenue to identifying the kinds of questions that arise has been to create “artificial life” through computer simulations [for example, (30)], and hence to explore how the rules that govern evolution develop and become modified. Often, the resulting simulations are so seductive that the boundary between truth and fiction becomes blurred, but the potential for developing novel insights cannot be denied. Needless to say, the computational problems that arise are substantial and are leading to new innovations in programming.

The flow between computation and biology is not one-way; as in the example of artificial life, computation can draw inspiration from biology. A case in point involves the invocation of evolutionary processes that use a variety of distinct approaches (29, 31, 32), all of which have at least some of the formal structure of genetic systems, to solve very complex optimization problems by identifying strategies with computer “genotypes.” For various reasons, the solutions found by such approaches may bear little similarity to how natural selection solves similar problems (32). Historically, the search for optimization principles to apply to natural evolutionary systems has had limited success, largely because of frequency dependence (the dependence of relative fitnesses on the frequencies of types in the populations); that is, evolution is best understood as a problem in game theory rather than optimization theory.

To address problems of frequency dependence, which arise naturally in the consideration of most interesting ecological problems, Maynard Smith introduced the notion of an evolutionarily stable strategy (ESS) (33), which has been used extensively to understand the evolution of behavior, especially altruistic behaviors. An elegant theory developed by Hamilton (34) based on inclusive fitnesses can explain why individuals might forego their own fitnesses to help relatives, but the evolution of altruism between unrelated individuals is much more difficult to explain.

The central issue in the evolution of altruism is to determine how cooperation can evolve through individual selection. A simple model system is provided by the familiar game of prisoner's dilemma, for which the game theoretic solution (for a single encounter) is noncooperation (Fig. 2). Evolutionary biologists have been able to explain the evolution of altruism by focusing on multiple repetitions of the games and on correlations that arise in time or space; such correlations affect realized payoffs because they affect who plays with whom.

Fig. 2.

Payoff matrix in the prisoner's dilemma game, where each box lists the payoff to player 2 when players 1 and 2 play the pair of strategies indicated [redrawn from (97)]. The game is a prisoner's dilemma if the reward for cooperation is greater than the average of the sucker's payoff and the temptation payoff, and the payoffs are ordered so that temptation payoff > reward for cooperation > punishment > sucker's payoff. In an evolutionary sense, the problem is to explain how strategies involving cooperation among nonrelated individuals evolve.

In particular, when the game is played repeatedly, as in iterated prisoner's dilemma (35), it can be shown that tit-for-tat, which consists of beginning with cooperation and then using the strategy used by the other “player” in the previous interaction, is better than the pure defecting strategy [and that no pure strategy is an ESS (36)]. Sophisticated simulations (37) allow exploration of more complex ESSs in which individuals remember past interactions, and the result is a greater ease of evolving cooperative strategies. Spatial localization of interactions further increases the probability that the same partners will play the game repeatedly and facilitate evolution of cooperatives.

In general, the introduction of explicit space produces further complications, leading to results that depend fundamentally on population structure and movement rules. The underlying principle is that the evolution of traits for which fitnesses are frequency-dependent requires knowledge of which individuals are interacting; thus, for large populations, simulations (38, 39) are needed to understand dynamics in spatially structured populations. Prisoner's dilemma is a caricature, and more biologically relevant studies are beginning to show the importance of the spatial localization of interactions in the evolution of both cooperative and antagonistic behaviors (38, 40). Substantial questions remain to be explored, including the evolution of more complex behaviors [for example, (41)] and coevolutionary questions. For parasite-host systems, the problem has been well studied [for example, (42)], but more diffuse interactions involving many species introduce challenges similar to those that arise in going from two loci to many loci. Fundamental challenges exist in understanding how community properties emerge from the evolution of component species, an issue that is at the core of research into biodiversity.

Infectious Diseases

The mathematical theory of the population biology of infectious diseases dates back at least as far as Daniel Bernoulli's mathematical analysis of smallpox control in 1760. The main impetus for this highly successful field has been the great impact of disease on human health and agriculture, both historically and in facing the threat of acquired immunodeficiency syndrome (AIDS) and other emerging diseases. However, parasite ecology—which effectively links ecological and immunological dynamics—also presents a number of fundamental questions for mathematical and computational research. Simple models have been remarkably successful in capturing many features of host-parasite dynamics and control (43, 44). However, as with ecology, the interaction between spatial and genetic heterogeneity, nonlinearity, and stochasticity can complicate this picture.

A major preoccupation for epidemiological modeling is how transmission varies with social or geographical space (44, 45). A key theoretical issue here is how, and in what detail, to represent spatial variations in the intrinsically nonlinear contact process underlying transmission. One of the best illustrations of this process is provided by the highly dynamic spatiotemporal epidemic pattern of measles (Fig. 3) (46, 47, 48, 49). An important set of analyses of simple, homogeneous models predicted the possibility of chaotic dynamics (50); however, the resulting large-amplitude epidemics generate unrealistically low persistence of infection in small communities (51). Adding successive layers of social and geographical space—and moving from deterministic to stochastic models—improves spatial realism and may reduce the propensity for chaos (46, 47, 52, 53, 54).

Fig. 3.

The spatiotemporal dynamics of measles illustrate the major open computational problems presented by heterogeneities in infectious disease transmission. (A) Time series of total weekly measles notifications for 60 towns and cities in England and Wales, for the period 1944 to 1994; the vertical blue line represents the onset of mass vaccination around 1968. (B) An image plot, showing the breakdown of cases for individual centers, ranked by population size; white indicates zero notifications, and other colorus represent cases on a visible light scale, from red (small) to green and blue (large). The large-scale prevaccination dynamics are well represented by age-structured deterministic models (46, 47, 54, 98, 99, 100, 101). (C and D) Pattern comparisons. The average observed biennial pattern (± SE) is compared in (C) with the limit cycles of the best-fit deterministic model (solid line) [see (47, 98) for more details]. By contrast with homogeneous models, which tend to predict large-amplitude chaotic dynamics (50), this age-structured formulation indicates that stochastically perturbed coexisting limit cycles may be the norm (102). A horizontal line in (B) marks the population threshold—the critical community size (CCS) (103)—above which measles persisted endemically, without local extinction of infection, in the prevaccination era. Recent developments of stochastic models can begin to capture this threshold (54), though much more needs to be done in explaining fully the complex spatiotemporal structure summarized in (B). The framing of explicit spatial structure—as “patch” models (46, 99), pair approximations to individual-level interactions (104), and power-law approaches to irregular epidemics in small populations (105)—shows promise for exploring the persistence of measles metapopulations (101). However, modeling anything approaching the full hierarchical spatial dynamics will require refinements to computational and analytical approaches and to nonlinear statistical analyses of the balance between deterministic and stochastic dynamics (58). One of the most interesting questions for such models is to explore the emergent spatial effects of vaccination. The CCS in (B) remained remarkably constant during most of the vaccine era (53, 106), and there was considerable persistence of measles even in the 1990s when high vaccine uptake greatly reduced its incidence. Preliminary analyses (53) indicate that this may be the result of a “rescue effect” arising from the observed decorrelation of epidemics caused by vaccination (107). This is illustrated in (D), which shows how simulated epidemics in two coupled centers (center 1, solid blue line; center 2, dashed red line) show global extinctions of infection, when the epidemics are in phase (top panel). Moving the epidemics out of phase (bottom panel) eliminates fadeouts attributable to cross-infection between the centers; details are given in (53). Global fadeouts of infection in the two centers are denoted by breaks in the green line on the time axis. The triangle and X in each panel illustrate the phase shift; these points, which are a year apart in the top panel, are brought together in the bottom panel by shifting center 2's dynamics forward in time. Long-term changes in the availability of susceptibles, as a result of birth rate trends, can also affect the spatiotemporal dynamics of infection in complex ways (108).

The major computational challenge in these highly nonlinear stochastic systems is to represent hierarchical spatial complexity and especially its impact on vaccination strategies. Depending on the problem, all scales—from the individual level to big cities—may be important, both in terms of social space [family and school infection dynamics (55)] and in terms of geographic spread and coherency (Fig. 3). As in ecology and evolution, a central question is: How spatially aggregated and parsimonious a model can provide useful results in a given context? This is particularly important in comparisons between directly transmitted human infections—where long-range movements may bring infection dynamics comparatively close to mean field behavior (in which every individual is assumed to have equal contact with every other individual, thus experiencing the mean or average field)—and the equivalent infections in natural populations, where more restricted movements and host population dynamics add extra complexities (56).

It is risky to model at a given level of detail without having data at the relevant spatial grain. Notifiable infectious diseases are unusually well provided here (Fig. 3), with large and often as yet uncomputerized spatiotemporal data sets. These data provide a huge potential testbed for developing methods for characterizing spatiotemporal dynamics in nonlinear, nonstationary stochastic systems. An encouraging development is that the current, generally nonparametric, approaches to characterizing chaos and other nonlinear behaviors are increasingly incorporating lessons from mechanistic epidemiological models (49, 57, 58).

The main focus for modeling social space (the space of social interactions) and disease is, of course, on AIDS and other sexually transmitted infections. Simple models illustrated clearly that heterogeneities in contact rates can substantially alter the predicted course of epidemics (43). This area has seen an explosion of research, both in data analysis of contact structures and in graph-theoretic and other approaches to modeling (43, 59, 60). Models and data analysis are most productive when combined, especially in allowing the observations to limit the universe of possible networks. The major computational challenge is how to deal with the complexity of networks, where concurrency of partnerships often means that closure to a few moments of the distribution is difficult (60). This problem is especially acute given the sensitivity of obtaining data for STD networks, in that the nature of the network is generally only partially and imperfectly known (61). The use of mathematical models for human immunodeficiency virus (HIV) transmission will be especially important in assessing the impact of potential vaccines (62). Another major computational challenge—which developed with the AIDS epidemic and is currently being applied to another pathogen, the bovine spongiform encephalopathy agent (63)—is to estimate the parameters of transmission models from disease incidence and other demographic data.

One hope for the future for both of these areas is network information embedded in viral genomes. A body of recent work indicates exciting possibilities for estimating epidemiological parameters from the birth and death processes of pathogen evolutionary trees (64). More generally, new mathematical and computational techniques will be needed to understand the epidemiological implications of the rapidly accumulating data on pathogen sequences, especially in the context of parasite genetic diversity and the host immunological response to it (65).

The other major area of current epidemiological interest, the impact of host and parasite genetic heterogeneity and coevolution (66), has a distinguished history in population genetics and epidemiology. However, the revolution in both genome research and molecular epidemiology is now providing the foundations for much more detailed explorations of the dynamics of host and parasite strains. An important linked area here is the question of immunoepidemiology (67)—modeling the population-dynamic implications of the immunological processes described in the next section. These approaches come together, for example, in recent work on the strain dynamics of malaria (68), in which models of observed strain and immunological variation indicate a set of cocirculating strains rather than the traditional homogeneous picture of a single, highly transmissible entity.

The major computational question is again to represent hierarchical spatial dynamics, but with the added problem (and hence the added dimensionality) of complex within-host dynamics and host-parasite genetic diversity. The genetic dynamics of a wide variety of pathogens, from influenza (69) and HIV to macroparasitic worms (70) and plant parasites (66), have major implications for the dynamics of control, the evolution of resistance, and the emergence of new pathogens.

These issues present a range of technical computational problems in the assimilation and analysis of data and model construction. For instance, moment closure (12) is a promising possibility for approximating the relatively smooth stochastic dynamics of helminth worm infections and some plant pathogens (66). By contrast, the spikey dynamics and frequent local extinctions of infection in measles and influenza seem to require more computer-intensive simulation approaches.

Over the next few years, we foresee further major development in computational approaches to the complexities of host-parasite spatial and genetic dynamics. Two areas that are likely to be of particular interest are integrating dynamics at the epidemiological, genetic, and immunological levels and exploring the new dynamical properties of systems revealed by parasite control strategies (Fig. 3). In terms of impact on human welfare, research on the dynamics of infectious diseases in developing nations is an important priority.

Immunology and Virology

Historically, mathematical and computational methods have not played a large role in immunology and virology. This is now changing, and impressive advances have come from the use of simple models applied to the interpretation of quantitative data.

The best example is in AIDS research. As is well known, AIDS develops slowly; the average time from HIV infection to the development of full-blown AIDS is about 10 years. Modeling of the progression to AIDS has received considerable attention and has been able to capture much of the observed phenomenology (71, 72). The suggestion that progression to AIDS involves a diversity threshold (72) has generated debate, new theory, and new experimentation (73). The role of the immune response in determining the pace of disease progression has yet to be clarified, but mathematical modeling has helped focus attention on the role of cytotoxic T cells (74, 75). Other key areas in which modeling has played and will continue to play an important role are the understanding of how HIV evolves resistance to antiretroviral drugs and the design of treatment strategies (76).

Much of the 10-year period until AIDS develops has been characterized as a period of clinical latency, with low but constant levels of virus and infected cells in circulation. Giving HIV-1-infected patients potent antiretroviral drugs and using simple dynamical models to analyze the ensuing decline in viral load has led to important insights into the in vivo processes involved in HIV infection. This analysis established that HIV is rapidly replicating and cleared from the body (77) and revealed that the average rate of HIV production was greater than 10 billion virus particles per day, that free virus particles were cleared with a half-life that is probably 6 hours or less, and that productively infected T cells had a life-span of about 1.5 days (78). These results, which derive from mathematical modeling, firmly put to rest the view of AIDS as a slow disease in which little happens for years after infection, and replaced it with a new paradigm in which rapid viral dynamics was the centerpiece. Most important, uncovering the rapid replication of HIV led to a new understanding of the observed rapid evolution of the virus and the seemingly inevitable emergence of drug-resistant forms of HIV-1. In part as a result of this increased understanding, treatment protocols using a single drug are being replaced by protocols using combinations of antiretroviral drugs, which have a greater antiretroviral effect and which increase the number of mutations needed for resistance. The early clinical results of combination therapy, along with mathematical modeling, have now been used to obtain minimal estimates for how long therapy needs to be maintained until HIV is eliminated from the body (79).

The new finding that HIV uses two receptors for entry into target cells—a primary receptor (CD4) and a coreceptor [a chemokine receptor, either fusin (now renamed CXCR4) or CCR5] (80)—provides new challenges and opportunities for modeling. Using concepts from population genetics, researchers have argued that individuals who are homozygous for a 32-nucleotide deletion in the CCR5 gene are resistant to HIV-1 infection and otherwise show no drastic decrease in fitness as a result of this deletion (81). The homozygous defect is found in approximately 1% of Caucasians of Western European ancestry (81). Models of HIV-1 dynamics have assumed that infection is a single-step process. New models need to account for coreceptors and for the interesting finding that high-affinity binding of HIV-1 gp120 to the first HIV receptor, CD4, causes conformational changes in gp120 that lead to the creation of a new recognition site on gp120 for CCR5 (82). Lastly, CCR5 has been identified as the major coreceptor for macrophage-tropic HIV-1 strains. Although some mathematical models have considered macrophage infection (79, 83), none yet have incorporated coreceptors.

Opportunities also exist for modeling to provide insights into the dynamics of other infectious diseases. Hepatitis, which currently infects more than 250 million people worldwide, is an important target for modeling, and work in this direction has begun (84). Models that incorporate immune responses and deal with the issue of drug resistance that can arise during treatment are of great importance and can yield insights into treatment strategies for tuberculosis, HIV, and other infectious agents (76, 85).

Spatial considerations, which play a large role in ecological and epidemiological modeling, also enter into virological and immunological problems. For example, in humans, detection of virus is most easily done in the blood, yet virus can be distributed throughout the body. Models and experiments now need to address the question of observability—that is, how well do measurements in blood reflect other compartments? New experiments and models are being designed that take into consideration bodily compartments where virus and T cells are found, for example, lymph nodes (86). Also, because drugs are transported through tissues, drug concentrations vary in space and time. Models need to be developed that allow for drug transport and differing concentrations at different locations, although some modeling has been initiated in other contexts (87). Such models are particularly relevant for agents such as monoclonal antibodies that can rapidly bind to cells as they move through tissue (88). The implication of spatial and temporal gradients for the generation and selection of drug-resistant organisms needs to be examined.

In basic immunology, issues related to mutation also have been the focus of mathematical modeling and intense experimentation (89, 90). During the course of an immune response, B lymphocytes within germinal centers can rapidly mutate the genes that code for antibody variable regions. The immune system thus provides an environment in which evolution occurs on a time scale of weeks. Among the large number of mutant B cells that are generated, selection chooses for survival those B cells that have increased binding affinity for the antigen that initiated the response. After 2 to 3 weeks, antibodies can have improved their equilibrium binding constant for antigen by one to two orders of magnitude, and may have sustained as many as 10 point mutations. How can the immune system generate and select variants with higher fitness this rapidly and this effectively? An optimal control model has suggested that mutation should be turned on and off episodically in order to allow new variants time to expand without being subjected to the generally deleterious effects of mutation (90). Time-varying mutation could be implemented by having cells recycle through one region of the germinal center, mutating while there, and proliferating in a different region of the germinal center (90). This suggestion has generated new experimental investigations of events that occur within germinal centers (91). Opportunities exist for a range of models that address basic questions about in vivo cell population dynamics and evolution, as well as more detailed questions involving the immunological mechanisms underlying affinity maturation.

Control of the immune response is another area ripe for modeling. What determines the intensity of a response? How is the response shut off when the antigen is eliminated? Feedback mechanisms may exist to control the response intensity, response length, and type of response (cellular or antibody). Some models of a basic feedback mechanism involving two types of helper T cells, TH1 and TH2, have been developed (92); others are needed. Regulatory mechanisms involve interactions among many cell populations that communicate by direct cell-cell contact and through the secretion of cytokines. Diagrams representing the elements of regulatory schemes commonly have scores of elements. Because of the complexities involved, theorists have an opportunity to lead experimentation by providing suggestions as to what needs to be measured and how such measurements can be used to provide an insightful view of possible control mechanisms.

A fundamental feature of the immune system is its diversity. Successful recognition of antigens appears to require a repertoire of at least 105 different lymphocyte clones. The diversity of the immune system has challenged experimentalists, and many recent advances have come from developing experimental models with limited immune diversity. However, models based on ecological concepts may provide insights into the control of clonal diversity (75, 93), and modern computational methods now make it practical to consider models with tens of thousands of clones. Thus, it is possible to develop models that start to approach the size of small immune systems. Simulations have suggested that from simple rules of cell response, emergent phenomena arise that may have immunological significance (94). The challenge in using computation is to develop models that address important questions, are realistic enough to capture the relevant immunology, and yet are simple enough to be revealing.


The problems discussed above are distinguished by their centrality to basic and applied biological research as well as by the mathematical and computational challenges they pose. In this regard, they are in a great tradition that reaches back to Galton and Fisher, to Lotka and Volterra, with such recent examples as the contribution of population biology to the development of the theory of chaos (1, 3, 5, 95). This is not surprising; the central issues—understanding how detail at one scale makes clear its signature on other scales, and how to relate phenomena across scales—cut across scientific disciplines, and indeed go to the heart of algorithmic development of approaches to high-speed computation.

Imaginative and efficient computational approaches are essential in dealing with the overwhelming complexity of biological systems. Such approaches should comprise the storage and retrieval of vast amounts of information as well as the development of simulation methods that must interact with those data structures and deal with complex hierarchical systems, taking advantage where possible of parallel structures and symmetries that allow simplification and efficient organization of computational steps. The potential for benefits to mathematics and computational sciences as well as to the applications of these methods will create a rich mutualism, in which the rate of advance is nonlinear. The face of the science of computational population biology and ecosystems science will change in the next decade. Key challenges involve ways to describe the dynamics of systems that are aggregates of heterogeneous units, representing the behavior of the means and lowest moments in closed form. Spatial heterogeneity and spatial localization of interactions introduce qualitatively new dynamics, and they present theoretical and computational issues that are similar across a range of biological levels.


  1. 1.
  2. 2.
  3. 3.
  4. 4.
  5. 5.
  6. 6.
  7. 7.
  8. 8.
  9. 9.
  10. 10.
  11. 11.
  12. 12.
  13. 13.
  14. 14.
  15. 15.
  16. 16.
  17. 17.
  18. 18.
  19. 19.
  20. 20.
  21. 21.
  22. 22.
  23. 23.
  24. 24.
  25. 25.
  26. 26.
  27. 27.
  28. 28.
  29. 29.
  30. 30.
  31. 31.
  32. 32.
  33. 33.
  34. 34.
  35. 35.
  36. 36.
  37. 37.
  38. 38.
  39. 39.
  40. 40.
  41. 41.
  42. 42.
  43. 43.
  44. 44.
  45. 45.
  46. 46.
  47. 47.
  48. 48.
  49. 49.
  50. 50.
  51. 51.
  52. 52.
  53. 53.
  54. 54.
  55. 55.
  56. 56.
  57. 57.
  58. 58.
  59. 59.
  60. 60.
  61. 61.
  62. 62.
  63. 63.
  64. 64.
  65. 65.
  66. 66.
  67. 67.
  68. 68.
  69. 69.
  70. 70.
  71. 71.
  72. 72.
  73. 73.
  74. 74.
  75. 75.
  76. 76.
  77. 77.
  78. 78.
  79. 79.
  80. 80.
  81. 81.
  82. 82.
  83. 83.
  84. 84.
  85. 85.
  86. 86.
  87. 87.
  88. 88.
  89. 89.
  90. 90.
  91. 91.
  92. 92.
  93. 93.
  94. 94.
  95. 95.
  96. 96.
  97. 97.
  98. 98.
  99. 99.
  100. 100.
  101. 101.
  102. 102.
  103. 103.
  104. 104.
  105. 105.
  106. 106.
  107. 107.
  108. 108.
  109. 109.
View Abstract

Stay Connected to Science

Navigate This Article