Special Reviews

Systems Biology: A Brief Overview

See allHide authors and affiliations

Science  01 Mar 2002:
Vol. 295, Issue 5560, pp. 1662-1664
DOI: 10.1126/science.1069492


To understand biology at the system level, we must examine the structure and dynamics of cellular and organismal function, rather than the characteristics of isolated parts of a cell or organism. Properties of systems, such as robustness, emerge as central issues, and understanding these properties may have an impact on the future of medicine. However, many breakthroughs in experimental devices, advanced software, and analytical methods are required before the achievements of systems biology can live up to their much-touted potential.

Since the days of Norbert Weiner, system-level understanding has been a recurrent theme in biological science (1). The major reason it is gaining renewed interest today is that progress in molecular biology, particularly in genome sequencing and high-throughput measurements, enables us to collect comprehensive data sets on system performance and gain information on the underlying molecules. This was not possible in the days of Weiner, when molecular biology was still an emerging discipline. There is now a golden opportunity for system-level analysis to be grounded in molecular-level understanding, resulting in a continuous spectrum of knowledge.

System-level understanding, the approach advocated in systems biology (2), requires a shift in our notion of “what to look for” in biology. While an understanding of genes and proteins continues to be important, the focus is on understanding a system's structure and dynamics. Because a system is not just an assembly of genes and proteins, its properties cannot be fully understood merely by drawing diagrams of their interconnections. Although such a diagram represents an important first step, it is analogous to a static roadmap, whereas what we really seek to know are the traffic patterns, why such traffic patterns emerge, and how we can control them.

Identifying all the genes and proteins in an organism is like listing all the parts of an airplane. While such a list provides a catalog of the individual components, by itself it is not sufficient to understand the complexity underlying the engineered object. We need to know how these parts are assembled to form the structure of the airplane. This is analogous to drawing an exhaustive diagram of gene-regulatory networks and their biochemical interactions. Such diagrams provide limited knowledge of how changes to one part of a system may affect other parts, but to understand how a particular system functions, we must first examine how the individual components dynamically interact during operation. We must seek answers to questions such as: What is the voltage on each signal line? How are the signals encoded? How can we stabilize the voltage against noise and external fluctuations? And how do the circuits react when a malfunction occurs in the system? What are the design principles and possible circuit patterns, and how can we modify them to improve system performance?

A system-level understanding of a biological system can be derived from insight into four key properties:

1) System structures. These include the network of gene interactions and biochemical pathways, as well as the mechanisms by which such interactions modulate the physical properties of intracellular and multicellular structures.

2) System dynamics. How a system behaves over time under various conditions can be understood through metabolic analysis, sensitivity analysis, dynamic analysis methods such as phase portrait and bifurcation analysis, and by identifying essential mechanisms underlying specific behaviors. Bifurcation analysis traces time-varying change(s) in the state of the system in a multidimensional space where each dimension represents a particular concentration of the biochemical factor involved.

3) The control method. Mechanisms that systematically control the state of the cell can be modulated to minimize malfunctions and provide potential therapeutic targets for treatment of disease.

4) The design method. Strategies to modify and construct biological systems having desired properties can be devised based on definite design principles and simulations, instead of blind trial-and-error.

Progress in any of the above areas requires breakthroughs in our understanding of computational sciences, genomics, and measurement technologies, and integration of such discoveries with existing knowledge.

Identification of gene-regulatory logic (3) and biochemical networks is a major challenge. The conventional methods for creating a network model include performing a series of experiments to identify specific interactions and conducting extensive literature surveys. Several attempts are under way to create a large-scale, comprehensive database on gene-regulatory and biochemical networks (4). Although such databases are useful sources of knowledge, many network structures remain to be identified. Substantial research has been done on expression profiling, in which clustering analysis is used to identify genes that are coexpressed with genes of known function (5, 6). Although clustering analysis provides insight into the “correlation” among genes and biological phenomena, it does not reveal the “causality” of regulatory relationships. Several methods have been proposed to automatically discover regulatory relationships solely on the basis of microarray data (7–9). At present, such methods use information derived from mRNA abundance, so there is limited scope to infer causality based on transcriptional regulation. Posttranscriptional and posttranslational mechanisms of regulation must be incorporated as large-scale data become available, but many properties have yet to be measured with sufficient accuracy or in high throughput. Although it is not possible to incorporate all the desired data into the automated discovery system, analysis of transcriptional regulation may provide very useful information because of the possible hypotheses it generates to allow us to infer the network structure. In general, when multiple hypotheses are generated by automated discovery analysis, it reflects a lack of information. This type of analysis can be combined with entropy-based decision-making algorithms to theoretically suggest an experiment that most reduces the number of ambiguous network hypotheses. Although such algorithms have yet to reach a level of practical application, they may prove useful for determining the optimal order of experiments needed to resolve ambiguous hypotheses (10). Progress in this area would lead to an increased emphasis on hypothesis-driven research in biology (Fig. 1).

Figure 1

Hypothesis-driven research in systems biology. A cycle of research begins with the selection of contradictory issues of biological significance and the creation of a model representing the phenomenon. Models can be created either automatically or manually. The model represents a computable set of assumptions and hypotheses that need to be tested or supported experimentally. Computational “dry” experiments, such as simulation, on models reveal computational adequacy of the assumptions and hypotheses embedded in each model. Inadequate models would expose inconsistencies with established experimental facts, and thus need to be rejected or modified. Models that pass this test become subjects of a thorough system analysis where a number of predictions may be made. A set of predictions that can distinguish a correct model among competing models is selected for “wet” experiments. Successful experiments are those that eliminate inadequate models. Models that survive this cycle are deemed to be consistent with existing experimental evidence. While this is an idealized process of systems biology research, the hope is that advancement of research in computational science, analytical methods, technologies for measurements, and genomics will gradually transform biological research to fit this cycle for a more systematic and hypothesis-driven science.

Once we have attained an understanding of network structure, we will be able to investigate network dynamics. In reality, analysis of dynamics and structure on the basis of network dynamics are overlapping processes, because dynamic analysis may yield useful predictions of unknown interactions. For dynamic analysis of a cellular system, we need to create a model. But first it is important to carefully consider the purpose of model building: Whether it is to obtain an in-depth understanding of system behavior or to predict complex behaviors in response to complex stimuli, we must first define the scope and abstraction level of the model.

The choice of analytical method used depends on the availability of biological knowledge to incorporate into the model. A steady-state analysis can be done using only the network structure, without knowing the rate constants for a particular reaction. For example, flux balance analysis (FBA) was used to predict switching of the metabolic pathway in Escherichia coli under different nutritional conditions based on knowledge of only the metabolic network structure; this was experimentally confirmed (11). With some knowledge of steady-state rate constants, traditional stability analysis and sensitivity analysis provide insights into how systems behavior changes when stimuli and rate constants are modified to reflect dynamic behavior. Bifurcation analysis, in which a dynamic simulator is coupled with analysis tools, can provide a detailed illustration of dynamic behavior (12, 13). This type of analysis has become conventional in dynamic systems and is already used in many studies on biological simulation.

Once both the network structure and its functional properties are understood for a large number of regulatory circuits, studies on classifications and comparison of circuits will provide further insights into the richness of design patterns used and how design patterns of regulatory circuits have been modified or conserved through evolution. The hope is that intensive investigation will reveal a possible evolutionary family of circuits as well as a “periodic table” for functional regulatory circuits.

Robustness is an essential property of biological systems (14). Understanding the mechanisms and principles underlying biological robustness is necessary for an in-depth understanding of biology at the system level. The phenomenological properties exhibited by robust systems can be classified into three areas: (i) adaptation, which denotes the ability to cope with environmental changes; (ii) parameter insensitivity, which indicates a system's relative insensitivity to specific kinetic parameters; and (iii) graceful degradation, which reflects the characteristic slow degradation of a system's functions after damage, rather than catastrophic failure. In engineering systems, robustness is attained by using (i) a form of system control such as negative-feedback and feed-forward control; (ii) redundancy, whereby multiple components with equivalent functions are introduced for backup; (iii) structural stability, where intrinsic mechanisms are built to promote stability; and (iv) modularity, where subsystems are physically or functionally insulated so that failure in one module does not spread to other parts and lead to system-wide catastrophe. Not surprisingly, these approaches used in engineering systems are also found in biological systems. Bacterial chemotaxis is an example of negative feedback that attains all three aspects of robustness (15–17). Redundancy is seen at the gene level, where it functions in control of the cell cycle and circadian rhythms, and at the circuit level, where it operates in alternative metabolic pathways in E. coli. Structural stability provides insensitivity to parameter changes in the network responsible for segment formation in Drosophila (18). And modularity is exploited at various scales, from the cell itself to compartmentalized yet interacting signal-transduction cascades (19).

To conduct a systems-level analysis, a comprehensive set of quantitative data is required. Projects already under way, such as the Alliance for Cellular Signaling (AfCS) (20), are making large-scale measurements with the ultimate goal of creating an in-depth simulation model of cells. Exploratory studies on modeling should be done at the earliest stage of such a project to identify where measurement bottlenecks exist in building the final model and to avoid acquiring data with little value for model building, such as measurements of insufficient coverage and accuracy.

Comprehensiveness in measurements requires consideration of three aspects: (i) factor comprehensiveness, which reflects the numbers of mRNA transcripts and proteins that can be measured at once; (ii) time-line comprehensiveness, which represents the time frame within which measurements are made; and (iii) item comprehensiveness, which refers to the simultaneous measurement of multiple items, such as mRNA and protein concentrations, phosphorylation, localization, and so forth. Model-based experiment planning dictates where accuracy is critical and where it is not, so that resources can be optimally allocated.

Complete system-level analysis of biological regulation requires high throughput and accurate measurements, goals that are perhaps beyond the scope of current experimental practices. Technical innovations in experimental devices, single-molecule measurements, femto-lasers that permit visualization of molecular interactions, and nano-technologies are critical aspects of systems biology research. For example, microfluidic systems, also known as micro-TAS (total analysis system), enable minute quantities (picoliters) of samples to be measured more rapidly and more precisely. Various prototypes for polymerase chain reaction and electrophoresis have been developed (21–24). Such methods not only speed up measurements, but also encourage automation.

Software infrastructure is another critical component of systems biology research. Although attempts have been made to build simulation software and to make use of the many analysis and computing packages originally designed for general engineering purposes, there is no common infrastructure or standard to enable integration of these resources. The Systems Biology Mark-up Language (SBML), along with CellML, represent attempts to define a standard for an XML-based computer-readable model definition that enables models to be exchanged between software tools. Systems Biology Workbench (SBW) is built on SBML and provides a framework of modular open-source software for systems biology research. Both SBML and SBW are collective efforts of a number of research institutions sharing the same vision (25).

How does the idea of systems biology impact pharmaceutical industries and medical practice? The most feasible application of systems biology research is to create a detailed model of cell regulation, focused on particular signal-transduction cascades and molecules to provide system-level insights into mechanism-based drug discovery (26–28). Such models may help to identify feedback mechanisms that offset the effects of drugs and predict systemic side effects. It may even be possible to use a multiple drug system to guide the state of malfunctioning cells to the desired state with minimal side effects. Such a systemic response cannot be rationally predicted without a model of intracellular biochemical and genetic interactions. It is not inconceivable that the U.S. Food and Drug Administration may one day mandate simulation-based screening of therapeutic agents, just as plans for all highrise building are required to undergo structural dynamics analysis to confirm earthquake resistance.

Although systems biology is in its infancy, its potential benefits are enormous in both scientific and practical terms. A transition is occurring in biology from the molecular level to the system level that promises to revolutionize our understanding of complex biological regulatory systems and to provide major new opportunities for practical application of such knowledge.


View Abstract

Navigate This Article