The Coordinates of Truth

See allHide authors and affiliations

Science  02 Oct 2009:
Vol. 326, Issue 5949, pp. 53-54
DOI: 10.1126/science.1177637

The scientific method has driven conceptual inquiry for centuries and still forms the basis of scientific investigation. Yet, the hypothesis-based research paradigm itself has received scant attention recently. Here, I propose an alternative model for this paradigm, based on decision, information, and game theory. Analysis of biomedical research efforts with this model may provide a framework for predicting their likely contributions to knowledge, assessing their impact on human health, and managing research priorities.

The scientific method provides a rationale upon which scientific principles are developed, tested, and validated or rejected (1, 2). For any natural phenomenon, there is a fundamental solution or truth that explains its basis. This solution exists in nature, regardless of whether the observer formulates the best hypothesis to explain it. It may thus be viewed as a set of coordinates in a multidimensional space: the coordinates of truth (see the first figure, panel A). By proposing hypotheses and testing their statistical validity, the hypothesis-driven experiment allows testing and validation of a scientific principle.

One goal of scientific discovery is to refine the hypothesis and increase its precision. At times, experiments yield unexpected findings and shift the view of the hypothesis without disproving it completely (see the first figure, panel B). Examples of such a paradigm shift include the discovery of reverse transcriptase (3, 4), which changed the central paradigm of biology that genetic information flowed only from DNA to RNA, and of discontinuous genes in mammalian organisms (5, 6), which revolutionized the understanding of gene regulation and structure. At other times, experiments may refute a hypothesis, thereby directing attention toward more productive avenues of study.

The accuracy and predictability of a hypothesis depend on the validity of the inputs used to generate and test it. Because problems are typically complex and information regarding their solution is limited, the solution is more likely to be found if the information base is greater. This rationale is a driving force behind systems biology, which attempts to define biological complexity from a systemic perspective using information technology. Rather than testing scientific hypotheses, it provides an abundance of data that facilitates hypothesis generation.

The relative value of discovery aimed at hypothesis generation versus hypothesis testing has been debated (7, 8). High-profile journals publish systems biology studies, including the human genome sequence, but most papers focus on hypothesis-driven investigations. Yet, there is synergy between hypothesis generation and hypothesis testing: If well designed, these efforts complement one another and can lead to fundamental breakthroughs. But how do we strike the right balance?

Validating hypotheses.

(A) Testing determines whether a hypothesis encompasses the true solution to the problem. (B) Experiments can lead to a change in the assumptions and predictions that encompasses the solution but fundamentally changes the decision space.

In the “coordinates” model, exploration through discovery research defines an unknown space with greater precision. When explored at low resolution, the solution to a problem may lie in a gap in the knowledge space. Increasing the density of information markedly raises the likelihood of defining the coordinates of a hypothesis that encompasses the solution (see the second figure).

The human genome project provides an example: Though not hypothesis-driven, it yielded a powerful information base that generated highly directed hypotheses regarding the causes of many human diseases. For scientists competing to find the best solution to a problem, game theory enters the process: Each hypothesizer games the system by using his or her own view of fragmentary data to postulate a decision shape that contains the coordinates of the true solution. The more limited the leads, the less likely it is that the problem can be distilled to its essence. With more background, an observer is better able to constrain the variables that define the optimal solution (see the second figure).

Generating hypotheses.

Exploration of an undefined space using low-resolution (A) or high-resolution information (B) illustrates the increased likelihood of defining a hypothesis that correctly explains a complex observation.

These considerations have implications for scientific funding. For example, the investigator-initiated grants at the National Institutes of Health allow investigators to propose and test any hypothesis as long as the rationale is justified to a set of peers. The process begins with the vision of the individual scientist and ends with a judgment of its scientific merit. Recently, changes have been proposed for rating these proposals, stressing their impact (9), but the evaluation remains largely subjective. The meaning of “impact” is ill defined, and there is no systematic way to assign value. In this and many other systems for awarding grants, the scientific community does not take full advantage of the scientific method to prioritize its research portfolio. For example, formal evaluation of hypotheses is not an inherent part of the review. Also, there have been few criteria by which to judge and prioritize grants for hypothesis-generating research.

How should hypothesis-generating research be evaluated? Several considerations seem relevant. If systems biology approaches are unconnected to scientific questions, they are unlikely to yield novel or fundamental insights. This research need not be driven by a hypothesis, but should be directed toward a specific question. For example, transcriptional arrays have been used to understand cell transformation and characterize cancer treatment and prognosis. The usefulness of array data for addressing such questions depends on how it is collected. Analyses of tumor biopsies that contain stromal cells will be much less informative than if the RNA is derived only from tumor cells isolated, for example, by laser capture microdissection.

Although technological advances such as those of systems biology have catalyzed progress, technical innovation alone is not the solution. The value of hypothesis-generating efforts should be analyzed critically for the pertinence of the methodology to the question, the overall significance of the problem, and the likelihood of generating a viable and high-impact hypothesis. Translational research, at the nexus between clinical observation and scientific discoveries that can be applied to the treatment of human disease, may be among the best-suited applications of this approach. Reexamination of the scientific research method offers a framework not only to judge the impact of hypothesis generation on scientific discovery, but also to assess its potential to advance clinical research and treatment.

Hypothesis generation can create an organized body of knowledge from which insight can emerge. The model described here is relevant not only to biomedical research but also to other scientific disciplines. For example, in physics, the patterns of particle decay detected in the Cern Large Hadron Collider provide information that both tests and generates hypotheses about the nature of elementary particles. A modern and rigorous view of the hypothesis-driven research paradigm can similarly help to consolidate a foundation that fundamentally transforms biology and medicine.

References and Notes

  1. I thank C. S. Nabel, E. G. Nabel, G. Griffin, and J. Esparza for discussions and comments. The views expressed here are those of the author and do not reflect the views or policies of the NIH.
View Abstract

Navigate This Article