Policy ForumMolecular Biology

NIH Molecular Libraries Initiative

See allHide authors and affiliations

Science  12 Nov 2004:
Vol. 306, Issue 5699, pp. 1138-1139
DOI: 10.1126/science.1105511

The purpose of the Molecular Libraries Initiative (MLI) component of the NIH Roadmap for Medical Research (1, 2) is to expand the availability, flexibility, and use of small-molecule chemical probes for basic research. Because this initiative is particularly novel and far-reaching, it has been the subject of considerable discussion (35), and sometimes misinterpretation (6), in the research community.

Two imperatives motivated the development of the MLI. The first, related to NIH's mission in basic biomedical research, was the need for fundamentally new approaches to determine function and therapeutic potential for all genes in the newly sequenced human genome. The second, related to NIH's mission to improve public health, was the need to accelerate the translation of basic research discoveries into new therapeutics.

Expense (particularly large capital costs), expertise, and cultural divides between public and private sectors have historically kept discovery and optimization of small molecules largely restricted to pharmaceutical and biotechnology companies. Dissemination of small-molecule research tools and technologies into the public sector via the MLI is timely for several reasons.

First, sequencing of the human genome has provided an abundance of new targets for study, and small-molecule research tools will accelerate the translation of genome sequence into biological and therapeutic insights (7). Small molecules are complementary to nucleic acid-based translational tools such as knockout mice and siRNAs, in that they target the protein gene product rather than the gene locus or mRNA, have virtually limitless structural diversity, can affect particular target functions for defined periods in isolated proteins, cells, or organisms, and can serve as either agonists or antagonists. The characteristics that make this class of molecule useful as drugs—their potential for selectivity, cell permeability, and subtle reversible modulation of important physiological functions—also make them good research tools for dissecting the functions of novel genes, pathways, and cells.

The human genome encodes 20,000 to 25,000 genes (8) and perhaps a million proteins, of which only ∼500 are targeted by currently available small molecules (9). For the most part, pharmaceutical and biotechnology companies prefer to focus on the “druggable genome” thought to be more amenable to drug development (10). The majority of the genome that is currently considered “undruggable” (i.e., unmanipulable by small molecules) is therefore a major focus of the MLI.

Large libraries of small molecules have traditionally been unavailable to academic researchers, but with the advent of combinatorial chemistry and commercial suppliers of high-quality compound libraries, small molecules can now be obtained on a large scale. At the same time, advances in robotics and informatics have made screening and analysis of such large compound libraries possible. Up to a million compounds can now be screened against a target in a single day, three orders of magnitude greater than was possible only a decade ago. Together, these developments make a public-sector small-molecule screening and chemistry initiative such as the MLI possible.

The MLI was developed over the course of 9 months through consultations with representatives of multiple NIH institutes, and external consultants from the public and private sectors. The MLI research agenda has three components focused on screening, cheminformatics, and technology development, and is being carried out via NIH grant and contract mechanisms (11).

The Molecular Libraries Screening Center Network (MLSCN) will be a consortium of five or six high-throughput screening (HTS) centers that will screen assays submitted by the research community on a large number of compounds (>100,000) maintained in a central compound repository, and perform optimization chemistry required to produce in vitro chemical probes of the targets or phenotypes studied in the assays (12). All results will be placed into a new public database (PubChem, see below), and probe compounds will be made available without encumbrance to all researchers, in public and private sectors, for their use in studying biology and disease. The first of the MLSCN screening centers (the NIH Chemical Genomics Center) has been established within the NIH intramural program. It will begin full-scale screening in early 2005 (13). The other MLSCN centers will be funded via extramural grants; applications to this NIH Roadmap Request for Applications (12) are under review and will be awarded in the late spring of 2005. The contract for the Molecular Libraries Small Molecule Repository, which will house the screening collection, was recently awarded (14), and the composition of the compound collection is being determined by a distinguished panel of chemists from the public and private sectors.

A comprehensive database of chemical structures and their activities, PubChem (15), has been developed by the National Center for Biotechnology Information at the National Library of Medicine/NIH. PubChem links small-molecule information to GenBank, MEDLINE, and the other Entrez databases, and will serve as a public portal for MLSCN screening results and chemistry data. New algorithms and tools for computational chemistry, virtual screening, and other research aspects of cheminformatics, will be funded via new grants (11).

The ultimate goal of the MLI is to develop small-molecule modulators of thousands of cellular targets. To succeed, the MLI will be developing technology in four critical areas (11):

  1. (i) Chemical diversity: production of pilot-scale compound libraries in novel areas of chemical space (16), and methods development for natural product isolation, characterization, and chemistry. Compounds will be placed into the Small Molecule Repository and screened by the MLSCN centers.

  2. (ii) Assay diversity: development of innovative high-throughput assays for novel proteins, cellular phenotypes, biological functions, and disease mechanisms.

  3. (iii) Instrumentation: development of new technologies to allow HTS of novel assay formats and to increase throughput and accuracy of current screening technologies (e.g., methods for highly parallel noncompetitive detection of target binding, lab-on-chip technologies enabling complex multistep assays).

  4. (iv) Predictive ADME/Toxicology:development of data sets and analysis algorithms for improved prediction of ADME (absorption, distribution, metabolism, and excretion) and toxicity properties of small molecules.

Although the MLI will utilize tools and technologies found in biopharmaceutical companies, the MLI has quite distinct goals and deliverables from those of the private sector. For example, the need to develop compounds for human use as quickly as possible drives biopharmaceutical companies to screen only “druglike” compounds having potential for intellectual property novelty, on assays for a relatively limited group of targets (e.g., GPCRs, kinases, nuclear hormone receptors), and to not disclose the results of screens, keeping them as trade secrets or intellectual property. In contrast, the types of compounds synthesized and screened by the MLI will be broader, including metabolic intermediates, a range of natural products, and agents with potential in vivo toxicity. Types of assays will be broader also, including, for example, protein-protein interactions, splicing events, and diverse cellular and even organismal phenotypes. Via PubChem and the compound repository, all data and chemical probes will be available to the entire research community without encumbrance. The MLI will thereby enable study of the majority of biological and chemical space that is currently unexplored.

The small-molecule research tools produced by the MLI should accelerate validation of new drug targets (17) and thereby enable new drug development, but clinically useful compounds cannot be expected from the MLI itself. Drug development is a complex, time-consuming, and expensive process (18, 19), only the first steps of which will be performed by the MLI (see the figure below). The “probe compounds” produced by the MLI will have potency and aqueous solubility adequate for in vitro applications, but chemical modifications will generally be needed to confer the selectivity, pharmacokinetic, and metabolic properties required for in vivo use. From the probe stage, 10 to 12 chemist-years are commonly required to develop a “lead compound” with minimal pharmaceutical properties, and an additional 20 to 30 chemist-years to produce a “clinical candidate” compound appropriate for testing in humans. During this time, >3000 different compounds based on the initial probes are typically synthesized and tested. Even after this investment in chemical optimization, >90% of clinical candidates fail in human testing (19).

Interface of the MLI and drug development.

Examination of the cost and expected success rate at each stage of drug development demonstrates that the assay development, HTS, and hit-to-probe chemistry steps to be performed by the MLI are inexpensive and straightforward compared with the later stages of drug development (see the figure above). This should allow the MLI to produce probes for a broad range of targets with its relatively limited budget of ∼$100 million per year (compared with >$30 billion per year spent by pharmaceutical and biotechnology companies on drug development). We estimate that the probes produced by the MLI will entail only 2% of the cost, and 5% of the time, required to develop a novel drug (19). In exceptional circumstances, the NIH itself might attempt the subsequent steps of drug development, particularly if a small-molecule probe shows special potential for development into a drug for an orphan indication or a disease occurring primarily in the developing world, where there is unlikely to be commercial interest. In this case, NIH Roadmap mechanisms such as the Translational Research Core Services Program and the Regional Translational Research Centers (20) could be used. But for the most part, we expect that chemists in the public and private sectors will use the MLI probes as proof-of-concept compounds and as starting points to produce a variety of chemical analogs with improved properties. For this reason, intellectual property claims on the MLI probes will be strongly discouraged, as such claims would prevent the widest use of these tools and is contrary to the “community resource” nature of this initiative (21).

The MLI is a bold initiative to catalyze science in the genome era. By providing a new path for discovery, this program aims to accelerate science and its translation into benefits for the health of the public.

References and Notes

  1. 1.
  2. 2.
  3. 3.
  4. 4.
  5. 5.
  6. 6.
  7. 7.
  8. 8.
  9. 9.
  10. 10.
  11. 11.
  12. 12.
  13. 13.
  14. 14.
  15. 15.
  16. 16.
  17. 17.
  18. 18.
  19. 19.
  20. 20.
  21. 21.
  22. 22.
View Abstract

Navigate This Article