News FocusGenetics

Can SNPs Deliver on Susceptibility Genes?

See allHide authors and affiliations

Science  27 Jul 2001:
Vol. 293, Issue 5530, pp. 593-595
DOI: 10.1126/science.293.5530.593

Minor differences in people's DNA ought to predict their risk of certain diseases. Is research on so-called SNPs living up to its promise?

When genetic markers called SNPs burst on the scene several years ago, scientists hailed them as a salvation for weary gene hunters. For decades researchers had been trying to track the genes involved in major killers such as heart disease and cancer. These complex diseases are not caused by a defect in a single gene, as is cystic fibrosis; rather, they arise from the interaction of multiple genes and environmental factors. But these so-called susceptibility genes emit such weak signals that chasing them has frustrated even the most seasoned geneticist.

As DNA sequencing capabilities soared in the late 1990s, researchers hit upon a new strategy: tracking down risk genes by using single-nucleotide polymorphisms (SNPs), fondly known as “snips.” SNPs are simply locations along the chromosomes where a single base varies among different people. Where some have a guanine in a given string of nucleotides, for instance, others might have a cytosine.

The premise of SNP mapping is simple: Common diseases such as cancer must be caused in part by common mutations. And the most common mutations in the genome are SNPs, which occur about every 1000 bases. All scientists have to do, the reasoning goes, is find enough SNPs on the human genome map and see whether distinct SNPs occur more often in people with a given disease. If so, these SNPs are either implicated in the disease or near another genetic variation that is a possible culprit. Now, several years, tens of millions of dollars, and millions of SNPs later, how goes the hunt for susceptibility genes?

Slowly and painstakingly, is the word from the trenches. One obstacle is that only about 15% of candidate SNPs are ready for prime time—the others haven't been characterized well enough for meaningful studies. Another is that researchers have to analyze huge numbers of SNPs in lots of people to find true links between DNA and disease. Finally, SNPs are not distributed as randomly in the genome as clean statistical analyses demand, and they don't always stay put. Thus, trying to scan the entire genome for disease genes using SNPs as signposts is a feat now considered nearly impossible.

In response, many researchers are focusing their SNP studies, using traditional methods to narrow down suspect chromosomal regions. This strategy has yielded success recently in studies of diabetes and the gastrointestinal ailment Crohn's disease. But even after taming their wild SNP hunts, researchers tell battle stories of endless assays, false leads, and depressingly high costs in time and money.

The technique may still pan out, many gene hunters say, provided researchers fine-tune the map of SNPs in the human genome; develop cheaper, faster, and more sophisticated methods for analyzing SNPs; and, perhaps, augment the SNP map with yet another type of genome map, called a haplotype map, an idea researchers explored last week (see p. 583).

“Researchers are going after SNPs because all other approaches have worked horribly,” explains molecular geneticist Pan Pui-Yan Kwok of Washington University in St. Louis. “When you study complex disease traits, you need to have all the tools that you can.”

What makes a SNP a SNP?

When the hunt for SNPs began in earnest in 1998, researchers weren't sure how hard it would be. That year 10 pharmaceutical companies, five academic centers, and the Wellcome Trust joined forces to create a public SNP map (Science, 19 December 1997, p. 2047). Their goal was to identify 300,000 SNPs and map 150,000 of them along the human chromosomes by 2001—a goal they met handily. SNPs proved remarkably easy to find. Nearly 3 million putative SNPs have been deposited into the public database; Celera Genomics of Rockville, Maryland, boasts of its own map with at least that many; and other companies such as Incyte Genomics in Palo Alto, California, and GlaxoSmithKline in Greenford, U.K., are stockpiling their own SNPs.

But determining whether these genetic variations are true SNPs has proved more daunting. To live up to the name “polymorphism,” a single-nucleotide change at a given location must be shared by at least 1% of a specific population, such as African Americans, say, or Pima Indians of the southwestern United States. This somewhat arbitrary distinction between a polymorphism and a mutation—as even rarer oddities are called—helps researchers zero in on those variations most likely to reveal a link to a specific disease in a given population.

“You have to know whether that SNP is relevant in the population you are interested in,” says Rick Lundberg of Celera. But doing so requires researchers to check hundreds or thousands of DNA samples from the group they are studying to see whether a candidate SNP reaches the 1% benchmark.

Geneticists call this process genotyping and traditionally do it by sequencing DNA from each sample. Genotyping can be managed for a handful of SNPs, but looking for millions of SNPs in thousands of individuals—at 20 cents to $1 per SNP per DNA sample—is simply too expensive for many studies, says Kwok.

Complex disease marker?

SNPs are single-base differences in DNA.

To keep costs down, many investigators are concentrating on validating just the 60,000 or so SNPs known to reside within genes, as opposed to those in noncoding regions. To do that, mappers from the original SNP mapping project and others are genotyping a standard panel of DNA samples (maintained at the Coriell Institute for Medical Research in Camden, New Jersey) from anonymous donors from around the world.

Paleo SNP studies

One of the pioneers of SNP expeditions, geneticist Graeme Bell of the University of Chicago (UC), has suffered many of the setbacks that plague SNP research today. In the mid-1990s, he and UCcolleague Nancy Cox began a search for genes related to adult-onset diabetes in a population of Mexican Americans in Starr County, Texas. Some people in this population run an increased risk of diabetes due to an unknown gene or genes somewhere at the end of chromosome 2. The researchers made their own SNP map, as others had done in smaller, highly focused studies. The UC team sequenced a 1.7-million-base region of DNA from 10 people with diabetes and then looked for SNPs common to that group but not controls. They turned up a SNP that seemed to fit the bill—a guanine in place of an adenosine at position 43.

But SNP43's location—in an intron, or noncoding region—didn't make sense. So Bell's team tried again … and again. Over the course of another year, they resequenced a 66,000-nucleotide stretch in and around SNP43 and came up with about 180 additional polymorphisms. When they compared the patterns of 98 SNPs in 100 diabetics and 100 controls, the team hit pay dirt. A combination of three SNPs marks a heightened susceptibility to diabetes in this population, the researchers reported in the October 2000 issue of Nature Genetics. The combination also conveys a risk—albeit lower—in two northern European populations, one from Germany and the other from the Bothnia region of Finland.

When asked what he would do differently if he could perform the study again, Bell answers: “I wouldn't do it.” Indeed, says human geneticist David Altshuler of Harvard Medical School in Boston and the Whitehead Institute for Biomedical Research/MIT Center for Genome Research in Cambridge, Massachusetts, “one of the main things Bell showed us is how tremendously difficult this kind of study can be.”

Second-generation SNPs

Diabetes is a complex disease, however, and no one is convinced that these SNPs on chromosome 2 are the only players. Bell's team had to spend so much time pinpointing SNPs that they weren't able to thoroughly blanket the region in search of gene candidates, says Altshuler. The team might have missed some.

“Graeme is an advertisement for the human genome project,” Altshuler says. “If his lab were starting today, they would go to the Web, click the map, and they would not be wasting time discovering polymorphisms.”

Spurred by that premise, a team led by Altshuler tried a slightly different approach. The investigators went back to published studies linking other SNPs to diabetes (not including Bell's) and retested 16 reported SNPs in a new group of 333 Scandinavian trios (a child who had diabetes and the child's parents). After genotyping and looking for associations, the investigators found that 13 SNPs were common enough in the population to study. Of those, two showed clear associations, either increasing or decreasing risk in the same direction as reported originally. The team tested those two in three additional patient populations, including a group of non-Scandinavian Canadians—in all, analyzing the DNA of some 3000 people to find a gene reproducibly associated with a risk of diabetes.

Only one SNP held up in the target populations: a mutation that changes an amino acid in a hormone receptor that regulates fat metabolism. The SNP association had been reported in 1998, but four out of five subsequent analyses with only a few hundred patients each could not confirm the linkage. The team figured out why: The diabetes-inducing version of the SNP is common—about 85% of the samples carried it—but the polymorphism increased someone's risk of diabetes by only about 25%. Finding such a subtle genetic contribution to diabetes using traditional family-based linkage studies would have required samples from 3 million siblings, says Altshuler. So although it is unwieldy, he says, the SNP approach is an improvement over traditional genetic tools.

SNP success

Another team used SNPs successfully without gathering thousands of samples, despite early setbacks. Gilles Thomas of France's biomedical research agency INSERM and Fondation Jean Dausset CEPH in Paris and his colleagues were hunting for genes that might play a role in Crohn's disease, an ailment in which people mount an abnormal inflammatory response to the normal microbes in their gut. In 1996, the team fingered a 20-million-base-long region of chromosome 16 by traditional methods as carrying at least one risk gene. But several rounds of subsequent SNP analyses failed to link Crohn's to two genes that researchers suspected might play a role.

Thomas's group proceeded to make ever-finer maps of the region. They narrowed it to a 160,000-nucleotide span, using a database search and non-SNP markers, and then turned to SNPs for finer detail. They found 13 candidate SNPs by comparing DNA sequences from Crohn's patients and unaffected people. The team then used the SNPs as markers to analyze 235 families with members harboring the disease.

The approach worked: Crohn's patients carried at least one of three SNPs more frequently than controls, Thomas's group reported this May in Nature. And the more copies of the bad SNPs they carried, the greater their risk. All three culprits fell in a gene that encodes NOD2—a protein involved in microbe recognition and the body's first line of defense against pathogens. Until now, researchers hadn't demonstrated that the gene contributed to Crohn's disease, so the finding might open new avenues for treating the ailment.

Tailoring SNP studies

New technologies may drive down the expense and person-hours necessary for a solid SNP study. To improve speed, for example, some teams are refining methods to perform many DNA amplifications and SNP genotyping reactions simultaneously. Some of these “multiplexing” techniques rely on enzymatic reactions coupled to fluorescent-tagged probes, or mass spectrometry, rather than sequencing a stretch of DNA to find single-nucleotide differences.

At the analysis end, statisticians are working out more powerful ways to separate disease-related SNPs from the noise of irrelevant genetic variation. One technique, haplotype analysis, is generating almost as much enthusiasm as SNPs did when they were first introduced. A more sophisticated version of SNP analysis, this new approach relies on the fact that certain SNPs travel together on a block of DNA—in other words, they tend to be inherited together. Researchers can thus focus on just a few SNPs per block, cutting down on the cost of looking for links between segments of DNA and disease. Finding haplotypes is a tricky computational problem, but new computer algorithms can find the SNPs that are inherited as a package.

Several researchers are lobbying for funds to build such a haplotype map, and the National Institutes of Health sponsored a meeting in Washington, D.C., on 18 and 19 July to explore the idea. But no matter how detailed a SNP or haplotype map researchers draw, linking SNPs to disease is likely to work only on a case-by-case basis, depending on which disease, which genes, and which population each researcher chooses to study. As Kwok says, “Some investigators will be very lucky and others will not.”

View Abstract

Stay Connected to Science

Navigate This Article