Genomicists Tackle the Primate Tree

See allHide authors and affiliations

Science  13 Apr 2007:
Vol. 316, Issue 5822, pp. 218-221
DOI: 10.1126/science.316.5822.218

Primates are taking center stage in genomics, with the macaque serving as an early milestone in understanding our relatives' genomes—and therefore our own

The deciphering of the human genome was a humbling experience. The promise of the project, in the words of James Watson, was “to find out what being human is.” But even when most of the 3 billion bases of the human genome had been properly placed, much about the sequence defied understanding. Where in the 20,000 human genes uncovered are the ones that set Homo sapiens apart from other mammals, or other primates? To find out, genomicists have been scrambling for more data ever since, most recently from primates. “The goal is to reconstruct the history of every gene in the human genome,” says Evan Eichler, a geneticist at the University of Washington, Seattle. And that requires data from our relatives. DNA from different branches of the primate tree will allow us “to trace back the evolutionary changes that occurred at various time points, leading from the common ancestors of the primate clade to Homo sapiens,” says Bruce Lahn, a human geneticist at the University of Chicago in Illinois.

In 2005, the unraveling of the chimp genome provided tantalizing hints about differences between us and our closest relative (Science, 2 September 2005, p. 1468). Now on page 222, the third primate genome, that of the rhesus macaque, begins to put the chimp and human genomes into perspective. Macaques are Old World monkeys, which split perhaps 25 million years ago from the ape lineage that led to both chimpanzees and humans (see diagram). So when compared to apes, monkeys can help identify the more primitive genetic variants, allowing researchers to tease out the changes that evolved only in apes. Researchers want to take such analyses back to even more ancient evolutionary divergences, and so seven more primate genome sequences are under way, as is the sequencing of the DNA of two close nonprimate relatives. Together, these genomes “should teach us general principles of primate evolution,” says Lahn.

A consortium of more than 100 researchers who have been unraveling the macaque genome are detecting genes that have changed faster than expected in the chimp and human lineages; such speed is usually a telltale sign of significance in evolution. They are also finding that dozens of base changes known to put humans at risk for disease also exist in the healthy macaque—but not in the chimpanzee. That suggests that some gene variants implicated in disease are relics of the ancestral primate condition. Such studies “may be the bridge between comparative genomics and evolutionary biology,” says Richard Gibbs, director of the Baylor College of Medicine Human Genome Sequencing Center in Houston, Texas, and coordinator of the rhesus macaque genome project.

Gibbs and his colleagues are tackling evolutionary biology in reverse. They are identifying key genomic differences without yet knowing how or whether those differences translate into traits that provide survival advantages. Traditionally, researchers have first traced changes in the shapes and sizes of beaks, bodies, brains, and so on, then sought the genes behind them. The hope is that the two modes of inquiry will meet in the middle. But so far researchers have come up short in linking genomic changes to traits subjected to natural selection and other evolutionary forces, ironically because of sparse biological data on nonhuman primates, says glycobiologist Ajit Varki of the University of California, San Diego: “[Without] basic information about the chimp, its physiology, its diseases, its anatomy, you are really very impoverished about what you can say.”

Complete coverage.

Researchers plan to eventually sequence the genomes of all these primates and related species, with human, macaque, and chimp now published. The animals are arranged in an artist's rendition of their family tree, with estimated divergence dates in millions of years.


Beyond mouse

In 2001, the human genome sequence drove home how little we knew about our genomic selves. About one-third of our genes were complete unknowns. Researchers immediately started lining up our DNA with that of worms, fish, and rodents to see what genes matched up and to try to pin down functions. They found not just genes but also conserved regions within the “junk” DNA that played as critical a role in genome function as the genes themselves. Their finds led to an unquenchable thirst for sequence data as a way to clarify how genomes work. “Every additional species increases our ability to resolve functional from nonfunctional [DNA],” explains Ross Hardison, a molecular biologist at Pennsylvania State University in State College.

The surprise of the chimp genome, the first nonhuman primate to be sequenced, was the large number of insertions and deletions that differed between humans and our closest living cousins. There were more changes in the order and number of genes and blocks of genes than changes in single base pairs, highlighting the importance of this kind of expansion and shuffling in primate speciation.

But the chimp data proved frustrating as well, because researchers couldn't put the chimp-human comparisons into an evolutionary context. If humans had one base, say a C, at a position where chimpanzees had a G, researchers had no way of knowing which base represented the ancestral condition. Consequently, there was no way to tell whether the change at that position had occurred only in humans—and therefore perhaps helped define Homo sapiens—or in the chimp. And so in 2005, the National Human Genome Research Institute began stuffing more primates into the sequencing pipeline and approved the $20 million rhesus macaque sequencing project. “It is great to finally have a [distant relative] that allows us to assign differences between the human and chimpanzee genomes to either the human or the chimpanzee evolutionary lineage,” says Svante Pääbo of the Max Planck Institute for Evolutionary Anthropology in Leipzig, Germany.

Sequencers aren't stopping at the macaque. Sequencing of many primates, including the orangutan, the gibbon, and a New World monkey, the marmoset, is under way, with promises that the baboon should be next. In 2006, the Wellcome Trust Sanger Institute in Hinxton, U.K., started deciphering the gorilla genome, planning coverage similar to that of the macaque. Meanwhile, genomicists have started sequencing key genes and regulatory regions from other primates, too. “To tell what is human-specific, you need this comparative context,” says Anne D. Yoder, an evolutionary biologist at the Duke Lemur Center in Durham, North Carolina.

To know a genome

Already, the primate genomic data are revealing bits of our genetic history. For example, more than 98% of chimp and human bases agree. So researchers hoping to pick out areas with fewer base changes than expected—such as regulatory regions conserved in all apes—are awash in a tide of virtually identical DNA. But when the search is expanded to additional primates, there's more variation in the sequences, and previously undetectable conserved regions, even small regulatory sequences, begin to surface.

For example, Dario Boffelli, now at Children's Hospital Research Center in Oakland, California, and his colleagues at the Joint Genome Institute in Walnut Creek, California, wanted to understand the regulation of genes that help maintain healthy levels of cholesterol in the body. They looked at 558,000 bases covering genes involved in cholesterol processing, comparing human and six other primates: baboon, colobus monkey, dusky titi, marmoset, owl monkey, and squirrel monkey. They discovered regions with virtually the same sequence in all the primates. Subsequent experiments showed that three of the newly identified conserved regions do indeed regulate genes in the cholesterol pathway, Boffelli's team reported in January in Genome Biology.

DNA reader.

Baylor's Richard Gibbs led more than 100 researchers in the sequencing of the macaque genome.


In other cases, particularly as researchers look for differences that reflect independent evolution, data from even one additional primate can help. In one analysis, the macaque team looked at 64,000 places in the macaque genome where they knew a disease-related mutation existed. In the past, researchers have assumed that such mutations were specific to humans. A few chimp genes had hinted that some problematic bases might predate humans, but the macaque drives home how often this may be the case. Hardison and his Pennsylvania State colleague Webb Miller found more than 200 sites where the macaque had the same base at the same position as the diseased or at-risk human. In 97 instances, both the chimp and the macaque matched the aberrant human base; in 48 cases it was just the chimp. And in 84 cases the rhesus, but not the chimp, matched the diseased human sequence, possibly because chimps also independently evolved away from the ancestral condition at those sites.

For example, about 1 in 15,000 people have phenylketonuria because their gene for an enzyme needed to process the amino acid phenylalanine is defective. Untreated, the buildup of a toxic byproduct causes mental retardation. In macaques, that same defective gene is the normal condition and has no ill effects. It could be that many “disease” variants in humans are simply ancestral variants “where [a dietary or environmental] change between the human ancestor and the human has made a variant that used to be good, bad,” says Miller.

In addition, the macaque genome consortium combed the macaque, chimp, and human genomes for families of genes that had expanded in one or more species. A family consists of the original gene and any subsequent copies, many of which evolve slightly different sequences and functions over time.

One in particular intrigued Miller. This family, called PRAME—short for “preferentially expressed antigen of melanoma” because the genes are activated in melanoma and other types of tumors—has had a complex history in humans. It has at least 26 intact members on chromosome 1. It's one of the regions of the human genome that “are going wild,” says Miller. The chimp has a similarly complex set of PRAME genes, but Miller found just eight PRAME genes in the macaque. “The cluster is very simple [and has] remained stable for millions of years,” he explains. Working from this simpler, presumably ancestral set, he and his colleagues hope to unravel the timing and types of duplications that resulted in the abundance of human PRAME genes.

Elsewhere in the genome, the consortium found that the macaque has as many as 33 major histocompatibility complex (HLA) genes, more than triple the number in humans. “When you see a dramatic change, it suggests there was some evolutionary selection that favored those extra copies,” says James Sikela, a computational biologist at the University of Colorado Health Sciences Center in Denver. “The tough question is, 'What favored that event?'”

While Sikela and colleagues ponder the macaque's need for HLA genes, Adam Siepel of Cornell University and his collaborators found other genes in which mutations were apparently favored by selection. Such positive selection, as it is called, typically shows up as bases that have mutated faster than would occur by chance. So Siepel's team compared 10,376 macaque genes with their equivalents in both the chimp and human genomes. They sought genes with a relative mutation rate that was higher in bases that changed the encoded amino acid than in bases that did not alter the coding. The researchers found 178 such genes, “considerably more” than previously identified in human-chimp scans, says Siepel. Some genes, such as a few involved in the formation of hair shafts, were changing rapidly in the three species, possibly because climate change or mate-selection strategies spurred rapid evolution, Siepel speculates.

Other positively selected genes detected in at least one species included those involved in cell adhesion and cell signaling, as well as genes coding for membrane proteins. “We don't really know enough at this stage to point to a case where we have a really nice story of a difference at the molecular level that we can connect to a known phenotypic difference,” Siepel laments.

Siepel and others say that such stories require more primate sequences. Evidence of positive selection in the same genes in multiple species will provide more clues to what prompted such rapid evolution. Moreover, researchers can be more confident about labeling a gene as “human-specific” once they have looked in a number of our relatives and not found it. “The more primates one can compare, the better,” says Sikela.

Sequencing decisions require tough choices about what species to sequence and how thoroughly, however. For his part, Boffelli thinks seven or eight primates would suffice and favors apes over prosimians, the most primitive living primates. With ape DNA, it will be easier to look for positive selection that led to humans. But Yoder thinks it's also important to understand how the whole primate branch has evolved, a point long made by researchers studying anatomy and behavior. “If you are going to understand which genes are primate-specific, you need a pretty broad phylogenetic spectrum, [with] things outside the primate clade but close to it,” she notes. That argument has already brought tree shrews and flying lemurs (which are not lemurs at all) into the picture, with researchers planning a quick skim across the DNA to get a very rough draft sequence.

Know thy genes.

The genomes of the gorilla, chimp, orangutan and human (left to right) will help clarify our evolution.


Others warn that the quick skim, which is also planned for the bushbaby, mouse lemur, and tarsier, might not be enough, however. With anything short of finished sequence, the computer programs may pick up differences—signs of evolution—that in reality may be sequencing errors, warns Miller. That was the lesson of the chimp genome, which initially was not a very polished draft.

Varki says the genomic work promises to be challenging in other ways, too: “At the genomic level, evolution is extremely messy, involving every conceivable mechanism, probably with lots of blind alleys and red herrings. Deciphering the significance of these molecular changes will be far, far more complicated than I imagined.” Nonetheless, Siepel predicts, “we're going to learn a lot in the next 5 years.”

View Abstract

Stay Connected to Science


Navigate This Article