Dynamics of Mammalian Chromosome Evolution Inferred from Multispecies Comparative Maps

See allHide authors and affiliations

Science  22 Jul 2005:
Vol. 309, Issue 5734, pp. 613-617
DOI: 10.1126/science.1111387


The genome organizations of eight phylogenetically distinct species from five mammalian orders were compared in order to address fundamental questions relating to mammalian chromosomal evolution. Rates of chromosome evolution within mammalian orders were found to increase since the Cretaceous-Tertiary boundary. Nearly 20% of chromosome breakpoint regions were reused during mammalian evolution; these reuse sites are also enriched for centromeres. Analysis of gene content in and around evolutionary breakpoint regions revealed increased gene density relative to the genome-wide average. We found that segmental duplications populate the majority of primate-specific breakpoints and often flank inverted chromosome segments, implicating their role in chromosomal rearrangement.

Whole-genome analysis of human, mouse, rat, and cattle genomes has indicated that breakpoints are reused (convergently) in karyotypic evolution, implying some bias for or against breakage in certain regions of mammalian genomes (13). To extend observations based on the few sequenced mammalian genomes (i.e., human, mouse, and rat) (47), we annotated homologous synteny blocks (HSBs) in the latest radiation hybrid (RH) genome maps of cat, cattle, dog, pig, and horse (812). The HSBs were defined for each species with the use of the human genome as a reference (NCBI Build 33) and required a minimum of two adjacent markers on the same chromosome in both species, without interruption by genes from other HSBs. Inversions were counted only if determined by three or more markers, each ≥1 million base pairs (Mbp) apart from its neighbors. We did not perform our analysis with the dog whole-genome sequence assembly because it was not available for analysis at the time. Furthermore, because of the low comparative resolution of many horse chromosome maps, only a subset of equine chromosomes was included for chromosome-specific analyses. We used the GRIMM-Synteny–based mouse-human and rat-human whole-genome sequence synteny blocks (13), but we allowed the threshold for considered rearrangements to be ≥1 Mbp to make resolution comparable to that of RH-based gene maps.

We identified 1159 pairwise HSBs between the genomes of human and the six nonprimate species (average size = 13.7 Mbp; median = 7.4 Mbp) (table S1). A bioinformatics tool (14) was designed to align and compare HSBs across species (Fig. 1) (fig. S1). Using the seven-species pairwise HSBs, we compiled multispecies HSBs (table S2) and constructed an evolutionary scenario depicting the rearrangements between all genomes and their ancestors (Fig. 2) (fig. S2). We were able to reconstruct a genome for the ferungulate ancestor of Cetartiodactyla (pig and cattle) and Carnivora (cat and dog) (Fig. 2). Combining the putative ferungulate ancestor with the computed primate-rodent ancestor, we then estimated an ancestral boreoeutherian mammalian genome that contains 48% comparative coverage of the human genome (Fig. 2). The ferungulate and boreoeutherian ancestors had 24 pairs of chromosomes, and they contain the majority of the ancestral chromosomes expected on the basis of chromosome painting of diverse mammals (15). In contrast to these cytogenetic studies that largely reveal conservation of synteny but not gene order (15), we now have reliable reconstructions of internal chromosome structure for most ancestral chromosomes. Eleven of 24 boreoeutherian ancestor (BA) chromosomes contain large segments that are colinear with human chromosomes, whereas other chromosomes show more extensive rearrangement in each mammalian lineage (e.g., BA11, BA15, and BA19, Fig. 2 and fig. S2).

Fig. 1.

Multispecies comparative chromosome architecture of human chromosome 16 (HSA16). Gray blocks indicate HSBs, with the chromosomal identity indicated by either a number or an uppercase letter and a number. Lowercase letters indicate order of the HSB in that species' chromosome (in alphabetical order). Telomere and centromere positions are indicated by dark gray rectangles and ovals, respectively. Reuse breakpoints are indicated with arrowheads labeled RB on the right side of the chromosome ideogram. Positions of cancer-associated breakpoints (10 or more confirmed cases) (32) are indicated with arrowheads followed by a number indicating the associated gene: 83 = CREBBP, 84 = MYH11, 85 = FUS, 86 = CBFB. See table S9 for details of all numbered cancer breakpoints and their occurrences. Computed boreoeutherian ancestral HSBs aligned to HSA16 (BA10c and BA21b) are shown on the right. For visualization of all chromosomes, see fig. S1 and (14).

Fig. 2.

Genome architecture of the ancestors of three mammalian lineages computed by MGR (33) from the seven starting genomes and compared to the human genome (far left). Each human chromosome is assigned a unique color and is divided into blocks corresponding to the seven-way HSBs common to all species. The size of each block is approximately proportional to the actual size of the block in human. Physical gaps between blocks are shown in human to give an indication of the coverage. Also in human, the heterochromatic/centromere regions are denoted by hatched gray boxes. Numbers above the reconstructed ancestral chromosomes indicate the human chromosome homolog. Diagonal lines within each block (from top left to bottom right) indicate the relative order and orientation of genes within the block. Black arrowheads under the ancestral chromosomes indicate that the two adjacent HSBs separated by the arrowhead were not found in every one of the most parsimonious solutions explored; these are considered “weak” adjacencies. Arrowheads at the ends of HSB chromosomes indicate that some alternative solutions placed these chromosome-end HSBs adjacent to HSBs from other chromosomes.

We defined an evolutionary breakpoint region as an interval between two HSBs that is demarcated by the end-sequence coordinates of those HSBs on each side. We identified 492 breakpoint regions in our data set (table S3); 367 of these were refined to <4 Mbp in size (average = 1.2 Mbp; median = 1.0 Mbp). We focused on these to avoid any possible errors in regions of low comparative coverage. Breakpoints were further categorized as lineage-specific (found in only one species), order-specific (overlapping between species of the same order), superordinal (overlapping in all representatives of a superordinal clade), and reuse (occurring in the same breakpoint region in different species).

Early comparative mapping studies concluded that there were three phases of chromosome evolution: (i) an early phase, 100 to 300 million years ago (Ma), with a slow rate of rearrangement; (ii) a second phase, 65 to 100 Ma, when there was an overall rate increase in mammalian lineages; and (iii) a reduction of mammalian rates during the Cenozoic Era (16). Other studies based primarily on chromosome painting data (unordered comparative maps) suggest a more dichotomous view of rearrangements during the Cenozoic (17), although recent studies based on chromosome painting of more genomes and a refined phylogeny (18) did not lend support to the bimodal model (15).

We looked for trends in the ordered genome data by examining rates of chromosome breakage throughout mammalian history. An evolutionary time scale (19) was used to infer rates (and assess confidence intervals) of breakage over time (Fig. 3). In contrast to previous studies (16), our results suggest an increase in breakage rates after the Cretaceous-Tertiary (K-T) boundary. Superordinal lineages predating the K-T boundary (i.e., the beginning of the Cenozoic) evolved at a rate of roughly 0.11 to 0.43 breaks per million years, whereas in ordinal and familial evolutionary lineages during the Cenozoic we find rate increases by factors of 2 to 4 in carnivores, primates, and cetartiodactyls, and by as much as a factor of 5 in rodents (Fig. 3). The only exception is the cat lineage, whose lower rate is partly a by-product of reduced map resolution relative to other species (20). Furthermore, our taxon sampling masks the fact that the rodent and primate rate increase occurred even later than shown in Fig. 3, because early primate and rodent ancestors, with origins around 75 Ma, had very conserved genome organizations (15). Thus, both ordered and unordered mapping data support the contention that early eutherian ancestors retained fairly conserved genomes.

Fig. 3.

Rates of chromosome breakage during mammalian evolution. The time scale is based on molecular divergence estimates (19). Rates (above the branches, in breaks per million years and 95% confidence intervals) were calculated using the total number of lineage, order, or superordinal breakpoints defined by the multispecies breakpoint analysis, and dividing these by the estimated time on the branch of the tree. The vertical gray dashed line indicates the K-T boundary, marking the abrupt extinction of the dinosaurs at ∼65 Ma and preceding the appearance of most crown-group placental mammal orders in the Cenozoic Era (19).

Nearly 20% of all classified breakpoints were categorized as reuse (Fig. 1), suggesting a high frequency of independent rearrangements occurring at the same regions of the genome in different mammalian lineages. A majority of reuse breakpoints (71%) involve the two rodents and one or two other species; in most cases the two other most rapidly evolving genomes were cattle or dog (fig. S3), confirming and extending previous findings (1, 2). Multispecies alignments also afforded an examination of the relationship between evolutionary and cancer-associated chromosome breakpoints (Fig. 1) (figs. S1 and S5). The more frequent cancer-associated chromosome aberrations (more than nine human cases) fell within or near (±0.4 Mbp) evolutionary breakpoint regions three times as often as did the less frequent cancer-associated aberrations (two to nine human cases), whereas outside of the evolutionary breakpoint regions their distributions were not different. These data, and the complete absence of cancer-associated breakpoints in the three longest HSBs conserved across all the mammalian genomes studied (on HSA3, HSA13, and HSA20, fig. S1), suggest a link between meiotic and mitotic chromosome instability.

We analyzed the human gene content of all evolutionary chromosomal breakpoint regions that could be refined to <4 Mbp. We defined the narrowest interval (usually defined by rat and mouse) of the breakpoint region as a “core breakpoint,” and then analyzed gene density (NCBI Build 33 RefSeq predicted genes) in windows surrounding the midpoint of the core breakpoint (table S4). When the central 1 Mbp around the core breakpoint was compared to the overall gene density per Mbp outside of the breakpoint regions, there was a significant increase in gene density (P < 0.0001) (genome-wide average 12.3 genes per Mbp versus 17.6 genes per Mbp in breakpoints) (table S5). One of the most gene-dense regions of the human genome, the major histocompatibility complex (∼26 genes per Mbp), is also characterized by recurrent breaks in different mammalian lineages (e.g., dog, cat, cattle, murid rodents), marked amounts of gene turnover (21), and variation in centromere placement (22).

Recent segmental duplications annotated in the human genome arose during the last 40 million years of primate evolution (23, 24). However, early studies of human-mouse evolutionary breakpoints (25, 26) were unable to distinguish breakpoints that occurred during primate evolution from those that occurred on the rodent lineage—a necessary piece of evidence to implicate segmental duplication as a potential cause of primate chromosome rearrangements. We considered a chromosome breakpoint shared by all or nearly all nonhuman species, when aligned with the human genome sequence, to be evidence of a rearrangement occurring in the human lineage (fig. S1). We identified 40 breakpoints that could be classified as primate-specific (table S6); that is, they occurred somewhere after divergence of primates and rodents at 85 Ma (19). We cannot rule out, however, that some of these breaks preceded the basal divergence of primates from tree shrews and flying lemurs (18, 19). For primate-specific breakpoints, 98% contained segmental duplications (table S6). On the basis of comparison to the reconstructed ancestor (Fig. 2), we could infer that many of the primate-specific breaks involved inversions. Roughly 85% of the primate-specific breakpoint regions were populated by intrachromosomal duplications (table S6). In 62% of the cases, these intrachromosomal duplications flanked the inverted HSBs. Therefore, we suggest that in these cases duplications promoted nonallelic homologous recombination, and thus a chromosome rearrangement (23, 27). Because hundreds of regions across the human genome are occupied by primate-specific segmental duplications, whereas only a few dozen of these co-occur with primate-specific chromosomal rearrangements, such duplications are more likely to have promoted chromosomal rearrangements than to have resulted from them (23).

As chromosomes evolve by breakage and fusion, telomeres must be able to form de novo for meiosis and mitosis to occur normally. Extreme examples of conservation of telomere position are found at HSA14qter and HSA20qter (fig. S1), where conservation exists in chromosomes from four and five other species, respectively. On a genome-wide basis, 70% of telomere positions (N = 254) are conserved in more than one species (table S7). Within Rodentia, 34% of telomere positions are conserved (fig. S4), although this is not surprising given the relatively recent divergence of mouse and rat. By contrast, the longer evolutionary time separating cat and dog, and cattle and pig, is reflected by a very small fraction (<5%) of telomere positions being conserved exclusively within both orders (fig. S4). Conversely, a much larger fraction (40 to 50%) of carnivore and cetartiodactyl telomere positions are conserved with other orders, consistent with their slower overall rate of chromosomal evolution relative to mouse and rat. Although the dog genome is evolving more rapidly than the cat genome (Fig. 3) (28), dog telomere positions are often more conserved with homologous positions on chromosomes of other species (table S7). Our data show that sites of ancient telomere fusions, which would be signified by a telomere being “replaced” by a centromere in the new species, are likely to be quite rare. Most cases of telomere-to-centromere “conversions” appear to result from an internal breakage followed by centromerization of the former telomere. As an example, the telomeric region of ancestral chromosome 9a (HSA6p) may have become a centromere in an ancestral carnivore by the internal breakage of the segment followed by the de novo appearance of a centromere and a telomere, as represented on CFA35a and FCAB2a (fig. S1).

In contrast to telomeres, centromeres are more dynamic, rapidly evolving structures that can be repositioned among closely related species (29). In primates, a relatively large number of cryptic “neocentromeres” can develop into functional centromeres de novo, are associated with chromosomal abnormalities (30), and have evolutionary importance in karyotype evolution and speciation (31). For the multispecies analysis, the positions of 85 centromeres could be unambiguously determined. Of these, 52 (61%) show conservation of position in two or more species (table S8). The positions of 20 are conserved within Carnivora (N = 14) and Cetartiodactyla (N = 6), supporting a slower rate of chromosome evolution within these mammalian orders (table S8). The two rodent genomes were not included in the analysis of centromere conservation within and among orders because reliable positions for many metacentric rat centromeres in the sequence assembly were not available. For a given species, 39% of all centromere positions were found to be unique. Thus, a large fraction of centromeres analyzed were repositioned either by independent chromosomal rearrangements or by de novo centromere emergence, affirming the rapid evolution of centromeres.

Our analyses further revealed that telomere and centromere positions tend to cluster at sites of evolutionary breakages (fig. S1). Among the 85 centromere positions that could be classified, 38 were unambiguously assigned to HSBs, of which 28 (74%) occurred at the boundaries of evolutionary breakpoints. Furthermore, all 216 nonhuman telomeres appeared at the boundaries of evolutionary breakpoints or at the ends of computed ancestral chromosomes. These observations are logical given the requirement that the viability of a gamete containing the breakage is dependent on proper chromosome segregation in daughter cells as well as in subsequent meioses in an offspring. Another apparently related phenomenon is the joint clustering of centromeres and telomeres around evolutionary breakpoints. For example, there are 20 positions of clustering of telomere/centromere positions across the entire multispecies comparative landscape (fig. S1). Of these, 11 are clusters found in multiple species. Most of the centromeres that appear at evolutionary breakpoints as defined on the human genome are associated with the formation of acrocentric centromeres in other species.

The association between reuse breakpoints and the positions of centromeres or telomeres was significant (χ2 = 14.5, P < 0.001, 1 df). When telomeres and centromeres were analyzed separately, only centromeres were found to be significantly associated with reuse breakpoints (P < 0.01; table S8). This observation suggests a possible mechanism for chromosome evolution and the appearance of reuse breakpoints, whereby these evolutionary breakages preferentially occur at sites of ancestral centromeres or neocentromeres in independent lineages. Alternatively, reuse breakpoints may represent unstable chromosomal sites that, after breakage, will tend to form a new centromere or telomere.

We have shown that tremendous evolutionary activity exists at breakpoint regions, including reuse, increased gene density, segmental duplication accumulation, and the emergence of centromeres and telomeres. Taken together with our identification of reuse breakage occurring at the highest frequency between species with the most accelerated rates of chromosome evolution, our data suggest that there exist a limited and nonrandom number of regions in mammalian genomes that can be disrupted by these various dynamic processes. Given sufficient evolutionary time, these sites become “recycled” in different species. Future challenges lie in more fully interpreting the structure and function of breakpoint regions across a broader range of mammalian taxa, with the use of whole-genome sequence-based maps from phylogenetically divergent species.

Supporting Online Material

Materials and Methods

Figs. S1 to S5

Tables S1 to S9

References and Notes

View Abstract

Navigate This Article