Special Reviews

The Centromere Paradox: Stable Inheritance with Rapidly Evolving DNA

See allHide authors and affiliations

Science  10 Aug 2001:
Vol. 293, Issue 5532, pp. 1098-1102
DOI: 10.1126/science.1062939


Every eukaryotic chromosome has a centromere, the locus responsible for poleward movement at mitosis and meiosis. Although conventional loci are specified by their DNA sequences, current evidence favors a chromatin-based inheritance mechanism for centromeres. The chromosome segregation machinery is highly conserved across all eukaryotes, but the DNA and protein components specific to centromeric chromatin are evolving rapidly. Incompatibilities between rapidly evolving centromeric components may be responsible for both the organization of centromeric regions and the reproductive isolation of emerging species.

Inheritance of genetic information requires a faithful copying mechanism. DNA replication and repair provide high-fidelity inheritance over evolutionary periods, whereas epigenetic inheritance is less rigid. DNA methylation can mediate epigenetic inheritance during development and even between generations of complex organisms (1). Stable protein-based inheritance is well-established for prions (2), and chromatin-based mechanisms are thought to maintain developmental states (3). Over the past decade, several authors have argued that centromeres, the sites of spindle attachment at mitosis and meiosis, can also be maintained epigenetically (4–8). Here, we examine recent studies on the basis for centromeric inheritance. These point to a novel chromatin-based mechanism for the maintenance of centromere location during multiple rounds of cell division. This mechanism may be responsible for the enigmatic organization of centromeric DNA and for the rapid onset of reproductive isolation as species emerge.

DNA Sequences at Centromeres

Delimiting the precise boundaries of centromeres has proven to be a daunting task. In animals and plants, centromeres are contained within regions of highly repetitive satellite DNA, which confounds even the most powerful mapping methods. Chromosomes in Saccharomyces cerevisiae are exceptional in that they lack satellite sequences and their centromeres have been precisely localized. Each of these “point” centromeres specifies spindle attachment with only ∼125 base pairs (bp) of DNA. However, this simplicity is evolutionarily derived, as centromeres from other fungal lineages include arrays of repeats (9, 10), much like what is found in animals and plants.

The simplicity of S. cerevisiae centromeres on small chromosomes led to the idea that specific repeated sequence elements might specify centromere location for larger chromosomes when present in a sufficient number of copies (11). However, this hypothesis has fallen out of favor for reasons that have been extensively reviewed (5–7). Most compelling is the lack of any common repetitive elements in many “neocentromeres” that occasionally are found in humans. Recent studies have documented human neocentromeres that lack α-satellite (12–15), which is the single common DNA sequence found in native human centromeres. Even fine-structure mapping of two human neocentromeres has failed to detect α-satellite or other tandem repeats found near centromeres (15, 16).

Could there be more subtle sequence motifs that have gone undetected within complex centromeres? Unfortunately, there are several technical difficulties in characterizing centromeric satellites. Centromeric and flanking sequence appear to be indistinguishable in a well-studiedDrosophila minichromosome (17), and it is unknown what subset of α-satellite is found at human centromeres as opposed to satellites found in surrounding heterochromatin (18). In addition, there are no reports of a complete satellite-containing centromere that has been cloned, much less sequenced. In this “postgenomic” era, centromeric sequencing has only just begun.

A candidate centromeric motif would have to be reiterated over hundreds of kilobases, because this is the minimal size of fully functional centromeres in several organisms. In Drosophila, mapping of a minichromosome centromere has determined that 420 kb of primarily tandem repeats is required for full function and that removal of material from either end leads to a progressive reduction in transmission (19). In maize, a fully functional supernumerary B chromosome centromere contains ∼500 kb of tandem repeats, and partial deletions reduce transmission (20). In humans, even the smallest minichromosomes retain at least 100 kb of α-satellite (21), and human artificial chromosomes are found to contain α-satellite arrays in the megabase range (22, 23). Therefore, “regional” rather than point centromeres are characteristic of both animals and plants.

One interesting feature of most centromeric satellite repeats is their unit length. Despite the lack of universal sequence motifs, repeat unit length can be remarkably similar between organisms. For example, the basic α-satellite unit in primates is 171 bp, in the fish Sparus aurata, the centromeric repeat is 186 bp, in the insect Chironomus pallidivittatus, it is 155 bp, in bothArabidopsis and maize, it is 180 bp, and in rice, it is 168 bp (24–26). It is striking that the narrow range of repeat lengths found for these centromeric satellites corresponds closely to the range of nucleosomal unit lengths (11,27). Larger repeat lengths, such as the 340-bp repeat found at pig centromeres, roughly encompass two nucleosomes. There may be exceptions, such as the pentameric satellite repeats inDrosophila melanogaster. These exceptions notwithstanding, selection for nucleosomal length might sometimes constrain evolution of centromeric satellites, consistent with their structural (noncoding) role in the genome.

Centromeric Proteins

The failure to detect common motifs that distinguish centromeres from noncentromere DNAs has led to speculations that non-DNA sequence determinants maintain centromeres (5,7). Such determinants might account for examples of reversible centromere inactivation (28, 29). However, evidence is lacking for DNA modifications, noncoding RNAs, or enzymatic activities that might uniquely “mark” centromeres. Nevertheless, there are proteins found only at centromeres, and these are candidates for maintaining centromeric location. Many of these proteins are present at centromeres only at mitosis (30). Some of these form the kinetochore, a proteinaceous structure that assembles on centromeric chromatin and connects the centromere to spindle microtubules. The kinetochore appears at prophase and disappears at telophase (31). Other protein complexes, such as cohesins, are not part of the kinetochore but also disappear from centromeric regions at mitosis. Mutations in kinetochore or cohesion components compromise chromosome segregration; however, only constitutive proteins are candidates for directly maintaining centromeres.

The best candidate for maintaining mammalian centromeres is CENP-A (32), which is of special interest because it is the centromeric histone (33, 34). CENP-A is present at native centromeres and at neocentromeres (30,35, 36) but is absent from centromeres that are mutated (13) or inactivated (37,38). CENP-A, a member of the histone H3 protein family, copurifies with nucleosomes (33), replaces H3 in purified nucleosome particles (27), and can be assembled in vitro into nucleosomal core particles lacking H3 (39,40). In vivo, each centromeric nucleosome consists of a histone octamer with CENP-A instead of H3 (27). Because CENP-A survives the protamine transition that removes all noncentromeric histones (33) and is present in sperm as distinct foci (4), it behaves as a heritable centromeric molecule.

CENP-A has counterparts in other eukaryotes (Fig. 1). Experimental evidence has confirmed that Cse4p in S. cerevisiae (41), HCP-3 inCaenorhabditis elegans (43) (Fig. 1B), Cid (for centromere identifier) in D. melanogaster (42) (Fig. 1D), and SpCENP-A in Schizosaccharomyces pombe(44) are exclusively centromeric. For example, in S. pombe, SpCENP-A is found only at the nonredundant central core region and not in the surrounding tandem repeats (44). In addition, molecular mapping of CENP-A in two human neocentromeres localizes it to the center of a region that includes the primary constriction (15,16), which is the consistent cytological feature of regional centromeres in metaphase chromosomes. Indeed, wherever there is a centromere marker in metaphase chromosomes, the centromeric histone is found: Even in C. elegans, where the holokinetochore extends from one end of the chromosome to the other, HCP-3 is found at the underlying holocentromere (Fig. 1B). At the other end of the spectrum, the point centromeres of S. cerevisiaehave what may be a single Cse4p-containing centromeric nucleosome (41). Thus, a centromeric H3-like histone underlies the vastly different types of centromeres found across eukaryotes.

Figure 1

(A) Schematic alignment of centromeric histones. Centromeric histones are similar to H3 in their core domains (light blue) but differ in their NH2-terminal tails even from each other, indicated by the various colors. (B) Antibody to HCP-3 labeling (green) ofC. elegans holocentric chromosomes (red) at prophase (94). (C) Antibody to CENP-A (green) and antibody to CENP-B labeling (purple) of human metaphase chromosomes (red) [from (38) (with permission)]. CENP-A is located precisely at the centromere, whereas CENP-B binds to surrounding α-satellite. (D) Antibody to Cid labeling (green) ofD. melanogaster prometaphase chromosomes (red) (95). The sister centromeres and holocentromeres have split.

Centromeric histones contain sequence features that distinguish them from histone H3, including a noncanonical NH2-terminal tail, a more divergent core histone fold, and a slightly longer loop 1 region (27, 45) (Fig. 1A). Although histone H3 is evolutionarily constrained, centromeric histones are strikingly divergent. We attribute this difference to the necessity of H3 to interact with the entire genome, whereas each centromeric variant need only interact with the current centromeric DNA (45). This DNA consists of satellite repeats, which are the most rapidly evolving components of eukaryotic genomes (46). We propose that the interaction between the centromeric histone and centromeric DNA is responsible for the approximately nucleosomal repeat lengths found for satellites.

In addition to the centromeric histone, other constitutive components of centromeres have been identified. Only one of these, mammalian CENP-C, is evolutionarily conserved, sharing a single conserved motif of less than 20 amino acids in length with S. cerevisiaecentromere protein, Mif2. Among sequenced eukaryotes, a homologous sequence has been found in many organisms, including HCP-4 in C. elegans (47, 48), but not inDrosophila or Giardia. The mitotic localization of CENP-C and HCP-4 depends on the presence of the centromeric histone (47–49) [but not vice-versa (47, 48,50)]. Furthermore, HCP-4 does not localize to centromeres at interphase (47, 48). Other proteins have been found to be constitutive components of centromeres, but only in particular organisms (51). In yeasts, protein complexes that are constitutively present at centromeres appear to be required for localization of centromeric nucleosomes (44, 52). However, no counterparts to these complexes have been documented in other organisms. In S. cerevisiae, the presence of these complexes at centromeres could be a consequence of the constitutive nature of the kinetochore itself (53). Thus, the centromeric H3-like histone is the only current candidate protein for universally maintaining centromeres.

Maintenance of Centromeric Chromatin

Nucleosomes are distributed between daughter strands at replication (54), and current evidence favors a similar distributive segregation of centromeric nucleosomes (55). For centromeres to be maintained, new centromeric nucleosomes must be assembled, and H3-containing nucleosomes must be excluded to prevent a gradual degradation of centromeric identity. Two general models have been proposed for this selective assembly. In one, old centromeric nucleosomes specifically direct the deposition of new centromeric nucleosomes (6, 8) (Fig. 2A). This process could be mediated by an interaction between centromeric nucleosomes or facilitated by a nucleosome loading factor (44). Adenosine triphosphate–dependent nucleosome remodeling complexes are attractive mediators of this process, given that centromeric nucleosomes can be assembled without DNA replication, both in humans (55) and in flies (56). A second model supposes that centromeric histones are available only at a restricted time or nuclear location during the cell cycle (27, 42,46) (Fig. 2B). This restriction would prevent the poisoning of centromeres by the vast excess of histone H3 and promote the assembly of centromeric nucleosomes. These two models are not mutually exclusive, as it seems likely that centromere maintenance involves both restricted availability of histones during centromere replication (56) and nucleosome remodeling after replication (8).

Figure 2

Proposed mechanisms for centromere maintenance in dividing cells. (A) Recognition model (8). After replication of centromeric DNA, gaps in nucleosome arrays are filled by a replication-independent mechanism. Old centromeric nucleosomes discriminate centromeric octamers (light gray) from H3-containing octamers (black), either by a direct interaction or indirectly through a chromosome remodeling factor (blue), leading to specific assembly of new centromeric nucleosomes. (B) Sequestration model, showing replication timing along the chromosome (15, 56) (top). In the early S phase nucleus, histone H3 is excluded from the heterochromatic compartment, but centromeric histones are enriched (bottom). This leads to exclusive deposition of centromeric histones as centromeres replicate. Centromeres (cen), like euchromatin (eu) (both green) replicate earlier than their flanking heterochromatin (het) (orange).

Centromeres replicate in mid S phase (56–58); however, they are embedded within a heterochromatin compartment in which all other DNA replicates late (56). Should this nonreplicating compartment exclude replication-dependent histone H3 but sequester centromeric histones, the compartment would ensure that only the latter are available for assembly. This model is supported by four lines of evidence. First, both the Drosophila cid and S. pombe Cnp1 genes are expressed during a brief period of early S phase (42, 44) [although human Cenp-A does not follow this pattern (55)], suggesting early cell cycle availability of centromeric histones. Second, centromeres replicate before heterochromatin in mammals (57), barley (58), and Drosophila (56) (Fig. 2B). Third, histone H3 is not deposited at replicatingDrosophila centromeres (56). Fourth, centromeric histones from diverse species appear to become sequestered within both Drosophila and human heterochromatin (42).

Consistent with the sequestration model, most native centromeres are inextricably linked to late-replicating heterochromatin and both consist of highly repetitive DNA. This has made it difficult to determine whether there is a causal relationship between centromeres and pericentric heterochromatin. However, human neocentromeres lacking repeats have been compared with their noncentromeric parents, and in this way, essential centromeric components can be identified. Several nonrepetitive neocentromeres accumulate Heterochromatin-associated Protein 1 (HP1), suggesting a causal link between heterochromatin and centromere function (30). Consistent with this view, late replication timing, a hallmark of heterochromatin, is an acquired feature of the regions on both sides flanking the ∼300 kb CENP-A–associated region of the mardel(10) neocentromere (15). Thus, the best characterized human centromere appears to be remarkably similar to S. pombe centromeres, where a central core containing the SpCENP-A centromeric histone is surrounded by repeats with heterochromatic features (44, 59).

These recent cytogenetic and molecular studies complement a growing body of genetic evidence that has revealed that heterochromatin is required for centromere function. Reduced chromosome transmission results from removal of flanking heterochromatin inDrosophila (60) and S. pombe(61). In both organisms, modifiers that disrupt heterochromatic silencing impair chromosome transmission (59, 60, 62). In S. pombe, this disruption of silencing and impaired transmission is accompanied by the hyperacetylation of histones flanking the central core (63, 64). Similarly, inhibition of histone deacetylases in mammalian cells causes delocalization of HP1 from heterochromatin and reduced chromosome transmission (65). Thus, flanking heterochromatin is a prerequisite for maintaining regional centromeres. Perhaps loss of heterochromatin was responsible for the utilization of DNA-binding proteins for centromere localization in budding yeast and the emergence of holocentricity in nematodes.

Centromere Evolution

Centromeric repeats comprise the most rapidly evolving DNA sequences in eukaryotic genomes, differing even between closely related species (46, 66, 67). This is not to say that satellites are hypermutable but rather that sequence variants are fixed by expansion and contraction (68) and can arise de novo at new sites (69). These satellite changes are brought about by a variety of mutational processes, including replication slippage, unequal exchange, transposition, and excision (68, 70). Such rapid change is paradoxical: Why has not a single optimal sequence been fixed at centromeres? A clue comes from examination of centromeric histones. These are expected to maintain favorable interactions with centromeric satellite (27). Comparison of Cid from closely relatedDrosophila species reveals that both the NH2-terminal tail and the histone core domain contain regions that have undergone frequent episodes of adaptive evolution (45). This is unexpected for a histone molecule, as histones are among the most evolutionarily constrained eukaryotic proteins. Within the histone core domain, most adaptive changes lie in loop 1, a region that makes direct H3-DNA contacts (71), suggesting that centromeric histone binding is sequence dependent (27). Consistent with this finding, all confirmed centromeric histones have a longer loop 1 than H3 (27,72). Thus, the adaptive signal and its location provide compelling evidence that Cid has evolved in concert with centromeric DNA, reminiscent of the co-evolution of rDNA and its transcription factor (70). Similarly, the NH2-terminal tail may be adapting under constraints imposed by centromeric DNA. Understanding the basis of these adaptive changes could resolve the paradox of rapidly evolving centromeres.

Asymmetry at female meiosis may be the key. Of the four products of meiosis, three are lost and only one becomes the oocyte nucleus. There is evidence that the asymmetry of the meiotic tetrad provides an opportunity for chromosomes to compete for inclusion into the oocyte nucleus by attaining a preferable orientation at meiosis (73–75). Centromeres that can exploit this opportunity at meiosis I will “win,” and even a slight advantage at each female meiosis will be enough to rapidly drive a centromere to fixation. Additional recruitment of centromeric nucleosomes, for example, by the expansion of a centromeric satellite, would confer this advantage (Fig. 3). Genetic evidence that some animal and plant centromeres are “stronger” at meiosis dates back nearly half a century (73, 76). In maize, centromere strength is characteristic of heterochromatic “knobs,” which display poleward movement and meiotic drive during female meiosis (25, 73), and a similar drive process might contribute to the success of selfish B chromosomes (77). In humans, a variety of Robertsonian translocations, with two adjacent centromeres, consistently display a higher than expected transmission ratio (78).

Figure 3

Centromere drive model. Expansion of a satellite that binds Cid provides more microtubule attachment sites. This stronger centromere drives in female meiosis but also leads to increased nondisjunction. A mutation in Cid that alters sequence specificity leads to more extensive binding of the weaker centromere, providing more microtubule attachment sites. This restores meiotic balance and alleviates nondisjunction.

In females, the winning centromeres simply exploit the inherently destructive process of forming the egg and thus might not reduce fecundity. However, in Drosophila males, heterochromatic differences between paired chromosomes at meiosis I can cause nondisjunction (79) manifested as skewed sex ratios or infertility. We propose that these chromosome pairs have centromeric imbalances. Cid is the best candidate to relieve deleterious effects associated with centromere meiotic drive (45). For example, if Cid were to mutate such that it preferentially bound the weaker centromere, centromeric balance would be restored (Fig. 3). Such a beneficial cid allele will drive to fixation itself. This process suffices to explain both the evolutionary dynamics of satellite DNA and the adaptive evolution of Cid (80). Episodes of drive and deleterious mutation by transposons (17,18) would lead to the accumulation of satellites representing centromeric relics surrounding functional centromeres (46, 81). This would also provide a mechanism for the well-documented fixation of chromosome-specific satellites (67) in successive episodes of drive.

Consider this process occurring in two isolated populations of the same species. Satellite-Cid configurations will diverge rapidly. In each population, Cid will evolve to suppress the deleterious effects of satellites that have driven through that population. By so doing, Cid becomes incompatible with the independently evolving centromeric satellites in the other population. Crosses between the populations will result in hybrid defects as centromeric drive is released again. Thus, the satellite-Cid drive process results in the onset of reproductive isolation between the two populations. In other words, speciation is an inevitable consequence of centromere evolution (82).

Hybrids between closely related species display a common syndrome of defects, most prominently, the sterility and inviability of the heterogametic sex, referred to as Haldane's rule (83). In most species, the XY male is the heterogametic sex; however, in many lineages, such as birds and butterflies, the ZW female is heterogametic. We think that our model is sufficient to explain the bias for hybrid sterility. In hybrids, there would be two sets of centromeric satellites and two alleles encoding the centromeric histone. The imbalance in centromere strength would be highest between the heterogametic pair, regardless of whether that pair is XY or ZW, because the centromeres of the heterogametic pair of centromeres are always the most dissimilar. Thus, infertility and distortion of the sex ratio (84) would be symptoms displayed by hybrids bringing together incompatible satellite-Cid combinations (85).

Because centromeric histones are found in all eukaryotes, and asymmetric female meiosis has been documented in both animals and plants, our model for the origin of species is widely applicable. The model is testable, in that adaptive evolution should be found for centromeric histones in all organisms with asymmetric meiosis, but not necessarily in organisms with symmetric meiosis. Moreover, we predict that species in early stages of postzygotic reproductive isolation will show critical changes in residues that mediate contacts between the centromeric histone and centromeric DNA. If this model survives these tests, then we will have had a glimpse into how the basic machinery of chromosome segregation can result in the astonishing diversifications of life.

  • * To whom correspondence should be addressed. E-mail: steveh{at}fhcrc.org


Stay Connected to Science

Navigate This Article