Four Evolutionary Strata on the Human X Chromosome

See allHide authors and affiliations

Science  29 Oct 1999:
Vol. 286, Issue 5441, pp. 964-967
DOI: 10.1126/science.286.5441.964

This article has a correction. Please see:


Human sex chromosomes evolved from autosomes. Nineteen ancestral autosomal genes persist as differentiated homologs on the X and Y chromosomes. The ages of individual X-Y gene pairs (measured by nucleotide divergence) and the locations of their X members on the X chromosome were found to be highly correlated. Age decreased in stepwise fashion from the distal long arm to the distal short arm in at least four “evolutionary strata.” Human sex chromosome evolution was probably punctuated by at least four events, each suppressing X-Y recombination in one stratum, without disturbing gene order on the X chromosome. The first event, which marked the beginnings of X-Y differentiation, occurred about 240 to 320 million years ago, shortly after divergence of the mammalian and avian lineages.

The human X and Y chromosomes, like those of other animals, are thought to have evolved from an ordinary pair of autosomes (1). The pseudoautosomal regions at the termini of the X and Y chromosomes still recombine during male meiosis, ensuring X-Y nucleotide sequence identity there. Elsewhere on the X and Y chromosomes, however, X-Y recombination has been suppressed. These nonrecombining regions of the X and Y chromosomes have become highly differentiated during evolution, and only a few X-Y sequence similarities persist within them. These modern X-Y gene pairs are the remaining “fossils” where extensive sequence identity between ancestral X and Y chromosomes once existed. The recent discovery of many X-Y genes has made it possible to examine the entire group to search for patterns of human sex chromosome evolution. Thus far, the human sex chromosomes—the best characterized mammalian sex chromosomes—have been found to contain 19 X-Y gene pairs (2).

We first compared the locations of all 19 pairs of genes on the human X and Y chromosomes (Fig. 1). We determined the relative positions of the X-linked genes through radiation hybrid analysis, in many cases confirming previously published localizations (3). Map positions of the Y-linked homologs were obtained principally from the literature (4–6). On the X chromosome, most of the X-Y genes map to the short arm, where they are concentrated toward the distal end. By contrast, the X-Y genes are found as singletons or small clusters throughout the euchromatic portion of the Y chromosome. In general, the map order of the X-linked genes corresponds poorly to that of the Y-linked homologs. Local exceptions to this rule are provided by three small gene clusters that are present on both X and Y chromosomes (Fig. 1).

Figure 1

Map of homologous genes in nonrecombining regions of human X and Y chromosomes. Pseudoautosomal regions of X and Y are black; heterochromatic region of Y is gray. Radiation hybrid analysis (3) was used to map genes on the X chromosome, which is drawn on a centiRay scale.K S-defined strata on the X chromosome are indicated. The boundary between strata 2 and 1 is somewhere betweenSMCX and RPS4X; here, it is arbitrarily shown at the centromere (white oval). Genes and pseudogenes on the Y chromosome were ordered previously by analysis of naturally occurring deletions (4, 5). UBE1X has a homolog on the squirrel monkey Y chromosome but not on the human Y chromosome (29). Brackets denote three small gene clusters (labeled a, b, c) that are present on both X and Y chromosomes.

We next measured, for each of the 19 X-Y gene pairs, synonymous nucleotide divergence between the X-linked and Y-linked coding regions (7). Because synonymous substitutions do not alter the encoded protein, they are generally assumed to be nearly neutral with respect to selection. The statistic K S (the estimated mean number of synonymous substitutions per synonymous site) is often used to gauge evolutionary time (8). In the present context, K S values provide a measure of the evolutionary time that has elapsed since the gene pairs started differentiating into distinct X and Y forms. The calculatedK S values are given in Table 1, where gene pairs are listed according to map order on the X chromosome.

Table 1

Sequence divergence between homologous X- and Y-linked genes.

View this table:

We noted that the 19 K S values appeared to cluster into approximately four groups (Fig. 2): 0.94 to 1.25 (group 1), 0.52 to 0.58 (group 2), 0.23 to 0.36 (group 3), and 0.05 to 0.12 (group 4). Each X-Y gene pair's K S value differed significantly from those of all gene pairs in other groups (P ≤ 0.02). The most striking observation was that, on the X chromosome, the fourK S-defined groups of genes are arranged in an orderly sequence (Fig. 2). X-Y genes are stratified by age along the length of the X chromosome. By contrast, on the Y chromosome, theK S-defined groups appear to be scrambled (compare Table 1 and Fig. 1).

Figure 2

Plot of K S (Table 1) versus X-chromosome map position (Fig. 1) for 19 X-Y gene pairs.

What might account for the orderly stratification of X-Y genes by age on the human X chromosome? We hypothesize that, during evolution, differentiation of the X from the Y chromosome was initiated one region, or stratum, at a time. Regions were recruited in the order of their physical position, with stratum 1 (containing the genes of group 1) having been the first to embark on X-Y differentiation, and stratum 4 having been the most recent. Genes in the same stratum began differentiating into X and Y homologs at about the same time, accounting for their similar K S values.

X-Y differentiation would have occurred only after X-Y recombination ceased (9). Our findings suggest that during evolution, X-Y recombination was suppressed regionally, beginning with stratum 1 and subsequently expanding in discrete steps to include strata 2, 3, and 4. Chromosomal inversions, which are known to be capable of suppressing recombination across broad regions in mammals (10), would appear to be the most likely mechanism. These inversions must have occurred on the evolving Y chromosome, where the strata have been scrambled, but not on the X chromosome, where the order of strata apparently has been preserved (Figs. 1 and 2). [Had the strata on the human X chromosome been extensively shuffled during evolution—as may have occurred on the mouse X chromosome after divergence of the human and murine lineages (11)—we would have observed no correlation between the age of X-Y gene pairs and the map positions of their X-chromosomal members.] In the modern human sex chromosomes, the proximal boundary of the pseudoautosomal region is spanned by a gene that is intact on the X chromosome, but grossly interrupted on the Y chromosome (12), consistent with disruption of an ancient pseudoautosomal region by a Y-chromosomal inversion. We speculate that this particular event was the most recent in a series of inversions, each of which enabled X-Y differentiation to begin in one stratum.

This model of staged, region-by-region initiation of X-Y differentiation also accounts for two global features of the X chromosome's gene content: (i) the concentration in strata 3 and 4 of genes with detectable Y homologs (Fig. 1) and (ii) the concentration on the short arm (strata 2, 3, and 4) of genes that escape X inactivation, some with and some without Y homologs (13). Evolutionary theory predicts that once X-Y recombination ceased within a stratum, the genes on the affected portion of the Y chromosome began to decay, with most of the Y-linked genes ultimately being obliterated (1). As an adaptive response, homologous genes on the X chromosome were up-regulated, and subsequently became subject to X inactivation, processes thought to have spread during evolution on a gene-by-gene or cluster-by-cluster basis (14). If decay of Y-linked genes and adaptation of X-linked homologs were gradual evolutionary processes, then one would expect the youngest X strata to exhibit the highest densities of (i) genes with detectable Y homologs and (ii) genes that escape inactivation. Both predictions are met (Fig. 1) (13).

A comparison of the youngest (group 4) gene pairs with the older (groups 1 through 3) gene pairs illustrates certain temporal features of X-Y differentiation. We measured both synonymous and nonsynonymous substitutions for each gene pair (Table 1). Nonsynonymous substitutions alter the encoded protein and are constrained by selection. Thus, their frequency (K A, the estimated mean number of nonsynonymous substitutions per nonsynonymous site) is a function of both evolutionary time and selective constraints on the encoded proteins. The degree of constraint can be reflected in the ratio K S/K A; values greater than one indicate the presence of constraints on both homologs, and values in the vicinity of one are consistent with lack of constraint on at least one homolog (8, 15). In groups 1 through 3, 10 of 11 gene pairs exhibitK S/K A ratios of 3 or higher (Table 1), suggesting that natural selection has preserved the Y copies of these genes. Without such selection, these X-Y homologies (especially those in groups 1 and 2) would no longer be visible. By contrast, the seven gene pairs in group 4 showK S/K A ratios of 1 to 2, and in five of these pairs, the Y copy is known to be a pseudogene. Among the group 4 pairs, X-Y homology is readily apparent even in the absence of selective constraint, because there has been little time for erosion of sequence similarity. Thus, the Y-chromosomal genes of the older groups, and especially those of groups 1 and 2, are survivors of an early winnowing process that is still ongoing in group 4.

To determine the age of theK S-defined strata, we used two methods. First, we considered published information on homologs of representative genes in diverse mammals. The maximum age of stratum 4, for example, was suggested by the prior observation that homologs of STS andKAL1 are pseudoautosomal or autosomal in prosimians (16–18). Assuming that suppression of X-Y recombination is an irreversible evolutionary step (14), this implies that X-Y differentiation in stratum 4 began less than 50 million years ago (Ma), when the simian and prosimian lineages diverged (19). Minimum ages of the strata could also be inferred. For example, STS andKAL1 have been shown to have X- and Y-specific homologs in both New and Old World monkeys (16, 17), suggesting that X-Y differentiation in stratum 4 began at least 30 Ma, when the New and Old World monkey lineages diverged (19, 20). Using similar logic, we inferred the ages of stratum 3 (80 to 130 million years), stratum 2 (130 to 170 million years), and stratum 1 (130 to 350 million years) from prior data on gene homologs in more-distantly related species, including nonprimate mammals, marsupials, monotremes, and birds (21).

These cross-species comparisons yielded reasonably precise estimates of age for strata 2, 3, and 4—the younger strata—but only crude estimates of age for stratum 1. Because this oldest stratum might contain information about the origins of mammalian sex chromosomes, its age is of great interest. Here, we used a second dating method, based on K S values for X-Y gene pairs. Theory predicts that among human X-Y gene pairs,K S values should be roughly proportional to age (8). This expectation is met by the X-Y gene pairs of strata 2, 3, and 4 (Fig. 3). By extrapolation, we estimated that X-Y differentiation began 240 to 320 Ma in stratum 1 (Fig. 3). These findings suggest that X-Y divergence began shortly after the mammalian lineage arose, having diverged from the lineage of birds (with Z-W sex chromosomes) between 300 and 350 Ma (19). [Because the sex chromosomes of birds appear to be completely unrelated to the mammalian sex chromosomes, it is thought that they arose independently, from a different autosomal pair (22).] Interestingly, our K Sfindings indicate that SOX3 and SRY (the primary sex-determining gene) are among the oldest known X-Y gene pairs in humans (Table 1). This finding strengthens an hypothesis, by Foster and Graves, which states that an ordinary autosomal pair became sex chromosomes when mutations fashioned one allele of SOX3, originally an autosomal gene, into the male-determining factorSRY (23). Indeed, formal cluster analysis of theK S values we report suggests that the X-Y genes of group 1 might actually comprise two distinct strata, withSRY/SOX3 perhaps being older than the two other X-Y gene pairs of group 1 (RPS4X/Y and RBMX/Y) (24). Although the difference inK S values between SRY/SOX3and the two other X-Y gene pairs is not statistically significant, the evidence is suggestive.

Figure 3

Plot of X-Y divergence time (age) versus average K S value for X-Y gene pairs (weight-averaged) in each stratum. The X chromosome schematic is adapted from Fig. 1. Maximum and minimum age estimates for strata 2, 3, and 4 are bracketed; these are not statistical confidence intervals. Theory predicts an approximately linear relationship between age andK S value (8); the shaded area is calibrated with respect to stratum 2, whose age is 130 to 170 million years (21) and whose average K S value is 0.53. By extrapolation, the age of stratum 1 is estimated between 240 and 320 million years.

If future studies establish that the group 1 genes are divisible into two strata, these results would also help date the emergence of X inactivation during mammalian sex chromosome evolution.XIST, an X-specific gene which plays a pivotal role in X inactivation (25), is located near RPS4X and therefore would be in the younger of the two strata—not in the stratum where the nascent X and Y chromosomes first differentiated. This would controvert the hypothesis of Chandra, who speculated that X inactivation emerged contemporaneously with the chromosomal sex-determining mechanism (26).

Consistent with our evolutionary map, Graves and colleagues have postulated that the long arm and proximal short arm of the human X chromosome are at least 170 million years old (27, 28). They have referred to this portion of the X as the “XCR” (X conserved region). Graves's XCR corresponds approximately to our strata 1 and 2. They have also postulated that the distal short arm of the human X chromosome is younger. This “XAR” (X added region) was attributed to translocation of an autosome to the pseudoautosomal region of both X and Y after divergence of placental mammals from marsupials (27,28). Our strata 3 and 4 are found within Graves's XAR.

In conclusion, we postulate that the evolution of human sex chromosomes was punctuated by at least four events, plausibly a series of inversions on the Y chromosome (Fig. 4). Each event suppressed X-Y recombination in one stratum and enabled X-Y differentiation to proceed there. The first of these events, which created stratum 1, was roughly contemporaneous with the birth of the mammalian sex chromosomes and the emergence of SRY as the primary sex determinant. This occurred about 240 to 320 Ma, shortly after the mammalian and avian lineages diverged. The pseudoautosomal region was expanded by translocation of autosomal material between the second and third events (which created strata 2 and 3, respectively). The fourth event occurred relatively recently, during primate evolution, creating stratum 4, where X-Y differentiation is still in its earliest stages.

Figure 4

A proposed sequence of evolutionary events that generated four strata on the human X chromosome. Four inversions on the Y chromosome are postulated. Each inversion reduced the size of the pseudoautosomal (X-Y recombining) region (black; for simplicity, only one pseudoautosomal region is shown for each chromosome) and enlarged the portions of the X (yellow) and Y (blue) chromosomes that did not recombine during male meiosis. Ongoing decay and loss of Y genes offset these periodic expansions of the nonrecombining region of the Y chromosome. Points of divergence from the sex chromosomes of other mammals are indicated. This model does not preclude the occurrence of (i) additional inversions or other rearrangements within the nonrecombining portion of the evolving Y chromosome or (ii) similar rearrangements on the evolving X chromosome, so long as they do not disturb the fundamental order among the four strata.

  • * Present address: Department of Human Genetics, University of Chicago, 924 East 57th Street, Chicago, IL 60637, USA.

  • To whom correspondence should be addressed. E-mail: dcpage{at}


Stay Connected to Science

Navigate This Article