Sequence Variants in the RNF212 Gene Associate with Genome-Wide Recombination Rate

See allHide authors and affiliations

Science  07 Mar 2008:
Vol. 319, Issue 5868, pp. 1398-1401
DOI: 10.1126/science.1152422


The genome-wide recombination rate varies between individuals, but the mechanism controlling this variation in humans has remained elusive. A genome-wide search identified sequence variants in the 4p16.3 region correlated with recombination rate in both males and females. These variants are located in the RNF212 gene, a putative ortholog of the ZHP-3 gene that is essential for recombinations and chiasma formation in Caenorhabditis elegans. It is noteworthy that the haplotype formed by two single-nucleotide polymorphisms (SNPs) associated with the highest recombination rate in males is associated with a low recombination rate in females. Consequently, if the frequency of the haplotype changes, the average recombination rate will increase for one sex and decrease for the other, but the sex-averaged recombination rate of the population can stay relatively constant.

Recombination generates part of the diversity that fuels evolution. In humans, it has been suggested that recombination rate must be highly regulated (1), as too little recombination can lead to inaccurate disjunction and aneuploidy (2, 3), whereas ectopic exchange can lead to chromosomal rearrangements (4). Some regions in the genome, known as hotspots, have much higher recombination rate per physical distance unit than the genome as a whole. By using high-density single-nucleotide polymorphism (SNP) data, from which historical recombination events can be inferred, and sperm data, substantial advances have been made in the understanding of local recombination rate (511). Furthermore, male and female recombination patterns are different in both genome-wide and regional recombination rates (12, 13). It is also firmly established that genome-wide recombination rate varies substantially among women (12, 13), and there have been hints that this also is true in men (1416).

Previously, we genotyped a large number of families with a genome-wide microsatellite set of ∼1000 markers. This work allowed us to estimate the recombination rate for thousands of men and women and demonstrated that maternal recombination rate increases with the age of the mother and that there is a positive correlation between the number of children and the recombination rate of a woman (17). A common inversion on chromosome 17q21.31 was also identified that associates with recombination rate and fertility of women (18). Here, we performed a genome-wide scan for variants associated with recombination rate by genotyping with the Illumina Hap300 chip 1887 males and 1702 females with recombination rate estimates [see (19) for a description of study groups]. After quality filtering, 309,241 SNPs were tested for association with recombination frequencies. Male and female recombination rates were studied separately with weighted regression where the weight of a person was proportional to the number of children used to estimate recombination rate. We fitted an additive model with the estimated recombination rate regressed on the number of an allele (0, 1, or 2) a person carried. The results were then adjusted for relatedness between individuals and potential population stratification with the method of genomic control (20). Specifically, standard errors of the effect estimates resulting from the regressions were multiplied by a factor of 1.041 for males and 1.067 for females corresponding to dividing the chi-square test statistics by an adjustment factor of 1.084 = 1.0412 and 1.138 = 1.0672 [see (19) for quality control and statistical analysis].

For the recombination rate of males, three SNPs achieved genome-wide significance (P < 1.6 × 10–7, fig. S1). They were rs3796619 (P = 1.1 × 10–14), rs1670533 (P = 1.8 × 10–11), and rs2045065 (P = 1.6 × 10–11), which were all located within a small region in strong linkage disequilibrium (LD) on chromosome 4p16.3 (Fig. 1). The same three SNPs were also associated with the female recombination rate (Table 1). The last two SNPs achieved genome-wide significance; no other SNP did. However, when we compared males with females, the three SNPs each showed opposite effects. For example, allele T of rs3796619 was associated with low recombination rate in males, but high recombination rate in females. These three SNPs were genotyped in a replication sample of 1248 males and 1663 females for whom recombination rate estimates were also available. Overall, 3135 males and 3365 females were genotyped for the three SNPs. A total of 4388 nuclear families and 19,578 individuals genotyped with genome-wide microsatellite markers (19) contributed to the final recombination rate estimates confirming the original associations. With the samples combined, rs3796619 showed the strongest association with male recombination rate (P = 3.2 × 10–24), and each copy of allele T (compared with C) was estimated to decrease recombination rate by 70.7 centimorgans (cM) (Table 1). For female recombination rate, the SNP rs1670533 showed the strongest association (P = 1.9 × 10–12), and, relative to the TT homozygote, each copy of allele C was estimated to increase recombination rate by 88.2 cM. The third SNP, rs2045065, was highly correlated with rs1670533 (r2 = 0.99), and their effects could not be distinguished from each other. Henceforth, for simplicity, we focused on the joint effects of rs3796619 and rs1670533.

Fig. 1.

(A) The pairwise correlation structure in a 300-kb interval (0.9 to 1.2 Mb, National Center for Biotechnology Information Build 34) on chromosome 4. The upper plot includes pairwise D′ for 230 common SNPs [with MAF (minor allele frequency) > 5%] from the HapMap release 22 for the CEU population (24); the lower plot includes pairwise r2 values for the same set of SNPs. (B) Sex-averaged recombination rate (saRR) estimates (in centimorgans per megabase) in the same interval, on the basis of the HapMap dataset (26). (C) Location of nine known genes in this region. (D) Schematic view of the correlation between recombination rates and genotypes for SNPs in the interval from the genome-wide association study of 1887 males (blues dots) and 1702 females (red dots). Plotted is –log10 P, where P is the adjusted P value, against the chromosomal location of the markers. All four panels use the same horizontal megabase scale indicated at the bottom of panel (D).

Table 1.

Association of sequence variants with male and female recombination rate. The two alleles of each SNP, rs3796619, rs1670533, and rs2045065, are shown. The frequency of the first allele (e.g., allele T of rs3796619) in the sample is given. Effect is estimated on the basis of an additive model. Displayed is the estimated effect of the first allele relative to the second allele.

View this table:

Although rs3796619 and rs1670533 are not surrogates of each other, nonetheless, they are in strong LD (D′ = 1, r2 = 0.60), and only three of the four possible haplotypes were observed in our sample. Pairwise comparisons of the effects of these three haplotypes on recombination rate were made; an additive model was assumed (Fig. 2, A and B). For males, haplotype [C,T] was associated with significantly higher recombination rate than both [T,T] and [T,C] (P = 3.2 × 10–7 and 7.0 × 10–23, respectively). Haplotype [T,T] was associated with a slightly higher recombination rate than haplotype [T,C], but the difference did not reach significance (P = 0.061). Hence, most, if not all, of the difference in male recombination rate between the three haplotypes could be explained by SNP rs3796619. For females, haplotype [T,C] was associated with significantly higher recombination rate than both haplotype [T,T] (P = 6.6 × 10–7) and haplotype [C,T] (P = 5.4 × 10–11), although there was not a significant difference between the latter two haplotypes (P = 0.47). Hence, the SNP rs1670533 alone may account for all the differences in female recombination rate between these haplotypes.

Fig. 2.

(A and B) Difference in recombination rate between the three haplotypes of rs3796619 and rs1670533. Size of the circle is proportional to the frequency of the haplotype in the population (67.1% [C,T], 22.7% [T,C], 10.2% [T,T]). Each arrow indicates the comparison of two haplotypes and the difference of the haplotype the arrow is pointing to relative to the second haplotype.

For the association of rs3796619 with male recombination rate and the association of rs1670533 with female recombination rate, testing the additive model against the full model and treating the genotype as a categorical variable showed that, in both cases, these results were not significantly different from that expected under the additive model (P = 0.77 and 0.68, respectively). It is noteworthy that haplotype [C,T] was associated with high male recombination rate and low female recombination and that the exact opposite was true for haplotype [T,C]. To ensure that these observations were not an artifact of estimating recombination rates through family linkage data, however unlikely, we analyzed the data of 2152 couples, a subset of the 3135 males and 3365 females studied. When the genotypes of the spouses were included in the regression, they were not statistically significant and did not affect the association between the genotypes of the individuals and their estimated recombination rates.

Association results for individual chromosomes, with the estimated effect presented as a percentage of the average recombination rate of a chromosome, are displayed in Fig. 3 (see fig. S2 for effects in centimorgans). For males, allele T of rs3796619 was estimated to have a negative effect on recombination rate for 21 of the 22 autosomes, and this effect was significant (P <0.05) for 13 chromosomes. Although this supports the hypothesis that the effect of the variant is indeed genome-wide, a test of heterogeneity indicated that the percentage change is not the same for all chromosomes (P = 0.001). For females, allele C of rs1670533 was estimated to have a positive effect on recombination rate for 22 of the 23 chromosomes, 13 of them significantly so. Unlike males, for females, a test of heterogeneity was not significant (P > 0.05), but chromosome 21 stood out as one that gives an estimated effect in the opposite direction.

Fig. 3.

Effect on recombination rate for individual chromosomes and all chomosomes combined. Estimate and 95% confidence interval are displayed for percentage change. (A) Effect of allele T of rs3796619 in males. (B) Effect of allele C of rs1670533 in females.

Relative to the population average, each copy of rs3796619 T was estimated to decrease male genome-wide recombination rate by 2.62%. For rs1670533 C and female genome-wide recombination rate, the estimated effect was an increase of 1.87%. From the regression results, rs3796619 T explained 3.5% of the variation in male recombination rate and rs1670533 C explained 1.7% of the variation in female recombination rate. These numbers, however, substantially underestimate the total parental effect explained by these variants. This is because, on the basis of the recombination counts of a few children only, the inherent recombination rate of an individual, which mathematically can be viewed as the average recombination counts in the children (if the person were to have an infinite number of children), was measured with substantial error. The latter has to be estimated and deducted from the total variation in the regression analyses in order to properly evaluate the contribution of the identified variants. With a method that utilizes the different chromosomes as pseudo-replicates, we estimated that about 6.6% of the total variation in the paternal recombination count of a child could be attributed to a systematic effect associated with the father [see (19) for a description of the method and the decomposition of the total variation into various components]. A systematic maternal effect was estimated to account for about 11% of the variation in the maternal recombination count of a child. For the fathers and mothers in this study, who have, on average, about 2.75 genotyped children, the paternal effect accounted for ∼16% of the variation in the estimated recombination rates of the fathers, and the maternal effect accounted for ∼26% of the variation in the estimated recombination rates of the mothers. Hence, rs3796619 explains about 22% (3.5/16) of the paternal effect, and rs1670533 explains about 6.5% (1.7/26) of the maternal effect.

The three SNPs with the strongest association with recombination rate in males and females were located in an LD block spanning 200 kb. Two genes, SPON2 and RNF212, are located within this LD block (Fig. 1). The SPON2 gene may act as an opsonin and pattern-recognition molecule for a range of pathogens through detection of carbohydrate structures and activation of macrophages. The RNF212 gene has not been characterized, but homology searches indicate that the gene encodes a RING finger protein that may be a ubiquitin ligase. Gene ortholog predictions suggest homology between the mammalian RNF212 gene and the ZHP-3 gene (a homolog of the ZIP3 gene in yeast), which is involved in meiotic recombination (21) (fig. S3). Meiotic recombination is likely to depend, in large part, on recombinatorial repair of programmed meiotic double-strand breaks (22). The synaptonemal complex (SC), a structure formed by close association of axes of homologous sister chromatid pairs, is essential for crossover formation and completion of meiosis (23). Three proteins involved in this complex, Zip2, Zip3, and Zip4 (from the yeast Zip genes), mediate protein-protein interactions (23). On the basis of gene knockout experiments, the ZIP3 homolog in Caenorhabditis elegans, ZHP-3 (K02B12.8) is essential for reciprocal recombination between homologous chromosomes and, thus, for chiasma formation (21). Similarities between distantly related organisms, such as Saccharomyces cerevisiae and C. elegans, suggest that the structure and function of the SC is conserved among distantly related species. This suggests that the RNF212 gene may play a crucial role in recombination and assembly of the SC (21) in mammals, as do their putative orthologs in C. elegans and S. cerevisiae. Although the conservation between RNF212 in humans and ZHP-3 in C. elegans is limited (36 out of 118 amino acids in the RING finger domain) (21), our findings support the idea that the RNF212 protein may be involved in recombination.

In the LD block containing RNF212, on the basis of the HapMap group composed of European Americans (CEU), data release 22 (24), there are a large number of SNPs highly correlated with either rs3796619 or rs1670533 or with haplotypes formed by alleles of the two SNPs. Nine SNPs tag rs3796619 and 43 SNPs tag rs1670533 with a pairwise correlation coefficient r2 > 0.9. One of the SNPs strongly correlated with rs3796619 is rs4045481 (r2 = 0.96), a silent (synonymous) SNP in the third exon of RNF212 mRNA (BC050356). A deletion, rs33995490, in the coding region of the fourth exon of another RNF212 mRNA (BC036250) is strongly correlated with allele C of rs1670533 (r2 = 1.0) (figs. S4 and S5). Thus, these variants may affect recombination rate, but functional work is required to prove causality. We sequenced the exons of the RNF212 gene and did not find any other common coding variants in the RefSeq Gene variant (NM_194439.1) that could account for the observed association with recombination rate (see table S2 for details).

A phylogenetic analysis of a 55-kb region containing rs3796619 and rs1670533 in the HapMap data (24) revealed three well-differentiated clusters of haplotypes showing notable differences in frequency between the Yoruban Nigerians (YRI) and CEU and East Asians (CHB and JPT) (fig. S6). The [C,T] and [T,C] haplotypes that associate most strongly with recombination rate have a combined frequency of only 17% in the YRI sample, but reach a frequency of 91% and 98% in the CEU and East Asian samples, respectively. Several SNPs in this region show an unusual degree of divergence among the HapMap groups, on the basis of the rank percentile of their FST values (Wright's coefficient, a measure of variance in allele frequencies among populations) among all autosomal SNPs with the same overall frequency in the HapMap. Specifically, we identified eight SNPs whose FST values are in the top 0.5% for differences between the YRI and East Asian HapMap samples and also in the top 5% of differences between the YRI and CEU samples. Each of these SNPs differentiated a subset of [T,T] haplotypes from the rest, perhaps indicating an episode of positive selection (or a severe founder effect) that increased the frequency of [C,T] and [T,C] haplotypes in the ancestors of European and East Asian populations.

Although an inversion on 17q21.31 has previously been associated with recombination rate in women, the variants described here are shown to associate with both male and female recombination rate. Notably, they are associated with opposite effects in the two sexes. It is possible that variants with such properties are unique to recombination rates. Biologically, the processes of male and female recombination are different (25), which may allow such variants to exist. These variants could serve a key function from an evolutionary perspective, as they allow the transfer of the recombination contribution from one sex to the other with minimal impact on the average recombination rate for the population as a whole.

Supporting Online Material

Materials and Methods

Figs. S1 to S6

Tables S1 to S3


References and Notes

View Abstract

Navigate This Article