Evidence for Selective Advantage of Pathogenic FGFR2 Mutations in the Male Germ Line

See allHide authors and affiliations

Science  01 Aug 2003:
Vol. 301, Issue 5633, pp. 643-646
DOI: 10.1126/science.1085710


Observed mutation rates in humans appear higher in male than female gametes and often increase with paternal age. This bias, usually attributed to the accumulation of replication errors or inefficient repair processes, has been difficult to study directly. Here, we describe a sensitive method to quantify substitutions at nucleotide 755 of the fibroblast growth factor receptor 2 (FGFR2) gene in sperm. Although substitution levels increase with age, we show that even high levels originate from infrequent mutational events. We propose that these FGFR2 mutations, although harmful to embryonic development, are paradoxically enriched because they confer a selective advantage to the spermatogonial cells in which they arise.

Striking male biases in mutation rate are observed in many human genetic disorders. Among these, three genes (RET, FGFR2, and FGFR3) encoding receptor tyrosine kinase proteins stand out because of the very high apparent rate of specific nucleotide substitutions and near-exclusive paternal origin of mutations (1, 2). We focused this work on position 755 of FGFR2 (normally a cytosine, 755C), which shows the highest inferred mutation rate within this gene (3).

The local sequence context of the 755C nucleotide, part of a CpG doublet, is shown in Fig. 1A. Heterozygous 755 C-to-G (C>G) transversions (4) cause ∼66% of cases of Apert syndrome (5, 6), a dominantly inherited malformation (7) that usually occurs by new mutation. The mutations arise exclusively from the unaffected father and are associated with increased paternal age (8, 9). On the basis of the observed birth prevalence of Apert syndrome of ∼1 in 70,000 (10, 11), the apparent rate of 755C>G mutations in sperm is 9.4 × 106, which is elevated 200- to 800-fold over the background genomic rate for C>G transversions at CpG dinucleotides (12, 13). Of the other possible heterozygous substitutions at position 755, C>A has not been documented, but the C>T transition was identified in five subjects from two families and was associated with a normal or mild Crouzon syndrome phenotype (7, 14, 15). Highlighting the unusual mutability of the 755C nucleotide, the C>T has also been described in cis with four additional mutations, generating the mutant FGFR2 alleles 755_756CG>TT (14, 16), 755_756CG>TC (15), 755_757CGC>TCT (14) (Fig. 1A), and 755C>T; 943G>T (17). In the heterozygous state, these multiple mutations (expected to occur at rates of ∼1011) (13) are consistently associated with more severe phenotypes (Apert syndrome, Pfeiffer syndrome, or severe syndactyly) (7) than 755C>T alone.

Fig. 1.

Sequence context and detection of mutations at FGFR2 position 755. (A) DNA sequence around nucleotide 755 (in bold) in relation to intron/exon structure (intron nucleotide in lower case), –112 G/A snp, and oligonucleotide primers (table S1). The MboI site (yellow) and relevant nucleotide substitutions (red), with corresponding amino acid substitutions and heterozygous phenotypes, are indicated. (B) Flow diagram summarizing the protocol for quantifying mutations. (C to E) Representative pyrograms from unspiked sperm samples. E and S indicate dispensation of enzyme and substrate, respectively; the filled arrowhead marks the diagnostic peak for 755G; and unfilled arrowheads, the two diagnostic peaks for 755T. Relative peak heights are used to quantify the different molecular species. In (C), G predominates, with a small amount of T. In (D), G is absent and T predominates; note equal heights of peaks diagnostic for T. In (E), the pyrogram shows mixed G and T. Exceptionally, the second T diagnostic peak is lower than the first. (F) DNA sequence electropherogram of a clone from this sample, showing the double mutation 755_756CG>TC (filled arrowheads).

We developed a protocol to enrich samples for mutations at 755C (Fig. 1B). Genomic DNA samples were digested with MboI, which cuts only the normal sequence (Fig. 1A); the undigested fraction was amplified by polymerase chain reaction (PCR), further enriched by repeat MboI digestion, and quantified with the use of Pyrosequencing technology (Pyrosequencing AB, Uppsala, Sweden) and statistical modeling (18) (fig. S1A). To determine absolute mutation levels, samples were spiked with two concentrations of genomic DNA from the patient (“GR”) heterozygous for the 755_757CGC>TCT substitution. Representative pyrograms are shown in Fig. 1, C and D.

To validate the method, we quantified the 755C>G mutation when progressively diluted with normal DNA from blood (Fig. 2A). At mutant concentrations of ≥105 (comparable to the birth rate of Apert syndrome), point estimates were accurate within twofold; below 105, mutation levels were overestimated, probably because of in vitro DNA damage and PCR errors (19). Next, we undertook pilot analyses of sperm samples from an individual with a relatively high 755C>G count (Fig. 2B). We obtained consistent estimates both in sample replicates and in different samples, indicating that analysis of single sperm samples accurately reflects mutation prevalence in the individual (18) (fig. S2A).

Fig. 2.

Validation of estimated mutation levels at FGFR2 position 755, calculated as posterior mean ±95% equal tail probability interval (ETPI). (A) Measurement of 755 C>G in a reconstitution experiment employing two different sources of Apert genomic DNA. (B) Measurement of C>G, C>T, and C>A levels in 14 replicates of a single sperm sample (left) and 13 different samples from the same individual (right).

We next measured mutation levels in samples from three sources: (i) blood from healthy individuals (n = 11), (ii) sperm from healthy men without a family history of Apert syndrome (n = 99), and (iii) sperm from unaffected fathers of children with Apert syndrome caused by the 755C>G mutation (n = 6). The levels of 755 C>G, C>T, and C>A are shown in Fig. 3A, B, and C, respectively. Only low levels (<105) of all mutations were found in blood, which excludes the possibility that higher levels in sperm were caused by contamination or PCR artifacts. In sperm, the level of 755C>A never exceeded 6.3 × 106 and showed no paternal age effect (r = –0.06, P = 0.71), but both 755C>G and 755C>T reached high levels (maxima of 1.6 × 104 and 1.4 × 104, respectively), that were positively correlated with donor age (r = 0.39, P < 0.0001 for 755C>G; r = 0.45, P < 0.0001 for 755C>T). The average level of 755C>G was 1.66-fold higher than that of 755C>T (permutation test, P < 0.05). Levels of the 755C>G mutation in the sperm of fathers of Apert syndrome children all fell within the envelope of normal values (Fig. 3A), indicating that these men are sampled from the general population and have a very low risk of fathering another affected child.

Fig. 3.

Mutation levels at FGFR2 position 755 in blood (gray square) and sperm (♦) and comparison with observed paternal age effect for Apert syndrome. (A to C) Levels (posterior mean ±95% ETPI) of C>G (A), C>T (B), and C>A (C) plotted against donor age. Results for the sperm of six fathers of Apert children heterozygous for the 755C>G mutation are indicated (♢). (D) Relative levels of C>G in sperm (black), observed/expected (O/E) rates of Apert syndrome births (gray; numbers denote Apert children with fathers in specified age category), and rates predicted from a best fit model of observed birth data (white) in relation to the father's age.

One sperm sample from a normal 37-year old donor showed an atypical pyrogram with different heights of the two peaks diagnostic for 755T (Fig. 1E); this would occur if an additional substitution at 756G had occurred (fig. S1A). We cloned and sequenced the MboI-digested PCR product from this individual and demonstrated that ∼20% of 755T sequences harbored the double mutation 755_756CG>TC (Fig. 1F), which causes Apert syndrome (1416).

We compared our measurements of sperm mutation prevalence in the population with paternal age data for a cohort of Apert patients (18). The increase in levels of the 755C>G mutation in sperm with donor age closely mirrored both the raw estimates of the relative birth rate of Apert syndrome at different paternal ages and a best fit model (18) of these birth data (Fig. 3D). The predicted birth prevalence of the 755C>G mutation from analysis of sperm is 1.5 × 105, within 1.6-fold of observed values (10, 11). We conclude that there is excellent correspondence between our estimates of 755C>G mutation levels in sperm and epidemiological data on Apert syndrome births.

To explore the occurrence of the 755 C>G and C>T mutations on the two FGFR2 alleles, we determined their phase in relation to a Gt/A single nucleotide polymorphism (snp), located at position –112 in the upstream intron (8) (Fig. 1A and fig. S1B), by allele-specific PCR (18) in sperm samples heterozygous for the snp. The reproducibility of these measurements is shown in fig. S2. For the more prevalent C>G mutation, most individuals showed a strong predominance of mutations on one or other allele (Fig. 4A), whereas this skewing effect was significantly less marked for the C>T mutation (Fig. 4B) (permutation test, P < 0.0001). Crucially, the observed pattern of relative skewing and mutation prevalence is incompatible with all neutral models of mutation accumulation, in which the more prevalent (and, under neutral conditions, necessarily more frequent) mutation will always exhibit less skewing [Supporting Online Material (SOM) Text]. Instead, it shows that the two mutations are subject to differential selection, with the C>G mutation arising less frequently but conferring a stronger selective advantage than the C>T mutation (Fig. 4C).

Fig. 4.

The unequal distribution of mutations at 755C on paired FGFR2 alleles signals positive selection. Proportions (posterior mean ±95% ETPI) of C>G (A) and C>T (B) mutations on the –112G allele in sperm of –112GA heterozygotes (♦, n = 46) plotted against the estimated level of the corresponding mutation. For sperm samples (♢, n = 3) from fathers of Apert syndrome children, the allele on which the mutation occurred in the child (8) is indicated beside the data point in (A). (C) Model explaining the patterns of occurrence of C>G and C>T mutations. We estimate (18) that C>T arises 2.2 times more frequently than C>G, but the growth rate of C>T is only 0.26 that of C>G. The prevalence of C>G mutations tends to be greater, but the distribution on the two FGFR2 alleles (black and white segments of column on right) is more skewed.

In fact, positive selection of infrequent mutations would readily explain several otherwise puzzling features of our data. These include the very high absolute levels of particular mutations (C>G and C>T but not C>A) in specific tissues (sperm but not blood) (Fig. 3, A to C), the predominance of C>G over C>T despite the higher expected C>T mutation frequency at methylated cytosines (12, 13), the greater individual scatter in levels of the C>G mutation (compare Fig. 3, A and B: the initial occurrence of C>G is more dependent on chance), and the observation of multiple mutations in cis to C>T (1417), also identified in one sperm sample (Fig. 1F) and attributable to sequential selection events (13). We conclude that the major factor underlying the paternal age effect is not the accumulation of replication errors or inefficient repair processes but positive selection of infrequent mutations acting over the course of time.

How can this selective process be explained? The 755 C>G and C>T mutations, respectively, encode Ser252-to-Trp252 and Ser252-to-Leu252 substitutions in FGFR2 (Fig. 1A). The Ser252-to-Trp252 substitution reduces the dissociation rate of specific ligands bound to the mutant receptor (20) and increases the repertoire of ligand binding (21); in the mutant crystal structure, the Trp side chain makes an additional contact with bound FGF2 (22). By contrast, the effect of the Ser252-to-Leu252 substitution on ligand binding is weaker (20), correlating with a milder clinical phenotype (14, 15). Hence, we envisage that selection acts because of a dominant gain of function in the encoded protein. Positive selection is not anticipated for 755C>A (resulting in a truncated protein), whereas multiple nucleotide changes featuring 755C>T may be enriched by sequential selection events that generate mutant FGFR2 proteins with a stronger gain of function than the Ser252-to-Leu252 substitution (reflected in their more severe clinical phenotypes).

The constancy of FGFR2 mutation levels over many months (Fig. 2B and fig. S2) indicates that the mutations are present in spermatogonia with stem cell–like properties. Assuming a stem cell population of 107 to 108 (SOM Text), clonal expansions to ∼103 cells (in essence, generating small neoplastic lesions) would account for the elevated mutation prevalence. FGFR2 may play a role analogous to RET, a regulator of spermatogonial stem cell fate (23) that shows a similar predilection for mutations in older males (1, 2). The concept of selective advantage leading to clonal expansion, unfamiliar in the context of spermatogenesis, is commonplace in tumor biology (24); somatic FGFR2 mutations detected in gastric cancers were previously described as germline mutations in Crouzon and Pfeiffer syndromes (6, 25).

These data link the paternal age effect in a genetic disorder, the prevalence of mutations in sperm, and positive selection in the male germ line. In addition to their utility for genetic counseling, our findings have wider implications. We have demonstrated the power of Pyrosequencing technology to quantify complex mixtures of DNA species. We propose that signal transduction by FGFR2 and probably FGFR3 (26) influences the fate of spermatogonial stem cells. Finally, we present a concrete example of a type of evolutionary conflict, distinct from meiotic drive, whereby a mutation that is harmful to the organism may be advantageous in the cellular context of the testis, leading to a sex bias in the observed mutation rate (1, 2, 27).

Supporting Online Material

Materials and Methods

SOM Text

Figs. S1 and S2

Tables S1 and S2

References and Notes

View Abstract

Stay Connected to Science

Navigate This Article