Report

Adaptation to Climate Across the Arabidopsis thaliana Genome

See allHide authors and affiliations

Science  07 Oct 2011:
Vol. 334, Issue 6052, pp. 83-86
DOI: 10.1126/science.1209244

Abstract

Understanding the genetic bases and modes of adaptation to current climatic conditions is essential to accurately predict responses to future environmental change. We conducted a genome-wide scan to identify climate-adaptive genetic loci and pathways in the plant Arabidopsis thaliana. Amino acid–changing variants were significantly enriched among the loci strongly correlated with climate, suggesting that our scan effectively detects adaptive alleles. Moreover, from our results, we successfully predicted relative fitness among a set of geographically diverse A. thaliana accessions when grown together in a common environment. Our results provide a set of candidates for dissecting the molecular bases of climate adaptations, as well as insights about the prevalence of selective sweeps, which has implications for predicting the rate of adaptation.

Climate change has already led to altered distributions of species, phenotypic variation, and allele frequencies (15), and the impact of changing climates is expected to intensify. The capacity to respond to changing climate is likely to vary widely as a consequence of variation among species in their degree of phenotypic plasticity and their potential for genetic adaptation (6), which in turn depends on the amount of standing genetic variation and the rate at which new genetic variation arises. Arabidopsis thaliana is an excellent model for investigating the genetic basis and mode of adaptation to climate owing to the extensive climatic variation across its native range, as well as the availability of genome-wide single-nucleotide polymorphism (SNP) data among a geographically diverse collection. We examined the correlations between 107 ecologically important phenotypes in A. thaliana (7) and 13 climate variables that represent extremes and seasonality of temperature and precipitation, photosynthetically active radiation (PAR), relative humidity, season lengths, and aridity (figs. S1 to S4). We observed strong correlations between day length and phenotypes related to development time and flowering, supporting the observation that flowering time in the field is modulated by complex environmental cues that are difficult to simulate under controlled growth conditions (810). Correlations were also found between leaf yellowing (chlorosis) and temperature (11), as well as between dormancy-related traits and those related to temperature and moisture (12), consistent with the role for both vernalization and moisture in breaking dormancy. These results provide evidence for a genetic basis for climate adaptations in A. thaliana.

We conducted genome-wide scans to detect climate associations for ~215,000 SNPs genotyped across 948 A. thaliana accessions distributed throughout the native range of the species (fig. S5) (13). Because we cannot be certain that our model completely accounts for the effects of population history (13), we tested whether our results detect true signals of adaptations by assessing enrichment of likely functional [i.e., nonsynonymous (NS)] variants relative to putative neutrally evolving [i.e., synonymous (S) and intergenic] variants in the 1% tail of the combined climate correlation distributions (14). We found that intergenic SNPs show deficits and NS SNPs show strong and significant enrichments in the tail relative to their proportions in the genome as a whole for climate overall (Fig. 1A), as well as for all but 1 of the 13 individual climate variables (precipitation seasonality; Fig. 1B). The pattern was similar when we controlled for allele frequency but contrasts sharply to climate correlations that did not control for population history (13) (fig. S6).

Fig. 1

Enrichment of amino acid–changing SNPs (red), synonymous SNPs (green), and intergenic SNPs (yellow) in the 1% tails of the distributions for (A) climate overall (using a rank statistic based on the minimum rank across climate variables) and (B) for each individual climate variable. Enrichments shown are relative to the proportion of each class of SNPs in the genome overall. Gray dots show the distribution of results of 1000 permutations. The gray line shows the expected enrichment under the null hypothesis of no enrichment. Enrichments that are significant relative to permutations are denoted by asterisks.

Notably, for climate overall and for many individual variables, S variants are also significantly enriched in the tail (Fig. 1). The tendency toward enrichment of S variants is expected because of linkage disequilibrium under neutral processes but may be intensified by background selection (15) and/or hitchhiking (16). NS SNPs were slightly enriched relative to S SNPs (ratio of NS to S = 1.146, P = 0.036) for climate overall. In addition, precipitation in the wettest and driest months, relative humidity, length of the growing season, and PAR were enriched for NS relative to S variants (ratios ranging from 1.137 to 1.361 and P values ranging from 0.025 to 1 × 10−4). Given that we do not have data for all individual SNPs, but rather use SNPs to represent variation in the genome, these results are surprisingly strong.

We examined which biological processes are overrepresented among strong climate correlations, focusing in particular on climate variables for which we observed a significant NS-to-S enrichment (Table 1 and table S2). PAR shows the largest number of enriched categories, including photosynthesis, auxin biosynthesis, and gravitropism. In addition, we found enrichment of processes related to energy metabolism (i.e., starch metabolism and mitochondrial electron transport) with both precipitation extremes. These links between energy metabolism and water availability likely result from variation in photosynthetic capacity across precipitation gradients due to differences in the proportion of time when stomata are open (17).

Table 1

Enrichment of biological processes (BPs) in the 1% tail (P < 1 × 10−3) for climate variables with significant NS relative to S enrichments.

View this table:

Although pleiotropic gene functions may influence the rate of adaptation (18, 19), we have an incomplete understanding of the extent and magnitude of their effects on adaptation (20). We find substantial overlap in the 1% tails of climate variables, with pairwise combinations sharing 0 to 70% of top SNPs (fig. S7), suggesting that pleiotropy is common among adaptive alleles. However, some of these results may be due to correlations among the climate variables themselves, rather than pleiotropy per se. Indeed, a significant positive correlation was observed between the matrix of pairwise correlation coefficients among climate variables and the matrix of their proportional overlap of SNPs (Mantel r2 = 0.59, P = 2 × 10−4). Hence, outliers that are compared to the variable correlation matrix are particularly interesting (e.g., fig. S8).

It would be difficult to confirm the candidates from climate-related genome scans, even if it were possible to predict climate with absolute certainty, because of the scale of such tests. We thus validated our model by reasoning that if we are observing true signals, then they should be able to predict the relative fitness of genotypes grown in a particular climate. We tested our ability to predict the relative fitness of 147 A. thaliana accessions planted in the fall in a common garden in Lille, France (Fig. 2A). In particular, we selected all SNPs in the 0.01% tail of correlations with any climate variable and pruned this set of SNPs to include only one per chromosomal region on the basis of patterns of linkage disequilibrium. We identified alleles that are more common within a window of climate similar to Lille’s. Then, we asked whether the count of these alleles could predict relative fitness, as measured by total silique length (21) among the accessions. We created a null distribution by conducting the same analysis on resampled sets of SNPs. We found a strong and significant correlation between the number of favored alleles and fitness (Spearman’s rho = 0.48, P = 0.003; Fig. 2, B and C), demonstrating that our climate scan is picking up a true signal. As no accessions from within 100 km of Lille were included in the analysis, the correlation between relative fitness and the number of favorable alleles is robust to home versus away effects. Further, additional analyses support this conclusion (13).

Fig. 2

The SNPs with strongest climate correlations predict ranks in reproductive success (fitness) in Lille, France. (A) Red dots show the locations where accessions included in the experiment were collected, and the green cross shows the location where plants were grown. (B) The relation between total silique length (a measure of reproductive success) and the number of alleles expected to be favorable based on the climate analysis. (C) Observed Spearman correlation coefficient between total silique length and number of favorable alleles (red line) compared to the distribution of correlation coefficients from permutations.

The geographic extent of climate-correlated SNPs provides at least an initial picture of how climate shapes patterns of genetic variation in A. thaliana. Geographic extents varied widely across climate variables, with day length and relative humidity representing the extremes (Fig. 3A); SNPs correlated with day length tended to be localized, whereas SNPs correlated with relative humidity tended to be widespread (Fig. 3B and fig. S9). These results, at least in part, can be understood in relation to the geographic distribution of the climate variables themselves.

Fig. 3

Geographic distributions of SNPs with the strongest climate correlations. (A) Distributions of SNP extents for all SNPs and for SNPs in the 1% tail for climate overall, day length, relative humidity, and PHS. SNPs represented in the plot were filtered to remove redundant information resulting from linkage disequilibrium between SNPs. (B) Distributions of the top five regions for day length overlaid on a map of the distribution of this variable (with values ranging from 12.3 to 16.6 hours). The central panel contains polygons showing the geographic extents of all five SNPs, and other panels show the central feature and extent of each individual SNP.

Narrow SNP distributions may correspond to “hard selective sweeps,” or situations in which a new variant was driven quickly to high frequency in the population. A scan for hard selective sweeps based on extended pairwise haplotype homozygosity (PHS) (22) identified partial selective sweeps throughout the genome and examined the geographic extents of these genomic regions (Fig. 3A). SNPs identified as candidates for selective sweeps were, indeed, shifted toward narrow geographic distributions, consistent with the idea that hard sweeps result in narrow geographic distributions. To quantify the generality of these results, we examined overlap between the 1% tails of the overall climate correlation distributions and the PHS results and found overlap threefold greater than expected by chance if the two variables were independent. This increased to nearly 10-fold enrichment when we examined overlap among the 10% of climate-related SNPs with the smallest geographic extents; enrichments were strongest for aridity, maximum temperature, precipitation in the driest month, and length of the growing season. Although selection on standing variation also plays a role, these results reveal that selective sweeps are likely an important mode of adaptation in A. thaliana. The central role of selective sweeps here suggests that species like A. thaliana may reach adaptive limits under rapid climate change, owing to the constraints imposed by waiting for new mutations.

Supporting Online Material

www.sciencemag.org/cgi/content/full/334/6052/83/DC1

Materials and Methods

Figs. S1 to S10

Table S1

References (2335)

References and Notes

  1. See supporting material in Science Online.
  2. Acknowledgments: Funded by NIH GM083068 and NSF DEB0519961 to J.B. A.M.H. was supported by a V. Dropkin Postdoctoral Fellowship, and M.W.H. was supported by an NSF Predoctoral Fellowship and a Graduate Assistance in Areas of National Need (GAANN) training grant. F.R. was supported by a Bonus Qualité Reserche (BQR) grant from the University of Lille, and B.B. received funding from a Ph.D. fellowship from the French Research Ministry and a mobility grant from the Collège Doctoral Européen. This is contribution 11-389-J from the Kansas Agricultural Experiment Station. We thank J. Borevitz, A. Fournier-Level, M. Nordborg, A. Platt, J. Schmitt, members of the Bergelson laboratory, and two anonymous reviewers for helpful input. Climate data for the 948 accessions used in these analyses, result files for the correlation analyses, and a browser that allows for viewing the results in their genomic context are available at http://bergelson.uchicago.edu/regmap-data/climate-genome-scan/. The genotype data used for these analyses resulted from the RegMap project (http://regmap.uchicago.edu).
View Abstract

Navigate This Article