PerspectiveHuman Genetics

GTEx detects genetic effects

See allHide authors and affiliations

Science  08 May 2015:
Vol. 348, Issue 6235, pp. 640-641
DOI: 10.1126/science.aab3002

One of the lessons from the past several years of genomic analysis is that well-conceived, ambitious, and thoughtfully analyzed genetic studies carried out by large consortia can advance the field in giant leaps. They do so both by providing new insight and by generating data sets that are widely accessible to all investigators. It is thus remarkable that, even though we now know that the vast majority of common polymorphisms (variants of a particular DNA sequence) that are associated with disease risk act by modulating gene expression, “big science” transcription analyses have been lacking. This deficit is now addressed with the publication of the first results from the Genotype-Tissue Expression (GTEx) Consortium (1), which also includes the findings of Melé et al. (2) and Rivas et al. (3), on pages 648, 660, and 666, respectively, in this issue.

GTEx is an effort coordinated by the U.S. National Human Genome Research Institute to understand the genetic basis for variation among individuals in transcript abundance across many tissues (4). Hitherto, our knowledge of the genetics of gene expression in humans has derived mostly from studies of blood (5), lymphoblast cell lines (6), and isolated studies of accessible tissues such as fat or skin (7). The plan for GTEx is to associate whole-genome sequence variation with RNA sequencing data for more than 50 tissue types from almost 1000 next-of-kin consented postmortem donors. This knowledge will provide direct evidence addressing the function of the many thousands of disease-associated variants supplied by genome-wide association studies (GWAS) and will illuminate mechanisms of variation for disease risk among healthy people. The pilot phase results (13) are based on data from the first 237 donors, of whom around 100 have RNA samples analyzed in 9 tissues, with data from smaller subsets of donors available for 33 other tissues. The main GTEx Consortium article reports on the genetic regulation of gene expression, whereas Melé et al. provide an overview of differences between the “transcriptome”—all RNA molecules, including messenger RNA, ribosomal RNA, transfer RNA, and other long noncoding RNA transcripts—across tissues and individuals. Rivas et al. report on the effect that protein-truncating variants have on human transcription, generating a quantitative model of how nonsense-mediated decay (the elimination of transcripts that contain a premature stop codon) varies across tissues and may be genetically regulated.

Previous studies in many organisms have established that common regulatory polymorphisms (expression quantitative trait loci, or cis-eQTLs) located within a few hundred kilobases of a gene significantly influence the expression of at least half of all genes in one tissue or another (8). They act locally to influence expression of a nearby gene, and may explain anywhere from a few percent to more than half the variance in abundance of the specific transcript among individuals. These effects are much larger than those typically associated with disease, so the largest eQTL effects can be detected with sample sizes of as few as 100 individuals (9). It is to be expected that rare variants also contribute to disease, although their discovery is in its infancy. Epigenetic influences such as chromatin modification and microRNA regulation certainly also explain substantial amounts of the variance. A critical feature of transcriptional variation is the very high degree of co-regulation, sometimes of thousands of genes. This can be attributed to the collective effects of trans-acting regulatory factors (transcription factors, hormones, environmental agents) as well as variation in the abundance of cell types within tissues.

One of the major contributions of these first GTEx papers is quantification of the relative contributions of cis-eQTLs in different tissues, suggesting (for example) that thyroid and tibial nerve have twice the number of genes regulated by local polymorphisms than blood or heart (1). However, blood seems to have a relatively high level of allelespecific expression (transcription predominantly from one of the two chromosomes), whereas brain samples are depleted for this phenomenon. Interestingly, it appears that allele-specific expression is more strongly conserved among tissues within individuals than is overall transcript abundance (1).

There has been much debate on how much shared genetic variation for gene expression there is across tissues. Different analytical methods applied to comparisons of different data sets have led to diverse conclusions (7, 10). The GTEx Consortium tackles the issue by studying a common set of individuals with two different analytical approaches: pairwise linear models and joint Bayesian analysis. The conclusion is striking, namely that around half of all ciseQTLs—particularly those proximal to a promoter—are active in the majority of tissues, whereas the other half tend to be specific to one or two tissues (1). Variants that affect splicing, namely the generation of alternative transcript isoforms, also tend to be conserved across tissues with more than 80% detected in multiple tissues, but with a wide range of similarity among pairwise comparisons: Whole blood shares fewer than 10% of its splicing QTLs with Sun-exposed skin, but almost 50% with the heart's left ventricle.

Switches in modularity.

Many genes tend to be coexpressed in modules. Modules may differ in their presence (gray) or size (yellow) among tissues and between individuals (blue). Some genes switch between modules [from the blue to yellow module in the brain (right); from the blue to gray module in the liver (left)].


One of the big surprises reported (1) is the discovery of module QTLs (modQTLs), which are regulatory variants that influence the co-regulation of gene expression. The idea is that most genes are organized into expression modules. Even though they are located on different chromosomes, they tend to be have similar expression levels (11). Often, many genes in a module share a function such as controlling the cell division cycle. The reported analysis finds 117 modules of between 25 and 414 transcripts each. Some of these are observed in multiple tissues, but not necessarily with the same expression profiles across tissues. It turns out that quite a few genes switch module “membership” between individuals (see the figure). These switches can be associated with local regulatory variants (modQTLs), only about half of which were detected as cis-eQTLs in the relevant tissues (1). Because genes in modules often share regulatory motifs for a common set of transcription factors, and these motifs tend to harbor regulatory polymorphisms, the implication is that the full GTEx project will uncover new principles accounting for variation in the co-regulation of genes within and between tissues.

What are the implications for personalized medicine? These are barely hinted at in the studies (13), but the most obvious is validation of inferences from bioinformatics processing of GWAS data and information from the Encyclopedia of DNA Elements (ENCODE) project (which identified functional elements in the human genome sequence) (12). The measurement of chromatin features designed to annotate enhancers and other regulatory elements has led to the realization that disease-associated variants tend to be enriched in the vicinity of genes that are more likely to be active in disease-relevant tissues, such as lymphoid cells in autoimmune disease or neurons in psychiatric disorders (13, 14). GTEx provides direct evidence that this is the case, and the project's accompanying portal allows anyone to look up in which tissue a disease variant influences the expression of a nearby gene in a particular direction. Notably, very often, “nearby” does not mean the most adjacent. Additionally, the important point is made (1, 3) that profiling gene expression across 50 tissues demonstrates that genes encoding protein-truncating variants are actually not even expressed in the most relevant tissue. This implies that variants predicted to be deleterious on the basis of DNA sequence may actually be highly unlikely to contribute to a disease.

Within the next 2 years, the full GTEx data should be available. There is no question that with an order of magnitude more data, the analyses will greatly exceed verification of the findings reported in the pilot studies. Vastly more cis-eQTLs will be found, intricacies of allele-specific expression and splicing will be worked out, and mechanisms responsible for switches in modularity inferred. We can also expect more complete integration of the GTEx data with the ENCODE analyses (13), using statistical approaches yet to be conceived, and a pilot ENCODE Tissue Expression (EnTEx) project will report chromatin profiling of some GTEx samples. Such analyses will allow us to sift through the suggestive GWAS peaks and explain more of the variance for disease and attribute it to appropriate cell types. Yet more ambitious GTEx projects might be conceived, evaluating how genetic regulatory effects vary in the context of disease (“GTEx-D”) and across environments (“GTEx-E”), if investments are made in the research of genotype-tissue expression from patients who have chronic diseases or have lived with different lifestyles or environmental exposures such as toxins or severe socioeconomic stress (15).

Perhaps most important, we can begin to outline an enhanced program of genome-enabled precision medicine. Although there is justifiable excitement about the ability of DNA sequencing to identify the causes of congenital abnormalities, to predict the progression of tumors, and to personalize the prescription of drugs, the static genome has its limitations. If, 20 years from now, gene expression profiling is incorporated side-by-side with genotype analysis as a standard component of medical diagnostics, the GTEx project will be seen to have brought us closer to realization of this vision.


View Abstract

Stay Connected to Science


Navigate This Article