Research Article

Systematic analysis of complex genetic interactions

See allHide authors and affiliations

Science  20 Apr 2018:
Vol. 360, Issue 6386, eaao1729
DOI: 10.1126/science.aao1729

Trigenic interactions in yeast link bioprocesses

To dissect the genotype-phenotype landscape of a cell, it is necessary to understand interactions between genes. Building on the digenic protein-protein interaction network, Kuzmin et al. created a trigenic landscape of yeast by using a synthetic genetic array (see the Perspective by Walhout). Triple-mutant analyses indicated that the majority of genes with trigenic associations functioned within the same biological processes. These converged on networks identified in the digenic interaction landscape. Although the overall effects were weaker for trigenic than for digenic interactions, trigenic interactions were more likely to bridge biological processes in the cell.

Science, this issue p. eaao1729; see also p. 269

Structured Abstract

INTRODUCTION

Genetic interactions occur when mutations in different genes combine to result in a phenotype that is different from expectation based on those of the individual mutations. Negative genetic interactions occur when a combination of mutations leads to a fitness defect that is more exacerbated than expected. For example, synthetic lethality occurs when two mutations, neither of which is lethal on its own, generate an inviable double mutant. Alternatively, positive genetic interactions occur when genetic perturbations combine to generate a double mutant with a greater fitness than expected. Global digenic interaction studies have been useful for understanding the functional wiring diagram of the cell and may also provide insight into the genotype-to-phenotype relationship, which is important for tracking the missing heritability of human health and disease. Here we describe a network of higher-order trigenic interactions and explore its implications.

RATIONALE

Variation in phenotypic outcomes in different individuals is caused by genetic determinants that act as modifiers. Modifier loci are prevalent in human populations, but knowledge regarding how variants interact to modulate phenotype in different individuals is lacking. Similarly, in yeast, traits including conditional essentiality—in which certain genes are essential in one genetic background but nonessential in another—often result from an interplay of multiple modifier loci. Because complex modifiers may underlie the genetic basis of physiological states found in natural populations, it is critical to understand the landscape of higher-order genetic interactions.

RESULTS

To survey trigenic interactions, we designed query strains that sampled key features of the global digenic interaction network: (i) digenic interaction strength, (ii) average number of digenic interactions, and (iii) digenic interaction profile similarity. In total, we tested ~400,000 double and ~200,000 triple mutants for fitness defects and identified ~9500 digenic and ~3200 trigenic negative interactions. Although trigenic interactions tend to be weaker than digenic interactions, they were both enriched for functional relationships. About one-third of trigenic interactions identified “novel” connections that were not observed in our digenic control network, whereas the remaining approximately two-thirds of trigenic interactions “modified” a digenic interaction, suggesting that the global digenic interaction network is important for understanding the trigenic interaction network. Despite their functional enrichment, trigenic interactions also bridged distant bioprocesses. We estimate that the global trigenic interaction network is ~100 times as large as the global digenic network, highlighting the potential for complex genetic interactions to affect the biology of inheritance.

CONCLUSION

The extensive network of trigenic interactions and their ability to generate functionally diverse phenotypes suggest that higher-order genetic interactions may play a key role in the genotype-to-phenotype relationship, genome size, and speciation.

Systematic analysis of trigenic interactions.

We surveyed for trigenic interactions and found that they are ~100 times as prevalent as digenic interactions, often modify a digenic interaction, and connect functionally related genes as well as genes in more diverse bioprocesses (multicolored nodes). PPI, protein-protein interaction.

Abstract

To systematically explore complex genetic interactions, we constructed ~200,000 yeast triple mutants and scored negative trigenic interactions. We selected double-mutant query genes across a broad spectrum of biological processes, spanning a range of quantitative features of the global digenic interaction network and tested for a genetic interaction with a third mutation. Trigenic interactions often occurred among functionally related genes, and essential genes were hubs on the trigenic network. Despite their functional enrichment, trigenic interactions tended to link genes in distant bioprocesses and displayed a weaker magnitude than digenic interactions. We estimate that the global trigenic interaction network is ~100 times as large as the global digenic network, highlighting the potential for complex genetic interactions to affect the biology of inheritance, including the genotype-to-phenotype relationship.

Genetic interactions occur when a combination of mutations in different genes leads to an unexpected phenotype that deviates from a model incorporating the combined effects of the corresponding single-mutant phenotypes. In humans, each individual carries thousands of different variants that may modulate gene function, which means that there is incredible potential for combinatorial genetic interactions to determine our personal phenotype (1, 2). Indeed, genetic interactions are thought to represent a substantial component of the missing heritability associated with current genome-wide association studies (GWAS) (3). However, the statistical limitations associated with GWAS data sets preclude the detection of specific genetic interactions, and thus potential genetic networks underlying inherited traits, including diseases, remain elusive (35). To address the role of genetic interactions in the genotype-to-phenotype relationship, we have been exploring their general principles through systematic analysis of genetic networks underlying cellular fitness in a genetically tractable model system, the budding yeast Saccharomyces cerevisiae (6). Our previous studies focused predominantly on genetic interactions involving two genes (digenic interactions) (7). In this study, we analyzed a series of single-, double-, and triple-mutants by quantifying their colony size, as a proxy for fitness, to systematically explore complex genetic interactions.

There are two basic types of fitness-based genetic interactions. A negative genetic interaction refers to a combination of mutations that results in a fitness defect that is more severe than expected (8). Synthetic lethality is an extreme example of a negative genetic interaction and occurs when two mutations, neither of which is lethal on its own, combine and lead to an inviable double-mutant phenotype (9, 10). Conversely, a positive genetic interaction occurs when a combination of genetic perturbations generates a double mutant with a greater fitness than expected; one example is genetic suppression, in which the fitness defect of a query mutant is alleviated by a mutation in a second gene (11). To map a global digenic interaction network for yeast, we constructed millions of double mutants and identified hundreds of thousands of negative and positive genetic interactions (7). To put these results in perspective, although only ~1000 of the ~6000 total yeast genes are individually essential and cause lethality when deleted (12) and an equivalent number of nonessential genes cause a slow-growth defect under standard laboratory conditions (13), ~550,000 different yeast gene pairs display a combinatorial negative genetic interaction, including a subset of ~10,000 extreme synthetic lethal interactions involving nonessential gene pairs (7). Thus, there are numerous potential ways to generate extreme lethal phenotypes through negative digenic interactions of nonessential gene pairs.

The set of digenic interactions for a query gene, its genetic interaction profile, provides a quantitative measure of function, because genes with similar roles have overlapping profiles (14, 15). Genes belonging to the same biological pathway or protein complex display highly similar genetic interaction profiles. Moreover, a global network based on digenic interaction profile similarity reveals a hierarchy of functional modules, which includes detailed pathways and complexes, that in turn cluster into larger modules corresponding to bioprocesses. Those larger units subsequently assemble into modules corresponding to cellular compartments to outline the functional architecture of a cell (7).

A complete understanding of the role of genetic interactions in the genotype-to-phenotype relationship requires that we also investigate complex, higher-order genetic interactions involving more than two genes. Because there are ~2000 times as many triple gene combinations as gene pairs (~18 million), it is possible that there is a substantially greater number of trigenic than digenic interactions and that higher-order interactions may be important for driving inherited traits. In this study, we surveyed yeast trigenic interactions, sampling quantitative features of the digenic network, and explored the implications of the higher-order genetic interaction network.

Mapping trigenic interactions quantitatively and surveying the global trigenic landscape

To explore the trigenic interaction landscape, we designed query strains that sampled three key quantitative features of our global digenic interaction network (7). We designed query strains carrying mutations in two genes spanning a range of the following features: (1) digenic interaction strength, (2) number of digenic interactions (average digenic interaction degree), and (3) digenic interaction profile similarity (Fig. 1A and table S1). Gene pairs were selected to fill bins of varying digenic interaction attributes and to cover all major biological processes in the cell, thus producing a sample that would provide a diverse survey of the trigenic interaction landscape. We largely focused on unambiguous singletons because duplicated genes represent a relatively small subset of genes and thus can only represent a small fraction of the global trigenic interaction network. For this survey, we constructed 151 double-mutant query strains and 302 single-mutant strains, encompassing 47 temperature-sensitive alleles of different essential genes and 255 deletion alleles of nonessential genes. The query strains in this set were selected to span the different digenic attribute bins according to predefined thresholds (table S1). An additional 31 double-mutant queries fell outside of the defined thresholds but were included for validation and comparison purposes (data S1 to S3) (16). The fitness of the resulting query strains was measured using a quantitative growth assay, and the behavior of the single- and double-mutant query strains showed strong agreement with our previously published data set (figs. S1 and S2 and data S4) (7, 15).

Fig. 1 Triple-mutant synthetic genetic array (SGA) analysis.

(A) Criteria for selecting query strains for sampling trigenic interaction landscape of singleton genes in yeast. The gene pairs were grouped into three general categories based on a range of features: (1) Digenic interaction strength. Gene pairs were directly connected by zero to very weak (digenic interaction score: 0 to –0.08, n = 74 strains), weak (–0.08 to –0.1, n = 32), or moderate (<–0.1, n = 45) negative digenic interactions. (2) Number of digenic interactions. Gene pairs had a low (10 to 45 interactions, n = 50), intermediate (46 to 70, n = 53), or high (>71, n = 48) average digenic interaction degree (denoted by the number of black edges of each node). (3) Digenic interaction profile similarity. Gene pairs had low (score: –0.02 to 0.03, n = 46; represented by genes A and B, which show a relatively low overlap of genetic interactions with genes K to R), intermediate (0.03 to 0.1, n = 59; represented by genes C and D, which display an intermediate overlap of genetic interactions), or high (>0.1, n = 46, represented by genes E and F, which display a relatively high level of overlap of genetic interactions) functional similarity, as measured by digenic interaction profile similarity and coannotation to the same GO term(s). Query mutant genes were either nonessential deletion mutant alleles (Δ) or conditional temperature-sensitive (ts) alleles of essential genes. (B) Diagram illustrating the triple-mutant SGA experimental strategy. To quantify a trigenic interaction, three types of screens are conducted in parallel. To estimate triple-mutant fitness, a double-mutant query strain carrying two desired mutated genes of interest (red and blue filled circles) is crossed into a diagnostic array of single mutants (black filled circle). Meiosis is induced in heterozygous triple mutants, and haploid triple-mutant progeny is selected in sequential replica pinning steps. In parallel, single-mutant control query strains are used to generate double mutants for fitness analysis. (C) Triple-mutant SGA quantitative scoring strategy. The top equation shows the quantification of a digenic interaction, where εij is the digenic interaction score, ƒij is the observed double-mutant fitness, and the expected double-mutant fitness is expressed as the product of single-mutant fitness estimates ƒiƒj. In the bottom equation, the trigenic interaction score (τijk) is derived from the digenic interaction score, where ƒijk is the observed triple-mutant fitness and ƒiƒjƒk is the triple-mutant fitness expectation expressed as the product of three single-mutant fitness estimates. The influence of digenic interactions is subtracted from the expectation, and each digenic interaction is scaled by the fitness of the third mutation.

Trigenic interaction screening required development and implementation of three operational components. First, synthetic genetic array (SGA) analysis—an automated form of yeast genetics that is often used to cross a query gene mutation into an array of single mutants to generate a defined set of haploid double mutants (6)—was adapted such that a double-mutant query strain could be crossed into an array of single mutants to generate triple mutants for trigenic interaction analysis (Fig. 1B). Because the identification of a trigenic interaction requires comparison with the corresponding double mutants, we also conducted screens in which the individual mutants of the query gene pair were scored for digenic interactions (Fig. 1B). Second, for experimental feasibility, we assembled a diagnostic array of 1182 strains, comprising 990 nonessential gene deletion mutants and 192 essential gene mutants carrying temperature-sensitive alleles, which combine to span ~20% of the yeast genome (data S5). The diagnostic array was designed to be highly representative of the rest of the genome in terms of exhibited genetic interaction profiles (fig. S3). Briefly, array strains were selected from a larger genetic interaction data set for their ability to represent different regions of the global network in a minimally redundant way. This was accomplished by iteratively selecting strains to maximize the performance of profile similarities when predicting coannotations to a functional gold standard (17). Third, we developed a scoring method, the τ-SGA score, which combines double- and triple-mutant fitness estimates derived from colony size measurements to identify trigenic interactions quantitatively (Fig. 1C). The τ-SGA score differs from the MinDC score reported previously (18), because it accounts for all cases in which two of the genes are not independent, resulting in an expectation that contains digenic interaction effects scaled by the fitness of the noninteracting genes (fig. S4) (16). The final trigenic τ-SGA interaction score then accounts for digenic effects but also enables detection of trigenic interactions in which digenic effects of insufficient explanatory power can be found.

We focused exclusively on the analysis of deleterious negative trigenic interactions for two reasons. First, quantitative scoring of negative genetic interactions is often more accurate than that for positive interactions because there is a greater signal-to-noise ratio for negative genetic interactions. Hence, negative genetic interactions are associated with lower false-positive and false-negative rates than positive interactions (8), a feature that is important for the robust statistical analysis necessary to differentiate true trigenic interactions from the extensive background digenic network. Second, negative digenic interactions are generally more functionally informative than positive digenic interactions (8), and thus the large-scale mapping of a negative trigenic interaction network is expected to provide the most mechanistic insight into gene function and pathway wiring.

Trigenic interactions are enriched for functionally related genes

To obtain sufficient precision, we carried out each analysis, which involved screening the individual query genes for digenic interactions and the double-mutant query for trigenic interactions, in at least two replicate screens with four colonies per screen (fig. S5). In total, we tested 410,399 double and 195,666 triple mutants for fitness defects, meeting a previously established intermediate magnitude cutoff (15) (data S2) and identified 9363 digenic and 3196 trigenic negative interactions. From detailed validation of trigenic interactions of our CLN1-CLN2 double-mutant query, which was screened previously (19), we estimated a false-negative rate of ~40%, a false-positive rate of ~20%, and a true-positive rate between ~60 and ~75% (table S2 and fig. S6) (16), which is consistent with our previous global digenic network analysis (8).

The distribution of trigenic interaction degree for array strains shows that the majority of low-degree genes (70%) account for ~88% of all trigenic interactions, whereas highly connected genes contribute the remaining ~12% of interactions. Thus, the trigenic interactions are not associated with a small set of highly connected genes; rather, the interactions are distributed across many different genes (fig. S7). On the other hand, with a smaller, more biased set of double-mutant query genes, the distribution of trigenic interaction degree shows that ~22% of them accounted for 51% of trigenic interactions, indicating that a particular subset of the digenic queries were enriched for trigenic interactions (fig. S7). About one-third of the newly mapped trigenic interactions identified connections that were not observed in our digenic control network; we refer to these as “novel” trigenic interactions. The remaining approximately two-thirds of the trigenic interactions overlapped a digenic interaction while still exhibiting a stronger than expected fitness defect in the triple mutant; these we refer to as “modified” trigenic interactions (fig. S8A). Thus, although a substantial fraction of trigenic interactions elucidate totally new functional information, the majority of the trigenic interactions we mapped expand upon the digenic interaction network.

We first assessed the functional information embedded in the trigenic network by comparing the distributions of digenic and trigenic interactions across different biological processes. As observed previously (15), digenic interactions were enriched among genes annotated to the same biological process and, although the magnitude of trigenic interaction enrichment was somewhat lower, they were comparably enriched for genes within the same bioprocess (Fig. 2A). We also evaluated the enrichment of digenic and trigenic interactions across common functional standards, including annotation to the same Gene Ontology (GO) biological process, subcellular-localization pattern, protein-protein interaction, and gene coexpression (Fig. 2B). Like digenic interactions (7, 15), genes involved in trigenic interactions were significantly enriched for all of these standards, with genes participating in the “modified” class of trigenic interactions exhibiting stronger functional relationships (fig. S8B) as well as a stronger magnitude of interactions (fig. S8C). Thus, trigenic interactions resemble digenic interactions in that they are rich in functional information, which means that genes participating in many trigenic interactions can be predicted from alternative data sets and general knowledge of cellular function.

Fig. 2 Functional characterization of trigenic interactions.

(A) Frequency of negative genetic interactions within biological processes. For our analysis, we used the fraction of screened query-array combinations exhibiting negative interactions belonging to functional gene sets annotated by SAFE (spatial analysis of functional enrichment) on the global genetic interaction network (55). The “within process” category received a count for any combination in which both genes for digenic interactions or all three genes for trigenic interactions were annotated to the same term. The size of the circle assigned to each “within process” element reflects the fold increase over the background fraction of interactions (digenic = 0.023, trigenic = 0.016). Significance was assessed with a hypergeometric test; P < 0.05. Blue circles represent significant enrichment; gray circles denote no significant change. (B) Enrichment of negative digenic and trigenic interactions across four functional standards. The dashed line indicates no enrichment. The functional standards are merged protein-protein interaction (PPI) (5660), coannotation (based on SAFE terms) (7), coexpression (61), and colocalization (62). Significance was assessed with a hypergeometric test; * represents 10−4P < 0.01, ** represents P < 10−4.

Trigenic interactions expand functional connections mapped by the global digenic network

Our functional analysis revealed that trigenic interactions have some properties distinct from those of digenic interactions, which suggests that trigenic interactions may be useful for discovering previously unknown connections between genes and their corresponding pathways. As an illustrative example, we examined the MDY2-MTC1 double-mutant query, which is a highly connected hub within the trigenic network. MDY2 encodes a protein that interacts and functions with components of the GET (guided entry of tail-anchor) pathway (20), which is important for Golgi–to–endoplasmic reticulum (ER) trafficking and inserting tail-anchored proteins into ER membranes. MTC1 encodes a protein of unknown function that localizes to the early Golgi apparatus. The MTC1 digenic interaction profile is similar to that of USO1, which is involved in vesicle-mediated ER-to-Golgi transport (21), and RUD3, which encodes a Golgi matrix protein important for the structural organization of the cis-Golgi (22), suggesting that MTC1 also has a role in the early secretory pathway.

As expected from these previous findings, the MDY2 and MTC1 digenic interactions identified in our screen were enriched for genes involved in cell polarity and the early secretory pathway (Fig. 3, fig. S9A, and data S6). However, the MDY2-MTC1 double-mutant query profile encompassed a much more functionally diverse set of genes (Fig. 3). For example, although we observed novel trigenic interactions with ER-to-Golgi transport genes, we also observed trigenic interactions with genes involved in other modes of vesicle trafficking, including endocytosis and peroxisome biology. Moreover, MDY2-MTC1 trigenic interactions identified connections to genes with more diverse functions, such as components of the elongator complex (Fig. 3), which controls the modification of wobble nucleosides in tRNAs, and several genes involved in DNA replication and repair. Notably, the MDY2-MTC1 query also showed a trigenic interaction with TOR1 (target of rapamycin 1), which encodes the key kinase subunit of the TORC1 complex that is required for growth in response to nutrients by regulating ribosome biogenesis, nutrient transport, and autophagy (23). Consistent with this observation, the MDY2-MTC1 trigenic interaction network captures a set of genes that have a dual role in TORC1 signaling and sorting of the general amino acid permease, including GTR1, MEH1, and LST4 (2426).

Fig. 3 The MDY2-MTC1 double mutant: a hub on the trigenic interaction network.

Representative digenic interactions are highlighted for MDY2 and MTC1 single-mutant query genes, and representative trigenic interactions are shown for the MDY2-MTC1 double-mutant query. The network was visualized using Cytoscape (63). Genes were chosen from representative protein complexes (8) in which ≥50% of members on the diagnostic array display genetic interactions. Negative genetic interactions (ε or τ < –0.08, P < 0.05) are depicted. All of the digenic and trigenic interactions displayed have been confirmed by tetrad analysis. Nodes are color coded on the basis of their biological roles and are labeled with gene names. Genes are grouped according to specific protein complexes.

The spectrum of bioprocesses that are represented in genetic interaction profiles can be visualized by mapping functional enrichment within the context of the global yeast digenic interaction profile similarity network, which clusters genes into 17 distinct bioprocesses (7, 27) (Fig. 4A). In comparison with the MDY2 and MTC1 digenic interaction profiles (Fig. 4, B and C), the MDY2-MTC1 trigenic interactions were enriched not only for vesicle trafficking and cell polarity bioprocess regions of the network but also in regions encompassing genes annotated to the tRNA wobble modification bioprocess, DNA replication and repair, as well as mitosis and chromosome segregation (Fig. 4D). Thus, the MDY2-MTC1 trigenic interaction profile exhibited a more expanded and functionally diverse set of connections than either of the corresponding MDY2 or MTC1 digenic interaction profiles.

Fig. 4 Enrichment of genetic interactions within bioprocesses defined by a global network of digenic interaction profile similarities.

(A) The global digenic interaction profile similarity network (7) was annotated using SAFE (55), identifying network regions enriched for similar GO biological process terms as outlined by dashed lines. rRNA, ribosomal RNA; ncRNA, noncoding RNA; MVB, multivesicular body. (B) MDY2 digenic interactions showing bioprocess enrichments. (C) MTC1 digenic interactions showing bioprocess enrichments. (D) MDY2-MTC1 trigenic interactions showing bioprocess enrichments.

We used a variety of assays to test three functional connections revealed by the MDY2-MTC1 trigenic interaction profile. First, although the MDY2-MTC1 double-mutant strain did not show an exaggerated cell biological phenotype associated with the early trafficking function (fig. S9B), it displayed a marked synthetic sick phenotype when combined with deletion of SLA1, which is involved in cortical actin assembly and endocytic vesicle formation, which translates into an extended Sla1 patch lifetime, reflecting a defect in endocytosis (Fig. 5A) (28). Second, given a negative trigenic interaction with OAF1, which encodes an oleate-activated transcription factor involved in peroxisome organization and biogenesis (29), we used fluorescence microscopy to explore peroxisome morphology. The MDY2-MTC1 double mutant displayed an accumulation of relatively small peroxisomes (Fig. 5B), which may be indicative of a defect in ER-derived peroxisome membrane biogenesis (30). Third, the MDY2-MTC1 double mutant showed pronounced sensitivity to hydroxyurea (HU) but not methyl methanesulfonate (MMS), which is consistent with a specific defect in DNA replication and reflects the negative genetic interactions we observed with a number of DNA replication and repair genes, including NSE4 and NSE5, which encode components of the Smc5-Smc6 complex that mediates resolution of DNA structures spanning sister chromatids (Fig. 5C and fig. S10, A to C) (31). We suspect that the MDY2-MTC1 double mutant may be primarily defective in trafficking functions that can modulate signaling or metabolic pathways and thereby influence DNA synthesis and repair pathways indirectly (fig. S10, D to H).

Fig. 5 Trigenic interactions reflect the physiology of the MDY2-MTC1 double-mutant query strain.

(A) Endocytic membrane trafficking is impaired in the mdy2Δ-mtc1Δ double-mutant query strain. (Top) Example of tetrad analysis confirmations for the mdy2Δ-mtc1Δ-sla1Δ triple-mutant strain. (Bottom left) Endocytic uptake dynamics were examined with the Sla1–green fluorescent protein (GFP) reporter. Representative kymographs are displayed for the wild type and the mdy2Δ-mtc1Δ double mutant. (Bottom right) Lifetime of Sla1-GFP endocytic vesicle formation was quantified across ~100 different patches in two independent experiments. Error bars represent SD. (B) Peroxisome biogenesis was monitored in the wild type (wt) and in mdy2Δ, mtc1Δ, and mdy2Δ-mtc1Δ mutants using Pex14p-GFP reporter. (C) Growth response to HU and MMS for the wild type and mdy2Δ, mtc1Δ, and mdy2Δ-mtc1Δ mutants. YPD, yeast extract, peptone, and dextrose.

Trigenic interaction profiles are more functionally diverse than their corresponding digenic profiles

To test the generality of whether query genes connect to more functionally divergent genes through trigenic interactions than through digenic interactions, we compared digenic profile similarity of pairs of genes spanned by either digenic or trigenic interactions. Indeed, genes involved in trigenic interactions tend to show profiles that are less similar than those connected by digenic interactions, which suggests that they are less functionally related than those connected by digenic interactions (Fig. 6A and fig. S11A). We also found that trigenic interactions were more enriched than digenic interactions for connections that bridge several different biological processes, including mRNA and tRNA processing, vesicle trafficking, mitosis and chromosome segregation, and glycosylation and protein folding and targeting (Fig. 6B). Moreover, as we showed for the MDY2-MTC1 double-mutant query (Fig. 4), trigenic interaction profiles were generally enriched for genes spanning more diverse bioprocesses than the corresponding digenic interaction profiles (Fig. 6C and fig. S11, B to D). Genes involved in vesicle trafficking were particularly enriched for trigenic interactions occurring between bioprocesses (Fig. 6B and figs. S7 and S11D). As we observed for MDY2-MTC1, other double-mutant queries carrying mutations in genes implicated in membrane trafficking were enriched for trigenic interactions with genes involved in DNA replication and repair machinery, which may indicate a general connection between these two bioprocesses. For example, the digenic query strain MVP1-MRL1, which carries mutations in genes required for sorting proteins to the vacuole (32, 33), and the strain SEC27-GET4, which carries mutations in genes involved in ER-to-Golgi transport (34) and the insertion of tail-anchored proteins into ER membrane (20), both exhibited an enrichment of trigenic interactions with DNA replication and repair machinery (fig. S11D). In general, our findings show that trigenic interaction profiles are composed of connections involving genes that are more functionally diverse than their corresponding digenic interaction profiles (Fig. 6C). However, despite their higher tendency to connect diverse processes, a significant fraction of trigenic interactions occurs among genes within the same bioprocess (P < 1 × 10−16; hypergeometric test) (Fig. 2A).

Fig. 6 Trigenic interactions are more functionally distant than digenic interactions.

(A) Distribution of genetic interaction profile similarities of genes showing digenic and trigenic interactions. P values are based on a Wilcoxon rank sum test; **P < 10−30. (B) Frequency of negative genetic interactions between biological processes using SAFE annotations for digenic and trigenic interactions (55). The size of the circle assigned to each “between process” element reflects the fold increase over the background fraction of interactions (digenic = 0.023, trigenic = 0.016); P < 0.05 based on a hypergeometric test. The “between process” category received a count for any combinations that were not counted in the “within process” category shown in Fig. 2A. Filled blue circles represent significant enrichment, the open blue circle represents significant underenrichment, and gray circles denote no significant change. Trigenic versus digenic fold change (the ratio of trigenic interaction enrichment to digenic interaction enrichment) is represented by filled squares (black is maximal fold change; white is no fold change). In cases for which the “between process” enrichment was observed but is not significant (P < 0.05), the square is outlined with a dashed line. (C) Number of SAFE bioprocess clusters enriched for digenic or trigenic interactions.

Gene features of trigenic interactions and the expanse of the global trigenic landscape

Having selected the query gene pairs based on the properties of the global digenic interaction network, we can assess how these properties relate to trigenic interaction frequency. The strongest correlation with the number of observed trigenic interactions for each gene pair was the digenic genetic interaction profile similarity (correlation coefficient r = 0.41, P = 1.2 × 10−7), followed by the average number of digenic interactions of the query genes (r = 0.25, P = 1.9 × 10−3) and the strength of a direct negative genetic interaction between the query gene pair (r = 0.23, P = 5.4 × 10−3) (fig. S12, A to C). Thus, numerous trigenic interactions were observed for functionally related query genes, which display overlapping profiles on the digenic similarity network and often show a digenic interaction with each other (7) (Fig. 7A). As observed for digenic interactions (7), the frequency of trigenic interactions was highly correlated with the fitness defect of the double-mutant query strain (fig. S13). Consistent with this observation, essential genes exhibited high connectivity on the trigenic interaction network. A double-mutant query that carries at least one temperature-sensitive allele of an essential gene, which is often associated with a fitness defect at the semipermissive screening temperature, exhibited more genetic interactions than a query deleted for a pair of nonessential genes (P = 0.035) (Fig. 7B). More generally, query genes that are highly connected on the digenic network are also highly connected on the trigenic network (fig. S12D).

Fig. 7 Relation of digenic and trigenic interaction networks.

(A) Trigenic interaction degree distribution correlated with three quantitative features of genes on the digenic interaction network: (i) Interaction profile similarity of the two genes in the double-mutant query gene pair (bin thresholds: –0.02, 0.03, 0.1, +∞), which generates three bins for average digenic interaction profile similarity (r): –0.02 < r < 0.03; 0.03 ≤ r < 0.1; 0.1 ≤ r. (ii) Negative digenic interaction strength associated with the double-mutant query gene pair (bin thresholds: 0, –0.08, –0.1, –∞), which generates three bins for digenic interaction score (ε): ε < –0.1; –0.1 ≤ ε < –0.08; –0.08 ≤ ε < 0. (iii) Average digenic interaction degree, which represents the average number of negative genetic interactions associated with each of the genes of the double-mutant query gene pair (bin thresholds: 10, 45, 70, +∞), which generates three bins for average digenic interaction degree: 10 ≤ degree < 45; 45 ≤ degree <70; 70 ≤ degree. The bin with the highest average negative trigenic interaction degree at the intermediate interaction score cutoff (τ < –0.08) of 63.5 is shown in dark blue. (B) Essentiality determines trigenic interaction degree. Number of single mutants: 254 nonessential genes, 47 essential genes. Number of double mutants: 111 nonessential gene pairs, 40 essential or mixed essentiality gene pairs. Mean genetic interaction is represented; error bars indicate SEM; P values are based on a t test. Negative genetic interactions (ε or τ < –0.08, P < 0.05) are depicted. (C) Cumulative distribution of negative digenic and trigenic interaction score magnitudes. Pairwise significance was assessed with a Wilcoxon rank sum test. (D) Estimates of the number of digenic and trigenic interactions at the intermediate score cutoff (ε or τ < –0.08, P < 0.05). Bootstrapping was used to generate the estimate by sampling 10,000 times with replacement. Dashed lines indicate the 95% CIs; solid lines denote the estimated extent of the trigenic interaction landscape. This conservative estimate of the total number of trigenic interactions in the yeast genome covers ~26% of the interaction space. For the total genome-wide estimate, see fig. S15B and table S3.

Notably, trigenic interactions tend to be ~25% weaker than digenic interactions (P < 1.7 × 10−98) (Fig. 7C), which means the average digenic interaction often has a more profound phenotype than the average trigenic interaction. However, to fully understand the potential for trigenic interactions to drive fitness defects, we also need to estimate the frequency at which they occur. Because we have mapped digenic interactions comprehensively and we know the false-positive and false-negative rates associated with this analysis, we can estimate the number of digenic interactions within the yeast genome, revealing a distribution that centers on ~6 × 105 total negative interactions (Fig. 7D) (16). Furthermore, because digenic interaction properties are predictive of trigenic interaction degree, we can also extrapolate our findings to estimate the number of negative trigenic interactions across the whole genome. As noted earlier, we selected gene pairs for trigenic analysis to fill bins of varying attributes, including double-mutant queries with either weak or strong interactions, as well as those with either sparse or rich genetic interaction profiles, as depicted schematically in Figs. 1A and 7A (table S1). For extrapolation to the whole genome, we used the mean trigenic interaction degree of double mutants in a given bin as the expected degree for any hypothetical pair from the genome with similar characteristics (data S7 and fig. S14). Integrating this value across the total number of gene pairs with the given characteristics, which preserves the double-mutant distribution across different digenic interaction features, and summing over all bins yielded an estimate of the total number of trigenic interactions.

In the yeast genome, there are 2000 times as many possible triple gene combinations (36 billion) as possible gene pairs (18 million), but the density of interactions (as both observed and extrapolated) is similar, reduced by only a factor of ~3 for trigenic interactions (table S3). We predict that ~108 trigenic combinations exhibit a negative genetic interaction, generating a conservative estimate of on the order of 100 times as many trigenic interactions as observed for the global digenic network (Fig. 7D and table S3). To establish confidence intervals (CIs) for the estimate, we repeated the extrapolation process with 10,000 bootstrapped samplings of the 151 double-mutant query pairs, keeping their associated trigenic interactions degrees and the corresponding digenic interaction features constant. The bin with the lowest digenic interaction degree encompasses a large fraction of the potential double mutants in the genome and is assigned a low trigenic interaction degree, which means that the summarized estimate provided is likely a conservative underestimate. Moreover, because our binning scheme restricts our extrapolation to ~25% of the potential trigenic interaction space (e.g., by omitting potential double-mutant queries that show a positive digenic interaction), we are underestimating its extent, and the true number of trigenic interactions is likely to be several times higher (fig. S15 and table S3) (16). The vast expanse of the global trigenic interaction network points to the potential for higher-order interactions to affect all aspects of the genetics of outbred populations, including the genotype-to-phenotype relationship.

Discussion

Systematic mapping of trigenic interactions revealed that their properties resemble those of digenic interactions because they often connect functionally related genes, which means that trigenic interactions have the potential to contribute to our understanding of the functional wiring diagram of the cell. The global digenic network is predictive of trigenic interactions because query gene pairs showing properties associated with shared functionality, such as overlapping digenic interaction profiles or a negative digenic genetic interaction, often map numerous trigenic interactions (Fig. 7A). Thus, if two query genes are in the same or similar bioprocess cluster on the global digenic profile similarity network (Fig. 4A), they will likely show a rich trigenic interaction profile, as we observed for the MDY2-MTC1 double mutant query (Figs. 3 and 4D). Gene essentiality and the average digenic interaction degree of the query gene pair were also correlated with trigenic connectivity (Fig. 7B), indicating that highly connected hubs are consistent on both the digenic and trigenic interaction networks (fig. S12, C and D).

Many of the trigenic interactions we observed overlapped with at least one digenic interaction. In some cases, we chose query gene pairs displaying a negative genetic interaction and so all of the trigenic interactions in these profiles accentuated the query interaction (fig. S8). Moreover, a substantial proportion of trigenic interactions measured for noninteracting query pairs exacerbated a digenic interaction that was previously seen between one or both of the query genes and the third gene (fig. S8). Thus, our findings show that negative trigenic interactions often highlight the potential for a third mutation to amplify the phenotype of a digenic interaction. Analogously, in human genetics, the variation in an individual’s genetic background can have profound influence on the penetrance of the phenotype associated with a disease gene (35).

Although we found that trigenic interactions tend to be slightly weaker than digenic interactions, they are ~100 times more numerous and are more functionally diverse than their digenic counterparts. The expanse of possible three-gene combinations makes exhaustive trigenic interaction mapping intractable with our current methodology. However, the substantial overlap of the digenic and trigenic networks indicates that the genetic landscape of the cell expands with higher-order genetic interactions but does not change drastically in terms of its functional modularity. Thus, the global digenic network is highly informative of potential trigenic interactions and can be used effectively to predict candidate query gene pairs for efficient trigenic interaction analysis.

Trigenic interaction data may inform a wide variety of subjects within biology. For example, the number, magnitude, and properties of digenic and trigenic interactions clarifies aspects of speciation theory in evolutionary biology (36, 37). Hybrids between species exhibit reduced fitness, which is usually attributed to negative (epistatic) interactions among genes that diverged in isolated populations. Each population may evolve fixed variants that are neutral or adaptive in their own genetic backgrounds. When these variants are brought together for the first time in hybrid genomes, they may cause deleterious genetic interactions, also termed Dobzhansky-Muller incompatibilities. As populations diverge from one another, the number of potential digenic interactions increases as the square of the number of substitutions, the so-called “snowball effect” (36, 38, 39). That is, each subsequent substitution in a distinct population has the potential to interact with any substitution from the other population (and vice versa), and thus the probability of a speciation event grows with each step. Most speciation genetics research has focused on these digenic interactions. However, the number of trigenic combinations accumulates exponentially faster than the number of digenic combinations. Both digenic and trigenic interactions have been implicated in speciation (40, 41), but the general extent to which digenic or complex negative genetic interactions drive speciation remains unknown. If digenic interactions do, in fact, play a major role in orchestrating speciation, then either the frequency and/or the strength of deleterious trigenic interactions must be relatively smaller than that of digenic interactions. Our systematic analysis shows that trigenic interactions are somewhat less likely to occur (by a factor of 3; ~3% versus ~1% for digenic versus trigenic, respectively) and generally weaker (~25% weaker) than digenic interactions. Nevertheless, modeling based on our findings suggests that trigenic interactions are substantially common and often strong enough to play a key role in the evolution of hybrid inviability (fig. S16 and table S4) (16). Because the connections associated with higher-order interactions may often overlap with those of simpler interactions and because those simpler interactions require fewer substitutions and will often manifest first, our findings may also suggest that the evolution of even more highly complex interactions may be limited, even though their absolute numbers increase exponentially, a possibility that is consistent with evolutionary theory (38).

Our trigenic interaction study is also relevant to synthetic biology efforts aimed at efficient synthesis (42, 43) and design of minimal genomes (44, 45). Digenic synthetic lethal interactions were recently noted as a major constraint in the design of the minimal genome for the bacteria Mycoplasma mycoides, for which a viable genome could be constructed only after resolving lethal interactions that arose between nonessential genes (44). For species in which systematic gene perturbation studies have been conducted, the proportion of essential genes is relatively small (e.g., ~20% in yeast, ~10% in human cells, which increases to 20% when only expressed genes are considered) (13, 4648). However, we expect that digenic and trigenic interactions will dictate much larger minimal genomes than the essential gene set, even for growth under simple laboratory conditions. With the complete digenic network (7), we estimate that the minimal yeast genome would encompass more than ~70% of genes after accounting for digenic interactions (table S5) (16). With the inclusion of constraints imposed by trigenic interactions, we expect that a minimal genome, without a substantial fitness defect, may nearly approach the complete set of genes encoded in the genome. Thus, genetic interactions may help to explain the large gap between the number of genes with strong individual fitness defects and the total genome size, and the prevalence of yeast negative trigenic interactions suggests that many genomes lack the potential for substantial compression while maintaining normal fitness.

It is important to consider other types of genetic interactions in addition to those associated with severe loss-of-function alleles due to entire open reading frame deletions of nonessential genes or temperature-sensitive alleles of essential genes. Our analysis revealed that double mutants with strong fitness defects often show rich trigenic interaction profiles (fig. S13C). Similarly, single-mutant fitness defects also correlate with digenic interaction degree (fig. S13A) (7). Presumably, the weaker fitness effects associated with the variation found in natural populations may require higher-order combinations, involving more than two genes, to influence trait heritability through genetic interactions (49). In the yeast model system, genetic interactions were found to play an important role in the heritability of a number of different quantitative traits, possibly with a greater contribution made by digenic interactions versus higher-order interactions (4, 5, 50). The genetic mechanism underlying conditional essentiality, in which a given yeast gene is nonessential in one genetic background but essential in another, often appears to be associated with a complex set of modifier loci (49), as do a number of other traits (51, 52). Thus, both digenic and higher-order interactions are established components of the genetic architecture of yeast complex traits, and similar findings have been made in a number of other organisms (53). In part because model organism populations have allele distributions that differ from those in humans, the degree to which higher-order genetic interactions will contribute to the genetics of complex human disease remains to be seen (50, 54). Nevertheless, the extensive landscape of trigenic interactions revealed here for yeast, as well as their capacity for generating functionally diverse phenotypes and driving speciation, suggests that higher-order genetic interactions may play a key role in the genotype-to-phenotype relationship.

Materials and methods summary

The supplementary materials contain a detailed description of materials and methods for the construction of yeast single-, double-, and triple-mutant strains as well as quantification of genetic interactions and any associated analyses. General methodological information and references to specific sections appear throughout the text.

Supplementary Materials

www.sciencemag.org/content/360/6386/eaao1729/suppl/DC1

Materials and Methods

Figs. S1 to S16

Tables S1 to S5

References (6482)

Data S1 to S7

References and Notes

  1. Materials and methods are available as supplementary materials.
Acknowledgments: We thank H. Friesen, J. Hou, and J. Moffat for discussions and M. P. Masinas and J. Nelson for logistical help. Funding: This work was primarily supported by the NIH (grant R01HG005853 to C.B., B.J.A., and C.L.M. and grants R01HG005084 and R01GM104975 to C.L.M.), the Canadian Institutes of Health Research (CIHR) (grants FDN-143264 and FDN-143265 to C.B. and B.J.A.), and the NSF (grant DBI\0953881 to C.L.M.). Computing resources and data storage services were partially provided by the Minnesota Supercomputing Institute and the University of Minnesota Office of Information Technology, respectively. Additional support was provided by the CIHR (grant MOP-79368 to G.W.B.), the NSF (DEB-1456462 to D.B.), the Swiss National Science Foundation (R.L.), the Canton of Geneva and the European Research Council Consolidator Grant program (R.L.), Natural Science and Engineering Research Council of Canada Postgraduate Scholarship-Doctoral PGS D2 (E.K.), a University of Toronto Open Fellowship (E.K.), and a University of Minnesota Doctoral Dissertation Fellowship (B.V). C.L.M, B.J.A., and C.B. are fellows of the Canadian Institute for Advanced Research. Author contributions: Conceptualization: E.K., B.V., B.J.A., C.B., and C.L.M.; Methodology and investigation: E.K., B.V., W.W., R.D., Y.C., A.B., M.M.U., J.v.L., E.N.K., C.P., A.J.D., M.P., J.Z.Y.W., J.H., M.R., K.X., H.H., B.-J.S.L., E.S., and H.Z.; Formal analysis: E.K., B.V., W.W., M.M.U., E.N.K., C.P., A.J.D., J.H., K.X., H.H., M.C., R.L., A.C., D.B., and G.W.B.; Resources: G.T.; Data curation: M.U.; Writing – original draft: E.K., B.V., B.J.A., C.B., and C.L.M.; Writing – review and editing: E.K., B.V., W.W., R.D., A.B., M.M.U., J.v.L., E.N.K., C.P., M.C., D.B., G.W.B., B.J.A., C.B., and C.L.M.; Supervision: B.J.A., C.B., and C.L.M.; Project administration: N.V.D. and S.S.; Funding acquisition: B.J.A., C.B., and C.L.M. Competing interests: The authors declare no competing interests. Data and materials availability: All data files (data S1 to S7) associated with this study are described in detail and available in the supplementary materials and can be downloaded from http://boonelab.ccbr.utoronto.ca/supplement/kuzmin2018/supplement.html. Data files S1 to S7 have also been deposited in the DRYAD Digital Repository (doi: 10.5061/dryad.tt367)
View Abstract

Navigate This Article