Research Article

Global Mapping of the Yeast Genetic Interaction Network

See allHide authors and affiliations

Science  06 Feb 2004:
Vol. 303, Issue 5659, pp. 808-813
DOI: 10.1126/science.1091317

Abstract

A genetic interaction network containing ∼1000 genes and ∼4000 interactions was mapped by crossing mutations in 132 different query genes into a set of ∼4700 viable gene yeast deletion mutants and scoring the double mutant progeny for fitness defects. Network connectivity was predictive of function because interactions often occurred among functionally related genes, and similar patterns of interactions tended to identify components of the same pathway. The genetic network exhibited dense local neighborhoods; therefore, the position of a gene on a partially mapped network is predictive of other genetic interactions. Because digenic interactions are common in yeast, similar networks may underlie the complex genetics associated with inherited phenotypes in other organisms.

Gene deletion mutations have been constructed for each of the ∼6000 known or predicted genes (1) in the budding yeast Saccharomyces cerevisiae, of which ∼73% are nonessential (2). Synthetic genetic array (SGA) analysis, an approach that automates the isolation of yeast double mutants (3), enables large-scale mapping of genetic interactions. In a typical SGA screen, a mutation in a query gene of interest is crossed into an array of viable gene deletion mutants to generate an output array of double mutants, which can then be scored for specific phenotypes (3). Synthetic lethal or sick interactions, in which the combination of mutations in two genes causes cell death or reduced fitness, respectively, are of particular interest because they can identify genes whose products buffer one another and impinge on the same essential biological process (4). To determine the basic principles of genetic interaction networks, we conducted a large-scale analysis of synthetic genetic interactions in yeast. Because many of the genes that control the essential processes of eukaryotic cells are highly conserved, we expect that specific elements of the yeast genetic network and its general properties are also conserved.

Large-scale genetic network analysis. We performed 132 SGA screens, focused on query genes involved in actin-based cell polarity, cell wall biosynthesis, microtubule-based chromosome segregation, and DNA synthesis and repair. The query mutations were either deletion alleles of nonessential genes or conditional (partially functional) alleles of essential genes. Each SGA screen was conducted three times, and putative interactions scored multiple times were then evaluated by tetrad or random spore analysis. We also attempted to confirm candidate interactions observed only one time in three trials (∼25% of the total), if the function of the candidate interactor was either similar to those found in multiple screens or uncharacterized previously. The resulting confirmed data set, containing ∼4000 interactions amongst ∼1000 genes (5, 6) (table S1), should contain only few false positives (incorrect interactions); however, for a given screen, each round of SGA analysis identified new interactions and we estimate the frequency of false negatives (true interactions that were not identified) to be in the range of 17 to 41% (5) (figs. S1 and S2). The number of confirmed interactions per query gene varied from 1 to 146, with an average of 34 interactions per screen. By comparison, yeast proteins tend to show about eight physical interactions in large-scale screens (5) (table S2), suggesting that the genetic interaction network is at least four times more dense than the protein-protein interaction network. This greater density reflects that genetic interactions map functional relations (5), which transcend physical interactions.

Approximately 20% of the query genes we attempted (not included in the 132 query genes listed above) showed no more positives than would be expected for a wild-type control query gene and were aborted after the first SGA screen (5). Assuming that gene pairs not yet tested by SGA behave similarly to those analyzed here, the yeast synthetic genetic network contains on the order of ∼100,000 interactions.

Identification of functionally related genes by synthetic genetic interactions. We assessed the relationship between the large-scale synthetic genetic interaction data and annotation with Gene Ontology (GO) functional attributes (7), applying three different computational approaches. First, we examined 756 specific GO attributes (5) and found that, in 80 cases, genes sharing the same attribute interact genetically more often than expected by chance (P′ < 0.05; considering only tested gene pairs, P′ is a P value corrected for multiple hypothesis testing). Second, we examined each of 285,390 different GO attribute pairs and found that 1,755 attribute pairs were “bridged” significantly by genetic interaction (P′ < 0.05). Here, a gene pair “bridges” two GO attributes if one gene has the first attribute, the other gene has the second attribute, and neither gene has both. A network mapping these relationships revealed four highly connected subnetworks, containing GO attributes associated with actin-based functions, secretion, microtubule-based functions, or DNA synthesis or repair (Fig. 1). The relative topology of these subnetworks identifies general functions that buffer one another; for example, microtubule-based functions buffer both actin-based and DNA synthesis or repair functions. In the third computational approach, we examined whether genetically interacting genes tend to have similar GO function annotation. We found that over 12% of genetic interactions are comprised of genes with an identical GO annotation (12 times more than expected by chance; P = 10–296), and over 27% of genetic interactions are between genes with a similar or identical GO annotation (an eightfold increase; P < 10–322), which is a conservative estimate because we defined two GO attributes to be similar only if they were annotated with significantly overlapping sets of genes (5). In summary, the results of this large-scale analysis suggest that synthetic genetic relationships frequently coincide with a known functional relationship between gene pairs. Thus, the complete genetic network will represent a global map of functional relationships between genes.

Fig. 1.

A network of genetically connected gene functions. GO attribute names are colored according to the legend. Pairs of different GO attributes are linked if they are connected by genetic interactions significantly, more often than would be expected by chance (P < 0.002) (table S7). The subsets of GO attributes were clustered using a network layout algorithm (5). Because the P values have been corrected for multiple hypothesis testing by resampling (5), it would be unlikely (P < 0.002) to find any lines in this network if synthetic interaction and GO annotation were unrelated. A section of the complete map on the left (outlined in red) is shown in greater detail on the right. Additional views of this map may be found in figs. S9 and S10. Significance is based on the Fisher's exact test of association, and the P value is corrected for multiple hypothesis testing by resampling (29).

Overlap of genetic interaction with other gene pair characteristics. We explored the relationship between genetic interactions and a variety of other characteristics of gene or protein pairs (5) (table S3). Synthetic genetic interactions were significantly more abundant between genes with the same mutant phenotype (P = 10–316), between genes encoding proteins with the same subcellular localization (P = 10–70), and between genes encoding proteins within the same protein complex (P = 10–68). Synthetic genetic interactions were also enriched amongst gene pairs encoding homologous proteins (P = 10–22), but this accounted for relatively few (2%) of the observed interactions.

Two-dimensional hierarchical clustering of genetic interaction profiles. To organize all genes in the network by their genetic interaction patterns, we performed two-dimensional hierarchical clustering analysis (Fig. 2A). This method clusters the query genes (vertical axis) according to the overlap of their interactions with array genes and clusters array genes (horizontal axis) according to overlap of their interactions with query genes. Sets of genes that function within the same pathway or complex tend to cluster together. Examples of clustered query genes (Fig. 2B) include actin patch assembly (ARC40 and ARP2), the chitin synthase III pathway (BNI4, CHS6, CHS3, SKT5, CHS7, and CHS5), the prefoldin complex (GIM3, GIM4, GIM5, PAC10, and YKE2), and sister chromatid cohesion (CTF18, DCC1, CTF8, and CTF4). Examples of clustered array genes (Fig. 2B) include components of the protein kinase C (PKC) mitogen-activated protein (MAP) kinase signal transduction pathway (BCK1 and SLT2), the dynein-dynactin spindle orientation pathway (ARP1, NUM1, DYN1, PAC11, PAC1, DYN2, JNM1, YMR299c, NIP100, and BIK1), and the spindle checkpoint pathway (BFA1, BUB2, BUB1, MAD1, MAD2, MAD3, and BUB3) (8).

Fig. 2.

Two-dimensional hierarchical clustering of the synthetic genetic interactions determined by SGA analysis. (A) Synthetic genetic interactions are represented as red lines. Rows, 132 query genes; columns, 1007 array genes. The cluster trees organize query and array genes that show similar patterns of genetic interactions. (B) Sections [yellow outlines in (A)] are expanded to allow visualization of specific query gene and array gene clusters. Synthetic genetic interactions are represented as red squares.

The clustergram highlights particular pathways that buffer one another. For example, query genes involved in the establishment of sister chromatid cohesion during chromosome replication, CTF18, DCC1, CTF8, and CTF4 (9), interact similarly with sets of genes encoding components of several different pathways, including the MAD/BUB spindle checkpoint pathway, the RAD51 pathway that controls recombinational repair of double-strand breaks, the RAD9 DNA damage checkpoint, and the TOF1/MRC1 DNA replication checkpoint pathway. These findings are consistent with a role of cohesion in establishing spindle tension during mitosis and in repairing of double-strand breaks caused by stalled replication forks.

Clustering uncharacterized genes with the components of defined pathways should enable us to predict specific biological functions. For example, the clustering analysis revealed that the genetic interaction pattern observed for CSM3 was most similar to that of the DNA replication checkpoint genes MRC1 and TOF1 (Fig. 2B), whose products interact directly with the DNA replication machinery and facilitate Rad53 activation in response to replication stress (10). Indeed, after methyl methanesulfonate (MMS)-induced DNA damage, csm3Δ rad9Δ double mutants were unable to slow S phase progression (Fig. 3A), like tof1Δ rad9Δ double mutants (11). In addition, like mrc1Δ rad9Δ double mutants (12), csm3Δ rad9Δ double mutants were defective in cell cycle arrest in response to replication fork stalling and activation of the Rad53 checkpoint kinase (Fig. 3B). Moreover, Csm3 has been shown to bind Tof1 by two-hybrid (13) and co-immunoprecipitation (14) assays. Thus, Csm3 may function at the level of Mrc1 and Tof1 in the Rad53 DNA replication checkpoint pathway (10).

Fig. 3.

(A) CSM3 is required for checkpoint activation in response to replication blocks. Logarithmically growing cultures were arrested in G1 and released into the cell cycle in the presence of 0.035% MMS. At the indicated times (min) after release, cell samples were taken and the cell cycle distribution was analyzed by flow cytometry. The positions of cells with a 1C and 2C DNA content are indicated. (B) Cell extracts derived from the MMS-treated cells were analyzed by immunoblotting to detect the activated (phosphorylated) form of Rad53 (Rad53-P). (C) ymr299cΔ exhibits abnormal cytoplasmic microtubules and defects in mitotic spindle positioning similar to dyn1Δ and arp1Δ. Cells were stained with antibody to tubulin and DAPI (4′,6′-diamidino-2-phenylindole); the percentage of cells with microtubule orientation defects was scored in large-budded cells (5) (fig. S4).

The uncharacterized gene YMR299c clusters with the genes encoding the dyneindynactin spindle orientation pathway (15), suggesting it may be a new component of this pathway. We found that the predicted YMR299c protein sequence showed weak similarity to mammalian cytoplasmic dynein light intermediate chain (fig. S3), and analysis of the YMR299c deletion mutant revealed a number of phenotypes known to be associated with cells defective for dynein-dynactin function, including exaggerated cytoplasmic microtubules and a more severe nuclear migration defect (Fig. 3C). Ymr299c localized to cortical dots, one or two per cell, that were motile and colocalized to the tip of cytoplasmic microtubules (5) (fig. S4). Thus, Ymr299c may function as the yeast dynein light intermediate chain.

Predicting protein-protein interactions from common neighbors in the genetic interaction network. Although genetically interacting genes do encode proteins in the same complex more often than would be expected by chance (5) (table S3), the predictive value of this correlation is limited because only ∼1% of the gene pairs encode proteins in the same complex, which presumably reflects that we are largely mapping interactions amongst genes in nonessential pathways. However, analysis of the genetic network and ∼15,000 protein-protein interactions collected from large-scale studies and the literature (16) revealed that the number of common neighbors between two genes in the SGA network correlates with a known protein-protein interaction between the corresponding gene products. For example, the CTF8 product is known to affinity-purify with that of CTF18, and these two genes are not connected in the genetic network but share a number of common interactions (Fig. 2B). Whereas only 30 of 4039 genetically interacting gene pairs encode physically-interacting proteins, 28 of 333 gene pairs with more than 16 common synthetic genetic neighbors encode physically interacting proteins, an ∼11-fold enrichment. The sensitivity of this predictive approach is limited by the size of the genetic network, but the accuracy of the approach increases with the number of shared interactions (5) (figs. S5 and S6), suggesting that its usefulness as a predictor will improve as the data set grows.

The small world of genetic interactions. The yeast synthetic genetic network exhibits two properties shared by networks as diverse as the World Wide Web and protein-protein interaction maps (17, 18). First, the connectivity distribution of array genes follows a power-law distribution (Fig. 4A), containing many genes with few interactions and a few genes with many interactions. Highly connected “hub genes” are likely to be more important for fitness than less connected genes, because random mutations in organisms lacking these genes would be more likely to be associated with a fitness defect. Indeed, hubs associated with conserved genes may be potential targets for anti-cancer drugs because cancer cells often carry a large mutation load and thus may be killed preferentially (19). The top five array gene hubs include four components of the prefoldin complex, GIM3, GIM5, PAC10, and GIM4, which functions as a chaperone for actin and tubulin and thereby buffers many cellular processes (4).

Fig. 4.

(A) The degree distribution of SGA array genes not also used as query genes. The number of genes with each degree (number of interaction partners) is shown on linear and log-log (inset) scales. The fit to a straight line in the log-log plot indicates a power-law degree distribution, a characteristic of a “scale-free” network. An analysis of the degree distribution of the query genes is included in the supplementary material (5) (fig. S11). (B) The topology of the genetic network of neighborhood of three query genes (SGS1, RAD27, and BIM1). Genes are represented as nodes and synthetic genetic interactions are represented as edges connecting the nodes. The nodes are colored according to a defined subset of GO functional annotations (table S8). 95% of the nodes were tested for synthetic genetic interactions against each other.

Second, the genetic network appears to be an example of a small-world network in which the length of the shortest path between a pair of vertices tends to be small (i.e., the network has a short characteristic path length) and local neighborhoods tend to be densely connected (20). The observed genetic network has a short characteristic path length of 3.3, which is similar to that of random graphs with the same degree distribution [3.2 (5)], as expected for a small-world network (17, 18). The topology of the genetic network also exhibits dense local neighborhoods because the immediate neighbors of a gene, its genetic interaction partners, tend to interact with one another. The dense neighborhood characteristic of small-world networks is of particular interest because it can be exploited to predict interactions, as has been shown previously for protein-protein interactions (21).

To examine neighborhood density of genetic interactions in more detail, we tested extensively whether the genetic network neighbors of three query genes tended to interact with one another. We examined 2561 unique pairs of genes with the property that each genetically interact with the same query gene, SGS1, RAD27 or BIM1, by using a spot assay version of random spore analysis (5) (fig. S7). In total, 24, 18, and 18% of the tested interactions were confirmed positive for the SGS1, RAD27, and BIM1 network neighbors, respectively (Fig. 4B; table S4), which is highly enriched compared with that observed (∼1%) for the average query gene against all SGA-tested gene pairs.

Higher order genetic interactions. Given the relatively large number of synthetic double mutant combinations, we were interested in the frequency of higher order synthetic genetic interactions. To test for triple mutant genetic interactions, we performed SGA analysis with a query strain carrying deletion mutations in both BNI1 and BIM1 (table S5). Although a total of 171 genetic interactions were identified in the BIM1 BNI1 double mutant screen, tetrad analysis revealed only four triple mutants with synthetic genetic fitness defects not attributable to a double mutant interaction. Triple mutants were also identified when a BNI1 KRE1 double mutant query was tested for triple mutant interactions (table S6). In total, 156 genetic interactions were identified in the BNI1 KRE1 double mutant screen, 29 of which showed triple mutant effects. For the special case of paralogs, in which the genes are highly similar and the products are functionally redundant, we anticipated that each single-mutant query strain would show fewer pairwise synthetic genetic interactions, whereas a double mutant query strain would identify many triple mutant interactions. Indeed, this was the case for a query strain carrying deletion alleles of the paralogs CLN1 and CLN2, which encode similar G1 cyclins; a total of 34 SGA interactions were identified, and tetrad analysis revealed that all were triple mutant effects (22). On the basis of this limited collection of screens, it appears that the rate of synthetic interaction among gene triplets is substantial but less than that of gene pairs. However, there are 2000-fold more gene triplets than gene pairs in S. cerevisiae, so the total number of trigenic interactions may outnumber that of digenic interactions.

Population genetics of yeast synthetic genetic interactions. Because inbred laboratory yeast strains carrying defined mutations display an extensive number of synthetic genetic interactions, we anticipate that similar interactions may also occur in outbred strains, which carry different alleles of genes due to the accumulation of mutations within the population. Indeed, synthetic genetic interactions may play an important role in determining the genetic basis of phenotypic variation because mutations are protected from selection if they display a deleterious phenotype only in combination with a mutant allele at a second locus. Phillips and Johnson (23) have established a theoretical framework for synthetic genetic interactions, which indicates that a conservative equilibrium frequency estimate for synthetic genetic combinations in diploids can be of the order of 1 in 1000 carrier gametes. Thus, the frequency of double mutant zygotes that would be homozygous for a synthetic genetic gene pair is the product of the gamete frequencies, 1 in 1,000,000 for a given synthetic mutant pair. Nevertheless, the genomic load of synthetic genetic effects has remained a mystery because the number of mutated genes that can accumulate within a population and the number of synthetic lethal interactions per gene have remained unknown for any organism. For yeast, we now know that the growth rate of close to 50% (∼2500) of homozygous diploid gene deletion mutants is normal under six different environmental conditions (2). This surprisingly large fraction of apparently benign mutations indicates that thousands of mutated genes may have the potential to accumulate within the diploid cells of a natural yeast population as single homozygous mutations in diploids. Given that ∼2500 loci are buffered individually from selection when null and that a substantial fraction of these, perhaps 10 to 50% as a conservative estimate, may participate in synthetic genetic interactions with ∼30 other loci, then on the order of 0.8 to 4% of the zygotes formed in yeast populations would have a synthetic double mutant phenotype (5). Because the potential for creating synthetic double mutant combinations should increase with gene number, the genomic load of synthetic effects may be even higher in humans.

Synthetic genetic interactions and complex human disease. Knowledge of the yeast genetic interaction network may be relevant to our understanding of complex human diseases (24, 25), the genetic bases of which are difficult to map. An exemplary case is cystic fibrosis where a primary “Mendelian” defect in the CFTR (cystic fibrosis transmembrane conductance regulator)–encoded chloride channel is modified by at least seven other genes, many enhancing the severity of the pulmonary phenotype (26, 27). Thus, the CFTR interactions resemble those observed with the query genes in an SGA screen (fig. S8) with the difference being that genetic saturation is difficult to achieve in humans. More generally, many of the human diseases considered to be simple Mendelian single gene effects may be sensitive to modifying mutations (polymorphisms) in many genes.

A pure synthetic interaction amongst disease genes, where the individual mutant genes have no phenotype but the combination of two mutant alleles leads to the disease, is referred to as a digenic disease. For example, mutations in two genes, ROM1 (encoding the retinal outer segment membrane protein 1) and RDS (retinal degeneration slow), are asymptomatic singly, but together cause retinitis pigmentosa (28). The synthetic or digenic interactions observed for a given gene can extend to multiple interacting partners. For example, Bardet-Biedl syndrome, a retinitis pigmentosa variant, results from combinations of mutant alleles in two genes from as many as six, such as BBS2 and BBS4 or BBS2 and BBS6 (26, 28). Because asymptomatic mutations can accumulate in the population and probably have the potential to interact with a large number of different genes, digenic effects may underlie many common diseases that are familial but not Mendelian in their inheritance. For complex heterogeneous human disease syndromes such as glaucoma, type II diabetes, lupus erythematosus, schizophrenia, Alzheimer's disease, and retinitis pigmentosa, a component of the genetic basis of the disease may be similar to the synthetic effects we see within a dense local neighborhood of the yeast genetic interaction map (Fig. 4B), where multiple pairs of genes have the potential to combine and compromise cellular fitness through a related mechanism. Mapping the expected dense network of digenic interactions in humans will be challenging; however, because elements of the genetic networks derived from model organisms are likely to be conserved, there exists potential for predicting candidate genetic interactions from large-scale functional genomics information.

Supporting Online Material

www.sciencemag.org/cgi/content/full/303/5659/808/DC1

Materials and Methods

Tables S1 to S8

Figs. S1 to S11

References

References and Notes

View Abstract

Navigate This Article