Systematic humanization of yeast genes reveals conserved functions and genetic modularity

See allHide authors and affiliations

Science  22 May 2015:
Vol. 348, Issue 6237, pp. 921-925
DOI: 10.1126/science.aaa0769

Staying the same across a billion years

How far across evolution do families of genes retain their function? Yeast and humans are separated by roughly a billion years of evolutionary history, and yet genes from one can substitute for orthologous genes in the other. To study this effect systematically, Kachroo et al. replaced over 400 essential yeast genes with their human orthologs. Roughly half of the human genes could functionally replace their yeast counterparts. Genes being in the same pathway was as important as sequence or expression similarity in determining replaceability.

Science, this issue p. 921


To determine whether genes retain ancestral functions over a billion years of evolution and to identify principles of deep evolutionary divergence, we replaced 414 essential yeast genes with their human orthologs, assaying for complementation of lethal growth defects upon loss of the yeast genes. Nearly half (47%) of the yeast genes could be successfully humanized. Sequence similarity and expression only partly predicted replaceability. Instead, replaceability depended strongly on gene modules: Genes in the same process tended to be similarly replaceable (e.g., sterol biosynthesis) or not (e.g., DNA replication initiation). Simulations confirmed that selection for specific function can maintain replaceability despite extensive sequence divergence. Critical ancestral functions of many essential genes are thus retained in a pathway-specific manner, resilient to drift in sequences, splicing, and protein interfaces.

The ortholog-function conjecture posits that orthologous genes in diverged species perform similar or identical functions (1). The conjecture is supported by comparative analyses of gene-expression patterns, genetic interaction maps, and chemogenomic profiling (26), and it is widely used to predict gene function across species. However, even if two genes perform similar functions in different organisms, it may not be possible to replace one for the other, particularly if the organisms are widely diverged. The extent to which deeply divergent orthologs can stand in for each other, and which principles govern such functional equivalence across species, is largely unknown.

In this study, we systematically addressed these questions by replacing a large number of yeast genes with their human orthologs. Humans and the baker’s yeast Saccharomyces cerevisiae diverged from a common ancestor approximately 1 billion years ago (7). They share several thousand orthologous genes, accounting for more than one-third of the yeast genome (8). Yeast and human orthologs tend to be recognizable but often highly diverged; amino acid identity ranges from 9 to 92%, with a genome-wide average of 32%. Although we know of individual examples of human genes capable of replacing their fungal orthologs (912), the extent and specific conditions under which human genes can substitute for their yeast orthologs are generally not known.

We focused on the set of genes essential for yeast cell growth under standard laboratory conditions (13, 14) and for which the yeast-human orthology is 1:1 (i.e., genes without lineage-specific duplicate genes that might mask the effects). Based on the availability of full-length human cDNA recombinant clones (15, 16) and matched yeast strains with conditionally null alleles of the test genes (1719), we selected 469 human genes to study (Fig. 1A).

Fig. 1 Systematic functional replacement of essential yeast genes by their human counterparts.

(A) Of 547 human genes with 1:1 orthology to essential yeast genes, 469 human open reading frames (ORFs) were subcloned into single-copy yeast expression vectors under control of either the GAL or GPD promoters. Using three distinct assay classes (repressible yeast-gene promoter, temperature-sensitive yeast allele, and heterozygous diploid knockout strain), we obtained 126, 151, and 375 informative replaceability assays, respectively. (B) Representative examples of the three assay classes. (C) Combining assays and literature, 200 human genes could functionally replace their yeast orthologs and 224 genes could not. Some human genes were toxic using GAL induction but replaced their yeast orthologs upon reducing expression.

We first subcloned and sequence-verified each human protein coding sequence into a single-copy, centromeric yeast plasmid under the transcriptional control of either an inducible (GAL) or constitutively active (GPD) promoter (see supplementary materials and methods). We assembled a matched set of yeast strains in which each orthologous yeast gene could be conditionally down-regulated [via a tetracycline-repressible promoter (17)], inactivated [via a temperature-sensitive allele (18)], or segregated away genetically [following sporulation of a heterozygous diploid deletion strain (13, 19)] (Fig. 1A and fig. S1). After verifying that the loss of the relevant yeast gene conferred a strong growth defect, we tested whether expression of the human ortholog could complement the growth defect, as illustrated for several examples in Fig. 1B (also figs. S2 to S4). When expressed in the permissive condition, 73 of the human genes exhibited toxicity; reducing the genes’ expression levels allowed us to assay replacement in 66 cases (table S1).

Overall, we performed 652 informative growth assays surveying 414 human-yeast orthologs (Fig. 1, A and C). In total, 176 yeast genes (43%) could be replaced by their human orthologs in at least one of the three strain backgrounds, whereas 238 (57%) could not (table S1). We collated previously published reports of yeast gene complementation by human genes: Our assays recapitulated these cases with 90% precision and 72% recall (table S1), and incorporating the literature data for subsequent analyses brought the observed complementation rate to 47% (Fig. 1C). For randomly selected subsets of strains, we additionally validated the assays by confirming human protein expression using Western blot analysis (fig. S5), verifying complementation by tetrad dissection (table S1), and subcloning the yeast test genes into the assay vectors and confirming positive complementation (table S2).

Given that roughly half of the tested human genes successfully replaced and half did not, we next investigated factors determining replaceability. We assembled 104 quantitative features of the genes or ortholog pairs, including calculated properties of the genes’ sequences (e.g., gene and protein lengths, sequence similarities, codon usage, and predicted protein aggregation potential) and properties such as protein interactions, mRNA and protein abundances, transcription and translation rates, and mRNA splicing features (table S3). We then quantified how well each feature predicted replaceability (Fig. 2A and table S3).

Fig. 2 Properties of gene modules can predict replaceability.

(A) One hundred four quantitative features of proteins or ortholog pairs were evaluated for their ability to explain replaceability, assessing each feature’s predictive strength as the area under a ROC curve (AUC) and determining significance by shuffling replacement status 1000 times, measuring mean AUCs ± 1 SD (error bars). AUCs above 0.58 were generally individually significant with 95% confidence. Starred features were included in the integrated classifier (leftmost bar). (B) Distribution of amino acid identities among the tested ortholog pairs (left y axis) and fraction of replaceable genes in each sequence-identity bin (right y axis). (C) Relative proportion of replaceable and nonreplaceable genes among 12 broad KEGG (20) pathway classes.

Notably, sequence similarity only partly predicted replaceability. This tendency was strongest for highly similar (>50% amino acid identity) or dissimilar (<20%) ortholog pairs. However, most pairs fell into an intermediate range of 20 to 50% sequence identity, which poorly predicted replaceability (Fig. 2B). Instead, replaceability was best predicted by properties of specific gene modules. In particular, proteins in the same pathway or complex tended to be similarly replaceable (Fig. 2A). Replaceable genes also tended to be shorter and more highly expressed. Using these features in a supervised Bayesian network classification algorithm (fig. S6), we achieved a high overall cross-validated prediction rate [area under the receiver operating characteristic (ROC) curve of 0.825 (Fig. 2A)] and correct prediction of 8 of 10 literature cases withheld from all computational analyses (table S4). Properties such as human gene splice-form counts, yeast 5′ and 3′ untranslated region lengths, codon adaptation measures, and yeast mRNA half-lives showed little relationship with replaceability (Fig. 2A and table S3).

The strong association between replaceability and gene modules led us to investigate this phenomenon in more depth, examining replaceability as a function of specific protein complexes and pathways. Broad Kyoto Encyclopedia of Genes and Genomes (KEGG) (20) pathway classes showed highly differential replaceability: Metabolic enzymes (e.g., enzymes participating in lipid, amino acid, and carbohydrate metabolism) tended to be replaceable, whereas proteins involved in DNA replication and repair or in cell growth tended not to be replaceable (Fig. 2C).

Among large protein complexes and pathways, we observed both extremes of replaceability. Some were entirely nonreplaceable: For example, we did not observe a single successful replacement among 13 tested members of the TriC chaperone complex, the DNA replication initiation origin recognition complex, or its interacting minichromosome maintenance (MCM) complex (Fig. 3, A and B). In contrast, some pathways were almost entirely replaceable: Among 19 components of the sterol biosynthesis pathway (which catalyzes the conversion of acetyl–coenzyme A to cholesterol in humans and ergosterol in yeast), only the human farnesyl-diphosphate farnesyltransferase 1 (FDFT1) and farnesyl diphosphate synthase (FDPS) enzymes failed to replace their yeast orthologs. All other tested components were replaceable, suggesting that yeast and humans both retain the same essential complement of ancestral sterol biosynthesis functionality (Fig. 3C and fig. S7).

Fig. 3 The modular nature of functional replacement.

(A) None of the four tested human TRiC/CCT chaperonin genes replaced their yeast counterparts. (B) Similarly, no genes tested in the origin recognition complex (ORC) or the MCM complex were replaceable. (C) In contrast, 17 of 19 sterol biosynthesis genes were replaceable. In two cases, the yeast gene had two human orthologs but only one could complement. Human HMGCS1 (but not HMGCS2) replaced yeast ERG13; human IDI1 (but not IDI2) replaced yeast IDI1. Human PMVK, a nonhomologous protein that carries out the same reaction as yeast Erg8 (27), complemented temperature-sensitive allele erg8-1.

The modular nature of replaceability was particularly evident in the case of the 26S proteasome complex. Of 28 tested subunits, 21 human genes replaced their yeast counterparts (Fig. 4A). However, the nonreplaceable subunits were not randomly distributed; rather, they clustered in two physically interacting groups: one consisting of the 19S lid components Rpn3 and Rpn12 and one consisting of the 20S inner-core heptameric beta-ring subunits β1, β2, β5, β6, and β7. Thus, of the two central heteroheptameric rings, all testable components of the alpha ring replaced, whereas most of the beta ring did not.

Fig. 4 Proteasome subunits are differentially replaceable.

(A) Yeast 26S proteasome genes were generally replaceable, except for two interacting clusters, in the 19S regulatory “lid” particle and in the 20S core β-subunit ring. (B) The yeast α6-β6 subunit interface (top panel) sterically accommodates the human subunit (bottom panel, showing superposition of human α6 onto the yeast α6) despite 50% sequence identity at the interface. (C) Alpha subunits from diverse eukaryotes generally complemented the yeast mutant, but beta subunits did not (unlike plasmid-expressed S. cerevisiae genes, included as positive controls). ND, not determined. (D) In simulated evolution of interacting proteins Ubc9 and Smt3, if binding to the extant partner is not enforced (“Non-Bound”), a protein’s ability to bind its ancestral partner decays rapidly as sequences diverge. However, if extant binding is enforced (“Wild Type” and “Low Stability”), even highly diverged proteins often still bind to their ancestral partners. (Dots indicate right-censored data; see fig. S14.)

An examination of the alpha and beta subunit structures showed that subunit-subunit interfacial amino acids were conserved to similar degrees between yeast and human subunits (fig. S8A), although beta subunits exhibited elevated rates of nonsynonymous substitutions compared with alpha subunits (fig. S8B). Even when interfacial amino acids were only partly conserved, modeling human alpha subunits into the known structure of the yeast proteasome (21) revealed that human proteins could be sterically accommodated into the yeast intersubunit interface, as shown for human α6 (Fig. 4B) packing against yeast β6, in spite of sharing only 50% identical amino acids at the interface (fig. S8A). Only orthologous alpha subunits replaced; nonorthologs failed (fig. S9).

We further confirmed this trend across alpha and beta proteasome subunits by cloning and assaying subunits from additional organisms, including another yeast (Saccharomyces kluverii), the nematode Caenorhabditis elegans, and several beta subunits from the frog Xenopus laevis. In all cases, alpha subunits complemented the loss of the yeast orthologs, whereas beta subunits generally failed to complement (Fig. 4C). The pattern of replaceability across species suggests that alpha and beta subunits experienced different evolutionary pressures, in each case operating at the level of the system of genes (the alpha or beta heteroheptamer).

To determine further why proteasome alpha subunits were replaceable but beta subunits were not, we isolated human β2 subunit mutants that complemented the yeast defect (figs. S10 to S12). A single serine-to-glycine substitution [Ser214→Gly214 (S214G)] was sufficient to rescue growth (fig. S11). β2 subunits act as proteases, but yeast β2 catalytic activity is dispensable if the proteasome assembles with other functioning protease subunits (22). Notably, a catalytically dead [Thr44→ Ala44 (T44A)] human β2 failed to complement, whereas an S214G, T44A double mutant complemented successfully (fig. S11). We conclude that the S214G mutant is competent to assemble an intact proteasome, although the subunit may not be catalytically active. Thus, native human β2 needs only one amino acid change to pack within the yeast proteasome.

Theory predicts that evolutionary divergence creates Dobzhansky-Muller incompatibilities, because evolutionarily novel mutations in one species are untested in the other species’ genetic background and may be deleterious there (23, 24). To better understand how proteins retain the ability to interact with their ortholog’s interaction partners, even when they have diverged substantially, we developed a biochemically realistic divergence model in which we simulated the evolution of two physically interacting proteins, which both diverge over time. We considered three distinct scenarios: (i) Both thermodynamic stability and binding to the extant partner were selected at ancestral levels; (ii) binding was selected at ancestral levels but stability was not; and (iii) stability was selected at ancestral levels but binding was not. Thermodynamic stability (ΔGfolding) and binding energy (ΔGinteraction) were calculated using the empirical FoldX energy function (25). Under all scenarios, we evaluated whether an evolved member of the pair could still bind to its ancestral partner, for which binding was not enforced. We found that ancestral binding decayed rapidly under scenario (iii) but much more slowly under the other two scenarios (Fig. 4D and figs. S13 to S15). Natural selection for a protein interaction thus preserves the interaction interface in a manner consistent with binding to the ancestral partner (figs. S16 and S17), even though many lineages will eventually accumulate mutations that cause incompatibilities with the ancestral interactor.

Our data demonstrate that a substantial portion of conserved yeast and human genes perform much the same roles in both organisms, to an extent that the protein-coding DNA of a human gene can actually substitute for that of the yeast. The strong pathway-specific pattern of individual replacements suggests that group-wise replacement of the genes should be feasible, raising the possibility of humanizing entire cellular processes in yeast. Such strains would simplify drug discovery against human proteins, enable studies of the consequences of human genetic polymorphisms [as in (26) and fig. S7], and empower functional studies of entire human cellular processes in a simplified organism.

Supplementary Materials

Materials and Methods

Figs. S1 to S18

Tables S1 to S6

References (2855)

Data S1 to S3

References and Notes

  1. Acknowledgments: We thank M. Minnix and A. Royall for assistance with cloning and assays, K. Drew for structural modeling assistance, M. Tsechansky for TANGO assistance, and C. Boone for providing the temperature-sensitive yeast strain collection. This work was supported by Cancer Prevention and Research Institute of Texas (CPRIT) research fellowships to A.H.K. and J.M.L; NIH grant R01 GM088344, Defense Threat Reduction Agency grant HDTRA1-12-C-0007, and NSF Science and Technology Center BEACON funds (DBI-0939454) to C.O.W.; and grants from the NIH, NSF, CPRIT, and the Welch foundation (F-1515) to E.M.M.
View Abstract

Navigate This Article