Research Article

Conservation, acquisition, and functional impact of sex-biased gene expression in mammals

See allHide authors and affiliations

Science  19 Jul 2019:
Vol. 365, Issue 6450, eaaw7317
DOI: 10.1126/science.aaw7317

The genetics of sexual dimorphism

In mammals, many species exhibit sex-specific phenotypes that differ between males and females. Although attention has been directed to the effects of the X and Y sex chromosomes, we do not understand how sex affects the rest of the genome. Naqvi et al. examined gene expression in 12 tissues in male and female humans, mice, rats, dogs, and cynomolgus macaques and identified diversity in gene expression between the sexes. Examining sex-biased gene expression in human height identified opposing male or female bias. Although conservation of differential sex-specific gene expression among species was observed, specific genes differed in the sexes among species and lineages suggesting the evolution of species- or lineage-specific sex-biased expression.

Science, this issue p. eaaw7317

Structured Abstract


Sex differences are widespread in humans and other mammals. For example, the distribution of height or body size is shifted upwards in males relative to females, and sex differences are found in the immune and cardiovascular systems as well as in metabolism. However, little is known about how gene expression differs between the sexes in a broad range of mammalian tissues and species. A catalog of such sex-biased gene expression could help us understand phenotypic sex differences. Assessing the extent to which sex-biased gene expression is conserved across the body could also have important implications for the use of nonhuman mammals as models of sex-biased human biology.


To identify both conserved and lineage- or species-specific sex differences in gene expression, we sequenced RNA from male and female samples in 12 tissues in each of four nonhuman mammals (cynomolgus macaque, mouse, rat, and dog) and analyzed these data jointly with publicly available data from postmortem male and female human tissues. To assess the impact of sex-biased gene expression on the sex difference in mean human height, we applied methods that integrate the effects of genetic variation on both gene expression and phenotype (height in this case). We sought to understand which transcription factors (TFs) contribute to evolutionary changes in sex bias by analyzing motifs gained or lost concurrently with lineage- or species-specific changes in sex bias.


Linear modeling revealed ~3000 genes with conserved (species-shared) sex bias in gene expression, most of which was tissue specific. The cumulative effects of conserved sex bias explain ~12% of the sex difference in mean human height, and cases such as that of LCORL, a TF with conservation of both female-biased expression and genetic association with height, suggest a contribution to sex differences in body size beyond humans. However, most sex-biased gene expression (~77%) was specific to single species or subsets of species, implying that it arose more recently during evolution. We identified 83 instances where TFs showed sex-biased expression in the same tissue, in which their motifs were associated with gain or loss of sex bias at other genes, accounting for a significant portion (~27%) of lineage-specific changes in sex bias.


By conducting a 12-tissue, five-species survey of sex differences in gene expression, we found that although conserved sex bias in gene expression exists throughout the body, most sex bias has been acquired more recently during mammalian evolution. Height is likely subject to opposing selective pressures in males and females; our study thus documents how such selective forces can result in sex-biased expression which, when layered upon genetic pathways acting identically in males and females, can lead to trait distributions shifted between the sexes. Our findings also suggest that, in many cases, molecular sex differences observed in humans may not be mirrored in nonhuman mammals.

RNA sequencing of male and female samples in 12 tissues and five species reveals the functional impact and mechanistic underpinnings of sex-biased gene expression.

A survey of sex differences in gene expression using RNA sequencing data (left) leads to the discovery of both conserved (species-shared) and lineage- or species-specific sex biases in expression across the genome. Genes with conserved sex bias contribute to the sex difference in mean height in humans and other mammals, whereas lineage-specific changes can be partially explained by gains and losses of motifs for sex-biased TFs.


Sex differences abound in human health and disease, as they do in other mammals used as models. The extent to which sex differences are conserved at the molecular level across species and tissues is unknown. We surveyed sex differences in gene expression in human, macaque, mouse, rat, and dog, across 12 tissues. In each tissue, we identified hundreds of genes with conserved sex-biased expression—findings that, combined with genomic analyses of human height, explain ~12% of the difference in height between females and males. We surmise that conserved sex biases in expression of genes otherwise operating equivalently in females and males contribute to sex differences in traits. However, most sex-biased expression arose during the mammalian radiation, which suggests that careful attention to interspecies divergence is needed when modeling human sex differences.

Males and females exhibit differences across a wide range of biological processes. Studies in humans have documented sex differences in anthropometric traits (1), energy metabolism (2), brain morphology (3), and immune (4) and cardiac (5) function. Sex differences are also evident in the incidence, prevalence, and mortality across diseases, including autoimmune disorders (6), cardiovascular diseases (7), and autism (8). Sex differences are common in other mammals besides humans, many of which are models of sex-biased human traits and diseases (9). For example, males are larger than females in most mammalian species (10), whereas sex differences in brain structures (11) and immune (12) and cardiac (13) function have been observed in rodents. These phenotypic sex differences are likely associated with, and may be caused by, sex differences in gene activity or function.

The sex chromosomes are one source of sex differences in gene activity. The Y chromosome harbors male-specific genes (14), some broadly expressed (15). Incomplete inactivation of the second X chromosome in females results in female-biased expression of some X-linked genes (16). However, given the scale and complexity of gene networks, and the greater number of autosomal genes, it is unlikely that sexually dimorphic expression of sex-linked genes accounts for all phenotypic sex differences in mammals. Understanding the molecular origins of these sex differences therefore requires a genome-wide, multitissue, and comparative approach to sex biases in gene expression.

Our understanding of sex bias in mammalian gene expression is lacking in three regards. First, the degree to which sex-biased expression is conserved across the mammalian lineage and the extent of conservation in different tissues and organ systems are unknown. Multitissue studies of sex bias in gene expression focused on humans (17, 18) or mice (19). Multispecies studies in Drosophila (2024) examined RNA from whole carcasses or gonads, whereas studies in mammals that examined nonreproductive tissues focused on single tissues (25, 26). Second, little is known about how sex differences in gene expression across the body cumulatively result in phenotypic sex differences. Sex-biased expression of the autosomal genes VGLL3 (27) and IL-33 (28), as well as the X-linked gene TLR7 (29), appears to contribute to sexually dimorphic immune phenotypes. However, most complex traits are polygenic and underpinned by variation in hundreds or even thousands of genes (30). Third, apart from single-gene studies in Drosophila (31), lineage-specific regulatory changes that drive the evolution of sex-biased expression remain unexplored. Progress has been made in understanding mechanisms of X-linked dosage compensation (32, 33), the lack of which can lead to sex-biased expression on the X chromosome, but additional mechanisms likely contribute to genome-wide sex-biased gene expression. Thus, previous studies sought to understand the extent of sex-biased expression across either tissues (1719) or species (25, 26), or they explored its phenotypic impact (2729) or underlying evolutionary mechanisms (31) for individual genes. Assessing sex-biased expression across tissues and species, together with its cumulative contribution to phenotypic sex differences, would advance our understanding of molecular differences between males and females.


A five-species, 12-tissue survey of sex differences in gene expression

To assess sex differences in nonhuman mammals, we collected RNA sequencing data from three males and three females from cynomolgus macaque (Macaca fascicularis, cyno), mouse (Mus musculus), rat (Rattus norvegicus), and dog (Canis familiaris). Together with humans, these five species, whose last common ancestor lived 80 to 100 million years ago, span the evolution of the Boreoeutheria, including all placental mammals except Afrotheria and Xenarthra (which include the elephant and anteater, respectively). We sampled 12 tissues from each individual: adipose, adrenal gland, brain, colon, heart, liver, lung, muscle, pituitary, skin, spleen, and thyroid. These tissues represent many organ systems and all three germ layers (Fig. 1A). We designed tissue collection and processing procedures to minimize biological and technical variation (34) (table S1). We used our RNA sequencing (RNA-seq) data to systematically improve the transcriptome annotations of each nonhuman mammalian species, which we then assessed using the percentage of reads from independent studies that mapped to our annotations versus existing annotations (e.g., a 16% increase in read mapping rate in dog) (fig. S1).

Fig. 1 Five-species, 12-tissue survey of sex differences in gene expression.

(A) Schematic of study design, with tissues chosen for analysis in all five species highlighted in humans. (B) Hierarchical clustering of 349 RNA-seq samples. (Top) Pairwise estimates of Jensen-Shannon divergence (JSD) between pairs of samples. Six random human samples per tissue, in addition to all nonhuman samples, were included for display purposes. (Middle) Tree dendrogram obtained by hierarchical clustering (average linkage) based on pairwise JSD values. (Bottom) Sample labels by tissue, species, and sex.

To assess sex differences in humans, we analyzed RNA-seq data from the Genotype-Tissue Expression Consortium (GTEx, v6p release) (35). To reduce the possibility of sex biases in cell-type composition, pathology, or other factors driving our results, we performed stringent quality control for samples from each of the 12 target tissues using individual- and sample-level metadata from GTEx and our own evaluation of histological images (34) (table S2). We adjusted gene expression values using top principal components to remove variation due to hidden technical or biological confounders. In three tissues (adipose, brain, and skin) for which expression data from purified cell populations is available, there is a correlation between sample-level cell-type proportions estimated by CIBERSORT (36) and top principal component loadings (fig. S2). Although this approach controls for variation in cell-type composition in the human samples, we acknowledge that some sex biases, especially those specific to nonhuman mammals, could reflect sex differences in cell-type composition.

We removed outlier samples (34) to obtain 740 human and 277 nonhuman RNA-seq samples (see table S3 for human sample sizes by sex and tissue). We clustered all nonhuman samples and a randomly chosen subset of human samples, using the expression levels of 12,939 one-to-one orthologous protein-coding genes. With the exception of human adipose tissue and lung, which cluster closely together, samples cluster first by tissue and then by species (Fig. 1B). This tissue-dominated clustering agrees with prior studies (37, 38) and indicates consistent sampling of tissues across species and also that the nonhuman data generated in this study are comparable to the human data from GTEx (35). There are no cases where samples cluster by sex before tissue or species, indicating that species effects dominate over sex effects. Nevertheless, sex contributes significantly to gene expression variation as pairwise within-sex distances in each tissue-species combination are significantly lower than pairwise between-sex distances (fig. S3).

Both our reanalysis of GTEx data and our analysis of our own data replicated published estimates of sex bias in six human and mouse tissues (27, 3945) (Pearson’s correlation coefficient r = 0.29 to 0.92) (figs. S4 and S5). These results indicate that expression values are comparable across species and yield reproducible estimates of sex bias.

Conserved sex-biased gene expression exists across the body

Within each tissue, we used a linear mixed model to identify genes that showed a consistent sex bias [false discovery rate (FDR) 4.5%, as estimated by permutation of male and female sample labels] across species while controlling for differences in expression variability and sample size between species. Further, we required that genes show a fold change ≥ 1.05 in the same direction in at least four of the five species studied. We assume that such genes likely had a conserved sex bias in the common ancestor of Boreoeutheria (example in Fig. 2A). Of 113,853 expressed gene-tissue pairs, 3885 pairs (corresponding to 3161 genes) show a conserved sex bias. We used a rank-based statistic to confirm that gene-tissue pairs with conserved sex bias also have low P values for sex bias in each of the individual species (fig. S6). Conserved sex bias is generally of modest magnitude (~90% of sex-biased gene-tissue pairs had a less than twofold change between the sexes) (fig. S7) but reproducible in independent datasets (Pearson’s r = 0.18 to 0.78) (fig. S8). The number of genes with conserved sex bias per tissue varies from 128 in colon to 805 in pituitary (Fig. 2B and table S4) and is not correlated with tissue sample size or rates of between-species gene expression divergence (Pearson’s r = 0.093 and 0.0083 and P = 0.77 and 0.97, respectively) (fig. S9). A naïve approach, requiring P < 0.05 in at least four of five species for each tissue, found a smaller number of gene-tissue pairs with conserved sex bias but revealed between-tissue patterns that were correlated with results from the linear mixed model (fig. S10). Of genes with conserved sex bias in any of the 12 tissues examined, 562 genes (18%) are sex-biased in more than one tissue (Fig. 2C). In cases of multitissue sex bias, the bias is significantly more likely to be in the same direction in multiple tissues (P = 0.00035, two-sided Fisher’s exact test) (Fig. 2D). Thus, conserved sex bias in gene expression is mostly tissue-specific, but a significant minority of genes shows concordant sex bias across multiple tissues, implying that some regulatory factors result in similar profiles of sex-biased expression in multiple tissues or cell types.

Fig. 2 Conserved sex bias in gene expression across the body.

(A) Example of gene with conserved female-biased expression. CPM, counts per million. (B) Heatmap of conserved male (blue) and female (orange) sex bias across genes (rows) and tissues (columns). (C) The y axis represents number of genes with conserved sex bias in one (left) or multiple (right) tissues. (D) Of genes with conserved sex bias in multiple tissues, the number concordant (same direction) or discordant (opposite direction) in multiple tissues is plotted. Significance as assessed by two-sided Fisher’s exact test comparing to equal proportions.

We considered the extent to which genes with conserved sex-biased expression were enriched for sex linkage. All assayed Y-linked genes are male-biased (fig. S11A), as expected, whereas X-linked genes are significantly enriched for conserved female bias (2.1- to 10.2-fold increase relative to autosomes two-sided Fisher’s exact test) (fig. S11B). The enrichment for X-linked genes is driven by genes that escape X inactivation in females. In turn, the enrichment for X-escape genes is largely driven by the subset of X-escape genes that have a nonrecombining Y-linked homolog in mammals (two-sided Fisher’s exact test) (fig. S11B). Despite these enrichments, most (85 to 95%, depending on the tissue) genes with conserved sex bias are autosomal (fig. S12). We compared the magnitude of sex bias between autosomal and X-linked genes using independent, publicly available datasets (27, 39, 40, 43, 44) (seven mouse, three human) to avoid ascertainment bias. X-linked genes show significantly higher magnitudes of sex bias in four of the ten datasets (adjusted P < 0.05, two-sided Wilcoxon rank-sum test) (fig. S13). Thus, the sex chromosomes, primarily as a result of harboring genes with both X- and Y-linked homologs, contribute a small but significant fraction of conserved sex bias in gene expression.

Most sex bias in gene expression has arisen since the last common ancestor of boreoeutherian mammals

We investigated sex-biased gene expression specific to subsets of the five species, mindful that differences in statistical power between species could result in false positive calls of lineage-specific sex bias. For example, a gene with true primate-specific male bias might falsely appear to have a human-specific male bias if its expression is significantly biased in humans but does not reach statistical significance in cyno. At the same time, false positive calls of sex bias in single species will by necessity appear to be species specific. We used mashr (46) to model the covariation in sex bias across tissues and species and to more confidently determine the lineage of sex bias. We repeated the mashr procedure using permuted male/female sample labels to empirically estimate the FDR for any given set of sex-biased genes (34). This increased the number of rodent-specific gains of sex bias in most tissues (fig. S14). After using mashr to estimate sex bias in each tissue-species combination, we assigned each sex-biased gene-tissue pair (other than those with conserved sex bias) to one of 12 lineage-specific categories by parsimony: primate-specific gains or losses, rodent-specific gains or losses, gains specific to one of the five species, multiple gains or losses, and more complex patterns of sex bias inconsistent with single gains or losses (examples in Fig. 3A and table S4). In each category, we used the permutation-estimated FDR to estimate the number of true positive sex-biased gene-tissue pairs.

Fig. 3 Most sex bias in gene expression has arisen since the last common ancestor of Boreoeutheria.

(A) Examples of genes with lineage-specific sex bias. (B) Number of true-positive sex-biased gene-tissue pairs (y axis) in each evolutionary class was calculated as the difference between the total number discovered across all tissues using true or permuted sex labels. Evolutionary classes defined in main text are designated as ancestral, acquired, or complex relative to last common ancestor of Boreoeutheria (the five species considered here). (C) Comparisons of ancestral to acquired sex biases as in (B), but performed in each tissue separately. Upper and lower confidence intervals represent fraction of sex bias estimated to be ancestral when counting all complex events as ancestral or acquired, respectively.

We assessed how much of the sex-biased gene expression observed in the five species was present in the common ancestor of Boreoeutheria (i.e., ancestral). Instances of ancestrally sex-biased expression included gene-tissue pairs that we previously identified as having a “conserved” sex bias as well as gene-tissue pairs that lost sex bias in the primate or rodent lineages or in multiple lineages. Instances of acquired sex bias included gene-tissue pairs with primate-, rodent-, or species-specific sex bias, as well as multiple gains of sex bias. By this logic, 6539 (23%) of sex-biased gene-tissue pairs were likely sex-biased in the common ancestor, and 22,194 (77%) likely acquired sex bias after divergence from a common ancestor. An additional 8495 gene-tissue pairs exhibited more complex patterns and could not be confidently assigned as ancestrally sex-biased or acquired (Fig. 3B). If all such “complex” events were ancestral, the ancestral fraction of sex bias would be 40%, whereas if they were acquired, the fraction would be 18%. Performing these calculations in each tissue separately, we found that ancestral sex bias constituted the minority of total sex bias in all tissues except the pituitary (Fig. 3C). We also quantified the fraction of ancestral bias using a range of fold-change cutoffs up to 1.5 and found that, for all cases, ancestral sex bias was in the minority (fig. S15). Repeating this analysis with conserved sex bias called by mashr rather than the linear mixed model yielded similar results, with both methods detecting similar numbers of gene-tissue pairs with conserved sex bias (fig. S15). We conclude that most sex bias in gene expression in nonreproductive tissues arose during, rather than before, the boreoeutherian radiation.

Sex-biased gene expression is associated with reduced selective constraint

We assessed the degree of selective constraint operating on sex-biased gene expression. Reasoning that genes functioning across many tissues and cell types face increased selective constraint on gene expression levels, we compared the breadth of expression of genes with and without sex bias, in each tissue. Sex-biased genes showed significantly lower expression breadth than genes with no bias, with the exception of lung, where sex-biased genes were more broadly expressed (adjusted P < 0.05, two-sided Wilcoxon rank-sum test) (Fig. 4A). These differences in expression breadth could either be downstream consequences of, or have predated, the observed sex bias. We thus analyzed expression breadth in chicken, an evolutionary outgroup to mammals, reasoning that patterns found in both human and chicken were likely present in the common mammalian ancestor before the acquisition of sex bias. Again, sex-biased genes in mammals showed almost uniformly lower expression breadth in chicken than unbiased genes (adjusted P < 0.05, two-sided Wilcoxon rank-sum test) (fig. S16).

Fig. 4 Sex-biased gene expression is associated with reduced selective constraint.

In each tissue, genes were binned as showing no sex bias or showing sex bias of any evolutionary type. Human breadth (A) was calculated on the basis of median expression values in the 12 selected GTEx tissues (34), expression constraint (B) represents the genome-wide percentile, and sequence conservation (C) is calculated as the mean coding phyloP score (34). In each heatmap, the group median of the indicated gene-level trait is plotted; asterisks indicate a Benjamini-Hochberg–adjusted P < 0.05 from a two-sided Wilcoxon rank-sum test, placed on the group (“No bias” or “Sex-biased”) with the lower value of the gene-level trait.

To assess conservation of expression levels in a tissue-specific manner, we used estimates of mammalian gene expression-level constraint learned from 16 species (47) and seven tissues, five of which were also assessed in our study. As with expression breadth, sex-biased genes showed lower constraint than unbiased genes in heart, muscle, and liver but higher constraint in lung (adjusted P < 0.05, two-sided Wilcoxon rank-sum test) (Fig. 4B).

We observed that genes sex-biased in heart, spleen, and liver showed lower sequence conservation than unbiased genes, whereas genes sex-biased in adipose, brain, and lung showed higher sequence conservation than unbiased genes (adjusted P < 0.05, two-sided Wilcoxon rank-sum test) (Fig. 4C). Thus, some sex-biased genes are relatively strongly constrained at the sequence level, perhaps because they perform important or pleiotropic functions. However, considering both expression levels and sequence conservation, our findings indicate that sex-biased gene expression is primarily associated with reduced selective constraint, from before the divergence of the boreoeutherian lineages.

Conserved sex bias in autosomal gene expression contributes to sex differences in mammalian height and body size

Males are larger than females in most mammalian taxa (10). Human males are, on average, 10 to 15 cm (7 to 13%) taller than females, but the distributions of height in males and females overlap substantially (Fig. 5A). The genetic architecture of human height is polygenic and largely shared between the sexes. A recent meta-analysis reported 712 genome-wide significant loci (48), and only a handful of sex-specific associations with height have been discovered (49, 50); recent human studies reported a between-sex genetic correlation of 0.96 (50, 51). Studies in humans and other mammals have concluded that height is likely subject to opposing selective pressures between the sexes, where increased height enhances reproductive success in males and decreased height favors reproductive success in females (52, 53).

Fig. 5 Conserved sex bias in autosomal gene expression contributes to sex differences in human height.

(A) Overlapping but shifted distributions of male and female heights. Theoretical normal distributions using published means and standard deviations of male and female heights in individuals of European ancestry from the United Kingdom (53). (B) TWAS z-scores for genome-wide significant height genes with either female (orange) or male (blue) bias in one of 12 tissues, either conserved across mammals (left), specific to humans (middle), or specific to primates (right). For each gene, TWAS z-scores were meta-analyzed across 48 GTEx tissues. (C) TWAS z-scores for gene-tissue pairs with either female (orange) or male (blue) bias, either conserved across mammals (left), specific to humans (middle), or specific to primates (right), in all cases in same tissue as computed TWAS z-score. Points represent group means; whiskers represent 95% confidence intervals. P value for mean difference calculated by 1000 permutations of male and female point labels.

Could sex-biased gene expression contribute to the sex difference in height and body size observed in humans and other mammals? To link variation in gene expression to variation in height, we turned to transcriptome-wide association studies (TWAS) that integrate an expression quantitative trait loci (eQTL) study from a given tissue or cell type and genome-wide association studies (GWAS) of a given trait (54).

If sex-biased gene expression contributes to sex differences in height, genes with male-biased expression levels should mostly identify height-increasing effects, as measured by TWAS, whereas female-biased genes should identify height-decreasing effects. We considered genes with genome-wide significant associations for height, as annotated in the NHGRI-EBI GWAS catalog (55). We used TWAS to combine height GWAS statistics from a meta-analysis (48) of data from the UK Biobank (56) and GIANT consortium (57) (which we verified were correlated; Pearson’s r = 0.83, P < 2.2 × 10−16) (fig. S17) with reference eQTL panels from 43 different tissues generated using data from GTEx (35). TWAS z-scores largely agree in sign across the 43 tissues (figs. S18 and S19; see table S5 for all TWAS z-scores). We therefore combined z-scores for each gene across tissues by meta-analysis. Sixty-two genome-wide significant height genes have both computed TWAS z-scores and conserved sex bias in at least one tissue. Genes with conserved male-biased expression have more-positive TWAS z-scores than genes with conserved female-biased expression (mean z-score difference = 18, P = 0.023, group permutation test), but this difference was not seen when analyzing genes with human-specific or primate-specific sex bias (Fig. 5B). Expanding our analyses to include TWAS results for all genes allowed for greater stringency by only considering TWAS z-scores calculated for the same tissue in which sex-biased expression was observed. Five hundred sixty gene-tissue pairs have both computed TWAS z-scores and conserved sex bias; these are distributed across all 12 tissues, with the largest numbers in muscle, adipose, and pituitary (fig. S20), and they are enriched for metabolic functions (adjusted P value < 0.05, two-sided Fisher’s exact test) (table S6). Gene-tissue pairs with conserved male bias have more-positive TWAS z-scores than those with conserved female bias (mean z-score difference = 0.7, P = 0.039, group permutation test), but this difference was not seen when considering gene-tissue pairs with human- or primate-specific sex bias (Fig. 5C). Together, these results indicate that genes with conserved male-biased expression show height-increasing effects, whereas genes with conserved female-biased expression show height-decreasing effects.

We sought to quantify the fraction of sex difference in height explained by conserved sex bias in gene expression, focusing on cases where the sex bias was in the same tissue as the TWAS z-score (i.e., Fig. 5C). We estimated the contribution of conserved sex bias to the height sex difference with two approaches. One approach used a physical scale with the effect sizes of eQTLs in GWAS, and the other examined a relative fold-change on the basis of TWAS z-scores (34) (fig. S21). The two approaches yielded similar estimates of the contribution of conserved sex-biased gene expression: ~1.6 cm, or 12%, of the observed sex difference in mean height.

Genes with conserved male and female bias show the largest difference in height TWAS z-scores, suggesting a contribution to sex differences in size in other mammals. Indeed, all five species assessed in this study exhibit sex differences in size (fig. S22). Consider the transcription factor LCORL, which shows conserved female bias in the pituitary (fig. S23A) and is height-decreasing in humans (cross-tissue TWAS z-score = −28.7). Although LCORL lacks a predictive TWAS model in the pituitary, an allele at the LCORL locus associated with increased expression in the pituitary is associated with decreased height (fig. S23B). Notably, genetic variation at the LCORL locus has been associated with height or body size in dogs (58), cattle (59), and horses (60). Reanalysis of publicly available RNA-seq data (61) shows that LCORL is one of the most strongly female-biased autosomal genes in the cattle pituitary (1.6-fold higher in females) (fig. S23C and table S7). An allele associated with increased body size in horses is associated with decreased LCORL expression in hair root (62), indicating that the negative association between LCORL expression and height likely extends beyond humans. These observations suggest that female-biased expression of LCORL contributes to sex differences in size in multiple species. Beyond LCORL, studies have observed significant overlap in genome-wide-significant height loci between humans (57), dogs (58), and cattle (59) (table S8), suggesting a broader contribution of conserved sex-biased gene expression to sex differences in body size in a range of mammals.

Conserved and acquired sex biases in gene expression are similarly enriched for specific biological pathways and show similar magnitudes of bias

Our results indicate that although sex-biased gene expression overall shows signs of lowered selective constraint, conserved sex bias contributes to sex differences in height or body size. This raises the possibility that more-recently acquired sex bias in expression has little or no functional impact. To assess this possibility, we compared genes with either conserved or acquired sex bias with respect to (i) overrepresentation in specific biological pathways via Gene Ontology (GO) category enrichment and (ii) the magnitude of their sex bias.

We observed similar degrees of enrichments for biological pathways among conserved and acquired sex biases. Genes with conserved male bias in pituitary are enriched for cyclic adenosine monophosphate signaling, which functions in response to stress (63). Genes with conserved female bias in colon and thyroid are enriched for adaptive immune pathways, whereas genes with conserved female bias in adipose tissue are enriched for mitochondrial translation and ribosomal RNA processing. At the same time, genes with acquired male bias in the liver, adipose tissue, and heart are enriched for functions related to fatty acid metabolism, regulation of hormone secretion, and nucleotide metabolism, respectively, and genes with acquired female bias in the liver are enriched for extracellular matrix organization (adjusted P < 0.05, two-sided Fisher’s exact test) (table S6).

We compared the magnitude of conserved and acquired sex bias using 10 independent human and mouse datasets to minimize differences due to ascertainment; we found no significant differences (adjusted P > 0.05, two-sided Wilcoxon rank-sum test) (fig. S24). Considering that conserved sex bias, although generally small in magnitude, can nevertheless contribute to sex differences in height, these results suggest that acquired sex bias could also be functionally consequential. Additional studies are needed to demonstrate the functional impact of acquired sex-biased gene expression.

Evolutionary turnover of motifs for sex-biased transcription factors reflects lineage-specific changes in sex bias

One mechanism by which sex-biased expression could evolve is via sex-biased transcription factors (TFs). For example, male-biased TF expression in muscle would result in higher TF activity in male muscle. Genes that acquired motifs for this TF in, for example, the primate lineage would then show a primate-specific sex bias in muscle. To test this idea, we searched for motifs enriched in the promoters of sex-biased genes with lineage-specific changes in a given tissue, relative to their unbiased orthologs. We repeated this analysis with random, equally sized sets of genes showing no lineage-specific sex bias in order to calculate an empirical P value for motif enrichment (34). Because of the nonparametric nature of this P value calculation and our desire to analyze enriched motifs inclusively as a set, we considered motifs at a 10% FDR. We found 83 instances in which such motifs matched predicted binding sites of TFs with sex bias in the same tissue (Fig. 6A and table S9). This was significantly more (P = 0.014, tissue permutation test) than the ~67 instances of matches expected when randomly assigning the tissue of sex bias for each TF (Fig. 6B), which, combined with the 10% FDR for motif discovery, yields ~6.7 matches expected by chance. By quantifying the enrichment of each motif in its corresponding set of sex-biased orthologs (34), we estimated that these 83 instances account for the lineage-specific sex bias of 6073 gene-tissue pairs, or 27% of all lineage-specific sex bias. Furthermore, 13 TFs showed matches to enriched motifs in more than one tissue, significantly more than the approximately one such TF expected by chance (P = 0.032, tissue permutation test), as determined by the motif discovery FDR and permuting the tissue of TF sex bias as described above (fig. S25). This suggests that gains and losses of motifs for sex-biased TFs could, in some cases, coordinate the evolution of sex-biased gene expression across multiple tissues or cell types.

Fig. 6 Evolutionary turnover of motifs for sex-biased transcription factors is associated with gains and losses of sex bias.

(A) Representative gained or lost motifs in promoters of genes with lineage-specific gains or losses of sex bias (top) aligned with motifs for sex-biased TFs in same tissue (bottom). The lineage of sex bias gain or loss is indicated above each motif; the sex-biased TF and lineage of its sex bias are indicated below. (B) Total number of matches between gained or lost motifs and sex-biased TFs when considering tissue of TF sex bias (black) or randomly chosen tissues (gray). (C) Enrichment of ChIP-seq peaks in promoters of genes with lineage-specific sex biases containing gained or lost motifs for the TF. The sex-biased TF, along with tissue of sex bias and motif gain or loss and cell type in which ChIP-seq was performed, are indicated to left. The log2 odds ratio for genes with lineage-specific sex bias and containing the motif as compared with a background set of genes with no motif is shown on the x axis, with 95% confidence intervals by Fisher’s exact test. (D) Effect of Pknox1 knockout (x axis) (64) versus sex bias (y axis), both in mouse muscle, for genes that show loss of sex bias in primate lineage and contain a motif for PKNOX1 in mouse.

To confirm that genes with lineage-specific gains or losses of motifs for sex-biased TFs are TF-bound in living cells, we leveraged publicly available data from chromatin immunoprecipitation sequencing (ChIP-seq) in human and mouse (table S9). Although these assays were almost invariably performed in a different cell type than the tissue of TF sex bias and motif gain, we reasoned that sex-biased genes with gained motifs in a given tissue should nevertheless show enrichment for TF ChIP-seq signal. Eleven of 15 cases with available data showed significant enrichment of ChIP-seq peaks in the promoters of genes with a gain or loss of sex bias and the relevant motif, relative to a background set of genes lacking the motif (two-sided Fisher’s exact test), which is consistent with this prediction (Fig. 6C). Thus, the evolutionary gains and losses of motifs we observed likely correspond to gains and losses of binding by cognate TFs.

If gains and losses of motifs for sex-biased TFs contribute to lineage-specific changes in sex bias in their target genes, there should be directional agreement between the activating or repressive effect of the TF, the sex bias of the TF, and the sex bias of the target gene. For example, target genes activated (or repressed) by a male-biased TF should be male (or female) biased, and the opposite should be true for female-biased TFs. Rigorously testing this prediction requires experimental manipulation of the TF in the tissue where lineage-specific changes in sex bias are observed. Such data are available for PKNOX1, a homeobox TF with conserved male-biased expression in muscle (64) (fig. S26). Genes with loss of muscle-specific sex bias in the primate lineage show depletion (at a stringent 5% FDR) of PKNOX1-matching motifs relative to mouse, rat, and dog (Fig. 5A, examples of PKNOX1 targets in fig. S26). Genes with a PKNOX1-matching motif show significant positive correlation between the effect of Pknox1 knockout (64) and the effect of sex on muscle gene expression (Pearson’s r = 0.40, P = 9.02 × 10−6) (Fig. 6D). Thus, both ChIP-seq and TF knockout data confirm that gains and losses of regulation by sex-biased TFs have contributed to the evolution of sex bias.


Comparative studies of sex-biased gene expression have implications for the use of nonhuman mammals as models of sex-biased human traits or diseases. Conserved sex bias in gene expression across the body indicates that certain molecular sex differences in humans are amenable to study in a wide range of mammalian model organisms. However, in many cases, nonhuman models may not adequately recreate the human sex differences in question. This is supported by two lines of evidence: (i) in each tissue, samples cluster by species rather than sex, and (ii) most sex bias in gene expression has arisen recently and is thus not shared between most mammals. For example, genetic variants that decrease expression of the TF KLF14 in human adipose tissue tend to increase insulin resistance and risk for type 2 diabetes only in females, but elimination of Klf14 expression in mouse adipose tissue leads to analogous phenotypes in both sexes (65). Nonhuman mammals may still be useful as models of physiological or systems-level sex differences, but caution should be exercised when extrapolating specific molecular findings to humans.

We find that conserved sex bias in autosomal gene expression explains ~12% of the sex difference in mean human height, whereas all common single-nucleotide polymorphisms are thought to explain 60% of the heritability of height (66). Although these two numbers are not directly comparable (the former relates to between-group differences, the latter to between-individual variation), these height heritability estimates suggest that additional genes and instances of sex bias relevant to sex differences in height remain to be discovered. Deletions of the SHOX gene, located in the pseudoautosomal region of the human X and Y chromosomes, contribute to short stature in Turner syndrome (67), whereas increases in sex chromosome number (and thus SHOX dosage) increase height (68). Although the height GWAS used here excluded the pseudoautosomal regions, precluding analysis of SHOX, targeted studies indicate that SHOX dosage is positively correlated with height (67, 68). In light of reports that expression of SHOX is male-biased in multiple tissues (16), it may be that SHOX contributes a fraction of the sex difference in height [discussed further in (69)].

Studies of selection on height have illustrated how males and females can have different optimal values for a quantitative trait, with increased height favored in males and decreased height favored in females (53). Our finding that conserved sex bias in gene expression contributes to sex differences in height suggests one way in which such optimal values can be reached—through the acquisition and maintenance of sex-biased gene expression (70). Thus, although some conserved sex bias in gene expression may have arisen through selectively neutral processes, opposing selective forces between the sexes appear to have been at work here. Height is also subject to balancing selection, in which extreme variation in either direction negatively impacts reproductive fitness (53). A recent study in Drosophila found a strong signature of balancing selection at loci with opposite fitness effects in females and males, establishing that sexual antagonism and balancing selection can coincide (71). Future studies may identify mechanisms that reduce fitness at the extremes of the height distributions in both sexes. Whereas our study focused exclusively on height, genetic pleiotropy may broaden the reach of our findings. Sex-biased gene expression resulting from opposing selective pressures on male and female height could result in sex differences in phenotypes yet to be identified.

Our results also illustrate one way in which sex-biased gene expression can lead to phenotypic sex differences: autosomal genes, operating identically in males and females to influence a trait, can be expressed more abundantly in one sex. Although most genetic variation influencing height acts identically between the sexes, pronounced sex-specific genetic effects have been demonstrated for waist-to-hip ratio, body mass index (7274), thyroid hormone levels (75), and obsessive-compulsive disorder (76). Fully accounting for such sex differences in genetic architecture in association mapping (77), and integrating this information with sex biases in gene expression, may reveal additional mechanisms underlying phenotypic sex differences.

Our finding of sex-biased TFs underlying lineage-specific changes in sex bias provides molecular insight into mechanisms underlying the evolution of sex-biased gene expression in nonreproductive mammalian tissues. We focused on regulatory changes in promoter regions because of the lack of tissue-specific enhancer annotations in cyno, rat, or dog. However, single-gene studies in Drosophila indicate that sex-biased gene expression can evolve through more complex changes in cis-regulatory elements at larger genomic distances from their target gene (31). Studying gains and losses of TF binding motifs in promoters, although an important first step, is a simplifying approach. It is thus necessary to catalog both tissue and species specificity of mammalian enhancers to enable detailed analyses of the cis-regulatory changes driving gains or losses of sex-biased gene expression during mammalian evolution. Most of the sex-biased TFs we identified as contributing to lineage-specific evolution of sex bias are autosomal, indicating that their sex biases could arise as a result of trans-regulatory effects of sex chromosomes or sex hormones. Distinguishing between these two possibilities is an important future direction for research.

Materials and methods summary

Human (GTEx) samples were filtered on the basis of cause of death, medical history, and notes from GTEx pathologists (35), and additional detailed evaluations were conducted on samples with available histology. Samples from cynomolgus macaque, mouse, rat, and dog were collected within 1 hour of euthanizing healthy animals, and only tissues from nonestrous females were used. RNA extraction, library preparation, and RNA-seq of nonhuman mammals were performed in batches randomized with respect to tissue, species, and sex. Analyzing the combined human and nonhuman dataset, we used a linear mixed model (78) to identify genes showing consistent sex bias across species in each tissue, and we used mashr to identify lineage-specific changes in sex bias. These analyses were repeated with permuted male/female sample labels to empirically estimate FDRs. Magnitudes of sex bias were assessed in independent datasets by reanalyzing raw data, where available. For lineage-specific sex biases in each tissue, motif analysis (79, 80) was used to identify TF binding sites enriched in the set of sex-biased orthologs relative to the unbiased orthologs. Height-increasing or -decreasing effects of gene-tissue pairs were determined by combining publicly available TWAS predictive models (54) based on eQTL information from GTEx with height GWAS summary statistics from the UK Biobank (56), GIANT consortium (57), and a meta-analysis of the two studies (48).

Supplementary Materials

Materials and Methods

Figs. S1 to S26

Tables S1 to S9

References (82106)

References and Notes

  1. Materials and methods are available as supplementary materials.
Acknowledgments: We thank D. W. Bellott, L. Chmatal, and R. Ransohoff for critically reading the manuscript. Funding: S.N. and A.K.G. were supported by a research grant from Biogen, Inc. This work was supported by Biogen, Whitehead Institute, National Institutes of Health (grants R01HG007852 and U01HG007857), Howard Hughes Medical Institute, generous gifts from Brit and Alexander d’Arbeloff and Arthur W. and Carol Tobin Brill. Author contributions: S.N., A.K.G., J.F.H., and D.C.P. designed the study. J.F.H. procured cyno tissue samples. M.L.G. procured mouse and rat tissue samples, with assistance from S.N. S.N. processed tissue samples and performed computational analyses, with assistance from A.K.G. R.N.M. performed histological evaluations on human tissue sections. D.C.P. supervised work. S.N. and D.C.P. wrote the paper. Competing interests: The authors declare no competing interests. Data and materials availability: Raw RNA-seq data (.fastq files) have been deposited in the Gene Expression Omnibus (GSE125483). Processed data (expression count and TPM matrices and sample metadata) and code required to reproduce the analyses are available at and at Zenodo (81).
View Abstract

Stay Connected to Science

Navigate This Article