Research ArticlesGenetics

A genetic signature of the evolution of loss of flight in the Galapagos cormorant

See allHide authors and affiliations

Science  02 Jun 2017:
Vol. 356, Issue 6341, eaal3345
DOI: 10.1126/science.aal3345

Loss of flight in the Galapagos cormorant

Although rare among existing birds, the loss of flight appears to have occurred multiple times in evolutionary history. However, the genetic changes that ground avian species are not well understood. Burga et al. sequenced genomes from three cormorant species and compared them with that of the flightless Galapagos cormorant (see the Perspective by Cooper). They identified variants in genes involved in primary ciliogenesis. Functional analyses of these variants suggest that the impaired function of the genes may be responsible for skeletal changes associated with the Galapagos cormorant’s loss of flight.

Science, this issue p. eaal3345; see also p. 904

Structured Abstract


Changes in the size and proportion of limbs and other structures have played a key role in the evolution of species. One common class of limb modification is recurrent wing reduction and loss of flight in birds. Indeed, Darwin used the occurrence of flightless birds as an argument in favor of his theory of natural selection. Loss of flight has evolved repeatedly and is found among 26 families of birds in 17 different orders. Despite the frequency of these modifications, we have a limited understanding of their underpinnings at the genetic and molecular levels.


To better understand the evolution of changes in limb size, we studied a classic case of recent loss of flight in the Galapagos cormorant (Phalacrocorax harrisi). Cormorants are large water birds that live in coastal areas or near lakes, and P. harrisi is the only flightless cormorant among approximately 40 extant species. The entire population is distributed along the coastlines of Isabela and Fernandina islands in the Galapagos archipelago. P. harrisi has a pair of short wings, which are smaller than those of any other cormorant. The extreme reduction of the wings and pectoral skeleton observed in P. harrisi is an attractive model for studying the evolution of loss of flight because it occurred very recently; phylogenetic evidence suggests that P. harrisi diverged from its flighted relatives within the past 2 million years. We developed a comparative and predictive genomics approach that uses the genome sequences of P. harrisi and its flighted relatives to find candidate genetic variants that likely contributed to the evolution of loss of flight.


We sequenced and de novo assembled the whole genomes of P. harrisi and three closely related flighted cormorant species. We identified thousands of coding variants exclusive to P. harrisi and classified them according to their probability of altering protein function based on conservation. Variants most likely to alter protein function were significantly enriched in genes mutated in human skeletal ciliopathies, including Ofd1, Evc, Wdr34, and Ift122. We carried out experiments in Caenorhabditis elegans to confirm that a missense variant present in the Galapagos cormorant IFT122 protein is sufficient to affect ciliary function. The primary cilium is essential for Hedgehog (Hh) signaling in vertebrates, and individuals affected by ciliopathies have small limbs and ribcages, mirroring the phenotype of P. harrisi. We also identified a 4–amino acid deletion in the regulatory domain of Cux1, a highly conserved transcription factor that has been experimentally shown to regulate limb growth in chicken. The four missing amino acids are perfectly conserved in all birds and mammals sequenced to date. We tested the consequences of this deletion in a chondrogenic cell line and showed that it impairs the ability of CUX1 to transcriptionally up-regulate cilia-related genes (some of which contain function-altering variants in P. harrisi) and to promote chondrogenic differentiation. Finally, we show that positive selection may have played a role in the fixation of the variants associated with loss of flight in P. harrisi.


Our results indicate that the combined effect of variants in genes necessary for the correct transcriptional regulation and function of the primary cilium likely contributed to the evolution of highly reduced wings and other skeletal adaptations associated with loss of flight in P. harrisi. Our approach may be generally useful for identification of variants underlying evolutionary novelty from genomes of closely related species.

Comparative and predictive genomics of loss of flight.

Comparison of the genomes of four closely related cormorant species allowed us to predict function-altering variants exclusively affecting the Galapagos cormorant and to test their functional consequences. Our results implicate ciliary dysfunction as a likely contributor to the evolution of loss of flight.


We have a limited understanding of the genetic and molecular basis of evolutionary changes in the size and proportion of limbs. We studied wing and pectoral skeleton reduction leading to flightlessness in the Galapagos cormorant (Phalacrocorax harrisi). We sequenced and de novo assembled the genomes of four cormorant species and applied a predictive and comparative genomics approach to find candidate variants that may have contributed to the evolution of flightlessness. These analyses and cross-species experiments in Caenorhabditis elegans and in chondrogenic cell lines implicated variants in genes necessary for transcriptional regulation and function of the primary cilium. Cilia are essential for Hedgehog signaling, and humans affected by skeletal ciliopathies suffer from premature bone growth arrest, mirroring skeletal features associated with loss of flight.

The evolution of loss of flight is one the most recurrent limb modifications encountered in nature (1). Indeed, Darwin used the occurrence of flightless birds as an argument in favor of his theory of natural selection (2). He proposed that loss of flight could evolve as a result of selection in favor of larger bodies and relaxed selection due to the absence of predators. Loss of flight has evolved repeatedly and is found among 26 families of birds in 17 different orders (1). Moreover, recent studies strongly suggest that the ratites (ostriches, emus, rheas, cassowaries, and kiwis), long thought to derive from a single flightless ancestor, may constitute a polyphyletic group characterized by multiple independent instances of loss of flight and convergent evolution (35). However, despite the ubiquity and evolutionary importance of loss of flight (6), the underlying genetic and molecular mechanisms remain unknown.

The Galapagos cormorant (Phalacrocorax harrisi) is the only flightless cormorant among approximately 40 extant species (7). The entire population is distributed along the coastlines of Isabela and Fernandina islands in the Galapagos archipelago. P. harrisi has a pair of short wings, which are smaller than those of any other cormorant (Fig. 1A)—a deviation from the allometric relationship between wing length and body mass (7). The radius and ulna are disproportionately small relative to the humerus, but no digits have been fused or lost, unlike in some ratites (8). In addition, the Galapagos cormorant differs from its flighted relatives in a delay in the onset of several developmental landmarks after hatching (9), shortened remiges (flight feathers), underdeveloped pectoral muscles, a long and narrow skull and pelvis, a disproportionately long tibiotarsus, a factor of 1.6 increase in body mass, and a highly reduced keel (7). The keel is an extension of the sternum that runs along its midline and provides an attachment surface for the flight muscles, the largest muscles in birds. Flightless taxa, such as ratites and Cretaceous Hesperornis, have evolved flat sternums in which the keel has been largely reduced or lost (10).

Fig. 1 The Galapagos cormorant, a model for studying flightlessness evolution.

(A) The average wing length of an adult Galapagos cormorant male is 19 cm (3.6 kg body mass), whereas the wing length of its closest relative, the double-crested cormorant, is 31.5 cm (2.2 kg body mass). [Illustration by Katie Bertsche from specimens 134079 and 151575, Museum of Vertebrate Zoology at Berkeley] (B) The CEGMA score is a good predictor of genome completeness from a gene-centric perspective. The blue line is the linear regression model (r2 = 0.75, P = 4.3 × 10−7). Genomes reported in this study are red triangles; other published avian genomes are black circles (table S2). (C) Bayesian phylogram reconstructed with fourfold degenerate sites from whole genome sequences. The orange bar illustrates the time span between the approximate origin of the proto-Galapagos archipelago (9 Ma) and the origin of the oldest extant island, San Cristobal (4 Ma). Nodes represent median divergence ages. Blue bars indicate the 95% highest posterior density interval. (D) Heterozygosity levels inferred from whole genome sequences. Birds are not drawn to scale.

In contrast to ratites and penguins, which became flightless more than 50 million years ago (Ma) (5, 11), the Galapagos cormorant and its flighted relatives are estimated to share a common ancestor at ~2 Ma (12). This recent and extreme modification of wing size and pectoral skeleton makes P. harrisi an attractive model for studying loss of flight.

High-quality genome sequences of four cormorant species

To identify variants associated with loss of flight, we sequenced and de novo assembled the 1.2-Gb genomes of the Galapagos cormorant (Galapagos Islands, Ecuador) and three flighted cormorant species: the double-crested cormorant (Phalacrocorax auritus; Minnesota, USA), the neotropical cormorant (Phalacrocorax brasilianus; Valdivia, Chile), and the pelagic cormorant (Phalacrocorax pelagicus; Alaska, USA). P. auritus and P. brasilianus are the closest relatives of P. harrisi (1214), and P. pelagicus is part of a sister clade and served as an outgroup. Genomes were assembled from a combination of short insert and mate-pair Illumina libraries with SOAPdenovo2 (15) (table S1). Among these four genomes, the Galapagos cormorant’s assembly had the longest contig and scaffold N50 metrics (contig N50, 103 kb; scaffold N50, 4.6 Mb; table S1B). We evaluated the completeness of the cormorants’ genomes by estimating the total number of uniquely annotated proteins in each assembly and by using the CEGMA pipeline (16, 17). Overall, we found agreement between these two independent metrics in a data set including the four cormorant genomes and 17 recently published bird genomes (r2 = 0.75, P = 4.3 × 10−7; Fig. 1B and table S2).

Commonly used metrics of assembly quality, such as contig and scaffold N50, were very poor predictors of the total number of proteins present in each assembly (r2 = 0.13, P = 0.15, and r2 = 0.06, P = 0.81; fig. S1 and table S2). Three of the four cormorant genomes (P. harrisi, P. auritus, and P. pelagicus) obtained the highest CEGMA scores and numbers of uniquely annotated genes among all bird genomes (red triangles, Fig. 1B). The following CEGMA scores, a means to estimate genome completeness, were obtained for the cormorants: P. harrisi, 90.3%; P. auritus, 91.3%; P. brasilianus, 72.6%; P. pelagicus, 87.1%. In contrast, Sanger and PacBio genomes had lower scores for other birds: Gallus gallus (Sanger assembly) (18), 80.7%; Taeniopygia guttata (Sanger assembly) (19), 71.4%; and Melopsittacus undulatus (PacBio assembly) (20), 79.0%. Thus, our cormorant genomes perform even better than genomes assembled from Sanger sequences and PacBio long reads (complete statistics in table S2).

Phylogeny and genetic diversity

We reconstructed the cormorant phylogeny using a Bayesian framework (17) and confirmed the phylogenetic relationship among the four sequenced species (Fig. 1C). Moreover, our results indicate that P. harrisi last shared a common ancestor with P. auritus and P. brasilianus at ~2.37 Ma, in agreement with an estimate from mitochondrial DNA (12) (Fig. 1C). Española, the oldest extant island in the Galapagos archipelago, emerged no earlier than 4 Ma, and proto-Galapagos islands existed at 9 Ma or earlier (21). Our results are consistent with the view that P. harrisi lost the ability to fly while inhabiting the archipelago.

We calculated the proportion of single-nucleotide polymorphism (SNP) heterozygous sites for each sequenced individual to estimate the levels of intraspecific genetic diversity (Fig. 1D). P. harrisi showed the lowest proportion of heterozygous SNPs among the sequenced cormorants (0.00685%; Fig. 1D). The heterozygosity of P. harrisi is even lower than that of the crested ibis, Nipponia nippon, a highly endangered bird with a small effective population size and a known recent population bottleneck (22) (0.0172%; Fig. 1D). The low level of heterozygosity found in the Galapagos cormorant is most likely due to its small population size (~1500 individuals) and multiple population bottlenecks (23).

Discovery and characterization of function-altering variants in P. harrisi

To investigate the genetics of flightlessness evolution, we developed a comparative and predictive genomics approach (24, 25) that uses the genome sequences of P. harrisi and its flighted relatives to identify genetic variants that likely contributed to the evolution of loss of flight. Both coding and cis-regulatory variants have been implicated in the evolution of morphological traits (26, 27). However, determining the impact of regulatory variants is not straightforward. To identify the contribution of regulatory variants to the evolution of loss of flight in P. harrisi, we searched for ultraconserved noncoding sequences showing accelerated molecular evolution (2830). We identified 11 ultraconserved noncoding regions in tetrapods that show accelerated evolution in P. harrisi but not in the other cormorants [false discovery rate (FDR) < 5%]. One of these regions was located in an intron of the gene FTO (fig. S2), which has been associated with obesity in humans (31); however, none of these regions overlapped with experimentally validated or putative mouse limb enhancers (17, 32, 33) (table S7).

We thus focused on characterizing coding variants because we are better able to predict their molecular consequences. For our variant discovery approach to be comprehensive, it was imperative to interrogate most of the Galapagos cormorant’s genes. To increase our power to do so, we annotated genes using homology-based and transcriptome-based gene annotations. The latter were derived with mRNA expression data from the developing wing of a double-crested cormorant embryo (fig. S3C) (17). We then predicted all missense, deletion, and insertion variants in ortholog pairs between P. harrisi and each of its three flighted relatives (fig. S4) (17).

We used PROVEAN (34), a phylogeny-corrected variant effect predictor, to evaluate the impact on protein function of each of the Galapagos cormorant’s variants on a genome-wide scale. PROVEAN predictions have been validated in experimental evolution studies that mimic the process of gradual accumulation of mutations in nature (35). A PROVEAN score is calculated for each variant; the more negative the score, the more likely a given variant is to alter protein function. We examined the distribution of PROVEAN scores obtained when comparing 12,442 ortholog pairs between P. harrisi and P. auritus (Fig. 2A). Of these, 4959 pairs (40%) did not contain coding variants; the remaining 7483 pairs contained a total of 23,402 coding variants: 22,643 single amino acid substitutions, 456 deletions, and 303 insertions (Fig. 2B). Most variants were predicted to be neutral (the distribution is centered around zero). As expected, deletions and insertions were enriched in the tails of the distribution (Fig. 2B). Very similar numbers of variants and PROVEAN score distributions were obtained for the other homology-based (fig. S5) and transcriptome-based annotations (fig. S3).

Fig. 2 Distribution of the effect of variants between P. auritus and P. harrisi.

(A) We used PROVEAN to predict the effect on protein function of 23,402 variants contained in 12,442 orthologous pairs between P. auritus and P. harrisi; 4966 pairs contained no variants. The more negative the score, the more likely the variant affects protein function. PROVEAN score thresholds used in this study are drawn as vertical dashed lines. Numbers of proteins and variants found for each threshold are shown in the inset. (B) Density of PROVEAN scores for each class of variant. The same variants shown in (A) were classified as single amino acid substitutions, deletions, and insertions. Numbers of variants in each class are indicated. a.u., arbitrary units.

Enrichment for genes mutated in skeletal ciliopathies

To identify proteins carrying function-altering variants in the Galapagos cormorant, we applied a stringent threshold to our four prediction data sets: PROVEAN score less than –5, twice the threshold for discovery of human disease variants (17, 34) (Fig. 2A). In our data set, variants with a PROVEAN score less than –5 typically occur at residues that have been perfectly conserved at least since mammals and birds last shared a common ancestor (~300 Ma; fig. S6). Consequently, changes in these residues are likely to alter protein function or stability.

On the basis of theoretical and experimental considerations (36), we hypothesized that flightlessness is likely to have a polygenic basis and that the underlying variants would be enriched in certain biological pathways. Consistent with this hypothesis, gene enrichment analysis of function-altering variants in the Galapagos cormorant revealed that genes implicated in human developmental disorders were significantly overrepresented (17) (table S3A). Strikingly, 8 of the 19 significantly enriched categories (Human Phenotype Ontology) consisted of genes that affect limb development when mutated, leading to disorders such as polydactyly, syndactyly, and duplication of limb bones. Control analyses showed no enrichment of these categories in the flighted cormorants (17) (table S3, B and C).

Many of the genes underlying the enrichment for limb syndromes are those mutated in a family of human disorders known as ciliopathies. For instance, 17 of 25 genes (65%) in the “duplication of hand bones” category and all 12 genes (100%) in the “preaxial hand polydactyly” category are mutated in human ciliopathies (table S4). Moreover, ciliopathy-associated genes were present in all of the enriched categories (table S4). Ciliopathies comprise a phenotypically diverse group of rare genetic disorders that result from defects in the formation or function of cilia (37).

Cilia are hairlike microtubule-based structures that are nucleated by the basal body (centriole and associated proteins) and project from the surface of cells. Primary cilia are essential for mediating Hedgehog (Hh) signaling in vertebrates, serving as antennae for morphogens during development (38). We confirmed by Sanger sequencing the presence of predicted function-altering variants in Ofd1, Evc, Talpid3, Dync2h1, Ift122, Wdr34, and Kif7, all of which are necessary for the assembly or functioning of the primary cilium and are mutated in human ciliopathies, particularly those affecting the skeleton (Table 1). We also found a likely function-altering variant in Gli2, a transcription factor necessary for Hh signaling (39) (Table 1). Humans affected by diverse skeletal ciliopathies have small limbs and ribcages (37), suggesting a parallel with the main features of the Galapagos cormorant: small wings and a flattened sternum. However, the consequences of ciliopathies in humans are often more severe and pleiotropic, likely as a consequence of the overrepresentation of loss-of-function alleles in patients (40). Interestingly, although ciliopathies do not exclusively affect the forelimb in humans, differences between forelimbs and hindlimbs are commonly found among patients. For example, digital abnormalities affect the hands more often than the feet among oral-facial-digital syndrome type I (40, 41) and Ellis–van Creveld (42) patients, which suggests that forelimbs and hindlimbs differ in their intrinsic sensitivity to ciliary dysfunction.

Table 1 Function-altering variants in P. harrisi are enriched for genes that cause skeletal ciliopathies in humans.

Sanger-validated examples of function-altering variants (PROVEAN score < –5) in P. harrisi. Cilia/Hh-related genes were found on the basis of functional enrichment for human syndromes. PCP (planar cell polarity) genes were selected according to literature evidence linking cilia and PCP. These variants are fixed in the population.

View this table:

The Galapagos cormorant ortholog of human OFD1 (mutated in oral-facial-digital syndrome 1) contains three predicted function-altering variants with a PROVEAN score less than –5: Arg325 → Cys (R325C), –6.913; Lys517 → Thr (K517T), –5.673; and Glu889 → Gly (E889G), –5.068 (fig. S6, A and B). Ofd1 knockout mice display polydactyly and shortened long bones (43). Also, a function-altering missense variant [Gln691 → Leu (Q691L), score –5.491] was found in IFT122, a component of the IFT complex that controls ciliogenesis and the ciliary localization of Shh pathway regulators (44). Null Ift122 mutants show severe limb and skeletal phenotypes in mice (45), and mutations in Ift122 have been associated with Sensenbrenner syndrome in humans, which is characterized by craniofacial, ectodermal, and skeletal abnormalities, including limb shortening (46). Strikingly, the mutated Gln in IFT122 is virtually invariant among eukaryotes as different as green algae, C. elegans, Drosophila, and vertebrates (fig. S6C). To directly test whether the nonsynonymous substitution in IFT122 affects protein function, we generated a C. elegans knock-in strain using CRISPR/Cas9 homology-directed genome editing (Fig. 3A). The edited strain carries the Galapagos cormorant missense variant at the corresponding orthologous position in daf-10 [Gln862 → Leu (Q862L)], the ortholog of IFT122 in C. elegans (47).

Fig. 3 The Galapagos cormorant variant IFT122 Q691L affects ciliary function in vivo.

(A) The daf-10 gene (IFT122 ortholog) was targeted with CRISPR/Cas9 homology-mediated repair in C. elegans to introduce a nonsynonymous substitution present exclusively in the Galapagos cormorant (IFT122 Q691L). The resulting edited knock-in strain contains the daf-10 Q862L substitution and 10 synonymous substitutions (17). Edited strains were sequenced with Sanger sequencing to confirm genotypes. gRNA, guide RNA. (B to D) Representative bordering behavior of N2 wild-type worms (B), daf-10(e1387) containing a premature stop codon Q892X (C), and daf-10 Q862L knock-in strain (D). (E) Quantification of bordering behavior in N2, daf-10(e1387), and two independently generated knock-in daf-10 Q862L strains (n = 3, t test). *P < 0.05; error bars denote SE.

In invertebrates, cilia do not mediate Shh signaling but are necessary for detecting external sensory inputs (48). The only ciliated cells in C. elegans are sensory neurons, and mutations in cilia components affect dispersal behavior, chemotaxis, and dauer formation (49). We tested the bordering behavior of worms (accumulation of animals on the thickest part of a bacterial lawn), which is known to be mediated by ciliated neurons (50). We found that daf-10(e1387) mutants carrying a premature stop (Q892X) displayed an increase in bordering behavior in a dispersal assay relative to the wild type [73% for daf-10(e1387) versus 30% for N2; P = 0.019, t test; Fig. 3, B, C, and E]. Two independently generated daf-10 Q862L knock-in lines phenocopied the effect of the premature stop allele (70% for line 1 and 69% for line 2 versus 30% for N2; P = 0.018 and 0.016, respectively; Fig. 3, B, D, and E; see movie S1). Daf-10(e1387) ciliary neurons fail to incorporate fluorescent dyes, like many other loss-of-function mutants in cilia components (49). In contrast, the daf-10 Q862L knock-in worms incorporated the DiO dye in the same manner as wild-type worms, which suggests that this allele is hypomorphic (fig. S7). Overall, these results indicate that the IFT122 Q691L missense variant present in the Galapagos cormorant can affect protein function in vivo in C. elegans.

The planar cell polarity (PCP) pathway exhibits a genetic link to cilia (38, 5153). We found function-altering variants in members of the PCP pathway in P. harrisi: Fat1 atypical cadherin (Fat1), Dachsous cadherin-related 1 (Dchs1), and Disheveled-1 (Dvl1) (Table 1). The Galapagos cormorant FAT1 contains two function-altering variants, Ser1717 → Leu and Tyr2462 → Cys (S1717L and Y2462C; Table 1). The mutated Ser and Tyr are conserved from zebrafish to humans (fig. S5D). Fat1 knockout mice show very selective defects in muscles of the upper body but not in posterior muscles (54). In addition, Dvl1 is mutated in humans with Robinow syndrome, characterized by limb shortening (55, 56).

Sanger sequencing of 20 Galapagos cormorant individuals from two different populations (Cabo Hammond and Cañones Sur) (57) revealed only homozygous carriers for all of the variants in Table 1, indicating that these variants are most likely fixed in the Galapagos cormorant. In summary, we found an overrepresentation of predicted function-altering variants in genes that, when mutated in humans and mice, cause skeletal ciliopathies and bone growth defects.

CUX1 is mutated in P. harrisi

To identify the most likely function-altering variants in P. harrisi, we applied a more stringent PROVEAN score threshold: –12.5 delta alignment score, five times the threshold used for discovery of human disease variants (34). This strategy narrowed our search to 23 proteins (0.16% of annotated proteins in P. harrisi) (table S5). We manually curated these 23 proteins and performed additional Sanger sequencing, reducing the list of proteins with confirmed or putative variants to 12 (table S5) (17). These variants were exclusively small deletions. Among these 12 proteins, two stood out from their known role in development: LGALS-3 and CUX1. LGALS-3 is affected by a 7–amino acid deletion in P. harrisi (PROVEAN score, –26.319). LGALS-3 (Galectin-3) is localized at the base of the primary cilium and is necessary for correct ciliogenesis in mice (58), but it has not been implicated in human ciliopathies. Moreover, LGALS-3 physically interacts with SUFU, an important regulator of mammalian Hh signaling (59), and knockout mice show pleiotropic defects in chondrocyte differentiation (60).

In addition, we found a 4–amino acid deletion (PROVEAN score, –15.704) in CUX1. CUX1 (cut-like homeobox 1), also known as CDP, is a highly conserved transcription factor with diverse roles in development. CUX1 contains four DNA binding domains: three CUT domains (CR1 to CR3) and one homeodomain (HD) (Fig. 4A) (61). The full-length isoform, which contains four DNA binding domains (CR1-3HD), acts exclusively as a transcriptional repressor and has rapid and unstable DNA binding dynamics. In contrast, smaller isoforms such as CR2-3HD and CR3HD can act as both repressors and activators of gene expression, and show slow and stable DNA binding dynamics in vitro (61). Although insect and bird wings evolved independently, it is noteworthy that cut, the Drosophila ortholog of Cux1, is necessary for the proper development of wings and flight muscles in flies (62). In chicken, Cux1 mRNA expression in the limb at embryonic stage 23 is restricted to the ectoderm bordering the apical ectodermal ridge (AER) (63). The AER is one of the key signaling centers that drive limb development. At later stages, Cux1 is expressed in the developing joints of both chicken (64) and mouse (fig. S8) and is also detected in chondrocytes in developing bones of mice (fig. S9). A function-altering variant in CUX1 is a strong candidate to contribute to loss of flight in P. harrisi, because adenovirus-mediated overexpression in the developing chicken wing of a form of CUX1 missing the Cut2 DNA binding domain results in severe wing truncation (63, 65). These truncations most strongly affect distal skeletal elements (digits, radius, and ulna). As already noted, in P. harrisi the radius and ulna are disproportionately small relative to the humerus (7).

Fig. 4 The Galapagos cormorant Cux1 is a transcriptional activation hypomorph.

(A) Western blot showing the expression of CUX1 isoforms in the developing wing of a mallard embryo (22 days). The most abundant band corresponds to the predicted size of the CR3HD CUX1 isoform. (B) Protein alignment showing the deleted AGSQ residues in the Galapagos cormorant CUX1 and their high degree of conservation among vertebrates. (C) Differential up-regulation of genes by CUX1-Ancestral and CUX1-Δ4aa variants in ATDC5 cells. Pooled stable lines carrying CR3HD CUX1-Ancestral or CUX1-Δ4aa variants were generated by lentiviral transduction and puromycin selection. Control cells were transduced with an empty vector. Gene expression levels were measured by RT-qPCR (n = 5 biological replicates, each comprising three technical replicates). (D) Luciferase-based assay to test the repression activity of CUX1 C-terminal domain lacking CR3 and HD domains. GAL4 DNA binding domain was fused to CUX1-Ancestral or CUX1-Δ4aa variants. Both constructs equally repressed a promoter containing UAS binding sites in COS-7 cells (n = 3 biological replicates, each comprising three technical replicates). In (C) and (D), error bars denote SE. *P < 0.05, **P < 0.01, ***P < 0.001 (ANOVA and Tukey HSD); black and red asterisks respectively denote significant difference of CUX1 variant versus control and CUX1-Ancestral variant versus CUX1-Δ4aa variant.

We Sanger-sequenced and confirmed the predicted Cux1 12–base pair (bp) deletion in P. harrisi. We also confirmed that this variant is fixed in the population and absent in the other cormorant species (fig. S10A). The 12-bp deletion in Cux1 removes four amino acids, Ala-Gly-Ser-Gln (AGSQ), immediately adjacent to the C-terminal end of the homeodomain (Fig. 4B). We refer to this variant as CUX1-Δ4aa. Alignment of CUX1 orthologs from available vertebrate genomes revealed that the four missing residues are extremely conserved among tetrapods (Fig. 4B). The deleted Ser is phosphorylated in human cells (66), but the consequences of this modification are unknown.

The Cux1 deletion does not include any of the predicted residues responsible for DNA contact and recognition (67), but given its close proximity to the homeodomain, we decided to test whether the DNA binding activity of CUX1 was affected. We chose to express the CR3HD isoform because Western blot analysis revealed that this was the most abundant CUX1 isoform expressed in the developing wing of mallard embryos (~50 kDa; Fig. 4A). We performed electrophoretic mobility shift assay (EMSA) with purified CR3HD CUX1-Ancestral and CUX1-Δ4aa protein variants (fig. S10B) as described (68) and found that DNA binding was not abolished in the deletion variant (fig. S10C). CUX1 is able to both directly repress and activate gene expression through its C-terminal tail (69, 70). We performed a luciferase reporter assay (69, 71) and found that both variants were equally capable of repressing the expression of a UAS/tk luciferase reporter (Fig. 4D). Thus, the Galapagos cormorant CUX1-Δ4aa variant appears to not affect DNA binding in vitro or the C-terminal repression activity in COS-7 cells.

CUX1 regulates the expression of cilia and PCP genes

We hypothesized that the Cux1 deletion variant is mechanistically related to the enrichment of function-altering variants in ciliopathy-related genes. This inference came from the fact that transgenic mice overexpressing the CR3HD-CUX1 isoform develop polycystic kidneys. Cilia in cystic epithelial cells from these animals were twice as long as the ones in control epithelial cells (72). Furthermore, the CR2CR3HD-CUX1 isoform has been shown to directly up-regulate the expression of RPGRIPL1, also known as FTM, a component of the cilia basal body that is involved in Shh signaling and mutated in human ciliopathies (73). Also, Cux1 knockout mice show deregulation of SHH expression in hair follicles (74).

To test whether Cux1 could globally regulate expression of cilia genes, we analyzed expression array data from human-derived Hs578t cells stably expressing a short hairpin RNA against Cux1, as well as cells overexpressing the human CR2CR3HD-CUX1 isoform (75). In concordance with the role of Cux1 as a regulator of cell growth and proliferation (76), genes significantly up- or down-regulated in both conditions (P < 0.05 and >10% change) were enriched for pathways such as “cell cycle” and “mitotic G1-G1/S phases” (P = 3.99 × 10−5 and 0.016, respectively; table S6). We also found enrichment for cilia-related categories such as “assembly of the primary cilium” and “intraflagellar transport” (P = 0.00012 and 0.0057, respectively; table S6). These results suggest that cilia-related genes are enriched among Cux1 targets.

To further test whether Cux1 can regulate ciliary genes in an appropriate cellular context, we generated ATDC5 stable lines expressing N-terminal His-tagged versions of CR3HD CUX1-Ancestral and CUX1-Δ4aa variants. ATDC5 is a well-characterized mouse chondrogenic cell line that largely recapitulates in vitro the differentiation landmarks of chondrocytes (77). We performed quantitative reverse transcription polymerase chain reaction (RT-qPCR) on a selected number of genes containing predicted strong function-altering variants in P. harrisi (Table 1) and showing detectable levels of expression in ATDC5 cells. In addition, we measured the expression of Ptch1, the receptor of the Hh pathway. Our experiments indicate that the CUX1-Ancestral variant transcriptionally up-regulated the expression of Ofd1 (factor of 1.7, P = 1.2 × 10−6; Fig. 4C) and Fat1 (factor of 1.8, P = 0.029; Fig. 4C) and down-regulated the expression of Ift122 (factor of 0.77, P = 0.0025; Fig. 4C) and Ptch1 (factor of 0.53, P = 0.014; Fig. 4C) relative to the control line. In contrast, neither Dync2h1 nor Wdr34 expression levels were changed by CUX1-Ancestral overexpression (Fig. 4C). These results suggest that cilia- and Hh-related genes are likely transcriptional targets of CUX1 in chondrocytes.

Impaired transcriptional activity of the Galapagos cormorant CUX1

The Galapagos cormorant CUX1 showed impaired transcriptional activity relative to the ancestral variant. Ofd1 was significantly up-regulated in CUX1-Δ4aa cells relative to control cells [factor of 1.2, P = 0.021, analysis of variance (ANOVA) and Tukey honest significant difference (HSD) test, Fig. 4C]; however, Ofd1 up-regulation was significantly reduced in CUX1-Δ4aa cells relative to CUX1-Ancestral cells (factor of 1.2 versus 1.7, P = 6 × 10−5; Fig. 4C). Similarly, although Fat1 was significantly up-regulated in CUX1-Ancestral cells relative to control cells (factor of 1.8, P = 0.029), Fat1 expression levels in CUX1-Δ4aa cells were not significantly different from control lines (factor of 1.2, P = 0.57; Fig. 4C). The difference between Fat1 up-regulation in CUX1-Δ4aa and CUX1-Δ4aa cells was not significant (factor of 1.8 versus 1.3, P = 0.17; Fig. 4C). In contrast, CUX1-Δ4aa cells significantly repressed both Ift122 (factor of 0.78, P = 0.0026) and Ptch1 (factor of 0.56, P = 0.024), and there were no significant differences between CUX1-Ancestral and CUX1-Δ4aa cells (factor of 0.78 versus 0.78 for Ift122, P = 0.99; factor of 0.53 versus 0.56 for Ptch1, P = 0.96; Fig. 4C). These results suggest that the 4–amino acid deletion in the Galapagos cormorant CUX1 affects its ability to activate but not to repress gene expression; they are also consistent with our luciferase reporter assays, which showed no effect on repression (Fig. 4D). It is notable that both the transcriptional activator (Cux1) and its target genes (Ofd1 and Fat1) exhibit function-altering variants in the Galapagos cormorant.

CR3HD-CUX1 promotes chondrogenesis

Chondrocytes are the main engine of bone growth. The growth of skeletal elements depends on the precise regulation of chondrocyte proliferation and hypertrophy. Mutations that affect cilia result in premature arrest of bone growth due to defects in Indian Hedgehog (IHH) signaling in chondrocytes (78). To test the role of CR3HD-CUX1 in chondrogenesis, we differentiated control, CUX1-Ancestral, and CUX1-Δ4aa ATDC5 cell lines and quantified the expression of Ihh and Sox9, two well-established markers of chondrocyte differentiation in vitro and in vivo (78). Overexpression of both CR3HD CUX1-Ancestral and CUX1-Δ4aa variants promoted chondrogenic differentiation of ATDC5 cells after 7 and 12 days of differentiation (Fig. 5A). However, the CUX1-Δ4aa variant was not as efficient as the ancestral variant, showing significant differences from CUX1-Ancestral in Ihh expression after 7 days of differentiation (~50% decrease, P = 5.9 × 10−4, ANOVA and Tukey HSD test; Fig. 5A) and in Sox9 expression after 12 days (~15% decrease, P = 1.6 × 10−2; Fig. 5A). These results suggest that the Galapagos cormorant CUX1 is probably not as effective as the ortholog from its flighted relatives in promoting chondrogenic differentiation, and that mutations in Cux1 may affect the dynamics of chondrogenesis. This observation is further supported by findings that CUX1 is expressed in the hypertrophic chondrocytes of developing bones in mice, and that the bones of Cux1 mutant mice are thin and flaky (79).

Fig. 5 CR3HD CUX1 promotes chondrogenesis.

(A) ATDC5 control cells and cells carrying CR3HD CUX1-Ancestral or CUX1-Δ4aa variants were differentiated into chondrocytes. Gene expression levels were measured by RT-qPCR (n = 4 biological replicates, each comprising three technical replicates) after 7 and 12 days. Error bars denote SE. *P < 0.05, **P < 0.01, ***P < 0.001 (ANOVA and Tukey HSD); black and red asterisks respectively denote significant difference of CUX1 variant versus control and CUX1-Ancestral variant versus CUX1-Δ4aa variant. (B) Proposed mechanism for the reduction of wing size in P. harrisi. Left: Normal functioning of IHH signaling pathway in vertebrates. CUX1 regulates the expression of cilia-related genes such as Ofd1 and promotes chondrogenesis. Right: State of the IHH pathway in P. harrisi. Proteins in red have predicted function-altering variants in P. harrisi. We propose that these variants would affect both cilia formation and functioning, leading to a reduction in IHH pathway activity. As a result, the pool of proliferating chondrocytes would decrease in wing bones and the number of hypertrophic chondrocytes would increase, resulting in impaired bone growth.

Possible evolutionary scenarios

Loss of flight has traditionally been attributed to relaxed selection. In this scenario, the first cormorants that inhabited the Galapagos Islands found a unique environment that lacked predators and provided food year-round, drastically reducing the need to migrate. However, we found no evidence for pseudogenization of developmental genes in P. harrisi (17) (tables S10 and S11). On the other hand, loss of flight in the Galapagos cormorant is thought to confer an advantage for diving by decreasing buoyancy via shorter wings and by indirectly allowing increased oxygen storage via larger body size (80). This advantage could make flightlessness a target of positive selection.

We evaluated whether any of our candidate genes (Table 1) showed signatures of positive selection in the Galapagos cormorant lineage by estimating the ratio of nonsynonymous to synonymous substitutions (ω = dN/dS). This is a very stringent test of selection because it assumes that all sites in a protein are evolving under the same selective pressure, a condition rarely met in highly conserved regulatory genes (81). We found that 3 of 11 tested genes showed signs of positive selection (ω > 1) in the Galapagos cormorant lineage compared to a background phylogeny of 35 taxa (Ofd1 ω = 1.92, Evc ω = 1.93, Gli2 ω = 1.10; table S8).

One of these three genes, Gli2, showed a statistically significant difference [ω = 1.10 (Galapagos branch) versus ω = 0.11 (background branch), P = 0.0024; table S8]. In contrast, Gli2 showed no sign of selection in the sister group of P. harrisi (P. auritus and P. brasilianus) [ω = 0.0001 (sister branch) versus ω = 0.11 (background branch), P = 0.46]. As a control, we also analyzed Gli3, the partially redundant paralog of Gli2, which also mediates Hh signaling but has no predicted function-altering variants in P. harrisi, and found no evidence for positive selection [ω = 0.04 (Galapagos branch) versus ω = 0.15 (background branch), P = 0.11; table S8]. These results suggest that selection toward flightlessness may be partially responsible for the phenotype of P. harrisi.


The study of evolution of flightlessness in the Rallidae family led to the hypothesis that flightlessness could be a fast-evolving heterochronic condition (10, 82). Heterochrony, the relative change in the rate or timing of developmental events among species, is thought to be an important factor contributing to macroevolutionary change (83). Yet virtually nothing is known about its genetic and molecular mechanisms.

Diverse myological, osteological, and developmental observations suggest that flightlessness in the Galapagos cormorant is caused by the retention into adulthood of juvenile characteristics affecting pectoral and forelimb development (a class of heterochrony known as paedomorphosis) (7). Here, we propose a genetic and molecular model that may explain this heterochronic condition, where the perturbations of cilia/Ihh signaling may be responsible for the reduction in growth of both keel and wings in the Galapagos cormorant (Fig. 5B). However, we cannot rule out a role of Cux1 in the AER. Of special interest is the gene Fat1, a target of Cux1 (Fig. 4D), which contains two putative function-altering variants (Table 1). Fat1−/− mouse mutants are viable and show selective defects in facial, pectoral, and shoulder muscles but not in hindlimb muscles (54). Thus, variants in Fat1 could explain the underdeveloped pectoral muscles of P. harrisi.

Although we have identified multiple variants that likely contribute to the flightless phenotype of P. harrisi, we cannot exclude the possibility that other genes and pathways may contribute to the phenotype, nor the contribution of noncoding regulatory variants (17). Further characterization of the individual and joint contributions of the variants found in this study will help us to reconstruct the chain of events leading to flightlessness and to genetically dissect macroevolutionary change. We hypothesize that mutations in cilia or functionally related genes could be responsible for limb and other skeletal heterochronic transformations in birds and diverse organisms, including humans.

Supplementary Materials

Materials and Methods

Figs. S1 to S10

Tables S1 to S13

Movie S1

References (84109)

References and Notes

  1. The onset times of several homologous developmental events, such as the appearance of immature feathers, dropping of egg tooth, and independence from parental care, occur progressively later in the Galapagos cormorant chick than in P. auritus.
  2. A recent phylogeny of the Phalacrocoracidae suggested classifying P. harrisi, P. auritus, and P. brasilianus as members of the monophyletic genera Nannopterum.
  3. See supplementary materials.
  4. The chicken CUX1-WT variant studied by Tavares et al. was inadvertently missing the Cut2 DNA binding domain, as revealed by the chicken genome generated subsequently to their study (Ensembl ID: ENSGALP00000002788). This isoform is likely hypomorphic, and its overexpression may exert a dominant negative effect.
Acknowledgments: Supported by the Jane Coffin Childs Memorial Fund for Medical Research (A.B.), the Howard Hughes Medical Institute (L.K.), the U.S. Geological Survey through the Wildlife Program of the Ecosystems Mission Area (A.M.R.), a Gruss-Lipper postdoctoral fellowship from the EGL Charitable Foundation (E.B.-D.), FONDECYT grant 11130305 (C.V.), and the National Institute of Arthritis and Musculoskeletal and Skin Diseases (K.L.). Field collection permits and export permits were by the Galapagos National Park, facilitated by the Charles Darwin Foundation. Any use of trade names is for descriptive purposes only and does not imply endorsement by the U.S. government. We thank C. Koo and M. Arbing at the UCLA-DOE Protein Expression Laboratory Core Facility for protein purification; S. Feng at the Broad Stem Cell Research Center High Throughput Sequencing Core for assistance; C. Cicero for granting access to cormorant specimens (134079 and 151575) at the Museum of Vertebrate Zoology at Berkeley; C. Valle (Universidad San Francisco de Quito) for advice; and K. Garrett (Natural History Museum of Los Angeles), K. Burns (San Diego Natural History Museum and San Diego State University Museum of Biodiversity), and R. Duerr (International Bird Rescue) for providing samples used in preliminary stages of this study. Author contributions: A.B. and L.K. conceived the study; A.B. coordinated the collection of samples, prepared libraries, assembled and annotated genomes, and performed analyses and experiments; E.B.-D. and A.B. performed the accelerated evolution analysis; P.C.W., A.M.R., C.V., and P.G.P. provided DNA or tissue samples; W.W. and A.B. carried out ATDC5 cell line experiments supervised by K.L.; A.B. and L.K. wrote the manuscript; and all authors discussed and agreed on the final version of the manuscript. All sequencing data from this study is available through the NCBI Sequence Read Archive under Bioproject accession number PRJNA327123. Alignments used for phylogenetic analysis and selection test are available at DRYAD doi:10.5061/dryad.8m2t5. The authors declare no competing financial interests.
View Abstract

Navigate This Article