Conserved Regulation of Cardiac Calcium Uptake by Peptides Encoded in Small Open Reading Frames

See allHide authors and affiliations

Science  06 Sep 2013:
Vol. 341, Issue 6150, pp. 1116-1120
DOI: 10.1126/science.1238802

smORFing for Calcium

Genomes contain thousands of small open reading frames (smORFs), short DNA sequences coding for peptides of less than 100 amino acids. Magny et al. (p. 1116, published on 22 August) describe two smORF-encoded peptides of less than 30 amino acids regulating calcium transport and, hence, regular heart contraction, in the fruit fly Drosophila. These peptides seem to have been conserved for more than 550 million years in a range of species from flies to humans, where they have been implicated in severe heart diseases. Such conservation suggests that smORFs might be an ancient part of our functional genome.


Small open reading frames (smORFs) are short DNA sequences that are able to encode small peptides of less than 100 amino acids. Study of these elements has been neglected despite thousands existing in our genomes. We and others previously showed that peptides as short as 11 amino acids are translated and provide essential functions during insect development. Here, we describe two peptides of less than 30 amino acids regulating calcium transport, and hence influencing regular muscle contraction, in the Drosophila heart. These peptides seem conserved for more than 550 million years in a range of species from flies to humans, in which they have been implicated in cardiac pathologies. Such conservation suggests that the mechanisms for heart regulation are ancient and that smORFs may be a fundamental genome component that should be studied systematically.

Thousands of small open reading frames (smORFs) exist in animal and plant genomes, yet their relevance and functionality has rarely been addressed because of their challenging properties (1). Detection of small peptides requires specific biochemical and bioinformatics techniques that are rarely used in the characterization of whole genomes. Thus, the number of translated smORFs and their biological functions are still unknown. We and others previously characterized a Drosophila gene, tarsal-less (tal/pri), encoding four smORFs as short as 11 amino acids that are translated and provide essential functions during development (24). These results demonstrate that extremely short smORFs can be functional and suggest, when extrapolated by bioinformatics and combined with the latest data from deep RNA sequencing, that hundreds of smORF-encoding transcripts exist in the fly genome (5). However, the tal gene is a single example and seems present only in arthropods (2, 3, 6), leaving the questions about the conservation and wider relevance of smORFs unanswered. The characterization of several smORFs displaying conservation of amino-acid sequence, translation, and biological function of the encoded peptides throughout evolution would be a powerful indicator that smORFs represent an important but neglected part of our genomes.

Using a bioinformatics method (5), we scrutinized the pool of polyadenylated, polysome-associated putative noncoding RNAs in which tal was initially included (7) and identified two potentially functional smORFs of 28 and 29 amino acids in the transcript encoded by the gene putative noncoding RNA 003 in 2L (pncr003:2L) (Fig. 1A) (6). As with tal, these smORFs have similar amino acid sequences to one another and follow strong Kozak sequences (fig. S1A). These peptides are highly hydrophobic, with a predicted alpha-helical secondary structure (fig. S1B).

Fig. 1 pncr003:2Lpeptide expression in muscles and heart.

(A) Annotated genomic region from the Flybase Genome Browser displaying pncr003:02L, nearby genes and deficiencies generated in this work, Df(2L)scl12 (green bar), and Df(2L)sclg6 (dark blue bar). As transheterozygous, these two deficiencies generate a homozygous deletion (Df(2L)scl, red bar), eliminating the pcnr003:2L transcript and the CG13283 and CG13282 genes. (B to D) Expression of pncr003:2LmRNA in Drosophila muscles (arrowhead), in (B) stage 17 embryos; (C) larval somatic muscles (arrowhead) and heart (arrow), and (D) in the adult heart (arrow). (E to E’’) ORFA-GFP expression (green; arrowheads) surrounding the phalloidin-stained sarcomeres (magenta) in adult transversal heart fibers. (F to F’’) FH-ORFA peptides display a reticular pattern (green; arrowheads) in adult longitudinal heart fibers labeled with phalloidin (magenta). Blue, 4′,6-diamidino-2-phenylindole (DAPI)–stained nuclei.

We corroborated the structure and sequence of the pncr003:2L transcript by means of reverse transcription polymerase chain reaction (RT-PCR). Next, we studied pncr003:2L expression by means of in situ hybridization, which showed strong expression in somatic muscles and in the post-embryonic heart (Fig. 1, B to D, and fig. S2, A to F). We tested the in vivo translation and subcellular localization of these peptides by generating C-terminal green fluorescent protein (GFP)–tagged fusions within the pncr003:2L cDNA of each ORF and expressing these UAS-smORF-GFP fusions (fig. S8) in muscles with Dmef2-Gal4. We observed the GFP signal at the dyads (Fig. 1E and fig. S1, C and D) (8)—the structures in which the sarco-endoplasmic reticulum (SER) membrane lies closest to both the plasma membrane and the sarcomeres—in order to facilitate the conversion of the voltage signal into calcium release and muscle contraction (fig. S2G). Similar results were obtained with N-terminal Flag-hemaglutinin-tagged smORFs (UAS-FH-smORF) (Fig. 1F and figs. S1, E and F, and S8).

To obtain a null mutant for pncr003:2L, we generated two small overlapping deficiencies around the {WH}f02056 insertion (Fig. 1A). Together, these two deletions generate a synthetic homozygous deficiency [“Df(2L)scl”] eliminating the pncr003:2L transcript and the CG13283 and CG13282 genes and represents our null condition for the pncr003:2L locus, as corroborated with RT-PCR and in situ hybridization (fig. S2, A to F).

Df(2L)scl mutants showed no behavioral or morphological muscle phenotype, even at the ultrastructural level (fig. S2, H to Q). We analyzed muscle function using time-lapse recordings of adult fly hearts (9), which provide an excellent read-out of muscle contraction (Fig. 2A). Df(2L)scl mutants showed significantly more arrhythmic cardiac contractions than those of wild-type flies (Fig. 2, A and B; tables S5 and S6; and movies S1 and S2). These effects are due to a requirement for pncr003:2L peptides and not the other genes removed in Df(2L)scl because the phenotype (i) is mimicked by RNA interference on pncr003:2L and (ii) is rescued by restoring expression of UAS-pncr003:2L or either of its encoded peptides in Df(2L)scl mutants, but is not rescued by smORFs carrying frameshifts in the peptide sequence (Fig. 2B, figs. S3A and S8, and tables S5 and S6). Correspondingly, intracellular electrophysiology recordings in cardiac cells show irregular action potentials (APs), involving “double” and occasionally failed APs in the nonrescued mutants (Fig. 2C, fig. S3C, and table S7).

Fig. 2 Role of pnrc003:2L in cardiac muscle contraction.

(A) Kymographs comparing the pattern of heart contractions for wild-type and Df(2L)scl hearts. The mutant shows irregular periods, some being abnormally long (asterisk). A normal heart period is indicated (green). (B) Arrythmicity index of pncr003:2L loss-of-function and rescue genotypes (left) and excess of function genotypes (right), normalized to age matched wild-type controls (9). Columns represent mean, and error bars represent SE. (C) Sample traces of intracellular recordings from adult cardiomyocytes of wild-type (green); Df(2L)scl (red); and Df(2L)scl rescued by UAS-pncr003:2L (blue). Arrows indicate “double” action potentials. Arrowheads indicate failed action potentials. Gray dashed line indicates resting potentials. Sample peaks from each trace (underlined) appear magnified. (D) Ca2+ transients during heart contraction of Df(2L)scl and rescue genotypes (left) and gain-of-function genotypes (right) color-coded as in (B). The fluorescent Ca2+ sensor G-CaMP3 was used to visualize calcium levels. Y axis values are ratios of calcium dependent fluorescence on its decay phase normalized to basal intensities and presented as percentages relative to wild-type controls; x axis values are percentage of time from the point of maximum transient amplitude.

Because the smORF peptides localize in the dyads, we checked a possible physiological function related to Ca2+ trafficking during muscle contraction by visualizing intracellular Ca2+ (9). During heart contraction, the Ca2+ transients of pncr003:2L mutants showed significantly higher amplitudes and steeper decay than those of wild-type controls (Fig. 2D; fig. S3, D and E; and table S8). Overexpression of either peptide in a wild-type fly—but not of frameshifted smORFs—produced reciprocal effects on Ca2+ transients but similar arrhythmias to Df(2L)scl. Altogether, these results suggest (i) a primary role for the pcnr003:2L gene during Ca2+ trafficking at the SER, which would be secondarily required for regular muscle contraction; and (ii) that such a role is mediated by the peptides encoded by the 28– and 29–amino acid smORFs.

We searched for conservation of these smORFs in other species by using Basic Local Alignment Search Tool (BLAST) and only identified them in other Drosophilids [with Ka/Ks scores of <0.2 supporting translation (10)]. Because the pncr003:2L peptides have a predicted helical structure, we searched for possible structural homologs (9) and retrieved the 30–amino acid human sarcolipin (Sln) peptide (Fig. 3A and tables S1 and S2) (11). However, the Sln and pncr003:2L peptides display noticeable differences in their amino acid sequences (Fig. 3B). If they were true homologs, peptides with intermediate sequences should exist in the stem lineages to both flies and humans. We devised a bioinformatics protocol (9) to identify possible pncr003:2L homologs in arthropods (Fig. 3B and fig. S4) plus nonannotated homologs of sln and its longer paralogue phospholamban (pln) (Fig. 3B and fig. S4) (12), until basal arthropod smORFs identified basal vertebrate homologs with the expected intermediate amino acid changes (fig. S4, A to C). Supporting their putative homology, we found that (i) antibodies to sarcolipin recognize the pncr003:2L peptides (Fig. 3, C and E, and fig. S5, A and B), and (ii) threading the pncr003:2L amino acid sequences on the Pln three-dimensional (3D) structure (13) also produces a compatible structure (Fig. 3D and tables S1 and S2).

Fig. 3 Putative homology of sequence and structure between human and Drosophila peptides.

(A) Secondary structure of the conserved domain [underlined in (B)] of Sarcolipin (top) and Drosophila pncr003:2LORFA peptide (bottom). Blue, nitrogen atoms; red, oxygen atoms. (B) Phylogenetic tree of vertebrate and arthropod (pncr003:2L, labeled “Sarcolamban”) peptides. Asterisks indicate sequences identified in this study (supplementary data file S1). Putative ancestral consensus sequences (left) and further analysis (fig. S4) (9) suggest that the two vertebrate peptides arose from a duplication of a single ancestor that also diverged independently into the different arthropod Sarcolamban peptides. Analysis of RNA (cDNA) sequences (arrows) indicates that all peptides arise from single smORFs (red boxes) uninterrupted by exons, suggesting that ancestral peptides were also encoded by smORFs. (C) Western blots from Drosophila S2 cells showing that the antibody to human Sarcolipin (left lanes) recognizes the Drosophila FH-tagged Sarcolamban18-kD peptides SclA and SclB, but not the 10-kD FH-tag alone. Right lanes show positive controls, with antibody to HA recognizing all peptides. (D) A compatible structure for Sarcolamban-A (magenta) is obtained by threading it onto the C-terminal domain of vertebrate Phospholamban (green). (E to E’’’) Drosophila FH-SclA peptides (arrowheads) surrounding the sarcomeres (red) are recognized by antibodies to Sarcolipin (green) and Flag (blue) in larval somatic muscles.

A phylogenetic tree of all these peptides suggests that Sln and Pln emerged from a gene duplication in vertebrates, whereas an independent and more recent duplication in flies gave rise to pncr003:2L ORFA and ORFB peptides. The tree, sequence alignments, and further bioinformatics analysis (fig. S4, supplementary data file S1, and tables S1 and S2) (9) are altogether compatible with a single origin for the Sln, Pln, and pncr003:2L peptides from an ancestral peptide-encoding smORF of ~30 amino acids (Fig. 3B and fig. S4B). We suggest that pncr003:2L and its arthropod homologs should be renamed sarcolamban (scl) in order to reflect their similarity and probable homology to vertebrate sln and pln.

Conservation of smORFs across such an evolutionary distance (>550 million years of divergence) has not been described; therefore, we scrutinized their functional homology. Sln and Pln regulate Ca2+ traffic in mammal muscles by dampening the activity of the Sarco-endoplasmic Reticulum Ca2+ adenosine triphosphatase (SERCA), whose function is to retrieve Ca2+ from the cytoplasm back into the SER, leading to muscle relaxation (fig. S2G) (14). The effects of removing sln upon the vertebrate muscle Ca2+ transients are remarkably similar to the effects we observed in Df(2L)scl mutants (Fig. 2D) (15). Furthermore, abnormal levels of Sln expression have been related to human heart arrhythmias (16), and Sln and Pln have been shown to bind SERCA (17). In flies, the Scl peptides colocalize with Drosophila SERCA (Ca-P60A) (Fig. 4A and fig. S5C) and coimmunoprecipitate with it (Fig. 4D). Furthermore, the arrhythmia and abnormal transients of Df(2L)scl mutants are corrected by reducing the function of Ca-P60A (Fig. 2, B to D), a genetic interaction that is consistent with a down-regulating role of Scl upon SERCA activity (18). Last, threading the sequence of Ca-P60A onto the 3D structure of vertebrate SERCA produces a compatible structure that seems able to dock Scl similarly to Sln and Pln binding to SERCA (Fig. 4, B and C; fig. S5, D and E; and tables S3 and S4) (17).

Fig. 4 Sarcolamban interacts with Ca-P60A SERCA.

(A to A’’’) Colocalization of sarcolamban FH-SclA peptides (green) and Ca-P60A SERCA (red) in the SER and dyads (arrowheads) surrounding the adult heart sarcomeres (blue, phalloidin). (B and C) Interaction between the Drosophila SclA (magenta) and Ca-P60A, modeled from vertebrate SERCA1a in the EI conformation (9). SclA docks onto Ca-P60A similarly as Phospholamban (yellow) and Sarcolipin (fig. S5, D and E) onto human SERCA1a (C). Peptide C-termini are down. (D) FH-tagged Drosophila SclA and SclB and the human Sln and Pln peptides pull-down the 100-kD Drosophila Ca-P60A (revealed with antibody to Ca-P60) from transfected S2 cells. Negative control lanes with Flag-only peptides or beads without antibodies (“ONLY beads”) do not show similar Ca-P60A signal. (E to E’’’) Human Sln peptides (green; arrowheads) expressed in the Drosophila adult heart surround the sarcomeres (red; labeled with antibody to Tropomyosin1). Blue, DAPI-stained nuclei (arrow).

Our studies suggest that Sln and Pln can bind fly Ca-P60A and can resemble Scl function. Modeling suggests that fly and vertebrate peptides could bind each other’s SERCA (tables S3 and S4), and indeed human peptides can pull down fly Ca-P60A (Fig. 4D). Sln and Pln expressed in fly muscles and cultured cells localize similarly to Scl and Ca-P60A (Fig. 4E and figs. S5F and S6) and produce arrhythmias and Ca2+ transients similar to those produced by overexpressing fly Scl peptides (Fig. 2, B and D). Furthermore, expression of human Pln in Df(2L)scl flies can rescue the mutant Ca2+ transients toward wild type, and the strong arrhythmia phenotype of ectopic Pln is itself reduced (Fig. 2, B and D).The human peptide overexpression and rescue effects do not completely reproduce those observed with fly peptides, and this suggests that although this family of peptides may share a regulatory function on Ca2+ pumps, each seems finely tuned to its own species-specific SERCA regulation.

Altogether, our results suggest that this family of peptides may represent an ancient system for the regulation of Ca2+ traffic, whose alteration can result in irregular muscle contractions. We propose that the Drosophila sarcolamban (scl) gene, previously annotated as the long noncoding RNA pncr003:2L, actually encodes two functional smORFs of 28 and 29 amino acids that are translated into bioactive peptides. The analysis of related amino acid sequences across multiple species is compatible with a conservation of these peptides and their putative molecular structure from flies to vertebrates, correlated with the conservation of their biological role in regulating Ca2+ uptake at the SER. We speculate that this remarkable conservation, together with previous reports on the tal gene (24), might indicate that smORFs can reveal both sequence conservation and important biological functions. Bioinformatics predictions (1, 5) and recent ribosomal profiling data from vertebrates (19) suggest that translated smORFs may be abundant. We believe that smORFs cannot be dismissed as irrelevant, but that their functionality should be considered whenever encountered.

Supplementary Materials

Materials and Methods

Figs. S1 to S8

Tables S1 to S9

References (2034)

Movies S1 and S2

Supplementary data file S1

References and Notes

  1. Material and methods are available as supplementary materials on Science Online.
  2. Acknowledgments: We thank Rose Phillips, Roger Phillips, and J. Thorpe for technical support; M. Ramaswami for the antibody to Ca-P60A; and F. Casares, M. Baylies, I. Galindo, C. Alonso, and laboratory members for manuscript comments. E.M. was supported by Conacyt, F.P. was supported by a Daphne Jackson Fellowship and the UK Medical Research Council, and J.N. was supported by a Royal Society University Fellowship. Otherwise, this work was funded by a Wellcome Trust Fellowship (ref 087516) awarded to J.P.C. The GenBank accession number for Drosophila Scl sequences is NR_001662.
View Abstract

Navigate This Article