Technical Comments

STAT Genes Found in C. elegans

See allHide authors and affiliations

Science  09 Jul 1999:
Vol. 285, Issue 5425, pp. 167
DOI: 10.1126/science.285.5425.167a

Signal transducer and activator of transcription (STAT) proteins appear to be evolutionarily conserved to carry crucial and irreplaceable functions in metazoan species (1). G. Ruvkun and O. Hobert stated, however, that in Caenorhabditis elegans only “a very distant partial STAT gene” was identified (2, p. 2039). Our analysis led us to a different conclusion. We searched the database ( using TBLASTN (3) with the human Stat5b protein sequence and found one apparent STAT-like sequence (p < 2.3−20) in C. elegans. This sequence is on an unfinished cosmid, Y51H4 (EMBL accession number AL031823). This sequence, designated here as ce-stat-a, encodes a protein CE-STAT-A that is 29% identical and 40% similar to the human Stat5b protein and shares the same organization of functional domains. These domains include a predicted coiled-coil for protein-protein interaction and complex formation; an SH2 domain, with a distinctive sequence that most clearly defines the STAT proteins; a DNA-binding domain; and a tyrosine phosphorylation motif (Fig. 1). We also noticed a second sequence following ce-stat-a in the same cosmid, which is similar to fragments of ce-stat-a. However, whether this sequence represents an individual gene remains to be verified.

Figure 1

Coiled-coil domains (A), DNA-binding domains (B), SH2 domains (C), and potential tyrosine phosphorylation sites (D) of human Stat5b, CE-STAT-A, and CE-STAT-B were aligned using MegAlign from the LaserGene program ( Relative starting and ending positions of the four functional domains were selected in reference to those in human Stat1, which are defined by crystal structure (6). CE-STAT-B does not contain a coiled coil domain predicted with high probability. Amino acids that are identical in at least two sequences were shaded and functionally conserved residues between all STATs were boxed.

We identified another STAT-like sequence from cosmid F58E6 (EMBL accession number Z70754) with a BLAST search of GenBank. We also obtained a cDNA sequence for this gene, which we designate asce-stat-b (GenBank Accession number AF164113) (4). The protein translation of this sequence is only 17% identical and 27% similar to human Stat5b. CE-STAT-B is also similar to mammalian STAT proteins in DNA-binding domain, SH2 domain, and tyrosine phosphorylation motif (Fig. 1). There is a zinc finger ring structure C3H2C3-type, spanning amino acids 98 through 142 (CRI CQMHEGDMVR PCDCAGTMGD VHEECLTKWV NMSNKKTCEI CK) in CE- STAT-B. The significance of this motif remains to be verified experimentally.

We did not find Janus kinase (JAK) sequences in the C. elegans genome. It has been shown that many protein tyrosine kinases (PTKs) other than JAK kinases can phosphorylate STAT proteins (5). It is possible that there are other more ancient PTKs that can activate the STAT proteins, and that the JAK-STAT pathway evolved later than the original PTK-STAT pathways. The presence of STAT without JAK in the worm genome could lead us to a better understanding of the PTK-STAT signal transduction mechanism.

  • * Boyer Center of Molecular Medicine


STAT Genes Found in C. elegans

Response: Liu et al. are correct in amending one point (among 589 such points) in our analysis (1) of developmental control genes present and missing in the C. elegans genome sequence. The stat-a gene they describe is indeed a member of the STAT gene superfamily. This gene is present on two overlapping yeast artificial chromosome (YAC) sequences (2) that were released as working drafts (and continue to be working drafts) in October 1998 and January 1999, respectively, after our analysis of the worm genomic sequence during the summer of 1998. The second STAT gene they describe is the degenerate STAT gene that we described (1). We agree that the presence of a STAT gene without a JAK kinase or a cytokine-like receptor implies that this STAT protein (as well as STAT proteins in animals that do have the other known pathway components) may couple to more than these canonical inputs. This conclusion is similar to that which we presented (1) for the vestigial C. elegans HEDGEHOG and TOLL signaling pathways.

More generally, we wish to reiterate the major proviso to our original analysis analysis (1). We analyzed the C. elegansgenome when it was estimated to be about 99% complete. We classified 414 transcription factor genes and 125 signal tranduction genes in that census. On the basis of an estimated 1% of the genome not yet sequenced at the time of our analysis, we predict that approximately 4 more transcription factor genes and one more signaling gene from the families that are present in C. elegans would emerge when the dust settles. We also reported that C. elegans was missing 32 particular transcription factor genes and 18 signaling genes that, because they are present in Drosphila and chordates, would be expected to be present in C. elegans. For each of these 50 genes, there was a 1% chance that our assertion that it was missing was wrong. So the detection of a STAT gene in that 1% of the genome sequence is within the tolerance of the analysis.

We would expect that perhaps one or two more missing genes will emerge when the genome sequence is complete. But for those pathways missing multiple components (like the Hedgehog and receptor/JAK/STAT pathways), it is unlikely that all the missing genes will appear. So the interpretation that the components that remain in C. eleganscouple to other pathways is likely to pertain when the genome sequence is complete.


Navigate This Article