Web Fig. 1


Abstract
Full Text
The Sequence of the Human Genome
J. Craig Venter,* Mark D. Adams, Eugene W. Myers, Peter W. Li, Richard J. Mural, Granger G. Sutton, Hamilton O. Smith, Mark Yandell, Cheryl A. Evans, Robert A. Holt, Jeannine D. Gocayne, Peter Amanatides, Richard M. Ballew, Daniel H. Huson, Jennifer Russo Wortman, Qing Zhang, Chinnappa D. Kodira, Xiangqun H. Zheng, Lin Chen, Marian Skupski, Gangadharan Subramanian, Paul D. Thomas, Jinghui Zhang, George L. Gabor, Miklos, Catherine Nelson, Samuel Broder, Andrew G. Clark, Joe Nadeau, Victor A. McKusick,Norton Zinder, Arnold J. Levine, Richard J. Roberts, Mel Simon, Carolyn Slayman, Michael Hunkapiller, Randall Bolanos, Arthur Delcher, Ian Dew, Daniel Fasulo, Michael Flanigan, Liliana Florea, Aaron Halpern, Sridhar Hannenhalli, Saul Kravitz, Samuel Levy, Clark Mobarry, Knut Reinert, Karin Remington, Jane Abu-Threideh, Ellen Beasley, Kendra Biddick, Vivien Bonazzi, Rhonda Brandon, Michele Cargill, Ishwar Chandramouliswaran, Rosane Charlab, Kabir Chaturvedi, Zuoming Deng, Valentina Di Francesco, Patrick Dunn, Karen Eilbeck, Carlos Evangelista, Andrei E. Gabrielian, Weiniu Gan, Wangmao Ge, Fangcheng Gong, Zhiping Gu, Ping Guan, Thomas J. Heiman, Maureen E. Higgins, Rui-Ru Ji, Zhaoxi Ke, Karen A. Ketchum, Zhongwu Lai, Yiding Lei, Zhenya Li, Jiayin Li, Yong Liang, Xiaoying Lin, Fu Lu, Gennady V. Merkulov, Natalia Milshina, Helen M. Moore, Ashwinikumar K Naik, Vaibhav A. Narayan, Beena Neelam, Deborah Nusskern, Douglas B. Rusch, Steven Salzberg, Wei Shao, Bixiong Shue, Jingtao Sun, Zhen Yuan Wang, Aihui Wang, Xin Wang, Jian Wang, Ming-Hui Wei, Ron Wides, Chunlin Xiao, Chunhua Yan, Alison Yao, Jane Ye, Ming Zhan, Weiqing Zhang, Hongyu Zhang, Qi Zhao, Liansheng Zheng, Fei Zhong, Wenyan Zhong, Shiaoping C. Zhu, Shaying Zhao, Dennis Gilbert, Suzanna Baumhueter, Gene Spier, Christine Carter, Anibal Cravchik, Trevor Woodage, Feroze Ali, Huijin An, Aderonke Awe, Danita Baldwin, Holly Baden, Mary Barnstead, Ian Barrow, Karen Beeson, Dana Busam, Amy Carver, Angela Center, Ming Lai Cheng, Liz Curry, Steve Danaher, Lionel Davenport, Raymond Desilets, Susanne Dietz, Kristina Dodson, Lisa Doup, Steven Ferriera, Neha Garg, Andres Gluecksmann, Brit Hart, Jason Haynes, Charles Haynes, Cheryl Heiner, Suzanne Hladun, Damon Hostin, Jarrett Houck, Timothy Howland, Chinyere Ibegwam, Jeffery Johnson, Francis Kalush, Lesley Kline, Shashi Koduru, Amy Love, Felecia Mann, David May, Steven McCawley, Tina McIntosh, Ivy McMullen, Mee Moy, Linda Moy, Brian Murphy, Keith Nelson, Cynthia Pfannkoch, Eric Pratts, Vinita Puri, Hina Qureshi, Matthew Reardon, Robert Rodriguez, Yu-Hui Rogers, Deanna Romblad, Bob Ruhfel, Richard Scott, Cynthia Sitter, Michelle Smallwood, Erin Stewart, Renee Strong, Ellen Suh, Reginald Thomas, Ni Ni Tint, Sukyee Tse, Claire Vech, Gary Wang, Jeremy Wetter, Sherita Williams, Monica Williams, Sandra Windsor, Emily Winn-Deen, Keriellen Wolfe, Jayshree Zaveri, Karena Zaveri, Josep F. Abril, Roderic Guigó, Michael J. Campbell, Kimmen V. Sjolander, Brian Karlak, Anish Kejariwal, Huaiyu Mi, Betty Lazareva, Thomas Hatton, Apurva Narechania, Karen Diemer, Anushya Muruganujan, Nan Guo, Shinji Sato, Vineet Bafna, Sorin Istrail, Ross Lippert, Russell Schwartz, Brian Walenz, Shibu Yooseph, David Allen, Anand Basu, James Baxendale, Louis Blick, Marcelo Caminha, John Carnes-Stine, Parris Caulk, Yen-Hui Chiang, My Coyne, Carl Dahlke, Anne Deslattes Mays, Maria Dombroski, Michael Donnelly, Dale Ely, Shiva Esparham, Carl Fosler, Harold Gire, Stephen Glanowski, Kenneth Glasser, Anna Glodek, Mark Gorokhov, Ken Graham, Barry Gropman, Michael Harris, Jeremy Heil, Scott Henderson, Jeffrey Hoover, Donald Jennings, Catherine Jordan, James Jordan, John Kasha, Leonid Kagan, Cheryl Kraft, Alexander Levitsky, Mark Lewis, Xiangjun Liu, John Lopez, Daniel Ma, William Majoros, Joe McDaniel, Sean Murphy, Matthew Newman, Trung Nguyen, Ngoc Nguyen, Marc Nodell, Sue Pan, Jim Peck, William Rowe, Robert Sanders, John Scott, Michael Simpson, Thomas Smith, Arlan Sprague, Timothy Stockwell, Russell Turner, Eli Venter, Mei Wang, Meiyuan Wen, David Wu, Mitchell Wu, Ashley Xia, Ali Zandieh, Xiaohong Zhu

Web Fig. 1: Annotation of the Celera Human Genome Assembly

Initial annotations of the Celera compartmentalized shotgun assembly (CSA) of the human genome including transcripts, sequence characteristics, polymorphisms, and molecular markers are presented. Each track of the figure is divided into three areas: forward-strand transcripts, sequence analysis, and reverse-strand transcripts (from top to bottom, respectively). The end of each chromosome tier is depicted as white space as it not yet clear that the CSA includes the telomeres. The genome sequence is displayed on a nucleotide scale of approximately 600 kbp/cm. Molecular genetic markers are shown above the nucleotide scale at the top of each track and are derived from the Marshfield map (http://research.marshfieldclinic.org/genetics/Map_Markers/maps/IndexMapFrames.html). Genes are adjacent to the sequence analysis tiers. They are color-coded by the algorithm used to define the transcript structure (see figure key) and are given a minimum length of 20 kb for display purposes. The structure of transcripts with two or more exons is displayed in one of two expanded transcript tiers at 120 kb/cm resolution above or below the genes for forward- and reverse-strand transcripts, respectively. Exons are depicted as black boxes and intronic regions are color-coded for transcripts assigned to the 14 largest Gene Ontology (GO, http://www.geneontology.org) categories. Single-exon transcripts are color-coded by GO classification and are displayed in a tier between the unexpanded transcripts and the sequence analysis tiers. Transcripts predicted by Celera's annotation algorithm (Otto) that correspond to RefSeq transcripts (http://www.ncbi.nlm.nih.gov/LocusLink/refseq.html) are assigned HUGO gene symbols (http://www.gene.ucl.ac.uk/nomenclature) if the RefSeq transcripts are associated with HUGO symbols by LocusLink (http://www.ncbi.nlm.nih.gov/LocusLink) and if the transcripts are longer than 25 kbp (to prevent overlap of gene symbols). There are three sequence analyses in the middle section of the tracks: G+C content, CpG Islands and SNP density. G+C content is depicted in a nonlinear scale described in the legend. A black box indicates the position of CpG islands. SNPs were identified by comparison of the Celera sequence with a genome assembly available at http://genome.ucsc.edu/. The range of SNP density is depicted above the color gradient in the legend. The natural log of the SNP density is used to color-code the SNP density analysis tier. Gaps within scaffolds are visible as white space in the G+C content tier if the gap is sufficiently large. Gaps between scaffolds are assigned a length of 2 kbp. Scaffold order along the chromosomes was determined by mate-pair information and alignment of scaffold sequence to the GeneMap'99 STS map (http://www.ncbi.nlm.nih.gov/genemap99/) and the Washington University BAC fingerprint map (http://www.genome.wustl.edu/gsc/mapping/). The centromere is depicted as a blue line crossing the annotation tiers and its position is approximated by the transition from p to q arms along the genome sequence, except for acrocentric chromosomes for which the centromere is placed at the beginning of the sequence analysis tiers.

The figure was generated with "gff2ps" (http://www1.imim.es/software/gfftools/GFF2PS.html), a genome annotation tool that converts General Feature Formatted records (http://www.sanger.ac.uk/Software/formats/GFF/) to a PostScript output [J. F. Abril, R. Guigó, Bioinformatics 16, 743 (2000)]. [PDF of figure caption text]

Downloadable PDFs of Chromosome Maps
Chromosome 1(889K)Chromosome 9(555K)Chromosome 17(611K)
Chromosome 2(788K)Chromosome 10(557K)Chromosome 18(412K)
Chromosome 3(713K)Chromosome 11(672K)Chromosome 19(620K)
Chromosome 4(598K)Chromosome 12(648K)Chromosome 20(460K)
Chromosome 5(647K)Chromosome 13(444K)Chromosome 21(354K)
Chromosome 6(659K)Chromosome 14(499K)Chromosome 22(440K)
Chromosome 7(608K)Chromosome 15(485K)Chromosome X(540K)
Chromosome 8(547K)Chromosome 16(504K)Chromosome Y(296K)
Legend Key
PDF of Legend Key