Table 1

Summary of construction of theDrosophila Gene Collection. RNA for the various libraries was obtained from the following sources: LD, 0- to 22-hour embryos; GM, ovaries, stage 1 to 6 of oogenesis; HL and GH, adult head; LP, mixed larval and early pupal stages; and SD, Schneider L2 cell line. Sequence reads were quality trimmed before submission to GenBank essentially as described in (14); we estimate the accuracy of the high-quality region to be better than 99% and that of the additional bases included in the total submission to be 97%. A list of the clones that make up the current DGC can be found

Library nameLDGMHLGHLPSDTotals
Number of 5′ ESTs sequenced32,7146,2012,98121,2128,9356,96279,636
Average submitted length inbp554505508577584 544546
Average high-quality length inbp457408405478488448447
Estimated percentage of clones extending 5′ of AUG83.478.646.078.979.481.080.0
Percentage of clones longer than the corresponding clone in the GenBank test set33.636.921.735.733.440.534.5
Number of 3′ ESTs sequenced3,8137142043,1113818579,080
Number of clones selected for DGC2,5944671371,8672095755,849
Average size in kb of cDNAs in DGC2.