Of Mice, Men, and the Genome
- 1Baylor College of Medicine, Houston, Texas 77030, USA; 2Texas Children's Cancer Center, Department of Hemotology-Oncology, Houston, Texas 77030, USA
This extract was created in the absence of an abstract.
Master, I marvel how the fishes live in the sea. (W. Shakespeare, Pericles Act II, Sc. I, V. 28)
In this issue Carninci et al. (2000) and the RIKEN Genome Exploration Research Group introduce one of the “marvels” of nature that many have predicted but that we are just beginning to grasp in full scope. This is the first large-scale publication of a mammalian “transcriptosome,” or the genome as it is expressed, in which the sequenced clones are specifically enriched for full-length sequences. This massive effort represents the accumulated data from the RIKEN-MEI 4.02 release May 15, 2000 (http://genome.rtc.riken.go.jp/), including 914,452 3′ end murine sequences representing 126,693 nonredundant clones found in >80 cDNA libraries generated from histologically and developmentally diverse tissues.
Perhaps even more important than the public identification of so many new mouse cDNAs is the clear explanation of the methods used to construct the Cap-Trapped, full-length, normalized, and subtracted libraries. These methods were responsible for a sequence redundancy level <2% and a rate of novel sequences discovery that ran as high as 20%–39% for the most heavily subtracted sublibraries. It is these two factors that made it economically feasible to consider sequencing such a large part of the transcriptosome. As pointed out by the authors, these methods are likely to be quite useful to the other ongoing or proposed cDNA discovery and sequencing efforts, including the recently announced Mammalian Gene Collection (MGC,http://www.ncbi.nlm.nih.gov/MGC/index.html). Of the transcripts obtained by these methods, 88% appear to contain at least one probable ATG start site and evidence of full-coding length, and …











