Large-Scale Concatenation cDNA Sequencing

  1. Wei Yu,
  2. Björn Andersson,
  3. Kim C. Worley,
  4. Donna M. Muzny,
  5. Yan Ding,
  6. Wen Liu,
  7. Jennifer Y. Ricafrente,
  8. Meredith A. Wentland,
  9. Greg Lennon1, and
  10. Richard A. Gibbs2
  1. Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas 77030; 1Human Genome Center, Lawrence Livermore National Laboratories, Livermore, California 94550

Abstract

A total of 100 kb of DNA derived from 69 individual human brain cDNA clones of 0.7–2.0 kb were sequenced by concatenated cDNA sequencing (CCS), whereby multiple individual DNA fragments are sequenced simultaneously in a single shotgun library. The method yielded accurate sequences and a similar efficiency compared with other shotgun libraries constructed from single DNA fragments (>20 kb). Computer analyses were carried out on 65 cDNA clone sequences and their corresponding end sequences to examine both nucleic acid and amino acid sequence similarities in the databases. Thirty-seven clones revealed no DNA database matches, 12 clones generated exact matches (≥98% identity), and 16 clones generated nonexact matches (57%–97% identity) to either known human or other species genes. Of those 28 matched clones, 8 had corresponding end sequences that failed to identify similarities. In a protein similarity search, 27 clone sequences displayed significant matches, whereas only 20 of the end sequences had matches to known protein sequences. Our data indicate that full-length cDNA insert sequences provide significantly more nucleic acid and protein sequence similarity matches than expressed sequence tags (ESTs) for database searching.

[All 65 cDNA clone sequences described in this paper have been submitted to the GenBank data library under accession nos. U79240U79304.]

Footnotes

  • 2 Corresponding author.

  • E-MAIL agibbs{at}bcm.tmc.edu; FAX (713) 798-5741.

    • Received November 15, 1996.
    • Accepted February 4, 1997.
| Table of Contents

Preprint Server