Development and Application of a Salmonid EST Database and cDNA Microarray: Data Mining and Interspecific Hybridization Characteristics

Table 1.

Salmonid EST Project Summary Statisticsa




Atlantic salmonb

Rainbow troutc

Chinook salmond

Sockeye salmone

Lake whitefishf
Number of good sequencesg 61,819h 14,544 1317 1243 1465
Average trimmed EST length (bp)i 563 484 492 456 486
Number of contigsj 11,560 2370 136 291 138
Number of singletons 17,150 6611 949 229 1038
Number of putative transcripts 28,710 8981 1085 520 1176
Max. assembled sequence size (no. of ESTs) 252 93 10 21 28
Average assembled sequence size (no. of ESTs) 2.15 1.61 1.21 2.39 1.24
Number of assembled ESTs withk
    Significant BLASTX hits 10,511 3562 239 337 253
    Significant BLASTN hits 13,459 4337 462 331 466
    No significant BLAST hits 11,802 3667 566 118 663
Percentage with no significant BLAST hitsk 41.1 40.8 52.2 22.7 56.4
Number of contigs containingi
    2 ESTs 5606 1360 90 108 97
    3 ESTs 2322 454 26 96 21
    4-5 ESTs 2030 350 12 48 9
    6-10 ESTs 1149 145 8 32 8
    11-20 ESTs 331 41 0 6 1
    21-30 ESTs 67 12 0 1 2
    31-50 ESTs 36 4 0 0 0
    >50 ESTs
19
4
0
0
0
  • a Assembled from the March 3, 2003, version of the GRASP EST database using PHRAP. Results of CAP3 and stackPACK assemblies of the March 3, 2003 GRASP EST database are available at http://web.uvic.ca/cbr/grasp

  • b Salmo salar

  • c Oncorhynchus mykiss

  • d Oncorhynchus tshawytscha

  • e Oncorhynchus nerka

  • f Coregonus clupeaformis

  • g A sequence is considered “good” if its trimmed PHRED20 length is at least 100 bases.

  • h Includes 55.082 good forward (3′) and 6737 good reverse (5′) reads. Of 5606 good reverse reads from clones with good forward reads, 2268 overlap/cluster with the corresponding forward reads.

  • i Vector, low-quality, and contaminating bacterial sequences are trimmed.

  • j A contig (contiguous sequence) contains two or more ESTs.

  • k Threshold for BLASTN and BLASTX significance: 10-5

This Article

  1. Genome Res. 14: 478-490

Preprint Server