Example of a CRAW report (text format) for a cluster afflicted with EST data derived from a possibly chimeric clone. The 34 sequences shown can be represented as two consensus sequences and an outlier sequence without information loss. The cluster was automatically partitioned into two consistent subgroups showing one sequence (GenBank accession no. AA015595) as being inconsistent with the two established subgroups. Similarity searching with BLAST against the NCBI nonredundant database indicates that the second subgroup, consisting of sequences representing GenBank accession nos. N20971 to AA076342, is highly similar to the 3′ end of mouse mRNA for talin. The second subgroup, sequences T90923 to AA136000 are identical to the coding region of human tubulin α-6 chain. Sequence AA015595 is a putative chimeric sequence within which 110 bases (contained in the first 5 positions of the CRAW report) are highly similar to 3′ UTR of talin mRNA. The rest of the sequence is highly similar to tubulin.
