Data flow of the computational search for transcription-induced chimeras
Processa | Resulting data |
|---|---|
| 1 Data download | cDNA data from GenBank version 136 |
| 2 Alignment and clustering | 26,057 clustersb of expressed sequences aligned to the genome |
| 3 Sense/antisense separation | 29,613 clusters on separate strands |
| 4 Computational detection of gene fusion | 322 pairs of fused genes |
| 5 Filtering out alignment artifacts | 281 pairs of fused genes |
| 6 Manual filtration of artifacts | Final data set: 212 pairs of fused genes |
a Procedures were performed as described in the Methods section
b A “cluster” is a group of ESTs that overlap on the genome and contains at least one RNA sequence