Transcription-mediated gene fusion in the human genome

Table 2.

Data flow of the computational search for transcription-induced chimeras


Processa

Resulting data
1 Data download cDNA data from GenBank version 136
2 Alignment and clustering 26,057 clustersb of expressed sequences aligned to the genome
3 Sense/antisense separation 29,613 clusters on separate strands
4 Computational detection of gene fusion 322 pairs of fused genes
5 Filtering out alignment artifacts 281 pairs of fused genes
6 Manual filtration of artifacts
Final data set: 212 pairs of fused genes
  • a Procedures were performed as described in the Methods section

  • b A “cluster” is a group of ESTs that overlap on the genome and contains at least one RNA sequence

This Article

  1. Genome Res. 16: 30-36

Preprint Server