Gene and alternative splicing annotation with AIR

Table 2.

Projection rates of mouse RefSeq and GenBank mRNA transcripts and exons on the rat genome




Mouse exons

Pct. (%)

Mouse transcripts

Pct. (%)
Original NA NA 125,972 NA
Aligned on mouse 761,241 100 119,280 100
Projection stage
   Complete projections 659,360 86.6 68,805 57.7
   Partial projections 38,612 5.1 38,666 32.4
   Not projected 64,062 8.4 11,809 9.9
      in unprojected transcripts 38,991 5.1
      in projected transcripts 25,071 3.3
Total from complete and partial projections 697,972 91.7 107,471 90.1
Refinement stage
   Endorsed unaltered 659,436 86.6 85,418 71.6
      canonical 645,147 84.7
      non-canonical 14,289 1.9
   Endorsed altered 23,603 3.1 17,548 14.7
      by 1–10 bp 19,145 2.5
      by >10 bp 4,458 0.6
   Projected, not endorsed 14,933 2.0 4,505 3.8
   Rejecteda 40,004 5.3 4,505 3.8
      in rejected transcripts 6,699 0.9
      in endorsed transcripts 33,305 4.4
   Missedb (found with BLASTN) 12,275 1.6 NA NA
Total endorsed projections
683,039
89.7
102,966
86.3
  • NA = not applicable.

    For the projection stage, “complete” exon projections are defined as having both ends of the exon contained in matches; the exon projection is then the entire interval between the projected endpoints. “Complete” transcript projections have all exons “complete”; “partial” transcripts have at least one complete or partially projected exon. “Complete” and “partial” transcripts were submitted for refinement in stage two. For the refinement stage, endorsed unaltered “canonical” projections had canonical splice signals and did not necessitate alteration, whereas “non-canonical” ones failed extension to a nearby consensus splice site. “Missed” projections were based on the BLASTN search (E = 2.0, default parameters) of all rejected exons from endorsed transcripts against the genomic interval between their adjacent exons, for internal rejected exons, or a 50,000-bp interval past the aligned end of the transcript, for marginal rejected exons, consistently with the orientation of the transcript. A threshold of 50% length or minimum 50-bp coverage of the exon was applied to the BLASTN results.

  • a Includes the number of unprojected exons from partially projected transcripts

  • b Counted from the number of rejected exons from endorsed transcripts

This Article

  1. Genome Res. 15: 54-66

Preprint Server