Sequence Information for the Splicing of Human Pre-mRNA Identified by Support Vector Machine Classification

Table 1.

SVM Peformance in Distinguishing Real From Pseudo Exons


Flanks

Splice sites



US
DS
3′
5′
Exon body
ROC
Specificitya
CVb 0.609 0.484
+ - - - - 0.791 0.638
- + - - - 0.784 0.618
+ + - - - 0.855 0.695
- - + - - 0.823 0.672
- - - + - 0.837 0.698
- - + + - 0.907 0.777
+ + + + - 0.932 0.825
- - - - + 0.946 0.841
+ + - - + 0.984 0.956
- - + + + 0.987 0.964
+
+
+
+
+
0.991
0.976
  • Performances are indexed by ROC values and specificity. Each row is an SVM test. ROC values were measured in untouched sets of ∼2200 real and ∼2300 pseudo exons. The first five columns indicate the components used by SVM. US, upstream; DS, downstream; TP, true positive; FN, false negative; FP, false positive; SE, sensitivity; SP, specificity.

  • a Specificity = TP/(TP + FP) at a sensitivity (SE = TP/(TP + FN)) of 0.90

  • b The SVM classified on the basis of the acceptor and the donor consensus values

This Article

  1. Genome Res. 13: 2637-2650

Preprint Server