SLAM: Cross-Species Gene Finding and Alignment with a Generalized Pair Hidden Markov Model

Table 1.

Results on the Test Sets

Test set Nucleotide level Exon level
SN SP AC SN SP (SN+SP)/2 ME WE
The ROSETTA set
 ROSETTA 0.935 0.978 0.949 0.833 0.829 0.831 0.048 0.047
 SGP-1 0.940 0.960 0.940 0.700 0.760 0.730 0.120 0.040
 SLAM 0.951 0.981 0.960 0.783 0.755 0.769 0.038 0.057
 TWINSCAN.p 0.960 0.941 0.940 0.855 0.824 0.840 0.045 0.081
 TWINSCAN 0.984 0.889 0.923 0.839 0.767 0.803 0.034 0.118
 GENSCAN 0.975 0.908 0.929 0.817 0.770 0.793 0.057 0.107
HoxA
 SLAM 0.852 0.896 0.864 0.727 0.533 0.630 0.000 0.333
 TWINSCAN.p 0.976 0.829 0.896 0.773 0.531 0.652 0.000 0.312
 TWINSCAN 0.949 0.511 0.704 0.591 0.173 0.382 0.000 0.707
 SGP-2 0.640 0.637 0.619 0.409 0.173 0.291 0.091 0.596
 GENSCAN 0.932 0.687 0.796 0.545 0.235 0.390 0.000 0.569
Elastin
 SLAM 0.876 0.981 0.926 0.802 0.859 0.831 0.121 0.059
 TWINSCAN.p 0.942 0.950 0.945 0.879 0.889 0.884 0.066 0.056
 TWINSCAN 0.933 0.877 0.903 0.835 0.826 0.831 0.110 0.120
 SGP-2 0.755 0.998 0.873 0.593 0.900 0.291 0.352 0.017
 GENSCAN 0.947 0.766 0.852 0.835 0.731 0.783 0.121 0.231
  • The measures of sensitivity SN = TP/TP + FN and specificity SP = TP/TP + FP (where TP = true positives, TN = true negatives, FP = false positives and FN = false negatives) are shown at both the nucleotide and exon level. ME is entirely missed exons, WE is wrong exons, and the approximate correlation AC = 1/2 (TP/TP + FN + TP/TP + FP + TN/TN + FP + TN/TN + FN) − 1 summarizes the overall nucleotide sensitivity and specificity by one number. Within each of the three data sets the methods are divided into three classes: those operating on a syntenic DNA pair, those operating on a human sequence using as evidence matches against a database of mouse sequences, and a single-organism gene finder (GENSCAN).

This Article

  1. Genome Res. 13: 496-502

Preprint Server