Regulatory Potential Scores From Genome-Wide Three-Way Alignments of Human, Mouse, and Rat

Table 1.

Rates From Leave-One-Out Cross-Validation for Selected Nested Alphabets and Orders



REG alignment segments

AR alignment segments
Alphabets and orders
Correct (TP)
Unclassified
Wrong (FN)
Correct (TN)
Unclassified
Wrong (FP)
10 symbols
   order 1 0.758242 0.000000 0.241758 0.865385 0.000000 0.134615
   order 2 0.805861 0.000000 0.194139 0.850000 0.000000 0.150000
   order 3 0.637363 0.000000 0.362637 0.907692 0.000000 0.092308
   order 4 0.065934 0.919414 0.014652 0.042308 0.957692 0.000000
9 symbols
   order 1 0.743590 0.000000 0.256410 0.873077 0.000000 0.126923
   order 2 0.761905 0.000000 0.238095 0.876923 0.000000 0.123077
   order 3 0.703297 0.000000 0.296703 0.773077 0.000000 0.226923
   order 4 0.069597 0.919414 0.010989 0.088462 0.911538 0.000000
8 symbols
   order 1 0.761905 0.000000 0.238095 0.846154 0.000000 0.153846
   order 2 0.809524 0.000000 0.190476 0.800000 0.000000 0.200000
   order 3 0.706960 0.000000 0.293040 0.773077 0.000000 0.226923
   order 4 0.161172 0.835165 0.003663 0.046154 0.953846 0.000000
7 symbols
   order 1 0.750916 0.000000 0.249084 0.846154 0.000000 0.153846
   order 2 0.791209 0.000000 0.208791 0.834615 0.000000 0.165385
   order 3 0.758242 0.000000 0.241758 0.800000 0.000000 0.200000
   order 4 0.293040 0.695971 0.010989 0.123077 0.869231 0.007692
6 symbols
   order 1 0.747253 0.000000 0.252747 0.846154 0.000000 0.153846
   order 2 0.743590 0.000000 0.256410 0.853846 0.000000 0.146154
   order 3 0.783883 0.000000 0.216117 0.765385 0.000000 0.234615
   order 4 0.498168 0.421245 0.080586 0.384615 0.546154 0.069231
5 symbols
   order 1 0.732601 0.000000 0.267399 0.873077 0.000000 0.126923
   order 2 0.725275 0.000000 0.274725 0.861538 0.000000 0.138462
   order 3 0.681319 0.000000 0.318681 0.838462 0.000000 0.000000
   order 4 0.633700 0.000000 0.366300 0.792308 0.000000 0.207692
   order 5
0.340659
0.351648
0.307692
0.584615
0.361538
0.053846
  • Thinking of REG as the category to be recognized, the Correct and Wrong columns are also labeled as TP (true positive) FN (false negative) for REG, and TN (true negative) and FP (false positive) for AR. When an order is reached that gives high rates of unclassified elements—over-fitting—larger orders, for which such rates become even higher, are not listed. The 10-symbol alphabet and order 2 (in bold) are the ones used for the 3-way RP and 3-way LRP scores.

This Article

  1. Genome Res. 14: 700-707

Preprint Server