Extraction of Functional Binding Sites from Unique Regulatory Regions: The Drosophila Early Developmental Enhancers

Table 2.

Results of Individual Trainings

Sequence Statistics Best Parameters
Name L c-MAP CC OQ PQ mmin mmax kmax Z c
eve2 728 0.15 0.62 0.51 0.80 9 9 2 9.7 0.15
hairy6 547 0.65 0.55 0.59 0.05 7 9 2 6.3 0.73
hairy7 932 0.16 0.53 0.41 0.77 8 9 1 11 0.11
eve37 508 0.29 0.52 0.46 0.43 8 9 1 4.9 0.29
tll 480 0.15 0.46 0.37 0.65 11 12 2 3.7 0.16
iab2 1745 0.07 0.46 0.33 0.89 9 11 4 22.3 0.10
kr730 718 0.32 0.43 0.40 0.31 8 9 1 3.8 0.33
sal 516 0.22 0.42 0.32 0.24 12 14 4 8.4 0.54
ftzprox 396 0.23 0.41 0.34 0.24 9 10 4 7.6 0.55
enint 900 0.20 0.34 0.29 0.39 7 7 1 4 0.23
Average 784 0.24 0.47 0.39 0.32 8.8 0.33
  • All 10 regions from the training set show positive statistical correlation (sorted by CC). The best selectivity (PQ) is observed for the eve2 and iab2 regions. Note that the hairy strip 6 region shows poor selectivity, which is mainly due to the very high optimal coverage cutoff c (0.73). The average of the observed coverage values (c-MAP), 0.24, was used as the default cutoff in the consequent trainings on the group of 10. L is sequence length in bps; Z is the corresponding Z-score cutoff value;OQ is overlap quality; PQ is prediction quality.

This Article

  1. Genome Res. 12: 470-481

Preprint Server