Homotypic Regulatory Clusters in Drosophila

Table 1.

Parameter Values in the Point of Global Agreement

Motif Number of loci Maximum CCmax Window size PWM cutoff Site E-value Cluster E-value Cluster frequency Cluster significance
BCD 8 GLOBAL 0.622 550 4.2 2.9 * 10−3   4 * 10−4 1.2 * 10−6  1.1 * 10−4
KR 6 GLOBAL 0.583 600 4 3.5 * 10−3 2.3 * 10−3 8.1 * 10−6 1.21 * 10−3
CAD 6 local 0.65 575 4.8 1.9 * 10−3 1.2 * 10−3 2.3 * 10−6 1.74 * 10−4
KNI 3 GLOBAL 0.65 625 3.6 4.6 * 10−3 1.1 * 10−2 5.1 * 10−5 3.56 * 10−3
HB 3 local 0.443 400 6.5 3.8 * 10−4 4.4 * 10−3 1.7 * 10−6 1.00 * 10−3
FTZ 3 local 0.399 575 4.3 2.7 * 10−3   6 * 10−4 1.6 * 10−6 6.76 * 10−4
EVE 2 GLOBAL 0.503 750 3.7 4.2 * 10−3 4.4 * 10−3 1.9 * 10−5 5.91 * 10−3
PRD 2 GLOBAL 0.61 550 7.5 1.1 * 10−4 4.1 * 10−2 4.5 * 10−6 2.62 * 10−4
TLL 2 local 0.419 500 4.3 2.7 * 10−3 1.4 * 10−2 3.8 * 10−5 2.42 * 10−3
Average window 576
  • Optimal parameters, correlation values, and the absolute cluster significance are shown for nine motifs at the point of the global correlation maximum. All maximums produce a narrow range of optimal window sizes, with a high correlation value and very high cluster significance. In some cases we also observed the presence of local maxima (CAD, HB, FTZ, TLL) with similar correlation values, but their corresponding window sizes do not agree with the rest of the data (500–600 bp). We estimated cluster frequency (probability to find a cluster in any given position of genome) as the product of the clusterE-value cutoff (conditional probability, see Methods) and the site E-value cutoff. The last column shows estimated cluster significance for a locus sequence (25 Kb) and accounts for multiple independent statistical tests performed simultaneously (correction Bonferroni). The cluster significance reflects probability that a given locus sequence (25 Kb) will contain a cluster by chance.

This Article

  1. Genome Res. 13: 579-588

Preprint Server