Predicting Gene Regulatory Elements in Silico on a Genomic Scale

Table 4.

Highest Scoring Patterns for the ClusterC(5,2,4)(0000022)

Pattern N+ Total+ Score TRANSFAC (exact matches)
A. Highest score in experiment allowing patterns to have at most 3 wild cards and no group characters
CCCCT..T 22 27 7.09 Y$DDR2_01, Y$DDR2_02, Y$TPI_02
A..AGGGG 22 27 7.09
GGGGC 20 27 4.09 Y$GAL2_02, Y$SUC2_02, Y$RRNA_01 Y$ERG11_01
GCCCC 20 27 4.09 Y$CYB2_02
G..GGGG 19 28 3.73 Y$CYC1_04, Y$CYC1_05, Y$CYC1_06
CCCC..C 19 28 3.73 Y$GAL3_01, Y$MAL2R_01
CCCC...T 25 42 3.65 Y$SUC2_01, Y$DDR2_01, Y$DDR2_02 Y$TPI_02, Y$GAL3_01, Y$GAL4_01 Y$MAL2R_01, Y$MAL63_01, Y$PDC1_02 Y$HAP4_01
A...GGGG 25 42 3.65 Y$SUC2_02, Y$RRNA_01, Y$ERG11_01 Y$MEL1_02, Y$FPS1_01
CCCCT 25 38 3.03 Y$DDR2_01, Y$DDR2_02, Y$TPI_02
AGGGG 25 38 3.03 Y$CAR1_02
CCCT..TT 19 22 2.95 Y$DDR2_01
AA..AGGG 19 22 2.95
GGG.TG 20 21 2.93
CA.CCC 20 21 2.93 Y$GAL1_04, Y$CYC1_12, Y$GAL1_14 Y$DDR2_02, Y$TPI_02
B. Highest score in experiment allowing patterns having at most one group character with two alternative letters (all pairs allowed)
CCCCT[GT] 20 28 3.86 Y$DDR2_01, Y$DDR2_02, Y$TPI_02
CCCCT[AT] 20 24 3.58 Y$DDR2_02, Y$TPI_02
[CG]CCCC 24 47 3.27 Y$CYB2_02, Y$GAL2_02, Y$SUC2_02,  Y$RRNA_01, Y$ERG11_01
CCCC[CT] 29 58 2.94 Y$DDR2_01, Y$DDR2_02, Y$TPI_02,  Y$SUC2_02, Y$CAR1_02, Y$ERG11_01
[AG]CCCC 29 48 2.90 Y$CYB2_02, Y$DDR2_02, Y$TPI_02,  Y$CYC1_04, Y$CYC1_05, Y$CYC1_06,  Y$GAL2_02, Y$SUC2_02, Y$RRNA_01,  Y$CAR1_02, Y$ERG11_01, Y$GAL1_15
  • Trivial pattern variants were removed, e.g., patterns ending with a wild-card character.

  • No. of upstream regions matching the pattern.

  • Total number of matches in the upstream regions.

  • Normalized version of pattern score.

  • TRANSFAC entries matching the pattern.

  • Best patterns from experiment 2 not also found in experiment 1.

This Article

  1. Genome Res. 8: 1202-1215

Preprint Server