Novel transcription regulatory elements in Caenorhabditis elegans muscle genes

Table 1.

C. elegans muscle genes used as training and test sets


Serial

Gene symbol

Elegans gene ID

Briggsae gene ID

Operon

Rank

Motif1

Mofit2

Motif3
1 mlc-3 F09F7.2 CBG24046 N 1 + + +
2 unc-22 ZK617.1 ND N 6 + + +
3 unc-87 F08B6.4 CBG12778 N 15 + + +
4 gpd-2 K10B3.8 ND Y 16 + + +
5 unc-54 F11C3.3 CBG19730 N 29 + + +
6 unc-120 D1081.2 CBG12542 N 38 + + +
7 myo-3 K12F2.1 CBG23416 N 42 + + +
8 mup-2 T22E5.5 CBG05057 N 46 + + +
9 lev-11 Y105E8B.1 CBG19793 N 85 + + +
10 deb-1 ZC477.9 CBG05763 N 91 + + +
11 tni-1 F42E11.4 CBG17351 N 166 + + +
12 unc-89 C09D1.1 CBG12078 N 227 + + +
13 mlc-1 C36E6.3 ND N 329 + + +
14 unc-112 C47E8.7 CBG04558 N 405 + + +
15 act-4 M03F4.2 ND N 573 + + -
16 unc-97 F14D12.2 CBG14705 N 964 + + +
17 let-2 F01G12.5 CBG16372 N 974 + + +
18 unc-105 C41C4.5 CBG00750 N 1514 + + +
19 myo-1 R06C7.10 CBG21911 N 1631 + + +
20 unc-15 F07A5.7 CBG11932 N 1955 + + -
21 pat-3 ZK1058.2 CBG03601 N 2117 + + +
22 unc-45 F30H5.1 CBG15283 N 2238 + + -
23 mef-2 W10D5.1 CBG12442 N 2320 + + -
24 pat-4 C29F9.7 CBG15792 N 2449 + + -
25 unc-60 C38C3.5 CBG06572 N 2491 + + -
26 pat-10 F54C1.7 CBG10771 N 2523 + + -
27 act-2 T04C12.5 ND N 2672 + + -
28 sup-10 R09G11.1 CBG01870 N 2806 + + -
29 act-1 T04C12.4 ND N 3161 + + -
30 atn-1 W04D2.1 CBG23504 N 3399 + + -
31 lam-1 W03F8.5 CBG20003 N 3463 - + +
32 unc-52 ZC101.2 CBG11064 N 3546 + + -
33 act-3 T04C12.6 ND N 3913 + + -
34 unc-68 K11C4.5 CBG19042 N 4139 + - +
35 myo-2 T18D3.4 CBG00120 N 4154 + + -
36 hlh-1 B0304.1 CBG13470 N 8517 - + +
37 epi-1 K08C7.3 CBG04423 N 11,340 - + -
38 gpd-3 K10B3.7 ND Y 11,342 - + -
39 emb-9 K04H4.1 CBG10116 Y 11,503 - + -
40 mec-8 F46A9.6 CBG03748 N 12,040 - + -
41
egl-19
C48A7.1
CBG05858
N
12,377
-
+
-
  • Genes that are shaded were included in the training set for motif discovery, and remaining genes were used as a test set. Putative C. briggsae orthologs of C. elegans muscle genes are given. Presence (+) or absence (-) of site predictions in the upstream 2000 bp of the genes are given, along with the rank of the gene when ordered according to the combined score of the three motifs in their upstream regions (equation 4). C. elegans genes that were inside operons according to Blumenthal et al. (2002) are indicated with a Y (for yes) or N (for no) in the Operon column.

This Article

  1. Genome Res. 14: 2457-2468

Preprint Server