C. elegans muscle genes used as training and test sets
|
Serial |
Gene symbol |
Elegans gene ID |
Briggsae gene ID |
Operon |
Rank |
Motif1 |
Mofit2 |
Motif3 |
|---|---|---|---|---|---|---|---|---|
| 1 | mlc-3 | F09F7.2 | CBG24046 | N | 1 | + | + | + |
| 2 | unc-22 | ZK617.1 | ND | N | 6 | + | + | + |
| 3 | unc-87 | F08B6.4 | CBG12778 | N | 15 | + | + | + |
| 4 | gpd-2 | K10B3.8 | ND | Y | 16 | + | + | + |
| 5 | unc-54 | F11C3.3 | CBG19730 | N | 29 | + | + | + |
| 6 | unc-120 | D1081.2 | CBG12542 | N | 38 | + | + | + |
| 7 | myo-3 | K12F2.1 | CBG23416 | N | 42 | + | + | + |
| 8 | mup-2 | T22E5.5 | CBG05057 | N | 46 | + | + | + |
| 9 | lev-11 | Y105E8B.1 | CBG19793 | N | 85 | + | + | + |
| 10 | deb-1 | ZC477.9 | CBG05763 | N | 91 | + | + | + |
| 11 | tni-1 | F42E11.4 | CBG17351 | N | 166 | + | + | + |
| 12 | unc-89 | C09D1.1 | CBG12078 | N | 227 | + | + | + |
| 13 | mlc-1 | C36E6.3 | ND | N | 329 | + | + | + |
| 14 | unc-112 | C47E8.7 | CBG04558 | N | 405 | + | + | + |
| 15 | act-4 | M03F4.2 | ND | N | 573 | + | + | - |
| 16 | unc-97 | F14D12.2 | CBG14705 | N | 964 | + | + | + |
| 17 | let-2 | F01G12.5 | CBG16372 | N | 974 | + | + | + |
| 18 | unc-105 | C41C4.5 | CBG00750 | N | 1514 | + | + | + |
| 19 | myo-1 | R06C7.10 | CBG21911 | N | 1631 | + | + | + |
| 20 | unc-15 | F07A5.7 | CBG11932 | N | 1955 | + | + | - |
| 21 | pat-3 | ZK1058.2 | CBG03601 | N | 2117 | + | + | + |
| 22 | unc-45 | F30H5.1 | CBG15283 | N | 2238 | + | + | - |
| 23 | mef-2 | W10D5.1 | CBG12442 | N | 2320 | + | + | - |
| 24 | pat-4 | C29F9.7 | CBG15792 | N | 2449 | + | + | - |
| 25 | unc-60 | C38C3.5 | CBG06572 | N | 2491 | + | + | - |
| 26 | pat-10 | F54C1.7 | CBG10771 | N | 2523 | + | + | - |
| 27 | act-2 | T04C12.5 | ND | N | 2672 | + | + | - |
| 28 | sup-10 | R09G11.1 | CBG01870 | N | 2806 | + | + | - |
| 29 | act-1 | T04C12.4 | ND | N | 3161 | + | + | - |
| 30 | atn-1 | W04D2.1 | CBG23504 | N | 3399 | + | + | - |
| 31 | lam-1 | W03F8.5 | CBG20003 | N | 3463 | - | + | + |
| 32 | unc-52 | ZC101.2 | CBG11064 | N | 3546 | + | + | - |
| 33 | act-3 | T04C12.6 | ND | N | 3913 | + | + | - |
| 34 | unc-68 | K11C4.5 | CBG19042 | N | 4139 | + | - | + |
| 35 | myo-2 | T18D3.4 | CBG00120 | N | 4154 | + | + | - |
| 36 | hlh-1 | B0304.1 | CBG13470 | N | 8517 | - | + | + |
| 37 | epi-1 | K08C7.3 | CBG04423 | N | 11,340 | - | + | - |
| 38 | gpd-3 | K10B3.7 | ND | Y | 11,342 | - | + | - |
| 39 | emb-9 | K04H4.1 | CBG10116 | Y | 11,503 | - | + | - |
| 40 | mec-8 | F46A9.6 | CBG03748 | N | 12,040 | - | + | - |
| 41
|
egl-19
|
C48A7.1
|
CBG05858
|
N
|
12,377
|
-
|
+
|
-
|
-
Genes that are shaded were included in the training set for motif discovery, and remaining genes were used as a test set. Putative C. briggsae orthologs of C. elegans muscle genes are given. Presence (+) or absence (-) of site predictions in the upstream 2000 bp of the genes are given, along with the rank of the gene when ordered according to the combined score of the three motifs in their upstream regions (equation 4). C. elegans genes that were inside operons according to Blumenthal et al. (2002) are indicated with a Y (for yes) or N (for no) in the Operon column.











