Abundance of Four-Base Words in Expressed Signatures
|
Rank (by ratio)a |
Wordb |
N2 (2-step)c |
N4 (4-step) |
N2:N4 Adj. ratiod |
||||
|---|---|---|---|---|---|---|---|---|
| A. Word use in 17-base signatures for which the 2-step abundance was significantly greater than 4-step abundance. This subset of expressed signatures is described as “Bin 2” in the Methods section. | ||||||||
| 1 | GGCC | 1 | 232 | 0.0043 | ||||
| 2 | TTAA | 6 | 846 | 0.0071 | ||||
| 3 | TATA | 3 | 415 | 0.0072 | ||||
| 4 | TCGA | 7 | 757 | 0.0092 | ||||
| 5 | AGCT | 11 | 1120 | 0.0098 | ||||
| 6 | CGCG | 2 | 155 | 0.0129 | ||||
| 7 | TGCA | 13 | 819 | 0.0159 | ||||
| 8 | CATG | 15 | 747 | 0.0201 | ||||
| 9 | ATAT | 18 | 762 | 0.0236 | ||||
| 10 | GCGC | 3 | 106 | 0.0283 | ||||
| 11 | CCGG | 13 | 401 | 0.0324 | ||||
| 12 | AATT | 33 | 887 | 0.0372 | ||||
| 13 | GTAC | 12 | 317 | 0.0379 | ||||
| 14 | ACGT | 12 | 291 | 0.0412 | ||||
| 15 | CTAG | 30 | 341 | 0.0880 | ||||
| 16 | TTTA | 111 | 788 | 0.1409 | ||||
| 17 | TAAA | 127 | 705 | 0.1801 | ||||
| 18 | TTGA | 202 | 1034 | 0.1954 | ||||
| 19 | TAGA | 81 | 412 | 0.1966 | ||||
| 20 | TCAC | 138 | 436 | 0.3165 | ||||
| 21 | TCGG | 89 | 249 | 0.3574 | ||||
| 22 | TCAA | 311 | 799 | 0.3892 | ||||
| 23 | TAAG | 171 | 403 | 0.4243 | ||||
| 24 | TGAC | 154 | 352 | 0.4375 | ||||
| 25 | TCCG | 98 | 221 | 0.4434 | ||||
| [see Supplemental Figures S4 to S6 for complete tables] | ||||||||
| 246 | GAAT | 484 | 172 | 2.8140 | ||||
| 247 | CGTA | 186 | 65 | 2.8615 | ||||
| 248 | GCTT | 490 | 171 | 2.8655 | ||||
| 249 | GATA | 385 | 131 | 2.9389 | ||||
| 250 | ACAT | 488 | 159 | 3.0692 | ||||
| 251 | GCTA | 307 | 85 | 3.6118 | ||||
| 252 | ACTC | 495 | 136 | 3.6397 | ||||
| 253 | ACTT | 600 | 162 | 3.7037 | ||||
| 254 | CCTA | 265 | 70 | 3.7857 | ||||
| 255 | ACTA | 365 | 89 | 4.1011 | ||||
| B. Word use in 17-base signatures for which the 4-step abundance was significantly greater than 2-step abundance. | ||||||||
| 1 | GCGC | 189 | 1 | 0.0053 | ||||
| 2 | TTAA | 837 | 10 | 0.0119 | ||||
| 3 | AGCT | 1359 | 21 | 0.0155 | ||||
| 4 | TATA | 407 | 9 | 0.0221 | ||||
| 5 | GGCC | 321 | 9 | 0.0280 | ||||
| 6 | TCGA | 782 | 22 | 0.0281 | ||||
| 7 | TGCA | 886 | 27 | 0.0305 | ||||
| 8 | CCGG | 449 | 19 | 0.0423 | ||||
| 9 | CATG | 729 | 31 | 0.0425 | ||||
| 10 | ATAT | 726 | 33 | 0.0455 | ||||
| 11 | GTAC | 405 | 21 | 0.0519 | ||||
| 12 | CGCG | 196 | 11 | 0.0561 | ||||
| 13 | ACGT | 366 | 21 | 0.0574 | ||||
| 14 | AATT | 879 | 52 | 0.0592 | ||||
| 15 | CTAG | 396 | 32 | 0.0808 | ||||
| 16 | TAAA | 750 | 111 | 0.1480 | ||||
| 17 | TTTA | 718 | 112 | 0.1560 | ||||
| 18 | TAGA | 546 | 95 | 0.1740 | ||||
| 19 | TTGA | 1100 | 213 | 0.1936 | ||||
| 20 | TCAC | 502 | 209 | 0.4163 | ||||
| 21 | TGGC | 381 | 161 | 0.4226 | ||||
| 22 | AGCC | 380 | 166 | 0.4368 | ||||
| 23 | TGAC | 414 | 184 | 0.4444 | ||||
| 24 | TTAG | 447 | 200 | 0.4474 | ||||
| 25 | GCGG | 184 | 83 | 0.4511 | ||||
| [see Supplemental Figures S7 to S9 for complete tables] | ||||||||
| 246 | AAAG | 443 | 1151 | 2.5982 | ||||
| 247 | GCTT | 227 | 594 | 2.6167 | ||||
| 248 | CATT | 229 | 605 | 2.6419 | ||||
| 249 | CATA | 157 | 429 | 2.7325 | ||||
| 250 | CCTT | 179 | 504 | 2.8156 | ||||
| 251 | ACTC | 185 | 523 | 2.8270 | ||||
| 252 | ACAT | 183 | 647 | 3.5355 | ||||
| 253 | ACTT | 205 | 744 | 3.6293 | ||||
| 254 | ACTA | 111 | 444 | 4.0000 | ||||
| 255
|
CCTA
|
60
|
265
|
4.4167
|
||||
-
↵a For brevity, only the first 25 and last 10 rows of 255 four-base words are shown; GATC was not considered because it is rarely observed among expressed signatures. For the complete set of data corresponding to this subset of signatures, the other “bins”, and the 20-base expressed signatures, see Supplemental Figures S4-S9.
-
↵b Palindromic words are shown in bold; other “bad” words are indicated in italics. Frame 1 (see Fig. 5A) was not considered because only the 16 words initiating with “TC” can be observed in this frame.
-
↵c “N” indicates the frequency of occurrence of the word among the frames of the expressed signatures for either of the indicated steppers. The frequency of the words was calculated with all expressed signatures considered equally, independent of the expression abundance.
-
↵d “Adj. ratio” indicates that the ratio was adjusted to account for the different number of frames in the 2- and 4-step reactions for which the word frequencies were counted; for the 17-base expressed signatures, words in 2-step frames 3 and 5 were counted and 4-step frames 2, 4, and 6 (Fig. 5A). Therefore, frequency counts for the 4-step words were adjusted by 2/3 prior to calculating the ratio.











