Most Frequent Tripeptide Sequences Observed Within the Genomes Studied
| Organism | N(Occ) | N(Seq) (expected) | N(Seq) (observed) | Sequences observed |
| M. jannaschii | 12 | 0.1 ± 0.3 | 2 | KEE (4.0), LKK (1.8) |
| (1773 ORFs) | 10 | 0.4 ± 0.6 | 1 | KKL (1.9) |
| 8 | 1.4 ± 1.1 | 2 | KKK (1.0), LKE (1.6) | |
| 7 | 2.7 ± 1.6 | 4 | IIK (2.1), KKE (1.1), LNK (3.0),RLL (4.8) | |
| 6 | 5.4 ± 2.2 | 9 | EKE (1.6), EKL (1.8), IKK (1.0), KIE (1.8), KKD (3.3), KKI (1.3),RKK (1.8), VKE (2.6), VKK (2.0) | |
| E. coli | 11 | 0.0 ± 0.2 | 1 | AKK (3.5) |
| (4290 ORFs) | 10 | 0.1 ± 0.3 | 3 | KKK (3.5),RSH (9.7), RSR(4.8) |
| 9 | 0.4 ± 0.6 | 2 | EAK (2.5), RLK(2.5) | |
| 8 | 1.2 ± 1.1 | 7 | AAQ (3.9), EEA (3.5), EVK (3.3), GLL (4.3), LEI (8.0), LLG (2.9), RRG (4.7) | |
| 7 | 3.6 ± 1.8 | 8 | DGE (7.4), EEV (6.5), GGK (4.1), KLA (2.4), LAS (2.8), NLA (4.0), RRR (3.1), SEE (5.4) | |
| S. cerevisiae | 21 | 0.0 ± 0.0 | 1 | SKK(3.3) |
| (6215 ORFs) | 19 | 0.0 ± 0.0 | 1 | KKK(2.8) |
| 17 | 0.0 ± 0.1 | 1 | VGE(16.5) | |
| 16 | 0.0 ± 0.1 | 2 | AKK (4.3),WIH (120.2) | |
| 15 | 0.0 ± 0.2 | 2 | DEL(7.3), IAN(12.0) | |
| 13 | 0.1 ± 0.3 | 1 | SKL(2.2) | |
| 12 | 0.3 ± 0.5 | 4 | EKK (1.5),LKK (1.5), LLL (1.9), LSK (1.9) | |
| 11 | 0.6 ± 0.7 | 2 | KKE (2.9), LLK (1.6) | |
| 10 | 1.3 ± 1.1 | 1 | GKK(2.8) | |
| 9 | 2.9 ± 1.7 | 7 | DEE (7.1), DSK (2.7), FWC (158.6), LSI (2.3), MLL (6.5), QKI (4.6), SSS (3.0) | |
| 8 | 6.1 ± 2.4 | 12 | DDE (7.0), EVD(8.8), IPK (5.8), KEK (2.2), KKD (2.6), KKN (1.9), LDL (2.3), LLV (2.6), RRK (3.5), SLA (3.2), SSL (1.7), TKK(2.2) | |
| A. thaliana | 99 | 0.0 ± 0.0 | 1 | SSS(3.4) |
| (25561 ORFs) | 54 | 0.0 ± 0.0 | 1 | DYW(95.3) |
| 43 | 0.0 ± 0.1 | 1 | SSL (1.4) | |
| 40 | 0.0 ± 0.2 | 1 | ASS(2.6) | |
| 39 | 0.1 ± 0.2 | 2 | DEL (5.2),TSS (2.6) | |
| 38 | 0.1 ± 0.3 | 1 | SKL(1.8) | |
| 36 | 0.2 ± 0.4 | 1 | LKL (1.8) | |
| 34 | 0.2 ± 0.5 | 1 | LLS (1.6) | |
| 32 | 0.3 ± 0.6 | 1 | EEE (8.2) | |
| 31 | 0.4 ± 0.6 | 1 | SST (2.3) | |
| 30 | 0.5 ± 0.7 | 2 | LSS (1.1), STS (2.0) | |
| 29 | 0.6 ± 0.8 | 4 | KKK (3.4), LLL (1.3),PSS (2.1), RRR (3.8) | |
| 28 | 0.8 ± 0.9 | 2 | SSI (1.8), VSS (1.7) | |
| 26 | 1.2 ± 1.1 | 3 | DSD (4.3), GSS (1.9), LVF (3.2) | |
| 25 | 1.6 ± 1.3 | 2 | DEE (6.7), KKR (2.7) | |
| 24 | 1.9 ± 1.3 | 4 | FLL (2.1), FSS(1.7), LSL (0.9), RRS (2.1) | |
| 23 | 2.5 ± 1.5 | 8 | DDE (7.5), EED (6.8), SFL (1.7), SLL (1.0), SSR (1.2), SSV (1.1), VSA (2.3), VTL (2.7) | |
| C. elegans | 70 | 0.0 ± 0.0 | 1 | KKK (4.6) |
| (19833 ORFs) | 45 | 0.0 ± 0.0 | 1 | LCE(20.4) |
| 38 | 0.0 ± 0.0 | 2 | SKL (2.3),YNP (33.7) | |
| 36 | 0.0 ± 0.0 | 1 | PGY(20.8) | |
| 32 | 0.0 ± 0.0 | 2 | GKK (4.3),TKY (5.8) | |
| 30 | 0.0 ± 0.0 | 1 | SSK(2.2) | |
| 28 | 0.0 ± 0.1 | 3 | DDE (11.5), KKN (2.2),SKK (1.8) | |
| 26 | 0.1 ± 0.2 | 2 | DSD (7.7), RRK (3.8) | |
| 24 | 0.2 ± 0.4 | 5 | AKK (2.8), DEE (8.4), KKL (1.5), KRK (2.4), LKK(1.6) | |
| 23 | 0.3 ± 0.5 | 5 | AKL (2.6),DEL (4.9), GRK (4.6), KKE (2.3), KKI (2.1) | |
| 22 | 0.4 ± 0.6 | 3 | EKK (2.3), SKN (1.6), TNS(3.9) | |
| 21 | 0.7 ± 0.8 | 1 | TRR(5.4) | |
| 20 | 1.1 ± 1.1 | 4 | ERA (5.3), KKQ (2.3), RKL (1.8), RRR (5.7) | |
| 19 | 1.8 ± 1.3 | 7 | DKE (3.6), FGK (4.3), INY (5.4), LGL (2.8), NKK (3.1), SSF (1.5), VSS (2.9) | |
| 18 | 2.6 ± 1.5 | 9 | EKL (1.8),FGG (12.2), KSE (2.1), LFN (2.5), LKI (1.6), RIC (9.5), SRR (3.3), SSS (1.8), VKK (1.8) | |
| H. sapiens | 32 | 0.0 ± 0.0 | 1 | DEL (6.3) |
| (14760 ORFs) | 31 | 0.0 ± 0.0 | 1 | EKK(5.3) |
| 28 | 0.0 ± 0.0 | 1 | KKK(4.5) | |
| 25 | 0.0 ± 0.1 | 1 | LKF(5.1) | |
| 22 | 0.1 ± 0.3 | 1 | EEE (6.3) | |
| 21 | 0.2 ± 0.4 | 2 | LLL (1.6), SDQ(6.0) | |
| 20 | 0.3 ± 0.5 | 2 | LAL (2.2), SSK(1.9) | |
| 19 | 0.4 ± 0.6 | 3 | EEL (2.5), LLK (2.2), WNK (28.0) | |
| 18 | 0.7 ± 0.8 | 3 | ASS (2.1), TRL (2.7), TSL (1.8) | |
| 17 | 1.0 ± 1.0 | 6 | KGK(3.4), KRK (3.3), LGL (1.6), LLS (1.6), RKK (3.5), SLL (1.2) | |
| 16 | 1.6 ± 1.3 | 5 | EDD (7.1), RRR (5.8), SES (1.7), SKL (1.2), TEL (2.2) | |
| 15 | 2.7 ± 1.6 | 9 | GSS (1.9), KRR (4.2),NKI (8.5), PSS (1.8), RRK (3.8), SSL (1.0), SSS (1.2), TKL (1.8), TVV (5.0) | |
| 14 | 4.2 ± 2.0 | 9 | APL (2.2), EKP (3.2), ERA (4.1), GKK (2.6), KSS (1.5), LVS (2.2), PGP (4.4), SCC (11.1), TEV (3.3) | |
| 13 | 6.5 ± 2.4 | 13 | AKL (1.6),CGF (12.8), DSD (4.7), DTM (18.3), EDL (2.3), KKN (3.9), LEA (2.6), PPQ (4.8), SHL (2.8), SSP (1.7), SVS (1.9), TSI (3.3), VSS (2.0) | |
| 12 | 10.1 ± 3.0 | 20 | AAS (2.2), EED (3.8), EKL (1.4), EVD (5.4), FGG(9.4), KAK (2.5), LKL (1.0), LPQ (3.0), LSL (0.9), LSS (1.0), PAS (2.3), QGL (2.6), RPY (7.6), SEI (2.6), SLS (1.0), SLT (2.0), SSV (1.4), TAL (1.9), TTV (3.8), VLL (1.7) |
-
For each organism, the number of ORFs used for the analysis is indicated. N(Occ) indicates the number of occurrences for a particular sequence in the genome; N(Seq) indicates the number of sequences that appear N(Occ) number of times. The expected value of N(Seq) is derived from the genome jumbling method (with uncertainties shown at one standard deviation). Values in parentheses accompanying each sequence refer to the ratio of the number of times that sequence is observed to the number of times that sequence is expected based on positional amino acid frequencies. Sequences in boldface are known recognition motifs; italicized sequences belong to, entirely or in part, highly repeated sequences (e.g., homologous proteins or transposon ORFs), and underlined sequences take the form XKK (XSS in A. thaliana).











