Nonrandom Tripeptide Sequence Distributions at Protein Carboxyl Termini

Table 1.

Position-Specific Amino Acid Frequencies (Expressed as a Percentage) at the Three C-Terminal Positions for Each of the Genomes Studied

AA M. jannaschii E. coli S. cerevisiae
−1 −2 −3 ORF −1 −2 −3 ORF −1 −2 −3 ORF
A 2.4 1.9 2.7 5.4 10.2 9.4 7.6 9.5 5.2 4.2 4.8 5.5
C 0.8 0.5 1.0 1.3 1.2 1.2 1.1 1.2 1.7 1.5 1.4 1.3
D 4.0 3.2 4.3 5.5 4.1 3.7 4.1 5.1 5.1 5.2 5.4 5.8
E 13.3 8.6 9.0 8.7 8.2 6.4 8.3 5.7 6.5 5.8 6.2 6.5
F 4.5 5.4 3.9 4.3 3.6 2.8 3.4 3.9 5.3 5.8 5.5 4.5
G 4.9 6.7 4.6 6.3 6.7 6.5 5.7 7.4 2.6 4.5 4.6 5.0
H 1.4 1.1 1.2 1.4 4.3 2.6 2.7 2.3 2.7 2.3 2.3 2.2
I 10.0 9.6 11.2 10.5 3.7 4.0 5.2 6.0 7.1 6.3 6.6 6.6
K 17.2 17.6 14.8 10.4 10.9 8.7 7.0 4.4 11.5 10.9 8.8 7.3
L 11.7 11.2 12.3 9.5 7.9 9.6 10.1 10.6 10.5 9.3 10.3 9.6
M 0.6 2.6 2.3 2.3 1.3 2.1 1.7 2.8 2.2 1.9 2.3 2.1
N 4.2 6.1 5.4 5.3 4.0 4.4 4.2 4.0 7.1 5.5 5.8 6.1
P 1.5 1.4 2.0 3.4 2.9 3.2 4.7 4.4 2.3 2.9 3.6 4.3
Q 3.7 1.9 1.3 1.4 6.6 5.4 4.7 4.4 4.4 4.0 4.1 3.9
R 5.6 6.5 6.3 3.8 8.7 7.5 7.9 5.5 4.7 6.5 4.9 4.5
S 3.8 4.6 5.2 4.5 6.2 7.0 5.9 5.8 6.7 8.7 8.3 9.0
T 1.9 3.2 2.7 4.0 1.1 5.3 4.7 5.4 3.8 5.3 4.6 5.9
V 2.9 4.2 5.5 6.8 4.8 6.3 6.5 7.1 5.3 4.6 5.7 5.6
W 1.5 0.8 0.9 0.7 1.6 1.1 1.5 1.5 1.7 1.0 1.3 1.0
Y 4.1 2.8 3.6 4.4 2.2 2.8 3.0 2.9 3.5 3.8 3.7 3.4
AA C. elegans A. thaliana H. sapiens
−1 −2 −3 ORF −1 −2 −3 ORF −1 −2 −3 ORF
A 5.2 4.1 4.8 6.3 6.2 4.9 5.5 6.3 5.3 6.0 6.4 7.0
C 2.7 2.2 2.0 2.1 2.6 2.2 2.0 1.8 3.2 2.7 2.4 2.2
D 4.3 4.7 4.6 5.3 4.5 4.8 4.9 5.5 4.5 5.0 4.4 4.9
E 5.7 5.5 5.4 6.5 5.1 5.8 5.1 6.8 5.1 6.9 6.8 7.1
F 8.5 5.7 5.7 4.9 5.7 5.0 5.0 4.3 4.6 3.4 3.8 3.7
G 3.0 4.4 4.2 5.3 3.7 5.5 4.9 6.4 3.7 6.2 6.1 6.8
H 3.4 2.6 2.2 2.3 2.6 2.5 2.4 2.3 3.5 2.8 2.7 2.5
I 6.4 6.1 6.3 6.2 5.4 5.1 5.1 5.3 4.5 3.7 3.6 4.4
K 8.8 10.1 8.7 6.5 6.2 7.7 7.1 6.4 7.5 7.9 7.2 5.7
L 9.3 8.1 8.7 8.7 10.2 8.6 9.9 9.5 11.1 8.5 9.2 9.9
M 1.9 2.2 2.1 2.6 2.1 2.0 2.1 2.4 2.1 1.9 2.1 2.2
N 7.4 6.6 5.1 4.9 5.1 4.5 4.2 4.4 4.0 3.8 3.3 3.7
P 2.6 3.8 4.6 4.9 4.2 4.3 5.0 4.8 5.6 6.1 6.2 6.2
Q 5.0 4.0 3.9 4.1 3.2 3.5 3.9 3.5 4.8 4.8 4.6 4.7
R 4.5 6.7 5.9 5.2 6.6 7.2 6.3 5.4 5.2 6.4 5.6 5.6
S 6.7 8.6 9.0 8.0 10.0 10.8 10.7 9.0 9.5 9.4 10.0 8.0
T 3.2 4.8 6.5 5.8 4.6 5.6 5.4 5.1 4.9 5.2 6.4 5.3
V 6.1 5.3 5.8 6.2 6.9 5.6 6.0 6.7 6.5 4.9 4.9 6.1
W 1.1 1.4 1.1 1.1 1.5 1.5 1.4 1.3 1.4 1.5 1.6 1.2
Y 4.3 3.1 3.4 3.2 3.7 3.1 3.2 2.9 3.1 2.8 2.6 2.7
  • The fourth column for each organism (ORF) shows the amino acid frequencies across the entire ORF array.

This Article

  1. Genome Res. 13: 617-623

Preprint Server