BLAT—The BLAST-Like Alignment Tool

Table 6.

Sensitivity and Specificity of Single Near-Perfect (One Mismatch Allowed) Amino Acid K-mer Matches as a Search Criterion

4 5 6 7 8 9
A. 71% 1.000 0.992 0.946 0.823 0.725 0.515
73% 1.000 0.995 0.965 0.867 0.785 0.586
75% 1.000 0.998 0.978 0.905 0.840 0.657
77% 1.000 0.999 0.987 0.935 0.886 0.727
79% 1.000 0.999 0.993 0.959 0.924 0.791
81% 1.000 1.000 0.997 0.976 0.952 0.849
83% 1.000 1.000 0.999 0.987 0.973 0.897
85% 1.000 1.000 0.999 0.994 0.986 0.936
87% 1.000 1.000 1.000 0.997 0.994 0.964
89% 1.000 1.000 1.000 0.999 0.998 0.982
91% 1.000 1.000 1.000 1.000 0.999 0.993
93% 1.000 1.000 1.000 1.000 1.000 0.998
B. K 4 5 6 7 8 9
F 1.2E+08 6.0E+06 300078 14985 749 37
  • (A) Columns are for K sizes of 4–9. Rows represent various percentage identities between the homologous sequences. The table entries show the fraction of homologies detected. (B) K represents the size of the near-perfect match. F shows how many perfect matches of this size expected to occur by chance in a translated genome of 3 billion bases using a query of 167 amino acids.

This Article

  1. Genome Res. 12: 656-664

Preprint Server