Analysis of Small Human Proteins Reveals the Translation of Upstream Open Reading Frames of mRNAs

Table 2.

Sequence Conservation of the Upstream CDS and the Longest ORF of Each RefSeq cDNA


RefSeq ID

Upstream CDSa (%)

Longest ORFa (%)

Longest ORF RefSeq definition
NM_005770 100 94 Small EDRK-rich factor 2 (SERF2)
NM_015532 95 89 Glutamate receptor, ionotropic, N-methyl D-aspartate-like 1A (GRINL1A)
NM_016215
71
92
EGF-like domain, multiple 7 (EGFL7)
  • Each amino acid sequence was deduced from the corresponding nucleotide sequence region of each cDNA. The mouse orthologous regions were extracted from the cDNA data below.

    NM_005770 [upstream CDS: NM_011354 (+19−+198); longest ORF: NM_011354 (+101−+397)].

    NM_015532 [upstream CDS: AK051868 (+13−+267); longest ORF: AK051868 (+122−+1222)].

    NM_016215 [upstream CDS: NM_198724 (+125−+367); longest ORF: NM_198724 (+294−+1130)].

  • a Each value represents the rate of similar amino acid residues over the entire region in the alignment of the human protein sequence with that of its mouse ortholog using CLUSTAL W (http://www.ddbj.nig.ac.jp/E-mail/clustalw-j.html).

This Article

  1. Genome Res. 14: 2048-2052

Preprint Server