A large number of novel coding small open reading frames in the intergenic regions of the Arabidopsis thaliana genome are transcribed and/or under purifying selection

(Downloading may take up to 30 seconds. If the slide opens in your browser, select File -> Save As to save it.)

Click on image to view larger version.

Figure 2.
Figure 2.

Frequency distributions of posterior probabilities for simulated coding and noncoding sequences. (A) Distribution of posterior probability (pp) of sequences resembling noncoding sequence (NCDS). Ten-thousand random sequences were generated based on the hexamer and pentamer frequencies of intronic ORFs. The great majority of simulated sequences have very small pp, and only 5% of the pp values are >0.2239. (B) The pp distribution of sequences resembling coding sequences (CDSs). Random sequences were generated according to cDNA CDSs. Approximately 10% of the CDS-like random sequences have pp values <0.2239.

This Article

  1. Genome Res. 17: 632-640

Preprint Server