Revisiting the Saccharomyces cerevisiae predicted ORFeome

  1. Qian-Ru Li1,6,
  2. Anne-Ruxandra Carvunis1,2,6,
  3. Haiyuan Yu1,6,
  4. Jing-Dong J. Han1,6,7,
  5. Quan Zhong1,
  6. Nicolas Simonis1,
  7. Stanley Tam1,
  8. Tong Hao1,
  9. Niels J. Klitgord1,
  10. Denis Dupuy1,
  11. Danny Mou1,
  12. Ilan Wapinski3,4,
  13. Aviv Regev3,5,
  14. David E. Hill1,
  15. Michael E. Cusick1, and
  16. Marc Vidal1,8
  1. 1 Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, and Department of Genetics, Harvard Medical School, Boston, Massachusetts 02115, USA;
  2. 2 TIMC-IMAG, CNRS UMR5525, Faculté de Médecine, 38706 La Tronche Cedex, France;
  3. 3 Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, USA;
  4. 4 School of Engineering and Applied Sciences, Harvard University, Cambridge, Massachusetts 02138, USA;
  5. 5 Department of Biology, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
  1. 6 These authors contributed equally to this work.

Abstract

Accurately defining the coding potential of an organism, i.e., all protein-encoding open reading frames (ORFs) or “ORFeome,” is a prerequisite to fully understand its biology. ORFeome annotation involves iterative computational predictions from genome sequences combined with experimental verifications. Here we reexamine a set of Saccharomyces cerevisiae “orphan” ORFs recently removed from the original ORFeome annotation due to lack of conservation across evolutionarily related yeast species. We show that many orphan ORFs produce detectable transcripts and/or translated products in various functional genomics and proteomics experiments. By combining a naïve Bayes model that predicts the likelihood of an ORF to encode a functional product with experimental verification of strand-specific transcripts, we argue that orphan ORFs should still remain candidates for functional ORFs. In support of this model, interstrain intraspecies genome sequence variation is lower across orphan ORFs than in intergenic regions, indicating that orphan ORFs endure functional constraints and resist deleterious mutations. We conclude that ORFs should be evaluated based on multiple levels of evidence and not be removed from ORFeome annotation solely based on low sequence conservation in other species. Rather, such ORFs might be important for micro-evolutionary divergence between species.

Footnotes

  • 7 Present address: Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing 100101, China.

  • 8 Corresponding author.

    8 E-mail marc_vidal{at}dfci.harvard.edu; fax (617) 632-5739.

  • [Supplemental material is available online at www.genome.org.]

  • Article published online before print. Article and publication date are at http://www.genome.org/cgi/doi/10.1101/gr.076661.108.

    • Received January 29, 2008.
    • Accepted May 5, 2008.

Articles citing this article

| Table of Contents

Preprint Server