Inferring gene expression from ribosomal promoter sequences, a crowdsourcing approach

  1. Gustavo Stolovitzky1
  1. 1 IBM;
  2. 2 University of Notre Dame;
  3. 3 Princeton University;
  4. 4 Weizmann Institute of Science;
  5. 5 -
  1. * Corresponding author; email: pmeyerr{at}us.ibm.com

Abstract

The Gene Promoter Expression Prediction challenge consisted of predicting gene expression from promoter sequences in a previously unknown experimentally generated data set. The challenge was presented to the community in the framework of the sixth Dialogue for Reverse Engineering Assessments and Methods (DREAM6), a community effort to evaluate the status of systems biology modeling methodologies. Nucleotide-specific promoter activity was obtained by measuring fluorescence from promoter sequences fused upstream of a yellow fluorescence protein gene and inserted in the same genomic site of yeast S.cerevisiae. Twenty-one teams submitted results predicting the expression levels of 53 different promoters from yeast ribosomal protein genes. Analysis of participant predictions shows that accurate values for low expressed and mutated promoters were difficult to obtain, although in the latter case only when the mutation induced a large change in promoter activity compared to the wildtype sequence. As in previous DREAM challenges, we found that aggregation of participant predictions provided robust results, but did not fare better than the 3 best algorithms. Finally, this study not only provides a benchmark for the assessment of methods predicting activity of a specific set of promoters from their sequence, but it also shows that the top performing algorithm, which used machine-learning approaches, can be improved by the addition of biological features such as transcription factor binding sites.

  • Received March 12, 2013.
  • Accepted August 14, 2013.

This article is distributed exclusively by Cold Spring Harbor Laboratory Press for the first six months after the full-issue publication date (see http://genome.cshlp.org/site/misc/terms.xhtml). After six months, it is available under a Creative Commons License (Attribution-NonCommercial 3.0 Unported), as described at http://creativecommons.org/licenses/by-nc/3.0/.

Articles citing this article

Related Articles

ACCEPTED MANUSCRIPT

Preprint Server