Effects of transcriptional noise on estimates of gene and transcript expression in RNA sequencing experiments

  1. Mihaela Pertea
  1. Johns Hopkins University
  • * Corresponding author; email: ales.varabyou{at}jhu.edu
  • Abstract

    RNA sequencing is widely used to measure gene expression across a vast range of animal and plant tissues and conditions. Most studies of computational methods for gene expression analysis use simulated data to evaluate the accuracy of these methods. These simulations typically include reads generated from known genes at varying levels of expression. Until now, simulations did not include reads from noisy transcripts, including erroneous transcription, erroneous splicing, and other processes that affect transcription in living cells. Here we examine the effects of realistic amounts of transcriptional noise on the ability of leading computational methods to assemble and quantify the genes and transcripts in an RNA-sequencing experiment. We demonstrate that the inclusion of noise leads to systematic errors in the ability of these programs to measure expression, including systematic underestimates of transcript abundance levels and large increases in the number of false positive genes and transcripts. Our results also suggest that alignment-free computational methods sometimes fail to detect transcripts expressed at relatively low levels.

    • Received May 21, 2020.
    • Accepted December 18, 2020.

    This article is distributed exclusively by Cold Spring Harbor Laboratory Press for the first six months after the full-issue publication date (see http://genome.cshlp.org/site/misc/terms.xhtml). After six months, it is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), as described at http://creativecommons.org/licenses/by-nc/4.0/.

    Articles citing this article

    ACCEPTED MANUSCRIPT

    This Article

    1. Genome Res. gr.266213.120 Published by Cold Spring Harbor Laboratory Press

    Article Category

    ORCID

    Share

    Preprint Server