Tree-based differential testing using inferential uncertainty for RNA-seq

  1. Rob Patro1
  1. 1Department of Computer Science, University of Maryland, College Park, Maryland 20742, USA;
  2. 2Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599, USA;
  3. 3Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27514, USA
  • Corresponding author: rob{at}cs.umd.edu
  • Abstract

    Identifying differentially expressed transcripts poses a crucial yet challenging problem in transcriptomics. Substantial uncertainty is associated with the abundance estimates of certain transcripts which, if ignored, can lead to the exaggeration of false positives and, if included, may lead to reduced power. Here, we introduce a data-driven differential testing method that maximizes biological resolution while retaining statistical power. Given a set of RNA-seq samples, TreeTerminus arranges transcripts in a hierarchical tree structure that encodes different layers of resolution for interpretation of the abundance of transcriptional groups, with uncertainty generally decreasing as one ascends the tree from the leaves. We introduce mehenDi, which utilizes the tree structure from TreeTerminus for differential testing. The nodes output by mehenDi, called the selected nodes, are determined in a data-driven manner to maximize the signal that can be extracted from the data while controlling for the uncertainty associated with estimating the transcript abundances. The identified selected nodes can include transcripts and inner nodes, with no two nodes having an ancestor/descendant relationship. We evaluate our method on both simulated and experimental data sets and compare its performance with other tree-based differential methods, as well as with uncertainty-aware differential transcript/gene expression methods. Our method detects inner nodes that show a strong signal for differential expression, which would have been overlooked when analyzing the transcripts alone.

    Footnotes

    • [Supplemental material is available for this article.]

    • Article published online before print. Article, supplemental material, and publication date are at https://www.genome.org/cgi/doi/10.1101/gr.279981.124.

    • Freely available online through the Genome Research Open Access option.

    • Received September 3, 2024.
    • Accepted August 7, 2025.

    This article, published in Genome Research, is available under a Creative Commons License (Attribution 4.0 International), as described at http://creativecommons.org/licenses/by/4.0/.

    OPEN ACCESS ARTICLE

    This Article

    1. Genome Res. © 2025 Singh et al.; Published by Cold Spring Harbor Laboratory Press

    Article Category

    ORCID

    Share

    Preprint Server