Recalibrating differential gene expression by genetic dosage variance prioritizes functionally relevant genes

  1. Tuuli Lappalainen1,6,7
  1. 1Science for Life Laboratory, Department of Gene Technology, KTH Royal Institute of Technology, 17165 Solna, Sweden;
  2. 2Department of Integrative Structural and Computational Biology, Scripps Research Institute, La Jolla, California 92037, USA;
  3. 3Center for Immunity and Immunotherapies, Seattle Children's Research Institute, Seattle, Washington 98101, USA;
  4. 4Department of Pediatrics, University of Washington School of Medicine, Seattle, Washington 98105, USA;
  5. 5Department of Genome Science, University of Washington, Seattle, Washington 98195, USA;
  6. 6New York Genome Center, New York, New York 10013, USA;
  7. 7Department of Systems Biology, Columbia University, New York, New York 10032, USA
  • Corresponding authors: philipp.rentzsch{at}scilifelab.se, tuuli.lappalainen{at}scilifelab.se
  • Abstract

    Differential expression (DE) analysis is a widely used method for identifying genes that are functionally relevant for an observed phenotype or biological response. However, typical DE analysis includes selection of genes based on a threshold of fold change in expression under the implicit assumption that all genes are equally sensitive to dosage changes of their transcripts. This tends to favor highly variable genes over more constrained genes where even small changes in expression may be biologically relevant. To address this limitation, we have developed a method to recalibrate each gene's DE fold change based on genetic expression variance observed in the human population. The newly established metric ranks statistically differentially expressed genes, not by nominal change of expression, but by relative change in comparison to natural dosage variation for each gene. We apply our method to RNA sequencing data sets from in vitro stimulus response and neuropsychiatric disease experiments. Compared to the standard approach, our method adjusts the bias in discovery toward highly variable genes and enriches for pathways and biological processes related to metabolic and regulatory activity, indicating a prioritization of functionally relevant driver genes. Tissue-specific recalibration increases detection of known disease-relevant processes. Altogether, our method provides a novel view on DE and contributes toward bridging the existing gap between statistical and biological significance. We believe that this approach will simplify the identification of disease-causing molecular processes and enhance the discovery of therapeutic targets.

    Footnotes

    • [Supplemental material is available for this article.]

    • Article published online before print. Article, supplemental material, and publication date are at https://www.genome.org/cgi/doi/10.1101/gr.280360.124.

    • Freely available online through the Genome Research Open Access option.

    • Received January 9, 2025.
    • Accepted July 8, 2025.

    This article, published in Genome Research, is available under a Creative Commons License (Attribution 4.0 International), as described at http://creativecommons.org/licenses/by/4.0/.

    | Table of Contents
    OPEN ACCESS ARTICLE

    Preprint Server