Analysis of coding gene expression from small RNA sequencing
-
↵2 These authors contributed equally to this work.
Abstract
The popularity of microRNA expression analyses is reflected by the existence of thousands of sRNA-seq studies in which matched total RNA-seq data are often unavailable. The lack of paired sequencing experiments limits the analysis of microRNA–gene regulatory networks. Here, we explore whether protein-coding gene expression can be quantified directly from transcript fragments present in sRNA-seq experiments. We analyze studies containing matched total RNA and small RNA from four human tissues and recover transcript fragments from the sRNA-seq data sets. We find that the expression levels of protein-coding gene transcripts derived from sRNA-seq data sets are comparable to those from total RNA-seq experiments (R2 ranging from 0.33 to 0.76). Analyses across multiple tissues and species show similar correlations, indicating that the approach is applicable across organisms. We confirm that transcript half-life and the expression of housekeeping or highly abundant genes do not bias the results. Analysis of the expression of both microRNAs and coding genes from the same sRNA-seq experiments demonstrates that known microRNA–target interactions are, as expected, inversely correlated with the expression profiles of these microRNA–mRNA pairs. For a dual mRNA/miRNA profile, we recommend sequencing the ≥25 nucleotide fraction at 5 million or more reads. To confirm the utility of this approach, we apply our method to breast cancer sRNA-seq data sets lacking total RNA-seq data and achieve 75% recall and 64% accuracy comparing inferred coding gene expression with qPCR-validated targets. Our findings demonstrate that quantifying mRNA fragments from sRNA-seq experiments provides a reliable approach to investigate microRNA–mRNA interactions when total RNA-seq is unavailable.
Footnotes
-
[Supplemental material is available for this article.]
-
Article published online before print. Article, supplemental material, and publication date are at https://www.genome.org/cgi/doi/10.1101/gr.281364.125.
- Received September 4, 2025.
- Accepted December 17, 2025.
This article is distributed exclusively by Cold Spring Harbor Laboratory Press for the first six months after the full-issue publication date (see https://genome.cshlp.org/site/misc/terms.xhtml). After six months, it is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), as described at http://creativecommons.org/licenses/by-nc/4.0/.











