Enhanced detection of RNA modifications and read mapping with high-accuracy nanopore RNA basecalling models
- Gregor Diensthuber1,2,5,
- Leszek P. Pryszcz1,5,
- Laia Llovera1,
- Morghan C. Lucas1,2,
- Anna Delgado-Tejedor1,2,
- Sonia Cruciani1,2,
- Jean-Yves Roignant3,4,
- Oguzhan Begik1 and
- Eva Maria Novoa1,2
- 1Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona 08003, Spain;
- 2Universitat Pompeu Fabra, Barcelona 08003, Spain;
- 3Center for Integrative Genomics, Faculty of Biology and Medicine, University of Lausanne, 1015 Lausanne, Switzerland;
- 4Institute of Pharmaceutical and Biomedical Sciences, Johannes Gutenberg-University Mainz, 55128 Mainz, Germany
-
↵5 These authors contributed equally to this work.
Abstract
In recent years, nanopore direct RNA sequencing (DRS) became a valuable tool for studying the epitranscriptome, owing to its ability to detect multiple modifications within the same full-length native RNA molecules. Although RNA modifications can be identified in the form of systematic basecalling “errors” in DRS data sets, N6-methyladenosine (m6A) modifications produce relatively low “errors” compared with other RNA modifications, limiting the applicability of this approach to m6A sites that are modified at high stoichiometries. Here, we demonstrate that the use of alternative RNA basecalling models, trained with fully unmodified sequences, increases the “error” signal of m6A, leading to enhanced detection and improved sensitivity even at low stoichiometries. Moreover, we find that high-accuracy alternative RNA basecalling models can show up to 97% median basecalling accuracy, outperforming currently available RNA basecalling models, which show 91% median basecalling accuracy. Notably, the use of high-accuracy basecalling models is accompanied by a significant increase in the number of mapped reads—especially in shorter RNA fractions—and increased basecalling error signatures at pseudouridine (Ψ)- and N1-methylpseudouridine (m1Ψ)-modified sites. Overall, our work demonstrates that alternative RNA basecalling models can be used to improve the detection of RNA modifications, read mappability, and basecalling accuracy in nanopore DRS data sets.
Footnotes
-
[Supplemental material is available for this article.]
-
Article published online before print. Article, supplemental material, and publication date are at https://www.genome.org/cgi/doi/10.1101/gr.278849.123.
- Received December 12, 2023.
- Accepted September 10, 2024.
This article is distributed exclusively by Cold Spring Harbor Laboratory Press for the first six months after the full-issue publication date (see https://genome.cshlp.org/site/misc/terms.xhtml). After six months, it is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), as described at http://creativecommons.org/licenses/by-nc/4.0/.











