Molecular barcoding of native RNAs using nanopore sequencing and deep learning

  1. Eva Maria Novoa1,2,5,6
  1. 1Garvan Institute of Medical Research, Darlinghurst 2010, NSW, Australia;
  2. 2St-Vincent's Clinical School, UNSW Sydney, Darlinghurst 2066, NSW, Australia;
  3. 3CHU Sainte-Justine Research Centre, Montreal, QC H3T 1C5, Canada;
  4. 4Department of Biochemistry and Molecular Medicine, Université de Montréal, Montreal, QC H3T 1J4, Canada;
  5. 5Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, 08003 Barcelona, Spain;
  6. 6Universitat Pompeu Fabra (UPF), 08005 Barcelona, Spain
  1. 7 These authors contributed equally to this work.

  • Corresponding authors: martinalexandersmith{at}gmail.com, eva.novoa{at}crg.eu
  • Abstract

    Nanopore sequencing enables direct measurement of RNA molecules without conversion to cDNA, thus opening the gates to a new era for RNA biology. However, the lack of molecular barcoding of direct RNA nanopore sequencing data sets severely affects the applicability of this technology to biological samples, where RNA availability is often limited. Here, we provide the first experimental protocol and associated algorithm to barcode and demultiplex direct RNA nanopore sequencing data sets. Specifically, we present a novel and robust approach to accurately classify raw nanopore signal data by transforming current intensities into images or arrays of pixels, followed by classification using a deep learning algorithm. We demonstrate the power of this strategy by developing the first experimental protocol for barcoding and demultiplexing direct RNA sequencing libraries. Our method, DeePlexiCon, can classify 93% of reads with 95.1% accuracy or 60% of reads with 99.9% accuracy. The availability of an efficient and simple multiplexing strategy for native RNA sequencing will improve the cost-effectiveness of this technology, as well as facilitate the analysis of lower-input biological samples. Overall, our work exemplifies the power, simplicity, and robustness of signal-to-image conversion for nanopore data analysis using deep learning.

    Footnotes

    • Received January 6, 2020.
    • Accepted August 4, 2020.

    This article is distributed exclusively by Cold Spring Harbor Laboratory Press for the first six months after the full-issue publication date (see http://genome.cshlp.org/site/misc/terms.xhtml). After six months, it is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), as described at http://creativecommons.org/licenses/by-nc/4.0/.

    Articles citing this article

    | Table of Contents

    Preprint Server