Rapid and accurate alignment of nucleotide conversion sequencing reads with HISAT-3N

  1. Daehwan Kim1
  1. University of Texas Southwestern Medical Center
  • * Corresponding author; email: daehwan.kim{at}utsouthwestern.edu
  • Abstract

    Sequencing technologies utilizing nucleotide conversion techniques such as cytosine-to-thymine in bisulfite-seq and thymine-to-cytosine in SLAM-seq are powerful tools to explore the chemical intricacies of cellular processes. To date, no one has developed a unified methodology for aligning converted sequences and consolidating alignment of these technologies in one package. In this paper, we describe HISAT-3N (hierarchical indexing for spliced alignment of transcripts - 3 nucleotides), which can rapidly and accurately align sequences consisting of any nucleotide conversion by leveraging the powerful hierarchical index and repeat index algorithms originally developed for the HISAT software. Tests on real and simulated data sets demonstrate that HISAT-3N is faster than other modern systems, with greater alignment accuracy, higher scalability, and smaller memory requirements. HISAT-3N therefore becomes an ideal aligner when used with converted sequence technologies.

    • Received December 29, 2020.
    • Accepted June 3, 2021.

    This article is distributed exclusively by Cold Spring Harbor Laboratory Press for the first six months after the full-issue publication date (see https://genome.cshlp.org/site/misc/terms.xhtml). After six months, it is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), as described at http://creativecommons.org/licenses/by-nc/4.0/.

    ACCEPTED MANUSCRIPT

    This Article

    1. Genome Res. gr.275193.120 Published by Cold Spring Harbor Laboratory Press

    Article Category

    ORCID

    Share

    Preprint Server