A Hitchhiker's Guide to long-read genomic analysis

  1. Fritz J. Sedlazeck1,2,3
  1. 1Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030, USA;
  2. 2Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas 77030, USA;
  3. 3Department of Computer Science, Rice University, Houston, Texas 77005, USA
  1. 4 These authors contributed equally to this work.

  • Corresponding author: fritz.sedlazeck{at}bcm.edu
  • Abstract

    Over the past decade, long-read sequencing has evolved into a pivotal technology for uncovering the hidden and complex regions of the genome. Significant cost efficiency, scalability, and accuracy advancements have driven this evolution. Concurrently, novel analytical methods have emerged to harness the full potential of long reads. These advancements have enabled milestones such as the first fully completed human genome, enhanced identification and understanding of complex genomic variants, and deeper insights into the interplay between epigenetics and genomic variation. This mini-review provides a comprehensive overview of the latest developments in long-read DNA sequencing analysis, encompassing reference-based and de novo assembly approaches. We explore the entire workflow, from initial data processing to variant calling and annotation, focusing on how these methods improve our ability to interpret a wide array of genomic variants. Additionally, we discuss the current challenges, limitations, and future directions in the field, offering a detailed examination of the state-of-the-art bioinformatics methods for long-read sequencing.

    This article is distributed exclusively by Cold Spring Harbor Laboratory Press for the first six months after the full-issue publication date (see https://genome.cshlp.org/site/misc/terms.xhtml). After six months, it is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), as described at http://creativecommons.org/licenses/by-nc/4.0/.

    Articles citing this article

    | Table of Contents

    Preprint Server