Unraveling the hidden complexity of cancer through long-read sequencing

Qiuhui Li; Ayse G. Keskus; Justin Wagner; Michal B. Izydorczyk; Winston Timp; Fritz J. Sedlazeck; Alison P. Klein; Justin M. Zook; Mikhail Kolmogorov; Michael C. Schatz

doi:10.1101/gr.280041.124

Review

Unraveling the hidden complexity of cancer through long-read sequencing

- ¹Department of Computer Science, Johns Hopkins University, Baltimore, Maryland 21218, USA;
- ²Cancer Data Science Laboratory, Center for Cancer Research, National Cancer Institute, Bethesda, Maryland 20892, USA;
- ³Material Measurement Laboratory, National Institute of Standards and Technology, Gaithersburg, Maryland 20899, USA;
- ⁴Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030, USA;
- ⁵Department of Biomedical Engineering, Johns Hopkins University, Baltimore, Maryland 21218, USA;
- ⁶Department of Molecular and Human Genetics, Baylor College of Medicine, Texas 77030, USA;
- ⁷Department of Computer Science, Rice University, Houston, Texas 77251, USA;
- ⁸Sidney Kimmel Comprehensive Cancer Center, Department of Oncology, Johns Hopkins Medicine, Baltimore, Maryland 21031, USA

Published March 20, 2025. https://doi.org/10.1101/gr.280041.124

Download PDF Cite Article Permissions

Current Issue:

June 2026, Vol. 36, No. 6

Focus view

Abstract

Cancer is fundamentally a disease of the genome, characterized by extensive genomic, transcriptomic, and epigenomic alterations. Most current studies predominantly use short-read sequencing, gene panels, or microarrays to explore these alterations; however, these technologies can systematically miss or misrepresent certain types of alterations, especially structural variants, complex rearrangements, and alterations within repetitive regions. Long-read sequencing is rapidly emerging as a transformative technology for cancer research by providing a comprehensive view across the genome, transcriptome, and epigenome, including the ability to detect alterations that previous technologies have overlooked. In this review, we explore the current applications of long-read sequencing for both germline and somatic cancer analysis. We provide an overview of the computational methodologies tailored to long-read data and highlight key discoveries and resources within cancer genomics that were previously inaccessible with prior technologies. We also address future opportunities and persistent challenges, including the experimental and computational requirements needed to scale to larger sample sizes, the hurdles in sequencing and analyzing complex cancer genomes, and opportunities for leveraging machine learning and artificial intelligence technologies for cancer informatics. We further discuss how the telomere-to-telomere genome and the emerging human pangenome could enhance the resolution of cancer genome analysis, potentially revolutionizing early detection and disease monitoring in patients. Finally, we outline strategies for transitioning long-read sequencing from research applications to routine clinical practice.

Article contents

Article (Back to top)
- Abstract
- Footnotes

Review

Unraveling the hidden complexity of cancer through long-read sequencing

Cite this article

Share

Current Issue:

Abstract

Article contents

Announcement(s)