Unraveling the hidden complexity of cancer through long-read sequencing

    • 1Department of Computer Science, Johns Hopkins University, Baltimore, Maryland 21218, USA;
    • 2Cancer Data Science Laboratory, Center for Cancer Research, National Cancer Institute, Bethesda, Maryland 20892, USA;
    • 3Material Measurement Laboratory, National Institute of Standards and Technology, Gaithersburg, Maryland 20899, USA;
    • 4Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030, USA;
    • 5Department of Biomedical Engineering, Johns Hopkins University, Baltimore, Maryland 21218, USA;
    • 6Department of Molecular and Human Genetics, Baylor College of Medicine, Texas 77030, USA;
    • 7Department of Computer Science, Rice University, Houston, Texas 77251, USA;
    • 8Sidney Kimmel Comprehensive Cancer Center, Department of Oncology, Johns Hopkins Medicine, Baltimore, Maryland 21031, USA
Published March 20, 2025. https://doi.org/10.1101/gr.280041.124
Download PDF Cite Article Permissions Share
cover of Genome Research Vol 36 Issue 5
Current Issue:

Abstract

Cancer is fundamentally a disease of the genome, characterized by extensive genomic, transcriptomic, and epigenomic alterations. Most current studies predominantly use short-read sequencing, gene panels, or microarrays to explore these alterations; however, these technologies can systematically miss or misrepresent certain types of alterations, especially structural variants, complex rearrangements, and alterations within repetitive regions. Long-read sequencing is rapidly emerging as a transformative technology for cancer research by providing a comprehensive view across the genome, transcriptome, and epigenome, including the ability to detect alterations that previous technologies have overlooked. In this review, we explore the current applications of long-read sequencing for both germline and somatic cancer analysis. We provide an overview of the computational methodologies tailored to long-read data and highlight key discoveries and resources within cancer genomics that were previously inaccessible with prior technologies. We also address future opportunities and persistent challenges, including the experimental and computational requirements needed to scale to larger sample sizes, the hurdles in sequencing and analyzing complex cancer genomes, and opportunities for leveraging machine learning and artificial intelligence technologies for cancer informatics. We further discuss how the telomere-to-telomere genome and the emerging human pangenome could enhance the resolution of cancer genome analysis, potentially revolutionizing early detection and disease monitoring in patients. Finally, we outline strategies for transitioning long-read sequencing from research applications to routine clinical practice.

Loading
Loading
Back to top