Long reads decipher genomes and transcriptomes and offer novel insights into biology and diseases

  1. Fritz J. Sedlazeck3,4,5
  1. 1Institute for Integrative Systems Biology (I2SysBio), Spanish National Research Council (CSIC), Paterna 46980, Spain;
  2. 2Department of Human Genetics and Department of Internal Medicine, Radboudumc Research Institute for Medical Innovation, Radboud Centre for Infectious Diseases (RCI), Radboud University Medical Center, 6500 HB Nijmegen, The Netherlands;
  3. 3Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030, USA;
  4. 4Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas 77030, USA;
  5. 5Department of Computer Science, Rice University, Houston, Texas 77005, USA
  • Corresponding authors: Ana.Conesa{at}csic.es, Alexander.Hoischen{at}radboudumc.nl, Fritz.Sedlazeck{at}bcm.edu
  • We are excited to present Part 2 of the Special Issue on “Long-read DNA and RNA Sequencing Applications in Biology and Medicine,” following the success of Part 1. Due to the record number of submissions, we are thrilled to feature an additional 39 manuscripts showcasing the growing interest and groundbreaking science driven by long-read technologies.

    In this Special Issue of Genome Research, we have assembled a diverse collection of articles highlighting novel methods, applications, and insights emerging from long-read technologies. This Special Issue contains four reviews, two perspectives, 16 research, and 17 method articles. While the long-read research primarily focuses on sequencing, such as Oxford Nanopore Technologies (ONT) and Pacific Biosciences (PacBio) HiFi-based sequencing, this Special Issue also highlights another long-read technology: optical genome mapping (OGM). Although not sequencing-based, OGM is particularly valuable for detecting large complex structural variants (SVs) and chromosomal aberrations.

    The maturation of long-read sequencing (LRS) technologies, marked by enhanced accuracy, increased throughput, and reduced costs, has propelled a substantial expansion in human genetics and genomics research. This trend is clearly reflected in this Special Issue, where four review articles, authored by leading experts, and two perspective articles, predominantly emphasize human LRS application. These reviews and perspectives explore the contributions of LRS to human population sequencing, advancing our understanding of human genome architecture and SVs with improved genome assemblies and graph genomes (Rausch et al. 2025) and rare disease diagnostics, uncovering previously undetected variants in rare disease patients, with particular attention to SVs, short tandem repeat (STR) expansions, and variant phasing (Del Gobbo and Boycott 2025). They provide an update on the emerging use of LRS for direct readout of DNA modifications, notably DNA methylation (Montano and Timp 2025), and provide novel insights into cancer-related genomic and epigenomic alterations (Li et al. 2025). Another perspective focuses on transcriptome biology, unraveling an unprecedented level of isoform diversity that creates new analytical challenges (Monzó et al. 2025). Finally, a Hitchhiker's Guide to long-read genomic analysis reviews the development of novel bioinformatic tools tailored to LRS data analysis (Mahmoud et al. 2025).

    The growing maturity and applicability of LRS in human research are further demonstrated by the prevalence of human-focused studies within this Special Issue. Notably, several of these studies highlight clinical utility or potential diagnostic applications.

    This Special Issue features a diverse collection of research articles demonstrating the power of LRS and related technologies in various biological and clinical applications.

    Structural variation and epigenetics

    Several articles demonstrate the value of LRS for exploring structural variation and DNA methylation. Paulin et al. (2025) show that LRS significantly contributes to improving the human reference genome and the detection of somatic SVs. Genner et al. (2025) assess novel analytical methods for detecting methylation in primary human tissues using nanopore sequencing. Groza et al. (2025) use methylation data from HiFi sequencing and graph genomes to link SVs and methylation, identifying thousands of SV-methylation quantitative trait loci. Porter et al. (2025) demonstrate the complementary use of DNA and methylation sequencing to understand somatic variants in human papillomavirus (HPV)-transformed cervical cancers, elucidating allele-specific epigenetic effects of HPV integration. Oggenfuss et al. (2025) investigate SV and transposable element insertions in Candida albicans and other pathogens, identifying their presence near coding sequences and at centromeres, suggesting potential impact on gene expression and a role in centromere evolution.

    Transcripts and isoforms

    Six articles highlight the utility of LRS for RNA analysis, with three showcasing the latest isoform sequencing methods. Black et al. (2025) use long-read single-cell RNA-seq with MAS-seq in chronic lymphocytic leukemia (CLL) samples and inform on subclonal evolution, which may help guide patient-specific therapies. Karakulak et al. (2025) employ MAS-seq to identify numerous novel transcript isoforms in patient-derived clear cell renal cell carcinoma organoids. Stemerdink et al. (2025) utilize Iso-Seq for long isoform detection in the human retina with a focus on the identification of Usher syndrome–associated transcripts and featuring indirect targeted enrichment of full-length isoforms up to 20 kb. Choquet et al. (2025) complement these studies by using direct RNA nanopore sequencing to reveal allele-specific effects on splicing in human lymphoblastoid cell lines. Zeng et al. (2025) demonstrate a combined DNA and RNA analysis in hepatitis B virus-driven hepatocellular carcinoma that examines the transcriptional consequences of somatically integrated viral DNA, including fusion gene detection. Wang et al. (2025) employed single-cell RNA-seq to identify 44,325 isoforms in mouse retina cells, revealing 38% novel and 17% exclusively expressed isoforms, highlighting significant cell type–specific variation.

    Rare disease research

    A subset of articles highlights the benefits of LRS for rare disease (RD) discovery. Rafehi et al. (2025) demonstrate the advantages of LRS, particularly adaptive sampling, for characterizing repeat expansions in ataxia. Steyaert et al. (2025) show that moderate-coverage HiFi sequencing detects previously hidden variants, explaining >12% of undiagnosed rare disease cases within the European Solve-RD consortium.

    Optical genome mapping

    Three articles showcase the capabilities of optical genome mapping to provide long-read capability. Vervoort et al. (2025) combine OGM with fiber-FISH and LRS to resolve paralogous sequences and recombination mechanisms in one of the most complex loci in the human genome associated with 22q11.2 Deletion Syndrome. Sahajpal et al. (2025) demonstrate the sensitivity of OGM for detecting previously hidden SVs in neural tube defect genomes. van der Sanden et al. (2025) show that OGM effectively identifies clinically relevant repeat expansions, providing accurate length estimates for very long repeats, which is extremely challenging for other methods, and revealing somatic repeat instability detected in a surprisingly high number of cases.

    This Special Issue features advancements in novel analysis methodologies and tools for LRS data.

    Improving genome assembly and SV detection

    Several articles focus on improving genome assembly and SV detection. Vrček et al. (2025) introduce GNNome, a geometric deep-learning framework that reconstructs genome assemblies with high contiguity and quality, overcoming traditional challenges posed by repetitive regions. Zhang et al. (2025) present MotifScope, a tool for multisample motif discovery and tandem repeat visualization, providing insights into genome-wide repeat structures. Chakravarty et al. (2025) develop RAmbler, a reference-guided assembler that reconstructs complex repetitive regions in human chromosomes, significantly outperforming existing tools in resolving centromeres and other repeats. Munro et al. (2025) improve nanopore adaptive sampling scalability for the PromethION platform using readfish, optimizing SV detection in human genomes.

    Human disease diagnostics

    Methods tailored for human disease research are also featured. Chen et al. (2025) propose NanoRCS, a nanopore-based consensus sequencing method that accurately profiles tumor cell-free DNA, enabling sensitive early cancer detection. Dondi et al. (2025) introduce LongSom, a computational workflow for detecting de novo somatic variants in single-cell RNA-seq data, addressing tumor heterogeneity. Jensen et al. (2025) integrate transcriptomics and long-read genomics to prioritize SVs, refining diagnostic workflows for undiagnosed cases. Beaulaurier et al. (2025) present a novel approach for de novo antibody identification using full-length single B cell transcriptomics, enhancing precision medicine applications.

    Transcriptome analysis

    Significant advancements are seen in long-read transcriptome analysis and isoform characterization. Peng et al. (2025) develop single-cell Rapid Capture Hybridization sequencing (scRaCH-seq) for accurate isoform usage and coding mutation detection in peripheral blood samples from CLL patients. Pryszcz et al. (2025) introduce SeqTagger, a tool for rapid demultiplexing of direct RNA nanopore sequencing data sets. Qin et al. (2025) enhance fusion transcript identification with CTAT-LR-Fusion by combining LRS and short-read sequencing, improving the detection of clinically relevant gene fusions in cancer. Keil et al. (2025) introduce SQANTI-reads, a quality assessment framework for multisample long-read RNA-seq experiments, identifying under-annotated genes and novel transcripts. Park and Cenik (2025) leverage long-read RNA-seq to reveal allele-specific N6-methyladenosine (m6A) modifications in human and mouse cells, uncovering sequence determinants of m6A deposition. Murali et al. (2025) develop Biosurfer, a computational tool for tracking regulatory mechanisms leading to protein isoform diversity, revealing novel patterns of frameshifts and codon splits.

    Expanding applications beyond human genomics

    This Special Issue also highlights the growing use of LRS beyond human genomics, with tools specifically developed for diverse biological systems. Horsfield et al. (2025) optimize nanopore adaptive sampling with the GNASTy algorithm to improve pneumococcal serotype surveillance for epidemiological studies. Milia et al. (2025) present a pangenome-wide association mapping approach to link SVs to cattle breed traits. Paniagua et al. (2025) evaluate long-read RNA-seq strategies for genome annotation, identifying thousands of novel loci and isoforms in the Florida manatee.

    The long and the short of it

    Based on our assessment of the manuscripts in this Special Issue, we have identified several key insights that characterize this new era of genomics driven by long-read technologies:

    • - LRS data are now indispensable for achieving truly comprehensive whole-genome analysis.

    • - Long-read technologies are significantly improving diagnostic rates in rare disease cases, with SVs, mobile element insertions, and STRs emerging as critical variant types. The ability of LRS to phase clinical variants also critically streamlines and accelerates definitive diagnoses.

    • - The simultaneous readout of DNA methylation is proving highly valuable in rare disease research. This approach enables the analysis of methylation in cis with SVs, the direct detection of imprinting disorders and skewed X-inactivation, and the utilization of recently developed epigenetic signatures for numerous rare diseases.

    • - LRS is transforming transcriptome analysis, providing unprecedented insights into transcript isoform landscapes, and fundamentally challenging existing paradigms in annotation and analysis.

    These advancements underscore the transformative power of LRS technologies, lending exceptional relevance to Sydney Brenner's statement, “Progress in science depends on new techniques, new discoveries, and new ideas, probably in that order.”

    Competing interest statement

    A.C., A.H., and F.J.S. served as Guest Editors for this Special Issue of Genome Research and had access to all papers before publication. F.J.S. obtained research support from Illumina, PacBio, and ONT. For some research studies by A.H., reagent costs were in part shared between Radboudumc and Bionano or Radboudumc and PacBio. A.C. obtained in-kind support from PacBio. A.C. participates with ONT in the EU-funded project LongTREC.

    Acknowledgments

    We would like to thank Hillary Sussman and the editorial team at Genome Research for the tremendous amount of work that went into the two Special Issues on Long-Read Sequencing and all the help provided to us. A.H. was supported by a ZonMW (The Netherlands Organization for Health Research and Development) Vici grant no. 09150182310053. F.J.S. was supported by the National Institutes of Health (1UG3NS132105-01 and 1U01HG011758-01). A.C. was supported by the Spanish Ministry of Science and Innovation grant no. PID2023-152976NB-I00.

    • Received March 17, 2025.
    • Accepted March 17, 2025.

    This article is distributed exclusively by Cold Spring Harbor Laboratory Press for the first six months after the full-issue publication date (see https://genome.cshlp.org/site/misc/terms.xhtml). After six months, it is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), as described at http://creativecommons.org/licenses/by-nc/4.0/.

    References

    | Table of Contents

    Preprint Server