Long-read sequencing (LRS) has matured, and the dramatically increased accuracy, ever-increasing throughput, and access now allow new and advanced studies even at scale. This Special Issue of Genome Research on “Long-read DNA and RNA Sequencing Applications in Biology and Medicine” garnered a record number of submissions, reflecting both the intense and broad interest in the technologies and the next round of revolutionary genomic science enabled by them. This interest is rooted in that all long-read technologies combine the core benefit of utilizing much longer DNA molecules (from tens of kb to Mb-scale), which offers several benefits common to most long-read technologies including: improved sensitivity to structural variants (SVs), detection of (complex) cytogenetic aberrations; the ability to assemble and phase genomes, with the potential to move away from variants to alleles/haplotypes; sensitivity for repeat expansion/contraction detection; and access to the “dark genome,” for example, repeat-rich and GC-rich areas of the human genome, even including segmental duplications (SegDups) or other sequence homologies.
In this first of two Special Issues, we have assembled a diverse collection of research and review articles highlighting the novel applications and developments around long-read sequencing. The trends that can be deduced from this Special Issue are that a steadily increasing amount of human genome and transcriptome studies are now enabled, more assembly-based analyses show the power of long-read sequencing, and non-human studies showcase the value for non-reference and model species.
Two-thirds of all manuscripts in this issue include the study of human samples, especially focusing on germline testing and rare diseases genetics. It's remarkable to see the successes of these studies despite long-read sequencing being on the market and routinely used for only a decade now. First, rare disease cohort studies of the added benefit of LRS are emerging (Hiatt et al. 2024; Eisfeldt et al. 2024a), and many more systematically designed studies are still under way. Currently, these focus on the identification of structural variants (SVs) and short tandem repeat (STR) expansions. In addition, particular efforts to provide complete sets of SVs now are possible at the population level (Gustafson et al. 2024), enabling a full set of SVs with breakpoint resolution, dramatically altering the catalog of SVs in human populations. Intriguing studies focusing on very complex SVs appear in this issue (Guitart et al. 2024; Eisfeldt et al. 2024b; Bilgrav Saether et al. 2024), and additional tools and visualizations of all clinically or biologically relevant variant types are emerging (De Coster et al. 2024; Tesi et al. 2024; Zhou et al. 2024), as well as tools to call additional genomic information (Gocuk et al. 2024). While current costs are still prohibitive for utilizing LRS as a first-tier, generic clinical test for all germline variants in human genetics, this may rapidly change. Nevertheless, Iyer et al. (2024) review approaches for cost-efficient targeted sequencing to enhance variant detection in certain regions of the genome.
Advancements in both accuracy and read length have driven the rapid expansion of improved assembly-based analysis methods, which are now capable of delivering nearly telomere-to-telomere and chromosome-level assemblies, for non-human and non-reference species, too, as illustrated by Kamath et al. (2024), Koren et al. (2024), Gardner et al. (2024), and Byerly et al. (2024). Long-read data now offer unprecedented insights into some of the most complex regions of genomes (de Groot et al. 2024; Volarić et al. 2024) and pave the way for the development of the next generation of high-quality reference genomes (Li et al. 2024). Moreover, novel assays and analysis methods facilitate studying the relationship between the genome and the epigenome using long reads (Jha et al. 2024).
Long reads can capture full-length transcript molecules, offering enhanced resolution for detecting alternative isoforms and accurately profiling transcriptome complexity. This Special Issue highlights the expanding applications of LRS in transcriptome studies, which are increasingly addressing complex questions in transcriptome biology.
Several studies in this issue demonstrate the power of LRS in cancer research, where it has identified deep intronic variants and aberrant splicing across various cancer types (Gulsuner et al. 2024; Pacholewska et al. 2024), and characterized novel transcriptional features linked to tumor progression (Lee et al. 2024). Additionally, increased read-throughput and reduced costs are enabling large-scale studies to capture transcriptome variability across multiple samples. For example, Zhang et al. (2024) profiled isoform diversity in house mouse populations using long-read RNA-seq, while Adams and Vollmers (2024) combined R2C2 library preparation with nanopore sequencing to profile the transcriptome across numerous mouse tissues, creating a Tissue-Level Atlas of Mouse Isoforms (TAMI).
Adaptations and advancements in experimental protocols paired with RNA LRS have allowed researchers to go beyond bulk transcriptomics. Numerous studies now employ long reads for single-cell and spatial transcriptomics analyses, as reviewed by Belchikov et al. (2024), to profile isoform usage across diverse cell types. Meanwhile, the integration of long-read RNA sequencing with a translatomic method (LR Frac-seq) introduced by Ritter et al. (2024) enables subcellular fractionation, revealing subcellular enrichment for thousands of transcripts, while the long-read Ribo-STAMP (LR-Ribo-STAMP) technique captures the translating transcriptome (Jagannatha et al. 2024). Single-molecule nanopore sequencing technologies uniquely enable the sequencing of native RNA molecules, offering exciting possibilities for studying RNA modifications and their roles in transcriptome function and disease (Diensthuber et al. 2024, Teng et al. 2024). Despite all these successes, important challenges such as coverage needs remain for long-read RNA applications as reviewed by Calvo-Roitberg et al. (2024). Nevertheless, Zeglinski et al. (2024) demonstrate the value of nanopore sequencing even for quality control of gene therapy vectors.
This Special Issue also highlights the power of long-read sequencing for rapid pathogen characterization and epidemiological control. For instance, Gomez-Simmonds et al. (2024) show the application of nanopore sequencing to profile numerous enterobacteria isolates, identifying sources of acquired antibiotic resistance, while Slizovskiy et al. (2024) present TELSeq, a targeted method to detect antimicrobial resistance genes (ARGs) and mobile genetic elements to study the transferability of ARG. However, Lohde et al. (2024) caution that nanopore sequencing accuracy can be compromised by DNA methylation, causing errors that may lead to false isolate identification. This limitation could be addressed by using PCR-based sequencing, offering a promising pathway for rapid outbreak tracing with long-read technologies. Conversely, bacterial DNA modifications can be of scientific interest, as they impact bacterial function, and Liu et al. (2024) introduce a novel pipeline to accurately detect DNA modifications in prokaryotes.
We would like to thank all the authors, reviewers, and the Genome Research editorial team, especially Dr. Hillary Sussman, for their hard work and significant contributions to this Special Issue. Given all these studies highlighting the benefits of long-read sequencing, it is clear that these technologies have already made a substantial impact in many studies. We look forward to the next decade of innovations around long reads as they reveal novel genomic and biological insights, and we are reminded of this quote from Carl Sagan: “Somewhere, something incredible is waiting to be known.”
Competing interest statement
A.C., A.H., and F.J.S. served as Guest Editors for this issue of Genome Research, and had access to all papers prior to publication. F.J.S. obtained research support from Illumina, PacBio, and ONT. For some research studies by A.H., reagent costs were in part shared between Radboudumc and Bionano, or Radboudumc and PacBio. A.C. obtained in-kind support from PacBio. A.C. participates with ONT in the EU-funded project LongTREC.
Notes
[1] Article and publication date are at https://www.genome.org/cgi/doi/10.1101/gr.280179.124.
References
- ↵Adams M, Vollmers C. 2024. Generation and analysis of a mouse multitissue genome annotation atlas. Genome Res (this issue) 34: 2108–2117. 10.1101/gr.279217.124
- ↵Belchikov N, Hsu J, Li XJ, Jarroux J, Hu W, Joglekar A, Tilgner HU. 2024. Understanding isoform expression by pairing long-read sequencing with single-cell and spatial transcriptomics. Genome Res (this issue) 34: 1735–1746. 10.1101/gr.279640.124
- ↵Bilgrav Saether K, Eisfeldt J, Bengtsson JD, Lun MY, Grochowski CM, Mahmoud M, Chao H-T, Rosenfeld JA, Liu P, Ek M, 2024. Leveraging the T2T assembly to resolve rare and pathogenic inversions in reference genome gaps. Genome Res (this issue) 34: 1785–1797. 10.1101/gr.279346.124
- ↵Byerly PA, von Thaden A, Leushkin E, Hilgers L, Liu S, Winter S, Schell T, Gerheim C, Hamadou AB, Greve C, 2024. Haplotype-resolved genome and population genomics of the threatened garden dormouse in Europe. Genome Res (this issue) 34: 2094–2107. 10.1101/gr.279066.124
- ↵Calvo-Roitberg E, Daniels RF, Pai AA. 2024. Challenges in identifying mRNA transcript starts and ends from long-read sequencing data. Genome Res (this issue) 34: 1719–1734. 10.1101/gr.279559.124
- ↵De Coster W, Höijer I, Bruggeman I, D'Hert S, Melin M, Ameur A, Rademakers R. 2024. Visualization and analysis of medically relevant tandem repeats in nanopore sequencing of control cohorts with pathSTR. Genome Res (this issue) 34: 2074–2080. 10.1101/gr.279265.124
- ↵de Groot N, van der Wiel M, Le NG, de Groot NG, Bruijnesteijn J, Bontrop RE. 2024. Unraveling the architecture of major histocompatibility complex class II haplotypes in rhesus macaques. Genome Res (this issue) 34: 1811–1824. 10.1101/gr.278968.124
- ↵Diensthuber G, Pryszcz LP, Llovera L, Lucas MC, Delgado-Tejedor A, Cruciani S, Roignant J-Y, Begik O, Novoa EM. 2024. Enhanced detection of RNA modifications and read mapping with high-accuracy nanopore RNA basecalling models. Genome Res (this issue) 34: 1865–1877. 10.1101/gr.278849.123
- ↵Eisfeldt J, Ameur A, Lenner F, de Boer ETB, Ek M, Wincent J, Vaz R, Ottosson J, Jonson T, Ivarsson S, 2024a. A national long-read sequencing study on chromosomal rearrangements uncovers hidden complexities. Genome Res (this issue) 34: 1774–1784. 10.1101/gr.279510.124
- ↵Eisfeldt J, Higginbotham EJ, Lenner F, Howe J, Fernandez BA, Lindstrand A, Scherer SW, Feuk L. 2024b. Resolving complex duplication variants in autism spectrum disorder using long-read genome sequencing. Genome Res (this issue) 34: 1763–1773. 10.1101/gr.279263.124
- ↵Gardner C, Chen J, Hadfield C, Lu Z, Debruin D, Zhan Y, Donlin MJ, Ahn T-H, Lin Z. 2024. Chromosome-level subgenome-aware de novo assembly provides insight into Saccharomyces bayanus genome divergence after hybridization. Genome Res (this issue) 34: 2133–2146. 10.1101/gr.279364.124
- ↵Gocuk SA, Lancaster J, Su S, Jolly JK, Edwards TL, Hickey DG, Ritchie ME, Blewitt ME, Ayton LN, Gouil Q. 2024. Measuring X-Chromosome inactivation skew for X-linked diseases with adaptive nanopore sequencing. Genome Res (this issue) 34: 1954–1965. 10.1101/gr.279396.124
- ↵Gomez-Simmonds A, Annavajhala MK, Seeram D, Hokunson TW, Park H, Uhlemann A-C. 2024. Genomic epidemiology of carbapenem-resistant Enterobacterales at a New York City hospital over a 10-year period reveals complex plasmid-clone dynamics and evidence for frequent horizontal transfer of blaKPC. Genome Res (this issue) 34: 1895–1907. 10.1101/gr.279355.124
- ↵Guitart X, Porubsky D, Yoo D, Dougherty ML, Dishuck PC, Munson KM, Lewis AP, Hoekzema K, Knuth J, Chang S, 2024. Independent expansion, selection, and hypervariability of the TBC1D3 gene family in humans. Genome Res (this issue) 34: 1798–1810. 10.1101/gr.279299.124
- ↵Gulsuner S, AbuRayyan A, Mandell JB, Lee MK, Bernier GV, Norquist BM, Pierce SB, King M-C, Walsh T. 2024. Long-read DNA and cDNA sequencing identify cancer-predisposing deep intronic variation in tumor-suppressor genes. Genome Res (this issue) 34: 1825–1831. 10.1101/gr.279158.124
- ↵Gustafson JA, Gibson SB, Damaraju N, Zalusky MPG, Hoekzema K, Twesigomwe D, Yang L, Snead AA, Richmond PA, De Coster W, 2024. High-coverage nanopore sequencing of samples from the 1000 Genomes Project to build a comprehensive catalog of human genetic variation. Genome Res (this issue) 34: 2061–2073. 10.1101/gr.279273.124
- ↵Hiatt SM, Lawlor JMJ, Handley LH, Latner DR, Bonnstetter ZT, Finnila CR, Thompson ML, Boston LB, Williams M, Nunez IR, 2024. Long-read genome sequencing and variant reanalysis increase diagnostic yield in neurodevelopmental disorders. Genome Res (this issue) 34: 1747–1762. 10.1101/gr.279227.124
- ↵Iyer SV, Goodwin S, McCombie WR. 2024. Leveraging the power of long reads for targeted sequencing. Genome Res (this issue) 34: 1701–1718. 10.1101/gr.279168.124
- ↵Jagannatha P, Tankka AT, Lorenz DA, Yu T, Yee BA, Brannan KW, Zhou CJ, Underwood JG, Yeo GW. 2024. Long-read Ribo-STAMP simultaneously measures transcription and translation with isoform resolution. Genome Res (this issue) 34: 2012–2024. 10.1101/gr.279176.124
- ↵Jha A, Bohaczuk SC, Mao Y, Ranchalis J, Mallory BJ, Min AT, Hamm MO, Swanson E, Dubocanin D, Finkbeiner C, 2024. DNA-m6A calling and integrated long-read epigenetic and genetic analysis with fibertools. Genome Res (this issue) 34: 1976–1986. 10.1101/gr.279095.124
- ↵Kamath SS, Bindra M, Pal D, Jain C. 2024. Telomere-to-telomere assembly by preserving contained reads. Genome Res (this issue) 34: 1908–1918. 10.1101/gr.279311.124
- ↵Koren S, Bao Z, Guarracino A, Ou S, Goodwin S, Jenike KM, Lucas J, McNulty B, Park J, Rautiainen M, 2024. Gapless assembly of complete human and plant chromosomes using only nanopore sequencing. Genome Res (this issue) 34: 1919–1930. 10.1101/gr.279334.124
- ↵Lee J, Snell EA, Brown J, Booth CE, Banks RE, Turner DJ, Vasudev NS, Lagos D. 2024. Long-read RNA sequencing of archival tissues reveals novel genes and transcripts associated with clear cell renal cell carcinoma recurrence and immune evasion. Genome Res (this issue) 34: 1849–1864. 10.1101/gr.278801.123
- ↵Li K, Smith ML, Chris Blazier J, Kochan KJ, Wood JMD, Howe K, Kwitek AE, Dwinell MR, Chen H, Ciosek JL, 2024. Construction and evaluation of a new rat reference genome assembly, GRCr8, from long reads and long-range scaffolding. Genome Res (this issue) 34: 2081–2093. 10.1101/gr.279292.124
- ↵Liu X, Ni Y, Ye L, Guo Z, Tan L, Li J, Yang M, Chen S, Li R. 2024. Nanopore strand-specific mismatch enables de novo detection of bacterial DNA modifications. Genome Res (this issue) 34: 2025–2038. 10.1101/gr.279012.124
- ↵Lohde M, Wagner GE, Dabernig-Heinz J, Viehweger A, Braun SD, Monecke S, Diezel C, Stein C, Marquet M, Ehricht R, 2024. Accurate bacterial outbreak tracing with Oxford Nanopore sequencing and reduction of methylation-induced errors. Genome Res (this issue) 34: 2039–2047. 10.1101/gr.278848.123
- ↵Pacholewska A, Lienhard M, Brüggemann M, Hänel H, Bilalli L, Königs A, Heß F, Becker K, Köhrer K, Kaiser J, 2024. Long-read transcriptome sequencing of CLL and MDS patients uncovers molecular effects of SF3B1 mutations. Genome Res (this issue) 34: 1832–1848. 10.1101/gr.279327.124
- ↵Ritter AJ, Draper JM, Vollmers C, Sanford JR. 2024. Long-read subcellular fractionation and sequencing reveals the translational fate of full-length mRNA isoforms during neuronal differentiation. Genome Res (this issue) 34: 2000–2011. 10.1101/gr.279170.124
- ↵Slizovskiy IB, Bonin N, Bravo JE, Ferm PM, Singer J, Boucher C, Noyes NR. 2024. Factors impacting target-enriched long-read sequencing of resistomes and mobilomes. Genome Res (this issue) 34: 2048–2060. 10.1101/gr.279226.124
- ↵Teng H, Stoiber M, Bar-Joseph Z, Kingsford C. 2024. Detecting m6A RNA modification from nanopore sequencing using a semisupervised learning framework. Genome Res (this issue) 34: 1987–1999. 10.1101/gr.278960.124
- ↵Tesi N, Salazar A, Zhang Y, van der Lee S, Hulsman M, Knoop L, Wijesekera S, Krizova J, Schneider A-F, Pennings M, 2024. Characterizing tandem repeat complexities across long-read sequencing platforms with TREAT and otter. Genome Res (this issue) 34: 1942–1953. 10.1101/gr.279351.124
- ↵Volarić M, Despot-Slade E, Veseljak D, Mravinac B, Meštrović N. 2024. Long-read genome assembly of the insect model organism Tribolium castaneum reveals spread of satellite DNA in gene-rich regions by recurrent burst events. Genome Res (this issue) 34: 1878–1894. 10.1101/gr.279225.124
- ↵Zeglinski K, Montellese C, Ritchie ME, Alhamdoosh M, Vonarburg C, Bowden R, Jordi M, Gouil Q, Aeschimann F, Hsu A. 2024. An optimized protocol for quality control of gene therapy vectors using nanopore direct RNA sequencing. Genome Res (this issue) 34: 1966–1975. 10.1101/gr.279405.124
- ↵Zhang W, Guenther A, Gao Y, Ullrich K, Huettel B, Ahmad A, Duan L, Wei K, Tautz D. 2024. Full-length RNA transcript sequencing traces brain isoform diversity in house mouse natural populations. Genome Res (this issue) 34: 2118–2132. 10.1101/gr.279166.124
- ↵Zhou Y, Song L, Li H. 2024. Full-resolution HLA and KIR gene annotations for human genome assemblies. Genome Res (this issue) 34: 1931–1941. 10.1101/gr.278985.124