Research

Whole-genome variant detection in long-read sequencing data from ultralow input patient samples

    • 1Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas 77030, USA;
    • 2Department of Genetics, Stanford University, Palo Alto, California 94304, USA;
    • 3Pacific Biosciences, Menlo Park, California 94025, USA
    • 4 These authors contributed equally to this work.
    • 5 Present address: Cancer Epigenetics Institute, Nuclear Dynamics and Cancer Program, Fox Chase Cancer Center, Philadelphia, PA 19111, USA
Download PDF Please log-in to or register for your personal account in order to access PDF Cite Article Permissions Share
cover of Genome Research Vol 36 Issue 6
Current Issue:

Abstract

Long-read sequencing provides a more complete view of the genome compared with short-read sequencing, with improved detection of structural variants, tandem repeats (TRs), and small variants (single-nucleotide variants [SNVs] and insertions and deletions) in difficult-to-map regions. One limitation of long-read sequencing has been high input DNA requirements, with several micrograms required per sample. Here, we evaluate two methods of amplification-based long-read, whole-genome sequencing: ultralow input HiFi (ULI-HiFi) sequencing and droplet multiple displacement amplification (dMDA) sequencing. When benchmarked against the Genome in a Bottle reference set (NA24385), we observe high precision and recall of SNVs with ULI-HiFi compared with the dMDA-amplified samples (F1 scores for SNVs of 99.82% for ULI-HiFi compared with 89.46% for dMDA). Across a catalog of more than 1.6 million TRs, ULI-HiFi achieves 90.4% perfect concordance and 98.9% accuracy when allowing for single motif differences. ULI-HiFi also illuminates medically important genes that were poorly mapped by short-read sequencing. We further apply ULI-HiFi to analyze a normal, polyp, and adenocarcinoma sample from a patient with familial adenomatous polyposis (FAP), a hereditary form of colorectal cancer. We identify a TR that progressively expanded in length from normal to polyp to adenocarcinoma. This repeat is located in the 5′ UTR of LIMD1, a reported tumor suppressor. Reporter assays reveal significantly reduced expression in colorectal cancer cell lines with increasing repeat length in the LIMD1 5′ UTR. We conclude that ULI-HiFi improves the characterization of genetic variants in dark regions of genomes from patient samples, enabling a better understanding of human disease.

Loading
Loading
Back to top