Analytical validation of germline small variant detection using long-read HiFi genome sequencing

  1. Stuart A. Scott1,2
  1. 1Clinical Genomics Laboratory, Stanford Medicine, Palo Alto, California 94304, USA;
  2. 2Department of Pathology, Stanford University, Stanford, California 94305, USA
  • 3 Present address: Influx Bio, San Francisco, CA 94124, USA

  • Corresponding author: sascott{at}stanford.edu
  • Abstract

    Long-read sequencing has the capacity to interrogate difficult genomic regions and phase variants; however, short-read sequencing is more commonly implemented for clinical testing. Given the advances in long-read HiFi sequencing chemistry and variant calling, we analytically validated this technology for small variant detection (single nucleotide variants, insertions/deletions; SNVs/indels; <50 bp). HiFi genome sequencing was performed on DNA from reference materials and clinical specimen types, and accuracy results were compared to short-read genome sequencing data. HiFi genome sequencing recall and precision across Genome in a Bottle (GIAB)-defined non-difficult and difficult genomic regions (high confidence) for SNVs are >99.9% and >99.7%, respectively, and for indels are >99.8% and >99.1%, respectively. Moreover, HiFi genome sequencing outperforms short-read genome sequencing on overall SNV/indel F1-score accuracy at all paired sequencing depths, which are further stratified across 100 total GIAB-defined genomic regions for a comprehensive evaluation of performance. Of note, HiFi genome sequencing F1-scores for SNVs and indels surpass 99% at ∼15× and ∼25×, respectively. In addition, high confidence small variant concordance across all HiFi genome sequencing reproducibility assessments (two specimens, three independent sequencing data sets) are >99.8% for SNVs and >98.6% for indels, and average high confidence small variant concordance between paired blood, saliva, and swab specimens are all >99.8%. Taken together, these data underscore that long-read HiFi genome sequencing detection of SNVs and indels is very accurate and robust, which supports the implementation of this technology for clinical diagnostic testing.

    Footnotes

    • Received December 9, 2023.
    • Accepted April 11, 2025.

    This article is distributed exclusively by Cold Spring Harbor Laboratory Press for the first six months after the full-issue publication date (see https://genome.cshlp.org/site/misc/terms.xhtml). After six months, it is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), as described at http://creativecommons.org/licenses/by-nc/4.0/.

    | Table of Contents

    Preprint Server