An efficient genotyper and star-allele caller for pharmacogenomics
- Ananth Hari1,2,
- Qinghui Zhou3,
- Nina Gonzaludo4,
- John Harting4,
- Stuart A. Scott5,
- Xiang Qin6,
- Steve Scherer6,
- S. Cenk Sahinalp2 and
- Ibrahim Numanagić3
- 1Department of Electrical and Computer Engineering, University of Maryland, College Park, Maryland 20742, USA;
- 2Cancer Data Science Laboratory, National Cancer Institute, National Institutes of Health, Bethesda, Maryland 20892, USA;
- 3Department of Computer Science, University of Victoria, Victoria, British Columbia V8P 5C2, Canada;
- 4Pacific Biosciences, Menlo Park, California 94025, USA;
- 5Department of Pathology, Stanford University, Palo Alto, California 94304, USA;
- 6Baylor College of Medicine Human Genome Sequencing Center, Houston, Texas 77030, USA
Abstract
High-throughput sequencing provides sufficient means for determining genotypes of clinically important pharmacogenes that can be used to tailor medical decisions to individual patients. However, pharmacogene genotyping, also known as star-allele calling, is a challenging problem that requires accurate copy number calling, structural variation identification, variant calling, and phasing within each pharmacogene copy present in the sample. Here we introduce Aldy 4, a fast and efficient tool for genotyping pharmacogenes that uses combinatorial optimization for accurate star-allele calling across different sequencing technologies. Aldy 4 adds support for long reads and uses a novel phasing model and improved copy number and variant calling models. We compare Aldy 4 against the current state-of-the-art star-allele callers on a large and diverse set of samples and genes sequenced by various sequencing technologies, such as whole-genome and targeted Illumina sequencing, barcoded 10x Genomics, and Pacific Biosciences (PacBio) HiFi. We show that Aldy 4 is the most accurate star-allele caller with near-perfect accuracy in all evaluated contexts, and hope that Aldy remains an invaluable tool in the clinical toolbox even with the advent of long-read sequencing technologies.
Footnotes
-
[Supplemental material is available for this article.]
-
Article published online before print. Article, supplemental material, and publication date are at https://www.genome.org/cgi/doi/10.1101/gr.277075.122.
- Received August 11, 2022.
- Accepted December 12, 2022.
This article is distributed exclusively by Cold Spring Harbor Laboratory Press for the first six months after the full-issue publication date (see https://genome.cshlp.org/site/misc/terms.xhtml). After six months, it is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), as described at http://creativecommons.org/licenses/by-nc/4.0/.











