An efficient genotyper and star-allele caller for pharmacogenomics

  1. Ibrahim Numanagić3
  1. 1Department of Electrical and Computer Engineering, University of Maryland, College Park, Maryland 20742, USA;
  2. 2Cancer Data Science Laboratory, National Cancer Institute, National Institutes of Health, Bethesda, Maryland 20892, USA;
  3. 3Department of Computer Science, University of Victoria, Victoria, British Columbia V8P 5C2, Canada;
  4. 4Pacific Biosciences, Menlo Park, California 94025, USA;
  5. 5Department of Pathology, Stanford University, Palo Alto, California 94304, USA;
  6. 6Baylor College of Medicine Human Genome Sequencing Center, Houston, Texas 77030, USA
  • Corresponding author: inumanag{at}uvic.ca
  • Abstract

    High-throughput sequencing provides sufficient means for determining genotypes of clinically important pharmacogenes that can be used to tailor medical decisions to individual patients. However, pharmacogene genotyping, also known as star-allele calling, is a challenging problem that requires accurate copy number calling, structural variation identification, variant calling, and phasing within each pharmacogene copy present in the sample. Here we introduce Aldy 4, a fast and efficient tool for genotyping pharmacogenes that uses combinatorial optimization for accurate star-allele calling across different sequencing technologies. Aldy 4 adds support for long reads and uses a novel phasing model and improved copy number and variant calling models. We compare Aldy 4 against the current state-of-the-art star-allele callers on a large and diverse set of samples and genes sequenced by various sequencing technologies, such as whole-genome and targeted Illumina sequencing, barcoded 10x Genomics, and Pacific Biosciences (PacBio) HiFi. We show that Aldy 4 is the most accurate star-allele caller with near-perfect accuracy in all evaluated contexts, and hope that Aldy remains an invaluable tool in the clinical toolbox even with the advent of long-read sequencing technologies.

    Footnotes

    • Received August 11, 2022.
    • Accepted December 12, 2022.

    This article is distributed exclusively by Cold Spring Harbor Laboratory Press for the first six months after the full-issue publication date (see https://genome.cshlp.org/site/misc/terms.xhtml). After six months, it is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), as described at http://creativecommons.org/licenses/by-nc/4.0/.

    | Table of Contents

    Preprint Server