Efficient and accurate KIR and HLA genotyping with massively parallel sequencing data

  1. Heng Li1,2
  1. 1Department of Data Science, Dana-Farber Cancer Institute, Boston, Massachusetts 02215, USA;
  2. 2Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts 02115, USA;
  3. 3Lyda Hill Department of Bioinformatics, University of Texas Southwestern Medical Center, Dallas, Texas 75390, USA
  • Present addresses: 4Department of Biomedical Data Science, Dartmouth College, Hanover, NH 03755, USA; 5Department of Biomolecular Engineering, University of California Santa Cruz, Santa Cruz, CA 95064, USA; 6GV20 Therapeutics, Cambridge, MA 02139, USA; 7Department of Pathology and Laboratory Science, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA

  • Corresponding authors: lib3{at}chop.edu, hli{at}ds.dfci.harvard.edu
  • Abstract

    Killer cell immunoglobulin like receptor (KIR) genes and human leukocyte antigen (HLA) genes play important roles in innate and adaptive immunity. They are highly polymorphic and cannot be genotyped with standard variant calling pipelines. Compared with HLA genes, many KIR genes are similar to each other in sequences and may be absent in the chromosomes. Therefore, although many tools have been developed to genotype HLA genes using common sequencing data, none of them work for KIR genes. Even specialized KIR genotypers could not resolve all the KIR genes. Here we describe T1K, a novel computational method for the efficient and accurate inference of KIR or HLA alleles from RNA-seq, whole-genome sequencing, or whole-exome sequencing data. T1K jointly considers alleles across all genotyped genes, so it can reliably identify present genes and distinguish homologous genes, including the challenging KIR2DL5A/KIR2DL5B genes. This model also benefits HLA genotyping, where T1K achieves high accuracy in benchmarks. Moreover, T1K can call novel single-nucleotide variants and process single-cell data. Applying T1K to tumor single-cell RNA-seq data, we found that KIR2DL4 expression was enriched in tumor-specific CD8+ T cells. T1K may open the opportunity for HLA and KIR genotyping across various sequencing applications.

    Footnotes

    • Received December 11, 2022.
    • Accepted May 4, 2023.

    This article is distributed exclusively by Cold Spring Harbor Laboratory Press for the first six months after the full-issue publication date (see https://genome.cshlp.org/site/misc/terms.xhtml). After six months, it is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), as described at http://creativecommons.org/licenses/by-nc/4.0/.

    Preprint Server