Ultrasensitive allele inference from immune repertoire sequencing data with MiXCR

  1. Dmitry A Bolotin2
  1. 1 Stanford University;
  2. 2 MiLaboratories Inc;
  3. 3 University of Liege;
  4. 4 University of Louisville;
  5. 5 MiLaboratories Inc, Central European Institute of Technology, Masaryk University
  • * Corresponding author; email: t.mikelov{at}gmail.com
  • Abstract

    Allelic variability in the adaptive immune receptor loci, which harbor the gene segments that encode B cell and T cell receptors (BCR/TCR), is of critical importance for immune responses to pathogens and vaccines. Adaptive immune receptor repertoire sequencing (AIRR-seq) has become widespread in immunology research making it the most readily available source of information about allelic diversity in immunoglobulin (IG) and T cell receptor (TR) loci. Here we present a novel algorithm for extra-sensitive and specific variable (V) and joining (J) gene allele inference, allowing reconstruction of individual high-quality gene segment libraries. The approach can be applied for inferring allelic variants from peripheral blood lymphocyte BCR and TCR repertoire sequencing data, including hypermutated isotype-switched BCR sequences, thus allowing high-throughput novel allele discovery from a wide variety of existing datasets. The developed algorithm is a part of the MiXCR software. We demonstrate the accuracy of this approach using AIRR-seq paired with long-read genomic sequencing data, comparing it to a widely used algorithm, TIgGER. We applied the algorithm to a large set of IG heavy chain (IGH) AIRR-seq data from 450 donors of ancestrally diverse population groups, and to the largest reported full-length TCR alpha and beta chain (TRA; TRB) AIRR-seq dataset, representing 134 individuals. This allowed us to assess the genetic diversity within the IGH, TRA and TRB loci in different populations and to establish a database of alleles of V and J genes inferred from AIRR-seq data and their population frequencies with free public access through an online database.

    • Received November 26, 2023.
    • Accepted October 3, 2024.

    This manuscript is Open Access.

    This article, published in Genome Research, is available under a Creative Commons License (Attribution-NonCommercial 4.0 International license), as described at http://creativecommons.org/licenses/by-nc/4.0/.

    Articles citing this article

    OPEN ACCESS ARTICLE
    ACCEPTED MANUSCRIPT

    This Article

    1. Genome Res. gr.278775.123 Published by Cold Spring Harbor Laboratory Press

    Article Category

    ORCID

    Share

    Preprint Server