RT Journal A1 Mikelov, Artem A1 Nefediev, George A1 Tashkeev, Alexander A1 Rodriguez, Oscar L. A1 Aguilar Ortmans, Diego A1 Skatova, Valeriia A1 Izraelson, Mark A1 Davydov, Alexey N. A1 Poslavsky, Stanislav A1 Rahmouni, Souad A1 Watson, Corey T. A1 Chudakov, Dmitriy A1 Boyd, Scott D. A1 Bolotin, Dmitry T1 Ultrasensitive allele inference from immune repertoire sequencing data with MiXCR JF Genome Research JO Genome Research YR 2024 FD December 01 VO 34 IS 12 SP 2293 OP 2303 DO 10.1101/gr.278775.123 UL http://genome.cshlp.org/content/34/12/2293.abstract AB Allelic variability in the adaptive immune receptor loci, which harbor the gene segments that encode B cell and T cell receptors (BCR/TCR), is of critical importance for immune responses to pathogens and vaccines. Adaptive immune receptor repertoire sequencing (AIRR-seq) has become widespread in immunology research making it the most readily available source of information about allelic diversity in immunoglobulin (IG) and T cell receptor (TR) loci. Here, we present a novel algorithm for extrasensitive and specific variable (V) and joining (J) gene allele inference, allowing the reconstruction of individual high-quality gene segment libraries. The approach can be applied for inferring allelic variants from peripheral blood lymphocyte BCR and TCR repertoire sequencing data, including hypermutated isotype-switched BCR sequences, thus allowing high-throughput novel allele discovery from a wide variety of existing data sets. The developed algorithm is a part of the MiXCR software. We demonstrate the accuracy of this approach using AIRR-seq paired with long-read genomic sequencing data, comparing it to a widely used algorithm, TIgGER. We applied the algorithm to a large set of IG heavy chain (IGH) AIRR-seq data from 450 donors of ancestrally diverse population groups, and to the largest reported full-length TCR alpha and beta chain (TRA and TRB) AIRR-seq data set, representing 134 individuals. This allowed us to assess the genetic diversity within the IGH, TRA, and TRB loci in different populations and to establish a database of alleles of V and J genes inferred from AIRR-seq data and their population frequencies with free public access through VDJ.online database.