Gene prediction in the immunoglobulin loci
Abstract
The V(D)J recombination process rearranges the variable (V), diversity (D), and joining (J) genes in the immunoglobulin loci to generate antibody repertoires. Annotation of these loci across various species and predicting the V, D, and J genes (IG genes) is critical for studies of the adaptive immune system. However, since the standard gene finding algorithms are not suitable for predicting IG genes, they have been semi-manually annotated in very few species. We developed the IGDetective algorithm for predicting IG genes and applied it to species with the assembled IG loci. IGDetective generated the first large collection of IG genes across many species and enabled their evolutionary analysis, including the analysis of the "bat IG diversity" hypothesis. This analysis revealed extremely conserved V genes in evolutionary distant species indicating that these genes may be subjected to the same selective pressure, e.g., pressure driven by common pathogens. IGDetective also revealed extremely diverged V genes and a new family of evolutionary conserved V genes in bats with unusual noncanonical cysteines. Moreover, in difference from all other previously reported antibodies, these cysteines are located within complementarity-determining regions. Since cysteines form disulfide bonds, we hypothesize that these cysteine-rich V genes might generate antibodies with noncanonical conformations and could potentially form a unique part of the immune repertoire in bats. We also analyzed the diversity landscape of the recombination signal sequences and revealed their features that trigger the high/low usage of the IG genes.
- Received February 9, 2022.
- Accepted May 6, 2022.
- Published by Cold Spring Harbor Laboratory Press
This article is distributed exclusively by Cold Spring Harbor Laboratory Press for the first six months after the full-issue publication date (see https://genome.cshlp.org/site/misc/terms.xhtml). After six months, it is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), as described at http://creativecommons.org/licenses/by-nc/4.0/.











