Independent expansion, selection, and hypervariability of the TBC1D3 gene family in humans

  1. Evan E. Eichler1,7
  1. 1Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA;
  2. 2Tisch Cancer Institute, Division of Hematology and Medical Oncology, The Icahn School of Medicine at Mount Sinai, New York, New York 10029, USA;
  3. 3Department of Biochemistry;
  4. 4Department of Medicine, Division of Cardiovascular Medicine, Stanford University, Stanford, California 94305, USA;
  5. 5Department of Pediatrics, Genomic Medicine Center, Children's Mercy Kansas City, Kansas City, Missouri 64108, USA;
  6. 6Department of Pediatrics, School of Medicine, University of Missouri Kansas City, Kansas City, Missouri 64108, USA;
  7. 7Howard Hughes Medical Institute, University of Washington, Seattle, Washington 98195, USA
  • Corresponding author: ee3{at}uw.edu
  • Abstract

    TBC1D3 is a primate-specific gene family that has expanded in the human lineage and has been implicated in neuronal progenitor proliferation and expansion of the frontal cortex. The gene family and its expression have been challenging to investigate because it is embedded in high-identity and highly variable segmental duplications. We sequenced and assembled the gene family using long-read sequencing data from 34 humans and 11 nonhuman primate species. Our analysis shows that this particular gene family has independently duplicated in at least five primate lineages, and the duplicated loci are enriched at sites of large-scale chromosomal rearrangements on Chromosome 17. We find that all human copy-number variation maps to two distinct clusters located at Chromosome 17q12 and that humans are highly structurally variable at this locus, differing by as many as 20 copies and ∼1 Mbp in length depending on haplotypes. We also show evidence of positive selection, as well as a significant change in the predicted human TBC1D3 protein sequence. Last, we find that, despite multiple duplications, human TBC1D3 expression is limited to a subset of copies and, most notably, from a single paralog group: TBC1D3-CDKL. These observations may help explain why a gene potentially important in cortical development can be so variable in the human population.

    Footnotes

    • [Supplemental material is available for this article.]

    • Article published online before print. Article, supplemental material, and publication date are at https://www.genome.org/cgi/doi/10.1101/gr.279299.124.

    • Freely available online through the Genome Research Open Access option.

    • Received March 8, 2024.
    • Accepted July 29, 2024.

    This article, published in Genome Research, is available under a Creative Commons License (Attribution 4.0 International), as described at http://creativecommons.org/licenses/by/4.0/.

    Articles citing this article

    | Table of Contents
    OPEN ACCESS ARTICLE

    Preprint Server