Estimation of intrafamilial DNA contamination in family trio genome sequencing using deviation from Mendelian inheritance

  1. Young Seok Ju3,6
  1. 1Department of Medicine, Washington University School of Medicine, St. Louis, Missouri 63110, USA;
  2. 2Research Center for Natural Sciences, Korea Advanced Institute of Science and Technology, Daejeon 34141, Korea;
  3. 3Graduate School of Medical Science and Engineering, Korea Advanced Institute of Science and Technology, Daejeon 34141, Korea;
  4. 4McDonnell Genome Institute, St. Louis, Missouri 63108, USA;
  5. 5Center for Supercomputing Applications, Division of National Supercomputing, Korea Institute of Science and Technology Information, Daejeon 34141, Korea;
  6. 6GENOME INSIGHT Incorporated, Daejeon 34051, Korea;
  7. 7Department of Biological Sciences, Korea Advanced Institute of Science and Technology, Daejeon 34141, Korea;
  8. 8Center for Synaptic Brain Dysfunctions, Institute for Basic Science, Daejeon 34141, Korea;
  9. 9Department of Psychiatry, Seoul National University Bundang Hospital, Seongnam 13620, Korea;
  10. 10Department of Psychiatry, Seoul National University College of Medicine, Seoul 03080, Korea;
  11. 11Department of Obstetrics and Gynecology, Seoul National University College of Medicine, Seoul 03080, Korea;
  12. 12Department of Laboratory Medicine, International St. Mary's Hospital, Catholic Kwandong University College of Medicine, Incheon 22711, Korea
  • Corresponding authors: ysju{at}kaist.ac.kr, obigriffith{at}wustl.edu
  • Abstract

    With the increasing number of sequencing projects involving families, quality control tools optimized for family genome sequencing are needed. However, accurately quantifying contamination in a DNA mixture is particularly difficult when genetically related family members are the sources. We developed TrioMix, a maximum likelihood estimation (MLE) framework based on Mendel's law of inheritance, to quantify DNA mixture between family members in genome sequencing data of parent–offspring trios. TrioMix can accurately deconvolute any intrafamilial DNA contamination, including parent–offspring, sibling–sibling, parent–parent, and even multiple familial sources. In addition, TrioMix can be applied to detect genomic abnormalities that deviate from Mendelian inheritance patterns, such as uniparental disomy (UPD) and chimerism. A genome-wide depth and variant allele frequency plot generated by TrioMix facilitates tracing the origin of Mendelian inheritance deviations. We showed that TrioMix could accurately deconvolute genomes in both simulated and real data sets.

    Footnotes

    • [Supplemental material is available for this article.]

    • Article published online before print. Article, supplemental material, and publication date are at https://www.genome.org/cgi/doi/10.1101/gr.276794.122.

    • Freely available online through the Genome Research Open Access option.

    • Received March 28, 2022.
    • Accepted October 31, 2022.

    This article, published in Genome Research, is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), as described at http://creativecommons.org/licenses/by-nc/4.0/.

    | Table of Contents
    OPEN ACCESS ARTICLE

    Preprint Server