Precise genotyping of circular mobile elements from metagenomic data uncovers human-associated plasmids with recent common ancestors

  1. Eitan Yaffe1,3
  1. 1Infectious Diseases Section, Veteran Affairs Palo Alto Health Care System, Palo Alto, California 94304, USA;
  2. 2Washington University in St. Louis, St. Louis, Missouri 63130, USA;
  3. 3Department of Medicine, Stanford University School of Medicine, Stanford, California 94305, USA;
  4. 4Department of Microbiology and Immunology, Stanford University School of Medicine, Stanford, California 94305, USA
  • Corresponding authors: relman{at}stanford.edu, eitan.yaffe{at}stanford.edu
  • Abstract

    Mobile genetic elements with circular genomes play a key role in the evolution of microbial communities. Their circular genomes correspond to circular walks in metagenome graphs, and yet, assemblies derived from natural microbial communities produce graphs riddled with spurious cycles, complicating the accurate reconstruction of circular genomes. We present DomCycle, an algorithm that reconstructs likely circular genomes based on the identification of so-called “dominant” graph cycles. In the implementation, we leverage paired reads to bridge assembly gaps and scrutinize cycles through a nucleotide-level analysis, making the approach robust to misassembly artifacts. We validated the approach using simulated and real sequencing data. Application of DomCycle to 32 publicly available DNA shotgun sequence data sets from diverse natural environments led to the reconstruction of hundreds of circular mobile genomes. Clustering revealed 20 highly prevalent and cryptic plasmids that have clonal population structures with recent common ancestors. This method facilitates the study of microbial communities that evolve through horizontal gene transfer.

    Footnotes

    • [Supplemental material is available for this article.]

    • Article published online before print. Article, supplemental material, and publication date are at https://www.genome.org/cgi/doi/10.1101/gr.275894.121.

    • Freely available online through the Genome Research Open Access option.

    • Received June 15, 2021.
    • Accepted April 1, 2022.

    This article, published in Genome Research, is available under a Creative Commons License (Attribution 4.0 International), as described at http://creativecommons.org/licenses/by/4.0/.

    | Table of Contents
    OPEN ACCESS ARTICLE

    Preprint Server