Quadrupia provides a comprehensive catalog of G-quadruplexes across genomes from the tree of life

  1. Ilias Georgakopoulos-Soares1
  1. 1Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, Pennsylvania 17033, USA;
  2. 2Institute for Fundamental Biomedical Research, BSRC “Alexander Fleming,” Vari 16672, Greece;
  3. 3Department of Basic Sciences, School of Medicine, University of Crete, Heraklion 71003, Greece;
  4. 4Department of Chemistry and State Key Laboratory of Marine Environmental Health, City University of Hong Kong, Kowloon Tong, Hong Kong SAR 999077, China;
  5. 5Shenzhen Research Institute of the City University of Hong Kong, Shenzhen 518057, China;
  6. 6Advanced Biomedical Computational Science, Frederick National Laboratory for Cancer Research, Frederick, Maryland 21702, USA;
  7. 7Advanced Manufacturing Laboratory, Department of Manufacturing Systems, Faculty of Mechanical Engineering and Robotics, AGH University of Krakow, Krakow 30-059, Poland;
  8. 8Division of Pharmacology and Toxicology, College of Pharmacy, The University of Texas at Austin, Dell Pediatric Research Institute, Austin, Texas 78723, USA;
  9. 9Department of Life Sciences, School of Sciences, European University Cyprus, Nicosia 1516, Cyprus;
  10. 10Cancer Genetics, Genomics and Systems Biology Laboratory, Basic and Translational Cancer Research Center (BTCRC), Nicosia 1516, Cyprus
  1. 11 These authors contributed equally to this work.

  • Corresponding authors: pavlopoulos{at}fleming.gr, izg5139{at}psu.edu
  • Abstract

    G-quadruplex DNA structures exhibit a profound influence on essential biological processes, including transcription, replication, telomere maintenance, and genomic stability. These structures have demonstrably shaped organismal evolution. However, a comprehensive, organism-wide G-quadruplex map encompassing the diversity of life has remained elusive. Here, we introduce Quadrupia, the most extensive and well-characterized G-quadruplex database to date, facilitating the exploration of G-quadruplex structures across the evolutionary spectrum. Quadrupia has identified G-quadruplex sequences in 108,449 reference genomes, with a total of 140,181,277 G-quadruplexes. The database also hosts a collection of 319,784 G-quadruplex clusters of 20 or more members, annotated by taxonomic distributions, multiple sequence alignments, profile hidden Markov models, and cross-references to G-quadruplex 3D structures. Examination of G-quadruplexes across functional genomic elements in different taxa indicates preferential orientation and positioning, with significant differences between individual taxonomic groups. For example, we find that G-quadruplexes in bacteria with a single replication origin display profound preference for the leading orientation. Finally, we experimentally validate the most frequently observed G-quadruplexes using CD-spectroscopy, UV melting, and fluorescent-based approaches.

    Footnotes

    • [Supplemental material is available for this article.]

    • Article published online before print. Article, supplemental material, and publication date are at https://www.genome.org/cgi/doi/10.1101/gr.279790.124.

    • Freely available online through the Genome Research Open Access option.

    • Received July 15, 2024.
    • Accepted August 21, 2025.

    This article, published in Genome Research, is available under a Creative Commons License (Attribution 4.0 International), as described at http://creativecommons.org/licenses/by/4.0/.

    This article has not yet been cited by other articles.

    | Table of Contents
    OPEN ACCESS ARTICLE

    Preprint Server