A butterfly pan-genome reveals that a large amount of structural variation underlies the evolution of chromatin accessibility
- Angelo A. Ruggieri1,
- Luca Livraghi2,3,
- James J. Lewis4,
- Elizabeth Evans1,
- Francesco Cicconardi5,
- Laura Hebberecht5,
- Yadira Ortiz-Ruiz1,6,
- Stephen H. Montgomery5,
- Alfredo Ghezzi1,
- José Arcadio Rodriguez-Martinez1,
- Chris D. Jiggins7,
- W. Owen McMillan3,
- Brian A. Counterman8,
- Riccardo Papa1,6 and
- Steven M. Van Belleghem1,9
- 1Department of Biology, University of Puerto Rico–Rio Piedras, San Juan PR 00931, Puerto Rico;
- 2Department of Biological Sciences, The George Washington University, Washington, DC 20052, USA;
- 3Smithsonian Tropical Research Institute, Apartado 0843-03092 Panamá, Panama;
- 4Department of Zoology, University of British Columbia, Vancouver, British Columbia V6T 1Z4, Canada;
- 5School of Biological Sciences, Bristol University, Bristol BS8 1QU, United Kingdom;
- 6Molecular Sciences and Research Center, University of Puerto Rico, San Juan 00926, Puerto Rico;
- 7Department of Zoology, University of Cambridge, Cambridge CB2 3EJ, United Kingdom;
- 8Department of Biological Sciences, Auburn University, Auburn, Alabama 36849, USA;
- 9Ecology, Evolution and Conservation Biology, Biology Department, KU Leuven, 3000 Leuven, Belgium
Abstract
Despite insertions and deletions being the most common structural variants (SVs) found across genomes, not much is known about how much these SVs vary within populations and between closely related species, nor their significance in evolution. To address these questions, we characterized the evolution of indel SVs using genome assemblies of three closely related Heliconius butterfly species. Over the relatively short evolutionary timescales investigated, up to 18.0% of the genome was composed of indels between two haplotypes of an individual Heliconius charithonia butterfly and up to 62.7% included lineage-specific SVs between the genomes of the most distant species (11 Mya). Lineage-specific sequences were mostly characterized as transposable elements (TEs) inserted at random throughout the genome and their overall distribution was similarly affected by linked selection as single nucleotide substitutions. Using chromatin accessibility profiles (i.e., ATAC-seq) of head tissue in caterpillars to identify sequences with potential cis-regulatory function, we found that out of the 31,066 identified differences in chromatin accessibility between species, 30.4% were within lineage-specific SVs and 9.4% were characterized as TE insertions. These TE insertions were localized closer to gene transcription start sites than expected at random and were enriched for sites with significant resemblance to several transcription factor binding sites with known function in neuron development in Drosophila. We also identified 24 TE insertions with head-specific chromatin accessibility. Our results show high rates of structural genome evolution that were previously overlooked in comparative genomic studies and suggest a high potential for structural variation to serve as raw material for adaptive evolution.
Footnotes
-
[Supplemental material is available for this article.]
-
Article published online before print. Article, supplemental material, and publication date are at https://www.genome.org/cgi/doi/10.1101/gr.276839.122.
- Received April 16, 2022.
- Accepted September 13, 2022.
This article is distributed exclusively by Cold Spring Harbor Laboratory Press for the first six months after the full-issue publication date (see https://genome.cshlp.org/site/misc/terms.xhtml). After six months, it is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), as described at http://creativecommons.org/licenses/by-nc/4.0/.











