Diversity, duplication, and genomic organization of homeobox genes in Lepidoptera
- Peter O. Mulhair1,
- Liam Crowley1,
- Douglas H. Boyes1,2,
- Amber Harper1,4,
- Owen T. Lewis1,
- Darwin Tree of Life Consortium3 and
- Peter W.H. Holland1
- 1Department of Biology, University of Oxford, Oxford OX1 3SZ, United Kingdom;
- 2UK Centre for Ecology and Hydrology, Wallingford OX10 8BB, United Kingdom;
- 5Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SA, UK;
- 6Marine Biological Association of the United Kingdom, Plymouth PL1 2PB, UK;
- 7University of Liverpool, Liverpool L69 3BX, UK;
- 8University of East Anglia, Norwich Research Park, Norwich NR4 7TJ, UK;
- 9Department of Biology, University of Oxford, Oxford OX1 3SZ, UK;
- 10Department of Genetics, University of Cambridge, Cambridge CB2 3EH, UK;
- 11Royal Botanic Gardens, London TW9 3AE, UK;
- 12Royal Botanic Garden Edinburgh, Edinburgh EH3 5LR, UK;
- 13University of Plymouth, Plymouth PL4 8AA, UK;
- 14Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, Edinburgh EH8 9YL, UK;
- 15Natural History Museum, London SW7 5BD, UK;
- 16EMBL-EBI, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
Abstract
Homeobox genes encode transcription factors with essential roles in patterning and cell fate in developing animal embryos. Many homeobox genes, including Hox and NK genes, are arranged in gene clusters, a feature likely related to transcriptional control. Sparse taxon sampling and fragmentary genome assemblies mean that little is known about the dynamics of homeobox gene evolution across Lepidoptera or about how changes in homeobox gene number and organization relate to diversity in this large order of insects. Here we analyze an extensive data set of high-quality genomes to characterize the number and organization of all homeobox genes in 123 species of Lepidoptera from 23 taxonomic families. We find most Lepidoptera have around 100 homeobox loci, including an unusual Hox gene cluster in which the lab gene is repositioned and the ro gene is next to pb. A topologically associating domain spans much of the gene cluster, suggesting deep regulatory conservation of the Hox cluster arrangement in this insect order. Most Lepidoptera have four Shx genes, divergent zen-derived loci, but these loci underwent dramatic duplication in several lineages, with some moths having over 165 homeobox loci in the Hox gene cluster; this expansion is associated with local LINE element density. In contrast, the NK gene cluster content is more stable, although there are differences in organization compared with other insects, as well as major rearrangements within butterflies. Our analysis represents the first description of homeobox gene content across the order Lepidoptera, exemplifying the potential of newly generated genome assemblies for understanding genome and gene family evolution.
Footnotes
-
↵3 The list of Darwin Tree of Life Consortium members and affiliations is listed at the end of this paper.
-
[Supplemental material is available for this article.]
-
Article published online before print. Article, supplemental material, and publication date are at https://www.genome.org/cgi/doi/10.1101/gr.277118.122.
-
Freely available online through the Genome Research Open Access option.
- Received July 12, 2022.
- Accepted November 29, 2022.
This article, published in Genome Research, is available under a Creative Commons License (Attribution 4.0 International), as described at http://creativecommons.org/licenses/by/4.0/.











