Abstract

Genome structural variants (SVs) comprise a sizable portion of functionally important genetic variation; yet, many evade discovery using short reads. While long-read sequencing can reveal hidden SVs, their role in organismal trait variation remains largely unclear. To address this gap, we investigate the molecular basis of 50 classical phenotypes in 11 Drosophila melanogaster strains using highly contiguous de novo genome assemblies generated with Oxford Nanopore Technologies long reads. These assemblies enable construction of a pangenome graph containing nucleotide-resolution maps of SVs, including complex rearrangements such as the interchromosomal inverted duplication Dp(2;4)eyD and large tandem duplications at the Bar locus. We uncover new candidate causal mutations for 15 phenotypes and new molecular alleles for 2 mutations comprising tandem duplications, transposable element (TE) insertions, and indels. For example, the wing vein phenotype plexus (px1) links to a 1.5 kb partial tandem gene duplication, and the century-old Curved (c1) wing phenotype links to a 7.5 kb DM412 retrotransposon disrupting the coding sequence of the muscle protein gene Strn-Mlck. We also identify a candidate intergenic enhancer for AblpeyD, a finding supported by CRISPR-Cas9. We also unveil 8 SV alleles of previously identified causal genes, including uncharacterized SVs underlying the extensively studied white and yellow phenotypes. Overall, 67.4% of genes causing phenotypic changes harbor candidate SVs >100 bp, whereas only 28% are expected based on euchromatic SVs. Together, our results indicate that SVs are strongly enriched among this class of large-effect, deleterious visible phenotypes in Drosophila.

Loading
Loading
Back to top