Method

Predicted protein 3D structures provide essential insights into the genetic architecture underlying phenotypic diversity in maize

    • 1Key Laboratory of Maize Biology and Genetic Breeding in Arid Areas of the Northwest Region, College of Agronomy, Northwest A&F University, Yangling, Shaanxi 712100, China;
    • 2Peking University Institute of Advanced Agricultural Sciences, Shandong Laboratory of Advanced Agriculture Sciences in Weifang, Weifang, Shandong 261325, China;
    • 3Section of Plant Breeding and Genetics, Cornell University, Ithaca, New York 14853, USA;
    • 4Institute for Genomic Diversity, Cornell University, Ithaca, New York 14853, USA;
    • 5Agricultural Research Service, United States Department of Agriculture, Ithaca, New York 14853, USA;
    • 6Center for Quantitative Genetics and Genomics, Aarhus University, Aarhus 8000, Denmark;
    • 7Texas Advanced Computing Center, University of Texas at Austin, Austin, Texas 78758, USA
Published October 31, 2025. Vol 36 Issue 1, pp. 214-225. https://doi.org/10.1101/gr.280514.125
Download PDF Please log-in to or register for your personal account in order to access PDF Cite Article Permissions Share
cover of Genome Research Vol 36 Issue 4
Current Issue:

Abstract

Variation in protein 3D structures reflects genetic variation and contributes to phenotypic diversity, yet its underlying genetic mechanisms remain unclear. To investigate the relationship between protein 3D structure and phenotype, we predict the 3D structures of 795,649 proteins from 26 maize (Zea mays L.) inbred lines using AlphaFold2. Population genetics analysis of these protein 3D structures reveal that buried residues held greater genomic evolutionary rate profiling (GERP) scores than exposed residues, indicating that buried residues are under stronger purifying selection. The design of the maize nested association mapping population makes it possible to utilize haplotype information and protein 3D structural variation to reveal the molecular mechanisms linking genetic diversity and phenotypic variation for a population with about 5000 individuals. Associating protein 3D structure variation with phenotypes (structure-based proteome-wide association study [PWAS]) identifies 14.2% more (96 vs. 84) significant proteins compared with associating protein sequence with phenotypes (sequence-based PWAS) using 32 agronomic traits. Moreover, structure-based PWAS identifies 24 additional significant proteins unique to predicted structures, whereas sequence-based PWAS identifies 12 additional significant proteins. Structure-based proteome-wide predictions (PWPs) improve genomic prediction accuracy by an average of 3.8% compared with sequence-based PWPs. In general, predicted protein 3D structures represent a powerful approach for understanding the natural diversity of protein haplotypes.

Loading
Loading
Back to top