ML-MAGES enables multivariate genetic association analyses with genes and effect size shrinkage

  1. Sohini Ramachandran1,3
  1. 1Data Science Institute, Brown University, Providence, Rhode Island 02912, USA;
  2. 2Microsoft Research, Cambridge, Massachusetts 02142, USA;
  3. 3Department of Ecology, Evolution, and Organismal Biology, Brown University, Providence, Rhode Island 02912, USA
  • Corresponding author: xiran_liu1{at}brown.edu
  • Abstract

    A fundamental goal of genetics is to identify which and how genetic variants are associated with a trait, often using the regression results from genome-wide association (GWA) studies. Important methodological challenges account for inflation in GWA effect estimates as well as in investigating more than one trait simultaneously. We leverage machine learning approaches for these two challenges, developing a computationally efficient method called ML-MAGES. First, we shrink the inflation in GWA effect sizes caused by nonindependence among variants using neural networks. We then cluster variant associations among multiple traits via variational inference. We compare the performance of shrinkage via neural networks to regularized regression and fine-mapping, two approaches used for addressing inflated effects but dealing with variants in focal regions of different sizes. Our neural network shrinkage outperforms both methods in approximating the true effect sizes in simulated data. Our infinite mixture clustering approach offers a flexible, data-driven way to distinguish different types of associations—trait-specific, shared across traits, or nonprioritized—among multiple traits based on their regularized effects. Clustering applied to our neural network shrinkage results also produces consistently higher precision and recall for distinguishing gene-level associations in simulations. We demonstrate the application of ML-MAGES on association analyses of two quantitative traits and two binary traits in the UK Biobank. Our identified associated genes from single-trait enrichment tests overlap with those having known relevant biological processes to the traits. Besides trait-specific associations, ML-MAGES identifies several variants with shared multitrait associations, suggesting putative shared genetic architecture.

    Footnotes

    • [Supplemental material is available for this article.]

    • Article published online before print. Article, supplemental material, and publication date are at https://www.genome.org/cgi/doi/10.1101/gr.280440.125.

    • Freely available online through the Genome Research Open Access option.

    • Received February 13, 2025.
    • Accepted October 10, 2025.

    This article, published in Genome Research, is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), as described at http://creativecommons.org/licenses/by-nc/4.0/.

    OPEN ACCESS ARTICLE

    This Article

    1. Genome Res. © 2025 Liu et al.; Published by Cold Spring Harbor Laboratory Press

    Article Category

    ORCID

    Share

    Preprint Server