Matrix sketching framework for linear mixed models in association studies

  1. Petros Drineas2
  1. 1Computational Genomics, IBM T.J. Watson Research Center, Yorktown Heights, New York 10598, USA;
  2. 2Computer Science Department, Purdue University, West Lafayette, Indiana 47907, USA
  1. 3 These authors contributed equally to this work.

  • Corresponding author: pdrineas{at}purdue.edu
  • Abstract

    Linear mixed models (LMMs) have been widely used in genome-wide association studies to control for population stratification and cryptic relatedness. However, estimating LMM parameters is computationally expensive, necessitating large-scale matrix operations to build the genetic relationship matrix (GRM). Over the past 25 years, Randomized Linear Algebra has provided alternative approaches to such matrix operations by leveraging matrix sketching, which often results in provably accurate fast and efficient approximations. We leverage matrix sketching to develop a fast and efficient LMM method called Matrix-Sketching LMM (MaSk-LMM) by sketching the genotype matrix to reduce its dimensions and speed up computations. Our framework comes with both theoretical guarantees and a strong empirical performance compared to the current state-of-the-art for simulated traits and complex diseases.

    Footnotes

    • [Supplemental material is available for this article.]

    • Article published online before print. Article, supplemental material, and publication date are at https://www.genome.org/cgi/doi/10.1101/gr.279230.124.

    • Freely available online through the Genome Research Open Access option.

    • Received February 28, 2024.
    • Accepted August 12, 2024.

    This article, published in Genome Research, is available under a Creative Commons License (Attribution 4.0 International), as described at http://creativecommons.org/licenses/by/4.0/.

    OPEN ACCESS ARTICLE

    Preprint Server