Genotype imputation via matrix completion

  1. Kenneth Lange5
  1. 1Department of Human Genetics, University of California, Los Angeles, California 90095, USA;
  2. 2Department of Statistics, North Carolina State University, Raleigh, North Carolina 27695-8203, USA;
  3. 3Department of Preventive Medicine, University of Southern California, Los Angeles, California 90089, USA;
  4. 4Interdepartmental Program in Bioinformatics, University of California, Los Angeles, California 90095, USA;
  5. 5Department of Biomathematics, Department of Human Genetics, and Department of Statistics, University of California, Los Angeles, California 90095, USA

    Abstract

    Most current genotype imputation methods are model-based and computationally intensive, taking days to impute one chromosome pair on 1000 people. We describe an efficient genotype imputation method based on matrix completion. Our matrix completion method is implemented in MATLAB and tested on real data from HapMap 3, simulated pedigree data, and simulated low-coverage sequencing data derived from the 1000 Genomes Project. Compared with leading imputation programs, the matrix completion algorithm embodied in our program MENDEL-IMPUTE achieves comparable imputation accuracy while reducing run times significantly. Implementation in a lower-level language such as Fortran or C is apt to further improve computational efficiency.

    Footnotes

    • Received July 11, 2012.
    • Accepted December 3, 2012.

    This article is distributed exclusively by Cold Spring Harbor Laboratory Press for the first six months after the full-issue publication date (see http://genome.cshlp.org/site/misc/terms.xhtml). After six months, it is available under a Creative Commons License (Attribution-NonCommercial 3.0 Unported License), as described at http://creativecommons.org/licenses/by-nc/3.0/.

    Preprint Server