Fast and accurate out-of-core PCA framework for large scale biobank data

(Downloading may take up to 30 seconds. If the slide opens in your browser, select File -> Save As to save it.)

Click on image to view larger version.

Figure 2.
Figure 2.

Runtime and memory usage. Runtime and memory usage for calculating top K = 40 PCs in either random subsets of common SNPs (left column) for 50,000 individuals or random subsets of individuals (right column) for 3,000,000 common SNPs. We used 20 CPU threads for all programs except FlashPCA2 which does not support multithreading. Detailed commands are included in the Benchmarking section. PLINK2 (FastPCA) ran out of memory when using more than 1,000,000 SNPs.

This Article

  1. Genome Res. 33: 1599-1608

Preprint Server