
Principal components analysis (PCA) based on insertion presence/absence states for 5060 Alu insertion loci in 160 individuals with at least 100,000 read sets, including a single Vietnamese individual and Venter. PCA was performed on the matrix of the sums of pairwise insertion-state differences between individuals; i.e., for each locus, each pair received a distance score of zero if they shared the same state or one if they differed. Individuals are plotted by their scores in the largest three principal components. PCA for all 160 individuals (A); for 110 individuals assayed using primer AluSPv2 (B); and for 50 individuals assayed with AluSPv3 (C). A threshold of three reads was used to assign present and absent states, and all loci represented were supported by at least 10 coverage-corrected reads in at least one individual. Individuals are colored according to their source population as per the legend. Lines dropping from each individual indicate the third principal component.











