Joanna L. Kelley; Jennifer Madeoy; John C. Calhoun; Willie Swanson; Joshua M. Akey

Figure 2.

The correlation between TD_Seq and TD_Gen predicts the performance of a simple outlier approach in ascertained data sets. The correlation, r, between Tajima’s D derived from complete sequence (TD_Seq) and genotype (TD_Gen) data was calculated from the data sets described in Figure 1. N_D denotes the number of chromosomes used for SNP discovery. Discovered SNPs were then genotyped in the sample panel and used to calculate TD_Gen (see Fig. 1 legend). For each value of N_D, there are eight points, which correspond to all combinations of simulation parameters: σ (20 and 200), fraction of positively selected loci (1% and 10%), and threshold used in defining candidate selection genes (1% and 5%). Note that for each value of N_D the correlation between TD_Seq and TD_Gen for the eight different parameter combinations differed by <1%, and thus for presentation purposes, the average correlation is shown. The gray shaded area helps to demark the range of (PPV_G/PPV_S) values for each value of N_D and simulation parameters.

Genomic signatures of positive selection in humans and the limits of outlier approaches

This Article

Preprint Server

Current Issue

In This Issue