
Comparison of various sample selection methods. (A) Cumulative fraction of SVs covered for a given number of samples chosen by the ILP, greedy, topN, and random approaches. SVCollector's greedy approach approximates the true ILP solution and exceeds the topN and random approaches at recovering unique SVs. (B) Number of SVs covered using three sample selection methods. In red is the median number of SVs covered over 100 trials of a random sample of 100 individuals. The red ribbon comprises the minimum and maximum number of SVs covered over the 100 trials. In green is the number of SVs covered using the 100 best ranked (greedy) individuals from the SNV data, and in blue is the number of SVs covered using the 100 best ranked individuals (greedy) from the SV data. Data are from the 1000 Genomes Project (The 1000 Genomes Project Consortium 2015).











