Application of Atlas-SNP2 to the 454 Watson genomic sequence data

[i] We used the two sets of prior values when running Atlas-SNP2 to assess the variant allele probabilities. Consistent with the tuning results in the S. aureus data set, the “set 1” priors generated reasonable resolutions. In the run using the “set 1” priors, approximately 2.66 million loci (boxes highlighted in dark blue and blue) had high confidence when the variant read coverage was greater than two at each locus. The quality of the discoveries was indicated by the high confirmation rate when compared to the dbSNP database; specifically, 92.6% of the loci were found in the dbSNP Builld 129 database (when we used only the high quality entries with the quality flags set as “1”). When compared to the Affymetrix 500K microarray genotype results, overall we detected 72.8% of Affymetrix sites with variant alleles (heterozygotes = 50% and homozygotes = 92%), and the genotype concordance was as high as 99.2%. If we included the ones in gray boxes that had at most two variant read coverage per site, there were around 3.4 million total loci, and the overall detection sensitivity for loci in the Affymetrix 500K platform was increased to 81% (heterozygotes = 71.1% and homozygotes = 94.2%) that was close to the expected numbers (Wheeler et al. 2008), whereas the dbSNP confirmation rate decreased to 83.3%. This illustrated that Atlas-SNP2 could achieve high accuracy, while the depth coverage was an important factor for our detection sensitivity.