Large-Scale Validation of Single Nucleotide Polymorphisms in Gene Regions

Table 1.

Comparison of SNP Annotations Between Failed and Functional Assays and Between Polymorphic Classes





Functionalc

Na
Failed (N = 21,899)
Non-Polymorphic (N = 77,809)
MAFd≤0.05 (N = 12,169)
MAF >0.05 (N = 114,222)
Heterozygosity 26,248 0.19/0.39/0.49 0.01/0.13/0.42 0.10/0.24/0.43 0.32/0.44/0.49
Length 193,940 394/419/551b 356/401/534 401/436/561 401/447/548
RS Build 193,940 87/92/108b 86/92/106 87/92/108 88/100/111
NCBI validation 193,941
    No 73% (11,077)b 86% (55,541) 72% (7,641) 57% (59,340)
    Yes 27% (4,157) 14% (8,683) 28% (2,929) 43% (44,573)
MolType 193,940
    cDNA 13% (1,924)b 18% (11,754) 9% (948) 5% (5,397)
    Genomic 87% (13,310) 82% (52,470) 91% (9622) 95% (98,515)
SNP Type 193,932
    Not annotated 34% (5,219)b 30% (19,462) 32% (3,385) 31% (32,131)
    Intron 30% (4,608) 31% (19,962) 34% (3,614) 38% (39,046)
    Locus region 15% (2,227) 14% (9,301) 15% (1,571) 15% (15,491)
    mRNA UTR 13% (2,002) 15% (9,720) 13% (1,331) 11% (11,718)
    Coding 0% (30) 0% (130) 0% (15) 0% (94)
    Coding nonsynon 4% (609) 5% (3,344) 3% (358) 3% (2,663)
    Coding synon 3% (425) 3% (1,792) 2% (243) 2% (2,287)
    Exception 1% (110) 1% (496) 1% (53) 0% (465)
    Splice site 0% (4) 0% (13) 0% (0) 0% (13)
Submitter count 193,941
    1 48% (7,382) 62% (39,822) 49% (5,203) 38% (39,873)
    2 33% (5,056) 28% (17,869) 35% (3,694) 38% (39,572)
    3 14% (2,148) 9% (5,527) 12% (1,321) 18% (18,321)
    >3 4% (648) 1% (1,006) 4% (352) 6% (6,147)
RS Mapping 193,941
    0 8% (1,178)b 6% (3,885) 5% (544) 2% (2,315)
    1 90% (13,746) 92% (59,280) 93% (9,819) 95% (99,185)
    >1 2% (310) 2% (1,059) 2% (207) 2% (2,413)
eXTEND mapping 226,059
    0 29% (6,281)b 9% (7,006) 7% (860) 7% (7,455)
    1 57% (12,532) 73% (56,566) 72% (8,782) 81% (92,947)
    >1 14% (3,085) 18% (14,208) 21% (2,526) 12% (13,811)
TSC MAF range 7,997
    [0, 0.025] 23% (120) 74% (1,325) 29% (129) 5% (239)
    [0.025, 0.05] 6% (31) 7% (133) 22% (99) 3% (163)
    [0.05, 0.075] 3% (16) 3% (59) 10% (43) 3% (141)
    [0.075, 0.1] 5% (28) 2% (33) 10% (46) 5% (282)
    >0.1

63% (331)
13% (234)
29% (130)
84% (4,415)
  • The first eight annotation categories were drawn from NCBI dbSNP for the 193,940 SNPs that overlap with SNPs in this study.

    Quantitative variables are summarized as 1st quartile/median/3rd quartile. Categorical variables are summarized by column percent (count).

  • a N: Number of valid nonmissing observations for each variable.

  • b Comparisons between failed and functional assays significant at α = 0.05. All significant results have P-values < 10-6.

  • c All comparisons among functional assay groups are statistically significant with P-values < 10-6.

  • d MAF: Minor allele frequency.

This Article

  1. Genome Res. 14: 1664-1668

Preprint Server