On the probability that a novel variant is a disease-causing mutation

Table 4.

Comparison of our case-control results to traditional case-control analyses


A














Length of sequence examineda
Group one freq
Group two freq
Single siteb
5 kb
10 kb
20 kb
50 kb
Traditional case-control
0.10 0.00 0.001 0.024 0.047 0.091 0.211 0.001
0.02 0.021 0.347 0.569 0.808 0.979 0.017
0.03 0.057 0.678 0.889 0.985 1.0 0.045
0.20 0.057 0.684 0.893 0.986 1.0 0.048
0.25 0.008 0.161 0.294 0.499 0.813 0.005
0.38 3.0 × 10-6 6.2 × 10-5 1.2 × 10-4 2.5 × 10-4 6.2 × 10-4 3.6 × 10-6
0.30 0.05 2.0 × 10-6 4.1 × 10-5 8.3 × 10-5 1.7 × 10-4 4.1 × 10-4 3.3 × 10-6
0.10 4.4 × 10-4 0.009 0.018 0.036 0.087 4.1 × 10-4
0.18 0.056 0.671 0.885 0.984 0.999 0.047
0.44 0.047 0.612 0.843 0.971 0.999 0.040
0.55 4.1 × 10-4 0.008 0.017 0.033 0.081 3.5 × 10-4
0.62 6.2 × 10-6 1.3 × 10-4 2.6 × 10-4 5.2 × 10-4 0.001 5.6 × 10-6
Number of sites expected

21
43
85
214

B










Sample sizea
100
200
500
Group one freq
Group two freq
Case control method
Our method
Case control method
Our method
Case control method
Our method
Single site
5 kb
Single site
5 kb
Single site
5 kb
0.10 0.05 0.058 0.069 0.799 0.007 0.008 0.200 2.2 × 10-5 2.2 × 10-5 6.9 × 10-4
0.06 0.140 0.164 0.975 0.037 0.042 0.674 9.8 × 10-4 0.001 0.032
0.07 0.282 0.322 0.999 0.128 0.144 0.974 0.016 0.018 0.419
0.13 0.347 0.387 1.0 0.184 0.202 0.994 0.035 0.038 0.688
0.15 0.131 0.149 0.966 0.033 0.036 0.621 0.001 7.9 × 10-4 0.024
0.17 0.041 0.047 0.670 0.004 0.004 0.108 4.6 × 10-6 4.9 × 10-6 1.5 × 10-4
0.30 0.21 0.039 0.044 0.647 0.003 0.004 0.100 3.9 × 10-6 4.2 × 10-6 1.3 × 10-4
0.24 0.177 0.194 0.987 0.056 0.061 0.799 0.003 0.003 0.080
0.26 0.373 0.403 1.0 0.208 0.222 0.996 0.046 0.049 0.773
0.34 0.391 0.421 1.0 0.225 0.240 0.997 0.055 0.058 0.827
0.37 0.138 0.152 0.968 0.036 0.039 0.647 0.001 9.8 × 10-4 0.030

0.39
0.058
0.065
0.782
0.007
0.008
0.198
2.3 × 10-5
2.5 × 10-5
7.7 × 10-4
  • Note that the length-dependent P-values obtained using our method can be approximated by multiplying the traditional case-control P-value by the number of polymorphic sites expected in the region sequenced. The genome-wide average θ, 8.25 × 10-4, was used.

  • A P-values for our method examining 5, 10, 20, and 50 kb and for traditional case-control method at varying case and control allele frequencies using 50 cases and 50 controls.

  • B P-values for traditional case-control method and for our method examining 5 kb at varying case and control allele frequencies using 100, 200, and 500 cases and controls. The number of variant sites expected in 5 kb among 100, 200, and 500 cases are 24.2, 27.1, and 30.9, respectively.

  • a Using our method

  • b P-value obtained using our method for a single site that is known to be polymorphic

  • a Sample size refers to the number of individuals in each group, cases, and controls

This Article

  1. Genome Res. 15: 960-966

Preprint Server