Detection of common single nucleotide polymorphisms synthesizing quantitative trait association of rarer causal variants

Fumihiko Takeuchi; Shotai Kobayashi; Toshio Ogihara; Akihiro Fujioka; Norihiro Kato

doi:10.1101/gr.115832.110

Detection of common single nucleotide polymorphisms synthesizing quantitative trait association of rarer causal variants

¹Department of Gene Diagnostics and Therapeutics, Research Institute, National Center for Global Health and Medicine, Tokyo 162-8655, Japan;
²Pathogen Genomics Center, National Institute of Infectious Diseases, Tokyo 162-8640, Japan;
³Shimane University Hospital, Izumo 693-8501, Japan;
⁴Department of Geriatric Medicine and Nephrology, Osaka University Graduate School of Medicine, Suita 565-0871, Japan;
⁵Amagasaki Health Medical Foundation, Amagasaki 661-0012, Japan

Abstract

Genome-wide association (GWA) studies have identified hundreds of common (minor allele frequency ≥5%) single nucleotide polymorphisms (SNPs) associated with phenotype traits or diseases, yet causal variants accounting for the association signals have rarely been determined. A question then raised is whether a GWA signal represents an “indirect association” as a proxy of a strongly correlated causal variant with similar frequency, or a “synthetic association” of one or more rarer causal variants in linkage disequilibrium (D′ ≈ 1, but r² not large); answering the question generally requires extensive resequencing and association analysis. Instead, we propose to test statistically whether a quantitative trait (QT) association of an SNP represents a synthetic association or not by inspecting the QT distribution at each genotype, not requiring the causal variant(s) to be known. We devised two test statistics and assessed the power by mathematical analysis and simulation. Testing the heterogeneity of variance was powerful when low-frequency causal alleles are linked mostly to one SNP allele, while testing the skewness outperformed when the causal alleles are linked evenly to either of the SNP alleles. By testing a statistic combining these two in 5000 individuals, we could detect synthetic association of a GWA signal when causal alleles sum up to 3% in frequency. Such signal only partially explains the heritability contributed by the whole locus. The proposed test is useful for designing fine mapping after studying association of common SNPs exhaustively; we can prioritize which GWA signal and which individuals to be resequenced, and identify the causal variants efficiently.

Genome-wide association (GWA) studies have identified hundreds of common (minor allele frequency [MAF] ≥5%) single nucleotide polymorphisms (SNPs) associated with a few hundred traits or diseases, yet the associated SNPs and their proxies mostly do not show evident function related to the target trait, and eventual identification of causal variants accounting for GWA signals has been challenging (Wellcome Trust Case Control Consortium 2007; McCarthy et al. 2008). A question that is then raised is whether a common SNP identified in a GWA study represents an “indirect association” as a proxy of a strongly correlated causal variant with similar frequency, or a “synthetic association” of one or more rarer causal variants that are in linkage disequilibrium (LD) (D′ ≈ 1, but r² not large) with the common SNP (Cirulli and Goldstein 2010; Dickson et al. 2010).

Synthetic association accounted for GWA signals in several studies. In a GWA study for dose of anticoagulant drug warfarin, the strongest association signal in the CYP2C9 gene was observed at an SNP rs4917639, whose minor allele (frequency 18%) is a composite of two functional alleles CYP2C9*2 (rs1799853, frequency 11%) and CYP2C9*3 (rs1057910, frequency 7%) (Wadelius et al. 2007; Takeuchi et al. 2009). In a GWA study for anemia in patients treated for chronic hepatitis, the strongest signal was observed at an SNP rs6051702 in C20orf194, whose minor allele (frequency 19%) is almost exactly a composite of two causal variants in the neighboring ITPA gene (frequency 8% and 12%) (Fellay et al. 2010). When there are many rare causal variants, but no common one, as in the HBB gene for sickle cell anemia or the GJB2/GJB6 locus for hearing loss, the association of common SNPs detected in GWA studies were attributable to the rare variants (Dickson et al. 2010). Using simulations, Dickson and colleagues showed that synthetic association is likely to occur when there are multiple rare variants in a locus (Dickson et al. 2010).

In general, identification of the causal variants accounting for a synthetic association requires extensive resequencing and association analysis. Instead, here we propose to test statistically whether a quantitative trait (QT) association of an SNP represents a synthetic association or not by inspecting only the QT distribution at each genotype of the SNP, without a priori knowledge about rarer causal variants. We focus on two statistics of the QT distribution: the heterogeneity of variance (i.e., heteroscedasticity) among SNP genotypes and the skewness. The statistical tests were examined in real data of the apolipoprotein E (APOE) gene, and in simulated data for representative models of synthetic association. Moreover, we formulated a general mathematical model of synthetic association, and assessed the test statistics theoretically. The two statistics were suitable for complementary scenarios: Heteroscedasticity was more sensitive than skewness when low-frequency (<5%) causal alleles were linked mostly to one SNP allele, while skewness outperformed when the causal alleles were linked in balance to either of the two SNP alleles. We thus devised a test combining the two statistics, which was powerful for any of the assumed models.

Previous Section Next Section

Results

Test of heteroscedasticity

We first show a schematic example of synthetic association and illustrate how QT variance can differ among individuals classified by marker SNP genotypes. We assume a common marker SNP with alleles A and a, and a single causal variant with alleles B₁ and b₁. The allele B₁ (5% in frequency) is always linked to allele A (20% in frequency); thus, existing haplotype classes are AB₁, Ab₁, and ab₁. We assume the QT is normally distributed with the unit variance and the mean equal to 2, 1, and 0 within a subgroup of individuals having genotype B₁/B₁, B₁/b₁, and b₁/b₁, respectively. The QT distribution in the whole population becomes a mixture of the normal distributions combined according to the frequency of genotypes B₁/B₁, B₁/b₁, and b₁/b₁ (Fig. 1A). Individuals with A/A genotype at the marker SNP are enriched with the genotypes of B₁/B₁ and B₁/b₁ at the causal variant, thus their QT distribution widens (Fig. 1B). On the contrary, individuals with a/a genotype at the marker all have b₁/b₁ genotype at the causal variant, and their QT variance equals one (Fig. 1D). The QT variance is the largest in the subgroup with A/A genotype, which is linked more frequently to the low-frequency causal allele B₁, and the smallest in the subgroup with a/a genotype. Indeed, QT variance among individuals of a specific marker genotype enlarges proportionally to two factors: the variance of the causal genotype within the subgroup, and the squared effect-size of the causal allele (equation M2). The low-frequency causal variant causes the synthetic association of the marker SNP, and the heteroscedasticity of QT distribution among the marker genotypes.

View larger version:

Download as PowerPoint Slide

Figure 1.

Probability distribution of the QT value within subgroups classified by marker SNP genotypes. (A) In the whole population, the total QT distribution (gray curve) comprises a mixture of normal distributions (black curves) with unit variance and the mean 0, 1, or 2, which correspond to genotypes b₁/b₁, B₁/b₁, and B₁/B₁ at the causal variant. As genotype B₁/B₁ is rare (0.25%), the corresponding curve appears flat. (B) QT distribution among individuals with A/A genotype at the marker. As B₁/B₁ and B₁/b₁ genotypes are enriched in this subgroup due to LD, the variance is enlarged, as noticeable from the lower peak and wider distribution of the gray curve. (C) Individuals with the A/a genotype have either genotypes b₁/b₁ or B₁/b₁, and the QT variance is moderately enlarged. (D) All individuals with a/a genotype at the marker have b₁/b₁ genotype at the causal variant. The QT variance is 1.10 in A, 1.38 in B, 1.19 in C, and 1 in D.

We could exemplify the detection of synthetic association using heteroscedasticity in the APOE gene, which is known to associate with LDL cholesterol (LDL-C) level through three classical isoforms coded by two functional (or causal) variants—rs7412 (Arg158Cys) and rs429358 (Cys112Arg). As compared with E3 (the most common isoform), E2 (coded by rs7412) and E4 (coded by rs429358) decreased and increased the LDL-C level, respectively (Weisgraber et al. 1981; Weisgraber 1994; Bennet et al. 2007). The two variants had MAF <10% (in Europeans and East Asians) and were not included in SNP chips of GWA scan (except for the recent ones containing >1 million SNPs). In a GWA study for lipids in 1210 Japanese (F Takeuchi, et al., in prep.), we initially found four SNPs near APOE to attain locus-wise significant P-values for LDL-C association, although any of these were not significant after adjustment for the two functional variants. When only the chip SNPs were analyzed, rs405509 and rs377702 showed statistically independent signals of association (Supplemental Fig. 1). In a larger panel of 4840 individuals, the association signals remained at the two chip SNPs, and heteroscedasticity was significant for rs405509 (P = 0.019) (Table 1). Indeed, the causal minor alleles of rs7412 (T) and rs429358 (C) were linked to alternate alleles of rs405509 (C and A, respectively), demonstrating synthetic association (Fig. 2). The two causal variants could simultaneously enlarge the QT variance at all three genotypes of rs405509, and consequently, diminish heteroscedasticity (equation M5). However, in this case, as the effect-size of rs7412 was much larger than that of rs429358, heteroscedasticity remained detectable; rs7412 enlarged the QT variance at C/C genotype of rs405509 to 1.182, whereas rs429358 kept the QT variance at A/A genotype at 0.978. On the other hand, the heteroscedasticity of rs377702 did not reach statistical significance due to its recombination with rs7412 (D′ = 0.34). Thus, even if we identified the association signals at rs405509 and rs377702 via the GWA scan, by detecting heteroscedasticity we could notice the presence of synthetic association and the necessity to search for variants not on the chip.

View this table:

Table 1.

Testing heteroscedasticity of SNPs in the APOE locus associated with LDL-C

View larger version:

Download as PowerPoint Slide

Figure 2.

Haplotype classes (A), their phylogeny (B) for the marker SNP rs405509 showing synthetic association of functional variants rs7412 and rs429358 in the APOE locus. LD coefficients between the SNPs associated with LDL-C (C). Haplotype frequencies were calculated using the PLINK software (Purcell et al. 2007).

We next estimated the power to detect synthetic association at an SNP that could be identified in a GWA study. We assumed that the marker SNP has MAF ≥5%, and that the proportion of QT variance explained by the marker is R²_mrk = 0.00592, a borderline level to attain genome-wide significance (see Supplemental Notes). Figure 3 illustrates the statistical power for detecting heteroscedasticity in 5000 individuals. We examined four representative models of synthetic association by simulation. Under Model 1, there are l causal variants with alleles B₁ and b₁, B₂ and b₂, up to B_l and b_l, and the low-frequency causal alleles B_i have a uniform effect (e.g., increase QT) and are all linked to marker allele A. The QT variance enlarges for individuals with A/A genotype at the marker since they carry various numbers of the causal alleles, whereas individuals with a/a genotype at the marker carry none. Heteroscedasticity of the marker was detectable (power >0.8) in the region marked with an asterisk: For example, when the A allele frequency, p_A ≥ 45%, or alternatively when p_A = 25% and the cumulative frequency of causal alleles is <3%. For a fixed value of p_A, the power for detecting heteroscedasticity increases as the cumulative frequency of causal alleles decreases. When p_A becomes small, the detectable range narrows; the highest cumulative frequency in the detectable range changes proportionally to as estimated in equation M12.

View larger version:

Download as PowerPoint Slide

Figure 3.

Power for detecting synthetic association by testing heteroscedasticity. The power was computed from simulation under four representative genetic models of synthetic association (see Methods), assuming the strength of marker association (R²_mrk) of 0.00592. Horizontal and vertical axes represent the frequency of the marker allele A, and the cumulative frequency of causal alleles B_i (linked to allele A), respectively. The asterisk indicates the region where synthetic association is detectable with power >0.8. The black region of the parameter space should be neglected, as it does not include causal variants accounting for the marker association.

We next examine Models 2–4, where both of the marker alleles are loaded with low-frequency causal alleles. In addition to l causal variants with alleles B₁ and b₁, B₂ and b₂, up to B_l and b_l, there are m other causal variants with alleles C₁ and c₁, C₂ and c₂, up to C_m and c_m, and we designate the low-frequency alleles B_i and c_j as causal. The two groups of causal alleles, B_i and c_j, affect the QT in opposing directions and are linked to alternate alleles A and a of the marker, respectively, and thus synthetically generate the marker association. The QT variance at marker genotype A/A enlarges due to the causal alleles B_i, and the variance at marker genotype a/a enlarges due to the causal alleles c_j (equation M3). Indeed, the variances for all marker genotypes increase and become less heterogeneous than under Model 1. Under Model 2, there is exact balance in effect-size and cumulative frequency between the two groups of causal alleles. The heteroscedasticity disappears if p_A = 50% (equation M5), and became undetectably weak around the frequency (Fig. 3). The heteroscedasticity was detectable when p_A is close to 5% or 95%: For example, when p_A = 15% or 85% and the cumulative frequency of the causal alleles B_i, which equals the cumulative frequency of c_j, is <1%. Under Models 3 and 4, where the causal alleles B_i and c_j are not balanced, heteroscedasticity still disappeared, but around a different marker allele frequency. Under Model 3, the effect-size of the causal variants is uniform, yet the cumulative frequency of alleles c_j is half that of alleles B_i, and under Model 4, the cumulative frequencies are identical, yet the effect-size of alleles C_j is half that of alleles B_i. Heteroscedasticity was undetectable around p_A = 65% and 80% under Models 3 and 4, respectively. At p_A = 25% heteroscedasticity was detectable when the cumulative frequency of B_i alleles was <2%.

Test of skewness

As the test of heteroscedasticity could not detect synthetic association at a certain marker allele frequency around p_A = 50%, when both alleles of the marker were loaded with low-frequency causal alleles (Fig. 3, Models 2–4) we introduced the test of skewness to cope with such a case. We observed that synthetic association skews the QT distribution at the marker genotypes A/A and a/a oppositely (equation M7): QT distribution among individuals with marker genotype A/A is skewed toward the effect direction of causal alleles B_i, and the QT distribution at genotype a/a is skewed toward the opposite direction, which is the effect direction of causal alleles c_j. Thus, we added the skewness test statistics for the two genotypes, taking the direction into account (equation M8). Accordingly, under Model 2, the test of skewness could detect synthetic association around p_A = 50% (Fig. 4). The detectable range with regard to the cumulative frequency of causal alleles B_i is wide (extends up to 2%) when p_A is around 50% and is narrow when near 5% or 95%; as estimated in equation M14, the maximum detectable cumulative frequency is proportional to . On the other hand, the skewness test was less powerful than the heteroscedasticity test when all causal alleles are linked to one marker allele (Model 1 in Fig. 4 vs. Fig. 3).

View larger version:

Download as PowerPoint Slide

Figure 4.

Power for detecting synthetic association by testing skewness. The power was computed from simulation under four representative genetic models, assuming the strength of marker association (R²_mrk) of 0.00592. The format of the figure is the same as Figure 3.

Combined test

Between the two tests to detect synthetic association, the test of heteroscedasticity was more powerful when one marker allele was loaded with the causal alleles (Model 1), and the test of skewness was more powerful when both of the marker alleles were loaded with a balanced amount of causal alleles (Model 2). By combining the two tests (equation M10), we devised the third test that was powerful under all of the models (Fig. 5). The detectable range for the cumulative frequency of B_i alleles exceeds 1% when the causal variants are exactly balanced (Model 2), and is up to 2% otherwise. Overall, we could detect synthetic association if the cumulative frequency of all causal alleles, B_i and c_j altogether, is <3%.

View larger version:

Download as PowerPoint Slide

Figure 5.

Power for detecting synthetic association by the combined test of heteroscedasticity and skewness. The power was computed from simulation under four representative genetic models, assuming the strength of marker association (R²_mrk) of 0.00592. The format of the figure is the same as Figure 3.

The power to detect synthetic association is influenced by the strength of marker SNP association and sample size. So far, we studied association at a borderline level of genome-wide significance, which is much weaker than some reported SNPs, for example, of lipid traits (Chasman et al. 2009). When the strength of association is doubled to R²_mrk = 0.0118, synthetic association could be detected if the cumulative frequency of all causal alleles is <6%, in a wider range (Supplemental Fig. 2). When the sample size is halved to 2500 individuals, the detectable region narrowed (Supplemental Fig. 3), because the χ² statistics of the tests are proportional to the sample size (equations M5 and M9).

Previous Section Next Section

Discussion

As seen through mathematical analysis and simulation, we could detect an SNP representing synthetic association of rarer causal variant(s) by testing heteroscedasticity and skewness. The test only requires the genotype–phenotype data obtained in association studies, and the causal variants can be unknown. Whereas previous studies of synthetic association were based on empirical results and simulation (Dickson et al. 2010), we introduced a general mathematical formulation (see Methods) and estimated the variance and skewness of the marker SNP. We also performed computer simulations under representative models of synthetic association and obtained concordant results. The test of heteroscedasticity outperformed the test of skewness when low-frequency causal alleles were linked mostly to one SNP allele, while the test of skewness was better when the causal alleles were linked in balance to either of the two SNP alleles. The test combining the two could detect synthetic association if the cumulative frequency of causal alleles is <3% when tested in 5000 individuals for a marker SNP associated with QT at a borderline level of genome-wide significance (Fig. 5).

Genetic or environmental factors not correlated or interacting with the tested marker SNP do not skew the proposed test statistics. Thus, even when there is allelic heterogeneity, the variants not in LD with the marker SNP have no effects on the test. Although we modeled the causal variants to have an additive effect on QT, the mode of inheritance does not change the results, because homozygotes for a low-frequency allele are rare and negligible. In the power assessment by simulation, we modeled the causal variants to be in complete LD (D′ = 1) with the marker. When LD decays, the heteroscedasticity or skewness at the marker becomes weaker and less detectable. However, since the marker is associated with QT, causal variant(s) of the same directional effect should be loaded mostly to one allele of the marker, thus the decay of LD would be limited.

There are a few limitations in using heteroscedasticity and skewness to detect synthetic association. False positives arise if a causal variant itself shows heteroscedasticity. This can result from a strong gene–environment interaction. Indeed, the test of heteroscedasticity has been used for detecting such interaction (Pare et al. 2010). Another possible source of false positives is population stratification; if two subpopulations have a different mean QT at a specific causal genotype, the QT variance enlarges when the subpopulations are combined. Although a realistic level of population stratification is unlikely problematic (Supplemental Table 1), we recommend applying the test to each cohort separately.

Although the tests we proposed are limited to QTs, the idea of stratifying individuals by marker genotype leads to another test of synthetic association, which is applicable to quantitative as well as dichotomous traits. Here, we compare the association of neighboring common SNPs among the strata. If rare causal alleles are loaded onto the marker allele A, but not on the allele a, neighboring SNPs in LD with the causal variants will show association in the individuals with A/A or A/a genotypes, but not in the individuals with a/a genotype. As a result, the association P-value of neighboring SNPs would be distributed differently among the strata. Contrarily, if the marker SNP represents indirect association, no neighboring SNPs will be associated in any of the strata.

The proposed test can be helpful in understanding the “missing heritability” that GWA studies failed to account for (Maher 2008). If synthetic association is detected at a GWA signal SNP, the heritability of the SNP is likely an underestimate of the heritability of the whole locus (Dickson et al. 2010). Actually, in the APOE example, the explained variance was much smaller for the leading SNPs on a GWA chip (R² = 0.006 and 0.005) than for the two causal variants (R² = 0.040 and 0.008; see Table 1). In other words, our method can identify loci that contribute to a trait more than what we would expect from GWA study results.

It is unknown what proportion of GWA signals are due to synthetic association rather than indirect association. The situation is likely to differ by the function and molecular evolution of the genes. For example, several common causal variants are known for pharmacological traits that have not been under evolutionary selection (Cirulli and Goldstein 2010). On the contrary, only rare causal variants are found in genes for renal salt reabsorption, which have been under purifying selection; homozygotes of mutant alleles are susceptible to severe renal salt wasting and hypotension, although heterozygotes confer health benefits from lower blood pressure in postreproductive ages (Ji et al. 2008). Although the proposed test would help detecting synthetic association of a marker SNP, the discovery of the causal variants can require resequencing of a large number of individuals if the causal variants are rare. The aim of the proposed test is to assess potential synthetic association at a particular locus and then use the information to help design future resequencing studies.

Testing synthetic association is useful for designing fine mapping after exhaustively interrogating association of common SNPs at a locus. Exhaustive analysis of common SNPs (MAF ≥5%) is becoming accomplishable by genotyping with SNP chips of the GWA test and by imputing the unassayed SNPs using the HapMap or the 1000 genomes project data. The next focus is to explore rarer variants by resequencing and to identify the causal variants. Since resequencing is still expensive, we need to prioritize which GWA loci and which individuals are to be resequenced. Such information is obtainable by testing synthetic association of the common leading SNP(s), showing the strongest association in a locus. If synthetic association is detected for the leading SNP(s), rarer variants need to be examined in order to pinpoint the variants causing synthetic association. Moreover, if heteroscedasticity is detected, we can discover the causal variants efficiently by resequencing individuals having the homozygote genotype with larger QT variance, and especially those having extreme QT values, who are enriched with the rare causal alleles. Alternatively, if the test for synthetic association is not significant (in >5000 samples), the leading SNP(s) or their proxies are likely causal. Whereas conventional fine-mapping techniques aim to find the causal SNP(s) or haplotype(s) from a set of SNPs tested for association (McCarthy et al. 2008), our method is unique in suggesting that causal variants can be discovered if the study is extended to rarer variants.

Numerous SNP associations have been identified in recent GWA studies, yet our understanding of causal variants is very limited; it is not easy to prove functional changes, let alone the causality with the associated phenotype (Cirulli and Goldstein 2010). We proposed a simple statistical test, which helps to detect whether a common SNP associated with a QT is a noncausal marker in LD with rarer causal variant(s). The proposed test statistic can serve as a milepost in fine mapping and help understand the genetic structure of complex traits.

Previous Section Next Section

Methods

Modeling the probability distribution of genotype and QT

We model a QT-associated marker SNP with alleles (referred to as the marker alleles), A and a. We assume the low-frequency allele of each causal variant—which we call the causal allele—is linked exclusively to one of the marker alleles; l causal variants each have alleles B₁ and b₁, B₂ and b₂, up to B_l and b_l, where the causal allele B_i is linked to marker allele A; m other causal variants each have alleles C₁ and c₁, C₂ and c₂, up to C_m and c_m, where the causal allele c_j is linked to marker allele a. We impose one assumption for mathematical convenience. Among the haplotype classes sharing a specific marker allele (A or a), we assume the probability distribution of causal variant alleles are independent among the variants; for example, among the haplotype classes carrying the A allele, the frequency (conditional on the marker allele being A) of the haplotype class carrying both B₁ and B₂ should equal the product of the frequencies of classes carrying B₁ and B₂, which is very small (e.g., 0.01%, if the frequency is 1% both for B₁ and B₂). The assumed frequency would differ only marginally from the actual frequency: The haplotype class carrying both B₁ and B₂ does not exist initially if the two causal variants arose separately in the phylogeny, and increases to the assumed frequency by recombination. This assumption enables us to rewrite the test statistics into simple forms (see Supplemental Notes). As we assume Hardy-Weinberg equilibrium, the frequencies of multivariant genotypes can be calculated from those of the haplotype classes.

Each individual's dose of the capital letter alleles, A, B_i, or C_j, is represented by random variables, x, y_i, or z_j, respectively, and the QT value is represented by a random variable q. The allele B_i (or C_j) is modeled to affect the QT by d_i (or e_j, respectively); specifically, the QT value q of an individual with multivariant genotype (y₁,…,y_l,z₁,…,z_m) has the probability density of a normal distribution with the unit variance and the mean of . Thus, the probability density function of the genotype and QT level (x,y₁,…,y_l,z₁,…,z_m,q) becomes Formula where p(x,y₁,…,y_l,z₁,…,z_m) represents the frequency of the multivariant genotype (x,y₁,…,y_l,z₁,…,z_m). The expectation of the QT value is Formula where E[·] represents the expectation (see equation S5 in the Supplemental Notes for derivation). Similarly, when conditioned on a specific genotype x₀ at the marker, Formula where E[·|x=x₀] represents the conditional expectation.

Variance

The variance of QT among individuals having a specific marker genotype x₀ becomes Formula where Var[·|x=x₀] represents the conditional variance (see Supplemental equation S6 for derivation). Equation M2 indicates that the inflation (above one) of QT variance decomposes into a sum of terms, each corresponding to one causal variant and determined by the square of the effect-size, d_i or e_j, and the conditional variance of genotype. Designating the frequencies of alleles A, B_i and c_j as p_A, p_{B_i}, and , respectively, the genotype variance becomes Formula where the approximation is under . By substituting the genotype variance into equation M2, we obtain Formula Thus, the QT variance at marker genotypes x = 2 (A/A) and x = 0 (a/a) is determined by the contribution of alleles B_i and c_j, respectively, and the average of the two variances equals the variance at genotype x = 1 (A/a). The average of three variances weighted by marker genotype frequency becomes Formula We test the heterogeneity of QT variance (i.e., heteroscedasticity) among marker genotypes using Bartlett's test (Bartlett 1937). The χ² statistic (two degrees of freedom) is expected to become Formula when the sample size N is large (thus, the constant for the Bartlett's test statistic equals one); for the second equality, equations M3 and M4 were substituted, and ln(1 + x) was approximated as x−x²/2. In the curly brackets of the final formula in equation M5, the contribution by causal alleles B_i (linked to marker allele A) is subtracted by the contribution of alleles c_j (linked to allele a). Thus, the statistic for heteroscedasticity is maximized when all low-frequency causal alleles are linked to the same marker allele, and diminishes when they are linked evenly to both of the marker alleles.

Skewness

The third central moment of the QT distribution among the individuals having a specific marker genotype x₀ is Formula which decomposes into a sum of terms, each contributed by one causal variant (see Supplemental equation S7 for derivation); μ₃[·|x=x₀] represents the conditional third central moment. The third central moment of the genotypes y_i and z_j conditional on a marker genotype becomes Formula where the approximation is under . By substituting the genotype moment into equation M6, we obtain Formula The z statistic (standard normal distribution) for testing skewness (Stuart et al. 1999) of QT among the individuals with genotype x₀ becomes Formula where is the number of the individuals, and the variance in the denominator is approximated as one for this statistic. By substituting equation M7, and converting into N multiplied by the marker genotype frequency, Formula As the QT distributions at the two homozygote marker genotypes should be skewed to opposite directions under synthetic association, z_x ₌ ₂ and z_x ₌ ₀ would have opposite signs. We take their difference and obtain a χ² statistic with one degree of freedom, Formula which we adopt as the test statistic for skewness. The test statistic is expected to become Formula reflecting the contribution by causal alleles B_i and c_j.

Statistical tests of synthetic association and type I error rate

We combine two types of statistics, heteroscedasticity and skewness, to devise the combined test. Using Fisher's method, the P-values for testing heteroscedasticity and skewness, p_{heteroscedasticity} and p_skewness, respectively, are combined as a χ² statistic (four degrees of freedom), Formula The significance level was set to 0.05 for all tests.

Before testing, we applied rank-based inverse normal transformation (Blom 1958) to the whole QT distribution. The transformation avoids detecting spurious signals when the QT distribution is skewed as a whole. The transformed QT value q_i of the i-th individual is Formula where r_i is the rank of the individual, N is the total number of individuals, c = 3/8, and Φ⁻¹ is the standard normal quantile. We strongly recommend applying the transformation, although it can cause false positives when the marker association is extremely strong, as explained below.

Type I error rate of the tests were assessed from simulated and empirical data. Under the “null hypothesis” of indirect association, we inspected the distribution of nominal P-value, and assessed the test as accurate, conservative, or anticonservative, if the actual type I error rate was equal, smaller, or larger than the nominal P-value, respectively; an anticonservative test cannot be used. For a marker showing association at a borderline level of genome-wide significance (R²_mrk = 0.00592), the heteroscedasticity test was accurate, but the skewness test tended to be conservative, due to the inverse normal transformation (Supplemental Fig. 4); this was not calibrated. As the tests for heteroscedasticity and skewness were not correlated, they could be combined using Fisher's method. When the marker association was as large as R²_mrk = 0.1, which is exceptional for GWA signals, the inverse normal transformation caused spurious heteroscedasticity and skewness, thus the proposed tests were not valid. For gene expression data (Stranger et al. 2007), the heteroscedasticity test was accurate, and the skewness test was slightly conservative (Supplemental Fig. 5; Supplemental Table 2).

Models for simulation

If the strength of the marker SNP association , and the frequency of variants are specified, we can calculate the effect-size of the causal variants (d_i,e_j) by solving Supplemental equation S3, and determine the genetic model. We systematically explore four representative models of synthetic association by simulation. (Plots of d_i according to variant frequency are shown in Supplemental Fig. 6.)

Model 1

All causal alleles linked to marker allele A have identical effect-size, and there are no causal alleles linked to allele a. By solving Supplemental equation S3 under and m = 0, Formula By substituting this to the test statistic of heteroscedasticity (equation M5), and solving for , Formula Thus, when R²_mrk = 0.00592 and N = 5000, heteroscedasticity is detectable at a power >0.8 under a significance level of 0.05 (which requires a noncentrality parameter of χ² > 9.64 for the χ² distribution with two degrees of freedom; “+2” in the parenthesis is negligible) if Formula Heteroscedasticity is detectable if the cumulative frequency of causal alleles (left term) is smaller than a certain function of the frequency of the marker allele A (right term); the detectable range with regard to is wider when p_A is larger.

Model 2

Causal alleles are linked to the two marker alleles in a balanced way, such that the effect-size is uniform, as , and the cumulative frequencies equal between causal alleles B_i (linked to marker allele A) and causal alleles c_j (linked to marker allele a), as . By solving Supplemental equation S3 under the constraints, Formula By substituting this (approximating the last term in the square-root as zero) to the test statistic of skewness (equation M9), and solving for , Formula Thus, when R²_mrk = 0.00592 and N = 5000, skewness is detectable at a power >0.8 under a significance level of 0.05 (which requires a noncentrality parameter of χ² > 7.85 for the χ² distribution with one degree of freedom) if Formula Skewness is detectable if the cumulative frequency of causal alleles B_i (left term) is smaller than a certain function of the frequency of the marker allele A (right term); the detectable range with regard to is widest when p_A is around 0.5.

Model 3

The effect-size of causal alleles is uniform, as , yet the cumulative frequency of the causal alleles B_i is twice the cumulative frequency of causal alleles c_j, as . Then, Formula

Model 4

The cumulative frequencies are equal between causal alleles linked to the two marker alleles, as , yet the effect-size of causal alleles B_i is twice the effect-size of causal alleles c_j, as . Then, Formula

Power assessment by simulation

We assessed the power of the three tests under each of the four models by simulation. In any of the models, we assumed that effect-size equals among the causal alleles linked to the same marker allele (i.e., and ). In such a case, the tests remain the same if instead there was one composite allele of B_i's and another composite of c_j's. Using this property, we actually simulated the special case with one causal allele linked to each one of the marker alleles; the simulation results apply to the general case with multiple causal alleles.

Simulations were performed under the following parameter values; R²_mrk = 0.00592, 0.0118; Other parameters—, d_i, and e_j—were determined according to constraints. We randomly generated 5000 (or 2500) individuals using simulation and applied the tests. The power was assessed from 1000 simulation trials. We used the R software for computation.

Previous Section Next Section

Acknowledgments

We thank the participants in the lipid study and Drs. Toru Nabika (Shimane University), Tomohiro Katsuya (Osaka University), and Yukio Yamori (Mukogawa Women's University). We also thank anonymous reviewers for their constructive comments. This work was supported by the Program for Promotion of Fundamental Studies in Health Sciences of the National Institute of Biomedical Innovation Organization; a Grant of the National Center for Global Health and Medicine; and the Ministry of Health, Labor and Welfare.

Previous Section Next Section

Footnotes

↵6 Corresponding author.

E-mail fumihiko{at}takeuchi.name.
[Supplemental material is available for this article. The synthetic association test software is freely available at http://www.fumihiko.takeuchi.name/PUBLICATIONS/synthetic.R.]
Article published online before print. Article, supplemental material, and publication date are at http://www.genome.org/cgi/doi/10.1101/gr.115832.110.

Received September 24, 2010.
Accepted March 9, 2011.

Copyright © 2011 by Cold Spring Harbor Laboratory Press

Previous Section

References

↵
1. Bartlett MS
Bartlett MS. 1937. Properties of sufficiency and statistical tests. Proc R Soc Lond A Math Phys Sci 160: 268–282.
CrossRef Google Scholar
↵
1. Bennet AM,
2. Di Angelantonio E,
3. Ye Z,
4. Wensley F,
5. Dahlin A,
6. Ahlbom A,
7. Keavney B,
8. Collins R,
9. Wiman B,
10. de Faire U
Bennet AM, Di Angelantonio E, Ye Z, Wensley F, Dahlin A, Ahlbom A, Keavney B, Collins R, Wiman B, de Faire U, et al. 2007. Association of apolipoprotein E genotypes with lipid levels and coronary risk. JAMA 298: 1300–1311.
CrossRef Medline Google Scholar
↵
1. Blom G
Blom G. 1958. Statistical estimates and transformed β-variables. Wiley, New York.
Google Scholar
↵
1. Chasman DI,
2. Pare G,
3. Mora S,
4. Hopewell JC,
5. Peloso G,
6. Clarke R,
7. Cupples LA,
8. Hamsten A,
9. Kathiresan S,
10. Malarstig A
Chasman DI, Pare G, Mora S, Hopewell JC, Peloso G, Clarke R, Cupples LA, Hamsten A, Kathiresan S, Malarstig A, et al. 2009. Forty-three loci associated with plasma lipoprotein size, concentration, and cholesterol content in genome-wide analysis. PLoS Genet 5: e1000730. doi: 10.1371/journal.pgen.1000730.
CrossRef Medline Google Scholar
↵
1. Cirulli ET,
2. Goldstein DB
Cirulli ET, Goldstein DB. 2010. Uncovering the roles of rare variants in common disease through whole-genome sequencing. Nat Rev Genet 11: 415–425.
CrossRef Medline Google Scholar
↵
1. Dickson SP,
2. Wang K,
3. Krantz I,
4. Hakonarson H,
5. Goldstein DB
Dickson SP, Wang K, Krantz I, Hakonarson H, Goldstein DB. 2010. Rare variants create synthetic genome-wide associations. PLoS Biol 8: e1000294. doi: 10.1371/journal.pbio.1000294.
CrossRef Medline Google Scholar
↵
1. Fellay J,
2. Thompson AJ,
3. Ge D,
4. Gumbs CE,
5. Urban TJ,
6. Shianna KV,
7. Little LD,
8. Qiu P,
9. Bertelsen AH,
10. Watson M
Fellay J, Thompson AJ, Ge D, Gumbs CE, Urban TJ, Shianna KV, Little LD, Qiu P, Bertelsen AH, Watson M, et al. 2010. ITPA gene variants protect against anaemia in patients treated for chronic hepatitis C. Nature 464: 405–408.
CrossRef Medline Google Scholar
↵
1. Ji W,
2. Foo JN,
3. O'Roak BJ,
4. Zhao H,
5. Larson MG,
6. Simon DB,
7. Newton-Cheh C,
8. State MW,
9. Levy D,
10. Lifton RP
Ji W, Foo JN, O'Roak BJ, Zhao H, Larson MG, Simon DB, Newton-Cheh C, State MW, Levy D, Lifton RP. 2008. Rare independent mutations in renal salt handling genes contribute to blood pressure variation. Nat Genet 40: 592–599.
CrossRef Medline Google Scholar
↵
1. Maher B
Maher B. 2008. Personal genomes: The case of the missing heritability. Nature 456: 18–21.
Medline Google Scholar
↵
1. McCarthy MI,
2. Abecasis GR,
3. Cardon LR,
4. Goldstein DB,
5. Little J,
6. Ioannidis JP,
7. Hirschhorn JN
McCarthy MI, Abecasis GR, Cardon LR, Goldstein DB, Little J, Ioannidis JP, Hirschhorn JN. 2008. Genome-wide association studies for complex traits: consensus, uncertainty and challenges. Nat Rev Genet 9: 356–369.
CrossRef Medline Google Scholar
↵
1. Pare G,
2. Cook NR,
3. Ridker PM,
4. Chasman DI
Pare G, Cook NR, Ridker PM, Chasman DI. 2010. On the use of variance per genotype as a tool to identify quantitative trait interaction effects: a report from the Women's Genome Health Study. PLoS Genet 6: e1000981. doi: 10.1371/journal.pgen.1000981.
CrossRef Medline Google Scholar
↵
1. Purcell S,
2. Neale B,
3. Todd-Brown K,
4. Thomas L,
5. Ferreira MA,
6. Bender D,
7. Maller J,
8. Sklar P,
9. de Bakker PI,
10. Daly MJ
Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MA, Bender D, Maller J, Sklar P, de Bakker PI, Daly MJ, et al. 2007. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet 81: 559–575.
CrossRef Medline Google Scholar
↵
1. Stranger BE,
2. Forrest MS,
3. Dunning M,
4. Ingle CE,
5. Beazley C,
6. Thorne N,
7. Redon R,
8. Bird CP,
9. de Grassi A,
10. Lee C
Stranger BE, Forrest MS, Dunning M, Ingle CE, Beazley C, Thorne N, Redon R, Bird CP, de Grassi A, Lee C, et al. 2007. Relative impact of nucleotide and copy number variation on gene expression phenotypes. Science 315: 848–853.
Abstract/FREE Full Text
↵
1. Stuart A,
2. Ord J,
3. Arnold S
Stuart A, Ord J, Arnold S. 1999. Kendall's advanced theory of statistics. Arnold, London, United Kingdom.
Google Scholar
↵
1. Takeuchi F,
2. McGinnis R,
3. Bourgeois S,
4. Barnes C,
5. Eriksson N,
6. Soranzo N,
7. Whittaker P,
8. Ranganath V,
9. Kumanduri V,
10. McLaren W
Takeuchi F, McGinnis R, Bourgeois S, Barnes C, Eriksson N, Soranzo N, Whittaker P, Ranganath V, Kumanduri V, McLaren W, et al. 2009. A genome-wide association study confirms VKORC1, CYP2C9, and CYP4F2 as principal genetic determinants of warfarin dose. PLoS Genet 5: e1000433. doi: 10.1371/journal.pgen.1000433.
CrossRef Medline Google Scholar
1. Takeuchi F,
2. Isono M,
3. Katsuya T,
4. Yamamoto K,
5. Yokota M,
6. Sugiyama T,
7. Nabika T,
8. Fujioka A,
9. Ohnaka K,
10. Asano H
Takeuchi F, Isono M, Katsuya T, Yamamoto K, Yokota M, Sugiyama T, Nabika T, Fujioka A, Ohnaka K, Asano H, et al. 2010. Blood pressure and hypertension are associated with 7 loci in the Japanese population. Circulation 121: 2302–2309.
Abstract/FREE Full Text
↵
1. Wadelius M,
2. Chen LY,
3. Eriksson N,
4. Bumpstead S,
5. Ghori J,
6. Wadelius C,
7. Bentley D,
8. McGinnis R,
9. Deloukas P
Wadelius M, Chen LY, Eriksson N, Bumpstead S, Ghori J, Wadelius C, Bentley D, McGinnis R, Deloukas P. 2007. Association of warfarin dose with genes involved in its action and metabolism. Hum Genet 121: 23–34.
CrossRef Medline Google Scholar
↵
1. Weisgraber KH
Weisgraber KH. 1994. Apolipoprotein E: structure-function relationships. Adv Protein Chem 45: 249–302.
CrossRef Medline Google Scholar
↵
1. Weisgraber KH,
2. Rall SC Jr.,
3. Mahley RW
Weisgraber KH, Rall SC Jr., Mahley RW. 1981. Human E apoprotein heterogeneity. Cysteine-arginine interchanges in the amino acid sequence of the apo-E isoforms. J Biol Chem 256: 9077–9083.
FREE Full Text
↵

Wellcome Trust Case Control Consortium. 2007. Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature 447: 661–678.

CrossRef Medline Google Scholar

[1] ↵

Bartlett MS

Bartlett MS. 1937. Properties of sufficiency and statistical tests. Proc R Soc Lond A Math Phys Sci 160: 268–282.

CrossRef Google Scholar

[2] Bartlett MS

[3] ↵

Bennet AM,

Di Angelantonio E,

Ye Z,

Wensley F,

Dahlin A,

Ahlbom A,

Keavney B,

Collins R,

Wiman B,

de Faire U

Bennet AM, Di Angelantonio E, Ye Z, Wensley F, Dahlin A, Ahlbom A, Keavney B, Collins R, Wiman B, de Faire U, et al. 2007. Association of apolipoprotein E genotypes with lipid levels and coronary risk. JAMA 298: 1300–1311.

CrossRef Medline Google Scholar

[4] Bennet AM,

[5] Di Angelantonio E,

[6] Ye Z,

[7] Wensley F,

[8] Dahlin A,

[9] Ahlbom A,

[10] Keavney B,

[11] Collins R,

[12] Wiman B,

[13] de Faire U

[14] ↵

Blom G

Blom G. 1958. Statistical estimates and transformed β-variables. Wiley, New York.

Google Scholar

[15] Blom G

[16] ↵

Chasman DI,

Pare G,

Mora S,

Hopewell JC,

Peloso G,

Clarke R,

Cupples LA,

Hamsten A,

Kathiresan S,

Malarstig A

Chasman DI, Pare G, Mora S, Hopewell JC, Peloso G, Clarke R, Cupples LA, Hamsten A, Kathiresan S, Malarstig A, et al. 2009. Forty-three loci associated with plasma lipoprotein size, concentration, and cholesterol content in genome-wide analysis. PLoS Genet 5: e1000730. doi: 10.1371/journal.pgen.1000730.

CrossRef Medline Google Scholar

[17] Chasman DI,

[18] Pare G,

[19] Mora S,

[20] Hopewell JC,

[21] Peloso G,

[22] Clarke R,

[23] Cupples LA,

[24] Hamsten A,

[25] Kathiresan S,

[26] Malarstig A

[27] ↵

Cirulli ET,

Goldstein DB

Cirulli ET, Goldstein DB. 2010. Uncovering the roles of rare variants in common disease through whole-genome sequencing. Nat Rev Genet 11: 415–425.

CrossRef Medline Google Scholar

[28] Cirulli ET,

[29] Goldstein DB

[30] ↵

Dickson SP,

Wang K,

Krantz I,

Hakonarson H,

Goldstein DB

Dickson SP, Wang K, Krantz I, Hakonarson H, Goldstein DB. 2010. Rare variants create synthetic genome-wide associations. PLoS Biol 8: e1000294. doi: 10.1371/journal.pbio.1000294.

CrossRef Medline Google Scholar

[31] Dickson SP,

[32] Wang K,

[33] Krantz I,

[34] Hakonarson H,

[35] Goldstein DB

[36] ↵

Fellay J,

Thompson AJ,

Ge D,

Gumbs CE,

Urban TJ,

Shianna KV,

Little LD,

Qiu P,

Bertelsen AH,

Watson M

Fellay J, Thompson AJ, Ge D, Gumbs CE, Urban TJ, Shianna KV, Little LD, Qiu P, Bertelsen AH, Watson M, et al. 2010. ITPA gene variants protect against anaemia in patients treated for chronic hepatitis C. Nature 464: 405–408.

CrossRef Medline Google Scholar

[37] Fellay J,

[38] Thompson AJ,

[39] Ge D,

[40] Gumbs CE,

[41] Urban TJ,

[42] Shianna KV,

[43] Little LD,

[44] Qiu P,

[45] Bertelsen AH,

[46] Watson M

[47] ↵

Ji W,

Foo JN,

O'Roak BJ,

Zhao H,

Larson MG,

Simon DB,

Newton-Cheh C,

State MW,

Levy D,

Lifton RP

Ji W, Foo JN, O'Roak BJ, Zhao H, Larson MG, Simon DB, Newton-Cheh C, State MW, Levy D, Lifton RP. 2008. Rare independent mutations in renal salt handling genes contribute to blood pressure variation. Nat Genet 40: 592–599.

CrossRef Medline Google Scholar

[48] Ji W,

[49] Foo JN,

[50] O'Roak BJ,

[51] Zhao H,

[52] Larson MG,

[53] Simon DB,

[54] Newton-Cheh C,

[55] State MW,

[56] Levy D,

[57] Lifton RP

[58] ↵

Maher B

Maher B. 2008. Personal genomes: The case of the missing heritability. Nature 456: 18–21.

Medline Google Scholar

[59] Maher B

[60] ↵

McCarthy MI,

Abecasis GR,

Cardon LR,

Goldstein DB,

Little J,

Ioannidis JP,

Hirschhorn JN

McCarthy MI, Abecasis GR, Cardon LR, Goldstein DB, Little J, Ioannidis JP, Hirschhorn JN. 2008. Genome-wide association studies for complex traits: consensus, uncertainty and challenges. Nat Rev Genet 9: 356–369.

CrossRef Medline Google Scholar

[61] McCarthy MI,

[62] Abecasis GR,

[63] Cardon LR,

[64] Goldstein DB,

[65] Little J,

[66] Ioannidis JP,

[67] Hirschhorn JN

[68] ↵

Pare G,

Cook NR,

Ridker PM,

Chasman DI

Pare G, Cook NR, Ridker PM, Chasman DI. 2010. On the use of variance per genotype as a tool to identify quantitative trait interaction effects: a report from the Women's Genome Health Study. PLoS Genet 6: e1000981. doi: 10.1371/journal.pgen.1000981.

CrossRef Medline Google Scholar

[69] Pare G,

[70] Cook NR,

[71] Ridker PM,

[72] Chasman DI

[73] ↵

Purcell S,

Neale B,

Todd-Brown K,

Thomas L,

Ferreira MA,

Bender D,

Maller J,

Sklar P,

de Bakker PI,

Daly MJ

Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MA, Bender D, Maller J, Sklar P, de Bakker PI, Daly MJ, et al. 2007. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet 81: 559–575.

CrossRef Medline Google Scholar

[74] Purcell S,

[75] Neale B,

[76] Todd-Brown K,

[77] Thomas L,

[78] Ferreira MA,

[79] Bender D,

[80] Maller J,

[81] Sklar P,

[82] de Bakker PI,

[83] Daly MJ

[84] ↵

Stranger BE,

Forrest MS,

Dunning M,

Ingle CE,

Beazley C,

Thorne N,

Redon R,

Bird CP,

de Grassi A,

Lee C

Stranger BE, Forrest MS, Dunning M, Ingle CE, Beazley C, Thorne N, Redon R, Bird CP, de Grassi A, Lee C, et al. 2007. Relative impact of nucleotide and copy number variation on gene expression phenotypes. Science 315: 848–853.

Abstract/FREE Full Text

[85] Stranger BE,

[86] Forrest MS,

[87] Dunning M,

[88] Ingle CE,

[89] Beazley C,

[90] Thorne N,

[91] Redon R,

[92] Bird CP,

[93] de Grassi A,

[94] Lee C

[95] ↵

Stuart A,

Ord J,

Arnold S

Stuart A, Ord J, Arnold S. 1999. Kendall's advanced theory of statistics. Arnold, London, United Kingdom.

Google Scholar

[96] Stuart A,

[97] Ord J,

[98] Arnold S

[99] ↵

Takeuchi F,

McGinnis R,

Bourgeois S,

Barnes C,

Eriksson N,

Soranzo N,

Whittaker P,

Ranganath V,

Kumanduri V,

McLaren W

Takeuchi F, McGinnis R, Bourgeois S, Barnes C, Eriksson N, Soranzo N, Whittaker P, Ranganath V, Kumanduri V, McLaren W, et al. 2009. A genome-wide association study confirms VKORC1, CYP2C9, and CYP4F2 as principal genetic determinants of warfarin dose. PLoS Genet 5: e1000433. doi: 10.1371/journal.pgen.1000433.

CrossRef Medline Google Scholar

[100] Takeuchi F,

[101] McGinnis R,

[102] Bourgeois S,

[103] Barnes C,

[104] Eriksson N,

[105] Soranzo N,

[106] Whittaker P,

[107] Ranganath V,

[108] Kumanduri V,

[109] McLaren W

[110] Takeuchi F,

Isono M,

Katsuya T,

Yamamoto K,

Yokota M,

Sugiyama T,

Nabika T,

Fujioka A,

Ohnaka K,

Asano H

Takeuchi F, Isono M, Katsuya T, Yamamoto K, Yokota M, Sugiyama T, Nabika T, Fujioka A, Ohnaka K, Asano H, et al. 2010. Blood pressure and hypertension are associated with 7 loci in the Japanese population. Circulation 121: 2302–2309.

Abstract/FREE Full Text

[111] Takeuchi F,

[112] Isono M,

[113] Katsuya T,

[114] Yamamoto K,

[115] Yokota M,

[116] Sugiyama T,

[117] Nabika T,

[118] Fujioka A,

[119] Ohnaka K,

[120] Asano H

[121] ↵

Wadelius M,

Chen LY,

Eriksson N,

Bumpstead S,

Ghori J,

Wadelius C,

Bentley D,

McGinnis R,

Deloukas P

Wadelius M, Chen LY, Eriksson N, Bumpstead S, Ghori J, Wadelius C, Bentley D, McGinnis R, Deloukas P. 2007. Association of warfarin dose with genes involved in its action and metabolism. Hum Genet 121: 23–34.

CrossRef Medline Google Scholar

[122] Wadelius M,

[123] Chen LY,

[124] Eriksson N,

[125] Bumpstead S,

[126] Ghori J,

[127] Wadelius C,

[128] Bentley D,

[129] McGinnis R,

[130] Deloukas P

[131] ↵

Weisgraber KH

Weisgraber KH. 1994. Apolipoprotein E: structure-function relationships. Adv Protein Chem 45: 249–302.

CrossRef Medline Google Scholar

[132] Weisgraber KH

[133] ↵

Weisgraber KH,

Rall SC Jr.,

Mahley RW

Weisgraber KH, Rall SC Jr., Mahley RW. 1981. Human E apoprotein heterogeneity. Cysteine-arginine interchanges in the amino acid sequence of the apo-E isoforms. J Biol Chem 256: 9077–9083.

FREE Full Text

[134] Weisgraber KH,

[135] Rall SC Jr.,

[136] Mahley RW

[137] ↵

Wellcome Trust Case Control Consortium. 2007. Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature 447: 661–678.

CrossRef Medline Google Scholar

Detection of common single nucleotide polymorphisms synthesizing quantitative trait association of rarer causal variants

Abstract

Results

Test of heteroscedasticity

Test of skewness

Combined test

Discussion

Methods

Modeling the probability distribution of genotype and QT

Variance

Skewness

Statistical tests of synthetic association and type I error rate

Models for simulation

Model 1

Model 2

Model 3

Model 4

Power assessment by simulation

Acknowledgments

Footnotes

References

Articles citing this article

This Article

Article Category

Services

Citing Articles

Google Scholar

PubMed/NCBI

Share

Preprint Server

Navigate This Article

Current Issue

In This Issue