High-throughput allele-specific expression across 250 environmental conditions

Gene-by-environment (GxE) interactions determine common disease risk factors and biomedically relevant complex traits. However, quantifying how the environment modulates genetic effects on human quantitative phenotypes presents unique challenges. Environmental covariates are complex and difficult to measure and control at the organismal level, as found in GWAS and epidemiological studies. An alternative approach focuses on the cellular environment using in vitro treatments as a proxy for the organismal environment. These cellular environments simplify the organism-level environmental exposures to provide a tractable influence on subcellular phenotypes, such as gene expression. Expression quantitative trait loci (eQTL) mapping studies identified GxE interactions in response to drug treatment and pathogen exposure. However, eQTL mapping approaches are infeasible for large-scale analysis of multiple cellular environments. Recently, allele-specific expression (ASE) analysis emerged as a powerful tool to identify GxE interactions in gene expression patterns by exploiting naturally occurring environmental exposures. Here we characterized genetic effects on the transcriptional response to 50 treatments in five cell types. We discovered 1455 genes with ASE (FDR < 10%) and 215 genes with GxE interactions. We demonstrated a major role for GxE interactions in complex traits. Genes with a transcriptional response to environmental perturbations showed sevenfold higher odds of being found in GWAS. Additionally, 105 genes that indicated GxE interactions (49%) were identified by GWAS as associated with complex traits. Examples include GIPR–caffeine interaction and obesity and include LAMP3–selenium interaction and Parkinson disease. Our results demonstrate that comprehensive catalogs of GxE interactions are indispensable to thoroughly annotate genes and bridge epidemiological and genome-wide association studies.

be obtained. Because we were interested in early changes in the transcriptome, cells were treated for six hours (no cell doublings).

RNA-seq library preparation
Treated cells were collected by centrifugation at 2000 rpm and washed 2x using ice cold PBS. Collected pellets were lysed on the plate, using Lysis/Binding Buffer (Ambion), and frozen at -80 • . Poly-adenylated mRNAs were subsequently isolated from thawed lysates using the Dynabeads mRNA Direct Kit (Ambion) and following the manufacturer instructions. RNA-seq libraries were prepared using a protocol modified from the NEBNext Ultradirectional (NEB) library preparation protocol to use 96 Barcodes from BIOOScientific added by ligation, as described in (Moyerbrailean et al., 2015). The individual libraries were quantified using the KAPA real-time PCR system, following the manufacturer instructions and using a custom-made series of standards obtained from serial dilutions of the phi-X DNA (Illumina).

Two step high-throughput screening approach
We used a two step approach to gene expression analysis that we recently developed (Moyerbrailean et al., 2015).
Briefly, in the first step all samples were experimentally processed in parallel, from tissue culture and treatments to library preparation, thus minimizing experimental variation from testing dozens of conditions at the same time ( Figure S1). Additionally, high multiplexing allows for the reduction of the number of controls that would need to be repeated across different treatment batches in a less multiplexed experimental setup (here 32 treatments plus 3 controls for each of three individuals represented on a plate, see Figure S1). A 96-library pooling and shallow sequencing strategy (<10M reads per library, Table S2) was then used to minimize the amount of resources used in the screening step. This allowed for a rapid screen of a large number of treatment conditions, while sequencing resources could be strategically allocated to the in depth analysis of relevant cases.
For the second step, we repooled a selection of the initial libraries (Section 8.1, Figure 1B), without additional experimental costs. Furthermore, using a two-step approach allowed us to repool the samples to achieve a more uniform allocation of sequencing reads across samples (130M reads/sample on average, Table S4).

Sequencing
A flowchart of the 2-step sequencing procedure can be found in Figure S2. Pools of 96 samples from Step 1 were sequenced on two lanes of an Illumina HiSeq2500 in fast mode to obtain 50bp PE reads, at the University of Chicago and at the Michigan State University Genomics Cores; or on one lane of the Illumina NextSeq500 for 75 cycles PE in HO mode in the Luca/Pique-Regi laboratory. To prepare subpools for deep sequencing (Step 2), we used the re-pooling approach described in (Moyerbrailean et al., 2015) which allows for iterative adjustments of pooling proportions in order to reach the desired total number of reads through multiple repooling and sequencing runs.
Step 2 resequencing was performed on the NextSeq500 in the Luca/Pique-Regi laboratory. The number of reads collected for each sample in step 1 and step 2 is reported in Table S2 and Table   S4, respectively. We collected 6.6 billion reads in step 1 (averaging 8.2 million reads per sample) and 33.5 billion reads of deep sequencing data in step 2 (averaging 113 million reads per sample). Note that reads from shallow sequencing (obtained during the first step) were not combined to the ones from deep sequencing (second step) because of differences in read length.
7 Pre-processing of RNA-seq data 7.1 Sequence alignment RNA-seq data for step 1 was processed as described in (Moyerbrailean et al., 2015). Briefly, raw sequencing reads were aligned to the hg19 human reference genome using bwa mem (Li and Durbin, 2009)(http:// bio-bwa.sourceforge.net). Reads with quality <10 and duplicate reads were removed using samtools rmdup (http://github.com/samtools/). Read counts per sample after filtering steps can be found in Table S2.

Differential gene expression
To identify differentially expressed (DE) genes, we used the method implemented in the software DESeq2 (Love et al., 2014) (R version 3.2.1, DESeq2 version 1.8.1). DESeq2 estimates variance-mean dependence in the read counts for each gene and tests for differential expression based on a model using the negative binomial distribution. Transcripts with fewer than 20 reads total on a given plate were discarded. To better estimate the dispersion parameters, the DESeq2 model was fit on all sequencing data from a single plate simultaneously (i.e., all treatment samples, and without merging the technical replicates of the control samples, see Figure S1): where, for each transcript i and sample j, the read counts K ij are modeled using a negative binomial distribution with fitted mean µ ij and a gene-specific dispersion parameter α i . The fitted mean is composed of a sample-specific size factor s j and a parameter q ij proportional to the expected true concentration of fragments for sample j. The coefficient β 0 represents the mean effect intercept, β Cl(j) represents the cell line effect (in our case one parameter for each of the 3 cell lines in each plate), and β Tr(j) represents the specific treatment/control effect (one parameter for each treatment and vehicle control used in that plate).
We then contrasted the treatment effect parameter β Tr(j) to the appropriate matched control (β CO1 , β CO2 or β CO3 ) using the default DEseq Wald test for each transcript, and a Benjamini-Hochberg (BH) adjusted p-value was calculated with automatic independent filtering (DEseq2 default setting). DE genes were determined as genes with at least one transcript having a Benjamini-Hochberg controlled FDR (Benjamini and Hochberg, 1995) (BH-FDR) of 10% and an absolute log 2 fold-change value > 0.25. The same procedure was used for step 1 and step 2.
A summary of differential expression for both steps can be found in Tables S3, and S5, and a full set of differential expression results from step 2 can be found in Table S6. 7 8.2 Summary of 1st step DE analysis.
In step one, we used shallow RNA-seq (8.2 million reads/sample on average, Table S2) to coarsely characterize global changes in gene expression. To identify differentially expressed (DE) genes we used the method implemented in the software DEseq2 (Love et al., 2014) using gene annotations from Ensembl version 69, described in Section 8.1 ( Figure S3, Table S3). This global characterization of transcriptional responses showed that certain treatments induce gene expression changes across all cell types (e.g. dexamethasone, retinoic acid), while others have a cell type specific effect (e.g. vitamin B6 in PBMCs) ( Figure 1B). Of the 50 treatments considered, 16 do not induce strong changes in gene expression (defined here as >80 DE genes, 10% FDR,|logFC| >0.25) in any cell type, while 8 result in strong gene expression changes across all cell types (dexamethasone, caffeine, tunicamycin, iron and manganese, vitamin D, aspirin, retinoic acid). Some of the more extreme responses, such as those corresponding to manganese, were determined to be a toxic response and were removed from any subsequent analysis.

Summary of 2nd step DE analysis.
32 treatment conditions with at least 80 DE genes from step one (see Section 8.2) were selected for deep sequencing. In addition, 12 treatment conditions with fewer than 80 DE genes were chosen to confirm that treatments with a small response from the shallow sequencing data similarly have a small response when deep sequenced.
Overall, 297 samples (32 treatments and 3 controls across up to 5 cell types) were selected for repooling ( Figure   1B, Table S4). Gene annotations from Ensembl version 75 were used, consisting of 204,940 transcripts corresponding to 60,234 unique ENSG gene identifiers. Sequencing reads covering transcripts were counted using bedtools coverage, using the -s and -split options to account for strandedness and for BED12 input, respectively.

Transcript and gene level FPKM summarization for each individual sample.
To perform Principal Component Analysis and network analysis, we calculated FPKMs for each sample (defined as the combination of a single individual and a single treatment, e.g., dexamethasone in GM19239) from the number of reads covering each transcript. To control for potential confounders, we fit the following linear model: where for each transcript i, S i is the transcript GC content proportion, L i is the transcript length, and S i * L i is an interaction term between GC content and transcript length. The residual of the linear model, the GC-corrected log 10 FPKMs are then quantile normalized within each individual.
For analyses at the gene level, expression of a single transcript was chosen to represent each gene for each different cell type. The most highly expressed transcript (based on average expression across the cell type before quantile normalization) was selected from each gene to represent the overall expression of that gene.
After quantile normalization log 10 FPKMs were further corrected within each individual by subtracting that individual's average value per transcript across all treatments. This was calculated after removing the top and bottom 10%-iles of data, usually referred to as 10% trimmed mean or Tukey's mean. The procedure is implemented in R "mean" function using the "trim=0.1" option.

Hierarchical clustering and PCA
Principal component analysis and hierarchical clustering was performed on a Pearson correlation matrix using transcript FPKM values for all samples on a plate (i.e., three individuals of a cell type treated with a single treatment panel, see Figure S1). Note that the input data is quantile normalized which makes Pearson correlation more similar to rank correlation in terms of robustness to outliers. Heatmaps were produced on the same correlation matrices, with dendrograms of Euclidian distance calculated using the "hclust" function with linkage "method=complete".
Samples corresponding to the same treatment tend to cluster together, and all controls (CO1, CO2, CO3) also cluster together ( Figures S4 -S5). We see similar results when performing hierarchical clustering . Specifically, treatments that elicit strong responses cluster distinctly from control samples, and are often separated from other samples along the first or second PC (for example, selenium in HUVECs, dexamethasone and aldosterone in SMCs, tunicamycin in melanocytes -see Figures S6 and S4). In contrast, treatments that don't have a strong response cluster close with the controls (e.g., B vitamins in PBMCs, Figures S4 and S6). We also see clustering of treatments in biologically relevant ways. For example, in LCLs ( Figure S4), PC1 separates the controls from the metal ions (selenium and copper), while PC2 separates the controls from nuclear receptor ligands (dexamethasone and vitamin A) and from caffeine. Finally, we also observe good concordance with the differential gene expression results. For example, vitamin D elicits a strong response in PBMCs, but less so in HUVECs (Figure 1B,also in FigureS12B). This is reflected in the fact that in PBMCs, vitamin D clusters apart from other treatments (and furthest from the controls), while in HUVECs, vitamin D clusters with the controls ( Figures S4, S6).

Network analysis with WGCNA
For network analysis we normalized the data as in the PCA procedure. We combined all the data across cell types, treatments and individuals resulting in a matrix with 14,527 rows (genes) and 297 columns (samples).
We then used WGCNA (Langfelder and Horvath, 2008) version 1.47 implemented in R to build an unsigned network. A soft thresholding power of 6 was chosen, and the network was built using the automated blockwise modules pipeline using Pearson correlations, a signed topological overlap matrix, and a minimum module size of 10. Modules were cut from the network dendrogram with the Dynamic Hybrid Tree cut method, and the module eigengene was calculated as the first principal component of each module's expression matrix. A measure of module membership was calculated for each gene by correlating the gene's expression profile with its module's eigengene. Modules that did not have at least 3 genes with connection to the eigengene (> 0.5) were disbanded.
Also, all genes with low connectivity to the module's eigengene (< 0.3) were removed from their modules. Modules with very highly correlated eigengenes were merged using WGCNA's default iterative clustering method.
The minimum module size was set to 10, while the largest module contained 1,456 genes (median module size was 42 genes). A total of 7,936 genes, corresponding to 54.6% of the genes considered was assigned to a module (Table S7). The resulting coexpression network is made up of 87 modules and was found to approximate a scale free topology R 2 = 0.8557, indicating that the degree distribution (or measure of network connectivity) followed an expected power law. Each module eigengene was then evaluated for significant differences in expression in each cell type between treatment versus control using Student's t-test (Table S8). We used this t statistic to plot a heatmap of association of each eigengene to each cell-type/treatment combination ( Figure S11). We then used a cytoscape version 3.2.1 to plot the global network ( Figure 1C), and individual module networks that show different properties across treatments or cell types ( Figure S12 and Figure S13).
We assigned a treatment to each module based on the most significant effect size of treatment on the module eigen-gene expression. Known target genes for specific treatments were categorized as hub genes in treatmentspecific modules (Fig S12, Fig S13). For example, the known glucocorticoid targets, TSC22D3 and FKBP5, are hub genes in Dexamethasone M66 and showed a similar transcriptional response in all analyzed cell types ( Fig S13A). To identify the modules that capture cell type specific gene expression response to a treatment, we considered seven treatments that were assayed in at least four cell types (dexamethasone, caffeine, selenium, vitamin A, aspirin, phthalate, and vitamin D) and are associated with 81 out of 87 modules (nominal p < 0.01).
First, we examined specificity of each module across treatments and cell types. 33 of the 81 modules had only one cell type with a significant association in a given treatment while 42 of the 81 modules had significant association with one treatment in a cell type. By taking the overlap, we identified ten modules that are only associated with one environment, demonstrating that these ten modules contain genes that respond specifically to treatment in a particular cell type. Then, we analyzed modules significantly associated with a treatment in at least two cell types (59% of the modules) to investigate the extent each module represents similar patterns of gene expression responses across cell types. Overall, 88% of the modules (42 of 48 modules) exhibited similar gene expression response patterns across cell types with sharp differences being treatment-dependent. Of the five cell types, SMCs expressed the most treatment-specific responses with 3/5 modules showing opposite gene expression patterns across treatments.
9 Allele specific expression analysis 9.1 Core set of SNPs for analysis To create a core set of SNPs for ASE analysis, we started with the autosomal 1KG SNPs from the phase 3 release (v5b.20130502, downloaded on 08/24/15), and first removed SNPs with low minor allele frequency (MAF < 5%). We also removed SNPs within regions of annotated copy number variation and ENCODE blacklisted regions (Degner et al., 2012), leaving a total of 7,085,180 SNPs in the core set.

Joint genotyping
Counts of reads covering each allele at selected SNPs (Section 9.1) were obtained by "piling up" aligned reads for each sample over SNPs using samtools mpileup and the hg19 human reference genome. Reads with a SNP at the beginning or at the end of the read were also removed to avoid any potential bias, as well as those within a reference skip (i.e., within a splice junction, meaning the read does not actually cover the SNP To verify that none of the samples had been contaminated with reads from another individual during library preparation, we compared the allele ratio,ρ, across samples processed at the same time (based on our study design these are samples from the same cell type, see Figure S1). The allele ratio is indicative of the genotype, as it follows a trimodal distribution with peaks for homozygous reference and alternate, and heterozygous (Harvey et al., 2015). For each cell type, we examined theρ for SNPs covered in each of the three individuals across all samples. For each cell type,theρ values were correlated across the three individuals, and the resulting correlation matrix used to perform PCA ( Figure S8). We expect the three individuals to group into three distinct clusters on the PCA plot, in the absence of sample contamination or mix-up.  Figure S9). The inference step takes into account the appropriate dispersion estimate for a given SNP's coverage, technical noise of that sample, and the genotyping uncertainty estimate (calculated in 9.2), and tests for the possibility that the allele ratio (ρ, or the proportion of reference reads to total reads) is different than 0.5. At the same time it also gives an estimate of ρ in log-odds form,β = log(ρ/(1 − ρ)) that can be interpreted as log(reference reads / alternate reads) while taking into account overdispersion and genotyping error. We also obtain an estimate of the standard error forβ, that we can use for downstream analyses of cASE. A summary of the amount of ASE detected in each sample is in Figure S10 and To calculate a SNP-based expression and fold change (rather than gene-based, as in the FPKM (Section 8.4) and DESeq2 (Section 8.1) methods), we first calculated the read coverage at each SNP, adjusted by the sequence depth of the sample: where the transcripts per million (TPM) for SNP i in sample j is calculated as the read coverage R ij times 10 6 , divided by the sequence depth of the sample, D j . The average expression level at a SNP is then calculated as the average between treatment T and control C samples: Similarly, the SNP-based fold change was calculated as: 10 Conditional ASE analysis Most of the ASE signal is shared between treatment and control for a given SNP (dots along the y = x line in Figure 3D). However, there is evidence for SNPs showing ASE only in the control (along the x-axis) or ASE only in the treatment condition (along the y-axis). These SNPs represent candidates for conditional ASE (cASE): SNPs that display ASE only in certain environmental conditions. Additionally, there is a large number of SNPs falling in the space comprised between the y = x line and the axes. These SNPs may represent cases of shared-ASE with large overdispersion or cases of cASE with quantitative differences in the genetic effect across conditions. In general testing for differences in genetic effects across two conditions is particularly challenging as it implies comparing two noisy measurements to determine whether they are different while taking into account heterogeneity of the underlying true genetic effects.
This problem has been previously faced in the context of condition-specific eQTL mapping (e.g. reQTL mapping and tissue-specific eQTL mapping) (Mangravite et al., 2013, Maranville et al., 2013, Qiu et al., 2014, Maranville et al., 2012, Maranville et al., 2011, Ç alışkan et al., 2015, Barreiro et al., 2012, Lee et al., 2014, Siddle et al., 2014, Franco et al., 2013, Idaghdour et al., 2012, Fairfax et al., 2014, Flutre et al., 2013 and some of the approaches developed can be broadly translated to applications for cASE analysis. With regard to condition-specific eQTL mapping, three major approaches have been used so far: i) independent eQTL calling and comparing p-values, ii) comparing summary statistics using a meta-analysis approach that takes into account heterogeneity of the sub-groups, and iii) directly modeling the interaction term using ANOVA or QTL mapping of the fold-change.
In the first approach, independent eQTL are determined in the two conditions analyzed, and the p-values for each SNP compared across conditions. The SNP is defined then as an eQTL only in the condition with the lowest p-value (Fairfax et al., 2014). A major limitation of this approach is that higher p-values may result from incomplete power in any of the conditions analysed. Another modification of this approach is based on setting different FDR thresholds across conditions so that a SNP is assigned to condition 1 if it has a very low FDR in that condition (e.g.<1%) and a very high FDR in condition 2 (FDR>90%) (Barreiro et al., 2012).
The second class of approaches directly compares different eQTL configurations in a Bayesian framework (Maranville et al., 2011, Wen andStephens, 2014) and incorporates in the model heterogeneity in the effect sizes between the groups contrasted. This class of models also defines the genetic effect of the response eQTL in a strict on/off mode between the groups compared.
Finally the third class of approaches uses a linear model with an interaction term to directly test for geneby-environment interactions at a given gene and SNP (Ç alışkan et al., 2015), and has been also recently applied to GxE analysis with ASE data (Knowles et al., 2015). This class of approaches includes eQTL mapping of gene expression log-fold change calculated across the two conditions tested (Smirnov et al., 2009,Barreiro et al., 2012, Maranville et al., 2011. As opposed to the Bayesian approaches they do not provide information on the specific eQTL configuration, they theoretically allow capturing any type of eQTL configuration, although they have the greatest power for eQTLs with opposite genetic effects in the two conditions tested and may be confounded with increased group heterogeneity. To test for cASE, here we used two methods that belong to the classes of approaches described in 2 and 3 above. The first one is MeSH (Meta-analysis of Subgroup Heterogeneity), which identifies qualitative GxE interactions. We also developed a test for cASE that compares the evidence for ASE in the treatment to ASE in the control using theβ estimates from QuASAR directly, thus identifying quantitative GxE interactions. We refer to this second method as ∆AST, Differential Allele-Specific Test. a Bayes factor is derived contrasting each of the configurations that assume ASE in at least one of the conditions to the configuration with no ASE in either condition as baseline. Specifically, BF treatment contrasts the evidence for β t = 0, β c = 0 to β t = 0, β c = 0; BF control contrasts the evidence for β t = 0, β c = 0 to β t = 0, β c = 0, and BF shared contrasts the evidence for β t = 0, β c = 0 to β t = 0, β c = 0. To look specifically at conditional ASE, a Bayes factor for cASE is calculated as BF treatment − BF shared (treatment-only cASE) and BF control − BF shared (control-only cASE). SNPs with cASE are then identified as SNPs with BF cASE > 30.

∆AST: A novel method to measure cASE
The cASE models tested in MESH assume extreme ASE differences between treatment and control. However, extreme on/off cases of ASE may not be the only relevant ones as cASE may also exist where the genetic effect (β) differs to a significant degree between treatment and control (β T −β C = 0) but is non-zero in both conditions.
To capture cASE genes that may not fit perfectly to the models tested by MESH, we developed an alternative approach named ∆AST to identify quantitative differences in the ASE ratios between treatment and control.
Differential Z-scores (Z ∆ ) were calculated from QuASAR betas using the following formula. For each SNP, where β T and se T represent the beta and standard error estimates for the treatment condition, and β C and se C represent the beta and standard error estimates for the control condition. The Z ∆ scores were then normalized by the standard deviation across Z ∆ scores corresponding to control vs control (CO1 vs. CO2). Finally p-values (p ∆ ) are calculated from the Z ∆ scores as p ∆ = 2 × pnorm(−|z|). Under the null, Z ∆ are asymptotically normally distributed and Figure 3E shows that when contrasting the two sets of controls the p ∆ -values are almost uniformly distributed as expected. To further correct for this small deviation we use the control vs. control p-values to empirically estimate the FDR as detailed in the following section.

Empirical estimation of the cASE false discovery rate
For ∆AST analysis of cASE, we were also able to take advantage of our study design that includes at least two pairs of vehicle controls per individual (CO1 and CO2) in each treatment plate. This allowed us to apply our statistical methods to a subset of control data, contrasting within each individual and plate the two controls (CO2 to CO1), and to empirically determine the false discovery rate (FDR) or to recalibrate the p-values. This is very similar to a permutation procedure to estimate the p-value in eQTL studies. Permuting read assignments in ASE studies may not recapitulate the overdispersed nature of the allelic imbalances and there is a small number of permutations possible across individuals. Nevertheless, in our experimental design, contrasting the two control samples could be used to empirically reveal the underlying null distribution similar to the permutation based approaches that are possible in QTL studies with big sample sizes.
To this end, we ran ∆AST analysis on 120,273 SNPs, contrasting the two controls (CO2 to CO1) as we did for any of the treatments to its matched control (Tx to CO1 or Tx to CO2). To focus on actual false positives, we removed 8,040 control SNPs with a SNP-based log 2 ( fold change ) > 1 (see Section 9.3) in the CO2 versus CO1) comparison (see Table S12 for results of the CO2 vs CO1 analysis). For our test statistics for cASE (i.e., Z ∆ ), we can observe the empirical distribution obtained when contrasting the two sets of controls. This empirical distribution represents a sample from the null distribution. Using this empirical distribution we can then derive a corrected p-value based on the ranks observed in the control vs. control (this is exactly the same as in a permutation based approach). After calculating the corrected p-values derived from Z ∆ , we applied multiple test correction using the q-value method (Storey, 2003).
We identified 67 cASE SNPs using the ∆AST method, corresponding to an FDR of 10%. When we relax the FDR threshold (25%), we find a total of 178 cASE SNPs (corresponding to 160 unique genes), of which 68 SNPs are identified also with MeSH. The list of significant cASE SNPs is in Table S11 (also see Figure 4). Of the 31 genes with dexamethasone reQTLs previously identified in 116 LCL and 88 PBMC samples (Maranville et al., 2011, Maranville et al., 2013 and tested in our dataset, here we validated 21 genes with cASE (p <0.05), 6 of which in response to dexamethasone, including the population-specific reQTL gene NQO1 (p = 0.03).
When we considered all cASE SNPs, we observed a significant positive correlation between the gene expression log 2 (fold change) after treatment and differences in the genetic effect in treatment and control samples (Pearson's r = 0.204, p-value = 0.038, Figure 5A) . However, we observed a negative correlation between gene expression levels and the difference in the genetic effect in the treatment and the control samples (Pearson's r = -0.381, p-value = 6.7×10 −5 , Figure 5B).

Analysis of environmental displacement of genetic effects (EDGE)
In addition to characterizing significant differences in the allelic ratio parameters (β) between treatment and control to determine cASE, we also characterized how correlated they are across all heterozygous sites. This correlation measures the consistency of the genetic effect between the treatment and control, and therefore a lower correlation indicates a higher perturbation or displacement of the genetic effects, in our case the ASE β parameters estimated by QuASAR. Within each treatment and cell line subgroup, we examined the Pearson's correlation of the treatment standardized effect size (ASE Z T =β T /se T ) to the matched control one (ASE Z C =β C /se C ). To make the measurement more comparable across environments, we normalized the treatment-control correlation to the correlation observed between the two controls (CO1 and CO2) as measured within the same cell type (see Figure S19A-B). Formally, we define this as the environmental displacement of genetic effect (EDGE) index: where Pearson(Z s,t , Z s,c ) is the sample Pearson correlation coefficient between treatment t and control c ASE Z-scores across all observed SNPs in cell type s. Equivalently, Pearson(Z s,CO1 , Z s,CO2 ) is the Pearson correlation coefficient between the two controls sets ASE Z-scores across all observed SNPs in cell type s. These values can be found in Figure S19C, Figure S20, and Table S14.
We also asked whether the EDGE index was fully explained by the SNPs with significant cASE. While the EDGE index and the proportion of tests with cASE are very well correlated (see Figure S19C), removing significant SNPs did not substantially alter the EDGE index ( Figure S21) indicating that there may remain a similar fraction of cASE SNPs with a smaller effect size that we are not able to detect and a higher coverage may be necessary.

Identification of induced ASE
Conditional ASE analyses require that for both treatment and control conditions the SNP has a sufficiently high coverage (Section 9.3), resulting in the somewhat implicit requirement that the gene containing that SNP is expressed both in treatment and control conditions. However, many genes may have very low expression levels for the control condition and very high expression with ASE in the treatment. In other words, the expression of these genes would be induced by a specific treatment. For these cases, ASE in the control condition is not well defined, as it is exemplified in the extreme case where 0:0 reads are counted for both alleles. We denote this type of phenomenon as induced ASE (iASE), which are cases when the ASE can only be observed in genes that are induced by the treatment. Genes with iASE may be as interesting as genes with cASE and would probably be missed by studies that only consider baseline eQTLs or ASE. Physiologically, the iASE allele with high expression may exceed the threshold for a downstream effect on an observable cellular or organismal phenotype that may only manifest in a particular environmental condition.
To identify genes with iASE, we selected SNPs that were well covered in the treatment (i.e., >40 reads) and had ASE (10% FDR) but had little to no expression in the matched control. We used a coverage threshold in the control of 10×(D C /D T ), where D C and D T represent the sequencing depth of the control and treatment libraries, respectively in TPM (see Section 9.3). This equates to a ratio of 40 reads to 10 (expression in the control is 4-fold lower than the minimum required for a gene to be considered expressed in the ASE analysis), while accounting for sequencing depth differences. Finally, we required the SNP-based log 2 ( fold change) (Section 9.3) to be greater than log 2 ( 5 ). This threshold represents a strong difference between treatment and control expression,  et al., 2010). This program implements a univariate linear mixed model (LMM) for marker association tests with a phenotype of interest (i.e., GWAS traits) that takes into account population stratification and sample structure.
Similar to other methods that partition heritability estimates across SNP functional categories (Gusev et al., 2014), GEMMA estimates fold change, to contrast heritability per SNP in one category with heritability per SNP across the genome. In particular, the enrichment in ith category = (heritability per SNP in category i)/(heritability per SNP in all categories). We used GEMMA to jointly analyze summary statistics from 18 GWAS meta-analysis studies with annotations of regulatory variation.
To run GEMMA we first annotated SNPs to the closest gene, using bedtools closest. We then partitioned SNPs genome wide to create a category file. Each SNP was assigned to one of the following categories: cASE (genic regions with conditional allele specific expression) or iASE (genic regions with induced allele specific expression), ASE (genic regions with allele specific expression), Other Genic (genic regions that do not fall into any of the categories above) and Intergenic (greater than 100kb from any gene). Here genic regions are defined as the gene body +100 kb on both sides of the gene. This genomic definition was chosen to be inclusive of the minimum regulatory region usually tested in eQTL studies. We then used GEMMA to compute the SNP correlations among different categories from a reference panel (502 individuals of European ancestry from the 1000 genome project). This was followed by summing the z 2 statistics from the GWAS meta-analysis with the categories. Finally, we computed the proportion of variance, and fold enrichment of heritability explained by each category. A table of the results can be found in Table S17.

Analysis of SNPs in the GWAS catalog
We downloaded the GWAS catalog (Welter et al., 2014) (version 1.0.1) on January 5th, 2016. To identify the overlap between our annotations and those associated with a GWAS trait, we intersected the unique genes of interest from our data with the reported genes from the GWAS catalog. Of the genes expressed in our dataset, 32,451 genes are differentially expressed in at least one treatment condition while 7,734 genes are not. 22% of DE genes (7,010 genes) are identified in the GWAS catalog while only 4% of non-differentially expressed genes (292 genes) are previously reported in the GWAS catalog. Using a Fisher's exact test on a 2x2 contingency table, we found an enrichment of DE genes among the genes reported to be associated with a trait in the GWAS catalog (p < 2.2 × 10 −16 , OR = 7.0).
We further studied whether we find an enrichment of genes containing cASE or iASE among genes associated with organismal traits. Of the 215 genes that contain iASE or cASE, 105 genes (49%) were found in the GWAS catalog. When compared to DE genes (after removing genes that contain either iASE or cASE, 6,906 were found in GWAS while 25,331 were not) using a Fisher's exact test, we found a significant enrichment of iASE/cASE genes among GWAS genes (p < 2.2 × 10 − 16, OR = 3.5). We also found that when compared to genes with ASE (not iASE or cASE), genes containing iASE or cASE have 1.4 higher odds to be relevant for complex traits (p = 0.025).
To assess our results relevant to other studies, we compared our iASE and cASE genes to eGenes, genes associated with an eQTL, from the GTEx study (The GTEx Consortium, 2015). 194 iASE or cASE genes were also found to be eGenes leaving 26,899 eGenes without evidence of GxE in our study. We found, again, that iASE and cASE genes were enriched among genes associated with organismal traits with 50% in the GWAS catalog (97 genes) as compared to 24% eGenes (6,419 genes) (Fisher's exact test, p = 4.3 × 10 − 15, OR = 3.2).
We also repeated these enrichment analyses excluding the iASE cases, but we essentially obtain the same odd ratios and with a significant enrichment for cASE genes.

Examples of cASE/iASE in GWAS genes with a biological connection
One example of iASE that occurs following treatment with caffeine is in the gene GIPR associated with obesityrelated traits (Wen et al., 2014, Mahajan et al., 2014, Berndt et al., 2013, Fox et al., 2012, Okada et al., 2012, Wen et al., 2012,Speliotes et al., 2010,Saxena et al., 2010. Previous studies have shown that caffeine can help prevent or treat obesity (Ohara et al., 2016, Kim et al., 2016, Li et al., 2015. This iASE links the effect of caffeine to obesity through GIPR. The direction of the effect, with the low-BMI allele being in phase with the high expression allele, is in agreement with a study showing that GIPR expression is lower in obese patients (Ceperuelo-Mallafré et al., 2014). Another example of iASE occurs at rs4619 in IGFBP1 following dexamethasone treatment. IGFBP1 is a gene that interacts with insulin-like growth factors and promotes cell migration. IGFBP1 has also been associated with rheumatoid arthritis and major depressive disorder in genome wide association studies (GENDEPInvestigators et al., 2013,Padyukov et al., 2011. These two disorders are seemingly unrelated until 20 we consider that dexamethasone is a drug often used to alleviate rheumatoid arthritis (BUNIM et al., 1958, Ferreira et al., 2016 while glucocorticoids are often misregulated in depression (Zunszain et al., 2011, Mokhtari et al., 2013, Carvalho and Pariante, 2008. Here, our data suggests that IGFBP1 may be associated with these very different phenotypes because it participates in one of the glucocorticoid response pathways. We identified cASE following copper treatment at rs7587 in SAMM50, a gene associated with liver enzyme levels and non-alcoholic fatty liver disease (Kitamoto et al., 2013, Kawaguchi et al., 2012, Chambers et al., 2011,Kamatani et al., 2010,Yuan et al., 2008. Reduced copper levels is associated with non-alcoholic fatty liver disease and severe hepatic steatosis (Feldman et al., 2015, Aigner et al., 2010. Our data suggest that copper facilitates these liver responses through modulation of allele-specific expression in SAMM50. A missense SNP (rs17482078) in ERAP1 showed cASE in response to selenium treatment (ERAP1 is shown in a selenium response module in Fig S14). This SNP is associated with Behçet's disease (Kirino et al., 2013), an autoimmune disorder.
Individuals with Behçet's disease have lower selenium levels (Delilbaşi et al., 1991). Another example of cASE was identified at rs189899 in the gene, GOSR2, following treatment with mono-n-butyl phthalate.   Table S2: Summary of shallow sequencing. Plate ID, Cell Type, Cell Line, Barcode ID (see Section 4), Treatment ID (see Table S1), Total reads (total number of reads sequenced), Aligned reads (number of reads after initial alignment), Quality reads (number of reads passing quality filtering), Clean reads (number of reads after duplicate removal). See attached file, Supplemental Table S2.xlsx, also available at http://genome.grid.wayne.edu/ gxebrowser/Tables/Supplemental_Table_S2.xlsx     Table S8: Z-scores for module-treatment combinations. For each module, listed is the t-score for the treatments in the cell type indicated. Also included for each module is the number of genes, the assigned treatment (based on the highest average t-score across cell types), the number of heterozygous and ASE SNPs in genes within the module, and the proportion of ASE to heterozygous SNPs in genes within the module. See attached file, Supplemental Table S8.xlsx, also available at http://genome.grid.wayne.edu/ gxebrowser/Tables/Supplemental_Table_S8.xlsx      Table S12: cASE results for all SNPs tested. For each SNP tested for cASE, listed is the same information in S11, as well as ASE data (similar to Table S10) for treatment and control SNPs, and two additional columns: the SNP-based expression and SNP-based log( fold change ), described in Section 9.3. See attached file, Supplemental Table S12.txt.gz, also available at http://genome.grid.wayne.edu/ gxebrowser/Tables/Supplemental_Table_S12.txt.gz Table S13: cASE genes with a reQTL signal. Out of 83 genes reported to have GxE in previous reQTL and ASE studies and able to be tested for cASE in our data, 63 genes are listed in this table that also have cASE in our data (p-value <0.05). The table denotes the genes, the environment in which the reQTL was identified, the citation of the paper establishing the reQTL, and the cell types and treatments for which we find cASE           Figure S1: Study design for one cell type (A) and all cell types (B). (A) Each cell type is represented by three individuals, each treated with treatments belonging to two treatment panels. For each treatment panel, the three individuals were treated in parallel on the same plate to analyze 32 samples (corresponding to sequencing libraries with unique barcode IDs, and represented by the different colors on the plate design). The 32 conditions correspond to 23 treatments and 3 vehicle controls for panel one, and to 26 treatments and 2 vehicle controls for panel two. For each control sample (colored in shades of grey), three technical replicates were performed (i.e., triplicates of the same sample/library). (B) The study design described in (A) is repeated for each cell type in the study.
Step 2 Step        LCL HUVEC q q q q q q q q q q q q q q q q q q q q q q q q q q q qq q q q q q q q q q q q q q qq q q q q q q

PBMC SMC
q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q Mel q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q    Figure S14: Module 23 containing ERAP1 in Melanocytes following Selenium. Nodes are sized proportional to their connectivity (normalized within each module) and colored based on the sign and magnitude of the gene expression log-fold change (red denotes an increase in expression following treatment). Edges are shaded based on the weight of the correlation between the two genes (darker indicates stronger correlation between samples). Gene names in black are significantly differentially expressed (1% FDR) following treatment.  Figure S15: Example diagrams and plots describing cASE and iASE SNPs. Left, gene expression in treatment (T) and control (C) samples for both reference (R) and alternate (A) alleles. Middle, barplot of example expression levels. The dotted line represents a threshold for the detection of transcript expression. Inset is a forest plot ofβ values calculated from the reads covering each alleles, using the provided formulas. Note that in the control sample, theβ is undefined for iASE. Right, scatter plot contrasting expression levels (normalized to library coverage) of the two alleles for both treatment (black point) and control (grey point). The line represents the magnitude of the fold-change between treatment and control. The dotted line represents equal expression of the two alleles (i.e., no ASE). and treatment (x-axis) conditions. The size of the point indicates the degree of differential expression for the gene containing the SNP, while the color indicates whether the treatment causes the gene to be up-or downregulated.

Proportion of cASE
Normalized to # SNPs tested Figure S18: Percent of cASE SNPs identified in each cell type. For each cell type, plotted is the percent of cASE SNPs identified, relative to the number of SNPs tested for that group. The dotted black line represents the average percent of cASE SNPs across all categories or cell type. Groups with a "*" are significantly enriched or depleted (Binomial p-value < 0.05) relative to the average.  Figure S21: Comparison of EDGE index values with and without including cASE SNPs. For each sample, the EDGE index was calculated as described in Section 10.4, using either all SNPs in that subgroup (y-axis) or only SNPs that do not display significant cASE (x-axis). Note that for this analysis, we only included SNPs passing the default coverage filter (40 reads).