Table 2.

Experimental validation by sequencing RNA samples of informative heterozygotes





In silico allelic imbalance data






Alleleb
Allele counts HapMap
Allele counts dbEST
AI95 analysis per LCL Validation by sequencing in LCL RNA
Locus-specific analysis (AILS)
Genea
SNP
CHR
H
L
H
L
H
L
No. of ESTsc
P-value
H > Ld
L > He
L = Hf
Unknowng
Average allele ratio RNA/DNAh
Allele ratio P-valuei
STEAP1 rs4015375 7 C G 8 112 19 0 23 1.88E-17 5 0 0 0 7.70 1.84E-04
GCS1 rs1063588 2 T C 11 109 17 23 48 8.51E-06 2 0 1 7 1.07 4.04E-02
WRB rs2837005 21 T C 25 95 10 2 12 2.57E-05 16 0 0 1 2.34 4.49E-04
METAP1 rs1238741 4 C T 11 109 12 17 34 1.11E-04 9 0 0 2 1.07 1.27E-01
GEMIN6 rs1056104 2 A G 6 114 12 35 56 3.58E-04 3 0 0 3 1.19 2.28E-02
PBK rs1052874 8 G C 6 114 11 32 53 5.06E-04 4 0 0 3 1.40 1.74E-03
PAX8 rs1478 2 G T 17 103 21 32 65 5.35E-04 6 1 2 8 1.57 3.42E-02
HADHB rs1056471 2 G C 10 110 38 122 211 6.85E-04 6 0 0 2 1.20 2.30E-05
OAS1 rs2660 12 G A 43 77 37 22 67 7.97E-04 29 0 0 0 2.17 1.11E-10
LGMN rs2236264 14 T C 13 107 25 65 199 2.00E-03 8 0 0 2 1.28 1.34E-04
EPHX2 rs1042064 8 C T 31 89 34 36 84 2.44E-03 20 2 0 4 2.87 2.77E-04
MTHFD2 rs12196 2 A G 68 52 72 22 119 2.45E-03 25 0 0 6 1.18 2.81E-08
RAB7L1 rs823137 1 G A 59 59 15 2 19 3.37E-03 7 1 0 17 1.07 3.18E-02
PISD rs8461 22 T C 28 92 24 27 68 3.37E-03 10 0 0 15 1.41 1.91E-02
ARTS-1 rs26653 5 C G 33 85 7 2 17 4.33E-03 17 0 3 3 1.43 6.06E-03
ELL3 rs2788 15 G A 8 112 7 25 43 1.78E-02 6 0 0 2 1.33 3.94E-04
CORO1C rs2111211 12 C T 54 64 55 8 67 2.93E-08 6 0 9 7 1.50 1.37E-01
VPS39 rs7086 15 C G 12 108 34 64 119 1.04E-05 0 4 4 3 0.89 1.92E-01
SNX6j rs9264 14 C T 65 53 62 14 97 1.81E-04 NA NA NA NA NA NA
CD200 rs1050572 3 A G 6 114 11 30 72 3.43E-04 1 3 0 2 0.93 6.31E-02
CXCL16 rs1051007 17 G A 7 111 12 32 60 4.98E-04 0 3 0 3 0.84 1.05E-02
FVT1 rs6810 18 A G 57 63 56 21 86 6.42E-04 1 11 5 13 0.93 1.25E-01
FLJ12788 rs2301984 2 G A 12 108 11 18 39 6.98E-04 0 3 1 6 1.01 8.48E-01
PTPN12 rs3750050 7 G A 11 109 12 23 39 7.05E-04 2 4 0 2 0.91 8.98E-02
GRN rs5848 17 T C 20 100 103 212 518 8.14E-04 9 3 1 4 1.64 5.62E-02
TRAF3 rs1131877 14 C T 26 92 9 4 22 9.05E-04 1 4 9 4 0.95 5.55E-01
SEC61A1 rs1042907 3 G C 89 31 103 10 155 9.11E-04 1 4 1 13 0.98 2.98E-01
MGC5576 rs6823 12 C G 55 63 78 37 144 1.44E-03 0 4 12 15 0.91 2.89E-01
STK33 rs2289921 11 G C 49 71 14 3 24 1.53E-03 9 7 1 10 1.15 1.50E-01
LMAN1 rs1127220 18 C T 30 90 23 23 56 2.85E-03 0 0 15 0 0.99 9.30E-01
GATM rs1049518 15 A G 39 77 38 30 96 3.49E-03 9 11 1 7 1.09 2.48E-01
HPS4 rs3747134 22 G A 10 110 8 17 38 3.59E-03 0 0 7 2 0.85 2.53E-01
ARPC5 rs11755 1 A G 51 69 54 31 100 4.44E-03 0 5 15 9 0.93 1.99E-01
FXYD2 rs11999 11 C A 36 80 20 14 69 4.58E-03 10 7 0 6 4.91 9.38E-02
MCM2 rs893293 3 C T 24 96 28 46 165 7.78E-03 3 0 3 14 1.19 1.66E-01
PIK3R1 rs3756668 5 A G 56 64 14 3 23 8.21E-03 6 0 13 13 0.99 9.01E-01
ZNF350 rs2278414 19 A G 16 104 6 7 17 8.30E-03 2 4 0 11 0.88 7.52E-02
ACSL5 rs8624 10 C T 30 90 22 25 56 8.99E-03 6 0 4 13 1.12 9.60E-02
CDK2 rs2069398 12 G A 108 12 59 0 128 9.39E-03 3 4 2 2 0.97 6.69E-01
PPID
rs2070629
4
C
T
78
42
19
2
26
2.13E-02
0
20
0
5
0.80
6.79E-03

a Genes on top (from STEAP1 to ELL3) were validated by either qualitative or quantitative analysis of allelic expression data. The data points in bold correspond to the data fulfilling the validation criteria mentioned in the text. Genes from CORO1C to PPID did not fulfill the criteria for validation.

b Alleles are ordered based on the “expected high (H) expressor” and “expected low (L) expressor” as predicted by the EST-genotype comparison.

c Total number of EST sequence traces in UniGene, a maximum of two per library were included in the EST allele counts.

d Number of heterozygous individuals showing overexpression of the predicted high allele as determined by consistent deviation in independent cDNA samples beyond the 95% confidence interval. If >80% of samples fulfilled the prediction, the data points fulfill the validation criteria and are shown in bold.

e Number of heterozygous individuals showing overexpression of the expected “low” allele as determined by consistent deviation in independent cDNA samples beyond the 95% confidence interval.

f Number of heterozygous individuals showing equal expression of alleles as determined by both RNA samples falling to 95% confidence interval observed for genomic DNA controls.

g Number of informative samples that did not fall into the preceding three categories in the allele ratio analysis and remained “unclassified.”

h The average ratio of predicted high-allele versus the predicted low allele in RNA is divided by the value of predicted high-allele versus the predicted low-allele in control heterozygous DNA samples (i.e., HRNA/LRNA:HDNA/LDNA). If this ratio >1 and the distribution of values in RNA versus DNA is statistically significant (t-test) the candidate SNP is considered validated and is shown in bold.

i P-value (t-test, two-tailed) for difference between the H/L ratios in genomic DNA versus RNA.

j Both RNA and DNA samples showed (concordant) variation of allele ratios in SNX6, thus unequal expression could be caused by DNA-copy number variation or unidentified SNPs underlying the sequencing primers. The data was omitted from further analysis.