Sequence features in regions of weak and strong linkage disequilibrium

Table 3.

Sequence composition of quartiles of the genome, defined according to the extent of linkage disequilibrium



Genome covered by HapMap

Genome quantiles, defined using LD


Mean
(±S.E.)
(Low LD) Q1
Q2
Q3
(High LD) Q4
Trend
Basic sequence features
    GC bases 4080.3 (±2.3) 4349.6 4102.3 3964.8 3904.6 Decreases with LD
    Bases in CpG islands 72.0 (±0.7) 93.8 72.8 63.4 57.9 Decreases with LD
    Polymorphism (π) 10.1 (±0.02) 11.9 10.6 9.6 8.3 Decreases with LD
Genes and related features
    Known genes (per 1000 kb) 6.4 (±0.4) 6.6 6.1 6.2 6.7 U shaped
    Genic bases (exon, intron, UTR) 3854.7 (±16.9) 3764.8 3456.9 3603.1 4594.0 U shaped
    Coding bases 116.2 (±0.8) 112.4 104.1 112.0 136.2 U shaped
    Exonic bases 222.1 (±1.5) 225.8 204.2 214.6 243.9 U shaped
    Intronic bases 3678.0 (±16.6) 3584.5 3293.5 3432.5 4401.5 U shaped
    UTR (3′ and 5′) 105.9 (±0.8) 113.4 100.1 102.6 107.7 U shaped
Other features
    Bases in transcription factor binding sites 101.7 (±0.3) 110.1 107.1 98.7 90.8 Decreases with LD
    Bases in transcribed fragmentsa 251.3 (±2.4) 290.9 258.0 232.9 223.3 Decreases with LD
    Predictions of conserved elements (phastCons) 485.0 (±1.5) 520.5 499.1 465.9 454.5 Decreases with LD
    Conserved noncoding sequence 139.1 (±0.6) 164.6 154.3 132.1 105.7 Decreases with LD
    Identical base in alignment with M. musculus 2531.5 (±4.7) 2768.9 2678.6 2466.2 2212.2 Decreases with LD
    Identical base in alignment with R. norvegicus 2454.0 (±4.8) 2679.8 2600.5 2395.5 2140.4 Decreases with LD
Repeat content
    Total bases in repeats 4787.2 (±5.0) 4421.4 4642.0 4858.8 5226.7 Increases with LD
    Bases in LINE repeats 2090.7 (±4.6) 1649.7 1988.4 2235.9 2488.9 Increases with LD
    Bases in SINE repeats 1359.7 (±3.9) 1474.0 1307.8 1261.6 1395.3 U shaped
    Bases in LTR repeats 851.2 (±2.4) 808.3 872.1 895.0 829.2 ∩ shaped
    Bases in DNA repeats 302.6 (±0.7) 306.9 305.2 301.0 297.3 Decreases with LD
    Bases in simple repeats 89.0 (±0.3) 109.3 91.2 82.1 73.4 Decreases with LD
    Bases in low complexity repeats 57.6 (±0.1) 56.1 56.8 58.9 58.5 Increases with LD
    Bases in satellite repeats 20.8 (±1.3) 5.4 6.9 8.4 62.5 Increases with LD
    Bases in other repeats
14.0
(±0.2)
10.4
12.0
14.3
19.4
Increases with LD
  • a Only applies to chromosomes 6, 7, 13, 14, 18, 19, 20, 21, 22, and X.

    a Average base counts (per 10,000 bases) and standard errors are presented for each feature.

This Article

  1. Genome Res. 15: 1519-1534

Preprint Server