Segmental duplications and gene conversion: Human luteinizing hormone/chorionic gonadotropin β gene cluster

Table 1.

Summary statistics of sequence variation data for individual LHB/CGB genes and predicted hotspot region
















Haplotypesh
Analyzed region
Segregating sites (S)
MSVd
MAF ≥10%

Length (bp)
GC %
Pa
All
≥10% MAFc
Singletons
Type 1
Type 2
%
SNPs in dbSNP
πe
θf
Dg
All No.
No.
Carriers (%)
Individual genes
LHB 1541 63 E 20 15 2 0 7 35.0 12 0.00371 0.00254 1.35 14 2 65.9
H 17 10 1 0 6 35.3 12 0.00348 0.00246 1.28 7 2 84
M 20 10 5 1 7 40.0 11 0.00313 0.00295 0.20 16 2 50
All 27 8 1 10 55.0 12 0.00375 0.00301
CGB 1543 64 E 18 14 2 7 72.2 2 0.00406 0.00228 2.24* 15 2 74.5
H 20 16 2 8 8 80.0 2 0.00530 0.00304 2.39* 13 2 72
M 28 23 3 10 13 82.1 5 0.00666 0.00413 2.06* 14 3 60.8
All 35 6 11 16 77.1 5 0.00545 0.00401
CGB2 1521 63 E 8 4 2 2 3 62.5 2 0.00115 0.00103 0.29 10 4 92.6
H 12 3 6 2 6 66.7 2 0.00101 0.00176 -1.26 8 3 80
M 24 4 10 2 9 45.8 2 0.00237 0.00359 -1.12 12 2 56.5
All 34 15 5 13 52.9 2 0.00145 0.00384
CGB1 1510 63 E 5 4 0 0 1 20.0 1 0.00095 0.00065 0.99 6 3 82.9
H 12 4 5 2 1 25.0 1 0.00119 0.00177 -0.98 8 3 78
M 14 5 3 2 2 28.6 1 0.00157 0.00226 -0.95 9 3 75.9
All 20 5 2 5 35.0 1 0.00124 0.00239
CGB5 1661 64 E 13 7 2 0 8 61.5 2 0.00155 0.00153 0.03 15 2 73.4
H 13 6 5 1 5 46.2 2 0.00117 0.00175 -0.99 9 2 78
M 13 10 1 2 8 76.9 2 0.00213 0.00178 0.59 11 4 78.3
All 25 7 2 13 60.0 2 0.00165 0.00259
CGB7 2233 63 E 30 27 1 6 6 40.0 4 0.00552 0.00271 3.18** 31 2 47.8
H 29 27 0 4 10 48.3 4 0.00484 0.00300 2.04* 16 2 52
M 41 25 1 8 13 51.2 5 0.00429 0.00418 0.09 21 2 45.6
All 50 1 9 14 46.0 5 0.00550 0.00385
Predicted recombination hotspot regionb
All region 8338 56 E 64 42 14 0.00212 0.00196 0.32
Intergenic CGB5/8 1879 52 E 17 12 2 0.00257 0.00219 0.61
CGB8 2156 62 E 15 6 7 1 5 42.9 6 0.00152 0.00178 -0.54 10 3 50
Intergenic CGB8/7
4303
54
E
32
24
5




0.00227
0.00208
0.35
19


  • a (E) Estonians (n=47); (H) Han (n=25); (M) Mandenka (n=23).

  • b Data for CGB8 are from 11 Estonian individuals from the resequencing of potential recombination hotspot (see text for explanation).

  • c (MAF) Minor allele frequency.

  • d Multisite variation (Fredman et al. 2004): (MSV1) SNPs that are also represented as paralogous sequence variants (PSVs) among the duplicons. (MSV2) SNPs present in >1 duplicated gene.

  • e Estimate of nucleotide diversity per site from average pairwise difference among individuals.

  • f Estimate of nucleotide diversity per site from number of segregating sites (S).

  • g Significance level of Tajima's D statistics: (**) p < 0.01; (*) p < 0.05.

  • h Haplotype distribution is the estimate by PHASE algorithm (Stephens et al. 2001).

This Article

  1. Genome Res. 15: 1535-1546

Preprint Server