Estimates of numbers of constrained bases per CNG and extrapolation of the number of constrained bases to the whole genome
Constrained bases per chromosome or genome | ||||||||
|---|---|---|---|---|---|---|---|---|
| Constrained bases per CNG | ||||||||
| Taxon | In CNGs | In flanksa | Total | CNGs, Chromosome 21b | CNGs, genomec | Coding, genomed | ||
| Hominids | 45 | 76 | 121 | 2.7 × 105 | 2.4 × 107 | 1.9 × 107 | ||
| Murids | 81 | 162 | 243 | 4.9 × 105 | 4.9 × 107 | 2.3 × 107 | ||
a The number of bases is calculated by summing over the contributions from the 1000-base flanking segments 5′ and 3′ of each CNG according to No. of constrained bases = ΣCifi, where Ci is the average constraint and fi is the average fraction of the I bases in the 1000-base segment, averaged over the set of CNG loci
b The set of 2262 CNGs on the long arm of human Chromosome 21
c Based on predicted numbers of CNGs from regression of CNG density on gene density for Chromosome 21. See text for details
d Assumes that there are 24,000 mammalian protein-coding genes of average length 1500 bases, that three-quarters of nucleotide substitutions in a gene lead to an amino acid substitution, and that constraint levels at amino acid sites are 0.69 and 0.84 for hominids and murids, respectively