Evolutionary constraints in conserved nongenic sequences of mammals

Table 4.

Estimates of numbers of constrained bases per CNG and extrapolation of the number of constrained bases to the whole genome






Constrained bases per chromosome or genome
Constrained bases per CNG
Taxon
In CNGs
In flanksa
Total
CNGs, Chromosome 21b
CNGs, genomec
Coding, genomed
Hominids 45 76 121 2.7 × 105 2.4 × 107 1.9 × 107
Murids
81
162
243
4.9 × 105
4.9 × 107
2.3 × 107
  • a The number of bases is calculated by summing over the contributions from the 1000-base flanking segments 5′ and 3′ of each CNG according to No. of constrained bases = ΣCifi, where Ci is the average constraint and fi is the average fraction of the I bases in the 1000-base segment, averaged over the set of CNG loci

  • b The set of 2262 CNGs on the long arm of human Chromosome 21

  • c Based on predicted numbers of CNGs from regression of CNG density on gene density for Chromosome 21. See text for details

  • d Assumes that there are 24,000 mammalian protein-coding genes of average length 1500 bases, that three-quarters of nucleotide substitutions in a gene lead to an amino acid substitution, and that constraint levels at amino acid sites are 0.69 and 0.84 for hominids and murids, respectively

This Article

  1. Genome Res. 15: 1373-1378

Preprint Server