Comparative sequence analyses reveal rapid and divergent evolutionary changes of the WFDC locus in the primate lineage

(Downloading may take up to 30 seconds. If the slide opens in your browser, select File -> Save As to save it.)

Click on image to view larger version.

Figure 1.
Figure 1.

The WFDC locus in human and other primates. (A) The long-range organization of the WFDC locus on human chromosome 20q13 is depicted (oriented relative to the centromere [CEN] and telomere [TEL]) and consists of the two indicated subloci separated by a 215-kb region containing unrelated genes (gray rectangle). The 145-kb centromeric WFDC sublocus contains the indicated six genes and four pseudogenes. The 322-kb telomeric WFDC sublocus contains the indicated 10 genes and one pseudogene. (B) The long-range organization of the centromeric WFDC sublocus in human and 12 nonhuman primates is depicted, as deduced by comparative analyses of the orthologous genomic sequences (see Table 2). Also shown are the two genes immediately flanking this sublocus on each side. Solid arrows represent small serine protease inhibitor genes and trappin (WFDC and PI3, respectively); open arrows represent semenogelin genes (SEMG); hatched arrows represent nonrelated neighboring genes; gray arrows with an X represent WFDC, trappin, and SEMG pseudogenes. The direction of all arrows reflects the gene’s transcriptional orientation. //, Gap between sequenced BACs (clone gaps in Table 2). “St” under a gene indicates the presence of a premature stop codon; “Fs” under a gene indicates the presence of a frameshift; “D” under a gene indicates that the corresponding sequence was retrieved from public databases rather than generated as part of this study (specifically, for chimpanzee, macaque, and orangutan). Note that squirrel monkey has two SEMG1 genes, labeled as “1a” and “1b” below each, and that owl monkey has a single unique SEMG gene (a SEMG1-SEMG2 chimera, which is labeled below the gene with an *). The two boxed areas reflect the WFDC12-to-ΨPI3 and PI3-to-ΨWFCD15d paralogous duplicons, respectively. (C) The G + C content of the human centromeric WFDC sublocus is shown. The Y-axis is scaled to reflect a minimum of 43% and maximum of 70% G + C content. Note the alternating genomic segments of low (I and III) and high (II and IV) G + C content. Also shown are the positions of CpG islands (defined as >200-bp stretches of sequence with a G + C content of >50% and an observed CpG to expected CpG ratio of >0.6). Note that panel C is drawn to scale relative to panel B.

This Article

  1. Genome Res. 17: 276-286

Preprint Server