Word frequency analysis reveals enrichment of dinucleotide repeats on the human X chromosome and [GATA]n in the X escape region

(Downloading may take up to 30 seconds. If the slide opens in your browser, select File -> Save As to save it.)

Click on image to view larger version.

Figure 6.
Figure 6.

Physical distribution of [GATA]n on the X chromosome based on NCBI release B35.1 unmasked sequence. The histogram on the left shows the distribution of [GATA]n/[TATC]n-derived word frequencies in 100-kb bins along the chromosome. On the right is a detailed map of the location and length of [GATA]n/[TATC]n tandem repeats in the distal p arm of the X chromosome. Single contiguous repeats are defined as being at least 8 bp long. Interruptions of no more than 4 bp (one repeat unit) of incorrect sequence were allowed if followed by at least two more correct units. Location of genes on the + and − strands are also indicated. The region highlighted in yellow indicates XE in both graphs.

This Article

  1. Genome Res. 16: 477-484

Preprint Server