LETTER

Word frequency analysis reveals enrichment of dinucleotide repeats on the human X chromosome and [GATA]n in the X escape region

    • Department of Cell Biology, University of Massachusetts Medical School, Worcester, Massachusetts 01655, USA
Published March 13, 2006. Vol 16 Issue 4, pp. 477-484. https://doi.org/10.1101/gr.4627606
Download PDF Please log-in to or register for your personal account in order to access PDF Cite Article Permissions Share
cover of Genome Research Vol 36 Issue 4
Current Issue:

Abstract

Most of the human genome encodes neither protein nor known functional RNA, yet available approaches to seek meaningful information in the “noncoding” sequence are limited. The unique biology of the X chromosome, one of which is silenced in mammalian females, can yield clues into sequence motifs involved in chromosome packaging and function. Although autosomal chromatin has some capacity for inactivation, evidence indicates that sequences enriched on the X chromosome render it fully competent for silencing, except in specific regions that escape inactivation. Here we have used a linguistic approach by analyzing the frequency and distribution of nine base-pair genomic “words” throughout the human genome. Results identify previously unknown sequence differences on the human X chromosome. Notably, the dinucleotide repeats [AT]n, [AC]n, and [AG]n are significantly enriched across the X chromosome compared with autosomes. Moreover, a striking enrichment (>10-fold) of [GATA]n is revealed throughout the 10-Mb segment at Xp22 that escapes inactivation, and is confirmed by fluorescence in situ hybridization. A similar enrichment is found in other eutherian genomes. Our findings clearly demonstrate sequence differences relevant to the novel biology and evolution of the X chromosome. Furthermore, they implicate simple sequence repeats, linked to gene regulation and unusual DNA structures, in the regulation and formation of facultative heterochromatin. Results suggest a new paradigm whereby a regional escape from X inactivation is due to the presence of elements that prevent heterochromatinization, rather than the lack of other elements that promote it.

Loading
Loading
Back to top