|
|
|
|
Genome Res. 14:67-78, 2004 ©2004 by Cold Spring Harbor Laboratory Press; ISSN 1088-9051/04 $5.00 Letter Ordered Partitioning Reveals Extended Splice-Site Consensus Information1 Department of Biology, Wesleyan University, Middletown, Connecticut 06459, USA 2 Department of Mathematics and Computer Science, Wesleyan University, Middletown, Connecticut 06459, USA
Using recently available cDNA and genomic data (Berkeley Drosophila Genome Project; http://www.fruitfly.org), we computed a large sample of 10,057 Drosophila splice sites. An information-theoretic analysis of the nucleotide sequences adjacent to these splice sites showed a strong correlation between the sizes of introns and exons and the levels of information, which is a measure of sequence conservation. The strong correlation permitted us to determine extensive consensus sequences at the donor and acceptor sites of longer introns. These sequences were further refined and extended by examining the information in regions around splice sites that only partially matched the consensus. The correlation between length and information provided the basis for determining alternative consensus arrangements associated with shorter introns, as well as general base-composition preferences that likely promote spliceosome function. We also observed a correlation between information near splice sites and the lengths of nonadjacent introns, indicating that there are long-range effects spanning multiple introns. The ordered partitioning approach used in this analysis may become increasingly useful as large genomic data sets become available.
Article and publication are at http://www.genome.org/cgi/doi/10.1101/gr.1715204.
3 Corresponding author. [Supplemental material is available online at www.genome.org.]
This article has been cited by other articles:
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||