LETTER

Evaluation of regulatory potential and conservation scores for detecting cis-regulatory modules in aligned mammalian genome sequences

    • 1 Center for Comparative Genomics and Bioinformatics, Huck Institutes of Life Sciences, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
    • 2 Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
    • 3 Department of Computer Science and Engineering, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
    • 4 Department of Statistics, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
    • 5 Department of Biology, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
    • 6 National Human Genome Research Institute, Rockville, Maryland 20852, USA
Published July 15, 2005. Vol 15 Issue 8, pp. 1051-1060. https://doi.org/10.1101/gr.3642605
Download PDF Please log-in to or register for your personal account in order to access PDF Cite Article Permissions Share
cover of Genome Research Vol 36 Issue 4
Current Issue:

Abstract

Techniques of comparative genomics are being used to identify candidate functional DNA sequences, and objective evaluations are needed to assess their effectiveness. Different analytical methods score distinctive features of whole-genome alignments among human, mouse, and rat to predict functional regions. We evaluated three of these methods for their ability to identify the positions of known regulatory regions in the well-studied HBB gene complex. Two methods, multispecies conserved sequences and phastCons, quantify levels of conservation to estimate a likelihood that aligned DNA sequences are under purifying selection. A third function, regulatory potential (RP), measures the similarity of patterns in the alignments to those in known regulatory regions. The methods can correctly identify 50%–60% of noncoding positions in the HBB gene complex as regulatory or nonregulatory, with RP performing better than do other methods. When evaluated by the ability to discriminate genomic intervals, RP reaches a sensitivity of 0.78 and a true discovery rate of ∼0.6. The performance is better on other reference sets; both phastCons and RP scores can capture almost all regulatory elements in those sets along with ∼7% of the human genome.

Loading
Loading
Back to top