
A machine learning algorithm (gkm-SVM) accurately predicts cis-regulatory elements. (A) An example of heart DNase-seq signals (raw data) and peaks (MACS2) at the GATA4 locus across multiple human samples. (B) A genome-wide heat map of DNase-seq read densities in 1000-bp windows centered at heart DHSs. Randomly sampled 1000 regions were used. Regions were grouped based on the configuration of the DHS peaks across the five samples with at least one observed DHS. (C) ROC curves of gkm-SVM models for five replicates against reserved test sets. (D) Comparisons of the fraction of DHSs overlapping predicted regions (precision) and fraction of predicted regions overlapping DHSs (recall).











