
(A) Overview of the methodology. The yellow box shows the main classifier that takes as input two sets of sequences: enhancers and controls. The classifier is used first to select a homogenous set of enhancers and then used again to classify between the selected set and control sequences. (B) Distribution of positive sequences predicted correctly. Almost one-third of the sequences are predicted consistently (>50% of the time) as positives (red dotted line). Sequences to the right of the line were considered homogenous. (C) ROC curve for five different methods on selected homogeneous sets. Performance details of our method and of four state-of-the-art methods are shown here. The maximum area under the ROC curve is achieved by our method (0.92) (shaded in gray).











