Alkes L. Price; Eleazar Eskin; Pavel A. Pevzner

Whole-genome analysis of Alu repeat elements reveals complex evolutionary history

(Downloading may take up to 30 seconds. If the slide opens in your browser, select File -> Save As to save it.)

Click on image to view larger version.

Figure 1.

Applicability of k-means clustering to different kinds of clustering problems. Disjoint clusters of similar size are easily identified (A). Small subfamilies nested inside large subfamilies, a typical scenario in Alu repeat subfamilies, are not easily identified, because there is a tendency to split off a larger cluster (B) instead of identifying the nested subfamily (C).