OrthoMCL: Identification of Ortholog Groups for Eukaryotic Genomes

(Downloading may take up to 30 seconds. If the slide opens in your browser, select File -> Save As to save it.)

Click on image to view larger version.

Figure 2
Figure 2

Illustration of sequence relationships and similarity matrix construction. Dotted arrows represent “recent” paralogy (duplication subsequent to speciation); solid arrows represent orthology. The upper right half of the matrix contains initial weights calculated as average –log10 (P-value) frompairwise WU-BLASTP similarities. The lower left half contains corrected weights supplied to the MCL algorithm; the edge weight connecting each pair of sequences wij is divided by Wij/W, where W represents the average weight among all ortholog (underlined) and “recent” paralog (italicized) pairs, and Wij represents the average edge weight among all ortholog pairs from species i and j. The net result of this normalization is to correct for systematic differences in comparisons between two species (e.g., differences attributable to nucleotide composition bias), and when i = j, to minimize the impact of “recent” paralogs (duplication within a given species) on the clustering of cross-species orthologs.

This Article

  1. Genome Res. 13: 2178-2189

Preprint Server