Markup | Genome Research

Figure 1

Schematic illustration of protein–protein interologs and the mapping methods. (A) Original interolog mapping. Theoretically, A-A′ and B-B′ should be orthologs between the two organisms. Operationally, only best-matching homologs are required. (B) Generalized interolog mapping. Proteins A₁′, A₂′, A₃′, and A₄′ in the target organism are all homologs of protein A in the source organism. These proteins form the A′ family. Likewise, protein B's homologs (B₁′, B₂′, B₃′) form the B′ family in the target organism. If we know that protein A interacts with B, we can predict that the A′ family and the B′ family are interacting families. All possible pairs between these two families are considered as the generalized interologs (shown as black, dashed lines with arrows). (C) Comparison with the gold standards. After the interactions in the source organism are mapped onto the target organism, the predictions (i.e., generalized interologs) are compared with the gold standard positives and negatives. True positives are the predictions that overlap with the gold standard positives. False positives are those that overlap with the gold standard negatives. (D) Schematic illustration of protein–DNA interologs and regulogs. In the source organism, TF A binds to its binding site (S_A) and regulates the downstream gene B. To perform the regulog mapping, TF A′ in the target organism needs to be the ortholog of A. Proteins B and B′ should also be orthologs. The DNA sequence upstream of gene B′ needs to contain the same motif (S_A′) as S_A. However, practically TF A and A′ only need to share ≥30% identity. The interaction between TF A′ and S_A′ is the protein–DNA interolog of that between A and S_A. The regulatory relationships between A → B and A′ → B′ are regulogs.