Figure 1

Schematic illustration of protein–protein interologs and the mapping methods. (A) Original interolog mapping. Theoretically, A-A′ and B-B′ should be orthologs between the two organisms. Operationally, only best-matching homologs are required. (B) Generalized interolog mapping. Proteins A1′, A2′, A3′, and A4′ in the target organism are all homologs of protein A in the source organism. These proteins form the A′ family. Likewise, protein B's homologs (B1′, B2′, B3′) form the B′ family in the target organism. If we know that protein A interacts with B, we can predict that the A′ family and the B′ family are interacting families. All possible pairs between these two families are considered as the generalized interologs (shown as black, dashed lines with arrows). (C) Comparison with the gold standards. After the interactions in the source organism are mapped onto the target organism, the predictions (i.e., generalized interologs) are compared with the gold standard positives and negatives. True positives are the predictions that overlap with the gold standard positives. False positives are those that overlap with the gold standard negatives. (D) Schematic illustration of protein–DNA interologs and regulogs. In the source organism, TF A binds to its binding site (SA) and regulates the downstream gene B. To perform the regulog mapping, TF A′ in the target organism needs to be the ortholog of A. Proteins B and B′ should also be orthologs. The DNA sequence upstream of gene B′ needs to contain the same motif (SA′) as SA. However, practically TF A and A′ only need to share ≥30% identity. The interaction between TF A′ and SA′ is the protein–DNA interolog of that between A and SA. The regulatory relationships between A → B and A′ → B′ are regulogs.

77098-22f1a_4t_rev1
77098-22f1b_4t_rev1
77098-22f1c_4t_rev1
77098-22f1d_4t_rev1