Figure 1.

A schematic picture of the scaled organization and intrinsic properties of the protein domain universe graph. The PDUG is built hierarchically, so that each level of evolutionary divergence can be considered independently. The domain structures are compared with each other using DALI (see Methods), and from this information, the structural graph is created (Dokholyan et al. 2002). All of the sequences from NRDB with >25% identity to the original sequence of each domain on PDUG are collected into a gene family. All of the equilogs (sequences with the same function) (Apweiler et al. 2000) matching the gene family are collected and used to create a probabilistic GO tree, from which the FFS is calculated using equation 2. As an example of how to build a structural neighborhood, consider the domain inside the blue rectangle, then all of the domains with red rectangles are its structural neighbors.

GR31336rf1_4t