Table 4.
Domain Composition and Functions of Lineage-Specific Clusters of Size ≥ 20
| Species | Cluster size | Average score density | Domain organization | Function |
| M. tuberculosis | 90 | 0.38 | Multitransmembrane proteins; PPE family | Predicted surface protein, interaction with host cells |
| M. tuberculosis | 67 | 0.37 | Signal-peptide-containing, non-globular proteins, consist mostly of glycine-rich repeats; PE family | Predicted surface protein, interaction with host cells |
| H. pylori | 34 | 0.34 | Outer membrane protein | Predicted surface protein, interaction with host cells |
| E. coli | 31 | 0.30 | Helix-turn-helix DNA-binding domain (LysR family), solute-binding domain | Transcription regulation of various metabolic operons |
| Synechocystis sp. | 30 | 0.26 | Histidine kinase | Signal transduction, sensing of environmental stimuli |
| M. pneumoniae | 25 | 0.62 | Predicted non-globular domain | Unknown |
| M. tuberculosis | 24 | 0.21 | Signal-peptide-containing protein | Predicted surface protein (mce1), interaction with host cells |
| A. fulgidus | 24 | 0.23 | Histidine kinase | Signal transduction, sensing of environmental stimuli |
| Synechocystis sp. | 22 | 0.39 | Diguanylate cyclase/phosphodiesterase (GGDEF and EAL domains) | Signal transduction, sensing of environmental stimuli |
| M. tuberculosis | 21 | 0.29 | Short chain dehydrogenase | Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) |
| M. tuberculosis | 20 | 0.45 | Beta-ketoacyl synthase, acyl transferase, thioesterase | Polyketide synthase |











