Table 4.

Domain Composition and Functions of Lineage-Specific Clusters of Size ≥ 20[i]

Species Cluster size Average score density Domain organization[ii] Function
M. tuberculosis 900.38Multitransmembrane proteins; PPE familyPredicted surface protein, interaction with  host cells
M. tuberculosis 670.37Signal-peptide-containing, non-globular  proteins, consist mostly of glycine-rich repeats; PE familyPredicted surface protein, interaction with  host cells
H. pylori 340.34Outer membrane proteinPredicted surface protein, interaction with  host cells
E. coli 310.30Helix-turn-helix DNA-binding domain  (LysR family), solute-binding domainTranscription regulation of various metabolic  operons
Synechocystis sp. 300.26Histidine kinaseSignal transduction, sensing of environmental  stimuli
M. pneumoniae 250.62Predicted non-globular domainUnknown
M. tuberculosis 240.21Signal-peptide-containing proteinPredicted surface protein (mce1), interaction  with host cells
A. fulgidus 240.23Histidine kinaseSignal transduction, sensing of environmental  stimuli
Synechocystis sp. 220.39Diguanylate cyclase/phosphodiesterase  (GGDEF and EAL domains)Signal transduction, sensing of environmental  stimuli
M. tuberculosis 210.29Short chain dehydrogenaseDehydrogenases with different specificities  (related to short-chain alcohol  dehydrogenases)
M. tuberculosis 200.45Beta-ketoacyl synthase, acyl transferase,  thioesterasePolyketide synthase

[i] Two more clusters of size ≥20a included transposases and were omitted.

[ii] Analyzed using the SMART, PSI-BLAST and SEG programs.