Table 1.

Completely Sequenced Organisms and Other Fragmentary Data Considered in this Analysis

Organism Domain[ii] Code[iii] ORFs[iv] Partitions[v]
H. influenzae BHI17131377
M. genitalium BMG468361
Synechocystissp.BSsp31682002
M. pneumoniae BMP677424
H. pylori BHP15771226
E. coli BEC42902473
B. subtilis BBS41002573
B. burgdorferi BBB850696
A. aeolicus BAE15221157
M. tuberculosis BMT39242329
T. pallidum BTP1031852
C. trachomatis BCT877718
C. jejuni BCJ17311323
R. prowazekii BRP837653
M. jannaschii AMJ17351180
M. thermoautotrophicum AMTH18711227
A. fulgidus AAF24371423
P. horikoshiiOT3APH20611373
S. cerevisiae ESC61824437
C. elegans ECE19,0997558
S. pombe[vi] ESP35792248
H. sapiens EHs
M. musculus[vi] EMm

[i] Predicted ORF products considered in this study are essentially as described in the original publications: H. influenzae(Fleischman et al. 1995), M. genitalium (Fraser et al. 1995), M. jannaschii (Bult et al. 1996),Synechocystis sp. strain PCC6803 (Kaneko et al. 1996),M. pneumoniae (Himmelreich et al. 1996), H. pylori(Tomb et al. 1997), E. coli (Blattner et al. 1997),M. thermoautotrophicum (Smith et al. 1997), B. subtilis (Kunst et al. 1997), A. fulgidus (Klenk et al. 1997), B. burgdorferi (Fraser et al. 1997), A. aeolicus (Deckert et al. 1998), M. tuberculosis (Cole et al. 1998), P. horikoshii (Kawarabayasi et al. 1998),T. pallidum (Fraser et al. 1998), C. trachomatis(Stephens et al. 1998), R. prowazekii (Andersson et al. 1998), and C. elegans (The C. elegans Sequencing Consortium, 1998). Yeast S. cerevisiae ORF products (Goffeau et al. 1997) correspond to those indicated in the MIPS server:http://www.mips.biochem.mpg.de/, with a few modifications. Preliminary complete proteome of C. jejuni (ftp.sanger.ac.uk/in/pub/pathogens/Cj/) was considered.

[ii] (B) Bacteria; (A) Archaea; and (E) Eukarya.

[iii] Organism abbreviations used in Figs. 1 and 3.

[iv] The total number of predicted ORF products.

[v] The total number of distinct partitions.

[vi] S. pombe correspond to those at the Sanger ftp server: ftp.sanger.ac.uk under/pub/yeast/sequences/pombe/pompep/), human (H. sapiens) and mouse (M. musculus) sequences Hsuniq, Mmuniq (Boguski et al. 1995). An incomplete set of data was used, containing 3579 ORF products representing at least 68% of total proteome (V. Wood pers. comm.); 43,088, and 8,821 sets of clustered ESTs derived from GenBank release 106, respectively. Hsuniq and Mmuniq were used solely as targets for comparisons with the other organisms.