Table 3.

Cluster Splitting of Gene Clusters Within 2.029 cDNA Clones

Gene ID Copies Core clusters Total Percent of copies Diversity index
Ef1_α669636 (646), 3 (4)63995.520.049
Cytochrom_cox_I274229 (232), 24 (24),  12 (14), 2 (2)26797.450.121
clone_190B1254237 (241), 2 (2)23994.090.068
tubulin_β207203 (205)20398.070.023
40SRibo_protS6183176 (176)17696.170.045
40SRibo_protS410099 (99)9999.000.012
60SRibo_protL48584 (86)8498.820.014
GAPDH8277 (77)7793.900.074
Ef1_β6766 (76)6698.510.018
human_calmodulin3222 (26), 4 (4)2681.250.324
heat_shock_cogKD712822 (23)2278.570.271
heat_shock_cogKD902620 (20)2076.920.276
human_TNF_receptor128 (8)866.670.442
clone_244D1483 (3)337.500.318
clone_241F1722 (2)2100.000.000
  Total2029193295.220.137

[i] The diversity index for most gene clusters is low (< 0.2). E.g., GAPDH (human glyceraldehyde-3-phosphate dehydrogenase) is present with 82 copies in the library. Clustering finds 77 copies in a calculated cluster of size 77 (the numbers in brackets denote the sizes of the calculated clusters). These 77 copies correspond to 93.90% of the copies of that gene. We consider only calculated clusters that are pure because only those clusters contribute to gene identification.