Prediction of Protein Functional Domains from Sequences Using Artificial Neural Networks

Table 3.

Performance of Neural Network Recognizers for Various Domain Groups

A. Protein Domain Groups Training set Test set Total
tp fp fn tn tp fp fn tn tp fp fn tn
Immunoglobulin domain 1721 24 15 2428 829 18 19 1155 2552 71 32 3554
Zinc finger, C2H2 type 330 9 9 540 147 3 6 245 477 12 15 785
Protein kinase domain 1120 0 0 264 522 0 0 119 1642 0 0 383
EF hand 409 14 1 116 186 3 1 62 594 17 3 178
EGF-like domain 291 2 0 332 145 3 0 137 436 3 0 471
WD repeat 186 2 4 168 76 0 1 92 262 2 5 260
Fibronectin type III domain 237 1 0 507 113 0 0 228 350 2 0 734
ABC transporters 530 3 0 90 239 2 0 43 769 1 0 137
Globin 595 2 2 200 269 1 0 92 866 3 0 292
Homeobox domain 584 0 1 58 260 0 0 31 845 0 0 89
ANK repeat 99 1 3 56 54 0 1 22 155 2 2 77
Sushi domain (SCR repeat) 106 0 0 77 59 0 0 32 165 0 0 109
RNA recognition motif (=RRM, RBD, or RNP domain) 215 1 2 50 99 1 1 26 315 2 2 76
Tetratricopeptide repeat 98 0 0 152 45 0 0 74 143 0 0 226
Short chain dehydrogenase 316 0 0 55 142 0 0 29 458 0 0 84
Trypsin 296 0 1 215 146 0 0 97 442 0 1 312
Dead/H box helicase domain 277 1 4 499 117 0 2 238 397 2 3 736
Kazal-type serine protease inhibitor domain 211 0 0 84 99 0 0 38 310 0 0 122
Response regulator receiver domain 242 0 0 20 112 0 0 11 354 0 0 31
Spectrin repeat 21 0 1 293 21 0 1 124 43 0 1 417
4FE-4S ferredoxins and rel. iron-sulf. clust.bind.domains 183 0 0 51 86 0 0 24 269 0 0 75
RAS family 203 0 0 32 96 0 0 14 299 0 0 46
SH3 domain 185 0 0 27 94 0 0 6 279 0 0 33
APOA/APOE-repeat 33 0 0 498 16 5 2 229 51 12 0 720
Zinc-binding dehydrogenases 152 0 0 74 80 0 1 26 233 0 0 100
Cytochrome C 142 1 6 73 71 0 1 33 213 1 7 106
ATPases associated with various cell. activites (AAA) 122 0 0 102 69 0 0 38 191 0 0 140
Helix-loop-helix DNA-binding domain 153 0 1 21 74 0 0 13 227 0 1 34
Alpha amylase 135 0 0 118 69 0 0 51 204 0 0 169
HSP70 protein 130 1 0 247 63 0 0 117 193 0 0 365
Ligand-binding domain of nuclear hormone receptors 176 1 0 13 81 0 0 10 257 0 0 14
Protein-tyrosine-phosphatase domain 110 0 0 42 56 0 0 20 166 0 0 62
Lectin in C-type domain 129 0 0 23 59 0 0 15 188 0 0 38
Intermediate filament proteins 127 2 0 192 61 2 0 86 188 2 0 280
Zinc-binding metalloprotease domain 148 1 0 4 75 0 0 1 223 1 0 5
ACYL carrier protein domain 102 0 0 8 54 0 0 2 156 0 0 10
PAS domain 77 0 0 52 46 0 0 15 123 0 0 67
FOS/JUN DNA-binding domain 108 0 0 69 64 0 0 23 172 0 0 92
Snake toxin 102 0 1 31 51 0 0 12 153 0 1 43
B. Functional Groups Training set Test set Total
tp fp fn tn tp fp fn tn tp fp fn tn
Permeases 289 22 12 593 139 7 4 270 428 29 16 863
Sensory transduction histidine kinases 177 2 3 167 87 2 1 75 264 2 4 244
Glycosyltransferases I 204 1 1 103 95 0 0 46 300 2 0 148
Thiol-disulfide isomerase and thioredoxins 139 0 4 65 77 1 4 17 218 1 6 82
Serine/threonine protein kinases 1107 1 10 264 515 1 2 121 1622 1 12 386

This Article

  1. Genome Res. 11: 1410-1417

Preprint Server