DIAN: A Novel Algorithm for Genome Ontological Classification

Table 3.

Comparison between DIAN and MGI Ontological Assignments

Results from the comparative approach are shown. A number of intrinsic problems were identified from this approach, such that type I and type II variances described here are for comparative purposes only and cannot be interpreted strictly as type I and II errors.
Table 3A. Comparing Assignments Made to the Cellular Role Ontology
Concept DIAN node number Highest level matching GO modes Present in Variation Sensitivity Selectivity
DIAN and GO DIAN only GO only Type I Type II
Chromosome structure 1.1 GO:0007001;GO:0006323 7 4 4 0.267 0.267 0.636 0.636
Transcription factors 1.4 GO:0003700 59 54 38 0.358 0.252 0.608 0.522
DNA duplication 3.2 GO:0006260;GO:0003964 10 2 2 0.143 0.143 0.833 0.833
Cell-cell adhesion 5.2 GO:0007155 35 15 14 0.234 0.219 0.714 0.700
Transcription factors 9.1.1.1 GO:0008135 1 0 1 0.000 0.500 0.500 1.000
Microtubule 6.2 GO:0007017 9 0 2 0.000 0.182 0.818 1.000
DNA repair 8.1 GO:0006281 14 2 9 0.080 0.360 0.609 0.875
Programmed cell death 8.2 GO:0006915 14 7 23 0.159 0.523 0.378 0.667
Channel and transporter 4.6 GO:0006810;GO:0005216 47 8 27 0.098 0.329 0.635 0.855
Amino acid metabolism 9.2 GO:0006519 4 1 7 0.083 0.407 0.476 0.625
Stress response 8.4 GO:0006950 5 6 55 0.091 0.833 0.083 0.455
Nucleotide metabolism 9.4 GO:0006140:0006205 0 2 7 0.222 0.778 0.000 0.000
GO:0006143
Cofactor metabolism 9.5 GO:0006731 4 0 3 0.000 0.429 0.571 1.000
Total DIAN and GO: 229
TotalDIAN only: 113
Total GO only: 214
Total: 556
Average type I: 0.203
Average type II: 0.385
Sensitivity: 0.517
Selectivity: 0.670
DIAN assignments made to a group of well-characterized, nonredundant mouse sequences were compared to assignments made by the MGI to the GO Process and Function ontologies. GO modes corresponding to DIAN nodes are listed, along with the abbreviated essential concept from theDIAN Role ontology. For brevity, only the highest level GO nodes are listed. The number of sequences whose assignment is shared to both sets of ontologies is indicated (DIAN and GO), as well as the number of sequence assignments which differed (DIANonly, GO only). These numbers are used to calculate Type I and II variation using the following equations: Type I variation =DIANonly/(DIAN and GO +DIAN only + GO only); Type II variation = GO only/(DIAN and GO +DIAN only + GO only); Sensitivity =DIAN and GO/(DIAN and GO + GO only); Selectivity =DIAN and GO/(DIAN and GO +DIAN only). Sensitivity is defined as the ability of the DIAN algorithm to make what are believed to be all possible correct assignments. Selectivity is defined as the ability of the DIANalgorithm to not make what is believed to be an incorrect assignment.
Table 3B. Comparing Assignments Made to the Protein Function Ontology
Concept DIAN node number Highest level matching GO modes Present in Variation Sensitivity Selectivity
DIAN and GO DIAN only GO only Type I Type II
Hormones and active peptides 10 GO:0005179;GO:0005103 7 3 8 0.167 0.444 0.467 0.700
GO:0005104;GO:0005105
GO:0005106;GO:0005109
GO:0005110;GO:0005111
GO:0005112;GO:0005113
GO:0005114;GO:0005115
GO:0005116;GO:0005117
GO:0005118;GO:0005119
GO:0005120;GO:0005121
GO:0005122;GO:0005123
GO:0005124;GO:0005177
GO:0005178;GO:0005186
Inhibitors 12 GO:0004857;GO:0008189 12 3 9 0.125 0.375 0.571 0.800
GO:0005074;GO:0005092
GO:0008200;GO:0005517
DNA or RNA associated proteins 3 GO:0003676;GO:0003735 255 21 26 0.070 0.086 0.907 0.924
GO:0004748;GO:0003910
GO:0003911;GO:0004518
GO:0003899;GO:0008534
GO:0008263;GO:0003907
GO:0003905;GO:0003906
GO:0003904
GO:0004844;GO:0003908
Protein secretion and chaperones 13 GO:0003754;GO:0008565 11 3 2 0.188 0.125 0.846 0.786
GO:0006605
Electron transport proteins 5 GO:0005489 0 7 6 0.538 0.462 0.000 0.000
Other tranport proteins 6 GO:0005215 62 17 19 0.173 0.194 0.765 0.785
Structural proteins 7 GO:0005198 31 23 40 0.245 0.426 0.437 0.574
Receptors 8 GO:0004872 67 43 15 0.344 0.120 0.817 0.609
Cytokines and growth factors 9 GO:0008083;GO:0005125 35 19 10 0.297 0.156 0.778 0.648
GO:0008009
TotalDIAN and GO: 480
TotalDIAN only: 139
Total GO only: 135
Total: 754
Average type I: 0.184
Average type II: 0.179
Sensitivity: 0.780
Selectivity: 0.775
  • A group of well-characterized, nonredundant mouse sequences were assigned to the Protein Function ontology by theDIAN domain-based mapping algorithm. These assignments were compared to assignments made to the GO Process and Function ontology by the MGI.

This Article

  1. Genome Res. 11: 1766-1779

Preprint Server