Large-Scale Protein Annotation through Gene Ontology

Table 5.

Number of Contigs Annotated with Selected GO Nodes and Their Children Nodes in the Three GO Categories, and Chi-Square Test Results

GO term 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 X Y Contig Chi P value
Physiological process
 Spermatogenesis 4 4 2 1 1 7 1 9 3 3 2 3 1 1 1 3 1 0 1 7 1 2 8 11 77 484 0.00E-00
 Homophilic cell adhesion 2 0 1 5 24 0 1 1 0 2 3 0 4 1 0 11 0 9 0 3 0 1 3 1 72 235 L04E-37
 Olfaction 4 0 0 0 0 12 5 1 4 0 25 1 1 1 0 2 7 0 5 0 0 0 0 0 68 165 5-45E-24
 DNA-dependent DNA  replication 8 5 4 6 2 8 23 1 1 4 1 4 4 2 0 5 8 5 2 5 0 6 1 0 105 103 L75E-12
 Transcription regulation 143 100 105 49 67 93 93 71 87 52 106 102 41 50 55 76 85 39 171 67 11 34 68 6 1771 97 2.10E-11
 Immune response 96 52 43 31 29 44 24 16 20 21 45 29 4 21 22 13 30 8 65 15 8 24 15 0 675 73 1.84E-07
 Ribosome biogenesis 7 5 4 2 8 3 4 2 4 0 3 5 1 2 3 3 3 1 4 1 0 0 2 0 67 14 0.88
 Cell shape and cell size   control 11 10 5 1 7 6 7 2 7 3 10 8 2 3 5 3 8 2 6 3 1 4 1 0 115 14 0.90
 Exocytosis 20 8 13 7 10 8 13 5 5 8 10 14 2 7 7 5 9 3 7 6 0 5 3 0 175 13 0.90
 Central nervous system   development 9 8 4 5 4 3 5 3 4 2 7 8 2 2 4 5 7 0 7 1 2 1 3 0 96 12 0.94
Molecular function
 Calcium-dependent cell   adhesion molecule 1 0 1 4 23 0 1 1 0 2 2 0 4 1 0 10 0 9 0 3 0 1 2 1 66 238 2.59E-38
 Olfactory receptor 6 1 1 0 1 13 5 3 9 0 31 1 1 1 0 3 8 0 9 0 0 0 0 0 93 181 4338-27
 Tumor antigen 14 4 0 1 1 2 4 2 3 2 3 5 5 2 0 0 2 3 5 1 1 2 23 0 85 161 3.20B-23
 G-protein coupled receptor 94 50 59 30 50 84 41 23 30 28 76 33 24 30 9 16 56 11 85 23 2 18 59 1 932 130 2.43E-17
 Defense/humanity protein 75 45 33 16 21 46 20 18 15 9 30 14 6 17 19 7 11 4 55 14 6 22 10 0 513 106 4.10E-13
 Transcription factor 116 78 80 37 58 78 77 55 74 52 95 79 24 41 51 59 60 33 149 51 9 24 59 4 1443 100 5.04E-12
 Glycopeptide hormone 6 6 2 0 3 4 3 1 1 1 1 5 2 2 1 1 3 1 5 1 1 1 3 0 54 13 0.91
 Metallopeptidase 14 8 11 6 7 6 8 8 4 10 13 13 3 3 5 6 7 3 7 6 2 2 4 0 156 13 0.92
 Serotonin receptor 11 9 4 3 7 4 7 4 3 5 11 6 2 1 3 6 6 0 6 4 0 3 3 0 108 13 0.92
 Protein serine/threonine   kinase 56 43 32 23 24 29 23 16 23 21 23 31 8 15 18 18 29 5 27 11 5 7 20 1 508 11 0.97
Cellular component
 Secretory vesicle 5 17 5 1 1 2 3 2 2 5 9 5 1 8 6 4 8 1 1 2 1 12 3 0 104 88 6.09E-10
 Intermediate filament 6 5 3 2 3 9 2 2 1 2 2 12 0 0 1 3 18 0 0 3 1 2 3 0 80 86 1.30E-09
 26S proteasome 11 7 5 1 0 2 3 1 2 3 5 2 1 7 3 5 20 2 2 1 2 1 1 0 87 85 1.72E-09
 Nuclear membrane lumen 173 91 87 50 72 79 76 57 64 83 83 97 23 39 63 70 60 48 78 49 12 33 51 22 1560 82 6.79E-09
 Collagen 9 11 7 0 1 5 2 5 2 1 5 1 1 7 0 1 2 0 2 2 4 8 3 0 79 78 3.12E-08
 Respiratory chain complex I 6 4 4 1 5 2 2 1 3 1 4 1 0 2 3 3 4 0 3 0 1 2 2 0 54 12 0.94
 NADH dehydrogenase   (abiquinone) 6 4 4 1 5 2 2 1 3 1 4 1 0 2 3 3 4 0 3 0 1 2 2 0 54 12 0.94
 Mitochondrial inner   membrane 36 31 35 15 25 24 16 17 16 15 28 23 11 13 13 22 23 8 21 12 7 10 14 2 437 12 0.95
 Small ribosomal subunit 9 6 4 3 4 3 6 2 3 1 4 3 0 3 1 6 4 1 3 3 1 2 2 0 74 11 0.96
  • Numbers or x, y in the column headings indicate the chromosome. The results indicate that contigs annotated with some GO nodes or their children nodes such as spermatogenesis and homophilic cell adhesion under “Physiological Process,” or tumor antigen and defense/immunity protein under “Molecular function,” or secretory vesicle and intermediate filament under “Cellular component,” among many others, are distributed unevenly across different chromosomes.

  • Total number of contigs the annotated with the GO node listed under GO term column or its children nodes.

  • CHi-square test score.

  • Significance level from the chi-square test.

This Article

  1. Genome Res. 12: 785-794

Preprint Server