Large-Scale Protein Annotation through Gene Ontology

Table 2.

Statistics on Textual Information Analyzed by GO Engine

MeSH term Title Abstract Definition line
Number of proteins 110608 106190 113073 516952
Number of articles 71703 77314 82654 n/a
Number of unique words 40011 18175 26630 25915
Average number of words per article or per definition line 19.05 2.70 11.65 6.56
  • Text information was extracted from titles, abstracts, and MeSH terms of articles referenced in GenBank and SWISS-PROT records and from the definition lines of protein records.

This Article

  1. Genome Res. 12: 785-794

Preprint Server