GALA, a Database for Genomic Sequence Alignments and Annotations

Table 1.

Annotation Statistics and Sources for Fields in the GALA Database

Category Entries Source Example fields from this category Reference URL
Genes 35,535 LocusLink at NCBI Name, type, orientation, exons, coding Pruitt and Maglott 2001 http://www.ncbi.nlm.nih.gov/LocusLink/
Genes 865 RefSeq at NCBI and HGB Name, type, orientation, exons, coding Pruitt and Maglott 2001 http://www.ncbi.nlm.nih.gov/LocusLink/refseq.html http://genome.ucsc.edu/
Gene products and function 17,388 LocusLink at NCBI Product, biological process, cellular component, molecular function, conserved domain Pruitt and Maglott 2001 http://www.ncbi.nlm.nih.gov/LocusLink/
Expression data 602,388 UniGene at NCBI Tissue Wheeler et al. 2002 http://www.ncbi.nlm.nih.gov/
Genetic disorders 2,802 OMIM Disorder Hamosh et al. 2002 http://www.ncbi.nlm.nih.gov/omim/
Alternate gene model: Acembly genes 123,238 Acembly and HGB Name, type, orientation, exons, coding J. Thierry-Mieg et al., unpublished http://www.acedb.org/Cornell/acembly/http://genome.ucsc.edu/
Alternate gene model: Ensembl genes 27,561 Ensembl and HGB Name, type, orientation, exons, coding Hubbard et al. 2002 http://www.ensembl.org/http://genome.ucsc.edu/
Alternate gene model: Genscan genes 42,737 Genscan and HGB Name, type, orientation, exons, coding Burge and Karlin 1997 http://genes.mit.edu/GENSCAN.html http://genome.ucsc.edu/
Alternate gene model: RefSeq genes 16,222 RefSeq and HGB Name, type, orientation, exons, coding Pruitt and Maglott 2001 http://www.ncbi.nlm.nih.gov/LocusLink/refseq.html http://genome.ucsc.edu/
Alternate gene model: Twinscan genes 25,744 Twinscan and HGB Name, type, orientation, exons, coding Korf et al. 2001 http://genes.cs.wustl.edu/http://genome.ucsc.edu/
Local alignments 1,585,186 MGSC and HGB Length, percent identity, gap size, identity step Waterston et al. 2002; Schwartz et al. 2003 http://bio.cse.psu.edu/http://genome.ucsc.edu/
Gap free alignments 33,970,427 MGSC and HGB Length, percent identity Waterston et al. 2002; Schwartz et al. 2003 http://bio.cse.psu.edu/http://genome.ucsc.edu/
SNPs 1,956,922 dbSNP at NCBI and HGB Type, allele, frequency Sherry et al. 2001 http://www.ncbi.nlm.nih.gov/SNP/http://genome.ucsc.edu/
Repeats 4,891,898 HGB and Repeat-Masker Name, class, family Kent et al. 2002; Smit and Green 1999 http://genome.ucsc.edu/http://repeatmasker.genome.washington.edu/cgi-bin/RepeatMasker
CpG islands 26,942 HGB Name Kent et al. 2002 http://genome.ucsc.edu/
Transcription factor binding sites 7,655,424 TRANSFAC, Cister, and tffind Factor name, strand, score Wingender et al. 2001; Frith et al. 2001 http://www.gene-regulation.com/pub/databases.html http://sullivan.bu.edu/∼mfrith/cister.shtml
Recombination rate 8,475 deCODE, Marshfield, Genethon and HGB Marker, recombination rate, range Kong et al. 2002; Browman et al. 1998; Hudson et al. 1995 http://www.decodegenetics.com/http://research.marshfieldclinic.org/genetics/Map_Markers/maps/IndexMapFrames.html http://www.genethon.fr/php/index_us.php http://genome.ucsc.edu/
  • Note: Users query on fields such as those listed as examples in column 4. The number of entries for each field is subject to change as the source databases update their entries. For all categories except gene products and functions, the number of entries is simply a count of the number of rows in the database table. For gene products and function, the number of entries is the number of gene rows that have data in this category. NCBI, National Center for Biotechnology Information at NIH; OMIM, Online Mendelian Inheritance in Man; HGB, Human Genome Browser at UCSC; and MGSC, Mouse Genome Sequencing Consortium.

This Article

  1. Genome Res. 13: 732-741

Preprint Server