Table 2.

Functional Classification of Individual cDNAs[iii]

Cell cycle
cDNA data Best database hit Tissue specificity
Clone ID Accession no. Contig size (bp) ORF size (aa) Chromosomal location Description of best hit Database accession no. P-value Gene family Tissue Score # ESTs
DKFZp434A0530 AL136842 27682542p22.1gene:Borg2; product: “CRIB-containing BORG2 protein”;Homo sapiens CRIB-containing BORG2 protein (BORG2) mRNA, complete cds.EMBL AF164118 2.1e-99
DKFZp434A1135 AL122068 30106705q13 Homo sapiens Rad 17-like protein (RAD17) mRNA, complete cds.EMBL AF076838 0
DKFZp434A1315 AL136755 18483871q21.2product: “F1N21.3”; The sequence of BAC F1N21 from Arabidopsis thaliana chromosome 1, complete sequence.EMBL AC002130 5.7e-22
DKFZp434B174AL80146154639815q21.3 Homo sapiens mRNA for cyclin B2, complete cds.EMBL AB020981 0ear6.386
DKFZp434G0514 AL136750 15033794p16.2cell growth regulating nucleolar protein LYAR—mousePIR A40683 2.7e-144
DKFZp434H152 AL136840 461985510p13gene:cdc23; “SPBC1347.10”; product: “cell division cycle protein 23”; S. pombe chromosome II cosmid c1347.EMBL AL035548 7e-21
DKFZp434J037 AL136891 34436281q32.1gene:KIAA0537; product: “KIAA0537 protein”; Homo sapiens mRNA for KIAA0537 protein, complete cds.EMBL AB011109 2.6e-148protein kinase
DKFZp434N0250 AL117525 15844621q43-q44product: “AKT3 protein kinase”; Homo sapiens AKT3 protein kinase mRNA, complete cds.EMBL AF135794 2.1e-249protein kinase
DKFZp434P107 AL136894 23804229q34XPMC2 protein—African clawed frogPIR S53818 5.9e-10
DKFZp434P2235 AL136860 202754917q12oncogene 1 (tre-2 locus) (clone 210)—humanPIR S22155 5.5e-226testis5.8112
DKFZp564A0723AL8011625247126q14.3-q16.1gene:ORC3L; product: “origin recognition complex ORC3L subunit”; Homo sapiens origin recognition complex ORC3L subunit (ORC3L) mRNA, complete cds.EMBL AF135044 0
DKFZp564E2182AL5026123672046q22.1-q22.33 Homo sapiens CGI-98 protein mRNA, complete cds.EMBL AF151856 1.2e-265
DKFZp564G1816 AL136599 47759843q12.2-q12.3gene:KIAA0797; product: “KIAA0797 protein”; Homo sapiens mRNA for KIAA0797 protein, partial cds.EMBL AB018340 2.1e-50
DKFZp564K142 AL136636 224133517p11.2 Rattus norvegicus implantation-associated protein (IAG2) mRNA, partial cds.EMBL AF008554 9.4e-184
DKFZp564L0562AL800909411854q31.21 Homo sapiens mRNA for APC10, complete cds.EMBL AB012109 4.4e-178
DKFZp564N0582AL5026416461443p21.1 Homo sapiens DRR1 (DRR1) mRNA, complete cds.EMBL AF089853 0brain5.1650
DKFZp564N0582AL5026416461443p21.1 Homo sapiens DRR1 (DRR1) mRNA, complete cds.EMBL AF089853 0retina5.457
DKFZp566G0346 AL136719 45032629q22.1 Homo sapiens spindlin mRNA, complete cds.EMBL AF106682 0

[i] The cDNAs have been grouped into ten functional categories (see Statistics—Classification) based on sequence similarity data and have been grouped accordingly. The cDNA clones are available from the Resource Center of the German Genome project using the clone ID shown in the first column. The respective sequences are available at the EMBL/GenBank/DDBJ databases under the accession numbers shown in the second column. The third column provides the size of the individual cDNA inserts, and the fourth column shows the size of the encoded/predicted proteins. The chromosomal location of the respective genes is shown in the fifth column. Columns 6–8 describe database hits with the highest similarity: The accession number of the best hit (and the database where this hit was found), the description of the best hit, and the P-value of this hit is provided in these three columns, respectively. Similarities were predicted based on BLASTX and BLASTN2 analyses. Selection of the “representative = best” hit was done using the following criteria: (1) A BLASTX hit was judged better than a BLASTN hit. (2) In cases where the best BLASTX (only with TREMBL database) hit had been calculated from the same nucleotide sequence entry that was the best hit in the BLASTN analysis, the BLASTN hit is given, and (3) Only when no other hits were available, genomic sequence entries are given.

[ii] If classification of a protein to a major gene family was possible (based on similarity information), the respective family is shown in column 9. Based on the availability of EST information, tissue-specific expression of transcripts has been depicted in columns 10–13, showing the tissue, an arbitrary score (see WWW2001) and the absolute number of ESTs sequenced from that particular tissue (at the time of analysis), respectively.

[iii] This section is excerpted from the full table, available on-line at http://www.dkfz-heidelberg.de/abt0840/GCC.