Markup | Genome Research

Table 2.

Functional Classification of Individual cDNAs[iii]

Cell cycle

cDNA data

Best database hit

Tissue specificity

Clone ID

Accession no.

Contig size (bp)

ORF size (aa)

Chromosomal location

Description of best hit

Database accession no.

P-value

Gene family

Tissue

Score

# ESTs

DKFZp434A0530

AL136842

2768

254

2p22.1

gene:Borg2; product: “CRIB-containing BORG2 protein”;Homo sapiens CRIB-containing BORG2 protein (BORG2) mRNA, complete cds.

EMBL AF164118

2.1e-99

DKFZp434A1135

AL122068

3010

670

5q13

Homo sapiens Rad 17-like protein (RAD17) mRNA, complete cds.

EMBL AF076838

DKFZp434A1315

AL136755

1848

387

1q21.2

product: “F1N21.3”; The sequence of BAC F1N21 from Arabidopsis thaliana chromosome 1, complete sequence.

EMBL AC002130

5.7e-22

DKFZp434B174

AL80146

1546

398

15q21.3

Homo sapiens mRNA for cyclin B2, complete cds.

EMBL AB020981

ear

6.38

DKFZp434G0514

AL136750

1503

379

4p16.2

cell growth regulating nucleolar protein LYAR—mouse

PIR A40683

2.7e-144

DKFZp434H152

AL136840

4619

855

10p13

gene:cdc23; “SPBC1347.10”; product: “cell division cycle protein 23”; S. pombe chromosome II cosmid c1347.

EMBL AL035548

7e-21

DKFZp434J037

AL136891

3443

628

1q32.1

gene:KIAA0537; product: “KIAA0537 protein”; Homo sapiens mRNA for KIAA0537 protein, complete cds.

EMBL AB011109

2.6e-148

protein kinase

DKFZp434N0250

AL117525

1584

462

1q43-q44

product: “AKT3 protein kinase”; Homo sapiens AKT3 protein kinase mRNA, complete cds.

EMBL AF135794

2.1e-249

protein kinase

DKFZp434P107

AL136894

2380

422

9q34

XPMC2 protein—African clawed frog

PIR S53818

5.9e-10

DKFZp434P2235

AL136860

2027

549

17q12

oncogene 1 (tre-2 locus) (clone 210)—human

PIR S22155

5.5e-226

testis

5.81

DKFZp564A0723

AL80116

2524

712

6q14.3-q16.1

gene:ORC3L; product: “origin recognition complex ORC3L subunit”; Homo sapiens origin recognition complex ORC3L subunit (ORC3L) mRNA, complete cds.

EMBL AF135044

DKFZp564E2182

AL50261

2367

204

6q22.1-q22.33

Homo sapiens CGI-98 protein mRNA, complete cds.

EMBL AF151856

1.2e-265

DKFZp564G1816

AL136599

4775

984

3q12.2-q12.3

gene:KIAA0797; product: “KIAA0797 protein”; Homo sapiens mRNA for KIAA0797 protein, partial cds.

EMBL AB018340

2.1e-50

DKFZp564K142

AL136636

2241

335

17p11.2

Rattus norvegicus implantation-associated protein (IAG2) mRNA, partial cds.

EMBL AF008554

9.4e-184

DKFZp564L0562

AL80090

941

185

4q31.21

Homo sapiens mRNA for APC10, complete cds.

EMBL AB012109

4.4e-178

DKFZp564N0582

AL50264

1646

144

3p21.1

Homo sapiens DRR1 (DRR1) mRNA, complete cds.

EMBL AF089853

brain

5.16

DKFZp564N0582

AL50264

1646

144

3p21.1

Homo sapiens DRR1 (DRR1) mRNA, complete cds.

EMBL AF089853

retina

5.45

DKFZp566G0346

AL136719

4503

262

9q22.1

Homo sapiens spindlin mRNA, complete cds.

EMBL AF106682

[i] The cDNAs have been grouped into ten functional categories (see Statistics—Classification) based on sequence similarity data and have been grouped accordingly. The cDNA clones are available from the Resource Center of the German Genome project using the clone ID shown in the first column. The respective sequences are available at the EMBL/GenBank/DDBJ databases under the accession numbers shown in the second column. The third column provides the size of the individual cDNA inserts, and the fourth column shows the size of the encoded/predicted proteins. The chromosomal location of the respective genes is shown in the fifth column. Columns 6–8 describe database hits with the highest similarity: The accession number of the best hit (and the database where this hit was found), the description of the best hit, and the P-value of this hit is provided in these three columns, respectively. Similarities were predicted based on BLASTX and BLASTN2 analyses. Selection of the “representative = best” hit was done using the following criteria: (1) A BLASTX hit was judged better than a BLASTN hit. (2) In cases where the best BLASTX (only with TREMBL database) hit had been calculated from the same nucleotide sequence entry that was the best hit in the BLASTN analysis, the BLASTN hit is given, and (3) Only when no other hits were available, genomic sequence entries are given.

[ii] If classification of a protein to a major gene family was possible (based on similarity information), the respective family is shown in column 9. Based on the availability of EST information, tissue-specific expression of transcripts has been depicted in columns 10–13, showing the tissue, an arbitrary score (see WWW2001) and the absolute number of ESTs sequenced from that particular tissue (at the time of analysis), respectively.

[iii] This section is excerpted from the full table, available on-line at http://www.dkfz-heidelberg.de/abt0840/GCC.