Table 1.

Sequence Sets Used in Analyses

Data set No. of seqs. No. of chars. Description
Fungal nucleotide data sets
 NcrEST3,5781,821,906 N. crassa ESTs from the Neurospora Genome Project.[i]
 Ncr contigs2,0931,147,268 N. crassasequences assembled from “Ncr EST.”[ii]
 ScerEST3,4241,136,588 S. cerevisiae ESTs from TIGR.[iii]
 CAL1,63114,929,251genomic sequence from C. albicans. [iv]
 ENI13,4045,594,817nucleotide sequences from A. nidulans.[v]
Fungal amino acid data sets
 NCR1,007400,653translated ORFs for non-ESTN. crassa sequences.[v]
 SC6,2272,908,935translated ORFs from completeS. cerevisiae genome.[vi]
 NAscF2,130735,449translated ORFs from nonascomycete fungi.[v]
 Spo8,3583,708,009translated ORFs from S. pombe. [v]
Nonfungal nucleotide data sets
 HMEST1,228,825455,623,980human and mouse ESTs from dbEST.[vii]
Nonfungal amino acid data sets
 NF206,89864,637,987translated ORFs for nonfungal organisms.[viii]
 EUTH166,24144,409,356translated ORFs from eutherian (placental) mammals.[v]

[ii] These assembled sequences were clustered into 1197 discontigs, which correspond to putative unique loci. These sequences can be retrieved fromhttp://molbio.ahpcc.unm.edu/search/discontigs.html using the discontig numbers used in this paper.

[viii] Subset of GSDB (Skupski et al. 1999) kindly provided by Marian Skupski of the National Center for Genome Resources.