Table 1.
Sequence Sets Used in Analyses
| Data set | No. of seqs. | No. of chars. | Description |
| Fungal nucleotide data sets | |||
| NcrEST | 3,578 | 1,821,906 | N. crassa ESTs from the Neurospora Genome Project. |
| Ncr contigs | 2,093 | 1,147,268 | N. crassasequences assembled from “Ncr EST.” |
| ScerEST | 3,424 | 1,136,588 | S. cerevisiae ESTs from TIGR. |
| CAL | 1,631 | 14,929,251 | genomic sequence from C. albicans. |
| ENI | 13,404 | 5,594,817 | nucleotide sequences from A. nidulans. |
| Fungal amino acid data sets | |||
| NCR | 1,007 | 400,653 | translated ORFs for non-ESTN. crassa sequences. |
| SC | 6,227 | 2,908,935 | translated ORFs from completeS. cerevisiae genome. |
| NAscF | 2,130 | 735,449 | translated ORFs from nonascomycete fungi. |
| Spo | 8,358 | 3,708,009 | translated ORFs from S. pombe. |
| Nonfungal nucleotide data sets | |||
| HMEST | 1,228,825 | 455,623,980 | human and mouse ESTs from dbEST. |
| Nonfungal amino acid data sets | |||
| NF | 206,898 | 64,637,987 | translated ORFs for nonfungal organisms. |
| EUTH | 166,241 | 44,409,356 | translated ORFs from eutherian (placental) mammals. |
-
↵These assembled sequences were clustered into 1197 discontigs, which correspond to putative unique loci. These sequences can be retrieved fromhttp://molbio.ahpcc.unm.edu/search/discontigs.html using the discontig numbers used in this paper.
-
↵Subset of GSDB (Skupski et al. 1999) kindly provided by Marian Skupski of the National Center for Genome Resources.











