Sequence Sets Used in Analyses
| Data set | No. of seqs. | No. of chars. | Description |
| Fungal nucleotide data sets | |||
| NcrEST | 3,578 | 1,821,906 | N. crassa ESTs from the Neurospora Genome Project.[i] |
| Ncr contigs | 2,093 | 1,147,268 | N. crassasequences assembled from “Ncr EST.”[ii] |
| ScerEST | 3,424 | 1,136,588 | S. cerevisiae ESTs from TIGR.[iii] |
| CAL | 1,631 | 14,929,251 | genomic sequence from C. albicans. [iv] |
| ENI | 13,404 | 5,594,817 | nucleotide sequences from A. nidulans.[v] |
| Fungal amino acid data sets | |||
| NCR | 1,007 | 400,653 | translated ORFs for non-ESTN. crassa sequences.[v] |
| SC | 6,227 | 2,908,935 | translated ORFs from completeS. cerevisiae genome.[vi] |
| NAscF | 2,130 | 735,449 | translated ORFs from nonascomycete fungi.[v] |
| Spo | 8,358 | 3,708,009 | translated ORFs from S. pombe. [v] |
| Nonfungal nucleotide data sets | |||
| HMEST | 1,228,825 | 455,623,980 | human and mouse ESTs from dbEST.[vii] |
| Nonfungal amino acid data sets | |||
| NF | 206,898 | 64,637,987 | translated ORFs for nonfungal organisms.[viii] |
| EUTH | 166,241 | 44,409,356 | translated ORFs from eutherian (placental) mammals.[v] |
[ii] These assembled sequences were clustered into 1197 discontigs, which correspond to putative unique loci. These sequences can be retrieved fromhttp://molbio.ahpcc.unm.edu/search/discontigs.html using the discontig numbers used in this paper.
[viii] Subset of GSDB (Skupski et al. 1999) kindly provided by Marian Skupski of the National Center for Genome Resources.