Table 1.

Estimated Intergenic Fractions

Homo sapiens Drosophila melanogaster Caenorhabolitis elegans Arabidosis thaliana
Euchromatin318000012300097800130000
Sequenced DNA36900012300091000119000
Gene-to-gene45.49.05.34.7
cDNA aligned106116285831401
Genomic quality1.223.32.415.7
Nested genes6%8%4%1%
05 Percentile2.50.90.80.9
Genomic length43.49.55.02.6
95 Percentile165.536.314.25.4
%, missing half11%10%21%30%
Intergenic DNADiscussed in text of article3%10%46%

[i] The first three rows list the euchromatic genome size, the amount of genomic sequence that was analyzed, and the annotation-based estimate of the gene-to-gene distance. The next three rows describe the cDNA alignments. These rows list the number of aligned cDNAs, our quality assessment for the genomic contigs (i.e., the median of the genomic contig size divided by the genomic length for the 95th-percentile gene), and our estimate of the frequency of nested genes (i.e., genes on the reverse strand or inside an intron). The genomic length is given in the next three rows by its arithmetic mean, and its 5th or 95th percentile values. Next, we indicate what fraction of the largest genes would have to be unidentified for half of the intragenic space to be missing. The last row lists the intergenic fraction, computed by correcting the mean genomic length for nested genes, dividing that by the mean gene-to-gene distance, and subtracting the result from one. Note: In Drosophila melanogaster, we do not count scaffold joins longer than 1 kb as contiguous when computing the genomic quality. All lengths are reported in kp.