Table 1.

Properties of SHARCGS assemblies from 30-mer data sets encompassing all 30-mer reads that can be deduced from forward or reverse strand; no sequence errors permitted

1697tbl1

Rows 1–10: Arabidopsis thaliana BACs; rows 11–20: Drosophila melanogaster BACs; rows 21–30: Homo sapiens BACs. Rows “Avg.” indicate average values for the parameters. Column labels: 1, BAC number; 2, species name (abbreviated); 3, GenBank accession number of BACs; 4, chromosome assignment; 5, length of BACs in bp; 6, percentage of bases masked using RepeatMasker [for BACs 21–30, superscripts indicate frequency of different repeat classes. Frequency decreases from left to right. S, SINEs (ALUs, MIRs); L, LINEs (LINE1, LINE2, L3/CR1); R, LTR elements (MaLRs, ERVL, ERV_class I, ERV_class II); D, DNA elements (MER1_type, MER2_type)]; 7, number of SHARCGS contigs > 50 bp; 8, average length of SHARCGS contigs in bp (only contigs >50 bp were counted); 9, size of largest resulting SHARCGS contig; 10, N50 length of SHARCGS assembly; 11, SHARCGS contig coverage of the source BAC insert sequence. All SHARCGS contigs are 100% identical to the reference sequence.