Assemblathon 1: A competitive assessment of de novo short read assembly methods

  1. Benedict Paten3,29
  1. 1 UCSC;
  2. 2 UC Davis;
  3. 3 UC Santa Cruz;
  4. 4 Agency for Science, Singapore;
  5. 5 Wellcome Trust Sanger;
  6. 6 EMBL-EBI;
  7. 7 CRACS Portugal;
  8. 8 Genome Sciences Centre, BC Cancer Agency;
  9. 9 DOE JGI;
  10. 10 UC Berkeley;
  11. 11 Symbiose, IRISA;
  12. 12 CNRS/Symbiose, IRISA;
  13. 13 CSHL;
  14. 14 U MD;
  15. 15 Monsanto;
  16. 16 U of Georgia;
  17. 17 UCSF;
  18. 18 Softberry;
  19. 19 GAC, Norwich;
  20. 20 Sainsbury Lab;
  21. 21 U of Chicago;
  22. 22 BGI-Shenzen;
  23. 23 Broad;
  24. 24 Sanger;
  25. 25 BC Cancer Genome Sciences Centre;
  26. 26 Iowa State;
  27. 27 GAC, Sainsbury Lab & Wellcome Trust;
  28. 28 Broad Institute
  1. * Corresponding author; email: benedict{at}soe.ucsc.edu

Abstract

Low cost short read sequencing technology has revolutionised genomics, though it is only just becoming practical for the high quality de novo assembly of a novel large genome. We describe the Assemblathon 1 competition, which aimed to comprehensively assess the state of the art in de novo assembly methods when applied to current sequencing technologies. In a collaborative effort teams were asked to assemble a simulated Illumina HiSeq dataset of an unknown, simulated diploid genome. A total of 41 assemblies from 17 different groups were received. Novel haplotype aware assessments of coverage, contiguity, structure, base calling and copy number were made. We establish that within this benchmark (1) it is possible to assemble the genome to a high level of coverage and accuracy, and that (2) large differences exist between the assemblies, suggesting room for further improvements in current methods.

  • Received May 20, 2011.
  • Accepted September 8, 2011.

This manuscript is Open Access.

Related Articles

OPEN ACCESS ARTICLE
ACCEPTED MANUSCRIPT

This Article

  1. Genome Res. gr.126599.111 Copyright © 2011, Cold Spring Harbor Laboratory Press

Article Category

Related Content

Share

Preprint Server