Chromosome-scale shotgun assembly using an in vitro method for long-range linkage

Abstract

Long-range and highly accurate de novo assembly from short-read data is one of the most pressing challenges in genomics. Recently, it has been shown that read pairs generated by proximity ligation of DNA in chromatin of living tissue can address this problem, dramatically increasing the scaffold contiguity of assemblies. Here, we describe a simpler approach ('Chicago') based on in vitro reconstituted chromatin. We generated two Chicago datasets with human DNA and developed a statistical model and a new software pipeline ('HiRise') that can identify poor quality joins and produce accurate, long-range sequence scaffolds. We used these to construct a highly accurate de novo assembly and scaffolding of a human genome with scaffold N50 of 20 Mbp. We also demonstrated the utility of Chicago for improving existing assemblies by re-assembling and scaffolding the genome of the American alligator. With a single library and one lane of Illumina HiSeq sequencing, we increased the scaffold N50 of the American alligator from 508 kbp to 10Mbp.

  • Received April 23, 2015.
  • Accepted December 21, 2015.

This manuscript is Open Access.

This article, published in Genome Research, is available under a Creative Commons License (Attribution 4.0 International license), as described at http://creativecommons.org/licenses/by/4.0/.

Articles citing this article

OPEN ACCESS ARTICLE
ACCEPTED MANUSCRIPT

This Article

  1. Genome Res. gr.193474.115 Published by Cold Spring Harbor Laboratory Press

Article Category

ORCID

Share

Preprint Server