Simulation of nanopore sequencing signal data with tunable parameters

  1. Ira Deveson1,3
  1. 1 Garvan Institute of Medical Research;
  2. 2 University of Sydney
  • * Corresponding author; email: i.deveson{at}garvan.org.au
  • Abstract

    In silico simulation of high-throughput sequencing data is a technique used widely in the genomics field. However, there is currently a lack of effective tools for creating simulated data from nanopore sequencing devices, which measure DNA or RNA molecules in the form of time-series current signal data. Here, we introduce Squigulator, a fast and simple tool for simulation of realistic nanopore signal data. Squigulator takes a reference genome, transcriptome or read sequences and generates corresponding raw nanopore signal data. This is compatible with basecalling software from Oxford Nanopore Technologies (ONT) and other third-party tools, thereby providing a useful substrate for development, testing, debugging, validation and optimization at every stage of a nanopore analysis workflow. The user may generate data with preset parameters emulating specific ONT protocols; noise-free 'ideal' data; or they may deterministically modify a range of experimental variables and/or noise parameters to shape the data to their needs. We present a brief example of Squigulator's use, creating simulated data to model the degree to which different parameters impact the accuracy of ONT basecalling and downstream variant detection. This analysis reveals new insights into the nature of ONT data and basecalling algorithms. We provide Squigulator as an open-source tool for the nanopore community.

    • Received November 14, 2023.
    • Accepted April 24, 2024.

    This manuscript is Open Access.

    This article, published in Genome Research, is available under a Creative Commons License (Attribution-NonCommercial 4.0 International license), as described at http://creativecommons.org/licenses/by-nc/4.0/.

    Articles citing this article

    OPEN ACCESS ARTICLE
    ACCEPTED MANUSCRIPT

    Preprint Server