German Nudelman; Antonio Frasca; Brandon Kent; Kirsten C. Sadler; Stuart C. Sealfon; Martin J. Walsh; Elena Zaslavsky

Figure 1.

Overview of the embryonic zebrafish long-read transcriptome analysis. (A) Schematic of the long-read–based transcriptome reconstruction pipeline. Pooled RNA of α-amanitin/untreated embryos were collected and profiled with long-read sequencing. Temporally corresponding pre-/post-ZGA pooled embryonic RNA samples were profiled with short-read RNA-seq. Long-read raw data were assembled into transcripts using Iso-Seq (O'Grady et al. 2016), mapped to the reference GRCz10 genome using GMAP (Wu and Watanabe 2005) and annotated against the reference transcriptome using Cuffcompare (Trapnell et al. 2012). Novel transcripts were compared to short-read data and computationally validated in constructing a final long-read augmented transcriptome. Select transcripts were experimentally validated. (B) Coverage of the reference GRCz10 zebrafish transcriptome by long-read sequencing. Shown in gray is a histogram of the number of RefSeq annotated transcripts along a range of transcript lengths. Shown in light purple is a similar histogram of long-read transcripts that overlap RefSeq annotations by any amount; 52.8% of reference transcripts have some overlap with the long-read data. (C) Potential novelty in the long-read transcriptome. Long-read data were compared for similarity in exon–intron structures against the reference annotation. A majority of the observed transcripts corresponded to potentially novel genes or isoforms.

High resolution annotation of zebrafish transcriptome using long-read sequencing

This Article

Preprint Server

Current Issue

In This Issue