High resolution annotation of zebrafish transcriptome using long-read sequencing

(Downloading may take up to 30 seconds. If the slide opens in your browser, select File -> Save As to save it.)

Click on image to view larger version.

Figure 1.
Figure 1.

Overview of the embryonic zebrafish long-read transcriptome analysis. (A) Schematic of the long-read–based transcriptome reconstruction pipeline. Pooled RNA of α-amanitin/untreated embryos were collected and profiled with long-read sequencing. Temporally corresponding pre-/post-ZGA pooled embryonic RNA samples were profiled with short-read RNA-seq. Long-read raw data were assembled into transcripts using Iso-Seq (O'Grady et al. 2016), mapped to the reference GRCz10 genome using GMAP (Wu and Watanabe 2005) and annotated against the reference transcriptome using Cuffcompare (Trapnell et al. 2012). Novel transcripts were compared to short-read data and computationally validated in constructing a final long-read augmented transcriptome. Select transcripts were experimentally validated. (B) Coverage of the reference GRCz10 zebrafish transcriptome by long-read sequencing. Shown in gray is a histogram of the number of RefSeq annotated transcripts along a range of transcript lengths. Shown in light purple is a similar histogram of long-read transcripts that overlap RefSeq annotations by any amount; 52.8% of reference transcripts have some overlap with the long-read data. (C) Potential novelty in the long-read transcriptome. Long-read data were compared for similarity in exon–intron structures against the reference annotation. A majority of the observed transcripts corresponded to potentially novel genes or isoforms.

This Article

  1. Genome Res. 28: 1415-1425

Preprint Server