# FusionInspector: In silico Validation of Fusion Transcript Predictions

<img src="images/FusionInspectorsvg.png" width="175" />

FusionInspector is a component of the [Trinity Cancer Transcriptome Analysis Toolkit (CTAT)](https://github.com/NCIP/Trinity_CTAT/wiki). FusionInspector assists in fusion transcript discovery by performing a supervised analysis of fusion predictions, attempting to recover and re-score evidence for such predictions. As of July, 2017, FusionInspector has been included as a component of the [STAR-Fusion](https://github.com/STAR-Fusion/STAR-Fusion/wiki)  suite.

Given a list of candidate fusion genes (as derived from running any fusion transcript prediction tool, such as [Prada](http://bioinformatics.mdanderson.org/main/PRADA:Overview), [FusionCatcher](http://biorxiv.org/content/early/2014/11/19/011650), [SoapFuse](http://soap.genomics.org.cn/soapfuse.html), [TophatFusion](http://ccb.jhu.edu/software/tophat/fusion_index.html), [DISCASM/GMAP-Fusion](https://github.com/DISCASM/DISCASM/wiki), [STAR-Fusion](https://github.com/STAR-Fusion/STAR-Fusion/wiki), or other), FusionInspector extracts the genomic regions for the fusion partners and constructs mini-fusion-contigs containing the pairs of genes in their proposed fused orientation.  The original reads are aligned to these candidate fusion contigs; fusion-supporting reads that would normally align as discordant pairs or split reads should align as concordant 'normal' reads in this fusion-gene context.  Those reads supporting each fusion (spanning fragments and fusion-breakpoint-containing reads) are identified, reported, and scored accordingly. An overview of the FusionInspector process is shown below.

<img src="images/FusionInspector-alg_overview.png" />


Optionally, [Trinity de novo transcriptome assembly](https://github.com/trinityrnaseq/trinityrnaseq/wiki) can be executed as part of the FusionInspector routine in order to de novo reconstruct fusion transcripts from the mapped reads.

## Visualizations

The evidence for fusions as evaluated by FusionInspector are easily viewed and navigated via [html-based fusion reports](FusionInspector-Visualizations) included as output. Alternatively, outputs in standard formats (bed, gtf, fasta) can be visualized in a    genome browser such as IGV. Examples are provided below.

<img src="images/FusionInspector_igv_report.png" />

See [FusionInspector Visualizations](FusionInspector-Visualizations) for detailed options.


## Installation Requirements

Installing FusionInspector should be a breeze. It does require additional popular software such as the STAR aligner and samtools, but FusionInspector is written in Python and doesn't require any compilation.  If you can use Docker, we have a FusionInspector Docker image and Singularity image that comes with all companion software integrated. See [Installing FusionInspector](installing-FusionInspector) for all installation details.  

## Running FusionInspector

FusionInspector requires one or more lists of fusion candidates, with each formatted like so, as geneA--geneB:

    B3GNT1--NPSR1
    ZNF709--DYRK1A
    ZNF844--NCBP2
    RBX1--HAPLN2
    FAM180B--TRIM60
    CASP9--ADCYAP1
    HS3ST3A1--C1QTNF2
    OPTC--AP000347.4
    GRIA2--ZW10


We'll call the file containing this list 'fusions.listA.txt'.  Let's assume we have another such list from another source, and we'll call it 'fusions.listB.txt'.

>It's ok to have a tab-delimited file containing other attributes (such as the raw output from some fusion-prediction tool) as long as the first column fits the above format.

Given this list of fusions, we'll run FusionInspector like so:

    FusionInspector --fusions fusions.listA.txt,fusions.listB.txt \
                    --genome_lib /path/to/CTAT_genome_lib \
                    --left_fq rnaseq_1.fq --right_fq rnaseq_2.fq \
                    --out_dir my_FusionInspector_outdir \
                    --out_prefix finspector \
                    --vis

## Outputs of FusionInspector

The [FusionInspector Outputs](FusionInspector-Outputs-Described) include several outputs that are fully described at the link provided.  If [Trinity de novo assembly is included](De-novo-reconstruction-of-fusion-transcripts), the reconstructed fusion transcript sequences and fusion/genome alignments are provided and integrated into the visualizations.


## Example data and execution

See the 'test/' subdirectory and examine the README.txt file included.  Example data and command execution info are provided.


## User support

Contact us on our google group <https://groups.google.com/forum/#!forum/trinity_ctat_users>

## Acknowledgements

FusionInspector is primarily a collaboration between [Brian Haas (Broad Institute)](https://personal.broadinstitute.org/bhaas/) and [Alex Dobin (Cold Spring Harbor Laboratory)](https://scholar.google.com/citations?user=2LAxpBsAAAAJ&hl=en), and developed as part of the [Trinity Cancer Transcriptome Analysis Toolkit](https://github.com/NCIP/Trinity_CTAT/wiki).  The [igv-reports](https://github.com/igvteam/igv-reports) based fusion report derives from a collaboration with [James Robinson](https://github.com/jrobinso). 

## Funding

CTAT-Mutations is supported as part of the [Trinity CTAT Project](https://github.com/NCIP/Trinity_CTAT/wiki), funded by the [National Cancer Institute Informatics Technology for Cancer Research](https://itcr.cancer.gov/)