[![install with bioconda](https://img.shields.io/badge/install%20with-bioconda-brightgreen.svg?style=flat)](http://bioconda.github.io/recipes/terrace/README.html)
[![Anaconda-Server Badge](https://anaconda.org/bioconda/terrace/badges/downloads.svg)](https://anaconda.org/bioconda/terrace)

# Introduction

TERRACE is a circRNA assembler for paired-end RNA-seq data.

# Installation

TERRACE can be easily installed via 
[conda](http://bioconda.github.io/recipes/terrace/README.html).
If you would install it from source code, please follow
[INSTALL](https://github.com/Shao-Group/TERRACE/blob/master/INSTALL.md).

# Usage

The usage of `TERRACE` is:
```
terrace -i <input.bam> -o <output.gtf> -fa <reference-genome.fa> --read_length <length-of-paired-end-reads> -r [reference_annotation.gtf] -fe [feature_file] [options]
```

The `input.bam` is the read alignment file generated by some RNA-seq aligner, (for example, STAR or HISAT2).
Make sure that it is sorted; otherwise run `samtools` to sort it:
```
samtools sort input.bam > input.sort.bam
```
The reconstructed circular transcripts shall be written as GTF format into `output.gtf`. Detailed documentation about [GTF format](https://useast.ensembl.org/info/website/upload/gff.html) is available from Ensembl.

`reference-genome.fa` is the reference genome file in fasta format. Recommended - Gencode GRCh37/GRCh38.

`length-of-paired-end-reads` is the length of the reads used to produce the alignment file.

`reference_annotation.gtf` is the annotation file in GTF format. This parameter is optional.

`feature_file` is a csv file generated by TERRACE that contains circRNA features used in a machine learning model for assigning confidence scores. This parameter is optional. For detailed usage of this file, see the section on `Scoring` below.


TERRACE support the following parameters. Please refer
to additional explanations below the table.

 Parameters | Default Value | Description
 ------------------------- | ------------- | ----------
 --help  | | print usage of TERRACE and exit
 --version | | print version of TERRACE and exit
 --preview | | show the inferred `library_type` and exit
 --library_type               | empty | chosen from {empty, unstranded, first, second}

`--library_type` is highly recommended to provide. The `unstranded`, `first`, and `second`
correspond to `fr-unstranded`, `fr-firststrand`, and `fr-secondstrand` used in standard Illumina
sequencing libraries. If none of them is given, i.e., it is `empty` by default, then TERRACE
will try to infer the `library_type` by itself (see `--preview`). Notice that such inference is based
on the `XS` tag stored in the input `bam` file. If the input `bam` file do not contain `XS` tag,
then it is essential to provide the `library_type` to TERRACE. You can try `--preview` to see
the inferred `library_type`.

# Running TERRACE on a small example
A small example of input data `example-input.bam` is available in the `example` directory.

Suppose we have installed TERRACE following the steps in the `Installation` section, we have the executable file `terrace` at `src/terrace`.

Commands to enter `example` directory and run TERRACE using `example-input.bam` as input:
```
cd ./example
../src/terrace -i example-input.bam -o example-output.gtf --read_length 150
```

An output file named `example-output.gtf` will appear in the `example` directory.
The output file stores the reconstructed circular transcripts assembled by TERRACE in GTF format. 

# Scoring

The `output.gtf` generated by TERRACE consists of abundance values in the `score` field of the GTF file by default. We provide a Random Forest pre-trained model to generate more reliable scores (between 0 to 1) and integrate them in the `score` field of the GTF file. After integrating the scores, a user-defined threshold can be provided to generate a supplementary precise.gtf file that contains circRNAs with scores above the given threshold. Please refer to [RF-scoring/README](https://github.com/Shao-Group/TERRACE/blob/master/RF-scoring/README.md) for details of score generation, integration, and precise.gtf file.

To make use of the scoring functionalities, TERRACE need to be run to generate a feature file as follows.

```
cd ./example
../src/terrace -i example-input.bam -o example-output.gtf --read_length 150 -fe feature_file
```

An output file named `example-output.gtf` and a feature file `feature_file` will appear in the `example` directory.
The output file stores the reconstructed circular transcripts assembled by TERRACE in GTF format. the feature file stores the features of output circRNAs needed for score generation.
