Here is the qc steps used for any samples.

```sh
# 1. remove rRNA, a hisat2 wrapper
python rRNA_QC.py --fq1 xx_L2_1.clean.fq.gz --fq2 xx_L2_2.clean.fq.gz --name xx --r1 1 --r2 15 --readsNum 10000 --pNum 3 --tool hisat2 --outDir Qc/rRNA/

# 2. cutadapt wrapper
python adcut.py --fq1 Qc/rRNA/xx/xx_1.fq.gz --fq2 Qc/rRNA/xx/xx_2.fq.gz --name xx --ad7 AGATCGGAAGAGCACACGTCTGAACTCCAGTCAC --ad5 AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTAGATCTCGGTGGTCGCCGTATCATT --overlap 6 --err 0.1 --outDir Qc/adapter/xx

# 3. filter low quality reads, use reads created by step 2.
python tmkQC.py --fq1 Qc/adapter/xx/xx_adap_rm_1.fq.gz --fq2 Qc/adapter/xx/xx_adap_rm_2.fq.gz --name xx --lowQ 20 --lowP 0.1 --NP 0.15 --ncpu 2 --outDir Qc/tmkQC/xx

# 4. apply fastqc
fastqc -t 2 -o Qc/FastQC/xx Qc/tmkQC/xx/xx_1.clean.fq.gz Qc/tmkQC/xx/xx_2.clean.fq.gz
```

The main step in `rRNA_QC.py` is calling the `hisat2` tool to detect rRNA ratio and output `--un-conc-gz` files.

```
hisat2 -p <pNum> -x <rRNA NR database index> -1 <fq1> -2 <fq2> --un-conc-gz <no rRNA gz>
```

Next, the `adcut.py` cutadapt parameters:

```
--discard-trimmed -e 0.1 -O 6
```

Third, in the `tmkQC.py`, we use a `tmkQC` qc command to trim low quality reads. the command line usage is:

```
tmkQC

Usage: 
	 -f    the path of fq files or fq gzip files, seperated by comma.
	 -a    the path of adaptor files or adaptor gzip files, seperated by comma.
	 -p     threads number.
	 -o     output directory.
	 -s     sample name.
	 -N     N rate. default: 0.1
	 -l     low qual. default: 5
	 -r     low qual bases rate ( low qual bases number/read length ). the reads will be discarded if the rate is greater that this value. default: 0.5
	 -L     the minimal length of read. The reads(perhaps have been truncated) will be discarded if its length is less than this value.
	 -g     the output sequence data is compressed by gzip.
```

The parameters we used is:

```
-p 2 -N 0.15 -l 20 -r 0.1
```

which will filter reads match any patterns of:

1. N rate greater than 0.15
2. low quality (lower than Q20) rate (bases/read length) greater than 0.1

with no read length filter expression applied.