#!/mnt/home/ettwiller/yan/exe/miniconda2/envs/my-python2-env/bin/python

'''
# Logs:

Created on Jan 15, 2020 By Bo Yan

Modifications on Feg 12, 2020:
    Add DensityPlot function to plot correlation between two tsscount groups
Modifications on April 20, 2020:
    Add absolute path file for output, input
Modifications on Aug 19, 2021:
    Add option in DensityPlot, use data.loc[(data[name[0]]>=cutoff) & (data[name[1]]>=cutoff)] or data.loc[(data[name[0]]>=cutoff) | (data[name[1]]>=cutoff)]
'''

'''
Based on python 2.7

Include multiple handling for comparison between different Cappable-seq tsscount groups, comparison between Enrich vs Control.
Since using temp file based on time, could be called from different threads

Functions and Usage:

Here TPM_1_lib means the TPM value equaling to 1 read in the corresponding lib

(1) Class Pair object:
A list containing Cappable-seq tsscount.gtf files generated by CountTssGTF.py.
If comparing Enrich and Control, the order should be [enrich, control]

(2) Class function object.compareEnvsCon(output_file)
Compare Enrich vs Control
Only output the positions existing in enrich, and calculate the EnrichRatio, 
for positions having 0 read in control, control_TPM=0 control_nio=0

EnrichRatio=(TPM_Enrich)/(TPM_Control),
and for the positions having 0 read in control (nio_control=0;TPM_control=0), EnrichRatio=(TPM_Enrich)/(TPM_1_control);
here I do not round the EnrichRatio

(3) with filter: 
    object.outputEnvsCon(output_file, TPM_cutoff=1, EnrichRatio_cutoff=1)

Compare enrich and control tsscount files, filter TSS positions based on EnrichRatio and TPM cutoff
$python /mnt/home/ettwiller/yan/Bo_script/CompareTSS.py filter --input enrich.tsscount.gtf control.tsscount.gtf --output .tsscount.gtf --tpmcutoff --ratiocutoff
    Ordering of input files: enrich.tsscount.gtf control.tsscount.gtf

output_file, e.g.:
<chr,TSS,strand><enrich_TPM><control_TPM><enrich_nio><control_nio><EnrichRatio>

The same as FilterTssGTF.py, the only difference is Not round EnrichRatio here.
Need to run object.compareEnvsCon() first.
The output file attribute:
    nio=enrich;TPM=enrich;Ratio=EnrichRatio;nio_control=control;TPM_control=control;
    Ratio=(TPM_Enrich)/(TPM_control)
    for the entry with nio=0 in control, report Ratio=(TPM_Enrich)/(TPM_1_Control), nio_control=0, TPM_control=0.

output file is a gtf, e.g.:
chr1    .       tssgtf  630887  630887  .       -       1-coordination  nio=90;TPM=1.42180094787;Ratio=6.0;nio_control=15;TPM_control=0.236966824645;
chr1    .       tssgtf  966978  966978  .       -       1-coordination  nio=93;TPM=1.4691943128;Ratio=93.0000000005;nio_control=0;TPM_control=0;

--tpmcutoff : default 1
    Only positions with tpm>=cutoff in Enrich will be saved in output.
--ratiocutoff : default 1
    Only positions with EnrichRatio>=cutoff in Enrich will be saved in output.

(4) with plotEnrich: 
    object.calculatePositions(TPM_cutoff, EnrichRatio_cutoff)

Compare enrich and control tsscount files, plot x-axis TPM vs y-axis EnrichRatio for TSS positions (above TPM cutoff) in enrich.
$python CompareTSS.py plotEnrich --input enrich.tsscount.gtf control.tsscount.gtf --output .png --tpmcutoff --ratiocutoff
    Ordering of input files: enrich.tsscount.gtf control.tsscount.gtf

Compare Enrich vs Control:
Plot the positions with TPM above cutoff in Enrich in png.
Report the number of positions and reads with above TPM and EnrichRatio cutoff for enrich.

Need to run object.compareEnvsCon() first.

    object.PlotEnrichRatio(self, png, TPM_cutoff)
Compare Enrich vs Control, Plot x-axis log10(TPM_Enrich) vs y-axis log10(EnrichRatio).
Only plot the positions with TPM_Enrich>=cutoff.
Need to run object.compareEnvsCon() first.

output is a png.
Calculate the number of reads and number of positions with EnrichRatio>=cutoff and TPM>=cutoff.

--tpmcutoff : Default 1
    Only positions with tpm>=cutoff in Enrich will be plotted.
--ratiocutoff : Default 1
    Ratio = (TPM_Enrich)/(TPM_Control)
        
(5) with extract: 
    oject.compareGroups(self, output, nio=None)

Extract the nio/tpm of TSS positions in multiple input files
$python CompareTSS.py extract --input input1.tsscount.gtf input2 input3 --output --verbose optional

Extract TPM of TSS for each input files listed in the object list.
Use tpm=0 for positions that do not exist in certain input file

--verbose: output nio instead of TPM
        
output e.g.
<chr;TSS 1-cooraltion;strand><TPM in input1><TPM in input2><TPM in input3>.

(6) with correlation: 
    oject.compareGroups(self, output, nio=None)
    object.DensityPlot(self, input, cutoff, png)

Compute and plot the pearson correlation between TSS in input1 and input2, e.g. correlation and scatter plot between replicates
$python CompareTSS.py correlation --input input1.tsscount.gtf input2.tsscount.gtf --output png --tpmcutoff --Flag AND

Calculate correlation for all positions without applying cutoff;
plot density plot for the positions with TPM_input1>=cutoff and/or TPM_input2>=cutoff, depending on Flag.

Since use log scale, for positions that only exist in one file, use log10(TPM_1_lib) for the other file which does not have TSS information.
These positions lay as a vertical (x<0.1, y axis starting at 0) or horizon line (y<0.1, x axis starting at 0) in the correlation plot (if --tpmcutoff 1)

--Flag: Default=AND, OR
Flag does not affect the calculation for pearson correlation which uses all the TSS, but only affects the output Density plot.
    Flag=AND:
    df = data.loc[(data[name[0]]>=cutoff) & (data[name[1]]>=cutoff)]
    only plot the positions with TPM_input1>=cutoff and TPM_input2>=cutoff
    use this for a better view for correlation between Replicates

    Flag=OR:
    df = data.loc[(data[name[0]]>=cutoff) | (data[name[1]]>=cutoff)]
    plot the positions with TPM_input1>=cutoff or TPM_input2>=cutoff
    use this for comparison between Two conditions

--output: a png
--tpmcutoff: default 1
    only plot the positions with TPM_input1>=cutoff and/or TPM_input2>=cutoff
'''

try:
    import os, sys
    from subprocess import check_call
    import time
    import re
    import argparse
    import math
    from scipy.stats import gaussian_kde
    import numpy as np
    import random
    import matplotlib.pyplot as plt
    import pandas as pd
    plt.switch_backend('agg') # need this for running on server
except:
    print "module error."
    quit()

def CreateTemp():
    '''
    create a temp file based on the time
    '''
    localtime = time.asctime(time.localtime())
    Prefix = ''.join(localtime.split()[-2].split(':')) # '151542'
    return 'comparison.temp' + Prefix # comparison.temp151542

def FindTPM_1(input_gtf):
    '''
    Use to find the value of TPM_1 in the input_gtf
    input e.g.:
    NC_000913.3     M9.control.tss  TSS     148     148     9.76438538076   +       1-coordination  nio=2;TPM=9.76438538076;
    
    return 9.76438538076/2, which is a float
    '''
    with open(input_gtf) as f:
        line = f.readline().split('\t')
        nio = int(re.findall('nio=(\d.*?);', line[-1])[0]) # '2'
        TPM = float(re.findall('TPM=(\d.*?);', line[-1])[0]) # '9.76438538076'
    return TPM/nio # float


class Pair:
   
    def __init__(self, filelist):
        self.inputlist = filelist # A list containing the files for comparison

        self.comparison = None # A file containing the TPM and EnrichRatio for Enrich and Control
        self.TPM_1_control = None # control TPM_1
        self.TPM_1_input = None # A list containing the TPM_1 for all the input files
    


    # Functions for Comparing Enrich and Control
    def compareEnvsCon(self, output_file):
        '''
        Compare Enrich vs Control, enrich should be the first file in self.inputlist.

        Only output the positions with TPM>0 in enrich, and calculate the EnrichRatio,
        for positions having 0 read in control, control_TPM=0

        EnrichRatio=(TPM_Enrich)/(TPM_Control),
        and for the positions having 0 read in control (nio_control=0;TPM_control=0), EnrichRatio=(TPM_Enrich)/(TPM_1_control);
        here I do not round the EnrichRatio, since low fold e.g. 0.003 is rouned to 0.0, which affects the log10(Ratio) transformation.
        In FilterTssGTF.py output, the EnrichRatio is rounded to 2 digits.

        output self.comparison:
        <chr,TSS,strand><enrich_TPM><control_TPM><enrich_nio><control_nio><EnrichRatio>
        '''

        if len(self.inputlist)!=2:
            print "Error: CompareTwo function can take only two input files."
            quit()

        self.TPM_1_control = FindTPM_1(self.inputlist[1])

        # create a dic saving control tpm
        with open(self.inputlist[1]) as f1:
            key = [(line.strip().split('\t')[0], line.strip().split('\t')[3], line.strip().split('\t')[6]) for line in f1] # key: (chr, TSS, strand)
            with open(self.inputlist[1]) as f2:
                tpm = (re.findall('TPM=(\d.*?);', line.strip())[0] for line in f2)
                dic_control_tpm = dict(zip(key, tpm))
                f2.seek(0)
                nio = (re.findall('nio=(\d.*?);', line.strip())[0] for line in f2)
                dic_control_nio = dict(zip(key, nio))

        output = open(output_file, 'w')
        count_enrich = 0 # total number of reads in enrich
        with open(self.inputlist[0]) as f:
            for line in f:
                key = (line.strip().split('\t')[0], line.strip().split('\t')[3], line.strip().split('\t')[6])
                temp = [';'.join(key)]
                TPM_Enrich = re.findall('TPM=(\d.*?);', line.strip())[0]
                nio_Enrich = re.findall('nio=(\d.*?);', line.strip())[0]
                count_enrich += int(nio_Enrich)
                temp.extend([TPM_Enrich, dic_control_tpm.get(key, '0'), nio_Enrich, dic_control_nio.get(key, '0')])
                EnrichRatio = float(TPM_Enrich)/float(dic_control_tpm.get(key, self.TPM_1_control))
                temp.append(str(EnrichRatio))
                print>>output, '\t'.join(temp)
        output.close()
        print "The total number of reads in Enrich is: {}".format(count_enrich)

        self.comparison = output_file

        return


    def calculatePositions(self, TPM_cutoff, EnrichRatio_cutoff):
        '''
        calculate the number of positions and reads with TPM and EnrichRatio cutoff for comparison between enrich and control.
        '''
        if self.comparison:
            data = pd.read_csv(self.comparison, header=None, sep='\t', names = ['Enrich', 'Control', 'EnrichNio', 'ControlNio', 'EnrichRatio'])
            tpm = data.loc[data['Enrich']>=TPM_cutoff, :]
            ratio = data.loc[(data['Enrich']>=TPM_cutoff) & (data['EnrichRatio']>=EnrichRatio_cutoff), :]
            print "Total number of TSS in Enrich: {}.".format(data.shape[0])
            print "Total number of TSS in Enrich with TPM>={}: {}".format(TPM_cutoff, tpm.shape[0])
            print "Total number of TSS in Enrich with TPM>={} and EnrichRatio>={}: {}".format(TPM_cutoff, EnrichRatio_cutoff, ratio.shape[0])
            print "Total TPM of TSS in Enrich: {}.".format(sum(data['Enrich']))
            print "Total TPM of TSS in Enrich with TPM>={}: {}".format(TPM_cutoff, sum(tpm['Enrich']))
            print "Total TPM of TSS in Enrich with TPM>={} and EnrichRatio>={}: {}".format(TPM_cutoff, EnrichRatio_cutoff, sum(ratio['Enrich']))
        else:
            print "Need to run object.compareEnvsCon first."
        return 


    def outputEnvsCon(self, output_file, TPM_cutoff, EnrichRatio_cutoff):
        '''
        Output positions in Enrich with TPM and EnrichRatio >=cutoff.
        The same as FilterTssGTF.py, the only difference is Not round EnrichRatio here.

        Here EnrichRatio: Original division, No round
        self.comparison
        <chr,TSS,strand><enrich_TPM><control_TPM><enrich_nio><control_nio><EnrichRatio>
        '''
        if self.comparison:
            print "Compare enrich and control."
            print "Save positions with TPM>={} and EnrichRatio>={} in output.".format(TPM_cutoff, EnrichRatio_cutoff)

            output = open(output_file, 'w')
            with open(self.comparison) as f:
                for line in f:
                    line = line.strip().split('\t')
                    if float(line[5]) >= EnrichRatio_cutoff and float(line[1]) >= TPM_cutoff:
                        temp = [line[0].split(';')[0], '.', 'tssgtf', line[0].split(';')[1], line[0].split(';')[1], '.', line[0].split(';')[2], '1-coordination']
                        attri = ['nio='+line[3],'TPM='+line[1],'Ratio='+line[5], 'nio_control='+line[4], 'TPM_control='+line[2],''] # I add '' here to have the ';' at the end of '\t'.join(temp)
                        temp.append(';'.join(attri))
                        print>>output, '\t'.join(temp)
            output.close()
        else:
            print "Need to run object.compareEnvsCon first."
        return 


    def jitter(self, list, var): # required by PlotEnrichRatio
        '''
        Used to add a value for all the numbers in list for jitter
        value is randomly chosed from normal distribution of (mean=number, variance=var)
        
        var controls the jitter amount
        '''
        ls_jitter = []
        for item in list:
            ls_jitter.append(random.choice(np.random.normal(item, var, size=50)))
        return ls_jitter

    def PlotEnrichRatio(self, png, TPM_cutoff):
        '''
        Plot y-axis EnrichRatio vs x-axis EnrichTPM
        input e.g.
        <chr,TSS,strand><enrich_TPM><control_TPM><EnrichRatio>
        
        TPM_cutoff: only plot the positions with TPM_Enrich>=cutoff
        '''
        if self.comparison:
            print "Compare enrich and control."
            print "Plot y-axis EnrichRatio vs x-axis EnrichTPM."
            data = pd.read_csv(self.comparison, header=None, sep='\t', names = ['Enrich', 'Control', 'EnrichNio', 'ControlNio', 'EnrichRatio'])
            df = data.loc[data['Enrich']>=TPM_cutoff, :]
            
            # log10 transformation, use TPM_1 for positions with 0
            x = self.jitter([np.log10(item) for item in df['Enrich']], 0.06) 
            y = self.jitter([np.log10(item) for item in df['EnrichRatio']], 0.06)

            ## plot
            plt.rcParams['font.size'] = 8 # change the font size for all the text

            fig, ax = plt.subplots(dpi=300) # by defualt figsize=(7.6, 6.4)
            pcm = ax.scatter(x, y, s=1.5, edgecolor='', alpha=0.6) 
            
            plt.ylabel('log10(EnrichRatio)')
            plt.xlabel('log10(Enrich TPM)')
            ax.set_aspect(1 / ax.get_data_ratio())
            plt.savefig(png, transparent=True) # path is the same as the output in compareRepeat().
        else:
            print "Need to run object.compareEnvsCon first."
        return

    # Functions for Extracting TSS tpm or nio from multiple groups (groups>=2)
    def compareGroups(self, output, nio=None):
        '''
        extract TPM of TSS for each input files listed in the input_file (a list).
        Use tpm=0 for positions that do not exist in any input file
        
        output e.g.
        <chr;TSS 1-cooraltion;strand><TPM in input1><TPM in input2><TPM in input3>.

        if nio: 
            output nio instead of TPM
        '''
        
        number_file = len(self.inputlist) # number of input files
        dic_tpm = {}
        dic_nio = {}
        i = 0
        ls_TPM_1 = [] 

        for input in self.inputlist:
            ls_TPM_1.append(FindTPM_1(input))

            with open(input) as f:
                for line in f:
                    key = (line.strip().split('\t')[0], line.strip().split('\t')[3], line.strip().split('\t')[6]) # key: (chr, TSS, strand)
                    #If choose to use TPM=TPM_1 and nio=1 for positions that do not exist in one of the repeats:
                    #dic_tpm[key] = [item for item in tpm1] # Add TPM_1 to all the positions
                    #dic_nio[key] = ['1']*number_file # Add 1 read to all the positions
                    #note: dic_tpm[key]=tpm1 returns wrong result: using list1=list2 is very confusing, since the change in any list is shown in both lists

                    # use TPM=0 and nio=0 for positions that do not exist in one of the repeats
                    if not dic_tpm.has_key(key): 
                        dic_tpm[key] = ['0']*number_file 
                    dic_tpm[key][i] = re.findall('TPM=(\d.*?);', line.strip().split('\t')[-1])[0]

                    if not dic_nio.has_key(key):
                        dic_nio[key] = ['0']*number_file
                    dic_nio[key][i] = re.findall('nio=(\d.*?);', line.strip().split('\t')[-1])[0]

            i +=1

        with open(output,'w') as f:
            if nio:
                for item in dic_nio:
                    ls = [';'.join([item[0], item[1], item[2]])]
                    ls.extend(dic_nio[item])
                    print>>f, '\t'.join(ls)
            else:
                for item in dic_tpm:
                    ls = [';'.join([item[0], item[1], item[2]])]
                    ls.extend(dic_tpm[item])
                    print>>f, '\t'.join(ls)
        
        self.TPM_1_input = ls_TPM_1[:]
        return ls_TPM_1 # list saving TPM_1 for all the input files
    
    def DensityPlot(self, input, cutoff, png, Flag):
        '''
        Calculate the pearson correlation between input files

        Generate a DensityDot Plot of TPM_input1 vs TPM_input2
        
        input is the output generated by compareGroups, e.g.
        <chr;TSS 1-cooraltion;strand><TPM in input1><TPM in input2>

        Flag does not affect the calculation for pearson correlation which uses all the TSS, but only affects the output Density plot.

        Flag=AND:
        df = data.loc[(data[name[0]]>=cutoff) & (data[name[1]]>=cutoff)]
        only plot the positions with TPM_input1>=cutoff and TPM_input2>=cutoff, use this for a better view for correlation between Replicates

        Flag=OR:
        df = data.loc[(data[name[0]]>=cutoff) | (data[name[1]]>=cutoff)]
        only plot the positions with TPM_input1>=cutoff or TPM_input2>=cutoff, use this for comparison between Two conditions
 
        png: name of output png file

        '''
        name = ['rep1', 'rep2']
        data = pd.read_csv(input, header=None, sep='\t', names = [name[0], name[1]])
        
        # correlation based on all the positions without TPM cutoff
        print "Correlation between two:"
        print np.corrcoef(data[name[0]], data[name[1]])

        if Flag == 'AND':
            # plot TSS existing in both above cutoff using &
            df = data.loc[(data[name[0]]>=cutoff) & (data[name[1]]>=cutoff)]
            print "The output plot shows the TSS positions existing in both goups."
        elif Flag == 'OR':
            # plot TSS existing in at least one above cutoff 
            df = data.loc[(data[name[0]]>=cutoff) | (data[name[1]]>=cutoff)]
            print "The output plot shows the TSS positions existing in either goups."
        else:
            print "Choose --Flag AND or --Flag OR"
            quit() 
        
        # log10 transformation, use TPM_1 for positions with 0
        x = np.asarray([math.log10(self.TPM_1_input[0]) if item ==0 else math.log10(item) for item in df[name[0]]])
        y = np.asarray([math.log10(self.TPM_1_input[1]) if item ==0 else math.log10(item) for item in df[name[1]]])

        xscale = int(math.floor(np.ndarray.max(x))) # maximum scale of x
        yscale = int(math.floor(np.ndarray.max(y)))

        # Sort the points by density, so that the densest points are plotted last
        xy = np.vstack([x,y])
        z = gaussian_kde(xy)(xy)
        idx = z.argsort()
        x, y, z = x[idx], y[idx], z[idx]

        ## plot
        plt.rcParams['font.size'] = 8 # change the font size for all the text

        fig, ax = plt.subplots(dpi=300) # by defualt figsize=(7.6, 6.4)
        pcm = ax.scatter(x, y, c=z, s=2, cmap='coolwarm', edgecolor='', alpha=0.8) # blue red color range

        # axis label
        xaxis, yaxis = [-1], [-1] # corresponding to 10^-1 = 0.1 for TPM x-axis label.
        xaxis.extend(range(0, xscale+1)) # axis: [-1, 0, 1, 2 ..], axis label: [10^-1, 10^0, 10^1, 10^2 ..]
        yaxis.extend(range(0, yscale+1))
        xaxis_ticks, yaxis_ticks  = ['0.1'], ['0.1'] # axis_ticks: label of ticks, use 0.1 for -1
        xaxis_ticks.extend([str(int(math.pow(10, item))) for item in range(0, xscale+1)])
        yaxis_ticks.extend([str(int(math.pow(10, item))) for item in range(0, yscale+1)])
        
        plt.xticks(xaxis, xaxis_ticks)
        plt.yticks(yaxis, yaxis_ticks)
        
        plt.xlabel('TPM of {}'.format(name[0]), fontsize=12)
        plt.ylabel('TPM of {}'.format(name[1]), fontsize=12)

        # Density color bar
        plt.colorbar(pcm, aspect=10, shrink = 0.3)
        # plt.ax.xaxis.set_ticks_position('top'): change colorbar position
        # aspect 20: ratio of long to short dimensions; shrink 1.0: fraction by which to multiply the size of the colorbar
        # plt.colorbar().set_label(label='Density', rotation=0, position=(0, 1.2)): add label for colorbar, but difficult to change the position

        # Make an axes square in screen units even with different limits. Should be called after plotting.
        # print ax.get_aspect() # auto
        # print ax.get_data_ratio() # 0.977325008998231
        ax.set_aspect(1 / ax.get_data_ratio()) # ax.get_data_ratio: Return the aspect ratio of the raw data.
        # print ax.get_aspect() # 1.02320107517
        plt.savefig(png, transparent=True) 


##-------Parser
if __name__ == '__main__':

    parser = argparse.ArgumentParser()
    subparsers = parser.add_subparsers(help='sub-command help', dest = 'mode')
    
    parser_a = subparsers.add_parser('filter', help='compare enrich and control, filter based on TPM and EnrichRatio cutoff')
    parser_a.add_argument('--input', nargs = '+', help = 'tsscount files', dest='input_file')
    parser_a.add_argument('--output', help='output tsscount.gtf containing enrich and control', dest='output_file')
    parser_a.add_argument('--tpmcutoff', help='tpm cutoff for enrich', dest='TPM_cutoff', type=float, default=1)
    parser_a.add_argument('--ratiocutoff', help='EnrichRatio cutoff for enrich', dest='EnrichRatio_cutoff', type=float, default=1)
    # outputEnvsCon(self, output_file, TPM_cutoff=1, EnrichRatio_cutoff=1)

    parser_b = subparsers.add_parser('plotEnrich', help='compare enrich and control, plot the EnrichRatio and calculate the TSS and Reads')
    parser_b.add_argument('--input', nargs = '+', help = 'tsscount files', dest='input_file')
    parser_b.add_argument('--output', help='png file for EnrichRatio', dest='output_file')
    parser_b.add_argument('--tpmcutoff', help='tpm cutoff for plotting enrich and calculation', dest='TPM_cutoff', type=float, default=1)
    parser_b.add_argument('--ratiocutoff', help='EnrichRatio cutoff for calculation', dest='EnrichRatio_cutoff', type=float, default=1)
    # Comparelist.calculatePositions(self, TPM_cutoff, EnrichRatio_cutoff)
    # Comparelist.PlotEnrichRatio(self, png, TPM_cutoff)

    parser_c = subparsers.add_parser('extract', help='extract TSS from different groups')
    parser_c.add_argument('--input', nargs = '+', help = 'tsscount files', dest='input_file')
    parser_c.add_argument('--output', help='files saving TPM or nio for different groups', dest='output_file')
    parser_c.add_argument('--verbose', help='add to output nio instead of tpm', action='store_true')
    # compareGroups(self, output, nio=None)

    parser_d = subparsers.add_parser('correlation', help='extract TSS from repeats and plot correlation')
    parser_d.add_argument('--input', nargs = '+', help = 'tsscount files', dest='input_file')
    parser_d.add_argument('--output', help='png file of correlation', dest='output_file')
    parser_d.add_argument('--tpmcutoff', help='tpm cutoff for plotting two repeats', dest='TPM_cutoff', type=float, default=1)
    parser_d.add_argument('--Flag', help='Use AND to plot TSS existing in both and OR to plot TSS existing in either', dest='Flag', type=str, default='AND')

    args = parser.parse_args()

    Comparelist = Pair([os.path.abspath(item) for item in args.input_file]) # with absolute path for input files
    output = os.path.abspath(args.output_file)

    if args.mode =='filter':
        tempfile = CreateTemp() # create a temp file based on time for analysis
        Comparelist.compareEnvsCon(tempfile)
        Comparelist.outputEnvsCon(output, args.TPM_cutoff, args.EnrichRatio_cutoff)
        os.remove(tempfile)

    elif args.mode == 'plotEnrich':
        tempfile = CreateTemp()
        Comparelist.compareEnvsCon(tempfile)
        Comparelist.PlotEnrichRatio(output, args.TPM_cutoff)
        Comparelist.calculatePositions(args.TPM_cutoff, args.EnrichRatio_cutoff)
        os.remove(tempfile)
    
    elif args.mode == 'extract':
        print Comparelist.compareGroups(output, args.verbose) # return ls_TPM_1, list saving TPM_1 for all the input files
    
    elif args.mode == 'correlation':
        temp_output = CreateTemp()
        Comparelist.compareGroups(temp_output)
        Comparelist.DensityPlot(temp_output, args.TPM_cutoff, output, args.Flag)
        os.remove(temp_output)


