GeneMarkS-2
===========
Article Name  : Modeling leaderless transcription and atypical genes results in more accurate gene prediction in prokaryotes

Authors       : Alex Lomsadze^, Karl Gemayel^, Shiyuyun Tang and Mark Borodovsky
                ^ joint first authors

Affiliation   : Georgia Institute of Technology
Group Website : topaz.gatech.edu
Publication   : Genome Research, 2018

Install
-------
This distribution is intended for Linux 64bit

Structure:     GeneMarkS-2 has five components: 
> - gms2.pl       : Controls the entire GeneMarkS-2 algorithm
> - biogem        : Implements the training stages of GeneMarkS-2
> - gmhmmp2       : Implements the prediction stages of GeneMarkS-2
> - compp         : Checks for convergence by comparing consecutive sets gene prediction
> - mgm_*.mod     : Files of pre-computed model parameters used in the algorithm

Forthcoming updates of the code will be available at 
      http://topaz.gatech.edu/GeneMark/license_download.cgi

Execution
---------

To run GeneMarkS-2, execute the perl script 'gms2.pl' by invoking 'perl gms2.pl' after compilation;
this step will generate a "usage message" showing all possible input parameters.
To run GeneMarkS-2 with default parameters use the command line:

perl gms2.pl -s sequence.fasta --genome-type TYPE

Here 'sequence.fasta' is the FASTA file containing the sequence.
TYPE value is either "bacteria", "archaea" or "auto" (automatic detection of the organism domain)

Usage
---------
Usage: gms2.pl --seq SEQ --genome-type TYPE
Basic Options: 
--seq                 File containing genome sequence in FASTA format
--genome-type         Type of genome: archaea, bacteria, auto 
--gcode               The genetic code number (default: 11. Choices: 11 and 4)
--output              Name of output file (default: gms2.lst)
--format              Format of output file (default: lst); Supported: gtf, gff3 and lst
--fnn                 Name of output file that will hold nucleotide sequences of predicted genes
--faa                 Name of output file that will hold protein sequences of predicted genes
--git                 Change gene ID format
--advanced-options    Show the advanced options

Version: 1.02    May 3, 2018
------------
