1.2.0:

* now requires a new version of GMTK (>= 0.1 f6e7c01e2b27 (ticket161)) for increased speed
* now requires Python >=2.6 (released 3.5 years ago) and <=3.0
  (although we will switch to Python 2.7 when we need it, so if you
  are upgrading, go all the way now)
* now requires Genomedata >=1.3.1
* default setuptools version now 0.6c11
* segway: now has experimental Torque support. PBS and PBS Pro will
  probably work with some changes--please e-mail Michael if you want
  to help with identifying the necessary changes (thanks to Jay Hesselberth)
* segway: new escaping method from Genomedata tracknames to GMTK
  tracknames allows any arbitrary Genomedata trackname, but this means
  that previous files will be incompatible, since "_" is now replaced
  by "_5F" and "." by "_2E" (formerly both of these were "_", which
  was ambiguous)
* segway: allow concatenated segmentation of tracks with --trackname track1,track2, which
  gives you only one set of parameters. you may need to adjust
  existing train.tab files to use track_specs instead of
  include_tracknames because of other necessary changes
* segway: LSF: fix bug where wrapper script sometimes did not delete
  temporary observation files after a crash
* segway: LSF: no longer generates mktemp warning messages
* segway: more informative error message when supervision labels overlap each other
* segway: segway.sh has cd command in addition to segway command
* segway: MC/MX/REAL_MAT direct parameters are no longer trained (MEAN/COVAR/DENSECPT parameters are)
* segway-layer: add --bigBed option, which works if you have
  bedToBigBed in your path
* segway: add --bigBed option which calls segway-layer --bigBed
* segway: removes PYTHONINSPECT from environment for sub-tasks, which
  caused identify mode failures before
* segway-layer: fix issue #19. Now doesn't override previous colors
  when mnemonics aren't supplied (thanks to Jay Hesselberth)
* segway-winner: fix issue #21. Now works on Python 2.5.1
* installation: always require drmaa. setuptools extras_require wasn't that useful in practice
* test: use fixed random seed to generate input.master every time
* test: when files differ, output is in a form that makes it easier to diff
* test: update for new GMTK

1.1.0:

* segway: now again includes posterior output in bedGraph format
  (thanks to Avinash Sahu)
* segway: --old-directory renamed to --recover
* segway: --recover now supports identify as well as train. less
  clunky and more consistent than the old recipe for training recovery
  included in the docs (posterior still not supported)
* segway: train.tab is now optional, so you should be able to run
  identify on pre-Segway 1.0.0 training directories that don't have
  it, as long as you specify all the same options as you did before
* segway-winner: new command to pick winning parameters in interrupted training run
* segway: train.tab now includes input master filename in multi-instance training
* segway: changed memory progression edge error to be more easily understandable
* segway: when some error kills one training instance, there is now a more detailed error at end
* segway: remove --keep-going, which hasn't done anything in a few versions
* test: filenames are sorted before comparisons
* test: now includes posterior test
* test/README: added description of test system
* test: training now only goes through two rounds to speed the tests

1.0.2:

* segway: train.tab is now sorted for easier test comparison
* segway: eliminate reporting of "info criterion" (which was flaky) in likelihood.tab
* test: automated testing framework in test/test.sh

1.0.1:

* segway: fix various option setup errors
* setup: add description
* setup: fix download URL
* docs: fix quickstart errors

1.0.0:

For the first official release, I have improved the user interface in
a number of ways. Segway should be much simpler to run. However, the
command-line interfaces have changed, so your scripts will have to
change as well.

* segway: change command-line signature. Usage is now
  segway train GENOMEDATA TRAINDIR
  segway identify GENOMEDATA TRAINDIR IDENTIFYDIR
  segway posterior GENOMEDATA TRAINDIR IDENTIFYDIR
  segway identify+posterior GENOMEDATA TRAINDIR IDENTIFYDIR

  Accordingly, have removed --no-train, --no-identify, --no-posterior, --directory

* segway: change --random-starts to --iterations
* segway: training now writes important file locations hyperparameters
  into TRAINDIR/train.tab
* segway: idenfication now automatically runs segway-layer as well

* segway: logs/run.sh is no longer flushed immediately. UUID is available from logs/segway.sh
* docs changed accordingly
* docs: add quick start tutorial to top

0.2.7:

* all: display hint when trying to run on cluster
* segway-layer: add --track-line-set option
* segway-layer: produce layers for every specified mnemonic, even if empty
* segway-layer: reverse previous ordering, so lines show up in genome
  browser in same order as mnemonics file
* segway-layer: ignores lines that start with #

0.2.6:

* segway: SGE: fix bug where Viterbi jobs would immediately crash with
  a pthreads error when using recent versions of PyTables/numexpr

0.2.5:

* segway: fix bug where aborted jobs were not returning maxvmem/vmem info
* segway: make RuntimeError for edge of memory usage progression print
  troubleshooting information below traceback
* docs: add more troubleshooting info

0.2.4:

* segway: print error filename when failing due to end of memory progression
* segway: LSF: add -E /bin/true to submissions to avoid trivial job
  failures on broken hosts
* docs: fix a mistake on description of explicit length distribution

0.2.3:

* segway: new --tracks-from option allows specification of a filename
* segway: LSF: now selects and declares tmp disk space usage. This
  perhaps could be extended to SGE with some cluster administrator effort.
* segway-layer: fix bug where not all segment labels were in an input file
* segway: with --distribution=asinh_normal, starting covariance
  parameters are now asinh(variance). This fixes a bug with negative
  variances
* segway: fix bug where --split-sequences was ignored
* segway: use Genomedata-reported metadata for whole genome rather
  than accumulating by chromosome
* prereqs: optbuild must now be >= 0.1.10

0.2.2:

* increase task wrapper robustness
* bug fixes
* docs fixes
* setup.py changes

0.2.1:

* docs: fix errors
* add sge_setup.py
* installation: add explicit PyTables installer

0.2.0:

* segway: add hierarchical segmentation
* segway: fix bug where only one random start is allowed
* segway: change "chunk" to "window" to avoid confusion with GMTK usage of chunk
* segway: updated system check to consider GE* _and SGE*_ systems as SGE.
* segway: removed dependence upon genomedata._util
* everything: anything that supported gzipped I/O now may support "-" to mean stdin/stdout
* segway-layer: now supports no arguments, which means use stdin/stdout
* segway-layer: now supports input files with no track line
* segway-layer: now supports non-integer segment labels
* GMTK-installation: fixed default shell to be /bin/bash as 'declare' call in
  installation script caused GMTK installation to fail.
* installation: move from path.py to forked-path

0.1.21:

* segway: add --segtransition-weight-scale

0.1.20:

* segway: limits the number of jobs dumped into the queuing system at
  once to avoid overloading it
* segway: increase robustness by adding a ulimit wrapper script to
  explicitly limit the memory used for each job and erase temporary files
* segway: add --old-directory option for identify mode recovery of a
  previous partial run
* segway-task: bugfix: won't throw a nested exception when temporary
  files were not created

0.1.19:

* segway: now set CARD_FRAMEINDEX dynamically to allow for
  --split-sequences to change between train and identify
* segway: --num-segs now called --num-labels
* LSF: fix bug related to: now appends to output/error instead of overwriting

0.1.18:

* LSF: now appends to output/error instead of overwriting
* recommit dumpnames.list removal (bugfix)

0.1.17:

* now requires genomedata >= 0.1.6
* remove dumpnames.list
* by default check all jobs every 30 min instead of every 54 min

0.1.16:

* now requires GMTK version 20091016
* segway: a change in the semantics of --trainable-params for
  training. If it is specified and the file exists, then it will be
  used as a parameter file in the first round of training. Before it
  was clobbered (with --clobber), or training was disabled.
* segway: allow multiple sets of parameters with --trainable-params,
  one for each random start. This together with the previous change
  will allow for the easy continuation of a multithread training run.
* segway: changed error text when wrong number of arguments provided
* segway: fixed bug that causes failure in command-line logging during
  identify phase

0.1.15:

* now requires GMTK version 20091016
* segway: more informative error on memory progression failure
* segway: now works on LSF even when LSF_UNIT_FOR_LIMITS has been changed
* segway: add jobname to "queued" diagnostic messages
* segway: log/res_usage.tab becomes log/jobs.tab
* segway: add jobid, jobname to jobs.tab
* segway: eliminate core dumps of jobs in LSF
* segway: make job abnormal error handling more robust
* segway: add log/details.sh to give all details of queued commands
* segway: queue longest jobs first
* segway: changed way GMTK model is specified for speedup
* segway: LSF: select only hosts with sufficient memory
* segway: eliminate 10 MB memory guard for declared usage versus crash usage
* docs updates

0.1.14:

* there is now a Segway issue tracker at http://code.google.com/p/segway-genome/
  file your bugs and feature requests there
* now requires GMTK version 20091004
* now requires drmaa-python >= 0.4a3
* segway: off-load identify file creation to the subtasks
* segway: --split-sequences now called --max-frames, reenabled for all tasks
* segway: by default, split sequences instead of throwing errors on long sequences
* segway: CARD_FRAMEINDEX is now based on --max-frames value
* segway: supervision and dinucleotide tracks are temporarily disabled
* segway: block e-mail report creation by cluster
* segway: fixes for LSF support
* segway: disabled a lot of diagnostic noise
* segway: default distribution is now asinh_norm
* segway: will now resubmit jobs in a cohort before all of the other
  jobs are done. this increases the load on the cluster manager, so
  job completion is only checked every 60 s per thread
* segway-res-usage: remove

0.1.13:

* segway: adding switching weights
* segway: restoring weighting to global number of datapoints in a track
* segway: now supports Platform LSF with FedStage DRMAA for LSF
* segway: moved SGE-specific code into modular driver architecture
* segway: --resolution option allows downsampling for speed
* segway: --ruler-scale option allows specification of ruler scale
  (formerly always 10)
* segway: start quoting words with spaces in log/run.sh
* genomedata >0.1.2 is now a prerequisite
* segway: identify BED output now names tracks segway.<uuid> in order
  to prevent conflicts
* segway-layer: automatically recolor in groups defined by period
  (such as 1.3) as well as alphabetical characters (A3)
* segway: fix bug: --dry-run causes traceback during training
* segway: --distribution=arcsinh_norm becomes --distribution=asinh_norm
* segway: now creates log/segway.sh to report arguments from which Segway was called
* segway: now prints out Segway distribution version to log files
* segway: with only one random start, now runs in a single-threaded mode

0.1.12:

* segway: LEN_SEG_EXPECTED becomes 100000
* segway: P(segTransition(0) | segCountDown(-1), seg(-1)) is now
  weighted with scale which can be set through
  -DSEGTRANSITION_WEIGHT_SCALE (not exposed to user, but you can hack
  include files)
* segway: fixed --semisupervised bugs
* segway-layer: add --mnemonic-file option
* BED parsing now handles empty lines by ignoring them instead of
  generating an error
* docs changes

0.1.11:

* optbuild >0.1.6 is now a prerequisite
* totally different memory management regime that eliminates old
  prediction stuff, and instead relies on trying at 2 GiB, 4 GiB, etc.
  until failure. globally shares minimum values for each thread
* temporary: minimum segment length is always at least 10
* temporarily disabled --split-sequences
* new prereq: switch from old DRMAA module to new drmaa module
* new prereq: optplus 0.1.1
* remove old "state" random variable from default structure
* BIC is now defined as -(2/n) ln L + k ln N where n is num of bases
  and N is num of sequences
* new segway-layer program changes the format of segway.bed.gz to
  something that is more intuitive for large numbers of labels, using
  thick/thin lines
* segway: remove now-unused --resource-profile option
* segway: add --seg-table option to describe segment hyperparameters,
  including different minimum and maximum segment lengths for each label
* remove --min-seg-len
* added some robustness to determining executable paths for queue
  submission, eliminating calls to bash for sub-jobs
* segway: add --exclude-coords option, works like --include-coords
* segway: add arcsinh-norm distribution
* segway: disabled track weighting
* segway: add --semisupervised=FILE option to get semisupervision
  labels from a BED file
* segway: add --drm-opt=OPTION option to allow specification of an
  arbitrary distributed resource manager option (e.g. for SGE -p sets
  priority, -l sets a resource requirement, etc.)
* segway: log memory and CPU usage to log/res_usage.tab
* segway: segway.inc now referenced with more relative paths

0.1.10:

* fix off-by-one error in declaring CARD_SEGCOUNTDOWN

0.1.9:

* replace state min seg len regime with one based on a ruler and segCountDown
* default: don't train any deterministic CPT
* stop creating input.master file before training when you have
  multiple random starts, and are going to be creating input.0.master,
  etc.
* segway.inc is changed so that CARD_SEG is now overridable. If you
  have an old segway.inc you must replace the CARD_SEG bits with
  the #ifndef CARD_SEG segment of a new segway.inc
* new classmethod segway.run.Runner.fromoptions() to make it easier to
  subclass while keeping the same command-line interface.
* fix bug in calculating BIC, reported in log likelihood logs
* default triangulation filenames now include the number of labels in the filename
* triangulation filenames now go into triangulation/ directory
* fixed a regression in 0.1.8: now copies final params file to
  params.params again
* segway now implements supports for a range argument to --num-segs

0.1.8:

* colorbrewer package is now a prerequisite
* segway: now produces BED9 output instead of BED4 output, with RGB
  colors from the ColorBrewer Dark2 scheme (Colors from
  www.ColorBrewer.org by Cynthia A. Brewer, Geography, Pennsylvania
  State University.)
* segway: fixed error when attempting to copy likelihood files in the
  case of only one random start
* segway: make error more explicit if you forget to set --num-segs=N
  if N is greater than 3 in the supplied model file
* packaging: will not install as zipped egg as a temporary measure to enhance debugging

0.1.7:

* segway: add --dont-train option to allow specification of a file with list of variables not to train

0.1.6:

* no changes, this is to fix a packaging/tagging problem

0.1.5:

Currently broken
* segway: --num-segs for a range does not work

New features/bugfixes:
* segway: allow posterior with --num-segs greater than 2
* segway: hard maximum of 2,000,000 for sequence length.
* segway: --split-sequences will split sequences below 5,000,000
* segway: if --num-segs was used and segway generated the input.master
  file, then use the Bayesian Information Criterion instead of likelihood to optimize parameters
* segway: now copies selected params.params and input.master rather than moving
* segway: bugfix to allow training to run
* segway: enhancement of the memory prediction algorithm
* segway: now spins off segway-task for Viterbi jobs, which produces BED output in parallel
  (the segway-task interface is undocumented and subject to change)

Documentation:
* there is now some documentation in doc/segway.rst (in
  reStructuredText format, a lightweight markup format)

0.1.4:

* genomedata>0.1.0 package is now a prerequisite
* optplus package is now a prerequisite
* 20090302 version of GMTK is now a prerequisite
* genomedata-load-data, genomedata-load-seq, genomedata-name-tracks, genomedata-save-metadata: moved to genomedata package
* segway: changed args from a number of h5filenames to a single genomedata directory 
* segway: added extra discrete variable called "state" to the default
  model. seg is now a child of this. transitions all occur state to
  state rather than seg to seg
* segway: start_seg -> state_state conditional probability table is no longer generated
  randomly, now reflects a geometric distribution with expected
  segment length 10000
* segway: start_seg -> start_state is no longer trained
* segway: seg_seg -> start_state is no longer generated randomly, instead starting
  probability defaults to 1/NUM_SEGS for each segment
* segway: forces a hard minimum segment limit of 10
* segway: new --num-segs option: allows more than 2 segment labels. even allows a range of number of segments
* segway: change some visual formatting of posterior track
* segway: bugfix: actually use default res usage file by default
* segway: don't create a posterior triangulation when --dry-run used
* segway: memory usage new requested in 1G increments instead of 1M
* segway: always uses island mode to save memory (at the cost of speed)
* segway: use ulimit to make a hard max on memory consumption

0.1.3:

* segway: weights probability from different tracks by the number of
  datapoints in each track. useful for dealing with data on
  heterogeneous scales
* seg.inc is now segway.inc
* seg.str is now segway.str
* segway-res-usage: now produces a tab-delimited output file that can
  be used to specify Segway's memory management
* segway-res-usage: calibrates resource usage of training bundle step, posterior, identify
* segway: uses the new resource usage file by default
* segway: new --resource-profile option to use another resource usage file
* segway: by default, abort if an input sequence is going to take more
  than 15G of RAM to process using any of the commands being run
* segway-res-usage: try to guess which tracks are going to be way too huge and skip them
* segway: write observations/observations.tab which gives information
  about what exactly is in each observation file
* segway-res-usage, segway: identify unique runs with uuid.uuid1() instead of os.getpid()
* segway: fixed a bug regarding the setting of initial parameter
  values with the --track/-t option
* segway: when rewriting regions due to --include-region or
  --split-sequences, ensure that the coordinates are in a monotonically
  increasing order

0.1.2:

* segway bugfixes: do not create segway.bed.gz or
  posterior.seg0.wig.gz when --no-identify or --no-posterior are
  enabled, respectively

0.1.1:

* Now requires optbuild 0.1.6
* new segway-res-usage program calibrates memory usage
* segway: changed -b option: It now specifies a BED filename instead of a directory where a file
  named segway.bed.gz is found
* --force/-f is now --clobber/-c
* segway: now will use a params.params file during identify
  implicitly, if it is in the directory specified with -d
* segway: method for generating starting parameters is somewhat different now
* segway: now generates a lot of directories for working files instead of dumping lots of stuff in one directory
* _load_data: ignore invalid chromosome specifications instead of
  triggering a fatal error
* genomedata-save-metadata: can now handle empty columns
* stop creating lots of auxiliary output files during Viterbi identify
* there is no directory named "out" created by default
* segway: now arranges some working files into subdirectories, some name changes
* runs posterior decoding during identify, creates wig files
* segway: no option --no-posterior disables posterior decoding
* segway: figure out fully-qualified path to GMTK executables and
  queue that. This increases robustness to minor configuration
  differences on parallel nodes.
* add segway-res-usage program to calibrate resource usage, which involved some refactoring of run.py
* adding different methods to create initial parameters. interface may be exposed in a future version
* jt_info.txt is now created in work directory instead of current directory
* fix bug: don't repeat track line multiple times in BED file
* performance: greatly improve genomedata-save-metadata

0.1.0:

* new segway option: --distribution: choices are norm, gamma
* default distribution is now norm
* _load_data: add support for wigFix with step, span values set
* _util.py: mark some functions for replacement with genomedata

0.1.0a4:

* Now requires optbuild>0.1.4
* Understandable structure files (e.g. the name for observation track "obs0" becomes "h3k27me3")
* Save log likelihood files
* Rename segway-load-seq -> genomedata-load-seq
* Rename segway-load-data -> genomedata-load-data
* Rename segway-name-tracks -> genomedata-name-tracks
* New command genomedata-save-metadata
* New command: _load_data: load data quickly (in C)

(These five commands will eventually be split out into a new genomedata package)

* New command: segway-calc-distance: calculate distance between identified segments and an external BED feature file
* New command: gtf2bed: convert GTF to BED format

* New segway option --track: use only a particular track
* Generate a "dinucleotide" track when specified with --track
* New segway option --prior-strength: uses dirichlet tables on seg_seg transition CPT
* segway --help now uses option groups

* New h5histogram options: --include-identify, --identify-label: now
  allows getting histogram per identified segment
* New h5histogram option: --num-bins: allow specification of number of histogram bins

* identification output files are now gzipped bed rather than wig
* rename "nonmissing" to "presence" throughout
* Move from normal distribution to gamma distribution
* Calculate starting parameters based on something close to the maximum likelihood estimation plus jitter
* Hacky (and wrong) support for memory requirements for up to six
  observation tracks, will be redone when GMTK memory optimization is
  done

* better handling of errors and KeyboardInterrupts during parallel operation of segway
* Stops using deprecated -inputTrainableParameters option to GMTK

* Checking for unset metadata before using genomedata files
* Corrections to README
* Bugfixes

0.1.0a3:

Files from previous versions are no longer compatible.

* Binary output of observations for GMTK. This means it takes a few
  minutes to write out observations for the whole genome before
  training or Viterbi rather than many hours.
* Output of segment lengths for diagnostic purposes.
* Changed observation storage from float32 to float64.
* Faster conversion of Viterbi output files to wiggle tracks.
* Observation files now have more descriptive names.

0.1.0a2:

Files from previous versions are no longer compatible.

* Efficient multiple random start EM training queue submission
* Addition of h5histogram command
* Limiting data analyzed to a particular region with --include-region
* More efficient data loading
* Some interface and file format changes
* Performance improvements
* Bugfixes
