001 package calhoun.analysis.crf.solver;
002
003 import calhoun.analysis.crf.LocalPathSimilarityScore;
004 import calhoun.analysis.crf.solver.semimarkov.CleanLocalScoreSemiMarkovGradient;
005
006 /** computes an objective function which is the expected value of a local path similarity score on a
007 * semi-Markov model. Requires a {@link CacheProcessor} and a {@link LocalPathSimilarityScore} to be configured.<p>
008 * <h2>Debugging output</h2>
009 * To get a better understanding of what the objective function is doing, several different properties can be set that
010 * cause the objective function to write out trace files showing its calculations during training. Usually when turning
011 * these options on, you should set <code>maxIters = 1</code> and <code>requireConvergence = false</code> in your optimizer
012 * to do only a single training iteration, possibly setting the starts to some predetermined value. Each of these
013 * properties can be configured with a filename and each time {@link #apply} is called, the file will be overwritten with
014 * data from the current call. The logging options are:
015 *
016 * <ul>
017 * <li> <b><code>alphaFile</code></b> - computation of alpha values for Markov states, includes all nodes and edges.
018 * <li> <b><code>alphaLengthFile</code></b> - computation of alpha values for semi-Markov states , includes all segments
019 * <li> <b><code>betaLengthFile</code></b> - computation of beta values for semi-Markov states , includes all segments
020 * <li> <b><code>expectFile</code></b> - computation of expected values for each Markov feature
021 * <li> <b><code>expectLengthFile</code></b> - computation of expected values for each semi-Markov feature
022 * <li> <b><code>nodeMarginalFile</code></b> - computation of marginal probability of each state at each position
023 * </ul>
024
025 * <h4>Implementation Notes</h4>
026 * The general normalization scheme works as follows. When updating alpha values in the forward pass we compute segments
027 * of length 1 first and then work backwards.
028 * <p>
029 * Instead of always normalizing to 1 we discretize the normalization. We choose an arbitrary normalization factor w,
030 * such as 50. The normalization factor at any position is then an integer v, and all entries at that position are
031 * alpha[y]*e^(v*w).
032 * <p>
033 * The normalization can be computed at any position from 1) Elements of the alpha array are summed s 2) v = log(s)/w.
034 * By integer division v will always be an appropriate normalizer. It may be positive or negative. 3) All elements of
035 * the array are divided by e^(v*w)
036 *
037 */
038 public class MaximumExpectedAccuracySemiMarkovGradient extends CleanLocalScoreSemiMarkovGradient {
039 }