calhoun.analysis.crf.io
Class InputHandlerInterleaved

java.lang.Object
  extended by calhoun.analysis.crf.io.InputHandlerInterleaved
All Implemented Interfaces:
InputHandler, java.io.Serializable

public class InputHandlerInterleaved
extends java.lang.Object
implements InputHandler

an InputHandler for handling input files that consist of multiple different sequences interleaved together in a file. This input handler has a list of InterleavedInputComponents. When reading in a file, this InputHandler opens a reader on the file and passes the reader to each InterleavedInputComponent in turn for each sequence. Training data is assumed to be the first line of each sequence, using an IntInput to encode the hidden states.

This input handler is useful for test data that contains multiple inputs in a file along with the training data. It is included to support backwards compatibility with the old input format.

This input handler can also work with "literal" input, where the location string that is passed in is not a file name, but the actual input data. This is used frequently to pass small volumes of data in unit tests.

See Also:
Serialized Form

Constructor Summary
InputHandlerInterleaved()
          creates a new input handler, usually to be configured from an XML file
InputHandlerInterleaved(InterleavedInputComponent base)
          creates a new input handler, containing a single InterleavedInputComponent
InputHandlerInterleaved(InterleavedInputComponent base, boolean locationIsLiteral)
          creates a new input handler, containing a single InterleavedInputComponent
 
Method Summary
 java.util.List<InterleavedInputComponent> getComponents()
          gets the current set of input components configured for this input handler.
 IntInput getHiddenStateReader()
          gets the reader used to read in results for training data.
 boolean isLocationIsLiteral()
          gets the meaning of the input location string.
 java.util.Iterator<? extends InputSequence<?>> readInputData(java.lang.String location)
          returns the input data read from the specified location.
 java.util.List<? extends TrainingSequence<?>> readTrainingData(java.lang.String location)
           
 java.util.List<? extends TrainingSequence<?>> readTrainingData(java.lang.String location, boolean predict)
          returns the training data read from the specified location.
 void setComponents(java.util.List<InterleavedInputComponent> components)
          sets the current set of input components configured for this input handler.
 void setHiddenStateReader(IntInput hiddenStateReader)
          sets the reader used to get hidden sequences.
 void setLocationIsLiteral(boolean literal)
          sets the meaning of the input location string.
 void writeInputData(java.lang.String location, java.util.Iterator<? extends InputSequence<?>> data)
          writes input data to the specified location.
 void writeTrainingData(java.lang.String location, java.util.List<? extends TrainingSequence<?>> data)
          writes training data to the specified location.
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

InputHandlerInterleaved

public InputHandlerInterleaved()
creates a new input handler, usually to be configured from an XML file


InputHandlerInterleaved

public InputHandlerInterleaved(InterleavedInputComponent base)
creates a new input handler, containing a single InterleavedInputComponent

Parameters:
base - the single input component which is contained in the input file

InputHandlerInterleaved

public InputHandlerInterleaved(InterleavedInputComponent base,
                               boolean locationIsLiteral)
creates a new input handler, containing a single InterleavedInputComponent

Parameters:
base - the single input component which is contained in the input file
locationIsLiteral - if true then the location string passed in to the read commands is the actual input data. Otherwise, it is the location of a file from which to read the data.
Method Detail

readInputData

public java.util.Iterator<? extends InputSequence<?>> readInputData(java.lang.String location)
                                                             throws java.io.IOException
Description copied from interface: InputHandler
returns the input data read from the specified location. The result is returned as an Iterator because the inference algorithms can predict on the sequences one at a time. The interpretation of the location string is dependent on the particular InputHandler implementation used.

Specified by:
readInputData in interface InputHandler
Parameters:
location - string location of the data. Meaning is implementation dependent.
Returns:
an iterator over input sequences
Throws:
java.io.IOException - if there is a problem reading the data

readTrainingData

public java.util.List<? extends TrainingSequence<?>> readTrainingData(java.lang.String location)
                                                               throws java.io.IOException
Specified by:
readTrainingData in interface InputHandler
Throws:
java.io.IOException

readTrainingData

public java.util.List<? extends TrainingSequence<?>> readTrainingData(java.lang.String location,
                                                                      boolean predict)
                                                               throws java.io.IOException
Description copied from interface: InputHandler
returns the training data read from the specified location. Training data includes input data and hidden sequences. The result is returned as a Iterator so algorithms are not forced to hold all of the training data at once (although most will). The interpretation of the location string is dependent on the particular InputHandler implementation used.

Specified by:
readTrainingData in interface InputHandler
Parameters:
location - string location of the data. Meaning is implementation dependent.
Returns:
a list of training sequences
Throws:
java.io.IOException - if there is a problem reading the data

writeInputData

public void writeInputData(java.lang.String location,
                           java.util.Iterator<? extends InputSequence<?>> data)
                    throws java.io.IOException
Description copied from interface: InputHandler
writes input data to the specified location. The interpretation of the location string is dependent on the particular InputHandler implementation used.

Specified by:
writeInputData in interface InputHandler
Parameters:
location - string location of the data. Meaning is implementation dependent.
data - an iterator over input sequences
Throws:
java.io.IOException - if there is a problem reading the data

writeTrainingData

public void writeTrainingData(java.lang.String location,
                              java.util.List<? extends TrainingSequence<?>> data)
                       throws java.io.IOException
Description copied from interface: InputHandler
writes training data to the specified location. Training data includes input data and hidden sequences. The interpretation of the location string is dependent on the particular InputHandler implementation used.

Specified by:
writeTrainingData in interface InputHandler
Parameters:
location - string location of the data. Meaning is implementation dependent.
data - a list of training sequences to write out.
Throws:
java.io.IOException - if there is a problem reading the data

getComponents

public java.util.List<InterleavedInputComponent> getComponents()
gets the current set of input components configured for this input handler.

Returns:
returns the interleaved input components that make up the file.

setComponents

public void setComponents(java.util.List<InterleavedInputComponent> components)
sets the current set of input components configured for this input handler.

Parameters:
components - sets the interleaved input components that make up the file.

isLocationIsLiteral

public boolean isLocationIsLiteral()
gets the meaning of the input location string.

Returns:
true to indicate whether the input data will come in as a file or through the location string.

setLocationIsLiteral

public void setLocationIsLiteral(boolean literal)
sets the meaning of the input location string.

Parameters:
literal - set locationIsLiteral to indicate whether the input data will come in as a file or through the location string.

getHiddenStateReader

public IntInput getHiddenStateReader()
gets the reader used to read in results for training data.

Returns:
the TrainingSequenceIO used to read in the hidden sequences for training

setHiddenStateReader

public void setHiddenStateReader(IntInput hiddenStateReader)
sets the reader used to get hidden sequences. Must be set to read in training data.

Parameters:
hiddenStateReader - the reader that will be used to access hidden states