calhoun.analysis.crf.io
Interface InputHandler

All Superinterfaces:
java.io.Serializable
All Known Implementing Classes:
CompositeInput.LegacyInputHandler, InputHandlerBase, InputHandlerDirectory, InputHandlerFile, InputHandlerInterleaved

public interface InputHandler
extends java.io.Serializable

interface to classes that handle reading and writing of input data for the CRF. The 'location' of the input data is passed in via a text string. The interpretation of this text string is left up to the InputHandler implementation. Commonly it will be a directory path, a file path, or some sort of configuration string. The InputHandler returns InputSequences and TrainingSequences to the engine.

The InputHandler is also reponsible for writing out data. This is necessary for subsetting and other partitioning utilities. As with reading, an implementation dependent location string is used to specify where the data should be written. When writing data, it is safe for the input handler to assume that the sequences it receives for writing are in the same format as those it created during reading.


Method Summary
 java.util.Iterator<? extends InputSequence<?>> readInputData(java.lang.String location)
          returns the input data read from the specified location.
 java.util.List<? extends TrainingSequence<?>> readTrainingData(java.lang.String location)
           
 java.util.List<? extends TrainingSequence<?>> readTrainingData(java.lang.String location, boolean predict)
          returns the training data read from the specified location.
 void writeInputData(java.lang.String location, java.util.Iterator<? extends InputSequence<?>> data)
          writes input data to the specified location.
 void writeTrainingData(java.lang.String location, java.util.List<? extends TrainingSequence<?>> data)
          writes training data to the specified location.
 

Method Detail

readTrainingData

java.util.List<? extends TrainingSequence<?>> readTrainingData(java.lang.String location,
                                                               boolean predict)
                                                               throws java.io.IOException
returns the training data read from the specified location. Training data includes input data and hidden sequences. The result is returned as a Iterator so algorithms are not forced to hold all of the training data at once (although most will). The interpretation of the location string is dependent on the particular InputHandler implementation used.

Parameters:
location - string location of the data. Meaning is implementation dependent.
Returns:
a list of training sequences
Throws:
java.io.IOException - if there is a problem reading the data

readTrainingData

java.util.List<? extends TrainingSequence<?>> readTrainingData(java.lang.String location)
                                                               throws java.io.IOException
Throws:
java.io.IOException

readInputData

java.util.Iterator<? extends InputSequence<?>> readInputData(java.lang.String location)
                                                             throws java.io.IOException
returns the input data read from the specified location. The result is returned as an Iterator because the inference algorithms can predict on the sequences one at a time. The interpretation of the location string is dependent on the particular InputHandler implementation used.

Parameters:
location - string location of the data. Meaning is implementation dependent.
Returns:
an iterator over input sequences
Throws:
java.io.IOException - if there is a problem reading the data

writeTrainingData

void writeTrainingData(java.lang.String location,
                       java.util.List<? extends TrainingSequence<?>> data)
                       throws java.io.IOException
writes training data to the specified location. Training data includes input data and hidden sequences. The interpretation of the location string is dependent on the particular InputHandler implementation used.

Parameters:
location - string location of the data. Meaning is implementation dependent.
data - a list of training sequences to write out.
Throws:
java.io.IOException - if there is a problem reading the data

writeInputData

void writeInputData(java.lang.String location,
                    java.util.Iterator<? extends InputSequence<?>> data)
                    throws java.io.IOException
writes input data to the specified location. The interpretation of the location string is dependent on the particular InputHandler implementation used.

Parameters:
location - string location of the data. Meaning is implementation dependent.
data - an iterator over input sequences
Throws:
java.io.IOException - if there is a problem reading the data