calhoun.analysis.crf.io
Class InputHandlerDirectory

java.lang.Object
  extended by calhoun.analysis.crf.io.InputHandlerBase
      extended by calhoun.analysis.crf.io.InputHandlerDirectory
All Implemented Interfaces:
InputHandler, java.io.Serializable

public class InputHandlerDirectory
extends InputHandlerBase

an InputHandler used when the input is in several files within a single directory. A single InputComponentIO is used for each file. A map associates each file name with its InputComponentIO. For training, hidden sequences are stored in a separate file in the directory whose name is set with the hiddenSequenceFile property. For this InputHandler, the location passed is the path to the directory containing the input data.

See Also:
Serialized Form

Constructor Summary
InputHandlerDirectory()
           
 
Method Summary
 java.lang.String getHiddenSequenceFile()
          gets the name of the hidden sequence file.
 TrainingSequenceIO getHiddenStateReader()
          gets the reader used to read in results for training data.
 java.util.Map<java.lang.String,InputComponentIO> getInputReaders()
          gets the readers used to read in input sequences.
 java.util.Iterator<? extends InputSequence<?>> readInputData(java.lang.String location)
          returns the input data read from the specified location.
 java.util.List<? extends TrainingSequence<?>> readTrainingData(java.lang.String location)
           
 java.util.List<? extends TrainingSequence<?>> readTrainingData(java.lang.String location, boolean predict)
          returns the training data read from the specified location.
 void setHiddenSequenceFile(java.lang.String hiddenSequenceFile)
          sets the name of the hidden sequence file.
 void setHiddenStateReader(TrainingSequenceIO hiddenStateReader)
          sets the reader used to get hidden sequences.
 void setInputReaders(java.util.Map<java.lang.String,InputComponentIO> inputReader)
          sets the readers used to read in input sequences.
 void writeInputData(java.lang.String location, java.util.Iterator<? extends InputSequence<?>> data)
          writes input data to the specified location.
 void writeTrainingData(java.lang.String location, java.util.List<? extends TrainingSequence<?>> data)
          writes training data to the specified location.
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

InputHandlerDirectory

public InputHandlerDirectory()
Method Detail

readInputData

public java.util.Iterator<? extends InputSequence<?>> readInputData(java.lang.String location)
                                                             throws java.io.IOException
Description copied from interface: InputHandler
returns the input data read from the specified location. The result is returned as an Iterator because the inference algorithms can predict on the sequences one at a time. The interpretation of the location string is dependent on the particular InputHandler implementation used.

Parameters:
location - string location of the data. Meaning is implementation dependent.
Returns:
an iterator over input sequences
Throws:
java.io.IOException - if there is a problem reading the data

readTrainingData

public java.util.List<? extends TrainingSequence<?>> readTrainingData(java.lang.String location)
                                                               throws java.io.IOException
Throws:
java.io.IOException

readTrainingData

public java.util.List<? extends TrainingSequence<?>> readTrainingData(java.lang.String location,
                                                                      boolean predict)
                                                               throws java.io.IOException
Description copied from interface: InputHandler
returns the training data read from the specified location. Training data includes input data and hidden sequences. The result is returned as a Iterator so algorithms are not forced to hold all of the training data at once (although most will). The interpretation of the location string is dependent on the particular InputHandler implementation used.

Parameters:
location - string location of the data. Meaning is implementation dependent.
Returns:
a list of training sequences
Throws:
java.io.IOException - if there is a problem reading the data

writeInputData

public void writeInputData(java.lang.String location,
                           java.util.Iterator<? extends InputSequence<?>> data)
                    throws java.io.IOException
Description copied from interface: InputHandler
writes input data to the specified location. The interpretation of the location string is dependent on the particular InputHandler implementation used.

Parameters:
location - string location of the data. Meaning is implementation dependent.
data - an iterator over input sequences
Throws:
java.io.IOException - if there is a problem reading the data

writeTrainingData

public void writeTrainingData(java.lang.String location,
                              java.util.List<? extends TrainingSequence<?>> data)
                       throws java.io.IOException
Description copied from interface: InputHandler
writes training data to the specified location. Training data includes input data and hidden sequences. The interpretation of the location string is dependent on the particular InputHandler implementation used.

Parameters:
location - string location of the data. Meaning is implementation dependent.
data - a list of training sequences to write out.
Throws:
java.io.IOException - if there is a problem reading the data

getHiddenStateReader

public TrainingSequenceIO getHiddenStateReader()
gets the reader used to read in results for training data.

Returns:
the TrainingSequenceIO used to read in the hidden sequences for training

setHiddenStateReader

public void setHiddenStateReader(TrainingSequenceIO hiddenStateReader)
sets the reader used to get hidden sequences. Must be set to read in training data.

Parameters:
hiddenStateReader - the reader that will be used to access hidden states

getInputReaders

public java.util.Map<java.lang.String,InputComponentIO> getInputReaders()
gets the readers used to read in input sequences. Must be set before any of the read methods are called.

Returns:
the reader used to read in input sequences.

setInputReaders

public void setInputReaders(java.util.Map<java.lang.String,InputComponentIO> inputReader)
sets the readers used to read in input sequences. Must be set before any of the read methods are called. the value is a map that associates filenames within the directory to input components.

Parameters:
inputReader - the reader used to read in input sequences.

getHiddenSequenceFile

public java.lang.String getHiddenSequenceFile()
gets the name of the hidden sequence file. This is the name of the file within the directory where training data will be located.

Returns:
the name of the hidden sequence file.

setHiddenSequenceFile

public void setHiddenSequenceFile(java.lang.String hiddenSequenceFile)
sets the name of the hidden sequence file. This is the name of the file within the directory where training data will be located.

Parameters:
hiddenSequenceFile - the name of the hidden sequence file within the input directory.