SIGRS
Class KDStatistics

java.lang.Object
  extended by SIGRS.KDStatistics

public class KDStatistics
extends java.lang.Object

SIGRS is a collection of routines used in searching for regions of contrasting composition (CCRs) in sequence files using a partial sum process. Significance of segments is evaluated using Karlin-Altschul statistics and specifically an extension by Karlin-Dembo allowing for nucleotides to have a Markov-dependence (see e.g. Karlin & Altschul (1993) and Karlin & Dembo (1992)

The routines are provided as is and no guarantee regarding stability etc. is given so use at your own risk!

See publication Larsson, P., Hinas, A., Ardell, D.H., Kirsebom, L.A., Virtanen, A. and Söderbom, F. De novo search for non-coding RNA genes in the AT-rich genome of Dictyostelium discoideum: performance of Markov-dependent genome feature scoring

Questions and comments can be directed to Pontus.Larsson@icm.uu.se


Constructor Summary
KDStatistics()
           
 
Method Summary
static double expect(double[][] s, double[][] p)
          Calculates the expected score based on a score matrix and the associated probabilities
static double K(double theta, double[] u, double[][] s, double[][] p)
          Estimates the parameter K for a score matrix assuming Markov-dependant letters.
static double[] theta(double[][] s, double[][] p)
          Estimateds the parameter theta* of Step 1 on p. 137 of Karlin & Dembo (1992) Includes a simple routine for numerical approximation.
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

KDStatistics

public KDStatistics()
Method Detail

expect

public static double expect(double[][] s,
                            double[][] p)
Calculates the expected score based on a score matrix and the associated probabilities

Parameters:
s - The score matrix
p - The associated probabilities
Returns:
The expected score

K

public static double K(double theta,
                       double[] u,
                       double[][] s,
                       double[][] p)
Estimates the parameter K for a score matrix assuming Markov-dependant letters. Follows the steps on pages 137-139 in Karlin & Dembo (1992). Not much care is taken to make sure iterations converge and that exceptions don't occur.

Parameters:
theta - The estimated theta for the score matrix
u - The right frequency eigenvector of PHI(theta) can be (obtained from the theta estimation)
s - The score matrix
p - The associated probabilities
Returns:
The estimate for the parameter K

theta

public static double[] theta(double[][] s,
                             double[][] p)
Estimateds the parameter theta* of Step 1 on p. 137 of Karlin & Dembo (1992) Includes a simple routine for numerical approximation. No special care is taken to make sure it converges!

Returns:
An array with the estimated theta at first position and the right frequency eigenvector u of step 2 in the remaining positions