SIGRS
Class Methods

java.lang.Object
  extended by SIGRS.Methods

public class Methods
extends java.lang.Object

SIGRS is a collection of routines used in searching for regions of contrasting composition (CCRs) in sequence files using a partial sum process. Significance of segments is evaluated using Karlin-Altschul statistics and specifically an extension by Karlin-Dembo allowing for nucleotides to have a Markov-dependence (see e.g. Karlin & Altschul (1993) and Karlin & Dembo (1992)

The routines are provided as is and no guarantee regarding stability etc. is given so use at your own risk!

See publication Larsson, P., Hinas, A., Ardell, D.H., Kirsebom, L.A., Virtanen, A. and Söderbom, F. De novo search for non-coding RNA genes in the AT-rich genome of Dictyostelium discoideum: performance of Markov-dependent genome feature scoring

Questions and comments can be directed to Pontus.Larsson@icm.uu.se


Field Summary
static char[] nucleotideCharCode
           
static java.lang.String nucleotideCode
           
 
Constructor Summary
Methods()
           
 
Method Summary
static byte[][] addElement(byte[] n, byte[][] arr)
          Adds an element to the end of an array
static byte[] addElement(byte n, byte[] arr)
          Adds an element to the end of an array
static double[][] addElement(double[] n, double[][] arr)
          Adds an element to the end of an array
static java.io.File[] addElement(java.io.File nO, java.io.File[] arr)
          Adds an element to the end of an array
static int[] addElement(int n, int[] arr)
          Adds an element to the end of an array
static java.lang.Object[] addElement(java.lang.Object nO, java.lang.Object[] arr)
          Adds an element to the end of an array
static java.lang.String[] addElement(java.lang.String nO, java.lang.String[] arr)
          Adds an element to the end of an array
static byte[] append(byte[] nE, byte[] src)
          Appends an array to the end of another array
static char[] append(char[] nE, char[] src)
          Appends an array to the end of another array
static double[][] append(double[][] nE, double[][] src)
          Appends an array to the end of another array
static double[] append(double[] nE, double[] src)
          Appends an array to the end of another array
static void append(java.io.File f1, java.io.File f2)
          Appends the contents of one file to the other
static int[] append(int[] nE, int[] src)
          Appends an array to the end of another array
static byte[] append(int start, int stop, byte[] nE, byte[] src)
           
static char[] append(int start, int stop, char[] nE, char[] src)
           
static java.lang.String[][] append(java.lang.String[][] nE, java.lang.String[][] src)
          Appends an array to the end of another array
static java.lang.String[] append(java.lang.String[] nE, java.lang.String[] src)
          Appends an array to the end of another array
static java.io.File concatenateFile(java.io.File src, java.io.File concat)
           
static int[][] countDiNucleotides(byte[] seq)
          Counts the number of each dinucleotide in a sequence Only A,C,G and T is counted.
static int[][] countDiNucleotides(java.io.File seqFile)
          Counts the number of each dinucleotide in a sequence Only A,C,G and T is counted.
static int[] countMonoNucleotides(byte[] seq)
          Counts the number of each nucleotide in a sequence Only A,C,G and T is counted.
static int[] countMonoNucleotides(java.io.File seqFile)
          Counts the number of each nucleotide in a sequence Only A,C,G and T is counted.
static java.lang.String decode(byte[] byteSeq)
           
static byte[] encode(java.lang.String seq)
          Encodes a nucleotide sequence into bytes where 1=A, 2=C, 3=G, 4=T, 5=X (Masked), any other nucleotide is encoded as 0=N
static double[] getColumn(int c, double[][] arr)
          Returns the desired column from a matrix array
static java.lang.String[] getColumn(int c, java.lang.String[][] arr)
          Returns the desired column from a matrix array
static java.lang.String getFileContents(java.io.File f)
          Reads the entire content of a file into a string
static int indexOf(int obj, int[] arr)
          Finds the index of an element within an array
static int indexOf(java.lang.String obj, java.lang.String[] arr)
          Finds the index of an element within an array
static double logN(double x, double N)
          Calculates the logarithm in an arbitrary base
static double max(double[] arr)
          Finds the greatest element in the input array
static double max(double[][] arr)
          Finds the greatest element in the input array
static double min(double[] arr)
          Finds the minimum element in the input array
static double min(double[][] arr)
          Finds the minimum element in the input array
static double[] newDoubleArray(int len, double fill)
          Creates a new array of specified length and with all elements set to a specified value
static java.lang.String pad(java.lang.String str, int padLength)
          Pads a string with whitespaces to a specified length.
static java.lang.String[][] parseBigFasta(java.io.File f)
           
static java.lang.String[][] parseFasta(java.io.File fastaFile)
          Parses a file in FASTA format and returns an array with the identifiers and sequences.
static int[] quickSort(int[] src)
          Sorts an array in ascending order
static byte[][] removeElementAt(int n, byte[][] arr)
          Removes an element from an array and shifts the remaining elements to the left
static int[] reverse(int[] arr)
          Reverse the order of the elements in an array
static byte[] reverseComplement(byte[] seq)
          Gets the reverse complement of a byte encoded sequence
static char reverseComplement(char c)
           
static java.io.File reverseComplement(java.io.File seqFile, java.io.File revFile)
           
static double[][] setColumn(double[][] src, int index, double[] col)
          Set a column of a matrix
static int skipIdLine(char[] cBuff, int i, int limit)
           
static int skipLineBreak(char[] cBuff, int i, int limit)
           
static int[] subarray(int start, int[] src)
          Extracts the last subsegment of an array
static char[] subarray(int start, int stop, char[] src)
          Extracts a subsegment of an array
static double[] subarray(int start, int stop, double[] src)
          Extracts a subsegment of an array
static double[][] subarray(int start, int stop, double[][] arr)
          Extracts a subsegment of an array
static int[] subarray(int start, int stop, int[] src)
          Extracts a subsegment of an array
static java.lang.String[][] subarray(int start, int stop, java.lang.String[][] src)
           
static java.lang.String[][] subarray(int start, java.lang.String[][] src)
          Extracts a subsegment of an array
static double sum(double[] arr)
          Sums the element in an array
static int sum(int[] arr)
          Sums the element in an array
static double[] vectorAdd(double[] arr, double f)
          Adds a scalar to an array
static int[][] vectorAdd(int[][] v1, int[][] v2)
          Adds to arrays together
static int[] vectorAdd(int[] v1, int[] v2)
          Adds to arrays together
static double[][] vectorDivide(double[][] arr, double f)
          Divides an array by a scalar
static double[] vectorDivide(double[] arr, double f)
          Divides an array by a scalar
static double vectorInnerProduct(double[] v1, double[] v2)
          Calculates the inner product of two vectors
static double[][] vectorMultiply(double[][] arr, double f)
          Multiplies a matrix by a scalar
static double[] vectorMultiply(double[] arr, double f)
          Multiplies an array by a scalar
static double[] vectorMultiply(double[] v1, double[] v2)
          Multiplies the values of two arrays
static int[] vectorMultiply(int[] arr, int f)
          Multiplies an array by a scalar
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

nucleotideCode

public static java.lang.String nucleotideCode

nucleotideCharCode

public static char[] nucleotideCharCode
Constructor Detail

Methods

public Methods()
Method Detail

addElement

public static byte[] addElement(byte n,
                                byte[] arr)
Adds an element to the end of an array

Parameters:
n - The new element to add
arr - The array to add the new element to
Returns:
A new array with the new element added to the input array

addElement

public static byte[][] addElement(byte[] n,
                                  byte[][] arr)
Adds an element to the end of an array

Parameters:
n - The new element to add
arr - The array to add the new element to
Returns:
A new array with the new element added to the input array

addElement

public static double[][] addElement(double[] n,
                                    double[][] arr)
Adds an element to the end of an array

Parameters:
n - The new element to add
arr - The array to add the new element to
Returns:
A new array with the new element added to the input array

addElement

public static int[] addElement(int n,
                               int[] arr)
Adds an element to the end of an array

Parameters:
n - The new element to add
arr - The array to add the new element to
Returns:
A new array with the new element added to the input array

addElement

public static java.io.File[] addElement(java.io.File nO,
                                        java.io.File[] arr)
Adds an element to the end of an array

Parameters:
n - The new element to add
arr - The array to add the new element to
Returns:
A new array with the new element added to the input array

addElement

public static java.lang.Object[] addElement(java.lang.Object nO,
                                            java.lang.Object[] arr)
Adds an element to the end of an array

Parameters:
n - The new element to add
arr - The array to add the new element to
Returns:
A new array with the new element added to the input array

addElement

public static java.lang.String[] addElement(java.lang.String nO,
                                            java.lang.String[] arr)
Adds an element to the end of an array

Parameters:
n - The new element to add
arr - The array to add the new element to
Returns:
A new array with the new element added to the input array

append

public static byte[] append(byte[] nE,
                            byte[] src)
Appends an array to the end of another array

Parameters:
nE - The array to append
src - The new array gets appended to the end of this one
Returns:
An array with the first array appended to the end of the second

append

public static byte[] append(int start,
                            int stop,
                            byte[] nE,
                            byte[] src)

append

public static char[] append(char[] nE,
                            char[] src)
Appends an array to the end of another array

Parameters:
nE - The array to append
src - The new array gets appended to the end of this one
Returns:
An array with the first array appended to the end of the second

append

public static char[] append(int start,
                            int stop,
                            char[] nE,
                            char[] src)

append

public static double[] append(double[] nE,
                              double[] src)
Appends an array to the end of another array

Parameters:
nE - The array to append
src - The new array gets appended to the end of this one
Returns:
An array with the first array appended to the end of the second

append

public static double[][] append(double[][] nE,
                                double[][] src)
Appends an array to the end of another array

Parameters:
nE - The array to append
src - The new array gets appended to the end of this one
Returns:
An array with the first array appended to the end of the second

append

public static void append(java.io.File f1,
                          java.io.File f2)
                   throws java.lang.Exception
Appends the contents of one file to the other

Parameters:
f1 - The file which contents will be appended
f2 - The file that will be appended
Throws:
java.lang.Exception

append

public static int[] append(int[] nE,
                           int[] src)
Appends an array to the end of another array

Parameters:
nE - The array to append
src - The new array gets appended to the end of this one
Returns:
An array with the first array appended to the end of the second

append

public static java.lang.String[] append(java.lang.String[] nE,
                                        java.lang.String[] src)
Appends an array to the end of another array

Parameters:
nE - The array to append
src - The new array gets appended to the end of this one
Returns:
An array with the first array appended to the end of the second

append

public static java.lang.String[][] append(java.lang.String[][] nE,
                                          java.lang.String[][] src)
Appends an array to the end of another array

Parameters:
nE - The array to append
src - The new array gets appended to the end of this one
Returns:
An array with the first array appended to the end of the second

concatenateFile

public static java.io.File concatenateFile(java.io.File src,
                                           java.io.File concat)
                                    throws java.lang.Exception
Throws:
java.lang.Exception

countMonoNucleotides

public static int[] countMonoNucleotides(byte[] seq)
Counts the number of each nucleotide in a sequence Only A,C,G and T is counted. Any other nucleotide is counted as N

Returns:
1x6array with nucleotide counts: [0] -> N, [1] -> A, [2] -> C, [3] -> G, [4] -> T, [5] -> N

countMonoNucleotides

public static int[] countMonoNucleotides(java.io.File seqFile)
                                  throws java.lang.Exception
Counts the number of each nucleotide in a sequence Only A,C,G and T is counted. Any other nucleotide is counted as N

Returns:
1x6array with nucleotide counts: [0] -> N, [1] -> A, [2] -> C, [3] -> G, [4] -> T, [5] -> N
Throws:
java.lang.Exception

countDiNucleotides

public static int[][] countDiNucleotides(byte[] seq)
Counts the number of each dinucleotide in a sequence Only A,C,G and T is counted. Any other nucleotide is counted as N

Returns:
6x6-array with dinucleotide counts: [0] -> N, [1] -> A, [2] -> C, [3] -> G, [4] -> T, [5] -> N

countDiNucleotides

public static int[][] countDiNucleotides(java.io.File seqFile)
                                  throws java.lang.Exception
Counts the number of each dinucleotide in a sequence Only A,C,G and T is counted. Any other nucleotide is counted as N

Returns:
6x6-array with dinucleotide counts: [0] -> N, [1] -> A, [2] -> C, [3] -> G, [4] -> T, [5] -> X
Throws:
java.lang.Exception

skipLineBreak

public static int skipLineBreak(char[] cBuff,
                                int i,
                                int limit)

skipIdLine

public static int skipIdLine(char[] cBuff,
                             int i,
                             int limit)

decode

public static java.lang.String decode(byte[] byteSeq)

encode

public static byte[] encode(java.lang.String seq)
Encodes a nucleotide sequence into bytes where 1=A, 2=C, 3=G, 4=T, 5=X (Masked), any other nucleotide is encoded as 0=N

Parameters:
seq - The input sequence
Returns:
The input sequence encoded to a byte array

getColumn

public static double[] getColumn(int c,
                                 double[][] arr)
Returns the desired column from a matrix array

Parameters:
c - Index of the column to return
arr - The array
Returns:
An array corresponding to the desired column from the input matrix

getColumn

public static java.lang.String[] getColumn(int c,
                                           java.lang.String[][] arr)
Returns the desired column from a matrix array

Parameters:
c - Index of the column to return
arr - The array
Returns:
An array corresponding to the desired column from the input matrix

indexOf

public static int indexOf(int obj,
                          int[] arr)
Finds the index of an element within an array

Parameters:
obj - The element to search for
arr - The array to search within
Returns:
The index of the first occurance of the element within the array or -1 if the element was not found

indexOf

public static int indexOf(java.lang.String obj,
                          java.lang.String[] arr)
Finds the index of an element within an array

Parameters:
obj - The element to search for
arr - The array to search within
Returns:
The index of the first occurance of the element within the array or -1 if the element was not found

logN

public static double logN(double x,
                          double N)
Calculates the logarithm in an arbitrary base

Parameters:
x - Value to take logarithm of
N - Base of logarithm
Returns:
The logarithm of x in base N

max

public static double max(double[] arr)
Finds the greatest element in the input array

Parameters:
arr - The input array to search
Returns:
The maximum value in the input array

max

public static double max(double[][] arr)
Finds the greatest element in the input array

Parameters:
arr - The input array to search
Returns:
The maximum value in the input array

min

public static double min(double[] arr)
Finds the minimum element in the input array

Parameters:
arr - The input array to search
Returns:
The minimum value in the input array

min

public static double min(double[][] arr)
Finds the minimum element in the input array

Parameters:
arr - The input array to search
Returns:
The minimum value in the input array

newDoubleArray

public static double[] newDoubleArray(int len,
                                      double fill)
Creates a new array of specified length and with all elements set to a specified value

Parameters:
len - The length of the array
fill - The value all elements will have
Returns:
A new double array of specified length and containing specified values

pad

public static java.lang.String pad(java.lang.String str,
                                   int padLength)
Pads a string with whitespaces to a specified length. If input string is longer it is truncated instead

Parameters:
str - The string to be padded
padLength - The desired length of the string
Returns:
The input string with whitespaces added at the end so the total length is equal to padLength. If the input string was longer it is truncated instead.

parseFasta

public static java.lang.String[][] parseFasta(java.io.File fastaFile)
                                       throws java.lang.Exception
Parses a file in FASTA format and returns an array with the identifiers and sequences. For files with few long sequences, use parseBigFasta() instead.

Parameters:
fastaFile - Sequence file in FASTA format
Returns:
Nx2-array holding the identifiers and sequences. [i][0] -> Identifier, [i][1] -> Sequence
Throws:
java.lang.Exception

parseBigFasta

public static java.lang.String[][] parseBigFasta(java.io.File f)
                                          throws java.lang.Exception
Throws:
java.lang.Exception

quickSort

public static int[] quickSort(int[] src)
Sorts an array in ascending order

Parameters:
src - The array to sort
Returns:
A new array with the input array elements sorted in ascending order

getFileContents

public static final java.lang.String getFileContents(java.io.File f)
                                              throws java.lang.Exception
Reads the entire content of a file into a string

Parameters:
f - The file to read
Returns:
The contents of the input file in a string
Throws:
java.lang.Exception

removeElementAt

public static byte[][] removeElementAt(int n,
                                       byte[][] arr)
Removes an element from an array and shifts the remaining elements to the left

Parameters:
n - The index of the element to be removed
arr - The array to remove the element from
Returns:
An array with the element removed

reverse

public static int[] reverse(int[] arr)
Reverse the order of the elements in an array

Parameters:
arr - The array to reverse
Returns:
A copy of the array with the elements in reverse order

reverseComplement

public static byte[] reverseComplement(byte[] seq)
Gets the reverse complement of a byte encoded sequence

Parameters:
seq - The sequence to reverse complement
Returns:
The reverse complement of the input sequence

reverseComplement

public static java.io.File reverseComplement(java.io.File seqFile,
                                             java.io.File revFile)
                                      throws java.lang.Exception
Throws:
java.lang.Exception

reverseComplement

public static char reverseComplement(char c)

setColumn

public static double[][] setColumn(double[][] src,
                                   int index,
                                   double[] col)
Set a column of a matrix

Parameters:
src - The matrix to set the column in
index - The index (zero based) of the column to set
col - The values of the column to set. Must have the same length as the number of rows in src
Returns:
A new matrix

subarray

public static char[] subarray(int start,
                              int stop,
                              char[] src)
Extracts a subsegment of an array

Parameters:
start - The start index (inclusive)
stop - The stop index (exclusive)
arr - The input array
Returns:
A subsegment of the input array

subarray

public static double[] subarray(int start,
                                int stop,
                                double[] src)
Extracts a subsegment of an array

Parameters:
start - The start index (inclusive)
stop - The stop index (exclusive)
arr - The input array
Returns:
A subsegment of the input array

subarray

public static double[][] subarray(int start,
                                  int stop,
                                  double[][] arr)
Extracts a subsegment of an array

Parameters:
start - The start index (inclusive)
stop - The stop index (exclusive)
arr - The input array
Returns:
A subsegment of the input array

subarray

public static int[] subarray(int start,
                             int[] src)
Extracts the last subsegment of an array

Parameters:
start - The start index (inclusive)
arr - The input array
Returns:
A subsegment of the input array starting from the supplied start index and continuing up to the end

subarray

public static int[] subarray(int start,
                             int stop,
                             int[] src)
Extracts a subsegment of an array

Parameters:
start - The start index (inclusive)
stop - The stop index (exclusive)
arr - The input array
Returns:
A subsegment of the input array

subarray

public static java.lang.String[][] subarray(int start,
                                            java.lang.String[][] src)
Extracts a subsegment of an array

Parameters:
start - The start index (inclusive)
stop - The stop index (exclusive)
arr - The input array
Returns:
A subsegment of the input array

subarray

public static java.lang.String[][] subarray(int start,
                                            int stop,
                                            java.lang.String[][] src)

sum

public static double sum(double[] arr)
Sums the element in an array

Parameters:
arr - The array whose elements to sum
Returns:
The sum of the elements

sum

public static int sum(int[] arr)
Sums the element in an array

Parameters:
arr - The array whose elements to sum
Returns:
The sum of the elements

vectorAdd

public static double[] vectorAdd(double[] arr,
                                 double f)
Adds a scalar to an array

Parameters:
arr - Array to add to
f - The scalar to add to the input array
Returns:
A new array

vectorAdd

public static int[] vectorAdd(int[] v1,
                              int[] v2)
Adds to arrays together

Parameters:
v1 - Vector 1
v2 - Vector 2
Returns:
A new vector where the elements are the sum of the corresponding elements of v1 and v2

vectorAdd

public static int[][] vectorAdd(int[][] v1,
                                int[][] v2)
Adds to arrays together

Parameters:
v1 - Vector 1
v2 - Vector 2
Returns:
A new vector where the elements are the sum of the corresponding elements of v1 and v2

vectorDivide

public static double[] vectorDivide(double[] arr,
                                    double f)
Divides an array by a scalar

Parameters:
arr - Array to scale
f - The scalar to scale the input array by
Returns:
The input array scaled by f

vectorDivide

public static double[][] vectorDivide(double[][] arr,
                                      double f)
Divides an array by a scalar

Parameters:
arr - Array to scale
f - The scalar to scale the input array by
Returns:
The input array scaled by f

vectorInnerProduct

public static double vectorInnerProduct(double[] v1,
                                        double[] v2)
Calculates the inner product of two vectors

Parameters:
v1 - Vector 1
v2 - Vector 2
Returns:
The inner product of v1 and v2

vectorMultiply

public static double[] vectorMultiply(double[] arr,
                                      double f)
Multiplies an array by a scalar

Parameters:
arr - Array to scale
f - The scalar to scale the input array by
Returns:
The input array scaled by f

vectorMultiply

public static double[] vectorMultiply(double[] v1,
                                      double[] v2)
Multiplies the values of two arrays

Parameters:
v1 - The first array
v2 - The second array
Returns:
A new array where the elements are the products of the corresponding elements of the input arrays

vectorMultiply

public static double[][] vectorMultiply(double[][] arr,
                                        double f)
Multiplies a matrix by a scalar

Parameters:
arr - Matrix to scale
f - The scalar to scale the input matrix by
Returns:
The input matrix scaled by f

vectorMultiply

public static int[] vectorMultiply(int[] arr,
                                   int f)
Multiplies an array by a scalar

Parameters:
arr - Array to scale
f - The scalar to scale the input array by
Returns:
The input array scaled by f