PHENOPROPHET
------------------------------------------------------
	PhenoProphet assigns each TF in a network a score representing the confidence that the deletion of that TF will yield a phenotypic change of interest. Given a network, the PhenoProphet score of a TF for a phenotype or interest is based on the enrichment of the TF's targets for genes that are known to be linked to the phenotype of interest. Specifically, a hypergeometric enrichment test is applied for each regulator over many total network sizes, ranging from the 500 most confident network edges to the 40,000 most confident network edges. The PhenoProphet score for a regulator is the maximum of the -log hypergeometric p-value for the regulator’s target genes at each network size cutoff.

SYSTEM REQUIREMENTS
------------------------------------------------------
	* R 

INSTALLATION INSTRUCTIONS
------------------------------------------------------
1. Unpack PhenoProphet
	tar -zxvf phenoprophet_VERSION.tar.gz
2. Execute the following lines or add them to your shell configuration file
	export PHENOPROPHET_DIR=$HOME/<phenoprophet_location>/CODE/
	export PATH=${PHENOPROPHET_DIR}:$PATH
	
EXAMPLE USAGE
------------------------------------------------------
	phenoprophet -n ${PHENOPROPHET_DIR}/../DATA/CRYPTO_CAPSULE_NETWORK/INPUT/network.adj.mtr -f ${PHENOPROPHET_DIR}/../DATA/CRYPTO_CAPSULE_NETWORK/INPUT/regulators.txt -g ${PHENOPROPHET_DIR}/../DATA/CRYPTO_CAPSULE_NETWORK/INPUT/target_genes.txt -p ${PHENOPROPHET_DIR}/../DATA/CRYPTO_CAPSULE_NETWORK/INPUT/capsule_size_phenotype_linked_genes.txt

MAXIMUM NETWORK SIZE AND NETWORK SIZE INTERVALS
------------------------------------------------------
	The -s and -i commands respectively set the maximum allowable network size and the number of next most confident interactions to add at each sampling of the network. The -s maximum allowable network size should be set to the size of the largest network that has better than random recovery of true interactions. Using NetProphet, we have observed better than chance recovery of the S. cerevisiae network at 40,000 edges. In practice, we have not observed significant changes in PhenoProphet scores when setting the maximum allowable network to greater than 40,000 edges. The -i number of next most confident interactions should be set based on sampling tolerance. By default we set -s to 40000 and -i to 500.

DESCRIPTION OF MAJOR FILES
------------------------------------------------------
	Required Input Files:
		1. networkFile - A representation of a transcriptional regulatory netwok as a tab separated adjacency matrix of dimensions:  # of regulators by # of target genes. Each cell i,j within the matrix contains the confidence of an interaction between regulator i and target gene j. The NetProphet algorithm can be used to infer, from gene expression data, a network which can be used here.   
		2. regulatorGeneNamesFile - A file listing one regulator gene identifier per line. The regulator gene identifiers should be ordered as they are in the network file.
		3. targetGeneNamesFile - A file listing one target gene identifier per line. The target gene identifiers should be ordered as they are in the network file.
		4. phenotypeLinkedGenesFile - A file listing one phenotype-linked gene per line. A phenotype linked gene is a gene that is associated with a phenotype of interest, potentially though null mutant phenotyping.
		
	Output Files:
		1. PhenoProphet_Scores.txt - A file listing each regulator and its PhenoProphet score. The file is sorted so that the regulators with the highest/most confident PhenoProphet scores are listed first.

REFERENCES
------------------------------------------------------
Maier EJ, Haynes BC, Gish SR, Wang ZA, Skowyra ML, Marulli AL, Doering TL, & Brent MR. (2014). Model-driven mapping of transcriptional networks reveals the circuitry and dynamics of virulence regulation. Manuscript submitted for publication.

Haynes BC, Maier EJ, Kramer MH, Wang PI, Brown H, Brent MR. (2013). Mapping functional transcription factor networks from gene expression data. Genome Research. 
