ChIP-seq analysis was done in several in multiple batches.

All scripts, except for Supplemental Figs. S17 and S18, were quite identical and therefore not included as individual files.

Scripts for Supplemental Figs. S17 and S18 are available in a separate folder.

Typical Steps included:

############ MAPPING, SORTING, and REMOVING DUPLICATES ############ 
A1. Mapping with Bowtie 1 to mm9 genome. Reads longer than 50 bases were trimmed to the first 50 bases.

A2. Removing unmapped or multi-mapping reads.

A3. Sorting by coordinates.
The following command was used to acheive steps A1, A2, and A3.
gzip -cd  $FASTQ_FILE(s)| bowtie  -v 2 -m 1 --best --strata -S --time -p 8 MM9_BW1_INDEX - | samtools view -F 0x4 -ubt mm9.fa.fai | samtools sort -@ 4 -m 8G - -o Sample.bam

A4. Removing PCR duplicates using
samtools view -F 0x4 -ub Sample.bam | samtools rmdup -s --output-fmt-option nthreads=8 - $sample.nodup.bam

A5. Samples were indexed using 
samtools index $sample.nodup.bam

A6. All of this (steps A1-A6) could be acheived using a script similar to 
BOWTIE_ChIPSEQ_LSD1_KO_mES_TUFTS_Bowtie1_mm9.pl


############ READ EXTENSION AND CONVERSION TO BIGWIGS FOR VISUALIZATION ############
B1. For visualization of the data after mapping, the reads were extended to a total of 180 bases in the direction of reads and converted to a bed file. 

B2. The extended reads were converted to bigwig files and normalized to 1 million non-mitochondrial and non-ribosomalreads.
gzip -cd Sorted_Extended_Sample.nodup.bed.gz|bedtools genomecov -bg -g mm9.chrom.sizes -scale $NORM_FACTOR -i stdin |gzip -3 > Normalized_Sample.bedgraph.gz
where 
$NORM_FACTOR=int(1000000*100000/$reads)/100000; 
i.e. to calculate $NORM_FACTOR to 5 decimal places which was then used to scale the bedgraph files.

B3. The bedgraph files were converted to bigwig files using
wigToBigWig Normalized_Sample.bedgraph.gz mm9.chrom.sizes Normalized_Sample.bw

A typical script for steps B1-B3 would be
CHIPSEQ_BAM_BED_BEDGRAPH_BW_EXTEND_ChIP-Seq_LSD1_KO_mES_NEW.pl



############ READ EXTENSION AND CALCULATION OF READ COVERAGES OF A FEATURE SET ############
C1. For calculating coverage of a feature by ChIP-seq reads, reads were extended to 200 bases and reads mapping to chrM or rRNA (Rn45s, chr17:39978942-39986774) were ignored. The files were converted to bedfiles.
A typical script would be 
Generate_Extended_Bed_FILES_FOR_COVERAGE.pl

C2. The coverage was then calculated using
gzip -cd $BEDFILE | bedtools coverage -mean -a FEATURES -b stdin > RAW_Sample.coverage

C3. The raw coverage was then normalized to total number of million reads and/or converted to rpkms.
A typical script for steps C2-C3 would be
Get_Coverage_WT_KO_ENHANCERS_HDAC_WIDTH_450.pl