*****aln_dna_aa.pl*****
This Perl script generates a nucleotide-based alignment file based on an amino-acid sequence alignment obtained by other programs such as ClustalW, MAFFT, and Muscle.

1. Preparing an input file.
Nucleotide sequence file: This file should be prepared as fasta format. Gaps are not allowed. Sequence name and nucleotide sequence should be separated by return.

Aligned amino acid sequence file: This file should also be prepared as fasta format. Sequences should be aligned (by other programs). Sequence name and nucleotide sequence should be separated by return.

2. Run the script
Type as follows.
$ perl aln_dna_aa.pl nucleotide_sequence_file aligned_amino_acid_sequence_file output_file

3. Reading an output file.
Output file is also in fasta format. It should be a nucleotide based alignment file which is based on the amino acid sequence alignment.



*****dS_dN_mNG_comp.pl*****
This Perl script conducts sliding window analysis for computing the numbers of synonymous and nonsynonymous sites, the numbers of synonymous and nonsynonymous substitutions, proportions of synonymous and nonsynonymous differences, and rates of synonymous and nonsynonymous substitutions (multiple hits are corrected by Jukes-Cantor (1969) method) based on the Modified Nei-Gojobori (Zhang et al. 1998) method with complete deletion option.

1. Preparing an input file.
Input file should be fasta format. All sequences should be aligned by other programs, such as MEGA and Muscle. Sequence name and nucleotide sequence should be separated by return. Nucleotide sequence for each sequence name should be given in one line, respectively, without any return. In addition, all characters must be in capital letters.

2. Run the script
Type as follows.
$ perl script_name (dS_dN_mNG_comp.pl) input_file window_size (no. of codons) sliding size (no. of codons) transition/transversion ratio (if you want to estimate the ratios from actual data, please type n/a) output_file

3. Reading an output file.
First six lines show the method you used and the information about input file, output file, window size, and sliding size.

Each column of the following lines means that,
column 1: start codon position for the window
column 2: end codon position for the window (show transition/transversion ratio for overall)
column 3: number of synonymous sites
column 4: number of nonsynonymous sites
column 5: number of synonymous differences
column 6: number of nonsynonymous differences
column 7: proportion of synonymous differences
column 8: proportion of nonsynonymous differences
column 9: synonymous substitution rate (corrected by JC method)
column 10: nonsynonymous substitution rate (corrected by JC method)



*****vcf_to_fasta.pl*****
This Perl script generates the neo-Y chromosome assembly using the genome sequence based on the female DNA, vcf files based on female and male short reads output from GATK, and a depth file based on male short reads output from bedtools. The neo-Y chromosome assembly is generated by replacing the neo-X chromosome assembly with male-specific variants.

1. Preparing input files.
Vcf files based on the female and male DNA short reads mapped onto the genome assembly based on the female DNA must be prepared. Variant call should be done by using GATK. In addition, a depth file based on the male DNA short reads is also necessary. Depth should be computed by bedtools.

2. Run the script
Type as follows.
$ perl vcf_to_fasta.pl genome_sequence_file (fasta file based on the female DNA) vcf_male_file vcf_female_file coverage_file(including positions with zero coverge) output_file



If you need more detailed information, please contact to

Masafumi Nozawa, PhD

Department of Biological Sciences
School of Science
Tokyo Metropolitan University
1-1 Minamiosawa, Hachioji
Tokyo 192-0397, JAPAN
Email: manozawa@tmu.ac.jp
URL: https://sites.google.com/view/masafumi-nozawa/top
