PROJECT SCRIPT OVERVIEW
=======================
This collection of shell, R and Python scripts covers the complete workflow


####1. WGS&Rnaseq_mapping####
  WGS_mapping.sh: Align WGS FASTQ with BWA, mark duplicates and sort to BAM using samtools.
  RNA_seq_mapping.sh: Align RNA-seq FASTQ with HISAT2 and generate coordinate-sorted BAM files.
  VCFmake.sh: Joint-genotype GVCFs with GATK GenotypeGVCFs and apply hard filters to produce a multi-sample VCF.


####2. Quantification####
  Readcounts.sh: Count gene-level reads with featureCounts.
  TPM_quantification.sh: Generate TPM expression.
  PSI_quantification.sh: Compute alternative-splicing PSI values with Leafcutter.
  PDUI_quantification.sh: Estimate APA PDUI indices using DaPars.


####3. TWAS####
  TWAS.R: perform transcriptome-wide association between traits and molecular phenotypes; outputs significant genes.


####4. XGBOOST####
  XGBOOST.R: Train an XGBoost regressor on molecular features (expression, APA, AS.) to predict egg production;
                            reports feature importance and cross-validation metrics.


####5. molQTL_mapping####
The current template demonstrates an eQTL analysis; 
by substituting the expression, covariate matrices and parameters (--phenotype_groups), you can run sQTL or apaQTL analyses within the same framework.

PEER_estimate.R: Infer hidden factors with PEER and export covariates.
molQTL_mapping.sh: Combine expression matrix, genotype VCF and covariates; run cis-molQTL scans via tensorQTL.
The current template demonstrates an eQTL analysis; 
substituting expression matrices and parameters lets you run sQTL or apaQTL analyses in the same framework.


####6. GWAS####
  GWAS.sh: Master script: for each trait perform GWAS with EMMAX.
  emmaxfilter_ZJU.pl: Filter EMMAX output and summarise significant SNPs.
  LD_Clumping.sh: LD-clump GWAS hits with PLINK to obtain lead SNPs.
  MANHATTAN_QQ.r: Draw Manhattan + QQ plots using base R.
  MCMCglmm_heritability_estimate.R: Estimate SNP-based heritability with MCMCglmm.


####7. Colocalization####
  COLOC_out.sh: Iterate through tissue × trait combinations, format input and launch COLOC.
  COLOC_withSNPoutput.R: Run coloc.abf in R and export both region- and SNP-level posterior probabilities.


####8. MediationAnalysis####
  causal_mediation.R: Decompose total genetic effects into direct and molecular-mediated components for a chosen Y–X–M trio.
  Run_Mediation.sh: Shell wrapper to batch-run causal_mediation.R and collate results.


####9. WGCNA####
  RunWGCNA.R: Build weighted gene-co-expression networks, detect modules and export module eigengenes.
  BayesianNetwork.R: Infer Bayesian causal networks among modules or via bnlearn.

