TY  - JOUR
A1  - Cochran, Kelly
A1  - Srivastava, Divyanshi
A1  - Shrikumar, Avanti
A1  - Balsubramani, Akshay
A1  - Hardison, Ross C.
A1  - Kundaje, Anshul
A1  - Mahony, Shaun
T1  - Domain adaptive neural networks improve cross-species prediction of transcription factor binding
Y1  - 2022/01/18 
JF  - Genome Research 
JO  - Genome Research 
DO  - 10.1101/gr.275394.121 
SP  - gr.275394.121 
UR  - http://genome.cshlp.org/content/early/2022/01/18/gr.275394.121.abstract 
N2  - The intrinsic DNA sequence preferences and cell-type specific cooperative partners of transcription factors (TFs) are typically highly conserved. Hence, despite the rapid evolutionary turnover of individual TF binding sites, predictive sequence models of cell-type specific genomic occupancy of a TF in one species should generalize to closely matched cell types in a related species. To assess the viability of cross-species TF binding prediction, we train neural networks to discriminate ChIP-seq peak locations from genomic background and evaluate their performance within and across species. Cross-species predictive performance is consistently worse than within-species performance, which we show is caused in part by species-specific repeats. To account for this domain shift, we use an augmented network architecture to automatically discourage learning of training species-specific sequence features. This domain adaptation approach corrects for prediction errors on species-specific repeats and improves overall cross-species model performance. Our results demonstrate that cross-species TF binding prediction is feasible when models account for domain shifts driven by species-specific repeats. 
ER  -