We hypothesize that the integration of various sources of genetic and epigenetic information is necessary for enabling the efficient and safe application of CRISPR-Cas in a therapeutic context. In this project, we aim at developing a deep learning-based approach for cell type-specific prediction of efficacy and specificity of CRISPR-Cas9 nucleases that incorporates similarity between datasets in a pre-training strategy. Various types of information will be integrated, and we will investigate different approaches for the similarity between datasets. To strengthen the robustness, we will train the models to be aware of adversarial examples. Experimental validation in various therapeutically applied human cell types will enable us to fine-tune the models in an iterative process and ensure clinical relevance.
(supervised by PI Backofen)
Data collection and analysis, data imputation, setting up the similarity measurements and network architectures, and performing model pretraining, model training, and model evaluation in various experimental settings.
(supervised by PI Cathomen)
Collection and analysis of CRISPR-Cas9 DNA binding and cleavage data as well as the experimental validation of algorithms.