Integrating information from similar data sites when developing clinical prediction models for patients from a target site

Project summary

We hypothesize that local prediction models, suffering from small sample sizes, can benefit from integrating similar information from external sites. The project will provide methodology based on weighting of similar data from external sites in a prediction model for a target site. We will investigate different strategies for quantifying similarity, including 1) propensity score-based approaches, 2) similarity quantified parametrically as well as nonparametrically via the difference between the conditional empirical distribution of the outcome given the covariates per site, and 3) a deep learning approach. Besides empirical evaluation, we will provide comprehensive mathematical theory on the parameter estimates of corresponding regression models.

Our methods

  • Combining knowledge- and data-driven modeling
  • Neural networks
  • Meta-learning
  • Local perspective

Principal investigator 1

Doctoral researcher position 1

(supervised by PI Rohde)


Development of an asymptotic small data framework und theoretical investigation of proposed approaches.


  • Master’s degree in Mathematics with profound knowledge in stochastics

Principal investigator 2

Doctoral researcher position 2

(supervised by PI Zöller)


Algorithmic development of approaches with a focus on evaluating the approaches in small and applied data settings.


  • Master’s degree or equivalent in mathematics, (bio-)statistics, computer science or similar
  • Advanced programming skills in, e.g., R, Python, or Julia
  • Ideally, experience in federated learning, prediction modeling and / or deep learning
  • Interest in modeling clinical data

Administrative Manager

Marc Schumacher

Institute of Medical Biometry and Statistics,
Faculty of Medicine and Medical Center –
University of Freiburg