SmallData Seminar Series 2025: Year in Review

In 2025, the SmallData Seminar Series explored how to extract reliable insight from datasets that are noisy, incomplete, or highly heterogeneous. Across invited talks, Associated Researcher presentations, and hands-on workshops, the series linked methodological advances to concrete biomedical and life-science questions within the small data landscape.

One strong thread was AI that remains dependable when data are imperfect. Talks covered self-supervised approaches to capture subtle behavioral patterns, methods to reconstruct missing tracking information with uncertainty estimates, and robust, explainable pipelines for microscopy that can generalize across devices and cohorts. Theoretical contributions reinforced why structure-informed deep learning can stay stable even in high-dimensional small-data settings.

A second theme centered on trustworthy inference under bias and limited study designs. Presentations examined causal strategies for rare diseases where randomized trials are infeasible, frameworks that correct selection bias by modeling how data enter the study, and Bayesian learning as a principled way to incorporate prior knowledge while quantifying uncertainty.

Finally, activities emphasized data integration and foundation models. Our workshop on imaging foundation models worked on the idea of a joint 2D/3D biomedical imaging foundation-model dataset with agreed benchmarks, while a LLM workshop tackled low signal-to-noise biomedical text and laid groundwork for real use cases.

2025 Seminar Series at a Glance

  • Analysis of complex and subtle behavior enabled by self-supervised deep learning — GS France Rose (University Hospital of Cologne)
  • A joint foundation model for 2D/3D biomedical imaging data: Datasets and benchmarks — Workshop, moderated by Simon Ging
  • Forensic applications using DNA methylation analysis and mathematical prediction models — AR Jana Naue
  • Methodological aspects of the emulated target trial approach to optimize treatment strategies for a rare pediatric disease — AR Martin Wolkewitz & GS Derek Hazard (IMBI, University of Freiburg)
  • Dealing with heterogeneity and low signal-to-noise ratio when extracting information from biomedical textual data — Workshop, moderated by Fabian Kabus
  • Engineering immune cells for improved antitumor functionality — GS Evelyn Ullrich (Goethe University Frankfurt)
  • Robust and Trustworthy AI for Cell Microscopy — AR Maria Kalweit
  • Bayesian (deep) learning for “small” data — AR Nadja Klein
  • Modeling heterogeneity and observations when analyzing longitudinal and time-to-event data — AR Susanne Weber
  • The Abundance in Scarcity – Small Data in Cardiorenal Healthcare — AR Janis Nolde
  • From pathogenesis to tailored treatments in Inborn Errors of Immunity — AR Maria Elena Maccari
  • Overcoming Selection Bias in Statistical Studies With Amortized Bayesian Inference — GS Jonas Arruda (University of Bonn)
  • Mathematical deep learning for high-dimensional systems in small-data regimes — AR Diyora Salimova

* AR, Associated Researcher; GS, Guest Speaker

Overall, 2025 reinforced a simple message: small data demands smarter methods, not smaller ambitions. 

We would like to extend our sincere thanks to all CRC members, including our Associated Researchers, for organizing and delivering these workshops and presentations!

Administrative Manager

Marc Schumacher

Institute of Medical Biometry and Statistics,
Faculty of Medicine and Medical Center –
University of Freiburg