
In 2025, the SmallData Seminar Series explored how to extract reliable insight from datasets that are noisy, incomplete, or highly heterogeneous. Across invited talks, Associated Researcher presentations, and hands-on workshops, the series linked methodological advances to concrete biomedical and life-science questions within the small data landscape.
One strong thread was AI that remains dependable when data are imperfect. Talks covered self-supervised approaches to capture subtle behavioral patterns, methods to reconstruct missing tracking information with uncertainty estimates, and robust, explainable pipelines for microscopy that can generalize across devices and cohorts. Theoretical contributions reinforced why structure-informed deep learning can stay stable even in high-dimensional small-data settings.
A second theme centered on trustworthy inference under bias and limited study designs. Presentations examined causal strategies for rare diseases where randomized trials are infeasible, frameworks that correct selection bias by modeling how data enter the study, and Bayesian learning as a principled way to incorporate prior knowledge while quantifying uncertainty.
Finally, activities emphasized data integration and foundation models. Our workshop on imaging foundation models worked on the idea of a joint 2D/3D biomedical imaging foundation-model dataset with agreed benchmarks, while a LLM workshop tackled low signal-to-noise biomedical text and laid groundwork for real use cases.




2025 Seminar Series at a Glance
* AR, Associated Researcher; GS, Guest Speaker
Overall, 2025 reinforced a simple message: small data demands smarter methods, not smaller ambitions.
We would like to extend our sincere thanks to all CRC members, including our Associated Researchers, for organizing and delivering these workshops and presentations!

