TabPFN Improves DNA Ancestry Prediction in Forensic Genetics

A new study by our doctoral researchers, Carola Heinzel & Lennart Purucker; as well as PIs, Frank Hutter & Peter Pfaffelhuber explores how TabPFN, foundation model, can be used to predict biogeographical ancestry (i.e. where a person’s ancestors likely came from) based on DNA data. This type of analysis is valuable in forensic genetics, for example when investigators work with DNA samples whose origins are unknown.

The researchers tested TabPFN against well-established methods such as Snipper and PLS-DA. Across multiple datasets, TabPFN consistently came out ahead. In one example, it raised accuracy for continental classification from 84% to 93%, and in a tougher test distinguishing between European populations, it improved results from 43% to 48%. To make the approach practical, the team has also shared an offline tool so that others can apply TabPFN directly to their own data.

The study also makes clear where the limits are. TabPFN works best when the populations being compared are genetically distinct — such as between continents — but performance drops when groups are very similar, like within Europe. The authors emphasize that researchers should always check the model carefully on their own data before applying it in real cases. Even with those caveats, TabPFN offers a promising step forward in forensic genetics, combining high accuracy with efficiency and ease of use.

Read the full article here.

Administrative Manager

Marc Schumacher

Institute of Medical Biometry and Statistics,
Faculty of Medicine and Medical Center –
University of Freiburg