Individual multi-state health modeling using machine learning and LLMs
This project is led by Mathias Lindholm and Filip Lindskog, respectively Associate Professor in Mathematical Statistics and Professor of Insurance Mathematics in the Department of Mathematics, Stockholm University.
Duration of the project: 2025-2027
Background
The project focuses on health state modeling, and in particular how state-of-the-art machine learning techniques can improve current practice. This is of course important for insurers, but also important for healthcare providers and anyone having to better plan for future healthcare needs.
Objectives
The objectives can be split into two parts:
State-of-the-art research on inference and prediction based on individual-level multi-state health data using statistical machine learning techniques.
Communicating new results and sharing best practice on process modeling of individual multi-state health data to the actuarial community and the surrounding industry.
The focus of the research proposal will be on the following research topics:
- Individual-level health-state modeling using machine learning techniques
- Understanding and deconstructing LLM-based multi-state approaches
- Generating pseudo multi-state health data.
Questions that will be addressed within these research activities are for example: The amount of Individual-level health-state data will vary a lot between individuals due to their life event histories — how should this be handled and combined with state-of-the-art machine learning techniques? Recent case studies summarize individual life event histories using LLMs — what type of dynamics is captured by these LLMs, and what aspects are potentially missed? Individual-level health data describing life event histories will contain sensitive information — how can one generate individual-level synthetic pseudo-health data that can be shared with, e.g., researchers and the surrounding community?
The research topics will be of theoretical nature, but the results will be developed in close connection with real individual level insurance or health registry data from Sweden and the Nordics.
The research output will be published open access in leading scientific journals in actuarial science or closely related fields, and the research will be presented at academic conferences and industry seminars. In addition, knowledge will also be disseminated/shared with SCOR through specific events.
Fields of research / technical cooperation
Individual health data, multi-state health data, large language models (LLMs), Markov models, generating individual level health pseudo data, machine learning techniques, gradient boosting
Mathias Lindholm holds a PhD in Mathematical Statistics from Stockholm University (2008). He has worked as an Actuary and Quantitative Analyst at AFA Insurance (2010-2013) and as a Risk Analyst at L¨ansf¨ors¨akringar Alliance in Stockholm (2013-2015). Returning to Stockholm University, he was an Affiliated Researcher from 2011-2014. He has been a Senior Lecturer in Mathematical Statistics at Stockholm University since 2015, and an Associate Professor in Mathematical Statistics since 2017. He is a member of the Statistics Sweden expert group for mortality forecasting and a Board member of the Swedish Society of Actuaries.
Filip Lindskog holds an MSc from KTH Royal Institute of Technology (2000) and a PhD from ETH Swiss Federal Institute of Technology (2004). He was Assistant/Associate Professor at the Division of Mathematical Statistics at KTH from 2004-2015, Director of Studies in Financial Mathematics at KTH from 2009-2015, and Director of the Applied and Computational Mathematics Master's program at KTH from 2012-2015. He has been Director of Stockholm University's Actuarial Mathematics Master's Program since 2016, Professor of Insurance Mathematics since 2015, and Head of the Mathematical Statistics Division since 2018. He has been the Editor of the Scandinavian Actuarial Journal since 2018.