Logo

70. Jahrestagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie e.V.

Deutsche Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie e.V. (GMDS)
07.-11.09.2025
Jena

Meeting Abstract

Potential of Machine Learning for Discharge Management Using Routine Health Insurance Data

Zully Ritter - Department of Medical Informatics, University Medical Center Göttingen (UMG), Göttingen, Germany
Miriam Cindy Maurer - Institut für Medizinische Informatik, Universitätsmedizin Göttingen, Göttingen, Germany
Jacqueline Beinecke - Department of Medical Informatics, University Medical Center Göttingen (UMG), Göttingen, Germany
Lisa Weller - aQua-Institute for Applied Quality Improvement and Research in Health Care, Göttingen, Germany
Thorsten Pollmann - aQua-Institute for Applied Quality Improvement and Research in Health Care, Göttingen, Germany
Matthias Kretzler - BKK Dachverband e.V., Berlin, Germany
Thomas Grobe - aQua-Institute for Applied Quality Improvement and Research in Health Care, Göttingen, Germany
Anne-Christin Hauschild - Department of Predictive Deep Learning in Medicine and Healthcare, Justus-Liebig University Gießen, Gießen, Germany; Department of Medical Informatics, University Medical Center Göttingen (UMG), Göttingen, Germany

Text

Introduction: The use of artificial intelligence, particularly machine learning models, to develop predictive models from health insurance routine data is becoming increasingly common [1], [2]. However, it’s not always clear when and how to leverage the advantages of advanced, AI and explainable AI models to create a stable and reliable framework for processing large datasets, managing bias, and handling imbalanced data collections.

Methods: To explore this, the KI-Thrust project conducted a comprehensive comparison between traditional regression models and various machine learning models (including, but not limited to, random forest (RF), neural networks (NN), and Adaboost (AB)) using a large-scale health insurance routine dataset collected by KI-Thrust. This dataset has around two million samples for training, including age, sex, diagnosis, mortality, unplanned readmissions, and aid use. The study focused on two outcomes: mortality (OM) and unplanned readmissions (OUR). For this, three data-models for each outcome (M1, M2, and M3) were implemented. M1 uses all features except those related to pre-existing conditions, which are represented by ICD-10 codes. M2 includes all features, while M3 is similar to M2 but adds a time component. Sampling techniques were used to manage imbalanced classes [3]. An automated explainability [4] tool was developed using various XAI techniques, including Shapley values and integrated gradients [5].

Results: Classical regression models performed similarly to machine learning models. In all cases, overall OM-classification performance was about 20% better than OUR-classification. The highest AU-ROC was achieved by the M2 AB model (88.9% for OM and 69.4% for OUR). The highest AU-precision-recall curve was obtained by LR, M2 (16.1%), with AB nearly identical at 16.09%. Machine learning models offer a better understanding and management of highly imbalanced data (e.g., mortality class < 1.0%), which traditional regression alone cannot accomplish. Additionally, feature importance was analyzed. The top three features for OM, M2 included age, organic – including symptomatic – mental disorders (F00-F09), and care level 0 (PFL_GR_0) with AB, and age, F00-F09, and polypharmacy (POLYMED) with RF. The insights were compiled into a white paper guiding XAI use in predictive models with routine insurance data.

Conclusion: Using health routine data to predict discharge management can significantly improve patient care. This study shows that machine learning models slightly outperform linear regression models, though they require more time and computational resources. However, they provide a better understanding of the data. Upsampling and downsampling enhance performance mainly when 80% of the data is sampled, but they do not surpass weighted loss functions. Therefore, weighted loss functions are preferred for better results. Improving hyperparameter tuning and neural network complexity does not guarantee that machine learning models will greatly outperform logistic regression, which remains the gold standard for predicting discharge management using health insurance routine data due to its inherent interpretability. Our results demonstrate that comparing XAI methods and models is crucial for validating and selecting relevant features for outcome prediction, ensuring the trustworthiness and reproducibility of the AI solutions presented.

The authors declare that they have no competing interests.

The authors declare that an ethics committee vote is not required.


References

[1] Ferrari AJ, Santomauro DF, Aali A, Abate YH, Abbafati C, Abbastabar H, et al. Global incidence, prevalence, years lived with disability (YLDs), disability-adjusted life-years (DALYs), and healthy life expectancy (HALE) for 371 diseases and injuries in 204 countries and territories and 811 subnational locations, 1990–2021: a systematic analysis for the Global Burden of Disease Study 2021. The Lancet. 2024 May;403(10440):2133–61.
[2] Stephan AJ, Hanselmann M, Bajramovic M, Schosser S, Laxy M. Development and validation of prediction models for stroke and myocardial infarction in type 2 diabetes based on health insurance claims: does machine learning outperform traditional regression approaches? Cardiovasc Diabetol. 2025 Feb 18;24(1):80.
[3] Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP. SMOTE: Synthetic Minority Over-sampling Technique. J Artif Intell Res. 2002 Jun 1;16:321–57.
[4] Ribeiro MT, Singh S, Guestrin C. "Why Should I Trust You?" Explaining the Predictions of Any Classifier. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. San Francisco California USA: ACM; 2016. p. 1135–44. DOI: 10.1145/2939672.293977
[5] Sundararajan M, Taly A, Yan Q. Axiomatic attribution for deep networks. In: Proceedings of the 34th International Conference on Machine Learning (ICML’17) - PMLR Volume 70. Sydney, NSW, Australia: ML Research Press; 2017. p. 3319–28. .