Logo

70. Jahrestagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie e.V.

Deutsche Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie e.V. (GMDS)
07.-11.09.2025
Jena


Meeting Abstract

Comparison of methods to handle missing values in the index test in a diagnostic accuracy study – two simulation studies

Katharina Stahlmann 1
Dennis Juljugin 1
Bastiaan Kellerhuis 2
Johannes B. Reitsma 2
Nandini Dendukuri 3
Antonia Zapf 1
1Institute of Medical Biometry and Epidemiology, University Medical Center Hamburg-Eppendorf, Hamburg, Germany
2Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht, Netherlands
3Department of Medicine, McGill University, Montreal, Canada

Text

Introduction: Most diagnostic accuracy studies apply a complete case analysis (CCA) or single imputation to address missing values in the index test [1], [2], which may lead to biased results [3]. Despite being available, methods to handle missing values in the index test are hardly used due to a lack of systematic comparisons and recommendations [4]. Thus, two simulation studies were conducted to compare the performance of different methods in estimating the AUC of a continuous index test (study 1) and the sensitivity and specificity of a binary index test (study 2) given missing values in the index test.

Methods: We simulated data for a reference standard, index test, and three covariates using different sample sizes, prevalences of the target condition, correlations between index test and covariates (only study 1), and true AUC/sensitivity and specificity. Subsequently, missing values were induced for the index test, varying proportions of missing values and missingness mechanisms. Multiple methods (study 1: multiple imputation (MI), empirical likelihood, and inverse probability weighting approaches; study 2: single imputation, MI, product multinomial framework-based methods) were compared to CCA regarding their performance when estimating the true AUC or sensitivity and specificity, respectively.

Results: Regarding the AUC under missing completely at random (MCAR) and many missing values, CCA gives good results with respect to bias for a small sample size and all methods perform well for a high sample size. If missing values are missing at random (MAR), all methods are severely biased if the sample size and prevalence are small. An augmented inverse probability weighting method and standard MI methods perform well with higher prevalence and sample size, respectively.

Regarding sensitivity and specificity, MI outperforms all other methods under MCAR and MAR. Under missing not at random (MNAR), most methods in both studies are biased, with some improvements for MI methods when observed covariates are highly correlated with the index test.

Discussion: Most methods perform well given few missing values. For calculating the AUC under many missing values and MCAR or MAR, we recommend CCA given a small sample size and MI or the augmented inverse probability approach given a large sample size. For calculating sensitivity and specificity, MI should be used under MCAR and MAR. Regardless of outcome parameter, sensitivity analyses are recommended, especially if MNAR is likely.

Conclusion: While MI outperforms its competitor methods when estimating sensitivity and specificity for an index test with missing values, there is no best method concerning estimating AUC. For AUC estimation, the augmented inverse probability method, MI and, in some cases, CCA worked well. Their performance was, however, heavily dependent on the sample size, prevalence, proportion of missing values, and, to a lesser extent, correlation between index test and covariates. A small effective sample size and MNAR was shown to be a high risk for biased results in both studies.

The authors declare that they have no competing interests.

The authors declare that an ethics committee vote is not required.

The contribution has been accepted as a poster at the MEMTAB conference in Birmingham April/May 2025: Stahlmann K, Juljugin D, Kellerhuis B, Reitsma JB, Dendukuri N, Zapf A. Comparison of methods to handle missing values in the index test in a diagnostic accuracy study – two simulation studies.


References

[1] Shinkins B, Thompson M, Mallett S, Perera R. Diagnostic accuracy studies: how to report and analyse inconclusive test results. BMJ. 2013;346:f2778.
[2] Schuetz GM, Schlattmann P, Dewey M. Use of 3×2 tables with an intention to diagnose approach to assess clinical performance of diagnostic tests: meta-analytical evaluation of coronary CT angiography studies. BMJ. 2012;345:e6717.
[3] Whiting PF, Rutjes AW, Westwood ME, et al. A systematic review classifies sources of bias and variation in diagnostic test accuracy studies. J Clin Epidemiol. 2013;66:1093-1104. 20130817. DOI: 10.1016/j.jclinepi.2013.05.014
[4] Stahlmann K, Reitsma JB, Zapf A. Missing values and inconclusive results in diagnostic studies - A scoping review of methods. Stat Methods Med Res. 2023;32(9):1842-55.