Deutsche Gesellschaft für Orthopädie und Unfallchirurgie (DGOU), Deutsche Gesellschaft für Orthopädie und Orthopädische Chirurgie (DGOOC), Deutsche Gesellschaft für Unfallchirurgie (DGU), Berufsverband für Orthopädie und Unfallchirurgie (BVOU)
28.-31.10.2025
Berlin

Meeting Abstract

Deep learning bone tumor entity classification model collapses under real-world distribution shifts

Anna Curto Vilalta - Technical University of Munich, Munich, Deutschland; Klinikum rechts der Isar, Munich, Deutschland

Ines del Val Guardiola - Technical University of Munich, Munich, Deutschland; Klinikum rechts der Isar, Munich, Deutschland

Rüdiger von Eisenhart-Rothe - Klinikum rechts der Isar, Munich, Deutschland

Sarah Consalvo - Klinikum rechts der Isar, Munich, Deutschland

Daniel Rueckert

Jendrik Hardes

Florian Hinterwimmer - Technical University of Munich, Munich, Deutschland; Klinikum rechts der Isar, Munich, Deutschland

Text

Objectives and questions: Artificial intelligence (AI) models have demonstrated significant potential in classifying bone tumors. However, their clinical adoption remains limited due to poor generalizability across different healthcare centers. This study aims to assess the impact of training AI models on single-center data and evaluate their performance on datasets with distribution shifts caused by variations in imaging centers, scanners, and acquisition conditions.

Material and methods: This retrospective study included x-rays from 635 patients diagnosed with Enchondroma or Atypical Cartilaginous Tumor (ACT). We used a pre-trained Vision Transformer to fine-tune it to classify bone tumor entities. A weighted cross-entropy loss function was applied to avoid a bias towards the majority class (enchondroma).

To assess model robustness, we simulated real-world distribution shifts. We introduced three perturbation scenarios to the test set: (1) sensor noise, modeled by adding Gaussian noise; (2) reduced image quality, simulated via image blurring; and (3) variations in acquisition settings, mimicked by modifying brightness and contrast. Model performance was evaluated on test dataset using classification metrics, including accuracy, sensitivity, and specificity. For sensitivity and specificity calculations, we considered Enchondroma as the negative class and ACT as the positive class.

Results: As shown in Table 1 [Tab. 1], the model achieved an overall accuracy of 77%, with a sensitivity of 45.5% and a specificity of 89.3% on the test set. These results highlight the model’s difficulty in improving sensitivity for the minority class (ACT) due to the class imbalance.

**Table 1: Model performance on the original and modified test datasets.**

When evaluating model robustness on the modified test set, simulating real-world distribution shifts, the model’s ability to balance sensitivity across classes collapsed. Under these conditions, sensitivity for ACT dropped to 0%, while Enchondroma classification reached 100%, indicating that the model classified all cases as Enchondroma.

Discussion and conclusions: Our results highlight the challenges of AI bias in bone tumor classification, with models trained on single-center data failing even under very small distribution shifts (Figure 1 [Abb. 1]). The strong reliance on dataset-specific features raises concerns about their reliability in broader clinical settings. To improve robustness and generalizability, multi-center data sharing is essential for developing accurate AI-based diagnostic tools.

Citation Note

Curto Vilalta A, del Val Guardiola I, von Eisenhart-Rothe R, Consalvo S, Rueckert D, Hardes J, Hinterwimmer F. Deep learning bone tumor entity classification model collapses under real-world distribution shifts In: Deutsche Gesellschaft für Orthopädie und Unfallchirurgie, Deutsche Gesellschaft für Orthopädie und Orthopädische Chirurgie, Deutsche Gesellschaft für Unfallchirurgie, Berufsverband für Orthopädie und Unfallchirurgie, editors. Deutscher Kongress für Orthopädie und Unfallchirurgie (DKOU 2025). Berlin, 28.-31.10.2025. Düsseldorf: German Medical Science GMS Publishing House; 2025. DocAB35-4055.

DOI: 10.3205/25dkou245

Download XML

License

© Curto Vilalta et al.
This abstract is distributed under the terms of the license Creative Commons Attribution 4.0 International License

Published: 2025-10-31

Get in touch.

German Congress of Orthopaedics and Traumatology (DKOU 2025)

Deep learning bone tumor entity classification model collapses under real-world distribution shifts

Text

ZB MED is a member of DataCite

ZB MED advocates gender equality

Award for German Medical Science

ZB MED advocates Open Access