German Congress of Orthopaedics and Traumatology (DKOU 2025)
Deutscher Kongress für Orthopädie und Unfallchirurgie 2025 (DKOU 2025)
Deep learning for cartilaginous tumor classification using CT and MRI
Text
Objectives and questions: Tumors of the musculoskeletal system are rare and heterogeneous which leads to limited experience among non-tumor experts. Three entities — Enchondroma, Chondrosarcoma, and Atypical Cartilaginous Tumor — are of particular clinical significance due to their distinct prognoses and treatment requirements. Misdiagnoses or failure to diagnose can lead to unnecessary interventions or treatment delays. Although Deep Learning (DL) has shown potential in cancer classification, it is still in its infancy in orthopedic oncology.
This study explores the potential of multi-modal DL models integrating CT, T1, and T2-weighted MRI to improve diagnostic accuracy for these tumors. Specifically, we aim to determine whether (1) a combination of modalities is more diagnostically useful than a single modality, (2) models trained on data from one hospital can generalize to another, and (3) the DL model’s performance is comparable with radiologists.
Material and methods: This retrospective study used data from 369 patients from Institute-1 and 198 patients from Institute-2. Models were developed using ResNet feature extraction and an intermediate fusion strategy based on the EmbraceNet architecture to integrate different modalities. Both uni-modal and multi-modal approaches were explored, considering individual modalities as well as their combinations. The DL models were trained on data from Institute-1 and evaluated on both a test dataset from Institute-1 and the dataset from Institute-2. Performance metrics, including F1 score, accuracy, sensitivity, and specificity were calculated and compared with those of senior and junior radiologists.
Results: The multi-modal model combining CT and T1-weighted MRI outperformed all three uni-modal models and the model utilizing all three modalities (CT, T1, T2). This model achieved an F1 score of 0.68 on the test dataset from Institute-1 which was comparable to senior radiologists. The performance on the external dataset from Institute-2 was lower (F1 score: 0.54) which was comparable to junior radiologists. This difference was due to the inter-institutional variability in the datasets which was mathematically demonstrated.
Discussion and conclusions: The study demonstrates that multi-modal deep-learning models can achieve performance similar to clinicians in differentiating Enchondromas, Chondrosarcomas, and ACT. Combining CT and T1-weighted MRI proved more effective than either modality alone. Interestingly, T2-weighted MRI did not provide diagnostic value for the model whereas radiologists rely on it for diagnosis. The results show the potential for integrating DL in clinical workflows for musculoskeletal tumor detection to reduce diagnostic delays. Future work should focus on enhancing model generalization by training on diverse datasets.



