Logo

German Congress of Orthopaedics and Traumatology (DKOU 2025)

Deutsche Gesellschaft für Orthopädie und Unfallchirurgie (DGOU), Deutsche Gesellschaft für Orthopädie und Orthopädische Chirurgie (DGOOC), Deutsche Gesellschaft für Unfallchirurgie (DGU), Berufsverband für Orthopädie und Unfallchirurgie (BVOU)
28.-31.10.2025
Berlin


Meeting Abstract

Improving patient education on tibial osteotomy for knee osteoarthritis management with a customized ChatGPT: A readability and quality evaluation

Benjamin Bartek 1
Stephan Oehme 1
Danko Milinkovic 1
Stephen Fahy 1
1Centrum für Muskuloskeletale Chirurgie, Charité Universitätsmedizin, Berlin, Deutschland

Text

Objectives and questions: Knee osteoarthritis (OA) greatly affects patients’ quality of life, often leading to the need for surgical intervention. While Total Knee Arthroplasty (TKA) is a common solution, it may not be ideal for younger patients with unicompartmental OA, who could benefit more from High Tibial Osteotomy (HTO). Effective patient education is essential for informed decision-making, yet most online health information is too complex for the average person to comprehend. AI tools like ChatGPT offer a potential solution, but their responses often exceed the general public's literacy level. This study evaluated whether a customized ChatGPT model could enhance readability and source accuracy in patient education on Knee OA and tibial osteotomy.

Material and methods: Frequently asked questions about HTO were collected using Google’s “People Also Asked” feature and rewritten at an 8th-grade reading level. Two versions of ChatGPT-4 were compared: the standard model and a fine-tuned version, “The Knee Guide”, optimized for readability and source citation using Instruction-Based Fine-Tuning (IBFT) and Reinforcement Learning from Human Feedback (RLHF). Responses were assessed for quality using the DISCERN criteria and readability using the Flesch Reading Ease Score (FRES) and Flesch-Kincaid Grade Level (FKGL).

Results: The standard ChatGPT-4 model had a mean DISCERN score of 38.41 (range: 25–46), indicating poor quality, while “The Knee Guide” achieved a score of 45.9 (range: 33–66), reflecting moderate quality. Interrater reliability was strong, with a Cronbach’s Alpha of 0.86. Readability improved significantly with “The Knee Guide”, which had a mean FKGL of 8.2 (range: 5–10.7, ±1.42) and a mean FRES of 60 (range: 47–76, ±7.83), compared to the standard model’s FKGL of 13.9 (range: 11–16, ±1.39) and FRES of 32 (range: 14–47, ±8.3). These differences were statistically significant (p < 0.001).

Discussion and conclusions: Fine-tuning ChatGPT significantly enhanced the readability and quality of HTO-related patient education materials. “The Knee Guide” demonstrated the potential of customized AI models in making complex medical information more accessible and easier to understand for patients.