Prompt Engineering Strategies for Context-Aware Medical Text Anonymization Using LLMs: Insights from the GraSCCo Corpus

25gmds210 10.3205/25gmds210 urn:nbn:de:0183-25gmds2101 Meeting Abstract Prompt Engineering Strategies for Context-Aware Medical Text Anonymization Using LLMs: Insights from the GraSCCo Corpus Wolfien Wolfien Markus M

Institute for Medical Informatics and Biometry, Faculty of Medicine Carl Gustav Carus, TUD Dresden University of Technology, Dresden, Germany Center for Scalable Data Analytics and Artificial Intelligence, Dresden/Leipzig, Germany, Dresden, Germany

author Teschner Teschner Florin F

Institute for Medical Informatics and Biometry, Faculty of Medicine Carl Gustav Carus, TUD Dresden University of Technology, Dresden, Germany

author Nguyen Nguyen Hung Manh HM

Institute for Medical Informatics and Biometry, Faculty of Medicine Carl Gustav Carus, TUD Dresden University of Technology, Dresden, Germany

author Sedlmayr Sedlmayr Martin M

Institute for Medical Informatics and Biometry, Faculty of Medicine Carl Gustav Carus, TUD Dresden University of Technology, Dresden, Germany

author German Medical Science GMS Publishing House

Düsseldorf

610 German clinical text large language model (LLM) prompt engineering anonymization semantic pre-processing de-identification 20260401 engl This is an Open Access article distributed under the terms of the Creative Commons Attribution 4.0 License. Dieser Artikel ist ein Open-Access-Artikel und steht unter den Lizenzbedingungen der Creative Commons Attribution 4.0 License (Namensnennung). M0631 210 Deutsche Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie 70. Jahrestagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie e. V. (GMDS) PS 12: Machine learning and AI applications Jena 20250907 20250911 Abstr. 239 Bundesministerium für Forschung, Technologie und Raumfahrt (BMFTR) TextIntroduction: The anonymization of clinical texts remains an ongoing challenge for enabling secondary use of healthcare data. With the increasing capabilities of large language models (LLMs) like ChatGPT-4.0, new opportunities arise for automating de-identification tasks , . However, performance is highly sensitive to prompt design and document formatting . This study evaluates how different prompt engineering strategies and input structuring can impact anonymization quality on synthetic German discharge letters from the GraSCCo corpus , .Methods: Three anonymization strategies were compared using ChatGPT-4.0: (i) a single static prompt applied in a continuous session, (ii) prompt renewal with isolated sessions per document, and (iii) structured input with semantically segmented sections combined with prompt renewal. All approaches used the GeMTeX anonymization guideline as a reference. Outputs were manually reviewed and evaluated using precision, recall, F1-score, and error rate.Results: Anonymization performance remained constant across iterations, with F1-scores around 0.72 (static prompt) to 0.79 (structured input). The error rate dropped from 19.4% to 7.6%, demonstrating a slight benefit of both prompt renewal and document structuring. However, these improvements were accompanied by a notable increase in false positives, particularly in masking non-identifying medical terms, such as medications and lab values.Discussion: Prompt engineering and input formatting can affect the reliability of LLM-based anonymization . While structured prompting did not improve overall F1-score, it increases over-masking, emphasizing the need for careful balance between data utility and privacy. Future work should explore prompt fine-tuning, guided pre-processing based on segment length, and hybrid approaches combining LLMs with rule-based verification. Local deployment using models like Ollama or DeepSeek may support clinical integration under privacy-sensitive conditions.Acknowledgements: This work was supported by the Federal Ministry of Research, Technology and Space (BMFTR) as part of the GeMTeX-Project (FKZ: 01ZZ2314F).The authors declare that they have no competing interests.The authors declare that an ethics committee vote is not required. Patsakis C Lykousas N Man vs the machine in the struggle for effective text anonymisation in the age of large language models 2023 Sci Rep 16026 Patsakis C, Lykousas N. Man vs the machine in the struggle for effective text anonymisation in the age of large language models. Sci Rep. 2023 Sep 25;13(1):16026. Liu Z Huang Y Yu X Zhang L Wu Z Cao C Dai H Zhao L Li Y Shu P Zeng F Sun L Liu W Shen D Li Q Liu T Zhu D Li X DeID-GPT: Zero-shot Medical Text De-Identification by GPT-4 2023 arXiv Liu Z, Huang Y, Yu X, Zhang L, Wu Z, Cao C, Dai H, Zhao L, Li Y, Shu P, Zeng F, Sun L, Liu W, Shen D, Li Q, Liu T, Zhu D, Li X. DeID-GPT: Zero-shot Medical Text De-Identification by GPT-4 [Preprint]. arXiv. 2023. DOI: 10.48550/arXiv.2303.11032 https://doi.org/10.48550/arXiv.2303.11032 Shusterman R Waters AC O’Neill S Bangs M Luu P Tucker DM An active inference strategy for prompting reliable responses from large language models in medical practice 2025 Npj Digit Med 1–10 Shusterman R, Waters AC, O’Neill S, Bangs M, Luu P, Tucker DM. An active inference strategy for prompting reliable responses from large language models in medical practice. Npj Digit Med. 2025 Feb 22;8(1):1–10. Modersohn L Schulz S Lohr C Hahn U GRASCCO – The First Publicly Shareable, Multiply-Alienated German Clinical Text Corpus 10 German Medical Data Sciences 2022 – Future Medicine: More Precise, More Integrative, More Sustainable! 66–72 Modersohn L, Schulz S, Lohr C, Hahn U. GRASCCO – The First Publicly Shareable, Multiply-Alienated German Clinical Text Corpus. In: German Medical Data Sciences 2022 – Future Medicine: More Precise, More Integrative, More Sustainable! IOS Press; 2022. p. 66–72. DOI: 10.3233/SHTI220805 https://doi.org/10.3233/SHTI220805 Lohr C Matthies F Faller J Modersohn L Riedel A Hahn U Kiser R Boeker M Meineke F De-Identifying GRASCCO – A Pilot Study for the De-Identification of the German Medical Text Project (GeMTeX) Corpus In Lohr C, Matthies F, Faller J, Modersohn L, Riedel A, Hahn U, Kiser R, Boeker M, Meineke F. De-Identifying GRASCCO – A Pilot Study for the De-Identification of the German Medical Text Project (GeMTeX) Corpus. In: German Medical Data Sciences 2024. IOS Press; 2024. p. 171–9. DOI: 10.3233/SHTI240853 http://dx.doi.org/10.3233/SHTI240853 Wiest IC Leßmann ME Wolf F Ferber D Treeck MV Zhu J Ebert MP Westphalen CB Wermke M Kather JN Deidentifying Medical Documents with Local, Privacy-Preserving Large Language Models: The LLM-Anonymizer 2025 NEJM AI AIdbp2400537 Wiest IC, Leßmann ME, Wolf F, Ferber D, Treeck MV, Zhu J, Ebert MP, Westphalen CB, Wermke M, Kather JN. Deidentifying Medical Documents with Local, Privacy-Preserving Large Language Models: The LLM-Anonymizer. NEJM AI. 2025 Mar 27;2(4):AIdbp2400537. 0 0 0 0