70. Jahrestagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie e.V.
70. Jahrestagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie e.V.
Building a Free and Open-Source Trusted Research Environment: A Researcher-Driven Approach to Clinical Data Sharing Without Data Leaving Institutional Boundaries
Text
Background: Data sharing (DS) of clinical study data is technically, administratively, and legally challenging. It involves multiple processes – extracting, preparing, and transferring data – while ensuring institutional and ethical compliance. Often, data sharing tasks require manual intervention by data management personnel. A key simplification occurs when each site maintains an automated infrastructure for managing metadata, study documentation, analysis tools, and secure access to sensitive datasets. To implement this idea into our data management infrastructure (built aound REDCap [1]), we developed a modular data access and delivery tool using Django [2] that automates metadata management, user-driven access requests, and secure data delivery within research environments.
Methods: In our use case data are retrieved derived from REDCap [1] and it‘s API. Our tool creates a JSON with structured metadata. Other data sources can be used as long as the output format is compliant. Integration of multiple data sources from different instances can be integrated by a custom extraction script. The tool then populates a searchable, web-based study catalogue. Researchers can explore variable-level metadata without needing REDCap access and view supplementary study materials. Data access requests are submitted through the same interface, with users selecting individual variables of interest. These requests are reviewed and approved or denied by the designated data owner. Upon approval, a cronjob scans the requested data, and a secure transfer script places the CSV file in the researcher’s home directory accessible within the whole IBE analysis environment (RStudio etc.). To guarantee data security, Ginko Virtual Monitor provides controlled remote access without risk of data leakage by preventing any data download or copying.
Results: An interesting use case consists of delivering data from the PEACHES cohort [3] automatically to several researchers. Data in over 2,400 variables is available. The manual processing of one request required five different people and took approximately 40 minutes. The automated workflow reduces time under two minutes involving two individuals, allowing immediate approval. Approved datasets are automatically delivered to user home directories which can also be accessed by collaborators or support staff. This setup also supports integration with multiple REDCap servers across different institutional installations.
Discussion: Our Trusted Research Environment (TRE) combines free and open-source components into a flexible, reusable framework aligned with the FAIR [4] principles. While REDCap itself is not open-source, it is freely available for non-profit academic use and widely supported. The Django-based catalogue and access workflow will be published under an open-source license to promote reuse. Governance requirements such as informed consent and ethics approval remain necessary. A key strength of the system is its lightweight, modular design: it can be deployed without specialized infrastructure, making it suitable even for smaller institutions. Unlike more complex or federated solutions, this TRE combines strict data containment with researcher autonomy, significantly reducing administrative burden while enabling secure, FAIR-compliant, and ethically responsible data analysis..
Conclusion: Our open-source TRE offers a secure, scalable, and fully in-house solution that enables compliant, researcher-driven clinical data reuse without relying on third-party cloud services or external data transfers.
The authors declare that they have no competing interests.
The authors declare that an ethics committee vote is not required.
Literatur
[1] REDCap. Research Electronic Data Capture. Vanderbilt University; 2025 [cited 2025 Apr 03]. Available from: https://projectredcap.org/[2] Django Software Foundation. Django: The web framework for perfectionists with deadlines. 2025 [cited 2025 Apr 03]. Available from: https://www.djangoproject.com/
[3] Zentrum für Prävention und Gesundheitsförderung Bayern. Mutter-Kind-Projekt PEACHES. [cited 2025 Apr 3]. Available from: https://www.zpg-bayern.de/mutter-kind-projekt-peaches.html
[4] Wilkinson MD, Dumontier M, Aalbersberg IJ, Appleton G, Axton M, Baak A, et al. The FAIR Guiding Principles for scientific data management and stewardship. Sci Data. 2016;3:160018. DOI: 10.1038/sdata.2016.18
[5] Farhi E. DARTS: The Data Analysis Remote Treatment Service. J Open Source Softw. 2023;8(90):5562. DOI: 0.21105/joss.05562



