Deutsche Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie e.V. (GMDS)
07.-11.09.2025
Jena

Meeting Abstract

Clustering neonatal and pediatric intensive care time series using k-means

Anis Abdollahi-Sissan - Embedded Software (Informatik 11) - RWTH Aachen, Aachen, Germany

Camelia Oprea - Embedded Software (Informatik 11) - RWTH Aachen, Aachen, Germany

Lena Olivier - Neonatology Section of the Department of Pediatric and Adolescent Medicine - RWTH Aachen University Hospital, Aachen, Germany

Mark Schoberer - Neonatology Section of the Department of Pediatric and Adolescent Medicine - RWTH Aachen University Hospital, Aachen, Germany

André Stollenwerk - Embedded Software (Informatik 11) - RWTH Aachen, Aachen, Germany

Text

Introduction: Continuous recordings of different sensors in clinical settings, such as neonatal and pediatric intensive care units (NICU/PICU), produce large volumes of unlabeled multivariate time series (MTS) data. Labeling data is time-intensive, requires trained staff, and is error-prone. Unsupervised learning methods such as clustering could help to detect trends and anomalies in these signals, identifying relevant deviations from the expected patterns. Such methods have already shown promising results for applications like ECG or respiratory signal clustering [1], [2]. A filtering step by clustering could provide support in the decision-making process in NICU/PICU environments, where a continuous human monitoring is not possible and the data are more heterogeneous than in adult patients.

Methods: We propose an implementation of k-means clustering to identify recurring physiological patterns and evaluate the feasibility of unsupervised MTS clustering in neonatal and pediatric intensive care data. Clustering was performed on data from nine ventilated neonatal and pediatric patients, covering six clinically relevant parameters: airway flow, airway pressure, chest impedance, heart rate, respiratory rate, and fraction of inspired oxygen. Six scenarios were defined to explore different algorithm design choices, including multivariate and univariate input, and variation in: data resolution, input window lengths, number of clusters, and centroid initialization strategies.

Preprocessing included resampling parameters to common frequencies, linear interpolation for small sensor-caused data gaps, and splitting segments at longer gaps, as well as normalization (to avoid parameter-bias [3]). Data were segmented into overlapping windows ranging from 10 s to 3,600 s using sliding windows with fixed strides (1/3 or 1/5 of the window size), chosen to align with the expected duration of physiological events. This approach intends to adequately capture temporal patterns while taking misalignments into account. Euclidean Distance (ED) was used as the distance metric for clustering due to its computational efficiency and suitability for large volumes of fixed-length time segments.

Results: Clustering quality was assessed visually by the research team and clinical experts due to the lack of ground truth labels. While some trends (such as increases or decreases in heart rate and inspired oxygen) were recognized, we couldn't generally identify a meaningful clustering, especially in the multivariate settings. Larger window sizes often led to convergence into a single cluster, while smaller windows produced many empty clusters or lacked meaningful separation. Nevertheless, simpler patterns like heart rate minima were clustered successfully. Univariate clustering also managed to capture some parameter-specific patterns.

Discussion: While ED does not consider the misalignment between the compared segments, the use of overlapping sliding windows helped align periodic signals, such as airway flow. In contrast, clustering less repetitive signals like heart rate showed limited success. The lack of meaningful structure in the multivariate clustering likely results from mixing signals with differing temporal characteristics. Predefined centroids offered only minor improvements over random initialization, indicating that the data structure dominates the clustering outcomes.

Conclusion: Overall, the results imply that, while k-means provides a simple baseline for exploring patterns in unlabeled intensive care time series data, its interpretability and clinical utility are limited. Future work should explore the use of time-series-specific distance metrics and other alternative clustering algorithms such as k-Shape.

The authors declare that they have no competing interests.

The authors declare that a positive ethics committee vote has been obtained.

References

[1] Nezamabadi K, Sardaripour N, Haghi B, Forouzanfar M. Unsupervised ECG Analysis: A Review. IEEE Reviews in Biomedical Engineering. 2022;16:208-224. DOI: 10.1109/RBME.2022.3154893
[2] Robles-Rubio CA, Kearney RE, Bertolizio G, Brown KA. Automatic unsupervised respiratory analysis of infant respiratory inductance plethysmography signals. PLoS ONE. 2020;15:e0238402. DOI: 10.1371/journal.pone.0238402
[3] Singh D, Singh B. Investigating the impact of data normalization on classification performance. Applied Soft Computing. 2020;97:105524. DOI: 10.1016/j.asoc.2019.105524

Citation note

Abdollahi-Sissan A, Oprea C, Olivier L, Schoberer M, Stollenwerk A. Clustering neonatal and pediatric intensive care time series using k-means In: Deutsche Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie, editors. 70. Jahrestagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie e. V. (GMDS). Jena, 07.-11.09.2025. Düsseldorf: German Medical Science GMS Publishing House; 2026. DocAbstr. 357.

DOI: 10.3205/25gmds207

Download XML

License

© Abdollahi-Sissan et al.
This abstract is distributed under the terms of the license Creative Commons Attribution 4.0 International

Published: 2026-04-01

Get in touch.

70. Jahrestagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie e.V.

Clustering neonatal and pediatric intensive care time series using k-means

Text

References

ZB MED is a member of DataCite

ZB MED advocates gender equality

Award for German Medical Science

ZB MED advocates Open Access