Healthcare time-series data frequently contain missing values due to sensor errors, technical failures, and clinical interruptions, potentially compromising patient care quality and research validity. This study evaluates and compares different missing data imputation methods using ICU mechanical ventilation parameters from two datasets: the public MIMIC-IV database and a private hospital dataset (MINIC). We implemented and assessed five approaches: Nearest Neighbors, Multiple Imputation Chained Equations (MICE), MICE with Random Forests (MiceForest), Gated Recurrent Unit (GRU) neural networks, and self-supervised learning auto-encoders. Performance was evaluated using R2, RMSE, and distribution preservation metrics across different missingness patterns and temporal windows. GRU-based models demonstrated superior performance, achieving R² scores up to 0.98 and consistently lower RMSE values compared to other methods. While MICE-based approaches successfully preserved data distributions, they struggled with temporal dependencies. These findings suggest that deep learning approaches, particularly GRU networks, are better suited for imputing missing values in healthcare time-series data, especially for mechanical ventilation parameters. |
*** Title, author list and abstract as submitted during Camera-Ready version delivery. Small changes that may have occurred during processing by Springer may not appear in this window.