Temporal Fusion of Convolutional and LSTM Networks for Vision-Based Fall Detection Using Anatomical Keypoints
Issued Date
2025-01-01
Resource Type
Scopus ID
2-s2.0-105041620016
Journal Title
2025 28th International Conference on Computer and Information Technology Iccit 2025
Start Page
3246
End Page
3251
Rights Holder(s)
SCOPUS
Bibliographic Citation
2025 28th International Conference on Computer and Information Technology Iccit 2025 (2025) , 3246-3251
Suggested Citation
Tasnim H., Joy A.D., Dutta A., Rabbi R., Zereen A.N. Temporal Fusion of Convolutional and LSTM Networks for Vision-Based Fall Detection Using Anatomical Keypoints. 2025 28th International Conference on Computer and Information Technology Iccit 2025 (2025) , 3246-3251. 3251. doi:10.1109/ICCIT68739.2025.11490414 Retrieved from: https://repository.li.mahidol.ac.th/handle/123456789/117419
Title
Temporal Fusion of Convolutional and LSTM Networks for Vision-Based Fall Detection Using Anatomical Keypoints
Author(s)
Author's Affiliation
Corresponding Author(s)
Other Contributor(s)
Abstract
Fall remains a leading cause of injury and serious health consequences, particularly among elderly individuals. Traditional fall detection systems often rely on wearable devices equipped with sensors, which can be inconvenient. On the other hand, existing deep learning-based approaches mostly analyze image or video data directly and involve complex, resourceintensive architectures that are unsuitable for practical, resourceconstrained settings. To resolve these issues, this study proposes a hybrid Convolutional Neural Network-Long Short-Term Memory (CNN-LSTM) architecture, leveraging both spatial and temporal dependencies of YOLOv8-extracted anatomical keypoints from sequential video frames of the Le2i fall detection dataset. Moreover, motion-based feature engineering and hyperparameter tuning are applied, enabling the model to achieve an accuracy of 98.48% using only 52 features, including 34 anatomical keypoints and 18 motion features (velocity, rolling mean, standard deviation). A comparative analysis with the baseline lstm, Recurrent Neural Network (RNN), and Gated Recurrent Unit (GRU) models is also conducted, demonstrating the superior performance of the proposed CNN-LSTM approach. Additionally, to enable practical usage of the model in resource-limited settings, a web interface is developed for real-time monitoring, alerts, and spacespecific filtering, allowing separate monitoring of personal areas while addressing privacy concerns.
