Temporal Fusion of Convolutional and LSTM Networks for Vision-Based Fall Detection Using Anatomical Keypoints

dc.contributor.authorTasnim H.
dc.contributor.authorJoy A.D.
dc.contributor.authorDutta A.
dc.contributor.authorRabbi R.
dc.contributor.authorZereen A.N.
dc.contributor.correspondenceTasnim H.
dc.contributor.otherMahidol University
dc.date.accessioned2026-06-20T18:14:14Z
dc.date.available2026-06-20T18:14:14Z
dc.date.issued2025-01-01
dc.description.abstractFall remains a leading cause of injury and serious health consequences, particularly among elderly individuals. Traditional fall detection systems often rely on wearable devices equipped with sensors, which can be inconvenient. On the other hand, existing deep learning-based approaches mostly analyze image or video data directly and involve complex, resourceintensive architectures that are unsuitable for practical, resourceconstrained settings. To resolve these issues, this study proposes a hybrid Convolutional Neural Network-Long Short-Term Memory (CNN-LSTM) architecture, leveraging both spatial and temporal dependencies of YOLOv8-extracted anatomical keypoints from sequential video frames of the Le2i fall detection dataset. Moreover, motion-based feature engineering and hyperparameter tuning are applied, enabling the model to achieve an accuracy of 98.48% using only 52 features, including 34 anatomical keypoints and 18 motion features (velocity, rolling mean, standard deviation). A comparative analysis with the baseline lstm, Recurrent Neural Network (RNN), and Gated Recurrent Unit (GRU) models is also conducted, demonstrating the superior performance of the proposed CNN-LSTM approach. Additionally, to enable practical usage of the model in resource-limited settings, a web interface is developed for real-time monitoring, alerts, and spacespecific filtering, allowing separate monitoring of personal areas while addressing privacy concerns.
dc.identifier.citation2025 28th International Conference on Computer and Information Technology Iccit 2025 (2025) , 3246-3251
dc.identifier.doi10.1109/ICCIT68739.2025.11490414
dc.identifier.scopus2-s2.0-105041620016
dc.identifier.urihttps://repository.li.mahidol.ac.th/handle/123456789/117419
dc.rights.holderSCOPUS
dc.subjectComputer Science
dc.titleTemporal Fusion of Convolutional and LSTM Networks for Vision-Based Fall Detection Using Anatomical Keypoints
dc.typeConference Paper
mu.datasource.scopushttps://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=105041620016&origin=inward
oaire.citation.endPage3251
oaire.citation.startPage3246
oaire.citation.title2025 28th International Conference on Computer and Information Technology Iccit 2025
oairecerif.author.affiliationMahidol University
oairecerif.author.affiliationBRAC University

Files

Collections