Optimizing Stroke Recognition with MediaPipe and Machine Learning: An Explainable AI Approach for Facial Landmark Analysis
Issued Date
2025-01-01
Resource Type
eISSN
21693536
Scopus ID
2-s2.0-86000506261
Journal Title
IEEE Access
Rights Holder(s)
SCOPUS
Bibliographic Citation
IEEE Access (2025)
Suggested Citation
Ul Karim R., Mahdi S., Samin A., Zereen A.N., Abdullah-Al-Wadud M., Uddin J. Optimizing Stroke Recognition with MediaPipe and Machine Learning: An Explainable AI Approach for Facial Landmark Analysis. IEEE Access (2025). doi:10.1109/ACCESS.2025.3550577 Retrieved from: https://repository.li.mahidol.ac.th/handle/20.500.14594/106800
Title
Optimizing Stroke Recognition with MediaPipe and Machine Learning: An Explainable AI Approach for Facial Landmark Analysis
Author's Affiliation
Corresponding Author(s)
Other Contributor(s)
Abstract
Early detection of stroke is critical for improving survival rates and recovery. This study presents an innovative approach to stroke diagnosis through the analysis of facial landmarks combining MediaPipe's facial landmark detection with advanced machine learning models - Random Forest (RF), Extreme Gradient Boosting (XGB), and Categorical Boosting (CB). A comprehensive dataset of stroke and non-stroke facial images are curated, with MediaPipe extracting 228 facial landmarks from key regions affected by stroke-related asymmetry. Explainable AI (XAI) techniques are applied to enhance model interpretability, allowing for a deeper understanding of the most significant facial regions contributing to stroke prediction. The machine learning models are individually optimized through hyperparameter tuning, and further we propose a Multimodal Voting Classifier (MVC). By aggregating the predictions through majority voting or weighted averaging, the system reduces the likelihood of incorrect classifications that might arise from a single model's limitations. This results in a more stable and generalized model with improved diagnostic accuracy, as demonstrated by its 94.75% accuracy, than any individual model's performance. Feature importance analysis, facilitated by XAI, identified key facial regions - such as the eyes, cheeks, and lips - as critical indicators of stroke, significantly improving the model's diagnostic precision. Additionally, t-SNE visualization is employed to provide insight into classifying the stroke and non-stroke cases, reinforcing the model's robustness. Real-time metrics show the model's efficiency and adaptability across diverse hardware, enabling practical clinical integration. The proposed system offers a cost-effective, non-invasive diagnostic tool that can be implemented in real-time, potentially improving accessibility to stroke diagnosis in remote and resource-limited areas.