Optimizing Stroke Recognition with MediaPipe and Machine Learning: An Explainable AI Approach for Facial Landmark Analysis

Ul Karim R.; Mahdi S.; Samin A.; Zereen A.N.; Abdullah-Al-Wadud M.; Uddin J.

Optimizing Stroke Recognition with MediaPipe and Machine Learning: An Explainable AI Approach for Facial Landmark Analysis

dc.contributor.author	Ul Karim R.
dc.contributor.author	Mahdi S.
dc.contributor.author	Samin A.
dc.contributor.author	Zereen A.N.
dc.contributor.author	Abdullah-Al-Wadud M.
dc.contributor.author	Uddin J.
dc.contributor.correspondence	Ul Karim R.
dc.contributor.other	Mahidol University
dc.date.accessioned	2025-03-24T18:21:57Z
dc.date.available	2025-03-24T18:21:57Z
dc.date.issued	2025-01-01
dc.description.abstract	Early detection of stroke is critical for improving survival rates and recovery. This study presents an innovative approach to stroke diagnosis through the analysis of facial landmarks combining MediaPipe's facial landmark detection with advanced machine learning models - Random Forest (RF), Extreme Gradient Boosting (XGB), and Categorical Boosting (CB). A comprehensive dataset of stroke and non-stroke facial images are curated, with MediaPipe extracting 228 facial landmarks from key regions affected by stroke-related asymmetry. Explainable AI (XAI) techniques are applied to enhance model interpretability, allowing for a deeper understanding of the most significant facial regions contributing to stroke prediction. The machine learning models are individually optimized through hyperparameter tuning, and further we propose a Multimodal Voting Classifier (MVC). By aggregating the predictions through majority voting or weighted averaging, the system reduces the likelihood of incorrect classifications that might arise from a single model's limitations. This results in a more stable and generalized model with improved diagnostic accuracy, as demonstrated by its 94.75% accuracy, than any individual model's performance. Feature importance analysis, facilitated by XAI, identified key facial regions - such as the eyes, cheeks, and lips - as critical indicators of stroke, significantly improving the model's diagnostic precision. Additionally, t-SNE visualization is employed to provide insight into classifying the stroke and non-stroke cases, reinforcing the model's robustness. Real-time metrics show the model's efficiency and adaptability across diverse hardware, enabling practical clinical integration. The proposed system offers a cost-effective, non-invasive diagnostic tool that can be implemented in real-time, potentially improving accessibility to stroke diagnosis in remote and resource-limited areas.
dc.identifier.citation	IEEE Access (2025)
dc.identifier.doi	10.1109/ACCESS.2025.3550577
dc.identifier.eissn	21693536
dc.identifier.scopus	2-s2.0-86000506261
dc.identifier.uri	https://repository.li.mahidol.ac.th/handle/123456789/106800
dc.rights.holder	SCOPUS
dc.subject	Materials Science
dc.subject	Computer Science
dc.subject	Engineering
dc.title	Optimizing Stroke Recognition with MediaPipe and Machine Learning: An Explainable AI Approach for Facial Landmark Analysis
dc.type	Article
mu.datasource.scopus	https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=86000506261&origin=inward
oaire.citation.title	IEEE Access
oairecerif.author.affiliation	Woosong University
oairecerif.author.affiliation	King Saud University
oairecerif.author.affiliation	Mahidol University
oairecerif.author.affiliation	BRAC University

Collections

Scopus 2025

	Office Hour: Monday-Friday 08.30-12.00 and 13.00-16.30 hrs.
	Phutthamonthon Sai 4 Rd. Salaya, Nakhon Pathom 73170, Thailand
	The office: +66 (2) 800 2680 ext.4306
	thipsuda.van@mahidol.ac.th
	https://repository.li.mahidol.ac.th

Optimizing Stroke Recognition with MediaPipe and Machine Learning: An Explainable AI Approach for Facial Landmark Analysis

Files

Collections