Automating Manga Character Analysis: A Robust Deep Vision-Transformer Approach to Facial Landmark Detection
Issued Date
2024-01-01
Resource Type
eISSN
21693536
Scopus ID
2-s2.0-85204185759
Journal Title
IEEE Access
Rights Holder(s)
SCOPUS
Bibliographic Citation
IEEE Access (2024)
Suggested Citation
Vachmanus S., Phinklao N., Phongsarnariyakul N., Plongcharoen T., Hotta S., Tuarob S. Automating Manga Character Analysis: A Robust Deep Vision-Transformer Approach to Facial Landmark Detection. IEEE Access (2024). doi:10.1109/ACCESS.2024.3459419 Retrieved from: https://repository.li.mahidol.ac.th/handle/20.500.14594/101342
Title
Automating Manga Character Analysis: A Robust Deep Vision-Transformer Approach to Facial Landmark Detection
Author's Affiliation
Corresponding Author(s)
Other Contributor(s)
Abstract
Comics, particularly Japanese manga, are a powerful medium that blends images and text to convey ideas and encapsulate a unique cultural heritage. Going beyond mere entertainment, manga merges diverse styles and content deeply rooted in Japanese cultural heritage. This study utilizes computer vision analysis, with a specific focus on facial landmark detection, acknowledging the growing significance of technology in analyzing manga images. Through a comprehensive exploration of various methods, the research identifies the extended version of Bidirectional Encoder Representations from Transformers (BERT), BERT Pre-Training of Image Transformers (BEiT), model as a standout performer due to its efficiency and effectiveness. The BEiT model's success lies in its ability to extract facial features, consequently establishing itself as a go-To solution for landmark detection on manga faces. The outcomes achieved the lowest Failure Rate compared to other landmark detection networks, with a Failure Rate of approximately 9.4% and a Mean Average Error of about 4.6 pixels. Beyond its technical accomplishments, this study carries a cultural significance, contributing to the ongoing narrative of manga in Japan.