Segmentation of Broken Khmer Characters
2
Issued Date
2022-01-01
Resource Type
Scopus ID
2-s2.0-85141579275
Journal Title
2022 3rd International Conference on Big Data Analytics and Practices, IBDAP 2022
Start Page
65
End Page
68
Rights Holder(s)
SCOPUS
Bibliographic Citation
2022 3rd International Conference on Big Data Analytics and Practices, IBDAP 2022 (2022) , 65-68
Suggested Citation
Pravesjit S., Kantawong K., That V., Longpradit P. Segmentation of Broken Khmer Characters. 2022 3rd International Conference on Big Data Analytics and Practices, IBDAP 2022 (2022) , 65-68. 68. doi:10.1109/IBDAP55587.2022.9907272 Retrieved from: https://repository.li.mahidol.ac.th/handle/123456789/84350
Title
Segmentation of Broken Khmer Characters
Author(s)
Author's Affiliation
Other Contributor(s)
Abstract
Preprocessing is the first step in handwritten characters recognition. Segmentation is an important preprocessing step for separate group of words into single character. Broken character is one of the most difficult segmentation cases which arise when handwritten characters are being segmented. Therefore, this paper focuses on the segmentation of broken characters. The proposed character segmentation workflow consists of finding the middle layer of words by the projection analysis and bounding box analysis which is initially employed to segment the document image into images of isolated characters and images of broken characters. The thinning algorithm is then applied to extract the skeleton of the characters in middle layer and file loop, junction point, vertex shape and straight line by chain code algorithm. Finally, the rules for combining characters from the aspect ratio value (height/width), together with the separated pieces of the broken Khmer characters, are put back to reconstruct one isolated characters. The proposed algorithm achieves an accuracy of 79.5%.
