Publication:
An optimal approach towards recognizing broken Thai characters in OCR systems

dc.contributor.authorChaivatna Sumetphongen_US
dc.contributor.authorSupachai Tangwongsanen_US
dc.contributor.otherMahidol Universityen_US
dc.date.accessioned2018-06-11T04:44:43Z
dc.date.available2018-06-11T04:44:43Z
dc.date.issued2012-12-01en_US
dc.description.abstractThis paper presents a novel technique for recognizing broken Thai characters found in degraded Thai text documents by modeling it as a set-partitioning problem (SPP). The technique searches for the optimal set-partition of the connected components by which each subset yields a reconstructed Thai character. Given the non-linear nature of the objective function needed for optimal set-partitioning, we design an algorithm we call Heuristic Incremental Integer Programming (HIIP), that employs integer programming (IP) with an incremental approach using heuristics to hasten the convergence. To generate corrected Thai words, we adopt a probabilistic generative approach based a Thai dictionary corpus. The proposed technique is applied successfully to a Thai historical document and poor quality Thai fax document with promising accuracy rates over 93%. © 2012 IEEE.en_US
dc.identifier.citation2012 International Conference on Digital Image Computing Techniques and Applications, DICTA 2012. (2012)en_US
dc.identifier.doi10.1109/DICTA.2012.6411736en_US
dc.identifier.other2-s2.0-84874352445en_US
dc.identifier.urihttps://repository.li.mahidol.ac.th/handle/20.500.14594/14005
dc.rightsMahidol Universityen_US
dc.rights.holderSCOPUSen_US
dc.source.urihttps://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=84874352445&origin=inwarden_US
dc.subjectComputer Scienceen_US
dc.titleAn optimal approach towards recognizing broken Thai characters in OCR systemsen_US
dc.typeConference Paperen_US
dspace.entity.typePublication
mu.datasource.scopushttps://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=84874352445&origin=inwarden_US

Files

Collections