Automatic categorization of tax forms and component block decomposition

dc.contributor.advisorSukanya Phongsuphap
dc.contributor.authorBenjawan Pisuthisombut
dc.date.accessioned2024-02-13T03:00:07Z
dc.date.available2024-02-13T03:00:07Z
dc.date.copyright2007
dc.date.created2007
dc.date.issued2007
dc.descriptionComputer Science (Mahidol University 2007)
dc.description.abstractThis paper investigates methods for classifying types of tax forms and decomposing component blocks of tax forms automatically. We considered three methods. In the first method, the input image was first separated into component blocks; then types of tax forms were identified by comparing the component blocks with component blocks of template images. In the second and third methods, a type of input tax form was identified first and then form models and registration techniques were used to decompose the component blocks of the input tax form. In the second method, the type of tax form was identified by matching the tax form type image of an input tax form image and a prototype tax form type image, using a correlation coefficient. The last method identified the type of input tax form by recognizing characters and digits on the top of an input tax form. Experiments were performed on 520 tax form images composed of 26 types, each with 20 images. The first method achieved a 61.15% average correct classification result but it could not extract all component blocks correctly. With regard to the second and third methods, the accuracy rates of tax form identification were 88.65% and 100%, respectively, and they could extract all component blocks on tax forms correctly by using form models for component block decomposition. The results here showed that the method of using the character recognition on the tax form type and the form model had potential to be applied to develop the system for classifying type of tax form images and decomposing component blocks of tax form images.
dc.format.extentxv, 255 leaves : ill.
dc.format.mimetypeapplication/pdf
dc.identifier.citationResearch Project (M.Sc. (Computer Science))--Mahidol University, 2007
dc.identifier.urihttps://repository.li.mahidol.ac.th/handle/20.500.14594/97018
dc.language.isoeng
dc.publisherMahidol University. Mahidol University Library and Knowledge Center
dc.rightsผลงานนี้เป็นลิขสิทธิ์ของมหาวิทยาลัยมหิดล ขอสงวนไว้สำหรับเพื่อการศึกษาเท่านั้น ต้องอ้างอิงแหล่งที่มา ห้ามดัดแปลงเนื้อหา และห้ามนำไปใช้เพื่อการค้า
dc.rights.holderMahidol University
dc.subjectDocument imaging systems
dc.subjectOptical character recognition devices
dc.subjectImage processing
dc.subjectTax forms -- Research -- Thailand
dc.titleAutomatic categorization of tax forms and component block decomposition
dc.title.alternativeการจำแนกประเภทและการแยกส่วนประกอบของแบบฟอร์มภาษีโดยอัตโนมัติ
dc.typeMaster Thesis
dcterms.accessRightsopen access
mods.location.urlhttp://mulinet11.li.mahidol.ac.th/e-thesis/4737989.pdf
thesis.degree.departmentFaculty of Science
thesis.degree.disciplineComputer Science
thesis.degree.grantorMahidol University
thesis.degree.levelMaster's degree
thesis.degree.nameMaster of Science

Files