Thai type styles recognition

Sutat Saetang

Thai type styles recognition

dc.contributor.advisor	Chularat Tanprasert
dc.contributor.advisor	Damras Wongsawang
dc.contributor.author	Sutat Saetang
dc.date.accessioned	2025-02-03T07:45:40Z
dc.date.available	2025-02-03T07:45:40Z
dc.date.copyright	1998
dc.date.created	2025
dc.date.issued	1998
dc.description	Computer science (Mahidol University 1998)
dc.description.abstract	Thai printed character recognition has been a very popular research topic in Thailand. There are three commercial Thai OCR softwares available to the public at the present. None of them can preserve the type styles of the original document image such as normal, bold, italics, and bold & italics styles into the output text file. Therefore, users who need to maintain a documents original character type styles have to modify the document by themselves which takes more time and more labor on a tedious job. This research presents the technique for preserving the specified Thai type styles by applying a specific preprocessing with a supervised neural networks learning algorithm. Only four type styles of Thai typed characters are considered. They are normal, bold, italics and bold & italics. Therefore, there are two main features to extract for these four Thai type styles: the thickness and the inclination of character. This research designed and experimented several types of templates to extract these two features from the raw bit-map character images. The best template preserves the two main characteristics and gives an average recognition at 95.85% with the unseen testing patterns. Therefore, the results confirm that the proposed technique effectively preserves the type styles of Thai typed fonts from the original document image into the output text file.
dc.description.abstract	การรู้จำตัวพิมพ์อักษรไทยเป็นหัวข้อวิจัยที่กำลังนิยมมากหัวข้อหนึ่งในประเทศไทย ปัจจุบันมีซอฟต์แวร์ทางด้านนี้ในท้องตลาดของประเทศไทย 3 ซอฟต์แวร์ด้วยกัน แต่ไม่มี ซอฟต์แวร์ตัวใดเลย ที่สามารถรู้จำรูปแบบตัวอักษรของเอกสารต้นฉบับเช่น ตัวปกติ ตัวหนา ตัวเอียง และตัวหนาเอียงได้เลย ด้วยเหตุนี้ผู้ใช้ที่ต้องการจะได้รูปแบบตัวอักษร ของเอกสารต้นฉบับในแฟ้มข้อความผลลัพธ์จึงจำเป็นต้องแก้ไขด้วยตนเองภายหลัง ซึ่ง เป็นการเสียเวลาและเป็นงานที่น่าเบื่อหน่าย งานวิจัยฉบับนี้ จะแสดงถึงเทคนิคในการรู้จำรูปแบบตัวอักษรไทยโดยอาศัยโครงข่าย ประสาทเทียมแบบมีผู้สอนช่วยในการรู้จำ โดยจะรู้จำรูปแบบทั้งหมด 4 รูปแบบคือ ตัวปกติ ตัวหนา ตัวเอียง และตัวหนาเอียง ซึ่งสามารถแบ่งเป็นลักษณะของรูปแบบหลักๆ ของ ตัวอักษรภาษาไทยได้ 2 รูปแบบด้วยกัน คือ รูปแบบความหนา และรูปแบบความเอียงของ ตัวอักษร จากการวิจัยได้ออกแบบและทดสอบกับหลายๆ แผ่นแบบ (template) ที่พัฒนาขึ้น เพื่อให้สามารถดึงรูปแบบหลักทั้ง 2 รูปแบบจากภาพลักษณ์ตัวอักษรไทยได้ โดยแผ่นแบบที่ดี ที่สุดสำหรับการดึงรูปแบบหลักทั้ง 2 รูปแบบดังกล่าวมีอัตราการรู้จำที่ 95.85% กับข้อมูล ทดสอบ ดังนั้นสามารถสรุปได้ว่าเทคนิคที่พัฒนาขึ้นสามารถรู้จำรูปแบบตัวอักษรภาษาไทยจาก ภาพลักษณ์ตัวอักษรได้อย่างมีประสิทธิภาพ
dc.format.extent	ix, 58 leaves : ill.
dc.format.mimetype	application/pdf
dc.identifier.citation	Research Project (M.Sc. (Computer science))--Mahidol University, 1998
dc.identifier.isbn	9746611364
dc.identifier.uri	https://repository.li.mahidol.ac.th/handle/123456789/103522
dc.language.iso	eng
dc.publisher	Mahidol University. Mahidol University Library and Knowledge Center
dc.rights	ผลงานนี้เป็นลิขสิทธิ์ของมหาวิทยาลัยมหิดล ขอสงวนไว้สำหรับเพื่อการศึกษาเท่านั้น ต้องอ้างอิงแหล่งที่มา ห้ามดัดแปลงเนื้อหา และห้ามนำไปใช้เพื่อการค้า
dc.rights.holder	Mahidol University
dc.subject	Algorithms
dc.subject	Computer architecture
dc.subject	Prints, Thai
dc.title	Thai type styles recognition
dc.title.alternative	การรู้จำลักษณะพิเศษของตัวอักษรไทย
dc.type	Master Thesis
dcterms.accessRights	open access
mods.location.url	http://mulinet11.li.mahidol.ac.th/e-thesis/scan/1088676.pdf
thesis.degree.department	Faculty of Science
thesis.degree.discipline	Computer science
thesis.degree.grantor	Mahidol University
thesis.degree.level	Master's degree
thesis.degree.name	Master of Science

Collections

Thesis and Thematic paper

	Office Hour: Monday-Friday 08.30-12.00 and 13.00-16.30 hrs.
	Phutthamonthon Sai 4 Rd. Salaya, Nakhon Pathom 73170, Thailand
	The office: +66 (2) 800 2680 ext.4306
	thipsuda.van@mahidol.ac.th
	https://repository.li.mahidol.ac.th

Thai type styles recognition

Files

Collections