Co-occurrence-based error correction approach to word segmentation

Ekawat Chaowicharat; Kanlaya Naruedomkul

Publication:
Co-occurrence-based error correction approach to word segmentation

dc.contributor.author	Ekawat Chaowicharat	en_US
dc.contributor.author	Kanlaya Naruedomkul	en_US
dc.contributor.other	Mahidol University	en_US
dc.date.accessioned	2018-05-03T08:08:29Z
dc.date.available	2018-05-03T08:08:29Z
dc.date.issued	2011-12-01	en_US
dc.description.abstract	A number of word segmentation algorithms have been offered in the past; however, there is still room for improvement. Co-occurrence-Based Error Correction (CBEC), the proposed approach in this chapter, is a novel Thai word segmentation approach that was designed to provide accurate segmentation results based on context and purpose. CBEC quickly segments the input string using any available algorithm; maximal matching was used in the experiment. Next, CBEC checks its segmentation output against an error risk data bank to determine if there is any error risk. The error risk data bank is developed based on a training corpus. The current version of the error risk bank was based on the training corpus available at BEST 2009. Then, CBEC re-segments the input string using the co-occurrence score of the word sequence to ensure the accuracy of the segmentation result. © 2012, IGI Global.	en_US
dc.identifier.citation	Cross-Disciplinary Advances in Applied Natural Language Processing: Issues and Approaches. (2011), 354-364	en_US
dc.identifier.doi	10.4018/978-1-61350-447-5.ch023	en_US
dc.identifier.other	2-s2.0-84898589146	en_US
dc.identifier.uri	https://repository.li.mahidol.ac.th/handle/123456789/11746
dc.rights	Mahidol University	en_US
dc.rights.holder	SCOPUS	en_US
dc.source.uri	https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=84898589146&origin=inward	en_US
dc.subject	Computer Science	en_US
dc.title	Co-occurrence-based error correction approach to word segmentation	en_US
dc.type	Chapter	en_US
dspace.entity.type	Publication
mu.datasource.scopus	https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=84898589146&origin=inward	en_US

Collections

Scopus 2011-2015

	Office Hour: Monday-Friday 08.30-12.00 and 13.00-16.30 hrs.
	Phutthamonthon Sai 4 Rd. Salaya, Nakhon Pathom 73170, Thailand
	The office: +66 (2) 800 2680 ext.4306
	thipsuda.van@mahidol.ac.th
	https://repository.li.mahidol.ac.th

Publication: Co-occurrence-based error correction approach to word segmentation

Files

Collections

Publication:
Co-occurrence-based error correction approach to word segmentation