Co-occurrence-based error correction approach to word segmentation

Ekawat Chaowicharat; Kanlaya Naruedomkul

Publication:
Co-occurrence-based error correction approach to word segmentation

Issued Date

2011-12-01

Resource Type

Chapter

DOI

10.4018/978-1-61350-447-5.ch023

Other identifier(s)

2-s2.0-84898589146

Rights

Mahidol University

Rights Holder(s)

SCOPUS

Bibliographic Citation

Cross-Disciplinary Advances in Applied Natural Language Processing: Issues and Approaches. (2011), 354-364

Suggested Citation

Ekawat Chaowicharat, Kanlaya Naruedomkul Co-occurrence-based error correction approach to word segmentation. Cross-Disciplinary Advances in Applied Natural Language Processing: Issues and Approaches. (2011), 354-364. doi:10.4018/978-1-61350-447-5.ch023 Retrieved from: https://repository.li.mahidol.ac.th/handle/123456789/11746

Title

Co-occurrence-based error correction approach to word segmentation

Author(s)

Ekawat Chaowicharat
Kanlaya Naruedomkul

Other Contributor(s)

Mahidol University

Abstract

A number of word segmentation algorithms have been offered in the past; however, there is still room for improvement. Co-occurrence-Based Error Correction (CBEC), the proposed approach in this chapter, is a novel Thai word segmentation approach that was designed to provide accurate segmentation results based on context and purpose. CBEC quickly segments the input string using any available algorithm; maximal matching was used in the experiment. Next, CBEC checks its segmentation output against an error risk data bank to determine if there is any error risk. The error risk data bank is developed based on a training corpus. The current version of the error risk bank was based on the training corpus available at BEST 2009. Then, CBEC re-segments the input string using the co-occurrence score of the word sequence to ensure the accuracy of the segmentation result. © 2012, IGI Global.

Keyword(s)

Computer Science

URI

https://repository.li.mahidol.ac.th/handle/123456789/11746

Collections

Scopus 2011-2015

Full item page

Send Feedback

Publication:
Co-occurrence-based error correction approach to word segmentation

Issued Date

Resource Type

DOI

Other identifier(s)

Rights

Rights Holder(s)

Bibliographic Citation

Suggested Citation

Research Projects

Organizational Units

Authors

Journal Issue

Thesis

Title

Author(s)

Other Contributor(s)

Abstract

Keyword(s)

Availability

URI

Collections

	Office Hour: Monday-Friday 08.30-12.00 and 13.00-16.30 hrs.
	Phutthamonthon Sai 4 Rd. Salaya, Nakhon Pathom 73170, Thailand
	The office: +66 (2) 800 2680 ext.4306
	thipsuda.van@mahidol.ac.th
	https://repository.li.mahidol.ac.th

Publication: Co-occurrence-based error correction approach to word segmentation

Issued Date

Resource Type

DOI

Other identifier(s)

Rights

Rights Holder(s)

Bibliographic Citation

Suggested Citation

Research Projects

Organizational Units

Authors

Journal Issue

Thesis

Title

Author(s)

Other Contributor(s)

Abstract

Keyword(s)

Availability

URI

Collections

Publication:
Co-occurrence-based error correction approach to word segmentation