Publication: A hybrid approach to discover semantic hierarchical sections in scholarly documents
Issued Date
2015-11-20
Resource Type
ISSN
15205363
Other identifier(s)
2-s2.0-84962602612
Rights
Mahidol University
Rights Holder(s)
SCOPUS
Bibliographic Citation
Proceedings of the International Conference on Document Analysis and Recognition, ICDAR. Vol.2015-November, (2015), 1081-1085
Suggested Citation
Suppawong Tuarob, Prasenjit Mitra, C. Lee Giles A hybrid approach to discover semantic hierarchical sections in scholarly documents. Proceedings of the International Conference on Document Analysis and Recognition, ICDAR. Vol.2015-November, (2015), 1081-1085. doi:10.1109/ICDAR.2015.7333927 Retrieved from: https://repository.li.mahidol.ac.th/handle/20.500.14594/35790
Research Projects
Organizational Units
Authors
Journal Issue
Thesis
Title
A hybrid approach to discover semantic hierarchical sections in scholarly documents
Author(s)
Other Contributor(s)
Abstract
© 2015 IEEE. Scholarly documents are usually composed of sections, each of which serves a different purpose by conveying specific context. The ability to automatically identify sections would allow us to understand the semantics of what is different in different sections of documents, such as what was in the introduction, methodologies used, experimental types, trends, etc. We propose a set of hybrid algorithms to 1) automatically identify section boundaries, 2) recognize standard sections, and 3) build a hierarchy of sections. Our algorithms achieve an F-measure of 92.38% in section boundary detection, 96% accuracy (average) on standard section recognition, and 95.51% in accuracy in the section positioning task.