Extracting feature from unstructured free e-text thyroid ultrasound report
Issued Date
2024
Copyright Date
2020
Resource Type
Language
eng
File Type
application/pdf
No. of Pages/File Size
x, 76 leaves: ill.
Access Rights
open access
Rights
ผลงานนี้เป็นลิขสิทธิ์ของมหาวิทยาลัยมหิดล ขอสงวนไว้สำหรับเพื่อการศึกษาเท่านั้น ต้องอ้างอิงแหล่งที่มา ห้ามดัดแปลงเนื้อหา และห้ามนำไปใช้เพื่อการค้า
Rights Holder(s)
Mahidol University
Bibliographic Citation
Thesis (M.Sc. (Data Science for Health Care))--Mahidol University 2020
Suggested Citation
Parin Kittipongdaja Extracting feature from unstructured free e-text thyroid ultrasound report. Thesis (M.Sc. (Data Science for Health Care))--Mahidol University 2020. Retrieved from: https://repository.li.mahidol.ac.th/handle/20.500.14594/99480
Title
Extracting feature from unstructured free e-text thyroid ultrasound report
Author(s)
Abstract
Thyroid ultrasound reports are stored as unstructured data in narrative text format which precludes the data gathering process for further usage. This research aimed to develop an automatic clinical information extraction from a rule-based approach and machine-learning approach for extracting clinical information from thyroid ultrasound reports and store them in a structured format. The model employed Natural Language Processing (NLP) techniques for extracting relevant clinical information based on Thyroid Imaging Reporting and Data System (TI-RADS) lexicons. A total of 60 thyroid ultrasound reports contained 477 sentences were annotated by an experienced clinician and used for model evaluation. The rule-based approach achieved 80.52% accuracy, 72.59% sensitivity (recall), 88.14% precision, 89.25% specificity, and 79.62% F1 score with average run-time for processing per report of 0.0369 second with 0.0144 second standard deviation. The machine-learning approach achieved 82.34% accuracy, 77.60% sensitivity (recall), 88.04% precision, 87.82% specificity, and 82.49% F1 score with average run-time for processing per a report of 0.1917 second with 0.0344 second standard deviation. Despite the small training set and lack of inter-annotator variability evaluation, both models showed promisingly high performance with short processing time that can be implemented in real clinical settings such as gathering information process, building a thyroid ultrasound database, and improving communication between ultrasound radiologists and diagnostic physicians.
Description
Data Science for Health Care (Mahidol University 2020)
Degree Name
Master of Science
Degree Level
Master's degree
Degree Department
Faculty of Medicine Ramathibodi Hospital
Degree Discipline
Data Science for Health Care
Degree Grantor(s)
Mahidol University