Extracting feature from unstructured free e-text thyroid ultrasound report

dc.contributor.advisorRatchainant Thammasudjarit
dc.contributor.advisorAmmarin Thakkinstian
dc.contributor.advisorOraluck Pattanaprateep
dc.contributor.advisorPawin Numthavaj
dc.contributor.authorParin Kittipongdaja
dc.date.accessioned2024-07-08T02:55:49Z
dc.date.available2024-07-08T02:55:49Z
dc.date.copyright2020
dc.date.created2020
dc.date.issued2024
dc.descriptionData Science for Health Care (Mahidol University 2020)
dc.description.abstractThyroid ultrasound reports are stored as unstructured data in narrative text format which precludes the data gathering process for further usage. This research aimed to develop an automatic clinical information extraction from a rule-based approach and machine-learning approach for extracting clinical information from thyroid ultrasound reports and store them in a structured format. The model employed Natural Language Processing (NLP) techniques for extracting relevant clinical information based on Thyroid Imaging Reporting and Data System (TI-RADS) lexicons. A total of 60 thyroid ultrasound reports contained 477 sentences were annotated by an experienced clinician and used for model evaluation. The rule-based approach achieved 80.52% accuracy, 72.59% sensitivity (recall), 88.14% precision, 89.25% specificity, and 79.62% F1 score with average run-time for processing per report of 0.0369 second with 0.0144 second standard deviation. The machine-learning approach achieved 82.34% accuracy, 77.60% sensitivity (recall), 88.04% precision, 87.82% specificity, and 82.49% F1 score with average run-time for processing per a report of 0.1917 second with 0.0344 second standard deviation. Despite the small training set and lack of inter-annotator variability evaluation, both models showed promisingly high performance with short processing time that can be implemented in real clinical settings such as gathering information process, building a thyroid ultrasound database, and improving communication between ultrasound radiologists and diagnostic physicians.
dc.format.extentx, 76 leaves: ill.
dc.format.mimetypeapplication/pdf
dc.identifier.citationThesis (M.Sc. (Data Science for Health Care))--Mahidol University 2020
dc.identifier.urihttps://repository.li.mahidol.ac.th/handle/20.500.14594/99480
dc.language.isoeng
dc.publisherMahidol University. Mahidol University Library and Knowledge Center
dc.rightsผลงานนี้เป็นลิขสิทธิ์ของมหาวิทยาลัยมหิดล ขอสงวนไว้สำหรับเพื่อการศึกษาเท่านั้น ต้องอ้างอิงแหล่งที่มา ห้ามดัดแปลงเนื้อหา และห้ามนำไปใช้เพื่อการค้า
dc.rights.holderMahidol University
dc.subjectMachine learning
dc.subjectNatural language processing (Computer science)
dc.subjectThyroid gland -- Cancer -- Diagnosis
dc.titleExtracting feature from unstructured free e-text thyroid ultrasound report
dc.typeMaster Thesis
dcterms.accessRightsopen access
mods.location.urlhttp://mulinet11.li.mahidol.ac.th/e-thesis/2563/561/6136454.pdf
thesis.degree.departmentFaculty of Medicine Ramathibodi Hospital
thesis.degree.disciplineData Science for Health Care
thesis.degree.grantorMahidol University
thesis.degree.levelMaster's degree
thesis.degree.nameMaster of Science

Files