Measuring Patient Similarities in Clinical Data Repository through Graph Representation
Issued Date
2024-01-01
Resource Type
Scopus ID
2-s2.0-85201392196
Journal Title
Proceedings - 21st International Joint Conference on Computer Science and Software Engineering, JCSSE 2024
Start Page
500
End Page
507
Rights Holder(s)
SCOPUS
Bibliographic Citation
Proceedings - 21st International Joint Conference on Computer Science and Software Engineering, JCSSE 2024 (2024) , 500-507
Suggested Citation
Kritsrinopphadol T., Suvirat K., Chairat S., Tangudomkit K., Ingviya T., Bovornchutichai P., Chaichulee S. Measuring Patient Similarities in Clinical Data Repository through Graph Representation. Proceedings - 21st International Joint Conference on Computer Science and Software Engineering, JCSSE 2024 (2024) , 500-507. 507. doi:10.1109/JCSSE61278.2024.10613727 Retrieved from: https://repository.li.mahidol.ac.th/handle/20.500.14594/100591
Title
Measuring Patient Similarities in Clinical Data Repository through Graph Representation
Corresponding Author(s)
Other Contributor(s)
Abstract
Patient similarity aims at identifying patients with similar profiles. This benefits the diagnosis and treatment of complex or rare conditions as well as the formation of homogeneous cohorts for retrospective studies or clinical trials. This study introduces a framework that enables fast retrieval of similar patients in the clinical data repository (CDR) by presenting both the patient's medical history and structured medical concepts in a graph format. Our approach utilises longitudinal patient data that includes detailed records of all visits, including patient age, care units attended, diagnoses made, and medications prescribed. The hierarchical concepts of medical knowledge are also incorporated into the graph to improve contextual understanding. The graph representation facilitates the embedding of nodes based on their contextual relationships and structural patterns within the graph. We use the Node2Vec embedding to convert nodes in the graph into vector representations in high-dimensional space and subsequently employ approximate nearest neighbours (ANN) for efficient search. We evaluated our approach on a large real-word dataset of over 172,394 patients and 3,439,340 visits as well as on the identification o f p atients with s pecific ta rget di agnoses. Our approach provides a simple, fast, and efficient w ay t o estimate patient similarities in clinical databases.