Enhancing agreement in cardiotocography interpretation between midwives and obstetricians through a rule-based AI program: A comparative cross-sectional study
Issued Date
2026-01-01
Resource Type
ISSN
00207292
eISSN
18793479
Scopus ID
2-s2.0-105031065856
Pubmed ID
41732906
Journal Title
International Journal of Gynecology and Obstetrics
Rights Holder(s)
SCOPUS
Bibliographic Citation
International Journal of Gynecology and Obstetrics (2026)
Suggested Citation
Kaewsrinual S., Homdee N., Rekhawasin Pinnington T., Surasereewong S., Chanprapaph P. Enhancing agreement in cardiotocography interpretation between midwives and obstetricians through a rule-based AI program: A comparative cross-sectional study. International Journal of Gynecology and Obstetrics (2026). doi:10.1002/ijgo.70868 Retrieved from: https://repository.li.mahidol.ac.th/handle/123456789/115509
Title
Enhancing agreement in cardiotocography interpretation between midwives and obstetricians through a rule-based AI program: A comparative cross-sectional study
Author's Affiliation
Corresponding Author(s)
Other Contributor(s)
Abstract
Objective: To evaluate whether a rule-based artificial intelligence (AI) program can enhance interrater agreement in cardiotocography (CTG) interpretation between nurse–midwives and obstetricians. Methods: CTG data from 50 singleton pregnancies at ≥32 weeks of gestation were used to develop a rule-based AI program based on National Institute of Child Health and Human Development (NICHD) 2008 guidelines, with content validity confirmed (item–objective congruence = 0.85). A 22-item CTG test representing NICHD categories I to III was generated using a local obstetrician consensus reference standard, defined as ≥70% agreement among seven obstetricians. Twenty nurse–midwives interpreted the same CTG tracings twice, before and after AI support, with a 1- to 2-month interval, while obstetricians completed the test once to establish the reference standard. Interrater agreement was evaluated relative to this local expert consensus, rather than neonatal outcomes or an external gold standard, using quadratic weighted kappa for ordinal data and intraclass correlation coefficients (ICC) for continuous data. Interpretation time and user satisfaction were also assessed. Results: AI support significantly improved agreement across all ordinal parameters. Agreement on NICHD category interpretation increased from moderate (κ = 0.548) to almost perfect (κ = 0.906, P < 0.001). Improvements were also observed for baseline variability (κ = 0.459 to 0.853), fetal heart rate category (κ = 0.669 to 0.868), prolonged decelerations (κ = 0.719 to 0.963), and acceleration count (κ = 0.482 to 0.723). ICCs for variable and late decelerations improved from poor to good (0.328 to 0.725 and 0.304 to 0.676, respectively), whereas early decelerations remained low. Interpretation time decreased by a mean of 6.7 min with AI support (P < 0.001). Most midwives reported high satisfaction, with 70% strongly agreeing on its clinical utility. Conclusion: This exploratory study suggests that a rule-based AI program was associated with improved interrater agreement between midwives and obstetricians and reduced CTG interpretation time, with high user satisfaction. These preliminary findings warrant confirmation in larger studies to assess generalizability and to determine whether improved agreement translates into better perinatal outcomes, which were not assessed in this study.
