EC-RAFT: Automated Generation of Clinical Trial Eligibility Criteria through Retrieval-Augmented Fine-Tuning

Lekuthai N.; Pewngam N.; Sokrai S.; Achakulvisut T.

EC-RAFT: Automated Generation of Clinical Trial Eligibility Criteria through Retrieval-Augmented Fine-Tuning

Issued Date

2025-01-01

Resource Type

Conference Paper

ISSN

0736587X

DOI

10.18653/v1/2025.findings-acl.491

Scopus ID

2-s2.0-105028640416

Journal Title

Proceedings of the Annual Meeting of the Association for Computational Linguistics

Start Page

9432

End Page

9444

Rights Holder(s)

SCOPUS

Bibliographic Citation

Proceedings of the Annual Meeting of the Association for Computational Linguistics (2025) , 9432-9444

Suggested Citation

Lekuthai N., Pewngam N., Sokrai S., Achakulvisut T. EC-RAFT: Automated Generation of Clinical Trial Eligibility Criteria through Retrieval-Augmented Fine-Tuning. Proceedings of the Annual Meeting of the Association for Computational Linguistics (2025) , 9432-9444. 9444. doi:10.18653/v1/2025.findings-acl.491 Retrieved from: https://repository.li.mahidol.ac.th/handle/123456789/114700

Title

EC-RAFT: Automated Generation of Clinical Trial Eligibility Criteria through Retrieval-Augmented Fine-Tuning

Author(s)

Lekuthai N.
Pewngam N.
Sokrai S.
Achakulvisut T.

Author's Affiliation

Mahidol University
Faculty of Medicine Ramathibodi Hospital, Mahidol University
Ravis Technology

Corresponding Author(s)

Lekuthai N.

Other Contributor(s)

Mahidol University

Abstract

Eligibility criteria (EC) are critical components of clinical trial design, defining the parameters for participant inclusion and exclusion. However, designing EC remains a complex, expertise-intensive process. Traditional approaches to EC generation may fail to produce comprehensive, contextually appropriate criteria. To address these challenges, we introduce EC-RAFT, a method that utilizes Retrieval-Augmented Fine-Tuning (RAFT) to generate structured and cohesive EC directly from clinical trial titles and descriptions. EC-RAFT integrates contextual retrieval, synthesized intermediate reasoning, and fine-tuned language models to produce comprehensive EC sets. To enhance clinical alignment evaluation with referenced criteria, we also propose an LLM-guided evaluation pipeline. Our results demonstrate that our solution, which uses Llama-3.1-8BInstruct as a base model, achieves a BERTScore of 86.23 and an EC-matched LLM-as-a-Judge score of 1.66 out of 3, outperforming zero-shot Llama-3.1 and Gemini-1.5 by 0.41 and 0.11 points, respectively. On top of that, EC-RAFT also outperforms other fine-tuned versions of Llama-3.1. EC-RAFT was trained in a low-cost setup and, therefore, can be used as a practical solution for EC generation while ensuring quality and relevance in clinical trial design. We release our code on GitHub at https://github.com/biodatlab/ec-raft/.

Keyword(s)

Computer Science
Social Sciences
Arts and Humanities

URI

https://repository.li.mahidol.ac.th/handle/123456789/114700

Collections

Scopus 2025

Full item page

Send Feedback

	Office Hour: Monday-Friday 08.30-12.00 and 13.00-16.30 hrs.
	Phutthamonthon Sai 4 Rd. Salaya, Nakhon Pathom 73170, Thailand
	The office: +66 (2) 800 2680 ext.4306
	thipsuda.van@mahidol.ac.th
	https://repository.li.mahidol.ac.th