PromptOps: Automated Tool for Testing Trustworthiness of LLMs
5
Issued Date
2025-01-01
Resource Type
ISSN
15301362
Scopus ID
2-s2.0-105035196380
Journal Title
Proceedings Asia Pacific Software Engineering Conference APSEC
Start Page
1005
End Page
1008
Rights Holder(s)
SCOPUS
Bibliographic Citation
Proceedings Asia Pacific Software Engineering Conference APSEC (2025) , 1005-1008
Suggested Citation
Sontesadisai C., Sae-Ngow C., Rudeerudchanawong J., Dangsungnoen L., Ragkhitwetsagul C., Racharak T., Sunetnanta T. PromptOps: Automated Tool for Testing Trustworthiness of LLMs. Proceedings Asia Pacific Software Engineering Conference APSEC (2025) , 1005-1008. 1008. doi:10.1109/APSEC66846.2025.00117 Retrieved from: https://repository.li.mahidol.ac.th/handle/123456789/116232
Title
PromptOps: Automated Tool for Testing Trustworthiness of LLMs
Author's Affiliation
Corresponding Author(s)
Other Contributor(s)
Abstract
Large Language Models (LLMs) are increasingly utilized in a wide range of natural language processing tasks. Despite their growing adoption, concerns regarding their trustworthiness, i.e., reliability and validity across diverse applications, still remain. This paper introduces a novel visual-based LLM testing tool called PromptOps using the principles of metamorphic testing to assess LLMs beyond traditional accuracy metrics. The tool evaluates LLMs on critical properties such as robustness, fairness, and logical consistency. The tool enables users to design custom test cases via visual programming, define specific prompts, and automatically generate diverse test scenarios. PromptOps fosters greater transparency for model developers by identifying areas for improvement in both performance and fairness. The video demonstration of the PromptOps tool is available at https://youtu.be/M6TbvPIt9kE, and the tool is available at https://github.com/MUICT-SERU/PromptOps.
