Autorepairability of ChatGPT and Gemini: A Comparative Study
Issued Date
2024-01-01
Resource Type
ISSN
15301362
Scopus ID
2-s2.0-105004737048
Journal Title
Proceedings - Asia-Pacific Software Engineering Conference, APSEC
Start Page
442
End Page
446
Rights Holder(s)
SCOPUS
Bibliographic Citation
Proceedings - Asia-Pacific Software Engineering Conference, APSEC (2024) , 442-446
Suggested Citation
Sriwilailak C., Higo Y., Lapvikai P., Ragkhitwetsagul C., Choetkiertikul M. Autorepairability of ChatGPT and Gemini: A Comparative Study. Proceedings - Asia-Pacific Software Engineering Conference, APSEC (2024) , 442-446. 446. doi:10.1109/APSEC65559.2024.00056 Retrieved from: https://repository.li.mahidol.ac.th/handle/20.500.14594/110183
Title
Autorepairability of ChatGPT and Gemini: A Comparative Study
Author's Affiliation
Corresponding Author(s)
Other Contributor(s)
Abstract
In recent years, Automated Program Repair (APR), which focuses on automatically fixing source code without human intervention, has become a hot topic in the field of software engineering, leading to the proposal of various automatic repair techniques. Additionally, Lapvikai et al. introduced a new software quality metric called 'Autorepairability.' Autorepairability is a metric that indicates how easily bugs in the target source code can be fixed using APR techniques. By utilizing Autorepairability, it becomes possible to pre-check whether the program repair techniques will work effectively on the target software and to perform refactoring to improve Autorepairability. However, in the past two to three years, program repair using large language models (LLMs) has become more prevalent, and several studies have revealed that these models exhibit superior repair capabilities compared to traditional APR techniques. In this study, we applied Autorepairability to compare the performance of multiple APR techniques. Specifically, we measured and compared Autorepairability using ChatGPT and Gemini, which are representative large language models, as well as kGenProg, a traditional APR technique. The results demonstrated that Gemini exhibited higher repair capabilities compared to both ChatGPT and the traditional APR technique kGenProg. The five code functionalities that Gemini offers higher Autorepairability scores than ChatGPT include (1) geographic and mathematic operations, (2) validation, comparison, and searching operations, (3) data conversion operations, (4) data extraction and comparison operations, and (5) encoding operations.