AutoGPT Devloop: An Autonomous AI Development Framework for End-to-End Software Generation, Execution, and Self-Repair
Issued Date
2025-01-01
Resource Type
Scopus ID
2-s2.0-105040654853
Journal Title
6th Technology Innovation Management and Engineering Science International Conference Times Icon 2025 Proceedings
Rights Holder(s)
SCOPUS
Bibliographic Citation
6th Technology Innovation Management and Engineering Science International Conference Times Icon 2025 Proceedings (2025)
Suggested Citation
Asavathongkul T., Sa-Nga-Ngam P., Kiattisin S. AutoGPT Devloop: An Autonomous AI Development Framework for End-to-End Software Generation, Execution, and Self-Repair. 6th Technology Innovation Management and Engineering Science International Conference Times Icon 2025 Proceedings (2025). doi:10.1109/TIMES-iCON67125.2025.11488122 Retrieved from: https://repository.li.mahidol.ac.th/handle/123456789/117192
Title
AutoGPT Devloop: An Autonomous AI Development Framework for End-to-End Software Generation, Execution, and Self-Repair
Author(s)
Author's Affiliation
Corresponding Author(s)
Other Contributor(s)
Abstract
We present the Autonomous AI Development Framework (AADF), a self-developing agent that converts high-level goals into functioning software through iterative planning, code generation, execution, and self-repair. AADF integrates (i) task decomposition, (ii) a semantic code memory (vector database), (iii) virtualized environment management, and (iv) automated file/version control to achieve autonomy across the software lifecycle. Using a Design Science Research approach, we evaluate AADF on a suite of 15 GUI / web tasks spanning five difficulty levels (3 trials each; 45 trials). The framework achieves 100% success on Levels 1-2 and Level 4, with partial degradations on tasks dependent on external API credentials or large NLP resources (Level 3: 67 % / 33 % on two tasks; Level 5: 67% on one task). Overall, AADF completes 41 / 45 trials (91.1 %) without manual code editing, demonstrating reliable self-repair for dependency conflicts and missing imports, and reducing human effort on environment setup, boilerplate, and repetitive file operations. We discuss error taxonomies (credentials, heavyweight downloads, concurrency), autonomy-speed trade-offs, and actionable design guidelines for safe deployment in practice.
