Sprint2Vec: a deep characterization of sprints in iterative software development
Issued Date
2024-01-01
Resource Type
ISSN
00985589
eISSN
19393520
Scopus ID
2-s2.0-85210995629
Journal Title
IEEE Transactions on Software Engineering
Rights Holder(s)
SCOPUS
Bibliographic Citation
IEEE Transactions on Software Engineering (2024)
Suggested Citation
Choetkiertikul M., Banyongrakkul P., Ragkhitwetsagul C., Tuarob S., Dam H.K., Sunetnanta T. Sprint2Vec: a deep characterization of sprints in iterative software development. IEEE Transactions on Software Engineering (2024). doi:10.1109/TSE.2024.3509016 Retrieved from: https://repository.li.mahidol.ac.th/handle/20.500.14594/102347
Title
Sprint2Vec: a deep characterization of sprints in iterative software development
Author's Affiliation
Corresponding Author(s)
Other Contributor(s)
Abstract
Iterative approaches like Agile Scrum are commonly adopted to enhance the software development process. However, challenges such as schedule and budget overruns still persist in many software projects. Several approaches employ machine learning techniques, particularly classification, to facilitate decision-making in iterative software development. Existing approaches often concentrate on characterizing a sprint to predict solely productivity. We introduce Sprint2Vec, which leverages three aspects of sprint information - sprint attributes, issue attributes, and the developers involved in a sprint, to comprehensively characterize it for predicting both productivity and quality outcomes of the sprints. Our approach combines traditional feature extraction techniques with automated deep learning-based unsupervised feature learning techniques. We utilize methods like Long Short-Term Memory (LSTM) to enhance our feature learning process. This enables us to learn features from unstructured data, such as textual descriptions of issues and sequences of developer activities. We conducted an evaluation of our approach on two regression tasks: predicting the deliverability (i.e., the amount of work delivered from a sprint) and quality of a sprint (i.e., the amount of delivered work that requires rework). The evaluation results on five well-known open-source projects (Apache, Atlassian, Jenkins, Spring, and Talendforge) demonstrate our approach's superior performance compared to baseline and alternative approaches.