Abstract
Background: Numerous methodologies have been used to study technical debt. Among different data sources, Q&A sites provide an opportunity to study how users reference and request support on technical debt. To date only few studies, focusing on narrow aspects, investigate technical debt through the lens of Stack Overflow.
Aims: We aim at gaining an in-depth understanding on the characteristics of technical debt questions on Stack Overflow. In addition, we assess if identification strategies based on machine learning can be used to automatically identify and classify technical debt questions.
Method: We use combination of automated and manual processes to identify technical debt questions on Stack Overflow. The final set of 415 questions is analyzed both quantitatively and qualitatively to study (i) technical debt types, (ii) question length, (iii) perceived urgency,(iv) sentiment, and (v) emerging themes.Natural language processing and machine learning techniques are used to evaluate if technical debt questions can be identified and classified automatically.
Results: Architecture debt is the most recurring debt type, followed by code and design debt. Most questions display mild urgency, with frequency of higher urgency steadily declining as urgency rises. Question length varies across debt types. Sentiment is mostly neutral. 29 recurrent themes emerge. Machine learning can be used to identify technical debt questions and binary urgency, but not debt types.
Conclusions: Different patterns emerge from the analysis of technical debt questions on Stack Overflow. The results provide further insights on the phenomenon, and support the adoption of a more comprehensive strategy to identify technical debt questions.
Aims: We aim at gaining an in-depth understanding on the characteristics of technical debt questions on Stack Overflow. In addition, we assess if identification strategies based on machine learning can be used to automatically identify and classify technical debt questions.
Method: We use combination of automated and manual processes to identify technical debt questions on Stack Overflow. The final set of 415 questions is analyzed both quantitatively and qualitatively to study (i) technical debt types, (ii) question length, (iii) perceived urgency,(iv) sentiment, and (v) emerging themes.Natural language processing and machine learning techniques are used to evaluate if technical debt questions can be identified and classified automatically.
Results: Architecture debt is the most recurring debt type, followed by code and design debt. Most questions display mild urgency, with frequency of higher urgency steadily declining as urgency rises. Question length varies across debt types. Sentiment is mostly neutral. 29 recurrent themes emerge. Machine learning can be used to identify technical debt questions and binary urgency, but not debt types.
Conclusions: Different patterns emerge from the analysis of technical debt questions on Stack Overflow. The results provide further insights on the phenomenon, and support the adoption of a more comprehensive strategy to identify technical debt questions.
Original language | English |
---|---|
Title of host publication | ESEM '22 |
Subtitle of host publication | Proceedings of the 16th ACM / IEEE International Symposium on Empirical Software Engineering and Measurement |
Editors | Fernanda Madeiral, Casper Lassenius, Casper Lassenius, Tayana Conte, Tomi Mannisto |
Publisher | IEEE Computer Society |
Pages | 45-56 |
Number of pages | 12 |
ISBN (Electronic) | 9781450394277 |
DOIs | |
Publication status | Published - Sept 2022 |
Event | 16th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, ESEM 2022 - Helsinki, Finland Duration: 18 Sept 2022 → 23 Sept 2022 |
Publication series
Name | International Symposium on Empirical Software Engineering and Measurement |
---|---|
ISSN (Print) | 1949-3770 |
ISSN (Electronic) | 1949-3789 |
Conference
Conference | 16th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, ESEM 2022 |
---|---|
Country/Territory | Finland |
City | Helsinki |
Period | 18/09/22 → 23/09/22 |
Bibliographical note
Publisher Copyright:© 2022 Association for Computing Machinery.
Keywords
- Technical Debt
- Stack Overflow
- Machine Learning