Asking about Technical Debt: Characteristics and Automatic Identification of Technical Debt Questions on Stack Overflow

Nicholas Kozanidis, Roberto Verdecchia*, Emitza Guzman Ortega

*Corresponding author for this work

Research output: Chapter in Book / Report / Conference proceedingConference contributionAcademicpeer-review

Abstract

Background: Numerous methodologies have been used to study technical debt. Among different data sources, Q&A sites provide an opportunity to study how users reference and request support on technical debt. To date only few studies, focusing on narrow aspects, investigate technical debt through the lens of Stack Overflow.

Aims: We aim at gaining an in-depth understanding on the characteristics of technical debt questions on Stack Overflow. In addition, we assess if identification strategies based on machine learning can be used to automatically identify and classify technical debt questions.

Method: We use combination of automated and manual processes to identify technical debt questions on Stack Overflow. The final set of 415 questions is analyzed both quantitatively and qualitatively to study (i) technical debt types, (ii) question length, (iii) perceived urgency,(iv) sentiment, and (v) emerging themes.Natural language processing and machine learning techniques are used to evaluate if technical debt questions can be identified and classified automatically.

Results: Architecture debt is the most recurring debt type, followed by code and design debt. Most questions display mild urgency, with frequency of higher urgency steadily declining as urgency rises. Question length varies across debt types. Sentiment is mostly neutral. 29 recurrent themes emerge. Machine learning can be used to identify technical debt questions and binary urgency, but not debt types.

Conclusions: Different patterns emerge from the analysis of technical debt questions on Stack Overflow. The results provide further insights on the phenomenon, and support the adoption of a more comprehensive strategy to identify technical debt questions.
Original languageEnglish
Title of host publicationESEM '22
Subtitle of host publicationProceedings of the 16th ACM / IEEE International Symposium on Empirical Software Engineering and Measurement
EditorsFernanda Madeiral, Casper Lassenius, Casper Lassenius, Tayana Conte, Tomi Mannisto
PublisherIEEE Computer Society
Pages45-56
Number of pages12
ISBN (Electronic)9781450394277
DOIs
Publication statusPublished - Sept 2022
Event16th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, ESEM 2022 - Helsinki, Finland
Duration: 18 Sept 202223 Sept 2022

Publication series

NameInternational Symposium on Empirical Software Engineering and Measurement
ISSN (Print)1949-3770
ISSN (Electronic)1949-3789

Conference

Conference16th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, ESEM 2022
Country/TerritoryFinland
CityHelsinki
Period18/09/2223/09/22

Bibliographical note

Publisher Copyright:
© 2022 Association for Computing Machinery.

Keywords

  • Technical Debt
  • Stack Overflow
  • Machine Learning

Fingerprint

Dive into the research topics of 'Asking about Technical Debt: Characteristics and Automatic Identification of Technical Debt Questions on Stack Overflow'. Together they form a unique fingerprint.

Cite this