Skip to main navigation Skip to search Skip to main content

Operationalizing Threats to MSR Studies by Simulation-Based Testing

Research output: Chapter in Book / Report / Conference proceedingConference contributionAcademicpeer-review

Abstract

Quantitative studies on the border between Mining Software Repository (MSR) and Empirical Software Engineering (ESE) apply data analysis methods, like regression modeling, statistic tests or correlation analysis, to commits or pulls to better understand the software development process. Such studies assure the validity of the reported results by following a sound methodology. However, with increasing complexity, parts of the methodology can still go wrong. This may result in MSR/ESE studies with undetected threats to validity. In this paper, we propose to systematically protect against threats by operationalizing their treatment using simulations. A simulation substitutes observed and unobserved data, related to an MSR/ESE scenario, with synthetic data, carefully defined according to plausible assumptions on the scenario. Within a simulation, unobserved data becomes transparent, which is the key difference to a real study, necessary to detect threats to an analysis methodology. Running an analysis methodology on synthetic data may detect basic technical bugs and misinterpretations, but it also improves the trust in the methodology. The contribution of a simulation is to operationalize testing the impact of important assumptions. Assumptions still need to be rated for plausibility. We evaluate simulation-based testing by operationalizing undetected threats in the context of four published MSR/ESE studies. We recommend that future research uses such more systematic treatment of threats, as a contribution against the reproducibility crisis.
Original languageEnglish
Title of host publicationMSR 2022
Subtitle of host publicationProceedings of the 19th International Conference on Mining Software Repositories
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages86-97
Number of pages12
ISBN (Electronic)9781450393034
DOIs
Publication statusPublished - 2022
Externally publishedYes
Event2022 Mining Software Repositories Conference, MSR 2022 - Pittsburgh, United States
Duration: 23 May 202224 May 2022

Conference

Conference2022 Mining Software Repositories Conference, MSR 2022
Country/TerritoryUnited States
CityPittsburgh
Period23/05/2224/05/22

Fingerprint

Dive into the research topics of 'Operationalizing Threats to MSR Studies by Simulation-Based Testing'. Together they form a unique fingerprint.

Cite this