Abstract
Work on Mining Software Repositories typically involves processing abstractions of resources on individual revisions. A corresponding processing of abstractions of resource changes often depends on working with all revisions of the repository history to guarantee a high resolution of the measured changes. Abstractions of resources and abstractions of resource changes are often very related up to the point that they can be used interchangeably in the processing. In practice, approaches working with abstractions processed over high revision counts face a scalability challenge. In this work, we contribute to the challenge by incrementalizing the processing of repository resources and the corresponding abstractions. Our work is inspired by incrementalization theory including insights on Abelian groups, group homomorphisms and indexing. We provide a map-reduce interface that enables calls to foreign functionality and convenient operations for processing abstractions, such as mapping, filtering, group-wise aggregation and joining. Apache Spark is used for distribution. We compare the scalability of our approach with available MSR approaches, i.e., with LISA that reduces redundancy and with DJ-Rex that migrates an analysis to a distributed map-reduce framework.
Original language | English |
---|---|
Title of host publication | SANER 2020 - Proceedings of the 2020 IEEE 27th International Conference on Software Analysis, Evolution, and Reengineering |
Editors | K. Kontogiannis, F. Khomh, A. Chatzigeorgiou, M.-E. Fokaefs, M. Zhou |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
Pages | 320-331 |
ISBN (Electronic) | 9781728151434 |
DOIs | |
Publication status | Published - 1 Feb 2020 |
Externally published | Yes |
Event | 27th IEEE International Conference on Software Analysis, Evolution, and Reengineering, SANER 2020 - London, Canada Duration: 18 Feb 2020 → 21 Feb 2020 |
Conference
Conference | 27th IEEE International Conference on Software Analysis, Evolution, and Reengineering, SANER 2020 |
---|---|
Country/Territory | Canada |
City | London |
Period | 18/02/20 → 21/02/20 |