Data clone detection and visualization in spreadsheets

Felienne Hermans, Ben Sedee, Martin Pinzger, Arie Van Deursen

Research output: Chapter in Book / Report / Conference proceedingConference contributionAcademicpeer-review

Abstract

Spreadsheets are widely used in industry: it is estimated that end-user programmers outnumber programmers by a factor 5. However, spreadsheets are error-prone, numerous companies have lost money because of spreadsheet errors. One of the causes for spreadsheet problems is the prevalence of copy-pasting. In this paper, we study this cloning in spreadsheets. Based on existing text-based clone detection algorithms, we have developed an algorithm to detect data clones in spreadsheets: formulas whose values are copied as plain text in a different location. To evaluate the usefulness of the proposed approach, we conducted two evaluations. A quantitative evaluation in which we analyzed the EUSES corpus and a qualitative evaluation consisting of two case studies. The results of the evaluation clearly indicate that 1) data clones are common, 2) data clones pose threats to spreadsheet quality and 3) our approach supports users in finding and resolving data clones.

Original languageEnglish
Title of host publication2013 35th International Conference on Software Engineering, ICSE 2013 - Proceedings
Pages292-301
Number of pages10
DOIs
Publication statusPublished - 30 Oct 2013
Externally publishedYes
Event2013 35th International Conference on Software Engineering, ICSE 2013 - San Francisco, CA, United States
Duration: 18 May 201326 May 2013

Publication series

NameProceedings - International Conference on Software Engineering
ISSN (Print)0270-5257

Conference

Conference2013 35th International Conference on Software Engineering, ICSE 2013
Country/TerritoryUnited States
CitySan Francisco, CA
Period18/05/1326/05/13

Keywords

  • clone detection
  • code smells
  • spreadsheet smells
  • spreadsheets

Fingerprint

Dive into the research topics of 'Data clone detection and visualization in spreadsheets'. Together they form a unique fingerprint.

Cite this