TY - GEN
T1 - Data clone detection and visualization in spreadsheets
AU - Hermans, Felienne
AU - Sedee, Ben
AU - Pinzger, Martin
AU - Van Deursen, Arie
PY - 2013/10/30
Y1 - 2013/10/30
N2 - Spreadsheets are widely used in industry: it is estimated that end-user programmers outnumber programmers by a factor 5. However, spreadsheets are error-prone, numerous companies have lost money because of spreadsheet errors. One of the causes for spreadsheet problems is the prevalence of copy-pasting. In this paper, we study this cloning in spreadsheets. Based on existing text-based clone detection algorithms, we have developed an algorithm to detect data clones in spreadsheets: formulas whose values are copied as plain text in a different location. To evaluate the usefulness of the proposed approach, we conducted two evaluations. A quantitative evaluation in which we analyzed the EUSES corpus and a qualitative evaluation consisting of two case studies. The results of the evaluation clearly indicate that 1) data clones are common, 2) data clones pose threats to spreadsheet quality and 3) our approach supports users in finding and resolving data clones.
AB - Spreadsheets are widely used in industry: it is estimated that end-user programmers outnumber programmers by a factor 5. However, spreadsheets are error-prone, numerous companies have lost money because of spreadsheet errors. One of the causes for spreadsheet problems is the prevalence of copy-pasting. In this paper, we study this cloning in spreadsheets. Based on existing text-based clone detection algorithms, we have developed an algorithm to detect data clones in spreadsheets: formulas whose values are copied as plain text in a different location. To evaluate the usefulness of the proposed approach, we conducted two evaluations. A quantitative evaluation in which we analyzed the EUSES corpus and a qualitative evaluation consisting of two case studies. The results of the evaluation clearly indicate that 1) data clones are common, 2) data clones pose threats to spreadsheet quality and 3) our approach supports users in finding and resolving data clones.
KW - clone detection
KW - code smells
KW - spreadsheet smells
KW - spreadsheets
UR - http://www.scopus.com/inward/record.url?scp=84883681660&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84883681660&partnerID=8YFLogxK
U2 - 10.1109/ICSE.2013.6606575
DO - 10.1109/ICSE.2013.6606575
M3 - Conference contribution
AN - SCOPUS:84883681660
SN - 9781467330763
T3 - Proceedings - International Conference on Software Engineering
SP - 292
EP - 301
BT - 2013 35th International Conference on Software Engineering, ICSE 2013 - Proceedings
T2 - 2013 35th International Conference on Software Engineering, ICSE 2013
Y2 - 18 May 2013 through 26 May 2013
ER -