A dataset of scratch programs: Scraped, shaped and scored

Efthimia Aivaloglou, Felienne Hermans, Jesus Moreno-Leon, Gregorio Robles

Research output: Chapter in Book / Report / Conference proceedingConference contributionAcademicpeer-review

Abstract

Scratch is increasingly popular, both as an introductory programming language and as a research target in the computing education research field. In this paper, we present a dataset of 250K recent Scratch projects from 100K different authors scraped from the Scratch project repository. We processed the projects' source code and metadata to encode them into a database that facilitates querying and further analysis. We further evaluated the projects in terms of programming skills and mastery, and included the project scoring results. The dataset enables the analysis of the source code of Scratch projects, of their quality characteristics, and of the programming skills that their authors exhibit. The dataset can be used for empirical research in software engineering and computing education.

Original languageEnglish
Title of host publicationProceedings - 2017 IEEE/ACM 14th International Conference on Mining Software Repositories, MSR 2017
PublisherIEEE Computer Society
Pages511-514
Number of pages4
ISBN (Electronic)9781538615447
DOIs
Publication statusPublished - 29 Jun 2017
Externally publishedYes
Event14th IEEE/ACM International Conference on Mining Software Repositories, MSR 2017 - Buenos Aires, Argentina
Duration: 20 May 201721 May 2017

Publication series

NameIEEE International Working Conference on Mining Software Repositories
ISSN (Print)2160-1852
ISSN (Electronic)2160-1860

Conference

Conference14th IEEE/ACM International Conference on Mining Software Repositories, MSR 2017
Country/TerritoryArgentina
CityBuenos Aires
Period20/05/1721/05/17

Keywords

  • computing education
  • dataset
  • Scratch

Fingerprint

Dive into the research topics of 'A dataset of scratch programs: Scraped, shaped and scored'. Together they form a unique fingerprint.

Cite this