Towards a FAIR Dataset for non-functional requirements

Maria Isabel Limaylla-Lunarejo, Nelly Condori-Fernandez, Miguel R. Luaces

Research output: Chapter in Book / Report / Conference proceedingConference contributionAcademicpeer-review

7 Downloads (Pure)

Abstract

In the last years, the application of supervised Machine Learning (ML) algorithms in Requirements Engineering (RE) has allowed increasing the performance (e.g. accuracy, precision) and scalability of automatic requirements classification. However, the lack of publicly labeled datasets is still one concern when conducting ML experiments. Few publicly labeled datasets for non-functional requirements classification are available, and even less in the Spanish language. Moreover, most of the available datasets present some limitations, such as imbalanced classes (e.g. PROMISE NFR). This study aims to generate a FAIR dataset of non-functional requirements in the Spanish language for facilitating reuse in ML classification experiments. 109 non-functional requirements were collected from final degree projects from the University of A Coruña. We conducted a pilot quasi-experiment for non-functional requirements labeling in the categories and subcategories of the ISO/IEC 25010 quality model. The labeling process was accomplished by 7 annotators. The inter-annotator agreement using a Fleiss' Kappa test obtained a substantial agreement in the category level (0.78) and a moderate agreement (0.48) when the classification is per subcategory.

Original languageEnglish
Title of host publicationSAC '23
Subtitle of host publicationProceedings of the 38th ACM/SIGAPP Symposium on Applied Computing
Editors Jiman Hong
PublisherAssociation for Computing Machinery
Pages1414-1421
Number of pages8
ISBN (Electronic)9781450395175
DOIs
Publication statusPublished - 2023
Event38th Annual ACM Symposium on Applied Computing, SAC 2023 - Tallinn, Estonia
Duration: 27 Mar 202331 Mar 2023

Publication series

NameProceedings of the ACM Symposium on Applied Computing

Conference

Conference38th Annual ACM Symposium on Applied Computing, SAC 2023
Country/TerritoryEstonia
CityTallinn
Period27/03/2331/03/23

Bibliographical note

Funding Information:
This research was partially funded by Xunta de Galicia/FEDER-UE ED413C 2021/53 (Database Lab, UDC) and ED431G 2019/04 (CITIUS, USC). The authors also want to acknowledge all researchers and practitioners that participated as annotators in our study.

Publisher Copyright:
© 2023 ACM.

Keywords

  • data labeling
  • FAIR principles
  • non-functional requirements
  • spanish dataset

Fingerprint

Dive into the research topics of 'Towards a FAIR Dataset for non-functional requirements'. Together they form a unique fingerprint.

Cite this