Validation methodology for expert-annotated datasets: Event annotation case study

Oana Inel*, Lora Aroyo

*Corresponding author for this work

Research output: Chapter in Book / Report / Conference proceedingConference contributionAcademicpeer-review


Event detection is still a difficult task due to the complexity and the ambiguity of such entities. On the one hand, we observe a low inter-annotator agreement among experts when annotating events, disregarding the multitude of existing annotation guidelines and their numerous revisions. On the other hand, event extraction systems have a lower measured performance in terms of F1-score compared to other types of entities such as people or locations. In this paper we study the consistency and completeness of expert-annotated datasets for events and time expressions. We propose a data-agnostic validation methodology of such datasets in terms of consistency and completeness. Furthermore, we combine the power of crowds and machines to correct and extend expert-annotated datasets of events. We show the benefit of using crowd-annotated events to train and evaluate a state-of-the-art event extraction system. Our results show that the crowd-annotated events increase the performance of the system by at least 5.3%.

Original languageEnglish
Title of host publication2nd Conference on Language, Data and Knowledge, LDK 2019
EditorsMaria Eskevich, Gerard de Melo, Christian Fath, John P. McCrae, Paul Buitelaar, Christian Chiarcos, Bettina Klimek, Milan Dojchinovski
PublisherSchloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing
Number of pages15
ISBN (Electronic)9783959771054
Publication statusPublished - 1 May 2019
Event2nd Conference on Language, Data and Knowledge, LDK 2019 - Leipzig, Germany
Duration: 20 May 201923 May 2019

Publication series

NameOpenAccess Series in Informatics
ISSN (Print)2190-6807


Conference2nd Conference on Language, Data and Knowledge, LDK 2019


  • Crowdsourcing
  • Event extraction
  • Human-in-the-loop
  • Time extraction


Dive into the research topics of 'Validation methodology for expert-annotated datasets: Event annotation case study'. Together they form a unique fingerprint.

Cite this