Corpora annotated with negation: An overview

Salud María Jiménez-Zafra, Roser Morante, María Teresa Martín-Valdivia, L. Alfonso Ureña-López

Research output: Contribution to JournalArticleAcademicpeer-review

Abstract

Negation is a universal linguistic phenomenon with a great qualitative impact on natural language processing applications. The availability of corpora annotated with negation is essential to training negation processing systems. Currently, most corpora have been annotated for English, but the presence of languages other than English on the Internet, such as Chinese or Spanish, is greater every day. In this study, we present a review of the corpora annotated with negation information in several languages with the goal of evaluating what aspects of negation have been annotated and how compatible the corpora are. We conclude that it is very difficult to merge the existing corpora because we found differences in the annotation schemes used, and most importantly, in the annotation guidelines: the way in which each corpus was tokenized and the negation elements that have been annotated. Differently than for other well established tasks like semantic role labeling or parsing, for negation there is no standard annotation scheme nor guidelines, which hampers progress in its treatment.

Original languageEnglish
Pages (from-to)189-244
Number of pages56
JournalComputational Linguistics
Volume46
Issue number1
DOIs
Publication statusPublished - Mar 2020

Funding

This work has been partially supported by a grant from the Ministerio de Educaci?n Cultura y Deporte (MECD-scholarship FPU014/00983), LIVING-LANG project (RTI2018-094653-B-C21), Fondo Europeo de Desarrollo Regional (FEDER), and REDES project (TIN2015-65136-C2-1-R) from the Spanish Government. R.M. was supported by the Netherlands Organization for Scientific Research (NWO) via the Spinoza-prize awarded to Piek Vossen (SPI 30-673, 2014-2019). We are thankful to the authors of the corpora who kindly answered our questions.

FundersFunder number
Spanish Government
Ministerio de Educación, Cultura y DeporteFPU014/00983, RTI2018-094653-B-C21
Nederlandse Organisatie voor Wetenschappelijk OnderzoekSPI 30-673, 2014-2019
European Regional Development FundTIN2015-65136-C2-1-R
Ministerio de Educación y Cultura

    Fingerprint

    Dive into the research topics of 'Corpora annotated with negation: An overview'. Together they form a unique fingerprint.

    Cite this