Evaluating the Robustness of Question-Answering Models to Paraphrased Questions

Paulo Alting von Geusau*, Peter Bloem

*Corresponding author for this work

Research output: Chapter in Book / Report / Conference proceedingConference contributionAcademicpeer-review

75 Downloads (Pure)

Abstract

Understanding questions expressed in natural language is a fundamental challenge studied under different applications such as question answering (QA). We explore whether recent state-of-the-art models are capable of recognizing two paraphrased questions using unsupervised learning. Firstly, we test QA models’ performance on an existing paraphrased dataset (Dev-Para). Secondly, we create a new paraphrased evaluation set (Para-SQuAD) containing multiple paraphrased question pairs from the SQuAD dataset. We describe qualitative investigations on these models and how they present paraphrased questions in continuous space. The results demonstrate that the paraphrased dataset confuses the QA models and decreases their performance. Visualizing the sentence embeddings of Para-SQuAD by the QA models suggests that all models, except BERT, struggle to recognize paraphrased questions effectively.

Original languageEnglish
Title of host publicationArtificial Intelligence and Machine Learning
Subtitle of host publication32nd Benelux Conference, BNAIC/Benelearn 2020, Leiden, The Netherlands, November 19–20, 2020, Revised Selected Papers
EditorsMitra Baratchi, Lu Cao, Walter A. Kosters, Jefrey Lijffijt, Jan N. van Rijn, Frank W. Takes
PublisherSpringer Science and Business Media Deutschland GmbH
Pages1-14
Number of pages14
ISBN (Electronic)9783030766405
ISBN (Print)9783030766399
DOIs
Publication statusPublished - 2021
Event32nd Benelux Conference on Artificial Intelligence and Belgian-Dutch Conference on Machine Learning, BNAIC/Benelearn 2020 - Virtual, Online
Duration: 19 Nov 202020 Nov 2020

Publication series

NameCommunications in Computer and Information Science
Volume1398 CCIS
ISSN (Print)1865-0929
ISSN (Electronic)1865-0937

Conference

Conference32nd Benelux Conference on Artificial Intelligence and Belgian-Dutch Conference on Machine Learning, BNAIC/Benelearn 2020
CityVirtual, Online
Period19/11/2020/11/20

Bibliographical note

Funding Information:
We thank the three anonymous reviewers for their constructive comments, and Michael Cochez for his feedback and helpful notes on the manuscript.

Publisher Copyright:
© 2021, Springer Nature Switzerland AG.

Copyright:
Copyright 2021 Elsevier B.V., All rights reserved.

Keywords

  • Embeddings
  • Natural language
  • Question answering
  • Transformers

Fingerprint

Dive into the research topics of 'Evaluating the Robustness of Question-Answering Models to Paraphrased Questions'. Together they form a unique fingerprint.

Cite this