Investigating the Robustness of Modelling Decisions for Few-Shot Cross-Topic Stance Detection: A Preregistered Study

Myrthe Reuver, Suzan Verberne, Antske Fokkens

Research output: Chapter in Book / Report / Conference proceedingConference contributionAcademicpeer-review

Abstract

For a viewpoint-diverse news recommender, identifying whether two news articles express the same viewpoint is essential. One way to determine”same or different” viewpoint is stance detection. In this paper, we investigate the robustness of operationalization choices for few-shot stance detection, with special attention to modelling stance across different topics. Our experiments test pre-registered hypotheses on stance detection. Specifically, we compare two stance task definitions (Pro/Con versus Same Side Stance), two LLM architectures (bi-encoding versus cross-encoding), and adding Natural Language Inference knowledge, with pre-trained RoBERTa models trained with shots of 100 examples from 7 different stance detection datasets. Some of our hypotheses and claims from earlier work can be confirmed, while others give more inconsistent results. The effect of the Same Side Stance definition on performance differs per dataset and is influenced by other modelling choices. We found no relationship between the number of training topics in the training shots and performance. In general, cross-encoding out-performs bi-encoding, and adding NLI training to our models gives considerable improvement, but these results are not consistent across all datasets. Our results indicate that it is essential to include multiple datasets and systematic modelling experiments when aiming to find robust modelling choices for the concept 'stance'.

Original languageEnglish
Title of host publicationProceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
EditorsNicoletta Calzolari, Min-Yen Kan, Veronique Hoste, Alessandro Lenci, Sakriani Sakti, Nianwen Xue
Place of PublicationTorino, Italia
PublisherELRA and ICCL
Pages9245-9260
Number of pages16
ISBN (Electronic)9782493814104
Publication statusPublished - 2024
EventJoint 30th International Conference on Computational Linguistics and 14th International Conference on Language Resources and Evaluation, LREC-COLING 2024 - Hybrid, Torino, Italy
Duration: 20 May 202425 May 2024

Conference

ConferenceJoint 30th International Conference on Computational Linguistics and 14th International Conference on Language Resources and Evaluation, LREC-COLING 2024
Country/TerritoryItaly
CityHybrid, Torino
Period20/05/2425/05/24

Bibliographical note

Publisher Copyright:
© 2024 ELRA Language Resource Association: CC BY-NC 4.0.

Funding

This research is part of the Rethinking News Algorithms project (2020-2024), funded through the Open Competition Digitalization Humanities and Social Science (grant nr 406.D1.19.073) by the Netherlands Organization of Scientific Research (NWO). Our computing was done through SURF Research Cloud, a national supercomputer infrastructure in the Netherlands also funded by the NWO. Thanks to Urja Khurana, Michiel van der Meer, and other PhDs from CLTL for their helpful feedback. We would also like to thank all anonymous reviewers, whose comments improved both this version and earlier versions of this paper. All remaining errors or unclarities are our own.

FundersFunder number
CLTL
Nederlandse Organisatie voor Wetenschappelijk Onderzoek

    Keywords

    • computational argumentation
    • preregistration
    • stance detection

    Fingerprint

    Dive into the research topics of 'Investigating the Robustness of Modelling Decisions for Few-Shot Cross-Topic Stance Detection: A Preregistered Study'. Together they form a unique fingerprint.

    Cite this