“Geen makkie”: Interpretable Classification and Simplification of Dutch Text Complexity

Eliza Hobo, Charlotte Pouw, Lisa Beinborn

Research output: Chapter in Book / Report / Conference proceedingConference contributionAcademicpeer-review

Abstract

An inclusive society needs to facilitate access to information for all of its members, including citizens with low literacy and with non-native language skills. We present an approach to assess Dutch text complexity on the sentence level and conduct an interpretability analysis to explore the link between neural models and linguistic complexity features.1 Building on these findings, we develop the first contextual lexical simplification model for Dutch and publish a pilot dataset for evaluation. We go beyond previous work which primarily targeted lexical substitution and propose strategies for adjusting the model’s linguistic register to generate simpler candidates. Our results indicate that continual pre-training and multi-task learning with conceptually related tasks are promising directions for ensuring the simplicity of the generated substitutions. Our code repository and the simplification dataset are available on GitHub.

Original languageEnglish
Title of host publicationProceedings of the 18th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2023)
EditorsEkaterina Kochmar, Jill Burstein, Andrea Horbach, Ronja Laarmann-Quante, Nitin Madnani, Anais Tack, Victoria Yaneva, Zheng Yuan, Torsten Zesch
PublisherAssociation for Computational Linguistics (ACL)
Pages503-517
Number of pages15
ISBN (Electronic)9781959429807
DOIs
Publication statusPublished - 2023
Event18th Workshop on Innovative Use of NLP for Building Educational Applications, BEA 2023 - Toronto, Canada
Duration: 13 Jul 2023 → …

Publication series

NameProceedings of the Annual Meeting of the Association for Computational Linguistics
ISSN (Print)0736-587X

Conference

Conference18th Workshop on Innovative Use of NLP for Building Educational Applications, BEA 2023
Country/TerritoryCanada
CityToronto
Period13/07/23 → …

Bibliographical note

Publisher Copyright:
© 2023 Association for Computational Linguistics.

Funding

Eliza Hobo’s simplification experiments were initiated during an internship at the Gemeente of Amsterdam. Iva Gornishka has been a valuable source of insight and support in this process. Charlotte Pouw’s experiments on readability were initiated in a joint project with Florian Kunneman and Bruna Guedes supported by the Network Institute (VU Amsterdam) through the Academy Assistants Program. Lisa Beinborn’s work was supported by the Dutch National Science Organisation (NWO) through the projects CLARIAHPLUS (CP-W6-19-005) and VENI (Vl.Veni.211C.039).

FundersFunder number
Dutch National Science organisation
Network Institute
Nederlandse Organisatie voor Wetenschappelijk OnderzoekCP-W6-19-005
Nederlandse Organisatie voor Wetenschappelijk Onderzoek

    Fingerprint

    Dive into the research topics of '“Geen makkie”: Interpretable Classification and Simplification of Dutch Text Complexity'. Together they form a unique fingerprint.

    Cite this