Generative Expression Constrained Knowledge-Based Decoding for Open Data

Lucas Lageweg*, Benno Kruit

*Corresponding author for this work

Research output: Chapter in Book / Report / Conference proceedingConference contributionAcademicpeer-review

5 Downloads (Pure)

Abstract

In this paper, we present GECKO, a knowledge graph question answering (KGQA) system for data from Statistics Netherlands (Centraal Bureau voor de Statistiek). QA poses great challenges in means of generating relevant answers, as well as preventing hallucinations. This is a phenomenon found in language models and creates issues when attempting factual QA with these models alone. To overcome these limitations, the Statistics Netherlands’ publicly available OData4 data was used to create a knowledge graph, in which the answer generation decoding process is grounded, ensuring faithful answers. When processing a question, GECKO performs entity and schema retrieval, does schema-constrained expression decoding, makes assumptions where needed and executes the generated expression as an OData4 query to retrieve information. A novel method was implemented to perform the constrained knowledge-based expression decoding using an encoder-decoder model. Both a sparse and dense entity retrieval method were evaluated. While the encoder-decoder model did not achieve production-ready performance, experiments show promising results for a rule-based baseline using a sparse entity retriever. Additionally, the results of qualitative user testing were positive. We therefore formulate recommendations for deployment help guide users of Statistics Netherlands data to their answers more quickly.

Original languageEnglish
Title of host publicationThe Semantic Web
Subtitle of host publication21st International Conference, ESWC 2024, Hersonissos, Crete, Greece, May 26–30, 2024, Proceedings, Part I
EditorsAlbert Meroño Peñuela, Anastasia Dimou, Raphaël Troncy, Pasquale Lisena, Olaf Hartig, Maribel Acosta, Mehwish Alam, Heiko Paulheim
PublisherSpringer Science and Business Media Deutschland GmbH
Pages307-325
Number of pages19
Volume1
ISBN (Electronic)9783031606267
ISBN (Print)9783031606250
DOIs
Publication statusPublished - 2024
Event21st European Semantic Web Conference, ESWC 2024 - Hersonissos, Greece
Duration: 26 May 202430 May 2024

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume14664 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference21st European Semantic Web Conference, ESWC 2024
Country/TerritoryGreece
CityHersonissos
Period26/05/2430/05/24

Bibliographical note

Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024.

Fingerprint

Dive into the research topics of 'Generative Expression Constrained Knowledge-Based Decoding for Open Data'. Together they form a unique fingerprint.

Cite this