Abstract
This dissertation investigates variation in linguistic framing of events. People differ in how they describe the same event, and this raises the question of how such variation can be systematically measured in textual data and how it can be explained. The study builds on the fundamental assumption that meaning arises from two complementary processes: first, the referential grounding of an expression, which identifies which event in the world the expression refers to; and second, the conceptual disambiguation of the expression, which determines which frame it evokes. While both processes have long been recognized in linguistic research, they have not previously been modeled together on a large scale in textual data. This dissertation demonstrates how such an integrated approach can be constructed and used for empirical analysis.
To achieve this, the research combines two theoretical resources. For referential grounding, it uses structured data from digital ontologies such as Wikidata, which contain information about events, their participants, locations, times, and types. For framing, it draws on FrameNet, a lexicographic database of more than a thousand semantic frames linked to lexical units and enriched with specifications of semantic roles. Using the Multilingual Wiki Extraction Platform, the study compiles a corpus in which event types are represented through large sets of incidents extracted from Wikidata, accompanied by texts sourced from the bibliographies of relevant Wikipedia pages. These texts are annotated in the Dutch FrameNet annotation tool with a two-layer system: entity linking connects expressions to the structured data of the events they denote, while frame annotation identifies which frames the same expressions evoke. The result is a corpus in which large numbers of texts are grouped under shared events, enriched with information on both their referential links and their framing structures. This corpus makes it possible to measure framing variation at a fine-grained level.
The question then becomes how to explain this variation. To address this, the dissertation develops a pragmatic framework centered on the role of common ground between writers and readers. The key idea is that the amount of shared knowledge about an event influences how it is described. When common ground is low, as immediately after an event, writers tend to provide extensive and varied descriptions, resulting in high framing variation. When common ground is high, as after time has passed, minimal descriptions are sufficient, resulting in low variation. Over time, therefore, variation decreases. Yet when new related events occur, they generate fresh narratives. Writers focus on the new developments with many and varied frames, while the initial event, now familiar, is referred to only briefly and in less diverse terms.
From this framework, the dissertation formulates hypotheses about framing variation as a function of common ground over time. These hypotheses concern frames, lexical units, constructions, and interfaces, and they are tested in the referentially grounded FrameNet corpus at the levels of event type, incident, and participant.
The contribution of this work is both empirical and theoretical. Empirically, it develops a large-scale corpus that integrates structured event data with FrameNet-based annotation, providing a new resource for corpus and computational linguistics. Theoretically, it offers a pragmatic model that links framing variation to dynamics of common ground and storytelling, shedding new light on how events are represented and narrated over time.
| Original language | English |
|---|---|
| Qualification | PhD |
| Awarding Institution |
|
| Supervisors/Advisors |
|
| Award date | 25 Nov 2025 |
| DOIs | |
| Publication status | Published - 25 Nov 2025 |