Open-domain extraction of future events from Twitter

Florian Kunneman*, Antal Van Den Bosch

*Corresponding author for this work

Research output: Contribution to JournalArticleAcademicpeer-review

Abstract

Explicit references on Twitter to future events can be leveraged to feed a fully automatic monitoring system of real-world events. We describe a system that extracts open-domain future events from the Twitter stream. It detects future time expressions and entity mentions in tweets, clusters tweets together that overlap in these mentions above certain thresholds, and summarizes these clusters into event descriptions that can be presented to users of the system. Terms for the event description are selected in an unsupervised fashion. 1 We evaluated the system on a month of Dutch tweets, by showing the top-250 ranked events found in this month to human annotators. Eighty per cent of the candidate events were indeed assessed as being an event by at least three out of four human annotators, while all four annotators regarded sixty-three per cent as a real event. An added component to complement event descriptions with additional terms was not assessed better than the original system, due to the occasional addition of redundant terms. Comparing the found events to gold-standard events from maintained calendars on the Web mentioned in at least five tweets, the system yields a recall-at-250 of 0.20 and a recall based on all retrieved events of 0.40.

Original languageEnglish
Pages (from-to)655-686
Number of pages32
JournalNatural Language Engineering
Volume22
Issue number5
DOIs
Publication statusPublished - 1 Sep 2016
Externally publishedYes

Fingerprint

Dive into the research topics of 'Open-domain extraction of future events from Twitter'. Together they form a unique fingerprint.

Cite this