Crowdsourcing Salient Information from News and Tweets

O.A. Inel, T. Caselli, L.M. Aroyo

Research output: Chapter in Book / Report / Conference proceedingConference contributionAcademicpeer-review

Abstract

The increasing streams of information pose challenges to both humans and machines. On the one hand, humans need to identify relevant information and consume only the information that lies at their interests. On the other hand, machines need to understand the information that is published in online data streams and generate concise and meaningful overviews. We consider events as primefactors to query for information and generate meaningful context. The focus of this paper is to acquire empirical insights for identifying salience features in tweets and news about a target event, i.e., the event of “whaling”. We first derive a methodology to identify suchfeatures by building up a knowledge space of the event enriched with relevant phrases, sentiments and ranked by their novelty. Weapplied this methodology on tweets and we have performed preliminary work towards adapting it to news articles. Our results show that crowdsourcing text relevance, sentiments and novelty (1) can be a main step in identifying salient information, and (2) provides a deeperand more precise understanding of the data at hand compared to state-of-the-art approaches.
LanguageEnglish
Title of host publicationProceedings of the Tenth International Conference on Language Resources and Evaluation LREC 2016, Portoroz, Slovenia, May 23-28, 2016
PublisherEuropean Language Resources Association (ELRA)
ISBN (Print)9782951740891
Publication statusPublished - 2016

Cite this

Inel, O. A., Caselli, T., & Aroyo, L. M. (2016). Crowdsourcing Salient Information from News and Tweets. In Proceedings of the Tenth International Conference on Language Resources and Evaluation LREC 2016, Portoroz, Slovenia, May 23-28, 2016 European Language Resources Association (ELRA).
Inel, O.A. ; Caselli, T. ; Aroyo, L.M. / Crowdsourcing Salient Information from News and Tweets. Proceedings of the Tenth International Conference on Language Resources and Evaluation LREC 2016, Portoroz, Slovenia, May 23-28, 2016. European Language Resources Association (ELRA), 2016.
@inproceedings{1ceb65ac9f554942b18e1cb8192cc6d1,
title = "Crowdsourcing Salient Information from News and Tweets",
abstract = "The increasing streams of information pose challenges to both humans and machines. On the one hand, humans need to identify relevant information and consume only the information that lies at their interests. On the other hand, machines need to understand the information that is published in online data streams and generate concise and meaningful overviews. We consider events as primefactors to query for information and generate meaningful context. The focus of this paper is to acquire empirical insights for identifying salience features in tweets and news about a target event, i.e., the event of “whaling”. We first derive a methodology to identify suchfeatures by building up a knowledge space of the event enriched with relevant phrases, sentiments and ranked by their novelty. Weapplied this methodology on tweets and we have performed preliminary work towards adapting it to news articles. Our results show that crowdsourcing text relevance, sentiments and novelty (1) can be a main step in identifying salient information, and (2) provides a deeperand more precise understanding of the data at hand compared to state-of-the-art approaches.",
author = "O.A. Inel and T. Caselli and L.M. Aroyo",
year = "2016",
language = "English",
isbn = "9782951740891",
booktitle = "Proceedings of the Tenth International Conference on Language Resources and Evaluation LREC 2016, Portoroz, Slovenia, May 23-28, 2016",
publisher = "European Language Resources Association (ELRA)",

}

Inel, OA, Caselli, T & Aroyo, LM 2016, Crowdsourcing Salient Information from News and Tweets. in Proceedings of the Tenth International Conference on Language Resources and Evaluation LREC 2016, Portoroz, Slovenia, May 23-28, 2016. European Language Resources Association (ELRA).

Crowdsourcing Salient Information from News and Tweets. / Inel, O.A.; Caselli, T.; Aroyo, L.M.

Proceedings of the Tenth International Conference on Language Resources and Evaluation LREC 2016, Portoroz, Slovenia, May 23-28, 2016. European Language Resources Association (ELRA), 2016.

Research output: Chapter in Book / Report / Conference proceedingConference contributionAcademicpeer-review

TY - GEN

T1 - Crowdsourcing Salient Information from News and Tweets

AU - Inel, O.A.

AU - Caselli, T.

AU - Aroyo, L.M.

PY - 2016

Y1 - 2016

N2 - The increasing streams of information pose challenges to both humans and machines. On the one hand, humans need to identify relevant information and consume only the information that lies at their interests. On the other hand, machines need to understand the information that is published in online data streams and generate concise and meaningful overviews. We consider events as primefactors to query for information and generate meaningful context. The focus of this paper is to acquire empirical insights for identifying salience features in tweets and news about a target event, i.e., the event of “whaling”. We first derive a methodology to identify suchfeatures by building up a knowledge space of the event enriched with relevant phrases, sentiments and ranked by their novelty. Weapplied this methodology on tweets and we have performed preliminary work towards adapting it to news articles. Our results show that crowdsourcing text relevance, sentiments and novelty (1) can be a main step in identifying salient information, and (2) provides a deeperand more precise understanding of the data at hand compared to state-of-the-art approaches.

AB - The increasing streams of information pose challenges to both humans and machines. On the one hand, humans need to identify relevant information and consume only the information that lies at their interests. On the other hand, machines need to understand the information that is published in online data streams and generate concise and meaningful overviews. We consider events as primefactors to query for information and generate meaningful context. The focus of this paper is to acquire empirical insights for identifying salience features in tweets and news about a target event, i.e., the event of “whaling”. We first derive a methodology to identify suchfeatures by building up a knowledge space of the event enriched with relevant phrases, sentiments and ranked by their novelty. Weapplied this methodology on tweets and we have performed preliminary work towards adapting it to news articles. Our results show that crowdsourcing text relevance, sentiments and novelty (1) can be a main step in identifying salient information, and (2) provides a deeperand more precise understanding of the data at hand compared to state-of-the-art approaches.

M3 - Conference contribution

SN - 9782951740891

BT - Proceedings of the Tenth International Conference on Language Resources and Evaluation LREC 2016, Portoroz, Slovenia, May 23-28, 2016

PB - European Language Resources Association (ELRA)

ER -

Inel OA, Caselli T, Aroyo LM. Crowdsourcing Salient Information from News and Tweets. In Proceedings of the Tenth International Conference on Language Resources and Evaluation LREC 2016, Portoroz, Slovenia, May 23-28, 2016. European Language Resources Association (ELRA). 2016