Crowdsourcing inclusivity: Dealing with diversity of opinions, perspectives and ambiguity in annotated data: The Crowdtruth tutorial

Lora Aroyo, Zoltán Szlávik, Anca Dumitrache, Benjamin Timmermans, Oana Inel, Chris Welty

Research output: Chapter in Book / Report / Conference proceedingConference contributionAcademicpeer-review


In this tutorial, we introduce a novel crowdsourcing methodology called CrowdTruth [1, 9]. The central characteristic of CrowdTruth is harnessing the diversity in human interpretation to capture the wide range of opinions and perspectives, and thus provide more reliable, realistic and inclusive real-world annotated data for training and evaluating machine learning components. Unlike other methods, we do not discard dissenting votes, but incorporate them into a richer and more continuous representation of truth. CrowdTruth is a widely used crowdsourcing methodology1 adopted by industrial partners and public organizations such as Google, IBM, New York Times, Cleveland Clinic, Crowdynews, Sound and Vision archive, Rijksmuseum, and in a multitude of domains such as AI, news, medicine, social media, cultural heritage, and social sciences. The goal of this tutorial is to introduce the audience to a novel approach to crowdsourcing that takes advantage of the diversity of opinions and perspectives that is inherent to the Web, as methods that deal with disagreement and diversity in crowdsourcing have become increasingly popular. Creating this more complex notion of truth contributes directly to the larger discussion on how to make the Web more reliable, diverse and inclusive.

Original languageEnglish
Title of host publicationThe Web Conference 2019
Subtitle of host publicationProceedings of The World Wide Web Conference WWW 2019
PublisherAssociation for Computing Machinery, Inc
Number of pages2
ISBN (Electronic)9781450366755
Publication statusPublished - 13 May 2019
Event2019 World Wide Web Conference, WWW 2019 - San Francisco, United States
Duration: 13 May 201917 May 2019


Conference2019 World Wide Web Conference, WWW 2019
Country/TerritoryUnited States
CitySan Francisco


  • Ambiguity
  • Computational Social Sciences
  • Crowdsourcing
  • Digital Humanities
  • Diversity
  • Ground Truth
  • Inter-annotator Disagreement
  • Medical Text Annotation
  • Perspectives


Dive into the research topics of 'Crowdsourcing inclusivity: Dealing with diversity of opinions, perspectives and ambiguity in annotated data: The Crowdtruth tutorial'. Together they form a unique fingerprint.

Cite this