INCA: Infrastructure for content analysis

Damian Trilling, Bob Van De Velde, Anne C. Kroon, Felicia Locherbach, Theo Araujo, Joanna Strycharz, Tamara Raats, Lisa De Klerk, Jeroen G.F. Jonkman

Research output: Chapter in Book / Report / Conference proceedingConference contributionAcademicpeer-review


We present INCA (short for INfrastructure for Content Analysis), a Python module for collecting, storing, processing, and analyzing a wide variety of media content, including but not limited to news, political debates, social media, forums, and customer reviews. Using Elasticsearch as a database backend and Celery for task management, it makes automated content analysis scalable. INCA's main objective is to enable and promote an integrated workflow. INCA focuses on re-usability of data, processors, and analyses; making all steps of automated content analysis (ACA) accessible to social scientists, without requiring advanced programming skills. Here, we present the aim, implementation and recommended workflow for INCA.
Original languageEnglish
Title of host publicationProceedings - IEEE 14th International Conference on eScience, e-Science 2018
PublisherInstitute of Electrical and Electronics Engineers Inc.
Number of pages2
ISBN (Electronic)9781538691564
Publication statusPublished - 24 Dec 2018
Externally publishedYes
Event14th IEEE International Conference on eScience, e-Science 2018 - Amsterdam, Netherlands
Duration: 29 Oct 20181 Nov 2018

Publication series

NameProceedings - IEEE 14th International Conference on eScience, e-Science 2018


Conference14th IEEE International Conference on eScience, e-Science 2018


  • Automated content analysis
  • Communication science
  • Python module
  • Social science


Dive into the research topics of 'INCA: Infrastructure for content analysis'. Together they form a unique fingerprint.

Cite this