Data science and automation in the process of theorizing: Machine learning’s power of induction in the co-duction cycle

Daan Kolkman, Gwendolyn K. Lee*, Arjen van Witteloostuijn

*Corresponding author for this work

Research output: Contribution to JournalArticleAcademicpeer-review

Abstract

Recent calls to take up data science either revolve around the superior predictive performance associated with machine learning or the potential of data science techniques for exploratory data analysis. Many believe that these strengths come at the cost of explanatory insights, which form the basis for theorization. In this paper, we show that this trade-off is false. When used as a part of a full research process, including inductive, deductive and abductive steps, machine learning can offer explanatory insights and provide a solid basis for theorization. We present a systematic five-step theory-building and theory-testing cycle that consists of: 1. Element identification (reduction); 2. Exploratory analysis (induction); 3. Hypothesis development (retroduction); 4. Hypothesis testing (deduction); and 5. Theorization (abduction). We demonstrate the usefulness of this approach, which we refer to as co-duction, in a vignette where we study firm growth with real-world observational data.

Original languageEnglish
Article numbere0309318
Pages (from-to)1-30
Number of pages30
JournalPLoS ONE
Volume19
Early online date4 Nov 2024
DOIs
Publication statusPublished - Nov 2024

Bibliographical note

Publisher Copyright:
This is an open access article, free of all copyright, and may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose. The work is made available under the Creative Commons CC0 public domain dedication.

Fingerprint

Dive into the research topics of 'Data science and automation in the process of theorizing: Machine learning’s power of induction in the co-duction cycle'. Together they form a unique fingerprint.

Cite this