Abstract
We identify three gaps that limit the utility and obstruct the progress of computational text analysis methods (CTAM) for social science research. First, we contend that CTAM development has prioritized technological over validity concerns, giving limited attention to the operationalization of social scientific measurements. Second, we identify a mismatch between CTAMs’ focus on extracting specific contents and document-level patterns, and social science researchers’ need for measuring multiple, often complex contents in the text. Third, we argue that the dominance of English language tools depresses comparative research and inclusivity toward scholarly communities examining languages other than English. We substantiate our claims by drawing upon a broad review of methodological work in the computational social sciences, as well as an inventory of leading research publications using quantitative textual analysis. Subsequently, we discuss implications of these three gaps for social scientists’ uneven uptake of CTAM, as well as the field of computational social science text research as a whole. Finally, we propose a research agenda intended to bridge the identified gaps and improve the validity, utility, and inclusiveness of CTAM.
Original language | English |
---|---|
Pages (from-to) | 1-18 |
Number of pages | 18 |
Journal | Communication Methods and Measures |
Volume | 16 |
Issue number | 1 |
Early online date | 27 Dec 2021 |
DOIs | |
Publication status | Published - 2022 |
Bibliographical note
Funding Information:This work was supported by the Horizon 2020 Framework Programme [951832].
Publisher Copyright:
© 2021 The Author(s). Published with license by Taylor & Francis Group, LLC.