Studying conceptual change using embedding models has become increasingly popular in the Digital Humanities community while critical observations about them have received less attention. This paper investigates what the impact of known pitfalls can be on the conclusions drawn in a digital humanities study through the use case of “Racism”. In addition, we suggest an approach for modeling a complex concept in terms of words and relations representative of the conceptual system. Our results show that different models created from the same data yield different results, but also indicate that using different model architectures, comparing different corpora and comparing to control words and relations can help to identify which results are solid and which may be due to artefact. We propose guidelines to conduct similar studies, but also note that more work is needed to fully understand how we can distinguish artefacts from actual conceptual changes.
|Title of host publication||The 1st International Workshop on Computational Approaches to Historical Language Change|
|Subtitle of host publication||Proceedings of the Workshop|
|Place of Publication||Florence, Italy|
|Publisher||Association for Computational Linguistics (ACL)|
|Number of pages||11|
|Publication status||Published - Aug 2019|