TY - GEN
T1 - Evaluating the consistency of word embeddings from small data
AU - Bloem, Jelke
AU - Fokkens, Antske
AU - Herbelot, Aurélie
PY - 2019/9
Y1 - 2019/9
N2 - In this work, we address the evaluation of distributional semantic models trained on smaller, domain-specific texts, particularly philosophical text. Specifically, we inspect the behaviour of models using a pretrained background space in learning. We propose a measure of consistency which can be used as an evaluation metric when no in-domain gold-standard data is available. This measure simply computes the ability of a model to learn similar embeddings from different parts of some homogeneous data. We show that in spite of being a simple evaluation, consistency actually depends on various combinations of factors, including the nature of the data itself, the model used to train the semantic space, and the frequency of the learned terms, both in the background space and in the in-domain data of interest.
AB - In this work, we address the evaluation of distributional semantic models trained on smaller, domain-specific texts, particularly philosophical text. Specifically, we inspect the behaviour of models using a pretrained background space in learning. We propose a measure of consistency which can be used as an evaluation metric when no in-domain gold-standard data is available. This measure simply computes the ability of a model to learn similar embeddings from different parts of some homogeneous data. We show that in spite of being a simple evaluation, consistency actually depends on various combinations of factors, including the nature of the data itself, the model used to train the semantic space, and the frequency of the learned terms, both in the background space and in the in-domain data of interest.
UR - http://www.scopus.com/inward/record.url?scp=85076466118&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85076466118&partnerID=8YFLogxK
UR - https://aclanthology.org/volumes/R19-1/
U2 - 10.26615/978-954-452-056-4_016
DO - 10.26615/978-954-452-056-4_016
M3 - Conference contribution
AN - SCOPUS:85076466118
SN - 9789544520557
T3 - International Conference Recent Advances in Natural Language Processing, RANLP
SP - 132
EP - 141
BT - Natural Language Processing in a Deep Learning World
A2 - Angelova, Galia
A2 - Mitkov, Ruslan
A2 - Nikolova, Ivelina
A2 - Temnikova, Irina
A2 - Temnikova, Irina
PB - Incoma Ltd.
T2 - 12th International Conference on Recent Advances in Natural Language Processing, RANLP 2019
Y2 - 2 September 2019 through 4 September 2019
ER -