Abstract
In this work we analyze the named entity rep-
resentations learned by Transformer-based lan-
guage models. We investigate the role entities
play in two tasks: a language modeling task,
and a sequence classification task. For this
purpose we collect a novel news topic classifi-
cation dataset with 12 topics called RefNews-
12. We perform two complementary methods
of analysis. First, we use diagnostic models
allowing us to quantify to what degree entity
information is present in the hidden represen-
tations. Second, we perform entity mention
substitution to measure how substitute-entities
with different properties impact model perfor-
mance. By controlling for model uncertainty
we are able to show that entities are identified,
and depending on the task, play a measurable
role in the model’s predictions. Additionally,
we show that the entities’ types alone are not
enough to account for this. Finally, we find that
the the frequency with which entities occur are
important for the masked language modeling
task, and that the entities’ distributions over
topics are important for topic classification.
resentations learned by Transformer-based lan-
guage models. We investigate the role entities
play in two tasks: a language modeling task,
and a sequence classification task. For this
purpose we collect a novel news topic classifi-
cation dataset with 12 topics called RefNews-
12. We perform two complementary methods
of analysis. First, we use diagnostic models
allowing us to quantify to what degree entity
information is present in the hidden represen-
tations. Second, we perform entity mention
substitution to measure how substitute-entities
with different properties impact model perfor-
mance. By controlling for model uncertainty
we are able to show that entities are identified,
and depending on the task, play a measurable
role in the model’s predictions. Additionally,
we show that the entities’ types alone are not
enough to account for this. Finally, we find that
the the frequency with which entities occur are
important for the masked language modeling
task, and that the entities’ distributions over
topics are important for topic classification.
Original language | English |
---|---|
Title of host publication | Proceedings of the BlackboxNLP workshop: Analyzing and interpreting neural networks for NLP |
Publisher | Association for Computational Linguistics (ACL) |
Pages | 384-393 |
Number of pages | 10 |
ISBN (Electronic) | 9781959429050 |
Publication status | Published - 2022 |