TY - GEN
T1 - Semantic overfitting
T2 - 26th International Conference on Computational Linguistics, COLING 2016
AU - Ilievski, Filip
AU - Postma, Marten
AU - Vossen, Piek
PY - 2016/1/1
Y1 - 2016/1/1
N2 - Semantic text processing faces the challenge of defining the relation between lexical expressions and the world to which they make reference within a period of time. It is unclear whether the current test sets used to evaluate disambiguation tasks are representative for the full complexity considering this time-anchored relation, resulting in semantic overfitting to a specific period and the frequent phenomena within. We conceptualize and formalize a set of metrics which evaluate this complexity of datasets. We provide evidence for their applicability on five different disambiguation tasks. To challenge semantic overfitting of disambiguation systems, we propose a time-based, metric-aware method for developing datasets in a systematic and semi-automated manner, as well as an event-based QA task.
AB - Semantic text processing faces the challenge of defining the relation between lexical expressions and the world to which they make reference within a period of time. It is unclear whether the current test sets used to evaluate disambiguation tasks are representative for the full complexity considering this time-anchored relation, resulting in semantic overfitting to a specific period and the frequent phenomena within. We conceptualize and formalize a set of metrics which evaluate this complexity of datasets. We provide evidence for their applicability on five different disambiguation tasks. To challenge semantic overfitting of disambiguation systems, we propose a time-based, metric-aware method for developing datasets in a systematic and semi-automated manner, as well as an event-based QA task.
UR - http://www.scopus.com/inward/record.url?scp=85051955052&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85051955052&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85051955052
SN - 9784879747020
T3 - COLING 2016 - 26th International Conference on Computational Linguistics, Proceedings of COLING 2016: Technical Papers
SP - 1180
EP - 1191
BT - COLING 2016 - 26th International Conference on Computational Linguistics, Proceedings of COLING 2016
PB - Association for Computational Linguistics, ACL Anthology
Y2 - 11 December 2016 through 16 December 2016
ER -