TY - GEN
T1 - A Framework for Evaluating Entity Alignment Impact on Downstream Knowledge Discovery
AU - Shoilee, Sarah Binta Alam
AU - de Boer, Victor
AU - van Ossenbruggen, Jacco
N1 - Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.
PY - 2025
Y1 - 2025
N2 - Entity alignment (EA) is a crucial process in integrating data from multiple sources, facilitating Knowledge Discovery (KD). Despite advances in EA techniques, selecting the appropriate algorithm for downstream KD tasks remains challenging due to several issues. These issues include domain entities alignment difficulties, the impact on KD tasks, and bias in data distribution. This paper presents a framework to address these challenges by providing a systematic approach to evaluate the impact of different EA algorithms based on three critical aspects: quality of alignment, information retrieved through alignment, and information imbalance or bias introduced through alignment. Our framework enables users to make informed decisions about algorithm selection, ensuring reliable, effective, and balanced KD. We demonstrate the application of the framework using a digital humanities case study, where the KD task involves enriching information about colonial collections. The choice of such a sensitive and historically imbalanced use-case allows us to highlight how the proposed framework helps identify suitable algorithms and to emphasis the importance of understanding the propagated information biases introduced through data alignment.
AB - Entity alignment (EA) is a crucial process in integrating data from multiple sources, facilitating Knowledge Discovery (KD). Despite advances in EA techniques, selecting the appropriate algorithm for downstream KD tasks remains challenging due to several issues. These issues include domain entities alignment difficulties, the impact on KD tasks, and bias in data distribution. This paper presents a framework to address these challenges by providing a systematic approach to evaluate the impact of different EA algorithms based on three critical aspects: quality of alignment, information retrieved through alignment, and information imbalance or bias introduced through alignment. Our framework enables users to make informed decisions about algorithm selection, ensuring reliable, effective, and balanced KD. We demonstrate the application of the framework using a digital humanities case study, where the KD task involves enriching information about colonial collections. The choice of such a sensitive and historically imbalanced use-case allows us to highlight how the proposed framework helps identify suitable algorithms and to emphasis the importance of understanding the propagated information biases introduced through data alignment.
KW - Digital Humanities
KW - Entity Alignment
KW - Evaluation Framework
KW - Knowledge Discovery
UR - http://www.scopus.com/inward/record.url?scp=85210850469&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85210850469&partnerID=8YFLogxK
U2 - 10.1007/978-3-031-77792-9_14
DO - 10.1007/978-3-031-77792-9_14
M3 - Conference contribution
AN - SCOPUS:85210850469
SN - 9783031777912
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 226
EP - 242
BT - Knowledge Engineering and Knowledge Management
A2 - Alam, Mehwish
A2 - Rospocher, Marco
A2 - van Erp, Marieke
A2 - Hollink, Laura
A2 - Gesese, Genet Asefa
PB - Springer Science and Business Media Deutschland GmbH
T2 - 24th International Conference on Knowledge Engineering and Knowledge Management, EKAW 2024
Y2 - 26 November 2024 through 28 November 2024
ER -