TY - GEN
T1 - Discovering Research Hypotheses in Social Science Using Knowledge Graph Embeddings
AU - de Haan, R.
AU - Tiddi, I.
AU - Beek, W.
PY - 2021
Y1 - 2021
N2 - In an era of ever-increasing scientific publications available, scientists struggle to keep pace with the literature, interpret research results and identify new research hypotheses to falsify. This is particularly in fields such as the social sciences, where automated support for scientific discovery is still widely unavailable and unimplemented. In this work, we introduce an automated system that supports social scientists in identifying new research hypotheses. With the idea that knowledge graphs help modeling domain-specific information, and that machine learning can be used to identify the most relevant facts therein, we frame the problem of hypothesis discovery as a link prediction task, where the ComplEx model is used to predict new relationships between entities of a knowledge graph representing scientific papers and their experimental details. The final output consists in fully formulated hypotheses including the newly discovered triples (hypothesis statement), along with supporting statements from the knowledge graph (hypothesis evidence and hypothesis history). A quantitative and qualitative evaluation is carried using experts in the field. Encouraging results show that a simple combination of machine learning and knowledge graph methods can serve as a basis for automated scientific discovery.
AB - In an era of ever-increasing scientific publications available, scientists struggle to keep pace with the literature, interpret research results and identify new research hypotheses to falsify. This is particularly in fields such as the social sciences, where automated support for scientific discovery is still widely unavailable and unimplemented. In this work, we introduce an automated system that supports social scientists in identifying new research hypotheses. With the idea that knowledge graphs help modeling domain-specific information, and that machine learning can be used to identify the most relevant facts therein, we frame the problem of hypothesis discovery as a link prediction task, where the ComplEx model is used to predict new relationships between entities of a knowledge graph representing scientific papers and their experimental details. The final output consists in fully formulated hypotheses including the newly discovered triples (hypothesis statement), along with supporting statements from the knowledge graph (hypothesis evidence and hypothesis history). A quantitative and qualitative evaluation is carried using experts in the field. Encouraging results show that a simple combination of machine learning and knowledge graph methods can serve as a basis for automated scientific discovery.
U2 - 10.1007/978-3-030-77385-4_28
DO - 10.1007/978-3-030-77385-4_28
M3 - Conference contribution
SN - 9783030773847
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 477
EP - 494
BT - The Semantic Web
A2 - Verborgh, Ruben
A2 - Hose, Katja
A2 - Paulheim, Heiko
A2 - Champin, Pierre-Antoine
A2 - Maleshkova, Maria
A2 - Corcho, Oscar
A2 - Ristoski, Petar
A2 - Alam, Mehwish
PB - Springer Science and Business Media Deutschland GmbH
T2 - 18th European Semantic Web Conference, ESWC 2021
Y2 - 6 June 2021 through 10 June 2021
ER -