TY - GEN
T1 - Tab2Know
T2 - 19th International Semantic Web Conference, ISWC 2020
AU - Kruit, Benno
AU - He, Hongyu
AU - Urbani, Jacopo
PY - 2020
Y1 - 2020
N2 - Tables in scientific papers contain a wealth of valuable knowledge for the scientific enterprise. To help the many of us who frequently consult this type of knowledge, we present Tab2Know, a new end-to-end system to build a Knowledge Base (KB) from tables in scientific papers. Tab2Know addresses the challenge of automatically interpreting the tables in papers and of disambiguating the entities that they contain. To solve these problems, we propose a pipeline that employs both statistical-based classifiers and logic-based reasoning. First, our pipeline applies weakly supervised classifiers to recognize the type of tables and columns, with the help of a data labeling system and an ontology specifically designed for our purpose. Then, logic-based reasoning is used to link equivalent entities (via sameAs links) in different tables. An empirical evaluation of our approach using a corpus of papers in the Computer Science domain has returned satisfactory performance. This suggests that ours is a promising step to create a large-scale KB of scientific knowledge.
AB - Tables in scientific papers contain a wealth of valuable knowledge for the scientific enterprise. To help the many of us who frequently consult this type of knowledge, we present Tab2Know, a new end-to-end system to build a Knowledge Base (KB) from tables in scientific papers. Tab2Know addresses the challenge of automatically interpreting the tables in papers and of disambiguating the entities that they contain. To solve these problems, we propose a pipeline that employs both statistical-based classifiers and logic-based reasoning. First, our pipeline applies weakly supervised classifiers to recognize the type of tables and columns, with the help of a data labeling system and an ontology specifically designed for our purpose. Then, logic-based reasoning is used to link equivalent entities (via sameAs links) in different tables. An empirical evaluation of our approach using a corpus of papers in the Computer Science domain has returned satisfactory performance. This suggests that ours is a promising step to create a large-scale KB of scientific knowledge.
UR - http://www.scopus.com/inward/record.url?scp=85096538260&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85096538260&partnerID=8YFLogxK
U2 - 10.1007/978-3-030-62419-4_20
DO - 10.1007/978-3-030-62419-4_20
M3 - Conference contribution
AN - SCOPUS:85096538260
SN - 9783030624187
VL - 1
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 349
EP - 365
BT - The Semantic Web – ISWC 2020
A2 - Pan, Jeff Z.
A2 - Tamma, Valentina
A2 - d’Amato, Claudia
A2 - Janowicz, Krzysztof
A2 - Fu, Bo
A2 - Polleres, Axel
A2 - Seneviratne, Oshani
A2 - Kagal, Lalana
PB - Springer Science and Business Media Deutschland GmbH
Y2 - 2 November 2020 through 6 November 2020
ER -