Abstract
Cultural heritage institutions hold collections of printed newspapers that are valuable resources for the study of history, linguistics and other Digital Humanities scientific domains. Effective retrieval of newspapers content based on metadata only is a task nearly impossible, making the retrieval based on (digitized) full-text particularly relevant. Europeana, Europe’s Digital Library, is in the position to provide access to large newspapers collections with full-text resources. Full-text corpora are also relevant for Europeana’s objective of promoting the usage of cultural heritage resources for use within research infrastructures. We have derived requirements for aggregating and publishing Europeana’s newspapers full-text corpus in an interoperable way, based on investigations into the specific characteristics of cultural data, the needs of two research infrastructures (CLARIN and EUDAT) and the practices being promoted in the International Image Interoperability Framework (IIIF) community. We have then defined a “full-text profile” for the Europeana Data Model, which is being applied to Europeana’s newspaper corpus.
Original language | English |
---|---|
Title of host publication | 2nd Conference on Language, Data and Knowledge, LDK 2019 |
Editors | Maria Eskevich, Gerard de Melo, Christian Fath, John P. McCrae, Paul Buitelaar, Christian Chiarcos, Bettina Klimek, Milan Dojchinovski |
Publisher | Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing |
Pages | 1-14 |
Number of pages | 14 |
ISBN (Electronic) | 9783959771054 |
DOIs | |
Publication status | Published - 1 May 2019 |
Event | 2nd Conference on Language, Data and Knowledge, LDK 2019 - Leipzig, Germany Duration: 20 May 2019 → 23 May 2019 |
Publication series
Name | OpenAccess Series in Informatics |
---|---|
Volume | 70 |
ISSN (Print) | 2190-6807 |
Conference
Conference | 2nd Conference on Language, Data and Knowledge, LDK 2019 |
---|---|
Country/Territory | Germany |
City | Leipzig |
Period | 20/05/19 → 23/05/19 |
Funding
Funding Nuno Freire: This work was partly supported by Portuguese national funds through Fundação para a Ciência e a Tecnologia (FCT) with reference UID/CEC/50021/2019, and by the European Commission under contract number 30-CE-0885387/00-80.e.
Keywords
- Cultural heritage
- Data aggregation
- Full-text
- Interoperability
- Metadata
- Research infrastructures