CEDAR: The Dutch Historical Censuses as Linked Open Data

Albert Meroño-Peñuela, Ashkan Ashkpour, Christophe Guéret, Stefan Schlobach

Research output: Contribution to JournalArticleAcademicpeer-review

Abstract

Here, we describe the CEDAR dataset, a five-star Linked Open Data representation of the Dutch historical censuses. These were conducted in the Netherlands once every 10 years from 1795 to 1971. We produce a linked dataset from a digitized sample of 2,288 tables. It contains more than 6.8 million statistical observations about the demography, labour and housing of Dutch society in the 18th, 19th and 20th centuries. The dataset is modeled using the RDF Data Cube, Open Annotation, and PROV vocabularies. These are used to represent the multidimensionality of the data, to express rules of data harmonization, and to keep track of the provenance of all data points and their transformations, respectively. We link observations within the dataset to well known standard classification systems in social history, such as the Historical International Standard Classification of Occupations (HISCO) and the Amsterdamse Code (AC). The three contributions of the dataset are (1) an easier access to integrated census data for historical researchers; (2) richer connections to related Linked Data resources; and (3) novel concept schemes of historical relevance, like classifications of historical religions and historical house types.
LanguageEnglish
Pages297-310
Number of pages14
JournalSemantic Web
Volume8
Issue number2
DOIs
Publication statusPublished - 2017

Fingerprint

Stars
Personnel

Keywords

  • census data
  • Linked Open Data
  • RDF Data Cube
  • Social history

Cite this

Meroño-Peñuela, Albert ; Ashkpour, Ashkan ; Guéret, Christophe ; Schlobach, Stefan. / CEDAR : The Dutch Historical Censuses as Linked Open Data. In: Semantic Web. 2017 ; Vol. 8, No. 2. pp. 297-310.
@article{ba103be04d354166b43301342eb3215b,
title = "CEDAR: The Dutch Historical Censuses as Linked Open Data",
abstract = "Here, we describe the CEDAR dataset, a five-star Linked Open Data representation of the Dutch historical censuses. These were conducted in the Netherlands once every 10 years from 1795 to 1971. We produce a linked dataset from a digitized sample of 2,288 tables. It contains more than 6.8 million statistical observations about the demography, labour and housing of Dutch society in the 18th, 19th and 20th centuries. The dataset is modeled using the RDF Data Cube, Open Annotation, and PROV vocabularies. These are used to represent the multidimensionality of the data, to express rules of data harmonization, and to keep track of the provenance of all data points and their transformations, respectively. We link observations within the dataset to well known standard classification systems in social history, such as the Historical International Standard Classification of Occupations (HISCO) and the Amsterdamse Code (AC). The three contributions of the dataset are (1) an easier access to integrated census data for historical researchers; (2) richer connections to related Linked Data resources; and (3) novel concept schemes of historical relevance, like classifications of historical religions and historical house types.",
keywords = "census data, Linked Open Data, RDF Data Cube, Social history",
author = "Albert Mero{\~n}o-Pe{\~n}uela and Ashkan Ashkpour and Christophe Gu{\'e}ret and Stefan Schlobach",
year = "2017",
doi = "10.3233/SW-160233",
language = "English",
volume = "8",
pages = "297--310",
journal = "Semantic Web",
issn = "1570-0844",
publisher = "IOS Press",
number = "2",

}

CEDAR : The Dutch Historical Censuses as Linked Open Data. / Meroño-Peñuela, Albert; Ashkpour, Ashkan; Guéret, Christophe; Schlobach, Stefan.

In: Semantic Web, Vol. 8, No. 2, 2017, p. 297-310.

Research output: Contribution to JournalArticleAcademicpeer-review

TY - JOUR

T1 - CEDAR

T2 - Semantic Web

AU - Meroño-Peñuela, Albert

AU - Ashkpour, Ashkan

AU - Guéret, Christophe

AU - Schlobach, Stefan

PY - 2017

Y1 - 2017

N2 - Here, we describe the CEDAR dataset, a five-star Linked Open Data representation of the Dutch historical censuses. These were conducted in the Netherlands once every 10 years from 1795 to 1971. We produce a linked dataset from a digitized sample of 2,288 tables. It contains more than 6.8 million statistical observations about the demography, labour and housing of Dutch society in the 18th, 19th and 20th centuries. The dataset is modeled using the RDF Data Cube, Open Annotation, and PROV vocabularies. These are used to represent the multidimensionality of the data, to express rules of data harmonization, and to keep track of the provenance of all data points and their transformations, respectively. We link observations within the dataset to well known standard classification systems in social history, such as the Historical International Standard Classification of Occupations (HISCO) and the Amsterdamse Code (AC). The three contributions of the dataset are (1) an easier access to integrated census data for historical researchers; (2) richer connections to related Linked Data resources; and (3) novel concept schemes of historical relevance, like classifications of historical religions and historical house types.

AB - Here, we describe the CEDAR dataset, a five-star Linked Open Data representation of the Dutch historical censuses. These were conducted in the Netherlands once every 10 years from 1795 to 1971. We produce a linked dataset from a digitized sample of 2,288 tables. It contains more than 6.8 million statistical observations about the demography, labour and housing of Dutch society in the 18th, 19th and 20th centuries. The dataset is modeled using the RDF Data Cube, Open Annotation, and PROV vocabularies. These are used to represent the multidimensionality of the data, to express rules of data harmonization, and to keep track of the provenance of all data points and their transformations, respectively. We link observations within the dataset to well known standard classification systems in social history, such as the Historical International Standard Classification of Occupations (HISCO) and the Amsterdamse Code (AC). The three contributions of the dataset are (1) an easier access to integrated census data for historical researchers; (2) richer connections to related Linked Data resources; and (3) novel concept schemes of historical relevance, like classifications of historical religions and historical house types.

KW - census data

KW - Linked Open Data

KW - RDF Data Cube

KW - Social history

UR - http://www.scopus.com/inward/record.url?scp=85004045046&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85004045046&partnerID=8YFLogxK

U2 - 10.3233/SW-160233

DO - 10.3233/SW-160233

M3 - Article

VL - 8

SP - 297

EP - 310

JO - Semantic Web

JF - Semantic Web

SN - 1570-0844

IS - 2

ER -