Nanopublications: A growing resource of provenance-centric scientific linked data

Tobias Kuhn, Albert Merono-Penuela, Alexander Malic, Jorrit H. Poelen, Allen H. Hurlbert, Emilio Centeno Ortiz, Laura I. Furlong, Nuria Queralt-Rosinach, Christine Chichester, Juan M. Banda, Egon Willighagen, Friederike Ehrhart, Chris Evelo, Tareq B. Malas, Michel Dumontier

Research output: Chapter in Book / Report / Conference proceedingConference contributionAcademicpeer-review

Abstract

Nanopublications are a Linked Data format for scholarly data publishing that has received considerable uptake in the last few years. In contrast to the common Linked Data publishing practice, nanopublications work at the granular level of atomic information snippets and provide a consistent container format to attach provenance and metadata at this atomic level. While the nanopublications format is domain-independent, the datasets that have become available in this format are mostly from Life Science domains, including data about diseases, genes, proteins, drugs, biological pathways, and biotic interactions. More than 10 million such nanopublications have been published, which now form a valuable resource for studies on the domain level of the given Life Science domains as well as on the more technical levels of provenance modeling and heterogeneous Linked Data. We provide here an overview of this combined nanopublication dataset, show the results of some overarching analyses, and describe how it can be accessed and queried.

LanguageEnglish
Title of host publicationProceedings - IEEE 14th International Conference on eScience, e-Science 2018
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages83-92
Number of pages10
ISBN (Electronic)9781538691564
DOIs
Publication statusPublished - 24 Dec 2018
Event14th IEEE International Conference on eScience, e-Science 2018 - Amsterdam, Netherlands
Duration: 29 Oct 20181 Nov 2018

Conference

Conference14th IEEE International Conference on eScience, e-Science 2018
CountryNetherlands
CityAmsterdam
Period29/10/181/11/18

Fingerprint

Linked Data
Provenance
provenance
Resources
Life sciences
resource
Metadata
life science
Containers
Genes
Proteins
Container
metadata
Pathway
Drugs
Gene
Protein
drug
Interaction
Modeling

Keywords

  • Linked Data
  • Nanopublications
  • Provenance

Cite this

Kuhn, T., Merono-Penuela, A., Malic, A., Poelen, J. H., Hurlbert, A. H., Ortiz, E. C., ... Dumontier, M. (2018). Nanopublications: A growing resource of provenance-centric scientific linked data. In Proceedings - IEEE 14th International Conference on eScience, e-Science 2018 (pp. 83-92). [8588643] Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/eScience.2018.00024
Kuhn, Tobias ; Merono-Penuela, Albert ; Malic, Alexander ; Poelen, Jorrit H. ; Hurlbert, Allen H. ; Ortiz, Emilio Centeno ; Furlong, Laura I. ; Queralt-Rosinach, Nuria ; Chichester, Christine ; Banda, Juan M. ; Willighagen, Egon ; Ehrhart, Friederike ; Evelo, Chris ; Malas, Tareq B. ; Dumontier, Michel. / Nanopublications : A growing resource of provenance-centric scientific linked data. Proceedings - IEEE 14th International Conference on eScience, e-Science 2018. Institute of Electrical and Electronics Engineers Inc., 2018. pp. 83-92
@inproceedings{709f0013771e4be39550ddf0b8aa6b41,
title = "Nanopublications: A growing resource of provenance-centric scientific linked data",
abstract = "Nanopublications are a Linked Data format for scholarly data publishing that has received considerable uptake in the last few years. In contrast to the common Linked Data publishing practice, nanopublications work at the granular level of atomic information snippets and provide a consistent container format to attach provenance and metadata at this atomic level. While the nanopublications format is domain-independent, the datasets that have become available in this format are mostly from Life Science domains, including data about diseases, genes, proteins, drugs, biological pathways, and biotic interactions. More than 10 million such nanopublications have been published, which now form a valuable resource for studies on the domain level of the given Life Science domains as well as on the more technical levels of provenance modeling and heterogeneous Linked Data. We provide here an overview of this combined nanopublication dataset, show the results of some overarching analyses, and describe how it can be accessed and queried.",
keywords = "Linked Data, Nanopublications, Provenance",
author = "Tobias Kuhn and Albert Merono-Penuela and Alexander Malic and Poelen, {Jorrit H.} and Hurlbert, {Allen H.} and Ortiz, {Emilio Centeno} and Furlong, {Laura I.} and Nuria Queralt-Rosinach and Christine Chichester and Banda, {Juan M.} and Egon Willighagen and Friederike Ehrhart and Chris Evelo and Malas, {Tareq B.} and Michel Dumontier",
year = "2018",
month = "12",
day = "24",
doi = "10.1109/eScience.2018.00024",
language = "English",
pages = "83--92",
booktitle = "Proceedings - IEEE 14th International Conference on eScience, e-Science 2018",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
address = "United States",

}

Kuhn, T, Merono-Penuela, A, Malic, A, Poelen, JH, Hurlbert, AH, Ortiz, EC, Furlong, LI, Queralt-Rosinach, N, Chichester, C, Banda, JM, Willighagen, E, Ehrhart, F, Evelo, C, Malas, TB & Dumontier, M 2018, Nanopublications: A growing resource of provenance-centric scientific linked data. in Proceedings - IEEE 14th International Conference on eScience, e-Science 2018., 8588643, Institute of Electrical and Electronics Engineers Inc., pp. 83-92, 14th IEEE International Conference on eScience, e-Science 2018, Amsterdam, Netherlands, 29/10/18. https://doi.org/10.1109/eScience.2018.00024

Nanopublications : A growing resource of provenance-centric scientific linked data. / Kuhn, Tobias; Merono-Penuela, Albert; Malic, Alexander; Poelen, Jorrit H.; Hurlbert, Allen H.; Ortiz, Emilio Centeno; Furlong, Laura I.; Queralt-Rosinach, Nuria; Chichester, Christine; Banda, Juan M.; Willighagen, Egon; Ehrhart, Friederike; Evelo, Chris; Malas, Tareq B.; Dumontier, Michel.

Proceedings - IEEE 14th International Conference on eScience, e-Science 2018. Institute of Electrical and Electronics Engineers Inc., 2018. p. 83-92 8588643.

Research output: Chapter in Book / Report / Conference proceedingConference contributionAcademicpeer-review

TY - GEN

T1 - Nanopublications

T2 - A growing resource of provenance-centric scientific linked data

AU - Kuhn, Tobias

AU - Merono-Penuela, Albert

AU - Malic, Alexander

AU - Poelen, Jorrit H.

AU - Hurlbert, Allen H.

AU - Ortiz, Emilio Centeno

AU - Furlong, Laura I.

AU - Queralt-Rosinach, Nuria

AU - Chichester, Christine

AU - Banda, Juan M.

AU - Willighagen, Egon

AU - Ehrhart, Friederike

AU - Evelo, Chris

AU - Malas, Tareq B.

AU - Dumontier, Michel

PY - 2018/12/24

Y1 - 2018/12/24

N2 - Nanopublications are a Linked Data format for scholarly data publishing that has received considerable uptake in the last few years. In contrast to the common Linked Data publishing practice, nanopublications work at the granular level of atomic information snippets and provide a consistent container format to attach provenance and metadata at this atomic level. While the nanopublications format is domain-independent, the datasets that have become available in this format are mostly from Life Science domains, including data about diseases, genes, proteins, drugs, biological pathways, and biotic interactions. More than 10 million such nanopublications have been published, which now form a valuable resource for studies on the domain level of the given Life Science domains as well as on the more technical levels of provenance modeling and heterogeneous Linked Data. We provide here an overview of this combined nanopublication dataset, show the results of some overarching analyses, and describe how it can be accessed and queried.

AB - Nanopublications are a Linked Data format for scholarly data publishing that has received considerable uptake in the last few years. In contrast to the common Linked Data publishing practice, nanopublications work at the granular level of atomic information snippets and provide a consistent container format to attach provenance and metadata at this atomic level. While the nanopublications format is domain-independent, the datasets that have become available in this format are mostly from Life Science domains, including data about diseases, genes, proteins, drugs, biological pathways, and biotic interactions. More than 10 million such nanopublications have been published, which now form a valuable resource for studies on the domain level of the given Life Science domains as well as on the more technical levels of provenance modeling and heterogeneous Linked Data. We provide here an overview of this combined nanopublication dataset, show the results of some overarching analyses, and describe how it can be accessed and queried.

KW - Linked Data

KW - Nanopublications

KW - Provenance

UR - http://www.scopus.com/inward/record.url?scp=85058974187&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85058974187&partnerID=8YFLogxK

U2 - 10.1109/eScience.2018.00024

DO - 10.1109/eScience.2018.00024

M3 - Conference contribution

SP - 83

EP - 92

BT - Proceedings - IEEE 14th International Conference on eScience, e-Science 2018

PB - Institute of Electrical and Electronics Engineers Inc.

ER -

Kuhn T, Merono-Penuela A, Malic A, Poelen JH, Hurlbert AH, Ortiz EC et al. Nanopublications: A growing resource of provenance-centric scientific linked data. In Proceedings - IEEE 14th International Conference on eScience, e-Science 2018. Institute of Electrical and Electronics Engineers Inc. 2018. p. 83-92. 8588643 https://doi.org/10.1109/eScience.2018.00024