Making Digital Artifacts on the Web Verifiable and Reliable

T. Kuhn, M. Dumontier

Research output: Contribution to JournalArticleAcademicpeer-review

Abstract

The current Web has no general mechanisms to make digital artifacts - such as datasets, code, texts, and images - verifiable and permanent. For digital artifacts that are supposed to be immutable, there is moreover no commonly accepted method to enforce this immutability. These shortcomings have a serious negative impact on the ability to reproduce the results of processes that rely on Web resources, which in turn heavily impacts areas such as science where reproducibility is important. To solve this problem, we propose trusty URIs containing cryptographic hash values. We show how trusty URIs can be used for the verification of digital artifacts, in a manner that is independent of the serialization format in the case of structured data files such as nanopublications. We demonstrate how the contents of these files become immutable, including dependencies to external digital artifacts and thereby extending the range of verifiability to the entire reference tree. Our approach sticks to the core principles of the Web, namely openness and decentralized architecture, and is fully compatible with existing standards and protocols. Evaluation of our reference implementations shows that these design goals are indeed accomplished by our approach, and that it remains practical even for very large files.
Original languageEnglish
JournalIEEE Transactions on Knowledge and Data Engineering
DOIs
Publication statusPublished - 7 Jul 2015

Fingerprint

Network protocols

Cite this

@article{b82e0095677c435aad71ee07dd6a9ddc,
title = "Making Digital Artifacts on the Web Verifiable and Reliable",
abstract = "The current Web has no general mechanisms to make digital artifacts - such as datasets, code, texts, and images - verifiable and permanent. For digital artifacts that are supposed to be immutable, there is moreover no commonly accepted method to enforce this immutability. These shortcomings have a serious negative impact on the ability to reproduce the results of processes that rely on Web resources, which in turn heavily impacts areas such as science where reproducibility is important. To solve this problem, we propose trusty URIs containing cryptographic hash values. We show how trusty URIs can be used for the verification of digital artifacts, in a manner that is independent of the serialization format in the case of structured data files such as nanopublications. We demonstrate how the contents of these files become immutable, including dependencies to external digital artifacts and thereby extending the range of verifiability to the entire reference tree. Our approach sticks to the core principles of the Web, namely openness and decentralized architecture, and is fully compatible with existing standards and protocols. Evaluation of our reference implementations shows that these design goals are indeed accomplished by our approach, and that it remains practical even for very large files.",
author = "T. Kuhn and M. Dumontier",
year = "2015",
month = "7",
day = "7",
doi = "10.1109/TKDE.2015.2419657",
language = "English",
journal = "IEEE Transactions on Knowledge and Data Engineering",
issn = "1041-4347",
publisher = "IEEE Computer Society",

}

Making Digital Artifacts on the Web Verifiable and Reliable. / Kuhn, T.; Dumontier, M.

In: IEEE Transactions on Knowledge and Data Engineering, 07.07.2015.

Research output: Contribution to JournalArticleAcademicpeer-review

TY - JOUR

T1 - Making Digital Artifacts on the Web Verifiable and Reliable

AU - Kuhn, T.

AU - Dumontier, M.

PY - 2015/7/7

Y1 - 2015/7/7

N2 - The current Web has no general mechanisms to make digital artifacts - such as datasets, code, texts, and images - verifiable and permanent. For digital artifacts that are supposed to be immutable, there is moreover no commonly accepted method to enforce this immutability. These shortcomings have a serious negative impact on the ability to reproduce the results of processes that rely on Web resources, which in turn heavily impacts areas such as science where reproducibility is important. To solve this problem, we propose trusty URIs containing cryptographic hash values. We show how trusty URIs can be used for the verification of digital artifacts, in a manner that is independent of the serialization format in the case of structured data files such as nanopublications. We demonstrate how the contents of these files become immutable, including dependencies to external digital artifacts and thereby extending the range of verifiability to the entire reference tree. Our approach sticks to the core principles of the Web, namely openness and decentralized architecture, and is fully compatible with existing standards and protocols. Evaluation of our reference implementations shows that these design goals are indeed accomplished by our approach, and that it remains practical even for very large files.

AB - The current Web has no general mechanisms to make digital artifacts - such as datasets, code, texts, and images - verifiable and permanent. For digital artifacts that are supposed to be immutable, there is moreover no commonly accepted method to enforce this immutability. These shortcomings have a serious negative impact on the ability to reproduce the results of processes that rely on Web resources, which in turn heavily impacts areas such as science where reproducibility is important. To solve this problem, we propose trusty URIs containing cryptographic hash values. We show how trusty URIs can be used for the verification of digital artifacts, in a manner that is independent of the serialization format in the case of structured data files such as nanopublications. We demonstrate how the contents of these files become immutable, including dependencies to external digital artifacts and thereby extending the range of verifiability to the entire reference tree. Our approach sticks to the core principles of the Web, namely openness and decentralized architecture, and is fully compatible with existing standards and protocols. Evaluation of our reference implementations shows that these design goals are indeed accomplished by our approach, and that it remains practical even for very large files.

U2 - 10.1109/TKDE.2015.2419657

DO - 10.1109/TKDE.2015.2419657

M3 - Article

JO - IEEE Transactions on Knowledge and Data Engineering

JF - IEEE Transactions on Knowledge and Data Engineering

SN - 1041-4347

ER -