Trust is a broad concept that in many systems is often reduced to user reputation alone. However, user reputation is just one way to determine trust. The estimation of trust can be tackled from other perspectives as well, including by looking at provenance. Here, we present a complete pipeline for estimating the trustworthiness of artifacts given their provenance and a set of sample evaluations. The pipeline is composed of a series of algorithms for (1) extracting relevant provenance features, (2) generating stereotypes of user behavior from provenance features, (3) estimating the reputation of both stereotypes and users, (4) using a combination of user and stereotype reputations to estimate the trustworthiness of artifacts and (5) selecting sets of artifacts to trust. These algorithms rely on the W3C PROV recommendations for provenance and on evidential reasoning by means of subjective logic. We evaluate the pipeline over two tagging datasets: tags and evaluations from the Netherlands Institute for Sound and Vision's Waisda? video tagging platform, as well as crowdsourced annotations from the Steve. Museum project. The approach achieves up to 85% precision when predicting tag trustworthiness. Perhaps more importantly, the pipeline provides satisfactory results using relatively little evidence through the use of provenance.
|Number of pages||28|
|Journal||ACM Journal of Data and Information Quality|
|Publication status||Published - 2016|