The goal of the Linked Art Provenance project is to support art-historical provenance research, with methods that automatically integrate information from heterogeneous sources. Art provenance regards the changes in ownership over time of artworks, involving actors, events and places. It is an important source of information for researchers interested in the history of collections. An example of an art-historical research question where provenance information is indispensable is: “Can we identify all the paintings from the collection of Pieter Cornelis Baron van Leyden (1717-88), which has been dispersed during an auction in the 19th century?” The auction of the 117 paintings has been recorded in a catalogue , which can serve as a basis for recording a provenance trail for each of the paintings. The problem is, paintings back then did not have unambiguous identifiers: identification relied on textual descriptions. Currently, based on this textual description, a researcher has to manually search for sources of consequent provenance transactions. We aim to automate this process, by matching the textual descriptions of objects with new sources of art provenance information, such as databases, websites and digitized auction catalogues. To do so, relevant sources of provenance information have to be identified and retrieved information has to be normalized. Therefore, we will create data harmonization pipelines for different types of sources. A pipeline will consist of the following steps:
Query formulation – transform the textual description of a painting in an appropriate query Data retrieval – retrieve data from the source Data conversion – convert the data into a standardized data model Entity linking – identify entities (e.g. actors, places) and link them to structured vocabularies Candidate event formulation – formulate candidate provenance events
The candidate events are evaluated by the provenance researcher, thereby giving the researcher full control over the resulting provenance trail.