Cultural heritage institutions more and more provide online access to their collections. Collections containing visual artworks need detailed and thorough annotations of the represented visual objects (e.g. plants or animals) to enable human access and retrieval. To make these suitable for access and retrieval, visual artworks need detailed and thorough annotations of the visual classes. Crowdsourcing has proven a viable tool to cater for the pitfalls of automatic annotation techniques. However, differently from traditional photographic image annotation, the artwork annotation task requires workers to possess the knowledge and skills needed to identify and recognise the occurrences of visual classes. The extent to which crowdsourcing can be effectively applied for artwork annotation is still an open research question. Based on a real-life case study from Rijksmuseum Amsterdam, this paper investigates the performance of a crowd of workers drawn from the CrowdFlower platform. Our contributions include a detailed analysis of crowd annotations based on two annotation configurations and a comparison of these crowd annotations with the ones from trusted annotators. In this study we apply a novel method for the automatic aggregation of local (i.e. bounding box) annotations, and we study how different knowledge extraction and aggregation configurations affect the identification and recognition aspects of artwork annotation. Our work sheds new light on the process of crowdsourcing artwork annotations, and shows how techniques that are effective for photographic image annotation cannot be straightforwardly applied to artwork annotation, thus paving the way for new research in the area.