Adding Semantics to Detectors for Video Retrieval

C. Snoek, B. Huurnink, L. Hollink, M. de Rijke, A.T. Schreiber, M. Worring

Research output: Contribution to JournalArticleAcademicpeer-review

147 Downloads (Pure)

Abstract

In this paper, we propose an automatic video retrieval method based on high-level concept detectors. Research in video analysis has reached the point where over 100 concept detectors can be learned in a generic fashion, albeit with mixed performance. Such a set of detectors is very small still compared to ontologies aiming to capture the full vocabulary a user has. We aim to throw a bridge between the two fields by building a multimedia thesaurus, i.e., a set of machine learned concept detectors that is enriched with semantic descriptions and semantic structure obtained from WordNet. Given a multimodal user query, we identify three strategies to select a relevant detector from this thesaurus, namely: text matching, ontology querying, and semantic visual querying. We evaluate the methods against the automatic search task of the TRECVID 2005 video retrieval benchmark, using a news video archive of 85 h in combination with a thesaurus of 363 machine learned concept detectors. We assess the influence of thesaurus size on video search performance, evaluate and compare the multimodal selection strategies for concept detectors, and finally discuss their combined potential using oracle fusion. The set of queries in the TRECVID 2005 corpus is too small for us to be definite in our conclusions, but the results suggest promising new lines of research. © 2007 IEEE.
Original languageEnglish
Pages (from-to)975-986
Number of pages12
JournalIEEE Transactions on Multimedia
Volume9
Issue number5
DOIs
Publication statusPublished - 2007

Bibliographical note

Snoek07a

Fingerprint

Dive into the research topics of 'Adding Semantics to Detectors for Video Retrieval'. Together they form a unique fingerprint.

Cite this