Using Google distance to weight approximate ontology matches

Risto Gligorov*, Warner Ten Kate, Zharko Aleksovski, Frank Van Harmelen

*Corresponding author for this work

Research output: Chapter in Book / Report / Conference proceedingConference contributionAcademicpeer-review


Discovering mappings between concept hierarchies is widely regarded as one of the hardest and most urgent problems facing the Semantic Web. The problem is even harder in domains where concepts are inherently vague and ill-defined, and cannot be given a crisp definition. A notion of approximate concept mapping is required in such domains, but until now, no such notion is vailable. The first contribution of this paper is a definition for approximate mappings between concepts. Roughly, a mapping between two concepts is decomposed into a number of submappings, and a sloppiness value determines the fraction of these submappings that can be ignored when establishing the mapping. A potential problem of such a definition is that with an increasing sloppiness value, it will gradually allow mappings between any two arbitrary concepts. To improve on this trivial behaviour, we need to design a heuristic weighting which minimises the sloppiness required to conclude desirable matches, but at the same time maximises the sloppiness required to conclude undesirable matches. The second contribution of this paper is to show that a Google based similarity measure has exactly these desirable properties. We establish these results by experimental validation in the domain of musical genres. We show that this domain does suffer from ill-defined concepts. We take two real-life genre hierarchies from the Web, we compute approximate mappings between them at varying levels of sloppiness, and we validate our results against a handcrafted Gold Standard. Our method makes use of the huge amount of knowledge that is implicit in the current Web, and exploits this knowledge as a heuristic for establishing approximate mappings between ill-defined concepts.

Original languageEnglish
Title of host publication16th International World Wide Web Conference, WWW2007
Number of pages10
Publication statusPublished - 2007
Event16th International World Wide Web Conference, WWW2007 - Banff, AB, Canada
Duration: 8 May 200712 May 2007


Conference16th International World Wide Web Conference, WWW2007
CityBanff, AB


  • Approximation
  • Google distance


Dive into the research topics of 'Using Google distance to weight approximate ontology matches'. Together they form a unique fingerprint.

Cite this