Abstract
This paper describes a tool for the automatic
extension and trimming of a multilingual
WordNet database for cross-lingual retrieval
and multilingual ontology building in
intranets and domain-specific document
collections. Hierarchies, built from
automatically extracted terms and combined
with the WordNet relations, are trimmed
with a disambiguation method based on the
document salience of the words in the
glosses. The disambiguation is tested in a
cross-lingual retrieval task, showing
considerable improvement (7%-11%). The
condensed hierarchies can be used as
browse-interfaces to the documents
complementary to retrieval.
Original language | English |
---|---|
Title of host publication | Proceedings on NAACL-2001 workshop on WordNet and other lexical resources applications, extensions and customizations,Pittsburgh,USA June 2001 |
Publisher | The Association for Computational Linguistics |
Publication status | Published - 2001 |