Learning Structural Classification Rules for Web-page Categorization

Heiner Stuckenschmidt, Jens Hartmann, Frank Van Harmelen

Research output: Contribution to ConferencePaperOther research output

Abstract

Content-related metadata plays an important role in the effort of developing intelligent web applications. One of the most established form of providing content-related metadata is the assignment of web-pages to content categories. We describe the Spectacle system for classifying individual web pages on the basis of their syntactic structure. This classification requires the spe-cification of classification rules associating common pa-ge structures with predefined classes. In this paper, we propose an approach for the automatic acquisition of these classification rules using techniques from inducti-ve logic programming and describe experiments in ap-plying the approach to an existing web-based informa-tion system.
Original languageEnglish
Publication statusPublished - 2002

Fingerprint Dive into the research topics of 'Learning Structural Classification Rules for Web-page Categorization'. Together they form a unique fingerprint.

Cite this