Learning Structural Classification Rules for Web-page Categorization

Heiner Stuckenschmidt, Jens Hartmann, Frank Van Harmelen

Research output: Contribution to ConferencePaperOther research output

Abstract

Content-related metadata plays an important role in the effort of developing intelligent web applications. One of the most established form of providing content-related metadata is the assignment of web-pages to content categories. We describe the Spectacle system for classifying individual web pages on the basis of their syntactic structure. This classification requires the spe-cification of classification rules associating common pa-ge structures with predefined classes. In this paper, we propose an approach for the automatic acquisition of these classification rules using techniques from inducti-ve logic programming and describe experiments in ap-plying the approach to an existing web-based informa-tion system.
Original languageEnglish
Publication statusPublished - 2002

Fingerprint

Websites
Metadata
Logic programming
Syntactics
Experiments

Cite this

@conference{1ed1d738b5ab4c4b9192563f10370569,
title = "Learning Structural Classification Rules for Web-page Categorization",
abstract = "Content-related metadata plays an important role in the effort of developing intelligent web applications. One of the most established form of providing content-related metadata is the assignment of web-pages to content categories. We describe the Spectacle system for classifying individual web pages on the basis of their syntactic structure. This classification requires the spe-cification of classification rules associating common pa-ge structures with predefined classes. In this paper, we propose an approach for the automatic acquisition of these classification rules using techniques from inducti-ve logic programming and describe experiments in ap-plying the approach to an existing web-based informa-tion system.",
author = "Heiner Stuckenschmidt and Jens Hartmann and {Van Harmelen}, Frank",
year = "2002",
language = "English",

}

Learning Structural Classification Rules for Web-page Categorization. / Stuckenschmidt, Heiner; Hartmann, Jens; Van Harmelen, Frank.

2002.

Research output: Contribution to ConferencePaperOther research output

TY - CONF

T1 - Learning Structural Classification Rules for Web-page Categorization

AU - Stuckenschmidt, Heiner

AU - Hartmann, Jens

AU - Van Harmelen, Frank

PY - 2002

Y1 - 2002

N2 - Content-related metadata plays an important role in the effort of developing intelligent web applications. One of the most established form of providing content-related metadata is the assignment of web-pages to content categories. We describe the Spectacle system for classifying individual web pages on the basis of their syntactic structure. This classification requires the spe-cification of classification rules associating common pa-ge structures with predefined classes. In this paper, we propose an approach for the automatic acquisition of these classification rules using techniques from inducti-ve logic programming and describe experiments in ap-plying the approach to an existing web-based informa-tion system.

AB - Content-related metadata plays an important role in the effort of developing intelligent web applications. One of the most established form of providing content-related metadata is the assignment of web-pages to content categories. We describe the Spectacle system for classifying individual web pages on the basis of their syntactic structure. This classification requires the spe-cification of classification rules associating common pa-ge structures with predefined classes. In this paper, we propose an approach for the automatic acquisition of these classification rules using techniques from inducti-ve logic programming and describe experiments in ap-plying the approach to an existing web-based informa-tion system.

M3 - Paper

ER -