Unsupervised Feature Selection for Efficient Exploration of High Dimensional Data

Arnab Chakrabarti*, Abhijeet Das, Michael Cochez, Christoph Quix

*Corresponding author for this work

Research output: Chapter in Book / Report / Conference proceedingConference contributionAcademicpeer-review

96 Downloads (Pure)

Abstract

The exponential growth in the ability to generate, capture, and store high dimensional data has driven sophisticated machine learning applications. However, high dimensionality often poses a challenge for analysts to effectively identify and extract relevant features from datasets. Though many feature selection methods have shown good results in supervised learning, the major challenge lies in the area of unsupervised feature selection. For example, in the domain of data visualization, high-dimensional data is difficult to visualize and interpret due to the limitations of the screen, resulting in visual clutter. Visualizations are more interpretable when visualized in a low dimensional feature space. To mitigate these challenges, we present an approach to perform unsupervised feature clustering and selection using our novel graph clustering algorithm based on Clique-Cover Theory. We implemented our approach in an interactive data exploration tool which facilitates the exploration of relationships between features and generates interpretable visualizations.

Original languageEnglish
Title of host publicationAdvances in Databases and Information Systems
Subtitle of host publication25th European Conference, ADBIS 2021, Tartu, Estonia, August 24–26, 2021, Proceedings
EditorsLadjel Bellatreche, Marlon Dumas, Panagiotis Karras, Raimundas Matulevičius
PublisherSpringer Science and Business Media Deutschland GmbH
Pages183-197
Number of pages15
ISBN (Electronic)9783030824723
ISBN (Print)9783030824716
DOIs
Publication statusPublished - 2021
Event25th European Conference on Advances in Databases and Information Systems, ADBIS 2021 - Tartu, Estonia
Duration: 24 Aug 202126 Aug 2021

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume12843 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference25th European Conference on Advances in Databases and Information Systems, ADBIS 2021
Country/TerritoryEstonia
CityTartu
Period24/08/2126/08/21

Bibliographical note

Funding Information:
Acknowledgment. This work was funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) under Germany’s Excellence Strategy – EXC-2023 Internet of Production – 390621612.

Publisher Copyright:
© 2021, Springer Nature Switzerland AG.

Funding

Acknowledgment. This work was funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) under Germany’s Excellence Strategy – EXC-2023 Internet of Production – 390621612.

Fingerprint

Dive into the research topics of 'Unsupervised Feature Selection for Efficient Exploration of High Dimensional Data'. Together they form a unique fingerprint.

Cite this