Abstract
Finding well-defined clusters in data represents a fundamental challenge for many data-driven applications, and largely depends on good data representation. Drawing on literature regarding representation learning, studies suggest that one key characteristic of good latent representations is the ability to produce semantically mixed outputs when decoding linear interpolations of two latent representations. We propose the Mixing Consistent Deep Clustering (MCDC) method which encourages interpolations to appear realistic while adding the constraint that interpolations of two data points must look like one of the two inputs. By applying this training method to various clustering (non-)specific autoencoder models we found that using the proposed training method systematically changed the structure of learned representations of a model and it improved clustering performance for the tested ACAI, IDEC, and VAE models on the MNIST, SVHN, and CIFAR-10 datasets. These outcomes have practical implications for numerous real-world clustering tasks, as it shows that the proposed method can be added to existing autoencoders to further improve clustering performance.
| Original language | English |
|---|---|
| Title of host publication | Machine Learning, Optimization, and Data Science |
| Subtitle of host publication | [Proceedings] 7th International Conference, LOD 2021, Grasmere, UK, October 4–8, 2021, Revised Selected Papers, Part I |
| Editors | Giuseppe Nicosia, Varun Ojha, Emanuele La Malfa, Gabriele La Malfa, Giorgio Jansen, Panos M. Pardalos, Giovanni Giuffrida, Renato Umeton |
| Publisher | Springer Science and Business Media Deutschland GmbH |
| Pages | 124-137 |
| Number of pages | 14 |
| Volume | 1 |
| ISBN (Electronic) | 9783030954673 |
| ISBN (Print) | 9783030954666 |
| DOIs | |
| Publication status | Published - 2022 |
| Event | 7th International Conference on Machine Learning, Optimization, and Data Science, LOD 2021 - Virtual, Online Duration: 4 Oct 2021 → 8 Oct 2021 |
Publication series
| Name | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
|---|---|
| Volume | 13163 LNCS |
| ISSN (Print) | 0302-9743 |
| ISSN (Electronic) | 1611-3349 |
Conference
| Conference | 7th International Conference on Machine Learning, Optimization, and Data Science, LOD 2021 |
|---|---|
| City | Virtual, Online |
| Period | 4/10/21 → 8/10/21 |
Bibliographical note
Publisher Copyright:© 2022, Springer Nature Switzerland AG.
Keywords
- Adversarial training
- Autoencoder
- Clustering
Fingerprint
Dive into the research topics of 'Mixing Consistent Deep Clustering'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver