Low-Dimensional Perturb-and-MAP Approach for Learning Restricted Boltzmann Machines

Jakub M. Tomczak*, Szymon Zaręba, Siamak Ravanbakhsh, Russell Greiner

*Corresponding author for this work

Research output: Contribution to JournalArticleAcademicpeer-review

Abstract

This paper introduces a new approach to maximum likelihood learning of the parameters of a restricted Boltzmann machine (RBM). The proposed method is based on the Perturb-and-MAP (PM) paradigm that enables sampling from the Gibbs distribution. PM is a two step process: (i) perturb the model using Gumbel perturbations, then (ii) find the maximum a posteriori (MAP) assignment of the perturbed model. We show that under certain conditions the resulting MAP configuration of the perturbed model is an unbiased sample from the original distribution. However, this approach requires an exponential number of perturbations, which is computationally intractable. Here, we apply an approximate approach based on the first order (low-dimensional) PM to calculate the gradient of the log-likelihood in binary RBM. Our approach relies on optimizing the energy function with respect to observable and hidden variables using a greedy procedure. First, for each variable we determine whether flipping this value will decrease the energy, and then we utilize the new local maximum to approximate the gradient. Moreover, we show that in some cases our approach works better than the standard coordinate-descent procedure for finding the MAP assignment and compare it with the Contrastive Divergence algorithm. We investigate the quality of our approach empirically, first on toy problems, then on various image datasets and a text dataset.

Original languageEnglish
Pages (from-to)1401-1419
Number of pages19
JournalNeural Processing Letters
Volume50
Issue number2
DOIs
Publication statusPublished - 1 Oct 2019
Externally publishedYes

Keywords

  • Greedy optimization
  • Gumbel perturbation
  • Restricted Boltzmann machine
  • Unsupervised deep learning

Fingerprint

Dive into the research topics of 'Low-Dimensional Perturb-and-MAP Approach for Learning Restricted Boltzmann Machines'. Together they form a unique fingerprint.

Cite this