Low-Dimensional Perturb-and-MAP Approach for Learning Restricted Boltzmann Machines

Jakub M. Tomczak, Szymon Zaręba, Siamak Ravanbakhsh, Russell Greiner

Research output: Contribution to JournalArticleAcademicpeer-review

Abstract

This paper introduces a new approach to maximum likelihood learning of the parameters of a restricted Boltzmann machine (RBM). The proposed method is based on the Perturb-and-MAP (PM) paradigm that enables sampling from the Gibbs distribution. PM is a two step process: (i) perturb the model using Gumbel perturbations, then (ii) find the maximum a posteriori (MAP) assignment of the perturbed model. We show that under certain conditions the resulting MAP configuration of the perturbed model is an unbiased sample from the original distribution. However, this approach requires an exponential number of perturbations, which is computationally intractable. Here, we apply an approximate approach based on the first order (low-dimensional) PM to calculate the gradient of the log-likelihood in binary RBM. Our approach relies on optimizing the energy function with respect to observable and hidden variables using a greedy procedure. First, for each variable we determine whether flipping this value will decrease the energy, and then we utilize the new local maximum to approximate the gradient. Moreover, we show that in some cases our approach works better than the standard coordinate-descent procedure for finding the MAP assignment and compare it with the Contrastive Divergence algorithm. We investigate the quality of our approach empirically, first on toy problems, then on various image datasets and a text dataset.

Original languageEnglish
Pages (from-to)1401-1419
Number of pages19
JournalNeural Processing Letters
Volume50
Issue number2
DOIs
Publication statusPublished - 1 Oct 2019
Externally publishedYes

Fingerprint

Learning
Play and Playthings
Maximum likelihood
Sampling
Datasets

Keywords

  • Greedy optimization
  • Gumbel perturbation
  • Restricted Boltzmann machine
  • Unsupervised deep learning

Cite this

Tomczak, Jakub M. ; Zaręba, Szymon ; Ravanbakhsh, Siamak ; Greiner, Russell. / Low-Dimensional Perturb-and-MAP Approach for Learning Restricted Boltzmann Machines. In: Neural Processing Letters. 2019 ; Vol. 50, No. 2. pp. 1401-1419.
@article{0138b76bbe2545ce873979e48cc59884,
title = "Low-Dimensional Perturb-and-MAP Approach for Learning Restricted Boltzmann Machines",
abstract = "This paper introduces a new approach to maximum likelihood learning of the parameters of a restricted Boltzmann machine (RBM). The proposed method is based on the Perturb-and-MAP (PM) paradigm that enables sampling from the Gibbs distribution. PM is a two step process: (i) perturb the model using Gumbel perturbations, then (ii) find the maximum a posteriori (MAP) assignment of the perturbed model. We show that under certain conditions the resulting MAP configuration of the perturbed model is an unbiased sample from the original distribution. However, this approach requires an exponential number of perturbations, which is computationally intractable. Here, we apply an approximate approach based on the first order (low-dimensional) PM to calculate the gradient of the log-likelihood in binary RBM. Our approach relies on optimizing the energy function with respect to observable and hidden variables using a greedy procedure. First, for each variable we determine whether flipping this value will decrease the energy, and then we utilize the new local maximum to approximate the gradient. Moreover, we show that in some cases our approach works better than the standard coordinate-descent procedure for finding the MAP assignment and compare it with the Contrastive Divergence algorithm. We investigate the quality of our approach empirically, first on toy problems, then on various image datasets and a text dataset.",
keywords = "Greedy optimization, Gumbel perturbation, Restricted Boltzmann machine, Unsupervised deep learning",
author = "Tomczak, {Jakub M.} and Szymon Zaręba and Siamak Ravanbakhsh and Russell Greiner",
year = "2019",
month = "10",
day = "1",
doi = "10.1007/s11063-018-9923-4",
language = "English",
volume = "50",
pages = "1401--1419",
journal = "Neural Processing Letters",
issn = "1370-4621",
number = "2",

}

Low-Dimensional Perturb-and-MAP Approach for Learning Restricted Boltzmann Machines. / Tomczak, Jakub M.; Zaręba, Szymon; Ravanbakhsh, Siamak; Greiner, Russell.

In: Neural Processing Letters, Vol. 50, No. 2, 01.10.2019, p. 1401-1419.

Research output: Contribution to JournalArticleAcademicpeer-review

TY - JOUR

T1 - Low-Dimensional Perturb-and-MAP Approach for Learning Restricted Boltzmann Machines

AU - Tomczak, Jakub M.

AU - Zaręba, Szymon

AU - Ravanbakhsh, Siamak

AU - Greiner, Russell

PY - 2019/10/1

Y1 - 2019/10/1

N2 - This paper introduces a new approach to maximum likelihood learning of the parameters of a restricted Boltzmann machine (RBM). The proposed method is based on the Perturb-and-MAP (PM) paradigm that enables sampling from the Gibbs distribution. PM is a two step process: (i) perturb the model using Gumbel perturbations, then (ii) find the maximum a posteriori (MAP) assignment of the perturbed model. We show that under certain conditions the resulting MAP configuration of the perturbed model is an unbiased sample from the original distribution. However, this approach requires an exponential number of perturbations, which is computationally intractable. Here, we apply an approximate approach based on the first order (low-dimensional) PM to calculate the gradient of the log-likelihood in binary RBM. Our approach relies on optimizing the energy function with respect to observable and hidden variables using a greedy procedure. First, for each variable we determine whether flipping this value will decrease the energy, and then we utilize the new local maximum to approximate the gradient. Moreover, we show that in some cases our approach works better than the standard coordinate-descent procedure for finding the MAP assignment and compare it with the Contrastive Divergence algorithm. We investigate the quality of our approach empirically, first on toy problems, then on various image datasets and a text dataset.

AB - This paper introduces a new approach to maximum likelihood learning of the parameters of a restricted Boltzmann machine (RBM). The proposed method is based on the Perturb-and-MAP (PM) paradigm that enables sampling from the Gibbs distribution. PM is a two step process: (i) perturb the model using Gumbel perturbations, then (ii) find the maximum a posteriori (MAP) assignment of the perturbed model. We show that under certain conditions the resulting MAP configuration of the perturbed model is an unbiased sample from the original distribution. However, this approach requires an exponential number of perturbations, which is computationally intractable. Here, we apply an approximate approach based on the first order (low-dimensional) PM to calculate the gradient of the log-likelihood in binary RBM. Our approach relies on optimizing the energy function with respect to observable and hidden variables using a greedy procedure. First, for each variable we determine whether flipping this value will decrease the energy, and then we utilize the new local maximum to approximate the gradient. Moreover, we show that in some cases our approach works better than the standard coordinate-descent procedure for finding the MAP assignment and compare it with the Contrastive Divergence algorithm. We investigate the quality of our approach empirically, first on toy problems, then on various image datasets and a text dataset.

KW - Greedy optimization

KW - Gumbel perturbation

KW - Restricted Boltzmann machine

KW - Unsupervised deep learning

UR - http://www.scopus.com/inward/record.url?scp=85054498287&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85054498287&partnerID=8YFLogxK

U2 - 10.1007/s11063-018-9923-4

DO - 10.1007/s11063-018-9923-4

M3 - Article

VL - 50

SP - 1401

EP - 1419

JO - Neural Processing Letters

JF - Neural Processing Letters

SN - 1370-4621

IS - 2

ER -