Data integrative Bayesian inference for mixtures of regression models

Research output: Contribution to JournalArticleAcademicpeer-review

Abstract

Modern data collection techniques, which often produce different types of relevant information, call for new statistical learning methods that are adapted to cope with data integration. In the paper Bayesian inference is considered for mixtures of regression models with an unknown number of components, that facilitates data integration and variable selection for high dimensional data. In the approach presented, named data integrative mixture of regressions,
data integration is accomplished by introducing a new data allocation scheme that summarizes additional data in the form of an informative prior on latent variables. To cope with high dimensionality, a shrinkage-type prior is assumed on the regression parameters, and a posteriori variable selection is conducted based on Bayesian credible intervals. Posterior estimation is achieved via a Markov chain Monte Carlo algorithm. The method is validated through simulation studies and illustrated by its performance on real data.
Original languageEnglish
Pages (from-to)1-22
Number of pages22
JournalJournal of the Royal Statistical Society: Series C (Applied Statistics)
DOIs
Publication statusPublished - 2019

Fingerprint

Data Integration
Bayesian inference
Regression Model
Variable Selection
Regression
Data Allocation
Credible Interval
Statistical Learning
Markov Chain Monte Carlo Algorithms
Number of Components
Latent Variables
High-dimensional Data
Shrinkage
Dimensionality
Simulation Study
Unknown
Data integration
Regression model
Variable selection

Keywords

  • Bayesian lasso
  • Data Integration
  • Markov chain Monte Carlo (MCMC)
  • Mixture regression

Cite this

@article{667e93f897814a6f936907a4f62ae65e,
title = "Data integrative Bayesian inference for mixtures of regression models",
abstract = "Modern data collection techniques, which often produce different types of relevant information, call for new statistical learning methods that are adapted to cope with data integration. In the paper Bayesian inference is considered for mixtures of regression models with an unknown number of components, that facilitates data integration and variable selection for high dimensional data. In the approach presented, named data integrative mixture of regressions,data integration is accomplished by introducing a new data allocation scheme that summarizes additional data in the form of an informative prior on latent variables. To cope with high dimensionality, a shrinkage-type prior is assumed on the regression parameters, and a posteriori variable selection is conducted based on Bayesian credible intervals. Posterior estimation is achieved via a Markov chain Monte Carlo algorithm. The method is validated through simulation studies and illustrated by its performance on real data.",
keywords = "Bayesian lasso, Data Integration, Markov chain Monte Carlo (MCMC), Mixture regression",
author = "Mehran Aflakparast and {de Gunst}, M.C.M.",
year = "2019",
doi = "10.1111/rssc.12346",
language = "English",
pages = "1--22",
journal = "Journal of the Royal Statistical Society: Series C (Applied Statistics)",
issn = "0035-9254",
publisher = "Wiley-Blackwell",

}

TY - JOUR

T1 - Data integrative Bayesian inference for mixtures of regression models

AU - Aflakparast, Mehran

AU - de Gunst, M.C.M.

PY - 2019

Y1 - 2019

N2 - Modern data collection techniques, which often produce different types of relevant information, call for new statistical learning methods that are adapted to cope with data integration. In the paper Bayesian inference is considered for mixtures of regression models with an unknown number of components, that facilitates data integration and variable selection for high dimensional data. In the approach presented, named data integrative mixture of regressions,data integration is accomplished by introducing a new data allocation scheme that summarizes additional data in the form of an informative prior on latent variables. To cope with high dimensionality, a shrinkage-type prior is assumed on the regression parameters, and a posteriori variable selection is conducted based on Bayesian credible intervals. Posterior estimation is achieved via a Markov chain Monte Carlo algorithm. The method is validated through simulation studies and illustrated by its performance on real data.

AB - Modern data collection techniques, which often produce different types of relevant information, call for new statistical learning methods that are adapted to cope with data integration. In the paper Bayesian inference is considered for mixtures of regression models with an unknown number of components, that facilitates data integration and variable selection for high dimensional data. In the approach presented, named data integrative mixture of regressions,data integration is accomplished by introducing a new data allocation scheme that summarizes additional data in the form of an informative prior on latent variables. To cope with high dimensionality, a shrinkage-type prior is assumed on the regression parameters, and a posteriori variable selection is conducted based on Bayesian credible intervals. Posterior estimation is achieved via a Markov chain Monte Carlo algorithm. The method is validated through simulation studies and illustrated by its performance on real data.

KW - Bayesian lasso

KW - Data Integration

KW - Markov chain Monte Carlo (MCMC)

KW - Mixture regression

U2 - 10.1111/rssc.12346

DO - 10.1111/rssc.12346

M3 - Article

SP - 1

EP - 22

JO - Journal of the Royal Statistical Society: Series C (Applied Statistics)

JF - Journal of the Royal Statistical Society: Series C (Applied Statistics)

SN - 0035-9254

ER -