Learning from a lot: Empirical Bayes for high-dimensional model-based prediction

Mark A. van de Wiel*, Dennis E. Te Beest, Magnus M. Münch

*Corresponding author for this work

Research output: Contribution to JournalArticleAcademicpeer-review


Empirical Bayes is a versatile approach to “learn from a lot” in two ways: first, from a large number of variables and, second, from a potentially large amount of prior information, for example, stored in public repositories. We review applications of a variety of empirical Bayes methods to several well-known model-based prediction methods, including penalized regression, linear discriminant analysis, and Bayesian models with sparse or dense priors. We discuss “formal” empirical Bayes methods that maximize the marginal likelihood but also more informal approaches based on other data summaries. We contrast empirical Bayes to cross-validation and full Bayes and discuss hybrid approaches. To study the relation between the quality of an empirical Bayes estimator and p, the number of variables, we consider a simple empirical Bayes estimator in a linear model setting. We argue that empirical Bayes is particularly useful when the prior contains multiple parameters, which model a priori information on variables termed “co-data”. In particular, we present two novel examples that allow for co-data: first, a Bayesian spike-and-slab setting that facilitates inclusion of multiple co-data sources and types and, second, a hybrid empirical Bayes–full Bayes ridge regression approach for estimation of the posterior predictive interval.

Original languageEnglish
Pages (from-to)2-25
Number of pages24
JournalScandinavian Journal of Statistics
Issue number1
Early online date1 Jun 2018
Publication statusPublished - Mar 2019


European Research Council, Grant/Award Number: 320637; European Union 7th Framework program, Grant/Award Number: 611425 We thank Paul Newcombe for fruitful discussions on spike-and-slab models and Carel Peeters for critical reading of this manuscript. This research has received funding from the European Research Council under ERC Grant Agreement 320637. It also received financial support from the European Union 7th Framework program under Grant Agreement 611425 (Oramod).

FundersFunder number
European Union 7th Framework Program
H2020 European Research Council
Seventh Framework Programme320637, 611425
European Research Council


    • co-data
    • empirical Bayes
    • marginal likelihood
    • prediction
    • variable selection


    Dive into the research topics of 'Learning from a lot: Empirical Bayes for high-dimensional model-based prediction'. Together they form a unique fingerprint.

    Cite this