Abstract
The quality and practical usefulness of a regression model are a function of both interpretability and prediction performance. This work presents some new graphical tools for improved interpretation of latent variable regression models that can also assist in improved algorithms for variable selection. Thus, these graphs provide visualization of the explanatory variables' content of response related as well as systematic orthogonal variation at a quantitative level. Furthermore, these graphs are able to reveal and partition the explanatory variables into those that are crucial for both interpretation and predictive performance of the model, and those that are crucial for prediction performance but confounded by large contributions of orthogonal variation. Tools for assessment of explanatory variables may not only aid interpretation and understanding of the model but also be crucial for performing variable selection with the purpose of obtaining parsimonious models with high explanatory information content as well as predictive performance. We show by example that by just using prediction performance as criterion for variable selection, it is possible to end up with a reduced model where the most selective variables are lost in the selection process.
Original language | English |
---|---|
Pages (from-to) | 615-622 |
Number of pages | 8 |
Journal | Journal of Chemometrics |
Volume | 28 |
Issue number | 8 |
DOIs | |
Publication status | Published - 1 Jan 2014 |
Externally published | Yes |
Keywords
- Latent variable regression
- Orthogonal variation
- Selectivity ratio
- Variable importance
- Variable selection