TY - JOUR

T1 - Principal Covariates Clusterwise Regression (PCCR)

T2 - Accounting for Multicollinearity and Population Heterogeneity in Hierarchically Organized Data

AU - Wilderjans, Tom Frans

AU - Vande Gaer, Eva

AU - Kiers, Henk A.L.

AU - Van Mechelen, Iven

AU - Ceulemans, Eva

PY - 2017/3/1

Y1 - 2017/3/1

N2 - In the behavioral sciences, many research questions pertain to a regression problem in that one wants to predict a criterion on the basis of a number of predictors. Although in many cases, ordinary least squares regression will suffice, sometimes the prediction problem is more challenging, for three reasons: first, multiple highly collinear predictors can be available, making it difficult to grasp their mutual relations as well as their relations to the criterion. In that case, it may be very useful to reduce the predictors to a few summary variables, on which one regresses the criterion and which at the same time yields insight into the predictor structure. Second, the population under study may consist of a few unknown subgroups that are characterized by different regression models. Third, the obtained data are often hierarchically structured, with for instance, observations being nested into persons or participants within groups or countries. Although some methods have been developed that partially meet these challenges (i.e., principal covariates regression (PCovR), clusterwise regression (CR), and structural equation models), none of these methods adequately deals with all of them simultaneously. To fill this gap, we propose the principal covariates clusterwise regression (PCCR) method, which combines the key idea’s behind PCovR (de Jong & Kiers in Chemom Intell Lab Syst 14(1–3):155–164, 1992) and CR (Späth in Computing 22(4):367–373, 1979). The PCCR method is validated by means of a simulation study and by applying it to cross-cultural data regarding satisfaction with life.

AB - In the behavioral sciences, many research questions pertain to a regression problem in that one wants to predict a criterion on the basis of a number of predictors. Although in many cases, ordinary least squares regression will suffice, sometimes the prediction problem is more challenging, for three reasons: first, multiple highly collinear predictors can be available, making it difficult to grasp their mutual relations as well as their relations to the criterion. In that case, it may be very useful to reduce the predictors to a few summary variables, on which one regresses the criterion and which at the same time yields insight into the predictor structure. Second, the population under study may consist of a few unknown subgroups that are characterized by different regression models. Third, the obtained data are often hierarchically structured, with for instance, observations being nested into persons or participants within groups or countries. Although some methods have been developed that partially meet these challenges (i.e., principal covariates regression (PCovR), clusterwise regression (CR), and structural equation models), none of these methods adequately deals with all of them simultaneously. To fill this gap, we propose the principal covariates clusterwise regression (PCCR) method, which combines the key idea’s behind PCovR (de Jong & Kiers in Chemom Intell Lab Syst 14(1–3):155–164, 1992) and CR (Späth in Computing 22(4):367–373, 1979). The PCCR method is validated by means of a simulation study and by applying it to cross-cultural data regarding satisfaction with life.

KW - clusterwise regression

KW - component analysis

KW - hierarchically organized data

KW - multicollinearity

KW - population heterogeneity

UR - http://www.scopus.com/inward/record.url?scp=85000405985&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85000405985&partnerID=8YFLogxK

U2 - 10.1007/s11336-016-9522-0

DO - 10.1007/s11336-016-9522-0

M3 - Article

C2 - 27905056

AN - SCOPUS:85000405985

VL - 82

SP - 86

EP - 111

JO - Psychometrika

JF - Psychometrika

SN - 0033-3123

IS - 1

ER -