TY - JOUR
T1 - Methods for significance testing of categorical covariates in logistic regression models after multiple imputation
T2 - Power and applicability analysis
AU - Eekhout, Iris
AU - Van De Wiel, Mark A.
AU - Heymans, Martijn W.
PY - 2017/8/22
Y1 - 2017/8/22
N2 - Background: Multiple imputation is a recommended method to handle missing data. For significance testing after multiple imputation, Rubin's Rules (RR) are easily applied to pool parameter estimates. In a logistic regression model, to consider whether a categorical covariate with more than two levels significantly contributes to the model, different methods are available. For example pooling chi-square tests with multiple degrees of freedom, pooling likelihood ratio test statistics, and pooling based on the covariance matrix of the regression model. These methods are more complex than RR and are not available in all mainstream statistical software packages. In addition, they do not always obtain optimal power levels. We argue that the median of the p-values from the overall significance tests from the analyses on the imputed datasets can be used as an alternative pooling rule for categorical variables. The aim of the current study is to compare different methods to test a categorical variable for significance after multiple imputation on applicability and power. Methods: In a large simulation study, we demonstrated the control of the type I error and power levels of different pooling methods for categorical variables. Results: This simulation study showed that for non-significant categorical covariates the type I error is controlled and the statistical power of the median pooling rule was at least equal to current multiple parameter tests. An empirical data example showed similar results. Conclusions: It can therefore be concluded that using the median of the p-values from the imputed data analyses is an attractive and easy to use alternative method for significance testing of categorical variables.
AB - Background: Multiple imputation is a recommended method to handle missing data. For significance testing after multiple imputation, Rubin's Rules (RR) are easily applied to pool parameter estimates. In a logistic regression model, to consider whether a categorical covariate with more than two levels significantly contributes to the model, different methods are available. For example pooling chi-square tests with multiple degrees of freedom, pooling likelihood ratio test statistics, and pooling based on the covariance matrix of the regression model. These methods are more complex than RR and are not available in all mainstream statistical software packages. In addition, they do not always obtain optimal power levels. We argue that the median of the p-values from the overall significance tests from the analyses on the imputed datasets can be used as an alternative pooling rule for categorical variables. The aim of the current study is to compare different methods to test a categorical variable for significance after multiple imputation on applicability and power. Methods: In a large simulation study, we demonstrated the control of the type I error and power levels of different pooling methods for categorical variables. Results: This simulation study showed that for non-significant categorical covariates the type I error is controlled and the statistical power of the median pooling rule was at least equal to current multiple parameter tests. An empirical data example showed similar results. Conclusions: It can therefore be concluded that using the median of the p-values from the imputed data analyses is an attractive and easy to use alternative method for significance testing of categorical variables.
KW - Categorical covariates
KW - Logistic regression
KW - Multiple imputation
KW - Pooling
KW - Significance test
KW - Simulation study
UR - http://www.scopus.com/inward/record.url?scp=85027980590&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85027980590&partnerID=8YFLogxK
U2 - 10.1186/s12874-017-0404-7
DO - 10.1186/s12874-017-0404-7
M3 - Article
AN - SCOPUS:85027980590
SN - 1471-2288
VL - 17
JO - BMC Medical Research Methodology
JF - BMC Medical Research Methodology
IS - 1
M1 - 129
ER -