Different methods to complete datasets used for capture-recapture estimation

B.F.M. Bakker, S.C. Gerritse, P.G.M. Van der Heijden

Research output: Contribution to JournalArticleAcademicpeer-review


We are interested in an estimate of the usual residents in the Netherlands. Capture-recapture estimation with three registers enables us to estimate the size of the total population, of which the usual residents are a part. However, usual residence cannot be used as a covariate because it is not available in one of the registers. We approach this as a missing data problem. There are different methods available to handle missing data. In this manuscript we use Expectation Maximization (EM) algorithm and Predictive Mean Matching (PMM). The EM algorithm is often used in categorical data analysis, but PMM has the advantage of flexibility in the choice for a specific part of the observed data used for the imputation of the missing data. Four scenarios have been identified where the missing data are completed via either the EM algorithm or PMM imputation, resulting in different population size estimates for usual residence. It was found that the different scenarios lead to different population size estimates. Even small changes in the completed data lead to different population size estimates. In this study PMM imputation performs best according flexibility and it is theoretically better motivated.
Original languageEnglish
Pages (from-to)613-627
Number of pages15
JournalStatistical Journal of the IAOS
Issue number4
Publication statusPublished - 2015


Dive into the research topics of 'Different methods to complete datasets used for capture-recapture estimation'. Together they form a unique fingerprint.

Cite this