How linkage error affects hidden Markov models: a sensitivity analysis

Research output: Contribution to JournalArticleAcademicpeer-review


Latent class models (LCM) are increasingly used to estimate and correct for classification error in categorical data, without the need for a “gold standard”, error-free, data source. To accomplish this, LCMs require multiple indicators of the same phenomenon within one data collection wave – “latent structure model” – or multiple observations over time on a single indicator – “hidden Markov model (HMM) ” – and assume that the errors in these indicators are conditionally independent. Unfortunately, this “local independence” assumption is often unrealistic, untestable, and a source of serious bias. Linking independent data sources can solve this problem by making the local independence assumption plausible across sources, while potentially allowing for local dependence within sources. However, record linkage introduces a new problem: the records may be erroneously linked. In this paper we investigate the effects of linkage error on HMM estimates of employment contract types. Our data come from linking a labor force survey to administrative employer records; this linkage yields two indicators per time point that are plausibly conditionally independent. Our results indicate that false-negative linkage error (exclusion) turns out to be problematic only if it is large and highly correlated with the dependent variable. Moreover, under many conditions, false-positive linkage error (mislinkage) turns out to act as another source of misclassification that the HMM can absorb into the
error-rate estimates, leaving the latent transition estimates unbiased. In these
cases, measurement error modeling already accounts for linkage error. Our
results also indicate where these conditions break down and more complex
methods would be needed.
Original languageEnglish
Article number8
Pages (from-to)483–512
JournalJournal of Survey Statistics and Methodology
Early online date29 May 2019
Publication statusPublished - Jun 2020


  • linkage error
  • classification error
  • measurement error
  • latent class model (LCM)
  • hidden Markov model (HMM)
  • misclassification

Fingerprint Dive into the research topics of 'How linkage error affects hidden Markov models: a sensitivity analysis'. Together they form a unique fingerprint.

Cite this