TY - JOUR
T1 - The SIMCLAS Model
T2 - Simultaneous Analysis of Coupled Binary Data Matrices with Noise Heterogeneity Between and Within Data Blocks
AU - Wilderjans, Tom F.
AU - Ceulemans, E.
AU - van Mechelen, I.
PY - 2012/10/1
Y1 - 2012/10/1
N2 - In many research domains different pieces of information are collected regarding the same set of objects. Each piece of information constitutes a data block, and all these (coupled) blocks have the object mode in common. When analyzing such data, an important aim is to obtain an overall picture of the structure underlying the whole set of coupled data blocks. A further challenge consists of accounting for the differences in information value that exist between and within (i. e., between the objects of a single block) data blocks. To tackle these issues, analysis techniques may be useful in which all available pieces of information are integrated and in which at the same time noise heterogeneity is taken into account. For the case of binary coupled data, however, only methods exist that go for a simultaneous analysis of all data blocks but that do not account for noise heterogeneity. Therefore, in this paper, the SIMCLAS model, being a Hierarchical Classes model for the simultaneous analysis of coupled binary two-way matrices, is presented. In this model, noise heterogeneity between and within the data blocks is accounted for by downweighting entries from noisy blocks/objects within a block. In a simulation study it is shown that (1) the SIMCLAS technique recovers the underlying structure of coupled data to a very large extent, and (2) the SIMCLAS technique outperforms a Hierarchical Classes technique in which all entries contribute equally to the analysis (i. e., noise homogeneity within and between blocks). The latter is also demonstrated in an application of both techniques to empirical data on categorization of semantic concepts.
AB - In many research domains different pieces of information are collected regarding the same set of objects. Each piece of information constitutes a data block, and all these (coupled) blocks have the object mode in common. When analyzing such data, an important aim is to obtain an overall picture of the structure underlying the whole set of coupled data blocks. A further challenge consists of accounting for the differences in information value that exist between and within (i. e., between the objects of a single block) data blocks. To tackle these issues, analysis techniques may be useful in which all available pieces of information are integrated and in which at the same time noise heterogeneity is taken into account. For the case of binary coupled data, however, only methods exist that go for a simultaneous analysis of all data blocks but that do not account for noise heterogeneity. Therefore, in this paper, the SIMCLAS model, being a Hierarchical Classes model for the simultaneous analysis of coupled binary two-way matrices, is presented. In this model, noise heterogeneity between and within the data blocks is accounted for by downweighting entries from noisy blocks/objects within a block. In a simulation study it is shown that (1) the SIMCLAS technique recovers the underlying structure of coupled data to a very large extent, and (2) the SIMCLAS technique outperforms a Hierarchical Classes technique in which all entries contribute equally to the analysis (i. e., noise homogeneity within and between blocks). The latter is also demonstrated in an application of both techniques to empirical data on categorization of semantic concepts.
KW - coupled data
KW - data fusion
KW - Hierarchical Classes Analysis
KW - hierarchical relations
KW - multi-set data
KW - multivariate binary data
KW - noise heterogeneity
KW - overlapping clustering
KW - simultaneous clusterings
UR - http://www.scopus.com/inward/record.url?scp=84867876189&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84867876189&partnerID=8YFLogxK
U2 - 10.1007/s11336-012-9275-3
DO - 10.1007/s11336-012-9275-3
M3 - Article
AN - SCOPUS:84867876189
VL - 77
SP - 724
EP - 740
JO - Psychometrika
JF - Psychometrika
SN - 0033-3123
IS - 4
ER -