A comparison of latent class, K-means, and K-median methods for clustering dichotomous data.

MJ Brusco, E Shireman, D Steinley - Psychological methods, 2017 - psycnet.apa.org
Psychological methods, 2017psycnet.apa.org
The problem of partitioning a collection of objects based on their measurements on a set of
dichotomous variables is a well-established problem in psychological research, with
applications including clinical diagnosis, educational testing, cognitive categorization, and
choice analysis. Latent class analysis and K-means clustering are popular methods for
partitioning objects based on dichotomous measures in the psychological literature. The K-
median clustering method has recently been touted as a potentially useful tool for …
Abstract
The problem of partitioning a collection of objects based on their measurements on a set of dichotomous variables is a well-established problem in psychological research, with applications including clinical diagnosis, educational testing, cognitive categorization, and choice analysis. Latent class analysis and K-means clustering are popular methods for partitioning objects based on dichotomous measures in the psychological literature. The K-median clustering method has recently been touted as a potentially useful tool for psychological data and might be preferable to its close neighbor, K-means, when the variable measures are dichotomous. We conducted simulation-based comparisons of the latent class, K-means, and K-median approaches for partitioning dichotomous data. Although all 3 methods proved capable of recovering cluster structure, K-median clustering yielded the best average performance, followed closely by latent class analysis. We also report results for the 3 methods within the context of an application to transitive reasoning data, in which it was found that the 3 approaches can exhibit profound differences when applied to real data.(PsycINFO Database Record (c) 2019 APA, all rights reserved)
American Psychological Association