Probabilistic consensus clustering using evidence accumulation

Clustering ensemble methods produce a consensus partition of a set of data points by combining the results of a collection of base clustering algorithms. In the evidence accumulation clustering (EAC) paradigm, the clustering ensemble is transformed into a pairwise co-association matrix, thus avoidin...

ver descrição completa

Detalhes bibliográficos
Autor principal: Lourenço, André Ribeiro (author)
Outros Autores: Bulo, Samuel Rota (author), Rebagliati, Nicola (author), Fred, Ana (author), Figueiredo, Mário (author), Pelillo, Marcello (author)
Formato: article
Idioma:eng
Publicado em: 2016
Assuntos:
Texto completo:http://hdl.handle.net/10400.21/5750
País:Portugal
Oai:oai:repositorio.ipl.pt:10400.21/5750
Descrição
Resumo:Clustering ensemble methods produce a consensus partition of a set of data points by combining the results of a collection of base clustering algorithms. In the evidence accumulation clustering (EAC) paradigm, the clustering ensemble is transformed into a pairwise co-association matrix, thus avoiding the label correspondence problem, which is intrinsic to other clustering ensemble schemes. In this paper, we propose a consensus clustering approach based on the EAC paradigm, which is not limited to crisp partitions and fully exploits the nature of the co-association matrix. Our solution determines probabilistic assignments of data points to clusters by minimizing a Bregman divergence between the observed co-association frequencies and the corresponding co-occurrence probabilities expressed as functions of the unknown assignments. We additionally propose an optimization algorithm to find a solution under any double-convex Bregman divergence. Experiments on both synthetic and real benchmark data show the effectiveness of the proposed approach.