Measuring similarity of complex and heterogeneous data in clustering of large data sets
Cluster analysis or classification usually concerns a set of exploratory multivariate data analysis methods and techniques for finding a clustering structure on a dataset. That may refer either to groups of statistical data units or to groups of variables. In this work we deal with a generalization...
Autor principal: | |
---|---|
Outros Autores: | , , |
Formato: | article |
Idioma: | eng |
Publicado em: |
2012
|
Assuntos: | |
Texto completo: | http://hdl.handle.net/10451/5659 |
País: | Portugal |
Oai: | oai:repositorio.ul.pt:10451/5659 |
Resumo: | Cluster analysis or classification usually concerns a set of exploratory multivariate data analysis methods and techniques for finding a clustering structure on a dataset. That may refer either to groups of statistical data units or to groups of variables. In this work we deal with a generalization of this paradigm concerning clustering of complex data described by three different types of variables, frequently present in a three-way context. We obtain compatible versions of the same affinity coefficient for measuring similarity between statistical data units described by those three types of variables. A global generalized similarity coefficient is analyzed for such kind of mixed data, often arising in data mining or knowledge mining. |
---|