Measuring similarity of complex and heterogeneous data in clustering of large data sets
Cluster analysis or classification usually concerns a set of exploratory multivariate data analysis methods and techniques for finding a clustering structure on a dataset. That may refer either to groups of statistical data units or to groups of variables. In this work we deal with a generalization...
Main Author: | |
---|---|
Other Authors: | , , |
Format: | article |
Language: | eng |
Published: |
2012
|
Subjects: | |
Online Access: | http://hdl.handle.net/10451/5659 |
Country: | Portugal |
Oai: | oai:repositorio.ul.pt:10451/5659 |
Summary: | Cluster analysis or classification usually concerns a set of exploratory multivariate data analysis methods and techniques for finding a clustering structure on a dataset. That may refer either to groups of statistical data units or to groups of variables. In this work we deal with a generalization of this paradigm concerning clustering of complex data described by three different types of variables, frequently present in a three-way context. We obtain compatible versions of the same affinity coefficient for measuring similarity between statistical data units described by those three types of variables. A global generalized similarity coefficient is analyzed for such kind of mixed data, often arising in data mining or knowledge mining. |
---|