Robust Clustering Method for the Detection of Outliers: Using AIC to Select the Number of Clusters

In [14] we proposed a method to detect outliers in multivariate data basedon clustering and robust estimators. To implement this method in practice it is necessaryto choose a clustering method, a pair of location and scatter estimators, andthe number of clusters, k. After several simulation experime...

Full description

Bibliographic Details
Main Author: Carla Santos Pereira (author)
Other Authors: Ana M. Pires (author)
Format: book
Language:eng
Published: 2013
Subjects:
Online Access:https://repositorio-aberto.up.pt/handle/10216/65809
Country:Portugal
Oai:oai:repositorio-aberto.up.pt:10216/65809
Description
Summary:In [14] we proposed a method to detect outliers in multivariate data basedon clustering and robust estimators. To implement this method in practice it is necessaryto choose a clustering method, a pair of location and scatter estimators, andthe number of clusters, k. After several simulation experiments it was possible togive a number of guidelines regarding the first two choices. However the choice ofthe number of clusters depends entirely on the structure of the particular data setunder study. Our suggestion is to try several values of k (e.g. from 1 to a maximumreasonable k which depends on the number of observations and on the number ofvariables) and select k minimizing an adapted AIC. In this paper we analyze thisAIC based criterion for choosing the number of clusters k (and also the clusteringmethod and the location and scatter estimators) by applying it to several simulateddata sets with and without outliers.