Detection of outliers in multivariate data: a method based on clustering and robust estimators

Outlier identification is important in many applications of multivariate analysis. Either because there is some specific interest in finding anomalous observations or as a pre-processing task before the application of some multivariate method, in order to preserve the results from possible harmful e...

Full description

Bibliographic Details
Main Author:	Carla M. Santos Pereira (author)
Other Authors:	Ana M. Pires (author)
Format:	book
Language:	eng
Published:	2002
Subjects:	Ciências exactas e naturais Natural sciences
Online Access:	https://hdl.handle.net/10216/65794
Country:	Portugal
Oai:	oai:repositorio-aberto.up.pt:10216/65794

Description
Summary:	Outlier identification is important in many applications of multivariate analysis. Either because there is some specific interest in finding anomalous observations or as a pre-processing task before the application of some multivariate method, in order to preserve the results from possible harmful effects of those observations. It is also of great interest in supervised classification (or discriminant analysis) if, when predicting group membership, one wants to have the possibility of labelling an observation as does not belong to any of the available groups. The identification of outliers in multivariate data is usually based on Mahalanobis distance. The use of robust estimates of the mean and the covariance matrix is advised in order to avoid the masking effect (Rousseeuw and Leroy, 1985; Rousseeuw and von Zomeren, 1990; Rocke and Woodruff, 1996; Becker and Gather, 1999). However, the performance of these rules is still highly dependent of multivariate normality of the bulk of the data. The aim of the method here described is to remove this dependence.

Detection of outliers in multivariate data: a method based on clustering and robust estimators

Similar Items

Need Help?