Summary: | The aim of a supervised classification problem is to build a decision rule according to which a new object is assigned to one of a set of c predefined classes on the basis of an observed p-dimensional feature vector (tranning sample). In the absence of absolute separation or when there is some uncertainty it may be better not to classify. In that case we can introduce a rejection option either in cases of dobt or of atypical observations (outliers). This work presents a method for classifying a new object into one of c + 2 Classes. Special emphasis is given to the treatment of atypical observations: we propose a new outlier rejection rule, based on clustering analysis and mahalanobis type distance with classical and robust estimators, wich performed well in a simulation study with normal and non-normal data, with and without outliers. We consideredthree clustering methods: k-means, pam and mclust; and three pairs of location-scatter estimators: classical, Reweighted Minimum Covariance Determinant with an approximate 25% breakdown point (RMCD25) and Orthogonalised Gnanadesikan-Kettenring (OGK) of Maronna and Zamar. The method is illustrated with two applications.
|