Outlier detection and robust variable selection for least angle regression

The problem of selecting a parsimonious subset of variables from a large number of predictors in a regression model is a topic of high importance. When the data contains vertical outliers and/or leverage points, outlier detection and variable selection are inseparable problems. Therefore a robust me...

Full description

Bibliographic Details
Main Author: Shahriari, Shirin (author)
Other Authors: Faria, Susana (author), Gonçalves, A. Manuela (author), Van Aelst, Stefan (author)
Format: conferencePaper
Language:eng
Published: 2014
Subjects:
Online Access:http://hdl.handle.net/1822/32500
Country:Portugal
Oai:oai:repositorium.sdum.uminho.pt:1822/32500
Description
Summary:The problem of selecting a parsimonious subset of variables from a large number of predictors in a regression model is a topic of high importance. When the data contains vertical outliers and/or leverage points, outlier detection and variable selection are inseparable problems. Therefore a robust method that can simultaneously detect outliers and select variables is needed. An outlier detection and robust variable selection method is introduced that combines robust least angle regression with least trimmed squares regression on jack-knife subsets. In a second stage the detected outliers are removed and standard least angle regression is applied on the cleaned data to robustly sequence the predictor variables in order of importance. The performance of this method is evaluated by simulations that contain vertical outliers and high leverage points. The results of the simulation study show the good performance of this method in both outlier detection and robust variable selection.