Volume 4, Issue 3, May 2015, Page: 78-84
Robust Linear Regression Using L1-Penalized MM-Estimation for High Dimensional Data
Kamal Darwish, Yildiz Technical University, Department of Statistics, Istanbul, Turkey
Ali Hakan Buyuklu, Yildiz Technical University, Department of Statistics, Istanbul, Turkey
Received: Mar. 10, 2015;       Accepted: Mar. 24, 2015;       Published: Mar. 30, 2015
DOI: 10.11648/j.ajtas.20150403.12      View  3581      Downloads  316
Abstract
Large datasets, where the number of predictors p is larger than the sample sizes n, have become very popular in recent years. These datasets pose great challenges for building a linear good prediction model. In addition, when dataset contains a fraction of outliers and other contaminations, linear regression becomes a difficult problem. Therefore, we need methods that are sparse and robust at the same time. In this paper, we implemented the approach of MM estimation and proposed L1-Penalized MM-estimation (MM-Lasso). Our proposed estimator combining sparse LTS sparse estimator to penalized M-estimators to get sparse model estimation with high breakdown value and good prediction. We implemented MM-Lasso by using C programming language. Simulation study demonstrates the favorable prediction performance of MM-Lasso.
Keywords
MM Estimate, Sparse Model, LTS Estimate, Robust Regression
To cite this article
Kamal Darwish, Ali Hakan Buyuklu, Robust Linear Regression Using L1-Penalized MM-Estimation for High Dimensional Data, American Journal of Theoretical and Applied Statistics. Vol. 4, No. 3, 2015, pp. 78-84. doi: 10.11648/j.ajtas.20150403.12
Reference
[1]
A. E. Hoerl and R. W. Kennard, “Ridge Regression: Biased Estimation for Nonorthogonal Problems,” Technometrics, vol. 12, no. 1, pp. 55–67, 1970.
[2]
R. Tibshirani, “Regression shrinkage and selection via the lasso,” J. Royal. Statist. Soc B., vol. 58, no. 1, pp. 267–288, 1996.
[3]
B. Efron, T. Hastie, and R.Tibshirani, “Least angle regression,” The Annals of Statistics, vol. 32, pp, 407–499, 2004.
[4]
K. Knight and W. Fu, “Asymptotics for Lasso-Type Estimators,” The Annals of Statistics, vol. 28, pp. 1356–1378, 2000.
[5]
J. Fan and R. Li, “Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties,” Journal of the American Statistical Association, vol. 96, no. 456, pp. 1348–1360, 2001
[6]
A. Alfons, C. Croux, and S. Gelper, “Sparse least trimmed squares regression for analyzing high dimensional large data sets,” The Annals of Applied Statistics, vol. 7, no. 1, pp. 226–248, 2013.
[7]
H.Wang, G. Li, and G. Jiang, “Robust regression shrinkage and consistent variable selection through the LAD-lasso,” Journal of Business & Economic Statistics, vol. 25, pp. 347-355, 2007.
[8]
G. Li, H. Peng, and L. Zhu,“Nonconcave penalized M-estimation with a diverging number of parameters,” Statitica Sinica , vol. 21, no. 1, pp. 391–419, 2013.
[9]
R. A. Maronna, “Robust ridge regression for high-dimensional data,” Technometrics, vol. 53, pp. 44–53, 2011.
[10]
J. A. Khan, Aelst, S. Van. and R. H. Zamar, “Robust linear model selection based on least angle regression,” Journal of the Statistical Association, vol. 102, pp. 1289–1299, 2007.
[11]
P. Rousseeuw and A. Leroy, Robust regression and outlier detection. John Wiley & Sons, 1987.
[12]
V. J. Yohai, “High Breakdown-point and High Efficiency Estimates for Regression,” The Annals of Statistics, vol. 15, pp. 642-65, 1987.
[13]
R. Maronna, D. Martin, and V. Yohai, Robust Statistics. John Wiley & Sons, Chichester. ISBN 978-0-470-01092-1, 2006.
[14]
A. E. Beaton, and J. W. Tukey, “The fitting of power series, meaning polynomials, illustrated on band-spectroscopic data,” Technometrics, vol. 16, pp. 147-185, 1974.
[15]
R. A. Maronna, and V. J. Yohai, “Correcting MM Estimates for Fat Data Sets,” Computational Statistics & Data Analysis, vol. 54, pp. 3168-3173, 2010.
[16]
V. J. Yohai and R.H. Zamar, “High breakdown-point estimates of regression by means of the minimization of an efficient scale,” Journal of the American Statistical Association, vol. 83, pp. 406–413, 1988.
[17]
A. Alfons, simFrame: Simulation framework. R package version 0.5, 2012b.
[18]
A. Alfons, robustHD: Robust methods for high-dimensional R pakage version 0.1.0, 2012a.
[19]
R. Koenker, quantreg: Quantile regression. R package version 4.67, 2011.
[20]
T. Hasti and B. Efron, lars: Least angle regression, lasso and forward stagewise. R package version 0.9-8, 2011.
Browse journals by subject