Large datasets, where the number of predictors p is larger than the sample sizes n, have become very popular in recent years. These datasets pose great challenges for building a linear good prediction model. In addition, when dataset contains a fraction of outliers and other contaminations, linear regression becomes a difficult problem. Therefore, we need methods that are sparse and robust at the same time. In this paper, we implemented the approach of MM estimation and proposed L1-Penalized MM-estimation (MM-Lasso). Our proposed estimator combining sparse LTS sparse estimator to penalized M-estimators to get sparse model estimation with high breakdown value and good prediction. We implemented MM-Lasso by using C programming language. Simulation study demonstrates the favorable prediction performance of MM-Lasso.
| Published in | American Journal of Theoretical and Applied Statistics (Volume 4, Issue 3) |
| DOI | 10.11648/j.ajtas.20150403.12 |
| Page(s) | 78-84 |
| Creative Commons |
This is an Open Access article, distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution and reproduction in any medium or format, provided the original work is properly cited. |
| Copyright |
Copyright © The Author(s), 2015. Published by Science Publishing Group |
MM Estimate, Sparse Model, LTS Estimate, Robust Regression
| [1] | A. E. Hoerl and R. W. Kennard, “Ridge Regression: Biased Estimation for Nonorthogonal Problems,” Technometrics, vol. 12, no. 1, pp. 55–67, 1970. |
| [2] | R. Tibshirani, “Regression shrinkage and selection via the lasso,” J. Royal. Statist. Soc B., vol. 58, no. 1, pp. 267–288, 1996. |
| [3] | B. Efron, T. Hastie, and R.Tibshirani, “Least angle regression,” The Annals of Statistics, vol. 32, pp, 407–499, 2004. |
| [4] | K. Knight and W. Fu, “Asymptotics for Lasso-Type Estimators,” The Annals of Statistics, vol. 28, pp. 1356–1378, 2000. |
| [5] | J. Fan and R. Li, “Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties,” Journal of the American Statistical Association, vol. 96, no. 456, pp. 1348–1360, 2001 |
| [6] | A. Alfons, C. Croux, and S. Gelper, “Sparse least trimmed squares regression for analyzing high dimensional large data sets,” The Annals of Applied Statistics, vol. 7, no. 1, pp. 226–248, 2013. |
| [7] | H.Wang, G. Li, and G. Jiang, “Robust regression shrinkage and consistent variable selection through the LAD-lasso,” Journal of Business & Economic Statistics, vol. 25, pp. 347-355, 2007. |
| [8] | G. Li, H. Peng, and L. Zhu,“Nonconcave penalized M-estimation with a diverging number of parameters,” Statitica Sinica , vol. 21, no. 1, pp. 391–419, 2013. |
| [9] | R. A. Maronna, “Robust ridge regression for high-dimensional data,” Technometrics, vol. 53, pp. 44–53, 2011. |
| [10] | J. A. Khan, Aelst, S. Van. and R. H. Zamar, “Robust linear model selection based on least angle regression,” Journal of the Statistical Association, vol. 102, pp. 1289–1299, 2007. |
| [11] | P. Rousseeuw and A. Leroy, Robust regression and outlier detection. John Wiley & Sons, 1987. |
| [12] | V. J. Yohai, “High Breakdown-point and High Efficiency Estimates for Regression,” The Annals of Statistics, vol. 15, pp. 642-65, 1987. |
| [13] | R. Maronna, D. Martin, and V. Yohai, Robust Statistics. John Wiley & Sons, Chichester. ISBN 978-0-470-01092-1, 2006. |
| [14] | A. E. Beaton, and J. W. Tukey, “The fitting of power series, meaning polynomials, illustrated on band-spectroscopic data,” Technometrics, vol. 16, pp. 147-185, 1974. |
| [15] | R. A. Maronna, and V. J. Yohai, “Correcting MM Estimates for Fat Data Sets,” Computational Statistics & Data Analysis, vol. 54, pp. 3168-3173, 2010. |
| [16] | V. J. Yohai and R.H. Zamar, “High breakdown-point estimates of regression by means of the minimization of an efficient scale,” Journal of the American Statistical Association, vol. 83, pp. 406–413, 1988. |
| [17] | A. Alfons, simFrame: Simulation framework. R package version 0.5, 2012b. |
| [18] | A. Alfons, robustHD: Robust methods for high-dimensional R pakage version 0.1.0, 2012a. |
| [19] | R. Koenker, quantreg: Quantile regression. R package version 4.67, 2011. |
| [20] | T. Hasti and B. Efron, lars: Least angle regression, lasso and forward stagewise. R package version 0.9-8, 2011. |
APA Style
Kamal Darwish, Ali Hakan Buyuklu. (2015). Robust Linear Regression Using L1-Penalized MM-Estimation for High Dimensional Data. American Journal of Theoretical and Applied Statistics, 4(3), 78-84. https://doi.org/10.11648/j.ajtas.20150403.12
ACS Style
Kamal Darwish; Ali Hakan Buyuklu. Robust Linear Regression Using L1-Penalized MM-Estimation for High Dimensional Data. Am. J. Theor. Appl. Stat. 2015, 4(3), 78-84. doi: 10.11648/j.ajtas.20150403.12
AMA Style
Kamal Darwish, Ali Hakan Buyuklu. Robust Linear Regression Using L1-Penalized MM-Estimation for High Dimensional Data. Am J Theor Appl Stat. 2015;4(3):78-84. doi: 10.11648/j.ajtas.20150403.12
@article{10.11648/j.ajtas.20150403.12,
author = {Kamal Darwish and Ali Hakan Buyuklu},
title = {Robust Linear Regression Using L1-Penalized MM-Estimation for High Dimensional Data},
journal = {American Journal of Theoretical and Applied Statistics},
volume = {4},
number = {3},
pages = {78-84},
doi = {10.11648/j.ajtas.20150403.12},
url = {https://doi.org/10.11648/j.ajtas.20150403.12},
eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.ajtas.20150403.12},
abstract = {Large datasets, where the number of predictors p is larger than the sample sizes n, have become very popular in recent years. These datasets pose great challenges for building a linear good prediction model. In addition, when dataset contains a fraction of outliers and other contaminations, linear regression becomes a difficult problem. Therefore, we need methods that are sparse and robust at the same time. In this paper, we implemented the approach of MM estimation and proposed L1-Penalized MM-estimation (MM-Lasso). Our proposed estimator combining sparse LTS sparse estimator to penalized M-estimators to get sparse model estimation with high breakdown value and good prediction. We implemented MM-Lasso by using C programming language. Simulation study demonstrates the favorable prediction performance of MM-Lasso.},
year = {2015}
}
TY - JOUR T1 - Robust Linear Regression Using L1-Penalized MM-Estimation for High Dimensional Data AU - Kamal Darwish AU - Ali Hakan Buyuklu Y1 - 2015/03/30 PY - 2015 N1 - https://doi.org/10.11648/j.ajtas.20150403.12 DO - 10.11648/j.ajtas.20150403.12 T2 - American Journal of Theoretical and Applied Statistics JF - American Journal of Theoretical and Applied Statistics JO - American Journal of Theoretical and Applied Statistics SP - 78 EP - 84 PB - Science Publishing Group SN - 2326-9006 UR - https://doi.org/10.11648/j.ajtas.20150403.12 AB - Large datasets, where the number of predictors p is larger than the sample sizes n, have become very popular in recent years. These datasets pose great challenges for building a linear good prediction model. In addition, when dataset contains a fraction of outliers and other contaminations, linear regression becomes a difficult problem. Therefore, we need methods that are sparse and robust at the same time. In this paper, we implemented the approach of MM estimation and proposed L1-Penalized MM-estimation (MM-Lasso). Our proposed estimator combining sparse LTS sparse estimator to penalized M-estimators to get sparse model estimation with high breakdown value and good prediction. We implemented MM-Lasso by using C programming language. Simulation study demonstrates the favorable prediction performance of MM-Lasso. VL - 4 IS - 3 ER -