Volume 8, Issue 5, September 2019, Page: 185-192
Analysis of Penalized Regression Methods in a Simple Linear Model on the High-Dimensional Data
Zari Farhadi Zari Farhadi, Department of Statistics, University of Tabriz, Tabriz, Iran
Reza Arabi Belaghi, Department of Statistics, University of Tabriz, Tabriz, Iran
Ozlem Gurunlu Alma, Department of Statistics, Mughla Sitki Kochman Unv, Mughla, Turkey
Received: Jun. 29, 2019;       Accepted: Sep. 3, 2019;       Published: Oct. 16, 2019
DOI: 10.11648/j.ajtas.20190805.14      View  78      Downloads  44
Abstract
Shrinkage methods for linear regression were developed over the last ten years to reduce the weakness of ordinary least squares (OLS) regression with respect to prediction accuracy. And, high dimensional data are quickly growing in many areas due to the development of technological advances which helps collect data with a large number of variables. In this paper, shrinkage methods were used to evaluate regression coefficients effectively for the high-dimensional multiple regression model, where there were fewer samples than predictors. Also, regularization approaches have become the methods of choice for analyzing such high dimensional data. We used three regulation methods based on penalized regression to select the appropriate model. Lasso, Ridge and Elastic Net have desirable features; they can simultaneously perform the regulation and selection of appropriate predictor variables and estimate their effects. Here, we compared the performance of three regular linear regression methods using cross-validation method to reach the optimal point. Prediction accuracy using the least squares error (MSE) was evaluated. Through conducting a simulation study and studying real data, we found that all three methods are capable to produce appropriate models. The Elastic Net has better prediction accuracy than the rest. However, in the simulation study, the Elastic Net outperformed other two methods and showed a less value in terms of MSE.
Keywords
Shrinkage ‎Estimator, High Dimension, Cross-Validation, Ridge ‎Regression, ‎Elastic Net
To cite this article
Zari Farhadi Zari Farhadi, Reza Arabi Belaghi, Ozlem Gurunlu Alma, Analysis of Penalized Regression Methods in a Simple Linear Model on the High-Dimensional Data, American Journal of Theoretical and Applied Statistics. Vol. 8, No. 5, 2019, pp. 185-192. doi: 10.11648/j.ajtas.20190805.14
Copyright
Copyright © 2019 Authors retain the copyright of this article.
This article is an open access article distributed under the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Reference
[1]
Doreswamy, Chanabasayya. M. Vastrad. (2013). "Performance Analysis Of Regularized Linear Regression Models For Oxazolines And Oxazoles Derivitive Descriptor Dataset," International Journal of Computational Science and Information Technology (IJCSITY) Vol. 1, No. 4. 10.5121/ijcsity.2013.1408.
[2]
Fan. J, Li. R (2001). "Variable selection via nonconcave penalized likelihood and its oracleproperties," Journal of the American Statistical Association 96: 1348-1360.
[3]
Hoerl. A. E‎,‎ Kennard. R. W (1970)‎. "‎Ridge regression: Biased estimation for nonorthogonal ‎problems," ‎Technometrics.‎ 12 (1‎)‎ 55–67.‎
[4]
Hastie. T, Tibshirani. R, and Friedman, J (2001).‎ The Elements of Statistical Learning; Data ‎Mining,‎ Inference and Prediction. New ‎York,‎ ‎Springer‎.
[5]
James. G, Witten. D, Hastie. T, R.‎ Tibshirani. ‎(2013).‎ An Introduction to Statistical Learning with Applications in ‎R. Springer New York Heidelberg Dordrecht London.‎‎‎
[6]
Jerome. Friedman, Trevor Hastie (2009). "Regularization Paths for Generalized Linear Models via Coordinate Descent", www.jstatsoft.org/v33/i01/paper.
[7]
Qiu. D, (2017). An Applied Analysis of High-Dimensional Logistic Regression. simon fraser niversity.
[8]
Tibshirani. R, (1996). "Regression shrinkage and selection via the LASSO," Journal of the Royal Statistical Society. Series B (Methodological)‎.‎ 267-288‎.
[9]
Tibshirani. R‎, ‎Hastie. T‎, ‎Wainwright. M‎., (2015). Statistical Learning with Sparsity The Lasso and ‎Generalizations‎. Chapman ‎and‎ hall ‎book
[10]
‎Yuzbasi.‎ B, ‎Arashi. ‎M, ‎Ahmed.‎ S. ‎E‎ ‎(2017). "Big Data Analysis Using Shrinkage Strategies," arXiv: 1704.05074v1 [stat.ME] 17 Apr 2017.
[11]
‎Zhang.‎ F, ‎(2011)‎. Cross-Valitation and regression analiysis in high dimentional sparse linear models. Stanford ‎University.
[12]
Zhao. P‎,‎ Yu. B, (2006)‎. "‎On model selection consistency of ‎lasso,"‎ Journal of Machine Learning Research 7 (11) 2541–2563‎.‎
[13]
Zou. H, and Hastie. T (2005). "Regularization and variable selection via the elastic net," J. Roy.Stat.Soc.B 67, 301–320‎.
[14]
Zou. H (2006). "The adaptive lasso and its oracle properties.", Journal of the American Statistical Association 101: 1418-1429.
Browse journals by subject