American Journal of Theoretical and Applied Statistics

| Peer-Reviewed |

Analysis of Penalized Regression Methods in a Simple Linear Model on the High-Dimensional Data

Received: Jun. 29, 2019    Accepted: Sep. 03, 2019    Published: Oct. 16, 2019
Views:       Downloads:

Share This Article

Abstract

Shrinkage methods for linear regression were developed over the last ten years to reduce the weakness of ordinary least squares (OLS) regression with respect to prediction accuracy. And, high dimensional data are quickly growing in many areas due to the development of technological advances which helps collect data with a large number of variables. In this paper, shrinkage methods were used to evaluate regression coefficients effectively for the high-dimensional multiple regression model, where there were fewer samples than predictors. Also, regularization approaches have become the methods of choice for analyzing such high dimensional data. We used three regulation methods based on penalized regression to select the appropriate model. Lasso, Ridge and Elastic Net have desirable features; they can simultaneously perform the regulation and selection of appropriate predictor variables and estimate their effects. Here, we compared the performance of three regular linear regression methods using cross-validation method to reach the optimal point. Prediction accuracy using the least squares error (MSE) was evaluated. Through conducting a simulation study and studying real data, we found that all three methods are capable to produce appropriate models. The Elastic Net has better prediction accuracy than the rest. However, in the simulation study, the Elastic Net outperformed other two methods and showed a less value in terms of MSE.

DOI 10.11648/j.ajtas.20190805.14
Published in American Journal of Theoretical and Applied Statistics ( Volume 8, Issue 5, September 2019 )
Page(s) 185-192
Creative Commons

This is an Open Access article, distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution and reproduction in any medium or format, provided the original work is properly cited.

Copyright

Copyright © The Author(s), 2024. Published by Science Publishing Group

Keywords

Shrinkage ‎Estimator, High Dimension, Cross-Validation, Ridge ‎Regression, ‎Elastic Net

References
[1] Doreswamy, Chanabasayya. M. Vastrad. (2013). "Performance Analysis Of Regularized Linear Regression Models For Oxazolines And Oxazoles Derivitive Descriptor Dataset," International Journal of Computational Science and Information Technology (IJCSITY) Vol. 1, No. 4. 10.5121/ijcsity.2013.1408.
[2] Fan. J, Li. R (2001). "Variable selection via nonconcave penalized likelihood and its oracleproperties," Journal of the American Statistical Association 96: 1348-1360.
[3] Hoerl. A. E‎,‎ Kennard. R. W (1970)‎. "‎Ridge regression: Biased estimation for nonorthogonal ‎problems," ‎Technometrics.‎ 12 (1‎)‎ 55–67.‎
[4] Hastie. T, Tibshirani. R, and Friedman, J (2001).‎ The Elements of Statistical Learning; Data ‎Mining,‎ Inference and Prediction. New ‎York,‎ ‎Springer‎.
[5] James. G, Witten. D, Hastie. T, R.‎ Tibshirani. ‎(2013).‎ An Introduction to Statistical Learning with Applications in ‎R. Springer New York Heidelberg Dordrecht London.‎‎‎
[6] Jerome. Friedman, Trevor Hastie (2009). "Regularization Paths for Generalized Linear Models via Coordinate Descent", www.jstatsoft.org/v33/i01/paper.
[7] Qiu. D, (2017). An Applied Analysis of High-Dimensional Logistic Regression. simon fraser niversity.
[8] Tibshirani. R, (1996). "Regression shrinkage and selection via the LASSO," Journal of the Royal Statistical Society. Series B (Methodological)‎.‎ 267-288‎.
[9] Tibshirani. R‎, ‎Hastie. T‎, ‎Wainwright. M‎., (2015). Statistical Learning with Sparsity The Lasso and ‎Generalizations‎. Chapman ‎and‎ hall ‎book
[10] ‎Yuzbasi.‎ B, ‎Arashi. ‎M, ‎Ahmed.‎ S. ‎E‎ ‎(2017). "Big Data Analysis Using Shrinkage Strategies," arXiv: 1704.05074v1 [stat.ME] 17 Apr 2017.
[11] ‎Zhang.‎ F, ‎(2011)‎. Cross-Valitation and regression analiysis in high dimentional sparse linear models. Stanford ‎University.
[12] Zhao. P‎,‎ Yu. B, (2006)‎. "‎On model selection consistency of ‎lasso,"‎ Journal of Machine Learning Research 7 (11) 2541–2563‎.‎
[13] Zou. H, and Hastie. T (2005). "Regularization and variable selection via the elastic net," J. Roy.Stat.Soc.B 67, 301–320‎.
[14] Zou. H (2006). "The adaptive lasso and its oracle properties.", Journal of the American Statistical Association 101: 1418-1429.
Cite This Article
  • APA Style

    Zari Farhadi Zari Farhadi, Reza Arabi Belaghi, Ozlem Gurunlu Alma. (2019). Analysis of Penalized Regression Methods in a Simple Linear Model on the High-Dimensional Data. American Journal of Theoretical and Applied Statistics, 8(5), 185-192. https://doi.org/10.11648/j.ajtas.20190805.14

    Copy | Download

    ACS Style

    Zari Farhadi Zari Farhadi; Reza Arabi Belaghi; Ozlem Gurunlu Alma. Analysis of Penalized Regression Methods in a Simple Linear Model on the High-Dimensional Data. Am. J. Theor. Appl. Stat. 2019, 8(5), 185-192. doi: 10.11648/j.ajtas.20190805.14

    Copy | Download

    AMA Style

    Zari Farhadi Zari Farhadi, Reza Arabi Belaghi, Ozlem Gurunlu Alma. Analysis of Penalized Regression Methods in a Simple Linear Model on the High-Dimensional Data. Am J Theor Appl Stat. 2019;8(5):185-192. doi: 10.11648/j.ajtas.20190805.14

    Copy | Download

  • @article{10.11648/j.ajtas.20190805.14,
      author = {Zari Farhadi Zari Farhadi and Reza Arabi Belaghi and Ozlem Gurunlu Alma},
      title = {Analysis of Penalized Regression Methods in a Simple Linear Model on the High-Dimensional Data},
      journal = {American Journal of Theoretical and Applied Statistics},
      volume = {8},
      number = {5},
      pages = {185-192},
      doi = {10.11648/j.ajtas.20190805.14},
      url = {https://doi.org/10.11648/j.ajtas.20190805.14},
      eprint = {https://download.sciencepg.com/pdf/10.11648.j.ajtas.20190805.14},
      abstract = {Shrinkage methods for linear regression were developed over the last ten years to reduce the weakness of ordinary least squares (OLS) regression with respect to prediction accuracy. And, high dimensional data are quickly growing in many areas due to the development of technological advances which helps collect data with a large number of variables. In this paper, shrinkage methods were used to evaluate regression coefficients effectively for the high-dimensional multiple regression model, where there were fewer samples than predictors. Also, regularization approaches have become the methods of choice for analyzing such high dimensional data. We used three regulation methods based on penalized regression to select the appropriate model. Lasso, Ridge and Elastic Net have desirable features; they can simultaneously perform the regulation and selection of appropriate predictor variables and estimate their effects. Here, we compared the performance of three regular linear regression methods using cross-validation method to reach the optimal point. Prediction accuracy using the least squares error (MSE) was evaluated. Through conducting a simulation study and studying real data, we found that all three methods are capable to produce appropriate models. The Elastic Net has better prediction accuracy than the rest. However, in the simulation study, the Elastic Net outperformed other two methods and showed a less value in terms of MSE.},
     year = {2019}
    }
    

    Copy | Download

  • TY  - JOUR
    T1  - Analysis of Penalized Regression Methods in a Simple Linear Model on the High-Dimensional Data
    AU  - Zari Farhadi Zari Farhadi
    AU  - Reza Arabi Belaghi
    AU  - Ozlem Gurunlu Alma
    Y1  - 2019/10/16
    PY  - 2019
    N1  - https://doi.org/10.11648/j.ajtas.20190805.14
    DO  - 10.11648/j.ajtas.20190805.14
    T2  - American Journal of Theoretical and Applied Statistics
    JF  - American Journal of Theoretical and Applied Statistics
    JO  - American Journal of Theoretical and Applied Statistics
    SP  - 185
    EP  - 192
    PB  - Science Publishing Group
    SN  - 2326-9006
    UR  - https://doi.org/10.11648/j.ajtas.20190805.14
    AB  - Shrinkage methods for linear regression were developed over the last ten years to reduce the weakness of ordinary least squares (OLS) regression with respect to prediction accuracy. And, high dimensional data are quickly growing in many areas due to the development of technological advances which helps collect data with a large number of variables. In this paper, shrinkage methods were used to evaluate regression coefficients effectively for the high-dimensional multiple regression model, where there were fewer samples than predictors. Also, regularization approaches have become the methods of choice for analyzing such high dimensional data. We used three regulation methods based on penalized regression to select the appropriate model. Lasso, Ridge and Elastic Net have desirable features; they can simultaneously perform the regulation and selection of appropriate predictor variables and estimate their effects. Here, we compared the performance of three regular linear regression methods using cross-validation method to reach the optimal point. Prediction accuracy using the least squares error (MSE) was evaluated. Through conducting a simulation study and studying real data, we found that all three methods are capable to produce appropriate models. The Elastic Net has better prediction accuracy than the rest. However, in the simulation study, the Elastic Net outperformed other two methods and showed a less value in terms of MSE.
    VL  - 8
    IS  - 5
    ER  - 

    Copy | Download

Author Information
  • Department of Statistics, University of Tabriz, Tabriz, Iran

  • Department of Statistics, University of Tabriz, Tabriz, Iran

  • Department of Statistics, Mughla Sitki Kochman Unv, Mughla, Turkey

  • Section