Assessing the Quality of Ordinary Least Squares in General L<sup>p</sup> Spaces

Kevin Hoffman; Hugo Moises Montesinos-Yufa

doi:doi:10.11648/j.ajtas.20241306.12

Research Article |

| Peer-Reviewed

Assessing the Quality of Ordinary Least Squares in General L^p Spaces

Kevin Hoffman

, Hugo Moises Montesinos-Yufa^*

Published in American Journal of Theoretical and Applied Statistics (Volume 13, Issue 6)

Received: 20 September 2024 Accepted: 18 October 2024 Published: 18 November 2024

Views: Downloads:

Download PDF

Share This Article

Twitter
Linked In
Facebook

Abstract

In the context of regression analysis, we propose an estimation method capable of producing estimators that are closer to the true parameters than standard estimators when the residuals are non-normally distributed and when outliers are present. We achieve this improvement by minimizing the norm of the errors in general L^p spaces, as opposed to minimizing the norm of the errors in the typical L² space, corresponding to Ordinary Least Squares (OLS). The generalized model proposed here—the Ordinary Least Powers (OLP) model—can implicitly adjust its sensitivity to outliers by changing its parameter p, the exponent of the absolute value of the residuals. Especially for residuals of large magnitude, such as those stemming from outliers or heavy-tailed distributions, different values of p will implicitly exert different relative weights on the corresponding residual observation. We fitted OLS and OLP models on simulated data under varying distributions providing outlying observations and compared the mean squared errors relative to the true parameters. We found that OLP models with smaller p's produce estimators closer to the true parameters when the probability distribution of the error term is exponential or Cauchy, and larger p's produce closer estimators to the true parameters when the error terms are distributed uniformly.

Published in	American Journal of Theoretical and Applied Statistics (Volume 13, Issue 6)
DOI	10.11648/j.ajtas.20241306.12
Page(s)	193-202
Creative Commons	This is an Open Access article, distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution and reproduction in any medium or format, provided the original work is properly cited.
Copyright	Copyright © The Author(s), 2024. Published by Science Publishing Group

Keywords

Regression Analysis, Least Squares, Robust Regression, Outliers, Simulation

References

[1]	Huber, Peter J. "Robust estimation of a location parameter." In Breakthroughs in statistics: Methodology and distribution, pp. 492-518. New York, NY: Springer New York, 1992.
[2]	Hampel, Frank R. "The influence curve and its role in robust estimation." Journal of the American Statistical Association 69, no. 346 (1974): 383-393.
[3]	Rousseeuw, Peter J. "Least median of squares regression." Journal of the American Statistical Association 79, no. 388 (1984): 871-880.
[4]	Yohai, Victor J. "High breakdown-point and high efficiency robust estimates for regression." The Annals of Statistics (1987): 642-656.
[5]	Maronna, Ricardo A., R. Douglas Martin, Victor J. Yohai, and Matías Salibián-Barrera. Robust statistics: theory and methods (with R). John Wiley & Sons, 2019.
[6]	Schumacker, R. E., Monahan, M. P., and Mount, R. E. (2002). A comparison of OLS and robust regression using S-PLUS. Multiple Linear Regression Viewpoints, 28(2), 10-13.
[7]	Ellis, S., and Morgenthaler, S. (1992). Leverage and Breakdown in L1 Regression. Journal of the American Statistical Association, 87(417), 143-148. https://doi.org/10.2307/2290462
[8]	Davies, P. (1993). Aspects of Robust Linear Regression. The Annals of Statistics, 21(4), 1843-1899. Retrieved April 24, 2020, from www.jstor.org/stable/2242320
[9]	Rousseeuw, P. J., and Leroy, A. M. (2005). Robust regression and outlier detection (Vol. 589). John Wiley & Sons.
[10]	Lai, P., and Lee, S. (2005). An Overview of Asymptotic Properties of Lp Regression under General Classes of Error Distributions. Journal of the American Statistical Association, 100(470), 446-458. Retrieved April 24, 2020, from www.jstor.org/stable/27590567
[11]	Lai, P., and Lee, S. (2008). Ratewise Efficient Estimation Of Regression Coefficients Based On Lp Procedures. Statistica Sinica, 18(4), 1619-1640. Retrieved April 24, 2020, from www.jstor.org/stable/24308573
[12]	Bouaziz, S., Tagliasacchi, A., and Pauly, M. (2013, August). Sparse iterative closest point. In Computer graphics forum (Vol. 32, No. 5, pp. 113-123). Oxford, UK: Blackwell Publishing Ltd.
[13]	Hasselman, Berend (2018). nleqslv: Solve Systems of Nonlinear Equations. R package version 3.3.2. https://cran.r-project.org/package=nleqslv
[14]	Fox, John, and Sanford Weisberg. An R companion to applied regression. Sage publications, 2018.
[15]	Hlavac, Marek (2018). stargazer: Well-Formatted Regression and Summary Statistics Tables. R package version 5.2.2. https://CRAN.R-project.org/package=stargazer
[16]	Cont, Rama. "Empirical properties of asset returns: stylized facts and statistical issues." Quantitative finance 1, no. 2 (2001): 223.
[17]	Hoek, Gerard, Bert Brunekreef, Sandra Goldbohm, Paul Fischer, and Piet A. van den Brandt. "Association between mortality and indicators of traffic-related air pollution in the Netherlands: a cohort study." The lancet 360, no. 9341 (2002): 1203-1209.
[18]	Stijnen, Theo, Taye H. Hamza, and Pinar Özdemir. "Random effects meta-analysis of event outcome in the framework of the generalized linear mixed model with applications in sparse data." Statistics in medicine 29, no. 29 (2010): 3046-3067.
[19]	Cutler, Winnifred, James Kolter, Catherine Chambliss, Heather O’Neill, and Hugo M. Montesinos-Yufa. "Long term absence of invasive breast cancer diagnosis in 2,402,672 pre and postmenopausal women: A systematic review and meta-analysis." Plos one 15, no. 9 (2020): e0237925. https://doi.org/10.1371/journal.pone.0237925
[20]	Montesinos-Yufa, Hugo Moises, and Emily Musgrove. "A Sentiment Analysis of News Articles Published Before and During the COVID-19 Pandemic." International Journal on Data Science and Technology 10, no. 2 (2024): 38-44. https://doi.org/10.11648/j.ijdst.20241002.13
[21]	Montesinos-Yufa, H. M., Nagasuru-McKeever, T. (2024). Gender-Specific Mental Health Outcomes in Central America: A Natural Experiment. International Journal on Data Science and Technology, 10(3), 45-50. https://doi.org/10.11648/j.ijdst.20241003.11
[22]	Coleman, E., Innocent, J., Kircher, S., Montesinos-Yufa, H. M., Trauger, M. (2024). A Pandemic of Mental Health: Evidence from the U. S. International Journal of Data Science and Analysis, 10(4), 77-85. https://doi.org/10.11648/j.ijdsa.20241004.12

Cite This Article

Plain Text BibTeX RIS

APA Style

Hoffman, K., Montesinos-Yufa, H. M. (2024). Assessing the Quality of Ordinary Least Squares in General Lp Spaces. American Journal of Theoretical and Applied Statistics, 13(6), 193-202. https://doi.org/10.11648/j.ajtas.20241306.12

Copy | Download

ACS Style

Hoffman, K.; Montesinos-Yufa, H. M. Assessing the Quality of Ordinary Least Squares in General Lp Spaces. Am. J. Theor. Appl. Stat. 2024, 13(6), 193-202. doi: 10.11648/j.ajtas.20241306.12

Copy | Download

AMA Style

Hoffman K, Montesinos-Yufa HM. Assessing the Quality of Ordinary Least Squares in General Lp Spaces. Am J Theor Appl Stat. 2024;13(6):193-202. doi: 10.11648/j.ajtas.20241306.12

Copy | Download

@article{10.11648/j.ajtas.20241306.12,
  author = {Kevin Hoffman and Hugo Moises Montesinos-Yufa},
  title = {Assessing the Quality of Ordinary Least Squares in General Lp Spaces
},
  journal = {American Journal of Theoretical and Applied Statistics},
  volume = {13},
  number = {6},
  pages = {193-202},
  doi = {10.11648/j.ajtas.20241306.12},
  url = {https://doi.org/10.11648/j.ajtas.20241306.12},
  eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.ajtas.20241306.12},
  abstract = {In the context of regression analysis, we propose an estimation method capable of producing estimators that are closer to the true parameters than standard estimators when the residuals are non-normally distributed and when outliers are present. We achieve this improvement by minimizing the norm of the errors in general Lp spaces, as opposed to minimizing the norm of the errors in the typical L2 space, corresponding to Ordinary Least Squares (OLS). The generalized model proposed here—the Ordinary Least Powers (OLP) model—can implicitly adjust its sensitivity to outliers by changing its parameter p, the exponent of the absolute value of the residuals. Especially for residuals of large magnitude, such as those stemming from outliers or heavy-tailed distributions, different values of p will implicitly exert different relative weights on the corresponding residual observation. We fitted OLS and OLP models on simulated data under varying distributions providing outlying observations and compared the mean squared errors relative to the true parameters. We found that OLP models with smaller p's produce estimators closer to the true parameters when the probability distribution of the error term is exponential or Cauchy, and larger p's produce closer estimators to the true parameters when the error terms are distributed uniformly.
},
 year = {2024}
}

Copy | Download

TY  - JOUR
T1  - Assessing the Quality of Ordinary Least Squares in General Lp Spaces

AU  - Kevin Hoffman
AU  - Hugo Moises Montesinos-Yufa
Y1  - 2024/11/18
PY  - 2024
N1  - https://doi.org/10.11648/j.ajtas.20241306.12
DO  - 10.11648/j.ajtas.20241306.12
T2  - American Journal of Theoretical and Applied Statistics
JF  - American Journal of Theoretical and Applied Statistics
JO  - American Journal of Theoretical and Applied Statistics
SP  - 193
EP  - 202
PB  - Science Publishing Group
SN  - 2326-9006
UR  - https://doi.org/10.11648/j.ajtas.20241306.12
AB  - In the context of regression analysis, we propose an estimation method capable of producing estimators that are closer to the true parameters than standard estimators when the residuals are non-normally distributed and when outliers are present. We achieve this improvement by minimizing the norm of the errors in general Lp spaces, as opposed to minimizing the norm of the errors in the typical L2 space, corresponding to Ordinary Least Squares (OLS). The generalized model proposed here—the Ordinary Least Powers (OLP) model—can implicitly adjust its sensitivity to outliers by changing its parameter p, the exponent of the absolute value of the residuals. Especially for residuals of large magnitude, such as those stemming from outliers or heavy-tailed distributions, different values of p will implicitly exert different relative weights on the corresponding residual observation. We fitted OLS and OLP models on simulated data under varying distributions providing outlying observations and compared the mean squared errors relative to the true parameters. We found that OLP models with smaller p's produce estimators closer to the true parameters when the probability distribution of the error term is exponential or Cauchy, and larger p's produce closer estimators to the true parameters when the error terms are distributed uniformly.

VL  - 13
IS  - 6
ER  -

Copy | Download

Author Information

Kevin Hoffman

Department of Mathematics, Computer Science, and Statistics, Ursinus College, Collegeville, USA

http://orcid.org/0000-0001-6957-7519
Hugo Moises Montesinos-Yufa

Department of Mathematics, Computer Science, and Statistics, Ursinus College, Collegeville, USA

Contact Email

Download PDF

Sections

Plain Text BibTeX RIS

APA Style

Hoffman, K., Montesinos-Yufa, H. M. (2024). Assessing the Quality of Ordinary Least Squares in General Lp Spaces. American Journal of Theoretical and Applied Statistics, 13(6), 193-202. https://doi.org/10.11648/j.ajtas.20241306.12

Copy | Download

ACS Style

Hoffman, K.; Montesinos-Yufa, H. M. Assessing the Quality of Ordinary Least Squares in General Lp Spaces. Am. J. Theor. Appl. Stat. 2024, 13(6), 193-202. doi: 10.11648/j.ajtas.20241306.12

Copy | Download

AMA Style

Hoffman K, Montesinos-Yufa HM. Assessing the Quality of Ordinary Least Squares in General Lp Spaces. Am J Theor Appl Stat. 2024;13(6):193-202. doi: 10.11648/j.ajtas.20241306.12

Copy | Download

@article{10.11648/j.ajtas.20241306.12,
  author = {Kevin Hoffman and Hugo Moises Montesinos-Yufa},
  title = {Assessing the Quality of Ordinary Least Squares in General Lp Spaces
},
  journal = {American Journal of Theoretical and Applied Statistics},
  volume = {13},
  number = {6},
  pages = {193-202},
  doi = {10.11648/j.ajtas.20241306.12},
  url = {https://doi.org/10.11648/j.ajtas.20241306.12},
  eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.ajtas.20241306.12},
  abstract = {In the context of regression analysis, we propose an estimation method capable of producing estimators that are closer to the true parameters than standard estimators when the residuals are non-normally distributed and when outliers are present. We achieve this improvement by minimizing the norm of the errors in general Lp spaces, as opposed to minimizing the norm of the errors in the typical L2 space, corresponding to Ordinary Least Squares (OLS). The generalized model proposed here—the Ordinary Least Powers (OLP) model—can implicitly adjust its sensitivity to outliers by changing its parameter p, the exponent of the absolute value of the residuals. Especially for residuals of large magnitude, such as those stemming from outliers or heavy-tailed distributions, different values of p will implicitly exert different relative weights on the corresponding residual observation. We fitted OLS and OLP models on simulated data under varying distributions providing outlying observations and compared the mean squared errors relative to the true parameters. We found that OLP models with smaller p's produce estimators closer to the true parameters when the probability distribution of the error term is exponential or Cauchy, and larger p's produce closer estimators to the true parameters when the error terms are distributed uniformly.
},
 year = {2024}
}

Copy | Download

TY  - JOUR
T1  - Assessing the Quality of Ordinary Least Squares in General Lp Spaces

AU  - Kevin Hoffman
AU  - Hugo Moises Montesinos-Yufa
Y1  - 2024/11/18
PY  - 2024
N1  - https://doi.org/10.11648/j.ajtas.20241306.12
DO  - 10.11648/j.ajtas.20241306.12
T2  - American Journal of Theoretical and Applied Statistics
JF  - American Journal of Theoretical and Applied Statistics
JO  - American Journal of Theoretical and Applied Statistics
SP  - 193
EP  - 202
PB  - Science Publishing Group
SN  - 2326-9006
UR  - https://doi.org/10.11648/j.ajtas.20241306.12
AB  - In the context of regression analysis, we propose an estimation method capable of producing estimators that are closer to the true parameters than standard estimators when the residuals are non-normally distributed and when outliers are present. We achieve this improvement by minimizing the norm of the errors in general Lp spaces, as opposed to minimizing the norm of the errors in the typical L2 space, corresponding to Ordinary Least Squares (OLS). The generalized model proposed here—the Ordinary Least Powers (OLP) model—can implicitly adjust its sensitivity to outliers by changing its parameter p, the exponent of the absolute value of the residuals. Especially for residuals of large magnitude, such as those stemming from outliers or heavy-tailed distributions, different values of p will implicitly exert different relative weights on the corresponding residual observation. We fitted OLS and OLP models on simulated data under varying distributions providing outlying observations and compared the mean squared errors relative to the true parameters. We found that OLP models with smaller p's produce estimators closer to the true parameters when the probability distribution of the error term is exponential or Cauchy, and larger p's produce closer estimators to the true parameters when the error terms are distributed uniformly.

VL  - 13
IS  - 6
ER  -

Copy | Download