Adaptive Lasso sparse logistic regression on high-dimensional data with multicollinearity
2
Issued Date
2025-02-25
Resource Type
eISSN
26300087
Scopus ID
2-s2.0-105020747709
Journal Title
Science Engineering and Health Studies
Volume
19
Rights Holder(s)
SCOPUS
Bibliographic Citation
Science Engineering and Health Studies Vol.19 (2025)
Suggested Citation
Sudjai N., Duangsaphon M., Chandhanayingyong C. Adaptive Lasso sparse logistic regression on high-dimensional data with multicollinearity. Science Engineering and Health Studies Vol.19 (2025). doi:10.69598/sehs.19.25020002 Retrieved from: https://repository.li.mahidol.ac.th/handle/123456789/113025
Title
Adaptive Lasso sparse logistic regression on high-dimensional data with multicollinearity
Author(s)
Author's Affiliation
Corresponding Author(s)
Other Contributor(s)
Abstract
A combination of high-dimensional sparse data and multicollinearity problems can lead to instabilities in a predictive model when applied to a new data set. The least absolute shrinkage and selection operator (Lasso) is widely employed in machine-learning algorithm for variable selection and parameter estimations. Although this method is computationally feasible for high-dimensional data, it has some drawbacks. Thus, the adaptive Lasso was developed using the adaptive weight on penalty function. This adaptive weight is related to the power order of the estimators. Hence, we focus on the power of adaptive weight on two penalty functions: adaptive Lasso and adaptive elastic net. This study aimed to compare the performances of the power of the adaptive Lasso and adaptive elastic net methods under high-dimensional sparse data with multicollinearity. Moreover, the performances of four penalized methods were compared: Lasso, elastic net, adaptive Lasso, and adaptive elastic net. They were compared using the mean of the predicted mean squared error for the simulation study and the classification accuracy for a real-data application. The results showed that the higher-order of the adaptive Lasso method performed best on very high-dimensional sparse data with multicollinearity when the initial weight was determined using a ridge estimator. However, in the case of high-dimensional sparse data with multicollinearity, the square root of the adaptive Lasso together with the initial weight using Lasso was the best option.
