M3S-ALG: Improved and robust prediction of allergenicity of chemical compounds by using a novel multi-step stacking strategy
Issued Date
2025-01-01
Resource Type
ISSN
0167739X
Scopus ID
2-s2.0-85201084839
Journal Title
Future Generation Computer Systems
Volume
162
Rights Holder(s)
SCOPUS
Bibliographic Citation
Future Generation Computer Systems Vol.162 (2025)
Suggested Citation
Charoenkwan P., Schaduangrat N., Phan L.T., Manavalan B., Shoombuatong W. M3S-ALG: Improved and robust prediction of allergenicity of chemical compounds by using a novel multi-step stacking strategy. Future Generation Computer Systems Vol.162 (2025). doi:10.1016/j.future.2024.07.033 Retrieved from: https://repository.li.mahidol.ac.th/handle/20.500.14594/102723
Title
M3S-ALG: Improved and robust prediction of allergenicity of chemical compounds by using a novel multi-step stacking strategy
Author's Affiliation
Corresponding Author(s)
Other Contributor(s)
Abstract
A wide variety of chemicals cannot be introduced to the marketplace because of their high allergenicity. Therefore, it is fundamentally crucial to assess the allergenic potential of chemicals before introducing them into clinical therapeutics. However, assessing the allergenicity of chemical compounds experimentally is time-consuming and costly. To tackle this challenge, we propose M3S-ALG, a novel multi-step stacking strategy (M3S) for rapid and accurate identification of the allergenicity of chemical compounds by using only the SMILES notation. The proposed M3S method involves three steps, as follows. First, ten different balanced datasets were constructed using an under-sampling approach. Second, for each balanced dataset, 144 base-classifiers were trained and optimized to generate the prediction scores of allergenic chemical compounds considered as new probabilistic features. Third, we selected the important probabilistic features and employed them to construct the final stacked model (M3S-ALG). Experimental results show that M3S-ALG outperforms conventional ensemble strategies and its constituent base-classifiers on both the training and independent test datasets. This indicates the effectiveness and robustness of our proposed strategy in identifying the allergenicity of chemical compounds. In addition, M3S-ALG exhibited excellent prediction performance compared to existing methods on the independent test dataset, achieving a balanced accuracy of 0.877, MCC of 0.712, and AUC of 0.931. Finally, we developed a user-friendly online web server at https://pmlabqsar.pythonanywhere.com/M3SALG. This new approach is anticipated to facilitate the drug discovery and development community for the large-scale identification of chemical compounds with no allergenic properties.