ACSMPred-FRL: Accelerating the identification of anticancer small molecules using feature representation learning
Issued Date
2025-01-01
Resource Type
eISSN
21693536
Scopus ID
2-s2.0-105017406061
Journal Title
IEEE Access
Rights Holder(s)
SCOPUS
Bibliographic Citation
IEEE Access (2025)
Suggested Citation
Shoombuatong W., Schaduangrat N., Mookdarsanit L., Hasan Mahmud S.M., Kusonmano K., Mookdarsanit P. ACSMPred-FRL: Accelerating the identification of anticancer small molecules using feature representation learning. IEEE Access (2025). doi:10.1109/ACCESS.2025.3613332 Retrieved from: https://repository.li.mahidol.ac.th/handle/123456789/112485
Title
ACSMPred-FRL: Accelerating the identification of anticancer small molecules using feature representation learning
Corresponding Author(s)
Other Contributor(s)
Abstract
Anti-cancer peptides (ACPs) have recently emerged as promising therapeutic agents for cancer treatment. Due to their unique ability to selectively target cancer cells without directly affecting healthy cells, they have been extensively studied. Meanwhile, many small molecule drugs are being introduced and evaluated in preclinical and clinical trials for anticancer treatments. However, experimental identification and characterization of anticancer small molecules (ACSMs) remains laborious and time-consuming. In this study, we propose an innovative computational approach, termed ACSMPred-FRL, that can accurately and rapidly identify compounds with or without anticancer activity based on a feature representation learning scheme. In ACSMPred-FRL, we initially constructed 442 base-classifiers by employing 13 machine learning (ML) algorithms in conjunction with 14 molecular descriptors and 20 molecular embeddings. Next, probabilistic features based on these 442 base-classifiers were generated and combined to provide multi-view information embedded in ACSMs. Then, principal component analysis (PCA) was applied to process the 442 probabilistic features resulting in 42 principal components used as input vectors for developing the final meta-classifier. Finally, the optimal feature subset was determined and used to train the final meta-classifier. Both cross-validation and independent tests showed that ACSMPred-FRL not only surpassed several conventional ML-based classifiers but also outperformed the existing method, highlighting its effectiveness and predictive capability. Specifically, on the independent test dataset, ACSMPred-FRL achieved an accuracy of 0.834, a specificity of 0.88, and a Matthew’s correlation coefficient of 0.672, representing improvements of 4.45, 4.70, and 9.20%, respectively, over the existing method. Therefore, we anticipate that ACSMPred-FRL will serve as a useful and reliable tool for the large-scale identification of ACSMs, accelerating their application in cancer treatment. The source code and datasets used in this study are publicly available for reproducibility at https://github.com/lawankorn-m/ACSMPred-MVF.
