AttBiLSTM_DE: enhancing anticancer peptide prediction using word embedding and an optimized attention-based BiLSTM framework
Issued Date
2026-12-01
Resource Type
eISSN
20452322
Scopus ID
2-s2.0-105026668483
Pubmed ID
41326521
Journal Title
Scientific Reports
Volume
16
Issue
1
Rights Holder(s)
SCOPUS
Bibliographic Citation
Scientific Reports Vol.16 No.1 (2026)
Suggested Citation
Juthy M.J.N., Mahmud S.M.H., Hosen M.F., Aktar M.N., Goh K.O.M., Shoombuatong W. AttBiLSTM_DE: enhancing anticancer peptide prediction using word embedding and an optimized attention-based BiLSTM framework. Scientific Reports Vol.16 No.1 (2026). doi:10.1038/s41598-025-29767-9 Retrieved from: https://repository.li.mahidol.ac.th/handle/123456789/114419
Title
AttBiLSTM_DE: enhancing anticancer peptide prediction using word embedding and an optimized attention-based BiLSTM framework
Corresponding Author(s)
Other Contributor(s)
Abstract
Cancer remains a major global health issue, causing numerous deaths annually. Standard treatments like chemotherapy and radiotherapy often exhibit cytotoxic characteristics, harming healthy cells and leading to significant side effects. In this context, anticancer peptides (ACPs) offer a promising strategy by specifically triggering apoptosis in cancer cells while protecting healthy tissues. However, the experimental screening process for new ACPs involves significant costs and requires considerable labor. To address these challenges, we propose an advanced computational framework: AttBiLSTM_DE, which combines an Attention-based Bidirectional LSTM architecture with Optimized Weighted Features for accurate ACP predictions. Firstly, we employed four NLP-based feature encoding techniques: One-Hot Encoding, Global Vectors (GloVe), fastText, and Word2Vec to convert peptide sequences into numerical representations. Additionally, k-mer embedding was used to help the model recognize important subsequence fragments within the sequences. Then, we also developed a stochastic Differential Evolution (DE) algorithm to construct hybrid features, optimize feature weights, and generate the most informative attributes. Finally, the weighted feature sets were analyzed with a Bidirectional LSTM model augmented by an attention mechanism. This bidirectional architecture effectively captures contextual dependencies from both preceding and succeeding peptide sequences, while the attention mechanism emphasizes the most pertinent aspects, thus enhancing the model’s prediction performance. Through extensive evaluation, our proposed AttBiLSTM_DE outperformed conventional attention-based deep learning models in predictive performance, achieving an accuracy of 95.85% and an AUC of 98.48%. These impressive results indicate that our AttBiLSTM_DE effectively predicts ACPs and is able to aid in further cancer treatment and drug development. Furthermore, we have developed an online web server to enable real-time prediction based on our proposed model, which is publicly accessible at: https://att-bi-lstm-de-acp.vercel.app/
