GRUATT-AVP: leveraging a novel attention-based gated recurrent unit to advance the accuracy of antiviral peptide prediction
Issued Date
2025-12-01
Resource Type
eISSN
20452322
Scopus ID
2-s2.0-105023232168
Pubmed ID
41309991
Journal Title
Scientific Reports
Volume
15
Issue
1
Rights Holder(s)
SCOPUS
Bibliographic Citation
Scientific Reports Vol.15 No.1 (2025)
Suggested Citation
Aziz M.T., Rupok A.S., Mahmud S.M.H., Goh K.O.M., Hosen M.F., Shoombuatong W., Nandi D. GRUATT-AVP: leveraging a novel attention-based gated recurrent unit to advance the accuracy of antiviral peptide prediction. Scientific Reports Vol.15 No.1 (2025). doi:10.1038/s41598-025-26565-1 Retrieved from: https://repository.li.mahidol.ac.th/handle/123456789/113403
Title
GRUATT-AVP: leveraging a novel attention-based gated recurrent unit to advance the accuracy of antiviral peptide prediction
Corresponding Author(s)
Other Contributor(s)
Abstract
Antiviral peptides (AVPs), produced by all living organisms, play a vital role as the first line of immune defense against viral infections. AVPs present a promising path for developing novel antiviral therapies that target diverse viruses, including those resistant to existing drugs. However, identifying AVPs using wet lab methods is often costly and requires significant effort, and existing computational methods still have certain limitations. In this study, a novel attention-based Gated Recurrent Unit framework, named GRUATT-AVP, is proposed for accurate and fast AVPs identification. In GRUATT-AVP, several Natural Language Processing (NLP) based encoding mechanisms, including One-Hot Encoding, Word2Vec, GloVe, FastText, and ProtBert, are adopted to encode the peptide sequences. Sequentially, different embedding dimensions based on the k-mer with fixed lengths (1–6) and pooling were explored, aiming to capture the local context within the sequences. After that, we conducted another experiment to determine the best feature selection technique and integrated the SHAP technique to eliminate noise and less important encoded features, thereby improving the model’s generalization performance. Finally, the most informative subset was fed into our developed GRUATT-AVP model to construct the GRUATT-AVP for classification. To understand the contribution of each component in the GRUATT-AVP model, an ablation study was performed, and the outcomes showed that our proposed model outperforms its other variants, establishing the model’s stability and efficacy. In terms of AVP prediction results, GRUATT-AVP demonstrated better performance compared to several state-of-the-art classifiers, with an accuracy of 94.8% and an AUC of 0.986, suggesting promising therapeutic potential against viral infections. To ensure wide accessibility and practical usage, the GRUATT-AVP web server is available at https://gruatt-avp.vercel.app/.
