Stack-AVP: A Stacked Ensemble Predictor Based on Multi-view Information for Fast and Accurate Discovery of Antiviral Peptides
Issued Date
2024-01-01
Resource Type
ISSN
00222836
eISSN
10898638
Scopus ID
2-s2.0-85210003994
Pubmed ID
39510347
Journal Title
Journal of Molecular Biology
Rights Holder(s)
SCOPUS
Bibliographic Citation
Journal of Molecular Biology (2024)
Suggested Citation
Charoenkwan P., Chumnanpuen P., Schaduangrat N., Shoombuatong W. Stack-AVP: A Stacked Ensemble Predictor Based on Multi-view Information for Fast and Accurate Discovery of Antiviral Peptides. Journal of Molecular Biology (2024). doi:10.1016/j.jmb.2024.168853 Retrieved from: https://repository.li.mahidol.ac.th/handle/20.500.14594/102224
Title
Stack-AVP: A Stacked Ensemble Predictor Based on Multi-view Information for Fast and Accurate Discovery of Antiviral Peptides
Author's Affiliation
Corresponding Author(s)
Other Contributor(s)
Abstract
AVPs, or antiviral peptides, are short chains of amino acids capable of inhibiting viral replication, preventing viral entry, or disrupting viral membranes. They represent a promising area of research for developing new antiviral therapies due to their potential to target a broad spectrum of viruses, incorporating those resistant to traditional antiviral drugs. However, traditional experimental methods for identifying AVPs are often costly and labour-intensive. Thus far, multiple computational methods have been introduced for the in silico identification of AVPs, but these methods still have certain shortcomings. In this study, we propose a novel stacked ensemble learning framework, termed Stack-AVP, for fast and accurate AVP identification. In Stack-AVP, we investigated heterogeneous prediction models, which were trained with 12 commonly used machine learning algorithms coupled with a wide range of multiple feature encoding schemes. Subsequently, these prediction models were adopted to generate multi-view features providing class information and probability information. Finally, we applied our feature selection method to determine the best feature subset for the construction of the final stacked model. Comparative assessments on the independent test dataset revealed that Stack-AVP surpassed the performance of current state-of-the-art methods, with an accuracy of 0.930, MCC of 0.860, and AUC of 0.975. Furthermore, it was found that our multi-view features exhibited a crucial mechanism to improve the prediction performance of AVPs. To facilitate experimental scientists in performing high-throughput identification of AVPs, the prediction sever Stack-AVP is publicly accessible at https://pmlabqsar.pythonanywhere.com/Stack-AVP.