iMRSAPred: Improved Prediction of Anti-MRSA Peptides Using Physicochemical and Pairwise Contact-Energy Properties of Amino Acids
Issued Date
2023-01-01
Resource Type
eISSN
24701343
Scopus ID
2-s2.0-85182005342
Journal Title
ACS Omega
Rights Holder(s)
SCOPUS
Bibliographic Citation
ACS Omega (2023)
Suggested Citation
Arif M., Fang G., Fida H., Musleh S., Yu D.J., Alam T. iMRSAPred: Improved Prediction of Anti-MRSA Peptides Using Physicochemical and Pairwise Contact-Energy Properties of Amino Acids. ACS Omega (2023). doi:10.1021/acsomega.3c08303 Retrieved from: https://repository.li.mahidol.ac.th/handle/20.500.14594/95602
Title
iMRSAPred: Improved Prediction of Anti-MRSA Peptides Using Physicochemical and Pairwise Contact-Energy Properties of Amino Acids
Corresponding Author(s)
Other Contributor(s)
Abstract
Methicillin-resistant Staphylococcus aureus (MRSA) is a growing concern for human lives worldwide. Anti-MRSA peptides act as potential antibiotic agents and play significant role to combat MRSA infection. Traditional laboratory-based methods for annotating Anti-MRSA peptides are although precise but quite challenging, costly, and time-consuming. Therefore, computational methods capable of identifying Anti-MRSA peptides accelerate the drug designing process for treating bacterial infections. In this study, we developed a novel sequence-based predictor “iMRSAPred” for screening Anti-MRSA peptides by incorporating energy estimation and physiochemical and sequential information. We successfully resolved the skewed imbalance phenomena by using synthetic minority oversampling technique plus Tomek link (SMOTETomek) algorithm. Furthermore, the Shapley additive explanation method was leveraged to analyze the impact of top-ranked features in the prediction task. We evaluated multiple machine learning algorithms, i.e., CatBoost, Cascade Deep Forest, Kernel and Tree Boosting, support vector machine, and HistGBoost classifiers by 10-fold cross-validation and independent testing. The proposed iMRSAPred method significantly improved the overall performance in terms of accuracy and Matthew’s correlation coefficient (MCC) by 5.45 and 0.083%, respectively, on the training data set. On the independent data set, iMRSAPred improved accuracy and MCC by 3.98 and 0.055%, respectively. We believe that the proposed method would be useful in large-scale Anti-MRSA peptide prediction and provide insights into other bioactive peptides.