Publication: iAMY-SCM: Improved prediction and analysis of amyloid proteins using a scoring card method with propensity scores of dipeptides
Issued Date
2020-01-01
Resource Type
ISSN
10898646
08887543
08887543
Other identifier(s)
2-s2.0-85092215327
Rights
Mahidol University
Rights Holder(s)
SCOPUS
Bibliographic Citation
Genomics. (2020)
Suggested Citation
Phasit Charoenkwan, Sakawrat Kanthawong, Chanin Nantasenamat, Md Mehedi Hasan, Watshara Shoombuatong iAMY-SCM: Improved prediction and analysis of amyloid proteins using a scoring card method with propensity scores of dipeptides. Genomics. (2020). doi:10.1016/j.ygeno.2020.09.065 Retrieved from: https://repository.li.mahidol.ac.th/handle/20.500.14594/59898
Research Projects
Organizational Units
Authors
Journal Issue
Thesis
Title
iAMY-SCM: Improved prediction and analysis of amyloid proteins using a scoring card method with propensity scores of dipeptides
Other Contributor(s)
Abstract
© 2020 Elsevier Inc. Fast, accurate identification and characterization of amyloid proteins at a large-scale is essential for understating their role in therapeutic intervention strategies. As a matter of fact, there exist only one in silico model for amyloid protein identification using the random forest (RF) model in conjunction with various feature types namely the RFAmy. However, it suffers from low interpretability for biologists. Thus, it is highly desirable to develop a simple and easily interpretable prediction method with robust accuracy as compared to the existing complicated model. In this study, we propose iAMY-SCM, the first scoring card method-based predictor for predicting and analyzing amyloid proteins. Herein, the iAMY-SCM made use of a simple weighted-sum function in conjunction with the propensity scores of dipeptides for the amyloid protein identification. Cross-validation results indicated that iAMY-SCM provided an accuracy of 0.895 that corresponded to 10–22% higher performance than that of widely used machine learning models. Furthermore, iAMY-SCM achieving an accuracy of 0.827 as evaluated by an independent test, which was found to be comparable to that of RFAmy and was approximately 9–13% higher than widely used machine learning models. Furthermore, the analysis of estimated propensity scores of amino acids and dipeptides were performed to provide insights into the biophysical and biochemical properties of amyloid proteins. As such, this demonstrates that the proposed iAMY-SCM is efficient and reliable in terms of simplicity, interpretability and implementation. To facilitate ease of use of the proposed iAMY-SCM, a user-friendly and publicly accessible web server at http://camt.pythonanywhere.com/iAMY-SCM has been established. We anticipate that that iAMY-SCM will be an important tool for facilitating the large-scale prediction and characterization of amyloid protein.