Phasit CharoenkwanChanin NantasenamatMd Mehedi HasanBalachandran ManavalanWatshara ShoombuatongKyushu Institute of TechnologyAjou University School of MedicineTulane University School of MedicineMahidol UniversityChiang Mai University2022-08-042022-08-042021-09-01Bioinformatics. Vol.37, No.17 (2021), 2556-256214602059136748032-s2.0-85102066790https://repository.li.mahidol.ac.th/handle/20.500.14594/76067Motivation: The identification of bitter peptides through experimental approaches is an expensive and timeconsuming endeavor. Due to the huge number of newly available peptide sequences in the post-genomic era, the development of automated computational models for the identification of novel bitter peptides is highly desirable. Results: In this work, we present BERT4Bitter, a bidirectional encoder representation from transformers (BERT)- based model for predicting bitter peptides directly from their amino acid sequence without using any structural information. To the best of our knowledge, this is the first time a BERT-based model has been employed to identify bitter peptides. Compared to widely used machine learning models, BERT4Bitter achieved the best performance with an accuracy of 0.861 and 0.922 for cross-validation and independent tests, respectively. Furthermore, extensive empirical benchmarking experiments on the independent dataset demonstrated that BERT4Bitter clearly outperformed the existing method with improvements of 8.0% accuracy and 16.0% Matthews coefficient correlation, highlighting the effectiveness and robustness of BERT4Bitter. We believe that the BERT4Bitter method proposed herein will be a useful tool for rapidly screening and identifying novel bitter peptides for drug development and nutritional research.Mahidol UniversityBiochemistry, Genetics and Molecular BiologyComputer ScienceMathematicsBERT4Bitter: A bidirectional encoder representations from transformers (BERT)-based model for improving the prediction of bitter peptidesArticleSCOPUS10.1093/bioinformatics/btab133