StackTTCA: a stacking ensemble learning-based framework for accurate and high-throughput identification of tumor T cell antigens

Charoenkwan P.; Schaduangrat N.; Shoombuatong W.

StackTTCA: a stacking ensemble learning-based framework for accurate and high-throughput identification of tumor T cell antigens

dc.contributor.author	Charoenkwan P.
dc.contributor.author	Schaduangrat N.
dc.contributor.author	Shoombuatong W.
dc.contributor.other	Mahidol University
dc.date.accessioned	2023-08-09T18:01:02Z
dc.date.available	2023-08-09T18:01:02Z
dc.date.issued	2023-07-28
dc.description.abstract	BACKGROUND: The identification of tumor T cell antigens (TTCAs) is crucial for providing insights into their functional mechanisms and utilizing their potential in anticancer vaccines development. In this context, TTCAs are highly promising. Meanwhile, experimental technologies for discovering and characterizing new TTCAs are expensive and time-consuming. Although many machine learning (ML)-based models have been proposed for identifying new TTCAs, there is still a need to develop a robust model that can achieve higher rates of accuracy and precision. RESULTS: In this study, we propose a new stacking ensemble learning-based framework, termed StackTTCA, for accurate and large-scale identification of TTCAs. Firstly, we constructed 156 different baseline models by using 12 different feature encoding schemes and 13 popular ML algorithms. Secondly, these baseline models were trained and employed to create a new probabilistic feature vector. Finally, the optimal probabilistic feature vector was determined based the feature selection strategy and then used for the construction of our stacked model. Comparative benchmarking experiments indicated that StackTTCA clearly outperformed several ML classifiers and the existing methods in terms of the independent test, with an accuracy of 0.932 and Matthew's correlation coefficient of 0.866. CONCLUSIONS: In summary, the proposed stacking ensemble learning-based framework of StackTTCA could help to precisely and rapidly identify true TTCAs for follow-up experimental verification. In addition, we developed an online web server ( http://2pmlab.camt.cmu.ac.th/StackTTCA ) to maximize user convenience for high-throughput screening of novel TTCAs.
dc.identifier.citation	BMC bioinformatics Vol.24 No.1 (2023) , 301
dc.identifier.doi	10.1186/s12859-023-05421-x
dc.identifier.eissn	14712105
dc.identifier.pmid	37507654
dc.identifier.scopus	2-s2.0-85165966016
dc.identifier.uri	https://repository.li.mahidol.ac.th/handle/20.500.14594/88205
dc.rights.holder	SCOPUS
dc.subject	Biochemistry, Genetics and Molecular Biology
dc.title	StackTTCA: a stacking ensemble learning-based framework for accurate and high-throughput identification of tumor T cell antigens
dc.type	Article
mu.datasource.scopus	https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85165966016&origin=inward
oaire.citation.issue	1
oaire.citation.title	BMC bioinformatics
oaire.citation.volume	24
oairecerif.author.affiliation	Mahidol University
oairecerif.author.affiliation	Chiang Mai University

Collections

Scopus 2023

	Office Hour: Monday-Friday 08.30-12.00 and 13.00-16.30 hrs.
	Phutthamonthon Sai 4 Rd. Salaya, Nakhon Pathom 73170, Thailand
	The office: +66 (2) 800 2680 ext.4306
	thipsuda.van@mahidol.ac.th
	https://repository.li.mahidol.ac.th

StackTTCA: a stacking ensemble learning-based framework for accurate and high-throughput identification of tumor T cell antigens

Files

Collections