Publication: A multi-layer graph analytics to identify bioinformatics tool usage practices from tool directories and pubmed indexed cross-citations
Issued Date
2017-02-21
Resource Type
Other identifier(s)
2-s2.0-85016218318
Rights
Mahidol University
Rights Holder(s)
SCOPUS
Bibliographic Citation
20th International Computer Science and Engineering Conference: Smart Ubiquitos Computing and Knowledge, ICSEC 2016. (2017)
Suggested Citation
Angkana Huang, Apirak Hoonlor A multi-layer graph analytics to identify bioinformatics tool usage practices from tool directories and pubmed indexed cross-citations. 20th International Computer Science and Engineering Conference: Smart Ubiquitos Computing and Knowledge, ICSEC 2016. (2017). doi:10.1109/ICSEC.2016.7859911 Retrieved from: https://repository.li.mahidol.ac.th/handle/20.500.14594/42402
Research Projects
Organizational Units
Authors
Journal Issue
Thesis
Title
A multi-layer graph analytics to identify bioinformatics tool usage practices from tool directories and pubmed indexed cross-citations
Author(s)
Other Contributor(s)
Abstract
© 2016 IEEE. The essence of bioinformatics is to research, develop, or apply computational tools to biology related data. We confirmed in this paper that bioinformatics tools have been emerging exponentially from 1988-2016. To aide in tool discovery, many directories were established. Though the number of citations to the tools were provided in some directories, none describe how the tools were used. Currently reviewing the literature in a continuous manner remains the key method to keep up with the rapidly changing best practices. To reduce this burden, we proposed a method to systematically gather the documented usage from literature and analyzed them such that the active tool combinations can be derived. Implementation of our method found a total of 4,832 bioinformatics tools, published during 1988-2016, with known PubMed unique identifiers. From January to July 2016, the tools were cited in 13,619 publications. From those publications, 57 function sets (i.e. analysis patterns) were deduced by clustering the usage instances according to the tool functionalities used. A total of 666 tool combinations were observed from those function sets. The top five function sets consisted of 30-98 combinations each; the additional 43 function sets contained 2-9 combinations. The nonhomogeneous tool preferences elicits the search for their influential factors to guide the improvement of tool discovery methods.