DiMA: sequence diversity dynamics analyser for viruses

dc.contributor.authorTharanga S.
dc.contributor.authorÜnlü E.S.
dc.contributor.authorHu Y.
dc.contributor.authorSjaugi M.F.
dc.contributor.authorÇelik M.A.
dc.contributor.authorHekimoğlu H.
dc.contributor.authorMiotto O.
dc.contributor.authorÖncel M.M.
dc.contributor.authorKhan A.M.
dc.contributor.correspondenceTharanga S.
dc.contributor.otherMahidol University
dc.date.accessioned2024-12-13T18:07:00Z
dc.date.available2024-12-13T18:07:00Z
dc.date.issued2024-11-22
dc.description.abstractSequence diversity is one of the major challenges in the design of diagnostic, prophylactic, and therapeutic interventions against viruses. DiMA is a novel tool that is big data-ready and designed to facilitate the dissection of sequence diversity dynamics for viruses. DiMA stands out from other diversity analysis tools by offering various unique features. DiMA provides a quantitative overview of sequence (DNA/RNA/protein) diversity by use of Shannon's entropy corrected for size bias, applied via a user-defined k-mer sliding window to an input alignment file, and each k-mer position is dissected to various diversity motifs. The motifs are defined based on the probability of distinct sequences at a given k-mer alignment position, whereby an index is the predominant sequence, while all the others are (total) variants to the index. The total variants are sub-classified into the major (most common) variant, minor variants (occurring more than once and of incidence lower than the major), and the unique (singleton) variants. DiMA allows user-defined, sequence metadata enrichment for analyses of the motifs. The application of DiMA was demonstrated for the alignment data of the relatively conserved Spike protein (2,106,985 sequences) of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) and the relatively highly diverse pol gene (2637) of the human immunodeficiency virus-1 (HIV-1). The tool is publicly available as a web server (https://dima.bezmialem.edu.tr), as a Python library (via PyPi) and as a command line client (via GitHub).
dc.identifier.citationBriefings in bioinformatics Vol.26 No.1 (2024)
dc.identifier.doi10.1093/bib/bbae607
dc.identifier.eissn14774054
dc.identifier.pmid39592151
dc.identifier.scopus2-s2.0-85211001154
dc.identifier.urihttps://repository.li.mahidol.ac.th/handle/20.500.14594/102341
dc.rights.holderSCOPUS
dc.subjectBiochemistry, Genetics and Molecular Biology
dc.subjectComputer Science
dc.titleDiMA: sequence diversity dynamics analyser for viruses
dc.typeArticle
mu.datasource.scopushttps://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85211001154&origin=inward
oaire.citation.issue1
oaire.citation.titleBriefings in bioinformatics
oaire.citation.volume26
oairecerif.author.affiliationUniversity of Doha for Science and Technology
oairecerif.author.affiliationMahidol Oxford Tropical Medicine Research Unit
oairecerif.author.affiliationPerdana University Centre for Bioinformatics
oairecerif.author.affiliationBezmiâlem Vakıf Üniversitesi
oairecerif.author.affiliationNUS Yong Loo Lin School of Medicine
oairecerif.author.affiliationİstanbul Tıp Fakültesi
oairecerif.author.affiliationNuffield Department of Medicine
oairecerif.author.affiliationMill Ln
oairecerif.author.affiliationYeni Elektrik Santral St. No:29/2

Files

Collections