Publication:
i4mC-Mouse: Improved identification of DNA N4-methylcytosine sites in the mouse genome using multiple encoding schemes

dc.contributor.authorMd Mehedi Hasanen_US
dc.contributor.authorBalachandran Manavalanen_US
dc.contributor.authorWatshara Shoombuatongen_US
dc.contributor.authorMst Shamima Khatunen_US
dc.contributor.authorHiroyuki Kurataen_US
dc.contributor.otherKyushu Institute of Technologyen_US
dc.contributor.otherAjou University, School of Medicineen_US
dc.contributor.otherJapan Society for the Promotion of Scienceen_US
dc.contributor.otherMahidol Universityen_US
dc.date.accessioned2020-05-05T05:08:02Z
dc.date.available2020-05-05T05:08:02Z
dc.date.issued2020-01-01en_US
dc.description.abstract© 2020 The Authors N4-methylcytosine (4mC) is one of the most important DNA modifications and involved in regulating cell differentiations and gene expressions. The accurate identification of 4mC sites is necessary to understand various biological functions. In this work, we developed a new computational predictor called i4mC-Mouse to identify 4mC sites in the mouse genome. Herein, six encoding schemes of k-space nucleotide composition (KSNC), k-mer nucleotide composition (Kmer), mono nucleotide binary encoding (MBE), dinucleotide binary encoding, electron–ion interaction pseudo potentials (EIIP) and dinucleotide physicochemical composition were explored that cover different characteristics of DNA sequence information. Subsequently, we built six RF-based encoding models and then linearly combined their probability scores to construct the final predictor. Among the six RF-based models, the Kmer, KSNC, MBE, and EIIP encodings are sufficient, which contributed to 10%, 45%, 25%, and 20% of the prediction performance, respectively. On the independent test the i4mC-Mouse predicted the 4mC sites with accuracy and MCC of 0.816 and 0.633, respectively, which were approximately 2.5% and 5% higher than those of the existing method (4mCpred-EL). For experimental biologists, a freely available web application was implemented at http://kurata14.bio.kyutech.ac.jp/i4mC-Mouse/.en_US
dc.identifier.citationComputational and Structural Biotechnology Journal. Vol.18, (2020), 906-912en_US
dc.identifier.doi10.1016/j.csbj.2020.04.001en_US
dc.identifier.issn20010370en_US
dc.identifier.other2-s2.0-85083319627en_US
dc.identifier.urihttps://repository.li.mahidol.ac.th/handle/123456789/54491
dc.rightsMahidol Universityen_US
dc.rights.holderSCOPUSen_US
dc.source.urihttps://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85083319627&origin=inwarden_US
dc.subjectBiochemistry, Genetics and Molecular Biologyen_US
dc.subjectComputer Scienceen_US
dc.titlei4mC-Mouse: Improved identification of DNA N4-methylcytosine sites in the mouse genome using multiple encoding schemesen_US
dc.typeArticleen_US
dspace.entity.typePublication
mu.datasource.scopushttps://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85083319627&origin=inwarden_US

Files

Collections