Comparisons of genome assembly tools for characterization of Mycobacterium tuberculosis genomes using hybrid sequencing technologies
dc.contributor.author | Trisakul K. | |
dc.contributor.author | Hinwan Y. | |
dc.contributor.author | Eisiri J. | |
dc.contributor.author | Salao K. | |
dc.contributor.author | Chaiprasert A. | |
dc.contributor.author | Kamolwat P. | |
dc.contributor.author | Tongsima S. | |
dc.contributor.author | Campino S. | |
dc.contributor.author | Phelan J. | |
dc.contributor.author | Clark T.G. | |
dc.contributor.author | Faksri K. | |
dc.contributor.correspondence | Trisakul K. | |
dc.contributor.other | Mahidol University | |
dc.date.accessioned | 2024-09-06T18:13:12Z | |
dc.date.available | 2024-09-06T18:13:12Z | |
dc.date.issued | 2024-01-01 | |
dc.description.abstract | Background: Next-generation sequencing of Mycobacterium tuberculosis, the infectious agent causing tuberculosis, is improving the understanding of genomic diversity of circulating lineages and strain-types, and informing knowledge of drug resistance mutations. An increasingly popular approach to characterizing M. tuberculosis genomes (size: 4.4 Mbp) and variants (e.g., single nucleotide polymorphisms (SNPs)) involves the de novo assembly of sequence data. Methods: We compared the performance of genome assembly tools (Unicycler, RagOut, and RagTag) on sequence data from nine drug resistant M. tuberculosis isolates (multi-drug (MDR) n = 1; pre-extensively-drug (pre-XDR) n = 8) generated using Illumina HiSeq, Oxford Nanopore Technology (ONT) PromethION, and PacBio platforms. Results: Our investigation found that Unicycler-based assemblies had significantly higher genome completeness (~98.7%; p values = 0.01) compared to other assembler tools (RagOut = 98.6%, and RagTag = 98.6%). The genome assembly sizes (bp) across isolates and sequencers based on RagOut was significantly longer (p values < 0.001) (4,418,574 ± 8,824 bp) than Unicycler and RagTag assemblies (Unicycler = 4,377,642 ± 55,257 bp, and RagTag = 4,380,711 ± 51,164 bp). RagOut-based assemblies had the fewest contigs (~32) and the longest genome size (4,418,574 bp; vs. H37Rv reference size 4,411,532 bp) and therefore were chosen for downstream analysis. Pan-genome analysis of Illumina and PacBio hybrid assemblies revealed the greatest number of detected genes (4,639 genes; H37Rv reference contains 3,976 genes), while Illumina and ONT hybrid assemblies produced the highest number of SNPs. The number of genes from hybrid assemblies with ONT and PacBio long-reads (mean: 4,620 genes) was greater than short-read assembly alone (4,478 genes). All nine RagOut hybrid genome assemblies detected known mutations in genes associated with MDR-TB and pre-XDR-TB. Conclusions: Unicycler software performed the best in terms of achieving contiguous genomes, whereas RagOut improved the quality of Unicycler’s genome assemblies by providing a longer genome size. Overall, our approach has demonstrated that short-read and long-read hybrid assembly can provide a more complete genome assembly than short-read assembly alone by detecting pan-genomes and more genes, including IS6110, and SNPs. | |
dc.identifier.citation | PeerJ Vol.12 No.8 (2024) | |
dc.identifier.doi | 10.7717/peerj.17964 | |
dc.identifier.eissn | 21678359 | |
dc.identifier.scopus | 2-s2.0-85202745674 | |
dc.identifier.uri | https://repository.li.mahidol.ac.th/handle/20.500.14594/101096 | |
dc.rights.holder | SCOPUS | |
dc.subject | Neuroscience | |
dc.subject | Biochemistry, Genetics and Molecular Biology | |
dc.subject | Agricultural and Biological Sciences | |
dc.title | Comparisons of genome assembly tools for characterization of Mycobacterium tuberculosis genomes using hybrid sequencing technologies | |
dc.type | Article | |
mu.datasource.scopus | https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85202745674&origin=inward | |
oaire.citation.issue | 8 | |
oaire.citation.title | PeerJ | |
oaire.citation.volume | 12 | |
oairecerif.author.affiliation | Siriraj Hospital | |
oairecerif.author.affiliation | London School of Hygiene & Tropical Medicine | |
oairecerif.author.affiliation | Faculty of Medicine, Khon Kaen University | |
oairecerif.author.affiliation | Khon Kaen University | |
oairecerif.author.affiliation | Thailand Ministry of Public Health | |
oairecerif.author.affiliation | Thailand National Center for Genetic Engineering and Biotechnology |