Dataset of bulged G-quadruplex forming sequences in the human genome
dc.contributor.author | Papp C. | |
dc.contributor.author | Jenjaroenpun P. | |
dc.contributor.author | Mukundan V.T. | |
dc.contributor.author | Phan A.T. | |
dc.contributor.author | Kuznetsov V.A. | |
dc.contributor.other | Mahidol University | |
dc.date.accessioned | 2023-09-24T18:02:52Z | |
dc.date.available | 2023-09-24T18:02:52Z | |
dc.date.issued | 2023-10-01 | |
dc.description.abstract | When several continuous guanine runs are present closely in a nucleic acid sequence, a secondary structure called G-quadruplex can form (G4s). Such structures in the genome could serve as structural and functional regulators in gene expression, DNA-protein binding, epigenetic modification, and genotoxic stress. Several types of G4-forming DNA sequences exist, including bulged G4-forming sequences (G4-BS). Such bulges occur due to the presence of non-guanine bases in specific locations (G-runs) in the G4-forming sequences. At present, search algorithms do not identify stable G4-BS conformations, making genome-wide studies of G4-like structures difficult. Data provided in this study are related to a published article "Stable bulged G-quadruplexes in the human genome: Identification, experimental validation and functionalization" published by Nucleic Acids Research [DIO.org/10.193/nar/gkad252]. Based on our studies in vitro and G4-seq and G4 CUT&Tag data analysis, we have specified and validated three pG4-BS models. In this article, a large collection of 'raw' (unfiltered) dataset is presented, which includes three subfamilies of pG4-BS. For each of pG4-BS, we provide strand-specific genomic boundaries. Data on pG4-BS might be useful in elucidating their structural, functional, and evolutionary roles. Furthermore, they may provide insight into the pathobiology of G4-like structures and their potential therapeutic applications. | |
dc.identifier.citation | Data in Brief Vol.50 (2023) | |
dc.identifier.doi | 10.1016/j.dib.2023.109550 | |
dc.identifier.eissn | 23523409 | |
dc.identifier.scopus | 2-s2.0-85171376383 | |
dc.identifier.uri | https://repository.li.mahidol.ac.th/handle/20.500.14594/90196 | |
dc.rights.holder | SCOPUS | |
dc.subject | Multidisciplinary | |
dc.title | Dataset of bulged G-quadruplex forming sequences in the human genome | |
dc.type | Data Paper | |
mu.datasource.scopus | https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85171376383&origin=inward | |
oaire.citation.title | Data in Brief | |
oaire.citation.volume | 50 | |
oairecerif.author.affiliation | Siriraj Hospital | |
oairecerif.author.affiliation | NTU Institute of Structural Biology | |
oairecerif.author.affiliation | School of Physical and Mathematical Sciences | |
oairecerif.author.affiliation | A-Star, Bioinformatics Institute | |
oairecerif.author.affiliation | SUNY Upstate Medical University |