Dataset of bulged G-quadruplex forming sequences in the human genome
Issued Date
2023-10-01
Resource Type
eISSN
23523409
Scopus ID
2-s2.0-85171376383
Journal Title
Data in Brief
Volume
50
Rights Holder(s)
SCOPUS
Bibliographic Citation
Data in Brief Vol.50 (2023)
Suggested Citation
Papp C., Jenjaroenpun P., Mukundan V.T., Phan A.T., Kuznetsov V.A. Dataset of bulged G-quadruplex forming sequences in the human genome. Data in Brief Vol.50 (2023). doi:10.1016/j.dib.2023.109550 Retrieved from: https://repository.li.mahidol.ac.th/handle/20.500.14594/90196
Title
Dataset of bulged G-quadruplex forming sequences in the human genome
Other Contributor(s)
Abstract
When several continuous guanine runs are present closely in a nucleic acid sequence, a secondary structure called G-quadruplex can form (G4s). Such structures in the genome could serve as structural and functional regulators in gene expression, DNA-protein binding, epigenetic modification, and genotoxic stress. Several types of G4-forming DNA sequences exist, including bulged G4-forming sequences (G4-BS). Such bulges occur due to the presence of non-guanine bases in specific locations (G-runs) in the G4-forming sequences. At present, search algorithms do not identify stable G4-BS conformations, making genome-wide studies of G4-like structures difficult. Data provided in this study are related to a published article "Stable bulged G-quadruplexes in the human genome: Identification, experimental validation and functionalization" published by Nucleic Acids Research [DIO.org/10.193/nar/gkad252]. Based on our studies in vitro and G4-seq and G4 CUT&Tag data analysis, we have specified and validated three pG4-BS models. In this article, a large collection of 'raw' (unfiltered) dataset is presented, which includes three subfamilies of pG4-BS. For each of pG4-BS, we provide strand-specific genomic boundaries. Data on pG4-BS might be useful in elucidating their structural, functional, and evolutionary roles. Furthermore, they may provide insight into the pathobiology of G4-like structures and their potential therapeutic applications.