Can Synthetic Data Allow for Smaller Sample Sizes in Chronic Urticaria Research?
Issued Date
2025-08-01
Resource Type
eISSN
20457022
Scopus ID
2-s2.0-105012764165
Journal Title
Clinical and Translational Allergy
Volume
15
Issue
8
Rights Holder(s)
SCOPUS
Bibliographic Citation
Clinical and Translational Allergy Vol.15 No.8 (2025)
Suggested Citation
Gutsche A., Salameh P., Jahandideh S.S., Roodsaz M., Kutan S., Salehzadeh-Yazdi A., Kocatürk E., Gregoriou S., Thomsen S.F., Kulthanan K., Tuchinda P., Dissemond J., Kasperska-Zajac A., Zajac M., Zamłyński M., van Doorn M., Parisi C.A.S., Peter J.G., Day C., McDougall C., Makris M., Fomina D., Kovalkova E., Streliaev N., Andrenova G., Lebedkina M., Khoskhkui M., Aliabadi M.M., Bauer A., Kiefer L., Muñoz M., Weller K., Kolkhir P., Metz M. Can Synthetic Data Allow for Smaller Sample Sizes in Chronic Urticaria Research?. Clinical and Translational Allergy Vol.15 No.8 (2025). doi:10.1002/clt2.70087 Retrieved from: https://repository.li.mahidol.ac.th/handle/123456789/111635
Title
Can Synthetic Data Allow for Smaller Sample Sizes in Chronic Urticaria Research?
Author(s)
Gutsche A.
Salameh P.
Jahandideh S.S.
Roodsaz M.
Kutan S.
Salehzadeh-Yazdi A.
Kocatürk E.
Gregoriou S.
Thomsen S.F.
Kulthanan K.
Tuchinda P.
Dissemond J.
Kasperska-Zajac A.
Zajac M.
Zamłyński M.
van Doorn M.
Parisi C.A.S.
Peter J.G.
Day C.
McDougall C.
Makris M.
Fomina D.
Kovalkova E.
Streliaev N.
Andrenova G.
Lebedkina M.
Khoskhkui M.
Aliabadi M.M.
Bauer A.
Kiefer L.
Muñoz M.
Weller K.
Kolkhir P.
Metz M.
Salameh P.
Jahandideh S.S.
Roodsaz M.
Kutan S.
Salehzadeh-Yazdi A.
Kocatürk E.
Gregoriou S.
Thomsen S.F.
Kulthanan K.
Tuchinda P.
Dissemond J.
Kasperska-Zajac A.
Zajac M.
Zamłyński M.
van Doorn M.
Parisi C.A.S.
Peter J.G.
Day C.
McDougall C.
Makris M.
Fomina D.
Kovalkova E.
Streliaev N.
Andrenova G.
Lebedkina M.
Khoskhkui M.
Aliabadi M.M.
Bauer A.
Kiefer L.
Muñoz M.
Weller K.
Kolkhir P.
Metz M.
Author's Affiliation
Charité – Universitätsmedizin Berlin
Erasmus MC
National and Kapodistrian University of Athens
Universität Duisburg-Essen
Sechenov First Moscow State Medical University
Mashhad University of Medical Sciences
Slaski Uniwersytet Medyczny w Katowicach
Universitätsklinikum Carl Gustav Carus Dresden
Siriraj Hospital
Bispebjerg Hospital
Lebanese American University
Université Libanaise
Constructor University
Bahçeşehir Üniversitesi
Hospital Italiano de Buenos Aires
Moscow Healthcare Department
University of Nicosia Medical School
Fraunhofer Institute for Translational Medicine and Pharmacology ITMP
Astana Medical University
University of Cape Town Lung Institute
Institut National de Santé Publique, d’Épidémiologie Clinique et de Toxicologie-Liban
Tediax B.V. Sterrenbos 5
Erasmus MC
National and Kapodistrian University of Athens
Universität Duisburg-Essen
Sechenov First Moscow State Medical University
Mashhad University of Medical Sciences
Slaski Uniwersytet Medyczny w Katowicach
Universitätsklinikum Carl Gustav Carus Dresden
Siriraj Hospital
Bispebjerg Hospital
Lebanese American University
Université Libanaise
Constructor University
Bahçeşehir Üniversitesi
Hospital Italiano de Buenos Aires
Moscow Healthcare Department
University of Nicosia Medical School
Fraunhofer Institute for Translational Medicine and Pharmacology ITMP
Astana Medical University
University of Cape Town Lung Institute
Institut National de Santé Publique, d’Épidémiologie Clinique et de Toxicologie-Liban
Tediax B.V. Sterrenbos 5
Corresponding Author(s)
Other Contributor(s)
Abstract
Background: Robust data are essential for clinical and epidemiological research, yet in chronic spontaneous urticaria (CSU), certain patient groups, such as the elderly or comorbid patients, are often underrepresented. In clinical trials, strict inclusion and exclusion criteria frequently limit recruitment, making it difficult to achieve sufficient statistical power. Similarly, real-world observational studies may lack sufficient sample sizes for robust analysis. To address these limitations, we generated synthetic patient data that reflect these groups’ clinical characteristics and variability. This approach enables more comprehensive analyses, facilitates hypothesis testing in otherwise inaccessible populations, and supports the generation of evidence where traditional data sources are insufficient. Methods: A tree-based decision model was applied to generate synthetic data based on an existing set of real-world data (RWD) from the Chronic Urticaria Registry (CURE). Descriptive characteristics and association strength between relevant RWD variables and their synthetic counterparts were analyzed as indicators of replication accuracy, providing insight into how closely the synthetic data aligns with the RWD. Finally, we determined the minimum sample size required to generate high-quality synthetic data. Results: The algorithm produced extensive synthetic data records, closely mirroring patient demographics and disease clinical characteristics. Smaller subgroups of the data were equally replicated and followed the same distribution as RWD. Known associations and correlations between disease-specific factors (disease control) and risk factors (age) yielded similar results, with no significant difference (p > 0.05). The lowest threshold at which synthetic data could be generated while maintaining high accuracy in RWD was identified to be 25%, enabling a fourfold increase in the synthetic population. Conclusion: Synthetic data could replicate RWD with reasonable accuracy for patients with CSU down to 25% of the original population size. This method has the potential to extend small patient subgroups in clinical and epidemiological research.
