Synthetic Data Generation for Tabular Data: A Survey, Jing Sun, Shuyue Wang, Ze Shi, Guang-Lei Chen, 2023ACM Computing Surveys, Vol. 56 (Association for Computing Machinery)DOI: 10.1145/3630248 - This survey reviews methods for generating synthetic tabular data, discussing the importance of preserving data distributions and inter-column relationships for data utility.
Modeling Tabular Data using Conditional GAN, Lei Xu, Maria Skoularidou, Alfredo Cuesta-Infante, Kalyan Veeramachaneni, 2019Advances in Neural Information Processing Systems, Vol. 32 (Curran Associates, Inc.)DOI: 10.48550/arXiv.1907.00503 - This paper introduces CTGAN, highlighting the difficulties of generating high-fidelity synthetic tabular data that maintains complex, multi-modal, and inter-column relationships.
Synthetic Data: Generating and Evaluating, Bas Van Breugel, Marc J. D. van der Weide, 2021IEEE Security & Privacy, Vol. 19 (IEEE)DOI: 10.1109/MSEC.2021.3059170 - This article gives an overview of synthetic data generation and assessment, stressing the significance of statistical fidelity, particularly inter-variable relationships, for the synthetic data's value.