Tabular data synthesis has received wide attention in the literature. This is because available data is often limited, incomplete, or cannot be obtained easily, and data privacy is becoming increasingly important. In this work, we present a generalized GAN framework for tabular synthesis, which combines the adversarial training of GANs and the negative log-density regularization of invertible neural networks. The proposed framework can be used for two distinctive objectives. First, we can further improve the synthesis quality, by decreasing the negative log-density of real records in the process of adversarial training. On the other hand, by increasing the negative log-density of real records, realistic fake records can be synthesized in a way that they are not too much close to real records and reduce the chance of potential information leakage. We conduct experiments with real-world datasets for classification, regression, and privacy attacks. In general, the proposed method demonstrates the best synthesis quality (in terms of task-oriented evaluation metrics, e.g., F1) when decreasing the negative log-density during the adversarial training. If increasing the negative log-density, our experimental results show that the distance between real and fake records increases, enhancing robustness against privacy attacks.
|Title of host publication||Advances in Neural Information Processing Systems 34 - 35th Conference on Neural Information Processing Systems, NeurIPS 2021|
|Editors||Marc'Aurelio Ranzato, Alina Beygelzimer, Yann Dauphin, Percy S. Liang, Jenn Wortman Vaughan|
|Publisher||Neural information processing systems foundation|
|Number of pages||11|
|Publication status||Published - 2021|
|Event||35th Conference on Neural Information Processing Systems, NeurIPS 2021 - Virtual, Online|
Duration: 2021 Dec 6 → 2021 Dec 14
|Name||Advances in Neural Information Processing Systems|
|Conference||35th Conference on Neural Information Processing Systems, NeurIPS 2021|
|Period||21/12/6 → 21/12/14|
Bibliographical noteFunding Information:
Jaehoon Lee and Jihyeon Hyeong equally contributed. Noseong Park is the corresponding author. This work was supported by the Institute of Information & Communications Technology Planning & Evaluation (IITP) grant funded by the Korean government (MSIT) (No. 2020-0-01361, Artificial Intelligence Graduate School Program (Yonsei University)).
© 2021 Neural information processing systems foundation. All rights reserved.
All Science Journal Classification (ASJC) codes
- Computer Networks and Communications
- Information Systems
- Signal Processing