Abstract
Synthesizing tabular data is attracting much attention these days for various purposes. With sophisticate synthetic data, for instance, one can augment its training data. For the past couple of years, tabular data synthesis techniques have been greatly improved. Recent work made progress to address many problems in synthesizing tabular data, such as the imbalanced distribution and multimodality problems. However, the data utility of state-of-the-art methods is not satisfactory yet. In this work, we significantly improve the utility by designing our generator and discriminator based on neural ordinary differential equations (NODEs). After showing that NODEs have theoretically preferred characteristics for generating tabular data, we introduce our designs. The NODE-based discriminator performs a hidden vector evolution trajectory-based classification rather than classifying with a hidden vector at the last layer only. Our generator also adopts an ODE layer at the very beginning of its architecture to transform its initial input vector (i.e., the concatenation of a noisy vector and a condition vector in our case) onto another latent vector space suitable for the generation process. We conduct experiments with 13 datasets, including but not limited to insurance fraud detection, online news article prediction, and so on, and our presented method outperforms other state-of-the-art tabular data synthesis methods in many cases of our classification, regression, and clustering experiments.
Original language | English |
---|---|
Title of host publication | The Web Conference 2021 - Proceedings of the World Wide Web Conference, WWW 2021 |
Publisher | Association for Computing Machinery, Inc |
Pages | 1506-1515 |
Number of pages | 10 |
ISBN (Electronic) | 9781450383127 |
DOIs | |
Publication status | Published - 2021 Apr 19 |
Event | 2021 World Wide Web Conference, WWW 2021 - Ljubljana, Slovenia Duration: 2021 Apr 19 → 2021 Apr 23 |
Publication series
Name | The Web Conference 2021 - Proceedings of the World Wide Web Conference, WWW 2021 |
---|
Conference
Conference | 2021 World Wide Web Conference, WWW 2021 |
---|---|
Country/Territory | Slovenia |
City | Ljubljana |
Period | 21/4/19 → 21/4/23 |
Bibliographical note
Funding Information:Jayoung Kim and Jinsung Jeon contributed equally to this research. Noseong Park is the corresponding author. This work was supported by the Institute of Information & Communications Technology Planning & Evaluation (IITP) grant funded by the Korea government (MSIT) (No. 2020-0-01361, Artificial Intelligence Graduate School Program (Yonsei University)).
Publisher Copyright:
© 2021 ACM.
All Science Journal Classification (ASJC) codes
- Computer Networks and Communications
- Software