Abstract
Classification datasets are often biased in observations, leaving onlya few observations for minority classes. Our key contribution is de-tecting and reducing Under-represented (U-) and Over-represented(O-) artifacts from dataset imbalance, by proposing a Counterfac-tual Generative Smoothing approach on both feature-space anddata-space, namely CGS_f and CGS_d. Our technical contribution issmoothing majority and minority observations, by sampling a ma-jority seed and transferring to minority. Our proposed approachesnot only outperform state-of-the-arts in both synthetic and real-lifedatasets, they effectively reduce both artifact types.
Original language | English |
---|---|
Title of host publication | CIKM 2021 - Proceedings of the 30th ACM International Conference on Information and Knowledge Management |
Publisher | Association for Computing Machinery |
Pages | 3058-3062 |
Number of pages | 5 |
ISBN (Electronic) | 9781450384469 |
DOIs | |
Publication status | Published - 2021 Oct 26 |
Event | 30th ACM International Conference on Information and Knowledge Management, CIKM 2021 - Virtual, Online, Australia Duration: 2021 Nov 1 → 2021 Nov 5 |
Publication series
Name | International Conference on Information and Knowledge Management, Proceedings |
---|
Conference
Conference | 30th ACM International Conference on Information and Knowledge Management, CIKM 2021 |
---|---|
Country/Territory | Australia |
City | Virtual, Online |
Period | 21/11/1 → 21/11/5 |
Bibliographical note
Funding Information:∗corresponding author, supported by Microsoft Research Asia and IITP grants (2021-0-01696, High-Potential Individuals Global Training Program, and 2021-0-01343, SNU AI Graduate School)
Publisher Copyright:
© 2021 ACM.
All Science Journal Classification (ASJC) codes
- Business, Management and Accounting(all)
- Decision Sciences(all)