Abstract
Although AI systems archive a great success in various societal fields, there still exists a challengeable issue of outputting discriminatory results with respect to protected attributes (e.g., gender and age). The popular approach to solving the issue is to remove protected attribute information in the decision process. However, this approach has a limitation that beneficial information for target tasks may also be eliminated. To overcome the limitation, we propose Fairness-aware Disentangling Variational Auto-Encoder (FD-VAE) that disentangles data representation into three subspaces: 1) Target Attribute Latent (TAL), 2) Protected Attribute Latent (PAL), 3) Mutual Attribute Latent (MAL). On top of that, we propose a decorrelation loss that aligns the overall information into each subspace, instead of removing the protected attribute information. After learning the representation, we re-encode MAL to include only target information and combine it with TAL to perform downstream tasks. In our experiments on CelebA and UTK Face datasets, we show that the proposed method mitigates unfairness in facial attribute classification tasks with respect to gender and age. Ours outperforms previous methods by large margins on two standard fairness metrics, equal opportunity and equalized odds.
Original language | English |
---|---|
Title of host publication | 35th AAAI Conference on Artificial Intelligence, AAAI 2021 |
Publisher | Association for the Advancement of Artificial Intelligence |
Pages | 2403-2411 |
Number of pages | 9 |
ISBN (Electronic) | 9781713835974 |
Publication status | Published - 2021 |
Event | 35th AAAI Conference on Artificial Intelligence, AAAI 2021 - Virtual, Online Duration: 2021 Feb 2 → 2021 Feb 9 |
Publication series
Name | 35th AAAI Conference on Artificial Intelligence, AAAI 2021 |
---|---|
Volume | 3B |
Conference
Conference | 35th AAAI Conference on Artificial Intelligence, AAAI 2021 |
---|---|
City | Virtual, Online |
Period | 21/2/2 → 21/2/9 |
Bibliographical note
Funding Information:This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (No.2019R1A2C2003760) and Institute for Information & communications Technology Planning & Evaluation (IITP) grant funded by the Korea government (MSIT) (Development of framework for analyzing, detecting, mitigating of bias in AI model and training data) under Grant 2019-0-01396 and (Artificial Intelligence Graduate School Program (YONSEI UNIVERSITY)) under Grant 2020-0-01361.
Publisher Copyright:
© 2021, Association for the Advancement of Artificial Intelligence
All Science Journal Classification (ASJC) codes
- Artificial Intelligence