Fair representation for safe artificial intelligence via adversarial learning of unbiased information bottleneck

Jin Young Kim, Sung Bae Cho

Research output: Contribution to journalConference article

Abstract

Algorithmic bias indicates the discrimination caused by algorithms, which occurs with protected features such as gender and race. Even if we exclude a protected feature inducing the unfairness from the input data, the bias can still appear due to proxy discrimination through the dependency of other attributes and protected features. Several methods have been devised to reduce the bias, but it is not yet fully explored to identify the cause of this problem. In this paper, non-discriminated representation is formulated as a dual objective optimization problem of encoding data while obfuscating the information about the protected features in the data representation by exploiting the unbiased information bottleneck. Encoder learns data representation and discriminator judges whether there is information about the protected features in the data representation or not. They are trained simultaneously in adversarial fashion to achieve fair representation. Moreover, the algorithmic bias is analyzed in terms of bias-variance dilemma to reveal the cause of bias, so as to prove that the proposed method is effective for reducing the algorithmic bias in theory and experiments. Experiments with the well-known benchmark datasets such as Adults, Census, and COMPAS demonstrate the efficacy of the proposed method compared to other conventional techniques. Our method not only reduces the bias but also can use the latent representation in other classifiers (i.e., once a fair representation is learned, it can be used in various classifiers). We illustrate it by applying to the conventional machine learning models and visualizing the data representation with t-SNE algorithm.

Original languageEnglish
Pages (from-to)105-112
Number of pages8
JournalCEUR Workshop Proceedings
Volume2560
Publication statusPublished - 2020
Event2020 Workshop on Artificial Intelligence Safety, SafeAI 2020 - New York, United States
Duration: 2020 Feb 7 → …

All Science Journal Classification (ASJC) codes

  • Computer Science(all)

Fingerprint Dive into the research topics of 'Fair representation for safe artificial intelligence via adversarial learning of unbiased information bottleneck'. Together they form a unique fingerprint.

  • Cite this