Hi! We're the Unfairness Squad! We spent 6 months researching the intentional and unintentional causes of discrimination being learned into machine learning and classification algorithms. The presence of these prediction algorithms has spread to industries like bank loans, restaurant inspection grades, hiring processing, advertising, and more.
We answer the questions of how to quantify biases and how we can improve the fairness of classifications for an 'underprivileged' group by implementing a Disparate Impact Detector, preprocessing data repair algorithm, 3 Bayesian machine learning algorithms, and numerous evaluation metrics.
The Naive Bayes classification algorithm wasn't as successful in improving fairness because of it's independence assumption that there is no correlation between attributes in a dataset. However, the modifications to it, Modified Bayes and Two Bayes, improved fairness. The addition of the Feldman preprocessing also saw improvements when paired with the Bayesian algorithms.
For a census income dataset that we ran experiments on, the 'group fairness' metrics between the privileged and underprivileged groups move closer together as we progress from Naive Bayes to Modified and Two Bayes.
He/Him/They/Them, Computer Science/English major from Austin Texas
He/Him, Computer Science major from Saratoga California
She/Her, Computer Science major from Chesterfield Missouri
He/Him, Computer Science/English major from Denver Colorado
She/Her, Computer Science/Film major from Pacific Palisades California
She/Her, Computer Science major from Seattle Washington
The Feldman Disparate Impact detector takes in a dataset with a sensitive attribute as input and returns whether or not there is possible disparate impact. Disparate impact is the legal theory used to determine unintended discrimination. The intuition behind disparate impact is that in a dataset with a given sensitive attribute if that sensitive attribute were removed, it should be hard to predict based solely on the other attributes, otherwise it contains disparate impact. During the algorithm, a classifier is run on all the columns of the dataset except the sensitive attribute column and tries to predict the sensitive attribute. We want a high error rate from this classifier because the high error indicates that the sensitive attribute was hard to predict based on the other attributes of the individual.
The Feldman et al. preprocessing algorithm suggests “repairing” the dataset with respect to a particular protected attribute X. To do this, the algorithm iterates through the numerical columns in the data and repairs each one. For a given column, the algorithm finds the distributions (Fx) of values (Y) for different values of protected attributes (x), and creates a median distribution. The repair preserves rank, which means that people in a certain percentile for Y in their sensitive attribute distribution Fx will end up in that same percentile in the median distribution.
The Naïve Bayes Classifier predicts outcomes such as whether or not an individual gets a loan (i.e. classifications) for a group of individuals or organizations. It bases these predictions on a series of conditional probabilities computed for all of the attributes in the dataset given a particular outcome. Naïve Bayes makes what's referred to as the independence assumption which chooses to ignore any possible correlation that may exist between attributes. A model datastructure is learned using a ttrainiing data set of iinstances we already have classification information on and then the model is used to iteratte through each instance needing classification to predict its outcomes.
Due to the naïve assumption made about the lack of correlation between attributes, Calders and Verwer propose a modification to the Naïve Bayes algorithm allowing us to improve the fairness of the classifications. We maintain the same model datastructure, however, instead of basing the classification off of the probability of a the classification (P(C)*P*(S|C)) we now base it off of the probability of the sensitive attribute (P(S)*P(C|S)). After we have made the first classification using Naïve Bayes, we compute a discriminattion score based on the difference between the number of privileged instances that we have classified as the positive outcome and the number of underprivileged instances receiving the positive outcome. While this discrimination score remains above 0.0, we artificially redistribute probabilities towards the underprivileged group and then reclassify the dtata and check the discrimination score again.
With Two Bayes, instead of removing the actual sensitive attribute, we remove its ability to be used to classify an individual. This is done by splitting the training set into two disjoint sets, one for each sensitive group. This gives us one model trained only on the prvileged group and one only on the underprivileged group . The classification process in Two Bayes chooses which model to use based on that individual's sensitive attribute value, allowing us to remove the correlation between sensitive attribute and classification without sacrificing accuracy.
evanmallgood@gmail.com
stephherbers@gmail.com
mitchellm@carleton.edu