Advisor: Anna Rafferty
Times: Tuesday/Thursday 2:45-3:45. [You'll typically meet with me only on Thursdays.]
Almost every day I see a news article discussing how big data and machine learning are leading to increasingly powerful predictive systems in diverse applications, such as assessing insurance risk, calculating credit scores, and as part of deciding who to hire for a job. These applications have real-world, high-stakes impacts on individuals, and the important of fairness to individuals in these types of applications has been recognized through laws that protect individuals against discrimination based on certain protected classes. Yet, many of the machine learning algorithms are not transparent in how they make decisions, and it can thus be difficult to determine whether an algorithm is treating individuals fairly - or even decide what it means for an algorithm to be fair.
There has been increasing excitement in the past few years about developing computational definitions of fairness and discrimination. Applying these metrics has led to several realizations:
These realizations are troubling, and have sparked significant interest in computational ways to try to mitigate algorithmic unfairness.
In this project, you'll be exploring different ways of characterizing fairness and algorithmic approaches to improving fairness outcomes. Specifically, you will:
(Wondering how this project is different from the other comps project focused on algorithmic fairness? That project is focused specifically on two criminal justice datasets and understanding what previous analyses of the datasets have uncovered as well as applying fair classification algorithms to these datasets. In this project, we will not be focused on criminal justice datasets, and will focus somewhat more on algorithmic approaches for mitigating unfairness.)
In this project, you'll be moving frequently between mathematical definitions of fairness, conceptual ideas about what it means for something to be fair, and algorithmic instantiations of these ideas. You don't need to have previous experience with this kind of work, but you do need to be willing (and hopefully excited!) to be engaged in algorithmic and mathematical analyses. Previous experience working with large datasets may be helpful but not necessary. Some courses that may be useful but are not required are Algorithms, Advanced Algorithms, Artificial Intelligence, Data Mining, Computational Models of Cognition, Data Science, or Linear Algebra.
Below are a few papers about some instances where algorithmic bias has been uncovered, what it means for an algorithm to be fair, and some strategies that researchers have used to try to create algorithms that are less biased. Note that these references are intended to provide a very minimal start for your literature search - they are certainly not the only nor necessarily the best sources for ideas. You will be finding and reading many additional papers!
This article discusses a range of examples of algorithmic bias, including where biases have had an impact in the criminal justice system. In this comps project, we won't be focusing on the criminal justice system because there's another great comps project this year where that will be the focus!
This paper discusses what it means for an algorithm to be fair and shows a case study about fairness with regards to race when making loan decisions based on credit scores.
This paper shows an example where a dataset has a particular skew - images about cooking more often involve women than men - but the algorithm that's learning about images amplifies that bias, leading to it being extremely unlikely to say that an image about cooking includes a man.
This paper introduces a particular definition of what it means for an algorithm to be biased based on legal definitions of bias.