Project: Explaining SVMs

Advisor: Dave Musicant

Meeting time: TTh 3:10-4:55

I. Background

Supervised machine learning techniques (neural networks, decision trees, support vector machines, etc.) are amazingly useful in lots of real world applications. They have been used by doctors to help determine whether or not someone is ill, and also by the credit industry to determine whether or not someone should receive a credit card, to name just a few applications. A key desirable aspect of machine learning systems is explainability. If someone is rejected for credit, or determined to be ill, that person wants to know why. An answer such as "our algorithm says so" isn't very satisfying. Some machine learning algorithms, such as decision trees, produce models that are simple enough that they can be used for good explanations.

The support vector machine (SVM) is a machine learning algorithm that has been shown to be highly effective. In many applications, it provides more accurate results than simpler techniques, such as decision treees. SVMs, however, result in a "black box" classifier for which it is difficult to provide a simple explanation. There are a variety of techniques out there for approximating the outcome of an SVM in such a way to make the result more explainable, but this approximation tends to be less accurate. I have an idea for how to make SVMs be considerably more explainable without needing to approximate.

This project is new machine learning research. As far as I know, the technique that we will implement has never been done before. Therefore, the ultimate goal is to attempt to write up and publish this work (assuming it works). After we have implemented and experimented with this new technique, we will write a paper to be submitted to a major machine learning / data mining conference. KDD (Knowledge Discovery and Data Mining), has a deadline of mid-to-late February for submitting papers. This means that there will be some pressure to finish some aspects of the project a little sooner than the typical comps schedule allows, but this means that your comps crunch time won't come at the end of the term.

It is true that since no one has tried to do this before, the results are somewhat unpredictable. This technique might work better than I expect it to; or might work considerably worse. One of the challenges of doing original research is in not quite knowing how it is going to turn out!

Part of the project will involved building a graphical user interface for users to experiment with in order to understand the explanations produced by the SVMs. This software would be released to the community-at-large, to coincide with submission of the research paper.

II. The Project

Here is a list of the concepts and technologies that will be necessary.

  1. Support vector machines. You'll be coming up to speed with how SVMs work. There are some good tutorials out there, and I can help. There is some good free SVM software out there ther we can leverage.
  2. Data visualization. You will be building a graphical user interface to help users visualize the explanations that you provide. This means that you will need to learn about concepts for visualizing high dimensional data that is understandable to people. You will also need to program in a language that lets you do such visualization appropriately.
  3. Optimization. If we make sufficient progress on the project, mathematical optimization techniques could eventually come into play. We might use techniques such as simulated annealing or genetic algorithms, or possibly others.
  4. Technical writing. An important part of this paper will be in actually writing the paper to submit to KDD. You'll use LaTeX in formatting the information appropriately.

III. References

Stuart Russell and Peter Norvig. Artificial Intelligence: A Modern Approach. 2nd edition. Chapter 20.6: currently available from textbook web page.

David R. Musicant. Data Mining via Mathematical Programming and Machine Learning. (Chapters 1 and 2). Available linked from my web page.

Glenn Fung, Sathyakama Sandilya and Bharat Rao. Rule extraction for linear support vector machines. Computer Aided Diagnosis & Therapy Solutions, Siemens Medical Solutions, June 2004. KDD 2005 accepted for oral presentation.

Haydemar Nunez, Cecilio Angulo, and Andreu Catala. Rule extraction from support vector machines. ESANN'2002 proceedings - European Symposium on Artificial Neural Networks Bruges (Belgium), 24-26 April 2002, pp. 107-112.