This is intended to give you a sense of what I think is important from the course, and what I will be thinking of when creating the exam. I hate disclaimers, but here are some anyway. This is not a contract. I may have inadvertently left something off this list that ends up in an exam question. I make no guarantees that the exam will be 100% limited to items listed below. Moreover, I will not be able to test all of this material given the time limitations of the exam. I will have to pick and choose some subset of it. You are permitted one 8.5 x 11 sheet of paper with notes (both sides) for use as a reference during the exam. Here are the specifics: Students should be able to... Answer detailed questions about what decision trees are how they work, from conceptual, algorithmic, and programming perspectives. Explain differences between training, testing, and tuning sets, regarding their purposes and how they are used. Describe how accuracy of supervised learning classifiers is measured, and produce novel measurement techniques to handle unusual circumstances. Demonstrate that entropy has desired characteristics. Explain and/or demonstrate what overfitting is, answer questions about it, and propose how to avoid it in novel situations. Appropriately interpret computational learning theory results that we developed, and apply such results correctly. Do the following regarding supervised learning-based backpropagation neural networks: - show that they can calculate the output of a single perceptron and of a neural network, with an associated set of inputs, weights, and defined threshold function. Students should be able to do this either in terms of actual given numbers, or algebraically with given variables. - indicate what a activation function is and what purpose it serves, what the sigmoid function is in particular, what the parameter "b" does, and what purpose the threshold value serves. - augment a perceptron or neural network with extra inputs to serve as proxies for the threshold value in the activation function, and explain what purpose this serves - demonstrate how the backpropagation process works with regards to use of training points, iterations, etc. Students do not need to derive, explain, or use specific backpropagation formulas for updating weights. - explain the effects that more vs. fewer hidden nodes has, as well as more vs. fewer layers in the network - describe various ways that overfitting can be aggravated or reduced. Do the following regarding neural network approaches to unsupervised learning, specifically Kohonen self-organizing maps: - explain what kind of problem these techniques are used for, and how they contrast with the one for which backpropagation networks are used - demonstrate what it means to normalize the data in this context; explain what role normalizing the data serves, when it is appropriate to normalize the data, and how the network calculations depend on normalized data. - explain how many layers Kohonen SOM's have, and how the number of nodes in each layer is determined - demonstrate how these networks are trained, and how the results are interpreted Do the following regarding support vector machines: - be able to interpret and explain the basic setup, and the meaning of all of the variables used - be able to describe and answer questions about what distinguishes SVMs from other techniques, particularly with regards to overfitting