Exam 1 guidelines

This is intended to give you a sense of what I think is important from the course so far, and what I will be thinking of when creating the exam.

I hate disclaimers, but here are some anyway. This is not a contract. I may have inadvertently left something off this list that ends up in an exam question. I make no guarantees that the exam will be 100% limited to items listed below. Moreover, I will not be able to test all of this material given the time limitations of the exam. I will have to pick and choose some subset of it.

BRING A CALCULATOR WITH YOU.

You are permitted one 8.5 x 11 sheet of paper with notes (both sides) for use as a reference during the exam.

Here are the specifics: Students should be able to...

Evaluate precisely the output from a simple k-nearest neighbor example. Calculate Euclidean and Manhattan distances correctly, as well as Jaccard similarity. Appropriately assess big-O complexity of traditional k-nearest neighbor. Show depth of understanding with k-nearest neighbor by answering questions about details of or variations in the algorithm.

Explain what purpose standardization has. Determine when appropriate. Implement on small examples.

Define and interpret measures of correctness on classification problems, such as accuracy, precision, and recall. Compute and interpret a confusion matrix. Define, implement, and explain the purpose of leave-one-out cross-validation.

Demonstrate in small examples how locality sensitive hashing (LSH) is used. Explain or draw conclusions about LSH showing fluency with the theory behind it. Compare and contrast with simpler less efficient approaches. Appropriately assess big-O of different aspects of LSH.


One last word: a fine way to practice is to do problems in the textbook that we haven't done. I wouldn't be surprised if I looked there for inspiration in writing text questions.