Exam 2 guidelines

This is intended to give you a sense of what I think is important from the course so far, and what I will be thinking of when creating the exam.

I hate disclaimers, but here are some anyway. This is not a contract. I may have inadvertently left something off this list that ends up in an exam question. I make no guarantees that the exam will be 100% limited to items listed below. Moreover, I will not be able to test all of this material given the time limitations of the exam. I will have to pick and choose some subset of it.

BRING A CALCULATOR WITH YOU.

You are permitted one 8.5 x 11 sheet of paper with notes (both sides) for use as a reference during the exam.

Here are the specifics: Students should be able to...

Mathematically define PageRank, and calculate the actual values on toy examples. Demonstrate what happens in goofy situations, such as dead ends and spider traps, and discuss how problems such as spam farms or others can be addressed. Be able to do PageRank algebraic analysis as we did in the homeworks. Demonstrate understanding of topic-sensitive variation.

Mathematically define HITS, perform calculations on toy example, and compare with PageRank.

Describe association rules problem in terms of transactions, items, support, and confidence. Distinguish between an itemset and an association rule.

Define and calculate support of an itemset, as well as support, confidence, and interest of a rule.

Be able to explain, interpret, justify, and or apply to toy examples the apriori algorithm (and detailed efficiency tricks) for finding frequent itemsets and quality association rules.

Be able to explain, interpret, justify, and answer questions about scalable association rules techniques, such as simple randomized, SON, and Toivonen's algorithm.


One last word: a fine way to practice is to do problems in the textbook that we haven't done. I wouldn't be surprised if I looked there for inspiration in writing text questions.