CS 395 Assignment: Decision Trees

For this assignment, you will run a variety of decision tree variations.

I have placed a dataset that reflects congressional voting records in the directory /Accounts/courses/cs395/dmusican/voting-records. Use the RainForest algorithms RF-Write, RF-Read, and simple RF-Hybrid to build a decision trees on this dataset using the ID3 criterion. Since this entire dataset is small and fits entirely in memory, you should simulate not having enough memory. Keep it simple: assume that precisely four AVC-groups can fit in memory at once. Report the following information:
Assuming that this entire dataset is used as training data, what level of training is achieved? What does your decision tree look like?

For RF-Hybrid, you should use the simple form described in the first paragraph of section 4.3 of the Gehrke paper.