CS 327 / CGST 360: Artificial Intelligence

Assignment: Decision Trees

Programming assignment for CS 327 only

Write a program in Lisp, Java, or C++ to implement the ID3 decision tree algorithm. You should read in a space delimited dataset and output your decision tree and the training set accuracy in some readable format. For example, here is the tennis dataset. The first line will contain the names of the fields:

outlook temperature humidity wind playtennis
sunny hot high FALSE no
sunny hot high TRUE no
overcast hot high FALSE yes
rainy mild high FALSE yes
rainy cool normal FALSE yes
rainy cool normal TRUE no
overcast cool normal TRUE yes
sunny mild high FALSE no
sunny cool normal FALSE yes
rainy mild normal FALSE yes
sunny mild normal TRUE yes
overcast mild high TRUE yes
overcast hot normal FALSE yes
rainy mild high TRUE no

The last column is the classification attribute, and will always contain contain the values yes or no.

For output, you might want to do something like Weka does: turn the decision tree on its side, and use indentation to show levels of the tree as it grows from the left. This is probably easier than drawing the tree growing down from the top of the screen.

outlook = sunny
|  humidity = high: no
|  humidity = normal: yes
outlook = overcast: yes
outlook = rainy
|  windy = TRUE: no
|  windy = FALSE: yes

Training set accuracy = 100%

If you choose to code in Lisp, look up the make-hash-table function. You may find it useful. If you code in Java, you may find that the HashMap helps similarly.

Non-programming questions for CGST 360 only

Turn in solutions for textbook exercises 18.1, 18.2, 18.4, 18.5, 18.7, 18.10, and 18.14.