Write a program in Lisp to implement the ID3 decision tree algorithm. You should read in a space delimited dataset and output your decision tree and the training set accuracy in some readable format. For example, here is the tennis dataset. The first line will contain the names of the fields:
outlook temperature humidity wind playtennis sunny hot high FALSE no sunny hot high TRUE no overcast hot high FALSE yes rainy mild high FALSE yes rainy cool normal FALSE yes rainy cool normal TRUE no overcast cool normal TRUE yes sunny mild high FALSE no sunny cool normal FALSE yes rainy mild normal FALSE yes sunny mild normal TRUE yes overcast mild high TRUE yes overcast hot normal FALSE yes rainy mild high TRUE no
The last column is the classification attribute, and will always
contain contain the values yes
or no
.
For output, you might want to do something like Weka does: turn the decision tree on its side, and use indentation to show levels of the tree as it grows from the left. This is probably easier than drawing the tree growing down from the top of the screen.
outlook = sunny | humidity = high: no | humidity = normal: yes outlook = overcast: yes outlook = rainy | windy = TRUE: no | windy = FALSE: yes
You don't need to make your tree output look exactly like above: feel free to print out something similarly readable if you think it is easier to code.
You may find the Lisp make-hash-table
function
useful. Good luck, and have fun!