Data Mining Readings


As each reading is assigned, you should do the following;
For class on Wednesday, 4/2:
J. Friedman. Data Mining and Statistics: What's the Connection?.

For class on Monday, 4/7:
N. Roussopoulos, S. Kelley, and F. Vincent. Nearest Neighbor Queries. SIGMOD 1995.

For class on Monday, 4/14:
J. Gehrke, R. Ramakrishnan, and V. Ganti. RAINFOREST - A Framework for Fast Decision Tree Construction of Large Datasets.

For class on Wednesday, 4/23:
P. Bradley, U. Fayyad, and C. Reina. Scaling Clustering Algorithms to Large Databases.

For class on Wednesday, 5/7:
S. Guha, R. Rastogi, and K. Shim, CURE: An Efficient Clustering Algorithm for Large Databases.

For class on Monday, 5/19:
R. Agrawal and R. Srikant. Fast Algorithms for Mining Association Rules
This is a long paper. Focus on the Apriori algorithm, and to a lesser degree on the AprioriTID algorithm. Don't spend much time reviewing AIS and SETM.