Association Rules

Association rules presents a unique algorithm which does not perform like any others we worked with. We used an implementation described b Kim et. al [1].

Fundamentals of Association Rules

Association rules try to connect the causal relationships between items. An association rule essentially is of the form A1, A2, A3, ... => B1, B2, B3, ... It attempts to show how a series of items can determine another series of items. For a more concrete example, if we said A => B, C, that would mean that the appearance of item A in someone's history would imply that B and C would be there as well.

It's not just the items that matter, however; another important factor is the confidence of a rule. Confidence is the intuitive idea of how applicable a rule is. It can range from 0 to 1. If the confidence is 1, then we know that the rule always applies - that is, every time we see A, we also see B and C. However, if the confidence is 0, it means it's never correct - A does not imply B and C.

Association Rules for Recommendations

For our purposes we used association rules of the form A => B. This means that we looked at all single-item relationships. That is, what is the likelihood of the active user rating the item B, given that the active user has rated A?

We created a square matrix of all these single-item relationships and their associated confidence values between all n items in the dataset. Then, we treat the user as a vector in n-dimensional space. If you multiply the matrix by the vector, you get what is called a recommendation vector - the most likely items that the user will rate, given the ones they have rated in the past.

You can easily use is recommendation vector to order preferences of a user.

Advantages

It is incredibly fast. Building the matrix takes a very short amount of time, and then all recommendations after that are instantaneous.
It generates credible results - what you might think of recommending a person given what they have seen.
It works well with sparse data sets, especially if we implement a multi-level association rule index which has higher levels of generalization (in case the lower ones don't have enough information). We didn't feel the need to implement it, but the option was there.

Disadvantages

It generates obvious results. For example, because everyone rated and enjoyed Star Wars, it almost always recommend that you go watch it if you haven't.
It cannot predict ratings, only order preferences. If you want to actually predict ratings you need to use a different algorithm.

References