In the modeling section, we will mainly focus on following questions. First, how can we represent user political stance given user responses to articles? Next, How can we show the recent user response changes? Last, How can we determine the articles political stance?
The first step is to set direction such that Negative = Left/0 = Center/Positive = Right. Next, we define a score change according to all possible conditions show in this table. For example, if a user agrees with Left leaning article, the score change will be negative. Most importantly, when a user agrees with Center article, the score change will move towards 0. In the end, the total score is the sum of all Score Changes and each change value is the article score(see Article Model for more detail). However, we can not just use the total score to represent the political stance, since those users who read more will get larger score. So we use Global User Score, which is simply Total Score / Length of Total Responses, to represent the overall user stance.
Once we have the user score, we now consider how to model the article bias using numerical value. The first thought is easy, we just use number to represent the classification result -- namely 1 for Right / 0 for Center/ -1 for Left. But that is not enough, here is the reason: Recall we have a text classifier with 85% accuracy, it sounds good at first but if you do the math, this means one of seventh of our prediction labels are wrong, which is not ideal. Moreover, if we take a close look at the baseline, we found out that the prediction accuracy of Center Article is much lower than other categories. Which kind of makes sounds -- even for us, it is a hard task too. how can we classify center article, given that we all have our own bias. How can we deal with these problems? Besides the classifier score, we introduced a new term User Rating that takes accounts of the user responses. Here, only the experienced users with Response History bigger than 10 can affect the user rating, because the new user has not form a clear stance yet as their scores may be jumping round at the early stage. We then combine the User Rating with classifier score to produce the Article Score. Moreover, this helps us to classify Center article too, as we believe the “Center” articles should have roughly same number of agree and disagree from both sides of users. To be more specific, the formula is User Rating += User Score * 0.1 * Response, as also shown in this table below. For example, a user with negative score agrees with a article, the user rating of this article will decrease. Right now, the weight of classifier score and user rating are the same, namely 1. In the future, as we have more users, we can increase the weight of User Rating and rely more on the response of users. Moreover, we can use those articles with User Rating as dataset to fine-tune the classifier. Eventually, we will have better and better results.