I recently completed the rollout of a brand new “Rating” system for our ninemsn site image content, essentially allowing users of our site to use the familiar “Star” method to rate each image in a slideshow like so:
Here, the user is about to rate this image “4 stars” and this image already has an “Average Rating” of 4.0 stars.
How is the “Average Rating” calculated?
Simple, let’s say that prior to us rating, 2 people rated this image 4 stars each, so using this simple calculation we can work it out.
Average Rating (4.0) = Total Rating (8) / Total People Who Rated (2)
So, what’s the difference between Ranking and Average Rating?
Let’s take a deeper look into this, the above mentioned rating system is used on a lot of sites like YouTube and iTunes where users are allowed to rate their content, in this case videos and music tracks. These ratings are then used to let other users discover new popular content on sites and act as a kind of “wisdom of the crowd” concept, where people tend to like content that others like.
This kind of rating based discovery of content is usually done by “Ranking” content based on the user ratings of the content, you can see this on YouTube where you can search for videos and sort it by rating.
Now this is where it gets tricky as the Ranking is a different concept than Rating, and the search results in the above YouTube search are going to show results ordered by the Ranking and not by the average rating of each video.
Let’s say we have 2 videos that have been tagged “southpark”.
Title: “Kenny Dies Again”:
Uploaded on: 10 Feb 2010
So far, 100 people have rated this video and it had a "Total Rating" of 300, therefore this video has an Average Rating of 3.0 stars (300/100).
Title: “Cartman plots to take over world”
Uploaded: 11 Feb 2011
So far, 2 people have rated this video 4 stars each and therefore a “Total Rating" of 8 which gives it an Average Rating of 4.0 stars.
So if the above search request for “southpark” returned videos based on the Average Rating value then “Cartman plots to take over world” would be given a higher priority than “Kenny Dies Again”. This is clearly wrong as a lot more people rated “Kenny Dies Again” and therefore this video should be given a higher priority.
So this is where Ranking becomes useful, ranking lets you take the above situation into consideration and give you a more accurate ordering of content based on the rating and the total votes each piece of content receives. In other words, it gives a more accurate picture on how people feel about a piece of content.
So in summary, average rating gives you an idea of how people feel about a single image whereas Raking lets you rank multiple images based on user ratings and see which is more relevant than the other.
Now that we understand the need for Ranking, how can we calculate it?One commonly used way to do this is to use the “Bayesian Average” calculation. Wikipedia has got a good explanation of it here.
Essentially what it means is that if we look at the above “southpark” search example again, where video 2 had a larger average rating that video 1 because it got a lot less votes, we need to use some “other constant value” to give video 1 a higher “relevancy weight” than video 2 and hence give it a higher ranking. This “other constant value” can be anything, so long as you consider it relative to what you are trying to do and something would help us improve the “relevancy weight” and the quality of the calculation.
So in this example, because we are trying to see how a video’s rating can determine its rank compared to all other videos we have and because we know that the “more” people who rate a video essentially carries more weight, we can use this as the “other constant value”.
Other Constant Value = Average Number of Votes across all videos * Average Rating of all videos
This works out to be a good number to compare against as we are trying to compare our videos against all the existing rated videos. So we can use this formula to work out the Bayesian Rating of a video:
Bayesian Rating =
(Average Number of Votes across all videos * Average Rating of all videos) + (Total People Who Rated * Video’s Total Rating) / (Average Number of Votes across all videos + Total People Who Rated)
So if we know that:
Average Number of Votes across all videos = 1500
Average Rating of all videos = 3.5
And we use this formula to calculate the ranking of video 1 and video 2 mentioned above:
Rank = (1500 * 3.5) + (100 * 300) / (1500 + 100)
Therefore, Rank = 22.03
Rank = (1500 * 3.5) + (2 * 8) / (1500 + 2)
Therefore, Rank = 3.5
So based on this it very clear that video 1 has a much higher “weight” than video 2 :)
The only this you need to keep in mind here is that the “other constant value” we used above (i.e. Average Number of Votes across all videos * Average Rating of all videos) needs to be constant across your ranking calculations. This value can change as obviously overtime the values for both Average Number of Votes across all videos and Average Rating of all videos are going to increase, and that’s fine, but make sure that when you calculate the rank for each content item on your site, use the same value. Confused?
Well, think of it this way. All the individual “Ranks” for your content would be stored in a separate database table (i.e. Video1 = 22.03, Video 2 = 3.5), and your searches use this table to order the search results. Now the process of calculating the Rank and keeping it updated for all your videos would be a SQL job that runs every 5 mins and step 1 of this SQL Job would be to calculate a fresh new “other constant value” and then use that to calculate and store rank for each content item in the rank table. Hopefully this makes sense now.
I’m sorry if this all sounds very confusing and to be honest I confused myself as I over analysed the Bayesian average formula, but the secret is to keep it simple and work out a good solid reusable formula that you can use on your site to determine rank based on user ratings. If you got any questions, then please ask me using the comments section below and i'll try my best to clarify.
Here is another good article that I found very useful:
Good luck and happy rating :)