Posted by Neil Paine on January 13, 2010
If you didn't catch it the first time around (since I know all of you read the PFR Blog now that I occasionally post over there), I highly recommend that you check out this series of posts that Doug Drinen wrote about various computer ranking systems and the methods behind them:
That last link is the topic I wanted to talk about today.
Every Friday in the BBR Rankings, I combine a pure won-lost rating with a strength of schedule component that factors in the point margin of each game. I combine the two this way because I think it's fair -- it rewards teams for wins and doesn't give undue credit to blowouts, while still acknowledging that the best indicator of a team's "true" strength is still its margin of victory/defeat. However, logically and mathematically, this method is not exactly the most rigorous one in the world. By combining the two elements in a somewhat arbitrary fashion, the aim of the rating is not crystal clear -- it's certainly not predictive (nor is it intended as such), but while I say it's retrodictive, it's not purely that either, because it does combine elements of predictiveness.
Obviously I'm still going to post them every week, but I also wanted to show you an alternative method that is the most purely retrodictive possible rating. It's called "maximum likelihood", and it seeks to find the set of team ratings that maximize retrodictive accuracy in the past.
Think about the way the season has progressed so far, starting with last night's game between Orlando and Sacramento. The Magic beat the Kings, which is a data point for any rating system to work with, and it implies that Orlando is better than Sacramento. Therefore, all else being equal, the system would seek to create a rating that ranked Orlando ahead of Sacramento. However, all else is not equal -- Orlando also has lost this season to Indiana, Washington, Utah, & Oklahoma City, all of whom Sacramento has beaten. Because the computer can't find a perfect ranking based on 100% internal consistency in the past, it can only maximize the rate at which it correctly retrodicts game results. It does this by establishing the probability of each win, and then multiplying these probabilities together for the entire season, producing the likelihood that, given a certain set of ratings, the season would have played out exactly the way it has in real life. In essence, we want to try different combinations of ratings until we maximize that likelihood, hence the name of the method.
If you want to know the math, for each game we'll assume the probability of the home team winning is p(hW) = exp(rH - rA + HC)/(exp(rH - rA + HC) + 1), where rH = the home team's rating, rA = the away team's rating, and HC = a home-court advantage term. When the home team wins in real life, the "likelihood" of the result is p(hW). When the road team wins, the likelihood is (1 - p(hW)). Now, instead of taking the product of the quotients involved, you can work with the natural logarithms of each individual game probability, and sum them for the entire season. The set of ratings that maximizes that sum is the set that best retrodicts the past. (If you have Excel, you can use the Solver tool to do this, telling it to maximize the sum of the natural logs by changing the team ratings and the home court term while keeping the sum of all ratings equal to zero.) This season, you get these ratings from the maximum likelihood method:
(Note: No, they don't all add to zero, but the solver will find the solution that both maximizes the sum of the natural logs and gets the average as close as possible to zero.)
So these are the ratings that best "retrodict" the past. They are only concerned with past wins and losses (even the SOS adjustment and the HCA term is based purely on W-L), which is the polar opposite of the SRS, which only concerns itself with point differential and is chiefly interested in predicting future outcomes. And the BBR Rankings, I suppose, are a hybrid of both approaches. As always, the approach that's best depends on the philosophical goal you're trying to achieve with the rankings.