This is our old blog. It hasn't been active since 2011. Please see the link above for our current blog or click the logo above to see all of the great data and content on this site.

The Moral of the Story: Humans and Computers Both Suck at Predicting

Posted by Neil Paine on June 30, 2011

One of my favorite quotes from Jonah Lehrer's oft-ripped Grantland piece was this:

"By nearly every statistical measure, the Mavs were outmanned by most of their playoff opponents. (According to one statistical analysis, the Los Angeles Lakers had four of the top five players in the series. The Miami Heat had three of the top four.) And yet, the Mavs managed to do what the best teams always do: They became more than the sum of their parts. They beat the talent."

Yep, because the stats guys were the only ones who didn't predict the playoffs with perfect accuracy:

ESPN National Expert Picks vs. Simple Rating System (no HCA), 2011 Playoffs:

Round Team A Team B Winner Adande Broussard Ford Legler Sheridan Stein Wilbon SRS
1 IND CHI CHI 1 1 1 1 1 1 1 1
1 PHI MIA MIA 1 1 1 1 1 1 1 1
1 NYK BOS BOS 1 1 1 1 1 1 1 1
1 ATL ORL ATL 0 0 0 0 0 0 0 0
1 MEM SAS MEM 0 0 0 0 0 0 0 0
1 NOR LAL LAL 1 1 1 1 1 1 1 1
1 POR DAL DAL 1 1 0 0 0 1 0 1
1 DEN OKC OKC 1 1 1 1 1 0 1 0
2 BOS MIA MIA 1 1 1 0 0 0 1
2 ATL CHI CHI 1 1 1 1 1 1 1 1
2 DAL LAL DAL 0 0 0 0 0 0 0 0
2 MEM OKC OKC 1 1 1 1 1 1 1 1
3 MIA CHI MIA 1 1 0 1 1 1 0 1
3 OKC DAL DAL 1 1 0 0 0 1 1 1
4 DAL MIA DAL 1 0 0 0 1 1 0 0

ESPN Experts: 63.5%
SRS: 66.7%

I'm not doing this to pick on ESPN's national NBA experts, by the way. It's just my way of showing that nobody predicted the playoffs very well, and nobody saw Dallas going as far as they did. Nobody. For Jonah Lehrer to act as though statheads were the only ones who failed to see the potential for Dallas to upset L.A. and then win 4 games in a 6-game sample against the Heat is beyond absurd. (But you probably already knew that, because you've read the 600 other takedowns of Lehrer's article.)

The point: when it comes to the unpredictability of sports making them look silly, computers hardly have the market cornered.

38 Responses to “The Moral of the Story: Humans and Computers Both Suck at Predicting”

  1. Mike Jeremy Says:

    That's a pretty bad argument, IMO. Computers simply automate what humans intended them to do. They use the same (imperfect) analytical tools in predicting sports, so more or less both will make the same mistakes.

    This piece doesn't contribute anything to the whole epistemological debate in sports analysis (and even in the social sciences).

    Nobody predicted the playoffs very well, and nobody saw Dallas going as far as they did. Yes, nobody among the statheads (and computers, i.e. if you consider those instruments as "somebody").

  2. Neil Paine Says:

    #1 - "Computers" here is shorthand for any algorithm-based, unbiased method of predicting outcomes. The SRS is given as an example of the "dumbest" possible algorithm, which didn't even take into account home-court advantage and still predicted at a greater rate than a group of human experts (none of whom, I'm assuming, used hardcore stats in their predictions).

    This basically runs counter to what Lehrer was saying -- that humans need to incorporate intangible factors beyond the algorithm to predict more effectively. This data says that non-algorithm users were actually worse than the simplest possible "computer" system, and that neither did well because sports are inherently unpredictable.

  3. Dre Says:

    Preface: I picked Dallas to lose in round 1 :) (Arturo did fairly well picking them the whole way through though)

    That said I find this scenario an interesting one. In an attempt to save face I'll say if you had to say bet on 4 consecutive coin flips with a slight bias in the coin each flip You'd hardly be considered foolish for always picking the slight bias. In fact it would be the right move. That said missing on 2-3 of the flips (which is what I'm assuming happened with DAL vs. LAL and MIA) would not even be a huge surprise. I do love looking at analyst and computer "predictions" and the recent Freakonomics podcast on doing that for football is great. That said I think our sample here is low and our margin was pretty tiny.

    Do you have more stuff like this by the way? Despite a long paragraph trying to save face for being a terrible guesser (I lost to Abbott's mom in the Smackdown. . . .) this is really cool and would love more.

  4. Bill Says:

    "Here's my problem with sabermetrics — it's a useful tool that feels like the answer. If we were smarter creatures, of course, we wouldn't get seduced by the numbers. We'd remember that not everything that matters can be measured, and that success in sports (not to mention car shopping) is shaped by a long list of intangibles."

    You've deconstructed a nice strawman, but I don't think Lehrer's argument is that sabermetrics is crap -- it's that sabermetrics will never have all of the answers, because some factors don't seem to quantify in a straightforward manner. (Stats won't predict that Dirk would tear a tendon in his hand or get the flu, for instance) Also, sometimes weird things happen. I think you're wearing your heart on your sleeve and are making a non-algorithmic, biased jump to defend your bread and butter.

    @1 "#1 - "Computers" here is shorthand for any algorithm-based, unbiased method of predicting outcomes."

    There's a world of difference between "algorithm-based" and "unbiased." I would argue most statistical methods are absolutely biased to some extent. There's still a human somewhere in the chain making at least semi-arbitrary pruning decisions in all of these algorithms.

  5. Mike Jeremy Says:

    Those ESPN analysts (sample doesn't represent the whole population of "non-algorithm" users, but anyway) scored lower because, as you pointed out, they weren't using the most objective mode of analysis. They are influenced by biases. However, that doesn't say anything on the (lack of) usefulness of intangible, non-empirical factors:

    maximization of roles
    team chemistry
    quality of coaching
    decision making
    ...and other qualitative data.

    Yes it's hard to quantify those things (if they are quantifiable at all), but as a group of objective and thinking individuals that abide by the scientific way of dealing with things, you can go back and look at the different algorithms that you have devised. Lehrer just pointed out the imperfections of your mode of analysis. Check where you might have gone wrong and how you can incorporate other factors that you didn't take into account and the things you might've overlooked.

    Better than saying "Hey nobody got it right! You can't control what the ball does, man!"

  6. mystic Says:

    Not quite sure how someone can use the fact that the Mavericks won as the proof that sabermetrics doesn't work. Which team has a bigger stats department? Which teams does use stats more than the Mavericks? They hired Roland Beech to run their department, the guy was sitting on the bench helping coaches making decisions. They paid Wayne Winston good money in order to get those APM data for players and lineups to get a better impression what is working and what not. Somehow the Mavericks are possible the best example why all the teams should use those sabermetrics MORE, not less.

  7. Neil Paine Says:

    Right, the Mavs were maybe the worst possible example to choose, because their in-house metrics were used to make decisions like starting Barea. (Of course, Lehrer's argument in response to this realization is bewilderingly unfalsifiable: "I’m still sticking with my argument that a big part of the value of Barea is psychological: that little dynamo pisses defenses off.")

    More to the point, I think this whole conversation proves that anti-stat people are actually more obsessed with an ordered universe than the statheads. Where sabermetricians would shrug and say "the sample was small and the coin flipped heads 8 out of 10 times, but that doesn't change the fact that the next flip is 50-50," the anti-stat crowd is desperate to invent hidden, intangible factors that restore a sense of order to their perception of the universe.

  8. BSK Says:

    What is interesting is to look at how SRS did when there was not a strong consensus...

    In 9 series (including one involving the Mavs), there was a unanimous pick. They weren't already right, but they agreed.

    That leaves 6 series without a unanimous pick.
    In 4 of these, the computer had the right pick.
    Adande was a perfect 6 for 6 in these (his only 3 misses the whole playoff were the 3 0-fers)
    Broussard was 5-6.
    Ford was 2-6.
    Legler was 2-6.
    Sheridan was 3-6.
    Stein was 4-6.
    Wilbon was 3-5 (he didn't pick the Finals because he was commentating).
    That makes the humans a total of 25-35, slightly better than the computer.

    Not sure that this really adds anything, but thought it was interesting to look at.

    I'm too lazy to break it down further, since a couple picks were near-unanimous and some were quite split.

  9. Anon Says:

    This is a poorly written article. Lehrer's preference for subjective analysis is clear as day.

    Mike Jeremy, there's a difference between seeking to improve the OBJECTIVE imperfections in the numbers - which APBRmetrics folks do all the time - and disregarding the numbers in favor of arbitrary, subjective factors such as picking teams whose stars know how to scowl for the cameras (

  10. AYC Says:

    Excellent post, BSK. I think Adande and Stein came to the same conclusion I did after the Mavs swept the Lakers; they were the best team, and this was their year. It annoys me when statheads try to chalk their success up to mere randomness and the small samples the playoffs provide. The Mavs' offensive execution in late-game situations was simply brilliant; just when most teams start to tighten up was when they played their sharpest ball. There was nothing random about it, and a team like MIA that doesn't execute well late couldn't hang with them. The statheads keep telling me that 4th quarter performance isn't any more important than the first quarter, but the 4th is when Dallas won games, time and time again. Apparently coaching, and the other 9 guys on the roster, still matters....

  11. Anon Says:

    "It annoys me when statheads try to chalk their success up to mere randomness and the small samples the playoffs provide."

    That's not even what they say either...

    "There was nothing random about it, and a team like MIA that doesn't execute well late couldn't hang with them."

    How did MIA get to the Finals then? Especially when they won a bunch of close games in the playoffs?

    "The statheads keep telling me that 4th quarter performance isn't any more important than the first quarter, but the 4th is when Dallas won games, time and time again."

    Actually, Dallas won games because they scored more points at the end of 48 minutes than the other team. Some were close games, others were decisive wins - including the Finals Game 6.

  12. CB Says:

    well, Wayne Winstons computer said that the team with the best impact player usually wins. Dallas had the best (impact) player on the court in every playoff game(Dirk: APM/RAPM).

    so I guess the "computer" (APM) won in the end

  13. huevonkiller Says:


    What did BSK prove? Stein and his crew (in a small sample size) barely beat a roughdraft version of SRS, not adjusted for schedule and such.

    Go to this very site and you'll see the advanced, intelligent stats, said Dallas was the best team after the trade deadline. That combined with their tactics is what won (double teaming LeBron more than any other team, the Heat's inability to beat the Mavs 4 on 3, containing Chris Bosh, Wade getting injured and messing up some last minute plays, etc.).

  14. DJ Says:

    Since picking at random would give you 50% success, the step up to 63.5% is a big improvement over random. In short, that's pretty good prediction. And though the sample size is really too small for the numbers to count, in the long run predicting 66.7% is a lot better than 63.5%.

    As for stats vs. intangibles, it would seem reasonable to find a place for both.

    It's silly to ignore stats because of the many errors in judgment we make with respect to probabilistic judgments. And the discussion should probably start with the stats.

    But life isn't a repeatable trial--it's not a coin that we toss over and over. Some things that are completely obvious do get ignored or swept under the rug as statistical noise.

    Remember Steve Sax and Chuck Knoblauch? They threw to first just fine for several seasons. And then suddenly...

    There is a real psychological element to the game. To ignore that pressure affects how we perform seems silly. And statistics isn't a great tool for assessing things that only occur a few times in a lifetime.

  15. huevonkiller Says:


    Nuanced analysis of stats already exist for the playoffs, that pretty much addresses most of the problem, see my previous post.

    And team-wise defense has been shown as the "intangible" often ignored by the media.


    There are many metrics that explain the win.

    The one year RAPM ranking looks pretty unreliable and wacky. The ten year RAPM looks decent though. Except for Pau Gasol, T-Mac, Lamarcus Aldridge, Shaq, some others.

  16. mystic Says:

    CB, correct. Wayne Winston actually said that the Mavericks should be favored to win in the finals, his numbers also had the Mavericks as the favorites for conference finals, and even when we take the minutes distribution against the Lakers into account the Mavericks would come out ahead (well, not as much as they beat them in the end, but whatever). His APM system seems like a pretty good predictor.

    Neil, I completely agree with your 2nd paragraph in #7. It is really, really weird behavior.

    Huevonkiller, only because you think those players are rated wrongly, doesn't make it so. Take your cognitive bias into account, your inability to judge 10 players exactly at the time objectively while watching and your amount of games you watched during those years. How much of those 1230 games last season did you watch?

  17. Imadogg Says:

    Besides SRS, you should list all the other stats used to predict W-L in the playoffs. Comparing all these ESPN guys to all the "advanced stats" that can attempt to predict series seems more fair than comparing 7 guys to one stat.

    Adande only got 3 series wrong, the 3 that not one other person (nor SRS) got correct. Impressive sir

  18. Walter Says:

    What is often forgot about the season was just how good the Mavs were when Dirk was healthy. They were likely the best team in the league if you exclude the results of that strecth of games where they played without Dirk because they really tanked at that point.

    It would be interesting to calculate their advanced statistics (SRS, Off Rtg, Def Rtg, etc...) from the season and exclude that stretch of Dirk-less games and then use that to predict the playoffs as a more accurate estimate of how they would look in the playoffs. The computers may then do much much better.

    If that is the case then it would support the computer algorith advanced stats approach with the assumption that the human puts in applicable inputs to model.

  19. BSK Says:


    I wasn't really trying to "prove" anything and most certainly proved nothing! I was curious if the computer or humans were better going against the grain. It was basically a wash. If one of the predictors (an individual human, the humans collectively, or the computer) was significantly better than the rest when the results were seemingly less obvious, that would seem noteworthy. In this sample, no one separated him or itself.

  20. huevonkiller Says:


    They're not subjective or random ideas though. Sin embargo :), I'm simply regurgitating the other apbr research.

    Team results are different than individual results (see Wins Produced). The kryptonite of APM is noise and it shows up no matter what one does.

    Based on high Win Share players, T-Mac's and Gasol's importance is very understated. Same thing with an aging, but all-star level Shaq. Some of the other rankings are just goofy and not reasonable, based on the other metrics.

    The ten year rankings look decent, but they're still flawed. There is no end-all statistic in basketball, so one should look at everything that is available.


    Oh I know, I was just poking fun at the comment.

  21. Joseph Says:

    Barry picked Dallas over Portland in 6 -- Dallas won in 6. He had no pick for the Dallas-Lakers series (on ESPN's site). He picked Dallas in 5 against Oklahoma City -- Dallas won in 5. Finally, he picked Dallas in 6 against Miami and that was the result. I wonder what he knew that none of us didn't.

  22. aweb Says:

    But life isn't a repeatable trial--it's not a coin that we toss over and over. Some things that are completely obvious do get ignored or swept under the rug as statistical noise.

    Remember Steve Sax and Chuck Knoblauch? They threw to first just fine for several seasons. And then suddenly...

    See, this is meant to be an argument against stats and analysis, but before it happened to them, did anyone see it coming? Did a clever scout or pundit notice something "intangible" about them before it happened? No, and neither did stats. Once Sax and Knoblauch became unable to throw to first, stats showed that as well.

    Keeping it on basketball, some players become unable to make free throws, well past "bad shooting". Shaq is one example, he apparently was quite reliable in practice for many years but couldn't do it in a game. If you looked at 300 players shooting free throws in practice, would the numbers tell you which ones couldn't do well in games? Not at this point. But neither could any scout without access to the in-game information.

    Sports stats might not pick up on sudden unexpected results, but the thing is, people don't either.

  23. aweb Says:

    This also reminds me of ESPN's "Tuesday morning QB" columnist (Easterman?), who usually compares a really simple prediction method, such as "take the best record, if tied, take the home team" to various pundits. The simple system is often better, and difficult to beat without resorting to detailed numerical analysis. His columns are usually terrible at dealing with numbers and game theory from a shallow point of view, but his point here is generally the same - predicting is hard to do, and people aren't very good at it. Most people try to fit predictions to a narrative they have already chosen (such as: Miami is too talented! Miami isn't a team! Lakers always win! Chicago is the best TEAM! Once more for the Spurs! Boston wins while healthy!). No one chose a "Dallas finally puts it together" story before round 1, so people didn't predict it.

  24. Greyberger Says:

    Aweb makes a good point. Having a hand on the pulse of the NBA and knowing every major storyline intimately would actually have gotten in the way of predicting 2010-2011. The Mavericks got some attention early in the season(beyond Barkley) but by the end everybody talked about them like they were a paper tiger, old and vulnerable, etc.

  25. Mike Goodman Says:

    These Mavs were only about their 8th best team of the Nowitzki era.
    They had 2 pretty good series and 2 really good series. No bad series.
    At different times during the season, the Lakers were unbeatable, and the Lakers just could not get it together. Substitute any other team for "Lakers", and the stories happened.

  26. DJ Says:

    If you think that my earlier comment was "an argument against stats", then look again. Like at where I say "the discussion should probably start with the stats."

    My main point: the prediction rates for both humans and "computers" were actually good, if we compare them to guessing the winner at random. The title of the the blog post, after all, is "humans and computers both suck at predicting." "Sucking" depends on expectations and context. If we expect 100% prediction, then, yeah, we'll be disappointed. But if you compare the actual rate to what we get by chance (50%), then we see, actually, that both humans and computers do far better than chance in predicting. And, I added, the computers had done better than the humans.

    If we are interested in prediction, statistics (or computers/algorithms) are limited because they cannot tell the story that needs to be told in certain situations. Statistics cannot recognize, for example, a real change in the ability of a player or team--whether caused by an injury or a psychological factor. Statistics cannot know whether to analyze data for the whole season, or data only for the games in which a star played. In retrospect, we can see that "computers/statistics" that looked at the whole season favored Miami, but if the same "computers" used a different segment of the season--one in which Nowitzki didn't miss games--we get Dallas as the favorite--like this.

    Because of such limitations, it's nuts to expect anything like 100% prediction from a statistical system. Because of such limitations, 67% might be really good.

    So: what is a good prediction rate? Do we judge it in comparison to other prediction systems? 67% is lousy compared to the accuracy with which we can predict the time of sunrise tomorrow. 67% may be good compared to the accuracy of predicting a baseball playoff series. If there is a system that predicts 67% of basketball playoff series outcomes, and the rest of the world only has systems that predict 64% of playoff series, then I would say the 67% system is damn good at predicting.

    And if we want to do the best predicting do we:
    1) choose at random
    2) use narrative information but no statistics
    3) use computers/statistics only
    4) use computers/statistics, but recognize the limitations of the system and overrule the system when appropriate?

  27. aweb Says:

    I think #4 is the most appropriate, except that the big question it raises is - "How do we recognize the limitations?" For where the computer went wrong and at least one of the experts got it right (Dallas vs. Miami, Dallas vs. LAL, Denver vs. OKC), is there anything about those series beforehand to indicate why the pick was wrong, aside from a deeper analysis of the numbers? The better models make more judgement calls on what data is important, but those decisions are best lead by past data itself.

    How good predictions can do is certainly something to examine - the beauty of automated systems is you can feed them data from past years and get them to predict what has already happened. Bulls over Suns? Pistons over Lakers? How well do the best automated systems do over time? One year of NBA playoffs, in a year filled with major and minor upsets by NBA standards, is probably not the best way to judge experts or computers...

  28. marparker Says:

    Actually Arturo Galleti devised a model that predicted a Dallas win in every round.

    1st and 2nd round proof

    conference finals


  29. Neil Paine Says:

    #28 - Not to take anything away from his picks, but it seems like his "model" was more art than science -- he basically picked and chose when to use the full-season vs. post-break numbers, even saying before the Finals:

    "I’m putting my faith in what Dallas has done lately and trusting the post All Star Break and playoff numbers."

    Faith is definitely not science. In fact, science said "form is overrated", that there was no correlation between late-season performance and playoff success:

    Dallas, with its scorching 2nd half, happened to be the exception to that rule.

    (And of course, I was guilty of over-weighting late-season games at times during the Stat Geek Smackdown as well. But I never claimed my picks were anything but gut calls merely informed by the stats... And that's the problem with injecting gut calls into a system -- there's no way to predict ahead of time whether the human interference will add or subtract from the accuracy.)

  30. marparker Says:


    I'm not trying to make a hero out of the man. I'm putting out there was a "computer" which saw this whole thing playing out like it did.

    I'm of the mindset that there is a perfect model out there that is yet undiscovered.I liked arturo's work this postseason. Even if the model had Miami in 7, it provided some pretty good insight along the way. I like his reverse engineering concept. I think that is a huge step in the right direction.

  31. David Says:

    Neil, he actually used the combined model. A variant of the DBerri WOW stuff I believe. The commentary is just color really. When I read his stuff I just skim the pictures and commentary and go for the tables. I wonder how a meta-algorithm would have fared. You know, just take the mean/median prediction. In my work these types of "models" are typically best (smallest bias, lowest RMSE). I also used post-break numbers and went down in flames. But that was a conscious choice. But the teams were not who we thought they were (to channel Denny Green). So some split was needed, but which one? For DAL is seems to be Dirk. For the rest?

  32. Ashish Says:

    It just means the computers weighed stats improperly. I personally picked Dallas over Boston by looking at SRS weighted more heavily towards the starting lineup, since starters usually have a bigger impact in the playoffs.

  33. DanielSong39 Says:

    There's really only one objective way to measure predictive success: use human and computer predictions to play in-running futures markets over the course of the regular season and playoffs.

    The proof is in the pudding and a good computer predictor will end up in the plus column over the long run.

    As for human predictors, most people will lose money in the betting markets but a small percentage of humans are good enough to win year after year. I can attest to this through personal experience.

  34. Hoops Maestro Says:

    You can't say "nobody" predicted the Dallas victory. Maybe no one in the national media. I predicted a Dallas-Miami Finals last summer after LeBron joined the Heat. (It's on the message boards at NBA Draft dot net somewhere.) Dallas had the deepest team in the NBA, and one of the most unstoppable players, and one of the best coaches. The Mavs had two solid centers, deadly outside shooting, and a few good perimeter defenders, and shot free throws exceptionally well.

    Portland was coming off recent trades and several injuries to key players. LA was getting old and playing poorly. Oklahoma City was too young and inexperienced. Miami was a top heavy team that had only been together for one season, and was weak at the key positions of PG and C.

    The Mavs were the best team in the NBA last season, but people wrote them off because of their previous recent playoff flops.

  35. Hoops Maestro Says:

    Another clear statistical clue that Dallas was going to do very, very well in the playoffs:

    Go to 82 games dot com, and look at their clutch stats for 2010-11. Then sort by the +/- per 48 minutes. The results will astound you.

    Dirk, Marion, Kidd, Chandler, and Terry are 5 of the top 6 players! No other team has any result remotely close to this. This unit was remarkably efficient in the clutch -- perhaps to an unprecedented degree. I've never seen anything like it before. This wasn't a fluke sample either -- it covered 36 to 48 games, including 77 to 154 minutes of "clutch" time production.

  36. marparker Says:


    in 07-08 Cleveland had 6 of the top 8.

    in 09-10 Dallas and Cleveland had each spot in the top 7 with Cleveland having 4 of the players

    However, in 08-09 Lakers had 3 of the top 6. Cleveland shared the other 3 spots.

    In the end I'm not sure what that tells us.

  37. huevonkiller Says:


    Pau Gasol played the most minutes of his career, he's on the wrong side of 30, and was complaining about fatigue throughout the season. I've been critical of Phil Jackson's coaching the past few years, and he failed this time like he has in other instances (Fisher, Smush parker, Luke Walton, no Shannon, etc.). With the depth LA had there's no reason to put Pau out there that much. No need to downplay your own site, this place did a fine job of supporting my notion about health.

    I know a huge Laker fan that said they would get dominated and lose easily.... Simply because Pau and Kobe looked done. It is that simple, you shouldn't overlook that factor Neil. That isn't gut based, their statistical decline is tangible in the first round. Life isn't completely fair and people get injured at the wrong time, or play too many minutes, or whatever.

    Dallas was the best team in the NBA when healthy, the stats clearly showed that before the Finals and through the post-season. It didn't help Mike Bibby had the lowest playoff PER ever, also that Miami had much more wear on them with their attacking playing style and high MPG. Lastly, James Jones was Miami's fourth best player this season and he didn't get on the court at all.

  38. huevonkiller Says:

    Second most minutes, sorry. :d