If you’ve ever played Fantasy Football, you’ve been burned by “Projections.” (I.e., You start Wide Receiver A over Wide Receiver B because ESPN predicts that A will score 10 more points than B. On Sunday, A catches two passes for 10 yards; B hauls in three touchdowns. You lose. You blame ESPN).

Most players dismiss faulty projections as either some kind of karmic retribution or personal failing, leading to random, hyperbolic statements like, “I’ll never bench Calvin Johnson in a Dome again.”

However, if you’re a Data Scientist – and your job hinges on making sense of numbers – these seemingly arbitrary figures need to be held accountable.

So, Datascope Analytics – a Chicago-based data science and design firm – combined pages and pages of historical projections and weekly scoring with a little homemade code to determine if ESPN’s weekly projections are accurate or not.

On their blog, the firm deep-dives into exactly how and why they analyzed the data, but the basic conclusion was that ESPN does consistently over-project for ‘Fantasy Relevant Players,’ or players that are most likely starting on teams.

On the other hand, due to the complexity of the data and the variety of ways in which it can be interpreted – for example, when Datascope took ‘underlying projections’ into account, ESPN’s predictions came off looking much better – we talked to the lead Data Scientist on this project about his take-aways, how ESPN can assemble better projections, and the best way to use these numbers in Fantasy Football.

Chicago Inno: Now that it’s all said and done, are you impressed with ESPN’s projections or disappointed by them? 

Greg Reda: I’m hesitant to say their projections are bad, but they’re not great either. I think it’s really tough to project something as varying as an individual player’s performance. However, they make a huge mistake by not approaching things probabilistically.

For instance, let’s assume the Lions are going to score two touchdowns against the Bears next week. ESPN predicts Matthew Stafford is going to pass for both – one to Calvin Johnson, and the other to Golden Tate.

That’s a pretty hard-lined approach to prediction – “This guy gets one, the other guy gets one. The end.” Their approach doesn’t factor in any measure of uncertainty. Why not spread the wealth? For example, predict Calvin Johnson to score 0.75 touchdowns, Golden Tate to score 0.5, Joique Bell to score 0.5, and Reggie Bush to score 0.25. Sure, in real life a player can’t score half a touchdown, but in predicting things that way, ESPN would be assigning a level of confidence to their projections; they would effectively be saying “We think Calvin Johnson is more likely to score than Golden Tate, but Tate still has a good chance of getting in the end-zone.”

All too often we boil things down to a single, easily understood number in terms of measurement – “Jay Cutler is going to score 18 points this week” or “Jay Cutler averages 15 points a game.” – But in doing so, we neglect to consider the likelihood of that event. Does that player’s performance vary more than others? When boiling things down to a single-number prediction, you fail to capture the uncertainty in your prediction.

CI: Does this change how you approach viewing ESPN projections and adjusting your roster?

GR: It doesn’t change how I adjust my roster. I’ve always felt that ESPN tended to over-project, but hadn’t looked into it until now. Others in my league had felt that way too. I think that speaks to how good humans are at recognizing patterns – ESPN only over-projects 3 out of 5 times – not much worse than 50/50.

That said, it also speaks much more to our negativity bias – we’re more likely to remember things that have an negative impact on us than those that have a neutral or positive one.

I tend to use FantasyPros since it aggregates projections from multiple sources. I think it’s a good way to see the uncertainty in the predictions – one guy might say 15 points, while the other guy says 7 – it’s a good way of seeing how “risky” a player is.

(Image via Datascopeanalytics.com)