Every week after the EPL games are completed, I post a couple of graphics – the Expected Final Table and each teams’ Deviation from Expected Points. I realized I’ve never really explained how I do this and why they’re important, so I wanted to take a couple of minutes to post what they are.
At the beginning of the season, I calculated the probability for each outcome for each game.1 So each team has an expected probability of winning, expected probability of losing, and expected probability of a draw. I calculate the expected points through a simple formula:
3(Prwin) + 1(Prdraw) + 0(Prloss)
Fairly straightforward – three points for a win, so each team is expected to earn 3 * probability of a win. 1 point for a draw, so each team is expected to earn 1*probability of a draw, and 0 for a loss. Adding up this number for a single team across all 38 games gives me their expected points total for the season, and doing this across all 20 EPL teams gives me the final table.
The next step in the process is, after each game completes, to replace the expected point total for that game with the actual point total. Then I recalculate the expected points by adding the actual points earned for the games completed and the expected points for the remaining games, re-doing the final table with those values. Here is the most recent table:
I like this measure for one major reason – it controls for both strength of schedule so far and remaining strength of schedule. Before each team has played all the other teams, there can be huge discrepancies in how many points they’re expected to have earned, but the regular table doesn’t show that, it only shows how many points they’ve earned. As we get further into the season, this can become more important, especially in the title and relegation fights because we’ll know exactly how each team is expected to do in their remaining games without having to do weird mental gymnastics like “Well Norwich City has games against Chelsea and Spurs left, while Bournemouth is 2 points behind but has West Brom and Sunderland.” This does the all that mental math for us “behind the scenes” so we don’t have to guesstimate whether Bournemouth will catch up to Norwich in this hypothetical situation.
It also lets us avoid “streak story-telling” where we overextrapolate from a few early results. Right now one of the big stories is that Chelsea is struggling – they’re currently in 13th place and everyone is worrying that the sky is falling. We all know they’re not going to finish there, but we don’t know what this early-season slump has done to their chances just by looking at the table. This measure lets us see exactly what we’d expect to happen by the end of the season and what would have to happen for them to catch up.
The next is the deviation from expectation. As an example, week 1 Arsenal was expected to earn 82 points2. They were big favorites week 1 home against West Ham, expected to earn 2.63 points. They lost, meaning they earned 0 points. So their expected points at the end of the season went from 82 to 79.373 with a deviation of -2.63 points.
This measure does a couple of things: first, it lets me diagnose how well the model is doing so far4. Second, it lets us see whether teams are exceeding, meeting, or falling below expectations. We expected Chelsea to perform at a much higher level than they have so far, and this model can quantify exactly how poor they’ve been compared to expectations. Similarly, we know Leicester City has been exceeding expectations, but this lets us quantify that. And despite a relatively slow start from Everton, they’re actually performing almost even with what we’d expect. Finally, we can see that Manchester City’s strong start has them performing significantly above expectations, and we can likely expect a little slump at some point as they regress to the mean.
Here is this week’s chart for an example:
Hopefully this explained the method a little more clearly for anyone who is interested – check my twitter (@Soccermetric) or this webpage for weekly updates as the season progresses.
- The method is available at http://soccer.chadmurphy.org/methods/predicting-late-season-outcomes-the-method/ ↩
- I don’t remember exactly how many points it was, but this is close enough for demonstration purposes. ↩
- 82-2.63=79.37 ↩
- Right now the expected value correlates at 0.48, which isn’t bad this early in the season but I’d like to see higher. I’ll write a blog post about proper hypothesis testing later because I think that’s important for the analytics community ↩