Parental Discretion Advised: Over-generalizations from incredibly small sample sizes to follow.
Those of you who follow me on Twitter (which I assume is almost all of my readers, but if not you should follow me @Soccermetric) probably know I debuted prediction models for Serie A and the Bundesliga this weekend. They’re based on the TAM model (which needs a better nickname – I want to backronym NAGBE if possible), which is described in full here. The short version is it only looks at basically two variables: in-season results and goal differential. The math is a lot more complicated than that, but this is the basic idea. The TAM model hasn’t done especially well through its first two weeks in the EPL, predicting 8/20 matches correctly, a meager 40%, or barely better than the 33% we’d expect from just flipping a three-sided coin.
However, it did much better in Serie A and the Bundesliga this weekend. It predicted 6/9 correct this week in the Bundesliga, or 67%. Here are the first week’s predictions:
The errors were Wolfsburg, Hamburg, and Hertha Berlin. I’m incredibly pleased with 6/9, and am equally pleased with the 6/10 in Serie A (especially because it failed to predict Milan’s win in the Derby).
In one week, the two new models each got 6 outcomes correct, which is almost as many as the same EPL model got correct in two weeks (MOTSON outperforms the simple model). This is obviously a small sample, and “correct” outcomes isn’t the right way to measure this, but it’s evidence that maybe the EPL is just really difficult to predict (especially this season). I’m not planning on putting together a model for La Liga, but that one might be even easier: you’d be hard-pressed to put together a bad model as long as you started with Barcelona > Real Madrid = Atletico.
The success of a model depends on the difficulty of the task, and if the same model performs much better in other leagues, then we may have evidence of a greater challenge in predicting EPL results rather than the other major European leagues and should adjust expectations of our models accordingly.