I’ve been surprised by how well Twitter has embraced my model (MOTSON: “Model of the Same Old Nonsense”), and feel fortunate to have people be interested in what was initially a fun side project for me. Because the same questions pop up in my mentions every game day and to help new followers, I wanted to post a list of answers to “frequently asked questions” I get.
What underlying stats do you use?
There’s a whole post about the method for anyone who is interested, but basically I use a number of offensive and defensive statistics from last season (2014-2015) combined with a “team strength” coefficient calculated from last year’s results.
Your model really likes Chelsea, what’s up with that?
The predictions I posted were all made based on 2014-2015 statistics, and more importantly the “team strength” coefficient was calculated on a year where Chelsea won the Premier League title. The model thinks they are good, especially at home where they were undefeated. They are not this season, so my model has really struggled with over-valuing them.
Your model isn’t giving Leicester City enough credit this week, they’re way better than ______.
Just like Chelsea, I’m using last year’s numbers. Leicester City have dramatically out-performed expectations this season. Unlike Chelsea, who would shock me if they improved anywhere near pre-season predictions, Leicester may still regress to the mean. We’ll see.
But Chelsea aren’t any good this year. Why don’t you update your predictions?
This was a personal decision – I’ve decided not to update the model throughout the season to see how it does. Creating a model based on recent results is a different challenge, and there are a number of ways to tackle it, but that’s not my goal here. Initially I aimed at creating a model where I could measure individual player contributions, and to do that I wanted to predict season performance. The individual game predictions were a byproduct of this model, and they’ve become far more popular than I had imagined, but they’re more of a diagnostic to how my model is doing overall.
I could update the model with a more recent team strength coefficient, but I’m a big believer in letting the model run its course.
Why does your model like Arsenal so much?
One of the features of the SVM model is that I don’t know what individual variables make the model say what it says – I do know it predicts a pretty significant home field advantage. Other than that, it’s a black box that I can’t unpack.