Game Theory: Rotate or Full-Strength Squad in the League Cup Round of 16

Building on my previous “Game theory” post about how it’s rational to play a weaker squad in the Champions League compared to the EPL, I wanted to walk through the logic of doing so in the League Cup Round of 16.

The general idea is the same: you rotate if immediate results in the league are more important to you, you play a full-strength squad if the cup is more important to you. This is mitigated by several factors:

  • Your odds of winning today’s match with a full strength squad vs. rotated squad
  • Your odds of winning the Cup
  • Your league position relative to where you want to be/can you risk losing points this weekend
  • Psychological benefits

So I wanted to walk through the logic for a couple of teams playing today to see what the rational choice would be.

Leicester City

Leicester City has been over-achieving in the league, with 19 points through 10 games as of today. You’d have to think they would have been ecstatic with a mid-table finish at the beginning of the season, so they’re over-achieving by any measure.  They can afford to drop points this weekend in the league.

Similarly, they’re playing a weaker opponent they’d be expected to beat in Hull City. Rotating the squad could make a significant difference there, but if they play a full strength squad they’d be significant favorites, even on the road.

Their odds of winning the cup aren’t great, but a lot of that depends on the draw and what Arsenal, Chelsea, and the two Manchester teams plan on doing.

The psychological benefits of making a run in a tournament like this, and potentially getting  a game at Wembley and winning a trophy could be huge for a team like Leicester.

Based on these factors, my “model’s” prediction would be a full-strength squad, or close to it.


Arsenal have to be considered heavy favorites over Sheffield Wednesday, but this is also likely true for a rotated squad. A team the size of Arsenal should be able to play their second best XI and beat Sheffield Wednesday.

As one of the top teams in England, Arsenal have to be considered one of the favorites to win the Cup. However, they’re probably co-favorites with Manchester City, Manchester United, and Chelsea (if Chelsea ever gets their act together…).

Arsenal’s league position is roughly where they want it,  tied with Manchester City at the top of the league. However, because they’re tied they don’t really have any room to spare and can’t afford to take any risks in the league if they want to mount a serious title challenge. And because they’re Arsenal, they may want to be extra careful to avoid tempting the fates.

Psychological benefits are minimal – a loss with a rotated squad to Sheffield Wednesday would get a couple snickers in the papers tomorrow, but no one would actually think badly of them if they lost. Similarly, they’ve won the FA Cup two years in a row so there’s no burning need to win a lesser trophy anytime soon.

Prediction: Arsenal rotates.


Chelsea is a tough case.1 They’re playing Stoke, so they should be favored fairly heavily. That being said, they’ve been playing well below form so maybe a full-strength squad doesn’t win today. You’d struggle to assign a win probability to them right now, so that’s tough. That being said, Stoke’s a good enough team that they should be able to beat Chelsea’s fully rotated squad.

Also, who knows what Chelsea’s odds are of winning the tournament? They beat Arsenal earlier in the season, but that looks more and more like a fluke with every game that passes. You’d have to think they’re not favored over either of the Manchester clubs right now, nor would they be favored over Arsenal. A win today likely just gets them a tough game next round and an exit by the semi-finals.

Chelsea’s league position is abysmal by their standards (15th through 10 games), and clearly they can’t afford to lose any more ground if they want to keep their chances alive for a 4th place finish. The league is clearly the priority right now.

Psychological benefits: these are tricky. They could use a win right now to maybe build some momentum, building #confidence in the locker room and among the fans. Mourinho is clearly trying to fix things on the fly (see Mike Goodman’s great piece on this ), and he could use some reps with some new tactical tweaks in a setting where there are no real consequences. Mourinho presumably could use a win to get some pressure off of him, especially given the rumors that he might be on his way out soon.

Prediction: tough to call, but the psychological benefits might outweigh other considerations.

  1. Spoiler alert, they played a full-strength squad

Manchester City’s Midfield Depth Problem

This week’s Manchester Derby was fairly uneventful, with City registering the first shot on goal for either team somewhere in the middle of the second half, and the game ending in a 0-0 draw. City was without two of their best attacking players (Sergio Aguero and David Silva), and it showed. To compensate, Manuel Pellegrini moved Yaya Toure up from his typical deep-lying midfielder/box-to-box role to a more attacking role behind Wilfried Bony, who would often come back deeper to link the defense to Toure in attack. From there, presumably the plan was for Toure to distribute the ball to Sterling and de Bruyne on the wing who would then either cut inside or cross the ball to Bony. I say “presumably” because de Bruyne usually either kicked the ball to the nearest Manchester United defender or as far as he could over the touchline, wasting the few opportunities City had to attack.

This plan worked relatively well up until the wingers got the ball, and as a City supporter I couldn’t have been happier with the Bony/Toure linkup. However, the American commentators focused quite a bit on how much City missed Aguero, and while he’s one of the best strikers in the EPL, I disagree that missing him was the problem. Running the quick numbers through my model, Bony as a replacement for Aguero is only a couple point downgrade over the course of a season. Missing Silva was the problem, and City needs a backup for him.1 Who can they get?

First, I looked at my “Points Above Replacement” spreadsheet, and confirmed the conventional wisdom: Yaya Toure and David Silva are both in the top 25 players in my database at their given position for Manchester City.2.  From the few players who were improvements, I looked for players who could play at least centrally in addition to their primary position as either a defensive mid or attacking mid, and I eliminated players who come from rival teams who would be unlikely to sell3. After these filters, I was left with what I consider six good options4.

Midfield Reinforcements Manchester City

The barplot shows the change in expected points for each of the six players I found as options for Manchester City. The best option according to my model is Swansea City’s midfielder Ki Sung-Yueng. He’s in his prime (he’ll be 27 in January), would be relatively easily buyable for a “big club” like Manchester City, and can play either as a defensive midfielder or more centrally.

The next best option, for me, is Milan Badelj. He’s the same age, and can play both centrally and as a defensive mid, and based on his history would be reasonably affordable to buy from Fiorentina.

Gary Medel and Daniele de Rossi are probably my least favorite buys on the list: both are older, de Rossi’s probably unbuyable and I’m not sure what sort of price tag it takes to buy a player from Inter Milan these days.

The other options are the young stars: Ilkay Gundogan, Marco Verratti, and Lorenzo Crisetig. Of the three, Gundogan’s price tag is probably too high and reportedly said “no” to Manchester United this summer. Crisetig is the biggest surprise on the list (my model likes him for Arsenal too), but he’s young with a big upside for me, and wouldn’t be too expensive as a speculative buy. I like Marco Verratti a lot, and PSG *may* need to sell someone if there’s any truth to the rumors that they’re going to buy Cristiano Ronaldo, and he could be a long-term replacement for either of the two aging stars so if he’s buyable I think City should pursue him.

To be clear, this is just a starting point. If I’m in charge at City, I’m surprised by how much the model likes Ki Sung-Yueng so I send scouts to every Swansea City game between now and January 1 and watch every bit of video I can get on him to see how well he’d fit the team’s style and how well he could slot in for either Toure or Silva. Same with Verratti, Badelj, and Crisetig. City’s depth could be their biggest weapon, but it was clear today that they don’t have a great second option for when David Silva is out and that could be what stops them from catching Arsenal.



  1. My model actually thinks Patrick Roberts would be a good replacement for him, but clearly Pellegrini doesn’t trust him as much as my model does so I’m operating under that assumption here.
  2. Yaya Toure is #16 in the box-to-box role, and Silva is #23 in the CAM role for Manchester City
  3. My model really likes Daley Blind as a replacement for both of them, but he’s obviously not an option
  4. Really five good options and Daniele de Rossi, but I’m such a huge fan of his I always like to add him when I’m talking box-to-box midfielders

There is no Debate: Everything is Analytics, Just Using Different Words

So we’ve had a little time since the last major newspaper column about ambient temperature and analytics, so I wanted to post something I’ve been thinking for a while now: Everything is analytics, whether you call it that or not. The two sides don’t complement each other because there aren’t two sides. Unless you purely watch soccer for the artistic brilliance of the game and make zero judgments on the game, you’re analyzing things. You’re deciding which players are good and which players are bad, which team is good and which team is bad, who should have won a game, and who will win your favorite competition. This is the exact same thing the math folks are doing, it’s simply that what we think of as “analytics” is just a more formal way of doing it than most people use.

I’m not going to write a long-winded rhetorical explanation of this point, instead I’ll just provide a few quick examples:

“Real football men” say: “Napoli outplayed Fiorentina last weekend and really deserved their win.”

“Analytics” says: “Napoli’s xG total was higher than Fiorentina’s, so you’d expect them to win.”


“Real football men” say: Walcott should have scored on his header, and Ozil’s goal was an easy finish.

“Analytics” says: “Walcott’s header had an xG of 0.34, and Ozil’s was 0.84”

“Real football men” say: “Arsenal’s playing well and could mount a real title challenge.”

“Analytics” says: If Arsenal continues to meet expectations, they have a 53% chance of winning the league this year.Heat Map Week 9

“Real football men” say: “Chelsea is playing horribly this year.”

“Analytics” say: “Chelsea’s underachieving by about 7.5 points so far and need to turn things around.”

Deviation Bar Week 9-2

“Real football men” say: Leverkusen was unlucky to not win against Augsburg.

“Analytics” say: The Expected Goals  values mean Leverkusen would have beaten Augsburg 84% of the time.


“Real football men” say: Fernando Torres has his confidence back!

“Analytics” quietly turn up the air conditioning and weep at their desks….ok, so not everything is analytics.

Follow me on Twitter @Soccermetric


Chelsea Can Pick Up Ground The Next Four Games

Probably the biggest story of this season so far has been how Chelsea is underachieving. As of today they’ve earned 11 points through 9 games, and sit in 12th place in the league table, falling about 7 points below my model’s expectations (which put them in second place at the beginning of the season).

Deviation Bar Week 9-2

They’re the second biggest outlier in my model, only slightly out-performing Sunderland so far. My predictions still have them likely finishing in the top 4, and this week’s performance consolidated that position a little bit1.

Heat Map Chelsea Week 9

But can Chelsea turn it around? I look at the next four games to see what we’d expect from Chelsea and what their chances are of catching up to expectations.

Chelseas Next Four Games

In their next four games, Chelsea can expect to earn 7.84 points. They’re big favorites in the two home games (against Liverpool and Norwich City), but are only slight favorites in the two away matches (against West Ham and Stoke City). I know Chelsea’s form has been bad this season, but you’d still expect them to beat Stoke City fairly easily and…well who knows when West Ham is going to come back down to earth? If they win both of those games, they’d have 6 points right there, and a win at home against Norwich City would bring them to 9. Right there they’d be 1.16 points above expectation, and if they could beat Liverpool at home they’d be a full 4.16 points over the expectation. That would cut their deficit by more than half, bringing them more in line with the pre-season predictions that made them title contenders.

Four wins in four games seems out of Chelsea’s range right now, but I think they have to turn things around eventually. If they can do it now, they can get right back into the thick of things and maybe mount at nominal title challenge.

  1. They were helped out by a draw between Liverpool and Spurs, their other main competitors for that last Champions League spot

Game Theory: A Long-Term Look at Selling Young Players

My last post looked at the calculations involved in whether a team battling relegation should sell strong young players, and showed that in a one-shot game it basically comes down to your perceptions of probability of remaining in the league if you sell the young star vs. keeping the young star. For those who didn’t read (and don’t want to click the link above) here’s a quick refresher on the calcuation:

EV(Keep Grealish) = Pr(Staying in the EPL)* (Total Revenue from being in the EPL)  + Pr(Relegated)* (Total Revenue from the Championship)  – (Money spent improving the squad)

EV(Sell Grealish) = Pr(Staying in the EPL) * (Total Revenue from being in the EPL) + Pr(Relegated)* (Total Revenue from the Championship) + (Money gained from Selling Grealish) – (Money spent replacing him)

If EV(Keep) > EV(Sell), then Villa should keep Grealish. If EV(Sell) > EV(Keep), then Villa should sell him.

I left the last post as/is in the interest of simplicity and parsimony. However, the full math is a little more complicated than this and I wanted to walk through it a little more fully for interested readers here.

First, there is a psychic cost/benefit to selling young players. Fans may become upset at selling a club’s young star, there may be a decrease in locker room morale, or some other effects I can’t anticipate along those lines.  So add that to the “sell” side of the equation.1

Second, this isn’t really a one shot game. Aston Villa (or any team) retaining their Premier League status is a yearly task. 2 So they’d have to factor in the probability of retaining Premier League status this year (“t”), next year (“t+1”), the year after (“t+2”), etc. Presumably, if Grealish improves like he is expected to over the next 5-10 years, he will grow into a valuable member of Aston Villa’s squad, decreasing their probability of relegation in t+1, t+2, etc.  So with this in mind, here’s the new equation (some of the names are abbreviated so it doesn’t get too unruly.

Pr(Stay)* (EPL Revenue)  + Pr(Relegated)* (Championship Revenue)  – (Net Transfer spend) 

EV(Keep) = t + Pr(Stay)(t+1))* (EPL Revenue(t+1))  + Pr(Relegated(t+1))* (Championship Revenue(t+1))  

EV(Sell) = t + Pr(Stay)(t+1))* (EPL Revenue(t+1))  + Pr(Relegated(t+1))* (Championship Revenue(t+1)) + (Money gained from Selling Grealish(t+1)) – (Money spent replacing him(t+1))

In this formula, Pr(Stay)(t+1)  for EV(keep) presumably is higher based on Grealish’s improvement over a year. He will be a better player, and will be able to make a higher contribution to the team. How much higher, and how much will that improve their likelihood of staying in the Premier League?

Similarly, his value will increase over the year, so Money gained from Selling Grealish(t+1) will be higher, increasing EV(sell). So this changes the calculation, but both sides probably increase proportionally (his affect on probability of staying in the league will go up as his value goes up).

I’m not going to go through year two, but the process is the same, just adding the EV of t and t+1 to the formula for t+2, although Grealish’s value will continue to go up.

The final wrinkle to the process is that the game presumably ends if Aston Villa gets relegated. A prospect like Jack Grealish wouldn’t want to play for a team in the Championship, and would be expected to leave in the summer instead of sticking around another year. So our new formula would be3:

EV(Keep(t+1)|EPL(t)) = t + Pr(Stay)(t+1))* (EPL Revenue(t+2))  + Pr(Relegated(t))* (Championship Revenue(t+2))  

EV(Keep(t)|Championship(t+1)) = Championship Revenue(t+1) + Sale price for Grealish(t+1 Championship) 

EV(Sell(t+1)) = t + Pr(Stay)(t+1))* (EPL Revenue(t+2))  + Pr(Relegated(t+1))* (Championship Revenue(t+2)) + (Money gained from Selling Grealish(t+1 EPL)) – (Money spent replacing him(t+1))

This version reflects Grealish’s presumably dramatically changing value based on Aston Villa’s success in year t. If Aston Villa keeps him and stays up, his value increases ever year. But if Aston Villa keeps him and goes down, his value presumably decreases because teams know he won’t want to stay at a Championship level team which changes the expected value calculations yet again. There’s a risk/reward involved in keeping young stars around, and so much uncertainty that what a team chooses depends on assigning the correct probabilities to all the events and determining your risk aversion.

Finally, we add the psychic benefit/cost in, weighting for relegation and staying in the league to get our final, overly complicated looking equation.

EV(Keep(t+1)|EPL(t)) = t + Pr(Stay)(t+1))* (EPL Revenue(t+2))  + Pr(Stay)(t+1) *(Psychic Benefit Stay (t+2))  * Pr(Relegated(t))* (Championship Revenue(t+2))  – Pr(Relegated)(t+1) *(Psychic Cost(Relegated) (t+2)) – Psychic Cost(Sell)

EV(Keep(t)|Championship(t+1)) = Championship Revenue(t+1) + Sale price for Grealish(t+1 Championship) – Pr(Relegated)(t) *(Psychic Cost(Relegated) (t+1))

EV(Sell(t+1)) = t + Pr(Stay)(t+1))* (EPL Revenue(t+2))  + Pr(Stay)(t+1) *(Psychic Benefit Stay (t+2)) + Pr(Relegated(t+1))* (Championship Revenue(t+2)) – Pr(Relegated)(t+1) *(Psychic Cost(Relegated) (t+2)) + (Money gained from Selling Grealish(t+1 EPL)) – (Money spent replacing him(t+1)) – Psychic Cost(Sell)

All of this being said, the money earned for keeping/selling a young star is small compared to the money earned from staying in the EPL vs. the Championship. A couple million pounds difference in selling/keeping a young star would have only matter if the odds of relegation changed marginally based on selling him. The driving force here is the first part of the equation:

Pr(Staying in the EPL) * (Total Revenue from being in the EPL) + Pr(Relegated)* (Total Revenue from the Championship)

How much better would Aston Villa be by selling Jack Grealish and buying more experienced players with the money they made? Would it be enough to offset the psychic cost, increased likelihood of remaining in the EPL the subsequent year, and profit they would make by keeping him an extra year? Given that Aston Villa, like all EPL teams, are single-minded forsakers of relegation, how risk-tolerant are they? If it were up to me, looking at the numbers, I’d sell now, but I likely estimate their probability of finding replacements who can improve their chances of staying up more highly than most do. Regardless of the conclusion, the goal here was to formalize the thought process of any team in the decision to sell a young player. It’s a complicated process with a lot of uncertainty in many different places, which is why you’ll see people argue both sides so passionately.

  1. In political science, William Riker asserted there was a psychic benefit to voting, e.g. wearing your “I voted today” sticker makes you feel good about yourself, and this is really the only reason why people vote.
  2. One could even apply this to the 7 or 8 teams who are perennially “safe” – change “relegated” to “Champions League” or “Winning the Title” or whatever your goal is.
  3. EV(Keep(t)|EPL(t+1)) is a statement of conditional probability that should read “The Expected Value of keeping Grealish in year t+1 given that they remained in the EPL after year t is…”

Game Theory: The Case for Villa Selling off Youth Players

My previous game theory posts (“The Logic of Not Caring About the Champions League” and “EPL’s Treatment of Europe is a Tragedy (of the Commons“) were well-received, so I thought I’d put together another one related to a topic on my mind that may get a fair amount of attention when transfer rumor season starts. Aston Villa is currently involved in a serious relegation fight, that has seen their odds of relegation increase to almost 70%.  The heat map below shows how their odds of getting relegated have increased over the season, especially in the last couple of weeks.
Heat Map Aston Villas Finishes

I’ve run some analyses to see exactly where Villa’s best opportunities for growth are, and they all point to Jack Grealish being the biggest opportunity for them. The average gain in his position is a  very solid 5 points, and the maximum gain is an astounding 20 points. Grealish may be a future Aston Villa and England star, but my model think he’s holding them back significantly today. But how do we know if it’s time to sell?

The immediate calculation is fairly easy and straightforward:

EV(Keep Grealish) = Pr(Staying in the EPL)* (Total Revenue from being in the EPL)  + Pr(Relegated)* (Total Revenue from the Championship)  – (Money spent improving the squad)

EV(Sell Grealish) = Pr(Staying in the EPL) * (Total Revenue from being in the EPL) + Pr(Relegated)* (Total Revenue from the Championship) + (Money gained from Selling Grealish) – (Money spent replacing him)

Then all you’d have to do is compare the two numbers: If EV(Keep) > EV(Sell), then you keep him. If EV(Sell) > EV(Keep), you sell him. Assuming net transfer spend in both situations is the same, the equation comes entirely down to what Aston Villa believes their chances of staying in the EPL is with/without him. If you believe my model, the chances improve significantly with the right replacement (many of whom are in Aston Villa’s buying range at first glance), therefore EV(sell) is much greater than EV(keep), meaning that it’s time to sell him.

This all assumes a one-shot game, which may be a reasonable assumption if you believe next year will be a similar fight to stay in the Premier League. I’ll post the longer version in my next post, but this illustrates the expected value calculations that teams go through on these decisions. Single-minded forsakers of relegation may have to go against club ethos and sell a young star so they can achieve their primary goal of maintaining Premier League status.

The Single-Minded Forsakers of Relegation

Sunderland’s disappointing start to the season saw them bring in Sam Allardyce, and it sounds like the vultures are starting to circle at Aston Villa. Both teams have underachieved significantly so far, only out-paced at the bottom of my underperformers list so far this season by Chelsea. It should go without saying that Chelsea has the quality to avoid relegation, but Aston Villa and Sunderland both have to be concerned.

Deviation Bar Week 8-2

One of the most influential books in political science is Congress: The Electoral Connection by David Mayhew, and in it he coins the phrase “single-minded seekers of re-election” The main argument is that members of Congress care about nothing other than re-election, and will pursue that goal single-mindedly. Richard Fenno relaxes this assumption a little bit, giving politicians goals like making good public policy and pursuing power in the institution, but his caveat is key: members of Congress cannot pursue other goals without achieving the primary goal, re-election. EPL teams have a similar goal: avoiding relegation, and Sunderland and Aston Villa prove this. EPL teams are single-minded forsakers of relegation – consolidating the league status is concern #1, and no other concerns matter until #1 is met.

Sunderland signing Sam Allardyce, much like West Ham before them, was met with any number of jokes about his boring, direct style of football.1 But he’s a proven manager who did a great job of consolidating West Ham’s position in the league, putting them into a good position where they can hire a manager who plays a more interesting style without worrying their league position. With relegation seemingly out of the question, West Ham can afford to move on to concerns about playing an attractive style. Sunderland doesn’t have the same luxury, so Allardyce is their best choice.

Similarly, reportedly one of the reasons Aston Villa originally tapped Tim Sherwood to be their manager was his track record in developing young players. Aston Villa has been investing in youth for quite a while now and they have some promising potential stars, Jack Grealish being toward the top of the list. However good he may be, and however good Tim Sherwood may be at developing players, they may not have the luxury of keeping either one of them at the club. My model thinks replacing Grealish gives Aston Villa the highest upside, with a shocking 20 point increase for the max player, and an average of over 4 points. Aston Villa Max Replacements Aston Villa Average Points

Everyone likes the idea of having future stars in their lineup, especially a home-grown, local player like Jack Grealish. And everyone likes a manager who can develop young players into the next generation of stars. But if the rumor mill is true and Tim Sherwood’s position is tenuous, it’s yet another example of how teams are single-minded forsakers of relegation. Staying in the Premier League is job #1, and it’s the only one that matters because without it, you can’t achieve any of the others. Putting the youth project at least partially on the backburner might be necessary to forsake relegation and maintain their Premier League status.



  1. My favorite was a gif labeling Mourinho as “The Special One” while labeling Allardyce as “Route One.”

How Do We Make Analytics More Accessible?

The world doesn’t need another hot take on whether analytics are good, or useful, or the temperature at which stats people keep their offices. I honestly stopped reading the media pieces on the topic after the Rory Smith debacle a week or two ago, but I do think there is room to discuss ways to make analytics more accessible and interesting to a larger audience. Academia deals with this quite a bit, especially in disciplines like Political Science where what we study has consistent relevance to the media, so I wanted to share some of the strategies on how to make often dense statistical research  more accessible to media and practitioners.1

  1. DON’T assume anyone knows anything about math.
    • Most people haven’t taken a math class since college, and even then they didn’t like it. Not only should you not get lost in the math, you should avoid it entirely. Have it in reserve if they ask for more details, but be prepared for people’s eyes to glaze over when you start.
  2. DON’T assume anyone cares about math
    • People don’t care about the method. They care about what they can learn.
  3. DON’T use jargon.
    • Many of the metrics we use have technical sounding names or abbreviations. Maybe the measure is clear and interesting, but when you lead with the technical part people will lose interest. Similarly, instead of saying “R2” talk about “correct predictions.” Present confidence intervals (don’t call them that) instead of p-values, and if possible show me, don’t tell me. Model fit can be intuitive when shown on a graph, but it can be daunting when it’s explained.
  4. DO start with a question
    • What do people care about? Is there a story in the news that you can shed light on with analytics? What can we learn from your method? Analytics that answer a question people care about are more likely to be embraced by media/practitioners than analytics than those that simply present a measure.
  5. DO focus on what we learn, rather than how you did it
    • What new insights does your method give us? What did we not know before that we do now? How does your analysis teach us something about soccer that we didn’t know before?
  6. DO explain why people should care about what you did.
    • In academia, we call this the “so what?” question, and it’s the most important part of this whole process. You did a bunch of math, so what? Why does this matter to a larger audience? Why should people care about what you did here?
  7. DO focus on clear, concise presentation of results.
    • I know of several high  profile studies in political science that only got attention because they had nice infographics attached to them. It could be a clever, clear, infographic, or an interactive tool of some sort that people can play with. If you’re not good with graphics, it could be a table. Or a short paragraph, or anything that is clear, attention getting, and concise. Save description for those who want it.
  8. DO be ready for people to not accept your conclusions.
    • Confirmation bias is a real thing, and it’s difficult to overcome. Analytics that confirm what people already believe are much easier to accept than those that aren’t. And beyond that, some people just don’t like stats – you’ll never convert them and it’s not worth trying.

That’s all I’ve got – I’ve tried to distill a couple dozen articles about outreach in academia to a few bullet points. I’m confident that if soccer analytics folks focus on these things, and are patient enough, things are going to change. They changed in politics pretty quickly, there’s no reason the same thing won’t happen in soccer.

  1. I don’t have a lot of experience with mass media and soccer, but I have had some folks in the industry reach out to me privately. In my political life, I’ve been quoted multiple times in USA Today, The New York Times, have had op-eds placed in Washington Post and USA Today, and have had my research featured in The Huffington Post.  I’ve also been on Voice of America, Al-Jazeera America, and am a regular guest on the News and Views radio show out of Minnesota so I know a fair amount about public outreach for arcane topics.

Stats for Coaches and Journalists: Thoughts and a Draft Syllabus

I’ve mentioned this before, but my day job is professor of political science, and specifically I teach courses about statistical methods and research design (among other classes). With the latest round of “Analytics, LOL” foolishness from the media on Twitter, I thought I would do something productive and create a syllabus for a hypothetical course on soccer analytics for coaches and journalists.

I wanted to share a few thoughts about this idea, and have an open question for anyone who reads this (which I’ll tweet as well). The target audience is people who have some interest in learning about analytics, their uses, and some basics with a goal of being able to speak to analytics types/read blog posts written with analytics. I had a Twitter exchange with @unfitforpurpose about this, and it may be worth re-thinking without the assumption that people are interested in learning the material.

I’m assuming zero knowledge on the part of the audience members. Gab Marcotti tweeted something about many people not knowing what a standard deviation is, which I think is potentially even overstating both the lack of math knowledge and math awareness of the audience. I also think focusing on the math is problematic: at their core, analytics aren’t about math, they’re about using tools to answer a question. I’m a big believer that measurement for the sake of measurement (or math for the sake of math) is a waste of time. To really appreciate and understand analytics, you need to start with simple concepts like hypotheses, measurement, and operationalizing variables. Even in the analytics community, we often forget that measures of uncertainty only come after proper model building.

I broke the course down into four sections:

  1. “What is science?”
  2. “Case Study and Small Sample Research”
  3. “Stats and Large Sample Research”
  4. “Data visualization techniques”

I think people were picturing an hour long seminar on how to do analytics, and I don’t think that’s the best way to do it. This isn’t a semester’s worth of learning, but I think it’s at least four 2 hour sessions to get a basic understanding of what we’re doing, although one could cut data viz out if the goal was just to understand rather than to produce, leaving us with three 2 hour sessions. Longer might be better, but if the goal is basic understanding then I think this would be enough.

However, I’m curious if other people think this could be broken down into a 1 hour session? What would you include? I don’t see it, but I’m thinking about ways it could be done and am curious for suggestions.  Here’s the syllabus I wrote, and  I may flesh it out even further with readings or videos if people are interested. Let me know what you think.



Game Theory: EPL’s Treatment of Europe is a Tragedy (of the Commons)

My last post asserted that it’s individually rational for each team in the EPL to not care about the Champions League.1 However, this leads to the very real possibility of a “tragedy of the commons” effect, and England losing a coveted Champions League spot.

The tragedy of the commons comes from Garrett Hardin, and describes a village where farmers all graze their sheep in a common area. This area belongs to everyone, and people are free to have as many sheep graze there as they can afford. It is in each farmer’s rational self-interest to purchase as many sheep as they can so they can sell milk, wool, and whatever else sheep are good for. As they buy more and more sheep, the commons become over-grazed, and all the grass dies. No one gets to graze their sheep, costing everyone money. Individuals acting in their own self-interest can hurt the collective in the long run.

We’re seeing that right now in Europe for England. It may be rational for teams to focus on the league and ignore the Champions League so they can focus on finishing in the top 4 and qualifying for next year’s Champions League. This is likely even more true for Europa League teams who need every edge they can get in the league to try and make the top 4 next year, so they’re more likely to tank the European fixtures. However, with UEFA coefficients (and Champions League spots being allocated to leagues based on their coefficient) being largely based on performance in continental competition, we’re seeing a tragedy of the commons.

It’s in each team’s interest to not worry about European fixtures and to focus on the league instead, but if every team does this then England could easily lose their 4th CL spot. When the individual good conflicts with the collective good, the collective good can easily disappear. Right now the EPL has too many sheep, not enough farmers maintaining the common area. Ignoring Europe, as rational as it may be for the individuals, is bad for the collective.



  1. I use the word “rational” in the economic sense of the word – acting in one’s self-interest.