If you follow me on Twitter, you’ve seen that I’ve been posting some preliminary Expected Goal (xG) data from the 2015 NWSL season. These data aren’t publicly available, so I’ve been collecting them by hand. To do this, I’ve been going on YouTube, watching every shot from every game, and coding a number of variables for each shot. As of today I’m up to around ~500 shots, and have built an xG model based on these shots which I will detail in another point when it’s done.
One thing I’ve added to typical xG models is a variable “whether a shooter was under pressure” – right now this is defined as whether the nearest defender was within a half yard of the shooter at the time of the shot. I was surprised to find that in my model, it didn’t reach statistical significance, both because it’s been considered an issue with typical xG models for a while now and because theoretically it makes sense that defensive pressure would lower the probability of scoring. So I did what any responsible analyst would do and plotted my data.
I plotted xG values (derived from my model) as a function of distance from the goal. The red line is when the shooter isn’t pressured (the nearest relevant defender isn’t within 0.5 yards), and the red shaded area is the 95% confidence interval. The blue line is when the shooter is under pressure (the nearest relevant defender is within 0.5 yards), and the blue shaded area is the 95% confidence interval for that estimate. As you can see, there’s a significant overlap between the two lines – from about 15 yards out not only do the 95% confidence intervals overlap, but the lines are almost identical. However, for closer shots there is a difference, highlighted in the graph below.
I’ve added a shaded area where the two lines are significantly different from each other – basically you’re looking at the area where the lines don’t overlap the other line’s shaded area to see where they are distinguishable from each other, which goes from about 3 yards to 13 yards away from goal. That is the zone where defensive pressure matters – basically anything between 3 yards from goal and the penalty spot is less likely to score if a defender is close, while anything further out than that defensive pressure is irrelevant. If I had to come up with a post hoc explanation, presumably shots in that area are difficult regardless of whether you have a defender in the way so distance is the limiting factor.
This is obviously limited to NWSL, and we may see differences for men vs. women here, but it’s a potentially interesting development given the paucity of defensive position data, and it’s an important methodological lesson to go beyond “star-gazing” and to look at the relationship between your variables. Defensive positioning matters most when the shooter is close to goal, even when controlling for a number of other factors (head v. foot, angle, etc.).