Back in November, before the World Cup began, I dove into the treasure trove that is StatsPerform’s World Cup charting data archive to look at the trends that have changed and taken over the sport from 1966 to 2018. When did the possession game start to really take over? (1990 or so, and then for real in 2014.) When did teams begin sacrificing shot quantity for shot quality? (Very much 2014.) Was 2018 as strange an outlier for set piece efficiency as it seemed? (Yes.)
With the 2022 World Cup almost complete, we now have another data set to take in. What have we learned over the past month or so that tells us how the sport might be evolving? And were some of the results as funky as they seemed? (Yes.)
– World Cup 2022: News and features | Schedule | Bracket
Lesson 1: (Predictive) stats are for losers
OK, not really. Stats are the language we use to speak about a game, and the existence of this StatsPerform data going back to England 1966 is an absolute blessing. But some years give us funkier results than others.
For each of the 15 World Cups in the data set, let’s look at the correlation between each team’s points per game and its expected goals (xG) differential. (Note: If you won a knockout match in a penalty shootout, we’re counting that as a draw for these purposes. Sorry, Croatia.)
Over a long enough period of time, your xG differential becomes very predictive of success moving forward. It’s basically looking at the quality of the shots you produce versus that of your opponent, and while finishing skill obviously matters, it only matters so much. Give all these teams 100 games, and the ranking of xG differential will look just about exactly like your points-per-game ranking. Guarantee teams just three matches, and you’re going to get some funky results. And the results in 2022 were particularly funky.
Correlation between points per game and xG differential:
1966: 0.532
1970: 0.770
1974: 0.845
1978: 0.571
1982: 0.554
1986: 0.765
1990: 0.694
1994: 0.476
1998: 0.488
2002: 0.552
2006: 0.713
2010: 0.446
2014: 0.603
2018: 0.598
2022: 0.464
(Note: correlations come on a scale from -1 to 1. The closer it is to 1, the stronger the “When one goes up/down, the other goes up/down” connection. The closer it is to -1, the stronger the opposite link — when one goes up, the other goes down. And the closer it is to 0, the weaker the overall link between the two.)
What this tells us: While every stats teacher in the world will quickly point out that correlation doesn’t equal causation, we can say that, in this case, a high correlation between these two measures suggests that a given tournament had a lower number of strange, “Team A took three shots worth 0.1 xG, Team B took 17 shots worth 2.2, and Team A won, 1-0” results.
As you see above, the 1974 World Cup had the highest correlation on the board. That year, Netherlands and West Germany produced by far the strongest xG figures in the competition and met in the final. There were some reasonably funky results — a pretty mediocre Brazil team (xG differential: +0.1 per match) reached the semifinals, while Scotland (+0.3) produced the third-best differential and failed to advance from the group stage. (They were tied with Yugoslavia and Brazil in Group 2 but fell short because of goal differential.) But all in all, the stats and results mostly agreed on the hierarchy of teams.
That has really not been the case at Qatar 2022. The team with by far the best xG differential in the competition — Germany (+2.2 per match) — finished third in Group E thanks to a super funky 2-1 loss to Japan (xG: Germany 3.1, Japan 1.5) and the goal-differential wrecker that was Spain’s 7-0 win over Costa Rica. Stats: literally for losers in this case.
The second-best team in the tournament, per xG differential? Brazil, which lost in a quarterfinal penalty shootout against Croatia. The teams drew 1-1 over 120 minutes despite a lopsided xG margin (Brazil 2.6, Croatia 0.6.)