Sampling and data tinkering
Last friday, Euro 2020, one of the biggest events in International soccer, was kicked off by the inaugural match between Italy and Turkey (Italy won it 3-0). Euros (short for European Championships) are usually held every 4 years, but because of he-who-must-not-be-named, last year’s edition was postponed to this summer, while keeping the name “Euro 2020” (much like the Tokyo Olympics ). 4 5 years ago, for Euro 2016, I basically wanted to try some cool methods based on splines on a real-world example. The mo
The model is built on about 12.500 international football matches played since 2006. The main part actually consists in a combination of two models, one that predicts the number of goals scored by the reference team in a given game, the other one that predicts the number of goals they will allow. This basically models offense and defense for each team. A number of models for soccer have used similar ideas in the past, including FiveThirtyEight’s brilliant Soccer Power Index .