There a number of excellent sites that gather football statistics such as Squawka, Whoscored, and Transfermarkt. From these sites, we can collect enormous amounts of data about players. This could vary between shooting data, passing data or even off-ball movements of players (to name a few). Let’s examine the shooting detail in more detail. First, we can measure the number of shots any certain player took. Following this, we can break the shots down further into shots on target (SoT) and goals scored. While this only explains to us the quantity of shots taken, we can also measure the quality of them through xG models.
Once we established the quantity and quality of shots taken by a player, we should aggregate these as benchmarks. Another way of describing this is to use the term called p90. This term allows us to compare players against each other to see who is a better shooter (by their minutes played) on a game by game or even season by season basis. The talented individuals writing for Statsbomb, in this case Benjamin Pugsley wrote a great article a couple of years back explaining per90. If you are not familiar with p90, then please read this article before continuing below.
At the end of the 2015/16 Premier League (PL) season, I wrote about evaluating my xG model with data that I collected from the PL. At the end of the post, I included a picture of all players’ goals vs xG scored. However, as Pugsley mentioned, “there is a problem with these raw, basic numbers”. This problem being that these numbers are not exactly accurate. So below, I adjusted each players’ shots, SoT and goals scored to their p90 minutes (with the help of Whoscored.com) and. Let’s have a look at the graph.
Unfortunately, I do not think the graphs tell us a lot, well except for the relationship between a players’ Shots p90 and SoT p90 (R2: 0.55, P-value: <0.0001); Shots p90 and Goals scored p90 (R2: 0.31, P-value: <0.0001) and lastly SoT p90 and Goals scored p90 (R2: 0.32, P-value: <0.0001).
There is one element though within all of this that I do find surprising. Between “Shots p90 and Goals scored p90 (R2: 0.31, P-value: <0.0001) and lastly SoT p90 and Goals scored p90 (R2: 0.32, P-value: <0.0001)”, the R2 value only increased by 0.01. Based on these results, SoT are only a very slight (0.01) improvement in better predicting goals scored than shots taken. Logically, we know that this is not true as SoT are closer to being converted than Shots taken. Lastly, as the R2 value is low, I am certain that there are other/better factors that could have a stronger relationship than those described above.
2015/16 Top 20 p90 PL goalscorers in the PL
As each players p90 statistics have already been calculated above, I now just need to filter the players by minutes played. I decided that players needed to have played at least 750 minutes to be eligible for this next part. In the table below, I have sorted players by p90 goals scored for the 2015/16 season.
*I will provide the data here, so you can filter the list as you wish*.
I also made the table more visual and appealing to the eye through Tableau. The table is quite detailed yet for the purposes of this blog post, I am only interested in p90 statistics than the players’ xG figures. The graph is divided into three parts: 1) Premier League Shots p90, 2) Premier League SoT p90, 3) Premier League Goals p90.
After examining the graph in more detail, let’s take a closer look at the numbers, what they mean and how to read them. For example, Tottenham striker Harry Kane took on average 4.27 Shots p90 (3rd); 2.11 were on Target p90 (1st) and he scored 0.69 goals p90 (4th). Kane had another fantastic season with Spurs and he has been a very consistent player for them.
Now that we know how many shots each player took, what about the quality of the shots they took? To examine this, I have broken the pitch into 8 zones (picture below) and distributed the shots (taken by each of the 20 players) among these zones. Of course, it would be so much easier to map the shots using XY coordinates such as the one here. Unfortunately, I do not have access to this data and if I was to collect XY coordinates of all shots, it would take a very long time.
Not to worry though as I have been able to distribute the shots to marked zones (as mentioned above). The next graph may look a bit messy at first, yet I find the data is summed up very nicely.
Again, let’s examine the data on Harry Kane as an example. As the picture shows, the central area in front of goal is marked as Zones 1 & 3. From these zones, Harry Kane’s record is as follows: (Zone 1 – Shots: 6; Goals: 3; Accuracy: 50%) and (Zone 3 – Shots: 27; Goals: 11; Accuracy: 41%). While the sample size is small, these numbers, in general, are not surprising. This is due to what has already been discussed numerous times in football analytics and what also makes logical sense – the further away from the goal you are, the less of a chance you will score. Also note the number of shots taken and goals scored from more of an angle (Zones 2, 4, 5 & 7).
I hope you enjoyed this post and I look forward to any feedback/questions that you may have.