tldr: this got longer than I expected, so I’ve bolded my key points if you don’t want to read it all.
I was inspired a couple of days ago by a twitter thread by @MishraAbhiA that was posted on the sub which applied statistics to identify U24 players with high performance in a few specific creative metrics. Since he only tweeted summaries of his findings and only listed a small number of players, it left me with many questions, so I decided to investigate into deeper statistical analysis myself. I’ve start by looking at creative players, since I feel that’s something that Arsenal is in dire need of, and noticed many interesting outputs and names.
But as you can tell from the title, this post is mostly going to focus on Emi Buendia. He seems to be a very divisive player these days (here and on r/soccer) – some basic stats indicate that he has many creative qualities, but there are many others who claim he as more damaging to a team based on anecdotal reference and stats regarding his sheer number of possessions lost in dangerous areas. So after I finished my statistical analysis on creative players, I wanted to scrutinize Buendia and see if there were conclusions I could make to affirm or contrast this general perception of him. Depending on how this is received, I may post other takeaways I found interesting (like other high value, bargain players). Just, maybe a bit shorter…
Quick disclaimer: stats are never the whole picture – they can be inaccurate, be made to be misleading, and are best used in conjunction with watching game tape. To the extent that I can, I will try to cover the stats in a holistic way. While I have been working towards getting my masters in data analytics, my true real world application of it is still small at this time, so if any of you have questions/comments about my methodology or inaccuracies in it, I will try to answer them or make fixes as needed (constructive criticism always appreciated!).
METHODOLOGY (skip if you don’t care):
Simply put, I compiled a database of all available statistics on players from the 5 big leagues (via fbref), then used the data to create a peer analysis. Once you determine your peer groups (I used Wingers/Attacking Midfielders, as categorized by fbref, with over 810 mins logged this season), you normalize each player’s stats so that it’s relative to the peer set. By doing that, I can see in what percentile a player is in relative to that data set by using the mean and standard deviation of that set (generally better than simply creating a % by saying “he ranks 20th out of 100 players in the set, so he is in the 80th percentile).
It’s important to note that there are different ways to do this, with one of the more simple ways being standardizing it to a normal distribution (the twitter thread that I linked earlier does a good job of explaining this succinctly) – and this is what I generally did. But this only works when the underlying data is distributed like a normal bell curve, which it isn’t for a lot of the cases. I identified which stat categories fit well to a normal distribution and which don’t. Sometimes, I could correct it by using a lognormal transformation to normalize it. Sometimes, it was random and fit no distribution, in which case I avoided that stat. Sometimes, it was due to a couple of players who had outlier performance in one stat that would have otherwise made the distribution normal – I tried avoiding using these if there was another proxy stat, but in cases where I do make reference to one of those stats, I provide an * to indicate that there may be some skew here. Those stats metrics are typically directionally correct, but have room for error in terms of percentile (anecdotally ~5-10% variance).
CONCLUSION (i’m putting this first since the detail section gets quite long):
This specific ranking is not supposed to be my main takeaway since it’s a somewhat arbitrary way of combining stats, but I needed a succinct way getting my message across. Taking the average percentile, not rank, across MANY stat categories, Buendia ranked as 25th out of 270 players across all ages and 7th out of 124 players when looking at just U25 aged players. Again, this is for players categorized as wingers/attacking forwards by fbref who have over 810 mins of game time this season in one of the big 5 leagues.
For transparency sake (skip if desired), the stats that went into this ranking were:
– Dribble attempts per 90*, Dribble Success %, # of players dribbled past per 90*, Miscontrols p90, Dispossessions p90, Goals p90, xG p90*, Shots on Target p90*, Total pass attempts p90, Pass Completion %, Assists p90, xA p90*, Key Passes p90, Passes Into Final 3rd p90, Passes Into Penalty Area p90, Crosses Into PA p90, Progressive Passes p90, Passes While Pressured p90, Shot Creating Chances p90, Goal Creating Chances p90*, Tackles p90*, Tackle Win %, # Dribbles Contested (as defender) p90, Dribbles Successfully Tackled %, # Pressures (as defender) p90, Successful Pressure %, Blocks p90*, Interceptions p90*, Team Success (+/-) On – Off p90, Team Success (xG) On – Off p90
Now, like I mentioned, the ranking isn’t what I want push here since it’s tough to determine what combination of categories is bests to estimate the overall quality of a creative player (although I was somewhat comforted by the other names that topped the lists: Sancho, Bernardo Silva, Gnabry, Nkunku, Foden in U25; Messi, Dybala, Neymar, Muller, Sancho, Coutinho over all ages). Even with some deviation and inaccuracies in the above calculation, the goal of this rank is to show the general overall value range of Buendia. Combine that with the fact that he’s among the cheapest for that value and already premier league tested, I think he would make a very smart buy.
From here on in my post, I break down in details some of the metrics that I used in my ranking above. It’s quite long, so here are the key takeaways if you don’t want to read much more:
– Buendia has elite Chance Creation ability, primarily by completing passes into the danger areas at a much higher rate than his peers.
– He doesn’t shoot or score much, not his MO; potentially could stand to be more clinical with his chances as well.
– A very direct dribbler; tries and succeeds at taking on defenders at a high rate, but because he has the ball so much, he also gets tackled and dispossessed a lot overall. However, his rate of dispossession relative to how much he is on the ball is actually not bad.
– While he has stats that resemble that of a highly creative attacker, he has the possession and ball carrying ability of a Box to Box midfielder. Would come deep into Norwich’s defensive territory to claim the ball and was relied on to progress it up field and create shooting chances.
– A highly active defender, running back to help on defense and engaging ballcarriers at very high rates, albeit with average rate of success on a per attempt basis. Regardless, the stats lead you to believe that he’s willing to operate in a press heavy team or one that requires him to sit back on a team that doesn’t have much possession and won’t be bad at either.
– And finally, Norwich was a bad team with him on the pitch, but they were even worse off without him. He statistically seemed to mean to Norwich, what Grealish meant to Villa (although I don’t think he has that same ability to change a game with a moment of sheer brilliance like Grealish).
DETAILS (the meat of what I wanted to get across):
In this section, I want to look more closely at specific aspects of Buendia to breakdown strengths/weaknesses and see how the stats hold up to public perception. If you don’t believe my words from above, this is where I provide the statistical perspective to try to back it up. Note: the lower the percentile, the worse it is, and vice versa. The percentile is calculated based on the standard deviation of the distribution and the distance of the individual from the mean, whereas rank is relative to how they directly stack up with their peers.
– Shot Chance Creation p90: 93.3% (U25: 4th out of 124; All: 15th out of 270)
– Goal Chance Creation p90*: 29.0% (U25: 84th, All: 182nd)
The first stat documents the number of shots the player directly helps create (by making a pass, dribbling past an opponent, drawing a foul, or trying to take a shot which ricocheted off someone and led to someone else getting a chance to shoot), where as the latter stat does the same but only tracks actions which actually resulted in goals (not “goal worthy” chances). What this tells me is that Buendia is in a great territory when it comes to creating opportunities to take shots (specifically due to opportunities created from passes if you drill down more), but that these opportunities are not being converted into goals (if he was the one taking most of the shots, then you could attribute it him, but that’s not the case). I’m not sure if we can conclude that this is due to his teammates letting him down by not converting efforts, but that is likely a factor.
– Assists p90: 82.5% (U25: 20th, All: 47th)
– xA p90*: 89.1% (U25: 10th, All: 21st)
– Passes Attempted p90*: 94.0% (U25: 8th, All: 18th)
– Pass Completion %: 51.1% (U25: 67th, All: 141st)
– Key Passes p90: 95.4% (U25: 4th, All: 10th)
– Completed Passes Into Final 3rd p90: 97.1% (U25: 5th, All: 8th)
– Completed Passes Into Penalty Area p90: 90.6% (U25: 4th, All: 19th)
– Completed Crosses Into Penalty Area p90: 36.7% (U25: 78th, All: 185th)
– Completed Progressive Passes p90: 98.8% (U25: 1st, All: 5th)
– Passes While Pressured: 99.3% (U25: 2nd, All: 3rd)
The above stats are a breakdown of his chance creation passing – again, he ranks incredibly in some of these very influential categories.
So what does the above tell us? Well, Buendia is on the high end of passes attempted and is pretty bang average on his overall completion rate (actually he’s above average on short and mid range pass completion, but below average on long range), but that he is in the extremely high end of the spectrum for passing the ball into dangerous places (and maybe leads to lowering his overall completion % since these passes are likely tougher to make than a back pass or horizontal pass). Now, I can acknowledge that there is an argument to be made that those creative numbers are a reflection of his high pass attempts rate, but it generally does hold up even on a “% of pass attempts” basis (which isn’t the case for all players):
– Key Passes per pass attempt: 83.7% (U25: 17th, All: 45th)
– Completed Passes Into Final 3rd per pass attempt: 95.8% (U25: 5th, All: 11th)
– Completed Passes Into Penalty Area per pass attempt: 69.2% (U25: 36th, All: 76th)
– Completed Crosses Into Penalty Area per pass attempt*: 23.1% (U25: 94th, All: 216th)
– Completed Progressive Passes per pass attempt: 92.0% (U25: 13th, All: 26th)
Bottom line: Buendia gets the ball to dangerous areas at a very high level.
– Touches p90: 97.5% (U25: 7th, All: 13th)
– *Touches p90 in Def Pen/Def 3rd/Mid 3rd/Attacking 3rd/ Att Pen: 99/100/92/81/12% (U25: 3rd/1/14/16/113, All: 6th/1/33/41/245)
– Dribbles Attempted p90*: 88.2% (U25: 20th, All: 30th)
– Dribble Success %: 96.6% (U25: 5th, All: 10th)
– # Players Dribbled Past p90*: 97.5% (U25: 5th, All: 9th)
– Dispossessions p90*: 3.7% (U25: 124th, All: 263rd)
– Dispossessions per Carry*: 42.4% (U25: 74th, All: 168th)
Buendia is on the ball A LOT, outpacing the other creative players in the cohorts by tracking back, taking the ball from deep, and being involved all the way until the opposing team’s penalty area. He has the presence of a Box-to-Box player, but is creative like an attacker. He also takes on a lot of defenders and is quite great at getting past them.
Then comes the bad part to his game – dispossessions (I didn’t see any stats to see where the dispossessions happen). Unfortunately, players who dwell on the ball and try to play around defenders as much as him typically do also have high numbers of dispossessions (I’m making this statement after looking at my database). Even the great players get dispossessed a ton (Neymar in the 0% percentile and Messi in the 15% percentile). However, just like I mentioned above for the passing metrics, raw p90 metrics doesn’t paint the entire picture when you have a ton of attempts. So when you look at his Dispossessions per Carry, he’s much more in line with the average – it’s just that he carries the ball a lot more than the average player. If you look at other very direct players who have raw dispossession numbers, very few have as high a per carry rate (take a darling on this sub for example, Wilfred Zaha, 99th percentile for dribbles past a player, but 0th percentile for disposessions and 2th percentile for dispos. per carry).
I’m not trying to say that dispossessions aren’t a weakness in his game – they are. But they’re a direct result of how much he is on the ball. The perception of him as someone who gives the ball away in dangerous positions is a function of how deep he has to play to progress the ball on a bad Norwich team – you’d imagine that someone with his creative profile wouldn’t be playing in his own defensive area so much. This is definitely projecting here, but you’d hope he’d likely be able to stay further up the pitch on a better team where his (average rate of) dispossessions wouldn’t be as much of a problem.
I’m going to shorten my detail on the other areas of his game because my primary point is to convey his chance creation and ball progression abilities, things that this Arsenal team sorely lacks right now. That’s not to say that I’m trying to hide other aspects of his game – I’ll just be a bit more brief since this has virtually turned into an essay…
– Goals p90: 1.8% (U25: 112th, All: 251st)
– xG p90*: 12.7% (U25: 109th, All: 244th)
– Shots p90* / Shots on Target p90*: 23/15% (U25: 97th/106th, All: 207th/229th)
– Tackles Attempted p90* / Success Rate: 100/54% (U25: 2nd/63rd, All: 3rd/132nd)
– Dribbles Contested p90 / Success Rate: 99/44% (U25: 1st/72nd, All: 3rd/151st)
– Dribbled Past p90* (as defender): 0% (U25: 124th, All: 270th)
– Pressed Ballcarrier p90 / Success Rate: 98/49% (U25: 5th/62nd, All: 7th/134th)
– Blocks p90* / Interceptions p90*: 99/75% (U25: 5th/27th, All: 6th/62nd)
Effectively, he’s not much of a goal threat. But it’s also a result of where he typically plays on the pitch – as you’ll recall from his touches above, he typically doesn’t get into the opponents penalty areas very much, so less opportunities to take shots.
As a defender, he seems like a high effort player who engages ballcarriers a lot, but with average returns (note: it’s not saying that he’s 50-50 tackler – he’s actually 66% for successful tackle % – it just means that he sit’s close to the average of the peer set). I’m not sure if this is how fans feel when watching him, but he seems to be a solid defensive player for a creator and someone who is willing to get back on defensive to provide support (as a lot of his tackle and press attempts come in the defensive 3rd). Nothing terrific, but good effort and active, with average returns.
– Team Net Goals (+/-) While On Pitch*: 6.9% (U25: 120th, All: 258th)
– Team Net Goals (+/-) While On Pitch vs Off: 76.6% (U25: 22nd, All: 66th)
– Team Net xG (+/-} While On Pitch*: 20.7% (U25: 104th, All: 221st)
– Team Net xG (+/-) While On Pitch vs Off: 74.1% (U25: 17th, All: 54th)
I think this is a nice metric to check impact / importance to a team, but I understand that there are a lot of other variables to consider when looking at on vs off (quality of competition, if a team is a man down, etc), so I take this part with a grain of salt. With that said, what this says for Buendia is that Norwich were not good when he was on the pitch, but they were much worse when he was off of it. Both net actual goals and net xG bear this out (which is true for most players, but not all). I’ve seen people mention that Norwich was better of against us once they benched Buendia – that may have been the case that game, but by-in-large through the season they were much better off with him on it and he was an integral piece for them.
You can argue that with the dearth of quality around him, his stats and metrics are inflated – sure that could be the case and you can imagine that there’s some regression to the mean if he joins a new club next season, but it’s the same case with Jack Grealish and everyone would love to bring him onto their team at a much higher fee than Buendia would cost (he’s the only other non-Man City EPL player in my the top 25 of my U25 list at #13, and #42 in my All ages list).
Since I started with my conclusion at the beginning of my post, I’ll end by thanking you for reading all the way to the end here and with a quick summary of my case for Emi Buendia: while stats aren’t everything, the data bears Buendia out to be a very high level creator with fantastic passing and ball-carrying ability who, while he has some weakness in terms of dispossessions and finishing, would represent a great bargain for the value he brings, especially when compared to the rates quoted on other high value players.