From the onset of modern statistics altering the landscape of mainstream analysis, the evolution of metrics has changed and morphed complex analysis into something easier to digest from seasoned data scientists to hobbyists.
Generally, shot attempts under the Corsi and Fenwick umbrella were recorded at 5v5 with 5v4 and other game situations following suit. Metrics evolved, gaining sophistication to mimic game situations. Metrics were refined to eliminate score effects – created when a team holding the lead falls into more of a defensive shell instead of attacking, in an attempt to lock down a win, with the trailing team influencing the flow of play and increasing shot attempts.
Editor's Note: Rotoworld’s Season Pass is now available for the low price of $19.99. You get plenty of extra articles including the minor league report, the junior report and much, much more. Buy it now!
Corsi/Fenwick ‘close’ was born with the intent to overcome score effects. Most score effects skewing the third period, a close metric involved 5v5 shot attempts in the first two periods as per usual, and only counting events when teams are tied, or within a goal in the third period. Eliminating the non ‘close’ data reduced how score effects assert themselves into game data. But was this the right approach?
Reducing data to form solid conclusions seems counterproductive and far from ideal. Removal of the data didn’t really improve overall analysis even if it effectively countered score effects following game situations. Even if the ‘close’ metric was serving its purpose, the data elimination made it less reliable overall.
The same logic applied to reducing data to the various game situations. Teams start tied and then end up leading or trailing depending on game progress. Data was parsed out to these values, tied, leading/trailing by one, two or three (or more) goals. Game situation had similar data reduction influences like games close, making the end result less reliable, even if it paralleled observed game situations.
Hockey analytics pioneer, Eric Tulsky, now the Director of Analytics with the Carolina Hurricanes, however, had an answer for reducing the game effects without sacrificing data, adjusting for score. Incorporating every element and applying a weighting according to game score produces better overall accuracy and improves predictability – the ultimate hallmark of analytics.
After all, analytics is the study of data, but end result is to improve future predictability. Better data and methods vastly improve predictability.
Score-adjusted data quickly became the preferred method of applied data analysis among hockey analysts gaining traction and prominence during the 2014-15 season. Normalizing game situations offered a more reliable alternative to reducing datasets associated with games ‘close’ or individual game score.
Even now, score-adjusted stats have overtaken anything related to shot metrics, with the evolution of metrics continuing. Taking from the soccer analytics model, expected goals has become the next preference.
Virtual ink for hockey expected goals couldn’t pave a small driveway, but there’s enough of a foundation from dedicated analysts to build upon basic concepts for future use. Expected goals (xG) attempt to assign a measurable value to shooting locations (and incorporating an analytics sore spot of shot quality) and then measuring the expected values with the actual (observed) values to produce a positive (outperforming the expected goals) or negative (underperforming) gap.
Expected goals measures a metric (goals in this case), in relation to the actual/observed results. Data integrity is maintained, taking all shots into consideration, with player (or team level) isolation from shooting location. In short, it tries to incorporate some elements to control for shot quality.
Save percentage is influenced by many on-ice items and can be a major contributing factor into measuring results. As with any new theory and practical examples, counter arguments are prevalent to keep the discussion as realistic. David Johnson of Hockey Analysis has been an ardent proponent of limitations of predictive analytics and expected goals, with some of the bounce back due to the prevalence of save percentage and its effect on stopping a shot on goal. Placing a weight on an unobstructed shot on goal from a prime shooting area should determine more than just shot location as the predominant factor. Missed/blocked shots and goaltender’s readiness to make the save aren’t all fully accounted and the gap can be significant.
For instance, a player in the slot with a clear shot on goal taking 10 shots will have a percentage go in on average. As a static measure, this holds up. Real life game situations don’t always occur as simply as this example in a vacuum. The player in question here could be receiving a pass from the wing or behind the net, forcing goaltender movement and lowering the propensity of a shot on goal becoming a goal. These motions and game play isn’t accounted for fully in the expected goals model. Johnson has been a vocal opponent of some of these aspects.
As stated, predictability is the hallmark of analytics. Metrics are created to further aid analysts in future predictability. This plays off well for poolies looking for an advantage and I’d suspect expected goals will become much more mainstream in coming seasons among fantasy hockey analysts.
The best part about new metrics is how a fantasy GM can use it to exploit Championships. Applicable strategies exist for Daily fantasy play, but there’s so much randomness in one hockey game that even by identifying trends, GM’s may not get the intended desired results.
Fantasy GM’s can use this information to make assessments prior to the draft, and in season waiver-wire additions or trades. Identifying underwhelming players that may be victims of bad percentages, or not meeting expected goals potential from other team’s/poolies could reap unexpected fortunes. Unsuspecting GM’s can exploit their gainful knowledge and analysis to pluck gems from other lineups.
I plan to expand more in future writings and use xG more often in analysis. For the moment, I’ll offer a brief introduction using the image below that ranks the top 5v5 goals producers (as of Monday Feb 13, 2017.)
View post on imgur.com
Each colored area contains the expected goals component and the actual metric. The table is arranged to show the expected value and the observed value in consecutive columns. All data is on-ice values, with the exception of ixG (individual expected goals) and goals. The Sh% column is on-ice shooting percentage providing some shooting efficiency into the mix accounting for team level scoring success.
For instance, T.J. Oshie has an expected GF% (Goals For On-Ice percentage) of 54.06% while recording an observed 69.41% GF% and outperforming the expected goals for (xGF%) by a considerable margin. This would constitute a situation where there’s room to peel back to a range closer to the expected model.
Individually, Oshie has an expected goals of 8.01. With 16 goals scored, he’s doubled the expected goals model benchmark and likely not repeatable – but not out of the question.
This entire chart is an interesting look at some of the even strength goals leaders. I’ll break down more expected goals markers, in the future, but if you really want an excellent primer, it would be beneficial to become familiar with the theory behind the models. The links below have incredible breadth of explanations to how models are calculated – and utilized.
Google – a simple search for online content
Corsica – part 1. Introductory information, but very beneficial.
Corsica Part 1.5 – building upon the introduction
Hockey Graphs – improving predictability is a priority in analytics models.
Game Charts – some context about expected versus observed results.
Fanrag – a useful practical example of metric.
Hockeybuzz – another practical example of expected versus observed goals.