Quantifying the Value of an NHL Timeout Using Survival Analysis

This project was originally posted on Hockey Graphs.

Introduction

Hockey, by nature, is a fast-paced sport that can be difficult to represent by discrete situations. While most other professional sports can be viewed as combinations of distinct in-game events – at-bats in baseball, plays and series in football, and even possessions in basketball – hockey is extremely fluid and the game state is constantly changing. This difference in game flow means that there are far fewer opportunities for a hockey coach to make any decisions based on distinct game states. While, for example, a football coach has several opportunities per game to decide whether or not to attempt a fourth-down conversion, a hockey coach has very few chances to make any comparable choice that can affect the outcome of the game. However, there are a few tools available to a hockey coach that can be researched so as to optimize their effectiveness in helping a team to win a game.

The most-researched of these decisions (thus far) for an NHL coach is when to pull the goalie in an endgame situation. There have been several papers published regarding the optimal time to pull the goalie, such as these two by Beaudoin and Swartz in 2010 and by Brown and Asness in 2018. (For even more great work on goalie pull times, you can check out Meghan Hall’s talk from the 2019 Seattle Hockey Analytics Conference and her Tableau dashboard, as well as the Goalie Pull Twitter Bot created by Rob Vollman and MoneyPuck.com.) All of this prior research has found that NHL teams should pull their goalies much sooner than conventional wisdom suggests, as teams are much more likely to score to tie the game if they pull their goalie earlier rather than later. However, beyond pulling the goalie, there are still a few more tools at a coach’s disposal. Teams are allowed to challenge goals for certain rule infractions, use a 30-second timeout during a stoppage in play, or switch goalies if the starter is having a bad game, in addition to personnel decisions regarding line combinations or matching up players against the other team. This article focuses on timeout usage, but I plan to explore the other tools in future work.

Before analyzing the effects of a called timeout, it will be helpful to go over the rules regarding an NHL timeout. Each team is provided with one 30-second timeout per game. Unlike the NFL and the NBA, however, this timeout must be used during a normal stoppage in play; whereas NFL and NBA teams can stop the clock with a timeout, an NHL team must wait until play is stopped to use a timeout. For the purposes of this analysis, there are only two other rules of concern. The first is that since the beginning of the 2017-18 season, a team that has just iced the puck cannot use its timeout. Prior to the 2017-18 season, teams occasionally used their timeout after icing the puck in order to rest their players, since a team cannot change its players after it ices the puck. The second is that, when the coach’s challenge was first introduced before the 2015-16 season, the penalty for a failed challenge of any kind was losing your timeout. Prior to the 2018-19 season, the challenge rule was altered to make the penalty for a failed offside challenge a minor penalty instead of losing a timeout (losing a timeout remained the penalty for a failed goaltender interference challenge). Prior to the 2019-20 season, the challenge rule was altered again to penalize all failed challenges with a minor penalty. Timeouts are no longer lost from unsuccessful challenges.

The goal of this project is to determine when NHL teams use their timeout, and to attempt to estimate the value of a timeout in any situation where teams commonly use a timeout. This part of the project focuses exclusively on endgame situations, as this is when most teams conventionally use their timeout. Timeouts will be quantified both in terms of win probability gained using survival analysis to model the end of regulation, and also by shot attempts gained by using a timeout with zero-inflated negative binomial regression to predict shot attempts at the end of the game. All results will be discussed at the end of this article.

Data

There were 3,908 called timeouts from the 2014-15 season through the 2018-19 season, in addition to 494 called timeouts from the (currently unfinished) 2019-20 season. These timeouts were obtained using EvolvingWild’s NHL play-by-play data scraper. These timeouts are broken down by season in the table below, which also displays the percentage of all available timeouts used per year.

SeasonNumber of Called TimeoutsNumber of GamesPercent of Available Timeouts Used
2014-151029123041.8%
2015-16839123034.1%
2016-17755123030.7%
2017-18641127125.2%
2018-19644127125.3%
2019-20494108222.8%
TOTAL4402731430.1%

One of the first (and simplest) findings of this project is that NHL teams do not use their timeout as often as they likely should. Prashanth Iyer wrote about this phenomenon a few years ago in an article about the coach’s challenge (and its relation to timeout usage) and found that timeouts are used even less frequently when accounting for those lost to failed challenges. However, even if it provides no measurable advantage in any situation, sports convention would suggest that it is still suboptimal to not use every tool at one’s disposal to influence the outcome of a game. If an NFL head coach were to not use any of his timeouts, most people would probably call for him to be fired.

There are a few plausible reasons why timeout usage is decreasing over time. When the coach’s challenge was first introduced, the penalty for a failed challenge was a timeout. Some coaches tended to save their timeout so that they could challenge if the situation arose (and thus viewed the timeout as practically worthless); other times, coaches might use a challenge, expecting it to fail, in order to get a longer timeout than the allowed 30 seconds. However, now that the rule has changed, it is possible that there are coaches that never used their timeout but frequently challenged goals and lost, resulting in a decline in apparent timeout use. Additionally, now that a timeout cannot be used after an icing, there are fewer instances where a coach might decide to use his timeout.

To get a better idea of precisely when NHL coaches tend to use their timeouts, the histograms below display the distribution of called timeouts by seconds elapsed and by score differential from the perspective of the team taking its timeout, broken down by season.

From the histograms above, it is clear that most timeouts are taken in the last 5 minutes of regulation, and that most timeouts are taken by the trailing team, with the most common score state being the team taking its timeout losing by one goal. With this in mind, this project will focus on attempting to quantify the value of a timeout used in the final five minutes of a game in which one team is trailing by a goal. There were 2,185 one-goal games with five minutes left in regulation between the 2014-15 and 2018-19 seasons. For this dataset, each observation represents one game in which one team is losing by a goal with five minutes left in regulation (this dataset was also obtained and generated by using EvolvingWild’s scraper). Each observation has the following features:

  • game_id: The unique NHL identifier for the game
  • season: The season in which the game takes place
  • leading_team: The team leading with five minutes left
  • trailing_team: The team trailing with five minutes left
  • home: A factor variable indicating whether the leading team is the home team
  • time: The time elapsed \(0 \leq t \leq 300\) in seconds since the five-minute mark of the third period at which one of the following events first occurs:
    • The leading team scores
    • The trailing team scores
    • The game ends \((t = 300)\)
  • status:
    • \(0\) if either the leading team scores or the game ends with no additional goals
    • \(1\) if the trailing team scores
  • to_time_trail:
    • The time elapsed \(0 \leq t \leq\) time in seconds since the five-minute mark at which the trailing team takes its timeout, if it uses its timeout
    • time if the trailing team does not take its timeout or uses its timeout after time
  • to_taken_trail: A factor variable indicating whether the trailing team takes its timeout before time
  • to_time_lead:
    • The time elapsed \(0 \leq t \leq\) time in seconds since the five-minute mark at which the leading team takes its timeout, if it uses its timeout
    • time if the leading team does not take its timeout or uses its timeout after time
  • to_taken_lead: A factor variable indicating whether the leading team takes its timeout before time
  • pull_time:
    • The time elapsed \(0 \leq t \leq\) time in seconds since the five-minute mark at which the trailing team first pulls its goalie, if it pulls its goalie
    • time if the trailing team does not pull its goalie or first pulls its goalie after time
  • goalie_pulled: A factor variable indicating whether the trailing team first pulls its goalie before time

Additionally, there are 12 features representing team-level per-60-minute rates for the two teams: 6 estimating the trailing team’s offensive ability, and 6 estimating the leading team’s defensive ability. All rates are obtained from Evolving-Hockey, and each team has one of each rate measure for each of the 5 seasons:

  • trail_EV_GF: The trailing team’s season-long 5-on-5 goals for per 60 minutes rate
  • trail_EV_CF: The trailing team’s season-long 5-on-5 Corsi for per 60 minutes rate
  • trail_EV_xGF: The trailing team’s season-long 5-on-5 expected goals for per 60 minutes rate
  • trail_PP_GF: The trailing team’s season-long power play goals for per 60 minutes rate
  • trail_PP_CF: The trailing team’s season-long power play Corsi for per 60 minutes rate
  • trail_PP_xGF: The trailing team’s season-long power play expected goals for per 60 minutes rate
  • lead_EV_GA: The leading team’s season-long 5-on-5 goals against per 60 minutes rate
  • lead_EV_CA: The leading team’s season-long 5-on-5 Corsi against per 60 minutes rate
  • lead_EV_xGA: The leading team’s season-long 5-on-5 expected goals against per 60 minutes rate
  • lead_SH_GA: The leading team’s season-long shorthanded goals against per 60 minutes rate
  • lead_SH_CA: The leading team’s season-long shorthanded Corsi against per 60 minutes rate
  • lead_SH_xGA: The leading team’s season-long shorthanded expected goals against per 60 minutes rate

The analyses performed on this dataset are described in the next sections.

Estimating Win Probability Gained by a Timeout

In the context of the final five minutes of regulation while trailing by a goal, a natural first step in attempting to quantify the value of a timeout is to estimate the win probability gained by using a timeout as compared to letting the game flow naturally. A timeout used late in a one-goal game by the trailing team is typically used to strategize and develop a set play off an offensive zone faceoff, as well as possibly resting the team’s top scorers for 30 seconds to prevent them from getting fatigued. The additional strategizing and rest could plausibly increase the team’s probability of scoring the tying goal, so the first model I used to estimate the value of a timeout was a logistic regression, with success defined as the trailing team earning standings points; that is, the trailing team at the five-minute mark either wins in regulation and earns two points (which is rare) or sends the game to overtime and earns at least one point. Using the features described above, I trained 6 logistic regression models, each using a different pair of complementary team ability metrics (such as the trailing team’s 5-on-5 goals for per 60 minutes and the leading team’s 5-on-5 goals against per 60 minutes) in conjunction with time remaining, whether the leading team was home, any timeouts called, and any goalie pulls. However, these models were not appropriate in trying to estimate the value of a timeout, since the logistic regression could not accurately represent the time-dependence of the timeouts or the goalie pulls, and thus the coefficients on the variables of interest were not accurate measures of the true effect of these coaching decisions. In order to capture the time-dependence of these variables, as well as more appropriately model the end of a close NHL game, I instead used survival analysis to estimate the value of a timeout.

Survival Analysis

Survival analysis is a tool typically used in a medical context and is most often used to estimate the probability that a given individual will survive for some amount of time after some event. (Many of the conventions I used for performing survival analysis in R came from this survival analysis primer from Emily Zabor, which includes a basic introduction to survival analysis.) This formulation can be translated to the last five minutes of an NHL hockey game fairly easily:

  • The leading team represents the individual in question; we would like to know the probability that they “survive” for a certain amount of time, beginning at the five-minute mark of the third period.
  • “Survival” represents the leading team holding its lead and winning the game.
  • If the trailing team scores the tying goal, we say that the leading team has effectively “died”; the event of the trailing team scoring has caused the leading team to fail to hold its one-goal lead, and while either team could plausibly win, the result of the game is likely relatively close to a coin flip.
  • When the leading team scores to increase its lead to 2, we censor the observation; in this context, this means that we say the leading team survived to the time that they scored and then they “left” the study. Once the leading team increases its lead, we assume that the trailing team will not overcome a two-goal deficit with fewer than five minutes remaining, and so we censor the observation and do not record any events following the censoring time.

This leads to the formulation of the time and status variables described in the Data section:

  • If neither team scores in the final five minutes, we set time to 300 (the number of seconds elapsed from the five-minute mark until the end of regulation) and set status to 0 (the trailing team did not score)
  • If the leading team scores to increase its lead in the final five minutes, we set time to the number of seconds elapsed from the five-minute mark until the leading team’s goal and censor the observation with status equal to 0
  • If the trailing team scores to tie the game in the final five minutes, we set time to the number of seconds elapsed from the five-minute mark until the trailing team’s goal and set status to 1 (the trailing team scored)

The Kaplan-Meier survival curve representing the probability that the leading team prevents the trailing team from scoring the tying goal in the final five minutes is shown below, with tick marks representing censored observations. The height of the curve at each time represents the estimated probability that the leading team survives until the given time. To get an estimate of the probability that the leading team survives with a lead until the end of the game, we get the height of the curve at \(t = 300\).

The estimated survival probability is summarized in the table below.

TimeGamesTied GamesSurvival ProbabilityStandard ErrorLower 95% CIUpper 95% CI
30021854990.72137630.01095780.70021590.7431762

Thus the probability that a given NHL team leading by one with five minutes remaining in regulation survives to the end of the game is given by \(\Pr(T = 300) = 0.721\), with a 95% confidence interval of \([0.700, 0.743]\).

We can also estimate separate Kaplan-Meier survival curves between groups. Below are the survival curves for the leading team, split by whether the leading team is the home team or the visiting team.

The estimated survival probabilities by location are summarized in the table below.

LocationTimeGamesTied GamesSurvival ProbabilityStandard ErrorLower 95% CIUpper 95% CI
Away30010642680.69481590.01618570.66380590.7272746
Home30011212310.74674370.01480740.71827850.7763371

The probability that a given NHL team leading by one at home with five minutes remaining in regulation survives to the end of the game is given by \(\Pr(T = 300 \mid \text{Home}) = 0.747\), with a 95% confidence interval of \([0.718, 0.776]\) while the probability that a given NHL team leading by one on the road with five minutes remaining in regulation survives to the end of the game is given by \(\Pr(T = 300 \mid \text{Away}) = 0.695\), with a 95% confidence interval of \([0.664, 0.727]\). Note that the survival probability estimate is higher for teams leading by one at home, which seems to affirm the idea of home-ice advantage. That is, home teams are more likely than away teams to hold a lead in the final five minutes of a game.

Another useful tool for estimating the difference in survival probability between groups is the Cox regression model, which is used for datasets with survival outcomes. The model is given by \[ h \left( t \mid \mathbf{X}_{i} \right) = h_{0}(t) \exp \left( \mathbf{X}_{i}^{\top} \beta \right) \] where \(h(t)\) represents the hazard, or the instantaneous rate at which the event of interest occurs at time \(t\), and \(h_{0}(t)\) represents the underlying baseline hazard at time \(t\). The model estimates a hazard ratio for each covariate, which in this case represents the ratio of hazards between two groups at any point in time. In other words, the hazard ratio HR can be interpreted as how much more likely one group is to experience the event of interest at a given time \(t\) as compared to the other group. Below are the results of a Cox regression on only the home factor variable:

VariableHazard RatioStandard ErrorStatisticp-value
Home0.80848130.0897924-2.3676590.017901

Note that the hazard ratio is given by \(\exp(\beta)\), where \(\beta\) is the coefficient estimate on home. A hazard ratio less than 1 indicates a decreased hazard of giving up a tying goal, while a hazard ratio greater than 1 indicates an increased hazard of giving up a tying goal. In the context of this regression, the hazard ratio of \(0.808\) on home means that, at any given time, a home team leading by 1 is only \(0.808\) times as likely to lose its lead as an away team leading by 1.

Now that we know what the hazard ratio output of the Cox model represents in the context of this analysis, we can see what other factors might influence the probability of the leading team losing its one-goal lead. In order to estimate the effect of using a timeout on the probability of the leading team losing its lead, we need to treat the use of a timeout as a time-dependent covariate, since the team’s status of having used a timeout depends on the time at which they use it. For our purposes, we will only consider two distinct game situations: either the team of interest has not yet used its timeout, or it has already used its timeout. The effects of the timeout may not actually last until the end of the game, but for now this is an appropriate way to treat timeouts. We train a Cox regression model exclusively on the time-dependent covariate of the trailing team using its timeout; the resulting estimated survival curves for the two groups are shown below.

The results of the Cox regression are shown below.

VariableHazard RatioStandard ErrorStatisticp-value
Trailing Team Timeout1.0422420.14680010.28183670.7780687

Note that while the hazard ratio is not statistically significant, the estimated hazard ratio of \(1.042\) is greater than 1, indicating that when the trailing team uses its timeout while down by a goal, the leading team is \(1.042\) times more likely to lose its lead, as compared to the situation where the trailing team does not use its timeout.

Since these findings are not significant, we will also train a Cox regression on whether the leading team uses its timeout while leading by a goal. The estimated survival curves for these two groups are shown below.

The results of the Cox regression are shown below.

VariableHazard RatioStandard ErrorStatisticp-value
Leading Team Timeout1.3385960.19097661.5269990.1267612

Again, while the hazard ratio estimate is not significant, the estimated hazard ratio indicates that when the leading team uses its timeout while up by a goal, it is \(1.339\) times more likely to lose its lead, as compared to the situation where the leading team does not use its timeout.

Note that this estimate does not indicate causation; the use of the timeout may not be the factor causing the leading team to be more likely to lose its lead. It is possible that the leading team uses its timeout in this situation only if it is getting hemmed in its defensive zone, in which case it may just be that the team is outmatched and the increased hazard of losing the lead is due to team strength. We will explore these possibilities shortly.

Note that neither of these results are significant; for comparison purposes, let’s see what the impact on survival probability is of a decision that we strongly believe to help the trailing team: pulling the goalie. We train a Cox regression on whether or not the trailing team pulls its goalie (and since this variable is certainly time-dependent, the survival probability is also affected by when the team pulls its goalie). The estimated survival curves for these two groups are shown below.

The results of the Cox regression are summarized below.

VariableHazard RatioStandard ErrorStatisticp-value
Goalie Pulled3.275370.18534516.40120

These findings are far more conclusive than those regarding either team taking a timeout; when the trailing team pulls its goalie, the leading team is \(3.275\) times more likely to lose its one-goal lead, as compared to when the trailing team does not pull its goalie. This serves to affirm the belief that pulling your goalie earlier increases your odds of tying the game and potentially earning standings points.

To get a better idea of what factors contribute to the survival probability of the leading team, we can now combine several covariates into one Cox regression. This will involve several time-independent covariates (such as location and team ability measures), as well as multiple time-dependent covariates (either team using its timeout and whether the goalie is pulled). To do this, we separate each game into time intervals defined by three state variables:

  • to_trail: a factor variable indicating whether the trailing team has already used its timeout
  • to_lead: a factor variable indicating whether the leading team has already used its timeout
  • g_pull: a factor variable indicating whether the trailing team has already pulled its goalie

We train 6 Cox regression models on this dataset (one for each complementary pair of team ability metrics). The hazard ratio estimates for each covariate in each regression model are summarized in the table below, along with significance codes. The column headings represent the rate metric used for both the trailing team rate for and the leading team rate against.

Variable5-on-5 Goals/605-on-5 Corsi/605-on-5 xG/60Power Play Goals/60Power Play Corsi/60Power Play xG/60
Trailing Team Timeout0.9170.9310.9250.9240.9330.925
Leading Team Timeout1.2861.2651.2791.2721.2411.253
Goalie Pulled3.108 ***3.177 ***3.156 ***3.215 ***3.237 ***3.202 ***
Home0.82 *0.822 *0.817 *0.822 *0.82 *0.818 *
Trailing Team Rate For1.447 *1.0151.1841.0551.0061.051
Leading Team Rate Against1.497 *1.0151.952 **1.118 **1.0071.139 +
Note:
Significance codes: *** p < 0.001 ** p < 0.01 * p < 0.05 + p < 0.10

There are some interesting results from these regressions. First of all, note that whether or not the trailing team’s goalie is pulled seems to be the most important covariate in estimating the probability of holding a one-goal lead. Whether or not the leading team is at home is also significant in each regression. The trailing team’s offensive ability is only significant when using 5-on-5 goals for per 60 minutes to represent their ability, but in each regression an increase in the trailing team’s offensive rate metric results in an increased hazard for the leading team, holding all other covariates constant. On the other hand, the leading team’s defensive ability is significant in the regressions not using Corsi against per 60 minutes as the rate metric, and in each case an increase in the leading team’s rate metric (meaning they are worse defensively) results in an increased hazard for the leading team, holding all else constant.

However, there are also some interesting results regarding timeout usage (even though none of those hazard ratios are significant). Note that when we take into account other covariates, the leading team is estimated to have a decreased hazard of losing its lead when the trailing team calls a timeout. Additionally, note that the hazard ratios for the leading team’s timeout usage are still right around \(1.25\), which is relatively close to the estimate of \(1.339\) from the regression on just the leading team’s timeout usage. While the hazard ratios are still insignificant, they are not moved much closer to 1 by accounting for other covariates, indicating that the leading team’s timeout usage itself may result in increased hazard for the leading team. This could possibly by an indication of momentum for the trailing team; if they successfully trap the leading team in their defensive zone and accumulate shots to the point that the leading team feels the need to use a timeout, the trailing team may gain some intangible benefit toward tying the game. Despite being insignificant, these hazard ratios may indicate an actual effect of using a timeout.

To further visualize the effect of each time-dependent covariate, note that there are eight possible states in which we can exist:

  • no timeouts used, goalie not pulled
  • trailing team timeout only, goalie not pulled
  • leading team timeout only, goalie not pulled
  • both timeouts used, goalie not pulled
  • no timeouts used, goalie pulled
  • trailing team timeout only, goalie pulled
  • leading team timeout only, goalie pulled
  • both timeouts used, goalie pulled

The two plots below display the survival curves for the four timeout situations, with one plot representing the goalie not being pulled and the other representing the goalie being pulled. The confidence intervals and censor ticks are not shown to make the plot easier to read; however, note that within the four timeout states, the curves are not significantly different from each other, while all curves where the goalie is pulled are significantly different from all curves where the goalie is not pulled.

Since the survival analyses performed do not indicate any statistically significant effect from timeout usage by either team in the final five minutes of a one-goal game, we will now try to determine if there is any shot value gained by the trailing team from using a timeout.

Shot Attempts Above Expectation

While called timeouts do not seem to provide any significant benefit in terms of win probability to the trailing team in a close game, it is possible that there is some value gained in shot attempt volume. That is, timeouts may “buy” additional shot attempts for the trailing team, and in a league like the NHL where almost any shot could potentially result in a goal, the trailing team could benefit greatly from additional shot attempts granted by the use of a timeout. To determine what the shot attempt value of a timeout is, we can use all situations where no timeout is taken to estimate how many shots a given team would be expected to take at the end of the game, and then compare that with the amount of shots actually attempted. This model for quantifying timeout value using a shot attempts above expectation value was adapted from Luke Benz’s thesis on estimating the point value of a timeout in NCAA men’s basketball.

In order to do this, we first must slightly modify the dataset used for the survival analysis earlier. First of all, most timeouts called by the trailing team occur in the final two minutes; to ensure that we are comparing distributions of Corsi and Fenwick shots that are similar, we will restrict our time frame to the final two minutes of a one-score game. For this exercise we are only concerned with one-goal games at the two-minute mark (that were also one-goal games at the five-minute mark) in which either no timeouts are called at all, or in which the trailing team uses its timeout. Games where no timeouts are used must have time greater than 180 in the initial dataset, and games where the trailing team uses its timeout must have to_time_trail at least 180 in the initial dataset. We exclude games in which only the leading team uses its timeout. We then split the dataset by whether or not the trailing team used a timeout; games in which no timeouts are used will serve as the training set for our regression models used to predict shot attempt volume in games where a timeout is called by the trailing team. We are also going to redefine the time variable to represent the length of the time interval of interest. There are two possibilities for time:

  • When no timeouts are called, time will represent the length of the time interval between the two-minute mark and the censoring time; that is, when either team first scores or the game ends.
  • When the trailing team uses its timeout, time will represent the length of the time interval between the time at which the timeout is used and the censoring time. To do this, we can simply subtract to_time_trail from time for games where the trailng team uses its timeout after the two-minute mark.

To simplify this model, we assume that shot volume is relatively constant across the final two minutes of a one-goal game. It is certainly possible that shot volume may increase dramatically as we approach the final seconds of the game, but for our purposes assuming consistent shot volume is fine.

Finally, we record the number of shots attempted by the trailing team in each of these games within each defined time interval. For those who are not familiar with the terms for shot attempts in hockey, we record the volume of two different types of shot attempts for these intervals:

  • Corsi shot attempts are all shot attempts (goals + saves + misses + blocked shots)
  • Fenwick shot attempts are all unblocked shot attempts (goals + saves + misses)

Below we display the distribution of Corsi and Fenwick shot attempts by the trailing team in both games where no timeout is used and in games where the trailing team uses a timeout.

For each game, we record the volume of both of these types of shot attempts in the time interval defined above. To predict the number of shot attempts in each interval following a timeout, we can train a regression on all instances where there are no timeouts used with the shot attempt volume as the response variable. Since Corsi and Fenwick shot attempt counts must take on nonnegative integer values, we can use either Poisson regression or negative binomial regression models to predict these shot counts. In addition, note from the histograms above that there are several instances in both scenarios in which the trailing team records no shot attempts. While this may be due to the fact that some teams would have always taken zero shots (possibly due to a lack of offensive ability or a strong defensive opponent), it is also possible that there are some zeros that are due to the fact that the trailing team allowed a goal (or scored a goal themselves) shortly after the event that started the time interval of interest (or used a timeout shortly before the end of the game). In this case, the fact that the team had so little time (by this formulation of the situation) to accumulate shot attempts may have been the reason why they recorded no shot attempts. For this reason, in addition to using both normal Poisson and negative binomial regression models, we will also use zero-inflated Poisson and zero-inflated negative binomial models to predict shot attempt counts, with time as the predictor of excess zeros. To determine which one of these four models seems to be the best choice, we train each model on the following features, using Corsi shot counts as the response variable:

  • One of the 6 metrics measuring the trailing team’s offensive ability and its complementary metric measuring the leading team’s defensive ability. Since all of the metrics are fairly similar, we will exclusively use 5-on-5 goals per 60 minutes (trail_EV_GF and lead_EV_GA) to choose which model to use.
  • pull_time: the time at which the goalie is pulled. If the goalie is not pulled, it is equal to the time elapsed since the five-minute mark until the censoring event. Note that in this case, the goalie pull time gives a numerical approximation of the coach’s aggressiveness.
  • home: whether or not the leading team is the home team
  • time: the length of the time interval as defined above

The coefficient estimates from these four regressions are summarized below:

VariablePoissonNegative BinomialZero-Inflated PoissonZero-Inflated Negative Binomial
(Intercept)0.706 *0.719 *0.915 **0.915 *
Trailing Team Rate For0.198 *0.206 *0.257 **0.256 **
Leading Team Rate Against0.0270.034-0.003-0.001
Goalie Pull Time-0.009 ***-0.009 ***-0.009 ***-0.009 ***
Home0.0540.0550.0590.059
Time0.015 ***0.015 ***0.013 ***0.013 ***
(Intercept) (zero)-0.74 +-0.778 +
Time (zero)-0.023 ***-0.023 ***
Note:
Significance codes: *** p < 0.001 ** p < 0.01 * p < 0.05 + p < 0.10

To get an idea of which model might be the best choice, we can compare the log likehood and dispersion statistics of each model:

Regression ModelLog LikelihoodDispersion Statistic
Poisson-1342.61.190
Negative Binomial-1337.71.031
Zero-Inflated Poisson-1329.51.024
Zero-Inflated Negative Binomial-1329.41.006

Note that these models are all approximately similar in fit; however, the two zero-inflated models appear to be better than the regular Poisson and negative binomial models. To visualize how appropriate the models may be, we can compare the predicted shot counts from each model with the actual shot counts. The histogram below represents the actual Corsi shot counts by the trailing team in time intervals following timeouts, while the density curves indicate how the predicted shot counts are distributed:

Note that none of these models are particularly good in terms of predicting shot counts; while the zero-inflated models appear to be slightly closer to the actual distribution of shot counts, the distributions of predicted shot counts from each model are very different from the actual distribution of shot counts. This may be due to the fact that we are attempting to predict shot attempt counts over intervals no larger than 120 seconds of game time; while shot attempt volume is relatively consistent from season to season among individual teams, it is certainly plausible that shot counts vary almost randomly from game to game, especially when we restrict our counts to such a small time interval.

However, we can still use the best of these models to attempt to estimate the shot attempt value of a timeout. Since the best model appears to be the zero-inflated negative binomial model (in terms of log likelihood and the dispersion statistic), we will use this model to predict how many shot attempts of each type we would expect the trailing team to generate in the defined time interval after they use their timeout. We train 12 regressions on all game situations with no timeout used; there are 6 complementary team ability rate metrics and 2 types of shot attempts. The results of the 6 regressions with corsi as the response variable are summarized below:

Variable5-on-5 Goals/605-on-5 Corsi/605-on-5 xG/60Power Play Goals/60Power Play Corsi/60Power Play xG/60
(Intercept)0.915 *0.0750.728 +1.276 ***0.7030.758 +
Trailing Team Rate For0.256 **0.027 ***0.386 ***0.05 *0.01 ***0.108 ***
Leading Team Rate Against-0.001-0.002-0.052-0.015-0.003-0.01
Goalie Pull Time-0.009 ***-0.009 ***-0.009 ***-0.009 ***-0.009 ***-0.009 ***
Home0.0590.0620.0680.0650.0540.066
Time0.013 ***0.013 ***0.013 ***0.013 ***0.013 ***0.013 ***
(Intercept) (zero)-0.778 +-0.875 +-0.81 +-0.858 +-0.947 +-0.933 +
Time (zero)-0.023 ***-0.023 ***-0.023 ***-0.023 ***-0.024 ***-0.024 ***
Note:
Significance codes: *** p < 0.001 ** p < 0.01 * p < 0.05 + p < 0.10

After predicting the shot attempt counts for each game in which the trailing team uses its timeout, finding the difference between the actual number of shot attempts generated and the predicted number of shot attempts generated if no timeout had been used produces a shot attempts above expectation value. (For example, if the trailing team is estimated to generate 2 Corsi shots in a 60-second interval after a timeout, but they actually generate 4, then we say they generated 2 Corsi shots above expectation.) After estimating the expected number of Corsi shot attempts for all games with a trailing team timeout and calculating the number of Corsi shot attempts above expectation, we get the following results:

The mean and median Corsi shots above expectation values for each model are summarized in the table below.

Rate MetricMedian CAEMean CAEStandard Deviation
5-on-5 Goals/600.2550.4821.537
5-on-5 Corsi/600.2440.4911.538
5-on-5 xG/600.2280.4751.540
PP Goals/600.2680.5011.531
PP Corsi/600.2700.5061.520
PP xG/600.2450.5011.526

Keeping in mind the fact that the zero-inflated negative binomial regression model does not seem to fit the data particularly well, we still get a positive Corsi shot attempt value above expectation after a timeout by the trailing team. On average, a team trailing by 1 at the two-minute mark of a game that uses its timeout is estimated to record approximately 0.25 to 0.50 Corsi shots more than an equivalent team that does not use its timeout and instead lets the game progress naturally.

When we perform the same procedure with Fenwick shot attempts, we get the following regression results:

Variable5-on-5 Goals/605-on-5 Corsi/605-on-5 xG/60Power Play Goals/60Power Play Corsi/60Power Play xG/60
(Intercept)0.401-0.2690.2591.277 ***0.6630.588
Trailing Team Rate For0.279 **0.021 **0.346 *0.05 *0.011 ***0.134 ***
Leading Team Rate Against0.1450.0090.149-0.029-0.004-0.031
Goalie Pull Time-0.009 ***-0.009 ***-0.009 ***-0.009 ***-0.009 ***-0.009 ***
Home0.127 *0.126 *0.129 *0.13 *0.117 +0.129 *
Time0.01 ***0.01 ***0.01 ***0.01 ***0.01 ***0.01 ***
(Intercept) (zero)-0.233-0.282-0.231-0.273-0.373-0.369
Time (zero)-0.03 ***-0.032 ***-0.031 ***-0.031 ***-0.031 ***-0.03 ***
Note:
Significance codes: *** p < 0.001 ** p < 0.01 * p < 0.05 + p < 0.10

We then get the following results for Fenwick shot attempts above expectation:

The Fenwick shots above expectation values for each model are summarized in the table below.

Rate MetricMedian FAEMean FAEStandard Deviation
5-on-5 Goals/60-0.0540.1751.184
5-on-5 Corsi/60-0.0290.1831.181
5-on-5 xG/60-0.0460.1711.187
PP Goals/60-0.0740.1871.186
PP Corsi/60-0.0410.1921.177
PP xG/60-0.0580.1901.182

As compared to Corsi shots above expectation, timeouts seem to be less valuable in terms of Fenwick shots above expectation. While we estimated a timeout to be worth approximately 0.25 to 0.50 Corsi shots above expectation, a timeout appears to be worth between -0.05 and 0.18 Fenwick shots above expectation. Note that the median Fenwick above expectation is negative for each model, while the means are all positive (likely due to the skewness of the predictions and the actual Fenwick shot counts). In context, this likely means that a timeout is not worth much in terms of unblocked shot attempts; since a timeout by the trailing team “buys” about 0.25 to 0.50 total shot attempts, while it seemingly doesn’t provide the team with any additional unblocked shot attempts, it appears as though a timeout is worth 0.25 to 0.50 blocked shots to the trailing team. Note that this doesn’t mean much while a game is in progress; without detailed tracking data, we cannot accurately predict whether a given shot will be blocked or not. While the value is not very much, and while the model used to predict shot attempt counts is not great, it still appears as though timeouts may have some shot attempt value to a team trailing by one goal in the last two minutes of a hockey game.

Model Limitations

Naturally, there are some faults in these two models for estimating NHL timeout value. The survival model of the last five minutes of regulation does not actually represent what it is supposed to represent; because of the way that the model was formulated, all survival curves and survival probabilities are actually estimating the probability that the trailing team scores a goal, and not the probability that they score the tying goal. We assumed that if the leading team scores and increases its lead to two goals, then the trailing team will not overcome a two-goal deficit in the final five minutes (we all know how safe that assumption is). We therefore censor every observation where the first event is the leading team scoring and set status to 0. In the survival model, this represents an at-risk individual, since the individual could still experience the event of interest, and since the event of interest is that the trailing team scores, the survival probability estimates more accurately represent the probability that the leading team gives up a goal. For example, it could happen that the leading team scores (we censor the observation), and then the trailing team scores a goal to cut the lead back to one. In our formulation of the situation, this would be the event of interest, even though we want the event of interest to be explicitly the tying goal. This could be improved by more accurately recording the events, but we simplified the model in order to make data recording easier. However, the probability estimates are still likely good approximations of the probability of allowing a game-tying goal.

In addition, as discussed in the shot attempts above expectation framework, the regression models used to predict shot counts do not seem to fit the data very well. These first estimates for the shot attempt value of a timeout should be reported with caution, but for first estimates of the value of a timeout, they seem to be reasonable. Since we have exhausted the best parametric regression models for predicting count responses, we could try using some nonparametric models (such as random forest and XGBoost) to predict shot counts in the same game situations. Beyond this, however, it may be the case that we cannot accurately predict shot attempt counts using game state information on time intervals no larger than 120 seconds. The game of hockey is notoriously dependent on luck, so it is certainly possible that we cannot predict shot attempt volume within a given game with much more accuracy than these models.

Conclusion

The results from this project are not particularly staggering or game-changing, but it is worth discussing the implications of this project nonetheless. When modeling the end of a one-goal hockey game as a survival situation of the leading team, timeouts by either team do not seem to have any significant effect on the probability that the leading team holds its lead. When accounting for the major factors affecting the trailing team’s chances of scoring a goal to tie the game, a timeout by the trailing team slightly benefits the leading team, on average. The most interesting result from the survival model is that a timeout by the leading team appears to benefit the trailing team; when the leading team takes a timeout, the trailing team is estimated to be roughly 1.25 times as likely to score a goal as compared to an equivalent situation where the leading team does not take its timeout. Even when accounting for team strength, this result remains the same (despite being statistically insignificant), so it might be that a leading team timeout provides some benefit in terms of momentum for the team trying to tie the game.

When trying to quantify a timeout by its shot attempt value, we conclude that a trailing team timeout may be worth 0.25 to 0.50 additional Corsi shot attempts, as compared to equivalent situations in which the trailing team allows the game to progress naturally. However, the timeout does not appear to provide any value in terms of Fenwick shot attempts. These estimates, while maybe not accurate, at least seem to be reasonable and in line with what may be expected to result from a timeout.

These results can likely be improved upon with more in-depth analysis. In terms of immediate future work to supplement this preliminary analysis, I’m planning to write a follow-up to this article using nonparametric models (as mentioned in the limitations of this project) to predict shot attempt volume in the next couple of weeks. Beyond that, there are also other ways to attempt to quantify the value of a timeout. Beyond just shot counts, we could try to estimate the value of a timeout in endgame situations in terms of expected goals, which are extremely popular for their weighting of shots by location to better represent the probability of a shot resulting in a goal.

There are also other situations in which a timeout may be valuable; while timeouts can no longer be used by a team that has just iced the puck, we may want to know how valuable that decision was before the rule was changed, since it may be the case that the rule change took away the best use of a timeout. It could also be the case that using a timeout after a negative scoring run (the opponent scores multiple goals in a relatively short time span) might provide some boost, such as is the common use of timeouts in basketball.

Finally, while this project exclusively focuses on timeouts, other future work related to this project should be focused on other coaching decisions as discussed in the introduction. While goalie pulls have been extensively studied, the coach’s challenge and the ability to swap goaltenders if the starter is playing badly are relative unknowns. We don’t know the true value of either of these decisions, but it may be possible to estimate the probability of a challenge being successful (especially once we have accurate player-tracking data) or the probability that the starter is going to have a bad game before he gives up too many goals. These may be interesting paths to pursue in terms of coach’s decisions in hockey.

Acknowledgements

Since this is my first project dealing with hockey analytics, I’d like to take the opportunity to sincerely thank Luke Benz (via the Hockey Graphs Mentorship Program and Asmae Toumi) for his guidance through the development of this project and for the several ideas he suggested on both model selection and how to generally proceed in this analysis. I’d also like to thank EvolvingWild for their publicly-available data scraper and all of their data on Evolving-Hockey.com.

References

Benz, L. (2019). An Examination of Timeout Value, Strategy, and Momentum in NCAA Division 1 Men’s Basketball. Yale University Senior Thesis.

Iyer, P. (2018). How Successful Is Your Coach With The Coach’s Challenge? Winging It In Motown. https://www.wingingitinmotown.com/2018/12/31/18160157/how-successful-is-your-coach-with-the-coachs-challenge

Zabor, E. (2018). Survival Analysis in R. https://www.emilyzabor.com/tutorials/survival_analysis_in_r_tutorial.html