By: Edward Egros

hockey

How Predictive Is Scoring Differential?

Pasted GraphicHow important is an impenetrable goalie in the NHL? How much better is it to outscore opponents throughout the season, as opposed to dominating them defensively? Overall, how important is point differential to overall success?

In an earlier blog post, I discussed
playoff unpredictability when it comes to determining who will win a championship based upon how many games that team won. There, the NBA was the most predictable, then the NHL, NFL, then MLB is the most unpredictable (unless, of course, you are the 2016 Chicago Cubs). But how does point differential (or run differential in baseball or goal differential in hockey) translate to winning championships? And which league is most predictable when looking at that specific metric?

Once again, I am using
logistic regressions using one explanatory variable and if that team won a championship as the dependent variable. However, this time I am using three per sport: offensive output, defensive output and scoring differential. Also once again, here is what is noteworthy with our datasets:

- All data used begins with the 1989-90 season because the NFL had the biggest chance to its playoff format at the turn of the new decade.

- Any season in any sport where a lockout shortened the number of games played considerably was removed (e.g., the 1998-99 NBA season, the 2012-13 NHL season, etc.)

- Though the NHL played 80 and 84 games in a few of these seasons, these numbers are not significantly different from the 82 played the rest of the dataset, so they are still used.

Each explanatory variable has the appropriate and logical coefficient. In other words, scoring variables have a positive coefficient, defensive variables have a negative coefficient and scoring differential variables have a much larger positive coefficient. All of this equates to a better probability of winning a championship. Each variable is also statistically significant with 95% confidence, which is to be expected. A better offense, defense and scoring differential will obviously increase the likelihood of winning a championship. What is not clear is which of these indicators is most predictive.
A goodness-of-fit measure called AIC (Akaike Information Criterion) can shed some light. As this number gets smaller, the model has a better fit, explaining away more of the randomness of that sport.

The first chart is points (or runs or goals) modeled against championships:

Pasted Graphic 1

Before analyzing this chart, it is important to note the value of each point, goal and run, compared with the other sports. In 2016, the average MLB team scored 726 runs for the season. This number is different from the 325 points scored, on average, for an NFL team in 2015, the 8419 points scored for an NBA team for last season and the 222 goals scored for an NHL team for last season. Fortunately, the variation across each league is not so substantially different to where comparison becomes impossible.

In the chart, we see goals in hockey as being the best predictor for winning its championship, with football being slightly more random, then basketball, then baseball finishing as the most random. So far, these results are consistent with the previous study where MLB's postseason was the toughest to predict, based upon number of wins during the regular season. Basketball makes intuitive sense because teams play at different paces, and it is not conclusive if playing at a faster rate—which scores more points but not necessarily more points per possession—is the best way to win a title.

The next chart illustrates runs, points, and goals allowed, modeled against winning a championship:

Pasted Graphic 2

Comparatively, the trends are almost the same as they are with offensive output: Major League Baseball is the most random, followed by the NBA. However, an NFL scoring defense is now a better indicator than an NHL scoring defense, but only slightly so.

Now, let's combine these two charts into scoring differential, modeled against a championship:

Pasted Graphic 3

Here, we learn point differential is more predictive in basketball than in any other sport. Remember how different teams playing at different paces obscures the importance of points alone? Including the defensive component erases pace of play and gives a clearer predictor. It also coincides with how a win total in basketball is most predictive for winning a championship. Football and hockey are nearly equal in predictive ability and baseball is a distant fourth.

There are more trends to uncover if we combine all of these charts:

Pasted Graphic 4

In nearly every sport, scoring defense is more predictive than offense (with hockey being the lone exception). Scoring differential is predictably better for analysis than offense or defense by itself, but the degree to which it takes away the randomness is different for each sport. It is only a slight improvement in the NFL, but a drastic improvement for basketball.

Overall, these proportions could prove helpful when determining if a team is going in the right direction when devoting resources to offense and defense. Both are necessary, but perhaps more money should be proportionally allocated to the areas that best predict who will win a championship.

Playoff Unpredictability

Pasted GraphicUntil recently, the Los Angeles Lakers were one of the fixtures of the NBA Playoffs, and in many seasons, the Finals. They have put together dynasties in different generations of the sport, from Magic Johnson's teams to the Shaq and Kobe era. When the Lakers were not winning titles, chances are another team was enjoying its own dynasty, like the Boston Celtics, Chicago Bulls or San Antonio Spurs. Dynasties are so commonplace in the NBA, 15 franchises in the sport's history do not have a championship (and seven of those still in existence never even made it to the Finals).

The NBA is unique in this regard: championships are won in bulk. Other leagues offer more parity, where there is a larger pool of contenders vying for a title. There may be dynasties in other sports, but there seems to be fewer of them, each shorter in duration and there stood a better chance someone unexpected can claim the sport's top prize.

Which of the four top professional sports leagues (NFL, NBA, MLB and NHL) offers the most playoff unpredictability? Is the NBA truly the most predictable? Is it significantly more predictable or marginally so?

One approach to answering these questions is by using a statistical model for each sport. Here, we will use
logistic regressions, where we will look at only wins (or points in hockey) and see how well it predicts whether a team won a championship that year. Here are some other notes for setting up this project:

- All data used begins with the 1989-90 season because
the NFL had the biggest chance to its playoff format at the turn of the new decade.

- Any season in any sport where a lockout shortened the number of games played considerably was removed (e.g., the 1998-99 NBA season, the 2012-13 NHL season, etc.)

- Though the NHL played 80 and 84 games in a few of these seasons, these numbers are not significantly different from the 82 played the rest of the dataset, so they are still used.

At first glance, every variable representing wins is statistically significant with 99% confidence, which should be obvious because you need so many wins just to make the playoffs. What matters is how well wins alone predicts championships. In statistical parlance, we will use a goodness-of-fit measure called
AIC (Akaike Information Criterion) to answer this question. As this number gets smaller, the model has a better fit. The following shows how well each model performs:

Screen Shot 2016-04-17 at 7.47.11 AM
The larger the bar, the more unpredictable the league is. Again, as expected, the NBA is the most predictable, and by a considerable margin. This model also suggests Major League Baseball is the most unpredictable, with the NFL as a close second and the NHL as a close third.

There are a number of other variables that could be added to these models to help determine who will win a championship, but the simplicity of these models makes for an easier comparison across sports.