By: Edward Egros

Oct 2016

A Unique Cowboys Perspective

Screen Shot 2016-10-30 at 2.59.44 PM
The Dallas Cowboys are constantly watching film and studying the playbook for that added edge. Their fans also want to know anything that can help explain why their favorite team won or lost, and if there is a way to forecast how they will do and where they need to improve. Our newest data visualizations hope to do all of the above.

Before and during every Cowboys game, I will post on my various social media accounts some analytics that explain what is going on and predict what will happen. After the game, I will have one summary detailing what happened, using explanatory variables that are the best indicators for the outcome of any football game. Here is some extra information for each highlighted variable:

  • Turnovers are perhaps self-explanatory and the team with the better turnover ratio has a significant advantage.
  • Scoring efficiency goes beyond just the scoreboard. It's a ratio of (offensive yards/points). A team may have moved the ball but failed to score many points when near the end zone, so they were inefficient. Not only can each team's efficiency be compared, but each bar has a color: red for bad, blue for average and green for good. Respectively, these quality ranges are: 0-12, 12.01-18.5, 18.51-. These ranges came from the last ten years of NFL data, provided by Pro Football Reference.
  • The ratio (time of possession/rushing yards) looks at who was controlling the game effectively. Time of possession is not an effective indicator for success, but how well a team controls the ball while on offense is. The team with the better ratio earns the checkmark.
  • Overachiever/underachiever is a way to look at how well a team is doing for the season, relative to its point differential. In other words, if a team is has a strong record but all of their wins are close, they are overachieving. If they suffered a number of losses but they have been close, they are underachieving. This idea is calculated using a Pythagorean Expectation formula, something more commonly used in football: ((Points for^2.37)/(Points for^2.37 + Points against^2.37)). This winning percentage can then be multiplied by the number of games played to show where a team "should" be with its record.

Periodically there will be additional metrics to explain why the Cowboys won or lost, such as net passing yards/attempt, which takes into account sacks and incompletions as well as how many passing yards each quarterback is able to accrue. As more metrics become readily available, this summary will include them. To see these visualizations in real time, follow me:


Special thanks to
Fuzzy Red Panda for putting together these beautiful images and programs that advance sports analytics in such creative ways.

Screen Shot 2016-10-30 at 3.22.31 PM

Go Cubs Go

Pasted GraphicIn just a few days, Wrigley Field's iconic scoreboard will showcase a World Series for the first time in more than seven decades. A franchise with questionable management and horrible luck has finally come within four wins of its first world championship in more than a century.

The Cubs have fielded formidable teams that have made the postseason, but never have they won the NLCS until this year. Often postseason baseball can be so unpredictable that it is difficult to explain why the Cubs could not reach the World Series until now. But there are some trends that predict success in playoff baseball, that does not have as great an impact in regular-season baseball.

While I have written a paper about this and have applied those lessons to the Texas Rangers in a previous post, I would like to look at alternative research. In the book "Baseball Between the Numbers", three qualities are listed that best determine postseason success:

  • Pitcher Strikeout Rate
  • Fielding Runs Above Average (FRAA)
  • Closer Expected Wins Over Replacement Pitcher (WXRL)

The Cubs finished 3rd in the majors in strikeout percentage and strikeouts per nine innings (the Dodgers finished first in both categories, the team Chicago beat in the NLCS). Fangraphs uses a metric called
Ultimate Zone Rating to calculate fielding, and listed the Cubs as the best fielding team this season. Lastly, the Cubs finished 19th in reliever Wins Above Replacement, but keep in mind, the team traded for Aroldis Chapman late in the season.

It is also worth nothing, the Indians had high rankings in all three of these categories as well (5th, 4th and 7th, respectively). While the matchup should make for a fantastic World Series, given how the Cubs have properly built this team for a postseason run, it should not come as a surprise if they can end this 108-year streak.

No Range for the Texas Rangers

IMG_5937It's hard not to catch shortstop Elvis Andrus smiling these days. His Texas Rangers go into the postseason with home-field advantage all the way through the World Series—while finishing one victory shy of a franchise record for most wins in a season—and boasting the most wins at home in the American League. Elvis himself finished the regular-season as a .302/.362/.439 hitter. And yet, a few sabermetricians have spoken out, saying not only shouldn't the Rangers be one of the favorites to win the World Series, their success is virtually fraudulent.

It involves
Pythagorean Expectation. This is the often-cited formula baseball guru Bill James invented to estimate how many wins a team "should" have based upon how many runs they scored and allowed. Since it became commonplace, the formula has worked quite well explaining why teams are thriving and struggling. Even this season, the formula explains all but a handful of wins or losses for every MLB team. The one team the formula has done the poorest job with, is the Texas Rangers.

For much of the season, this team's Pythagorean W-L hovered around .500. The Rangers finished 13 games above what was expected, at 95-67. Why? The Rangers were 36-11 in one-run games (the .766 winning percentage is a record in modern baseball). They were also 18-24 in games decided by 5+ runs. In other words, the Rangers won a lot of close games and lost a lot of blowouts.

This large of a discrepancy is unprecedented in the last decade for the Rangers:

Pasted Graphic

The Rangers have performed roughly what was expected, given their runs scored and allowed. But the last two years this team has over-performed. It might be a coincidence those were the two years Jeff Banister has been the manager of the Rangers, but maybe not. Banister has a history of evaluating players and looking at skills during blowouts. He is certainly not the only manager to have this approach, but it is possible he takes it to the next level. Two years is not sufficient data to make such a conclusion, but it is a noteworthy trend to consider.

So how accurate is this formula when predicting if the Rangers will win the World Series? Not very. Since 1969,
11 teams out of 47 had the best Pythagorean Expected record and went on to win the World Series. In fact, the likelihood has decreased since the postseason expanded. Many conclude the postseason is almost impossible to predict, though there are the trends to consider that are helpful. Most notably, "Small ball" seems to be a more successful approach in the postseason than the regular-season. Among teams in the postseason, the Rangers rank 3rd in stolen bases, 5th in sacrifice flies and 3rd in hit by pitch (they are however last in walks and almost last in sacrifice hits).

If you believe the Rangers will eventually regress to the mean given this disparity, it has not happened through 162 games, so statistically nothing suggests this trend will automatically change after another 19 games. In a way, the Texas Rangers have just as good a chance to win the franchise's first world championship as anybody, and that smile from Elvis Andrus will be even wider.