By: Edward Egros

quarter

No Need to Establish the Run

David Johnson

Arizona Cardinals running back David Johnson (left) may understand the importance of balancing between rushing and passing about as well as anybody. Last season, he finished with the most touches, all-purpose yards and rushing/rec touchdowns of anyone in the NFL. For an encore, his head coach says he wants Johnson to average 30 touches per game.

It's one thing to strike the right balance between how to use Johnson as a rusher and as a receiver; it's another to make these decision relative to the time of the game. Conventional wisdom in football has always championed the idea of "establishing the run"; meaning no matter how long it takes to create an effective run game, it should be a point of emphasis early in a contest. More recently,
rushing plays are called less frequently, regardless of what the clock reads. Knowing this recent trend, there is a way to explain why, at least analytically, attempting to establish the run is unnecessary.

I took NFL play-by-play data from the 2010 thru the 2015 seasons. This information included which team won and lost. Then, using only rushing plays, I summed up the rushing yards each team had per quarter, per game (in this analysis, I am not including overtime rushing yards because of how infrequently they appeared, but also how much they swayed the results because so many rushing yards will essentially end the game). Using a
logit regression with "win" as a binary dependent variable and rushing yards per quarter as my explanatory variables, here is the output:

=========================================
Deviance Residuals:
Min 1Q Median 3Q Max
-2.8447 -0.9786 -0.5544 1.0545 2.0701
Coefficients: Estimate Std. Error z value Pr(>|z|)
(Intercept)
-1.747385 0.105946 -16.493 < 2e-16 ***
yards.gained.1
0.006508 0.001922 3.386 0.000708 ***
yards.gained.2
0.007091 0.001953 3.632 0.000282 ***
yards.gained.3
0.015546 0.001910 8.137 4.05e-16 ***
yards.gained.4
0.035783 0.002156 16.594 < 2e-16 ***
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 4251.8 on 3066 degrees of freedom
Residual deviance: 3711.2 on 3062 degrees of freedom
AIC: 3721.2 Number of Fisher Scoring iterations: 4
==========================================

First, all of these variables are statistically significant at the 99% level, which makes logical sense. The more yards a team has, no matter the type, the likelier they are to win. Second, there is a direct relationship between the time of the game and the magnitude of the coefficient. In other words, as the game goes on, the more important rushing yards are to the game's outcome. Having the largest coefficient for the fourth quarter makes sense because teams that are leading are trying to take time off the clock, and rushing makes that motive easier to fulfill. However, that the third quarter has a greater magnitude than the first half could suggest there is no statistical advantage to "establishing the run".

It is also important to convert these coefficients to
odds ratios to know how important each rushing yard is to winning. Specifically, an extra first quarter yard increases the odds of winning by a factor of 1.0065. In the second quarter, it's 1.0071, a small difference. In the third quarter, it is 1.0157 and in the fourth, it is 1.0364.

There may be a value to wearing down a defense by running the ball earlier in a game, but from this data and regression, it is not captured. It may also be possible a running back needs several carries before knowing how to dissect a defense later in a game; but again, this idea is not captured aggregately. Again, establishing the run may not be as crucial an idea as originally thought.

However, one conventional bit of wisdom that is reflected is the idea a team controls the game more effectively by running the ball later in the contest. Quantifying how a team controls a game can be captured using a study like this one. In fact, I plan to use this analysis in my weekly Cowboys postgame graphics that explain why Dallas either won or lost a particular contest. I will go over these upgraded graphics in a later blog post.

(Special thanks to
Luke Stanke for providing the data and helping me with the code!)

Who Do You Trust in the 4th Quarter?

Pasted GraphicSince being named the starting quarterback for the Dallas Cowboys, Tony Romo has been in the NFL spotlight for ten seasons and 127 games. While he has put up some of the more prolific statistics of any quarterback during this time, many argue he is the most scrutinized veteran gunslinger in the 21st century. One reason is anti-analytical: blown opportunities to win games in the 4th quarter. While many of these games have been the most critical for his team's championship aspirations, it does bring up the bigger question of which quarterbacks have been the most reliable for winning a game in the 4th quarter.

In a later article we will apply analytics and look at what constitutes a "clutch" quarterback. But first, let's look at the raw statistics. The data features 42 quarterbacks spanning all eras of the NFL but who can be considered, at a minimum, marginally successful (e.g. Peyton Manning, Warren Moon, Roger Staubach, Colin Kaepernick, etc.). The 4th quarter variables are: comeback attempts, comeback wins, comeback rate and career blown leads by the QB's own defense.

First, here is a graph of the comeback success rates:

Pasted Graphic 1

Of the quarterbacks analyzed, Andrew Luck has the best 4th quarter comeback rate of anyone (63%). However, he also had the fewest attempts, so it is too soon to call him the most clutch we have ever seen. In second place is Joe Montana (56%), who many might be more willing to admit is the best in close games. Peyton Manning had the most attempts of anyone (94), but his rate is 47%.

Then comes the aforementioned Tony Romo. His rate matches is only slightly worse than Manning's. While it is below half, only five of the 42 quarterbacks studied finished better than 50%. In fact, Romo's rate is 11th best out of 42. At the other end, the worst rate among active quarterbacks belongs to Aaron Rodgers (27%). Don Meredith has the lowest success rate of anyone at 25%.

Some of these rates can be explained by analyzing blown leads by that quarterback's defense:


Pasted Graphic 2

The quarterback dealt the least clutch defense is Drew Brees, where on 31 occasions, his "D" has blown a 4th quarter lead. Fran Tarkenton ranks second with 27. Tony Romo is tied for 10th with 17. This mark is slightly above the average among the 42 quarterback studied. As for those who have fewer reasons to be upset with their defense, there is Kurt Warner (6) and, as expected, Andrew Luck (2).

Visually and expectedly, there is already a direct correlation between 4th quarter comeback rates and blown leads by defense. Still, it is worth discovering if there are statistics for each quarterback that can help explain why some successful quarterbacks are better than others at the end of football games. I will report my findings in a future article.

Special thanks to Mark Lane for putting this data together. You can follow him on Twitter
@therealmarklane.