Football Prediction Network

Tuesday, July 31, 2007

Does Defense Matter More in the Postseason?

In Pro Football Prospectus 2006 and 2007, it is shown that defensive performance (as measured by DVOA) correlates more with playoff success than offensive performance. Similar findings have been made in baseball. As correlation is not causation, the question is why offensive performance is seemingly less important. Originally, I hypothesized that defensive performance was less consistent and needed to be better overall so that the worst performances were still good enough to win. That didn't pan out particularly well, so it was back to the drawing board. Today's hypothesis is a little more straightforward.

If offensive efficiency has a higher correlation with regular season wins than defensive efficiency, it means that the offenses in the playoffs are already very good. When everybody is good, then everyone is average. If the quality of defenses isn't as good, however, then teams with good defenses will be more successful in the playoffs.

(The test data is on the 1996-2006 seasons.)

The average efficiencies (yards per play) for playoff teams are as follows:

Run Off 4.1126 yards, 1.5356% VOLA (Value over League Average)
Run Def 4.0076 yards, 1.0906% VOLA
Pass Off 6.379 yards, 8.4316% VOLA
Pass Def 5.6086 yards, 4.6672% VOLA

So by VOLA standards, defenses of playoff teams aren't as good as the offenses. The one caveat is that there are 20+ more instances of Pass Off. VOLA above 10% and 20% than for Pass Def. VOLA. Take that as you will. It might just mean that there are a lot of crappy quarterbacks in the league weighing down the league averages and the very good QBs are also very consistent (Brady, Manning, Culpepper, Warner of the Rams, Green, Elway, Favre).

Let's take a look at the correlation coeffiecients with playoff seedings, specifically 7-seed # for playoff teams only. By the hypothesis, we'd expect the defensive eff. correlations to be higher because the offenses are all very good across the seeds.

The correlation coefficients with 7-(Seed #) listed as Offense, Defense are as follows:

Run VOLA .10812, .096591
Pass VOLA .11949, .3477
Sack Rate VOLA .1075, .23407
Third Down Conversion Rate VOLA .11549, .24107
Interception Rate VOLA .16219, .2013
Fumble Rate VOLA .0189, .16761

In every category except run efficiency, defensive performance has the higher correlation with playoff seeding. So defense is what sets apart the contenders from the one-and-dones. Creating turnovers is important, but a quarterback who is poor at decision-making plays an important part as well. Stopping drives on 3rd downs and forcing punts is also important. Interestingly, punt return averages have next to nothing correlation with the seeding, but kick return averages have a .18278 correlation. And although generally, none of these correlations are particularly strong, they seem to be strong relative to what you'd see in working with football data. I suspect ranking divisional winners 1, 2, 3 and now 4 cuts down on these coefficients, however.

Of course, seedings do not equal success, but seedings equal home field advantage, which does seem to play a significant part in playoff success. The following table shows the value of home field advantage for the intraconference playoff games (i.e. all games except the neutral-site Super Bowl). I've split it time-wise by the realignment of divisions, which cuts down on the sample size considerably, but there does seem to be an impact.

Avg. Result/Home Win %	1996-2001	2002-2006
WC	9.667/79.167%	5.0/60%
DIV	11.917/79.167%	6.25/70%
CF	3.0833/50%	3.4/60%
OVERALL	8.0303/71.212%	5.1818/63.636%

Despite the seemingly larger home field advantage, it's important to note that higher seeding likely means better team. It is the defenses, however, that are making the teams "better," according to my interpretation of the data.

Avg. VOLA for Playoff Teams, 1996-2006
Stat	Super Bowl Winners	Super Bowl Losers	Non-Super Bowl Winners
RO	0.022576	0.028555	0.014699
RD	0.048716	0.048332	0.007469
PO	0.10837	0.11239	0.081933
PD	0.097542	0.045339	0.042048
SRM	0.043673	0.028613	0.066265
SRA	0.08411	0.11689	0.12584
PR	0.16781	-0.0078771	0.020933
PC	0.040876	0.010033	0.030548
KR	0.046957	-0.0024219	0.0095025
KC	-0.02826	-0.020431	0.0022995
3CM	0.12462	0.067547	0.068969
3CA	0.034159	0.044792	0.03235
PFD	0.0064548	-0.053755	0.023301
PY	0.02089	-0.021606	0.018493
IRG	0.1623	0.11482	0.087693
IRT	0.27941	0.095794	0.053809
FRG	0.091916	0.071552	0.11377
FRT	0.19085	0.095611	0.054136

The stats where the Super Bowl winners had a noticably higher average than the rest of the playoff teams were: Run Defense, Pass Defense, Punt Return, Third Down Conversion Rate Made, Interception Rate Given, Interception Rate Taken, and Fumble Rate Taken. Tangential but interesting: 1996 Green Bay, 1997 Denver, 2000 Baltimore, 2001 New England, and 2005 Pittsburgh had 25%+ VOLA on punt returns, while besides 2004 New England (-32.4%) and 1998 Denver (-2.16%), no Super Bowl winner had negative VOLA on punt returns. Back on point: the defensive efficiency on run and pass plays are much higher for Super Bowl teams than other playoff teams (4.87% vs. 0.007% run, 9.75% vs. 4.20% pass), but the gap in offensive pass efficiency isn't as large (10.837% vs. 8.193%). The pass offenses of playoff teams are good to begin with. Defenses of playoff teams aren't necessarily better.

Another way to look at it is who won with high VOLA and who won with low VOLA. Of the 11 Super Bowls looked at, only the 2000-2 winnners (BAL, NE, and TB) won with below average pass offense efficiency. Only the Patriots in 2001 won with below average pass defense efficiency. Of the 22 Super Bowl teams, 12 had 10%+ VOLA in pass offense (6 winners and 6 losers). In the same time frame, 73 teams had 10%+ VOLA, 59 of which made the playoffs. Eight teams had 10%+ VOLA in pass defense (4 winners and 4 losers), but only 47 teams had 10%+ VOLA in the same time frame, 34 of which made the playoffs. So 80.822% of teams with 10%+ VOLA in pass offense make the playoffs, and 16.438% of which make the Super Bowl. On the other hand, 72.34% of teams with 10%+ VOLA in pass defense make the playoffs, while 17.021% of those teams make the Super Bowl. The percentages show that a good offense will get you to the playoffs, but it needs to be balanced with a good defense in order to reach the Super Bowl.

That very good pass offenses are much more common than very good pass defenses is surprising and disconcerting. As a sanity check, I took a quick look at FO's 2006 DVOA standings. Eleven teams had 10%+ DVOA for pass offense, while only six teams had 10%+ DVOA for pass defense. The same is not true for rush efficiency. The surprising conclusion is that very good offenses are more common than very good defenses. The million dollar question is why they are more common. Perhaps offenses are easier to build because one man can make such a large difference in offenses (QB or RB), and one man cannot make such a difference in defenses. Brady never had great receivers (nor a great running back), but his skills led them to 3 Super Bowl wins. Though you might argue that Bob Sanders had a large impact last postseason as well.

Actually, the Colts also faced imbalanced teams in the 2006 playoffs. The Chiefs had an above average offense but below average defense and a worn out running back. The Ravens had a great defense but a so-so offense (6.26% Pass Off. VOLA, -17.275% Rush Off. VOLA). The Bears had a great defense but a below average offense. The only team the Colts faced in the 2006 playoffs with some amount of balance was the New England Patriots (#7 Off DVOA, #8 Def DVOA).

In conclusion, very good offenses are more common and more important to regular season wins, so playoff teams have good offenses on average to begin with. Consequently, teams with balance on both sides of the field are more likely to win.

Mind you, 11 seasons is a small sample size, but the initial results merit more research. In the future, I hope to expand on the sample size back a couple decades at the expense of dimensionality (restricted to simple rush/pass yards per play measurements) and revisit the last decade using DVOA.

Thursday, July 26, 2007

Who's On the Rise and Who's On the Decline: Expanded and Corrected

Going over the 2007 Pro Football Prospectus, I thought I'd give a quick shot at predicting whose win totals would improve/decline next season.

I chose the linear regression model of season win totals based on Value Over League Average, training on 1996-2005, and took a look at which teams won at least X more/fewer games than their expected win totals, where X was the mean absolute error on the 2006 season (1.233 games). So the rule of thumb is that those who won ≥1.233 games less than their expected win total is expected to improve in 2007. Those who won ≥1.233 games more than their expected win total is expected to decline in 2007.

IMPROVE

Miami Dolphins (7.9975 expected wins, 6 actual) Very good defense with gaps in the secondary and an aging core. Offense will hopefully improve with a good quarterback (Trent Green), but the receiving corps is weak. The offensive line is also being reshuffled. Any statistical improvement will likely be regression to the mean as much as actual increase in talent.
Pittsburgh Steelers (9.4697 expected wins, 8 actual) Big Ben will be healthy. That should make all the difference.
Jacksonville Jaguars (10.432 expected wins, 8 actual) They need to settle on a quarterback because otherwise, this is a very good team. Their unadjusted VOLA for run offense and defense are 20.0084% and 16.382% (percentage above league average for yards per carry). Their pass efficiency was below average on defense, but they got a lot of interceptions (their rate was 21% above league average). I will jump on the FO bandwagon and peg them as the big sleeper of 2007.
Oakland Raiders (4.1932 expected wins, 2 actual) There's nowhere to go but up, right? Well, actually, their pass defense was good (9.04%, 15.775% unadj VOLA in pass eff. and sack rate made). It's just that their pass offense was far more atrocious (-27.022% pass eff., -96% sack rate allowed, a sack rate of 11.6% compared to a 6.2% league average). Don't expect them to be playoff contenders, but if JaMarcus Russell and Michael Bush are simply mediocre, this team is at least a 5 or 6-win team.
Philadelphia Eagles (11.294 vs. 10) If not for a freak 62-yard game winning field goal (Matt Bryant of the Bucs), the Eagles wouldn't be in this list. The Eagles might not improve their win total in 2007, depending on when McNabb is able to play effectively again. Despite the injury to McNabb, their pass efficiency was 20.58% above average in 2006. They went 1-3 against the AFC South and don't look to fare much better against the AFC East in 2007. They do, however, play the NFC North instead of the South. My intuition is that their interdivision and interconference games will be the difference between 10-6 and 13-3.
Detroit Lions (5.2434 vs. 3), Minnesota Vikings (8.0195 vs. 6) I don't expect them to be significantly better. I just expect the Bears to be worse.

DECLINE

New England Patriots (10.013 vs. 12) With a legitimate wide receiving corps (that is quickly becoming the most overrated without having played a down), I wouldn't buy into this. New England will do just fine.
New York Jets (8.3734 vs. 10) Their defense just isn't very good. And their passing game was average. This is a team I'd expect to regress to the mean. Thomas Jones is a good back, but I don't think that's enough to make this team a legit playoff contender.
Tennessee Titans (4.6593 vs. 8) When you're winning games with 62 yard field goals, you know your luck is good. But they've lost Pacman Jones, and how many times can they rely on Vince Young's scrambling ability?
San Diego Chargers (12.401 vs. 14) I think that this will happen with most 14 or 15-win teams. It's not that the Chargers weren't very good. They were at least 10% above average in rush offense, pass offense, and pass defense efficiency. It's just that with "luck" and scheduling, teams have to be really, really, super good for the system to guarantee at least 14 wins. That said, they play the AFC South and NFC North this year, as opposed to the NFC West and AFC North, so expect some drop off in win totals.
Chicago Bears (11.005 vs. 13) The Bears were lucky to play 4 games against the NFC West. Next season, their intraconference matchups will be against the NFC East, plus the Saints. A harder schedule will mean fewer victories. That said, Rex Grossman's passing efficiency was about average, and the pass defense was very good. Actually, Rex Grossman and the offensive line did a very nice job of avoiding sacks as well. Thomas Jones, according to these numbers, was overrated, as his efficiency was 8.69% below average. So was my original perception about Rex Grossman being crappy wrong? I don't think so. I think the weak schedule might have inflated his numbers somewhat (adjusted VOLA is about 0% as opposed to about +2%). I'd peg them down as an 9-7 or 8-8 team. Somebody has to win the NFC North, after all.
Atlanta Falcons (5.5363 vs. 7) Great offense rushing efficiency (31% VOLA) but a very bad pass offense (-14.45% VOLA). Bad pass defense too. With Joey Harrington under center, don't expect their pass efficiency to go up much, if at all. They might get some help from playing the NFC West.
San Francisco 49ers (5.2197 vs. 7), Seattle Seahawks (6.906 vs. 9) This is really just a result from the NFC West being a really weak division overall. The Cardinals drew the shortest straw in 2006 apparently. While Seattle also had health issues dragging down their season averages, Shaun Alexander might not recover so easily from overuse in 2005.

The numbers don't seem to add much to what we already know, but it will be interesting to see how the predictions will play out. At the end of the season, I'll pick apart the article and methods to see what went right and what went wrong. I think the running theme of the article, however, is that the quarterback is the key issue for most of these teams. For some teams, other team's quarterbacks seem to be the issue.

In short: The NFC North and NFC West are mediocre but wide-open divisions. Atlanta might miss Michael Vick, but the Falcons weren't very good before he got himself in legal trouble. The Steelers and Eagles can expect improvment with healthy quarterbacks at the helm. Jacksonville has great potential to get deep into the playoffs. Chicago is going to sink in 2007. And Tom Brady and Bill Belicheck must be pretty dang good to perform so well with Reche Caldwell and Jabar Gaffney as their receivers.

Addendum: Accuracy of Rise and Fall Projections
I went back and checked tested these rise-and-fall predictions on the 1997-2005 seasons against each selected team's win total the following season. Over those 9 seasons, 53.125% of predicted risers saw at least a 1-game increase in win totals the next season, while 62.295% of predicted fallers saw at least a 1-game decrease in win totals. Over 2002-2005 (the last realignment), however, the accuracy numbers were 65.517% and 67.742%. So the divisional realignment and new scheduling method seem to expose more teams with abnormally high or low win totals as the bad/good team they actually are.

For one more experiment, let's try projecting next season's win totals based on expected and actual wins as a simple regression model. The inputs are actualwins(Y) and expectedwins(Y)-actualwins(Y) and a bias input (always equal to 1). So the idea is that teams that outperform projections will be penalized and teams that underperform will be rewarded. The output will be actualwins(Y+1).

This model, unsurprisingly, isn't very good. The mean absolute error was 2.1299 games in predicting actual wins in 2006. The correlation coeffecients to actualwins(Y+1) are 0.26804 for actualwins(Y) and -0.0052895 for expectedwins(Y)-actualwins(Y). I was surprised to see the second coefficient close to zero. The difference between expected wins and actual wins is actually the error of the win total regression system, and that error goes both ways creating "random" noise. The error doesn't necessarily mean that team over or underperformed that season. That is only the interpretation I ascribe to it (and as a side note: I think it's important to state that statistics are merely tools. They can be poorly designed or poorly implemented. I think people are too quick to dismiss stats as excluding intangibles. That is the interpretation they've ascribed to the stats. It may or may not be true.).

The regression coeffiecients make a little more sense, however. Each win in year Y is worth 0.29507 wins in year Y+1. Each expected win a team did not get in year Y is worth 0.17917 wins in year Y+1. Each win above the expected win total in year Y is worth -0.17917 wins in year Y+1. Each team is automatically "given" 5.6739 wins (the coefficient for the bias input). For the predictions next year, I swapped out the actual wins input with expected wins in 2006. This was an inadvertent mistake on my part and not completely sound science. However, the predictions are more interesting this way. One of the weaknesses of many next-season win projections is that division rankings aren't shaken up very much, if at all. This slip-up turns out to be pretty useful because it does mix things up. All of the predicted win totals for 2007 are between 6-10 games, so the exact predictions are not very useful. The rankings, however, are useful. Here are the expected final intradivision rankings for 2007.

AFC East

Miami Dolphins
New England Patriots
Buffalo Bills
New York Jets

AFC North

Baltimore Ravens
Pittsburgh Steelers
Cincinnati Bengals
Cleveland Browns

AFC South

Jacksonville Jaguars
Indianapolis Colts
Houston Texans
Tennessee Titans

AFC West

San Diego Chargers
Kansas City Chiefs
Denver Broncos
Oakland Raiders

NFC East

Philidelphia Eagles
Dallas Cowboys
New York Giants
Washington Redskins

NFC North

Chicago Bears
Green Bay Packers
Minnesota Vikings (1,2, and 3 are very close)
Detroit Lions

NFC South

New Orleans Saints (by a mile)
Carolina Panthers
Tampa Bay Bucs
Atlanta Falcons

NFC West

St. Louis Rams
Arizona Cardinals
Seattle Seahawks (1, 2 and 3 are very close)
San Francisco 49ers

Certainly, I disagree with some of these projections, but a lot of my assumptions at the beginning of the season don't pan out, so I'll see how the numbers do.

7-30-07: I found an error in my offensive pass efficiency stats. Results and article corrected.
7-30-07: Added section to article about accuracy of this projection system.
8-27-07: Corrected NFC North order and prediction method.

Thursday, July 19, 2007

Gone Fishin'

Just wanted to let you readers know that I'll be out of the country until the 25th, so stay tuned, and I'll have something new up by the end of the month.

Football question: Will this the week-long break kill my momentum? Will I be rusty when I return, or will I be refreshed and less likely to pull a hammy?

Monday, July 16, 2007

Does Balance in Play Calling Matter?

If, on a given Sunday, your running game just can't get anything done, does it pay to continue calling running plays? Should you just abandon it for the passing game? Do you REALLY want Joey Harrington throwing the ball sixty-two times in a single game, Osaban Bin Lyin'? This really isn't about the running game setting up the passing game, per se. As it turns out, over 1994-2006, unadjusted pass and run (offensive) efficiency have a correlation coefficient of -0.0084705, essentially zero. They're "unrelated." Talent probably wins out. Adjusted efficiencies don't fare much better. Instead, I'm simply testing whether or not an imbalance in play calling negatively affects offensive efficiency. Ideally, on a graph where X is the ratio of pass plays called to run plays called and Y is the offensive efficiency, we'd expect a parabola peaking at 1-1, or at least something that looks like one.

Correlation of stat with percentage of plays run that are pass plays
Statistic	Corr. Coef.
Run Eff, Unadj	-0.19428
Run Eff, Opp Adj	-0.16552
Run Eff, HF Adj	-0.19233
Pass Eff, Unadj	-0.24897
Pass Eff, Opp Adj	-0.28482
Pass Eff, HF Adj	-0.24415
Off Eff, Unadj	-0.12309
Sack Rate, Unadj	0.069636
Sack Rate, Opp Adj	0.052371
Sack Rate, HF Adj	0.066329
Int Rate, Unadj	0.13693
Int Rate, Opp Adj	0.11492
Int Rate, HF Adj	0.13295
Fum Rate, Unadj	0.19613
Fum Rate, Opp Adj	0.1806
Fum Rate, HF Adj	0.19536
Margin of Victory/Defeat	-0.59146

If I check correlation with an imbalance stat (absolute value of .5 - proportion of plays called that are pass plays), the numbers aren't much different but here's a quick summary.

Run efficiencies: ~-0.11
Pass efficiencies: ~-0.14
Total offensive efficiencies: ~-0.08
Margin of victory/Defeat: -0.38188

So, play calling imbalance seems to have a negative effect on offensive efficiency, but it could just mean that losing teams, which are generally teams that aren't performing as well statistically, are forced to abandon the running game to try and catch up. And of course, teams that are winning will abandon the passing game to run out the clock. The margin of victory/defeat correlation coefficients seem to back up this theory. But I would hardly say that the numbers are conclusive. Take a look at the graphs. I've included only 2 of the charts for brevity's sake and just to get an idea of how much they do not look like parabolas.

They're essentially blobs with some vague direction. This is the face of low correlation. I'd like to revisit this article later on with finer grained data. Once I collect some play-by-play data (hopefully the NFL.com gamebooks won't be too hard), I plan on breaking down this data by quarter/half or scoring margin, to see how ability and game plans affect the results rather than how results affect the game plan.

The Predictive Ability of Averages

The predictions made by my and others' models are simply an expected level of performance given the averages. In other words, if this game were played an infinite number of times, we'd expect the average result to converge towards this number. Players don't play at the mean, however. They play above and below it. On any given Sunday, numerous unpredictable factors can influence performance levels and thus the outcome. Weather, in-game injuries, the coaches' playcalls, bad seafood, a scorned woman (e.g. Mrs. Nick Harper), a player soliciting prostitution from a cop, a player soliciting prostitutes for opposing players. Over time, however, these factors should balance themselves out, and everything but ability should be filtered out of the averages. So how predictive of the next week's performance are the averages?

To find out, I took the to-date averages of weeks 3-16 for the years 1996-2006 and matched them up with the in-game averages from weeks 4-17 of the same years. This matches up with my testing methods for the prediction models except I'll start making predictions for week 3. You can't really expect to make good predictions based off 1 or 2 weeks of data, however. So the test is how well do averages based on weeks 1 through W-1 correlate with the results in week W?

Correlation coefficients of averages up to week W-1 with performances in week W
Stat	Unadj.	Adj. for Opponent
RO	0.14647	0.15258
RD	0.091264	0.11125
PO	0.16958	0.1335
PD	-0.0013236	0.069243
SRM	0.028981	0.036354
SRA	0.23358	0.22261
PR	0.026346	0.023044
PC	0.063592	0.066901
KR	0.048825	0.030787
KC	0.039768	0.0099349
3CM	0.12793	0.12153
3CA	0.066826	0.060486
PFD	0.048445	0.052096
PY	0.088783	0.083038
IRG	0.00046638	0.0038317
IRT	0.046033	0.025835
FRG	0.058733	0.039033
FRT	0.023268	0.002728

Here we see a fundamental problem with the prediction models: Players don't play at their mean level, and thus, the averages aren't terribly predictive of performance within a single game. Even adjusting for performance doesn't really help. The yards per carry and yards per pass attempt stats are highly dependent on down-and-distance and current scoring margin situations. Busted plays might cause huge shifts in "momentum," as Marcus Allen's 74-yard TD run and Joe Thiesmann's pre-halftime pick six in Super Bowl XVIII did. And if Don McNeal hadn't slipped as he stopped following the receiver in motion, maybe he would have tackled John Riggins for a 2-yard-gain instead of letting him break off a 43-yard TD run, and the Dolphins would have won Super Bowl XVII. Little things like that greatly affect the in-game averages. Excluding that one play, Riggins averaged 3.324 yards on 37 carries, not a particularly good performance against a poor run defense (4.38 yards/carry allowed). Considering that he had a paltry 3.1 yards/carry that season, the odds of him getting that one long play were probably very small. But the play call by Gibbs put McNeal in the position to slip up. Was it luck? I wouldn't say that. Actually, reading up on the 2006 Cardinals in Baseball Prospectus, I came across this fitting quote: "Luck is the residue of design." Was it a repeatable event with predictive value? I have my doubts.

So happenstance can make the statistics more retrodictive than predictive, judging by the correlations with next-game scoring margins, next-game averages, and with season win totals. Sixteen games doesn't seem to be a large enough sample size to filter out all of the happenstances from the stats.

Friday, July 13, 2007

The Value of Home-Field Advantage 4.0 - Performance Enhancement

The regression models tend to put stronger coefficients on the home team performance variables than on the away team's variables. This contributes to the overly strong home-team bias of the regression models. But because of the training error-minimizing nature of the regression, you can't really tinker around with the coefficients and expect improved results. More mathematically sound methods of dampening the weights (e.g. ridge regression) have not really improved accuracy either. Given that, I tried to fix the bias in the coefficients by fixing the bias in the data.

If the home team is winning 58% of all games, then it's reasonable to assume that being at home influences a team's offensive, defensive and special teams efficiencies. When trying to judge a team's performance mid-season, they have not necessarily played as many home games as away games. So how much bias would that cause? Does adjusting performance based on home-field advantage create more informative stats?

The table below shows the average of year-end league averages in several metrics split according to performance at home and on the road.

Avg. Performance of League, 1996-2006
Stat	Home	Away
Run Eff.	4.103	3.9956
Pass Eff.	6.0151	5.7526
Sack Rate Allowed	6.5962%	6.9897%
Punt Ret.	9.5288	9.1585
Kick Ret.	22.084	21.479
3rd Down Conv.	38.656%	37.005%
Pen. 1st Downs Given	1.667	1.5091
Pen. Yds. Given	52.257	55.936
Int. Rate	2.8441%	3.1675%
Fum. Rate	3.1435%	3.1924%

As expected, home teams are consistently better, but the scale of those differences is small. The difference is run efficiency amounts to an extra 2 yards for every 30 attempts. For pass efficiency, it amounts to an extra 8 yards for every 30 attempts. The difference in interception rates doesn't even amount to a tenth of an interception for every 30 pass attempts. Yet these differences are somehow worth 2.7857 points to the home team and a 58.941%/41.509% split of games.

So to adjust stats for home field advantage, I used the same method as opponent adjustments. Instead of adjusting by the ratio of league average to opponent average, I use the ratio of league average to home/away average. In terms of season win totals, adjusting for home-field advantage increased the R² of the model (1996-2006 data) from 0.728 to 0.764. Adjusting the stats for home-field advantage does improve the statistics. Unfortunately, it does not significantly alleviate the problem of regression models classifying too many games as home team wins (a 1-2% decrease, still above 70%).

Thursday, July 12, 2007

Consistency of Offensive and Defensive Efficiency Part I - Regular Season

Looking at the correlations of the stats to regular season win totals, offense seems to be more important than defense to winning. For example,

Rush offense efficiency: 0.2052
Rush defense efficiency: -0.13296

However, Football Outsiders did a study that showed that defense was more important to postseason success. This seems to fall in line with the 2002-6 Colts. And the same pattern can be found in baseball, where pitching metrics (especially for relievers) become the most highly correlated to playoff success (Baseball between the Numbers). But correlation does not equal causation. Why does offense suddenly drop in importance?

With the the single elimination playoffs, the better team does not always necessarily win. Teams never perform at their absolute average, which makes predicting wins through teams' average performance fairly difficult. The Jaguars by a fair share of metrics perform really well on average, but it hasn't translated to a great record because of their inconsistency. Over one game, anything can happen. So the team that gets deeper into the playoffs should be reliable and consistent.

If defense is the better predictor of postseason success, then is defense more consistent from game to game? Over a 16 game sample, if it is more consistent, then because the outcomes of games are so varied, the more consistent defensive metrics are more lowly correlated with win totals. Or an alternative hypothesis might be that because defensive performance is less consistent, that you need a very good defense, for whom rock bottom isn't that bad, to overcome the consistently excellent offenses in the playoffs. From year to year, defensive efficiency changes more than offensive efficiency (according to FO), so more week-to-week inconsistency would be reasonable to expect. Let's take a look at what the numbers have to say.

Correlation of week's performance with next week's performance, Unadjusted for opponent
Stat	Corr. Coef.
RunOff	0.08295
RunDef	0.039186
PassOff	0.17349
PassDef	0.010152
SackRateMade(Def)	0.01733
SackRateAllow(Off)	0.18595
3rdDownOff	0.071573
3rdDownDef	0.011514
IntRateGiven(Off)	-0.012781
IntRateTaken(Def)	0.01926
FumRateGiven(Off)	0.025166
FumRateTaken(Def)	0.025957

Average in-season standard deviation of metric, Unadjusted for opponent
Stat	Avg. Std. Dev.
RunOff	1.1732yds
RunDef	1.1885yds
PassOff	1.8195yds
PassDef	1.8962yds
SackRateMade(Def)	4.919%
SackRateAllow(Off)	4.6111%
3rdDownOff	13.096%
3rdDownDef	13.405%
IntRateGiven(Off)	2.9114%
IntRateTaken(Def)	2.8889%
FumRateGiven(Off)	2.7177%
FumRateTaken(Def)	2.7259%

According to these methods using the unadjusted-for-opponent stats, defensive performance is less consistent than offensive performance (except maybe with turnovers). The differences between the average std. deviations of the offensive and defensive metrics are small, but they are at least consistent in favor of offense. The differences in correlation coefficients are more significant. Surprisingly, the week-to-week correlation of defensive performance is near zero. The performance of defense one week seemingly has no bearing on performance the following week whatsoever. And momentum is, at best, a small factor on offensive performance, which apparently doesn't matter as much in the playoffs anyway. Much of this variance likely has to do with the opponents, so now is the time when we adjust for opponents.

Correlation of week's performance with next week's performance, Adjusted for opponent
Stat	Corr. Coef.
RunOff	0.095318
RunDef	0.052455
PassOff	0.13248
PassDef	0.026777
SackRateMade(Def)	0.022664
SackRateAllow(Off)	0.17386
3rdDownOff	0.071013
3rdDownDef	0.017221
IntRateGiven(Off)	-0.0048146
IntRateTaken(Def)	0.010823
FumRateGiven(Off)	0.018135
FumRateTaken(Def)	0.024204

Average in-season standard deviation of metric, Adjusted for opponent
Stat	Avg. Std. Dev.
RunOff	1.1083yds
RunDef	1.1073yds
PassOff	1.9546yds
PassDef	1.7601yds
SackRateMade(Def)	4.6508%
SackRateAllow(Off)	4.5034%
3rdDownOff	12.682%
3rdDownDef	12.788%
IntRateGiven(Off)	2.9403%
IntRateTaken(Def)	2.9135%
FumRateGiven(Off)	2.6759%
FumRateTaken(Def)	2.6853%

As it turns out, adjusting for opponents closes the gap a little, but offense is still more consistent overall. Adjusted for opponent, rush pass off. eff. have more variance than their defensive counterparts, but the week-to-week correlation is still well in favor of the offense.

Three out of the four tests I ran say that offensive performance is actually more consistent than defensive performance. So the conclusion seems to be that the defense needs to be very good so it can matchup okay with the consistently good offenses of playoff teams, even when they're not performing at their optimum. Does this reasoning make sense to anyone else? Definitely a topic for further exploration/discussion.

Addendum: Reply to bettingman's post
Do defenses adapt over the season and improve? Makes sense as they have to react to the offense. I just threw together these graphs. It's by game rather than week, so the two rates won't exactly sync up.

Rushing offense improves over the year by about .2 yards/attempt, but passing offense worsens over the year by about .2 yards/attempt. Sack rates don't trend either way, but interception rates increase by about .3%. So the passing game does indeed become less effective as the year progresses. The improvement in the rushing game might stem from defenses focusing more on the passing game. Also of note, kick and punt return averages decrease over the season but start to pick up again towards the end of the season. Clearly, special teams do a good job of adapating their return coverages. Do the return units start to adapt to the coverages as well?

Football Prediction Network

Tuesday, July 31, 2007

Does Defense Matter More in the Postseason?

Thursday, July 26, 2007

Who's On the Rise and Who's On the Decline: Expanded and Corrected

Thursday, July 19, 2007

Gone Fishin'

Monday, July 16, 2007

Does Balance in Play Calling Matter?

The Predictive Ability of Averages

Friday, July 13, 2007

The Value of Home-Field Advantage 4.0 - Performance Enhancement

Thursday, July 12, 2007

Consistency of Offensive and Defensive Efficiency Part I - Regular Season

Special Content

About the Author

Other Great Research Sites

ShinyStat Counter

Archive