Monday, February 11, 2008

Accuracy of 2007 Preseason Predictions

For the next couple weeks, I'm going to be auditing the various statistics and systems I use to see what works, what doesn't. First up, the win total projections based on plugging in preseason stats into a linear regression system trained on regular season stats and win totals. The offensive stats used were those for projected starters only, and the defensive stats were total defensive stats. Only rushing and passing yards per play stats were used.

Across the board, the preseason predictions were OK as far as preseason predictions go, which is to say not very good. The average absolute error was 2.66 games, the difference between 7-9 and 10-6. It called Miami's 1-15 season but was 7+ games short of New England's perfect regular season. Randy Moss, however, did not play in preseason. Hmm... I wonder what that bodes for next season...

The correlation coefficient comes out a little better at 0.40641. So the preseason win projections matched up somewhat strongly with actual win totals. It predicted Cleveland to have a good season, as well as Tennessee and Washington. It, however, also predicted New Orleans to be the best team in the league this season at 14-2. Again, as far as these things go, it was normal, having a mix of dead-on predictions and mile-wide misses. Looking at it from a divisional, rather than a league-wide perspective, shows pretty much the same thing.

TeamExpected WinsActual Wins
AFC East
AFC North
AFC South
AFC West
NFC East
NFC North
NFC South
NFC West

For 4 out of the 8 divisions, the projected win totals had a very strong correlation with regular season standings. The only complete miss was the NFC North, where the order was reversed. But with the exception of Oakland, the AFC West predictions were also very good. The NFC East and South predictions were essentially a wash with weak correlations. So only 3 of 8 divisions were complete failures. The p-values, however, show that the correlation coefficients have a higher-than-desired probability of being achievable with random data. The small sample size (n=4) for each division plays a large role in this.

TeamCorr. Coef.P-valueMean Abs. Err.
AFC East0.80700.19303.1516
AFC North0.74240.25762.0331
AFC South0.70250.29751.4520
AFC West0.47540.52462.0079
NFC East-0.01040.98963.1886
NFC North-0.95530.04473.3724
NFC South-0.23030.76974.5139
NFC West0.88290.11711.5770

I think the lesson to take away from this, however, is that preseason does have some meaning. Most studies into the meaning of preseason have looked at win-loss records, which any football researcher would know is not the most accurate reflection of team ability, especially with a sample size of four. Skill is skill and should be visible even when the effort isn't 100%. Filtering out the play of benchwarmers from the statistics is important, and with some more development, it looks like reasonably good predictions can be made with preseason stats.

Read More......

Sunday, February 3, 2008

Super Bowl XLII Win Probability

Home TeamAway TeamP(Home Team Won) (%)Final Score Margin

Despite the home team bias of the system given the neutral site, the Giants came out with the higher win probability. Their 2 fumbles (both recovered) and the one interception weren't enough to cancel out the Brady's fumble and the Patriots' poor performance in general.

As I pointed out in my preview, the two X factors that could play in the Giants' favor were the running game and pass protection. Both did play in their favor. The Giants gained 3.5 yards per carry compared to the Patriots' 2.8 ypc. And the Giants generated a ton of pressure on Brady the entire night, leading to 5 sacks. Eli also continued his hot streak, averaging 6.7 yards per pass, compared to 4.3 for Brady.

As a Dolfan, I'm ecstatic. As a blogger on predicting football games, I take pride in my ideas coming to fruition.

Read More......

Tuesday, January 29, 2008

Super Bowl XLII Breakdown - Pats vs. Giants

Stats used in the preview are unadjusted for opponent and cover the regular-season only.

Patriots Run Offense vs. Giants Run Defense
Pats' Run Off: 14th in league, 4.0998 yards per carry, 1.6663% VOLA (Value Over League Average)
Giants' Run Def: 9th, 5.7191 yards per carry, 3.8020% VOLA
Advantage: Giants, worth 0.4905 points

The interesting thing about the Patriots this season has been their seemingly mediocre running game. According to Football Outsiders, they have the best running offense in the league. Since Football Outsiders' stats are about gaining the yards you need, I'm thinking that the Patriots are put in an abnormally high proportion of short-yardage situations because their passing game is ridiculously good. Looking here, the Patriots are a paltry 26th in the league in running plays over 10 yards. With the threat of Randy Moss, it's quite possible that safeties cannot commit to run defense even in situations where they expect the run. My system does not see the Giants' advantage as being particularly large nor particularly valuable, but if the Patriots are stuck in numerous 3rd-and-4 or -5 situations, expect the passing game to bail them out.

Giants Run Offense vs. Patriots Run Defense
Giants' Run Off: 14th in league, 4.0998 yards per carry, 1.6663% VOLA
Pats' Run Def: 26th, 4.3733 yards per carry, -8.4481% VOLA
Advantage: Giants, worth 2.7902 points

My research into postseason success has shown that balanced teams (as in being good to very good at most things, not being equally good at everything) go the distance, which is why it's disconcerting to see the Patriots' run defense ranked so low. This might also be a function of their pass offense being so good. They can afford to give up big runs that eat up clock when ahead 20 points. Or their linebackers are old and just not very good. I'm inclined to go with the latter. A good game from Brandon Jacobs will go a long way towards keeping the Giants in it, since that will keep Tom Brady off the field. Jacob's high success rate is a very good omen. The advantage here could very well be worth more than 3 points.

Patriots Pass Offense vs. Giants Pass Defense
Pats' Pass Off: 1st in league, 7.6294 yards per pass, 26.1090% VOLA
Giants' Pass Def: 8th, 6.8706 yards per pass, 5.6341% VOLA
Advantage: Patriots, worth 3.7375 points

How can the advantage be worth only 4 points, you ask? Well, keep in mind that the prediction system is trained not to output extreme values because quite often they don't come true. Look at some of the close games the Pats had against teams like the Ravens. At any rate, I think it's clear that His Excellency, Field Marshall, Al-Haji, Dr. Wes Welker, Life President of Wide Receivers, conqueror of the National Football League, distinguished service order of the Military Cross, Victoria Cross and Professor of Geography will just freakin' dominate this game. Might I also be the first to suggest that Randy Moss' comeback is thanks entirely to human growth hormone? How else does a broken-down receiver make such a miraculous comeback? Tainted perfect season! McCarthyism! Join the fun! *!

Giants Pass Offense vs. Patriots Pass Defense
Giants' Pass Off: 23th in league, 5.5236 yards per pass, -8.6972% VOLA
Pats' Pass Def: 5th, 5.3106 yards per pass, 12.2180% VOLA
Advantage: Patriots, worth 4.0638 points

Eli Manning has been good in the playoffs, but in the regular season, he was crap. I would say go with the larger sample of data, but he played well against the Patriots the last time. The Giants can't win without a good game from Manning. The odds say that it won't happen. Momentum is nothing next to regression to the mean. If you're a mediocre quarterback, you'll eventually play like one.

Patriots Pass Rush vs. Giants Pass Protection
Pats' Sack Rate Made: 2nd in league, 8.2024% of pass plays, 35.9070% VOLA
Giants' Sack Rate Allowed: 8th, 4.9037% of pass plays, 18.7510% VOLA
Advantage: Patriots, 0.0232 points

I do not remember the Patriots' pass rush being this good, but the Giants' strong offensive line play has a good chance of neutralizing the pass rush.

Giants Pass Rush vs. Patriots Pass Protection
Giants' Sack Rate Made: 1st in league, 9.0592% of pass plays, 50.1030% VOLA
Patriots' Sack Rate Allowed: 4th, 3.4596% of pass plays, 42.6770% VOLA
Advantage: Giants, 0.0536 points

Similarly, the Pats' pass protection and Giants' pass rush cancel each other out. However, I will say that I saw Matt Light get abused in a few games against top-tier pass rushers (the Colts game comes to mind). Maybe Strahan has one last great game left in him.

Patriots 3rd Down Offense vs. Giants 3rd Down Defense
Pats' Conversion Rate Made: 2nd in league, 48.1674% conversion rate, 21.9020% VOLA
Giants' Conversion Rate Allowed: 5th, 34.5970% conversion rate, 12.4420% VOLA
Advantage: Patriots, 0.5455 points

Giants 3rd Down Offense vs. Patriots 3rd Down Defense
Giants' Conversion Rate Made: 12th in league, 41.7431% conversion rate, 5.6435% VOLA
Patriots' Conversion Rate Allowed: 4th, 33.6897% conversion rate, 14.7380% VOLA
Advantage: Patriots, 0.5241 points

The Patriots' success on third downs is a function of their passing game. Again, I think if the Giants can avoid 3rd-and-shorts, they can go toe-to-toe with the Patriots.

Patriots Interception Rate Given vs. Giants Interception Rate Taken
Pats' Interception Rate Given: 1st in league, 1.4827% of pass plays, 49.8100% VOLA
Giants' Interception Rate Taken: 26th, 2.4390% of pass plays, -17.4380% VOLA
Advantage: Patriots, 0.9639 points

Giants Interception Rate Given vs. Patriots Interception Rate Taken
Giants' Interception Rate Given: 26th, 3.5026% of pass plays, -18.5660% VOLA
Patriots' Interception Rate Taken: 8th, 3.3159% of pass plays, 12.2440% VOLA
Advantage: Patriots, 0.4416 points

Patriots Fumble Rate Given vs. Giants Fumble Rate Taken
Pats' Fumble Rate Given: 1st in league, 1.6000% of plays, 48.3080% VOLA
Giants' Fumble Rate Taken: 18, 3.2595% of plays, 5.3054% VOLA
Advantage: Patriots, 1.0120 points

Giants Fumble Rate Given vs. Patriots Fumble Rate Taken
Giants' Fumble Rate Given: 20th in league, 3.2581% of plays, -5.2632% VOLA
Patriots' Fumble Rate Taken: 16th, 3.3333% of plays, 7.6923% VOLA
Advantage: Patriots, 0.3050 points

The Giants defense isn't very good at creating turnovers, and they cough up the ball too often. This game could very easily be a 30-point blowout on the basis of a few Giants' turnovers.

Final Prediction: Patriots by 8 points (8.3633 to be exact)
This translates to about a 75% chance of winning the game. A lot of things have to go the Giants' way for them to even stay in it. That said, the '72 Dolphins did not have as many close calls as the Patriots have had. To me, this game is all about regression to the mean. Either Eli Manning regresses to the mean and the game ends 52-0 Patriots or the Patriots regress to the mean and finally lose a close game. Maybe the Patriots will experience some bad luck with injuries finally and lose somebody that matters.

Read More......

Sunday, January 20, 2008

Win Probabilities 2007 Conference Championships

These probabilities of victory are based on the box score stats, training on 1996-2006. Inputs include rushing and passing efficiency by yards per play, third down efficiency, sack rates, turnover rates and penalty yards. The difference between probability and result can stem from recovered fumbles, where the ball was turned over, special teams, among other things.

Home TeamAway TeamP(Home Team Won) (%)Final Score Margin

SD@NE - The two teams were about equal in rushing efficiency. They weren't that far apart in passing efficiency. New England had 3 picks and a recovered fumble; San Diego had 2 picks and a recovered fumble. New England, however, was much better on third downs, converting 53.84%, compared to San Diego's 25% efficiency. If San Diego had converted half of their third downs, they would have had a 57% chance of winning the game.

NYG@GB - Similarly, in this game, the Packers were the more efficient team overall but converted only 1 of 10 third downs. Despite 5 fumbles by the Giants (though only one was lost), Green Bay still had less than a 1 in 3 shot of winning the game. Converting 3 third downs would have given them a 51% chance of winning the game. Without Favre's 2 interceptions, however, Green Bay would have had a 68% chance of winning.

Read More......

Monday, January 14, 2008

Predictions 2007 Conference Championships

Predictions based on opponent-adjusted stats
GamePredicted Final Score MarginP(Home Team Wins)
SD @ NE7.314872.2577
NYG @ GB4.143664.2149

Looks like another Pats-Packers Super Bowl coming up.

Read More......

Win Probabilities 2007 Divisional Round

These probabilities of victory are based on the box score stats, training on 1996-2006. Inputs include rushing and passing efficiency by yards per play, third down efficiency, sack rates, turnover rates and penalty yards. The difference between probability and result can stem from recovered fumbles, where the ball was turned over, special teams, among other things.

As you can see, all of the victories were pretty decisive this week, though Sunday's games were closer than the box score stats would indicate.

Home TeamAway TeamP(Home Team Won) (%)Final Score Margin

SD@IND - Peyton Manning played really well. 8.4 yards per pass is way above average, and San Diego's pass defense was top-10 this year to begin with. Keep in mind, however, that just because something is unlikely doesn't mean it won't happen. I would put the probability of Rivers average 13.6 yards per pass against a top-5 defense as next to zero. Play that game over 1000 times and it might happen that one time. But it did happen. This game was not Manning's fault. If the interception at the end of the first didn't happen, that would have increased their probability of winning to 2.2%. It made that little difference. You're just not going to win with your pass defense that porous.

NYG@DAL - Ignoring the recovered fumble boosts Dallas' win probability to 13.5%, but penalties and turnovers made a big difference. Even without those, however, Romo and the passing game just didn't perform up to snuff.

Read More......

Monday, January 7, 2008

Predictions 2007 Divisional Round

Predictions based on opponent-adjusted stats
GamePredicted Final Score MarginP(Home Team Wins)
JAX @ NE5.210565.7171
SD @ IND7.574774.7456
NYG @ DAL6.428770.6450
SEA @ GB4.250064.4811

Last week, my predictions and the system's predictions went 2 for 4.

Let's take a look at the matchups I didn't see coming.

Rush Off: NYG - 4.5940 yards per carry (3rd), DAL - 4.1799 ypc (9th)
Rush Def: NYG - 3.8020 ypc (9th), DAL - 3.9816 ypc (14th)
Pass Off: NYG - 5.5236 yards per pass (23rd), DAL - 7.3662 ypp (2nd)
Pass Def: NYG - 6.8706 ypp (8th), DAL - 5.4370 ypp (7th)

The Giants have the better running game, but the Cowboys have the better passing game, which is more important. That Eli Manning performed so well against the #2 pass defense in the wild card round is promising, and Terrell Owens' injury could severely limit Dallas' passing game. Tony Romo is not exactly a proven playoff quarterback either, but I'd expect some regression to the mean for Eli. The probability of him performing so well against 2 top 10 pass defenses is just really low. I'm still picking Dallas to advance.

Rush Off: SEA - 3.7837 ypc (24th), GB - 4.1160 ypc (13th)
Rush Def: SEA - 3.8957 ypc (13th), GB - 3.8732 ypc (12th)
Pass Off: SEA - 6.33233 ypp (13th), GB - 7.2752 ypp (3rd)
Pass Def: SEA - 5.7178 ypp (10th), GB - 5.9208 ypp (12th)

The Packers have a much better offense than the Seahawks, though their defenses are pretty much the same statistically. The Seahawks did plenty well against a superior pass defense in the Redskins game, and I could see an upset happening, but Brett Favre has just been too good this season. Packers win.

Read More......