Monday, February 11, 2008

Accuracy of 2007 Preseason Predictions

For the next couple weeks, I'm going to be auditing the various statistics and systems I use to see what works, what doesn't. First up, the win total projections based on plugging in preseason stats into a linear regression system trained on regular season stats and win totals. The offensive stats used were those for projected starters only, and the defensive stats were total defensive stats. Only rushing and passing yards per play stats were used.

Across the board, the preseason predictions were OK as far as preseason predictions go, which is to say not very good. The average absolute error was 2.66 games, the difference between 7-9 and 10-6. It called Miami's 1-15 season but was 7+ games short of New England's perfect regular season. Randy Moss, however, did not play in preseason. Hmm... I wonder what that bodes for next season...

The correlation coefficient comes out a little better at 0.40641. So the preseason win projections matched up somewhat strongly with actual win totals. It predicted Cleveland to have a good season, as well as Tennessee and Washington. It, however, also predicted New Orleans to be the best team in the league this season at 14-2. Again, as far as these things go, it was normal, having a mix of dead-on predictions and mile-wide misses. Looking at it from a divisional, rather than a league-wide perspective, shows pretty much the same thing.

TeamExpected WinsActual Wins
AFC East
AFC North
AFC South
AFC West
NFC East
NFC North
NFC South
NFC West

For 4 out of the 8 divisions, the projected win totals had a very strong correlation with regular season standings. The only complete miss was the NFC North, where the order was reversed. But with the exception of Oakland, the AFC West predictions were also very good. The NFC East and South predictions were essentially a wash with weak correlations. So only 3 of 8 divisions were complete failures. The p-values, however, show that the correlation coefficients have a higher-than-desired probability of being achievable with random data. The small sample size (n=4) for each division plays a large role in this.

TeamCorr. Coef.P-valueMean Abs. Err.
AFC East0.80700.19303.1516
AFC North0.74240.25762.0331
AFC South0.70250.29751.4520
AFC West0.47540.52462.0079
NFC East-0.01040.98963.1886
NFC North-0.95530.04473.3724
NFC South-0.23030.76974.5139
NFC West0.88290.11711.5770

I think the lesson to take away from this, however, is that preseason does have some meaning. Most studies into the meaning of preseason have looked at win-loss records, which any football researcher would know is not the most accurate reflection of team ability, especially with a sample size of four. Skill is skill and should be visible even when the effort isn't 100%. Filtering out the play of benchwarmers from the statistics is important, and with some more development, it looks like reasonably good predictions can be made with preseason stats.

Read More......

Sunday, February 3, 2008

Super Bowl XLII Win Probability

Home TeamAway TeamP(Home Team Won) (%)Final Score Margin

Despite the home team bias of the system given the neutral site, the Giants came out with the higher win probability. Their 2 fumbles (both recovered) and the one interception weren't enough to cancel out the Brady's fumble and the Patriots' poor performance in general.

As I pointed out in my preview, the two X factors that could play in the Giants' favor were the running game and pass protection. Both did play in their favor. The Giants gained 3.5 yards per carry compared to the Patriots' 2.8 ypc. And the Giants generated a ton of pressure on Brady the entire night, leading to 5 sacks. Eli also continued his hot streak, averaging 6.7 yards per pass, compared to 4.3 for Brady.

As a Dolfan, I'm ecstatic. As a blogger on predicting football games, I take pride in my ideas coming to fruition.

Read More......