The Value of Home Field Advantage Part I
So let's go back to basics. Really basic stuff.
How much is home field advantage actually worth? Pretty straightforward question, but there are several ways to tackle the question. From 1994-2006, home field advantage was worth about 2.6362 points on average, but the standard deviation of the results was about 14.0780 points. 69.91% of the games fell within one standard deviation of the mean, or in other words, those games were within 2 touchdowns either way of the average result. As a side note, the outcomes fall in line with a normal distribution, which means that linear regression is a good way to try predicting future outcomes. So home field advantage on average matters to a small extent. Home teams win 58.81% of games. Between two teams very close in talent, go for the home team, but if one team is clearly better than the other, you're better off picking the better team. That's pretty much the conventional wisdom, isn't it?
So how does a predictor like the spread do in terms of valuing home field advantage? The average spread from 1998-2006 was -2.5346, very close to the actual average. The standard deviation, however, is only 5.7783. The extreme outcomes (games decided by 20+ points) have a 18.515% chance of occurring. There's very little incentive statistically to predict such large wins, though the outcome is more frequent than is perhaps expected. I ran a few experiments to classify games as big wins or close wins and for which team in the original research, and the more classes I introduced, the worse classification accuracy became. For the 4-class problem, 28-32% accuracy was the best I could do. The prediction systems play the odds and thus have a tighter range of margins than the actual outcomes. Tightening the bounds of the actual range within the training data does not help accuracy of the prediction systems I've implemented.
What's interesting to note is that the value of home field advantage fluctuates a fair deal from year to year, but reached a peak in 2005 and a deep, deep valley in 2006, which caused the accuracy of the spread and my prediction systems to similarly fluctuate, particularly on 2006. The chart below lays out all the specific numbers.Average Actual Result Average Spread* Proportion of games won by home team Proportion of games home team was favorite OVERALL** 2.6362 -2.5346 58.51% 66.859% 1998 3.5042 -2.4479 62.917% 67.083% 1999 3.0645 -2.3911 59.677% 64.516% 2000*** 2.8226 -2.6149 57.447% 68.511% 2001 2.0444 -2.2641 55.645% 65.323% 2002 2.2461 -2.2637 57.813% 64.844% 2003 3.5313 -2.5586 61.328% 70.313% 2004 2.5078 -2.5371 56.641% 66.406% 2005 3.6484 -2.6309 58.984% 67.969% 2006 0.84766 -2.8027 53.125% 66.797% Average Actual Margin of Victory Average Margin of Victory Predicted by Spread Proportion of games won by favorite Proportion of games in which favorite beat the spread OVERALL** 11.471 5.3498 65.352% 48.263% 1998 11.504 5.7313 70.00% 52.917% 1999 11.355 5.5565 65.726% 50.403% 2000*** 11.798 5.8234 64.682% 45.957% 2001 11.077 5.1593 65.323% 49.194% 2002 11.105 4.9355 62.109% 49.219% 2003 11.914 4.9792 67.13% 51.389% 2004 11.367 5.127 62.891% 44.922% 2005 11.688 5.416 72.656% 48.047% 2006 11.426 5.5215 58.594% 46.094%
* Spread is negative when home team is favored.
** Spread covers 1998-2006, but the averages for actual outcomes are from 1994-2006.
*** Spreads for week 4 of 2000 could not be found and are not included in the 2000 spread stats.
Curiously, there's a correlation with the predictive performance of Football Outsider's DVOA stats as well. In 2005, two-thirds of games were won by the team with the higher DVOA. In 2006, that number fell to 55.80%. Without more years of data, it's hard to say if this is just some natural aberration. But there was something unusual about 2006. Was it rule changes, a change in how rules are enforced, a change in stadiums or playing fields? Is there anyway to account for the natural variance from year to year? For prediction systems like linear regression, one could alter the bias coefficient to reduce the bias towards home teams, but there's no guarantee that it'll improve accuracy.
In what ways are the spread and other prediction systems being inefficient in dealing with home-field advantage? One obvious place to start is the weather.
No comments:
Post a Comment