Football Prediction Network: The Value of Home Field Advantage Part I

So let's go back to basics. Really basic stuff.

How much is home field advantage actually worth? Pretty straightforward question, but there are several ways to tackle the question. From 1994-2006, home field advantage was worth about 2.6362 points on average, but the standard deviation of the results was about 14.0780 points. 69.91% of the games fell within one standard deviation of the mean, or in other words, those games were within 2 touchdowns either way of the average result. As a side note, the outcomes fall in line with a normal distribution, which means that linear regression is a good way to try predicting future outcomes. So home field advantage on average matters to a small extent. Home teams win 58.81% of games. Between two teams very close in talent, go for the home team, but if one team is clearly better than the other, you're better off picking the better team. That's pretty much the conventional wisdom, isn't it?

So how does a predictor like the spread do in terms of valuing home field advantage? The average spread from 1998-2006 was -2.5346, very close to the actual average. The standard deviation, however, is only 5.7783. The extreme outcomes (games decided by 20+ points) have a 18.515% chance of occurring. There's very little incentive statistically to predict such large wins, though the outcome is more frequent than is perhaps expected. I ran a few experiments to classify games as big wins or close wins and for which team in the original research, and the more classes I introduced, the worse classification accuracy became. For the 4-class problem, 28-32% accuracy was the best I could do. The prediction systems play the odds and thus have a tighter range of margins than the actual outcomes. Tightening the bounds of the actual range within the training data does not help accuracy of the prediction systems I've implemented.

What's interesting to note is that the value of home field advantage fluctuates a fair deal from year to year, but reached a peak in 2005 and a deep, deep valley in 2006, which caused the accuracy of the spread and my prediction systems to similarly fluctuate, particularly on 2006. The chart below lays out all the specific numbers.

	Average Actual Result	Average Spread*	Proportion of games won by home team	Proportion of games home team was favorite
OVERALL**	2.6362	-2.5346	58.51%	66.859%
1998	3.5042	-2.4479	62.917%	67.083%
1999	3.0645	-2.3911	59.677%	64.516%
2000***	2.8226	-2.6149	57.447%	68.511%
2001	2.0444	-2.2641	55.645%	65.323%
2002	2.2461	-2.2637	57.813%	64.844%
2003	3.5313	-2.5586	61.328%	70.313%
2004	2.5078	-2.5371	56.641%	66.406%
2005	3.6484	-2.6309	58.984%	67.969%
2006	0.84766	-2.8027	53.125%	66.797%

	Average Actual Margin of Victory	Average Margin of Victory Predicted by Spread	Proportion of games won by favorite	Proportion of games in which favorite beat the spread
OVERALL**	11.471	5.3498	65.352%	48.263%
1998	11.504	5.7313	70.00%	52.917%
1999	11.355	5.5565	65.726%	50.403%
2000***	11.798	5.8234	64.682%	45.957%
2001	11.077	5.1593	65.323%	49.194%
2002	11.105	4.9355	62.109%	49.219%
2003	11.914	4.9792	67.13%	51.389%
2004	11.367	5.127	62.891%	44.922%
2005	11.688	5.416	72.656%	48.047%
2006	11.426	5.5215	58.594%	46.094%

* Spread is negative when home team is favored.
** Spread covers 1998-2006, but the averages for actual outcomes are from 1994-2006.
*** Spreads for week 4 of 2000 could not be found and are not included in the 2000 spread stats.

Curiously, there's a correlation with the predictive performance of Football Outsider's DVOA stats as well. In 2005, two-thirds of games were won by the team with the higher DVOA. In 2006, that number fell to 55.80%. Without more years of data, it's hard to say if this is just some natural aberration. But there was something unusual about 2006. Was it rule changes, a change in how rules are enforced, a change in stadiums or playing fields? Is there anyway to account for the natural variance from year to year? For prediction systems like linear regression, one could alter the bias coefficient to reduce the bias towards home teams, but there's no guarantee that it'll improve accuracy.

In what ways are the spread and other prediction systems being inefficient in dealing with home-field advantage? One obvious place to start is the weather.

Football Prediction Network

Saturday, June 16, 2007

The Value of Home Field Advantage Part I

No comments:

Special Content

About the Author

Other Great Research Sites

ShinyStat Counter

Archive