Friday, July 13, 2007

The Value of Home-Field Advantage 4.0 - Performance Enhancement

The regression models tend to put stronger coefficients on the home team performance variables than on the away team's variables. This contributes to the overly strong home-team bias of the regression models. But because of the training error-minimizing nature of the regression, you can't really tinker around with the coefficients and expect improved results. More mathematically sound methods of dampening the weights (e.g. ridge regression) have not really improved accuracy either. Given that, I tried to fix the bias in the coefficients by fixing the bias in the data.

If the home team is winning 58% of all games, then it's reasonable to assume that being at home influences a team's offensive, defensive and special teams efficiencies. When trying to judge a team's performance mid-season, they have not necessarily played as many home games as away games. So how much bias would that cause? Does adjusting performance based on home-field advantage create more informative stats?

The table below shows the average of year-end league averages in several metrics split according to performance at home and on the road.

Avg. Performance of League, 1996-2006
Run Eff.4.1033.9956
Pass Eff.6.01515.7526
Sack Rate Allowed6.5962%6.9897%
Punt Ret.9.52889.1585
Kick Ret.22.08421.479
3rd Down Conv.38.656%37.005%
Pen. 1st Downs Given1.6671.5091
Pen. Yds. Given52.25755.936
Int. Rate2.8441%3.1675%
Fum. Rate3.1435%3.1924%

As expected, home teams are consistently better, but the scale of those differences is small. The difference is run efficiency amounts to an extra 2 yards for every 30 attempts. For pass efficiency, it amounts to an extra 8 yards for every 30 attempts. The difference in interception rates doesn't even amount to a tenth of an interception for every 30 pass attempts. Yet these differences are somehow worth 2.7857 points to the home team and a 58.941%/41.509% split of games.

So to adjust stats for home field advantage, I used the same method as opponent adjustments. Instead of adjusting by the ratio of league average to opponent average, I use the ratio of league average to home/away average. In terms of season win totals, adjusting for home-field advantage increased the R2 of the model (1996-2006 data) from 0.728 to 0.764. Adjusting the stats for home-field advantage does improve the statistics. Unfortunately, it does not significantly alleviate the problem of regression models classifying too many games as home team wins (a 1-2% decrease, still above 70%).

1 comment:

Anonymous said...

It's really very complicated in this active life to listen news on Television, so I just use world wide web for that reason, and obtain the hottest information.

Also visit my weblog; Best Dark Spot Treatment