Sunday, July 1, 2007

The Value of Home Field Advantage Part 33 1/3: Interconference Games

One question that's been bugging me for the last few months is why home teams did so well in 2005 and so poorly in 2006. When you consider the Saints home games in 2005, the numbers seem even stranger. Is the variance for all types of matchups, or is there one specific type of matchups that has more inherent variance? Since 2002, when the league expanded to 32 teams, the scheduling for the NFL became very simplified with 16 teams in each conference and 4 teams in each division. Now, each team in the AFC plays the teams in one division of the NFC once every four years and vice versa. So interconference schedules for each team changed drastically in terms of opponent quality and regions of the country visited (and thus climates experienced). The same holds true, though to a lesser extent, for intraconference, interdivision play, as teams still have to play at least one team from each other division within the conference a year.

So the hypothesis is as follows: The scheduling procedure implemented by the NFL since the league's expansion to 32 teams has made home field advantage less stable from year-to-year, increasing the varaince of home team winning percentage and average result (home team points - away team points). Interconference games have the most year-to-year variance, followed by interdivisional games. Intradivisional games should have the least variance. The interconference games are largely responsible for the aberrant numbers for the 2005 and 2006 seasons.

Home Win % (mean/std. dev.)InterconferenceInterdivisionIntradivision
Avg Result
(mean/std. dev.)

As predicted, the standard deviation for home team winning percentage increased overall, but by category, only the standard deviation for interconference games increased, though it did so significantly. Before the realignment in 2002, each division had 4 or 5 or 6 teams, so the number of intradivisional games played by each team was not always even. Even in 1995-1998, when each of the six divisions had five teams, interdivisional games were tougher to schedule. Of the 8 non-intradivisional games, 4 were interconference, and 4 were interdivisional, meaning no team would be playing against every team in any other division. So for example, though the AFC East might be matched up with the AFC Central, the Bills might get an easier schedule against than the Dolphins because the Bills get to play the Bengals. The scheduling might also have been tooled around with to give worse teams easier schedules, resulting in some teams not meeting each other for many years, whereas the new system guarantees that won't happen. The closer teams are in quality, the more variance one would expect in the outcome. Therefore, it makes sense that the new scheduling formula decreases variance for interdivision and intradivision games.

Nevertheless, in both time periods, variance was highest for interconference games, followed by interdivision games and then intradivision games. Suprisingly, home field advantage seems to have lost some value since the realignment. Fewer games are won by the home team and by fewer points. With fewer teams in the division, intradivisional games might involve more parity and thus more variance in outcomes. The slight uptick in home team winning percentage for interconference games might have to do with the imbalance between the conferences. Though fewer interdivisional games are won by the home team, the average result has increased in favor of the home team by nearly a whole point. The converse is true for interconference games. In both cases, I'm not really sure why that happens with the average result. At any rate, home field advantage ain't what it used to be, so I might have to go back and rerun experiments training only on 2002 and beyond.

All GamesInterconference
YearGamesHome Win%Avg ResultGamesHome Win%Avg Result
YearGamesHome Win%Avg ResultGamesHome Win%Avg Result

Now, let's look at how the numbers break down by season. In 2005, although 58.98% of games are won by the home team, which is about average, the average result is very high at 3.6484. Only 1996 had a higher average result, so it's at the extremes of what's been observed before. The interconference games had an average result of 4.9219. On average, the home team won those games by nearly 5 points, which is very high, but it is still within what has been observed before. In 2000, the average result was 6.48333. The average result of interdivisional games was the highest in 2005 at 3.8646, while the home field advantage in intradivisional games was slightly below average that year. In 2006, interconference games made all the difference. Only 50% of the games were won by the home team, but the average result was actually in favor of the away team at -1.5469. The numbers for the intraconference games, while well below average, did not set any record lows. Given this data, it is reasonably safe to say that the year-to-year variance in home field advantage is largely due to interconference games.

In theory, what's happening is that as interconference matchups are rotated, strong teams are getting matched up with weak opponents. So some of this variance should be predictable. In 2006, only 40.63% of home teams in interconference games had better records than their opponents in the previous season. In 2002-2005, the numbers were 46.88%, 48.44%, 43.75%, and 43.75% respectively. The correlation of this stat to the proportion of interconference games won by the home team in that year is very strong, 0.82427, though 2005 was still better than expected, given 2004. Five data points is too small to reasonably use linear regression, but we can still take a guess at how 2007 will turn out. It turns out that the 2007 stat matches 2003 at 48.44%, so expect home field advantage to return to at least normal levels in 2007.

So how does a prediction system like the spread handle interconference and interdivision games? Since 2002, the spread has had 63.75%, 63.96%, and 66.25% accuracy on interconference, interdivision, and intradivision games respectively, while the home team was favored in 67.50%, 68.54% and 65.83% of those games. The standard deviations of percentages of favorites being home teams are 4.65%, 2.16%, and 2.74%. So the spread does seem to be sensitive to the variance but not strong enough. If similar numbers hold for my linear regression model, then perhaps better opponent adjustments are needed. Given that 75% of the season is played intraconference, I'm wondering if stats should be adjusted based on conference averages rather than league averages. I know they do similar things for baseball. It's something I'll tinker around with in the future. In short, I just traveled a long, long road for a maybe. As usual, answering one question led to several new questions popping up.

No comments: