Wednesday, January 11, 2012

Fair-enheit 27-9

Warning: nuclear-grade nerddom follows.

People in 49 of the 50 United States, as well as the District of Columbia and Puerto Rico, seem to be grousing about the "unfairness" of the BCS system this week. Fortunately for everyone, my multiple graduate level courses in probability & statistics at Michigan taught me how to design a perfectly fair college football season. It's really easy! First, you make an infinite number of clones of all the players, and then have each set of clones play a 120-team round robin. Once everyone's done, we figure out which team had the largest average number of wins, and there ya go, there's your national champion! If two teams end up with the exact same average number of wins (an event that can occur, but with probability zero - that's the kind of silliness you deal with in graduate courses in probability), we can whip up two new batches of clones and have them play a championship game.

Of course this can't happen. If I could produce an infinite army of clones, they'd be divided between securing my benevolent dictatorship and playing far deadlier games than football for my amusement. But this is how probability works. Axiomatic probability theory is based on the understanding that if we play the same game multiple times, we'll end up with different outcomes, so we should figure out what would happen if we played the game an infinite number of times. The fact that this is the basic approach to the subject is why explaining probability and statistics to the Paul Finebaums and Michael Weinrebs of the world is a fool's errand. Don't even bother trying to explain it to people who can't even count to 85.

In theoretical computer science, there is a formal definition of fairness. Roughly, it says that if you visit a state x infinitely often, and an action a can occur from the state x, then a will occur infinitely often. In real life, we don't get to work with infinite amounts of time and space, so we can't design a system that's formally fair. Since we only get 12-14 games to work with, we'll have to make some compromises to be reasonably fair, like Tony Hoare suggested back in 1978.

Tony Hoare. Sounds like he should be a wide receiver for the Steelers.
Looks like the old British dude he is.
In my thought experiment I don't have to put effort into designing a fair playoff system because I have a regular season that is infinite in scope. In real life, we have to reach a compromise between weighting a finite regular season that provides a lot of information about who the best teams are and a playoff that allows us to settle any outstanding questions. The constraint we have is that we have to decide on the playoff format before the season and thus the question is, how many teams should be in a fair playoff system?

I believe it's reasonably fair (in an informal sense) that a team be excluded from the playoffs if they have demonstrated over the course of the season that they are almost certainly not the best team in the country. So we need to estimate the number of teams that could reasonably be the best after a 12 game season.

There's a straightforward way to do this, if you have the clone armies. First you have the clone armies play their seasons and then rank them from #1 to #120. Then you have the clones participate in playoffs of different sizes, and see how often the #1 ranked team wins, how often the #2 ranked team wins, and so on.

Now here's the twist. We don't have clone armies, but we have a clone of FBS that we call FCS. They played a 20-team playoff this year, and they've played 16-team playoffs in prior years, so we can use the data from FCS to estimate how highly a team needs to be ranked to have a reasonable chance.  Here are the 11 most recent FCS champions, and their seeds going into the playoffs:


YearTeamSeed
2011North Dakota State2
2010Eastern Washington5
2009Villanova2
2008Richmond7
2007Appalachian State7
2006Appalachian State1
2005Appalachian State2
2004James Madison7
2003Delaware2
2002Western Kentucky15
2001Montana1
So, ten years out of eleven, the winner comes from the top seven seeds. In 2002, Western Kentucky got lucky and went on a run, upsetting #1 McNeese State in the championship. Based on eleven years of lab experiments in Chattanooga, TN and Frisco, TX, I think we can conclude that a playoff should include at least seven teams. There's no reason for seven other than, "the data says it's quite possible the team ranked #7 may actually win the playoffs."

The top seven teams in the BCS before the bowls were: LSU, Alabama, Oklahoma State, Stanford, Oregon, Arkansas, and Boise State. Based on the results of the regular season, it's extremely difficult to argue that any of those teams definitively demonstrated that they weren't the best team in the country. The easiest argument is against Arkansas because they lost to both LSU and Alabama, but maybe a different rating system with less human input would have ranked them lower.

I've written my piece about how the current BCS rankings are unfair, and now I'm combining it with an argument about why a two-team playoff is unfair. So my solution is twofold. First, replace the current human and six-computer ranking system with a crowd-sourced, open source, computer ranking system. By using only open source algorithms, all the voters put their biases up front. By crowd-sourcing, we can have enough voters that the biases of different voters cancel each other out. (Ideally, we'd have an infinite number of voters, but that's the axiomatic probabilist talking again.)

At the end of the season, we'll take the top eight teams and have a playoff. We use eight because it makes an easier playoff than seven, and we use seven because that's the number of teams FCS results tell us have a reasonable shot of winning. (Kansas State would have been the eighth team in 2011-2012.) The results from FCS suggest there's little difference between the top 4 and the next 4, but it seems fair that the top 4 get home games as a reward for those slight differences. Semi-finals and finals can be played at predetermined sites.

There's one last piece of unfairness I'd like to eliminate even though eliminating it causes logistical difficulties. Every team but the national champion should have to lose. If there's an undefeated team ranked #9 or worse and they played a reasonably strong schedule, the playoffs should be expanded to let them in. The #8 team could host the leftover undefeated team the week before the quarterfinals start. It just seems wrong to me that a team can win every game and still be declared a loser.

So there's my proposal. Fairer rankings with silly human biases eliminated, combined with an 8-team playoff because I thank the FCS data shows a 12-game season's not long enough to definitively eliminate the top 8 from consideration most years. I hope you found the probabilistic arguments sound, will leave a comment about it if you didn't, and I appreciate you taking the time to read a lot of URNBALL.

No comments: