Discussion about this post

User's avatar
Patrick Liscio's avatar

How much of this can be attributed to the addition of the First Four in 2011? UMBC wouldn’t have been a 16-seed in 2010, and every low seed essentially dropped half a seed between 2000 and 2010. This, plus the fact that two 11-ish seeds have to win games to get to round one, seem like they would produce stronger upset candidates.

Expand full comment
Aaron C Brown's avatar

I dispute the 1-in-200 significance claim. I think you computed the probability of getting 77 or more heads in 334 flips of a coin with 104/600 probability of getting heads. 77 is the number of upsets since 2010, 334 is the number of opportunities since 2010, and 104/660 is the frequency of upsets before 2010. That number comes to 0.46%, or one in 218.

The right way to do it is to consider the probability of 77 heads if the flip probability is 181/934--in other words if the flip probability is constant at the overall sample value. That's 5.38%, 1 in 19, and below the standard 5% level often used to declare statistical significance.

Moreover, since you could have written a column about how upsets are declining if you got an equally unlikely result on the other side, you should double the p-value to 10.77%.

A superior approach is to use a Chi-Square test with one degree of freedom. That yields a lower significance, 3.40% for the one-tailed calculation, 6.80% for the two-tail.

Even these results should be further adjusted. You could have picked a year other than 2010, or counted more or fewer seeds. Moreover you know the assumptions of both the binomial and Chi-Square tests are not met--there is not a constant probability of upset in all games, and upsets are not distributed identically among all games.

Moving from theory to practice, it's easy to come up with hundreds of observations supported by this level of evidence, that have no basis beyond random chance. Using a 5% significant threshold only makes sense when you test one unique hypothesis determined independently of the data--ideally before you've seen the data--that is likely to be true.

If there is a real effect, my intuition would be to look for explanations in the seeding process rather than the nature of the game itself. It's become more systematic and evidence-based. I don't think that matters a lot for which teams are in the top six (the top 24 in the country), but I think the committee has been getting much better at selecting the 41st to 64th best teams.

Expand full comment
12 more comments...

No posts