NCAA upsets are happening more often
Parity has arrived in college basketball, and even #16 seeds have better chances than they once did.
Don’t forget to check out our NCAA tournament predictions! The men’s tournament technically begins at 6:40 p.m. tonight with a First Four game in Dayton, though few office pools care about these games and lock on Thursday morning instead.
We will update our forecast late tonight or (more likely) tomorrow morning with the results from Dayton, however. In the interim, we’ve made a few injury updates to the model.
One of my more traumatizing sports-related childhood memories from East Lansing is when teachers wheeled in TVs to my 6th-grade middle school classroom so we could watch the big game: #1 Michigan State taking on patsy Murray State, the 16-seed. Instead of a celebration, though, it nearly became an embarrassment as Murray State hit a last-minute 3-pointer to send the game to OT. Luckily, the Spartans prevailed 75-71 in the extra period, but it was an inauspicious way to begin the tournament.1
However, it preserved a phenomenal opening-round win streak for #1 seeds, which would continue for another 28 years until the first 1-vs.-16 upset, when UMBC beat Virginia by 20 points (!!) in 2018. Until then, top seeds had gone 135-0.
But then another #1 seed, Purdue, lost to 16-seeded Fairleigh Dickinson in 2023. Meanwhile, 2-versus-15 upsets are now almost routine: there’s been an average of about one every other tournament since 2010.
So, if it seems like this sort of thing is happening more often than in your childhood dreams/nightmares, you’re not wrong. In fact, the trend toward more first-round upsets is statistically significant.
Here’s the data. Using 1985 (the first year of the 64-team field) as the starting point and 2010 as the dividing line — I’ll explain why I chose that year in a moment — here are opening-round results for all matchups from 1-vs.-16 through 6-vs.-11.2 Overall, the upset percentage has risen from 17 percent to 23 percent.
The superior seed’s margin of victory has also declined by about two points. #6 seeds actually have a negative point differential against #11s since 2010, in fact.
So if your office pool rewards upsets — or you just want to differentiate from the pack — you might want to consider that historical data on the frequency of upsets may no longer apply. (We also have one #12 seed, Colorado State, that’s an outright favorite against #5 Memphis. To see our probabilistic forecasts for every first-round game, see our tournament homepage.)
But might this just be a fluke? Probably not. From 1985 through 2009, top-6 seeds went 496-104 in the opening round, good for an 82.7 percent winning percentage. If that win rate persisted, you’d expect them to go 276-58 in tournament games from 2010 onward — but instead, they’re 257-77. That’s almost 20 more upsets than expected out of a sample of 334 games. My instinct was that this was a statistically significant difference, and it turns out that it is. Using a binomial distribution, I find there’s only a 0.5 percent chance (1 in 200) that there would be that many upsets or more if the top seeds were still as dominant as they once were.
So they probably aren’t. Moreover, the reason for the change is pretty simple to discern. Take a look at this:
This is how many of the top 10 NBA draft picks were college upperclassmen (juniors or seniors). Until 1990, it was incredibly rare for players to turn pro as freshmen or sophomores (and it wasn’t even allowed at all until 1971). The exceptions were some of the elites of the sport, like two guys we cared a lot about in East Lansing, Magic Johnson and future Detroit Piston Isiah Thomas. (Both left as sophomores.) Now, it’s incredibly rare to see an upperclassman be a top 10 pick. The only exception last year was Purdue’s Zach Edey, a senior, and the top 14 in this year’s consensus mock draft are all college freshmen. So although there’s no hard cutoff — as late as 2004, half the top 10 were upperclassmen — 2010 is a round-numbered year that probably works as well as anything.
There’s also some other evidence for the fading dominance of the top college programs I found when recalibrating our historical college Elo ratings, which we’ve now redubbed SBCB ratings after making various improvements. Our data runs all the way back to 1949-50, but we found that the system had become overconfident when applied to recent seasons — that is, predicting too few upsets both in the regular season and the tournament. Most of the issue, though, was concentrated in the early part of the schedule; elite performance doesn’t carry over as much from season to season. Teams now revert more toward the mean than they have historically because if they’re lucky enough to have a top talent, they’ll almost invariably lose him after one or two seasons. (Nobody except certain grumpy washed-up ex-Sacramento Kings expect Cooper Flagg to return to Duke, for instance.)
This is much less of a problem for smaller schools that fill out the bottom rungs of the bracket, which rarely have top-10 talent. It also isn’t an issue in the women’s game, where players are barred from entering the WNBA until they turn 22. So, in designing our women’s SBCB ratings, we use a smaller mean-reversion factor that looks more like what the men’s game was like in the 1980s. Caitlin Clark still played four years in school, for instance, and she really had no choice.
Admittedly, this can be hard to take advantage of when making your picks. If a #1 seed now has a 96 or 97 percent chance of winning instead of 98 or 99 percent — as I was prescient enough to write two years before Virginia-UMBC, it was never truly 100 percent as near-misses like MSU vs. Murray State attest — the odds of an upset are still really low.3 But the best teams aren’t as good as they once were, and that ought to have implications as you think about your bracket.
After another narrow win against UCSB in the Round of 32 — my family actually traveled to Tennessee for the game, and I’m still mad at my dad for not letting us stay for the second game of the doubleheader, which featured a rising LSU star named Shaquille O’Neal — Sparty succumbed in the Sweet 16.
The 8-vs.-9 and 7-vs.-10 games are usually closely matched enough so as not to really qualify as upsets.
Nor is it usually fun to bet, say, +3300 moneylines when you’ll usually lose — plus you’re playing vig to the sportsbook. Point spreads are another matter — the Silver Bulletin model likes some big underdogs in the opening round — but I’m a little skittish about some of these too, just because it often comes down to factors like when teams rest their starters rather than the underlying quality of the teams. Although full disclosure: I do have some of those underdog bets down for low stakes myself
How much of this can be attributed to the addition of the First Four in 2011? UMBC wouldn’t have been a 16-seed in 2010, and every low seed essentially dropped half a seed between 2000 and 2010. This, plus the fact that two 11-ish seeds have to win games to get to round one, seem like they would produce stronger upset candidates.
I dispute the 1-in-200 significance claim. I think you computed the probability of getting 77 or more heads in 334 flips of a coin with 104/600 probability of getting heads. 77 is the number of upsets since 2010, 334 is the number of opportunities since 2010, and 104/660 is the frequency of upsets before 2010. That number comes to 0.46%, or one in 218.
The right way to do it is to consider the probability of 77 heads if the flip probability is 181/934--in other words if the flip probability is constant at the overall sample value. That's 5.38%, 1 in 19, and below the standard 5% level often used to declare statistical significance.
Moreover, since you could have written a column about how upsets are declining if you got an equally unlikely result on the other side, you should double the p-value to 10.77%.
A superior approach is to use a Chi-Square test with one degree of freedom. That yields a lower significance, 3.40% for the one-tailed calculation, 6.80% for the two-tail.
Even these results should be further adjusted. You could have picked a year other than 2010, or counted more or fewer seeds. Moreover you know the assumptions of both the binomial and Chi-Square tests are not met--there is not a constant probability of upset in all games, and upsets are not distributed identically among all games.
Moving from theory to practice, it's easy to come up with hundreds of observations supported by this level of evidence, that have no basis beyond random chance. Using a 5% significant threshold only makes sense when you test one unique hypothesis determined independently of the data--ideally before you've seen the data--that is likely to be true.
If there is a real effect, my intuition would be to look for explanations in the seeding process rather than the nature of the game itself. It's become more systematic and evidence-based. I don't think that matters a lot for which teams are in the top six (the top 24 in the country), but I think the committee has been getting much better at selecting the 41st to 64th best teams.