How our SBCB ratings and NCAA tournament forecasts work
Plus, our all-time men's basketball Top 100 teams.
Last revised 3/16/2025
Men’s SBCB
If you’ve clicked on this page, you’re presumably a statistics and/or college basketball nerd. So, I will describe our methods with a minimum of fuss. As a bonus, though, I’ll start by showing you the all-time top 100 season-ending SBCB totals since the 1949-1950 NCAA season, which helps to illustrate the differences between the two parallel systems — “SBCB Pure Elo” and “SBCB Bayesian Elo” — that we use.1
The SBCB numbers are a souped-up version of the Elo rating system, a method with a long history in chess and other fields:
One of the core principles of Elo is that winning matters: a team never sees its Elo rating decline after a victory, no matter the circumstances.
Another is that the total number of points in the system is preserved. If Michigan State gains 11 Elo points after beating Wisconsin, then Wisconsin loses exactly 11 points.
Finally, strength of opponent matters considerably. A team with a 1900 rating will gain hardly any ground by beating an 1100, since it’s already expected to win this game around 99 percent of the time. But if a 1500 beats a 1700 — about a 15 percent chance — it will get a lot of credit.
Beyond that, though, there are a lot of nuances:
Margin of victory matters. Although there is some degree of diminishing returns — in basketball particularly, teams often send in the scrubs in lopsided games, and free-throw strategy can turn narrow wins into games that were closer than they appear from the final score — larger margins of victory get a higher multiplier. Specifically, the margin of victory factor is calculated as
(3 + s) ^ .85
, wheres
is the scoring differential.Home court advantage matters too — in fact, we calculate a separate home court rating for each team, based on how much it underperforms or exceeds its Elo projection in home games. Generally speaking, teams that are reputed to have larger home-court advantages based on difficult playing conditions or more enthusiastic fan bases actually do. Teams that play at high altitudes — here’s looking at you, schools based in Utah and Colorado — often do as well. These home-court ratings move very slowly, taking advantage of data from previous years. (They fully carry over from season to season.) However, having a larger home-court advantage isn’t helpful in the NCAA tournament, which is entirely played at neutral sites. Teams like Purdue, whose home court is worth an additional ~2 points of victory margin compared with the NCAA D1 average, may be overrated by other systems that don’t account for this factor.
Travel distance also matters and, for the 2024-25 season, is approximately equal to
8 * m^(⅓)
worth of Elo ratings points, wherem
is the distance in miles from the visiting team’s campus. For home games, be sure to add the travel distance factor to the team-specific home court rating to calculate a team’s overall advantage. Travel distance also matters in neutral site games: a team flying across the country to play in a tournament game will be at a disadvantage. Note that the effect of travel distance is gradually becoming smaller, presumably reflecting improving travel accommodations; SBCB accounts for this.SBCB ratings, like many other Elo systems applied to sports, carry over from season to season, with a discount factor applied that reverts the ratings toward the mean in between seasons. Empirically, the degree of mean reversion from year to year is growing — in other words, teams are less likely to sustain their success — probably because the best players typically leave for the NBA after one or two years in college; even elite programs now rarely maintain dominance with the same core of talent. Currently, a team’s rating is reverted by 30-35 percent toward the mean at the start of each new season.
However, in the Pure Elo version of our ratings, this reversion is based on the average of season-ending Elo ratings in a team’s conference — not toward the global average of
1500
(with the exception of independent teams). Thus, interconference play matters a lot. In particular, a team that exceeds expectations in the NCAA Tournament will then redistribute those gains toward the rest the teams in its conference in the off-season recalibration. Despite this, SBCB is generally more bullish on mid-major and “minor” conference teams than its competitors.In the Bayesian Elo version of SBCB, conversely, mean reversion is instead based partly on preseason rankings in the AP (media) and Coaches Polls.
This creates various complications because the polls only provide a truncated list: that is, only 25 teams are ranked — and less than that in some previous seasons — although we also use teams in the “also receiving votes” category to create a longer list. The process for imputing human ratings quite literally applies Bayes’ Theorem in the sense that it relies on a prior about how likely a team is to be ranked. For instance, a team with a 1900 Elo rating would typically expect to be ranked somewhere in the top 25 (or at least among the “also receiving votes” teams) in the next preseason poll — so if it isn’t ranked, that provides a lot of information that its performance is expected to decline, usually because of a loss of key talent. However, a team that ended the previous season with a 1400 rating would rarely expect to be ranked, so this tells us little; SBCB ratings account for this. Basically, SBCB defaults to the Pure Elo process for these squads. Teams ranked specifically #1 overall receive special treatment if their Bayesian Elo rating lags behind their mean-reverted Pure Elo rating, to ensure that truly dominant clubs like the late 60s/early 70s UCLA Bruins are not punished — however, these instances are rare in the modern game.2
What we describe as Bayesian Elo is actually a 50/50 blend of an “undiluted” Bayes rating (where preseason priors are based as much as possible on human rankings) and Pure Elo; this maximizes predictive power rather than using either system alone. The “undiluted” Bayes version isn’t shown anywhere on the site, though it can be imputed. For instance, if a team is listed with an 1800 Bayesian Elo rating and a 1700 Pure Elo rating, that implies that its undiluted Bayesian rating is 1900. Essentially, the system keeps two separate sets of books — Pure Elo and undiluted Bayes — and what we call Bayesian Elo is the average between these. As you can see from the top teams list, Bayesian Elo usually matches the conventional wisdom better — rewarding historically well-regarded squads that were considered elite from wire to wire, rather than teams like 2024 UConn that admirably overperformed expectations but may have been playing over their heads.
I haven’t yet mentioned what is perhaps the most important parameter in any Elo system, which is called the k-factor. This governs how much the ratings update after each game. A higher k-factor implies more sensitivity to recent play but also more volatility. Statistically speaking, the goal is generally to minimize autocorrelation. That is, you want to avoid both a too-high k-factor where ratings zig-zag around (i.e. teams usually decline after gaining and vice versa) and a too-low one where a team with a recent ratings gain can predictably be expected to follow that up with further gains because the system is too slow to account for what soccer fans call a change in “form”. Specifically, we use a k-factor of
38
; this number has no intrinsic meaning and is derived empirically. Generally speaking, SBCB ratings are more aggressive than other college basketball systems about accounting for recent play and tend to ride a winning hand while discounting ratings for teams that have been on a downward trajectory.However, the k-factor is up to 50 percent higher (so, up to a k-factor of 56) for early-season games, with this diminishing linearly to a k-factor of 38 until a team plays roughly the 20th game of its season. The intuition behind this is that early-season games reveal a lot of information as compared to the crude mean-reversion estimates described above. By the middle of the season, conversely, teams mostly are “what they thought we were” and each subsequent game tells us less.
Also, NCAA tournament games also receive an additional multiple of
1.25x
, tantamount to a k-factor of 47.5. Tournament games generally reveal a lot of information — they’re high-stakes games played on neutral courts. Teams that outperform expectations in the early rounds of the tourney often continue to do so.Furthermore, differences in team strength are typically more apparent in the tournament, and the model accounts for this, too. An additional multiplier of
1.07x
is applied to the Elo ratings difference between the teams in forecasting margins of victory and win probabilities in the tournament. This implies that upsets are actually less common in the tournament relative to an equivalent regular season game, despite the tourney’s reputation for upsets.Teams that are new to Division I begin with a rating of
1300
at the start of their first D1 season, adjusted slightly upward or downward based on the strength of their new conference. That is to say, they are usually considerably below average since the average Elo rating is 1500. Preexisting D1 teams’ ratings are adjusted slightly upward such that the global average remains at 1500 when new teams join.Our database contains many games between Division I and Division II teams, especially in recent seasons. However, rather than calculating a rating for individual D2 teams, we instead lump all D2 teams together into a single “divtwo” rating. Essentially, this makes them the equivalent of the Washington Generals, barnstorming around and usually getting obliterated. A separate “divtwohome” running tally is calculated for D2 teams that host home games as opposed to paying on the road or at neutral sites — but this has become rare as D1 teams generally don’t want to decline an opportunity to sell tickets. D2 games that do host D1 opponents tend to be considerably stronger and perhaps candidates for an upgrade to D1. Overall, however, D2 teams are patsies, especially as more and more schools join D1, with D1 teams winning upwards of 99 percent of the time at home in recent years against D2 opponents. As of March 2025, the generic Division II rating is about 660 for road and neutral site games — much worse than for any D1 team — and about 900 for home games.3
For calculating margins of victory, one point in a basketball game equals approximately
27
Elo points. Thus, a team with a 100-point Elo advantage, after accounting for home court, travel distance, and the tournament multiplier, is roughly a 3.5- or 4-point favorite.
Differences in women’s SBCB
There tends to be more dominance in the women’s game, i.e. the top teams win more often against average ones, and by larger margins. This mostly emerges organically from the rating-generation process, although there are a few small tweaks. Division II teams and new Division I teams start with lower default ratings than in the men’s version, and there is more skewness introduced in the Bayesian version of the ratings.
There is less mean-reversion from season to season, probably because women are not eligible to join the WNBA until age 22, leading to greater team continuity.
Home court advantage tends to be slightly less in the women’s game. As for the men, this is calculated on a team-by-team basis. But note that Round of 64 and Round of 32 NCAA tournament games are often played on home-court sites in the women’s tournament.
Empirically, the ratio of Elo rating point differences to the point spread is about
25:1
for women as opposed to 27:1 for men. By implication, the same projected win probability for women translates to a wider point spread. This probably reflects the fact that women’s games are lower-scoring on average, introducing more variance since there are fewer scoring possessions. Lower-scoring sports in general tend to produce more unpredictable outcomes, controlling for differences in team quality.Women’s ratings are based on the 2002-03 season onward rather than 1949-50 for the men. Data is also somewhat less complete: for instance, more games against D2 opponents are missing, and data on game locations for neutral-site games is much less comprehensive.
NCAA Tournament forecasts
Our NCAA tournaments are similar in many respects. They account for differences in team strength, game location, and on the women’s side — since top seeds usually host games in the first two rounds — home court advantage. The parameters in the model have been calibrated based on our historical analysis of tournament games, including the fact that (as described above) differences in team strength are slightly magnified in the tourney.
The most obvious difference is that our ratings are essentially turned around to apply on a forward-looking basis, running through the various conditional probabilities for the rest of the tournament. This accounts for the fact that conditional upon winning, a team’s rating is likely to improve, i.e. if a #12 seed defeats a #5 seed, the 12-seed is probably going to be better than we expected in forthcoming games.
There are two other obvious differences between the tournament forecasts and SBCB.
First, our tournament forecasts account for injuries, whereas plain old SBCB does not. The injury information is entered manually, and is probabilistic; we have to make some judgment calls on translating sometimes subjective injury reports into probabilities that the player is sidelined for forthcoming rounds. The model also accounts for significant players who missed time during the regular season but are now available. Thus, some teams will have an upward injury adjustment for having gotten healthier.
Historically, data on women’s injuries is sparser and we haven’t accounted for it, but we’ll look to adjust for major injuries in the women’s tournament beginning in 2025.
The injury adjustment accounts for the importance of the player, as measured by sports-reference.com Win Shares, adjusted for strength of schedule. We project a player’s replacement based on an historical analysis of players who played 5 MPG or less, adjusted for team strength. In other words, even a scrub on Duke who fills in for an injured player is likely to be pretty good, whereas one on Southeast Missouri State might not be. Thus, the most devastating injuries are when a middling team loses one of its few stars.
Second, the tournament ratings are based on a 50/50 composite of Bayesian SBCB and other systems.4 For men, the composite employs the following external ratings:
Pomeroy (1.5x weight)
The NCAA tournament committee’s S-Curve rankings, i.e. how it seeds all tournament teams from 1 through 68.
On the women’s side, there are fewer high-quality published ratings, so we rely on this mix instead:
Sonny Moore
Massey
S-Curve with ties broken6 by the tournament committee’s quasi-official NET ratings.
On the men’s side, we give Pomeroy a little extra weight just because KenPom is widely regarded as the gold standard in college basketball. The Sagarin ratings were formerly used in the model but are no longer being published; we’ve also retired the LRMC ratings.7
In past years, we also accounted for preseason human poll rankings, specifically a composite of the AP and Coaches preseason polls. It is no longer necessary to do this because these rankings are now incorporated into Bayesian SBCB, which already gets half the weight.
Ratings systems other than SBCB are normalized to have the same mean and standard deviation in a given season as SBCB.
Like SBCB, the composite ratings used in our model adjust after every game using an Elo-like formula. In fact, the formula is identical to the SBCB formula, other than that our tournament version also accounts for injury status at the time of the game. In past years, this formula was based on undiscounted margin of victory, but it’s now been made more consistent with our Elo-based ratings, so runaway margins of victory are discounted to some degree.
In forecasting future rounds, we also account for the projected change in Elo conditional on a team winning. For example, if a #15 upsets a #2 seed, then conditional on advancing to the next round, it is likely to realize a large Elo gain since Elo substantially rewards upsets. Thus, it might be more plucky against its next opponent than our pre-tournament forecast suggested. Conversely, the #2 seed would barely gain any ratings points from this since its victory was expected — although it will get some credit for a lopsided win.
A note on how teams are named
For the tables that accompany SBCB, we’ve put a lot of effort into our style guide, operating on the principle that we generally call teams how they self-identify in their public communications: for instance, “IU Indy” rather than “IU Indianapolis” or “IUPUI.” We’ve also looked at which teams tend to prefer their abbreviations (e.g. “BYU” rather than “Brigham Young”) and which are more ambivalent about them (hence, “North Carolina” rather than “UNC”). However, there are some exceptions to avoid ambiguity or because of space constraints. If you see anything that looks funky — or for that matter, anything else that looks wrong with the SBCB ratings — please let us know.
The historical Bayesian ratings were updated on 3/14/2025 based on the slight change in translating rankings to ratings as described below.
A modification made on 3/13/2025 also slightly increases the dispersion of our Bayesian Elo ratings. Namely, we now assume a small bit of skewness in the distribution of ratings derived from polls, which tends to slightly improve the SBCB ratings for teams that rank very highly in the preseason poll.
SBCB ratings are designed to average 1500 across the 364 men’s D1 programs. But there may be some very small amount of “leakage” to or from D2 over the course of a season based on whether the D2 teams overperform or underperform our projections. Thus, the D1 average may not be exactly 1500 once the season gets underway (though it’s typically very close: it’s 1499.9 toward the end of the 2024-25 season, for instance.) These small differences are reconciled in the off-season recalibration.
Historically, our tournament forecasts didn’t give our Elo forecasts this much weight, but we think we’ve improved the system to the point where it accounts for more information than other ratings systems, especially preseason ratings, recent play, and performance in previous seasons.
Newly added for 2025; it’s a great site, by the way.
Historically, the selection committee reveals the top 4 overall seeds for women in order, but then doesn’t differentiate the teams after that other than through their seed lines.
It’s a good system but it ranks teams (i.e. from 1 to 364) rather than give them a rating (i.e. 94.52 versus 92.12), which reduces fidelity. Also, there’s some evidence of unreliability; it’s missing a team (there are 363 men’s teams listed rather than 364) and it’s updated less regularly.
How is margin of victory (mov) implemented in your model? Is (3 + s)^.85 applied as a scalar to your k value or as an addition to post-game elo change?
In other words,
If the standard elo change is: +- k * (1 - expected_win)
is the mov applied like:
1. +- k * (1 - expected_win) * (3+s)^.85 (would be a large number)
2. +- k * (1 - expected_win) + (3+s)^.85
Both seem larger than I'd expect. In the scenario where a 2000 elo team beats a 1500 elo team by 20, the expected point spread, the winning team would normally gain ~2 elo points (k factor of 38). With the MOV adjustment (as addition, option 2 above), this would be boosted to ~16. (as mult, option 1 above, this would be ~30).
Intuitively, this feels high for a team meeting expectations. Am I missing something? Or is this meant to add volatility?
"Margin of victory matters. Although there is some degree of diminishing returns — in basketball particularly, teams often send in the scrubs in lopsided games, and free-throw strategy can turn narrow wins into games that were closer than they appear from the final score — larger margins of victory get a higher multiplier. Specifically, the margin of victory factor is calculated as (3 + s) ^ .85, where s is the scoring differential"
The LRMC (Baysian) pi factors are posted here:
https://www2.isye.gatech.edu/~jsokol/lrmc/lrmcratings.html
So there is a "rating" of sorts, not just a ranking
Don't use the conversion factors from Sokol's early papers because they are for classical LRMC. But the two conversion *methods* in those papers might be useful.