2024 presidential election model methodology update
There aren’t many changes from 2020 — but here are the exceptions.
The Silver Bulletin presidential model is the same model that I developed and published when I worked for FiveThirtyEight from 2008 through 2023.1 So with relatively few exceptions, the model is identical to the one I ran at FiveThirtyEight in 2020. There’s a highly detailed methodology breakdown of the 2020 model here; in this post, I’ll simply note what’s changed since then.
Without getting too philosophical here, there’s a careful balance between fixing things that clearly need fixing or that you’ve had the time to think more deeply about on the one hand — and not wanting to fix something that isn’t broken on the other hand. In general, model-builders err too much toward “fighting the last war” — i.e. tinkering with things that didn’t work out great in the last election or made people mad at them, but which were actually perfectly reasonable assumptions — although it’s also possible to lean too far in either direction. I tend to be conservative about making changes to the election models — with that said, a few things are different this year.
Removing COVID-specific assumptions
This is straightforward. The 2020 version of the model contained various one-off tweaks related to COVID, such as reducing the convention bounce adjustment for conventions that were largely held virtually, and increasing the overall amount of uncertainty in the model. I think these were defensible choices — COVID was the sort of once-in-a-century emergency that rose to the threshold of a “broken leg problem” — but the pandemic is over and these have been removed.
Adjusting for new turnout dynamics
For many years, Republicans tended to be more reliable voters, with a higher likelihood of turning out. The main effect of this assumption on the model was in how polls of registered voters (or polls of all adults) were handled compared to polls of likely voters. The model had a prior that Republican candidates were likely to benefit from the shift from registered to likely voter polls and that challengers (whose supporters tend to be more enthusiastic) rather than incumbents would also benefit from this shift.
This situation has now almost completely reversed: there is extremely robust evidence — such as their strong performance in special elections — that Democrats now have the more engaged voters, and may actually benefit from lower turnout. For now, I’ve simply reverted to a “zero prior” instead, meaning that the model assumes that the conversion from registered to likely voter polls is equally likely to help Democrats or Republicans. However, this prior is revised as polls directly comparing registered and likely voters enters the database. I’ve also lowered the weight on the prior to make the adjustment more strictly empirical. As of this writing, in fact — in late June, 2024 — the prior has been almost entirely phased out, and it will be fully phased out soon.
For what it’s worth, weakening this prior helps Joe Biden, who so far this cycle has generally fared slightly better in polls of likely voters rather than registered voters; the old model would have been more stubborn in holding onto the premise that likely voter polls usually benefit Republicans. But I think the evidence for making this change is pretty unambiguous.
I’ve also removed a component of the model that we introduced in 2020 based on the Cost of Voting Index, which assumed that Democrats benefited from changes that liberalized voting rights rules and Republicans benefited from changes that tightened them.
More robust assumptions about RFK and third-party votes
Robert F. Kennedy Jr. is running as an independent this year and is likely to be on the ballot in most states. He initially received around 10 percent of the vote in polls, although he faded to the mid-to-high single digits in more recent surveys, especially among likely voters.
Kennedy nevertheless clearly meets the threshold of what I call a “named” third-party candidate, meaning one who could plausibly have an effect on the overall dynamics of the race rather than one who can be treated as a rounding error. This distinction is admittedly somewhat subjective — but in our backtesting on elections since 1968, this designation also applies to George Wallace (1968), John B. Anderson (1980), Ross Perot (1992 and 1996), Ralph Nader (2000) and Gary Johnson (2016).
In the model, “named” third-party candidates are allowed to win states and electoral votes (as Kennedy does in about 5 percent of simulations as of June 2024) while candidates designated as “other” are not.
Some of the model’s logic for handling RFK Jr. is borrowed from the 2016 version that applied to Johnson, although some of the third-party code was rewritten between 2016 and 2020. (It never wound up getting applied because there were no “named” third-party candidates in 2020; the lack of testing on real-world data led to some amusing glitches when we vetted the logic this year).) In general, third-party candidates have a highly asymmetric probability distribution for their vote share. Most of the time, their vote fades down the stretch run as voters conclude they are non-viable and don’t want to waste their votes. But there is a right tail where they take off and become major players in the race, as Perot did in 1992. That’s why Kennedy occasionally wins a few electoral votes in the model even though his modal outcome is to decline to something like 4 percent of the vote.
There are also a few more detailed changes to handling third-party candidates that I’ve implemented since 2016:
Because most polls include RFK Jr. as an option but some do not, the model now incorporates a third-party adjustment that’s similar to the likely voter adjustment described above in calculating its polling averages. That is, the model guesses at what polls would say if they included RFK Jr. when they do not. Overall, the effect of this is neutral: RFK Jr. draws about equally from both Biden and Trump. The adjustment is based specifically on polls that include Biden, Trump and Kennedy, but not any other minor party candidates. Biden probably is hurt by the left-of-center third-party candidates Jill Stein and Cornel West.
The model treats RFK’s ability to be on the ballot in various states as probabilistic rather than deterministic. I err on the side of assuming he will be included; he has a well-funded ballot access campaign, and I take the Kennedy campaign at its word when it says which states it has qualified for already.2 However, the model includes guesstimates of RFK Jr’s likelihood of ultimately reaching the ballot in states where he has not yet qualified, based on the ratio of signatures required to the voting-eligible population.. he model draws a random number to determine RFK ballot eligibility in states where it’s ambiguous whether he’ll qualify or not. In simulations where he’s not qualified in a particular state, the model backs out the third-party adjustment as described above and reassigns RFK’s votes to Biden, Trump and “other” candidates. Gradually — as we learn exactly which states RFK Jr. will be qualified for — these guesstimates will be phased out.
Handling a rematch and better assumptions about incumbency
Trump vs. Biden is the seventh presidential rematch of all-time, after 1800, 1828, 1840, 1892, 1900 and 1956. The former three elections aren’t highly useful data points — they came before the popular vote was widely adopted, and 1828 and 1840 also had significant complications from third-party candidates. But the latter three plausibly provide some information. In general, there was a high degree of persistence in these three elections. Dwight D. Eisenhower won with externally similar maps against Adlai Stevenson in both 1956 and 1952, as did William McKinley against Williams Jennings Bryan in 1900 and 1896. Grover Cleveland won the Electoral College against Benjamin Harrison in 1892 after losing it in 1888, but he narrowly won the popular vote in both years.
I’ve slightly tweaked the formula that the model uses to account for incumbency. Previously, it included two components: an economic index based on six economic variables — incumbent parties generally do better when the economy is stronger — and a dummy variable indicating whether the incumbent party has nominated an elected incumbent president (like Barack Obama in 2012), an unelected incumbent president (like Gerald Ford in 1976) or a candidate who wasn’t an incumbent at all (like Hillary Clinton in 2016). Each variable was adjusted for the degree of political polarization in the country based on Voteview data, accounting for the fact that the incumbency advantage is lesser and that elections are generally tighter in highly polarized political environments like the one the U.S. has today.
These two variables remain, although there is now an additional “tier” of incumbency to account for presidential rematches like the one this year — the assumption baked into the model is that the results are likely to be more persistent in an election where the exact same two candidates are on the ballot. In addition, the model now includes a third variable indicating how much the incumbent party won (or lost) the popular vote by in the previous election, and a constant term. We now also weigh recent elections more heavily in running the regression associated with incumbency, which is derived from elections since 1880.
The effect of these changes is negligible in 2024 — under both the old and new versions of the model, Biden is assumed to have a 2 to 3 point advantage in the popular vote based on projected economic conditions as of June 2024. However, we think this new version is more robust going forward.
The model also now includes a ninth component in its uncertainty index, indicating whether one or both parties have nominated the same candidate as in the previous election. In general, elections featuring repeat candidates are more stable and less uncertain, since these candidates are better known. Since both candidates are repeaters this year, this slightly reduces the overall uncertainty in the model.
Odds and ends
Finally, there are a few changes that don’t fit neatly into any of the above categories:
The model now simulates the results of ranked choice voting in Maine, using a two-step process. First — following the state’s process — in simulations where no candidate initially receives a majority of the vote, votes from “other” candidates are reassigned to Biden, Trump and RFK Jr. The split of the vote between Biden and Trump is determined randomly — empirically, in ranked choice elections, there is often not an even split. Then, if there’s still no majority after the “other” votes are reassigned, the model reallocates the third-placed candidate’s votes. This also includes an uncertainty term, although the model uses the breakdown of RFK Jr. votes between Biden and Trump as calculated in the third-party adjustment described above as a prior in simulations where RFK Jr. is the third-place candidate. As a simplifying assumption, the model ignores the fact that there is typically wastage in elections featuring ranked-choice voting (i.e. some voters do not completely fill out their ballots). The model’s published forecasts for the popular vote in Maine reflect results after any ranked-choice reallocations are applied; this is why RFK Jr. and “other” are forecasted to have low vote shares in Maine, even though it is typically a good state for third-party candidates. Because Maine is one of two states that awards electoral votes to the winner of each congressional district, the model also applies the ranked choice reallocation in simulations where no candidate has a majority in one of Maine’s congressional districts. Because of this, it is possible that Maine’s official tally of the statewide popular vote will not match its combined tally of the vote between the two districts. The model uses the statewide result in its forecast of the national popular vote.
The model uses data on voters’ religious affiliation, along with other demographic, geographic and political variables, as part of its process in calculating the correlation in the vote between different states. It now uses more detailed religious categories, with separate breakouts for African-American evangelical voters (as opposed to congregants in predominantly white evangelical Christian churches) — as well as categories for Jewish voters and Muslim voters, who had been lumped with other groups previously.
In calculating the weight assigned to various polls, the model now puts less emphasis on a poll’s sample size — as empirically, sample size is a less reliable predictor of poll accuracy than it “should” be, partly because modern polling techniques often deviate from classical statistical assumptions about sampling error. This change was applied to the FiveThirtyEight midterm forecast in 2022 and is now being phased into the presidential model for the first time.
Finally, we have further re-examined the convention bounce adjustment described above. The convention bounce has been modest in recent elections, likely as a result of higher polarization and the 24/7 nature of political coverage. The model now assumes that the candidate who just had his convention will benefit by a net of only about 2-and-a-half percentage points even at the very peak of his bounce.
I retained the IP to these models as per my agreement with Disney.
Other than in New York, where Democrats’ lawsuit against Kennedy is reported to be credible.
You have a typo in the first "Odds and ends" bullet. You say that votes are reassigned to "...Biden, Kennedy, and RFK, Jr." I believe you mean "...Biden, Trump, and RFK, Jr.".
Thought I’d flag, since I don’t know if it’s on your to-do list which I’m sure is already lengthy, that this article still talks about the election as a Trump-Biden race.