Which polls are biased toward Harris or Trump?
*Statistically* biased, we mean. A guide to house effects in the Silver Bulletin model.
We’re 39 days out from the election and people are starting to freak out about the polls — and unskew them using some less-than-rigorous methods. Could a bad poll for your favorite candidate just come down to sampling variation? Nope, the poll must be broken because recalled 2020 vote among Hispanic women with college degrees who live in rural areas and make more than $100,000 a year looks weird. Needless to say, you shouldn’t pay attention to this stuff.
Some election watchers also have strong opinions about how election forecasts weight each poll. A common critique of the Silver Bulletin model goes something like this: “poll X has more influence on the model than poll Y, but I think poll Y is better than poll X. Therefore, the model is bad and/or wrong.” Mostly this reasoning is used by people who are upset because we’re bearish on their favored candidate, but it’s sometimes used to justify outright conspiracy theories.
Now that’s not to say you can’t make a good-faith critique of how the Silver Bulletin model weights polls. For example, our polling averages are pretty aggressive — meaning we put a big premium on newer data. Other excellent models take a less aggressive approach.
The sample size is also a big factor in a given poll’s influence score. And when a firm frequently conducts national polls or polls of the same state — we’re looking at you, YouGov and Morning Consult — the model essentially spreads the total weight assigned to a polling firm across those different surveys; otherwise our averages would quickly be dominated by the more prolific pollsters. To see our assessment of pollster quality absent those other factors, just bookmark our pollster ratings. But those other factors matter too, and it generally isn’t a good use of your time to complain about a polling average’s weighting scheme: other than RealClearPolitics, they mostly wind up in highly similar places.
How our house effects adjustment works
But even more importantly, weighting polls is only half the battle. We also apply house effects adjustments to each poll.
In other words, we know that some polling firms consistently lean toward Democrats or Republicans and we adjust their results to compensate. The point here is to make our polling averages more stable and strip out the bias that some polling firms have toward certain candidates. (By “bias”, we mean statistical bias. Sometimes this correlates with political bias and sometimes this doesn’t: Fox News’s polling department has long played it straight down the fairway, even if their editorial coverage hasn’t.) So even if a poll that’s known to lean toward Donald Trump or Kamala Harris gets a large weight in our forecast, it’s often being adjusted heavily on the backend to correct for that house effect.
Here’s what our current house effects for Harris’s margin look like for pollsters with at least five Harris/Trump polls in our model. (We don’t use any leftover data from the Biden/Trump matchup.) Positive numbers mean Harris usually does better in a pollster’s surveys than in the average; negative numbers mean Harris does worse than average.
There’s one important methodological stipulation here. These are our mean-reverted house effect estimates — basically what we’d expect the numbers to look like over the long run if a pollster surveyed the race every day. If a pollster has only conducted a handful of polls, its house effects are discounted heavily toward the mean — whereas we can get a more precise estimate for firms like Morning Consult that poll constantly. For nonpartisan polls, house effects are reverted toward a mean of zero. For polls with partisan sponsors1, we basically treat them as “guilty until proven innocent” or “biased until proven unbiased”. Presidential polls with partisan sponsors typically exaggerate their candidate’s margin by about 3 percentage points and the model accounts for this.
But this chart shouldn’t surprise readers who follow the polls closely. For example, polls from Rasmussen Reports nearly always lean toward Republicans, so they have a large house effect of -2.6. Which means we expect Harris’s margin in their polls to be 2.6 points lower than it is on average. Trafalgar Group (-2.7), Spry Strategies (-2.3), and InsiderAdvantage (-0.8) also have Republican house effects.
On the other hand, Ipsos polls tend to be more Democratic than average (house effect of +1.9), as do Public Policy Polling (+1.4) and Morning Consult (+1.2).
And what about the frequently discussed (and frequently unskewed) New York Times/Siena College poll? It has a house effect of -0.06 — in other words, basically no house effect at all. Indeed, most of our most highly-rated firms have little net house effect.
This is partly because of how we design our polling averages. In calculating house effects, we basically look at how a firm’s polls compare to the trendline of other polls from that state (or compared to other national polls). This involves an iterative process: we calculate the trendlines, then calculate the house effects based on the trendlines, then recalculate the trendlines with adjustments for house effects, then calculate a more refined version of house effects from the recalculated trendlines, and so on. In this process, the more highly rated, nonpartisan firms serve as essentially the center of gravity: the “true” values against which other firms are compared.
What does this mean for the forecast? Not every R +1 poll that enters the model is actually treated like an R +1 poll. If we get an R +1 Rasmussen poll for example, the model thinks about it more like a D +2 poll due to Rasmussen’s large house effect. But an R +1 NYT/Siena poll is treated pretty much like an R +1 result.
Still, there are sometimes elections in which the more highly-rated polls tend to say one thing and the mediocre ones say another. Higher-quality polls were usually more favorable to Barack Obama in 2012 than lower-rated ones, for instance. But 2024 isn’t one of those elections. Among our two highest-rated pollsters, for instance, NYT/Siena has had some bad data for Harris recently, while Selzer just published some excellent numbers for her in Iowa (only down 4 to Trump). Polling averages that only took numbers from the top firms — and had a consistent process for this, not cherry-picking — would say pretty much the same thing as our averages do.
There’s one last wrinkle. The house effects shown above are for Harris’s margin, but we technically calculate separate house effects for Harris’s and Trump’s vote shares because some pollsters show more undecided voters than others. So for example, YouGov actually has negative house effects for both Harris (-0.6) and Trump (-0.5), meaning their polls show lower support for both candidates (and more undecided or third-party voters) than the average poll. But there’s little net house effect in YouGov polls. Other firms like Marist have positive house effects for both candidates.
The takeaway here is that we’d encourage you to trust the process. Even seeming “outlier” polls can provide critical clues as to how the race is trending. But if a pollster consistently leans toward Trump or Harris, we adjust and account for that.
Note that it’s the sponsor of a poll that matters in this calculation, not the polling firm — some firms conduct polls both for partisan sponsors and for nonpartisan ones. The exception is Trafalgar Group, which we treat as inherently Republican because they have a history of failing to disclose who they’re conducting polls for.
Folks who spend time doing deep dives on and critiquing polls and models online for purported bias against their candidate might consider putting that time much more usefully toward volunteering for their candidate instead … just a thought
Suggestion for a special topic post: an assessment of what polls are saying about President-Senate ticket splitting, and what the past can tell us, if anything, about how it correlates with the final vote. Looking at polling averages in battleground states that also have Senate races, the amount of ticket splitting, for the modern era, ranges from pretty big (WI, MI) to really big (PA) to massive (AZ, NV). When polls suggest this much splitting, does that usually persist? Or do they tend to converge as election day approaches? My inclination, admittedly biased as a Harris supporter, is that the current polling suggests she has more upside than Trump has. Would love to know what an empirical evaluation suggests!