36 Comments
User's avatar
LD's avatar

Here's a plea for the Silver Bulletin to directly address one question that Nate Cohn who apparently oversaw NYTimes polls flagged as critical prior to the election -- the growing trend of pollsters tweaking their raw data based on recalled voting -- i.e. weighting responses from people who recalled whether they voted for Trump or Biden in 2020 based on the actual vote percentages in each state from 2020.

Given the relatively poor performance of the NYTimes which decided not to join that trend, it would seem that the decision to utilize recalled voting was a good one, and will likely be even more widespread in future election years. To the extent that the Silver Bulletin is able to determine which pollsters used the recalled vote method, (1) is that a correct conclusion; and (2) what does the SB think of that trend?

Expand full comment
Eli McKown-Dawson's avatar

I'm very interested in this question! Definitely hoping to write more about it going forward. Food for thought right now: even in 2024 recalled vote wasn't a panacea. For example, YouGov (who have been weighting on recalled vote for a long time) also performed relatively poorly this cycle.

Expand full comment
Thoughts About Stuff's avatar

Note to Eli / Nate: your charts say “Combied” instead of “Combined”.

Expand full comment
Eli McKown-Dawson's avatar

Fixed! It wouldn't be a Silver Bulletin article without at least one obvious typo.

Expand full comment
Thoughts About Stuff's avatar

I've been reading Nate's work since the beginning and this is a true constant of his work! 😂

Expand full comment
Eli McKown-Dawson's avatar

I guess I'm a good fit here. Because the Combied typo was my doing.

Expand full comment
Thoughts About Stuff's avatar

I guessed that, that's why I mentioned you first. But Nate is no better lol. Good article btw, ty.

Expand full comment
aphyer's avatar

Wouldn't "lower average error but a consistent systematic error" be what we would see if pollsters were doing huge amounts of herding?

Expand full comment
Kyle Belcher's avatar

When is Nate going to acknowledge that polls are not IID at the state level and should be using nonparametric assumptions instead of parametric assumptions?

Expand full comment
Jay Arr Ess's avatar

Nonparametric assumptions where and in like what, their calculations of standard errors? Standard errors where?

Their voting simulations are obviously built off this idea, they assume correlated polling errors.

If you're going to make a nerdy stats critique, be precise about what your critique is. There's folks who would love to talk nerdy stats with you, but you gots to gives us more to work with.

Expand full comment
Kyle Belcher's avatar

Polling at the state level should be using nonparametric assumptions. This is obvious if you try to prove that polling is independent, identically distributed random samples. They aren't. This increases significantly the error range the polls report, which should have a significant effect on Nate's aggregator model. It also nicely explains why a highly rated pollster (the Selzer one) reported such an outlier right before the election. You would only expect that far of an outlier (especially now that we have election results) from nonparametric assumptions.

Does this clarify things for you?

Expand full comment
Jay Arr Ess's avatar

Thanks for explaining what you're taking issue with. Yes, the issue you're taking is now clear to me!

Now my remaining uncertainty: what brings you to the conclusion that Nate is assuming polling errors at the state level are IID?

I'm not trying to be a jerk here (I apologize that my reply looks a little snarky on reread; that hadn't been my intent), because it seems like we've from equal input information availability come to different conclusions.

It's often me who's wrong on things like this (and I will apologize if you point me to a footnote that lays it out somewhere), but I have been under the impression that the polling simulations that Nate runs assume correlated errors across states.

How exactly one assumes the errors are correlated is a modelling assumption, and at some point "nonparametric" is another word for "lots and lots and lots of parameters of how I'm binning stuff and how many bins I'm making," but I'd had the impression that there had likely been some stress testing of this.

Expand full comment
Slaw's avatar

"When the polls are that close, their main utility is to tell you that the race is uncertain and that things could break either way."

That's what the polling says. Whether or not that's actually reflective of the underlying reality is much more complicated.

Expand full comment
Oliver Moldenhauer's avatar

Great work. But please refrain from clickbait-headlines. Why do you write "By one important measure, they performed better than ever. By another, they had their worst year yet. " when you could have written something like "low average error, high systematic error" ? I find this habit of many news websites deeply annoying and would love for you to do better.

Expand full comment
Nate's avatar

He's written about the importance of having attention-grabbing headlines and pictures on Substack. I think the headline he used is significantly better in that regard than the alternative you proposed.

Expand full comment
Jay Arr Ess's avatar

Nicely done, Eli!

I totally get why you're writing "For all intents and purposes, such races [ones within 3%] should be treated as toss-ups," but the nerdy nerdy part of me is saying, "Well, betting is one purpose and one intent IS to make money on margins, and for that particular intent and purpose, a 3% race that gives you a 56% rather than a 50% probability of success, spread over a whole bunch of elections, is an edge that some people are likely very interested in."

But for the sake of the flow of your post and the point you're trying to get across, it's probably against your bigger purpose to actually be precise there. The nerds like me should know that you're making a point, and the non-mega-nerds should just go ahead and take you at your word.

Expand full comment
PJ Cummings's avatar

Thanks Nate and Eli. Your statistical explanations are very interesting. Helps cut through a lot of poor/biased journalism during election cycles.

Expand full comment
Aaron C Brown's avatar

I'm interested in the effect of sample size on poll accuracy. I broke the first table in the post down by below median sample size (657) and above median. You'd expect a given point margin to be a better predictor of the winner with a larger sample size, at least if statistical sampling noise were the main source of poll errors.

In fact we see a slight--not statistically significant--reverse effect for poll margins of less than 6 points. With under a three point poll margin, small polls predict the winner 56.4% of the time, while large polls are only 55.7%; from 3 to 6 points the figures are 70.7% for small polls and 67.7% for large polls. While it's plausible (p-value 13%) that small and large polls are equally predictive, there certainly seems no advantage from trusting larger sample-size polls for the closer races. When the poll margin exceeds 6 points, the larger sample-size polls do better, as expected.

Looking at this effect another way, suppose we treated each poll as having only pure statistical error--as if it were a perfect random sample of the actual ballots cast, with no noise from counting people who don't end up voting, people who vote different from what they tell the pollsters and non-random sampling; but also no improvements due to techniques like weighting, stratified samples and importance sampling. In that case we can convert the poll margin into a theoretical chance of the leading candidate winning.

If the theoretical probability is less than 70%, then there is always less than a three point lead. But for 70% to 80% theoretical probabilities it could be a large poll showing a lead of less than three points, or a smaller poll showing a lead of 3 to 6 points. In the first case there's a 58.1% chance of the leading candidate winning, in the second case 63.2%. Of course both are substantially less than 70%, which we know, because poll errors are much larger than pure statistical noise. But my point is that for a given theoretical probability, you should put more trust in a small poll showing a bigger point lead than a large poll showing a smaller lead.

Similarly with theoretical probabilities from 80% to 90%, large polls showing 3 to 6 point leads get the result right 70.5% of the time, versus 73.3% for smaller polls showing 6 to 10 point leads.

All of the above results are for combinations with at least 100 polls. Only for 90% to 99% are there more than 100 examples for more than two point ranges. The largest polls with 90% to 99% theoretical probability on point leads of 3 to 6 points get the winner right 71.2% of the time. Somewhat smaller polls with 6 to 10 point leads are right 86.4%, 10 to 15 point leads are right 93.8%, 15 to 20 point 97.1% and over 20 points 100%.

Finally when theoretical probability is over 99%, 15 to 20 point leads show the winner 99.6% of the time and over 20 point leads 100%.

Expand full comment
Jay Arr Ess's avatar

My gut goes with survivorship bias in firms that run small polls.

What do I mean: If small polls are more likely to be run by small firms, then the variance in FIRMS that are running small polls is going to be high. There are lots of small firms out there, and lots of variablility in how they're run.

By random chance, small firms will sometimes be very successful and sometimes very unsuccessful. Not because their methods are so great, but because there's a lot of them. If their polls are basically a coin flip, say, then with a lot of small firms, there are going to be some small firms that always land Heads, which in this case means "their polls just happen to predict the result well."

Now, if those small firms that are luckily successful are more likely to survive in the polling world, that would give you your answer.

But it would also tell you to be skeptical in ranking small firms highly: some might actually have good methods, but some might just look good because they've been lucky for a long time.

Open to other interpretations, but that's one that I find plausible.

Expand full comment
Aaron C Brown's avatar

That makes sense, certainly as a reason for skepticism, although I know of no evidence. I actually find it more plausible in reverse--from my experience in polls for science. Some people do good jobs of stratifying, finding good respondent, getting accurate responses and exploiting prior information. They find small samples adequate, because pure statistical noise is a small part of error, and money is better spent on things other than more samples.

Other people get bad results and increase sample size in a vain attempt to solve the problem.

But either way, it pays to be skeptical about inferences that could be affected by the universe of polls chosen.

Expand full comment
Jay Arr Ess's avatar

For polling firms in particular, I do not know. But I agree on the conclusion!

This question overall of "is smaller better" comes up a lot in economics research in both:

- Class sizes. Are smaller class sizes necessarily better, and if so by how much? the best research is a couple studies that randomly assign class sizes, but those are hard to implement because engaged parents often manipulate things to get their kids into the small classes, even if they were assigned to the larger ones... and kids with engaged parents tend to be doing pretty well anyways. We tend to think the effects are positive, but probably small for some reasonable range of class sizes, and then we have to ask if the funds would be better spent elsewhere.

- Should we want to help "small businesses?": the small businesses that survive are sometimes just better run that's true, but generally it's larger firms that are more productive. but there's all this political support (across the board) for small businesses that's really based on the observation of the small businesses that WERE successful. Many economists tend to think that these are more likely to be the lucky firms than the truly "productive" firms. But then that opens up the question of where do the big firms come from, then...

The latter is an area of open research in developing economies: the question of whether you should subsidize small firms or big firms is a question of how do you think economies grow. If you get it wrong, you're funnelling a very limited government budget into something that might not be helping your economy much.

Expand full comment
Mr. Myzlpx's avatar

Nate/Eli…

Be honest, I did not read the entire article. so, you might have already written what I’m about to say-or disproved it.

I’ve mentioned this in comments in the past. To me, it’s 100% obvious why the posters continually overestimated Democrats and underestimated Republicans. The answer is, it has been, for quite a long time, totally unacceptable in educated circles to say you were for Trump. it would cause social ostracism, you’d lose friends, and some parts of your life would be canceled. As a result, many Trump supporters either did not answer poles, or they lied. It wouldn’t take very many people who did that to skew the results quite a bit.

Let’s say a typical pole has 1500 people. If only 10 of them come out of 1500, lied and said they were for Biden/Harris instead of saying the truth-that they were for Trump that would be a swing of 20 people in a 1500 person sample. That all by itself is a 1.3% change in results. and I would bet the number of “liars“ is much bigger than that. Add to that my empirical belief that a lot more Republicans than Democrats simply would not answer poles. And, all of a sudden, the posters bias toward Democrats is clear.

Only a few more liars, added to the Republicans, who wouldn’t respond to polls, would clearly create that Democratic bias

Please do not underestimate how how many Republicans especially Trump Republicans were literally fearful of losing their friends and/or getting canceled in some other part of their lives as well.

in, as I said, it wouldn’t take very many of those people to totally buy us the poll results

Expand full comment
Robby Trail's avatar

Lol read the article if you want someone to read your comment and answer your question

Expand full comment
Mr. Myzlpx's avatar

Robby,

I actually read most of it, not all of it. It was beginning to get a little confusing.

Pollsters worked their tails off to overcome the bias of 2016 and 2020.

But they failed because there was still bias.

OTOH, they were really good because their bias was less than it was in each of the past two elections.

Etc.

IMO, despite the pollsters' best efforts to overcome their past bias in favor of Democrats, they simply did not understand that many Republicans, for fear of retribution, would lie to their friends and pollsters about their true intent (to vote for Trump) and/or simply not participate at a higher level than Democrats chose not to participate.

With only typically 1,500 -1,800 people in the polls, it takes a shockingly low number of liars and non-participants to create a significant distortion to reality.

Expand full comment
Brian's avatar

The shy Trump voter theory was popular after the 2016 election and still discussed but less popular after the 2020 election. Nate has addressed the shy Trump voter periodically over the years and while Nate or Eli might correct my representation, but the collective data from the polling industry argues strongly against a shy Trump vote effective in the 2024 election cycle.

Perhaps you live in an indigo blob bubble where your social circle would frown upon a public Trump supporter but particularly in swing states, this social phenomenon has almost completely disappeared and the opposite has come to pass. A person coming out publicly supporting Trump is likely to gain new friends and acquaintances from fellow Trump supports who were not as public. Sure, they may lose some social capital in some areas of their life but they will also gain social capital.

I have a different theory to explain why fewer Trump supporters might be responding to surveys and making it difficult for pollsters to use voter turnout model weighting to fix. The Republican Party and even more so, the MAGA movement, are profoundly skeptical of experts, elites, and public institutions and this distrust is reflected in a reduced willingness to participate in surveys. The most likely voters to avoid pollsters are those who don’t trust the integrity of poll results and likely still believe the 2020 election was stolen. These are the type of voter who would rather hang up on a pollster or not click a text poll link or online link than risk their responses being fraudulently manipulated by a lamestream media company.

Expand full comment
Mr. Myzlpx's avatar

Brian...

I like your "model" of why Trumpers will not respond to pollsters. I'm sure there are some Dems who act the same way, but my instinct says that your model -- i.e. huge distrust of experts and institutions on the right -- Reps are significantly overweighted in the group that hangs up on pollsters.

I stil lbelieve a lot of people lie about their poltical affiliations. But, that's my belief.

The startling thing is how even a relative handful of "liars", added to the non-responders or responders who won't disclose their party affillitation, can really create a significant difference in poll results among the 1,500 to 1,700 people in a typical poll

Expand full comment
Caleb Begly's avatar

Question for Nate/Eli about the congressional race call percentage success rate - is that 60% just among contested races (that have, for example at least 5 polls) or is that among all races? The reason I ask is because around 80% of congressional races are uncompetitive - so solidly red or blue that whoever wins the corresponding primary will win the general election with a high degree of certainty. So if polls are only correctly calling 60% of all races, that's substantially worse than just looking at what party won it the last three times in a row.

Expand full comment
Tokyo Sex Whale's avatar

How does poll accuracy across cycles comparewith presidential primary polls excluded?

Expand full comment
Tamritz's avatar

Nate is Jewish but refused to learn from the Israeli experience. Pollsters have never overestimated Shas’s strength in Israeli elections. Why? Because it is the party of the uneducated lower classes—or, without political correctness, the party of low IQ. Since 2016, the Republicans have been the party of low IQ. I explained this in advance, even before the elections.

https://tamritz.substack.com/p/i-bet-on-trump

Expand full comment
Matthew's avatar

Where is patriot polling ranked?!

Expand full comment
Bullfighter's avatar

Errr...the Elephant in the Room is actually a Whale. I find it shocking that any serious examination of polling performance fails to mention The French Whale. This is the bettor who outsmarted every single professional pollster by figuring out a very simple method to overcome the Trump underestimation bias. Of course, as the authors point out, this was the single largest failing of the polls in 2024, 2020, and 2016.

https://www.thefp.com/p/french-whale-makes-85-million-on-polymarket-trump-win

One would think that if a gambler with absolutely zero polling experience found a way to outsmart me on the single most important variable of the last three elections, I would be shamed into a fundamental rethinking of everything that I do. But instead we get yet one more technocratic, self-serving soliloquy on how "some things were good, and some things were bad". I'm sorry, but no matter how often you choose to believe you did a good job, nothing can change the fact you missed, for the third consecutive election, the headline story.

Maybe fourth time is a charm?

Expand full comment
Jay Arr Ess's avatar

If there are millions of people out there using all sorts of idiosyncratic methods to make predictions, there are going to be some who just get lucky and get the answer right.

Just like if you flip millions of coins hundreds of times, some of those coins will land heads every time. It's not because they're superior coins, it's just because you're flipping a whole lot of coins.

If your coin is unbiased, the fact that one of those particular coins, call it Frank, flipped Heads for 299 times still doesn't tell you anything about the three hundredth flip.

Now, if you've got a very compelling reason for why this particular coin Frank was so good at landing Heads (i.e. you're taking issue with the assumption that the coin currently called Frank is unbiased), then that's a different story.

But you would need some backing up on that because the prior, that there are lots of people making lots of bets out there and sometimes someone will get lucky, should be pretty strong.

Expand full comment
Bullfighter's avatar

Understood and acknowledged. I would argue that the "Frank" here is anything but a random bettor out there flipping his coin. The French Whale was perhaps the single largest bettor across all the betting markets - he won $85 million off his election bet on Trump. If you read about his methodology, it was simple, but brilliant. The fact that hundreds/thousands of professional pollsters failed to replicate his methodology is a serious indictment of the polling industry writ large. It speaks to me of narrow thinking within the echo chamber of the industry - everyone seems to move in a pack. It took a complete outsider to break through and solve the one single problem that has vexed the pollsters for three consecutive elections - the Trump underestimation bias. The fact that a complete unknown, with zero experience in polling, solved the problem that no one in the industry could figure out speaks to me of a fundamental flaw in the industry.

What is particularly perplexing to me is the ongoing propensity of the pollsters, when engaged in self reflection about their performance, insist on clinging to statistical nuance (such as margins of error) to rationalize their performance in the past three election cycles. While they are technically correct, they are missing the forest for the trees. The reason that the public's faith in the polling industry is at an all time low is because the polls have fundamentally missed the headline result in each of the last three elections:

2016: Polls say Hillary wins big WRONG

2020: Polls say Biden wins huge WRONG

2024: Polls say too close to call WRONG

In each of those elections, if you parse the statistical minutiae fine enough, you could probably conclude that the results were within a reasonable margin of error vis-a-vis the polls. The pollsters and data geeks can then, with smug satisfaction, declare that they did a pretty good job. But the general public, quite honestly, does not give a rat's ass about standard deviations and margins of error. They rely on the polls to give them a "headline" understanding of which way the wind is blowing. And in each of the last three presidential elections, the polls suggested the wind was blowing east, when in fact it was blowing west. So while a professional pollster may argue that they are not in the business of predicting outcomes -- only probabilities of outcomes -- that distinction is lost on 99% of the consuming public. I have yet to see even a sliver of recognition by Nate, or his ilk, of this reality.

E.O.R. (end of rant)

Expand full comment
Jay Arr Ess's avatar

I don't have access to the whole article, but what I read was actually a pretty common tactic in academic surveys.

Basically when there's a social stigma against something (e.g. the "shy Trump voter" theory), one thing you can do is ask people what they think other people are doing.

To take a non-US example, there are lots of campaigns by western researchers to reduce female genital mutilation in parts of western Africa. These folks realize that the western researchers obviously want to reduce this practice, so if you ask them, they're not going to be like "yeah I want that for my daughter."

If you have a nice random sample and all that, say you have 2000 people. Of those 2000, maybe you ask 500 what they did with their daughters. You ask another 500 what they think their neighbors would do with their daughters / what their neighbors did with their daughters. You ask another 250 both questions, changing the order between the two (just in case you're worried that you might be "priming" them to suspect your tricks). And you ask another 500 all your regular baseline-characteristics questions (what's your income, how many goats ya got), without these "sensitive questions."

And then you compare the balance between these groups, controlling for numbers of goats etc if you wish. If there's a serious difference, then your spidey sense says hey, hmm, maybe there's something making people nonresponsive.

I don't think that's revolutionary as a concept. If pollsters aren't at least occasionally doing it for cases where they suspect there are stigmas against response, then, yeah, that's a knock.

I don't have the data on what pollsters are doing. But if some enterprising folks are not using a fairly common method to try to do this sort of correction, I'd be surprised.

Expand full comment
Bullfighter's avatar

Yes, you nailed exactly what the French Whale you did. And I wholeheartedly agree that it is not a revolutionary idea. I don't know exactly what the pollsters are doing either, but from what I've been able to discern, none of the major polls used the methodology the French Whale did. At a minimum, he got it right and ALL of them got it wrong, so one has to assume that the pollsters missed it. Not once. Not twice. But THREE consecutive times. And it's not like the problem was undiagnosed - every pollster knew exactly that the "shy Trump voter" was the elusive problem to be solved. With that foreknowledge, they were still unable to get to the bottom of the problem, unlike the French Whale. For my money, this requires a deeper level of self-examination than what Eli and Nate have offered thus far.

Expand full comment
User's avatar
Comment deleted
Feb 17
Comment deleted
Expand full comment
aphyer's avatar

One thing you can do here is look at state-level figures.

For instance, even if you think there's been a lot of voter suppression going on (in either direction), it seems unlikely that NY/NJ/CA have been suppressing left-wing voters. This means that large red swings in those states probably indicate something real going on.

What you'd see if one side were engaging in lots of voter suppression would be that side running up extremely large leads in states it had control over.

Expand full comment