Hey Nate, could you clarify what got the banned pollsters banned in the first place? Not looking for an itemized list explaining the offense of each one, just trying to understand what results in a ban.
At least a couple of the banned firms were found to be simply and literally making up numbers. (Wow, the R2K scandal was more than a decade ago; I am too old.)
One thing about the tables is that when sorting by grade, A+ gets sorted after A, A-, and A/B. Not sure if there’s anything that can be done about that. Maybe adding spaces between the A and the plus?
Re: Datawrapper. There’s a third audience besides email and web; there’s the iOS/iPadOS app. I use all three. Totally understand that features can’t make it into email, but I hope the fancy stuff carries over to the app which I assume is more-or-less a dedicated browser.
If I look at the pollsters rated some type of A (A+, A, A- or A/B), I find that 800 of the 3,832 election results fell outside the margin of error. The MOE is defined as 1.96 divided by the square root of the sample size. That would give a 95% confidence interval if the poll were conducted by taking a perfect random sample of actual ballots cast, in which case we would expect 192 of the 3,832 results to be outside the poll MOE.
Of course, true polls suffer from three inherent difficulties that inflate errors--the people polled might not vote, they might vote different from what they tell the pollster and the sampling might not be perfectly random. On the other hand, sophisticated polls do many things--stratified sampling, importance sampling, weighting, etc.--to improve results over a simple random sample weighted equally.
So one observation is the inherent problems far outweigh the simple statistical noise, and also the sophisticated attempts at improvement.
More important, the inherent errors are probably independent of sample size and the value of sophisticated improvements declines with sample size. Therefore quoting MOE as a function of the inverse square root of sample size likely exaggerates the accuracy of large-sample versus small-sample polls.
That turns out to be true in the data, although to a lesser extent than I would have expected. Polls of below median sample size (647) exceeded MOE 22% of the time, polls of average or above size exceeded 29%. My experience is with general science polling rather than political polling, and in that area larger sample sizes generally mean lower-quality data and general research quality. A medical study of 20 patients from actual hospital records and interviews with patients and healthcare providers can be more reliable than random telephone surveys of 100,000 people.
Another point is that rating polls by actual error has some academic value, but for most practical purposes we care about whether or not the poll got the result right. In a bit over half the polls (6,661) the result was indeterminant using the MOE, meaning if the MOE was a true 95% confidence interval, both candidates had more than 5% chances of winning. In the other 5,645 polls, one candidate was better than a 95% favorite, using the MOE.
In the indeterminate races, if the MOE was a true 95% confidence interval, the candidate in the lead in the poll should have won 81% of the time. In fact, she won only 66% of the races. Still that shows the polls are useful, but at least on average, users should apply a larger MOE than one computed mechanically from sample size.
If one candidate looks like a 95% or better favorite based on the MOE, he does in fact win 95% of the time. However if you took the MOE literally, he should have won 99.9999% of the time.
My general take is if a good-quality poll shows one candidate as a 95% favorite based on the mechanical MOE, he probably is a very strong favorite, but don't bet on him at longer odds than 20:1, whatever the poll says (you might have additional information to justify longer odds). If the poll shows a result within the MOE, a general rule of thumb to square the nominal probability (so an 80% chance becomes a 64% chance) might get you in the ballpark.
I forget where I read this but generally the real error is about double the polls sampling error. And of course the sampling error refers to one candidates vote share not the difference in vote margins between candidates so gotta double again if talking about margins.
If a poll has Gen.Eric Democrat up 50-42 over Gen. Eric Republican with a listed MOE of +/- 4
The true 95% confidence interval spans anywhere from 58-34 Dem to 50-42 *Rep*. Odds of winning then are one sigma or about 84% for General Dem.
I'd put it differently. There is are non-sampling sources of error that might be typically about equal to sampling error in polls of a few hundred voters--but they don't decline with sample size. So they could be twice the sampling error in polls with 1,000 or so voters, and four times the sampling error in polls with 4,000 voters.
The non-sampling sources are not the same for all polls--there are ways of reducing them. For example, polls that ask how the respondent's neighbor will vote can reduce the error from respondents misrepresenting their intentions. In-person polls by trained interviewers can be more accurate than polls that ask respondents to check boxes on-line. Polls that focus narrowly on carefully selected swing voters can do better than random-sample polls.
One unfortunate tendency is the polls with the lowest non-sampling error are generally the most expensive to run per respondent, and often have small sample sizes even though they could benefit from larger samples. The cheapest-to-administer polls with the largest non-sampling error can afford large sample size, but those do not good since non-sampling error dominates.
In most elections, the vote shares of the two candidates are strongly negatively correlated, in which case the margin--the difference between the candidates--will have an error of about twice the error in the individual candidate share, as you say. But if third party alternatives exist it gets more complicated.
This is a bit of a tangent, but how does primary polling account for open/pseudo-open primaries? Like for example I voted for Nikki Haley in the Republican primary even though I'm not a Republican and I had no intention of voting for her in the general if she won. So it seems like subtle differences in the framing of the question (e.g. "Who is your preferred candidate" vs. "Who are you voting for") might change how people honestly answer polling questions with regards to open primaries.
BUG REPORT: When clicking on the 'Grade' column to sort, 'A+' is being sorted below 'A/B'. Sort order appears to be A, A-, A/B/ A+. Of course, expected results is A+ would be sorted first.
However, the 'image in email, click to get to interactive' worked perfectly for me! Gmail client running in Chrome on Windows. Great feature!
The 2024 presidential election forecast page links to this pollster ratings page in the sentence that reads, (paraphrasing) No, historically Fox News isn't biased towards Republican candidates. Alas, Fox News does not appear in your pollster ratings page (I downloaded the spreadsheet to double check).
The Fox News-commissioned polls are conducted under the joint direction of Beacon Research (D) and Shaw & Company Research (R). You can find them under that name in the pollster database.
Was extremely surprised to see my high school in the rankings! Rated a B/C, it’s a fine start but let’s see some hustle kids. At least they’re already beating SurveyMonkey.
I'm confused -- if we're a subscriber to the Silver Bulletin does that also give us access to the detailed Election Forecast? Or is that a separate subscription?
The Fox News-commissioned polls are conducted under the joint direction of Beacon Research (D) and Shaw & Company Research (R). You can find them under that name in the pollster database.
Why does Big Data Poll have an F? Also the ban from 538 has been lifted. Aside from a few misses they’ve been more accurate than many polls ranked higher than them. Or do you have personal animosity against Rich Baris?
I see what you’re saying now, you’re right. Either an error on Nate’s part or maybe he’s only using bans from when he was still at 538? I know there was controversy over Rasmussen and GEM, etc and they disagreed on which pollsters should be banned, so could be part of it.
Could be footnote #6: "I’m also reverting back any banned/unbanned decisions to the ones that were in place in spring 2023. Not in the mood to re-litigate these, sorry." Disappointing.
Hey Nate, could you clarify what got the banned pollsters banned in the first place? Not looking for an itemized list explaining the offense of each one, just trying to understand what results in a ban.
At least a couple of the banned firms were found to be simply and literally making up numbers. (Wow, the R2K scandal was more than a decade ago; I am too old.)
The policy at fivethirtyeight before he left is still available online.
TLDR betting on politics and data manipulation amongst other issues
https://fivethirtyeight.com/features/polls-policy-and-faqs/
One thing about the tables is that when sorting by grade, A+ gets sorted after A, A-, and A/B. Not sure if there’s anything that can be done about that. Maybe adding spaces between the A and the plus?
Yeah I noticed this as well. There are a ton of ways around that issue though.
Re: Datawrapper. There’s a third audience besides email and web; there’s the iOS/iPadOS app. I use all three. Totally understand that features can’t make it into email, but I hope the fancy stuff carries over to the app which I assume is more-or-less a dedicated browser.
You should be able to access the rating directly here, hopefully this works on iOS: https://datawrapper.dwcdn.net/qUTnp/2/
If I look at the pollsters rated some type of A (A+, A, A- or A/B), I find that 800 of the 3,832 election results fell outside the margin of error. The MOE is defined as 1.96 divided by the square root of the sample size. That would give a 95% confidence interval if the poll were conducted by taking a perfect random sample of actual ballots cast, in which case we would expect 192 of the 3,832 results to be outside the poll MOE.
Of course, true polls suffer from three inherent difficulties that inflate errors--the people polled might not vote, they might vote different from what they tell the pollster and the sampling might not be perfectly random. On the other hand, sophisticated polls do many things--stratified sampling, importance sampling, weighting, etc.--to improve results over a simple random sample weighted equally.
So one observation is the inherent problems far outweigh the simple statistical noise, and also the sophisticated attempts at improvement.
More important, the inherent errors are probably independent of sample size and the value of sophisticated improvements declines with sample size. Therefore quoting MOE as a function of the inverse square root of sample size likely exaggerates the accuracy of large-sample versus small-sample polls.
That turns out to be true in the data, although to a lesser extent than I would have expected. Polls of below median sample size (647) exceeded MOE 22% of the time, polls of average or above size exceeded 29%. My experience is with general science polling rather than political polling, and in that area larger sample sizes generally mean lower-quality data and general research quality. A medical study of 20 patients from actual hospital records and interviews with patients and healthcare providers can be more reliable than random telephone surveys of 100,000 people.
Another point is that rating polls by actual error has some academic value, but for most practical purposes we care about whether or not the poll got the result right. In a bit over half the polls (6,661) the result was indeterminant using the MOE, meaning if the MOE was a true 95% confidence interval, both candidates had more than 5% chances of winning. In the other 5,645 polls, one candidate was better than a 95% favorite, using the MOE.
In the indeterminate races, if the MOE was a true 95% confidence interval, the candidate in the lead in the poll should have won 81% of the time. In fact, she won only 66% of the races. Still that shows the polls are useful, but at least on average, users should apply a larger MOE than one computed mechanically from sample size.
If one candidate looks like a 95% or better favorite based on the MOE, he does in fact win 95% of the time. However if you took the MOE literally, he should have won 99.9999% of the time.
My general take is if a good-quality poll shows one candidate as a 95% favorite based on the mechanical MOE, he probably is a very strong favorite, but don't bet on him at longer odds than 20:1, whatever the poll says (you might have additional information to justify longer odds). If the poll shows a result within the MOE, a general rule of thumb to square the nominal probability (so an 80% chance becomes a 64% chance) might get you in the ballpark.
I forget where I read this but generally the real error is about double the polls sampling error. And of course the sampling error refers to one candidates vote share not the difference in vote margins between candidates so gotta double again if talking about margins.
If a poll has Gen.Eric Democrat up 50-42 over Gen. Eric Republican with a listed MOE of +/- 4
The true 95% confidence interval spans anywhere from 58-34 Dem to 50-42 *Rep*. Odds of winning then are one sigma or about 84% for General Dem.
I'd put it differently. There is are non-sampling sources of error that might be typically about equal to sampling error in polls of a few hundred voters--but they don't decline with sample size. So they could be twice the sampling error in polls with 1,000 or so voters, and four times the sampling error in polls with 4,000 voters.
The non-sampling sources are not the same for all polls--there are ways of reducing them. For example, polls that ask how the respondent's neighbor will vote can reduce the error from respondents misrepresenting their intentions. In-person polls by trained interviewers can be more accurate than polls that ask respondents to check boxes on-line. Polls that focus narrowly on carefully selected swing voters can do better than random-sample polls.
One unfortunate tendency is the polls with the lowest non-sampling error are generally the most expensive to run per respondent, and often have small sample sizes even though they could benefit from larger samples. The cheapest-to-administer polls with the largest non-sampling error can afford large sample size, but those do not good since non-sampling error dominates.
In most elections, the vote shares of the two candidates are strongly negatively correlated, in which case the margin--the difference between the candidates--will have an error of about twice the error in the individual candidate share, as you say. But if third party alternatives exist it gets more complicated.
This is a bit of a tangent, but how does primary polling account for open/pseudo-open primaries? Like for example I voted for Nikki Haley in the Republican primary even though I'm not a Republican and I had no intention of voting for her in the general if she won. So it seems like subtle differences in the framing of the question (e.g. "Who is your preferred candidate" vs. "Who are you voting for") might change how people honestly answer polling questions with regards to open primaries.
BUG REPORT: When clicking on the 'Grade' column to sort, 'A+' is being sorted below 'A/B'. Sort order appears to be A, A-, A/B/ A+. Of course, expected results is A+ would be sorted first.
However, the 'image in email, click to get to interactive' worked perfectly for me! Gmail client running in Chrome on Windows. Great feature!
The 2024 presidential election forecast page links to this pollster ratings page in the sentence that reads, (paraphrasing) No, historically Fox News isn't biased towards Republican candidates. Alas, Fox News does not appear in your pollster ratings page (I downloaded the spreadsheet to double check).
The Fox News-commissioned polls are conducted under the joint direction of Beacon Research (D) and Shaw & Company Research (R). You can find them under that name in the pollster database.
Was extremely surprised to see my high school in the rankings! Rated a B/C, it’s a fine start but let’s see some hustle kids. At least they’re already beating SurveyMonkey.
I'm confused -- if we're a subscriber to the Silver Bulletin does that also give us access to the detailed Election Forecast? Or is that a separate subscription?
I’m glad to hear you are narrating your own book; I pretty much just use Audible now. Looking forward to the audio release.
Thanks, Nate - great to see our polling operation at the University of North Florida continue to receive good grades!
Does this evaluation include the herding effect Nate covered in the Sheep article?
Ie. are pollsters also evaluated on whether their results fall outside the MoE the expected amount of time?
I love how Nate says he weight more reliable polls heavily but all the polls showing Harris winning are low A to B. LOL
Where are Fox News polls in this listing? They've always had a pretty good reputation (despite the bias of their "news" division......
The Fox News-commissioned polls are conducted under the joint direction of Beacon Research (D) and Shaw & Company Research (R). You can find them under that name in the pollster database.
Would be nice to be able to sort pollsters by grade. Currently A/B shows ahead of A+
Why does Big Data Poll have an F? Also the ban from 538 has been lifted. Aside from a few misses they’ve been more accurate than many polls ranked higher than them. Or do you have personal animosity against Rich Baris?
I thought Nate said banned pollsters were assigned an F? So it’s probably still banned.
Not banned any longer. Unsure when they lifted the ban: https://x.com/ding3rs/status/1790797608924934263
I see what you’re saying now, you’re right. Either an error on Nate’s part or maybe he’s only using bans from when he was still at 538? I know there was controversy over Rasmussen and GEM, etc and they disagreed on which pollsters should be banned, so could be part of it.
Could be footnote #6: "I’m also reverting back any banned/unbanned decisions to the ones that were in place in spring 2023. Not in the mood to re-litigate these, sorry." Disappointing.