Back in the day, I used to lead some research efforts on neural network speech recognition. One approach that looked promising (mostly done in labs other than ours) was to train a large model on many hours of speech, and then use that model to generate synthetic data that could be used to train a much smaller model. While the very "deep" and large models were better at learning, the smaller ones were just about as good, and could be pretty shallow and require much less computation for inference. In our lab we had also used such methods just to improve our models for speech recognition using the same networks. So I can see that it could be helpful. But there is no question that having more real data, especially in something like the political landscape that is so time variable, would be better than synthetic data, which probably just fills in a few holes in the models.
Synthetic data has an innate flaw in that it is historically bound but contains context far outside of the models scope. A model made based on 1998's data would be partially useless in 2000, but completely wrong in 2002 because it wouldn't take into account the massive changes between them.
This is a obvious datapoint for most things, but whereas voice synthesis is a use case with a relatively limited datapool to encode and use, human decision making en-mass is it's worse case scenario giving how many blind spots exist in it.
For note, this is why you have issues of AI's 'Hapsburging' if they are not provided with enough fresh human interaction to use as 'DNA'
(My apologies for the analogy, but I've realized recently that it's one that actually can reach through thickheaded C-Levels: "Forcing your employees to over-rely on AI causes your AI to degrade because it's effectively inbreeding, which is why it deleted your Database" is image and logic they can understand, despite being completely mechanically wrong.)
Feels like the analogy of trying to navigate a car by looking in the rear view mirror. All the data these models is trained on is based on past experience not current or predicted feelings. Polls take a snapshot of the current situation, and models like those used by the Silver Bulletin try to predict what that means for a future outcome - Like driving a car in pitch darkness where you can see directly in front of you but no further. Still better than looking backwards.
One way that these methodologies could exceed the performance of traditional polling not addressed here is in handling non-response bias. For example, while many demographics may be very hard to reach in a traditional poll, they may still have an observable footprint in other ways such as social media interaction, purchasing decisions, and news consumption. I find it quite plausible that a highly semantically aware model like an LLM could semi-reliably glean how people in the demographic might vote based on that data. It definitely won’t be perfect but I don’t find it hard to believe that it will be better than noise or no data at all.
So I agree that claiming these models know people better than themselves is quite preposterous, and there certainly will be significant sources of error, I think it is entirely reasonable that these approaches work with a lot more data than traditional polling, would could result in superior performance. What are your thoughts on this?
People being hard to reach is not a problem for polling per se (other than adding to the cost).
Like if you expect to reach 30 young black men in a survey of 1,000 people, but you only reach 20, you can address that with weighting.
The problem is when the people you can’t reach (those 10 “missing” black men) would have given you a different answer from the 20 you did reach. But I don’t see how an LLM would figure that out just from looking at (eg) purchasing decisions.
I guess my claim is that those 10 “missing” black men probably still have some degree of online footprint correlated with their political leanings, and you could prompt an LLM with something along the lines of “here is what we know about this person, fill in the blanks on their political leanings”. It’s not quite as simple as just presenting demographic information to an LLM and having it guess how that demographic would vote, and I think it done well, it still needs a lot of traditional weighting and statistics to give reasonable results. It’s just a way to augment useful, but incomplete data.
My pet peeve again. A Silver Bulletin post that ignores the best source of information, its readers. Quoting from dueling commercial interests in a product is less useful than asking people who make consistent money betting on elections--many of whom in my experience read SB.
For myself and others I know:
1. All of us make very extensive use of AI
2. None of us try to forecast individual behavior and aggregate up--that problem has vastly more parameters than could ever be fit by data. That means we don't use AI in the way described as an AI-poll, but it also means we don't take reported real poll results at face value--the fact that 13% of soccer Moms said they would vote for Trump doesn't cause us to estimate that 13% of soccer Moms will vote for Trump. Poll results often move our forecasts in the opposite direction of straightforward interpretation.
3. Actual poll results are pretty small inputs, although part of that is the information from those polls is partially incorporated in prediction market prices and expert opinion. The filtered information of polls seems to extract most of the value.
I would describe modern scientific polls as using large amounts of general information--which today is processed by AI--to form a prior. Important swing individual types are identified and polled to measure deviation from prior. This leads to an updated posterior distribution. So AI does the prior, humans answering questions adjusts to the posterior. Most of the art is in guessing who will vote, rather than how they will vote if they do.
It's hard to say how much value polling humans adds because the prior already incorporates poll information. It might be possible to disentangle with an event study around the release of poll results but there are lots of polls and few move the needle much individually, most polls allow information to leak out slowly, and the poll signal is small relative to overall election noise. It would help if polls released not their posterior distribution, but the change in their posterior from their prior. That would make it easier to evaluate their value.
The important point is the election is a poll, and a bad one: mainly because respondents are self-selected but also because eligibility, registration and tabulation are complex and cumbersome. Traditional polling uses a good poll to find out what people really think, and then adjusts--mainly through estimating how likely different individuals are to vote--to predict a bad poll results.
From either a Bayesian or AI standpoint, a good poll is just one more bit of information. It gets no special respect because it is a poll and the election is also a poll. Asking real people if they'll vote and if so how is the most straightforward way to predict an election, but it's not obvious that it's a particularly useful one when evaluated against a good AI-prior.
I think you're too short on AI here. (demo, news diet, maybe some personality training) + (survey question) = a novel calculation that's never been done before off data that's never existed in that configuration. I don't think you can quite dismiss that as only a model, not data.
Seems a bit semantics-ish to relegate anything that isn't asking a human a question into the category of "models" not "new data."
I can already smell that betting markets and AI polls sound anti-institutionalist and right-coded, which implies the two types of polls are necessarily partisan competitors. Maybe AI polls become the sole client of person polls for labeled data. That would be more like a vertical integration than a knockout competition.
My guess is Delta(demographic+news diet, vote) is extremely small... specifically for the people whose vote was obvious anyway. Is our rural old lady <insert other flyover country stereotypes> whose news consists of social chatter and passing CNN in the waiting room going to have an easily modelable news diet, maybe not. Ironically, and uncomfortably, the most "informed" voters may have eliminated most of the privacy of their thought and be the flattest objects of prediction come election time.
Conclusion, after contradicting myself: Nate/Eli is probably right because polls are better at less digital people, and those are the ones who matter when you're polling.
Back in the day, I used to lead some research efforts on neural network speech recognition. One approach that looked promising (mostly done in labs other than ours) was to train a large model on many hours of speech, and then use that model to generate synthetic data that could be used to train a much smaller model. While the very "deep" and large models were better at learning, the smaller ones were just about as good, and could be pretty shallow and require much less computation for inference. In our lab we had also used such methods just to improve our models for speech recognition using the same networks. So I can see that it could be helpful. But there is no question that having more real data, especially in something like the political landscape that is so time variable, would be better than synthetic data, which probably just fills in a few holes in the models.
Synthetic data has an innate flaw in that it is historically bound but contains context far outside of the models scope. A model made based on 1998's data would be partially useless in 2000, but completely wrong in 2002 because it wouldn't take into account the massive changes between them.
This is a obvious datapoint for most things, but whereas voice synthesis is a use case with a relatively limited datapool to encode and use, human decision making en-mass is it's worse case scenario giving how many blind spots exist in it.
For note, this is why you have issues of AI's 'Hapsburging' if they are not provided with enough fresh human interaction to use as 'DNA'
(My apologies for the analogy, but I've realized recently that it's one that actually can reach through thickheaded C-Levels: "Forcing your employees to over-rely on AI causes your AI to degrade because it's effectively inbreeding, which is why it deleted your Database" is image and logic they can understand, despite being completely mechanically wrong.)
Feels like the analogy of trying to navigate a car by looking in the rear view mirror. All the data these models is trained on is based on past experience not current or predicted feelings. Polls take a snapshot of the current situation, and models like those used by the Silver Bulletin try to predict what that means for a future outcome - Like driving a car in pitch darkness where you can see directly in front of you but no further. Still better than looking backwards.
One way that these methodologies could exceed the performance of traditional polling not addressed here is in handling non-response bias. For example, while many demographics may be very hard to reach in a traditional poll, they may still have an observable footprint in other ways such as social media interaction, purchasing decisions, and news consumption. I find it quite plausible that a highly semantically aware model like an LLM could semi-reliably glean how people in the demographic might vote based on that data. It definitely won’t be perfect but I don’t find it hard to believe that it will be better than noise or no data at all.
So I agree that claiming these models know people better than themselves is quite preposterous, and there certainly will be significant sources of error, I think it is entirely reasonable that these approaches work with a lot more data than traditional polling, would could result in superior performance. What are your thoughts on this?
People being hard to reach is not a problem for polling per se (other than adding to the cost).
Like if you expect to reach 30 young black men in a survey of 1,000 people, but you only reach 20, you can address that with weighting.
The problem is when the people you can’t reach (those 10 “missing” black men) would have given you a different answer from the 20 you did reach. But I don’t see how an LLM would figure that out just from looking at (eg) purchasing decisions.
I guess my claim is that those 10 “missing” black men probably still have some degree of online footprint correlated with their political leanings, and you could prompt an LLM with something along the lines of “here is what we know about this person, fill in the blanks on their political leanings”. It’s not quite as simple as just presenting demographic information to an LLM and having it guess how that demographic would vote, and I think it done well, it still needs a lot of traditional weighting and statistics to give reasonable results. It’s just a way to augment useful, but incomplete data.
But in this instance how do you even find the person and their associated online footprint in the first place to give it to the LLM?
I read the title as “All polls are fake polls”. That would be a truly spicy take.
My pet peeve again. A Silver Bulletin post that ignores the best source of information, its readers. Quoting from dueling commercial interests in a product is less useful than asking people who make consistent money betting on elections--many of whom in my experience read SB.
For myself and others I know:
1. All of us make very extensive use of AI
2. None of us try to forecast individual behavior and aggregate up--that problem has vastly more parameters than could ever be fit by data. That means we don't use AI in the way described as an AI-poll, but it also means we don't take reported real poll results at face value--the fact that 13% of soccer Moms said they would vote for Trump doesn't cause us to estimate that 13% of soccer Moms will vote for Trump. Poll results often move our forecasts in the opposite direction of straightforward interpretation.
3. Actual poll results are pretty small inputs, although part of that is the information from those polls is partially incorporated in prediction market prices and expert opinion. The filtered information of polls seems to extract most of the value.
I would describe modern scientific polls as using large amounts of general information--which today is processed by AI--to form a prior. Important swing individual types are identified and polled to measure deviation from prior. This leads to an updated posterior distribution. So AI does the prior, humans answering questions adjusts to the posterior. Most of the art is in guessing who will vote, rather than how they will vote if they do.
It's hard to say how much value polling humans adds because the prior already incorporates poll information. It might be possible to disentangle with an event study around the release of poll results but there are lots of polls and few move the needle much individually, most polls allow information to leak out slowly, and the poll signal is small relative to overall election noise. It would help if polls released not their posterior distribution, but the change in their posterior from their prior. That would make it easier to evaluate their value.
The important point is the election is a poll, and a bad one: mainly because respondents are self-selected but also because eligibility, registration and tabulation are complex and cumbersome. Traditional polling uses a good poll to find out what people really think, and then adjusts--mainly through estimating how likely different individuals are to vote--to predict a bad poll results.
From either a Bayesian or AI standpoint, a good poll is just one more bit of information. It gets no special respect because it is a poll and the election is also a poll. Asking real people if they'll vote and if so how is the most straightforward way to predict an election, but it's not obvious that it's a particularly useful one when evaluated against a good AI-prior.
I think you're too short on AI here. (demo, news diet, maybe some personality training) + (survey question) = a novel calculation that's never been done before off data that's never existed in that configuration. I don't think you can quite dismiss that as only a model, not data.
Seems a bit semantics-ish to relegate anything that isn't asking a human a question into the category of "models" not "new data."
I can already smell that betting markets and AI polls sound anti-institutionalist and right-coded, which implies the two types of polls are necessarily partisan competitors. Maybe AI polls become the sole client of person polls for labeled data. That would be more like a vertical integration than a knockout competition.
My guess is Delta(demographic+news diet, vote) is extremely small... specifically for the people whose vote was obvious anyway. Is our rural old lady <insert other flyover country stereotypes> whose news consists of social chatter and passing CNN in the waiting room going to have an easily modelable news diet, maybe not. Ironically, and uncomfortably, the most "informed" voters may have eliminated most of the privacy of their thought and be the flattest objects of prediction come election time.
Conclusion, after contradicting myself: Nate/Eli is probably right because polls are better at less digital people, and those are the ones who matter when you're polling.
Great, another article about AI slop.