Transcript of The Gold Standard For ...

[00:00:00]

We were so the story went from I found shrimp tales in my cinnamon toast crunch to he got canceled. I missed all of that, but the shrimp tales were real, right?

[00:00:21]

Hello and welcome to the 538 Politics podcast, I'm Galen Droog, I'm Nathaniel Silver and this is model talk.

[00:00:33]

I ask with an inflection as it's a question. Is it model time? Is it model talk?

[00:00:36]

I don't know. I don't know that it's quite model talk, but it felt right. I mean, it's definitely model adjacent. So today's a big day here.

[00:00:44]

At five thirty eight we launched our updated poster ratings and that is of course where we grade pollsters according to accuracy, transparency and until today, methodology. So there's plenty to discuss in these updated polls to ratings. But the biggest headline is what was once the gold standard methodology of a live actual person. Making phone calls to landlines and cell phones is not the gold standard anymore. So based on the numbers that you crunched, Nate, that method of polling isn't systematically more accurate than some of the other methods.

[00:01:22]

And in fact, in twenty twenty, the most accurate pollsters used a variety of methods, including online polling, text messaging and automated phone calls. I want to get into all of that. And for people who were excited about that model talk introduction, the place where the model comes into all of this is that these ratings aren't just a competition between pollsters, although I'm sure pollsters love getting an A plus rating. These also affect how much weight polls get in our election forecast models.

[00:01:51]

So lots to discuss here. I got to ask you out of the gate, how does it feel to untether ourselves from the old live caller gold standard?

[00:02:00]

It's a new world man and Seltzer's out Trafalgar Group is in. No, I'm just kidding. And Seltzer's still gets, I think, the highest overall grade in our ratings, although Trafalgar Group has moved from like a C plus to minus. So good regulations to them. There's like actual news value in the segment. So maybe I should start with naming very obscure pollsters. Right.

[00:02:22]

We'll get into those pollsters in a second. But can you just explain how you arrived at this conclusion? So the way to the pollsters is I just kind of sit there and look at a stack of polls, think how I feel about it, go get a sandwich. And then I like a fine jeweler, appraise it and assign a grade. That's how it works.

[00:02:44]

I thought you just said that there is news value here and we should be sincere. OK, how we actually do it is we have a database with now more than ten thousand polls, which is basically every election poll and the final three weeks of an election campaign since nineteen ninety eight. So Governor, US Senate, US House Presidential General and primary elections. We look at various metrics. There's a simple metric, which is just how close to the margin, the actual result, if you have Biden winning by four and Trump wins by two in a particular state, it's a six point polling error.

[00:03:16]

But also we adjust for the fact of when the poll was conducted, it's easier to be right on spot accurate on Election Day than three weeks beforehand. We just for the type of election in general, House races are more difficult than presidential races. For example, we compare people directly to others of the same race. So maybe everyone's off in one race, but you're the least bad pollster. You get credit for that, basically, at least in part.

[00:03:39]

So you run through some fancy math, try to have a fair way of judging how well a poll does relative to its peers.

[00:03:47]

OK, after you've crunched all of these numbers and you can determine various different things like which kinds of elections do polls do the best and etc.. The headline here that we're discussing for the moment is which methodology is the most effective? And so you determine that among the different methodologies, right. There's that old gold standard of live caller, basically meaning a real person calls other real people on a telephone, either a cell phone or landline. There's also, of course, like a recorded call that there's an automated voice on the phone and you just respond to the automated voice.

[00:04:23]

There's text messaging, there's also online surveys, etc.. So you looked at all of these different methods and you crunch down, which are the most accurate. And so what did you find? Is it that none of them are more accurate than the other? Or is it now that text messaging is the best way to do a poll?

[00:04:38]

You're using that word crunch a lot. Is Cinnamon Toast Crunch still on your mind? It has been a week for sentiment.

[00:04:45]

Toast, crunch. I don't eat cinnamon toast crunch. I just subliminally, you know, we make arbitrary choices of language and like if the word crunch is in the back of your head.

[00:04:53]

But I feel like crunching data. I mean, that's just what you do. Cinnamon data crunch.

[00:04:59]

I mean, the idiom is kind of an odd one if you think about it. Right.

[00:05:02]

When we more like churning data, like in an urn, I don't know, should we sell a five thirty or on the five thirty eight dotcom store, which, by the way, is back up and running, people should head over to five thirty eight dot com slash store.

[00:05:14]

In all seriousness, data in a serious. Which method is the most effective or all of them equally ineffective, slash effective? Let me put this carefully.

[00:05:26]

I'd say that the method of a pollster alone. Doesn't inform you that much, they're better and more informed ways to judge how much you can trust a poll. Let me actually back up a little further. One issue is that it used to be that back in the day, polls had one consistent methodology that they would use all the time. Gallup is a phone poll. Rasmussen is an IVR poll. YouGov is an online poll. Now, it's not always so clear, right.

[00:05:58]

Like Pew, for example, a famous phone pollster now does most, maybe almost all their polling online. A lot of polls use hybrid methods. So Rasmussen, for example, will use an IVR poll, which is a robo polling automated poll where hello, I am calling from Rasmussen Reports, press one for Mitt Romney, that kind of thing. But because those polls can only reach people on landlines, you can't place an auto phone call to cell phones in most states.

[00:06:23]

They'll supplement that by using an online panel of some kind or you have text anyway. Everyone's kind of being very promiscuous with their methods now, basically. So it no longer makes sense in the first place to assign one method to one pollster. And so that would be at the poll level. So therefore, just to start with, if we assign Monmouth University a grade based on being alive, call the poll and then it turns out they're experiment with doing online polls.

[00:06:50]

Then all of a sudden we kind of have an issue. The premise by which we graded my methods is not valid in that particular poll. So that's one reason to no longer rate pollsters on the basis of their methodology, because the methodology can change and be mixed and matched. But also, yeah, I mean, you don't really see in the data, frankly, recently or really over the long term either. You don't really see a clear link between method and who is most accurate.

[00:07:15]

I mean, you might see some faint traces of things in recent elections, polls that are pure IVR, meaning polls that only call landlines have not done very well. It's not a huge sample. That might be a poll you probably shouldn't use at all. But in general, everyone is trying out different things. And the methodology alone, again, won't tell you everything.

[00:07:34]

Wait, so when was it the case that live caller polls were the gold standard and that there was a noticeable difference in accuracy? For how long has that not been the case?

[00:07:46]

What we found is that when we originally out of this criteria to the pollster ratings is that live caller polls that called cell phones were more accurate for a period of time, kind of a mark of quality was do you call cell phones? Why does it make a quality? Well, because it's expensive to call cell phones relative to landlines. And in a world where like, let's say twenty five percent of population has cell phones and 70 percent still relies on landlines for pulser to say we're going to bear extra expense because we think it's important to go the extra mile to reach these people on cell phones, then it was kind of maybe more a proxy for the overall budget of a poll than it was about the methodology itself.

[00:08:28]

So when we decided that it was more about cell phones than about live caller, what happened is that eventually everybody either abandon my phone calls or called cell phones or did some hybrid method like Rassmussen, where you find some other way to reach people who are not on landlines. So it kind of devolved from a cell phone standard as a mark of quality to a live caller standard. I don't think we ever necessarily had research showing that live caller polls in general were better.

[00:08:56]

Right?

[00:08:57]

It was live callers who call both landlines and cell phones. Do we have a sense of why this has all changed? It's just is it simply, as you said, because everyone started iterating in different ways?

[00:09:10]

Does it have something actually to do with talking to a live caller or people not picking up their phones?

[00:09:18]

I am actually not sure it's changed, but here's kind of my, frankly, kind of slightly pessimistic theory, which is that as people have been more reluctant to respond, telephone poles live caller polls have reverted to the mean. It's not that these IVR polls, these online polls are good. So much is that the live caller polls are not as good as they used to be because it's just kind of becomes very hard in a world where response rates are low and there is bias in who responds to phone polls.

[00:09:54]

So because online polls have plenty of issues, for example. Right. They're mostly using non probability sample screens. They recruit a panel. They try to create a representative within that panel. But it's not like randomly selected, like a random digit dial poll is supposed to be. You know, I mean, the pure IVR polls, as we've found like are actually probably pretty junky.

[00:10:12]

But if you're doing AVR and you're doing an online panel and you're kind of doing some creative weighting and IVR, to reiterate, is like an automated person on the phone.

[00:10:21]

IBRD Yeah.

[00:10:22]

Hello, I am Rasmussen Reports, but like you, good at that. You'd be good at that. That should be like a contest for like you can do the best robot. It's like the reverse Turing test. Right. You can imitate like a robot.

[00:10:36]

Stick around. After the podcast is over, folks will have a competition.

[00:10:40]

But that's kind of my theory is like there was this one expensive but gold standard way to do polls and like that may not be working as well anymore. So therefore, everybody has to get their hands a little bit dirty in the data cauldron, churn the data cauldron.

[00:10:57]

We're going to end up with so many metaphors by the end of this podcast. OK, so let's talk about what this means in the real world, which pollsters did best in twenty twenty and what methods were they using.

[00:11:09]

So we should. Keep in mind that these sample sizes are fairly small and we're looking at polling firms that may have done 10 or 20 or 30 polls in 20, 20, and those results are correlated. You might pull the state several times, but the best poll overall, based on just simply average error, meaning how close was your poll to the actual result was Atlus Intel, who had an error of only two point two points. On average, over 14 polls is pretty good when the polls had a pretty bad year overall.

[00:11:39]

Second, our friends at Trafalgar Group at two point six, then Rasmussen Reports, our other friends, we love them two point eight points. Harris in Cincinnati, three point three points. Opinion savvy, such insider advantage, three point five points. Emerson College, four point one points. Ipsos who actually has been the 53rd partner, four point six points, but that's the top performing group.

[00:12:02]

OK, so none of those polls are live caller polls. What kinds of methods do they use? Everything except love caller.

[00:12:10]

I mean you have a mix of and again, a lot of these firms now are kind of mixing and matching based on a particular race they're serving. But it's a lot of Internet at some of our it's text messaging. Like one thing that probably is worth mentioning here, if you're doing like a complicated survey about racial attitudes in America or how do you feel about 30 different major issues, you're revealing personal information about your health. I mean, those polls, you probably definitely do want live caller polls.

[00:12:38]

You have to have interviewers are trained to like interpret ambiguous responses and coax the respondent through a very long survey. I used to do this actually my first job, believe it or not, is it I actually it was an interviewer. I would go to a lab at Michigan State University, people on the phone. It was people who had had workers comp claims. You'd have to ask them, like on a scale of zero to 10, how bad is the pain in your left finger?

[00:13:05]

How about your right finger, like literally up to do this. And it was like auto workers who had fallen on hard times. Anyway, I digress. But like, it requires like real skill to do these kind of long, complicated polls. You probably want a light person doing it or online. We can kind of curate the experience more carefully. These Polster ratings are meant for horserace polls. I don't know how you go about evaluating more complicated polls if there's no way to test them, per say.

[00:13:29]

But for the horse race, if you just want to know, are you going to vote for Trump or Biden, maybe polling is not so complicated. You just want to have a high response rate. Make sure you're reaching the person you think you're reaching. No other demographics demographic so you can balance them in your data caldron, but you don't necessarily need the investment that you would for like a high quality survey about public opinion, about many items in the news, for example.

[00:13:55]

So this is intended for an important but narrow application for horserace polling. I would not necessarily translate these conclusions out to other domains.

[00:14:04]

OK, but you said that in terms of asking people their opinions about potentially sensitive topics like racial attitudes, in that case, you might want a live person on the phone. I've heard that one of the arguments about why text messaging or online polling might be more accurate sometimes is that people don't have as much shame or social pressure in expressing how they feel to their computer screen or to a robot or over text message, as they might when there's an actual person on the phone and they may feel like that person is judging their opinions.

[00:14:38]

I think the evidence for that is a little bit mixed potentially. I mean, there is some evidence for it, like there is a lot of academic literature and social desirability bias going back a long time. Maybe I should have picked the example of, like, racial attitudes because those are sensitive and an example of where there could be some effects by mood. Well, actually, what things aren't sensitive these days if you're doing a survey on how you feel about the health care system and you're asking a lot of questions about different single payer systems and whatnot, in your experience with the health care system like there, I think you'd want Kaiser Foundation to do a poll in that Rasmussen Reports or something.

[00:15:11]

Right. I don't think the reason that these polls did better in twenty twenty is because that people were lying about their support for Trump. In fact, one thing it's a little interesting and surprising is that the polling was actually a little bit better in the presidential race than in races for Congress. So I know there's like a shy generic Republican congressmen vote, but like that's like the actually the bigger areas in those races than in the presidential race. It's probably more a matter of not reaching certain types of voters in the first place.

[00:15:43]

If you're only getting a response rate of of 10 percent or whatever, then it's pretty naive to assume that you're getting a representative sample, that you can just kind of use these big major demographic categories like race, age, gender or whatnot, and that that will cure kind of all the problems that you have. So I think we actually have some unpacking to do to go back to when you were saying which pollsters did the best in twenty twenty, you mentioned that Trafalgar Group and.

[00:16:11]

And reports are noted, friends of the podcast probably sounds like an inside joke to most people. So let's explain what's going on here. First of all, a bunch of these pollsters do generally have a Republican bias, the ones who did the best in twenty twenty in an election where let me let me let me know, am I wrong?

[00:16:30]

Let's be careful about which terms we used in the context of the post ratings, Republican House, the facts.

[00:16:35]

Yeah, I know. Yes. Effects House effects. Yeah.

[00:16:38]

Yeah. So all the pollsters that had big Republican House effects, they were more Tremper GOP friendly relative to the average poll in a year where the average polls off by four and a half points or whatever, then they're going to do well. So the Rasmussen's, which are floggers, the Susquehanna is now some of those firms also either have a history of having an actual bias in polls. Rafelson actually had a bias the way we define it in polls and or they have a history of being.

[00:17:10]

Booster for Republican candidates and away or being kind of publicly cantankerous in some ways, some don't like, at least until I. Don't think there's any bias in there of any kind in their blood or whatever. Right. But like, you know, obviously Trafalgar Group, if you go back and forth, I mean, they're kind of doing polling and the guy's doing some degree of punditry. But like, you have to give credit where it's due. Right.

[00:17:31]

Trafalgar Group has not been around that long. They in twenty sixteen, we're more bullish on Trump than the average correctly. In most cases, we did not have a great necessarily twenty eighteen in twenty twenty. They were more bullish on Trump than the consensus and they missed a few states, but they were closer on average than the average poll and they deserve credit for it, frankly. And then by the way, in the Georgia runoff, they did not have a Republican House effect.

[00:17:55]

They were very much in line with the average and the averages were very good. So that's what you'd like to see. You'd like to see a pollster that, hey, if in one cycle you're way off to the side and you're right, then we're not going to critique you at all for that. If every cycle, like Rasmussen Reports, you're off in the same direction and then you happen to have some really good cycles and then some really bad cycles, then I don't tend to give as much credit.

[00:18:17]

But Trafalgar, as far as I'm concerned, they deserve their A minus rating. I really, really, really wish they were more transparent. There are lots of issues with Trafalgar about like who their sponsors are that are problematic and that we'll have to consider. I mean, we got to like they should have much better disclosure and transparency. But if you're judging based on results, then they deserve credit for having good results, even though they like in some cases of Trump winning states he lost.

[00:18:42]

I mean, if you have Trump winning a state by a point and he wins a state by a point, that was a good poll. If you had Biden winning a state by 17 points, what did the ABC News we love ABC News. And they had actually decent your overall but like Wisconsin poll, they had like by winning Wisconsin by 17 points and by wins Wisconsin by one point. I mean, that poll is not going to help their grade, right?

[00:19:04]

It doesn't. So, yeah, giving some credit where it's due. Now, the question is like, how will that predict how they'll do going forward? I don't know. We'll have to see.

[00:19:11]

Yeah, that was going to be my next question. Well, first of all, do we owe Trafalgar an apology? I think they have been the butt of a couple of jokes here on this podcast.

[00:19:20]

Never apologize, Gaylan. Never apologize.

[00:19:23]

No, look, I think I have more. Respect for whatever it is they're doing, I'm not sure I know what they're doing, but I have more respect for whatever it is they're doing. I apologize in the following sense, which is I think sometimes there's like. We have our method and we designed the method carefully in the method, gives a higher weight to pulses are more accurate. It's actually pretty ecumenical. Is that a word? It's actually pretty even handed where it gives a little bit more weight to the highest quality pollsters.

[00:19:52]

But like in Trafalgar as a new poll, then that will still influence the polling average in a state. I think there was like a lot of times we were kind of like, oh, grumble, grumble. Here's another Trafalgar poll that knocked Biden's average down from plus two point six to plus one point nine. And, you know, I'm not really sure I believe that, but I guess we have to go with the method. Right. In some sense, that kind of even more of a rationale for, like trusting the process used systematic rules that are if you turn into an algorithm and trust that I have a rhythm, instead of trying to superimpose on top of it and saying, oh, Trafalgar, they make it.

[00:20:28]

But I wish they didn't. But like, there's a certain other like polling aggregator or forecaster, but just have like a list of polls that they struck there. Like the seven pollsters are evil and they they're polls shall not appear in our polling averages. Like that's exactly what you don't want to do. You want to like systematize and say, OK, we need a system of kind of like checks and balances. Right. Because there are plenty of years where like twenty, twelve or whatever, twenty eighteen.

[00:20:56]

There are plenty of years where like the high quality pollsters kick butt and low quality pollsters don't. But yours where that's not really twenty twenty and a good system of checks and balances so to speak, means that like you're Hegde in the right way. So that Trafalgar Group, even when they have a C plus rating, will have some influence on the average and now they are rewarded now by having an A minus rating and they'll have more influence going forward.

[00:21:18]

So we think it's like a thoughtfully designed system and sometimes we kind of editorialize on top of that or try to have it both ways. So here's where our model says. But here's the you know, I think that never serves anyone's purposes well in the first place. And it probably reveals like our grievances and our biases and whatever else. I I apologize to a group in that context. Oh, wow.

[00:21:39]

That was an actual apology. But I want to nail down on the going forward part because now Trafalgar will have more influence in our forecast. The elections when they did well were twenty, sixteen and twenty twenty that seem to be like very particular years in terms of the way that polls underestimated Republicans, especially with Trump at the top of the ticket, because obviously in twenty eighteen and in Georgia and in the Alabama special etc., we have examples of your old school legacy pollsters doing very well.

[00:22:10]

So what if these pollsters like Rasmussen, Trafalgar, etc., were kind of just dicking around in a way in twenty, sixteen and twenty twenty? That worked out well for them when Trump is at the top of the ticket. But that may not be the case if he's not at the top of the ticket in the future. Is that a concern that we have or do you think that at least with Trafalgar, their methods are more durable than just a Trump election?

[00:22:36]

Again, if I kind of like subjective and I fear the TOGHER guys talk about how they do the polling, I don't want to apologize for them in one segment and then throw shade at the trash. I mean, I don't know. I mean, here's the whole point. If I did like a sit down interview with every pollster, you don't have time to do it. We have like four hundred pollsters we rate. If I get a sit down interview with every pollster and try to get a qualitative assessment of their methodology and then wrote down a subjective grade.

[00:23:07]

Do I think that would help predict how well the polls would do above me on the pollster ratings? I don't know, maybe a little bit, but like I think in twenty 20, if I'd done that, then I'd have put even more weight on some of these firms that did not have a good year, frankly. So the whole point about like an algorithm is you aren't necessarily solving like the last mile problem. You're always leaving a certain amount of detail out, but you're getting the big stuff right.

[00:23:33]

You are getting most of the way there. You're having a coherent and consistent rationale. You have to think systematically about a problem. Right? If we make an exception for this pollster, then we have to make it different, except for that pollster. And in general, I think people who think in terms of systems do better than people who for whom everything is ad hoc. So I do have ad hoc opinions about pollster. I mean, another classic one is like Emerson College.

[00:24:01]

They get kind of a lot of they use like Mechanical Turk and stuff like that that people don't like, but they continue to have pretty good years. Now, again, I think it used to be more that like there was a correct textbook way to do polling. And if you were willing to have the expertize in it and pay the expense for it, it was expensive. And that would work. Unfortunately, under current conditions, I'm not sure that method would work anymore.

[00:24:27]

Actually, what I think might work, to be super honest, is maybe polls that involve mail or door to door components like federal government surveys are probably quite accurate and have very high response rates. But apart from that, if you like a telephone survey in a world where people don't answer their phones very much from phone calls from strangers, then. That gold standard method is no longer foolproof. It may still be better than the alternatives, but there's no longer a sure fire safe way to have a super accurate poll.

[00:25:03]

You mentioned mail or door to door, which is obviously even more expensive than live caller polls that also target cell phones. Do you have any other ideas of how pollsters can or should be innovating to get over the problem of people not answering their phones?

[00:25:19]

Some polls have started to use email, which is interesting. I mean, you know, we live our lives mostly online. I think the texting is kind of promising potentially. It kind of has become like a fairly universal way to communicate. I mean, one thing is true to hear is like I'm not sure that we ever really did live in that golden era. It's hard to know. Right? I mean, it used to be that, like, Native American had a telephone period, right?

[00:25:43]

I mean, it's always the case that not everybody had a telephone. For example, you have infamous polling skews people trying to do polls through magazines in the nineteen thirties or whatnot. Right. But like I know the golden era ever lasted that long necessarily. The other perspective would be that like polling has always been a challenge. Pollsters always have to innovate around it. It's never been as clean as textbook says and that polls have better years and worse years.

[00:26:07]

And actually statistically, it's not so clear that we've been on a downward trajectory. We had a very good twenty eighteen, for example. The polls were very good in Georgia. That's kind of the macro view. I mean, I think the macro view in the macro viewer are very different here. We're actually talking a bit more the micro view, the actual mechanics of doing polling on this podcast and the article and five thirty eight, it's semi provocatively taking a more macro view and saying, yeah, the polls weren't great, but it's kind of within normal bounds that you can model and whatnot.

[00:26:38]

I did want to give our listeners the provocative headline about the gold standard polling and also the wonkier headline up front. But I also want to talk about some of the broader trends that you found in reassessing our pollster ratings before we do that vote. Today's podcast is brought to you by C-SPAN. C-SPAN, in partnership with cable providers across the country, is awarding one hundred thousand dollars in prizes. In their annual student camp competition. They ask students to create a documentary about an issue they want the president and Congress to address.

[00:27:09]

Watch the award winning videos at student camp dot org and catch the best of the best ahead of Washington Journal every morning starting April 1st on C-SPAN, they'll air a new video each day at six fifty a.m. Eastern through April twenty.

[00:27:25]

First, C-SPAN is funded by America's cable television companies, their partners in student camp.

[00:27:32]

Again, check out student camp dot org or watch C-SPAN.

[00:27:38]

The last time we took stock of how the polls did in twenty twenty, we were working off of our polling averages that we developed for the twenty twenty election. But when you actually rate pollsters, you have a much more in-depth process, looking back over a span of weeks, looking at a bunch of different polls. So after all of that, there's more rigorous tallying of how all the polls did in twenty twenty.

[00:28:04]

Has your view of the efficacy of polling in twenty twenty changed at all or how has it evolved? No, it hasn't changed very much because I wrote an article for the site like two days after the election was called for Biden, and it was kind of saying what I expressed a moment ago, which is like, yeah, the polls are kind of like they were within the normal bounds and that like the media was giving them more crap than was probably deserved.

[00:28:28]

And that's probably what the deeper analysis reveals, too, I think, frankly.

[00:28:33]

So what is that deeper analysis like? How does 20, 20 compare with the other years that we rate pollsters by going back to nineteen ninety eight?

[00:28:43]

If you go back to nineteen ninety eight, look at the average error. And by the way, this includes the entire 20, 19, 20 cycle. So includes presidential primaries, includes any special elections. It includes a Georgia runoff where effectively twenty twenty one bit of the resolution to a 20 20 race, they get qualified. So the average error between all those polls was six point three points, which is the third worst cycle out of 12. So six point eight points in twenty fifteen sixteen seven point seven points, nineteen ninety eight.

[00:29:12]

Still, though, across all years, our database, the average error is six points. So six point three is higher, but not that much higher, judged by how many races, if you call correctly. Seventy nine percent of polls in the cycle got the right winner, which is actually pretty decent and matches the historical average of seventy nine percent. Exactly. Again, there are a lot of states where, like Joe Biden won Wisconsin just by not by nearly as much as a poll thought he won the Electoral College Democrats won the popular vote for the US House.

[00:29:40]

So like the polls were, quote unquote, right in many cases, but with margins that were off, the bigger issue is with this bias where on average polls overestimate how well the Democrat would do by four point eight points, including four point two points in a race for the presidency, five point six for governor, five for the Senate, and six point one points for the US House. That is the biggest bias we can find another direction since nineteen ninety eight.

[00:30:07]

There are years historically before this database begins, like 1980 and probably 1994. You had similar or larger biases, maybe like a seven point bias in 1980 where the polls can have missed the Reagan wave. But still, that's a pretty big bias.

[00:30:23]

And that's, of course, the second presidential election in a row where we saw that bias, although not the second election in a row, because the polls were unbiased, actually at a very slight GOP bias in twenty, seventeen 18. That might seem like a subtle thing, but like people cherry pick a narrative a lot and they kind of ignore the fact that you had a year in twenty, seventeen, eighteen where the polling was both very accurate and very unbiased.

[00:30:49]

It also kind of ignore these Georgia runoff. The polling was both very accurate and very unbiased. Again, from the macro view, I get kind of very I mean, I'm not a say I'll get defensive. I'm not a pollster. I'm someone who's, like, trying to evaluate how well polls do for a macro view is kind of like, OK, you're putting an awful lot of eggs in this kind of one bad year or two bad news out of three or whatever after a very good year the other year, that three in like a coin landing heads two or three times or three to four times is like statistically not really something that you should get that excited about.

[00:31:19]

Definitely.

[00:31:20]

I totally understand that the midterm years were good. We've talked about that many times on this podcast. But in terms of those presidential years. Right.

[00:31:27]

But yeah, but like if the average American listen the podcast, then that would be different than that's not what the average mainstream media reporter would think about the polls. They probably have no idea. I don't think even the average political reporter working for mainstream media outlet has any idea how good the polling was in twenty eighteen. It's not widely known.

[00:31:48]

Do you know why the polling was so good in twenty, seventeen, twenty, eighteen and in Georgia?

[00:31:53]

I don't know. I mean, I think you have like a lot of high quality people doing polling like the New York Times upshot polls are really helpful in twenty, seventeen, twenty eighteen. It was a fairly stable and boring race. It was fairly high turnout. I think high turnout helps. I mean, again, people like in the polls, the polls, we're dealing with a lot of weird circumstances this year, including covid like one theory for why the polls had a bad year is that Democrats were more likely to piously follow their social distancing requirements and stay at home.

[00:32:27]

When you're at home in the middle of a pandemic, you're so bored you might actually answer a poll especially really excited about trying to get Trump out of office. And so polls kind of navigated a very challenging year in a way that wasn't great, let's be honest. Right. But like I mean, in the primaries, we show the primary polls as having been not very good, but also the gigantic inflection point where, like, we're all of a sudden Joe Biden gained like 30 points, like one of the two largest ever in the primaries, along with John Kerry in 2004.

[00:32:57]

But the polls of the Iowa caucus were actually pretty good. The polls in New Hampshire were pretty good. Those were pretty open, I suppose. But like I think people just kind of by default went to pollsters when, like like everyone else, we were dealing with a lot of unusual circumstances in twenty twenty.

[00:33:12]

Pollsters need a good attorney and they have one. And you wait. OK, I agree with what mean. Here's a symmetry though. The problem is these polls feel like these very honest academics who are like, oh no, we must do better. And like they have to be more assertive pollsters.

[00:33:26]

OK, but we are people who listen to our podcast know that we're not just like harshly criticizing pollsters because we can or because that's just what Internet culture is about. We are being more rigorous and we're talking about this in a methodical way. And so there was something that went awry in twenty, sixteen and twenty twenty. Right. It may not be shocking because things go awry in polling on a regular basis and have for decades.

[00:33:50]

But pollsters are still going to try to like, evaluate why was there a systemic bias that underestimated Republicans, etc..

[00:33:58]

And we've talked about that in Podcast's in the past. I don't want to get into it too much right now, but we are going to be watching very carefully in the upcoming midterms and in twenty, twenty four, who knows where we'll all be in twenty, twenty four. But do you think, for example, if there's a systemic bias that underestimates Republicans in twenty, twenty four, that we can say, OK, there's something going on here that the polling industry just underestimates Republicans like how many times would it have to happen?

[00:34:24]

In a presidential election, before you can say this is a signal and not just noise, to quote a friend of mine.

[00:34:34]

Well, I would also look at midterm elections, right? I would not just look at presidential elections. I mean, this is not a perfect example, but like, if you keep flipping a coin. And it keeps coming up tails, at what point do you conclude that the coin is? More likely to be biased or rigged than a fair coin, and the answer to that is you have to have some prior and you revise that prior underpays theorem.

[00:34:59]

I mean, look, is it noteworthy at all that the average person conducting a poll? Is probably a highly educated Democrat. Does that matter at all? To me, that would matter. A little bit earlier we talked about this exactly and I talk about in the article. But like do people who in an environment where everything is polarized by education, to be a pollster, you probably have a Ph.D. or some advanced degree, upper middle class, probably might work for a university or for a newspaper or something like that.

[00:35:27]

Does that affect things at all? Maybe. Maybe. I don't think that's quite the question you're asking, by the way. But like, that's worth thinking about, I think. I mean, I still think that, like. Polls released by as part of news operation usually. But, you know, I mean, if you kept missing in one side and that side was like the side that the average person conducting a poll tended to belong to, then, yeah, as a Bayesian, you have to, like, give that some weight.

[00:35:56]

I don't think it's really it's going on here because, like pollsters, much stronger incentive is like they get much less if they are and or are perceived to be accurate. One thing I'll criticize the industry for a bit, since I am inclined to be very forgiving and think again, they do a vital service and do a great job for the most part. I do think there was a little bit too much faith in this, like we have detected the problem from twenty sixteen and solved it and put a bow around that.

[00:36:25]

Right. The problem being OK, we discovered the problem is like we had too many college educated voters in our sample and now we wait by education and now polling is perfect again. I think there was a little bit too much blind faith placed in that mean it was correct. I mean, it is definitely better to do education waiting in polling or from like a both a macro and a micro point of view. The micro issue is that, like even education is kind of a proxy for other conditions that can still influence poll.

[00:36:59]

So like the real metric that might influence your support for Trump is your degree of, you know, people say social connectedness, for example, that people who have greater ties to their communities, as measured in various ways, tend to be more democratic than people who have fewer ties, tend to be more Trump in Republican. Even when you control for education the person who got a degree from a four year college and then fell off the grid. Demographic weighting can't solve every single problem that response bias is pertinent to.

[00:37:28]

That's kind of the micro problem. The macro problem is Pulsers not acknowledging that polling is messy under real world conditions, trying to live up to this gold standard, textbook way of doing polling that simply isn't going to work as well. In a world where most people don't respond to phone calls and their big response because of various sorts in who does, I understand why from a marketing standpoint. I mean, again, polling is scientific to me. Again, any problem probably talk about with the judgment calls I have to make is like 20 times worse for any kind of news story that appears on any major website, including five thirty eight, as well as for, like we should say, designing models, which involves making far more complex judgment calls and doing a poll.

[00:38:19]

So we criticize ourselves there too. But like not acknowledging that the real world is messy, that polling is a imperfect instrument, a good and necessary but imperfect instrument is something that like I think pollsters have trouble communicating that like, hey, get used to us being wrong some of the time. We're going to be right. Most of the time we're going to help inform you. Right. But it's going to get a little messy and we're going to have a bad election now and then.

[00:38:44]

And that's OK. We don't have to sell flagellate every time we have a bad election. We don't have to promise that we've reinvented things and solve every problem. Every time we have a bad election, we don't have to buy the media narrative every time we have a bad election. Instead of saying, you know what, actually give us a mechanism, because we do actually call the major things right. We did say Joe Biden is going to be president.

[00:39:03]

We did say Democrats would win Congress. Yeah, we were pretty nervous until Democrats won that run off about Congress and got kind of hairy there on election night. But like, we don't get the big stuff right and you criticize us when we're close in the market, we've got the big stuff wrong. So be consistent media. But anyway, so it's kind of like a little rant. All right. Fair enough.

[00:39:20]

I posted the findings from our new poll ratings on social media shortly before we jumped on here. So we did got a couple of questions that I want to pose to you before we wrap things up. And this one was a good question.

[00:39:33]

I thought from Josh, he asks, low response rates to phone polls and the nature of online polling means that it's basically impossible to get a truly random sample. Should this change the way that we think about polling and have pollsters adequately updated their thinking and methods to reflect this kind of gets maybe what you were just saying. But like, if we're kind of accepting that all the methods we have for polling can't find us a truly random sample, what does that say about polling more broadly?

[00:40:03]

Well, I mean, I think, first of all, all pollsters are thinking about this, right? I think they're very thoughtful and they have incentive to be thoughtful and frankly, like more incentive than we have. If tomorrow Joe Biden ban polling and everybody look at a poll again. Right. I think be really bad for democracy, but you couldn't be happier to go find something else to do, right. If you're a pollster, that wouldn't be very good for business.

[00:40:25]

But like, what would it be? What would you do me play poker something? Write a book, write a book that sounds worse than polling.

[00:40:33]

No, it's you got a cigaret. Is that Gailen? Everyone's going to like microblogging in social media. And man, you got to zoom out and zoom out.

[00:40:41]

All right. Well. Waiting for a second Nate Silver book, what's it going to be about? I actually am not going to say anything right now, but there are things turning my mind.

[00:40:50]

So when the time comes, when are we breaking news?

[00:40:55]

A second time on this podcast, when there is a book project, frankly, like it's the first time I haven't had a big major project to do and kind of I got this medium sized project of the poster race off my desk. So like my is like I'm going to go get some really good noodles for lunch and then go maybe start thinking about my next big project. Maybe it would be a book or something, but nothing is immediately in the offing.

[00:41:16]

But like, fair enough. But I interrupted you. Yeah. I mean, actually when I think it I think text messaging is intriguing. Right, because people are probably well, I don't know, actually. I know how to respond to spam text, but that's maybe a way to like you can randomize sort of if you have a list of numbers. So I guess you're now not getting people who have landlines. It becomes a problem. I don't know.

[00:41:34]

There's no, like, perfect solution. I mean, this is before, but like. The difference between like polling and modeling is becoming murkier and murkier, and that will continue to be the case, I think. Yeah, I keep hedging back and forth between, like, saying it's always been like this and there have always been like, you know, it's always been a circuitous path and polls have had bad years before. I mean, what happened in Haiti where they underestimated Reagan by seven or eight points or whatever, was polling broken then?

[00:42:05]

So I'm quite agnostic on like. Whether this is something new or something of the kind and has always been how polling works in the real world, I mean, one thing it's true to any kind of candidates before, but like there's a good argument for, like pluralism. It's good that we have phone polls and text polls and IVR polls and online polls. It's good that we have some polls that are very rigorous and maybe some that are taking more unusual ad hoc approaches.

[00:42:35]

It's an argument for like aggregation and averaging.

[00:42:37]

I think we've got a second question and then we'll wrap things up. It was why doesn't everyone just use an seltzer's methods? Which is an interesting question, because Ansell's are used as live caller polls that call cell phones and landlines. But it's also important to note that she has the highest ranked poll of any of the polls that we assessed, despite the fact that we found that which methodology you use may not be as important as we once thought. So what's up with that?

[00:43:08]

Why is and sells are still in the lead when methodology seems to be not the driving force behind quality anymore?

[00:43:16]

Well, first of all, to say that you can't tell how much your poll quality based on methodology is the same as saying that there isn't a higher and lower quality. So one thing we find is that polls that abide by better transparency standards. So polls that are like a part of the AP transparency initiative are contributing to the Roper archive. They do actually quite a bit better. So it's not that all things are created equal and polls have been around longer.

[00:43:38]

Any poll that has less than 20 polls, our database, which gets penalized because like you have an established track record. So someone who's very transparent and open about their methodology and has a long track record of success, then you should trust them more. And that definitely describes like Ann Selzer, like what does she do that like other pollsters don't? I mean, I think what she'd say is that you should have her on again because she, again, kind of famously had a poll for the eve of the race that showed Democrats losing ground in Iowa.

[00:44:05]

That freaked a lot Democrats out, I guess, somewhat appropriately about Iowa. Certainly, she trusts her data. I mean, she doesn't mind being an outlier. She doesn't mind being criticized from like people and parties who are going to be apt to criticize. That's part of it. Like, one thing I do worry about with pollsters is in this environment where everyone is online all the time, they're very attuned to the reaction to their polls. And people behave in very partizan ways in election campaigns.

[00:44:34]

Right. And so, you know, so I worry a little bit more about groupthink. I wish and again, we have apologized to Trafalgar. They've been a minus rating. Now, I wish that. They hadn't been so willing to engage in Internet combat, just kind of quietly did their thing like Ann Seltzer, right. And like you were going to. That's quite the thing for you to say.

[00:44:57]

I don't know, man. I don't know.

[00:44:59]

I think there's like narratives that like, come on, you totally argue for standing your ground. I do. But I don't make it personal. I guess I don't know what Trafalgar was doing online.

[00:45:09]

And I mean, I guess they're all trying to get attention, but maybe not personal. Personal, right. But they're trying to, like, tag me a lot of stuff and like kind of get me involved and kind of build up. But I will defend my ground. But I don't like I tried really, really hard not to make any kind of personal jabs at. Anybody? Well, also, I don't know, I mean, I guess I think my job's a little different than that of a pollster.

[00:45:27]

Maybe not. Maybe I should just I mean, again. Probably if polling and Twitter disappeared tomorrow, the two things that are missed, but probably would improve my quality of life anyway. All right.

[00:45:39]

Is that the note that we're going to end the song? I do want to make the standard like there are many reasons why it's very important to know kind of what public opinion is for many reasons, including, by the way, that we are now in a world where. At least one of the two major political parties, the Republican Party is threatening to like undermine the integrity of elections in different ways. I hope that America doesn't become one of these third world countries where you have to have polls as a way to kind of audit against the election results.

[00:46:11]

But like, I don't know if you have in two years or four years or eight years with some secretary of state who's going to stop ballot counting in a certain state, it's nice to have accurate polling to kind of know what some have, some other measure of what public opinion thought about an election. It's nice for President Trump or President Biden to know what kind of people think about an issue you try to carry out. I mean, I'm not like a constitutional scholar.

[00:46:37]

I don't want to get into interpretations of democracy. But, you know, to have some way to understand what the popular will is apart from it being interpreted by other vessels, I don't want to have to take a politician's word necessarily for what the popular will is in his or her district. I don't want to take the media's word for like we kind of helicoptered into this district and went to a diner. And here's what people think, right. I would much rather kind of see like an actual poll.

[00:47:06]

And by the way, there was one big success for polling that we haven't talked about in nineteen twenty 20 cycle, which is that the polls for almost the entire race showed Joe Biden. As the most popular candidate, the Democratic primary and a lot of my friends, a lot of people in the media thought they knew better, they thought, oh, people in my bubble really like Elizabeth Warren and people go to church and people like that and they think that Biden's going to just name recognition and they were wrong.

[00:47:32]

And that support for Biden was real and was actually concerned when people who are like underrepresented groups, older voters, minority voters, poor voters and the media kind of ignored that its own narrative. And then the polls were right that Joe Biden would go on to eventually like a fairly commanding victory in Democratic primary. So that seems like a pretty important example of good use of polling.

[00:47:53]

All right. So pulling a love hate relationship. Yes. Glad we could add on a more positive note after you were like, if polling would just go away, my quality of life would be so much better. Other Twitter, I feel like you have more control over it. I don't think we need Twitter for democracy. But but anyway, I think that's where we'll leave things and also tell people to go check out the article that you wrote and you can see all of the polls to ratings.

[00:48:17]

A bunch of our colleagues worked very hard on every pollster now has their own page. And you can see all of the different polls they've conducted in comparison to how that race turned out, etc.. It's really handy and it's beautiful. So folks should go check that out on 538 dotcom. But anyway, thank you. Thank you, Galen. My name is Galen Droog. Tony Chow is in the virtual control room. Claire, but Gary Curtis is on audio editing.

[00:48:40]

You can get in touch by emailing us at podcast at five thirty eight dotcom. You can also, of course, provided us with any questions or comments. If you're a fan of the show, give us a rating or review in the Apple podcast store. Go read us for real or tell someone about us that works too. Thanks for listening and we'll see you soon. Like, it will become a metaphor or simile, like getting stuck sideways in the Suez Canal in 60 years, some old timey person will be like it's like getting stuck sideways in the Suez Canal.

[00:49:17]

Like Harry Empting will say that when he's like eighty three and people like what? What does it mean? Like the thing that you would say is like getting stuck sideways in the Suez Canal, like up the creek.

[00:49:31]

It's like you're in the Lincoln Tunnel and there's a traffic jam. Right. You're like stuck sideways in the Suez Canal, OK.

[00:49:38]

There was much talk of a big question, he wants to get out.

[00:49:43]

There is no way out from best case studios and ABC audio, listen to In Plain Sight Lady Bird Johnson, a new podcast about the power of a political partnership, one that somehow doesn't show up in the many, many accounts of Lyndon Johnson's presidency, told through ladybirds own audio diaries and available now wherever you listen to podcasts.