Model Talk: The 2020 Forecast Is Live!
FiveThirtyEight Politics- 1,211 views
- 12 Aug 2020
FiveThirtyEight's 2020 presidential election forecast is live! In this edition of "Model Talk," Nate Silver and Galen Druke break down what is new in this year's forecast and where the uncertainty lies.
I'm Galen Droog. I'm Nate Silver and this is Model Top Model Talk.
Wow, you really you really try to unsynchronized that, right? I didn't yeah.
I felt a little spunky for some reason.
Hello and welcome to the 538 Politics podcast. I'm Galen Duric.
I'm Nate Silver, and this is Real Talk.
All right. There we go. So if you are listening to this podcast, it means that we have officially launched the twenty twenty presidential forecast model and you can go check it out at five thirty eight dotcom. It has been a while. It has been four years since we've had a presidential forecast model, but it is finally here. Thank you for being patient, all of our listeners. There is lots of information to explore on this new version of the forecast, but I will give you all the top line number.
As of Tuesday, August 11th, it's that Biden is favored to win the election with a 71 percent chance.
So, Nate, you know who else had a one percent chance of winning the election? You read my mind. So those odds probably sound familiar to our listeners. They definitely sound familiar to me. And I have a lot of questions for you. But I'm just going to go straight to that most obvious one, which is that those were basically Hillary Clinton's odds on Election Day 2016. We just discussed on the podcast on Monday how Biden has been in a stronger position than Clinton was in so far this cycle.
So why isn't the model more bullish on Biden at this point?
Because it's August. Nothing has really happened yet. I mean, a lot of things have happened, 150000 people have died there. The most significant protest against racial inequity since the 1960s. A lot of stuff has happened, but we're still fairly early in the campaign in a time when the world is pretty uncertain. I know that sounds very colloquial, but like there are ways you can actually account for that empirically and include it in the model. We've had the most violently swinging economic data basically in American history.
Right. There's more news than ever. And yes, you can quantify these things. We haven't had the debates. We haven't had the conventions. Right. We haven't seen a whole ton of advertising yet. It's still pretty early. And although Biden has a pretty big lead now, there are reasons to think the election could tighten. If the election does tighten, then the Electoral College probably still favors Trump as it did in 2016. So, you know, look to me in August to say a candidate has a 70 or 72 or whatever percent chance of winning is like it is fairly high in some sense.
It's much lower than other models. But, you know, I think other models have not been designed as carefully to think about uncertainty in the world.
Is this forecast model substantively different than the forecast model we put out in 2016? And if so, why?
So it's not different because of 2016. We think our model did well in 2016. It gave Trump a much better chance than other models did than the conventional wisdom did. The prediction markets did. So if you've been following our model, you would have bet a lot of money on Trump if you were inclined to do that kind of thing and made a lot of money. So we didn't think there was anything fundamentally wrong about the math behind our model. We thought carefully about how was presented.
However, we have this thing called sars-cov-2, right? covid-19. And we have, like I said, economic data, the likes of which we haven't seen maybe ever in this country, certainly not since the Great Depression for some of this like GDP decline in the second quarter, for example. So we did a lot of thinking around around that, right around an additional uncertainties that covid create, spending more time on the economic index that we've used, thinking about how the mechanics of voting could be altered.
You know, I mean, if this were a boring, ordinary re-election campaign, which I don't know, it's possible that in the era of Trump. Right.
But let's say it was, you know, Mitt Romney running for re-election and there was a mild recession and the polls are really stable. And Romney was eight points behind Joe Biden or something, and there was no covid pandemic then. OK, then maybe you could have fairly high confidence that Biden would win. Right. Although still not, I wouldn't think 90 or 95 percent confidence. It's too early, but it's not the world that we're in. Right.
We're in a world where there's an unprecedented amount of news where voting is going to be very different than it was before. And we're not even accounting for what I call just like extracurricular shenanigans by President Trump or extraconstitutional.
Right. We're not trying to account for the president a little dramatically about anyone trying to steal the election. Right. It's kind of beyond the scope of the model. Even assuming the election goes on as normal, then it's still pretty early. And if you go back and look. OK, look at elections for which we have data, there's robust polling data for state level data going back to 1972, their national polls going back to 1936, having an eight point lead.
And really it's more like the equivalent of like a six point lead. The Electoral College having an eight point lead in August is not necessarily secure, that there are elections when the polls swing by more than eight points between now and November. So from what I understand in looking over the forecast model and reading your methodology, is that by including more data from history, that includes big events like the stock market crashing or a past pandemic in 1918 or other big events going back to the 19th century, it injects more uncertainty into the forecast.
So I guess I'm curious, can you try to quantify how much more uncertainty this current forecast model includes as opposed to past ones like the one in 2016? I'm not sure that intrinsically includes more uncertainty, you know, what we're saying is like there are now more factors that we use to measure uncertainty, a couple of which are high because of covid. Right. The ones that are high because of Kowit are this factor. We look at how much do the economic variables that we use in our economic index change, and they've changed a lot.
And then one based on literally how much front page news is there. So how often does The New York Times have like a full width headline? And typically or that might that might happen 10 times all year this year. We've already had like 30 of those write related to covid and the protests and President Trump's impeachment.
So in 2016, there was also a decently high uncertainty, but for for different reasons.
Right. In 2016, there were a lot of undecided voters. In 2016, the polls fluctuated more than they have this year. So, yeah, we're not even saying like that. This year is incredibly especially uncertain. We're saying it's probably about average relative to the entire data set that we use. And the entire data set includes cases like 1988 where Michael Dukakis was ahead at this point and lost in a landslide in the end. Right. Includes cases like 1992 where, you know, there are giant 10 or 20 point swings back and forth.
1980, right where Reagan was way ahead and Carter made a big surge and Reagan won in a landslide. Right. People are used to like this kind of period from like twenty four, twenty eight, twenty twelve, where the polls were were fairly stable. And that's actually more the exception than the rule.
Yeah. I mean, so as you said, this forecast model is a bit different from the one in 2016. I'm curious if you plugged in the data from 2016 into this forecast model, would it be much different from what we actually forecasted back then?
No, it wouldn't, because again, we think like 2016, the model did great. Pretty much. Let me put like this right. If you designed a model that's polling driven. That would have predicted a Donald Trump win in 2016. Then you made a terrible model, then you're fighting the last war. You obviously manipulated the way you structure that model in ways that you're fighting the last war and you probably don't know what you're doing. Right. Oftentimes you're building a model, then you can overfit to pass data.
And it's not necessarily a good indication of a model that it retrofits. Well, right. Prediction and retrofitting are related up to a point. The first, you know, 80 percent of model building is saying, OK, how would you have optimized prediction on past elections? But the 20 percent is where people actually differentiate themselves. Right. And the 20 percent is thinking about like what are things that would lead me to overestimate how certain I am about future data and overestimate how much I know about unknown data.
Given this relatively limited sample size of past data. So I'm never impressed when someone tells me you don't really care, you know, when someone tells me what their forecast would have said. Right. I do care a little bit like about edge cases where if you had some very strange election where something really weird happened or a weird state. Right. I would care a little bit more about that. I think it's usually more important for a model to give reasonable answers, quote unquote, to as many cases as possible than to give exact answers, if that makes sense.
Mm hmm. But yeah, I think kind of what would the model have said in 2016, I think is a really overrated question, because I could optimize the model in a way that would give an answer that people would like to that question, but it wouldn't necessarily make them model any better this year. Right. I guess the question was more getting out in general. Is this forecast model one that just has a more open mind about how three months away from an election, we don't actually know what's going to happen?
Like, is there just more uncertainty in general? And it sounds like in some cases in this election. Yes, but others no.
I mean, there are indicators point to low uncertainty, like the low number of undecided voters, but there's covid and there's 10 percent or whatever is now unemployment. Right. And there's all the news.
And empirically, when you have I mean, it's kind of, you know, polls change in response to news.
Now, there are fewer swing voters in there once were. Right. So you have a lot of stimulus and relatively few people. But but still, I mean, even you know, even even with the SABL polls, I mean, you know, the polls have gone from Biden being ahead by like four points to ahead by more like nine points and maybe now more like eight points. Right. I mean, you know, going from four to nine is a decently large shift at four points.
And you kind of get more into 2016 on Election Day territory where just a slight polling, this combined with the Trump Electoral College advantage, could be enough to allow him to win a second term.
If the current polling and fundamental data held on November 2nd, the day before Election Day, what do you think the forecast would show for Biden and Trump? You know, how much of what we're seeing is the result of being months away from Election Day? And most of it is right.
I think if you tell our model the election is today, then it gives Biden something like a 90 or 92 percent chance of winning the Electoral College. Yeah, I mean, look, an eight point lead nationally and it's, again, probably more like six or something in the swing states, six or seven. Right.
That is a pretty big lead. It's not out of the realm of question that, like, you could have a polling error that large could happen. Right. But that would be a genuine upset. But now it's just the question. It's mostly like things could change a lot between now and November.
You've been working on this for months at this point, you know, combing through historical data and all kinds of stuff, working with our colleagues. And then at the end, you actually put it all together, run the forecast and you get a number yourself that you're seeing for the first time. Right. Were you surprised at all when you saw that?
No, I thought it was lower than I expected for Biden. You know, I mean, there have been other models out there, and I kind of assume that they were being very careful about all of these things after 2016. And I think they weren't being careful. So I kind of you know, it it anchored my priorities in a different place. Maybe they should have been.
And to be fair, like, you know, kudos to other people for having other models, like, out first. Right. But no, I was surprised that it wasn't like Biden at 85 percent or something.
I want to talk a little bit more about the fundamentals that go into this forecast model for new listeners or for people who want to refresh. Since the last time we've gotten into the nitty gritty on this stuff to break it down for folks. What are the key components that go into a forecast model like this where polls we have fundamentals.
What do we mean when we talk about fundamentals and what kinds of polls are we using?
We use pretty much every poll, but state polls are more important in the forecast. National polls have an important role in calculating what we call a trend line adjustment. Right. So national polls can tell you, oh, OK, well, Biden's gained two points since last week. We haven't had a poll of Nevada recently. So therefore, we can, you know, nudge upward a poll of Nevada to account for for the shift in the race overall.
But for the most part, it's an election where obviously the vote is tallied up state by state. And having a poll of a state is usually better than trying to make inferences about the standing of that state, although you can do a blend. And so, you know, it's driven by state polls. We also have a fundamental forecast, we call it, or a prior, which is based on an index of economic conditions, but also accounts for incumbency and polarization.
You know, perhaps surprisingly to people, that prior actually says that Trump is only a narrow underdog. The reasons being that no one actually not all the economic data is all that bad. So income, for example, where because of the Carers Act, the government has given tons of money to people that may be running out. Now, that's a big concern. If I were Trump. Right, but the government gave a lot of money to people that meant disposable income actually went way up.
And that's one of the six indicators that we use. Meanwhile, inflation has been low. The stock market has, you know, bizarrely, I guess, but it has been pretty good are other indicators that we use. And so, you know, between that and the fact that it's a highly polarized climate, so maybe things don't have as much effect anyway. And the fact that Trump is an elected incumbent, the prior only has him losing by a couple of points, even though he's behind by eight points or something in the polling average.
So that actually expects the election to to tighten a bit.
And how much does the economic data weigh on the forecast? For example, if Congress doesn't come to some sort of agreement on a relief bill and that extra six hundred dollars in unemployment benefits goes away, how much could that potentially change the forecast? It would depend we use.
Different ways of forecasting. So one of the thing I should mention, right, we actually are forecasting what the economy will look like in November because we're not economic workers ourselves. We use surveys of professional forecasters. We use the Wall Street Journal economic forecasting panel. We also use a stock market, which even those will annoy some people, actually does have some predictive power in predicting macroeconomic conditions. So, yeah, we're currently assuming, based on those forecasting devices, that the economy will improve pretty meaningfully between now and November if you don't have further stimulus passed.
And I think that's a question that's more questionable assumption. Right.
And so you probably initially see that maybe in the stock market going down or forecasters predicting less growth in the third quarter and then it would eventually kind of affect the variables themselves that we track potentially. You know, you can imagine disposable income going way down this month, for example, that would have a negative effect on on the economy and therefore, Crumb's chances.
You know, but the prior is not weighted that much, right. At this point. It's, you know, 75 or 80 percent based on the polls in twenty or twenty five percent based on on the prior and the prior ramps down to zero by Election Day, we mentioned that were accounting for covid-19 in this forecast model.
So how exactly does the forecast incorporate that data into it?
Frankly, in some ways that are not that important to the overall model design? There's not like any assumption in the model that like, oh, because Trump is screwing up on COBA that he'll lose or whatever. Right. You know, in fact, I think the reason to think Trump will lose is not because of the polls, but is because of like, hey, look, he screwed up on covid. People have their lives kind of ruined for the past six months because of covid.
They won't get that much better before Election Day. And so therefore he'll lose. Right. That's a perfectly coherent hypothesis. It's not something which is something that would be the result of the polls or sissako model. So in some sense, I think it's like the model should be cautious and everyone else should maybe be cautious. Right. Instead, you know, I'm not going to demean other models. But Trump is not only is more than a 10 percent chance of winning the Electoral College Recchi, but OK, how is covid data used in the model?
In a couple of ways. One is that we run as a way to, like, enhance or an alternate to the polls.
We run a regression analysis that says, OK, which factors predict state by state polls? You can use that prediction, that model estimate, we call it, to fill in states where you have little polling or no polling. Right. And one of the factors that tries to assess is, is any of the polling driven by covid cases where states we have a worse covid outbreak. Is that affecting Trump standing there positively or negatively, for that matter?
Also, we allow error to be correlated based on cold cases.
Right. So it could be that, like Trump would underperform his polls and a bunch of states with covid or that could affect the mechanics of voting. So you'd have error in ways that would be correlated with covid. So technically, yeah, data gets used, but it's not we're not making any particularly strong assumptions about it. We're just allowing, like, kind of covid to be like a vector, if you will, that affects how the distributed state by state.
Right. So in the way that in 2016 we had the upper Midwest correlated meant that we gave Trump a better chance than other forecasts. In this case, states with high levels of covid cases are now also correlated. For example, I think New York and Arizona have both had bad outbreaks. So while they may not seem obviously correlated from an electoral standpoint, that's how they're correlated. And this model, is that correct?
That's right, yeah. And they're like, you know, literally dozens of different correlations that we account for, ranging from, you know, how old the states are in terms of their, you know, the average citizen, how old this person is to like what region of the country they're into latitude and longitude to, you know, racial and religious categories. But we've added covid. And for that matter, also how much of the states vote? We expect to be cast by mail to those things, which are potential sources of correlated error.
What considerations do you make for how different states are voting? Because in a lot of states, people are going to be able to cast a ballot by mail if they wish because of covid in some states, including states like Texas. That's not the case. Not anybody who wants to will be able to cast a ballot by mail. So how much does that change the fundamentals or assumptions about that state?
We use something very useful that a group of academics put together called the cost of Voting Index that looks at how easy or difficult it is to vote in each state and when there are changes in that, that can change the composition of the electorate in ways that are fairly predictable, meaning that when you make it easier to vote. Democrats tend to benefit when you raise barriers to voting, Republicans tend to benefit, so those academics graciously provided an update of their index for 20 20.
It appears to be published through 2016.
I'm not entirely sure how much they're doing to account for covid or not, but but Texas is a state where it's become harder to vote according to their index. And so that actually makes the model more bearish for Biden and more bullish for Trump in Texas than than other models might be. And that's actually been a point of contention, which we've talked about on this podcast, whether or not making it easier to vote, i.e., vote by mail or longer periods for early voting actually boosts Democrats odds.
It sounds like you're saying that they do. While we've heard from other academics that they don't that it doesn't really shape the outcome of an election. So is that like a new piece of data? What should we make of the conflicting messages?
I think the voting by mail piece of it is something where people have made stronger claims. And I think that's partly because voting by mail is something that both parties really did take pretty good advantage of. Right. And there are things like you have an absentee ballot. It helps to if you're older, you would send it in it. You have a stable household, right? So voting by mail may be one of the exceptions actually, where we did have a neutral effect or maybe who knows, even helped Republicans.
Right.
But overall, like, look, there is a reason that Republicans spend a lot of time trying to make it harder for people to vote, which is that Republicans benefit when fewer people, especially people who are poor, who are minorities, they benefit when those people have a tougher time voting. Right. It's very plain. And Democrats spent a lot of time fighting back against that. And so if any academics are saying that overall, it just kind of totally neutral, then No.
One, I don't think that's what the data actually shows.
And number two, you're saying that all these apparatus that exists for both parties to spend huge amounts of legal fees and time energy on in the GOP's case, making it harder for people to vote Democrats case, making it easier, saying all these people are idiots for doing that. I mean, you know, again, the narrow claim about male voting, I, I haven't looked at that. I have reason to believe the academics are right about that. But no, look, I mean, news flash, right?
Republicans don't want like black people to vote. Basically, that's kind of one of the things you have to conclude about the American political system. I'm glad that we're finally incorporating some of that magic logic into our model because it's empirically shows up pretty strongly.
What other new data are we considering this time around? We have recovered data. We have how easy it is to vote. What else is new?
I mean, that's really most of it. We have somewhat more sophisticated handling around events like debates. What can happen around debates is that or conventions is a more costly example. A candidate will get a bounce in the polls than that will dissipate over time. So previous the model had like a convention bounce adjustment, but now also as a second strategy, which is that when there's a major event, it will kind of take the snapshot of the polls before that event happened and then hold on to some of that snapshot for a couple of weeks until we see if any changes in the polls is permanent.
Right. Does that make sense? Yeah.
So instead of someone having a good debate and the forecast immediately peaks, it says, well, let's wait and see if the enthusiasm around this. Yeah.
It'll say say, let's give you half credit for the apparent gain in the polls. Right. And then after a week or two, then we'll go ahead and give you full credit. Basically, it's a pretty minor adjustment, but it does mean that, like, you know, the model wouldn't have bounced as much toward Mitt Romney, for example, in after the first debate in 2012 or I guess toward Clinton after the first and second debates in 2016.
So it'll be a little bit more conservative about flopping around as a result of an event like that. And we've also expanded our data. That's right. Yes. So that economic index is now constructed on elections going all the way back to 1880. It's a big problem when you have a small sample size in these fundamentals models.
And so going back to 1880 lets us capture the Great Depression, another very kind of strange economic conditions. And so I think that makes our model more robust. Also, if you go back to 1880, it's not as clear that the economy is so deterministically. Predictive of of kind of which party wins the election, so it's one of the reasons why we don't wait. The fundamentals that much as if you actually expand the sample size to data that you didn't kind of test the mettle on originally.
Well, it turns out maybe the economic fundamentals are not as fundamental. You know, certainly obviously Hoover did not do very well. Right. If you have a really severe depression. But but yeah.
And by the way, I would say, too, like, you know, even if you're kind of using an economic prior, it's really covid that voters are blaming Trump for and not the economy. His approval ratings on the economy are actually break even. They're fairly decent. And so if you rigged up a model that that says, oh, you know, it's the economy, it's all about the economy, you know, it's all about covid, you know what I mean?
That's if you actually look at the polling data is what it says. And so maybe coincidentally, you're kind of indirectly having two wrongs make a right. Right. The economy gets wrecked because of covid and therefore you use an economic variable and therefore you say, I voted for Trump because the economy. Right. But yet again, in our in our model, it's kind of more evert's toward a prior of zero or I think, you know, Trump losing by two points or something, which should be very close, the Electoral College.
And so it just kind of saying, hey, things might tighten down the stretch run here.
All right. So let's do a little rapid fire before we close out this episode of Model Talk. And I'll say that in future episodes of Model Talk, we will answer questions from listeners. So listeners, if you have any questions, you can tweet at us. You can also email us at podcasts at 538 Dotcom. We're going to be doing this through the elections and we'll have lots of time to answer your questions. But all right. Here's some rapid fire.
Are you ready? Mm hmm. Yeah. At this point, according to the model, what is the tipping point state?
I don't know, actually, because there isn't one tipping point state ahead of time, tipping point states determined only after the fact.
What is the likeliest tipping point state according to our forecasts? Yeah, it's like a tie between Florida and Pennsylvania. So between the three kind of Midwestern weather, you can debate whether Pennsylvania is Midwestern or not. Right. But that Pennsylvania, Wisconsin, Michigan group, the one that has pulled the best for Biden is Michigan. The one that has at least electoral votes is Wisconsin. So you could afford to lose Wisconsin or Minnesota in some cases where you couldn't lose another state.
The way the math works right now, Pennsylvania's actually pulled a little closer than you would think, and it has a lot of electoral votes. So Pennsylvania is super important and then Florida has a ton of electoral votes. And also, like if Biden wins Florida, then he can lose Midwestern states and still win the election.
So so Florida and Pennsylvania are the most important states according to the model, not the sexier Wisconsin and Arizona that people talk about as much.
And again, it's partly just because, like the number of electoral votes matters a lot. Florida has, what? Twenty nine, Pennsylvania has 20. If you win 29 electoral votes and it becomes very hard for Trump to win.
Next question is, what is the likelihood of an Electoral College popular vote split?
According to the model, at this point in time, it's in the neighborhood of 10 percent. So I think we have Biden, you know, 70 or 70 percent to win the Electoral College and 80 or 82 percent to win the popular vote.
There is also like a very outside chance that the polls are really discombobulated and that Biden wins the Electoral College and loses the popular vote.
That's like a one in 500 chance or something. That's quite low. But the chance that Trump does well just in 2016 or, you know, there's a 10 percent chance of a repeat of that roughly.
And that is key, right? If we were not using Electoral College, then then we'd say, OK, look at Biden's current lead in the polls and even that might tighten. That would translate to an 80 something percent chance of winning. So the Electoral College is an important factor here.
And let's leave things with this.
Should listeners expect a Senate or House forecast this cycle?
Yes, OK, but don't make me think about it right now because it's just me. All right. Fair enough. Well, let's leave it there. I know that you deserve a break. A lot of work, but thank you. Great. Thank you. And a couple of housekeeping notes before we close out this podcast, we're going to be doing nightly podcast during the conventions. So for next week, the Democratic Convention and the week after the Republican convention, you can expect near daily podcast.
So Monday through Thursday. But that also means that our regular Monday podcast will come out a little bit later than usual. We're going to be recording our podcast on Monday after all of the speeches have concluded at the Democratic convention. And the same goes for the following week. So you get near daily podcast, but you have to wait a little bit longer for the Monday podcast. Also, on a related note, the next few months are going to be very exciting and busy here at five thirty eight.
So we are hiring a producer and a temporary roll through Inauguration Day. Twenty, twenty one to help out. On the politics podcast to work two to three days a week, so check out that posting on five thirty eight dotcom or my Twitter feed where it's also posted, or if you think you know somebody who might be interested, let them know. My name is Gail Drew. Tony Chow is in the virtual control room. You can get in touch by emailing us at podcasts at 538 dotcom.
You can also of course, treated us with questions or comments. If you're a fan of the show, leave us a rating or review in the Apple podcast store or tell someone about us. Thanks for listening and we will see you soon.