Editor's Note: This transcript was automatically transcribed, so mistakes are inevitable. You can contribute by proofreading the transcript or highlighting the mistakes. Sign up to be amongst the first contributors.
Welcome to the Knowledge Project, I'm your host, Shane Parrish, editor and chief curator of the Furnham Street blog, a Web site with over 70000 readers dedicated to mastering the best of what other people have already figured out. The knowledge project allows me to interview amazing people from around the world to deconstruct why they're good at what they do. It's more conversation than prescription. On this episode, I'm happy to have Philip Tetlock, a professor at the University of Pennsylvania.
He's the leader of the Good Judgment Project, which is a multi-year forecasting study. And he's also the author of the recently released Super Forecasting the Art and Science of Prediction. How We Can Get Better Prediction is the subject of this interview. We're going to dive into what makes some people better and what we can learn to improve our ability to guess the future. I hope you enjoy the conversation as much as I did. Before I get started, here's a quick word from our sponsor, Greenhaven Road Capital is a small hedge fund inspired by the early Warren Buffett partnerships.
We have a fair fee structure and our portfolio manager is the largest investor in the fund. Our minimum investment is one hundred thousand dollars. Accredited investors can learn more at Greenhaven Road dotcom.
OK, so I guess we'll get started and I want to talk about your new book, Super Forecasting the Art and Science of Prediction that you wrote with Dan Gardner, who, like me, I think is still based in Ottawa. In the book, you say that we're all forecasters. Can you elaborate on that a little?
Well, it's hard to make any decision in life, whether it's a consumer decision about whether to buy a car or a house or whether to marry a particular spouse or potential spouse or a candidate to vote for in an election. Very hard to make any decision without forming at least implicit expectations about what the consequences of that decision will be. So whenever you're making a decision, there are implied probabilities built into that. So the question becomes, are you better off with implicit probabilities that you don't recognize as probabilities or explicit ones?
And I think one of the major takeaways from the forecasting tournaments we've been running is that when people make explicit judgments and they're fully self conscious about what they're doing, they can learn to do it better.
And you're talking about the Good Judgment Project. Can you maybe introduce us to that a little?
Sure. Well, the Good Judgment Project is a research program that my wife Barbara and I started several years ago. It was supported by a research and development branch of the US intelligence community known as IARPA Intelligence Advanced Research Projects Activity, which models itself after DARPA and the Defense Department. And their mandate is to support research that has the potential to revolutionize intelligence analysis. So working from that mandate, they decided in 2010 to support a series of forecasting tournaments in which major universities would compete.
Researchers and major universities would compete to generate accurate probability estimates of possible futures of national security relevance. And we were one of the five teams selected for the competition in 2010. The tournaments ran from 2011 to 2015. They ended in June of this year. And the Good Judgment Project, I am proud to say, was the winner of those forecasting tournaments. I can explain more about what winning a forecasting tournament means later, if you want. Congratulations.
Yeah, definitely. Is there a difference between forecasting and predicting? I don't see one.
I think that if you go to the source, I think we're going to find they're virtual synonyms. Some some people may try to draw distinctions of one sort or another, but I see them essentially as distinction without a difference.
And so were you using a representative subset of the Good Judgment Project or were you using super forecaster's from the project or how are you competing in that?
Well, different universities and different teams of researchers took different approaches to generating accurate probability estimates. We recruited thousands of forecasters and we explored a number of different techniques for eliciting the best possible probability estimates from those forecasters. We were continually running experiments and one of the experiments we conducted was to identify top performers. In each year, the top two percent of performers each year creamed them off into teams, elite teams with super teams of super forecasters and give them as much support as we could, intellectual support as we could for their task and see how it would see what would happen.
And they really went to town. They did a phenomenally good job. They blew the ceiling off all of the performance expectations that are up ahead for what was possible. And frankly, they they certainly exceeded my expectations as well.
So some of us are good and some of us are bad and some of us seem like way off the chart. I'm making predictions. Why are some people so good?
That is indeed the sixty four thousand dollar question. Why are some people so good? So the skeptics argue that that if you toss enough coins enough time, some of them are bound to come up heads. So the super forecasters are just super lucky. So let's treat that as one of the default skeptical hypothesis. There's nothing special about super forecasters. If we ran a tournament in which the task was they to predict whether a fair coin of land heads or tails, some, some, some people would do better than others just by chance.
In a given year, we could anoint those people as super coin toss predictors and we could say, well, how are they going to do the next year? And what we would find is perfect regression toward the mean. The best prediction that is that the super coin toss predictors in year one will be essentially around the average in year two, and the worst predictors will regress upward toward the mean, of course. So that's what that's what a pure chance environment would look like.
Well, we. In the the tournament is that there certainly is an element of chance and predicting geopolitical and economic outcomes, but the skill luck ratio seems to be about 70, 30. So you're not observing a great deal of regression toward the mean among super forecasters, but there inevitably is some regression toward the mean among the top performers.
And so what makes those people so good? Well, now that we've eliminated or rendered implausible the super lucky hypothesis, yeah. The question becomes, what are the attributes these super forecasters have? They might they might you might think of them as being stable psychological attributes to score higher on measures of fluid intelligence or crystallized intelligence or active open mindedness that have certain attitudinal profile, certain behavioral profiles. And the answer is all of the above. Super forecasters differ from ordinary mortals in a host of ways.
They're not radically different from ordinary mortals, but they are systematically different. They tend to score higher on issues of fluid intelligence. They tend to score higher on measures of active open mindedness. But if I had to identify one factor that I think best distinguishes super forecasters from from other forecasters who are equally intelligent and equally open minded, it is that super forecasters believe that probability estimation of real world events is a skill that can be cultivated and is worth cultivating.
And they're willing to make that commitment, that effort. So when people ask me how could the super forecasters have outperformed, say, intelligence analysts who do this full time and have access to classified information? I think the short answer is it's not because they're smarter and it's not because they're even more open minded, although they are pretty open minded. It's because they are willing to make this this commitment, desire and faith that there is a skill it's worth cultivating.
So in the book, we quote Aaron Brown, who's the chief risk officer at eikaiwa and also a great poker player, that his view is you could distinguish great players from talented amateurs on the basis that great players are good at distinguishing 60, 40 bits from 40, 60 bats. And then he paused and said no, maybe more like fifty five. Forty five. Forty five. Fifty five. The greatest players tend to be extremely granular in their assessments of uncertainty.
One of the big questions I think that IARPA wanted us to answer and that I think we have answered in the affirmative is does granularity in assessments of uncertainty pay off not just in poker, but in when you're making messy, real world judgments like whether Greece is going to leave the eurozone or what kind of mischief Putin might be up to in the Ukraine next, or what's going to happen with Sino Japanese relations in the East China Sea, or is there going to be another outbreak of bird flu in a given region?
These are extremely idiosyncratic. One short historical events. It's not like poker where you're sampling from a well-defined sampling universe, repeated play, quick feedback. So there are a lot of people, very smart people have been skeptical for many decades that it's even possible to make probability estimates of these kinds of intelligence analytic problems. And I think what the tournament has proven beyond reasonable doubt, in my opinion, is that there is room for improvement. It's possible to make these probability estimates.
It's possible to get better at it. It's possible to identify the kinds of people who learn to do it better. It's possible to develop training modules to help people do it better. And the gains and accuracy are appreciable.
So what happened when you took average people and you started giving them I think I remember this, that you started giving a course in probability, we get about four average forecasters who were randomly assigned to an experimental condition in which they get Khanum and Style be biasing exercises. The improvement is in the vicinity of 10 percent. And that's and that's a big effect when you consider that we're talking about improvement across the entire year of forecasting and in this training exercise takes about 50 minutes.
And what did that consist of?
This 50 minute training exercise, some basic ideas about heuristics and biases and how to check biases. For example, one of the classic Gonnerman arguments is that people don't give enough weight to statistical or base rate information and assessing the probabilities of events, they are too quick to take the inside view. So if you're attending a wedding and you see the happy couple and you're impressed by how much in love they are and the enthusiasm of the moment, and someone asks you how likely are they to get divorced, you're not likely to convince all national divorce statistics for that subgroup.
You're likely to say they look really happy and compatible. I'm going to touch a very high probability to they're not getting divorced. And the net result of making predictions in that way is that you're in. Going to be somewhat less accurate than you would have been if you had at least started your estimation process by saying what are the base rates of divorce? And now I'm going to suggest that based on whatever idiosyncratic factors are present in this particular relationship to starting with the outside view and working your way inside.
Start with the outside and work inside that that's a that's it's one of our mantras.
So but isn't conmen famous for saying that he studied Bias's his whole life and he feels like he's no better at avoiding them? So how does this 50 minute training exercise come in and help people?
Well, you know, Danny Kahneman with a colleague of ours at Berkeley, my wife and I, we know we know him well and we know that he is more pessimistic about the prospects for devising than we are. He did give us advice on how to design the Biase modules. I think he probably is more of a pessimist than we are, but I think he is persuaded that these improvements are real.
They certainly seem to be.
So one of the keys to keeping track of forecasting and your ability to predict is kind of keeping score. And do you think it takes a certain type of person to want to keep score? I mean, most of us are happy to kind of weasel out of or use uncertain wording or jargon when we're going about making decisions so that even if we're wrong, we can kind of say, well, that's not what I meant.
Absolutely. It does take a particular type of person. And there are many factors that come into play. I think it certainly helps to be open minded, but there are other things that come into play that are a little more sociological. I've been doing forecasting tournaments for over 30 years now and I started when I was about 30 in nineteen eighty four. I'm sixty one years old now. So I'm if I were an intelligence analyst, a sixty year old intelligence analyst, I would be a very senior analyst.
And let's just say for sake of argument, that I am a senior analyst in the US intelligence community. I'm on the National Intelligence Council, say, just for sake of argument, and I'm the go to guy on China. So when Xi Jinping comes into town and people say to me, you know, what's going on? I have inputs into the presidential daily briefing and help with national intelligence estimates. And I'm at the top of the status pecking order within the ISI on China.
And someone comes along like this upstart research and development branch of the for the Office of Director of National Intelligence. And they say, hey, you know what we're going to do? We want to run forecasting tournaments now and everyone's going to compete on a level playing field. And twenty five year old China analysts are going to compete against sixty one year old analysts like that lot. And we're going to see who does better. Are this are the 61 year old analyst going to welcome this development?
No. To ask us to answer even open minded sixty one year olds are not going to be very enthusiastic about this. They're going to argue that these tournaments don't really capture what makes my judgment special. And that is indeed a lot of the resistance we've run into for forecasting tournaments. I mean, in the book you may remember, we talk about the parable of two forecasters at the beginning, Tom Friedman and Bill FLAC. Almost everybody who reads newspapers knows who Tom Friedman is, famous New York Times columnist, Middle East expert, often in the White House or Davos and God knows where.
And Bill Flack, nobody has faintest idea who he is because he's an anonymous retired irrigation specialist in Nebraska who happens to be a super forecaster. And we know a tremendous amount about Bill Flaks forecasting track record. We know almost nothing about Tom Friedman's forecasting track record. Right. And that's in substantial part because Tom Friedman forecasts and he does make forecasts are embedded in vague verbiage. He says that this could happen or this might happen. And when you say something could or might happen, that could mean anything from point one zero point nine in probability terms.
And if it if it happens, I can say I told you it could. And if it doesn't happen, I can say, look, I merely said it could write. You can't get paid to have it very nicely. Yeah. Do you think that that's one of the problems with organizations? I mean, it seems like we're not getting better as organizations that are making decisions, in part because our ability to keep score is hampered by the psychological kind of effects where, you know, if I keep score, it might be wrong.
So my incentive is not to. And if I use precise wording, I might be wrong. So my incentive is not to.
Yes, yeah. I think there's a whole mix. There's a real mixture, a powerful mixture of psychological and political forces that interact to create a lot of resistance to forecasting tournaments. So even though I think we have shown that forecasting tournaments can appreciably improve probability estimates, there are a lot of reasons why organizations don't adopt them. One is the people at the top of the status hierarchy are not very enthusiastic. Bob, who's in the CEO suite? Let me ask about it being discovered that Bob in the mailroom is just as good as he is at anticipating trends relevant to the company's future.
So you have the status hierarchy problem. People at the top don't want to be second guessed. They don't want their judgment process to be demystified. A large part of status in contemporary organizations is that there's something special about your judgment. So even open minded, high status people are going to be reluctant to do this because it's going to look like a career damaging move. So they certainly that and there's another of a lot of other factors in play. There's, again, this kind of an argument that people don't pay attention to the outside view of the book.
We talk about a mistake that a New York Times famous New York Times journalist, David Leonhardt, you may not know him. He runs the Upshot column in The New York Times. He's he's a quant savvy journalist. And he made a mistake in 2012 that we talk about that that illustrates just how tenacious the misconceptions can be. He was commenting on the Supreme Court decision to uphold Obamacare in 2012. It was a narrow decision. It was five four.
And he noted that the prediction markets had had futures contracts on this decision and that's on the Supreme Court decision. And they were pricing it at about a seventy five percent probability of the law being overturned. OK, so they were way off and he said, well, how far off is way off? He said, well, they got it wrong. He just said flat out got it wrong, but doesn't account for that complexity. Right. That bad that that bet itself is wrong.
Right. It certainly isn't good news that the prediction, a good prediction market, that it was on the wrong side may maybe by that margin. But prediction markets have generated hundreds of forecasts over many years and they've proven to be pretty darn well calibrated, which is another way of saying when they say seventy five percent probability of something happening, things happen about seventy five percent of the time and they don't happen about twenty five percent of the time. So even if you have a perfectly calibrated prediction market system doing that, it says seventy five percent.
Twenty five percent of the time, smart observers, observers are smart, as David Leonhardt are going to be tempted to conclude that you're wrong and to dismiss you. So this creates a huge political incentive to stick with vague verbiage. If they simply said it could be overturned, they would be well positioned to explain it either way. But because they were the prediction market was generating these precise probability estimates and because people don't take the outside view and say, well, we can't just look at that particular forecast, we have to put it in the context of all these other forecasts that the system is generating.
Take the outside view toward the system. People have a very hard time doing that. And David Leonhardt knows this is true. And he's even written later on the upshot about situations in which I read about this fallacy. So if someone as smart as that who doesn't have a grudge against prediction markets can make a mistake like that, you can see why politically savvy intelligence analysts might be reluctant to blame game culture like D.C. to do it right.
I think one of the most interesting parts of the book for me was when you started talking about the ME style thinking. Can you can you introduce us to that?
Well, Enrico Fermi was telling American cities to develop the first nuclear reactor at the University of Chicago. He was involved in the development of the atomic bomb in World War Two, and he was known for his rather flamboyant thinking style. He was continually coming up with innovative ways of estimating the seemingly UN estimate about one of the famous examples of a family problem. It sounds really weird. It was to estimate the number of piano tuners in Chicago. Other examples might be estimating how much the Empire State Building weighs or are estimating the likelihood of extraterrestrial civilizations elsewhere in the Milky Way.
Sounds like the brain teasers that Google used to ask to hire. Exactly. Now, I don't know whether Google, whether the legal department still allows Google to continue using those for screening potential personnel, but they are interesting tests of how people approach problems.
And what was so interesting about the way that Fermoy approached that he really believed in flushing out your ignorance and decomposing, decomposing the problem into as many tractable components as possible. So you would start by how how many how many stars are there in the Milky Way? Roughly about one hundred billion. You'd say, well, how many of these stars have planets orbiting around them? Look at the most recent data from Kepler, which has done some reconnaissance in our local area, about 60 year round and say, well, you know, it looks like a pretty high percentage of stars do seem to have.
Planets going around them, let's say, could be as much as half or maybe slightly less, but I don't really know the answer to that question. But you make you make you make an initial guesses. You flush out your ignorance, and then other people can come back and they can see that fetlocks at about half. And they so Catholic doesn't understand what Kepler's doing. It should have been 70 percent. No, I said that 30 percent. But it's not that Tatlock is getting it right.
It's that we're flushing out headlocks zone of ignorance and we're making it clear and it's all open and transparent. And then we end in that process of inquiry would continue. How many planets are in the habitable zone? And you drive some further guesstimate from Kepler. It's a fairly small fraction of planets seem to qualify for that. And but that still might leave you with, say, as many as five hundred million to a billion planets that are potentially inhabitable zones.
And then you'd have to make an estimate about how likely is life to jump start if you have a planet and habitable zone and how likely is intelligent life to emerge once you have. And there are different evolutionary theories that have different models that at least somewhat different implications of answers to those questions and what you would wind up with would be ranges of probabilities. Now, for this particular problem, the range of possible probability is going to be very large and we know it's not impossible.
There's another advanced extraterrestrial civilization in the Milky Way. We also know it's not a sure thing. It's probably in my my best guesstimate, if I were to combine all the different steps we started to work, there would be it would be probably more than one or two percent. But I don't think it would be as high as 90 percent and probably it would take between two and 50 percent. That's a guesstimate. Now, there's nothing special about that number but what Tetlock has done.
Now, if he's fleshed out, let me if I can put this other person here. What what the Fermi for me before my eyes are using, the very method is done is he or she has fleshed out all the different points of ignorance along the reasoning continuum. And you, the observer, can say, oh, look, Tatlock made a really stupid estimate here and you have to adjust that. And but it's a basis for proceeding and would initially look like a hopelessly intractable problem, at least becomes at least a little more tractable.
And that's what super forecasters are pretty good at doing that breaking down seemingly intractable problems into semi tractable components and then just pushing. They're not afraid of looking stupid and making estimates that observers can see and look at and say, oh, my God, why did you say something that stupid about the capital budget?
That's an incredible point where you're taking this big, intractable kind of problem that's very hard to pin down. And you determine you go you have some organized process for determining the subcomponents involved to get you there and then you go through an estimate. So part of that would be highlighting your you're thinking, right? Yes. And then part of that would be like, I really don't know anything about this question.
So can I break that down further into subcomponents or my extrapolating too much?
No, that's exactly the spirit of the enterprise.
So why is that style of thinking? Why does it lend itself, do you think, to better forecasting? Is it just the nature of the changing, the framing of the problem itself, or do you think it's more the curiosity of the people who are willing to break it down and go through? It sounds like a lot of work. It sounds very demanding and mentally taxing to do that versus just throw an estimate with your, you know, your immediate response.
You're exactly right. It is demanding. And I think it works best if it's done in a team environment in which members of the team have mutual respect for each other, but they're also willing to push each other hard.
So if you were an organization, you wanted to set up a team environment like a forecasting team within a large company, you say IBM.
How would you go about doing that with your knowledge? That's a great question, and I'm a little bit wary about saying that organizations should try to construct super teams the way the Good Judgment Project did, because team construction has a lot of implications for other parts of the organization. That can be tricky. I mean, imagine that if you if you just did what we did in the tournament to win it and you just identified the very best people and brought them together and nurtured them and helped them push them hard, that would be a very elitist and somewhat divisive thing to do in many organizations.
Yeah, and it could cause a lot of political friction. Now, we didn't care a lot about that because we were in a forecasting tournament. We didn't really have an organization in the traditional sense of the term. We wanted performance engine. Right. We wanted to harness human ingenuity individually and collectively, as rigorously as possible to generate as accurate as possible probability estimates for things that the intelligence community cared about. That was it. It was a pure accuracy game.
And we we weren't we weren't that interested in the long term viability of the organization. We were interested in the pure accuracy. So I would I would be a little cautious about saying, you know, it's really easy. All you do is you recruit these super forecasters and you put them into these teams and you give them some training on how to do precision questioning and you give them some training on how to do constructive confrontation. And you've got these anti groupthink norms enforced and you give them some training and guidance and probabilistic reasoning.
You encourage a certain self-critical structure and culture inside the teams and boom magic, amazingly accurate forecasts emerge. It works pretty well in the forecasting tournament environment, but whether it would work well in an actual organization, I think the senior executives want to think carefully about each step, each step along the way to what would you say to people inside an organization?
How can they use your research to make better decisions inside their company?
Well, I think it's something you want to consider seriously, that when people make forecasts inside organism, most organizations today, accuracy is only one of the goals that they're pursuing. They're also interested in making forecasts that are going to be difficult to falsify so they can't be embarrassed. So a lot of the forecasting inside organizations doesn't involve numbers. It involves a lot of vague verbiage. They're also interested in making forecasts that don't annoy other people in the organization. They don't want to tip the political applecart over.
So they're compromising accuracy in a whole host of ways that help promote their careers inside the organization, help to maintain political stability in the organization but that aren't all that centrally focused on accuracy. Forecasting tournaments are really weird because they focus one hundred percent on accuracy. That's all that matters. So I guess the thing you'd want to consider as an executive would be do I want to preserve part of my organization's analytical processing capacity for a pure accuracy game? They want to incentivize some small group of the people in my organization to play pure accuracy games in forecasting tournaments.
And those probability estimates would then filter up to senior executives to guide decision making. I think it's really an interesting experiment to consider doing. I think the intelligence community has been moving somewhat in that direction. I think it's a good idea and I think it would probably be a good idea for many other entities as well, at least to consider it's in the spirit of the whole IARPA enterprises to run experiments. And what I would propose would be that senior executives consider running experiments in which they see what the what what do they discover when they incentivize people to play or accuracy games.
And do you think what transfers from your research into the decision making process in a corporation? Not necessarily, but forecasting how we go about organizing, unpacking, synthesizing multiple views.
How does that transfer, do you think, into a learnable skill that people can have inside of an organization?
There are many ways that could happen. We put a lot of emphasis and the good judgment project on synthesizing diverse views and aggregate forecasts. And I think one of our major performance engines was the statistical rhythms that our statisticians developed for doing that. When I started this whole exercise, they thought it would be really hard to do better than twenty or thirty or forty percent better than the weighted average of the group, a control group. Forecasters and our super forecasters exceeded that performance benchmark quite substantially each year of the tournament.
They did so well that I are essentially suspend the tournament after two years and are that they were able to absorb the other teams into our team in a substantial way and compete against the intelligence community and. Against the prediction market baselines instead of the other universities, now, how did all that come to pass? I think the aggregation algorithm developed if I had to credit two big things as responsible for the victory or the good judgment project. One of them would be the super forecasters and the other would be called the super algorithms, the great algorithms that our statisticians developed.
Now, when I described these algorithms, some of you're not going to be too surprised at first, but there is one aspect of them that does surprise most people. So the first thing to do so I don't know if your listeners are familiar with the James Surowiecki wisdom of the Crowd book, but probably well known. It's been well known in the forecasting world that the average of a group of forecasters, the average forecast from those forecasters, is going to be more accurate than most of the individuals from whom the average was derived.
And this is the famous in story about the ox. You got hundreds of people trying to guess the weight of the ox. And the average of all those guesses was only about one or two pounds off from the original from the true weight of the ox. And that that means it was more accurate than all of the individuals from whom the average was derived. So averaging is a powerful way of synthesizing information from diverse perspectives. It's really is remarkably crude approach to doing it, but it works pretty darn well.
And that's why IARPA used it as a benchmark. Now, we were able to be averaging by doing some simple things like giving more weight, the better forecasters. As we get more and more data on who the good forecasters were to the more intelligent forecasters were, who the more frequent belief operators is, where various attributes of what we were able to give more weight to certain forecasters and created weighted averages. Weighted average is the average. That's not that's not too surprising is since it's not astonishing now.
Now, here's the interesting thing that the algorithms did. They did something called extreme. And to illustrate extreme rising, I want to have a little digression of a story about that we talk about in the book about the decision President Obama made to go after Osama bin Laden. He was in the movie Zero Dark Thirty. They have a scene in which senior analysts are being told how likely they think it is that Osama bin Laden is, and that is in that compound.
And putting aside what Hollywood says about it, let's just do a little thought experiment and imagine that you're the United States and you have these senior advisers around the table and you ask them how likely is it that Osama is there? And each of the analysts around the table says, do you, Mr. President? I think the answer is point seven point seven point seven, everybody around the table to points. Then what should the president conclude is the likelihood that Osama bin Laden is in that compound?
And the short answer to that is, well, if the advisers are all clones of each other and they're drawing on exactly the same information and processing it in exactly the same way, the answer is point seven, because there's no information at it. Right. But imagine that the analysts say point seven all around the table, but the analysts don't know each other and they haven't been sharing information. And each analyst basis, point seven, his his or her point seven, judgment on information that only he or she has.
So you have extreme diversity of perspectives. One person has satellite information, another has encryption breaking stuff, and another one has human intelligence and so forth. But they're finalized and they're coming together for the first time. And each one is independently arrived at this point to have an estimate from very different sources of information. You've got you've got true diversity here. And is the answer to a point seven, should the president say shrug and say, well, I think at this point, seven, or should the president say, gee, each of you has very different reasons for believing point seven?
This leads me to suppose that the answer probably more extreme than point seven, because as each of you knew the reasons the others had, you would probably become more extreme. And that's exactly what the best algorithm did it extremist as a function of diversity. So point seven was turned into point eight, five point nine. That's that's fascinating.
I mean, how did they go about doing that in terms of aggregating the data from the people or from the forecasters?
That's right. From the from the forecast.
And what would happen if you had two forecasters who have great track records and then their divergent on the really divergent on an opinion or a forecast? Is that happen often?
No, it doesn't happen very often, actually. But if it did happen, it would be a real cautionary moment if you had two super forecasters, one of whom has a point nine. Yet there was a point one. My inclination would be not to State Department of knowing, not knowing nothing else.
Are there certain types of. To avoid if you your desire is to have an accurate prediction, yes.
Well, there are many questions in the tournament. There are many classes in life in which there is a massive amount of irreducible uncertainty. If you want to be a good forecaster, you don't spend very much time working on roulette wheel type problems. If you go if you visit casinos, you'll find lots of people who think they can detect patterns that we'll spend and they develop little algorithms even to help them.
But what they're doing is they're essentially modeling randomness. So spending a lot of time modeling randomness is a good way not to become a forecaster.
What other types of questions would you say don't lend themselves to? Is it like a time duration?
Is it what other kinds of questions are roulette wheel like? Well, not roulette wheel. But if if you're what what type of questions lend themselves to better predictions. Right. Is it short time? Very few. I mean, I don't want to say very few variables, but short time duration versus long time duration because you have to constantly update over a long period of time. Right. I mean, that was one of the things that super forecasters did.
Was they updated there?
Yes, that's true. Well, all of the things equal, it's usually easier to predict questions, shorter and shorter time ranges than longer time ranges. But that's not always true. I mean, some short range questions are extremely unpredictable and it's very hard to say whether the stock market's going to up. It's going to go up or down tomorrow. So that's a short range question. In some ways, it's easier to predict where the stock market is going to be up or down 10 years from now relative to now than it is tomorrow.
That's a good point. So there are categories of problems in which you get a reversal of that. But but, yes, I think by and large, it's true that the analogy division would be right. It's easier to see the snow. And if you're close to it and you're far from it. Right. Probabilistic foresight is better or in a shorter time ranges. And that's one of the things I talk about in the book. One of the reasons why my my later work is different in emphasis from my earlier work, which experts had a hard time dealing with arguing that they because they were in the earlier work making much longer term predictions than they were in the IARPA work with predictions were rarely much more than a year.
You mentioned openmindedness at the beginning. How do we go about fostering mindedness, other ways that we can improve that in ourselves or other people?
Well, we try to do that's another thing we do try to emphasize in the training adverting. People simply to be open minded is what most people don't think they're close minded or some people think most most people think they're quite reasonable and simply say exhorting people to be open minded. People shrug and say, well, yeah, I already am. I think you want to start the more specific ways. You want to start with very specific problems in which you assess whether the people change their minds in an appropriate way.
So there are some normative models like base there that tell you how much you should change your mind in response to evidence. It has certain diagnostic value and you can create simulated problems in medical diagnosis problems. They might be economic problems, they might be military problems. But you can create simulated problems with simulated data and you could see whether people learn to practice, to update their beliefs the way they should. Now, there's always the question of whether they're going to get those lessons are going to stick.
And we found that they do stick a little bit because they can produce 10 percent improvement throughout the year. But it's one of the great challenges. I don't think we've solved the problem of how to make people more open minded. I think we can make people better believe that data on problems where they don't have very strong ideological priors or preconceptions. But when people have really strong emotions and ideological convictions about presidential candidates or economic policy or whatnot, belief updating becomes quite problematic.
Yeah, I mean, I can see why that would be a problem.
It contradicts probably something that you hold very dear into giving that up would take a lot of what, a mental labor.
Yeah, we can we can make people a bit more open minded, but making people perfect Bayesian belief of data is is something that no one has achieved yet and I think will be very difficult to achieve. I think we should keep working on it. I don't think we should give up.
Do you think the super forecasters were better at learning from the other super forecasters than the average forecaster? Like if somebody had a better approach where they copy it, would they just drop their own internal approach?
And I think they listen to each other quite carefully. And the super forecaster teams, even when they disagree with each other, they disagree diplomatically, but they can disagree quite forcefully about what lessons they should draw from particular forecasting failures. Even forecasting successes, I mean, it's fairly common for regular forecasters even to say, well, what what did we do wrong with the forecasting failure? And consumers do that, too. But they also second guess their successes.
They say, well, we're very lucky. We really nailed this question, but were we lucky? Could it could it have done otherwise? Were we almost wrong? That's an unusual question for people to ask themselves. People don't normally look a gift horse in the mouth, and they when they're right, they they want to take credit for it. And super forecaster skepticism even extends to their forecasting successes.
I can't imagine a lot of the the average or below average in terms of forecasting ability. People went through their successes and evaluated them from that angle.
What would you say is the role of intuition in forecasting or would you say that it's minimized or would you say that it's.
But this is one of the big debates in the field of judgment and decision making. Malcolm Gladwell wrote a book, Blink, and some psychologists wrote a book much less widely read called Think. There are different schools of thought about the value of intuition. And even Gladwell, of course, was divided in his book. And he did point to some great successes of intuition. They also noted the situation in which intuition could lead you seriously astray. I think the dominant emphasis in our work, it leans toward think over blank.
I'm not ruling out the possibility that there are super forecasters who do rely on intuition, but the problem that we're dealing with in real world are different from the sorts of problems where brilliant intuition has been demonstrated pretty rigorously. So it's not like chess where you're playing the same game with well-defined rules, right. The potter, really smart and really smart people can do extremely rapid forms of combinatorics and pattern recognition, and it's quite astonishing what they can do. Real world isn't quite like chess, is it?
And I think that it requires more subtlety and more willingness to second guess yourself, because history I think, as Mark Twain said, history doesn't repeat itself. It does rhyme. And I think super forecasts are sure to get that, that there are patterns in history, but they're quite subtle and they're quite conditional. And if you if you can easily over learn from history, that's a really good point.
What book would you say has had the most impact on your life?
On my life, I would have to be a book that would have to be a book I read very early on in my life. Possibly. Yeah. Yeah, I think well, I don't know how far back we should go on this one. I mean, if I were to go back to graduate school, say, when I was making decisions about what I would do with my research career, I built a book by Robert Jervis, who still, I think is to be an emeritus professor now at Columbia, but he's a very senior political scientist.
He wrote a wonderful book in nineteen seventy six that I was in graduate school and I just started graduate school in 1976. And it's called Perception of Misperception in international politics. And it is a wonderful synthesis of psychology and political science. And I think it is a synthesis of the sort that I aspire to. I tried to be división in my work in many ways. This is not a quantitative researcher, these qualitative, whereas I'm more quantitative. So we differ in a number of ways.
But I have a deep respect for how he tried to synthesize the psychological and the political. And I suppose if there's any themes running through my work at synthesizing psychological and the political.
So the last question is, who would you like to see interviewed on the show and their thoughts articulated or explored with me?
Well, I've always been a fan of Michael Lewis's work. I think he would be a fun person to talk to. And I think he may be working on a biography of Daniel Kahneman and Amos Tversky. I think that would be an interesting conversation.
Well, excellent. Thank you so much for taking the time. I really appreciate it. It's been a great conversation. Oh, it's a pleasure.
Hey, guys, this is Shane again, just a few more things before we wrap up. You can find show at Farnam Street blog, dotcom slash podcast. That's fair. And s-t r e t blog. Dotcom slash podcast. You can also find information there on how to get a transcript. And if you'd like to receive a weekly email from me filled with all sorts of brain food, go to Furnham Street blog, dotcom slash newsletter. This is all the good stuff I've found on the Web that week that I've read and shared with close friends, books I'm reading and so much more.
Thank you for listening.