Today's episode of Rationally Speaking is sponsored by Give Well, one of my favorite non-profits, they do rigorous research to quantify how much good a given charity does, how many lives does it save or how much does it reduce? Poverty per dollar donated. You can read all about their research or just check out their short list of top recommended evidence based charities to maximize the amount of good that your donations can do. Check them out at Give Weblog.
Welcome, traditionally speaking, the podcast where we explore the borderlands between reason and nonsense. I'm your host, Julia Gillard. And with me is today's guest, David Rutman. David is a senior adviser for the Open Philanthropy Project. Before that, he served as a senior economic adviser at the Bill and Melinda Gates Foundation. He specializes in public policy and economics, especially economic development in poorer countries. And David was a guest on rationally speaking previously, about a year ago now maybe a little more when we were talking about microfinance and his book, Due Diligence.
This time we're going to be talking about something primarily about something called the Worm Wars, which sounds most of all like a science fiction epic trilogy. But no, it's even more exciting than that. It's a controversy in social science. So, David, welcome back to the show. Great to be here. All right, let's set the stage here, starting in about thousand for the economic development, world's got really excited about deworming pills as a cheap and easy way to help people in poor countries.
The idea being that in a lot of countries, especially in Africa, it's quite common for people to be infected with intestinal worms, parasites which interfere with their nutrition and cause illness. And that in turn makes it hard for them to go to work in school and succeed in life. But we have pills that get rid of worms. And so the idea was that in a way, this is like an anti-poverty pill, right? That not just can we give people a pill and make them healthier, but that that has ripple effects for their earning potential down the road.
And the source of all this excitement was a study that came out in 2004 showing that deworming reduces poverty, which to skip ahead in the story for a second. That paper was revisited a couple of years ago when its results called into question. And that controversy is what's been dubbed the worm wars. But before we get to the words themselves, David, maybe you could kick things off by telling us about that original study that generated all of this excitement about deworming.
OK, that was a great summary. There are a few things I to. Yes, but I could never have this. I could never have distilled it as nicely as you just did. The idea that intestinal worms are a bad thing and that it's worth spending money to get rid of them actually goes back more than a century, the Rockefeller Foundation and its very early days when it was sort of defining or helping to define modern American philanthropy. I guess there is also, you know, Carnegie and others doing that, too.
But they became persuaded that one of the best ways they could use their money was to deworm kids not in Kenya, but in especially the southern United States. And they launched a multi-year campaign from around 1910 to 1945 to do just that. Now that go on.
Yeah. So this goes this goes back far.
What happened in 2004 was a publication of a paper in a top economics journal by Ted Miguel and Michael Kramer, where they had carried out a randomized study of a project to give kids deworming pills in western Kenya. And so, as in any randomized study like this, some kids got the pills and some didn't. Or more precisely, some schools were visited for distribution of the pills. In other words, the way that they did this in order to keep the cost down was that that workers would go to schools and instead of having kids, which would be more.
Exactly. Which would cost, which is take more time and cost more money. So you get kids who are school age and you just give the pills to everybody who's at the school. And actually wasn't just the pills. There was also education about the importance of washing your hands and things like that to help prevent to help people change behavior as well as biology.
And actually, I I didn't mention this or I'm not sure I know the stat, but what percentage of children tend to have intestinal ones?
It really varies a lot by location where if we're working in areas that we're concerned about, it can be as high as 20 percent, 40 percent in the place where the study took place.
I think the average number of worm infections, because there are different kinds that you have. So you can actually at any given time have more than one was I think about two, maybe two to two to three on average.
So some people had more than two types of worms in there. Yes.
Yes. So this is an area where the infections are really quite intensive. And so this study was actually most important, not for what it told us about the warming, but because it launched the movement to do randomized studies in development economics. It was the first big one. And I bet you've talked a lot about that and your podcast. Well, this this is where it started. There had been randomized studies in other parts of economics, but they brought it to development economics.
So that that's actually one thing I wanted to ask you. It's a bit of a tangent from deworming itself. It's important. Do you have any sense of why this transition happened when it did with this 2004 paper? Like I mean. There doesn't necessarily have to be some explanation, like obviously a transition happens, it has to happen at some time, but, you know, it still seems useful to ask why not earlier? Why not leader? Like, was it that no one in economic development before Magellan Kramer had thought of doing a field experiment or like a randomized experiment, or was there something preventing previous researchers in the field from doing it?
You know, that's a great question and a lot like a lot of questions about history, as you know, multiple. Yeah. Hey, this is a safe space. Feel free to speculate wildly without any.
And I think a couple of factors that I can think of first there starting I think maybe in the 1970s, possibly earlier, there were some very important experiments run, you know, not in development economics in the United States and elsewhere within within the field of economics. And so there's a I guess there's a sense in which it was only a matter of time if these were seen as effective and they were, that they would take over all parts of economics, I think.
Something else that was going on at the same time is that around 1980, you had the arrival of faster and faster computers. And so there was this period in the 80s and 90s where a lot of the effort of empirical economists, as opposed to theoretical ones, went into developing fancier and fancier mathematical methods for trying to study patterns and data. And I think then there was a kind of a backlash against the saying that all this fancy math was not actually working as well as it seemed.
And it was mostly just confusing people and hiding the problems in the data. And if you really want to know whether X causes Y, you've got to run an experiment. And then the math is really simple. Pretty much. You know, you look at the average breast cancer rate in the you know, in this group and you compare it to in that group and you just see if the averages are different. So it was part of a backlash against the trend towards fanciness in economics.
And so actually, I think it's been a good trend. There's a sense in which economics is simpler today, but also more empirical and grounded in reality.
But then before 2004, in economic development, how did people decide which things to try to reduce poverty in developing countries? Was it just sort of common sense? This seems like it should work.
They would they would do non randomized studies. So a set of studies that I'm very familiar with were of the impacts of microcredit in Bangladesh.
Right. That was your. Yes, of course. Yeah. So that was based on data collected in 1992 and then second round, I think in 1999. And there was no randomization there. It's just that some villages there was more microcredit than in others. And so you try to use creative ideas.
OK, so it wasn't this wasn't sort of like pre evidence based medicine when doctors were just relying on common sense or common wisdom. There were attempts to to. There were serious attempts to be empirical. Just not an around. Yes.
And also there was the foundation being laid by people like Angus Deaton, who just got the Nobel Prize a couple of years ago for method of methods, just for measuring stuff we care about, like what is called an economics consumption, which essentially means how much stuff you buy each month or each year. You know, in the in the West, we have all these statistics that we can tap. If you want to know what the GDP was five years ago, the United States, you can get that.
But in a developing country context, if you want to know how people are doing and then there's no statistician around who's going to hand you the statistics. So years of work went into just developing methods for running surveys and then figure out what questions to ask and then how and how to compile those answers into indexes to represent things like consumption. Yeah, and so that had to happen before you could do any kind of research on what raises consumption long term.
That makes sense. Well, so we've talked about in Kramer's paper in terms of its significance for the field, but we should also describe what it found in terms of.
Right. So what they found, they looked at whether there are two outcomes, as far as I can recall, at least two major ones or probably others that were folded in. One was whether kids came to school more after they had been received these deworming pills. And they should explain some of the basics here. These these pills, probably the recommendation is that you give them once or maybe twice a year and more often in areas where there are more worms and they're seen as having essentially no side effects.
Nothing serious, which means that if you give the pills to kids who don't need them, not much harm is done. And meanwhile, if you want to test kids to see if they've got worms before you actually give them the pills, you can spend more money doing that. That can be more costly than just giving the pills to everyone. So what was done in this experiment was what's called masti warming. And that's the major kind of intervention that is carried out by some of the charities that give wall recommends that are involved in deworming.
You don't take the time to figure out who's got the worms, who don't, who doesn't. You just deworm everybody, give the poor it's all kids in school. So what they found is that in the schools where deworming was done, kids came to school more. And how much more? Oh, gosh, I think it was something like the attendance rates went up by something like six percentage points. Don't quote me on that, as it were, but it was on that.
I'm just trying to get. I'm not good at remembering these specific numbers, but it's you know, we're. But it wasn't like 50. No, I was on the order of like five or six percentage points more where the average might have been, you know, roughly 50 or 60 percent attendance. So 50 or 60 percent. Oh, I start. Right. So say 50 or 60 or maybe 70 percent of the kids who are officially enrolled in school would show up on any given day.
But we got like a five or six point bump from deworming.
Why is it why is it so low? Why is it so low? Yeah, well, I don't. Is it things like it could be illness.
It could be having to work on the farm. It could be, you know, enrollment records not being reliable. You know, there could be a lot of different factors. OK, and the answer is, I don't know a lot about the concrete realities there.
They also look at whether they can to improve test scores. You know, did they actually did these kids seem to improve in their knowledge? And the answer was no. So the big headline finding was about school attendance, and they also found some interesting patterns that I think I don't find super exciting and I don't mean to be critical because I think they're more exciting to economists than than than to give well, which is that there were externalities or what you might call spillovers.
So kids who were within a kilometre of a school that got the D warming also seemed to improve, even if they themselves didn't get the deworming. And I think they avoided giving the pills to girls over 13 because they're not recommended for pregnant women and a girl over 13. There's some probability that she's pregnant. So. But that then meant. Yeah, then they could then look as whether girls over 13 also saw an improvement, even though they weren't getting the pills.
And the answer was yes.
Uh, and these spillover effects, it's not just magic. It's it's about contagion theory.
I mean, you know, the data don't actually tell you the mechanism, but it makes a lot of sense. Yes. You know, if if your friends and your siblings have been determined, then you're less likely to get worms, too, right?
Yes. So there are these spillovers. And that was seen as very central to the paper. And one thing that does mean that did mean is that if you want to look at the impacts of warming, you can't randomize within schools. You can't, like, give deworming pills to half the kids because you may be helping your control group almost as much as you're hoping the treatment, it's because they're bad. That's a great thing. If you look at the difference, it's going to be very small and you're going to underestimate the real benefit so that that shows the necessity for a you know, when you're dealing with a highly infectious disease for doing what they call cluster randomised studies where you didn't randomize at the individual level, but you randomized at the school level or the regional level, you need some schools get the treatment and others don't.
And so it actually one critical thing you question you can raise is about whether improving school attendance is really such a good thing. And my former colleague, Lant Pritchett, who's a brilliant development economist, has written a lot about how schooling ain't learning. I think maybe he's even got a book or a paper with that title saying we shouldn't get too excited just because you've packed the kids into the schools unless they're actually learning more and and seems like a lot of cases or not, or unless completing your degree makes you more able to get a job.
Well, sure, that's right. Yeah, but but I think if I don't think we can take a lot of consolation in a child, you know, finishing second grade or if he or she can't even read any words, you know, and that's that's the kind of level that we're talking about here in terms of the concern.
Actually, this is another brief tangent. I'll probably regret this as the end of the episode approaches. And I haven't gotten to all my other central questions. But I was having an interesting or reading an interesting debate on Facebook recently about this basic income study.
I forget in which country it was an African country, probably Kenya. Yeah. And the researchers or maybe it was just a journalist. I'm sorry, I can't remember the details, but basically they went around and asked people what they were spending this unconditional cash grants on. And one of maybe several of the respondents said, I'm spending it on sending my kids to school because I couldn't afford to send them to school before. And that seemed great. But my friend said, actually, this isn't great because schooling is kind of a positional good, like it means your children are going to be more likely to succeed relative to other people, but it doesn't actually make the whole country better off.
Does that make sense to you?
That is certainly potentially true. At the end of the day, it's it's a question of context. If the schooling is effective and its graduates are entering an economy that's fairly dynamic and creating more opportunities where they're, you know, improves skills will be rewarded, then then I think more education can be, you know, will be a good thing if it's really just a waste of time. They're not learning, but they're positioning themselves relative to others. And yeah, then the net benefit is going to be much lower than that.
The gross benefit for the kids for getting ahead. OK, so anyway, derailing, you know, this is all very interesting, so so yes, so the study came out and it was very influential within economics in several ways, as we've discussed. But it absolutely is true that it created a lot of excitement around Masti warming and that it has been controversial because it is not the only study that has been done of the impacts of these deworming pills.
There have been several dozen studies done over the last few decades.
And wasn't there a follow up study several years later looking at the earnings of people of the children who had received the deworming versus those who didn't?
That's right. So the same authors and plus some some new guys and gals did a follow up study where they followed a fraction, a random fraction of the kids in the original study into late, late adolescence and even early adulthood and wanted to know how they're doing with their long term benefits from getting a few years of warming back in your youth. And they looked at a lot of different things. They looked at income and health and education levels and there's an array of findings.
But the one that popped out most certainly forgive well, was that former children who had received the extra worming were earning more in adulthood. I think it was about 12 percent more overall. If you focused, you know, on wage income, which is one type of income, it was more like 27 or 30 percent.
I think 10 percent extra earnings seems like a lot to result from a six percent if it is roughly six percent increase in school attendance, maybe.
So, you know, I don't really know what to expect, but maybe not. That's not the only obvious.
As they say. I don't have priors about that. But what is? OK, but that raises a big question, which is, is this plausible? Right. Is it possible that you could have a relatively modest, arguably modest benefit in the short run and what looks like a larger benefit in the long run? Right.
And at some point you have to say, well, whatever the evidence has happened is, is the truth. We shouldn't go too far in imposing our priors.
But but it depends on how strong the evidence. That's right. Which is always a complicated question, but this is a very important finding. Forgive. Well, because that bump in income is huge. If it lasts many years, say, 10 or 20 years or more, and you compare it to the cost of achieving it, which was just a few dollars, these pills cost almost nothing. And my administration is very cheap. So you could get like a, I don't know, benefit cost ratio easily of one hundred one thousand one, depending on how you do the numbers.
And that is part of what's really drawn. Give wealthy to deserving charities.
And plus you have the additional, you know, non negligible benefit of and these people don't have to suffer from worms for years. That's also nice.
Absolutely right. So let's let's get to the words themselves.
I really liked you had a line in one of your blog posts about your analysis of this issue that went and so on, the dreary plains of academia.
Did the Great World Wars Begin, which really captures the excitement of the topic, at least for at least for a few, maybe for you, and we'll see if it captures of it.
So what if the war is like where did the doubts about the findings begin?
Yeah, so there is an institution in in the UK called Cochrane, I think it was formerly called the Cochrane Collaborative, and it supports what a lot of work that is called meta analysis. And what that is, is that's when you you're wondering whether, you know, screening at age 45 for prostate cancer actually makes people save lives. And maybe there have been a dozen studies of this done over the years in different ways. And you want to draw them together and try to synthesize a single conclusion out of them, because the larger your study, the more precise your answer can be on any question.
And so if you can pool a bunch of studies, you're essentially making one big study with more statistical power. And this is a meta analysis does so working under the Cochrane umbrella, people have been maintaining meta analysis of studies of the impact of the warming. And I think the latest iteration was released. I think it was twenty fifteen. Right. Is that what I said in the blog post?
I was. Yes. And it found, you know, looking at a couple dozen studies of deworming masti warming as well as targeted deworming it just just for kids who had tested positive that there was substantial evidence of no impact. And therefore the implication was that it was crazy to be spending millions or hundreds of millions of dollars, deworming kids, when all the evidence seems to say, well, there's no benefit here. Mm hmm. And so then that led has led to a kind of a classic academic argument where there's back and forth the authors of the original studies.
Have replied and the critics have, you know, added new new areas to their quiver. They've they've done replications of the study that they were just talking about and raised doubts about the methodology and so on. And it gets very complicated. I think one of the big divisions here is between epidemiology and economics. And to an extent, it seems to have worked out to be between Brits and Americans, certainly tribal, tribal feel to the British and American.
Kemp's line up with the epidemiology and Yukon camps. Are those two separate axes in this case?
Pretty much, yes. OK, they're pretty much lined up. We've got most of the critics who are really saying that you shouldn't believe these deworming impact studies, at least that you shouldn't be funding deworming. Our British epidemiologists and most of the supporters of this are American economists.
It's a funny thing to say, OK, can you unpack why the battle lines got drawn that way, like your earlier historical question? I've wondered about. I know, but so I think one thing that's going on is I guess it's an epistemological questions like what constitutes valid evidence of impact. It seems to me that would a lot of the epidemiologists are saying is we run our statistical methods and we get some kind of estimate of the impact, which is usually positive.
But when we look at the 95 percent confidence interval around it, it includes zero. And I can I can I can expand on what that means. Therefore, there's no evidence of impact. There's they're saying that on the evidence that we have, we think that if if there were no impact. The actual data that we got don't seem terribly unlikely when we rejected something at point zero five. What we're saying is that if there is no effect, if the true answer is zero, that there's a more than a five percent chance at less than one.
So if you reject the null at 25, then we think there's a less than five percent chance that we would have gotten results like this or more extreme than this, you know, in this world in which you actually.
No. Yes. Yeah. It's a surprisingly complicated. Is that true even for people who are very familiar with it to explain on the fly. But yes. Yes, yes.
Because they don't actually articulate it that often. I'd have more practice. Yes. If they're saying that they cannot reject the null of no impact, they're saying that we're not confident enough, that there's an impact here, that we can when forced to make a binary call, whether it's present or absent that we can say that it's present.
And what about the American economists? They say a couple of things. What I might say is would give I think I might contrast that first, though, with how give well thinks about things. Give all is trying to do a cost benefit analysis of deworming. Right. And not just the wording, but deworming charities. How much does it cost to give a kid a pill? Is it 12 cents as a twenty seven cents whatever. And then what are the expected benefits from that?
Now if we do our and our best analysis, we're not going to know for sure what the benefits are because all the studies have uncertainty on them. So we can imagine, at least informally, that there's some kind of probability distribution, maybe for twenty seven cents, will get 100 hundred dollars of benefit, maybe somewhat less likely. It'll be a thousand dollars or maybe it'll be ten dollars. There's a range. And over that there's a probability distribution.
And whether 95 percent of that distribution is to the right of zero, we don't really care about. We're just trying to make our best guess and then go with it. And of course, we can revise that best guess over time. And what I think is important to understand there is that we're going from questions about what is the state of the world, what is the impact, if you were going to questions about what should we do or what should visitors to our website do?
So there's a jump there from from conclusions about the world to decisions about what should be done.
Is this sort of like Spock saying, well, we don't have enough information, we don't have enough data to make a decision about which path we should take and the captain being like, well, we've got to take a path. So let's just go with one of these, whatever is your best guess. And Spock, like, no, we don't have enough data. We can't decide.
It is kind of like that. That's right. You have to make a decision, the character. Right, exactly. You have to make a decision based on the evidence that you have and how what what your decision frame is, what your choices are, what are the downsides, if you're wrong, should influence whether you want to impose point or five is your threshold or point two or point seven, or get rid of a threshold altogether and just deal with the continuum of possibilities.
And so what I think the one thing the epidemiologists are doing is because they're so steeped in the methodology of Cochrane and meta analysis, is that they're thinking inside the world of conclusions. That is, what do we know about the world and not thinking about the complexity this that is the decision making. And so they feel like their job is to set up a very high bar for demonstrating that something actually has an impact. You know, it reminds me of an example.
You know, if you and I are driving down the highway really fast and it's dark and rainy and you're driving and I'm riding and I say, you know what, Julia? I think we might be headed for a cliff.
But, you know, I can only worry about the language of the no, I'm only ninety four percent sure, so I can't reject it.
And you can say, well, in that case, I'll keep going, which would be crazy, right?
Yeah. Although presumably, I mean it sounds like you and I think give a lot of it. I don't want to speak for them. If you had to guess you would think the probability that deworming helps is more than five percent, like significantly more, right? That's right. Like you're giving this sort of conservative case. Yeah. Of like let's say we were only five percent sure. Should we still go ahead and do it? If we thought the benefit was great enough?
Not that is an important question. But in in this particular case, it sounds like there was actually a reason to be more optimistic than that. Yes.
So, I mean, we we try to get away from the question of whether it does or does not have an impact, because reality is not binary where we're doing this. And so think about what is what is our best estimate for the expected or average impact. And my point is, just how you synthesize the evidence into a decision has to depend on the specifics of the decision you're making. It's one thing if you're thinking about trying an experimental new drug on a cancer patient, it's another if you're worried about whether you're going to hit it, you're heading for a cliff.
Right? Know, it could be that the epidemiology is very conservative and they insist on that point of five as a high standard of evidence to meet because there are good reasons, like maybe a lot of research is funded by a drug company. And so you should start from a standpoint of strong skepticism or maybe most new things have side effects that you need to worry about, and again, that argues for a higher standard.
But here we've got something that apparently there's a consensus has very little in the way of side effects, and it's very cheap to deliver. And so I just think that the point of standard doesn't make sense there. Yeah. Now, the proponents also argue that if you do the meta analysis, right, that you can meet the point of five standard. And that's something we can also talk about. Yes, it's complicated.
Well, maybe, you know, feel free to go only as far into the weeds as you think is appropriate here. But if we could just briefly touch on the nature of the criticisms of the original deworming study, maybe a good way to divide it up is actually in our last episode together, I think we talked about internal versus external validity, which is a distinction I find myself using a lot. And basically internal validity is about was the study well constructed?
Can we trust the results of the study in the context where it was conducted? And then external validity is basically about how much can we generalize the results of the study to other contexts? And so I find myself making this distinction colloquially when, for example, someone is giving me advice and I try to separately ask, like, do I think their interpretation of their own situation is sound from? Do I think their advice applies to my situation? So anyway, I find that concept useful both in evaluating studies and in life.
But getting back to the topic at hand, do you think the concerns about the deworming study in Kenya, the Mechelen Kramer study, were more about its internal validity, like it wasn't well done or that its results didn't apply to other situations?
That's a great framing. I think a lot of the criticism was about the first whether the study was actually well done. And it's something you can believe. One can certainly also talk about the external validity, because we talked before about, you know, what percentage of kids have worms in general. And I mentioned that it's particularly high. It was particularly high where the study was done.
And so then you could say, well, that means maybe it doesn't generalize, which be something the valid concern, because in other areas, if you give everyone deworming pills and only a few kids needed them, then it's not very cost effective. Right.
Right. It turns out this is a funny thing. This as I mentioned, this is the study that launched the randomisation movement and development economics. And it wasn't randomized. It was pseudo randomized.
And I guess you might we often use pseudo or quasi as a prefix.
And it was randomized dish. Randomized ish. Yes. So and, you know, maybe there are certain people who use the word randomized to embrace what was done, but they did not flip a coin or roll dye or use a, you know, computer to generate random numbers. And I think they wanted to. But the economists themselves didn't actually run the experiment. They had to work with another organization, an NGO, a nonprofit that was operating there.
And apparently the nonprofit was resistant to the idea. Some people have moral concerns about randomization.
It feels like you're, you know, playing God with health and life, although you still have to divide up the schools somehow by whatever method you use, you're denying potentially useful treatment from half the students. Whether you made that decision by flipping a coin or by, you know, drawing a line in the alphabetical roster of schools, I don't see how that.
Right. I agree. And you should can I would also argue that we should give the best evidence that we can so that we can help. You know, that, too. Right.
So in any way, what happened is they made a list of 75 schools that they were going to work in and then they sorted the list. And it turns out that they didn't even describe correctly how they sorted it. They originally said it was by the name of the school, which was not true. It was by, you know, I don't know, county equivalent of a county. And then within counties, they sorted by the number of kids in the school.
And then once they had this sorted list, they went down the list and they divided the schools into three groups, kind of like, I wouldn't say doctor groups, but I think that's not the right analogy. So the first the went into group one, the next school went into group to the third one and a group three and the next one went back into group one. So they they randomly and I'm sorry, not randomly but arbitrarily split these 75 schools into three groups of 25 each.
And then group one immediately got to determine this was in the beginning of 1998. Group two didn't get the worming pills until the beginning of 1999, and Group three was the full control group. It didn't get to pills for the duration of the experiment, which was two or three years, I guess three years. So there was that. And so, you know, you could say that this wasn't, in fact, randomized. And maybe there are stories that would explain the results that they got without relying on the idea that the pills made a difference because of the somewhat non-random way that these schools were grouped together.
And I spent some time trying to make up that kind of story. I thought, well, maybe just by chance, the schools that got the warming earlier, you know, we're at a different elevation. This is this is a hilly area. And that would affect how much the prevalence of worms. There was a big. Each new event at the time, so there was a lot of flooding which could have aggravated, you know, worm prevalence in some areas more than others, and maybe you could sort of construct a story, you know, and I even went to great lengths to try to come up with things like this and prove them and ultimately failed.
I really tried to attack the premise here and, you know, bringing in new variables that hadn't been checked for statistical imbalance. And most of it came away convinced that while, yes, there was this asterisk on the studies, that it didn't seem like a very compelling explanation for the results compared to the more straightforward explanation that the pills actually made a difference.
Right. Yeah, I really liked your explanation of how to approach the question of are these results real or how real are these results in that most people who are bothering to critique methodology, which is a small subset of people, they approach that question by like tallying up flaws in a study and asking themselves, like, is the study too flawed or not too flawed or which of these studies is least flawed? And by contrast, you are asking what I think is a much more important or central question, which is looking at all of this evidence together.
Which hypothesis about the world makes this entire body of evidence most likely? Like, is it just to oversimplify, is it the hypothesis that deworming, you know, does work, it does reduce poverty over time or not? And I think you describe this processes as kind of related to Occam's razor, right? That's right.
You know, when you're trying to make decisions about whether to recommend a charity or not, it can't be optimal to set such high standards for the quality of the research that you have no evidence and you just say, well, we don't know, because that's then it's like Captain Kirk and Spock, as you described. So we have to do is say, you know what, here's what we've got. We've got to make the best of it. What's the best explanation for all the data?
And I define best as being a combination of two things. A good theory is something that explains most of the evidence before us and is simple. And that's what is called Occam's razor. The idea that the simpler a theory, if you have two theories that can explain the results before you, the simpler one is more likely to be right. That may reflect my biases. What I actually am trained in is not economics, but a math, mathematics.
And in mathematics, the simpler statements are more beautiful and more true in some some deep sense. But you raise an important question, which is, you know, how good is the study have to be for us to take it seriously? One of the things that the epidemiologists say about a lot of the research that economists have undertaken is that it doesn't meet good quality standards. So, for example, in Kenya, this children in the control group did not receive placebo pills.
They knew they were in the control group, whereas the kids who got the pills knew they were in the treatment group. And that is not considered best practice in medical research, because then you've got some people who are in the treatment group who know they're being observed and are being treated differently. And that can create what is it called, Hawthorne effects and placebo effects. Just by virtue of people knowing they're being observed, they can behave differently. The analysis was not preregistered, right?
You know, it's when once you've got your data and you can test all sorts of hypotheses and then maybe there's a tendency to only report the results that are significant, you know, and so your data mining. And so one way to prevent that is to preregister what you're going to do with the data once you get it. And that was not done here and so on. They're a bunch of these things. And whereas in epidemiology, I should say, in medical research, it is more common for there to be preregistration and blinding and blinding in these kinds of things and so intuitively reacted against this.
Again, this kind of minor ization of knowledge saying either cancer doesn't count, it's either there or not.
Right. Yeah, it's a very P-value way to look at the world. I mean, there's two two different thresholds here. One is, is it significant or is it not significant as just a binary cut off and then another is is is the methodology good enough or is it not good enough? Exactly right.
But I realized that I also do the same thing. It's like, you know, accusing somebody else of being racist and then realizing that you are also superficial and how you judge people. There are some studies that I will hardly spend any time looking at because I just don't trust them. You know, if you told me that you've done a non randomized study where you just you discover that, you know, healthier kids get higher grades, I'm not going to take that as proof of that either causes the other.
But that seems correct to me. I mean, we can't examine everything, but maybe but if that was the only evidence that we had on something, it was of that quality, then maybe we should take, you know, still pay attention to it. So I filter, too. And the question is, you know, when is that appropriate and how should you do it? So there was a there is there was a sort of a holier than thou kind of thing coming from the epidemiologists.
And The Economist responded very well.
Some of those, like everybody, agrees that blinding would be good. But the economists, you know, have good responses to why. And this is in this case, they couldn't do everything that the epidemiology epidemiologists said would be ideal. I mean, to get a bit graphic, if you. Have worms and you take the pills, then your body will expel worms and you'll see it, and so you can't actually hide from the subjects who is in the treatment group and who's in the control group.
It's just not possible, that kind of thing. So blinding is not was not actually possible in this case. I forget how we got into this, but this is an important thing that I talk about in the question of whether the study itself is valid. I agree that in the ideal there ways which could have been done better, but we still have to work with.
Yeah, I think that's a message that's important than it often gets lost in in the overall very, very good and useful message that skeptics and science communicators and statistics promoters like me promote, which is that, you know, these are all really important elements of rigor and you should be less trusting of a study if it didn't use blending or randomization, et cetera, et cetera. But the the takeaway that people often get from that is like this is my excuse to ignore anything that doesn't meet every single one of these standards.
And honestly, like even for, you know, the gold standard, randomized, controlled, blinded long term study in the field, there's always problems that you're going to be able to find. And especially I mean, I'm not claiming the epidemiologists were motivated in this case to reject this results. But if if they had been, it would have been like even for the very best conducted study, I'm sure they would have been able to find reasons to reject it.
Oh, I think that's right. Yes, yeah, yeah. David, when you zoom out and you look at the way that this debate went down, how does it make you feel about the fields or the cluster of fields, social epistemology, like do you think that people basically handled it well, that they discussed these various epistemological methodological questions in an open and intellectually honest way? Or did things become tribal with people like arguing for their side and just digging in their heels?
I think I am a bit biased. I have much more affinity with the American economists, even though technically I'm not one of them.
It feels to me like the economists have been. More careful to be constructive in their tone and more sophisticated in how they reason, for example, not not quite getting so caught in the trap of insisting on significance at point or five and only considering studies that meet certain superficial quality criteria. I feel when I use when I think of there being tribal elements to this, I feel that more in the British epidemiologists that I've talked with. But that may just reflect my bias.
I'm sure the epidemiologists say I'm not tribal. Your tribal. Your tribal.
Yes, yes and yes. And they might have some validity in saying that. I think that. The meta analysis process, which is supported by Cochran and also another group in the United States called Campbell is, which has been an important part of this whole debate, is still fundamentally constructive. I think there are conservative biases, as we've talked about, towards, you know, really imposing high standards for saying that there's an impact which may not be appropriate here.
But I still think their general approach has been useful, where I feel like the biggest problem lies is in people being systematic in their reasoning. And this may be very familiar territory for you. There are times when I when I was debating with some of the the British epidemiologists and email where I felt like we weren't quite talking past each other, but we were operating from different premises and we were not doing a good job of confronting those differences and working with them systematically.
Yeah, this is why I like to have a whiteboard when I'm debating something even remotely complex so that we can map out like, you know what? What are the things that I'm assuming that makes me say this? And can we, like, pinpoint where our models start to diverge?
Right. So I think I think I think with epidemiologists are saying is that generally when we run the meta analysis, we do not get results that are we like for the impact on children's weight. For example, there are cities that are statistically significant at point zero five. Therefore, there is no impact. Therefore, people who are, you know, engaging in deworming on the basis of the evidence are highly misguided and are wasting money, maybe, maybe even doing harm.
And so there's a set set of assumptions there and, you know, established ways of reasoning that I don't think are universally shared, at least not with economists. And so what I'd like to do, I mean, this is a case where, you know, we're not talking about what is art, right? It's not where we're at some point you can you can argue and never reach a conclusion. We're talking here about data, probabilistic reasoning, making concrete decisions.
And there actually are ways to be more systematic about that. And so I'd like to I've been playing now with building a spreadsheet where you can engage in some more Bayesian and systematic decision making. So you could say here's my prior for what I think the impact of masti warming is on children's weight, which is an indicator of nutrition. Here's what the evidence says from the meta analysis. And then I'll do the Bayesian analysis, which produces a new distribution for the estimated impacts that factors in the data, but also my my preconceptions, which are unavoidable.
And then I can see that into some kind of cost benefit calculation where I say here's a distribution of the likely benefits based on the evidence we have, and here's our best estimate of the cost of the pills. Here's a discount for the fact that where deworming is done today, fewer kids have worms, so expect less benefit. Here's other discounts, et cetera.
Oh, that's better than my white board solution.
And then I'd my ideal would there be for the graphical and you could adjust some knobs and you could play with your assumptions. Maybe a key assumption, for example, would be that there's no side effects. You know, maybe if you start to allow side effects, that would really shift your decision and then at the end of that you would get a distribution for the cost benefit ratio. And if give was right, the center of that distribution will be positive and you can just decide, like is enough of this distribution positive enough that it's worth taking action?
And I think as a matter of math and formal analysis that encompasses both sides here, I'm like, I'm making this up as I go. And I've never articulated this before, but I ought to be able to say, OK, to the epidemiologists or any economists. Do you agree with the structure here and the parameters, if you don't like the parameters of the spreadsheet, lets you change them. If you don't like the structure, tell me and we can redo this, but I feel like it can provide common ground for a more systematic discussion of how we go from the evidence that we have to the decisions that we need to make.
Yeah, I mean, that's basically my my ideal. It's funny. I think people are used to thinking of well, let's just check the data and see what it says. As far as the end point, like that's how we resolve a debate is we just look at the evidence. But in many cases, that's just the starting point.
That's right. And that's because in many cases the evidence is, you know, weak. There's a lot of uncertainty or it comes from one context and we're working in another.
And so and you have different priors and you have different epistemological priors about how how you should interpret data or. Yeah, yes.
When things are when the evidence is really powerful, then most of this falls away like we have enough evidence that smoking is bad for you or that if you run into a wall, it will hurt. But when you move into the more uncertain areas, then you have to be more careful in how you go from inference to decision.
Well, that seems like a good place to wrap up the conversation. Before we close the episode, David, I want to give you an opportunity to introduce the rationally speaking pick of the episode. This is a book or article or something that has influenced your thinking in some way. Do you have a pick for us? Yeah.
When I first got interested in public policy and economics and development and other things, I was just out of college trying to figure out what I wanted to do with myself. And I was actually studying at Cambridge in England and I lost interest in the mathematics that I had been studying there and ultimately actually failed on my exams. I was oh no, I mean, just attending classes. But part of what pulled me away from the lectures was starting to read books that friends recommended to me or that I found on their shelves.
And one of the most exciting books for me in that period was called Steady State Economics by Herman Daly, which brought notions of economics, but also thermodynamics to thinking about global environmental issues and the way that I found extremely compelling. And Tulpan helped me put me on the course that my life eventually followed. That's great.
I love one of the categories of books that I love is a book that highlights a parallel or a similarity between two fields that you thought were unrelated and says, like, we can use these like tools of analysis or these ways of thinking from one field like thermodynamics and this other completely seemingly unrelated feel like like economics.
Yes. Yes. Cool. Well, David, thank you so much for coming back on the show.
It's been a pleasure having you. Pleasure to be here. I look forward to doing it again.
Yes. This concludes another episode of Rationally Speaking. Join us next time for more explorations on the borderlands between reason and nonsense. If you enjoy listening to the rationally speaking podcast, consider donating a few dollars to help support us using the donate button on our website, rationally speaking podcast Dog, we're all volunteers here, but we do have a few monthly expenses, such as getting the podcast transcribed and anything you can give to help. That would be greatly appreciated.