Transcribe your podcast

Today's episode of Rationally Speaking is sponsored by Give Well, a non-profit dedicated to finding outstanding charities and publishing their full analysis to help donors decide where to give. They do rigorous research to quantify how much good a given charity does, how many lives does it save or how much does it reduce? Poverty per dollar donated. You can read all about their research or just check out their short list of top recommended evidence based charities to maximize the amount of good that your donations can do.


It's free and available to everyone online. Check them out at Give Weblog. Welcome to, rationally speaking, the podcast, where we explore the borderland between reason and nonsense. I'm your host, Julia Gillard, and my guest today is Jessie Single. Jessie is a journalist, formerly the senior editor for New York magazine's website, where he ran the blog, The Science of US, and is now a contributing writer for New York magazine. I started following Jesse on Twitter because from time to time, these articles would pop up in my news feed and on Facebook or on Twitter.


And I'd be like, this is an unusually thoughtful, statistically literate example of science journalism. And then I would take the byline and it would be by Jesse every time. So so it's like I should really follow this person on Twitter. We're going to be talking today about the implicit association test, the most famous test, I would say, of unconscious bias, particularly racial or gender bias. And you may have heard of the idea to you, not necessarily by that name, but you may have heard it referenced because and all of these conversations that our society has been having about sexism in the tech world or about racism among police officers in the last few years, people often reference that it is evidence that, look, you know, even if you don't feel consciously biased in your views of women or minorities, that doesn't mean you aren't.


And in fact, most people have these unconscious biases against these groups, which the IAT can reveal. So it's been in the public discourse a lot, probably more than almost any other social psychology instrument. However, as researchers and journalists have been scrutinizing the Iot a little more closely and the evidence that it's measuring something real and meaningful, well, it looks like these results might not be as solid as they seemed, which is where Jesse came in, comes in just he wrote a piece for New York magazine that was sort of one of my favorite of his articles about just going really in-depth into the evidence for the IAT and some of the problems with that evidence and the interpretations of it in the in popular discourse.


So that's where we're going to talk about today. Jesse, welcome to the show.


Thank you for having me. I disagree that I am statistically literate, but I appreciate the compliment.


Well, you do a convincing impression of statistical literacy. Thank you very much. You know, the closer that gets to to being perfectly convincing, the you know, the less distinguishable it is from actually being statistically literate. Yeah, exactly. So, Jesse, let's just start at the beginning. Can you just describe what the implicit association test is like? You know, when you sit down to take it, what are you doing? How does it work?


Yeah, so in the most basic version of the Iot, you sit down at a computer and you're shown sort of a series of images and words that flash quickly. So some of the words will be negative, some will be positive. You might have, you know, illness as a negative word, happiness as a positive word. And those are interspersed with images of black faces and white faces. This is one version of the test and the most important version of the test.


And you're basically asked if you see a good word or a white face, hit the water, say E on your keyboard. If you see a bad word or a black face, it's the letter I and those two combinations are flipped at a certain point. And what's going on under the hood is the computer is tracking how easily you connect in your brain. This is the thought at least good concepts with white faces versus good concepts with black faces or bad concepts with white faces versus bad concepts with black faces.


The idea is, if it takes you longer to connect, say, black faces with good concepts, that means your brain is sort of struggling more and trying to overcome implicit bias. It means you're according to the tests, that is a sign that you are implicitly biased in a way that would favor white people over black people. And of course, the tests would also reveal that you have an unconscious preference for for black people over white people. But the data they've collected suggests most Americans have an unconscious preference for white faces over black faces or, you know, in other versions of the test, it might be white sounding names over black sounding names and so on.


Right. Right. And what have the results generally shown? Like how do people tend to perform on the test?


The average American is implicitly biased against black people. It's funny, even just describing these results, you need to be very careful because as we'll get into that, we'll definitely get into that.


Yeah. So right.


That is the sort of lay person or version or what the proponents of the test would say is that the average American is implicitly biased against black people and that is more common among white test takers and black test takers. But a solid minority of even black test takers are also a. Slightly biased against their own race. Cool, so let's just start by digging into this, the interpretation of the IHT I so reading your article and some of the things I've been reading after your article has been really helpful.


But I have to admit that once I went when I first heard that the IHT might not actually be evidence of bias, my thought was like, yeah. I mean, I agree that just because someone gets the score on this test, that isn't that isn't identical to them actually being racist. It's just their score on a test. So it's not proof, but it was really pretty hard for me to come up with a plausible story in which someone would have this unconscious association between white faces and good concepts or between blackface and bad concepts and not actually be biased.


So maybe you could just talk about other ways to interpret the results.


Yeah. So the most common alternative explanation, and this has been posed in the literature a lot by by several different researchers excuse me, is basically that if you're aware of negative stereotypes about black people, you might be quicker to associate words like crime with black face, black faces or, you know, victim or violence with black faces, if that's the case. And that could generate a, you know, a high score, meaning a biased score of it, even though you're not implicitly biased against black people, you're just aware of these negative stereotypes against them.


And there was one particularly ingenious experiment in which researchers actually created a new non-existent you could call either a race or a species called NAAFI, NS, A. FIA and S and by sort of inducing in the experiments participants, this idea that dolphins are the sort of downtrodden group that society doesn't like, they were able to somehow just like reading them paragraphs about Novins being bad or.


Yeah. So basically what they did was there are two groups, Hannaford's in the sites, and so sometimes they would basically tell people Nathans are privileged and few sites are oppressed. Other times it would be the reverse. And when Novins were oppressed, for example, people would score higher, a higher priority about dolphins and more bias, more what we would have naively assumed was bias against Novins.


Yeah, it's not, but, you know, obviously they can't actually have any reason to think dolphins are bad because they're made up group and all we know is that they've been oppressed.


Right. Exactly. So that was what was so ingenious about the experiment is there's in this case, there was no other explanation but that, you know, if you view a group as downtrodden, it might boost your Iot score regarding that group.


Right. I guess in theory, you could have a kind of convoluted, but you could have a story in which you have some belief that oppressed groups are oppressed for good reason. And so if you know group is oppressed, then there must be something bad about them or something.


Yeah, well, I mean, there's all kinds of ossify that. Yeah, definitely.


And well, so that's that's the thing we're talking about. In some cases, you know, a difference in reaction time of a couple hundred milliseconds. So what if you have a pie that is two hundred milliseconds, how big a slice of the pie is actually implicit bias? How big is associations? How big is just problems with the way the test is designed or with error? And the basic problem is that people have assumed the whole pie or most of it is something that we can genuinely call implicit bias against the group and the studies trying to connect it scores to actual behavior in a lab setting simply haven't really shown that.


How do they try to connect it to? Do you mean to actual to behavior that we would call biased behavior? You're trying to connect that to the people scoring it.


Yeah. So there have been by now a lot of studies where you basically give people the data and then, you know, it sounds weird, but you put them in a lab and you give them an opportunity to be a little bit discriminatory and that can take a lot of different forms.


One of them, as you sort of give people a chance to interact with a white or a black experiment, Confederate, and you might find that people with high scores are ruder to to someone who's black than someone who's white, as judged by a third party observer, that that's what a third party observer who doesn't know their score on the itte.


Yeah, third parties are blinded to the whole sort of experiment. Interesting. So, you know, for a while, the test first came out in 98. The first study showing this sort of results showing that it was linked to behavior came out in 2001. And then for, you know, a decade and a half, there were all these other studies that appeared to show this link between Iot behavior, between Iot scores and behavior and the test to creators and main proponents.


Mahzarin Banaji, she's the head of psychology at Harvard, and Anthony Greenwald, who's another big name in social psychology. They said, look, these are incredible results that showed this test can predict behavior better than any just about anything else, including, you know, situations where you explicitly ask people about their levels of racism.


And what researchers have found is their claim doesn't hold up when you actually take all these studies together and do a rigorous meta analysis. Iot scores really only explain a tiny chunk of the variation in how racist people act in these experiments.


Do people's results on it correlate with their explicit self reported bias? Like if you just ask them questions about their beliefs about black and white people, does that connect to their score on the issue?


My understanding is that there there's, I think a fairly weak correlation there. So they but now they've they've correlated it scores with things like political party, for example. So I think there's a fairly strong connection where the higher you score in it, the more likely you are to be politically conservative.


It seems like kind of indirect evidence for for it's you know, for for whether it's measuring something that the thing that it claims to be measuring.


There are so many sort of potential conflicts here. And part of the problem is a lot of these experiments were not, at least according to Lantian, who's sort of one of the smartest critics of the literature, he doesn't think even the experiments that do show a correlation were designed in a sufficiently rigorous way. And that's a view that was echoed by a couple of Scandinavian researchers who looked at the evidence and basically said, whatever correlation we find in the landscape of Iaat behavior literature, it's just so statistically weak, we really can't assume anything really true here.


Got it.


So probably we should separate these two. Different, these two distinct questions about whether the IHT is a good test, but that actually demonstrates implicit bias, where first the first question is like, were these studies well conducted? Like if you were to try and replicate this exact phenomenon of people, you know, showing a stronger association between white faces and good concepts than between black faces and good concepts with those studies replicate as opposed to just being noize or, you know, a poorly designed study, etc.


. But then second, let's say that the answer to the first question is yes, they were well designed studies. The real phenomenon. Then the question is sort of what does that tell us? Does that have any relation to the things we actually care about, like people making biased choices or behaving in biased ways? I guess these these two different questions kind of map onto internal validity and external validity. Does that sound right to you that those are the two questions?


Yeah, and I think those are the questions. And I think where we're at now is that there are consistent patterns in the data where, for example, white people score higher than black people on the black white, meaning they score higher scores, more bias.


And that has replicated yeah, yeah, association exists and that is higher among white people.


Yeah, that stuff is my sense is that stuff is fairly solid in that even the critics will say, OK, this is a genuine pattern and this is worth investigating. What what has not held up is the external validity. And we should go back in there and talk because there's other internal validity issues. But in terms of the basic external validity question of does the IAT predict behavior meaningfully at this point, the answer really appears to be no to the point where even, you know Brian Nosek, who's a big reproducibility guy and who is involved in a lot of this research, his name is on a paper.


I'm not sure if it's out yet or if it's in press. But the most recent sophisticated meta analysis showed that it scores account for less than one percent of the variance in racist behavior in lab settings. So. Oh, wow.


In what other situation, if someone came to you with this fancy new instrument and said, check this out, this is really important, we should spend millions of dollars on it. And you said, OK, how good a job does it do explaining what we're trying to measure? And they said under one percent, there's just no situation which you would find that impressive. So to me, the external validity question really doesn't look good at this point or the answer to that question doesn't look good.


Yeah, I did indeed want to talk a bit about the internal validity problems, too, so. Yeah, why don't you tell us what what some of the problems are with how the with actually measuring the pattern.


Yeah. So you do get these consistent patterns at the big zoomed out level, especially in terms of how white people versus black people perform on the test. The test is quite poor when it comes to what's known as test retest reliability.


They give the same person, takes the test multiple times. Do they get the similar scores there?


Yeah, their score will jump around quite a bit. And I think if they're white, it's likely that most of those scores will be positive, meaning in theory it indicates their bias.


But if the you know, my article has sort of the nitty gritty statistical stuff, but basically this does not this test does not come close to the level of test retest reliability we would expect for any instrument professionally used to measure anxiety, depression or anything else. And the reason why it sort of blew up the way it did and became this viral sensation, despite performing so poorly in terms of its psychometric attributes, is an important question. And I wish we had a better sense of why that is, but it just doesn't come close to the level of performance one would expect.


Yeah, you know, thinking more about why it could be how it could be the case that the pattern is real, but it doesn't actually translate into biased behaviour. Someone I think it was a philosopher, but I'm blanking on his name now said that it seems to them what matters isn't whether people have an association between, say, the concept of black people in the concept of, I don't know, bad or dangerous. What matters is how central the concepts dangerous is to their conception of blackness, because if it's not central, then it can easily be swamped by other contextual associations.


Like, I think the example they gave was, you know, if dangerousness is central to your concept of blackness, then you're going to associate that with a black person no matter what context you view them in. Almost like whether you see a black person in a church or a university or a street corner. But if dangerousness is just very kind of secondary or peripheral to your concept of blackness, then maybe you'll associate a black person with dangerousness in some contexts like a street corner.


But as soon as the. Which is to say a church or a university, there's no longer that association for you at all. And so the IATSE, you know, it's this very abstract kind of contrived. It's well, it's really contextless. So it doesn't really measure the centrality or the context dependence of the associations that it's testing for. What do you think about that?


Yeah, I mean, that makes sense to me.


And I think what's striking is the way that even though all we're measuring is this very slight fraction of a second, in many cases, reaction, time, difference, and the the folks who created this test and then helped publicize it and wrote a book about it, simply stated, in my opinion, without evidence that this test could explain everything from police shootings to, you know, problems in terms of who gets to read what home.


And they've basically said this test can explain a big chunk of racist racism in the US. And I just don't think that claim is merited by the evidence at all. And I think all the focus on the Iaat has potentially sucked a little bit of the oxygen out of the room and diverted funding and attention from other, you know, better or more rigorous ways of understanding race from a social psychological perspective.


Yeah, you know, one really important point that I want to make sure comes across in this conversation is that even so, let's just assume that the idea is completely bullshit. That would not imply that there's no such thing as implicit bias. Right. It just means this one test that was designed to measure implicit bias is not a good test. And in fact, I you know, personally, I would be shocked if there is no such thing as implicit bias, like all of my priors and a bunch of anecdotal evidence suggests to me that there is.


And by Prior's, I mean, you know, we know that a ton of cognition is unconscious. We know we're influenced by patterns that we're unconsciously picking up or that we think we're picking up. Our brain is picking up without being aware of it. And there has been plenty of stereotyping by race or gender in our culture historically. So I would expect that to influence our unconscious pattern matching algorithms, you know, and maybe that's maybe the fact that it feels like such a common sense.


Thing to exist is why the test has been like why people have sort of overplayed the evidence for this test, because on some level they're like, oh, well, this makes so much sense. It must be true. And they like, don't worry too much about the fine details of the evidence. Yeah.


And well, I mean, you're touching on one of the frustrations about trying to write about this in a rigorous way, which is when you point out the weaknesses of the test, you sort of get two annoying responses, one from each side of the political aisle.


To oversimplify a little bit like the story of your life on Twitter, you can really impressed by the equal amounts of vitriol you get from both sides.


Yeah, which I find weird because I really do consider myself pretty far to the left at least.


But yeah, I also consider yourself. Yeah, in an American context, you seem pretty clearly on the left to me, but.


Yeah, but so, so you know, you read an article like this and people on the left will say, oh so you're saying implicit bias isn't real. You're saying racism is an important thing to study? No, I'm not saying that. And then people on the right will say, well, this clearly shows that the you know, the focus on implicit bias is misplaced because this shows implicit bias isn't real, when, in fact, as you just said, this is just one instrument.


If I gave you a thermometer and you show that the thermometer doesn't do a good job measuring temperature, you would not infer from that that temperature isn't an important thing to measure or temperature doesn't exist.


And the evidence for implicit bias. I think there is a fair amount of solid empirical evidence for it. When you look at, for example, studies where they send out a bunch of job applications with stereotypically black versus white names. There is this consistent pattern where if you have a white net, having a white name grants you sort of a bonus in the probability you'll be granted an interview. And that, to me, sending resumes out in the twenty first century when we know people claim not to have explicit bias, that to me is fairly solid evidence for implicit bias.


Yeah, it's also a little more. I mean, it is. I know it's a test not. Well, no, no, it's sorry. It's actually in the field real test. Like people think they're actually deciding who to hire, right? Yeah. Yeah. So you're right. That's that seems like much more direct evidence of what we care about than the IHT. I would say so.


I mean, if you people should look up the work of I took a class with their names, Devah Pager Devah, she's done some of these resume studies, as have a lot of other academics. And yeah, I mean, you could tell some sort of pretzel like story where, oh, that's actually explicit bias, but they're hiding it and no one can observe them, so they act as it. But to me, as you said, it's just implicit bias.


Everything we know about sort of system one versus system to processing in the way humans really carve people up into categories, I absolutely think implicit bias is real, that in some settings it probably matters a lot.


I also think what's happened over the last two decades in social psychology is that this one very faulty test has convinced people that implicit bias is probably the thing to study, to understand racial discrepancies, when in reality there's all sorts of deeply rooted structural stuff going on that doesn't really rely on implicit bias to to reproduce these inequalities, to use the sort of lefty sociological language.


Right. Right. Um, I guess now that we're talking about these more meaningful and and fruitful tests of implicit bias like the resume studies, I guess I'm wondering what what is the value add of an idea? Like even if it I guess, OK, what if we want to test people for implicit bias?


Why don't we just give them resume tests, for example? And I know that, you know, we couldn't just give them a single resume and infer their level of implicit bias from that, because it's it's the bias there would be measured over a set of resumes. You'd have to look at, you know, whether on average they show bias against black named resumes. But you could do that. You could just give people, you know, twenty resumes to evaluate, some of which have black names and some which have white names.


And you would randomize, you know, the name and see how they evaluate them. And if they show a correlation or you could give them, you know, a pretend court case with a defendant who's either black or white and ask, you know, do you think this person is guilty or what kind of sentence do you think they should get? And over twenty of these, you could then test, you could then see if people are harsher or less forgiving of black defendants than white defendants.


Why aren't we just doing that? That's so much more direct, I think.


Well, for one thing, the T you can sit down and take it in about ten minutes, like it's very simple and straightforward to take. I also think that the proponents of the test would claim it's measuring something a bit. Or sort of primal or like a really fundamental level that isn't captured by by a more sort the more intellectual system to process of sitting down and evaluating resumes and evidence.


I see this is where stuff gets really complicated, because if, you know, if you talk about a hiring manager looking carefully at different resumes, how much are they influenced by system one thinking versus system to thinking? How can you even answer that question empirically? The nice thing about the Iot, if you think it works, is it really boils things down to gut impulse system one thinking, and it doesn't let other factors creep in again in theory.


Mm hmm. All right. So let's say it's true that to some extent, implicit associations predict bias. And I hear you that the evidence is pretty shaky on that. But let's just assume it were true. Does that mean or would that mean that if we wanted to get people to act in a less biased way, that the right approach would be to to change those associations, like get them to start associating black and white people so they are more likely to associate black people with good instead of bad.


And and that that would cause them then to behave in a less biased way. What do you think of that connection? It's like jumping from prediction, using this as prediction to, you know, Internet connection. Exactly.


I would say. Again, we're in now a fantasy world where the test does predict something meaningful. There's still no evidence that you can actually attack these associations and change them. There has now been I'm less versed in this research, but there have now been a number of studies attempting to build interventions that basically do what you just said, which is sort of try to rewire people's associations. And that big meta analysis I mentioned that Nosek is a co-author on the main thing it was looking at was the effectiveness of these interventions and have a quote in my article here basically from Calvin Lai, who's this?


Harvard, I think he's a postdoc who worked on that study. And he basically said we found no reason to believe that these interventions work, that they do what they're advertised to do. And, you know, I don't find that surprising because the test isn't really measuring anything in the first place. But even if you assume it is, there's still no evidence that you should. This is the most useful use of our inevitably limited resources to try to fight racism.


Right. Are there versions of the Iot on topics other than race that hold up better like that are more stable or what did you test retest? Well, I was just going to ask if there are versions of it that score better on that metric and are also more predictive of real world behavior. Like what about Iot for gender instead of race? Right.


The gender one is, if anything, weirder because it fairly consistently finds that women are more implicitly biased against women than men are gives all these weird results that were supposed to take at face value.


So I'm look at this chart and it basically shows that white women have the highest level of implicit bias and any kind of man, white men, black men, Hispanic men are less implicitly biased when it comes to gender.


So they're less likely. They seem to have less of an association and implicit association between female faces or names with. Concepts like good or what is the actual association they were testing for? Yeah, this is basically this is the compilation of everything they ask on all the tests they were taken on Project Implicit. I have to actually go back and see how they frame this is basically just reported as implicit gender bias, which we can take safely to mean just that, that men are better than women are better qualified for certain positions than women.


Right. And so in this view, it varies a little bit depending on how liberal or conservative you are. But generally speaking, when all women are more biased than all men, more implicitly bias.


Now, do the men show any bias or is it just not distinguishable from zero? They show some bias. Yeah, like it looks like a moderate amount of bias. So what the ATF has to do is accept this version of reality where women are just consistently and significantly more implicitly biased against women than men are, which like implicit bias, is this sort of sometimes fuzzy and woolly concept where, you know, you could have bias against yourself, you can have all these weird societal factors.


But I'm just not sure why we should accept this idea that women are significantly across the political spectrum, more biased against women than men. And what's funny is I'm look at this now, even strongly liberal women, they're more implicitly biased against women then far, far right men.


So think about that. Yeah, I was I was sort of I was sort of skeptical when you were saying that women showed more bias against women than men, because I was like, well, you know, but like, if there's if there's a culturally dominant stereotype about what women are good at or bad at, I wouldn't necessarily expect women to buy into that less than men. And so shocking to me that women would show more of this bias. But I mean, the liberal strong liberal versus strong conservative, that just that goes against all of my common sense about, you know, gender bias.


Yeah. So this is what I'm sort of saying. Like, we I when I wrote this article, this is a second article I was responding to a 538 piece by a very talented reporter who reported these associations at face value. And I don't think he was skeptical enough. So to me, these tests are magical. Researchers aren't infallible. You're telling me that a far left woman is more implicitly biased against women than Jerry Falwell Jr.? I mean, should we I I guess there's some version and maybe I'd be more likely to believe it if this test had shown, you know, that it was ready for primetime in terms of all the psychometric properties.


But I just find that to be a very hard result to believe.


Yeah, this almost makes me wonder how easy is the Iot to game? Like, could it be the case, for example, that far right's men are like now? Oh, these the like these liberal researchers are trying to like, prove that people are sexist and show them and I'll just like, you know, associate women with all the, you know, career minded words or whatever. Right.


So one of the big selling points of the test, especially, I think in its earlier days, was that you can't really game it, that it's sort of this like computer computer algorithm, truth serum that reveals your deepest beliefs.


I really do see why this became so beloved by the public. Like it's such a sexy idea.


It's so compelling.


And yeah, no, I want to circle back to some of the sort of political ramifications for liberals, but they the test proponents have not told a consistent story over whether or not these are impulses we can control or whether or not it can be gamed. And you can see that a little bit. I'm not sure these two ideas are entirely contradictory. Maybe you tell me on one hand you have the idea this test is revealing something about ourselves we can't control it can't be gamed.


Then the other hand, you have but there are these interventions that can reduce it, that can fix it.


I guess you could sort of see those two ideas and see those don't seem contradictory to me. Gotcha.


So you're saying like maybe if like if you make a conscious effort to fix them, then you can. Yeah, I mean, you can make them you can reconcile them with each other by just saying, well, you know, you're you can't hide like taking this test, you can't hide your current level of bias that's going to come out. But that doesn't mean you can't change your level of bias and, you know, over time. Gotcha. Yeah, yeah.


That's how I read it. Anyway, I try. Yeah, that part's not that contradictory, I guess, where they have been contradictory as different researchers at different times have made different claims over how to what extent in one test taking session you can control your results or game it.


Yeah, that makes sense. Great. So let's talk about the the you were alluding to the political implications of these results for liberals in particular. Yeah.


So I'm I'm you know, I'm basically working on a book about sort of shoddy ideas in social science and why they catch on and what they can tell us about society.


And I my chapter is going to be in some regards similar to the to the article I wrote, just talking about the methodological issues. But I'm also interested in the way that in a way that might sound strange. I think this test tells white liberals a story they want to hear, and I'm including myself on that because I am very much white and very much a liberal.


There's something about the experience of taking this test that can make white liberals feel like they are taking part in the fight against racism, despite I'll try to be gentle, but oftentimes white liberals don't really do much but take the EITE or tweet about racial injustice.


And oftentimes because of the crappy system that we have set up, we are we are perpetuating many of these forms of inequality. I think the might give us an easy way out to feel like we're fighting the good fight without actually doing anything.


That's my yeah. I really I really appreciated this part. In your article where you're talking, I think it was your article you were talking about the kind of performative nature of the use of of people reporting their scores on the IAEA. To you, there's just sort of this ritual you that people seem to go through of reporting their score with this like really sort of troubled self, like like the kind of emotional tone, almost like a confessional, like the this wasn't your phrasing, but it reminded me of some of the like circles in in the early 70s, like a Esslin or something, where people were sort of like confessing their sins to the group in this really emotional way, almost like like exercising their, you know, bigoted demons or something.


Yeah, there's this sort of for some people who take the test, not just white people, but mostly white people for for understandable reasons. There is a sense of like I this sin within me has been discovered and I can expurgated by by telling people about it and by telling people about how I responded to this test emotionally.


And it's just striking because the test doesn't really do anything.


I mean, as I would argue, we now know now and it's just sort of another way, in my view, for white liberals to talk about caring about this stuff without actually putting their money where their mouth is.


And, well, I want to stick up a little bit for the white liberals. I'm sure, you know, if I imagine this test actually measuring a real thing and being reliable and so on, then it does seem, you know, it's not the most like high effort or costly thing you can do, but still seems pretty valuable for for people to post their results and say like. Turns out, actually, I'm biased, too, because it you know, it kind of helps with the story of, like, racism is real.


And just because you think you're not a racist doesn't mean you're not acting in a way that hurts minorities or something. So that does seem. Well, I guess I could also tell the opposite story. This is something that I talked about in my previous episode with Seth Stephens Davidowitz, which is that like it's not clear that making it common knowledge that everyone's racist is actually helpful, certainly seems plausible. It could be. But you could also tell this other story where, like, it just normalizes racism and then everyone's like, oh, I guess if everyone's racist, then it's fine for me to be racist and it makes it seem more acceptable.


So I don't I don't actually know. But at least I can tell a plausible story in which this would be a valuable thing for white liberals to do in a world in which the idea was measuring something real and meaningful. A lot of qualifications, but.


Sure. Yeah, no, no. And I appreciate your impulse because because I do think overall anything you can do to sort of. Gave your own role in propping up racism or propping up any bad institution is good. I just think the a the performative aspect of this and B, the extent to which it just isn't anchored to anything in the real world, which again, is something the average U.S.A. doesn't know. I just it seems to me that it just doesn't get us anywhere.


It's just more white people talking about how dismayed they are at their own racism without than I just I don't find Aichi. Conversations tend to lead to discussions about what to do. I think it leaves people maybe feeling a little bit helpless. And I think from a you know, I'm not an activist, but I think it's safe to say psychologically, if you make people feel helpless or mad at themselves, that isn't necessarily the best way to then get them to actually do something about it.


Yeah, it also and this is another good point you raise in the article. It also strikes me as kind of unethical to keep giving people this test that tells them that they're biased without necessarily good evidence for that, because as you say it, it is it can be kind of psychologically harmful or it's like a it's an unpleasant thing to have to believe about yourself. And it's ironic because the you know, the IRBM, the what's it called Institutional Review Board, that, you know, the board that that's supposed to, like, prevent unethical studies from being done and make sure that, you know, participants are being treated well and they have informed consent and they're debriefed and all these things.


And like in so many cases, the IRB, you know, dramatically overreaches and bogs down like completely harmless studies in like months of red tape. And it's maddening. And yet at the same time, stuff like this happens that like actually does seem harmful and somehow slipped through the cracks. I don't really understand how this works. Yeah, I'm, um.


Did you see the Scott Alexander piece? Yes. But I think that everyone's potential link to that. Yeah. Please start codecs. His blog is a great piece about navigating the incredibly Byzantine IAB process. But I yeah, this is another point I hope I can go deeper into in the chapter.


Brian knows I have a lot of respect for when I asked him about this, he said, you know, if my fifth grade daughter gets a bad grade on a math test, it's just it's just one piece of information. She should fit into her broader understanding of how good she is at math. And he compared that to the. I don't think that comparison works because since the test was created, so many people who have taken it, including the co-founders, talked about the searing nature of the emotional experience of getting your test result and finding out that you're biased.


So it's very clear that taking this test has a big emotional impact on some people.


So what's another situation in which we would be OK with a psychological test being promoted by Harvard, eliciting in people this sort of reaction, despite the fact that it isn't clear what exactly, if anything, it's measuring if Harvard was offering any anxiety or depression tests that had these psychometric properties and that caused people to get really upset, they would be rightfully some serious ethical questions about that, about that test.


And I I don't know, I guess summing it all up like racism is just such a tricky emotional subject. And we so desperately want easy answers about how to address it, that I think our desire for those answers swamps some of our other more thoughtful impulses maybe.


Yeah. You know, defenders of the scientific process, which is, you know, frequently includes me, will often say, like, OK, look. So, yes, we we were wrong about this phenomenon before. We've had to revise our model of it. Some of these results were wrong and they got corrected. But that's not that's not a flaw in science. That's how science is supposed to work. But, of course, on the other hand, you know, some errors are just part of doing science, but other errors are more a result of sloppiness or or intellectual fudging or even intellectual dishonesty.


Not that's actually a bug, not a feature in the scientific process. So I'm curious whether in your opinion, this sort of, you know, huge excitement over the idea and then having to walk a ton of that back, whether that is a case of science functioning healthfully and self-correcting, or was there was there a flaw in how we went about we, you know, science when so many in promoting the mean we meaning me and you right.


Over the last 20 years. Right.


I think I have complicated views on that. I think there's no way to look at this whole thing and not consider it at least a minor disaster. For 20 years, you've had a huge amount of resources flowing toward a task that is just so psychologically weak and to me, really misleading to a lot of people that that is a failure of science in terms of how to divvy up the blame or who should have done what I did.


Put my finger. Exactly. You know, I blame white liberals.


No, I think folks like Greenwald and Banaji who who really were the cheerleaders for this test. I think they're trying to be good scientists and I think they take these issues seriously. I also think they made claims that significantly, significantly ran ahead of the evidence. And that's most striking in some of the text of their book where they really say, like this test does a great job predicting behavior better than anything else. We have better than explicit measures. And, you know, two years later, they themselves are admitting in an academic journal, no one's going to read.


This test should not be used to diagnose individual levels of implicit bias. They themselves have admitted that. So why was there all this excitement over a test that that 20 years in the architects of the tests had to say, you shouldn't use this on individuals? And if you can't use it on individuals, really, we're going to now use it to make these big maps of racism in America. Some people have done. We're going to use it to compare big populations.


I don't know. I'm torn. I don't want to be too hard on them. I just think every step of the way they got excited and they saw the attention it was getting. And I think they made the mistake a lot of scientists make, which is they sense they had like a really exciting idea on their hand and they let the excitement get the best of them.


Yeah. There's also this thing that I've seen some of you know, not from all proponents of the idea, but from some of them. That's like when they encounter criticism or pushback on the claims they use. The defense, like if you're criticizing it, then maybe it's because you're racist or like you don't care about racism or something like that, that, yeah, that's is such bad form. It's so toxic to being able to talk about evidence and, you know, converge on the truth in the long run.


Like, I have very little sympathy for that. Yeah.


I mean, look, I put it in the article, it's not a secret. But Banaji, who's the head of psych at Harvard, she said in emails like she basically made that argument that people who criticize the. He you know, I want to go back to a world of segregation. She didn't say that exactly. She basically she strongly implied they're racist.


And and I just think that's like the sort of plausible motivation for why someone would doubt diety. Yeah.


Which if you're going to make that claim, like, what's the point of debating anything if you're just going to immediately jump to that? And, you know, there are obviously a lot of people who are motivated by racism. But in this case, she's talking about a lot of researchers publishing in top journals about a test that she herself in writing has said is too weak to tell individuals how implicitly biased they are, even as Harvard University continues to do exactly just that every day.


So I think she should be in a bit more of a position of defending why she's made these statements about her test and not accusing what I see to be good faith critics of it, of being racist or whatever.


Yeah, yeah. There's also I mean, I was sort of trying to give I was trying to be charitable to like, you know, science is not going to be perfect on its first try, you know, with every investigation that does or every conclusion it reaches. And I still stand behind that. But, you know, the more I think about the the discourse around this test, both early on and and, you know, now that there's been a lot more questioning of it, the discourse seems really bad, like even, you know, this is less toxic than like accusing your critics of racism.


But even the like response of, well, you know, it's not maybe it's not perfect or maybe it doesn't actually show conclusively what we said it did. But it's still educational. It's still right. Help teach people about race. That's such a week. Yeah. That's such a rationalization. And it's not anyway.


Oh, no, that's exactly right. And again, all you have to do to understand the weakness of these arguments is swap out any other thing that social psychologists or any psychologists might want to measure. So this test will actually tell you how suicidal you are, but it's a good opportunity to teach you about suicide. Right. Right.


If you were a suicidal score, but that suicidal score is meant as a diagnosis. And anyway, the whole point was just to educate people about suicide.


Right. And so what drives me crazy and you see this in so many different areas and if you oppose our test, then you think people should quit.


So I write because, again, there are certain concepts that sees our brains and turn off our usual ways of thinking and our usual standards. And I do think because racism is so pervasive and offers up so many salient examples of how horribly people in this country are treated for no reason, but it that shouldn't short-circuit us and cause us to make arguments that we wouldn't make in other cases the other way. I hear a lot is like, sure, this test doesn't really predict anything, but it's better than anything else we have.


I mean, OK, if I have a thermometer that is consistently off by 50 degrees Fahrenheit, that's a little bit better than random. But would I ever use that for professional purposes, what I consider its readings to be accurate? Of course not. So it's just weird to see like a totally different set of standards applied to it than to apply to basically any other scientific project. All right, well, we're just about out of time, did we?


I think we've fixed racism in the lab. Yeah, that's right. It took us less than an hour. I don't know what the rest of society was, you know, doing with their time the last 100 years. Exactly. Very efficient. So before we close, Jesse, I want to give you the opportunity to recommend the rationally speaking pick of the episode. This is a book or a blog post or article, something that has influenced your thinking in some way.


What would your pick be?


Everyone should read Galileo's Middle Finger by Illustrator. Oh, that's also on my reading list. Yeah.


It's so good in launching this whole this is like another podcast for another day. But I've written a little bit about gender dysphoria and especially the question of what you know, what you do in a five year old has gender dysphoria and where gender dysphoria is is identifying with a gender that doesn't matter.


Biological sex. Yeah, sometimes.


Sorry, did I just open up a huge kind of I find it more this on Twitter.


So basically it's just it means basically a feeling of discomfort with your own sex in one way or another. And some people interpret that as the other sex. Some I'm it is as I have no gender. It gets complicated.


But Trager's book sort of like really opened my eyes to the way good lefty social justice can sometimes be at loggerheads with good, rigorous science. She offers a number of case studies, including one about transgender issues, and she discusses where this way that I really admire of taking these complicated academic fights and turning them almost into sort of mysteries you want to solve and these really compelling narratives and in a way no other book has done.


Her book just sort of nudged my writing career in a different direction and got me involved in stuff I didn't think I'd get involved with. And I just think it's a wonderful book everyone should read.


Oh, fantastic. Well, I'm I'm grateful to her for doing that, although I feel bad for your blood pressure on as you wade into the swamp on Twitter every day, getting attacked from both sides, sort of see you fighting the good fight.


It's the price of admission. And like I should say, a lot of the times people who are mad at me on Twitter like they have legitimate grievances. Yeah. And then, I don't know, people have their own stuff going on. So I try not to take Twitter too too seriously.


Well, easier said than done, but. Yeah. Something to strive for. Yeah. All right. Well, we'll we'll link to Trager's book as well as to your article. And and we should also link to Scott's post about the IRP, which everyone should read. Definitely. Jesse, thanks so much for being on the show. It's been a pleasure talking to you. Thanks, Joe. This was a lot of fun. This concludes another episode of rationally speaking.


Join us next time for more explorations on the borderlands between reason and nonsense.