Transcribe your podcast

This episode of rationally speaking is brought to you by Stripe Stripe builds economic infrastructure for the Internet. They're tools help online businesses with everything from incorporation and getting started to handling marketplace payments to preventing fraud. Stripe's Culture puts a special emphasis on rigorous thinking and intellectual curiosity, so if you enjoy podcasts like this one and you're interested in what Stripe does, I'd recommend you check them out. They're always hiring. Learn more at Stripe Dotcom.


Welcome to, rationally speaking, the podcast, where we explore the borderlands between reason and nonsense.


I'm your host, Julia Gillard, and I'm here with today's guest, Sarmin. Wazir Semin is a professor of psychology at the University of California, Davis. She's the author of the blog Sometimes I'm Wrong and the co-host of The Black Goat. A podcast about doing the science means research is really interesting. It's about how accurately we understand ourselves, our personalities and our behavior and why that matters. So we're going to talk about that topic. But the way that I first encountered ZAMEEN was in a different role that she plays.


She has been a central participant in the conversation about methodology in the social sciences and where the field needs to shape up and how. Just to give you an example and a taste of mean style, she teaches a seminar that's titled Oh, you like that finding, do you? Well, it's probably false. So we'll be talking about that as well. I mean, welcome to rationally speaking. I think so. That's actually not really the title of my class.


That's my joke title, but basically the theme of the class. I'm not really calibrated yet. And your when I read in your book, but it's obvious to others. Yeah. Yeah. So we've I've talked a fair bit on rationally speaking already about the replication crisis, reasons why studies don't replicate with some previous guests like Brian Nosek and your Simon then. But one argument that I've been thinking about recently against the idea of increasing rigor in the social sciences I wanted to pose to you.


And the argument is, look, false positives are bad, like thinking that we've found something cool in our field that isn't actually there and it's just an artifact of a badly done study plus confirmation bias, et cetera. That's bad. We don't want to go down a bunch of blind alleys, but false negatives or even worse, we don't want to fail to discover real phenomena. And so maybe there's a tradeoff where if we increase the standards of rigor, like we make it harder to publish things, we increase the standard of evidence.


Basically, we're we're reducing false positives. But at the same time, we're maybe increasing false negatives and that that's a bad tradeoff. This is an argument some people have made to me. I'm not not sure exactly how I feel about it, but what do you think? Do you think that there's a real tradeoff there?


I think in principle there could be if we were doing everything right. But I think in reality, false negatives aren't an issue. We actually have a blog post called Why I'm Not That Worried About False Negatives. And I lay out some reasons why, given the way we're doing things, at least in psychology, but I don't think it's specific to psychology. There's actually very little risk of people abandoning an actually true hypothesis because of a false negative. And one reason for that is that I think with P hacking and researcher degrees of freedom, the kinds of practices that we've until recently allowed and even encouraged, you could turn almost anything into a significant result.


So if you have a hypothesis and you test it even just once or twice and you allow yourself to look at it from a lot of different angles, you're probably going to find something that you can interpret as evidence for your hypothesis. So if you're willing to do those things, you don't understand that they're harmful and you're motivated to find something to support your policies. So basically, if you're human, you probably will find evidence for your hypothesis, even if it's not true.


So there's very little risk that if it is true, you're going to miss the evidence unless we start getting a lot stricter. And so let's say we do let's say we crack down on these hacking and research degrees of freedom and so on. I do think then we have to start worrying a little bit more about false negatives, which basically means we need to increase our sample size, which is something that those of us pushing for reforms are pushing for, too.


They go hand in hand that if we're going to stop hacking but keep our sample sizes the way they have been, then everything's going to be uninformative. So in order to not end up not failing to pursue promising things, we need to increase our sample sizes. But there's kind of other kind of more sociology of science, I guess reasons why I'm not that worried about false negatives. One is that even now, even with the reforms, very few journalists will actually publish results.


So if you fail to get something, it might discourage you, but it won't discourage others because others won't find out that you failed to get it. So it doesn't have the same ripple effects that false positives have. It also seems like people, even if we did publish some negatives, like now we're publishing replication studies once in a while, still not very much, but some are getting out there and some of those are probably false negatives. Many of them are null results and some of them are probably.


False negatives and even those are not getting very much attention so often, the original study continues to get way more attention than even a much more rigorous preregistered, large sample replication study. So it seems that the null results, even the first dot, tend not to get published. And second, when they do get published, people don't pay attention to them. The things that are going to make good headlines and be good click bait are usually the significant results that things are going to make it into.


Textbooks are usually the significant results. So I don't think there's as much potential for a false negative to change a lot of people's minds and convince a lot of people that there's nothing there and it's not worth pursuing and so on. And this might be different in other fields, like maybe and when it comes to cancer treatments or things like that, maybe given the competitiveness and so on, people will use other people's negative results to avoid going down a dead end or something like that.


I don't know. I don't know what the culture is in those fields, but in psychology, I don't see very much evidence that, like everybody concludes, oh, well, there must be no effect because this one lab didn't get it. So let's abandon that altogether. Oh, interesting.


I'm now realizing that there are two different things I was conflating here when I talked about this trade off. I think. Tell me what you think. One mechanism by which reducing false positives could increase false negatives is the thing that you're describing, which is people publish results showing actually this effect isn't there, or we fail to find this thing and that spreads and that sort of thing takes root. And maybe there was a real thing there. But the mechanism I think I was originally thinking of is there's a sort of there is a real phenomenon.


And the study shows that the phenomenon was real and the study has flaws. But so it's like it's an informative but flawed study. And whatever reasons, the researchers had to think that the phenomenon was real. Yeah, we're as evidence of the phenomenon itself just wasn't actually strong enough. It right. It didn't meet the standard evidence of the journal. And so then then we go back to our Pryors.


Right. So I think if if the paper is so flawed that it basically provides we're no better off after seeing the results than we were before, which I think is many of us feel that way about some chunk of the published literature from the past that like basically we just don't know whether to put any stock in it in some again, not the whole literature, but there's some pieces of it. And then I think we definitely need to be careful that absence of evidence is not the same thing as evidence of absence.


So if I think that a study was flawed to the point where it doesn't add to my certainty, in effect, I shouldn't conclude there's no in fact, I should go back to my baseline level of certainty and the effect, which is whatever my prior beliefs were before reading that study, I do think sometimes we make that mistake of thinking if they had to be hacked to get this result, then it must not be real. But that's not true.


Right. And if I could be real. But the design of the study was so poorly designed that they couldn't detect it without hacking or they just be because that's what they were taught to do. And so it doesn't mean that they had to be active, the effects and so on. So, yeah, I think that's important. I'm not sure how much how big of a problem it is in reality. I do think I've seen that happen. People slipping into if it was back then the effect might not be real.


And some of that I think is legitimate because I think our priors on on some effects like art should probably be low because we've been pushed more or more to study counter-intuitive things. So the I think the possibility of some of our hypotheses is low to begin with. So then if if the study was probably designed, then we go back to our prior, which is like that didn't sound likely before the evidence. So I still don't think it's likely. Right.


There's actually a bunch of things like this in the social sciences broadly where I believe there's a real effect there. And there's a bunch of research showing that there's a real effect. But the research has nothing to do with my belief that the effect. Yeah, yeah, yeah. It's sort of like so just to get super controversial right away, like one area where I think that might be the case and I don't know the literature super well, but I hear people express skepticism about the studies.


And let's assume that skepticism is at least sometimes valid is stereotype threat. But when you think about the kind of more abstract vision for. Yeah, so the most abstract version of literature and I'm going to butcher this, but it's basically the idea that if there's a negative stereotype about a group that you're a member of your gender, your ethnicity, something like that, being reminded of that stereotype is going to make you anxious. It's going to make you perform worse in that domain.


And I think on some level that must be true in my prior on that is quite high. And then if there are some studies that are flawed or that used practices that were acceptable at the time, but now I wouldn't consider strong, that doesn't really change my prior. But but it doesn't increase my confidence either. Right.


I mean, another question I've been wondering about recently with regard to increasing rigor in the social sciences or in science in general is how much? Is it true that new research builds on past research in. Well, let's just stick to your field of psychology. So, for example, I could in theory, I could just go run the studies today on Mechanical Turk testing various hypotheses in psychology without, you know, being all that knowledgeable about the literature. And if my standards if my methodology isn't very good, then the results aren't going to be great.


But at least I'm not I'm not sort of relying on previous work. Trusting the previous work was good and basing my investigation on that. And so it seems to me that the the degree to which we really need increased rigor depends in part on how much we think psychology is kind of this this pyramid where new research, interesting building on previous generations of research and just trusting that it's solid. And this is just a fact.


I don't know about the structure that's related to whether or not we need rigor. So so in psychology, I think a common joke that people say, I don't remember who who said it first, but there's a joke which is not really a joke. It's basically true that in psychology, theories are like toothbrushes.


No self respecting person would use somebody else's and in fact, gross. Yes. That said, when I was an assistant professor trying to get tenure, everybody told me and I think it was true, although I never actually got the word explicitly. But the rumor was that to get a grant from social psychology at NSF, which is like the main place that we get grants, you had to have your own theory. So I came up with my own quote unquote theory, which was really, really simple and obvious.


But you couldn't just build on someone else's theory. I was at a meeting once where a dean from Stanford said that they would not give someone tenure for incrementally improving somebody else's theory. And I think that's a common view, is that incremental work and this is different and different subdisciplines of sake. I think cognitive psych, there's more valuing of incremental work than I think in social and personality psych. And I don't I don't know for sure, but that's my impression.


Just a brief tangent.


I'm curious, do you think these professors, you know, who wouldn't give tenure to someone, you didn't have their own theory? Do you think they agree that incrementalism is valuable? A bit like double checking other people's work and modifying other people's work to make it better is valuable.


Like don't I think they think it's less valuable. So I think they think somebody needs to do it, but the less skilled people should do it, not the Stanford professors. That's my guess. I don't know. There's only way I can reconcile. So I see a lot of people talking about the importance of creativity and novelty and so on. But but when you say. But isn't correction also important? They say, yeah, yeah, of course.


So then that what I take away from that is they think, well, they're really smart. People are doing the creative novel stuff. And then the people that can't do that, they can do the correction.


The second stringers, as some people would put it, really unpleasant incentives or incentives if we just decided that the the correction stuff is valuable, but low status. Right, right.


And I mean, that's actually a rosy view. I think many people think it's not valuable or they in principle think it's valuable. But when it actually gets done, I think why are those people being so mean? And that they think the specific instances of correction when you have to when you see what it has to actually look and feel like, they don't like it. So I'm not even sure that it's valued even as a second stringer kind of activity.


But I would say that even if that's the case and this is I mean, I have a pretty bleak view of this and maybe I'm too pessimistic. But even if it's the case that psychology is not as incremental and not as much of a pyramid as it maybe should be, or maybe that's appropriate for young science, I don't know. But in any case, even if that's true, I would say it's still really important that the past literature be relatively solid because we still use cumulative thinking.


So we use meta analysis. We use to decide what gets put in textbooks, who should get awards, etc. So things still accumulate, even if not theoretically, the theories don't necessarily build off each other. But I mean, I think meta analysis is like one example of where we've gotten to really big trouble. And there's kind of a phase I've seen myself and other people go through when you start questioning whether we're doing things right and science. And one of the early phases is you start putting all your faith in meta analysis because you realize that single studies aren't very trustworthy and that's actually turns out to be worse.


So there's this leader disillusionment where you realize that actually the cure is worse than the disease. And that's because, yeah, like meta analysis really assumes that the set of things being put in the analysis has more signal-to-noise. And if it's it's not just noise, it's bias. Right. And so when you aggregate bias things together, the bias amplifies.


I'll I'll just try to clarify why I thought there was a relationship there and you can still disagree with me. But just to make sure I was clear in a world in which there's not this. Pyramid structure. Someone who just decides, like, you know, darn it, I know that my a lot of my peers are using shoddy method and that's unfortunate. But I'm going to be a really good researcher. I'm going to use good methods. They can just do that and do good science and like get trustworthy results in this non pyramidal world.


But in a world where they have to trust other people's work, then they're kind of screwed even if they personally want to use good methods. Yeah, I think that's true. I think that it's easier to wipe the slate clean and kind of start over if things weren't building on each other to begin with or weren't really too much. On the other hand, if we had had a more pyramid structure, it's possible that we wouldn't have let the problem slide so much.


If many, many other people were building on your work, it's possible that they would have detected the problems at some point. Somebody would have come along and said, no, this way of doing it isn't good. And if I'm constrained to do it the same way everyone else, that I'm going to critique the way that I I'm being pushed to do it. So I think that it is easier going forward to just do things a different way. It's not doesn't disrupt some long chain of things.


On the other hand, it means that the older stuff is never going to get cleaned up. It just sits there kind of related to this point. You you had this great argument a while ago about how the standards that people applied to replications, like when a team of scientists attempts to replicate someone else's study to double check that the phenomenon they wanted to demonstrate was real, that the standards people applied replications are really different from the standards that they apply to regular original papers.


What's going on there? Yeah, I mean, I think that actually it's a broader phenomenon. I don't really make this connection until very recently. That is maybe not so much about applications. It's just about findings we don't like and maybe they'll do the same thing after I was an original study. So like I've now been an author, a co-author, a collaborator on a number of papers where we had an idea, we we planned a study or we found data sets that we could use to test it.


And we didn't find what we expected. We got a result. And it's often really interesting to watch, like me and my collaborators talk about go back and question whether the data really were a good way to test the hypothesis, whether design was really adequate and so on. And I've also seen this as an editor and reviewer people. I had one case, the reviewer explicitly said I thought the design was fine, but then I saw that the result was null.


And so I went back and looked at the design more closely to see what was wrong with it. And I found these problems. And in a way, that's great, right, that we should do that independent of the results. And it's really interesting to me to see how willing we are to throw our methods under the bus. And we don't like the result. And I think in the case, replications, often in that case, the people who like the result are not necessarily the authors themselves, but maybe the authors of the original study or people who were fans of the original study.


And so they can nitpick the methods and so on. And again, like my view is, let's do that. But let's do that to all studies, including the ones that I find exciting things that we want to believe. But it's really fascinating to watch, like the level of critical thinking, how people step up their critical thinking when they don't want to believe the result. It's kind of nice to see that we're capable of it. Clearly, there's there's a quote by I think it was Tom Gilovich who wrote How We Know What Isn't.


So he said, I'm going to mangler quote. But it was basically when there's something that we want to believe, we ask ourselves, can I believe this? And when there's something we don't want to believe, we ask ourselves, must I believe this implying the difference? Yeah, yeah. I mean, I think the evidence we're using, I catch myself doing that and I see reviews doing that. Yeah. Yeah. Like when we're evaluating science, we should always be asking, must I believe this or I mean maybe not quite that far, but like would it be foolish to believe there's something like that.


Like am I like I think how much do I have to believe this. You don't know.


So I'm pretty familiar with the phenomenon of must I believe this. And it makes sense to me that people would be applying that to failed replications of their studies that they liked or that they were invested in. But the phenomenon that I thought you were pointing out was was different and kind of interesting in its own right. It was more of like a status quo bias, like like even if you have no investment in the original study or the replication, the failed replication, there's something that happens where we kind of I think where we we kind of.


Like, we accept something as true and then it like. The replication is sort of on a different plane of just I'm not doing a good job of explaining.


Yeah, this is it reminds me of what Andrew Gelman calls a time reversal heuristic, where he says, like, imagine that the yeah, the replication had happened first and then the original. But actually so that implies that it's about order. But I don't know that it's so much about order. I think it's about prestige in many ways. And I think the original often has a lot more prestige. So part of it is because it came first and people think that replicators are less intellectually innovative because they're just copying something.


So that's less it's less prestigious, as we talked about earlier, for that reason. It's also often the people doing applications are less famous than the people who did the original work very often. Often replications are published in less prestigious journals than the original work, etc.. So I think that there is a bias. Yeah, I think there's a it ends up being a status quo bias because the original came first. And I think a lot of it has to do with just people being skeptical of the ability and the motives of replicators and being skeptical of people they've never heard of, being skeptical of journals where their applications often get published, things like that.


Is it your intuition that that's the more likely mechanism, or do you have any way to distinguish that from the status quo bias model? Um, I guess I would I would predict very strongly that if a null result got published in a lower tier journal as an original study and then someone came along and replicated it and said, no, see, there is an effect and they were more famous and they got a better journal, I don't think there would be status quo bias.


I don't think people would be like, no, no, no, we're going to stick with the original conclusion. So I think it's it's it's like a bias towards significant rather than null findings, a bias towards a more famous, more splashy people and journals and conclusions and so on. But I don't think that order is a big part of it.


Kind of a nice segue into another question I wanted to ask you, which is you've made the case in several venues that the scientific community is too reliant on Éminence as a marker of quality, like older, you know, procedure scientists from, you know, from more prestigious institutions, et cetera. They're more likely to get attention and awards and publications, et cetera. How can we tell that Éminence isn't just a marker of quality? Like, you know what?


Yeah, how do we falsify the null hypothesis that, like, there isn't any bias toward them and going on? It's just like talented people become eminent and then they get more attention awards because they're talented?


Yeah, it's really hard because part of the criticism of evidence is the criticism of metrics. And like, I think that a lot of us well, a lot of us feel like the way people get recognition is partly by just having many, many papers that are cited many, many times and so on. And so if we're arguing what we shouldn't just be using those metrics, we shouldn't decide who's successful just based on these, like, numbers that can be gamed or could just not reflect quality.


Then how do we disprove that hypothesis you just said? Because you have to do it quantitatively to convince people. So you have to find other metrics. But every metric is can be gamed or can reflect something other than what it's supposed to reflect. I mean, I think that sometimes this is what's behind wanting to test the validity through really rigorous large scale applications of some of the most classic, most celebrated findings. You know, some people think that this is to take down famous people and to try to become famous by taking down a famous person yourself.


But I think part of it is to test this hypothesis that that there's some calibration between the recognition and fame and attention that a finding or its authors get and how solid it is. And I think that's what's behind some of that drive to be like, well, let's see, let's take the most solid, most celebrated things that we teach our undergrads and so on. And if that's not solid, then the correlation between prestige and rigor can't be that high.


If the most prestigious things turn out not to be rigorous, I don't know how we would test the whole spectrum. But I think a pretty good way to start is to test the things at the top.


That makes a lot of sense to me, although it also makes it hard. Like a lot of the discourse around how to respond to criticism or failed replication of your work is like, look, this isn't judgement on you as a scientist, right? And this is just, you know, the process of science. Like we should be correcting each other's findings. It's not personal. Right.


But as you say, it also kind of is personal, just in the sense that it's sort of a referendum on whether you deserve your prestige. Yeah. I mean, I remember in the earlier days of the replicability crisis in psychology, which has been going on now for like six or seven years, maybe two years ago. So not that long ago. And I'm sure it could still happen today. But a few years ago, I heard someone say that people shouldn't do applications because.


They're skeptical of a finding that's not a good reason to do a replication, and I think that was a very widely held view at the time and maybe still is today. And I think that's crazy. Of course, it's a legitimate reason to do a replication. And, yes, that introduces bias. But original authors are biased in the sense that they hope that there is an effect. So someone thinks that there's not an effect that doesn't make them more biased and someone who thinks there is.


So we just need mechanisms in place to rein in that bias like preregistration and transparency and so on. But so in that sense, it can be personal, although I still think that case is still more about the effect. So what I'm talking about testing the most prestigious findings. I'm talking about the findings still rather than the people. But obviously those are very hard to separate. In another kind of paradox of the replicability debate, it is that, you know, the critics are asked to not target people, to not name names.


Why do you even have to use the reference with the authors names? Just talk about the effect. But then when we talk, when we're getting praise, when we're saying in effect or reciting something as supporting whatever in a positive way, we do use the names. And it would be really weird not to. And so there's this double standard that it's counted as personal, if you say so. And so is effect in a negative way. But but do want it to be called after their name when it's in a positive way.


So I think I think it is a little bit. It is. I think it's we should we shouldn't deny that there's some aspect of like choosing things that are held up in really high regard to see if those stand up. And I think that's a good way to test how deep is the problem. But, yeah, it's not a neutral way.


One of the many things I like about your blog Sometimes I'm Wrong, is that you do address this kind of human side of improving rigor and correcting findings, which is something that I find lacking in a lot of discussions of the replication crisis that I otherwise agree with. People will say, look, you've got to take criticism. You've got to let people to critique your work. You should be happy when people criticize you. The field is progressing. I agree with the spirit of that.


And I think the paramount virtue in science has to be criticism and transparency, even if it hurts people's feelings. But I still think people tend to be pretty glib about like I've really heard people say, oh, you know, I'm always happy when I get criticism and I'm like, they're not like I just don't believe that I'm very self-aware, which is the thing we should talk about. Do you have any suggestions for listeners on handling? Handling? Sort of.


Actually, I was going to say handling like fair criticism, but maybe I want to broaden it like any kind of criticism with aplomb.


Um, I don't know. I mean, it's funny because I was I thought you were going to ask, like, how to deliver criticism in a more sensitive way. Question. You can answer the question. You start with that one, because you have I mean, my answer to that is actually to try to remember a time when you were wrong and had to admit it. And so, like, for me, I try to pay attention to those times because I think you can learn a lot from them and have a lot more compassion for when you're the one criticizing someone else.


I was recently at our annual conference, which is a big conference of social personalities like and I always room with my friend Alexa and I was flying from California, the conferences in Atlanta. So the three hour time difference. And she wanted to get up early to work out. And I was really grumpy about the fact that you want to set an alarm for six forty five. And I gave her a really hard time and I was just a jerk about it.


And so then the next day I was like, oh man, I was such a jerk. It's totally reasonable for someone to want to wake up at six forty five at a conference like that's just normal. And so I like texted her and then I found her and gave her a hug. But it was hard, even though I was sure that I was in the wrong, I had to like swallow something like almost almost physically felt like I had to swallow something.


And so but it was kind of nice, even though it's not at all an intellectual thing. Like, it was a nice reminder that even something silly and easy like that, like she forgave me right away. It wasn't hard. Even that was like probably the most unpleasant thing that happened to me over a span of a couple of days. So, yeah, like if you're asking someone to accept that they were wrong and something where there are so there is something at stake, I think, like having some compassion is good.


But I do think if I had to say, like, what's overdone, the lack of compassion and meanness in the criticism versus the attacking people for not criticizing nicely enough, I would say the latter is overdone. More like there's just way too much complaints about the way people are criticizing when. I think the much bigger problem is that there is so much to criticize, like that should worry us. But yeah, I think that if you're going to call someone out or point out an error, just I'm not sure that there's any way to do it without it hurting the other person.


So that's not the right standard. But just like remembering that how much that sucks, even for little things, is important. And then in terms of how to respond, I mean, I. Taking a beat, that's such an obvious answer, a bit like, you know, as an editor, I've had more experiences now with people, you know, criticizing my decisions or criticizing things we publish or things like that. And even if I think I'm like, totally calm, like I could respond right away and it would be totally fine.


It's almost always a good idea to wait a day like the response I thought would sound totally calm right away. Like I can do a lot for a day. I've got passive aggression and you're like, no, I'm not defensive. I'm really not there. But there is an entertaining footnote on your blog, which, by the way, listeners blog is peppered with entertaining footnotes. This one said emotions. Depression gets a bad rap. And I think I agree with this, but I wanted to ask you about it.


Yeah. So I'm not an emotion researcher, but my understanding of the emotion research in psychology is that there's different emotion, regulation strategies. There's a reappraisal, which is what you try to reframe what happened in a more positive way. And then that reappraisal, I think, is often considered the most adaptive way to regulate your emotions. And then suppression, which is like trying not to think about it, is considered a less adaptive way. And I think there are other emotion regulation strategies, but those are the two that come to mind.


And I always thought suppression gets such a bad rap. Like for me, like if something's bugging me and I can't do anything about it, I literally try to find something, my visual field and just like mentally describe it and I'll feel better, you know, like I, I think like trying to distract yourself often is a really good strategy. I mean, I think it depends if it's something you need to respond to or do something about, then depression might not be good.


And I mean, the other argument against depression is that it makes it harder for others to know you. And I think that's probably there's no there's that is true of me that I probably am a fan of depression. That's not unrelated to the fact that people find me hard to get to know, but it has a good intra psychic effect. But maybe the interpersonal effects are not as good. I don't know.


I was actually always kind of suspicious of the you know, you need to process your emotions. You can't just suppress them just because it sounded to pretty or it sounded like it felt like it fit into this general narrative that our culture happens to have at this point in history about emotion and authenticity and so on. And that didn't necessarily by itself mean that the claim is wrong, but it made me more suspicious of it. Yeah, I feel like there's this idea that if you suppress, emotions are going to bounce back.


But that's not always true. Surely sometimes they just go away. Yeah, I mean, I think it's more of a spring or something right down. It springs back but. Yeah, yeah.


OK, well I definitely want to make sure we have time to talk about your more sort of object level research on self awareness. So this is a good time to segway into that. A recent paper of yours had a pretty clever novel way of measuring how well people know themselves. Can you tell us about that?


Sure. I think I know what you're talking about.


So multiple times during this, I'm thinking of the one with the the recorders. Yeah. So so this is something I saw when I was in grad school in the early 2000s. There was another lab in the same department, Jimmy Pennebaker's lab and his grad student, Matthias. meAll had developed this technique called the electronically activated recorder, which at the time was just literally a tape recorder. And they used it to look at language. So people talk about and especially during traumatic events and so on.


And it occurred to me that this might be a really cool tool to look at actual behavior. So one problem in personality research, which is my subdiscipline, is that we tend to rely on questionnaires and what people say they're like in a questionnaire could be right or wrong. And we've kind of taken it. It's it's not so much that we've taken at face value. We have tested the validity of people's self reports and they are quite valid. But if you want to study where people might be wrong, where they might have blind spots, there wasn't really a great way to do that because you could you self report questionnaires or you could use like peer reports, which is where you ask people's friends and family what they think.


But then if those two disagree, which they do a little bit, it's kind of a glass half empty, glass half full. So they agree substantially. But there's areas of disagreement and there wasn't really a good way to resolve. Well, who's right when they disagree, right? Yeah. And that is.


Yeah, you know, my friends say, you know, arrogant or something, but what do they know? Like. Right. They're they're just jealous. Yeah, right. Exactly. And sometimes it's like, well by definition, if you're if you're friends say you're not funny, then it doesn't matter what you say. It was actually the first example I thought of and I was like, wait, no. So there's a future by definition. You're right.


So like your self-esteem, you're almost by definition, right? I mean, you could be, I guess, really deeply deluded about your self esteem and then like how charming you are by other others are by definition. Right. But there's a lot of stuff in the. Area where, like, if you say you're friendly and others say you're not or whatever, I mean, it can be ambiguous. I always have kind of a in my head that there's one that I count as more valid than the other.


But it would still be nice to have empirical evidence. And the problem with most behavioral measures that we had up to that point was that they had to be administered in the labs, had to be pulled into the lab and like videotaped their behavior, put them in a situation and videotaped their behavior. And there's a lot of downsides to that. For one thing, you might put people in a situation that they actually never would have chosen to be in.


And so then you're eliciting behavior that's not typical for them. So if you want to know what they're not what they're typically like, which is what we mean by personality, you need to catch them in their own natural environments. And this this technique that Matear smell and doing, Pennebaker had developed, the electronically activated recorder or ear would allow us to do that, at least for some behavior. So because it's an audio recorder, we only get behaviors that you can tell from audio recordings.


So we have our participants aware that for a very amount of time, like in the last study, we did it for six days and it comes on and off. So it records 30 second snippets at regular intervals. And so then we had the self report questionnaires. So we knew what they said they were like, we knew what their friends said they were like. And then we had this week of their actual behavior with their friends, with their classmates, et cetera.


And so we have a huge team of coders, undergraduate research assistants working in our lab. And we develop coding protocols and they listen to the recordings and code behavior from those recordings. So then if you say that you're like really warm and your friends say you're not or vice versa, we can listen to the recordings and try to adjudicate and say who's right or wrong. That's a kind of overly simplistic way of thinking about it, though, because here, the way I described it, we're treating the codas readings as the truth.


But they could be wrong, too, because they're only getting audio. They're not seeing your face or not seeing your your movements, et cetera. They're only getting 30 second snippets. So they're missing some context. They're only getting five percent of the time. It's not recording all the time. So add noise. Right. That wouldn't it adds noise. So and but for some behaviors, we basically can't get them from from the ear. Some behaviors are either too rare or they're just not acoustically detectable.


So yeah, one of the things we're trying to do is figure out which behaviors you can reliably get from the ear and which ones you can't.


Interesting. So what did you find about people's level of self knowledge?


Um, so it's again, a kind of a glass half empty glass half full thing. So in one study we found that it was the self reports and the reports were about equally accurate, but they were accurate about different behaviors. And then we kind of came up with an explanation, post hoc for what kinds of behaviors this is probably more accurate for. And that's like behaviors that are more private, basically, which is not too surprising. And then what kind of behaviors others are more accurate for.


And that's behaviors that are more public or overt and also things that are more evaluative. So things that would be hard to admit about yourself. It's a really desirable value, really desirable or undesirable things, and not actually because so often people jump from that to like, oh, everybody loves themselves and thinks they're great and so on. And it's not actually the case. So overall, your friend's ratings of you tend to be more positive than yourself ratings.


And some of that is probably a little bit artificial. But the self ratings are kind of there's the problem with the self ratings is not that everybody loves themselves. The problem is the self ratings is that there's individual differences in how much people love themselves and that plays into how they rate their personality. So the higher people's self-esteem, the more they describe themselves in a positive way. Some of that is valid. There's probably a correlation between having a good personality and having high self-esteem, but some of it is just positive self view.


So like a halo effect. And so people with negative self-esteem actually underrate themselves.


And so then that messes up everything or other some traits that people tend to be that everyone, to varying degrees, tends to be positively biased about them. Uh, some traits that everyone tends to be negatively biased about.


There's individual variation on every trait. So on every trait, if you have people self reports, is going to be a range. And that range is going to probably I think in every case that we've looked at, it correlates with self-esteem. So people are more likely to overestimate in a positive direction. The more they have high self-esteem and the ability to underrate themselves, the more they have low self-esteem. There's definitely mean differences. So there's some traits where people on average tend to rate themselves more positively, and that's the more desirable traits and also the ones that are kind of more vague and easier to define in whatever way you want.


So if I ask about your intelligence, people are going to tend to give a higher rating and it's going to be more influenced by their self esteem than if I ask how good are you at math or what's your verbal like? How good is your vocabulary? Those are going to be more accurate and less influenced by self-esteem.


And what have you found about the effects of. One's level of self-knowledge, like how much their view of themselves correlates with other people or with the with the ratings.


That's the million dollar question. Like what we really want to know is who has more self knowledge and what what's different about those people? Do they have better relationships? Are they happier? Are they better off in some way? Do they make better career choices?


The problem is we are interested in this because I have this this strong suspicion that accuracy makes you better able to navigate the world. Yeah. And I want to believe that the challenges, interesting challenges to that.


I think we just don't know. So at this point, we've measured accuracy for the whole sample so we can come up with one number that tells you on average, the correlation between self reports and their actual behavior is they point for. So there's some accuracy, but it's not perfect and so on. But we haven't yet come up with a way to say, here is this person's level of accuracy. And here is that person's level of accuracy. So we have accuracy for people as a whole or for a sample as a whole.


But we haven't come up with a way to look at individual differences in self-knowledge or accuracy, which that's what you would need in order to correlate that that variable, like each person's level of self knowledge with outcomes like, you know, relationships and work and so on, there are ways to do it. I just don't think any of them are very good.


So you am I misremembering? I thought you had a paper where you looked at your relationship quality. Were you in your own paper there? Yes. So, you know, we have there's three authors on that paper and we disagree about how confident we are in the findings. And I don't really want to like, OK, that's fine. But I mean, I think we would all say, look, it was a single study with 80 people and a high P value.


So if it's true, we got kind of lucky and going further, we would all want to see it replicated. I intended to replicate it. And then stupidly in my last year study, which an inner city takes many, many years. So the one that we're currently analyzing, we collected data collection started in 2012. We'll finish coding that year files maybe in two or three years. So that was a research assistant.


I can only imagine how much work that must be. Yeah, no, our resources are really just they work so hard. They're great and it still takes forever. So, yeah. So I had planned to include all the variables we would need to replicate that effect that you referred to. But I stupidly didn't like I thought I had. And then I went to like go start planning the analysis for when we have the data. And I was like, we didn't measure the right variables.


I have no idea why. And that's one of the problems with these studies that take forever is when you go back and you're like, yeah, why didn't we measure that? And you don't know. Well, then so I don't know if the next couple of questions can be answered, like given the lack of individual measures of self-awareness. But I'll ask them anyway, um, what's your impression of how deceived people really are, like, to the extent that we have, you know, inaccurate views of our traits, do we really deep down believe those and accurate views like the arguments that we don't would go something like, you know, I might insist I'm great at fighting because that makes me look good to other people.


And maybe I didn't like in that moment that I'm insisting that I kind of believe it because, you know, that makes it easier for me to be convincing to other people that I'm a good fighter. It actually breaks out. Maybe I'm just going to run away because deep down, I actually know I'm not. Yeah, I think. And so I have these to defend my brain doing these two different things in different times, depending on what's appropriate.


There's the signaling system and then there's the like actually make decisions that won't get you killed this time. Call it the navigating the world system. Yeah. And it uses different models of me depending on which system is appropriate.


Yeah, that's my intuition too. And that's one evolutionary theory about self-deception, that the function of self-deception is to deceive others because you'll be better at convincing other people if you really believe what you're saying. So at least on some level, it helps if you believe it. We do have one study that gets at that. So we ask people to rate themselves on a bunch of characteristics. We recorded those readings. They couldn't change them. Then we showed them the readings again and said, Do you think that you overestimated or underestimated or just right?


And we found a lot of accuracy. So people who overestimated said they overestimated. People who underestimated, said they underestimated. It turns out they were just using the heuristic that if I rated myself high, then I probably overestimated myself, decimated. So it was not a sophisticated judgment, but it was accurate and it worked. Yeah. Yeah. So I think that supports this idea that on some level you believe it and on some level you don't. And I've definitely had that experience and it's something I've talked a little bit with philosophers about, like isn't it possible to kind of believe something but also kind of not they don't really like those are weirdly stubborn about that.


Seems obvious. Just having a brain and the brain, the. This is a real phenomenon, but it just yeah. Yeah, and it is in books written about like the paradox of why people do things that aren't in their self-interest.


Right. Different selves. And who is was a W.H. Auden or Walt Whitman? I always I can't remember which one, but some point do I contradict myself? Very well then I contradict myself that contain multitudes. Exactly. And I read those with all the time. I feel like I contradict myself all the time, including about myself. Right. And I think that that's adaptive to some extent. Right. Yeah. It's nice to be able to bring up different self use depending on what's going to be functional and what context I'm in.


And maybe one way to think about that is in terms of like confidence intervals, like maybe for any given characteristic, I have a confidence interval in my head, like, you know, maybe for some characteristics. It's really narrow. Like I'm pretty sure that I'm around the 80th percentile and for other characteristics. And like I think my best guess is the 80th percentile, but it could be anywhere from 20th to ninety nine and or something like that.


There's also just to complicate the picture even more, I might acknowledge and genuinely believe that I'm probably wrong about this is low, but let's just say 20 percent of things that I believe that I can genuinely believe that. But if you point to any specific thing that I believe, I feel much more than 80 percent confident in that thing. Yeah, all the things that's and it's a secret. This goes back to the replicability issue. Like let's say that we think that at least 30 or 40 percent of our published studies are false, which I think that shouldn't be too controversial, let's say 30 percent.


I don't think that should be controversial, but we don't know which 30 percent. And so some of us take that to mean let's be skeptical of everything and other people do that. I mean, I'm not going to worry unless I have a reason to doubt a specific paper. And yeah, I think when it comes to our own beliefs, we're more like the latter. Like we're just like I'm not going to worry about which ones are wrong unless I get feedback that tells me I'm wrong.


I'm going to assume I'm in that 80 percent. Can I run my own pet theory about self knowledge? Pursue my pet theory is this is specifically about is self knowledge helpful or accurate knowledge about yourself? And the theory is that it is there's an interaction effect. So it's helpful if you have some other trait or important cluster of traits. And an example of a trait I could imagine filling that role is low neuroticism. So accurate knowledge of yourself is helpful.


If you are sort of able to not freak out over the flaws that you see in yourself and not dwell on them and like you react constructively to them.


So if you're good at emotion and depression, but, you know, if you have high neuroticism, then, you know, maybe you're better off just like not fixing your flaws and. Yeah. Going on.


Yeah, I think there are a number of like what we call moderators of when self-knowledge is good and one of them has to be like whether you can do anything about it but do anything about it is kind of a broad category because I have plenty of characteristics that I know I can't do anything about. But I actually I wore that year and I listen to myself and I learned how like flat I am. And I didn't know that about myself. I mean, people had told me, but like listening just I don't express very much.


I'm not emotionally expressive. I sound uninterested in things. I think I've gotten a little bit better, but not much. And so listening to myself, I realize, like, I come across as if I don't like anyone and I don't like anything they're saying and I'm not interested in anything. I also say much less than I thought I did. So I would listen to these conversations and I remember the thoughts I had and be like, I'm sure I expressed that thought and I would weigh in and I never expressed anything.


And so I can't really change that. Or I mean, maybe this is also me being a personality psychologist about it. I'm pretty pessimistic about people's ability to change. But now that I know that I come across that way, I will sometimes go out of my way to tell people like you probably think I don't like you, but actually I do. Well, I have a very good friend who who is like that. For years, he never really smiled.


I always kind of was confused about why he kept wanting to hang out with me because he always miserable when he did hang out with me. And then he is very low. Well, I don't know if you would call this neuroticism, but he's like very good at sort of acting on, you know, the knowledge about his flaws when he found out that he was giving people that impression. He just had learned how to smile and that habit. And now he gives off this very warm vibe.


That's one of funny. That's impressed. There's hope. Yeah. I felt like you were very engaged in this conversation.


So I think I think I've gotten better in having a podcast myself. Helps a little bit, too, because I have to force myself to sound more expressive than the thing is, it's not that I don't feel it is that I assumed it was coming across when it wasn't. And so but I still find it hard to do in person, but I've learned to do it more in writing. Like I'll follow up with an email and be like I really enjoyed our conversation or whatever, so I'm a little bit better at that.


So I think that it's it's hard to think of a case where there's nothing you can do with the self knowledge. But I think there are some cases where maybe it does more harm than good. And I think no. The system is probably a good predictor of what might make that difference, but also just other resources. There's more you can do to fix your flaws the more resources you have, whether that's eroticism or money or time or people to support you or things like that.


Well, to me, before I let you go, I wanted to ask you for your pick for this episode and for you. I think my question is, is there a book or article or blog post or something else you've consumed over the course of your career that you don't agree with, but that you nevertheless think is is a valuable thing to read or, you know, like well argued or worth engaging with in summary.




So what comes to mind is actually something I did when I was thinking about self-knowledge a lot. In the early days of my research, I taught a class on like self-knowledge, knowledge. And one of things we talked a lot about was differences between how people see themselves and how others see them. And actually, I didn't do this in my class, but I had the idea of eventually developing a seminar where all we did was read autobiographies and then biographies of the same people.


So interesting. Yes, I never got around to doing with students, but I started it on my own. And one of the first ones I did maybe even I don't remember how far I got beyond this, but the one I remember doing was Clarence Thomas. And so I read his autobiography and then I read a biography of him and it was really fascinating. So I would recommend that in particular, I guess. I think that's that's an interesting person to see from there through their own eyes and then through someone else's eyes.


And I think I read a pretty critical biography. But in general, I guess I would recommend like when people you really don't understand where they're coming from. Right. Autobiographies, that seems like a really unique opportunity. And obviously, especially politicians or people involved in the political arena, there's probably a lot of filter. But I still think it can be really fascinating to hear what they want the world to see about themselves versus what a biographer would write.


Do you remember the autobiography in the biography of Clarence Thomas so that we can link to them? I don't remember, but I can your yeah, I'll try to find them. I mean, I think the autobiography, there's probably only one I'll try to find. Yeah. That and the one I mean.


Well, I mean, thank you so much for coming on the show. It's been a pleasure. Thanks. It was a lot of fun. This concludes another episode of rationally speaking. Join us next time for more explorations on the borderlands between reason and nonsense.