Transcript of Rationally Speaking #209 - ...

[00:00:00]

This episode of rationally speaking is brought to you by Stripe Stripe builds economic infrastructure for the Internet. Their tools help online businesses with everything from incorporation and getting started to handling marketplace payments to preventing fraud. Stripe's Culture puts a special emphasis on rigorous thinking and intellectual curiosity, so if you enjoy podcasts like this one and you're interested in what Stripe does, I'd recommend you check them out. They're always hiring. Learn more at Stripe Dotcom.

[00:00:43]

Welcome to, rationally speaking, the podcast, where we explore the borderlands between reason and nonsense. I'm your host, Julia Gillard, and I'm here today with Christopher Chabris.

[00:00:54]

Chris is a cognitive psychologist and professor at Geisinger Health System in Pennsylvania. He writes about social science for publications like The New York Times and The Wall Street Journal. And he's the author of the book The Invisible Gorilla How Our Intuitions Deceive US. Chris, welcome to rationally speaking.

[00:01:09]

Thanks for having me. Great to talk to you.

[00:01:12]

I have a bunch of things lined up that I want to ask you about. Maybe let's start with some of your recent research on collective intelligence. Can you tell us what how you define collective intelligence and and how do we know it's it's the thing and it matters?

[00:01:27]

Sure. So this work I'm going to talk about is all done in collaboration with Tom Malone from MIT and Anita wholely from Carnegie Mellon. I should say that right off the bat. And the second thing I should say is that as a researcher, I try not to I try not to get hung up on defining concepts because I find that sort of defining them precisely, even though that's generally a good idea, can often sort of get us hung up on whether we agree on the definition and distract us from the empirical phenomenon.

[00:01:58]

So I'm going to define it a little bit by describing an empirical phenomenon. And that phenomenon comes from studies of individuals. So we have the concept of intelligence, the psychological concept of intelligence as a measurable thing about people, because when you give a bunch of people a bunch of different cognitive tasks, it just turns out empirically that for whatever reason, people who do well on one of the tasks also tend to do well. On the other tasks, they're not perfectly correlated.

[00:02:31]

So it's not as though the person who gets the highest score on tasks, one necessarily gets the highest score on all the other tasks and so on. But there's a general tendency for the performance on different kinds of cognitive tests to be positively correlated. And we call the the capacity sort of the inferred capacity that can lead people to do well on a variety of tasks. We call that intelligence. So in our research, we basically just tried to apply that simple concept of intelligence, the way it works with individuals, some people being colloquial or speaking smarter than others and just apply it to small groups or teams, which is groups of two, three, five or six people working together to achieve common goals.

[00:03:13]

And it turns out, as we hypothesize, some teams just seem to be generally smarter than others. They generally do better on tasks, different kinds of tasks. Teams that do well on one kind of task tend to do well on other kinds of tasks. Also sort of spoiler. That's what we found in our research. We can go into more detail now, and that's really the phenomenon of collective intelligence. It's some capacity that is reflected in the fact that some teams tend to do better than other teams on a wide variety of tasks that they might have to perform.

[00:03:44]

And what is the scope of of tasks that we think collective intelligence predict performance on? Because with IQ, you know, it predicts performance on a lot of things, but certainly not all things or it predicts performance differentially on different tasks.

[00:03:58]

So what what have you looked at and what's your sense of what things that might not predict?

[00:04:05]

So you're absolutely right about individual intelligence or IQ. It's more important for some things and less important for others. And it is kind of hard to find something that individuals do that sort of has like a measurably good and bad side to the to the continuum that is not related somehow to IQ. It could be related very tenuously to IQ. So, for example, being good at recognizing faces is pretty unrelated to IQ, at least as far as we can tell.

[00:04:33]

And there've been several studies on this. When it comes to collective intelligence, the picture is kind of similar. Actually, we we found a couple of the tasks that really are the best measures of collective intelligence are solving abstract puzzles. That is, let's say three people sit around a table and they literally get one piece of paper with a matrix reasoning puzzle on it, say, which is a common kind of IQ test item. It's abstract, it's non-verbal, it's not even tutorial.

[00:05:00]

It's just a bunch of lines and shapes. Turns out that groups that do well on that kind of task tend to do well on the others. Another one is a test that measure sort of speed and coordination of the group members. So this is my favorite one because it's kind of the funniest in some ways. We we gave them printouts of Wikipedia article and they had to type as much of that into a shared Google doc as they could and in a limited amount of time, without sort of duplicating or leaving gaps or making typos and mistakes in it and so on.

[00:05:33]

So it's kind of a speed, accuracy and especially coordination task for team members, which doesn't really seem very intellectual on its on its surface. But just as the speed. What you can, let's say, respond to blinking lights is correlated with individual intelligence. Sort of this kind of speed and coordination task seems to be an indicator of collective intelligence. The farther you get away from those kinds of things, it seems like the lower the relationship is with sort of a general collective intelligence factor.

[00:06:04]

So, you know, moral reasoning might have less relationship than abstract, logical reasoning, let's say. How do you measure performance and moral reasoning?

[00:06:15]

So that's yeah, that's a difficult one also because there's not necessarily a correct judgment. So that's sort of more of a process measurement like how many things people how many things groups take into account when they discuss their decisions and so on. And we all these sort of we don't know what's in the individual members mind, but we can keep track of what they what they mentioned and what they discussing and so on. Got it. Brainstorming is another one we use, which is also a common thing that groups do together, even though it's not necessarily the best way for groups to generate good ideas.

[00:06:49]

But it's often done. And again, there you have a little bit of a problem of measuring the outcomes. So it could be that a group that generates only one idea still comes up with a great idea. But the usual ways of measuring the output are a number of different ideas and things like that right now for IQ, far from an expert on IQ.

[00:07:11]

But my impression is that we we think we have some understanding of some of the underlying mechanisms that would make someone, you know, good at this whole wide variety of things. For example, I think working memory is probably part of IQ, but that statement may be not be correct. But I'm wondering what the equivalent would be for collective intelligence. What is the theory for why? Like what is causing groups to perform well on this wide variety of tasks or poorly?

[00:07:41]

So we have done some work on that from the very beginning, and I guess I should say that it's it's mainly correlational work, as is most of the work on what causes IQ, because it's hard to randomly assigned people to conditions that actually make them smarter or less smart to do randomized experiments. Although the evidence so far suggests you're right to things like working memory and processing speed and even brain volume are all related to IQ collective intelligence. I think the story is in many ways much more interesting because since we're talking about a capacity of a group of people, it's always possible to try to decide who should be in the group as a way of increasing or decreasing the collective intelligence of the group.

[00:08:24]

And it's also possible to arrange different different kinds of environments or systems for the group to use interaction, which you can't really do with the different parts of your own brain. Right. It's it's hard to tinker with those and engineer them. But since we have the parts of the group, in essence right in front of us, we can we can start to play with that kind of stuff correlational. We have found in our initial studies that important things for collective intelligence, for how well the group took turns.

[00:08:57]

So we recorded every interaction that all the groups in our studies had, and we quantified things like how evenly distributed the amount of speaking was. So if one person did most of the talking, that seemed to be bad for the group. If each person spoke about an equal number of times, that seemed to be better. Those groups tended to score higher on the test. Another one was having group members who score higher on tests of social intelligence. And we used a very common measure of the capacity called Theory of Mind.

[00:09:29]

And the test we use is called the Reading the Mind in the Eyes test. So it's kind of an advanced test of social perception, detecting complex mental states and people just by looking at their eyes and team having members of your team who score higher on that test seems to be associated with having a more intelligent team. Again, it's hard to say who is going to be such or causal, because while we sort of randomly assigned people to be on teams, we just measure these capacities and then do correlations after the fact.

[00:10:01]

And the third thing in our initial studies that popped out was having more having more women on a team so teams with more women tended to score higher. And then we've replicated these findings basically several times now. So even though they were first published in our initial studies, our group, including some studies that I was not part of and somewhat I was replicated those basic effects more than once.

[00:10:26]

Does the first half are you talking about like literally there's like a monotonic relationship between number of women or percentage of women in the group and performance? Or is it like you want you know, at least half women, like if you had all women, would that be better than half of that?

[00:10:44]

So, yeah, so is it. So it's interesting. Sometimes people interpret the results that we published as evidence in favor of diversity.

[00:10:51]

That's probably because they're thinking of a baseline of mostly men. So they think more women equals more diverse, which it does from that baseline.

[00:10:58]

Yes, if your baseline of zero women than adding women would be great as far as our results say about collective intelligence. But we actually add the statistical analysis found a significant linear relationship, meaning the more women, the higher the collective intelligence, but no quadratic relationship. So although if you look at our graphs, as I've looked at them many times myself, it does look like there's a slight drop when you get to 100 percent women, it's not clear that that's statistically significant.

[00:11:24]

Maybe it was a lot more data. It would be. And it's it's a benefit to have a mix. But certainly it's it's not we don't have evidence that, like a 50 50 balance is best, let's say. But one thing we do have some evidence for, though, is that in our studies, although this is not necessarily universally correct and we could get into the weeds about the characteristics of the test we use and so on. But in our studies, we found that some of the effect of having more women on your team seemed to be due to women also being higher in social intelligence, at least according to our measures of social intelligence.

[00:12:01]

So it's not necessarily purely women per say. It could be that having more people who are more socially intelligent would also be a benefit. And that's just goes along with a slight effect of the sex difference between men and women in that social intelligence measure. How does the effectiveness or the importance of social sensitivity trade off against things like individual members skill at the tasks that the group is working on? Like if you if you were putting a team together and you had to choose between people who are really high in sensitivity but just average at their skill at the task versus people who are average at social sensitivity versus, you know, 90th percentile skill at the task, which is better.

[00:12:41]

That's a really good question. And I wish I could say that we had better data on that. One thing we tried to do in this research was to study from the beginning a fairly wide array of tasks in a deliberate attempt to capture what's common to all of the tasks, all of those tasks that we are far from the first people to study group performance or what are the characteristics of effective groups or what things might make groups more effective than the novelty was having each group do a bunch of different tasks and sort of look for the commonalities among them.

[00:13:16]

So in order to do that, we had to pick tasks that did not really require specialized expertise because otherwise we might not even be getting group performance. We might just have one person doing everything and everybody else sitting back and watching. And that's not the kind of group interaction we wanted it to simulate. Of course, that would be a good strategy in some cases. Right. Like if, you know, if one of you is a surgeon and the rest of you aren't, well, that guy should do the surgery and the other people should watch rather than everybody sticking their hands in the patient's body and messing around.

[00:13:45]

And what goes on in the real world sort of exists on that continuum. Now, previous research not by us, but really pretty good quality research, I think is found that often groups do not access the expertise and the ability and the knowledge of the most expert people in the group that factors like personality factors and social factors. And people are expressing confidence and people being the first to speak. And things like that can often override substantial differences in expertise or knowledge, and that people also often conceal, maybe not deliberately, but they sort of fail to surface their own special knowledge and expertise about these groups are working on together.

[00:14:29]

So we think that collective intelligence is sort of a phenomenon at their group level, which is quite influenced by these kinds of social interactions. And even when you have experts, you have people who are clearly superior performers. They might not get to express all of that in the groups. Efforts might be even worse than the best individual.

[00:14:51]

So the group that you were studying were, I think, basically strangers working together for the first time. Do you think that the effect of social sensitivity might wash out? One group members get to know each other and are sort of better able to to read cues about like who has a thing to say but isn't talking because the conversation is too chaotic for him?

[00:15:12]

Or, you know, who who who do I expect would have good input here?

[00:15:16]

Because we work together a bunch of times, et cetera. How much do you expect this to hold up over repeated work?

[00:15:22]

Yeah, I think what you describe is kind of the ideal of how we would like groups to evolve over time, that ideally they should sort of start to pick people should start to pick up on those things and they should arrive at patterns of interaction that, you know, maybe not optimized, but improve their their ability to use individuals expertise and knowledge and and so on. We haven't really done a lot of long term studies of groups, but we do have some there was a study that I wasn't involved in, but I really liked that a bunch of my colleagues did, which was a study of what's the riot games, a game.

[00:16:06]

But that League of Legends. Yeah, yeah. So League of Legends teams with the collaboration of the company, which I heard was wonderful to work with League of Legends teams, took our collective intelligence task, which had nothing to do with expertise and in League of Legends or anything like that. And teams with higher ratings in the game and higher levels of achievement did better on our test and also did better in the future. So it wasn't just retrospective, but also continued to perform well in the future.

[00:16:40]

So one might imagine that being really good at League of Legends is kind of like a very specialized thing, and you learn a lot about your teammates and so on. But it still seems to make a difference to have a team that does well and sort of like this generic collective intelligence does.

[00:16:52]

So that that that sounds very plausible to me. But the thing that's surprising about it is if I'm understanding how gaming works, people, it's remote. So people don't have an ability to read social cues off of each other's faces, which is what I thought social sensitivity was capturing. Why would that still hold?

[00:17:11]

Yeah, that's another another great question. I'm sure you're tired of hearing. Great question.

[00:17:15]

No, no. So in our so in our original studies, we use this mind in the ice test, which we sort of initially interpreted as in a sort of a narrow way as perhaps just a test of the above, what it seems like on its on its face, so to speak, which is the ability to read subtle cues from facial expressions. And therefore you would wonder if we can't see the other people in our group, what difference does being good at that make?

[00:17:44]

And we did a study to test exactly, exactly this. We in this study, we randomly assigned people to once they were in a team, they came to the lab and they were put on a team and then a team was randomly assigned to either sit. So they were all facing each other and they could see each other's faces or to sit in cubicles facing the wall and not even really knowing which other people in the room were on the same team as them.

[00:18:14]

And then they did the collective intelligence battery in an online form so that groups in either condition, face to face or cubicles did the exact same battery.

[00:18:23]

And there was a chat room that recorded everything they typed and in either case, online or face to face or purely online or sort of online, plus being able to see each other's faces. The mind and the eyes test still was correlated with the collective intelligence of the group. And as a reminder, this is the mind in the performance of the individual. So every individual does the test by themselves. We sort of average the score of the team members and that average score of the team members is still positively correlated with the team's collective intelligence, even when they never look at each other during the collective intelligence test and don't even know who in the room is on their team.

[00:19:01]

So it seems to be measuring something deeper than just perceiving facial expressions, even though that's the medium that is that it uses for the test. Items seem to be measuring some deeper capacity of social intelligence theory of mind ability, the ability to sort of understand and represent what other people are thinking, what they know, what their emotions are, which may come through in text or in other subtle behaviors that get expressed online. Right.

[00:19:26]

Going back to the percentage of women factor, are you aware of any correlational studies of real world teams of women and whether they tend to perform better than men? Like, for example, has anyone looked at startups that were founded by more than one person, two or more people? You can measure the percentage of women in the in the founding team, and then you could look at whether the probability of being profitable five years later or something like that there any any kind of real world correlation that that would back up the experimental finding.

[00:20:00]

So there are some studies that I'm aware of. And I should say that the startup study that you mentioned was an idea that we also had a few years ago that that I wanted to pursue. And I thought the ideal environment for doing that kind of study was to look at a bunch of founding teams that were all at the same stage of starting up somebody like everybody, everybody who applied to Y Combinator and all the other incubators. One was accepted into a batch and then you can follow how they go along and so on.

[00:20:28]

So I contacted Paul Graham and other people and so on and tried to drum up some interest in having everybody do like a forty five minute collective intelligence test at the beginning of at the beginning of their incubator time. And then we'll just sort of passively like gather gather information on what happens to their companies over the next year or months and so on. I never got that off the ground. I think it's a good idea. I'm not sure startups really want to be studied or that the people who are funding the startups want to be participating in studies as opposed to inventing stuff and marketing it and so on.

[00:21:00]

But but I like your idea of looking at observable characteristics like a number of a number of women and other other factors. There is some data from there's some data from studies of boards of directors where there's some argument that boards of directors that have more women companies whose boards of directors have more women do better. And I would like to think that that's because the collective intelligence of the board is is increased. And therefore, whatever influence the board has on the company is generates positive results and so on.

[00:21:38]

But I think all those connections are probably a bit tenuous and the causality can actually be reversed. It could be the companies that are doing really well, in a sense, can afford to now attend to questions of diversity and representation and so on, that struggling companies may just not have the attention or other capacity to pay attention to. Right. There's one other study, by the way. You just mentioned very briefly that I find a little bit more convincing.

[00:22:02]

There's a guy, a guy at Harvard Business School whose name escapes me right now because I don't have a book in front of me. But he did a study of equity research analysts on Wall Street in the 90s. But these are the people who analyze companies and they say. Buy, hold or sell and set price targets and things like that, and their performance is measured in sort of a little bit of a fuzzy metric by how highly their customers rate them.

[00:22:30]

But still, it turned out that when these analysts got very highly rated, they tended to be poached by competing banks would hire them. And when that happened, their performance tended to go down. But if they were women, their performance recovered faster than if they were men.

[00:22:51]

And one interpretation of that is that what you're really measuring here is not the performance of this one person, but sort of the entire team that they're part of. And so therefore, there could be some effect of women adding more to team collective intelligence or something that leads to a better outcome. But again, it's it's a you know, these are correlational studies. They're a little bit sort of you know, the data is is not as good as we would want.

[00:23:18]

So maybe someday those startups will want to study them. I want to raise a general concern that I have about studies that find that women are better at something than men, and and my general concern is I worry that the opposite result would be unlikely to be published, like if a researcher did a study that seemed to show, oh, hey, when you add women to a team, the team does worse is that people are going to get published.

[00:23:45]

It just seems so inflammatory that I my suspicion is that either a journal would be reluctant to publish it or would like subject it to much more stringent standards to make sure it's a real result, you know, to avoid publishing a false, inflammatory study which, you know, stringent standards are good.

[00:24:05]

But if you're applying them unequally, then that affects the ratio of of findings that you end up seeing. Or maybe the researcher himself or herself wouldn't try to pursue that finding because of the potential fallout.

[00:24:16]

So I just don't know how to interpret the findings that I see published and shared that show that women are better at a thing because I don't know what the denominator is. I don't know what you know, the other potential study is found or would have found if they had been allowed to be conducted. How do you how do you think about this? Do you think my concern is like a real one?

[00:24:39]

Well, it's a very sensible concern. Like we should always be concerned about publication biases. And there are so many filters that publication biases of all kinds. Right. And there are so many filters that occur between sort of the conceptualization of a study and not only what gets published in journals, but maybe especially what's what gets publicized after it is in journals or in conferences or something like that. So you're you're even more likely to hear about some kinds of studies than they are likely to get published in journals and and so on.

[00:25:05]

So I think that's always sensible. I mean, going back in time, you could say that probably there was a time in the past when the opposite publication bias existed where someone in your data that meant that the men were women were better than men of something that that might have been less likely to get published.

[00:25:19]

So, yeah, no, that makes a lot of sense to me. But it doesn't it doesn't get rid of the concern that any any research about which gender performed is better than the other, like is hard to interpret because, you know, the denominator is especially unknown.

[00:25:34]

Yeah. It's harder, it's it's harder to interpret. I agree with that. And we always don't know, you know, what what was not published. It's very hard to know what was not published and what even more so sort of what's not being done with hypotheses are not being tested. I can say in our case that we didn't have a hypothesis about that from the outset. You did or didn't know? We did not have. I did not.

[00:25:58]

So it was kind of surprising. And it replicates, at least in our in our data that that replicates it's not it's not really a huge effect. I mean, it's not you know, it's not the biggest effect you whatever you would ever see. But it does it does tend to replicate. But I share your general concern that sort of the social desirability of the research outcome among whoever is the gatekeeper or someplace along in the process, you can definitely affect that.

[00:26:29]

And I would be somewhat concerned. I think, you know, one solution is increasingly sort of open data. So there are more and more large data sets being made more and more open and available so people can look at sex differences and other differences in the original data. And if you know, articles about sex differences in this data that have not been published, people can download the data for themselves and and look at it and start to and start to to point it out.

[00:27:01]

I'm not sure of really a good universal mechanism for for fixing that. I should say. I've you know, I've found I mean, I've looked at sex differences in other contexts, such as spatial reasoning, like performing mental rotation tasks, and found both the normal improved performance, far better performance by men than some of those tasks, but also some indications of, you know, of why that might be that don't necessarily have to do with it. Sort of like absolute better performance in spatial cognition.

[00:27:39]

I mean, we're getting we're getting into a little bit of the weeds of some of this stuff. But I think it's it's a fair concern and it's a good it's a good thing to think about when you read about these kinds of results. Cool. Well, let's shift tracks at this point.

[00:27:52]

I wanted to ask you about an op ed that you wrote.

[00:27:57]

Actually, it's a topic that you've you've touched on in several pieces that you've written about companies experimenting on their customers are on the public. Your arguments was basically that people get upset about, you know, when they find out that Facebook was was, you know, doing a B testing where some some users were subjected to more emotional content than others. And Facebook was studying, you know how. This affects people's posting habits and things like that, and people were really upset about this, and your argument was that they shouldn't be upset.

[00:28:29]

You want to lay out the case?

[00:28:31]

Sure. So this work was done. The work is done in collaboration with with my wife, Michelle Meyer, who's a bioethicist and legal scholar. And she's actually done more of this than I have. But so she should get the majority of the credit for this line of thinking. But the basic idea is that. There are often more often I don't know, there's some there have been many high profile cases, especially in the world of people who focus on research ethics and things like that, of randomized experiments that have been run either by companies or by other kinds of organizations, even by medical researchers where people objected to the idea of the experiment.

[00:29:16]

They say the experiment itself was unethical, shouldn't have been run. And it's really, really bad for a variety of reasons. And what we noticed and what Michelle noticed especially is that people are complaining about a B tests. That's just one experiment where people are assigned either an A or B condition, but they rarely complain about just changes in policy or practice which affect everybody without comparing them to any to anything else. So if that changes, Facebook changes its algorithm one day, they're in a way just sort of like running a really bad experiment where we're all in the air condition and there's no condition to compare to.

[00:29:59]

And we don't object to that. We don't object to doctors deciding to practice medicine one way and not another. But sometimes when people do randomized experiments, even in medicine, there's there's objection to that. So Michelle and I pointed out a few cases of this, and we call it the we call it the AB illusion when people object to an AB experiment, but they would not object to just imposing A or B on everybody as a matter of practice.

[00:30:31]

Yeah, and I think that can be one example we gave in one of our pieces was companies being reluctant to run beneficial experiments for which they could learn a lot because they don't want people to find out that they've been running an experiment. So instead they don't run an experiment. They just sort of go with lesser quality data or no data at all or just intuition or as as someone said, the hippo, the highest paid person's opinion, just governs, you know, governs the outcome.

[00:31:01]

And that's to us. And I probably do a lot of people sort of not the most enlightened way to figure out what policies and treatments and practices are likely to work best, either for the company, for the company's bottom line, or for our customers. And mean companies very often are concerned about the welfare of the customers. They legitimate. They are honestly trying to make products that, you know, that improve people's lives. So everybody has a stake in this and this illusion.

[00:31:26]

I think this reminds me of an old anecdote.

[00:31:29]

I have no idea if this is real or not or some, you know, prestigious, esteemed doctor, I think it was a surgeon was was reporting that some change in surgical method methodology and and like arguing that this would be a good thing.

[00:31:47]

And some student in the back raised his hands like, well, why don't you try it on only half your patients to see if it works well. And the the presenting surgeon, you know, took umbrage at this. He was like, you're you're seriously telling me that we should subject half of our patients to worse treatment just for the sake of experimentation? I'm not going to subject half the people to to worse treatment. And the student just replied, Which half anyway?

[00:32:14]

Yeah, well, yes, exactly. So there's you can we can talk about all the the kind of cognitive biases and thinking traps that might lead us to believe after an experiment has been run that we knew all along what the results would be or we should have known what the results would be, or we should have known that people would be harmed. We can make up a lot of explanations for that and so on. It's kind of to me and I think it's kind of interesting that randomized experiments, kind of like randomness in general, tend to trip up our intuitive thinking processes.

[00:32:48]

And I think part of the explanation for that is that randomized expert, like the first randomized experiment, was done something like two hundred years ago, and it wasn't even really followed up on very much. It was it was less than one hundred years ago that the proper theory and statistical tools for doing randomized experiments were invented. And they're still a little bit unintuitive for people, for people to think about. And I think it partly has to do with the fact that they're a brilliant social invention, but a very recent one that maybe we should spend more time teaching people about really and in schools or something like that, or try to make people understand more, because there are really powerful.

[00:33:29]

I think you would probably agree. And people listening to this podcast tell me what a really powerful evidence generating mechanism, knowledge generating mechanism for, you know, for human society and for all of us. So maybe we should really try to get people to understand them much better than than they do and not react emotionally with sort of fallacious reasoning in cases like this.

[00:33:53]

By the way, I should add, this is not to say that there's nothing ever wrong with any lab test. Right. So the principles of ethics say that there are various situations when it would be unethical to do an experiment. For example, if you know that one of the treatments is clearly superior, if the evidence is properly construed, shows that one of them is clearly superior, well, it's probably not right to give people an inferior treatment, especially when health is when health is involved.

[00:34:19]

And so there are lots of reasons why you can't just willy nilly experiment and why every experiment should be OK. And there also could be reasons why one might be unhappy, let's say, with what Facebook is doing with its platform, but they probably don't. How much to do with the fact that you're doing a B tests and we shouldn't let sort of our dissatisfaction with whatever is going on with Facebook sort of spill over into just disapproving of running a B tests online in general?

[00:34:44]

That would be that would be a mistake.

[00:34:47]

Yeah.

[00:34:47]

You also made a great point recently when Starbucks announced that they were going to start giving implicit bias training to all of their employees to to reduce the incidence of, you know, unfortunate things like, I guess it was two weeks ago that two black men in a Starbucks were they called the police on these guys for loitering, even though everyone loaders and Starbucks. And you pointed out like we really don't know if implicit bias training works, but like, if you're going to do it anyway, you might as well be tested.

[00:35:20]

So at least we'll get some more real world data on it. And it's kind of a waste to do all this training without, you know, information collection to boot. Yeah.

[00:35:29]

My colleague here, guys, Matt Brown and I wrote a piece in The Wall Street Journal about this. It seemed like Starbucks reacted very with a very proactive response to this whole incident in Philadelphia. And they announced they were going to close their doors for a whole afternoon. And everybody in the all these 8000 stores are going to get training. And my first thought was what a great opportunity to run an experiment and see any of this training actually works. I don't think they were committed to implicit bias training, per say, which OK, which has a very checkered I said evidence base, even according to some of the leading scholars on the topic of implicit bias, they are not convinced that somehow training people to reduce implicit bias actually reduces discriminatory behavior.

[00:36:13]

That's a big open question. I think it's a great example of how we sort of just rush to assume that something works and we know what's going to work even when it doesn't. Starbucks could have run any number of experiments that could have delayed this by a month. They could have, you know, even today, they could decide to hold back some of their stores and just delay the training and some of those stores to see whether the training they're going to give everybody actually works.

[00:36:35]

That would help social scientists, that would help other companies that want to actually do effective training to have one of the biggest retail organizations in the world do a real experiment. And it's not really that much harder to do statement than it is to train two hundred thousand employees.

[00:36:50]

I know that's the maddening thing, that it would be so easy and so valuable and we're just not doing it exactly how expensive the experiment is really not that much compared to what they're already, but they're already investing in a company to go and invest precious resources in an experiment. It's kind of hard to do from the outside not knowing what all of their considerations are and so on. But having seen what they're already putting into this now, it's easy to say, well, if you would just add the experimental component would be great, but it doesn't look like they're going to do it, unfortunately.

[00:37:16]

Yeah. I did want to push back a little bit on your defense of Facebook, and you also defended octopods AB testing on its users in the same op ed. And it seems to me that there are two separate things that we're talking about when we talk about the public's negative reaction to experiments.

[00:37:34]

One is AB testing or, you know, random experiments. When the participants know that they're in an experiment, there's there's as you've correctly pointed out, people have an aversion to the idea that, like, if we even have a suspicion that one thing might be better than the other, that we should just give that thing to everyone, you know, even though in practice we're often wrong about which thing is better. And it's much better to know than just to guess that's one thing.

[00:37:58]

But then the separate concern is, is there a problem with experimenting on people when they don't know they're in an experiment? And one argument you could make that you, in fact, did make as well. You know, people are in experiments anyway, like not a B tests, but tests when a company tries a thing on everyone.

[00:38:19]

But it feels to me like that there's a real difference when.

[00:38:25]

The company is doing a thing that doesn't feel like it was part of the bargain when you signed up to use their service.

[00:38:31]

So the Facebook example feels kind of borderline to me, although I would lean on the side of I guess it's OK for Facebook to to do this a B testing, but the occupant example feels more over the line into not OK territory to me.

[00:38:47]

So that that was OK. Cupid told some of its users that its matching algorithm had determined that they were a good match with someone else when in fact they weren't a good match, according to the algorithm, and occupied discovered that actually people hit it off just fine with those who, you know, they were secretly not a good match with, according to the algorithm, which was interesting and useful to know. But it still felt like a violation to me because the company was being actively deceptive instead of just giving some service or feature to some people and not to others.

[00:39:20]

What do you think? I agree with you that that act of deception is ethically problematic. I won't disagree with that.

[00:39:29]

Oh, I should also add that I think it's strategically unwise to just from a scientific perspective, like forget the PR issues because like let's say you let's say you conduct an experiment without letting people know that you're experimenting on them and you find a result. What are you going to do with that result?

[00:39:46]

Like, presumably, you know, in the future, let's say you're a medical scientist and you want to give someone a drug, a placebo without telling them it's a placebo. Great. So you find out that, you know, if people don't know they're getting placebo and don't even know there's a chance of getting a placebo, it helps them.

[00:40:02]

Well, what do you do with that in the future? Presumably at some point you're going to have to tell patients they might be getting a placebo unless you want to lie to everyone for the rest of time. So, you know, you have this results that shows that the placebo works. If patients have no idea, there's even a chance of getting a placebo. But now in the real world, you you know, they know there's a chance of getting a placebo.

[00:40:21]

And so the result that you got in your experiments is no longer that applicable. And I think the same thing probably applies to research like octopods, where people don't know that there's even a chance that they're getting a random result.

[00:40:35]

Yeah. So first of all, let me say, you know, informed consent is obviously a desirable thing whenever possible. However, a lot of experiments lose their validity when that's not involved or it's just not practical to do that. And they don't involve significant, significant risk. So in the case of OK, Cupid, the it's funny that there's the assumption that there's deception going on is in a way based on an assumption of the validity of the algorithm.

[00:41:05]

So if the algorithm really does match us up, well, well, then telling us that we're a good match for someone we're really a bad match with is is deceptive.

[00:41:14]

But if you don't really know how good the algorithm is for the game, the company thinks they're matching people well and they may not be, but they think they are.

[00:41:24]

Yeah, or they don't they don't know. They've created an algorithm, which sort of seems sensible on its surface. It seems like a sensible policy. Right. But they don't actually know. I would not I would certainly not advise companies to go doing exactly what OK Cupid did all the time. I actually have the impression that they sort of enjoyed having the reputation of being the being the dating site that did those kind of things. They publicize it a lot in their blogs.

[00:41:47]

Yeah, Christian right. Wrote a book and so on. But I do think that, you know, there are many cases where informed consent isn't possible and there are many cases where we want people to experiment. Like I think we sort of off the shelf to try out different things in the restaurant, you know, and change the menu around them and so on. And we don't we don't sort of want there to be like sort of one menu that's set and fixed.

[00:42:12]

And it's never experiment with we don't necessarily want to sign a consent form when we go into the restaurant. But the chef, my variety of ingredients a little bit from night tonight or.

[00:42:20]

Yeah, but if the chef told us, if the chef told us that we were getting, you know, the best or most expensive fish and he actually gave us the cheaper fish, that might be a useful experiment to do. But I think people would still object.

[00:42:33]

Well, and those. Yeah, I mean, you could do those things actually under a sort of, you know, pretty normal informed consent, informed consent standards. Right. It turns out expensive wines don't taste that much better than the cheap wines when people when people don't know. Right. So that that kind of work is done. So I'm not yeah. I'm not saying that sort of like every sort of undisclosed manipulation and so on is a good and OK one.

[00:42:56]

I think companies should should certainly weigh that. But at the same time, you know, we should realise that, you know, as our friend Duncan Watts says, you know, the world is just a condition of an UN run experiment and people doing things to us all the time that, you know, they don't ask consent for. You know, like when one OK, Cupid made their dating algorithm, they didn't disclose the algorithm and all its details and say, do you consent?

[00:43:24]

To be matched up under under the rules of this algorithm, they instead they said we have an algorithm. I don't know exactly what they said, but it must have been something like we have an algorithm that will match you up and you sort of assume that it's a good one. But that's somewhat unwarranted assumption, just like we assumed. A lot of medical practices, for example, are evidence based and based on good information when often they're not. I mean, think how long it took them to start washing their hands in hospitals and so on.

[00:43:54]

There's a lot of stuff that still goes on that is not as evidence based as we might think. I think one thing Michelle would say if she were here on this is that probably we should do more like in general, companies, hospitals, institutions should do more to explain to their users, customers, employees, whatever, that they are organizations that try to learn and improve over time. And one of the ways they do that is by doing low risk experiments that are not meant to put you in danger, but are meant to compare different policies, ideas, so on, and figure out what works best so that everyone can benefit.

[00:44:32]

I think if that were communicated a lot more clearly and continuously, that might help.

[00:44:36]

Yeah, well, I can argue with that.

[00:44:40]

I want to say something that can't be argued with Chris.

[00:44:45]

Before I let you go, I wanted to ask you for a recommendation of a book or article or blog or something like that that you don't agree with or you have substantial disagreements with, but you still think is worthwhile and worth engaging with because it makes interesting arguments or advances an interesting hypothesis, something like that. What would you recommend in that in that vein?

[00:45:09]

Well, I've heard you asked this question before, so I did the operation and I came up with not one, but for answer. Oh, fantastic. I will rather live off a little bit quickly. I'll start with I'll start with Nassim Taleb, who probably has been discussed on your podcast before and is familiar to listeners. I know. Yes. He wrote The Black Swan, Fooled by randomness, antifragile, skin in the game, variety of books with wonderful titles.

[00:45:42]

He's a very interesting guy. He says a lot of things that one can disagree with. But I think that it's still very rewarding to consider his ideas and try to get beyond some of the rhetoric and occasional bombast and drama and so on and think about what he's actually saying. And it probably will change people's worldviews a little bit if they haven't been exposed to it.

[00:46:08]

Yeah, he's a good he's a good test case for me because I find his style, especially on Twitter, so abrasive and obnoxious that I have to I have to really work to consider his claims on the merits and not be not let my judgment be colored by my impressions of him.

[00:46:25]

Yeah, you can use it as training. You could use the training for exactly the open mindedness. And when separating out the person from from the arguments and exactly how so many so many different uses of work, you know, and I'm sure it's extremely unpleasant to come under attack from him. And I know people who have, but I also know him. And I just think there's something to be gained from from thinking about his thought and that he's somewhat of a significant thinker, you know, and it's worth it's worth reading his stuff and thinking about it.

[00:46:58]

So the second one I'll mention is Malcolm Gladwell.

[00:47:02]

Interesting because you wrote a serious critique of Malcolm Gladwell a few years ago. I think that's how I first encountered you, actually.

[00:47:09]

Yeah. I did write a couple of pieces about Malcolm Gladwell and his one of his books, which was called David and Goliath, which came out a few years ago. And I have had a lot of objections to claims that that Gladwell makes and the way he thinks about evidence and the way he sort of implicitly leads his own readers to think about evidence and to think about how the mind works and so on. But I do read all of his stuff.

[00:47:39]

I've read every single article is written, but I do try to make it a point of reading all of this stuff because he always manages to find interesting things that nobody knows about or to show well-known things in a different light. So it's worth thinking about what he has to say and looking at the items he's picked to talk about and write about. And it's also, in a sense, good training to look and see whether are there are there reasons to disagree with that?

[00:48:03]

Don't accept him at face value. He's also a great writer. So it's good to have examples just of engaging prose and the way to write so that people will actually want to read you and so on. So I use Gladwell stuff in teaching. For example, when I used it, when I used to teach seminars and writing and so on, we would we would use that stuff and try to dissect sort of what makes us effective writing on the one hand.

[00:48:23]

But.

[00:48:24]

But also, what makes it potentially not a correct or even a misleading account of human behavior and how the world works, maybe one of the keys to to finding useful people you disagree with is just finding different ways to read their stuff, like with different purposes in mind. So if you're if your purpose in reading Gladwell stuff is to find, you know, interesting stories or data or anecdotes that you can then add to your own world models and interpret yourself rather than taking his interpretations of those data, that might be a more useful way to use a Gladwell.

[00:49:00]

Exactly. And he performs a great service of going out and gathering some of the best of that stuff. And sometimes he's right, sometimes he's wrong. So if you just go into it, not assuming that he's right, but trying to think about how he might be wrong, then you're in good shape by then. And I guess very quickly, I'll mention Ray Dalio, whose book he's the founder and CEO of Bridgewater, the hedge fund.

[00:49:22]

And he he wrote a book, I think it's called Principles and the first volume of the Principles book. I don't know if you know it or not, but he describes a very sort of the very unique way that his business has operated and the way that he thinks businesses should operate. And I found a lot to disagree with it, because I think a lot of it is sort of based on fairly simplistic models, again, sort of models of the mind and the brain and of what personality differences between people really mean and what are important and so on.

[00:49:54]

But to hear the way a very successful person sort of has thought about this is clearly thought about it a lot, was a very interesting counterpoint, sort of like how academics think about the same stuff. Interesting how researchers would think about the same stuff. Here's a guy who's like tried to put it into practice. What does he come out with on the other end? How does how does he try to use the stuff that is really thought provoking? Cool.

[00:50:16]

Did it were those four or three did you have a bell?

[00:50:19]

That was three. I guess the last one I'll mention because I happened to write down four was and we're Tufty, who's written these beautiful books on information design and and graphics. I think the first one was called The Visual Display of Quantitative Information, and they got one called envisioning information. He gives seminars and so on. So I think he's he's kind of wrong about a lot of the the visual perception psychology about sort of what kind of charts are convey their message most effectively and how should graphics be designed to be easy for people to understand.

[00:50:53]

But he produces incredibly beautiful graphics and he tries to visualize information in unusual ways. So, again, it's sort of like the same kind of thing, like look at it and like marvel at the beauty and the eloquence and the interesting stuff that's in there. And then think about what this really work as a way of communicating a message to an audience. And how could you how could you do it differently?

[00:51:21]

Fantastic. Well, I always love getting four four for the price of one. That's that's a good day. And Chris, thank you so much for coming on the show. It's been a pleasure having you. Oh, well, thanks for having me.

[00:51:32]

It was a really fun conversation. This concludes another episode of Rationally Speaking. Join us next time for more explorations on the borderlands between reason and nonsense.