Rationally speaking, is a presentation of New York City skeptics dedicated to promoting critical thinking, skeptical inquiry and science education. For more information, please visit us at NYC Skeptic's Doug. Welcome to Only Speaking the podcast, where we explore the borderlands between reason and nonsense. I'm your host, Massimo Pierluisi, and with me, as always, is my co-host, Julia Gillard. Julia, what are we going to talk about today?
Well, Masimo, today we're going to turn a rational eye on the world of standardized testing. In particular, we're going to look at the intelligence quotient test, the IQ test and Myers, Briggs and other sort of personality type testing. Oh, OK. So have you taken the Myers Briggs test? You know what your personality type is?
You know what? At some point or another, I had taken both the Myers Briggs and the Big Five test, but I have no recollection whatsoever of one personality type.
I can probably typecast you. I sent you to their categories.
Yeah, but that is that is exactly the problem. Right? That is, you know, how reliable that casting be. Have you taken either one of these tests?
I took the Myers Briggs test a couple times once in high school and then once I think in college or shortly after. And I think I recall myself the last time I took it being an eye and T.P, so introverted, introverted, intuitive or intuiting. Thinking and perceiving. Mm hmm. But and I don't I don't remember. I was probably similar the first time I took it to. But, you know, one of the many problems with this test, which, you know, I'm sure we'll get into is that it's all self reported or self, you know, it's self perception at best.
Right. This is how you perceive yourself to be. And in my case, I I remember noticing as I was taking the test that it was it was worse than that for me. It wasn't even just self perception. It was self aspiration. So it was I was taking it in terms of how I wanted to do it like that, not how I actually did see myself, which in itself would not be a very reliable metric.
I actually had a similar experience when I by the way, I should say that I didn't I didn't I never took these tests in any formal setting. I always I took him a couple of times on, you know, basically online versions simplified. Right. Just out of curiosity, that's one of the things that it's very different, by the way, between sort of the the American culture use of these tests and European setting in Europe. There is almost no use of these kinds of tests.
As far as I could tell, I never actually encountered any of these tests, especially the tests that the IQ test, of course, originated in Europe. It was invented by Alfred Binay, who was a Frenchman. But the personality tests are really something that is very popular in the United States for millions of people. They come every year for a variety of purposes. But anyway, the point I was making was that, yes, I remember having exactly that sort of subjective experience that you just mentioned.
That is what I'm responding. The way I would like to be perceived were like. But not that doesn't necessarily mean that it's the way I actually am. Yeah.
In fact, now I'm just remembering a little more. When I took it in high school, I was an extrovert or rather the tests that I was an extrovert, but I was actually much more introverted in high school than I am now. And the reason that I came out as an extrovert was that at the time in high school, I thought that extroverts were better than introverts and I wanted to be an extrovert. So all of the questions like, do you feel comfortable in large conversations with strangers at parties?
I would say yes, because I really wanted to be the kind of person who could have large, you know, conversations with strangers and be comfortable.
And now you can. So, you know, it worked. I can.
Although, you know, I still I still get sort of drained by being in in contact with a lot of people that I don't know where I have to be sort of socially on for extended periods of time. Right. So that I think that's still sort of makes me at the core an introvert. Yeah.
Well, what piqued my curiosity originally about this topic was actually a recent article about the Myers Briggs personality test in particular. You know, I didn't know much about the history of this thing, but it turns out that the history is sort of almost pseudoscientific the original idea of the past, which was, you know, put together. What did you mean by by Kathryn Briggs and her daughter, Isabel Myers? We're talking about the early part of the twentieth century.
The original idea is that the two women in question were highly influenced by the psychology of Carl Jung and who was, of course, a initially protege of Freud. And then, you know, just like everybody else who was a protege of Freud sort of fell out with the Masters.
But the thing is that Jung so as soon as I study is as a read jung, the the baloney detector went to a fairly high level. Because Jung is famous for all sorts of interesting things, one is his idea of sort of unconscious collective archetypes that all humanity shares for which of course, there is no not a shred of empirical evidence. He's also famous for his antipathy toward quantification and statistical analysis. He fought that statistic with statistics. You can lie and pretty much tell any story you want.
There is a kernel of truth there, of course, but that's that's if you're trying to tell exactly what to do, if you're trying to discover the truth. Exactly. And and very, very brief, tangential quote. Mathematicians, you ever heard that great line from a horse's name? H. Chesterton, I think was his name. He was referring to someone I don't remember who. And he said he uses statistics the way a drunkard uses a lamp post for support, not for illumination.
Yes, exactly. Yes. If that's your goal, you can use statistics to lie. That doesn't mean that you can figure out the truth without statistics. That's exactly right.
And instead, Jung was very much into sort of anecdotal evidence and, you know, personal interpretation of, you know, stories, which is, of course, the whole problem with psychoanalysis, more brother. And when we talked about in other in other episodes, that is one of the reasons why Karl Popper famously thought of psychoanalysis as a pseudoscience, as, you know, as technically being unfalsifiable. Essentially, Jung was also very much into sort of extraterrestrial visitations.
You wrote a book about UFOs. It was about, you know, it was interesting astrology. So it's not exactly. So whenever his name comes to mind into the conversation, he's not exactly a, you know, a flag for rationality and evidence based claims. So the fact that the the Brookmyre test originated from the insights of Jung already should be a pretty good reason to be, at the least, very careful. But do you remember Masimo?
In our episode on Freud we were talking about like how to weigh the evidence for or against Freud's theories. And I said, you know, it almost seems silly to be working too hard to figure out, you know, what evidence there is for or against them. If we know that what caused the theories in the first place was not a process that tends was not the sort of process that tends to produce true theories. You know, it was just him making up something that he thought sounded cool, I mean, to be a little uncharitable, but like, you know, yeah, it was a good story.
It was an interesting story. But that doesn't make it anywhere remotely a good basis for for a scientific analysis of anything. Now, that said, however, there is another side to the sort of wanting to be a little bit more charitable just for a moment, because then we'll go back to being uncharitable, I suppose, or being critical at least, but wanting to be a little bit more charitable about these, in particular the Myers Briggs test. So there is another possibility there, right?
There are situations like, I don't know, acupuncture, for instance, which is still a highly debated technique and all that. But there is increasing evidence that it does have some effects, certainly nothing like what their supporters actually claim. You know, it doesn't cure cancer or anything like that, but it seems to be that somewhat useful and beyond the placebo effect. But even if it is now, the claim is still debatable. But even if it is, the point is definitely that it's not effective because of the theory, you know, the the the meridians of energy that go through your body.
It's all a lot of baloney. But that doesn't mean that the practice itself doesn't work like, for instance, in other clerics. Not even clear example, I think, is the efficacy of certain herbal remedies. It's an efficacy that has nothing to do with folk theories of why these things work. But, you know, people have tried a bunch of things and and they presumably retain by trial and or those that worked. And therefore, it's no surprise that something may, in fact, work today, even though the theory from which it originated was was a lot of nonsense.
Unfortunately for the Myers Briggs thing, we don't even know if that is true because apparently there is very little, if any, empirical evidence supporting the reliability of the tests. I mean, this this stuff was made up out of whole cloth to begin with. And it doesn't seem to be much of a research, unlike the case for other, you know, personality test that really doesn't seem to be much of a reason for keep using them. It's actually very popular.
Is this the second or third most popular test, a personality test used in the United States, that the statistics are pretty impressive. About 10000 companies, 2500 colleges or universities and a couple of hundred governmental agencies use it in the United States, which makes for a handy, nice, nice profit of about 20 million a year for the company that produces the test, the CCP. And to me, that is the really interesting stuff that I mean actual decisions in the real world.
Are being made by people who think they're making those decisions based on evidence because, you know, after all, it's a test and, you know, it's measuring something, presumably. And yet, in fact, there is neither theoretical nor much of an empirical basis for what's going on with the Myers Briggs.
So so I think that we need to look harder at what kind of a thing is the Myers Briggs test before we can ask whether it has been or could be validated, because it seems like a different category of thing seems to belong to a different category than things like acupuncture. So acupuncture is is a method that is trying to produce an effect. And it also has a theory, a causal explanation of why that effect is supposed to occur. Right. And then there are other theories, scientific theories that are supposedly providing, you know, causal explanations of why something happens.
And then the Myers Briggs is sort of more descriptive constructs. So it would I mean, it's a way of just classifying people according to these various attributes. So in what ways could it be wrong?
Well, that's an interesting question, actually. Maybe one way to to tackle that question is to look at some of the other personality tests that have been there, widely used. One of the other big ones is the so-called Big Five personality traits.
Yeah, tell me about that one, because I've heard about it and I can't remember what it is.
So so that one is, again, as you, as you were saying, sort of classifies people according to a certain number of of sort of dimensions in this case, five personality traits. These traits are openness, conscientiousness, extraversion, agreeableness and neuroticism. And now it does suffer from some of the same problems that you were mentioning earlier or earlier for the Myers Briggs, which are which are common to pretty much any personality test, which is, you know, the test is essentially based on people subjectively answering a set of questions about themselves.
And therefore, there is the issue that people can be could be biased, either consciously or unconsciously. First of all, since all the people know that these tests are being used for placement in in the workplace or in colleges or within a governmental agency, clearly people want to come across better, as good as you know as well as possible. So obviously there is that there is a certain bias. Now, psychologists are aware of this thing, of course, and there is this sort of their internal cross-validation, for instance, that the five factor tests have to sort of try to counter as much as possible for these things.
But it's always a potential problem.
So that's one way the test could be wrong then it could be wrong and that it's claiming that the scores on this test are an accurate representation of that person's personality, when, in fact, that's not the case because of these self reports issues. Right?
Yes, that's that's definitely one way. Now, there is, of course, a more fundamental issue, which I think is where you were going a minute ago. And and that is but even if the tests are accurate, what exactly is it that they're measuring? So the problem is, is that in the 1960s, particular 50s and 60s, a lot of psychologists where, you know, interested in sort of a behaviorist approach to to human behavior, and therefore they tended to opt to emphasize the effects of the environmental stimuli as opposed to any anything innate or anything genetic that might determine or influence human behavior.
So if you believe that a person is largely, you know, tabula rasa, well, then you believe that there are no no such thing as personality. Our behavior is determined by whatever it is that we're reacting to. On the other hand, if you have a you know, if you behave that if you believe that genetic influences have much more of a role, then you'd be more sympathetic to the general idea that there is going to be, you know, personality.
Now, the current view, from what I understand, is that, of course, somewhere in between the combination of the two that that, you know, there are there is such a thing as a personality. Traits tend to be somewhat stable, although they do change during developmental time. It's interesting that a lot of the a lot of even the better personality tests typically are applied to adults and they run into trouble if you try to apply them to early on in individual development.
And there are children version of them. But but it's known that the personality traits actually do change over time to some extent, you know, not necessarily radically, but to some extent anyway. The whole point was to sort of going back to the you know, to the to the evidence and the theory. So we said that the Myers Briggs is bad on both counts. The five factor tests, on the other hand, do significantly better in terms of.
All evidence, I mean, there is a number of studies that sort of cross validates these tests. What does that what are they cross validating now? Well, for instance, first of all, you can use different different versions of the same tests and see if you get similar results. But I'll get to that in a second. You can also do it. You can also do a cross culturally. And there is some cross-cultural validation to do these tests, although there are some interesting differences.
For instance, several Asian cultures don't seem to have old five personality traits, or at least not not all as developed as as the Caucasian sort of Western samples do.
But it doesn't seem that it doesn't seem that damning to me, though. If a test is a good classification of personality traits in one culture, but not a good classification and another.
No, it's not that damning, although it does tell you then then then the environmental influence must be important because, you know, by environmental influence in this case, I mean, you know, the cultural milieu, the cultural environment in which a person grows up. Right.
I see. So it makes the test less reliable as an indicator of like genetic, of innate personality, but still reliable as a descriptor of the personalities that people end up with.
Right now, there are still two two problems that I that, as I understand it with the with the big five tests, one of them is that there's no theory. These are theoretical things. That's true also for the Myers Briggs, that's pretty much for almost any person to test. And if that is true, I think also for the IQ test, there is no theoretical basis for this thing, but this is just not that bad.
Well, if you're if you want to do it, isn't it just saying that here are some dimensions on which you can classify people? Yes.
But if you want to do I mean, this goes back to some extent to the general discussion of what kind of science psychology is right.
Most people tend to think most people and by most people, I mean most philosophers of science and most scientists, I think people and that's most people as far as we're concerned, they would argue that a mature science is theoretically based science.
Right. I mean, the quintessential example of a mature science, mature science are physics and chemistry, which are highly theoretically based.
Yes, they're followed by things like biology and biology became essentially a modern science after Darwin, which means after it got a theoretical grounding of some sort. Right.
Psychology, although I think psychology would probably resent your use of the Myers Briggs as the thing by which to judge whether there are real science or not.
Well, I didn't say said I didn't say real science. I think of psychologists as a soft science. And I think the problem is not just the Myers Briggs. I mean that if that were the case, yes, this would be a sort of a silly objection. I think it actually applies more broadly to psychology in general. I mean, you know, there is there had been, of course, historically historical attempts to ground psychology in the theoretical in some kind of theoretical framework, the most obvious one being Freud, but pretty much all failed.
And Freud, Freudianism failed psychoanalysis in general, behaviorism failed as well. Now we're getting you know, the latest attempt is essentially evolutionary psychology, which, as you know, I don't have a lot of sympathy for either. So it I think it's fair to say that at the very least, psychology as a discipline is significantly behind other harder sciences. That's not not to say that. That's not because psychologists are, you know, less smart than physicists.
I think it's actually much more to do with with the nature of the discipline, the kind of phenomena that people study. But that's a whole different discussion. The point is, none of these tests, including the Big Five, seems to have theoretical grounding. Now, most philosophers of science and scientists, I think, would consider that a problem, not a fatal problem, because something can be useful even if it doesn't have theoretical grounding. But it is a problem that at some level needs to be addressed.
Then the last question remains, as far as you know, the other question remains as far as these test is concerned. Well, what about the empirical grounding? We've seen the Myers Briggs doesn't really have any well, by the big five, the big five as I that there are there's a number of studies that that the validate and cross validate them. But the fundamental interesting thing is there is how they construct the test to begin with. How do they come up with five factors?
OK, and the fourth factor there gives you a clue. These tests were built originally using a very powerful, very common common multivariate statistical analysis called factor analysis. Now factor analysis. Incidentally, the same kind of thing that gave us the the idea of an underlying intelligence factor that is correlated with all IQ tests. The idea that there is such a thing as is a general intelligence, basically, which, of course, is very controversial. Yeah, I was hoping to talk about.
Yes. Well. Yes, so let's finish about the patronizes basically is I've actually used it a lot when I was a practicing biologist. It's a very common tool and essentially he's one of the kinds of things it's an exploratory analysis, although I should say for the listeners who are statistically inclined that, yes, there is there's such a thing as a confirmatory factor analysis. But let's set that aside for a minute. It's largely an exploratory analysis. Basically what you do when you have a large amount of data and you don't know what you want.
You want to figure out if there are any patterns in any non-random patterns in the data and you want to organize. Let's say that you measure something like 20 different variables or 50 variables or 100 variables. And, you know, you could plot, let's say, 100 variables by themselves or in it all together into sort of a 100 dimensional space. But, you know, good luck making making any sense of that sort of stuff. So factor analysis, like principal like principal components analysis, which is also related to it, is a way to simplify the problem.
It reduces the dimensionality of the problem from the richer number of variables to a small number of factors. These factors are found or identified statistically by correlations between variables, essentially. So you reduce the number of regional variables because several of them are going to be correlated. And so what you do is you the technique essentially is a multidimensional rotation in data, in the space identified by the data, defined by the data. And what that rotation does with these mathematical rotation does is it identifies positions in these multidimensional space that explain the majority of the variation in the data.
So it's a way to simplify large datasets.
So is there a way to explain what that means, like in this context where the Wouters we're talking about are personality traits? Well, would it be fair to say that the openness factor that's one of the five, isn't it? Openness, yes. Openness to experience. Anyway, let's say it were. Yeah, it is. Would it be fair to say that openness represents sort of a collapsed version of of many characteristics that you could, in theory, have talked about separately?
Right. But because they're kind of clustered with each other, we collapse into this one thing we call openness.
That's right. That's exactly right. So and the same goes for the other five factors. So the idea is that essentially you have these hundreds of variables that you can measure. And these are the variables in this case that actually the responses to individual questions. OK, so the typical question it could be could include hundreds, sometimes several hundred questions. And each one of those questions is a variable. Essentially the response to to that to that question, to each question.
And what you're trying to do is to figure out if there are common factors that sort of the information about all those responses. The idea is, therefore, that empirically it turns out there is five of these factors. There's five major directions in these multivariate space that sort of summarized pretty well the available data. And then you interpret these factors. So you give them names like openness to experience or conscientiousness or extroversion or whatever, based on which original questions, which are regional variables are highly correlated with each factor.
Right. So although.
Yes, so this is another way that personality tests could be wrong, could not like the factors it's measuring, could not be very good or very significant. That's right. And explainers of the data. That's right. That we've got now. So I have to wonder, like, it seems like there's almost an arbitrarily large number of questions you could ask people or things you could measure about people which you would then be able to do a factor analysis on.
That's right. And like, I'm sure there are questions like how happy are you among strangers? Right. But there are probably not questions about do you prefer sitting at a square table or a round table? Exactly. And like, there are probably many, many, many questions that we could have asked people that we didn't ask that if we had, our factors would have come out different. Yes.
And in fact, that is one of the objections in the technical literature to these kinds of personality test. That is, they they only measure it at the best. They measure a subset of sort of of of things that are relevant to human behaviour. Now, the counterclaim, of course, is that. Yes, but these these tests have been actually been done with a number of different ways, asking a number of different questions. And yet the most consistent result you got is these, you know, these five factors.
Now, that's OK, except for another crucial objection to the whole use of fact. And as I said, having used it, I'm actually familiar with the technique and I know how subjective the interpretation is. The number of. Can you elaborate? Yeah.
The number of factors that you identify is. Is you know, it's a qualitative call, you have to establish a sort of a more or less arbitrary line for stopping at the number of factors, because from a mathematical perspective, the way the technique actually works mathematically from the mathematical perspective, the number of factors actually is exactly identical to it's the same number as the original variables, except that then you lose a bunch of these factors in the in your conclusions because they tend to be statistically non significant.
You simplify your system. So if you start with, let's say, 300 variables, you really have 300 factors there. And it's what you're trying to do is to identify the most important ones, the ones that explain the most amount of variation in the data. But where do you stop exactly? It's controversial because you can't have to establish some kind of threshold of variation explained. And of course, you can argue, well, why did you stop there as opposed to a little before and a little later?
And sure enough, in the literature on the Big Five, some of the critics point out that other researchers have done similar studies and they found only two factors or three or 10 or 20. And so there's quite a bit of variation actually, in the number of factors that people have found.
I remember thinking about this when I've been interested for a while in stories and storytelling and like the construction of a good story. And every now and then a book or an article will come out saying there are really only X number of stories in the world. Right. And and that X varies wildly. Like there was one book that said there are twenty one different stories in the world and the stories will have types like, I don't know, the quest or Boy meets girl, boy loses girl or I don't know that the fracturing of a family or something like that.
And the author will show how all of these popular famous myths and novels and short stories fall into one of these twenty one categories. But you could narrow it down even more. In fact, sometimes the articles will say there are only three kinds of stories. Right. And then at its simplest, there's only one kind of story. Some person or group of people try to do something and they either succeed or fail. Exactly. So it's really just about how fine, fine distinctions.
By the way, I should qualify one thing that I said earlier, that is that the big personality, the big five tests are theoretical. That's true in their current incarnation. But they didn't start out that way. Yeah, they started out as following something that was called the lexical hypothesis. And the lexical hypothesis is the idea that that essentially people's personalities are encoded in the way we use language. And interestingly, by the way, one of the first people who actually worked in this thing and developed the first test was Francis Galton Darwin Scouten, who you you find almost anywhere.
We're talking about statistics early on in the late 19th century and early 20th century, you find Galton's fingerprints. You was everywhere. But so including, for instance, most famously in the first studies on heritability. And now what the idea was, if the original researchers started out by simply putting together a bunch of words describing personality taken from dictionaries and in the region of the early results in the 1930s, when that starting now, it's not a bad starting point.
Absolutely. But but, you know, the original studies were looking at something like almost 18000 words and then they'd reduce them to about 4500 and then result in further. And they found that initial initial cluster of about thirty five major personality traits. You can see that the numbers here are all over the place. You know, they vary all over the place. And then now somehow we reduced from the from the thirty five original clusters down to five. But but that's as I said earlier, it's controversial.
So the interesting thing about the Big Five is that there is no more you know, there's more empirical evidence certainly for the use of the big five than there is for the use of the Brigs Myers. But that's, you know, somewhat faint praise, shall we say? Yeah.
I was wondering when I took the Myers Briggs why these four dimensions were the ones that he thought important or they thought important enough to include, like extroversion, introversion. OK, that seems somewhat fundamental. And I think it's the case that that's the only part of the Myers Briggs that predicts significant things in the rest of one's life, like job performance. But the judging versus perceiving that, you know, at the end of the Myers Briggs score, whether you get the J or the P, I don't know.
Judging represents whether you you prefer your life to be more structured and decided or more flexible, like whether you like deadlines or not. Why is that one of the top four most important things to know about a person, you know, now that the. It's a good question. We should also mention another these three very popular tests that tests that are used to do, we've talked already to a large extent. The third one is the Minnesota Multiphasic Personality Inventory, and that one is used particularly in top secret security clearances for United States governmental agencies.
So it's very important. Well, I suppose also to measure personality traits, although it uses a different system from both the brick by Brick Smyers and the Big Five, which right there, of course, raises an interesting question. You know, why? Why are there so many different ways of measuring personality of personality supposed to be such a fundamental aspect of what we do? But more interestingly, the Minnesota inventory went through several different sort of iterations. And again, this is one one for which there is a certain amount of empirical evidence in terms of cross-validation and all that.
But the way it started out was interesting. It was a very small sample that was developed. The original test was developed in the United States, of course, and originally was based on only. Let me see. I have the actual number now, I don't have the actual number, but a very small number of individuals that were essentially young, white and married people from rural Minnesota.
OK, from who else is there, right? Exactly. That's right.
So. So that's and sure enough, and no big surprise there when they tried to apply the same test to, oh, I don't know, women or, you know, people from different ethnic minorities or backgrounds. The results were not exactly very encouraging. Now they have since standardized the test. They've they amplified the sample, you know, so it's gotten better. But of course, that, you know, again, that shows you that that there is quite a bit more subjectivity, I think, to this thing and quite a bit more dependency on the cemp, on the specific samples and procedures that may than it may seem at first at first glance.
And again, these are important tests because these are actually have real consequences in people's lives in terms of hiring and, you know, education placement and all that sort of stuff.
So, yep, makes sense. Now, we want to talk briefly about the IQ thing, but yeah.
Yeah, we do. Especially about the the G factor. Yes. So you started to explain what the G factor is more formally. What would be the best way to define it formally? And it's it's sort of a construct representing allegedly representing general intelligence. And how would you describe it?
Well, that's exactly what it's supposed to represent. And it is arrived at by doing a principle factor analysis, the same one that we've talked about for the for the big five personality traits, the principle factor analysis on on a variety of IQ tests. So you first have you know, people have developed a number of different IQ tests and you can administer those tests to the same group of people. And then you crunch the data into a factor analysis that tries to see if there is any correlation basically in the way in which people respond to one test as opposed to the way they would respond to another test.
So if I give you, you know, to test two IQ tests, are you going to score high on both? On both, or are you going to score low on both? Are you going to score or are the score in fact, uncorrelated?
And the idea being if there's some sort of like combination linear combination of scores that people get on these different tests, that that represents their general intelligence, that's causing them to do well on all these different tests?
That's right. But you can see how you know that there's there's many problems with that sort of interpretation. Right. First of all, all you may be measuring is the similarity between the tests as opposed to. Right. As opposed to whatever native intelligence might actually mean. That's that's one thing. The other thing is, remember, it's factor analysis. So we're still talking about the additional factors. There's a number of additional factors. There are inaudible. You know, there's a secondary factor, tertiary factor and so on and so forth.
And these are simply going to be naught because the original researchers that were interested in IQ already had in mind this idea that it must be a single factor, a single general intelligence. But if you abandon that idea, then all of a sudden you can start taking a look at the second factor, the third factor, the fourth factor. And then you run into the same problem that we discussed a minute ago, which is what what are you going to stop?
How many different types of intelligence are there? Right.
Yeah. So what I had wanted to ask you is I've been so confused about why so many smart people think seem to think that this thing we're measuring, that we call G represents a real thing in the sense of being having some instantiation in the brain other than just like a linear combination of these other things that the tests have measured. Like it seems almost like reify. Yes. Which means falsely assuming that there's some real thing right behind the thing you're looking at.
Exactly right. Not a very good description. Pretty good. Well, let me let me give an example to make it clear. If you took five tests and got different scores and all of them, you could take the average of your scores. And that's just a very simple mathematical construct. And so it would be false reification to say that here is I've discovered what my average intelligence is. You haven't discovered anything beyond what those five separate tests individually told you.
You've just, you know, recombined the data and given a new name.
That's right. Now, as I said, you know, if the where you know, the power of the fact, our knowledge is supposed to be that if, in fact, you have good reasons to believe that there is one, two, three or four or five underlying factors. Right, then that is one way. You can quantify that thing, but but but now you're moving we're moving from a purely empirical approach, which is what most of these tests are based on to a theoretical base for for the approach to begin with, which is the very thing that is missing.
I mean, there is no particular reason to think that there is a general intelligence factor other than the fact that IQ tests tend to, according to certain types of analysis, cluster along the lines of one. One linear combination. By the way, you mentioned several times the word linear. That's another one of the major criticisms of these kinds of techniques in general. That is, well, who the heck said that? These things are actually linear combinations.
You know, that the interesting byleth easier.
It's easier to do. Exactly. It is easier to do. But of course, by the way, multivariate tests that that take into account the non-linearity, one of them, it's called multidimensional scaling and it's very sophisticated multivariate analysis that actually takes into account the possibility that that the underlying factors that group the variation in a certain number of responses are actually combined in a nonlinear way. But that opens a really interesting large number of possibilities, because there's a bunch there's a bunch of different ways in which things can be non-linear, can be combined linearly and some makes it makes the whole situation much more complicated.
Incidentally, there is evidence that some of the dimensions in the Big Five going back to one of the early tests that we discussed are actually those factors are not actually orthogonal to each other, meaning that they're not exactly they're not mathematically independent. And now that's you know, is that a bad or good?
Well, it's it's from a from a purely descriptive perspective, it's either bad or good. You know, that's just the way it is. But from a point of view of making claims about personality factors, then it's a bad thing because it means that all of a sudden, you know, two or three of the personality dimensions may actually be correlated with each other. So that it's not it's not it's not an independent thing of whether you are an extrovert and, you know, you score high on something else in in the in the big five.
So there could be, for instance, a correlation between openness to experience and extroversion. Now, that would be empirically interesting. But what that does is it undermines the idea that now there are five independent traits, because if if two of them, let's say, are correlated, then one could say, well, then why not collapse those into one underlying factor and call in the openness, extroversion dimension? You know, that and that reopens the discussion of subjectivity, how far you want to go with grouping or splitting these things?
Mm hmm. I we have a few months left. I want to talk a little bit more about what it is that IQ tests are actually measuring. So good question. I know that IQ tests are correlated reasonably strongly with various things like performance in school and salary and so on. Yeah, but that alone doesn't seem like very good evidence that IQ tests are measuring some property of your mind, which is then in turn causing all of those good outcomes. That's right.
It seems I know there's a lot of research on this and I've only read some of it, but so far it doesn't seem clear to me from what I've read, that we can rule out explanations like people who are good at focusing and are patient do well on IQ tests. And those are also the same people who at 10 turn a lot of money.
Yeah, no, that's right.
So there is the usual problem with correlation and causation, and particularly when it comes to such a highly high level human cognitive traits like like intelligence and all its aspects, it's very difficult to tell which way the direction of the causal arrow actually goes.
For one thing, some of these things can be self-fulfilling prophecies, because the interesting, interesting story about the IQ test is that after Binay, when when he originally came up with the tests, meant that for a socially progressive purpose, he basically wanted to identify children who were falling behind in elementary school so very early on so that they could be helped and brought back up to the level of the rest of the class. So it was a way to. And you didn't think there was anything innate about this thing?
It was just a matter you know, for whatever reason, people fall behind on certain things and we can identify them early on and bring them back up to speed. Right. Ironically, the IQ tests, broadly speaking, have been used exactly for the opposite purpose. They've been used for classifying people and showing that this race or that gender or this ethnic group or whatever it is, it's inherently inferior to another, usually to us.
And there's no point in trying to. That's right. Exactly. More resources that educating or helping the group because, see, the test has shown that it's. An innate trait of theirs is exactly the Fed is really ironically, it is opposite the original meaning of the task. Exactly.
And, you know, the famous or infamous, depending on who you ask, Bell Curve book that came out a number of years ago. That was the fundamental message that, you know, it doesn't matter what you try to do in terms of social policies or education, because, you know, if people are, you know, deficient in intelligence as as a negative thing, there's nothing you can do about it. Now, we could have a whole different conversation here, of course, about heritability and the pitfalls of that kind of analysis.
But as far as the IQ tests are concerned, one needs to be really careful about realizing what we're measuring. IQ tests, of course, that changed over time. And they have interestingly, there are some very amusing examples of that. Tell you how easily this to get an IQ test wrong. I don't know if there is a way to get it right, but there certainly are many, many ways of getting it wrong. For instance, a classic example that comes from a book by Stephen Jay Gould from several years ago is a question in in in an IQ test that was administered early on by immigration authorities in the United States.
And the question basically was in the form of a picture of a tennis court and the tennis court was missing the net. And you had the question was, you know, what is missing from this picture? Well, if you've never seen a tennis court, you have no idea. But that means that it has absolutely nothing to do with how he regards the implication.
There was that knowledge of tennis courts was an innate trait that this test was measuring. Some babies are just born with a better understanding of what tennis courts decide.
Now, of course, the modern versions have taken care of those kinds of obvious faults. And they the people have actually tried to produce tests that tend to be a little more cross culturally stable, a little more, you know, less dependent on specific knowledge of sort of cultural cultural environment. But still, there's always that possibility lurking.
There's still the problem that socioeconomic status and other environmental factors are highly correlated with IQ tests and so are scores on IQ tests. And as countries tend to develop, their average IQ score goes up. And so it seems really problematic for if your goal is trying to infer innate differences from differences in scores and IQ tests, because surely you would expect there to be some difference in average IQ score between two groups with different socioeconomic status or other environmental factors being different.
And the question of how big that difference is and how big that expected difference compares to the observed difference in average IQ scores. You see, just very unclear to me how you would confidently say that the observed differences bigger or smaller than what you would have expected just from environmental factors.
Yes, not only that, but the results are pretty clear evidence that IQ scores have actually gone up right now with populations over the last years.
Yeah, that's called the Flynn effect. Yeah, right. And that clearly is not the result. That can't be the result of genetic changes. That's just too short a time period for for that to be the case. So. Right. Well, we are just over time and have just barely managed to avoid wading into the highly explosive field of IQ and cultural differences. So let's stop while we're ahead and wrap up this section of the podcast and move on to the rationally speaking, PEX.
I'd like to take this moment to remind our listeners that if you're a fan of the rationally speaking podcast, you'll definitely enjoy this year's Northeast Conference on Science and Skepticism, which will be held in New York, New York the weekend of April 5th through 7th, 2013. Go to NextG now to get your tickets there on sale in addition to Masimo. And you'll also find a lineup of great speakers, including the SGU, Simon Singh, Michael Shermer and our keynote speaker, physicist Leonard Mladenov, author of The Drunkard's Walk.
Next, the story. That's an easy asphaug. Go get your tickets now.
Welcome back every episode, Julie, and I think a couple of our favorite books, movies, websites or whatever tickles our rational fancy. Let's start as usual with Julia Spik.
Thanks, Massimo. My pick is an organization or a website, depending on how you measure things. It's an organization I'm a big fan of called the Center for Effective Altruism. There's a cluster of related organizations within that umbrella, but they're all focused on how to do the most good possible with whatever your time, your money, etc.. So they're they're focused to some degree on cost benefit analysis of different charities or different ways of helping the world and really trying to measure what kind of outcomes you get for a given investment of money.
But they're also focused on questions about what careers you can go into to have the greatest marginal positive impact on the world. And so there's some interesting philosophical questions there that they deal with. And most of the founders are philosophers. They're based out of Oxford. I like it right there. So they deal with questions like what kind of baseline should you be measuring your contribution to relative to like if you're a doctor and you're saving lives? Do you measure your positive impact in terms of the number of lives you've saved or do you measure it in terms of the number of lives you've saved relative to the number of lives that would have been saved by the person who would have been the doctor if you weren't?
Anyway, I'm going to post a link to their site and especially to the blog that they run, which I think is a particularly thoughtful blog and a great example of philosophy being used to have a positive impact on the world. Sounds good.
So my pick on the other hand, it's it's about a book, but it's not the book because I haven't read it yet. It's the book review, which obviously piqued my interest and so, so much so that I'm going to actually read the book. The book is called How to Talk About Books You Haven't Read and you're kidding.
No, that is the title. And this is so meta. My brain just exploded. It's right.
And it is by a University of Paris literature professor, period. And the book reviews by Maria Sharapova, who is the curator of the Brain Picking's website. And the basic idea, from what I understand, of course, I guess if I followed the suggestion of the book itself, I shouldn't really need necessary to read the book in order to talk about it, which is what I'm doing. But but the idea is more serious and a little less provocative than the title would would imply.
The guy isn't really this is not a how to guide, how to cheat on your social conversations at a party, at a cocktail party. What it is, is, you know, there is a real problem with a huge amount of books and not to mention other kinds of writings, you know, articles, blogs and so on. But but these huge amount of books that is produced every year. And, of course, there is no way any sensible human being is going to be able to read more than a fraction of these books.
Not only are the ones that are coming out, but also, of course, if you count anything from the classics on. Nonetheless, the author of these book, Banyard, claims that, you know, we need some kind of map to navigate the cultural territory. That is the cultural landscape that is actually made or shaped by by these books. And so we say this is a quote from from the book. Not reading is not just the absence of reading.
It is a genuine activity, one that consists of adopting a stance in relation to the immense tide of books that protects you from drowning. On that basis, it deserves to be defended and even taught. So that was stimulating enough for me to convince me to get the book and see what what actually the substance is about. And if it turns out to be, you know, substantive, maybe we can have the other on the on the political scene in a future episode would be great.
I would love to discuss other ways to reframe my own laziness or negligence as a positive stance for something like, you know, I didn't go to the gym not because I'm lazy, but because I'm taking a stand against society's unfair beauty standards.
Absolutely. And I'm sure you could find something. I would agree with you.
All right. We are all out of time. So this concludes another episode of rationally speaking. Join us next time for more explorations on the borderlands between reason and nonsense. The rationally speaking podcast is presented by New York City skeptics for program notes, links, and to get involved in an online conversation about this and other episodes, please visit rationally speaking podcast Dog. This podcast is produced by Benny Pollack and recorded in the heart of Greenwich Village, New York.
Our theme, Truth by Todd Rundgren, is used by permission. Thank you for listening.