Rationally speaking, is a presentation of New York City skeptics dedicated to promoting critical thinking, skeptical inquiry and science education. For more information, please visit us at NYC Skeptic's Doug. Welcome to, rationally speaking, the podcast, where we explore the borderlands between reason and nonsense. I'm your host, Masimo on YouTube, and with me, as always, is my co-host, Julia Gillard. Julia, where are we going to talk about today?
Today's topic is crowdsourcing, which is using a large group of people of strangers, often over the Internet, to solve some problem or answer some question. And there are a ton of examples of crowdsourcing. People are doing it more and more to design parks or to solve research problems or to create comprehensive encyclopedias online of all of the world's knowledge and some of the approaches to crowdsourcing work better than others. So we're going to talk today about the relationship between the consensus opinion of a crowd and truth and what kinds of methods of crowdsourcing are more likely to generate truth and useful answers.
So let me get this straight. Crowdsourcing in general is sort of a kind of a distributed problem solving approach or the street approach to problem solving, whatever it is.
Sure. I suppose we could we could crowdsource to answer it from the two of us. We pick the best one right up to the top.
It's a small crowd. But yes.
Now let's one of the things that I read is that some people tend to think of crowdsourcing as a phenomenon that arose with the Internet. In fact, the the term crowdsourcing actually is pretty recent comes from a Wired magazine article published in 2006.
It's that recent, really. But so it was coined by Jeff Howie for Wired. But the thing is, in fact, there are historical examples of things that if it were not for the Internet, we could be calling crowdsourcing. I mean, the one that I come across in doing the readings for this episode is the Oxford English Dictionary, for instance.
Why don't you explain a little bit about.
Well, so the OED was actually started by the editors asking for submissions of words, definitions and word usage. And the whole thing received something like six million submissions over a period of seven years. So do you know what kinds of people. This is one thing that fascinates me about crowdsourcing is what motivates people to contribute their time and effort to this big project where they're just an anonymous piece of the puzzle. Well, that is a good question.
So there are some suggestions, let's say I don't know about research, but suggestions about what the motivations are. And just like pretty much anything else, I suppose, that human beings do, that can be intrinsic motivation. That can be extrinsic motivations. Right. So an intrinsic motivation would be something like, well, I just like doing it because it makes me feel good for whatever reason. Like, for instance, in the case of the Oxford presumably, or the modern version, which is Wikipedia, somebody might do it.
I'm not a contributor to Wikipedia, for instance, but although I think I register like freeloader.
But yes, I feel like I feel a little too. But that's OK. Not everybody has to be a contributor to to use it as long as you use it properly. But the thing is. So both the Oxford and Wikipedia, I assume that part of the reason people contribute to that without getting any monetary or any financial advantage without in some cases not even being recognized. Yeah. You know, officially they do it because they want to be part of this bigger thing, know it's it.
And it's one of the things that Aristotle will say adds meaning to your life.
But in a project, you know, now that you mention that I hadn't made this connection, Aristotle or Wikipedia, I know this feeling of just, you know, deriving meaning or satisfaction from being a part of this thing.
Even though you're anonymous, I. Are you familiar with this phenomenon? I guess it's kind of crowdsourcing.
It's like grassroots, spontaneous crowdsourcing of of humor.
So on Amazon, sometimes I think spontaneously there will arise this like trend or this pattern of people writing reviews for some random, like, stupid little object for sale.
And the reviews, you know, only tangentially relate to the object. They're really just excuses to be to do creative writing. So one of the most famous, I think, originators of this trend was milk. There was like a gallon of milk for sale on Amazon and people just started writing hilarious reviews and the reviews got more and more elaborate. There are thousands and thousands of reviews. Now, if you look for if you Google Tuscan Milk, Amazon and some people wrote long epic poems about milk.
Other people, other people like wrote detective stories about milk, other people just wrote satires. I also contributed.
I first saw a detective story. Right. Satires as opposed to. Yeah, yeah.
And it just became like this massive, like jam session, essentially between thousands and thousands of strangers on the Internet, all with just this one sort of. Random theme to anchor it, and it was such fun to be part of it, of course. Mine was anonymous. Most were anonymous. So no one's gonna recognize me from my writing. But it's such fun. Yeah. To participate.
And then actually, I mean, that is an interesting or amusing example. But even Amazon reviews in general often are anonymous or reviews or websites. In general, they're either anonymous or quasi anonymous and people use a handle that doesn't necessarily tell you who the actual person is. So we use it all the time. Right. So obviously, the Russian speaking blog has a lot of people who comment on it, and most of them are, in fact, anonymous or quasi anonymous.
And yet they get involved into all sorts of discussions. Sometimes some of the cumulative responses that we get are actually longer than the posts that are.
You know, some of the best discussions are not short. I mean, typically a Russian speaking post is at least fifteen hundred, sometimes 2500 words, which is about almost three times, you know, an editorial in op ed piece in a newspaper. So to get people, sometimes dozens of people to actually contribute back and forth and sometimes for weeks with no reward whatsoever other than the pleasure of having shared their opinion.
Well, I think it's the pleasure of sharing their opinion that that like that seems like an explanation in itself, that people really feel compelled to give their opinion about things right. And then it's the other kind of crowdsourcing that it's a little harder for me to explain where people are doing work like like doing work to contribute to a large task is sort of less inherently satisfying, I think, than, you know, voicing your opinion or judging something about writing it.
So the only so another example of crowdsourcing where people seem genuinely, like, really motivated to be a part of it is Linux and other open source projects, Linux, open source projects in Java.
And they're my cousin actually runs the I ran one of the Debian conferences, which is another open source.
I don't even know which Nauen word to use to describe what Debian is because I don't know anything about it. But I know the people and we can crowdsource that. Yeah, thank you.
The people in his in his community of like Debian developers are our sort of ideologically passionate about open source that things in the world should be open source.
The world would be a better place if if things were open source. And so so that I actually do understand that's, you know, like a principle that they're working towards.
That's right. So we're there to find the following motivation. So so the intrinsic pleasure of saying that, you know, contribute is something that it's actually a use of your skill. Right. So so you you get to use your skills to do something. The another intrinsic motivation is the pleasure of seeing you're seeing basically what you think about something. I'm thinking not only of comments and blogs and all that, but things like, you know, I use a lot of Yelp, for instance, when it comes down to to pick a restaurant where to go.
And then I feel like I have to contribute. And so I do contribute mini reviews. But yes, part of that is, in fact, so it's a part of my motivation is, I suppose, ethical as in I use the system and therefore I am using other people's opinions on these things. So why shouldn't I contribute to it since it doesn't take that much time? But the other thing is, yes, this is sort of a compulsion, at least in certain kinds of personalities, to have wanted to give your opinion about something, you know, regardless of whether it's positive or not.
And then just the fun like the jam session and then the financial and the funds rate and Toscan Milk.
Now, there is also, of course, extrinsic motivations. The most obvious one is money. Some of these people actually are paid. There are a lot of sites like, what is it, Amazon, Mechanical Turk that apparently pay very low amounts of money, very little.
And in fact, that brings up. Well, let's let's break it off for a second, because that actually brings up an interesting ethical point.
Oh, yeah. I also want to talk about that. But let's go back to that one in a minute. But there is a still that's an intrinsic motivation. I mean, however little you're being paid to do something, there's also a less tangible sort of extrinsic motivation, such as social recognition, especially if you're not anonymous, obviously. But even if you are anonymous, there's sort of you still have this feeling that you're recognized, if not you, your handle or whatever it is.
That's right. So there is a sense in which, OK, you do it because in some sense we're social animals. We'd like to be recognized by a social group, particularly a social group that we value for whatever reasons or, you know, the social group of people contribute a particular type of crowdsourcing problem, for instance, or out or open source software and that sort of stuff. So. So I think it's actually, as it turns out, I mean, I was wondering about motivations myself when I started reading about crowdsourcing.
And it turns out, actually, that's probably one of the easiest thing to figure out, that there's plenty of motivations, both intrinsic and extrinsic, why people might want to do that, even anonymously. So it's now so we go back to the Mechanical Turk thing or similar. Similar because not. It's not the only one I don't want to pick on Amazon in particular, and I don't know exactly how it works.
I never participated to do Mechanical Turk, but basically people set up the crowd sourced source, set up a problem that can be, you know, either subdivide into different tasks or people compete.
Essentially, you bit for for doing a particular task. And then if they completed within a certain period of time, then they get, you know, monetary remuneration. Now, one of the problems with that, so clearly that gives, as we just said, an external motivation extends to motivation. The problem is that it turns out the statistic seems to be pretty clear that most of the time the people who are paid to do these kinds of works are paid below minimum wage in the country from which they originate.
Wow. That's very low. Right.
And in fact, just because so many of them come from India, that's a large number of apparently from India, Indonesia, Bangladesh, but also the ones that come from the United States tend to be paid lower than minimum wage in the United States, which, of course, raises raises actually legal issues, particularly in countries where there is a minimum wage, but it also raises ethical issues. A lot of these projects, for instance, are done by researchers.
But university researchers who have grants that allow them to set up a project like this and and get input from thousands or tens of thousands of people potentially. But the problem is that then then Fatty's these projects are for all effective purposes, exploiting the people sleeper's without recognition.
So I can just give an example of a project I know of use Mechanical Turk. It was trying to to train a machine learning like an artificial intelligence algorithm to be able to recognize. I think I think this is the project and I think you have to recognize what was pornography and what wasn't.
And so, you know, there are various like I didn't know it was possible to do the colors and like compositions of pictures that are more correlated with pornography than not.
But you can't it's hard to, like, give those instructions to a computer in a way that the computer could use to recognize pornography reliably. So. So what you have to do is just, you know, get a huge data set of things that are both porn and not porn and get humans to go through and just click porn, not potent porn, not porn. And gradually the computer can learn from their data what things are associated with, you know, something being porn.
So that would that would be a thing that Mechanical Turk would be used for. You just look at a picture, click porn or not porn, and you just do that for hours and you're paid for and you're paid.
Yes. I thought this wasn't the best example of unethical. Well, actually, maybe ethical, even more interesting, but. Yeah, so OK.
That sort of that sort of project does sort of raises these kinds of questions. There is another question, which again, especially for research projects, is brought up by crowdsourcing, which is the the composition of the crowd itself. Yeah. All right. These are clearly not random samples of of of of people. In fact, there are some statistics, there are some studies, the demographics of crowdsourcing. And interestingly, these these the results of some of these studies.
Part three, I would guess, because the phenomenon is still new and therefore it's very dynamic, seem to be changing very rapidly. So, for instance, there was a survey done in 2008 where the researchers found that the majority of the contributors where Americans, young females, fairly well-educated and with fairly large incomes, rather large incomes. But only a year later, another survey found they're very different. This one, this one focusing on particular Mechanical Turk, found that actually two thirds of the majority of the about one third, I guess 36 percent of the surveyed population was Indian and that two thirds of these were male.
And, you know, the level of education was largely lower. The income was definitely lower. And so so there are differences in gender, are differences in geographical distribution. There are differences in education. So especially, again, if you use this as a research project, then you might want to be cognizant of the fact that the kind of results you get may, in fact, depend sometimes to a large extent, presumably, on the fact that you do not have a random sample of the population.
And in some cases you don't even know what your population is compared to what it should be. And you maybe want to target a particular in the case of pornography, for instance. I would think that if you ask, you know, if you present the same set of images to people with a fairly significant different cultural background or even just gender biases, you would get very different results on, you know, in terms of funding outcomes.
So the study I was thinking of might have been, is there nudity or not? That might have been the question, not pornography. Because you're right, pornography is famously hard to define. My identity is probably easier, but still even so, right, that some I don't know, my my guess would be the males would be more tolerant of nude images than females, especially in certain in certain cultures, but in other cultures may go the other way around.
Or it may be that there are certain differences between, let's say, an American or Western audience versus, you know, Middle Eastern or Asian audience and so on. And we don't know how to time necessarily what these are going to be, nor how you could possibly sort of counter that that sort of sampling.
In fact, I guess I'm a little confused about how to think about what harm is being done by offering people online from any country the ability to or the opportunity to, you know, take up and do these tasks for very small amounts of money.
Well well, the obvious one is this is a discussion we might not necessarily want to go into, but it's the same kind of harm you do any time you manage to pay somebody less than the minimum wage.
Right. So, yeah, maybe that becomes a political discussion.
But it's interesting because the listeners shouldn't get the idea that these are people who just necessarily do it for a hobby, because if you do something for just for a hobby and, you know, it's it's it's in your spare time, the fact that you get paid at all, it's I suppose it could be considered an additional incentive to do it. But it's not a major you know, it's not a big deal. But as it turns out, especially for some of the union workers that work on Amazon, Mechanical Turk, that actually turns out to be a major component of their income and either supplemental or, in fact, a large component of income, in which case you really are talking about exploitation.
So that sort of stuff, I actually was thinking of another ethical issue with crowd sourcing that has occurred to me in the past.
So competitions are a really good way for a company organization, whatever to to like generate really promising ideas for, I don't know, a plan for the park that they want to build or an idea for a logo or, you know, really anything.
And this tactic is used a lot. And the reason that it's so effective is you see the you know, the the company the organization gets dozens or hundreds or sometimes thousands of submissions.
And all you have to do is pay for one of them, maybe, you know, the large prize.
But so, you know, it ends up being a really good deal for you. If you actually hired a thousand people to each, you know, do a draft proposal for you and paid for their time, that would be incredibly expensive, far, far more than the amount of prize money you're giving out for the competition. And in some cases, I've seen and especially in like architecture and design, the the prize you get is just being hired. So, you know, you have thousands of people like spending a lot of time creating submissions for the contest just for the chance of being hired.
And I think that the reason that contests worked out to be such a good deal for the company sponsoring them is that people overestimate the their likelihood of being chosen. Right. I mean, I don't know that for sure. That's an empirical question, but that's my guess as to why they do it.
And so it is kind of an exploitation of this cognitive bias that people have that, you know, although that's no different from from the way we sell lottery tickets in some sense.
Right. It's true. It's just sort of more with the difference in the case of lottery tickets, it's entirely up to randomness.
You know, that is also I'm also not defending it.
Just the smaller well, I don't know, maybe the over the course of someone's life, it's a large expense they spend on lottery tickets. I was just thinking of in a contest, people spend a lot of time and I hired I use 99 designs to come up with the logo for my organization.
And I felt pained at the hundreds of people who are submitting designs. Almost all of them just got a thank you note.
I'm not like I'm sure that is good, but it's not just companies, as it turns out.
The interesting thing, if you look at the history of these things, again, pre Internet, it was governments largely that started this being particular, the French government really to some extent, the British government. So, you know, famous examples. So the French government throughout for a couple of centuries actually has done these these national competitions to solve problems that up to that point were not resolved. So as it turns out, the first commercial turbine was invented, first developed that way.
The competition at this kind and the French government also got somebody to develop a new way of preserving food in airtight containers. And of course, one of the most famous ones is the British government. Longitude prize longitude. Yes. So this was a prize that was announced to determine the help to determine the longitude of a ship at sea, which was an unsolvable problems and sort of an unsolved problem until fairly recently. And in fact, there is a book called Longitude that is about to discover these of this method.
And so that. Again, an interesting situation where it was actually governments in this case, the French and the British government that use the same system and successfully in a lot of these prices actually did get, um, no cleaned. And society at large, you might say, benefited from this thing. But, yes, probably there were a lot of unscented thank you notes by both governments.
I am I'm I kind of want to talk about the the kind of crowdsourcing problems where you're not just choosing your favorite option from among the many submissions or you're not just hoping that one of the submissions turns out to work, but you're trying to sort of collectively reach truth, like trying to collectively generate the best answer. And, you know, I see huge differences between, say, Wikipedia on the one hand and Yahoo! Answers on the other, both of whom try to, you know, use this crowdsourcing approach to coming up with true answers to questions, but do dramatically different quality jobs at it.
So, you know, Wikipedia, on the one hand is I found like.
Hmm, frequently more accurate than well, maybe not like published books and journal articles, but no, certainly far more convenient. And from where I just say far more accurate than like I don't know, maybe reading like a pop science article.
There are there was this famous article a number of years ago now, of course, in already in Nature magazine that actually compared the accuracy of science entries with certain kinds of science entries between Wikipedia and the Encyclopedia Britannica. Oh, really? And Wikipedia turned out to be about as accurate as to.
Wow. I didn't write it now. That's not true in general. Then you can't for instance, as you know, there are certain areas where Wikipedia that are regularly disputed, especially when it comes down to political figures and topics, which is why, in fact, Wikipedia, contrary to what most people seem to think, it's not entirely unstructured thing. I mean, that you do have filters and they have people actually monitor a certain content content that pages and so on.
Nonetheless, the result was interesting. In fact, I tweeted out on Wikipedia, you can look up the references about Wikipedia's accuracy. There is a page on Wikipedia about Wikipedia accuracy. And you don't need to trust that. You just need to go and check the original sources. Of course, that Encyclopedia Britannica, the editors of Encyclopedia Britannica, for instance, disputed the finding on the paper.
And so the editors of Nature had to actually go back and check the data and adjudicate that dispute. And it turns out that the author of the original paper were correct, that Wikipedia for those entries in certain science topics was in fact about as accurate as the Britannica. So it's interesting.
Yeah, it's like I think you're right. You would have the few times that I'd looked at some of the answers for those things as were horrible. Yeah, OK.
The classic Yahoo! Answers question is, how is Babbie formed?
Oh, yes, I have a baby formula.
And the answer is, you know, ranged from, you know what I can't talk about.
It just occurred to me. Never mind any kind of censorship. OK, fine. Point being, they were not accurate and they were not, you know.
Right. The, you know, children of our society being given sex ed by Yahoo! Answers. It's my point here. That's a bad idea.
But here is another example that I use on a regular basis, which is whenever I have a computer problem, typically, I don't I don't look at manuals, first of all at all. And I don't try, you know, the usual sort of, I don't know, customer support or something like that.
I just Google the question. I phrase a question as in, you know, why my does my iPod not do this?
And typically, almost invariably, within the first page of answers, I will get the right answer. And more often than not, the right answer comes out of a forum or a user forum for that particular, you know, computer software or whatever it is.
And now there is quite a variety of quality of answers. Some of them are, in fact, most of them are actually useless. But you will get pretty quickly within the first Bajour to do something that you can actually use and solve the problem.
So that's that's a nice example of a case in which you can recognize the true answer when you see it. But, you know, you can find out whether it's likely to work with you on Wikipedia or Yahoo! Answers. A lot of the time you don't. The tricky problem really is in finding a way to let the correct answers rise to the top because the reader is not able to tell which the correct answers are. And I think I mean, I've only speculated, but I think that I mean, it helps, of course, that Wikipedia has an active and smart moderating community.
But it also helps, I think, that Wikipedia allows you to edit or delete, you know, submit for deletion other people's answers. Whereas on Yahoo! Answers, you can submit your own, but you can't editor delete other people's. And so you just end up with this huge proliferation of answers, most of which are horribly, amusingly wrong.
And that makes it harder to separate. That brings me to another type of sort of crowdsourcing, which sometimes referred to as crowd voting or crowd ranking.
So this is used, obviously, by websites such as Google itself. But but it's also used by increasingly by professional outlets. There is an increasing number of open source scientific journals to do that. So, for instance, the Public Library of Science, which is a collection of different kinds of journals that deal with biology, chemistry, physics and so on, they essentially have abandoned the standard peer review process.
They modified it. What they do is they still send out articles for review. But the reviewers are asked only to apply minimum standards. They're only asked to verify that the article is coherent to make sense, that there's no obvious mistakes, no obvious, nothing wrong with it, but they're not asked. The other major thing that is usually a professional editor for a scientific journal does ask, which is what is the worst of this right in the field? Is it worth the space of publication?
Essentially now since places online, they don't have a problem with space of publication, so they publish everything that meets these minimum standards of readability and coherence and sort of minimum standard, publish a building. And then after that, it's the users of this of the site who have to be, I believe, to have to be registered to actually, you know, who they are. But and presumably they are professionals or members, not members of the public at large.
But nonetheless, these are the users actually essentially get to vote on the papers and papers that are read or commented more often get get to the top of the list. And so they become more visible. Now, it's an interesting system. It does save a lot of time. If you are somebody who wants to read the latest in this particular large field of science, then you going start with the articles that flow to the top.
Now, the problem is, of course, that there's there's several problems. One is that if the community is actually fairly small, you can have a drift, a random drift. Essentially, things come flooding up and down without necessarily have anything to do with the quality of the piece. And some scientific fields of research are, in fact, very small. The actual community is very small. Second, of course, the system is open to abuse from the outside to throttling.
Essentially, you have to register for to the website, but you can see trolling now so anybody can presumably register. And so you're going to find I'm going to give you an example in a minute of something that that may open even more so to to the to outsiders essentially coming in and hijacking the system. And then there is the other issue that the judgment of even of a community of experts at a particular point in time may not be the best long term judgment.
And it tends to be self self reinforcing.
You know, if if it if one particular paper, you know, a group of papers floats to the top of the list, then obviously many more people will read those papers than anything else that is on on the site. And they will tend to vote on a subset of the papers, which will sort of keep reinforcing the system. And then you might have some gems that are completely lost.
There is a great experiment whose name I'm now blanking on that took a large group of people, divided them up into several groups and gave each group access to a set of new songs and they could download the songs and rate them and so on. And at the end of the period of time, whatever it was, and the researchers looked at which songs were the most popular and each group, there were a few songs that were the most popular by far.
There were, you know, like the equivalent of the top 40. But in that small experimental group, smaller, I don't know, at least 10000, I think.
And that's a small yeah, I'm small compared to the world. And but the most popular songs that were judged to be the best were very different in each group. And so the system of judging the popularity of songs was just incredibly sensitive to initial conditions.
So, you know, a few songs happened to get more positive votes early on and then other people would listen to them and then positively affect them positively. And so there's just the snowball effect. And it sounds like, you know, a similar thing can happen with, you know, exciting academic papers or or going back to Amazon reviews.
I know that, unfortunately, because I think it's unethical. A number of publishers actually encourage their authors to rig up a bunch of friends. A book comes out and put out reviews and that generates buzz. But you can argue that sort of that defeats and undermines the whole system.
Yeah, I think this this problem is a good example of one of the ways that the wisdom of crowds principle can fail.
There's always of is one type of crowdsourcing, of course. Right.
Right. The idea that the collective judgment of a crowd of people and I. Aggregate results in more accurate decisions than you know, than single than any single member of the group could have made or that an expert could have made. And, you know, classic examples of this are like if you get a group of people to try to estimate how many jelly beans are in a jar. Right. The average of everyone's answers is going to be much more accurate than any individual person's answers or most recorded examples of that, at least that I could find.
It goes back to 19 or six. And it was a crowd that was asked of 800 people that was asked to estimate the weight of slaughtered and dressed orcs. No.
And he tries not to be to tell me. That's right now. But the interestingly, first of all, they got it right within one percent of the true weight. Wow. And and we're talking about more than a thousand pounds. So it's a pretty substantial amount of the mass there. But interestingly, the person who actually carried the experiment, organized experiment, was none other than Francis Galton, Darwin's cousin, and one of the early statisticians. Oh, wow.
So, yeah. So he was the first one to figure out that, yeah. Maybe we can use these this kind of thing. Yeah.
And of course, the basic idea there sort of from from a statistical perspective is that it's work, it works because the the matters that are going to be associated with any any particular estimate are going to cancel each other out in areas like the distance of that person's estimate from the to the to answer is going to be canceled if you get especially if you get a sufficiently large, large crowd.
In fact, there are people who have proposed in theorems about the behavior of the of the error in crowdsourcing. So but anyway, that was the first example.
But you were saying, well, I brought that up as sort of the classic example of the wisdom of crowds working. I mean, there are other examples, but this is one example. And as you were saying, one of the reasons this works is because the the mistakes that people tend to make when estimating the number of jellybeans in the way of an ox are sort of random and uncorrelated with each other.
And so, you know, you have some people getting too low and some people getting too high. But you don't have, you know, most people systematically getting higher, most people systematically getting low, which is why they average out to close to the true answer.
So in cases when people are sort of systematically biased, the wisdom of crowds is not going to work for you.
Like, you know, if you have people estimating which candidate is going to be the better president, even if we all agreed on what better president meant.
Right. They're still going to be systematic biases because everyone is or most people are subject to, you know, thinking more highly of a candidate if he smiles more or is taller or has better marketing, you know.
So these are these are going to detract from the wisdom of crowds. And then the other reason I brought this up was that another reason the wisdom of crowds doesn't work is if people hear each other's answers and are able to change their own opinion based on other people's answers. So you get the sort of echo chamber effect where, you know, I know that other people think this paper is good. So I'm going to take it more seriously and then I'm going to say it's good and other people hear me say it's good and so on.
And I run to this problem personally and I've had to be like trained myself to be more personally mindful of this non-independent of opinions problem.
So, for example, when I first moved out to the Bay Area, I noticed that I met a lot of people, a lot of really smart, well-educated people who seemed quite competent.
Those I know, I try to avoid them. So a lot of smart, educated people who thought that the paleo diet was the sort of best nutritional choice you could make for yourself.
That's that's the diet where you you try to eat as closely as possible to what our ancestors aid and, you know, the Pleistocene.
So, you know, meats and vegetables and, you know, some fruits, no processed food or grains. And there's like people go back and forth when they get their mammoth meat.
Well, so, you know, as close as possible, there's some wiggle room there. Right.
But anyway, so so I was pretty struck by this at first, like, wow, all of these people, you know, have come to the conclusion that paleo diet is the way to go.
And then I started digging a little deeper and asking them so, you know, how did you come to this conclusion?
You know, so I asked John and John was like, well, you know, I know this guy Will has done a lot of research on the paleo diet. And, you know, he's really smart and, you know, I trust him.
And then I ask, you know, Mary, and she's like, well, I know a lot of people really buy into it. And, you know, Will has done a lot of research on it.
Then I ask, you know, Max? And he was like, well, I talked to Mary and she said, you know, this really and so on.
And so, you know, you can, like, chart all of the causes of people's belief in the paleo diet. And they mostly boil down to like this one guy doing research on it and then other people listening to him and listening to other people who listen to him. And it's not it's not it's not just equivalent to one person's opinion, because there's some additional evidentiary weight added by the fact that all these other people trusted his judgment enough to adopt his belief.
But it's still far less than whatever. 12 independent. Yeah, exactly. Would be, well, what were you going to end up having in that case is a violation of the principle of independence of the right talking about crime. So you have obviously data points that tend to be biased in a particular particular direction. Now, that direction may be correct. If the initial bias was in fact correct, initial estimate was correct. But but but there's definitely you lose what is supposed to be the advantage of the crowd sourcing to the right.
You begin with the connection between the number of people in your sample and the confidence that you can have in, you know, their consensus. That connection is severed if those people are not giving their independent opinions.
Now, what we should also make clear that when people talk about wisdom of the crowd. So the first time let me back up for a second here. The first time I heard of the term wisdom of the crowd, I thought, oh, here we go. Another new age, mystical mumbo jumbo. Things like, what do you mean, wisdom with the crowds? But, you know, as we know, it does work.
There's pretty good empirical evidence of the cases in which it does work and in the cases in which it doesn't work as the one you just mentioned.
But the crowd doesn't necessarily mean that.
It doesn't mean a random sample, as we were saying earlier, with with other examples. But it also doesn't necessarily mean a crowd of nonexperts because sometimes it's presented that the phenomenon is presented that way. Oh, you know, it's a bunch of people know better than writing that way, right. When it does happen, it's very striking.
But in fact, a lot of the times the crowd itself is, in fact, a crowd of experts. You know, the people that participate to the crowd sourcing situation or experiment is in fact experts made up of either on purpose, either because it's designed that way because, you know, a particular community is because the crowd crowdsourcing is limited to a particular community or sort of more or less automatically.
So so if I I'm sure that if we crowdsource, for instance, solutions to Fermat's last theorem, we'll probably get an overwhelming majority mathematicians, because most other people wouldn't even know what Fermat's last theorem is or where we begin to propose a solution for the theorem.
And so you automatically have a sort of a selection toward experts. That doesn't mean that all the entire crowd or even the majority crowd will be made of, let's say, top mathematicians, for instance. But it certainly would be made of people who know enough about mathematics to even understand what the question is and trying to make a contribution. Right.
So that's the selection bias or the selection effect working, I guess, towards your advantage if you're going to get the right answer. I can also totally work against you, which is actually another thing that I've, you know, trained myself or tried to train myself to be more mindful of.
I I think I might have given this example in some podcast earlier, but in case I didn't do, in case people forgot, I when I was thinking of applying to grad schools back when I was in college, I asked a bunch of professors if they enjoyed academia, if they enjoyed being professors, and they were generally very positive and encouraging, which I took to be a great fine.
And and then only later did it occur to me that I had only been asking the opinions of people who liked academia enough to pursue it and stick with it. The people who knew ahead of time that they wouldn't like academia or who tried it and realized they hated it, were not in my sample of people I was talking to.
So that's like asking Plato what's the best profession that one could have in life? And the answer to that is philosopher.
To be in charge of the country is right now, there's one more thing before we we're moving to we're wrapping up already this topic.
But there is another issue here that came out in my reading is it's interesting. It's sometimes referred to as the crowd within the crowd within.
That also sounds new agey. It does sound huge, doesn't it? But this is basically the idea that the individual cognition also has a probabilistic component to it. And therefore, it has sort of, again, these errors that that cause distance between the estimate and the true response that you're looking for. And therefore, by sampling the individual himself or herself multiple times, you actually reduce the errors.
That's cool. Yeah, it is cool. And it actually does seem to work in what context as well.
There is some research that shows this, that, for instance, if you asked the same question to up to somebody over a period of time, if you asked two or three times the same question and you get different estimates as an answer, that the average of those estimates actually is in fact, statistically significantly better than in each one of the point estimates. However, the interesting thing is that there seems to be some limitations. First of all, apparently it does depend on your memory, on the subject's memory.
That is the people who remember that as well, because they're not independent. That's right.
They're not independent. And apparently also the the the idea is that you don't want to do it over too long of I mean, too many times. So there seems to be an optimal level of. Of what they call sometimes is referred to as dialectical bootstrapping, which is a fancy term for just saying what you can do is you can get a first estimate and then you can ask the person in question to entertain a counter opinion about the estimate that he has just given.
And the research seems to show that if you do ask people to entertain counter opinions, they do come up with a better estimate overall.
Yeah. So it's interesting now that this research is not entirely sort of clear cut.
There are there are other papers that seems to show that DEFAT is either not as strong as some other people claim, or, as I said, that there are some other circumstances that need to be taken into account, including the memory of digital.
But still, it's an interesting concept that that you could sample yourself over time and come up with a better average for them than if you asked yourself the question only once.
Very cool. Well, it looks like we are now out of time. So let's wrap up the section of the podcast and move on to the rationally speaking next.
I'd like to take this moment to remind our listeners that if you're a fan of the rationally speaking podcast, you'll definitely enjoy this year's Northeast Conference on Science and Skepticism, which will be held in New York, New York the weekend of April 5th through 7th, 2013. Go to Nexxus dot org now to get your tickets there on sale in addition to Masimo. And you'll also find a lineup of great speakers, including the SGU, Simon Singh, Michael Shermer and our keynote speaker, physicist Leonard Mladenov, author of The Drunkard's Walk.
Next, the story. That's an easy asphaug. Go get your tickets now.
Welcome back. Every episode, Julie and I pick a couple of our favorite books, movies, websites or whatever tickles our irrational fancy. Let's start, as usual, ritualistic.
Well, Masimo, My pick is a book by Nate Silver. You've probably heard.
Oh, yeah, I've heard the called The Signal and the noise. I'm in the middle of it right now. Of course, it had to be a package. It was just a question of when.
So it's my pick now and it is as good as people say it is. Nate Silver. I'm just so delighted with him. Like, I'm just I'm so like he's here. It's like hearing my own thoughts spoken by someone I know even more eloquently than I could. It's like warms my heart.
I want to remind the two or three people who don't know who he is, who he is. Yeah.
So, you know, Nate Silver predicted the two thousand eight election within a hair's breadth and also essentially perfectly predicted the outcome of the 2012 election as well.
And not only the election itself, but actually state by state. Yeah, the presidency, but not just yes or no, which would not be as impressive. No state by state.
Yeah, just far more accurate than almost any other pundit or pollster. And and he uses really solid statistics.
And he talks in his book about the methods uses for making predictions and about why they work and why other sort of common methods of making predictions don't work.
And and the thing that I personally really like about the book is the shout outs that he gives to Bayes Rule and Vision and Friends, which is basically a philosophy of statistics, are a way of doing statistical inference that I think is basically correct. But it's not in as wide uses as I wish it were. And this one passage in the book on, I think, page three hundred thirty one, he describes a essentially a game that people could play that I now really want to play.
And that would be like a game to train people to make predictions like rational agents.
So let me find the description. Yeah. Here.
So so the event would involve people going around at this party or whatever, carrying signs showing the odds that they assigned various outcomes like the stock market crashing in the next month or life being discovered on Mars or, you know, so-and-so being crowned the next American Idol or whatever.
And as people mingle, when they pass each other and they find they see on each others signs that they've assigned different odds to the same outcome, they are obliged to discuss and either come to a consensus, share their evidence and come to a consensus and revise their odds to be the same. Or they have to place a bet on their respective forecasts. You know, if if I think that you're wrong, then I should be willing to wager some amount of money.
As they say, a bet is a tax on bullshit.
So my pick is eight app for the iPhone and iPad that was released recently. It's called FI to Fi try to as in the number child. In fact, I've actually written about it for for the writing and speaking blog. It's put out by Jonathan Weisberger, who is a philosopher, and this is interested of interest to either professional philosophers, grant students or just people want to play a little philosopher. And basically what it is, is you open the app and you have a list of questions that the people have asked, other users have asked, and the questions are organized by branch of philosophy.
So the list of questions about epistemology, ethics, general questions and philosophy, history, philosophy and so on and so forth.
And then what you can do is you can answer the questions or any question as soon as you answered the question. The system gives you the current status of the answers from the crowd we just talked about first with the crowd.
As far as that particular answer is concerned, the more time passes, the more the answers are updated. And so you can see what what happens over time. The whole thing is anonymous. So nobody knows who is actually putting out the answers. And then you can ask your own questions if you if you will.
So I tried, but if I answered a few questions and see what happened and then I asked my own. And so I give you a couple of examples.
One of the questions that I that I answered was concerned Startrek are transporting devices. Go on.
So you know what happens when Kirk is about to be teleported in another place and the answers that were available are a Kirk dies, a Trekkie, Clie cries for B, Kirk is Kirk one or two, whichever you like, or C, Kirk is parentheses Kirk one and two before and after teleportation. So there's a potential potential, Kirk and then there's an actual or D Kirk is Kirk wanting to but only after Transperth or Kirk is Kirk one and Kirk is to adeus transitivity and.
Go on, there's another two or three, and the interesting thing is that you can look at the responses and the responses, that is the responses by anybody who has answered the question at the moment, as I'm looking at the graph, give a as winning, which is kick dies, a Trekkie cries.
OK, so you better not step in the teleporter if you if you bias them, though, because, you know, research has shown that people trust statements more when they rhyme.
Aha. I did. I prepared this particular question. Somebody else did, but that was it.
Now the other interesting thing is that you can do you can filter the answers if there's enough of them by the person's background and or disciplines.
So you're supposed to as a user, you're supposed to put information about who you are.
And so if you filter, for instance, this question by philosophers, people who have either a degree or have a professional career in philosophy, then actually answer a crocodile goes up even more clearly. And you can filter further not only among philosophers, but metaphysicians. And it turns out that the overwhelming majority of metaphysicians who answered this particular question, in fact, again, agree that, yes, in fact, does die. So it's kind of interesting.
One of my questions, the questions that I asked, let's see, is the question of the relationship between logic and mathematics. And so the way I put the question was, what is the relationship between logic and mathematics? And I gave the following a five possible answers to users. Math is a branch of logic or logic is a branch of math, or they are related but distinct types of deductive reasoning, or it depends on what you mean by logic or other.
And and users can add their own, their own answers. And so far at the moment, as we're looking at it, the responses are still this morning. No so much that I can't really do any filtering. But so far, apparently C is the winner. They are related but distinct types of deductive reasoning. So, yeah, it's fine.
It's I highly recommend to fire. To Fire is the is the name of the app. Jonathan Weisman is the producer philosopher.
Cool. And what an appropriate pick for our crowd sourcing everything.
Exactly. Well this concludes another episode of Rationally Speaking. Join us next time for more explorations on the border between reason and nonsense.
The rationally speaking podcast is presented by New York City skeptics for program notes, links, and to get involved in an online conversation about this and other episodes, please visit rationally speaking podcast Dog. This podcast is produced by Benny Pollack and recorded in the heart of Greenwich Village, New York. Our theme, Truth by Todd Rundgren, is used by permission. Thank you for listening.