The following is a conversation with Francois Sharlet, his second time in the podcast, he's both a world class engineer and a philosopher in the realm of deep learning and artificial intelligence. This time we talk a lot about his paper titled On the Measure of Intelligence that discusses how and why define and measure general intelligence in our computing machinery. Quick summary of the sponsors Babble Master Class Kashyap click. The sponsor links in the description to get a discount and to support this podcast.
As a side note, let me say that the serious, rigorous scientific study of artificial general intelligence is a rare thing. The mainstream machine learning community works on very narrow, a very narrow benchmarks. This is very good for incremental and sometimes big incremental progress. On the other hand, the outside the mainstream renegade, you could say ajai community works on approaches that verge on the philosophical and even the literary without big public benchmark's walking.
The line between the two worlds is a rare breed, but it doesn't have to be around the AGI series at MIT as an attempt to inspire more people to walk this line deep mind and open eye for a time and still on occasion, walk this line. Francois Charolais does as well. I hope to also. It's a beautiful dream to work towards and to make real one day. If you enjoy this thing, subscribe on YouTube, review it five stars and have a podcast, follow on Spotify, support on Patrón or connect with me on Twitter at LAX.
Friedman, as usual, I do a few minutes of ads now and no ads in the middle. I can make these interesting, but I'll give you a timestamp so you can skip. But still, please do check out responses by clicking on links and description. It's the best way to support this podcast. This show sponsored by Babul, an app and website that gets is speaking a new language within weeks, Krabappel dot com and news collects to get three months free.
They offer 14 languages, including Spanish, French, Italian, German and yes, Russian. Daily lessons are 10 to 15 minutes, super easy, effective, designed by over 100 language experts. Let me read a few lines from the Russian poem. It's a finite Ajaka by Alexander Block that you'll start to understand if you sign up to Babul knowledge. It's phonotactics dismissing the two Svet Shalvey show, which Verka said Thuc is holda. Yet now I say that you'll start to understand this poem because Russian starts with the language and ends with the vodka.
Now the latter part is definitely not endorsed or provided by Babul and will probably lose me this sponsorship, although it hasn't yet. But once you graduate with Babo, you can roll my advanced course of late night Russian colonization of a vodka.
No app for that yet. So get started by visiting Batbold. I come and use collects to get three months free. The show was also sponsored by Master Class Sign Up and Master Class Dotcom's class looks to get a discount and to support this podcast. When I first heard about master class, I thought it was too good to be true. I still think it's too good to be true for one hundred eighty dollars a year. Get an all access pass to watch courses from some of my favorites.
Chris Hadfield and Space Exploration. Hope to have him in this podcast one day.
deGrasse Tyson on scientific thinking communication, Neil, two will write Creator SIM City and Sims and game designer Carlos Santana on guitar, Garry Kasparov on chess and on the ground on poker. Many more Chris Hadfield explaining how rockets work and the experience of being watched. The space alone is worth the money, by the way. You can watch it on basically any device. Once again, sign up a master class that likes to get a discount and to support this podcast.
This show finally is presented by Kashyap, the number one finance app in the App Store, when you get it, is called Lux podcast. Kashyap lets you send money to friends, buy Bitcoin and invest in the stock market with as little as one dollar. SketchUp allows to send and receive money digitally. Let me mention a surprising fact related to physical money of all the currency in the world, roughly eight percent of it is actually physical money. The other 92 percent of the money only exists digitally, and that's only going to increase.
So again, if you get cash out from the App Store or Google Play and Use Collects podcast, you get 10 bucks in cash. Will also donate ten dollars. The first, an organization that is helping to advance robotics and stem education for young people around the world. And now here's my conversation with Francois Shali. What philosophers, thinkers or ideas had a big impact on you growing up and today, so one author that had a big impact on me when I read his books as a teenager was Jean Piaget with Swiss psychologist, is considered to be the father of developmental psychology and has a large body of work about basically how intelligence develops in children.
And so it's very old work, like most of it is from the 1930s and 40s. So it's not quite up to date. It's actually superseded by many new developments in developmental psychology.
But to me, it was it was very, very interesting, very striking and really shaped the early ways in which I started thinking about the mind and the development of intelligence as a teenager whose actual ideas or the way he thought about or just the fact that you could think about the developing mind at all.
I guess both. John Kerry is the author that's introduced me to the notion that intelligence and the mind is something that you construct throughout throughout your life and that you, the children constructed in stages. And I thought that was a very interesting idea, which is, of course, very relevant to A.I., to building artificial minds. And then a book that I read around the same time that had a big impact on me. And and and there was actually a little bit of overlap with Trump as well.
And I read it around the same time is Jeff Hawkins on Intelligence, which is a classic, and he has this vision of the mind as a multi scale hierarchy of for prediction models. And his ideas really resonated with me, like the notion of a modular hierarchy of potentially of. Compression functions of prediction functions, I thought was really interesting, and it shaped, uh, the way I started thinking about how to build mines, the hierarchical nature, the which aspect.
Also, he's a neuroscientist. He was thinking, yes, actually, he was basically talking about how our mind works at the notion that cognition is prediction was an idea.
It was kind of new to me at a time when I Atari dumped at the time. And. Yeah, and the notion that the multiple squares of processing in the brain. The hierarchy, yes, this was before deep learning, these ideas of hierarchies, and I've been around for a long time, even before unintelligence mean they've been around since the 1980s. And yeah, that was before the planning. But of course, I think these ideas really found that the practical implementation in the planning.
What about the memory side of things? I think he was talking about knowledge, representation. Do you think about memory a lot? One way you could think of neural networks as a kind of memory. You're memorizing things, but it doesn't seem to be. The kind of memory that's in our brains. Or doesn't have the same rich complexity, long term nature that's in our brains.
Yes, the brain is more fastpass access memory so that you can actually retrieve the very precisely like bits of your experience, the retrieval aspect.
You can introspect, you can ask yourself questions, I guess you can program your own memory. And language is actually the tool you use to do that. I think language is a kind of operating system for the mind and use language. Well, one of the uses of language is as a query that you run over your own memory, use words as keys to retrieve specific expanses of this concept, specific thoughts like languages of where you store thoughts, not just in writing in the in the physical world, but also in your own mind.
And it's also how you retrieve them. Imagine if you didn't have language, then you would have to you would not really have a self internally triggered a way of retrieving past thoughts. You would have to rely on external experiences. For instance, you use a specific site, you smell a specific smell, and that brings up memories. But you would naturally have a way to deliberately, deliberately access these memories without language.
Well, the interesting thing you mentioned is you can also program the memory. You can change it probably would language using language.
Well, let me ask you a Chomsky question, which is like, first of all, do you think language is like, fundamental, like there's turtles?
What's at the bottom of the turtles? They don't go. It can't be turtles all the way down is language at the bottom of cognition of everything is like language, the fundamental aspect of what it means to be a thinking thing.
No, I don't think so. I think language. You disagree with Noam Chomsky. Yes, I think language is a layer on top of cognition. So it is fundamental to cognition in the sense that to to use a comforting metaphor, I see language as the operating system of the brain, of the human mind. And the operating system, you know, is a layer on top of the computer. The computer exists before the operating system, but the operating system is how you make it truly useful.
And the operating system is most likely Windows and not Linux as its language is messy. Yeah, it's messy and it's it's pretty difficult to, uh, inspect its introspected.
How do you think about language like we use actually sort of human interpretable language, but is there something like deeper that's closer like. Like a logical type of statements, like, you know, what is the nature of language, do you think? I guess there's something deeper than the syntactic rules to construct, is there something that doesn't require utterances or so writing or so on?
Are you asking about the possibility that there could exist languages for thinking that are not made of words?
Yeah, yeah, I think so. I think so.
Uh, the mind is layers, right. And language is almost like the outermost the most layer. But before we think in words, I think we think in in terms of emotion in space and we think in terms of physical actions. And I think babies in particular probably expresses thoughts in terms of the actions that they've seen of that or that they can perform.
And in terms of the in terms of emotions, of objects in the environment before they start thinking in terms of words, it's amazing to think about that as the building blocks of language.
So like the kind of actions and ways the babies see the world as like more fundamental than the beautiful Shakespearean language you construct on top of a. And we probably don't have any idea what that looks like, right, ARIKAWA Because it's important for them trying to engineer it into systems.
I think visual analogies and motion is a fundamental building block of the mind and you actually see it reflected in language like language is full of spatial metaphors. And when you think about things I consider myself at the moment as a as visual thinker, you often expresses thoughts by using things like visualizing concepts in INTUITY space or like you solve problems by imagining yourself navigating a concept space. I don't know if you have this sort of experience. You said visualize in concept space.
Like, so I certainly think about.
I certainly I certainly visualized mathematical concepts, but you mean like in concept space?
Visually, you're embedding ideas into something, the three dimensional space to explore with your mind, essentially Streamliner Tweedlee 2D, your flatlander, you're OK?
No, I. I do not I always have to before I jump from concept to concept, I have to put it back down.
And it has to be on paper, I can only travel on duty paper, not inside my mind, you're able to move inside your mind when you and if you're writing and I could paper, for instance, don't you have like a special representation of your paper? Like, you visualize where ideas lie, topology in relationship to other ideas, kind of like a subway map of the ideas in your paper.
Yeah, that's true. I mean, there there is in papers, everybody there feels like there's a destination.
Um. There's a there's a key idea that you want to arrive at, and a lot of it is in the fog and you're trying to kind of it's almost like, um.
What's that called when you do a path planning search from both directions, from the start and from the end, and then you find you do like shortest path, but like, you know, in game plan, you do this like a star from both sides.
And you we're want to join. Yeah. So you kind of do this to me. I think, like, first of all, just exploring from the start from like first principles. What do I know? What can I start proving from that. Right. And then from the destination. If you start backtracking, like if, if I want to show some kind of sets of ideas, what would it take to show them? And you kind of backtrack.
But, yeah, I don't think I'm doing all that in my mind, though, like, I'm putting it down on paper.
Do you use mine maps to organize your ideas? Yeah, I like mine maps. I'm let's get into this because it's. I've been so jealous of people, I haven't really tried it, I've been jealous of people that seem to like they get like this fire of passion in their eyes because everything starts making sense. It's like Tom Cruise in the movie was like moving stuff around. Some of the most brilliant people I know use mine maps. I haven't tried, really.
Can you explain what the hell a mind map is?
I guess my map is a way to make contact and mess inside your mind, to just put it on paper so that you gain more control over it. It's the way to organize things on paper and as kind of like a consequence of organizing things on paper and it's still being more organized inside you, inside your own mind.
What does that look like? You put it like, do you have an example? Like what what do you what's the first thing you write on paper? What's the second thing you write?
I mean, typically you draw a mind map to organize the way you think about a topic. So you would start by writing down like the key concept about that topic, like you would write intelligence or something, and then you would start adding associative connections, like what do you think about when you think about intelligence? What do you think are the key elements of intelligence? So maybe we have language in motion. And so you would start drawing those with these things and then you would see what you think about when you think about emotion and so on.
And you would go like that like a tree. It's a tree or a tree. Mossler is a graph to like a tree.
Oh, it's more of a graph than a tree in. And it's not limited to just writing down words. You can also draw things. And it's not it's not supposed to be purely hierarchical. Right. Like you can. The point is that you can start once. Once you start writing it down, you can start reorganizing it so that it makes more sense that it's connected in a more effective way.
See, but I'm so OCD that you just mentioned, intelligence and language, emotion. I would start becoming paranoid that the categorization is imperfect. That I would become paralyzed with the mind map that like this may not be so like the even though you're just doing associative kind of connections, there's an implied hierarchy that's emerging and I will start becoming paranoid.
That's not the proper hierarchy. So you're not just one way to see my masters. You're putting thoughts on paper. It's like a stream of consciousness.
But then you can also start getting paranoid. Well, is this the right hierarchy? Sure. Which it's its mind maps your mind. Maybe you're free to draw anything you want. You're free to draw any connection you want, and you can just make a different mind if you think the central node is not the right note.
Yeah, I suppose there's a fear of being wrong. If you want to if you want to organize your ideas by writing down what you think, which I think is effective, like how do you know what you think about something if you if you don't write it down. Right. If you do that, the thing is that it imposes much more syntactic structure over your ideas, which is not required with the mind maps in mind.
Maybe it's kind of like a lower level, more free hand where for organizing your thoughts and once you've drawn it, then you can start actually voicing your thoughts in terms of, you know, tachographs.
It's two dimensional aspect of layout to write. Yeah, the kind of flower I guess you start there's usually you want to start with the central concept. Yes. The more typically it ends up more like a subway map.
So it ends up more like a graph topological graph without a root note.
They are so lucky in a subway map, some notes that are more connected than others. And there are some notes that are more important than others. Right. So they are destinations, but it's it's not going to be purely like a tree, for instance. Yeah. It's fascinating to think that if there is something to that about our about the way our mind thinks.
By the way, I just kind of remembered obvious thing that I have probably thousands of documents in Google dark at this point that are bullet point list, which is you can probably map a mine map to a bullet point list.
It's the same. It's a no, it's not a tree.
The tree. Yeah. So I create trees, but also they don't have the visual element.
Like, I guess I'm comfortable with the structure.
It feels like the narrowness, the constraints feel more comforting.
If you have thousands of documents with your own thoughts in Google Docs, why don't you write some kind of search engine like maybe a mind map and a piece of software mind mapping software where you write down a concept and then it gives you sentences or paragraphs from your size on Google Docs document that match this concept, though the problem is it's so deeply unlike mine maps, so deeply rooted in natural language. So it's not it's not semantically searchable, I would say, because the categories are very you kind of machine intelligence, language and motion.
They're very strong semantic, like it feels like the mind map forces you to be.
Semantically clear and specific, the bullet points list they have are are sparse, disparate thoughts. That poetically represent a category like motion as opposed to, say, motion. So unfortunately, that's the same problem with the Internet.
That's why the idea of semantic web is difficult to get.
Its most language on the Internet is the giant mess of natural language. That's hard to interpret. Which so do you think do you think there's something to mind maps, as you actually originally brought up as we were talking about kind of cognition and language, do you think there's something to my maps about how our brain actually deals like think reasons about things? It's possible, I think it's reasonable to assume that those. Some level of topological processing in the brain that the brain is very associative in nature and I also believe that a political space is a better medium to encode thoughts than a geometric space.
So I think what's the difference in a topological in a geometric space? Well, if you're talking about topologies, then points either connected or not. So topologies is more like a subway map and geometry is when you're interested in the distance between things and the subway map. You not only have the concept of distance, you only have the concept of whether there is a train going from station A to station B. And what we do in ziplining is that we are actually dealing with geometric spaces.
We are dealing with concept vectors, word vectors that have a distance between the express in terms of product. We are not we are not really building topological models.
Usually I think you have to write like distance is of fundamental importance in deep learning. I mean, it's the continuous aspect of it. Yes. Because everything is a vector and everything has to be a vector because everything has to be differentiable. If your space is discrete, it's no longer defensible regional planning in it anymore. Well, you could, but you can only do it by and benitz in a bigger continua space. So if you do that to barraging in the context of the program, you have to do it by embedding energy imagery, right?
Yeah, well, I mean the mesomorph. Second, let's get into your paper on the measure of intelligence that you put on.
Yes. OK. November. November. Yeah, remember 2013, that was a different time. Yeah, I remember, I still remember it feels like a different and different different world.
You could travel, you can actually go outside and see friends. Yeah.
Let me ask the most absurd question. I think there's some non-zero probability there will be a textbook one day, like two hundred years from now on, artificial intelligence, or it'll be called like just intelligence because humans will already be gone. And be your picture with a quote.
You know, one of the early biological systems would consider the nature of intelligence and they'll be like a definition of how they thought about intelligence, which is one of the things you do in your paper on measuring intelligence is to ask like, well, what is intelligence and how to test for intelligence, so on.
So is there a spiffy quote about what is intelligence? What is the definition of intelligence, according to your friends who are actually. Yes. Do you think the superintendent is of the future will want to remember us the way we remember humans from the past? And do you think they will be you know, they won't be ashamed of having a biological origin? No, I think it would be a niche topic.
It won't be that interesting, but it'll be it'll be like the people that study in certain contexts like historical civilization and long longer exist, the Aztecs and so on, that that's how they'll be seen and they'll be studying also the context on social media.
There will be hashtags about the atrocity committed to human beings when when the when the robots finally got rid of them like it was a mistake.
You'll be seen as a as a giant mistake, but ultimately in the name of progress. And it created a better world because humans were overconsuming the resources and the were not very rational or were destructive in the end in terms of productivity and putting more love in the world.
And so within that context, there'll be a chapter about these biological systems, seem to have a very detailed vision of that future.
Should write a novel about it.
I'm working and I'm working on A.I. by now. Yes, yes. Yeah. Self published. Yeah. The definition of intelligence. So intelligence is the efficiency with which you acquire new skills are tasks that you did not previously know but that did not prepare for. All right. So it is not intelligence is not skewed itself. It's not what you know, it's not what you can do. It's how well and how efficiently you can learn new things.
New things. Yes. The idea of newness. There seems to be fundamentally important. Yes.
So you would see intelligence on display, for instance, whenever you see a human being or an eye creature adapt to a new environment that does not seem before that its creators did not anticipate. When you see adaptation, when you see improvisation, when you see channelization, that's intelligence in reverse. If you have a system, that's when you put it in a slightly new environment. It cannot adapt, it cannot improvise, it cannot deviate from what its hardcoded to do what it has been trained to do.
That is a system that is not intelligence. There's actually a quote from Einstein that captures this idea, which is the measure of intelligence is the ability to change. I like that quote, I think it's captures at least part of this idea, you know, there might be something interesting about the difference between your definition and ISIS. I mean, he's just being Einstein and clever, but acquisition of.
New. Ability to deal with new things versus ability to just change, what's the difference between those two things? So just change in itself. Do you think there's something to that just being able to change? Yes, being able to adapt to not not change, but certainly a change of direction, being able to adapt yourself to your environment, whatever the environment. That's a big part of intelligence. Yes.
And intelligence is more specifically, you know, how efficiently you're able to adapt efficiently. You're able to basically master your environment efficiently. You can acquire new skills. And I think there's a there's a big distinction to be drawn between intelligence, which is a process and the output of that process, which is skill. So, for instance, if you have a very smart human programmer that considers the game of chess and that writes down a static program that can play chess, then the intelligence is the process of developing that program.
But the program itself is just encoding the output artifact of that process. The program itself is not intelligence. And the way you tell it's not intelligent is that if you put it in a different context, you get to play Gore something. It's not going to be able to perform well researched human involvement because the source of intelligence, the entity that is capable of that process, is the human program. So we shouldn't be able to tell the difference between the process and its output.
We should not confuse the output and the process. It's the same as, you know, do not confuse a road building company and one specific road because one specific protects you from point A to point B, but a road building company can take you from can make a path from anywhere to anywhere else.
Yeah, that's beautifully put. But it's also to play devil's advocate a little bit, you know, it's possible that there is something more fundamental than us humans. So you kind of said the programmer creates the differences between the the choir.
The skill and the skills of that could be something like you could argue the universe.
Is more intelligent, like the the deep the base intelligence of that we should be trying to measure is something that created humans. We should be measuring God or what the source of the universe, as opposed to that, there could be a deeper intelligence.
Sure, there's always deeper intelligence. You can argue that. But that does not take anything away from the fact that humans are intelligence. And you can tell that because they are capable of adaptation and generality. And you see that in the fact that. Humans are capable of handling situations and tasks that are quite different from anything that any other evolutionary ancestors as ever encountered, so racketball of generalizing, very much out of distribution, if you consider our evolutionary history as being, in a way, Australia.
Of course, evolutionary biologists would argue that we're not going too far out of the distribution or like mapping the skills we've learned previously, desperately trying to jam them into these new situations. I mean, there's definitely a little bit of that, but it's pretty clear to me that I able to you know, most of the things we do on any given day in our modern civilization are things that are very, very different from what our ancestors in a million years ago would have been doing in any given day.
And your environment is very different. So I agree that everything we do, we do it with cognitive building blocks that we acquired over the course of evolution. Right. And that Encore's our cognition to a certain context, which is deman conditioned by image. But still our mind is capable of a pretty remarkable degree of generality far beyond anything we can create in artificial systems today. Back to the degree in which the mind can generalize from its evolutionary history, can generalize away from its evolutionary history, is much greater than the degree to which a deep learning system today can generalize away from its brain data.
And the key point you're making, which I think is quite beautiful, is like we shouldn't.
Measure, if you talk about measurement, we shouldn't measure the skill. We should measure, like the creation of the new skill, the ability to create a new skill. Yes, but it's tempting like.
It's weird because the scale is a little bit of a small window into the. Into the system. So whenever you have a lot of skills. It's tempting to measure the skills. Yes, I mean, the kid is the only thing you can objectively measure. Yeah. So the thing to keep in mind is that when you see skill in the human it. Gives you a strong signal that that human is intelligence because you knew they weren't born with that skill.
Typically, like you see if you see a very strong chess player, maybe you're just being yourself. I think you're and you're saying that because I'm Russian and you're you're prejudiced. You assume I'm biased. So.
Well, you're bias.
So if you see a very strong chess player, you know, they weren't born knowing how to play chess, so they had to acquire that skill with their limited resources, with their limited lifetime. And, you know, they did that because they are generally intelligent. And so they may as well have acquired any other skills. You know, they have this potential. And on the other hand, if you see a computer playing chess, you cannot make the same assumptions because you cannot just assume the computer is intelligent.
The computer may be born knowing how to play chess in the sense that it may have been programmed by a human that has understood chess for the computer and that that's just encoded the output of that understanding in a static program.
And that program is not intelligence that Lizama just for a second and said like, what is the goal of the on the measure of intelligence paper? Like, what do you hope to achieve with it?
So the goal of the paper is to clear up some longstanding misunderstandings about the way we've been conceptualizing intelligence in the community and in the way we've been evaluating progress in. I just been a lot of progress recently in learning and people are, you know, extrapolating from that progress that we are about to solve general intelligence. And if you want to be able to evaluate these statements, you need to precisely define what you're talking about when you're talking about general intelligence.
And you need a formal way, a reliable way to measure how much intelligence, how much intelligence a system possesses. And ideally, this measure of intelligence should be actionable. So it should not just describe what intelligence is. It should not just be binary indicator that tells you the system is intelligent or it isn't. It should be actionable, should explanatory power. Right. So you could use it as a feedback signal. It would show you the way towards building more intelligent systems.
So at the first level, you draw a distinction between two divergent views of intelligence of.
As we just talked about, intelligence is a collection of tax task, specific skills and a general learning ability.
So what's the difference between kind of this memorization of skills and a general learning ability? We talked about a little bit, but we can try to linger on this topic for a bit. Yes.
So the first part of the paper is an assessment of the different ways we've been thinking about intelligence and the different ways we've been evaluating progress in the eye and destry of cognitive sciences has been shaped by two views of the human mind. And one of you is the evolutionary psychology view in which the mind is a collection of fairly static, special purpose, ad hoc mechanisms that have been hardcoded by evolution over history as a species for a very long time and. Early A.I. researchers, people like Marvin Minsky, for instance, they clearly subscribed to this view and they saw this sort of mind as a kind of collection of static programs similar to the programs they would they would run on like mainframe computers.
And in fact, I think they very much understood the mind through the metaphor of the mainframe computer because it was a tool that they were working with. Right. And so you had these static programs, this collection of very different static programs operating of a database like memory. And in this picture, learning was not very important. Learning was considered to be just memorization. And in fact, learning is basically not featured in textbooks until the 1980s with the demise of machine learning.
Kind of fun to think about that learning was the outcaste. Like the the weird people were learning like the mainstream world. Was. I mean, I don't know what the best term is, but it's not learning. It was seen as legal reasoning, yes, would not be learning based. Yes, it was seen. It was constituted. The mind was a collection of programs that were primarily logical in nature. And that's all you needed to do to create mind, was to write down these programs.
And they would operate with knowledge which would be stored in some kind of database. And as long as your database would encompass, you know, everything about the world and your logical rules were comprehensive, then you would have in mind. So the other view of the mind is the brain are as sort of blank slate. Right. This is a very old idea. You find it in John Locke's writings. This is the tabula rasa. And this is this idea that the mind is some kind of like information sponge that starts empty and it starts blank and that absorbs knowledge and skills from experience.
Right. So it's it's a sponge that reflects the complexity of the world. The complexity of your life expands essentially that everything you know and everything you can do is a reflection of something you found in the outside world, essentially.
So this is an idea that's very old and it was not very popular, for instance, and in the nineteen seventies, but that had gained a lot of vitality recently with the rise of connection ism in vascularity planning. And so today ziplining is the dominant paradigm in the eye. And I feel like lots of researchers are conceptualizing the mind via a deep learning metaphor like this. See the mind as a kind of randomly initialized neural network that starts blank when you're born and then that gets trained via exposure to train data that requires knowledge and skills, experience to train data.
Whether it's a small tangent, I feel like people who are thinking about intelligence are not conceptualizing it that way. I actually haven't met too many people who believe that a neural network will be able to reason, who seriously think that rigorously, because I think it's actually interesting worldview and we'll talk about it more.
But it's been impressive what what neural networks have been able to accomplish. And it's up to me, I don't know, you might disagree, but it's an open question whether I like. Like scaling size eventually might lead to incredible results to us, mere humans will appear as if it's general.
I mean, if you if you ask people who are seriously thinking about insurgents, they will definitely not say that. All you need to do is it's like the mind is just in your network. However, it's actually you that's that's very popular, I think, in the diplomatic community that many people are kind of conceptually, you know, intellectually lazy about it.
But I guess what I'm saying is I am I haven't met many people. And I think it will be interesting to meet a person who is not intellectualise about this particular topic and still believes that neural networks will go all the way.
I think I believe is probably closest to that was some people who argue that they are currently planning techniques are already the way to generate artificial intelligence and that all you need to do is to scale it up to all the available training data. And that's if you look at the waves that open the eyes Jupitus remodelers made, you see because of this idea. So on that topic, Deepti three, similar to GBG, to actually have captivities some part of the imagination of the public, there's just a bunch of hype of different kind that I would say it's emergent, it's not artificially manufactured.
It's just like people just get excited for some strange reason. In the case of three, which is funny that there is, I believe, a couple of months delay from release to hype. Maybe I'm not historically correct on that, but it feels like there was a little bit of a lack of hype and then there's a phase shift into into hype. But nevertheless, there's a bunch of cool applications that seem to captivate the imagination of the public about what this language model that's trained and unsupervised way without any fine tuning is able to achieve.
So what do you make of that? What are your thoughts about GBG three?
Yeah, so I think what's interesting about it is three is the idea that it may be able to learn new tasks in after just being shown a few examples. So it's actually capable of doing that. That's novel and that's interesting and that's something we should investigate. That said, I must say I'm not entirely convinced that we have shown it's it's capable of doing that. It's very likely, given the amount of data that the model is trained on, that what it's actually doing is pattern matching and you task, you give it with the task that it's been exposed to.
And it's it's just recognising the task instead of just developing a model of the task. Right.
But there's a certain drop. There's parallels to what you said before, which is it's possible to CP3 as like the prompt is given as a kind of obscure query into this thing that it's learned similar to what you said before, which is language is used to query the memory. Yes. So is it possible that neural network is a giant memorization thing, but then if it gets sufficiently memorized, sufficiently large amounts to think of the world where it becomes more intelligence becomes a quarrying machine?
I think it's possible that a significant chunk of intelligence is this giant associative memory. I definitely don't believe that intelligence is just a giant associative memory, but it may well be a big component. So do you think gipped three, four, five, tippity 10 will eventually, like, what do you think was the ceiling? Do you think he'll be able to reason? No, that's a bad question. Like what is the ceiling is the better question? Well, where is it going to scale?
How good is Jupiter and going to be? Yeah. So I believe Jupiter N is going there and is going to improve on the strength of Jupiter two and three, which is it will be able to generate, you know, other more plausible text in context just monotonic on a promise performance.
Yes. If you if you train bigger on more data than your text will be increasingly more context aware and increasingly more plausible in the same way that Jupiter three, it is much better at generating closable text compared to Jupiter two. But that said, I don't think just heading up the model to more transformer, less and more training is going to address the flaws of Jupiter three, which is that it can generate plausible text. But that text is not constrained by anything else other than plausibility, saying Basara.
It's not constrained by factualness or even consistency, which is why it's very easy to get Jupiter three. To generate statements that are factually untrue are two general statements that I have and self-contradictory, right? Because it's it's it's only goal is plausibility and it doesn't always constraints. It's not constrained to be self-consistent.
Right. And so for this reason, one thing that I thought was very interesting, which if you disagree, is that you can present mine. The answer, which will give you by asking the question in a specific way, because it's very responsive to the way you ask the question, since it has no understanding of the content of the question. Right.
That and if you are the same question in two different ways that are basically addressed, solely engineered to produce an answer, you will get two different answers to contradictory answers as very susceptible to adversarial attacks, essentially.
Potentially, yes. So in general, the problem with these models is not in models, is that they are very good at generating plausible text. But that's just there's just not enough. Right. You need, I think, one one avenue that would be very interesting to make progress is to make it possible to write programs of Adelaide in space that these models operate on, that you would rely on these self-service models to generate a sort of like pool of knowledge and concepts and common sense.
And then you would be able to write explicit reasoning programs over it. Because the current problem with Jupitus is that you it can be quite difficult to get it to do what you want to do. If you want to turn Jupiters into products, you need to put constraints on it. You need to force it to obey certain rules. So you need a way to program it explicitly.
So if you look at its ability to do programme synthesis, it generates, like you said, something that's plausible.
Yeah. So if you if you try to make it joint programs, it will perform well for any programme that it has seen it in its training data. But because program space is not interpretive, right, it's not going to be able to generalise to problems it hasn't seen before. Now, that's currently, do you think, sort of an absurd, but I think useful and just intuition builder is, you know, the Deepti three has one hundred seventy five billion parameters.
The human brain has 100. Has about a thousand times that or more in terms of numbers, Synopsize. Do you think obviously very different kinds of things, but. There is some degree of similarity, do you think? What do you think Deepti will look like when it has one hundred trillion parameters? You think our conversation might be so in nature different, like because you've been through very effectively now, do you think? No, I don't think so.
So the idea is to balance work with skinny models to achieve better models is not going to be the size of the model or how long it takes to train it. The bottleneck is going to be to train data because opening is already a training industry on a crawl. Basically the entire web. Right. And it's a lot of data. So you could imagine training on more data than that.
Like Google could try on more data on that, but it would still be only incrementally more data. And I don't recall exactly how much more data and you'd be just through a strain on compared to two, but it's probably at less than a hundred, maybe even a thousand, although the exact number, you're not going to be able to train the model on hundred more data than with what you're already doing. So that's that's brilliant. So it's not you know, it's easier to think of compute as a bottleneck and then arguing that we can remove that bottleneck, but we can remove the computer bottleneck.
I don't think it's a big problem if you look at the pace at which we've improved the efficiency of ziplining models in the past few years. I'm not worried about Trentham Bottlenecks or model size bottlenecks are the bottleneck indicators. Is generative transformer models is absolutely the trend data. What about the quality of the data? So. So, yeah. So the quality of the data is an interesting point. The thing is, if you're going to want to use these models in real products, then you you want to feed them data that's as high quality as factual, I would say as unbiased as possible, that there's just not such a thing as unbiased data in the first place.
But you probably don't want to to try it on Reddit, for instance. Sanza, that sounds like a bad plan.
So from my personal experience, working with a large scale deep learning models. So at some point I was working on a model at Google that's trained on an extra hundred fifty million label images. I'd image misclassification a lot of images. That's like probably most publicly available images on the Web at the time. And it was a very noisy data set because the labels were not originally annotated by hand by humans, they were. Automatically derived from tags on social media are just keywords in the same page as the image was and so on, so it was very noisy and it turned out that you could easily get a better model, not just by trying, like if you train on more of the noisy data, you get incrementally better model.
But you you very quickly, it's diminishing returns. On the other hand, if you try on smaller data set with higher quality annotations quality at the annotations that's actually made by humans, you get better. Mollen and it also takes less time to train it. Yeah, that's fascinating. Is the self supervised learning there's a way to get better doing the automated. Labeling, yeah, so you can enrich or refine your labels in an automated way, that's correct.
Do you ever hope for.
I don't know if you're familiar with the idea of a semantic web. Is the semantic web just for people who are not familiar? And is is the idea of being able to convert the Internet or be able to attach like semantic meaning to the words on the Internet, the sentences, the paragraphs to be able to convert information on the Internet or some fraction of the Internet into something that's interpretable by machines? I was kind of a dream. For I think this matter papers in the 90s, it's kind of the dream that, you know, the Internet is full of rich, exciting information, even just looking at Wikipedia should be able to use that as data for machines.
And so far it is not is not really in a format that's available to machines. So, no, I don't think the semantic web will ever work simply because it would be better off work. Right. To make to provide that information is structured form, and there is not really any incentive for anyone to provide that work. So I think the way forward to make the knowledge on the web available to machines is actually something closer to unsupervised deep learning.
Yeah, so Djibouti's is actually could be a step in the direction of making the knowledge of the web available to machines than the semantic web was.
Yeah, perhaps in a human centric sense, it feels like Djibouti's three hasn't learned anything that could be used to reason. But that might be just the early days. Yeah, I think that's correct. I think the forms of reasoning that you that you see form are basically just sort producing patterns that just strange to. So, of course, if you're train on a gentile web, then you can produce an illusion of reasoning in many different situations, but it will break down if it's presented with a novel situation.
That's the open question between the illusion of reasoning and actual reasoning.
Yes, the power to adapt to something that is genuinely new. Because the thing is, even imagine you had, um, you could train on every bit of data ever generated in this of humanity. And it remains that model would be capable of anticipating many different possible situations. But it remains that the future is going to be something different. Like, for instance, if you train a Jupitus remodel on on data from the year 2002, for instance, and then use it today, it's going to be missing many things.
It's going to be missing many common sense facts about the world. It's even going to be missing vocabulary and so on.
It's interesting that three even doesn't have, I think, any information about the coronavirus.
Yes. Which is why, you know, a system that's. You tell that the system is intelligent and it's capable to adapt. So intelligence is going to require a certain amount of continuous learning. It's also going to require some amount of improvisation. It's not enough to assume that what you're going to be asked to do is something that you've seen before or something that is a simple interpolation of things you've seen before. Yeah, in fact, that model breaks down for even even very tasks that look relatively simple from a distance, like L5 self driving, for instance.
You're going to at a paper a couple of years back showing that some 30 million different road situations were actually completely insufficient to train driving model. It wasn't even L2. Right. And that's a lot of data. That's a lot more data than the 20 or 30 hours of driving that a human needs to be able to drive, given the knowledge they've already accumulated.
Well, let me ask you on that topic. Elon Musk, Tesla auto pilot, one of the only companies I believe is really pushing for a learning based approach.
Are you're skeptical that that kind of network can achieve a level four and four is probably achievable?
L5 probably not worth the distinction there is there five is completely you can just fall asleep. Yeah. Alpha is basically human level. What it will drive.
You have to be careful saying human level because like that element of drivers. Yeah, that's the clearest example of like, you know, cars will most likely be much safer than humans in situ. In many situations where humans fail, it's the vice versa.
So the question I'll tell you, you know, the thing is the demands of data you would need to anticipate for pretty much every possible situation you learn content or world, is that it's not entirely unrealistic to think that at some point in the future will develop a system with enough data, especially provided that we can simulate a lot of the data. We don't necessarily need actual actual cars on the road for everything, but it's a massive effort and it turns out you can create a system that's much more adaptive, that can generalize much better if you just add explicit models of the surroundings of the car and if you use the planning for what it's good at, which is to provide perceptive information.
So deep learning is a way to encode perception and a way to earn good intuition. But it is not a good medium for any sort of explicit reasoning. And in know systems today, a strong generalization tends to come from explicit models, tend to come from abstractions in the human mind that are included in program form by a human engineer. Right.
These are the abstractions can actually generalize, not the sort of weak abstraction that is run by your network.
Yeah, and the question is how much how much reasoning, how much strong abstractions are required to solve particular tasks like driving. Let's ask the question. Or human life existence. How much how much stronger abstractions is existence require, but more specific than driving? It's that seems to be that seems to be a coupled question about intelligence as like how much intelligence. Like, how do you build an intelligence system and the couple problem, how hard is this problem?
How much intelligence does this problem actually require? So where we get the cheat, right?
Because we get to look at the problem like it's not like you get to close our eyes and completely new to driving. We get to do what we do as human beings, which is for the majority of our life before we ever learn, quote unquote, to drive, you get to watch other cars and other people drive and get to be in cars. You get to watch you get to go and see movies about cars. We get to, you know, we get to observe all this stuff.
And it's similar to what neural networks are doing is getting a lot of data and. The the question is, yeah, how much is how many leaps of reasoning genius is required to be able to actually effectively drive. It's, for example, driving.
I mean, sure, you've seen a lot of cars in your life before you learn to drive. But let's say you've learned to drive in Silicon Valley and now you rent a car in Tokyo. Well, now everyone is driving on the other side of the road and the signs are different and the roads are more narrow and so on. So it's a very different environment.
And a smart woman, even an average human, should be able to just zero shut. It's to just be operational in this in this very different environment. But right away, despite having had no contacts with the novel complexity, that is not in this environment. Right. And that is novel complexity. It is not just interpretation over the situations that you've encountered previously in learning to drive in the US. Right.
I would say the reason I ask is one of the most interesting tests of intelligence we have today actively, which is driving in terms of having an impact on the world. Like when do you think we'll pass that test of intelligence? So I don't think driving is that test intelligence because again, there is no task for which skill at that task demonstrates intelligence, unless it's a kind of metatarsus that involves acquiring new skills. So I don't think I think you can actually solve driving without having any real amount of intelligence.
For instance, if you indeed have infinite train there, you could just literally train and enter in any model and Australian provided infinite training data. The only problem with the whole idea is collecting data sets that sufficiently comprehensive that covers the very long tail of possible situations you might encounter. And it's just a scale problem.
So I think the there's nothing fundamentally wrong with this plan, with this idea. It's just that it strikes me as a fairly inefficient thing to do because you run into this this scanning issue with diminishing returns, whereas if instead you took a more manual engineering approach where you use deplaning modules in combination with engineering and explicit model of the surrounding of the cars, and you and you bridge the two in a clever way, your model will actually start generating much earlier and more effectively than end to end operating model.
So why would you not go with the more manual engineering answer approach, even if you created that system, either the end to end the planning model system, the infinite data or the slightly more human system? I don't think achieving Elfy would demonstrate a general intelligence or intelligence of any generality at all. Again, the only possible test of generality in I would be a test that looks at skill acquisition of unknown tasks. But for instance, you could take your L5 driver and ask you to to run to the two pilots a commercial airplane, for instance.
And then you would look at how much human involvement is required and how much training data is required for the system to run to buy an airplane. And that that gives you a measure of how intelligent the system is.
Yeah, I mean, that's a big leap I get you. But I'm more interested as a problem. I would say to me, driving is a black box that can generate novel situations at some rate, but what people call edge cases like. So it does have newness that keeps being like we're confronted, let's say, once a month.
It is a very long tail. Yes, the long tail. That doesn't mean you cannot solve it just by by training a good model. And lot of data, a huge amount of data. It's Romanowski. But I guess what I'm saying is if you have a vehicle that achieves a level five, it is going to be able to deal with new situations or I mean. The data is so large that the rate of new situations is very low. Yes, that's not intelligence.
So if we go back to your kind of definition of intelligence, the efficiency with which you can adapt to new situations, to truly new situations, not situations you've seen before right now, situations that could be anticipated by your critics, by the critics of the system. But three new situations, the efficiency with which you acquire new skills if you require it, in order to pick up a new skill, you require a very extensive training data set of most possible situations that that can occur in the practice of that skill.
Then the system is not intelligent. It is mostly just a lookup table. Yeah, well, likewise, if in order to acquire a skill, you need a human engineer to write down a bunch of rules that cover most or every possible situation. Likewise, the system is not intelligent. The system is merely the output artifact of a process that that happens happens in the minds of the engineers that are creating it. Right. It is including an abstraction that's produced by the human mind and intelligence that would actually be.
The process of producing autonomously producing is obstruction.
Yeah, not like if you take an obstruction and you include it on a piece of paper in a computer program, the obstruction itself is not intelligence where intelligence is the agent that's capable of producing these obstructions. All right. Yeah, it feels like there's a little bit of a gray area, like because you're basically saying that deep learning forms abstractions to. But those abstractions do not seem to be effective for generalizing far outside of the things that's already seen, but generally is a little bit.
Absolutely no. Deeply industrialized, a little bit like generalization is not is not binary. It's like a spectrum.
Yeah. And there's a certain point it's a gray area, but there's a certain point where there's an impressive degree of generalization that happens. Not like I guess exactly what you were saying is. Intelligence is. How efficiently you're able to generalize. Far outside the distribution of things you've seen already.
Yes, both the sense of how far you can like how new, how radically new something is and how efficiently to deal, or so you can think of intelligence as a measure of an information conversion ratio, like imagine a space of possible situations and you've covered some of them. So you have some amount of information about your special possible situations that's provided by the situations you already know.
And that's, on the other hand, also provided by the prior knowledge that the system brings to the table prior knowledge. It's embedded in the system. So the system starts with some information right. About the problem. But the task and it's about going from that information to a program, what we would call a skill program, a behavioral program that can cover a large area of possible situation space. And essentially the ratio between that area and the amount of information you start with is intelligence.
So a very smart agent can make efficient uses of very little information about a new problem and very little prior knowledge of someone to cover a very large area of potential situation. Since that problem, without knowing with this future new situation, I'm going to be. So one of the other big things you talk about in in the paper we've talked about a little bit already, but let's talk about some more as actual tests of intelligence show. If you look at like human and machine intelligence, do you think tests of intelligence should be different for humans and machines or how we think about testing of intelligence?
Are these fundamentally the same kind of. Intelligence sources that were after and therefore the test should be similar. So if your goal is to create eyes that are more humanlike, then it will be super valuable, obviously, to have a test that's that's universal. It applies to both eyes and humans so that you can you could establish a comparison between the two that you could tell exactly how intelligent in terms of human intelligence a given system is.
So that said, the constraints that apply to artificial intelligence and human intelligence are very different. And your tests should account for this difference, because if you get artificial systems, it's always possible for an experiment to buy arbitrary levels of skill at arbitrary tasks, either by injecting a heart, could it prior knowledge into the system. Yeah, rules and so on. That come from the human mind, from the minds of the programmers, and also buying higher levels of skill just by training on more data.
For instance, you could generate an infinity of different go games and you could train and go playing system that way. But you could not directly compare it to human playing skills because a human that plays school had to develop that skill in a very constrained environment. They are the limited amount of time. They are the limited amount of energy. And of course, this started from a different set of previous studies from, you know, innate human Prius. So I think if you want to compare the intelligence of two systems, like the intelligence of an A.I. and the intelligence of the human, you have to control for Prius, you have to start from the same set of knowledge prior to about the task and you have to control for your experience, that is, say, for training data.
So prior was Pryors.
So prior is whatever information you have about a given task before you start learning about this task and has a difference from experience, what experience is required. Right. So, for instance, if you're if you're trying to play Google, your experience with Google is all the Google Games you've played or you've seen or you've simulated in your mind, let's say, and your priors are things like, well, Google is a game on the to the grid.
And we have lots of hardcoded priors about the organization of the space and the rules of how the dynamics of the physics of this game in the study space. Yes. And the idea that you have what winning is. Yes, exactly. So I and other board games can also share some similarities. Go in. If you play this board games, then with respect to the game of goo, that would be part of your preference. But the game.
Well, it's interesting to think about the game of goals. How many players are actually brought to the table?
When you look at self play, reinforcement learning based mechanisms to learning, it seems like the number of players is pretty low. Yes, but you're saying you should be tailored to these special players in the cabinet, right. Where you should be clear at making those prizes explicit. Yes. So in Barcelona, I think if you if your goal is to measure a human like form of intelligence, then you should clearly establish that you want the testing to start from the same set of priors that humans start with.
So, I mean, to me personally, I think to a lot of people, the human side of things is very interesting, the testing intelligence for humans. What what do you think is a good test of human intelligence? Well, that's the question that psychometrics is is interested in, well, that's an entire subfield of psychology that deals with this question. So what psychometrics the psychometrics is the subset of psychology that that tries to measure, quantify aspects of the human mind.
So in particular, cognitive abilities, intelligence and personality traits as well.
So like what are maybe a weird question, but what are the first principles of the of psychometrics that operates on, you know, what what are the Pryors it brings to the table?
So it's a field with a with a very long history.
It's so, you know, psychology sometimes gets a bad reputation for not having very reproducible results. And so on psychometrics, there's actually some fairly solidly reproducible results. So ideal goals of the field is, you know, tests should be be reliable, which is a notion that you reproducibility, it should be vetted, meaning that it should actually measure what you says, as you say it measures. So, for instance, if you're if you're saying that you're measuring intelligence, then your test results should be correlated with things that you expect to be correlated with intelligence like success in school or success in the workplace and so on, should be standardized, meaning that you can administer your tests to many different people in some conditions and it should be free from bias.
Meaning that, for instance, if you're if if your test involves the English language, then you have to be aware that this creates a bias against people who have English as a second language or people who can't speak English at all. So, of course, this these principles for creating psychometric tests are very much nitinol. I don't think every psychometric test is is really either reliable, that valid or offer from bias. But at least the field is aware of these weaknesses and is trying to address them.
So it's kind of interesting. Ultimately, you're only able to measure, like you said previously, the skill, but you're trying to do a bunch of measures of different skills that correlate strongly with some general concept of cognitive ability. Yes. Yes. So what's the G factor? So right.
There are many different kinds of tests, tests of intelligence, and each of them is interested in different aspects of intelligence. You know, some of them will deal with language that we deal with a special vision, maybe mental rotations, numbers and so on. When you run these very different tests at scale, which you start seeing is that there are clusters of correlations among test results. So, for instance, if you look at homework at school, you will see that people will do well at math, are also likely statistically to do well in physics.
And what's more, that they are also people do well at math and physics, also statistically likely to do well in things that sound completely unrelated, like writing an English essay, for instance. And so when you see clusters of correlations in statistical terms, you would explain them with the lead in variable and written variability. For instance, explain the relationship between being good math and being good at physics would be a cognitive ability. Right. And the G factor is the latent variable that explains the fact that every test of intelligence that you can come up with results on that on this test end up being correlated.
So there is some single unique variable that explains this coalition strategy factor. So it's a statistical construct. It's not really something you can directly measure, for instance, in a person what is there? But it's there is there is the scale. And that's also one thing I want to mention about psychometrics. Like, you know, when you talk about measuring intelligence in humans, for instance, some people get a little bit worried. They will say, you know, that sounds dangerous.
Maybe that's not potentially discriminatory and so on, and they are not wrong. And the thing is, so personally, I'm not interested in psychometrics as a way to characterize one individual person. Like if if I get your psychometric personality assessment or your IQ, I don't think that actually tells me much about you as a person. I think psychometrics is most useful as a statistical tool. So it's most useful at scale. It's most useful when you. Targeting test results for a large number of people, and you start across creating this test results because that gives you information about the structure of the human mind, particularly about the structure of human cognitive abilities.
So at scale, psychometrics paints a certain picture of the human mind, and that's interesting. And that's what's driven to the structure of human cognitive abilities.
Yeah, gives you an insight into I mean, to me, I remember when I learned about G Factor, it seemed it seemed like it would be impossible for it to be real, even as a statistical variable, like it felt kind of like astrology, I guess, like wishful thinking. Psychologists. But the more I learned, I realized that there is some I mean, I'm not sure what to make about human beings. The fact that the Jew factor is a thing, that there is a commonality across all of human species, as there does seem to be a strong correlation between cognitive abilities.
That's kind of fascinating. Yeah. So you can see what it is.
Have a structure like the most mainstream theory of the structure of abilities is called a theory. So Katlehong and the three psychologists who contributed key pieces of it and it describes cognitive abilities as a high IQ with three levels. And at the top you have the G factor. Then you have broad cognitive abilities, for instance, through intelligence. Right. And that that encompass a broad set of possible kinds of tasks that are all related. And then you have narrow cognitive abilities at the last level, which is closer to task specific skill.
And there are different theories of the structure of abilities that just emerge from different statistical analysis of IQ test results, which they all describe a high IQ with a kind of G factor at the top. And you're right that the G factor is it's not quite trivial in the sense that it's not something you can observe and measure your height, for instance, but it's really in the sense that you sit in a statistical analysis of the data. Right. One thing I want to mention is that the fact that there is a factor does not really mean that human intelligence is a general in a strong sense, does not mean human intelligence can can be applied to any problem at all, and that someone has a high IQ is going to be able to solve any problem at all.
That's not quite what it means. And when one particular analogy to understand it is the sports analogy. If you consider the concept of physical fitness, it's a concept that's very similar to intelligence because it's a useful concept. It's something you can intuitively understand. Some people are fit, maybe like some people are not ascites, maybe like me, but none of us can fly absolutely constrained to specific fits.
That doesn't mean you can do anything at all in any environment. You obviously cannot fly. You cannot survive at the bottom of the ocean and so on. And if you were a scientist, say what you wanted to precisely define and measure physical fitness in humans, then you would come up with a battery of tests like you would have running and a meter playing soccer, playing table tennis, swimming and so on. And if you ran these tests over many different people, you would start seeing correlations and test results.
For instance, people who are good at soccer are so good at sprinting. Right. And you will explain these correlations with their physical abilities that are strictly analogous to cognitive abilities. Right. And then you would start self-serving correlations between biological characteristics like maybe lung volume is correlated with being a fast runner, for instance, in the same way that there are neuro physical correlates of cognitive abilities. Right. And at the top of the hierarchy of physical abilities that you would be able to observe, you would have a factor, a physical factor, which would map to physical fitness.
Right. And as you just said, that doesn't mean that people with high physical fitness can fly, doesn't mean human morphology. And human physiology is universal. It's actually super specialized. We can only do the things that we're evolved to do. Right. Like we are not appropriate to to to you could not exist on Venus or Mars or in the void of space on the bottom of the ocean. So that said, when. It's really striking and remarkable is that our morphology generalizes far beyond the environments that we evolved for, like in a way you could say we evolved to run after prey in the savanna.
Right. That's very much where our human morphology comes from. And that said, we can we can do a lot of things that are that are completely unrelated to that. We can climb mountains. We can we can swim across lakes. We can play table tennis. I mean, table tennis is very different from what we were able to do right through our morphology. Our bodies are so similar to our finances are of a degree of generosity. That is absolutely remarkable.
Right. And I think cognition is very similar to that. Our cognitive abilities have a degree of generality that goes far beyond what the mind was initially supposed to do, which is why we can play music and write novels and go to mass and do all kinds of crazy things. But it's not universal in the same way that human morphology and our body is not appropriate for actually most of the universe by volume. In the same way, you could say that the human mind is not very appropriate for most of problem space, potential problem space by volume.
So we have very strong cognitive biases actually that mean that there are certain types of problems that we handle very well and certain certain types of problem that we are completely inadequate for. So let's see how we interpret the de facto. It's not a sign of strong generality. It's it's just a broader the broader cognitive ability. But our abilities, whether we are talking about sensorimotor abilities or cognitive abilities, they still they remain very specialized in the human condition. Right.
Within the constraints of the human cognition, their general. Yes, absolutely.
But the constraints, as you're saying, are very limited. What I think what's limiting.
So we evolved our cognition. Anybody evolved in very specific environments because our environment was so variable, fast changing and so unpredictable.
Part of the constraints that that drove our evolution is generality itself.
So we were, in a way, evolved to to be able to improvise in all kinds of of physical all kinds of environments.
Right. Yeah. And for this reason, it turns out that the demise and bodies that we ended up with can be applied to much, much broader scope than what they were evolved for. Right. And that's truly remarkable. And that goes that's the degree of generalization that is far beyond anything you can see in artificial systems today. Right.
That's it does not mean that human intelligence is anywhere universal is.
Yes. In general. You know, it's a kind of exciting topic for people even outside of artificial intelligence IQ tests. There I think it's Menza, whatever, there's different degrees of difficulty for questions. We talked about this offline a little bit to about sort of difficult questions. You know, what makes a question on an IQ test more difficult or less difficult? Do you think so? The thing to keep in mind is that there's no such thing as a question that's intrinsically difficult.
It has to be difficult with respect to the things you already know and the things you can really do. Right. So in in terms of an IQ test question, typically would have to be structured, for instance, as a set of demonstration input and output pairs. Right. And then you would be given a test and put a prompt and which you you would need to recognize or produce the corresponding output. And in that narrow context, you could say a difficult question is a question where the input prompt is very surprising and unexpected given the training examples, just given the nature of the patterns that you're observing in the input, for instance, let's say you have a rotation problem, you must really the shape by 90 degrees.
If I give you two examples and then I'll give you one one prompt, which is actually one of the two training examples, then there is zero generalization difficulty for the task. And Secretary, your task, you just recognize that it's one one of the training examples and you produce the same answer. Now, if it's if it's a more complex shape, there is a little bit more transition, but it remains that you are still doing the same thing at this time as you were being demonstrated at a train time, a difficult task.
Does that require some amount of test, some adaptation, some amount of improvisation? Right. So consider I don't know. You're teaching a class on like quantum physics or something. If if you wanted to kind of test the understanding the students have of the material, you would come up with an exam that's very different from anything they've seen, like on the Internet when they were cramming. On the other hand, if you wanted to make it easy, you would just give them something that's very similar to the mock exams that they've taken, something that's just a simple interpolation of questions that they've already seen.
And so that would be an easy exam. It's very similar to what you've been trained on. And a difficult exam is one that really probes your understanding because it forces you to improvise. It forces you to do things that are different from what you've exposed to before. So that said, it doesn't mean that the exam that requires improvisation is intrinsically hard. Right. Because maybe your quantum physics experts. So when you take the exam, this is actually stuff that despite being new to the students, it's not new to you.
Right. So it can only be difficult with respect to what the test teacher already knows and with respect to the information that the test taker has about the task. So that's what I mean by controlling for Pryors what you bring to the table and they expand experience, which is the training data. So in in the case of the quantum physics exam, that would be all the course material itself and all the mock exams that students might have taken online.
Yeah, it's interesting because they've also I sent you an email asked you like I've been there's just this curious question of.
You know what's a really hard IQ test question, and I've been talking to also people who have designed IQ tests as a few folks on the Internet, like I think people are really curious about it. First of all, most of the IQ tests they designed, they like religiously protect against the correct answers, like you can't find the correct answers anywhere. In fact, the question is ruined once, you know, even like the approach you're supposed to take.
So they're very concerned. The approach is implicit in the training examples. So sure it is the strings course it's over. Well, which is why in Iraq, for instance, there is a test set that is private and no one has seen it. Not for really tough IQ questions, it's not obvious, it's not because the ambiguity like it's. And you have to look to them, but like some number sequences and so on, it's not completely clear.
So, like, you can get a sense. But there's like some you know, when you look at a number sequence, I don't know of the Fibonacci numbers sequence.
If you look at the first few numbers, that sequence could be completed a lot of different ways. Mm hmm. And, you know, some are if you think deeply or more correct than others, like there's a kind of intuitive simplicity and elegance to the correct solution.
Yes. I am personally not a fan of ambiguity in in this question, S.E., but I think you can have difficulty. Wizards requiring ambiguity simply by making the test require a lot of extrapolation over other examples. But the beautiful question is difficult, but gives away everything when you give the training example. Basically, yes, meaning that so the death interested in creating are not necessarily difficult for humans because human intelligence is to benchmark they're supposed to be difficult for machines in ways that are easy for humans.
Like, I think an ideal test of human and machine intelligence is a test that is actionable, that highlights the need for progress, and it highlights the direction in which you should be making progress.
I think we'll talk about the challenge and the test you've constructed and these elegant examples. I think that highlight like this is really easy for us humans, but it's really hard for machines. But under, you know, the designing an IQ test for IQ higher than 160 and so on. You have to say you have to take that and put it on steroids. Right. You have to think like what is hard for humans. And that's a fascinating exercise in itself, I think.
And it was an interesting question of what it takes to create a really hard question for humans, because you, again, have to do the same processes you mentioned, which is, you know, something basically where the experience that you have likely to have encountered throughout your whole life, even if you've prepared for IQ tests, which is a big challenge, that this will still be novel for you.
Yeah, I mean, novelty is a requirement. You should not be able to practice for the questions that you're going to be tested on. That's important because otherwise what you're doing is not exhibiting intelligence, which you're doing is just retrieving what you've been exposed before. It's is the same thing as deploying model. If you're trying to deploy any model on all the possible answers, that will ace your test in the same way that, you know, as a stupid student can still ace the test.
If they cram for it, then memorize, you know, hundreds different possible mock exams. And then they hope that the actual exam will be a very simple interpretation of the mock exams. And that student could just be a deep learning model at that point. But you can actually do that without any understanding of the material. And in fact, many students pass the exams in exactly this way. And if you want to avoid that, you need an exam that's unlike anything they've seen, the three probes, the understanding.
So how do we design an IQ test for machines and intelligence tests for machines? All right.
So in the paper, I outline a number of requirements and they should expect of such a test. And in particular, we should start by acknowledging the Prius that we expect to be required in order to perform the test. So we should be explicit about the Prius. Right. And if the goal is to compare mission intelligence and human intelligence, then we should assume human cognitive bias. Right. And secondly, we should make sure that we are testing for skill acquisition and Betoota skill acquisition efficiency in particular, and not Foskett itself, meaning that every task featured in your test should be novel and should not be something that you can anticipate.
So, for instance, it should not be possible to brute force the space of possible questions right. To present at every possible question and answer. So it should be tasks that cannot be anticipated, not just by the system itself, but by the creators of the system.
Right. Yeah, you know, it's fascinating. I mean, one of my favorite aspects of the paper and the work we do with our challenge is the the process of making Pryors explicit. Just even that act alone as a really powerful one of like what? Ah, it's a it's a really powerful question to ask of us humans. What are the powers that we bring to the table?
So the next step is like once you have those powers, how do you use them to solve a novel task? But like just even making the prize explicit is a really difficult and really powerful step. And that's like visually beautiful and conceptually, philosophically beautiful. Part of the work you did with the, I guess continue to do probably with the paper and the art challenge. Can you talk about some of the priors that we're talking about here?
Yes. So a recession has done a lot of work on what exactly and and not just that that are innate to humans. Is Elizabeth speaking from Harvard? So she developed the core knowledge theory, which outlines four different core knowledge systems, systems of knowledge that we are basically either born with on it. We are hardwired to acquire very early on in our development. And there's no there's no strong distinction between the two. Like if you are primed to acquire a certain type of knowledge in just a few weeks, you might as well just be born with it.
It's just it's just part of who you are. And so there are now four different core knowledge systems. Like the first one is the notion of abjectness and basic physics lecture. Recognize that something that moves quite unclear, for instance, is an object. So we intuitively, naturally, innately divide the world into objects based on this notion of coherence, physical currents. And in terms of elementary physics, there's the fact that, you know, objects can bump against each other and the fact that they can occlude each other.
These are things that we are essentially born with, cycles that are going to be quite, extremely early because we're already hardwired to acquire them.
A bunch of points, pixels that move together. All the objects are partly the same object. Yes. I mean I mean that like, I don't smoke weed, but if I did, that's something I could sit like all night and just like think about. I remember I've written your paper just abjectness.
I wasn't self aware, I guess, of how that particular prior that that's such a fascinating prior that like that's the most basic one.
But addiction is just identity, just the object. Yes. It's very basic, I suppose. But it's so fundamental.
It is fundamental to human cognition. Yeah. And the second prior that's also fundamental is business, which is not the real world. Real world that's so edginess. The fact that some of these objects that you that you segments your environment into some of these objects are agents to Western agents. It's basically it's an object that has goals so far that has what that does. Gall's the symbol of person. So, for instance, if you see two dots moving in in roughly synchronized fashion, you will intuitively infer that one of the dots is pursing the other.
So that's one of the dots is and one of the dots is an agent. And its goal is to avoid the other dots and one of the dots, the other that is also an agent. And its goal is to catch the first dots. Veldheer as shown that babies as young as three month identify easiness and call directness in their environment. Another prior is basic geometry and topology, like the notion of distance, the ability to navigate in your environment and so on.
This is something that is fundamentally hardwired into our brain. It's in fact backed by very specific neural mechanisms, like, for instance, grid cells and place cells. So it's it's something that's literally hardcoded at a new level in a hyper compass. And the last player would be the notion of numbers like numbers are not actually a cultural construct. We are intuitive. You need to be able to do some basic accounting and to compare quantities, so it doesn't mean we can do arbitrary isometric counting, the counting that's counting, like counting one, two, three ish, then maybe more than three.
You can also compare quantities if I give you three darts and five that you can tell that the decider is five that has more dots. So this is actually an innate prior. So that said, the list may not be exhaustive.
So you still, you know, pursuing the potential existence of new knowledge systems, for instance, and knowledge systems to deal with social relationships. Yeah, I mean, in Lockerbie, which is much, much less relevant to something like Arcore IQ tests, and you're right, there's Lockerbie stuff that's like like you said, rotation symmetry is really interesting.
It's very likely that there is speaking about rotation, that there is in the brain a hard coded system that is capable of performing rotations, one one famous experiment, and that people did. And I don't remember who it was exactly. But the in the 70s was that people found that if you asked people if you give them to different shapes and one of the shapes is rotated version of the first shape, and you ask them is, is that the division of the first group or not?
What you see is that the time it takes people to answer is linearly proportional, right to the angle of rotation. So it's almost like you have it somewhere in your brain, like a turntable with a fixed speed. And if you want to know if two two objects are seated aversion of each other, you put the object on the table, you let it move around a little bit, and then you and then you stop when you have a match.
And that's really interesting.
So what's the ARC challenge?
So in the paper I outlined, you know, all these principles, that's a good test of machine intelligence and human intelligence should follow. And the auction is one attempt to embody as many of these principles as possible. So I don't think it's anywhere near a perfect attempt. Right. It does not actually follow every principle, but it is where there was able to do, given that given the constraints. So the format of ARC is very similar to classic IQ tests, in particular Raven's progressive matrices.
Braveheart's Yeah. Raven's progressive matrices. I mean, if you've done IQ tests in the past, you know, whether it's probably life as you've seen it, even if you don't know what it's called. And so you have a set of tasks, that's what they're called. And for each task, you have training data, which is a set of inputs and add with pass so and an input output, there is a grid of corales, basically the grid the size of the grid.
These variables is the size of the grid is variable and you're given an inputs and you must transform it into the proper outputs. Right. And so you're shown a few demonstrations of a task in the form of existing input output. And then you're given a new input and you must provide you must produce the correct outputs. And the assumptions in ARC is that every task should only require core knowledge. Prior's should not require any outside knowledge. So, for instance, no language, no English, nothing like this.
New concepts take account from all human experience, like trees, dogs, cats and so on. So only tasks that are reasoning, tasks that are built on top of core knowledge. Prior's and some of the tasks are actually explicitly trying to probe specific forms of obstruction. Right are part of the reason why I wanted to create Ark is I'm a big believer in, you know, when you're faced with a problem as murky as understanding how to autonomously generate abstraction in a machine, you have to coevolve the solution and problem.
And so part of the reason why I design art was to clarify my ideas about the nature of abstraction. Right. And some of the tasks are actually designed to to probe bits of that theory. And there are things that are out to be very easy for humans to perform, including young kids. Right. But not to be near impossible for missions to. What have you learned from the nature of abstraction from from designing? That worked. Can you clarify what you mean one of the things you wanted to try to understand was this idea of abstraction.
Yes, so clarifying my own ideas about abstraction by forcing myself to produce tasks that would require the ability to produce that form of abstraction in order to solve them.
Got it. OK.
So and by the way, just to I mean, people should check out I'll probably overlay if you're watching the video part, but the the grid input output. With the different colors on the grid, that's it, that's I mean, it's a very simple world, but it's kind of beautiful.
It's very similar to classic Equitas. Like, it's not very original in that sense. The main difference with architects is that we made a Prior's explicit, which is not usually the case in IQ tests. So we make it explicit that everything should only be built out of core knowledge Prius. I also think it's generally more a more diverse than IQ tests in general. And it's it perhaps requires a bit more manual work to produce solutions because you have to click around on the grid for a while.
Sometimes the grades can be as large as city by city cells.
So how did you come up? If you can reveal with the questions like what's the process of the questions? Was it mostly, you know, they came up with the questions, what how difficult is it to come up with a question like, is this scalable to a much larger number? If you think, you know, with IQ tests, you might not necessarily wanted to or needed to be scalable with machines? It's possible you could argue that it needs to be scalable so that no one questions about tasks.
Yes, well, including the tests and the protest that I think it's fairly difficult in the sense that a big requirement is that every task should be novel and unique and unpredictable. Like you don't want to create your your own little world. That is simple enough that it would be possible for a human to reverse engineer it and write down an algorithm that could generate every possible arc task and solutions, for instance, that would completely invalidate the test. So we're constantly coming up with new stuff.
Yeah, you need a source of novelty of and about novelty. And the one thing I found is that as a human, you are not a very good source of unfixable novelty.
And so you have to pace the creation of these tasks quite a bit. There are only so many unique tasks that you can do in a given day.
So I means coming up with a truly original new ideas. Did psychedelics help you at all? But I mean, that's fascinating to think about. Like, so you would be like walking or something like that. You are you constantly thinking of something totally new?
And this is hard. This is I mean, I'm not saying I've done anywhere near perfect yet, but there is some amount of Woodmansee and there are many imperfections in hock. So that said, you should you should consider Iraq as a work in progress. It is not the definitive states where the tasks today are not definitive. State of the test. I want to keep refining it in the future. I also think it should be possible to open up the creation of tasks to a broad audience to do crowd sourcing.
That would involve several levels of filtering, obviously. But I think it's possible to apply crowdsourcing to to develop a much bigger and much more diverse ARG data sets that would also be free of potentially, you know, some of my own personal biases.
But is there always need to be a part of AHC? That's the test, like it's hidden? Yes, absolutely. It is imperative that the tests that you're using to actually benchmark algorithms is not accessible to the people developing these algorithms, because otherwise what's going to happen is that the human engineers aren't just going to solve the tasks some sarasohn and encode their solution in program form. But that, again, what you're seeing here is the process of intelligence happening in the mind of the human and the and then you're just capturing it's crystallized output.
But that's Cristeros output is not the same thing as the process generated. It's not intelligence. So what by the way, the idea of crowdsourcing? It is fascinating. I think I think the creation of questions is really exciting for people. I think I think there's a lot of really brilliant people out there that love to create these kinds of stuff. Yeah.
One thing that that kind of surprised me I wasn't expecting is that lots of people seemed to actually enjoy Arek as a as a kind of game. And I was really seeing it as a test, as a benchmark of a fluid general intelligence. And lots of people, just including kids, just enjoying it as a game. So I think that's that's encouraging.
Yeah, I'm fascinated by there's a world of people who create IQ questions. I think I think that's a cool.
There's a activity for machines, for humans and people, humans are themselves fascinated by taking the questions like, you know, measuring their own intelligence. I mean, that's just really compelling. It's really interesting to me, too. It helps one of the cool things about Iraq, he said it's kind of inspired by IQ tests or whatever follows a similar process. But because of its nature, because of the context in which it lives, it immediately forces you to think about the nature of intelligence as opposed to just the test of your like it forces you to really think there's I don't know if it's if it's within the question inherent in the question or just the fact that it lives in the test.
That's supposed to be a test of machine intelligence. Absolutely. As you as you solve all tasks as a human, you will be forced to basically introspect. Yeah. Hi. How you come up with solutions and it forces you to reflect on the human problem-solving process and the way your own mind generates abstract representations of the problems it's exposed to. I think it's due to the fact that the set of core knowledge PIOs that the Ark is built upon is so small, it's all recombination of a very, very small set of assumptions.
OK, so what's the future of Ark? So you hold Ark as a challenge as part of like a cargo competition. Yes. Cargo competition and.
What do you think? Do you think this is something that continues for five years, 10 years, like just continues growing? Yes, absolutely. So ARC itself will keep evolving. So talking about crowdsourcing, I think it's it's a good avenue. Another thing I'm saying is I'll be collaborating with folks from the psychology department at NYU. Nice to do human testing on AHC. And I think there are lots of interesting questions you can start asking, especially as you start correlating machine solutions to tasks and and the human characteristics of solutions.
Like, for instance, you can try to see if there's a relationship between the human perceived difficulty for task and the machine. Yes, and exactly some measure of mission, perceived difficulty. And there's a playground in which to explore this very difference. It's the same things we talked about, autonomous vehicles. The things that could be difficult for humans might be very different than the things that. Absolutely. And formalising or making explicit that difference in difficulty will teach us something, may teach us something fundamental about intelligence.
So one thing I think we did well with OK, is that it's proving to be a very actionable test in the sense that mission performance and Orkestar, that very much zero initially, while humans found actually the tasks very easy and that that alone was like a big red flashing light, saying that something is going on and that we are missing something at the same time. Mission performance did not stay at zero for very long. Actually, within two weeks of the current competition, we started adding a non-zero number.
And now the state of the art is around 20 percent of the test sets solved. And so ARC is actually a challenge. Where are our capabilities started? Zero, which indicates the need for progress. But it's also not an impossible challenge. It's not accessible. You can start making progress basically right away. At the same time, we are still very far from being solved. And that's actually a very positive outcome of the competition, is that the competition has proven that there was no obvious shortcuts to solve these tasks.
Yeah, so the test held up. Yeah, exactly.
That was the primary reason to get well, competition is to check if some some clever person was was going to hack the benchmark. And that did not happen. Right. Like people while solving the task should be doing it well in a way that they actually exploiting some flaws that we will need to address in the future, especially they're essentially anticipating what sort of tasks may be content with test set. Right, right.
Which is kind of yeah. That's the kind of hacking. That's human hacking of. Yes. That that said, you know, with the state of the art, that's like a 20 percent. We're still very, very far from human level, which is closer to one person. And so and I do believe that it will it will take a while until we reach no human parity on ARC and that by the time we have human parity, we will have these systems that are probably pretty close to human level in terms of general fluid intelligence, which is I mean, they are not going to be necessarily human like they are not necessarily you would not necessarily recognise them as being Energis, but they would be capable of a degree of generalization that matches the generalization performed by human fluid intelligence.
I mean, this is a good point in terms of general intelligence. To mention in your paper, you describe different kinds of generalizations, local, broad, extreme, and there's a kind of hierarchy that you form. So when we say generalizations, what or what are we talking about? What kinds are there? Right.
So, uh, generalization is very good idea. I mean, it's even Orwellian and machine learning in the context of machine learning. You say a system generalises if it can make sense of an input as it is not yet seen. And that's what I would call system centric generalization. You generalization with respect to novelty for the specific system you're considering. So I think a good dose of intelligence should actually, uh, there is develop a way out of the.
And which is slightly stronger than system centric transition, so develop your generalization, develop where transition would be the ability to generalize to novelty or uncertainty that not only the system itself has not access to, but the developer of the system could not have access to either.
That's a fascinating that's a fascinating meta definition. So like the system. As it is basically the education thing we're talking about with autonomous vehicles, yes, neither the developer nor the system nor about the education system. So it's up to the get the system should be able to generalize the thing that that nobody expected near the design of the training data, nor obviously the contents of the training. That's a fascinating definition.
You can see generalization, degrees of generalization as a spectrum. And the lowest level is worth mentioning is trying to is the assumption that any new situation is going to be sampled from a static distribution of possible situations and that you already have a representative sample of the distribution that's filtering data. And so in Michigan, you generalize to a new sample from a known distribution. And the ways in which your new sample will be new or different are ways that are already understood by the developers of the system.
So you are generalizing to known unknowns for one specific task. That's what you would call robustness.
You are robust to things like no small variations and so on for one fixed known distribution that, you know, through your training data and a higher degree will be flexibility in mission intelligence. The flexibility would be something like an five self-driving car or maybe a robot that can, you know, pasta the coffee cup test, which is the notion that you would be given a random kitchen somewhere in the country and you would have to, you know, go make a cup of coffee in that kitchen.
So flexibility would be the ability to deal with unknown unknowns. So things that could not dimensions the viability. It could not have been possibly foreseen by the creators of the system within one specific task. So generalizing to the long tail of situations instead of driving, for instance, would be flexibility. So you have robustness, flexibility, and finally you would have extreme generalization, which is basically flexibility. But instead of just considering one specific domain, like driving domestic robotics, you're considering an open-ended range of possible domains.
So a robot would be capable of extreme transition if it's designed and trained to do for cooking, for instance. And if I if I buy the robots and if I'm able if it's able to teach itself gardening in a couple weeks, it would be capable of extreme generalization, for instance.
So the ultimate goal is extreme generalizations. So be creating a system that is so general that it could essentially achieve human skill parity of arbitrary task and arbitrary domains with the same level of improvisation and adaptation power as humans when when when it encounters new situations. And we do so over basically the same range of possible domains and tasks as humans and using essentially the same amount of training expanse of practice as humans would require, that will be human level Externalisation. So I don't actually think humans are anywhere near the optimal intelligence balance if there is such a thing.
So I think for humans or in general in general, I think it's quite likely that there is an a hard limit to heart intelligence. Any system can be built at the same time. I don't think humans are anywhere near that limit. Yeah, last time I think we talked, I think he had this idea that we're only as intelligence, as the problems we face sort of. Yes, we are bounded by the problems. So in a way, yes, we are.
We are bounded by our environments and we are bounded by the points who try to solve.
Yeah, yeah. What do you make of NewLink and. Outsourcing some of the brain power like brain computer interfaces. Do you think we can expand our augment our intelligence?
I am fairly skeptical of neural interfaces because they're trying to fix one specific bottleneck in in human cognition, which is the bandwidth bottleneck, input and output of information in the brain. And my perception of the problem is that bandwidth is not at this time a bottleneck at all, meaning that we already have sensors that enable us to to take in far more information than which we can actually process.
Well, to push back on that a little bit. But the sort of play devil's advocate a little bit is if you look at the Internet, Wikipedia, say Wikipedia, I would say that humans after the advent of Wikipedia are much more intelligent.
Yes, I think that's a good one. But that's also not. But that's about externalizing our intelligence, the information processing systems, excellent function system, which is very different from brain computer interfaces.
Right. But the question is whether if we have direct access, if our brain has direct access to Wikipedia without brain, already has direct access to Wikipedia, it's on your phone and you have your hands and your eyes and your ears and so on to access that information and the speed at which you can access it is bottlenecked by the. It's already close, fairly close to optimal, which is why speed reading, for instance, does not work. The faster you read, the less you understand.
But maybe it's because it uses the eyes. So maybe. So I don't believe so.
I think the brain is very slow. It typically operates the fastest things that happen in the brain and the level of 50 milliseconds are forming. A conscious thought can potentially take entire seconds. Right. And you can already read pretty fast. So I think the speed at which you can take information in and even the speed at which you can add information can only be very incrementally improved. Maybe think that if you're a very fast typer, if you're a veteran, type, the speed at which you can express your thoughts is already the speed at which you can form your thoughts.
Right. So that's kind of an idea that there are fundamental bottlenecks that the human mind, but it's possible that the everything we have in the human mind is just to be able to survive in the environment and. There's a lot more to expand, maybe, you know, you said this the speed of the thought.
So, yeah, I think augmenting human intelligence is a very valid and very powerful avenue. Right. And that's with computers. But in fact, that's what, you know, all of culture and civilization is about.
The culture is externalized cognition. And we rely on culture to think constantly. Yeah, yeah.
I mean, that's that's another way. Yeah.
Not not just not just computers, not as phones and the Internet. I mean, all of culture like language, for instance, is a form of extreme recognition. Books obviously externalise condition. Yeah. And you can scale that extend far beyond the capability of the human brain. And you could see, you know, civilization, civilization itself is it has capabilities that are far beyond any individual brain and will keep skinniness because it's not really bound by individual brains.
It's a different kind of system. Yeah.
And and that system includes non-human non humans, first of all, includes all the other biological systems which are probably contributing to the overall intelligence of the organism.
And then the computer isn't part of it. Non-human systems. I'm not contributing much, but Aizer definitely contributing to that like Google search, for instance, a big part of it. Yeah, yeah, a huge part of our part, we can probably introspect like how the world has changed in the past 20 years. It's probably very difficult for us to be able to understand until, of course, whoever created the simulation world is probably doing metrics measuring the progress.
There was probably a big spike in performance. They're enjoying. They're enjoying this. So what are your thoughts on the Turing test and the Loebner Prize, which is the, you know, one of the most famous attempts at the test of human intelligence, sorry, of artificial intelligence by doing a natural language, open dialogue tests that tests that are judged by humans as far as how well the machine did.
So I'm not a fan of the Turing Test itself or any of its variants for two reasons. So, first of all, it's it's really copying out of trying to define and measure intelligence because it's entirely outsourcing that to a panel of human judges and the human judges. They may not themselves have any proper methodology. They may not themselves have any proper definition of intelligence. They may not be reliable. So the Turing test or defending one of the core psychometrics principles, which is reliability, because you have biased human judges, it's also violating the standardization requirement and the freedom from bias requirement.
And so it's really a copout because you are outsourcing everything that matters, which is precisely describing intelligence and finding a standard on test to measure it through outsourcing everything to to people. So it's really a copout. And by the way, we should keep in mind that when Turing proposed The Imitation Game, he was not meaning for The Imitation Game to be an actual goal for the field of an actual test of intelligence he was using. And it was using The Imitation Game as a thought experiment.
In a philosophical discussion in his 1950 paper, he was trying to argue that theoretically it should be possible for something very much like the human mind, indistinguishable from the human mind to be included in the Turing machine. And at the time that was that was, you know, a very daring idea. It was stretching credulity. But now is I think it's fairly well accepted that the mind is an information processing system and that you could probably put it into a computer.
So another reason why I'm not a fan of this type of test is that it's the incentives that it creates are incentives that are not conducive to proper scientific research. If your goal is to trick to convince a panel of human judges that they are talking to a human, then you have an incentive to rely on on tricks and prestidigitation in the same way that let's say you're doing physics and you want to sort of teleportation. And what if the test that you set out to pass is you need to convince a panel of judges that teleportation took place and they're just sitting there and watching what you're doing, and that is something that you can achieve with, you know, David Copperfield could could achieve it in his in his short Vegas.
Right. But is and what is doing is very elaborate. But it's not actually it's not physics.
It's not making any progress in our understanding of the universe to push back on.
That is possible. That's the hope with these kinds of subjective evaluations, is that it's easier to solve it generally than it is to come up with tricks that convince a large number of people.
That's the whole practice, which it turns out that is very easy to deceive people in the same way that, you know, you can you can do magic in Vegas. You can actually very easily convince people that they are talking to human when they are actually talking to them. I disagree. I disagree with that. I think it's easy. I would I would push that. It's not easy. It's it's doable. It's very easy because I wouldn't say it's very easy, though.
We are biased. We have a theory of mind are constantly projecting emotions. Intentions. Yes. Ancientness isn't. This is one of our core innate. Right, we are projecting this things on everything around this, like if you if you paint a smiley on Iraq, the Iraq becomes happy, you know, eyes. And because we have this extreme bias that permeates everything, everything we see around us, it's actually pretty easy to trick people. I thought this I so totally disagree with you.
Brilliantly put as a huge it's the anthropomorphised that we naturally do, the agents of that or the real world.
But it's not the real world. I like it, but it's useful when it's useful way. Let's make it real. It's a huge help, but I still think it's really difficult to convince if you do like the surprise formulation where, you know, you talk for an hour like this, formulations of the test you can create where it's very difficult.
So I like the price better because it's more pragmatic, it's more practical. It's actually incentivizing developers to create something that's useful. Yeah. As as as a human machine interface. So that's slightly better than just imitation.
So I like your your your idea. Ideas like a test will hopefully help us in creating intelligent systems as a result. Like if you create a system that passes, it'll be useful for creating further intelligence systems. Yes, at least. Yeah.
I mean, just to kind of comment, I'm a little bit surprised how little inspiration people draw from the Turing test today.
You know, the media and the popular press might write about it every once in a while. The philosophers might talk about it, but like most engineers are not really inspired by it.
And I know I know you don't like the Turing test, but we'll have this argument another time. You know, there is something inspiring about it.
I think that as a as a philosophical device, in a philosophical discussion, I think there's something very interesting about it. I don't think it is in practical terms. I don't think it's conducive to to progress. And one of the reasons why is that, you know, I think being very human, like being indistinguishable from a human, is actually the very last step in the creation of machine intelligence, that the first eyes that will show strong generalization in in and it will actually implement human like broad cognitive abilities, they will not actually be able to look anything like humans.
Human likeness is the very last step in that process, and so a good test is a test that points you to the first step on the ladder, not towards the top of the ladder. So to push back on that. So I guess I usually agree with you on most things. I remember you, I think at some point tweeting something about the Turing Test not being being counterproductive or something like that. And I think a lot of very smart people agree with that.
I at a, you know, computation speaking, not very smart person. I disagree with that because I think there's some magic to the interactivity, interactivity with other humans. So to push to play devil's advocate in your statement, it's possible that in order to demonstrate the generalization abilities of a system, you have to.
Show you're in conversation, show your ability to adjust, adapt to the conversation through not just like as a stand alone system, but through the process of like the interaction, that game theoretic, where the your you really are changing the environment by your actions. So in the ARC challenge, for example, you're an observer. You can't you can't scare the test into into changing. You can't talk to the test. You can't play with it. So there's some aspect of that interactivity that becomes highly subjective, but it feels like it could be conducive to.
Yeah, you make a great point that interactivity is very good setting to force the system, to show adaptation, to shoot transition.
That said, you're at the same time, it's not something very scalable because you rely on human judges. It's not something reliable because the humans are may not enough. So you don't like human judges basically? Yes, and I think so. I love the idea of interactivity. I initially wanted an artist that had some amount of interactivity where your score on a task would not be one or zero if you can solve it or not, but would be the number of attempts that you can make before you hit the right solution, which means that now you can start applying the centric method as you sort of Shakhtarsk, that you can start formulating a hypothesis and probing the system to see whether the hypothesis, the observation would match it with this or not.
It's amazing if you could also even higher level than that, measure the quality of your attempts, which of course, is impossible. Again, that's subjective. Like how good was your thinking?
Like it's how efficient was. So one thing that's interesting about this notion of scoring you as how many attempts you need is that you can start producing tasks that are way more ambiguous. Right. Right, because you can with the promise that with the different attempts, you can actually probe that ambiguity, right? Right. So that's in a sense, which is how good can you adapt to the uncertainty and reduce the uncertainty?
Yes, it's half fast with is the efficiency with which you reduce uncertainty in prime space.
Exactly. Very difficult to come up with that kind of test, though. Yeah. So I would love to be able to create something like this in practice. It would be it would be very, very difficult. But yes.
But I mean, what you're doing, what you've done with the challenge is brilliant. I'm also not surprised that it's not more popular, but I think it's picking up its niche.
What are your thoughts about another test with. I talks with Marcus Hoder. Here's the prize for compression of human knowledge. And the idea is really sort of quantify and reduce the test of intelligence purely to just the ability to compress. What's your thoughts about this intelligence as compression? I mean, it's a very fun test because it's such a simple idea, like your given Wikipedia, basic English, Wikipedia, and you must compress it. And so it stems from the idea that cognition is compression, that the brain is basically a compression algorithm.
This is a very old idea. It's a very, I think, striking and beautiful idea. I used to believe it. I eventually had to realize that it was it was very much a flawed idea. So I no longer believe this compression recognition's compression. So but I can tell you what's the difference. So it's very easy to believe that cognition and compression are the same thing because. So Jeff Hopkins, for instance, says that cognition is prediction. And of course, prediction is basically the same thing as compression.
Right. It's just including the temporal axis. And it's very easy to believe it is because compression is something that we do all the time, very naturally. We are constantly, you know, compressing information. We are constantly trying we have a bias towards simplicity.
We are constantly trying to organize things in our mind and around us to be more regular. Right. So it's a beautiful idea. It's very easy to believe there is a big difference between what we do with our brains and compression or compression is actually kind of a tool in the human cognitive toolkit that is used in many ways. But it's just a tool. It is not. It is a tool for cognition. It is not cognition itself. And the big fundamental difference is that cognition is about being able to operate in future situations that include fundamental uncertainty and novelty.
So, for instance, consider a child at age 10 and so they have 10 years of life experience.
They've gotten, you know, pain, pleasure, rewards and punishments at period of time. If you were to generate the shortest give your program that would have basically run that child is ten years in an optimal way.
Right. The shortest optimal behavioral program given the expanse of that chart so far.
Well, that program that that compressed program, this is what you would get if the mind of the child was a compression algorithm, essentially would be utterly unable, inappropriate to process the next 70 years in the in the life of that child. So in the models we build of the world, we are not trying to make them actually optimally compressed. We are we are using compression as a tool to promote simplicity and efficiency in not models, but they are not perfectly compressed because they need to include things that are seemingly useless to date that have seemingly been useless so far.
But that may turn out to be useful in the future because you just don't know the future. Unless there's the fundamental principle that cognition that intelligence arises from is that you need to be able to run appropriately if your programs, except you have absolutely no idea what sort of context, environment and situation they're going to be running in. And you have to deal with that, with that uncertainty, with that future analogy. So an analogy, an analogy that you can make is with investing, for instance.
If I look at the past, you know, 20 years of stock market data and they use a compression algorithm to figure out the best trading strategy, it's going to be, you know, you buy Apple stock then maybe the past few years you buy Tesla's stock or something.
But is that strategy still going to be true for the next 20 years? Well, actually, probably not, which is why if you're a smart investor, you're not you're not just going to be following the strategy that corresponds to compression of the past. You're going to be throwing you're going to have a balanced portfolio. Yeah, right.
Because you just don't know what's in store on your things. I mean, I guess in that same sense, the compression is analogous to what you talked about, which is like local or robust generalization versus extreme generalization. It's much closer to that side of being able to generalize in the local sense. That's why, you know, as humans, as when we are when we are children, you know, education. So a lot of it is driven by players, driven by curiosity.
We are not efficiently compressing things. We're actually exploring. We are retaining all kinds of things from our environment that that seem to be completely useless because they might turn out to be eventually useful. Right. And it's it's that's what cognition is really about. And that's what makes it antagonistic to compression, is that it is about hedging for future uncertainty. And that's an official data compression.
Yes, your facial recognition leverage is compression as a tool to promote with efficiency. Right. And so in that sense, in our models, it's like I said, make it simpler, but not our core goals, but not too simple. So you want to compression simplifies things, but you don't make it too simple. Yes, so a good model of the world is going to include all kinds of things that are completely useless, actually, just because just in case, because you need diversity in the same way that in your portfolio you need all kinds of stocks that that may not have performed well so far, but you need diversity and there isn't any diversity because fundamentally don't know what you're doing.
And the same is true of the human mind, is that it needs to to behave appropriately in a future. And it has no idea when the future is going to be like it, but it's not going to be like the past. So compressing the past is not appropriate because the past is not is not predictive of the future.
Yeah, history repeats itself, but not perfectly. I don't think I asked you last time the most inappropriately absurd question. We've talked a lot about intelligence, but. You know, the bigger question from intelligence is of meaning. You know, intelligence systems are kind of goal oriented, there's throws optimizing for goal, you look at how price actually. I mean, there's always there's always a clean formulation of a goal. But the natural questions for us humans, since we don't know our objective function, is what is the meaning of it all?
So the absurd question is what friends who actually do you think is the meaning of life?
What's the meaning of life? Yeah, that's that's a big question. And I think I can I can give you my answer, at least one of my answers. And so, you know, the one thing that's very important in understanding who we are is that. Everything that makes up, that makes it our service, it makes up we are even even your most personal thoughts is not actually your own right. Like even your most personal thoughts are expressed in words that you did not invent and are built on concepts and images that you did not invent.
We are very much cultural beings, right? Well well made of culture. We are not that what makes us different from animals, for instance. Right.
So we are everything about ourselves is an echo of the past and echo of people who lived before us. Right. That's who we are. And in the same way, if we manage to contribute something to the collective edifice of culture, a new idea, maybe a beautiful piece of music, a work of art, a grand theory, a new words, maybe that something is is going to become a part of the minds of future humans, essentially forever.
So everything we do creates reports that put it into the future. And that's in a way, this is the sort of path to immortality is dead as we contribute things to culture, culture into in turn becomes a future humans. And we keep influencing people, you know, thousands of years from now. So our actions today create reports and. This is a report I think basically summed up the meaning of life, like in the same way that we are, the sum of the interactions between many different reports that came from our past, we are selves creating reports that we propagate into the future.
And that's why, you know, we should be this seems like perhaps a nice thing to say, but we should be kind to others during our time on Earth because every act of kindness, Chris reports. And in reverse, every act of violence also creates ripples. And you want you want to carefully choose which kind of reports you want to create and you want to propagate into the future.
And in your case, first of all, beautifully put in your case, creating ripples into the future of human. And future ajai systems. Yes, it's fascinating, all six of us.
I don't think there's a better way to end it, François. As always, for a second time, and I'm sure many times in the future, it's been a huge honor. You know, one of the most brilliant people in the machine learning computer science, science world. Again, it's a huge honor. Thanks for talking today. It's been a pleasure. Thanks a lot for having me. We appreciate it. Thanks for listening to this conversation with Francois Charolais and thank you to our sponsors Bebo, Masterclass and Kashyap click the sponsor links in the description to get a discount and to support this podcast.
Enjoy this thing, subscribe on YouTube of five stars and have a podcast. Follow on Spotify, support on Patrón or connect with me on Twitter, Àlex Friedemann. And now let me leave you with some words from a cart in 16 68, an excerpt of which Francois includes in is on the measure of intelligence paper. If there were machines which bore a resemblance to our bodies and imitated our actions as close as possible for all practical purposes, which is still have to very certain means of recognizing that they were not real men, the first is that they could never use words or put together signs, as we do in order to declare our thoughts to others.
But we can certainly conceive of a machine so constructed that it utters words and even utters words that correspond to bodily actions, causing a change in organs. But it is not conceivable that such a machine should produce different arrangements of words so as to give it an appropriately meaningful answer to whatever is said in its presence as adults and men can do here. Descartes is anticipating the Turing test and the argument still continues to this day. Secondly, he continues, even though some machines might do some things as well as we do them, or perhaps even better, that would inevitably fail in others, which would reveal that they're acting not from understanding, but only from the disposition of their organs.
This is incredible quote, for whereas reason is a universal instrument which can be used in all kinds of situations, these organs need some particular action. Hence is, for all practical purposes, impossible for machine to have enough different organs to make it act in all the contingencies of life in the way in which our reason makes us act. That's the debate between mimicry, memorization versus understanding. So thank you for listening and hope to see you next time.