Transcribe your podcast
[00:00:00]

The following is a conversation with Thomas Superjail, he's a professor at MIT and is a director of the Center for Brains, Minds and Machines, cited over 100000 times. His work has had a profound impact on our understanding of the nature of intelligence in both biological and artificial neural networks. He has been an adviser to many highly impactful researchers and entrepreneurs and A.I., including Demis Hassabis of Mind and then Joshua Mobli and Christof Koch of the Allen Institute for Brain Science.

[00:00:34]

This conversation is part of the MIT course on artificial general intelligence and the Artificial Intelligence Podcast, if you enjoy it. Subscribe on YouTube, iTunes or simply connect with me on Twitter at Lex Friedman spelled F.R. I.D.. And now here's my conversation with Tomasso Poggio.

[00:01:09]

You've mentioned that in your childhood you've developed a fascination with physics, especially the theory of relativity, and that Einstein was also a childhood hero to you.

[00:01:21]

What aspect of Einstein's genius, the nature of his genius, do you think was essential for discovering the theory of relativity?

[00:01:29]

You know, Einstein was a hero to me and I'm sure to many people because he was able to make, of course, a major, major contribution to physics with. Simplifying a bit, just a good Duncan experiment, a thought experiment. You know, imagining communication with the lights between a stationary observer and somebody on a train. And I thought, you know, the fact that just with the force of office for twisting of his mind, it could get to some something so deep in terms of physical reality, how time and depend on space and speed.

[00:02:18]

It was something absolutely fascinating. Was the power of intelligence. The power of the mind.

[00:02:25]

Do you think the ability to imagine, to visualize, as he did as a lot of great physicists do, do you think that's in all of us human beings or is there something special to that one particular human being?

[00:02:39]

I think, you know, all of us can learn and have in principle similar breakthroughs. There are lessons to be learned from Einstein. He was one of five Ph.D. students at 84. And the idea narcissistically auction in Zurich in physics. And he was the worse of the five. The only one who did not get an academic position when he graduated with finished his Ph.D. and he went to work, as everybody knows, for the patent office. And so it's not so much that he worked for the patent office, but the fact that obviously he was smart, but he was not the top student, obviously was the anti conformist, was not thinking in the traditional way that probably teachers and the other students were doing.

[00:03:37]

So there is a lot to be said about, you know, trying to be. To do the opposite or something quite different from what other people are doing, that's certainly true for the stock market.

[00:03:50]

Never, never buy if everybody's buying and also true for science. Yes. So you've also mentioned. Staying on the theme of physics that you were excited at a young age by the mysteries of the universe that physics could uncover, such as I saw mentioned, the possibility of time travel.

[00:04:13]

So the most out of the box question I think I'll get to ask today, do you think time travel is possible?

[00:04:20]

Why? It would be nice if it were possible. Right now, you know, in science, you never say no.

[00:04:29]

But your understanding of the nature of time, yeah, it's very likely that it's not possible to travel in time. You may be able to travel forward in time if we can, for instance, freeze ourselves or, you know, go on some spacecraft travelling close to the speed of light. But in terms of actively traveling, for instance, back in time, I find. Probably very unlikely. So do you still hold the underlying dream of the engineering intelligence that will build systems that are able to do such huge leaps, like discovering the kind of mechanism that will be required to travel through time, do you still hold that dream or are echoes of it from your childhood?

[00:05:24]

Yeah, I you know, I don't think whether there are certain problems that probably cannot be solved, depending what what you believe about the physical reality, like, uh, you know, maybe totally impossible to create energy from nothing or to travel back in time, but about making machines that can think as well as we do or better or more likely, especially in the short and mid-term, help us think better, which is in a sense is happening already with the computers we have.

[00:06:03]

And it will happen more and more. But that I certainly believe and I don't see in principle why computers at some point could not become more intelligent than we are, although the word intelligence is a tricky one and one you should discuss first.

[00:06:23]

I mean, with that intelligence consciousness.

[00:06:26]

Yeah, words like love is all these are very you need to be disentangled. So you've mentioned also that you believe the problem of intelligence is the greatest problem in science, greater than the origin of life and the origin of the universe.

[00:06:44]

You've also in the talk I've listened to, said that you're open to arguments against against you.

[00:06:51]

So what do you think is the most captivating aspect of this problem, of understanding the nature of intelligence?

[00:07:00]

Why does it captivate you as it does?

[00:07:04]

Well, originally, I think one of the motivation that I had was, I guess a teenager when I was infatuated with the theory of relativity was really that I I found that there was the problem of time and space and general relativity, but there were so many other problems of the same level of difficulty and importance that they could. Even if I were Einstein, it was difficult to hope to solve all of them. So what about solving a problem with solution and allow me to solve all the problems?

[00:07:43]

And this was what if we could find the key to an intelligence, you know, 10 times better or faster than Einstein?

[00:07:54]

So that's sort of seeing artificial intelligence as a as a tool to expand our capabilities. But is there just an inherent curiosity in you and just understanding what is in our in here that makes it all all work?

[00:08:11]

Yes, absolutely. All right. So I was starting I started saying this was the motivation when I was a teenager. But, you know, soon after, I think the problem of human intelligence became a real focus of, you know, of my state, of my science and my research because.

[00:08:34]

I think he's, for me, the most interesting problem he's really asking.

[00:08:42]

Oh, we we are right is asking not only a question about science, but even about the very tool we are using to do science, which is our brain. How does our brain work from? Where does it come from? What are its limitations? Can we make it better? And that, in many ways, is the ultimate question that underlies this whole effort of science.

[00:09:11]

So you've made significant contributions in both the science of intelligence and the engineering of intelligence. In a hypothetical way, let me ask, how far do you think we can get in creating intelligent systems without understanding the biological the understanding how the human brain creates intelligence?

[00:09:33]

Put another way, do you think we can build a stronger system without really getting at the core the functional need, understanding the function nature of the brain?

[00:09:42]

Well, this is a real difficult question. You know, we did solve problems like flying.

[00:09:52]

Without really using too much our knowledge about how birds fly. Was important, I guess, to know that you could have things heavier than than er being able to fly like like birds. But beyond that, probably we did not learn very much, you know, some you know, the brothers, right, did learn a lot of observation about birds and designing their their aircraft. But, you know, you can argue we did not use much of biology in that particular case.

[00:10:35]

Now, in the case of intelligence, I think that it's a bit of a bet right now.

[00:10:46]

If you are if you ask, OK, we we all agree, will get at some point, maybe soon, maybe later to a machine that is indistinguishable from my secretary, say in terms of what I can ask the machine to do.

[00:11:04]

I think we got there and now the question is, and you can ask people, do you think we'll get there without any knowledge about, you know, the human brain or that is the best way to get there is to understand better the human brain. Yeah, OK. This is, I think, an educated bet that different people with different backgrounds will decide in different ways. The recent history of the progress in I the last I would say five years or ten years has been that the main breakthroughs, the main recent breakthroughs.

[00:11:44]

I really start from neuroscience. I can mention reinforcement learning as one is one of the algorithms at the core of Alpha goal, which is the system to beat the kind of unofficial world champion of Girl Lisa Doll and two, three years ago in Seoul, that's one. And that started really with the work of Pavlov. And I thought hundred Marvin Minsky in the 60s and many other neuroscientists later on and deep learning started, which is at the core again of Alpha Go and systems like autonomous driving systems for cars, like the systems that Mobileye, which is a company started by one of my ex postdoc, I'm not so sure.

[00:12:43]

Yes, they did. So that is the core of those things. And deep learning is really the initial ideas in terms of the architecture of this. Layered hierarchical networks started with the work of Thurstone Vissel and David Hubert at Harvard up the river in the 60s. So recent history suggest that neuroscience played a big role in these breakthroughs. My personal bet is that there is a good chance they continue to play a big role, maybe not in all the future breakthroughs, but in some of them, at least in inspiration.

[00:13:21]

So at least in inspiration. Absolutely, yes.

[00:13:24]

So you see, you studied both artificial and biological neural networks. He said these mechanisms that underlie deep learning and reinforcement learning.

[00:13:36]

But there is nevertheless a significant differences between biological, artificial neural networks as they stand now between the two. What do you find is the most interesting, mysterious, maybe even beautiful difference as it currently stands in our understanding?

[00:13:54]

I must confess that until recently I found that the artificial networks too simplistic relative to real neural networks. But, you know, recently I've been started to think that, yes, that's a very big simplification of what you find in the brain. But on the other hand, there are much closer in terms of the architecture to the brain than other models that we had that computer science used as a model of thinking, which were mathematical logics, you know, lisp, prolog and those kind of things.

[00:14:36]

Yeah. So in comparison to those that are much closer to the brain, you have networks of neurons, which is what the brain is about. The the artificial neurons in the model is, as I said, a caricature of the biological neurons. But there are still neurons, single units communicating with other units, something that is absent in, you know, the traditional computer type models of mathematics, reasoning and so on.

[00:15:07]

So what aspect would you like to see in artificial neural networks added over time as we try to figure out ways to improve them?

[00:15:17]

So one of the main differences and, you know, problems in terms of deep learning today, and it's not only deep learning and that the brain is the need for deep learning techniques to have a lot of labelled examples, you know, for for imagination to have like a training site, which is one million images, each one labeled by some human in terms of which object is there. And it's it's clear that in biology. A baby may be able to see millions of images in the first years of life, but will not have millions of labels given to him or her by parents or take take caretakers.

[00:16:13]

So how do we solve that? You know, I think there is this interesting challenge that today deep learning and related techniques are all about big data, big data, meaning a lot of. Examples labelled by humans, whereas in nature you have so that this big data is and go into infinity, that's the best, you know, and meaning label data. But I think the biological world is more and going to want the child can learn.

[00:16:53]

It's a beautiful robot, very small number of, you know, labelled examples like you tell a child this is a car. You don't need to say like you can imagine that, you know, this is a car. This is a car. This is not a car. This is not a car. One million times.

[00:17:10]

So and of course, with Alfa go and or at least four zero variants, there's because of the because of the world of go so simplistic that you can actually learn by yourself or self play, you can play against each other and the real world. I mean, the visual system that you've studied extensively is a lot more complicated than the game of go now and the comment about children which are fascinatingly good at learning new stuff. How much of a do you think is hardware?

[00:17:41]

How much of it is software?

[00:17:43]

Yeah, that's a good deep question is in a sense, is the old question of nurture and nature. How much is in the gene and how much is in the experience of an individual? Obviously, it's both the A and I believe that the way. Evolution gives put priority information, so to speak, hardwired, it's not really hardwired, but that's essentially an hypothesis. I think what's going on is that evolution. As you know, almost necessarily, if you believe in Darwin is very opportunistic and.

[00:18:33]

And think about our DNA and the DNA of Drosophila. Mm hmm. Our DNA does not have many more genes than Drosophila on the fly, the fly. The fruit fly. Yeah. Now, we know that the fruit fly does not learn very much during its individual existence. It looks like one of these machinery that it's really mostly not a hundred percent, but, you know, 95 percent hardcoded by the genes. But since we don't have many more gene standards of evolution, couldn't in as a kind of general learning machinery and then had to give very weak Prior's like, for instance, let me give a specific example, which is recent work by a member of our Center for Brains, Minds and Machines.

[00:19:37]

We know because of work of other people in our group and other groups that there are cells in that part of our brain neurons that are tuned to faces. They seems to be involved in facial recognition. Now, this face area exist, seems to be present in young children and adults. And one question is, is there from the beginning is hardwired by evolution or, you know, somehow is learned very quickly. So what's your.

[00:20:12]

By the way, a lot of the questions I'm asking with the answer is we don't really know. But as a person who has contributed some profound ideas in these fields, you're a good person, a guess at some of these. So, of course, there's a caveat before a lot of the stuff we talk about.

[00:20:28]

But what is your hunch? Is the face the part of the brain that that seems to be concentrated on face recognition? Are you born with that or are you just is designed to learn that quickly, like the face of the mother?

[00:20:43]

And my my hunch by bias was the second one learned very quickly. And it turns out that marriage Livingstone at Harvard has done some amazing experiments in which she raised baby monkeys, depriving them of during the first weeks of life. So they see technicians. But the technicians have a mask.

[00:21:10]

Yes. And and so when they looked at the area in the brain of these monkeys that were usually find faeces, they found no face preference. So my guess is that what evolution does in this case is there is a plastic canaria, which is plastic, which is kind of predetermined to be imprinted very easily. But the command from the gene is not a detailed circuitry for a face template. Could be. But this will require probably a lot of bits.

[00:21:53]

You have to specify a lot of connection of a lot of neurons instead that the the one from the gene is something like imprint memories, what you see most often in the first two weeks of life, especially in connection with food. And maybe nipples, I don't write all sorts of food and so in that area is very plastic at first and then solidifies.

[00:22:17]

It'd be interesting if a variant of that experiment would show a different kind of pattern associated with food than a face pattern.

[00:22:25]

Well, whether that could stick, there are indications that during that experiment.

[00:22:32]

Well, the monkey saw quite often wear the blue gloves of the technicians that were giving to the baby monkeys, the milk and some of the Celsi, instead of being face sensitive in that area. Hansons. That's fascinating.

[00:22:53]

Can you talk about what are the different parts of the brain and in your view, sort of loosely and how do they contribute to intelligence? Do you see the brain as a bunch of different modules and they together come in the human brain to create intelligence? Or is it all one? Much of the same kind of fundamental architecture, yeah, that's you know, that's an important question. And there was a phase in neuroscience back in the 1950s or so in which it was believed for a while that the brain was equi potential.

[00:23:39]

This was the term you could cut out a piece and nothing special happened apart, a little bit less performance. There was a surgeon largely with a lot of experiments of this type with mice and rats, and concluded that every part of the brain was essentially equivalent to any other one. It turns out that that's that's really not true. It's there are very specific modules in the brain, as you said, and, you know, people may lose the ability to speak if you have a stroke in a certain region or may lose control of their legs in another region or so there are very specific.

[00:24:31]

The brain is also quite flexible and redundant. So often it can correct things and, you know, kind of take over functions from one part of the brain to the other. But but but really, there are specific modules of the answer that we know from this old work, which was basically on based on lesions, either on animals or very often there were nine of them. While there was a mind or very interesting data coming from from the war, from different types of injuries, injuries that soldiers had in the brain and more recently functional MRI which allow you to to check.

[00:25:29]

Which part of the brain are active when you are doing different tasks? As you can replace some of this, you can see that certain parts of the brain are involved, are active in certain language. Yeah, yeah, that's right.

[00:25:49]

But sort of taking a step back to that part of the brain that discovers that specializes in the face and how that might be learned.

[00:25:58]

What's your intuition behind?

[00:26:02]

You know, is it possible that sort of from a physicist perspective, when you get lower and lower, it's all the same stuff? And it just when you're born, it's plastic and quickly figures out this part is going to be about vision? This is going to be about language. This is about common sense reasoning. Do you have an intuition that that kind of learning is going on really quickly or is it really kind of solidified in hardware?

[00:26:26]

That's a great question. So there are parts of the brain. Like the cerebellum or the hippocampus that are quite different from each other. They clearly have different anatomy, different connectivity, that then there is the the cortex, which is the most developed part of the brain in humans and in the cortex, you have different regions of the cortex that are responsible for vision, for audition, for motor control, for language. Now, one of the big puzzles of all of this is that in the cortex is the cortex is the cortex is looks like it is the same in terms of hardware, in terms of type of neurons and connectivity across these different modalities.

[00:27:25]

So for the cortex, setting aside these other parts of the brain, like spinal cord, hippocampal, cerebellum and so on. For the cortex, I think your question about hardware and software and learning and so on, it's it I think is rather open. And, you know, I find it very interesting for us to think about an architecture, computer architecture that is good for vision and the same time is good for language, seems to be, you know, so different problem areas that you have to solve.

[00:28:06]

But the underlying mechanism might be the same. And that's really instructive for it may be artificial neural networks.

[00:28:12]

So you've done a lot of great work in vision and human vision, computer vision. And you mentioned the problem of human vision is really as difficult as the problem of general intelligence, and maybe that connects to the cortex discussion. Can you describe the human visual cortex and how the humans begin to understand the world through the raw sensory information? The what's for folks enough familiar, especially in the computer vision side?

[00:28:47]

We don't often actually take a step back except saying with a sentence or two that one is inspired by the other. What is it that we know about the human visual cortex that's interested?

[00:28:57]

So we know quite a bit. At the same time, we don't know a lot, but the bit we know. You know, in a sense, we know a lot of the details and and many we don't know and we know a lot of the top level, the answer, the top level question, but we don't know some basic ones. Even in terms of general neuroscience, forgettin vision, you know, why do we sleep?

[00:29:26]

It's such a basic question, and we really don't have an answer to that. Do you think so? Taking a step back on that. So sleep, for example, is fascinating. Do you think that's a neuroscience question? Or if we talk about abstractions, what do you think is an interesting way to study intelligence or are most effective on the levels of abstraction? Is a chemical or biological is what your physical mathematical as you've done a lot of work on that side, which is a psychology sort of like which level of abstraction do you think?

[00:30:00]

Well, in terms of levels of abstraction, I think we need all of them.

[00:30:05]

It's one you know, it's like if you ask me, what does it mean to understand a computer that's much simpler? But in a computer, I could say, well, I understand how to use PowerPoint. That's my level of understanding. A computer, it's it is reasonable, you know. Give me some power to produce lights and beautiful lights.

[00:30:30]

And now you can have somebody else who says, well, I know all the transistor works that are inside the computer. I can write the equation for, you know, transistors and diodes and circuits, logical circuits. And I can ask this guy, do you know how to operate PowerPoint? No idea.

[00:30:51]

So do you think if we discovered computers walking amongst us full of these transistors that are also operating under Windows and have PowerPoint, do you think it's.

[00:31:04]

The digging in a little bit more, how useful is it to understand the transistor in order to be able to understand? PowerPoint in these higher level, very good intelligence. So I think in the case of computers, because they were made by engineers, by as this different level of understanding are rather separate on purpose. You know, they are separate modules so that the engineer that designed the circuit for the chips does not need to know what is inside PowerPoint.

[00:31:40]

And somebody can write to the the software translating from one to the end to the other. And so in that case, I don't think understanding the transistor help you understand PowerPoint or very little. All right. If you want to understand the computer, this question, you know, I would say you have to understand at different levels if you really want to build one.

[00:32:07]

Right. But but for the brain, I think this level of understanding. So the algorithms, which kind of computation, you know, the equivalent of PowerPoint and the circuits, you know, the transistors, I think they are much more intertwined with each other. That is not, you know, a neatly level of the software separate from the hardware. And so that's why I think in the case of the brain, the problem is more difficult to more than four computers requires the interaction, the collaboration between different types of expertise.

[00:32:46]

That's the big the brain's a big mess that you can't just disentangle. Like you can.

[00:32:53]

But this is much more difficult. And it's not you know, it's not completely obvious. And as I said, I think he's one of the person I think is the greatest problem in science. So, you know, I think it is fair that it's difficult, a difficult one.

[00:33:10]

That said, you do talk about compositionally and why it might be useful.

[00:33:15]

And when you discuss why these neural networks in artificial or biological sense, learn anything you talk about compositionally, see, there's a sense that nature can be disentangled are while all aspects of our cognition could be disentangled a little to some degree.

[00:33:39]

So why do you think what first of all, how do you see composition reality and why do you think it exists at all in nature? It spoke about.

[00:33:51]

I use the the term compositionally to. When we look at deep neural networks, multi layers, and trying to understand when and why they are more powerful than more classical, one layer networks like linear classifier or callnet machines, so-called. And what we found is that in terms of approximating or learning or representing a function, a mapping from an input to an output like from an image to the label in the image, if this function is a particular structure, then deep networks are much more powerful than shallow networks to approximate the underlying function.

[00:34:45]

And the particular structure is a structure of composition relative. The function is made up of functions of function so that you need to look on when you are interpreting an image, classifying an image. You don't need to look at all pixels at once, but you can compute something from small groups of pixels and then you can compute something on the output of this local computation and so on. That is similar to what you do when you read the sentence. You don't need to read the first and the last letter, but you can read syllables, combine them in words, combine their words in sentences.

[00:35:35]

So this is this kind of structure.

[00:35:38]

So that's as part of a discussion of why deep neural networks may be more effective than the shallow methods. And it's your sense for most things you can use. Neural networks for. Those problems are going to be a compositional in nature, like like language, like vision. How far can we get in this kind of right.

[00:36:04]

So here is almost philosophy. Well, you go there. Yeah, let's go there.

[00:36:11]

So this friend of mine, Max Tegmark, who is a physicist at M.I.T., I've talked to him on this thing and he disagrees with you a little bit.

[00:36:21]

We you know, we agree on most, but the conclusion is a bit different.

[00:36:26]

Each conclusion is that four images, for instance, the compositional structure of this function that we have to learn or to solve these problems comes from physics, comes from the fact that you have local interactions in physics between atoms and other atoms, between particle of matter and other particles, between planets and other planets, between stars, that it's all local.

[00:37:02]

Yeah, and that's true. But you could push this argument a bit farther and not this argument. Actually, you could argue that, you know, maybe that's part of the true. But maybe what happens is kind of the opposite is that our brain is wired up as a deep network so it can learn to understand, solve problems that have this compositional structure. Mm hmm. And cannot do it, cannot solve problems that don't have this composition or structure.

[00:37:46]

So the problem is we are accustomed to we think about, we test our algorithms on are this compositional structure because our brain is made up.

[00:37:59]

And that's in a sense, an evolutionary perspective that we have. So the ones that didn't have they weren't dealing with a conversational nature of reality. Oh, it died off yet.

[00:38:13]

It also could be maybe the reason. Why we have this? Local connectivity in the brain, like simple cells in cortex, looking only at the small part of the image, each one of them and then others are looking at the small number of the simple cells and so on. The reason for this may be purely that was difficult to grow, long range of connectivity. So suppose it's you for biology. It's possible to grow short range connectivity, but not long range also because there is a limited number of long range.

[00:38:56]

And so you have this this limitation from the biology. And this means you build a deep, convolutional neck. This would be something like a deep, convolutional network. And this is great for solving certain class of problem. These are the ones we are we find easy and important for our life. And yes, they were enough for us to survive.

[00:39:24]

And and you can start a successful business on solving those problems. I tell Michael Mobileye, driving is a compositional problem right now on the on the learning task.

[00:39:37]

I mean, we don't know much about how the brain learns in terms of optimization, but so the thing that's sarcastic gradient descent is what artificial neural networks use for the most part to adjust the parameters in such a way that it's able to deal based on the label data, it's able to solve the problem.

[00:39:59]

So what's your intuition about why it works at all?

[00:40:07]

How hard of a problem it is to optimize a neural network, artificial neural network? Is there other alternatives? Yeah, just in general, your intuition is behind this very simplistic algorithm that seems to do pretty good, surprisingly. Yes.

[00:40:23]

Yes. So I find neuroscience, the architecture of cortex is really similar to the architecture of deep networks so that there is a nice correspondence there between the biology and this kind of local connectivity, hierarchical architecture, the stochastic gradient descent. As you say, this is a very simple technique. It seems pretty unlikely that biology could do that from from what we know right now about the, you know, cortex and neurons and synopsize so it's a big question open whether there are other.

[00:41:12]

Optimization, learning algorithms that can replace stochastic gradient descent and my my guess is yes. But nobody has found yet a real answer. I mean, people are trying, still trying, and there are some interesting ideas. The fact that stochastic gradient descent is so successful, this has become clear is not so mysterious, and the reason is that it's an interesting fact that, you know, is a change in a sense in how people think about statistics. And and this is the following is that typically when you had data and you had, say, a model with parameters, you are trying to fit the model to the data, know to fit the pyramid, typically the kind of kind of crowd wisdom type idea it was, you should have at least, you know, twice the number of data and the number of parameters.

[00:42:28]

You were maybe 10 times better. Now, the way you train the neural network these days is that they have they have 10 or 100 times more parameters than did exactly the opposite. And which, you know, it is it has been one of the puzzles about the neural networks, how can you get something that really works when you have so much freedom in from that little Darick in general somehow?

[00:43:00]

Right, exactly. Do you think this the stochastic nature of it is essentially the randomness?

[00:43:05]

So I think we have some initial understanding why this happens. But one nice side effect of having this over parameterization, more parameters than data is that when you look for the minimum of a lost function like stochastic descent is doing, you find I made some calculations based on.

[00:43:31]

Some old basic theorem of algebra called Bazoo Theorem, and that gives you an estimate of the number of solutions of a system of polynomial equation equations anyway, the bottom line is that there are probably more minima for a typical deep networks than atoms in the universe. Just to say there are a lot because of the over parameterization. Yes, more global minimum zero. Meaning, good meaning.

[00:44:06]

So it's not monosyllable. Yeah, a lot of that.

[00:44:10]

So we have a lot of solutions. So it's not so surprising that you can find them relatively easily. And this is why this is because of the over parameterization.

[00:44:21]

They all prioritization sprinkles. That entire space of solutions is pretty good.

[00:44:26]

And so it's not so surprising. It is like, you know, if you have a system of linear equation and you have more unknowns than equations, then you have we know you have an infinite number of solutions. And the question is to pick one. That's another story, but have an infinite number of solutions.

[00:44:44]

So there are a lot of value of your unknowns that satisfy the equations, but it's possible that there's a lot of those solutions that aren't very good.

[00:44:54]

Well, what's surprising is so that's the question. Why can you pick one that generalizes? Well, yeah, but that's a separate question with separate answers. Yeah.

[00:45:04]

One one theorem that people like to talk about that kind of inspires imagination of the power and one that works is the universality universal approximation theorem that you can approximate any computer malfunction with just a finite number of neurons in a single hidden layer. You find this theorem one surprising, you find it useful, interesting, inspiring to know this one.

[00:45:31]

You know, I never found it very surprising. It was known since the 80s, since I entered the field, because it's basically the same as Vasteras Theorem, which says that I can approximate any continuous function with a polynomial of sufficiently with a sufficient number of terms monomers. Yeah, it's basically the same and the proofs very similar.

[00:45:58]

So your intuition was there was never any doubt in your mind works in theory. Could it be a very strong approximation?

[00:46:05]

The question the interesting question is that. If this theorem. Says, you can approximate fine, but when you ask how many neurons, for instance, or in the case of polynomial, how many more nominals, I need to get a good approximation. Then it turns out that that depends on the dimensionality of your function, how many variables you have, but it depends on the dimensionality of your function in a bad way. It's, for instance, suppose you want an error which is no worse than 10 percent in your approximation.

[00:46:52]

You come up with a network that approximate your function within 10 percent. Then turns out that the number of units you need are in the order of 10 to the dimensionality. How many variables? So if you have, you know, two variables, these these two, would you have 100 units and OK, but if you have, say, 200 by 200 pixel images now this is, you know, 40000 or whatever, and we can go to the size of the universe pretty quickly.

[00:47:26]

Exactly ten to the 40000 or something. Yeah.

[00:47:31]

And so this is called the curse for dimensionality, not, you know, quite appropriately.

[00:47:39]

And the hope is with the extra layers, you can remove the curse.

[00:47:45]

What we proved is that if you have deeply hurt Iraqi architecture with the local connectivity of that type of convolutional deep learning, and if you're dealing with a function that has this kind of Iraqi architecture, then you avoid completely the curse.

[00:48:07]

You've spoken a lot about supervised deep learning. Yeah.

[00:48:11]

What are your thoughts, hopes, views on the challenges of unsupervised learning with the Ganz with generative A.l networks? Do you see those as distinct the power of Gans's to those as distinct from the supervised methods in your networks? Are they really all in the same representation?

[00:48:32]

Ballpark guarantees one way to get an estimation of probability densities, which is somewhat new way that people have not done before.

[00:48:47]

I, I don't know whether this would really play an important role in, you know, in intelligence or it's it's interesting. I'm I'm less enthusiastic about it to many people in the field. I have the feeling that many people in the field are really impressed by the ability to of producing realistic looking images in this generative way, which describes the popularity of the methods.

[00:49:20]

But you're saying that while that's exciting and cool to look at, it may not be the tool that's useful for.

[00:49:27]

Yeah, for it. So you describe it kind of beautifully. Current supervised methods go and infinity turns a number of labeled points and we really have to figure out how to go to and to one.

[00:49:38]

Yeah, and you're thinking Gan's might help, but they might not be the right, I don't think, for that problem, which I really think is important. I think they may help. They certainly have applications, for instance, in computer graphics. And, you know, I did work long ago. Which was a little bit similar in terms of saying, OK, I have a network and I present images and I can see the input, its images and the output is, for instance, the pause of the image, you know, efface I'm much smiling, is rotated 45 degrees or not.

[00:50:20]

What about having a network that I train with the same data set, but now I invert input and output. Now the input is the pause or the expression in no set of numbers and the output is the image. And I train. And we did pretty good. Interesting results in terms of producing very realistic looking images was, you know, much less sophisticated mechanism. But the output was pretty less than Gan's, but the output was pretty much of the same quality.

[00:50:55]

So I think for computer graphics type applications, yeah, definitely. Gance can be quite useful. And not only for that, for but for, you know, helping, for instance, on this problem, unsupervised example of reducing the number of labelled examples.

[00:51:19]

I think people it's like they think they can get out more than they put in. You know, there's no free lunches.

[00:51:29]

Yeah, right. What do you think? What's your intuition? How can we slow the growth of Antonveneta and supplies and to infinity in supervised learning? So, for example, Mobileye has very successfully, I mean, essentially annotated large amounts of data to be able to drive a car. Now, one thought is so we're trying to teach machines, school, V.I. and we're trying to see what how can we become better teachers?

[00:52:02]

Maybe that's one one way.

[00:52:04]

Now, you are your you know what?

[00:52:07]

I like that because one again, one caricature of the history of computer science, you could say is always begins with programmers. Expensive. Yeah, continuous labellers cheap. Yeah. And the future would be schools like we have for kids. Yeah.

[00:52:33]

Currently the labeling methods were not selective about which examples we we teach networks.

[00:52:42]

So I think the focus of making one networks to learn much faster is off and on the architecture side. But how can we pick better examples of the ways to learn? Do you have intuitions about that?

[00:52:56]

Well, that's part of the the part of the problem. But the other one is, you know, if we look at biology, a reasonable assumption, I think is in the same spirit that I said evolution is opportunistic and has weak Prior's you know, the way I think the intelligence of a child the baby may develop is by bootstrapping a week prior from evolution, for instance, in. You can assume that you have a most organisms, including human babies, built in some basic machinery to detect motion and relative motion.

[00:53:55]

And in fact, there is you know, we know all insects from fruit flies to other animals. They have this. Even in the rightness of it, in the very pitiful part, it's very conserved across species, something that evolution discovered early. It may be the reason why babies tend to look the first few days to moving objects and not to not moving on now. Moving objects means, OK, they're attracted by motion. But motion also means that the motion gives automatic segmentation from the background.

[00:54:37]

Mm hmm. So because of motion boundaries, you know, either the object is moving or the eye of the baby striking the moving object and the background is moving, right? Yeah.

[00:54:50]

So just purely on the visual characteristics of the scene, that seems to be the most useful. Right.

[00:54:55]

So it's like looking at it in an object without background, a background. It's ideal for learning the object. Otherwise it's really difficult because you have so much stuff. So suppose you do this at the beginning, first weeks, then after that you can recognize object now are imprinted, the number of one even in the background, even without motion.

[00:55:22]

So that's the by the way, I just want to ask an object recognition problem. So there is this being responsive to movement and as detection, essentially, what's the gap between being effectively effective at visually recognising stuff, detecting where it is and understanding the scene? Is this a huge gap and many layers or is it are we as a close?

[00:55:49]

No, I think that's a huge gap. I think present algorithm, with all the success that we have and the fact that there are a lot of very useful, it's I think we are in a golden age for applications of low level vision and low level speech recognition and so on, you know, Alexa. And so there are many more things, a similar level to be done, including medical diagnosis and so on. But we are far from what we call understanding of a scene of language, of actions, of people.

[00:56:28]

And that is despite the claims that I think are very far, we're a little bit off.

[00:56:36]

So in popular culture and among many researchers, some of which I spoke with the CEO, Russell, and, you know, a mosque in and out of the AI field, there's a concern about the existential threat of A.I..

[00:56:50]

Yeah.

[00:56:51]

And how do you think about this concern in and is it valuable to think about large scale, long term unintended consequences of intelligence systems we try to build? I always think it's better to worry first, you know, early rather than late.

[00:57:14]

So the worry is good. Yeah, yeah. I'm not against worrying at all. Personally, I think that, you know, it will take a long time before there is real reason to be worried. But as I said, I think it is good to put in place and think about possible safety against what I find a bit misleading. Things like that have been said by people I know, like Elon Musk and what his Bostrom in particular and what his first name is, Nick Nick Bostrom.

[00:57:55]

Right. You know, and there are a couple of other people that, for instance, is more dangerous than nuclear weapons, right? Yeah, I think that's really wrong. And that can be misleading because in terms of priority, we should still be more worried about nuclear weapons and, you know, people are doing about it and so on then. Yeah. And he's spoken about them as Salveson yourself, saying that you think it would be about 100 years out.

[00:58:32]

Before we have a general intelligence system that's on par with a human being, do you have any updates for those predictions? Well, I think he said they said 20. I think he said 20. Right. This was a couple of years ago. I have not asked him again. So should have your own prediction.

[00:58:52]

What's your prediction about when you'll be truly surprised and what's the confidence interval on that?

[00:58:59]

You know, it's so difficult to predict the future and even the president, but it's pretty hard to predict. But I would be. But as I said, this is completely is I would be more like Rod Brooks.

[00:59:14]

I think he's about to enter the year when we have this kind of ajai system, artificial general intelligence system, and you're sitting in a room with her him at.

[00:59:29]

Do you think it will be the underlying design of such a system is something will be able to understand? It will be simple. Do you think it'll be explainable? Understandable by us, your intuition, again, we're in the realm of philosophy a little bit, or probably no.

[00:59:53]

But again, it depends. What you really mean for understanding, so I think. You know, we don't understand what our deep networks work, I think we are beginning to have a theory now, but in the case of the networks or even in the case of the simple, simpler Kernell machines or linear classifier, we really don't understand the individual units also. We but we understand, you know, what the computation and the limitations and the properties of it are.

[01:00:37]

It's similar to many things in a way. What does it mean to understand how a fusion bomb works?

[01:00:46]

How many of us. You know, many of us understand the basic principle and some of us may understand deeper details in that sense, understanding is as a community, as a civilization, can we build another copy of it?

[01:01:03]

OK. And in that sense, do you think there'll be there will need to be some evolutionary component where it runs away from our understanding, or do you think it could be engineered from the ground up the same way you go from the transistor trigger point that I had so many years ago?

[01:01:22]

This was actually 40, 41 years ago.

[01:01:26]

I wrote a paper with David Marr, who was one of the founding fathers of Computer Vision of Computation, which I wrote a paper about levels of understanding, which is relate to the question we discussed earlier about understanding power point, understanding transistors and so on. And, you know, in that kind of framework, we had the level of the hardware and the top level of the algorithms. We did not have learning. Recently, I updated adding levels and one level I added to those three was learning.

[01:02:11]

So and you can imagine you could have a good understanding of how you construct learning machine like we do, but being unable to describe in detail what the learning machines will discover. Hmm. Right now, that would be still a powerful understanding.

[01:02:34]

If I can build a learning machine, even if I don't understand in detail, every time it learns something, just like our children, if they if they start listening to a certain type of music, I don't know Miley Cyrus or something, you don't understand why they came to that particular preference, but you understand the learning process. That's very interesting. Yeah. Yeah. So, uh.

[01:03:01]

And learning for systems to be part of our world, it has a certain one of the challenging things that you've spoken about is learning ethics, learning morals.

[01:03:16]

And how hard do you think is the problem of, first of all, humans understanding our ethics? What is the origin of the neural and low level of ethics? What is it at the higher level?

[01:03:29]

Is it something that's learnable from machines and your intuition?

[01:03:34]

I think, yeah, ethics is learnable very likely. I think it's one of these problems where. Think understanding the neuroscience of ethics. You know, people discuss the recent ethics of neuroscience. Yeah. Yes, you know how a neuroscientist should or should not behave. Think of a neurosurgeon and the ethics rule has to be or he she has to be.

[01:04:11]

But I'm more interested in the neuroscience.

[01:04:14]

You're blowing my mind right now. The neuroscience of ethics is very matter. Yeah. And, you know, I think that would be important to understand also for being able to to design machines that have that are ethical machines in our sense of ethics.

[01:04:32]

And you think there is something in neuroscience, there's patterns, tools in neuroscience that could help us shed some light on ethics or more see on the psychology of sociology, much higher level.

[01:04:46]

No, there is psychology, but there is also in the meantime, there are there is evidence that family of specific areas of the brain that are involved in certain ethical judgment. And not only this, you can stimulate those area with magnetic fields and change the ethical decisions. Yeah. Oh, so that's a work by colleague of mine, Rebecca Saxe, and there is other researchers doing similar work. And I think, you know, this is the beginning, but ideally at some point will have an understanding of how this works and why it evolved to write the big why question.

[01:05:36]

Yeah, it must have some some purpose.

[01:05:39]

Yeah, obviously it has, you know, some social purposes is. Probably if neuroscience holds the key to at least eliminate some aspect of ethics, that means it could be a learnable problem. Yeah, exactly.

[01:05:55]

And as we're getting into harder and harder questions, let's go to the hard problem of consciousness. Yeah.

[01:06:02]

Is this an important problem for us to think about and solve on the engineering of intelligence side of your work, of our dream?

[01:06:13]

You know, it's unclear. So, you know, this is a deep problem, partly because it's very difficult to define consciousness and the. And there is a debate among neuroscientists about whether consciousness and philosophers, of course, whether consciousness is something that requires flesh and blood, so to speak. Yeah, or could be, you know, that we could have silicon devices that are conscious or up to statement like everything has some degree of consciousness and some more than others.

[01:07:05]

Yeah, this is like Julio to Nyoni and she would just recently talk to Christophe Karkoc.

[01:07:14]

So here Christopher was my first graduate student.

[01:07:17]

Do you think it's important to illuminate aspects of consciousness in order to engineer intelligent systems? Do you think an intelligent system would ultimately have consciousness? Are they too? Are they interlinked? You know, most of the people working in artificial intelligence, I think with that answer, we don't strictly need consciousness to have an intelligence system. That's sort of the easier question because. Yeah, because it's it's a very engineering answer to the question. Yes. Has the Turing test on your consciousness.

[01:07:55]

But if you were to go, do you think it's possible that we need to have that kind of self-awareness, if we may?

[01:08:05]

Yes. So, for instance, I, I personally think that when. Test a machine or a person in a Turing test in an extended Turing test. I think consciousness is part of what we required in that test, you know, implicitly to say that this is intelligent. Christof disagrees. So, yes, he does.

[01:08:35]

Yeah. And despite many other romantic notions he holds, he disagrees with that one.

[01:08:41]

Yes, that's right. So, you know, we'll see. Do you think. As a quick question. And his backers fear of death. Do you think mortality in those kinds of things are important for. Well, for consciousness and for intelligence, the finiteness of life, finiteness of existence, or is that just a side effect of the evolutionary side effect it's useful to for natural selection?

[01:09:18]

Do you think this kind of thing that this interview is going to run out of time soon? Our life will run out of time soon. Do you think that's needed to make this conversation good and and life good?

[01:09:29]

You know, I never thought about it is a very interesting question. I think Steve Jobs, in his commencement speech at Stanford, argued that, you know, having a finite life was important for for stimulating achievements.

[01:09:47]

And I was a different you live every day like it's your last, right? Yeah. Yeah. So rationally, I don't think strickly you need mortality for consciousness, but. Who knows? They seem to go together and our biological systems. Yeah, yeah. You've mentioned before and the students are associated with the alpha girl, mobilize the big recent success stories and I think it's captivated the entire world of what I can do.

[01:10:23]

So what do you think will be the next breakthrough and what's your intuition about the next breakthrough? Of course, I don't know where the next breakthrough is, I I think that there is a good chance, as I said before, that the next breakthrough would also be inspired by, you know, neuroscience. But which one? I don't know. And there's so Ammit has this quest for intelligence now and there's a few moonshots which in that spirit, which ones are you excited about?

[01:10:58]

What which projects kind of.

[01:11:01]

Well, of course I'm excited about one of the moonshots with which is our Center for Brains, Minds and Machines, which is that the one which is fairly fully funded by NSF.

[01:11:15]

And it's a it is about visual intelligence that one is particularly about understanding visual intelligence of the visual cortex and in visual intelligence in the sense of how we look around ourselves and understand the world around ourselves. You know, meaning what? What is going on, how we could go from here to there without hitting obstacles. You know, whether there are other agents, people in these are all things that we perceive very quickly. And and it's something actually quite close to being conscious.

[01:12:04]

Not quite.

[01:12:04]

But, you know, there is this interesting experiment that was run at Google X, which is in a sense is just a virtual reality experiment, but in which that subject is sitting in a chair with goggles like Oculus and so on.

[01:12:26]

Earphones and they were seeing through the eyes of a robot nearby to cameras, microphones for a so their sensory system was there. And the impression of all the subject, very strong, they could not shake it off was that they were where the robot was. They could look at themselves from the robot and still. Feel they were they were where the robot is, they were looking at their body. They are self had moved to some aspect of understanding, has to have ability to place yourself, have a self-awareness about your position in the world and what the world is like.

[01:13:16]

So, yeah, so we may have to solve the hard problem of consciousness to solve it on their way.

[01:13:22]

Yes, it's quite, quite a.

[01:13:24]

So you've been an adviser to some incredible minds, including Dennis Hassabis, Christof Koch. I'm not sure. Like you said, all went on to become seminal figures in their respective fields from your own success as a researcher and from my perspective as a mentor of these researchers having guided them.

[01:13:49]

In the way of advice, what does it take to be successful in science and engineering careers? Whether you're talking to somebody in their teens, 20s and 30s, what does that path look like? It's curiosity and having fun. And I think is important also having fun with other curious minds, it's the people around with, too, so yeah, fun and curiosity is there mentioned Steve Jobs.

[01:14:27]

Is there also an underlying ambition that's unique that you saw or is it really does boil down to insatiable curiosity and fun?

[01:14:35]

Well, of course, you know, it's been curious in the active and ambitious way. Yes, the, um, definitely. But I think sometimes in in science, there are friends of mine who are like this. You know, there are some of the scientists like to work by themselves and kind of communicate only when they completed their work or discovered something.

[01:15:10]

Um, I think I always found that the the actual process of, you know, discovering something is more fun if it's together with the intelligent and curious and fun people.

[01:15:26]

So if you see the fun in that process, the side effect of that process will be the election of discovering something. Yeah, yes.

[01:15:33]

So as you've led a many incredible efforts here, what's the secret to being a good adviser, mentor, leader in a research setting? Is a similar spirit or.

[01:15:47]

Yeah, what what what advice can you give to people, young faculty and so on?

[01:15:52]

It's partly repeating what I said about an environment that should be friendly and fun and ambitious. And, you know, I think I learned a lot from some of my advisors and friends and some of our physicists.

[01:16:12]

And there was, for instance, this behavior that was encouraged of when somebody comes with a new idea in the group, you are unless it's really stupid, but you are always enthusiastic and then and you are enthusiastic for a few minutes, for a few hours, then you start, you know, asking critically a few questions to testing basis.

[01:16:40]

But, you know, this is a process that is I think is very, very good. This you have to be enthusiastic sometimes.

[01:16:47]

People are very critical from the beginning that that's not.

[01:16:53]

Yes, you have to give it a chance. Yes. That's see to grow. That said, with some of your ideas, which are quite revolutionary. So there's eyewitnessed, especially in the human vision side and neuroscience side, there could be some pretty heated arguments you enjoy these days. Is that a part of science and your academic pursuits that you enjoy? Yeah. Is it is that something that happens in your group as well?

[01:17:18]

Yeah, absolutely. I also spent some time in Germany. Again, there is this tradition in which people are more forthright, less kind than here, so. You know, in the U.S., when you write a bad letter, you still say this guy is nice, you know? Yes, yes. And so yet here in America, it's degrees of nice.

[01:17:45]

Yes. So it's all just degrees of nice. Yeah, right. Right.

[01:17:48]

So as long as this does not become personal. And it's really like, you know, a football game with these rules, that's great and fun.

[01:18:03]

So if you somehow find yourself in a position to ask one question of an oracle, like a genie, maybe a God will, and you're guaranteed to get a clear answer, what kind of question would you ask?

[01:18:18]

What what would be the question you would ask in the spirit of our discussion?

[01:18:23]

It could be how could be I could become 10 times more intelligent.

[01:18:28]

And so. But you only get a clear, short answer. So do you think there's a clear short answer to that? No. And that's the answer you'll get. OK, so you've mentioned Flowers of Organon.

[01:18:44]

Oh, yeah. This is a story that inspired you in your childhood as this story of a mouse and human achieving genius level intelligence and then understanding what was happening was slowly becoming not intelligent again in this tragedy of gaining intelligence and losing intelligence.

[01:19:05]

Do you think in that spirit, in that story? Do you think intelligence is a gift or curse from the perspective of happiness and meaning of life? You try to create an intelligence system that understands the universe, but on an individual level, the meaning of life. Do you think intelligence is a gift?

[01:19:27]

It's a good question. I don't know. As one of the. As one people consider the smartest people in the world, in some in some dimension at the very least.

[01:19:48]

Uh, what do you think? I know it may be invariant to intelligence, the degree of happiness would be nice if it were. That's the hope. Yeah, you could be smart and happy and clueless and happy. Yeah. As always, on the discussion of the meaning of life is probably a good place to end, Tomasso. Thank you so much for talking today. Thank you.

[01:20:16]

This was great.