Transcript of Vladimir Vapnik: ...

[00:00:00]

The following is a conversation with Vladimir APNIC, he's the convention support vector machines, support vector clustering, VXI theory and many foundational ideas and statistical learning. He was born in the Soviet Union and worked at the Institute of Control Sciences in Moscow, then in the United States. He worked at AT&T Labs Facebook Research and now is a professor at Columbia University. His work has been cited over one hundred seventy thousand times, he has some very interesting ideas about artificial intelligence and the nature of learning, especially on the limits of our current approaches and the open problems in the field.

[00:00:40]

This conversation is part of an MIT course on artificial general intelligence and the artificial intelligence podcast. If you enjoy it, please subscribe on YouTube or read it on iTunes or a podcast provider of choice, or simply connect with me on Twitter or other social networks at Lex Friedman spelled F Outride. And now here's my conversation with Vladimir, Bartnick. Einstein famously said that God doesn't play dice. Yeah, you have studied the world through the eyes of statistics. So let me ask you, in terms of the nature of reality, fundamental nature of reality, does God play dice?

[00:01:36]

You don't know some facts, and because you don't know some factors which could be important, it looks like go play those. But what you should describe in philosophy is they distinguish between two positions, positions of instrumentalism where you're creating suited for production. And position of realism, where you're trying to understand what did you describe instrumentalism and realism a little bit.

[00:02:13]

For example, if you have some mechanical logs, what is that? Is it law which through AlwaysOn everywhere or it is law which allow you to predict position of moving element. What you believe, you believe that it is God's law that God created the world, which would be to this physical law, or it is just lawful predictions, and which one is instrumentalism for predictions?

[00:02:54]

If you believe that this is a law of God, it isn't always true everywhere. That means that you're the oldest. So you're trying to really understand, understand that God sort say the way you see the world as an instrumentalist.

[00:03:15]

You know, I was working for some models model of machine learning, so in this model we can see setting. And we try to solve resolve the setting to solve the problem, and you can do it in two different ways from the point of view of instrumentalise, and that's what everybody does not because this is a goal of machine learning, is to point zero for classification. That is true, but it is an instrument for prediction. But they can see the ghost of machine learning is to to learn about conditional probability.

[00:04:05]

So I have got to use and he is Hipsley, what is probability for one, what is probability for a given situation, but for prediction? I don't need this. I need the rule but for understanding need conditional probability.

[00:04:23]

So let me just step back a little bit first to talk about you mentioned, which I read last night, the the parts of the 1960 paper by Eugene Wigner are unreasonable effectiveness of mathematics and natural sciences.

[00:04:39]

There's a such a beautiful paper, by the way, you made me feel.

[00:04:46]

To be honest, to confess my own work in the past few years on deep learning, heavily applied, made me feel that I was missing out on some of the beauty of nature in the way that math can uncover.

[00:05:00]

So let me just step away from the poetry of that for a second. How do you see the role of math in your life?

[00:05:08]

Is it is it a tool or the poetry? Where does it sit? And does math for you have limits of what it can describe? Some people saying that Moss is language which use God.

[00:05:23]

So I believe if I speak to God or use God or use God. Yeah. So I believe that. This article. About effectiveness, unreasonable effectiveness as much is that if you're looking in which magical structures? They know something about reality. And the most. Scientists from natural science, they're looking at an equation in trying to understand reality. So the same emotional logic, if you drive very carefully, look on all the equations which define conditional probability, you can understand something about reality more than from your fantasy to math can reveal the simple underlying principles of reality.

[00:06:28]

Perhaps you know what simple. It is very hard to discover the. But then when you discover them and look at them, you see how beautiful they are. And and it is surprising why people did not. She said before you're working on an equation derived from equations, for example, I talked yesterday about a new school method and people have a lot of fun things. You have to improve it. But if you look going step by step by solving some equations, you're suddenly you get some term which after thinking you understand the described position of observation point where you throw out a lot of information.

[00:07:23]

You don't look in composition of point of observations. We look on residuals. But when you understood that, that's a very simple idea. But it's not too simple to understand. And you can derive this just from equations.

[00:07:40]

So some simple algebra is a few steps will take you to something surprising that when you think about.

[00:07:49]

And that is proof that human intuition not too rich and very primitive and it does not see very simple situations. So let me take a step back. And in general. Yes, right.

[00:08:09]

But what about human intuition? Ingenuity.

[00:08:16]

Moments of brilliance. So are you so do you have to be so hard on human intuition? Are there moments of brilliance and human intuition that can leap ahead of math and then the math will catch up? I don't think so. I think that is the best human creation it is putting in auctions and then it is technical work. See where the axioms take you. Yeah, but if they correctly take auctions but it your polish during generations or scientist and this is integral, use them so as beautifully put.

[00:09:02]

But if you maybe look at it when you, when you think of Einstein. And special relativity. What is the role of imagination coming first there in the moment of discovery of an idea?

[00:09:19]

To those, obviously, a mix of math and out of the box imagination there that I don't know, whatever I did or exclude any imagination. Because whatever I saw, emotional or the contrary, imagination like features, like deep learning, they're not relevant to the problem. When you're looking very carefully from what you might call equations, you deriving very simple theory which goes far beyond theoretical then whatever people can imagine because it is not good for. Yeah, it is just interpretation.

[00:10:01]

It is just fantasy. But it is not what you need. You don't need any imagination to derive the same principle of machine learning.

[00:10:15]

When you think about learning and intelligence, maybe thinking about the human brain and trying to describe mathematically the process of learning, that is something like what happens in the human brain.

[00:10:28]

Do you think we have the tools currently? Do you think we will ever have the tools to try to describe that process of learning you? It is not a description of what's going on. It is interpretation. It is your interpretation. Your vision can be wrong. You know, when God invent a microscope, living for the first time only got this instrument and nobody will.

[00:10:56]

He kept secrets about a microscope, but he wrote a report in London, the cardinal signs in his report when he looking at the blood, he looked everywhere on the water, on the blood, on the spill.

[00:11:11]

But he described blood like fight between Queen and King. So he saw blood cells, red cells, and he imagines that it is army fighting each other.

[00:11:27]

And it was his interpretation of situation. And he said that this report indicated no signs. They very carefully look because they believe that he's right. He's right. He saw something. Yes. But he gave wrong interpretation. And they believe the same can happen to his brain. Will be the most important, but, you know, I believe in human language, in some product is so much visible. For example, people say that it is better than a thousand days of diligent studies.

[00:12:06]

One day this great teacher. But if you ask your teacher does nobody knows. And that is intelligence. And but we know from history and know from from massive machine learning that teacher can do a lot to what from a mathematical point of view is a great teacher.

[00:12:31]

I don't know. That's not that. But we can see what teachers can do. You can introduce some environ, some predicate for creating convergence. How you doing it? I don't know, because teacher knows reality and can describe forms of reality, a predicate invariants. But he know that when you're using invariant, you can decrease the number of observations 100 times.

[00:13:03]

That's so. But maybe try to pull that apart a little bit.

[00:13:08]

I think you mentioned like a piano teacher saying to the student, play like a butterfly.

[00:13:14]

I played piano playing guitar for a long time.

[00:13:18]

Yeah, that's there's maybe a romantic poetic, but it feels like there's a lot of truth in that statement, like there is a lot of instruction in that statement.

[00:13:30]

And so you can't pull that apart. What what is that?

[00:13:34]

The language itself may not contain this information, not blah, blah, blah, because it's not other if you it's what affect you and affect your playing. Yes, it does.

[00:13:45]

But what it's not the length. It's it feels like what is the information being exchanged there?

[00:13:53]

What is the nature of information, what is the representation of that information?

[00:13:56]

I believe that it is sort of predicate, but I don't know. That's exactly what what intelligence and machine learning should be. Yes, because the rest is just mathematical technique. I think that what was discovered recently is that there is to try to mechanism of learning, one called strong convergence mechanism and the convergence mechanism before people use only one word in the convergence mechanism, you can use predicate. That's what's clearly butterfly. And if you immediately effect your playing, you know this that is English Brulard.

[00:14:41]

Great. If it looks like a duck, swims like a duck and quacks like a duck, then it is probably duck. Yes, but this is exact. Both predicates looks like a duck. What it means. So you so many ducks that your training data. So you, you have a description of how he looks into the looks ducks.

[00:15:11]

You have the visual characteristics of our duck. Yeah. Yeah.

[00:15:14]

But you want and you have model for your cognition so you would like. So that theoretical description from model coincide with empirical description each year. So tell them so about. Looks like the dogmatists general but what about. Seems like a dog. You should know that duck swims, you can say it looks like a duck duck doesn't play chess and it is completely legal predicate, but it is useless. So how did you can recognize not useless predicate? So up to now, we don't use this predicate in existing machinery and you think the zillions of data Watkins's English, but industry will probably use only three predicate.

[00:16:10]

Looks like a duck, seems like a duck and quacks like a duck.

[00:16:14]

So you can't deny the fact that swims like a duck and quacks like a duck, has humor in it, has ambiguity. Let's talk about swim like a duck. And it does not say jumps jumps like a duck. Why?

[00:16:32]

Because it's not relevant, but that's music, you know, ducks, you know, different birds, you know, animals. And you derive from this that it is relevant to say something like a jar.

[00:16:47]

So underneath, in order for us to understand swims like a duck, it feels like we need to know millions of other little pieces of information and we pick up along the way.

[00:16:59]

You don't think so? That doesn't need to be.

[00:17:01]

This knowledge base in in those statements carries some rich information that helps us understand the essence of duck.

[00:17:10]

Yeah. How far are we from integrating predicates? You know that when when you consider completely machine learning, so what it does, you have a lot of functions and then you're talking it looks like a duck. You see your training data from training data you recognize like.

[00:17:41]

Expecta. Doug should look, then you remove all functions, which does not look like you think it should look from training day, so you decrease in motor function from your pickup, what then? You give a second predicate and again, it decreases the set of function. And after that, you pick up the best function you can find it a standard machine learning. So why you need not too many examples. Your products aren't very good. Well, you're not such music.

[00:18:23]

Yeah, because every predicate is invented to decrease admissable set of function. So you talk about admissable set of functions and you talk about good functions.

[00:18:37]

So what makes a good function so admissable sort of function is sort of function which choose.

[00:18:45]

Small capacity of small diversity, small U.S. dimension, which contain good fortune and so, by the way, for people who don't know v.C, you're the V in the VC.

[00:18:59]

So how would you describe to a layperson what victory is?

[00:19:05]

How would you describe this rare mushroom smooshing? Capable to pick up one function from the admissable, set the function. But set of admissable function can be self-contained, all continuous functions and useless. You don't have so many examples to pick up function, but it can be small, small. We call it capacity, but maybe better diversity, so not very different function, settling in units at a function, but not very diverse. So it is small, we should mention when we mention a small unit, not the small amount of training that.

[00:19:56]

So the goal is to create admissable set of functions, which is have small we dimension and contain good function. Then you should you'll be able to pick up the function using small amount of observations. So that is the task of learning, yeah, is creating a set of admissible functions.

[00:20:26]

There's a small vesi dimension and then you figure out a clever way of picking up the vote that this goal of learning, which was not formulated yesterday, the statistical learning theory, does not involve in creating admissible set of function in classical learning theory everywhere, 100 percent. The textbook, the set of function admissable set of function is given. But this is sizable nothing because the most difficult problem to create admissable set of functions given see a lot of functions continue to function created missable set of functions.

[00:21:10]

That means that it is final three dimensions movie dimension and contain good function. So this was out of consideration. So what's the process of doing that?

[00:21:22]

I mean, it's fascinating. What is the process of creating this admissable set of functions that is invariant, that's in various.

[00:21:30]

Can you describe the variance, your string of properties of training data and, uh. Properties means that you serve some function in unit, you just can't what is your average value of function of training data? You have. Model and what is the expectation of dysfunction on the model and they should coincide so that the problem with both have to pick up functions, it can be a new function if in fact it is true for all functions. But because when we're talking set, say Doug does not jumping so you don't ask a question, jump like a duck because it is three really does the jumping doesn't help you to recognize.

[00:22:31]

But you know something which question to ask you asking. It seems like the job like a duck but looks like a duck at this general situation. Looks like a guy who have this illness is a disease that it is legal. Yeah. So there is a general type of predicate, looks like a special type of predicate which related to this specific problem. And that is intelligence part of all this business and that were teachers and incorporating the specialized predicates.

[00:23:15]

OK, what do you think about deep learning as as a neural networks, these arbitrary architectures, as helping accomplish some of the tasks you're thinking about, their effectiveness or lack thereof? What are what are the weaknesses and what are the possible strengths?

[00:23:35]

You know, I think that this is fun. You which like deep learning like teachers. Let me give you this example.

[00:23:49]

One of the greatest book, the Churchill book, about history of Second World War. And he's starting this book describing that in old time when war is over. So. The Great Kings. They gathered together, almost all of them were relatives, and they discussed what should be done, how to create peace. And they came to agreement on what will happen. First World War The. General public came in power and they were so greedy that Europe, Germany, and it was clear for everybody that it is not peace, that peace will last only 20 years because they was not professionals in the same way she in Washington Zarar mathematicians were looking for the problem.

[00:24:50]

From a very deep point of view, what you might call pointier and a computer scientists is mostly does not know mathematics. They just have interpretation of that. And they invented a lot of blah, blah, blah interpretations like deep learning. Why you get deployed does not know. Deploying mathematics does not know neurons. It is just function. If you like to say piecewise linear function, say that and do it in class of piecewise linear function. But they invent something and then they try to to to to prove advantage of that through interpretations which mostly wrong.

[00:25:37]

And then they, they appeal to brain which they know nothing about, that nobody knows what going on in the brain.

[00:25:45]

So I think the more reliable localness this is you much call problem to your quest to solve this problem, try to understand that there is more only one way of convergence, which is strong way of convergence.

[00:26:01]

There is a view of convergence which requires predicate.

[00:26:04]

And if you will go through all this stuff, you will see that you don't need diploid you even more. I would say one of the curium which called representativity. It says that the optimal solution. Of which medical problem, which is described learning is on schedule. Network not on deep loading and a shalonda or again, the problem is they're absolutely so in the end, what you're saying is exactly right.

[00:26:44]

The question is you have no value for throwing something on the table, playing with it, not math.

[00:26:53]

So can do all that work or you said throwing something in the bucket and or by the biological example and looking at kings and queens or the cells with a microscope, you don't see value in imagining the cells or kings and queens and using that as inspiration and imagination for where the math will eventually lead you. You think that interpretation basically deceives you in a way that's not productive?

[00:27:21]

I think that if you try to analyze this, the nature of learning and especially discussion about deep learning, it was a discussion about interpretation, not about since, about what you can say about things. That's right. But aren't you surprised by the beauty of it?

[00:27:43]

So not mathematical beauty, but the fact that it works at all?

[00:27:50]

Or are you criticizing that very beauty, our human desire to to interpret, to to find our silly, silly interpretations and these constructs like let me ask you this.

[00:28:06]

Are you. Surprised and does it inspire you?

[00:28:12]

How do you feel about the success of a system like Alpha Girl beating the game of go using neural networks to estimate the quality of a ball of a board and the quality of the.

[00:28:26]

Is your interpretation quality of support? Yeah, yes. Yeah.

[00:28:32]

But it's that interpretation. The fact is a neural network system doesn't matter. A learning system that we don't, I think mathematically understand that well, beats the best human player does something that was stolen in music.

[00:28:47]

It's not a very difficult problem that that's.

[00:28:50]

So you empiric we've empirically have discovered that this is not a very difficult problem.

[00:28:55]

Yeah, yeah, it's true. Uh, so maybe it's, uh, I can't argue.

[00:29:02]

Uh, so we were more I say the user use the Plotnick. It is not the most effective way of learning it. And usually when people use uploading, they're using zillions of training data. Yes, but you don't need this, so I describe a challenge, can we do some problems which you do? Well. Deep learning method is deep that are using a hundred times less training data, even more some problems, but deep learning cannot so. Because it's not necessary, they create admissable set of functionality to create the architecture means to create a reasonable set of functions, you cannot say that you create a good set of functions.

[00:30:05]

You're just that's your fantasy. It does not control much, but it is possible to create a reasonable set of functions because you have your training data. That actually, for mathematicians, when you're considered a lawyer, you need to use law of large numbers. When you're making training in existing algorithms, you need uniform law of large numbers, which is much more difficult to require. So we should mention another stop.

[00:30:40]

But nevertheless, if you use Morse and stroke way of convergence, you can decrease a lot of training that you could do.

[00:30:50]

The three the swims like a duck and quacks like a duck. But ah, so let's let's step back and.

[00:31:00]

Think about intel, human intelligence in general. And clearly, that has evolved in a non mathematical way. It wasn't as far as we know, God or whoever didn't come up with a model and place in our brain of admissable functions that kind of evolved. I don't know, maybe you have a view on this, but so Alan Turing in the 50s in his paper asked and rejected the question, can machines think? It's not a very useful question, but can you briefly entertain this useful, useless question?

[00:31:38]

Can machines think so? Talk about intelligence and your view of it?

[00:31:43]

I don't know that I know the critics describe imitation. If a computer can imitate human beings, let's call it intelligent. And he understands that it is not thinking computer. Yes, you completely understand what he's doing, but he set up problem of imitation. So now we understand that the problem not in imitation. I'm not sure that intelligence just inside of us, maybe also outside of us, I have several observations, so. When they prove something to him, it's very difficult.

[00:32:28]

So in a couple of years, in several places, people prove the same, showed him say so little after I was done. The other guys prove the same suit ever in the history of science. It's happened all the time. For example, geometry, it happened simultaneously. Did you have skills in girls and boys and other guys in it? Approximately 10 times period, the 10 year period of time? Mm hmm.

[00:33:03]

And I saw a lot of examples like that in which magicians that when they develop something, they develop something in general which affect everybody. So maybe our models of intelligence only inside of us is incorrect.

[00:33:22]

It's our interpretation that maybe they exist. Some connection with world intelligence, I don't know.

[00:33:31]

You're almost like plugging in into one. Yeah, exactly. And contributing to this network and into a big maybe neural network model.

[00:33:43]

And the flip side of that, maybe you can comment on big old complexity in how you see classifying algorithms by worst case running time in relation to their input.

[00:33:57]

So that way of thinking about functions.

[00:34:00]

Do you think P equals and P do you think that's an interesting question? It is interesting question, but let me talk about. Complexity in the boat, worst case scenario. The reason which a magical setting when they came to the United States in 1990, those people did not know the stories and did not know statistics.

[00:34:27]

So in Russia, it was published two monographs of monographs. But in America, they did not know that they learned.

[00:34:37]

And somebody told me that it is worst case scenario and they will create real chaos.

[00:34:42]

But still now it did not because it is much more logical to you can do only what you can do using your markings and which has a clear understanding and clear description. And for this reason, we introduce complexity and indeed this. Because. Using. Actually, it is divorcées like this one more this year dimension, you can prove some theorems, but they also create order for. Case of a new low probability measure, and that is the best case which can happen at the entropy suing.

[00:35:32]

So from what you might call point of view, you know, the best possible case and the worst possible case, you can draw a different model.

[00:35:42]

We don't, but if not so interesting, you think that the edges are interesting and they're just interesting because. It is not so easy to get good example. It's not many cases where the ball is not exact, but interesting principles. Did you discover the mass?

[00:36:09]

Do you think it's interesting because it's challenging and reveals interesting principles that allow you to get those bounds? Or do you think it's interesting because it's actually very useful for understanding the essence of a function of of an algorithm.

[00:36:25]

So it's like me judging your life as a human being by the worst thing you did in the best thing you did versus all the stuff in the middle. It seems not productive.

[00:36:40]

I don't think so, because you cannot describe a situation in the middle or if you're not general. So you can describe it just gorgeous. And it is clear because some model, but you cannot describe a model for every new case.

[00:37:02]

So you, you know, very accurate when you use, but from a statistical point of view, the way you've studied functions and and the nature of learning and the world, don't you think that the real world has a very long tail that the cases are very far away from? The mean the the stuff in the middle or no. I don't know that because I think that. But from my point of view. If you will use. Formal statistic, you uniformed law of large numbers.

[00:37:55]

If you will use. This. Invariance business, you don't you just love large numbers and there's a huge difference between uniformed law numbers and large numbers.

[00:38:11]

Is it useful to describe that a little more or shall we just take it now? For example, when we're talking about doc, I give three predicates if it was enough. But if you do try to to do formal distinguish, you will need a lot of observation. And so that means that information about looks like a duck. Contain a lot of bit of information, more bits of information, so we don't know that how much bits of information contain things from artificial intelligence and that is the subject of analysis know.

[00:39:01]

Old business. I don't like how people consider artificial intelligence. They consider us some quotes which imitate the activity of human beings. It is not science. It is applications you would like to imitate. Go ahead. It is very useful and a good problem, but. You need to to to to learn something more. Have people tried to do what people can to develop, say, predicates?

[00:39:39]

Seems like a duck or play like a butterfly or something like that, the lot not the teacher says you have it came in his mind how he chooses you. So that process has this problem of intelligence. That is the problem of intel. And you see that connected to the problem of learning.

[00:40:00]

Are they because you immediately give this predicate like specific predicate swims like a duck or quacks like a duck?

[00:40:09]

It was Wachusett somehow. So what is the line of work, would you say, if you were to formally as a set of open problems?

[00:40:21]

That will take us there to play like a butterfly will get a system to be able to let separate two stories, one, which a magical story that if you have predicate, you can do something in another story, you have to get predicate. It is intelligence problem. And people even did not start understand intelligence because to understand intelligence, first of all, try to understand what the teachers. Did you teach? I want one teacher better than another one.

[00:40:59]

Yeah, so you think we really even haven't started on the journey of not generating the president's you don't understand evil, don't understand this problem exist. Because did you do it? No, I just no name yet. I want to understand why one teacher but doesn't another. And have a teacher, student. It was not because he repeating the problem, which is textbook, he makes some remarks. He makes some philosophy of what he's saying, you know, that's beautiful.

[00:41:39]

So it is a formulation. Of a question that is the open problem, why is one teacher better than another, right? What she does, but. Yeah, what, what, what, why in every level, what people how do they get better, what does it mean to be better than the whole.

[00:42:04]

Yeah, yeah. From from whatever model I have. Yeah. One teacher can give a very good predicate. My teacher can say swims like a duck and another can say jump like a duck. And jump like a dog, courage, zero information. Yeah, so what is the most exciting problem in statistical learning you've ever worked on or are working on now? Oh, I just finished this invariance story. I very much hope is that I believe that it is ultimately.

[00:42:43]

Long story, at least they can show that there are no. If there's a mechanism or little mechanism, but they separate statistical but from intelligent plot and they know nothing about intelligent board. And if we do know the intelligent part. So if you'll help us. A lot in teaching and learning and learning. Yeah, you know, we'll know it when we see it. So, for example, in my talk, the last slide was a challenge.

[00:43:21]

So you have so NYST digital recognition problem and deplore the claim that they did it very well, say ninety nine point five percent of it. Correct answers, but say sixty thousand observations. Yeah. Can you do the same musical defenceless. But incorporating invariance, what it means, you know, they just want to say, yeah, just looking at that. Explain musician worry, and they should keep. To use the examples or say 100 times less examples to do the same job.

[00:44:03]

Yeah, that last slide in, unfortunately, you're talking ended quickly, but that last slide was a powerful, open challenge and a formulation of the essence of this exact problem of intelligence.

[00:44:20]

Because. Everybody, when when martial law started, it was developed much, much more efficient immediately recognise that we use much more training data as a human need. But now, again, we came to the same story after the case, that is a problem of learning.

[00:44:45]

It is not like in deep learning they use of subterranean bait. Because my vigilance is not enough, if you have a good invariance, maybe you'll never collect some more observations, but now it is a question to to intelligence. Have to do that because statistical part is ready as soon as you supply us this predicate, we can do a good job with a small amount of observations. And the very first challenge is low digit recognition and low digits. And please, there'll be variance, I think about that.

[00:45:31]

I can say four digits. Three, I would introduce concept of horizontal symmetry so that the digits really has horizontal symmetry, say more than, say, digits or something like that.

[00:45:49]

But as soon as I get to the horizontal cemetery, I can, which a watch could invent a lot of measure of horizontal symmetry or vertical symmetry or delusional symmetry or whatever, if I have a deal of symmetry. But what else? Look on digitizes it, it is the predicates. Which is not cheap to something like symmetry, like how dark this whole picture is, something like that. Each victim herself raised a predicate you think such a predicate could rise out of.

[00:46:37]

Something that's not general meaning. It feels like for me to be able to understand the difference between a two and three, I would need to have had a a childhood of 10 to 15 years playing with kids, going to school, being yelled by parents. All of that walking, jumping, looking at ducks, and now then I would be able to generate the right predicate for telling the difference in two or three, or do you think there's a more efficient way?

[00:47:22]

I know for sure that you must know something more than digits.

[00:47:27]

Yes. To that's a powerful state.

[00:47:29]

Yeah, but maybe there are several languages of description, the elements of digits. So I talking about symmetry, about symmetry, properties of geometry. I'm talking about something abstract. I don't know yet, but there's a problem of intelligence. So in one of our article, it is trivial to show that every example can cut a lot more than one bit of information in real, because, ah, when you show example and you say this is what you can remove, say, a function which does not tell you what say it's a best strategy if you can do it to remove half of the work.

[00:48:25]

But when you use one predicate, which looks like a duck, you can remove much more function and half.

[00:48:33]

And that means that it got a lot of formations. From formal point of view, but when you have. A general picture, what you want to recognize and general picture of the world. Can you just predicate. And that predicates got a lot of information. Beautifully put, maybe just me, but in all the material in your work, which is some of the most profound mathematical work in the field of learning, I just math in general. I hear a lot of poetry and philosophy.

[00:49:19]

You really kind of, um, talk about philosophy of science. There's a there's a poetry and music to a lot of the work you're doing and the way you're thinking about it. So do you.

[00:49:30]

Where does that come from? Do you escape to poetry? Do you escape to music or not resist ground truth. Resist ground truth. Yeah. And that can be seen everywhere. Yeah. The smart guy philosopher. Sometimes I surprise how the deep sea.

[00:49:53]

Sometimes I sees that some of them are completely out of subject. But. The grant also seen music. Music is the ground truth. Yeah, and in poetry, when people say they believe that. They take dictation, so what what piece of music? As a piece of empirical evidence gave you a sense that they are they're touching something in the ground truth, it is structure, the structure of the book.

[00:50:34]

Yeah, but you see this structure very clear, very classic. Very simple. It was the same was when you have axioms, enjoy the theory. You have the same feeling and at the same sometime you see the same.

[00:50:50]

Yeah. Um, and if you look back at your childhood, you grew up in Russia, you maybe were born as a researcher in Russia, you've developed as a researcher in Russia. You came to the United States in a few places.

[00:51:06]

If you look back, what were what was some of your happiest moments as a researcher? Some of the most profound moments, not in terms of their impact on society, but in terms of their impact and how damn good you feel that day. And you remember that moment.

[00:51:30]

You know, every time you follow something.

[00:51:36]

It is great things in life, every simple things, just the general feeling that they most most of my time was wrong, you should go again and again and again and try to be honest in front of yourself, not to make interpretation, but try to understand that it related to ground truth. It is not my blah, blah, blah interpretation and something like that.

[00:52:07]

But you're allowed to get excited at the at the possibility of discovery. Oh yeah. You have to double check it.

[00:52:14]

But no but covid related to the ground rules is it's just temporary.

[00:52:22]

Well it is for whatever you know, you always have a feeling when you found something have because that.

[00:52:34]

So 20 years ago, we discovered a statistical evidence, and so nobody believe except for one guy, doesn't it? Mm hmm. And then in 20 years, it became fashion was the same. The support vector machines, the kernel machines.

[00:52:56]

So with with support vector machines, the learning theory. But when you were working on it, you had a sense that you had a sense of the profundity of it, how that this this seems to be right.

[00:53:14]

This seems to be powerful, right? Absolutely. We immediately recognized that it will last forever. You know, when I found this. Invariance story. You feel the same way if I have a feeling that it is completely OK because I have proof that there are no different mechanisms, you can have some cosmetic improvement you can do. But in terms of invariance, you get more variance, statistical learning, extra work together. But also a secret is that, um, you can formulate what is intelligence of that.

[00:54:06]

And to separate from technical part. That is completely different, so, so well, Barbara, thank you so much for talking today. Thank you. It's an honor for the.