#82 – Dr. JOSCHA BACH – Digital Physics, DL and Consciousness [UNPLUGGED]

Welcome back to Street Talk, just a little bit of housekeeping before we kick off today. Paulina Salivadove is one of the organizers for a charity AI conference called AI Helps Ukraine. Now, their main goal is to raise funds for Ukraine, both from the folks attending the conference and also from companies sponsoring the conference and it’s not too late to sponsor the conference and support it. So please do if you possibly can. Now all of the funds that they raise will go to Ukraine Medical Support which is a Canadian nonprofit organization which is specializing in humanitarian aid for Ukraine. Now they have some of the world’s leading AI experts keynoting at this event. So Joshua Benjo, Timnik Gubru, Max Welling, Regina Basile, Alexei Efros and also one of our own personal favorites here on MLSD, Professor Michael Braunstein, the one and only. Now the conference is online pretty much from now until the 6th of December and it’s being hosted from Miele in Montreal. Their goal is to raise $100,000 and they really really need the support of the AI community to club together and to just donate anything that you can. So we’ll link to the conference in the video and the podcast description. Please donate if you can and also share the links on your socials. Now I’m at New York this week in New Orleans so I’ll be walking around with a camera. Please just bump into me and if you want to record any spicy takes on artificial general intelligence then let’s do it. Today is a conversation with Joshua Bach who’s one of our most requested guests ever. We recorded a conversation back in April which gives you a bit of an indication of our backlog. I can only apologise about the backlog. Keith and I recently started a new venture called X-Ray and as you can imagine we’ve been working around the clock coding just trying to get that business off the ground but when we’ve made our millions we’ll devote all of our time to producing amazing content on MLSD. So yeah please bear with us loads and loads of cool content coming your way soon. I hope you enjoy the show today. Peace out. Dr. Joshua Bach is our most requested guest ever. Joshua Bach is a cognitive scientist focusing on cognitive architectures, models of mental representation, emotion, motivation and sociality. Joshua’s interview on Lex’s podcast, he did two interviews on Lex’s podcast, have been watched over two million times so far which is just absolutely unreal. Now Joshua I’ve watched many of your interviews and I still don’t feel that I have a firm grasp on some of your views. So today if you don’t mind I hope we can do a tour de force over some of your most important views in our shared space to the extent that we can keep up with you of course. Now for example we’d like to discuss GoDell and computation, consciousness, digital physics, free will and determinism, large statistical models and indeed whether they’re AGI or a parlor trick or something more esoteric. Now when people talk about God or consciousness or any other complex phenomena it relates to everyone and it means something different to everyone. It’s ineffable and every conversation sounds like a typical post-catamin discussion which is to say extremely low information content. Now the topics we’re discussing today are very complex and often we’re reaching for the best language to use to conduct the conversation so I hope we do well today. Anyway Dr. Joshua Barg has an absolute honor to finally welcome you to MLST. Thank you very much and let it be on the saw. Amazing. Well when I started doing computer science many years ago interestingly the theory of computation wasn’t even on the curricula and I was wondering whether you thought it should be I mean presumably you think it’s extremely relevant for AGI. Now we want this to be as pedagogical as possible so please explain everything like with five. What does computation mean to you? I think that computation is part easier than most people think it means that you have a causal structure where every transition can be decomposed into individual steps and we’ll be talking about computational models we decompose the world into states and transitions between the states and then it turns out that there is certain mini-mod system that is able to execute everything and this can be described in many ways the most famous one is probably the Turing machine and many other ways in which you can describe the Turing machine for instance you can just do the Turing machine by doing search and replace on strings and this is how the lambda calculus is defined and all the programming languages and the lambda calculus and the Turing machine turn out to have the same power that is if you can compute something with one of these paradigms you can compute it with the others as long as you don’t run into resource constraints so as long as it still fits into memory and you don’t care about speed they all have the same power but in practice of course every system is limited so we don’t run things forever we want them to give us a result after a certain time so what matters is what can be efficiently computed not what it’s reachable at all awesome and there are and we’re going to get into this a bit but there are some possible loop holes at least and you know let’s say whether or not the universe is limited in certain ways that the definitions of like Turing machines are and one I wanted to ask you about specifically is penrose’s claims and so he claims that what godals work in fact proves is that the human mind can understand truths that are not provable so specifically one can show that you know a given godals sentence is necessarily true given given the mathematical analysis even though it can’t be proven within the formal system that it’s that it’s defined and penrose claims that this capability to understand if you will to mathematically understand is in fact non-computational at least in part and so if he’s right then our brains might be what Turing referred to as Oracle machines these are computers that have access to a non-computable Oracle or function that they can then utilize those oracles in order to perform hyper computation essentially so I’m curious are you open to this possibility and if not what is your response to penrose’s arguments is it suspect that godals has been misunderstood by a lot of philosophers godals was a true sphere list that is he thought that truth really exists out there that it’s the same that is eternal in some sense this very strong intuition and mathematics classically is also formalized in this way the difference between mathematics and computation at least in the standard sense in which we normally teach mathematics at school is that mathematics has no states everything in mathematics just is eternally right it’s it’s a single state and if you want to go through the sequence of states you put an index into the formula and but still everything is there at the same time the index is just a way to access this thing and this way of having mathematics stateless is very elegant because it allows us to define functions that have infinity many arguments right if you would have a state machine that tries to consume infinity many arguments it would never finish before it goes to the next step and the same thing in the middle of the function if you would compute something if it’s stateless you can just compute all the indices all at once if it’s infinitely many in classical mathematics in a computational system you would have to do this maybe one after the other and if you do it at parallel you will have lots of CPUs running at parallel right so you run into limits and the same thing with the output so in the classical mathematics you can chain infinitely many steps and functions and exchange infinity many arguments but of course mathematicians never did this a practice it’s just a specification this is how they like to write things down and they want to calculate it they still have to go down and do it sequentially step by step it’s just mathematics is defined in such a way as if you could offload this to some supernatural being or some grad student who was going to do the infinity many calculations and goodl took this specification of mathematics and he found out that when you have this stateless mathematics you can for instance define self-referential statements that change that choose value depending on the statement itself and this recurrence leads can lead to a contradiction in the state itself so you basically get to statements which say I am wrong and if by referring to itself it changes its own choose value right so if mathematics is stateless you now run into a conflict in a computational system that’s not a big problem your computer is not going to crash if you write it down the right bay it just happens is that your choose value fluctuates in every execution step it’s not going to converge but this is not the real truth right who is something that doesn’t change when you call it a call the function again right so this what’s going on here and I think what goodl is discovered is that classical mathematics doesn’t work but you cannot build is any kind of mathematics doesn’t allow you to build a machine a hypothetical abstract machine any kind of universe that runs the semantics of the classical mathematics without crashing yeah but it kind of seems like okay we’re going to believe girls youth of math use of mathematics to prove that mathematics is flawed like there seems to be almost an inherent contradiction in there like you either believe mathematics and thus you believe girdles proof of you know some specific limitations on computational systems right or you believe that somehow mathematics is flawed in which case you can’t trust the proof that mathematics is flawed I think girdles conclusion was that there is something fundamentally going wrong that there might be an ability of mathematics with its popularity and if you believe that truth is real and it exists independently of the procedure by which you calculated then this seems to be plausible and it was also the conclusion a lot of philosophers have drawn from this which basically write girdles proof and concluded that mathematicians have admitted that their arcane techniques are important to describe reality and therefore philosophers who don’t understand mathematics have a clear advantage of course this is not the conclusion instead what turns out is that if you just skip the or if you drop the original classical notation or as the understanding of mathematics and replace it by computation basically we say truth is what you calculate with the following procedure and you can define any kind of procedure that you want you just have to make sure that it converges to some kind of value in the way that you want then you resolve your problem it’s just that you lose your notion that truth is independent of that procedure and so in some sense the classical mathematics is a specification that cannot be computed from the perspective of computer scientists that happens all the time some customer wants you to build something that cannot be built and you have just proven that it cannot be built right but doesn’t mean that you cannot build something useful and I think that Penrose believes that our brain is actually doing these infinite things and it’s not when we reason about infinity we’re not actually reasoning about infinite many steps what we do is we create a symbol and then we do very finite computations over that symbol but we cannot construct infinity we cannot build it we cannot go there from scratch write down some clever automaton that produces an infinity for you. I was recently browsing Penrose’s book the road to reality and I would say that I mean I don’t know that much about physics but the chapters were really interesting you know they’re talking about surfaces and manifolds and symmetries and fiber bundles and gauges and wave functions calculus matrix theory and even computation I mean almost all of the discussion was on mathematical modeling at different levels of description or emergence if you will and in machine learning an AI we are forever challenged by trying to get machines to model physical reality at different levels of description using an interoperable set of tools so it seems increasingly true that we need machines that can learn descriptions and concepts at multiple levels if we’re ever going to have AI capable of understanding the world and learning novel semantic models all of machine learning models today work by chopping up a Euclidean space into what is effectively a locality sensitive lookup table very big one and we need AI’s that can go far beyond this it’s got to be able to learn novel geometries beyond even what humans could have come up with and the ability to reason topologically and algebraically over those geometries something which I think you would agree is not happening with the current deep learning systems. Well let’s start out with the nodule geometry first if you read Penrose book what you find is that it’s entire universe is geometric which means it’s made of continuous spaces in which things are happening and if we actually look into the world deeply quantum mechanics is not a geometric theory the geometry only emerges approximately at the level of the space time description and it seems that geometry is actually the domain of too many parts to ground and reality all the objects that we describe as surfaces if you zoom in a made of discrete parts like atoms and particles and so on and these in turn are made out of things that have a finite resolution and if we look into our computer programs you can create stuff that looks continuous to us but there’s nothing continuous inside of our computer programs and it turns out that the assumption of of continuity requires that be partitioned the space into infinitely many parts. So now we are again running against that scene which Girl has shown run it gives us difficult and it’s not a big problem in practice because in practice we never need to do these infinitely many things to produce a computer game with an arbitrary fidelity right we can make something that looks like space but the space that we sink in and so on is an approximation that our brain is discovered it’s a set of operators that converge in the limit but the limit doesn’t exist it’s just it happens that when you live in the world that is made of too many parts to count for almost everywhere where you look you need to find these operators that converge in the limit and the set of operators that happens to converge in the limit is still computable this is what we call geometry and to use these uncomputable geometric approximations for macroscopic physics like Newtonian mechanics and completely fine and you’re just going to compute it up to a certain digit and then this is good but it’s a problem for foundational physics because if it turns out that you cannot take a language that actually computes infinities if you cannot construct just your language and you cannot write a universe in it right so our universe is not written in continuous language but in the 10 rows universe is so this does mean the geometry is full we need this to describe the world of combined parts to count but we do this via computational approximations our brain does the same so let me ask you this then because we we’ve come across kind of the infinities a couple of times and I know that you place an emphasis on constructive mathematics so of course you and all of us you know except let’s say the existence of potential infinities you know algorithms that you can sit there and just keep calculating for as long as you want and get kind of more digits but it’s really around actual infinities that we seem to be running into problems so let me ask this this first question here really leading up to some computational questions which is can the universe can our actual universe that we’re in right now be actually infinite in spatial extent I promise there you can have unboundedness in the sense that you have a computation that doesn’t stop giving your results but you cannot take the last result of such a computation and go to the next step you can not have a computation that relies on knowing the last digit of pi before it goes to the next step you don’t have an infinity but the infinities are about the conclusion of such a function it means that you actually run this function to the end and then do something with the result unboundedness is different in the sense that you will always get something you that you didn’t expect but they cannot predict but it’s it’s just going on and on without this end and it’s I think it completely conceivable that our universe is in this class of systems in the sense that it doesn’t end but it doesn’t mean that there is anything that gives you the result of an infinite computation because if it that was the case then it could not be expressed in any language it also means if something cannot be expressed in any language that you cannot actually properly think about it because when you think you need to think of some kind of language not in English but in some kind of language of sort or in a mathematical language that doesn’t have contradictions and what goodness shown is that the language that you hope to reason in about infinities breaks that it has a contradictions in it that at some point it blows it’s self apart but the languages that you can build are only those in which we have to assume that infinities cannot be built so infinity in this sense is meaningless because you cannot make it in any kind of language. So the thing is though I’m not limiting what the universe is capable of based on human you know mental and linguistic limitations or even mathematical limitations like I’m asking you if it’s possible for this universe that we’re in to ontically be right now actually infinite in spatial extent. The thing is that you try to make a reference to something that you cannot observe that cannot conceive of other than making a model in some kind of language and to have that model that extends the language needs to work right otherwise you are just maybe in some kind of delusional thing and we can construct delusional things we can construct languages that have backs that we cannot see but if a user language that has backs in it that we cannot see and we cannot repair them then this means that the stuff that we express in the language is not meaningful where we have to use a different language that has maybe the same expressive power but doesn’t have these backs that now if you try to think about the universe in the language that allows you to imagine that the universe is literally infinite rather than very very very big and much bigger than you can imagine and not ending which is practical for all means and purposes almost the same thing right then if you do this other thing then your thought doesn’t mean anything so it’s basically you cannot properly express the idea in your own mind without running out of contradictions that the universe is infinite infinite in the sense that such a universe could exist okay so you’re basically following me. I cannot think that the universe is infinite I cannot express this that’s my issue. Fine so you’re basically saying that the the you know English that I used just a minute or so ago just is uh you know not coherent or not conceivable there’s just you know it’s not something. They underlying thing we had the English right English is not designed to be coherent it’s designed to be disembeiguating it’s designed to be unprincipled to allow us to express things vaguely and not break but if you would think really really deeply and really exactly then the question is what kind of model is your mind building at which point is there just some kind of noisy nervous that you’re pointing at this out actually decomposing it and anything that would make sense. Okay and so the lack of really the ability to conceive or for you know actual on fendonies to ontically exist in some sense you know if we just deny all that so we’re really just stuck with all right we’ve got finite everything discrete everything there’s no such thing as a continuum you know there’s no such thing as as actual infinite spatial extent etc that’s really the world that you’re proposing here right that everything is constructed from at the end of the day finite discrete you know kind of kind of elements um so if we again imagine you remind us the library of functions in a way and these functions are doing jobs and on the about sort of the box you write down what these functions are doing and you construct a box that this in this box there is a infinity between for instance a continuum between two points and then you open up the box and look at what’s actually inside of the box and you realize it’s just a lot of small steps and it’s designed in such a way that you can if you want to have more steps it’s going to give you more steps if you zoom in right and it’s totally doing apparently what’s written down on the box but if you look very closely realize oh no the thing that is written down on the box that you have written down on the box cannot actually be in there you can prove that you cannot be in there it must be something else that’s in there that is doing most of the work of what you’ve written down so what you should actually be doing I think if you are interested in housing actually work right on the box what it’s actually doing which means it’s going to subdivide any interval there is any resolution you want as long as you can afford it. Okay one one mystery if you will for me and I’m hoping can help me understand this is that all of the standard standard models for physics right that we have today they do have in them these continuous you know for example symmetries that a rotational symmetry or or things like that they’re built off of of positing continuums with with continuous waves you know lots of lots of continuities and infinities at least in the the mathematical descriptions and I think you would based on what you’ve been saying you would say that those are those are artifacts or properties of our of our mathematical descriptions of reality but they’re not actually extent you know in reality and my my mystery there is why do those those continuous and and mathematical maybe flawed and inconsistent with infinities all over the place descriptions work so well for for describing phenomenon at different levels if everything at the end of the day you know if we just look at high enough energy and small enough resolution we’d see kind of the the the grid and you know all the discrete effects and rotationing happening kind of in in little tiny tiny infinite you know tiny very small but not infinite decimal degrees you know why does all this continuous infinity based mathematics work so well what is the explanation for the unreasonable effectiveness of that kind of mathematics the easiest answer is that the world in which we live in is made of extremely small parts and we could not exist if the that world was not made of that many small parts so for instance you want to have a moment of for particles that are almost continuous so you can address the space with high resolution because the momentum is what tells you where information comes from in the universe basically the direction of where from which information reaches you and so on if that would be very coarse then the complexity that you could build would probably be far lower and we we consist of so many parts if you look down it’s uncountably many for all practical purposes so the mathematics that we need to describe the world that we are in that we need to model out mostly not in the realm of countable numbers the countable numbers only play a role when we are looking at very few macroscopic things as soon as you leave this domain of a few apples on our table we almost instantly drop in this round where we just need to switch to a continuous description of things and this is completely fine for most of our history when we did physics we never zoomed in that heart and even now when we really need to zoom at the level where the plank length matters and the resolution of the universe becomes visible and it’s of course not some Euclidean letters some grid that you can see it’s just that at this level to no longer have space and I wanted to move matters back over to some of the happenings in the you know the the world of large language models and deep learning and so on and first quick fire question and I honestly you’re a bit of an enigma to me Yoshua because obviously and I read some of your research and you seem like a hybrid guide to me you’re very you know Ben Gertz over very well for example but you’re also hugely into the hype train on the connectionism and you you know for example you criticize Gary Marcus’s article so the first question is are you as simple as to a connectionist? I’m neither the thing is that I hate deep learning is the best of us it’s just a deep learning is ugly it’s brutalist it’s a pure very simple algorithm that are blown up to the max but I cannot prove that these algorithms do not converge to what if you want them to converge to right it’s it’s maybe not elegant but it works and the solution to problems with deep learning so far has always been to use more deep learning not less so what upsets me about Gary Marcus argument is not that I’m not sympathetic to what he’s trying to push it I like to build models that are more elegant more sparse and so on but are in in the past all these elegant bars models have been left in the dust by just using more deep learning and we can also see when we zoom out a little bit that there is not an obvious limit to deep learning itself because deep learning is not just three algorithms deep learning is a programming paradigm it’s differentiable programming basically means that you express everything with approximately continuous numbers and you use algorithms that converge business certain ranges and when it doesn’t converge then you just tweak it and you introduce a different architecture which is some kind of discrete operations that you do on these continuous numbers and so on but you just patch it it best you write your programs slightly differently and you can automate the search for the program and the people who do deep learning are not also dogs in the sense that they say oh my god symbolic structures are not allowed I cannot use a pison script in here rather than just a tensile flow this is not what’s happening it’s also not that they are constrained to any kind of thing that will use whatever is working and what we see is that the entry and train systems are going more more powerful and rather than sitting there by hand and tinkering and finding a solution we can just use a system that is tinkering automatically through a dramatic larger space than we would ever be able to explore by trying all sorts of algorithms so when we look at curry markers articles like this deep learning is hitting a wall and so on and you look what he’s actually giving us arguments the arguments are not very good he gives us an example the netheck challenge netheck is a game which is in very large horizon because you basically have only one life you need to explore a very deep librarians need to plan pretty firehead with what you’re doing and so it’s something that is difficult to discover this right solution with a deep learning model that has no prior ideas about what it’s doing because that takes us very very long until you get the necessary feedback to learn about your actions and people are relatively good at learning this because they have so many ideas about what the situation is that they’re in there’s so many priors from our world interaction and from other games that we have played that we can bring to the tasks so the current winner of this is a symbolic solution and this symbolic solution that curry markers gives us a proof that symbolic methods are still ahead of deep learning things in the single case right not like he has a big array of tasks where there are superior there’s just two students who have written a program that is made of lots and lots of events this is just a big hack this is not some symbolic learning of reason that does something novel hybrid or whatever no this is that’s the script and deep is curry markers seriously proposing oh my god deep learning models are limited if we need to replace them with more scripts this is not a good argument yeah so I think it maybe and look I I get that there are these kind of two competing camps and they maybe go after each other with some no they don’t this is only on Twitter there is there are no competing camps it’s Yande Kuhn is not also docs in the sense that you believe you need to use this algorithm all the other arguments are impure and plot this brand is built systems that work and if one of his people comes up with some of the works better than what he came up with you probably praise him for that and that didn’t go on yeah sure but there’s absolutely however there is you know there is let’s say momentum and hardware lotteries and and paradigms that kind of reinforce themselves and to an extent they can strangle off you know resources that maybe we like we shouldn’t be investing all our eggs in one basket we shouldn’t be pouring you know the 99% of research funding necessarily down down deep learning and I think that’s kind of the problem that that these paradigms cause but I want to get back to something you said which is it it’s a good point it’s I think that’s an important point I think that in absolute terms the other approaches get more money than that did before it’s not that we have a funding stop as we had at some point a research funding stop when you will network the merwin mincey wrote a book where he saw the approval that the new network cannot converge over multiple layers press a prompt cannot turn x or and so on right mincey was wrong people found a way around this but at this time there was so little funding that this cutoff mattered and at the moment if you want to do something that has AI and headline the chance that you get at fund that well about paradigm you’re doing is greater than ever so the absolute amount of funds that goes into any kind of paradigm that you want to work on is greater than ever and the reason why the majority of funds goes into very few paradigms is because these are the things that work in industrial applications there is no other algorithm that is able to learn from scratch how to translate between arbitrary languages and generate stories and draw pre pictures for you this is the only game and talent at the moment the only class of algorithm that converges over all these many domains and people are looking for better alternatives and yes we are in a bubble because of course they’re looking mostly where things already work that you have hardware that works you have libraries that work and so on it’s hard to get out of that but what that is true and it’s always good to push for alternatives and so on but I don’t think that we should be in a panic and say oh my god there is something politically wrong it’s expected by and large the forces of the markets and the forces of the academic researchers that want to explore alternative are pushing in the right direction already yeah I mean fair enough and you know you could be right and there may not be that much of an imbalance but I want to get back to to one technical thing you said yes I it seems apparent that let’s say what deep learning is doing is this this differentiable program search if you will and a question I have about that is if we imagine the space of all possible programs that you know requiring that we’re doing a differentiable search is certainly going to skew that sample space it may even cut off programs in that space that can’t be discovered easily by differentiable search so I’m wondering doesn’t that leave open the possibility that other algorithms that are more discrete in nature say evolutionary algorithms or discrete program search or whatever they may have access to a different subspace of the space of all all programs that aren’t easily accessible by differentiable paradigms is that true now the question is how do you find it of how you do find this algorithms to manipulate the discrete things I agree that when you have a perceptual model that is modeling everything with chains or sums over your numbers and a few non-added archfrains thrown and you get characteristic artefact for instance in the generative models you often have the problem and we try to model a person with glasses or without glasses that because the models think that these features are somewhat continuous you often run into the situation you get areas in the generative model where the glasses are half materialized and looks always very weird and you have these strange things where reality as a discontinuity but your model has permissible states where you are in the mech in the middle of the discontinuity and you try to generate something that cannot exist you want your model to be structured such a way ideally that every model configuration corresponds to a world configuration and this is not necessarily the case with many of the deep learning models and what the deep learning models as you train them harder typically tend to do is that they squeeze these impermeasible areas until you are very unlikely to end up in them and it’s probably possible to get them to implement filters and all sorts of tricks but what you can also do is you can combine this with some kind of discrete machine and then what you do is you learn how to use this so this deep learning network did not interact with the world directly but it learns how to use an architecture that does that so for instance instead of training a neural network to do numerical calculations you can train it to use a numerical calculator and in this way it can become very sparse again right so there’s not an obvious limit to that I can see where I can prove to the deep learning people oh here is where you should stop deep learning because they can just combine that deep learning approach with other approaches and use the deep learning system to remote control this and it turns out when we reason and so on even when we do discrete reasoning that the steps that we assemble it to each other are heuristics that require some kind of probabilistic element right so when we form a saw that when the saw is made of very discrete elements the search for that saw this some kind of deep learning process that is happening right and when we make the pool we do this we emulate discrete reasoning but of course we can combine this and we can get a new network to learn how to perform the discrete operations there’s a certain thing that I would like to see which is something like a more sparse language of thought well when we are looking at deep learning models there’s a phenomenon that people are sometimes observing which they call rocking that is you train the model and the model gets better and better and then it overfits which means it gets very good at the training data but it gets very bad on the real world at things that it hasn’t seen before like a person on psychedelics was able to explain everything in the past but it’s no longer able to perform well in the future because they’re overfitting they basically fit the curve too closely to the data that I’ve seen and there are many tricks in deep learning to go around this overfitting to make sure that this doesn’t happen and people try to avoid it and then what they discovered is when you take this overfit model you train it more and more and more and more at some point it sometimes clicks and it gets much better than ever before and there is a question if there’s something that we’re doing wrong and deep learning for instance when you think about how people learn they learn very different from GPDs 3 people first learn by pointing at stuff at things that are relevant to them that they can eat that they convert that convert them or that they find pleasant and so on they that they can feel that they can where they have contrast on it that are salient to them and so you start out with learning these semantic space on the saliency and relevance that you have and then when you learn language you learn basic syntax how to put things together and in the long tail of the syntax you learn style how to express things with newer and so on and the GPD 3 it’s the opposite you first learn style right and then you learn syntax as the regularities in the style and the semantics is the long tail of that and to make that happen you need to learn much much more you need to have more training data and so on maybe there is a way in which we can reverse the order and basically get it to start out with relevance to build a curriculum where you first get very sparse regularities very irault clicks into place you always make sure that you can handle it with very limited resources and only see the style and the nice tease and the nuances as far extensions of these very sparse concise models that have very big predictive power yeah I mean on that I mean the rocking paper was very interesting and a lot a lot of these large language model fans always cite that very very quickly when you have a conversation with them but there is a problem with machine learning in general which is that there is as you said there’s a spectrum of correlations and almost all of them are spurious and on one side of that spectrum you have the idealized features you actually want it to learn which will generalize after distribution and then of course if you go down that spectrum you pick up on all sorts of very spurious correlations that just happen to generalize very well and if you tell the models not to use those spurious correlations the performance of the model will go down but I want to just move a little bit over to Yazaman’s Razegi the Yazaman Razegi’s paper I don’t know where do you saw that but she showed that the performance of large language models for arithmetic tasks are linearly correlated to the term frequency in the training corpus suggesting that they are memorizing the data set which presumably you would agree with and Google has recently released this 540 billion parameter language model called Palm which interestingly does extremely well on for example some of the Google big bench tasks such as the conceptual combinations task which is one of them which tests for compositionality which we’ll talk about in a minute but you know compositionality is when you can take constituents from the prompt and compose them together to form the answer now it’s tempting to jump to the conclusion that these models are starting to magically reason at scale along the lines that you were just discussing but I still think there’s plenty of opportunities for shortcut learning you know by which I mean these spurious correlations given the brittle interface of an auto-aggressive GPT style language model with these human designed benchmarks would you agree with that yeah when I started my own career in computer science in the 90s I was in New Zealand and the prof he in Vitten realized that I was ward in class so he took me out of the class and in this lab and he gave me the task to discover grammatical structure and an unknown language from scratch and left me pretty much to my own devices on how to do this so the unknown language I picked was English was just unknown to the computer but was the easiest one to get a corpus for and they gave me the largest computer they had it has two gigabytes of RAM and I did in memory compression with C as on and tried to do statistics and I quickly realized and RAM statistics don’t work because of too many words in between so unlike vision task work confnets have used for prior by thinking that adjacent pixels also relate to symbolically related information right so adjacent C images is a very good predictor for the medical aidedness it doesn’t really work in NLP so the principal was discovered in natural language processing for that reason because you cannot use direct adjacent C very well and so I realized I cannot use and RAM which depend on direct adjacent C between words and so I first of all use ordered pairs of words and tried to find correlations between pairs and then find a mutual information tree that would give me the best prediction over the structure of the the sentence for all the sentences that I would have in my corpus and indeed this correlated to structure and I realized this is going to not just give me grammar but it’s also also going to give me semantics if I can more deep statistics but I will lead something not just ordered pairs but I need to have something like force order models but to do the statistics even in memory this clever in memory compression in many tricks that I did I could not do full statistics on this so what I realized I had to do was that I do multiple passes and at first I discard almost all the information I only pick out the most salient things and then my time was over in this lab and I went back to Germany and never reviewed this area of research again but what I had realized is to make progress and need to make statistics over what I need to make the statistics over and the very principled base and need to learn what I have to learn and they didn’t pay attention to this domain at all it was and I also missed the 2017 transformer paper and its relevance it was only when GPT2 came out that I realized oh my god they did this they did statistics over the statistics and it’s still not the right solution I think it’s not the way which our brain is doing it it’s some brute force shortcut where for instance the individual attention heads are not correlative with each other but the reality they are we have this in reality our attention heads are integrated into one model of what’s going on and it’s not that we have an attention head on every layer every layer that just pays attention what happened me in the lower layer right it’s it’s much more clever in our own mind and the thing is active we single out things in reality to research which book we need to take out of the shelf to update our working memory context to be able to interpret the current sentence that we don’t understand and so we always go for saliency when we read something that doesn’t make sense at least my mind works like this I discarded I will will not stop until it makes sense or I will have to go to some preliminary I will not accept some kind of vague statistical approximation of what I read keep this is an intermediate stage in my mind and fill it with the hope that eventually converges right it’s it’s a completely different learning paradigm when we teach our children arithmetic is not that we show them lots of very loud mass textbooks and hope that initially it will not make any sense to them but as they reread them again and again with many samples eventually it will click and they will quote words on arithmetic no this is not how it works you start this giving them extremely simple things and say and these extremely simple things there’s structure that you can fully understand now go and find the structure that you fully understand once you’ve done it you make this a little bit more complicated for you right this this is probably the paradigm that you could be exploring I know that the problem is it’s incredibly deceptive when you have something which appears intelligence and of course the boundary of of our perception of intelligence is a receding one but I wanted to just get on to there are some incredible generative visual models like Dali and you know the disco diffusion so these models I think are going to revolutionize the creative profession I’ve been playing with disco diffusion all day to day I’ve already ordered some prints to go on my wall it’s incredible so yeah the two obvious settings where large language models you know might be successful are coding and information retrieval in my opinion but let’s take pause for thought right so I’ve played with codex and I’m resolutely sure that I wouldn’t want to use it I think code and knowledge are a different ballgame to art which I think will be amazing with codex there’s an impedance mismatch between the process of generating the code and then debugging and running the code which is euphemistically being framed as prompt engineering or another term which I’ve just invented a retrospective development I think it’s easier to start again from scratch than fix broken code from a large language model and it’s quite interesting at Google it already takes months to get any code checked into their mono repo because it’s basically a bureaucracy because they needed to you know to have gatekeeping after they decided to use a mono repo so could you imagine how much bureaucracy there’d be if they allow people to start checking in code which was generated from an algorithm so anyway I think there’s an exciting possible future for using this systems for information retrieval you know rather than the the way that we kind of go through and prune the results on on a Google search these models might just answer directly but I hasten to think what that search UI would look like you know would it’s output be sclerotic or unadaptable would it be relevant to the query that I that I put in would it’s output even be true um perhaps it will ask you to select what kind of truth you are looking for so you know do you think these models would visionate or spoil our society or do you think they would actually enrich it yeah very hard to say I think that from some perspective our society is already maximally spoiled humans if they live today are basically locals with opposable stamps this is not going to go on forever this technological society here are it seems to me on some kind of Titanic that is going to hit the iceberg no matter what and but basically should make us content is that the Titanic was the only place in the universe that has internet and we are born on it and we wouldn’t have been born if if there was no Titanic we would not have been born in the sustainable ancestral society but so in some sense our society needs to re-event itself it’s not really working right now we don’t know what the future is going to look like and if it’s going to be very technological or if limited certain things no idea what’s going to happen but if we think about how our current approaches work if you want just to make programming better I suspect that these tools can help but they will be much more useful if you do not have to have this battle between a machine that doesn’t really understand what you want and instead you have something that is working next to you it’s like imagine you were working for some corporation and the corporation introduces some kind of planning tool that requires to do to jump through all sorts of hopes and the terms out that the planning tool itself makes you more productive but it makes work much less fine it’s so rational to use it and everybody will hate it but by and large it will be used if it makes people slightly more productive and everybody will feel there might be a must be a better solution something that feels more organic and so it could be that critics is in this category that it makes mediocre programmers much more productive at producing boilerplate but it’s not just this it’s often able to find solutions very quickly but you need to use a lot of stack overflow before you understand the new language or before you tease this new algorithm apart that you want to understand and so on it’s just when it doesn’t probably turn you into a better programmer if that is your goal but for your employer maybe they don’t care whether you’re a better programmer they just want you to turn out these pages of code and then they run this against the very fire and against the unit test and then a done go to the next thing right so maybe it’s not that important but the systems that you would want what would they look like I think they need to know what they’re doing you want to have a program that is not just able to reproduce something very well in the given context you want to understand the context as deeply as you do or better so it needs to understand what kind of world it’s operating in in the moment and what itself is what is is that it can do what is that what needs to learn still let’s see you in some sense you want systems that are sentient and it’s so like woo oh my god that it just means you have a learning system that’s general enough to model principle the entire universe and this is not as outrageous as it sounds because Delhi is already dealing with two modalities language and images and we will get to video and we will get to audio connected and you see them early steps in this direction with the socratic model people for instance so I think that’s almost inevitable that this generality will happen and you will have to add a system to work in real time so it can just cover itself yeah and I think I think there’s something really magic though about the creative process here and and also the prompt engineering is another thing we we can talk about but Kenneth Stanley once made this thing called pickbreeder and you could essentially distribute the selection of these images created with cppn’s compositional pattern producing networks and you would just get these beautiful images they were they were so incredibly pro you know incredibly diverse and interesting so it’s not that the algorithms were intelligent there was something magic about the externalized process and what’s really interesting about these models like Dalie for example is that creativity has been distilled down to a raw idea in your head right so for example I might decide to mix the style of two artists and combine them with a new subject you know I want a black cat in front of Royal Holloway University and the style of cyberpunk I’ve been doing that all day and the technical process is now done for you the only limit is your imagination so just like Kenneth Stanley’s pickbreeder creativity itself has now become this uber efficient and externalized process I think it’s unreal but the thing is like the reason I never thought gpt3 was intelligent is because it can’t be used non-interactively the magic must happen and when it’s used by humans interactively well you can basically build a machine that is generated prompts for gpt3 so in principle you can build a robot that has a vision to text module and that is used to prompt gpt3 into generating a story about a robot who sees these things and interact with them and then you take the output of the a generator model and translate this using a text to motor module and in this way you close the loop and while this I just used this as a thought experiment to think about the invitations of a body meant for such systems they can is essentially doing that right so somebody has made this happen and it even with a language model it works to some degree and we know that we don’t want to do this with natural language because natural language is a crutch these mid systems make up for this but you just use a more natural language path that then people could use it but there is some language of thought that we are using that is not learned but discovered by our own mind that we couldn’t bridge on that is much more efficient and this language of thought seems to be able to bottom out and perceptual distributed representations that are unc principled like these neural networks are in a sense unc principled but they don’t break then there is something that is vague and ambiguous and has small contradictions in it but at some level it also is able to emulate very principle logic very well and becomes very sparse and very powerful and expressing things concisely and this very concise language of thought is I don’t so don’t see it in our models what Dali is doing in Dali too is that it combines the language model and division model using embedding spaces and these embedding spaces basically project all the concepts into some high dimensional manifold and find similarities between them and Gary Marcus points out that there is an issue with composition in this right so you need to find the semantic structure of a sentence that is made of a hierarchy of concepts and this is easy to do with the grammar and it’s much harder to do this visit deep learning system that needs to discover this in a way and structure this space in the right way it’s not impossible so when Gary Marcus says these models cannot do this and cannot learn it he is probably wrong but he I think he is right in the sense that this is something that is much much harder for the current approaches they need dramatic watch training data than the human being and the algorithms are not doing this naturally so then I’ll probably raise in which you could make this happen much more elegantly and quickly a converged for instance on models for arithmetic that’s right I mean I remember I read a really good Twitter thread I think it was by Raphael Millier you know about compositionality of these large generative vision models because usually compositionality is referred to in respect of language models you know and it’s this yeah I think Raphael said that the assessment of the claim is complicated by the fact that people differ in their understanding of of what compositionality means but if language is is compositional as you say and thought is language like as argued by the proponents of this language of thought hypothesis I think Raphael said that he thought language itself you know should be compositional in a similar sense and perhaps by extension visual imagery should be compositional so I think Gary was arguing in a nutshell that it’s hard to go from the image or let’s say the utterance if it’s an LP to the structure or the grammar or the constituents it’s much easier to go the other way around would would you agree and the issues that language of thought is executable and natural language is not right the uh execute natural language by translating it into our mind and something that we can execute and use a reason about code you might use natural language to support your reasoning but the code that you built in your mind is filled in some kind of abstract syntax tree that you can actually execute in your mind to some degree and then you get a sense of the output right so you entrain your own brain with an executable structure and this executable structure has properties that are quite similar to the ones that the compiler has in your computer right so you can anticipate what the compiler is going to do with your code go not go into this with all the depths that you compiler do it you might still have to run your code but you will find when you want to experience programmer your stuff will usually run right so our language of thought can do this it can execute stuff and it’s not just some machine neural network that guesses what the outcome is going to be and it’s right some of the time but uh it gets pretty good at figuring this out and this means that it has to build this composition a compositional structure that has some verifiable properties and we observe ourselves operating on this verification process right when we do introspection for the program we observe ourselves how we direct our attention on making proofs and this attention algorithm that works in real time that is making changes on your mental models and then predicts the outcome of these changes and look compare this with what your mental computations give you and then fix your models of how you’re only thinking process in this domain works and so on that you can observe yourself doing that and uh it’s nothing where I would say a given approach or the given approaches that we have will never get there but there seem to be ways in which we have just barely scratch the surface in what you need to be doing to make these models simple efficient and spars and more adequate to model domains you’re interested in. Cool okay well um I mean just to finish like the discussion about the open AI stuff I mean that I agree with the prognosticators and I do think that these large language models and these visual-generative models will be revolutionary for some domains but um you know I think you really need to have a human guiding the creative process which is a huge limitation but I think it could also potentially hint at what intelligence actually is right I think intelligence might be this externalized process in a in a cybernetics sense if you like this idea of intelligence being fully embedded in an algorithm in a single agent might be might be the wrong way to think about it. I think that humans by a large are very confused very often you need a human to guide a human right and uh then you ask yourself if you do this recursively that’s the society know where it’s going or is this at some level of confusion at all levels that is balancing each other. So there seem to be very few people with a plan right now and it’s quite apparent when we see this in the sciences we see this in politics we see this in art and that feature it is that humans have a higher degree of sentience but by a large very few people have a principal plan on how to build sustainable harmonic world and if you set an AI system to this task it might make more progress on it it’s just that Dali is not operating in real time on the universe that it’s entangled with and neither is GPT-3. Most of them are in some sense fancy autocomplete algorithms but this fancy autocomplete is able to do autocomplete that are far beyond the autocompletion abilities of humans in almost every context and so I don’t see Dali yet as art it’s a very strange sense when someone that opened AI let me throw trumps at Dali too and I got images back I had a sense of ownership I had the sense that I was doing that even though it was clearly doing skills using skills that I didn’t have and I suppose that you had the same impression when you were generating things with your diffusion model that you’re going to put up on your walls right you did that using this amazing tool that was empowering you to think that you otherwise never could do but you are the creative nexus and to make an artist a digital artist you would need to create an autonomous creative nexus in a way a creative entity something that reflects on the world because art is about capturing conscious states so we would need to build a system that has a story about itself and that is reacting towards own interaction with the world and that thing would need to be human it would need to be consistent something that is an intelligent entity that is creatively interacting with the world I think we could totally build an AI artist franchise right now that would have a huge following but what it need to have is an identity that is not fake that it’s actually both from its interactions with the world in real time well I think we’ve got a lovely segue there because you said that you know art is about representing our conscious states in a way I disagree because you could say well it’s very reductionist I’ve just put I’ve just put a prompt in there and I’ve created art well I think it I think it is art but how much of a representation of my conscious state is it I think Douglas Hofstad would say it wasn’t but over over to the matter of consciousness because we’re a bit low on time I mean you you said actually that you’ve spent much of your life thinking about what consciousness is and you said that you you thought it was very mysterious but you now think that it’s a riddle that can be solved right so on your recent theory of everything interview we’ve Donald Hoffman actually you said that it was virtual not a physical thing that brains are mechanistic and that the elements of consciousness are magical somehow but you said it had an a causal structure but not the way physics is built but it was a story which the physical system tells to itself you said that the organism is a coherent and consistent pattern which is state building at least at some level of analysis and that consciousness allows organisms to coordinate their cells to succeed in their niches and then you spoke of information processing over cells now what model of I should say like what measure of consciousness do you think you most align with and there is only one theory that offers a measure of consciousness and there’s this integrated information theory where it’s you actually put a number on it and it’s not clear what that number means right it’s it’s not that there is some kind of scalar that measures this and and people be think of consciousness as something that is more qualitative than quantitative either somebody is conscious or somebody is unconscious and when you are conscious you can have a lack of accuracy you can be eddled in your brain and you can be drifting in and out of consciousness but it’s still a qualitative thing whether you have that or not and this qualitative thing seems to be simple probably much simpler than people expected to be the hard thing might be perception and consciousness is on top of perception is a certain way to deal with our attention so I think a very important aspect of consciousness is reflexive attention that we notice our self attending to something and we reflect on that and integrate this in our model the conundrum with understanding consciousness if you go right into the history of everything starting with likeness and many others likeness has this idea of imagine you could have a mill and this this is your mind and this the mill is made of lots of mechanical parts and somehow the thing is feeling and perceiving things and we blow the mill up so large that we can walk into it or we would today maybe zoom into it until you see all the parts and we just see these pushing and pulling parts and nothing of them can ever explain a perception or a feeling and it is a very strong intuition that also drives the Chinese woman many other thinkers will get attracted to this and seems pretty obvious that these mechanical phenomena are insufficient to explain what’s going on that’s not an obvious connection so people become dualists there are some are too complete separate domains and I think in a way this dualism is correct but not in the sense that the mental states are ontologically existing they are exist as if there is no organism there is only this connection of cells and this collection of cells is acting in a coherent way which means we can compress it we can model it using a very low-dimensional as much circular function then look at all the cells in general and the organism is only approximating this function but what makes the organism more powerful than the collection of cells is exactly that function this structure that we project into it and the interesting thing is that by the information processing within the organism the organism can discover that function by itself and use it to drive its own behavior so while the organism is not a person it not even an organism it is very useful for the organism to behave but it was an organism and also to have an idea of what it would be like to be a person that interacts with the social world so it creates a simulation of that it’s often not even a simulation it’s often just a similar platform that’s what makes it a causal the difference between a simulation and the reality is that the simulation is modeling some aspects of the dynamics of a domain on a different causal substrate on the different causal footing so if a computer game in which you can shoot a gun but there is no upper physics in the game that would recreate what’s happened in real world when you shoot a gun right instead it is using a different causal structure of your software program to give you something that gives you good enough dynamics so you can interact with the world and experience this causal structure you can make a different decision you make different move in the game and as a result the game behaves as if you would expect it because it’s imitating the same causal structure using this different substrate in the semolacrosis you don’t have the causal structure like a movie doesn’t have causal structure it only gives you a sequence of observables and our own mental model of ourselves is a mixture of simulation and simulacron so it sometimes creates a sequence without causal structure it looks like it does this thing magically and sometimes we have a causal model but this causal model is not the real deal it’s just this simplified geometric simulation of how the world works it game engine that our brain is producing tool anticipate what happens in the physical world yes so that very strongly resonates with me and another another person I very much respects who’s opinion resonates with me regarding consciousness is Carl Friston and I’m not sure how much you’re familiar with his free energy principle and his thoughts on consciousness but I’d like to put forward to you one of his more recent definitions if you will or proposals to explain consciousness and I get your opinion on it this is from his 2018 article titled MI self-conscious or does self-organization entail self-consciousness and what he says here is the proposal on offer here is that the mind comes into being when self-evidentcing has a temporal thickness or counterfactual depth which grounds inferences about the consequences of my action on this view consciousness is nothing more than inference about my future namely the self-evident seeing consequences of what I could do does that align pretty closely with your your view you know I don’t think it’s sufficient and also don’t think it’s necessary I like Friston’s idea but most of the free energy principle comes down to predictive coding which is in some sense a radically test with GPT3 GPT3 is trained in some sense entirely on predictive coding it’s only trying to predict the future from the past and the future is the next token based on the token status is seen so far in GPT3 medically tries how far you can go with this and you can go very far but you would need far far worse samples than an organism does so there are players in us that go beyond predictive coding maybe we converge towards this over many generations in the evolutionary process so I don’t think it’s a stupid idea that Carp Friston proposes but we are born with additional loss functions that let us converge much much faster on something that is useful to the organism and if we think about consciousness he has a point about agency in there agent C means that you have a controller that is able to control the future took me so well to understand this when I go up we talked about BDI agents and they seem to be quite complicated and convoluted to put a lot of code there to make a BDI agent but there’s the deep desires and intentions and so on but if we think about what actually is a minimal agent a thermostat is not a minimal agent the startup has enough agency it doesn’t want anything it just acts on the present frame but do it the obvious thing but imagine that you give the thermostat the ability to integrate the expect a temperature to franches but the franches over the future then it does X now or Y now or does it a moment later right so suddenly you have a branching reality and then this branching reality you can make decisions and you will have preferences based on this integrated expected reward right so just by giving the thermostat the ability to model the future you turn it into an agent this is sufficient and if you make this model deeper and deeper it’s going to get better and better at it and at the certain depths the thermostat is going to discover itself it’s this going to discover the idiosyncrasies of its sensors and notice that the sensor operates differently when it’s closer to the eating element and so on and so on right so it becomes aware of how it functions it might even become aware of the way in which it’s modeling and reasoning process works and to improve it or to account for the its inefficiencies in certain ways and this is also what we do with our own cell but this model of the self is not identical to our consciousness our consciousness is a feeling of what it’s like in the moment it’s the experience of it now it’s there is an experience of a perspective that we are having right and this is what’s absent in the description of first and he is missing the core point of what for what it means for some to be conscious doesn’t mean that it has a self it doesn’t even just mean it as a first person perspective it means that it experiences a reality and this is not exactly and this is fris and quote yeah we we explicitly asked him this question actually when we talked talk to him last time and his response there was that you know this concept of feel feel like is really something that would need to be coded into the generative model that this you know that this agent has about the world like it number one it has to have a generative model as we’ve just been discussing it has to be able to entertain counterfactual you know possibilities for the predictive coding right and he’s saying that these feel like concepts would would literally be encoded in that generative model as hypotheses that we recognize you know so so things like I’m feeling pain for example would be a concept within that model and he says there’s actually evidence from you know treating patients with chronic pain and this sort of thing that that’s actually exactly what’s happening in the mind that that the feeling of pain is actually a concept that’s built in as a slot if you will into this generative model I mean what do you think about that proposal the semantics of pain are given by the avoidance that you don’t want to experience the pain usually and it could be that you cultivate the pain and use it to make something happening on the next level but it requires that you are then building a multi level control structure if you want to use pain productively some artists are maybe doing but the you cannot have pain I think without an action tendency without something that modulates what you are doing so your your cognition is embedded into this engine and to build such an engine that does it that causally changes how you operate is not that hard but when you live inside of such an engine it feels very strange that there is something that is happening that somehow depends on what you are thinking but you cannot control that it controls you it’s upstream from you you are downstream from it and when you get upstream of your own pain the pain stops being pain it’s something that is a representation that you can now control and we are able to get there but it’s not easy and you’re not meant to get there because it means that we can immunize ourselves to pain and sacrifice the organism to our intellectual interests what’s crucial about feelings when you look at them introspectively is that feelings are essentially geometric don’t know if you notice that so for instance we notice feelings typically in our body and that’s because I think that the feelings play out in a space and the only space that we have always instantiated in our mind is the body map so they’re being projected into the space to make them distinct and when we look at the semantics of the feeling we mean we notice that they are contracting or expanding or they are light or they’re heavy and so on is this all movement of staff in space right it’s all geometry plus valence the stuff that is going to push your behavior in a certain direction so these are basically the interactions of some deep learning system that is producing continuous geometric representations as perceived from an analytic engine right it’s an interface between two parts of your mind between the analytic attention control that is reflecting on the operations that your mind is doing while it’s optimizing its attention and the underlying system that represents the state of the organism and tells where you should be going and makes this visible to you it is a system that is not able to speak to you it uses geometry and this is what we call feelings so that’s a very interesting connection and I think Jeff Hawkins of you know Nemento would be would be quite interested in that as well because some of what he discussed with us was that in his view the evolution of let’s say abstract thinking and whatnot actually came from systems that evolved to operate into simple three-dimensional kind of motion and that eventually those were reutilized by you know the evolutionary process to start engaging in abstract thinking which he views is movement through an abstract space and so I think there’s a lot of connection here to what you’re saying about feeling which is that again you know in a sense and in a sense our mind is reutilized this three three plus one D you know movement mapping capability that it needed in order to survive and a three plus one D you know environment physical environment and it’s reutilized those for mapping feelings it’s reutilized them for mapping to abstract thinking is like a form of motion in an abstract space is that a fair connection yeah yes but I don’t think that it’s because it’s forward from the world in between interact but because of the it’s the only game in town it’s the only mathematics that can deal with multi-dimensional numbers right so when we talk about spaces we actually talk about multi-dimensional numbers about things that are not just a scalar in a single dimension but features that are related and sometimes you can take these features that you measure continuously because they have too many steps meaning for the discretize them right so what it does you sometimes discover that you can rotate something and this is when you get a space in a sense as we have a space for which we are moving and these spaces which you can rotate things only exist in 2D and 4D and 8D and so the geometry that we’re talking about is constrained to certain mathematical paradigms which you can derive from numbers the array for best principles and our brain is discovering our brain is basically discovering a useful set of functions to model anything a set of useful computational primitives and we can probably give our deep learning systems a library of predefined primitives to speed up their convergence that’s also the reason why there is useful transfer learning between different domains you can train a vision model but and use it as a pre-trainsing for audio and it’s not because it’s the same thing but because it has learned useful computational primitives that it can apply across domains but there is geometry in the audio signal so this is very interesting territory I hope I you know I hope you’ll come back to dive into this a bit more deeply when we have we have more time in a better connection because I agree with you I think there’s some very fascinating math here. Fantastic well I’m Dr. Yoshirovak it’s as I said you’ve by far the most requested guest that we’ve had right from the very beginning so it’s a non-it’s fun to get you on the show and I hope we can get you back soon for a longer conversation thank you so much likewise I enjoyed this very much let’s meet again soon

AI video(s) you might be interested in …