#81 JULIAN TOGELIUS, Prof. KEN STANLEY – AGI, Games, Diversity & Creativity [UNPLUGGED]
Amazing. Welcome back to Street Talk Unplugged. Today I’m here with Professor Julian Tegelis from New York University and also Professor Ken Stanley from OpenAI on Co-Interview Adjute. Can you believe it? This is the first time that we’ve done anything like this. I’m really, really excited. It’s a good time to be alive, actually. So Julian, can you introduce yourself first and then we’ll hand over to Ken to do the same? Yeah, hi, everyone. I’m Julian Tegelius at New York University, where I’m an associate professor in one of our several computer science departments. And I’m also a co-founder of model.ai, which does AI for games. And well, I do a lot of AI for games and open-ended learning and things like this, I guess we’ll dig deeper into this. Indeed, and Ken. Ken Stanley, and right now I’m leading the OpenEnde NIS team at OpenAI. And I know that Julian and I have a lot of common interests, so very looking forward to this chance to talk. Amazing. Well, let’s start proceedings here then. So Julian, you’ve said that you think that games are a good testbed for AI models. And by extension, actually, that if an AGI were capable of performing well-ong games, then it would be job done to a certain extent. So what’s your take, Julian? Right. Yeah, I said this thing that video games are great for training. Generally, I back when people didn’t really believe in it. People thought that board games are great, and board games capture what’s great about you when you tell a jumps and so. And then it turned out that it wasn’t really enough because people went to have them achieve, do you like performance in these board games with amazing systems like Deep Blue, AlphaGo, and so on. And still, these systems weren’t really capable of doing anything else and just playing these board games. And I was saying that well, we have all these video games. And in fact, they have this very, very, very wide design space. They test a lot of different cognitive skills. In fact, game designers, when they design games, they are essentially exploring the space of the space of cognitive adaptation that we even have, you know, south and sort of sort of mapping out cognitive skills by finding designs that test our ways of thinking in your ways. And that’s what I was saying that, you know, if we try to create a just a complete game, it’s not just one simple game, but games in general, video games in general in particular, then we’d probably be pretty close to what we could call it, AGI. Fascinating. But this gets into a really interesting philosophical discussion, though, because the task-specific skill is quite crystallized. So I don’t think anyone thinks that that’s intelligence. I think that you are right. When you put the question that way, task-specific skill is not intelligence. People will agree with you. But I also think that this is implicitly what people think. They see someone doing a lot of things that look intelligent. And then they’re like, yeah, this is this person is intelligent. Yes, but at a core intelligence is about being able to rapidly build on what you know and learning to do something more on top of this, something that sort of builds on this and is a new capability. And this is why, you know, if I um, fire up steam and downloads a random game and start to play it, um, I would probably learn to play this at a decent level relatively fast because I played a bunch of games. I used to say that I play a lot of games, but not as many as I should, given that I lead a game innovation level stuff. But um, someone who does not have experience um, playing a lot of video games will not be able to do that. So in a sense, why somewhat general, um, my intelligence in the in the sphere of certain video game design conventions, um, has become higher by playing all these games. Now, of course, you could try to ask what about my intelligence in general general. So my answer is that I don’t believe in that. Um, every notion of intelligence is always relative to some domain. Well, yeah, I mean games are such a wide spectrum of possibilities that probably in my view and compasses, um, like a lot of what we we think of as sort of outside of games. I mean, they’re games where you have social aspects and you know, you’re building a life inside the game and it’s not necessarily entirely, what you would do in your whole life, but it captures a lot of what intelligence applies to you. And so it does seem to me and I know that I overlocked with Julian a lot here with that games have a lot of potential for learning high level intelligence or even near the human level. But I’m curious, um, Julian, like your current thinking on how well this, uh, this is playing out with the game industry. Because I know that like maybe we had a dream that like this such a great synergy here between these two opportunities like gaming and AI, um, that like there’s going to be this great partnership going forward. Like it would be really cool for the games too, probably if they had really good AI, right? And it’s good for the AI if there’s really interesting sophisticated games. And I think it turned out to be, uh, maybe a lot more complicated, but I’m curious what you think the current status of that is. So it’s it’s it’s it’s it’s it’s it’s it’s it’s it’s a beautiful dream. Wasn’t it? Yeah, so it’s a dream we shared about like being able to develop state of the art, AI, and this would somehow improve games and yeah, I spent the last 15 years trying to, I don’t know, I’m partly trying to do that in different ways. And most recently, through co-founding model.ai, which is an attempt to bring things from research to where you would actually fit into the game development pipeline. And it’s hard. Um, the parts where it’s helping most is in game testing. So agents that can help you play through aspects of game to test them in certain parts of procedure content generation generating levels as specific cases. Um, but, um, the dream that some of us, um, some of us in this call, um, had a wonderful time that, hey, you’ll be able to learn generally intelligent agents and these would make the game better. That is why way more complicated because games are not designed for that. Um, um, I like to say that most game designs we have, um, build on templates that were made before we had, uh, before we had any kind of useful AI. Um, so we had to design RPGs and, uh, so world-playing games, first person shooters, um, MMOs, um, puzzle game with anything racing games around the lack of artificial intelligence. And now that’s the game design conventions. So I think that it is really interesting. Both Ken and I have worked on this, um, trying to design new game games around the availability for artificial intelligence. I think it’s very, very, very fascinating. It’s very hard. It’s the kind of work which you can get academic research money for because this is not research. And that the game industry, um, uh, doesn’t necessarily support because they don’t support research easily. These are just very strange games. Um, but it’s true that just like taking an existing game design, putting some kind of really clever agent in there is not going to make the game better. The most key to this is just going to make it worse. Yeah, I mean, the game industry clearly has some risk of version, um, to like exotic experimental things. And I mean, it’s understandable. Like, I mean, it’s not like some, some kind of scandal or something like that. But yet, um, yeah, I mean, what, what is the way around that risk of version? I’m curious because it’s, it’s frustrating from the like, like the opportunistic side. Like you see, like, oh, this, this things could really work together well and do something interesting. Like, why is there no room to have a little playground or we could try things like that? I think this will have to come from individual sort of AI practitioners slash game designers that are well versed enough in what Martin AI can do and have a game design sensibilities and do these things, put them out on steam without any expectations of actually making money. And this will point to a way forward. And if this sounds rough and unfair, um, that is how indie game development is. It is kind of rough and unfair. And most people don’t make money out of it. But it’s, um, if I was running a large video game developer and I had to put 100 million dollars into my next game production, knowing that if it, uh, if it failed, I would, um, we would be out of business. Um, I would also be kind of conservative. So, so I get it, you know, it’s, um, and it’s also the problem that building a modern game would, all the world production values, um, is, um, is extremely expensive simply. So, but I mean, but these big tech companies, I mean, they have like, not game industry companies, but other big tech companies, they have, they have research wings and things like this. Right. Why is that so hard for the game industry, um, you know, dev labs or something like, it seems like it’d be worth the investment. Let’s see what could happen, mix a little games that might be, might send a point to the future in some way. Some have, but, but in a very small, small scale, um, like both Ubisoft has enough work, EA has seed. Um, these are smaller outfits and they tend to be focused on how they can support existing game productions that build on existing game design conventions, um, uh, in specific roles, such as game testing, which we also do in model AI, procedural animation and so on. Um, there’s a bit of the, it’s interesting. Um, the guys at the EA seed, they actually did do some interesting procedural generation work. Um, um, and they sort of, um, and they publish it and it’s pretty nice. Um, I like it, but it’s, um, it’s very much supporting the existing paradigms and, and I agree. You need to sort of, if you want interesting AI to come out there, um, and, and AI that makes a pass towards, and it makes progress towards AGI, whatever AGI is, then you really need to do rethink these conventions and sort of build things around like, what if you, you know, take, take, take, take, take a very simple, co-read example, dialogue trees in role playing games. No one thought that the way we should have conversations with, um, um, with agents is by going, moving, navigating down a dialogue trees. Um, um, that was like basic and design of the necessity because we didn’t have any useful and repeat technology. Back in the eight days, I wouldn’t have the hard-wredge run it on. We didn’t have the processing, we didn’t have RAM or whatever. And that became a design condition to start. And this is now how you do it. So people like, um, uh, Nick Walton, who made AI Dungeon his company, I forgot what his company’s called, but anyway, the B.A.I. Dungeon Company, they, um, they are, courageously trying to define new paradigms for that. Um, and, uh, yeah, I hope they succeed. But let’s like one of these winterback team. It doesn’t come out at a big game suit use because they’re still tired, I think. So, um, I want to come back a little bit to this notion of, of intelligence. Cause I’ve got a few questions around this. And I want to get to your ideas on super intelligence and intelligence explosion as well. But, um, uh, folks like pay Wang, they, and again, by the way, the idea of having a formalism at all is something that Ken and I have spoken about. So you, you, you could formalize it in, in respect of an agent, um, taking in percepts and having an internal state and performing actions in, in environment. I don’t know if that’s too much of a formalism or not. And then you could assess what an intelligence is in respect of if it does the right things in respect of an input or if it does it in the right way because it has the right state and stuff like that. So, like with that framing, what would an intelligence be for you? For example, um, do you think alpha zero is intelligent? My intuitive answer is no, because it’s a way to domain specific. Then reminding myself that I just said that everything is kind of domain specific, which I deeply believe. My, um, my revised answer would be not particularly. Alpha zero is extremely limited to acting in on one particular kind of problem. It can be retrained, but even they sort of domains in which it can be retrained is very, um, they’re very small and the, um, retraining itself is very, very expensive. So, um, I don’t think it can be said to be intelligent in any meaningful sense. No. Um, I do think that formalisms, for the formalisms can be inspirational sometimes. Um, I like Shane, like, and Marcus Hurtler’s, um, universe, early intelligence, um, definition, partly because I like to point out all the ways you can disagree with it. Um, but it can be, it has inspired us. The inspired us will build the general video game, like I can write a shop, for example. Um, um, but I don’t think that any formalist will ever come, will ever, uh, sort of capture everything we mean by intelligence. In general, I think there are too many physicists in this field. Um, physicists are really strange, strange because they live in a world where formalisms make sense and we’re like the mathematics in our sense is the reality. And I think that kind of thinking is, um, currently pretty much overvalued with, with artificial intelligence, because these are not the kind of entities we deal with. And the, um, the sort of, um, uh, the things we’re trying to capture, we saw how vanish if we, uh, formalizing that much. So back to your question, um, what do I think, what do I call intelligence? Um, I think it’s, uh, I think it’s kind of a mispost question. Intelligence word we apply to various organisms, um, in various, um, circumstances, um, and it’s always domain specific, um, and even the founders of the organism, rather so long. This being said, I used to work intelligence a lot. So I would associate it more with being able to, um, being able to, um, survive in a large variety of environments, or like being able to have seemingly rational and large variety environments with only, only short training time or short learning time. Yeah, I agree that the efficiency thing is important, because even though AIXI or the universal intelligence, um, idea is interesting, it’s one of these appeals to infinity and the sense that it’s kind of saying, well, if you could do absolutely anything, then this is what it might look like. And there’s this thing called the McCord duck effect as well, which is every time that, you know, we, we produce something which seems intelligence. There’s a chorus of people that say, that’s not really intelligent. And, and there’s this parable of the elephant problem as well, as you say that, um, we’re all describing this thing, um, in a particular way, but we’re excluding many important aspects of the thing. And I think one of the, one of the problems here is that like many complex phenomena, intelligence is an emergent thing. And it makes it impossible to describe it succinctly. You know, it’s, it’s also like a fork psychological concept. And it doesn’t necessarily have a true definition. Um, I mean, we could come up with a very limited definition that would be, um, mathematically beautiful and logically consistent, but that definition might not be useful. How do you like the term AGI or artificial general intelligence? Is that helpful? It’s bullshit. Um, what a divisible shit. No, because, um, the main reason I did like artificial general intelligence as a term and asked a, the, that’s the existence of which as the premise of a discussion. I, yeah, it’s, it’s because it gets tied up with a super intelligence argument, which I think is, um, extreme bullshit, harmful bullshit. Um, but, but let’s start with artificial general intelligence. Um, what I do like about it is the spirit that we need to go beyond, um, coming up with solutions to individual tasks. And I think that’s, um, inspirational again. And it has, I sort of guided a lot of the interesting work on trying, on, on domain adaptation and, uh, generalization. We were reinforcement learning, something that’s I’d be working on, quite a bit recently, for example, and I know you have conflicts as well, um, and, and I end up, with my learning, um, but so, so, so, so the civilizing with us behind it. Um, and I like that is sort of, you know, puts a spotlight on like, we, we can go beyond these very, very narrow things that are dominated dayi for some, um, but the general not fish is general intelligence. is always, you need to always put it in print in the light. Yeah, well, in the quotes, basically. Artificial, called general, unquote, intelligence, because how general it can really be. It’s always the main specific in substance. And this is often what you sort of need to figure out. Helps, basically, when you turn the question to the said, so what is natural general intelligence, then most people would basically say that, well, that’s what humans have. But then humans are also very, the main specific in terms of individual humans. They, I mean, I live here in New York City in a nice, kushy Western environment, where I do like complicated things, like taking a subway to work and talking to my PhD students all day and writing papers. Most of the tasks that humans solve around the world, I have no idea what to do with it. I’m completely worthless as like, smelting iron or harvesting wheat, or even like sort of adjudicating, adjudicating conflicts about like land use or taxes. Or, you know, I’m not saying I’m particularly stupid. I’m a little bit stupid, but you know, I’m not particularly stupid. It’s just that, you know, I know a very small amount of things. Could I really learn and do the other things? Yes, I could probably become a tax lawyer, but that would take me five years or something. So how could you even, how could you reason to say that I’m general intelligence and that sucks? Then of course, like, even all of humanity, everything we know when every environment we can survive in, that’s extremely small compared to all the tasks that you could possibly do in the universe and all the environments that could possibly exist or even do exist in the universe. So we have like a lack of generality in many, many ways. So I don’t think artificial general intelligence can ever exist because there is no natural general intelligence and I don’t see a reason where there would be any. I agree. So a few things to unpack there. First of all, I think Ben Gertzel invented the term AGI and he was exasperated with the notion that all the AGI that we’re working on now is narrow AGI and I think basically he meant it wasn’t flexible, but I agree with you. Yeah, I think you were shingling. Oh, okay, maybe I’m on that. Yeah, yeah, yeah, yeah, yeah. But could we discuss about it? Yeah, indeed, but so I agree with you that all intelligence is in the context of something in a particular domain, in a particular environment. There’s no such thing as completely general intelligence. I agree with that. There’s also this notion, I know you worked a bit with you again, Schmidt Hooper, and he had this notion of a girdle machine, a self-referential universal problem solver making provably optimal self-improvements. And I know you’ve spoken about goods in formal remarks on an intelligence explosion through self-improving techniques. And this is actually what a lot of the Nick Bostrom style people talk about, this notion that intelligence will improve itself. I personally don’t think AGI means self-improving intelligence, but Francois Cholé wrote a wonderful article about this talking about the rate limiting steps in the environment that would actually stop any self-improvement or limit any self-improvement. But what are your thoughts on that? No, so I agree with, I think Francois Cholé’s article is really good there. And my thoughts on this is that it sort of follows from that there isn’t really any natural general intelligence or follows that there is, it’s sort of hard to imagine what a self-improvement artificial general intelligence would be. Not as in I can’t imagine, because I have no imagination, but it’s not sure it even seems self-consistent, daddy. So the idea, so if you tried to think out what it would mean, it was what it would mean to be a self-improvement artificial general intelligence, then you would need to be able to improve yourself by rewriting your own code. That’s complicated, because how would something understand its own code well enough, but then still keep that in representation is wide. Well, let’s say for instance, you could do that. Well, you only get so far every writing your own code. You still need to augment all the code that this built some, because you have network code, you’re operating system code. And probably the hardware that it built some, told me the hardware, you need an extremely, extremely huge global supply chain. You need to sort of generate and design better ships. So you need to have better machines that can build them. So you need better lasers, then you need better mining for the component that goes into these. And so when you need better logistics, the transport things are so on. Basically, what being self-improving at an unlimited scale, what it means is that you sort of need to improve the whole world economy. So I would say that the only reasonable way you could think about a self-improving organism would be if you saw the civilization, as we know it, as an organism that is self-improving. Now, this I could buy, this might actually be the case. So there is a version of the Supreme Intelligence argument, which I believe may be true to some extent, but that, but in that version, all 7 billion of us humans are part of the machinery of the self-improving thing. And then you can see people saying, aha, well, there is an EGI in there somehow that controls all of this, like sits in a box somewhere in a server room, and sort of controls all of us like Marionettes or something, to which I would say that sounds dangerous like all kinds of disproven and unlikely conspiracy theories. And that’s just not how the world and it fundamentally distributed the decision making actually works. So the AGI, the Supreme Intelligence argument, it’s in almost all versions, almost ridiculous, except the version where you talk about it on civilization scale. But even then, it’s really hard to figure out what’s the agent. What do you think though, like if you put aside the jargon in these different terminologies, like what do you think we should be aspiring towards? Like what is the North Star, Holy Girl thing? Now that’s a really interesting question. I think we should aspire towards gradual generalization, gradually more general capabilities. And it’s fine to come up with somewhat contrived and weird bench works for this. So maybe then maybe coming from action use cases. I like these ideas of working on developing embodied robots that could basically be home nurses and take care of the person who is bed ridden in every conceivable way, such as cooking dinner, cleaning the carpets, watering the plants, taking care of the pets and whatever you may need to do in a home. That’s great because that is a very, very wide range of skills you need to do there. Even though all the skills, please note, all the skills here are the kind of skills that we selected or created as a human civilization. Why is the opening a door is a hard skill for a robot? But is this a skill that it should be because we made all the doors so that we use it easily opened them. So it’s still very domain limited, but it’s like a reasonable challenge. Now, so that’s a good challenge. And there are many challenges like that. Personally, I want an AI that can play every game on the App Store or Steam library reasonably well. And I think that would be a very great achievement and it would teach us a lot. This is not how we write things in grant funding proposals because other people would roll the rice like why would anyone want to do that? How is this helping humanity? I think it’s helping humanity by teaching us a lot about how to build, quote, intelligent, unquote machines. Another big mission is that I want to build environments that we can call them games. We can call them entertainment environments, experience environments that are self-creating, that you can go into a simulated open world and you can walk in any direction and the game will present you with experiences with present you with things. It’s a playing grand the daughter and you sort of walk in any direction, you get like new cities, new people in there with driving cars you never seen before. And you know, they have like relationships you never heard about before. They have like new narratives. And this is all sort of there to basically serve you in a sense that the system is trying to probe what you like, what you find interesting, what experiences you have not had, yet had what it thinks you would actually like to have. And this would be interesting because it would advance the state of the art to both like building worlds and building narratives, building game mechanics, building all these things. But it also very much is the M, M sort of, it would very much sort of advance the state of the art to the art in systems that understand you and look at what we do and what we say and try to get inside your minds and models. Yeah, it’s interesting because there’s just kind of two sides to which you’re describing. I think one is kind of a more mechanical operations like the nurse who takes care of the patient or something like that. It’s a piece of you has certain articulation skills that it needs to learn like to get the food or to give a bath or something like that. And then there’s like very creative, like I mean creating a world or something like that. Where like that’s like not just good for games. I mean, you could have like, you know, artificial intelligence like literally participating in culture. And I guess, you know, like making things for us in the real world too. I’m curious about your feeling about those two things. Is that like do you think they’re related to each other or that there’s one more important to you than the other? Yeah, I mean, there’s a difference between important for humanity and important for me. Yeah, I’m a thing. I’ll have to on a human society, you know, that I had to recognize this distinction. I think I personally, I mean, I come from a family of artists and this is how I grew up. I never grew up with like, you know, I was never told to always think about the best humanity. I was, I’d write with people who literally painted for a living and when they couldn’t make a living out of painting, they would sort of take a day job and sort of make money from that somehow. So all the way grumbling about what need to get back to their, to the real work. So I’m very interested in machines and methods that can create and can create together with us, both autonomously in response to us and how we behave with also together with us. I think the history of art is very much a history of technological development because like art history has been so much, so much development in ways of manipulating colors and materials and reprimandments and material and reprimanding reality and so on. And I think that see, and new art styles have also been driven, not only by like trying to depict what people see or sort of in some, depict or narrate what people see around themselves, but also new technologies that make this art possible from like new methods for painting and sculpting and colors and so on to new methods of writing and sort of and of course like digital art like my games or so on. So I’m very interested in that because I think it’s intrinsic and interesting and this is, this is what contributes to our culture in a sense. It helps us understand ourselves and gives us better and richer lives. Now I forgot what was the original question. Well, I mean, I think you answered it pretty well, but I guess I was curious about the singing kind of dichotomy between the like very practical like nurse robot and then like create a level like which is like so open ended and this is so many like creative aspects to that. Oh, just how those designified together. I think there are several ways to get the fight together. The sort of, of a somewhat abstract level, every problem so all problem so long is great. And everyone basic, I think I can say that everyone practices some creativity in their life. Even if what you’re doing is being a home care nurse, you will find problems that you need to solve creatively. This wasn’t going to build, you know, how am I going to deal with that? The cat sleeps on the medicines or whatever, you know, then you have to figure out something about that. But there’s also a more down to earth or closer to the actual current kind of things that you can’t research throughout your way, which is that, and I know you’ve been working a lot with that kind of example, that we can use creativity to create problems that our agents can learn to solve. So generating levels in its great in itself, but it’s also great because it can help us test and retrain the reinforcement learning agents, for example. So something that has been discovered of people over the last few years is that the pre-enforcement learning tends to overfit in a pretty bad way, you know. Re-enforcement learning agents tend to not only be overfitted to a particular game, they can also play that game, nothing else, but also to a particular level or a particular level in a particular sequence. And a particular sort of screen resolution, angle, sort of frame rate, everything, if every little part, then you change any quarter of this, and you get like random behavior, and all the thoughts you just had about like, oh, I saw these deep neural networks playing the Satori games, it’s really smart. Look, it can play fun or one of the Zoomers revenge is something, you know, and then, in that sort of, you change just a tiny bit of thing and you see it collapse. Now you wonder, how could you learn something slightly more general, slightly more general gameplay skills? And I think part of that, and very important part of that is generating parts of the game, was it going along? Maybe not only the level, but also the rules of the game, and the graphic representation, and you can also vary a ton of different things. So in a sense, this is like the data annotation, which people have been talking about and practicing in machine learning for a very, very long time, but taking to new limits, how to sort of do this. And one thing that I am very interested in at the moment, and I know you’ve been working on Can Before, is exactly what kind of levels do we need to generate? What’s interesting is generating here? For example, do we want to generate levels that discriminate well between different agents? Do we want to generate levels that are hard? How hard? Do we want to generate levels that have a particular set of challenges in a particular order? What are interesting points to what are interesting points to vary and so on? So these are things I’m working on pretty much, as we speak, but yeah. And this is one way, so to get back to the question, this is one way how very mundane concerns are solving particular tasks intersects with creativity. They are an automatic creativity. I really wanted to unpack exactly what we mean by things like creativity and open-endedness and subjectivity and ambiguity and all of that, but let’s just quickly do the reinforcement learning thing now, because I want to eventually contrast that to some of the open-ended methods that can introduce me to, by the way, absolutely open my eyes when I discovered Ken’s book and research. But you’ve spoken about the over-fitting of deep reinforcement learning. I actually made a video on my other channel about Alex Erpan’s article, Deep Reinforcement Learning, doesn’t work yet. You probably read that one before. Oh, yeah. So you said that, oh, is it because we’re training on the testing set? Is it because the training set is small in our modernness? Is it because our models are too large? Is it because the input representation favors learning specific strategies or something else? And you were just saying, well, if only we could generate data as we go along. And I’m thinking it’s because neural networks are locality-sensitive hashing tables. So why do you think these reinforcement learning algorithms are, you said, memorizing the environment? Because it’s the easy thing to do. I agree with you that neural networks in many instances work pretty much like. And it’s specifically when we talk about reinforcement learning. They often work more or less like locality-sensitive hashing. And why do they do this? Because that’s the easy thing to learn. So powerfully is resistance in a sense. This actually plays into a bigger critique of is it good that gradient descent is such a dominant paradigm? Because it currently gradient descent is the dominant. I’ve written before, but I’ve had some troubles at calculating, as well as what I want to, about how I think that the implied greediness is not very good for learning the kind of complex representations that we probably need to learn. I’m not saying it’s impossible. But I use a lot of modern deep learning and what I do based on gradient descent. I come originally more from the evolutionary computation side of things. And I still think that the path of making larger stochastic changes and letting them play out to be evaluated over a longer time has the advantage that you can learn things that gradient descent would struggle to learn because the gradient descent would always be pushed around by the data points essentially. Now, so this is one reason. In general, reinforcing learning methods that we know and love, well, that we know and use, currently, are extremely prone to just taking the simplest way out, the simplest way to get a family, get a good performance. And this is not the controversial point of view. This is generally accepted. And knowing this, the problem is, I mean, the extreme variation of the conditions that you want to get through to proceed to regeneration is there, basically, because you need to explicitly vary everything you don’t want to reinforce into learning to learn to use as a crutch. And in a sense, the relevant question is not really why does it overfit? Why does it learn these very specific solutions? How can you make it ever do anything else? Because it doesn’t come naturally, like to gradient descent algorithm. You will naturally move towards the simplest solution inside, and that is almost always. Some extreme form of graffiti. Your question was also about, like, if there are other reasons for this, right? Well, maybe we’ll just quickly meditate on that. So I guess I’m trying to tease a part in my mind, because one aspect of it is the shortcut rule, which is that you get exactly what you optimize for at the cost of everything else. And these models are trained monolithically with a single objective and can open my eyes into this idea of a system with a panor play of objectives has no objective at all. But I’m still thinking in my mind that the way all neural network architectures work is you’ve got the MLP backend, which is the hash table. And then there’s a whole bunch of architectures for doing the diffusion of information with successive, equivariant layers. So you’re kind of spreading the information out into those buckets. So with that in mind, how could it do anything other than memorize the environments? Maybe we should bring in the NetAhaq challenge. So apparently the reason why that worked, sorry, didn’t work for reinforcement learning and did work for symbolic methods is because it wasn’t possible to memorize the environment because the environment was different every time. Yeah. When I saw the Nuttaxhallage, I was like, this is a great challenge. I had one of my PhD students was really interested in working on it. And then that we haven’t really put any real effort into that yet. We may still do, but I immediately thought that, in no way, we’re going to get to see a reinforcement learning doing well on this. Not anytime soon. It’s not the environment is also not represented in a way that the neural network would like. It’s naturally not such a representation. And of course, the variation makes it really hard to do. So yes, I sort of, this is very much a symbolic friendly environment, which is way good. I hope that people keep working on it. Nothing else can be about to say that. Well, actually, my team just wanted to ask you guys, because I wasn’t sure if I only read a little bit about that. But my understanding was that neither approach did well. Like the symbolic approach just was better, but that wasn’t exactly good or it was far from good. So it’s just sort of like, wow, we don’t really, is that true? Is that pretty much accurate? There’s really not a good way. It was, well, I think it comes back to what we were saying at the beginning about what is intelligence. So the symbolic approach basically was a representation of the game. As if you’d coded it up with classes and methods and so on. So is that really intelligent? If you can represent, if you can take a game and basically represent it so well, so you can keep doing memory and try executing things. You basically use the game with a solid forward model. This is almost always the best way to play it. It’s just that many games are so computation demanding that you can’t easily do that. But yeah, when I’m already getting a challenge, back in 2009, we put it online. We thought that this is going to be a great challenge and people are going to work hard on this and so on. And then Robin Baumgart and Norfolk Beer College basically solved it within the first few weeks. By doing exactly this using the game has its own representation and then just cleverly. And well, classic AI methods like planning that combined with the sort of the past simulation are just like extremely good in many cases. The question is can you use the right to the whole representation? So could we contrast this because we were just getting to the contrast between the monolithic reinforcement learning approaches where there’s a single reward versus… And Ken would say that’s keeping all of your eggs in one basket versus approaches that utilize diversity preservation. So what do we get there? I think the quality diversity paradigm is extremely powerful and will eventually eat all of the other. Great. More or less. So in quality diversity, you may be doing an optimization task. You may also not be doing automation tasks, but crucially, you are exploring the space and trying to find diversity along some measures in the original of the search. You simply have a distance measure. In the math elites, you have several behavioral descriptors. So you can call them… We often call them measures or metrics. This is a template logic and confusion. And you want to find solutions that are just like… Even this spread out along these measures. And you may also want to find solutions that are good in all those measures. In almost every problem where you’re trying to optimize something, you’re not actually looking for the one best solution. You’re looking for a large amount of different solutions that differ along some dimensions you care about. And then you want to pick. Even in cases we just want one solution, you probably want to have a large amount of diversity in the process to make sure you actually find the best solution. And I think this re-initiation has yet to diffuse that sort of widely into the AI machine learning community. It’s not there yet. But I think it’s crucially… I mean, we’ve seen use cases for it everywhere. We see use cases for it in like if you’re trying to… a policy to play again, well, you probably want many different policies. And you want to choose to run them. We want to combine them in some sense. And you want to be able to adaptably rapidly and switch from one to another or something. If you want like content generators, you probably want several differents. You can get like a wide variety of content styles produced if you want. A model that helps you predict who to hire. You probably want different ones so you can explore the trade-offs and so on. For like almost every case we can imagine you want machine learning optimization, you probably want to use a quality diversity method instead. Luckily, there’s a lot of development going on in sort of taking the core IDs behind things like mobile news to actually map the needs. And combining them with… Because these are essentially at the core, they’re like classic evolutionary algorithms about working at diversity. Working with diversity has the main thing to do. But combining them with modern radio-based methods to sort of get the best of both works. We developed something called the covariance matrix rotation map leads to this like a modern evolution strategy on the map in the structure. And we’re just like seeing that this is like the default thing we have to basically everything. But the problem we see first of all, wow. And do you… You previously commented about the contrast between gradient based and… It’s not necessarily what you’d call evolutionary, but it’s kind of like knocking something out of its zone like intentionally which tends to be not what gradients do. Do you think that like what… Lemon and Light of what you just said about quality diversity that that is… Is that a relevant distinction still do you think? Like is it like any… It doesn’t really matter what the underlying kind of optimization method is. You can just use a different gradient, let’s say, to push it one way or another. Because you’re talking about using gradient methods with these quality diversity algorithms. Yeah. So gradient methods obviously have a measurement of the results, in particular they are efficient. But I do think… And I think you can combine them with the quality diversity thinking it may be equal to where is it a partially gradient based or a completely gradient based in many ways. But they sort of… But I do think they’re still an advantage to sort of doing the undirected mutation at various depths. Like the learning algorithms of the future will almost certainly take place or sort of operate in multiple different scales. Some scale where you’re doing the dark end or the reputation in some scale where you’re doing gradient descent search. And you’re also throughout this trying to maintain some kind of diversity. And you may have one or zero or many objectives. And you may have the number of dimensions of variance that you can… It might be important. New dimensions may be collapsed as you search. New dimensions may appear as you search. And this may be partially directed with human uses and we point out a tape. The difference between this thing here and this thing over there is… This is an interesting difference. Keep that in mind as you continue searching. For example. But I think just like under-attributation definitely would play role in there. So this waiting for all these insights to call us. I think there’s also movement from the other side. So to speak from the side of people who haven’t yet sort of been towards the quality of diversity of the job. In particular when it comes to reinforcement learning, learning finding leaks and finding concepts or tools of teams of the collaborators who are like adversaries. Like AlphaStars, I think it’s called leaks. You have like colleagues of different collaborators than the new collaborators and new agents. Or different adversaries, new agents needs to be tested against all of these, for example. This by the way are classic IDs from conventional convolution, collaborative convolution that have sort of been rediscovered over the time. So I think that many streams of thoughts are sort of pushing towards this realization. I recommend folks read Ken’s book as well. So when I read Ken’s book and I learned all about this notion of deception in search spaces and the false compass, that absolutely changed my perception on everything because I now realize that you need to keep your options open. But I wanted to just touch on one other thing as well. So I’m quite interested in this notion of empiricism versus rationalism. And one of our guests, Walid Subba, he rails against he planning because he cites Chomsky’s poverty of stimulus argument, which is basically that kids must have native cognitive templates because they can’t possibly learn enough by the time they’re four years old. But not only that, there’s also a notion that you’ve raised. So if you look at Einstein’s theory of relativity, for example, it’s kind of provable that you can’t derive that from data. You need to have some kind of higher level cognition. And then you made a really interesting move, which I didn’t quite understand. So I hope you can explain it now. You kind of said that with these evolutionary type algorithms, that’s akin to the rationalist type thinking, as opposed to the empiricist type thinking. Don’t understand that. No, I’m not sure I understand it myself. But I still think there’s something there. I think that gradient descent is, in really descent, you fundamentally pushed around by the data. Data points appear and the causality is that the data point pushes your hypothesis, if you want to use that terminology, or your model in some direction. And you’re very much similar to how we classic empiricism, the send state are supposed to literally make impressions on your mind. This is what’s happening in the gradient descent based deep learning. Evolutionary computation is based on the random change in its most powerful. Now, in the random change, that happens at another level. It happens kind of at the hypothesis level. You’re changing a random hypothesis, and then you’re testing it. The fitness evaluation is the testing of your hypothesis. Now, what people have raised against this, I think Jeff Kloon at some point basically said, just as you said to him, like, I don’t understand this really, because hypothesis formation is entrant. That is correct. I put this information in some typical random. Sometimes it is. Sometimes people are like, you know, this doesn’t make sense. So that’s straight, I tried anyway. That’s my favorite kind of science. This is a stupid idea. So that’s right. But it points to that there are other things. You know, evolutionary computation, you can have very many different forms of mutation in there. And like throughout history of evolutionary computation, people have tried all kinds of different things for example, what very often happens in real life hypothesis formation is pattern matching. Like I have seen another theory that looks like this. So I’m going to take the pattern of this theory and overlay it on what I think in this field. And now I have a new hypothesis, and then I’ll go out and do this. So this could be like a form of mutation that does pattern matching. And this is entirely viable. And I think this is the kind of a more rationalist conception of search. So say you’re sort of, you have, okay, this gets harder to sort of apply intuitively to your network search base because your network weights don’t really mean anything. But you could imagine that you have like a particular sort of say you’re sort of evolving a decision tree. You may know that there’s a particular kind of tree structure. You’ve seen another decision tree. This hypothesis formed this way. And you may want to sort of just apply this to some part of your own tree. And this becomes, this becomes a hypothesis, then you try this by fitness testing it. It’s not the perfect analogy, but I do think there’s something there and it helps underscore what are the limitations of gradient descent, like pure gradient descent search. It is much more, much more certain of the connection between, we’re a few more spread of connection between classic empiricism and gradient descent. And then I am at the connection between rationalism or critical rationalism and evolution. But I think there’s a way of making them meet. You started off by talking about, well, you actually invoked randomness. And it’s not random, is it? And this is something that Ken’s work fascinates me as well. Ken has this notion of interestingness. And the problem is when we define creativity, and we were getting back to that definition problem, there is no definition. Right, what do we mean when we talk about open-endedness and creativity? And then I think Ken was trying to formalize it in the sense of accumulating information, like in natural evolution, there’s an arrow of complexity. Would you agree with a formalism like that, or how would you define it? Is there an overall complexity? So in your abandoning objectives paper, I think you said that you were one of the first researchers in the ML world to almost validate, or at least take seriously this idea and biology that there’s this arrow of complexity of increasing information in natural evolution. Yeah, I think another argument is about this idea that as opposed to following ingredient towards improving performance, which you normally do in machine learning, we could be trying to just increase the amount of information in the system, which is a different kind of, I guess you can call it gradient, but it’s a different kind of thing that you’d be searching for. And so I guess you could look at divergent searches or novelty-driven searches or interestingly-driven searches or things like this as effectively doing that, like they basically increase information in some way. And if you look at evolution over eons, I guess you could see that it’s uncovering more information about, so the question is about what? Like it’s about nature in some sense, about all the things you can do, you can fly, like you can walk, you can ambly, you can move through water, like you can do photosynthesis, you can think like all these things, so like this is like an index of possibilities opened up by physics, and they’re being kind of unfailed because of this kind of search that isn’t necessarily objectively-driven. And so that’s kind of an arrow of complexity, I think that kind of sense of like, let’s get more and more abilities catalogued in this archive in some way. And it’s both population and individual. So as individuals, we know all the planets in the solar system and things like this, this is definitely an accumulation information relative to our ancestors. So it’s something like that. Yeah, that’s something that makes it over sounds. And I do think that as we work in open-ended systems, one thing that’s really crucial here is that we think about the environment as you that you think is developing in an open-ended system isn’t just the agent, which is like a kind of sort of arbitrary somewhat arbitrary sort of somewhat arbitrary, terribly sort of the defined entity, but like the whole ecosystem, because the reason why we can do things like seems so intelligent is that we build this whole civilization that enables us to do all these things that seems so intelligent. Like NGWABLX is extremely free to have actions. So branching factor of a human living in modern world is insane. And the reason we could build this complicated civilization is the extreme complexity of life on Earth, which then depends on previous life on Earth and so on. So in that sense, and that you sort of implicitly creating an archive of complexity through all the living things that makes a lot of sense. Now, is there, but I guess the thing, are we have to do is that is there also an error of complexity in terms of individual organisms of having acknowledged that organism with some arbitrary boundary? But in that, that’s less certain. Are we more complex and, you know, are we more complex than other creatures? Are we more complex than them or ancestors, our grandfather’s or whatever? That’s not obvious. Maybe not. Is it necessarily true? Probably not, but yeah. Maybe what this points to is how important is to think of intelligence that can give ability on a population level. I’m curious like getting to individuals just to understand your viewpoint. I think it’s some lineages. It’s agree that it’s pretty ambiguous. Like it has complex to increase or not. Like we could debate about it. But in the human lineage, like you’re saying that it’s not clear, for example, that humans are more complex than flatworms. Or just said, because like to me, it seems like it’s pretty reasonable to think that, but would you contest that? Probably. Well, I could find, I’m sure I could find some measure by which it is false. There is like yeast, it’s like extremely complex. And then you sort of, then you look at it, it’s just yeast, and a light bread and everything, but still, but that’s kind of a stupid measure of complexity. So would I contest it? I would qualify it. I would say like, that we are certainly more complex in dimensions we care about than basically older ancestors. Other measures of complexity, by which we are not more complex and made worse. Probably. Can I put it right now? No, because that’s, maybe because I’m interested in kind of inhuman kind of things. Mainly this is a silly objection. I was just, to me, it makes more sense that there is an error complexity on a population or ecosystem, but which I think they’re definitely is. And there’s a bit more confused on the individual organism level. Yeah, I mean, so somehow it relates to the issue of like the definition of complexity at some, because it’s not well defined here. And so I’m just going with sort of intuitive, like I feel more complex than a flat one, but yes, if you play with the definition, you can, it gets confusing, but I don’t know if that’s that useful to play with it. No, not true. I think it’s a bit of a discourse to me, it’s a discourse that I’m also looking for complexly. I’m also looking when I sort of, if we create like an agent that either, it sort of plays again, regenerate again level to look at things I typically do, I’m also looking for complexity. I wanted to do things that I couldn’t have predicted it would do. Yeah. I wanted to. Wow. Yeah. That’s a kind of looks we have looking for. But in many cases, when we’re looking at general intelligence, I feel on an intellectual level, though I’m not sure I practice it, and we should looking with more at the population rate of system level. How can you find like a wide variety of organisms with, with that work together to go to create something complex? I think that’s really interesting because like it is true that a lot of AI today, your monitor machine learning is focused on an individual. Like there’s just one thing that’s getting better. But this this notion of a population, there’s a lot of a lot of power in it, I think. And yeah, I wonder if you could say a little more like what, what’s the difference? Like why do you think a population adds something important, than just optimizing one thing? I think a population in particular, when it’s heterogeneous, when it’s not all the same, is interesting because we, because the different population or an ecosystem, because of different organisms create the conditions for each other to be intelligent. So intelligent is always, it tells you it’s always relative to domain. And all organisms we know of are like adapted to a particular domain where we sort of, we are intelligent because of the world we build. Now you have an ecosystem of different organisms behaving in some environment, they can sort of create that for each other. I don’t know how to do this. And I don’t know that anyone really has good ideas of how to do this. There’s this is long history of a life experiments, like Tierra, Avida, and so on. And there’s a bunch of like more graphically that worlds and so on. These old are kind of free-meted, you know what they actually were able to do. And kind of unsophisticated in their methods. Now some would probably come and sort of hit me over the head with like, you know, actually here at this fantastic experiment, I didn’t take it, but I think that’s a big sort of untutiful tension. In particular, as we work open and open and learning, which is definitely one of my main interests, we’re still looking at like, you know, here’s one signal environment going about in one signal, one single agent in one signal environment. And there’s like a population of agents, but they’re all trying to solve the same task or like, you know, they all have the same affordances as so on. I think there’s a wide open field of things we have not explored out there. We’re not explored with Indiana County Depths. I think one thing that really fascinates me is a lot of the interesting phenomena happens at a different level, a different rung of the emergence ladder, if that makes sense. And I’m starting to see this everywhere, like even at work, I’m building a code of you platform and at the low level, the metrics are obvious. I know Ken talks about the tyranny of metrics, by the way, but you know, it’s how many code reviews that has an engineer done, how many customer engineers do I have? It’s easy. And then I start going up the levels of abstraction. I’m talking to the senior leaders. And now I’m starting to use much more abstract language, like vertical information flows and trusts and engineering culture. And all of a sudden, it’s impossible for me to quantify it. And if I do, I’m making it up. And it’s the same thing you’re talking about these populations scale phenomena that happen. As I’ve got all of these intelligent agents, they’re doing things. And I can try and, because now I’ve got a meta-optimization problem, right? So I want to encourage interesting phenomena in the emergent scale. So I might say, well, this type of thing is interesting. I want more of that. But I’m kind of, I’m reaching because I don’t know how to describe it. I don’t know either. It’s a, it’s a, it, I mean, one way to phrase this is that, during the kind of research we can have outlining here, it might be hard to get it into Newrips because you don’t necessarily easily have a graph where the number goes up and the bench work to beat. I’m not saying that every paper in Newrips is like this, but hey, let’s say that this is the norm, right? I mean, I think one of their, one of their characteristic meetings in point two that that kind of separates, like population from just individual is sort of specialization versus generalization. Like I think population-driven algorithms are implicitly, are more about specialization a lot of the time. Because like each of the memory of the population you want them to be doing some different thing. So they’re kind of becoming specialists, but I think there’s a huge amount of generalization snobbery kind of within machine learning. Like we were looking for the ultimate generalist all the time. It’s like, you get it to do all the task you can possibly do and then throw more in and the data set just gets bigger and then we’re all very impressed with that. And the population implies I feel like something’s in spirit different because it’s more just like, actually I want to see a lot of different things and like hyper specializations to all kinds of exotic things that like probably the generalists won’t do because it’s basically all because it cares about it’s being general. Yeah, what do you think? Do you think that’s also a part of why populations are P.I.s? That’s true. But and this goes down tonight’s the focus on the particular level of abstraction or the level of agency that we have a, this is the level of the agent and this is what we care about. So you want the generalist level of agent, maybe you want to generalist, you should generalist level of the ecosystem with population. If one interesting thing is like, if we look at like very boring, so called machine learning, boring but useful machine learning, the stuff that keeps winning all the Kaggle competitions especially with structured data is, this ensemble method such as XG boost. And interestingly, they solve problems by splitting, splitting and not having a single agent or single model having a bunch of different models enforcing diversity through a very different way than quite as diverse in my thoughts. And by changing data set and they combine it together. So in a sense, it is the population that is solving the problem. So it’s a my convergent light of thinking. Julian, it’s such a honor. I’ve really, really enjoyed this conversation. Thank you so much and Ken as well. Hold me too. This is great. This was great. So thank you for giving me the opportunity to be here.