Mayalen Etcheverry is a Machine Learning researcher at Inria, University of Bordeaux. A second year PhD candidate, she is part of a long-term research program working on autonomous developmental learning at the frontiers of AI and cognitive sciences. Together with her team, the FLOWERS group, they leverage ML techniques along with developmental psychology and neuroscience to find applications in robotics, human-computer interaction, automated discovery and educational technologies.
Etcheverry speaks with GoodAI Research Scientist, Nicholas Guttenberg, about developing curiosity-driven models aimed at enabling agents to generate their own self-organizing learning curricula. Her upcoming talk at the ICLR 2022 workshop will address how metadiversity search, curriculum learning, and external guidance (environmental or preference-based) can be key ingredients for shaping the search process.
This interview has been edited for clarity.
All right, well, welcome. This is an interview with Mayalen Etcheverry who is going to be giving an invited talk at the Cells to Societies’ workshop at ICLR. Mayalen is a second year PhD candidate at Inria with the FLOWERS group, if I have that right.
The topic of the research involves things such as curious AI, open-endedness, metadiversity search and automated goal generation, and intrinsic motivation. So I’m curious — you talk about the potential for open-endedness in your workshop abstract. Do you want to say what sort of things you intend by that or where you see that happening?
Oh, yeah, sure. Well, first, thank you for inviting me to this workshop. When I’m talking of an open-ended system, in general, we mean a system whose behavior is not convergent over time. It’s a system which keeps surprising you and producing new and interesting outcomes. From our computer’s perspective, if we were able to design, come up with a computer program or an AI that would fit this description, well, I guess it would be very desirable, of course, across many practical domains.
One sort of open-endedness that I’m really interested in in my work and for which I believe that AI can play a big role is with respect to our scientific discovery practice in complex systems. There are many complex systems that are studied by scientists at the bench – chemists are trying to discover new drugs or new materials. Biologists are trying to bio-engineer functional tissues or organs.
Then you also have computer scientists, like my team, who are exploring complex numerical systems such as models of cellular automata – for instance, to inform about theories about the original life or intelligence. I’m talking about complex systems because I think that they are great candidates for open-endedness in the sense that they offer us unbounded possibilities for emergence. But at the same time, they’re very hard to explore and navigate for us humans.
I always take the example of the Game of Life as a good example. It’s a very simple numerical system where we know all the rules. It’s fully deterministic and it was created more than 50 years ago. But players are still discovering new and increasingly complex patterns in it — running it for hours, for instance, on supercomputers. It’s a good example of how our discovery practice is really difficult in the system.
I think that the same trend holds for chemical and synthetic biology systems. There is even this sort of trend or low, which is actually interesting. I think it’s the reverse of Moore’s law. Even though we have this exponential increase in technology, research, and computation – we are getting better and better at doing chemical tests and running them massively in parallel – but paradoxically, instead of any exponential increase, or making discovery much faster and cheaper, it’s actually much slower and much harder.
So that’s, I think, a very interesting tendency. You could even say that as a society, we’re starting to converge to this extreme. I’m not saying that scientific discovery is not open-ended. But I think that maybe we are reaching a point where the range of problems or scientific challenges that we are trying to tackle are so complex, that we really need new tools to help us tackle them to avoid this convergence point and to unlock new possibilities. So I think that that’s what I mean when I talk about open-endedness – discovering the system, unlocking the possibility for discovery in complex systems.
You had some recent work with Lenia where you were able to use this method to discover all sorts of behaviors that would be quite hard to hand-design.
Exactly. We are working mainly on numerical systems so far, especially Lenia. In our latest work, we were able with machine learning tools to discover new forms of self-organized agencies – which really look like small life forms – that are moving around in the cellular automaton. And that’s something that has been really hard to find by hand or by random exploration or by optimization methods. It’s really not an easy problem.
I wonder if I should show some of the some of the videos from your recent work – we can include a link. One of the things I’m curious about with regards to open-endedness is a lot of the examples you gave, you have a purpose in mind – like drug discovery – there’s a demand on the human side of what we want that to find. But at the same time, for the system, there’s something more innate to the system that it wants to discover, some way that it’s going to have the most success following, or even things where it has to do something we wouldn’t know as important in order to get to interesting states. Do you find any kind of tension between those things in your work? Or do you have any way of resolving that?
Sure, so indeed, there’s this problem that — especially when we are using machine learning programs — at the moment, we tend to strongly define the tasks that we want our system to be solving. So we are trying at the same time to get away from this too strong task specification that you do, typically, in optimization where you only define, for instance, a scalar reward and you want your system to converge to the peak in the fitness landscape.
So we have been proposing – in previous works – some ways to escape these strong constraints. But at the same time, as you said, we don’t want the system to go randomly and do things that are not interesting for us. Because at the end, we as human, we are evaluating discoveries. And so we want something that fits our preferences — it’s not easy to manage balance where humans have preferences, but at the same time, we don’t know how to define them and to compute a quantity out of it.
And so we introduced a paper, for instance, this metadiversity paper. We have kind of this interactive thing where the agent should boldly explore the space, but at the same time show its discovery, for instance, periodically to a human, and then the human could give some feedback to guide back inspiration.
So you mentioned, metadiversity — and there’s flat diversity, equality diversity, where it’s just trying to explore the space. But with this metadiversity idea, do you want to explain that? What sort of difference is there between that and novelty search?
Yeah, sure. So this metadiversity idea, it’s something that we came up with when we were trying in practice how to formulate exactly this problem of making an open-ended form of discovery assistance in complex systems. As you mentioned, the first idea or first family of machine learning algorithms that came to our mind was more like this novelty search type of algorithms — algorithms that come from evolutionary or developmental robotics.
That seemed to be a good fit with respect to this optimization method because instead of trying to optimize the fitness, it would try to optimize the novelty of the discoveries. So we started with that, but then quite a critical part and well known critical part, is that they still assume – at least in their standard definition – that you have some sort of oracle representation. So that will basically, from your system’s raw observation, describe the interesting degrees of behavioral variation in your outcome.
So I like to see this representation as taking a lens from the system. It’s like putting on a pair of glasses and only looking at those few factors of variation that can emerge in your system and then trying to find the most novelty and diversity within that lens.
And so I guess a single lens makes sense, or were more successful for simpler physical systems like robotics. We have some experiments in the FLOWERS team where, for instance, you would put a torso robot with the arm that could be on a table and it could manipulate, for instance, a discrete set of objects in the room. And so here it makes sense to look at, instead of looking at the raw image, you could look at the position of the objects. And then if you would find novelty in this behavioral space, it means that your robot would learn to grasp objects and move them around.
So it would make sense to look at this only through this lens of the system. But in the context of Lenia where we have this raw video of pixels, there is no one unique lens that you should use. There are tons of lenses that you can use to describe the system. And so it’s not trivial first to define one lens and not trivial also to know the impact that using this lens will have on your flat novelty-search like discovery. To test that, we did this interesting experiment where we tried to run the same equivalent of a novelty search algorithm, but with different lenses.
So one lens was hand-defined with statistics from the original Lenia paper, or we have one lens for which we would use Fourier descriptors. Then we also used unsupervised-learned lens in the sense that we pre-trained a VAE on a big database of Lenia patterns. And we even tried a VAE that was trained online so the lens would not be so static anymore, but it would be fixed in capacity.
And so we ran the algorithm and we had two striking results coming out. If you ran the algorithm using lens A and you projected in the space of lens A, indeed, it’s going to be very diverse for our set of final discoveries. So it shows that the novelty search is very efficient at finding new solutions. At least in a system like Lenia, but in a given state space.
But if you take the same discoveries and project them through lens B, then they would achieve a very poor diversity because obviously, the variation that makes them diverse in space A will disappear. So here it will have missed potentially other types of interesting types of diversity. And that was the point of this research experiment. The metadiversity came very naturally as this idea was that your open-ended system should not try to find new solutions within its existing state or view of the system, but that it should continuously try to extend it and incorporate other degrees of variation that could emerge in the system.
That was the intuitive idea behind the metadiversity. In this paper, we formulated it as a bi-level exploration loop where you have an agent in an outer loop who is trying to learn that divergent feature spaces and extend this module — a set of modules — feature space to characterize the system, which would be the lens. And then in an inner loop, we’ll run some novelty search in each of those spaces. And also something that we emphasize that’s very important, as you say, is to put some human in the loop to try to prioritize the state space that for the human would be interesting. So that’s how we formulated metadiversity.
So the relationship between the levels — it’s like if I imagined, let’s say, you just took every single pixel, you’d have this huge dimensional space. You would never really fill it. Or everything would be diverse automatically because everything would be different in some pixels and they wouldn’t necessarily be meaningful differences. But when you construct the space, hierarchically, do you find that the kind of lenses you get are special in some way you wouldn’t have been able to get if you tried to jump straight to them? Do the lenses themselves tell you something about the system?
So first of all, you could say that why not train right away the huge dimensional space of pixels. But that’s known to not be efficient for a novelty search type of algorithm. You need lower dimensional spaces. And then defended in this paper, we built them hierarchically. But that was one possible implementation of meta diversity search and I guess there are many other ways that you can do so.
But the intuition about doing it hierarchically was that, first you don’t know anything about the system. So you actually want some kind of VAE-like, some coarse compression of it that right away gives you the main factors of variation. Maybe the average intensity or the coarse form of the pattern or something like that. And then you probably want to cluster the discoveries, to start separating them into niches that seem to be more related between them.
And then you want to instantiate a new lens per niche, so probably you will have this niche of texture looking patterns with very high frequencies and in another you won’t have just the coarse description, but you want to go more in details and start seeing that you have some zebra patterns, or wave-like patterns. And then again, you want to cluster them and instantiate through them. You can have this coarse grain view that seems meaningful, but there are maybe other ways that’s a good construct of lens progressively.
So it’s like each niche you’re trying to find the thing about that niche that wants to vary, and then you just ask it to vary more or to vary completely?
Yeah, that’s it.
Do you think that this resolves the lazy way that a lot of intrinsic motivated systems will just fill up entropy from the bottom? Because you’re saying, here’s the direction that you already want to vary and you can explore this, but I’m not going to give you every direction. It’s going to be the top two directions at a time. And then when those are done, you have to find another split before you get to have another direction of entropy to fill up?
That’s a good question. So I guess it depends on the system. In Lenia, as I told you, we had very strong results that generating diversity in some direction was not driving diversity along other dimensions – for instance, we have a very striking example where we use this Fourier descriptor. Basically, we were characterizing frequencies in the image.
At the end of the exploration, we would only have discoveries in Lenia that were purely vertical stripes. So that was interesting. The intrinsically-motivated system had exploited the fact that you’re searching for diverse frequencies, but it would only show you vertical stripes. So indeed, with all kinds of frequencies, but intuitively it’s not what you expected of diversity. Even if you have all kinds of frequencies, you wanted to have some other types, maybe wavy forms or localized patterns. So you could say some sort of lazy indeed problem for intrinsically-motivated system. And once again, it shows the importance of having good goal spaces, a good way to characterize the system.
Along those lines, do you have some idea? Is there a standard by which you could say this is a good goal for a system to have, or this is a bad goal for a system to have? Is there some kind of objective measure? Or does it really have to be constructed from within the system’s point of view?
So what is a good goal once you’re in this task space that you define? I guess so at least in the FLOWERS team where we’re working a lot in this intrinsically-motivated goal setting system and we generally say that good goals should be around two dimensions. First, as we discussed, a good goal should be novel and interesting. So it should be this kind of creative goal. And the other dimension is that it should be feasible, it should be learnable. So it should not be too easy or too hard.
Personally, in my work I’ve used very basic strategies for all of these dimensions. For novelty, I was simply uniformly sampling in the goal space, with the intuition that if you manage to uniformly cover the space then you should reach maximum diversity. Then for the interesting part, well, we use this basic scoring system where you would add some human that could score the state spaces that he is interested in, and so you would prioritize goal sampling in those spaces.
And for the feasible part, we either didn’t implement — I mean, I in my work, I didn’t implement any smart thing. Or in our latest work that you mentioned at the beginning, we have this curriculum, basic curriculum idea where we sample goals not too far from the set of goals that we had already reached. But there are, for the three dimensions, many other and more advanced strategies that have been proposed, especially in the FLOWERS team.
For novelty, you could have count-based estimation or you could train some generative model distribution and try to sample inversely proportional to the probability of sampling in the distribution. There are colleagues in my team that are trying to incorporate language in the goal sampling strategy such that a human could somehow communicate goals. That would be very cool. And for the curriculum, you have also more advanced strategies of automatic curriculum learning where, for instance, there is this idea of learning progress. So you can empirically estimate how good you’re doing across different goal regions, and then you would sample toward the goals of intermediate difficulty that are not too easy or are not too hard.
Oh, great. Well, thank you for your answers and for giving us the time to have this interview and I look forward to your talk.
Thank you Nicholas. It was a pleasure.
Cells to Societies: Collective Learning Across Scales
April 29 2022
For the latest from our blog, sign up for our newsletter.