Policy Learning with Neural Cellular Automata

Sebastian Risi, IT University of Copenhagen.

Summary

Sebastian Risi of IT University of Copenhagen receives a GoodAI grant for his work with neural cellular automata (NCA) to grow neural network policies.
Open-ended search methods that will be developed in this project are aligned with methods desirable in GoodAI’s Badger architecture.
Risi and GoodAI share the important research goal of scaling to more complex tasks.

GoodAI’s Grants program has awarded Sebastian Risi with funding for his research project with Neural Cellular Automata (NCA). An extension of previous work with NCA to generate 3D structures of varying complexity [1], Risi now sets out to apply the algorithm to grow neural network policies that will adapt to different circumstances in complex reinforcement learning (RL) tasks and in a self–organized way.

Once a policy is developed for the single cell (the update rule), adaptation during the agent’s lifetime will not rely on the more general (but slower) RL algorithms or evolutionary algorithms (EA), instead learning by itself through the update rule. Neural cellular automata will be trained to evolve and represent policies instead of rigid structures or bodies. These policies should grow to adapt to an arbitrary given task, without the need to encounter the task in the NCA training phase.

Neural Networks as CA Rules

Complex computational systems, cellular automata (CA) are governed by a few simple rules. Used to understand how complexity can emerge from a network of identical nodes, cellular automata serve as test beds of more advanced models.

A well-known example of CA is the Game of Life. Invented by British mathematician James Conway in the 1970’s, the Game of Life is a zero-player game. Its evolution is determined by its initial state without input from human players. One interacts with the game by creating an initial configuration and observing how it evolves.

CA whose rules are represented by neural networks can be seen to implement ontogenetic dynamics analogous to the developmental process of living organisms. Starting from a single cell, they develop complex functional bodies with specialized organs without the need to explicitly encode all the information in the genome.

Risi proposes to train the parameters of the NCA with evolutionary algorithms and gradient descent-based approaches. He points out that optimizing an NCA to grow a neural network can be a very deceptive search space, even as early results have been promising with the NCA agent being able to solve several OpenAI gym [2] and pyBullet [3] tasks. One of the grant goals is to evaluate the learned learning policies on continual learning tasks.

Scalability and open-ended searches

The NCA approach fits very well into GoodAI’s Badger architecture. Like Badger agents, every cell within the growing neural network consists of sub-agents with a shared policy that only communicate locally. Complementing the CA approach where agents connect on a 2D grid, the agents in Risi’s system will operate directly on a neural network graph representation, trying to learn both how to grow a policy neural network and adapt it.

Illustration of a ‘Badger’ agent. A single agent comprises a number of experts (blue/pink circle) that operate according to the same fixed and shared policy (blue circle). Each expert has its own unique internal state (pink circle).

By changing the seed from which the neural network grows, there’s potential to grow neural networks that perform different tasks, all from the same NCA. Integrating this ability within the Badger architecture could bring new directions for continual and lifelong learning, as well as aid the system to scale to more complex tasks.

Elements of active research such as open-ended search methods, self-play, and auto curricula are critical for Risi’s system to develop NCAs that generalize well. These same problems are key for the development of Badger architecture and are a marker that this grant project is a good match for GoodAI’s Badger architecture research.

To date, most of what we consider general AI research is done in academia and inside big corporations. We believe that humanity’s ultimate frontier, creation of general AI, calls for a novel research paradigm, to fit a mission-driven moonshot endeavor. GoodAI Grants is part of our effort to combine the best of both cultures, academic rigor and fast-paced innovation. We aim to create the right conditions to collaborate and cooperate across boundaries. Our goal is to accelerate the progress towards general AI in a safe manner by putting emphasis on community-driven research, which in the future might play a key role in preventing the monopolization of AI technology (see AI race).

If you are interested in Badger Architecture and the work GoodAI does and would like to collaborate, check out our GoodAI Grants opportunities or our Jobs page for open positions!

For the latest from our blog sign up for our newsletter.

References

[1] Shyam Sudhakaran, Djordje Grbic, Siyan Li, Adam Katona, Elias Najarro, Claire Glanois, Sebastian Risi “Growing 3D Artefacts and Functional Machines with Neural Cellular Automata”. Proceedings of the 2021 Conference on Artificial Life (ALIFE 2021)

[2] OpenAI. (n.d.). Getting Started with Gym. gym.openai. https://gym.openai.com/docs/

[3] Maggiolino, G. 2019, October 22. Creating OpenAI Gym Environments with PyBullet Part 1. gerardmaggiolino.medium.com. https://gerardmaggiolino.medium.com/creating-openai-gym-environments-with-pybullet-part-1-13895a622b24

Policy Learning with Neural Cellular Automata

Summary

Leave a comment

Cancel reply

Join GoodAI

Keep in touch