Badger Architecture

Badger architecture is the unifying framework for our research defined by its key principle modular life-long learning.

The modular aspect is expressed in the architecture through a network of identical agents. The life-long learning means that the network will be capable of adapting to a growing (open-ended) range of new and unseen tasks while being able to reuse knowledge acquired in previous tasks. The algorithm that is run by individual Badger experts will be discovered through meta-learning. 

We expect the design principles of Badger architecture to be its key advantages. The modular approach should enable scaling beyond what is possible for a monolithic system and the focus on life-long learning will allow for incremental, piece-wise learning, driving down the demand for training data.

State of Badger

Below you can find a taster of some of our latest work.

Most modern AI scales with the availability of data – to improve its performance or behavior requires more and more data taken from the world. The promise of Badger is to give scalable computational resources, as increasing numbers of experts could be added at runtime. So how do we turn that into an advantage, if data is the limit? One kind of method that does scale with compute is AI that integrates some sort of search component over a model of the world – for example, the successes of Monte-Carlo Tree Search and AlphaGo. However, in those games, a perfect model of the world can be provided. For AI that will generalize and scale to real-world, formerly unseen problems, we need to become good at learning such models.

The above videos are examples from a model learned from videos of a particle simulation. In particular, this is a model that can generate not just one future but an entire distribution of possible futures – a distribution that could be searched in parallel by multiple nodes, with their discoveries then networked together via communication in order to formulate a decision.

The animation above shows 10 layers of a cellular automaton evolving over time. The task is to copy the pattern from the middle to one of the dots – not both. Thus, the distributed agent has to learn a coordination strategy to determine which dot will be active. The second animation demonstrates a “transform” task, where the pattern has to be modified during the copying process. Additionally, regularization in form of cell activation zeroing is present. The policies unfolding above have been found by SGD.

The cellular automata displayed in the earlier section demonstrated the ability to capture and simulate a model of the world in a distributed fashion. A more direct approach, often fruitful in deep learning applications, is learning a task solver in an end-to-end manner: provided with inputs and the desired outputs, the solution (copy pattern, transform pattern, etc.) is found by backpropagation. In the case of cellular automata with a shared cell policy, this naturally makes generalisation to different tasks easier, because the solution needs to be modular and distributed over varying network sizes. 

Having a modular policy that can be distributed on a large scale and solve different tasks is helpful, but is only the first step towards general AI. The next and necessary step is to allow the policy to expand the range of tasks that it can solve through accumulation and reuse of experience. These principles together form the foundation of Badger architecture.

Principles of Badger

Badger is an architecture and a learning procedure where:

  • An agent is made up of many experts
  • All experts share the same communication policy (expert policy), but have different internal memory states
  • There are two levels of learning, an inner loop (with a communication stage) and an outer loop
  • Inner loop – Agent’s behavior and adaptation emerges as a result of experts communicating between each other. Experts send messages (of any complexity) to each other and update their internal memories/states based on observations/messages and their internal state from the previous time-step. Expert policy is fixed and does not change during the inner loop.
  • Inner loop loss need not even be a proper loss function. It can be any kind of structured feedback so long as it eventually relates to the outer loop performance.
  • Outer loop – An expert policy is discovered over generations of agents, ensuring that strategies that find solutions to problems in diverse environments can quickly emerge in the inner loop.
  • Agent’s objective is to adapt fast to novel tasks
  • Open-ended inner loop learning needs to be enabled by a suitable design of the outer loop, for instance through the support of agent self-reference and by using curiosity as an implicit agent goal creation mechanism. An open-ended agent should be able to come up with novel and creative solutions to problems it faces. The environment it operates in needs to be open-ended too – it must enable creation of novel and unforeseen tasks that match the current skill level of the agent, to support its further improvement.

Exhibiting the following novel properties:

  • Roles of experts and connectivity among them assigned dynamically at inference time
  • Learned communication protocol with context-dependent messages of varied complexity
  • Generalizes to different numbers and types of inputs/outputs
  • Can be trained to handle variations in architecture during both training and testing

Badger paper

For the motivation behind Badger, more details, preliminary experiments, literature, please see the full paper using the button below.


Badger workshops

GoodAI runs regular workshops with external collaborators in order to advance the Badger Architecture you can read summaries of past workshops and find information about upcoming workshops below:

Past workshops 

If you would like to join one of these workshops in the future please contact us.

Join our team

We are growing our team, we are looking for people interested in collaborating on the Badger Architecture to join us in our office in Prague or remotely. Please see our jobs page for open positions.

From our blog

Read the latest technical blogs from GoodAI.

Bayesian Online Meta-Learning (BOML) for continual & gradual learning

April 19, 2021 ResearchTechnical blogs

New project aims to create AI that can continually acquire knowledge in different domains as well as utilize past experiences to quickly adapt to new unseen tasks. 

Read more

GoodAI enters into research collaboration to progress meta-learning and combinatorial generalization

April 14, 2021 ResearchTechnical blogs

Self-improving artificial intelligence that can learn new tasks from small amounts of data is a crucial step for the advancement of strong artificial intelligence.

Read more

Creating a new framework for multi-agent AI systems

April 12, 2021 ResearchTechnical blogs

Current artificial intelligence is limited in its scope and is far from human-level intelligence. One of the key components missing is learning to pursue multiple goals, ones that are dynamic, changing, and that depend on knowledge acquired from previous tasks.

Read more