Badger Architecture is the culmination of five years of work at GoodAI. While researching general AI we have tried many different architectures from biologically inspired, to artificial neural networks, and deep learning.
We developed an understanding that what we want is an agent that can quickly adapt to novel tasks. The follow-up question was, what are the building blocks of an agent that can adapt its internal structure as it learns new tasks? Can we meta-learn these building block(s)?
To answer these questions we started Badger Architecture.
Badger is an architecture and a learning procedure where:
- An agent is made up of many experts
- All experts share the same communication policy (expert policy), but have different internal memory states
- There are two levels of learning, an inner loop (with a communication stage) and an outer loop
- Inner loop – Agent’s behavior and adaptation emerges as a result of experts communicating between each other. Experts send messages (of any complexity) to each other and update their internal memories/states based on observations/messages and their internal state from the previous time-step. Expert policy is fixed and does not change during the inner loop.
- Inner loop loss need not even be a proper loss function. It can be any kind of structured feedback so long as it eventually relates to the outer loop performance.
- Outer loop – An expert policy is discovered over generations of agents, ensuring that strategies that find solutions to problems in diverse environments can quickly emerge in the inner loop.
- Agent’s objective is to adapt fast to novel tasks
Exhibiting the following novel properties:
- Roles of experts and connectivity among them assigned dynamically at inference time
- Learned communication protocol with context-dependent messages of varied complexity
- Generalizes to different numbers and types of inputs/outputs
- Can be trained to handle variations in architecture during both training and testing
Initial empirical results show generalization and scalability along the spectrum of learning types.
Badger agent is made of many experts. All experts have the same expert policy, but have different internal memory states. Experts communicate and and coordinate together in order to adapt the agent to novel tasks.
The crucial question is – how to get this expert policy? One approach is to handcraft it. The other approach is to meta-learn it, to automate the search for it. This approach can be framed as a “multi-agent learning” (in our case, multi-expert learning).
For the motivation behind Badger, more details, preliminary experiments, literature, please see the full paper.
Join our team
We are growing our team, we are looking for people interested in collaborating on the Badger Architecture to join us in our office in Prague or remotely. Please see our jobs page for open positions.