We run the Meta-Learning & Multi-Agent Learning Workshops (MLMA) to bring together world-renowned experts in order to advance the field of meta-learning and multi-agent learning. The 2020 MLMA workshop was held in a hybrid form at GoodAI headquarters in Prague and online. Dates for the 2021 edition coming soon.

Oranžérie: the office of GoodAI and location of the workshop
MLMA 2020 Workshop
In this workshop, we aimed to answer some critical questions:
- Can we build a general AI agent that is composed of many (homogenous) units that were meta-learned to communicate in order for the agent to learn new tasks in a continuous and open-ended manner?
- Can these units be meta-learned to communicate learned credit, form new topologies, modulate other units, grow new units, learn new motivations, and more?
- Can it lead to an agent that adapts its internal structures in order to better adapt to future challenges?
The workshop took an interdisciplinary approach drawing on the fields of; machine learning, meta-learning, artificial life, network science, dynamical systems, complexity science, collective computation, social intelligence, creativity and communication, and more. The idea for this workshop came from the type of questions we’re solving while working on our Badger Architecture.
Core areas of focus
- Can we frame policy search as a multi-agent learning problem? (Learn how can units coordinate to learn new tasks together.)
- Can we frame it as a meta-learning problem?
- Can we frame it as just another Deep Learning architecture, e.g. RNN with shared weights?
- Minimum Viable Environment = what minimalistic environment will support learning a general learning algorithm that can generalize to more complex environments interesting for humans?
- How can we add intrinsic motivation, so the agent drives itself, without explicit goals, towards open-ended development?
When and where?
The workshop was split into two parts:
- First part August 10- 14: one week for all, presentations, working groups, discussions
- Second part August 17 – September 4: extra 3 weeks of focused work (experiments), for people who can stay.
Schedule and videos
Monday 10 August
(All times are relative to Prague – CEST)
16:30 – 17:45: Workshop & Badger Introduction and Principia Badgerica (VIDEO) (Jan Feyereisl, GoodAI & Marek Rosa, GoodAI)
18:00 – 18:45: Related Research 1 – End to end differentiable Self organising systems (Ettore Randazzo, Google Research & Alexander Mordvintsev, Google Research)
18:45 – 19:15: Discussion on presented topic
20:00 – 20:45: Related Research 2 – Discovering Reinforcement Learning Algorithms (Junhyuk Oh, DeepMind)
20:45 – 21:15: Discussion on presented topic
21:15 – 22:00: Breakout Session
22:15 – 23:00: Related Research 3 – Compositional Control: Intelligence Without a Brain (VIDEO) (Deepak Pathak, CMU)
23:00 – 23:30: Discussion on presented topic
Tuesday 11 August
(All times are relative to Prague – CEST)
10:30 – 11:15: A multi-agent perspective to AI (VIDEO) (Anuj Mahajan, University of Oxford)
11:15 – 11:45: Discussion on Multi-Agent learning
12:00 – 12:45: Multi-Agent Learning 2 (Jun Wang, UCL)
12:45 – 13:15: Discussion on Multi-Agent Learning
13:15 – 14:00: Breakout Session
15:00 – 15:45: Learned Communication (Angeliki Lazaridou, DeepMind)
15:45 – 16:15: Discussion on Communication
16:30 – 17:45: Big Picture Discussion: Benefits of Communication & Multi-Agentness
18:00 – 18:45: Self-Play and Zero-Shot Human AI Coordination in Hanabi (VIDEO) (Jakob Foerster, The Vector Institute)
18:45 – 19:15: Discussion on Multi-Agent Learning
Wednesday 12 August
(All times are relative to Prague – CEST)
15:00 – 15:45: Relative Overgeneralization in Distributed Control (VIDEO) (Wendelin Boehmer, University of Oxford)
15:45 – 16:15: Discussion on NMP/Graph NNs
16:30 – 17:15: Modular Meta-Learning: Learning to build up knowledge through modularity (VIDEO) (Ferran Alet, MIT)
17:15 – 17:45: Discussion on Modular Meta-Learning
18:00 – 19:15: Big Picture Discussion: Benefits of Modularity & Internal Structure
20:00 – 20:45: Modular & Compositional Computation (VIDEO) (Clemens Rosenbaum, ASAPP Inc)
20:45 – 21:15: Discussion on Modular & Compositional Computation
21:15 – 22:00: Breakout Session
22:15 – 23:00: Complexity: Concepts, Abstraction, and Analogy in Natural and Artificial Intelligence (VIDEO) (Melanie Mitchell, The Santa Fe Institute)
23:00 – 23:30: Discussion on Complexity
Thursday
(All times are relative to Prague – CEST)
10:30 – 11:15: Open-Endedness 1: Measuring growth of complexity (Tomas Mikolov, CIIRC)
11:15 – 11:45: Discussion on Open-Endedness
12:00 – 12:45: Why Think? (VIDEO) (Nicholas Guttenberg, GoodAI & Cross Compass)
12:45 – 13:15: Discussion on Deliberation & Accessible Information
13:15 – 14:00: Breakout Session
15:00 – 15:45: Minimum Viable Environments (VIDEO) (Julian Togelius, New York University)
15:45 – 16:15: Discussion on Minimum Viable Environments
16:30 – 17:15: Open-Endedness 2: The Importance of Open-Endedness in AI and Machine Learning (VIDEO) (Kenneth Stanley, OpenAI)
17:15 – 17:45: Discussion on Open-Endedness
18:00 – 19:15: Big Picture Discussion: Open-Endedness, Deliberation, and Discovery of Algorithms
Friday
(All times are relative to Prague – CEST)
12:00 – 12:45: The Science of Deep Learning 1 (VIDEO) (Stanislav Fort, Stanford)
12:45 – 13:15: Discussion on the Science of Deep Learning
13:15 – 14:00: Breakout Session
15:00 – 15:45: Learned Learning Algorithms (VIDEO) (Luke Metz, Google Brain)
15:45 – 16:15: Discussion on Meta-Learning
16:30 – 17:15: The Science of Deep Learning 2, Understanding Neural Networks via Pruning (Jonathan Frankle, MIT)
17:15 – 17:45: Discussion on the Science of Deep Learning
18:00 – 19:15: Big Picture Discussion: Generalization & Scalability
20:00 – 20:45: Breakout Session
20:45 – 21:15: Working Groups – Experiments & Hypotheses
21:15 – 22:00: Workshop Conclusion
Code of conduct for the workshop
All attendees and speakers must abide by GoodAI’s Code of Conduct during the Workshop.
Recommended reading
The following is a list of recommended reading for the workshop. Items highlighted in bold demonstrate most representative related work within each topic. Items with an asterisk * denote articles whose author will be presenting or participating in the workshop. Items in italics represent articles that provide an informative summary of a particular topic or provide a comprehensive/unified view of the corresponding sub-area or its aspect. Some items are located in more than one section due to their relevance to more than one sub-topic.
Badger
- Badger paper*: BADGER: Learning to (Learn [Learning Algorithms] through Multi-Agent Communication)
- Recent badger blog posts*: here, here and here.
- Recent badger workshops*: here and here.
Meta-learning
- Discovering Reinforcement Learning Algorithms*: https://arxiv.org/abs/2007.08794
- Improving Generalization in Meta Reinforcement Learning using Learned Objectives*: https://arxiv.org/abs/1910.04098
- Meta-learning of Sequential Strategies: https://arxiv.org/abs/1905.03030
- Understanding and correcting pathologies in the training of learned optimizers*: https://arxiv.org/abs/1810.10180
- Meta-learners’ learning dynamics are unlike learners’: https://arxiv.org/abs/1905.01320
- Meta-Learning in Neural Networks: A Survey: https://arxiv.org/abs/2004.05439
- Learning to Learn with Feedback and Local Plasticity: https://arxiv.org/abs/2006.09549
- Finding online neural update rules by learning to remember: https://arxiv.org/abs/2003.03124
Multi-Agent Learning & Emergent Communication
- Learning Multiagent Communication with Backpropagation: https://arxiv.org/pdf/1605.07736.pdf
- Learning Attentional Communication for Multi-Agent Cooperation*: https://arxiv.org/abs/1805.07733
- Learning to communicate with deep multi-agent reinforcement learning*: http://papers.nips.cc/paper/6042-learning-to-communicate-with-deep-multi-agent-reinforcement-learning.pdf
- Emergent Multi-Agent Communication in the Deep Learning Era*: https://arxiv.org/abs/2006.02419
- Emergence of Grounded Compositional Language in Multi-Agent Populations: https://arxiv.org/abs/1703.04908
- Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments: https://arxiv.org/abs/1706.02275
- Decentralized Multi-Agent Actor-Critic with Generative Inference*: https://arxiv.org/abs/1910.03058
- AI-QMIX: Attention and Imagination for Dynamic Multi-Agent Reinforcement Learning*: https://arxiv.org/abs/2006.04222
- MAVEN: Multi-Agent Variational Exploration*: https://deepai.org/publication/maven-multi-agent-variational-exploration
- A Survey and Critique of Multiagent Deep Reinforcement Learning: https://arxiv.org/abs/1810.05587
Modularity & Generalized Deep Learning
- One Policy to Control Them All: Shared Modular Policies for Agent-Agnostic Control*: https://wenlong.page/modular-rl/
- Learning to Control Self-Assembling Morphologies: A Study of Generalization via Modularity*: https://pathak22.github.io/modular-assemblies/
- MPLP: Learning a Message Passing Learning Protocol*: https://arxiv.org/abs/2007.00970
- Neural Relational Inference with Fast Modular Meta-learning*: https://papers.nips.cc/paper/9353-neural-relational-inference-with-fast-modular-meta-learning.pdf
- Modular meta-learning*: https://arxiv.org/abs/1806.10166
- Graph Element Networks: adaptive, structured computation and memory*: https://arxiv.org/abs/1904.09019
- Routing Networks and the Challenges of Modular and Compositional Computation*: https://arxiv.org/abs/1904.12774
- Are Neural Nets Modular? Inspecting Their Functionality Through Differentiable Weight Masks: http://people.idsia.ch/~csordas/are_neural_networks_modular.pdf
- Graph Structure of Neural Networks: https://arxiv.org/abs/2007.06559
- Recurrent Independent Mechanisms: https://arxiv.org/abs/1909.10893
- Learning to Combine Top-Down and Bottom-Up Signals in Recurrent Neural Networks with Attention over Modules: https://arxiv.org/abs/2006.16981
- S2RMs: Spatially Structured Recurrent Modules: https://arxiv.org/abs/2007.06533
Gradual and Continual Learning
- Learning to Continually Learn*: https://arxiv.org/abs/2002.09571
- Bilevel Continual Learning: https://arxiv.org/abs/2007.15553
Inductive Biases, Symmetries and Invariances
- Learning Invariance vs. Model Invariance: Implications for Meta-Learning*: tbd
- The Holy Grail of Deep Learning: Modelling Invariances: https://www.inference.vc/the-holy-gr/
- Natural Graph Networks: https://arxiv.org/abs/2007.08349
- Meta-Learning Symmetries by Reparameterization: https://arxiv.org/abs/2007.02933
Meta-control and Deliberation
- Automatically Composing Representation Transformations as a Means for Generalization: https://arxiv.org/abs/1807.04640
- Doing more with less: meta-reasoning and meta-learning in humans and machines: https://cocosci.princeton.edu/papers/doing-more-with-less.pdf
- Metacontrol for Adaptive Imagination-Based Optimization: https://arxiv.org/abs/1705.02670
- A Theory of Usable Information under Computational Constraints: https://openreview.net/forum?id=r1eBeyHFDH
Neural Message Passing & Graph Neural Networks
- Relational inductive biases, deep learning, and graph networks: https://arxiv.org/abs/1806.01261
- NerveNet: Learning Structured Policy with Graph Neural Networks: https://openreview.net/forum?id=S1sqHMZCb
- Pointer Graph Networks: https://arxiv.org/abs/2006.06380
- Natural Graph Networks: https://arxiv.org/abs/2007.08349
- Learning TSP Requires Rethinking Generalization: https://arxiv.org/abs/2006.07054
- Neural Message Passing for Quantum Chemistry: https://arxiv.org/abs/1704.01212
- Benchmarking Graph Neural Networks: https://arxiv.org/abs/2003.00982
- Graph Neural Networks Meet Neural-Symbolic Computing: A Survey and Perspective: https://arxiv.org/abs/2003.00330
Neural Optimizers
- Learning to Learn without Gradient Descent by Gradient Descent: https://arxiv.org/abs/1611.03824
- Understanding and correcting pathologies in the training of learned optimizers*: https://arxiv.org/abs/1810.10180
- Learned Optimizers that Scale and Generalize: https://arxiv.org/abs/1703.04813
Optimization, Adaptation and Generalization
- Rapid Task-Solving in Novel Environments: https://arxiv.org/abs/2006.03662
- Generalized Planning With Deep Reinforcement Learning: https://arxiv.org/abs/2005.02305
- Learning TSP Requires Rethinking Generalization: https://arxiv.org/abs/2006.07054
- A Brief Look at Generalization in Visual Meta-Reinforcement Learning: https://arxiv.org/abs/2006.07262
- Meta-Learning and Universality: Deep Representations and Gradient Descent can Approximate any Learning Algorithm: https://arxiv.org/abs/1710.11622
- The Early Phase of Neural Network Training*: https://arxiv.org/abs/2002.10365
- The Break-Even Point on the Optimization Trajectories of Deep Neural Networks*: https://openreview.net/pdf?id=r1g87C4KwB
Program Induction & Synthesis
- Strong Generalization and Efficiency in Neural Programs: https://arxiv.org/abs/2007.03629
- Neural Execution Engines: Learning to Execute Subroutines: https://arxiv.org/abs/2006.08084
Curriculum learning, open-endedness & Minimum Viable Environments
- Evolving Structures in Complex Systems*: https://arxiv.org/abs/1911.01086
- Enhanced POET: Open-Ended Reinforcement Learning through Unbounded Invention of Learning Challenges and their Solutions*: https://arxiv.org/abs/2003.08536
- Procedural Content Generation via Machine Learning (PCGML)*: https://arxiv.org/abs/1702.00539
- Object-Oriented Curriculum Generation for Reinforcement Learning*: https://www.researchgate.net/publication/323738119_Object-Oriented_Curriculum_Generation_for_Reinforcement_Learning
- Generative Teaching Networks: Accelerating Neural Architecture Search by Learning to Generate Synthetic Training Data*: https://arxiv.org/abs/1912.07768
- CommAI: Evaluating the first steps towards a useful general AI*: https://arxiv.org/abs/1701.08954
- Pommerman: A multi-agent playground*: https://arxiv.org/abs/1809.07124
Further reading
- Growing Neural Cellular Automata*: https://distill.pub/2020/growing-ca/
- AI-GAs: AI-generating algorithms, an alternate paradigm for producing general artificial intelligence: https://arxiv.org/abs/1905.10985
- Network of Evolvable Neural Units: Evolving to Learn at a Synaptic Level*: https://arxiv.org/abs/1912.07589
- Multiagent Bidirectionally-Coordinated Nets: Emergence of Human-level Coordination in Learning to Play StarCraft Combat Games*: https://arxiv.org/abs/1703.10069
- Learning Structured Communication for Multi-agent Reinforcement Learning*: https://arxiv.org/pdf/2002.04235.pdf
- Graph Convolutional Reinforcement Learning: https://arxiv.org/pdf/1810.09202.pdf
- TarMAC: Targeted Multi-Agent Communication: https://arxiv.org/pdf/1810.11187.pdf