Learning to Learn with Video Games

Summary

NYU’s Julian Togelius receives a GoodAI grant for his work on game-based benchmarks of reinforcement learning (RL) algorithms.
His project, to build a scalable 2D/3D generative environment, will contribute to an evaluation system that measures and improves open-ended learning, task transfer, and generalization for game AI.

Julian Togelius works with game-based testing of artificial intelligence methods to improve AI’s ability to self-learn. His grant project focuses on developing game-based benchmarks to measure and improve open-ended learning, task transfer, and generalization for game AI. Key to research goals is a scalable 2D/3D generative environment that would support the training of RL agents.

The impetus for the project addresses the limits of current approaches to open-ended learning and generalization. A generative environment would yield levels of ever increasing complexity to enable agents to learn quickly and generalize well to novel situations, lending itself to stronger evaluation metrics.

Togelius’ grant work builds on existing game-based AI benchmarks that he has co-designed and developed, specifically, the General Video Game AI framework (GVGAI) [1] and its Video Game Description Language [2]. Informing his research are game adaptations of the POET algorithm [3], evolving environments for the Neural MMO environment, and quality-diversity methods for level generation.

Video games and AI benchmarks

Games have historically served as test-beds for AI research and techniques. Used to test whether computers can solve tasks requiring “intelligence,” they provide the objective means (structure, repetition and reinforcement) to measure performance. Video games have proven especially useful for testing AI given the range of multi-cognitive and multi-sensory challenges offered.

Its entertainment value and inherent learning curve is what makes them the ideal environment to train AI. The drawback is that even as the AI gets good at a specific task, its skills aren’t transferable. As Togelius points out, it’s important to test algorithms not just on a single game, but on a large number of games to focus on general intelligence and not just solving a single problem. The goal is to learn not only how to play a game as well as possible, but also how to reproduce a particular playing style.

Self-recursive learning through self-generated curricula

Open-ended algorithms quickly stall in fixed environments once the challenges contained within the environment are solved. A generative environment would continually invent both problems and solutions of increasing complexity and diversity. As an agent learns to master the tasks posed to it by the environment, the environment would gradually complexify and pose further challenges.

If the environment previously rewarded the agent for collecting a certain item, it might now reward the agent for collecting that item under certain conditions but punish the agent under other conditions. The self- complexifying tasks would act as a form of natural curriculum generation. Choosing how and where to complexify will be a key research question.

Complementary to Badger’s modular framework – composed of multiple agents collaborating with each other in order to adapt to a changing environment – a generative environment would permit Badger architecture to realize as much as possible its potential for learning generalizable skills.

In particular, given the organization of an agent into multiple experts, one way the generative environment could be used is to specifically create new tasks and task variations that incentivize new expert modules to be formed. Togelius’ research could support Badger agents towards continual knowledge acquisition and with ever-expanding skill-sets.

References

[1] Perez-Liebana, J. Liu, A. Khalifa, R. D. Gaina, J. Togelius and S. M. Lucas, “General Video Game AI: A Multitrack Framework for Evaluating Agents, Games, and Content Generation Algorithms,” in IEEE Transactions on Games, vol. 11, no. 3, pp. 195-214, Sept. 2019, doi: 10.1109/TG.2019.2901021.

[2] Perez-Liebana, D. (n.d.). Chapter 2 -VGDL and the GVGAI Framework. Retrieved January 13, 2022, from https://gaigresearch.github.io/gvgaibook/PDF/chapters/ch02.pdf?raw=true

[3] Wang, R., Lehman, J., Clune, J., & Stanley, K. O. (2019). Paired Open-Ended Trailblazer (POET): Endlessly Generating Increasingly Complex and Diverse Learning Environments and Their Solutions. ArXiv:1901.01753 [Cs]. https://arxiv.org/abs/1901.01753

To date, most of what we consider general AI research is done in academia and inside big corporations. We believe that humanity’s ultimate frontier, creation of general AI, calls for a novel research paradigm, to fit a mission-driven moonshot endeavor. If you are interested in Badger Architecture and the work GoodAI does and would like to collaborate, check out our GoodAI Grants opportunities or our Jobs page for open positions!

For the latest from our blog sign up for our newsletter.

Learning to Learn with Video Games

Summary

Leave a comment

Cancel reply

Join GoodAI

Keep in touch