This environment makes it possible to formulate new learning algorithms, which dynamically control how an agent trains and the games on which it trains. We created a vast game environment we call XLand, which includes many multiplayer games within consistent, human-relatable 3D worlds. Today, we published " Open-Ended Learning Leads to Generally Capable Agents," a preprint detailing our first steps to train an agent capable of playing many different games without needing human interaction data. Instead of learning one game at a time, these agents would be able to react to completely new conditions and play a whole universe of games and tasks, including ones never seen before. DeepMindās mission of solving intelligence to advance science and humanity led us to explore how we could overcome this limitation to create AI agents with more general and adaptive behaviour. The same is true for other successes of RL, such as Atari, Capture the Flag, StarCraft II, Dota 2, and Hide-and-Seek. But AlphaZero still trained separately on each game - unable to simply learn another game or task without repeating the RL process from scratch. Through reinforcement learning (RL), this single system learnt by playing round after round of games through a repetitive process of trial and error. For instance, AlphaZero beat world-champion programs in chess, shogi, and Go after starting out with knowing no more than the basic rules of how to play. In recent years, artificial intelligence agents have succeeded in a range of complex game environments.
0 Comments
Leave a Reply. |