Google unveils multi-game decision transformers for training generalist agents.
Current deep reinforcement learning (RL) approaches can train expert artificial agents that excel at decision-making on various individual tasks in specific contexts, such as Go or StarCraft.
Multi-game decision transformers are to train generalist agents. However, generalizing these conclusions to generalist agents that can carry out various jobs in environments with distinct manifestations has not been done very well.
Here, the multi-game decision transformer teaches an agent how to play 41 Atari games. In addition, it can be tweaked for new games and is better than the few other ways to train agents to play more than one game.
Natural language processing, vision, and generative models (like PaLM and Flamingo) have made significant progress in recent years. By enlarging Transformer-based models and using large datasets with diverse meanings, such as those used to train Flamingo and Imagen.
In this work, the researchers’ primary method for training an RL agent is to use decision transformers. Decision transformers predict an agent’s following action. So it is by looking at how the agent and its environment have interacted in the past. Most importantly, what the agent wants to get out of future interactions. Instead of learning a policy to get a high return magnitude, as in traditional reinforcement learning, Decision Transformers map different experiences, from expert-level to beginner-level, to their corresponding return magnitudes during training.
Training an agent from beginner to expert exposes the model to a broader range of gameplay variables, which helps it learn valuable rules that allow it wins in any setting. So, during inference, the decision transformer can get any return value in the range it has seen during training, including the best return.
This work is an essential step toward showing that general-purpose agents can work in many different environments, bodies, and ways of acting. In addition, the researchers have shown how performance improves with more users. These results seem to point to a story about generalization similar to stories about vision and language. Finally, this approach suggests that scaling data and learning from different experiences can be very useful.
Researchers look forward to working more to make agents that work well in multiple environments and bodies.
Source: indiaai.gov.in