Universal robots are no longer far away, Google demonstrates the world’s first multi-tasking AI agent

Recently, Google’sAITeam DeepMind has launched a self-improving, self-improvingrobotAI agent named RoboCat.

RoboCat is essentially a software program powered by AI, which can be used as the “brain” of the robot. The difference between the robot supported by it and the traditional robot is that the RoboCat robot is more “universal” and can achieve self-improvement and self-improvement .

According to DeepMind, RoboCat is the world’s first robot AI agent that can solve and adapt to multiple tasks, and it can complete these tasks on various real robot products.

According to DeepMind, RoboCat can learn to operate after only about 100 demonstrations.robotic armto complete a variety of tasks, and then iteratively improve with self-generated data.

You must know that one of the important reasons why the construction of general-purpose Robots is slow is that it takes time to collect real-world training data, and RoboCat’s fast learning ability reduces the need for human-supervised training, which can be said to be a step towards the creation of general-purpose robots. important step.

Universal robots are no longer far away, Google demonstrates the world’s first multi-tasking AI agent

As can be seen from the released video, RoboCat can already control the Robotic arm through self-learning, and complete tasks such as “ringing”, “building blocks” and “grabbing fruit”. These tasks seem simple, but they test the accuracy of the robotic arm operation. , comprehension, and the ability to solve shape-matching puzzles.

The most important thing is that RoboCat has never seen before, whether it is the robotic arm it controls or the task to be completed. Today, RoboCat’s success rate in completing a new task has increased from 36% in the initial stage to 74%.

One of the key technologies used by RoboCat is Gato, a multimodal model. Gato means “cat” in Spanish, which is one of the origins of the name “RoboCat”.

The Gato model can process language, images, and actions in simulated and physical environments. The researchers combined Gato’s architecture with a large training dataset consisting of 100-1,000 demonstrations of various robotic arms performing tasks.

Based on the original data set and the data generated by new training, RoboCat’s data set will contain millions of times of training trajectory data, the more new tasks it learns, the better it can learn and solve additional new tasks.

Previously, researchers have explored the large-scale learning of multiple tasks by robots, and combined the understanding of language models with the capabilities of real-world robots. The advancement of RoboCat is that it is the first to solve and adapt to multiple tasks. robotic AI agents for such tasks.

The DeepMind paper shows that the large increase in task success is due to RoboCat’s increasing experience, just as people develop more diverse skills when they deepen their learning in a specific domain, and RoboCat’s ability to complete real-world training tasks The success rate is much higher than the traditional vision-based model scheme, which is also the important value of DeepMind’s research.

RoboCat’s “universal learning ability” is of great significance for accelerating the research in the field of robotics. DeepMind believes that RoboCat’s independent learning skills, rapid self-improvement ability, and rapid adaptability to different hardware devices will be of great significance to the new generation of general robot AI. The development of intelligent agents plays an important role in promoting.

The Links:   6AV2124-0GC13-0AX0 INICT12 IGBT

Pre:    Next: