AI Is Becoming More Human: New Algorithm Lets Computers Learn From Their Mistakes

Artificial Intelligence (AI) is becoming ever more human by the day.

OpenAI—a San Francisco-based, non-profit research company—has developed a new algorithm which enables AI to learn from its own mistakes, in much the same way as people do.

The new technology is known as Hindsight Experience Replay (HER)—which lets AI review its previous actions when trying to complete a specific task or goal, according to the company's blog.

The company has released the technology as an open-source software package which includes a number of virtual environments in which two simulated robots—a robotic arm, similar to those used in manufacturing, and a robotic hand—can be put through their paces in a number of mini tasks.

Read more: Need a job? Why artificial intelligence will help human workers, not hurt them

These tasks include sliding a disc across a table until it hits a given target and manipulating a pen until it achieves a desired position and rotation.

At first, the simulated robots are unsuccessful at completing their given tasks, but as the new algorithm kicks in, they begin to train themselves to become more effective by reframing each failure as a success.

"The key insight that HER formalizes is what humans do intuitively," the researchers wrote in the blog. "Even though we have not succeeded at a specific goal, we have at least achieved a different one. So why not just pretend that we wanted to achieve this goal to begin with, instead of the one that we set out to achieve originally?"

The simulated robotic arms learn through failure using the new algorithm. OpenAI

"By doing this substitution, the reinforcement learning algorithm can obtain a learning signal since it has achieved some goal; even if it wasn't the one that we meant to achieve originally," they added. "If we repeat this process, we will eventually learn how to achieve arbitrary goals, including the goals that we really want to achieve."

This process mimics the way that we learn when it comes to trying to master new skills. When you're first learning to drive a car, for example, each mistake you make on the road will contribute to you becoming a better driver because it teaches you what not to do in a given situation.

Usually when teaching an AI agent to reach a specific goal, existing algorithms either give out a virtual reward—when the AI achieves a given task—or no reward—if it fails. HER, on the other hand, gives out a reward every time the AI fails, helping it learn faster and more effectively.

The researchers have already been using their new software to help train real physical robots, but these kind of self-learning algorithms could prove useful in a huge range of applications.

In addition to the developments at OpenAI, the University of Texas at San Antonio (UTSA) have outlined a new cloud-based platform at the 51st Hawaii International Conference on System Sciences, which also teaches AI to learn more like humans.

"Cognitive learning is all about teaching computers to learn without having to explicitly program them," Paul Rad assistant director of the UTSA Open Cloud Institute said in a statement. "We're presenting an entirely new platform for machine learning to teach computers to learn the way we do."

"Our goal here is to teach the machine to become smarter, so that it can help us. That's what they're here to do. So how do we become better? We learn from experience."