Robot Learns To Cook From Watching YouTube

Researchers have developed a preliminary system that could allow robots to independently learn actions from web videos

Robotics researchers at the University of Maryland and Australia’s national ICT research centre have devised a system that could allow robots to learn activities such as cooking from videos freely available on YouTube.

Their techniques, to be presented later this month at an artificial intelligence conference in Texas, build on earlier research, but reduce errors and use “unconstrained” videos – meaning they have been filmed for ordinary human viewers.

robot hand

While the system is at an early stage, the goal is for robots to be able to independently analyse videos and learn actions from them, according to the paper.

“We believe this preliminary integrated system raises hopes towards a fully intelligent robot for manipulation tasks that can automatically enrich its own knowledge resource by ‘watching’ recordings from the world wide web,” it said.

The system consists of two recognition modules, one for classifying types of hand grasps and the other for object recognition, and a parsing module that generates visual “sentences” that can be followed by the robot to learn an action.

The researchers wanted to make use of publicly available material for the experiment, and relied on 88 YouTube videos, from which researchers extracted 12 video clips showing one typical cooking action each, reserving the rest of the video material for training and validation.

From the 12 clips, researchers further extracted 1,525 image patches showing hands carrying out particular actions. From these the system was able to recognise manipulation actions with a relatively low error rate, according to the paper.

“Experiments conducted on a publicly available unconstrained video dataset show that the system is able to learn manipulation actions by ‘watching’ unconstrained videos with high accuracy,” the researchers wrote.

Deep neural networks

This accuracy is in part due to the use of recently developed “deep neural networks”, which have “revolutionized visual recognition research”, according to the scientists.

The researchers said they focused upon cooking because it is a field of human activity ripe for automation.

“Cooking is an activity… that future service robots most likely need to learn,” they wrote.

The paper (PDF) “Robot Learning Manipulation Action Plans by ‘Watching’ Unconstrained Videos from the World Wide Web”, based on research funded in part by grants from the EU, the US’ National Science Foundation and the US Army, is to be presented at the 29th annual Association for the Advancement of Artificial Intelligence conference later this month in Austin, Texas.

Large companies such as Google and Facebook have recently made significant investments in the development of artificial intelligence, with IBM putting its “Watson” AI system to work in fields such as research and large-scale analytics.

Meanwhile, Stephen Hawking and Elon Musk are among those who have recently called artificial intelligence a threat to humanity’s existence.

Are you a security pro? Try our quiz!