MEMS and Sensors

How Do Robots Learn? They Watch How-To Videos

24 December 2015

Much like humans watch tutorial videos to learn some new things, Cornell University researchers are teaching robots to watch instructional videos – then come up with some step-by-step instructions to perform a particular task. According to Cornell, this method will become so effective that, you won’t even have to turn on the DVD player – the robot can look up what it needs on YouTube.

Robots watch videos on how-to topics and a computer finds instructions they have in common. (Source: Cornell University)Robots watch videos on how-to topics and a computer finds instructions they have in common. (Source: Cornell University)

The Cornell researchers’ project, RoboWatch, is based on thoughts of a future when personal robots can perform daily household chores like cooking, doing laundry, and feeding pets. Part of what makes this concept possible is that there is a common underlying structure to most how-to videos and plenty of source material available. For example, YouTube offers 180,000 videos on “How to make an omelet” and 281,000 on “How to tie a bowtie.” So if a computer were to scan numerous videos on the same task, it would be able to find what they all have in common and come up with step-by-step instructions.

According to graduate student Ozan Sener, a key aspect of the system is that it is “unsupervised,” whereas in most previous work, robot learning is accomplished with the help of humans explaining different observations. For example, teaching a robot to recognize objects by showing it pictures of the objects while a human labels them by name. Here, a robot with a job to do can look up the instructions and figure them out for itself.

How it works

If the robot doesn’t know how to perform a task, its computer brain sends a query to YouTube to find a collection of how-to videos on the topic. The algorithm includes routines to omit “outliers,” or videos that fit the keywords but are not instructional. The computer then scans the videos frame by frame, looking for objects that appear often, and reads the accompanying narration (using subtitles) in search of frequently repeated words. Using these markers it matches similar segments in the various videos and orders them into a single sequence.

Using the subtitles, it can even create written instructions. In the future, information from other sources such as Wikipedia might be added.The learned knowledge from the YouTube videos is made available via RoboBrain, an online knowledge base that robots can consult to help them do their jobs.

Powered by CR4, the Engineering Community

Discussion – 0 comments

By posting a comment you confirm that you have read and accept our Posting Rules and Terms of Use.
Engineering Newsletter Signup
Get the Engineering360
Stay up to date on:
Features the top stories, latest news, charts, insights and more on the end-to-end electronics value chain.
Weekly Newsletter
Get news, research, and analysis
on the Electronics industry in your
inbox every week - for FREE
Sign up for our FREE eNewsletter
Find Free Electronics Datasheets