While robots have come a long way toward learning, in reality, they are still very limited in what they can do.
Repetitive tasks are manageable but the ability to understand human language makes robots mostly useless for more complicated jobs. For example, if you ask a robot to pick up a specific tool from a toolbox, the robot would have to recognize the specific tool, distinguishing the tool from others of similar shapes and sizes.
Now, researchers at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) have developed a new system that uses Amazon Alexa to allow robots to understand a range of commands that require knowledge about objects and their environment.
The system, called ComText, allows the robot to learn about specific objects and to update the robot with more information about other objects and have it execute a range of tasks like picking up different sets of objects based on different commands.
“Where humans understand the world as a collection of objects and people and abstract concepts, machines view it as pixels, point-clouds and 3-D maps generated from sensors,” said Rohan Paul, a CSAIL postdoc. “This semantic gap means that, for robots to understand what we want them to do, they need a much richer representation of what we do and say.”
MIT tested ComText on Baxter, a two-armed humanoid robot developed for Rethink Robotics. In the experiment, Baxter was able to execute the right command about 90 percent of the time. Going forward, the team hopes to enable robots to understand more complicated information such as multi-step commands, the intent of actions and using properties about objects to interact with them more naturally.
How It Works
ComText observes a range of visuals and natural language to understand episodic memory about an object’s size, shape, position, type and if it belongs to someone. From there it can reason, infer meaning and respond to commands.
“The main contribution is this idea that robots should have different kinds of memory, just like people,” said Andrei Barbu, a research scientist at MIT. “We have the first mathematical formulation to address this issue, and we’re exploring how these two types of memory play and work off of each other.”
This line of research could enable better communications for a range of robotic systems, from self-driving cars to household helpers.