Recognizing and understanding others’ emotional states can be difficult for children with autism spectrum conditions. Some therapists use a kid-friendly robot to demonstrate emotions, and engage the children in imitating them and responding to them in appropriate ways.
Of course, the child needs to be interested and paying attention for that kind of therapy to succeed, and robots can’t interpret a child’s level of engagement.
Or can they?
Researchers at the MIT Media Lab have developed a new type of “deep learning” network that can help robots gauge the quality of their interactions with children, using data unique to each child.
"The long-term goal is not to create robots that will replace human therapists,” he said, “but to augment them with key information that the therapists can use to personalize the therapy content.” Another aim, according to Oggi Rudovic, a postdoctoral student and first author of a study on the research, is to create more engaging and naturalistic interactions between the robots and children with autism.
Creating machine learning and AI that works for autism is particularly challenging, because autism is so unique to the individual. A famous adage, according to study co-author Rosalind Picard, states that "if you have met one person, with autism, you have met one person with autism.”
The researchers used SoftBank Robotics NAO humanoid robots for their study, which are nearly two feet tall and resemble armored superheroes or droids. NAO can convey different emotions by changing the color of its eyes, the tone of its voice and the motion of its limbs. The robots were seen as successful in attracting the attention of the children studied, which itself can be a struggle for therapists.
"Therapists say that engaging the child for even a few seconds can be a big challenge for them," said Rudovic. “Also, humans change their expressions in many different ways, but the robots always do it in the same way, and this is less frustrating for the child because the child learns in a very structured way how the expressions will be shown."
The deep learning approach was chosen because it is designed to use hierarchical, multiple layers of data processing to improve its tasks, with each successive layer amounting to a slightly more abstract representation of the original raw data. It has previously been used in automatic speech and object-recognition programs. But while the concept has been around since the 1980s, Rudovic added, it’s only recently that there has been enough computing power to implement this kind of artificial intelligence.
The team took this concept a step further by building a personalized framework that could learn from data collected on each individual child. They captured video of each child's facial expressions, head and body movements, and poses and gestures, as well as audio recordings; using a wrist monitor, they also collected data on heart rate, body temperature and skin sweat response. The personalized deep learning networks were then built from layers of these video, audio and physiological data; information about the child's autism diagnosis and abilities; their culture; and their gender.
The results were promising and even showed a higher correlation to assessments by human experts than human observers — who often disagree about what they are seeing.
Interesting cultural differences among the children were also observed. “For instance, children from Japan showed more body movements during episodes of high engagement, while in Serbs large body movements were associated with disengagement episodes," said Rudovic.