If you want your human-like robot to authentically imitate facial expressions, timing is crucial. Engineers at Columbia University’s Creative Machines Lab have spent five years refining their robot’s reflexes to the millisecond. Their findings, outlined in a new study in Science Robotics, are now open to the public.
Meet Emo, the robot head can predict and mirror human facial expressions, including smiles, within 840 milliseconds. However, whether the demonstration video will leave you smiling is uncertain.
AI is becoming quite adept at imitating human conversations—strong focus on “mimicking.” But when it comes to visibly replicating emotions, their physical robot counterparts still have some lot of catching up to do. A machine incorrectly timing a smile is not just uncomfortable–it draws attention to its artificial nature.
Human brains, in contrast, are incredibly skilled at interpreting large amounts of visual cues in real-time, and then responding appropriately with various facial movements. Besides making it extremely challenging to teach AI-powered robots the subtleties of expression, it is also difficult to construct a mechanical face capable of realistic muscle movements that do not become unsettling.
[Related: Please consider carefully before allowing AI to scan your penis for STIs.]
Emo’s creators aim to address some of these challenges, or at least, help close the gap between human and robot expressiveness. To create their new robot, a team led by AI and robotics expert Hod Lipson initially designed a lifelike robotic human head with 26 separate actuators to enable subtle facial expressions. Each of Emo’s eyes also contained high-resolution cameras to track the eyes of its human conversation partner—another crucial, nonverbal visual signal for people. Finally, Lipson’s team covered Emo’s mechanical components with a silicone “skin” to make it less.. you know, unsettling.
After that, researchers developed two separate AI models to work together—one to anticipate human expressions from a target face’s subtle expressions, and another to rapidly generate motor responses for a robot face. Using sample videos of human facial expressions, Emo’s AI then learned emotional details frame-by-frame. In just a few hours, Emo was able to observe, interpret, and respond to the slight facial movements people tend to make when they start to smile. Moreover, it can now do so within about 840 milliseconds.
“I believe accurately predicting human facial expressions is a game-changer in [human-robot interactions,” Yuhang Hu, Columbia Engineering PhD student and study lead author, stated earlier this week. “Traditionally, robots have not been designed to consider humans’ expressions during interactions. Now, the robot can incorporate human facial expressions as feedback.”
At present, Emo does not possess any verbal comprehension skills, so it can only interact by analyzing human facial expressions. Hodson, Hu, and the rest of their collaborators hope to soon merge the physical capabilities with a large language model system like ChatGPT. If they achieve this, then Emo will be even closer to natural(ish) human interactions. Of course, there’s a lot There is more to being relatable than just smiling, smirking, and grinning, which is what the scientists seem to be concentrating on. However, in the future, the robots might need to understand how to react to our frowns and scowls, even though mimicking expressions like pouting or frowning should be approached carefully as they could be misunderstood or convey unintended emotions.