The MIT team has created an imaging technique called Interactive Dynamic Video (IDV) that allows users to reach in and “touch” objects in videos. IDV uses traditional cameras and algorithms to find the tiny and almost invisible vibrations of an object and create video simulations that users can interact with virtually. The new technology could reduce the need for CGI green screens, too.
"This technique lets us capture the physical behavior of objects, which gives us a way to play with them in virtual space,” said Abe Davis, CSAIL Ph.D. student, who will be publishing the work this month for his final dissertation. “By making videos interactive, we can predict how objects will respond to unknown forces and explore new ways to engage with videos.”
According to Davis, IDV has many potential applications, such as filmmaking visual effects to architecture, in which it could be used to determine if buildings are structurally sound.
To explain the technology, Davis compares it to the recent Pokemon Go application, which can drop virtual characters into real-world environments. IDV can go a step further by actually enabling virtual objects to interact with their environments in realistic ways, like bouncing off the leaves of a nearby bush.
How It Works
One of the most common ways to simulate objects’ motions is via 3-D modeling. Because 3-D modeling is expensive, and can be almost impossible for many objects, Davis instead employed algorithms to track motion in video and magnify objects.
Davis’ work proves that even five seconds of video can have enough information to create realistic simulations.
The team analyzed video clips to find “vibration modes” at different frequencies that each represent distinct ways that an object can move. By identifying these models’ shapes, the researchers were able to make predictions about how the objects would move in new situations.
“Computer graphics allows us to use 3-D models to build interactive simulations, but the techniques can be complicated,” said Doug James, a professor of computer science at Stanford University who was not involved in the research. “Davis and his colleagues have provided a simple and clever way to extract a useful dynamics model from very tiny vibrations in video, and shown how to use it to animate an image.”
Davis used the new technique on videos of a bridge, a jungle gym and a ukulele. With a few clicks of the mouse, he showed that he can push and pull the image, as well as bend and move it in different directions. He even showed that he can make his own hand appear to be moving the leaves of a bush.
“If you want to model how an object behaves and responds to different forces, we show that you can observe the object respond to existing forces and assume that it will respond in a consistent way to new ones,” said Davis, who also found that the technique even works on some existing videos on YouTube.
According to the researchers, the tool has potential uses in engineering and entertainment.
For example, in movies it could be used to take video of an existing environment and make some minor edits, such as masking, matting and shading, to achieve a special effect in less time than current methods, and at a fraction of the cost.
Engineers could also use the system to simulate how an old building or bridge would respond to strong winds or an earthquake.
Davis also says there are other futuristic applications in sports film and new forms of virtual reality (VR).
“When you look at VR companies like Oculus, they are often simulating virtual objects in real spaces,” said Davis. “This sort of work turns that on its head, allowing us to see how far we can go in terms of capturing and manipulating real objects in virtual space.”