Three3DWorld

ThreeDWorld – a new AI-powered virtual platform simulates a rich and interactive audio-visual environment for human and robotic learning, training, and experimental studies by utilising real-world physics.

Developed by academicians at MIT, the MIT-IBM Watson AI Lab, Harvard University, and Stanford University, the platform demonstrates how to create a rich virtual world akin to the one shown in The Matrix. ThreeDWorld (TDW, mimics high-fidelity audio and visual environments, both indoor and outdoor, and enables users, objects, and mobile agents to interact as they would in real life, governed by physics laws. As interactions occur, the orientations, physical features, and velocities of fluids, soft bodies, and rigid objects are computed and implemented, resulting in exact collisions and impact sounds.

What is ThreeDWorld

Unlike other computer programs, TDW is designed to be flexible and generalisable. Photos and sounds are made in real-time, and they can be combined to make audio-visual datasets and used for both humans and neural networks to learn and predict. They can also be changed through interactions in the scene. Also, many types of robots and avatars can be made to plan and perform tasks in the controlled simulation. In addition, by using virtual reality (VR), attention and play behaviour in the area can be used to get real-world data.

“We want to build a general-purpose simulation platform that can be used for a wide range of AI applications,” says lead researcher Chuang Gan, a research scientist at the MIT-IBM Watson AI Lab.

“At the moment, the majority of AI is based on supervised learning,” says Josh McDermott, an associate professor of Brain and Cognitive Sciences (BCS) and the project lead for the MIT-IBM Watson AI Lab. This type of AI is based on huge datasets of images or sounds that have been annotated by humans, which is not readily available and consumes many resources. For physical characteristics of objects, like mass, which aren’t always obvious to humans, labels may not be available. A simulator like TDW can get around this problem by making scenes with specific parameters and annotations. The currently available tools are very task-specific. TDW, on the other hand, can be used for a lot of different tasks that aren’t possible on other platforms.

TDW is a controlled environment that makes it easier to figure out how AI robots learn and ways to improve them. When you teach robot systems, you can do it in a safe place. “Many of us are excited about the possibilities that these types of virtual worlds offer for testing people’s senses and minds. There is the ability to make these very rich sensory experiences while still having complete control and awareness of the environment,” McDermott added.

How it is done

The researchers built TDW on the Unity3D Engine video game platform. They were able to render both visual and audio data without animation. The simulation is made up of two parts: the build, which runs the visuals, sounds, and physics simulations; and the controller, which is a Python-based interface through which the user can send commands to the build. 

Researchers used 3D models of things like furniture, animals, and cars to build and populate scenes. These models responded well to changes in lighting and their material composition and orientation in the space. Also, the team made virtual floor plans that could be filled with agents and avatars by researchers. TDW uses generative models of impact noises to make a sound mimicking collisions or other interactions between things in the simulation. In addition, TDW models noise attenuation and reverberation based on the shape of the space and the things in it.

Read the full paper here.