INTRODUCING

Aria Synthetic Environments Dataset

WHAT IS IT?

A large-scale, fully simulated dataset of procedurally-generated interior scenes

Aria Synthetic Environments is a synthetic dataset, created from procedurally-generated interior layouts filled with 3D objects, simulated with the sensor characteristics of Aria glasses.

Unlike prior datasets for 3D scene understanding, which are typically not large enough for ML training, the Aria Synthetic Environments Dataset sets a new precedent for the scale of indoor environment datasets. This surfaces exciting new research opportunities for tasks related to 3D scene reconstruction, and object detection and tracking.

DOWNLOAD THE DATASET BELOW

Dataset Content

  • 100,000 unique multi-room interior scenes
  • Simulated with realistic device trajectories
  • Across ~2-minute trajectories
  • Populated with ~8000 3D objects
  • With semi-dense map representations

Simulated sensor data per sequence

  • 1 x outward-facing RGB camera stream
  • Simulated Aria camera & lens characteristics

Ground Truth Annotations

  • 6DoF camera trajectory
  • 3D floor plan
  • 2D instance segmentation
  • 2D depth map
Tens of 3D models of interior environments in a grid on black background, illustrating the scale of the Aria Synthetic Environments dataset.
HOW WAS IT CREATED?

100,000 unique scenes, procedurally generated

By creating a fully-simulated dataset, researchers can explore new methods for scene understanding tasks.

Bird-eye view of one of the scenes contained within the Aria Synthetic Environments dataset.

100,000 unique multi-room interior scenes

The Aria Synthetic Environments Dataset has been procedurally generated to produce a diverse set of interior scenes. Each scene has a unique room graph connecting multiple rooms, and unique placement of architectural features, such as windows, doors, and pillars.

Ground-levell view of one of the scenes contained within the Aria Synthetic Environments dataset.

Populated with high-quality 3D objects

Each of the 100,000 unique scenes, is filled with objects from a digital library, each with high-quality materials and geometry. Objects are diverse and placed according to a simple set of rules that result in a physically-valid location for each object.

A visualization of the depth and object instance annotations included with the Aria Synthetic Environments dataset.

Rendered with precisely simulated sensor characteristics

Each simulated sequence is rendered to reflect the sensor characteristics of Project Aria glasses, including simulated lens and sensor characteristics. Inertial data is also simulated using a noise model that reflects Project Aria’s IMU sensors.

Screenshot showing the CAD-like language used by the Aria Synthetic Environments dataset to describe doors, walls, and windows.

Each described with a CAD-like language for architectual entities

Architectural features, such as doors, windows, and pillars, are described with a CAD-like language, including the feature type, location, and dimensions. This unlocks new exciting ways to tackle research challenges related to reconstruction and detection tasks.

Visualization of one of the scenes contained within the Aria Synthetic Environments dataset. The simulated trajectory of an Aria device is included within the scene.

Realistic simulated trajectory within each environment

Before rendering each sequence, device trajectories are simulated within each environment according to a set of rules that mirror how users walk while wearing Project Aria glasses. Trajectories are created automatically and ensure a full traversal of each virtual scene.

A visualization of the dense point cloud annotations included with the Aria Synthetic Environments dataset.

Semi-dense map representation for each scene

In addition to per-frame depth and instance maps for each sequence, semi-dense point cloud representations are also made available for each environment. These additional representations open up new ways for researchers to tackle reconstruction and detection tasks.

ARIA SYNTHETIC ENVIRONMENT DATASET TOOLS

Comprehensive tools to load and visualize data easily

Accompanying tools to the Aria Synthetic Environments Dataset allow researchers to interpret the dataset’s CAD-like language, and interactively visualize the data using an interactive 3D floorplan viewer.

Additionally, since Fall 2024, Aria Synthetic Environments now also supports ATEK, an e2e framework for training and evaluating deep learning models on Aria data, for both 3D egocentric-specific and general machine perception tasks.

VIEW ARIA SYNTHETIC ENVIRONMENTS TOOLS ON GITHUB
LEARN MORE ABOUT ATEK
ATEK Diagram. PyTorch Compatible. Processing, Evaluation, Data Store

Access Aria Synthetic Environments Dataset and accompanying Tools

If you are a researcher in AI or ML research, access the Aria Synthetic Environments Dataset and accompanying tools here.

By submitting your email and accessing the Aria Synthetic Environments Dataset, you agree to abide by the dataset license agreement and to receive emails in relation to the dataset.

Subscribe to Project Aria Updates

Stay in the loop with the latest news from Project Aria.

By providing your email, you agree to receive marketing related electronic communications from Meta, including news, events, updates, and promotional emails related to Project Aria. You may withdraw your consent and unsubscribe from these at any time, for example, by clicking the unsubscribe link included on our emails. For more information about how Meta handles your data please read our Data Policy.