INTRODUCING

Aria Digital Twin Dataset

A real-world dataset, with hyper-accurate digital counterpart & comprehensive ground-truth annotation

WHAT IS IT?

An egocentric dataset with extensive and accurate ground-truth

Aria Digital Twin is an egocentric dataset captured using Aria glasses, with extensive simulated ground truth for devices, objects and environment.

This dataset sets a new standard for egocentric machine perception research, and accelerates research into a number of challenges including 3D object detection and tracking, scene reconstruction and understanding, sim-to-real learning, human pose prediction, while also inspiring new machine perception tasks for augmented reality (AR) applications.

DOWNLOAD THE DATASET BELOW
EXPLORE THE DATA IN ARIA DATASET EXPLORER

Dataset Content

  • 200 sequences (~400 mins)
  • 398 objects (324 stationary, 74 dynamic)
  • 2 real indoor scenes
  • Single + multi-user activities

Sensor Data per device

  • 2 x outward-facing monochrome camera streams
  • 1 x outward-facing RGB camera stream
  • 2 x IMU streams
  • 2 x Internal-facing eye tracking cameras
  • Complete sensor calibrations

Annotations

  • 6DoF device trajectory
  • 3D object pose
  • 3D human skeleton
  • 3D eye gaze
  • 2D Photo-realistic synthetic rendering
  • 2D bounding box
  • 2D instance segmentation
  • 2D depth map
A visualization of the data contained within the Project Aria Digital Twin Dataset, including raw footage, object annotations, 3D depth estimation, and object instances.
HOW WAS IT CREATED?

A digital twin for a physical world

The Aria Digital Twin Dataset was captured in 2 different locations within Meta offices in North America, each with extensive ground-truth survey.

Two side-by-side images of a birdhouse, one which is from real-world footage, the other which is simulated. The image shows very little difference between the two.

Photo-realistic object reconstruction

Each object within the Aria Digital Twin are laser scanned to reconstruct highly precise geometry. Object material are modeled using a photogrammetry pipeline and fine-tuned to ensure the images rendered from the models accurately match real images of the object.

Two side-by-side images of a scene from the Aria Digital Twin Dataset, one which is from real-world footage, the other which is simulated. The image shows very little difference between the two.

Hyper-accurate scene digitization

The Aria Digital Twin Dataset was captured in 2 different locations within Meta offices in North America. Each room was laser scanned & modelled to ensure a high quality ground truth for each environment.

HOW IS THE DATASET ANNOTATED?

Comprehensive ground-truth of the real-world environment

For every frame of motion in the real-world footage, the Aria Digital Twin Dataset has a complete set of ground-truth data at the human, object, and scene level.

A visualization of the trajectory from Project Aria Glasses, contained within the Aria Digital Twin dataset.

High-quality device and object 6DoF poses

Camera and object trajectories are provided for every sequence, aligned to the same reference-frame as the scene geometry, allowing annotations to be understood within the same context.

A visualization of the depth map and object instances contained within the Aria Digital Twin dataset.

High quality depth-maps and object segmentation

Aria Digital Twin derives depth maps and object segmentation by leveraging the complete scene reconstruction and dynamic object tracking. This data provides researchers with additional knowledge of objects and scene.

A visualization of the 3D human pose rig annotations contained within the Aria Digital Twin dataset.

3D human poses

In addition to camera poses, each Aria wearer is outfitted with a full body motion capture suit, to estimate the joint positions of the wearer. This allows dataset users to explore methods for full body pose estimation.

Two side-by-side images of a scene from the Aria Digital Twin Dataset, one which is from real-world footage, the other which is simulated. The image shows very little difference between the two. Optitrack markers can be seen on the real-world image.

Faithfully-simulated synthetic sensor data

Each real-world sequence is accompanied by a synthetic sequence matching the sensor characteristics of the RGB and monochrome sensors on Aria glasses, at photo-realistic quality.

A visualization of the eye gaze vector annotations contained within the Aria Digital Twin dataset.

3D eye gaze vectors

Using data from Project Aria’s eye-tracking cameras, Aria Digital Twin includes an estimate of the wearer's eye-gaze as a 3D vector with depth information. This introduces additional user-object interaction besides hands.

NEW SINCE FALL 2024

Publicly Available 3D Object Models

Nearly all objects seen in the dataset have a high quality 3D object model that can be downloaded

Each object has a carefully crafted 3D model with a geometric accuracy of 5 mm and 4k resolution PBR textures. These models were used to generate the synthetic renderings available in the sequence data. Combining these models with the ground truth object poses, users can generate their own synthetic data for training ML models. For more information on downloading and visualizing the models, refer to the ADT Docs.

ARIA DIGITAL TWIN DATASET TOOLS

Comprehensive tools to load and visualize data easily

Tools for working with Aria Digital Twin allow researchers to access, interact with, and visualize all raw data and annotations available in the dataset.

We provide both C++ and Python interfaces to load data, so that researchers can access data in the way best suited to their needs. We also provide tools for querying dataset contents, so that specific types of data can be surfaced.

Additionally, since Fall 2024, Aria Digital Twin now also supports ATEK, an e2e framework for training and evaluating deep learning models on Aria data, for both 3D egocentric-specific and general machine perception tasks.

VIEW ARIA DIGITAL TWIN TOOLS ON GITHUB
LEARN MORE ABOUT ATEK
ATEK Diagram. PyTorch Compatible. Processing, Evaluation, Data Store

Enabling innovation, responsibly

All sequences within the Aria Digital Twin Dataset have been captured using fully consented researchers in controlled environments in Meta offices.

RESPONSIBLE INNOVATION PRINCIPLES
A researcher wearing a mocap suit acts out a scene, contained within the Aria Digital Twin dataset.

Read the accompanying ADT Research Paper

For more information about the Aria Digital Twin Dataset, read our paper on here.

ARIA DIGITAL TWIN RESEARCH PAPER
A screenshot from the Aria Digital Twin research paper.

Access Aria Digital Twin Dataset

If you are a researcher in AI or ML research, access the Aria Digital Twin Dataset and accompanying tools here.

By submitting your email and accessing the Aria Digital Twin Dataset, you agree to abide by the dataset license agreement and to receive emails in relation to the dataset.

Subscribe to Project Aria Updates

Stay in the loop with the latest news from Project Aria.

By providing your email, you agree to receive marketing related electronic communications from Meta, including news, events, updates, and promotional emails related to Project Aria. You may withdraw your consent and unsubscribe from these at any time, for example, by clicking the unsubscribe link included on our emails. For more information about how Meta handles your data please read our Data Policy.