EFM3D: a new benchmark for 3D egocentric perception tasks

Meta Reality Labs Research

EFM3D: a new benchmark
for 3D egocentric perception tasks

EFM3D accelerates a novel class of foundation models rooted in 3D space.
A new model, EVL, establishes the first baseline for the benchmark.

Why are foundation models important?

Foundation models such as DINOv2 offer versatile solutions for a range of 2D computer vision tasks, including object detection and tracking. However, these models often do not leverage the detailed 3D information available from the latest AR and VR devices, such as camera poses, calibration, and semi-dense point data.

EFM3D introduces the first benchmark for foundation models that are optimized for egocentric devices with robust 3D priors. This initiative is designed to adapt foundation models to better meet the specific needs and leverage the capabilities of AR and VR technologies.

READ THE PAPER
CHECKOUT THE CODE
Task-specific model

Task-specific models, such as Mask R-CNN, are designed to solve a specific task or problem with 2D data.

Foundation models

Foundation models, like DINOv2, are highly versatile, and serve as a base for various applications across different domains.

3D location data

EFM3D is the first benchmark to measure performance of foundation models that utilize fine-grained 3D location data.

EFM3D Benchmark tasks

EFM3D tracks the performance of two separate computer vision tasks

3D Surface Regression
3D Surface Reconstruction

Understanding how an environment is shaped is crucial for various computer vision applications, such as self-driving cars, AR, and robotics.

The EFM3D benchmark tracks the performance of 3D Surface prediction, helping ensure that spatial computing devices are equipped to understand and interact with the physical world around them.

3D Bounding Box Detection

Spatial computing devices need to understand what they are looking at, to be able to appropriately react to the context around them.

The EFM3D benchmark tracks the performance of 3D object detection, ensuring both devices and users have a seamless interaction with the real-world surroundings.

EVL 3D Bounding Box Predictions
Accelerating open development 3D egocentric foundation models

Three open datasets to evaluate and train egocentric foundation models

To facilitate novel research in 3D Egocentric Foundation Models (EFMs), we release three new datasets for both training and evaluation.


Aria Everyday Objects

A new real-world validation dataset, with high quality 3D OBB annotations, containing 25 diverse scenes with 1037 OBB instances across 17 object classes.

VIEW AEO DATASET PAGE


Aria Synthetic Environments (new ATEK update)

New annotations for the Aria Synthetic Environments dataset include 3 million 3D oriented bounding boxes (OBBs) across 43 object classes, with 100 3D groundtruth meshes for scenes in the evaluation set.

VIEW ASE DATASET PAGE


Aria Digital Twin (new ATEK update)

New annotations for the Aria Digital Twin dataset include the 3D groundtruth mesh for all scenes to enable benchmarking 3D reconstruction methods.

VIEW ADT DATASET PAGE

Three open datasets
The first baseline model
in the EFM3D Benchmark

Egocentric Voxel Lifting (EVL): the first baseline for 3D egocentric foundation models

EVL is the first baseline for an egocentric foundational model to solve both EFM3D tasks by leveraging the strong priors from egocentric data

CHECK OUT THE CODE ON GITHUB
The first baseline model in the EFM3D Benchmark

Access the EVL Model Weights

If you are a researcher in AI or ML research, access the EVL model weights here.

By submitting your email and accessing the EVL Model, you agree to abide by the dataset license agreement and to receive emails in relation to EFM3D.

Read the accompanying EFM3D Paper

For more information about EFM3D, read our paper on here.

Read the paper
A screenshot from the EFM3D research paper.

Acknowledgements

Research Authors

Julian Straub, Daniel DeTone, Tianwei Shen, Nan Yang,
Chris Sweeney, Richard Newcombe

with contribution from

Suvam Patra, Armen Avetisyan, Samir Aroudj, Chris Xie, Henry Howard-Jenkins, Yang Lou, Kang Zheng, Shangyi Cheng, Xiaqing Pan, Thomas Whelan, Numair Khan, Campbell Orme, Dan Barnes, Raul Mur Artal, Lingni Ma, Austin Kukay, Rowan Postyeni, Abha Arora, Luis Pesqueira, and Edward Miller.

Subscribe to Project Aria Updates

Stay in the loop with the latest news from Project Aria.

By providing your email, you agree to receive marketing related electronic communications from Meta, including news, events, updates, and promotional emails related to Project Aria. You may withdraw your consent and unsubscribe from these at any time, for example, by clicking the unsubscribe link included on our emails. For more information about how Meta handles your data please read our Data Policy.