Meta Reality Labs Research
EFM3D: a new benchmark
for 3D egocentric perception tasks
EFM3D accelerates a novel class of foundation models rooted in 3D space.
A new model, EVL, establishes the first baseline for the benchmark.
Foundation models such as DINOv2 offer versatile solutions for a range of 2D computer vision tasks, including object detection and tracking. However, these models often do not leverage the detailed 3D information available from the latest AR and VR devices, such as camera poses, calibration, and semi-dense point data.
EFM3D introduces the first benchmark for foundation models that are optimized for egocentric devices with robust 3D priors. This initiative is designed to adapt foundation models to better meet the specific needs and leverage the capabilities of AR and VR technologies.
Task-specific models, such as Mask R-CNN, are designed to solve a specific task or problem with 2D data.
Foundation models, like DINOv2, are highly versatile, and serve as a base for various applications across different domains.
EFM3D is the first benchmark to measure performance of foundation models that utilize fine-grained 3D location data.
EFM3D tracks the performance of two separate computer vision tasks
Understanding how an environment is shaped is crucial for various computer vision applications, such as self-driving cars, AR, and robotics.
The EFM3D benchmark tracks the performance of 3D Surface prediction, helping ensure that spatial computing devices are equipped to understand and interact with the physical world around them.
Spatial computing devices need to understand what they are looking at, to be able to appropriately react to the context around them.
The EFM3D benchmark tracks the performance of 3D object detection, ensuring both devices and users have a seamless interaction with the real-world surroundings.
Three open datasets to evaluate and train egocentric foundation models
To facilitate novel research in 3D Egocentric Foundation Models (EFMs), we release three new datasets for both training and evaluation.
Aria Everyday Objects
A new real-world validation dataset, with high quality 3D OBB annotations, containing 25 diverse scenes with 1037 OBB instances across 17 object classes.
VIEW AEO DATASET PAGE
New annotations for the Aria Synthetic Environments dataset include 3 million 3D oriented bounding boxes (OBBs) across 43 object classes, with 100 3D groundtruth meshes for scenes in the evaluation set.
VIEW ASE DATASET PAGE
New annotations for the Aria Digital Twin dataset include the 3D groundtruth mesh for all scenes to enable benchmarking 3D reconstruction methods.
VIEW ADT DATASET PAGE
Egocentric Voxel Lifting (EVL): the first baseline for 3D egocentric foundation models
EVL is the first baseline for an egocentric foundational model to solve both EFM3D tasks by leveraging the strong priors from egocentric data
Access the EVL Model Weights
If you are a researcher in AI or ML research, access the EVL model weights here.
By submitting your email and accessing the EVL Model, you agree to abide by the dataset license agreement and to receive emails in relation to EFM3D.
Read the accompanying EFM3D Paper
For more information about EFM3D, read our paper on here.
Acknowledgements
Research Authors
Julian Straub, Daniel DeTone, Tianwei Shen, Nan Yang,
Chris Sweeney, Richard Newcombe
with contribution from
Suvam Patra, Armen Avetisyan, Samir Aroudj, Chris Xie, Henry Howard-Jenkins, Yang Lou, Kang Zheng, Shangyi Cheng, Xiaqing Pan, Thomas Whelan, Numair Khan, Campbell Orme, Dan Barnes, Raul Mur Artal, Lingni Ma, Austin Kukay, Rowan Postyeni, Abha Arora, Luis Pesqueira, and Edward Miller.
Stay in the loop with the latest news from Project Aria.
By providing your email, you agree to receive marketing related electronic communications from Meta, including news, events, updates, and promotional emails related to Project Aria. You may withdraw your consent and unsubscribe from these at any time, for example, by clicking the unsubscribe link included on our emails. For more information about how Meta handles your data please read our Data Policy.