Building on the work that two years ago led to the release of Egocentric 4D Live Perception, the world’s most diverse egocentric dataset, the Ego4D consortium has drastically expanded the reach and ambition of their research with the newly published Ego-Exo4D – a foundational dataset to support research on video learning and multimodal perception.
A University of Bristol research team led by Professor Dima Damen at the School of Computer Science is part of an international consortium of 13 universities in partnership with Meta that is driving research in computer vision through collecting joint egocentric and exocentric datasets of human skilled activities.
The result of a two-year effort by Meta’s FAIR (Fundamental Artificial Intelligence Research), Project Aria, and the Ego4D consortium of 13 university partners, Ego-Exo4D is a first-of-its-kind large-scale multimodal multiview dataset and benchmark suite. Its defining feature is its simultaneous capture of both first-person ‘egocentric’ views, from a participant’s wearable camera, as well as multiple ‘exocentric’ views, from cameras surrounding the participant. Together, these two perspectives will give AI models a new window into complex skilled human activity allowing approaches to capture an understanding of how skilled participants perform tasks such as dancing, playing music as well as carry out procedures such as maintaining a bicycle.