FIRST PERSON (EGOCENTRIC) VISION FOR HUMAN-CENTRIC ASSISTANCE: HISTORY, BUILDING BLOCKS, AND APPLICATIONS

ABSTRACT

Wearable devices equipped with a camera and computing abilities are attracting the attention of both the market and the society, with commercial devices more and more available and many companies announcing the upcoming release of new devices. The main appeal of wearable devices is due to their mobility and to their ability to enable user-machine interaction through Augmented Reality. Due to these characteristics, wearable devices provide an ideal platform to develop intelligent assistants able to assist humans and augment their abilities, for which Artificial Intelligence and Computer Vision play a major role.

Differently from classic computer vision (the so called “third person vision”), which analyses images collected from a static point of view, first person (egocentric) vision assume that images are collected from the point of view of the user, which gives privileged information on the user’s activities and the way they perceive and interact with the world. Indeed, the visual data acquired with wearable cameras usually provides useful information about the users, their intentions, and how they interact with the world.

This tutorial will discuss the challenges and opportunities offered by first person (egocentric) vision, covering the historical background and seminal works, presenting the main technological tools and building blocks, and discussing applications.

Tutorial organizers

Antonino Furnari, University of Catania and Next Vision s.r.l., Italy

Francesco Ragusa, University of Catania and Next Vision s.r.l., Italy

Antonino Furnari is a research fellow at the University of Catania. He received his PhD in Mathematics and Computer Science in 2017 from the University of Catania and authored one patent and more than 50 papers in international book chapters, journals and conference proceedings. Antonino Furnari is involved in the organization of different international events, such as the Assistive Computer Vision and Robotics (ACVR) workshop series (since 2016), the International Computer Vision Summer School (ICVSS) (since 2017), and the Egocentric Perception Interaction and Computing (EPIC) workshop series (since 2018). In 2020, he has been guest editor for IEEE Transactions on Pattern Analysis and Image Intelligence (TPAMI) with a special issue on “Egocentric Perception”. Since 2018, he has been involved in the collection, release, and maintenance of the EPIC-KITCHENS dataset series, and in particular in the egocentric action anticipation and action detection challenges. Since 2021, he has been involved in the collection and benchmarking of the EGO4D dataset. He is co-founder of NEXT VISION s.r.l., an academic spin-off the the University of Catania since 2021. His research interests concern Computer Vision, Pattern Recognition, and Machine Learning, with focus on First Person Vision. More information is available at http://www.antoninofurnari.it/.

Francesco Ragusa is a postdoc at the University of Catania. He is member of the IPLAB (University of Catania) research group since 2015. He has completed an Industrial Doctorate in Computer Science in 2021. During his PhD studies, he has spent a period as Research Student at the University of Hertfordshire, UK. He received his master’s degree in computer science (cum laude) in 2017 from the University of Catania. Francesco has authored one patent and more than 10 papers in international journals and international conference proceedings. He serves as reviewer for several international conferences in the fields of computer vision and multimedia, such as CVPR, BMVC, WACV, ACM Multimedia, ICPR, ICIAP, and for international journals, including Pattern Recognition Letters and IeT Computer Vision. Francesco Ragusa is member of IEEE, CVF e CVPL. He has been involved in different research projects and has honed in on the issue of human-object interaction anticipation from egocentric videos as the key to analyze and understand human behavior in industrial workplaces. He is co-founder and CEO of NEXT VISION s.r.l., an academic spin-off the the University of Catania since 2021. His research interests concern Computer Vision, Pattern Recognition, and Machine Learning, with focus on First Person Vision. More information is available at https://iplab.dmi.unict.it/ragusa/.