Aditya Malik, Nalini Ratha, et al.
CAI 2024
Despite significant advancements in modern artificial intelligence, human visual object recognition remains markedly superior in terms of generalizability and adaptability. For example, when encountering a novel object, humans can readily distinguish it from dissimilar ones, a task that remains challenging for modern AI systems. Active vision leverages visual exploration over time to associate visual characteristics and link them to sensorimotor contingencies, enabling object discrimination. Recently, Kolner et al. [1] introduced a model, called Glimpse-based Active Perception (GAP), that selectively attends to salient areas of the image guided by selective visual attention mechanisms. Merging the spatial and visual information about the attended regions of interest (ROIs), GAP learns representations that generalize to out-of-distribution (OOD) visual content. Even though this approach demonstrates state-of-the-art OOD generalisation on several visual tasks, it has not been tested in real-world applications. This work extends the approach by incorporating bio-inspired, event-driven processing, enabling the model to efficiently handle dynamic visual inputs. Unlike traditional frame-based sensors that capture full images at fixed intervals, event-based cameras asynchronously detect changes in brightness with microsecond precision. This results in a highly efficient, low-latency stream of events with superior temporal resolution and reduced redundancy.
Aditya Malik, Nalini Ratha, et al.
CAI 2024
Leonid Karlinsky, Joseph Shtok, et al.
CVPR 2019
Erik Altman, Jovan Blanusa, et al.
NeurIPS 2023
Pavel Klavík, A. Cristiano I. Malossi, et al.
Philos. Trans. R. Soc. A