[ECCV'20 demo] Spike-FlowNet: Event-based Optical Flow Estimation

ECCV 2020

Demo video for the extension of ECCV'20 paper "Spike-FlowNet: Event-based Optical Flow Estimation with Energy-Efficient Hybrid Neural Networks" * Abstract Over the past years, majority of optical flow estimation techniques relied on images from traditional frame-based cameras, where the input data is obtained by sampling intensities on the entire frame at fixed time intervals irrespective of the scene dynamics. Although sufficient for certain computer vision applications, frame-based cameras suffer from issues such as motion blur during high speed motion, inability to capture information in low-light conditions, and over- or under-saturation in high dynamic range environments. On the other hand, event-based sensors such as the Dynamic Vision Sensor (DVS), display great potential for a variety of real-world tasks such as high-speed motion detection and enabling navigation on edge-devices even in such adverse environments. This is attributed to their high temporal resolution, high dynamic range, and low-power consumption. However, conventional computer vision methods as well as deep Analog Neural Networks (ANNs) are not suited to work well with the asynchronous and discrete nature of event sensor outputs. This is mainly because these methods are typically designed for pixel-based images relying on photo-consistency constraints, assuming the color and brightness of object remain the same in all image sequences. Spiking Neural Networks (SNNs) serve as ideal paradigms to handle the outputs of an event sensor, due to the asynchronous computations and being able to exploit the inherent sparsity of spatio-temporal events (spikes). Here, we present the optical-flow evaluation results of our ECCV 2020 accepted paper titled “Spike-FlowNet” which proposes a deep hybrid neural network architecture integrating SNNs and ANNs for efficiently estimating optical flow from sparse event camera outputs without sacrificing the performance. The network is end-to-end trained with self-supervised learning on Multi-Vehicle Stereo Event Camera (MVSEC) dataset which contains indoor-flying and outdoor-driving event sequences recorded using DAVIS346 event sensors. It employs Integrate and Fire (IF) spiking neurons, a novel input encoding technique and a U-Net based network architecture to estimate optical flow. The work outperforms its corresponding ANN-based method in terms of the optical flow prediction capability while providing significant improvements in computational efficiency. This demo presents a brief description of the paper along with optical flow results on MVSEC dataset. In addition, it also highlights the results for real-time optical flow estimation running on a standard laptop and a DAVIS346 event sensor, for a variety of actions contrastingly different from those present in the dataset. This highlights the generalizing capability of the proposed architecture and the potential of its deployment on edge-devices such as autonomous drones. We hope this demo compliments the paper nicely and intrigues the deep learning and computer vision community towards the enormous potential that event sensors hold for accomplishing vision tasks such as optical flow efficiently on edge devices.