Animals and humans seem able to learn perception and control tasks extremely quickly, learning to drive a car or land an airplane takes 30 hours of practice. In contrast, popular machine learning paradigms require large amounts of human-labeled data for supervised learning or enormous amounts of trials for reinforcement learning. Humans and animals learn vast amounts of background knowledge about how the world works through mere observation in a task-independent manner. One hypothesis is that it is their ability to learn good representations and predictive models of the perceptual and motor worlds that allows them to learn new tasks efficiently. How do we reproduce this ability in machines? One promising avenue is self-supervised learning (SSL), where the machine predicts parts of its input from other parts of its input. SSL has already brought about great progress in discrete domains, such as language understanding. The challenge is to devise SSL methods that can handle the stochasticity and multimodality of prediction in high-dimensional continuous domains such as video. Such a paradigm would allow robots to learn world models and to use them for Model-Predictive Control or policy learning. An approach for this will be presented that handles uncertainty not through a probability distribution but through an energy function. An application to driving autonomous vehicles in dense traffic will be presented.
By: Yann LeCun (NYU Courant Inst. & Center for Data Science, Facebook AI Research)