Yann LeCun - Self-Supervised Learning: The Dark Matter of Intelligence (FAIR Blog Post Explained)
#selfsupervisedlearning #yannlecun #facebookai Deep Learning systems can achieve remarkable, even super-human performance through supervised learning on large, labeled datasets. However, there are two problems: First, collecting ever more labeled data is expensive in both time and money. Second, these deep neural networks will be high performers on their task, but cannot easily generalize to other, related tasks, or they need large amounts of data to do so. In this blog post, Yann LeCun and Ishan Misra of Facebook AI Research (FAIR) describe the current state of Self-Supervised Learning (SSL) and argue that it is the next step in the development of AI that uses fewer labels and can transfer knowledge faster than current systems. They suggest as a promising direction to build non-contrastive latent-variable predictive models, like VAEs, but ones that also provide high-quality latent representations for downstream tasks. OUTLINE: 0:00 - Intro & Overview 1:15 - Supervised Learning, Self-Supervised Learning, and Common Sense 7:35 - Predicting Hidden Parts from Observed Parts 17:50 - Self-Supervised Learning for Language vs Vision 26:50 - Energy-Based Models 30:15 - Joint-Embedding Models 35:45 - Contrastive Methods 43:45 - Latent-Variable Predictive Models and GANs 55:00 - Summary & Conclusion Paper (Blog Post): https://ai.facebook.com/blog/self-supervised-learning-the-dark-matter-of-intelligence My Video on BYOL: https://www.youtube.com/watch?v=YPfUiOMYOEE ERRATA: - The difference between loss and energy: Energy is for inference, loss is for training. - The R(z) term is a regularizer that restricts the capacity of the latent variable. I think I said both of those things, but never together. - The way I explain why BERT is contrastive is wrong. I haven't figured out why just yet, though :) Video approved by Antonio. Abstract: We believe that self-supervised learning (SSL) is one of the most promising ways to build such background knowledge and approximate a form of common sense in AI systems. Authors: Yann LeCun, Ishan Misra Links: TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://discord.gg/4H8xxDF BitChute: https://www.bitchute.com/channel/yannic-kilcher Minds: https://www.minds.com/ykilcher Parler: https://parler.com/profile/YannicKilcher LinkedIn: https://www.linkedin.com/in/yannic-kilcher-488534136/ BiliBili: https://space.bilibili.com/1824646584 If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this): SubscribeStar: https://www.subscribestar.com/yannickilcher Patreon: https://www.patreon.com/yannickilcher Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2 Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n

0:00​ - Intro & Overview 1:15​ - Supervised Learning, Self-Supervised Learning, and Common Sense 7:35​ - Predicting Hidden Parts from Observed Parts 17:50​ - Self-Supervised Learning for Language vs Vision 26:50​ - Energy-Based Models 30:15​ - Joint-Embedding Models 35:45​ - Contrastive Methods 43:45​ - Latent-Variable Predictive Models and GANs 55:00​ - Summary & Conclusion