SimVLM explained | What the paper doesn’t tell you

SimVLM explained | What the paper doesn’t tell you

Feb 18, 2022
|
40 views
Details
πŸ“œSimVLM explained. What the authors tell us, what they don’t tell us and how this all works. Enjoy with coffee! πŸ“Ί Vision & Language Transformer explained (ViLBERT): https://youtu.be/dd7nE4nbxN0 πŸ“Ί ViT explained: https://youtu.be/DVoHvmww2lQ Thanks to our Patrons who support us in Tier 2, 3, 4: πŸ™ donor, Dres. Trost GbR, Yannik Schneider Paper: πŸ“œ Wang, Zirui, Jiahui Yu, Adams Wei Yu, Zihang Dai, Yulia Tsvetkov, and Yuan Cao. "SimVLM: Simple Visual Language Model Pretraining with Weak Supervision." arXiv preprint arXiv:2108.10904 (2021). https://arxiv.org/abs/2108.10904 πŸ”— SimVLM AI Google Blog post: https://ai.googleblog.com/2021/10/simvlm-simple-visual-language-model-pre.html πŸ“œ Jia, Chao, Yinfei Yang, Ye Xia, Yi-Ting Chen, Zarana Parekh, Hieu Pham, Quoc V. Le, Yunhsuan Sung, Zhen Li, and Tom Duerig. "Scaling up visual and vision-language representation learning with noisy text supervision." arXiv preprint arXiv:2102.05918 (2021). https://arxiv.org/abs/2102.05918 πŸ“œGPT-3 paper: Brown, Tom B., Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan et al. "Language models are few-shot learners." arXiv preprint arXiv:2005.14165 (2020). https://arxiv.org/abs/2005.14165 πŸ“Ί GPT-3 video: https://youtu.be/5fqxPOaaqi0 Outline: 00:00 SimVLM 01:15 End-to-end image processing 03:01 Objective: Prefix Language Modelling 06:38 The secret ingredient β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€ πŸ”₯ Optionally, pay us a coffee to help with our Coffee Bean production! β˜• Patreon: https://www.patreon.com/AICoffeeBreak Ko-fi: https://ko-fi.com/aicoffeebreak β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€ πŸ”— Links: AICoffeeBreakQuiz: https://www.youtube.com/c/AICoffeeBreak/community Twitter: https://twitter.com/AICoffeeBreak Reddit: https://www.reddit.com/r/AICoffeeBreak/ YouTube: https://www.youtube.com/AICoffeeBreak #AICoffeeBreak #MsCoffeeBean #MachineLearning #AI #research​ Video and thumbnail contain emojis designed by OpenMoji – the open-source emoji and icon project. License: CC BY-SA 4.0

00:00 SimVLM 01:15 End-to-end image processing 03:01 Objective: Prefix Language Modelling 06:38 The secret ingredient
Comments
loading...