“Transformer in Transformer” paper explained by Ms. Coffee Bean. In this video you will find out why modelling image patches with Transformers in Transformers is a good idea. It goes beyond ViT or DeiT and models global and local structure.
📺 ViT Transformer: https://youtu.be/DVoHvmww2lQ
📺 DeiT explained: https://youtu.be/-FbV2KgRM8A
📄 “Transformer in Transformer” paper by Han et al. 2021: https://arxiv.org/pdf/2103.00112.pdf
Outline:
* 00:00 Transformer’s love affair with image recognition
* 01:34 Vision Transformers so far
* 02:00 Transformer in Transformer
* 05:04 Goodies in the paper
Music 🎵 : Forever by Anno Domini Beats
--------------
🔗 Links:
YouTube: https://www.youtube.com/AICoffeeBreak
Twitter: https://twitter.com/AICoffeeBreak
Reddit: https://www.reddit.com/r/AICoffeeBreak/
#AICoffeeBreak #MsCoffeeBean #MachineLearning #ai #research
00:00 Transformer’s love affair with image recognition
01:34 Vision Transformers so far
02:00 Transformer in Transformer
05:04 Goodies in the paper