Size does not matter | The efficiency misnomer | What does the number of parameters mean?

Size does not matter | The efficiency misnomer | What does the number of parameters mean?

Dec 06, 2021
|
35 views
Details
How important is the number of parameters in deep learning models? But what about other measures like FLOPs or speed/throughput? β–Ί Check out our sponsor Aleph Alpha πŸ‘‰ https://www.aleph-alpha.de/ ! Follow them on Twitter: Aleph__Alpha Paper πŸ“œ: Dehghani, Mostafa, Anurag Arnab, Lucas Beyer, Ashish Vaswani, and Yi Tay. "The Efficiency Misnomer." arXiv preprint arXiv:2110.12894 (2021). https://arxiv.org/abs/2110.12894 πŸ”— Megatron-Turing NLG 530B: https://www.microsoft.com/en-us/research/blog/using-deepspeed-and-megatron-to-train-megatron-turing-nlg-530b-the-worlds-largest-and-most-powerful-generative-language-model/ Thanks to our Patrons who support us in Tier 2, 3, 4: πŸ™ donor, Dres. Trost GbR, Yannik Schneider Outline: 00:00 Model efficiency comparison 02:51 FLOPs 03:55 Number of parameters: means what? 06:31 Speed / throughput 09:39 Aleph Alpha (Sponsor) β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€ πŸ”₯ Optionally, pay us a coffee to help with our Coffee Bean production! β˜• Patreon: https://www.patreon.com/AICoffeeBreak Ko-fi: https://ko-fi.com/aicoffeebreak β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€ πŸ”— Links: AICoffeeBreakQuiz: https://www.youtube.com/c/AICoffeeBreak/community Twitter: https://twitter.com/AICoffeeBreak Reddit: https://www.reddit.com/r/AICoffeeBreak/ YouTube: https://www.youtube.com/AICoffeeBreak #AICoffeeBreak #MsCoffeeBean #MachineLearning #AI #research​

00:00 Model efficiency comparison 02:51 FLOPs 03:55 Number of parameters: means what? 06:31 Speed / throughput 09:39 Aleph Alpha (Sponsor)
Comments
loading...