Pattern Exploiting Training explained!
Few-shot learning for “normal-sized” language models like BERT or ALBERT with pattern-exploiting training (PET). Here you can find all about PET, iPET, ADAPET. Choose your favorite!
Not only GPT-3 is a few-shot learner, at least not on SuperGLUE.
📺 Ms. Coffee Bean explains the Transformer: https://youtu.be/FWFA4DGuzSc
📺 Ms. Coffee Bean on GPT-3: https://youtu.be/5fqxPOaaqi0
📄 Schick, T., & Schütze, H. (2020). Exploiting cloze questions for few-shot text classification and natural language inference. arXiv preprint arXiv:2001.07676. https://arxiv.org/abs/2001.07676
📄 Schick, T., & Schütze, H. (2020). It's Not Just Size That Matters: Small Language Models Are Also Few-Shot Learners. arXiv preprint arXiv:2009.07118. https://arxiv.org/abs/2009.07118
📄 Tam, D., Menon, R. R., Bansal, M., Srivastava, S., & Raffel, C. (2021). Improving and Simplifying Pattern Exploiting Training. arXiv preprint arXiv:2103.11955. https://arxiv.org/abs/2103.11955
📄 Brown, T. B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., ... & Amodei, D. (2020). Language models are few-shot learners. arXiv preprint arXiv:2005.14165. https://arxiv.org/abs/2005.14165
* 00:00 Small language models are also few-shot learners
* 01:30 Few-shot learning for GPT-3
* 02:58 Few-shot learning for everyone: PET
* 07:29 iPET
* 08:00 The gist of PET
* 08:20 ADAPET
* 11:53 Wrap-up
Music 🎵 : The Truth by Anno Domini Beats
#AICoffeeBreak #MsCoffeeBean #few-shot-learning #gpt3 #MachineLearning #AI #research
00:00 Small language models are also few-shot learners
01:30 Few-shot learning for GPT-3
02:58 Few-shot learning for everyone: PET
08:00 The gist of PET