It's Not Just Size That Matters: Small Language Models Are Also Few-Shot Learners | PET, iPET, ADAPET explained!
Details
Few-shot learning for “normal-sized” language models like BERT or ALBERT with pattern-exploiting training (PET) explained. Here you can find PET, iPET, ADAPET. Choose your favorite! Not only GPT-3 is a few-shot learner, at least not on SuperGLUE. Schick, T., & Schütze, H. (2020). Exploiting cloze questions for few-shot text classification and natural language inference. arXiv preprint arXiv:2001.07676. https://arxiv.org/abs/2001.07676​ Schick, T., & Schütze, H. (2020). It's Not Just Size That Matters: Small Language Models Are Also Few-Shot Learners. arXiv preprint arXiv:2009.07118. https://arxiv.org/abs/2009.07118​ Tam, D., Menon, R. R., Bansal, M., Srivastava, S., & Raffel, C. (2021). Improving and Simplifying Pattern Exploiting Training. arXiv preprint arXiv:2103.11955. https://arxiv.org/abs/2103.11955

00:00​ Small language models are also few-shot learners 01:30​ Few-shot learning for GPT-3 02:58​ Few-shot learning for everyone: PET 07:29​ iPET 08:00​ The gist of PET 08:20​ ADAPET 11:53​ Wrap-up
Comments
loading...