Generalization through Memorization: Nearest Neighbor Language Models (Research Paper Walkthrough)

Generalization through Memorization: Nearest Neighbor Language Models (Research Paper Walkthrough)

Aug 28, 2021
|
20 views
Details
#languagemodels #knn #nlp Bigger Models, Better Results?? This research extends a pre-trained neural language model by linearly interpolating it with a k-nearest neighbors model, achieving new state-of-the-art results on Wikitext-103 with no additional training. ⏩ Abstract: We introduce NN-LMs, which extend a pre-trained neural language model (LM) by linearly interpolating it with a nearest neighbors (NN) model. The nearest neighbors are computed according to distance in the pre-trained LM embedding space, and can be drawn from any text collection, including the original LM training data. Applying this transformation to a strong Wikitext-103 LM, with neighbors drawn from the original training set, our NN-LM achieves a new state-of-the-art perplexity of 15.79 -- a 2.9 point improvement with no additional training. We also show that this approach has implications for efficiently scaling up to larger training sets and allows for effective domain adaptation, by simply varying the nearest neighbor datastore, again without further training. Qualitatively, the model is particularly helpful in predicting rare patterns, such as factual knowledge. Together, these results strongly suggest that learning similarity between sequences of text is easier than predicting the next word, and that nearest neighbor search is an effective approach for language modeling in the long tail. Please feel free to share out the content and subscribe to my channel :) ⏩ Subscribe - https://youtube.com/channel/UCoz8NrwgL7U9535VNc0mRPA?sub_confirmation=1 ⏩ OUTLINE: 0:00 - Background and Abstract 04:10 - illustration of kNN-LM - Algorithm 07:21 - Experiment - Results ⏩ Paper Title: Generalization through Memorization: Nearest Neighbor Language Models ⏩ Paper: https://openreview.net/attachment?id=HklBjCEKvH&name=original_pdf ⏩ Author: Urvashi Khandelwal, Omer Levy, Dan Jurafsky, Luke Zettlemoyer, Mike Lewis ⏩ Organisation: Stanford University, Facebook AI Research ⏩ IMPORTANT LINKS Full Playlist on BERT usecases in NLP: https://www.youtube.com/watch?v=kC5kP1dPAzc&list=PLsAqq9lZFOtV8jYq3JlkqPQUN5QxcWq0f Full Playlist on Text Data Augmentation Techniques: https://www.youtube.com/watch?v=9O9scQb4sNo&list=PLsAqq9lZFOtUg63g_95OuV-R2GhV1UiIZ Full Playlist on Text Summarization: https://www.youtube.com/watch?v=kC5kP1dPAzc&list=PLsAqq9lZFOtV8jYq3JlkqPQUN5QxcWq0f Full Playlist on Machine Learning with Graphs: https://www.youtube.com/watch?v=-uJL_ANy1jc&list=PLsAqq9lZFOtU7tT6mDXX_fhv1R1-jGiYf Full Playlist on Evaluating NLG Systems: https://www.youtube.com/watch?v=-CIlz-5um7U&list=PLsAqq9lZFOtXlzg5RNyV00ueE89PwnCbu ********************************************** If you want to support me financially which totally optional and voluntary ❤️ You can consider buying me chai ( because i don't drink coffee :) ) at https://www.buymeacoffee.com/TechvizCoffee ********************************************** ⏩ Youtube - https://www.youtube.com/c/TechVizTheDataScienceGuy ⏩ LinkedIn - https://linkedin.com/in/prakhar21 ⏩ Medium - https://medium.com/@prakhar.mishra ⏩ GitHub - https://github.com/prakhar21 ⏩ Twitter - https://twitter.com/rattller ********************************************* Tools I use for making videos :) ⏩ iPad - https://tinyurl.com/y39p6pwc ⏩ Apple Pencil - https://tinyurl.com/y5rk8txn ⏩ GoodNotes - https://tinyurl.com/y627cfsa #techviz #datascienceguy #machinelearning #ai

Comments
loading...