Automatic Title Generation for Text with Transformer Language Model (Research Paper Walkthrough)

Automatic Title Generation for Text with Transformer Language Model (Research Paper Walkthrough)

May 01, 2021
|
41 views
Details
#gpt2 #transformers #naturallanguageprocessing This paper presents a novel approach for title/headline generation based on pre-trained transformer language model GPT-2. A representative and an interesting title is invariably the most important aspect of any document. The title is the first, and sometimes the only part of an article or the document that the potential readers will see. While the title has to be catchy to grab the readers attention and entice them to read the full article, it should also accurately portray the content of the document so that readers are not misled by a catchy but not-quite-accurate title. ⏩ Abstract: In this paper, we propose a novel approach to Automatic Title Generation for a given text using a pre-trained Transformer Language Model GPT-2. The model proposes an unique approach of generating a pool of candidate titles and selecting an appropriate title among them which is then refined or de-noised to get the final title. The approach consists of a pipeline of three modules namely Generation, Selection and Refinement followed by a Scoring function. The Generation and Refinement modules are based on GPT-2, while the Selection module has a heuristic based approach. The model is able to generate accurate titles in spite of having a smaller corpus of relevant training data due to the fact that the natural language generation capabilities come from the pre-training while the model has to primarily learn task and corpus specific nuances. Additionally, Selection and Refinement modules ensure that the titles are representative of the given text and are semantically and syntactically accurate. We train our model for research paper abstracts from arXiv and evaluate it on three different test sets. Our pipeline shows promising results when evaluated on ROUGE and BLEU metrics against the test sets. In addition, we also perform human evaluation for validating the results generated by our proposed approach. Note: The title of this paper was generated automatically by our proposed algorithm from the abstract. Please feel free to share out the content and subscribe to my channel :) ⏩ Subscribe - https://youtube.com/channel/UCoz8NrwgL7U9535VNc0mRPA?sub_confirmation=1 ⏩ OUTLINE: 0:00 - Intro 01:19 - Generation Module 02:47 - Top-k Top-p Sampling Strategy 04:54 - Selection Module 06:13 - Refinement Module 07:30 - Title Generation Pipeline flow with example 10:02 - Relevancy Score 11:34 - Sample Results ⏩ Paper Title: Automatic Title Generation for Text with Pre-trained Transformer Language Model ⏩ Paper: https://ieeexplore.ieee.org/document/9364613 ⏩ Author: Prakhar Mishra; Chaitali Diwan; Srinath Srinivasa; G. Srinivasaraghavan ⏩ Organisation: IIIT Bangalore, India ⏩ IMPORTANT LINKS Full Playlist on BERT usecases in NLP: https://www.youtube.com/watch?v=kC5kP1dPAzc&list=PLsAqq9lZFOtV8jYq3JlkqPQUN5QxcWq0f Full Playlist on Text Data Augmentation Techniques: https://www.youtube.com/watch?v=9O9scQb4sNo&list=PLsAqq9lZFOtUg63g_95OuV-R2GhV1UiIZ Full Playlist on Text Summarization: https://www.youtube.com/watch?v=kC5kP1dPAzc&list=PLsAqq9lZFOtV8jYq3JlkqPQUN5QxcWq0f Full Playlist on Machine Learning with Graphs: https://www.youtube.com/watch?v=-uJL_ANy1jc&list=PLsAqq9lZFOtU7tT6mDXX_fhv1R1-jGiYf Full Playlist on Evaluating NLG Systems: https://www.youtube.com/watch?v=-CIlz-5um7U&list=PLsAqq9lZFOtXlzg5RNyV00ueE89PwnCbu Full Playlist on Query Expansion for Information Retrieval using NLP: https://www.youtube.com/watch?v=QpTZ_-6uio8&list=PLsAqq9lZFOtXsJ_S_lB9pPz2cbz-i2DD0 Full Playlist on Text Generation Evaluation Techniques: https://www.youtube.com/watch?v=-CIlz-5um7U&list=PLsAqq9lZFOtXlzg5RNyV00ueE89PwnCbu ********************************************* ⏩ Youtube - https://www.youtube.com/c/TechVizTheDataScienceGuy ⏩ Blog - https://prakhartechviz.blogspot.com ⏩ LinkedIn - https://linkedin.com/in/prakhar21 ⏩ Medium - https://medium.com/@prakhar.mishra ⏩ GitHub - https://github.com/prakhar21 ⏩ Twitter - https://twitter.com/rattller ********************************************* Please feel free to share out the content and subscribe to my channel :) ⏩ Subscribe - https://youtube.com/channel/UCoz8NrwgL7U9535VNc0mRPA?sub_confirmation=1 Tools I use for making videos :) ⏩ iPad - https://tinyurl.com/y39p6pwc ⏩ Apple Pencil - https://tinyurl.com/y5rk8txn ⏩ GoodNotes - https://tinyurl.com/y627cfsa #techviz #datascienceguy #research #finetuning #titlegeneration

0:00 Intro 01:19 Generation Module 02:47 Top-k Top-p Sampling Strategy 04:54 Selection Module 06:13 Refinement Module 07:30 Title Generation Pipeline flow with example 10:02 Relevancy Score 11:34 Sample Results
Comments
loading...