Evaluation of Text Generation: A Survey | Human-Centric Evaluations | Research Paper Walkthrough
CrossMind.ai logo

Evaluation of Text Generation: A Survey | Human-Centric Evaluations | Research Paper Walkthrough

Dec 11, 2020
#textgeneration #naturallanguageprocessing #researchpaperwalkthrough Natural language generation (NLG) is the natural language processing task of generating natural language from a machine representation system such as a knowledge base or a logical form. Evaluating such systems is not straight forward due to unavailability of robust automatic metrics that correlates well with human decisions. Also, humans evaluations encode subjectivity. This survey paper puts out all the research that has happened in this domain of evaluation metric of text generation systems be it Human-centric, automatic metric, learned metrics. ⏩ Support by subscribing to the channel to not miss out on any video that i upload next - https://youtube.com/channel/UCoz8NrwgL7U9535VNc0mRPA?sub_confirmation=1 ⏩ Abstract: The paper surveys evaluation methods of natural language generation (NLG) systems that have been developed in the last few years. We group NLG evaluation methods into three categories: (1) human-centric evaluation metrics, (2) automatic metrics that require no training, and (3) machine-learned metrics. For each category, we discuss the progress that has been made and the challenges still being faced, with a focus on the evaluation of recently proposed NLG tasks and neural NLG models. We then present two case studies of automatic text summarization and long text generation, and conclude the paper by proposing future research directions. ⏩ OUTLINE: 0:00 - Background and Abstract 2:23 - Introduction 3:58 - Intrinsic Evaluation 8:24 - Extrinsic Evaluation 9:20 - Inter-evaluator agreement 10:15 - Percent Agreement (Inter-evaluator agreement) 11:09 - Cohen's Kappa (Inter-evaluator agreement) 12:34 - Fleiss Kappa (Inter-evaluator agreement) 13:35 - Krippendorff Alpha (Inter-evaluator agreement) 15:03 - Wrapping up ⏩ Title: Evaluation of Text Generation: A Survey ⏩ Link: https://arxiv.org/abs/2006.14799 ⏩ Author: Asli Celikyilmaz, Elizabeth Clark, Jianfeng Gao ⏩ Organisation: Microsoft Research, University of Washington ⏩ IMPORTANT LINKS BLEURT: Learning Robust Metric for Text Generation - https://www.youtube.com/watch?v=9lWxwfMKAdM Text Style Transfer - https://www.youtube.com/watch?v=cjnk3PJljDs ********************************************* ⏩ Youtube - https://youtube.com/channel/UCoz8NrwgL7U9535VNc0mRPA ⏩ Blog - https://prakhartechviz.blogspot.com ⏩ LinkedIn - https://linkedin.com/in/prakhar21 ⏩ Medium - https://medium.com/@prakhar.mishra ⏩ GitHub - https://github.com/prakhar21 ********************************************* Please feel free to share out the content and subscribe to my channel :) ⏩ Subscribe - https://youtube.com/channel/UCoz8NrwgL7U9535VNc0mRPA?sub_confirmation=1 Tools I use for making videos :) ⏩ iPad - https://amzn.to/3kA3vuo ⏩ Apple Pencil - https://amzn.to/3kFZFA2 ⏩ GoodNotes - https://tinyurl.com/y627cfsa ⏩ Microphone - https://amzn.to/2UEyCuh About Me: I am Prakhar Mishra and this channel is my passion project. I am currently pursuing my MS (by research) in Data Science. I have an industry work-ex of 3 years in the field of Data Science and Machine Learning with a particular focus in Natural Langauge Processing (NLP). #techviz #datascienceguy #nlp #machinelearning #research #evaluation #survey #nlg