Language Models are Open Knowledge Graphs (Paper Explained)
#ai #research #nlp Knowledge Graphs are structured databases that capture real-world entities and their relations to each other. KGs are usually built by human experts, which costs considerable amounts of time and money. This paper hypothesizes that language models, which have increased their performance dramatically in the last few years, contain enough knowledge to use them to construct a knowledge graph from a given corpus, without any fine-tuning of the language model itself. The resulting system can uncover new, unknown relations and outperforms all baselines in automated KG construction, even trained ones! PROMO - 3 Months Free TabNine Pro (sign up until 72 Hours after Video Launch): (the site is a bit slow :) ) OUTLINE: 0:00 - Intro & Overview 1:40 - TabNine Promotion 4:20 - Title Misnomer 6:45 - From Corpus To Knowledge Graph 13:40 - Paper Contributions 15:50 - Candidate Fact Finding Algorithm 25:50 - Causal Attention Confusion 31:25 - More Constraints 35:00 - Mapping Facts To Schemas 38:40 - Example Constructed Knowledge Graph 40:10 - Experimental Results 47:25 - Example Discovered Facts 50:40 - Conclusion & My Comments Paper: Abstract: This paper shows how to construct knowledge graphs (KGs) from pre-trained language models (e.g., BERT, GPT-2/3), without human supervision. Popular KGs (e.g, Wikidata, NELL) are built in either a supervised or semi-supervised manner, requiring humans to create knowledge. Recent deep language models automatically acquire knowledge from large-scale corpora via pre-training. The stored knowledge has enabled the language models to improve downstream NLP tasks, e.g., answering questions, and writing code and articles. In this paper, we propose an unsupervised method to cast the knowledge contained within language models into KGs. We show that KGs are constructed with a single forward pass of the pre-trained language models (without fine-tuning) over the corpora. We demonstrate the quality of the constructed KGs by comparing to two KGs (Wikidata, TAC KBP) created by humans. Our KGs also provide open factual knowledge that is new in the existing KGs. Our code and KGs will be made publicly available. Authors: Chenguang Wang, Xiao Liu, Dawn Song Links: YouTube: Twitter: Discord: BitChute: Minds: Parler: LinkedIn: If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this): SubscribeStar: Patreon: Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2 Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n