Abstract: Extraction from raw text to a knowledge base of entities and ﬁne-grained types is often cast as prediction into a ﬂat set of entity and type labels, neglecting the rich hierarchies over types and entities contained in curated ontologies. Previ-ous attempts to incorporate hierarchical structure have yielded little beneﬁt and are restricted to shallow ontologies. This paper presents new methods using real and complex bilinear mappings for integrating hierarchical information, yielding substantial improvement over ﬂat predictions in entity linking and ﬁne-grained entity typing, and achiev-ing new state-of-the-art results for end-to-end models on the benchmark FIGER dataset. We also present two new human-annotated datasets containing wide and deep hierarchies which we will release to the community to encour-age further research in this direction: MedMentions, a collection of PubMed abstracts in which 246k mentions have been mapped to the massive UMLS ontology; and TypeNet, which aligns Freebase types with the WordNet hierarchy to obtain nearly 2k entity types. In experiments on all three datasets we show substantial gains from hierarchy-aware training.
Authors: Shikhar Murty, Patrick Verga, Luke Vilnis, Irena Radovanovic, Andrew McCallum (UMass Amherst, Chan Zuckerberg Initiative)