On the Limitations of Unsupervised Bilingual Dictionary Induction

ACL 2018

On the Limitations of Unsupervised Bilingual Dictionary Induction

Jan 27, 2021
|
25 views
|
Details
Abstract: Recent work has managed to learn cross-lingual word embeddings without parallel data by mapping monolingual embeddings to a shared space through adversarial training. However, their evaluation has focused on favorable conditions, using comparable corpora or closely-related languages, and we show that they often fail in more re-alistic scenarios. This work proposes an alternative approach based on a fully unsupervised initialization that ex-plicitly exploits the structural similarity of the embeddings, and a robust self-learning algorithm that iteratively im-proves this solution. Our method succeeds in all tested scenarios and obtains the best published results in standard datasets, even surpassing previous supervised systems. Authors: Anders Søgaard, Sebastian Ruder, Ivan Vulić (University of Copenhagen, National University of Ireland, University of Cambridge)

Comments
loading...