Tangled up in BLEU: Reevaluating the Evaluation of Automatic Machine Translation Evaluation Metrics

ACL 2020