Distill, Adapt, Distill: Training Small, In-Domain Models for Neural Machine Translation

ACL 2020