XTREME: A Massively Multilingual Multi-task Benchmark for Evaluating Cross-lingual Generalization