"Improving Vision-and-Language Navigation with Image-Text Pairs from the Web" is work conducted by Arjun Majumdar, Ayush Shrivastava, Stefan Lee, Peter Anderson, Devi Parikh, Dhruv Batra at Georgia Tech, Facebook AI Research (FAIR), and Oregon State University.
This work was accepted to the 2020 European Conference on Computer Vision (ECCV.)
Full paper: https://arxiv.org/pdf/2004.14973.pdf
Website: www.ml.gatech.edu
Twitter: @mlatgt
Instagram: @mlatgeorgiatech