Spatially Aware Multimodal Transformers for TextVQA

ECCV 2020

"Spatially Aware Multimodal Transformers for TextVQA" is work by Yash Kant, Dhruv Batra, Peter Anderson, Alex Schwing, Devi Parikh, Jiasen Lu, Harsh Agrawal at Georgia Tech, Facebook AI Research (FAIR), and University of Illinois, Urbana-Champaign. This work has been accepted to the European Conference on Computer Vision (ECCV) 2020. Full paper: Website: Twitter: @mlatgt Instagram: @mlatgeorgiatech