Spatially Aware Multimodal Transformers for TextVQA

ECCV 2020

Details
"Spatially Aware Multimodal Transformers for TextVQA" is work by Yash Kant, Dhruv Batra, Peter Anderson, Alex Schwing, Devi Parikh, Jiasen Lu, Harsh Agrawal at Georgia Tech, Facebook AI Research (FAIR), and University of Illinois, Urbana-Champaign. This work has been accepted to the European Conference on Computer Vision (ECCV) 2020. Full paper: https://arxiv.org/pdf/2007.12146.pdf Website: www.ml.gatech.edu Twitter: @mlatgt Instagram: @mlatgeorgiatech

Comments
loading...