An Image is Worth 16 × 16 Tokens: Visual Priors for Efficient Image Synthesis with Transformers

NeurIPS 2020