Invited Talk: When is Grounding Helpful for Language and Vision Tasks?