To solve common sense issues, new approaches have been proposed by University of North Carolina, Chapel Hill researchers to visually supervise language models to achieve performance gains, called Vokenization - Only for GLUE benchmark and SQuAD. This approach is done by creating token-image matching (vokens) and then classifying corresponding tokens with a weakly-supervised loss function.