Long-Tail Predictions with Continuous-Output Language Models