Wasserstein Information Geometry in Generative and Discriminative Learning