In this work, we develop a new approximation method to solve the analytically intractable Bayesian inference for Gaussian process models with factorizable Gaussian likelihoods and single-output latent functions. Our method -- dubbed QP -- is similar to the expectation propagation (EP), however it minimizes the $L^2$ Wasserstein distance instead of the Kullback-Leibler (KL) divergence. We consider the specific case in which the non-Gaussian likelihood is approximated by the Gaussian likelihood. We show that QP has the following properties: (1) QP matches quantile functions rather than moments in EP; (2) QP and EP have the same local update for the mean of the approximate Gaussian likelihood; (3) the local variance estimate for the approximate likelihood is smaller for QP than for EP's, addressing EP's over-estimation of the variance; (4) the optimal approximate Gaussian likelihood enjoys a univariate parameterization, reducing memory consumption and computation time. Furthermore, we provide a unified interpretations of EP and QP -- both are coordinate descent algorithms of a KL and an $L^2$ Wasserstein global objective function respectively, under the same assumptions. In the performed experiments, we employ eight real world datasets and we show that QP outperforms EP for the task of Gaussian process binary classification.
Speakers: Rui Zhang, Christian Walder, Edwin Bonilla, Marian-Andrei Rizoiu, Lexing Xie