Understanding the dependencies among features of a dataset is at the core of most unsupervised learning tasks. However, a majority of generative modeling approaches are focused solely on the joint distribution $p(x)$ and utilize models where it is intractable to obtain the conditional distribution of some arbitrary subset of features $x_u$ given the rest of the observed covariates $x_o$: $p(x_u \mid x_o)$. Traditional conditional approaches provide a model for a fixed set of covariates conditioned on another fixed set of observed covariates. Instead, in this work we develop a model that is capable of yielding all conditional distributions $p(x_u \mid x_o)$ (for arbitrary $x_u$) via tractable conditional likelihoods. We propose a novel extension of (change of variables based) flow generative models, arbitrary conditioning flow models (AC-Flow), that can be conditioned on arbitrary subsets of observed covariates, which was previously infeasible. We apply AC-Flow to the imputation of features, and also develop a unified platform for both multiple and single imputation by introducing an auxiliary objective that provides a principled single "best guess" for flow models. Extensive empirical evaluations show that our models achieve state-of-the-art performance in both single and multiple imputation across image inpainting and feature imputation in synthetic and real-world datasets. Code is available at https://github.com/lupalab/ACFlow.
Speakers: Yang Li, Shoaib Akbar, Junier B. Oliva