Space-filling designs such as scrambled-Hammersley, Latin Hypercube Sampling and Jittered Sampling have been proposed for fully parallel hyperparameter search, and were shown to be more effective than random or grid search. In this paper, we show that these designs only improve over random search by a constant factor. In contrast, we introduce a new approach based on reshaping the search distribution, which leads to substantial gains over random search, both theoretically and empirically. We propose two flavors of reshaping. First, when the distribution of the optimum is some known $P_0$, we propose Recentering, which uses as search distribution a modified version of $P_0$ tightened closer to the center of the domain, in a dimension-dependent and budget-dependent manner. Second, we show that in a wide range of experiments with $P_0$ unknown, using a proposed Cauchy transformation, which simultaneously has a heavier tail (for unbounded hyperparameters) and is closer to the boundaries (for bounded hyperparameters), leads to improved performances. Besides artificial experiments and simple real world tests on clustering or Salmon mappings, we check our proposed methods on expensive artificial intelligence tasks such as attend/infer/repeat, video next frame segmentation forecasting and progressive generative adversarial networks.
Speakers: Marie-Liesse Cauwet, Camille Couprie, Julien Dehos, Pauline Luc, Jeremy Rapin, Morgane Riviere, Fabien Teytaud, Olivier Teytaud, Nicolas Usunier