An evaluation metric for generative models using hierarchical clustering