A Trainable Optimal Transport Embedding for Feature Aggregation