Is normalization indispensable for training deep neural network?