This paper proposes an improved design of the perceptron unit to mitigate the vanishing gradient problem. This nuisance appears when training deep multilayer perceptron networks with bounded activation functions. The new neuron design, named auto-rotating perceptron (ARP), has a mechanism to ensure that the node always operates in the dynamic region of the activation function, by avoiding saturation of the perceptron. The proposed method does not change the inference structure learned at each neuron. We test the effect of using ARP units in some network architectures which use the sigmoid activation function. The results support our hypothesis that neural networks with ARP units can achieve better learning performance than equivalent models with classic perceptrons.
Speakers: Daniel Saromo, Leonardo Bravo, Elizabeth Villota