Learning Efficient Dialogue Policy from Demonstrations through Shaping