The Mean-Squared Error of Double Q-Learning