On the importance of initialization and momentum in deep learning, Ilya Sutskever, James Martens, George Dahl, and Geoffrey Hinton, 2013Proceedings of the 30th International Conference on Machine Learning, Vol. 28 (PMLR) - 讨论了动量方法(包括Nesterov加速梯度)在深度学习中的实际应用和优势。