Deep Learning, Ian Goodfellow, Yoshua Bengio, and Aaron Courville, 2016 (MIT Press) - A fundamental textbook that provides explanations of various gradient descent algorithms and their batching approaches for training neural networks.
Optimization: Stochastic Gradient Descent, Andrej Karpathy, Justin Johnson, and Fei-Fei Li, 2023Stanford University CS231n Course Notes - Lecture notes providing clear explanations of various optimization techniques, including the use of batching strategies for neural networks, from a respected course.