Deep Learning, Ian Goodfellow, Yoshua Bengio, and Aaron Courville, 2016 (MIT Press) - Provides comprehensive theoretical foundations for neural network training, including gradient descent, backpropagation, loss functions, optimizers, epochs, and mini-batching.
Flux.jl Documentation, The Flux.jl Contributors, 2025 - The official documentation for Flux.jl, providing practical guidance on building models, defining loss functions, using optimizers, and implementing training loops in Julia.
Zygote.jl Documentation, The Zygote.jl Contributors, 2024 - Official documentation explaining automatic differentiation in Julia, which powers gradient computation (backward pass) within the Flux.jl training ecosystem.
CS231n: Deep Learning and Computer Vision - Lecture Notes, Fei-Fei Li, Ehsan Adeli, Justin Johnson, Zane Durante, 2023 (Stanford University) - Provides clear, accessible explanations of the core components of neural network training, including forward pass, backpropagation, loss functions, and optimization algorithms.