Training Models, Flux.jl Contributors, 2025 (FluxML) - Describes the core training functionality and callback integration within the Flux.jl deep learning framework, including the usage of Flux.train! and utilities.
Deep Learning, Ian Goodfellow, Yoshua Bengio, and Aaron Courville, 2016 (MIT Press) - Provides foundational theoretical background on regularization techniques like early stopping and learning rate scheduling, which are primary applications of callbacks in deep learning.
FluxTraining.jl: A Training Library for Flux, FluxTraining.jl Contributors, 2020 - Offers a higher-level, comprehensive callback system for Flux.jl, demonstrating advanced patterns and best practices for structured training oversight and automation.
SGDR: Stochastic Gradient Descent with Warm Restarts, Ilya Loshchilov, Frank Hutter, 2017International Conference on Learning Representations (ICLR 2017)DOI: 10.48550/arXiv.1608.03983 - Introduces 'Stochastic Gradient Descent with Warm Restarts' (SGDR), a learning rate scheduling technique that significantly improves model performance and convergence, directly relevant to dynamic learning rate callbacks.