Generative Adversarial Networks, Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, Yoshua Bengio, 2014Advances in Neural Information Processing Systems, Vol. 27 (NeurIPS) - 介绍GAN的奠基性论文,定义了最小最大目标,并建立了与詹森-香农散度的联系,这些都是本节讨论的非收敛性的根本原因。
Wasserstein GAN, Martin Arjovsky, Soumith Chintala, and Léon Bottou, 2017Proceedings of the 34th International Conference on Machine Learning (ICML), Vol. 70 - 引入了 Wasserstein 距离作为 JSD 的替代方案,直接解决了当分布重叠可忽略时出现的梯度消失问题,这是本节详述的一个主要挑战。
Improved Techniques for Training GANs, Tim Salimans, Ian Goodfellow, Wojciech Zaremba, Vicki Cheung, Alec Radford, Xi Chen, Xi Chen, 2016Advances in Neural Information Processing Systems, Vol. 29DOI: 10.48550/arXiv.1606.03498 - 本文讨论了 GAN 中常见的训练不稳定性,并提出了几种稳定训练的实用启发式方法,间接强调了非收敛和模式崩溃的困难。
On the Convergence of Adversarial Training, Thomas Mescheder, Andreas Geiger, and Sebastian Nowozin, 2018Proceedings of the 35th International Conference on Machine Learning (ICML), Vol. 80 (PMLR (Proceedings of Machine Learning Research))DOI: 10.5598/v80/mescheder18a - 提供了 GAN 训练动态的理论分析,阐明了寻找鞍点的困难以及交替梯度更新时出现的振荡行为。