Site Reliability Engineering: How Google Runs Production Systems, Betsy Beyer, Chris Jones, Jennifer Petoff, and Niall Richard Murphy, 2017 (O'Reilly Media) - A foundational guide to monitoring, alerting, and operational practices, including setting thresholds and managing deployments like canary releases.
GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium, Martin Heusel, Hubert Ramsauer, Thomas Unterthiner, Bernhard Nessler, Günter Klambauer, Andreas Maier, Sepp Hochreiter, and Michael Widmann, 2017Advances in Neural Information Processing Systems (NeurIPS 2017), Vol. 30 (NeurIPS)DOI: 10.48550/arXiv.1706.08500 - Introduces the Fréchet Inception Distance (FID) metric, a widely used measure for evaluating the quality of generated images.