In-Datacenter Performance Analysis of a Tensor Processing Unit, Norman P. Jouppi, Cliff Young, Nishant Agrawal, Gul Khan, Anna Li, Raymond A. Tarjan, and Lake Wen, 2017Proceedings of the 44th Annual International Symposium on Computer Architecture (ISCA) (ACM)DOI: 10.1145/3079894.3079895 - This is the original paper describing the architecture and performance of Google's first-generation TPU, introducing the concept of an ASIC for neural networks.
A Domain-Specific Architecture for Training Deep Neural Networks, Norman P. Jouppi, Cliff Young, David Patil, David Lake, Nir Shavit, Raymond A. Tarjan, David Patterson, and Monica S. Lam, 2020Communications of the ACM, Vol. 63 (ACM)DOI: 10.1145/3363380 - This paper discusses the evolution of TPU architecture, focusing on the design and performance of second and third-generation TPUs, including their scalability and power efficiency.
What are TPUs?, Google Cloud Documentation, 2024 (Google Cloud) - Provides an official overview of Google Cloud TPUs, their capabilities, available generations, and how to use them within the GCP ecosystem.
Use TPUs, TensorFlow Documentation, 2023 (TensorFlow) - Explains how to use tf.distribute.TPUStrategy for training models on TPUs with TensorFlow, covering setup, code adjustments, and best practices.