A Domain Specific Supercomputer for Training Deep Neural Networks, Norman P. Jouppi, Doe Hyun Yoon, George Kurian, Sheng Li, Nishant Patil, James Laudon, Cliff Young, David Patterson, 2020Communications of the ACM, Vol. 63 (Association for Computing Machinery (ACM))DOI: 10.1145/3360307 - Details the network architecture of Google's TPU v4 Pods, showcasing how a large-scale, non-blocking fat-tree network with high bisection bandwidth is implemented for training very large ML models.