In-Datacenter Performance Analysis of a Tensor Processing Unit, Norman P. Jouppi, Cliff Young, Nishant Agrawal, Mike Baker, Gaurav Bates, Kelly Cao, Raymond M. Chiu, George Chou, Jeremy Clark, Brad Conrad, John N. Cook, Phoebe Coplon, Pat Costello, Anna Cuningham, Nathan Eifrig, Jeremy Kaiser, Paul Kallman, Alan Lee, Jason Li, Alex Lukefahr, David Mullis, Alex Nagurney, LaMDA Tran, Trevor Norris, Grant Ortega, Lawrence Ortega, Rahul Pandit, Daniel Smith, Kevin Tarolli, Greg Tassa, Anant Thazhuthaveetil, Rajat Verma, Dean Way, David Welch, Jennifer Wen, Paul N. Williams, William Wolf, Scott Wong, Tim Xu, and David Zhabel, 2017Proceedings of the 44th Annual International Symposium on Computer Architecture (ISCA '17) (ACM)DOI: 10.1145/3079737.3079803 - 这篇基础性论文介绍了谷歌的张量处理单元(TPU),并详细阐述了其架构和在神经网络工作负载中的性能特点。