Attention Is All You Need, Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin, 2017Advances in Neural Information Processing Systems (NeurIPS), Vol. 30DOI: 10.48550/arXiv.1706.03762 - 提出了Transformer架构和多头自注意力机制,这是现代注意力机制的基本组成部分。
Non-local Neural Networks, Xiaolong Wang, Ross Girshick, Abhinav Gupta, Kaiming He, 2018Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)DOI: 10.48550/arXiv.1711.07971 - 提出了非局部操作,这是一种用于捕获卷积神经网络中长距离依赖关系的自注意力泛化方法,对增强U-Net的上下文理解能力非常重要。