Zero-Shot Text-to-Image Generation, Aditya Ramesh, Mikhail Pavlov, Gabriel Goh, Scott Gray, Chelsea Voss, Alec Radford, Mark Chen, Ilya Sutskever, 2021Proceedings of the 38th International Conference on Machine Learning, Vol. 139 (PMLR)DOI: 10.32473/pmlr.v139.ramesh21a - 展示了VQ-VAE在DALL-E中的重要应用,强调其在学习离散视觉标记方面的作用,这些标记随后由Transformer模型用于文本到图像生成。