Text-to-Image Synthesis: Creating Visuals from Descriptions (Introduction)
Was this section helpful?
Zero-Shot Text-to-Image Generation, Aditya Ramesh, Mikhail Pavlov, Gabriel Goh, Scott Gray, Chelsea Voss, Alec Radford, Mark Chen, Ilya Sutskever, 2021DOI: 10.48550/arXiv.2102.12092 - Introduces DALL-E, a pioneering model that generates diverse images from text descriptions, highlighting the effectiveness of large-scale text-image pre-training.
High-Resolution Image Synthesis with Latent Diffusion Models, Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, Björn Ommer, 2022Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (IEEE)DOI: 10.1109/CVPR52688.2022.01042 - Presents Latent Diffusion Models, the architecture behind Stable Diffusion, for efficient and high-quality text-to-image synthesis, making advanced generation accessible.