Image Captioning Systems: Generating Text from Images
Was this section helpful?
Show and Tell: A Neural Image Caption Generator, Oriol Vinyals, Alexander Toshev, Samy Bengio, Dumitru Erhan, 2015Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)DOI: 10.1109/CVPR.2015.7298935 - Introduces the pioneering encoder-decoder neural network architecture for image captioning, combining a CNN for image feature extraction and an RNN for sequence generation.
Deep Learning, Ian Goodfellow, Yoshua Bengio, Aaron Courville, 2016 (MIT Press) - A textbook covering the mathematical and conceptual background of deep learning, including neural networks, CNNs, and RNNs, which form the basis for image captioning systems.