Image-Text-Papers

Image Caption and Image Generation related papers.

Still working on it ....

Image Caption (Image --> Text)

Survey

Bernardi, Raffaella, et al. Automatic Description Generation from Images: A Survey of Models, Datasets, and Evaluation Measures. J. Artif. Intell. Res.(JAIR) 55 (2016): 409-442. [pdf]
Karpathy, Andrej. CONNECTING IMAGES AND NATURAL LANGUAGE. Diss. STANFORD UNIVERSITY, 2016. [pdf]

Visual-semantic Embedding Based

Kiros R, Salakhutdinov R, Zemel R S. Unifying visual-semantic embeddings with multimodal neural language models. arXiv preprint arXiv:1411.2539, 2014. [pdf]
Karpathy A, Fei-Fei L. Deep visual-semantic alignments for generating image descriptions. CVPR, 2015: 3128-3137. [pdf]

Encoder-Decoder

Vinyals, Oriol, et al. Show and tell: A neural image caption generator. CVPR, 2015. [pdf]
Xu, Kelvin, et al. Show, attend and tell: Neural image caption generation with visual attention. ICML, 2015. [pdf]
Karpathy, Andrej, and Li Fei-Fei. Deep visual-semantic alignments for generating image descriptions. CVPR, 2015. [pdf]
Anderson, Peter, et al. Bottom-up and top-down attention for image captioning and VQA. arXiv preprint arXiv:1707.07998 (2017). [pdf]

Reinforcement Learning

Rennie, Steven J., et al. Self-critical Sequence Training for Image Captioning. CVPR, 2017. [pdf]
Liu, Siqi, et al. Improved Image Captioning via Policy Gradient optimization of SPIDEr. ICCV, 2017. [pdf] [video]
Zhou Ren, Xiaoyu Wang, Ning Zhang, et al. Deep Reinforcement Learning-based Image Captioning with Embedding Reward. CVPR, 2017. [pdf] [video]
Chen T H, Liao Y H, Chuang C Y, et al. Show, Adapt and Tell: Adversarial Training of Cross-domain Image Captioner[C]. ICCV, 2017. [pdf] [Supplementary]
Dai B, Lin D, Urtasun R, et al. Towards diverse and natural image descriptions via a conditional gan. ICCV, 2017. [pdf] [video]

Image Generation (Text --> Image)

RNN

Oord, Aaron van den, Nal Kalchbrenner, and Koray Kavukcuoglu. Pixel recurrent neural networks. arXiv preprint arXiv:1601.06759 (2016). [pdf]
Zhang H, Xu T, Li H, et al. Gregor, Karol, et al. DRAW: A recurrent neural network for image generation. arXiv preprint arXiv:1502.04623 (2015). [pdf]
Mansimov, Elman, et al. Generating images from captions with attention. arXiv preprint arXiv:1511.02793 (2015). [pdf]

GAN

Gauthier, Jon. Conditional generative adversarial nets for convolutional face generation. Class Project for Stanford CS231N: Convolutional Neural Networks for Visual Recognition, Winter semester 2014.5 (2014): 2. [pdf]
Reed S, Akata Z, Yan X, et al. Generative adversarial text to image synthesis. ICML, 2016. [pdf] [Supplementary]
Reed, Scott E., et al. Learning what and where to draw. NIPS, 2016. [pdf]
Zhang H, Xu T, Li H, et al. StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks. ICCV, 2017. [pdf] [video]
Zhang H, Xu T, Li H, et al. StackGAN++: Realistic Image Synthesis with Stacked Generative Adversarial Networks. arXiv preprint arXiv:1710.10916, 2017. [pdf]
Xu, Tao, et al. AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks. arXiv preprint arXiv:1711.10485 (2017). [pdf]
Hao Dong, Simiao Yu, Chao Wu, Yike Guo. Semantic Image Synthesis via Adversarial Learning. ICCV, 2017. [pdf] [Supplementary]
Ayushman, John, et al. TAC-GAN - Text Conditioned Auxiliary Classifier Generative Adversarial Network . arXiv preprint arXiv:1703.06412, 2017. [pdf]
Nguyen, Anh, et al. Plug & play generative networks: Conditional iterative generation of images in latent space. CVPR, 2017. [pdf] [Supplementary]

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Image-Text-Papers

Image Caption (Image --> Text)

Survey

Visual-semantic Embedding Based

Encoder-Decoder

Reinforcement Learning

Image Generation (Text --> Image)

RNN

GAN

About

Releases

Packages

License

watsonyanghx/Image-Text-Papers

Folders and files

Latest commit

History

Repository files navigation

Image-Text-Papers

Image Caption (Image --> Text)

Survey

Visual-semantic Embedding Based

Encoder-Decoder

Reinforcement Learning

Image Generation (Text --> Image)

RNN

GAN

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Packages