Computer Vision (CV)

Courses

Stanford's cs231n is recommended resource for Deep Learning on Computer Vision
Introduction to Computer Vision Course on Udacity, taught at Georgia Tech Master, has assignments in Octave/Matlab

Pattern Recognition and Machine Learning [$], Christopher Bishop, 2006, Springer, 27k citations
Computer Vision: Algorithms and Applications, Richard Szelski, 2011, Springer, 3k citations
Learning OpenCV [$], Gray Bradski, Adrian Kaehler, 2008, O'Reilly

Based on various sources, including Awesome Deep Learning and Adit Deshpande's "The 9 Deep Learning Papers You Need To Know About"

Alexnet: ImageNet Classification with Deep Convolutional Neural Networks A. Krizhevsky et al, 2012
ZFNet: Visualizing and Understanding Convolutional Networks Matthew D. Zeiler, Rob Fergus 2013
VGGNet: Very Deep Convolutional Networks For Large-scale Image Recognition Karen Simonyan and Andrew Zisserman 2015
GoogLeNet: Going Deeper with Convolutions Christian Szegedy et al 2015
ResNet: Deep Residual Learning for Image Recognition
OverFeat: Integrated recognition, localization and detection using convolutional networks, P. Sermanet et al., 2013, 1700+ citations
Return of the devil in the details: delving deep into convolutional nets, K. Chatfield et al., 2014, 1200+ citations
Network in Network or 1x1 convolution, M. Lin et al., 2013, 1000+ citations

Rich feature hierarchies for accurate object detection and semantic segmentation, R. Girshick et al., 2014
Fully convolutional networks for semantic segmentation, J. Long et al., 2015
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks, S. Ren et al., 2015
Fast R-CNN, R. Girshick, 2015
Learning hierarchical features for scene labeling, C. Farabet et al., 2013
Semantic image segmentation with deep convolutional nets and fully connected CRFs, L. Chen et al.

Other Interesting Papers

- [Spatial pyramid pooling in deep convolutional networks for visual recognition](http://arxiv.org/pdf/1406.4729), K. He et al., 2014 - [You only look once: Unified, real-time object detection](http://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Redmon_You_Only_Look_CVPR_2016_paper.pdf), J. Redmon et al., 2016

DeepFace: Closing the gap to human-level performance in face verification, Y. Taigman et al., 2014
Large-scale video classification with convolutional neural networks, A. Karpathy et al., 2014
Show and tell: A neural image caption generator, O. Vinyals et al., 2015
Show, attend and tell: Neural image caption generation with visual attention, K. Xu et al., 2015
Deep visual-semantic alignments for generating image descriptions, A. Karpathy and L. Fei-Fei, 2015
Long-term recurrent convolutional networks for visual recognition and description, J. Donahue et al., 2015
3D convolutional neural networks for human action recognition, S. Ji et al., 2013
Two-stream convolutional networks for action recognition in videos, K. Simonyan et al., 2014

Other Interesting Papers

- [Image Super-Resolution Using Deep Convolutional Networks](https://arxiv.org/pdf/1501.00092v3.pdf), C. Dong et al., 2016 - [VQA: Visual question answering](http://www.cv-foundation.org/openaccess/content_iccv_2015/papers/Antol_VQA_Visual_Question_ICCV_2015_paper.pdf), S. Antol et al., 2015 - [A neural algorithm of artistic style](https://arxiv.org/pdf/1508.06576), L. Gatys et al., 2015