* 미리 만들어 놓은 모델 모음집
Network in Network model
이 모델은 여기 자세히 나와있다. ICLR-2014 paper:
Network In Network
M. Lin, Q. Chen, S. Yan
International Conference on Learning Representations, 2014 (arXiv:1409.1556)
please cite the paper if you use the models.
Models from the BMVC-2014 paper "Return of the Devil in the Details: Delving Deep into Convolutional Nets"
이 모델은 ILSVRC-2012 데이타셋으로 학습되었다. 자세한건 project page 와 BMVC-2014 paper:
Return of the Devil in the Details: Delving Deep into Convolutional Nets
K. Chatfield, K. Simonyan, A. Vedaldi, A. Zisserman
British Machine Vision Conference, 2014 (arXiv ref. cs1405.3531)
Please cite the paper if you use the models.
Models used by the VGG team in ILSVRC-2014
이 모델은 LSVRC-2014 에서 VGG 팀에 의해 사용된 모델의 강화 버전. 참고 : project page arXiv paper:
Very Deep Convolutional Networks for Large-Scale Image Recognition
K. Simonyan, A. Zisserman
arXiv:1409.1556
Please cite the paper if you use the models.
Places-CNN model from MIT.
Places CNN is described in the following NIPS 2014 paper:
B. Zhou, A. Lapedriza, J. Xiao, A. Torralba, and A. Oliva
Learning Deep Features for Scene Recognition using Places Database.
Advances in Neural Information Processing Systems 27 (NIPS) spotlight, 2014.
The project page is here
GoogLeNet GPU implementation from Princeton.
We implemented GoogLeNet using a single GPU. Our main contribution is an effective way to initialize the network and a trick to overcome the GPU memory constraint by accumulating gradients over two training iterations.
Fully Convolutional Semantic Segmentation Models (FCN-Xs)
These models are described in the paper:
Fully Convolutional Models for Semantic Segmentation
Jonathan Long, Evan Shelhamer, Trevor Darrell
CVPR 2015
arXiv:1411.4038
CaffeNet fine-tuned for Oxford flowers dataset
https://gist.github.com/jimgoo/0179e52305ca768a601f
The is the reference CaffeNet (modified AlexNet) fine-tuned for the Oxford 102 category flower dataset. The number of outputs in the inner product layer has been set to 102 to reflect the number of flower categories. Hyperparameter choices reflect those in Fine-tuning CaffeNet for Style Recognition on “Flickr Style” Data. The global learning rate is reduced while the learning rate for the final fully connected is increased relative to the other layers.
CNN Models for Salient Object Subitizing.
CNN models described in the following CVPR'15 papger "Salient Object Subitizing":
Salient Object Subitizing
J. Zhang, S. Ma, M. Sameki, S. Sclaroff, M. Betke, Z. Lin, X. Shen, B. Price and R. Mech.
CVPR, 2015.
Deep Learning of Binary Hash Codes for Fast Image Retrieval
We present an effective deep learning framework to create the hash-like binary codes for fast image retrieval. The details can be found in the following "CVPRW'15 paper":
Deep Learning of Binary Hash Codes for Fast Image Retrieval
K. Lin, H.-F. Yang, J.-H. Hsiao, C.-S. Chen
CVPR 2015, DeepVision workshop
Places_CNDS_models on Scene Recognition
The details of training this model are described in the following report. Please cite this work if the model is useful for you.
Training Deeper Convolutional Networks with Deep Supervision
L.Wang, C.Lee, Z.Tu, S. Lazebnik, arXiv:1505.02496, 2015
Models for Age and Gender Classification.
- Age/Gender.net are models for age and gender classification trained on the Adience-OUIdataset. See the Project page.
GoogLeNet_cars on car model classification
GoogLeNet_cars is the GoogLeNet model pre-trained on ImageNet classification task and fine-tuned on 431 car models in CompCars dataset. It is described in the technical report. Please cite the following work if the model is useful for you.
A Large-Scale Car Dataset for Fine-Grained Categorization and Verification
L. Yang, P. Luo, C. C. Loy, X. Tang, arXiv:1506.08959, 2015
ParseNet: Looking wider to see better
These models are described in the paper:
ParseNet: Looking Wider to See Better
Wei Liu, Andrew Rabinovich, Alexander C. Berg
arXiv:1506.04579
SegNet and Bayesian SegNet
SegNet is a real-time semantic segmentation architecture for scene understanding. Code and trained models for SegNet and Bayesian SegNet are available.
SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation
Vijay Badrinarayanan, Alex Kendall and Roberto Cipolla
arXiv preprint arXiv:1511.00561, 2015
Conditional Random Fields as Recurrent Neural Networks
Code (with Matlab/Python API) and model are described in the ICCV 2015 paper
Conditional Random Fields as Recurrent Neural Networks
S. Zheng, S. Jayasumana, B. Romera-Paredes, V. Vineet, Z. Su, D. Du, C. Huang, P. Torr
ICCV 2015.
Holistically-Nested Edge Detection
The model and code provided are described in the ICCV 2015 paper:
Holistically-Nested Edge Detection
Saining Xie and Zhuowen Tu
ICCV 2015
Translating Videos to Natural Language
These models are described in this NAACL-HLT 2015 paper.
Translating Videos to Natural Language Using Deep Recurrent Neural Networks
S. Venugopalan, H. Xu, J. Donahue, M. Rohrbach, R. Mooney, K. Saenko
NAACL-HLT 2015
More details can be found on this project page.
VGG Face CNN descriptor
These models are described in this BMVC 2015 paper.
Deep Face Recognition
Omkar M. Parkhi, Andrea Vedaldi, Andrew Zisserman
BMVC 2015
More details can be found on this project page.
Yearbook Photo Dating
Model from the ICCV 2015 Extreme Imaging Workshop paper:
A Century of Portraits: Exploring the Visual Historical Record of American High School Yearbooks
Shiry Ginosar, Kate Rakelly, Brian Yin, Sarah Sachs, Alyosha Efros
ICCV Workshop 2015
Model and prototxt files: Yearbook
CCNN: Constrained Convolutional Neural Networks for Weakly Supervised Segmentation
These models are described in the ICCV 2015 paper.
Constrained Convolutional Neural Networks for Weakly Supervised Segmentation
Deepak Pathak, Philipp Krähenbühl, Trevor Darrell
ICCV 2015
arXiv:1506.03648
These are pre-release models. They do not run in any current version of BVLC/caffe, as they require unmerged PRs. Full details, source code, models, prototxts are available here: CCNN.
Emotion Recognition in the Wild via Convolutional Neural Networks and Mapped Binary Patterns
We provide models for facial emotion classification for different image representation obtained using mapped binary patterns. See the Project page for more details.
The models are described in the following paper:
Emotion Recognition in the Wild via Convolutional Neural Networks and Mapped Binary Patterns
Gil Levi and Tal Hassner
Proc. ACM International Conference on Multimodal Interaction (ICMI), Seattle, Nov. 2015
If you find our models useful, please add suitable reference to our paper in your work.
Facial Landmark Detection with Tweaked Convolutional Neural Networks
We provide source code and model for article: Yue Wu and Tal Hassner, "Facial Landmark Detection with Tweaked Convolutional Neural Networks", arXiv preprint arXiv:1511.04031, 12 Nov. 2015. See project page for more information about this project.
Written by Ishay Tubi
This software is provided as is, without any warranty, with no legal constraints. If you find our models useful, please add suitable reference to our paper in your work.
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
Download pre-computed Faster R-CNN detectors cd $FRCN_ROOT
./data/scripts/fetch_faster_rcnn_models.sh This will populate the $FRCN_ROOT/data folder with faster_rcnn_models. See data/README.md for details. These models were trained on VOC 2007 trainval.
Sequence to Sequence - Video to Text
These models are described in this ICCV 2015 paper.
Sequence to Sequence - Video to Text
S. Venugopalan, M. Rohrbach, J. Donahue, T. Darrell, R. Mooney, K. Saenko
The IEEE International Conference on Computer Vision (ICCV) 2015
More details can be found on this project page.
Model:
S2VT_VGG_RGB:
This is the S2VT (RGB) model described in the ICCV 2015 paper. It uses video frame features from the VGG-16 layer model. This is trained only on the Youtube video dataset.
Compatibility:
These are pre-release models. They do not run in any current version of BVLC/caffe, as they require unmerged PRs. The models are currently supported by the recurrent
branch of the Caffe fork provided at https://github.com/jeffdonahue/caffe/tree/recurrent andhttps://github.com/vsubhashini/caffe/tree/recurrent.
ResNets: Deep Residual Networks from MSRA at ImageNet and COCO 2015
This repository contains the original models (ResNet-50, ResNet-101, and ResNet-152) described in the paper "Deep Residual Learning for Image Recognition" (http://arxiv.org/abs/1512.03385). These models are those used in ILSVRC and COCO 2015 competitions, which won the 1st places in: ImageNet classification, ImageNet detection, ImageNet localization, COCO detection, and COCO segmentation.
More instructions with prototxt and binary weight files are in:https://github.com/KaimingHe/deep-residual-networks