├── LICENSE ├── README.md ├── models ├── deploy.prototxt └── deploy_lab.prototxt └── resources ├── fetch_caffe.sh └── fetch_models.sh /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2016 Richard Zhang 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | ## Split-Brain Autoencoders: Unsupervised Learning by Cross-Channel Prediction [[Project Page]](http://richzhang.github.io/splitbrainauto/) ## 2 | [Richard Zhang](https://richzhang.github.io/), [Phillip Isola](http://web.mit.edu/phillipi/), [Alexei A. Efros](http://www.eecs.berkeley.edu/~efros/). In CVPR, 2017. (hosted on [ArXiv](https://arxiv.org/abs/1611.09842)) 3 | 4 | 5 | 6 | ### Overview ### 7 | This repository contains a pre-trained Split-Brain Autoencoder network. The network achieves state-of-the-art results on several large-scale unsupervised representation learning benchmarks. 8 | 9 | ### Clone this repository ### 10 | Clone the master branch of the respository using `git clone -b master --single-branch https://github.com/richzhang/splitbrainauto.git` 11 | 12 | ### Dependencies ### 13 | This code requires a working installation of [Caffe](http://caffe.berkeleyvision.org/). For guidelines and help with installation of Caffe, consult the [installation guide](http://caffe.berkeleyvision.org/) and [Caffe users group](https://groups.google.com/forum/#!forum/caffe-users). 14 | 15 | ### Test-Time Usage ### 16 | **(1)** Run `./resources/fetch_models.sh`. This will load model `model_splitbrainauto_clcl.caffemodel`. It will also load model `model_splitbrainauto_clcl_rs.caffemodel`, which is the model with the rescaling method from [Krähenbühl et al. ICLR 2016](https://github.com/philkr/magic_init) applied. The rescaling method has been shown to improve fine-tuning performance in some models, and we use it for the PASCAL tests in Table 4 in the paper. Alternatively, download the models from [here](https://people.eecs.berkeley.edu/~rich.zhang/projects/2017_splitbrain/files/models/) and put them in the `models` directory. 17 | 18 | **(2)** To extract features, you can (a) use the main branch of Caffe and do color conversion outside of the network or (b) download and install a modified Caffe and not worry about color conversion. 19 | 20 | **(a)** **Color conversion outside of prototxt** To extract features with the main branch of [Caffe](http://caffe.berkeleyvision.org/):
21 | **(i)** Load the downloaded weights with model definition file `deploy_lab.prototxt` in the `models` directory. The input is blob `data_lab`, which is an ***image in Lab colorspace***. You will have to do the Lab color conversion pre-processing outside of the network. 22 | 23 | **(b)** **Color conversion in prototxt** You can also extract features with in-prototxt color version with a modified Caffe.
24 | **(i)** Run `./resources/fetch_caffe.sh`. This will load a modified Caffe into directory `./caffe-colorization`.
25 | **(ii)** Install the modified Caffe. For guidelines and help with installation of Caffe, consult the [installation guide](http://caffe.berkeleyvision.org/) and [Caffe users group](https://groups.google.com/forum/#!forum/caffe-users).
26 | **(iii)** Load the downloaded weights with model definition file `deploy.prototxt` in the `models` directory. The input is blob `data`, which is a ***non mean-centered BGR image***. 27 | 28 | ### Citation ### 29 | If you find this model useful for your resesarch, please use this [bibtex](http://richzhang.github.io/index_files/bibtex_cvpr2017_splitbrain.txt) to cite. 30 | 31 | -------------------------------------------------------------------------------- /models/deploy.prototxt: -------------------------------------------------------------------------------- 1 | 2 | 3 | layer { 4 | name: "input" 5 | type: "Input" 6 | top: "data" # BGR image from [0,255] ***NOT MEAN CENTERED*** 7 | input_param { shape { dim: 1 dim: 3 dim: 227 dim: 227 } } 8 | } 9 | layer { # Convert to lab 10 | name: "img_lab" 11 | type: "ColorConv" 12 | bottom: "data" 13 | top: "img_lab" 14 | propagate_down: false 15 | color_conv_param { 16 | input: 0 # BGR 17 | output: 3 # Lab 18 | } 19 | } 20 | layer { # 0-center lightness channel 21 | name: "data_lab" 22 | type: "Convolution" 23 | bottom: "img_lab" 24 | top: "data_lab" # [-50,50] 25 | propagate_down: false 26 | param {lr_mult: 0 decay_mult: 0} 27 | param {lr_mult: 0 decay_mult: 0} 28 | convolution_param { 29 | kernel_size: 1 30 | num_output: 3 31 | group: 3 32 | } 33 | } 34 | layer { 35 | name: "conv1" 36 | type: "Convolution" 37 | # bottom: "img" 38 | bottom: "data_lab" 39 | # bottom: "img_bn" 40 | top: "conv1" 41 | param {lr_mult: 0 decay_mult: 0} 42 | param {lr_mult: 0 decay_mult: 0} 43 | convolution_param { 44 | num_output: 96 45 | kernel_size: 11 46 | stride: 4 47 | weight_filler { 48 | type: "gaussian" 49 | std: 0.01 50 | } 51 | bias_filler { 52 | type: "constant" 53 | value: 0 54 | } 55 | } 56 | } 57 | layer { 58 | name: "relu1" 59 | type: "ReLU" 60 | bottom: "conv1" 61 | top: "conv1" 62 | } 63 | layer { 64 | name: "pool1" 65 | type: "Pooling" 66 | bottom: "conv1" 67 | top: "pool1" 68 | pooling_param { 69 | pool: MAX 70 | kernel_size: 3 71 | stride: 2 72 | } 73 | } 74 | layer { 75 | name: "conv2" 76 | type: "Convolution" 77 | bottom: "pool1" 78 | top: "conv2" 79 | param { lr_mult: 0 decay_mult: 0 } 80 | param { lr_mult: 0 decay_mult: 0 } 81 | convolution_param { 82 | num_output: 256 83 | pad: 2 84 | kernel_size: 5 85 | group: 2 86 | } 87 | } 88 | layer { 89 | name: "relu2" 90 | type: "ReLU" 91 | bottom: "conv2" 92 | top: "conv2" 93 | } 94 | layer { 95 | name: "pool2" 96 | type: "Pooling" 97 | # bottom: "conv2" 98 | bottom: "conv2" 99 | top: "pool2" 100 | pooling_param { 101 | pool: MAX 102 | kernel_size: 3 103 | stride: 2 104 | # pad: 1 105 | } 106 | } 107 | layer { 108 | name: "conv3" 109 | type: "Convolution" 110 | bottom: "pool2" 111 | top: "conv3" 112 | # propagate_down: false 113 | param { lr_mult: 0 decay_mult: 0 } 114 | param { lr_mult: 0 decay_mult: 0 } 115 | convolution_param { 116 | num_output: 384 117 | pad: 1 118 | kernel_size: 3 119 | weight_filler { 120 | type: "gaussian" 121 | std: 0.01 122 | } 123 | bias_filler { 124 | type: "constant" 125 | value: 0 126 | } 127 | } 128 | } 129 | layer { 130 | name: "relu3" 131 | type: "ReLU" 132 | bottom: "conv3" 133 | top: "conv3" 134 | } 135 | layer { 136 | name: "conv4" 137 | type: "Convolution" 138 | bottom: "conv3" 139 | top: "conv4" 140 | param { lr_mult: 0 decay_mult: 0 } 141 | param { lr_mult: 0 decay_mult: 0 } 142 | convolution_param { 143 | num_output: 384 144 | pad: 1 145 | kernel_size: 3 146 | group: 2 147 | } 148 | } 149 | layer { 150 | name: "relu4" 151 | type: "ReLU" 152 | bottom: "conv4" 153 | top: "conv4" 154 | } 155 | layer { 156 | name: "conv5" 157 | type: "Convolution" 158 | bottom: "conv4" 159 | top: "conv5" 160 | param { lr_mult: 0 decay_mult: 0 } 161 | param { lr_mult: 0 decay_mult: 0 } 162 | convolution_param { 163 | num_output: 256 164 | pad: 1 165 | kernel_size: 3 166 | group: 2 167 | } 168 | } 169 | layer { 170 | name: "relu5" 171 | type: "ReLU" 172 | bottom: "conv5" 173 | top: "conv5" 174 | } 175 | layer { 176 | name: "pool5" 177 | type: "Pooling" 178 | bottom: "conv5" 179 | top: "pool5" 180 | pooling_param { 181 | pool: MAX 182 | kernel_size: 3 183 | stride: 2 184 | } 185 | } 186 | layer { 187 | name: "fc6" 188 | type: "Convolution" 189 | bottom: "pool5" 190 | top: "fc6" 191 | param { lr_mult: 0 decay_mult: 0 } 192 | param { lr_mult: 0 decay_mult: 0 } 193 | convolution_param { 194 | kernel_size: 6 195 | dilation: 2 196 | pad: 5 197 | stride: 1 198 | num_output: 4096 199 | weight_filler { 200 | type: "gaussian" 201 | std: 0.005 202 | } 203 | bias_filler { 204 | type: "constant" 205 | value: 1 206 | } 207 | } 208 | } 209 | layer { 210 | name: "relu6" 211 | type: "ReLU" 212 | bottom: "fc6" 213 | top: "fc6" 214 | } 215 | layer { 216 | name: "fc7" 217 | type: "Convolution" 218 | bottom: "fc6" 219 | top: "fc7" 220 | param { lr_mult: 0 decay_mult: 0 } 221 | param { lr_mult: 0 decay_mult: 0 } 222 | convolution_param { 223 | kernel_size: 1 224 | stride: 1 225 | num_output: 4096 226 | weight_filler { 227 | type: "gaussian" 228 | std: 0.005 229 | } 230 | bias_filler { 231 | type: "constant" 232 | value: 1 233 | } 234 | } 235 | } 236 | layer { 237 | name: "relu7" 238 | type: "ReLU" 239 | bottom: "fc7" 240 | top: "fc7" 241 | } 242 | -------------------------------------------------------------------------------- /models/deploy_lab.prototxt: -------------------------------------------------------------------------------- 1 | 2 | 3 | layer { 4 | name: "input" 5 | type: "Input" 6 | top: "img_lab" # image in Lab color space 7 | input_param { shape { dim: 1 dim: 3 dim: 227 dim: 227 } } 8 | } 9 | layer { # 0-center lightness channel 10 | name: "data_lab" 11 | type: "Convolution" 12 | bottom: "img_lab" 13 | top: "data_lab" # [-50,50] 14 | propagate_down: false 15 | param {lr_mult: 0 decay_mult: 0} 16 | param {lr_mult: 0 decay_mult: 0} 17 | convolution_param { 18 | kernel_size: 1 19 | num_output: 3 20 | group: 3 21 | } 22 | } 23 | layer { 24 | name: "conv1" 25 | type: "Convolution" 26 | # bottom: "img" 27 | bottom: "data_lab" 28 | # bottom: "img_bn" 29 | top: "conv1" 30 | param {lr_mult: 0 decay_mult: 0} 31 | param {lr_mult: 0 decay_mult: 0} 32 | convolution_param { 33 | num_output: 96 34 | kernel_size: 11 35 | stride: 4 36 | weight_filler { 37 | type: "gaussian" 38 | std: 0.01 39 | } 40 | bias_filler { 41 | type: "constant" 42 | value: 0 43 | } 44 | } 45 | } 46 | layer { 47 | name: "relu1" 48 | type: "ReLU" 49 | bottom: "conv1" 50 | top: "conv1" 51 | } 52 | layer { 53 | name: "pool1" 54 | type: "Pooling" 55 | bottom: "conv1" 56 | top: "pool1" 57 | pooling_param { 58 | pool: MAX 59 | kernel_size: 3 60 | stride: 2 61 | } 62 | } 63 | layer { 64 | name: "conv2" 65 | type: "Convolution" 66 | bottom: "pool1" 67 | top: "conv2" 68 | param { lr_mult: 0 decay_mult: 0 } 69 | param { lr_mult: 0 decay_mult: 0 } 70 | convolution_param { 71 | num_output: 256 72 | pad: 2 73 | kernel_size: 5 74 | group: 2 75 | } 76 | } 77 | layer { 78 | name: "relu2" 79 | type: "ReLU" 80 | bottom: "conv2" 81 | top: "conv2" 82 | } 83 | layer { 84 | name: "pool2" 85 | type: "Pooling" 86 | # bottom: "conv2" 87 | bottom: "conv2" 88 | top: "pool2" 89 | pooling_param { 90 | pool: MAX 91 | kernel_size: 3 92 | stride: 2 93 | # pad: 1 94 | } 95 | } 96 | layer { 97 | name: "conv3" 98 | type: "Convolution" 99 | bottom: "pool2" 100 | top: "conv3" 101 | # propagate_down: false 102 | param { lr_mult: 0 decay_mult: 0 } 103 | param { lr_mult: 0 decay_mult: 0 } 104 | convolution_param { 105 | num_output: 384 106 | pad: 1 107 | kernel_size: 3 108 | weight_filler { 109 | type: "gaussian" 110 | std: 0.01 111 | } 112 | bias_filler { 113 | type: "constant" 114 | value: 0 115 | } 116 | } 117 | } 118 | layer { 119 | name: "relu3" 120 | type: "ReLU" 121 | bottom: "conv3" 122 | top: "conv3" 123 | } 124 | layer { 125 | name: "conv4" 126 | type: "Convolution" 127 | bottom: "conv3" 128 | top: "conv4" 129 | param { lr_mult: 0 decay_mult: 0 } 130 | param { lr_mult: 0 decay_mult: 0 } 131 | convolution_param { 132 | num_output: 384 133 | pad: 1 134 | kernel_size: 3 135 | group: 2 136 | } 137 | } 138 | layer { 139 | name: "relu4" 140 | type: "ReLU" 141 | bottom: "conv4" 142 | top: "conv4" 143 | } 144 | layer { 145 | name: "conv5" 146 | type: "Convolution" 147 | bottom: "conv4" 148 | top: "conv5" 149 | param { lr_mult: 0 decay_mult: 0 } 150 | param { lr_mult: 0 decay_mult: 0 } 151 | convolution_param { 152 | num_output: 256 153 | pad: 1 154 | kernel_size: 3 155 | group: 2 156 | } 157 | } 158 | layer { 159 | name: "relu5" 160 | type: "ReLU" 161 | bottom: "conv5" 162 | top: "conv5" 163 | } 164 | layer { 165 | name: "pool5" 166 | type: "Pooling" 167 | bottom: "conv5" 168 | top: "pool5" 169 | pooling_param { 170 | pool: MAX 171 | kernel_size: 3 172 | stride: 2 173 | } 174 | } 175 | layer { 176 | name: "fc6" 177 | type: "Convolution" 178 | bottom: "pool5" 179 | top: "fc6" 180 | param { lr_mult: 0 decay_mult: 0 } 181 | param { lr_mult: 0 decay_mult: 0 } 182 | convolution_param { 183 | kernel_size: 6 184 | dilation: 2 185 | pad: 5 186 | stride: 1 187 | num_output: 4096 188 | weight_filler { 189 | type: "gaussian" 190 | std: 0.005 191 | } 192 | bias_filler { 193 | type: "constant" 194 | value: 1 195 | } 196 | } 197 | } 198 | layer { 199 | name: "relu6" 200 | type: "ReLU" 201 | bottom: "fc6" 202 | top: "fc6" 203 | } 204 | layer { 205 | name: "fc7" 206 | type: "Convolution" 207 | bottom: "fc6" 208 | top: "fc7" 209 | param { lr_mult: 0 decay_mult: 0 } 210 | param { lr_mult: 0 decay_mult: 0 } 211 | convolution_param { 212 | kernel_size: 1 213 | stride: 1 214 | num_output: 4096 215 | weight_filler { 216 | type: "gaussian" 217 | std: 0.005 218 | } 219 | bias_filler { 220 | type: "constant" 221 | value: 1 222 | } 223 | } 224 | } 225 | layer { 226 | name: "relu7" 227 | type: "ReLU" 228 | bottom: "fc7" 229 | top: "fc7" 230 | } 231 | -------------------------------------------------------------------------------- /resources/fetch_caffe.sh: -------------------------------------------------------------------------------- 1 | 2 | wget eecs.berkeley.edu/~rich.zhang/projects/2016_colorization/files/train/caffe-colorization.tar.gz -O ./caffe-colorization.tar.gz 3 | tar -xvf ./caffe-colorization.tar.gz 4 | -------------------------------------------------------------------------------- /resources/fetch_models.sh: -------------------------------------------------------------------------------- 1 | 2 | wget eecs.berkeley.edu/~rich.zhang/projects/2017_splitbrain/files/models/model_splitbrainauto_clcl.caffemodel -O ./models/model_splitbrainauto_clcl.caffemodel 3 | wget eecs.berkeley.edu/~rich.zhang/projects/2017_splitbrain/files/models/model_splitbrainauto_clcl_rs.caffemodel -O ./models/model_splitbrainauto_clcl_rs.caffemodel 4 | --------------------------------------------------------------------------------