├── LICENSE
├── README.md
├── models
    ├── deploy.prototxt
    └── deploy_lab.prototxt
└── resources
    ├── fetch_caffe.sh
    └── fetch_models.sh


/LICENSE:
--------------------------------------------------------------------------------
 1 | MIT License
 2 | 
 3 | Copyright (c) 2016 Richard Zhang
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy
 6 | of this software and associated documentation files (the "Software"), to deal
 7 | in the Software without restriction, including without limitation the rights
 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 | 
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 | 
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | ## Split-Brain Autoencoders: Unsupervised Learning by Cross-Channel Prediction [[Project Page]](http://richzhang.github.io/splitbrainauto/) ##
 2 | [Richard Zhang](https://richzhang.github.io/), [Phillip Isola](http://web.mit.edu/phillipi/), [Alexei A. Efros](http://www.eecs.berkeley.edu/~efros/). In CVPR, 2017. (hosted on [ArXiv](https://arxiv.org/abs/1611.09842))
 3 | 
 4 | <img src="http://richzhang.github.io/index_files/cvpr2017_splitbrain.png" height="180" />
 5 | 
 6 | ### Overview ###
 7 | This repository contains a pre-trained Split-Brain Autoencoder network. The network achieves state-of-the-art results on several large-scale unsupervised representation learning benchmarks.
 8 | 
 9 | ### Clone this repository ###
10 | Clone the master branch of the respository using `git clone -b master --single-branch https://github.com/richzhang/splitbrainauto.git`
11 | 
12 | ### Dependencies ###
13 | This code requires a working installation of [Caffe](http://caffe.berkeleyvision.org/). For guidelines and help with installation of Caffe, consult the [installation guide](http://caffe.berkeleyvision.org/) and [Caffe users group](https://groups.google.com/forum/#!forum/caffe-users).
14 | 
15 | ### Test-Time Usage ###
16 | **(1)** Run `./resources/fetch_models.sh`. This will load model `model_splitbrainauto_clcl.caffemodel`. It will also load model `model_splitbrainauto_clcl_rs.caffemodel`, which is the model with the rescaling method from [Kr&auml;henb&uuml;hl et al. ICLR 2016](https://github.com/philkr/magic_init) applied. The rescaling method has been shown to improve fine-tuning performance in some models, and we use it for the PASCAL tests in Table 4 in the paper. Alternatively, download the models from [here](https://people.eecs.berkeley.edu/~rich.zhang/projects/2017_splitbrain/files/models/) and put them in the `models` directory.
17 | 
18 | **(2)** To extract features, you can (a) use the main branch of Caffe and do color conversion outside of the network or (b) download and install a modified Caffe and not worry about color conversion.
19 | 
20 | **(a)** **Color conversion outside of prototxt** To extract features with the main branch of [Caffe](http://caffe.berkeleyvision.org/): <br>
21 | **(i)** Load the downloaded weights with model definition file `deploy_lab.prototxt` in the `models` directory. The input is blob `data_lab`, which is an ***image in Lab colorspace***. You will have to do the Lab color conversion pre-processing outside of the network.
22 | 
23 | **(b)** **Color conversion in prototxt** You can also extract features with in-prototxt color version with a modified Caffe. <br>
24 | **(i)** Run `./resources/fetch_caffe.sh`. This will load a modified Caffe into directory `./caffe-colorization`. <br>
25 | **(ii)** Install the modified Caffe. For guidelines and help with installation of Caffe, consult the [installation guide](http://caffe.berkeleyvision.org/) and [Caffe users group](https://groups.google.com/forum/#!forum/caffe-users). <br>
26 | **(iii)** Load the downloaded weights with model definition file `deploy.prototxt` in the `models` directory. The input is blob `data`, which is a ***non mean-centered BGR image***.
27 | 
28 | ### Citation ###
29 | If you find this model useful for your resesarch, please use this [bibtex](http://richzhang.github.io/index_files/bibtex_cvpr2017_splitbrain.txt) to cite.
30 |  
31 | 


--------------------------------------------------------------------------------
/models/deploy.prototxt:
--------------------------------------------------------------------------------
  1 | 
  2 | 
  3 | layer {
  4 | 	name: "input"
  5 | 	type: "Input"
  6 | 	top: "data" # BGR image from [0,255] ***NOT MEAN CENTERED***
  7 | 	input_param { shape { dim: 1 dim: 3 dim: 227 dim: 227 } }
  8 | }
  9 | layer { # Convert to lab
 10 |   name: "img_lab"
 11 |   type: "ColorConv"
 12 |   bottom: "data"
 13 |   top: "img_lab"
 14 |   propagate_down: false
 15 |   color_conv_param {
 16 |     input: 0 # BGR
 17 |     output: 3 # Lab
 18 |   }
 19 | }
 20 | layer { # 0-center lightness channel
 21 |   name: "data_lab"
 22 |   type: "Convolution"
 23 |   bottom: "img_lab"
 24 |   top: "data_lab" # [-50,50]
 25 |   propagate_down: false
 26 |   param {lr_mult: 0 decay_mult: 0}
 27 |   param {lr_mult: 0 decay_mult: 0}
 28 |   convolution_param {
 29 |     kernel_size: 1
 30 |     num_output: 3
 31 |     group: 3
 32 |   }
 33 | }
 34 | layer {
 35 |   name: "conv1"
 36 |   type: "Convolution"
 37 |   # bottom: "img"
 38 |   bottom: "data_lab"
 39 |   # bottom: "img_bn"
 40 |   top: "conv1"
 41 |   param {lr_mult: 0 decay_mult: 0}
 42 |   param {lr_mult: 0 decay_mult: 0}
 43 |   convolution_param {
 44 |     num_output: 96
 45 |     kernel_size: 11
 46 |     stride: 4
 47 |     weight_filler {
 48 |       type: "gaussian"
 49 |       std: 0.01
 50 |     }
 51 |     bias_filler {
 52 |       type: "constant"
 53 |       value: 0
 54 |     }
 55 |   }
 56 | }
 57 | layer {
 58 |   name: "relu1"
 59 |   type: "ReLU"
 60 |   bottom: "conv1"
 61 |   top: "conv1"
 62 | }
 63 | layer {
 64 |   name: "pool1"
 65 |   type: "Pooling"
 66 |   bottom: "conv1"
 67 |   top: "pool1"
 68 |   pooling_param {
 69 |     pool: MAX
 70 |     kernel_size: 3
 71 |     stride: 2
 72 |   }
 73 | }
 74 | layer {
 75 |   name: "conv2"
 76 |   type: "Convolution"
 77 |   bottom: "pool1"
 78 |   top: "conv2"
 79 |   param {    lr_mult: 0    decay_mult: 0  }
 80 |   param {    lr_mult: 0    decay_mult: 0  }
 81 |   convolution_param {
 82 |     num_output: 256
 83 |     pad: 2
 84 |     kernel_size: 5
 85 |     group: 2
 86 |   }
 87 | }
 88 | layer {
 89 |   name: "relu2"
 90 |   type: "ReLU"
 91 |   bottom: "conv2"
 92 |   top: "conv2"
 93 | }
 94 | layer {
 95 |   name: "pool2"
 96 |   type: "Pooling"
 97 |   # bottom: "conv2"
 98 |   bottom: "conv2"
 99 |   top: "pool2"
100 |   pooling_param {
101 |     pool: MAX
102 |     kernel_size: 3
103 |     stride: 2
104 |     # pad: 1
105 |   }
106 | }
107 | layer {
108 |   name: "conv3"
109 |   type: "Convolution"
110 |   bottom: "pool2"
111 |   top: "conv3"
112 |   # propagate_down: false
113 |   param {    lr_mult: 0    decay_mult: 0  }
114 |   param {    lr_mult: 0    decay_mult: 0  }
115 |   convolution_param {
116 |     num_output: 384
117 |     pad: 1
118 |     kernel_size: 3
119 |     weight_filler {
120 |       type: "gaussian"
121 |       std: 0.01
122 |     }
123 |     bias_filler {
124 |       type: "constant"
125 |       value: 0
126 |     }
127 |   }
128 | }
129 | layer {
130 |   name: "relu3"
131 |   type: "ReLU"
132 |   bottom: "conv3"
133 |   top: "conv3"
134 | }
135 | layer {
136 |   name: "conv4"
137 |   type: "Convolution"
138 |   bottom: "conv3"
139 |   top: "conv4"
140 |   param {    lr_mult: 0    decay_mult: 0  }
141 |   param {    lr_mult: 0    decay_mult: 0  }
142 |   convolution_param {
143 |     num_output: 384
144 |     pad: 1
145 |     kernel_size: 3
146 |     group: 2
147 |   }
148 | }
149 | layer {
150 |   name: "relu4"
151 |   type: "ReLU"
152 |   bottom: "conv4"
153 |   top: "conv4"
154 | }
155 | layer {
156 |   name: "conv5"
157 |   type: "Convolution"
158 |   bottom: "conv4"
159 |   top: "conv5"
160 |   param {    lr_mult: 0    decay_mult: 0  }
161 |   param {    lr_mult: 0    decay_mult: 0  }
162 |   convolution_param {
163 |     num_output: 256
164 |     pad: 1
165 |     kernel_size: 3
166 |     group: 2
167 |   }
168 | }
169 | layer {
170 |   name: "relu5"
171 |   type: "ReLU"
172 |   bottom: "conv5"
173 |   top: "conv5"
174 | }
175 | layer {
176 |   name: "pool5"
177 |   type: "Pooling"
178 |   bottom: "conv5"
179 |   top: "pool5"
180 |   pooling_param {
181 |     pool: MAX
182 |     kernel_size: 3
183 |     stride: 2
184 |   }
185 | }
186 | layer {
187 |   name: "fc6"
188 |   type: "Convolution"
189 |   bottom: "pool5"
190 |   top: "fc6"
191 |   param {    lr_mult: 0    decay_mult: 0  }
192 |   param {    lr_mult: 0    decay_mult: 0  }
193 |   convolution_param {
194 |     kernel_size: 6
195 |     dilation: 2
196 |     pad: 5
197 |     stride: 1
198 |     num_output: 4096
199 |     weight_filler {
200 |       type: "gaussian"
201 |       std: 0.005
202 |     }
203 |     bias_filler {
204 |       type: "constant"
205 |       value: 1
206 |     }
207 |   }
208 | }
209 | layer {
210 |   name: "relu6"
211 |   type: "ReLU"
212 |   bottom: "fc6"
213 |   top: "fc6"
214 | }
215 | layer {
216 |   name: "fc7"
217 |   type: "Convolution"
218 |   bottom: "fc6"
219 |   top: "fc7"
220 |   param {    lr_mult: 0    decay_mult: 0  }
221 |   param {    lr_mult: 0    decay_mult: 0  }
222 |   convolution_param {
223 |     kernel_size: 1
224 |     stride: 1
225 |     num_output: 4096
226 |     weight_filler {
227 |       type: "gaussian"
228 |       std: 0.005
229 |     }
230 |     bias_filler {
231 |       type: "constant"
232 |       value: 1
233 |     }
234 |   }
235 | }
236 | layer {
237 |   name: "relu7"
238 |   type: "ReLU"
239 |   bottom: "fc7"
240 |   top: "fc7"
241 | }
242 | 


--------------------------------------------------------------------------------
/models/deploy_lab.prototxt:
--------------------------------------------------------------------------------
  1 | 
  2 | 
  3 | layer {
  4 | 	name: "input"
  5 | 	type: "Input"
  6 | 	top: "img_lab" # image in Lab color space
  7 | 	input_param { shape { dim: 1 dim: 3 dim: 227 dim: 227 } }
  8 | }
  9 | layer { # 0-center lightness channel
 10 |   name: "data_lab"
 11 |   type: "Convolution"
 12 |   bottom: "img_lab"
 13 |   top: "data_lab" # [-50,50]
 14 |   propagate_down: false
 15 |   param {lr_mult: 0 decay_mult: 0}
 16 |   param {lr_mult: 0 decay_mult: 0}
 17 |   convolution_param {
 18 |     kernel_size: 1
 19 |     num_output: 3
 20 |     group: 3
 21 |   }
 22 | }
 23 | layer {
 24 |   name: "conv1"
 25 |   type: "Convolution"
 26 |   # bottom: "img"
 27 |   bottom: "data_lab"
 28 |   # bottom: "img_bn"
 29 |   top: "conv1"
 30 |   param {lr_mult: 0 decay_mult: 0}
 31 |   param {lr_mult: 0 decay_mult: 0}
 32 |   convolution_param {
 33 |     num_output: 96
 34 |     kernel_size: 11
 35 |     stride: 4
 36 |     weight_filler {
 37 |       type: "gaussian"
 38 |       std: 0.01
 39 |     }
 40 |     bias_filler {
 41 |       type: "constant"
 42 |       value: 0
 43 |     }
 44 |   }
 45 | }
 46 | layer {
 47 |   name: "relu1"
 48 |   type: "ReLU"
 49 |   bottom: "conv1"
 50 |   top: "conv1"
 51 | }
 52 | layer {
 53 |   name: "pool1"
 54 |   type: "Pooling"
 55 |   bottom: "conv1"
 56 |   top: "pool1"
 57 |   pooling_param {
 58 |     pool: MAX
 59 |     kernel_size: 3
 60 |     stride: 2
 61 |   }
 62 | }
 63 | layer {
 64 |   name: "conv2"
 65 |   type: "Convolution"
 66 |   bottom: "pool1"
 67 |   top: "conv2"
 68 |   param {    lr_mult: 0    decay_mult: 0  }
 69 |   param {    lr_mult: 0    decay_mult: 0  }
 70 |   convolution_param {
 71 |     num_output: 256
 72 |     pad: 2
 73 |     kernel_size: 5
 74 |     group: 2
 75 |   }
 76 | }
 77 | layer {
 78 |   name: "relu2"
 79 |   type: "ReLU"
 80 |   bottom: "conv2"
 81 |   top: "conv2"
 82 | }
 83 | layer {
 84 |   name: "pool2"
 85 |   type: "Pooling"
 86 |   # bottom: "conv2"
 87 |   bottom: "conv2"
 88 |   top: "pool2"
 89 |   pooling_param {
 90 |     pool: MAX
 91 |     kernel_size: 3
 92 |     stride: 2
 93 |     # pad: 1
 94 |   }
 95 | }
 96 | layer {
 97 |   name: "conv3"
 98 |   type: "Convolution"
 99 |   bottom: "pool2"
100 |   top: "conv3"
101 |   # propagate_down: false
102 |   param {    lr_mult: 0    decay_mult: 0  }
103 |   param {    lr_mult: 0    decay_mult: 0  }
104 |   convolution_param {
105 |     num_output: 384
106 |     pad: 1
107 |     kernel_size: 3
108 |     weight_filler {
109 |       type: "gaussian"
110 |       std: 0.01
111 |     }
112 |     bias_filler {
113 |       type: "constant"
114 |       value: 0
115 |     }
116 |   }
117 | }
118 | layer {
119 |   name: "relu3"
120 |   type: "ReLU"
121 |   bottom: "conv3"
122 |   top: "conv3"
123 | }
124 | layer {
125 |   name: "conv4"
126 |   type: "Convolution"
127 |   bottom: "conv3"
128 |   top: "conv4"
129 |   param {    lr_mult: 0    decay_mult: 0  }
130 |   param {    lr_mult: 0    decay_mult: 0  }
131 |   convolution_param {
132 |     num_output: 384
133 |     pad: 1
134 |     kernel_size: 3
135 |     group: 2
136 |   }
137 | }
138 | layer {
139 |   name: "relu4"
140 |   type: "ReLU"
141 |   bottom: "conv4"
142 |   top: "conv4"
143 | }
144 | layer {
145 |   name: "conv5"
146 |   type: "Convolution"
147 |   bottom: "conv4"
148 |   top: "conv5"
149 |   param {    lr_mult: 0    decay_mult: 0  }
150 |   param {    lr_mult: 0    decay_mult: 0  }
151 |   convolution_param {
152 |     num_output: 256
153 |     pad: 1
154 |     kernel_size: 3
155 |     group: 2
156 |   }
157 | }
158 | layer {
159 |   name: "relu5"
160 |   type: "ReLU"
161 |   bottom: "conv5"
162 |   top: "conv5"
163 | }
164 | layer {
165 |   name: "pool5"
166 |   type: "Pooling"
167 |   bottom: "conv5"
168 |   top: "pool5"
169 |   pooling_param {
170 |     pool: MAX
171 |     kernel_size: 3
172 |     stride: 2
173 |   }
174 | }
175 | layer {
176 |   name: "fc6"
177 |   type: "Convolution"
178 |   bottom: "pool5"
179 |   top: "fc6"
180 |   param {    lr_mult: 0    decay_mult: 0  }
181 |   param {    lr_mult: 0    decay_mult: 0  }
182 |   convolution_param {
183 |     kernel_size: 6
184 |     dilation: 2
185 |     pad: 5
186 |     stride: 1
187 |     num_output: 4096
188 |     weight_filler {
189 |       type: "gaussian"
190 |       std: 0.005
191 |     }
192 |     bias_filler {
193 |       type: "constant"
194 |       value: 1
195 |     }
196 |   }
197 | }
198 | layer {
199 |   name: "relu6"
200 |   type: "ReLU"
201 |   bottom: "fc6"
202 |   top: "fc6"
203 | }
204 | layer {
205 |   name: "fc7"
206 |   type: "Convolution"
207 |   bottom: "fc6"
208 |   top: "fc7"
209 |   param {    lr_mult: 0    decay_mult: 0  }
210 |   param {    lr_mult: 0    decay_mult: 0  }
211 |   convolution_param {
212 |     kernel_size: 1
213 |     stride: 1
214 |     num_output: 4096
215 |     weight_filler {
216 |       type: "gaussian"
217 |       std: 0.005
218 |     }
219 |     bias_filler {
220 |       type: "constant"
221 |       value: 1
222 |     }
223 |   }
224 | }
225 | layer {
226 |   name: "relu7"
227 |   type: "ReLU"
228 |   bottom: "fc7"
229 |   top: "fc7"
230 | }
231 | 


--------------------------------------------------------------------------------
/resources/fetch_caffe.sh:
--------------------------------------------------------------------------------
1 | 
2 | wget eecs.berkeley.edu/~rich.zhang/projects/2016_colorization/files/train/caffe-colorization.tar.gz -O ./caffe-colorization.tar.gz
3 | tar -xvf ./caffe-colorization.tar.gz
4 | 


--------------------------------------------------------------------------------
/resources/fetch_models.sh:
--------------------------------------------------------------------------------
1 | 
2 | wget eecs.berkeley.edu/~rich.zhang/projects/2017_splitbrain/files/models/model_splitbrainauto_clcl.caffemodel -O ./models/model_splitbrainauto_clcl.caffemodel
3 | wget eecs.berkeley.edu/~rich.zhang/projects/2017_splitbrain/files/models/model_splitbrainauto_clcl_rs.caffemodel -O ./models/model_splitbrainauto_clcl_rs.caffemodel
4 | 


--------------------------------------------------------------------------------