├── LICENSE
├── README.md
├── models
├── deploy.prototxt
└── deploy_lab.prototxt
└── resources
├── fetch_caffe.sh
└── fetch_models.sh
/LICENSE:
--------------------------------------------------------------------------------
1 | MIT License
2 |
3 | Copyright (c) 2016 Richard Zhang
4 |
5 | Permission is hereby granted, free of charge, to any person obtaining a copy
6 | of this software and associated documentation files (the "Software"), to deal
7 | in the Software without restriction, including without limitation the rights
8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 |
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 |
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | ## Split-Brain Autoencoders: Unsupervised Learning by Cross-Channel Prediction [[Project Page]](http://richzhang.github.io/splitbrainauto/) ##
2 | [Richard Zhang](https://richzhang.github.io/), [Phillip Isola](http://web.mit.edu/phillipi/), [Alexei A. Efros](http://www.eecs.berkeley.edu/~efros/). In CVPR, 2017. (hosted on [ArXiv](https://arxiv.org/abs/1611.09842))
3 |
4 |
5 |
6 | ### Overview ###
7 | This repository contains a pre-trained Split-Brain Autoencoder network. The network achieves state-of-the-art results on several large-scale unsupervised representation learning benchmarks.
8 |
9 | ### Clone this repository ###
10 | Clone the master branch of the respository using `git clone -b master --single-branch https://github.com/richzhang/splitbrainauto.git`
11 |
12 | ### Dependencies ###
13 | This code requires a working installation of [Caffe](http://caffe.berkeleyvision.org/). For guidelines and help with installation of Caffe, consult the [installation guide](http://caffe.berkeleyvision.org/) and [Caffe users group](https://groups.google.com/forum/#!forum/caffe-users).
14 |
15 | ### Test-Time Usage ###
16 | **(1)** Run `./resources/fetch_models.sh`. This will load model `model_splitbrainauto_clcl.caffemodel`. It will also load model `model_splitbrainauto_clcl_rs.caffemodel`, which is the model with the rescaling method from [Krähenbühl et al. ICLR 2016](https://github.com/philkr/magic_init) applied. The rescaling method has been shown to improve fine-tuning performance in some models, and we use it for the PASCAL tests in Table 4 in the paper. Alternatively, download the models from [here](https://people.eecs.berkeley.edu/~rich.zhang/projects/2017_splitbrain/files/models/) and put them in the `models` directory.
17 |
18 | **(2)** To extract features, you can (a) use the main branch of Caffe and do color conversion outside of the network or (b) download and install a modified Caffe and not worry about color conversion.
19 |
20 | **(a)** **Color conversion outside of prototxt** To extract features with the main branch of [Caffe](http://caffe.berkeleyvision.org/):
21 | **(i)** Load the downloaded weights with model definition file `deploy_lab.prototxt` in the `models` directory. The input is blob `data_lab`, which is an ***image in Lab colorspace***. You will have to do the Lab color conversion pre-processing outside of the network.
22 |
23 | **(b)** **Color conversion in prototxt** You can also extract features with in-prototxt color version with a modified Caffe.
24 | **(i)** Run `./resources/fetch_caffe.sh`. This will load a modified Caffe into directory `./caffe-colorization`.
25 | **(ii)** Install the modified Caffe. For guidelines and help with installation of Caffe, consult the [installation guide](http://caffe.berkeleyvision.org/) and [Caffe users group](https://groups.google.com/forum/#!forum/caffe-users).
26 | **(iii)** Load the downloaded weights with model definition file `deploy.prototxt` in the `models` directory. The input is blob `data`, which is a ***non mean-centered BGR image***.
27 |
28 | ### Citation ###
29 | If you find this model useful for your resesarch, please use this [bibtex](http://richzhang.github.io/index_files/bibtex_cvpr2017_splitbrain.txt) to cite.
30 |
31 |
--------------------------------------------------------------------------------
/models/deploy.prototxt:
--------------------------------------------------------------------------------
1 |
2 |
3 | layer {
4 | name: "input"
5 | type: "Input"
6 | top: "data" # BGR image from [0,255] ***NOT MEAN CENTERED***
7 | input_param { shape { dim: 1 dim: 3 dim: 227 dim: 227 } }
8 | }
9 | layer { # Convert to lab
10 | name: "img_lab"
11 | type: "ColorConv"
12 | bottom: "data"
13 | top: "img_lab"
14 | propagate_down: false
15 | color_conv_param {
16 | input: 0 # BGR
17 | output: 3 # Lab
18 | }
19 | }
20 | layer { # 0-center lightness channel
21 | name: "data_lab"
22 | type: "Convolution"
23 | bottom: "img_lab"
24 | top: "data_lab" # [-50,50]
25 | propagate_down: false
26 | param {lr_mult: 0 decay_mult: 0}
27 | param {lr_mult: 0 decay_mult: 0}
28 | convolution_param {
29 | kernel_size: 1
30 | num_output: 3
31 | group: 3
32 | }
33 | }
34 | layer {
35 | name: "conv1"
36 | type: "Convolution"
37 | # bottom: "img"
38 | bottom: "data_lab"
39 | # bottom: "img_bn"
40 | top: "conv1"
41 | param {lr_mult: 0 decay_mult: 0}
42 | param {lr_mult: 0 decay_mult: 0}
43 | convolution_param {
44 | num_output: 96
45 | kernel_size: 11
46 | stride: 4
47 | weight_filler {
48 | type: "gaussian"
49 | std: 0.01
50 | }
51 | bias_filler {
52 | type: "constant"
53 | value: 0
54 | }
55 | }
56 | }
57 | layer {
58 | name: "relu1"
59 | type: "ReLU"
60 | bottom: "conv1"
61 | top: "conv1"
62 | }
63 | layer {
64 | name: "pool1"
65 | type: "Pooling"
66 | bottom: "conv1"
67 | top: "pool1"
68 | pooling_param {
69 | pool: MAX
70 | kernel_size: 3
71 | stride: 2
72 | }
73 | }
74 | layer {
75 | name: "conv2"
76 | type: "Convolution"
77 | bottom: "pool1"
78 | top: "conv2"
79 | param { lr_mult: 0 decay_mult: 0 }
80 | param { lr_mult: 0 decay_mult: 0 }
81 | convolution_param {
82 | num_output: 256
83 | pad: 2
84 | kernel_size: 5
85 | group: 2
86 | }
87 | }
88 | layer {
89 | name: "relu2"
90 | type: "ReLU"
91 | bottom: "conv2"
92 | top: "conv2"
93 | }
94 | layer {
95 | name: "pool2"
96 | type: "Pooling"
97 | # bottom: "conv2"
98 | bottom: "conv2"
99 | top: "pool2"
100 | pooling_param {
101 | pool: MAX
102 | kernel_size: 3
103 | stride: 2
104 | # pad: 1
105 | }
106 | }
107 | layer {
108 | name: "conv3"
109 | type: "Convolution"
110 | bottom: "pool2"
111 | top: "conv3"
112 | # propagate_down: false
113 | param { lr_mult: 0 decay_mult: 0 }
114 | param { lr_mult: 0 decay_mult: 0 }
115 | convolution_param {
116 | num_output: 384
117 | pad: 1
118 | kernel_size: 3
119 | weight_filler {
120 | type: "gaussian"
121 | std: 0.01
122 | }
123 | bias_filler {
124 | type: "constant"
125 | value: 0
126 | }
127 | }
128 | }
129 | layer {
130 | name: "relu3"
131 | type: "ReLU"
132 | bottom: "conv3"
133 | top: "conv3"
134 | }
135 | layer {
136 | name: "conv4"
137 | type: "Convolution"
138 | bottom: "conv3"
139 | top: "conv4"
140 | param { lr_mult: 0 decay_mult: 0 }
141 | param { lr_mult: 0 decay_mult: 0 }
142 | convolution_param {
143 | num_output: 384
144 | pad: 1
145 | kernel_size: 3
146 | group: 2
147 | }
148 | }
149 | layer {
150 | name: "relu4"
151 | type: "ReLU"
152 | bottom: "conv4"
153 | top: "conv4"
154 | }
155 | layer {
156 | name: "conv5"
157 | type: "Convolution"
158 | bottom: "conv4"
159 | top: "conv5"
160 | param { lr_mult: 0 decay_mult: 0 }
161 | param { lr_mult: 0 decay_mult: 0 }
162 | convolution_param {
163 | num_output: 256
164 | pad: 1
165 | kernel_size: 3
166 | group: 2
167 | }
168 | }
169 | layer {
170 | name: "relu5"
171 | type: "ReLU"
172 | bottom: "conv5"
173 | top: "conv5"
174 | }
175 | layer {
176 | name: "pool5"
177 | type: "Pooling"
178 | bottom: "conv5"
179 | top: "pool5"
180 | pooling_param {
181 | pool: MAX
182 | kernel_size: 3
183 | stride: 2
184 | }
185 | }
186 | layer {
187 | name: "fc6"
188 | type: "Convolution"
189 | bottom: "pool5"
190 | top: "fc6"
191 | param { lr_mult: 0 decay_mult: 0 }
192 | param { lr_mult: 0 decay_mult: 0 }
193 | convolution_param {
194 | kernel_size: 6
195 | dilation: 2
196 | pad: 5
197 | stride: 1
198 | num_output: 4096
199 | weight_filler {
200 | type: "gaussian"
201 | std: 0.005
202 | }
203 | bias_filler {
204 | type: "constant"
205 | value: 1
206 | }
207 | }
208 | }
209 | layer {
210 | name: "relu6"
211 | type: "ReLU"
212 | bottom: "fc6"
213 | top: "fc6"
214 | }
215 | layer {
216 | name: "fc7"
217 | type: "Convolution"
218 | bottom: "fc6"
219 | top: "fc7"
220 | param { lr_mult: 0 decay_mult: 0 }
221 | param { lr_mult: 0 decay_mult: 0 }
222 | convolution_param {
223 | kernel_size: 1
224 | stride: 1
225 | num_output: 4096
226 | weight_filler {
227 | type: "gaussian"
228 | std: 0.005
229 | }
230 | bias_filler {
231 | type: "constant"
232 | value: 1
233 | }
234 | }
235 | }
236 | layer {
237 | name: "relu7"
238 | type: "ReLU"
239 | bottom: "fc7"
240 | top: "fc7"
241 | }
242 |
--------------------------------------------------------------------------------
/models/deploy_lab.prototxt:
--------------------------------------------------------------------------------
1 |
2 |
3 | layer {
4 | name: "input"
5 | type: "Input"
6 | top: "img_lab" # image in Lab color space
7 | input_param { shape { dim: 1 dim: 3 dim: 227 dim: 227 } }
8 | }
9 | layer { # 0-center lightness channel
10 | name: "data_lab"
11 | type: "Convolution"
12 | bottom: "img_lab"
13 | top: "data_lab" # [-50,50]
14 | propagate_down: false
15 | param {lr_mult: 0 decay_mult: 0}
16 | param {lr_mult: 0 decay_mult: 0}
17 | convolution_param {
18 | kernel_size: 1
19 | num_output: 3
20 | group: 3
21 | }
22 | }
23 | layer {
24 | name: "conv1"
25 | type: "Convolution"
26 | # bottom: "img"
27 | bottom: "data_lab"
28 | # bottom: "img_bn"
29 | top: "conv1"
30 | param {lr_mult: 0 decay_mult: 0}
31 | param {lr_mult: 0 decay_mult: 0}
32 | convolution_param {
33 | num_output: 96
34 | kernel_size: 11
35 | stride: 4
36 | weight_filler {
37 | type: "gaussian"
38 | std: 0.01
39 | }
40 | bias_filler {
41 | type: "constant"
42 | value: 0
43 | }
44 | }
45 | }
46 | layer {
47 | name: "relu1"
48 | type: "ReLU"
49 | bottom: "conv1"
50 | top: "conv1"
51 | }
52 | layer {
53 | name: "pool1"
54 | type: "Pooling"
55 | bottom: "conv1"
56 | top: "pool1"
57 | pooling_param {
58 | pool: MAX
59 | kernel_size: 3
60 | stride: 2
61 | }
62 | }
63 | layer {
64 | name: "conv2"
65 | type: "Convolution"
66 | bottom: "pool1"
67 | top: "conv2"
68 | param { lr_mult: 0 decay_mult: 0 }
69 | param { lr_mult: 0 decay_mult: 0 }
70 | convolution_param {
71 | num_output: 256
72 | pad: 2
73 | kernel_size: 5
74 | group: 2
75 | }
76 | }
77 | layer {
78 | name: "relu2"
79 | type: "ReLU"
80 | bottom: "conv2"
81 | top: "conv2"
82 | }
83 | layer {
84 | name: "pool2"
85 | type: "Pooling"
86 | # bottom: "conv2"
87 | bottom: "conv2"
88 | top: "pool2"
89 | pooling_param {
90 | pool: MAX
91 | kernel_size: 3
92 | stride: 2
93 | # pad: 1
94 | }
95 | }
96 | layer {
97 | name: "conv3"
98 | type: "Convolution"
99 | bottom: "pool2"
100 | top: "conv3"
101 | # propagate_down: false
102 | param { lr_mult: 0 decay_mult: 0 }
103 | param { lr_mult: 0 decay_mult: 0 }
104 | convolution_param {
105 | num_output: 384
106 | pad: 1
107 | kernel_size: 3
108 | weight_filler {
109 | type: "gaussian"
110 | std: 0.01
111 | }
112 | bias_filler {
113 | type: "constant"
114 | value: 0
115 | }
116 | }
117 | }
118 | layer {
119 | name: "relu3"
120 | type: "ReLU"
121 | bottom: "conv3"
122 | top: "conv3"
123 | }
124 | layer {
125 | name: "conv4"
126 | type: "Convolution"
127 | bottom: "conv3"
128 | top: "conv4"
129 | param { lr_mult: 0 decay_mult: 0 }
130 | param { lr_mult: 0 decay_mult: 0 }
131 | convolution_param {
132 | num_output: 384
133 | pad: 1
134 | kernel_size: 3
135 | group: 2
136 | }
137 | }
138 | layer {
139 | name: "relu4"
140 | type: "ReLU"
141 | bottom: "conv4"
142 | top: "conv4"
143 | }
144 | layer {
145 | name: "conv5"
146 | type: "Convolution"
147 | bottom: "conv4"
148 | top: "conv5"
149 | param { lr_mult: 0 decay_mult: 0 }
150 | param { lr_mult: 0 decay_mult: 0 }
151 | convolution_param {
152 | num_output: 256
153 | pad: 1
154 | kernel_size: 3
155 | group: 2
156 | }
157 | }
158 | layer {
159 | name: "relu5"
160 | type: "ReLU"
161 | bottom: "conv5"
162 | top: "conv5"
163 | }
164 | layer {
165 | name: "pool5"
166 | type: "Pooling"
167 | bottom: "conv5"
168 | top: "pool5"
169 | pooling_param {
170 | pool: MAX
171 | kernel_size: 3
172 | stride: 2
173 | }
174 | }
175 | layer {
176 | name: "fc6"
177 | type: "Convolution"
178 | bottom: "pool5"
179 | top: "fc6"
180 | param { lr_mult: 0 decay_mult: 0 }
181 | param { lr_mult: 0 decay_mult: 0 }
182 | convolution_param {
183 | kernel_size: 6
184 | dilation: 2
185 | pad: 5
186 | stride: 1
187 | num_output: 4096
188 | weight_filler {
189 | type: "gaussian"
190 | std: 0.005
191 | }
192 | bias_filler {
193 | type: "constant"
194 | value: 1
195 | }
196 | }
197 | }
198 | layer {
199 | name: "relu6"
200 | type: "ReLU"
201 | bottom: "fc6"
202 | top: "fc6"
203 | }
204 | layer {
205 | name: "fc7"
206 | type: "Convolution"
207 | bottom: "fc6"
208 | top: "fc7"
209 | param { lr_mult: 0 decay_mult: 0 }
210 | param { lr_mult: 0 decay_mult: 0 }
211 | convolution_param {
212 | kernel_size: 1
213 | stride: 1
214 | num_output: 4096
215 | weight_filler {
216 | type: "gaussian"
217 | std: 0.005
218 | }
219 | bias_filler {
220 | type: "constant"
221 | value: 1
222 | }
223 | }
224 | }
225 | layer {
226 | name: "relu7"
227 | type: "ReLU"
228 | bottom: "fc7"
229 | top: "fc7"
230 | }
231 |
--------------------------------------------------------------------------------
/resources/fetch_caffe.sh:
--------------------------------------------------------------------------------
1 |
2 | wget eecs.berkeley.edu/~rich.zhang/projects/2016_colorization/files/train/caffe-colorization.tar.gz -O ./caffe-colorization.tar.gz
3 | tar -xvf ./caffe-colorization.tar.gz
4 |
--------------------------------------------------------------------------------
/resources/fetch_models.sh:
--------------------------------------------------------------------------------
1 |
2 | wget eecs.berkeley.edu/~rich.zhang/projects/2017_splitbrain/files/models/model_splitbrainauto_clcl.caffemodel -O ./models/model_splitbrainauto_clcl.caffemodel
3 | wget eecs.berkeley.edu/~rich.zhang/projects/2017_splitbrain/files/models/model_splitbrainauto_clcl_rs.caffemodel -O ./models/model_splitbrainauto_clcl_rs.caffemodel
4 |
--------------------------------------------------------------------------------