├── readme.md
├── render
├── gen_script.py
├── readme.md
├── template-a.xml
├── template-d.xml
├── template-i.xml
├── template-r.xml
└── template-s.xml
└── train
├── Criterion.lua
├── Dataset.lua
├── Network.lua
├── Patch.lua
├── Test.lua
├── Train.lua
└── readme.md
/readme.md:
--------------------------------------------------------------------------------
1 | # ShapeNet-Intrinsics
2 |
3 | This is implementation for [Learning Non-Lambertian Object Intrinsics across ShapeNet Categories](https://arxiv.org/abs/1612.08510)
4 |
5 | You might be interested in the synthetic dataset we used in the paper. The entire dataset takes more than 1T for HDR images, and 240G for even compressed .jpg images. So it is hard to share it online, and we are still working on it;)
6 |
7 | However, you can still check the [rendering scripts](render), which can generate the dataset and do even more things for your own need, e.g. depth and normal images. [Training and testing scripts](train) are implemented in [torch](http://torch.ch/).
8 |
9 | #### Downloads
10 |
11 | Trained torch model and HDR environment map can be accessed [here](https://1drv.ms/f/s!ApfQp_rip6el-X-neX32NGAE_aiC).
12 |
13 | Note: there are two models(model.t7 and model_old.t7). Torch updated its API on SpatialUpSamplingBilinear, which is used in the model. The old one is trained with old API(maybe before Oct 2016), and model.t7 works on current version of torch.
14 |
15 | model.t7 is trained for 1M steps, which is observed overfitting on syn data.
16 | model_old.t7 is trained for 450k steps. It performs worse than model.t7 on synthetic data, but for some real data, it might produce better result. To try old version of model, simply uncomment line 10 in Test.lua (-- require 'Patch').
17 |
18 | #### Torch is outdated...
19 |
20 | Recently I found that people switch to tensorflow and pytorch, while torch is not active. I did spend some time to try to migrate the work to tensorflow, but I found the exact same network structure in tensorflow is not working. The major problem is that I used ReLU in the network. In tensorflow, the network produces all black image, and pixel values and gradients are clamped. I tried different LR and optimizer, but cannot make it work. In torch, everything is OK. Although I can use LeakyReLU in tensorflow, it is different from the original version. I doubt there is some difference in the internal implementation of these frameworks. I don't have enough time to make it works, but I maganed to compile torch and run the model under a relative new platform: **ubuntu 18.04, cuda 9.0 and cudnn 7**, which I think is acceptable for most people.
21 |
22 | #### Running the code...
23 |
24 | Here is the note for run the code with torch under ubuntu 18.04, cuda 9.0 and cudnn 7.
25 |
26 | 1. Clone the torch repo as usual
27 | ```
28 | git clone https://github.com/torch/distro.git ~/torch --recursive
29 | ```
30 | 2. Modify the install-deps, in line 178, 'sudo apt-get install -y python-software-properties'. The package in ubuntu 18.04 is replaced by 'software-properties-common', you can replace it or manually install the package.
31 | 3. Set gcc compiler to gcc-6, which is the maximun version for cuda-9.0. First install gcc-6 with apt, and then change default gcc to gcc-6.
32 | ```
33 | sudo apt install gcc-6
34 | sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-6 10
35 | sudo update-alternatives --config gcc
36 | ```
37 | 1. Install.
38 | ```
39 | export TORCH_NVCC_FLAGS="-D__CUDA_NO_HALF_OPERATORS__"
40 | ./install.sh
41 | ```
42 | 2. If everything is OK, torch would run...on cudnn5...Now we need to switch to cudnn7.
43 | ```
44 | cd extra/cudnn
45 | git fetch
46 | git checkout R7
47 | luarocks make cudnn-scm-1.rockspec
48 | ```
49 | Now you can probably run torch.
--------------------------------------------------------------------------------
/render/gen_script.py:
--------------------------------------------------------------------------------
1 | # generate rendering scripts
2 | import os
3 | import platform
4 | import random
5 | import math
6 | import urllib.request
7 |
8 | ## for testing...
9 | #os.environ["MTISUBA"] = "D:/Develope/Project/mitsuba_plugins/build/mitsuba/binaries/MinSizeRel/mitsuba.exe"
10 | #os.environ["SHAPENET_ROOT"] = "E:/ShapeNet/ShapeNetCore.v1/models"
11 | #os.environ["RENDER_ROOT"] = "E:/ShapeNet/Render"
12 | #os.environ["ENVMAP_ROOT"] = "E:/ShapeNet/Envmap"
13 |
14 | # fix random seed
15 | random.seed(0)
16 | # generate a random viewpoing
17 | def RandomView():
18 | theta = random.random() * math.pi * 0.5 # theta should be in [0, 0.5PI)
19 | phi = random.random() * math.pi * 2.0 # [0, 2PI)
20 | x = math.sin(theta) * math.cos(phi)
21 | y = math.cos(theta)
22 | z = math.sin(theta) * math.sin(phi)
23 |
24 | return "\"%f,%f,%f\"" % (x * 2, y * 2, z * 2)
25 |
26 | # rendering options
27 | # useful options: -q quite, -x skip exists
28 | options = "-q"
29 |
30 | # templates
31 | script_dir = os.path.dirname(os.path.abspath(__file__))
32 | template_i = os.path.join(script_dir, "template-i.xml")
33 | template_a = os.path.join(script_dir, "template-a.xml")
34 | template_s = os.path.join(script_dir, "template-s.xml")
35 | template_r = os.path.join(script_dir, "template-r.xml")
36 | template_d = os.path.join(script_dir, "template-d.xml")
37 |
38 | # the mitsuba executable binary
39 | MITSUBA = os.environ["MTISUBA"]
40 | # the ShapeNet model repository directory
41 | SHAPENET_ROOT = os.environ["SHAPENET_ROOT"]
42 | # the environment map root folder
43 | ENVMAP_ROOT = os.environ["ENVMAP_ROOT"]
44 | # where to put rendering output
45 | RENDER_ROOT = os.environ["RENDER_ROOT"]
46 |
47 | # first we create a rendering output directory
48 | if not os.path.exists(RENDER_ROOT):
49 | os.makedirs(RENDER_ROOT)
50 |
51 | # download all.csv from shapenet, which contains a separation of dataset
52 | list_url = "http://shapenet.cs.stanford.edu/shapenet/obj-zip/SHREC16/all.csv"
53 | list_file = os.path.join(RENDER_ROOT, "dataset.csv")
54 | if not os.path.exists(list_file):
55 | print("Download model list from ShapeNet website...")
56 | u = urllib.request.urlretrieve(list_url, list_file)
57 |
58 | # load environment map list
59 | # suppose there is a list.txt under ENVMAP_ROOT
60 | envmaps = []
61 | for line in open(os.path.join(ENVMAP_ROOT, "list.txt")):
62 | env_file = os.path.join(ENVMAP_ROOT, line[:-1])
63 | #print(env_file)
64 | if os.path.exists(env_file):
65 | envmaps.append(env_file)
66 |
67 | model_list = open(list_file, 'r')
68 |
69 | first_row = True
70 | for row in model_list:
71 | # split the first row
72 | if first_row:
73 | first_row = False
74 | continue
75 |
76 | cols = row.split(",")
77 |
78 | idx = cols[0] # model index
79 | category = cols[1] # model category
80 | uuid = cols[3] # model ID
81 |
82 | # the model .obj file
83 | model_file = os.path.join(SHAPENET_ROOT, category, uuid, "model.obj")
84 | if not os.path.exists(model_file):
85 | print("Model %s: file not exists!" % idx)
86 | print(model_file)
87 | else:
88 | print("Model %s, %s, %s" % (idx, category, uuid))
89 |
90 | # make output directory
91 | output_dir = os.path.join(RENDER_ROOT, idx)
92 | if not os.path.exists(output_dir):
93 | os.makedirs(output_dir)
94 |
95 | # we do not do rendering in this script
96 | # instead, we write rendering scripts for each model
97 | # and use a synthesize script to call ImageMagick to synthesize image
98 | if platform.system() == "Windows":
99 | render_script = open(os.path.join(output_dir, "render.bat"), 'w')
100 | syn_script = open(os.path.join(output_dir, "synthesize.bat"), 'w')
101 | else: # assume linux or macos
102 | render_script = open(os.path.join(output_dir, "render.sh"), 'w')
103 | syn_script = open(os.path.join(output_dir, "synthesize.sh"), 'w')
104 |
105 | # random specular properties
106 | ks = random.random() * 0.2
107 | ns = random.random() * 1000
108 | # diffuse term
109 | kd = 1.0 - ks
110 |
111 | for env_file in envmaps:
112 | output_name = os.path.splitext(os.path.basename(env_file))[0]
113 |
114 | # generate a random viewpoint
115 | view = RandomView()
116 |
117 | # varibles for rendering
118 | var = "-Dmodel=\"%s\" -Denv=\"%s\" -Dview=%s -Dkd=%f -Dks=%f -Dns=%f" % (model_file, env_file, view, kd, ks, ns)
119 |
120 | # a base command
121 |
122 | # render command for image, albedo, shading, specular and depth
123 | # we do not need to directly render image since we can synthesize it by I = A*S + R
124 | #script.write("%s %s %s %s -o %s-i\n" % (MITSUBA, template_i, options, var, output_name))
125 |
126 | # rendering albedo
127 | render_script.write("%s %s %s %s -o %s_a\n" % (MITSUBA, template_a, options, var, output_name))
128 |
129 | # rendering shading
130 | render_script.write("%s %s %s %s -o %s_s\n" % (MITSUBA, template_s, options, var, output_name))
131 |
132 | # rendering specular
133 | render_script.write("%s %s %s %s -o %s_r\n" % (MITSUBA, template_r, options, var, output_name))
134 |
135 | # rendering depth, can be used for generate object mask
136 | render_script.write("%s %s %s %s -o %s_d\n" % (MITSUBA, template_d, options, var, output_name))
137 |
138 |
139 | # synthesize process script
140 | # we use .jpg LDR image instead of .exr HDR image to save disk space
141 |
142 | # first generate mask from depth
143 | syn_script.write("magick convert -format png -alpha off -depth 1 %s_d.exr %s_m.png\n" % (output_name,output_name))
144 |
145 | # albedo, shading and specular
146 | syn_script.write("magick mogrify -format jpg %s_a.exr\n" % output_name)
147 | syn_script.write("magick mogrify -format jpg %s_s.exr\n" % output_name)
148 | syn_script.write("magick mogrify -format jpg %s_r.exr\n" % output_name)
149 |
150 | # synthesize image
151 | syn_script.write("magick composite -compose Multiply %s_a.exr %s_s.exr temp.exr\n" % (output_name,output_name))
152 | syn_script.write("magick composite -compose Plus %s_r.exr temp.exr -format jpg %s_i.jpg\n" % (output_name,output_name))
153 |
154 |
155 | # clean
156 | if platform.system() == "Windows":
157 | syn_script.write("del temp.exr\n")
158 | else:
159 | syn_script.write("rm temp.exr\n")
160 |
161 | render_script.close()
162 | syn_script.close()
163 |
164 | model_list.close()
165 |
--------------------------------------------------------------------------------
/render/readme.md:
--------------------------------------------------------------------------------
1 | # Rendering Script
2 |
3 | In this directory we provide rendering configuration template for [mitsuba-shapenet](https://github.com/shi-jian/mitsuba-shapenet]) to render different components of a ShapeNet model.
4 |
5 | ### Configuration templates
6 |
7 | * [template-i.xml](template-i.xml) configuration for rendering image.
8 | * [template-a.xml](template-a.xml) for albedo.
9 | * [template-s.xml](template-s.xml) for shading.
10 | * [template-r.xml](template-r.xml) for specular. We use homogeneous specular for the entire model.
11 | * [template-d.xml](template-d.xml) for depth, which is used to generate mask.
12 |
13 | Note: In the experiments, we rendered albedo/shading/specular and then synthesized image by I=A*S+R. Depth is used to generate object mask.
14 |
15 | It might be useful to look into albedo/depth configuration file if you want to render other 'field' in mitsuba, such as normal.
16 |
17 | ### Script for generating rendering scripts
18 |
19 | [gen_script.py](gen_script.py) is used to generate rendering and synthesize scripts. Please set following environment:
20 |
21 | * **MITSUBA**
22 | points to mitsuba renderer executable (e.g. mitsuba.exe in windows).
23 |
24 | * **SHAPENET_ROOT**
25 | the directory contains extracted ShapeNet models.
26 |
27 | * **ENVMAP_ROOT**
28 | the directory contains environment maps, with a 'list.txt' file. Each line of the list file contains an environment map filename.
29 |
30 | * **RENDER_ROOT**
31 | the directory to put rendering scripts and results.
32 |
33 |
34 | Recently ShapeNet released an official dataset separation. The script would automatically download the model [list](http://shapenet.cs.stanford.edu/shapenet/obj-zip/SHREC16/all.csv) from ShapeNet, which contains models, categories, uuid and data separation. Then it would generate output directories for models under RENDER_ROOT, as well as two scripts: render.bat/render.sh and synthesize.bat/synthesize.sh.
35 |
36 | * **render.bat**: render albedo/shading/specular/depth in HDR images.
37 | * **synthesize.bat**: generate mask image from depth, convert HDR to LDR for albedo/shading/specular(for saving disk space), generate image by I=A*S+R. [ImageMagick](http://www.imagemagick.org) is required for image synthesizing.
38 |
39 | Then, you can run these scripts under their directory. We strongly recommend to render on a cluster. Rendering for a single model under 92 environment maps takes about 45 min on an i7-2600 old PC.
40 |
--------------------------------------------------------------------------------
/render/template-a.xml:
--------------------------------------------------------------------------------
1 |
2 |
3 |
4 |
5 |
6 |
7 |
8 |
9 |
10 |
11 |
12 |
13 |
14 |
15 |
16 |
17 |
18 |
19 |
20 |
21 |
22 |
23 |
24 |
25 |
26 |
27 |
28 |
29 |
30 |
31 |
32 |
--------------------------------------------------------------------------------
/render/template-d.xml:
--------------------------------------------------------------------------------
1 |
2 |
3 |
4 |
5 |
6 |
7 |
8 |
9 |
10 |
11 |
12 |
13 |
14 |
15 |
16 |
17 |
18 |
19 |
20 |
21 |
22 |
23 |
24 |
25 |
26 |
27 |
28 |
29 |
30 |
31 |
32 |
--------------------------------------------------------------------------------
/render/template-i.xml:
--------------------------------------------------------------------------------
1 |
2 |
3 |
4 |
5 |
6 |
7 |
8 |
9 |
10 |
11 |
12 |
13 |
14 |
15 |
16 |
17 |
18 |
19 |
20 |
21 |
22 |
23 |
24 |
25 |
26 |
27 |
28 |
29 |
30 |
31 |
--------------------------------------------------------------------------------
/render/template-r.xml:
--------------------------------------------------------------------------------
1 |
2 |
3 |
4 |
5 |
6 |
7 |
8 |
9 |
10 |
11 |
12 |
13 |
14 |
15 |
16 |
17 |
18 |
19 |
20 |
21 |
22 |
23 |
24 |
25 |
26 |
27 |
28 |
29 |
30 |
31 |
32 |
33 |
34 |
35 |
36 |
37 |
38 |
39 |
--------------------------------------------------------------------------------
/render/template-s.xml:
--------------------------------------------------------------------------------
1 |
2 |
3 |
4 |
5 |
6 |
7 |
8 |
9 |
10 |
11 |
12 |
13 |
14 |
15 |
16 |
17 |
18 |
19 |
20 |
21 |
22 |
23 |
24 |
25 |
26 |
27 |
28 |
29 |
30 |
31 |
32 |
33 |
34 |
35 |
36 |
37 |
38 |
--------------------------------------------------------------------------------
/train/Criterion.lua:
--------------------------------------------------------------------------------
1 | require 'torch'
2 | require 'nn'
3 | require 'cunn'
4 |
5 | -- weighted MSE criterion, scale invariant, and with mask
6 | local WeightedMSE, parent = torch.class('nn.WeightedMSE', 'nn.Criterion')
7 |
8 | function WeightedMSE:__init(scale_invariant)
9 | parent.__init(self)
10 | -- we use a standard MSE criterion internally
11 | self.criterion = nn.MSECriterion()
12 | self.criterion.sizeAverage = false
13 |
14 | -- whether consider scale invarient
15 | self.scale_invariant = scale_invariant or false
16 | end
17 |
18 | -- targets should contains {target, weight}
19 | function WeightedMSE:updateOutput(pred, targets)
20 |
21 | local target = targets[1]
22 | local weight = targets[2]
23 |
24 | -- scale-invariant: rescale the pred to target scale
25 | if self.scale_invariant then
26 |
27 | -- get the dimension and size
28 | local dim = target:dim()
29 | local size = target:size()
30 | for i=1,dim-2 do
31 | size[i] = 1
32 | end
33 |
34 | -- scale invariant
35 | local tensor1 = torch.cmul(pred, target)
36 | local tensor2 = torch.cmul(pred, pred)
37 |
38 | -- get the scale
39 | self.scale = torch.cdiv(tensor1:sum(dim):sum(dim-1),tensor2:sum(dim):sum(dim-1))
40 | -- patch NaN
41 | self.scale[self.scale:ne(self.scale)] = 1
42 |
43 | -- constrain the scale in [0.1, 10]
44 | self.scale:cmin(10)
45 | self.scale:cmax(0.1)
46 |
47 | -- expand the scale
48 | self.scale = self.scale:repeatTensor(size)
49 |
50 | -- re-scale the pred
51 | pred:cmul(self.scale)
52 | end
53 |
54 | -- sum for normalize
55 | self.alpha = torch.cmul(weight, weight):sum()
56 | if self.alpha ~= 0 then
57 | self.alpha = 1 / self.alpha
58 | end
59 |
60 | -- apply weight to pred and target, and keep a record for them so that we do not need to re-calculate
61 | self.weighted_pred = torch.cmul(pred, weight)
62 | self.weighted_target = torch.cmul(target, weight)
63 |
64 | return self.criterion:forward(self.weighted_pred, self.weighted_target) * self.alpha
65 | end
66 |
67 | function WeightedMSE:updateGradInput(input, target)
68 |
69 | self.grad = self.criterion:backward(self.weighted_pred, self.weighted_target)
70 |
71 | if self.scale then
72 | self.grad:cdiv(self.scale)
73 | -- patch NaN
74 | self.grad[self.grad:ne(self.grad)] = 0
75 | end
76 |
77 | return self.grad * self.alpha
78 |
79 | end
80 |
81 | function WeightedMSE:cuda()
82 | self.criterion:cuda()
83 | end
84 |
85 |
86 | -- build convolutional kernel for calculate image gradient
87 | local x = nn.SpatialConvolution(3,3,3,3,1,1,1,1)
88 | x.weight:zero()
89 | x.bias:zero()
90 |
91 | local y = nn.SpatialConvolution(3,3,3,3,1,1,1,1)
92 | y.weight:zero()
93 | y.bias:zero()
94 |
95 | for i = 1, 3 do
96 | x.weight[{i, i, {}, {}}] = torch.Tensor({{-1, 0, 1},{-2, 0, 2},{-1, 0, 1}})
97 | y.weight[{i, i, {}, {}}] = torch.Tensor({{-1, -2, -1},{0, 0, 0},{1, 2, 1}})
98 | end
99 |
100 | -- gradient
101 | local function Gradient(image)
102 | local ix = x:forward(image)
103 | local iy = y:forward(image)
104 | return torch.sqrt(torch.add(torch.cmul(ix, ix), torch.cmul(iy, iy)))
105 | end
106 |
107 | -- criterion for multiple output(albedo/shading/specular)
108 | IntrinsicCriterion, parent = torch.class('nn.IntrinsicCriterion', 'nn.Criterion')
109 |
110 | function IntrinsicCriterion:__init()
111 |
112 | -- criterions
113 | self.criterion_a = nn.MultiCriterion()
114 | self.criterion_a:add(nn.WeightedMSE(true), 0.95)
115 | self.criterion_a:add(nn.WeightedMSE(false), 0.05)
116 |
117 | -- for shading
118 | self.criterion_s = nn.MultiCriterion()
119 | self.criterion_s:add(nn.WeightedMSE(true), 0.95)
120 | self.criterion_s:add(nn.WeightedMSE(false), 0.05)
121 |
122 | -- for specular
123 | self.criterion_r = nn.MultiCriterion()
124 | self.criterion_r:add(nn.WeightedMSE(false), 1)
125 |
126 | end
127 |
128 | -- pred contains a table of {pred_albedo, pred_shading, pred_specular}
129 | -- target contains a table of {input, mask, albedo, shading, specular}
130 | function IntrinsicCriterion:updateOutput(pred, target)
131 | -- input might be useful to calculate weight
132 | local input = target[1]
133 | local mask = target[2]
134 |
135 | -- we can mask out background pixel of prediction here
136 | local gt_a = torch.cmul(mask, target[3])
137 | local gt_s = torch.cmul(mask, target[4])
138 | local gt_r = torch.cmul(mask, target[5])
139 |
140 | local pd_a = torch.cmul(mask, pred[1])
141 | local pd_s = torch.cmul(mask, pred[2])
142 | local pd_r = torch.cmul(mask, pred[3])
143 |
144 | -- calculate weight
145 | local weight
146 | local useGradient = false
147 | if useGradient then
148 | local gradient = torch.exp(Gradient(input))
149 | weight = torch.cmul(mask, gradient)
150 | else
151 | weight = mask
152 | end
153 |
154 | self.loss_a = self.criterion_a:forward(pd_a, {gt_a, weight})
155 | self.loss_s = self.criterion_s:forward(pd_s, {gt_s, weight})
156 | self.loss_r = self.criterion_r:forward(pd_r, {gt_r, weight})
157 |
158 | return {self.loss_a, self.loss_s, self.loss_r}
159 | end
160 |
161 | function IntrinsicCriterion:updateGradInput(pred, target)
162 |
163 | local input = target[1]
164 | local mask = target[2]
165 | local gt_a = target[3]
166 | local gt_s = target[4]
167 | local gt_r = target[5]
168 |
169 | local pd_a = pred[1]
170 | local pd_s = pred[2]
171 | local pd_r = pred[3]
172 |
173 | self.grad_a = self.criterion_a:backward(pd_a, {gt_a, mask})
174 | self.grad_s = self.criterion_s:backward(pd_s, {gt_s, mask})
175 | self.grad_r = self.criterion_r:backward(pd_r, {gt_r, mask})
176 |
177 | return {self.grad_a, self.grad_s, self.grad_r}
178 | end
179 |
180 | function IntrinsicCriterion:cuda()
181 | -- convert to cuda
182 | self.criterion_a:cuda()
183 | self.criterion_s:cuda()
184 | self.criterion_r:cuda()
185 | end
186 |
187 |
188 |
189 |
190 |
191 |
192 |
193 |
--------------------------------------------------------------------------------
/train/Dataset.lua:
--------------------------------------------------------------------------------
1 | Dataset = {}
2 | Dataset.__index = Dataset
3 |
4 | -- a simple dataset class
5 | function Dataset.load(list1, list2)
6 | local dataset = {}
7 | setmetatable(dataset, Dataset)
8 |
9 | dataset.size1 = #list1
10 | dataset.size2 = #list2
11 | dataset.size = dataset.size1 * dataset.size2
12 |
13 | dataset.list1 = list1
14 | dataset.list2 = list2
15 |
16 | -- build a index table, we use tensor to get avoid gc
17 | dataset.idx = torch.LongStorage(dataset.size)
18 | for i=1,dataset.size do
19 | dataset.idx[i] = i
20 | end
21 |
22 | return dataset
23 | end
24 |
25 | function Dataset:shuffle(seed)
26 | math.randomseed(seed or 0)
27 | for i=self.size, 2, -1 do
28 | -- get a random index for shuffle
29 | local j = math.random(i)
30 | -- shuffle indices
31 | self.idx[i],self.idx[j] = self.idx[j],self.idx[i]
32 | end
33 | end
34 |
35 | function Dataset:get(index)
36 | -- keep index in [1, self.size]
37 | index = (index - 1) % self.size + 1
38 |
39 | local idx1 = (self.idx[index] - 1) % self.size1 + 1
40 | local idx2 = math.floor((self.idx[index] - 1) / self.size1 + 1)
41 | return self.list1[idx1], self.list2[idx2]
42 | end
43 |
44 |
45 |
46 |
--------------------------------------------------------------------------------
/train/Network.lua:
--------------------------------------------------------------------------------
1 | require 'nn'
2 | require 'nngraph'
3 | require 'cunn'
4 | require 'cudnn'
5 |
6 | -- the network
7 | function IntrinsicNetwork()
8 |
9 | local SConv = nn.SpatialConvolution
10 | local SBatchNorm = nn.SpatialBatchNormalization
11 | local SUpSamp = nn.SpatialUpSamplingBilinear
12 |
13 | -- input image
14 | local image = nn.Identity()():annotate{
15 | name = 'Input'
16 | }
17 |
18 | -- encoder layers
19 | local conv0 = nn.Sequential()
20 | conv0:add(SConv(3,16,3,3,1,1,1,1))
21 | conv0:add(SBatchNorm(16))
22 | conv0:add(nn.ReLU(true))
23 | local conv00 = conv0(image) -- still 256x256
24 |
25 | local conv1 = nn.Sequential()
26 | conv1:add(SConv(16,32,3,3,2,2,1,1))
27 | conv1:add(SBatchNorm(32))
28 | conv1:add(nn.ReLU(true))
29 | local conv10 = conv1(conv00) -- 128x128
30 |
31 | local conv2 = nn.Sequential()
32 | conv2:add(SConv(32,64,3,3,2,2,1,1))
33 | conv2:add(SBatchNorm(64))
34 | conv2:add(nn.ReLU(true))
35 | local conv20 = conv2(conv10) -- 64x64
36 |
37 | local conv3 = nn.Sequential()
38 | conv3:add(SConv(64,128,3,3,2,2,1,1))
39 | conv3:add(SBatchNorm(128))
40 | conv3:add(nn.ReLU(true)) -- 32x32
41 | local conv30 = conv3(conv20)
42 |
43 | local conv4 = nn.Sequential()
44 | conv4:add(SConv(128,256,3,3,2,2,1,1))
45 | conv4:add(SBatchNorm(256))
46 | conv4:add(nn.ReLU(true)) -- 16x16
47 | local conv40 = conv4(conv30)
48 |
49 | local conv5 = nn.Sequential()
50 | conv5:add(SConv(256,256,3,3,2,2,1,1))
51 | conv5:add(SBatchNorm(256))
52 | conv5:add(nn.ReLU(true)) -- 8x8
53 | local conv50 = conv5(conv40)
54 |
55 | -- start decoder
56 | local mid = {}
57 | local deconvs0 = {}
58 | local deconvs1 = {}
59 | local deconvs2 = {}
60 | local deconvs3 = {}
61 | local deconvs4 = {}
62 | local outputs = {}
63 |
64 | for i=1,3 do
65 | local fc = nn.Sequential()
66 | fc:add(SConv(256,256,3,3,1,1,1,1))
67 | fc:add(SBatchNorm(256))
68 | fc:add(nn.ReLU(true)) -- 8x8
69 | fc:add(SConv(256,256,3,3,1,1,1,1))
70 | fc:add(SBatchNorm(256))
71 | fc:add(nn.ReLU(true)) -- 8x8
72 | fc:add(SConv(256,256,3,3,1,1,1,1))
73 | fc:add(SBatchNorm(256))
74 | fc:add(nn.ReLU(true)) -- 8x8
75 | fc:add(SConv(256,256,3,3,1,1,1,1))
76 | fc:add(SBatchNorm(256))
77 | fc:add(nn.ReLU(true)) -- 8x8
78 | mid[i] = fc(conv50)
79 | end
80 | mid[4] = conv50
81 |
82 | for i=1,3 do
83 | -- deconv and upsampling
84 | local deconv0 = nn.Sequential()
85 | deconv0:add(nn.JoinTable(2))
86 | deconv0:add(SConv(1024,256,3,3,1,1,1,1))
87 | deconv0:add(SBatchNorm(256))
88 | deconv0:add(nn.ReLU(true))
89 | deconv0:add(SUpSamp(2))
90 | deconvs0[i] = deconv0(mid) -- 16x16
91 | end
92 | deconvs0[4] = conv40
93 |
94 | for i=1,3 do
95 | local deconv1 = nn.Sequential()
96 | deconv1:add(nn.JoinTable(2))
97 | deconv1:add(SConv(1024,128,3,3,1,1,1,1))
98 | deconv1:add(SBatchNorm(128))
99 | deconv1:add(nn.ReLU(true))
100 | deconv1:add(SUpSamp(2))
101 | deconvs1[i] = deconv1(deconvs0) -- 32x32
102 | end
103 | deconvs1[4] = conv30
104 |
105 | for i=1,3 do
106 | local deconv2 = nn.Sequential()
107 | deconv2:add(nn.JoinTable(2))
108 | deconv2:add(SConv(512,64,3,3,1,1,1,1))
109 | deconv2:add(SBatchNorm(64))
110 | deconv2:add(nn.ReLU(true))
111 | deconv2:add(SUpSamp(2))
112 | deconvs2[i] = deconv2(deconvs1) -- 64x64
113 | end
114 | deconvs2[4] = conv20
115 |
116 | for i=1,3 do
117 | local deconv3 = nn.Sequential()
118 | deconv3:add(nn.JoinTable(2))
119 | deconv3:add(SConv(256,32,3,3,1,1,1,1))
120 | deconv3:add(SBatchNorm(32))
121 | deconv3:add(nn.ReLU(true))
122 | deconv3:add(SUpSamp(2))
123 | deconvs3[i] = deconv3(deconvs2) -- 128x128
124 | end
125 | deconvs3[4] = conv10
126 |
127 | for i=1,3 do
128 | local deconv4 = nn.Sequential()
129 | deconv4:add(nn.JoinTable(2))
130 | deconv4:add(SConv(128,16,3,3,1,1,1,1))
131 | deconv4:add(SBatchNorm(16))
132 | deconv4:add(nn.ReLU(true))
133 | deconv4:add(SUpSamp(2))
134 | deconvs4[i] = deconv4(deconvs3) -- 256x256
135 | end
136 | deconvs4[4] = conv00
137 |
138 | for i=1,3 do
139 | -- output
140 | local output4 = nn.Sequential()
141 | output4:add(nn.JoinTable(2))
142 | output4:add(SConv(64,16,3,3,1,1,1,1))
143 | output4:add(SBatchNorm(16))
144 | output4:add(nn.ReLU(true))
145 |
146 | -- image resolution
147 | output4:add(SConv(16,3,3,3,1,1,1,1))
148 | output4:add(SBatchNorm(3))
149 | output4:add(nn.ReLU(true))
150 | outputs[i] = output4(deconvs4) -- 3x256x256
151 | end
152 |
153 | return nn.gModule({image}, outputs)
154 | end
155 |
156 |
--------------------------------------------------------------------------------
/train/Patch.lua:
--------------------------------------------------------------------------------
1 | require 'nn'
2 |
3 | local SpatialUpSamplingBilinear = nn.SpatialUpSamplingBilinear
4 |
5 | function SpatialUpSamplingBilinear:setSize(input)
6 | local xdim = input:dim()
7 | local ydim = xdim - 1
8 | for i = 1, input:dim() do
9 | self.inputSize[i] = input:size(i)
10 | self.outputSize[i] = input:size(i)
11 | end
12 | if self.scale_factor ~= nil then
13 | self.outputSize[ydim] = (self.outputSize[ydim]-1) * (self.scale_factor-1)
14 | + self.outputSize[ydim]
15 | self.outputSize[xdim] = (self.outputSize[xdim]-1) * (self.scale_factor -1)
16 | + self.outputSize[xdim]
17 | else
18 | self.outputSize[ydim] = self.oheight
19 | self.outputSize[xdim] = self.owidth
20 | end
21 | end
22 |
23 |
--------------------------------------------------------------------------------
/train/Test.lua:
--------------------------------------------------------------------------------
1 | require 'torch'
2 | require 'paths'
3 | require 'image'
4 | require 'nn'
5 | require 'nngraph'
6 | require 'cunn'
7 | require 'cudnn'
8 |
9 | -- for old model
10 | -- require 'Patch'
11 |
12 | local cmd = torch.CmdLine()
13 |
14 | cmd:option('-input', '', 'input image')
15 | cmd:option('-mask', '', 'input mask')
16 | cmd:option('-model', '', 'model file')
17 | cmd:option('-outdir', '.', 'output directory')
18 | cmd:option('-gpu', 1, 'use GPU')
19 |
20 | local options = cmd:parse(arg)
21 |
22 | local input = torch.FloatTensor(1, 3, 256, 256)
23 | local mask = torch.FloatTensor(1, 3, 256, 256)
24 |
25 | -- load model
26 | local model = torch.load(options.model)
27 |
28 | -- load input image and mask
29 | input[{1, {}, {}, {}}] = image.scale(image.load(options.input, 3), 256, 256)
30 | mask[{1, {}, {}, {}}] = image.scale(image.load(options.mask, 3), 256, 256)
31 |
32 | if options.gpu == 0 then
33 | model:float()
34 | else
35 | model:cuda()
36 | input = input:cuda()
37 | mask = mask:cuda()
38 | end
39 |
40 | local pred = model:forward(input)
41 |
42 | -- save output
43 | image.save(paths.concat(options.outdir, 'albedo.png'), pred[1]:cmul(mask):squeeze())
44 | image.save(paths.concat(options.outdir, 'shading.png'), pred[2]:cmul(mask):squeeze())
45 | image.save(paths.concat(options.outdir, 'specular.png'), pred[3]:cmul(mask):squeeze())
46 | -- save a copy of input
47 | image.save(paths.concat(options.outdir, 'input.png'), input:squeeze())
48 | image.save(paths.concat(options.outdir, 'mask.png'), mask:squeeze())
49 |
50 | --EOF
51 |
52 |
--------------------------------------------------------------------------------
/train/Train.lua:
--------------------------------------------------------------------------------
1 | require 'torch'
2 | require 'paths'
3 | require 'image'
4 |
5 | --
6 | require 'Dataset'
7 | require 'Network'
8 | require 'Criterion'
9 |
10 | -- parse command line parameters
11 | local cmd = torch.CmdLine()
12 |
13 | cmd:option('-data_root', os.getenv('RENDER_ROOT') or '', 'the dataset root directory')
14 | cmd:option('-model_list', paths.concat(os.getenv('RENDER_ROOT') or '','dataset.csv'))
15 | cmd:option('-env_list', paths.concat(os.getenv('ENVMAP_ROOT') or '','list.txt'))
16 | cmd:option('-outdir', '.', 'output directory')
17 |
18 | cmd:option('-max_iter', 1000000, 'max training iteration')
19 | cmd:option('-save_iter', 10000, 'save snapshot')
20 | cmd:option('-test_iter', 10000, 'test')
21 | cmd:option('-disp_iter', 10, 'display error and training result')
22 |
23 | cmd:option('-snapshot', '', 'snapshot file')
24 |
25 | cmd:option('-batch_size', 4, 'batch size')
26 | cmd:option('-cuda', true, 'use cuda')
27 | cmd:option('-cudnn', true, 'use cudnn')
28 | cmd:option('-devid', 1, 'cuda device index')
29 | cmd:option('-seed', 666, 'random seed')
30 |
31 | local options = cmd:parse(arg)
32 |
33 | -- check input files
34 | if not paths.filep(options.model_list) then
35 | print('Please check model_list file!')
36 | os.exit(1)
37 | end
38 |
39 | if not paths.filep(options.env_list) then
40 | print('Please check env_list file!')
41 | os.exit(1)
42 | end
43 |
44 | -- load ShapeNet model list
45 | local model_train = {}
46 | local model_test = {}
47 | local model_val = {}
48 |
49 | -- read model list
50 | for line in io.lines(options.model_list) do
51 | local split = {}
52 | for token in line:gmatch('([^,]+)') do
53 | table.insert(split, token)
54 | end
55 |
56 | -- split shapenet model for training
57 | if split[5] == 'train' then
58 | table.insert(model_train, split[1])
59 | elseif split[5] == 'test' then
60 | table.insert(model_test, split[1])
61 | elseif split[5] == 'val' then
62 | table.insert(model_val, split[1])
63 | end
64 | end
65 |
66 | -- print(#model_train, #model_test, #model_val)
67 |
68 | -- load environment map list
69 | local env_train = {}
70 | local env_test = {}
71 | local env_val = {}
72 |
73 | for line in io.lines(options.env_list) do
74 | local split = {}
75 | for token in paths.basename(line):gmatch('([^.]+)') do
76 | table.insert(split, token)
77 | end
78 |
79 | -- we use all environment maps for both training and testing
80 | -- since we have relative small size environment map dataset, it is not a good idea to random split it.
81 | -- experiments showed that reasonably split environment maps would produce results very close to no-split setting
82 | table.insert(env_train, split[1])
83 | table.insert(env_test, split[1])
84 | table.insert(env_val, split[1])
85 | end
86 |
87 | -- create training dataset
88 | local dataset = Dataset.load(model_train, env_train)
89 | dataset:shuffle()
90 | print('Training dataset size:', dataset.size)
91 |
92 | -- pre-allocate memory, hard code for image resolution
93 | local input = torch.Tensor(options.batch_size, 3, 256, 256)
94 | local mask = torch.Tensor(options.batch_size, 3, 256, 256)
95 | local target_albedo = torch.Tensor(options.batch_size, 3, 256, 256)
96 | local target_shading = torch.Tensor(options.batch_size, 3, 256, 256)
97 | local target_specular = torch.Tensor(options.batch_size, 3, 256, 256)
98 |
99 | torch.setdefaulttensortype('torch.FloatTensor')
100 | torch.manualSeed(options.seed)
101 | if options.cuda then
102 | cutorch.setDevice(options.devid)
103 | cutorch.manualSeed(options.seed)
104 | end
105 |
106 | -- network and criterion
107 | local network
108 |
109 | if options.snapshot and paths.filep(options.snapshot) then
110 | -- load from snapshot
111 | print('Load snapshot file...')
112 | network = torch.load(options.snapshot)
113 | else
114 | network = IntrinsicNetwork()
115 | end
116 |
117 | local criterion = nn.IntrinsicCriterion()
118 |
119 | -- cuda
120 | if options.cuda then
121 | require 'cunn'
122 |
123 | input = input:cuda()
124 | mask = mask:cuda()
125 | target_albedo = target_albedo:cuda()
126 | target_shading = target_shading:cuda()
127 | target_specular = target_specular:cuda()
128 |
129 | network:cuda()
130 | criterion:cuda()
131 |
132 | if options.cudnn then
133 | require 'cudnn'
134 | --cudnn.fastest = true
135 | cudnn.convert(network, cudnn)
136 | end
137 | end
138 |
139 | -- for solver
140 | local x, dl_dx = network:getParameters()
141 |
142 | local optim = require('optim')
143 | --local solver = optim['adam']
144 | local solver = optim['adadelta']
145 | local state = {
146 | learningRate = 0.01,
147 | learningRateDecay = 1e-5
148 | }
149 |
150 | -- for display images and error
151 | local display = require 'display'
152 |
153 | -- timer
154 | local timer = torch.Timer()
155 | timer:reset()
156 |
157 | local loss_albedo = 0
158 | local loss_shading = 0
159 | local loss_specular = 0
160 |
161 | local loss_table_train = {}
162 | local loss_table_test = {}
163 |
164 | local plot_config_train = {
165 | title = "Training Loss",
166 | labels = {"iter", "albedo", "shading", "specular"},
167 | ylabel = "Weighted MSE",
168 | }
169 |
170 | local plot_config_test = {
171 | title = "Testing Loss",
172 | labels = {"iter", "albedo", "shading", "specular"},
173 | ylabel = "Weighted MSE",
174 | }
175 |
176 | local iter = 0
177 | local curr = 0 -- current sample index
178 | while iter < options.max_iter do
179 | iter = iter + 1
180 |
181 | network:training()
182 |
183 | for i=1,options.batch_size do
184 | curr = curr + 1
185 | local model, env = dataset:get(curr)
186 | local prefix = paths.concat(options.data_root, model, env)
187 |
188 | -- load data
189 | input[{i,{},{},{}}] = image.load(prefix..'_i.jpg', 3)
190 | mask[{i,{},{},{}}] = image.load(prefix..'_m.png', 3)
191 |
192 | target_albedo[{i,{},{},{}}] = image.load(prefix..'_a.jpg', 3)
193 | target_shading[{i,{},{},{}}] = image.load(prefix..'_s.jpg', 3)
194 | target_specular[{i,{},{},{}}] = image.load(prefix..'_r.jpg', 3)
195 | end
196 |
197 | -- mask out input background?
198 | -- input:cmul(mask)
199 |
200 | -- forward
201 | local pred = network:forward(input)
202 | local pred_albedo = pred[1]:cmul(mask)
203 | local pred_shading = pred[2]:cmul(mask)
204 | local pred_specular = pred[3]:cmul(mask)
205 |
206 | -- get loss
207 | local loss = criterion:forward(pred, {input, mask, target_albedo, target_shading, target_specular})
208 | loss_albedo = loss_albedo + loss[1]
209 | loss_shading = loss_shading + loss[2]
210 | loss_specular = loss_specular + loss[3]
211 |
212 | -- get gradient
213 | local grad = criterion:backward(pred, {input, mask, target_albedo, target_shading, target_specular})
214 |
215 | -- optimize
216 | local function feval()
217 | -- update parameters
218 | network:zeroGradParameters()
219 | network:backward(input,grad)
220 | return loss, dl_dx
221 | end
222 | solver(feval, x, state)
223 |
224 |
225 | if iter % options.disp_iter == 0 then
226 | -- display training images and plot error
227 | win_image = display.image(input, {win=win_image,title='Input Image, iter:'..iter})
228 | win_a0 = display.image(torch.cat(target_albedo,pred_albedo, 4), {win=win_a0,title='Albedo, iter:'..iter})
229 | win_s0 = display.image(torch.cat(target_shading,pred_shading, 4), {win=win_s0,title='Shading, iter:'..iter})
230 | win_r0 = display.image(torch.cat(target_specular,pred_specular, 4), {win=win_r0,title='Specular, iter:'..iter})
231 |
232 | loss_albedo = loss_albedo / options.disp_iter
233 | loss_shading = loss_shading / options.disp_iter
234 | loss_specular = loss_specular / options.disp_iter
235 |
236 | table.insert(loss_table_train, {iter, loss_albedo, loss_shading, loss_specular})
237 | plot_config_train.win = display.plot(loss_table_train, plot_config_train)
238 |
239 | -- print to console
240 | print(string.format("Iteration %d, %d/%d samples, %.2f seconds passed",
241 | iter, curr, dataset.size, timer:time().real))
242 | print(string.format("\t Loss-a: %.4f, Loss-s: %.4f, Loss-r: %.4f",
243 | loss_albedo, loss_shading, loss_specular))
244 |
245 | loss_albedo = 0
246 | loss_shading = 0
247 | loss_specular = 0
248 | end
249 |
250 | if iter % options.test_iter == 0 then
251 | -- testing...
252 | network:evaluate()
253 |
254 | local test_loss_a = 0
255 | local test_loss_s = 0
256 | local test_loss_r = 0
257 |
258 | for k = 1,#model_test,options.batch_size do
259 |
260 | for i=1, options.batch_size do
261 | local idx0 = (k + i - 2) % #model_test + 1
262 | local idx1 = (k + i - 2) % #env_test + 1
263 |
264 | -- go through models
265 | local model = model_test[idx0]
266 | -- we only use single envmap to save testing time
267 | local env = env_test[idx1]
268 |
269 | local prefix = paths.concat(options.data_root, model, env)
270 |
271 | -- load data
272 | input[{i,{},{},{}}] = image.load(prefix..'_i.jpg', 3)
273 | mask[{i,{},{},{}}] = image.load(prefix..'_m.png', 3)
274 |
275 | target_albedo[{i,{},{},{}}] = image.load(prefix..'_a.jpg', 3)
276 | target_shading[{i,{},{},{}}] = image.load(prefix..'_s.jpg', 3)
277 | target_specular[{i,{},{},{}}] = image.load(prefix..'_r.jpg', 3)
278 | end
279 |
280 | -- forward
281 | local pred = network:forward(input)
282 |
283 | -- get loss
284 | local loss = criterion:forward(pred, {input, mask, target_albedo, target_shading, target_specular})
285 | test_loss_a = test_loss_a + loss[1]
286 | test_loss_s = test_loss_s + loss[2]
287 | test_loss_r = test_loss_r + loss[3]
288 |
289 | end
290 |
291 | test_loss_a = test_loss_a / #model_test
292 | test_loss_s = test_loss_s / #model_test
293 | test_loss_r = test_loss_r / #model_test
294 |
295 | print(string.format("Evaluation, Loss-a: %.4f, Loss-s: %.4f, Loss-r: %.4f",
296 | test_loss_a, test_loss_s, test_loss_r))
297 |
298 | table.insert(loss_table_test, {iter, test_loss_a, test_loss_s, test_loss_r})
299 | plot_config_test.win = display.plot(loss_table_test, plot_config_test)
300 | end
301 |
302 | if iter % options.save_iter == 0 then
303 | -- save model
304 | print("Save model on iteration", iter)
305 | network:clearState()
306 | torch.save(paths.concat(options.outdir, 'snapshot_'..iter..'.t7'), network)
307 | end
308 |
309 | -- manually GC
310 | collectgarbage()
311 |
312 | end
313 |
314 |
315 |
316 |
317 |
--------------------------------------------------------------------------------
/train/readme.md:
--------------------------------------------------------------------------------
1 | # Training/Testing scripts
2 |
3 | This directory provides network structure, criterion, training and testing scripts.
4 |
5 | ## Train
6 |
7 | Some parameters:
8 | * **-data_root** specify the root directory of ShapeNet rendering images.
9 | * **-model_list** specify the .csv file containing model id and dataset separation downloaded from [ShapeNet](http://shapenet.cs.stanford.edu/shapenet/obj-zip/SHREC16/all.csv) website.
10 | * **-env_list** specify the environment map list file.
11 | * **-outdir** specify the output directory for saving snapshots. Default is current directory.
12 |
13 | ## Test
14 |
15 | Testing scripts is quite simple. It accepts 5 parameters.
16 |
17 | * **-input** specify the input image file.
18 | * **-mask** specify the mask file.
19 | * **-model** specify the trained model file.
20 | * **-outdir** specify the output directory, default is current directory.
21 | * **-gpu** 0 is for running on CPU. Default is using GPU.
22 |
23 |
24 | The script would output 5 images including albedo.png, shading.png, specular.png, as well as input.png and mask.png under outdir.
25 |
--------------------------------------------------------------------------------