├── readme.md ├── render ├── gen_script.py ├── readme.md ├── template-a.xml ├── template-d.xml ├── template-i.xml ├── template-r.xml └── template-s.xml └── train ├── Criterion.lua ├── Dataset.lua ├── Network.lua ├── Patch.lua ├── Test.lua ├── Train.lua └── readme.md /readme.md: -------------------------------------------------------------------------------- 1 | # ShapeNet-Intrinsics 2 | 3 | This is implementation for [Learning Non-Lambertian Object Intrinsics across ShapeNet Categories](https://arxiv.org/abs/1612.08510) 4 | 5 | You might be interested in the synthetic dataset we used in the paper. The entire dataset takes more than 1T for HDR images, and 240G for even compressed .jpg images. So it is hard to share it online, and we are still working on it;) 6 | 7 | However, you can still check the [rendering scripts](render), which can generate the dataset and do even more things for your own need, e.g. depth and normal images. [Training and testing scripts](train) are implemented in [torch](http://torch.ch/). 8 | 9 | #### Downloads 10 | 11 | Trained torch model and HDR environment map can be accessed [here](https://1drv.ms/f/s!ApfQp_rip6el-X-neX32NGAE_aiC). 12 | 13 | Note: there are two models(model.t7 and model_old.t7). Torch updated its API on SpatialUpSamplingBilinear, which is used in the model. The old one is trained with old API(maybe before Oct 2016), and model.t7 works on current version of torch. 14 | 15 | model.t7 is trained for 1M steps, which is observed overfitting on syn data. 16 | model_old.t7 is trained for 450k steps. It performs worse than model.t7 on synthetic data, but for some real data, it might produce better result. To try old version of model, simply uncomment line 10 in Test.lua (-- require 'Patch'). 17 | 18 | #### Torch is outdated... 19 | 20 | Recently I found that people switch to tensorflow and pytorch, while torch is not active. I did spend some time to try to migrate the work to tensorflow, but I found the exact same network structure in tensorflow is not working. The major problem is that I used ReLU in the network. In tensorflow, the network produces all black image, and pixel values and gradients are clamped. I tried different LR and optimizer, but cannot make it work. In torch, everything is OK. Although I can use LeakyReLU in tensorflow, it is different from the original version. I doubt there is some difference in the internal implementation of these frameworks. I don't have enough time to make it works, but I maganed to compile torch and run the model under a relative new platform: **ubuntu 18.04, cuda 9.0 and cudnn 7**, which I think is acceptable for most people. 21 | 22 | #### Running the code... 23 | 24 | Here is the note for run the code with torch under ubuntu 18.04, cuda 9.0 and cudnn 7. 25 | 26 | 1. Clone the torch repo as usual 27 | ``` 28 | git clone https://github.com/torch/distro.git ~/torch --recursive 29 | ``` 30 | 2. Modify the install-deps, in line 178, 'sudo apt-get install -y python-software-properties'. The package in ubuntu 18.04 is replaced by 'software-properties-common', you can replace it or manually install the package. 31 | 3. Set gcc compiler to gcc-6, which is the maximun version for cuda-9.0. First install gcc-6 with apt, and then change default gcc to gcc-6. 32 | ``` 33 | sudo apt install gcc-6 34 | sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-6 10 35 | sudo update-alternatives --config gcc 36 | ``` 37 | 1. Install. 38 | ``` 39 | export TORCH_NVCC_FLAGS="-D__CUDA_NO_HALF_OPERATORS__" 40 | ./install.sh 41 | ``` 42 | 2. If everything is OK, torch would run...on cudnn5...Now we need to switch to cudnn7. 43 | ``` 44 | cd extra/cudnn 45 | git fetch 46 | git checkout R7 47 | luarocks make cudnn-scm-1.rockspec 48 | ``` 49 | Now you can probably run torch. -------------------------------------------------------------------------------- /render/gen_script.py: -------------------------------------------------------------------------------- 1 | # generate rendering scripts 2 | import os 3 | import platform 4 | import random 5 | import math 6 | import urllib.request 7 | 8 | ## for testing... 9 | #os.environ["MTISUBA"] = "D:/Develope/Project/mitsuba_plugins/build/mitsuba/binaries/MinSizeRel/mitsuba.exe" 10 | #os.environ["SHAPENET_ROOT"] = "E:/ShapeNet/ShapeNetCore.v1/models" 11 | #os.environ["RENDER_ROOT"] = "E:/ShapeNet/Render" 12 | #os.environ["ENVMAP_ROOT"] = "E:/ShapeNet/Envmap" 13 | 14 | # fix random seed 15 | random.seed(0) 16 | # generate a random viewpoing 17 | def RandomView(): 18 | theta = random.random() * math.pi * 0.5 # theta should be in [0, 0.5PI) 19 | phi = random.random() * math.pi * 2.0 # [0, 2PI) 20 | x = math.sin(theta) * math.cos(phi) 21 | y = math.cos(theta) 22 | z = math.sin(theta) * math.sin(phi) 23 | 24 | return "\"%f,%f,%f\"" % (x * 2, y * 2, z * 2) 25 | 26 | # rendering options 27 | # useful options: -q quite, -x skip exists 28 | options = "-q" 29 | 30 | # templates 31 | script_dir = os.path.dirname(os.path.abspath(__file__)) 32 | template_i = os.path.join(script_dir, "template-i.xml") 33 | template_a = os.path.join(script_dir, "template-a.xml") 34 | template_s = os.path.join(script_dir, "template-s.xml") 35 | template_r = os.path.join(script_dir, "template-r.xml") 36 | template_d = os.path.join(script_dir, "template-d.xml") 37 | 38 | # the mitsuba executable binary 39 | MITSUBA = os.environ["MTISUBA"] 40 | # the ShapeNet model repository directory 41 | SHAPENET_ROOT = os.environ["SHAPENET_ROOT"] 42 | # the environment map root folder 43 | ENVMAP_ROOT = os.environ["ENVMAP_ROOT"] 44 | # where to put rendering output 45 | RENDER_ROOT = os.environ["RENDER_ROOT"] 46 | 47 | # first we create a rendering output directory 48 | if not os.path.exists(RENDER_ROOT): 49 | os.makedirs(RENDER_ROOT) 50 | 51 | # download all.csv from shapenet, which contains a separation of dataset 52 | list_url = "http://shapenet.cs.stanford.edu/shapenet/obj-zip/SHREC16/all.csv" 53 | list_file = os.path.join(RENDER_ROOT, "dataset.csv") 54 | if not os.path.exists(list_file): 55 | print("Download model list from ShapeNet website...") 56 | u = urllib.request.urlretrieve(list_url, list_file) 57 | 58 | # load environment map list 59 | # suppose there is a list.txt under ENVMAP_ROOT 60 | envmaps = [] 61 | for line in open(os.path.join(ENVMAP_ROOT, "list.txt")): 62 | env_file = os.path.join(ENVMAP_ROOT, line[:-1]) 63 | #print(env_file) 64 | if os.path.exists(env_file): 65 | envmaps.append(env_file) 66 | 67 | model_list = open(list_file, 'r') 68 | 69 | first_row = True 70 | for row in model_list: 71 | # split the first row 72 | if first_row: 73 | first_row = False 74 | continue 75 | 76 | cols = row.split(",") 77 | 78 | idx = cols[0] # model index 79 | category = cols[1] # model category 80 | uuid = cols[3] # model ID 81 | 82 | # the model .obj file 83 | model_file = os.path.join(SHAPENET_ROOT, category, uuid, "model.obj") 84 | if not os.path.exists(model_file): 85 | print("Model %s: file not exists!" % idx) 86 | print(model_file) 87 | else: 88 | print("Model %s, %s, %s" % (idx, category, uuid)) 89 | 90 | # make output directory 91 | output_dir = os.path.join(RENDER_ROOT, idx) 92 | if not os.path.exists(output_dir): 93 | os.makedirs(output_dir) 94 | 95 | # we do not do rendering in this script 96 | # instead, we write rendering scripts for each model 97 | # and use a synthesize script to call ImageMagick to synthesize image 98 | if platform.system() == "Windows": 99 | render_script = open(os.path.join(output_dir, "render.bat"), 'w') 100 | syn_script = open(os.path.join(output_dir, "synthesize.bat"), 'w') 101 | else: # assume linux or macos 102 | render_script = open(os.path.join(output_dir, "render.sh"), 'w') 103 | syn_script = open(os.path.join(output_dir, "synthesize.sh"), 'w') 104 | 105 | # random specular properties 106 | ks = random.random() * 0.2 107 | ns = random.random() * 1000 108 | # diffuse term 109 | kd = 1.0 - ks 110 | 111 | for env_file in envmaps: 112 | output_name = os.path.splitext(os.path.basename(env_file))[0] 113 | 114 | # generate a random viewpoint 115 | view = RandomView() 116 | 117 | # varibles for rendering 118 | var = "-Dmodel=\"%s\" -Denv=\"%s\" -Dview=%s -Dkd=%f -Dks=%f -Dns=%f" % (model_file, env_file, view, kd, ks, ns) 119 | 120 | # a base command 121 | 122 | # render command for image, albedo, shading, specular and depth 123 | # we do not need to directly render image since we can synthesize it by I = A*S + R 124 | #script.write("%s %s %s %s -o %s-i\n" % (MITSUBA, template_i, options, var, output_name)) 125 | 126 | # rendering albedo 127 | render_script.write("%s %s %s %s -o %s_a\n" % (MITSUBA, template_a, options, var, output_name)) 128 | 129 | # rendering shading 130 | render_script.write("%s %s %s %s -o %s_s\n" % (MITSUBA, template_s, options, var, output_name)) 131 | 132 | # rendering specular 133 | render_script.write("%s %s %s %s -o %s_r\n" % (MITSUBA, template_r, options, var, output_name)) 134 | 135 | # rendering depth, can be used for generate object mask 136 | render_script.write("%s %s %s %s -o %s_d\n" % (MITSUBA, template_d, options, var, output_name)) 137 | 138 | 139 | # synthesize process script 140 | # we use .jpg LDR image instead of .exr HDR image to save disk space 141 | 142 | # first generate mask from depth 143 | syn_script.write("magick convert -format png -alpha off -depth 1 %s_d.exr %s_m.png\n" % (output_name,output_name)) 144 | 145 | # albedo, shading and specular 146 | syn_script.write("magick mogrify -format jpg %s_a.exr\n" % output_name) 147 | syn_script.write("magick mogrify -format jpg %s_s.exr\n" % output_name) 148 | syn_script.write("magick mogrify -format jpg %s_r.exr\n" % output_name) 149 | 150 | # synthesize image 151 | syn_script.write("magick composite -compose Multiply %s_a.exr %s_s.exr temp.exr\n" % (output_name,output_name)) 152 | syn_script.write("magick composite -compose Plus %s_r.exr temp.exr -format jpg %s_i.jpg\n" % (output_name,output_name)) 153 | 154 | 155 | # clean 156 | if platform.system() == "Windows": 157 | syn_script.write("del temp.exr\n") 158 | else: 159 | syn_script.write("rm temp.exr\n") 160 | 161 | render_script.close() 162 | syn_script.close() 163 | 164 | model_list.close() 165 | -------------------------------------------------------------------------------- /render/readme.md: -------------------------------------------------------------------------------- 1 | # Rendering Script 2 | 3 | In this directory we provide rendering configuration template for [mitsuba-shapenet](https://github.com/shi-jian/mitsuba-shapenet]) to render different components of a ShapeNet model. 4 | 5 | ### Configuration templates 6 | 7 | * [template-i.xml](template-i.xml) configuration for rendering image. 8 | * [template-a.xml](template-a.xml) for albedo. 9 | * [template-s.xml](template-s.xml) for shading. 10 | * [template-r.xml](template-r.xml) for specular. We use homogeneous specular for the entire model. 11 | * [template-d.xml](template-d.xml) for depth, which is used to generate mask. 12 | 13 | Note: In the experiments, we rendered albedo/shading/specular and then synthesized image by I=A*S+R. Depth is used to generate object mask. 14 | 15 | It might be useful to look into albedo/depth configuration file if you want to render other 'field' in mitsuba, such as normal. 16 | 17 | ### Script for generating rendering scripts 18 | 19 | [gen_script.py](gen_script.py) is used to generate rendering and synthesize scripts. Please set following environment: 20 | 21 | * **MITSUBA** 22 | points to mitsuba renderer executable (e.g. mitsuba.exe in windows). 23 | 24 | * **SHAPENET_ROOT** 25 | the directory contains extracted ShapeNet models. 26 | 27 | * **ENVMAP_ROOT** 28 | the directory contains environment maps, with a 'list.txt' file. Each line of the list file contains an environment map filename. 29 | 30 | * **RENDER_ROOT** 31 | the directory to put rendering scripts and results. 32 | 33 | 34 | Recently ShapeNet released an official dataset separation. The script would automatically download the model [list](http://shapenet.cs.stanford.edu/shapenet/obj-zip/SHREC16/all.csv) from ShapeNet, which contains models, categories, uuid and data separation. Then it would generate output directories for models under RENDER_ROOT, as well as two scripts: render.bat/render.sh and synthesize.bat/synthesize.sh. 35 | 36 | * **render.bat**: render albedo/shading/specular/depth in HDR images. 37 | * **synthesize.bat**: generate mask image from depth, convert HDR to LDR for albedo/shading/specular(for saving disk space), generate image by I=A*S+R. [ImageMagick](http://www.imagemagick.org) is required for image synthesizing. 38 | 39 | Then, you can run these scripts under their directory. We strongly recommend to render on a cluster. Rendering for a single model under 92 environment maps takes about 45 min on an i7-2600 old PC. 40 | -------------------------------------------------------------------------------- /render/template-a.xml: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | -------------------------------------------------------------------------------- /render/template-d.xml: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | -------------------------------------------------------------------------------- /render/template-i.xml: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | -------------------------------------------------------------------------------- /render/template-r.xml: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | -------------------------------------------------------------------------------- /render/template-s.xml: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | -------------------------------------------------------------------------------- /train/Criterion.lua: -------------------------------------------------------------------------------- 1 | require 'torch' 2 | require 'nn' 3 | require 'cunn' 4 | 5 | -- weighted MSE criterion, scale invariant, and with mask 6 | local WeightedMSE, parent = torch.class('nn.WeightedMSE', 'nn.Criterion') 7 | 8 | function WeightedMSE:__init(scale_invariant) 9 | parent.__init(self) 10 | -- we use a standard MSE criterion internally 11 | self.criterion = nn.MSECriterion() 12 | self.criterion.sizeAverage = false 13 | 14 | -- whether consider scale invarient 15 | self.scale_invariant = scale_invariant or false 16 | end 17 | 18 | -- targets should contains {target, weight} 19 | function WeightedMSE:updateOutput(pred, targets) 20 | 21 | local target = targets[1] 22 | local weight = targets[2] 23 | 24 | -- scale-invariant: rescale the pred to target scale 25 | if self.scale_invariant then 26 | 27 | -- get the dimension and size 28 | local dim = target:dim() 29 | local size = target:size() 30 | for i=1,dim-2 do 31 | size[i] = 1 32 | end 33 | 34 | -- scale invariant 35 | local tensor1 = torch.cmul(pred, target) 36 | local tensor2 = torch.cmul(pred, pred) 37 | 38 | -- get the scale 39 | self.scale = torch.cdiv(tensor1:sum(dim):sum(dim-1),tensor2:sum(dim):sum(dim-1)) 40 | -- patch NaN 41 | self.scale[self.scale:ne(self.scale)] = 1 42 | 43 | -- constrain the scale in [0.1, 10] 44 | self.scale:cmin(10) 45 | self.scale:cmax(0.1) 46 | 47 | -- expand the scale 48 | self.scale = self.scale:repeatTensor(size) 49 | 50 | -- re-scale the pred 51 | pred:cmul(self.scale) 52 | end 53 | 54 | -- sum for normalize 55 | self.alpha = torch.cmul(weight, weight):sum() 56 | if self.alpha ~= 0 then 57 | self.alpha = 1 / self.alpha 58 | end 59 | 60 | -- apply weight to pred and target, and keep a record for them so that we do not need to re-calculate 61 | self.weighted_pred = torch.cmul(pred, weight) 62 | self.weighted_target = torch.cmul(target, weight) 63 | 64 | return self.criterion:forward(self.weighted_pred, self.weighted_target) * self.alpha 65 | end 66 | 67 | function WeightedMSE:updateGradInput(input, target) 68 | 69 | self.grad = self.criterion:backward(self.weighted_pred, self.weighted_target) 70 | 71 | if self.scale then 72 | self.grad:cdiv(self.scale) 73 | -- patch NaN 74 | self.grad[self.grad:ne(self.grad)] = 0 75 | end 76 | 77 | return self.grad * self.alpha 78 | 79 | end 80 | 81 | function WeightedMSE:cuda() 82 | self.criterion:cuda() 83 | end 84 | 85 | 86 | -- build convolutional kernel for calculate image gradient 87 | local x = nn.SpatialConvolution(3,3,3,3,1,1,1,1) 88 | x.weight:zero() 89 | x.bias:zero() 90 | 91 | local y = nn.SpatialConvolution(3,3,3,3,1,1,1,1) 92 | y.weight:zero() 93 | y.bias:zero() 94 | 95 | for i = 1, 3 do 96 | x.weight[{i, i, {}, {}}] = torch.Tensor({{-1, 0, 1},{-2, 0, 2},{-1, 0, 1}}) 97 | y.weight[{i, i, {}, {}}] = torch.Tensor({{-1, -2, -1},{0, 0, 0},{1, 2, 1}}) 98 | end 99 | 100 | -- gradient 101 | local function Gradient(image) 102 | local ix = x:forward(image) 103 | local iy = y:forward(image) 104 | return torch.sqrt(torch.add(torch.cmul(ix, ix), torch.cmul(iy, iy))) 105 | end 106 | 107 | -- criterion for multiple output(albedo/shading/specular) 108 | IntrinsicCriterion, parent = torch.class('nn.IntrinsicCriterion', 'nn.Criterion') 109 | 110 | function IntrinsicCriterion:__init() 111 | 112 | -- criterions 113 | self.criterion_a = nn.MultiCriterion() 114 | self.criterion_a:add(nn.WeightedMSE(true), 0.95) 115 | self.criterion_a:add(nn.WeightedMSE(false), 0.05) 116 | 117 | -- for shading 118 | self.criterion_s = nn.MultiCriterion() 119 | self.criterion_s:add(nn.WeightedMSE(true), 0.95) 120 | self.criterion_s:add(nn.WeightedMSE(false), 0.05) 121 | 122 | -- for specular 123 | self.criterion_r = nn.MultiCriterion() 124 | self.criterion_r:add(nn.WeightedMSE(false), 1) 125 | 126 | end 127 | 128 | -- pred contains a table of {pred_albedo, pred_shading, pred_specular} 129 | -- target contains a table of {input, mask, albedo, shading, specular} 130 | function IntrinsicCriterion:updateOutput(pred, target) 131 | -- input might be useful to calculate weight 132 | local input = target[1] 133 | local mask = target[2] 134 | 135 | -- we can mask out background pixel of prediction here 136 | local gt_a = torch.cmul(mask, target[3]) 137 | local gt_s = torch.cmul(mask, target[4]) 138 | local gt_r = torch.cmul(mask, target[5]) 139 | 140 | local pd_a = torch.cmul(mask, pred[1]) 141 | local pd_s = torch.cmul(mask, pred[2]) 142 | local pd_r = torch.cmul(mask, pred[3]) 143 | 144 | -- calculate weight 145 | local weight 146 | local useGradient = false 147 | if useGradient then 148 | local gradient = torch.exp(Gradient(input)) 149 | weight = torch.cmul(mask, gradient) 150 | else 151 | weight = mask 152 | end 153 | 154 | self.loss_a = self.criterion_a:forward(pd_a, {gt_a, weight}) 155 | self.loss_s = self.criterion_s:forward(pd_s, {gt_s, weight}) 156 | self.loss_r = self.criterion_r:forward(pd_r, {gt_r, weight}) 157 | 158 | return {self.loss_a, self.loss_s, self.loss_r} 159 | end 160 | 161 | function IntrinsicCriterion:updateGradInput(pred, target) 162 | 163 | local input = target[1] 164 | local mask = target[2] 165 | local gt_a = target[3] 166 | local gt_s = target[4] 167 | local gt_r = target[5] 168 | 169 | local pd_a = pred[1] 170 | local pd_s = pred[2] 171 | local pd_r = pred[3] 172 | 173 | self.grad_a = self.criterion_a:backward(pd_a, {gt_a, mask}) 174 | self.grad_s = self.criterion_s:backward(pd_s, {gt_s, mask}) 175 | self.grad_r = self.criterion_r:backward(pd_r, {gt_r, mask}) 176 | 177 | return {self.grad_a, self.grad_s, self.grad_r} 178 | end 179 | 180 | function IntrinsicCriterion:cuda() 181 | -- convert to cuda 182 | self.criterion_a:cuda() 183 | self.criterion_s:cuda() 184 | self.criterion_r:cuda() 185 | end 186 | 187 | 188 | 189 | 190 | 191 | 192 | 193 | -------------------------------------------------------------------------------- /train/Dataset.lua: -------------------------------------------------------------------------------- 1 | Dataset = {} 2 | Dataset.__index = Dataset 3 | 4 | -- a simple dataset class 5 | function Dataset.load(list1, list2) 6 | local dataset = {} 7 | setmetatable(dataset, Dataset) 8 | 9 | dataset.size1 = #list1 10 | dataset.size2 = #list2 11 | dataset.size = dataset.size1 * dataset.size2 12 | 13 | dataset.list1 = list1 14 | dataset.list2 = list2 15 | 16 | -- build a index table, we use tensor to get avoid gc 17 | dataset.idx = torch.LongStorage(dataset.size) 18 | for i=1,dataset.size do 19 | dataset.idx[i] = i 20 | end 21 | 22 | return dataset 23 | end 24 | 25 | function Dataset:shuffle(seed) 26 | math.randomseed(seed or 0) 27 | for i=self.size, 2, -1 do 28 | -- get a random index for shuffle 29 | local j = math.random(i) 30 | -- shuffle indices 31 | self.idx[i],self.idx[j] = self.idx[j],self.idx[i] 32 | end 33 | end 34 | 35 | function Dataset:get(index) 36 | -- keep index in [1, self.size] 37 | index = (index - 1) % self.size + 1 38 | 39 | local idx1 = (self.idx[index] - 1) % self.size1 + 1 40 | local idx2 = math.floor((self.idx[index] - 1) / self.size1 + 1) 41 | return self.list1[idx1], self.list2[idx2] 42 | end 43 | 44 | 45 | 46 | -------------------------------------------------------------------------------- /train/Network.lua: -------------------------------------------------------------------------------- 1 | require 'nn' 2 | require 'nngraph' 3 | require 'cunn' 4 | require 'cudnn' 5 | 6 | -- the network 7 | function IntrinsicNetwork() 8 | 9 | local SConv = nn.SpatialConvolution 10 | local SBatchNorm = nn.SpatialBatchNormalization 11 | local SUpSamp = nn.SpatialUpSamplingBilinear 12 | 13 | -- input image 14 | local image = nn.Identity()():annotate{ 15 | name = 'Input' 16 | } 17 | 18 | -- encoder layers 19 | local conv0 = nn.Sequential() 20 | conv0:add(SConv(3,16,3,3,1,1,1,1)) 21 | conv0:add(SBatchNorm(16)) 22 | conv0:add(nn.ReLU(true)) 23 | local conv00 = conv0(image) -- still 256x256 24 | 25 | local conv1 = nn.Sequential() 26 | conv1:add(SConv(16,32,3,3,2,2,1,1)) 27 | conv1:add(SBatchNorm(32)) 28 | conv1:add(nn.ReLU(true)) 29 | local conv10 = conv1(conv00) -- 128x128 30 | 31 | local conv2 = nn.Sequential() 32 | conv2:add(SConv(32,64,3,3,2,2,1,1)) 33 | conv2:add(SBatchNorm(64)) 34 | conv2:add(nn.ReLU(true)) 35 | local conv20 = conv2(conv10) -- 64x64 36 | 37 | local conv3 = nn.Sequential() 38 | conv3:add(SConv(64,128,3,3,2,2,1,1)) 39 | conv3:add(SBatchNorm(128)) 40 | conv3:add(nn.ReLU(true)) -- 32x32 41 | local conv30 = conv3(conv20) 42 | 43 | local conv4 = nn.Sequential() 44 | conv4:add(SConv(128,256,3,3,2,2,1,1)) 45 | conv4:add(SBatchNorm(256)) 46 | conv4:add(nn.ReLU(true)) -- 16x16 47 | local conv40 = conv4(conv30) 48 | 49 | local conv5 = nn.Sequential() 50 | conv5:add(SConv(256,256,3,3,2,2,1,1)) 51 | conv5:add(SBatchNorm(256)) 52 | conv5:add(nn.ReLU(true)) -- 8x8 53 | local conv50 = conv5(conv40) 54 | 55 | -- start decoder 56 | local mid = {} 57 | local deconvs0 = {} 58 | local deconvs1 = {} 59 | local deconvs2 = {} 60 | local deconvs3 = {} 61 | local deconvs4 = {} 62 | local outputs = {} 63 | 64 | for i=1,3 do 65 | local fc = nn.Sequential() 66 | fc:add(SConv(256,256,3,3,1,1,1,1)) 67 | fc:add(SBatchNorm(256)) 68 | fc:add(nn.ReLU(true)) -- 8x8 69 | fc:add(SConv(256,256,3,3,1,1,1,1)) 70 | fc:add(SBatchNorm(256)) 71 | fc:add(nn.ReLU(true)) -- 8x8 72 | fc:add(SConv(256,256,3,3,1,1,1,1)) 73 | fc:add(SBatchNorm(256)) 74 | fc:add(nn.ReLU(true)) -- 8x8 75 | fc:add(SConv(256,256,3,3,1,1,1,1)) 76 | fc:add(SBatchNorm(256)) 77 | fc:add(nn.ReLU(true)) -- 8x8 78 | mid[i] = fc(conv50) 79 | end 80 | mid[4] = conv50 81 | 82 | for i=1,3 do 83 | -- deconv and upsampling 84 | local deconv0 = nn.Sequential() 85 | deconv0:add(nn.JoinTable(2)) 86 | deconv0:add(SConv(1024,256,3,3,1,1,1,1)) 87 | deconv0:add(SBatchNorm(256)) 88 | deconv0:add(nn.ReLU(true)) 89 | deconv0:add(SUpSamp(2)) 90 | deconvs0[i] = deconv0(mid) -- 16x16 91 | end 92 | deconvs0[4] = conv40 93 | 94 | for i=1,3 do 95 | local deconv1 = nn.Sequential() 96 | deconv1:add(nn.JoinTable(2)) 97 | deconv1:add(SConv(1024,128,3,3,1,1,1,1)) 98 | deconv1:add(SBatchNorm(128)) 99 | deconv1:add(nn.ReLU(true)) 100 | deconv1:add(SUpSamp(2)) 101 | deconvs1[i] = deconv1(deconvs0) -- 32x32 102 | end 103 | deconvs1[4] = conv30 104 | 105 | for i=1,3 do 106 | local deconv2 = nn.Sequential() 107 | deconv2:add(nn.JoinTable(2)) 108 | deconv2:add(SConv(512,64,3,3,1,1,1,1)) 109 | deconv2:add(SBatchNorm(64)) 110 | deconv2:add(nn.ReLU(true)) 111 | deconv2:add(SUpSamp(2)) 112 | deconvs2[i] = deconv2(deconvs1) -- 64x64 113 | end 114 | deconvs2[4] = conv20 115 | 116 | for i=1,3 do 117 | local deconv3 = nn.Sequential() 118 | deconv3:add(nn.JoinTable(2)) 119 | deconv3:add(SConv(256,32,3,3,1,1,1,1)) 120 | deconv3:add(SBatchNorm(32)) 121 | deconv3:add(nn.ReLU(true)) 122 | deconv3:add(SUpSamp(2)) 123 | deconvs3[i] = deconv3(deconvs2) -- 128x128 124 | end 125 | deconvs3[4] = conv10 126 | 127 | for i=1,3 do 128 | local deconv4 = nn.Sequential() 129 | deconv4:add(nn.JoinTable(2)) 130 | deconv4:add(SConv(128,16,3,3,1,1,1,1)) 131 | deconv4:add(SBatchNorm(16)) 132 | deconv4:add(nn.ReLU(true)) 133 | deconv4:add(SUpSamp(2)) 134 | deconvs4[i] = deconv4(deconvs3) -- 256x256 135 | end 136 | deconvs4[4] = conv00 137 | 138 | for i=1,3 do 139 | -- output 140 | local output4 = nn.Sequential() 141 | output4:add(nn.JoinTable(2)) 142 | output4:add(SConv(64,16,3,3,1,1,1,1)) 143 | output4:add(SBatchNorm(16)) 144 | output4:add(nn.ReLU(true)) 145 | 146 | -- image resolution 147 | output4:add(SConv(16,3,3,3,1,1,1,1)) 148 | output4:add(SBatchNorm(3)) 149 | output4:add(nn.ReLU(true)) 150 | outputs[i] = output4(deconvs4) -- 3x256x256 151 | end 152 | 153 | return nn.gModule({image}, outputs) 154 | end 155 | 156 | -------------------------------------------------------------------------------- /train/Patch.lua: -------------------------------------------------------------------------------- 1 | require 'nn' 2 | 3 | local SpatialUpSamplingBilinear = nn.SpatialUpSamplingBilinear 4 | 5 | function SpatialUpSamplingBilinear:setSize(input) 6 | local xdim = input:dim() 7 | local ydim = xdim - 1 8 | for i = 1, input:dim() do 9 | self.inputSize[i] = input:size(i) 10 | self.outputSize[i] = input:size(i) 11 | end 12 | if self.scale_factor ~= nil then 13 | self.outputSize[ydim] = (self.outputSize[ydim]-1) * (self.scale_factor-1) 14 | + self.outputSize[ydim] 15 | self.outputSize[xdim] = (self.outputSize[xdim]-1) * (self.scale_factor -1) 16 | + self.outputSize[xdim] 17 | else 18 | self.outputSize[ydim] = self.oheight 19 | self.outputSize[xdim] = self.owidth 20 | end 21 | end 22 | 23 | -------------------------------------------------------------------------------- /train/Test.lua: -------------------------------------------------------------------------------- 1 | require 'torch' 2 | require 'paths' 3 | require 'image' 4 | require 'nn' 5 | require 'nngraph' 6 | require 'cunn' 7 | require 'cudnn' 8 | 9 | -- for old model 10 | -- require 'Patch' 11 | 12 | local cmd = torch.CmdLine() 13 | 14 | cmd:option('-input', '', 'input image') 15 | cmd:option('-mask', '', 'input mask') 16 | cmd:option('-model', '', 'model file') 17 | cmd:option('-outdir', '.', 'output directory') 18 | cmd:option('-gpu', 1, 'use GPU') 19 | 20 | local options = cmd:parse(arg) 21 | 22 | local input = torch.FloatTensor(1, 3, 256, 256) 23 | local mask = torch.FloatTensor(1, 3, 256, 256) 24 | 25 | -- load model 26 | local model = torch.load(options.model) 27 | 28 | -- load input image and mask 29 | input[{1, {}, {}, {}}] = image.scale(image.load(options.input, 3), 256, 256) 30 | mask[{1, {}, {}, {}}] = image.scale(image.load(options.mask, 3), 256, 256) 31 | 32 | if options.gpu == 0 then 33 | model:float() 34 | else 35 | model:cuda() 36 | input = input:cuda() 37 | mask = mask:cuda() 38 | end 39 | 40 | local pred = model:forward(input) 41 | 42 | -- save output 43 | image.save(paths.concat(options.outdir, 'albedo.png'), pred[1]:cmul(mask):squeeze()) 44 | image.save(paths.concat(options.outdir, 'shading.png'), pred[2]:cmul(mask):squeeze()) 45 | image.save(paths.concat(options.outdir, 'specular.png'), pred[3]:cmul(mask):squeeze()) 46 | -- save a copy of input 47 | image.save(paths.concat(options.outdir, 'input.png'), input:squeeze()) 48 | image.save(paths.concat(options.outdir, 'mask.png'), mask:squeeze()) 49 | 50 | --EOF 51 | 52 | -------------------------------------------------------------------------------- /train/Train.lua: -------------------------------------------------------------------------------- 1 | require 'torch' 2 | require 'paths' 3 | require 'image' 4 | 5 | -- 6 | require 'Dataset' 7 | require 'Network' 8 | require 'Criterion' 9 | 10 | -- parse command line parameters 11 | local cmd = torch.CmdLine() 12 | 13 | cmd:option('-data_root', os.getenv('RENDER_ROOT') or '', 'the dataset root directory') 14 | cmd:option('-model_list', paths.concat(os.getenv('RENDER_ROOT') or '','dataset.csv')) 15 | cmd:option('-env_list', paths.concat(os.getenv('ENVMAP_ROOT') or '','list.txt')) 16 | cmd:option('-outdir', '.', 'output directory') 17 | 18 | cmd:option('-max_iter', 1000000, 'max training iteration') 19 | cmd:option('-save_iter', 10000, 'save snapshot') 20 | cmd:option('-test_iter', 10000, 'test') 21 | cmd:option('-disp_iter', 10, 'display error and training result') 22 | 23 | cmd:option('-snapshot', '', 'snapshot file') 24 | 25 | cmd:option('-batch_size', 4, 'batch size') 26 | cmd:option('-cuda', true, 'use cuda') 27 | cmd:option('-cudnn', true, 'use cudnn') 28 | cmd:option('-devid', 1, 'cuda device index') 29 | cmd:option('-seed', 666, 'random seed') 30 | 31 | local options = cmd:parse(arg) 32 | 33 | -- check input files 34 | if not paths.filep(options.model_list) then 35 | print('Please check model_list file!') 36 | os.exit(1) 37 | end 38 | 39 | if not paths.filep(options.env_list) then 40 | print('Please check env_list file!') 41 | os.exit(1) 42 | end 43 | 44 | -- load ShapeNet model list 45 | local model_train = {} 46 | local model_test = {} 47 | local model_val = {} 48 | 49 | -- read model list 50 | for line in io.lines(options.model_list) do 51 | local split = {} 52 | for token in line:gmatch('([^,]+)') do 53 | table.insert(split, token) 54 | end 55 | 56 | -- split shapenet model for training 57 | if split[5] == 'train' then 58 | table.insert(model_train, split[1]) 59 | elseif split[5] == 'test' then 60 | table.insert(model_test, split[1]) 61 | elseif split[5] == 'val' then 62 | table.insert(model_val, split[1]) 63 | end 64 | end 65 | 66 | -- print(#model_train, #model_test, #model_val) 67 | 68 | -- load environment map list 69 | local env_train = {} 70 | local env_test = {} 71 | local env_val = {} 72 | 73 | for line in io.lines(options.env_list) do 74 | local split = {} 75 | for token in paths.basename(line):gmatch('([^.]+)') do 76 | table.insert(split, token) 77 | end 78 | 79 | -- we use all environment maps for both training and testing 80 | -- since we have relative small size environment map dataset, it is not a good idea to random split it. 81 | -- experiments showed that reasonably split environment maps would produce results very close to no-split setting 82 | table.insert(env_train, split[1]) 83 | table.insert(env_test, split[1]) 84 | table.insert(env_val, split[1]) 85 | end 86 | 87 | -- create training dataset 88 | local dataset = Dataset.load(model_train, env_train) 89 | dataset:shuffle() 90 | print('Training dataset size:', dataset.size) 91 | 92 | -- pre-allocate memory, hard code for image resolution 93 | local input = torch.Tensor(options.batch_size, 3, 256, 256) 94 | local mask = torch.Tensor(options.batch_size, 3, 256, 256) 95 | local target_albedo = torch.Tensor(options.batch_size, 3, 256, 256) 96 | local target_shading = torch.Tensor(options.batch_size, 3, 256, 256) 97 | local target_specular = torch.Tensor(options.batch_size, 3, 256, 256) 98 | 99 | torch.setdefaulttensortype('torch.FloatTensor') 100 | torch.manualSeed(options.seed) 101 | if options.cuda then 102 | cutorch.setDevice(options.devid) 103 | cutorch.manualSeed(options.seed) 104 | end 105 | 106 | -- network and criterion 107 | local network 108 | 109 | if options.snapshot and paths.filep(options.snapshot) then 110 | -- load from snapshot 111 | print('Load snapshot file...') 112 | network = torch.load(options.snapshot) 113 | else 114 | network = IntrinsicNetwork() 115 | end 116 | 117 | local criterion = nn.IntrinsicCriterion() 118 | 119 | -- cuda 120 | if options.cuda then 121 | require 'cunn' 122 | 123 | input = input:cuda() 124 | mask = mask:cuda() 125 | target_albedo = target_albedo:cuda() 126 | target_shading = target_shading:cuda() 127 | target_specular = target_specular:cuda() 128 | 129 | network:cuda() 130 | criterion:cuda() 131 | 132 | if options.cudnn then 133 | require 'cudnn' 134 | --cudnn.fastest = true 135 | cudnn.convert(network, cudnn) 136 | end 137 | end 138 | 139 | -- for solver 140 | local x, dl_dx = network:getParameters() 141 | 142 | local optim = require('optim') 143 | --local solver = optim['adam'] 144 | local solver = optim['adadelta'] 145 | local state = { 146 | learningRate = 0.01, 147 | learningRateDecay = 1e-5 148 | } 149 | 150 | -- for display images and error 151 | local display = require 'display' 152 | 153 | -- timer 154 | local timer = torch.Timer() 155 | timer:reset() 156 | 157 | local loss_albedo = 0 158 | local loss_shading = 0 159 | local loss_specular = 0 160 | 161 | local loss_table_train = {} 162 | local loss_table_test = {} 163 | 164 | local plot_config_train = { 165 | title = "Training Loss", 166 | labels = {"iter", "albedo", "shading", "specular"}, 167 | ylabel = "Weighted MSE", 168 | } 169 | 170 | local plot_config_test = { 171 | title = "Testing Loss", 172 | labels = {"iter", "albedo", "shading", "specular"}, 173 | ylabel = "Weighted MSE", 174 | } 175 | 176 | local iter = 0 177 | local curr = 0 -- current sample index 178 | while iter < options.max_iter do 179 | iter = iter + 1 180 | 181 | network:training() 182 | 183 | for i=1,options.batch_size do 184 | curr = curr + 1 185 | local model, env = dataset:get(curr) 186 | local prefix = paths.concat(options.data_root, model, env) 187 | 188 | -- load data 189 | input[{i,{},{},{}}] = image.load(prefix..'_i.jpg', 3) 190 | mask[{i,{},{},{}}] = image.load(prefix..'_m.png', 3) 191 | 192 | target_albedo[{i,{},{},{}}] = image.load(prefix..'_a.jpg', 3) 193 | target_shading[{i,{},{},{}}] = image.load(prefix..'_s.jpg', 3) 194 | target_specular[{i,{},{},{}}] = image.load(prefix..'_r.jpg', 3) 195 | end 196 | 197 | -- mask out input background? 198 | -- input:cmul(mask) 199 | 200 | -- forward 201 | local pred = network:forward(input) 202 | local pred_albedo = pred[1]:cmul(mask) 203 | local pred_shading = pred[2]:cmul(mask) 204 | local pred_specular = pred[3]:cmul(mask) 205 | 206 | -- get loss 207 | local loss = criterion:forward(pred, {input, mask, target_albedo, target_shading, target_specular}) 208 | loss_albedo = loss_albedo + loss[1] 209 | loss_shading = loss_shading + loss[2] 210 | loss_specular = loss_specular + loss[3] 211 | 212 | -- get gradient 213 | local grad = criterion:backward(pred, {input, mask, target_albedo, target_shading, target_specular}) 214 | 215 | -- optimize 216 | local function feval() 217 | -- update parameters 218 | network:zeroGradParameters() 219 | network:backward(input,grad) 220 | return loss, dl_dx 221 | end 222 | solver(feval, x, state) 223 | 224 | 225 | if iter % options.disp_iter == 0 then 226 | -- display training images and plot error 227 | win_image = display.image(input, {win=win_image,title='Input Image, iter:'..iter}) 228 | win_a0 = display.image(torch.cat(target_albedo,pred_albedo, 4), {win=win_a0,title='Albedo, iter:'..iter}) 229 | win_s0 = display.image(torch.cat(target_shading,pred_shading, 4), {win=win_s0,title='Shading, iter:'..iter}) 230 | win_r0 = display.image(torch.cat(target_specular,pred_specular, 4), {win=win_r0,title='Specular, iter:'..iter}) 231 | 232 | loss_albedo = loss_albedo / options.disp_iter 233 | loss_shading = loss_shading / options.disp_iter 234 | loss_specular = loss_specular / options.disp_iter 235 | 236 | table.insert(loss_table_train, {iter, loss_albedo, loss_shading, loss_specular}) 237 | plot_config_train.win = display.plot(loss_table_train, plot_config_train) 238 | 239 | -- print to console 240 | print(string.format("Iteration %d, %d/%d samples, %.2f seconds passed", 241 | iter, curr, dataset.size, timer:time().real)) 242 | print(string.format("\t Loss-a: %.4f, Loss-s: %.4f, Loss-r: %.4f", 243 | loss_albedo, loss_shading, loss_specular)) 244 | 245 | loss_albedo = 0 246 | loss_shading = 0 247 | loss_specular = 0 248 | end 249 | 250 | if iter % options.test_iter == 0 then 251 | -- testing... 252 | network:evaluate() 253 | 254 | local test_loss_a = 0 255 | local test_loss_s = 0 256 | local test_loss_r = 0 257 | 258 | for k = 1,#model_test,options.batch_size do 259 | 260 | for i=1, options.batch_size do 261 | local idx0 = (k + i - 2) % #model_test + 1 262 | local idx1 = (k + i - 2) % #env_test + 1 263 | 264 | -- go through models 265 | local model = model_test[idx0] 266 | -- we only use single envmap to save testing time 267 | local env = env_test[idx1] 268 | 269 | local prefix = paths.concat(options.data_root, model, env) 270 | 271 | -- load data 272 | input[{i,{},{},{}}] = image.load(prefix..'_i.jpg', 3) 273 | mask[{i,{},{},{}}] = image.load(prefix..'_m.png', 3) 274 | 275 | target_albedo[{i,{},{},{}}] = image.load(prefix..'_a.jpg', 3) 276 | target_shading[{i,{},{},{}}] = image.load(prefix..'_s.jpg', 3) 277 | target_specular[{i,{},{},{}}] = image.load(prefix..'_r.jpg', 3) 278 | end 279 | 280 | -- forward 281 | local pred = network:forward(input) 282 | 283 | -- get loss 284 | local loss = criterion:forward(pred, {input, mask, target_albedo, target_shading, target_specular}) 285 | test_loss_a = test_loss_a + loss[1] 286 | test_loss_s = test_loss_s + loss[2] 287 | test_loss_r = test_loss_r + loss[3] 288 | 289 | end 290 | 291 | test_loss_a = test_loss_a / #model_test 292 | test_loss_s = test_loss_s / #model_test 293 | test_loss_r = test_loss_r / #model_test 294 | 295 | print(string.format("Evaluation, Loss-a: %.4f, Loss-s: %.4f, Loss-r: %.4f", 296 | test_loss_a, test_loss_s, test_loss_r)) 297 | 298 | table.insert(loss_table_test, {iter, test_loss_a, test_loss_s, test_loss_r}) 299 | plot_config_test.win = display.plot(loss_table_test, plot_config_test) 300 | end 301 | 302 | if iter % options.save_iter == 0 then 303 | -- save model 304 | print("Save model on iteration", iter) 305 | network:clearState() 306 | torch.save(paths.concat(options.outdir, 'snapshot_'..iter..'.t7'), network) 307 | end 308 | 309 | -- manually GC 310 | collectgarbage() 311 | 312 | end 313 | 314 | 315 | 316 | 317 | -------------------------------------------------------------------------------- /train/readme.md: -------------------------------------------------------------------------------- 1 | # Training/Testing scripts 2 | 3 | This directory provides network structure, criterion, training and testing scripts. 4 | 5 | ## Train 6 | 7 | Some parameters: 8 | * **-data_root** specify the root directory of ShapeNet rendering images. 9 | * **-model_list** specify the .csv file containing model id and dataset separation downloaded from [ShapeNet](http://shapenet.cs.stanford.edu/shapenet/obj-zip/SHREC16/all.csv) website. 10 | * **-env_list** specify the environment map list file. 11 | * **-outdir** specify the output directory for saving snapshots. Default is current directory. 12 | 13 | ## Test 14 | 15 | Testing scripts is quite simple. It accepts 5 parameters. 16 | 17 | * **-input** specify the input image file. 18 | * **-mask** specify the mask file. 19 | * **-model** specify the trained model file. 20 | * **-outdir** specify the output directory, default is current directory. 21 | * **-gpu** 0 is for running on CPU. Default is using GPU. 22 | 23 | 24 | The script would output 5 images including albedo.png, shading.png, specular.png, as well as input.png and mask.png under outdir. 25 | --------------------------------------------------------------------------------