├── .gitignore ├── DSL_ImgProcess ├── README.MD ├── batch_test_script_leftpart.sh ├── batch_test_script_mainbody.sh ├── cifar_input.py ├── data │ ├── __init__.py │ ├── cifar10_data.py │ ├── cifar10_plotdata.py │ ├── imagenet_data.py │ └── pixelcnn_samples.png ├── data2 │ ├── __init__.py │ ├── cifar10_data.py │ ├── cifar10_plotdata.py │ ├── imagenet_data.py │ └── pixelcnn_samples.png ├── data4 │ ├── __init__.py │ ├── cifar10_data.py │ ├── cifar10_plotdata.py │ └── imagenet_data.py ├── example.sh ├── monitor.py ├── monitor_GivenCLM.py ├── monitor_checkleft.py ├── pixel_cnn_pp │ ├── __init__.py │ ├── model.py │ ├── nn.py │ └── plotting.py ├── resnet_model_basic.py ├── test_grab_train_loss.py ├── train.py ├── worker_I2L.py └── worker_L2I.py ├── DSL_SentimentAnalysis ├── CLM │ ├── CLM.py │ ├── Data.py │ ├── __init__.py │ ├── ipdb │ │ ├── __init__.py │ │ └── __main__.py │ ├── multiverso │ │ ├── Multiverso.dll │ │ ├── __init__.py │ │ ├── api.py │ │ ├── tables.py │ │ ├── tests │ │ │ └── test_multiverso.py │ │ ├── theano_ext │ │ │ ├── __init__.py │ │ │ ├── lasagne_ext │ │ │ │ ├── __init__.py │ │ │ │ └── param_manager.py │ │ │ └── sharedvar.py │ │ └── utils.py │ ├── nmt_base.py │ ├── training_scripts │ │ ├── train_clm_WithDropout_lr0.5.py │ │ ├── train_clm_nodr.py │ │ └── train_clm_nodr_lr0.5.py │ └── worker.bat ├── Classifier │ ├── Data.py │ ├── Layers.py │ ├── Models.py │ ├── Util.py │ ├── __init__.py │ └── worker.bat ├── Data.py ├── README.md ├── Util_basic.py ├── config.py ├── data │ └── readme.txt ├── inference.py ├── ipdb │ ├── __init__.py │ └── __main__.py ├── model.npz.pkl ├── monitor.py ├── train.bat ├── train_linux.sh ├── valid.bat └── valid_linux.sh ├── LICENSE ├── README.md └── SECURITY.md /.gitignore: -------------------------------------------------------------------------------- 1 | ## Ignore Visual Studio temporary files, build results, and 2 | ## files generated by popular Visual Studio add-ons. 3 | ## 4 | ## Get latest from https://github.com/github/gitignore/blob/master/VisualStudio.gitignore 5 | 6 | # User-specific files 7 | *.suo 8 | *.user 9 | *.userosscache 10 | *.sln.docstates 11 | 12 | # User-specific files (MonoDevelop/Xamarin Studio) 13 | *.userprefs 14 | 15 | # Build results 16 | [Dd]ebug/ 17 | [Dd]ebugPublic/ 18 | [Rr]elease/ 19 | [Rr]eleases/ 20 | x64/ 21 | x86/ 22 | bld/ 23 | [Bb]in/ 24 | [Oo]bj/ 25 | [Ll]og/ 26 | 27 | # Visual Studio 2015 cache/options directory 28 | .vs/ 29 | # Uncomment if you have tasks that create the project's static files in wwwroot 30 | #wwwroot/ 31 | 32 | # MSTest test Results 33 | [Tt]est[Rr]esult*/ 34 | [Bb]uild[Ll]og.* 35 | 36 | # NUNIT 37 | *.VisualState.xml 38 | TestResult.xml 39 | 40 | # Build Results of an ATL Project 41 | [Dd]ebugPS/ 42 | [Rr]eleasePS/ 43 | dlldata.c 44 | 45 | # .NET Core 46 | project.lock.json 47 | project.fragment.lock.json 48 | artifacts/ 49 | **/Properties/launchSettings.json 50 | 51 | *_i.c 52 | *_p.c 53 | *_i.h 54 | *.ilk 55 | *.meta 56 | *.obj 57 | *.pch 58 | *.pdb 59 | *.pgc 60 | *.pgd 61 | *.rsp 62 | *.sbr 63 | *.tlb 64 | *.tli 65 | *.tlh 66 | *.tmp 67 | *.tmp_proj 68 | *.log 69 | *.vspscc 70 | *.vssscc 71 | .builds 72 | *.pidb 73 | *.svclog 74 | *.scc 75 | 76 | # Chutzpah Test files 77 | _Chutzpah* 78 | 79 | # Visual C++ cache files 80 | ipch/ 81 | *.aps 82 | *.ncb 83 | *.opendb 84 | *.opensdf 85 | *.sdf 86 | *.cachefile 87 | *.VC.db 88 | *.VC.VC.opendb 89 | 90 | # Visual Studio profiler 91 | *.psess 92 | *.vsp 93 | *.vspx 94 | *.sap 95 | 96 | # TFS 2012 Local Workspace 97 | $tf/ 98 | 99 | # Guidance Automation Toolkit 100 | *.gpState 101 | 102 | # ReSharper is a .NET coding add-in 103 | _ReSharper*/ 104 | *.[Rr]e[Ss]harper 105 | *.DotSettings.user 106 | 107 | # JustCode is a .NET coding add-in 108 | .JustCode 109 | 110 | # TeamCity is a build add-in 111 | _TeamCity* 112 | 113 | # DotCover is a Code Coverage Tool 114 | *.dotCover 115 | 116 | # Visual Studio code coverage results 117 | *.coverage 118 | *.coveragexml 119 | 120 | # NCrunch 121 | _NCrunch_* 122 | .*crunch*.local.xml 123 | nCrunchTemp_* 124 | 125 | # MightyMoose 126 | *.mm.* 127 | AutoTest.Net/ 128 | 129 | # Web workbench (sass) 130 | .sass-cache/ 131 | 132 | # Installshield output folder 133 | [Ee]xpress/ 134 | 135 | # DocProject is a documentation generator add-in 136 | DocProject/buildhelp/ 137 | DocProject/Help/*.HxT 138 | DocProject/Help/*.HxC 139 | DocProject/Help/*.hhc 140 | DocProject/Help/*.hhk 141 | DocProject/Help/*.hhp 142 | DocProject/Help/Html2 143 | DocProject/Help/html 144 | 145 | # Click-Once directory 146 | publish/ 147 | 148 | # Publish Web Output 149 | *.[Pp]ublish.xml 150 | *.azurePubxml 151 | # TODO: Comment the next line if you want to checkin your web deploy settings 152 | # but database connection strings (with potential passwords) will be unencrypted 153 | *.pubxml 154 | *.publishproj 155 | 156 | # Microsoft Azure Web App publish settings. Comment the next line if you want to 157 | # checkin your Azure Web App publish settings, but sensitive information contained 158 | # in these scripts will be unencrypted 159 | PublishScripts/ 160 | 161 | # NuGet Packages 162 | *.nupkg 163 | # The packages folder can be ignored because of Package Restore 164 | **/packages/* 165 | # except build/, which is used as an MSBuild target. 166 | !**/packages/build/ 167 | # Uncomment if necessary however generally it will be regenerated when needed 168 | #!**/packages/repositories.config 169 | # NuGet v3's project.json files produces more ignorable files 170 | *.nuget.props 171 | *.nuget.targets 172 | 173 | # Microsoft Azure Build Output 174 | csx/ 175 | *.build.csdef 176 | 177 | # Microsoft Azure Emulator 178 | ecf/ 179 | rcf/ 180 | 181 | # Windows Store app package directories and files 182 | AppPackages/ 183 | BundleArtifacts/ 184 | Package.StoreAssociation.xml 185 | _pkginfo.txt 186 | 187 | # Visual Studio cache files 188 | # files ending in .cache can be ignored 189 | *.[Cc]ache 190 | # but keep track of directories ending in .cache 191 | !*.[Cc]ache/ 192 | 193 | # Others 194 | ClientBin/ 195 | ~$* 196 | *~ 197 | *.dbmdl 198 | *.dbproj.schemaview 199 | *.jfm 200 | *.pfx 201 | *.publishsettings 202 | orleans.codegen.cs 203 | 204 | # Since there are multiple workflows, uncomment next line to ignore bower_components 205 | # (https://github.com/github/gitignore/pull/1529#issuecomment-104372622) 206 | #bower_components/ 207 | 208 | # RIA/Silverlight projects 209 | Generated_Code/ 210 | 211 | # Backup & report files from converting an old project file 212 | # to a newer Visual Studio version. Backup files are not needed, 213 | # because we have git ;-) 214 | _UpgradeReport_Files/ 215 | Backup*/ 216 | UpgradeLog*.XML 217 | UpgradeLog*.htm 218 | 219 | # SQL Server files 220 | *.mdf 221 | *.ldf 222 | *.ndf 223 | 224 | # Business Intelligence projects 225 | *.rdl.data 226 | *.bim.layout 227 | *.bim_*.settings 228 | 229 | # Microsoft Fakes 230 | FakesAssemblies/ 231 | 232 | # GhostDoc plugin setting file 233 | *.GhostDoc.xml 234 | 235 | # Node.js Tools for Visual Studio 236 | .ntvs_analysis.dat 237 | node_modules/ 238 | 239 | # Typescript v1 declaration files 240 | typings/ 241 | 242 | # Visual Studio 6 build log 243 | *.plg 244 | 245 | # Visual Studio 6 workspace options file 246 | *.opt 247 | 248 | # Visual Studio 6 auto-generated workspace file (contains which files were open etc.) 249 | *.vbw 250 | 251 | # Visual Studio LightSwitch build output 252 | **/*.HTMLClient/GeneratedArtifacts 253 | **/*.DesktopClient/GeneratedArtifacts 254 | **/*.DesktopClient/ModelManifest.xml 255 | **/*.Server/GeneratedArtifacts 256 | **/*.Server/ModelManifest.xml 257 | _Pvt_Extensions 258 | 259 | # Paket dependency manager 260 | .paket/paket.exe 261 | paket-files/ 262 | 263 | # FAKE - F# Make 264 | .fake/ 265 | 266 | # JetBrains Rider 267 | .idea/ 268 | *.sln.iml 269 | 270 | # CodeRush 271 | .cr/ 272 | 273 | # Python Tools for Visual Studio (PTVS) 274 | __pycache__/ 275 | *.pyc 276 | 277 | # Cake - Uncomment if you are using it 278 | # tools/** 279 | # !tools/packages.config 280 | 281 | # Telerik's JustMock configuration file 282 | *.jmconfig 283 | 284 | # BizTalk build output 285 | *.btp.cs 286 | *.btm.cs 287 | *.odx.cs 288 | *.xsd.cs 289 | -------------------------------------------------------------------------------- /DSL_ImgProcess/README.MD: -------------------------------------------------------------------------------- 1 | Dual supervised learning for image processing. 2 | ========== 3 | 4 | This is a tensorflow based implementation of DSL for image processing. The primal task is image classification (image-to-label) and the dual task is image generation with given label (label-to-iamge). It can leverage the dual signal of two tasks to improve the performance. 5 | 6 | Download 7 | ---------- 8 | $ git clone https://github.com/Microsoft/DualLearning 9 | 10 | Build 11 | ---------- 12 | 13 | **Prerequisite** 14 | 15 | The code is build on tensorflow. 16 | 17 | 18 | Run 19 | ---------- 20 | Traininig: refer to `example.sh` 21 | 22 | Evaluation: refer to `batch_test_script_mainbody.sh` 23 | 24 | Checkpoint 25 | ---------- 26 | The data and a pretrained checkpoint can be downloaded at 27 | https://www.dropbox.com/sh/fpnvtcmyj4mul2s/AAB4wvsxoS8pf7ExnZYe4VV1a?dl=0 28 | 29 | Reference 30 | ---------- 31 | [1] Xia, Y., Qin, T., Chen, W., Bian, J., Yu, N., & Liu, Dual Supervised Learning, ICML2017 32 | -------------------------------------------------------------------------------- /DSL_ImgProcess/batch_test_script_leftpart.sh: -------------------------------------------------------------------------------- 1 | export PATH=/usr/anaconda2/bin:$PATH 2 | #export LD_LIBRARY_PATH=~/Downloads/cuda/lib64:"$LD_LIBRAYR_PATH" 3 | export CUDA_VISIBLE_DEVICES=6 4 | 5 | model_dir=checkpoints 6 | 7 | for (( e=345;e<=345;e+=2 ));do 8 | filename=$(ls "$model_dir" | grep -o 'params_'${e}'uidx[^\.]*\.ckpt\.index') 9 | filename=${filename:0:-6} 10 | 11 | python monitor_checkleft.py --data_dir=./cifar10_data --save_dir=$model_dir --batch_size=12 --show_interval=100 --load_params=${filename} --mode=I2L --useSoftLabel=0 12 | 13 | done 14 | 15 | for (( e=345;e<=345;e+=2 ));do 16 | filename=$(ls "$model_dir" | grep -o 'params_'${e}'uidx[^\.]*\.ckpt\.index') 17 | filename=${filename:0:-6} 18 | 19 | python monitor_checkleft.py --data_dir=./cifar10_data --save_dir=$model_dir --batch_size=12 --show_interval=100 --load_params=${filename} --mode=L2I --useSoftLabel=0 20 | 21 | done 22 | 23 | 24 | 25 | 26 | : <<'VIRTUAL_ENV' 27 | source ~/virtual_py/bin/activate 28 | export CUDA_VISIBLE_DEVICES=0 29 | 30 | model_dir=debug_room 31 | 32 | python monitor.py --data_dir=./cifar10_data --save_dir=$model_dir --batch_size=12 --show_interval=100 --load_params=params_9uidx13880.ckpt --mode=L2I --useSoftLabel=0 33 | 34 | deactivate 35 | 36 | VIRTUAL_ENV 37 | 38 | 39 | 40 | 41 | -------------------------------------------------------------------------------- /DSL_ImgProcess/batch_test_script_mainbody.sh: -------------------------------------------------------------------------------- 1 | export PATH=/usr/anaconda2/bin:$PATH 2 | #export LD_LIBRARY_PATH=~/Downloads/cuda/lib64:"$LD_LIBRAYR_PATH" 3 | export CUDA_VISIBLE_DEVICES=6 4 | 5 | model_dir=checkpoints 6 | 7 | for (( e=345;e<=345;e+=2 ));do 8 | filename=$(ls "$model_dir" | grep -o 'params_'${e}'uidx[^\.]*\.ckpt\.index') 9 | filename=${filename:0:-6} 10 | 11 | python monitor.py --data_dir=./cifar10_data --save_dir=$model_dir --batch_size=12 --show_interval=100 --load_params=${filename} --mode=I2L --useSoftLabel=0 12 | 13 | done 14 | 15 | for (( e=345;e<=345;e+=2 ));do 16 | filename=$(ls "$model_dir" | grep -o 'params_'${e}'uidx[^\.]*\.ckpt\.index') 17 | filename=${filename:0:-6} 18 | 19 | python monitor.py --data_dir=./cifar10_data --save_dir=$model_dir --batch_size=12 --show_interval=100 --load_params=${filename} --mode=L2I --useSoftLabel=0 20 | 21 | done 22 | 23 | 24 | # Reminder: When using "--oneside" in training mode, you should also add the 25 | # corresponding "--oneside" in the inference phase. 26 | -------------------------------------------------------------------------------- /DSL_ImgProcess/cifar_input.py: -------------------------------------------------------------------------------- 1 | # Copyright 2016 The TensorFlow Authors. All Rights Reserved. 2 | # 3 | # Licensed under the Apache License, Version 2.0 (the "License"); 4 | # you may not use this file except in compliance with the License. 5 | # You may obtain a copy of the License at 6 | # 7 | # http://www.apache.org/licenses/LICENSE-2.0 8 | # 9 | # Unless required by applicable law or agreed to in writing, software 10 | # distributed under the License is distributed on an "AS IS" BASIS, 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12 | # See the License for the specific language governing permissions and 13 | # limitations under the License. 14 | # ============================================================================== 15 | 16 | """CIFAR dataset input module. 17 | """ 18 | 19 | import tensorflow as tf 20 | 21 | 22 | def build_input(dataset, data_path, batch_size, mode): 23 | """Build CIFAR image and labels. 24 | 25 | Args: 26 | dataset: Either 'cifar10' or 'cifar100'. 27 | data_path: Filename for data. 28 | batch_size: Input batch size. 29 | mode: Either 'train' or 'eval'. 30 | Returns: 31 | images: Batches of images. [batch_size, image_size, image_size, 3] 32 | labels: Batches of labels. [batch_size, num_classes] 33 | Raises: 34 | ValueError: when the specified dataset is not supported. 35 | """ 36 | image_size = 32 37 | if dataset == 'cifar10': 38 | label_bytes = 1 39 | label_offset = 0 40 | num_classes = 10 41 | elif dataset == 'cifar100': 42 | label_bytes = 1 43 | label_offset = 1 44 | num_classes = 100 45 | else: 46 | raise ValueError('Not supported dataset %s', dataset) 47 | 48 | depth = 3 49 | image_bytes = image_size * image_size * depth 50 | record_bytes = label_bytes + label_offset + image_bytes 51 | 52 | data_files = tf.gfile.Glob(data_path) 53 | file_queue = tf.train.string_input_producer(data_files, shuffle=True) 54 | # Read examples from files in the filename queue. 55 | reader = tf.FixedLengthRecordReader(record_bytes=record_bytes) 56 | _, value = reader.read(file_queue) 57 | 58 | # Convert these examples to dense labels and processed images. 59 | record = tf.reshape(tf.decode_raw(value, tf.uint8), [record_bytes]) 60 | 61 | label = tf.cast(tf.slice(record, [label_offset], [label_bytes]), tf.int32) 62 | # Convert from string to [depth * height * width] to [depth, height, width]. 63 | depth_major = tf.reshape(tf.slice(record, [label_bytes], [image_bytes]), 64 | [depth, image_size, image_size]) 65 | # Convert from [depth, height, width] to [height, width, depth]. 66 | image = tf.cast(tf.transpose(depth_major, [1, 2, 0]), tf.float32) 67 | 68 | if mode == 'train': 69 | image = tf.image.resize_image_with_crop_or_pad( 70 | image, image_size+4, image_size+4) 71 | image = tf.random_crop(image, [image_size, image_size, 3]) 72 | image = tf.image.random_flip_left_right(image) 73 | # Brightness/saturation/constrast provides small gains .2%~.5% on cifar. 74 | # image = tf.image.random_brightness(image, max_delta=63. / 255.) 75 | # image = tf.image.random_saturation(image, lower=0.5, upper=1.5) 76 | # image = tf.image.random_contrast(image, lower=0.2, upper=1.8) 77 | image = tf.image.per_image_standardization(image) 78 | 79 | example_queue = tf.RandomShuffleQueue( 80 | capacity=16 * batch_size, 81 | min_after_dequeue=8 * batch_size, 82 | dtypes=[tf.float32, tf.int32], 83 | shapes=[[image_size, image_size, depth], [1]]) 84 | num_threads = 16 85 | else: 86 | image = tf.image.resize_image_with_crop_or_pad( 87 | image, image_size, image_size) 88 | image = tf.image.per_image_whitening(image) 89 | 90 | example_queue = tf.FIFOQueue( 91 | 3 * batch_size, 92 | dtypes=[tf.float32, tf.int32], 93 | shapes=[[image_size, image_size, depth], [1]]) 94 | num_threads = 1 95 | 96 | example_enqueue_op = example_queue.enqueue([image, label]) 97 | tf.train.add_queue_runner(tf.train.queue_runner.QueueRunner( 98 | example_queue, [example_enqueue_op] * num_threads)) 99 | 100 | # Read 'batch' labels + images from the example queue. 101 | images, labels = example_queue.dequeue_many(batch_size) 102 | labels = tf.reshape(labels, [batch_size, 1]) 103 | indices = tf.reshape(tf.range(0, batch_size, 1), [batch_size, 1]) 104 | labels = tf.sparse_to_dense( 105 | tf.concat(1, [indices, labels]), 106 | [batch_size, num_classes], 1.0, 0.0) 107 | 108 | assert len(images.get_shape()) == 4 109 | assert images.get_shape()[0] == batch_size 110 | assert images.get_shape()[-1] == 3 111 | assert len(labels.get_shape()) == 2 112 | assert labels.get_shape()[0] == batch_size 113 | assert labels.get_shape()[1] == num_classes 114 | 115 | # Display the training images in the visualizer. 116 | tf.image_summary('images', images) 117 | return images, labels 118 | -------------------------------------------------------------------------------- /DSL_ImgProcess/data/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/microsoft/DualLearning/14f8e76b3c47e1f00bef4254606e1f549cc8c9ea/DSL_ImgProcess/data/__init__.py -------------------------------------------------------------------------------- /DSL_ImgProcess/data/cifar10_data.py: -------------------------------------------------------------------------------- 1 | """ 2 | Utilities for downloading and unpacking the CIFAR-10 dataset, originally published 3 | by Krizhevsky et al. and hosted here: https://www.cs.toronto.edu/~kriz/cifar.html 4 | """ 5 | 6 | import os 7 | import sys 8 | import tarfile 9 | from six.moves import urllib 10 | import numpy as np 11 | 12 | def maybe_download_and_extract(data_dir, url='http://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz'): 13 | if not os.path.exists(os.path.join(data_dir, 'cifar-10-batches-py')): 14 | if not os.path.exists(data_dir): 15 | os.makedirs(data_dir) 16 | filename = url.split('/')[-1] 17 | filepath = os.path.join(data_dir, filename) 18 | if not os.path.exists(filepath): 19 | def _progress(count, block_size, total_size): 20 | sys.stdout.write('\r>> Downloading %s %.1f%%' % (filename, 21 | float(count * block_size) / float(total_size) * 100.0)) 22 | sys.stdout.flush() 23 | filepath, _ = urllib.request.urlretrieve(url, filepath, _progress) 24 | print() 25 | statinfo = os.stat(filepath) 26 | print('Successfully downloaded', filename, statinfo.st_size, 'bytes.') 27 | tarfile.open(filepath, 'r:gz').extractall(data_dir) 28 | 29 | def unpickle(file): 30 | fo = open(file, 'rb') 31 | if (sys.version_info >= (3, 0)): 32 | import pickle 33 | d = pickle.load(fo, encoding='latin1') 34 | else: 35 | import cPickle 36 | d = cPickle.load(fo) 37 | fo.close() 38 | return {'x': d['data'].reshape((10000,3,32,32)), 'y': np.array(d['labels']).astype(np.uint8)} 39 | 40 | def load(data_dir, subset='train'): 41 | maybe_download_and_extract(data_dir) 42 | if subset=='train': 43 | train_data = [unpickle(os.path.join(data_dir,'cifar-10-batches-py','data_batch_' + str(i))) for i in range(1,6)] 44 | trainx = np.concatenate([d['x'] for d in train_data],axis=0) 45 | trainy = np.concatenate([d['y'] for d in train_data],axis=0) 46 | return trainx, trainy 47 | elif subset=='test': 48 | test_data = unpickle(os.path.join(data_dir,'cifar-10-batches-py','test_batch')) 49 | testx = test_data['x'] 50 | testy = test_data['y'] 51 | return testx, testy 52 | else: 53 | raise NotImplementedError('subset should be either train or test') 54 | 55 | class DataLoader(object): 56 | """ an object that generates batches of CIFAR-10 data for training """ 57 | 58 | def __init__(self, data_dir, subset, batch_size, rng=None, shuffle=False, return_labels=False, filter_labels=None): 59 | """ 60 | - data_dir is location where to store files 61 | - subset is train|test 62 | - batch_size is int, of #examples to load at once 63 | - rng is np.random.RandomState object for reproducibility 64 | """ 65 | 66 | self.data_dir = data_dir 67 | self.batch_size = batch_size 68 | self.shuffle = shuffle 69 | self.return_labels = return_labels 70 | 71 | # create temporary storage for the data, if not yet created 72 | if not os.path.exists(data_dir): 73 | print('creating folder', data_dir) 74 | os.makedirs(data_dir) 75 | 76 | # load CIFAR-10 training data to RAM 77 | self.data, self.labels = load(os.path.join(data_dir,'cifar-10-python'), subset=subset) 78 | 79 | if filter_labels is not None: 80 | selected_idx = self.labels == filter_labels 81 | self.data = self.data[selected_idx] 82 | self.labels = self.labels[selected_idx] 83 | print('There are %d samples left' % self.labels.size) 84 | 85 | self.data = np.transpose(self.data, (0,2,3,1)) # (N,3,32,32) -> (N,32,32,3) 86 | 87 | self.p = 0 # pointer to where we are in iteration 88 | self.rng = np.random.RandomState(1) if rng is None else rng 89 | 90 | def get_observation_size(self): 91 | return self.data.shape[1:] 92 | 93 | def get_num_labels(self): 94 | return np.amax(self.labels) + 1 95 | 96 | def reset(self): 97 | self.p = 0 98 | 99 | def __iter__(self): 100 | return self 101 | 102 | def __next__(self, n=None): 103 | """ n is the number of examples to fetch """ 104 | if n is None: n = self.batch_size 105 | 106 | # on first iteration lazily permute all data 107 | if self.p == 0 and self.shuffle: 108 | inds = self.rng.permutation(self.data.shape[0]) 109 | self.data = self.data[inds] 110 | self.labels = self.labels[inds] 111 | 112 | # on last iteration reset the counter and raise StopIteration 113 | if self.p + n > self.data.shape[0]: 114 | self.reset() # reset for next time we get called 115 | raise StopIteration 116 | 117 | # on intermediate iterations fetch the next batch 118 | x = self.data[self.p : self.p + n] 119 | y = self.labels[self.p : self.p + n] 120 | self.p += self.batch_size 121 | 122 | if self.return_labels: 123 | return x,y 124 | else: 125 | return x 126 | 127 | next = __next__ # Python 2 compatibility (https://stackoverflow.com/questions/29578469/how-to-make-an-object-both-a-python2-and-python3-iterator) 128 | 129 | 130 | -------------------------------------------------------------------------------- /DSL_ImgProcess/data/cifar10_plotdata.py: -------------------------------------------------------------------------------- 1 | import cifar10_data 2 | import argparse 3 | import plotting 4 | import numpy as np 5 | 6 | data_dir = '/home/tim/data' 7 | 8 | parser = argparse.ArgumentParser() 9 | parser.add_argument('--save_dir', type=str, default='./log') 10 | parser.add_argument('--data_dir', type=str, default='/home/tim/data') 11 | parser.add_argument('--plot_title', type=str, default=None) 12 | args = parser.parse_args() 13 | print(args) 14 | 15 | data_dir = args.data_dir 16 | 17 | trainx, trainy = cifar10_data.load(data_dir) 18 | 19 | ids = [[] for i in range(10)] 20 | for i in range(trainx.shape[0]): 21 | if len(ids[trainy[i]]) < 10: 22 | ids[trainy[i]].append(i) 23 | if np.alltrue(np.asarray([len(_ids) >= 10 for _ids in ids])): 24 | break 25 | 26 | images = np.zeros((10*10,32,32,3),dtype='uint8') 27 | for i in range(len(ids)): 28 | for j in range(len(ids[i])): 29 | images[10*j+i] = trainx[ids[i][j]].transpose([1,2,0]) 30 | print(ids) 31 | 32 | img_tile = plotting.img_tile(images, aspect_ratio=1.0, border_color=1.0, stretch=True) 33 | img = plotting.plot_img(img_tile, title=args.plot_title if args.plot_title != 'None' else None) 34 | plotting.plt.savefig(args.save_dir + '/cifar10_orig_images.png') 35 | plotting.plt.close('all') 36 | 37 | -------------------------------------------------------------------------------- /DSL_ImgProcess/data/imagenet_data.py: -------------------------------------------------------------------------------- 1 | """ 2 | Utilities for loading the small ImageNet dataset used in Oord et al. 3 | use scripts/png_to_npz.py to create the npz files 4 | 5 | The code here currently assumes that the preprocessing was done manually. 6 | TODO: make automatic and painless 7 | """ 8 | 9 | import os 10 | import sys 11 | import tarfile 12 | from six.moves import urllib 13 | 14 | import numpy as np 15 | from scipy.misc import imread 16 | 17 | def fetch(url, filepath): 18 | filename = url.split('/')[-1] 19 | def _progress(count, block_size, total_size): 20 | sys.stdout.write('\r>> Downloading %s %.1f%%' % (filename, 21 | float(count * block_size) / float(total_size) * 100.0)) 22 | sys.stdout.flush() 23 | print(url) 24 | filepath, headers = urllib.request.urlretrieve(url, filepath, _progress) 25 | print() 26 | statinfo = os.stat(filepath) 27 | print('Successfully downloaded', filename, statinfo.st_size, 'bytes.') 28 | 29 | def maybe_download_and_extract(data_dir): 30 | # more info on the dataset at http://image-net.org/small/download.php 31 | # downloads and extracts the two tar files for train/val 32 | 33 | train_dir = os.path.join(data_dir, 'train_32x32') 34 | if not os.path.exists(train_dir): 35 | train_url = 'http://image-net.org/small/train_32x32.tar' # 4GB 36 | filepath = os.path.join(data_dir, 'train_32x32.tar') 37 | fetch(train_url, filepath) 38 | print('unpacking the tar file', filepath) 39 | tarfile.open(filepath, 'r').extractall(data_dir) # creates the train_32x32 folder 40 | 41 | test_dir = os.path.join(data_dir, 'valid_32x32') 42 | if not os.path.exists(test_dir): 43 | test_url = 'http://image-net.org/small/valid_32x32.tar' # 154MB 44 | filepath = os.path.join(data_dir, 'valid_32x32.tar') 45 | fetch(test_url, filepath) 46 | print('unpacking the tar file', filepath) 47 | tarfile.open(filepath, 'r').extractall(data_dir) # creates the valid_32x32 folder 48 | 49 | def maybe_preprocess(data_dir): 50 | 51 | npz_file = os.path.join(data_dir, 'imgnet_32x32.npz') 52 | if os.path.exists(npz_file): 53 | return # all good 54 | 55 | trainx = [] 56 | train_dir = os.path.join(data_dir, 'train_32x32') 57 | for f in os.listdir(train_dir): 58 | if f.endswith('.png'): 59 | print('reading', f) 60 | filepath = os.path.join(train_dir, f) 61 | trainx.append(imread(filepath).reshape((1,32,32,3))) 62 | trainx = np.concatenate(trainx, axis=0) 63 | 64 | testx = [] 65 | test_dir = os.path.join(data_dir, 'valid_32x32') 66 | for f in os.listdir(test_dir): 67 | if f.endswith('.png'): 68 | print('reading', f) 69 | filepath = os.path.join(test_dir, f) 70 | testx.append(imread(filepath).reshape((1,32,32,3))) 71 | testx = np.concatenate(testx, axis=0) 72 | 73 | np.savez(npz_file, trainx=trainx, testx=testx) 74 | 75 | 76 | def load(data_dir, subset='train'): 77 | if not os.path.exists(data_dir): 78 | print('creating folder', data_dir) 79 | os.makedirs(data_dir) 80 | maybe_download_and_extract(data_dir) 81 | maybe_preprocess(data_dir) 82 | imagenet_data = np.load(os.path.join(data_dir,'imgnet_32x32.npz')) 83 | return imagenet_data['trainx'] if subset == 'train' else imagenet_data['testx'] 84 | 85 | 86 | 87 | class DataLoader(object): 88 | """ an object that generates batches of CIFAR-10 data for training """ 89 | 90 | def __init__(self, data_dir, subset, batch_size, rng=None, shuffle=False): 91 | """ 92 | - data_dir is location where the files are stored 93 | - subset is train|test 94 | - batch_size is int, of #examples to load at once 95 | - rng is np.random.RandomState object for reproducibility 96 | """ 97 | 98 | self.data_dir = data_dir 99 | self.batch_size = batch_size 100 | self.shuffle = shuffle 101 | 102 | self.data = load(os.path.join(data_dir,'small_imagenet'), subset=subset) 103 | 104 | self.p = 0 # pointer to where we are in iteration 105 | self.rng = np.random.RandomState(1) if rng is None else rng 106 | 107 | def get_observation_size(self): 108 | return self.data.shape[1:] 109 | 110 | def reset(self): 111 | self.p = 0 112 | 113 | def __iter__(self): 114 | return self 115 | 116 | def __next__(self, n=None): 117 | """ n is the number of examples to fetch """ 118 | if n is None: n = self.batch_size 119 | 120 | # on first iteration lazily permute all data 121 | if self.p == 0 and self.shuffle: 122 | inds = self.rng.permutation(self.data.shape[0]) 123 | self.data = self.data[inds] 124 | 125 | # on last iteration reset the counter and raise StopIteration 126 | if self.p + n > self.data.shape[0]: 127 | self.reset() # reset for next time we get called 128 | raise StopIteration 129 | 130 | # on intermediate iterations fetch the next batch 131 | x = self.data[self.p : self.p + n] 132 | self.p += self.batch_size 133 | 134 | return x 135 | 136 | next = __next__ # Python 2 compatibility (https://stackoverflow.com/questions/29578469/how-to-make-an-object-both-a-python2-and-python3-iterator) 137 | 138 | -------------------------------------------------------------------------------- /DSL_ImgProcess/data/pixelcnn_samples.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/microsoft/DualLearning/14f8e76b3c47e1f00bef4254606e1f549cc8c9ea/DSL_ImgProcess/data/pixelcnn_samples.png -------------------------------------------------------------------------------- /DSL_ImgProcess/data2/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/microsoft/DualLearning/14f8e76b3c47e1f00bef4254606e1f549cc8c9ea/DSL_ImgProcess/data2/__init__.py -------------------------------------------------------------------------------- /DSL_ImgProcess/data2/cifar10_data.py: -------------------------------------------------------------------------------- 1 | """ 2 | Utilities for downloading and unpacking the CIFAR-10 dataset, originally published 3 | by Krizhevsky et al. and hosted here: https://www.cs.toronto.edu/~kriz/cifar.html 4 | """ 5 | 6 | import os 7 | import sys 8 | import tarfile 9 | from six.moves import urllib 10 | import numpy as np 11 | 12 | def maybe_download_and_extract(data_dir, url='http://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz'): 13 | if not os.path.exists(os.path.join(data_dir, 'cifar-10-batches-py')): 14 | if not os.path.exists(data_dir): 15 | os.makedirs(data_dir) 16 | filename = url.split('/')[-1] 17 | filepath = os.path.join(data_dir, filename) 18 | if not os.path.exists(filepath): 19 | def _progress(count, block_size, total_size): 20 | sys.stdout.write('\r>> Downloading %s %.1f%%' % (filename, 21 | float(count * block_size) / float(total_size) * 100.0)) 22 | sys.stdout.flush() 23 | filepath, _ = urllib.request.urlretrieve(url, filepath, _progress) 24 | print() 25 | statinfo = os.stat(filepath) 26 | print('Successfully downloaded', filename, statinfo.st_size, 'bytes.') 27 | tarfile.open(filepath, 'r:gz').extractall(data_dir) 28 | 29 | def unpickle(file): 30 | fo = open(file, 'rb') 31 | if (sys.version_info >= (3, 0)): 32 | import pickle 33 | d = pickle.load(fo, encoding='latin1') 34 | else: 35 | import cPickle 36 | d = cPickle.load(fo) 37 | fo.close() 38 | return {'x': d['data'].reshape((10000,3,32,32)), 'y': np.array(d['labels']).astype(np.uint8)} 39 | 40 | def load(data_dir, subset='train'): 41 | maybe_download_and_extract(data_dir) 42 | if subset=='train': 43 | train_data = [unpickle(os.path.join(data_dir,'cifar-10-batches-py','data_batch_' + str(i))) for i in range(1,6)] 44 | trainx = np.concatenate([d['x'] for d in train_data],axis=0) 45 | trainy = np.concatenate([d['y'] for d in train_data],axis=0) 46 | return trainx, trainy 47 | elif subset=='test': 48 | test_data = unpickle(os.path.join(data_dir,'cifar-10-batches-py','test_batch')) 49 | testx = test_data['x'] 50 | testy = test_data['y'] 51 | return testx, testy 52 | else: 53 | raise NotImplementedError('subset should be either train or test') 54 | 55 | class DataLoader(object): 56 | """ an object that generates batches of CIFAR-10 data for training """ 57 | 58 | def __init__(self, data_dir, subset, batch_size, LMscore=None, rng=None, shuffle=False, return_labels=False): 59 | """ 60 | - data_dir is location where to store files 61 | - subset is train|test 62 | - batch_size is int, of #examples to load at once 63 | - rng is np.random.RandomState object for reproducibility 64 | """ 65 | 66 | self.data_dir = data_dir 67 | self.batch_size = batch_size 68 | self.shuffle = shuffle 69 | self.return_labels = return_labels 70 | 71 | # create temporary storage for the data, if not yet created 72 | if not os.path.exists(data_dir): 73 | print('creating folder', data_dir) 74 | os.makedirs(data_dir) 75 | 76 | # load CIFAR-10 training data to RAM 77 | self.data, self.labels = load(os.path.join(data_dir,'cifar-10-python'), subset=subset) 78 | self.data = np.transpose(self.data, (0,2,3,1)) # (N,3,32,32) -> (N,32,32,3) 79 | 80 | if subset == 'train': 81 | self.LM = np.load(LMscore + '.train.npz')['arr_0'] 82 | elif subset == 'test': 83 | self.LM = np.load(LMscore + '.test.npz') 84 | else: 85 | raise 'Not found proper LMscore folder' 86 | 87 | self.p = 0 # pointer to where we are in iteration 88 | self.rng = np.random.RandomState(1) if rng is None else rng 89 | 90 | def get_observation_size(self): 91 | return self.data.shape[1:] 92 | 93 | def get_num_labels(self): 94 | return np.amax(self.labels) + 1 95 | 96 | def reset(self): 97 | self.p = 0 98 | 99 | def __iter__(self): 100 | return self 101 | 102 | def __next__(self, n=None): 103 | """ n is the number of examples to fetch """ 104 | if n is None: n = self.batch_size 105 | 106 | # on first iteration lazily permute all data 107 | if self.p == 0 and self.shuffle: 108 | inds = self.rng.permutation(self.data.shape[0]) 109 | self.data = self.data[inds] 110 | self.labels = self.labels[inds] 111 | self.LM = self.LM[inds] 112 | 113 | # on last iteration reset the counter and raise StopIteration 114 | if self.p + n > self.data.shape[0]: 115 | self.reset() # reset for next time we get called 116 | raise StopIteration 117 | 118 | # on intermediate iterations fetch the next batch 119 | x = self.data[self.p : self.p + n] 120 | y = self.labels[self.p : self.p + n] 121 | lmscore = self.LM[self.p : self.p + n] 122 | self.p += self.batch_size 123 | 124 | if self.return_labels: 125 | return x,y, lmscore 126 | else: 127 | return x, lmscore 128 | 129 | next = __next__ # Python 2 compatibility (https://stackoverflow.com/questions/29578469/how-to-make-an-object-both-a-python2-and-python3-iterator) 130 | 131 | 132 | -------------------------------------------------------------------------------- /DSL_ImgProcess/data2/cifar10_plotdata.py: -------------------------------------------------------------------------------- 1 | import cifar10_data 2 | import argparse 3 | import plotting 4 | import numpy as np 5 | 6 | data_dir = '/home/tim/data' 7 | 8 | parser = argparse.ArgumentParser() 9 | parser.add_argument('--save_dir', type=str, default='./log') 10 | parser.add_argument('--data_dir', type=str, default='/home/tim/data') 11 | parser.add_argument('--plot_title', type=str, default=None) 12 | args = parser.parse_args() 13 | print(args) 14 | 15 | data_dir = args.data_dir 16 | 17 | trainx, trainy = cifar10_data.load(data_dir) 18 | 19 | ids = [[] for i in range(10)] 20 | for i in range(trainx.shape[0]): 21 | if len(ids[trainy[i]]) < 10: 22 | ids[trainy[i]].append(i) 23 | if np.alltrue(np.asarray([len(_ids) >= 10 for _ids in ids])): 24 | break 25 | 26 | images = np.zeros((10*10,32,32,3),dtype='uint8') 27 | for i in range(len(ids)): 28 | for j in range(len(ids[i])): 29 | images[10*j+i] = trainx[ids[i][j]].transpose([1,2,0]) 30 | print(ids) 31 | 32 | img_tile = plotting.img_tile(images, aspect_ratio=1.0, border_color=1.0, stretch=True) 33 | img = plotting.plot_img(img_tile, title=args.plot_title if args.plot_title != 'None' else None) 34 | plotting.plt.savefig(args.save_dir + '/cifar10_orig_images.png') 35 | plotting.plt.close('all') 36 | 37 | -------------------------------------------------------------------------------- /DSL_ImgProcess/data2/imagenet_data.py: -------------------------------------------------------------------------------- 1 | """ 2 | Utilities for loading the small ImageNet dataset used in Oord et al. 3 | use scripts/png_to_npz.py to create the npz files 4 | 5 | The code here currently assumes that the preprocessing was done manually. 6 | TODO: make automatic and painless 7 | """ 8 | 9 | import os 10 | import sys 11 | import tarfile 12 | from six.moves import urllib 13 | 14 | import numpy as np 15 | from scipy.misc import imread 16 | 17 | def fetch(url, filepath): 18 | filename = url.split('/')[-1] 19 | def _progress(count, block_size, total_size): 20 | sys.stdout.write('\r>> Downloading %s %.1f%%' % (filename, 21 | float(count * block_size) / float(total_size) * 100.0)) 22 | sys.stdout.flush() 23 | print(url) 24 | filepath, headers = urllib.request.urlretrieve(url, filepath, _progress) 25 | print() 26 | statinfo = os.stat(filepath) 27 | print('Successfully downloaded', filename, statinfo.st_size, 'bytes.') 28 | 29 | def maybe_download_and_extract(data_dir): 30 | # more info on the dataset at http://image-net.org/small/download.php 31 | # downloads and extracts the two tar files for train/val 32 | 33 | train_dir = os.path.join(data_dir, 'train_32x32') 34 | if not os.path.exists(train_dir): 35 | train_url = 'http://image-net.org/small/train_32x32.tar' # 4GB 36 | filepath = os.path.join(data_dir, 'train_32x32.tar') 37 | fetch(train_url, filepath) 38 | print('unpacking the tar file', filepath) 39 | tarfile.open(filepath, 'r').extractall(data_dir) # creates the train_32x32 folder 40 | 41 | test_dir = os.path.join(data_dir, 'valid_32x32') 42 | if not os.path.exists(test_dir): 43 | test_url = 'http://image-net.org/small/valid_32x32.tar' # 154MB 44 | filepath = os.path.join(data_dir, 'valid_32x32.tar') 45 | fetch(test_url, filepath) 46 | print('unpacking the tar file', filepath) 47 | tarfile.open(filepath, 'r').extractall(data_dir) # creates the valid_32x32 folder 48 | 49 | def maybe_preprocess(data_dir): 50 | 51 | npz_file = os.path.join(data_dir, 'imgnet_32x32.npz') 52 | if os.path.exists(npz_file): 53 | return # all good 54 | 55 | trainx = [] 56 | train_dir = os.path.join(data_dir, 'train_32x32') 57 | for f in os.listdir(train_dir): 58 | if f.endswith('.png'): 59 | print('reading', f) 60 | filepath = os.path.join(train_dir, f) 61 | trainx.append(imread(filepath).reshape((1,32,32,3))) 62 | trainx = np.concatenate(trainx, axis=0) 63 | 64 | testx = [] 65 | test_dir = os.path.join(data_dir, 'valid_32x32') 66 | for f in os.listdir(test_dir): 67 | if f.endswith('.png'): 68 | print('reading', f) 69 | filepath = os.path.join(test_dir, f) 70 | testx.append(imread(filepath).reshape((1,32,32,3))) 71 | testx = np.concatenate(testx, axis=0) 72 | 73 | np.savez(npz_file, trainx=trainx, testx=testx) 74 | 75 | 76 | def load(data_dir, subset='train'): 77 | if not os.path.exists(data_dir): 78 | print('creating folder', data_dir) 79 | os.makedirs(data_dir) 80 | maybe_download_and_extract(data_dir) 81 | maybe_preprocess(data_dir) 82 | imagenet_data = np.load(os.path.join(data_dir,'imgnet_32x32.npz')) 83 | return imagenet_data['trainx'] if subset == 'train' else imagenet_data['testx'] 84 | 85 | 86 | 87 | class DataLoader(object): 88 | """ an object that generates batches of CIFAR-10 data for training """ 89 | 90 | def __init__(self, data_dir, subset, batch_size, rng=None, shuffle=False): 91 | """ 92 | - data_dir is location where the files are stored 93 | - subset is train|test 94 | - batch_size is int, of #examples to load at once 95 | - rng is np.random.RandomState object for reproducibility 96 | """ 97 | 98 | self.data_dir = data_dir 99 | self.batch_size = batch_size 100 | self.shuffle = shuffle 101 | 102 | self.data = load(os.path.join(data_dir,'small_imagenet'), subset=subset) 103 | 104 | self.p = 0 # pointer to where we are in iteration 105 | self.rng = np.random.RandomState(1) if rng is None else rng 106 | 107 | def get_observation_size(self): 108 | return self.data.shape[1:] 109 | 110 | def reset(self): 111 | self.p = 0 112 | 113 | def __iter__(self): 114 | return self 115 | 116 | def __next__(self, n=None): 117 | """ n is the number of examples to fetch """ 118 | if n is None: n = self.batch_size 119 | 120 | # on first iteration lazily permute all data 121 | if self.p == 0 and self.shuffle: 122 | inds = self.rng.permutation(self.data.shape[0]) 123 | self.data = self.data[inds] 124 | 125 | # on last iteration reset the counter and raise StopIteration 126 | if self.p + n > self.data.shape[0]: 127 | self.reset() # reset for next time we get called 128 | raise StopIteration 129 | 130 | # on intermediate iterations fetch the next batch 131 | x = self.data[self.p : self.p + n] 132 | self.p += self.batch_size 133 | 134 | return x 135 | 136 | next = __next__ # Python 2 compatibility (https://stackoverflow.com/questions/29578469/how-to-make-an-object-both-a-python2-and-python3-iterator) 137 | 138 | -------------------------------------------------------------------------------- /DSL_ImgProcess/data2/pixelcnn_samples.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/microsoft/DualLearning/14f8e76b3c47e1f00bef4254606e1f549cc8c9ea/DSL_ImgProcess/data2/pixelcnn_samples.png -------------------------------------------------------------------------------- /DSL_ImgProcess/data4/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/microsoft/DualLearning/14f8e76b3c47e1f00bef4254606e1f549cc8c9ea/DSL_ImgProcess/data4/__init__.py -------------------------------------------------------------------------------- /DSL_ImgProcess/data4/cifar10_data.py: -------------------------------------------------------------------------------- 1 | """ 2 | Utilities for downloading and unpacking the CIFAR-10 dataset, originally published 3 | by Krizhevsky et al. and hosted here: https://www.cs.toronto.edu/~kriz/cifar.html 4 | """ 5 | 6 | import os 7 | import sys 8 | import tarfile 9 | from six.moves import urllib 10 | import numpy as np 11 | 12 | def maybe_download_and_extract(data_dir, url='http://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz'): 13 | if not os.path.exists(os.path.join(data_dir, 'cifar-10-batches-py')): 14 | if not os.path.exists(data_dir): 15 | os.makedirs(data_dir) 16 | filename = url.split('/')[-1] 17 | filepath = os.path.join(data_dir, filename) 18 | if not os.path.exists(filepath): 19 | def _progress(count, block_size, total_size): 20 | sys.stdout.write('\r>> Downloading %s %.1f%%' % (filename, 21 | float(count * block_size) / float(total_size) * 100.0)) 22 | sys.stdout.flush() 23 | filepath, _ = urllib.request.urlretrieve(url, filepath, _progress) 24 | print() 25 | statinfo = os.stat(filepath) 26 | print('Successfully downloaded', filename, statinfo.st_size, 'bytes.') 27 | tarfile.open(filepath, 'r:gz').extractall(data_dir) 28 | 29 | def unpickle(file): 30 | fo = open(file, 'rb') 31 | if (sys.version_info >= (3, 0)): 32 | import pickle 33 | d = pickle.load(fo, encoding='latin1') 34 | else: 35 | import cPickle 36 | d = cPickle.load(fo) 37 | fo.close() 38 | return {'x': d['data'].reshape((10000,3,32,32)), 'y': np.array(d['labels']).astype(np.uint8)} 39 | 40 | def load(data_dir, subset='train'): 41 | maybe_download_and_extract(data_dir) 42 | if subset=='train': 43 | train_data = [unpickle(os.path.join(data_dir,'cifar-10-batches-py','data_batch_' + str(i))) for i in range(1,6)] 44 | trainx = np.concatenate([d['x'] for d in train_data],axis=0) 45 | trainy = np.concatenate([d['y'] for d in train_data],axis=0) 46 | return trainx, trainy 47 | elif subset=='test': 48 | test_data = unpickle(os.path.join(data_dir,'cifar-10-batches-py','test_batch')) 49 | testx = test_data['x'] 50 | testy = test_data['y'] 51 | return testx, testy 52 | else: 53 | raise NotImplementedError('subset should be either train or test') 54 | 55 | class DataLoader(object): 56 | """ an object that generates batches of CIFAR-10 data for training """ 57 | 58 | def __init__(self, data_dir, subset, batch_size, rng=None, shuffle=False, return_labels=False, filter_labels=None,final=8): 59 | """ 60 | - data_dir is location where to store files 61 | - subset is train|test 62 | - batch_size is int, of #examples to load at once 63 | - rng is np.random.RandomState object for reproducibility 64 | """ 65 | 66 | self.data_dir = data_dir 67 | self.batch_size = batch_size 68 | self.shuffle = shuffle 69 | self.return_labels = return_labels 70 | 71 | # create temporary storage for the data, if not yet created 72 | if not os.path.exists(data_dir): 73 | print('creating folder', data_dir) 74 | os.makedirs(data_dir) 75 | 76 | # load CIFAR-10 training data to RAM 77 | self.data, self.labels = load(os.path.join(data_dir,'cifar-10-python'), subset=subset) 78 | if final > 0: 79 | self.data = np.tile(self.data[-final:],[3,1,1,1]) 80 | self.labels = np.tile(self.labels[-final:],[3]) 81 | 82 | 83 | if filter_labels is not None: 84 | selected_idx = self.labels == filter_labels 85 | self.data = self.data[selected_idx] 86 | self.labels = self.labels[selected_idx] 87 | print('There are %d samples left' % self.labels.size) 88 | 89 | self.data = np.transpose(self.data, (0,2,3,1)) # (N,3,32,32) -> (N,32,32,3) 90 | 91 | self.p = 0 # pointer to where we are in iteration 92 | self.rng = np.random.RandomState(1) if rng is None else rng 93 | 94 | def get_observation_size(self): 95 | return self.data.shape[1:] 96 | 97 | def get_num_labels(self): 98 | return np.amax(self.labels) + 1 99 | 100 | def reset(self): 101 | self.p = 0 102 | 103 | def __iter__(self): 104 | return self 105 | 106 | def __next__(self, n=None): 107 | """ n is the number of examples to fetch """ 108 | if n is None: n = self.batch_size 109 | 110 | # on first iteration lazily permute all data 111 | if self.p == 0 and self.shuffle: 112 | inds = self.rng.permutation(self.data.shape[0]) 113 | self.data = self.data[inds] 114 | self.labels = self.labels[inds] 115 | 116 | # on last iteration reset the counter and raise StopIteration 117 | if self.p + n > self.data.shape[0]: 118 | self.reset() # reset for next time we get called 119 | raise StopIteration 120 | 121 | # on intermediate iterations fetch the next batch 122 | x = self.data[self.p : self.p + n] 123 | y = self.labels[self.p : self.p + n] 124 | self.p += self.batch_size 125 | 126 | if self.return_labels: 127 | return x,y 128 | else: 129 | return x 130 | 131 | next = __next__ # Python 2 compatibility (https://stackoverflow.com/questions/29578469/how-to-make-an-object-both-a-python2-and-python3-iterator) 132 | 133 | 134 | -------------------------------------------------------------------------------- /DSL_ImgProcess/data4/cifar10_plotdata.py: -------------------------------------------------------------------------------- 1 | import cifar10_data 2 | import argparse 3 | import plotting 4 | import numpy as np 5 | 6 | data_dir = '/home/tim/data' 7 | 8 | parser = argparse.ArgumentParser() 9 | parser.add_argument('--save_dir', type=str, default='./log') 10 | parser.add_argument('--data_dir', type=str, default='/home/tim/data') 11 | parser.add_argument('--plot_title', type=str, default=None) 12 | args = parser.parse_args() 13 | print(args) 14 | 15 | data_dir = args.data_dir 16 | 17 | trainx, trainy = cifar10_data.load(data_dir) 18 | 19 | ids = [[] for i in range(10)] 20 | for i in range(trainx.shape[0]): 21 | if len(ids[trainy[i]]) < 10: 22 | ids[trainy[i]].append(i) 23 | if np.alltrue(np.asarray([len(_ids) >= 10 for _ids in ids])): 24 | break 25 | 26 | images = np.zeros((10*10,32,32,3),dtype='uint8') 27 | for i in range(len(ids)): 28 | for j in range(len(ids[i])): 29 | images[10*j+i] = trainx[ids[i][j]].transpose([1,2,0]) 30 | print(ids) 31 | 32 | img_tile = plotting.img_tile(images, aspect_ratio=1.0, border_color=1.0, stretch=True) 33 | img = plotting.plot_img(img_tile, title=args.plot_title if args.plot_title != 'None' else None) 34 | plotting.plt.savefig(args.save_dir + '/cifar10_orig_images.png') 35 | plotting.plt.close('all') 36 | 37 | -------------------------------------------------------------------------------- /DSL_ImgProcess/data4/imagenet_data.py: -------------------------------------------------------------------------------- 1 | """ 2 | Utilities for loading the small ImageNet dataset used in Oord et al. 3 | use scripts/png_to_npz.py to create the npz files 4 | 5 | The code here currently assumes that the preprocessing was done manually. 6 | TODO: make automatic and painless 7 | """ 8 | 9 | import os 10 | import sys 11 | import tarfile 12 | from six.moves import urllib 13 | 14 | import numpy as np 15 | from scipy.misc import imread 16 | 17 | def fetch(url, filepath): 18 | filename = url.split('/')[-1] 19 | def _progress(count, block_size, total_size): 20 | sys.stdout.write('\r>> Downloading %s %.1f%%' % (filename, 21 | float(count * block_size) / float(total_size) * 100.0)) 22 | sys.stdout.flush() 23 | print(url) 24 | filepath, headers = urllib.request.urlretrieve(url, filepath, _progress) 25 | print() 26 | statinfo = os.stat(filepath) 27 | print('Successfully downloaded', filename, statinfo.st_size, 'bytes.') 28 | 29 | def maybe_download_and_extract(data_dir): 30 | # more info on the dataset at http://image-net.org/small/download.php 31 | # downloads and extracts the two tar files for train/val 32 | 33 | train_dir = os.path.join(data_dir, 'train_32x32') 34 | if not os.path.exists(train_dir): 35 | train_url = 'http://image-net.org/small/train_32x32.tar' # 4GB 36 | filepath = os.path.join(data_dir, 'train_32x32.tar') 37 | fetch(train_url, filepath) 38 | print('unpacking the tar file', filepath) 39 | tarfile.open(filepath, 'r').extractall(data_dir) # creates the train_32x32 folder 40 | 41 | test_dir = os.path.join(data_dir, 'valid_32x32') 42 | if not os.path.exists(test_dir): 43 | test_url = 'http://image-net.org/small/valid_32x32.tar' # 154MB 44 | filepath = os.path.join(data_dir, 'valid_32x32.tar') 45 | fetch(test_url, filepath) 46 | print('unpacking the tar file', filepath) 47 | tarfile.open(filepath, 'r').extractall(data_dir) # creates the valid_32x32 folder 48 | 49 | def maybe_preprocess(data_dir): 50 | 51 | npz_file = os.path.join(data_dir, 'imgnet_32x32.npz') 52 | if os.path.exists(npz_file): 53 | return # all good 54 | 55 | trainx = [] 56 | train_dir = os.path.join(data_dir, 'train_32x32') 57 | for f in os.listdir(train_dir): 58 | if f.endswith('.png'): 59 | print('reading', f) 60 | filepath = os.path.join(train_dir, f) 61 | trainx.append(imread(filepath).reshape((1,32,32,3))) 62 | trainx = np.concatenate(trainx, axis=0) 63 | 64 | testx = [] 65 | test_dir = os.path.join(data_dir, 'valid_32x32') 66 | for f in os.listdir(test_dir): 67 | if f.endswith('.png'): 68 | print('reading', f) 69 | filepath = os.path.join(test_dir, f) 70 | testx.append(imread(filepath).reshape((1,32,32,3))) 71 | testx = np.concatenate(testx, axis=0) 72 | 73 | np.savez(npz_file, trainx=trainx, testx=testx) 74 | 75 | 76 | def load(data_dir, subset='train'): 77 | if not os.path.exists(data_dir): 78 | print('creating folder', data_dir) 79 | os.makedirs(data_dir) 80 | maybe_download_and_extract(data_dir) 81 | maybe_preprocess(data_dir) 82 | imagenet_data = np.load(os.path.join(data_dir,'imgnet_32x32.npz')) 83 | return imagenet_data['trainx'] if subset == 'train' else imagenet_data['testx'] 84 | 85 | 86 | 87 | class DataLoader(object): 88 | """ an object that generates batches of CIFAR-10 data for training """ 89 | 90 | def __init__(self, data_dir, subset, batch_size, rng=None, shuffle=False): 91 | """ 92 | - data_dir is location where the files are stored 93 | - subset is train|test 94 | - batch_size is int, of #examples to load at once 95 | - rng is np.random.RandomState object for reproducibility 96 | """ 97 | 98 | self.data_dir = data_dir 99 | self.batch_size = batch_size 100 | self.shuffle = shuffle 101 | 102 | self.data = load(os.path.join(data_dir,'small_imagenet'), subset=subset) 103 | 104 | self.p = 0 # pointer to where we are in iteration 105 | self.rng = np.random.RandomState(1) if rng is None else rng 106 | 107 | def get_observation_size(self): 108 | return self.data.shape[1:] 109 | 110 | def reset(self): 111 | self.p = 0 112 | 113 | def __iter__(self): 114 | return self 115 | 116 | def __next__(self, n=None): 117 | """ n is the number of examples to fetch """ 118 | if n is None: n = self.batch_size 119 | 120 | # on first iteration lazily permute all data 121 | if self.p == 0 and self.shuffle: 122 | inds = self.rng.permutation(self.data.shape[0]) 123 | self.data = self.data[inds] 124 | 125 | # on last iteration reset the counter and raise StopIteration 126 | if self.p + n > self.data.shape[0]: 127 | self.reset() # reset for next time we get called 128 | raise StopIteration 129 | 130 | # on intermediate iterations fetch the next batch 131 | x = self.data[self.p : self.p + n] 132 | self.p += self.batch_size 133 | 134 | return x 135 | 136 | next = __next__ # Python 2 compatibility (https://stackoverflow.com/questions/29578469/how-to-make-an-object-both-a-python2-and-python3-iterator) 137 | 138 | -------------------------------------------------------------------------------- /DSL_ImgProcess/example.sh: -------------------------------------------------------------------------------- 1 | export PATH=/usr/anaconda2/bin:$PATH 2 | export CUDA_VISIBLE_DEVICES=0,1,2,3 3 | 4 | # train two models (test 4 gpu) 5 | python monitor.py --data_dir=./cifar10_data --save_dir=./checkpoints_All --batch_size=12 --show_interval=10 --learning_rate=1e-4 --load_params=params_345uidx480248.ckpt --learning_rate_I2L=2e-4 --trade_off_I2L=30. --trade_off_L2I=1.5 --save_interval=1 --bias=0.02 --valid_interval=8 --lr_decay=1. --nr_gpu=4 6 | 7 | # train image classifier only (test single gpu) 8 | # python monitor.py --data_dir=./cifar10_data --save_dir=./checkpoints_I2L --batch_size=12 --show_interval=10 --learning_rate=1e-4 --load_params=params_345uidx480248.ckpt --learning_rate_I2L=2e-4 --trade_off_I2L=30. --trade_off_L2I=1.5 --save_interval=1 --bias=0.02 --valid_interval=8 --lr_decay=1. --nr_gpu=1 --oneside=I2L 9 | 10 | # train image generator only (test 2 gpu) 11 | # python monitor.py --data_dir=./cifar10_data --save_dir=./checkpoints_L2I --batch_size=12 --show_interval=10 --learning_rate=1e-4 --load_params=params_345uidx480248.ckpt --learning_rate_I2L=2e-4 --trade_off_I2L=30. --trade_off_L2I=1.5 --save_interval=1 --bias=0.02 --valid_interval=8 --lr_decay=1. --nr_gpu=2 --oneside=L2I 12 | -------------------------------------------------------------------------------- /DSL_ImgProcess/pixel_cnn_pp/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/microsoft/DualLearning/14f8e76b3c47e1f00bef4254606e1f549cc8c9ea/DSL_ImgProcess/pixel_cnn_pp/__init__.py -------------------------------------------------------------------------------- /DSL_ImgProcess/pixel_cnn_pp/model.py: -------------------------------------------------------------------------------- 1 | """ 2 | The core Pixel-CNN model 3 | """ 4 | 5 | import tensorflow as tf 6 | from tensorflow.contrib.framework.python.ops import arg_scope 7 | import pixel_cnn_pp.nn as nn 8 | 9 | def model_spec(x, h=None, init=False, ema=None, dropout_p=0.5, nr_resnet=5, nr_filters=160, nr_logistic_mix=10, resnet_nonlinearity='concat_elu'): 10 | """ 11 | We receive a Tensor x of shape (N,H,W,D1) (e.g. (12,32,32,3)) and produce 12 | a Tensor x_out of shape (N,H,W,D2) (e.g. (12,32,32,100)), where each fiber 13 | of the x_out tensor describes the predictive distribution for the RGB at 14 | that position. 15 | 'h' is an optional N x K matrix of values to condition our generative model on 16 | """ 17 | 18 | counters = {} 19 | with arg_scope([nn.conv2d, nn.deconv2d, nn.gated_resnet, nn.dense], counters=counters, init=init, ema=ema, dropout_p=dropout_p): 20 | 21 | # parse resnet nonlinearity argument 22 | if resnet_nonlinearity == 'concat_elu': 23 | resnet_nonlinearity = nn.concat_elu 24 | elif resnet_nonlinearity == 'elu': 25 | resnet_nonlinearity = tf.nn.elu 26 | elif resnet_nonlinearity == 'relu': 27 | resnet_nonlinearity = tf.nn.relu 28 | else: 29 | raise('resnet nonlinearity ' + resnet_nonlinearity + ' is not supported') 30 | 31 | with arg_scope([nn.gated_resnet], nonlinearity=resnet_nonlinearity, h=h): 32 | 33 | # ////////// up pass through pixelCNN //////// 34 | xs = nn.int_shape(x) 35 | x_pad = tf.concat([x,tf.ones(xs[:-1]+[1])],3) # add channel of ones to distinguish image from padding later on 36 | u_list = [nn.down_shift(nn.down_shifted_conv2d(x_pad, num_filters=nr_filters, filter_size=[2, 3]))] # stream for pixels above 37 | ul_list = [nn.down_shift(nn.down_shifted_conv2d(x_pad, num_filters=nr_filters, filter_size=[1,3])) + \ 38 | nn.right_shift(nn.down_right_shifted_conv2d(x_pad, num_filters=nr_filters, filter_size=[2,1]))] # stream for up and to the left 39 | 40 | for rep in range(nr_resnet): 41 | u_list.append(nn.gated_resnet(u_list[-1], conv=nn.down_shifted_conv2d)) 42 | ul_list.append(nn.gated_resnet(ul_list[-1], u_list[-1], conv=nn.down_right_shifted_conv2d)) 43 | 44 | u_list.append(nn.down_shifted_conv2d(u_list[-1], num_filters=nr_filters, stride=[2, 2])) 45 | ul_list.append(nn.down_right_shifted_conv2d(ul_list[-1], num_filters=nr_filters, stride=[2, 2])) 46 | 47 | for rep in range(nr_resnet): 48 | u_list.append(nn.gated_resnet(u_list[-1], conv=nn.down_shifted_conv2d)) 49 | ul_list.append(nn.gated_resnet(ul_list[-1], u_list[-1], conv=nn.down_right_shifted_conv2d)) 50 | 51 | u_list.append(nn.down_shifted_conv2d(u_list[-1], num_filters=nr_filters, stride=[2, 2])) 52 | ul_list.append(nn.down_right_shifted_conv2d(ul_list[-1], num_filters=nr_filters, stride=[2, 2])) 53 | 54 | for rep in range(nr_resnet): 55 | u_list.append(nn.gated_resnet(u_list[-1], conv=nn.down_shifted_conv2d)) 56 | ul_list.append(nn.gated_resnet(ul_list[-1], u_list[-1], conv=nn.down_right_shifted_conv2d)) 57 | 58 | # /////// down pass //////// 59 | u = u_list.pop() 60 | ul = ul_list.pop() 61 | for rep in range(nr_resnet): 62 | u = nn.gated_resnet(u, u_list.pop(), conv=nn.down_shifted_conv2d) 63 | ul = nn.gated_resnet(ul, tf.concat([u, ul_list.pop()],3), conv=nn.down_right_shifted_conv2d) 64 | 65 | u = nn.down_shifted_deconv2d(u, num_filters=nr_filters, stride=[2, 2]) 66 | ul = nn.down_right_shifted_deconv2d(ul, num_filters=nr_filters, stride=[2, 2]) 67 | 68 | for rep in range(nr_resnet+1): 69 | u = nn.gated_resnet(u, u_list.pop(), conv=nn.down_shifted_conv2d) 70 | ul = nn.gated_resnet(ul, tf.concat([u, ul_list.pop()],3), conv=nn.down_right_shifted_conv2d) 71 | 72 | u = nn.down_shifted_deconv2d(u, num_filters=nr_filters, stride=[2, 2]) 73 | ul = nn.down_right_shifted_deconv2d(ul, num_filters=nr_filters, stride=[2, 2]) 74 | 75 | for rep in range(nr_resnet+1): 76 | u = nn.gated_resnet(u, u_list.pop(), conv=nn.down_shifted_conv2d) 77 | ul = nn.gated_resnet(ul, tf.concat([u, ul_list.pop()],3), conv=nn.down_right_shifted_conv2d) 78 | 79 | x_out = nn.nin(tf.nn.elu(ul),10*nr_logistic_mix) 80 | 81 | assert len(u_list) == 0 82 | assert len(ul_list) == 0 83 | 84 | return x_out 85 | 86 | -------------------------------------------------------------------------------- /DSL_ImgProcess/pixel_cnn_pp/plotting.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import matplotlib 3 | matplotlib.use('Agg') 4 | from matplotlib import pyplot as plt 5 | 6 | # Plot image examples. 7 | def plot_img(img, title=None): 8 | plt.figure() 9 | plt.imshow(img, interpolation='nearest') 10 | if title is not None: 11 | plt.title(title) 12 | plt.axis('off') 13 | plt.tight_layout() 14 | 15 | def img_stretch(img): 16 | img = img.astype(float) 17 | img -= np.min(img) 18 | img /= np.max(img)+1e-12 19 | return img 20 | 21 | def img_tile(imgs, aspect_ratio=1.0, tile_shape=None, border=1, 22 | border_color=0, stretch=False): 23 | ''' Tile images in a grid. 24 | If tile_shape is provided only as many images as specified in tile_shape 25 | will be included in the output. 26 | ''' 27 | 28 | # Prepare images 29 | if stretch: 30 | imgs = img_stretch(imgs) 31 | imgs = np.array(imgs) 32 | if imgs.ndim != 3 and imgs.ndim != 4: 33 | raise ValueError('imgs has wrong number of dimensions.') 34 | n_imgs = imgs.shape[0] 35 | 36 | # Grid shape 37 | img_shape = np.array(imgs.shape[1:3]) 38 | if tile_shape is None: 39 | img_aspect_ratio = img_shape[1] / float(img_shape[0]) 40 | aspect_ratio *= img_aspect_ratio 41 | tile_height = int(np.ceil(np.sqrt(n_imgs * aspect_ratio))) 42 | tile_width = int(np.ceil(np.sqrt(n_imgs / aspect_ratio))) 43 | grid_shape = np.array((tile_height, tile_width)) 44 | else: 45 | assert len(tile_shape) == 2 46 | grid_shape = np.array(tile_shape) 47 | 48 | # Tile image shape 49 | tile_img_shape = np.array(imgs.shape[1:]) 50 | tile_img_shape[:2] = (img_shape[:2] + border) * grid_shape[:2] - border 51 | 52 | # Assemble tile image 53 | tile_img = np.empty(tile_img_shape) 54 | tile_img[:] = border_color 55 | for i in range(grid_shape[0]): 56 | for j in range(grid_shape[1]): 57 | img_idx = j + i*grid_shape[1] 58 | if img_idx >= n_imgs: 59 | # No more images - stop filling out the grid. 60 | break 61 | img = imgs[img_idx] 62 | yoff = (img_shape[0] + border) * i 63 | xoff = (img_shape[1] + border) * j 64 | tile_img[yoff:yoff+img_shape[0], xoff:xoff+img_shape[1], ...] = img 65 | 66 | return tile_img 67 | 68 | def conv_filter_tile(filters): 69 | n_filters, n_channels, height, width = filters.shape 70 | tile_shape = None 71 | if n_channels == 3: 72 | # Interpret 3 color channels as RGB 73 | filters = np.transpose(filters, (0, 2, 3, 1)) 74 | else: 75 | # Organize tile such that each row corresponds to a filter and the 76 | # columns are the filter channels 77 | tile_shape = (n_channels, n_filters) 78 | filters = np.transpose(filters, (1, 0, 2, 3)) 79 | filters = np.resize(filters, (n_filters*n_channels, height, width)) 80 | filters = img_stretch(filters) 81 | return img_tile(filters, tile_shape=tile_shape) 82 | 83 | def scale_to_unit_interval(ndar, eps=1e-8): 84 | """ Scales all values in the ndarray ndar to be between 0 and 1 """ 85 | ndar = ndar.copy() 86 | ndar -= ndar.min() 87 | ndar *= 1.0 / (ndar.max() + eps) 88 | return ndar 89 | 90 | 91 | def tile_raster_images(X, img_shape, tile_shape, tile_spacing=(0, 0), 92 | scale_rows_to_unit_interval=True, 93 | output_pixel_vals=True): 94 | """ 95 | Transform an array with one flattened image per row, into an array in 96 | which images are reshaped and layed out like tiles on a floor. 97 | 98 | This function is useful for visualizing datasets whose rows are images, 99 | and also columns of matrices for transforming those rows 100 | (such as the first layer of a neural net). 101 | 102 | :type X: a 2-D ndarray or a tuple of 4 channels, elements of which can 103 | be 2-D ndarrays or None; 104 | :param X: a 2-D array in which every row is a flattened image. 105 | 106 | :type img_shape: tuple; (height, width) 107 | :param img_shape: the original shape of each image 108 | 109 | :type tile_shape: tuple; (rows, cols) 110 | :param tile_shape: the number of images to tile (rows, cols) 111 | 112 | :param output_pixel_vals: if output should be pixel values (i.e. int8 113 | values) or floats 114 | 115 | :param scale_rows_to_unit_interval: if the values need to be scaled before 116 | being plotted to [0,1] or not 117 | 118 | 119 | :returns: array suitable for viewing as an image. 120 | (See:`PIL.Image.fromarray`.) 121 | :rtype: a 2-d array with same dtype as X. 122 | 123 | """ 124 | 125 | assert len(img_shape) == 2 126 | assert len(tile_shape) == 2 127 | assert len(tile_spacing) == 2 128 | 129 | # The expression below can be re-written in a more C style as 130 | # follows : 131 | # 132 | # out_shape = [0,0] 133 | # out_shape[0] = (img_shape[0] + tile_spacing[0]) * tile_shape[0] - 134 | # tile_spacing[0] 135 | # out_shape[1] = (img_shape[1] + tile_spacing[1]) * tile_shape[1] - 136 | # tile_spacing[1] 137 | out_shape = [(ishp + tsp) * tshp - tsp for ishp, tshp, tsp 138 | in zip(img_shape, tile_shape, tile_spacing)] 139 | 140 | if isinstance(X, tuple): 141 | assert len(X) == 4 142 | # Create an output numpy ndarray to store the image 143 | if output_pixel_vals: 144 | out_array = np.zeros((out_shape[0], out_shape[1], 4), dtype='uint8') 145 | else: 146 | out_array = np.zeros((out_shape[0], out_shape[1], 4), dtype=X.dtype) 147 | 148 | #colors default to 0, alpha defaults to 1 (opaque) 149 | if output_pixel_vals: 150 | channel_defaults = [0, 0, 0, 255] 151 | else: 152 | channel_defaults = [0., 0., 0., 1.] 153 | 154 | for i in range(4): 155 | if X[i] is None: 156 | # if channel is None, fill it with zeros of the correct 157 | # dtype 158 | out_array[:, :, i] = np.zeros(out_shape, 159 | dtype='uint8' if output_pixel_vals else out_array.dtype 160 | ) + channel_defaults[i] 161 | else: 162 | # use a recurrent call to compute the channel and store it 163 | # in the output 164 | out_array[:, :, i] = tile_raster_images(X[i], img_shape, tile_shape, tile_spacing, scale_rows_to_unit_interval, output_pixel_vals) 165 | return out_array 166 | 167 | else: 168 | # if we are dealing with only one channel 169 | H, W = img_shape 170 | Hs, Ws = tile_spacing 171 | 172 | # generate a matrix to store the output 173 | out_array = np.zeros(out_shape, dtype='uint8' if output_pixel_vals else X.dtype) 174 | 175 | 176 | for tile_row in range(tile_shape[0]): 177 | for tile_col in range(tile_shape[1]): 178 | if tile_row * tile_shape[1] + tile_col < X.shape[0]: 179 | if scale_rows_to_unit_interval: 180 | # if we should scale values to be between 0 and 1 181 | # do this by calling the `scale_to_unit_interval` 182 | # function 183 | this_img = scale_to_unit_interval(X[tile_row * tile_shape[1] + tile_col].reshape(img_shape)) 184 | else: 185 | this_img = X[tile_row * tile_shape[1] + tile_col].reshape(img_shape) 186 | # add the slice to the corresponding position in the 187 | # output array 188 | out_array[ 189 | tile_row * (H+Hs): tile_row * (H + Hs) + H, 190 | tile_col * (W+Ws): tile_col * (W + Ws) + W 191 | ] \ 192 | = this_img * (255 if output_pixel_vals else 1) 193 | return out_array 194 | 195 | -------------------------------------------------------------------------------- /DSL_ImgProcess/test_grab_train_loss.py: -------------------------------------------------------------------------------- 1 | """ 2 | Trains a Pixel-CNN++ generative model on CIFAR-10 or Tiny ImageNet data. 3 | Uses multiple GPUs, indicated by the flag --nr-gpu 4 | 5 | Example usage: 6 | CUDA_VISIBLE_DEVICES=0,1,2,3 python train_double_cnn.py --nr_gpu 4 7 | """ 8 | 9 | import os 10 | import sys 11 | import time 12 | import json 13 | import argparse 14 | 15 | import numpy as np 16 | import tensorflow as tf 17 | 18 | import pixel_cnn_pp.nn as nn 19 | import pixel_cnn_pp.plotting as plotting 20 | from pixel_cnn_pp.model import model_spec 21 | import data.cifar10_data as cifar10_data 22 | import data.imagenet_data as imagenet_data 23 | 24 | # ----------------------------------------------------------------------------- 25 | parser = argparse.ArgumentParser() 26 | # data I/O 27 | parser.add_argument('-i', '--data_dir', type=str, default='/tmp/pxpp/data', help='Location for the dataset') 28 | parser.add_argument('-o', '--save_dir', type=str, default='/tmp/pxpp/save', help='Location for parameter checkpoints and samples') 29 | parser.add_argument('-d', '--data_set', type=str, default='cifar', help='Can be either cifar|imagenet') 30 | parser.add_argument('-t', '--save_interval', type=int, default=20, help='Every how many epochs to write checkpoint/samples?') 31 | parser.add_argument('-r', '--load_params', dest='load_params', action='store_true', help='Restore training from previous model checkpoint?') 32 | # model 33 | parser.add_argument('-q', '--nr_resnet', type=int, default=5, help='Number of residual blocks per stage of the model') 34 | parser.add_argument('-n', '--nr_filters', type=int, default=160, help='Number of filters to use across the model. Higher = larger model.') 35 | parser.add_argument('-m', '--nr_logistic_mix', type=int, default=10, help='Number of logistic components in the mixture. Higher = more flexible model') 36 | parser.add_argument('-z', '--resnet_nonlinearity', type=str, default='concat_elu', help='Which nonlinearity to use in the ResNet layers. One of "concat_elu", "elu", "relu" ') 37 | parser.add_argument('-c', '--class_conditional', dest='class_conditional', action='store_true', help='Condition generative model on labels?') 38 | # optimization 39 | parser.add_argument('-l', '--learning_rate', type=float, default=0.001, help='Base learning rate') 40 | parser.add_argument('-e', '--lr_decay', type=float, default=0.999995, help='Learning rate decay, applied every step of the optimization') 41 | parser.add_argument('-b', '--batch_size', type=int, default=12, help='Batch size during training per GPU') 42 | parser.add_argument('-a', '--init_batch_size', type=int, default=100, help='How much data to use for data-dependent initialization.') 43 | parser.add_argument('-p', '--dropout_p', type=float, default=0.5, help='Dropout strength (i.e. 1 - keep_prob). 0 = No dropout, higher = more dropout.') 44 | parser.add_argument('-x', '--max_epochs', type=int, default=5000, help='How many epochs to run in total?') 45 | parser.add_argument('-g', '--nr_gpu', type=int, default=8, help='How many GPUs to distribute the training across?') 46 | # evaluation 47 | parser.add_argument('--polyak_decay', type=float, default=0.9995, help='Exponential decay rate of the sum of previous model iterates during Polyak averaging') 48 | # reproducibility 49 | parser.add_argument('-s', '--seed', type=int, default=1, help='Random seed to use') 50 | args = parser.parse_args() 51 | print('input args:\n', json.dumps(vars(args), indent=4, separators=(',',':'))) # pretty print args 52 | 53 | # ----------------------------------------------------------------------------- 54 | # fix random seed for reproducibility 55 | rng = np.random.RandomState(args.seed) 56 | tf.set_random_seed(args.seed) 57 | 58 | # initialize data loaders for train/test splits 59 | if args.data_set == 'imagenet' and args.class_conditional: 60 | raise("We currently don't have labels for the small imagenet data set") 61 | DataLoader = {'cifar':cifar10_data.DataLoader, 'imagenet':imagenet_data.DataLoader}[args.data_set] 62 | train_data = DataLoader(args.data_dir, 'train', args.batch_size * args.nr_gpu, rng=rng, shuffle=False, return_labels=args.class_conditional) 63 | test_data = DataLoader(args.data_dir, 'test', args.batch_size * args.nr_gpu, shuffle=False, return_labels=args.class_conditional) 64 | obs_shape = train_data.get_observation_size() # e.g. a tuple (32,32,3) 65 | assert len(obs_shape) == 3, 'assumed right now' 66 | 67 | # data place holders 68 | x_init = tf.placeholder(tf.float32, shape=(args.init_batch_size,) + obs_shape) 69 | xs = [tf.placeholder(tf.float32, shape=(args.batch_size, ) + obs_shape) for i in range(args.nr_gpu)] 70 | 71 | # if the model is class-conditional we'll set up label placeholders + one-hot encodings 'h' to condition on 72 | if args.class_conditional: 73 | num_labels = train_data.get_num_labels() 74 | y_init = tf.placeholder(tf.int32, shape=(args.init_batch_size,)) 75 | h_init = tf.one_hot(y_init, num_labels) 76 | y_sample = np.split(np.mod(np.arange(args.batch_size*args.nr_gpu), num_labels), args.nr_gpu) 77 | h_sample = [tf.one_hot(tf.Variable(y_sample[i], trainable=False), num_labels) for i in range(args.nr_gpu)] 78 | ys = [tf.placeholder(tf.int32, shape=(args.batch_size,)) for i in range(args.nr_gpu)] 79 | hs = [tf.one_hot(ys[i], num_labels) for i in range(args.nr_gpu)] 80 | else: 81 | h_init = None 82 | h_sample = [None] * args.nr_gpu 83 | hs = h_sample 84 | 85 | # create the model 86 | model_opt = { 'nr_resnet': args.nr_resnet, 'nr_filters': args.nr_filters, 'nr_logistic_mix': args.nr_logistic_mix, 'resnet_nonlinearity': args.resnet_nonlinearity } 87 | model = tf.make_template('model', model_spec) 88 | 89 | # run once for data dependent initialization of parameters 90 | gen_par = model(x_init, h_init, init=True, dropout_p=args.dropout_p, **model_opt) 91 | 92 | # keep track of moving average 93 | all_params = tf.trainable_variables() 94 | ema = tf.train.ExponentialMovingAverage(decay=args.polyak_decay) 95 | maintain_averages_op = tf.group(ema.apply(all_params)) 96 | 97 | # get loss gradients over multiple GPUs 98 | grads = [] 99 | loss_gen = [] 100 | loss_gen_test = [] 101 | for i in range(args.nr_gpu): 102 | with tf.device('/gpu:%d' % i): 103 | # train 104 | gen_par = model(xs[i], hs[i], ema=None, dropout_p=args.dropout_p, **model_opt) 105 | loss_gen.append(nn.discretized_mix_logistic_loss(xs[i], gen_par)) 106 | # gradients 107 | grads.append(tf.gradients(loss_gen[i], all_params)) 108 | # test 109 | gen_par = model(xs[i], hs[i], ema=ema, dropout_p=0., **model_opt) 110 | loss_gen_test.append(nn.discretized_mix_logistic_loss(xs[i], gen_par)) 111 | 112 | # add losses and gradients together and get training updates 113 | tf_lr = tf.placeholder(tf.float32, shape=[]) 114 | with tf.device('/gpu:0'): 115 | for i in range(1,args.nr_gpu): 116 | loss_gen[0] += loss_gen[i] 117 | loss_gen_test[0] += loss_gen_test[i] 118 | for j in range(len(grads[0])): 119 | grads[0][j] += grads[i][j] 120 | # training op 121 | optimizer = tf.group(nn.adam_updates(all_params, grads[0], lr=tf_lr, mom1=0.95, mom2=0.9995), maintain_averages_op) 122 | 123 | # convert loss to bits/dim 124 | bits_per_dim = loss_gen[0]/(args.nr_gpu*np.log(2.)*np.prod(obs_shape)*args.batch_size) 125 | bits_per_dim_test = loss_gen_test[0]/(args.nr_gpu*np.log(2.)*np.prod(obs_shape)*args.batch_size) 126 | 127 | # sample from the model 128 | new_x_gen = [] 129 | for i in range(args.nr_gpu): 130 | with tf.device('/gpu:%d' % i): 131 | gen_par = model(xs[i], h_sample[i], ema=ema, dropout_p=0, **model_opt) 132 | new_x_gen.append(nn.sample_from_discretized_mix_logistic(gen_par, args.nr_logistic_mix)) 133 | def sample_from_model(sess): 134 | x_gen = [np.zeros((args.batch_size,) + obs_shape, dtype=np.float32) for i in range(args.nr_gpu)] 135 | for yi in range(obs_shape[0]): 136 | for xi in range(obs_shape[1]): 137 | new_x_gen_np = sess.run(new_x_gen, {xs[i]: x_gen[i] for i in range(args.nr_gpu)}) 138 | for i in range(args.nr_gpu): 139 | x_gen[i][:,yi,xi,:] = new_x_gen_np[i][:,yi,xi,:] 140 | return np.concatenate(x_gen, axis=0) 141 | 142 | # init & save 143 | initializer = tf.initialize_all_variables() 144 | saver = tf.train.Saver() 145 | 146 | # turn numpy inputs into feed_dict for use with tensorflow 147 | def make_feed_dict(data, init=False): 148 | if type(data) is tuple: 149 | x,y = data 150 | else: 151 | x = data 152 | y = None 153 | x = np.cast[np.float32]((x - 127.5) / 127.5) # input to pixelCNN is scaled from uint8 [0,255] to float in range [-1,1] 154 | if init: 155 | feed_dict = {x_init: x} 156 | if y is not None: 157 | feed_dict.update({y_init: y}) 158 | else: 159 | x = np.split(x, args.nr_gpu) 160 | feed_dict = {xs[i]: x[i] for i in range(args.nr_gpu)} 161 | if y is not None: 162 | y = np.split(y, args.nr_gpu) 163 | feed_dict.update({ys[i]: y[i] for i in range(args.nr_gpu)}) 164 | return feed_dict 165 | 166 | # //////////// perform testing ////////////// 167 | 168 | print('starting testing') 169 | test_bpd = [] 170 | lr = args.learning_rate 171 | 172 | with tf.Session() as sess: 173 | # compute likelihood over test data 174 | ckpt_file = args.save_dir + '/params_' + args.data_set + '.ckpt' 175 | print('restoring parameters from', ckpt_file) 176 | saver.restore(sess, ckpt_file) 177 | 178 | test_losses = [] 179 | uidx = 0 180 | for d in train_data: 181 | feed_dict = make_feed_dict(d) 182 | l = sess.run(bits_per_dim_test, feed_dict) 183 | test_losses.append(l) 184 | uidx += 1 185 | if uidx % 100 == 0: 186 | print(uidx, l) 187 | test_loss_gen = np.mean(test_losses) 188 | print(uidx, ' -- ', test_loss_gen) 189 | test_bpd.append(test_loss_gen) 190 | print('Test nll=%.2f' % test_loss_gen) 191 | 192 | np.savez('./TMD', np.array(test_losses)) 193 | 194 | 195 | 196 | 197 | -------------------------------------------------------------------------------- /DSL_ImgProcess/train.py: -------------------------------------------------------------------------------- 1 | # Copyright (c) Microsoft. All rights reserved. 2 | # Licensed under the MIT license. See LICENSE file in the project root for full license information. 3 | 4 | import os 5 | import sys 6 | import time 7 | import json 8 | import argparse 9 | 10 | import numpy as np 11 | import tensorflow as tf 12 | 13 | import pixel_cnn_pp.nn as nn 14 | import pixel_cnn_pp.plotting as plotting 15 | from pixel_cnn_pp.model import model_spec 16 | import data.cifar10_data as cifar10_data 17 | import data.imagenet_data as imagenet_data 18 | 19 | # ----------------------------------------------------------------------------- 20 | parser = argparse.ArgumentParser() 21 | # data I/O 22 | parser.add_argument('-i', '--data_dir', type=str, default='/tmp/pxpp/data', help='Location for the dataset') 23 | parser.add_argument('-o', '--save_dir', type=str, default='/tmp/pxpp/save', help='Location for parameter checkpoints and samples') 24 | parser.add_argument('-d', '--data_set', type=str, default='cifar', help='Can be either cifar|imagenet') 25 | parser.add_argument('-t', '--save_interval', type=int, default=20, help='Every how many epochs to write checkpoint/samples?') 26 | parser.add_argument('-r', '--load_params', dest='load_params', action='store_true', help='Restore training from previous model checkpoint?') 27 | # model 28 | parser.add_argument('-q', '--nr_resnet', type=int, default=5, help='Number of residual blocks per stage of the model') 29 | parser.add_argument('-n', '--nr_filters', type=int, default=160, help='Number of filters to use across the model. Higher = larger model.') 30 | parser.add_argument('-m', '--nr_logistic_mix', type=int, default=10, help='Number of logistic components in the mixture. Higher = more flexible model') 31 | parser.add_argument('-z', '--resnet_nonlinearity', type=str, default='concat_elu', help='Which nonlinearity to use in the ResNet layers. One of "concat_elu", "elu", "relu" ') 32 | parser.add_argument('-c', '--class_conditional', dest='class_conditional', action='store_true', help='Condition generative model on labels?') 33 | # optimization 34 | parser.add_argument('-l', '--learning_rate', type=float, default=0.001, help='Base learning rate') 35 | parser.add_argument('-e', '--lr_decay', type=float, default=0.999995, help='Learning rate decay, applied every step of the optimization') 36 | parser.add_argument('-b', '--batch_size', type=int, default=12, help='Batch size during training per GPU') 37 | parser.add_argument('-a', '--init_batch_size', type=int, default=100, help='How much data to use for data-dependent initialization.') 38 | parser.add_argument('-p', '--dropout_p', type=float, default=0.5, help='Dropout strength (i.e. 1 - keep_prob). 0 = No dropout, higher = more dropout.') 39 | parser.add_argument('-x', '--max_epochs', type=int, default=5000, help='How many epochs to run in total?') 40 | parser.add_argument('-g', '--nr_gpu', type=int, default=8, help='How many GPUs to distribute the training across?') 41 | # evaluation 42 | parser.add_argument('--polyak_decay', type=float, default=0.9995, help='Exponential decay rate of the sum of previous model iterates during Polyak averaging') 43 | # reproducibility 44 | parser.add_argument('-s', '--seed', type=int, default=1, help='Random seed to use') 45 | args = parser.parse_args() 46 | print('input args:\n', json.dumps(vars(args), indent=4, separators=(',',':'))) # pretty print args 47 | 48 | # ----------------------------------------------------------------------------- 49 | # fix random seed for reproducibility 50 | rng = np.random.RandomState(args.seed) 51 | tf.set_random_seed(args.seed) 52 | 53 | # initialize data loaders for train/test splits 54 | if args.data_set == 'imagenet' and args.class_conditional: 55 | raise("We currently don't have labels for the small imagenet data set") 56 | DataLoader = {'cifar':cifar10_data.DataLoader, 'imagenet':imagenet_data.DataLoader}[args.data_set] 57 | train_data = DataLoader(args.data_dir, 'train', args.batch_size * args.nr_gpu, rng=rng, shuffle=True, return_labels=args.class_conditional) 58 | test_data = DataLoader(args.data_dir, 'test', args.batch_size * args.nr_gpu, shuffle=False, return_labels=args.class_conditional) 59 | obs_shape = train_data.get_observation_size() # e.g. a tuple (32,32,3) 60 | assert len(obs_shape) == 3, 'assumed right now' 61 | 62 | # data place holders 63 | x_init = tf.placeholder(tf.float32, shape=(args.init_batch_size,) + obs_shape) 64 | xs = [tf.placeholder(tf.float32, shape=(args.batch_size, ) + obs_shape) for i in range(args.nr_gpu)] 65 | 66 | # if the model is class-conditional we'll set up label placeholders + one-hot encodings 'h' to condition on 67 | if args.class_conditional: 68 | num_labels = train_data.get_num_labels() 69 | y_init = tf.placeholder(tf.int32, shape=(args.init_batch_size,)) 70 | h_init = tf.one_hot(y_init, num_labels) 71 | y_sample = np.split(np.mod(np.arange(args.batch_size*args.nr_gpu), num_labels), args.nr_gpu) 72 | h_sample = [tf.one_hot(tf.Variable(y_sample[i], trainable=False), num_labels) for i in range(args.nr_gpu)] 73 | ys = [tf.placeholder(tf.int32, shape=(args.batch_size,)) for i in range(args.nr_gpu)] 74 | hs = [tf.one_hot(ys[i], num_labels) for i in range(args.nr_gpu)] 75 | else: 76 | h_init = None 77 | h_sample = [None] * args.nr_gpu 78 | hs = h_sample 79 | 80 | # create the model 81 | model_opt = { 'nr_resnet': args.nr_resnet, 'nr_filters': args.nr_filters, 'nr_logistic_mix': args.nr_logistic_mix, 'resnet_nonlinearity': args.resnet_nonlinearity } 82 | model = tf.make_template('model', model_spec) 83 | 84 | # run once for data dependent initialization of parameters 85 | gen_par = model(x_init, h_init, init=True, dropout_p=args.dropout_p, **model_opt) 86 | 87 | # keep track of moving average 88 | all_params = tf.trainable_variables() 89 | ema = tf.train.ExponentialMovingAverage(decay=args.polyak_decay) 90 | maintain_averages_op = tf.group(ema.apply(all_params)) 91 | 92 | # get loss gradients over multiple GPUs 93 | grads = [] 94 | loss_gen = [] 95 | loss_gen_test = [] 96 | for i in range(args.nr_gpu): 97 | with tf.device('/gpu:%d' % i): 98 | # train 99 | gen_par = model(xs[i], hs[i], ema=None, dropout_p=args.dropout_p, **model_opt) 100 | loss_gen.append(nn.discretized_mix_logistic_loss(xs[i], gen_par)) 101 | # gradients 102 | grads.append(tf.gradients(loss_gen[i], all_params)) 103 | # test 104 | gen_par = model(xs[i], hs[i], ema=ema, dropout_p=0., **model_opt) 105 | loss_gen_test.append(nn.discretized_mix_logistic_loss(xs[i], gen_par)) 106 | 107 | # add losses and gradients together and get training updates 108 | tf_lr = tf.placeholder(tf.float32, shape=[]) 109 | with tf.device('/gpu:0'): 110 | for i in range(1,args.nr_gpu): 111 | loss_gen[0] += loss_gen[i] 112 | loss_gen_test[0] += loss_gen_test[i] 113 | for j in range(len(grads[0])): 114 | grads[0][j] += grads[i][j] 115 | # training op 116 | optimizer = tf.group(nn.adam_updates(all_params, grads[0], lr=tf_lr, mom1=0.95, mom2=0.9995), maintain_averages_op) 117 | 118 | # convert loss to bits/dim 119 | bits_per_dim = loss_gen[0]/(args.nr_gpu*np.log(2.)*np.prod(obs_shape)*args.batch_size) 120 | bits_per_dim_test = loss_gen_test[0]/(args.nr_gpu*np.log(2.)*np.prod(obs_shape)*args.batch_size) 121 | 122 | # sample from the model 123 | new_x_gen = [] 124 | for i in range(args.nr_gpu): 125 | with tf.device('/gpu:%d' % i): 126 | gen_par = model(xs[i], h_sample[i], ema=ema, dropout_p=0, **model_opt) 127 | new_x_gen.append(nn.sample_from_discretized_mix_logistic(gen_par, args.nr_logistic_mix)) 128 | def sample_from_model(sess): 129 | x_gen = [np.zeros((args.batch_size,) + obs_shape, dtype=np.float32) for i in range(args.nr_gpu)] 130 | for yi in range(obs_shape[0]): 131 | for xi in range(obs_shape[1]): 132 | new_x_gen_np = sess.run(new_x_gen, {xs[i]: x_gen[i] for i in range(args.nr_gpu)}) 133 | for i in range(args.nr_gpu): 134 | x_gen[i][:,yi,xi,:] = new_x_gen_np[i][:,yi,xi,:] 135 | return np.concatenate(x_gen, axis=0) 136 | 137 | # init & save 138 | initializer = tf.initialize_all_variables() 139 | saver = tf.train.Saver() 140 | 141 | # turn numpy inputs into feed_dict for use with tensorflow 142 | def make_feed_dict(data, init=False): 143 | if type(data) is tuple: 144 | x,y = data 145 | else: 146 | x = data 147 | y = None 148 | x = np.cast[np.float32]((x - 127.5) / 127.5) # input to pixelCNN is scaled from uint8 [0,255] to float in range [-1,1] 149 | if init: 150 | feed_dict = {x_init: x} 151 | if y is not None: 152 | feed_dict.update({y_init: y}) 153 | else: 154 | x = np.split(x, args.nr_gpu) 155 | feed_dict = {xs[i]: x[i] for i in range(args.nr_gpu)} 156 | if y is not None: 157 | y = np.split(y, args.nr_gpu) 158 | feed_dict.update({ys[i]: y[i] for i in range(args.nr_gpu)}) 159 | return feed_dict 160 | 161 | # //////////// perform training ////////////// 162 | if not os.path.exists(args.save_dir): 163 | os.makedirs(args.save_dir) 164 | print('starting training') 165 | test_bpd = [] 166 | lr = args.learning_rate 167 | with tf.Session() as sess: 168 | for epoch in range(args.max_epochs): 169 | begin = time.time() 170 | 171 | # init 172 | if epoch == 0: 173 | feed_dict = make_feed_dict(train_data.next(args.init_batch_size), init=True) # manually retrieve exactly init_batch_size examples 174 | train_data.reset() # rewind the iterator back to 0 to do one full epoch 175 | sess.run(initializer, feed_dict) 176 | print('initializing the model...') 177 | if args.load_params: 178 | ckpt_file = args.save_dir + '/params_' + args.data_set + '.ckpt' 179 | print('restoring parameters from', ckpt_file) 180 | saver.restore(sess, ckpt_file) 181 | 182 | # train for one epoch 183 | train_losses = [] 184 | for d in train_data: 185 | feed_dict = make_feed_dict(d) 186 | # forward/backward/update model on each gpu 187 | lr *= args.lr_decay 188 | feed_dict.update({ tf_lr: lr }) 189 | l,_ = sess.run([bits_per_dim, optimizer], feed_dict) 190 | train_losses.append(l) 191 | train_loss_gen = np.mean(train_losses) 192 | 193 | # compute likelihood over test data 194 | test_losses = [] 195 | for d in test_data: 196 | feed_dict = make_feed_dict(d) 197 | l = sess.run(bits_per_dim_test, feed_dict) 198 | test_losses.append(l) 199 | test_loss_gen = np.mean(test_losses) 200 | test_bpd.append(test_loss_gen) 201 | 202 | # log progress to console 203 | print("Iteration %d, time = %ds, train bits_per_dim = %.4f, test bits_per_dim = %.4f" % (epoch, time.time()-begin, train_loss_gen, test_loss_gen)) 204 | sys.stdout.flush() 205 | 206 | if epoch % args.save_interval == 0: 207 | 208 | # generate samples from the model 209 | sample_x = sample_from_model(sess) 210 | img_tile = plotting.img_tile(sample_x[:int(np.floor(np.sqrt(args.batch_size*args.nr_gpu))**2)], aspect_ratio=1.0, border_color=1.0, stretch=True) 211 | img = plotting.plot_img(img_tile, title=args.data_set + ' samples') 212 | plotting.plt.savefig(os.path.join(args.save_dir,'%s_sample%d.png' % (args.data_set, epoch))) 213 | plotting.plt.close('all') 214 | 215 | # save params 216 | saver.save(sess, args.save_dir + '/params_' + args.data_set + '.ckpt') 217 | np.savez(args.save_dir + '/test_bpd_' + args.data_set + '.npz', test_bpd=np.array(test_bpd)) 218 | -------------------------------------------------------------------------------- /DSL_ImgProcess/worker_I2L.py: -------------------------------------------------------------------------------- 1 | # Copyright (c) Microsoft. All rights reserved. 2 | # Licensed under the MIT license. See LICENSE file in the project root for full license information. 3 | 4 | import time 5 | import sys 6 | import os 7 | 8 | import cifar_input 9 | import numpy as np 10 | import resnet_model_basic as resnet_model 11 | import tensorflow as tf 12 | import data.cifar10_data as cifar10_data 13 | 14 | 15 | 16 | def lr_I2L(train_step): 17 | #step_wise = [40000,60000,80000] # this is the one for original 18 | step_wise = [51000,76000,102000] 19 | if train_step < step_wise[0]: 20 | return 0.1 21 | elif train_step < step_wise[1]: 22 | return 0.01 23 | elif train_step < step_wise[2]: 24 | return 0.001 25 | else: 26 | return 0.0001 27 | 28 | class worker_I2L(object): 29 | def __init__(self, args): 30 | 31 | hps = resnet_model.HParams(batch_size=args.batch_size, 32 | num_classes=10, 33 | min_lrn_rate=0.0001, 34 | lrn_rate=0.1, 35 | num_residual_units=18, 36 | use_bottleneck=False, 37 | weight_decay_rate=0.0002, 38 | relu_leakiness=0.1, 39 | optimizer='mom') 40 | self.args = args 41 | self.model = resnet_model.ResNet(hps, args.mode, use_wide_resnet=args.use_wide_resnet, nr_gpu=args.nr_gpu) 42 | self.model.build_graph() 43 | 44 | truth = tf.argmax(tf.concat(self.model.labels, axis=0), axis=1) 45 | predictions = tf.argmax(tf.concat(self.model.predictions,axis=0), axis=1) 46 | self.right_decision = tf.reduce_sum(tf.to_float(tf.equal(predictions, truth))) 47 | 48 | def GetLoss(self): 49 | return self.model.nlls, self.model.GetWeightDecay() 50 | 51 | def Valid(self, test_data, sess): 52 | with tf.device('/gpu:0'): 53 | cost_all = self.model.nlls[0] 54 | for i in range(1, self.args.nr_gpu): 55 | cost_all += self.model.nlls[i] 56 | 57 | m_sample = 0 58 | m_correct = 0. 59 | costs = 0. 60 | for test_image, test_label in test_data: 61 | m_sample += test_image.shape[0] 62 | 63 | splitted_image = np.split(test_image.astype('float32'), self.args.nr_gpu) 64 | splitted_label = np.split(test_label, self.args.nr_gpu) 65 | 66 | feed_dict = {self.model.needImgAug: False} 67 | feed_dict.update({self.model.input_image[i]: splitted_image[i] for i in range(self.args.nr_gpu)}) 68 | feed_dict.update({self.model.input_label[i]: splitted_label[i][:, None] for i in range(self.args.nr_gpu)}) 69 | 70 | _cost, _right_decision = sess.run([cost_all, self.right_decision], feed_dict) 71 | costs += np.sum(_cost) 72 | m_correct += _right_decision 73 | test_loss = costs / m_sample 74 | test_acc = m_correct * 1. / m_sample 75 | print('[I2L] test_nll={},test_acc={}'.format( 76 | '{0:.4f}'.format(test_loss), '{0:.6f}'.format(test_acc) ) 77 | ) 78 | -------------------------------------------------------------------------------- /DSL_ImgProcess/worker_L2I.py: -------------------------------------------------------------------------------- 1 | # Copyright (c) Microsoft. All rights reserved. 2 | # Licensed under the MIT license. See LICENSE file in the project root for full license information. 3 | 4 | import os 5 | import sys 6 | import time 7 | import json 8 | import argparse 9 | 10 | import numpy as np 11 | import tensorflow as tf 12 | 13 | import pixel_cnn_pp.nn as nn 14 | import pixel_cnn_pp.plotting as plotting 15 | from pixel_cnn_pp.model import model_spec 16 | import data.cifar10_data as cifar10_data 17 | 18 | class worker_L2I(object): 19 | def __init__(self, args, num_labels, image_shape): 20 | # Default parameters 21 | self.num_labels = num_labels 22 | self.image_shape=image_shape 23 | self.args = args 24 | 25 | # Data userd for data-dependent parameter initialization 26 | self.x_init = tf.placeholder(tf.float32, shape=(args.init_batch_size,) + self.image_shape) 27 | self.xs = [tf.placeholder(tf.float32, shape=(args.batch_size, ) + self.image_shape) for _ in range(args.nr_gpu)] 28 | self.y_init = tf.placeholder(tf.int32, shape=(args.init_batch_size,)) 29 | self.h_init = tf.one_hot(self.y_init, self.num_labels) 30 | 31 | # parameters used for sampling 32 | self.y_sample = np.split(np.mod(np.arange(args.batch_size*args.nr_gpu), self.num_labels), args.nr_gpu) 33 | # self.h_sample = [tf.one_hot(tf.Variable(self.y_sample[i], trainable=False), self.num_labels) for i in range(args.nr_gpu)] 34 | # the above line is the version used for icml paper. I revise it as follows 35 | self.h_sample = [tf.one_hot(self.y_sample[i], self.num_labels) for i in range(args.nr_gpu)] 36 | self.ys = [tf.placeholder(tf.int32, shape=(args.batch_size,)) for i in range(args.nr_gpu)] 37 | self.hs = [tf.one_hot(self.ys[i], self.num_labels) for i in range(args.nr_gpu)] 38 | # create the model 39 | self.model_opt = { 'nr_resnet': args.nr_resnet, 'nr_filters': args.nr_filters, 'nr_logistic_mix': args.nr_logistic_mix, 'resnet_nonlinearity': args.resnet_nonlinearity } 40 | self.model = tf.make_template('model', model_spec) 41 | 42 | # run once for data dependent initialization of parameters 43 | # in the original code, it is " gen_par = self.model(...)"; when init=True, it will run initilization automatically 44 | self.model(self.x_init, self.h_init, init=True, dropout_p=args.dropout_p, **self.model_opt) 45 | 46 | # keep track of moving average 47 | self.all_params = tf.trainable_variables() 48 | self.ema = tf.train.ExponentialMovingAverage(decay=args.polyak_decay) 49 | self.maintain_averages_op = tf.group(self.ema.apply(self.all_params)) 50 | 51 | # parameters for optimization 52 | self.tf_lr = tf.placeholder(tf.float32, shape=()) 53 | 54 | def GetLoss(self): 55 | # get loss gradients over multiple GPUs 56 | loss_gen = [] 57 | loss_gen_test = [] 58 | for i in range(self.args.nr_gpu): 59 | with tf.device('/gpu:%d' % i): 60 | # train 61 | gen_par = self.model(self.xs[i], self.hs[i], ema=None, dropout_p=self.args.dropout_p, **self.model_opt) 62 | loss_gen.append(nn.discretized_mix_logistic_loss(self.xs[i], gen_par, sum_all=False)) 63 | 64 | # test 65 | gen_par = self.model(self.xs[i], self.hs[i], ema=self.ema, dropout_p=0., **self.model_opt) 66 | loss_gen_test.append(nn.discretized_mix_logistic_loss(self.xs[i], gen_par)) 67 | 68 | return loss_gen, loss_gen_test 69 | 70 | def GetOverallLoss(self): 71 | # get loss gradients over multiple GPUs 72 | loss_gen = [] 73 | loss_gen_test = [] 74 | for i in range(self.args.nr_gpu): 75 | with tf.device('/gpu:%d' % i): 76 | # train 77 | gen_par = self.model(self.xs[i], self.hs[i], ema=None, dropout_p=self.args.dropout_p, **self.model_opt) 78 | loss_gen.append(nn.discretized_mix_logistic_loss(self.xs[i], gen_par, sum_all=False)) 79 | 80 | # test 81 | gen_par = self.model(self.xs[i], self.hs[i], ema=self.ema, dropout_p=0., **self.model_opt) 82 | loss_gen_test.append(nn.discretized_mix_logistic_loss(self.xs[i], gen_par)) 83 | 84 | # add the lossx to /gpu:0 85 | with tf.device('/gpu:0'): 86 | for i in range(1,self.args.nr_gpu): 87 | loss_gen[0] += loss_gen[i] 88 | loss_gen_test[0] += loss_gen_test[i] 89 | 90 | # training op 91 | #optimizer = tf.group(nn.adam_updates(self.all_params, grads[0], lr=self.tf_lr, mom1=0.95, mom2=0.9995), self.maintain_averages_op) 92 | 93 | # convert loss to bits/dim 94 | self.bits_per_dim = loss_gen[0]/(self.args.nr_gpu*np.log(2.)*np.prod(self.image_shape)*self.args.batch_size) 95 | self.bits_per_dim_test = loss_gen_test[0]/(self.args.nr_gpu*np.log(2.)*np.prod(self.image_shape)*self.args.batch_size) 96 | 97 | def Update(self, grads, useSGD=False): 98 | if useSGD: 99 | print('Use pure SGD for Label-->Image tasks') 100 | optimizer = tf.train.GradientDescentOptimizer(learning_rate=self.tf_lr) 101 | apply_op = optimizer.apply_gradients(zip(grads, self.all_params)) 102 | self.update_ops = tf.group(apply_op) 103 | else: 104 | self.update_ops = tf.group(nn.adam_updates(self.all_params, grads, lr=self.tf_lr, mom1=0.95, mom2=0.9995), self.maintain_averages_op) 105 | 106 | def build_sample_from_model(self): 107 | # sample from the model 108 | self.new_x_gen = [] 109 | for i in range(self.args.nr_gpu): 110 | with tf.device('/gpu:%d' % i): 111 | gen_par = self.model(self.xs[i], self.h_sample[i], ema=self.ema, dropout_p=0, **self.model_opt) 112 | self.new_x_gen.append(nn.sample_from_discretized_mix_logistic(gen_par, self.args.nr_logistic_mix)) 113 | 114 | def _sample_from_model(self, sess): 115 | x_gen = [np.zeros((self.args.batch_size,) + self.image_shape, dtype=np.float32) for _ in range(self.args.nr_gpu)] 116 | for yi in range(self.image_shape[0]): 117 | for xi in range(self.image_shape[1]): 118 | new_x_gen_np = sess.run(self.new_x_gen, {self.xs[i]: x_gen[i] for i in range(self.args.nr_gpu)}) 119 | for i in range(self.args.nr_gpu): 120 | x_gen[i][:,yi,xi,:] = new_x_gen_np[i][:,yi,xi,:] 121 | return np.concatenate(x_gen, axis=0) 122 | 123 | 124 | def Gen_Images(self, sess, epoch): 125 | sample_x = self._sample_from_model(sess) 126 | img_tile = plotting.img_tile(sample_x[:int(np.floor(np.sqrt(self.args.batch_size*self.args.nr_gpu))**2)], aspect_ratio=1.0, border_color=1.0, stretch=True) 127 | img = plotting.plot_img(img_tile, title=self.args.data_set + ' samples') 128 | plotting.plt.savefig(os.path.join(self.args.save_dir,'%s_sample%d.png' % (self.args.data_set, epoch))) 129 | plotting.plt.close('all') 130 | -------------------------------------------------------------------------------- /DSL_SentimentAnalysis/CLM/Data.py: -------------------------------------------------------------------------------- 1 | # Copyright (c) Microsoft. All rights reserved. 2 | # Licensed under the MIT license. See LICENSE file in the project root for full license information. 3 | 4 | import cPickle as pkl 5 | import gzip 6 | import os 7 | import numpy 8 | from theano import config 9 | 10 | def get_dataset_file(dataset, default_dataset, origin): 11 | ''' 12 | Look for it as if it was a full path, if not, try local file, 13 | if not try in the data directory. 14 | 15 | Download dataset if it is not present 16 | ''' 17 | data_dir, data_file = os.path.split(dataset) 18 | if data_dir == "" and not os.path.isfile(dataset): 19 | # Check if dataset is in the data directory. 20 | new_path = os.path.join( 21 | os.path.split(__file__)[0], 22 | "..", 23 | "data", 24 | dataset 25 | ) 26 | if os.path.isfile(new_path) or data_file == default_dataset: 27 | dataset = new_path 28 | 29 | if (not os.path.isfile(dataset)) and data_file == default_dataset: 30 | from six.moves import urllib 31 | print('Downloading data from %s' % origin) 32 | urllib.request.urlretrieve(origin, dataset) 33 | 34 | return dataset 35 | 36 | def load_data(path="imdb.pkl", n_words=100000, maxlen=None, 37 | sort_by_len=True, fixed_valid=True, valid_portion=0.1): 38 | ''' 39 | Loads the dataset 40 | :type path: String 41 | :param path: The path to the dataset (here IMDB) 42 | :type n_words: int 43 | :param n_words: The number of word to keep in the vocabulary. 44 | All extra words are set to unknow (1). 45 | :type maxlen: None or positive int 46 | :param maxlen: the max sequence length we use in the train/valid set. 47 | :type sort_by_len: bool 48 | :name sort_by_len: Sort by the sequence lenght for the train, 49 | valid and test set. This allow faster execution as it cause 50 | less padding per minibatch. Another mechanism must be used to 51 | shuffle the train set at each epoch. 52 | :type fixed_valid: bool 53 | :param fixed_valid: load fixed validation set from the corpus file, 54 | which would otherwise be picked randomly from the training set with 55 | proportion [valid_portion] 56 | :type valid_portion: float 57 | :param valid_portion: The proportion of the full train set used for 58 | the validation set. 59 | 60 | ''' 61 | 62 | # Load the dataset 63 | path = get_dataset_file( 64 | path, "imdb.pkl", 65 | "http://www.iro.umontreal.ca/~lisa/deep/data/imdb.pkl") 66 | if path.endswith(".gz"): 67 | f = gzip.open(path, 'rb') 68 | else: 69 | f = open(path, 'rb') 70 | 71 | train_set = pkl.load(f) 72 | if fixed_valid: 73 | valid_set = pkl.load(f) 74 | test_set = pkl.load(f) 75 | f.close() 76 | 77 | def _truncate_data(train_set): 78 | ''' 79 | truncate sequences with lengths exceed max-len threshold 80 | :param train_set: a list of sequences list and corresponding labels list 81 | :return: truncated train_set 82 | ''' 83 | new_train_set_x = [] 84 | new_train_set_y = [] 85 | for x, y in zip(train_set[0], train_set[1]): 86 | if len(x) < maxlen: 87 | new_train_set_x.append(x) 88 | new_train_set_y.append(y) 89 | train_set = (new_train_set_x, new_train_set_y) 90 | del new_train_set_x, new_train_set_y 91 | return train_set 92 | 93 | def _set_valid(train_set, valid_portion): 94 | ''' 95 | set validation with [valid_portion] proportion of training set 96 | ''' 97 | train_set_x, train_set_y = train_set 98 | n_samples = len(train_set_x) 99 | sidx = numpy.random.permutation(n_samples) # shuffle data 100 | n_train = int(numpy.round(n_samples * (1. - valid_portion))) 101 | valid_set_x = [train_set_x[s] for s in sidx[n_train:]] 102 | valid_set_y = [train_set_y[s] for s in sidx[n_train:]] 103 | train_set_x = [train_set_x[s] for s in sidx[:n_train]] 104 | train_set_y = [train_set_y[s] for s in sidx[:n_train]] 105 | train_set = (train_set_x, train_set_y) 106 | valid_set = (valid_set_x, valid_set_y) 107 | del train_set_x, train_set_y, valid_set_x, valid_set_y 108 | return train_set, valid_set 109 | 110 | if maxlen: 111 | train_set = _truncate_data(train_set) 112 | if fixed_valid: 113 | print 'Loading with fixed validation set...', 114 | valid_set = _truncate_data(valid_set) 115 | else: 116 | print 'Setting validation set with proportion:', valid_portion, '...', 117 | train_set, valid_set = _set_valid(train_set, valid_portion) 118 | test_set = _truncate_data(test_set) 119 | 120 | if maxlen is None and not fixed_valid: 121 | train_set, valid_set = _set_valid(train_set, valid_portion) 122 | 123 | def remove_unk(x): 124 | return [[1 if w >= n_words else w for w in sen] for sen in x] 125 | 126 | test_set_x, test_set_y = test_set 127 | valid_set_x, valid_set_y = valid_set 128 | train_set_x, train_set_y = train_set 129 | 130 | # remove unk from dataset 131 | train_set_x = remove_unk(train_set_x) # use 1 if unk 132 | valid_set_x = remove_unk(valid_set_x) 133 | test_set_x = remove_unk(test_set_x) 134 | 135 | def len_argsort(seq): 136 | return sorted(range(len(seq)), key=lambda x: len(seq[x])) 137 | 138 | if sort_by_len: 139 | sorted_index = len_argsort(test_set_x) 140 | # ranked from shortest to longest 141 | test_set_x = [test_set_x[i] for i in sorted_index] 142 | test_set_y = [test_set_y[i] for i in sorted_index] 143 | 144 | sorted_index = len_argsort(valid_set_x) 145 | valid_set_x = [valid_set_x[i] for i in sorted_index] 146 | valid_set_y = [valid_set_y[i] for i in sorted_index] 147 | 148 | sorted_index = len_argsort(train_set_x) 149 | train_set_x = [train_set_x[i] for i in sorted_index] 150 | train_set_y = [train_set_y[i] for i in sorted_index] 151 | 152 | train = (train_set_x, train_set_y) 153 | valid = (valid_set_x, valid_set_y) 154 | test = (test_set_x, test_set_y) 155 | 156 | return train, valid, test 157 | 158 | def load_mnist(path='mnist.pkl', fixed_permute=True, rand_permute=False): 159 | f = open(path, 'rb') 160 | train = pkl.load(f) 161 | valid = pkl.load(f) 162 | test = pkl.load(f) 163 | f.close() 164 | 165 | def _permute(data, perm): 166 | x, y = data 167 | x_new = [] 168 | for xx in x: 169 | xx_new = [xx[pp] for pp in perm] 170 | x_new.append(xx_new) 171 | return (x_new, y) 172 | 173 | def _trans2list(data): 174 | x, y = data 175 | x = [list(xx) for xx in x] 176 | return (x, y) 177 | 178 | if rand_permute: 179 | print 'Using a fixed random permutation of pixels...', 180 | perm = numpy.random.permutation(range(784)) 181 | train = _permute(train, perm) 182 | valid = _permute(valid, perm) 183 | test = _permute(test, perm) 184 | elif fixed_permute: 185 | print 'Using permuted dataset...', 186 | 187 | _trans2list(train) 188 | _trans2list(valid) 189 | _trans2list(test) 190 | 191 | return train, valid, test 192 | 193 | def get_minibatches_idx(n, minibatch_size, shuffle=False): 194 | """ 195 | Used to shuffle the dataset at each iteration. 196 | """ 197 | 198 | idx_list = numpy.arange(n, dtype="int32") 199 | 200 | if shuffle: 201 | numpy.random.shuffle(idx_list) 202 | 203 | minibatches = [] 204 | minibatch_start = 0 205 | for i in range(n // minibatch_size): 206 | minibatches.append(idx_list[minibatch_start: 207 | minibatch_start + minibatch_size]) 208 | minibatch_start += minibatch_size 209 | 210 | if (minibatch_start != n): 211 | # Make a minibatch out of what is left 212 | minibatches.append(idx_list[minibatch_start:]) 213 | 214 | return zip(range(len(minibatches)), minibatches) 215 | 216 | def get_minibatches_idx_bucket(dataset, minibatch_size, shuffle=False): 217 | """ 218 | divide into different buckets according to sequence lengths 219 | dynamic batch size 220 | """ 221 | # divide into buckets 222 | slen = [len(ss) for ss in dataset] 223 | bucket1000 = [sidx for sidx in xrange(len(dataset)) 224 | if slen[sidx] > 0 and slen[sidx] <= 1000] 225 | bucket3000 = [sidx for sidx in xrange(len(dataset)) 226 | if slen[sidx] > 1000 and slen[sidx] <= 3000] 227 | bucket_long = [sidx for sidx in xrange(len(dataset)) 228 | if slen[sidx] > 3000] 229 | 230 | # shuffle each bucket 231 | if shuffle: 232 | numpy.random.shuffle(bucket1000) 233 | numpy.random.shuffle(bucket3000) 234 | numpy.random.shuffle(bucket_long) 235 | 236 | # make minibatches 237 | def _make_batch(minibatches, bucket, minibatch_size): 238 | minibatch_start = 0 239 | n = len(bucket) 240 | for i in range(n // minibatch_size): 241 | minibatches.append(bucket[minibatch_start : minibatch_start + minibatch_size]) 242 | minibatch_start += minibatch_size 243 | if (minibatch_start != n): 244 | # Make a minibatch out of what is left 245 | minibatches.append(bucket[minibatch_start:]) 246 | return minibatches 247 | 248 | minibatches = [] 249 | _make_batch(minibatches, bucket1000, minibatch_size=minibatch_size) 250 | _make_batch(minibatches, bucket3000, minibatch_size=minibatch_size//2) 251 | _make_batch(minibatches, bucket_long, minibatch_size=minibatch_size//8) 252 | 253 | # shuffle minibatches 254 | numpy.random.shuffle(minibatches) 255 | 256 | return zip(range(len(minibatches)), minibatches) 257 | 258 | def prepare_data(seqs, labels, maxlen=None, dataset='text'): 259 | """Create the matrices from the datasets. 260 | 261 | This pad each sequence to the same lenght: the lenght of the 262 | longuest sequence or maxlen. 263 | 264 | if maxlen is set, we will cut all sequence to this maximum 265 | lenght. 266 | 267 | This swap the axis! 268 | """ 269 | # x: a list of sentences 270 | lengths = [len(s) for s in seqs] 271 | 272 | if maxlen is not None: 273 | new_seqs = [] 274 | new_labels = [] 275 | new_lengths = [] 276 | for l, s, y in zip(lengths, seqs, labels): 277 | if l < maxlen: 278 | new_seqs.append(s) 279 | new_labels.append(y) 280 | new_lengths.append(l) 281 | lengths = new_lengths 282 | labels = new_labels 283 | seqs = new_seqs 284 | 285 | if len(lengths) < 1: 286 | return None, None, None 287 | 288 | n_samples = len(seqs) 289 | maxlen = numpy.max(lengths) 290 | 291 | if dataset == 'mnist': 292 | x = numpy.zeros((maxlen, n_samples)).astype('float32') 293 | else: 294 | x = numpy.zeros((maxlen, n_samples)).astype('int64') 295 | x_mask = numpy.zeros((maxlen, n_samples)).astype(config.floatX) 296 | for idx, s in enumerate(seqs): 297 | x[:lengths[idx], idx] = s 298 | x_mask[:lengths[idx], idx] = 1. 299 | 300 | return x, x_mask, labels 301 | 302 | def prepare_data_hier(seqs, labels, hier_len, maxlen=None, dataset='text'): 303 | ''' 304 | prepare minibatch for hierarchical model 305 | ''' 306 | # sort (long->short) 307 | sorted_idx = sorted(range(len(seqs)), key=lambda x: len(seqs[x]), reverse=True) 308 | seqs = [seqs[i] for i in sorted_idx] 309 | labels = [labels[i] for i in sorted_idx] 310 | 311 | # truncate data 312 | lengths = [len(s) for s in seqs] 313 | if maxlen is not None: 314 | new_seqs = [] 315 | new_labels = [] 316 | new_lengths = [] 317 | for l, s, y in zip(lengths, seqs, labels): 318 | if l '0.10.2': 30 | from IPython.core.debugger import Pdb, BdbQuit_excepthook 31 | try: 32 | get_ipython 33 | except NameError: 34 | # Make it more resilient to different versions of IPython and try to 35 | # find a module. 36 | possible_modules = ['IPython.terminal.embed', # Newer IPython 37 | 'IPython.frontend.terminal.embed'] # Older IPython 38 | 39 | count = len(possible_modules) 40 | for module in possible_modules: 41 | try: 42 | embed = __import__(module, fromlist=["InteractiveShellEmbed"]) 43 | InteractiveShellEmbed = embed.InteractiveShellEmbed 44 | except ImportError: 45 | count -= 1 46 | if count == 0: 47 | raise 48 | else: 49 | break 50 | 51 | ipshell = InteractiveShellEmbed() 52 | def_colors = ipshell.colors 53 | else: 54 | def_colors = get_ipython.im_self.colors 55 | 56 | from IPython.utils import io 57 | 58 | if 'nose' in sys.modules.keys(): 59 | def update_stdout(): 60 | # setup stdout to ensure output is available with nose 61 | io.stdout = sys.stdout = sys.__stdout__ 62 | else: 63 | def update_stdout(): 64 | pass 65 | else: 66 | from IPython.Debugger import Pdb, BdbQuit_excepthook 67 | from IPython.Shell import IPShell 68 | from IPython import ipapi 69 | 70 | ip = ipapi.get() 71 | if ip is None: 72 | IPShell(argv=['']) 73 | ip = ipapi.get() 74 | def_colors = ip.options.colors 75 | 76 | from IPython.Shell import Term 77 | 78 | if 'nose' in sys.modules.keys(): 79 | def update_stdout(): 80 | # setup stdout to ensure output is available with nose 81 | Term.cout = sys.stdout = sys.__stdout__ 82 | else: 83 | def update_stdout(): 84 | pass 85 | 86 | 87 | def wrap_sys_excepthook(): 88 | # make sure we wrap it only once or we would end up with a cycle 89 | # BdbQuit_excepthook.excepthook_ori == BdbQuit_excepthook 90 | if sys.excepthook != BdbQuit_excepthook: 91 | BdbQuit_excepthook.excepthook_ori = sys.excepthook 92 | sys.excepthook = BdbQuit_excepthook 93 | 94 | 95 | def set_trace(frame=None): 96 | update_stdout() 97 | wrap_sys_excepthook() 98 | if frame is None: 99 | frame = sys._getframe().f_back 100 | Pdb(def_colors).set_trace(frame) 101 | 102 | 103 | def post_mortem(tb): 104 | update_stdout() 105 | wrap_sys_excepthook() 106 | p = Pdb(def_colors) 107 | p.reset() 108 | if tb is None: 109 | return 110 | p.interaction(None, tb) 111 | 112 | 113 | def pm(): 114 | post_mortem(sys.last_traceback) 115 | 116 | 117 | def run(statement, globals=None, locals=None): 118 | Pdb(def_colors).run(statement, globals, locals) 119 | 120 | 121 | def runcall(*args, **kwargs): 122 | return Pdb(def_colors).runcall(*args, **kwargs) 123 | 124 | 125 | def runeval(expression, globals=None, locals=None): 126 | return Pdb(def_colors).runeval(expression, globals, locals) 127 | 128 | 129 | @contextmanager 130 | def launch_ipdb_on_exception(): 131 | try: 132 | yield 133 | except Exception: 134 | e, m, tb = sys.exc_info() 135 | print(m.__repr__(), file=sys.stderr) 136 | post_mortem(tb) 137 | finally: 138 | pass 139 | 140 | 141 | def main(): 142 | if not sys.argv[1:] or sys.argv[1] in ("--help", "-h"): 143 | print("usage: ipdb.py scriptfile [arg] ...") 144 | sys.exit(2) 145 | 146 | mainpyfile = sys.argv[1] # Get script filename 147 | if not os.path.exists(mainpyfile): 148 | print('Error:', mainpyfile, 'does not exist') 149 | sys.exit(1) 150 | 151 | del sys.argv[0] # Hide "pdb.py" from argument list 152 | 153 | # Replace pdb's dir with script's dir in front of module search path. 154 | sys.path[0] = os.path.dirname(mainpyfile) 155 | 156 | # Note on saving/restoring sys.argv: it's a good idea when sys.argv was 157 | # modified by the script being debugged. It's a bad idea when it was 158 | # changed by the user from the command line. There is a "restart" command 159 | # which allows explicit specification of command line arguments. 160 | pdb = Pdb(def_colors) 161 | while 1: 162 | try: 163 | pdb._runscript(mainpyfile) 164 | if pdb._user_requested_quit: 165 | break 166 | print("The program finished and will be restarted") 167 | except Restart: 168 | print("Restarting", mainpyfile, "with arguments:") 169 | print("\t" + " ".join(sys.argv[1:])) 170 | except SystemExit: 171 | # In most cases SystemExit does not warrant a post-mortem session. 172 | print("The program exited via sys.exit(). Exit status: ", end='') 173 | print(sys.exc_info()[1]) 174 | except: 175 | traceback.print_exc() 176 | print("Uncaught exception. Entering post mortem debugging") 177 | print("Running 'cont' or 'step' will restart the program") 178 | t = sys.exc_info()[2] 179 | pdb.interaction(None, t) 180 | print("Post mortem debugger finished. The " + mainpyfile + 181 | " will be restarted") 182 | 183 | if __name__ == '__main__': 184 | main() 185 | -------------------------------------------------------------------------------- /DSL_SentimentAnalysis/CLM/multiverso/Multiverso.dll: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/microsoft/DualLearning/14f8e76b3c47e1f00bef4254606e1f549cc8c9ea/DSL_SentimentAnalysis/CLM/multiverso/Multiverso.dll -------------------------------------------------------------------------------- /DSL_SentimentAnalysis/CLM/multiverso/__init__.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # coding:utf8 3 | 4 | from api import init, shutdown, barrier, workers_num, worker_id, server_id, is_master_worker 5 | from tables import ArrayTableHandler, MatrixTableHandler 6 | -------------------------------------------------------------------------------- /DSL_SentimentAnalysis/CLM/multiverso/api.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # coding:utf8 3 | 4 | import ctypes 5 | from utils import Loader 6 | import numpy as np 7 | 8 | 9 | mv_lib = Loader.get_lib() 10 | 11 | 12 | def init(sync=False): 13 | '''Initialize mutliverso. 14 | 15 | This should be called only once before training at the beginning of the 16 | whole project. 17 | If sync is True, a sync server will be created. Otherwise an async server 18 | will be created. 19 | ''' 20 | args = [""] # the first argument will be ignored. So we put a placeholder here 21 | if sync: 22 | args.append("-sync=true") 23 | n = len(args) 24 | args_type = ctypes.c_char_p * n 25 | mv_lib.MV_Init(ctypes.pointer(ctypes.c_int(n)), args_type(*[ctypes.c_char_p(arg) for arg in args])) 26 | 27 | 28 | def shutdown(): 29 | '''Set a barrier for all workers to wait. 30 | 31 | Workers will wait until all workers reach a specific barrier. 32 | ''' 33 | mv_lib.MV_ShutDown() 34 | 35 | 36 | def barrier(): 37 | '''Shutdown multiverso. 38 | 39 | This should be called only once after finishing training at the end of the 40 | whole project. 41 | ''' 42 | mv_lib.MV_Barrier() 43 | 44 | 45 | def workers_num(): 46 | '''Return the total number of workers.''' 47 | return mv_lib.MV_NumWorkers() 48 | 49 | 50 | def worker_id(): 51 | '''Return the id (zero-based index) for current worker.''' 52 | return mv_lib.MV_WorkerId() 53 | 54 | 55 | def server_id(): 56 | return mv_lib.MV_ServerId() 57 | 58 | 59 | def is_master_worker(): 60 | '''If the worker is master worker 61 | 62 | Some things only need one worker process, such as validation, outputing the 63 | result, initializing the parameters and so on. So we mark the worker 0 as 64 | the master worker to finish these things. 65 | ''' 66 | return worker_id() == 0 67 | -------------------------------------------------------------------------------- /DSL_SentimentAnalysis/CLM/multiverso/tables.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # coding:utf8 3 | 4 | import ctypes 5 | from utils import Loader 6 | from utils import convert_data 7 | import numpy as np 8 | import api 9 | 10 | 11 | mv_lib = Loader.get_lib() 12 | 13 | 14 | class TableHandler(object): 15 | '''`TableHandler` is an interface to sync different kinds of values. 16 | 17 | If you are not writing python code based on theano or lasagne, you are 18 | supposed to sync models (for initialization) and gradients (during 19 | training) so as to let multiverso help you manage the models in distributed 20 | environments. 21 | Otherwise, you'd better use the classes in `multiverso.theano_ext` or 22 | `multiverso.theano_ext.lasagne_ext` 23 | ''' 24 | def __init__(self, size, init_value=None): 25 | raise NotImplementedError("You must implement the __init__ method.") 26 | 27 | def get(self, size): 28 | raise NotImplementedError("You must implement the get method.") 29 | 30 | def add(self, data, sync=False): 31 | raise NotImplementedError("You must implement the add method.") 32 | 33 | 34 | # types 35 | C_FLOAT_P = ctypes.POINTER(ctypes.c_float) 36 | 37 | 38 | class ArrayTableHandler(TableHandler): 39 | '''`ArrayTableHandler` is used to sync array-like (one-dimensional) value.''' 40 | def __init__(self, size, init_value=None): 41 | '''Constructor for syncing array-like (one-dimensional) value. 42 | 43 | The `size` should be a int equal to the size of value we want to sync. 44 | If init_value is None, zeros will be used to initialize the tables, 45 | otherwise the table will be initialized as the init_value. 46 | Notice: if the init_value is different in different processes, the 47 | average of them will be used. 48 | ''' 49 | self._handler = ctypes.c_void_p() 50 | self._size = size 51 | mv_lib.MV_NewArrayTable(size, ctypes.byref(self._handler)) 52 | if init_value is not None: 53 | init_value = convert_data(init_value) 54 | # sync add is used because we want to make sure that the initial 55 | # value has taken effect when the call returns. 56 | self.add(init_value / api.workers_num(), sync=True) 57 | 58 | def get(self): 59 | '''get the latest value from multiverso ArrayTable 60 | 61 | Data type of return value is numpy.ndarray with one-dimensional 62 | ''' 63 | data = np.zeros((self._size, ), dtype=np.dtype("float32")) 64 | mv_lib.MV_GetArrayTable(self._handler, data.ctypes.data_as(C_FLOAT_P), self._size) 65 | return data 66 | 67 | def add(self, data, sync=False): 68 | '''add the data to the multiverso ArrayTable 69 | 70 | Data type of `data` is numpy.ndarray with one-dimensional 71 | 72 | If sync is True, this call will blocked by IO until the call finish. 73 | Otherwise it will return immediately 74 | ''' 75 | data = convert_data(data) 76 | assert(data.size == self._size) 77 | if sync: 78 | mv_lib.MV_AddArrayTable(self._handler, data.ctypes.data_as(C_FLOAT_P), self._size) 79 | else: 80 | mv_lib.MV_AddAsyncArrayTable(self._handler, data.ctypes.data_as(C_FLOAT_P), self._size) 81 | 82 | 83 | class MatrixTableHandler(TableHandler): 84 | def __init__(self, num_row, num_col, init_value=None): 85 | '''Constructor for syncing matrix-like (two-dimensional) value. 86 | 87 | The `num_row` should be the number of rows and the `num_col` should be 88 | the number of columns. 89 | 90 | If init_value is None, zeros will be used to initialize the tables, 91 | otherwise the table will be initialized as the init_value. 92 | Notice: if the init_value is different in different processes, the 93 | average of them will be used. 94 | ''' 95 | self._handler = ctypes.c_void_p() 96 | self._num_row = num_row 97 | self._num_col = num_col 98 | self._size = num_col * num_row 99 | mv_lib.MV_NewMatrixTable(num_row, num_col, ctypes.byref(self._handler)) 100 | if init_value is not None: 101 | init_value = convert_data(init_value) 102 | # sync add is used because we want to make sure that the initial 103 | # value has taken effect when the call returns. 104 | self.add(init_value / api.workers_num(), sync=True) 105 | 106 | def get(self, row_ids=None): 107 | '''get the latest value from multiverso MatrixTable 108 | 109 | If row_ids is None, we will return all rows as numpy.narray , e.g. 110 | array([[1, 3], [3, 4]]). 111 | Otherwise we will return the data according to the row_ids(e.g. you can 112 | pass [1] to row_ids to get only the first row, it will return a 113 | two-dimensional numpy.ndarray with one row) 114 | 115 | Data type of return value is numpy.ndarray with two-dimensional 116 | ''' 117 | if row_ids is None: 118 | data = np.zeros((self._num_row, self._num_col), dtype=np.dtype("float32")) 119 | mv_lib.MV_GetMatrixTableAll(self._handler, data.ctypes.data_as(C_FLOAT_P), self._size) 120 | return data 121 | else: 122 | row_ids_n = len(row_ids) 123 | int_array_type = ctypes.c_int * row_ids_n 124 | data = np.zeros((row_ids_n, self._num_col), dtype=np.dtype("float32")) 125 | mv_lib.MV_GetMatrixTableByRows(self._handler, data.ctypes.data_as(C_FLOAT_P), 126 | row_ids_n * self._num_col, 127 | int_array_type(*row_ids), row_ids_n) 128 | return data 129 | 130 | def add(self, data=None, row_ids=None, sync=False): 131 | '''add the data to the multiverso MatrixTable 132 | 133 | If row_ids is None, we will add all data, and the data 134 | should be a list, e.g. [1, 2, 3, ...] 135 | 136 | Otherwise we will add the data according to the row_ids 137 | 138 | Data type of `data` is numpy.ndarray with two-dimensional 139 | 140 | If sync is True, this call will blocked by IO until the call finish. 141 | Otherwise it will return immediately 142 | ''' 143 | assert(data is not None) 144 | data = convert_data(data) 145 | 146 | if row_ids is None: 147 | assert(data.size == self._size) 148 | if sync: 149 | mv_lib.MV_AddMatrixTableAll(self._handler, data.ctypes.data_as(C_FLOAT_P), self._size) 150 | else: 151 | mv_lib.MV_AddAsyncMatrixTableAll(self._handler, data.ctypes.data_as(C_FLOAT_P), self._size) 152 | else: 153 | row_ids_n = len(row_ids) 154 | assert(data.size == row_ids_n * self._num_col) 155 | int_array_type = ctypes.c_int * row_ids_n 156 | if sync: 157 | mv_lib.MV_AddMatrixTableByRows(self._handler, data.ctypes.data_as(C_FLOAT_P), 158 | row_ids_n * self._num_col, 159 | int_array_type(*row_ids), row_ids_n) 160 | else: 161 | mv_lib.MV_AddAsyncMatrixTableByRows(self._handler, data.ctypes.data_as(C_FLOAT_P), 162 | row_ids_n * self._num_col, 163 | int_array_type(*row_ids), row_ids_n) 164 | -------------------------------------------------------------------------------- /DSL_SentimentAnalysis/CLM/multiverso/tests/test_multiverso.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # coding:utf8 3 | import multiverso as mv 4 | import unittest 5 | import numpy as np 6 | import theano 7 | from multiverso.theano_ext import sharedvar 8 | 9 | 10 | def setUpModule(): 11 | mv.init() 12 | 13 | 14 | def tearDownModule(): 15 | mv.shutdown() 16 | 17 | 18 | class TestMultiversoTables(unittest.TestCase): 19 | ''' 20 | Use the commands below to run test 21 | $ nosetests 22 | ''' 23 | 24 | def _test_array(self, size): 25 | tbh = mv.ArrayTableHandler(size) 26 | mv.barrier() 27 | 28 | for i in xrange(100): 29 | tbh.add(range(1, size + 1)) 30 | tbh.add(range(1, size + 1)) 31 | mv.barrier() 32 | for j, actual in enumerate(tbh.get()): 33 | self.assertEqual((j + 1) * (i + 1) * 2 * mv.workers_num(), actual) 34 | mv.barrier() 35 | 36 | def test_small_array(self): 37 | # TODO : this is not supported by multiverso because of the size 38 | # limited. Waiting for the solution of this issue 39 | # https://github.com/Microsoft/multiverso/issues/69 40 | 41 | # self._test_array(1) 42 | pass 43 | 44 | def test_array(self): 45 | self._test_array(10000) 46 | 47 | def test_matrix(self): 48 | num_row = 11 49 | num_col = 10 50 | size = num_col * num_row 51 | workers_num = mv.workers_num() 52 | tbh = mv.MatrixTableHandler(num_row, num_col) 53 | mv.barrier() 54 | for count in xrange(1, 21): 55 | row_ids = [0, 1, 5, 10] 56 | tbh.add(range(size)) 57 | tbh.add([range(rid * num_col, (1 + rid) * num_col) for rid in row_ids], row_ids) 58 | mv.barrier() 59 | data = tbh.get() 60 | mv.barrier() 61 | for i, row in enumerate(data): 62 | for j, actual in enumerate(row): 63 | expected = (i * num_col + j) * count * workers_num 64 | if i in row_ids: 65 | expected += (i * num_col + j) * count * workers_num 66 | self.assertEqual(expected, actual) 67 | data = tbh.get(row_ids) 68 | mv.barrier() 69 | for i, row in enumerate(data): 70 | for j, actual in enumerate(row): 71 | expected = (row_ids[i] * num_col + j) * count * workers_num * 2 72 | self.assertEqual(expected, actual) 73 | 74 | 75 | class TestMultiversoSharedVariable(unittest.TestCase): 76 | ''' 77 | Use the commands below to run test 78 | $ nosetests 79 | ''' 80 | 81 | def _test_sharedvar(self, row, col): 82 | W = sharedvar.mv_shared( 83 | value=np.zeros( 84 | (row, col), 85 | dtype=theano.config.floatX 86 | ), 87 | name='W', 88 | borrow=True 89 | ) 90 | delta = np.array(range(1, row * col + 1), 91 | dtype=theano.config.floatX).reshape((row, col)) 92 | train_model = theano.function([], updates=[(W, W + delta)]) 93 | mv.barrier() 94 | 95 | for i in xrange(100): 96 | train_model() 97 | train_model() 98 | sharedvar.sync_all_mv_shared_vars() 99 | mv.barrier() 100 | # to get the newest value, we must sync again 101 | sharedvar.sync_all_mv_shared_vars() 102 | for j, actual in enumerate(W.get_value().reshape(-1)): 103 | self.assertEqual((j + 1) * (i + 1) * 2 * mv.workers_num(), actual) 104 | mv.barrier() 105 | 106 | def test_sharedvar(self): 107 | self._test_sharedvar(200, 200) 108 | 109 | 110 | if __name__ == '__main__': 111 | unittest.main() 112 | -------------------------------------------------------------------------------- /DSL_SentimentAnalysis/CLM/multiverso/theano_ext/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/microsoft/DualLearning/14f8e76b3c47e1f00bef4254606e1f549cc8c9ea/DSL_SentimentAnalysis/CLM/multiverso/theano_ext/__init__.py -------------------------------------------------------------------------------- /DSL_SentimentAnalysis/CLM/multiverso/theano_ext/lasagne_ext/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/microsoft/DualLearning/14f8e76b3c47e1f00bef4254606e1f549cc8c9ea/DSL_SentimentAnalysis/CLM/multiverso/theano_ext/lasagne_ext/__init__.py -------------------------------------------------------------------------------- /DSL_SentimentAnalysis/CLM/multiverso/theano_ext/lasagne_ext/param_manager.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # coding:utf8 3 | 4 | import lasagne 5 | import numpy as np 6 | import multiverso as mv 7 | 8 | 9 | class MVNetParamManager(object): 10 | ''' 11 | MVNetParamManager is manager to make managing and synchronizing the 12 | variables in lasagne more easily 13 | ''' 14 | def __init__(self, network): 15 | ''' The constructor of MVNetParamManager 16 | 17 | The constructor will associate the parameter with multiverso array 18 | table. The initial value of ArrayTableHandler will be same as the 19 | parameters of network. If different parameters are used in different 20 | processes, the average of them will be used as the initial value 21 | ''' 22 | self.shapes = [] 23 | self.dtypes = [] 24 | self.sizes = [] 25 | self.all_param_list = [] 26 | self.network = network 27 | for arr in lasagne.layers.get_all_param_values(self.network): 28 | self.shapes.append(arr.shape) 29 | # TODO: Now only float32 is supported in multiverso. So I store all 30 | # the parameters in a float32 array. This place need modification 31 | # after other types are supported 32 | assert(np.dtype("float32") == arr.dtype) 33 | self.dtypes.append(arr.dtype) 34 | self.sizes.append(arr.size) 35 | self.all_param_list.extend([i for i in np.nditer(arr)]) 36 | self.all_param_list = np.array(self.all_param_list) 37 | 38 | self.tbh = mv.ArrayTableHandler(len(self.all_param_list), init_value=self.all_param_list) 39 | mv.barrier() # add barrier to make sure the initial values have token effect 40 | self.all_param_list = self.tbh.get() 41 | self._set_all_param_to_net() 42 | 43 | def _set_all_param_to_net(self): 44 | n = 0 45 | params = [] 46 | for i, size in enumerate(self.sizes): 47 | params.append(self.all_param_list[n:n + size].reshape(self.shapes[i])) 48 | n += size 49 | lasagne.layers.set_all_param_values(self.network, params) 50 | 51 | def sync_all_param(self): 52 | '''sync all parameters with multiverso server 53 | 54 | This function will 55 | 1) calc all the delta of params in the network and add the delta to multiverso server 56 | 2) get the latest value from the multiverso server 57 | ''' 58 | cur_network_params = np.concatenate([ 59 | arr.reshape(-1) for arr in lasagne.layers.get_all_param_values(self.network)]) 60 | 61 | params_delta = cur_network_params - self.all_param_list 62 | self.tbh.add(params_delta) 63 | self.all_param_list = self.tbh.get() 64 | self._set_all_param_to_net() 65 | -------------------------------------------------------------------------------- /DSL_SentimentAnalysis/CLM/multiverso/theano_ext/sharedvar.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # coding:utf8 3 | 4 | from theano.tensor.basic import TensorType, _tensor_py_operators 5 | from theano.compile import SharedVariable 6 | from theano.compile.sharedvalue import shared 7 | from theano.gof import Variable, utils 8 | import numpy 9 | import multiverso as mv 10 | 11 | 12 | class MVSharedVariable(object): 13 | '''MVSharedVariable is an wrapper of SharedVariable 14 | 15 | It will act same as SharedVariable. The only difference is a multiverso 16 | ArrayTable is addded to make it easier to sync values. 17 | ''' 18 | def __init__(self, svobj): 19 | '''Constructor of the MVSharedVariable 20 | 21 | The constructor will create ArrayTableHandler and associate the shared 22 | variable with it. The initial value of ArrayTableHandler will be same 23 | as the value of SharedVariable. If different initial value is used in 24 | different processes, the average of them will be used as the initial 25 | value 26 | ''' 27 | assert(isinstance(svobj, SharedVariable)) 28 | self._svobj = svobj 29 | self._mv_array = mv.ArrayTableHandler(self._svobj.get_value().size, 30 | init_value=self._svobj.get_value().reshape((-1,))) 31 | 32 | mv.barrier() # add barrier to make sure the initial values have token effect 33 | # _last_mv_data restore a copy of value. It will be used for calculate 34 | # the update for multiverso when calling mv_sync 35 | self._last_mv_data = self._mv_array.get().reshape(self._svobj.get_value().shape) 36 | self._svobj.set_value(self._last_mv_data, borrow=False) 37 | 38 | def mv_sync(self): 39 | ''' sync values with multiverso server 40 | 41 | mv_sync will add the delta of SharedVariable, which is usually the 42 | gradients in typical examples, to parameter server and then get the 43 | latest value in multiverso. 44 | ''' 45 | # because multiverso always use add method to sync value, the delta 46 | # will be the difference of the current value of last synced value 47 | self._mv_array.add(self._svobj.get_value() - self._last_mv_data) 48 | 49 | self._svobj.set_value(self._mv_array.get().reshape(self._svobj.get_value().shape)) 50 | self._last_mv_data = self._svobj.get_value(borrow=False) 51 | 52 | def __getstate__(self): 53 | '''This is for cPickle to store state. 54 | 55 | It is usually called when you want to dump the model to file with 56 | cPickle 57 | ''' 58 | odict = self.__dict__.copy() # copy the dict since we change it 59 | del odict['_mv_array'] # remove mv_array, because we can't pickle it 60 | return odict 61 | 62 | def __getattribute__(self, attr): 63 | '''This function make MVSharedVariable act same as SharedVariable''' 64 | if attr in ['_svobj', '_mv_array', '_last_mv_data']: 65 | # If get the attribute of self, use parent __getattribute__ to get 66 | # attribute from the object, otherwise it will fall into infinite 67 | # loop 68 | return object.__getattribute__(self, attr) 69 | elif attr in ['mv_sync', "__getstate__"]: 70 | # If call method of MVSharedVariable, then call the method directly 71 | # and bound the method to self object 72 | return getattr(MVSharedVariable, attr).__get__(self) 73 | else: 74 | # Otherwise I will get attribute from the wrapped object 75 | return getattr(self._svobj, attr) 76 | 77 | 78 | def mv_shared(*args, **kwargs): 79 | '''mv_shared works same as `theano.shared` 80 | 81 | It calls `theano.shared` to create the SharedVariable and use 82 | MVSharedVariable to wrap it. 83 | ''' 84 | var = shared(*args, **kwargs) 85 | mv_shared.shared_vars.append(MVSharedVariable(var)) 86 | return var 87 | 88 | 89 | mv_shared.shared_vars = [] # all shared_vars in multiverso will be recorded here 90 | 91 | 92 | def sync_all_mv_shared_vars(): 93 | '''Sync shared value created by `mv_shared` with multiverso 94 | 95 | It is often used when you are training model, and it will add the gradients 96 | (delta value) to the server and update the latest value from the server. 97 | Notice: It will **only** sync shared value created by `mv_shared` 98 | ''' 99 | for sv in mv_shared.shared_vars: 100 | sv.mv_sync() 101 | -------------------------------------------------------------------------------- /DSL_SentimentAnalysis/CLM/multiverso/utils.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # coding:utf8 3 | 4 | import ctypes 5 | import os 6 | import platform 7 | from ctypes.util import find_library 8 | import numpy as np 9 | 10 | PACKAGE_PATH = os.path.abspath(os.path.dirname(__file__)) 11 | 12 | 13 | class Loader(object): 14 | ''' 15 | This loader is responsible for loading multiverso dynamic library in both 16 | *nux and windows 17 | ''' 18 | 19 | LIB = None 20 | 21 | @classmethod 22 | def _find_mv_path(cls): 23 | if platform.system() == "Windows": 24 | mv_lib_path = find_library("Multiverso") 25 | if mv_lib_path is None: 26 | print "* Fail to load Multiverso.dll from the windows $PATH."\ 27 | "Because Multiverso.dll can not be found in the $PATH "\ 28 | "directories. Go on loading Multiverso from the package." 29 | else: 30 | return mv_lib_path 31 | 32 | mv_lib_path = os.path.join(PACKAGE_PATH, "Multiverso.dll") 33 | if not os.path.exists(mv_lib_path): 34 | print "* Fail to load Multiverso.dll from the package. Because"\ 35 | " the file " + mv_lib_path + " can not be found." 36 | else: 37 | return mv_lib_path 38 | else: 39 | mv_lib_path = find_library("multiverso") 40 | if mv_lib_path is None: 41 | print "* Fail to load libmultiverso.so from the system"\ 42 | "libraries. Because libmultiverso.so can't be found in"\ 43 | "library paths. Go on loading Multiverso from the package." 44 | else: 45 | return mv_lib_path 46 | 47 | mv_lib_path = os.path.join(PACKAGE_PATH, "libmultiverso.so") 48 | if not os.path.exists(mv_lib_path): 49 | print "* Fail to load libmultiverso.so from the package. Because"\ 50 | " the file " + mv_lib_path + " can not be found." 51 | else: 52 | return mv_lib_path 53 | return None 54 | 55 | @classmethod 56 | def load_lib(cls): 57 | mv_lib_path = cls._find_mv_path() 58 | if mv_lib_path is None: 59 | print "Fail to load the multiverso library. Please make sure you"\ 60 | " have installed multiverso successfully" 61 | else: 62 | print "Find the multiverso library successfully(%s)" % mv_lib_path 63 | return ctypes.cdll.LoadLibrary(mv_lib_path) 64 | 65 | @classmethod 66 | def get_lib(cls): 67 | if not cls.LIB: 68 | cls.LIB = cls.load_lib() 69 | cls.LIB.MV_NumWorkers.restype = ctypes.c_int 70 | return cls.LIB 71 | 72 | 73 | def convert_data(data): 74 | '''convert the data to float32 ndarray''' 75 | if not isinstance(data, np.ndarray): 76 | data = np.array(data) 77 | return data.astype(np.float32) 78 | -------------------------------------------------------------------------------- /DSL_SentimentAnalysis/CLM/training_scripts/train_clm_WithDropout_lr0.5.py: -------------------------------------------------------------------------------- 1 | from CLM import train 2 | 3 | def log_with_print(log, context): 4 | print >>log, context 5 | print context 6 | 7 | 8 | logfile = __file__ + 'log' 9 | log = open(logfile, 'w') 10 | 11 | round = 0 12 | log_with_print(log, 'round ' + str(round) + 'begin ------------------------------- !!') 13 | # change some for round 14 | 15 | 16 | max_epochs = 100000 17 | 18 | obj_directory = r'..\Sentiment_CLM_WithDropout' 19 | reload_model = obj_directory + r'\T.npz' 20 | 21 | 22 | train(round = round, 23 | saveto = obj_directory + '\\round%d_model_lstm.npz'%(round), 24 | reload_model = reload_model, 25 | reload_option = reload_model + '.pkl', 26 | dataset = r'../data/imdb.pkl', #%(work_id + 1), 27 | encoder = 'lstm', 28 | dropout_input = 0.5, 29 | dropout_output= 0.5, 30 | clip_c = 5., 31 | dim_word = 500, 32 | dim_proj = 1024, 33 | n_words = 10000, 34 | #n_words_sqrt = n_words_sqrt, 35 | optimizer = 'adadelta', 36 | lrate = 0.5, 37 | maxlen = None, 38 | minlen = 1, 39 | start_iter = 0, 40 | start_epoch = 0, 41 | max_epochs = max_epochs, #!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! 42 | batch_size = 16, 43 | patience = 100, 44 | validFreq = 5000, 45 | saveFreq = 50000000, 46 | dispFreq = 1, 47 | sampleFreq = 20000000, 48 | newDumpFreq = 20000, 49 | syncFreq = 5000000000, 50 | sampleNum = 25, 51 | decay_c = 0., 52 | log = logfile, 53 | monitor_grad = False, 54 | sampleFileName= obj_directory + '\\round%d_sample.txt'%(round), 55 | pad_sos = False, 56 | embedding = '../data/embedding500.npz' 57 | ) 58 | 59 | -------------------------------------------------------------------------------- /DSL_SentimentAnalysis/CLM/training_scripts/train_clm_nodr.py: -------------------------------------------------------------------------------- 1 | from CLM import train 2 | 3 | def log_with_print(log, context): 4 | print >>log, context 5 | print context 6 | 7 | 8 | logfile = __file__ + 'log' 9 | log = open(logfile, 'w') 10 | 11 | round = 0 12 | log_with_print(log, 'round ' + str(round) + 'begin ------------------------------- !!') 13 | # change some for round 14 | 15 | 16 | max_epochs = 100000 17 | 18 | obj_directory = r'..\Sentiment_CLM_nodrop' 19 | reload_model = obj_directory + r'\de.npz' 20 | 21 | 22 | train(round = round, 23 | saveto = obj_directory + '\\round%d_model_lstm.npz'%(round), 24 | reload_model = None, #reload_model, 25 | reload_option = None, #reload_model + '.pkl', 26 | dataset = r'../data/imdb.pkl', #%(work_id + 1), 27 | encoder = 'lstm', 28 | dropout_input = None, 29 | dropout_output= None, 30 | clip_c = 5., 31 | dim_word = 500, 32 | dim_proj = 1024, 33 | n_words = 10000, 34 | #n_words_sqrt = n_words_sqrt, 35 | optimizer = 'adadelta', 36 | lrate = 1.0, 37 | maxlen = None, 38 | minlen = 1, 39 | start_iter = 0, 40 | start_epoch = 0, 41 | max_epochs = max_epochs, #!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! 42 | batch_size = 16, 43 | patience = 100, 44 | validFreq = 10000, 45 | saveFreq = 50000000, 46 | dispFreq = 1, 47 | sampleFreq = 20000000, 48 | newDumpFreq = 20000, 49 | syncFreq = 5000000000, 50 | sampleNum = 25, 51 | decay_c = 0., 52 | log = logfile, 53 | monitor_grad = False, 54 | sampleFileName= obj_directory + '\\round%d_sample.txt'%(round), 55 | pad_sos = False, 56 | embedding = '../data/embedding500.npz' 57 | ) 58 | 59 | -------------------------------------------------------------------------------- /DSL_SentimentAnalysis/CLM/training_scripts/train_clm_nodr_lr0.5.py: -------------------------------------------------------------------------------- 1 | from CLM import train 2 | 3 | def log_with_print(log, context): 4 | print >>log, context 5 | print context 6 | 7 | 8 | logfile = __file__ + 'log' 9 | log = open(logfile, 'w') 10 | 11 | round = 0 12 | log_with_print(log, 'round ' + str(round) + 'begin ------------------------------- !!') 13 | # change some for round 14 | 15 | 16 | max_epochs = 100000 17 | 18 | obj_directory = r'..\Sentiment_CLM_nodrop_lr0.5' 19 | reload_model = obj_directory + r'\T.npz' 20 | 21 | 22 | train(round = round, 23 | saveto = obj_directory + '\\round%d_model_lstm.npz'%(round), 24 | reload_model = reload_model, 25 | reload_option = reload_model + '.pkl', 26 | dataset = r'../data/imdb.pkl', #%(work_id + 1), 27 | encoder = 'lstm', 28 | dropout_input = None, 29 | dropout_output= None, 30 | clip_c = 5., 31 | dim_word = 500, 32 | dim_proj = 1024, 33 | n_words = 10000, 34 | #n_words_sqrt = n_words_sqrt, 35 | optimizer = 'adadelta', 36 | lrate = 0.5, 37 | maxlen = None, 38 | minlen = 1, 39 | start_iter = 0, 40 | start_epoch = 0, 41 | max_epochs = max_epochs, #!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! 42 | batch_size = 16, 43 | patience = 100, 44 | validFreq = 5000, 45 | saveFreq = 50000000, 46 | dispFreq = 1, 47 | sampleFreq = 20000000, 48 | newDumpFreq = 20000, 49 | syncFreq = 5000000000, 50 | sampleNum = 25, 51 | decay_c = 0., 52 | log = logfile, 53 | monitor_grad = False, 54 | sampleFileName= obj_directory + '\\round%d_sample.txt'%(round), 55 | pad_sos = False, 56 | embedding = '../data/embedding500.npz' 57 | ) 58 | 59 | -------------------------------------------------------------------------------- /DSL_SentimentAnalysis/CLM/worker.bat: -------------------------------------------------------------------------------- 1 | @echo off 2 | setlocal ENABLEDELAYEDEXPANSION 3 | set THEANO_FLAGS=device=gpu1 4 | python train_clm_WithDropout_lr0.5.py -------------------------------------------------------------------------------- /DSL_SentimentAnalysis/Classifier/Data.py: -------------------------------------------------------------------------------- 1 | # Copyright (c) Microsoft. All rights reserved. 2 | # Licensed under the MIT license. See LICENSE file in the project root for full license information. 3 | 4 | import cPickle as pkl 5 | import gzip 6 | import os 7 | import numpy 8 | from theano import config 9 | 10 | def get_dataset_file(dataset, default_dataset, origin): 11 | ''' 12 | Look for it as if it was a full path, if not, try local file, 13 | if not try in the data directory. 14 | 15 | Download dataset if it is not present 16 | ''' 17 | data_dir, data_file = os.path.split(dataset) 18 | if data_dir == "" and not os.path.isfile(dataset): 19 | # Check if dataset is in the data directory. 20 | new_path = os.path.join( 21 | os.path.split(__file__)[0], 22 | "..", 23 | "data", 24 | dataset 25 | ) 26 | if os.path.isfile(new_path) or data_file == default_dataset: 27 | dataset = new_path 28 | 29 | if (not os.path.isfile(dataset)) and data_file == default_dataset: 30 | from six.moves import urllib 31 | print('Downloading data from %s' % origin) 32 | urllib.request.urlretrieve(origin, dataset) 33 | 34 | return dataset 35 | 36 | def load_data(path="imdb.pkl", n_words=100000, maxlen=None, 37 | sort_by_len=True, fixed_valid=True, valid_portion=0.1): 38 | ''' 39 | Loads the dataset 40 | :type path: String 41 | :param path: The path to the dataset (here IMDB) 42 | :type n_words: int 43 | :param n_words: The number of word to keep in the vocabulary. 44 | All extra words are set to unknow (1). 45 | :type maxlen: None or positive int 46 | :param maxlen: the max sequence length we use in the train/valid set. 47 | :type sort_by_len: bool 48 | :name sort_by_len: Sort by the sequence lenght for the train, 49 | valid and test set. This allow faster execution as it cause 50 | less padding per minibatch. Another mechanism must be used to 51 | shuffle the train set at each epoch. 52 | :type fixed_valid: bool 53 | :param fixed_valid: load fixed validation set from the corpus file, 54 | which would otherwise be picked randomly from the training set with 55 | proportion [valid_portion] 56 | :type valid_portion: float 57 | :param valid_portion: The proportion of the full train set used for 58 | the validation set. 59 | 60 | ''' 61 | 62 | # Load the dataset 63 | path = get_dataset_file( 64 | path, "imdb.pkl", 65 | "http://www.iro.umontreal.ca/~lisa/deep/data/imdb.pkl") 66 | if path.endswith(".gz"): 67 | f = gzip.open(path, 'rb') 68 | else: 69 | f = open(path, 'rb') 70 | 71 | train_set = pkl.load(f) 72 | if fixed_valid: 73 | valid_set = pkl.load(f) 74 | test_set = pkl.load(f) 75 | f.close() 76 | 77 | def _truncate_data(train_set): 78 | ''' 79 | truncate sequences with lengths exceed max-len threshold 80 | :param train_set: a list of sequences list and corresponding labels list 81 | :return: truncated train_set 82 | ''' 83 | new_train_set_x = [] 84 | new_train_set_y = [] 85 | for x, y in zip(train_set[0], train_set[1]): 86 | if len(x) < maxlen: 87 | new_train_set_x.append(x) 88 | new_train_set_y.append(y) 89 | train_set = (new_train_set_x, new_train_set_y) 90 | del new_train_set_x, new_train_set_y 91 | return train_set 92 | 93 | def _set_valid(train_set, valid_portion): 94 | ''' 95 | set validation with [valid_portion] proportion of training set 96 | ''' 97 | train_set_x, train_set_y = train_set 98 | n_samples = len(train_set_x) 99 | sidx = numpy.random.permutation(n_samples) # shuffle data 100 | n_train = int(numpy.round(n_samples * (1. - valid_portion))) 101 | valid_set_x = [train_set_x[s] for s in sidx[n_train:]] 102 | valid_set_y = [train_set_y[s] for s in sidx[n_train:]] 103 | train_set_x = [train_set_x[s] for s in sidx[:n_train]] 104 | train_set_y = [train_set_y[s] for s in sidx[:n_train]] 105 | train_set = (train_set_x, train_set_y) 106 | valid_set = (valid_set_x, valid_set_y) 107 | del train_set_x, train_set_y, valid_set_x, valid_set_y 108 | return train_set, valid_set 109 | 110 | if maxlen: 111 | train_set = _truncate_data(train_set) 112 | if fixed_valid: 113 | print 'Loading with fixed validation set...', 114 | valid_set = _truncate_data(valid_set) 115 | else: 116 | print 'Setting validation set with proportion:', valid_portion, '...', 117 | train_set, valid_set = _set_valid(train_set, valid_portion) 118 | test_set = _truncate_data(test_set) 119 | 120 | if maxlen is None and not fixed_valid: 121 | train_set, valid_set = _set_valid(train_set, valid_portion) 122 | 123 | def remove_unk(x): 124 | return [[1 if w >= n_words else w for w in sen] for sen in x] 125 | 126 | test_set_x, test_set_y = test_set 127 | valid_set_x, valid_set_y = valid_set 128 | train_set_x, train_set_y = train_set 129 | 130 | # remove unk from dataset 131 | train_set_x = remove_unk(train_set_x) # use 1 if unk 132 | valid_set_x = remove_unk(valid_set_x) 133 | test_set_x = remove_unk(test_set_x) 134 | 135 | def len_argsort(seq): 136 | return sorted(range(len(seq)), key=lambda x: len(seq[x])) 137 | 138 | if sort_by_len: 139 | sorted_index = len_argsort(test_set_x) 140 | # ranked from shortest to longest 141 | test_set_x = [test_set_x[i] for i in sorted_index] 142 | test_set_y = [test_set_y[i] for i in sorted_index] 143 | 144 | sorted_index = len_argsort(valid_set_x) 145 | valid_set_x = [valid_set_x[i] for i in sorted_index] 146 | valid_set_y = [valid_set_y[i] for i in sorted_index] 147 | 148 | sorted_index = len_argsort(train_set_x) 149 | train_set_x = [train_set_x[i] for i in sorted_index] 150 | train_set_y = [train_set_y[i] for i in sorted_index] 151 | 152 | train = (train_set_x, train_set_y) 153 | valid = (valid_set_x, valid_set_y) 154 | test = (test_set_x, test_set_y) 155 | 156 | return train, valid, test 157 | 158 | def load_mnist(path='mnist.pkl', fixed_permute=True, rand_permute=False): 159 | f = open(path, 'rb') 160 | train = pkl.load(f) 161 | valid = pkl.load(f) 162 | test = pkl.load(f) 163 | f.close() 164 | 165 | def _permute(data, perm): 166 | x, y = data 167 | x_new = [] 168 | for xx in x: 169 | xx_new = [xx[pp] for pp in perm] 170 | x_new.append(xx_new) 171 | return (x_new, y) 172 | 173 | def _trans2list(data): 174 | x, y = data 175 | x = [list(xx) for xx in x] 176 | return (x, y) 177 | 178 | if rand_permute: 179 | print 'Using a fixed random permutation of pixels...', 180 | perm = numpy.random.permutation(range(784)) 181 | train = _permute(train, perm) 182 | valid = _permute(valid, perm) 183 | test = _permute(test, perm) 184 | elif fixed_permute: 185 | print 'Using permuted dataset...', 186 | 187 | _trans2list(train) 188 | _trans2list(valid) 189 | _trans2list(test) 190 | 191 | return train, valid, test 192 | 193 | def get_minibatches_idx(n, minibatch_size, shuffle=False): 194 | """ 195 | Used to shuffle the dataset at each iteration. 196 | """ 197 | 198 | idx_list = numpy.arange(n, dtype="int32") 199 | 200 | if shuffle: 201 | numpy.random.shuffle(idx_list) 202 | 203 | minibatches = [] 204 | minibatch_start = 0 205 | for i in range(n // minibatch_size): 206 | minibatches.append(idx_list[minibatch_start: 207 | minibatch_start + minibatch_size]) 208 | minibatch_start += minibatch_size 209 | 210 | if (minibatch_start != n): 211 | # Make a minibatch out of what is left 212 | minibatches.append(idx_list[minibatch_start:]) 213 | 214 | return zip(range(len(minibatches)), minibatches) 215 | 216 | def get_minibatches_idx_bucket(dataset, minibatch_size, shuffle=False): 217 | """ 218 | divide into different buckets according to sequence lengths 219 | dynamic batch size 220 | """ 221 | # divide into buckets 222 | slen = [len(ss) for ss in dataset] 223 | bucket1000 = [sidx for sidx in xrange(len(dataset)) 224 | if slen[sidx] > 0 and slen[sidx] <= 1000] 225 | bucket3000 = [sidx for sidx in xrange(len(dataset)) 226 | if slen[sidx] > 1000 and slen[sidx] <= 3000] 227 | bucket_long = [sidx for sidx in xrange(len(dataset)) 228 | if slen[sidx] > 3000] 229 | 230 | # shuffle each bucket 231 | if shuffle: 232 | numpy.random.shuffle(bucket1000) 233 | numpy.random.shuffle(bucket3000) 234 | numpy.random.shuffle(bucket_long) 235 | 236 | # make minibatches 237 | def _make_batch(minibatches, bucket, minibatch_size): 238 | minibatch_start = 0 239 | n = len(bucket) 240 | for i in range(n // minibatch_size): 241 | minibatches.append(bucket[minibatch_start : minibatch_start + minibatch_size]) 242 | minibatch_start += minibatch_size 243 | if (minibatch_start != n): 244 | # Make a minibatch out of what is left 245 | minibatches.append(bucket[minibatch_start:]) 246 | return minibatches 247 | 248 | minibatches = [] 249 | _make_batch(minibatches, bucket1000, minibatch_size=minibatch_size) 250 | _make_batch(minibatches, bucket3000, minibatch_size=minibatch_size//2) 251 | _make_batch(minibatches, bucket_long, minibatch_size=minibatch_size//8) 252 | 253 | # shuffle minibatches 254 | numpy.random.shuffle(minibatches) 255 | 256 | return zip(range(len(minibatches)), minibatches) 257 | 258 | def prepare_data(seqs, labels, maxlen=None, dataset='text'): 259 | """Create the matrices from the datasets. 260 | 261 | This pad each sequence to the same lenght: the lenght of the 262 | longuest sequence or maxlen. 263 | 264 | if maxlen is set, we will cut all sequence to this maximum 265 | lenght. 266 | 267 | This swap the axis! 268 | """ 269 | # x: a list of sentences 270 | lengths = [len(s) for s in seqs] 271 | 272 | if maxlen is not None: 273 | new_seqs = [] 274 | new_labels = [] 275 | new_lengths = [] 276 | for l, s, y in zip(lengths, seqs, labels): 277 | if l < maxlen: 278 | new_seqs.append(s) 279 | new_labels.append(y) 280 | new_lengths.append(l) 281 | lengths = new_lengths 282 | labels = new_labels 283 | seqs = new_seqs 284 | 285 | if len(lengths) < 1: 286 | return None, None, None 287 | 288 | n_samples = len(seqs) 289 | maxlen = numpy.max(lengths) 290 | 291 | if dataset == 'mnist': 292 | x = numpy.zeros((maxlen, n_samples)).astype('float32') 293 | else: 294 | x = numpy.zeros((maxlen, n_samples)).astype('int64') 295 | x_mask = numpy.zeros((maxlen, n_samples)).astype(config.floatX) 296 | for idx, s in enumerate(seqs): 297 | x[:lengths[idx], idx] = s 298 | x_mask[:lengths[idx], idx] = 1. 299 | 300 | return x, x_mask, labels 301 | 302 | def prepare_data_hier(seqs, labels, hier_len, maxlen=None, dataset='text'): 303 | ''' 304 | prepare minibatch for hierarchical model 305 | ''' 306 | # sort (long->short) 307 | sorted_idx = sorted(range(len(seqs)), key=lambda x: len(seqs[x]), reverse=True) 308 | seqs = [seqs[i] for i in sorted_idx] 309 | labels = [labels[i] for i in sorted_idx] 310 | 311 | # truncate data 312 | lengths = [len(s) for s in seqs] 313 | if maxlen is not None: 314 | new_seqs = [] 315 | new_labels = [] 316 | new_lengths = [] 317 | for l, s, y in zip(lengths, seqs, labels): 318 | if l (clip_c**2), g/tensor.sqrt(g2) * clip_c, g)) 31 | grads = new_grads 32 | return grads 33 | 34 | def dropout_layer(state_before, dropout, use_noise, trng): 35 | proj = tensor.switch(use_noise, 36 | (state_before * 37 | trng.binomial(state_before.shape, 38 | p=(1-dropout), n=1, 39 | dtype=state_before.dtype)), 40 | state_before * (1-dropout)) 41 | return proj 42 | 43 | # ========================== 44 | # optimizers 45 | # supports sgd, adadelta and rmsprop 46 | # only adadelta supports hierarchical structure 47 | # ========================== 48 | 49 | def sgd(lr, tparams, grads, x, mask, y, cost): 50 | """ Stochastic Gradient Descent 51 | 52 | :note: A more complicated version of sgd then needed. This is 53 | done like that for adadelta and rmsprop. 54 | 55 | """ 56 | # New set of shared variable that will contain the gradient 57 | # for a mini-batch. 58 | gshared = [theano.shared(p.get_value() * 0., name='%s_grad' % k) 59 | for k, p in tparams.items()] 60 | gsup = [(gs, g) for gs, g in zip(gshared, grads)] 61 | 62 | # Function that computes gradients for a mini-batch, but do not 63 | # updates the weights. 64 | f_grad_shared = theano.function([x, mask, y], cost, updates=gsup, 65 | name='sgd_f_grad_shared') 66 | 67 | pup = [(p, p - lr * g) for p, g in zip(tparams.values(), gshared)] 68 | 69 | # Function that updates the weights from the previously computed 70 | # gradient. 71 | f_update = theano.function([lr], [], updates=pup, 72 | name='sgd_f_update') 73 | 74 | return f_grad_shared, f_update 75 | 76 | def adadelta(lr, tparams, grads, x, mask, y, cost, mask_hier=None): 77 | """ 78 | An adaptive learning rate optimizer 79 | # modified to support hierarchical mode 80 | 81 | Parameters 82 | ---------- 83 | lr : Theano SharedVariable 84 | Initial learning rate 85 | tpramas: Theano SharedVariable 86 | Model parameters 87 | grads: Theano variable 88 | Gradients of cost w.r.t to parameres 89 | x: Theano variable 90 | Model inputs 91 | mask: Theano variable 92 | Sequence mask 93 | y: Theano variable 94 | Targets 95 | cost: Theano variable 96 | Objective fucntion to minimize 97 | 98 | Notes 99 | ----- 100 | For more information, see [ADADELTA]_. 101 | 102 | .. [ADADELTA] Matthew D. Zeiler, *ADADELTA: An Adaptive Learning 103 | Rate Method*, arXiv:1212.5701. 104 | """ 105 | 106 | zipped_grads = [theano.shared(p.get_value() * numpy_floatX(0.), 107 | name='%s_grad' % k) 108 | for k, p in tparams.items()] 109 | running_up2 = [theano.shared(p.get_value() * numpy_floatX(0.), 110 | name='%s_rup2' % k) 111 | for k, p in tparams.items()] 112 | running_grads2 = [theano.shared(p.get_value() * numpy_floatX(0.), 113 | name='%s_rgrad2' % k) 114 | for k, p in tparams.items()] 115 | 116 | zgup = [(zg, g) for zg, g in zip(zipped_grads, grads)] 117 | rg2up = [(rg2, 0.95 * rg2 + 0.05 * (g ** 2)) 118 | for rg2, g in zip(running_grads2, grads)] 119 | if mask_hier is not None: 120 | f_grad_shared = theano.function([x, mask, mask_hier, y], cost, updates=zgup + rg2up, 121 | name='adadelta_f_grad_shared') 122 | else: 123 | f_grad_shared = theano.function([x, mask, y], cost, updates=zgup + rg2up, 124 | name='adadelta_f_grad_shared') 125 | 126 | updir = [-tensor.sqrt(ru2 + 1e-6) / tensor.sqrt(rg2 + 1e-6) * zg 127 | for zg, ru2, rg2 in zip(zipped_grads, 128 | running_up2, 129 | running_grads2)] 130 | ru2up = [(ru2, 0.95 * ru2 + 0.05 * (ud ** 2)) 131 | for ru2, ud in zip(running_up2, updir)] 132 | param_up = [(p, p + ud) for p, ud in zip(tparams.values(), updir)] 133 | 134 | f_update = theano.function([lr], [], updates=ru2up + param_up, 135 | on_unused_input='ignore', 136 | name='adadelta_f_update') 137 | 138 | return f_grad_shared, f_update 139 | 140 | def rmsprop(lr, tparams, grads, x, mask, y, cost): 141 | """ 142 | A variant of SGD that scales the step size by running average of the 143 | recent step norms. 144 | 145 | Parameters 146 | ---------- 147 | lr : Theano SharedVariable 148 | Initial learning rate 149 | tpramas: Theano SharedVariable 150 | Model parameters 151 | grads: Theano variable 152 | Gradients of cost w.r.t to parameres 153 | x: Theano variable 154 | Model inputs 155 | mask: Theano variable 156 | Sequence mask 157 | y: Theano variable 158 | Targets 159 | cost: Theano variable 160 | Objective fucntion to minimize 161 | 162 | Notes 163 | ----- 164 | For more information, see [Hint2014]_. 165 | 166 | .. [Hint2014] Geoff Hinton, *Neural Networks for Machine Learning*, 167 | lecture 6a, 168 | http://cs.toronto.edu/~tijmen/csc321/slides/lecture_slides_lec6.pdf 169 | """ 170 | 171 | zipped_grads = [theano.shared(p.get_value() * numpy_floatX(0.), 172 | name='%s_grad' % k) 173 | for k, p in tparams.items()] 174 | running_grads = [theano.shared(p.get_value() * numpy_floatX(0.), 175 | name='%s_rgrad' % k) 176 | for k, p in tparams.items()] 177 | running_grads2 = [theano.shared(p.get_value() * numpy_floatX(0.), 178 | name='%s_rgrad2' % k) 179 | for k, p in tparams.items()] 180 | 181 | zgup = [(zg, g) for zg, g in zip(zipped_grads, grads)] 182 | rgup = [(rg, 0.95 * rg + 0.05 * g) for rg, g in zip(running_grads, grads)] 183 | rg2up = [(rg2, 0.95 * rg2 + 0.05 * (g ** 2)) 184 | for rg2, g in zip(running_grads2, grads)] 185 | 186 | f_grad_shared = theano.function([x, mask, y], cost, 187 | updates=zgup + rgup + rg2up, 188 | name='rmsprop_f_grad_shared') 189 | 190 | updir = [theano.shared(p.get_value() * numpy_floatX(0.), 191 | name='%s_updir' % k) 192 | for k, p in tparams.items()] 193 | updir_new = [(ud, 0.9 * ud - 1e-4 * zg / tensor.sqrt(rg2 - rg ** 2 + 1e-4)) 194 | for ud, zg, rg, rg2 in zip(updir, zipped_grads, running_grads, 195 | running_grads2)] 196 | param_up = [(p, p + udn[1]) 197 | for p, udn in zip(tparams.values(), updir_new)] 198 | f_update = theano.function([lr], [], updates=updir_new + param_up, 199 | on_unused_input='ignore', 200 | name='rmsprop_f_update') 201 | 202 | return f_grad_shared, f_update 203 | 204 | # ========================== 205 | # matrix initializations 206 | # supports normalized, orthogonal and randomized 207 | # ========================== 208 | 209 | def ortho_weight(ndim): 210 | W = numpy.random.randn(ndim, ndim) 211 | u, s, v = numpy.linalg.svd(W) 212 | return u.astype(config.floatX) 213 | 214 | def norm_weight(nin, nout=None, scale=0.01, ortho=True): 215 | if nout is None: 216 | nout = nin 217 | if nout == nin and ortho: 218 | W = ortho_weight(nin) 219 | else: 220 | # bug fixed: set to be ortho_init 221 | # W = scale * numpy.random.randn(nin, nout) 222 | W = numpy.random.randn(nin, nout) 223 | u, s, v = numpy.linalg.svd(W) 224 | if nin > nout: 225 | W = u[:, :nout] 226 | else: 227 | W = v[:nin, :] 228 | return W.astype('float32') 229 | 230 | def rand_weight(nin, nout=None, scale=0.01, ortho=True): 231 | if nout is None: 232 | nout = nin 233 | if nout == nin and ortho: 234 | W = ortho_weight(nin) 235 | else: 236 | W = scale * numpy.random.randn(nin, nout) 237 | return W.astype('float32') 238 | 239 | # ========================== 240 | # some utility functions 241 | # ========================== 242 | 243 | def zipp(params, tparams): 244 | """ 245 | When we reload the model. Needed for the GPU stuff. 246 | """ 247 | for kk, vv in params.items(): 248 | tparams[kk].set_value(vv) 249 | 250 | def unzip(zipped): 251 | """ 252 | When we pickle the model. Needed for the GPU stuff. 253 | """ 254 | new_params = OrderedDict() 255 | for kk, vv in zipped.items(): 256 | new_params[kk] = vv.get_value() 257 | return new_params 258 | 259 | def numpy_floatX(data): 260 | return numpy.asarray(data, dtype=config.floatX) -------------------------------------------------------------------------------- /DSL_SentimentAnalysis/Classifier/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/microsoft/DualLearning/14f8e76b3c47e1f00bef4254606e1f549cc8c9ea/DSL_SentimentAnalysis/Classifier/__init__.py -------------------------------------------------------------------------------- /DSL_SentimentAnalysis/Classifier/worker.bat: -------------------------------------------------------------------------------- 1 | @echo off 2 | setlocal ENABLEDELAYEDEXPANSION 3 | set THEANO_FLAGS=device=gpu5 4 | python train_classifier_LM_NoDrop_google_sgd0.2.py -------------------------------------------------------------------------------- /DSL_SentimentAnalysis/Data.py: -------------------------------------------------------------------------------- 1 | # Copyright (c) Microsoft. All rights reserved. 2 | # Licensed under the MIT license. See LICENSE file in the project root for full license information. 3 | 4 | import cPickle as pkl 5 | import gzip 6 | import os 7 | import numpy 8 | from theano import config 9 | 10 | def get_dataset_file(dataset, default_dataset, origin): 11 | ''' 12 | Look for it as if it was a full path, if not, try local file, 13 | if not try in the data directory. 14 | 15 | Download dataset if it is not present 16 | ''' 17 | data_dir, data_file = os.path.split(dataset) 18 | if data_dir == "" and not os.path.isfile(dataset): 19 | # Check if dataset is in the data directory. 20 | new_path = os.path.join( 21 | os.path.split(__file__)[0], 22 | "..", 23 | "data", 24 | dataset 25 | ) 26 | if os.path.isfile(new_path) or data_file == default_dataset: 27 | dataset = new_path 28 | 29 | if (not os.path.isfile(dataset)) and data_file == default_dataset: 30 | from six.moves import urllib 31 | print('Downloading data from %s' % origin) 32 | urllib.request.urlretrieve(origin, dataset) 33 | 34 | return dataset 35 | 36 | def load_data(path="imdb.pkl", n_words=100000, maxlen=None, 37 | sort_by_len=True, fixed_valid=True, valid_portion=0.1): 38 | ''' 39 | Loads the dataset 40 | :type path: String 41 | :param path: The path to the dataset (here IMDB) 42 | :type n_words: int 43 | :param n_words: The number of word to keep in the vocabulary. 44 | All extra words are set to unknow (1). 45 | :type maxlen: None or positive int 46 | :param maxlen: the max sequence length we use in the train/valid set. 47 | :type sort_by_len: bool 48 | :name sort_by_len: Sort by the sequence lenght for the train, 49 | valid and test set. This allow faster execution as it cause 50 | less padding per minibatch. Another mechanism must be used to 51 | shuffle the train set at each epoch. 52 | :type fixed_valid: bool 53 | :param fixed_valid: load fixed validation set from the corpus file, 54 | which would otherwise be picked randomly from the training set with 55 | proportion [valid_portion] 56 | :type valid_portion: float 57 | :param valid_portion: The proportion of the full train set used for 58 | the validation set. 59 | 60 | ''' 61 | 62 | # Load the dataset 63 | path = get_dataset_file( 64 | path, "imdb.pkl", 65 | "http://www.iro.umontreal.ca/~lisa/deep/data/imdb.pkl") 66 | if path.endswith(".gz"): 67 | f = gzip.open(path, 'rb') 68 | else: 69 | f = open(path, 'rb') 70 | 71 | train_set = pkl.load(f) 72 | if fixed_valid: 73 | valid_set = pkl.load(f) 74 | test_set = pkl.load(f) 75 | f.close() 76 | 77 | def _truncate_data(train_set): 78 | ''' 79 | truncate sequences with lengths exceed max-len threshold 80 | :param train_set: a list of sequences list and corresponding labels list 81 | :return: truncated train_set 82 | ''' 83 | new_train_set_x = [] 84 | new_train_set_y = [] 85 | for x, y in zip(train_set[0], train_set[1]): 86 | if len(x) < maxlen: 87 | new_train_set_x.append(x) 88 | new_train_set_y.append(y) 89 | train_set = (new_train_set_x, new_train_set_y) 90 | del new_train_set_x, new_train_set_y 91 | return train_set 92 | 93 | def _set_valid(train_set, valid_portion): 94 | ''' 95 | set validation with [valid_portion] proportion of training set 96 | ''' 97 | train_set_x, train_set_y = train_set 98 | n_samples = len(train_set_x) 99 | sidx = numpy.random.permutation(n_samples) # shuffle data 100 | n_train = int(numpy.round(n_samples * (1. - valid_portion))) 101 | valid_set_x = [train_set_x[s] for s in sidx[n_train:]] 102 | valid_set_y = [train_set_y[s] for s in sidx[n_train:]] 103 | train_set_x = [train_set_x[s] for s in sidx[:n_train]] 104 | train_set_y = [train_set_y[s] for s in sidx[:n_train]] 105 | train_set = (train_set_x, train_set_y) 106 | valid_set = (valid_set_x, valid_set_y) 107 | del train_set_x, train_set_y, valid_set_x, valid_set_y 108 | return train_set, valid_set 109 | 110 | if maxlen: 111 | train_set = _truncate_data(train_set) 112 | if fixed_valid: 113 | print 'Loading with fixed validation set...', 114 | valid_set = _truncate_data(valid_set) 115 | else: 116 | print 'Setting validation set with proportion:', valid_portion, '...', 117 | train_set, valid_set = _set_valid(train_set, valid_portion) 118 | test_set = _truncate_data(test_set) 119 | 120 | if maxlen is None and not fixed_valid: 121 | train_set, valid_set = _set_valid(train_set, valid_portion) 122 | 123 | def remove_unk(x): 124 | return [[1 if w >= n_words else w for w in sen] for sen in x] 125 | 126 | test_set_x, test_set_y = test_set 127 | valid_set_x, valid_set_y = valid_set 128 | train_set_x, train_set_y = train_set 129 | 130 | # remove unk from dataset 131 | train_set_x = remove_unk(train_set_x) # use 1 if unk 132 | valid_set_x = remove_unk(valid_set_x) 133 | test_set_x = remove_unk(test_set_x) 134 | 135 | def len_argsort(seq): 136 | return sorted(range(len(seq)), key=lambda x: len(seq[x])) 137 | 138 | if sort_by_len: 139 | sorted_index = len_argsort(test_set_x) 140 | # ranked from shortest to longest 141 | test_set_x = [test_set_x[i] for i in sorted_index] 142 | test_set_y = [test_set_y[i] for i in sorted_index] 143 | 144 | sorted_index = len_argsort(valid_set_x) 145 | valid_set_x = [valid_set_x[i] for i in sorted_index] 146 | valid_set_y = [valid_set_y[i] for i in sorted_index] 147 | 148 | sorted_index = len_argsort(train_set_x) 149 | train_set_x = [train_set_x[i] for i in sorted_index] 150 | train_set_y = [train_set_y[i] for i in sorted_index] 151 | 152 | train = (train_set_x, train_set_y) 153 | valid = (valid_set_x, valid_set_y) 154 | test = (test_set_x, test_set_y) 155 | 156 | return train, valid, test 157 | 158 | def load_mnist(path='mnist.pkl', fixed_permute=True, rand_permute=False): 159 | f = open(path, 'rb') 160 | train = pkl.load(f) 161 | valid = pkl.load(f) 162 | test = pkl.load(f) 163 | f.close() 164 | 165 | def _permute(data, perm): 166 | x, y = data 167 | x_new = [] 168 | for xx in x: 169 | xx_new = [xx[pp] for pp in perm] 170 | x_new.append(xx_new) 171 | return (x_new, y) 172 | 173 | def _trans2list(data): 174 | x, y = data 175 | x = [list(xx) for xx in x] 176 | return (x, y) 177 | 178 | if rand_permute: 179 | print 'Using a fixed random permutation of pixels...', 180 | perm = numpy.random.permutation(range(784)) 181 | train = _permute(train, perm) 182 | valid = _permute(valid, perm) 183 | test = _permute(test, perm) 184 | elif fixed_permute: 185 | print 'Using permuted dataset...', 186 | 187 | _trans2list(train) 188 | _trans2list(valid) 189 | _trans2list(test) 190 | 191 | return train, valid, test 192 | 193 | def get_minibatches_idx(n, minibatch_size, shuffle=False): 194 | """ 195 | Used to shuffle the dataset at each iteration. 196 | """ 197 | 198 | idx_list = numpy.arange(n, dtype="int32") 199 | 200 | if shuffle: 201 | numpy.random.shuffle(idx_list) 202 | 203 | minibatches = [] 204 | minibatch_start = 0 205 | for i in range(n // minibatch_size): 206 | minibatches.append(idx_list[minibatch_start: 207 | minibatch_start + minibatch_size]) 208 | minibatch_start += minibatch_size 209 | 210 | if (minibatch_start != n): 211 | # Make a minibatch out of what is left 212 | minibatches.append(idx_list[minibatch_start:]) 213 | 214 | return zip(range(len(minibatches)), minibatches) 215 | 216 | def get_minibatches_idx_bucket(dataset, minibatch_size, shuffle=False): 217 | """ 218 | divide into different buckets according to sequence lengths 219 | dynamic batch size 220 | """ 221 | # divide into buckets 222 | slen = [len(ss) for ss in dataset] 223 | bucket1000 = [sidx for sidx in xrange(len(dataset)) 224 | if slen[sidx] > 0 and slen[sidx] <= 1000] 225 | bucket3000 = [sidx for sidx in xrange(len(dataset)) 226 | if slen[sidx] > 1000 and slen[sidx] <= 3000] 227 | bucket_long = [sidx for sidx in xrange(len(dataset)) 228 | if slen[sidx] > 3000] 229 | 230 | # shuffle each bucket 231 | if shuffle: 232 | numpy.random.shuffle(bucket1000) 233 | numpy.random.shuffle(bucket3000) 234 | numpy.random.shuffle(bucket_long) 235 | 236 | # make minibatches 237 | def _make_batch(minibatches, bucket, minibatch_size): 238 | minibatch_start = 0 239 | n = len(bucket) 240 | for i in range(n // minibatch_size): 241 | minibatches.append(bucket[minibatch_start : minibatch_start + minibatch_size]) 242 | minibatch_start += minibatch_size 243 | if (minibatch_start != n): 244 | # Make a minibatch out of what is left 245 | minibatches.append(bucket[minibatch_start:]) 246 | return minibatches 247 | 248 | minibatches = [] 249 | _make_batch(minibatches, bucket1000, minibatch_size=minibatch_size) 250 | _make_batch(minibatches, bucket3000, minibatch_size=minibatch_size//2) 251 | _make_batch(minibatches, bucket_long, minibatch_size=minibatch_size//8) 252 | 253 | # shuffle minibatches 254 | numpy.random.shuffle(minibatches) 255 | 256 | return zip(range(len(minibatches)), minibatches) 257 | 258 | def prepare_data(seqs, labels, maxlen=None, dataset='text'): 259 | """Create the matrices from the datasets. 260 | 261 | This pad each sequence to the same lenght: the lenght of the 262 | longuest sequence or maxlen. 263 | 264 | if maxlen is set, we will cut all sequence to this maximum 265 | lenght. 266 | 267 | This swap the axis! 268 | """ 269 | # x: a list of sentences 270 | lengths = [len(s) for s in seqs] 271 | 272 | if maxlen is not None: 273 | new_seqs = [] 274 | new_labels = [] 275 | new_lengths = [] 276 | for l, s, y in zip(lengths, seqs, labels): 277 | if l < maxlen: 278 | new_seqs.append(s) 279 | new_labels.append(y) 280 | new_lengths.append(l) 281 | lengths = new_lengths 282 | labels = new_labels 283 | seqs = new_seqs 284 | 285 | if len(lengths) < 1: 286 | return None, None, None 287 | 288 | n_samples = len(seqs) 289 | maxlen = numpy.max(lengths) 290 | 291 | if dataset == 'mnist': 292 | x = numpy.zeros((maxlen, n_samples)).astype('float32') 293 | else: 294 | x = numpy.zeros((maxlen, n_samples)).astype('int64') 295 | x_mask = numpy.zeros((maxlen, n_samples)).astype(config.floatX) 296 | for idx, s in enumerate(seqs): 297 | x[:lengths[idx], idx] = s 298 | x_mask[:lengths[idx], idx] = 1. 299 | 300 | return x, x_mask, labels 301 | 302 | def prepare_data_hier(seqs, labels, hier_len, maxlen=None, dataset='text'): 303 | ''' 304 | prepare minibatch for hierarchical model 305 | ''' 306 | # sort (long->short) 307 | sorted_idx = sorted(range(len(seqs)), key=lambda x: len(seqs[x]), reverse=True) 308 | seqs = [seqs[i] for i in sorted_idx] 309 | labels = [labels[i] for i in sorted_idx] 310 | 311 | # truncate data 312 | lengths = [len(s) for s in seqs] 313 | if maxlen is not None: 314 | new_seqs = [] 315 | new_labels = [] 316 | new_lengths = [] 317 | for l, s, y in zip(lengths, seqs, labels): 318 | if l '0.10.2': 30 | from IPython.core.debugger import Pdb, BdbQuit_excepthook 31 | try: 32 | get_ipython 33 | except NameError: 34 | # Make it more resilient to different versions of IPython and try to 35 | # find a module. 36 | possible_modules = ['IPython.terminal.embed', # Newer IPython 37 | 'IPython.frontend.terminal.embed'] # Older IPython 38 | 39 | count = len(possible_modules) 40 | for module in possible_modules: 41 | try: 42 | embed = __import__(module, fromlist=["InteractiveShellEmbed"]) 43 | InteractiveShellEmbed = embed.InteractiveShellEmbed 44 | except ImportError: 45 | count -= 1 46 | if count == 0: 47 | raise 48 | else: 49 | break 50 | 51 | ipshell = InteractiveShellEmbed() 52 | def_colors = ipshell.colors 53 | else: 54 | def_colors = get_ipython.im_self.colors 55 | 56 | from IPython.utils import io 57 | 58 | if 'nose' in sys.modules.keys(): 59 | def update_stdout(): 60 | # setup stdout to ensure output is available with nose 61 | io.stdout = sys.stdout = sys.__stdout__ 62 | else: 63 | def update_stdout(): 64 | pass 65 | else: 66 | from IPython.Debugger import Pdb, BdbQuit_excepthook 67 | from IPython.Shell import IPShell 68 | from IPython import ipapi 69 | 70 | ip = ipapi.get() 71 | if ip is None: 72 | IPShell(argv=['']) 73 | ip = ipapi.get() 74 | def_colors = ip.options.colors 75 | 76 | from IPython.Shell import Term 77 | 78 | if 'nose' in sys.modules.keys(): 79 | def update_stdout(): 80 | # setup stdout to ensure output is available with nose 81 | Term.cout = sys.stdout = sys.__stdout__ 82 | else: 83 | def update_stdout(): 84 | pass 85 | 86 | 87 | def wrap_sys_excepthook(): 88 | # make sure we wrap it only once or we would end up with a cycle 89 | # BdbQuit_excepthook.excepthook_ori == BdbQuit_excepthook 90 | if sys.excepthook != BdbQuit_excepthook: 91 | BdbQuit_excepthook.excepthook_ori = sys.excepthook 92 | sys.excepthook = BdbQuit_excepthook 93 | 94 | 95 | def set_trace(frame=None): 96 | update_stdout() 97 | wrap_sys_excepthook() 98 | if frame is None: 99 | frame = sys._getframe().f_back 100 | Pdb(def_colors).set_trace(frame) 101 | 102 | 103 | def post_mortem(tb): 104 | update_stdout() 105 | wrap_sys_excepthook() 106 | p = Pdb(def_colors) 107 | p.reset() 108 | if tb is None: 109 | return 110 | p.interaction(None, tb) 111 | 112 | 113 | def pm(): 114 | post_mortem(sys.last_traceback) 115 | 116 | 117 | def run(statement, globals=None, locals=None): 118 | Pdb(def_colors).run(statement, globals, locals) 119 | 120 | 121 | def runcall(*args, **kwargs): 122 | return Pdb(def_colors).runcall(*args, **kwargs) 123 | 124 | 125 | def runeval(expression, globals=None, locals=None): 126 | return Pdb(def_colors).runeval(expression, globals, locals) 127 | 128 | 129 | @contextmanager 130 | def launch_ipdb_on_exception(): 131 | try: 132 | yield 133 | except Exception: 134 | e, m, tb = sys.exc_info() 135 | print(m.__repr__(), file=sys.stderr) 136 | post_mortem(tb) 137 | finally: 138 | pass 139 | 140 | 141 | def main(): 142 | if not sys.argv[1:] or sys.argv[1] in ("--help", "-h"): 143 | print("usage: ipdb.py scriptfile [arg] ...") 144 | sys.exit(2) 145 | 146 | mainpyfile = sys.argv[1] # Get script filename 147 | if not os.path.exists(mainpyfile): 148 | print('Error:', mainpyfile, 'does not exist') 149 | sys.exit(1) 150 | 151 | del sys.argv[0] # Hide "pdb.py" from argument list 152 | 153 | # Replace pdb's dir with script's dir in front of module search path. 154 | sys.path[0] = os.path.dirname(mainpyfile) 155 | 156 | # Note on saving/restoring sys.argv: it's a good idea when sys.argv was 157 | # modified by the script being debugged. It's a bad idea when it was 158 | # changed by the user from the command line. There is a "restart" command 159 | # which allows explicit specification of command line arguments. 160 | pdb = Pdb(def_colors) 161 | while 1: 162 | try: 163 | pdb._runscript(mainpyfile) 164 | if pdb._user_requested_quit: 165 | break 166 | print("The program finished and will be restarted") 167 | except Restart: 168 | print("Restarting", mainpyfile, "with arguments:") 169 | print("\t" + " ".join(sys.argv[1:])) 170 | except SystemExit: 171 | # In most cases SystemExit does not warrant a post-mortem session. 172 | print("The program exited via sys.exit(). Exit status: ", end='') 173 | print(sys.exc_info()[1]) 174 | except: 175 | traceback.print_exc() 176 | print("Uncaught exception. Entering post mortem debugging") 177 | print("Running 'cont' or 'step' will restart the program") 178 | t = sys.exc_info()[2] 179 | pdb.interaction(None, t) 180 | print("Post mortem debugger finished. The " + mainpyfile + 181 | " will be restarted") 182 | 183 | if __name__ == '__main__': 184 | main() 185 | -------------------------------------------------------------------------------- /DSL_SentimentAnalysis/model.npz.pkl: -------------------------------------------------------------------------------- 1 | (dp1 2 | S'monitor_grad' 3 | p2 4 | I00 5 | sS'dropout_output' 6 | p3 7 | F0.5 8 | sS'n_words' 9 | p4 10 | I10000 11 | sS'start_epoch' 12 | p5 13 | I0 14 | sS'dataset' 15 | p6 16 | S'text' 17 | p7 18 | sS'patience' 19 | p8 20 | I10 21 | sS'skip_steps2' 22 | p9 23 | I-1 24 | sS'hier_len' 25 | p10 26 | NsS'max_epochs' 27 | p11 28 | I5000 29 | sS'dispFreq' 30 | p12 31 | I50 32 | sS'newDumpFreq' 33 | p13 34 | I5000000 35 | sS'self' 36 | p14 37 | NsS'hybrid' 38 | p15 39 | I00 40 | sS'clip_c' 41 | p16 42 | F-1 43 | sS'dim_proj' 44 | p17 45 | I1024 46 | sS'saveto' 47 | p18 48 | S'model.npz' 49 | p19 50 | sS'start_iter' 51 | p20 52 | I0 53 | sS'lastHiddenLayer' 54 | p21 55 | NsS'noise_std' 56 | p22 57 | F0 58 | sS'batch_len_threshold' 59 | p23 60 | NsS'valid_batch_size' 61 | p24 62 | I16 63 | sS'corpus' 64 | p25 65 | S'imdb.pkl' 66 | p26 67 | sS'reload_options' 68 | p27 69 | NsS'optimizer' 70 | p28 71 | S'adadelta' 72 | p29 73 | sS'validFreq' 74 | p30 75 | I2000 76 | sS'dropout_input' 77 | p31 78 | F0.80000000000000004 79 | sS'warm_LM' 80 | p32 81 | NsS'batch_size' 82 | p33 83 | I16 84 | sS'encoder' 85 | p34 86 | S'lstm' 87 | p35 88 | sS'hierarchical' 89 | p36 90 | I00 91 | sS'reload_model' 92 | p37 93 | S'winner/warmClassifier.npz' 94 | p38 95 | sS'lrate' 96 | p39 97 | F1 98 | sS'truncate_grad' 99 | p40 100 | I-1 101 | sS'decay_c' 102 | p41 103 | F-1 104 | sS'encoder2' 105 | p42 106 | NsS'test_size' 107 | p43 108 | NsS'dim_word' 109 | p44 110 | I500 111 | sS'unit_depth' 112 | p45 113 | I-1 114 | sS'maxlen' 115 | p46 116 | NsS'skip_steps' 117 | p47 118 | I-1 119 | sS'embedding' 120 | p48 121 | NsS'logFile' 122 | p49 123 | S'log2' 124 | p50 125 | sS'mean_pooling' 126 | p51 127 | I00 128 | s. -------------------------------------------------------------------------------- /DSL_SentimentAnalysis/monitor.py: -------------------------------------------------------------------------------- 1 | # Copyright (c) Microsoft. All rights reserved. 2 | # Licensed under the MIT license. See LICENSE file in the project root for full license information. 3 | 4 | from config import config_params 5 | import os 6 | os.environ['THEANO_FLAGS']='floatX=float32,device=cuda%d' % (config_params.gpu) 7 | if os.name == 'nt': 8 | cmdstr = '\"C:\\Program Files\\NVIDIA Corporation\\NVSMI\\nvidia-smi.exe\" ' 9 | os.system(cmdstr) 10 | else: 11 | os.system(r'nvidia-smi') 12 | 13 | from CLM.CLM import CLM_worker 14 | from Classifier.Models import Model as Classifier 15 | import theano 16 | import theano.tensor as tensor 17 | import numpy 18 | from Util_basic import sgd_joint, prepare_data_x, unzip, itemlist_NoEmb, adadelta_joint, Optim 19 | from Data import load_data, get_minibatches_idx, get_minibatches_idx_bucket 20 | from collections import OrderedDict 21 | 22 | def grad_clipping(grads, clip_c): 23 | g2 = 0. 24 | for g in grads: 25 | g2 += (g**2).sum() 26 | new_grads = [] 27 | for g in grads: 28 | new_grads.append(tensor.switch(g2 > (clip_c**2), g/tensor.sqrt(g2) * clip_c, g)) 29 | return new_grads, tensor.sqrt(g2) 30 | 31 | class monitor(object): 32 | def __init__(self): 33 | print config_params 34 | self.CLM = CLM_worker(lrate=1., 35 | optimizer='adadelta', 36 | batch_size=config_params.minibatch, 37 | saveto='model.npz', 38 | validFreq=2000, 39 | dispFreq=100, 40 | dropout_input=config_params.CLM_drop_in, 41 | dropout_output=config_params.CLM_drop_out, 42 | reload_model=config_params.model_dir + '/' + config_params.model_L2S, 43 | reload_option=None, 44 | log='log1' 45 | ) 46 | self.classifier = Classifier(lrate=1., # Learning rate for sgd (not used for adadelta and rmsprop) 47 | optimizer='adadelta', 48 | saveto='model.npz', # The best model will be saved there 49 | dispFreq=50, # Display the training progress after this number of updates 50 | validFreq=2000, # Compute the validation error after this number of updates 51 | batch_size=config_params.minibatch, # The batch size during training. 52 | batch_len_threshold=None, # use dynamic batch size if sequence lengths exceed this threshold 53 | valid_batch_size=config_params.minibatch, # The batch size used for validation/test set. 54 | lastHiddenLayer=None, 55 | dropout_output=config_params.classifier_drop_out, 56 | dropout_input=config_params.classifier_drop_in, 57 | reload_options=None, # Path to a saved model options we want to start from 58 | reload_model=config_params.model_dir + '/' + config_params.model_S2L, 59 | embedding=None, # Path to the word embedding file (otherwise randomized) 60 | warm_LM=None, 61 | logFile='log2' # Path to log file 62 | ) 63 | self.trainSet, self.validSet, self.testSet = \ 64 | load_data(path=config_params.data_dir, n_words=10000, maxlen=None, sort_by_len=True, fixed_valid=True) 65 | self.LMscore = numpy.load(config_params.LMScoreFile) 66 | self.LMscore = self.LMscore[self.LMscore.files[0]] 67 | self.build() 68 | 69 | def build(self): 70 | LMsores = tensor.vector('LMScore', dtype='float32') 71 | lrate = tensor.scalar(dtype='float32') 72 | 73 | CLM_srcx, CLM_srcx_mask, CLM_ctx_, CLM_cost, CLM_sentenceLen = self.CLM.GetNll() 74 | classifier_x, classifier_mask, classifier_y, classifier_nlls = self.classifier.GetNll() 75 | consistent_loss = (((classifier_nlls + numpy.log(0.5))/CLM_sentenceLen + LMsores - CLM_cost) ** 2).mean() 76 | CLM_cost_avg = CLM_cost.mean() 77 | overall_L2S = CLM_cost_avg + config_params.trade_off_L2S * config_params.trade_off_L2S * consistent_loss 78 | classifier_nlls_avg = classifier_nlls.mean() 79 | overall_S2L = classifier_nlls_avg + config_params.trade_off_S2L * config_params.trade_off_S2L * consistent_loss 80 | 81 | if config_params.FreezeEmb: 82 | grads_L2S = tensor.grad(overall_L2S, wrt=itemlist_NoEmb(self.CLM.tparams)) 83 | else: 84 | grads_L2S = tensor.grad(overall_L2S, wrt=self.CLM.tparams.values()) 85 | if config_params.clip_L2S > 0.: 86 | grads_L2S, norm_grads_L2S = grad_clipping(grads_L2S, config_params.clip_L2S) 87 | else: 88 | norm_grads_L2S = tensor.alloc(-1.) 89 | 90 | if config_params.FreezeEmb: 91 | grads_S2L = tensor.grad(overall_S2L, wrt=itemlist_NoEmb(self.classifier.tparams)) 92 | else: 93 | grads_S2L = tensor.grad(overall_S2L, wrt=self.classifier.tparams.values()) 94 | if config_params.clip_S2L > 0.: 95 | grads_S2L, norm_grads_S2L = grad_clipping(grads_S2L, config_params.clip_S2L) 96 | else: 97 | norm_grads_S2L = tensor.alloc(-1.) 98 | 99 | if config_params.dual_style == 'all': 100 | merged_var_dic = OrderedDict() 101 | if config_params.FreezeEmb: 102 | merged_var_dic.update(OrderedDict((k + '_S2L',v) for (k,v) in self.classifier.tparams.iteritems() if 'Wemb' not in k )) 103 | merged_var_dic.update(OrderedDict((k + '_L2S',v) for (k,v) in self.CLM.tparams.iteritems() if 'Wemb' not in k )) 104 | else: 105 | merged_var_dic.update(OrderedDict((k + '_S2L',v) for (k,v) in self.classifier.tparams.iteritems())) 106 | merged_var_dic.update(OrderedDict((k + '_L2S',v) for (k,v) in self.CLM.tparams.iteritems())) 107 | 108 | inps = [CLM_srcx, CLM_srcx_mask, CLM_ctx_, classifier_x, classifier_mask, classifier_y, LMsores] 109 | outs = [CLM_cost_avg, classifier_nlls_avg, consistent_loss, overall_L2S, overall_S2L, norm_grads_L2S, norm_grads_S2L] 110 | self.f_grad_shared, self.f_update = Optim[config_params.optim](lrate, merged_var_dic, grads_S2L + grads_L2S, inps, outs) 111 | elif config_params.dual_style == 'S2L': 112 | if config_params.FreezeEmb: 113 | merged_var_dic = OrderedDict((k + '_S2L',v) for (k,v) in self.classifier.tparams.iteritems() if 'Wemb' not in k ) 114 | else: 115 | merged_var_dic = OrderedDict((k + '_S2L',v) for (k,v) in self.classifier.tparams.iteritems()) 116 | 117 | inps = [CLM_srcx, CLM_srcx_mask, CLM_ctx_, classifier_x, classifier_mask, classifier_y, LMsores] 118 | norm_grads_L2S = tensor.alloc(-1.) 119 | outs = [CLM_cost_avg, classifier_nlls_avg, consistent_loss, overall_L2S, overall_S2L, norm_grads_L2S, norm_grads_S2L] 120 | self.f_grad_shared, self.f_update = Optim[config_params.optim](lrate, merged_var_dic, grads_S2L, inps, outs) 121 | elif config_params.dual_style == 'L2S': 122 | if config_params.FreezeEmb: 123 | merged_var_dic = OrderedDict((k + '_L2S',v) for (k,v) in self.CLM.tparams.iteritems() if 'Wemb' not in k ) 124 | else: 125 | merged_var_dic = OrderedDict((k + '_L2S',v) for (k,v) in self.CLM.tparams.iteritems()) 126 | 127 | inps = [CLM_srcx, CLM_srcx_mask, CLM_ctx_, classifier_x, classifier_mask, classifier_y, LMsores] 128 | norm_grads_S2L = tensor.alloc(-1.) 129 | outs = [CLM_cost_avg, classifier_nlls_avg, consistent_loss, overall_L2S, overall_S2L, norm_grads_L2S, norm_grads_S2L] 130 | self.f_grad_shared, self.f_update = Optim[config_params.optim](lrate, merged_var_dic, grads_L2S, inps, outs) 131 | else: 132 | raise Exception('Not support {} in dual_style'.format(config_params.dual_style)) 133 | 134 | def train_one_minibatch(self, seqx, seqy, LMscore): 135 | CLM_x, CLM_xmask = prepare_data_x(seqx, pad_eos=True) 136 | labels = numpy.array(seqy).astype('int64') 137 | classifier_x, classifier_xmask = prepare_data_x(seqx, pad_eos=False) 138 | CLM_cost_avg, classifier_nlls_avg, consistent_loss, overall_L2S, overall_S2L, norm_grads_L2S, norm_grads_S2L = self.f_grad_shared( 139 | CLM_x, CLM_xmask, labels, classifier_x, classifier_xmask, labels, LMscore 140 | ) 141 | print 'CLM_cost_avg=%f, classifier_nlls_avg=%f, norm_grads_L2S=%f, norm_grads_S2L=%f, consistent_loss=%f,' \ 142 | ' overall_L2S=%f, overall_S2L=%f' % ( 143 | CLM_cost_avg, classifier_nlls_avg, norm_grads_L2S, norm_grads_S2L, consistent_loss, overall_L2S, overall_S2L ) 144 | self.f_update(config_params.lrate) 145 | 146 | def train(self): 147 | uidx = 0 148 | for eidx in xrange(0, config_params.maxEpoch): 149 | n_samples = 0 150 | self.kf_train = get_minibatches_idx_bucket(self.trainSet[0],config_params.minibatch,shuffle=True) 151 | 152 | for _, train_index in self.kf_train: 153 | uidx += 1 154 | self.classifier.use_noise.set_value(1.) 155 | self.CLM.use_noise.set_value(1.) 156 | 157 | # Select the random examples for this minibatch 158 | seqx = [self.trainSet[0][t] for t in train_index] 159 | seqy = [self.trainSet[1][t] for t in train_index] 160 | LMscore = [self.LMscore[t] for t in train_index] 161 | self.train_one_minibatch(seqx, seqy, numpy.array(LMscore).astype('float32')) 162 | 163 | if uidx % config_params.validFreq == 0: 164 | self.classifier.use_noise.set_value(0.) 165 | self.CLM.use_noise.set_value(0.) 166 | 167 | if config_params.dual_style == 'all': 168 | suffix_S2L = self.valid_S2L() 169 | suffix_L2S = self.valid_L2S() 170 | 171 | S2Lpath = config_params.model_dir + '/model_S2L_' + suffix_S2L + '_uidx' + str(uidx) 172 | L2Spath = config_params.model_dir + '/model_L2S_' + suffix_L2S + '_uidx' + str(uidx) 173 | 174 | numpy.savez(S2Lpath, history_errs=[], **unzip(self.classifier.tparams) ) 175 | numpy.savez(L2Spath, history_errs=[], **unzip(self.CLM.tparams) ) 176 | elif config_params.dual_style == 'S2L': 177 | suffix_S2L = self.valid_S2L() 178 | S2Lpath = config_params.model_dir + '/model_S2L_' + suffix_S2L + '_uidx' + str(uidx) 179 | numpy.savez(S2Lpath, history_errs=[], **unzip(self.classifier.tparams) ) 180 | elif config_params.dual_style == 'L2S': 181 | suffix_L2S = self.valid_L2S() 182 | L2Spath = config_params.model_dir + '/model_L2S_' + suffix_L2S + '_uidx' + str(uidx) 183 | numpy.savez(L2Spath, history_errs=[], **unzip(self.CLM.tparams) ) 184 | 185 | 186 | def valid_S2L(self): 187 | acc = self.classifier.evaluate(self.trainSet, self.validSet, self.testSet) 188 | print 'TrainAcc=%f, ValidAcc=%f, TestAcc=%f' % (acc[0], acc[1], acc[2]) 189 | return 'train_{}_valid_{}_test_{}'.format(acc[0], acc[1], acc[2]) 190 | 191 | def valid_L2S(self): 192 | valid_ppl, test_ppl = self.CLM.evaluate(self.validSet, self.testSet) 193 | print 'Valid_PPL=%f, Test_PPL=%f' % (valid_ppl, test_ppl) 194 | return 'valid_{}_test_{}'.format(valid_ppl, test_ppl) 195 | 196 | 197 | if __name__ == '__main__': 198 | runner = monitor() 199 | runner.train() 200 | 201 | 202 | 203 | 204 | 205 | 206 | 207 | 208 | 209 | 210 | 211 | 212 | 213 | -------------------------------------------------------------------------------- /DSL_SentimentAnalysis/train.bat: -------------------------------------------------------------------------------- 1 | python monitor.py --classifier_drop_in=0.8 --classifier_drop_out=0.5 --clip_L2S=5.0 --trade_off_S2L=5.0 --trade_off_L2S=5.0 --validFreq=5000 --model_dir=your_model_folder --lrS2L=0.1 --lrL2S=0.1 --lrate=0.1 --bias=0.0 --dual_style=all --optim=adadelta -------------------------------------------------------------------------------- /DSL_SentimentAnalysis/train_linux.sh: -------------------------------------------------------------------------------- 1 | python monitor.py --classifier_drop_in=0.8 --classifier_drop_out=0.5 --clip_L2S=5.0 --trade_off_S2L=5.0 --trade_off_L2S=5.0 --validFreq=5000 --model_dir=Sentiment_model --lrS2L=0.1 --lrL2S=0.1 --lrate=0.1 --bias=0.0 --dual_style=all --optim=adadelta -------------------------------------------------------------------------------- /DSL_SentimentAnalysis/valid.bat: -------------------------------------------------------------------------------- 1 | python inference.py --model_dir=winner --model_S2L=warmClassifier.npz --model_L2S=warmCLM.npz -------------------------------------------------------------------------------- /DSL_SentimentAnalysis/valid_linux.sh: -------------------------------------------------------------------------------- 1 | python inference.py --model_dir=winner --model_S2L=warmClassifier.npz --model_L2S=warmCLM.npz --gpu=3 -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) Microsoft Corporation. All rights reserved. 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | 2 | # Contributing 3 | 4 | This project welcomes contributions and suggestions. Most contributions require you to agree to a 5 | Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us 6 | the rights to use your contribution. For details, visit https://cla.microsoft.com. 7 | 8 | When you submit a pull request, a CLA-bot will automatically determine whether you need to provide 9 | a CLA and decorate the PR appropriately (e.g., label, comment). Simply follow the instructions 10 | provided by the bot. You will only need to do this once across all repos using our CLA. 11 | 12 | This project has adopted the [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/). 13 | For more information see the [Code of Conduct FAQ](https://opensource.microsoft.com/codeofconduct/faq/) or 14 | contact [opencode@microsoft.com](mailto:opencode@microsoft.com) with any additional questions or comments. 15 | 16 | The code consists of two parts: 17 | (1) dual supervised learning for image processing: DSL_ImgProcess 18 | (2) dual supervised learning for sentiment analysis: DSL_SentimentAnalysis -------------------------------------------------------------------------------- /SECURITY.md: -------------------------------------------------------------------------------- 1 | 2 | 3 | ## Security 4 | 5 | Microsoft takes the security of our software products and services seriously, which includes all source code repositories managed through our GitHub organizations, which include [Microsoft](https://github.com/microsoft), [Azure](https://github.com/Azure), [DotNet](https://github.com/dotnet), [AspNet](https://github.com/aspnet), [Xamarin](https://github.com/xamarin), and [our GitHub organizations](https://opensource.microsoft.com/). 6 | 7 | If you believe you have found a security vulnerability in any Microsoft-owned repository that meets [Microsoft's definition of a security vulnerability](https://aka.ms/opensource/security/definition), please report it to us as described below. 8 | 9 | ## Reporting Security Issues 10 | 11 | **Please do not report security vulnerabilities through public GitHub issues.** 12 | 13 | Instead, please report them to the Microsoft Security Response Center (MSRC) at [https://msrc.microsoft.com/create-report](https://aka.ms/opensource/security/create-report). 14 | 15 | If you prefer to submit without logging in, send email to [secure@microsoft.com](mailto:secure@microsoft.com). If possible, encrypt your message with our PGP key; please download it from the [Microsoft Security Response Center PGP Key page](https://aka.ms/opensource/security/pgpkey). 16 | 17 | You should receive a response within 24 hours. If for some reason you do not, please follow up via email to ensure we received your original message. Additional information can be found at [microsoft.com/msrc](https://aka.ms/opensource/security/msrc). 18 | 19 | Please include the requested information listed below (as much as you can provide) to help us better understand the nature and scope of the possible issue: 20 | 21 | * Type of issue (e.g. buffer overflow, SQL injection, cross-site scripting, etc.) 22 | * Full paths of source file(s) related to the manifestation of the issue 23 | * The location of the affected source code (tag/branch/commit or direct URL) 24 | * Any special configuration required to reproduce the issue 25 | * Step-by-step instructions to reproduce the issue 26 | * Proof-of-concept or exploit code (if possible) 27 | * Impact of the issue, including how an attacker might exploit the issue 28 | 29 | This information will help us triage your report more quickly. 30 | 31 | If you are reporting for a bug bounty, more complete reports can contribute to a higher bounty award. Please visit our [Microsoft Bug Bounty Program](https://aka.ms/opensource/security/bounty) page for more details about our active programs. 32 | 33 | ## Preferred Languages 34 | 35 | We prefer all communications to be in English. 36 | 37 | ## Policy 38 | 39 | Microsoft follows the principle of [Coordinated Vulnerability Disclosure](https://aka.ms/opensource/security/cvd). 40 | 41 | 42 | --------------------------------------------------------------------------------