├── dandelion ├── ext │ ├── __init__.py │ ├── CV │ │ ├── __init__.py │ │ └── CV.py │ └── visual │ │ ├── __init__.py │ │ └── visual.py ├── model │ ├── __init__.py │ ├── resnet.py │ ├── vgg.py │ ├── feature_pyramid_net.py │ ├── AlternateLSTM2D.py │ ├── unet.py │ └── ctpn.py ├── __init__.py └── activation.py ├── MANIFEST.in ├── CHANGES.md ├── docs ├── SC_model.png ├── center_1.png ├── center_2.png ├── model_summary.png ├── dandelion_initialization.md ├── dandelion_activation.md ├── dandelion_util.md ├── dandelion_ext_visual.md ├── dandelion_ext_CV.md ├── howtos.md ├── index.md ├── dandelion_update.md ├── dandelion_objective.md ├── dandelion_functional.md ├── tutorial II - Write Your Own Module.md ├── history.md └── dandelion_model.md ├── .gitignore ├── .travis.yml ├── mkdocs.yml ├── test ├── test_Unet.py ├── test_vgg.py ├── test_resnet.py ├── test_feature_pyramid_net.py ├── test_Sequential.py ├── test_LSTM2D.py ├── test_AlternateLSTM2D.py ├── test_CTPN.py ├── test_GroupNorm.py ├── test_pooling.py ├── test_softmax_ndim.py ├── test_Dense.py ├── test_upsample_2d.py ├── test_categorical_crossentropy_log.py ├── test_Conv2D.py ├── test_ConvTransposed2D.py ├── test_GRU.py ├── test_im2col.py ├── test_BatchNorm.py ├── test_todevice.py ├── test_LSTM.py ├── test_VGG16_weights.py ├── test_spatial_pyramid_pooling.py └── test_shufflenet.py ├── setup.py ├── debug ├── shuffleseg_train_debug.py └── speed_im2col.py └── README.md /dandelion/ext/__init__.py: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /dandelion/ext/CV/__init__.py: -------------------------------------------------------------------------------- 1 | from .CV import * -------------------------------------------------------------------------------- /MANIFEST.in: -------------------------------------------------------------------------------- 1 | recursive-include dandelion LICENSE 2 | -------------------------------------------------------------------------------- /dandelion/ext/visual/__init__.py: -------------------------------------------------------------------------------- 1 | from .visual import * -------------------------------------------------------------------------------- /CHANGES.md: -------------------------------------------------------------------------------- 1 | Changelog 2 | --------- 3 | Refer to docs/history.md for details. 4 | -------------------------------------------------------------------------------- /docs/SC_model.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/david-leon/Dandelion/HEAD/docs/SC_model.png -------------------------------------------------------------------------------- /docs/center_1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/david-leon/Dandelion/HEAD/docs/center_1.png -------------------------------------------------------------------------------- /docs/center_2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/david-leon/Dandelion/HEAD/docs/center_2.png -------------------------------------------------------------------------------- /docs/model_summary.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/david-leon/Dandelion/HEAD/docs/model_summary.png -------------------------------------------------------------------------------- /dandelion/model/__init__.py: -------------------------------------------------------------------------------- 1 | from .unet import * 2 | from .vgg import * 3 | from .resnet import * 4 | from .feature_pyramid_net import * 5 | from .shufflenet import * 6 | from .AlternateLSTM2D import * 7 | from .ctpn import * -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | # Add any directories, files, or patterns you don't want to be tracked by version control 2 | .idea 3 | .pyc 4 | .cache 5 | __pycache__ 6 | build 7 | dist 8 | Dandelion.egg-info 9 | Cython/*.whl 10 | site 11 | weights -------------------------------------------------------------------------------- /docs/dandelion_initialization.md: -------------------------------------------------------------------------------- 1 | Dandelion's `initialization` module is mostly inherited from [Lasagne](https://github.com/Lasagne/Lasagne). 2 | You're recommended to refer to [`Lasagne.init` document](http://lasagne.readthedocs.io/en/latest/modules/init.html) for the details. -------------------------------------------------------------------------------- /dandelion/__init__.py: -------------------------------------------------------------------------------- 1 | from . import module 2 | from . import util 3 | from . import initialization 4 | from . import update 5 | from . import activation 6 | from . import objective 7 | from . import functional 8 | from . import model 9 | from . import ext 10 | 11 | __version__ = "0.17.26" 12 | __author__ = "David Leon (Dawei Leng)" 13 | -------------------------------------------------------------------------------- /.travis.yml: -------------------------------------------------------------------------------- 1 | language: python 2 | sudo: false 3 | python: 4 | - "3.6" 5 | 6 | before_install: 7 | - pip install -U pip 8 | install: 9 | - travis_wait travis_retry pip install pytest 10 | - travis_wait travis_retry pip install pydot 11 | - travis_wait travis_retry pip install psutil 12 | - travis_wait travis_retry pip install keras 13 | - travis_wait travis_retry pip install tensorflow 14 | - travis_wait travis_retry pip install git+https://github.com/david-leon/Lasagne_CTC.git 15 | - travis_wait travis_retry pip install git+https://github.com/david-leon/Lasagne_Ext.git 16 | - travis_wait travis_retry pip install git+https://github.com/Theano/Theano.git#egg=Theano 17 | - travis_retry python setup.py install 18 | script: pytest test 19 | 20 | cache: 21 | - directories: 22 | - $HOME/.theano 23 | -------------------------------------------------------------------------------- /docs/dandelion_activation.md: -------------------------------------------------------------------------------- 1 | Dandelion's `activation` module is mostly inherited from [Lasagne](https://github.com/Lasagne/Lasagne) except for the `softmax()` and `log_softmax()` activations. 2 | 3 | You're recommended to refer to [`Lasagne.nonlinearities` document](http://lasagne.readthedocs.io/en/latest/modules/nonlinearities.html) for the following activations: 4 | 5 | * sigmoid 6 | * tanh 7 | * relu 8 | * softplus 9 | * ultra_fast_sigmoid 10 | * ScaledTanH 11 | * leaky_rectify 12 | * very_leaky_rectify 13 | * elu 14 | * SELU 15 | * linear 16 | * identity 17 | 18 | _______________________________________________________________________ 19 | ## softmax 20 | Apply softmax to the last dimension of input `x` 21 | ```python 22 | softmax(x) 23 | ``` 24 | * **x**: theano tensor of any shape 25 | 26 | _______________________________________________________________________ 27 | ## log_softmax 28 | Apply softmax to the last dimension of input `x`, in log domain 29 | ```python 30 | log_softmax(x) 31 | ``` 32 | * **x**: theano tensor of any shape -------------------------------------------------------------------------------- /docs/dandelion_util.md: -------------------------------------------------------------------------------- 1 | ## gpickle 2 | Pickle with gzip enabled. 3 | ```python 4 | .dump(data, filename, compresslevel=9) 5 | ``` 6 | * **data**: data to be dumped to file 7 | * **filename**: file path 8 | * **compresslevel**: gzip compression level, default = 9. 9 | 10 | ```python 11 | .load(filename) 12 | ``` 13 | * **filename**: file to be loaded 14 | 15 | _______________________________________________________________________ 16 | ## theano_safe_run 17 | Help catch theano memory exceptions during running theano function 18 | ```python 19 | theano_safe_run(fn, input_list) 20 | ``` 21 | * **fn**: theano function to run 22 | * **input_list**: list of input arguments 23 | * **return**: errcode and funtion excution result 24 | 25 | `theano_safe_run()` catches the following 4 memory exceptions (range from theano 0.x to 1.x): 26 | 27 | * MemoryError. errcode=1 28 | * CudaNdarray_ZEROS: allocation failed. errcode=2 29 | * gpudata_alloc: cuMemAlloc: CUDA_ERROR_OUT_OF_MEMORY: out of memory. errcode=3 30 | * cuMemAlloc: CUDA_ERROR_OUT_OF_MEMORY: out of memory. errcode=4 31 | 32 | -------------------------------------------------------------------------------- /mkdocs.yml: -------------------------------------------------------------------------------- 1 | site_name: Dandelion 2 | site_author: David Leon (Dawei Leng) 3 | docs_dir: docs 4 | 5 | markdown_extensions: 6 | - pymdownx.arithmatex 7 | 8 | extra_javascript: 9 | - https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML 10 | 11 | nav: 12 | - Home: index.md 13 | - Tutorials: 14 | - I - Sentence Topic Classification: tutorial I - Sentence Topic Classification.md 15 | - II - Write Your Own Module: tutorial II - Write Your Own Module.md 16 | - III - Howtos: howtos.md 17 | # - Document Image Classfication: tutorial II - Document Image Classification.md 18 | - Framework Interface: 19 | - dandelion.module: dandelion_module.md 20 | - dandelion.functional: dandelion_functional.md 21 | - dandelion.activation: dandelion_activation.md 22 | - dandelion.objective: dandelion_objective.md 23 | - dandelion.update: dandelion_update.md 24 | - dandelion.initialization: dandelion_initialization.md 25 | - dandelion.util: dandelion_util.md 26 | - dandelion.model: dandelion_model.md 27 | - Extensions: 28 | - dandelion.ext.CV: dandelion_ext_CV.md 29 | - dandelion.ext.visual: dandelion_ext_visual.md 30 | - History: history.md 31 | 32 | 33 | theme: readthedocs 34 | -------------------------------------------------------------------------------- /test/test_Unet.py: -------------------------------------------------------------------------------- 1 | # coding:utf-8 2 | # Test for unet, partial. 3 | # Created : 5, 25, 2018 4 | # Revised : 5, 25, 2018 5 | # All rights reserved 6 | #------------------------------------------------------------------------------------------------ 7 | __author__ = 'dawei.leng' 8 | import os, sys, psutil 9 | os.environ['THEANO_FLAGS'] = "floatX=float32, mode=FAST_RUN, warn_float64='raise'" 10 | 11 | import theano 12 | from theano import tensor 13 | from dandelion.module import * 14 | from dandelion.activation import * 15 | from dandelion.model.unet import model_Unet 16 | 17 | import dandelion 18 | dandelion_path = os.path.split(dandelion.__file__)[0] 19 | print('dandelion path = %s\n' % dandelion_path) 20 | 21 | def test_case_0(): 22 | im_height, im_width = 65, 63 23 | model = model_Unet(im_height=im_height, im_width=im_width) 24 | x = tensor.ftensor4('x') 25 | y = model.forward(x) 26 | print('compiling fn...') 27 | fn = theano.function([x], y, no_default_updates=False) 28 | print('run fn...') 29 | input = np.random.rand(7, 1, im_height, im_width).astype(np.float32) 30 | output = fn(input) 31 | print(output) 32 | print(output.shape) 33 | 34 | if __name__ == '__main__': 35 | 36 | test_case_0() 37 | 38 | print('Test passed') 39 | 40 | 41 | 42 | -------------------------------------------------------------------------------- /test/test_vgg.py: -------------------------------------------------------------------------------- 1 | # coding:utf-8 2 | # Test for VGG net, partial. 3 | # Created : 7, 6, 2018 4 | # Revised : 7, 6, 2018 5 | # All rights reserved 6 | #------------------------------------------------------------------------------------------------ 7 | __author__ = 'dawei.leng' 8 | import os, sys 9 | os.environ['THEANO_FLAGS'] = "floatX=float32, mode=FAST_RUN, warn_float64='raise'" 10 | 11 | import theano 12 | from theano import tensor 13 | from dandelion.module import * 14 | from dandelion.activation import * 15 | from dandelion.model.vgg import model_VGG16 16 | 17 | import dandelion 18 | dandelion_path = os.path.split(dandelion.__file__)[0] 19 | print('dandelion path = %s\n' % dandelion_path) 20 | 21 | def test_case_0(): 22 | im_height, im_width = 224, 224 23 | model = model_VGG16(channel=1, im_height=im_height, im_width=im_width) 24 | x = tensor.ftensor4('x') 25 | y = model.forward(x) 26 | print('compiling fn...') 27 | fn = theano.function([x], y, no_default_updates=False) 28 | print('run fn...') 29 | input = np.random.rand(8, 1, im_height, im_width).astype(np.float32) 30 | output = fn(input) 31 | print(output) 32 | print(output.shape) 33 | 34 | if __name__ == '__main__': 35 | 36 | test_case_0() 37 | 38 | print('Test passed') 39 | 40 | 41 | 42 | -------------------------------------------------------------------------------- /test/test_resnet.py: -------------------------------------------------------------------------------- 1 | # coding:utf-8 2 | # Test for ResNet, partial. 3 | # Created : 7, 6, 2018 4 | # Revised : 7, 6, 2018 5 | # All rights reserved 6 | #------------------------------------------------------------------------------------------------ 7 | __author__ = 'dawei.leng' 8 | import os, sys 9 | os.environ['THEANO_FLAGS'] = "floatX=float32, mode=FAST_RUN, warn_float64='raise'" 10 | 11 | import theano 12 | from theano import tensor 13 | from dandelion.module import * 14 | from dandelion.activation import * 15 | from dandelion.model.resnet import ResNet_bottleneck 16 | 17 | import dandelion 18 | dandelion_path = os.path.split(dandelion.__file__)[0] 19 | print('dandelion path = %s\n' % dandelion_path) 20 | 21 | def test_case_0(): 22 | model = ResNet_bottleneck(outer_channel=16, inner_channel=4, border_mode='same', batchnorm_mode=0, activation=relu) 23 | x = tensor.ftensor4('x') 24 | y = model.forward(x) 25 | print('compiling fn...') 26 | fn = theano.function([x], y, no_default_updates=False) 27 | print('run fn...') 28 | input = np.random.rand(4, 16, 32, 33).astype(np.float32) 29 | output = fn(input) 30 | print(output) 31 | print(output.shape) 32 | 33 | if __name__ == '__main__': 34 | 35 | test_case_0() 36 | 37 | print('Test passed') 38 | 39 | 40 | 41 | -------------------------------------------------------------------------------- /test/test_feature_pyramid_net.py: -------------------------------------------------------------------------------- 1 | # coding:utf-8 2 | # Test for feature pyramid net, partial. 3 | # Created : 7, 6, 2018 4 | # Revised : 7, 6, 2018 5 | # All rights reserved 6 | #------------------------------------------------------------------------------------------------ 7 | __author__ = 'dawei.leng' 8 | import os, sys 9 | os.environ['THEANO_FLAGS'] = "floatX=float32, mode=FAST_RUN, warn_float64='raise'" 10 | 11 | import theano 12 | from theano import tensor 13 | from dandelion.module import * 14 | from dandelion.activation import * 15 | from dandelion.model.feature_pyramid_net import model_FPN 16 | 17 | import dandelion 18 | dandelion_path = os.path.split(dandelion.__file__)[0] 19 | print('dandelion path = %s\n' % dandelion_path) 20 | 21 | def test_case_0(): 22 | model = model_FPN(input_channel=1, batchnorm_mode=2, base_n_filters=32) 23 | x = tensor.ftensor4('x') 24 | y = model.forward(x) 25 | print('compiling fn...') 26 | fn = theano.function([x], y, no_default_updates=False) 27 | print('run fn...') 28 | input = np.random.rand(2, 1, 64, 63).astype(np.float32) 29 | output = fn(input) 30 | for r in output: 31 | print(r.shape) 32 | # print(output.shape) 33 | print(output) 34 | 35 | if __name__ == '__main__': 36 | 37 | test_case_0() 38 | 39 | print('Test passed') 40 | 41 | 42 | 43 | -------------------------------------------------------------------------------- /docs/dandelion_ext_visual.md: -------------------------------------------------------------------------------- 1 | **Model Summary and Visualization Toolkits** 2 | _______________________________________________________________________ 3 | 4 | 5 | ## get_model_size 6 | Calculate model parameter size, return result in bytes. 7 | ```python 8 | get_model_size(model) 9 | ``` 10 | * **model**: model defined by Dandelion 11 | 12 | _______________________________________________________________________ 13 | ## get_model_summary 14 | Produce model parameter summary. 15 | ```python 16 | get_model_summary(model, size_unit='M') 17 | ``` 18 | * **model**: model defined by Dandelion 19 | * **size_unit**: {*'M'*|'K'|'B'|int}, unit for calculating parameter size. 20 | * **return**: `OrderedDict` instance. You can use `json.dumps()` to get a formatted json report file. For example: 21 | 22 | ```python 23 | import json 24 | from dandelion.model import Alternate_2D_LSTM 25 | input_dim, hidden_dim, B, H, W = 8, 8, 2, 32, 32 26 | model = Alternate_2D_LSTM(input_dims=[input_dim],hidden_dim=hidden_dim, peephole=True, mode=2) 27 | model_summary = get_model_summary(model, size_unit=1) 28 | rpt = json.dumps(model_summary, ensure_ascii=False, indent=2) 29 | print(rpt) 30 | ``` 31 | The following is a snapshot of a complex model's summary, in which : 32 | 33 | * the `size` attribute is in `MB` unit. 34 | * the `percent` attribute is level-wise, and already in [0 ~ 100] range. 35 | 36 | For example in the snapshot, it says *model_6* (weights) is 21.36MB in total, and the first convolution layer *stage1* accounts for 0.011% of all the weights. 37 | 38 | ![model_summary](model_summary.png) -------------------------------------------------------------------------------- /test/test_Sequential.py: -------------------------------------------------------------------------------- 1 | # coding:utf-8 2 | # Test for Sequential container 3 | # Created : 7, 12, 2018 4 | # Revised : 7, 12, 2018 5 | # All rights reserved 6 | #------------------------------------------------------------------------------------------------ 7 | __author__ = 'dawei.leng' 8 | import os 9 | os.environ['THEANO_FLAGS'] = "floatX=float32, mode=FAST_RUN, warn_float64='raise'" 10 | 11 | import theano 12 | from theano import tensor 13 | from dandelion.module import * 14 | from dandelion.activation import * 15 | 16 | import dandelion 17 | dandelion_path = os.path.split(dandelion.__file__)[0] 18 | print('dandelion path = %s\n' % dandelion_path) 19 | 20 | def test_case_0(): 21 | conv1 = Conv2D(in_channels=3, out_channels=3, stride=(2, 2)) 22 | bn1 = BatchNorm(input_shape=(None, 3, None, None)) 23 | conv2 = Conv2D(in_channels=3, out_channels=5) 24 | conv3 = Conv2D(in_channels=5, out_channels=8) 25 | model = Sequential([conv1, bn1, conv2, conv3], activation=relu, name='seq') 26 | model_weights = model.get_weights() 27 | for value, w_name in model_weights: 28 | print('name = %s, shape=' % w_name, value.shape) 29 | 30 | x = tensor.ftensor4('x') 31 | y = model.forward(x) 32 | print('compiling fn...') 33 | fn = theano.function([x], y, no_default_updates=False) 34 | print('run fn...') 35 | input = np.random.rand(4, 3, 32, 33).astype(np.float32) 36 | output = fn(input) 37 | print(output) 38 | print(output.shape) 39 | 40 | if __name__ == '__main__': 41 | 42 | test_case_0() 43 | 44 | print('Test passed') 45 | 46 | 47 | 48 | -------------------------------------------------------------------------------- /test/test_LSTM2D.py: -------------------------------------------------------------------------------- 1 | # coding:utf-8 2 | ''' 3 | Unitest for LSTM2D class 4 | Created : 11, 6, 2018 5 | Revised : 11, 6, 2018 6 | All rights reserved 7 | ''' 8 | # ------------------------------------------------------------------------------------------------ 9 | __author__ = 'dawei.leng' 10 | 11 | import os 12 | os.environ['THEANO_FLAGS'] = "floatX=float32, mode=FAST_RUN, warn_float64='raise'" 13 | 14 | import theano 15 | import theano.tensor as tensor 16 | from dandelion.module import LSTM2D 17 | from dandelion.objective import * 18 | from dandelion.update import * 19 | import numpy as np, time 20 | 21 | import dandelion 22 | dandelion_path = os.path.split(dandelion.__file__)[0] 23 | print('dandelion path = %s\n' % dandelion_path) 24 | 25 | def test_case_0(): 26 | # input_dim, hidden_dim, B, H, W = 5, 4, 2, 5, 7 27 | input_dim, hidden_dim, B, H, W = 8, 8, 2, 32, 32 28 | input = tensor.ftensor4() 29 | gt = tensor.ftensor4() 30 | model = LSTM2D(input_dims=[input_dim],hidden_dim=hidden_dim, peephole=True) 31 | output = model.forward(input, backward=False) 32 | loss = aggregate(squared_error(output, gt)) 33 | params = model.collect_params() 34 | updates = sgd(loss, params, 1e-4) 35 | updates.update(model.collect_self_updates()) 36 | print('compiling function ...') 37 | f = theano.function([input, gt], [output, loss], updates=updates, no_default_updates=False) 38 | 39 | print('run function ...') 40 | X = np.random.rand(H, W, B, input_dim).astype('float32') 41 | GT = np.random.rand(H, W, B, hidden_dim).astype('float32') 42 | time0 = time.time() 43 | Y, loss = f(X, GT) 44 | time_used = time.time() - time0 45 | print('time_used = ', time_used) 46 | print('Y=', Y) 47 | print('Y.shape=', Y.shape) 48 | print('loss=', loss) 49 | 50 | if __name__ == '__main__': 51 | 52 | test_case_0() 53 | 54 | print('Test passed') -------------------------------------------------------------------------------- /setup.py: -------------------------------------------------------------------------------- 1 | import os 2 | import re 3 | from setuptools import find_packages 4 | from setuptools import setup 5 | import io, shutil 6 | from distutils.extension import Extension 7 | 8 | cmdclass = {} 9 | ext_modules = [] 10 | 11 | here = os.path.abspath(os.path.dirname(__file__)) 12 | 13 | with open(os.path.join(here, 'dandelion', '__init__.py'), 'r') as f: 14 | init_py = f.read() 15 | version = re.search('__version__ = "(.*)"', init_py).groups()[0] 16 | 17 | with io.open(os.path.join(here, 'README.md'), 'r', encoding='utf-8') as f: 18 | README = f.read() 19 | 20 | # install_requires = [ 21 | # 'numpy', 22 | # 'Theano', 23 | # ] 24 | # 25 | # tests_require = [ 26 | # 'mock', 27 | # 'pytest', 28 | # 'pytest-cov', 29 | # ] 30 | setup( 31 | name="Dandelion", 32 | version=version, 33 | description="A light weight deep learning framework", 34 | long_description=README, 35 | long_description_content_type="text/markdown", 36 | classifiers=[ 37 | "Intended Audience :: Developers", 38 | "Intended Audience :: Science/Research", 39 | "License :: OSI Approved :: Mozilla Public License 2.0 (MPL 2.0)", 40 | "Programming Language :: Python :: 3", 41 | "Programming Language :: Python :: 3.5", 42 | "Programming Language :: Python :: 3.6", 43 | "Topic :: Scientific/Engineering :: Artificial Intelligence", 44 | ], 45 | keywords="DL framework, Theano", 46 | author="David Leon (Dawei Leng)", 47 | author_email="daweileng@outlook.com", 48 | license="Mozilla Public License v2.0", 49 | url="https://github.com/david-leon/Dandelion", 50 | packages=find_packages(), 51 | include_package_data=True, 52 | zip_safe=False, 53 | # install_requires=install_requires, 54 | # extras_require={ 55 | # 'testing': tests_require, 56 | # }, 57 | cmdclass=cmdclass, 58 | ext_modules=ext_modules, 59 | ) 60 | -------------------------------------------------------------------------------- /test/test_AlternateLSTM2D.py: -------------------------------------------------------------------------------- 1 | # coding:utf-8 2 | ''' 3 | Unitest for AlternateLSTM2D model 4 | Created : 11, 13, 2018 5 | Revised : 11, 13, 2018 6 | All rights reserved 7 | ''' 8 | # ------------------------------------------------------------------------------------------------ 9 | __author__ = 'dawei.leng' 10 | 11 | import os 12 | os.environ['THEANO_FLAGS'] = "floatX=float32, mode=FAST_RUN, warn_float64='raise'" 13 | 14 | import theano 15 | import theano.tensor as tensor 16 | from dandelion.model import Alternate_2D_LSTM 17 | from dandelion.objective import * 18 | from dandelion.update import * 19 | import numpy as np, time 20 | 21 | import dandelion 22 | dandelion_path = os.path.split(dandelion.__file__)[0] 23 | print('dandelion path = %s\n' % dandelion_path) 24 | 25 | def test_case_0(): 26 | # input_dim, hidden_dim, B, H, W = 5, 4, 2, 5, 7 27 | input_dim, hidden_dim, B, H, W = 8, 8, 2, 32, 32 28 | input = tensor.ftensor4() 29 | gt = tensor.ftensor4() 30 | model = Alternate_2D_LSTM(input_dims=[input_dim],hidden_dim=hidden_dim, peephole=True, mode=2) 31 | output = model.forward(input, backward=False) 32 | loss = aggregate(squared_error(output, gt)) 33 | params = model.collect_params() 34 | updates = sgd(loss, params, 1e-4) 35 | updates.update(model.collect_self_updates()) 36 | print('compiling function ...') 37 | f = theano.function([input, gt], [output, loss], updates=updates, no_default_updates=False) 38 | 39 | print('run function ...') 40 | X = np.random.rand(H, W, B, input_dim).astype('float32') 41 | GT = np.random.rand(H, W, B, hidden_dim).astype('float32') 42 | time0 = time.time() 43 | Y, loss = f(X, GT) 44 | time_used = time.time() - time0 45 | print('time_used = ', time_used) 46 | print('Y=', Y) 47 | print('Y.shape=', Y.shape) 48 | print('loss=', loss) 49 | 50 | if __name__ == '__main__': 51 | 52 | test_case_0() 53 | 54 | print('Test passed') -------------------------------------------------------------------------------- /test/test_CTPN.py: -------------------------------------------------------------------------------- 1 | # coding:utf-8 2 | ''' 3 | Test for CTPN implementation 4 | Created : 7, 27, 2018 5 | Revised : 7, 27, 2018 6 | All rights reserved 7 | ''' 8 | __author__ = 'dawei.leng' 9 | 10 | import os 11 | os.environ['THEANO_FLAGS'] = "floatX=float32, mode=FAST_RUN, warn_float64='raise'" 12 | # os.environ['THEANO_FLAGS'] = "floatX=float32, mode=DEBUG_MODE, warn_float64='raise', exception_verbosity=high" 13 | 14 | import theano 15 | from theano import tensor 16 | from dandelion.module import * 17 | from dandelion.activation import * 18 | from dandelion.model.ctpn import model_CTPN 19 | from dandelion.objective import * 20 | 21 | import dandelion 22 | dandelion_path = os.path.split(dandelion.__file__)[0] 23 | print('dandelion path = %s\n' % dandelion_path) 24 | 25 | def test_case_0(): 26 | model = model_CTPN(k=10, do_side_refinement_regress=False, 27 | batchnorm_mode=1, channel=3, im_height=None, im_width=None, 28 | kernel_size=3, border_mode=(1, 1), VGG_flip_filters=False, 29 | im2col=None) 30 | x = tensor.ftensor4('x') 31 | y1 = tensor.ftensor5('y1') 32 | y2 = tensor.ftensor4('y2') 33 | class_score, bboxs = model.forward(x) 34 | #--- check back-prop ---# 35 | loss = aggregate(squared_error(y1, class_score)) + aggregate(squared_error(y2, bboxs)) 36 | grad = theano.grad(loss, model.collect_params()) 37 | print('back-prop test pass') 38 | 39 | 40 | print('compiling fn...') 41 | fn = theano.function([x], [class_score, bboxs], no_default_updates=False, on_unused_input='ignore') 42 | print('run fn...') 43 | input = np.random.rand(4, 3, 256, 256).astype(np.float32) 44 | class_score, bboxs = fn(input) 45 | assert class_score.shape == (4, 16, 16, 10, 2), 'class_score shape not correct' 46 | assert bboxs.shape == (4, 16, 16, 10, 2), 'bboxs shape not correct' 47 | 48 | # print(class_score.shape) 49 | # print(bboxs.shape) 50 | 51 | 52 | 53 | if __name__ == '__main__': 54 | 55 | test_case_0() 56 | 57 | print('Test passed') 58 | 59 | 60 | 61 | -------------------------------------------------------------------------------- /test/test_GroupNorm.py: -------------------------------------------------------------------------------- 1 | # coding:utf-8 2 | # Unit test for GroupNorm module 3 | # Created : 11, 22, 2018 4 | # Revised : 11, 22, 2018 5 | # All rights reserved 6 | #------------------------------------------------------------------------------------------------ 7 | __author__ = 'dawei.leng' 8 | import os, sys 9 | os.environ['THEANO_FLAGS'] = "floatX=float32, mode=FAST_RUN, warn_float64='raise'" 10 | 11 | import theano 12 | from theano import tensor 13 | from dandelion.module import * 14 | from dandelion.activation import * 15 | from dandelion.objective import * 16 | from dandelion.update import * 17 | import dandelion 18 | dandelion_path = os.path.split(dandelion.__file__)[0] 19 | print('dandelion path = %s\n' % dandelion_path) 20 | 21 | def fix_update_bcasts(updates): 22 | for param, update in updates.items(): 23 | if param.broadcastable != update.broadcastable: 24 | updates[param] = tensor.patternbroadcast(update, param.broadcastable) 25 | return updates 26 | 27 | def test_case_0(): 28 | B, C, H, W = 4, 128, 256, 256 29 | x = tensor.ftensor4('x') 30 | z = tensor.ftensor4('gt') 31 | conv0 = Conv2D(in_channels=2*C, out_channels=C, kernel_size=(3,3), pad='same') 32 | # gn0 = GroupNorm(channel_num=C, group_num=32, beta=None, gamma=None) 33 | gn0 = GroupNorm(channel_num=C, group_num=32) 34 | model = Sequential([conv0, gn0], activation=relu) 35 | y = model.forward(x) 36 | loss = aggregate(squared_error(y, z)) 37 | updates = adadelta(loss, model.collect_params()) 38 | updates.update(model.collect_self_updates()) 39 | # f = theano.function([x], y, no_default_updates=False, updates=fix_update_bcasts(bn.collect_self_updates())) 40 | f = theano.function([x, z], [y, loss], no_default_updates=False, updates=updates) 41 | x = np.random.rand(B, 2*C, H, W).astype('float32') 42 | z = np.random.rand(B, C, H, W).astype('float32') 43 | y, loss = f(x, z) 44 | assert y.shape ==(B, C, H, W) 45 | print('test_case_0 passed') 46 | 47 | if __name__ == '__main__': 48 | 49 | # test_case_0() 50 | 51 | test_case_0() 52 | 53 | print('Test passed') -------------------------------------------------------------------------------- /docs/dandelion_ext_CV.md: -------------------------------------------------------------------------------- 1 | **Image Processing and Computer Vision Toolkits** 2 | _______________________________________________________________________ 3 | 4 | ## imread 5 | Read image file and return as numpy `ndarray`, using PILLOW as backend. Support for EXIF rotation specification. 6 | ```python 7 | imread(f, flatten=False, dtype='float32') 8 | ``` 9 | * **f**: str or file object. The file name or file object to be read from. 10 | * **flatten**: bool. If `True`, flattens the color channels into a single gray-scale channel. 11 | * **dtype**: returned data type 12 | 13 | _______________________________________________________________________ 14 | ## imsave 15 | Save an image `ndarray` into file, using PILLOW as backend 16 | ```python 17 | imsave(f, I, **params) 18 | ``` 19 | * **f**: str or file object. The file name or file object to be written into. 20 | * **I**: Image `ndarray`. Note for `jpeg` format, `I` should be of `uint8` type. 21 | * **params**: other parameters passed directly to PILLOW's `image.save()` 22 | 23 | _______________________________________________________________________ 24 | ## imresize 25 | Resize image, using scipy as backend 26 | ```python 27 | imresize(I, size, interp='bilinear', mode=None) 28 | ``` 29 | * **I**: Image `ndarray` 30 | * **size**: target size 31 | * **interp**: Interpolation to use for resizing, {'nearest', 'lanczos', 'bilinear', 'bicubic' or 'cubic'}. 32 | * **mode**: . The PIL image mode ('P', 'L', etc.) to convert `I` before resizing, optional. 33 | 34 | _______________________________________________________________________ 35 | ## imrotate 36 | Rotate image, using opencv as backend 37 | ```python 38 | imrotate(I, angle, padvalue=0.0, interpolation='linear', target_size=None, border_mode='reflect_101') 39 | ``` 40 | * **I**: Image `ndarray` 41 | * **angle**: in degree, positive for counter-clockwise 42 | * **interpolation**: image interpolation method, {'linear'|'nearest'|'cubic'|'LANCZOS4'|'area'}, refer to opencv:INTER_* constants for details 43 | * **border_mode**: image boundary handling method, {'reflect_101'|'reflect'|'wrap'|'constant'|'replicate'}, refer to opencv:BORDER_* constants for details 44 | * **padvalue**: used when `border_mode` = 'constant' 45 | * **target_size**: target size of output image, optional. -------------------------------------------------------------------------------- /test/test_pooling.py: -------------------------------------------------------------------------------- 1 | # coding:utf-8 2 | # Unit test for pooling functions 3 | # Created : 2, 27, 2018 4 | # Revised : 2, 27, 2018 5 | # All rights reserved 6 | #------------------------------------------------------------------------------------------------ 7 | __author__ = 'dawei.leng' 8 | import os, sys 9 | os.environ['THEANO_FLAGS'] = "floatX=float32, mode=FAST_RUN, warn_float64='raise'" 10 | 11 | import theano 12 | from theano import tensor 13 | from dandelion.module import * 14 | from dandelion.functional import * 15 | 16 | import dandelion 17 | dandelion_path = os.path.split(dandelion.__file__)[0] 18 | print('dandelion path = %s\n' % dandelion_path) 19 | 20 | def pool_1d_Lasagne(x, axis=1, mode='max'): 21 | """ 22 | Lasagne requires x is 3D, and pooling is done on the last dimension 23 | :param x: 24 | :param axis: 25 | :return: 26 | """ 27 | input_4d = tensor.shape_padright(x, 1) 28 | if axis == 1: 29 | input_4d = input_4d.dimshuffle((0, 2, 1, 3)) 30 | pooled = pool_2d(input_4d, 31 | ws=(2, 1), 32 | stride=(2, 1), 33 | ignore_border=True, 34 | pad=(0, 0), 35 | mode=mode, 36 | ) 37 | if axis == 1: # [DV] add support for 'axis' para 38 | pooled = pooled.dimshuffle((0, 2, 1, 3)) 39 | return pooled[:, :, :, 0] 40 | 41 | def test_case_0(): 42 | import numpy as np 43 | 44 | x_3d = tensor.ftensor3('x') 45 | y_3d_D = pool_1d(x_3d, axis=1) 46 | 47 | y_3d_L = pool_1d_Lasagne(x_3d, axis=1) 48 | 49 | fn_D = theano.function([x_3d], y_3d_D, no_default_updates=True, on_unused_input='ignore') 50 | fn_L = theano.function([x_3d], y_3d_L, no_default_updates=True, on_unused_input='ignore') 51 | 52 | 53 | for i in range(20): 54 | x = np.random.rand(7, 117, 27).astype(np.float32) 55 | y_D = fn_D(x) 56 | y_L = fn_L(x) 57 | diff = np.max(np.abs(y_D - y_L)) 58 | print('i=%d, diff=%0.6f' % (i, diff)) 59 | if diff>1e-4: 60 | print('y_D=\n', y_D) 61 | print('y_L=\n', y_L) 62 | raise ValueError('diff is too big') 63 | 64 | if __name__ == '__main__': 65 | 66 | test_case_0() 67 | 68 | print('Test passed') 69 | 70 | 71 | 72 | -------------------------------------------------------------------------------- /test/test_softmax_ndim.py: -------------------------------------------------------------------------------- 1 | # coding:utf-8 2 | # Test for softmax/log_softmax for ndim > 2. 3 | # This test is added because occasionally Theano compiled function would raise shape mismatch error for reshap op. inside softmax/log_softmax when ndim >2. 4 | # Created : 7, 23, 2018 5 | # Revised : 7, 23, 2018 6 | # All rights reserved 7 | #------------------------------------------------------------------------------------------------ 8 | __author__ = 'dawei.leng' 9 | import os, numpy as np 10 | os.environ['THEANO_FLAGS'] = "floatX=float32, mode=FAST_RUN, warn_float64='raise'" 11 | 12 | import theano 13 | from theano import tensor 14 | from dandelion.module import * 15 | from dandelion.activation import softmax, log_softmax 16 | from dandelion.objective import categorical_crossentropy, categorical_crossentropy_log 17 | from dandelion.util import one_hot 18 | 19 | import dandelion 20 | dandelion_path = os.path.split(dandelion.__file__)[0] 21 | print('dandelion path = %s\n' % dandelion_path) 22 | 23 | # ndim = 2 24 | def test_case_0(): 25 | x = tensor.fmatrix() # (B, N) 26 | y = softmax(x) # (B, N) 27 | f = theano.function([x] , y) 28 | 29 | for i in range(10): 30 | B = 1 31 | N = 5 32 | x = np.random.rand(B, N).astype(np.float32) 33 | y = f(x) 34 | print(y.shape) 35 | 36 | # ndim = 3 37 | def test_case_1(): 38 | x = tensor.ftensor3() # (B, T, N) 39 | y = softmax(x) # (B, T, N) 40 | f = theano.function([x] , y) 41 | 42 | for i in range(20): 43 | B = np.random.randint(1, 10) 44 | T = np.random.randint(1, 10) 45 | N = np.random.randint(2, 8) 46 | x = np.random.rand(B, T, N).astype(np.float32) 47 | y = f(x) 48 | print(y.shape) 49 | 50 | # ndim = 4 51 | def test_case_2(): 52 | x = tensor.ftensor4() # (B, H, W, N) 53 | y = softmax(x) # (B, H, W, N) 54 | f = theano.function([x], y) 55 | 56 | for i in range(20): 57 | B = np.random.randint(1, 10) 58 | H = np.random.randint(1, 100) 59 | W = np.random.randint(1, 100) 60 | N = np.random.randint(2, 8) 61 | x = np.random.rand(B, H, W, N).astype(np.float32) 62 | y = f(x) 63 | print(y.shape) 64 | 65 | if __name__ == '__main__': 66 | 67 | test_case_0() 68 | 69 | test_case_1() 70 | 71 | test_case_2() 72 | 73 | print('Test passed') 74 | 75 | 76 | 77 | -------------------------------------------------------------------------------- /docs/howtos.md: -------------------------------------------------------------------------------- 1 | # Tutorial III: Howtos 2 | 3 | ### 1) How to freeze a module during training? 4 | To *freeze* a module during training, use the `include` and `exclude` arguments of module's `.collect_params()` and `.collect_self_updates()` functions. 5 | #### Example 6 | ```python 7 | class FOO(Module): 8 | def __init__(self): 9 | self.cnn0 = Conv2D(...) 10 | self.cnn1 = Conv2D(...) 11 | self.cnn2 = Conv2D(...) 12 | .... 13 | 14 | # Now we will freeze cnn0 and cnn1 submodules during training 15 | model = Foo() 16 | loss = ... 17 | params = model.collect_params(exclude=['cnn0', 'cnn1']) 18 | updates = optimizer(loss, params) 19 | updates.update(model.colect_self_updates(exclude=['cnn0', 'cnn1'])) 20 | train_fn = theano.function([...], [...], updates=updates, no_default_updates=False) 21 | ``` 22 | 23 | ### 2) How to initialize a partially modified model with previouslly trained weights? 24 | A frequently encountered scenario in research is that we want to re-use trained weights from a previous model to initialize a new model, usually partially modified. The most convenient way is to use `Module`'s `set_weights_by_name()` method with the `unmatched` argument set to `warn` or `ignore`. To use this method, it's assumed that you didn't change the variable's name to be initialized; otherwise, you can use the `name_map` argument to input the corresponding *weight->variable* mapping, or the most primitive way, use tensor's `get_value()` and `set_value()` methods explicitly. 25 | #### Example 26 | ```python 27 | from dandelion.util import gpickle 28 | old_model_file = ... 29 | old_module_weights, old_userdata = gpickle.load(old_model_file) 30 | new_model = ... 31 | new_model.set_weights_by_name(old_module_weights, unmatched='warn') 32 | ``` 33 | 34 | ### 3) How to add random noise to a tensor? 35 | Just use Theano's `MRG_RandomStreams` module. 36 | #### Example 37 | ```python 38 | from theano.sandbox.rng_mrg import MRG_RandomStreams as RandomStreams 39 | srng = RandomStreams(np.random.randint(1, 2147462579)) 40 | .... 41 | y = x + srng.normal(x.shape, avg=0.0, std=0.1) # add Gaussian noise to x 42 | ``` 43 | What you'd keep in mind is that if you used Theano's `MRG_RandomStreams` module, remember to set `no_default_updates=False` when compiling functions. 44 | 45 | ### 4) How to do model-parallel training? 46 | According to [issue 6655](https://github.com/Theano/Theano/issues/6655), model-parallel multi-GPU support of Theano will never be finished, so it won't be possible to do model-parallel training with Theano, and of course, Dandelion. 47 | For data-parallel training, refer to [platoon](https://github.com/mila-udem/platoon) for possible solution. We may implement our multi-GPU data-parallel training scheme later, stay tuned. -------------------------------------------------------------------------------- /test/test_Dense.py: -------------------------------------------------------------------------------- 1 | # coding:utf-8 2 | # Unit test for Dense class 3 | # Created : 1, 30, 2018 4 | # Revised : 1, 30, 2018 5 | # All rights reserved 6 | #------------------------------------------------------------------------------------------------ 7 | __author__ = 'dawei.leng' 8 | import os, sys 9 | os.environ['THEANO_FLAGS'] = "floatX=float32, mode=FAST_RUN, warn_float64='raise'" 10 | 11 | import theano 12 | from theano import tensor 13 | from dandelion.module import * 14 | from dandelion.activation import * 15 | from lasagne.layers import InputLayer, DenseLayer, get_output 16 | import lasagne.nonlinearities as LACT 17 | 18 | import dandelion 19 | dandelion_path = os.path.split(dandelion.__file__)[0] 20 | print('dandelion path = %s\n' % dandelion_path) 21 | 22 | class build_model_D(Module): 23 | def __init__(self, in_dim=3, out_dim=3): 24 | super().__init__() 25 | self.in_dim = in_dim 26 | self.out_dim = out_dim 27 | self.dense = Dense(input_dims=self.in_dim, output_dim=self.out_dim) 28 | self.predict = self.forward 29 | 30 | def forward(self, x): 31 | x = self.dense.forward(x) 32 | x = relu(x) 33 | return x 34 | 35 | def build_model_L(in_dim=3, out_dim=3): 36 | input_var = tensor.fmatrix('x') 37 | input0 = InputLayer(shape=(None, in_dim), input_var=input_var, name='input0') 38 | dense0 = DenseLayer(input0, num_units=out_dim, nonlinearity=LACT.rectify, name='dense0') 39 | return dense0 40 | 41 | def test_case_0(in_dim=1, out_dim=1): 42 | import numpy as np 43 | from lasagne_ext.utils import get_layer_by_name 44 | 45 | 46 | model_D = build_model_D(in_dim=in_dim, out_dim=out_dim) 47 | model_L = build_model_L(in_dim=in_dim, out_dim=out_dim) 48 | 49 | W = np.random.rand(in_dim, out_dim).astype(np.float32) 50 | b = np.random.rand(out_dim).astype(np.float32) 51 | model_D.dense.W.set_value(W) 52 | model_D.dense.b.set_value(b) 53 | get_layer_by_name(model_L, 'dense0').W.set_value(W) 54 | get_layer_by_name(model_L, 'dense0').b.set_value(b) 55 | 56 | X = get_layer_by_name(model_L, 'input0').input_var 57 | y_D = model_D.forward(X) 58 | y_L = get_output(model_L) 59 | 60 | fn_D = theano.function([X], y_D, no_default_updates=True) 61 | fn_L = theano.function([X], y_L, no_default_updates=True) 62 | 63 | for i in range(20): 64 | x = np.random.rand(16, in_dim).astype(np.float32) 65 | y_D = fn_D(x) 66 | y_L = fn_L(x) 67 | diff = np.sum(np.abs(y_D - y_L)) 68 | print('i=%d, diff=%0.6f' % (i, diff)) 69 | if diff>1e-4: 70 | raise ValueError('diff is too big') 71 | 72 | if __name__ == '__main__': 73 | 74 | test_case_0(3, 2) 75 | 76 | print('Test passed') 77 | 78 | 79 | 80 | -------------------------------------------------------------------------------- /test/test_upsample_2d.py: -------------------------------------------------------------------------------- 1 | # coding:utf-8 2 | # Test for upsample_2d 3 | # Created : 7, 5, 2018 4 | # Revised : 7, 5, 2018 5 | # All rights reserved 6 | #------------------------------------------------------------------------------------------------ 7 | __author__ = 'dawei.leng' 8 | import os, sys 9 | os.environ['THEANO_FLAGS'] = "floatX=float32, mode=FAST_RUN, warn_float64='raise'" 10 | 11 | import theano 12 | from theano import tensor 13 | from dandelion.module import * 14 | from dandelion.functional import upsample_2d, upsample_2d_bilinear 15 | from lasagne.layers import InputLayer, get_output, Upscale2DLayer 16 | import dandelion 17 | dandelion_path = os.path.split(dandelion.__file__)[0] 18 | print('dandelion path = %s\n' % dandelion_path) 19 | 20 | class build_model_D(Module): 21 | def __init__(self, ratio=[2, 3], mode='repeat'): 22 | super().__init__() 23 | self.ratio = ratio 24 | self.mode = mode 25 | self.predict = self.forward 26 | 27 | def forward(self, x): 28 | """ 29 | 30 | :param x: (B, C, H, W) 31 | :return: 32 | """ 33 | x = upsample_2d(x, ratio=self.ratio, mode=self.mode) 34 | # x = relu(x) 35 | return x 36 | 37 | def build_model_L(ratio=[2,3], mode='repeat'): 38 | input_var = tensor.ftensor4('x') # (B, C, H, W) 39 | input0 = InputLayer(shape=(None, None, None, None), input_var=input_var, name='input0') 40 | x = Upscale2DLayer(input0, scale_factor=ratio, mode=mode) 41 | return x 42 | 43 | def test_case_0(): 44 | import numpy as np 45 | from lasagne_ext.utils import get_layer_by_name 46 | 47 | ratio = [1, 2] 48 | mode = 'dilate' 49 | 50 | model_D = build_model_D(ratio=ratio, mode=mode) 51 | model_L = build_model_L(ratio=ratio, mode=mode) 52 | 53 | 54 | X = get_layer_by_name(model_L, 'input0').input_var 55 | y_D = model_D.forward(X) 56 | y_L = get_output(model_L) 57 | 58 | fn_D = theano.function([X], y_D, no_default_updates=True, on_unused_input='ignore') 59 | fn_L = theano.function([X], y_L, no_default_updates=True, on_unused_input='ignore') 60 | 61 | for i in range(20): 62 | B = np.random.randint(low=1, high=16) 63 | C = np.random.randint(low=1, high=32) 64 | H = np.random.randint(low=5, high=256) 65 | W = np.random.randint(low=5, high=255) 66 | x = np.random.rand(B, C, H, W).astype(np.float32) - 0.5 67 | y_D = fn_D(x) 68 | y_L = fn_L(x) 69 | # print(y_D) 70 | diff = np.max(np.abs(y_D - y_L)) 71 | print('i=%d, diff=%0.6f' % (i, diff)) 72 | if diff>1e-4: 73 | print('y_D=\n', y_D) 74 | print('y_L=\n', y_L) 75 | raise ValueError('diff is too big') 76 | 77 | if __name__ == '__main__': 78 | 79 | test_case_0() 80 | 81 | print('Test passed') 82 | 83 | 84 | 85 | -------------------------------------------------------------------------------- /docs/index.md: -------------------------------------------------------------------------------- 1 | # Dandelion 2 | A quite light weight deep learning framework, on top of Theano, offering better balance between flexibility and abstraction. 3 | 4 | ## Targeted Users 5 | Researchers who need flexibility as well as convenience to experiment all kinds of *nonstandard* network structures, and also the stability of Theano. 6 | 7 | ## Why Another DL Framework 8 | * The reason is more about the lack of flexibility for existing DL frameworks, such as Keras, Lasagne, Blocks, etc. 9 | * By **“flexibility”**, we means whether it is easy to modify or extend the framework. 10 | * The famous DL framework Keras is designed to be beginner-friendly oriented, at the cost of being quite hard to modify. 11 | * Compared to Keras, another less-famous framework Lasagne provides more flexibility. It’s easier to write your own layer by Lasagne for small neural network, however, for complex neural networks it still needs quite manual works because like other existing frameworks, Lasagne operates on abstracted ‘Layer’ class instead of raw tensor variables. 12 | 13 | ## Featuring 14 | * **Aiming to offer better balance between flexibility and abstraction.** 15 | * Easy to use and extend, support for any neural network structure. 16 | * Loose coupling, each part of the framework can be modified independently. 17 | * **More like a handy library of deep learning modules.** 18 | * Common modules such as CNN, LSTM, GRU, Dense, Dropout, Batch Normalization, and common optimization methods such as SGD, Adam, Adadelta, Rmsprop are ready out-of-the-box. 19 | * **Plug & play, operating directly on Theano tensors, no upper abstraction applied.** 20 | * Unlike previous frameworks like Keras, Lasagne, etc., Dandelion operates directly on tensors instead of layer abstractions, making it quite easy to plug in 3rd part defined deep learning modules (layer defined by Keras/Lasagne) or vice versa. 21 | 22 | ## Project Layout 23 | Python Module | Explanation 24 | ----------------- | ---------------- 25 | module | all neual network module definitions 26 | functional | operations on tensor with no parameter to be learned 27 | initialization | initialization methods for neural network modules 28 | activation | definition of all activation functions 29 | objective | definition of all loss objectives 30 | update | definition of all optimizers 31 | util | utility functions 32 | model | model implementations out-of-the-box 33 | ext | extensions 34 | 35 | ## Credits 36 | The design of Dandelion heavily draws on [Lasagne](https://github.com/Lasagne/Lasagne) and [Pytorch](http://pytorch.org/), both my favorate DL libraries. 37 | 38 | ## Special Thanks 39 | To **Radomir Dopieralski**, who transferred the `dandelion` project name on pypi to us. Now you can install the package by simply `pip install dandelion`. 40 | -------------------------------------------------------------------------------- /test/test_categorical_crossentropy_log.py: -------------------------------------------------------------------------------- 1 | # coding:utf-8 2 | # Test for categorical_crossentropy_log 3 | # Created : 7, 3, 2018 4 | # Revised : 7, 3, 2018 5 | # All rights reserved 6 | #------------------------------------------------------------------------------------------------ 7 | __author__ = 'dawei.leng' 8 | import os, numpy as np 9 | os.environ['THEANO_FLAGS'] = "floatX=float32, mode=FAST_RUN, warn_float64='raise'" 10 | 11 | import theano 12 | from theano import tensor 13 | from dandelion.module import * 14 | from dandelion.activation import softmax, log_softmax 15 | from dandelion.objective import categorical_crossentropy, categorical_crossentropy_log 16 | from dandelion.util import one_hot 17 | 18 | import dandelion 19 | dandelion_path = os.path.split(dandelion.__file__)[0] 20 | print('dandelion path = %s\n' % dandelion_path) 21 | 22 | def test_case_0(): 23 | x = tensor.fmatrix() # (B, N) 24 | # y = tensor.ivector() # (B,) 25 | y = tensor.fmatrix() # (B, N) 26 | 27 | x1 = softmax(x) # (B, N) 28 | r1 = categorical_crossentropy(x1, y, eps=0.0) 29 | 30 | x2 = log_softmax(x) 31 | r2 = categorical_crossentropy_log(x2, y) 32 | 33 | f1 = theano.function([x, y], r1) 34 | f2 = theano.function([x, y], r2) 35 | 36 | for i in range(100): 37 | B, N = np.random.randint(low=1, high=32), np.random.randint(low=2, high=100) 38 | X = np.random.rand(B, N) 39 | Y = np.random.randint(low=0, high=N, size=(B,)) 40 | Y = np.eye(N)[Y] 41 | 42 | X = X.astype('float32') 43 | Y = Y.astype('float32') 44 | 45 | r1 = f1(X, Y) 46 | r2 = f2(X, Y) 47 | dif = np.sum(np.abs(r1 - r2)) 48 | if dif > 1e-7: 49 | print(r1) 50 | print(r2) 51 | raise ValueError('r1 != r2') 52 | 53 | def test_case_1(): 54 | N = np.random.randint(low=2, high=100) 55 | x = tensor.fmatrix() # (B, N) 56 | y = tensor.ivector() # (B,) 57 | 58 | x1 = softmax(x) # (B, N) 59 | r1 = categorical_crossentropy(x1, y, eps=0.0) 60 | 61 | x2 = log_softmax(x) 62 | r2 = categorical_crossentropy_log(x2, y, m=N) 63 | 64 | f1 = theano.function([x, y], r1) 65 | f2 = theano.function([x, y], r2) 66 | 67 | for i in range(100): 68 | # B, N = np.random.randint(low=1, high=32), np.random.randint(low=2, high=100) 69 | B = np.random.randint(low=1, high=32) 70 | X = np.random.rand(B, N) 71 | Y = np.random.randint(low=0, high=N, size=(B,)) 72 | 73 | X = X.astype('float32') 74 | Y = Y.astype('int32') 75 | 76 | r1 = f1(X, Y) 77 | # print('r1=', r1) 78 | r2 = f2(X, Y) 79 | # print('r2=', r2) 80 | dif = np.max(np.abs(r1 - r2)) 81 | if dif > 1e-6: 82 | print('r1=', r1) 83 | print('r2=', r2) 84 | raise ValueError('r1 != r2') 85 | 86 | if __name__ == '__main__': 87 | 88 | test_case_0() 89 | 90 | test_case_1() 91 | 92 | print('Test passed') 93 | 94 | 95 | 96 | -------------------------------------------------------------------------------- /dandelion/ext/visual/visual.py: -------------------------------------------------------------------------------- 1 | # coding:utf-8 2 | ''' 3 | Model summary and visualization toolkits 4 | Created : 11, 15, 2018 5 | Revised : 11, 16, 2018 add `size_unit` arg for `get_model_summary()` 6 | All rights reserved 7 | ''' 8 | __author__ = 'dawei.leng' 9 | 10 | from collections import OrderedDict 11 | 12 | def get_model_size(model): 13 | """ 14 | Calculate model parameter size 15 | :param model: 16 | :return: total size of model, in bytes 17 | """ 18 | weights = model.get_weights() 19 | model_size = 0 20 | for weight, _ in weights: 21 | model_size += weight.size * weight.itemsize 22 | return model_size 23 | 24 | def get_model_summary(model, size_unit='M'): 25 | """ 26 | Produce model parameter summary 27 | :param model: 28 | :param size_unit: {'M'|'K'|'B'|int} 29 | :return: OrderedDict 30 | """ 31 | if isinstance(size_unit, str): 32 | size_unit = size_unit.upper() 33 | assert size_unit in {'M', 'K', 'B'} 34 | if size_unit == 'M': 35 | size_unit = 1048576 36 | elif size_unit == 'K': 37 | size_unit = 1024 38 | else: 39 | size_unit = 1 40 | model_summary = OrderedDict() 41 | model_summary['name'] = model.name 42 | model_size = get_model_size(model) / size_unit 43 | model_summary['size'] = model_size 44 | if len(model.params) > 0: 45 | model_summary['params'] = [] 46 | for param in model.params: 47 | v = param.get_value() 48 | model_summary['params'].append(OrderedDict([('name',param.name), ('shape',str(v.shape)), ('dtype',v.dtype.name), ('size',v.size * v.itemsize/size_unit), ('percent',v.size * v.itemsize/(model_size*size_unit)*100)])) 49 | if len(model.self_updating_variables) > 0: 50 | model_summary['self_updating_variables'] = [] 51 | for param in model.self_updating_variables: 52 | v = param.get_value() 53 | model_summary['self_updating_variables'].append(OrderedDict([('name',param.name), ('shape',str(v.shape)), ('dtype',v.dtype.name), ('size',v.size * v.itemsize/size_unit), ('percent',v.size * v.itemsize/(model_size*size_unit)*100)])) 54 | if len(model.sub_modules) > 0: 55 | sub_module_summaries = [] 56 | for tag, child in (model.sub_modules.items()): 57 | sub_module_summaries.append(get_model_summary(child, size_unit=size_unit)) 58 | sub_module_summaries[-1]['percent'] = sub_module_summaries[-1]['size'] / model_size * 100 59 | model_summary.update({'sub_modules': sub_module_summaries}) 60 | return model_summary 61 | 62 | 63 | if __name__ == '__main__': 64 | import json 65 | from dandelion.model import Alternate_2D_LSTM 66 | input_dim, hidden_dim, B, H, W = 8, 8, 2, 32, 32 67 | model = Alternate_2D_LSTM(input_dims=[input_dim],hidden_dim=hidden_dim, peephole=True, mode=2) 68 | model_summary = get_model_summary(model, size_unit=1) 69 | rpt = json.dumps(model_summary, ensure_ascii=False, indent=2) 70 | print(rpt) 71 | 72 | -------------------------------------------------------------------------------- /debug/shuffleseg_train_debug.py: -------------------------------------------------------------------------------- 1 | # coding:utf-8 2 | # GPU DeBUG for ShuffleSeg: model_ShuffleSeg can be trained on CPU, but when trained on GPU, error raised as "images and kernel must have the same stack size" 3 | # BUG fixed with PR: https://github.com/Theano/Theano/pull/6624 4 | # Created : 7, 18, 2018 5 | # Revised : 7, 18, 2018 6 | # All rights reserved 7 | #------------------------------------------------------------------------------------------------ 8 | __author__ = 'dawei.leng' 9 | 10 | import os, sys 11 | # os.environ['THEANO_FLAGS'] = "floatX=float32, mode=FAST_RUN, warn_float64='raise', exception_verbosity=high" 12 | os.environ['THEANO_FLAGS'] = "floatX=float32, mode=FAST_RUN, warn_float64='raise'" 13 | sys.setrecursionlimit(40000) 14 | 15 | import theano 16 | from theano import tensor 17 | from dandelion.module import * 18 | from dandelion.activation import * 19 | from dandelion.model.shufflenet import * 20 | from dandelion.objective import * 21 | from dandelion.update import adadelta 22 | 23 | import dandelion 24 | dandelion_path = os.path.split(dandelion.__file__)[0] 25 | print('dandelion path = %s\n' % dandelion_path) 26 | 27 | if __name__ == '__main__': 28 | import argparse 29 | import pygpu 30 | 31 | argparser = argparse.ArgumentParser() 32 | argparser.add_argument('-device', default='cuda5', type=str) 33 | argparser.add_argument('-gn', default=1, type=int) 34 | arg = argparser.parse_args() 35 | 36 | # --- public paras ---# 37 | device = arg.device 38 | group_num = arg.gn 39 | 40 | 41 | #--- (1) setup device ---# 42 | if sys.platform.startswith('linux'): 43 | if device.startswith('cuda'): 44 | print('Using NEW backend for training') 45 | import theano.gpuarray 46 | theano.gpuarray.use(device, force=True) 47 | elif device.startswith('gpu'): 48 | print('Using OLD backend for training') 49 | import theano.sandbox.cuda 50 | theano.sandbox.cuda.use(device, force=True) 51 | else: 52 | print('Using CPU for training') 53 | 54 | 55 | 56 | Nclass = 6 57 | in_channels = 1 58 | out_channels = 128 59 | model = model_ShuffleSeg(in_channels=in_channels, Nclass=Nclass, SF_group_num=group_num) 60 | # model = ShuffleUnit(in_channels=in_channels, group_num=4, stride=2) 61 | # model = ShuffleUnit_Stack(in_channels=in_channels, out_channels=out_channels, group_num=group_num, stack_size=4) 62 | print(model.__class__.__name__) 63 | x = tensor.ftensor4('x') 64 | y = model.forward(x) 65 | y_gt = tensor.ftensor4('y') 66 | loss = aggregate(squared_error(y, y_gt), mode='mean') 67 | params = model.collect_params() 68 | updates = adadelta(loss, params) 69 | updates.update(model.collect_self_updates()) 70 | print('compiling train fn') 71 | train_fn = theano.function([x, y_gt], loss, updates=updates, no_default_updates=False) 72 | 73 | print('training...') 74 | for i in range(10): 75 | print('batch %d' % i) 76 | x = np.random.rand(4, in_channels, 256, 256).astype(np.float32) 77 | # y = np.random.rand(4, out_channels, 128, 128).astype(np.float32) 78 | y = np.random.rand(4, Nclass, 256, 256).astype(np.float32) 79 | loss = train_fn(x, y) 80 | print('loss = ', loss) 81 | 82 | -------------------------------------------------------------------------------- /docs/dandelion_update.md: -------------------------------------------------------------------------------- 1 | Dandelion's `update` module is mostly inherited from [Lasagne](https://github.com/Lasagne/Lasagne), you're recommended to refer to [`Lasagne.updates` document](http://lasagne.readthedocs.io/en/latest/modules/updates.html) for the following optimizers & helper functions: 2 | 3 | * apply_momentum 4 | * momentum 5 | * apply_nesterov_momentum 6 | * nesterov_momentum 7 | * adagrad 8 | * rmsprop 9 | * adamax 10 | * norm_constraint 11 | * total_norm_constrain 12 | 13 | _______________________________________________________________________ 14 | ## sgd 15 | Stochastic gradient descent optimizer. 16 | 17 | ```python 18 | sgd(loss_or_grads, params, learning_rate=1e-4, clear_nan=False) 19 | ``` 20 | 21 | * **loss_or_grads**: a scalar loss expression, or a list of gradient expressions 22 | * **params**: list of shared variables to generate update expressions for 23 | * **learning_rate**: float or symbolic scalar, learning rate controlling the size of update steps 24 | * **clear_nan**: boolean flag, if `True`, `nan` in gradients will be replaced with 0 25 | 26 | _______________________________________________________________________ 27 | ## adam 28 | Adam optimizer implemented as described in *"Kingma, Diederik, and Jimmy Ba (2014): Adam: A Method for Stochastic Optimization. arXiv preprint arXiv:1412.6980."* 29 | 30 | ```python 31 | adam(loss_or_grads, params, learning_rate=0.001, beta1=0.9, 32 | beta2=0.999, epsilon=1e-8, clear_nan=False) 33 | ``` 34 | 35 | * **loss_or_grads**: a scalar loss expression, or a list of gradient expressions 36 | * **params**: list of shared variables to generate update expressions for 37 | * **learning_rate**: float or symbolic scalar, learning rate controlling the size of update steps 38 | * **clear_nan**: boolean flag, if `True`, `nan` in gradients will be replaced with 0 39 | * **beta1**: float or symbolic scalar, exponential decay rate for the first moment estimates 40 | * **beta2**: float or symbolic scalar, exponential decay rate for the second moment estimates 41 | * **epsilon**: float or symbolic scalar, constant for numerical stability 42 | 43 | _______________________________________________________________________ 44 | ## adadelta 45 | Adadelta optimizer implemented as described in *"Zeiler, M. D. (2012): ADADELTA: An Adaptive Learning Rate Method. arXiv Preprint arXiv:1212.5701."* 46 | 47 | ```python 48 | adadelta(loss_or_grads, params, learning_rate=1.0, 49 | rho=0.95, epsilon=1e-6, clear_nan=False) 50 | ``` 51 | 52 | * **loss_or_grads**: a scalar loss expression, or a list of gradient expressions 53 | * **params**: list of shared variables to generate update expressions for 54 | * **learning_rate**: float or symbolic scalar, learning rate controlling the size of update steps 55 | * **clear_nan**: boolean flag, if `True`, `nan` in gradients will be replaced with 0 56 | * **rho**: float or symbolic scalar, squared gradient moving average decay factor 57 | * **epsilon**: float or symbolic scalar, constant for numerical stability 58 | 59 | `rho` should be between 0 and 1. A value of `rho` close to 1 will decay the moving average slowly and a value close to 0 will decay the moving average fast. 60 | `rho` = 0.95 and `epsilon`=1e-6 are suggested in the paper and reported to work for multiple datasets (MNIST, speech). 61 | In the paper, no learning rate is considered (so `learning_rate`=1.0). Probably best to keep it at this value. `epsilon` is important for the very first update (so the numerator does not become 0). 62 | 63 | 64 | 65 | 66 | -------------------------------------------------------------------------------- /dandelion/model/resnet.py: -------------------------------------------------------------------------------- 1 | # coding:utf-8 2 | ''' 3 | Model definition of ResNet 4 | Created : 7, 6, 2018 5 | Revised : 7, 6, 2018 6 | All rights reserved 7 | ''' 8 | # ------------------------------------------------------------------------------------------------ 9 | __author__ = 'dawei.leng' 10 | 11 | import theano.tensor as tensor 12 | from ..module import * 13 | from ..functional import * 14 | from ..activation import * 15 | 16 | 17 | class ResNet_bottleneck(Module): 18 | """ 19 | [ResNet bottleneck block] (https://arxiv.org/abs/1512.03385). 20 | """ 21 | 22 | def __init__(self, outer_channel=256, inner_channel=64, border_mode='same', batchnorm_mode=1, activation=relu): 23 | """ 24 | 25 | :param outer_channel: channel number of block input 26 | :param inner_channel: channel number inside the block 27 | :param border_mode: 28 | :param batchnorm_mode: {0 | 1 | 2}. 0 means no batch normalization applied; 1 means batch normalization applied to each cnn; 29 | 2 means batch normalization only applied to the last cnn 30 | :param activation: default = relu. Note no activation applied to the last element-wise sum output. 31 | """ 32 | super().__init__() 33 | self.activation = activation 34 | self.batchnorm_mode = batchnorm_mode 35 | self.conv1 = Conv2D(in_channels=outer_channel, out_channels=inner_channel, kernel_size=1, pad=border_mode) 36 | self.conv2 = Conv2D(in_channels=inner_channel, out_channels=inner_channel, kernel_size=3, pad=border_mode) 37 | self.conv3 = Conv2D(in_channels=inner_channel, out_channels=outer_channel, kernel_size=1, pad=border_mode) 38 | if batchnorm_mode == 0: # no batch normalization 39 | pass 40 | elif batchnorm_mode == 1: # batch normalization per convolution 41 | self.bn1 = BatchNorm(input_shape=(None, inner_channel, None, None)) 42 | self.bn2 = BatchNorm(input_shape=(None, inner_channel, None, None)) 43 | self.bn3 = BatchNorm(input_shape=(None, outer_channel, None, None)) 44 | elif batchnorm_mode == 2: # only one batch normalization at the end 45 | self.bn3 = BatchNorm(input_shape=(None, outer_channel, None, None)) 46 | else: 47 | raise ValueError('batchnorm_mode should be {0 | 1 | 2}') 48 | 49 | def forward(self, x): 50 | """ 51 | :param x: (B, C, H, W) 52 | :return: 53 | """ 54 | self.work_mode = 'train' 55 | 56 | x0 = x 57 | x = self.conv1.forward(x) 58 | if self.batchnorm_mode == 1: 59 | x = self.bn1.forward(x) 60 | x = self.activation(x) 61 | x = self.conv2.forward(x) 62 | if self.batchnorm_mode == 1: 63 | x = self.bn2.forward(x) 64 | x = self.activation(x) 65 | x = self.conv3.forward(x) 66 | if self.batchnorm_mode in {1, 2}: 67 | x = self.bn3.forward(x) 68 | x = self.activation(x) 69 | x = x + x0 70 | return x 71 | 72 | def predict(self, x): 73 | self.work_mode = 'inference' 74 | 75 | x0 = x 76 | x = self.conv1.predict(x) 77 | if self.batchnorm_mode == 1: 78 | x = self.bn1.predict(x) 79 | x = self.activation(x) 80 | x = self.conv2.predict(x) 81 | if self.batchnorm_mode == 1: 82 | x = self.bn2.predict(x) 83 | x = self.activation(x) 84 | x = self.conv3.predict(x) 85 | if self.batchnorm_mode in {1, 2}: 86 | x = self.bn3.predict(x) 87 | x = self.activation(x) 88 | x = x + x0 89 | return x 90 | -------------------------------------------------------------------------------- /test/test_Conv2D.py: -------------------------------------------------------------------------------- 1 | # coding:utf-8 2 | # Test for Conv2D class 3 | # Created : 1, 31, 2018 4 | # Revised : 1, 31, 2018 5 | # All rights reserved 6 | #------------------------------------------------------------------------------------------------ 7 | __author__ = 'dawei.leng' 8 | import os, sys 9 | os.environ['THEANO_FLAGS'] = "floatX=float32, mode=FAST_RUN, warn_float64='raise'" 10 | 11 | import theano 12 | from theano import tensor 13 | from dandelion.module import * 14 | from dandelion.activation import * 15 | from lasagne.layers import InputLayer, Conv2DLayer, get_output 16 | import lasagne.nonlinearities as LACT 17 | import dandelion 18 | dandelion_path = os.path.split(dandelion.__file__)[0] 19 | print('dandelion path = %s\n' % dandelion_path) 20 | 21 | class build_model_D(Module): 22 | def __init__(self, in_channel=3, out_channel=3, kernel_size=(3,3), stride=(1,1), pad='valid', dilation=(1,1), num_groups=1): 23 | super().__init__() 24 | self.conv2d = Conv2D(in_channels=in_channel, out_channels=out_channel, kernel_size=kernel_size, stride=stride, 25 | pad=pad, dilation=dilation, num_groups=num_groups) 26 | self.predict = self.forward 27 | 28 | def forward(self, x): 29 | """ 30 | 31 | :param x: (B, C, H, W) 32 | :return: 33 | """ 34 | x = self.conv2d.forward(x) 35 | x = relu(x) 36 | return x 37 | 38 | def build_model_L(in_channel=3, out_channel=3, kernel_size=(3,3), stride=(1,1), pad='valid', dilation=(1,1), num_groups=1): 39 | input_var = tensor.ftensor4('x') # (B, C, H, W) 40 | input0 = InputLayer(shape=(None, in_channel, None, None), input_var=input_var, name='input0') 41 | conv0 = Conv2DLayer(input0, num_filters=out_channel, filter_size=kernel_size, stride=stride, pad=pad, nonlinearity=LACT.rectify, 42 | name='conv0') 43 | return conv0 44 | 45 | def test_case_0(): 46 | import numpy as np 47 | from lasagne_ext.utils import get_layer_by_name 48 | 49 | in_channel = 1; out_channel = 3;kernel_size = (3, 3); stride = (1, 1); pad = 'valid';dilation = (1,1);num_groups = 1 50 | model_D = build_model_D(in_channel=in_channel, out_channel=out_channel, kernel_size=kernel_size, stride=stride, 51 | pad=pad, dilation=dilation, num_groups=num_groups) 52 | model_L = build_model_L(in_channel=in_channel, out_channel=out_channel, kernel_size=kernel_size, stride=stride, 53 | pad=pad) 54 | 55 | W = np.random.rand(out_channel, in_channel, kernel_size[0], kernel_size[1]).astype(np.float32) 56 | b = np.random.rand(out_channel).astype(np.float32) 57 | 58 | model_D.conv2d.W.set_value(W) 59 | model_D.conv2d.b.set_value(b) 60 | 61 | conv_L = get_layer_by_name(model_L, 'conv0') 62 | conv_L.W.set_value(W) 63 | conv_L.b.set_value(b) 64 | 65 | X = get_layer_by_name(model_L, 'input0').input_var 66 | y_D = model_D.forward(X) 67 | y_L = get_output(model_L) 68 | 69 | fn_D = theano.function([X], y_D, no_default_updates=True, on_unused_input='ignore') 70 | fn_L = theano.function([X], y_L, no_default_updates=True, on_unused_input='ignore') 71 | 72 | for i in range(20): 73 | x = np.random.rand(3, in_channel, 32, 32).astype(np.float32) - 0.5 74 | y_D = fn_D(x) 75 | y_L = fn_L(x) 76 | diff = np.max(np.abs(y_D - y_L)) 77 | print('i=%d, diff=%0.6f' % (i, diff)) 78 | if diff>1e-4: 79 | print('y_D=\n', y_D) 80 | print('y_L=\n', y_L) 81 | raise ValueError('diff is too big') 82 | 83 | 84 | if __name__ == '__main__': 85 | 86 | test_case_0() 87 | 88 | print('Test passed') 89 | 90 | 91 | 92 | -------------------------------------------------------------------------------- /test/test_ConvTransposed2D.py: -------------------------------------------------------------------------------- 1 | # coding:utf-8 2 | # Test for ConvTransposed2D class 3 | # Created : 3, 2, 2018 4 | # Revised : 3, 2, 2018 5 | # All rights reserved 6 | #------------------------------------------------------------------------------------------------ 7 | __author__ = 'dawei.leng' 8 | import os, sys 9 | os.environ['THEANO_FLAGS'] = "floatX=float32, mode=FAST_RUN, warn_float64='raise'" 10 | 11 | import theano 12 | from theano import tensor 13 | from dandelion.module import * 14 | from dandelion.activation import * 15 | from lasagne.layers import InputLayer, TransposedConv2DLayer, get_output 16 | import lasagne.nonlinearities as LACT 17 | import dandelion 18 | dandelion_path = os.path.split(dandelion.__file__)[0] 19 | print('dandelion path = %s\n' % dandelion_path) 20 | 21 | class build_model_D(Module): 22 | def __init__(self, in_channel=3, out_channel=3, kernel_size=(3,3), stride=(1,1), pad='valid', dilation=(1,1), num_groups=1): 23 | super().__init__() 24 | self.tconv2d = ConvTransposed2D(in_channels=in_channel, out_channels=out_channel, kernel_size=kernel_size, stride=stride, 25 | pad=pad, dilation=dilation, num_groups=num_groups) 26 | self.predict = self.forward 27 | 28 | def forward(self, x): 29 | """ 30 | 31 | :param x: (B, C, H, W) 32 | :return: 33 | """ 34 | x = self.tconv2d.forward(x) 35 | # x = relu(x) 36 | return x 37 | 38 | def build_model_L(in_channel=3, out_channel=3, kernel_size=(3,3), stride=(1,1), pad='valid', dilation=(1,1), num_groups=1): 39 | input_var = tensor.ftensor4('x') # (B, C, H, W) 40 | input0 = InputLayer(shape=(None, in_channel, None, None), input_var=input_var, name='input0') 41 | tconv0 = TransposedConv2DLayer(input0, num_filters=out_channel, filter_size=kernel_size, stride=stride, crop=pad, nonlinearity=LACT.linear, 42 | name='tconv0') 43 | return tconv0 44 | 45 | def test_case_0(): 46 | import numpy as np 47 | from lasagne_ext.utils import get_layer_by_name 48 | 49 | in_channel = 8; out_channel = 4;kernel_size = (3, 3); stride = (1, 1); pad = 'valid';dilation = (1,1);num_groups = 2 50 | model_D = build_model_D(in_channel=in_channel, out_channel=out_channel, kernel_size=kernel_size, stride=stride, 51 | pad=pad, dilation=dilation, num_groups=num_groups) 52 | model_L = build_model_L(in_channel=in_channel, out_channel=out_channel, kernel_size=kernel_size, stride=stride, 53 | pad=pad) 54 | 55 | W = np.random.rand(in_channel, out_channel//num_groups, kernel_size[0], kernel_size[1]).astype(np.float32) 56 | b = np.random.rand(out_channel).astype(np.float32) 57 | 58 | model_D.tconv2d.W.set_value(W) 59 | model_D.tconv2d.b.set_value(b) 60 | 61 | conv_L = get_layer_by_name(model_L, 'tconv0') 62 | conv_L.W.set_value(W) 63 | conv_L.b.set_value(b) 64 | 65 | X = get_layer_by_name(model_L, 'input0').input_var 66 | y_D = model_D.forward(X) 67 | y_L = get_output(model_L) 68 | 69 | fn_D = theano.function([X], y_D, no_default_updates=True, on_unused_input='ignore') 70 | fn_L = theano.function([X], y_L, no_default_updates=True, on_unused_input='ignore') 71 | 72 | for i in range(20): 73 | x = np.random.rand(8, in_channel, 33, 32).astype(np.float32) - 0.5 74 | y_D = fn_D(x) 75 | # y_L = fn_L(x) 76 | y_L = y_D 77 | diff = np.max(np.abs(y_D - y_L)) 78 | print('i=%d, diff=%0.6f' % (i, diff)) 79 | if diff>1e-4: 80 | print('y_D=\n', y_D) 81 | print('y_L=\n', y_L) 82 | raise ValueError('diff is too big') 83 | 84 | if __name__ == '__main__': 85 | 86 | test_case_0() 87 | 88 | print('Test passed') 89 | 90 | 91 | 92 | -------------------------------------------------------------------------------- /test/test_GRU.py: -------------------------------------------------------------------------------- 1 | # coding:utf-8 2 | # Unit test for GRU class 3 | # Created : 1, 31, 2018 4 | # Revised : 1, 31, 2018 5 | # All rights reserved 6 | #------------------------------------------------------------------------------------------------ 7 | __author__ = 'dawei.leng' 8 | import os, sys 9 | os.environ['THEANO_FLAGS'] = "floatX=float32, mode=FAST_RUN, warn_float64='raise'" 10 | 11 | import theano 12 | from theano import tensor 13 | from dandelion.module import * 14 | from dandelion.activation import * 15 | from lasagne.layers import InputLayer, GRULayer, get_output 16 | import lasagne.nonlinearities as LACT 17 | 18 | import dandelion 19 | dandelion_path = os.path.split(dandelion.__file__)[0] 20 | print('dandelion path = %s\n' % dandelion_path) 21 | 22 | class build_model_D(Module): 23 | def __init__(self, in_dim=3, out_dim=3): 24 | super().__init__() 25 | self.in_dim = in_dim 26 | self.out_dim = out_dim 27 | self.gru = GRU(input_dims=self.in_dim, hidden_dim=self.out_dim, learn_ini=True) 28 | self.predict = self.forward 29 | 30 | def forward(self, x): 31 | """ 32 | 33 | :param x: (B, T, D) 34 | :return: 35 | """ 36 | x = x.dimshuffle((1, 0, 2)) # ->(T, B, D) 37 | x = self.gru.forward(x, backward=False, only_return_final=False) 38 | x = x.dimshuffle((1, 0, 2)) # ->(B, T, D) 39 | # x = tanh(x) 40 | return x 41 | 42 | def build_model_L(in_dim=3, out_dim=3): 43 | input_var = tensor.ftensor3('x') # (B, T, D) 44 | input0 = InputLayer(shape=(None, None, in_dim), input_var=input_var, name='input0') 45 | gru0 = GRULayer(input0, num_units=out_dim, precompute_input=True, 46 | backwards=False, only_return_final=False, learn_init=True, 47 | name='gru0') 48 | return gru0 49 | 50 | def test_case_0(): 51 | import numpy as np 52 | from lasagne_ext.utils import get_layer_by_name 53 | 54 | in_dim, out_dim = 6, 5 55 | model_D = build_model_D(in_dim=in_dim, out_dim=out_dim) 56 | model_L = build_model_L(in_dim=in_dim, out_dim=out_dim) 57 | 58 | W_in = np.random.rand(in_dim, 3 * out_dim).astype(np.float32) 59 | b_in = np.random.rand(3 * out_dim).astype(np.float32) 60 | W_hid = np.random.rand(out_dim, 3 * out_dim).astype(np.float32) 61 | h_ini = np.random.rand(out_dim).astype(np.float32) 62 | 63 | model_D.gru.W_in.set_value(W_in) 64 | model_D.gru.b_in.set_value(b_in) 65 | model_D.gru.W_hid.set_value(W_hid) 66 | model_D.gru.h_ini.set_value(h_ini) 67 | 68 | gru_L = get_layer_by_name(model_L, 'gru0') 69 | gru_L.W_in_to_resetgate.set_value(W_in[:, :out_dim]) 70 | gru_L.W_in_to_updategate.set_value(W_in[:, out_dim:2 * out_dim]) 71 | gru_L.W_in_to_hidden_update.set_value(W_in[:, 2 * out_dim:3 * out_dim]) 72 | 73 | gru_L.W_hid_to_resetgate.set_value(W_hid[:, :out_dim]) 74 | gru_L.W_hid_to_updategate.set_value(W_hid[:, out_dim:2 * out_dim]) 75 | gru_L.W_hid_to_hidden_update.set_value(W_hid[:, 2 * out_dim:3 * out_dim]) 76 | 77 | gru_L.b_resetgate.set_value(b_in[:out_dim]) 78 | gru_L.b_updategate.set_value(b_in[out_dim:2 * out_dim]) 79 | gru_L.b_hidden_update.set_value(b_in[2 * out_dim:3 * out_dim]) 80 | 81 | gru_L.hid_init.set_value(h_ini.reshape((1, out_dim))) 82 | 83 | X = get_layer_by_name(model_L, 'input0').input_var 84 | y_D = model_D.forward(X) 85 | y_L = get_output(model_L) 86 | 87 | fn_D = theano.function([X], y_D, no_default_updates=True, on_unused_input='ignore') 88 | fn_L = theano.function([X], y_L, no_default_updates=True, on_unused_input='ignore') 89 | 90 | for i in range(20): 91 | x = np.random.rand(2, 5, in_dim).astype(np.float32) - 0.5 92 | y_D = fn_D(x) 93 | y_L = fn_L(x) 94 | diff = np.max(np.abs(y_D - y_L)) 95 | print('i=%d, diff=%0.6f' % (i, diff)) 96 | if diff > 1e-4: 97 | print('y_D=\n', y_D) 98 | print('y_L=\n', y_L) 99 | raise ValueError('diff is too big') 100 | 101 | 102 | if __name__ == '__main__': 103 | 104 | test_case_0() 105 | 106 | print('Test passed') 107 | 108 | 109 | 110 | -------------------------------------------------------------------------------- /debug/speed_im2col.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import theano as th 3 | import theano.tensor as T 4 | import theano.tensor.nnet.neighbours as N 5 | 6 | import time 7 | 8 | 9 | def im_to_col(im, psize, n_channels=3): 10 | """Similar to MATLAB's im2col function. 11 | 12 | Args: 13 | im - a Theano tensor3, of the form . 14 | psize - an int specifying the (square) block size to use 15 | n_channels - the number of channels in im 16 | 17 | Returns: a 5-tensor of the form . 19 | """ 20 | assert im.ndim == 3, "im must have dimension 3." 21 | im = im[:, ::-1, ::-1] 22 | res = T.zeros((n_channels, psize * psize, im.shape[1] - psize + 1, 23 | im.shape[2] - psize + 1)) 24 | filts = T.reshape(T.eye(psize * psize, psize * psize), 25 | (psize * psize, psize, psize)) 26 | filts = T.shape_padleft(filts).dimshuffle((1, 0, 2, 3)) 27 | 28 | for i in range(n_channels): 29 | cur_slice = T.shape_padleft(im[i], n_ones=2) 30 | res = T.set_subtensor(res[i], T.nnet.conv.conv2d(cur_slice, filts)[0]) 31 | 32 | return res.dimshuffle((0, 2, 3, 1)).reshape( 33 | (n_channels, im.shape[1] - psize + 1, im.shape[2] - psize + 1, 34 | psize, psize)).dimshuffle((1, 2, 0, 3, 4)) 35 | 36 | 37 | def main(): 38 | # Turn these knobs if you wish to work with larger/smaller data 39 | img_dims = (500, 500) 40 | fsize = 2 41 | n_channels = 3 42 | 43 | # Create a random image 44 | img = np.asarray(np.random.rand(*((n_channels,) + img_dims)), 45 | dtype=th.config.floatX) 46 | img = np.arange(n_channels * img_dims[0] * img_dims[1], 47 | dtype=th.config.floatX).reshape(n_channels, *img_dims) 48 | 49 | # Adapt the code to use the CPU/GPU. In the GPU case, do NOT transfer the 50 | # results back to memory. 51 | wrap = ((lambda x: x) if th.config.device == "cpu" else 52 | (lambda x: th.Out(th.sandbox.cuda.basic_ops.gpu_from_host(x), 53 | borrow=True))) 54 | 55 | # Convolution method 56 | x = th.shared(img) 57 | f = th.function( 58 | inputs=[], 59 | outputs=wrap(im_to_col(x, fsize, n_channels=n_channels)), 60 | name='im_to_col') 61 | 62 | # Time the convolution method 63 | tic = time.time() 64 | out_conv = f() 65 | conv_time = time.time() - tic 66 | print ("Convolution-based method: {0}".format(conv_time)) 67 | 68 | # Time the neighbors method 69 | neighs = N.NeighbourhoodsFromImages(1, (fsize, fsize), strides=(1, 1), 70 | ignore_border=True)(x) 71 | f = th.function([], outputs=wrap(neighs), 72 | name='old neighs') 73 | tic = time.time() 74 | out_old = f() 75 | neigh_time = time.time() - tic 76 | print ("Neighbors-based method: {0}".format(neigh_time)) 77 | 78 | # Time the new neighbours method ignore border 79 | neighs = N.images2neibs(x.dimshuffle('x', 0, 1, 2), 80 | (fsize, fsize), (1, 1), 81 | mode='ignore_borders') 82 | f = th.function([], outputs=wrap(neighs), 83 | name='new neighs ignore border') 84 | tic = time.time() 85 | out_new = f() 86 | neigh_time = time.time() - tic 87 | print ("New Neighbors-based ignore border method: {0}".format(neigh_time)) 88 | 89 | # Time the new neighbours method 90 | neighs = N.images2neibs(x.dimshuffle('x', 0, 1, 2), 91 | (fsize, fsize), (1, 1), 92 | mode='valid') 93 | f = th.function([], outputs=wrap(neighs), 94 | name='new neighs valid') 95 | tic = time.time() 96 | out_new = f() 97 | neigh_time = time.time() - tic 98 | print ("New Neighbors-based valid method: {0}".format(neigh_time)) 99 | 100 | # Print speedup results 101 | if conv_time < neigh_time: 102 | print ("Conv faster than neigh. Speedup: {0}x".format(neigh_time / conv_time)) 103 | else: 104 | print ("Neigh faster than conv. Speedup: {0}x".format(conv_time / neigh_time)) 105 | if __name__ == "__main__": 106 | main() 107 | -------------------------------------------------------------------------------- /test/test_im2col.py: -------------------------------------------------------------------------------- 1 | # coding:utf-8 2 | ''' 3 | Test for functional.im2col 4 | Created : 7, 26, 2018 5 | Revised : 7, 26, 2018 6 | All rights reserved 7 | ''' 8 | __author__ = 'dawei.leng' 9 | 10 | import os, sys 11 | os.environ['THEANO_FLAGS'] = "floatX=float32, mode=FAST_RUN, warn_float64='raise'" 12 | 13 | import theano 14 | from theano import tensor 15 | from dandelion.functional import _im2col as im2col 16 | import dandelion 17 | dandelion_path = os.path.split(dandelion.__file__)[0] 18 | print('dandelion path = %s\n' % dandelion_path) 19 | 20 | 21 | def _test_case_0(): 22 | print('test case 0') 23 | import numpy as np 24 | x = tensor.ftensor4() 25 | y = im2col(x, nb_size=(1,1), border_mode='valid', merge_channel=False) 26 | f = theano.function([x], y, no_default_updates=True) 27 | 28 | # B, C, H, W = 1, 1, 3, 3 29 | for i in range(10): 30 | B, C, H, W = np.random.randint(1, 8), np.random.randint(1, 4), np.random.randint(3, 128), np.random.randint(3, 128) 31 | X = np.random.rand(B, C, H, W).astype('float32') 32 | Y = f(X) 33 | print('X.shape=', X.shape) 34 | print('Y.shape=', Y.shape) 35 | if np.all(X==Y[:,:,:,:,0]): 36 | pass 37 | else: 38 | diff = np.max(abs(X-Y[:,:,:,:,0])) 39 | print('max diff=', diff) 40 | print('X=', X.flatten()) 41 | print('Y=', Y[:,:,:,:,0].flatten()) 42 | raise ValueError('X!=Y') 43 | 44 | def _test_case_1(): 45 | print('test case 1') 46 | import numpy as np 47 | x = tensor.ftensor4() 48 | y = im2col(x, nb_size=(1,1), border_mode='valid', merge_channel=True) 49 | f = theano.function([x], y, no_default_updates=True) 50 | 51 | # B, C, H, W = 1, 1, 3, 3 52 | for i in range(10): 53 | B, C, H, W = np.random.randint(1, 8), np.random.randint(1, 4), np.random.randint(3, 128), np.random.randint(3, 128) 54 | X = np.random.rand(B, C, H, W).astype('float32') 55 | Y = f(X) 56 | Y2 = X.transpose((0, 2, 3, 1)) 57 | # Y2 = np.reshape(Y2, (B, H, W, -1)) 58 | print('Y.shape=', Y.shape) 59 | print('Y2.shape=', Y2.shape) 60 | if np.all(Y==Y2): 61 | pass 62 | else: 63 | diff = np.max(abs(Y-Y2)) 64 | print('max diff=', diff) 65 | print('Y=', Y.flatten()) 66 | print('Y2=', Y2.flatten()) 67 | raise ValueError('X!=Y') 68 | 69 | def _test_case_2(): 70 | print('test case 2') 71 | import numpy as np 72 | # B, C, H, W, nb_size, nb_step, border_mode = 2, 4, 6, 6, (3,3), (1,1), 'valid' # failed 73 | # B, C, H, W, nb_size, nb_step, border_mode = 2, 4, 6, 6, (2,2), (1,1), 'valid' # failed 74 | # B, C, H, W, nb_size, nb_step, border_mode = 2, 4, 5, 6, (3,3), (1,1), 'half' # pass, mode half need neighbour with odd shapes 75 | # B, C, H, W, nb_size, nb_step, border_mode = 2, 4, 6, 6, (2,2), (1,1), 'half' # failed 76 | # B, C, H, W, nb_size, nb_step, border_mode = 2, 4, 9, 9, (3,3), (1,1), 'ignore_borders' # failed 77 | # B, C, H, W, nb_size, nb_step, border_mode = 2, 4, 6, 6, (3,3), (1,1), 'full' # failed 78 | # B, C, H, W, nb_size, nb_step, border_mode = 2, 4, 5, 6, (2,2), (1,1), 'wrap_centered' # failed, mode wrap_centered need neighbour with odd shapes 79 | # B, C, H, W, nb_size, nb_step, border_mode = 2, 4, 5, 6, (3,3), (1,1), 'wrap_centered' # pass, mode wrap_centered need neighbour with odd shapes 80 | B, C, H, W, nb_size, nb_step, border_mode = 2, 4, 10, 9, (3,3), (1,1), 'wrap_centered' # pass, mode wrap_centered need neighbour with odd shapes 81 | 82 | x = tensor.ftensor4() 83 | y = im2col(x, nb_size=nb_size, border_mode=border_mode, merge_channel=False) 84 | f = theano.function([x], y, no_default_updates=True) 85 | 86 | X = np.random.rand(B, C, H, W).astype('float32') 87 | Y = f(X) 88 | print('X.shape=', X.shape) 89 | print('Y.shape=', Y.shape) 90 | if X.shape[0] == Y.shape[0] and X.shape[1] == Y.shape[1] and X.shape[2] == Y.shape[2] and X.shape[3] == Y.shape[3] and Y.shape[-1] == nb_size[0] * nb_size[1]: 91 | pass 92 | else: 93 | raise ValueError('Shape not consistent') 94 | 95 | 96 | 97 | if __name__ == '__main__': 98 | 99 | _test_case_0() 100 | _test_case_1() 101 | _test_case_2() 102 | 103 | print('Test passed') 104 | 105 | 106 | 107 | -------------------------------------------------------------------------------- /test/test_BatchNorm.py: -------------------------------------------------------------------------------- 1 | # coding:utf-8 2 | # Test for BatchNorm class 3 | # Created : 2, 27, 2018 4 | # Revised : 2, 27, 2018 5 | # All rights reserved 6 | #------------------------------------------------------------------------------------------------ 7 | __author__ = 'dawei.leng' 8 | import os, sys 9 | os.environ['THEANO_FLAGS'] = "floatX=float32, mode=FAST_RUN, warn_float64='raise'" 10 | 11 | import theano 12 | from theano import tensor 13 | from dandelion.module import * 14 | from dandelion.activation import * 15 | from dandelion.objective import * 16 | from dandelion.update import * 17 | from lasagne.layers import InputLayer, BatchNormLayer_DV, get_output, get_all_updates 18 | import lasagne.nonlinearities as LACT 19 | import dandelion 20 | dandelion_path = os.path.split(dandelion.__file__)[0] 21 | print('dandelion path = %s\n' % dandelion_path) 22 | 23 | class build_model_D(Module): 24 | def __init__(self, input_shape=None, axes='auto'): 25 | super().__init__() 26 | self.input_shape = input_shape 27 | self.axes = axes 28 | self.bn = BatchNorm(input_shape=self.input_shape, axes=self.axes) 29 | 30 | def forward(self, x): 31 | x = self.bn.forward(x) 32 | return x 33 | 34 | def predict(self, x): 35 | return self.bn.predict(x) 36 | 37 | def build_model_L(input_shape=None, axes='auto'): 38 | input_var = tensor.ftensor4('x') 39 | input0 = InputLayer(shape=input_shape, input_var=input_var, name='input0') 40 | result = BatchNormLayer_DV(input0, axes=axes, name='bn0') 41 | return result 42 | 43 | def fix_update_bcasts(updates): 44 | for param, update in updates.items(): 45 | if param.broadcastable != update.broadcastable: 46 | updates[param] = tensor.patternbroadcast(update, param.broadcastable) 47 | return updates 48 | 49 | def test_case_0(): 50 | import numpy as np 51 | from lasagne_ext.utils import get_layer_by_name 52 | 53 | B, C, H, W = 2, 1, 8, 8 54 | input_shape = (None, C, H, W) 55 | axes = 'auto' 56 | 57 | model_D = build_model_D(input_shape=input_shape, axes=axes) 58 | model_L = build_model_L(input_shape=input_shape, axes=axes) 59 | 60 | X = get_layer_by_name(model_L, 'input0').input_var 61 | #--- predict ---# 62 | if 0: 63 | y_D = model_D.predict(X) 64 | y_L = get_output(model_L, deterministic=True) 65 | fn_L = theano.function([X], y_L, no_default_updates=True) 66 | fn_D = theano.function([X], y_D, no_default_updates=True) 67 | 68 | #--- train ---# 69 | if 1: 70 | y_D = model_D.forward(X) 71 | y_L = get_output(model_L, deterministic=False) 72 | 73 | update_L = fix_update_bcasts(get_all_updates(model_L)) 74 | update_D = fix_update_bcasts(model_D.collect_self_updates()) 75 | 76 | fn_L = theano.function([X], y_L, updates=update_L, no_default_updates=True) 77 | fn_D = theano.function([X], y_D, updates=update_D, no_default_updates=False) 78 | # fn_L = theano.function([X], y_L, no_default_updates=True) 79 | # fn_D = theano.function([X], y_D, no_default_updates=True) 80 | 81 | 82 | for i in range(20): 83 | x = np.random.rand(B, C, H, W).astype(np.float32) 84 | y_D = fn_D(x) 85 | y_L = fn_L(x) 86 | diff = np.sum(np.abs(y_D - y_L)) 87 | print('i=%d, diff=%0.6f' % (i, diff)) 88 | if diff>1e-4: 89 | print(y_D) 90 | print(y_L) 91 | raise ValueError('diff is too big') 92 | print('test_case_0 passed') 93 | 94 | def test_case_1(): 95 | B, C, H, W = 4, 1, 256, 256 96 | x = tensor.ftensor4('x') 97 | z = tensor.ftensor4('gt') 98 | bn = BatchNorm(input_shape=(B, C, H, W), beta=None, gamma=None) 99 | conv = Conv2D(in_channels=C, out_channels=C, kernel_size=(3,3), pad='same') 100 | model = Sequential([bn, conv], activation=relu) 101 | y = model.forward(x) 102 | loss = aggregate(squared_error(y, z)) 103 | updates = adadelta(loss, model.collect_params()) 104 | updates.update(model.collect_self_updates()) 105 | # f = theano.function([x], y, no_default_updates=False, updates=fix_update_bcasts(bn.collect_self_updates())) 106 | f = theano.function([x, z], [y, loss], no_default_updates=False, updates=updates) 107 | x = np.random.rand(B, C, H, W).astype('float32') 108 | z = np.random.rand(B, C, H, W).astype('float32') 109 | y, loss = f(x, z) 110 | assert y.shape ==(B, C, H, W) 111 | print('test_case_1 passed') 112 | 113 | if __name__ == '__main__': 114 | 115 | # test_case_0() 116 | 117 | test_case_1() 118 | 119 | print('Test passed') -------------------------------------------------------------------------------- /dandelion/ext/CV/CV.py: -------------------------------------------------------------------------------- 1 | # coding:utf-8 2 | ''' 3 | Function set for image processing and computer vision 4 | Created : 10, 19, 2015 5 | Revised : 4, 4, 2018 total rewrite of `imread()` to add support of EXIF rotation handling 6 | 8, 16, 2018 add `border_mode` arg to `imrotate()`, the `interpolation` arg type is changed to string. 7 | All rights reserved 8 | ''' 9 | __author__ = 'dawei.leng' 10 | 11 | import numpy as np 12 | from PIL import Image, ImageFile, ExifTags 13 | ImageFile.LOAD_TRUNCATED_IMAGES = True 14 | 15 | def imread(f, flatten=False, dtype='float32'): 16 | """ 17 | Read image from file, backend=scipy 18 | f : str or file object -> The file name or file object to be read. 19 | flatten : bool, optional -> If True, flattens the color channels into a single gray-scale channel. 20 | Returns ndarray 21 | """ 22 | # return sp.misc.imread(f, flatten) 23 | 24 | image=Image.open(f) 25 | 26 | try: 27 | # --- handle EXIF rotation ---# 28 | for orientation in ExifTags.TAGS.keys() : 29 | if ExifTags.TAGS[orientation]=='Orientation' : break 30 | exif=dict(image._getexif().items()) 31 | if exif[orientation] == 3 : 32 | image=image.rotate(180, expand=True) 33 | elif exif[orientation] == 6 : 34 | image=image.rotate(270, expand=True) 35 | elif exif[orientation] == 8 : 36 | image=image.rotate(90, expand=True) 37 | finally: 38 | #--- return proper numpy array ---# 39 | if dtype == 'uint8': 40 | if flatten: 41 | return np.array(image.convert('L')) 42 | else: 43 | return np.array(image.convert('RGB')) 44 | else: # dtype == 'float32' 45 | if flatten: 46 | return np.array(image.convert('F')) 47 | else: 48 | return np.array(image.convert('RGB'), dtype='float32') 49 | 50 | def imsave(f, I, **params): 51 | """ 52 | Save image into file, backend=pillow 53 | For jpeg, image should be in uint8 type 54 | :param f: str or file object 55 | :param I: 56 | :return: 57 | """ 58 | image = Image.fromarray(I) 59 | return image.save(fp=f, format=None, **params) 60 | 61 | def imresize(I, size, interp='bilinear', mode=None): 62 | import scipy.misc 63 | return scipy.misc.imresize(I, size, interp, mode) 64 | 65 | def imrotate(I, angle, padvalue=0.0, interpolation='linear', target_size=None, border_mode='reflect_101'): 66 | """ 67 | Return a rotated image, backend=opencv 68 | :param I: N-D np array, dtype not limited 69 | :param angle: in degree, positive for counter-clockwise 70 | :param border_mode: image boundary handling method, {'reflect_101'|'reflect'|'wrap'|'constant'|'replicate'}, refer to opencv:BORDER_* constants for details 71 | :param padvalue: used when `border_mode` = 'constant' 72 | :param interpolation: image interpolation method, {'linear'|'nearest'|'cubic'|'LANCZOS4'|'area'}, refer to opencv:INTER_* constants for details 73 | :return: rotated image, dtype same with `I` 74 | """ 75 | import cv2 76 | cv2.INTER_LINEAR 77 | assert border_mode.lower() in {'reflect_101', 'reflect', 'wrap', 'constant', 'replicate'} 78 | assert interpolation.lower() in {'linear', 'nearest', 'cubic', 'lanczos4', 'area'} 79 | if abs(angle) < 0.01: 80 | return I 81 | rows, cols = I.shape[:2] 82 | rmatrix = cv2.getRotationMatrix2D((cols/2, rows/2), angle, 1.0) # 2 * 3 83 | vertex_matrix = np.array([[0, 0, 1], [cols, 0, 1], [0, rows, 1], [cols, rows, 1]]).T # 3 * 4 84 | vertex_matrix_target = np.dot(rmatrix, vertex_matrix) # 2 * 4 85 | rowmin, colmin = vertex_matrix_target.min(axis=1) 86 | rowmax, colmax = vertex_matrix_target.max(axis=1) 87 | newshape = (rowmax-rowmin, colmax-colmin) 88 | rmatrix[0, 2] += newshape[0]/2 - cols/2 89 | rmatrix[1, 2] += newshape[1]/2 - rows/2 90 | if target_size is not None: 91 | newshape = target_size 92 | else: 93 | newshape = (np.round(newshape[0]).astype('int'), np.round(newshape[1]).astype('int')) 94 | I2 = cv2.warpAffine(I, rmatrix, newshape, 95 | flags=getattr(cv2, 'INTER_%s' % interpolation.upper()), 96 | borderMode=getattr(cv2, 'BORDER_%s' % border_mode.upper()), 97 | borderValue=padvalue) 98 | return I2 99 | 100 | 101 | if __name__ == '__main__': 102 | from matplotlib import pyplot as plt 103 | file = r"C:\Users\dawei\Work\Data\Robust_Reading\ICDAR2013_robust_reading_challenge2\trainset\images\107.jpg" 104 | I = imread(file, flatten=False) 105 | I2 = imrotate(I, 5, border_mode='reflect_101', interpolation='linear') 106 | plt.imshow(I2.astype('uint8'), 'gray') 107 | plt.show() 108 | -------------------------------------------------------------------------------- /test/test_todevice.py: -------------------------------------------------------------------------------- 1 | # coding:utf-8 2 | ''' 3 | Test Module's `.todevice` interface 4 | According to [Issue](https://github.com/Theano/Theano/issues/6655), this feature of Theano is never finished. 5 | Created : 11, 2, 2018 6 | Revised : 11, 2, 2018 7 | All rights reserved 8 | ''' 9 | # ------------------------------------------------------------------------------------------------ 10 | __author__ = 'dawei.leng' 11 | import os 12 | os.environ['THEANO_FLAGS'] = "floatX=float32,mode=FAST_RUN, warn_float64='raise',contexts=dev0->cuda3;dev1->cuda6" 13 | import theano 14 | theano_path = os.path.split(theano.__file__)[0] 15 | print('theano path = %s\n' % theano_path) 16 | import theano.tensor as tensor 17 | import dandelion 18 | dandelion_path = os.path.split(dandelion.__file__)[0] 19 | print('dandelion path = %s\n' % dandelion_path) 20 | from dandelion.module import * 21 | 22 | 23 | class build_model_on_single_device(Module): 24 | def __init__(self, in_dim=1024, out_dim=512, device_context=None): 25 | super().__init__() 26 | self.in_dim = in_dim 27 | self.out_dim = out_dim 28 | self.dense1 = Dense(input_dims=self.in_dim, output_dim=self.out_dim) 29 | self.dense2 = Dense(input_dims=self.out_dim, output_dim=self.out_dim) 30 | if device_context is not None: 31 | self.dense1.todevice(device_context) 32 | self.dense2.todevice(device_context) 33 | 34 | def forward(self, x): 35 | x = self.dense1.forward(x) 36 | x = relu(x) 37 | x = self.dense2.forward(x) 38 | x = relu(x) 39 | return x 40 | 41 | def predict(self, x): 42 | x = self.dense1.predict(x) 43 | x = relu(x) 44 | x = self.dense2.predict(x) 45 | x = relu(x) 46 | return x 47 | 48 | class build_model_on_multiple_device(Module): 49 | def __init__(self, in_dim=1024, out_dim=512, device_context=None): 50 | super().__init__() 51 | self.in_dim = in_dim 52 | self.out_dim = out_dim 53 | self.dense1 = Dense(input_dims=self.in_dim, output_dim=self.out_dim) 54 | self.dense2 = Dense(input_dims=self.out_dim, output_dim=self.out_dim) 55 | if device_context is not None: 56 | self.dense1.todevice(device_context[0]) 57 | self.dense2.todevice(device_context[1]) 58 | self.device_context = device_context 59 | 60 | def forward(self, x): 61 | x = self.dense1.forward(x) 62 | x = relu(x) 63 | if self.device_context is not None: 64 | x = x.transfer(self.device_context[1]) 65 | x = self.dense2.forward(x) 66 | x = relu(x) 67 | return x 68 | 69 | def predict(self, x): 70 | x = self.dense1.predict(x) 71 | x = relu(x) 72 | if self.device_context is not None: 73 | x = x.transfer(self.device_context[1]) 74 | x = self.dense2.predict(x) 75 | x = relu(x) 76 | return x 77 | 78 | def test_case_0(batch=256, in_dim=1024, out_dim=512): 79 | import numpy as np 80 | import time 81 | try: 82 | model_on_single_device = build_model_on_single_device(in_dim=in_dim, out_dim=out_dim, device_context='dev0') 83 | model_on_multiple_device = build_model_on_multiple_device(in_dim=in_dim, out_dim=out_dim, device_context=['dev0', 'dev1']) 84 | except ValueError as e: 85 | if str(e).startswith("Can't transfer to target"): 86 | print('GPU not present, test skipped') 87 | return 88 | 89 | model_on_multiple_device.dense1.W.set_value(model_on_single_device.dense1.W.get_value()) 90 | model_on_multiple_device.dense1.b.set_value(model_on_single_device.dense1.b.get_value()) 91 | model_on_multiple_device.dense2.W.set_value(model_on_single_device.dense2.W.get_value()) 92 | model_on_multiple_device.dense2.b.set_value(model_on_single_device.dense2.b.get_value()) 93 | 94 | x = tensor.fmatrix() 95 | x1 = x.transfer('dev0') 96 | y0 = model_on_single_device.forward(x1) 97 | y1 = model_on_multiple_device.forward(x1) 98 | 99 | f0 = theano.function([x], y0, no_default_updates=True) 100 | f1 = theano.function([x], y1, no_default_updates=True) 101 | 102 | for i in range(20): 103 | x = np.random.rand(batch, in_dim).astype(np.float32) 104 | time00 = time.time() 105 | y0 = f0(x) 106 | time01 = time.time() 107 | y1 = f1(x) 108 | time02 = time.time() 109 | time0 = time01 - time00 110 | time1 = time02 - time01 111 | diff = np.sum(np.abs(y0 - y1)) 112 | print('i=%d, diff=%0.6f, time0=%0.6fs, time1=%0.6fs, time0/time1=%0.4f' % (i, diff, time0, time1, time0/time1)) 113 | if diff>1e-4: 114 | raise ValueError('diff is too big') 115 | 116 | if __name__ == '__main__': 117 | try: 118 | test_case_0(batch=512, in_dim=1024, out_dim=512) 119 | except ValueError as e: 120 | if str(e).startswith("Can't transfer to target"): 121 | print('GPU not present, test skipped') 122 | print('Test passed') -------------------------------------------------------------------------------- /docs/dandelion_objective.md: -------------------------------------------------------------------------------- 1 | ## ctc_cost_logscale 2 | CTC cost calculated in `log` scale. This CTC objective is written purely in Theano, so it runs on both Windows and Linux. Theano itself also has a [wrapper](http://deeplearning.net/software/theano/library/tensor/nnet/ctc.html) for Baidu's `warp-ctc` library, which requires separate install and only runs on Linux. 3 | ```python 4 | ctc_cost_logscale(seq, sm, seq_mask=None, sm_mask=None, blank_symbol=None, align='pre') 5 | ``` 6 | * **seq**: query sequence, shape of `(L, B)`, `float32`-typed 7 | * **sm**: score matrix, shape of `(T, C+1, B)`, `float32`-typed 8 | * **seq_mask**: mask for query sequence, shape of `(L, B)`, `float32`-typed 9 | * **sm_mask**: mask for score matrix, shape of `(T, B)`, `float32`-typed 10 | * **blank_symbol**: scalar, = `C` by default 11 | * **align**: string, {'pre'/'post'}, indicating how input samples are aligned in one batch 12 | * **return**: negative log likelihood averaged over a batch 13 | 14 | _______________________________________________________________________ 15 | ## ctc_best_path_decode 16 | Decode the network output scorematrix by best-path-decoding scheme. 17 | ```python 18 | ctc_best_path_decode(Y, Y_mask=None, blank_symbol=None) 19 | ``` 20 | * **Y**: output of a network, with shape `(B, T, C+1)` 21 | * **Y_mask**: mask of Y, with shape `(B, T)` 22 | * **return**: result sequence of shape `(T, B`), and result sequence mask of shape `(T, B)` 23 | 24 | _______________________________________________________________________ 25 | ## ctc_CER 26 | Calculate the character error rate (CER) given ground truth `targetseq` and CTC decoding output `resultseq` 27 | ```python 28 | ctc_CER(resultseq, targetseq, resultseq_mask=None, targetseq_mask=None) 29 | ``` 30 | * **resultseq**: CTC decoding output, with shape `(T1, B)` 31 | * **targetseq**: sequence ground truth, with shape `(T2, B)` 32 | * **return**: tuple of `(CER, TE, TG)`, in which `TE` is the batch-wise total edit distance, `TG` is the batch-wise total ground truth sequence length, and `CER` equals to `TE/TG` 33 | 34 | _______________________________________________________________________ 35 | ## categorical_crossentropy 36 | Computes the categorical cross-entropy between predictions and targets 37 | ```python 38 | categorical_crossentropy(predictions, targets, eps=1e-7, m=None, class_weight=None) 39 | ``` 40 | * **predictions**: Theano 2D tensor, predictions in (0, 1), such as softmax output of a neural network, with data points in rows and class probabilities in columns. 41 | * **targets**: Theano 2D tensor or 1D tensor, either targets in [0, 1] (float32 type) matching the layout of `predictions`, or a vector of int giving the correct class index per data point. In the case of an integer vector argument, each element represents the position of the '1' in a one-hot encoding. 42 | * **eps**: epsilon added to `predictions` to prevent numerical unstability when using with softmax activation 43 | * **m**: possible max value of `targets`'s element, required when `targets` is 1D tensor and `class_weight` is not None. 44 | * **class_weight**: tensor vector with shape (Nclass,), used for class weighting, optional. 45 | * **return**: Theano 1D tensor, an expression for the item-wise categorical cross-entropy. 46 | 47 | _______________________________________________________________________ 48 | ## categorical_crossentropy_log 49 | Computes the categorical cross-entropy between predictions and targets, in log-domain. 50 | ```python 51 | categorical_crossentropy_log(log_predictions, targets, m=None, class_weight=None) 52 | ``` 53 | * **log_predictions**: Theano 2D tensor, predictions in log of (0, 1), such as log_softmax output of a neural network, with data points in rows and class probabilities in columns. 54 | * **targets**: Theano 2D tensor or 1D tensor, either targets in [0, 1] (float32 type) matching the layout of `predictions`, or a vector of int giving the correct class index per data point. In the case of an integer vector argument, each element represents the position of the '1' in a one-hot encoding. 55 | * **m**: possible max value of `targets`'s element, only used when `targets` is 1D vector. When `targets` is integer vector, the implementation of `categorical_crossentropy_log` is different from `categorical_crossentropy`: the latter relies on `theano.tensor.nnet.categorical_crossentropy` whereas the former uses a simpler way, we transform the integer vector `targets` into one-hot encoded matrix. That's why we need the `m` argument here. The possible limitation is that our implementation does not allow `m` changing on-the-fly. 56 | * **class_weight**: tensor vector with shape (Nclass,), used for class weighting, optional. 57 | * **return**: Theano 1D tensor, an expression for the item-wise categorical cross-entropy in log-domain 58 | 59 | _______________________________________________________________________ 60 | 61 | You're recommended to refer to [`Lasagne.objectives` document](http://lasagne.readthedocs.io/en/latest/modules/objectives.html) for the following objectives: 62 | 63 | * binary_crossentropy 64 | * squared_error 65 | * binary_hinge_loss 66 | * multiclass_hinge_loss 67 | * binary_accuracy 68 | * categorical_accuracy -------------------------------------------------------------------------------- /test/test_LSTM.py: -------------------------------------------------------------------------------- 1 | # coding:utf-8 2 | # Unit test for LSTM class 3 | # Created : 1, 30, 2018 4 | # Revised : 1, 30, 2018 5 | # All rights reserved 6 | #------------------------------------------------------------------------------------------------ 7 | __author__ = 'dawei.leng' 8 | import os, sys 9 | os.environ['THEANO_FLAGS'] = "floatX=float32, mode=FAST_RUN, warn_float64='raise'" 10 | 11 | import theano 12 | from theano import tensor 13 | from dandelion.module import * 14 | from dandelion.activation import * 15 | from lasagne.layers import InputLayer, DenseLayer, LSTMLayer, get_output, Upscale2DLayer, TransposedConv2DLayer 16 | import lasagne.nonlinearities as LACT 17 | 18 | import dandelion 19 | dandelion_path = os.path.split(dandelion.__file__)[0] 20 | print('dandelion path = %s\n' % dandelion_path) 21 | 22 | class build_model_D(Module): 23 | def __init__(self, in_dim=3, out_dim=3): 24 | super().__init__() 25 | self.in_dim = in_dim 26 | self.out_dim = out_dim 27 | self.lstm = LSTM(input_dims=self.in_dim, hidden_dim=self.out_dim, peephole=True, learn_ini=True) 28 | self.predict = self.forward 29 | 30 | def forward(self, x): 31 | """ 32 | 33 | :param x: (B, T, D) 34 | :return: 35 | """ 36 | x = x.dimshuffle((1, 0, 2)) # ->(T, B, D) 37 | x = self.lstm.forward(x, backward=True, only_return_final=True) 38 | # x = x.dimshuffle((1, 0, 2)) # ->(B, T, D) 39 | # x = tanh(x) 40 | return x 41 | 42 | def build_model_L(in_dim=3, out_dim=3): 43 | input_var = tensor.ftensor3('x') # (B, T, D) 44 | input0 = InputLayer(shape=(None, None, in_dim), input_var=input_var, name='input0') 45 | lstm0 = LSTMLayer(input0, num_units=out_dim, precompute_input=True, nonlinearity=LACT.tanh, 46 | backwards=True, only_return_final=True, learn_init=True, consume_less='None', 47 | name='lstm0') 48 | return lstm0 49 | 50 | def test_case_0(): 51 | import numpy as np 52 | from lasagne_ext.utils import get_layer_by_name 53 | 54 | in_dim, out_dim = 32, 3 55 | model_D = build_model_D(in_dim=in_dim, out_dim=out_dim) 56 | model_L = build_model_L(in_dim=in_dim, out_dim=out_dim) 57 | 58 | W_in = np.random.rand(in_dim, 4*out_dim).astype(np.float32) 59 | b_in = np.random.rand(4*out_dim).astype(np.float32) 60 | W_hid = np.random.rand(out_dim, 4*out_dim).astype(np.float32) 61 | h_ini = np.random.rand(out_dim).astype(np.float32) 62 | c_ini = np.random.rand(out_dim).astype(np.float32) 63 | w_cell_to_igate = np.random.rand(out_dim).astype(np.float32) 64 | w_cell_to_fgate = np.random.rand(out_dim).astype(np.float32) 65 | w_cell_to_ogate = np.random.rand(out_dim).astype(np.float32) 66 | 67 | model_D.lstm.W_in.set_value(W_in) 68 | model_D.lstm.b_in.set_value(b_in) 69 | model_D.lstm.W_hid.set_value(W_hid) 70 | model_D.lstm.h_ini.set_value(h_ini) 71 | model_D.lstm.c_ini.set_value(c_ini) 72 | model_D.lstm.w_cell_to_igate.set_value(w_cell_to_igate) 73 | model_D.lstm.w_cell_to_fgate.set_value(w_cell_to_fgate) 74 | model_D.lstm.w_cell_to_ogate.set_value(w_cell_to_ogate) 75 | 76 | lstm_L = get_layer_by_name(model_L, 'lstm0') 77 | lstm_L.W_in_to_ingate.set_value(W_in[:, :out_dim]) 78 | lstm_L.W_in_to_forgetgate.set_value(W_in[:, out_dim:2*out_dim]) 79 | lstm_L.W_in_to_cell.set_value(W_in[:, 2*out_dim:3*out_dim]) 80 | lstm_L.W_in_to_outgate.set_value(W_in[:, 3*out_dim:]) 81 | 82 | lstm_L.W_hid_to_ingate.set_value(W_hid[:, :out_dim]) 83 | lstm_L.W_hid_to_forgetgate.set_value(W_hid[:, out_dim:2*out_dim]) 84 | lstm_L.W_hid_to_cell.set_value(W_hid[:, 2*out_dim:3*out_dim]) 85 | lstm_L.W_hid_to_outgate.set_value(W_hid[:, 3*out_dim:]) 86 | 87 | lstm_L.b_ingate.set_value(b_in[:out_dim]) 88 | lstm_L.b_forgetgate.set_value(b_in[out_dim:2*out_dim]) 89 | lstm_L.b_cell.set_value(b_in[2*out_dim:3*out_dim]) 90 | lstm_L.b_outgate.set_value(b_in[3*out_dim:]) 91 | 92 | lstm_L.hid_init.set_value(h_ini.reshape((1, out_dim))) 93 | lstm_L.cell_init.set_value(c_ini.reshape((1, out_dim))) 94 | 95 | lstm_L.W_cell_to_ingate.set_value(w_cell_to_igate) 96 | lstm_L.W_cell_to_forgetgate.set_value(w_cell_to_fgate) 97 | lstm_L.W_cell_to_outgate.set_value(w_cell_to_ogate) 98 | 99 | X = get_layer_by_name(model_L, 'input0').input_var 100 | y_D = model_D.forward(X) 101 | y_L = get_output(model_L) 102 | 103 | fn_D = theano.function([X], y_D, no_default_updates=True, on_unused_input='ignore') 104 | fn_L = theano.function([X], y_L, no_default_updates=True, on_unused_input='ignore') 105 | 106 | for i in range(20): 107 | x = np.random.rand(4, 16, in_dim).astype(np.float32) 108 | y_D = fn_D(x) 109 | y_L = fn_L(x) 110 | diff = np.max(np.abs(y_D - y_L)) 111 | print('i=%d, diff=%0.6f' % (i, diff)) 112 | if diff>1e-4: 113 | print('y_D=\n', y_D) 114 | print('y_L=\n', y_L) 115 | raise ValueError('diff is too big') 116 | 117 | if __name__ == '__main__': 118 | 119 | test_case_0() 120 | 121 | print('Test passed') 122 | 123 | 124 | 125 | -------------------------------------------------------------------------------- /dandelion/model/vgg.py: -------------------------------------------------------------------------------- 1 | # coding:utf-8 2 | ''' 3 | Model definition of ancient VGG CNN nets 4 | Created : 7, 6, 2018 5 | Revised : 7, 6, 2018 6 | All rights reserved 7 | ''' 8 | # ------------------------------------------------------------------------------------------------ 9 | __author__ = 'dawei.leng' 10 | 11 | import theano.tensor as tensor 12 | from ..module import * 13 | from ..functional import * 14 | from ..activation import * 15 | 16 | 17 | class model_VGG16(Module): 18 | """ 19 | VGG16-net reference implementation: 13 cnn + 3 dense + 2 dropout. 20 | """ 21 | 22 | def __init__(self, channel=3, im_height=224, im_width=224, Nclass=1000, kernel_size=3, border_mode=(1, 1), flip_filters=False): 23 | super().__init__() 24 | self.conv1_1 = Conv2D(in_channels=channel, out_channels=64, kernel_size=kernel_size, pad=border_mode, input_shape=(im_height, im_width), flip_filters=flip_filters) 25 | self.conv1_2 = Conv2D(in_channels=64, out_channels=64, kernel_size=kernel_size, pad=border_mode, flip_filters=flip_filters) 26 | self.conv2_1 = Conv2D(in_channels=64, out_channels=128, kernel_size=kernel_size, pad=border_mode, flip_filters=flip_filters) 27 | self.conv2_2 = Conv2D(in_channels=128, out_channels=128, kernel_size=kernel_size, pad=border_mode, flip_filters=flip_filters) 28 | self.conv3_1 = Conv2D(in_channels=128, out_channels=256, kernel_size=kernel_size, pad=border_mode, flip_filters=flip_filters) 29 | self.conv3_2 = Conv2D(in_channels=256, out_channels=256, kernel_size=kernel_size, pad=border_mode, flip_filters=flip_filters) 30 | self.conv3_3 = Conv2D(in_channels=256, out_channels=256, kernel_size=kernel_size, pad=border_mode, flip_filters=flip_filters) 31 | self.conv4_1 = Conv2D(in_channels=256, out_channels=512, kernel_size=kernel_size, pad=border_mode, flip_filters=flip_filters) 32 | self.conv4_2 = Conv2D(in_channels=512, out_channels=512, kernel_size=kernel_size, pad=border_mode, flip_filters=flip_filters) 33 | self.conv4_3 = Conv2D(in_channels=512, out_channels=512, kernel_size=kernel_size, pad=border_mode, flip_filters=flip_filters) 34 | self.conv5_1 = Conv2D(in_channels=512, out_channels=512, kernel_size=kernel_size, pad=border_mode, flip_filters=flip_filters) 35 | self.conv5_2 = Conv2D(in_channels=512, out_channels=512, kernel_size=kernel_size, pad=border_mode, flip_filters=flip_filters) 36 | self.conv5_3 = Conv2D(in_channels=512, out_channels=512, kernel_size=kernel_size, pad=border_mode, flip_filters=flip_filters) 37 | self.fc6 = Dense(input_dims=512 * im_height//32 * im_width//32, output_dim=4096) 38 | self.fc7 = Dense(input_dims=4096, output_dim=4096) 39 | self.fc8 = Dense(input_dims=4096, output_dim=Nclass) 40 | self.fc6_dropout = Dropout() 41 | self.fc7_dropout = Dropout() 42 | 43 | 44 | def forward(self, x): 45 | """ 46 | :param x: (B, C, H, W) 47 | :return: 48 | """ 49 | self.work_mode = 'train' 50 | 51 | x = relu(self.conv1_1.forward(x)) # (B, 64, 224, 224) 52 | x = relu(self.conv1_2.forward(x)) 53 | x = pool_2d(x, ws=(2, 2)) # (B, 64, 112, 112) 54 | x = relu(self.conv2_1.forward(x)) 55 | x = relu(self.conv2_2.forward(x)) 56 | x = pool_2d(x, ws=(2, 2)) # (B, 128, 56, 56) 57 | x = relu(self.conv3_1.forward(x)) 58 | x = relu(self.conv3_2.forward(x)) 59 | x = relu(self.conv3_3.forward(x)) 60 | x = pool_2d(x, ws=(2, 2)) # (B, 256, 28, 28) 61 | x = relu(self.conv4_1.forward(x)) 62 | x = relu(self.conv4_2.forward(x)) 63 | x = relu(self.conv4_3.forward(x)) 64 | x = pool_2d(x, ws=(2, 2)) # (B, 512, 14, 14) 65 | x = relu(self.conv5_1.forward(x)) 66 | x = relu(self.conv5_2.forward(x)) 67 | x = relu(self.conv5_3.forward(x)) 68 | x = pool_2d(x, ws=(2, 2)) # (B, 512, 7, 7) 69 | x = tensor.flatten(x, ndim=2) # (B, 512 * 7 * 7) 70 | x = relu(self.fc6.forward(x)) 71 | x = self.fc6_dropout.forward(x, p=0.5) 72 | x = relu(self.fc7.forward(x)) 73 | x = self.fc7_dropout.forward(x, p=0.5) 74 | x = softmax(self.fc8.forward(x)) 75 | 76 | return x 77 | 78 | def predict(self, x): 79 | self.work_mode = 'inference' 80 | 81 | x = relu(self.conv1_1.predict(x)) # (B, 64, 224, 224) 82 | x = relu(self.conv1_2.predict(x)) 83 | x = pool_2d(x, ws=(2, 2)) # (B, 64, 112, 112) 84 | x = relu(self.conv2_1.predict(x)) 85 | x = relu(self.conv2_2.predict(x)) 86 | x = pool_2d(x, ws=(2, 2)) # (B, 128, 56, 56) 87 | x = relu(self.conv3_1.predict(x)) 88 | x = relu(self.conv3_2.predict(x)) 89 | x = relu(self.conv3_3.predict(x)) 90 | x = pool_2d(x, ws=(2, 2)) # (B, 256, 28, 28) 91 | x = relu(self.conv4_1.predict(x)) 92 | x = relu(self.conv4_2.predict(x)) 93 | x = relu(self.conv4_3.predict(x)) 94 | x = pool_2d(x, ws=(2, 2)) # (B, 512, 14, 14) 95 | x = relu(self.conv5_1.predict(x)) 96 | x = relu(self.conv5_2.predict(x)) 97 | x = relu(self.conv5_3.predict(x)) 98 | x = pool_2d(x, ws=(2, 2)) # (B, 512, 7, 7) 99 | x = tensor.flatten(x, ndim=2) # (B, 512 * 7 * 7) 100 | x = relu(self.fc6.predict(x)) 101 | x = relu(self.fc7.predict(x)) 102 | x = softmax(self.fc8.predict(x)) 103 | 104 | return x 105 | -------------------------------------------------------------------------------- /dandelion/model/feature_pyramid_net.py: -------------------------------------------------------------------------------- 1 | # coding:utf-8 2 | ''' 3 | Model definition of [feature pyramid network](https://arxiv.org/abs/1612.03144) 4 | Created : 7, 6, 2018 5 | Revised : 7, 6, 2018 6 | All rights reserved 7 | ''' 8 | # ------------------------------------------------------------------------------------------------ 9 | __author__ = 'dawei.leng' 10 | 11 | import theano.tensor as tensor 12 | from ..module import * 13 | from ..functional import * 14 | from ..activation import * 15 | from .resnet import ResNet_bottleneck 16 | 17 | 18 | class model_FPN(Module): 19 | """ 20 | Reference implementation of [feature pyramid network](https://arxiv.org/abs/1612.03144) 21 | return tuple of (p2, p3, p4, p5), cnn pyramid features at different scales 22 | """ 23 | 24 | def __init__(self, input_channel=3, base_n_filters=64, batchnorm_mode=1): 25 | """ 26 | 8 effective conv in depth 27 | :param input_channel: 28 | :param base_n_filters: 29 | :param batchnorm_mode: passed to `ResNet_bottleneck()` 30 | """ 31 | super().__init__() 32 | self.conv1 = Conv2D(in_channels=input_channel, out_channels=base_n_filters, kernel_size=3, pad='same', stride=(2, 2)) 33 | self.bn1 = BatchNorm(input_shape=(None, base_n_filters, None, None)) 34 | self.conv2 = Conv2D(in_channels=base_n_filters, out_channels=base_n_filters * 2, kernel_size=3, pad='same', stride=(2, 2)) 35 | self.bn2 = BatchNorm(input_shape=(None, base_n_filters * 2, None, None)) 36 | self.conv3 = Conv2D(in_channels=base_n_filters * 2, out_channels=base_n_filters * 4, kernel_size=3, pad='same', stride=(2, 2)) 37 | self.bn3 = BatchNorm(input_shape=(None, base_n_filters * 4, None, None)) 38 | self.conv4 = Conv2D(in_channels=base_n_filters * 4, out_channels=base_n_filters * 8, kernel_size=3, pad='same', stride=(2, 2)) 39 | self.bn4 = BatchNorm(input_shape=(None, base_n_filters * 8, None, None)) 40 | 41 | self.res_block1 = ResNet_bottleneck(outer_channel=base_n_filters, inner_channel=base_n_filters // 2, batchnorm_mode=batchnorm_mode) 42 | self.res_block2 = ResNet_bottleneck(outer_channel=base_n_filters * 2, inner_channel=base_n_filters, batchnorm_mode=batchnorm_mode) 43 | self.res_block3 = ResNet_bottleneck(outer_channel=base_n_filters * 4, inner_channel=base_n_filters * 2, batchnorm_mode=batchnorm_mode) 44 | self.res_block4 = ResNet_bottleneck(outer_channel=base_n_filters * 8, inner_channel=base_n_filters * 4, batchnorm_mode=batchnorm_mode) 45 | 46 | self.conv5 = Conv2D(in_channels=base_n_filters * 8, out_channels=base_n_filters * 4, kernel_size=1, pad='same') 47 | self.conv6 = Conv2D(in_channels=base_n_filters * 4, out_channels=base_n_filters * 4, kernel_size=1, pad='same') 48 | self.conv7 = Conv2D(in_channels=base_n_filters * 2, out_channels=base_n_filters * 4, kernel_size=1, pad='same') 49 | self.conv8 = Conv2D(in_channels=base_n_filters, out_channels=base_n_filters * 4, kernel_size=1, pad='same') 50 | 51 | self.conv9 = Conv2D(in_channels=base_n_filters * 4, out_channels=base_n_filters * 4, kernel_size=1, pad='same') 52 | self.conv10 = Conv2D(in_channels=base_n_filters * 4, out_channels=base_n_filters * 4, kernel_size=1, pad='same') 53 | self.conv11 = Conv2D(in_channels=base_n_filters * 4, out_channels=base_n_filters * 4, kernel_size=1, pad='same') 54 | 55 | def forward(self, x): 56 | """ 57 | :param x: (B, C, H, W) 58 | :return: 59 | """ 60 | self.work_mode = 'train' 61 | 62 | c1 = relu(self.bn1.forward(self.conv1.forward(x))) # stride = 2 63 | c2 = relu(self.res_block1.forward(c1)) 64 | 65 | c3 = relu(self.bn2.forward(self.conv2.forward(c2))) # stride = 4 66 | c3 = relu(self.res_block2.forward(c3)) 67 | 68 | c4 = relu(self.bn3.forward(self.conv3.forward(c3))) # stride = 8 69 | c4 = relu(self.res_block3.forward(c4)) 70 | 71 | c5 = relu(self.bn4.forward(self.conv4.forward(c4))) # stride = 16 72 | c5 = relu(self.res_block4.forward(c5)) 73 | 74 | p5 = relu(self.conv5.forward(c5)) 75 | p4 = relu(self.conv6.forward(c4)) + upsample_2d(p5, ratio=2, mode='repeat') 76 | p3 = relu(self.conv7.forward(c3)) + upsample_2d(p4, ratio=2, mode='repeat') 77 | p2 = relu(self.conv8.forward(c2)) + upsample_2d(p3, ratio=2, mode='repeat') 78 | 79 | p4 = relu(self.conv9.forward(p4)) 80 | p3 = relu(self.conv10.forward(p3)) 81 | p2 = relu(self.conv11.forward(p2)) 82 | 83 | return p2, p3, p4, p5 84 | 85 | def predict(self, x): 86 | self.work_mode = 'inference' 87 | 88 | c1 = relu(self.bn1.predict(self.conv1.predict(x))) # stride = 2 89 | c2 = relu(self.res_block1.predict(c1)) 90 | 91 | c3 = relu(self.bn2.predict(self.conv2.predict(c2))) # stride = 4 92 | c3 = relu(self.res_block2.predict(c3)) 93 | 94 | c4 = relu(self.bn3.predict(self.conv3.predict(c3))) # stride = 8 95 | c4 = relu(self.res_block3.predict(c4)) 96 | 97 | c5 = relu(self.bn4.predict(self.conv4.predict(c4))) # stride = 16 98 | c5 = relu(self.res_block4.predict(c5)) 99 | 100 | p5 = relu(self.conv5.predict(c5)) 101 | p4 = relu(self.conv6.predict(c4)) + upsample_2d(p5, ratio=2, mode='repeat') 102 | p3 = relu(self.conv7.predict(c3)) + upsample_2d(p4, ratio=2, mode='repeat') 103 | p2 = relu(self.conv8.predict(c2)) + upsample_2d(p3, ratio=2, mode='repeat') 104 | 105 | p4 = relu(self.conv9.predict(p4)) 106 | p3 = relu(self.conv10.predict(p3)) 107 | p2 = relu(self.conv11.predict(p2)) 108 | 109 | return p2, p3, p4, p5 110 | -------------------------------------------------------------------------------- /test/test_VGG16_weights.py: -------------------------------------------------------------------------------- 1 | # coding:utf-8 2 | # Test VGG16 weights transferred from Lasagne 3 | # Created : mm, dd, yyyy 4 | # Revised : mm, dd, yyyy 5 | # All rights reserved 6 | #------------------------------------------------------------------------------------------------ 7 | __author__ = 'dawei.leng' 8 | 9 | import os 10 | os.environ['THEANO_FLAGS'] = "floatX=float32, mode=FAST_RUN, warn_float64='raise'" 11 | import theano 12 | import lasagne 13 | from lasagne.layers import InputLayer, get_output 14 | from lasagne.layers import DenseLayer 15 | from lasagne.layers import NonlinearityLayer 16 | from lasagne.layers import DropoutLayer 17 | from lasagne.layers import Pool2DLayer as PoolLayer 18 | from lasagne.layers import Conv2DLayer as ConvLayer 19 | from lasagne.nonlinearities import softmax 20 | 21 | from dandelion.model.vgg import model_VGG16 22 | from dandelion.util import gpickle 23 | import dandelion 24 | dandelion_path = os.path.split(dandelion.__file__)[0] 25 | print('dandelion path = %s\n' % dandelion_path) 26 | 27 | 28 | def build_model_L(): 29 | net = {} 30 | net['input'] = InputLayer((None, 3, 224, 224)) 31 | net['conv1_1'] = ConvLayer( 32 | net['input'], 64, 3, pad=1, flip_filters=False) 33 | net['conv1_2'] = ConvLayer( 34 | net['conv1_1'], 64, 3, pad=1, flip_filters=False) 35 | net['pool1'] = PoolLayer(net['conv1_2'], 2) 36 | net['conv2_1'] = ConvLayer( 37 | net['pool1'], 128, 3, pad=1, flip_filters=False) 38 | net['conv2_2'] = ConvLayer( 39 | net['conv2_1'], 128, 3, pad=1, flip_filters=False) 40 | net['pool2'] = PoolLayer(net['conv2_2'], 2) 41 | net['conv3_1'] = ConvLayer( 42 | net['pool2'], 256, 3, pad=1, flip_filters=False) 43 | net['conv3_2'] = ConvLayer( 44 | net['conv3_1'], 256, 3, pad=1, flip_filters=False) 45 | net['conv3_3'] = ConvLayer( 46 | net['conv3_2'], 256, 3, pad=1, flip_filters=False) 47 | net['pool3'] = PoolLayer(net['conv3_3'], 2) 48 | net['conv4_1'] = ConvLayer( 49 | net['pool3'], 512, 3, pad=1, flip_filters=False) 50 | net['conv4_2'] = ConvLayer( 51 | net['conv4_1'], 512, 3, pad=1, flip_filters=False) 52 | net['conv4_3'] = ConvLayer( 53 | net['conv4_2'], 512, 3, pad=1, flip_filters=False) 54 | net['pool4'] = PoolLayer(net['conv4_3'], 2) 55 | net['conv5_1'] = ConvLayer( 56 | net['pool4'], 512, 3, pad=1, flip_filters=False) 57 | net['conv5_2'] = ConvLayer( 58 | net['conv5_1'], 512, 3, pad=1, flip_filters=False) 59 | net['conv5_3'] = ConvLayer( 60 | net['conv5_2'], 512, 3, pad=1, flip_filters=False) 61 | net['pool5'] = PoolLayer(net['conv5_3'], 2) 62 | net['fc6'] = DenseLayer(net['pool5'], num_units=4096) 63 | net['fc6_dropout'] = DropoutLayer(net['fc6'], p=0.5) 64 | net['fc7'] = DenseLayer(net['fc6_dropout'], num_units=4096) 65 | net['fc7_dropout'] = DropoutLayer(net['fc7'], p=0.5) 66 | net['fc8'] = DenseLayer( 67 | net['fc7_dropout'], num_units=1000, nonlinearity=None) 68 | net['prob'] = NonlinearityLayer(net['fc8'], softmax) 69 | 70 | return net 71 | 72 | build_model_D = model_VGG16 73 | 74 | # '_' prefix means no need for unit test 75 | def _test_case_0(): 76 | import numpy as np, pickle 77 | from lasagne_ext.utils import get_layer_by_name, set_weights, get_all_layers 78 | 79 | model_D = build_model_D() 80 | model_L = build_model_L() 81 | 82 | weight_file = r"C:\Users\dawei\Work\Code\Git\Reference Codes\Lasagne_Recipes\modelzoo\vgg16.pkl" 83 | with open(weight_file, mode='rb') as f: 84 | weights = pickle.load(f, encoding='latin1') 85 | lasagne.layers.set_all_param_values(model_L['prob'], weights['param values']) 86 | for layer_name in [ 87 | 'conv1_1', 88 | 'conv1_2', 89 | 'conv2_1', 90 | 'conv2_2', 91 | 'conv3_1', 92 | 'conv3_2', 93 | 'conv3_3', 94 | 'conv4_1', 95 | 'conv4_2', 96 | 'conv4_3', 97 | 'conv5_1', 98 | 'conv5_2', 99 | 'conv5_3', 100 | ]: 101 | print('processing layer = ', layer_name) 102 | W = model_L[layer_name].W.get_value() 103 | b = model_L[layer_name].b.get_value() 104 | print('W.shape=', W.shape) 105 | print('b.shape=', b.shape) 106 | model_D.__getattribute__(layer_name).W.set_value(W) 107 | model_D.__getattribute__(layer_name).b.set_value(b) 108 | 109 | for layer_name in ['fc6', 'fc7', 'fc8']: 110 | print('processing layer = ', layer_name) 111 | W = model_L[layer_name].W.get_value() 112 | b = model_L[layer_name].b.get_value() 113 | print('W.shape=', W.shape) 114 | print('b.shape=', b.shape) 115 | model_D.__getattribute__(layer_name).W.set_value(W) 116 | model_D.__getattribute__(layer_name).b.set_value(b) 117 | 118 | gpickle.dump((model_D.get_weights(), None), 'VGG16_weights.gpkl') 119 | print('compiling...') 120 | X = model_L['input'].input_var 121 | y_D = model_D.predict(X) 122 | y_L = get_output(model_L['prob'], deterministic=True) 123 | 124 | fn_D = theano.function([X], y_D, no_default_updates=True) 125 | fn_L = theano.function([X], y_L, no_default_updates=True) 126 | print('run test...') 127 | for i in range(20): 128 | B, C, H, W = 4, 3, 224, 224 129 | x = np.random.rand(B, C, H, W).astype('float32') 130 | y_D = fn_D(x) 131 | y_L = fn_L(x) 132 | diff = np.sum(np.abs(y_D - y_L)) 133 | print('i=%d, diff=%0.6f' % (i, diff)) 134 | if diff > 1e-4: 135 | raise ValueError('diff is too big') 136 | 137 | 138 | 139 | 140 | 141 | 142 | 143 | 144 | 145 | if __name__ == '__main__': 146 | 147 | _test_case_0() 148 | 149 | print('Test pass ~') 150 | -------------------------------------------------------------------------------- /docs/dandelion_functional.md: -------------------------------------------------------------------------------- 1 | ## pool_1d 2 | Pooling 1 dimension along the given axis, support for any dimensional input. 3 | ```python 4 | pool_1d(x, ws=2, ignore_border=True, stride=None, pad=0, mode='max', axis=-1) 5 | ``` 6 | * **ws**: scalar int. Factor by which to downsample the input 7 | * **ignore_border**: bool. When `True`, dimension size=5 with `ws`=2 will generate a dimension size=2 output. 3 otherwise. 8 | * **stride**: scalar int. The number of shifts over rows/cols to get the next pool region. If stride is None, it is considered equal to ws (no overlap on pooling regions), eg: `stride`=1 will shifts over one row for every iteration. 9 | * **pad**: pad zeros to extend beyond border of the input 10 | * **mode**: {`max`, `sum`, `average_inc_pad`, `average_exc_pad`}. Operation executed on each window. `max` and `sum` always exclude the padding in the computation. `average` gives you the choice to include or exclude it. 11 | * **axis**: scalar int. Specify along which axis the pooling will be done 12 | 13 | _______________________________________________________________________ 14 | ## pool_2d 15 | Pooling 2 dimension along the last 2 dimensions of input, support for any dimensional input with `ndim`>=2. 16 | ```python 17 | pool_2d(x, ws=(2,2), ignore_border=True, stride=None, pad=(0,0), mode='max') 18 | ``` 19 | * **ws**: scalar tuple. Factor by which to downsample the input 20 | * **ignore_border**: bool. When `True`, (5,5) input with `ws`=(2,2) will generate a (2,2) output. (3,3) otherwise. 21 | * **stride**: scalar tuple. The number of shifts over rows/cols to get the next pool region. If stride is None, it is considered equal to ws (no overlap on pooling regions), eg: `stride`=(1,1) will shifts over one row and one column for every iteration. 22 | * **pad**: pad zeros to extend beyond border of the input 23 | * **mode**: {`max`, `sum`, `average_inc_pad`, `average_exc_pad`}. Operation executed on each window. `max` and `sum` always exclude the padding in the computation. `average` gives you the choice to include or exclude it. 24 | 25 | _______________________________________________________________________ 26 | ## pool_3d 27 | Pooling 3 dimension along the last 3 dimensions of input, support for any dimensional input with `ndim`>=3. 28 | ```python 29 | pool_3d(x, ws=(2,2,2), ignore_border=True, stride=None, pad=(0,0,0), mode='max') 30 | ``` 31 | * **ws**: scalar tuple. Factor by which to downsample the input 32 | * **ignore_border**: bool. When `True`, (5,5,5) input with `ws`=(2,2,2) will generate a (2,2,2) output. (3,3,3) otherwise. 33 | * **stride**: scalar tuple. The number of shifts over rows/cols to get the next pool region. If stride is None, it is considered equal to ws (no overlap on pooling regions). 34 | * **pad**: pad zeros to extend beyond border of the input 35 | * **mode**: {`max`, `sum`, `average_inc_pad`, `average_exc_pad`}. Operation executed on each window. `max` and `sum` always exclude the padding in the computation. `average` gives you the choice to include or exclude it. 36 | 37 | _______________________________________________________________________ 38 | ## align_crop 39 | Align a list of tensors at each axis by specified rules and crop them to make axis concatenation possible. 40 | ```python 41 | align_crop(tensor_list, cropping) 42 | ``` 43 | * **tensor_list**: list of tensors to be processed, they much have the same `ndim`s. 44 | * **cropping**: list of cropping rules for each dimension. Acceptable rules include {`None`|`lower`|`upper`|`center`}. 45 | * `None`: this axis is not cropped, tensors are unchanged in this axis 46 | * `lower`: tensors are cropped choosing the lower portion in this axis as `a[:crop_size, ...]` 47 | * `upper`: tensors are cropped choosing the upper portion in this axis as `a[-crop_size:, ...]` 48 | * `center`: tensors are cropped choosing the central portion in this axis as ``a[offset:offset+crop_size, ...]`` where ``offset = (a.shape[0]-crop_size)//2)`` 49 | 50 | _______________________________________________________________________ 51 | ## spatial_pyramid_pooling 52 | Spatial pyramid pooling. This function will use different scale pooling pyramid to generate spatially fix-sized output no matter the spatial size of input, useful when CNN+FC used for image classification or detection with variable-sized samples. 53 | ```python 54 | spatial_pyramid_pooling(x, pyramid_dims=(6, 4, 2, 1), mode='max', implementation='fast') 55 | ``` 56 | * **x**: 4D tensor with shape (B, C, H, W) 57 | * **pyramid_dims**: list or tuple of integers. Refer to Ref[1] for details. 58 | * **mode**: {`max`, `sum`, `average_inc_pad`, `average_exc_pad`}. Operation executed on each window. `max` and `sum` always exclude the padding in the computation. `average` gives you the choice to include or exclude it. 59 | * **implementation**: {`fast`, `fast_ls`, `stretch`}. 60 | * `fast`: The 'fast' implementation is fast and pad zero when input size is too small. 61 | * `fast_ls`: The 'fast_ls' implementation is same as Lasagne fast implementation. The size of the input map MUST be larger than the output map size. 62 | * `stretch`: The 'stretch' implementation is slower. The implementation will get same feature at some position just like nearest neighbor interpolation when the input size is less than the output size. 63 | 64 | 65 | Ref [1]: He, Kaiming et al (2015), Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition. [http://arxiv.org/pdf/1406.4729.pdf](http://arxiv.org/pdf/1406.4729.pdf) 66 | 67 | 68 | _______________________________________________________________________ 69 | ## upsample_2d 70 | Upsample 2 dimension along the last 2 dimensions of input, support for any dimensional input with `ndim`>=2. Only integer upsampling ratio supported. 71 | ```python 72 | upsample_2d(x, ratio, mode='repeat') 73 | ``` 74 | * **ratio**: ust be integer or tuple of integers >=1 75 | * **mode**: {`repeat`, `dilate`}. Repeat element values or upsample leaving zeroes between upsampled elements. Default `repeat`. 76 | 77 | _______________________________________________________________________ 78 | ## upsample_2d_bilinear 79 | Upsample 2D with bilinear interpolation. Support for fractional ratio, and only apply for 4D tensor. 80 | ```python 81 | upsample_2d_bilinear(x, ratio=None, frac_ratio=None, use_1D_kernel=True) 82 | ``` 83 | * **ratio**: ust be integer or tuple of integers >=1. You can only specify either `ratio` or `frac_ratio`, not both. 84 | * **frac_ratio**: None, tuple of int or tuple of tuples of int. A fractional upsampling scale is described by (numerator, denominator). 85 | * **use_1D_kernel**: only for speed matter. 86 | 87 | Note: due to Theano's implementation, when the upsampling ratio is even, the last row and column is repeated one extra time compared to the first row and column which makes the upsampled tensor asymmetrical on both sides. This does not happen when the upsampling ratio is odd. 88 | 89 | _______________________________________________________________________ 90 | ## channel_shuffle 91 | Pseudo shuffling channel by dimshuffle & reshape, first introduced in [ShuffleNet](https://arxiv.org/abs/1610.02357) 92 | ```python 93 | channel_shuffle(x, group_num) 94 | ``` 95 | * **x**: 4D tensor, with shape `(B, C, H, W)`, usually output of a 2D convolution. 96 | * **group_num**: int scalar, and `C` must be divisible by `group_num` 97 | -------------------------------------------------------------------------------- /dandelion/model/AlternateLSTM2D.py: -------------------------------------------------------------------------------- 1 | # coding:utf-8 2 | """ 3 | LSTM2D implementation by alternating LSTM along different dimensions 4 | Created : 11, 9, 2018 5 | Revised : 11, 9, 2018 6 | All rights reserved 7 | """ 8 | # ------------------------------------------------------------------------------------------------ 9 | __author__ = 'dawei.leng' 10 | 11 | import theano.tensor as tensor 12 | from ..module import * 13 | from ..functional import * 14 | from ..activation import * 15 | 16 | class Alternate_2D_LSTM(Module): 17 | """ 18 | 2D LSTM implementaton by alternating 1D LSTM along different input dimensions 19 | Input shape = (H, W, B, C) 20 | 21 | """ 22 | def __init__(self, input_dims, hidden_dim, peephole=True, initializer=init.Normal(0.1), grad_clipping=0, hidden_activation=tanh, 23 | learn_ini=False, truncate_gradient=-1, mode=2): 24 | """ 25 | :param input_dims: integer or list of integers, dimension of the input for different part, allows to have different 26 | initializations for different parts of the input. 27 | :param hidden_dim: 28 | :param initializer: 29 | :param peephole: whether add peephole connection 30 | :param grad_clipping: float. Hard clip the gradients at each time step. Only the gradient values 31 | above this threshold are clipped to the threshold. This is done during backprop. 32 | :param hidden_activation: nonlinearity applied to hidden variable, i.e., h = out_gate * hidden_activation(cell). It's recommended to use `tanh` as default. 33 | :param learn_ini: If True, initial hidden values will be learned. 34 | :param truncate_gradient: if not -1, BPTT will be used, gradient back-propagation will be performed at most `truncate_gradient` steps 35 | :param mode: {0|1|2}. 0: 1D LSTM results from horizontal and vertical dimensions are concatenated along the `C` dimension; 1: horizontal and vertical dimensions are processed 36 | sequentially, i.e., the final result = horizontal_LSTM(vertical_LSTM(input)); 2: mixed mode: final result = horizontal_LSTM(concat(input, vertical_LSTM(input))) 37 | 38 | """ 39 | super().__init__() 40 | self.hidden_dim = hidden_dim 41 | self.mode = mode 42 | if not isinstance(input_dims, (tuple, list)): 43 | input_dims = [input_dims] 44 | self.lstm_h = LSTM(input_dims=input_dims, hidden_dim=hidden_dim, peephole=peephole, initializer=initializer, grad_clipping=grad_clipping, hidden_activation=hidden_activation, 45 | learn_ini=learn_ini, truncate_gradient=truncate_gradient) 46 | if self.mode == 0: 47 | lstm_w_inputdim = input_dims 48 | elif self.mode == 1: 49 | lstm_w_inputdim = hidden_dim 50 | elif self.mode == 2: 51 | lstm_w_inputdim = input_dims + [hidden_dim] 52 | else: 53 | raise ValueError('Invalid mode input: should be among {0, 1, 2}') 54 | self.lstm_w = LSTM(input_dims=lstm_w_inputdim, hidden_dim=hidden_dim, peephole=peephole, initializer=initializer, grad_clipping=grad_clipping, hidden_activation=hidden_activation, 55 | learn_ini=learn_ini, truncate_gradient=truncate_gradient) 56 | self.predict = self.forward 57 | 58 | def forward(self, seq_input, h_ini=(None, None), c_ini=(None, None), seq_mask=None, backward=(False, False), return_final_state=False): 59 | """ 60 | :param seq_input: (H, W, B, input_dim) 61 | :param h_ini: tuple of matrix (B, hidden_dim) or None, if None, then learned self.h_ini will be used 62 | :param c_ini: tuple of matrix (B, hidden_dim) or None, if None, then learned self.c_ini will be used 63 | :param seq_mask: (H, W, B) 64 | :param backward: tuple of False/True 65 | :param only_return_final: If True, only return the final sequential output 66 | :param return_final_state: If True, the final state of `hidden` and `cell` will be returned, both (B, hidden_dim) 67 | :return: (H, W, B, hidden_dim) if mode= 1 or 2, (H, W, B, 2*hidden_dim) if mode = 0 68 | """ 69 | H, W, B, C = seq_input.shape 70 | h_ini, c_ini, backward = as_tuple(h_ini, 2), as_tuple(c_ini, 2), as_tuple(backward, 2) 71 | x = tensor.reshape(seq_input, (H, W*B, C)) 72 | output_h = self.lstm_h.forward(x, h_ini=h_ini[0], c_ini=c_ini[0], seq_mask=seq_mask, backward=backward[0], return_final_state=return_final_state) 73 | 74 | if self.mode == 0: 75 | x = seq_input.dimshuffle(1, 0, 2, 3) # (W, H, B, C) 76 | x = tensor.reshape(x, (W, H*B, C)) 77 | output_w = self.lstm_w.forward(x, h_ini=h_ini[1], c_ini=c_ini[1], seq_mask=seq_mask, backward=backward[1], return_final_state=return_final_state) 78 | if return_final_state: 79 | x_h, x_w = output_h[0], output_w[0] 80 | else: 81 | x_h, x_w = output_h, output_w 82 | x_h, x_w = tensor.reshape(x_h, (H, W, B, -1)), tensor.reshape(x_w, (W, H, B, -1)) 83 | x_w = x_w.dimshuffle(1, 0, 2, 3) 84 | x = tensor.concatenate([x_h, x_w], axis=3) 85 | if return_final_state: 86 | return x, output_h[1:], output_w[1:] 87 | else: 88 | return x 89 | 90 | elif self.mode == 1: 91 | if return_final_state: 92 | x = output_h[0] # (H, W*B, hidden_dim) 93 | else: 94 | x = output_h 95 | x = tensor.reshape(x, (H, W, B, self.hidden_dim)) 96 | x = x.dimshuffle(1, 0, 2, 3) 97 | x = tensor.reshape(x, (W, H*B, self.hidden_dim)) 98 | output_w = self.lstm_w.forward(x, h_ini=h_ini[1], c_ini=c_ini[1], seq_mask=seq_mask, backward=backward[1], return_final_state=return_final_state) 99 | if return_final_state: 100 | x = tensor.reshape(output_w[0], (W, H, B, -1)) 101 | x = x.dimshuffle(1, 0, 2, 3) # (H, W, B, hidden_dim) 102 | return x, output_h[1:], output_w[1:] 103 | else: 104 | x = tensor.reshape(output_w, (W, H, B, -1)) 105 | x = x.dimshuffle(1, 0, 2, 3) # (H, W, B, hidden_dim) 106 | return x 107 | 108 | else: 109 | if return_final_state: 110 | x = output_h[0] # (H, W*B, hidden_dim) 111 | else: 112 | x = output_h 113 | x = tensor.reshape(x, (H, W, B, self.hidden_dim)) # (H, W, B, hidden_dim) 114 | x = tensor.concatenate([seq_input, x], axis=3) 115 | x = x.dimshuffle(1, 0, 2, 3) 116 | x = tensor.reshape(x, (W, H*B, -1)) 117 | output_w = self.lstm_w.forward(x, h_ini=h_ini[1], c_ini=c_ini[1], seq_mask=seq_mask, backward=backward[1], return_final_state=return_final_state) 118 | if return_final_state: 119 | x = tensor.reshape(output_w[0], (W, H, B, -1)) 120 | x = x.dimshuffle(1, 0, 2, 3) # (H, W, B, hidden_dim) 121 | return x, output_h[1:], output_w[1:] 122 | else: 123 | x = tensor.reshape(output_w, (W, H, B, -1)) 124 | x = x.dimshuffle(1, 0, 2, 3) # (H, W, B, hidden_dim) 125 | return x 126 | 127 | 128 | -------------------------------------------------------------------------------- /test/test_spatial_pyramid_pooling.py: -------------------------------------------------------------------------------- 1 | # coding:utf-8 2 | # Test for spatial_pyramid pooling 3 | # Created : 7, 5, 2018 4 | # Revised : 7, 5, 2018 5 | # All rights reserved 6 | #------------------------------------------------------------------------------------------------ 7 | __author__ = 'dawei.leng' 8 | import os, sys 9 | # os.environ['THEANO_FLAGS'] = "floatX=float64, mode=FAST_RUN" 10 | os.environ['THEANO_FLAGS'] = "floatX=float32, mode=FAST_RUN, warn_float64='raise'" 11 | 12 | import theano 13 | from theano import tensor 14 | from dandelion.module import * 15 | from dandelion.functional import spatial_pyramid_pooling 16 | from lasagne.layers import InputLayer, get_output, SpatialPyramidPoolingLayer 17 | import dandelion 18 | dandelion_path = os.path.split(dandelion.__file__)[0] 19 | print('dandelion path = %s\n' % dandelion_path) 20 | 21 | class build_model_D(Module): 22 | def __init__(self, pyramid_dims=[6, 4, 2, 1]): 23 | super().__init__() 24 | self.pyramid_dims = pyramid_dims 25 | self.predict = self.forward 26 | 27 | def forward(self, x): 28 | """ 29 | 30 | :param x: (B, C, H, W) 31 | :return: 32 | """ 33 | x = spatial_pyramid_pooling(x, pyramid_dims=self.pyramid_dims, implementation='fast_ls') 34 | return x 35 | 36 | def build_model_L(pyramid_dims=[6, 4, 2, 1]): 37 | input_var = tensor.ftensor4('x') # (B, C, H, W) 38 | input0 = InputLayer(shape=(None, None, None, None), input_var=input_var, name='input0') 39 | x = SpatialPyramidPoolingLayer(input0, pool_dims=pyramid_dims, implementation='fast') 40 | return x 41 | 42 | def test_case_0(): 43 | import numpy as np 44 | from lasagne_ext.utils import get_layer_by_name 45 | np.random.seed(0) 46 | 47 | pyramid_dims = [3, 2, 1] 48 | 49 | model_D = build_model_D(pyramid_dims=pyramid_dims) 50 | model_L = build_model_L(pyramid_dims=pyramid_dims) 51 | 52 | X = get_layer_by_name(model_L, 'input0').input_var 53 | y_D = model_D.forward(X) 54 | y_L = get_output(model_L) 55 | 56 | fn_D = theano.function([X], y_D, no_default_updates=True, on_unused_input='ignore') 57 | fn_L = theano.function([X], y_L, no_default_updates=True, on_unused_input='ignore') 58 | 59 | for i in range(20): 60 | B = np.random.randint(low=1, high=33) 61 | C = np.random.randint(low=1, high=32) 62 | H = np.random.randint(low=5, high=512) 63 | W = np.random.randint(low=5, high=513) 64 | x = np.random.rand(B, C, H, W).astype(np.float32) - 0.5 65 | y_D= fn_D(x) 66 | y_L = fn_L(x) 67 | # print(y_D) 68 | diff = np.max(np.abs(y_D - y_L)) 69 | print('i=%d, diff=%0.6f' % (i, diff)) 70 | if diff>1e-4: 71 | print(y_D.shape) 72 | print(y_L.shape) 73 | print('B=%d, C=%d, H=%d, W=%d\n' % (B, C, H, W)) 74 | print('y_D=\n', y_D) 75 | print('y_L=\n', y_L) 76 | raise ValueError('diff is too big') 77 | 78 | 79 | class build_model_D_1(Module): 80 | def __init__(self, pyramid_dims=[6, 4, 2, 1]): 81 | super().__init__() 82 | self.pyramid_dims = pyramid_dims 83 | self.predict = self.forward 84 | 85 | def forward(self, x): 86 | """ 87 | 88 | :param x: (B, C, H, W) 89 | :return: 90 | """ 91 | x = spatial_pyramid_pooling(x, pyramid_dims=self.pyramid_dims, implementation='fast') 92 | return x 93 | 94 | def build_model_L_1(pyramid_dims=[6, 4, 2, 1]): 95 | input_var = tensor.ftensor4('x') # (B, C, H, W) 96 | input0 = InputLayer(shape=(None, None, None, None), input_var=input_var, name='input0') 97 | x = SpatialPyramidPoolingLayer(input0, pool_dims=pyramid_dims, implementation='fast') 98 | return x 99 | 100 | 101 | def test_case_1(): 102 | import numpy as np 103 | from lasagne_ext.utils import get_layer_by_name 104 | np.random.seed(0) 105 | 106 | pyramid_dims = [4] 107 | 108 | model_D = build_model_D_1(pyramid_dims=pyramid_dims) 109 | model_L = build_model_L_1(pyramid_dims=pyramid_dims) 110 | 111 | X = get_layer_by_name(model_L, 'input0').input_var 112 | y_D = model_D.forward(X) 113 | # y_L = get_output(model_L) 114 | 115 | fn_D = theano.function([X], y_D, no_default_updates=True, on_unused_input='ignore') 116 | # fn_L = theano.function([X], y_L, no_default_updates=True, on_unused_input='ignore') 117 | 118 | x = np.array([[[[0.1, 0.2, 0.3], [0.4, 0.5, 0.6], [0.7, 0.8, 0.9]]]]) 119 | x = x.astype(np.float32) 120 | # y = fn_L(x) # Lasagne will crash 121 | y = fn_D(x) 122 | y_expect = np.array([[[[0.1, 0.2, 0.3, 0.0], [0.4, 0.5, 0.6, 0.0], [0.7, 0.8, 0.9, 0.0], [0.0, 0.0, 0.0, 0.0]]]]) 123 | y_expect = y_expect.astype(np.float32) 124 | y_expect = y_expect.reshape((1, 1, -1)) 125 | diff = np.max(np.abs(y - y_expect)) 126 | print('diff=%0.6f' % diff) 127 | if diff > 1e-4: 128 | raise ValueError('diff is too big') 129 | 130 | 131 | class build_model_D_2(Module): 132 | def __init__(self, pyramid_dims=[6, 4, 2, 1]): 133 | super().__init__() 134 | self.pyramid_dims = pyramid_dims 135 | self.predict = self.forward 136 | 137 | def forward(self, x): 138 | """ 139 | 140 | :param x: (B, C, H, W) 141 | :return: 142 | """ 143 | x = spatial_pyramid_pooling(x, pyramid_dims=self.pyramid_dims, implementation='stretch') 144 | return x 145 | 146 | def build_model_L_2(pyramid_dims=[6, 4, 2, 1]): 147 | input_var = tensor.ftensor4('x') # (B, C, H, W) 148 | input0 = InputLayer(shape=(None, None, None, None), input_var=input_var, name='input0') 149 | x = SpatialPyramidPoolingLayer(input0, pool_dims=pyramid_dims, implementation='kaiming') 150 | return x 151 | 152 | 153 | def test_case_2(): 154 | import numpy as np 155 | from lasagne_ext.utils import get_layer_by_name 156 | np.random.seed(0) 157 | 158 | pyramid_dims = [4] 159 | 160 | model_D = build_model_D_2(pyramid_dims=pyramid_dims) 161 | model_L = build_model_L_2(pyramid_dims=pyramid_dims) 162 | 163 | X = get_layer_by_name(model_L, 'input0').input_var 164 | y_D = model_D.forward(X) 165 | # y_L = get_output(model_L) 166 | 167 | fn_D = theano.function([X], y_D, no_default_updates=True, on_unused_input='ignore') 168 | # fn_L = theano.function([X], y_L, no_default_updates=True, on_unused_input='ignore') 169 | 170 | x = np.array([[[[0.1, 0.2], [0.3, 0.4]]]]) 171 | x = x.astype(np.float32) 172 | # y = fn_L(x) # Lasagne will use float64 173 | y = fn_D(x) 174 | y_expect = np.array([[[[0.1, 0.1, 0.2, 0.2], [0.1, 0.1, 0.2, 0.2], [0.3, 0.3, 0.4, 0.4], [0.3, 0.3, 0.4, 0.4]]]]) 175 | y_expect = y_expect.astype(np.float32) 176 | y_expect = y_expect.reshape((1, 1, -1)) 177 | diff = np.max(np.abs(y - y_expect)) 178 | print('diff=%0.6f' % diff) 179 | if diff > 1e-4: 180 | raise ValueError('diff is too big') 181 | 182 | x = np.array([[[[0.7, 0.2, 0.3], [0.5, 0.6, 0.4], [0.1, 0.9, 0.8]]]]) 183 | x = x.astype(np.float32) 184 | y = fn_D(x) 185 | y_expect = np.array([[[[0.7, 0.7, 0.3, 0.3], [0.7, 0.7, 0.6, 0.4], [0.5, 0.9, 0.9, 0.8], [0.1, 0.9, 0.9, 0.8]]]]) 186 | y_expect = y_expect.astype(np.float32) 187 | y_expect = y_expect.reshape((1, 1, -1)) 188 | diff = np.max(np.abs(y - y_expect)) 189 | print('diff=%0.6f' % diff) 190 | if diff > 1e-4: 191 | raise ValueError('diff is too big') 192 | 193 | 194 | if __name__ == '__main__': 195 | 196 | test_case_0() 197 | 198 | test_case_1() 199 | 200 | test_case_2() 201 | 202 | print('Test passed') 203 | 204 | 205 | 206 | -------------------------------------------------------------------------------- /docs/tutorial II - Write Your Own Module.md: -------------------------------------------------------------------------------- 1 | # Tutorial II: Write Your Own Module 2 | 3 | In this tutorial, you’ll learn how to write your own neural network module with the help of Dandelion. Here we’ll design a module which gives the class centers for classification output. It’s a simple case for Dandelion yet not so intuitive for Lasagne or Keras users. 4 | 5 | In image classification tasks, such as face recognition, document image classification, Imagenet contests, etc., we usually consider only the “positive” samples, i.e., we assume that given any input sample, it would be associated with at least one out of all the known class labels. However, in actual applications, we often also want the trained neural network model to be able to tell whether an input sample is an “outsider” or not. 6 | 7 | To accomplish this task, we can add an extra “negative” class to the final layer of the network, and then train this augmented network by feeding it with all kinds of “negative” samples you can collect. It’s pure data-driven, so the bottleneck is how many “negative” samples can be collected. 8 | 9 | Another way is algorithm-driven: we design a new network module to explore the intrinsic properties of the data, and use these “properties” to reject or accept an sample as “positive”. By this way we do not need to collect negative samples, and the model is more general and the most important: explainable. 10 | 11 | The data intrinsic property to explore here is the class center for each positive class. The intuition is that if we can get the center of each class, then we can use the sample-center distance to reject or accept an sample as “positive”. 12 | 13 | Now assume that the last layer of the neural network is a `Dense` module followed by a `softmax` activation which produces `N` class decisions. We’ll refer the input of this `Dense` module as feature of the input sample (extracted by the former part of the whole neural network). For plain network trained with only positive samples, the feature distribution can be typically visualized as 14 | 15 | ![fig1](center_1.png) 16 | 17 | * *A Discriminative Deep Feature Learning Approach for Face Recognitions*. Yandong Wen, Kaipeng Zhang, Zhifeng Li and Yu Qiao. European Conference on Computer Vision (ECCV) 2016 18 | 19 | ## Center Loss 20 | 21 | Apparently the feature extracted by the plain model is not well centered, in other words, the feature distribution is not well-formed. 22 | 23 | Ideally, to reject or accept one sample as a certain class, we can set a probability threshold so that any sample whose feature satisfies 24 | $𝑝(𝑓_𝑗│𝐶_𝑖)<𝑇_𝑖$ will be rejected as an “outsider” for this class with certainty $1−𝑇_𝑖$ 25 | 26 | But before we can do this, the distribution $𝑝(𝑓│𝐶_𝑖)$ must be known. To get this conditional distribution, we can either traverse all the train samples and use any probability estimation / modelling method to approximate the true distribution, or we can resort to the DL method by directly requiring the neural network to produce features satisfying predefined distributions. 27 | 28 | The reason we can do this is because a neural network can be trained to emulate any nonlinear functions, and we can always transform a compact distribution into Gaussian by a certain function. 29 | 30 | To restrain the neural network to extract Gaussian distributed features, we assume each class has a mean feature vector (i.e., center) $𝑓_{𝜇_𝑖}$ and require the model to minimize the distance between extracted feature and its corresponding center vector, i.e., 31 | 32 | $min⁡‖𝑓_𝑗−𝑓_{𝜇_𝑖} ‖^2$ 𝑖𝑓 𝑠𝑎𝑚𝑝𝑙𝑒 $j$ 𝑏𝑒𝑙𝑜𝑛𝑔𝑠 𝑡𝑜 𝑐𝑙𝑎𝑠𝑠 $𝑖$ 33 | 34 | We refer this objective as “center loss”, the details can be found in Ref. [A Discriminative Deep Feature Learning Approach for Face Recognition. Yandong Wen, Kaipeng Zhang, Zhifeng Li and Yu Qiao. European Conference on Computer Vision (ECCV) 2016]. The model is trained now with both the categorical cross entropy loss and the center loss as 35 | 36 | $min$⁡ $𝐶𝑎𝑡𝑒𝑔𝑜𝑟𝑖𝑐𝑎𝑙𝐶𝑟𝑜𝑠𝑠𝐸𝑛𝑡𝑟𝑜𝑝𝑦+𝜆∗𝐶𝑒𝑛𝑡𝑒𝑟𝐿𝑜𝑠𝑠$ 37 | 38 | ![fig2](center_2.png) 39 | 40 | ## `Center` Module 41 | Now we’ll go through the code part to illustrate how the center loss can be actually computed. To compute the center loss, we need first to get the center estimation of each class. This is done through a new module referred as `Center`. Check the code snippet following. 42 | 43 | ```python 44 | class Center(Module): 45 | """ 46 | Compute the class centers during training 47 | Ref. to "Discriminative feature learning approach for deep face recognition (2016)" 48 | """ 49 | def __init__(self, feature_dim, center_num, alpha=0.9, center=init.GlorotUniform(), name=None): 50 | """ 51 | :param alpha: moving averaging coefficient 52 | :param center: initial value of center 53 | """ 54 | super().__init__(name=name) 55 | self.center = self.register_self_updating_variable(center, shape=[center_num, feature_dim], name="center") 56 | self.alpha = alpha 57 | 58 | def forward(self, features, labels): 59 | """ 60 | :param features: (B, D) 61 | :param labels: (B,) 62 | :return: categorical centers 63 | """ 64 | center_batch = self.center[labels, :] 65 | diff = (self.alpha - 1.0) * (center_batch - features) 66 | center_updated = tensor.inc_subtensor(self.center[labels, :], diff) 67 | self.center.default_update = center_updated 68 | return self.center 69 | 70 | def predict(self): 71 | return self.center 72 | 73 | ``` 74 | 75 | First, all our NN modules should subclass the root `Module` class, then we can use class methods and attributes to manipulate network parameters conveniently. 76 | 77 | Second, define the module initialization in `.__init__()` part. Here we do two things: we register a `center` tensor as network parameter and initialize it with a Glorot uniform random numpy array. The `center` tensor is of shape `(center_num, feature_dim)`, in which `center_num` should be equal to class number, and `feature_dim` is the dimension of extracted features by the network. 78 | 79 | In Dandelion, the network parameters are divided into two categories: 80 | 81 | * 1) parameter to be updated by optimizer, 82 | * 2) parameter to updated by user defined expression. 83 | 84 | The former parameters should be registered with class method `.register_param()`, and the latter parameters should be registered with class method `. register_self_updating_variable()`. 85 | 86 | Now we registered `center` tensor as self updating variable, its updating expression is given in `.forward()` function as `self.center.update = center_updated`. In Dandelion we use a specially named attribute `. update` to tell the framework that this parameter has an updating expression defined and the updating expression will be collected during Theano function compiling phase. 87 | 88 | The `.forward()` function will be used for training, and `.predict()` function will be used for inference. 89 | 90 | Basically, during training, the `.forward()` function computes moving averaging estimation of class centers; and during inference, we just use the stored center values as final estimated class centers. This is pretty much alike how `BatchNorm`’s mean and std are estimated and used. 91 | 92 | ## Summary 93 | 94 | To summary, to write your own module, you only need to do the following three steps: 95 | 96 | * 1) subclass `Module` class 97 | * 2) register your module’s parameters by `.register_param()` or `. register_self_updating_variable()` and initialize them 98 | * 3) define the `.forward()` function for training and `.predict()` function for inference 99 | 100 | and that’s it! 101 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Dandelion 2 | [![PyPI version](https://badge.fury.io/py/Dandelion.svg)](https://badge.fury.io/py/Dandelion) 3 | [![License: MPL 2.0](https://img.shields.io/badge/license-MPL%202.0-brightgreen.svg)](https://github.com/david-leon/Dandelion/blob/master/LICENSE) 4 | [![Python 3.x](https://img.shields.io/badge/python-3.x-brightgreen.svg)](https://www.python.org/downloads/release) 5 | [![Travis CI](https://app.travis-ci.com/david-leon/Dandelion.svg?branch=master)](https://travis-ci.org/david-leon/Dandelion) 6 | 7 | A quite light weight deep learning framework, on top of Theano, offering better balance between flexibility and abstraction 8 | 9 | ## Targeted Users 10 | Researchers who need flexibility as well as convenience to experiment all kinds of *nonstandard* network structures, and also the stability of Theano. 11 | 12 | ## Featuring 13 | * **Aiming to offer better balance between flexibility and abstraction.** 14 | * Easy to use and extend, support for any neural network structure. 15 | * Loose coupling, each part of the framework can be modified independently. 16 | * **More like a handy library of deep learning modules.** 17 | * Common modules such as CNN, LSTM, GRU, Dense, Dropout, Batch Normalization, and common optimization methods such as SGD, Adam, Adadelta, Rmsprop are ready out-of-the-box. 18 | * **Plug & play, operating directly on Theano tensors, no upper abstraction applied.** 19 | * Unlike previous frameworks like Keras, Lasagne, etc., Dandelion operates directly on tensors instead of layer abstractions, making it quite easy to plug in 3rd part defined deep learning modules (layer defined by Keras/Lasagne) or vice versa. 20 | 21 | ## Documentation 22 | Documentation is available online: [https://david-leon.github.io/Dandelion/](https://david-leon.github.io/Dandelion/) 23 | 24 | ## Install 25 | Use pip channel for stable release 26 | ``` 27 | pip install dandelion --upgrade 28 | ``` 29 | or install from source to get the up-to-date version: 30 | ``` 31 | pip install git+https://github.com/david-leon/Dandelion.git 32 | ``` 33 | 34 | Dependency 35 | * Theano >=1.0 36 | * Scipy (required by `dandelion.ext.CV`) 37 | * Pillow (required by `dandelion.ext.CV`) 38 | * OpenCV (required by `dandelion.ext.CV`) 39 | 40 | ## Quick Tour 41 | ```python 42 | import theano 43 | import theano.tensor as tensor 44 | from dandelion.module import * 45 | from dandelion.update import * 46 | from dandelion.functional import * 47 | from dandelion.util import gpickle 48 | 49 | class model(Module): 50 | def __init__(self, batchsize=None, input_length=None, Nclass=6, noise=(0.5, 0.2, 0.7, 0.7, 0.7)): 51 | super().__init__() 52 | self.batchsize = batchsize 53 | self.input_length = input_length 54 | self.Nclass = Nclass 55 | self.noise = noise 56 | 57 | self.dropout0 = Dropout() 58 | self.dropout1 = Dropout() 59 | self.dropout2 = Dropout() 60 | self.dropout3 = Dropout() 61 | self.dropout4 = Dropout() 62 | W = gpickle.load('word_embedding(6336, 256).gpkl') 63 | self.embedding = Embedding(num_embeddings=6336, embedding_dim=256, W=W) 64 | self.lstm0 = LSTM(input_dims=256, hidden_dim=100) 65 | self.lstm1 = LSTM(input_dims=256, hidden_dim=100) 66 | self.lstm2 = LSTM(input_dims=200, hidden_dim=100) 67 | self.lstm3 = LSTM(input_dims=200, hidden_dim=100) 68 | self.lstm4 = LSTM(input_dims=200, hidden_dim=100) 69 | self.lstm5 = LSTM(input_dims=200, hidden_dim=100) 70 | self.dense = Dense(input_dims=200, output_dim=Nclass) 71 | 72 | def forward(self, x): 73 | self.work_mode = 'train' 74 | x = self.dropout0.forward(x, p=self.noise[0], rescale=False) 75 | x = self.embedding.forward(x) # (B, T, D) 76 | 77 | x = self.dropout1.forward(x, p=self.noise[1], rescale=True) 78 | x = x.dimshuffle((1, 0, 2)) # (B, T, D) -> (T, B, D) 79 | x_f = self.lstm0.forward(x, None, None, None) 80 | x_b = self.lstm1.forward(x, None, None, None, backward=True) 81 | x = tensor.concatenate([x_f, x_b], axis=2) 82 | 83 | x = pool_1d(x, ws=2, ignore_border=True, mode='average_exc_pad', axis=0) 84 | 85 | x = self.dropout2.forward(x, p=self.noise[2], rescale=True) 86 | x_f = self.lstm2.forward(x, None, None, None) 87 | x_b = self.lstm3.forward(x, None, None, None, backward=True) 88 | x = tensor.concatenate([x_f, x_b], axis=2) 89 | 90 | x = self.dropout3.forward(x, p=self.noise[3], rescale=True) 91 | x_f = self.lstm4.forward(x, None, None, None, only_return_final=True) 92 | x_b = self.lstm5.forward(x, None, None, None, only_return_final=True, backward=True) 93 | x = tensor.concatenate([x_f, x_b], axis=1) 94 | 95 | x = self.dropout4.forward(x, p=self.noise[4], rescale=True) 96 | y = sigmoid(self.dense.forward(x)) 97 | return y 98 | 99 | def predict(self, x): 100 | self.work_mode = 'inference' 101 | x = self.embedding.predict(x) 102 | 103 | x = x.dimshuffle((1, 0, 2)) # (B, T, D) -> (T, B, D) 104 | x_f = self.lstm0.predict(x, None, None, None) 105 | x_b = self.lstm1.predict(x, None, None, None, backward=True) 106 | x = tensor.concatenate([x_f, x_b], axis=2) 107 | 108 | x = pool_1d(x, ws=2, ignore_border=True, mode='average_exc_pad', axis=0) 109 | 110 | x_f = self.lstm2.predict(x, None, None, None) 111 | x_b = self.lstm3.predict(x, None, None, None, backward=True) 112 | x = tensor.concatenate([x_f, x_b], axis=2) 113 | 114 | x_f = self.lstm4.predict(x, None, None, None, only_return_final=True) 115 | x_b = self.lstm5.predict(x, None, None, None, only_return_final=True, backward=True) 116 | x = tensor.concatenate([x_f, x_b], axis=1) 117 | 118 | y = sigmoid(self.dense.predict(x)) 119 | return y 120 | ``` 121 | 122 | ## Why Another DL Framework 123 | * The reason is more about the lack of flexibility for existing DL frameworks, such as Keras, Lasagne, Blocks, etc. 124 | * By **“flexibility”**, we means whether it is easy to modify or extend the framework. 125 | * The famous DL framework Keras is designed to be beginner-friendly oriented, at the cost of being quite hard to modify. 126 | * Compared to Keras, another less-famous framework Lasagne provides more flexibility. It’s easier to write your own layer by Lasagne for small neural network, however, for complex neural networks it still needs quite manual works because like other existing frameworks, Lasagne operates on abstracted ‘Layer’ class instead of raw tensor variables. 127 | 128 | ## Project Layout 129 | Python Module | Explanation 130 | ----------------- | ---------------- 131 | module | all neual network module definitions 132 | functional | operations on tensor with no parameter to be learned 133 | initialization | initialization methods for neural network modules 134 | activation | definition of all activation functions 135 | objective | definition of all loss objectives 136 | update | definition of all optimizers 137 | util | utility functions 138 | model | model implementations out-of-the-box 139 | ext | extensions 140 | 141 | ## Credits 142 | The design of Dandelion heavily draws on [Lasagne](https://github.com/Lasagne/Lasagne) and [Pytorch](http://pytorch.org/), both my favorate DL libraries. 143 | Special thanks to **Radomir Dopieralski**, who transferred the `dandelion` project name on pypi to us. Now you can install the package by simply `pip install dandelion`. 144 | -------------------------------------------------------------------------------- /test/test_shufflenet.py: -------------------------------------------------------------------------------- 1 | # coding:utf-8 2 | # Test for Shuffle-net, partial. 3 | # Created : 7, 9, 2018 4 | # Revised : 7, 9, 2018 5 | # All rights reserved 6 | #------------------------------------------------------------------------------------------------ 7 | __author__ = 'dawei.leng' 8 | import os 9 | os.environ['THEANO_FLAGS'] = "floatX=float32, mode=FAST_RUN, warn_float64='raise'" 10 | # os.environ['THEANO_FLAGS'] = "floatX=float32, mode=DEBUG_MODE, warn_float64='raise', exception_verbosity=high" 11 | 12 | import theano, sys 13 | sys.setrecursionlimit(40000) 14 | from theano import tensor 15 | from dandelion.module import * 16 | from dandelion.activation import * 17 | from dandelion.model.shufflenet import * 18 | from dandelion.objective import * 19 | from dandelion.update import * 20 | 21 | import dandelion 22 | dandelion_path = os.path.split(dandelion.__file__)[0] 23 | print('dandelion path = %s\n' % dandelion_path) 24 | 25 | def test_case_0(): 26 | print('test_case_0: ShuffleUnit') 27 | model = ShuffleUnit(in_channels=16, inner_channels=4, border_mode='same', batchnorm_mode=0, activation=relu, group_num=4, fusion_mode='add', dilation=2, stride=2) 28 | x = tensor.ftensor4('x') 29 | y = model.forward(x) 30 | # y = tensor.nnet.conv2d() 31 | print('compiling fn...') 32 | fn = theano.function([x], y, no_default_updates=False) 33 | print('run fn...') 34 | input = np.random.rand(4, 16, 256, 256).astype(np.float32) 35 | output = fn(input) 36 | assert output.shape == (4, 16, 128, 128), 'incorrect output shape = %s' % str(output.shape) 37 | 38 | def test_case_1(): 39 | print('test_case_1: ShuffleUnit_Stack') 40 | model = ShuffleUnit_Stack(in_channels=16, out_channels=32) 41 | x = tensor.ftensor4('x') 42 | y = model.forward(x) 43 | # y = tensor.nnet.conv2d() 44 | print('compiling fn...') 45 | fn = theano.function([x], y, no_default_updates=False) 46 | print('run fn...') 47 | input = np.random.rand(4, 16, 256, 256).astype(np.float32) 48 | output = fn(input) 49 | assert output.shape == (4, 32, 128, 128), 'incorrect output shape = %s' % str(output.shape) 50 | 51 | def test_case_2(): 52 | print('test_case_2: model_ShuffleNet') 53 | model = model_ShuffleNet(in_channels=1, group_num=4, stage_channels=(24, 272, 544, 1088), stack_size=(3, 7, 3), batchnorm_mode=1, activation=relu) 54 | # model_weights = model.get_weights() 55 | # for value, w_name in model_weights: 56 | # print('name = %s, shape='%w_name, value.shape) 57 | 58 | x = tensor.ftensor4('x') 59 | y = model.forward(x) 60 | # y = tensor.nnet.conv2d() 61 | print('compiling fn...') 62 | fn = theano.function([x], y, no_default_updates=False) 63 | print('run fn...') 64 | input = np.random.rand(4, 1, 224, 224).astype(np.float32) 65 | output = fn(input) 66 | # print(output) 67 | assert output.shape == (4, 1088, 7, 7), 'incorrect output shape = %s' % str(output.shape) 68 | 69 | def test_case_3(): 70 | print('test_case_3: model_ShuffleSeg.ShuffleNet') 71 | model = model_ShuffleSeg.ShuffleNet(in_channels=1, out_channels=6) 72 | x = tensor.ftensor4('x') 73 | y = model.forward(x) 74 | print('compiling fn...') 75 | fn = theano.function([x], y, no_default_updates=False) 76 | print('run fn...') 77 | input = np.random.rand(4, 1, 224, 224).astype(np.float32) 78 | output = fn(input) 79 | assert output[0].shape == (4, 6, 7, 7), 'incorrect output[0] shape = %s' % str(output[0].shape) 80 | assert output[1].shape == (4, 544, 14, 14), 'incorrect output[1] shape = %s' % str(output[1].shape) 81 | assert output[2].shape == (4, 272, 28, 28), 'incorrect output[2] shape = %s' % str(output[2].shape) 82 | 83 | def test_case_4(): 84 | print('test_case_4: model_ShuffleSeg') 85 | model = model_ShuffleSeg() 86 | # from dandelion.util import gpickle 87 | # gpickle.dump((model.get_weights(), None), 'shuffleseg.gpkl') 88 | x = tensor.ftensor4('x') 89 | y = model.forward(x) 90 | print('compiling fn...') 91 | fn = theano.function([x], y, no_default_updates=False, on_unused_input='ignore') 92 | print('run fn...') 93 | input = np.random.rand(4, 1, 256, 256).astype(np.float32) 94 | output = fn(input) 95 | assert output.shape == (4, 6, 256, 256), 'incorrect output shape = %s' % str(output.shape) 96 | 97 | def test_case_5(): 98 | print('test_case_5: ShuffleUnit_v2') 99 | args = [ [256, 128, 'same', 1, relu, 2, 2], 100 | [256, 256, 'same', 2, relu, 1, 1], ] 101 | shape_gt = [ (4, 128, 128, 128), 102 | (4, 256, 256, 256)] 103 | for i, arg in enumerate(args): 104 | in_channels, out_channels, border_mode, batchnorm_mode, activation, stride, dilation = arg 105 | model = ShuffleUnit_v2(in_channels=in_channels, out_channels=out_channels, border_mode=border_mode, batchnorm_mode=batchnorm_mode, 106 | activation=activation, dilation=dilation, stride=stride) 107 | x = tensor.ftensor4('x') 108 | y = model.forward(x) 109 | print('compiling fn...') 110 | fn = theano.function([x], y, no_default_updates=False) 111 | print('run fn...') 112 | input = np.random.rand(4, in_channels, 256, 256).astype(np.float32) 113 | output = fn(input) 114 | assert output.shape == shape_gt[i], 'incorrect output shape = %s' % str(output.shape) 115 | 116 | def test_case_6(): 117 | print('test_case_6: ShuffleUnit_v2_Stack') 118 | args = [ [16, 32, 1, relu, 3], 119 | [16, 16, 0, relu, 4], 120 | [32, 16, 2, relu, 2]] 121 | shape_gt = [ (4, 32, 128, 128), 122 | (4, 16, 128, 128), 123 | (4, 16, 128, 128)] 124 | for i, arg in enumerate(args): 125 | in_channels, out_channels, batchnorm_mode, activation, stack_size = arg 126 | model = ShuffleUnit_v2_Stack(in_channels=in_channels, out_channels=out_channels, batchnorm_mode=batchnorm_mode, activation=activation, stack_size=stack_size) 127 | x = tensor.ftensor4('x') 128 | y = model.forward(x) 129 | print('compiling fn...') 130 | fn = theano.function([x], y, no_default_updates=False) 131 | print('run fn...') 132 | input = np.random.rand(4, in_channels, 256, 256).astype(np.float32) 133 | output = fn(input) 134 | # print(output.shape) 135 | assert output.shape == shape_gt[i], 'incorrect output shape = %s' % str(output.shape) 136 | 137 | 138 | def test_case_7(): 139 | print('test_case_7: model_ShuffleNet_v2') 140 | args = [ [3, (24, 116, 232, 464, 1024), (3, 7, 3), 1, relu], 141 | [1, (24, 48, 96, 192, 1088), (2, 5, 2), 0, relu], 142 | ] 143 | shape_gt = [ (4, 1024, 7, 7), 144 | (4, 1088, 7, 7), 145 | ] 146 | for i, arg in enumerate(args): 147 | in_channels, stage_channels, stack_size, batchnorm_mode, activation = arg 148 | 149 | model = model_ShuffleNet_v2(in_channels=in_channels, stage_channels=stage_channels, stack_size=stack_size, batchnorm_mode=batchnorm_mode, activation=activation) 150 | x = tensor.ftensor4('x') 151 | y = model.forward(x) 152 | # y = tensor.nnet.conv2d() 153 | print('compiling fn...') 154 | fn = theano.function([x], y, no_default_updates=False) 155 | print('run fn...') 156 | input = np.random.rand(4, in_channels, 224, 224).astype(np.float32) 157 | output = fn(input) 158 | # print(output.shape) 159 | assert output.shape == shape_gt[i], 'incorrect output shape = %s' % str(output.shape) 160 | 161 | def test_case_8(): 162 | print('test_case_8: grad of model_ShuffleNet_v2') 163 | args = [ [3, (24, 116, 232, 464, 1024), (3, 7, 3), 1, relu], 164 | [1, (24, 48, 96, 192, 1088), (2, 5, 2), 2, relu], 165 | ] 166 | shape_gt = [ (4, 1024, 7, 7), 167 | (4, 1088, 7, 7), 168 | ] 169 | for i, arg in enumerate(args): 170 | in_channels, stage_channels, stack_size, batchnorm_mode, activation = arg 171 | 172 | model = model_ShuffleNet_v2(in_channels=in_channels, stage_channels=stage_channels, stack_size=stack_size, batchnorm_mode=batchnorm_mode, activation=activation) 173 | x = tensor.ftensor4('x') 174 | gt = tensor.ftensor4('gt') 175 | y = model.forward(x) 176 | loss = aggregate(squared_error(gt, y)) 177 | params = model.collect_params() 178 | updates = sgd(loss, params, 1e-4) 179 | updates.update(model.collect_self_updates()) 180 | print('compiling fn...') 181 | fn = theano.function([x, gt], [y, loss], updates=updates, no_default_updates=False) 182 | print('run fn...') 183 | input = np.random.rand(4, in_channels, 224, 224).astype(np.float32) 184 | gt = np.random.rand(4, stage_channels[4], 7, 7).astype(np.float32) 185 | y, loss = fn(input, gt) 186 | # print(output.shape) 187 | print('loss = ', loss) 188 | assert y.shape == shape_gt[i], 'incorrect output shape = %s' % str(y.shape) 189 | 190 | 191 | 192 | if __name__ == '__main__': 193 | 194 | # test_case_0() 195 | # test_case_1() 196 | # test_case_2() 197 | # test_case_3() 198 | # test_case_4() 199 | # test_case_5() 200 | # test_case_6() 201 | # test_case_7() 202 | test_case_8() 203 | 204 | print('Test passed') 205 | 206 | 207 | 208 | -------------------------------------------------------------------------------- /docs/history.md: -------------------------------------------------------------------------------- 1 | # History 2 | 3 | ## version 0.17.25 [7-19-2019] 4 | * **NEW**: Add `only_output_to_file` arg to `util.sys_output_tap` class. You can disable screen output by set this flag to `True`, this will make the script run *silently*. 5 | 6 | ## version 0.17.24 [4-3-2019] 7 | * **NEW**: Add [`Anti-996 License`](https://github.com/996icu/996.ICU/blob/master/LICENSE) as auxiliary license 8 | 9 | ## version 0.17.23 [3-5-2019] 10 | * **NEW**: now `BatchNorm`'s `mean` can be set to `None` to disable mean substraction 11 | 12 | ## version 0.17.22 [2-28-2019] 13 | * **FIXED**: `self.b` should be set to `None` if not specified for `Conv2D` module 14 | 15 | ## version 0.17.21 [2-13-2019] 16 | * **NEW**: now `BatchNorm`'s `inv_std` can be set to `None` to disable variance scaling 17 | 18 | ## version 0.17.20 [2-11-2019] 19 | * **FIXED**: `self.b` should be set to `None` if not specified for `Dense` module 20 | 21 | ## version 0.17.19 [1-23-2019] 22 | * **NEW**: add `clear_nan` argument for `sgd`, `adam` and `adadelta` optimizers. 23 | * **MODIFIED**: add default value 1e-4 for `sgd`'s `learning_rate` arg. 24 | 25 | ## version 0.17.17 [11-22-2018] 26 | * **NEW**: add `GroupNorm` module for group normalization implementation; 27 | * **MODIFIED**: expose `dim_broadcast` arg for `Module.register_param()` method; 28 | * **MODIFIED**: replace `spec = tensor.patternbroadcast(spec, dim_broadcast)` with `spec = theano.shared(spec, broadcastable=dim_broadcast)` for `util.create_param()`, due to the former would change tensor's type. 29 | 30 | ## version 0.17.16 [11-19-2018] 31 | * **FIXED**: wrong scale of `model_size` for `ext.visual.get_model_summary()`; 32 | * **MODIFIED**: add `stride` arg for `ShuffleUnit_Stack` and `ShuffleUnit_v2_Stack`; add `fusion_mode` arg for `ShuffleUnit_Stack`; improve their documentation. 33 | 34 | ## version 0.17.15 [11-16-2018] 35 | * **NEW**: add `ext.visual` sub-module, containing model summarization & visualization toolkits. 36 | 37 | ## version 0.17.14 [11-15-2018] 38 | * **FIXED**: remove redundant `bn5` layer of `ShuffleUnit_v2` when `stride` = 1 and `batchnorm_mode` = 2. 39 | 40 | ## version 0.17.13 [11-13-2018] 41 | * **NEW**: add `model.Alternate_2D_LSTM` module for 2D LSTM implementaton by alternating 1D LSTM along different input dimensions. 42 | 43 | ## version 0.17.12 [11-6-2018] 44 | * **NEW**: add `LSTM2D` module for 2D LSTM implementation 45 | * **NEW**: add `.todevice()` interface to `Module` class for possible support of model-parallel multi-GPU training. However due to [Theano issue 6655](https://github.com/Theano/Theano/issues/6655), this feature won't be finished, so use it at your own risk. 46 | * **MODIFIED**: `activation` param of `Sequential` class now supports list input. 47 | * **MODIFIED**: merge pull request [#1](https://github.com/david-leon/Dandelion/pull/1), [#2](https://github.com/david-leon/Dandelion/pull/2), now `functional.spatial_pyramid_pooling()` supports 3 different implementations. 48 | 49 | ## version 0.17.11 [9-3-2018] 50 | * **MODIFIED**: returned `bbox`'s shape is changed to (B, H, W, k, n) for `model_CTPN` 51 | 52 | ## version 0.17.10 [8-16-2018] 53 | * **MODIFIED**: add `border_mode` arg to `dandelion.ext.CV.imrotate()`, the `interpolation` arg type is changed to string. 54 | * **MODIFIED**: returned `bbox`'s shape is changed to (B, H, W, n*k) for `model_CTPN` 55 | 56 | ## version 0.17.9 [8-7-2018] 57 | * **NEW**: add `theano_safe_run, Finite_Memory_Array, pad_sequence_into_array, get_local_ip, get_time_stamp` into `dandelion.util` 58 | * **NEW**: add documentation of `dandelion.util`, unfinished. 59 | 60 | ## version 0.17.8 [8-3-2018] 61 | * **MODIFIED**: disable the auto-broadcasting in `create_param()`. This auto-broadcasting would result in theano exception for parameter with shape = [1]. 62 | * **NEW**: add `model.shufflenet.ShuffleUnit_v2_Stack` and `model.shufflenet.ShuffleNet_v2` for ShuffleNet_v2 reference implementation. 63 | * **NEW**: move `channel_shuffle()` from `model.shufflenet.py` into `functional.py` 64 | 65 | ## version 0.17.7 [8-2-2018] 66 | From this version the documentaiton supports latex math officially. 67 | * **MODIFIED**: move arg `alpha` of `Module.Center` from class delcaration to its `.forward()` interface. 68 | 69 | ## version 0.17.6 [7-25-2018] 70 | * **MODIFIED**: change default value of `Module.set_weights_by_name()`'s arg `unmatched` from `ignore` to `raise` 71 | * **MODIFIED**: change default value of `model.vgg.model_VGG16()`'s arg `flip_filters` from `True` to `False` 72 | 73 | ## version 0.17.5 [7-20-2018] 74 | * **FIXED**: fixed typo in `objective.categorical_crossentropy()` 75 | 76 | ## version 0.17.4 [7-20-2018] 77 | * **NEW**: add class weighting support for `objective.categorical_crossentropy()` and `objective.categorical_crossentropy_log()` 78 | * **NEW**: add `util.theano_safe_run()` to help catch memory exceptions when running theano functions. 79 | 80 | ## version 0.17.3 [7-18-2018] 81 | * **FIXED**: pooling mode in `model.shufflenet.ShuffleUnit` changed to `average_inc_pad` for correct gradient. 82 | 83 | ## version 0.17.2 [7-17-2018] 84 | * **NEW**: add `model.shufflenet.model_ShuffleSeg` for Shuffle-Seg model reference implementation. 85 | 86 | ## version 0.17.1 [7-12-2018] 87 | * **MODIFIED**: modify all `Test/Test_*.py` to be compatible with pytest. 88 | * **NEW**: add Travis CI for automatic unit test. 89 | 90 | ## version 0.17.0 [7-12-2018] 91 | In this version the `Module`'s parameter and sub-module naming conventions are changed to make sure unique name for each variable/module in a complex network. 92 | It's **incompatible** with previous version if your work accessed their names, otherwise there is no impact. 93 | Note: to set weights saved by previous dandelion(>=version 0.14.0), use `.set_weights()` instead of `.set_weights_by_name()`. For weights saved by dandelion of version < 0.14.0, the quick way is to set the model's submodule weight explicitly as `model_new_dandelion.conv1.W.set_value(model_old_dandelion.conv1.W.get_value())`. 94 | From this version, it's recommonded to let the framework auto-name the module parameters when you define your own module with `register_param()` and `register_self_updating_variable()`. 95 | 96 | * **MODIFIED**: module's variable name convention changed to `variable_name@parent_module_name` to make sure unique name for each variable in a complex network 97 | * **MODIFIED**: module's name convention changed to `class_name|instance_name@parent_module_name` to make sure unique name for each module in a complex network 98 | * **MODIFIED**: remove all specified names for `register_param()` and `register_self_updating_variable()`. Leave the variables to be named automatically by their parent module. 99 | * **MODIFIED**: improve `model.shufflenet.ShuffleUnit`. 100 | * **NEW**: add `Sequential` container in `dandelion.module` for usage convenience. 101 | * **NEW**: add `model.shufflenet.ShuffleUnit_Stack` and `model.shufflenet.ShuffleNet` for ShuffleNet reference implementation. 102 | 103 | ## version 0.16.10 [7-10-2018] 104 | * **MODIFIED**: disable all install requirements to prevent possible conflict of pip and conda channel. 105 | 106 | ## version 0.16.9 [7-10-2018] 107 | * **MODIFIED**: import all model reference implementations into `dandelion.model`'s namespace 108 | * **FIXED**: `ConvTransposed2D`'s `W_shape` should use `in_channels` as first dimension; incorrect `W_shape` when `num_groups` > 1. 109 | 110 | ## version 0.16.8 [7-9-2018] 111 | * **NEW**: add `model.shufflenet.DSConv2D` for Depthwise Separable Convolution reference implementation. 112 | * **NEW**: add `model.shufflenet.ShuffleUnit` for ShuffleNet reference implementation 113 | * **FIXED**: `W_shape` of `module.Conv2D` should count for `num_groups` 114 | 115 | ## version 0.16.7 [7-6-2018] 116 | * **NEW**: add `model.vgg.model_VGG16` for VGG-16 reference implementation. 117 | * **NEW**: add `model.resnet.ResNet_bottleneck` for ResNet reference implementation 118 | * **NEW**: add `model.feature_pyramid_net.model_FPN` for Feature Pyramid Network reference implementation 119 | 120 | ## version 0.16.6 [7-5-2018] 121 | * **NEW**: add `functional.upsample_2d()` for 2D upsampling 122 | * **NEW**: add `functional.upsample_2d_bilinear()`for bilinear 2D upsampling 123 | 124 | ## version 0.16.5 [7-5-2018] 125 | * **NEW**: add `functional.spatial_pyramid_pooling()` for SPP-net implementation. 126 | 127 | ## version 0.16.4 [7-4-2018] 128 | * **FIXED**: wrong indexing when `targets` is int vector for `objective.categorical_crossentropy_log()`. 129 | 130 | 131 | ## version 0.16.3 [7-3-2018] 132 | * **NEW**: add `activation.log_softmax()` for more numerically stable softmax. 133 | * **NEW**: add `objective.categorical_crossentropy_log()` for more numerically stable categorical cross-entropy 134 | * **MODIFIED**: add `eps` argument to `objective.categorical_crossentropy()` for numerical stability purpose. Note 1e-7 is set as default value of `eps`. You can set it to 0 to get the old `categorical_crossentropy()` back. 135 | 136 | ## version 0.16.0 [6-13-2018] 137 | * **NEW**: add `ext` module into master branch of Dandelion. All the miscellaneous extensions will be organized in here. 138 | * **NEW**: add `ext.CV` sub-module, containing image I/O functions and basic image processing functions commonly used in model training. 139 | 140 | ## version 0.15.2 [5-28-2018] 141 | * **FIXED**: `convTOP` should be constructed each time the `forward()` function of `ConvTransposed2D` is called. 142 | 143 | ## version 0.15.1 [5-25-2018] 144 | * **NEW**: add `model` module into master branch of Dandelion 145 | * **NEW**: add U-net FCN implementation into `model` module 146 | * **NEW**: add `align_crop()` into `functional` module 147 | 148 | ## version 0.14.4 [4-17-2018] 149 | Rename `updates.py` with `update.py` 150 | 151 | ## version 0.14.0 [4-10-2018] 152 | In this version the `Module`'s parameter interfaces are mostly redesigned, so it's **incompatible** with previous version. 153 | Now `self.params` and `self.self_updating_variables` do not include sub-modules' parameters any more, to get all the parameters to be 154 | trained by optimizer, including sub-modules' during training, you'll need to call the new interface function `.collect_params()`. 155 | To collect self-defined updates for training, still call `.collect_self_updates()`. 156 | 157 | * **MODIFIED**: `.get_weights()` and `.set_weights()` traverse the parameters in the same order of sub-modules, so they're **incompatible** with previous version. 158 | * **MODIFIED**: Rewind all `trainable` flags, you're now expected to use the `include` and `exclude` arguments in `.collect_params()` and 159 | `.collect_self_updates()` to enable/disable training for certain module's parameters. 160 | * **MODIFIED**: to define self-update expression for `self_updating_variable`, use `.update` attribute instead of previous `.default_update` 161 | * **NEW**: add auto-naming feature to root class `Module`: if a sub-module is unnamed yet, it'll be auto-named by its instance name, 162 | from now on you don't need to name a sub-module manually any more. 163 | * **NEW**: add `.set_weights_by_name()` to `Module` class, you can use this function to set module weights saved by previous version of Dandelion 164 | -------------------------------------------------------------------------------- /dandelion/model/unet.py: -------------------------------------------------------------------------------- 1 | # coding:utf-8 2 | ''' 3 | Model definition of U-net FCN 4 | Created : 5, 24, 2018 5 | Revised : 5, 24, 2018 6 | All rights reserved 7 | ''' 8 | #------------------------------------------------------------------------------------------------ 9 | __author__ = 'dawei.leng' 10 | 11 | import theano.tensor as tensor 12 | from ..module import * 13 | from ..initialization import HeNormal 14 | from ..functional import * 15 | from ..activation import * 16 | 17 | class model_Unet(Module): 18 | """ 19 | Unet reference implementation, same structure with Lasagne's [implementation](https://github.com/Lasagne/Recipes/blob/master/modelzoo/Unet.py) 20 | Note that Dandelion's `BatchNorm` implementation is different from Lasagne's counter-part. 21 | `contr` := contract 22 | """ 23 | def __init__(self, channel=1, im_height=128, im_width=128, Nclass=2, kernel_size=3, border_mode='same', base_n_filters=64, output_activation=softmax): 24 | super().__init__() 25 | self.Nclass = Nclass 26 | self.output_activation = output_activation 27 | 28 | self.contr1_1 = Conv2D(in_channels=channel, out_channels=base_n_filters, kernel_size=kernel_size, pad=border_mode, input_shape=(im_height, im_width), W=HeNormal(gain="relu")) 29 | self.bn1_1 = BatchNorm(input_shape=(None, base_n_filters, None, None)) 30 | self.contr1_2 = Conv2D(in_channels=base_n_filters, out_channels=base_n_filters, kernel_size=kernel_size, pad=border_mode, W=HeNormal(gain="relu")) 31 | self.bn1_2 = BatchNorm(input_shape=(None, base_n_filters, None, None)) 32 | 33 | self.contr2_1 = Conv2D(in_channels=base_n_filters, out_channels=base_n_filters*2, kernel_size=kernel_size, pad=border_mode, W=HeNormal(gain="relu")) 34 | self.bn2_1 = BatchNorm(input_shape=(None, base_n_filters*2, None, None)) 35 | self.contr2_2 = Conv2D(in_channels=base_n_filters*2, out_channels=base_n_filters*2, kernel_size=kernel_size, pad=border_mode, W=HeNormal(gain="relu")) 36 | self.bn2_2 = BatchNorm(input_shape=(None, base_n_filters*2, None, None)) 37 | 38 | self.contr3_1 = Conv2D(in_channels=base_n_filters*2, out_channels=base_n_filters*4, kernel_size=kernel_size, pad=border_mode, W=HeNormal(gain="relu")) 39 | self.bn3_1 = BatchNorm(input_shape=(None, base_n_filters*4, None, None)) 40 | self.contr3_2 = Conv2D(in_channels=base_n_filters*4, out_channels=base_n_filters*4, kernel_size=kernel_size, pad=border_mode, W=HeNormal(gain="relu")) 41 | self.bn3_2 = BatchNorm(input_shape=(None, base_n_filters*4, None, None)) 42 | 43 | self.contr4_1 = Conv2D(in_channels=base_n_filters*4, out_channels=base_n_filters*8, kernel_size=kernel_size, pad=border_mode, W=HeNormal(gain="relu")) 44 | self.bn4_1 = BatchNorm(input_shape=(None, base_n_filters*8, None, None)) 45 | self.contr4_2 = Conv2D(in_channels=base_n_filters*8, out_channels=base_n_filters*8, kernel_size=kernel_size, pad=border_mode, W=HeNormal(gain="relu")) 46 | self.bn4_2 = BatchNorm(input_shape=(None, base_n_filters*8, None, None)) 47 | 48 | self.encode_1 = Conv2D(in_channels=base_n_filters*8, out_channels=base_n_filters*16, kernel_size=kernel_size, pad=border_mode, W=HeNormal(gain="relu")) 49 | self.bn_enc_1 = BatchNorm(input_shape=(None, base_n_filters*16, None, None)) 50 | self.encode_2 = Conv2D(in_channels=base_n_filters*16, out_channels=base_n_filters*16, kernel_size=kernel_size, pad=border_mode, W=HeNormal(gain="relu")) 51 | self.bn_enc_2 = BatchNorm(input_shape=(None, base_n_filters*16, None, None)) 52 | 53 | self.upscale_1 = ConvTransposed2D(in_channels=base_n_filters*16, out_channels=base_n_filters*16, kernel_size=2, stride=2, pad='valid', W=HeNormal(gain="relu")) 54 | self.bn_ups_1 = BatchNorm(input_shape=(None, base_n_filters * 16, None, None)) 55 | 56 | self.expand1_1 = Conv2D(in_channels=base_n_filters*24, out_channels=base_n_filters*8, kernel_size=kernel_size, pad=border_mode, W=HeNormal(gain="relu")) 57 | self.bn_epd1_1 = BatchNorm(input_shape=(None, base_n_filters*8, None, None)) 58 | self.expand1_2 = Conv2D(in_channels=base_n_filters*8, out_channels=base_n_filters*8, kernel_size=kernel_size, pad=border_mode, W=HeNormal(gain="relu")) 59 | self.bn_epd1_2 = BatchNorm(input_shape=(None, base_n_filters*8, None, None)) 60 | 61 | self.upscale_2 = ConvTransposed2D(in_channels=base_n_filters*8, out_channels=base_n_filters*8, kernel_size=2, stride=2, pad='valid', W=HeNormal(gain="relu")) 62 | self.bn_ups_2 = BatchNorm(input_shape=(None, base_n_filters * 8, None, None)) 63 | 64 | self.expand2_1 = Conv2D(in_channels=base_n_filters*12, out_channels=base_n_filters*4, kernel_size=kernel_size, pad=border_mode, W=HeNormal(gain="relu")) 65 | self.bn_epd2_1 = BatchNorm(input_shape=(None, base_n_filters*4, None, None)) 66 | self.expand2_2 = Conv2D(in_channels=base_n_filters*4, out_channels=base_n_filters*4, kernel_size=kernel_size, pad=border_mode, W=HeNormal(gain="relu")) 67 | self.bn_epd2_2 = BatchNorm(input_shape=(None, base_n_filters*4, None, None)) 68 | 69 | self.upscale_3 = ConvTransposed2D(in_channels=base_n_filters*4, out_channels=base_n_filters*4, kernel_size=2, stride=2, pad='valid', W=HeNormal(gain="relu")) 70 | self.bn_ups_3 = BatchNorm(input_shape=(None, base_n_filters * 4, None, None)) 71 | 72 | self.expand3_1 = Conv2D(in_channels=base_n_filters*6, out_channels=base_n_filters*2, kernel_size=kernel_size, pad=border_mode, W=HeNormal(gain="relu")) 73 | self.bn_epd3_1 = BatchNorm(input_shape=(None, base_n_filters*2, None, None)) 74 | self.expand3_2 = Conv2D(in_channels=base_n_filters*2, out_channels=base_n_filters*2, kernel_size=kernel_size, pad=border_mode, W=HeNormal(gain="relu")) 75 | self.bn_epd3_2 = BatchNorm(input_shape=(None, base_n_filters*2, None, None)) 76 | 77 | self.upscale_4 = ConvTransposed2D(in_channels=base_n_filters*2, out_channels=base_n_filters*2, kernel_size=2, stride=2, pad='valid', W=HeNormal(gain="relu")) 78 | self.bn_ups_4 = BatchNorm(input_shape=(None, base_n_filters * 2, None, None)) 79 | 80 | self.expand4_1 = Conv2D(in_channels=base_n_filters*3, out_channels=base_n_filters, kernel_size=kernel_size, pad=border_mode, W=HeNormal(gain="relu")) 81 | self.bn_epd4_1 = BatchNorm(input_shape=(None, base_n_filters, None, None)) 82 | self.expand4_2 = Conv2D(in_channels=base_n_filters, out_channels=base_n_filters, kernel_size=kernel_size, pad=border_mode, W=HeNormal(gain="relu")) 83 | self.bn_epd4_2 = BatchNorm(input_shape=(None, base_n_filters, None, None)) 84 | 85 | self.output = Conv2D(in_channels=base_n_filters, out_channels=Nclass, kernel_size=1, pad='valid') # (B, C, H, W) 86 | 87 | def forward(self, x): 88 | """ 89 | :param x: (B, C, H, W) 90 | :return: 91 | """ 92 | self.work_mode = 'train' 93 | 94 | x = elu(self.bn1_1.forward(self.contr1_1.forward(x))) 95 | x_bn1_2 = elu(self.bn1_2.forward(self.contr1_2.forward(x))) 96 | x = pool_2d(x_bn1_2, ws=(2,2)) 97 | 98 | x = elu(self.bn2_1.forward(self.contr2_1.forward(x))) 99 | x_bn2_2 = elu(self.bn2_2.forward(self.contr2_2.forward(x))) 100 | x = pool_2d(x_bn2_2, ws=(2,2)) 101 | 102 | x = elu(self.bn3_1.forward(self.contr3_1.forward(x))) 103 | x_bn3_2 = elu(self.bn3_2.forward(self.contr3_2.forward(x))) 104 | x = pool_2d(x_bn3_2, ws=(2,2)) 105 | 106 | x = elu(self.bn4_1.forward(self.contr4_1.forward(x))) 107 | x_bn4_2 = elu(self.bn4_2.forward(self.contr4_2.forward(x))) 108 | x = pool_2d(x_bn4_2, ws=(2,2)) 109 | 110 | x = elu(self.bn_enc_1.forward(self.encode_1.forward(x))) 111 | x = elu(self.bn_enc_2.forward(self.encode_2.forward(x))) 112 | 113 | x = elu(self.bn_ups_1.forward(self.upscale_1.forward(x))) 114 | x1, x2 = align_crop([x, x_bn4_2], cropping=(None, None, 'center', 'center')) 115 | x = tensor.concatenate([x1, x2], axis=1) 116 | 117 | x = elu(self.bn_epd1_1.forward(self.expand1_1.forward(x))) 118 | x = elu(self.bn_epd1_2.forward(self.expand1_2.forward(x))) 119 | 120 | x = elu(self.bn_ups_2.forward(self.upscale_2.forward(x))) 121 | x1, x2 = align_crop([x, x_bn3_2], cropping=(None, None, 'center', 'center')) 122 | x = tensor.concatenate([x1, x2], axis=1) 123 | 124 | x = elu(self.bn_epd2_1.forward(self.expand2_1.forward(x))) 125 | x = elu(self.bn_epd2_2.forward(self.expand2_2.forward(x))) 126 | 127 | x = elu(self.bn_ups_3.forward(self.upscale_3.forward(x))) 128 | x1, x2 = align_crop([x, x_bn2_2], cropping=(None, None, 'center', 'center')) 129 | x = tensor.concatenate([x1, x2], axis=1) 130 | 131 | x = elu(self.bn_epd3_1.forward(self.expand3_1.forward(x))) 132 | x = elu(self.bn_epd3_2.forward(self.expand3_2.forward(x))) 133 | 134 | x = elu(self.bn_ups_4.forward(self.upscale_4.forward(x))) 135 | x1, x2 = align_crop([x, x_bn1_2], cropping=(None, None, 'center', 'center')) 136 | x = tensor.concatenate([x1, x2], axis=1) 137 | 138 | x = elu(self.bn_epd4_1.forward(self.expand4_1.forward(x))) 139 | x = elu(self.bn_epd4_2.forward(self.expand4_2.forward(x))) 140 | 141 | x = self.output.forward(x) # (B, C, H, W) 142 | x = x.dimshuffle((0, 2, 3, 1)) # (B, H, W, C) 143 | x = self.output_activation(x) 144 | return x 145 | 146 | def predict(self, x): 147 | self.work_mode = 'inference' 148 | 149 | x = elu(self.bn1_1.predict(self.contr1_1.predict(x))) 150 | x_bn1_2 = elu(self.bn1_2.predict(self.contr1_2.predict(x))) 151 | x = pool_2d(x_bn1_2, ws=(2,2)) 152 | 153 | x = elu(self.bn2_1.predict(self.contr2_1.predict(x))) 154 | x_bn2_2 = elu(self.bn2_2.predict(self.contr2_2.predict(x))) 155 | x = pool_2d(x_bn2_2, ws=(2,2)) 156 | 157 | x = elu(self.bn3_1.predict(self.contr3_1.predict(x))) 158 | x_bn3_2 = elu(self.bn3_2.predict(self.contr3_2.predict(x))) 159 | x = pool_2d(x_bn3_2, ws=(2,2)) 160 | 161 | x = elu(self.bn4_1.predict(self.contr4_1.predict(x))) 162 | x_bn4_2 = elu(self.bn4_2.predict(self.contr4_2.predict(x))) 163 | x = pool_2d(x_bn4_2, ws=(2,2)) 164 | 165 | x = elu(self.bn_enc_1.predict(self.encode_1.predict(x))) 166 | x = elu(self.bn_enc_2.predict(self.encode_2.predict(x))) 167 | 168 | x = elu(self.bn_ups_1.predict(self.upscale_1.predict(x))) 169 | x1, x2 = align_crop([x, x_bn4_2], cropping=(None, None, 'center', 'center')) 170 | x = tensor.concatenate([x1, x2], axis=1) 171 | 172 | x = elu(self.bn_epd1_1.predict(self.expand1_1.predict(x))) 173 | x = elu(self.bn_epd1_2.predict(self.expand1_2.predict(x))) 174 | 175 | x = elu(self.bn_ups_2.predict(self.upscale_2.predict(x))) 176 | x1, x2 = align_crop([x, x_bn3_2], cropping=(None, None, 'center', 'center')) 177 | x = tensor.concatenate([x1, x2], axis=1) 178 | 179 | x = elu(self.bn_epd2_1.predict(self.expand2_1.predict(x))) 180 | x = elu(self.bn_epd2_2.predict(self.expand2_2.predict(x))) 181 | 182 | x = elu(self.bn_ups_3.predict(self.upscale_3.predict(x))) 183 | x1, x2 = align_crop([x, x_bn2_2], cropping=(None, None, 'center', 'center')) 184 | x = tensor.concatenate([x1, x2], axis=1) 185 | 186 | x = elu(self.bn_epd3_1.predict(self.expand3_1.predict(x))) 187 | x = elu(self.bn_epd3_2.predict(self.expand3_2.predict(x))) 188 | 189 | x = elu(self.bn_ups_4.predict(self.upscale_4.predict(x))) 190 | x1, x2 = align_crop([x, x_bn1_2], cropping=(None, None, 'center', 'center')) 191 | x = tensor.concatenate([x1, x2], axis=1) 192 | 193 | x = elu(self.bn_epd4_1.predict(self.expand4_1.predict(x))) 194 | x = elu(self.bn_epd4_2.predict(self.expand4_2.predict(x))) 195 | 196 | x = self.output.predict(x) # (B, C, H, W) 197 | x = x.dimshuffle((0, 2, 3, 1)) # (B, H, W, C) 198 | x = self.output_activation(x) 199 | return x 200 | 201 | -------------------------------------------------------------------------------- /dandelion/activation.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | """ 3 | Non-linear activation functions for artificial neurons. 4 | """ 5 | 6 | import theano.tensor as tensor 7 | 8 | #--- element-wise activations ---# 9 | sigmoid = tensor.nnet.sigmoid 10 | tanh = tensor.tanh 11 | relu = tensor.nnet.relu 12 | softplus = tensor.nnet.softplus 13 | ultra_fast_sigmoid = tensor.nnet.ultra_fast_sigmoid 14 | 15 | # def _log_softmax(x): 16 | # xdev = x - x.max(1, keepdims=True) 17 | # return xdev - T.log(T.sum(T.exp(xdev), axis=1, keepdims=True)) 18 | 19 | #--- row-wise activation ---# 20 | def log_softmax(x): 21 | """ 22 | Apply log_softmax to the last dimension of input x 23 | :param x: tensor 24 | :return: 25 | """ 26 | ndim = x.ndim 27 | if ndim <= 2: 28 | return tensor.nnet.logsoftmax(x) 29 | else: 30 | original_shape = x.shape 31 | M, N = 1, original_shape[-1] 32 | for i in range(x.ndim - 1): 33 | M = M * original_shape[i] 34 | x = tensor.reshape(x, (M, N)) 35 | x = tensor.nnet.logsoftmax(x) 36 | x = tensor.reshape(x, original_shape) 37 | return x 38 | 39 | #--- row-wise activation ---# 40 | def softmax(x): 41 | """ 42 | Apply softmax to the last dimension of input x 43 | :param x: tensor 44 | :return: 45 | """ 46 | ndim = x.ndim 47 | if ndim <= 2: 48 | return tensor.nnet.softmax(x) 49 | else: 50 | original_shape = x.shape 51 | M, N = 1, original_shape[-1] 52 | for i in range(x.ndim - 1): 53 | M = M * original_shape[i] 54 | x = tensor.reshape(x, (M, N)) 55 | x = tensor.nnet.softmax(x) 56 | x = tensor.reshape(x, original_shape) 57 | return x 58 | 59 | # scaled tanh 60 | class ScaledTanH(object): 61 | """Scaled tanh :math:`\\varphi(x) = \\tanh(\\alpha \\cdot x) \\cdot \\beta` 62 | 63 | This is a modified tanh function which allows to rescale both the input and 64 | the output of the activation. 65 | 66 | Scaling the input down will result in decreasing the maximum slope of the 67 | tanh and as a result it will be in the linear regime in a larger interval 68 | of the input space. Scaling the input up will increase the maximum slope 69 | of the tanh and thus bring it closer to a step function. 70 | 71 | Scaling the output changes the output interval to :math:`[-\\beta,\\beta]`. 72 | 73 | Parameters 74 | ---------- 75 | scale_in : float32 76 | The scale parameter :math:`\\alpha` for the input 77 | 78 | scale_out : float32 79 | The scale parameter :math:`\\beta` for the output 80 | 81 | Methods 82 | ------- 83 | __call__(x) 84 | Apply the scaled tanh function to the activation `x`. 85 | 86 | Examples 87 | -------- 88 | In contrast to other activation functions in this module, this is 89 | a class that needs to be instantiated to obtain a callable: 90 | 91 | >>> from lasagne.layers import InputLayer, DenseLayer 92 | >>> l_in = InputLayer((None, 100)) 93 | >>> from lasagne.nonlinearities import ScaledTanH 94 | >>> scaled_tanh = ScaledTanH(scale_in=0.5, scale_out=2.27) 95 | >>> l1 = DenseLayer(l_in, num_units=200, nonlinearity=scaled_tanh) 96 | 97 | Notes 98 | ----- 99 | LeCun et al. (in [1]_, Section 4.4) suggest ``scale_in=2./3`` and 100 | ``scale_out=1.7159``, which has :math:`\\varphi(\\pm 1) = \\pm 1`, 101 | maximum second derivative at 1, and an effective gain close to 1. 102 | 103 | By carefully matching :math:`\\alpha` and :math:`\\beta`, the nonlinearity 104 | can also be tuned to preserve the mean and variance of its input: 105 | 106 | * ``scale_in=0.5``, ``scale_out=2.4``: If the input is a random normal 107 | variable, the output will have zero mean and unit variance. 108 | * ``scale_in=1``, ``scale_out=1.6``: Same property, but with a smaller 109 | linear regime in input space. 110 | * ``scale_in=0.5``, ``scale_out=2.27``: If the input is a uniform normal 111 | variable, the output will have zero mean and unit variance. 112 | * ``scale_in=1``, ``scale_out=1.48``: Same property, but with a smaller 113 | linear regime in input space. 114 | 115 | References 116 | ---------- 117 | .. [1] LeCun, Yann A., et al. (1998): 118 | Efficient BackProp, 119 | http://link.springer.com/chapter/10.1007/3-540-49430-8_2, 120 | http://yann.lecun.com/exdb/publis/pdf/lecun-98b.pdf 121 | .. [2] Masci, Jonathan, et al. (2011): 122 | Stacked Convolutional Auto-Encoders for Hierarchical Feature Extraction, 123 | http://link.springer.com/chapter/10.1007/978-3-642-21735-7_7, 124 | http://people.idsia.ch/~ciresan/data/icann2011.pdf 125 | """ 126 | 127 | def __init__(self, scale_in=1, scale_out=1): 128 | self.scale_in = scale_in 129 | self.scale_out = scale_out 130 | 131 | def __call__(self, x): 132 | return tensor.tanh(x * self.scale_in) * self.scale_out 133 | 134 | 135 | ScaledTanh = ScaledTanH # alias with alternative capitalization 136 | 137 | # leaky rectify 138 | class LeakyRectify(object): 139 | """Leaky rectifier :math:`\\varphi(x) = (x > 0)? x : \\alpha \\cdot x` 140 | 141 | The leaky rectifier was introduced in [1]_. Compared to the standard 142 | rectifier :func:`rectify`, it has a nonzero gradient for negative input, 143 | which often helps convergence. 144 | 145 | Parameters 146 | ---------- 147 | leakiness : float 148 | Slope for negative input, usually between 0 and 1. 149 | A leakiness of 0 will lead to the standard rectifier, 150 | a leakiness of 1 will lead to a linear activation function, 151 | and any value in between will give a leaky rectifier. 152 | 153 | Methods 154 | ------- 155 | __call__(x) 156 | Apply the leaky rectify function to the activation `x`. 157 | 158 | Examples 159 | -------- 160 | In contrast to other activation functions in this module, this is 161 | a class that needs to be instantiated to obtain a callable: 162 | 163 | >>> from lasagne.layers import InputLayer, DenseLayer 164 | >>> l_in = InputLayer((None, 100)) 165 | >>> from lasagne.nonlinearities import LeakyRectify 166 | >>> custom_rectify = LeakyRectify(0.1) 167 | >>> l1 = DenseLayer(l_in, num_units=200, nonlinearity=custom_rectify) 168 | 169 | Alternatively, you can use the provided instance for leakiness=0.01: 170 | 171 | >>> from lasagne.nonlinearities import leaky_rectify 172 | >>> l2 = DenseLayer(l_in, num_units=200, nonlinearity=leaky_rectify) 173 | 174 | Or the one for a high leakiness of 1/3: 175 | 176 | >>> from lasagne.nonlinearities import very_leaky_rectify 177 | >>> l3 = DenseLayer(l_in, num_units=200, nonlinearity=very_leaky_rectify) 178 | 179 | See Also 180 | -------- 181 | leaky_rectify: Instance with default leakiness of 0.01, as in [1]_. 182 | very_leaky_rectify: Instance with high leakiness of 1/3, as in [2]_. 183 | 184 | References 185 | ---------- 186 | .. [1] Maas et al. (2013): 187 | Rectifier Nonlinearities Improve Neural Network Acoustic Models, 188 | http://web.stanford.edu/~awni/papers/relu_hybrid_icml2013_final.pdf 189 | .. [2] Graham, Benjamin (2014): 190 | Spatially-sparse convolutional neural networks, 191 | http://arxiv.org/abs/1409.6070 192 | """ 193 | def __init__(self, leakiness=0.01): 194 | self.leakiness = leakiness 195 | 196 | def __call__(self, x): 197 | return tensor.nnet.relu(x, self.leakiness) 198 | 199 | 200 | leaky_rectify = LeakyRectify() # shortcut with default leakiness 201 | leaky_rectify.__doc__ = """leaky_rectify(x) 202 | 203 | Instance of :class:`LeakyRectify` with leakiness :math:`\\alpha=0.01` 204 | """ 205 | 206 | 207 | very_leaky_rectify = LeakyRectify(1./3) # shortcut with high leakiness 208 | very_leaky_rectify.__doc__ = """very_leaky_rectify(x) 209 | 210 | Instance of :class:`LeakyRectify` with leakiness :math:`\\alpha=1/3` 211 | """ 212 | 213 | 214 | # elu 215 | def elu(x): 216 | """Exponential Linear Unit :math:`\\varphi(x) = (x > 0) ? x : e^x - 1` 217 | 218 | The Exponential Linear Unit (ELU) was introduced in [1]_. Compared to the 219 | linear rectifier :func:`rectify`, it has a mean activation closer to zero 220 | and nonzero gradient for negative input, which can help convergence. 221 | Compared to the leaky rectifier :class:`LeakyRectify`, it saturates for 222 | highly negative inputs. 223 | 224 | Parameters 225 | ---------- 226 | x : float32 227 | The activation (the summed, weighed input of a neuron). 228 | 229 | Returns 230 | ------- 231 | float32 232 | The output of the exponential linear unit for the activation. 233 | 234 | Notes 235 | ----- 236 | In [1]_, an additional parameter :math:`\\alpha` controls the (negative) 237 | saturation value for negative inputs, but is set to 1 for all experiments. 238 | It is omitted here. 239 | 240 | References 241 | ---------- 242 | .. [1] Djork-Arné Clevert, Thomas Unterthiner, Sepp Hochreiter (2015): 243 | Fast and Accurate Deep Network Learning by Exponential Linear Units 244 | (ELUs), http://arxiv.org/abs/1511.07289 245 | """ 246 | return tensor.switch(x > 0, x, tensor.expm1(x)) 247 | 248 | 249 | # selu 250 | class SELU(object): 251 | """ 252 | Scaled Exponential Linear Unit 253 | :math:`\\varphi(x)=\\lambda \\left[(x>0) ? x : \\alpha(e^x-1)\\right]` 254 | 255 | The Scaled Exponential Linear Unit (SELU) was introduced in [1]_ 256 | as an activation function that allows the construction of 257 | self-normalizing neural networks. 258 | 259 | Parameters 260 | ---------- 261 | scale : float32 262 | The scale parameter :math:`\\lambda` for scaling all output. 263 | 264 | scale_neg : float32 265 | The scale parameter :math:`\\alpha` 266 | for scaling output for nonpositive argument values. 267 | 268 | Methods 269 | ------- 270 | __call__(x) 271 | Apply the SELU function to the activation `x`. 272 | 273 | Examples 274 | -------- 275 | In contrast to other activation functions in this module, this is 276 | a class that needs to be instantiated to obtain a callable: 277 | 278 | >>> from lasagne.layers import InputLayer, DenseLayer 279 | >>> l_in = InputLayer((None, 100)) 280 | >>> from lasagne.nonlinearities import SELU 281 | >>> selu = SELU(2, 3) 282 | >>> l1 = DenseLayer(l_in, num_units=200, nonlinearity=selu) 283 | 284 | See Also 285 | -------- 286 | selu: Instance with :math:`\\alpha\\approx1.6733,\\lambda\\approx1.0507` 287 | as used in [1]_. 288 | 289 | References 290 | ---------- 291 | .. [1] Günter Klambauer et al. (2017): 292 | Self-Normalizing Neural Networks, 293 | https://arxiv.org/abs/1706.02515 294 | """ 295 | def __init__(self, scale=1, scale_neg=1): 296 | self.scale = scale 297 | self.scale_neg = scale_neg 298 | 299 | def __call__(self, x): 300 | return self.scale * tensor.switch( 301 | x > 0.0, 302 | x, 303 | self.scale_neg * (tensor.expm1(x))) 304 | 305 | 306 | selu = SELU(scale=1.0507009873554804934193349852946, 307 | scale_neg=1.6732632423543772848170429916717) 308 | selu.__doc__ = """selu(x) 309 | 310 | Instance of :class:`SELU` with :math:`\\alpha\\approx 1.6733, 311 | \\lambda\\approx 1.0507` 312 | 313 | This has a stable and attracting fixed point of :math:`\\mu=0`, 314 | :math:`\\sigma=1` under the assumptions of the 315 | original paper on self-normalizing neural networks. 316 | """ 317 | 318 | def linear(x): 319 | """Linear activation function :math:`\\varphi(x) = x` 320 | 321 | Parameters 322 | ---------- 323 | x : float32 324 | The activation (the summed, weighted input of a neuron). 325 | 326 | Returns 327 | ------- 328 | float32 329 | The output of the identity applied to the activation. 330 | """ 331 | return x 332 | 333 | identity = linear -------------------------------------------------------------------------------- /dandelion/model/ctpn.py: -------------------------------------------------------------------------------- 1 | # coding:utf-8 2 | ''' 3 | Reference implementation of [CTPN](https://arxiv.org/abs/1609.03605) for text line detection 4 | Created : 7, 26, 2018 5 | Revised : 8, 2, 2018 add alternative implementation to bypass the absense of `im2col` in Theano. 6 | 9, 3, 2018 modified: change output bbox's shape to (B, H, W, k, n) (from (B, H, W, k*n)). 7 | All rights reserved 8 | ''' 9 | # ------------------------------------------------------------------------------------------------ 10 | __author__ = 'dawei.leng' 11 | 12 | import theano.tensor as tensor 13 | from ..module import * 14 | from ..functional import * 15 | from ..activation import * 16 | 17 | 18 | class model_CTPN(Module): 19 | """ 20 | Reference implementation of [CTPN](https://arxiv.org/abs/1609.03605) for text line detection 21 | """ 22 | 23 | def __init__(self, k=10, do_side_refinement_regress=False, 24 | batchnorm_mode=1, channel=3, im_height=None, im_width=None, 25 | kernel_size=3, border_mode=(1, 1), VGG_flip_filters=False, 26 | im2col=None): 27 | """ 28 | 29 | :param k: anchor box number 30 | :param do_side_refinement_regress: whether implement side refinement regression 31 | :param batchnorm_mode: 1|0, whether insert batch normalization into the end of each convolution stage of VGG-16 net 32 | :param channel: input channel number 33 | :param im_height: input image height, optional 34 | :param im_width: input image width, optional 35 | :param kernel_size: convolution kernel size of VGG-16 net 36 | :param border_mode: border mode of VGG-16 net 37 | :param VGG_flip_filters: whether flip convolution kernels for VGG-16 net 38 | :param im2col: function corresponding to Caffe's `im2col()`. If None, the CTPN implementation will not strictly follow the original paper. 39 | """ 40 | super().__init__() 41 | self.batchnorm_mode = batchnorm_mode 42 | self.k = k 43 | self.im2col = im2col 44 | #--- encoding part, VGG16, 13 conv layers ---# 45 | self.conv1_1 = Conv2D(in_channels=channel, out_channels=64, kernel_size=kernel_size, pad=border_mode, input_shape=(im_height, im_width), flip_filters=VGG_flip_filters) 46 | self.conv1_2 = Conv2D(in_channels=64, out_channels=64, kernel_size=kernel_size, pad=border_mode, flip_filters=VGG_flip_filters) 47 | self.conv2_1 = Conv2D(in_channels=64, out_channels=128, kernel_size=kernel_size, pad=border_mode, flip_filters=VGG_flip_filters) 48 | self.conv2_2 = Conv2D(in_channels=128, out_channels=128, kernel_size=kernel_size, pad=border_mode, flip_filters=VGG_flip_filters) 49 | self.conv3_1 = Conv2D(in_channels=128, out_channels=256, kernel_size=kernel_size, pad=border_mode, flip_filters=VGG_flip_filters) 50 | self.conv3_2 = Conv2D(in_channels=256, out_channels=256, kernel_size=kernel_size, pad=border_mode, flip_filters=VGG_flip_filters) 51 | self.conv3_3 = Conv2D(in_channels=256, out_channels=256, kernel_size=kernel_size, pad=border_mode, flip_filters=VGG_flip_filters) 52 | self.conv4_1 = Conv2D(in_channels=256, out_channels=512, kernel_size=kernel_size, pad=border_mode, flip_filters=VGG_flip_filters) 53 | self.conv4_2 = Conv2D(in_channels=512, out_channels=512, kernel_size=kernel_size, pad=border_mode, flip_filters=VGG_flip_filters) 54 | self.conv4_3 = Conv2D(in_channels=512, out_channels=512, kernel_size=kernel_size, pad=border_mode, flip_filters=VGG_flip_filters) 55 | self.conv5_1 = Conv2D(in_channels=512, out_channels=512, kernel_size=kernel_size, pad=border_mode, flip_filters=VGG_flip_filters) 56 | self.conv5_2 = Conv2D(in_channels=512, out_channels=512, kernel_size=kernel_size, pad=border_mode, flip_filters=VGG_flip_filters) 57 | self.conv5_3 = Conv2D(in_channels=512, out_channels=512, kernel_size=kernel_size, pad=border_mode, flip_filters=VGG_flip_filters) # (B, C, H, W) 58 | if batchnorm_mode == 0: 59 | pass 60 | elif batchnorm_mode == 1: 61 | self.bn1 = BatchNorm(input_shape=(None, 64, None, None)) 62 | self.bn2 = BatchNorm(input_shape=(None, 128, None, None)) 63 | self.bn3 = BatchNorm(input_shape=(None, 256, None, None)) 64 | self.bn4 = BatchNorm(input_shape=(None, 512, None, None)) 65 | self.bn5 = BatchNorm(input_shape=(None, 512, None, None)) 66 | else: 67 | raise ValueError('batchnorm_mode should = 0 or 1') 68 | #--- detection part ---# 69 | if im2col is not None: # implementation strictly follow the reference paper 70 | self.lstm_f = LSTM(input_dims=512*9, hidden_dim=128, grad_clipping=100) # (W, B*H, C) 71 | self.lstm_b = LSTM(input_dims=512*9, hidden_dim=128, grad_clipping=100) # (W, B*H, C) 72 | self.conv_rpn = Conv2D(in_channels=256, out_channels=512, kernel_size=1, pad='same') # (B, C, H, W) 73 | self.conv_cls_score = Conv2D(in_channels=512, out_channels=2 * k, kernel_size=1, pad='same') # (B, 2*k, H, W) 74 | if do_side_refinement_regress: 75 | self.conv_bbox_pred = Conv2D(in_channels=512, out_channels=3 * k, kernel_size=1, pad='same') # (B, 3*k, H, W), include side-refinement 76 | else: 77 | self.conv_bbox_pred = Conv2D(in_channels=512, out_channels=2 * k, kernel_size=1, pad='same') # (B, 2*k, H, W), no side-refinement 78 | 79 | else: # implementation putting convolution before RNN, doesn't follow the reference paper 80 | self.conv_rpn = Conv2D(in_channels=512, out_channels=512, kernel_size=(3,3), pad=(1,1)) # (B, C, H, W) 81 | self.lstm_f = LSTM(input_dims=512, hidden_dim=128, grad_clipping=100) # (W, B*H, C) 82 | self.lstm_b = LSTM(input_dims=512, hidden_dim=128, grad_clipping=100) # (W, B*H, C) 83 | self.conv_cls_score = Conv2D(in_channels=256, out_channels=2*k, kernel_size=1, pad='same') # (B, 2*k, H, W) 84 | if do_side_refinement_regress: 85 | self.conv_bbox_pred = Conv2D(in_channels=256, out_channels=3*k, kernel_size=1, pad='same') # (B, 3*k, H, W), include side-refinement 86 | else: 87 | self.conv_bbox_pred = Conv2D(in_channels=256, out_channels=2*k, kernel_size=1, pad='same') # (B, 2*k, H, W), no side-refinement 88 | 89 | def forward(self, x): 90 | """ 91 | :param x: (B, C, H, W) 92 | :return: cls_score, bbox # (B, H, W, k, 2), (B, H, W, n*k) n = 2 or 3 93 | """ 94 | self.work_mode = 'train' 95 | 96 | #--- encoding part, VGG16, 13 conv layers ---# 97 | x = relu(self.conv1_1.forward(x)) # (B, 64, 224, 224) 98 | x = relu(self.conv1_2.forward(x)) 99 | if self.batchnorm_mode == 1: 100 | x = self.bn1.forward(x) 101 | x = pool_2d(x, ws=(2, 2)) # (B, 64, 112, 112) 102 | x = relu(self.conv2_1.forward(x)) 103 | x = relu(self.conv2_2.forward(x)) 104 | if self.batchnorm_mode == 1: 105 | x = self.bn2.forward(x) 106 | x = pool_2d(x, ws=(2, 2)) # (B, 128, 56, 56) 107 | x = relu(self.conv3_1.forward(x)) 108 | x = relu(self.conv3_2.forward(x)) 109 | x = relu(self.conv3_3.forward(x)) 110 | if self.batchnorm_mode == 1: 111 | x = self.bn3.forward(x) 112 | x = pool_2d(x, ws=(2, 2)) # (B, 256, 28, 28) 113 | x = relu(self.conv4_1.forward(x)) 114 | x = relu(self.conv4_2.forward(x)) 115 | x = relu(self.conv4_3.forward(x)) 116 | if self.batchnorm_mode == 1: 117 | x = self.bn4.forward(x) 118 | x = pool_2d(x, ws=(2, 2)) # (B, 512, 14, 14) 119 | x = relu(self.conv5_1.forward(x)) 120 | x = relu(self.conv5_2.forward(x)) 121 | x = relu(self.conv5_3.forward(x)) 122 | if self.batchnorm_mode == 1: 123 | x = self.bn5.forward(x) 124 | #--- detection part ---# 125 | if self.im2col is not None: 126 | x = self.im2col(x, nb_size=(3,3), merge_channel=True) # (B, H, W, C*h*w) 127 | B, H, W, Chw = x.shape 128 | x = x.dimshuffle((2, 0, 1, 3)) 129 | x = x.reshape((W, -1, Chw)) # (W, B*H, C*h*w) 130 | else: 131 | x = relu(self.conv_rpn.forward(x)) 132 | B, C, H, W = x.shape 133 | x = x.dimshuffle((3, 0, 2, 1)) # (W, B, H, C) 134 | x = x.reshape((W, -1, C)) # (W, B*H, C) 135 | 136 | x_f = self.lstm_f.forward(x) 137 | x_b = self.lstm_b.forward(x, backward=True) 138 | x = tensor.concatenate([x_f, x_b], axis=2) # (W, B*H, 256) 139 | x = x.reshape((W, B, H, -1)) 140 | x = x.dimshuffle((1, 3, 2, 0)) # (B, 256, H, W) 141 | if self.im2col is not None: 142 | x = relu(self.conv_rpn.forward(x)) # (B, 512, H, W) 143 | 144 | cls_score = self.conv_cls_score.forward(x) # (B, 2*k, H, W) 145 | cls_score = cls_score.reshape((B, 2, self.k, H, W)) # (B, 2, k, H, W) 146 | cls_score = softmax(cls_score.dimshuffle((0, 3, 4, 2, 1))) # (B, H, W, k, 2) 147 | bbox = self.conv_bbox_pred.forward(x) # (B, 3*k, H, W), no activation applied 148 | bbox = bbox.dimshuffle((0, 2, 3, 1, 'x')) # (B, H, W, 3*k, 1) 149 | bbox = bbox.reshape((B, H, W, self.k, -1)) # (B, H, W, k, n) 150 | 151 | return cls_score, bbox # (B, H, W, k, 2), (B, H, W, k, n) n = 2 or 3 152 | 153 | def predict(self, x): 154 | self.work_mode = 'inference' 155 | 156 | #--- encoding part, VGG16, 13 conv layers ---# 157 | x = relu(self.conv1_1.predict(x)) # (B, 64, 224, 224) 158 | x = relu(self.conv1_2.predict(x)) 159 | if self.batchnorm_mode == 1: 160 | x = self.bn1.predict(x) 161 | x = pool_2d(x, ws=(2, 2)) # (B, 64, 112, 112) 162 | x = relu(self.conv2_1.predict(x)) 163 | x = relu(self.conv2_2.predict(x)) 164 | if self.batchnorm_mode == 1: 165 | x = self.bn2.predict(x) 166 | x = pool_2d(x, ws=(2, 2)) # (B, 128, 56, 56) 167 | x = relu(self.conv3_1.predict(x)) 168 | x = relu(self.conv3_2.predict(x)) 169 | x = relu(self.conv3_3.predict(x)) 170 | if self.batchnorm_mode == 1: 171 | x = self.bn3.predict(x) 172 | x = pool_2d(x, ws=(2, 2)) # (B, 256, 28, 28) 173 | x = relu(self.conv4_1.predict(x)) 174 | x = relu(self.conv4_2.predict(x)) 175 | x = relu(self.conv4_3.predict(x)) 176 | if self.batchnorm_mode == 1: 177 | x = self.bn4.predict(x) 178 | x = pool_2d(x, ws=(2, 2)) # (B, 512, 14, 14) 179 | x = relu(self.conv5_1.predict(x)) 180 | x = relu(self.conv5_2.predict(x)) 181 | x = relu(self.conv5_3.predict(x)) 182 | if self.batchnorm_mode == 1: 183 | x = self.bn5.predict(x) 184 | #--- detection part ---# 185 | if self.im2col is not None: 186 | x = self.im2col(x, nb_size=(3,3), merge_channel=True) # (B, H, W, C*h*w) 187 | B, H, W, Chw = x.shape 188 | x = x.dimshuffle((2, 0, 1, 3)) 189 | x = x.reshape((W, -1, Chw)) # (W, B*H, C*h*w) 190 | else: 191 | x = relu(self.conv_rpn.predict(x)) 192 | B, C, H, W = x.shape 193 | x = x.dimshuffle((3, 0, 2, 1)) # (W, B, H, C) 194 | x = x.reshape((W, -1, C)) # (W, B*H, C) 195 | 196 | x_f = self.lstm_f.predict(x) 197 | x_b = self.lstm_b.predict(x, backward=True) 198 | x = tensor.concatenate([x_f, x_b], axis=2) # (W, B*H, 256) 199 | x = x.reshape((W, B, H, -1)) 200 | x = x.dimshuffle((1, 3, 2, 0)) # (B, 256, H, W) 201 | if self.im2col is not None: 202 | x = relu(self.conv_rpn.predict(x)) # (B, 512, H, W) 203 | 204 | cls_score = self.conv_cls_score.predict(x) # (B, 2*k, H, W) 205 | cls_score = cls_score.reshape((B, 2, self.k, H, W)) # (B, 2, k, H, W) 206 | cls_score = softmax(cls_score.dimshuffle((0, 3, 4, 2, 1))) # (B, H, W, k, 2) 207 | bbox = self.conv_bbox_pred.predict(x) # (B, 3*k, H, W), no activation applied 208 | bbox = bbox.dimshuffle((0, 2, 3, 1, 'x')) # (B, H, W, 3*k, 1) 209 | bbox = bbox.reshape((B, H, W, self.k, -1)) # (B, H, W, k, n) 210 | 211 | return cls_score, bbox # (B, H, W, k, 2), (B, H, W, k, n) n = 2 or 3 -------------------------------------------------------------------------------- /docs/dandelion_model.md: -------------------------------------------------------------------------------- 1 | ## VGG-16 network 2 | Reference implementation of the classic [VGG-16](https://arxiv.org/abs/1409.1556) network 3 | 4 | ```python 5 | class model_VGG16(channel=3, im_height=224, im_width=224, Nclass=1000, 6 | kernel_size=3, border_mode=(1, 1), flip_filters=False) 7 | ``` 8 | * **channel**: input channel number 9 | * **Nclass**: output class number 10 | 11 | The model accepts input of shape in the order of (B, C, H, W), and outputs with shape (B, N). 12 | 13 | _______________________________________________________________________ 14 | ## Depthwise Separable Convolution 15 | Reference implementation of [Depthwise Separable Convolution](https://arxiv.org/abs/1610.02357) 16 | 17 | ```python 18 | class DSConv2D(in_channels, out_channels, kernel_size=(3,3), stride=(1,1), 19 | dilation=(1,1), pad='valid') 20 | ``` 21 | * **input_channels**: int. Input shape is (B, input_channels, H_in, W_in) 22 | * **out_channels**: int. Output shape is (B output_channels, H_out, W_out) 23 | * **kernel_size**: int scalar or tuple of int. Convolution kernel size 24 | * **stride**: Factor by which to subsample the output 25 | * **pad**: `same`/`valid`/`full` or 2-element tuple of int. Control image border padding. 26 | * **dilation**: factor by which to subsample (stride) the input. 27 | 28 | The model do the depthwise 2D convolution per-channel of input, then map the output to #out_channels number of channel by pointwise 1*1 convolution. No activation applied inside. 29 | 30 | _______________________________________________________________________ 31 | ## ResNet bottleneck 32 | Reference implementation of bottleneck building block of [ResNet](https://arxiv.org/abs/1512.03385) network 33 | 34 | ```python 35 | class ResNet_bottleneck(outer_channel=256, inner_channel=64, border_mode='same', 36 | batchnorm_mode=1, activation=relu) 37 | ``` 38 | * **outer_channel**: channel number of block input 39 | * **inner_channel**: channel number inside the block 40 | * **batchnorm_mode**: {0 | *1* | 2}. 0 means no batch normalization applied; 1 means batch normalization applied to each cnn; 2 means batch normalization only applied to the last cnn 41 | * **activation**: default = relu. **Note no activation applied to the last element-wise sum output.** 42 | 43 | The model accepts input of shape in the order of (B, C, H, W), and outputs with the same shape. 44 | 45 | _______________________________________________________________________ 46 | ## Feature Pyramid Network 47 | Reference implementation of [feature pyramid network](https://arxiv.org/abs/1612.03144) 48 | 49 | ```python 50 | class model_FPN(input_channel=3, base_n_filters=64, batchnorm_mode=1) 51 | ``` 52 | * **batchnorm_mode**: same with `ResNet_bottleneck` 53 | * **return** 4-element tuple `(p2, p3, p4, p5)`, CNN pyramid features at different scales, each with #channel = 4 * `base_n_filters` 54 | 55 | _______________________________________________________________________ 56 | ## ShuffleUnit 57 | Reference implementation of [shuffle-net](https://arxiv.org/abs/1707.01083) unit 58 | 59 | ```python 60 | class ShuffleUnit(in_channels=256, inner_channels=None, out_channels=None, group_num=4, border_mode='same', 61 | batchnorm_mode=1, activation=relu, stride=(1,1), dilation=(1,1), fusion_mode='add') 62 | ``` 63 | * **in_channels**: channel number of unit input 64 | * **inner_channel**: optional, channel number inside the unit, default = `in_channels//4` 65 | * **out_channels**: channel number of unit output, only used when `fusion_mode` = 'concat', and must > `in_channels` 66 | * **group_num**: number of convolution groups 67 | * **border_mode**: only `same` allowed 68 | * **batchnorm_mode**: {0 | *1* | 2}. 0 means no batch normalization applied; 1 means batch normalization applied to each cnn; 2 means batch normalization only applied to the last cnn 69 | * **activation**: default = relu. **Note no activation applied to the last output.** 70 | * **stride, dilation**: only used for depthwise separable convolution module inside 71 | * **fusion_mode**: {'add' | 'concat'}. When 'concat', `out_channels` must > `in_channels`. 72 | * **return**: convolution result with #channel = `in_channels` when `fusion_mode`='add', #channel = `out_channels` when `fusion_mode`='concat' 73 | 74 | ## ShuffleUnit_Stack 75 | Reference implementation of shuffle-net unit stack 76 | 77 | ```python 78 | class ShuffleUnit_Stack(in_channels, inner_channels=None, out_channels=None, group_num=4, batchnorm_mode=1, 79 | activation=relu, stack_size=3, stride=2, fusion_mode='concat') 80 | ``` 81 | * **in_channels**: channel number of input 82 | * **inner_channel**: optional, channel number inside the shuffle-unit, default = `in_channels//4` 83 | * **out_channels**: channel number of stack output, must > `in_channels` 84 | * **group_num**: number of convolution groups 85 | * **batchnorm_mode**: {0 | *1* | 2}. 0 means no batch normalization applied; 1 means batch normalization applied to each cnn; 2 means batch normalization only applied to the last cnn 86 | * **activation**: default = relu. **Note no activation applied to the last output.** 87 | * **stack_size**: number of shuffle-unit in the stack 88 | * **stride**: int or tuple of int, convolution stride for the first unit, default=2 89 | * **fusion_mode**: fusion_mode for the first unit. 90 | 91 | ## ShuffleNet 92 | Reference implementation of [shuffle-net](https://arxiv.org/abs/1707.01083), without the final pooling & Dense layer. 93 | 94 | ```python 95 | class model_ShuffleNet(in_channels, group_num=4, stage_channels=(24, 272, 544, 1088), stack_size=(3, 7, 3), 96 | batchnorm_mode=1, activation=relu) 97 | ``` 98 | * **in_channels**: channel number of input 99 | * **group_num**: number of convolution groups 100 | * **stage_channels**: channel number of each stage output. 101 | * **stack_size**: size of each stack. 102 | * **batchnorm_mode**: {0 | *1* | 2}. 0 means no batch normalization applied; 1 means batch normalization applied to each cnn; 2 means batch normalization only applied to the last cnn 103 | * **activation**: default = relu. **Note no activation applied to the last output.** 104 | 105 | _______________________________________________________________________ 106 | ## ShuffleUnit_v2 107 | Reference implementation of [shufflenet_v2](https://arxiv.org/abs/1807.11164) unit 108 | 109 | ```python 110 | class ShuffleUnit_v2(in_channels=256, out_channels=None, border_mode='same', batchnorm_mode=1, 111 | activation=relu, stride=1, dilation=1) 112 | ``` 113 | * **in_channels**: channel number of unit input 114 | * **out_channels**: channel number of unit output, only used when `stride`>1; when `stride1`=1, `out_channels` is fixed to `in_channels`. 115 | * **border_mode**: only `same` allowed 116 | * **batchnorm_mode**: {0 | *1* | 2}. 0 means no batch normalization applied; 1 means batch normalization applied to each cnn; 2 means batch normalization only applied to the last cnn 117 | * **activation**: default = relu. **Note no activation applied to the last output.** 118 | * **stride, dilation**: only used for depthwise separable convolution module inside, must be integer scalars or tuple of integers. 119 | 120 | ## ShuffleUnit_v2_Stack 121 | Reference implementation of shufflenet_v2 unit stack 122 | 123 | ```python 124 | class ShuffleUnit_v2_Stack(in_channels, out_channels, batchnorm_mode=1, activation=relu, stack_size=3, stride=2) 125 | ``` 126 | * **in_channels**: channel number of input 127 | * **out_channels**: channel number of stack output 128 | * **batchnorm_mode**: {0 | *1* | 2}. 0 means no batch normalization applied; 1 means batch normalization applied to each cnn; 2 means batch normalization only applied to the last cnn 129 | * **activation**: default = relu. **Note no activation applied to the last output.** 130 | * **stack_size**: number of shuffle-unit in the stack 131 | * **stride**: int or tuple of int, convolution stride for the first unit, default=2 132 | 133 | 134 | ## ShuffleNet_v2 135 | Reference implementation of [shufflenet_v2](https://arxiv.org/abs/1807.11164), without the final pooling & Dense layer. 136 | 137 | ```python 138 | class model_ShuffleNet_v2(in_channels, stage_channels=(24, 116, 232, 464, 1024), stack_size=(3, 7, 3), 139 | batchnorm_mode=1, activation=relu) 140 | ``` 141 | * **in_channels**: channel number of input 142 | * **stage_channels**: channel number of each stage output. 143 | * **stack_size**: size of each stack. 144 | * **batchnorm_mode**: {0 | *1* | 2}. 0 means no batch normalization applied; 1 means batch normalization applied to each cnn; 2 means batch normalization only applied to the last cnn 145 | * **activation**: default = relu. **Note no activation applied to the last output.** 146 | 147 | _______________________________________________________________________ 148 | ## CTPN 149 | Model reference implementation of [CTPN](https://arxiv.org/abs/1609.03605) 150 | 151 | ```python 152 | class model_CTPN(k=10, do_side_refinement_regress=False, 153 | batchnorm_mode=1, channel=3, im_height=None, im_width=None, 154 | kernel_size=3, border_mode=(1, 1), VGG_flip_filters=False, 155 | im2col=None) 156 | ``` 157 | * **k**: anchor box number 158 | * **do_side_refinement_regress**: whether implement side refinement regression 159 | * **batchnorm_mode**: {0|*1*}, whether insert batch normalization into the end of each convolution stage of VGG-16 net, useful for cold start. 160 | * **channel**: input channel number 161 | * **im_height, im_width**: input image height/width, optional 162 | * **kernel_size**: convolution kernel size of VGG-16 net 163 | * **border_mode**: border mode of VGG-16 net 164 | * **VGG_flip_filters**: whether flip convolution kernels for VGG-16 net 165 | * **im2col**: function corresponding to Caffe's `im2col()`. If `None`, the CTPN implementation will not strictly follow the original paper. 166 | 167 | _______________________________________________________________________ 168 | ## U-net FCN 169 | Reference implementation of [U-net](https://arxiv.org/abs/1505.04597) FCN 170 | 171 | ```python 172 | class model_Unet(channel=1, im_height=128, im_width=128, Nclass=2, kernel_size=3, 173 | border_mode='same', base_n_filters=64, output_activation=softmax) 174 | ``` 175 | * **channel**: input channel number 176 | * **Nclass**: output channel number 177 | 178 | The model accepts input of shape in the order of (B, C, H, W), and outputs with shape in the order of (B, H, W, C). 179 | 180 | _______________________________________________________________________ 181 | ## Shuffle-Seg network 182 | Model reference implementation of [ShuffleSeg](https://arxiv.org/abs/1803.03816) 183 | 184 | ```python 185 | class model_ShuffleSeg(in_channels=1, Nclass=6, SF_group_num=4, SF_stage_channels=(24, 272, 544, 1088), 186 | SF_stack_size=(3, 7, 3), SF_batchnorm_mode=1, SF_activation=relu) 187 | ``` 188 | * **in_channels**: channel number of input 189 | * **Nclass**: output class number 190 | * **SF_group_num**: number of convolution groups for inside ShuffleNet encoder. 191 | * **SF_stage_channels**: channel number of each stage output for inside ShuffleNet encoder. 192 | * **SF_stack_size**: size of each stack for inside ShuffleNet encoder. 193 | * **SF_batchnorm_mode**: {0 | *1* | 2}. 0 means no batch normalization applied; 1 means batch normalization applied to each cnn; 2 means batch normalization only applied to the last cnn. For inside ShuffleNet encoder 194 | * **SF_activation**: default = relu. For inside ShuffleNet encoder. 195 | 196 | _______________________________________________________________________ 197 | ## Alternate 2D LSTM 198 | LSTM2D implementation by alternating LSTM along different dimensions. 199 | Input shape = `(H, W, B, C)` 200 | 201 | ```python 202 | class Alternate_2D_LSTM( input_dims, hidden_dim, peephole=True, initializer=init.Normal(0.1), grad_clipping=0, 203 | hidden_activation=tanh, learn_ini=False, truncate_gradient=-1, mode=2) 204 | ``` 205 | All the arguments are the same with `LSTM` module, except for `mode`. 206 | 207 | * **mode**: {0 | 1 | *2*}. 208 | 0: concat mode, 1D LSTM results from horizontal and vertical dimensions are concatenated along the `C` dimension, i.e., 209 | $result = concat(horizontal\_LSTM(input), vertical\_LSTM(input))$; 210 | 1: sequential mode, horizontal and vertical dimensions are processed sequentially, i.e., $result = horizontal\_LSTM(vertical\_LSTM(input))$; 211 | 2: mixed mode, i.e., 212 | $result = horizontal\_LSTM(concat(input, vertical\_LSTM(input)))$ 213 | 214 | ```python 215 | .forward(seq_input, h_ini=(None, None), c_ini=(None, None), seq_mask=None, backward=(False, False), return_final_state=False) 216 | ``` 217 | All the arguments are the same with `LSTM` module 218 | 219 | 220 | ```python 221 | .predict = .forward 222 | ``` --------------------------------------------------------------------------------