├── .gitignore ├── README.md ├── __init__.py ├── adam.py ├── constants.py ├── custom_cmap.py ├── generate_trade_helper.py ├── input_parts ├── __init__.py ├── abs_position_part.py ├── base_input_part.py ├── beat_part.py ├── chord_part.py └── passthrough_part.py ├── instructions ├── generate_trade_helper.md ├── lscat.md ├── lssplit.md ├── main.md ├── param_cvt.md ├── plot_data.md └── plot_internal_state.md ├── leadsheet.py ├── lscat.py ├── lssplit.py ├── main.py ├── models ├── __init__.py ├── compressive_autoencoder_model.py ├── product_model.py └── simple_rel_model.py ├── nametrain └── name_model.py ├── note_encodings ├── __init__.py ├── abs_seq_encoding.py ├── base_encoding.py ├── chord_relative.py ├── circle_of_thirds_encoding.py ├── relative_jump.py └── rhythm_only.py ├── param_cvt.py ├── param_keys ├── ae_abs_keys.txt ├── ae_poex_keys.txt ├── corn_keys.txt ├── poex_keys.txt └── poex_sep_rhythm_keys.txt ├── plot_data.py ├── plot_internal_state.py ├── queue_managers ├── __init__.py ├── nearness_standard_manager.py ├── noise_wrapper.py ├── queue_base.py ├── queueless_standard_manager.py ├── queueless_variational_manager.py ├── sampling_variational_manager.py ├── standard_manager.py └── variational_manager.py ├── relshift_lstm.py ├── training.py └── util.py /.gitignore: -------------------------------------------------------------------------------- 1 | *.pyc 2 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Using LSTMprovisor 2 | 3 | This project allows you to train LSTM-based neural network models of jazz music in the .ls format, as well as convert the models into connectome files for use in Impro-Visor. 4 | 5 | ## Dependencies and Setup 6 | 7 | In order to run this project, you will need to install [Python 3.5][] (or later). You will also need the libraries `theano`, `theano-lstm`, `sexpdata`, and `matplotlib`, which you can install with 8 | 9 | ``` 10 | pip3 install theano theano-lstm sexpdata matplotlib 11 | ``` 12 | 13 | [Python 3.5]: https://www.python.org/downloads/ 14 | 15 | Note that Theano depends on SciPy. If you do not already have SciPy, `pip3` should install SciPy automatically when you install Theano, but if that fails, you can download SciPy from [their website][scipy]. Alternately, you can install a Python 3.5 distribution that already has SciPy installed, such as [Anaconda][]. 16 | 17 | [scipy]: http://scipy.org/install.html 18 | [Anaconda]: https://www.continuum.io/downloads 19 | 20 | Before using the scripts, you will also need to make a file called `.theanorc` in your home directory, with the following contents: 21 | 22 | ``` 23 | [global] 24 | floatX=float32 25 | 26 | [mode]=FAST_RUN 27 | ``` 28 | 29 | For additional Theano configuration options, including instructions on how to use the GPU, see the [theano config documentation][configdoc]. 30 | 31 | [configdoc]: http://deeplearning.net/software/theano/library/config.html 32 | 33 | ## Scripts 34 | 35 | Python scripts are provided to accomplish certain tasks. Each script can be invoked using the `python3` executable, for example 36 | 37 | ``` 38 | python3 SCRIPTNAME.py [arguments] 39 | ``` 40 | 41 | - [main.py](instructions/main.md): The main entry point for the project. Trains different types of model, as well as generating samples from them. 42 | - [param_cvt.py](instructions/param_cvt.md): Converts trained connectomes from pickle format (.p) to Impro-Visor connectome format (.ctome) 43 | - [plot_internal_state.py](instructions/plot_internal_state.md): Plots the internal state of a network, produced by main.py in generation mode. 44 | - [plot_data.py](instructions/plot_data.md): Plots a .csv file as a graph, allowing you to visualize the training loss of a network 45 | - [lscat.py](instructions/lscat.md): Concatenates leadsheets together for easier viewing. 46 | - [lssplit.py](instructions/lssplit.md): Splits leadsheets into multiple pieces. 47 | - [generate_trade_helper.py](instructions/generate_trade_helper.md): Interleaves generated output with the original input in a single leadsheet, for use in autoencoder models. 48 | 49 | Detailed instruction pages for each script are available in the instructions subdirectory, and each script will display a help message if the script is given the `-h` argument. 50 | 51 | ## Examples 52 | 53 | Some general examples follow. See detailed instruction pages for more in-depth examples. 54 | 55 | Train a product-of-experts generative model on a directory of leadsheet fileswith path `datasets/my_dataset`, automatically resuming training if previously interrupted. By default, each leadsheet file will be split into 4-bar chunks starting at each bar. 56 | 57 | ``` 58 | $ python3 main.py --dataset datasets/my_dataset --outputdir output_my_dataset --resume_auto poex 59 | ``` 60 | 61 | Generate some leadsheets using the trained product-of-experts model, sampling from the dataset: 62 | 63 | ``` 64 | $ python3 main.py --dataset datasets/my_dataset --outputdir output_my_dataset/generated --resume 0 output_my_dataset/final_params.p --generate poex 65 | ``` 66 | 67 | Visualize the internal state of the network for the first generated leadsheet: 68 | 69 | ``` 70 | $ python3 plot_internal_state.py output_my_dataset/generated 0 71 | ``` 72 | 73 | Plot the training progress of the network: 74 | 75 | ``` 76 | $ python3 plot_data.py output_my_dataset/data.csv 77 | ``` 78 | 79 | Convert the trained network into a connectome file: 80 | 81 | ``` 82 | $ python3 param_cvt.py --keys param_keys/poex_keys.txt output_my_dataset/final_params.p 83 | ``` 84 | 85 | Train a compressing autoencoder on the same dataset, using fixed features and product-of-experts for compatibility with Impro-Visor: 86 | 87 | ``` 88 | $ python3 main.py --dataset datasets/my_dataset --outputdir output_my_dataset_compae --resume_auto compae poex queueless_std --feature_period 24 --add_loss 89 | ``` 90 | 91 | Run the autoencoder on some leadsheets from the dataset, and then combine the input and output into a trading summary leadsheet: 92 | 93 | ``` 94 | $ python3 main.py --dataset datasets/my_dataset --outputdir output_my_dataset_compae/generated --resume 0 output_my_dataset_compae/final_params.p --generate compae poex queueless_std --feature_period 24 --add_loss 95 | $ python3 generate_trade_helper.py output_my_dataset_compae/generated 96 | ``` 97 | 98 | Convert the trained autoencoder into a connectome file: 99 | 100 | ``` 101 | $ python3 param_cvt.py --keys param_keys/ae_poex_keys.txt output_my_dataset_compae/final_params.p 102 | ``` 103 | -------------------------------------------------------------------------------- /__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Impro-Visor/lstmprovisor-python/027e394dbf43d31900fe293a22fadfed683256f9/__init__.py -------------------------------------------------------------------------------- /adam.py: -------------------------------------------------------------------------------- 1 | """ 2 | The MIT License (MIT) 3 | Copyright (c) 2015 Alec Radford 4 | Permission is hereby granted, free of charge, to any person obtaining a copy 5 | of this software and associated documentation files (the "Software"), to deal 6 | in the Software without restriction, including without limitation the rights 7 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 8 | copies of the Software, and to permit persons to whom the Software is 9 | furnished to do so, subject to the following conditions: 10 | The above copyright notice and this permission notice shall be included in all 11 | copies or substantial portions of the Software. 12 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 13 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 14 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 15 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 16 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 17 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 18 | SOFTWARE. 19 | """ 20 | 21 | import theano 22 | import theano.tensor as T 23 | import numpy as np 24 | 25 | def Adam(cost, params, lr=0.0002, b1=0.1, b2=0.001, e=1e-8): 26 | updates = [] 27 | grads = T.grad(cost, params) 28 | i = theano.shared(np.array(0., theano.config.floatX)) 29 | i_t = i + 1. 30 | fix1 = 1. - (1. - b1)**i_t 31 | fix2 = 1. - (1. - b2)**i_t 32 | lr_t = lr * (T.sqrt(fix2) / fix1) 33 | for p, g in zip(params, grads): 34 | m = theano.shared(p.get_value() * 0.) 35 | v = theano.shared(p.get_value() * 0.) 36 | m_t = (b1 * g) + ((1. - b1) * m) 37 | v_t = (b2 * T.sqr(g)) + ((1. - b2) * v) 38 | g_t = m_t / (T.sqrt(v_t) + e) 39 | p_t = p - (lr_t * g_t) 40 | updates.append((m, m_t)) 41 | updates.append((v, v_t)) 42 | updates.append((p, p_t)) 43 | updates.append((i, i_t)) 44 | return updates -------------------------------------------------------------------------------- /constants.py: -------------------------------------------------------------------------------- 1 | from collections import OrderedDict, namedtuple 2 | import numpy as np 3 | 4 | # [ c - d - e f - g - a - b] 5 | CHORD_TYPES = OrderedDict([ 6 | ('', [ 1, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0]), 7 | ('+', [ 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0]), 8 | ('6', [ 1, 0, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0]), 9 | ('7', [ 1, 0, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0]), 10 | ('9', [ 1, 0, 1, 0, 1, 0, 0, 1, 0, 0, 1, 0]), 11 | ('M', [ 1, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0]), 12 | ('m', [ 1, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0]), 13 | ('o', [ 1, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0]), 14 | ('+7', [ 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 1, 0]), 15 | ('11', [ 1, 0, 1, 0, 1, 1, 0, 1, 0, 0, 1, 0]), 16 | ('13', [ 1, 0, 1, 0, 1, 0, 0, 1, 0, 1, 1, 0]), 17 | ('69', [ 1, 0, 1, 0, 1, 0, 0, 1, 0, 1, 0, 0]), 18 | ('7+', [ 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 1, 0]), 19 | ('9+', [ 1, 0, 1, 0, 1, 0, 0, 0, 1, 0, 1, 0]), 20 | ('M6', [ 1, 0, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0]), 21 | ('M7', [ 1, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 1]), 22 | ('M9', [ 1, 0, 1, 0, 1, 0, 0, 1, 0, 0, 0, 1]), 23 | ('NC', [ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]), 24 | ('h7', [ 1, 0, 0, 1, 0, 0, 1, 0, 0, 0, 1, 0]), 25 | ('m+', [ 1, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0]), 26 | ('m6', [ 1, 0, 0, 1, 0, 0, 0, 1, 0, 1, 0, 0]), 27 | ('m7', [ 1, 0, 0, 1, 0, 0, 0, 1, 0, 0, 1, 0]), 28 | ('m9', [ 1, 0, 1, 1, 0, 0, 0, 1, 0, 0, 1, 0]), 29 | ('o7', [ 1, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 0]), 30 | ('7#5', [ 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 1, 0]), 31 | ('7#9', [ 1, 0, 0, 1, 1, 0, 0, 1, 0, 0, 1, 0]), 32 | ('7b5', [ 1, 0, 0, 0, 1, 0, 1, 0, 0, 0, 1, 0]), 33 | ('7b6', [ 1, 0, 0, 0, 1, 0, 0, 1, 1, 0, 1, 0]), 34 | ('7b9', [ 1, 1, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0]), 35 | ('9#5', [ 1, 0, 1, 0, 1, 0, 0, 0, 1, 0, 1, 0]), 36 | ('9b5', [ 1, 0, 1, 0, 1, 0, 1, 0, 0, 0, 1, 0]), 37 | ('M#5', [ 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0]), 38 | ('M69', [ 1, 0, 1, 0, 1, 0, 0, 1, 0, 1, 0, 0]), 39 | ('M7+', [ 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 1]), 40 | ('aug', [ 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0]), 41 | ('m#5', [ 1, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0]), 42 | ('m11', [ 1, 0, 1, 1, 0, 1, 0, 1, 0, 0, 1, 0]), 43 | ('m13', [ 1, 0, 1, 1, 0, 1, 0, 1, 0, 1, 1, 0]), 44 | ('m69', [ 1, 0, 1, 1, 0, 0, 0, 1, 0, 1, 0, 0]), 45 | ('mM7', [ 1, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1]), 46 | ('mM9', [ 1, 0, 1, 1, 0, 0, 0, 1, 0, 0, 0, 1]), 47 | ('mb6', [ 1, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0]), 48 | ('13#9', [ 1, 0, 0, 1, 1, 0, 0, 1, 0, 1, 1, 0]), 49 | ('13b5', [ 1, 0, 0, 0, 1, 0, 1, 0, 0, 1, 1, 0]), 50 | ('13b9', [ 1, 1, 0, 0, 1, 0, 0, 1, 0, 1, 1, 0]), 51 | ('7#11', [ 1, 0, 0, 0, 1, 0, 1, 1, 0, 0, 1, 0]), 52 | ('7alt', [ 1, 0, 0, 1, 1, 0, 0, 0, 1, 0, 1, 0]), 53 | ('7aug', [ 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 1, 0]), 54 | ('7b13', [ 1, 0, 0, 0, 1, 0, 0, 1, 1, 0, 1, 0]), 55 | ('7no5', [ 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0]), 56 | ('7sus', [ 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 1, 0]), 57 | ('9#11', [ 1, 0, 1, 0, 1, 0, 1, 1, 0, 0, 1, 0]), 58 | ('9b13', [ 1, 0, 1, 0, 1, 0, 0, 1, 1, 0, 1, 0]), 59 | ('9no5', [ 1, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0]), 60 | ('9sus', [ 1, 0, 1, 0, 0, 1, 0, 1, 0, 0, 1, 0]), 61 | ('Bass', [ 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]), 62 | ('M7#5', [ 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 1]), 63 | ('M7b5', [ 1, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 1]), 64 | ('M9#5', [ 1, 0, 1, 0, 1, 0, 0, 0, 1, 0, 0, 1]), 65 | ('aug7', [ 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 1, 0]), 66 | ('m7#5', [ 1, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0]), 67 | ('m7b5', [ 1, 0, 0, 1, 0, 0, 1, 0, 0, 0, 1, 0]), 68 | ('m9#5', [ 1, 0, 1, 1, 0, 0, 0, 0, 1, 0, 1, 0]), 69 | ('m9b5', [ 1, 0, 1, 1, 0, 0, 1, 0, 0, 0, 1, 0]), 70 | ('sus2', [ 1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0]), 71 | ('sus4', [ 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0]), 72 | ('+add9', [ 1, 0, 1, 0, 1, 0, 0, 0, 1, 0, 0, 0]), 73 | ('13#11', [ 1, 0, 1, 0, 1, 0, 1, 1, 0, 1, 1, 0]), 74 | ('13sus', [ 1, 0, 1, 0, 0, 1, 0, 1, 0, 1, 1, 0]), 75 | ('7#5#9', [ 1, 0, 0, 1, 1, 0, 0, 0, 1, 0, 1, 0]), 76 | ('7#5b9', [ 1, 1, 0, 0, 1, 0, 0, 0, 1, 0, 1, 0]), 77 | ('7b5#9', [ 1, 0, 0, 1, 1, 0, 0, 0, 1, 0, 1, 0]), 78 | ('7b5b9', [ 1, 1, 0, 0, 1, 0, 1, 1, 0, 0, 1, 0]), 79 | ('7sus4', [ 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 1, 0]), 80 | ('9sus4', [ 1, 0, 1, 0, 0, 1, 0, 1, 0, 0, 1, 0]), 81 | ('Blues', [ 1, 0, 0, 1, 0, 1, 1, 1, 0, 0, 1, 0]), 82 | ('M7#11', [ 1, 0, 0, 0, 1, 0, 1, 1, 0, 0, 0, 1]), 83 | ('M9#11', [ 1, 0, 1, 0, 1, 0, 1, 1, 0, 0, 0, 1]), 84 | ('Msus2', [ 1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0]), 85 | ('Msus4', [ 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0]), 86 | ('m11#5', [ 1, 0, 1, 1, 0, 1, 0, 0, 1, 0, 1, 0]), 87 | ('m11b5', [ 1, 0, 1, 1, 0, 1, 1, 0, 0, 0, 1, 0]), 88 | ('mM7b6', [ 1, 0, 0, 1, 0, 0, 0, 1, 1, 0, 0, 1]), 89 | ('madd9', [ 1, 0, 1, 1, 0, 0, 0, 1, 0, 0, 0, 0]), 90 | ('mb6M7', [ 1, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 1]), 91 | ('mb6b9', [ 1, 1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0]), 92 | ('susb9', [ 1, 1, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0]), 93 | ('13sus4', [ 1, 0, 1, 0, 0, 1, 0, 1, 0, 1, 1, 0]), 94 | ('7#9#11', [ 1, 0, 0, 1, 1, 0, 1, 1, 0, 0, 1, 0]), 95 | ('7#9b13', [ 1, 0, 0, 1, 1, 0, 0, 1, 1, 0, 1, 0]), 96 | ('7b5b13', [ 1, 0, 0, 0, 1, 0, 1, 1, 1, 0, 1, 0]), 97 | ('7b9#11', [ 1, 1, 0, 0, 1, 0, 1, 1, 0, 0, 1, 0]), 98 | ('7b9b13', [ 1, 1, 0, 0, 1, 0, 0, 1, 1, 0, 1, 0]), 99 | ('7b9sus', [ 1, 1, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0]), 100 | ('7susb9', [ 1, 1, 0, 0, 0, 1, 0, 1, 0, 0, 1, 0]), 101 | ('9#5#11', [ 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0]), 102 | ('9b5b13', [ 1, 0, 1, 0, 1, 0, 1, 1, 1, 0, 1, 0]), 103 | ('13#9#11', [ 1, 0, 0, 1, 1, 0, 1, 1, 0, 1, 1, 0]), 104 | ('13b9#11', [ 1, 1, 0, 0, 1, 0, 1, 1, 0, 1, 1, 0]), 105 | ('7#11b13', [ 1, 0, 0, 0, 1, 0, 1, 1, 1, 0, 1, 0]), 106 | ('7b9sus4', [ 1, 1, 0, 0, 0, 1, 0, 1, 0, 0, 1, 0]), 107 | ('7sus4b9', [ 1, 1, 0, 0, 0, 1, 0, 1, 0, 0, 1, 0]), 108 | ('9#11b13', [ 1, 0, 1, 0, 1, 0, 1, 1, 1, 0, 1, 0]), 109 | ('M#5add9', [ 1, 0, 1, 0, 1, 0, 0, 0, 1, 0, 0, 0]), 110 | ('7#5b9#11', [ 1, 1, 0, 0, 1, 0, 1, 0, 1, 0, 1, 0]), 111 | ('7b5b9b13', [ 1, 1, 0, 0, 1, 0, 1, 1, 1, 0, 1, 0]), 112 | ('7#9#11b13', [ 1, 0, 0, 1, 1, 0, 1, 1, 1, 0, 1, 0]), 113 | ('7b9#11b13', [ 1, 1, 0, 0, 1, 0, 1, 1, 1, 0, 1, 0]), 114 | ('7b9b13#11', [ 1, 1, 0, 0, 1, 0, 1, 1, 1, 0, 1, 0]), 115 | ('7b9b13sus4', [ 1, 1, 0, 0, 0, 1, 0, 1, 1, 0, 1, 0]), 116 | ('7sus4b9b13', [ 1, 1, 0, 0, 0, 1, 0, 1, 1, 0, 1, 0]), 117 | ]) 118 | 119 | NOTE_OFFSETS = OrderedDict([ 120 | ("c", 0), 121 | ("db", 1), 122 | ("d", 2), 123 | ("eb", 3), 124 | ("e", 4), 125 | ("f", 5), 126 | ("gb", 6), 127 | ("g", 7), 128 | ("ab", 8), 129 | ("a", 9), 130 | ("bb", 10), 131 | ("b", 11), 132 | 133 | ("c#", 1), 134 | ("d#", 3), 135 | ("f#", 6), 136 | ("g#", 8), 137 | ("a#", 10), 138 | 139 | ("cb", 11), 140 | ("fb", 4), 141 | ("e#", 5), 142 | ("b#", 0), 143 | ]) 144 | 145 | CHORD_NOTE_OFFSETS = OrderedDict((k[:1].upper()+k[1:],v%12) for k,v in NOTE_OFFSETS.items()) 146 | 147 | WHOLE = 480; # slots in a whole note 148 | HALF = WHOLE//2; # 240 149 | QUARTER = WHOLE//4; # 120 150 | EIGHTH = WHOLE//8; # 60 151 | SIXTEENTH = WHOLE//16; # 30 152 | THIRTYSECOND = WHOLE//32; # 15 153 | 154 | HALF_TRIPLET = 2*HALF//3; # 160 155 | QUARTER_TRIPLET = 2*QUARTER//3; # 80 156 | EIGHTH_TRIPLET = 2*EIGHTH//3; # 40 157 | SIXTEENTH_TRIPLET = 2*SIXTEENTH//3; # 20 158 | THIRTYSECOND_TRIPLET = 2*THIRTYSECOND//3; # 10 159 | 160 | QUARTER_QUINTUPLET = 4*QUARTER//5; # 96 161 | EIGHTH_QUINTUPLET = 4*EIGHTH//5; # 48 162 | SIXTEENTH_QUINTUPLET = 4*SIXTEENTH//5; # 24 163 | THIRTYSECOND_QUINTUPLET = 4*THIRTYSECOND//5; # 12 164 | 165 | DOTTED_HALF = 3*HALF//2; # 360 166 | DOTTED_QUARTER = 3*QUARTER//2; # 180 167 | DOTTED_EIGHTH = 3*EIGHTH//2; # 90 168 | DOTTED_SIXTEENTH = 3*SIXTEENTH//2; # 45 169 | 170 | FOUREIGHTIETH = 1; # WHOLE/WHOLE # 1 171 | TWOFORTIETH = 2; # 2 172 | ONETWENTIETH = 4; # 4 173 | SIXTIETH = 8; # 8 174 | 175 | RESOLUTION_SCALAR = 10; 176 | 177 | MIDDLE_C_MIDI = 60 178 | OCTAVE = 12 179 | 180 | 181 | EPSILON = np.finfo(np.float32).eps 182 | 183 | NoteBounds = namedtuple("NoteBounds",["lowbound","highbound"]) 184 | 185 | BOUNDS = NoteBounds(48, 84+1) 186 | -------------------------------------------------------------------------------- /custom_cmap.py: -------------------------------------------------------------------------------- 1 | 2 | from matplotlib.colors import ListedColormap 3 | from numpy import nan, inf 4 | 5 | # Used to reconstruct the colormap in viscm 6 | parameters = {'xp': [24.397622053872112, 22.329064134250359, -3.5279098610215271, -7.3202660469947318, -3.5279098610215271], 7 | 'yp': [10.224586288416106, 29.531126871552431, 21.60165484633572, 1.6055949566588197, -1.4972419227738101], 8 | 'min_Jp': 18.730158730158728, 9 | 'max_Jp': 99.6190476190476} 10 | 11 | cm_data = [[ 3.19188006e-01, 5.08411011e-04, 4.27847166e-02], 12 | [ 3.23452429e-01, 2.94004376e-03, 4.29498405e-02], 13 | [ 3.27697959e-01, 5.48550076e-03, 4.30937800e-02], 14 | [ 3.31925014e-01, 8.14729136e-03, 4.32108710e-02], 15 | [ 3.36134026e-01, 1.09282090e-02, 4.32932080e-02], 16 | [ 3.40324261e-01, 1.38309580e-02, 4.33544174e-02], 17 | [ 3.44496109e-01, 1.68585499e-02, 4.33863092e-02], 18 | [ 3.48649544e-01, 2.00141344e-02, 4.33873393e-02], 19 | [ 3.52784075e-01, 2.33004930e-02, 4.33674127e-02], 20 | [ 3.56899901e-01, 2.67211674e-02, 4.33188700e-02], 21 | [ 3.60996892e-01, 3.02796868e-02, 4.32405671e-02], 22 | [ 3.65074618e-01, 3.39789376e-02, 4.31416825e-02], 23 | [ 3.69133039e-01, 3.78227502e-02, 4.30181093e-02], 24 | [ 3.73172062e-01, 4.17796063e-02, 4.28631975e-02], 25 | [ 3.77191242e-01, 4.56665666e-02, 4.26883879e-02], 26 | [ 3.81190385e-01, 4.94889072e-02, 4.24937807e-02], 27 | [ 3.85169297e-01, 5.32578705e-02, 4.22682848e-02], 28 | [ 3.89127602e-01, 5.69807657e-02, 4.20217552e-02], 29 | [ 3.93065023e-01, 6.06644181e-02, 4.17567112e-02], 30 | [ 3.96981266e-01, 6.43150594e-02, 4.14725462e-02], 31 | [ 4.00875841e-01, 6.79394110e-02, 4.11600480e-02], 32 | [ 4.04748512e-01, 7.15406015e-02, 4.08309528e-02], 33 | [ 4.08598937e-01, 7.51227174e-02, 4.04858381e-02], 34 | [ 4.12426752e-01, 7.86893890e-02, 4.01206455e-02], 35 | [ 4.16231263e-01, 8.22452517e-02, 3.97338038e-02], 36 | [ 4.20012268e-01, 8.57920330e-02, 3.93348179e-02], 37 | [ 4.23769406e-01, 8.93321242e-02, 3.89262925e-02], 38 | [ 4.27502256e-01, 9.28678889e-02, 3.85095772e-02], 39 | [ 4.31210387e-01, 9.64014725e-02, 3.80861373e-02], 40 | [ 4.34892901e-01, 9.99361677e-02, 3.76509851e-02], 41 | [ 4.38549738e-01, 1.03472432e-01, 3.72127744e-02], 42 | [ 4.42180470e-01, 1.07011798e-01, 3.67737224e-02], 43 | [ 4.45784643e-01, 1.10555736e-01, 3.63357812e-02], 44 | [ 4.49361798e-01, 1.14105584e-01, 3.59010342e-02], 45 | [ 4.52911477e-01, 1.17662562e-01, 3.54716989e-02], 46 | [ 4.56432847e-01, 1.21228614e-01, 3.50467949e-02], 47 | [ 4.59925736e-01, 1.24804054e-01, 3.46317940e-02], 48 | [ 4.63389808e-01, 1.28389545e-01, 3.42302550e-02], 49 | [ 4.66824637e-01, 1.31985895e-01, 3.38449160e-02], 50 | [ 4.70229810e-01, 1.35593819e-01, 3.34786526e-02], 51 | [ 4.73604927e-01, 1.39213951e-01, 3.31344783e-02], 52 | [ 4.76949608e-01, 1.42846843e-01, 3.28155440e-02], 53 | [ 4.80263491e-01, 1.46492970e-01, 3.25251366e-02], 54 | [ 4.83546237e-01, 1.50152732e-01, 3.22666779e-02], 55 | [ 4.86797340e-01, 1.53826781e-01, 3.20427670e-02], 56 | [ 4.90016620e-01, 1.57515177e-01, 3.18577613e-02], 57 | [ 4.93203933e-01, 1.61217925e-01, 3.17160407e-02], 58 | [ 4.96359052e-01, 1.64935164e-01, 3.16215131e-02], 59 | [ 4.99481779e-01, 1.68666973e-01, 3.15782102e-02], 60 | [ 5.02571949e-01, 1.72413379e-01, 3.15902848e-02], 61 | [ 5.05629430e-01, 1.76174354e-01, 3.16620080e-02], 62 | [ 5.08654123e-01, 1.79949823e-01, 3.17977662e-02], 63 | [ 5.11645963e-01, 1.83739667e-01, 3.20020585e-02], 64 | [ 5.14604920e-01, 1.87543723e-01, 3.22794932e-02], 65 | [ 5.17530997e-01, 1.91361790e-01, 3.26347858e-02], 66 | [ 5.20424232e-01, 1.95193632e-01, 3.30727557e-02], 67 | [ 5.23284694e-01, 1.99038983e-01, 3.35983244e-02], 68 | [ 5.26112485e-01, 2.02897548e-01, 3.42165124e-02], 69 | [ 5.28907738e-01, 2.06769007e-01, 3.49324380e-02], 70 | [ 5.31670618e-01, 2.10653019e-01, 3.57513149e-02], 71 | [ 5.34401314e-01, 2.14549225e-01, 3.66784508e-02], 72 | [ 5.37100046e-01, 2.18457250e-01, 3.77192462e-02], 73 | [ 5.39767058e-01, 2.22376711e-01, 3.88791931e-02], 74 | [ 5.42402617e-01, 2.26307210e-01, 4.01638742e-02], 75 | [ 5.45007013e-01, 2.30248347e-01, 4.15510718e-02], 76 | [ 5.47580556e-01, 2.34199717e-01, 4.30359691e-02], 77 | [ 5.50123575e-01, 2.38160913e-01, 4.46216703e-02], 78 | [ 5.52636413e-01, 2.42131527e-01, 4.63067669e-02], 79 | [ 5.55119432e-01, 2.46111156e-01, 4.80895724e-02], 80 | [ 5.57573004e-01, 2.50099400e-01, 4.99681664e-02], 81 | [ 5.59997514e-01, 2.54095864e-01, 5.19404378e-02], 82 | [ 5.62393356e-01, 2.58100161e-01, 5.40041260e-02], 83 | [ 5.64760934e-01, 2.62111911e-01, 5.61568596e-02], 84 | [ 5.67100658e-01, 2.66130744e-01, 5.83961918e-02], 85 | [ 5.69412966e-01, 2.70156278e-01, 6.07196404e-02], 86 | [ 5.71698290e-01, 2.74188154e-01, 6.31246983e-02], 87 | [ 5.73957035e-01, 2.78226051e-01, 6.56088561e-02], 88 | [ 5.76189628e-01, 2.82269644e-01, 6.81696432e-02], 89 | [ 5.78396498e-01, 2.86318616e-01, 7.08046350e-02], 90 | [ 5.80578073e-01, 2.90372663e-01, 7.35114667e-02], 91 | [ 5.82734784e-01, 2.94431495e-01, 7.62878435e-02], 92 | [ 5.84867059e-01, 2.98494830e-01, 7.91315492e-02], 93 | [ 5.86975325e-01, 3.02562403e-01, 8.20404518e-02], 94 | [ 5.89060008e-01, 3.06633957e-01, 8.50125072e-02], 95 | [ 5.91121531e-01, 3.10709248e-01, 8.80457618e-02], 96 | [ 5.93160315e-01, 3.14788043e-01, 9.11383525e-02], 97 | [ 5.95176778e-01, 3.18870120e-01, 9.42885072e-02], 98 | [ 5.97171334e-01, 3.22955269e-01, 9.74945428e-02], 99 | [ 5.99144397e-01, 3.27043288e-01, 1.00754863e-01], 100 | [ 6.01096374e-01, 3.31133987e-01, 1.04067958e-01], 101 | [ 6.03027671e-01, 3.35227184e-01, 1.07432397e-01], 102 | [ 6.04938690e-01, 3.39322707e-01, 1.10846829e-01], 103 | [ 6.06829831e-01, 3.43420394e-01, 1.14309978e-01], 104 | [ 6.08701490e-01, 3.47520088e-01, 1.17820639e-01], 105 | [ 6.10554058e-01, 3.51621644e-01, 1.21377673e-01], 106 | [ 6.12387926e-01, 3.55724921e-01, 1.24980008e-01], 107 | [ 6.14203480e-01, 3.59829787e-01, 1.28626629e-01], 108 | [ 6.16001408e-01, 3.63935903e-01, 1.32316392e-01], 109 | [ 6.17781818e-01, 3.68043347e-01, 1.36048549e-01], 110 | [ 6.19545070e-01, 3.72152022e-01, 1.39822260e-01], 111 | [ 6.21291540e-01, 3.76261820e-01, 1.43636721e-01], 112 | [ 6.23021600e-01, 3.80372642e-01, 1.47491171e-01], 113 | [ 6.24735623e-01, 3.84484392e-01, 1.51384889e-01], 114 | [ 6.26433979e-01, 3.88596978e-01, 1.55317195e-01], 115 | [ 6.28117036e-01, 3.92710314e-01, 1.59287442e-01], 116 | [ 6.29785163e-01, 3.96824318e-01, 1.63295020e-01], 117 | [ 6.31438727e-01, 4.00938911e-01, 1.67339345e-01], 118 | [ 6.33078094e-01, 4.05054018e-01, 1.71419867e-01], 119 | [ 6.34703629e-01, 4.09169567e-01, 1.75536060e-01], 120 | [ 6.36315954e-01, 4.13285331e-01, 1.79687212e-01], 121 | [ 6.37915317e-01, 4.17401322e-01, 1.83872936e-01], 122 | [ 6.39501936e-01, 4.21517570e-01, 1.88092897e-01], 123 | [ 6.41076174e-01, 4.25634014e-01, 1.92346660e-01], 124 | [ 6.42638397e-01, 4.29750599e-01, 1.96633813e-01], 125 | [ 6.44188969e-01, 4.33867270e-01, 2.00953958e-01], 126 | [ 6.45728257e-01, 4.37983973e-01, 2.05306713e-01], 127 | [ 6.47256627e-01, 4.42100659e-01, 2.09691714e-01], 128 | [ 6.48774449e-01, 4.46217278e-01, 2.14108606e-01], 129 | [ 6.50282093e-01, 4.50333782e-01, 2.18557051e-01], 130 | [ 6.51780194e-01, 4.54449979e-01, 2.23036476e-01], 131 | [ 6.53268943e-01, 4.58565931e-01, 2.27546728e-01], 132 | [ 6.54748608e-01, 4.62681653e-01, 2.32087599e-01], 133 | [ 6.56219567e-01, 4.66797103e-01, 2.36658794e-01], 134 | [ 6.57682196e-01, 4.70912243e-01, 2.41260026e-01], 135 | [ 6.59136877e-01, 4.75027032e-01, 2.45891017e-01], 136 | [ 6.60583993e-01, 4.79141432e-01, 2.50551496e-01], 137 | [ 6.62023931e-01, 4.83255406e-01, 2.55241197e-01], 138 | [ 6.63457080e-01, 4.87368916e-01, 2.59959863e-01], 139 | [ 6.64884050e-01, 4.91481816e-01, 2.64707020e-01], 140 | [ 6.66305021e-01, 4.95594182e-01, 2.69482635e-01], 141 | [ 6.67720349e-01, 4.99706002e-01, 2.74286507e-01], 142 | [ 6.69130435e-01, 5.03817240e-01, 2.79118400e-01], 143 | [ 6.70535682e-01, 5.07927863e-01, 2.83978083e-01], 144 | [ 6.71936499e-01, 5.12037840e-01, 2.88865326e-01], 145 | [ 6.73333295e-01, 5.16147137e-01, 2.93779904e-01], 146 | [ 6.74726487e-01, 5.20255723e-01, 2.98721596e-01], 147 | [ 6.76116551e-01, 5.24363538e-01, 3.03690119e-01], 148 | [ 6.77503877e-01, 5.28470568e-01, 3.08685290e-01], 149 | [ 6.78888821e-01, 5.32576816e-01, 3.13706971e-01], 150 | [ 6.80271812e-01, 5.36682249e-01, 3.18754950e-01], 151 | [ 6.81653287e-01, 5.40786839e-01, 3.23829018e-01], 152 | [ 6.83033683e-01, 5.44890554e-01, 3.28928966e-01], 153 | [ 6.84413445e-01, 5.48993365e-01, 3.34054587e-01], 154 | [ 6.85793019e-01, 5.53095241e-01, 3.39205675e-01], 155 | [ 6.87172842e-01, 5.57196161e-01, 3.44382044e-01], 156 | [ 6.88553325e-01, 5.61296115e-01, 3.49583542e-01], 157 | [ 6.89934944e-01, 5.65395067e-01, 3.54809950e-01], 158 | [ 6.91318165e-01, 5.69492986e-01, 3.60061066e-01], 159 | [ 6.92703460e-01, 5.73589844e-01, 3.65336688e-01], 160 | [ 6.94091305e-01, 5.77685610e-01, 3.70636614e-01], 161 | [ 6.95482183e-01, 5.81780256e-01, 3.75960642e-01], 162 | [ 6.96876579e-01, 5.85873751e-01, 3.81308568e-01], 163 | [ 6.98274958e-01, 5.89966078e-01, 3.86680224e-01], 164 | [ 6.99677633e-01, 5.94057281e-01, 3.92075624e-01], 165 | [ 7.01085271e-01, 5.98147261e-01, 3.97494373e-01], 166 | [ 7.02498381e-01, 6.02235988e-01, 4.02936266e-01], 167 | [ 7.03917477e-01, 6.06323433e-01, 4.08401097e-01], 168 | [ 7.05343079e-01, 6.10409565e-01, 4.13888660e-01], 169 | [ 7.06775712e-01, 6.14494353e-01, 4.19398747e-01], 170 | [ 7.08215906e-01, 6.18577766e-01, 4.24931150e-01], 171 | [ 7.09664198e-01, 6.22659774e-01, 4.30485657e-01], 172 | [ 7.11120918e-01, 6.26740423e-01, 4.36062323e-01], 173 | [ 7.12586632e-01, 6.30819674e-01, 4.41660917e-01], 174 | [ 7.14062051e-01, 6.34897436e-01, 4.47281028e-01], 175 | [ 7.15547738e-01, 6.38973677e-01, 4.52922438e-01], 176 | [ 7.17044260e-01, 6.43048362e-01, 4.58584925e-01], 177 | [ 7.18552191e-01, 6.47121460e-01, 4.64268267e-01], 178 | [ 7.20072113e-01, 6.51192935e-01, 4.69972235e-01], 179 | [ 7.21604612e-01, 6.55262754e-01, 4.75696601e-01], 180 | [ 7.23150282e-01, 6.59330882e-01, 4.81441132e-01], 181 | [ 7.24709638e-01, 6.63397313e-01, 4.87205706e-01], 182 | [ 7.26282965e-01, 6.67462115e-01, 4.92990519e-01], 183 | [ 7.27871260e-01, 6.71525124e-01, 4.98794821e-01], 184 | [ 7.29475145e-01, 6.75586301e-01, 5.04618361e-01], 185 | [ 7.31095251e-01, 6.79645609e-01, 5.10460886e-01], 186 | [ 7.32732213e-01, 6.83703007e-01, 5.16322138e-01], 187 | [ 7.34386675e-01, 6.87758456e-01, 5.22201851e-01], 188 | [ 7.36059289e-01, 6.91811916e-01, 5.28099756e-01], 189 | [ 7.37750713e-01, 6.95863345e-01, 5.34015578e-01], 190 | [ 7.39461613e-01, 6.99912703e-01, 5.39949035e-01], 191 | [ 7.41192662e-01, 7.03959945e-01, 5.45899838e-01], 192 | [ 7.42944541e-01, 7.08005028e-01, 5.51867692e-01], 193 | [ 7.44717602e-01, 7.12048006e-01, 5.57852798e-01], 194 | [ 7.46512751e-01, 7.16088771e-01, 5.63854543e-01], 195 | [ 7.48330822e-01, 7.20127236e-01, 5.69872417e-01], 196 | [ 7.50172531e-01, 7.24163354e-01, 5.75906084e-01], 197 | [ 7.52038600e-01, 7.28197076e-01, 5.81955204e-01], 198 | [ 7.53929761e-01, 7.32228351e-01, 5.88019423e-01], 199 | [ 7.55846753e-01, 7.36257131e-01, 5.94098377e-01], 200 | [ 7.57790321e-01, 7.40283361e-01, 6.00191688e-01], 201 | [ 7.59761221e-01, 7.44306992e-01, 6.06298969e-01], 202 | [ 7.61760216e-01, 7.48327968e-01, 6.12419817e-01], 203 | [ 7.63788076e-01, 7.52346237e-01, 6.18553814e-01], 204 | [ 7.65845580e-01, 7.56361743e-01, 6.24700530e-01], 205 | [ 7.67933513e-01, 7.60374432e-01, 6.30859514e-01], 206 | [ 7.70052669e-01, 7.64384247e-01, 6.37030303e-01], 207 | [ 7.72203849e-01, 7.68391133e-01, 6.43212411e-01], 208 | [ 7.74387858e-01, 7.72395033e-01, 6.49405338e-01], 209 | [ 7.76605512e-01, 7.76395891e-01, 6.55608559e-01], 210 | [ 7.78857629e-01, 7.80393651e-01, 6.61821529e-01], 211 | [ 7.81145034e-01, 7.84388257e-01, 6.68043683e-01], 212 | [ 7.83468558e-01, 7.88379654e-01, 6.74274427e-01], 213 | [ 7.85829033e-01, 7.92367789e-01, 6.80513147e-01], 214 | [ 7.88227297e-01, 7.96352608e-01, 6.86759199e-01], 215 | [ 7.90664188e-01, 8.00334062e-01, 6.93011914e-01], 216 | [ 7.93140546e-01, 8.04312100e-01, 6.99270593e-01], 217 | [ 7.95657211e-01, 8.08286679e-01, 7.05534505e-01], 218 | [ 7.98215019e-01, 8.12257756e-01, 7.11802892e-01], 219 | [ 8.00814805e-01, 8.16225293e-01, 7.18074960e-01], 220 | [ 8.03457395e-01, 8.20189256e-01, 7.24349883e-01], 221 | [ 8.06143611e-01, 8.24149618e-01, 7.30626801e-01], 222 | [ 8.08874262e-01, 8.28106357e-01, 7.36904818e-01], 223 | [ 8.11650145e-01, 8.32059461e-01, 7.43183002e-01], 224 | [ 8.14472043e-01, 8.36008923e-01, 7.49460385e-01], 225 | [ 8.17340716e-01, 8.39954747e-01, 7.55735965e-01], 226 | [ 8.20256906e-01, 8.43896948e-01, 7.62008700e-01], 227 | [ 8.23221044e-01, 8.47835583e-01, 7.68278333e-01], 228 | [ 8.26233704e-01, 8.51770691e-01, 7.74544189e-01], 229 | [ 8.29295924e-01, 8.55702279e-01, 7.80804058e-01], 230 | [ 8.32408327e-01, 8.59630412e-01, 7.87056769e-01], 231 | [ 8.35571491e-01, 8.63555173e-01, 7.93301118e-01], 232 | [ 8.38785950e-01, 8.67476662e-01, 7.99535878e-01], 233 | [ 8.42052188e-01, 8.71395001e-01, 8.05759795e-01], 234 | [ 8.45370193e-01, 8.75310323e-01, 8.11973366e-01], 235 | [ 8.48740412e-01, 8.79222758e-01, 8.18175322e-01], 236 | [ 8.52163579e-01, 8.83132479e-01, 8.24362799e-01], 237 | [ 8.55639953e-01, 8.87039686e-01, 8.30534495e-01], 238 | [ 8.59169726e-01, 8.90944601e-01, 8.36689115e-01], 239 | [ 8.62752326e-01, 8.94847329e-01, 8.42829467e-01], 240 | [ 8.66388535e-01, 8.98748189e-01, 8.48950924e-01], 241 | [ 8.70078463e-01, 9.02647473e-01, 8.55051709e-01], 242 | [ 8.73821857e-01, 9.06545384e-01, 8.61132327e-01], 243 | [ 8.77618473e-01, 9.10442054e-01, 8.67193908e-01], 244 | [ 8.81468712e-01, 9.14337976e-01, 8.73231456e-01], 245 | [ 8.85372253e-01, 9.18233284e-01, 8.79246333e-01], 246 | [ 8.89328979e-01, 9.22128095e-01, 8.85239267e-01], 247 | [ 8.93339074e-01, 9.26022993e-01, 8.91205135e-01], 248 | [ 8.97402366e-01, 9.29917617e-01, 8.97149605e-01], 249 | [ 9.01519093e-01, 9.33812678e-01, 9.03066039e-01], 250 | [ 9.05689518e-01, 9.37707793e-01, 9.08958623e-01], 251 | [ 9.09913938e-01, 9.41603448e-01, 9.14822730e-01], 252 | [ 9.14193252e-01, 9.45499026e-01, 9.20662416e-01], 253 | [ 9.18527956e-01, 9.49395047e-01, 9.26471928e-01], 254 | [ 9.22919852e-01, 9.53290490e-01, 9.32255874e-01], 255 | [ 9.27370222e-01, 9.57185424e-01, 9.38009990e-01], 256 | [ 9.31881046e-01, 9.61079371e-01, 9.43732754e-01], 257 | [ 9.36455353e-01, 9.64971274e-01, 9.49424410e-01], 258 | [ 9.41096680e-01, 9.68860078e-01, 9.55083193e-01], 259 | [ 9.45808721e-01, 9.72744936e-01, 9.60704680e-01], 260 | [ 9.50596398e-01, 9.76624444e-01, 9.66285224e-01], 261 | [ 9.55465459e-01, 9.80497024e-01, 9.71819696e-01], 262 | [ 9.60422407e-01, 9.84360987e-01, 9.77301129e-01], 263 | [ 9.65474320e-01, 9.88214632e-01, 9.82720362e-01], 264 | [ 9.70629792e-01, 9.92055757e-01, 9.88067257e-01], 265 | [ 9.75895168e-01, 9.95883564e-01, 9.93326172e-01], 266 | [ 9.81275361e-01, 9.99697973e-01, 9.98479623e-01]] 267 | 268 | cm_data = [x for i,x in enumerate(cm_data) for _ in range((len(cm_data)-i))] 269 | 270 | test_cm = ListedColormap(cm_data, name=__file__) 271 | 272 | 273 | if __name__ == "__main__": 274 | import matplotlib.pyplot as plt 275 | import numpy as np 276 | 277 | try: 278 | from viscm import viscm 279 | viscm(test_cm) 280 | except ImportError: 281 | print("viscm not found, falling back on simple display") 282 | plt.imshow(np.linspace(0, 100, 256)[None, :], aspect='auto', 283 | cmap=test_cm) 284 | plt.show() 285 | -------------------------------------------------------------------------------- /generate_trade_helper.py: -------------------------------------------------------------------------------- 1 | import sys 2 | import os 3 | import leadsheet 4 | import argparse 5 | import pickle 6 | import numpy as np 7 | import lscat 8 | 9 | def main(filedir): 10 | files = [] 11 | with open(os.path.join(filedir,'generated_sources.txt'),'r') as f: 12 | for i,line in enumerate(f): 13 | files.append(line.split(":")[0]) 14 | files.append(os.path.join(filedir,'generated_{}.ls'.format(i))) 15 | lscat.main(files, output=os.path.join(filedir,'generated_trades.ls'), verbose=False) 16 | 17 | parser = argparse.ArgumentParser(description='Helper to concatenate trades into single leadsheet') 18 | parser.add_argument('filedir', help='Directory to process') 19 | 20 | if __name__ == '__main__': 21 | args = parser.parse_args() 22 | main(**vars(args)) 23 | -------------------------------------------------------------------------------- /input_parts/__init__.py: -------------------------------------------------------------------------------- 1 | from .base_input_part import InputPart 2 | from .beat_part import BeatInputPart 3 | from .abs_position_part import PositionInputPart 4 | from .chord_part import ChordShiftInputPart 5 | from .passthrough_part import PassthroughInputPart -------------------------------------------------------------------------------- /input_parts/abs_position_part.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import theano 3 | import theano.tensor as T 4 | 5 | import constants 6 | 7 | from .base_input_part import InputPart 8 | 9 | class PositionInputPart( InputPart ): 10 | """ 11 | An input part that constructs a position 12 | """ 13 | 14 | def __init__(self, low_bound, up_bound, num_divisions): 15 | """ 16 | Build an input part with num_divisions ranging betweeen low_bound and up_bound. 17 | This part will activate each division depending on how close the relative_position 18 | is to it. 19 | """ 20 | assert num_divisions >= 2, "Must have at least 2 divisions!" 21 | self.low_bound = low_bound 22 | self.up_bound = up_bound 23 | self.PART_WIDTH = num_divisions 24 | 25 | def generate(self, relative_position, **kwargs): 26 | """ 27 | Generate a position input for a given timestep. 28 | 29 | Parameters: 30 | relative_position: A theano tensor (int32) of shape (n_parallel), giving the 31 | current relative position for this timestep 32 | 33 | Returns: 34 | piece: A theano tensor (float32) of shape (n_parallel, PART_WIDTH) 35 | """ 36 | delta = (self.up_bound-self.low_bound) / (self.PART_WIDTH-1) 37 | indicator_pos = np.array([[i*delta + self.low_bound for i in range(self.PART_WIDTH)]], np.float32) 38 | 39 | # differences[i][j] is the difference between relative_position[i] and indicator_pos[j] 40 | differences = T.cast(T.shape_padright(relative_position),'float32') - indicator_pos 41 | 42 | # We want each indicator to activate at its division point, and fall off linearly around it, 43 | # capped from 0 to 1. 44 | activities = T.maximum(0, 1-abs(differences/delta)) 45 | 46 | # activities = theano.printing.Print("PositionInputPart")(activities) 47 | # activities = T.opt.Assert()(activities, T.eq(activities.shape[1], self.PART_WIDTH)) 48 | 49 | return activities 50 | -------------------------------------------------------------------------------- /input_parts/base_input_part.py: -------------------------------------------------------------------------------- 1 | 2 | class InputPart( object ): 3 | """ 4 | Base class for input parts 5 | """ 6 | 7 | PART_WIDTH = 0 8 | 9 | def generate(self, **kwargs): 10 | """ 11 | Generate the appropriate input. 12 | 13 | Parameters: 14 | **kwargs: Depending on the particular class, may take in different values. 15 | To allow flexibility, all subclasses should ignore any kwargs that they do not need, 16 | so this method can be called with all relevant parameters. 17 | 18 | Returns: 19 | part: A theano tensor (float32) of shape (n_parallel, PART_WIDTH), where n_parallel 20 | is either an explicit parameter or determined by shapes of input 21 | """ 22 | raise NotImplementedError("generate not implemented") 23 | -------------------------------------------------------------------------------- /input_parts/beat_part.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import theano 3 | import theano.tensor as T 4 | 5 | import constants 6 | 7 | from .base_input_part import InputPart 8 | 9 | class BeatInputPart( InputPart ): 10 | """ 11 | An input part that builds a beat 12 | """ 13 | 14 | BEAT_PERIODS = np.array([x//constants.RESOLUTION_SCALAR for x in [ 15 | constants.WHOLE, 16 | constants.HALF, 17 | constants.QUARTER, 18 | constants.EIGHTH, 19 | constants.SIXTEENTH, 20 | constants.HALF_TRIPLET, 21 | constants.QUARTER_TRIPLET, 22 | constants.EIGHTH_TRIPLET, 23 | constants.SIXTEENTH_TRIPLET, 24 | ]], np.int32) 25 | PART_WIDTH = len(BEAT_PERIODS) 26 | 27 | def generate(self, timestep, **kwargs): 28 | """ 29 | Generate a beat input for a given timestep. 30 | 31 | Parameters: 32 | timestep: A theano int of shape (n_parallel). The current timestep to generate the beat input for. 33 | 34 | Returns: 35 | piece: A theano tensor (float32) of shape (n_parallel, PART_WIDTH) 36 | """ 37 | 38 | result = T.eq(T.shape_padright(timestep) % np.expand_dims(self.BEAT_PERIODS,0), 0) 39 | 40 | # result = theano.printing.Print("BeatInputPart")(result) 41 | # result = T.opt.Assert()(result, T.eq(result.shape[1], self.PART_WIDTH)) 42 | return result 43 | -------------------------------------------------------------------------------- /input_parts/chord_part.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import theano 3 | import theano.tensor as T 4 | 5 | import constants 6 | 7 | from .base_input_part import InputPart 8 | 9 | class ChordShiftInputPart( InputPart ): 10 | """ 11 | An input part that builds a shifted chord representation 12 | """ 13 | 14 | CHORD_WIDTH = 12 15 | PART_WIDTH = 12 16 | 17 | def generate(self, relative_position, cur_chord_root, cur_chord_type, **kwargs): 18 | """ 19 | Generate a chord input for a given timestep. 20 | 21 | Parameters: 22 | relative_position: A theano tensor (int32) of shape (n_parallel), giving the 23 | current relative position for this timestep 24 | cur_chord_root: A theano tensor (int32) of shape (n_parallel) giving the unshifted chord root 25 | cur_chord_type: A theano tensor (int32) of shape (n_parallel, CHORD_WIDTH), giving the unshifted chord 26 | type representation, parsed from the leadsheet 27 | 28 | Returns: 29 | piece: A theano tensor (float32) of shape (n_parallel, PART_WIDTH) 30 | """ 31 | def _map_fn(pos, chord): 32 | # Now pos is scalar and chord is of shape (CHORD_WIDTH), so we can roll 33 | return T.roll(chord, (-pos)%12, 0) 34 | 35 | shifted_chords, _ = theano.map(_map_fn, sequences=[relative_position-cur_chord_root, cur_chord_type]) 36 | 37 | # shifted_chords = theano.printing.Print("ChordShiftInputPart")(shifted_chords) 38 | # shifted_chords = T.opt.Assert()(shifted_chords, T.eq(shifted_chords.shape[1], self.PART_WIDTH)) 39 | return shifted_chords -------------------------------------------------------------------------------- /input_parts/passthrough_part.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import theano 3 | import theano.tensor as T 4 | 5 | import constants 6 | 7 | from .base_input_part import InputPart 8 | 9 | class PassthroughInputPart( InputPart ): 10 | """ 11 | An input part that passes through one of its parameters unchanged 12 | """ 13 | 14 | def __init__(self, keyword, width): 15 | """ 16 | Initialize the input part to passthrough the input given by keyword 17 | """ 18 | self.keyword = keyword 19 | self.PART_WIDTH = width 20 | 21 | def __repr__(self): 22 | return ''.format(self.keyword) 23 | 24 | def generate(self, **kwargs): 25 | """ 26 | Generate a beat input for a given timestep. 27 | 28 | Parameters: 29 | kwargs[keyword]: A theano tensor (float32) of shape (n_parallel, PART_WIDTH). 30 | This method will extract this keyword argument and return it unchanged 31 | 32 | Returns: 33 | piece: A theano tensor (float32) of shape (n_parallel, PART_WIDTH) 34 | """ 35 | result = kwargs[self.keyword] 36 | # result = theano.printing.Print("PassthroughInputPart")(result) 37 | # result = T.opt.Assert()(result, T.eq(result.shape[1], self.PART_WIDTH)) 38 | return result 39 | -------------------------------------------------------------------------------- /instructions/generate_trade_helper.md: -------------------------------------------------------------------------------- 1 | #generate_trade_helper.py: Generate a trading summary leadsheet 2 | 3 | ``` 4 | usage: generate_trade_helper.py [-h] filedir 5 | 6 | Helper to concatenate trades into single leadsheet 7 | 8 | positional arguments: 9 | filedir Directory to process 10 | 11 | optional arguments: 12 | -h, --help show this help message and exit 13 | ``` 14 | 15 | To use this script, pass the path to a generation output directory produced by `main.py`. This script will read the generated files and produce a new file `generated_trades.py` which alternates between the source pieces and the generated output produced by the network. -------------------------------------------------------------------------------- /instructions/lscat.md: -------------------------------------------------------------------------------- 1 | #lscat.py: Concatenate leadsheets 2 | 3 | ``` 4 | usage: lscat.py [-h] [--output OUTPUT] [--verbose] files [files ...] 5 | 6 | Concatenate leadsheet files. 7 | 8 | positional arguments: 9 | files Files to process 10 | 11 | optional arguments: 12 | -h, --help show this help message and exit 13 | --output OUTPUT Name of the output file 14 | --verbose Be verbose about processing 15 | ``` 16 | 17 | To use this script, pass a list of leadsheet files. These files will then be concatenated together into a new leadsheet file. It is recommended that you also use the `--output` argument to specify an output filename, but otherwise a filename is generated based on the first concatenated file. 18 | 19 | For example, to concatenate `a.ls`, `b.ls`, and `c.ls`, you can run 20 | 21 | ``` 22 | $ python3 lscat.py a.ls b.ls c.ls --output combined.ls 23 | ``` 24 | -------------------------------------------------------------------------------- /instructions/lssplit.md: -------------------------------------------------------------------------------- 1 | #lssplit.py: Split leadsheets 2 | 3 | ``` 4 | usage: lssplit.py [-h] [--output OUTPUT] file split 5 | 6 | Split a leadsheet file. 7 | 8 | positional arguments: 9 | file File to process 10 | split Bars to split at 11 | 12 | optional arguments: 13 | -h, --help show this help message and exit 14 | --output OUTPUT Base name of the output files 15 | ``` 16 | 17 | To use this script, pass a single leadsheet file, and a number of bars. This script will chop up the input into chunks of length `split`. 18 | 19 | For instance, to split up `large.ls` into chunks of 4 bars, use 20 | ``` 21 | $ python3 lssplit.py large.ls 4 --output smaller 22 | ``` 23 | 24 | which will produce files `smaller_0.ls`, `smaller_1.ls`, etc. -------------------------------------------------------------------------------- /instructions/main.md: -------------------------------------------------------------------------------- 1 | # main.py: Train or generate from a neural network model 2 | 3 | This script allows you to actually interact with a neural network model. You can run the script with 4 | 5 | ``` 6 | $ python3 main.py [general arguments] MODELTYPE [model-specific arguments] 7 | ``` 8 | 9 | ## General Arguments 10 | ``` 11 | -h, --help show this help message and exit 12 | --dataset DATASET [DATASET ...] 13 | Path(s) to dataset folder (with .ls files). If 14 | multiple are passed, samples randomly from each 15 | (default: ['dataset']) 16 | --validation VALIDATION 17 | Path to validation dataset folder (with .ls files) 18 | (default: None) 19 | --validation_generate_ct VALIDATION_GENERATE_CT 20 | Number of samples to generate at each validation time. 21 | (default: 1) 22 | --outputdir OUTPUTDIR 23 | Path to output folder (default: output) 24 | --check_nan Check for nans during execution (default: False) 25 | --batch_size BATCH_SIZE 26 | Size of batch (default: 10) 27 | --iterations ITERATIONS 28 | How many iterations to train (default: 50000) 29 | --learning_rate LEARNING_RATE 30 | Learning rate for the ADAM gradient descent method 31 | (default: 0.0002) 32 | --segment_len SEGMENT_LEN 33 | Length of segment to train on (default: 4bar) 34 | --segment_step SEGMENT_STEP 35 | Period at which segments may begin (default: 1bar) 36 | --save-params-interval TRAIN_SAVE_PARAMS 37 | Save parameters after this many iterations (default: 38 | 5000) 39 | --final-params-only Don't save parameters while training, only at the end. 40 | (default: None) 41 | --auto_connectome_keys AUTO_CONNECTOME_KEYS 42 | Path to keys for running param_cvt. If given, will run 43 | param_cvt automatically for each saved parameters 44 | file. (default: None) 45 | --resume TIMESTEP PARAMFILE 46 | Where to restore from: timestep, and file to load 47 | (default: None) 48 | --resume_auto Automatically restore from a previous run using output 49 | directory (default: False) 50 | --generate Don't train, just generate. Should be used with 51 | restore. (default: False) 52 | --generate_over SOURCE DIV_WIDTH 53 | Don't train, just generate, and generate over SOURCE 54 | chord changes divided into chunks of length DIV_WIDTH 55 | (or one contiguous chunk if DIV_WIDTH is 'full'). Can 56 | use 'bar' as a unit. Should be used with restore. 57 | (default: None) 58 | ``` 59 | 60 | If you pass a validation folder to `--validation`, then each time it saves parameters, it will also generate samples over the pieces in the validation directory (similar to `--generate_over`). By default, it generates once over each piece in the validation folder, but this can be changed using `--validation_generate_ct`. 61 | 62 | ## Model Types 63 | 64 | 65 | ###`simple`: A simple model 66 | This is a simple class of model, not available in Impro-Visor. 67 | 68 | `simple`-specific arguments: 69 | ``` 70 | usage: main.py simple [-h] [--per_note] {abs,cot,rel} 71 | 72 | positional arguments: 73 | {abs,cot,rel} Type of encoding to use 74 | 75 | optional arguments: 76 | -h, --help show this help message and exit 77 | --per_note Enable note memory cells 78 | ``` 79 | The three types of encoding in this mode are 80 | 81 | - `abs`: An absolute encoding, where each distinct pitch has a distinct bit in the note representation 82 | - `cot`: The circles-of-thirds representation, where pitches are determined by which circles of major and minor thirds that pitch is in. This encoding was originally described by Judy A. Franklin in [Recurrent Neural Networks and Pitch Representations for Music Tasks ](http://cs.smith.edu/~jfrankli/papers/FLAIRS04FranklinJ.pdf). 83 | - `rel`: An interval-relative encoding, where each note is encoded as the size and direction of interval between this note and the previous one. 84 | 85 | ###`poex`: A product-of-experts generative model. 86 | This is the type of generative model available in Impro-Visor. 87 | 88 | `poex`-specific arguments: 89 | ``` 90 | -h, --help show this help message and exit 91 | --per_note Enable note memory cells 92 | --separate_rhythm Use a separate rhythm expert. Only works without note 93 | memory cells 94 | --layer_size LAYER_SIZE [LAYER_SIZE ...] 95 | Layer size of the LSTMs. Either pass a single number 96 | to be used for all experts, or a sequence of numbers, 97 | one for each expert. Only works without note memory 98 | cells. 99 | --num_layers NUM_LAYERS [NUM_LAYERS ...] 100 | Number of LSTM layers. Either pass a single number to 101 | be used for all experts, or a sequence of numbers, one 102 | for each expert. Only works without note memory cells 103 | --skip_training_experts EXPERT_INDEX [EXPERT_INDEX ...] 104 | Skip training these experts 105 | ``` 106 | `--per_note` enables memory cells which are fixed to particular notes and do not shift with the rest of the network. Although these were investigated as being potentially useful, they did not give a significant advantage and were not implemented in Impro-Visor for simplicity. If you wish to train a model for Impro-Visor, do not use the `--per_note` flag. 107 | 108 | `--separate_rhythm` configures the network to delegate rhythm to a third expert, and use the other two experts for pitch choices only. 109 | 110 | The `--layer_size` and `--num_layers` parameters give the layer sizes and number of layers used for each expert. They can be passed either a single number, used for all experts, or a list of numbers, one per expert. The order of experts is interval, chord, (optionally) rhythm. 111 | 112 | Using `--skip_training_experts`, you can specify particular experts to not train. This is useful if you are resuming from an already-partially-trained parameters file. Experts are passed by index. 113 | 114 | ###`compae`: A compressing autoencoder model 115 | This is the type of trading model available in Impro-Visor. 116 | 117 | `compae`-specific arguments: 118 | ``` 119 | usage: main.py [general arguments] compae ENCODING MANAGER [compae optional arguments] 120 | 121 | positional arguments: 122 | {abs,cot,rel,poex} Type of encoding to use 123 | {std,var,sample_var,queueless_var,queueless_std,nearness_std} 124 | Type of queue manager to use 125 | 126 | optional arguments: 127 | -h, --help show this help message and exit 128 | --per_note Enable note memory cells 129 | --hide_output Hide previous outputs from the decoder 130 | --sparsity_loss_scale SPARSITY_LOSS_SCALE 131 | How much to scale the sparsity loss by 132 | --variational_loss_scale VARIATIONAL_LOSS_SCALE 133 | How much to scale the variational loss by 134 | --feature_size FEATURE_SIZE 135 | Size of feature vectors 136 | --feature_period FEATURE_PERIOD 137 | If in queueless mode, period of features in timesteps 138 | --add_pre_noise [ADD_PRE_NOISE] 139 | Add Gaussian noise to the feature values before 140 | applying the activation function 141 | --add_post_noise [ADD_POST_NOISE] 142 | Add Gaussian noise to the feature values after 143 | applying the activation function 144 | --train_decoder_only Only modify the decoder parameters 145 | --layer_size LAYER_SIZE 146 | Layer size of the LSTMs. Only works without note 147 | memory cells 148 | --num_layers NUM_LAYERS 149 | Number of LSTM layers. Only works without note memory 150 | cells 151 | --priority_loss [LOSS_MODE_PRIORITY] 152 | Use priority loss scaling mode (with the specified 153 | curviness) 154 | --add_loss Use adding loss scaling mode 155 | --cutoff_loss CUTOFF Use cutoff loss scaling mode with the specified per- 156 | batch cutoff 157 | --trigger_loss TRIGGER RAMP_TIME 158 | Use trigger loss scaling mode with the specified per- 159 | batch trigger value and desired ramp-up time 160 | ``` 161 | 162 | In order to be compatible with Impro-Visor, you must use the encoding `poex` and the queue manger `queueless_std` or `queueless_var`, and you must not use the flags `--per_note` or `--hide_output`. Other modes are available for training and generation within python, but were not implemented in Impro-Visor. Additionally, `--feature_period` should be 24, corresponding to the duration of a half note. 163 | 164 | Other than `poex`, which matches the `poex` generative model, the other encoding modes correspond to the encodings for the `simple` model. 165 | 166 | There are a variety of queue managers available: 167 | 168 | - `std`: The standard, variable-sized feature model, where the network decides where to output features, and is encouraged to output few features. 169 | - `var`: A variational version of the variable-sized feature model, where the features are latent variables that are sampled from a normal distribution, and that distribution is regularized to be similar to a unit normal distribution. 170 | - `sample_var`: Like `var`, but instead of allowing a feature to be output with fractional strength, the network samples from the feature strength to decide where the features are, and then is trained using a variant of reinforcement learning. 171 | - `queueless_std`: A fixed-size feature model, with features repeating at a fixed interval. 172 | - `queueless_var`: A variational version of the fixed-size feature model. 173 | - `nearness_std`: A version of the variable-sized feature model where features that are close together are penalized more than features that are far away. 174 | 175 | For some queue mangers, there are multiple loss values: the reconstruction loss, as well as some extra loss. The `--sparsity_loss_scale` and `--variational_loss_scale` allow you to scale the importance of these losses, and the loss modes `--add_loss`, `--priority_loss`, `--cutoff_loss`, and `--trigger_loss` determine how the losses are balanced: 176 | 177 | - `--add_loss` simply adds the losses 178 | - `--priority_loss` attempts to scale the losses so that the largest loss is most important 179 | - `--cutoff_loss` ignores the extra loss unless the reconstruction loss is small enough 180 | - `--trigger_loss` waits until the reconstruction loss becomes small enough, and then interpolates the extra loss from having no influence to having full influence (as in `--add_loss` mode) 181 | 182 | ## Examples 183 | 184 | Train a product-of-experts generative model on a directory of leadsheet fileswith path `datasets/my_dataset`, automatically resuming training if previously interrupted. Each leadsheet file will be split into 4-bar chunks starting at each bar. 185 | 186 | ``` 187 | $ python3 main.py --dataset datasets/my_dataset --outputdir output_my_dataset --resume_auto poex 188 | ``` 189 | 190 | Train a product-of-experts generative model as before, but split each leadsheet into 8-bar chunks, starting at each multiple of 4 bars: 191 | 192 | ``` 193 | $ python3 main.py --dataset datasets/my_dataset --outputdir output_my_dataset --resume_auto --segment_len 8bar --segment_step 4bar poex 194 | ``` 195 | 196 | Train a product-of-experts generative model as before, but save parameters every 100 iterations, and generate samples over a validation dataset: 197 | 198 | ``` 199 | $ python3 main.py --dataset datasets/my_dataset --outputdir output_my_dataset --resume_auto --save-params-interval 100 --validation datasets/validation_dataset poex 200 | ``` 201 | 202 | Generate some leadsheets using a trained product-of-experts model, sampling from the dataset: 203 | 204 | ``` 205 | $ python3 main.py --dataset datasets/my_dataset --outputdir output_my_dataset/generated --resume 0 output_my_dataset/final_params.p --generate poex 206 | ``` 207 | 208 | As above, but generate over a particular piece `my_generate_target.ls` in 4 bar chunks: 209 | 210 | ``` 211 | $ python3 main.py --dataset datasets/my_dataset --outputdir output_my_dataset/generated --resume 0 output_my_dataset/final_params.p --generate_over my_generate_target.ls 4bar poex 212 | ``` 213 | 214 | As above, but generate over the whole piece in a single run of the network (without breaking into chunks) 215 | 216 | ``` 217 | $ python3 main.py --dataset datasets/my_dataset --outputdir output_my_dataset/generated --resume 0 output_my_dataset/final_params.p --generate_over my_generate_target.ls full poex 218 | ``` 219 | 220 | Train a compressing autoencoder on the same dataset, using fixed features and product-of-experts for compatibility with Impro-Visor: 221 | 222 | ``` 223 | $ python3 main.py --dataset datasets/my_dataset --outputdir output_my_dataset_compae --resume_auto compae poex queueless_std --feature_period 24 --add_loss 224 | ``` 225 | 226 | As above, but train a variational autoencoder instead of a standard one, scaling the variational loss by 0.01 and only enforcing the variational loss after the reconstruction loss drops below 4 per sample: 227 | 228 | ``` 229 | $ python3 main.py --dataset datasets/my_dataset --outputdir output_my_dataset_compae --resume_auto compae poex queueless_var --feature_period 24 --trigger_loss 4 2000 230 | ``` 231 | 232 | As above, but with a standard autoencoder and variable-size features: 233 | 234 | ``` 235 | $ python3 main.py --dataset datasets/my_dataset --outputdir output_my_dataset_compae --resume_auto compae poex std --trigger_loss 4 2000 236 | ``` 237 | 238 | Run the autoencoder and generate some output: 239 | 240 | ``` 241 | $ python3 main.py --dataset datasets/my_dataset --outputdir output_my_dataset_compae/generated --resume 0 output_my_dataset_compae/final_params.p --generate compae poex queueless_std --feature_period 24 --add_loss 242 | ``` 243 | 244 | -------------------------------------------------------------------------------- /instructions/param_cvt.md: -------------------------------------------------------------------------------- 1 | #param_cvt.py: Convert a python parameters file into a connectome file 2 | 3 | ``` 4 | usage: param_cvt.py [-h] --keys KEYS [--output OUTPUT] 5 | [--precision PRECISION] [--raw] 6 | file 7 | 8 | Convert a python parameters file into an Impro-Visor connectome file 9 | 10 | positional arguments: 11 | file File to process 12 | 13 | optional arguments: 14 | -h, --help show this help message and exit 15 | --keys KEYS File to load parameter names from 16 | --output OUTPUT Base name of the output files 17 | --precision PRECISION 18 | Decimal points of precision to use (default 18) 19 | --raw Create individual csv files instead of a connectome 20 | file 21 | ``` 22 | 23 | In python, trained parameters are saved as pickle files containing a list of the model parameter matrices. To convert this to a format that Impro-Visor can read, we encode each model parameter matrix as a .csv file, and name it according to a key file, which describes the order that the parameters appear in the list. These .csv files are zipped together and given the extension .ctome, which can be loaded by Impro-Visor. 24 | 25 | ## Examples 26 | 27 | Convert a product-of-experts network into a connectome file: 28 | 29 | ``` 30 | $ python3 param_cvt.py --keys param_keys/poex_keys.txt output_poex/final_params.p 31 | ``` 32 | 33 | Convert a compressing autoencoder network into a connectome file: 34 | 35 | ``` 36 | $ python3 param_cvt.py --keys param_keys/ae_poex_keys.txt output_compae/final_params.p 37 | ``` 38 | -------------------------------------------------------------------------------- /instructions/plot_data.md: -------------------------------------------------------------------------------- 1 | #plot_data.py: Plot a csv data file 2 | 3 | ``` 4 | usage: plot_data.py [-h] fn 5 | 6 | Plot a .csv file 7 | 8 | positional arguments: 9 | fn File to plot 10 | 11 | optional arguments: 12 | -h, --help show this help message and exit 13 | ``` 14 | 15 | This script is designed to plot the `data.csv` file produced by `main.py` during training. You can pass this script the path to the `data.csv` file to visualize it: 16 | 17 | ``` 18 | $ python3 plot_data.py output_my_dataset/data.csv 19 | ``` 20 | -------------------------------------------------------------------------------- /instructions/plot_internal_state.md: -------------------------------------------------------------------------------- 1 | #plot_internal_state.py: Plot the internal state of a network 2 | 3 | ``` 4 | usage: plot_internal_state.py [-h] folder idx 5 | 6 | Plot the internal state of a network 7 | 8 | positional arguments: 9 | folder Directory with the generated files 10 | idx Zero-based index of the output to visualize 11 | 12 | optional arguments: 13 | -h, --help show this help message and exit 14 | ``` 15 | 16 | Using matplotlib, this script plots the internal state of the network while generating a particular piece. To use the script, first run `main.py` with the `--generate` or `--generate_over` arguments. This will output a series of files to the desired output directory. You must then pass that directory to this utility, along with an index of the piece to view. For instance, if running main.py gives you a directory `generated_stuff`, with files 17 | 18 | ``` 19 | generated_stuff/generated_0.ls 20 | generated_stuff/generated_1.ls 21 | generated_stuff/generated_10.ls 22 | generated_stuff/generated_11.ls 23 | generated_stuff/generated_2.ls 24 | generated_stuff/generated_3.ls 25 | generated_stuff/generated_4.ls 26 | generated_stuff/generated_5.ls 27 | generated_stuff/generated_6.ls 28 | generated_stuff/generated_7.ls 29 | generated_stuff/generated_8.ls 30 | generated_stuff/generated_9.ls 31 | generated_stuff/generated_chosen.npy 32 | generated_stuff/generated_info_0.npy 33 | generated_stuff/generated_info_1.npy 34 | generated_stuff/generated_probs.npy 35 | generated_stuff/generated_sources.txt 36 | ``` 37 | 38 | and you want to view the state of the network while generating `generated_6.ls`, you can run 39 | 40 | ``` 41 | $ python3 plot_internal_state.py generated_stuff 6 42 | ``` 43 | -------------------------------------------------------------------------------- /leadsheet.py: -------------------------------------------------------------------------------- 1 | import sexpdata 2 | import re 3 | from pprint import pprint 4 | import fractions 5 | import itertools 6 | 7 | import constants 8 | from functools import reduce 9 | 10 | import numpy as np 11 | 12 | def rotate(li, x): 13 | """ 14 | Rotate list li by x spaces to the right, i.e. 15 | rotate([1,2,3,4],1) -> [4,1,2,3] 16 | """ 17 | return li[-x % len(li):] + li[:-x % len(li)] 18 | 19 | def chunkwise(t, size=2): 20 | """ 21 | Return an iterator of tuples of size 22 | """ 23 | it = iter(t) 24 | return zip(*[it]*size) 25 | 26 | def gcd(it): 27 | def _gcd_helper(a,b): 28 | if a==0: 29 | return b 30 | else: 31 | return _gcd_helper(b%a, a) 32 | return reduce(_gcd_helper, it) 33 | 34 | def repeat_print(li): 35 | 36 | last = None 37 | lastct = 0 38 | for c in li+[None]: 39 | if c == last: 40 | lastct += 1 41 | else: 42 | if last is not None: 43 | print(last, "*", lastct) 44 | last = c 45 | lastct = 1 46 | 47 | 48 | def parse_chord(cstr,verbose=False): 49 | """ 50 | Given a string representation of a chord, return a binary representation 51 | as a list of length 12, starting with C. 52 | """ 53 | if cstr == "NC": 54 | return 0, constants.CHORD_TYPES["NC"] 55 | chord_match = re.match(r"([A-G](?:#|b)?)([^/]*)(?:/(.+))?", cstr) 56 | root_note, ctype, slash_note = chord_match.groups() 57 | 58 | try: 59 | ctype_vec = constants.CHORD_TYPES[ctype] 60 | except KeyError: 61 | if(verbose): 62 | print("WARNING: Could not find chord {}, substituting NC".format(cstr)) 63 | ctype_vec = constants.CHORD_TYPES['NC'] 64 | 65 | root_offset = constants.CHORD_NOTE_OFFSETS[root_note] 66 | if slash_note is None: 67 | return root_offset, ctype_vec 68 | else: 69 | # For a slash chord, we need to add the slashed note to the chord, 70 | # and also make it the bass note 71 | slash_offset = constants.CHORD_NOTE_OFFSETS[slash_note] 72 | shifted_ctype_vec = rotate(ctype_vec, root_offset-slash_offset) 73 | shifted_ctype_vec[0] = 1 74 | return slash_offset, shifted_ctype_vec 75 | 76 | 77 | def parse_duration(durstr): 78 | accum_dur = 0 79 | 80 | parts = durstr.split("+") 81 | for part in parts: 82 | dot_match = re.match(r"([^\.]*)(\.*)", part) 83 | part = dot_match.group(1) 84 | num_dots = len(dot_match.group(2)) 85 | 86 | tupl_parts = part.split("/") 87 | if len(tupl_parts) == 1: 88 | # Not a tuplet 89 | [dur_frac_str] = tupl_parts 90 | dur_frac = int(dur_frac_str) 91 | assert constants.WHOLE % dur_frac == 0, "Bad duration {} -> {} / {}".format(durstr, constants.WHOLE, dur_frac) 92 | slots = constants.WHOLE // dur_frac 93 | else: 94 | [dur_frac_str, tuplet_str] = tupl_parts 95 | dur_frac = int(dur_frac_str) 96 | dur_tupl = int(tuplet_str) 97 | assert (constants.WHOLE * (dur_tupl-1)) % (dur_frac * dur_tupl) == 0, "Bad duration {} -> {} / {}".format(durstr, (constants.WHOLE * (dur_tupl-1)), (dur_frac * dur_tupl)) 98 | slots = constants.WHOLE * (dur_tupl-1) // (dur_frac * dur_tupl) 99 | 100 | for i in range(num_dots): 101 | assert (slots * 3) % 2 == 0, "Bad duration {} -> {} / {}".format(durstr, (slots * 3), 2) 102 | slots = slots * 3 // 2 103 | 104 | accum_dur += slots 105 | 106 | assert accum_dur % constants.RESOLUTION_SCALAR == 0, "Bad duration {}: {} not a multiple of resolution {}".format(durstr, accum_dur, constants.RESOLUTION_SCALAR) 107 | return accum_dur//constants.RESOLUTION_SCALAR 108 | 109 | def parse_note(nstr): 110 | """ 111 | Given a string representation of a note, return (midiOrNone, duration) 112 | """ 113 | note_match = re.match(r"((?:[a-g]|r)(?:[#b]?))([\+\-]*)(.*)", nstr) 114 | note = note_match.group(1) 115 | octaveshift_str = note_match.group(2) 116 | duration_str = note_match.group(3) 117 | 118 | octaveshift = sum({"+":1,"-":-1}[x] for x in octaveshift_str) 119 | if nstr[0] == 'r': 120 | midival = None 121 | else: 122 | midival = constants.MIDDLE_C_MIDI + (constants.OCTAVE * octaveshift) + constants.NOTE_OFFSETS[note] 123 | 124 | duration = parse_duration(duration_str) 125 | 126 | return (midival, duration) 127 | 128 | 129 | def parse_leadsheet(fn,verbose=False): 130 | with open(fn,'r') as f: 131 | contents = "\n".join(f.readlines()) 132 | parsed = sexpdata.loads("({})".format(contents.replace("'",""))) 133 | 134 | parts = [('default','',[])] 135 | for p in parsed: 136 | if not isinstance(p, list): 137 | parts[-1][2].append(p.value()) 138 | elif not isinstance(p[0], list) and p[0].value() == 'part': 139 | def strval(x): 140 | return x.value() if isinstance(x,sexpdata.Symbol) else str(x) 141 | part_type = next((' '.join(strval(x) for x in l[1:]) for l in p if isinstance(l,list) and l[0].value() == "type"), None) 142 | title = next((' '.join(strval(x) for x in l[1:]) for l in p if isinstance(l,list) and l[0].value() == "title"), '') 143 | parts.append((part_type, title, [])) 144 | 145 | chord_parts = [x for x in parts if x[0]=='chords'] 146 | if len(chord_parts) == 0: 147 | chord_parts = [x for x in parts if x[0]=='default'] 148 | assert len(chord_parts) == 1, 'Wrong number of chord parts!' 149 | 150 | chords_raw = [x for x in chord_parts[0][2] if x[0].isupper() or x in ("|", "/")] 151 | chords = [] 152 | partial_measure = [] 153 | last_chord = None 154 | for c in chords_raw: 155 | if c == "|": 156 | length_each = constants.WHOLE//(len(partial_measure)*constants.RESOLUTION_SCALAR) 157 | for chord in partial_measure: 158 | for x in range(length_each): 159 | chords.append(chord) 160 | partial_measure = [] 161 | else: 162 | if c != "/": 163 | last_chord = parse_chord(c,verbose) 164 | partial_measure.append(last_chord) 165 | 166 | melody = [] 167 | for part_type, title, part_data in parts: 168 | if part_type == 'melody': 169 | melody_raw = [x for x in part_data if x[0].islower()] 170 | melody_proc = [parse_note(x) for x in melody_raw] 171 | mlen = sum(dur for n,dur in melody_proc) 172 | if mlen < len(chords): 173 | melody_proc.append((None, len(chords)-mlen)) 174 | melody.extend(melody_proc) 175 | 176 | # print "Raw Chords: " + " ".join(chords_raw) 177 | # print "Raw Melody: " + " ".join(melody_raw) 178 | 179 | # print "Parsed chords: " 180 | # repeat_print(chords) 181 | # print "Parsed melody: " 182 | # pprint(melody) 183 | 184 | clen = len(chords) 185 | mlen = sum(dur for n,dur in melody) 186 | # Might have multiple melodies over the same chords 187 | assert mlen % clen == 0, "Notes and chords don't match in {}: {}, {}".format(fn, clen,mlen) 188 | 189 | return chords, melody 190 | 191 | def constrain_melody(melody,bounds): 192 | new_melody = [] 193 | for n,dur in melody: 194 | if n is None: 195 | new_melody.append((n,dur)) 196 | else: 197 | while n >= bounds.highbound: 198 | n -= 12 199 | while n < bounds.lowbound: 200 | n += 12 201 | new_melody.append((n,dur)) 202 | return new_melody 203 | 204 | def get_leadsheet_length(chords, melody): 205 | return sum(dur for n,dur in melody) 206 | 207 | def slice_leadsheet(chords, melody, start, end): 208 | 209 | sliced_melody_start = [] 210 | sliced_melody_full = [] 211 | 212 | timestep = 0 213 | for n,dur in melody: 214 | if start-dur < timestep <= start: 215 | sliced_melody_start.append((n,timestep+dur-start)) 216 | elif start < timestep: 217 | sliced_melody_start.append((n,dur)) 218 | timestep += dur 219 | 220 | timestep = start 221 | for n,dur in sliced_melody_start: 222 | if timestep < end-dur: 223 | sliced_melody_full.append((n,dur)) 224 | elif end-dur <= timestep < end: 225 | sliced_melody_full.append((n,end-timestep)) 226 | timestep += dur 227 | 228 | sliced_chords = [chords[i%len(chords)] for i in range(start,end)] 229 | 230 | clen = len(sliced_chords) 231 | mlen = sum(dur for n,dur in sliced_melody_full) 232 | assert clen == mlen, "clen {} and mlen {} do not match".format(clen,mlen) 233 | 234 | return sliced_chords, sliced_melody_full 235 | 236 | def write_duration(duration): 237 | """ 238 | Convert a number of slots to a duration string 239 | """ 240 | q_dir = constants.QUARTER//constants.RESOLUTION_SCALAR 241 | whole_dir = constants.WHOLE//constants.RESOLUTION_SCALAR 242 | 243 | if duration > whole_dir: 244 | # Longer than a measure 245 | return "1+{}".format(write_duration(duration - whole_dir)) 246 | elif q_dir % duration == 0: 247 | # Simple, shorter than a quarter note 248 | return { 249 | 12:"32/3", 250 | 6:"16/3", 251 | 4:"16", 252 | 3:"8/3", 253 | 2:"8", 254 | 1:"4" 255 | }[ q_dir//duration ] 256 | elif duration % q_dir == 0: 257 | # Simple, longer than a quarter note 258 | return { 259 | 1:"4", 260 | 2:"2", 261 | 3:"2.", 262 | 4:"1" 263 | }[ duration//q_dir ] 264 | elif duration > q_dir: 265 | # Longer than a quarter note, but not evenly divisible. 266 | # Break up long and short parts 267 | q_parts = duration % q_dir 268 | return "{}+{}".format(write_duration(duration-q_parts), write_duration(q_parts)) 269 | else: 270 | # Find the shortest representation 271 | best = None 272 | for i in range(1,duration//2): 273 | cur_try = "{}+{}".format(write_duration(duration-i),write_duration(i)) 274 | if best is None or len(cur_try) < len(best): 275 | best = cur_try 276 | return cur_try 277 | 278 | def write_melody(melody): 279 | """ 280 | Convert a list of melody to a string 281 | """ 282 | notes = [] 283 | for midi, dur in melody: 284 | if midi is None: 285 | notename = "r" 286 | octave_adj = "" 287 | else: 288 | delta_from_middle = midi - constants.MIDDLE_C_MIDI 289 | octaves = delta_from_middle // 12 290 | pitchclass = delta_from_middle % 12 291 | notename = list(constants.NOTE_OFFSETS.keys())[list(constants.NOTE_OFFSETS.values()).index(pitchclass)] 292 | 293 | if octaves < 0: 294 | octave_adj = "-"*(-octaves) 295 | else: 296 | octave_adj = "+"*octaves 297 | 298 | duration_str = write_duration(dur) 299 | 300 | notes.append(notename + octave_adj + duration_str) 301 | 302 | return " ".join(notes) 303 | 304 | def write_chords(chords): 305 | """ 306 | Convert a list of chords to a string 307 | """ 308 | whole_dir = constants.WHOLE//constants.RESOLUTION_SCALAR 309 | 310 | parts = [] 311 | for measure in chunkwise(chords, whole_dir): 312 | partial_measure = [] 313 | last_seen = None 314 | for chord in measure: 315 | if chord == last_seen: 316 | partial_measure[-1][1] += 1 317 | else: 318 | last_seen = chord 319 | root,ctype = chord 320 | if ctype == constants.CHORD_TYPES["NC"]: 321 | chord_str = "NC" 322 | else: 323 | if ctype in list(constants.CHORD_TYPES.values()): 324 | t_idx = list(constants.CHORD_TYPES.values()).index(ctype) 325 | ctype_s = list(constants.CHORD_TYPES.keys())[t_idx] 326 | 327 | r_idx = list(constants.CHORD_NOTE_OFFSETS.values()).index(root) 328 | root_s = list(constants.CHORD_NOTE_OFFSETS.keys())[r_idx] 329 | 330 | chord_str = root_s + ctype_s 331 | else: 332 | # Try slash chords: "root" is bass, look for true root 333 | bass = root 334 | mod_ctype = [0] + ctype[1:] 335 | for offset in range(1,12): 336 | true_root = (bass + offset) % 12 337 | shifted_chord = rotate(ctype, -offset) 338 | mod_shifted_chord = rotate(mod_ctype, -offset) 339 | if shifted_chord in list(constants.CHORD_TYPES.values()): 340 | t_idx = list(constants.CHORD_TYPES.values()).index(shifted_chord) 341 | elif mod_shifted_chord in list(constants.CHORD_TYPES.values()): 342 | t_idx = list(constants.CHORD_TYPES.values()).index(mod_shifted_chord) 343 | else: 344 | continue 345 | 346 | ctype_s = list(constants.CHORD_TYPES.keys())[t_idx] 347 | 348 | r_idx = list(constants.CHORD_NOTE_OFFSETS.values()).index(true_root) 349 | root_s = list(constants.CHORD_NOTE_OFFSETS.keys())[r_idx] 350 | 351 | slash_idx = list(constants.CHORD_NOTE_OFFSETS.values()).index(bass) 352 | slash_s = list(constants.CHORD_NOTE_OFFSETS.keys())[slash_idx] 353 | 354 | chord_str = root_s + ctype_s + '/' + slash_s 355 | break 356 | else: 357 | print("Not a valid chord!") 358 | chord_str = "NC" 359 | 360 | partial_measure.append([chord_str, 1]) 361 | 362 | divisor = gcd(x[1] for x in partial_measure) 363 | for chord_str, ct in partial_measure: 364 | for _ in range(ct//divisor): 365 | parts.append(chord_str) 366 | parts.append("|") 367 | 368 | return " ".join(parts) 369 | 370 | def write_leadsheet(chords, melody, filename=None): 371 | """ 372 | Convert chords and a melody to a leadsheet file 373 | """ 374 | full_leadsheet = """ 375 | (section (style swing)) 376 | 377 | (part (type chords)) 378 | {} 379 | (part (type melody)) 380 | {} 381 | """.format(write_chords(chords), write_melody(melody)) 382 | 383 | if filename is not None: 384 | with open(filename,'w') as f: 385 | f.write(full_leadsheet) 386 | else: 387 | return full_leadsheet 388 | 389 | -------------------------------------------------------------------------------- /lscat.py: -------------------------------------------------------------------------------- 1 | import sys 2 | import leadsheet 3 | import argparse 4 | 5 | def main(files, output=None, verbose=False): 6 | melody = [] 7 | chords = [] 8 | for f in files: 9 | if verbose: 10 | print(f) 11 | nc,nm = leadsheet.parse_leadsheet(f, verbose) 12 | melody.extend(nm) 13 | chords.extend(nc) 14 | if output is None: 15 | output = files[0] + "-cat.ls" 16 | leadsheet.write_leadsheet(chords, melody, output) 17 | 18 | parser = argparse.ArgumentParser(description='Concatenate leadsheet files.') 19 | parser.add_argument('files', nargs='+', help='Files to process') 20 | parser.add_argument('--output', help='Name of the output file') 21 | parser.add_argument('--verbose', action="store_true", help='Be verbose about processing') 22 | 23 | if __name__ == '__main__': 24 | args = parser.parse_args() 25 | main(**vars(args)) -------------------------------------------------------------------------------- /lssplit.py: -------------------------------------------------------------------------------- 1 | import sys 2 | import leadsheet 3 | import argparse 4 | import constants 5 | import os 6 | 7 | def main(file, split, output=None): 8 | c,m = leadsheet.parse_leadsheet(file) 9 | lslen = leadsheet.get_leadsheet_length(c,m) 10 | divwidth = int(split) * (constants.WHOLE//constants.RESOLUTION_SCALAR) 11 | slices = [leadsheet.slice_leadsheet(c,m,s,s+divwidth) for s in range(0,lslen,divwidth)] 12 | if output is None: 13 | output = file + "-split" 14 | for i, (chords, melody) in enumerate(slices): 15 | leadsheet.write_leadsheet(chords, melody, '{}_{}.ls'.format(output,i)) 16 | 17 | parser = argparse.ArgumentParser(description='Split a leadsheet file.') 18 | parser.add_argument('file',help='File to process') 19 | parser.add_argument('split',help='Bars to split at') 20 | parser.add_argument('--output', help='Base name of the output files') 21 | 22 | if __name__ == '__main__': 23 | args = parser.parse_args() 24 | main(**vars(args)) -------------------------------------------------------------------------------- /main.py: -------------------------------------------------------------------------------- 1 | import argparse 2 | import time 3 | import sys 4 | import os 5 | import collections 6 | 7 | from models import SimpleModel, ProductOfExpertsModel, CompressiveAutoencoderModel 8 | from note_encodings import AbsoluteSequentialEncoding, RelativeJumpEncoding, ChordRelativeEncoding, CircleOfThirdsEncoding, RhythmOnlyEncoding 9 | from queue_managers import StandardQueueManager, VariationalQueueManager, SamplingVariationalQueueManager, QueuelessVariationalQueueManager, QueuelessStandardQueueManager, NearnessStandardQueueManager, NoiseWrapper 10 | import input_parts 11 | import leadsheet 12 | import training 13 | import pickle 14 | import theano 15 | import theano.tensor as T 16 | 17 | import numpy as np 18 | import constants 19 | from util import sliceMaker 20 | 21 | ModelBuilder = collections.namedtuple('ModelBuilder',['name', 'build', 'config_args', 'desc']) 22 | builders = {} 23 | 24 | def build_simple(should_setup, check_nan, unroll_batch_num, encode_key, no_per_note): 25 | if encode_key == "abs": 26 | enc = AbsoluteSequentialEncoding(constants.BOUNDS.lowbound, constants.BOUNDS.highbound) 27 | inputs = [input_parts.BeatInputPart(),input_parts.ChordShiftInputPart()] 28 | elif encode_key == "cot": 29 | enc = CircleOfThirdsEncoding(constants.BOUNDS.lowbound, (constants.BOUNDS.highbound-constants.BOUNDS.lowbound)//12) 30 | inputs = [input_parts.BeatInputPart(),input_parts.ChordShiftInputPart()] 31 | elif encode_key == "rel": 32 | enc = RelativeJumpEncoding() 33 | inputs = None 34 | sizes = [(200,10),(200,10)] if (encode_key == "rel" and not no_per_note) else [(300,0),(300,0)] 35 | bounds = constants.NoteBounds(48, 84) if encode_key == "cot" else constants.BOUNDS 36 | return SimpleModel(enc, sizes, bounds=bounds, inputs=inputs, dropout=0.5, setup=should_setup, nanguard=check_nan, unroll_batch_num=unroll_batch_num) 37 | 38 | def config_simple(parser): 39 | parser.add_argument('encode_key', choices=["abs","cot","rel"], help='Type of encoding to use') 40 | parser.add_argument('--per_note', dest="no_per_note", action="store_false", help='Enable note memory cells') 41 | 42 | builders['simple'] = ModelBuilder('simple', build_simple, config_simple, 'A simple single-LSTM-stack sequential model') 43 | 44 | ####################### 45 | 46 | def build_poex(should_setup, check_nan, unroll_batch_num, no_per_note, layer_size, num_layers, separate_rhythm, skip_training_experts): 47 | encs = [RelativeJumpEncoding(), ChordRelativeEncoding()] 48 | if separate_rhythm: 49 | encs = [RelativeJumpEncoding(with_artic=False), ChordRelativeEncoding(with_artic=False), RhythmOnlyEncoding()] 50 | shift_modes = ["drop","roll","drop"] 51 | else: 52 | encs = [RelativeJumpEncoding(), ChordRelativeEncoding()] 53 | shift_modes = ["drop","roll"] 54 | 55 | if no_per_note: 56 | if len(layer_size) == 1: 57 | layer_size = layer_size*len(encs) 58 | assert len(layer_size) == len(encs) 59 | if len(num_layers) == 1: 60 | num_layers = num_layers*len(encs) 61 | assert len(num_layers) == len(encs) 62 | 63 | sizes = [[(ls,0)]*nl for ls, nl in zip(layer_size, num_layers)] 64 | else: 65 | sizes = [[(200,10),(200,10)]]*len(encs) 66 | 67 | return ProductOfExpertsModel(encs, sizes, shift_modes=shift_modes, 68 | dropout=0.5, setup=should_setup, nanguard=check_nan, unroll_batch_num=unroll_batch_num, 69 | normalize_artic_only=separate_rhythm, skip_training_experts=skip_training_experts) 70 | 71 | def config_poex(parser): 72 | parser.add_argument('--per_note', dest="no_per_note", action="store_false", help='Enable note memory cells') 73 | parser.add_argument('--separate_rhythm', action="store_true", help='Use a separate rhythm expert. Only works without note memory cells') 74 | parser.add_argument('--layer_size', nargs="+", type=int, default=[300], help='Layer size of the LSTMs. Either pass a single number to be used for all experts, or a sequence of numbers, one for each expert. Only works without note memory cells.') 75 | parser.add_argument('--num_layers', nargs="+", type=int, default=[2], help='Number of LSTM layers. Either pass a single number to be used for all experts, or a sequence of numbers, one for each expert. Only works without note memory cells') 76 | parser.add_argument('--skip_training_experts', nargs="+", type=int, default=[], metavar="EXPERT_INDEX", help='Skip training these experts') 77 | 78 | builders['poex'] = ModelBuilder('poex', build_poex, config_poex, 'A product-of-experts LSTM sequential model, using note and chord relative encodings.') 79 | 80 | ####################### 81 | 82 | def build_compae(should_setup, check_nan, unroll_batch_num, encode_key, queue_key, no_per_note, layer_size, num_layers, feature_size, hide_output, sparsity_loss_scale, variational_loss_scale, train_decoder_only, feature_period=None, add_pre_noise=None, add_post_noise=None, loss_mode_priority=False, loss_mode_add=False, loss_mode_cutoff=None, loss_mode_trigger=None): 83 | bounds = constants.NoteBounds(48, 84) if encode_key == "cot" else constants.BOUNDS 84 | shift_modes = None 85 | if encode_key == "abs": 86 | enc = [AbsoluteSequentialEncoding(constants.BOUNDS.lowbound, constants.BOUNDS.highbound)] 87 | sizes = [[(layer_size,0)]*num_layers] 88 | inputs = [[input_parts.BeatInputPart(), input_parts.ChordShiftInputPart()]] 89 | elif encode_key == "cot": 90 | enc = [CircleOfThirdsEncoding(bounds.lowbound, (bounds.highbound-bounds.lowbound)//12)] 91 | sizes = [[(layer_size,0)]*num_layers] 92 | inputs = [[input_parts.BeatInputPart(), input_parts.ChordShiftInputPart()]] 93 | elif encode_key == "rel": 94 | enc = [RelativeJumpEncoding()] 95 | sizes = [[(200,10),(200,10)] if (not no_per_note) else [[(layer_size,0)]*num_layers]] 96 | shift_modes=["drop"] 97 | inputs = None 98 | elif encode_key == "poex": 99 | enc = [RelativeJumpEncoding(), ChordRelativeEncoding()] 100 | sizes = [ [(200,10),(200,10)] if (not no_per_note) else [[(layer_size,0)]*num_layers] ]*2 101 | shift_modes=["drop","roll"] 102 | inputs = None 103 | 104 | unscaled_loss_fun = lambda x: T.log(1+99*x)/T.log(100) 105 | lossfun = lambda x: np.array(sparsity_loss_scale, np.float32) * unscaled_loss_fun(x) 106 | if queue_key == "std": 107 | qman = StandardQueueManager(feature_size, loss_fun=lossfun) 108 | elif queue_key == "var": 109 | qman = VariationalQueueManager(feature_size, loss_fun=lossfun, variational_loss_scale=variational_loss_scale) 110 | elif queue_key == "sample_var": 111 | qman = SamplingVariationalQueueManager(feature_size, loss_fun=lossfun, variational_loss_scale=variational_loss_scale) 112 | elif queue_key == "queueless_var": 113 | qman = QueuelessVariationalQueueManager(feature_size, period=feature_period, variational_loss_scale=variational_loss_scale) 114 | elif queue_key == "queueless_std": 115 | qman = QueuelessStandardQueueManager(feature_size, period=feature_period) 116 | elif queue_key == "nearness_std": 117 | qman = NearnessStandardQueueManager(feature_size, sparsity_loss_scale*10, sparsity_loss_scale, 0.97, loss_fun=unscaled_loss_fun) 118 | 119 | if add_pre_noise is not None or add_post_noise is not None: 120 | if "queueless" in queue_key: 121 | pre_mask = sliceMaker[:] 122 | else: 123 | pre_mask = sliceMaker[1:] 124 | qman = NoiseWrapper(qman, add_pre_noise, add_post_noise, pre_mask) 125 | 126 | loss_mode = "add" if loss_mode_add else \ 127 | ("cutoff", loss_mode_cutoff) if loss_mode_cutoff is not None else \ 128 | ("trigger",)+tuple(loss_mode_trigger) if loss_mode_trigger is not None else \ 129 | ("priority", loss_mode_priority if loss_mode_priority is not None else 50) 130 | 131 | return CompressiveAutoencoderModel(qman, enc, sizes, sizes, shift_modes=shift_modes, bounds=bounds, hide_output=hide_output, inputs=inputs, 132 | dropout=0.5, setup=should_setup, nanguard=check_nan, unroll_batch_num=unroll_batch_num, loss_mode=loss_mode, train_decoder_only=train_decoder_only) 133 | 134 | def config_compae(parser): 135 | parser.add_argument('encode_key', choices=["abs","cot","rel","poex"], help='Type of encoding to use') 136 | parser.add_argument('queue_key', choices=["std","var","sample_var","queueless_var","queueless_std","nearness_std"], help='Type of queue manager to use') 137 | parser.add_argument('--per_note', dest="no_per_note", action="store_false", help='Enable note memory cells') 138 | parser.add_argument('--hide_output', action="store_true", help='Hide previous outputs from the decoder') 139 | parser.add_argument('--sparsity_loss_scale', type=float, default="1", help='How much to scale the sparsity loss by') 140 | parser.add_argument('--variational_loss_scale', type=float, default="1", help='How much to scale the variational loss by') 141 | parser.add_argument('--feature_size', type=int, default="100", help='Size of feature vectors') 142 | parser.add_argument('--feature_period', type=int, help='If in queueless mode, period of features in timesteps') 143 | parser.add_argument('--add_pre_noise', type=float, nargs="?", const=1.0, help='Add Gaussian noise to the feature values before applying the activation function') 144 | parser.add_argument('--add_post_noise', type=float, nargs="?", const=1.0, help='Add Gaussian noise to the feature values after applying the activation function') 145 | parser.add_argument('--train_decoder_only', action="store_true", help='Only modify the decoder parameters') 146 | parser.add_argument('--layer_size', type=int, default=300, help='Layer size of the LSTMs. Only works without note memory cells') 147 | parser.add_argument('--num_layers', type=int, default=2, help='Number of LSTM layers. Only works without note memory cells') 148 | lossgroup = parser.add_mutually_exclusive_group() 149 | lossgroup.add_argument('--priority_loss', nargs='?', const=50, dest='loss_mode_priority', type=float, help='Use priority loss scaling mode (with the specified curviness)') 150 | lossgroup.add_argument('--add_loss', dest='loss_mode_add', action='store_true', help='Use adding loss scaling mode') 151 | lossgroup.add_argument('--cutoff_loss', dest='loss_mode_cutoff', type=float, metavar="CUTOFF", help='Use cutoff loss scaling mode with the specified per-batch cutoff') 152 | lossgroup.add_argument('--trigger_loss', dest='loss_mode_trigger', nargs=2, type=float, metavar=("TRIGGER", "RAMP_TIME"), help='Use trigger loss scaling mode with the specified per-batch trigger value and desired ramp-up time') 153 | 154 | builders['compae'] = ModelBuilder('compae', build_compae, config_compae, 'A compressive autoencoder model.') 155 | 156 | ################################################################################################################### 157 | 158 | def main(modeltype, batch_size, iterations, learning_rate, segment_len, segment_step, train_save_params, dataset=["dataset"], outputdir="output", validation=None, validation_generate_ct=1, resume=None, resume_auto=False, check_nan=False, generate=False, generate_over=None, auto_connectome_keys=None, **model_kwargs): 159 | generate = generate or (generate_over is not None) 160 | should_setup = not generate 161 | unroll_batch_num = None if generate else training.BATCH_SIZE 162 | 163 | for dataset_dir in dataset: 164 | if os.path.samefile(dataset_dir,outputdir): 165 | print("WARNING: Directory {} passed as both dataset and output directory!".format(outputdir)) 166 | print("This may cause problems by adding generated samples to the dataset directory.") 167 | while True: 168 | result = input("Continue anyway? (y/n)") 169 | if result == "y": 170 | break 171 | elif result == "n": 172 | sys.exit(0) 173 | else: 174 | print("Please type y or n") 175 | 176 | if generate_over is None: 177 | training.set_params(batch_size, segment_step, segment_len) 178 | leadsheets = [training.filter_leadsheets(training.find_leadsheets(d)) for d in dataset] 179 | else: 180 | # Don't bother loading leadsheets, we don't need them 181 | leadsheets = [] 182 | 183 | if validation is not None: 184 | validation_leadsheets = training.filter_leadsheets(training.find_leadsheets(validation)) 185 | else: 186 | validation_leadsheets = None 187 | 188 | m = builders[modeltype].build(should_setup, check_nan, unroll_batch_num, **model_kwargs) 189 | m.set_learning_rate(learning_rate) 190 | 191 | if resume_auto: 192 | paramfile = os.path.join(outputdir,'final_params.p') 193 | if os.path.isfile(paramfile): 194 | with open(os.path.join(outputdir,'data.csv'), 'r') as f: 195 | for line in f: 196 | pass 197 | lastline = line 198 | start_idx = lastline.split(',')[0] 199 | print("Automatically resuming from {} after iteration {}.".format(paramfile, start_idx)) 200 | resume = (start_idx, paramfile) 201 | else: 202 | print("Didn't find anything to resume. Starting from the beginning...") 203 | 204 | if resume is not None: 205 | start_idx, paramfile = resume 206 | start_idx = int(start_idx) 207 | m.params = pickle.load( open(paramfile, "rb" ) ) 208 | else: 209 | start_idx = 0 210 | 211 | if not os.path.exists(outputdir): 212 | os.makedirs(outputdir) 213 | 214 | if generate: 215 | print("Setting up generation") 216 | m.setup_produce() 217 | print("Starting to generate") 218 | start_time = time.process_time() 219 | if generate_over is not None: 220 | source, divwidth = generate_over 221 | if divwidth == 'full': 222 | divwidth = 0 223 | elif divwidth == 'debug_firststep': 224 | divwidth = -1 225 | elif len(divwidth)>3 and divwidth[-3:] == 'bar': 226 | divwidth = int(divwidth[:-3])*(constants.WHOLE//constants.RESOLUTION_SCALAR) 227 | else: 228 | divwidth = int(divwidth) 229 | ch,mel = leadsheet.parse_leadsheet(source) 230 | lslen = leadsheet.get_leadsheet_length(ch,mel) 231 | if divwidth == 0: 232 | batch = ([ch],[mel]), [source] 233 | elif divwidth == -1: 234 | slices = [leadsheet.slice_leadsheet(ch,mel,0,1)] 235 | batch = list(zip(*slices)), [source] 236 | else: 237 | slices = [leadsheet.slice_leadsheet(ch,mel,s,s+divwidth) for s in range(0,lslen,divwidth)] 238 | batch = list(zip(*slices)), [source] 239 | training.generate(m, leadsheets, os.path.join(outputdir, "generated"), with_vis=True, batch=batch) 240 | else: 241 | training.generate(m, leadsheets, os.path.join(outputdir, "generated"), with_vis=True) 242 | end_time = time.process_time() 243 | print("Generation took {} seconds.".format(end_time-start_time)) 244 | else: 245 | training.train(m, leadsheets, iterations, outputdir, start_idx, train_save_params, validation_leadsheets=validation_leadsheets, validation_generate_ct=validation_generate_ct, auto_connectome_keys=auto_connectome_keys) 246 | pickle.dump( m.params, open( os.path.join(outputdir, "final_params.p"), "wb" ) ) 247 | 248 | def cvt_time(s): 249 | if len(s)>3 and s[-3:] == "bar": 250 | return int(s[:-3])*(constants.WHOLE//constants.RESOLUTION_SCALAR) 251 | else: 252 | return int(s) 253 | 254 | parser = argparse.ArgumentParser(description='Train a neural network model.', formatter_class=argparse.ArgumentDefaultsHelpFormatter) 255 | parser.add_argument('--dataset', nargs="+", default=['dataset'], help='Path(s) to dataset folder (with .ls files). If multiple are passed, samples randomly from each') 256 | parser.add_argument('--validation', help='Path to validation dataset folder (with .ls files)') 257 | parser.add_argument('--validation_generate_ct', type=int, default=1, help='Number of samples to generate at each validation time.') 258 | parser.add_argument('--outputdir', default='output', help='Path to output folder') 259 | parser.add_argument('--check_nan', action='store_true', help='Check for nans during execution') 260 | parser.add_argument('--batch_size', type=int, default=10, help='Size of batch') 261 | parser.add_argument('--iterations', type=int, default=50000, help='How many iterations to train') 262 | parser.add_argument('--learning_rate', type=float, default=0.0002, help='Learning rate for the ADAM gradient descent method') 263 | parser.add_argument('--segment_len', type=cvt_time, default="4bar", help='Length of segment to train on') 264 | parser.add_argument('--segment_step', type=cvt_time, default="1bar", help='Period at which segments may begin') 265 | parser.add_argument('--save-params-interval', type=int, default=5000, dest="train_save_params", help="Save parameters after this many iterations") 266 | parser.add_argument('--final-params-only', action="store_const", const=None, dest="train_save_params", help="Don't save parameters while training, only at the end.") 267 | parser.add_argument('--auto_connectome_keys', help='Path to keys for running param_cvt. If given, will run param_cvt automatically for each saved parameters file.') 268 | resume_group = parser.add_mutually_exclusive_group() 269 | resume_group.add_argument('--resume', nargs=2, metavar=('TIMESTEP', 'PARAMFILE'), default=None, help='Where to restore from: timestep, and file to load') 270 | resume_group.add_argument('--resume_auto', action='store_true', help='Automatically restore from a previous run using output directory') 271 | gen_group = parser.add_mutually_exclusive_group() 272 | gen_group.add_argument('--generate', action='store_true', help="Don't train, just generate. Should be used with restore.") 273 | gen_group.add_argument('--generate_over', nargs=2, metavar=('SOURCE', 'DIV_WIDTH'), default=None, help="Don't train, just generate, and generate over SOURCE chord changes divided into chunks of length DIV_WIDTH (or one contiguous chunk if DIV_WIDTH is 'full'). Can use 'bar' as a unit. Should be used with restore.") 274 | 275 | subparsers = parser.add_subparsers(title='Model Types', dest='modeltype', help='Type of model to use. (Note that each model type has additional parameters.)') 276 | for k,b in builders.items(): 277 | cur_parser = subparsers.add_parser(k, help=b.desc) 278 | b.config_args(cur_parser) 279 | 280 | if __name__ == '__main__': 281 | np.set_printoptions(linewidth=200) 282 | args = vars(parser.parse_args()) 283 | if args["modeltype"] is None: 284 | parser.print_usage() 285 | else: 286 | main(**args) 287 | -------------------------------------------------------------------------------- /models/__init__.py: -------------------------------------------------------------------------------- 1 | from .simple_rel_model import SimpleModel 2 | from .product_model import ProductOfExpertsModel 3 | from .compressive_autoencoder_model import CompressiveAutoencoderModel -------------------------------------------------------------------------------- /models/product_model.py: -------------------------------------------------------------------------------- 1 | 2 | import theano 3 | import theano.tensor as T 4 | from theano.sandbox.rng_mrg import MRG_RandomStreams 5 | 6 | import numpy as np 7 | 8 | import constants 9 | import input_parts 10 | from relshift_lstm import RelativeShiftLSTMStack 11 | from adam import Adam 12 | from note_encodings import Encoding 13 | import leadsheet 14 | 15 | import itertools 16 | import functools 17 | 18 | from theano.compile.nanguardmode import NanGuardMode 19 | 20 | 21 | class ProductOfExpertsModel(object): 22 | def __init__(self, encodings, all_layer_sizes, inputs=None, shift_modes=None, dropout=0, setup=False, nanguard=False, unroll_batch_num=None, bounds=constants.BOUNDS, normalize_artic_only=False, skip_training_experts=[]): 23 | self.encodings = encodings 24 | 25 | self.bounds = bounds 26 | self.normalize_artic_only = normalize_artic_only 27 | self.skip_training_experts = skip_training_experts 28 | 29 | if shift_modes is None: 30 | shift_modes = ["drop"]*len(encodings) 31 | 32 | if inputs is None: 33 | inputs = [[ 34 | input_parts.BeatInputPart(), 35 | input_parts.PositionInputPart(self.bounds.lowbound, self.bounds.highbound, 2), 36 | input_parts.ChordShiftInputPart()]]*len(self.encodings) 37 | 38 | self.all_layer_sizes = all_layer_sizes 39 | self.lstmstacks = [] 40 | for layer_sizes, encoding, shift_mode, ipt in zip(all_layer_sizes,encodings,shift_modes, inputs): 41 | parts = ipt + [ 42 | input_parts.PassthroughInputPart("last_output", encoding.ENCODING_WIDTH) 43 | ] 44 | lstmstack = RelativeShiftLSTMStack(parts, layer_sizes, encoding.RAW_ENCODING_WIDTH, encoding.WINDOW_SIZE, dropout, mode=shift_mode, unroll_batch_num=unroll_batch_num) 45 | self.lstmstacks.append(lstmstack) 46 | 47 | self.srng = MRG_RandomStreams(np.random.randint(1, 1024)) 48 | 49 | self.learning_rate_var = theano.shared(np.array(0.0002, theano.config.floatX)) 50 | 51 | self.update_fun = None 52 | self.eval_fun = None 53 | self.gen_fun = None 54 | 55 | self.nanguard = nanguard 56 | 57 | if setup: 58 | print("Setting up train") 59 | self.setup_train() 60 | print("Setting up gen") 61 | self.setup_generate() 62 | print("Done setting up") 63 | 64 | @property 65 | def params(self): 66 | return list(itertools.chain(*(lstmstack.params for lstmstack in self.lstmstacks))) 67 | 68 | @params.setter 69 | def params(self, paramlist): 70 | mycopy = list(paramlist) 71 | for lstmstack in self.lstmstacks: 72 | lstmstack.params = mycopy[:len(lstmstack.params)] 73 | del mycopy[:len(lstmstack.params)] 74 | assert len(mycopy) == 0 75 | 76 | def get_optimize_params(self): 77 | return list(itertools.chain(*(lstmstack.params for i,lstmstack in enumerate(self.lstmstacks) if i not in self.skip_training_experts))) 78 | 79 | def set_learning_rate(self, lr): 80 | self.learning_rate_var.set_value(np.array(lr, theano.config.floatX)) 81 | 82 | def setup_train(self): 83 | 84 | # dimensions: (batch, time, 12) 85 | chord_types = T.btensor3() 86 | 87 | # dimensions: (batch, time) 88 | chord_roots = T.imatrix() 89 | 90 | # dimensions: (batch, time) 91 | relative_posns = [T.imatrix() for _ in self.encodings] 92 | 93 | # dimesions: (batch, time, output_data) 94 | encoded_melodies = [T.btensor3() for _ in self.encodings] 95 | 96 | # dimesions: (batch, time) 97 | correct_notes = T.imatrix() 98 | 99 | n_batch, n_time = chord_roots.shape 100 | 101 | def _build(det_dropout): 102 | all_out_probs = [] 103 | for encoding, lstmstack, encoded_melody, relative_pos in zip(self.encodings, self.lstmstacks, encoded_melodies, relative_posns): 104 | activations = lstmstack.do_preprocess_scan( timestep=T.tile(T.arange(n_time), (n_batch,1)) , 105 | relative_position=relative_pos, 106 | cur_chord_type=chord_types, 107 | cur_chord_root=chord_roots, 108 | last_output=T.concatenate([T.tile(encoding.initial_encoded_form(), (n_batch,1,1)), 109 | encoded_melody[:,:-1,:] ], 1), 110 | deterministic_dropout=det_dropout) 111 | 112 | out_probs = encoding.decode_to_probs(activations, relative_pos, self.bounds.lowbound, self.bounds.highbound) 113 | all_out_probs.append(out_probs) 114 | reduced_out_probs = functools.reduce((lambda x,y: x*y), all_out_probs) 115 | if self.normalize_artic_only: 116 | non_artic_probs = reduced_out_probs[:,:,:2] 117 | artic_probs = reduced_out_probs[:,:,2:] 118 | non_artic_sum = T.sum(non_artic_probs, 2, keepdims=True) 119 | artic_sum = T.sum(artic_probs, 2, keepdims=True) 120 | norm_artic_probs = artic_probs*(1-non_artic_sum)/artic_sum 121 | norm_out_probs = T.concatenate([non_artic_probs, norm_artic_probs], 2) 122 | else: 123 | normsum = T.sum(reduced_out_probs, 2, keepdims=True) 124 | normsum = T.maximum(normsum, constants.EPSILON) 125 | norm_out_probs = reduced_out_probs/normsum 126 | return Encoding.compute_loss(norm_out_probs, correct_notes, True) 127 | 128 | train_loss, train_info = _build(False) 129 | updates = Adam(train_loss, self.get_optimize_params(), lr=self.learning_rate_var) 130 | 131 | eval_loss, eval_info = _build(True) 132 | 133 | self.loss_info_keys = list(train_info.keys()) 134 | 135 | self.update_fun = theano.function( 136 | inputs=[chord_types, chord_roots, correct_notes] + relative_posns + encoded_melodies, 137 | outputs=[train_loss]+list(train_info.values()), 138 | updates=updates, 139 | allow_input_downcast=True, 140 | on_unused_input='ignore', 141 | mode=(NanGuardMode(nan_is_error=True, inf_is_error=True, big_is_error=True) if self.nanguard else None)) 142 | 143 | self.eval_fun = theano.function( 144 | inputs=[chord_types, chord_roots, correct_notes] + relative_posns + encoded_melodies, 145 | outputs=[eval_loss]+list(eval_info.values()), 146 | allow_input_downcast=True, 147 | on_unused_input='ignore', 148 | mode=(NanGuardMode(nan_is_error=True, inf_is_error=True, big_is_error=True) if self.nanguard else None)) 149 | 150 | def _assemble_batch(self, melody, chords): 151 | encoded_melodies = [[] for _ in self.encodings] 152 | relative_posns = [[] for _ in self.encodings] 153 | correct_notes = [] 154 | chord_roots = [] 155 | chord_types = [] 156 | for m,c in zip(melody,chords): 157 | m = leadsheet.constrain_melody(m, self.bounds) 158 | for i,encoding in enumerate(self.encodings): 159 | e_m, r_p = encoding.encode_melody_and_position(m,c) 160 | encoded_melodies[i].append(e_m) 161 | relative_posns[i].append(r_p) 162 | correct_notes.append(Encoding.encode_absolute_melody(m, self.bounds.lowbound, self.bounds.highbound)) 163 | c_roots, c_types = zip(*c) 164 | chord_roots.append(c_roots) 165 | chord_types.append(c_types) 166 | return ([np.array(chord_types, np.float32), 167 | np.array(chord_roots, np.int32), 168 | np.array(correct_notes, np.int32)] 169 | + [np.array(x, np.int32) for x in relative_posns] 170 | + [np.array(x, np.int32) for x in encoded_melodies]) 171 | 172 | def train(self, chords, melody): 173 | assert self.update_fun is not None, "Need to call setup_train before train" 174 | res = self.update_fun(*self._assemble_batch(melody,chords)) 175 | loss = res[0] 176 | info = dict(zip(self.loss_info_keys, res[1:])) 177 | return loss, info 178 | 179 | def eval(self, chords, melody): 180 | assert self.update_fun is not None, "Need to call setup_train before eval" 181 | res = self.eval_fun(*self._assemble_batch(melody,chords)) 182 | loss = res[0] 183 | info = dict(zip(self.loss_info_keys, res[1:])) 184 | return loss, info 185 | 186 | def setup_generate(self): 187 | 188 | # dimensions: (batch, time, 12) 189 | chord_types = T.btensor3() 190 | 191 | # dimensions: (batch, time) 192 | chord_roots = T.imatrix() 193 | 194 | n_batch, n_time = chord_roots.shape 195 | 196 | specs = [lstmstack.prepare_sample_scan( start_pos=T.alloc(np.array(encoding.STARTING_POSITION, np.int32), (n_batch)), 197 | start_out=T.tile(encoding.initial_encoded_form(), (n_batch,1)), 198 | timestep=T.tile(T.arange(n_time), (n_batch,1)), 199 | cur_chord_type=chord_types, 200 | cur_chord_root=chord_roots, 201 | deterministic_dropout=True ) 202 | for lstmstack, encoding in zip(self.lstmstacks, self.encodings)] 203 | 204 | updates, all_chosen, all_probs, indiv_probs = helper_generate_from_spec(specs, self.lstmstacks, self.encodings, self.srng, n_batch, n_time, self.bounds, self.normalize_artic_only) 205 | 206 | self.generate_fun = theano.function( 207 | inputs=[chord_roots, chord_types], 208 | updates=updates, 209 | outputs=all_chosen, 210 | allow_input_downcast=True, 211 | mode=(NanGuardMode(nan_is_error=True, inf_is_error=True, big_is_error=True) if self.nanguard else None)) 212 | 213 | self.generate_visualize_fun = theano.function( 214 | inputs=[chord_roots, chord_types], 215 | updates=updates, 216 | outputs=[all_chosen, all_probs] + indiv_probs, 217 | allow_input_downcast=True, 218 | mode=(NanGuardMode(nan_is_error=True, inf_is_error=True, big_is_error=True) if self.nanguard else None)) 219 | 220 | def generate(self, chords): 221 | assert self.generate_fun is not None, "Need to call setup_generate before generate" 222 | 223 | chord_roots = [] 224 | chord_types = [] 225 | for c in chords: 226 | c_roots, c_types = zip(*c) 227 | chord_roots.append(c_roots) 228 | chord_types.append(c_types) 229 | chosen = self.generate_fun(np.array(chord_roots, np.int32),np.array(chord_types, np.float32)) 230 | return [Encoding.decode_absolute_melody(c, self.bounds.lowbound, self.bounds.highbound) for c in chosen] 231 | 232 | def generate_visualize(self, chords): 233 | assert self.generate_fun is not None, "Need to call setup_generate before generate" 234 | chord_roots = [] 235 | chord_types = [] 236 | for c in chords: 237 | c_roots, c_types = zip(*c) 238 | chord_roots.append(c_roots) 239 | chord_types.append(c_types) 240 | stuff = self.generate_visualize_fun(chord_roots, chord_types) 241 | chosen, all_probs = stuff[:2] 242 | 243 | melody = [Encoding.decode_absolute_melody(c, self.bounds.lowbound, self.bounds.highbound) for c in chosen] 244 | return melody, chosen, all_probs, stuff[2:] 245 | 246 | def setup_produce(self): 247 | self.setup_generate() 248 | 249 | def produce(self, chords, melody): 250 | return self.generate_visualize(chords) 251 | 252 | def helper_generate_from_spec(specs, lstmstacks, encodings, srng, n_batch, n_time, bounds, normalize_artic_only=False): 253 | """Helper function to generate through a product LSTM model""" 254 | def _scan_fn(*inputs): 255 | # inputs is [ spec_sequences..., last_absolute_position, spec_taps..., spec_non_sequences... ] 256 | inputs = list(inputs) 257 | 258 | partitioned_inputs = [[] for _ in specs] 259 | for cur_part, spec in zip(partitioned_inputs, specs): 260 | cur_part.extend(inputs[:len(spec.sequences)]) 261 | del inputs[:len(spec.sequences)] 262 | last_absolute_chosen = inputs.pop(0) 263 | for cur_part, spec in zip(partitioned_inputs, specs): 264 | cur_part.extend(inputs[:spec.num_taps]) 265 | del inputs[:spec.num_taps] 266 | for cur_part, spec in zip(partitioned_inputs, specs): 267 | cur_part.extend(inputs[:len(spec.non_sequences)]) 268 | del inputs[:len(spec.non_sequences)] 269 | 270 | scan_routs = [ lstmstack.sample_scan_routine(spec, *p_input) for lstmstack,spec,p_input in zip(lstmstacks, specs, partitioned_inputs) ] 271 | new_posns = [] 272 | all_out_probs = [] 273 | for scan_rout, encoding in zip(scan_routs, encodings): 274 | last_rel_pos, last_out, cur_kwargs = scan_rout.send(None) 275 | 276 | new_pos = encoding.get_new_relative_position(last_absolute_chosen, last_rel_pos, last_out, bounds.lowbound, bounds.highbound, **cur_kwargs) 277 | new_posns.append(new_pos) 278 | addtl_kwargs = { 279 | "last_output": last_out 280 | } 281 | 282 | out_activations = scan_rout.send((new_pos, addtl_kwargs)) 283 | out_probs = encoding.decode_to_probs(out_activations,new_pos,bounds.lowbound, bounds.highbound) 284 | all_out_probs.append(out_probs) 285 | 286 | reduced_out_probs = functools.reduce((lambda x,y: x*y), all_out_probs) 287 | if normalize_artic_only: 288 | non_artic_probs = reduced_out_probs[:,:2] 289 | artic_probs = reduced_out_probs[:,2:] 290 | non_artic_sum = T.sum(non_artic_probs, 1, keepdims=True) 291 | artic_sum = T.sum(artic_probs, 1, keepdims=True) 292 | norm_artic_probs = artic_probs*(1-non_artic_sum)/artic_sum 293 | norm_out_probs = T.concatenate([non_artic_probs, norm_artic_probs], 1) 294 | else: 295 | normsum = T.sum(reduced_out_probs, 1, keepdims=True) 296 | normsum = T.maximum(normsum, constants.EPSILON) 297 | norm_out_probs = reduced_out_probs/normsum 298 | 299 | sampled_note = Encoding.sample_absolute_probs(srng, norm_out_probs) 300 | 301 | outputs = [] 302 | for scan_rout, encoding, new_pos in zip(scan_routs, encodings, new_posns): 303 | encoded_output = encoding.note_to_encoding(sampled_note, new_pos, bounds.lowbound, bounds.highbound) 304 | scan_outputs = scan_rout.send(encoded_output) 305 | scan_rout.close() 306 | outputs.extend(scan_outputs) 307 | 308 | return [sampled_note, norm_out_probs] + all_out_probs + outputs 309 | 310 | sequences = [] 311 | non_sequences = [] 312 | outputs_info = [{"initial":T.zeros((n_batch,),'int32'), "taps":[-1]}, None] + [None]*len(specs) 313 | for spec in specs: 314 | sequences.extend(spec.sequences) 315 | non_sequences.extend(spec.non_sequences) 316 | outputs_info.extend(spec.outputs_info) 317 | 318 | result, updates = theano.scan(fn=_scan_fn, sequences=sequences, non_sequences=non_sequences, outputs_info=outputs_info) 319 | all_chosen = result[0].dimshuffle((1,0)) 320 | all_probs = result[1].dimshuffle((1,0,2)) 321 | indiv_probs = [r.dimshuffle((1,0,2)) for r in result[2:2+len(specs)]] 322 | 323 | return updates, all_chosen, all_probs, indiv_probs 324 | -------------------------------------------------------------------------------- /models/simple_rel_model.py: -------------------------------------------------------------------------------- 1 | 2 | import theano 3 | import theano.tensor as T 4 | from theano.sandbox.rng_mrg import MRG_RandomStreams 5 | 6 | import numpy as np 7 | 8 | import constants 9 | import input_parts 10 | from relshift_lstm import RelativeShiftLSTMStack 11 | from adam import Adam 12 | from note_encodings import Encoding 13 | import leadsheet 14 | 15 | class SimpleModel(object): 16 | def __init__(self, encoding, layer_sizes, inputs=None, shift_mode="drop", dropout=0, setup=False, nanguard=False, unroll_batch_num=None, bounds=constants.BOUNDS): 17 | 18 | self.encoding = encoding 19 | 20 | self.bounds = bounds 21 | 22 | if inputs is None: 23 | inputs = [ 24 | input_parts.BeatInputPart(), 25 | input_parts.PositionInputPart(self.bounds.lowbound, self.bounds.highbound, 2), 26 | input_parts.ChordShiftInputPart()] 27 | 28 | parts = inputs + [ 29 | input_parts.PassthroughInputPart("last_output", encoding.ENCODING_WIDTH) 30 | ] 31 | self.lstmstack = RelativeShiftLSTMStack(parts, layer_sizes, encoding.RAW_ENCODING_WIDTH, encoding.WINDOW_SIZE, dropout, mode=shift_mode, unroll_batch_num=unroll_batch_num) 32 | 33 | self.srng = MRG_RandomStreams(np.random.randint(1, 1024)) 34 | 35 | self.learning_rate_var = theano.shared(np.array(0.0002, theano.config.floatX)) 36 | 37 | self.update_fun = None 38 | self.eval_fun = None 39 | self.gen_fun = None 40 | 41 | self.nanguard = nanguard 42 | 43 | if setup: 44 | print("Setting up train") 45 | self.setup_train() 46 | print("Setting up gen") 47 | self.setup_generate() 48 | print("Done setting up") 49 | 50 | @property 51 | def params(self): 52 | return self.lstmstack.params 53 | 54 | @params.setter 55 | def params(self, paramlist): 56 | self.lstmstack.params = paramlist 57 | 58 | def set_learning_rate(self, lr): 59 | self.learning_rate_var.set_value(np.array(lr, theano.config.floatX)) 60 | 61 | def setup_train(self): 62 | 63 | # dimensions: (batch, time, 12) 64 | chord_types = T.btensor3() 65 | 66 | # dimensions: (batch, time) 67 | chord_roots = T.imatrix() 68 | 69 | # dimensions: (batch, time) 70 | relative_pos = T.imatrix() 71 | 72 | # dimesions: (batch, time, output_data) 73 | encoded_melody = T.btensor3() 74 | 75 | # dimesions: (batch, time) 76 | correct_notes = T.imatrix() 77 | 78 | n_batch, n_time = relative_pos.shape 79 | 80 | def _build(det_dropout): 81 | activations = self.lstmstack.do_preprocess_scan( timestep=T.tile(T.arange(n_time), (n_batch,1)) , 82 | relative_position=relative_pos, 83 | cur_chord_type=chord_types, 84 | cur_chord_root=chord_roots, 85 | last_output=T.concatenate([T.tile(self.encoding.initial_encoded_form(), (n_batch,1,1)), 86 | encoded_melody[:,:-1,:] ], 1), 87 | deterministic_dropout=det_dropout) 88 | 89 | out_probs = self.encoding.decode_to_probs(activations, relative_pos, self.bounds.lowbound, self.bounds.highbound) 90 | return Encoding.compute_loss(out_probs, correct_notes, True) 91 | 92 | train_loss, train_info = _build(False) 93 | updates = Adam(train_loss, self.params, lr=self.learning_rate_var) 94 | 95 | eval_loss, eval_info = _build(True) 96 | 97 | self.loss_info_keys = list(train_info.keys()) 98 | 99 | self.update_fun = theano.function( 100 | inputs=[chord_types, chord_roots, relative_pos, encoded_melody, correct_notes], 101 | outputs=[train_loss]+list(train_info.values()), 102 | updates=updates, 103 | allow_input_downcast=True, 104 | mode=(NanGuardMode(nan_is_error=True, inf_is_error=True, big_is_error=True) if self.nanguard else None)) 105 | 106 | self.eval_fun = theano.function( 107 | inputs=[chord_types, chord_roots, relative_pos, encoded_melody, correct_notes], 108 | outputs=[eval_loss]+list(eval_info.values()), 109 | allow_input_downcast=True, 110 | mode=(NanGuardMode(nan_is_error=True, inf_is_error=True, big_is_error=True) if self.nanguard else None)) 111 | 112 | def _assemble_batch(self, melody, chords): 113 | encoded_melody = [] 114 | relative_pos = [] 115 | correct_notes = [] 116 | chord_roots = [] 117 | chord_types = [] 118 | for m,c in zip(melody,chords): 119 | m = leadsheet.constrain_melody(m, self.bounds) 120 | e_m, r_p = self.encoding.encode_melody_and_position(m,c) 121 | encoded_melody.append(e_m) 122 | relative_pos.append(r_p) 123 | correct_notes.append(Encoding.encode_absolute_melody(m, self.bounds.lowbound, self.bounds.highbound)) 124 | c_roots, c_types = zip(*c) 125 | chord_roots.append(c_roots) 126 | chord_types.append(c_types) 127 | return (np.array(chord_types, np.float32), 128 | np.array(chord_roots, np.int32), 129 | np.array(relative_pos, np.int32), 130 | np.array(encoded_melody, np.float32), 131 | np.array(correct_notes, np.int32)) 132 | 133 | def train(self, chords, melody): 134 | assert self.update_fun is not None, "Need to call setup_train before train" 135 | res = self.update_fun(*self._assemble_batch(melody,chords)) 136 | loss = res[0] 137 | info = dict(zip(self.loss_info_keys, res[1:])) 138 | return loss, info 139 | 140 | def eval(self, chords, melody): 141 | assert self.update_fun is not None, "Need to call setup_train before eval" 142 | res = self.eval_fun(*self._assemble_batch(melody,chords)) 143 | loss = res[0] 144 | info = dict(zip(self.loss_info_keys, res[1:])) 145 | return loss, info 146 | 147 | def setup_generate(self): 148 | 149 | # dimensions: (batch, time, 12) 150 | chord_types = T.btensor3() 151 | 152 | # dimensions: (batch, time) 153 | chord_roots = T.imatrix() 154 | 155 | n_batch, n_time = chord_roots.shape 156 | 157 | spec = self.lstmstack.prepare_sample_scan( start_pos=T.alloc(np.array(self.encoding.STARTING_POSITION, np.int32), (n_batch)), 158 | start_out=T.tile(self.encoding.initial_encoded_form(), (n_batch,1)), 159 | timestep=T.tile(T.arange(n_time), (n_batch,1)), 160 | cur_chord_type=chord_types, 161 | cur_chord_root=chord_roots, 162 | deterministic_dropout=True ) 163 | 164 | def _scan_fn(*inputs): 165 | # inputs is [ spec_sequences..., last_absolute_position, spec_taps..., spec_non_sequences... ] 166 | inputs = list(inputs) 167 | last_absolute_chosen = inputs.pop(len(spec.sequences)) 168 | scan_rout = self.lstmstack.sample_scan_routine(spec, *inputs) 169 | 170 | last_rel_pos, last_out, cur_kwargs = scan_rout.send(None) 171 | 172 | new_pos = self.encoding.get_new_relative_position(last_absolute_chosen, last_rel_pos, last_out, self.bounds.lowbound, self.bounds.highbound, **cur_kwargs) 173 | addtl_kwargs = { 174 | "last_output": last_out 175 | } 176 | 177 | out_activations = scan_rout.send((new_pos, addtl_kwargs)) 178 | out_probs = self.encoding.decode_to_probs(out_activations,new_pos,self.bounds.lowbound, self.bounds.highbound) 179 | sampled_note = Encoding.sample_absolute_probs(self.srng, out_probs) 180 | encoded_output = self.encoding.note_to_encoding(sampled_note, new_pos, self.bounds.lowbound, self.bounds.highbound) 181 | scan_outputs = scan_rout.send(encoded_output) 182 | scan_rout.close() 183 | 184 | return [sampled_note, out_probs] + scan_outputs 185 | 186 | outputs_info = [{"initial":T.zeros((n_batch,),'int32'), "taps":[-1]}, None] + spec.outputs_info 187 | result, updates = theano.scan(fn=_scan_fn, sequences=spec.sequences, non_sequences=spec.non_sequences, outputs_info=outputs_info) 188 | all_chosen = result[0].dimshuffle((1,0)) 189 | all_probs = result[1].dimshuffle((1,0,2)) 190 | 191 | self.generate_fun = theano.function( 192 | inputs=[chord_roots, chord_types], 193 | updates=updates, 194 | outputs=all_chosen, 195 | allow_input_downcast=True, 196 | mode=(NanGuardMode(nan_is_error=True, inf_is_error=True, big_is_error=True) if self.nanguard else None)) 197 | 198 | self.generate_visualize_fun = theano.function( 199 | inputs=[chord_roots, chord_types], 200 | updates=updates, 201 | outputs=[all_chosen, all_probs], 202 | allow_input_downcast=True, 203 | mode=(NanGuardMode(nan_is_error=True, inf_is_error=True, big_is_error=True) if self.nanguard else None)) 204 | 205 | def generate(self, chords): 206 | assert self.generate_fun is not None, "Need to call setup_generate before generate" 207 | 208 | chord_roots = [] 209 | chord_types = [] 210 | for c in chords: 211 | c_roots, c_types = zip(*c) 212 | chord_roots.append(c_roots) 213 | chord_types.append(c_types) 214 | chosen = self.generate_fun(np.array(chord_roots, np.int32),np.array(chord_types, np.float32)) 215 | return [Encoding.decode_absolute_melody(c, self.bounds.lowbound, self.bounds.highbound) for c in chosen] 216 | 217 | def generate_visualize(self, chords): 218 | assert self.generate_fun is not None, "Need to call setup_generate before generate" 219 | chord_roots = [] 220 | chord_types = [] 221 | for c in chords: 222 | c_roots, c_types = zip(*c) 223 | chord_roots.append(c_roots) 224 | chord_types.append(c_types) 225 | chosen, all_probs = self.generate_visualize_fun(chord_roots, chord_types) 226 | 227 | melody = [Encoding.decode_absolute_melody(c, self.bounds.lowbound, self.bounds.highbound) for c in chosen] 228 | return melody, chosen, all_probs 229 | 230 | def setup_produce(self): 231 | self.setup_generate() 232 | 233 | def produce(self, chords, melody): 234 | return self.generate_visualize(chords) + ([],) 235 | -------------------------------------------------------------------------------- /nametrain/name_model.py: -------------------------------------------------------------------------------- 1 | import theano 2 | import theano.tensor as T 3 | from theano.sandbox.rng_mrg import MRG_RandomStreams 4 | 5 | import numpy as np 6 | 7 | import constants 8 | import input_parts 9 | from relshift_lstm import RelativeShiftLSTMStack 10 | from queue_managers import QueueManager 11 | from adam import Adam 12 | from note_encodings import Encoding 13 | import leadsheet 14 | 15 | import itertools 16 | import functools 17 | from theano_lstm import LSTM, StackedCells, Layer 18 | from util import * 19 | import random 20 | 21 | import pickle 22 | 23 | CHARKEY = " !\"'(),-.01245679:?ABCDEFGHIJKLMNOPQRSTUVWYZabcdefghijklmnopqrstuvwxyz" 24 | 25 | def name_model(): 26 | 27 | LSTM_SIZE = 300 28 | layer1 = LSTM(len(CHARKEY), LSTM_SIZE, activation=T.tanh) 29 | layer2 = Layer(LSTM_SIZE, len(CHARKEY), activation=lambda x:x) 30 | params = layer1.params + [layer1.initial_hidden_state] + layer2.params 31 | 32 | ################# Train ################# 33 | train_data = T.ftensor3() 34 | n_batch = train_data.shape[0] 35 | train_input = T.concatenate([T.zeros([n_batch,1,len(CHARKEY)]),train_data[:,:-1,:]],1) 36 | train_output = train_data 37 | 38 | def _scan_train(last_out, last_state): 39 | new_state = layer1.activate(last_out, last_state) 40 | layer_out = layer1.postprocess_activation(new_state) 41 | layer2_out = layer2.activate(layer_out) 42 | new_out = T.nnet.softmax(layer2_out) 43 | return new_out, new_state 44 | 45 | outputs_info = [None, initial_state(layer1, n_batch)] 46 | (scan_outputs, scan_states), _ = theano.scan(_scan_train, sequences=[train_input.dimshuffle([1,0,2])], outputs_info=outputs_info) 47 | 48 | flat_scan_outputs = scan_outputs.dimshuffle([1,0,2]).reshape([-1,len(CHARKEY)]) 49 | flat_train_output = train_output.reshape([-1,len(CHARKEY)]) 50 | crossentropy = T.nnet.categorical_crossentropy(flat_scan_outputs, flat_train_output) 51 | loss = T.sum(crossentropy)/T.cast(n_batch,'float32') 52 | 53 | adam_updates = Adam(loss, params) 54 | 55 | train_fn = theano.function([train_data],loss,updates=adam_updates) 56 | 57 | ################# Eval ################# 58 | 59 | length = T.iscalar() 60 | srng = MRG_RandomStreams(np.random.randint(1, 1024)) 61 | 62 | def _scan_gen(last_out, last_state): 63 | new_state = layer1.activate(last_out, last_state) 64 | layer_out = layer1.postprocess_activation(new_state) 65 | layer2_out = layer2.activate(layer_out) 66 | new_out = T.nnet.softmax(T.shape_padleft(layer2_out)) 67 | sample = srng.multinomial(n=1,pvals=new_out)[0,:] 68 | sample = T.cast(sample,'float32') 69 | return sample, new_state 70 | 71 | initial_input = np.zeros([len(CHARKEY)], np.float32) 72 | outputs_info = [initial_input, layer1.initial_hidden_state] 73 | (scan_outputs, scan_states), updates = theano.scan(_scan_gen, n_steps=length, outputs_info=outputs_info) 74 | 75 | gen_fn = theano.function([length],scan_outputs,updates=updates) 76 | 77 | return layer1, layer2, train_fn, gen_fn 78 | 79 | def train_name(dataset_file): 80 | with open(dataset_file,'r') as f: 81 | dataset = [x.strip() for x in f] 82 | maxlen = max(len(x) for x in dataset) 83 | dataset = [x+" "*(maxlen-len(x)) for x in dataset] 84 | 85 | layer1, layer2, train_fn, gen_fn = name_model() 86 | params = layer1.params + [layer1.initial_hidden_state] + layer2.params 87 | 88 | print("Starting train...") 89 | 90 | BATCH_SIZE = 20 91 | for iteration in range(10000): 92 | sample = [random.choice(dataset) for _ in range(BATCH_SIZE)] 93 | sample_encoded = np.zeros([BATCH_SIZE, maxlen, len(CHARKEY)], np.float32) 94 | for i,train_string in enumerate(sample): 95 | for j,c in enumerate(train_string): 96 | try: 97 | idx = CHARKEY.index(c) 98 | except ValueError: 99 | print("Couldn't find character <{}>, replacing with space".format(c)) 100 | idx = len(CHARKEY)-1 101 | sample_encoded[i,j,idx] = 1.0 102 | 103 | loss = train_fn(sample_encoded) 104 | if iteration % 100 == 0: 105 | print("Iter",iteration,"has loss",loss) 106 | for _ in range(10): 107 | generate_name(maxlen, layer1, layer2, train_fn, gen_fn) 108 | 109 | 110 | pickle.dump([p.get_value() for p in params], open("name_params.p", 'wb')) 111 | 112 | def generate_name(length, layer1=None, layer2=None, train_fn=None, gen_fn=None, num_names=1): 113 | if layer1 is None: 114 | layer1, layer2, train_fn, gen_fn = name_model() 115 | params = layer1.params + [layer1.initial_hidden_state] + layer2.params 116 | loaded_params = pickle.load(open("name_params.p", 'rb')) 117 | for p,v in zip(params, loaded_params): 118 | p.set_value(v) 119 | for i in range(num_names): 120 | scan_outputs = gen_fn(length) 121 | outval = [] 122 | for output in scan_outputs: 123 | idx = np.nonzero(output)[0][0] 124 | outval.append(CHARKEY[idx]) 125 | print(''.join(outval)) -------------------------------------------------------------------------------- /note_encodings/__init__.py: -------------------------------------------------------------------------------- 1 | from .base_encoding import Encoding 2 | from .relative_jump import RelativeJumpEncoding 3 | from .chord_relative import ChordRelativeEncoding 4 | from .abs_seq_encoding import AbsoluteSequentialEncoding 5 | from .circle_of_thirds_encoding import CircleOfThirdsEncoding 6 | from .rhythm_only import RhythmOnlyEncoding -------------------------------------------------------------------------------- /note_encodings/abs_seq_encoding.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import theano 3 | import theano.tensor as T 4 | import random 5 | 6 | from .base_encoding import Encoding 7 | 8 | import constants 9 | import leadsheet 10 | import math 11 | 12 | class AbsoluteSequentialEncoding( Encoding ): 13 | 14 | STARTING_POSITION = 0 15 | WINDOW_SIZE = 1 16 | 17 | def __init__(self, low_bound, high_bound): 18 | self.low_bound = low_bound 19 | self.high_bound = high_bound 20 | self.ENCODING_WIDTH = high_bound-low_bound+2 21 | self.RAW_ENCODING_WIDTH = self.ENCODING_WIDTH 22 | 23 | def encode_melody_and_position(self, melody, chords): 24 | abs_encoded_idxs = Encoding.encode_absolute_melody(melody, self.low_bound, self.high_bound) 25 | encoded_form = np.eye(self.ENCODING_WIDTH)[abs_encoded_idxs] 26 | position = np.zeros([abs_encoded_idxs.shape[0]]) 27 | return encoded_form, position 28 | 29 | def decode_to_probs(self, activations, relative_position, low_bound, high_bound): 30 | squashed = T.reshape(activations, (-1,self.RAW_ENCODING_WIDTH)) 31 | probs = T.nnet.softmax(squashed) 32 | fixed = T.reshape(probs, activations.shape) 33 | return fixed 34 | 35 | def note_to_encoding(self, chosen_note, relative_position, low_bound, high_bound): 36 | encoded_form = T.extra_ops.to_one_hot(chosen_note, self.ENCODING_WIDTH) 37 | return encoded_form 38 | 39 | def get_new_relative_position(self, last_chosen_note, last_rel_pos, last_out, low_bound, high_bound, **cur_kwargs): 40 | return T.zeros_like(last_chosen_note) 41 | 42 | def initial_encoded_form(self): 43 | return np.array([1]+[0]*(self.ENCODING_WIDTH-1), np.float32) 44 | -------------------------------------------------------------------------------- /note_encodings/base_encoding.py: -------------------------------------------------------------------------------- 1 | import theano 2 | import theano.tensor as T 3 | from theano.ifelse import ifelse 4 | import numpy as np 5 | import constants 6 | 7 | class Encoding( object ): 8 | """ 9 | Base class for note encodings 10 | """ 11 | ENCODING_WIDTH = 0 12 | RAW_ENCODING_WIDTH = 0 13 | STARTING_POSITION = 0 14 | 15 | def encode_melody_and_position(self, melody, chords): 16 | """ 17 | Encode a melody in the correct format 18 | 19 | Parameters: 20 | melody: A melody object, of the form [(note_or_none, dur), ... ], 21 | where note_or_none is either a MIDI note value or None if this is a rest, 22 | and dur is the duration, relative to constants.RESOLUTION_SCALAR. 23 | chords: A chord object, of the form [(root, typevec), ...], 24 | where root is a MIDI note value in 0-12, typevec is a boolean list of length 12 25 | 26 | Returns: 27 | encoded_form: A numpy ndarray (float32) of shape (timestep, ENCODING_WIDTH) representing 28 | the encoded form of the melody 29 | relative_positions: A numpy ndarray (int32) of shape (timestep), where relative_positions[t] 30 | gives the note position that the given timestep encoding is relative to. 31 | """ 32 | raise NotImplementedError("encode_melody_and_position not implemented") 33 | 34 | def decode_to_probs(self, activations, relative_position, low_bound, high_bound): 35 | """ 36 | Convert a set of activations to a probability form across notes. 37 | 38 | Parameters: 39 | activations: A theano tensor (float32) of shape (..., RAW_ENCODING_WIDTH) giving 40 | raw activations from a standard neural network layer 41 | relative_position: A theano tensor of shape (...) giving the current relative position 42 | low_bound: The MIDI index of the lowest note to return 43 | high_bound: The MIDI index of one past the highest note to return 44 | 45 | Returns: 46 | encoded_probs: A theano tensor (float32) of shape (..., 2+high_bound-low_bound) giving a 47 | probability distribution for chosen notes, where 48 | [0]: rest 49 | [1]: continue 50 | [2+x]: play note (low_bound + x) 51 | """ 52 | raise NotImplementedError("decode_to_probs not implemented") 53 | 54 | def note_to_encoding(self, chosen_note, relative_position, low_bound, high_bound): 55 | """ 56 | Convert a chosen note back into an encoded form 57 | 58 | Parameters: 59 | relative_position: A theano tensor of shape (...) giving the current relative position 60 | chosen_note: A theano tensor of shape (...) giving an index into encoded_probs 61 | low_bound: The MIDI index of the lowest note to return 62 | high_bound: The MIDI index of one past the highest note to return 63 | 64 | Returns: 65 | sampled_output: A theano tensor (float32) of shape (..., ENCODING_WIDTH) that is 66 | sampled from encoded_probs. Should have the same representation as encoded_form from 67 | encode_melody 68 | """ 69 | raise NotImplementedError("note_to_output not implemented") 70 | 71 | def get_new_relative_position(self, last_chosen_note, last_rel_pos, last_out, low_bound, high_bound, **cur_kwargs): 72 | """ 73 | Get the new relative position for this timestep 74 | 75 | Parameters: 76 | last_chosen_note is a theano tensor of shape (n_batch) indexing into 2+high_bound-low_bound 77 | last_rel_pos is a theano tensor of shape (n_batch) 78 | last_out will be a theano tensor of shape (n_batch, output_size) 79 | cur_kwargs[k] is a theano tensor of shape (n_batch, ...), from kwargs 80 | low_bound: The MIDI index of the lowest note to return 81 | high_bound: The MIDI index of one past the highest note to return 82 | 83 | Returns: 84 | new_pos, a theano tensor of shape (n_batch), giving the new relative position 85 | """ 86 | raise NotImplementedError("get_new_relative_position not implemented") 87 | 88 | 89 | def initial_encoded_form(self): 90 | """ 91 | Returns: A numpy ndarray (float32) of shape (ENCODING_WIDTH) for an initial encoding of 92 | the "previous note" when there is no previous data. Generally should be a representation 93 | of nothing, i.e. of a rest. 94 | """ 95 | raise NotImplementedError("initial_encoded_form not implemented") 96 | 97 | @staticmethod 98 | def encode_absolute_melody(melody, low_bound, high_bound): 99 | """ 100 | Encode an absolute melody 101 | 102 | Parameters: 103 | melody: A melody object, of the form [(note_or_none, dur), ... ], 104 | where note_or_none is either a MIDI note value or None if this is a rest, 105 | and dur is the duration, relative to constants.RESOLUTION_SCALAR. 106 | low_bound: The MIDI index of the lowest note to return 107 | high_bound: The MIDI index of one past the highest note to return 108 | 109 | Returns 110 | A numpy matrix of shape (timestep) giving the int index (in 2+high_bound-low_bound) of the correct note 111 | """ 112 | positions = [] 113 | 114 | for note, dur in melody: 115 | positions.append(0 if note is None else (note-low_bound+2)) 116 | 117 | for _ in range(dur-1): 118 | positions.append(0 if note is None else 1) 119 | 120 | return np.array(positions, np.int32) 121 | 122 | 123 | @staticmethod 124 | def decode_absolute_melody(positions, low_bound, high_bound): 125 | """ 126 | Decode an absolute melody 127 | 128 | Parameters: 129 | A numpy matrix of shape (timestep) giving the int index (in 2+high_bound-low_bound) of the correct note 130 | low_bound: The MIDI index of the lowest note to return 131 | high_bound: The MIDI index of one past the highest note to return 132 | 133 | Returns 134 | melody: A melody object, of the form [(note_or_none, dur), ... ], 135 | where note_or_none is either a MIDI note value or None if this is a rest, 136 | and dur is the duration, relative to constants.RESOLUTION_SCALAR. 137 | """ 138 | melody = [] 139 | 140 | for out in positions.tolist(): 141 | if out==1: 142 | # Continue a note 143 | if len(melody) == 0: 144 | print("ERROR: Can't continue from nothing! Inserting rest") 145 | melody.append([None, 0]) 146 | melody[-1][1] += 1 147 | elif out==0: 148 | # Rest 149 | if len(melody)>0 and melody[-1][0] is None: 150 | # More rest 151 | melody[-1][1] += 1 152 | else: 153 | melody.append([None, 1]) 154 | else: 155 | note = out-2 + low_bound 156 | melody.append([note, 1]) 157 | 158 | return [tuple(x) for x in melody] 159 | 160 | @staticmethod 161 | def sample_absolute_probs(srng, probs): 162 | """ 163 | Sample from a probability distribution 164 | 165 | Parameters: 166 | srng: A RandomStreams instance 167 | probs: A matrix of probabilities of shape (n_batch, sample_from) 168 | 169 | Returns: 170 | Sampled output, an index in [0,sample_from) of shape (n_batch) 171 | One-hot encoding of that output of shape (n_batch, sample_from) 172 | """ 173 | n_batch,sample_from = probs.shape 174 | 175 | sample = srng.multinomial(n=1,pvals=probs) 176 | idx = T.cast(T.argmax(sample,axis=1),'int32') 177 | 178 | return idx 179 | 180 | @staticmethod 181 | def compute_loss(probs, absolute_melody, extra_info=False): 182 | """ 183 | Compute loss between probs and an absolute melody 184 | 185 | Parameters: 186 | probs: A theano tensor of shape (batch, time, 2+high_bound-low_bound) 187 | absolute_melody: A tensor of shape (batch, time) with correct indices 188 | extra_info: If True, return extra info 189 | 190 | Returns 191 | A theano tensor loss value. 192 | Also, if extra_info is true, an additional info dict. 193 | """ 194 | n_batch, n_time, prob_width = probs.shape 195 | correct_encoded_form = T.reshape(T.extra_ops.to_one_hot(T.flatten(absolute_melody), prob_width), probs.shape) 196 | loglikelihoods = T.log( probs + constants.EPSILON )*correct_encoded_form 197 | full_loss = T.neg(T.sum(loglikelihoods)) 198 | 199 | if extra_info: 200 | loss_per_timestep = full_loss/T.cast(n_batch*n_time, theano.config.floatX) 201 | accuracy_per_timestep = T.exp(-loss_per_timestep) 202 | 203 | loss_per_batch = full_loss/T.cast(n_batch, theano.config.floatX) 204 | accuracy_per_batch = T.exp(-loss_per_batch) 205 | 206 | num_jumps = T.sum(correct_encoded_form[:,:,2:]) 207 | loss_per_jump = full_loss/T.cast(num_jumps, theano.config.floatX) 208 | accuracy_per_jump = T.exp(-loss_per_jump) 209 | 210 | return full_loss, { 211 | "loss_per_timestep":loss_per_timestep, 212 | "accuracy_per_timestep":accuracy_per_timestep, 213 | "loss_per_batch":loss_per_batch, 214 | "accuracy_per_batch":accuracy_per_batch, 215 | "loss_per_jump":loss_per_jump, 216 | "accuracy_per_jump":accuracy_per_jump 217 | } 218 | else: 219 | return full_loss 220 | -------------------------------------------------------------------------------- /note_encodings/chord_relative.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import theano 3 | import theano.tensor as T 4 | import random 5 | 6 | from .base_encoding import Encoding 7 | 8 | import constants 9 | import leadsheet 10 | import math 11 | 12 | class ChordRelativeEncoding( Encoding ): 13 | """ 14 | An encoding based on the chord. Encoding format is a one-hot 15 | 16 | [ 17 | 18 | rest, \ 19 | continue, } (a softmax set of excl probs) 20 | play x 12, / 21 | 22 | ] 23 | 24 | where play is relative to the chord root 25 | """ 26 | 27 | ENCODING_WIDTH = 1 + 1 + 12 28 | WINDOW_SIZE = 12 29 | 30 | def __init__(self, with_artic=True): 31 | self.with_artic = with_artic 32 | self.RAW_ENCODING_WIDTH = self.WINDOW_SIZE + (2 if with_artic else 0) 33 | 34 | def encode_melody_and_position(self, melody, chords): 35 | 36 | time = 0 37 | positions = [] 38 | encoded_form = [] 39 | 40 | for note, dur in melody: 41 | root, ctype = chords[time] 42 | if note is None: 43 | encoded_form.append([1]+[0]+[0]*self.WINDOW_SIZE) 44 | else: 45 | index = (note - root)%self.WINDOW_SIZE 46 | encoded_form.append([0]+[0]+[1 if i==index else 0 for i in range(self.WINDOW_SIZE)]) 47 | 48 | for _ in range(dur-1): 49 | rcp = [1 if note is None else 0] + [0 if note is None else 1] + [0]*self.WINDOW_SIZE 50 | encoded_form.append(rcp) 51 | time += dur 52 | 53 | positions = [root for root,ctype in chords] 54 | 55 | return np.array(encoded_form, np.float32), np.array(positions, np.int32) 56 | 57 | def decode_to_probs(self, activations, relative_position, low_bound, high_bound): 58 | squashed = T.reshape(activations, (-1,self.RAW_ENCODING_WIDTH)) 59 | n_parallel = squashed.shape[0] 60 | probs = T.nnet.softmax(squashed) 61 | 62 | 63 | def _scan_fn(cprobs, cpos): 64 | 65 | if self.with_artic: 66 | abs_probs = cprobs[:2] 67 | rel_probs = cprobs[2:] 68 | else: 69 | rel_probs = cprobs 70 | abs_probs = T.ones((2,)) 71 | 72 | aligned = T.roll(rel_probs, (cpos-low_bound)%12) 73 | 74 | num_tile = int(math.ceil((high_bound-low_bound)/self.WINDOW_SIZE)) 75 | 76 | tiled = T.tile(aligned, (num_tile,))[:(high_bound-low_bound)] 77 | 78 | full = T.concatenate([abs_probs, tiled], 0) 79 | return full 80 | 81 | # probs = theano.printing.Print("probs",['shape'])(probs) 82 | # relative_position = theano.printing.Print("relative_position",['shape'])(relative_position) 83 | from_scan, _ = theano.map(fn=_scan_fn, sequences=[probs, T.flatten(relative_position)]) 84 | # from_scan = theano.printing.Print("from_scan",['shape'])(from_scan) 85 | newshape = T.concatenate([activations.shape[:-1],[2+high_bound-low_bound]],0) 86 | fixed = T.reshape(from_scan, newshape, ndim=activations.ndim) 87 | return fixed 88 | 89 | def note_to_encoding(self, chosen_note, relative_position, low_bound, high_bound): 90 | """ 91 | Convert a chosen note back into an encoded form 92 | 93 | Parameters: 94 | relative_position: A theano tensor of shape (...) giving the current relative position 95 | chosen_note: A theano tensor of shape (...) giving an index into encoded_probs 96 | 97 | Returns: 98 | sampled_output: A theano tensor (float32) of shape (..., ENCODING_WIDTH) that is 99 | sampled from encoded_probs. Should have the same representation as encoded_form from 100 | encode_melody 101 | """ 102 | new_idx = T.switch(chosen_note<2, chosen_note, (chosen_note-2+low_bound-relative_position)%self.WINDOW_SIZE + 2) 103 | sampled_output = T.extra_ops.to_one_hot(new_idx, self.ENCODING_WIDTH) 104 | return sampled_output 105 | 106 | def get_new_relative_position(self, last_chosen_note, last_rel_pos, last_out, low_bound, high_bound, cur_chord_root, **cur_kwargs): 107 | return cur_chord_root 108 | 109 | def initial_encoded_form(self): 110 | return np.array([1]+[0]+[0]*self.WINDOW_SIZE, np.float32) 111 | -------------------------------------------------------------------------------- /note_encodings/circle_of_thirds_encoding.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import theano 3 | import theano.tensor as T 4 | import random 5 | 6 | from .base_encoding import Encoding 7 | 8 | import constants 9 | import leadsheet 10 | import math 11 | 12 | class CircleOfThirdsEncoding( Encoding ): 13 | """ 14 | [ rest sustain play ] 15 | [ () () () () ] 16 | [ () () () ] 17 | [ octave0 ... ] 18 | """ 19 | 20 | STARTING_POSITION = 0 21 | WINDOW_SIZE = 12 22 | 23 | def __init__(self, octave_start, num_octaves): 24 | self.octave_start = octave_start 25 | self.num_octaves = num_octaves 26 | self.ENCODING_WIDTH = 3 + 4 + 3 + num_octaves 27 | self.RAW_ENCODING_WIDTH = self.ENCODING_WIDTH 28 | 29 | def encode_melody_and_position(self, melody, chords): 30 | encoded_form = [] 31 | 32 | for note, dur in melody: 33 | if note is None: 34 | for _ in range(dur): 35 | encoded_form.append([1] + [0]*(self.ENCODING_WIDTH-1)) 36 | else: 37 | pitchclass = note % 12 38 | octave = (note - self.octave_start)//12 39 | 40 | first_circle = [(1 if ((pitchclass-i)%4 == 0) else 0) for i in range(4)] 41 | second_circle = [(1 if ((pitchclass-i)%3 == 0) else 0) for i in range(3)] 42 | octave_enc = [1 if (i==octave) else 0 for i in range(self.num_octaves)] 43 | 44 | enc_timestep = [0,0,1] + first_circle + second_circle + octave_enc 45 | 46 | encoded_form.append(enc_timestep) 47 | 48 | for _ in range(dur-1): 49 | encoded_form.append([0, 1] + [0]*(self.ENCODING_WIDTH-2)) 50 | 51 | encoded_form = np.array(encoded_form, np.float32) 52 | position = np.zeros([encoded_form.shape[0]]) 53 | return encoded_form, position 54 | 55 | def decode_to_probs(self, activations, relative_position, low_bound, high_bound): 56 | assert (low_bound%12==0) and (high_bound-low_bound == self.num_octaves*12), "Circle of thirds must evenly divide into octaves" 57 | squashed = T.reshape(activations, (-1,self.RAW_ENCODING_WIDTH)) 58 | 59 | rsp = T.nnet.softmax(squashed[:,:3]) 60 | c1 = T.nnet.softmax(squashed[:,3:7]) 61 | c2 = T.nnet.softmax(squashed[:,7:10]) 62 | octave_choice = T.nnet.softmax(squashed[:,10:]) 63 | octave_notes = T.tile(c1,(1,3)) * T.tile(c2,(1,4)) 64 | full_notes = T.reshape(T.shape_padright(octave_choice) * T.shape_padaxis(octave_notes, 1), (-1,12*self.num_octaves)) 65 | full_probs = T.concatenate([rsp[:,:2], T.shape_padright(rsp[:,2])*full_notes], 1) 66 | 67 | newshape = T.concatenate([activations.shape[:-1],[2+high_bound-low_bound]],0) 68 | fixed = T.reshape(full_probs, newshape, ndim=activations.ndim) 69 | return fixed 70 | 71 | def note_to_encoding(self, chosen_note, relative_position, low_bound, high_bound): 72 | assert chosen_note.ndim == 1 73 | n_batch = chosen_note.shape[0] 74 | 75 | dont_play_version = T.switch( T.shape_padright(T.eq(chosen_note, 0)), 76 | T.tile(np.array([[1,0] + [0]*(self.ENCODING_WIDTH-2)], dtype=np.float32), (n_batch, 1)), 77 | T.tile(np.array([[0,1] + [0]*(self.ENCODING_WIDTH-2)], dtype=np.float32), (n_batch, 1))) 78 | 79 | rcp = T.tile(np.array([0,0,1],dtype=np.float32), (n_batch, 1)) 80 | circle_1 = T.eye(4)[(chosen_note-2)%4] 81 | circle_2 = T.eye(3)[(chosen_note-2)%3] 82 | octave = T.eye(self.num_octaves)[(chosen_note-2+low_bound-self.octave_start)//12] 83 | 84 | play_version = T.concatenate([rcp, circle_1, circle_2, octave], 1) 85 | 86 | encoded_form = T.switch( T.shape_padright(T.lt(chosen_note, 2)), dont_play_version, play_version ) 87 | return encoded_form 88 | 89 | def get_new_relative_position(self, last_chosen_note, last_rel_pos, last_out, low_bound, high_bound, **cur_kwargs): 90 | return T.zeros_like(last_chosen_note) 91 | 92 | def initial_encoded_form(self): 93 | return np.array([1]+[0]*(self.ENCODING_WIDTH-1), np.float32) 94 | -------------------------------------------------------------------------------- /note_encodings/relative_jump.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import theano 3 | import theano.tensor as T 4 | import random 5 | 6 | from .base_encoding import Encoding 7 | 8 | import constants 9 | import leadsheet 10 | 11 | def rotate(li, x): 12 | """ 13 | Rotate list li by x spaces to the right, i.e. 14 | rotate([1,2,3,4],1) -> [4,1,2,3] 15 | """ 16 | if len(li) == 0: return [] 17 | return li[-x % len(li):] + li[:-x % len(li)] 18 | 19 | class RelativeJumpEncoding( Encoding ): 20 | """ 21 | An encoding based on relative jumps. Encoding format is a one-hot 22 | 23 | [ 24 | 25 | rest, \ 26 | continue, } (a softmax set of excl probs) 27 | play x WINDOW_SIZE, / 28 | 29 | ] 30 | 31 | where WINDOW_SIZE gives the number of places to which we can jump. 32 | """ 33 | 34 | WINDOW_RADIUS = 12 35 | WINDOW_SIZE = WINDOW_RADIUS*2+1 36 | 37 | STARTING_POSITION = 72 38 | 39 | ENCODING_WIDTH = 1 + 1 + WINDOW_SIZE 40 | 41 | def __init__(self, with_artic=True): 42 | self.with_artic = with_artic 43 | self.RAW_ENCODING_WIDTH = self.WINDOW_SIZE + (2 if with_artic else 0) 44 | 45 | def encode_melody_and_position(self, melody, chords): 46 | 47 | positions = [] 48 | encoded_form = [] 49 | 50 | cur_pos = next((n for n,d in melody if n is not None), self.STARTING_POSITION) + random.randrange(-self.WINDOW_RADIUS, self.WINDOW_RADIUS+1) 51 | 52 | positions.append(cur_pos) 53 | 54 | for note, dur in melody: 55 | if note is None: 56 | delta = 0 57 | else: 58 | delta = note - cur_pos 59 | cur_pos = note 60 | if not (-self.WINDOW_RADIUS <= delta <= self.WINDOW_RADIUS): 61 | olddelta = delta 62 | if delta>0: 63 | delta = delta % self.WINDOW_RADIUS 64 | else: 65 | delta = -(-delta % self.WINDOW_RADIUS) 66 | # print("WARNING: Jump of size {} from {} to {} not allowed. Substituting jump of size {}".format(olddelta, note-olddelta, note, delta)) 67 | 68 | rcp = ([1 if note is None else 0] + 69 | [0] + 70 | [(1 if i==delta and note is not None else 0) for i in range(-self.WINDOW_RADIUS, self.WINDOW_RADIUS+1)]) 71 | 72 | encoded_form.append(rcp) # for this timestep 73 | positions.append(cur_pos) # for next timestep 74 | 75 | for _ in range(dur-1): 76 | rcp = [1 if note is None else 0] + [0 if note is None else 1] + [0]*self.WINDOW_SIZE 77 | encoded_form.append(rcp) 78 | positions.append(cur_pos) 79 | 80 | # Remove last position, since nothing is relative to it 81 | positions = positions[:-1] 82 | 83 | return np.array(encoded_form, np.float32), np.array(positions, np.int32) 84 | 85 | def decode_to_probs(self, activations, relative_position, low_bound, high_bound): 86 | squashed = T.reshape(activations, (-1,self.RAW_ENCODING_WIDTH)) 87 | n_parallel = squashed.shape[0] 88 | probs = T.nnet.softmax(squashed) 89 | 90 | def _scan_fn(cprobs, cpos): 91 | # cprobs = theano.printing.Print("cprobs",['shape'])(cprobs) 92 | # cpos = theano.printing.Print("cpos",['shape'])(cpos) 93 | 94 | if self.with_artic: 95 | abs_probs = cprobs[:2] 96 | rel_probs = cprobs[2:] 97 | else: 98 | rel_probs = cprobs 99 | 100 | # abs_probs = theano.printing.Print("abs_probs",['shape'])(abs_probs) 101 | # rel_probs = theano.printing.Print("rel_probs",['shape'])(rel_probs) 102 | 103 | # Start index: 104 | # *****[-----------------------------] 105 | # [****{-|------] 106 | # 107 | # [-----------------------------] 108 | # ~~~~~{------|------] 109 | start_diff = low_bound - (cpos-self.WINDOW_RADIUS) 110 | startidx = T.maximum(0, start_diff) 111 | startpadding = T.maximum(0, -start_diff) 112 | # End index: 113 | # [-----------------------------] 114 | # [******|**}---] 115 | # 116 | # [-----------------------------] 117 | # [******|******}~~~~~~~~~~~ 118 | endidx = T.minimum(self.WINDOW_SIZE, high_bound - (cpos-self.WINDOW_RADIUS)) 119 | endpadding = T.maximum(0, high_bound-(cpos+self.WINDOW_RADIUS+1)) 120 | 121 | # start_diff = theano.printing.Print("start_diff",['shape','__str__'])(start_diff) 122 | # startidx = theano.printing.Print("startidx",['shape','__str__'])(startidx) 123 | # startpadding = theano.printing.Print("startpadding",['shape','__str__'])(startpadding) 124 | # endidx = theano.printing.Print("endidx",['shape','__str__'])(endidx) 125 | # endpadding = theano.printing.Print("endpadding",['shape','__str__'])(endpadding) 126 | 127 | cropped = rel_probs[startidx:endidx] 128 | 129 | if self.with_artic: 130 | normalize_sum = T.sum(cropped) + T.sum(abs_probs) 131 | normalize_sum = T.maximum(normalize_sum, constants.EPSILON) 132 | padded = T.concatenate([abs_probs/normalize_sum, T.zeros((startpadding,)), cropped/normalize_sum, T.zeros((endpadding,))], 0) 133 | else: 134 | normalize_sum = T.sum(cropped) 135 | normalize_sum = T.maximum(normalize_sum, constants.EPSILON) 136 | padded = T.concatenate([T.ones((2,)), T.zeros((startpadding,)), cropped/normalize_sum, T.zeros((endpadding,))], 0) 137 | 138 | # padded = theano.printing.Print("padded",['shape'])(padded) 139 | return padded 140 | 141 | # probs = theano.printing.Print("probs",['shape'])(probs) 142 | # relative_position = theano.printing.Print("relative_position",['shape'])(relative_position) 143 | from_scan, _ = theano.map(fn=_scan_fn, sequences=[probs, T.flatten(relative_position)]) 144 | # from_scan = theano.printing.Print("from_scan",['shape'])(from_scan) 145 | newshape = T.concatenate([activations.shape[:-1],[2+high_bound-low_bound]],0) 146 | fixed = T.reshape(from_scan, newshape, ndim=activations.ndim) 147 | return fixed 148 | 149 | def note_to_encoding(self, chosen_note, relative_position, low_bound, high_bound): 150 | """ 151 | Convert a chosen note back into an encoded form 152 | 153 | Parameters: 154 | relative_position: A theano tensor of shape (...) giving the current relative position 155 | chosen_note: A theano tensor of shape (...) giving an index into encoded_probs 156 | 157 | Returns: 158 | sampled_output: A theano tensor (float32) of shape (..., ENCODING_WIDTH) that is 159 | sampled from encoded_probs. Should have the same representation as encoded_form from 160 | encode_melody 161 | """ 162 | new_idx = T.switch(chosen_note<2, chosen_note, chosen_note+low_bound-relative_position+self.WINDOW_RADIUS) 163 | new_idx = T.opt.Assert("new_idx should be less than {}".format(self.ENCODING_WIDTH))(new_idx, T.all(new_idx < self.ENCODING_WIDTH)) 164 | sampled_output = T.extra_ops.to_one_hot(new_idx, self.ENCODING_WIDTH) 165 | return sampled_output 166 | 167 | def get_new_relative_position(self, last_chosen_note, last_rel_pos, last_out, low_bound, high_bound, **cur_kwargs): 168 | return T.switch(last_chosen_note<2, last_rel_pos, last_chosen_note+low_bound-2) 169 | 170 | def initial_encoded_form(self): 171 | return np.array([1]+[0]+[0]*self.WINDOW_SIZE, np.float32) 172 | -------------------------------------------------------------------------------- /note_encodings/rhythm_only.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import theano 3 | import theano.tensor as T 4 | import random 5 | 6 | from .base_encoding import Encoding 7 | 8 | import constants 9 | import leadsheet 10 | import math 11 | 12 | class RhythmOnlyEncoding( Encoding ): 13 | """ 14 | An encoding that only encodes rhythm, of the form 15 | 16 | [ 17 | 18 | rest, \ 19 | continue, } (a softmax set of excl probs) 20 | articulate, / 21 | 22 | ] 23 | """ 24 | 25 | ENCODING_WIDTH = 3 26 | WINDOW_SIZE = 12 27 | RAW_ENCODING_WIDTH = 3 28 | 29 | def encode_melody_and_position(self, melody, chords): 30 | 31 | time = 0 32 | positions = [] 33 | encoded_form = [] 34 | 35 | for note, dur in melody: 36 | root, ctype = chords[time] 37 | if note is None: 38 | encoded_form.append([1,0,0]) 39 | else: 40 | encoded_form.append([0,0,1]) 41 | 42 | for _ in range(dur-1): 43 | encoded_form.append([1,0,0] if note is None else [0,1,0]) 44 | time += dur 45 | 46 | positions = [root for root,ctype in chords] 47 | 48 | return np.array(encoded_form, np.float32), np.array(positions, np.int32) 49 | 50 | def decode_to_probs(self, activations, relative_position, low_bound, high_bound): 51 | squashed = T.reshape(activations, (-1,self.RAW_ENCODING_WIDTH)) 52 | n_parallel = squashed.shape[0] 53 | probs = T.nnet.softmax(squashed) 54 | 55 | abs_probs = probs[:,:2] 56 | artic_prob = probs[:,2:] 57 | repeated_artic_probs = T.tile(artic_prob, (1,high_bound-low_bound)) 58 | 59 | full_probs = T.concatenate([abs_probs,repeated_artic_probs],1) 60 | 61 | newshape = T.concatenate([activations.shape[:-1],[2+high_bound-low_bound]],0) 62 | fixed = T.reshape(full_probs, newshape, ndim=activations.ndim) 63 | return fixed 64 | 65 | def note_to_encoding(self, chosen_note, relative_position, low_bound, high_bound): 66 | """ 67 | Convert a chosen note back into an encoded form 68 | 69 | Parameters: 70 | relative_position: A theano tensor of shape (...) giving the current relative position 71 | chosen_note: A theano tensor of shape (...) giving an index into encoded_probs 72 | 73 | Returns: 74 | sampled_output: A theano tensor (float32) of shape (..., ENCODING_WIDTH) that is 75 | sampled from encoded_probs. Should have the same representation as encoded_form from 76 | encode_melody 77 | """ 78 | new_idx = T.switch(chosen_note<2, chosen_note, 2) 79 | sampled_output = T.extra_ops.to_one_hot(new_idx, self.ENCODING_WIDTH) 80 | return sampled_output 81 | 82 | def get_new_relative_position(self, last_chosen_note, last_rel_pos, last_out, low_bound, high_bound, cur_chord_root, **cur_kwargs): 83 | return cur_chord_root 84 | 85 | def initial_encoded_form(self): 86 | return np.array([1,0,0], np.float32) 87 | -------------------------------------------------------------------------------- /param_cvt.py: -------------------------------------------------------------------------------- 1 | import sys 2 | import os 3 | import leadsheet 4 | import argparse 5 | import pickle 6 | import numpy as np 7 | import zipfile 8 | import io 9 | 10 | def main(file, precision, keys=None, output=None, make_zip=False): 11 | params = pickle.load(open(file, 'rb')) 12 | param_vals = [x if isinstance(x,np.ndarray) else x.get_value() for x in params] 13 | if output is None: 14 | output = os.path.splitext(file)[0] + (".ctome" if make_zip else "-raw") 15 | 16 | with open(keys,'r') as f: 17 | config_info = f.readline() 18 | key_names = f.readlines() 19 | assert len(key_names) == len(params), "Wrong number of keys for params! {} keys, {} params".format(len(key_names), len(params)) 20 | 21 | fmt = '%.{}e'.format(precision) 22 | if make_zip: 23 | with zipfile.ZipFile(output, 'w', zipfile.ZIP_DEFLATED) as zfile: 24 | for name,val in zip(key_names, param_vals): 25 | with io.BytesIO() as str_capture: 26 | np.savetxt(str_capture, val, fmt=fmt, delimiter=",") 27 | zfile.writestr("param_{}.csv".format(name.strip()), str_capture.getvalue()) 28 | zfile.writestr("config.txt", config_info) 29 | else: 30 | for name,val in zip(key_names, param_vals): 31 | np.savetxt("{}_{}.csv".format(output,name.strip()), val, fmt=fmt, delimiter=",") 32 | with open("{}_config.txt".format(output), 'w') as f: 33 | f.write(config_info) 34 | 35 | parser = argparse.ArgumentParser(description='Convert a python parameters file into an Impro-Visor connectome file') 36 | parser.add_argument('file', help='File to process') 37 | parser.add_argument('--keys', help='File to load parameter names from', required=True) 38 | parser.add_argument('--output', help='Base name of the output files') 39 | parser.add_argument('--precision', default=18, type=int, help='Decimal points of precision to use (default 18)') 40 | parser.add_argument('--raw', dest='make_zip', action='store_false', help='Create individual csv files instead of a connectome file') 41 | 42 | if __name__ == '__main__': 43 | args = parser.parse_args() 44 | main(**vars(args)) 45 | -------------------------------------------------------------------------------- /param_keys/ae_abs_keys.txt: -------------------------------------------------------------------------------- 1 | autoencoder_absolute 2 | enc_lstm1_input_w 3 | enc_lstm1_input_b 4 | enc_lstm1_forget_w 5 | enc_lstm1_forget_b 6 | enc_lstm1_activate_w 7 | enc_lstm1_activate_b 8 | enc_lstm1_out_w 9 | enc_lstm1_out_b 10 | enc_lstm2_input_w 11 | enc_lstm2_input_b 12 | enc_lstm2_forget_w 13 | enc_lstm2_forget_b 14 | enc_lstm2_activate_w 15 | enc_lstm2_activate_b 16 | enc_lstm2_out_w 17 | enc_lstm2_out_b 18 | enc_full_w 19 | enc_full_b 20 | enc_lstm1_initialstate 21 | enc_lstm2_initialstate 22 | dec_lstm1_input_w 23 | dec_lstm1_input_b 24 | dec_lstm1_forget_w 25 | dec_lstm1_forget_b 26 | dec_lstm1_activate_w 27 | dec_lstm1_activate_b 28 | dec_lstm1_out_w 29 | dec_lstm1_out_b 30 | dec_lstm2_input_w 31 | dec_lstm2_input_b 32 | dec_lstm2_forget_w 33 | dec_lstm2_forget_b 34 | dec_lstm2_activate_w 35 | dec_lstm2_activate_b 36 | dec_lstm2_out_w 37 | dec_lstm2_out_b 38 | dec_full_w 39 | dec_full_b 40 | dec_lstm1_initialstate 41 | dec_lstm2_initialstate 42 | -------------------------------------------------------------------------------- /param_keys/ae_poex_keys.txt: -------------------------------------------------------------------------------- 1 | autoencoder_product_interval_chords 2 | enc_0_lstm1_input_w 3 | enc_0_lstm1_input_b 4 | enc_0_lstm1_forget_w 5 | enc_0_lstm1_forget_b 6 | enc_0_lstm1_activate_w 7 | enc_0_lstm1_activate_b 8 | enc_0_lstm1_out_w 9 | enc_0_lstm1_out_b 10 | enc_0_lstm2_input_w 11 | enc_0_lstm2_input_b 12 | enc_0_lstm2_forget_w 13 | enc_0_lstm2_forget_b 14 | enc_0_lstm2_activate_w 15 | enc_0_lstm2_activate_b 16 | enc_0_lstm2_out_w 17 | enc_0_lstm2_out_b 18 | enc_0_full_w 19 | enc_0_full_b 20 | enc_0_lstm1_initialstate 21 | enc_0_lstm2_initialstate 22 | enc_1_lstm1_input_w 23 | enc_1_lstm1_input_b 24 | enc_1_lstm1_forget_w 25 | enc_1_lstm1_forget_b 26 | enc_1_lstm1_activate_w 27 | enc_1_lstm1_activate_b 28 | enc_1_lstm1_out_w 29 | enc_1_lstm1_out_b 30 | enc_1_lstm2_input_w 31 | enc_1_lstm2_input_b 32 | enc_1_lstm2_forget_w 33 | enc_1_lstm2_forget_b 34 | enc_1_lstm2_activate_w 35 | enc_1_lstm2_activate_b 36 | enc_1_lstm2_out_w 37 | enc_1_lstm2_out_b 38 | enc_1_full_w 39 | enc_1_full_b 40 | enc_1_lstm1_initialstate 41 | enc_1_lstm2_initialstate 42 | dec_0_lstm1_input_w 43 | dec_0_lstm1_input_b 44 | dec_0_lstm1_forget_w 45 | dec_0_lstm1_forget_b 46 | dec_0_lstm1_activate_w 47 | dec_0_lstm1_activate_b 48 | dec_0_lstm1_out_w 49 | dec_0_lstm1_out_b 50 | dec_0_lstm2_input_w 51 | dec_0_lstm2_input_b 52 | dec_0_lstm2_forget_w 53 | dec_0_lstm2_forget_b 54 | dec_0_lstm2_activate_w 55 | dec_0_lstm2_activate_b 56 | dec_0_lstm2_out_w 57 | dec_0_lstm2_out_b 58 | dec_0_full_w 59 | dec_0_full_b 60 | dec_0_lstm1_initialstate 61 | dec_0_lstm2_initialstate 62 | dec_1_lstm1_input_w 63 | dec_1_lstm1_input_b 64 | dec_1_lstm1_forget_w 65 | dec_1_lstm1_forget_b 66 | dec_1_lstm1_activate_w 67 | dec_1_lstm1_activate_b 68 | dec_1_lstm1_out_w 69 | dec_1_lstm1_out_b 70 | dec_1_lstm2_input_w 71 | dec_1_lstm2_input_b 72 | dec_1_lstm2_forget_w 73 | dec_1_lstm2_forget_b 74 | dec_1_lstm2_activate_w 75 | dec_1_lstm2_activate_b 76 | dec_1_lstm2_out_w 77 | dec_1_lstm2_out_b 78 | dec_1_full_w 79 | dec_1_full_b 80 | dec_1_lstm1_initialstate 81 | dec_1_lstm2_initialstate 82 | -------------------------------------------------------------------------------- /param_keys/corn_keys.txt: -------------------------------------------------------------------------------- 1 | name_generator 2 | lstm_input_w 3 | lstm_input_b 4 | lstm_forget_w 5 | lstm_forget_b 6 | lstm_activate_w 7 | lstm_activate_b 8 | lstm_out_w 9 | lstm_out_b 10 | lstm_initialstate 11 | full_w 12 | full_b -------------------------------------------------------------------------------- /param_keys/poex_keys.txt: -------------------------------------------------------------------------------- 1 | generative_product_interval_chords 2 | 0_lstm1_input_w 3 | 0_lstm1_input_b 4 | 0_lstm1_forget_w 5 | 0_lstm1_forget_b 6 | 0_lstm1_activate_w 7 | 0_lstm1_activate_b 8 | 0_lstm1_out_w 9 | 0_lstm1_out_b 10 | 0_lstm2_input_w 11 | 0_lstm2_input_b 12 | 0_lstm2_forget_w 13 | 0_lstm2_forget_b 14 | 0_lstm2_activate_w 15 | 0_lstm2_activate_b 16 | 0_lstm2_out_w 17 | 0_lstm2_out_b 18 | 0_full_w 19 | 0_full_b 20 | 0_lstm1_initialstate 21 | 0_lstm2_initialstate 22 | 1_lstm1_input_w 23 | 1_lstm1_input_b 24 | 1_lstm1_forget_w 25 | 1_lstm1_forget_b 26 | 1_lstm1_activate_w 27 | 1_lstm1_activate_b 28 | 1_lstm1_out_w 29 | 1_lstm1_out_b 30 | 1_lstm2_input_w 31 | 1_lstm2_input_b 32 | 1_lstm2_forget_w 33 | 1_lstm2_forget_b 34 | 1_lstm2_activate_w 35 | 1_lstm2_activate_b 36 | 1_lstm2_out_w 37 | 1_lstm2_out_b 38 | 1_full_w 39 | 1_full_b 40 | 1_lstm1_initialstate 41 | 1_lstm2_initialstate -------------------------------------------------------------------------------- /param_keys/poex_sep_rhythm_keys.txt: -------------------------------------------------------------------------------- 1 | generative_product_interval_chords_rhythm 2 | 0_lstm1_input_w 3 | 0_lstm1_input_b 4 | 0_lstm1_forget_w 5 | 0_lstm1_forget_b 6 | 0_lstm1_activate_w 7 | 0_lstm1_activate_b 8 | 0_lstm1_out_w 9 | 0_lstm1_out_b 10 | 0_lstm2_input_w 11 | 0_lstm2_input_b 12 | 0_lstm2_forget_w 13 | 0_lstm2_forget_b 14 | 0_lstm2_activate_w 15 | 0_lstm2_activate_b 16 | 0_lstm2_out_w 17 | 0_lstm2_out_b 18 | 0_full_w 19 | 0_full_b 20 | 0_lstm1_initialstate 21 | 0_lstm2_initialstate 22 | 1_lstm1_input_w 23 | 1_lstm1_input_b 24 | 1_lstm1_forget_w 25 | 1_lstm1_forget_b 26 | 1_lstm1_activate_w 27 | 1_lstm1_activate_b 28 | 1_lstm1_out_w 29 | 1_lstm1_out_b 30 | 1_lstm2_input_w 31 | 1_lstm2_input_b 32 | 1_lstm2_forget_w 33 | 1_lstm2_forget_b 34 | 1_lstm2_activate_w 35 | 1_lstm2_activate_b 36 | 1_lstm2_out_w 37 | 1_lstm2_out_b 38 | 1_full_w 39 | 1_full_b 40 | 1_lstm1_initialstate 41 | 1_lstm2_initialstate 42 | 2_lstm1_input_w 43 | 2_lstm1_input_b 44 | 2_lstm1_forget_w 45 | 2_lstm1_forget_b 46 | 2_lstm1_activate_w 47 | 2_lstm1_activate_b 48 | 2_lstm1_out_w 49 | 2_lstm1_out_b 50 | 2_lstm2_input_w 51 | 2_lstm2_input_b 52 | 2_lstm2_forget_w 53 | 2_lstm2_forget_b 54 | 2_lstm2_activate_w 55 | 2_lstm2_activate_b 56 | 2_lstm2_out_w 57 | 2_lstm2_out_b 58 | 2_full_w 59 | 2_full_b 60 | 2_lstm1_initialstate 61 | 2_lstm2_initialstate -------------------------------------------------------------------------------- /plot_data.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import matplotlib.pyplot as plt 3 | import matplotlib 4 | import sys 5 | import argparse 6 | 7 | def plot_file(fn): 8 | with open(fn,'r') as f: 9 | legend = f.readline() 10 | 11 | colnames = legend.strip().split(', ') 12 | data = np.loadtxt(fn, skiprows=1, delimiter=',') 13 | 14 | markers = ".ov^<>12348sp*hH+xDd" 15 | colors = "bgrcmyk" 16 | 17 | skip=100 18 | timestep = data[::skip,0] 19 | handles = [] 20 | for i,colname in enumerate(colnames[1:]): 21 | val = data[::skip,1+i] 22 | handles.append(plt.scatter(timestep, val, marker=markers[i%len(markers)], color=colors[i%len(colors)])) 23 | 24 | plt.legend(handles, colnames[1:]) 25 | plt.show() 26 | 27 | parser = argparse.ArgumentParser(description='Plot a .csv file') 28 | parser.add_argument('fn', help='File to plot') 29 | 30 | if __name__ == '__main__': 31 | args = parser.parse_args() 32 | plot_file(**vars(args)) 33 | -------------------------------------------------------------------------------- /plot_internal_state.py: -------------------------------------------------------------------------------- 1 | import constants 2 | import matplotlib 3 | import matplotlib.pyplot as plt 4 | import itertools 5 | import sys 6 | import numpy as np 7 | import os 8 | import argparse 9 | plt.ion() 10 | 11 | import custom_cmap 12 | my_cmap = matplotlib.colors.ListedColormap(custom_cmap.test_cm.colors[::-1]) 13 | 14 | # probs = np.load('generation/dataset_10000_probs.npy') 15 | # probs_jump = np.load('generation/dataset_10000_info_0.npy') 16 | # probs_chord = np.load('generation/dataset_10000_info_1.npy') 17 | # chosen = np.load('generation/dataset_10000_chosen.npy') 18 | # chosen_map = np.eye(probs.shape[-1])[chosen] 19 | 20 | def plot_note_dist(mat, name="", show_octaves=True): 21 | f = plt.figure(figsize=(20,5)) 22 | f.canvas.set_window_title(name) 23 | plt.imshow(mat.T, origin="lower", interpolation="nearest", cmap=my_cmap) 24 | plt.xticks( np.arange(0,4*(constants.WHOLE//constants.RESOLUTION_SCALAR),(constants.QUARTER//constants.RESOLUTION_SCALAR)) ) 25 | plt.xlabel('Time (beat/12)') 26 | plt.ylabel('Note') 27 | plt.colorbar() 28 | if show_octaves: 29 | for y in range(0,36,12): 30 | plt.axhline(y + 1.5, color='c') 31 | for x in range(0,4*(constants.WHOLE//constants.RESOLUTION_SCALAR),(constants.QUARTER//constants.RESOLUTION_SCALAR)): 32 | plt.axvline(x-0.5, color='k') 33 | for x in range(0,4*(constants.WHOLE//constants.RESOLUTION_SCALAR),(constants.WHOLE//constants.RESOLUTION_SCALAR)): 34 | plt.axvline(x-0.5, color='c') 35 | plt.show() 36 | 37 | def plot_scalar(mat, name=""): 38 | f = plt.figure(figsize=(20,5)) 39 | f.canvas.set_window_title(name) 40 | plt.bar(range(mat.shape[0]),mat,1) 41 | plt.xticks( np.arange(0,4*(constants.WHOLE//constants.RESOLUTION_SCALAR),(constants.QUARTER//constants.RESOLUTION_SCALAR)) ) 42 | plt.xlabel('Time (beat/12)') 43 | plt.ylabel('Strength') 44 | for x in range(0,4*(constants.WHOLE//constants.RESOLUTION_SCALAR),(constants.QUARTER//constants.RESOLUTION_SCALAR)): 45 | plt.axvline(x, color='k') 46 | for x in range(0,4*(constants.WHOLE//constants.RESOLUTION_SCALAR),(constants.WHOLE//constants.RESOLUTION_SCALAR)): 47 | plt.axvline(x, color='c') 48 | plt.show() 49 | 50 | 51 | def plot_all(folder, idx=0): 52 | probs = np.load(os.path.join(folder,'generated_probs.npy')) 53 | chosen_raw = np.load(os.path.join(folder,'generated_chosen.npy')) 54 | chosen = np.eye(probs.shape[-1])[chosen_raw] 55 | plot_note_dist(probs[idx], 'Probabilities') 56 | plot_note_dist(chosen[idx], 'Chosen') 57 | try: 58 | for i in itertools.count(): 59 | probs_info = np.load(os.path.join(folder,'generated_info_{}.npy'.format(i))) 60 | if len(probs_info.shape) == 3: 61 | show_octaves = probs_info.shape[2] < 40 62 | plot_note_dist(probs_info[idx], 'Info {}'.format(i), show_octaves) 63 | else: 64 | plot_scalar(probs_info[idx], 'Info {}'.format(i)) 65 | except FileNotFoundError: 66 | pass 67 | 68 | parser = argparse.ArgumentParser(description='Plot the internal state of a network') 69 | parser.add_argument('folder', help='Directory with the generated files') 70 | parser.add_argument('idx', type=int, help='Zero-based index of the output to visualize') 71 | 72 | if __name__ == '__main__': 73 | args = parser.parse_args() 74 | plot_all(**vars(args)) 75 | input("Press enter to close.") 76 | -------------------------------------------------------------------------------- /queue_managers/__init__.py: -------------------------------------------------------------------------------- 1 | from .queue_base import QueueManager 2 | from .standard_manager import StandardQueueManager 3 | from .variational_manager import VariationalQueueManager 4 | from .sampling_variational_manager import SamplingVariationalQueueManager 5 | from .queueless_variational_manager import QueuelessVariationalQueueManager 6 | from .queueless_standard_manager import QueuelessStandardQueueManager 7 | from .nearness_standard_manager import NearnessStandardQueueManager 8 | from .noise_wrapper import NoiseWrapper 9 | -------------------------------------------------------------------------------- /queue_managers/nearness_standard_manager.py: -------------------------------------------------------------------------------- 1 | from .queue_base import QueueManager 2 | import theano 3 | import theano.tensor as T 4 | import numpy as np 5 | from .standard_manager import StandardQueueManager 6 | 7 | class NearnessStandardQueueManager( StandardQueueManager ): 8 | """ 9 | A standard queue manager, using a configurable set of functions, with an exponential different loss 10 | """ 11 | 12 | def __init__(self, feature_size, penalty_shock, penalty_base, falloff_rate, vector_activation_fun=T.nnet.sigmoid, loss_fun=(lambda x:x)): 13 | super().__init__(feature_size, vector_activation_fun, loss_fun) 14 | self._penalty_shock = penalty_shock 15 | self._penalty_base = penalty_base 16 | self._falloff_rate = falloff_rate 17 | 18 | def get_loss(self, raw_feature_strengths, raw_feature_vects, extra_info=False): 19 | raw_losses = self._loss_fun(raw_feature_strengths) 20 | raw_sum = T.sum(raw_losses) 21 | 22 | n_parallel, n_timestep = raw_feature_strengths.shape 23 | 24 | falloff_arr = np.array(self._falloff_rate, np.float32) ** T.cast(T.arange(n_timestep), 'float32') 25 | falloff_mat = T.shape_padright(falloff_arr) / T.shape_padleft(falloff_arr) 26 | falloff_scaling = T.switch(T.ge(falloff_mat,1), 0, falloff_mat)/self._falloff_rate 27 | # falloff_scaling is of shape (n_timestep, n_timestep) with 0 along diagonal, and jump to 1 falling off along dimension 1 28 | # now we want to multiply through on both dimensions 29 | first_multiply = T.dot(raw_feature_strengths, falloff_scaling) # shape (n_parallel, n_timestep) 30 | second_multiply = raw_feature_strengths * first_multiply 31 | unscaled_falloff_penalty = T.sum(second_multiply) 32 | 33 | full_loss = self._penalty_base * raw_sum + self._penalty_shock * unscaled_falloff_penalty 34 | 35 | if extra_info: 36 | return full_loss, {"raw_loss_sum":raw_sum} 37 | else: 38 | return full_loss 39 | -------------------------------------------------------------------------------- /queue_managers/noise_wrapper.py: -------------------------------------------------------------------------------- 1 | import theano 2 | import theano.tensor as T 3 | import numpy as np 4 | from theano.sandbox.rng_mrg import MRG_RandomStreams 5 | from .queue_base import QueueManager 6 | from util import sliceMaker 7 | 8 | class NoiseWrapper( QueueManager ): 9 | """ 10 | Queue manager that wraps another queue manager and adds noise to it 11 | """ 12 | def __init__(self, inner_manager, pre_std=None, post_std=None, pre_mask=sliceMaker[:]): 13 | """ 14 | Initialize the manager. 15 | 16 | Parameters: 17 | pre_std: Standard deviation of noise to apply to input activations 18 | post_std: Standard deviation of noise to apply to output vector 19 | pre_mask: A slice that determines which part of input activations 20 | to apply noise to (i.e. activations[:,:,pre_mask]) 21 | """ 22 | self._srng = MRG_RandomStreams(np.random.randint(0, 1024)) 23 | self._inner_manager = inner_manager 24 | self._pre_std = pre_std 25 | self._pre_mask = pre_mask 26 | self._post_std = post_std 27 | 28 | @property 29 | def activation_width(self): 30 | return self._inner_manager.activation_width 31 | 32 | @property 33 | def feature_size(self): 34 | return self._inner_manager.feature_size 35 | 36 | def get_strengths_and_vects(self, input_activations): 37 | if self._pre_std is not None: 38 | input_activations = T.inc_subtensor(input_activations[:,:,self._pre_mask], self._srng.normal(input_activations.shape, self._pre_std)) 39 | strengths, vects = self._inner_manager.get_strengths_and_vects(input_activations) 40 | if self._post_std is not None: 41 | vects = vects + self._srng.normal(vects.shape, self._post_std) 42 | return strengths, vects 43 | 44 | def get_loss(self, input_activations, raw_feature_strengths, raw_feature_vects, extra_info=False): 45 | return self._inner_manager.get_loss(input_activations, raw_feature_strengths, raw_feature_vects, extra_info) 46 | 47 | def process(self, input_activations, extra_info=False): 48 | if self._pre_std is not None: 49 | input_activations = T.inc_subtensor(input_activations[:,:,self._pre_mask], self._srng.normal(input_activations.shape, self._pre_std)) 50 | stuff = self._inner_manager.process(input_activations, extra_info) 51 | vects = stuff[2] 52 | if self._post_std is not None: 53 | vects = vects + self._srng.normal(vects.shape, self._post_std) 54 | return stuff[:2] + (vects,) + stuff[3:] 55 | -------------------------------------------------------------------------------- /queue_managers/queue_base.py: -------------------------------------------------------------------------------- 1 | import theano 2 | import theano.tensor as T 3 | import numpy as np 4 | 5 | class QueueManager( object ): 6 | """ 7 | Manages the queue transformation 8 | """ 9 | 10 | @property 11 | def activation_width(self): 12 | """ 13 | The activation width of the queue manager, determining the dimensions of the 14 | input_activations 15 | """ 16 | raise NotImplementedError("activation_width not implemented") 17 | 18 | @property 19 | def feature_size(self): 20 | """ 21 | The feature width of the queue manager, determining the dimensions of the transformed output 22 | """ 23 | raise NotImplementedError("feature_size not implemented") 24 | 25 | def get_strengths_and_vects(self, input_activations): 26 | """ 27 | Prepare a set of input activations, returning the feature strengths and vectors. 28 | 29 | Parameters: 30 | input_activations: a theano tensor (float32) of shape (batch, timestep, activation_width) 31 | 32 | Returns: 33 | raw_feature_strengths: A theano tensor (float32) of shape (batch, timestep) giving the 34 | raw push strength for each timestep 35 | raw_feature_vects: A theano tensor (float32) of shape (batch, timestep, feature_size) 36 | giving the raw vector of the input at each timestep 37 | """ 38 | raise NotImplementedError("get_strengths_and_vects not implemented") 39 | 40 | def get_loss(self, input_activations, raw_feature_strengths, raw_feature_vects, extra_info=False): 41 | """ 42 | Calculate the loss for the given vects and strengths 43 | 44 | Parameters: 45 | raw_feature_strengths: A theano tensor (float32) of shape (batch, timestep) giving the 46 | raw push strength for each timestep 47 | raw_feature_vects: A theano tensor (float32) of shape (batch, timestep, feature_size) 48 | giving the raw vector of the input at each timestep 49 | extra_info: If True, return extra info 50 | 51 | Returns: 52 | sparsity_loss: a theano scalar (float32) giving the sparsity loss 53 | Also, if extra_info is true, an additional info dict. 54 | """ 55 | raise NotImplementedError("get_loss not implemented") 56 | 57 | def process(self, input_activations, extra_info=False): 58 | """ 59 | Process a set of input activations, returning the transformed output and sparsity loss. 60 | 61 | Parameters: 62 | input_activations: a theano tensor (float32) of shape (batch, timestep, activation_width) 63 | extra_info: If True, return extra info 64 | 65 | Returns: 66 | sparsity_loss: a theano scalar (float32) giving the sparsity loss 67 | raw_feature_strengths: A theano tensor (float32) of shape (batch, timestep) giving the 68 | raw push strength for each timestep 69 | raw_feature_vects: A theano tensor (float32) of shape (batch, timestep, feature_size) 70 | giving the raw vector of the input at each timestep 71 | Also, if extra_info is true, an additional info dict. 72 | """ 73 | raw_feature_strengths, raw_feature_vects = self.get_strengths_and_vects(input_activations) 74 | sparsity_loss = self.get_loss(raw_feature_strengths, raw_feature_vects, extra_info=extra_info) 75 | 76 | if extra_info: 77 | sparsity_loss, info = sparsity_loss 78 | return sparsity_loss, raw_feature_strengths, raw_feature_vects, info 79 | else: 80 | return sparsity_loss, raw_feature_strengths, raw_feature_vects 81 | 82 | def surrogate_loss(self, reconstruction_cost, extra_info): 83 | """ 84 | Get a "surrogate loss" to estimate gradients of any stochastic choices that factor into the reconstruction 85 | cost 86 | 87 | Parameters: 88 | reconstruction_cost: The reconstruction cost for this batch 89 | extra_info: A dictionary, of the form returned by process 90 | 91 | Returns: 92 | Either: 93 | - None, which means no loss will be added 94 | - (surrogate_loss, updates): where surrogate_loss is a value which will be added to the final cost 95 | and differentiated to get the gradient estimator, and updates is a list of updates in 96 | [(variable, value), ...] form. 97 | """ 98 | return None 99 | 100 | @staticmethod 101 | def queue_transform(feature_strengths, feature_vects, return_strengths=False): 102 | """ 103 | Process features according to a "fragmented queue", where each timestep 104 | gets a size-1 window onto a feature queue. Effectively, 105 | feature_strengths gives how much to push onto queue 106 | feature_vects gives what to push on 107 | pop weights are tied to feature_strengths 108 | output is a size-1 peek (without popping) 109 | 110 | Parameters: 111 | - feature_strengths: float32 tensor of shape (batch, push_timestep) in [0,1] 112 | - feature_vects: float32 tensor of shape (batch, push_timestep, feature_dim) 113 | 114 | Returns: 115 | - peek_vects: float32 tensor of shape (batch, timestep, feature_dim) 116 | """ 117 | n_batch, n_time, n_feature = feature_vects.shape 118 | 119 | cum_sum_str = T.extra_ops.cumsum(feature_strengths, 1) 120 | 121 | # We will be working in (batch, timestep, push_timestep) 122 | # For each timestep, if we subtract out the sum of pushes before that timestep 123 | # and then cap to 0-1 we get the cumsums for just the features active in that 124 | # timestep 125 | timestep_adjustments = T.shape_padright(cum_sum_str - feature_strengths) 126 | push_time_cumsum = T.shape_padaxis(cum_sum_str, 1) 127 | relative_cumsum = push_time_cumsum - timestep_adjustments 128 | capped_cumsum = T.minimum(T.maximum(relative_cumsum, 0), 1) 129 | 130 | # Now we can recover the peek strengths by taking a diff 131 | shifted = T.concatenate([T.zeros((n_batch, n_time, 1)), capped_cumsum[:,:,:-1]],2) 132 | peek_strengths = capped_cumsum-shifted 133 | # Peek strengths is now (batch, timestep, push_timestep) 134 | 135 | result = T.batched_dot(peek_strengths, feature_vects) 136 | 137 | if return_strengths: 138 | return peek_strengths, result 139 | else: 140 | return result 141 | 142 | 143 | def test_frag_queue(): 144 | feature_strengths = T.fmatrix() 145 | feature_vects = T.ftensor3() 146 | peek_strengths, res = QueueManager.queue_transform(feature_strengths, feature_vects, True) 147 | grad_s, grad_v = theano.gradient.grad(T.sum(res[:,:,1]), [feature_strengths,feature_vects]) 148 | 149 | fun = theano.function([feature_strengths, feature_vects], [peek_strengths, res, grad_s, grad_v], allow_input_downcast=True) 150 | 151 | mystrengths = np.array([[0.3,0.3,0.2,0.6,0.3,0.7,0.2,1], [0.3,0.3,0.2,0.6,0.3,0.7,0.2,1]], np.float32) 152 | myvects = np.tile(np.eye(8, dtype=np.float32), (2,1,1)) 153 | mypeek, myres, mygs, mygv = fun(mystrengths, myvects) 154 | 155 | print(mypeek) 156 | print(myres) 157 | print(mygs) 158 | print(mygv) 159 | return mypeek, myres, mygs, mygv -------------------------------------------------------------------------------- /queue_managers/queueless_standard_manager.py: -------------------------------------------------------------------------------- 1 | from .queue_base import QueueManager 2 | import theano 3 | import theano.tensor as T 4 | from theano.sandbox.rng_mrg import MRG_RandomStreams 5 | import numpy as np 6 | 7 | class QueuelessStandardQueueManager( QueueManager ): 8 | """ 9 | A variational-autoencoder-based manager which does not use the queue, with a configurable loss 10 | """ 11 | 12 | def __init__(self, feature_size, period=None, vector_activation_fun=T.nnet.sigmoid): 13 | """ 14 | Initialize the manager. 15 | 16 | Parameters: 17 | feature_size: The width of a feature 18 | period: Period for queue activations 19 | vector_activation_fun: The activation function to apply to the vectors. Will be applied 20 | to a tensor of shape (n_parallel, feature_size) and should return a tensor of the 21 | same shape 22 | """ 23 | self._feature_size = feature_size 24 | self._vector_activation_fun = vector_activation_fun 25 | self._period = period 26 | 27 | @property 28 | def activation_width(self): 29 | return self.feature_size 30 | 31 | @property 32 | def feature_size(self): 33 | return self._feature_size 34 | 35 | def get_strengths_and_vects(self, input_activations): 36 | n_batch, n_time, _ = input_activations.shape 37 | pre_vects = input_activations 38 | 39 | strengths = T.zeros((n_batch, n_time)) 40 | if self._period is None: 41 | strengths = T.set_subtensor(strengths[:,-1],1) 42 | else: 43 | strengths = T.set_subtensor(strengths[:,self._period-1::self._period],1) 44 | 45 | flat_pre_vects = T.reshape(pre_vects,(-1,self.feature_size)) 46 | flat_vects = self._vector_activation_fun( flat_pre_vects ) 47 | vects = T.reshape(flat_vects, pre_vects.shape) 48 | 49 | return strengths, vects 50 | 51 | def get_loss(self, raw_feature_strengths, raw_feature_vects, extra_info=False): 52 | loss = T.as_tensor_variable(np.float32(0.0)) 53 | if extra_info: 54 | return loss, {} 55 | else: 56 | return loss 57 | -------------------------------------------------------------------------------- /queue_managers/queueless_variational_manager.py: -------------------------------------------------------------------------------- 1 | from .queue_base import QueueManager 2 | import theano 3 | import theano.tensor as T 4 | from theano.sandbox.rng_mrg import MRG_RandomStreams 5 | import numpy as np 6 | import constants 7 | 8 | class QueuelessVariationalQueueManager( QueueManager ): 9 | """ 10 | A variational-autoencoder-based manager which does not use the queue, with a configurable loss 11 | """ 12 | 13 | def __init__(self, feature_size, period=None, variational_loss_scale=1): 14 | """ 15 | Initialize the manager. 16 | 17 | Parameters: 18 | feature_size: The width of a feature 19 | period: Period for queue activations 20 | variational_loss_scale: Factor by which to scale variational loss 21 | """ 22 | self._feature_size = feature_size 23 | self._period = period 24 | self._srng = MRG_RandomStreams(np.random.randint(0, 1024)) 25 | self._variational_loss_scale = np.array(variational_loss_scale, np.float32) 26 | 27 | @property 28 | def activation_width(self): 29 | return self.feature_size*2 30 | 31 | @property 32 | def feature_size(self): 33 | return self._feature_size 34 | 35 | def helper_sample(self, input_activations): 36 | n_batch, n_time, _ = input_activations.shape 37 | means = input_activations[:,:,:self.feature_size] 38 | stdevs = abs(input_activations[:,:,self.feature_size:]) + constants.EPSILON 39 | wiggle = self._srng.normal(means.shape) 40 | 41 | vects = means + (stdevs * wiggle) 42 | 43 | strengths = T.zeros((n_batch, n_time)) 44 | if self._period is None: 45 | strengths = T.set_subtensor(strengths[:,-1],1) 46 | else: 47 | strengths = T.set_subtensor(strengths[:,self._period-1::self._period],1) 48 | 49 | return strengths, vects, means, stdevs, {} 50 | 51 | def get_strengths_and_vects(self, input_activations): 52 | strengths, vects, means, stdevs, _ = self.helper_sample(input_activations) 53 | return strengths, vects 54 | 55 | def process(self, input_activations, extra_info=False): 56 | 57 | strengths, vects, means, stdevs, sample_info = self.helper_sample(input_activations) 58 | 59 | means_sq = means**2 60 | variance = stdevs**2 61 | loss_parts = 1 + T.log(variance) - means_sq - variance 62 | if self._period is None: 63 | loss_parts = loss_parts[:,-1] 64 | else: 65 | loss_parts = loss_parts[:,self._period-1::self._period] 66 | variational_loss = -0.5 * T.sum(loss_parts) * self._variational_loss_scale 67 | 68 | info = {"variational_loss":variational_loss} 69 | info.update(sample_info) 70 | if extra_info: 71 | return variational_loss, strengths, vects, info 72 | else: 73 | return variational_loss, strengths, vects 74 | -------------------------------------------------------------------------------- /queue_managers/sampling_variational_manager.py: -------------------------------------------------------------------------------- 1 | from .queue_base import QueueManager 2 | from .variational_manager import VariationalQueueManager 3 | import theano 4 | import theano.tensor as T 5 | from theano.sandbox.rng_mrg import MRG_RandomStreams 6 | import numpy as np 7 | import constants 8 | 9 | class SamplingVariationalQueueManager( VariationalQueueManager ): 10 | """ 11 | A variational-autoencoder-based queue manager, using a configurable loss, with sampled pushes and pops 12 | """ 13 | 14 | def __init__(self, feature_size, loss_fun=(lambda x:x), variational_loss_scale=1, baseline_scale=0.9): 15 | """ 16 | Initialize the manager. 17 | 18 | Parameters: 19 | feature_size: The width of a feature 20 | loss_fun: A function which computes the loss for each timestep. Should be an elementwise 21 | operation. 22 | baseline_scale: How much to adjust the baseline 23 | """ 24 | super().__init__(feature_size, loss_fun, variational_loss_scale) 25 | self._baseline_scale = baseline_scale 26 | 27 | def helper_sample(self, input_activations): 28 | strength_probs, vects, means, stdevs, old_info = super().helper_sample(input_activations) 29 | samples = self._srng.uniform(strength_probs.shape) 30 | strengths = T.cast(strength_probs>samples, 'float32') 31 | return strengths, vects, means, stdevs, {"sample_strength_probs":strength_probs, "sample_strength_choices":strengths} 32 | 33 | def surrogate_loss(self, reconstruction_cost, extra_info): 34 | """ 35 | Based on "Gradient Estimation Using Stochastic Computation Graphs", we can compute the gradient estimate 36 | as 37 | 38 | grad(E[cost]) ~= E[ sum(grad(log p(w|...)) * (Q - b)) + grad(cost(w))] 39 | 40 | where 41 | - w is the current thing we sampled (so p(w|...) is the probability we would do what we sampled doing) 42 | - Q is the cost "downstream" of w 43 | - b is an arbitrary baseline, which must not be downstream of w 44 | 45 | In this case, each w is a particular choice we made in sampling the strengths, and Q is just the 46 | reconstruction cost (since the final output can depend on strengths both in the past and the future). 47 | We let b be an exponential average of previous values of Q. 48 | 49 | We can construct our surrogate loss function as 50 | 51 | L = sum(log p(w|...)*(Q - b)) + actual costs 52 | = (Q - b)*sum(log p(w|...)) + actual costs 53 | 54 | as long as we consider Q and b constant wrt any derivative operation. This function thus returns 55 | 56 | S = (Q - b)*sum(log p(w|...)) 57 | """ 58 | s_probs = extra_info["sample_strength_probs"] 59 | s_choices = extra_info["sample_strength_choices"] 60 | prob_do_sampled = s_probs * s_choices + (1-s_probs)*(1-s_choices) 61 | logprobsum = T.sum(T.log(prob_do_sampled)) 62 | 63 | accum_prev_Q = theano.shared(np.array(0, np.float32)) 64 | accum_divisor = theano.shared(np.array(constants.EPSILON, np.float32)) 65 | baseline = accum_prev_Q / accum_divisor 66 | 67 | Q = theano.gradient.disconnected_grad(reconstruction_cost) 68 | 69 | surrogate_loss_component = logprobsum * (Q - baseline) 70 | 71 | new_prev_Q = (self._baseline_scale)*accum_prev_Q + (1-self._baseline_scale)*Q 72 | new_divisor = (self._baseline_scale)*accum_divisor + (1-self._baseline_scale) 73 | 74 | updates = [(accum_prev_Q, new_prev_Q), (accum_divisor, new_divisor)] 75 | 76 | return surrogate_loss_component, updates 77 | 78 | 79 | 80 | 81 | -------------------------------------------------------------------------------- /queue_managers/standard_manager.py: -------------------------------------------------------------------------------- 1 | from .queue_base import QueueManager 2 | import theano 3 | import theano.tensor as T 4 | 5 | class StandardQueueManager( QueueManager ): 6 | """ 7 | A standard queue manager, using a configurable set of functions 8 | """ 9 | 10 | def __init__(self, feature_size, vector_activation_fun=T.nnet.sigmoid, loss_fun=(lambda x:x)): 11 | """ 12 | Initialize the manager. 13 | 14 | Parameters: 15 | feature_size: The width of a feature 16 | vector_activation_fun: The activation function to apply to the vectors. Will be applied 17 | to a tensor of shape (n_parallel, feature_size) and should return a tensor of the 18 | same shape 19 | loss_fun: A function which computes the loss for each timestep. Should be an elementwise 20 | operation. 21 | """ 22 | self._feature_size = feature_size 23 | self._vector_activation_fun = vector_activation_fun 24 | self._loss_fun = loss_fun 25 | 26 | @property 27 | def activation_width(self): 28 | return 1 + self.feature_size 29 | 30 | @property 31 | def feature_size(self): 32 | return self._feature_size 33 | 34 | def get_strengths_and_vects(self, input_activations): 35 | pre_strengths = input_activations[:,:,0] 36 | pre_vects = input_activations[:,:,1:] 37 | 38 | strengths = T.nnet.sigmoid(pre_strengths) 39 | strengths = T.set_subtensor(strengths[:,-1],1) 40 | 41 | flat_pre_vects = T.reshape(pre_vects,(-1,self.feature_size)) 42 | flat_vects = self._vector_activation_fun( flat_pre_vects ) 43 | vects = T.reshape(flat_vects, pre_vects.shape) 44 | 45 | return strengths, vects 46 | 47 | def get_loss(self, raw_feature_strengths, raw_feature_vects, extra_info=False): 48 | losses = self._loss_fun(raw_feature_strengths) 49 | full_loss = T.sum(losses) 50 | if extra_info: 51 | return full_loss, {} 52 | else: 53 | return full_loss -------------------------------------------------------------------------------- /queue_managers/variational_manager.py: -------------------------------------------------------------------------------- 1 | from .queue_base import QueueManager 2 | import theano 3 | import theano.tensor as T 4 | from theano.sandbox.rng_mrg import MRG_RandomStreams 5 | import numpy as np 6 | import constants 7 | 8 | class VariationalQueueManager( QueueManager ): 9 | """ 10 | A variational-autoencoder-based queue manager, using a configurable loss 11 | """ 12 | 13 | def __init__(self, feature_size, loss_fun=(lambda x:x), variational_loss_scale=1): 14 | """ 15 | Initialize the manager. 16 | 17 | Parameters: 18 | feature_size: The width of a feature 19 | loss_fun: A function which computes the loss for each timestep. Should be an elementwise 20 | operation. 21 | variational_loss_scale: Factor by which to scale variational loss 22 | """ 23 | self._feature_size = feature_size 24 | self._srng = MRG_RandomStreams(np.random.randint(0, 1024)) 25 | self._loss_fun = loss_fun 26 | self._variational_loss_scale = np.array(variational_loss_scale, np.float32) 27 | 28 | @property 29 | def activation_width(self): 30 | return 1 + self.feature_size*2 31 | 32 | @property 33 | def feature_size(self): 34 | return self._feature_size 35 | 36 | def helper_sample(self, input_activations): 37 | """Helper method to sample from the input_activations. Also returns an (empty) info dict for child class use""" 38 | pre_strengths = input_activations[:,:,0] 39 | strengths = T.nnet.sigmoid(pre_strengths) 40 | strengths = T.set_subtensor(strengths[:,-1],1) 41 | 42 | means = input_activations[:,:,1:1+self.feature_size] 43 | stdevs = abs(input_activations[:,:,1+self.feature_size:]) + constants.EPSILON 44 | wiggle = self._srng.normal(means.shape) 45 | 46 | vects = means + (stdevs * wiggle) 47 | 48 | return strengths, vects, means, stdevs, {} 49 | 50 | def get_strengths_and_vects(self, input_activations): 51 | strengths, vects, means, stdevs, _ = self.helper_sample(input_activations) 52 | return strengths, vects 53 | 54 | def process(self, input_activations, extra_info=False): 55 | 56 | strengths, vects, means, stdevs, sample_info = self.helper_sample(input_activations) 57 | 58 | sparsity_losses = self._loss_fun(strengths) 59 | full_sparsity_loss = T.sum(sparsity_losses) 60 | 61 | means_sq = means**2 62 | variance = stdevs**2 63 | variational_loss = -0.5 * T.sum(1 + T.log(variance) - means_sq - variance) * self._variational_loss_scale 64 | 65 | full_loss = full_sparsity_loss + variational_loss 66 | 67 | info = {"sparsity_loss": full_sparsity_loss, "variational_loss":variational_loss} 68 | info.update(sample_info) 69 | if extra_info: 70 | return full_loss, strengths, vects, info 71 | else: 72 | return full_loss, strengths, vects 73 | -------------------------------------------------------------------------------- /relshift_lstm.py: -------------------------------------------------------------------------------- 1 | import theano 2 | import theano.tensor as T 3 | import numpy as np 4 | 5 | 6 | from theano_lstm import LSTM, StackedCells, Layer 7 | from util import * 8 | 9 | from collections import namedtuple 10 | 11 | SampleScanSpec = namedtuple('SampleScanSpec', ['sequences', 'non_sequences', 'outputs_info', 'num_taps', 'kwargs_keys', 'deterministic_dropout', 'start_pos']) 12 | 13 | class RelativeShiftLSTMStack( object ): 14 | """ 15 | Manages a stack of LSTM cells with potentially a relative shift applied 16 | """ 17 | 18 | def __init__(self, input_parts, layer_sizes, output_size, window_size=0, dropout=0, mode="drop", unroll_batch_num=None): 19 | """ 20 | Parameters: 21 | input_parts: A list of InputParts 22 | layer_sizes: A list of the form [ (indep, per_note), ... ] where 23 | indep is the number of non-shifted cells to have, and 24 | per_note is the number of cells to have per window note, which shift as the 25 | network moves 26 | Alternately can just be [ indep, ... ] 27 | output_size: An integer, the width of the desired output 28 | dropout: How much dropout to apply. 29 | mode: Either "drop" or "roll". If drop, discard memory that goes out of range. If roll, roll it instead 30 | """ 31 | 32 | self.input_parts = input_parts 33 | self.window_size = window_size 34 | 35 | layer_sizes = [x if isinstance(x,tuple) else (x,0) for x in layer_sizes] 36 | self.layer_sizes = layer_sizes 37 | self.tot_layer_sizes = [(indep + per_note*self.window_size) for indep, per_note in layer_sizes] 38 | 39 | self.output_size = output_size 40 | self.dropout = dropout 41 | 42 | self.input_size = sum(part.PART_WIDTH for part in input_parts) 43 | 44 | self.cells = StackedCells( self.input_size, celltype=LSTM, activation=T.tanh, layers = self.tot_layer_sizes ) 45 | self.cells.layers.append(Layer(self.tot_layer_sizes[-1], self.output_size, activation = lambda x:x)) 46 | 47 | assert mode in ("drop", "roll"), "Must specify either drop or roll mode" 48 | self.mode = mode 49 | 50 | self.unroll_batch_num = unroll_batch_num 51 | 52 | @property 53 | def params(self): 54 | return self.cells.params + list(l.initial_hidden_state for l in self.cells.layers if has_hidden(l)) 55 | 56 | @params.setter 57 | def params(self, paramlist): 58 | self.cells.params = paramlist[:len(self.cells.params)] 59 | for l, val in zip((l for l in self.cells.layers if has_hidden(l)), paramlist[len(self.cells.params):]): 60 | l.initial_hidden_state.set_value(val.get_value()) 61 | 62 | def perform_step(self, in_data, shifts, hiddens, dropout_masks=[]): 63 | """ 64 | Perform a step through the LSTM network. 65 | 66 | in_data: A theano tensor (float32) of shape (batch, input_size) 67 | shifts: A theano tensor (int32) of shape (batch), giving the relative 68 | shifts to apply to the last hiddens 69 | hiddens: A list of hiddens [layer](batch, hidden_idx) 70 | dropout_masks: If [], apply dropout deterministically. Otherwise, should 71 | be a set of masks returned by get_dropout_masks, generally passed through 72 | a scan as a non-sequence. 73 | """ 74 | 75 | # hiddens is of shape [layer](batch, hidden_idx) 76 | # We want to permute the hidden_idx values according to shifts, 77 | # which are ints of shape (batch) 78 | 79 | n_batch = in_data.shape[0] 80 | new_hiddens = [] 81 | for layer_i, (indep, per_note) in enumerate(self.layer_sizes): 82 | if per_note == 0: 83 | # Don't bother with this layer 84 | new_hiddens.append(hiddens[layer_i]) 85 | continue 86 | # The theano_lstm code puts [memory_cells... , old_activations...] 87 | # We want to slide the memory cells only. 88 | lstm_hsplit = self.cells.layers[layer_i].hidden_size 89 | indep_mem = hiddens[layer_i][:,:indep] 90 | per_note_mem = hiddens[layer_i][:,indep:lstm_hsplit] 91 | remaining_values = hiddens[layer_i][:,lstm_hsplit:] 92 | # per_note_mem is (batch, per_note_mem) 93 | separated_mem = per_note_mem.reshape((n_batch, self.window_size, per_note)) 94 | # separated_mem is (batch, note, mem) 95 | # [a b c ... x y z] shifted up 1 (+1) goes to [b c ... x y z 0] 96 | # [a b c ... x y z] shifted down 1 (-1) goes to [0 a b c ... x y] 97 | def _shift_step(c_mem, c_shift): 98 | # c_mem is (note, mem) 99 | # c_shift is an int 100 | if self.mode=="drop": 101 | def _clamp_w(x): 102 | return T.maximum(0,T.minimum(x,self.window_size)) 103 | ins_at_front = T.zeros((_clamp_w(-c_shift),per_note)) 104 | ins_at_back = T.zeros((_clamp_w(c_shift),per_note)) 105 | take_part = c_mem[_clamp_w(c_shift):self.window_size-_clamp_w(-c_shift),:] 106 | return T.concatenate([ins_at_front, take_part, ins_at_back], 0) 107 | elif self.mode=="roll": 108 | return T.roll(c_mem, (-c_shift)%12, axis=0) 109 | 110 | if self.unroll_batch_num is None: 111 | shifted_mem, _ = theano.map(_shift_step, [separated_mem, shifts]) 112 | else: 113 | shifted_mem_parts = [] 114 | for i in range(self.unroll_batch_num): 115 | shifted_mem_parts.append(_shift_step(separated_mem[i], shifts[i])) 116 | shifted_mem = T.stack(shifted_mem_parts) 117 | 118 | new_per_note_mem = shifted_mem.reshape((n_batch, self.window_size * per_note)) 119 | new_layer_hiddens = T.concatenate([indep_mem, new_per_note_mem, remaining_values], 1) 120 | new_hiddens.append(new_layer_hiddens) 121 | 122 | if dropout_masks == [] or not self.dropout: 123 | masks = [] 124 | else: 125 | masks = [None] + dropout_masks 126 | new_states = self.cells.forward(in_data, prev_hiddens=new_hiddens, dropout=masks) 127 | return new_states 128 | 129 | def do_preprocess_scan(self, deterministic_dropout=False, **kwargs): 130 | """ 131 | Run a scan using this LSTM, preprocessing all inputs before the scan. 132 | 133 | Parameters: 134 | kwargs[k]: should be a theano tensor of shape (n_batch, n_time, ... ) 135 | Note that "relative_position" should be a keyword argument given here if there are relative 136 | shifts. 137 | deterministic_dropout: If True, apply dropout deterministically, scaling everything. If false, 138 | sample dropout 139 | 140 | Returns: 141 | A theano tensor of shape (n_batch, n_time, output_size) of activations 142 | """ 143 | 144 | assert len(kwargs)>0, "Need at least one input argument!" 145 | n_batch, n_time = list(kwargs.values())[0].shape[:2] 146 | 147 | squashed_kwargs = { 148 | k: v.reshape([n_batch*n_time] + [x for x in v.shape[2:]]) for k,v in kwargs.items() 149 | } 150 | 151 | full_input = T.concatenate([ part.generate(**squashed_kwargs) for part in self.input_parts ], 1) 152 | adjusted_input = full_input.reshape([n_batch, n_time, self.input_size]).dimshuffle((1,0,2)) 153 | 154 | if "relative_position" in kwargs: 155 | relative_position = kwargs["relative_position"] 156 | diff_shifts = T.extra_ops.diff(relative_position, axis=1) 157 | cat_shifts = T.concatenate([T.zeros((n_batch, 1), 'int32'), diff_shifts], 1) 158 | shifts = cat_shifts.dimshuffle((1,0)) 159 | else: 160 | shifts = T.zeros(n_time, n_batch, 'int32') 161 | 162 | def _scan_fn(in_data, shifts, *other): 163 | other = list(other) 164 | if self.dropout and not deterministic_dropout: 165 | split = -len(self.tot_layer_sizes) 166 | hiddens = other[:split] 167 | masks = [None] + other[split:] 168 | else: 169 | masks = [] 170 | hiddens = other 171 | 172 | return self.perform_step(in_data, shifts, hiddens, dropout_masks=masks) 173 | 174 | if self.dropout and not deterministic_dropout: 175 | dropout_masks = UpscaleMultiDropout( [(n_batch, shape) for shape in self.tot_layer_sizes], self.dropout) 176 | else: 177 | dropout_masks = [] 178 | 179 | outputs_info = [initial_state_with_taps(layer, n_batch) for layer in self.cells.layers] 180 | result, _ = theano.scan(fn=_scan_fn, sequences=[adjusted_input, shifts], non_sequences=dropout_masks, outputs_info=outputs_info) 181 | 182 | final_out = get_last_layer(result).transpose((1,0,2)) 183 | 184 | return final_out 185 | 186 | def prepare_sample_scan(self, start_pos, start_out, deterministic_dropout=False, **kwargs): 187 | """ 188 | Prepare a sample scan 189 | 190 | Parameters: 191 | kwargs[k]: should be a theano tensor of shape (n_batch, n_time, ... ) 192 | Note that "relative_position" should be a keyword argument given here if there are relative 193 | shifts, as should "timestep" 194 | start_pos: a theano tensor of shape (n_batch) giving the initial position passed to the 195 | out_to_in function 196 | start_out: a theano tensor of shape (n_batch, X) giving the initial "output" passed 197 | to the out_to_in_fn 198 | deterministic_dropout: If True, apply dropout deterministically, scaling everything. If false, 199 | sample dropout 200 | 201 | Returns: 202 | A namedtuple, where 203 | sequences: a list of sequences to input into scan 204 | non_sequences: a list of non_sequences into scan 205 | outputs_info: a list of outputs_info for scan 206 | num_taps: the number of outputs with taps for this 207 | (other values): for internal use 208 | """ 209 | assert len(kwargs)>0, "Need at least one input argument!" 210 | n_batch, n_time = list(kwargs.values())[0].shape[:2] 211 | 212 | transp_kwargs = { 213 | k: v.dimshuffle((1,0) + tuple(range(2,v.ndim))) for k,v in kwargs.items() 214 | } 215 | 216 | if self.dropout and not deterministic_dropout: 217 | dropout_masks = UpscaleMultiDropout( [(n_batch, shape) for shape in self.tot_layer_sizes], self.dropout) 218 | else: 219 | dropout_masks = [] 220 | 221 | outputs_info = [{"initial":start_pos, "taps":[-1]}, {"initial":start_out, "taps":[-1]}] + [initial_state_with_taps(layer, n_batch) for layer in self.cells.layers] 222 | sequences = list(transp_kwargs.values()) 223 | non_sequences = dropout_masks 224 | num_taps = len([True for x in outputs_info if x is not None]) 225 | return SampleScanSpec(sequences=sequences, non_sequences=non_sequences, outputs_info=outputs_info, num_taps=num_taps, kwargs_keys=list(transp_kwargs.keys()), deterministic_dropout=deterministic_dropout, start_pos=start_pos) 226 | 227 | 228 | def sample_scan_routine(self, spec, *inputs): 229 | """ 230 | Start a scan routine. This is implemented as a generator, since we may need to interrupt the state in the 231 | middle of iteration. How to use: 232 | 233 | scan_rout = x.sample_scan_routine(spec, *inputs) 234 | - spec: The SampleScanSpec returned by prepare_sample_scan 235 | - *inputs: The scan inputs, in [ sequences..., taps..., non_sequences... ] order 236 | 237 | last_rel_pos, last_out, cur_kwargs = scan_rout.send(None) 238 | - last_rel_pos is a theano tensor of shape (n_batch) 239 | - last_out will be a theano tensor of shape (n_batch, output_size) 240 | - cur_kwargs[k] is a theano tensor of shape (n_batch, ...), from kwargs 241 | 242 | out_activations = scan_rout.send((new_pos, addtl_kwargs)) 243 | - new_pos is a theano tensor of shape (n_batch), giving the new relative position 244 | - addtl_kwargs[k] is a theano tensor of shape (n_batch, ...) to be added to cur kwargs 245 | Note that "relative_position" will be added automatically. 246 | 247 | scan_outputs = scan_rout.send(new_out) 248 | - new_out is a tensor of shape (n_batch, X) to be output 249 | 250 | scan_rout.close() 251 | 252 | -> scan_outputs should be returned back to scan 253 | """ 254 | stuff = list(inputs) 255 | I = len(spec.kwargs_keys) 256 | kwarg_seq_vals = stuff[:I] 257 | cur_kwargs = {k:v for k,v in zip(spec.kwargs_keys, kwarg_seq_vals)} 258 | last_pos, last_out = stuff[I:I+2] 259 | other = stuff[I+2:] 260 | 261 | if self.dropout and not spec.deterministic_dropout: 262 | split = -len(self.tot_layer_sizes) 263 | hiddens = other[:split] 264 | masks = [None] + other[split:] 265 | else: 266 | masks = [] 267 | hiddens = other 268 | 269 | cur_pos, addtl_kwargs = yield(last_pos, last_out, cur_kwargs) 270 | all_kwargs = { 271 | "relative_position": cur_pos 272 | } 273 | all_kwargs.update(cur_kwargs) 274 | all_kwargs.update(addtl_kwargs) 275 | 276 | shift = T.switch(T.eq(all_kwargs["timestep"],0), 0, cur_pos - last_pos) 277 | 278 | full_input = T.concatenate([ part.generate(**all_kwargs) for part in self.input_parts ], 1) 279 | 280 | step_stuff = self.perform_step(full_input, shift, hiddens, dropout_masks=masks) 281 | new_hiddens = step_stuff[:-1] 282 | raw_output = step_stuff[-1] 283 | sampled_output = yield(raw_output) 284 | 285 | yield [cur_pos, sampled_output] + step_stuff 286 | 287 | def extract_sample_scan_results(self, spec, outputs): 288 | """ 289 | Extract outputs from the scan results. 290 | 291 | Parameters: 292 | outputs: The outputs from the scan associated with this stack 293 | 294 | Returns: 295 | positions, raw_output, sampled_output 296 | """ 297 | positions = T.concatenate([T.shape_padright(spec.start_pos), outputs[0].transpose((1,0))[:,:-1]], 1) 298 | sampled_output = outputs[2].transpose((1,0,2)) 299 | raw_output = outputs[-1].transpose((1,0,2)) 300 | 301 | return positions, raw_output, sampled_output 302 | 303 | 304 | def do_sample_scan(self, start_pos, start_out, sample_fn, out_to_in_fn, deterministic_dropout=True, **kwargs): 305 | """ 306 | Run a scan using this LSTM, sampling and processing as we go. 307 | 308 | Parameters: 309 | kwargs[k]: should be a theano tensor of shape (n_batch, n_time, ... ) 310 | Note that "relative_position" should be a keyword argument given here if there are relative 311 | shifts. 312 | start_pos: a theano tensor of shape (n_batch) giving the initial position passed to the 313 | out_to_in function 314 | start_out: a theano tensor of shape (n_batch, X) giving the initial "output" passed 315 | to the out_to_in_fn 316 | sample_fn: a function with signature 317 | sample_fn(out_activations, rel_pos) -> new_out, new_rel_pos 318 | where 319 | - rel_pos is a theano tensor of shape (n_batch) 320 | - out_activations is a tensor of shape (n_batch, output_size) 321 | and 322 | - new_out is a tensor of shape (n_batch, X) to be output 323 | - new_rel_pos should be a theano tensor of shape (n_batch) 324 | out_to_in_fn: a function with signature 325 | out_to_in_fn(rel_pos, last_out, **cur_kwargs) -> addtl_kwargs 326 | where 327 | - rel_pos is a theano tensor of shape (n_batch) 328 | - last_out will be a theano tensor of shape (n_batch, output_size) 329 | - cur_kwargs[k] is a theano tensor of shape (n_batch, ...), from kwargs 330 | and 331 | - addtl_kwargs[k] is a theano tensor of shape (n_batch, ...) to be added to cur kwargs 332 | Note that "relative_position" will be added automatically. 333 | deterministic_dropout: If True, apply dropout deterministically, scaling everything. If false, 334 | sample dropout 335 | 336 | Returns: positions, raw_output, sampled_output, updates 337 | """ 338 | raise NotImplementedError() 339 | spec = self.prepare_sample_scan(start_pos, start_out, sample_fn, deterministic_dropout, **kwargs) 340 | 341 | def _scan_fn(*stuff): 342 | scan_rout = self.sample_scan_routine(spec, *stuff) 343 | rel_pos, last_out, cur_kwargs = scan_rout.send(None) 344 | addtl_kwargs = out_to_in_fn(rel_pos, last_out, **cur_kwargs) 345 | out_activations = scan_rout.send(addtl_kwargs) 346 | sampled_output, new_pos = sample_fn(out_activations, rel_pos) 347 | scan_outputs = scan_rout.send((sampled_output, new_pos)) 348 | scan_rout.close() 349 | return scan_outputs 350 | 351 | result, updates = theano.scan(fn=_scan_fn, sequences=spec.sequences, non_sequences=spec.non_sequences, outputs_info=spec.outputs_info) 352 | positions, raw_output, sampled_output = self.extract_sample_scan_results(spec, result) 353 | return positions, raw_output, sampled_output, updates 354 | -------------------------------------------------------------------------------- /training.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import os 3 | import random 4 | import signal 5 | 6 | import leadsheet 7 | import constants 8 | 9 | import param_cvt 10 | 11 | import pickle as pickle 12 | 13 | import traceback 14 | 15 | from pprint import pformat 16 | 17 | BATCH_SIZE = 10 18 | SEGMENT_STEP = constants.WHOLE//constants.RESOLUTION_SCALAR 19 | SEGMENT_LEN = 4*SEGMENT_STEP 20 | 21 | def set_params(batch_size, segment_step, segment_len): 22 | global BATCH_SIZE 23 | global SEGMENT_STEP 24 | global SEGMENT_LEN 25 | BATCH_SIZE = batch_size 26 | SEGMENT_STEP = segment_step 27 | SEGMENT_LEN = segment_len 28 | 29 | VALIDATION_CT = 5 30 | 31 | def find_leadsheets(dirpath): 32 | return [os.path.join(dirpath, fname) for fname in os.listdir(dirpath) if fname[-3:] == '.ls'] 33 | 34 | def filter_leadsheets(leadsheets): 35 | new_leadsheets=[] 36 | for lsfn in leadsheets: 37 | print("---- {} ----".format(lsfn)) 38 | c,m = leadsheet.parse_leadsheet(lsfn, verbose=True) 39 | length = leadsheet.get_leadsheet_length(c,m) 40 | if length < SEGMENT_LEN: 41 | print("Leadsheet {} is too short! Skipping...".format(lsfn)) 42 | else: 43 | new_leadsheets.append(lsfn) 44 | print("Found {} leadsheets.".format(len(leadsheets))) 45 | return new_leadsheets 46 | 47 | def get_batch(leadsheets, with_sample=False): 48 | """ 49 | Get a batch 50 | 51 | leadsheets should be a list of dataset lists of (chord, melody) tuples, or just a dataset list of tuples 52 | 53 | returns: chords, melodies 54 | """ 55 | if not isinstance(leadsheets[0], list): 56 | leadsheets = [leadsheets] 57 | 58 | sample_datasets = [random.randrange(len(leadsheets)) for _ in range(BATCH_SIZE)] 59 | sample_fns = [random.choice(leadsheets[i]) for i in sample_datasets] 60 | loaded_samples = [leadsheet.parse_leadsheet(lsfn) for lsfn in sample_fns] 61 | sample_lengths = [leadsheet.get_leadsheet_length(c,m) for c,m in loaded_samples] 62 | 63 | starts = [(0 if l==SEGMENT_LEN else random.randrange(0,l-SEGMENT_LEN,SEGMENT_STEP)) for l in sample_lengths] 64 | sliced = [leadsheet.slice_leadsheet(c,m,s,s+SEGMENT_LEN) for (c,m),s in zip(loaded_samples, starts)] 65 | 66 | res = list(zip(*sliced)) 67 | 68 | sample_sources = ["{}: starting at {} = bar {}".format(fn, start, start/(constants.WHOLE//constants.RESOLUTION_SCALAR)) for fn,start in zip(sample_fns, starts)] 69 | 70 | if with_sample: 71 | return res, sample_sources 72 | else: 73 | return res 74 | 75 | def generate(model, leadsheets, filename, with_vis=False, batch=None): 76 | if batch is None: 77 | batch = get_batch(leadsheets, True) 78 | (chords, melody), sample_sources = batch 79 | generated_out, chosen, vis_probs, vis_info = model.produce(chords, melody) 80 | 81 | if with_vis: 82 | with open("{}_sources.txt".format(filename), "w") as f: 83 | f.write('\n'.join(sample_sources)) 84 | np.save('{}_chosen.npy'.format(filename), chosen) 85 | np.save('{}_probs.npy'.format(filename), vis_probs) 86 | for i,v in enumerate(vis_info): 87 | np.save('{}_info_{}.npy'.format(filename,i), v) 88 | for samplenum, (melody, chords) in enumerate(zip(generated_out, chords)): 89 | leadsheet.write_leadsheet(chords, melody, '{}_{}.ls'.format(filename, samplenum)) 90 | 91 | def validate(model, validation_leadsheets): 92 | accum_loss = None 93 | accum_infos = None 94 | for i in range(VALIDATION_CT): 95 | loss, infos = model.eval(*get_batch(validation_leadsheets)) 96 | if accum_loss is None: 97 | accum_loss = loss 98 | accum_infos = infos 99 | else: 100 | accum_loss += loss 101 | for k in accum_info.keys(): 102 | accum_loss[k] += accum_infos[k] 103 | accum_loss /= VALIDATION_CT 104 | for k in accum_info.keys(): 105 | accum_loss[k] /= VALIDATION_CT 106 | return accum_loss, accum_info 107 | 108 | def validate_generate(model, validation_leadsheets, generated_dir): 109 | for lsfn in validation_leadsheets: 110 | ch,mel = leadsheet.parse_leadsheet(lsfn) 111 | batch = ([ch],[mel]), [lsfn] 112 | curdir = os.path.join(generated_dir, os.path.splitext(os.path.basename(lsfn))[0]) 113 | os.makedirs(curdir) 114 | generate(model, None, os.path.join(curdir, "generated"), with_vis=True, batch=batch) 115 | 116 | def train(model,leadsheets,num_updates,outputdir,start=0,save_params_interval=5000,validation_leadsheets=None,validation_generate_ct=1,auto_connectome_keys=None): 117 | stopflag = [False] 118 | def signal_handler(signame, sf): 119 | stopflag[0] = True 120 | print("Caught interrupt, waiting until safe. Press again to force terminate") 121 | signal.signal(signal.SIGINT, old_handler) 122 | old_handler = signal.signal(signal.SIGINT, signal_handler) 123 | for i in range(start+1,start+num_updates+1): 124 | if stopflag[0]: 125 | break 126 | loss, infos = model.train(*get_batch(leadsheets)) 127 | with open(os.path.join(outputdir,'data.csv'),'a') as f: 128 | if i == 1: 129 | f.seek(0) 130 | f.truncate() 131 | f.write("iter, loss, " + ", ".join(k for k,v in sorted(infos.items())) + "\n") 132 | f.write("{}, {}, ".format(i,loss) + ", ".join(str(v) for k,v in sorted(infos.items())) + "\n") 133 | if i % 10 == 0: 134 | print("update {}: {}, info {}".format(i,loss,pformat(infos))) 135 | if save_params_interval is not None and i % save_params_interval == 0: 136 | paramfile = os.path.join(outputdir, 'params{}.p'.format(i)) 137 | pickle.dump(model.params,open(paramfile, 'wb')) 138 | if auto_connectome_keys is not None: 139 | param_cvt.main(paramfile, 18, auto_connectome_keys, make_zip=True) 140 | if validation_leadsheets is None: 141 | generate(model, leadsheets, os.path.join(outputdir,'sample{}'.format(i))) 142 | else: 143 | for gen_num in range(validation_generate_ct): 144 | validate_generate(model, validation_leadsheets, os.path.join(outputdir, "validation_{}_sample_{}".format(i,gen_num))) 145 | if not stopflag[0]: 146 | signal.signal(signal.SIGINT, old_handler) -------------------------------------------------------------------------------- /util.py: -------------------------------------------------------------------------------- 1 | import theano 2 | import os 3 | import theano.tensor as T 4 | import numpy as np 5 | from theano_lstm import MultiDropout 6 | 7 | def has_hidden(layer): 8 | """ 9 | Whether a layer has a trainable 10 | initial hidden state. 11 | """ 12 | return hasattr(layer, 'initial_hidden_state') 13 | 14 | def matrixify(vector, n): 15 | return T.repeat(T.shape_padleft(vector), n, axis=0) 16 | 17 | def initial_state(layer, dimensions = None): 18 | """ 19 | Initalizes the recurrence relation with an initial hidden state 20 | if needed, else replaces with a "None" to tell Theano that 21 | the network **will** return something, but it does not need 22 | to send it to the next step of the recurrence 23 | """ 24 | if dimensions is None: 25 | return layer.initial_hidden_state if has_hidden(layer) else None 26 | else: 27 | return matrixify(layer.initial_hidden_state, dimensions) if has_hidden(layer) else None 28 | 29 | def initial_state_with_taps(layer, dimensions = None): 30 | """Optionally wrap tensor variable into a dict with taps=[-1]""" 31 | state = initial_state(layer, dimensions) 32 | if state is not None: 33 | return dict(initial=state, taps=[-1]) 34 | else: 35 | return None 36 | 37 | 38 | def get_last_layer(result): 39 | if isinstance(result, list): 40 | return result[-1] 41 | else: 42 | return result 43 | 44 | def ensure_list(result): 45 | if isinstance(result, list): 46 | return result 47 | else: 48 | return [result] 49 | 50 | def UpscaleMultiDropout(shapes, dropout = 0.): 51 | """ 52 | Return all the masks needed for dropout outside of a scan loop. 53 | """ 54 | orig_masks = MultiDropout(shapes, dropout) 55 | fixed_masks = [m / (1-dropout) for m in orig_masks] 56 | return fixed_masks 57 | 58 | class _SliceHelperObj(object): 59 | """ 60 | Helper object that exposes the slice from __getitem__ directly 61 | """ 62 | def __getitem__(self, key): 63 | return key 64 | 65 | sliceMaker = _SliceHelperObj() 66 | 67 | def _better_print_fn(op, xin): 68 | for item in op.attrs: 69 | if callable(item): 70 | pmsg = item(xin) 71 | else: 72 | temp = getattr(xin, item) 73 | if callable(temp): 74 | pmsg = temp() 75 | else: 76 | pmsg = temp 77 | print(op.message, attr, '=', pmsg) 78 | 79 | def FnPrint(name, items=['__str__']): 80 | return theano.printing.Print(name, items, _better_print_fn) 81 | 82 | def Save(path="", preprocess=lambda x:x, text=False): 83 | def _save_fn(op, xin): 84 | val = preprocess(xin) 85 | if text: 86 | np.savetxt(path + ".csv", val, delimiter=",") 87 | else: 88 | np.save(path + ".npy", val) 89 | return theano.printing.Print(path, [], _save_fn) 90 | --------------------------------------------------------------------------------