├── .gitignore
├── README.md
├── __init__.py
├── adam.py
├── constants.py
├── custom_cmap.py
├── generate_trade_helper.py
├── input_parts
    ├── __init__.py
    ├── abs_position_part.py
    ├── base_input_part.py
    ├── beat_part.py
    ├── chord_part.py
    └── passthrough_part.py
├── instructions
    ├── generate_trade_helper.md
    ├── lscat.md
    ├── lssplit.md
    ├── main.md
    ├── param_cvt.md
    ├── plot_data.md
    └── plot_internal_state.md
├── leadsheet.py
├── lscat.py
├── lssplit.py
├── main.py
├── models
    ├── __init__.py
    ├── compressive_autoencoder_model.py
    ├── product_model.py
    └── simple_rel_model.py
├── nametrain
    └── name_model.py
├── note_encodings
    ├── __init__.py
    ├── abs_seq_encoding.py
    ├── base_encoding.py
    ├── chord_relative.py
    ├── circle_of_thirds_encoding.py
    ├── relative_jump.py
    └── rhythm_only.py
├── param_cvt.py
├── param_keys
    ├── ae_abs_keys.txt
    ├── ae_poex_keys.txt
    ├── corn_keys.txt
    ├── poex_keys.txt
    └── poex_sep_rhythm_keys.txt
├── plot_data.py
├── plot_internal_state.py
├── queue_managers
    ├── __init__.py
    ├── nearness_standard_manager.py
    ├── noise_wrapper.py
    ├── queue_base.py
    ├── queueless_standard_manager.py
    ├── queueless_variational_manager.py
    ├── sampling_variational_manager.py
    ├── standard_manager.py
    └── variational_manager.py
├── relshift_lstm.py
├── training.py
└── util.py


/.gitignore:
--------------------------------------------------------------------------------
1 | *.pyc
2 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
  1 | # Using LSTMprovisor
  2 | 
  3 | This project allows you to train LSTM-based neural network models of jazz music in the .ls format, as well as convert the models into connectome files for use in Impro-Visor.
  4 | 
  5 | ## Dependencies and Setup
  6 | 
  7 | In order to run this project, you will need to install [Python 3.5][] (or later). You will also need the libraries `theano`, `theano-lstm`, `sexpdata`, and `matplotlib`, which you can install with
  8 | 
  9 | ```
 10 | pip3 install theano theano-lstm sexpdata matplotlib
 11 | ```
 12 | 
 13 | [Python 3.5]: https://www.python.org/downloads/
 14 | 
 15 | Note that Theano depends on SciPy. If you do not already have SciPy, `pip3` should install SciPy automatically when you install Theano, but if that fails, you can download SciPy from [their website][scipy]. Alternately, you can install a Python 3.5 distribution that already has SciPy installed, such as [Anaconda][].
 16 |  
 17 | [scipy]: http://scipy.org/install.html
 18 | [Anaconda]: https://www.continuum.io/downloads
 19 | 
 20 | Before using the scripts, you will also need to make a file called `.theanorc` in your home directory, with the following contents:
 21 | 
 22 | ```
 23 | [global]
 24 | floatX=float32
 25 | 
 26 | [mode]=FAST_RUN
 27 | ```
 28 | 
 29 | For additional Theano configuration options, including instructions on how to use the GPU, see the [theano config documentation][configdoc].
 30 | 
 31 | [configdoc]: http://deeplearning.net/software/theano/library/config.html
 32 | 
 33 | ## Scripts
 34 | 
 35 | Python scripts are provided to accomplish certain tasks. Each script can be invoked using the `python3` executable, for example
 36 | 
 37 | ```
 38 | python3 SCRIPTNAME.py [arguments]
 39 | ```
 40 | 
 41 | - [main.py](instructions/main.md): The main entry point for the project. Trains different types of model, as well as generating samples from them.
 42 | - [param_cvt.py](instructions/param_cvt.md): Converts trained connectomes from pickle format (.p) to Impro-Visor connectome format (.ctome)
 43 | - [plot_internal_state.py](instructions/plot_internal_state.md): Plots the internal state of a network, produced by main.py in generation mode.
 44 | - [plot_data.py](instructions/plot_data.md): Plots a .csv file as a graph, allowing you to visualize the training loss of a network
 45 | - [lscat.py](instructions/lscat.md): Concatenates leadsheets together for easier viewing.
 46 | - [lssplit.py](instructions/lssplit.md): Splits leadsheets into multiple pieces.
 47 | - [generate_trade_helper.py](instructions/generate_trade_helper.md): Interleaves generated output with the original input in a single leadsheet, for use in autoencoder models.
 48 | 
 49 | Detailed instruction pages for each script are available in the instructions subdirectory, and each script will display a help message if the script is given the `-h` argument.
 50 | 
 51 | ## Examples
 52 | 
 53 | Some general examples follow. See detailed instruction pages for more in-depth examples.
 54 | 
 55 | Train a product-of-experts generative model on a directory of leadsheet fileswith path `datasets/my_dataset`, automatically resuming training if previously interrupted. By default, each leadsheet file will be split into 4-bar chunks starting at each bar.
 56 | 
 57 | ```
 58 | $ python3 main.py --dataset datasets/my_dataset --outputdir output_my_dataset --resume_auto poex
 59 | ```
 60 | 
 61 | Generate some leadsheets using the trained product-of-experts model, sampling from the dataset:
 62 | 
 63 | ```
 64 | $ python3 main.py --dataset datasets/my_dataset --outputdir output_my_dataset/generated --resume 0 output_my_dataset/final_params.p --generate poex
 65 | ```
 66 | 
 67 | Visualize the internal state of the network for the first generated leadsheet:
 68 | 
 69 | ```
 70 | $ python3 plot_internal_state.py output_my_dataset/generated 0
 71 | ```
 72 | 
 73 | Plot the training progress of the network:
 74 | 
 75 | ```
 76 | $ python3 plot_data.py output_my_dataset/data.csv
 77 | ```
 78 | 
 79 | Convert the trained network into a connectome file:
 80 | 
 81 | ```
 82 | $ python3 param_cvt.py --keys param_keys/poex_keys.txt output_my_dataset/final_params.p
 83 | ```
 84 | 
 85 | Train a compressing autoencoder on the same dataset, using fixed features and product-of-experts for compatibility with Impro-Visor:
 86 | 
 87 | ```
 88 | $ python3 main.py --dataset datasets/my_dataset --outputdir output_my_dataset_compae --resume_auto compae poex queueless_std --feature_period 24 --add_loss
 89 | ```
 90 | 
 91 | Run the autoencoder on some leadsheets from the dataset, and then combine the input and output into a trading summary leadsheet:
 92 | 
 93 | ```
 94 | $ python3 main.py --dataset datasets/my_dataset --outputdir output_my_dataset_compae/generated --resume 0 output_my_dataset_compae/final_params.p --generate compae poex queueless_std --feature_period 24 --add_loss
 95 | $ python3 generate_trade_helper.py output_my_dataset_compae/generated
 96 | ```
 97 | 
 98 | Convert the trained autoencoder into a connectome file:
 99 | 
100 | ```
101 | $ python3 param_cvt.py --keys param_keys/ae_poex_keys.txt output_my_dataset_compae/final_params.p
102 | ```
103 | 


--------------------------------------------------------------------------------
/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Impro-Visor/lstmprovisor-python/027e394dbf43d31900fe293a22fadfed683256f9/__init__.py


--------------------------------------------------------------------------------
/adam.py:
--------------------------------------------------------------------------------
 1 | """
 2 | The MIT License (MIT)
 3 | Copyright (c) 2015 Alec Radford
 4 | Permission is hereby granted, free of charge, to any person obtaining a copy
 5 | of this software and associated documentation files (the "Software"), to deal
 6 | in the Software without restriction, including without limitation the rights
 7 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 8 | copies of the Software, and to permit persons to whom the Software is
 9 | furnished to do so, subject to the following conditions:
10 | The above copyright notice and this permission notice shall be included in all
11 | copies or substantial portions of the Software.
12 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
13 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
14 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
15 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
16 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
17 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
18 | SOFTWARE.
19 | """
20 | 
21 | import theano
22 | import theano.tensor as T
23 | import numpy as np
24 | 
25 | def Adam(cost, params, lr=0.0002, b1=0.1, b2=0.001, e=1e-8):
26 |     updates = []
27 |     grads = T.grad(cost, params)
28 |     i = theano.shared(np.array(0., theano.config.floatX))
29 |     i_t = i + 1.
30 |     fix1 = 1. - (1. - b1)**i_t
31 |     fix2 = 1. - (1. - b2)**i_t
32 |     lr_t = lr * (T.sqrt(fix2) / fix1)
33 |     for p, g in zip(params, grads):
34 |         m = theano.shared(p.get_value() * 0.)
35 |         v = theano.shared(p.get_value() * 0.)
36 |         m_t = (b1 * g) + ((1. - b1) * m)
37 |         v_t = (b2 * T.sqr(g)) + ((1. - b2) * v)
38 |         g_t = m_t / (T.sqrt(v_t) + e)
39 |         p_t = p - (lr_t * g_t)
40 |         updates.append((m, m_t))
41 |         updates.append((v, v_t))
42 |         updates.append((p, p_t))
43 |     updates.append((i, i_t))
44 |     return updates


--------------------------------------------------------------------------------
/constants.py:
--------------------------------------------------------------------------------
  1 | from collections import OrderedDict, namedtuple
  2 | import numpy as np
  3 | 
  4 | #                  [ c  -  d  -  e  f  -  g  -  a  -  b]
  5 | CHORD_TYPES = OrderedDict([
  6 |     ('',           [ 1, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0]),
  7 |     ('+',          [ 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0]),
  8 |     ('6',          [ 1, 0, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0]),
  9 |     ('7',          [ 1, 0, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0]),
 10 |     ('9',          [ 1, 0, 1, 0, 1, 0, 0, 1, 0, 0, 1, 0]),
 11 |     ('M',          [ 1, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0]),
 12 |     ('m',          [ 1, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0]),
 13 |     ('o',          [ 1, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0]),
 14 |     ('+7',         [ 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 1, 0]),
 15 |     ('11',         [ 1, 0, 1, 0, 1, 1, 0, 1, 0, 0, 1, 0]),
 16 |     ('13',         [ 1, 0, 1, 0, 1, 0, 0, 1, 0, 1, 1, 0]),
 17 |     ('69',         [ 1, 0, 1, 0, 1, 0, 0, 1, 0, 1, 0, 0]),
 18 |     ('7+',         [ 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 1, 0]),
 19 |     ('9+',         [ 1, 0, 1, 0, 1, 0, 0, 0, 1, 0, 1, 0]),
 20 |     ('M6',         [ 1, 0, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0]),
 21 |     ('M7',         [ 1, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 1]),
 22 |     ('M9',         [ 1, 0, 1, 0, 1, 0, 0, 1, 0, 0, 0, 1]),
 23 |     ('NC',         [ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]),
 24 |     ('h7',         [ 1, 0, 0, 1, 0, 0, 1, 0, 0, 0, 1, 0]),
 25 |     ('m+',         [ 1, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0]),
 26 |     ('m6',         [ 1, 0, 0, 1, 0, 0, 0, 1, 0, 1, 0, 0]),
 27 |     ('m7',         [ 1, 0, 0, 1, 0, 0, 0, 1, 0, 0, 1, 0]),
 28 |     ('m9',         [ 1, 0, 1, 1, 0, 0, 0, 1, 0, 0, 1, 0]),
 29 |     ('o7',         [ 1, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 0]),
 30 |     ('7#5',        [ 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 1, 0]),
 31 |     ('7#9',        [ 1, 0, 0, 1, 1, 0, 0, 1, 0, 0, 1, 0]),
 32 |     ('7b5',        [ 1, 0, 0, 0, 1, 0, 1, 0, 0, 0, 1, 0]),
 33 |     ('7b6',        [ 1, 0, 0, 0, 1, 0, 0, 1, 1, 0, 1, 0]),
 34 |     ('7b9',        [ 1, 1, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0]),
 35 |     ('9#5',        [ 1, 0, 1, 0, 1, 0, 0, 0, 1, 0, 1, 0]),
 36 |     ('9b5',        [ 1, 0, 1, 0, 1, 0, 1, 0, 0, 0, 1, 0]),
 37 |     ('M#5',        [ 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0]),
 38 |     ('M69',        [ 1, 0, 1, 0, 1, 0, 0, 1, 0, 1, 0, 0]),
 39 |     ('M7+',        [ 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 1]),
 40 |     ('aug',        [ 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0]),
 41 |     ('m#5',        [ 1, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0]),
 42 |     ('m11',        [ 1, 0, 1, 1, 0, 1, 0, 1, 0, 0, 1, 0]),
 43 |     ('m13',        [ 1, 0, 1, 1, 0, 1, 0, 1, 0, 1, 1, 0]),
 44 |     ('m69',        [ 1, 0, 1, 1, 0, 0, 0, 1, 0, 1, 0, 0]),
 45 |     ('mM7',        [ 1, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1]),
 46 |     ('mM9',        [ 1, 0, 1, 1, 0, 0, 0, 1, 0, 0, 0, 1]),
 47 |     ('mb6',        [ 1, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0]),
 48 |     ('13#9',       [ 1, 0, 0, 1, 1, 0, 0, 1, 0, 1, 1, 0]),
 49 |     ('13b5',       [ 1, 0, 0, 0, 1, 0, 1, 0, 0, 1, 1, 0]),
 50 |     ('13b9',       [ 1, 1, 0, 0, 1, 0, 0, 1, 0, 1, 1, 0]),
 51 |     ('7#11',       [ 1, 0, 0, 0, 1, 0, 1, 1, 0, 0, 1, 0]),
 52 |     ('7alt',       [ 1, 0, 0, 1, 1, 0, 0, 0, 1, 0, 1, 0]),
 53 |     ('7aug',       [ 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 1, 0]),
 54 |     ('7b13',       [ 1, 0, 0, 0, 1, 0, 0, 1, 1, 0, 1, 0]),
 55 |     ('7no5',       [ 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0]),
 56 |     ('7sus',       [ 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 1, 0]),
 57 |     ('9#11',       [ 1, 0, 1, 0, 1, 0, 1, 1, 0, 0, 1, 0]),
 58 |     ('9b13',       [ 1, 0, 1, 0, 1, 0, 0, 1, 1, 0, 1, 0]),
 59 |     ('9no5',       [ 1, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0]),
 60 |     ('9sus',       [ 1, 0, 1, 0, 0, 1, 0, 1, 0, 0, 1, 0]),
 61 |     ('Bass',       [ 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]),
 62 |     ('M7#5',       [ 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 1]),
 63 |     ('M7b5',       [ 1, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 1]),
 64 |     ('M9#5',       [ 1, 0, 1, 0, 1, 0, 0, 0, 1, 0, 0, 1]),
 65 |     ('aug7',       [ 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 1, 0]),
 66 |     ('m7#5',       [ 1, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0]),
 67 |     ('m7b5',       [ 1, 0, 0, 1, 0, 0, 1, 0, 0, 0, 1, 0]),
 68 |     ('m9#5',       [ 1, 0, 1, 1, 0, 0, 0, 0, 1, 0, 1, 0]),
 69 |     ('m9b5',       [ 1, 0, 1, 1, 0, 0, 1, 0, 0, 0, 1, 0]),
 70 |     ('sus2',       [ 1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0]),
 71 |     ('sus4',       [ 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0]),
 72 |     ('+add9',      [ 1, 0, 1, 0, 1, 0, 0, 0, 1, 0, 0, 0]),
 73 |     ('13#11',      [ 1, 0, 1, 0, 1, 0, 1, 1, 0, 1, 1, 0]),
 74 |     ('13sus',      [ 1, 0, 1, 0, 0, 1, 0, 1, 0, 1, 1, 0]),
 75 |     ('7#5#9',      [ 1, 0, 0, 1, 1, 0, 0, 0, 1, 0, 1, 0]),
 76 |     ('7#5b9',      [ 1, 1, 0, 0, 1, 0, 0, 0, 1, 0, 1, 0]),
 77 |     ('7b5#9',      [ 1, 0, 0, 1, 1, 0, 0, 0, 1, 0, 1, 0]),
 78 |     ('7b5b9',      [ 1, 1, 0, 0, 1, 0, 1, 1, 0, 0, 1, 0]),
 79 |     ('7sus4',      [ 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 1, 0]),
 80 |     ('9sus4',      [ 1, 0, 1, 0, 0, 1, 0, 1, 0, 0, 1, 0]),
 81 |     ('Blues',      [ 1, 0, 0, 1, 0, 1, 1, 1, 0, 0, 1, 0]),
 82 |     ('M7#11',      [ 1, 0, 0, 0, 1, 0, 1, 1, 0, 0, 0, 1]),
 83 |     ('M9#11',      [ 1, 0, 1, 0, 1, 0, 1, 1, 0, 0, 0, 1]),
 84 |     ('Msus2',      [ 1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0]),
 85 |     ('Msus4',      [ 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0]),
 86 |     ('m11#5',      [ 1, 0, 1, 1, 0, 1, 0, 0, 1, 0, 1, 0]),
 87 |     ('m11b5',      [ 1, 0, 1, 1, 0, 1, 1, 0, 0, 0, 1, 0]),
 88 |     ('mM7b6',      [ 1, 0, 0, 1, 0, 0, 0, 1, 1, 0, 0, 1]),
 89 |     ('madd9',      [ 1, 0, 1, 1, 0, 0, 0, 1, 0, 0, 0, 0]),
 90 |     ('mb6M7',      [ 1, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 1]),
 91 |     ('mb6b9',      [ 1, 1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0]),
 92 |     ('susb9',      [ 1, 1, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0]),
 93 |     ('13sus4',     [ 1, 0, 1, 0, 0, 1, 0, 1, 0, 1, 1, 0]),
 94 |     ('7#9#11',     [ 1, 0, 0, 1, 1, 0, 1, 1, 0, 0, 1, 0]),
 95 |     ('7#9b13',     [ 1, 0, 0, 1, 1, 0, 0, 1, 1, 0, 1, 0]),
 96 |     ('7b5b13',     [ 1, 0, 0, 0, 1, 0, 1, 1, 1, 0, 1, 0]),
 97 |     ('7b9#11',     [ 1, 1, 0, 0, 1, 0, 1, 1, 0, 0, 1, 0]),
 98 |     ('7b9b13',     [ 1, 1, 0, 0, 1, 0, 0, 1, 1, 0, 1, 0]),
 99 |     ('7b9sus',     [ 1, 1, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0]),
100 |     ('7susb9',     [ 1, 1, 0, 0, 0, 1, 0, 1, 0, 0, 1, 0]),
101 |     ('9#5#11',     [ 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0]),
102 |     ('9b5b13',     [ 1, 0, 1, 0, 1, 0, 1, 1, 1, 0, 1, 0]),
103 |     ('13#9#11',    [ 1, 0, 0, 1, 1, 0, 1, 1, 0, 1, 1, 0]),
104 |     ('13b9#11',    [ 1, 1, 0, 0, 1, 0, 1, 1, 0, 1, 1, 0]),
105 |     ('7#11b13',    [ 1, 0, 0, 0, 1, 0, 1, 1, 1, 0, 1, 0]),
106 |     ('7b9sus4',    [ 1, 1, 0, 0, 0, 1, 0, 1, 0, 0, 1, 0]),
107 |     ('7sus4b9',    [ 1, 1, 0, 0, 0, 1, 0, 1, 0, 0, 1, 0]),
108 |     ('9#11b13',    [ 1, 0, 1, 0, 1, 0, 1, 1, 1, 0, 1, 0]),
109 |     ('M#5add9',    [ 1, 0, 1, 0, 1, 0, 0, 0, 1, 0, 0, 0]),
110 |     ('7#5b9#11',   [ 1, 1, 0, 0, 1, 0, 1, 0, 1, 0, 1, 0]),
111 |     ('7b5b9b13',   [ 1, 1, 0, 0, 1, 0, 1, 1, 1, 0, 1, 0]),
112 |     ('7#9#11b13',  [ 1, 0, 0, 1, 1, 0, 1, 1, 1, 0, 1, 0]),
113 |     ('7b9#11b13',  [ 1, 1, 0, 0, 1, 0, 1, 1, 1, 0, 1, 0]),
114 |     ('7b9b13#11',  [ 1, 1, 0, 0, 1, 0, 1, 1, 1, 0, 1, 0]),
115 |     ('7b9b13sus4', [ 1, 1, 0, 0, 0, 1, 0, 1, 1, 0, 1, 0]),
116 |     ('7sus4b9b13', [ 1, 1, 0, 0, 0, 1, 0, 1, 1, 0, 1, 0]),
117 | ])
118 | 
119 | NOTE_OFFSETS = OrderedDict([
120 |     ("c",    0),
121 |     ("db",   1),
122 |     ("d",    2),
123 |     ("eb",   3),
124 |     ("e",    4),
125 |     ("f",    5),
126 |     ("gb",   6),
127 |     ("g",    7),
128 |     ("ab",   8),
129 |     ("a",    9),
130 |     ("bb",   10),
131 |     ("b",    11),
132 | 
133 |     ("c#",   1),
134 |     ("d#",   3),
135 |     ("f#",   6),
136 |     ("g#",   8),
137 |     ("a#",   10),
138 | 
139 |     ("cb",   11),
140 |     ("fb",   4),
141 |     ("e#",   5),
142 |     ("b#",   0),
143 | ])
144 | 
145 | CHORD_NOTE_OFFSETS = OrderedDict((k[:1].upper()+k[1:],v%12) for k,v in NOTE_OFFSETS.items())
146 |  
147 | WHOLE                = 480; # slots in a whole note
148 | HALF                 = WHOLE//2;              # 240
149 | QUARTER              = WHOLE//4;              # 120
150 | EIGHTH               = WHOLE//8;              #  60
151 | SIXTEENTH            = WHOLE//16;             #  30
152 | THIRTYSECOND         = WHOLE//32;             #  15
153 |     
154 | HALF_TRIPLET         = 2*HALF//3;             # 160
155 | QUARTER_TRIPLET      = 2*QUARTER//3;          #  80
156 | EIGHTH_TRIPLET       = 2*EIGHTH//3;           #  40
157 | SIXTEENTH_TRIPLET    = 2*SIXTEENTH//3;        #  20
158 | THIRTYSECOND_TRIPLET = 2*THIRTYSECOND//3;     #  10
159 | 
160 | QUARTER_QUINTUPLET      = 4*QUARTER//5;       #  96
161 | EIGHTH_QUINTUPLET       = 4*EIGHTH//5;        #  48
162 | SIXTEENTH_QUINTUPLET    = 4*SIXTEENTH//5;     #  24
163 | THIRTYSECOND_QUINTUPLET = 4*THIRTYSECOND//5;  #  12
164 | 
165 | DOTTED_HALF           = 3*HALF//2;            # 360
166 | DOTTED_QUARTER        = 3*QUARTER//2;         # 180
167 | DOTTED_EIGHTH         = 3*EIGHTH//2;          #  90
168 | DOTTED_SIXTEENTH      = 3*SIXTEENTH//2;       #  45
169 | 
170 | FOUREIGHTIETH         = 1; # WHOLE/WHOLE    #   1
171 | TWOFORTIETH           = 2;                   #   2
172 | ONETWENTIETH          = 4;                   #   4
173 | SIXTIETH              = 8;                   #   8
174 | 
175 | RESOLUTION_SCALAR = 10;
176 | 
177 | MIDDLE_C_MIDI = 60
178 | OCTAVE = 12
179 | 
180 | 
181 | EPSILON = np.finfo(np.float32).eps
182 | 
183 | NoteBounds = namedtuple("NoteBounds",["lowbound","highbound"])
184 | 
185 | BOUNDS = NoteBounds(48, 84+1)
186 | 


--------------------------------------------------------------------------------
/custom_cmap.py:
--------------------------------------------------------------------------------
  1 | 
  2 | from matplotlib.colors import ListedColormap
  3 | from numpy import nan, inf
  4 | 
  5 | # Used to reconstruct the colormap in viscm
  6 | parameters = {'xp': [24.397622053872112, 22.329064134250359, -3.5279098610215271, -7.3202660469947318, -3.5279098610215271],
  7 |               'yp': [10.224586288416106, 29.531126871552431, 21.60165484633572, 1.6055949566588197, -1.4972419227738101],
  8 |               'min_Jp': 18.730158730158728,
  9 |               'max_Jp': 99.6190476190476}
 10 | 
 11 | cm_data = [[  3.19188006e-01,  5.08411011e-04,  4.27847166e-02],
 12 |            [  3.23452429e-01,  2.94004376e-03,  4.29498405e-02],
 13 |            [  3.27697959e-01,  5.48550076e-03,  4.30937800e-02],
 14 |            [  3.31925014e-01,  8.14729136e-03,  4.32108710e-02],
 15 |            [  3.36134026e-01,  1.09282090e-02,  4.32932080e-02],
 16 |            [  3.40324261e-01,  1.38309580e-02,  4.33544174e-02],
 17 |            [  3.44496109e-01,  1.68585499e-02,  4.33863092e-02],
 18 |            [  3.48649544e-01,  2.00141344e-02,  4.33873393e-02],
 19 |            [  3.52784075e-01,  2.33004930e-02,  4.33674127e-02],
 20 |            [  3.56899901e-01,  2.67211674e-02,  4.33188700e-02],
 21 |            [  3.60996892e-01,  3.02796868e-02,  4.32405671e-02],
 22 |            [  3.65074618e-01,  3.39789376e-02,  4.31416825e-02],
 23 |            [  3.69133039e-01,  3.78227502e-02,  4.30181093e-02],
 24 |            [  3.73172062e-01,  4.17796063e-02,  4.28631975e-02],
 25 |            [  3.77191242e-01,  4.56665666e-02,  4.26883879e-02],
 26 |            [  3.81190385e-01,  4.94889072e-02,  4.24937807e-02],
 27 |            [  3.85169297e-01,  5.32578705e-02,  4.22682848e-02],
 28 |            [  3.89127602e-01,  5.69807657e-02,  4.20217552e-02],
 29 |            [  3.93065023e-01,  6.06644181e-02,  4.17567112e-02],
 30 |            [  3.96981266e-01,  6.43150594e-02,  4.14725462e-02],
 31 |            [  4.00875841e-01,  6.79394110e-02,  4.11600480e-02],
 32 |            [  4.04748512e-01,  7.15406015e-02,  4.08309528e-02],
 33 |            [  4.08598937e-01,  7.51227174e-02,  4.04858381e-02],
 34 |            [  4.12426752e-01,  7.86893890e-02,  4.01206455e-02],
 35 |            [  4.16231263e-01,  8.22452517e-02,  3.97338038e-02],
 36 |            [  4.20012268e-01,  8.57920330e-02,  3.93348179e-02],
 37 |            [  4.23769406e-01,  8.93321242e-02,  3.89262925e-02],
 38 |            [  4.27502256e-01,  9.28678889e-02,  3.85095772e-02],
 39 |            [  4.31210387e-01,  9.64014725e-02,  3.80861373e-02],
 40 |            [  4.34892901e-01,  9.99361677e-02,  3.76509851e-02],
 41 |            [  4.38549738e-01,  1.03472432e-01,  3.72127744e-02],
 42 |            [  4.42180470e-01,  1.07011798e-01,  3.67737224e-02],
 43 |            [  4.45784643e-01,  1.10555736e-01,  3.63357812e-02],
 44 |            [  4.49361798e-01,  1.14105584e-01,  3.59010342e-02],
 45 |            [  4.52911477e-01,  1.17662562e-01,  3.54716989e-02],
 46 |            [  4.56432847e-01,  1.21228614e-01,  3.50467949e-02],
 47 |            [  4.59925736e-01,  1.24804054e-01,  3.46317940e-02],
 48 |            [  4.63389808e-01,  1.28389545e-01,  3.42302550e-02],
 49 |            [  4.66824637e-01,  1.31985895e-01,  3.38449160e-02],
 50 |            [  4.70229810e-01,  1.35593819e-01,  3.34786526e-02],
 51 |            [  4.73604927e-01,  1.39213951e-01,  3.31344783e-02],
 52 |            [  4.76949608e-01,  1.42846843e-01,  3.28155440e-02],
 53 |            [  4.80263491e-01,  1.46492970e-01,  3.25251366e-02],
 54 |            [  4.83546237e-01,  1.50152732e-01,  3.22666779e-02],
 55 |            [  4.86797340e-01,  1.53826781e-01,  3.20427670e-02],
 56 |            [  4.90016620e-01,  1.57515177e-01,  3.18577613e-02],
 57 |            [  4.93203933e-01,  1.61217925e-01,  3.17160407e-02],
 58 |            [  4.96359052e-01,  1.64935164e-01,  3.16215131e-02],
 59 |            [  4.99481779e-01,  1.68666973e-01,  3.15782102e-02],
 60 |            [  5.02571949e-01,  1.72413379e-01,  3.15902848e-02],
 61 |            [  5.05629430e-01,  1.76174354e-01,  3.16620080e-02],
 62 |            [  5.08654123e-01,  1.79949823e-01,  3.17977662e-02],
 63 |            [  5.11645963e-01,  1.83739667e-01,  3.20020585e-02],
 64 |            [  5.14604920e-01,  1.87543723e-01,  3.22794932e-02],
 65 |            [  5.17530997e-01,  1.91361790e-01,  3.26347858e-02],
 66 |            [  5.20424232e-01,  1.95193632e-01,  3.30727557e-02],
 67 |            [  5.23284694e-01,  1.99038983e-01,  3.35983244e-02],
 68 |            [  5.26112485e-01,  2.02897548e-01,  3.42165124e-02],
 69 |            [  5.28907738e-01,  2.06769007e-01,  3.49324380e-02],
 70 |            [  5.31670618e-01,  2.10653019e-01,  3.57513149e-02],
 71 |            [  5.34401314e-01,  2.14549225e-01,  3.66784508e-02],
 72 |            [  5.37100046e-01,  2.18457250e-01,  3.77192462e-02],
 73 |            [  5.39767058e-01,  2.22376711e-01,  3.88791931e-02],
 74 |            [  5.42402617e-01,  2.26307210e-01,  4.01638742e-02],
 75 |            [  5.45007013e-01,  2.30248347e-01,  4.15510718e-02],
 76 |            [  5.47580556e-01,  2.34199717e-01,  4.30359691e-02],
 77 |            [  5.50123575e-01,  2.38160913e-01,  4.46216703e-02],
 78 |            [  5.52636413e-01,  2.42131527e-01,  4.63067669e-02],
 79 |            [  5.55119432e-01,  2.46111156e-01,  4.80895724e-02],
 80 |            [  5.57573004e-01,  2.50099400e-01,  4.99681664e-02],
 81 |            [  5.59997514e-01,  2.54095864e-01,  5.19404378e-02],
 82 |            [  5.62393356e-01,  2.58100161e-01,  5.40041260e-02],
 83 |            [  5.64760934e-01,  2.62111911e-01,  5.61568596e-02],
 84 |            [  5.67100658e-01,  2.66130744e-01,  5.83961918e-02],
 85 |            [  5.69412966e-01,  2.70156278e-01,  6.07196404e-02],
 86 |            [  5.71698290e-01,  2.74188154e-01,  6.31246983e-02],
 87 |            [  5.73957035e-01,  2.78226051e-01,  6.56088561e-02],
 88 |            [  5.76189628e-01,  2.82269644e-01,  6.81696432e-02],
 89 |            [  5.78396498e-01,  2.86318616e-01,  7.08046350e-02],
 90 |            [  5.80578073e-01,  2.90372663e-01,  7.35114667e-02],
 91 |            [  5.82734784e-01,  2.94431495e-01,  7.62878435e-02],
 92 |            [  5.84867059e-01,  2.98494830e-01,  7.91315492e-02],
 93 |            [  5.86975325e-01,  3.02562403e-01,  8.20404518e-02],
 94 |            [  5.89060008e-01,  3.06633957e-01,  8.50125072e-02],
 95 |            [  5.91121531e-01,  3.10709248e-01,  8.80457618e-02],
 96 |            [  5.93160315e-01,  3.14788043e-01,  9.11383525e-02],
 97 |            [  5.95176778e-01,  3.18870120e-01,  9.42885072e-02],
 98 |            [  5.97171334e-01,  3.22955269e-01,  9.74945428e-02],
 99 |            [  5.99144397e-01,  3.27043288e-01,  1.00754863e-01],
100 |            [  6.01096374e-01,  3.31133987e-01,  1.04067958e-01],
101 |            [  6.03027671e-01,  3.35227184e-01,  1.07432397e-01],
102 |            [  6.04938690e-01,  3.39322707e-01,  1.10846829e-01],
103 |            [  6.06829831e-01,  3.43420394e-01,  1.14309978e-01],
104 |            [  6.08701490e-01,  3.47520088e-01,  1.17820639e-01],
105 |            [  6.10554058e-01,  3.51621644e-01,  1.21377673e-01],
106 |            [  6.12387926e-01,  3.55724921e-01,  1.24980008e-01],
107 |            [  6.14203480e-01,  3.59829787e-01,  1.28626629e-01],
108 |            [  6.16001408e-01,  3.63935903e-01,  1.32316392e-01],
109 |            [  6.17781818e-01,  3.68043347e-01,  1.36048549e-01],
110 |            [  6.19545070e-01,  3.72152022e-01,  1.39822260e-01],
111 |            [  6.21291540e-01,  3.76261820e-01,  1.43636721e-01],
112 |            [  6.23021600e-01,  3.80372642e-01,  1.47491171e-01],
113 |            [  6.24735623e-01,  3.84484392e-01,  1.51384889e-01],
114 |            [  6.26433979e-01,  3.88596978e-01,  1.55317195e-01],
115 |            [  6.28117036e-01,  3.92710314e-01,  1.59287442e-01],
116 |            [  6.29785163e-01,  3.96824318e-01,  1.63295020e-01],
117 |            [  6.31438727e-01,  4.00938911e-01,  1.67339345e-01],
118 |            [  6.33078094e-01,  4.05054018e-01,  1.71419867e-01],
119 |            [  6.34703629e-01,  4.09169567e-01,  1.75536060e-01],
120 |            [  6.36315954e-01,  4.13285331e-01,  1.79687212e-01],
121 |            [  6.37915317e-01,  4.17401322e-01,  1.83872936e-01],
122 |            [  6.39501936e-01,  4.21517570e-01,  1.88092897e-01],
123 |            [  6.41076174e-01,  4.25634014e-01,  1.92346660e-01],
124 |            [  6.42638397e-01,  4.29750599e-01,  1.96633813e-01],
125 |            [  6.44188969e-01,  4.33867270e-01,  2.00953958e-01],
126 |            [  6.45728257e-01,  4.37983973e-01,  2.05306713e-01],
127 |            [  6.47256627e-01,  4.42100659e-01,  2.09691714e-01],
128 |            [  6.48774449e-01,  4.46217278e-01,  2.14108606e-01],
129 |            [  6.50282093e-01,  4.50333782e-01,  2.18557051e-01],
130 |            [  6.51780194e-01,  4.54449979e-01,  2.23036476e-01],
131 |            [  6.53268943e-01,  4.58565931e-01,  2.27546728e-01],
132 |            [  6.54748608e-01,  4.62681653e-01,  2.32087599e-01],
133 |            [  6.56219567e-01,  4.66797103e-01,  2.36658794e-01],
134 |            [  6.57682196e-01,  4.70912243e-01,  2.41260026e-01],
135 |            [  6.59136877e-01,  4.75027032e-01,  2.45891017e-01],
136 |            [  6.60583993e-01,  4.79141432e-01,  2.50551496e-01],
137 |            [  6.62023931e-01,  4.83255406e-01,  2.55241197e-01],
138 |            [  6.63457080e-01,  4.87368916e-01,  2.59959863e-01],
139 |            [  6.64884050e-01,  4.91481816e-01,  2.64707020e-01],
140 |            [  6.66305021e-01,  4.95594182e-01,  2.69482635e-01],
141 |            [  6.67720349e-01,  4.99706002e-01,  2.74286507e-01],
142 |            [  6.69130435e-01,  5.03817240e-01,  2.79118400e-01],
143 |            [  6.70535682e-01,  5.07927863e-01,  2.83978083e-01],
144 |            [  6.71936499e-01,  5.12037840e-01,  2.88865326e-01],
145 |            [  6.73333295e-01,  5.16147137e-01,  2.93779904e-01],
146 |            [  6.74726487e-01,  5.20255723e-01,  2.98721596e-01],
147 |            [  6.76116551e-01,  5.24363538e-01,  3.03690119e-01],
148 |            [  6.77503877e-01,  5.28470568e-01,  3.08685290e-01],
149 |            [  6.78888821e-01,  5.32576816e-01,  3.13706971e-01],
150 |            [  6.80271812e-01,  5.36682249e-01,  3.18754950e-01],
151 |            [  6.81653287e-01,  5.40786839e-01,  3.23829018e-01],
152 |            [  6.83033683e-01,  5.44890554e-01,  3.28928966e-01],
153 |            [  6.84413445e-01,  5.48993365e-01,  3.34054587e-01],
154 |            [  6.85793019e-01,  5.53095241e-01,  3.39205675e-01],
155 |            [  6.87172842e-01,  5.57196161e-01,  3.44382044e-01],
156 |            [  6.88553325e-01,  5.61296115e-01,  3.49583542e-01],
157 |            [  6.89934944e-01,  5.65395067e-01,  3.54809950e-01],
158 |            [  6.91318165e-01,  5.69492986e-01,  3.60061066e-01],
159 |            [  6.92703460e-01,  5.73589844e-01,  3.65336688e-01],
160 |            [  6.94091305e-01,  5.77685610e-01,  3.70636614e-01],
161 |            [  6.95482183e-01,  5.81780256e-01,  3.75960642e-01],
162 |            [  6.96876579e-01,  5.85873751e-01,  3.81308568e-01],
163 |            [  6.98274958e-01,  5.89966078e-01,  3.86680224e-01],
164 |            [  6.99677633e-01,  5.94057281e-01,  3.92075624e-01],
165 |            [  7.01085271e-01,  5.98147261e-01,  3.97494373e-01],
166 |            [  7.02498381e-01,  6.02235988e-01,  4.02936266e-01],
167 |            [  7.03917477e-01,  6.06323433e-01,  4.08401097e-01],
168 |            [  7.05343079e-01,  6.10409565e-01,  4.13888660e-01],
169 |            [  7.06775712e-01,  6.14494353e-01,  4.19398747e-01],
170 |            [  7.08215906e-01,  6.18577766e-01,  4.24931150e-01],
171 |            [  7.09664198e-01,  6.22659774e-01,  4.30485657e-01],
172 |            [  7.11120918e-01,  6.26740423e-01,  4.36062323e-01],
173 |            [  7.12586632e-01,  6.30819674e-01,  4.41660917e-01],
174 |            [  7.14062051e-01,  6.34897436e-01,  4.47281028e-01],
175 |            [  7.15547738e-01,  6.38973677e-01,  4.52922438e-01],
176 |            [  7.17044260e-01,  6.43048362e-01,  4.58584925e-01],
177 |            [  7.18552191e-01,  6.47121460e-01,  4.64268267e-01],
178 |            [  7.20072113e-01,  6.51192935e-01,  4.69972235e-01],
179 |            [  7.21604612e-01,  6.55262754e-01,  4.75696601e-01],
180 |            [  7.23150282e-01,  6.59330882e-01,  4.81441132e-01],
181 |            [  7.24709638e-01,  6.63397313e-01,  4.87205706e-01],
182 |            [  7.26282965e-01,  6.67462115e-01,  4.92990519e-01],
183 |            [  7.27871260e-01,  6.71525124e-01,  4.98794821e-01],
184 |            [  7.29475145e-01,  6.75586301e-01,  5.04618361e-01],
185 |            [  7.31095251e-01,  6.79645609e-01,  5.10460886e-01],
186 |            [  7.32732213e-01,  6.83703007e-01,  5.16322138e-01],
187 |            [  7.34386675e-01,  6.87758456e-01,  5.22201851e-01],
188 |            [  7.36059289e-01,  6.91811916e-01,  5.28099756e-01],
189 |            [  7.37750713e-01,  6.95863345e-01,  5.34015578e-01],
190 |            [  7.39461613e-01,  6.99912703e-01,  5.39949035e-01],
191 |            [  7.41192662e-01,  7.03959945e-01,  5.45899838e-01],
192 |            [  7.42944541e-01,  7.08005028e-01,  5.51867692e-01],
193 |            [  7.44717602e-01,  7.12048006e-01,  5.57852798e-01],
194 |            [  7.46512751e-01,  7.16088771e-01,  5.63854543e-01],
195 |            [  7.48330822e-01,  7.20127236e-01,  5.69872417e-01],
196 |            [  7.50172531e-01,  7.24163354e-01,  5.75906084e-01],
197 |            [  7.52038600e-01,  7.28197076e-01,  5.81955204e-01],
198 |            [  7.53929761e-01,  7.32228351e-01,  5.88019423e-01],
199 |            [  7.55846753e-01,  7.36257131e-01,  5.94098377e-01],
200 |            [  7.57790321e-01,  7.40283361e-01,  6.00191688e-01],
201 |            [  7.59761221e-01,  7.44306992e-01,  6.06298969e-01],
202 |            [  7.61760216e-01,  7.48327968e-01,  6.12419817e-01],
203 |            [  7.63788076e-01,  7.52346237e-01,  6.18553814e-01],
204 |            [  7.65845580e-01,  7.56361743e-01,  6.24700530e-01],
205 |            [  7.67933513e-01,  7.60374432e-01,  6.30859514e-01],
206 |            [  7.70052669e-01,  7.64384247e-01,  6.37030303e-01],
207 |            [  7.72203849e-01,  7.68391133e-01,  6.43212411e-01],
208 |            [  7.74387858e-01,  7.72395033e-01,  6.49405338e-01],
209 |            [  7.76605512e-01,  7.76395891e-01,  6.55608559e-01],
210 |            [  7.78857629e-01,  7.80393651e-01,  6.61821529e-01],
211 |            [  7.81145034e-01,  7.84388257e-01,  6.68043683e-01],
212 |            [  7.83468558e-01,  7.88379654e-01,  6.74274427e-01],
213 |            [  7.85829033e-01,  7.92367789e-01,  6.80513147e-01],
214 |            [  7.88227297e-01,  7.96352608e-01,  6.86759199e-01],
215 |            [  7.90664188e-01,  8.00334062e-01,  6.93011914e-01],
216 |            [  7.93140546e-01,  8.04312100e-01,  6.99270593e-01],
217 |            [  7.95657211e-01,  8.08286679e-01,  7.05534505e-01],
218 |            [  7.98215019e-01,  8.12257756e-01,  7.11802892e-01],
219 |            [  8.00814805e-01,  8.16225293e-01,  7.18074960e-01],
220 |            [  8.03457395e-01,  8.20189256e-01,  7.24349883e-01],
221 |            [  8.06143611e-01,  8.24149618e-01,  7.30626801e-01],
222 |            [  8.08874262e-01,  8.28106357e-01,  7.36904818e-01],
223 |            [  8.11650145e-01,  8.32059461e-01,  7.43183002e-01],
224 |            [  8.14472043e-01,  8.36008923e-01,  7.49460385e-01],
225 |            [  8.17340716e-01,  8.39954747e-01,  7.55735965e-01],
226 |            [  8.20256906e-01,  8.43896948e-01,  7.62008700e-01],
227 |            [  8.23221044e-01,  8.47835583e-01,  7.68278333e-01],
228 |            [  8.26233704e-01,  8.51770691e-01,  7.74544189e-01],
229 |            [  8.29295924e-01,  8.55702279e-01,  7.80804058e-01],
230 |            [  8.32408327e-01,  8.59630412e-01,  7.87056769e-01],
231 |            [  8.35571491e-01,  8.63555173e-01,  7.93301118e-01],
232 |            [  8.38785950e-01,  8.67476662e-01,  7.99535878e-01],
233 |            [  8.42052188e-01,  8.71395001e-01,  8.05759795e-01],
234 |            [  8.45370193e-01,  8.75310323e-01,  8.11973366e-01],
235 |            [  8.48740412e-01,  8.79222758e-01,  8.18175322e-01],
236 |            [  8.52163579e-01,  8.83132479e-01,  8.24362799e-01],
237 |            [  8.55639953e-01,  8.87039686e-01,  8.30534495e-01],
238 |            [  8.59169726e-01,  8.90944601e-01,  8.36689115e-01],
239 |            [  8.62752326e-01,  8.94847329e-01,  8.42829467e-01],
240 |            [  8.66388535e-01,  8.98748189e-01,  8.48950924e-01],
241 |            [  8.70078463e-01,  9.02647473e-01,  8.55051709e-01],
242 |            [  8.73821857e-01,  9.06545384e-01,  8.61132327e-01],
243 |            [  8.77618473e-01,  9.10442054e-01,  8.67193908e-01],
244 |            [  8.81468712e-01,  9.14337976e-01,  8.73231456e-01],
245 |            [  8.85372253e-01,  9.18233284e-01,  8.79246333e-01],
246 |            [  8.89328979e-01,  9.22128095e-01,  8.85239267e-01],
247 |            [  8.93339074e-01,  9.26022993e-01,  8.91205135e-01],
248 |            [  8.97402366e-01,  9.29917617e-01,  8.97149605e-01],
249 |            [  9.01519093e-01,  9.33812678e-01,  9.03066039e-01],
250 |            [  9.05689518e-01,  9.37707793e-01,  9.08958623e-01],
251 |            [  9.09913938e-01,  9.41603448e-01,  9.14822730e-01],
252 |            [  9.14193252e-01,  9.45499026e-01,  9.20662416e-01],
253 |            [  9.18527956e-01,  9.49395047e-01,  9.26471928e-01],
254 |            [  9.22919852e-01,  9.53290490e-01,  9.32255874e-01],
255 |            [  9.27370222e-01,  9.57185424e-01,  9.38009990e-01],
256 |            [  9.31881046e-01,  9.61079371e-01,  9.43732754e-01],
257 |            [  9.36455353e-01,  9.64971274e-01,  9.49424410e-01],
258 |            [  9.41096680e-01,  9.68860078e-01,  9.55083193e-01],
259 |            [  9.45808721e-01,  9.72744936e-01,  9.60704680e-01],
260 |            [  9.50596398e-01,  9.76624444e-01,  9.66285224e-01],
261 |            [  9.55465459e-01,  9.80497024e-01,  9.71819696e-01],
262 |            [  9.60422407e-01,  9.84360987e-01,  9.77301129e-01],
263 |            [  9.65474320e-01,  9.88214632e-01,  9.82720362e-01],
264 |            [  9.70629792e-01,  9.92055757e-01,  9.88067257e-01],
265 |            [  9.75895168e-01,  9.95883564e-01,  9.93326172e-01],
266 |            [  9.81275361e-01,  9.99697973e-01,  9.98479623e-01]]
267 | 
268 | cm_data = [x for i,x in enumerate(cm_data) for _ in range((len(cm_data)-i))]
269 | 
270 | test_cm = ListedColormap(cm_data, name=__file__)
271 | 
272 | 
273 | if __name__ == "__main__":
274 |     import matplotlib.pyplot as plt
275 |     import numpy as np
276 | 
277 |     try:
278 |         from viscm import viscm
279 |         viscm(test_cm)
280 |     except ImportError:
281 |         print("viscm not found, falling back on simple display")
282 |         plt.imshow(np.linspace(0, 100, 256)[None, :], aspect='auto',
283 |                    cmap=test_cm)
284 |     plt.show()
285 | 


--------------------------------------------------------------------------------
/generate_trade_helper.py:
--------------------------------------------------------------------------------
 1 | import sys
 2 | import os
 3 | import leadsheet
 4 | import argparse
 5 | import pickle
 6 | import numpy as np
 7 | import lscat
 8 | 
 9 | def main(filedir):
10 |     files = []
11 |     with open(os.path.join(filedir,'generated_sources.txt'),'r') as f:
12 |         for i,line in enumerate(f):
13 |             files.append(line.split(":")[0])
14 |             files.append(os.path.join(filedir,'generated_{}.ls'.format(i)))
15 |     lscat.main(files, output=os.path.join(filedir,'generated_trades.ls'), verbose=False)
16 | 
17 | parser = argparse.ArgumentParser(description='Helper to concatenate trades into single leadsheet')
18 | parser.add_argument('filedir', help='Directory to process')
19 | 
20 | if __name__ == '__main__':
21 |     args = parser.parse_args()
22 |     main(**vars(args))
23 | 


--------------------------------------------------------------------------------
/input_parts/__init__.py:
--------------------------------------------------------------------------------
1 | from .base_input_part import InputPart
2 | from .beat_part import BeatInputPart
3 | from .abs_position_part import PositionInputPart
4 | from .chord_part import ChordShiftInputPart
5 | from .passthrough_part import PassthroughInputPart


--------------------------------------------------------------------------------
/input_parts/abs_position_part.py:
--------------------------------------------------------------------------------
 1 | import numpy as np
 2 | import theano
 3 | import theano.tensor as T
 4 | 
 5 | import constants
 6 | 
 7 | from .base_input_part import InputPart
 8 | 
 9 | class PositionInputPart( InputPart ):
10 |     """
11 |     An input part that constructs a position
12 |     """
13 | 
14 |     def __init__(self, low_bound, up_bound, num_divisions):
15 |         """
16 |         Build an input part with num_divisions ranging betweeen low_bound and up_bound.
17 |         This part will activate each division depending on how close the relative_position
18 |         is to it.
19 |         """
20 |         assert num_divisions >= 2, "Must have at least 2 divisions!"
21 |         self.low_bound = low_bound
22 |         self.up_bound = up_bound
23 |         self.PART_WIDTH = num_divisions
24 | 
25 |     def generate(self, relative_position, **kwargs):
26 |         """
27 |         Generate a position input for a given timestep.
28 | 
29 |         Parameters: 
30 |             relative_position: A theano tensor (int32) of shape (n_parallel), giving the
31 |                 current relative position for this timestep
32 | 
33 |         Returns:
34 |             piece: A theano tensor (float32) of shape (n_parallel, PART_WIDTH)
35 |         """
36 |         delta = (self.up_bound-self.low_bound) / (self.PART_WIDTH-1)
37 |         indicator_pos = np.array([[i*delta + self.low_bound for i in range(self.PART_WIDTH)]], np.float32)
38 | 
39 |         # differences[i][j] is the difference between relative_position[i] and indicator_pos[j]
40 |         differences = T.cast(T.shape_padright(relative_position),'float32') - indicator_pos
41 | 
42 |         # We want each indicator to activate at its division point, and fall off linearly around it,
43 |         # capped from 0 to 1.
44 |         activities = T.maximum(0, 1-abs(differences/delta))
45 | 
46 |         # activities = theano.printing.Print("PositionInputPart")(activities)
47 |         # activities = T.opt.Assert()(activities, T.eq(activities.shape[1], self.PART_WIDTH))
48 | 
49 |         return activities
50 | 


--------------------------------------------------------------------------------
/input_parts/base_input_part.py:
--------------------------------------------------------------------------------
 1 | 
 2 | class InputPart( object ):
 3 |     """
 4 |     Base class for input parts
 5 |     """
 6 | 
 7 |     PART_WIDTH = 0
 8 | 
 9 |     def generate(self, **kwargs):
10 |         """
11 |         Generate the appropriate input.
12 | 
13 |         Parameters:
14 |             **kwargs: Depending on the particular class, may take in different values.
15 |                 To allow flexibility, all subclasses should ignore any kwargs that they do not need,
16 |                 so this method can be called with all relevant parameters.
17 | 
18 |         Returns:
19 |             part: A theano tensor (float32) of shape (n_parallel, PART_WIDTH), where n_parallel
20 |                 is either an explicit parameter or determined by shapes of input
21 |         """
22 |         raise NotImplementedError("generate not implemented")
23 | 


--------------------------------------------------------------------------------
/input_parts/beat_part.py:
--------------------------------------------------------------------------------
 1 | import numpy as np
 2 | import theano
 3 | import theano.tensor as T
 4 | 
 5 | import constants
 6 | 
 7 | from .base_input_part import InputPart
 8 | 
 9 | class BeatInputPart( InputPart ):
10 |     """
11 |     An input part that builds a beat
12 |     """
13 | 
14 |     BEAT_PERIODS = np.array([x//constants.RESOLUTION_SCALAR for x in [
15 |                         constants.WHOLE,
16 |                         constants.HALF,
17 |                         constants.QUARTER,
18 |                         constants.EIGHTH,
19 |                         constants.SIXTEENTH,
20 |                         constants.HALF_TRIPLET,
21 |                         constants.QUARTER_TRIPLET,
22 |                         constants.EIGHTH_TRIPLET,
23 |                         constants.SIXTEENTH_TRIPLET,
24 |                     ]], np.int32)
25 |     PART_WIDTH = len(BEAT_PERIODS)
26 | 
27 |     def generate(self, timestep, **kwargs):
28 |         """
29 |         Generate a beat input for a given timestep.
30 | 
31 |         Parameters: 
32 |             timestep: A theano int of shape (n_parallel). The current timestep to generate the beat input for.
33 | 
34 |         Returns:
35 |             piece: A theano tensor (float32) of shape (n_parallel, PART_WIDTH)
36 |         """
37 | 
38 |         result = T.eq(T.shape_padright(timestep) % np.expand_dims(self.BEAT_PERIODS,0), 0)
39 | 
40 |         # result = theano.printing.Print("BeatInputPart")(result)
41 |         # result = T.opt.Assert()(result, T.eq(result.shape[1], self.PART_WIDTH))
42 |         return result
43 | 


--------------------------------------------------------------------------------
/input_parts/chord_part.py:
--------------------------------------------------------------------------------
 1 | import numpy as np
 2 | import theano
 3 | import theano.tensor as T
 4 | 
 5 | import constants
 6 | 
 7 | from .base_input_part import InputPart
 8 | 
 9 | class ChordShiftInputPart( InputPart ):
10 |     """
11 |     An input part that builds a shifted chord representation
12 |     """
13 | 
14 |     CHORD_WIDTH = 12
15 |     PART_WIDTH = 12
16 | 
17 |     def generate(self, relative_position, cur_chord_root, cur_chord_type, **kwargs):
18 |         """
19 |         Generate a chord input for a given timestep.
20 | 
21 |         Parameters: 
22 |             relative_position: A theano tensor (int32) of shape (n_parallel), giving the
23 |                 current relative position for this timestep
24 |             cur_chord_root: A theano tensor (int32) of shape (n_parallel) giving the unshifted chord root
25 |             cur_chord_type: A theano tensor (int32) of shape (n_parallel, CHORD_WIDTH), giving the unshifted chord
26 |                 type representation, parsed from the leadsheet
27 | 
28 |         Returns:
29 |             piece: A theano tensor (float32) of shape (n_parallel, PART_WIDTH)
30 |         """
31 |         def _map_fn(pos, chord):
32 |             # Now pos is scalar and chord is of shape (CHORD_WIDTH), so we can roll
33 |             return T.roll(chord, (-pos)%12, 0)
34 | 
35 |         shifted_chords, _ = theano.map(_map_fn, sequences=[relative_position-cur_chord_root, cur_chord_type])
36 | 
37 |         # shifted_chords = theano.printing.Print("ChordShiftInputPart")(shifted_chords)
38 |         # shifted_chords = T.opt.Assert()(shifted_chords, T.eq(shifted_chords.shape[1], self.PART_WIDTH))
39 |         return shifted_chords


--------------------------------------------------------------------------------
/input_parts/passthrough_part.py:
--------------------------------------------------------------------------------
 1 | import numpy as np
 2 | import theano
 3 | import theano.tensor as T
 4 | 
 5 | import constants
 6 | 
 7 | from .base_input_part import InputPart
 8 | 
 9 | class PassthroughInputPart( InputPart ):
10 |     """
11 |     An input part that passes through one of its parameters unchanged
12 |     """
13 | 
14 |     def __init__(self, keyword, width):
15 |         """
16 |         Initialize the input part to passthrough the input given by keyword
17 |         """
18 |         self.keyword = keyword
19 |         self.PART_WIDTH = width
20 | 
21 |     def __repr__(self):
22 |         return '<PassthroughInputPart with keyword "{}">'.format(self.keyword)
23 | 
24 |     def generate(self, **kwargs):
25 |         """
26 |         Generate a beat input for a given timestep.
27 | 
28 |         Parameters: 
29 |             kwargs[keyword]: A theano tensor (float32) of shape (n_parallel, PART_WIDTH).
30 |                 This method will extract this keyword argument and return it unchanged
31 | 
32 |         Returns:
33 |             piece: A theano tensor (float32) of shape (n_parallel, PART_WIDTH)
34 |         """
35 |         result = kwargs[self.keyword]
36 |         # result = theano.printing.Print("PassthroughInputPart")(result)
37 |         # result = T.opt.Assert()(result, T.eq(result.shape[1], self.PART_WIDTH))
38 |         return result
39 | 


--------------------------------------------------------------------------------
/instructions/generate_trade_helper.md:
--------------------------------------------------------------------------------
 1 | #generate_trade_helper.py: Generate a trading summary leadsheet
 2 | 
 3 | ```
 4 | usage: generate_trade_helper.py [-h] filedir
 5 | 
 6 | Helper to concatenate trades into single leadsheet
 7 | 
 8 | positional arguments:
 9 |   filedir     Directory to process
10 | 
11 | optional arguments:
12 |   -h, --help  show this help message and exit
13 | ```
14 | 
15 | To use this script, pass the path to a generation output directory produced by `main.py`. This script will read the generated files and produce a new file `generated_trades.py` which alternates between the source pieces and the generated output produced by the network.


--------------------------------------------------------------------------------
/instructions/lscat.md:
--------------------------------------------------------------------------------
 1 | #lscat.py: Concatenate leadsheets
 2 | 
 3 | ```
 4 | usage: lscat.py [-h] [--output OUTPUT] [--verbose] files [files ...]
 5 | 
 6 | Concatenate leadsheet files.
 7 | 
 8 | positional arguments:
 9 |   files            Files to process
10 | 
11 | optional arguments:
12 |   -h, --help       show this help message and exit
13 |   --output OUTPUT  Name of the output file
14 |   --verbose        Be verbose about processing
15 | ```
16 | 
17 | To use this script, pass a list of leadsheet files. These files will then be concatenated together into a new leadsheet file. It is recommended that you also use the `--output` argument to specify an output filename, but otherwise a filename is generated based on the first concatenated file.
18 | 
19 | For example, to concatenate `a.ls`, `b.ls`, and `c.ls`, you can run
20 | 
21 | ```
22 | $ python3 lscat.py a.ls b.ls c.ls --output combined.ls
23 | ```
24 | 


--------------------------------------------------------------------------------
/instructions/lssplit.md:
--------------------------------------------------------------------------------
 1 | #lssplit.py: Split leadsheets
 2 | 
 3 | ```
 4 | usage: lssplit.py [-h] [--output OUTPUT] file split
 5 | 
 6 | Split a leadsheet file.
 7 | 
 8 | positional arguments:
 9 |   file             File to process
10 |   split            Bars to split at
11 | 
12 | optional arguments:
13 |   -h, --help       show this help message and exit
14 |   --output OUTPUT  Base name of the output files
15 | ```
16 | 
17 | To use this script, pass a single leadsheet file, and a number of bars. This script will chop up the input into chunks of length `split`.
18 | 
19 | For instance, to split up `large.ls` into chunks of 4 bars, use
20 | ```
21 | $ python3 lssplit.py large.ls 4 --output smaller
22 | ```
23 | 
24 | which will produce files `smaller_0.ls`, `smaller_1.ls`, etc.


--------------------------------------------------------------------------------
/instructions/main.md:
--------------------------------------------------------------------------------
  1 | # main.py: Train or generate from a neural network model
  2 | 
  3 | This script allows you to actually interact with a neural network model. You can run the script with
  4 | 
  5 | ```
  6 | $ python3 main.py [general arguments] MODELTYPE [model-specific arguments]
  7 | ```
  8 | 
  9 | ## General Arguments
 10 | ```
 11 |   -h, --help            show this help message and exit
 12 |   --dataset DATASET [DATASET ...]
 13 |                         Path(s) to dataset folder (with .ls files). If
 14 |                         multiple are passed, samples randomly from each
 15 |                         (default: ['dataset'])
 16 |   --validation VALIDATION
 17 |                         Path to validation dataset folder (with .ls files)
 18 |                         (default: None)
 19 |   --validation_generate_ct VALIDATION_GENERATE_CT
 20 |                         Number of samples to generate at each validation time.
 21 |                         (default: 1)
 22 |   --outputdir OUTPUTDIR
 23 |                         Path to output folder (default: output)
 24 |   --check_nan           Check for nans during execution (default: False)
 25 |   --batch_size BATCH_SIZE
 26 |                         Size of batch (default: 10)
 27 |   --iterations ITERATIONS
 28 |                         How many iterations to train (default: 50000)
 29 |   --learning_rate LEARNING_RATE
 30 |                         Learning rate for the ADAM gradient descent method
 31 |                         (default: 0.0002)
 32 |   --segment_len SEGMENT_LEN
 33 |                         Length of segment to train on (default: 4bar)
 34 |   --segment_step SEGMENT_STEP
 35 |                         Period at which segments may begin (default: 1bar)
 36 |   --save-params-interval TRAIN_SAVE_PARAMS
 37 |                         Save parameters after this many iterations (default:
 38 |                         5000)
 39 |   --final-params-only   Don't save parameters while training, only at the end.
 40 |                         (default: None)
 41 |   --auto_connectome_keys AUTO_CONNECTOME_KEYS
 42 |                         Path to keys for running param_cvt. If given, will run
 43 |                         param_cvt automatically for each saved parameters
 44 |                         file. (default: None)
 45 |   --resume TIMESTEP PARAMFILE
 46 |                         Where to restore from: timestep, and file to load
 47 |                         (default: None)
 48 |   --resume_auto         Automatically restore from a previous run using output
 49 |                         directory (default: False)
 50 |   --generate            Don't train, just generate. Should be used with
 51 |                         restore. (default: False)
 52 |   --generate_over SOURCE DIV_WIDTH
 53 |                         Don't train, just generate, and generate over SOURCE
 54 |                         chord changes divided into chunks of length DIV_WIDTH
 55 |                         (or one contiguous chunk if DIV_WIDTH is 'full'). Can
 56 |                         use 'bar' as a unit. Should be used with restore.
 57 |                         (default: None)
 58 | ```
 59 | 
 60 | If you pass a validation folder to `--validation`, then each time it saves parameters, it will also generate samples over the pieces in the validation directory (similar to `--generate_over`). By default, it generates once over each piece in the validation folder, but this can be changed using `--validation_generate_ct`.
 61 | 
 62 | ## Model Types
 63 | 
 64 | 
 65 | ###`simple`: A simple model
 66 | This is a simple class of model, not available in Impro-Visor.
 67 | 
 68 | `simple`-specific arguments:
 69 | ```
 70 | usage: main.py simple [-h] [--per_note] {abs,cot,rel}
 71 | 
 72 | positional arguments:
 73 |   {abs,cot,rel}  Type of encoding to use
 74 | 
 75 | optional arguments:
 76 |   -h, --help     show this help message and exit
 77 |   --per_note     Enable note memory cells
 78 | ```
 79 | The three types of encoding in this mode are
 80 | 
 81 | - `abs`: An absolute encoding, where each distinct pitch has a distinct bit in the note representation
 82 | - `cot`: The circles-of-thirds representation, where pitches are determined by which circles of major and minor thirds that pitch is in. This encoding was originally described by Judy A. Franklin in [Recurrent Neural Networks and Pitch Representations for Music Tasks ](http://cs.smith.edu/~jfrankli/papers/FLAIRS04FranklinJ.pdf).
 83 | - `rel`: An interval-relative encoding, where each note is encoded as the size and direction of interval between this note and the previous one.
 84 | 
 85 | ###`poex`: A product-of-experts generative model.
 86 | This is the type of generative model available in Impro-Visor.
 87 | 
 88 | `poex`-specific arguments:
 89 | ```
 90 |   -h, --help            show this help message and exit
 91 |   --per_note  Enable note memory cells
 92 |   --separate_rhythm     Use a separate rhythm expert. Only works without note
 93 |                         memory cells
 94 |   --layer_size LAYER_SIZE [LAYER_SIZE ...]
 95 |                         Layer size of the LSTMs. Either pass a single number
 96 |                         to be used for all experts, or a sequence of numbers,
 97 |                         one for each expert. Only works without note memory
 98 |                         cells.
 99 |   --num_layers NUM_LAYERS [NUM_LAYERS ...]
100 |                         Number of LSTM layers. Either pass a single number to
101 |                         be used for all experts, or a sequence of numbers, one
102 |                         for each expert. Only works without note memory cells
103 |   --skip_training_experts EXPERT_INDEX [EXPERT_INDEX ...]
104 |                         Skip training these experts
105 | ```
106 | `--per_note` enables memory cells which are fixed to particular notes and do not shift with the rest of the network. Although these were investigated as being potentially useful, they did not give a significant advantage and were not implemented in Impro-Visor for simplicity. If you wish to train a model for Impro-Visor, do not use the `--per_note` flag.
107 | 
108 | `--separate_rhythm` configures the network to delegate rhythm to a third expert, and use the other two experts for pitch choices only.
109 | 
110 | The `--layer_size` and `--num_layers` parameters give the layer sizes and number of layers used for each expert. They can be passed either a single number, used for all experts, or a list of numbers, one per expert. The order of experts is interval, chord, (optionally) rhythm.
111 | 
112 | Using `--skip_training_experts`, you can specify particular experts to not train. This is useful if you are resuming from an already-partially-trained parameters file. Experts are passed by index.
113 | 
114 | ###`compae`: A compressing autoencoder model
115 | This is the type of trading model available in Impro-Visor.
116 | 
117 | `compae`-specific arguments:
118 | ```
119 | usage: main.py [general arguments] compae ENCODING MANAGER [compae optional arguments]
120 | 
121 | positional arguments:
122 |   {abs,cot,rel,poex}    Type of encoding to use
123 |   {std,var,sample_var,queueless_var,queueless_std,nearness_std}
124 |                         Type of queue manager to use
125 | 
126 | optional arguments:
127 |   -h, --help            show this help message and exit
128 |   --per_note            Enable note memory cells
129 |   --hide_output         Hide previous outputs from the decoder
130 |   --sparsity_loss_scale SPARSITY_LOSS_SCALE
131 |                         How much to scale the sparsity loss by
132 |   --variational_loss_scale VARIATIONAL_LOSS_SCALE
133 |                         How much to scale the variational loss by
134 |   --feature_size FEATURE_SIZE
135 |                         Size of feature vectors
136 |   --feature_period FEATURE_PERIOD
137 |                         If in queueless mode, period of features in timesteps
138 |   --add_pre_noise [ADD_PRE_NOISE]
139 |                         Add Gaussian noise to the feature values before
140 |                         applying the activation function
141 |   --add_post_noise [ADD_POST_NOISE]
142 |                         Add Gaussian noise to the feature values after
143 |                         applying the activation function
144 |   --train_decoder_only  Only modify the decoder parameters
145 |   --layer_size LAYER_SIZE
146 |                         Layer size of the LSTMs. Only works without note
147 |                         memory cells
148 |   --num_layers NUM_LAYERS
149 |                         Number of LSTM layers. Only works without note memory
150 |                         cells
151 |   --priority_loss [LOSS_MODE_PRIORITY]
152 |                         Use priority loss scaling mode (with the specified
153 |                         curviness)
154 |   --add_loss            Use adding loss scaling mode
155 |   --cutoff_loss CUTOFF  Use cutoff loss scaling mode with the specified per-
156 |                         batch cutoff
157 |   --trigger_loss TRIGGER RAMP_TIME
158 |                         Use trigger loss scaling mode with the specified per-
159 |                         batch trigger value and desired ramp-up time
160 | ```
161 | 
162 | In order to be compatible with Impro-Visor, you must use the encoding `poex` and the queue manger `queueless_std` or `queueless_var`, and you must not use the flags `--per_note` or `--hide_output`. Other modes are available for training and generation within python, but were not implemented in Impro-Visor. Additionally, `--feature_period` should be 24, corresponding to the duration of a half note.
163 | 
164 | Other than `poex`, which matches the `poex` generative model, the other encoding modes correspond to the encodings for the `simple` model.
165 | 
166 | There are a variety of queue managers available:
167 | 
168 | - `std`: The standard, variable-sized feature model, where the network decides where to output features, and is encouraged to output few features.
169 | - `var`: A variational version of the variable-sized feature model, where the features are latent variables that are sampled from a normal distribution, and that distribution is regularized to be similar to a unit normal distribution.
170 | - `sample_var`: Like `var`, but instead of allowing a feature to be output with fractional strength, the network samples from the feature strength to decide where the features are, and then is trained using a variant of reinforcement learning.
171 | - `queueless_std`: A fixed-size feature model, with features repeating at a fixed interval.
172 | - `queueless_var`: A variational version of the fixed-size feature model.
173 | - `nearness_std`: A version of the variable-sized feature model where features that are close together are penalized more than features that are far away.
174 | 
175 | For some queue mangers, there are multiple loss values: the reconstruction loss, as well as some extra loss. The `--sparsity_loss_scale` and `--variational_loss_scale` allow you to scale the importance of these losses, and the loss modes `--add_loss`, `--priority_loss`, `--cutoff_loss`, and `--trigger_loss` determine how the losses are balanced:
176 | 
177 | - `--add_loss` simply adds the losses
178 | - `--priority_loss` attempts to scale the losses so that the largest loss is most important
179 | - `--cutoff_loss` ignores the extra loss unless the reconstruction loss is small enough
180 | - `--trigger_loss` waits until the reconstruction loss becomes small enough, and then interpolates the extra loss from having no influence to having full influence (as in `--add_loss` mode)
181 | 
182 | ## Examples
183 | 
184 | Train a product-of-experts generative model on a directory of leadsheet fileswith path `datasets/my_dataset`, automatically resuming training if previously interrupted. Each leadsheet file will be split into 4-bar chunks starting at each bar.
185 | 
186 | ```
187 | $ python3 main.py --dataset datasets/my_dataset --outputdir output_my_dataset --resume_auto poex
188 | ```
189 | 
190 | Train a product-of-experts generative model as before, but split each leadsheet into 8-bar chunks, starting at each multiple of 4 bars:
191 | 
192 | ```
193 | $ python3 main.py --dataset datasets/my_dataset --outputdir output_my_dataset  --resume_auto --segment_len 8bar --segment_step 4bar poex
194 | ```
195 | 
196 | Train a product-of-experts generative model as before, but save parameters every 100 iterations, and generate samples over a validation dataset:
197 | 
198 | ```
199 | $ python3 main.py --dataset datasets/my_dataset --outputdir output_my_dataset --resume_auto --save-params-interval 100 --validation datasets/validation_dataset poex
200 | ```
201 | 
202 | Generate some leadsheets using a trained product-of-experts model, sampling from the dataset:
203 | 
204 | ```
205 | $ python3 main.py --dataset datasets/my_dataset --outputdir output_my_dataset/generated --resume 0 output_my_dataset/final_params.p --generate poex
206 | ```
207 | 
208 | As above, but generate over a particular piece `my_generate_target.ls` in 4 bar chunks:
209 | 
210 | ```
211 | $ python3 main.py --dataset datasets/my_dataset --outputdir output_my_dataset/generated --resume 0 output_my_dataset/final_params.p --generate_over my_generate_target.ls 4bar poex
212 | ```
213 | 
214 | As above, but generate over the whole piece in a single run of the network (without breaking into chunks)
215 | 
216 | ```
217 | $ python3 main.py --dataset datasets/my_dataset --outputdir output_my_dataset/generated --resume 0 output_my_dataset/final_params.p --generate_over my_generate_target.ls full poex
218 | ```
219 | 
220 | Train a compressing autoencoder on the same dataset, using fixed features and product-of-experts for compatibility with Impro-Visor:
221 | 
222 | ```
223 | $ python3 main.py --dataset datasets/my_dataset --outputdir output_my_dataset_compae --resume_auto compae poex queueless_std --feature_period 24 --add_loss
224 | ```
225 | 
226 | As above, but train a variational autoencoder instead of a standard one, scaling the variational loss by 0.01 and only enforcing the variational loss after the reconstruction loss drops below 4 per sample:
227 | 
228 | ```
229 | $ python3 main.py --dataset datasets/my_dataset --outputdir output_my_dataset_compae --resume_auto compae poex queueless_var --feature_period 24 --trigger_loss 4 2000
230 | ```
231 | 
232 | As above, but with a standard autoencoder and variable-size features:
233 | 
234 | ```
235 | $ python3 main.py --dataset datasets/my_dataset --outputdir output_my_dataset_compae --resume_auto compae poex std --trigger_loss 4 2000
236 | ```
237 | 
238 | Run the autoencoder and generate some output:
239 | 
240 | ```
241 | $ python3 main.py --dataset datasets/my_dataset --outputdir output_my_dataset_compae/generated --resume 0 output_my_dataset_compae/final_params.p --generate compae poex queueless_std --feature_period 24 --add_loss
242 | ```
243 | 
244 | 


--------------------------------------------------------------------------------
/instructions/param_cvt.md:
--------------------------------------------------------------------------------
 1 | #param_cvt.py: Convert a python parameters file into a connectome file
 2 | 
 3 | ```
 4 | usage: param_cvt.py [-h] --keys KEYS [--output OUTPUT]
 5 |                     [--precision PRECISION] [--raw]
 6 |                     file
 7 | 
 8 | Convert a python parameters file into an Impro-Visor connectome file
 9 | 
10 | positional arguments:
11 |   file                  File to process
12 | 
13 | optional arguments:
14 |   -h, --help            show this help message and exit
15 |   --keys KEYS           File to load parameter names from
16 |   --output OUTPUT       Base name of the output files
17 |   --precision PRECISION
18 |                         Decimal points of precision to use (default 18)
19 |   --raw                 Create individual csv files instead of a connectome
20 |                         file
21 | ```
22 | 
23 | In python, trained parameters are saved as pickle files containing a list of the model parameter matrices. To convert this to a format that Impro-Visor can read, we encode each model parameter matrix as a .csv file, and name it according to a key file, which describes the order that the parameters appear in the list. These .csv files are zipped together and given the extension .ctome, which can be loaded by Impro-Visor.
24 | 
25 | ## Examples
26 | 
27 | Convert a product-of-experts network into a connectome file:
28 | 
29 | ```
30 | $ python3 param_cvt.py --keys param_keys/poex_keys.txt output_poex/final_params.p
31 | ```
32 | 
33 | Convert a compressing autoencoder network into a connectome file:
34 | 
35 | ```
36 | $ python3 param_cvt.py --keys param_keys/ae_poex_keys.txt output_compae/final_params.p
37 | ```
38 | 


--------------------------------------------------------------------------------
/instructions/plot_data.md:
--------------------------------------------------------------------------------
 1 | #plot_data.py: Plot a csv data file
 2 | 
 3 | ```
 4 | usage: plot_data.py [-h] fn
 5 | 
 6 | Plot a .csv file
 7 | 
 8 | positional arguments:
 9 |   fn          File to plot
10 | 
11 | optional arguments:
12 |   -h, --help  show this help message and exit
13 | ```
14 | 
15 | This script is designed to plot the `data.csv` file produced by `main.py` during training. You can pass this script the path to the `data.csv` file to visualize it:
16 | 
17 | ```
18 | $ python3 plot_data.py output_my_dataset/data.csv
19 | ```
20 | 


--------------------------------------------------------------------------------
/instructions/plot_internal_state.md:
--------------------------------------------------------------------------------
 1 | #plot_internal_state.py: Plot the internal state of a network
 2 | 
 3 | ```
 4 | usage: plot_internal_state.py [-h] folder idx
 5 | 
 6 | Plot the internal state of a network
 7 | 
 8 | positional arguments:
 9 |   folder      Directory with the generated files
10 |   idx         Zero-based index of the output to visualize
11 | 
12 | optional arguments:
13 |   -h, --help  show this help message and exit
14 | ```
15 | 
16 | Using matplotlib, this script plots the internal state of the network while generating a particular piece. To use the script, first run `main.py` with the `--generate` or `--generate_over` arguments. This will output a series of files to the desired output directory. You must then pass that directory to this utility, along with an index of the piece to view. For instance, if running main.py gives you a directory `generated_stuff`, with files
17 | 
18 | ```
19 | generated_stuff/generated_0.ls
20 | generated_stuff/generated_1.ls
21 | generated_stuff/generated_10.ls
22 | generated_stuff/generated_11.ls
23 | generated_stuff/generated_2.ls
24 | generated_stuff/generated_3.ls
25 | generated_stuff/generated_4.ls
26 | generated_stuff/generated_5.ls
27 | generated_stuff/generated_6.ls
28 | generated_stuff/generated_7.ls
29 | generated_stuff/generated_8.ls
30 | generated_stuff/generated_9.ls
31 | generated_stuff/generated_chosen.npy
32 | generated_stuff/generated_info_0.npy
33 | generated_stuff/generated_info_1.npy
34 | generated_stuff/generated_probs.npy
35 | generated_stuff/generated_sources.txt
36 | ```
37 | 
38 | and you want to view the state of the network while generating `generated_6.ls`, you can run
39 | 
40 | ```
41 | $ python3 plot_internal_state.py generated_stuff 6
42 | ```
43 | 


--------------------------------------------------------------------------------
/leadsheet.py:
--------------------------------------------------------------------------------
  1 | import sexpdata
  2 | import re
  3 | from pprint import pprint
  4 | import fractions
  5 | import itertools
  6 | 
  7 | import constants
  8 | from functools import reduce
  9 | 
 10 | import numpy as np
 11 | 
 12 | def rotate(li, x):
 13 |     """
 14 |     Rotate list li by x spaces to the right, i.e.
 15 |         rotate([1,2,3,4],1) -> [4,1,2,3]
 16 |     """
 17 |     return li[-x % len(li):] + li[:-x % len(li)]
 18 | 
 19 | def chunkwise(t, size=2):
 20 |     """
 21 |     Return an iterator of tuples of size
 22 |     """
 23 |     it = iter(t)
 24 |     return zip(*[it]*size)
 25 | 
 26 | def gcd(it):
 27 |     def _gcd_helper(a,b):
 28 |         if a==0:
 29 |             return b
 30 |         else:
 31 |             return _gcd_helper(b%a, a)
 32 |     return reduce(_gcd_helper, it)
 33 | 
 34 | def repeat_print(li):
 35 | 
 36 |     last = None
 37 |     lastct = 0
 38 |     for c in li+[None]:
 39 |         if c == last:
 40 |             lastct += 1
 41 |         else:
 42 |             if last is not None:
 43 |                 print(last, "*", lastct)
 44 |             last = c
 45 |             lastct = 1
 46 | 
 47 | 
 48 | def parse_chord(cstr,verbose=False):
 49 |     """
 50 |     Given a string representation of a chord, return a binary representation
 51 |     as a list of length 12, starting with C.
 52 |     """
 53 |     if cstr == "NC":
 54 |         return 0, constants.CHORD_TYPES["NC"]
 55 |     chord_match = re.match(r"([A-G](?:#|b)?)([^/]*)(?:/(.+))?", cstr)
 56 |     root_note, ctype, slash_note = chord_match.groups()
 57 | 
 58 |     try:
 59 |         ctype_vec = constants.CHORD_TYPES[ctype]
 60 |     except KeyError:
 61 |         if(verbose):
 62 |             print("WARNING: Could not find chord {}, substituting NC".format(cstr))
 63 |         ctype_vec = constants.CHORD_TYPES['NC']
 64 | 
 65 |     root_offset = constants.CHORD_NOTE_OFFSETS[root_note]
 66 |     if slash_note is None:
 67 |         return root_offset, ctype_vec
 68 |     else:
 69 |         # For a slash chord, we need to add the slashed note to the chord,
 70 |         # and also make it the bass note
 71 |         slash_offset = constants.CHORD_NOTE_OFFSETS[slash_note]
 72 |         shifted_ctype_vec = rotate(ctype_vec, root_offset-slash_offset)
 73 |         shifted_ctype_vec[0] = 1
 74 |         return slash_offset, shifted_ctype_vec
 75 | 
 76 | 
 77 | def parse_duration(durstr):
 78 |     accum_dur = 0
 79 | 
 80 |     parts = durstr.split("+")
 81 |     for part in parts:
 82 |         dot_match = re.match(r"([^\.]*)(\.*)", part)
 83 |         part = dot_match.group(1)
 84 |         num_dots = len(dot_match.group(2))
 85 | 
 86 |         tupl_parts = part.split("/")
 87 |         if len(tupl_parts) == 1:
 88 |             # Not a tuplet
 89 |             [dur_frac_str] = tupl_parts
 90 |             dur_frac = int(dur_frac_str)
 91 |             assert constants.WHOLE % dur_frac == 0, "Bad duration {} -> {} / {}".format(durstr, constants.WHOLE, dur_frac)
 92 |             slots = constants.WHOLE // dur_frac
 93 |         else:
 94 |             [dur_frac_str, tuplet_str] = tupl_parts
 95 |             dur_frac = int(dur_frac_str)
 96 |             dur_tupl = int(tuplet_str)
 97 |             assert (constants.WHOLE * (dur_tupl-1)) % (dur_frac * dur_tupl) == 0, "Bad duration {} -> {} / {}".format(durstr, (constants.WHOLE * (dur_tupl-1)), (dur_frac * dur_tupl))
 98 |             slots = constants.WHOLE * (dur_tupl-1) // (dur_frac * dur_tupl)
 99 | 
100 |         for i in range(num_dots):
101 |             assert (slots * 3) % 2 == 0, "Bad duration {} -> {} / {}".format(durstr, (slots * 3), 2)
102 |             slots = slots * 3 // 2
103 | 
104 |         accum_dur += slots
105 | 
106 |     assert accum_dur % constants.RESOLUTION_SCALAR == 0, "Bad duration {}: {} not a multiple of resolution {}".format(durstr, accum_dur, constants.RESOLUTION_SCALAR)
107 |     return accum_dur//constants.RESOLUTION_SCALAR
108 | 
109 | def parse_note(nstr):
110 |     """
111 |     Given a string representation of a note, return (midiOrNone, duration)
112 |     """
113 |     note_match = re.match(r"((?:[a-g]|r)(?:[#b]?))([\+\-]*)(.*)", nstr)
114 |     note = note_match.group(1)
115 |     octaveshift_str = note_match.group(2)
116 |     duration_str = note_match.group(3)
117 | 
118 |     octaveshift = sum({"+":1,"-":-1}[x] for x in octaveshift_str)
119 |     if nstr[0] == 'r':
120 |         midival = None
121 |     else:
122 |         midival = constants.MIDDLE_C_MIDI + (constants.OCTAVE * octaveshift) + constants.NOTE_OFFSETS[note]
123 | 
124 |     duration = parse_duration(duration_str)
125 | 
126 |     return (midival, duration)
127 | 
128 | 
129 | def parse_leadsheet(fn,verbose=False):
130 |     with open(fn,'r') as f:
131 |         contents = "\n".join(f.readlines())
132 |     parsed = sexpdata.loads("({})".format(contents.replace("'","")))
133 | 
134 |     parts = [('default','',[])]
135 |     for p in parsed:
136 |         if not isinstance(p, list):
137 |             parts[-1][2].append(p.value())
138 |         elif not isinstance(p[0], list) and p[0].value() == 'part':
139 |             def strval(x):
140 |                 return x.value() if isinstance(x,sexpdata.Symbol) else str(x)
141 |             part_type = next((' '.join(strval(x) for x in l[1:]) for l in p if isinstance(l,list) and l[0].value() == "type"), None)
142 |             title = next((' '.join(strval(x) for x in l[1:]) for l in p if isinstance(l,list) and l[0].value() == "title"), '')
143 |             parts.append((part_type, title, []))
144 | 
145 |     chord_parts = [x for x in parts if x[0]=='chords']
146 |     if len(chord_parts) == 0:
147 |         chord_parts = [x for x in parts if x[0]=='default']
148 |     assert len(chord_parts) == 1, 'Wrong number of chord parts!'
149 | 
150 |     chords_raw = [x for x in chord_parts[0][2] if x[0].isupper() or x in ("|", "/")]
151 |     chords = []
152 |     partial_measure = []
153 |     last_chord = None
154 |     for c in chords_raw:
155 |         if c == "|":
156 |             length_each = constants.WHOLE//(len(partial_measure)*constants.RESOLUTION_SCALAR)
157 |             for chord in partial_measure:
158 |                 for x in range(length_each):
159 |                     chords.append(chord)
160 |             partial_measure = []
161 |         else:
162 |             if c != "/":
163 |                 last_chord = parse_chord(c,verbose)
164 |             partial_measure.append(last_chord)
165 | 
166 |     melody = []
167 |     for part_type, title, part_data in parts:
168 |         if part_type == 'melody':
169 |             melody_raw = [x for x in part_data if x[0].islower()]
170 |             melody_proc = [parse_note(x) for x in melody_raw]
171 |             mlen = sum(dur for n,dur in melody_proc)
172 |             if mlen < len(chords):
173 |                 melody_proc.append((None, len(chords)-mlen))
174 |             melody.extend(melody_proc)
175 | 
176 |     # print "Raw Chords: " + " ".join(chords_raw)
177 |     # print "Raw Melody: " + " ".join(melody_raw)
178 | 
179 |     # print "Parsed chords: "
180 |     # repeat_print(chords)
181 |     # print "Parsed melody: "
182 |     # pprint(melody)
183 | 
184 |     clen = len(chords)
185 |     mlen = sum(dur for n,dur in melody)
186 |     # Might have multiple melodies over the same chords
187 |     assert mlen % clen == 0, "Notes and chords don't match in {}: {}, {}".format(fn, clen,mlen)
188 | 
189 |     return chords, melody
190 | 
191 | def constrain_melody(melody,bounds):
192 |     new_melody = []
193 |     for n,dur in melody:
194 |         if n is None:
195 |             new_melody.append((n,dur))
196 |         else:
197 |             while n >= bounds.highbound:
198 |                 n -= 12
199 |             while n < bounds.lowbound:
200 |                 n += 12
201 |             new_melody.append((n,dur))
202 |     return new_melody
203 | 
204 | def get_leadsheet_length(chords, melody):
205 |     return sum(dur for n,dur in melody)
206 | 
207 | def slice_leadsheet(chords, melody, start, end):
208 | 
209 |     sliced_melody_start = []
210 |     sliced_melody_full = []
211 | 
212 |     timestep = 0
213 |     for n,dur in melody:
214 |         if start-dur < timestep <= start:
215 |             sliced_melody_start.append((n,timestep+dur-start))
216 |         elif start < timestep:
217 |             sliced_melody_start.append((n,dur))
218 |         timestep += dur
219 | 
220 |     timestep = start
221 |     for n,dur in sliced_melody_start:
222 |         if timestep < end-dur:
223 |             sliced_melody_full.append((n,dur))
224 |         elif end-dur <= timestep < end:
225 |             sliced_melody_full.append((n,end-timestep))
226 |         timestep += dur
227 | 
228 |     sliced_chords = [chords[i%len(chords)] for i in range(start,end)]
229 | 
230 |     clen = len(sliced_chords)
231 |     mlen = sum(dur for n,dur in sliced_melody_full)
232 |     assert clen == mlen, "clen {} and mlen {} do not match".format(clen,mlen)
233 | 
234 |     return sliced_chords, sliced_melody_full
235 | 
236 | def write_duration(duration):
237 |     """
238 |     Convert a number of slots to a duration string
239 |     """
240 |     q_dir = constants.QUARTER//constants.RESOLUTION_SCALAR
241 |     whole_dir = constants.WHOLE//constants.RESOLUTION_SCALAR
242 | 
243 |     if duration > whole_dir:
244 |         # Longer than a measure
245 |         return "1+{}".format(write_duration(duration - whole_dir))
246 |     elif q_dir % duration == 0:
247 |         # Simple, shorter than a quarter note
248 |         return {
249 |             12:"32/3",
250 |             6:"16/3",
251 |             4:"16",
252 |             3:"8/3",
253 |             2:"8",
254 |             1:"4"
255 |         }[ q_dir//duration ]
256 |     elif duration % q_dir == 0:
257 |         # Simple, longer than a quarter note
258 |         return {
259 |             1:"4",
260 |             2:"2",
261 |             3:"2.",
262 |             4:"1"
263 |         }[ duration//q_dir ]
264 |     elif duration > q_dir:
265 |         # Longer than a quarter note, but not evenly divisible.
266 |         # Break up long and short parts
267 |         q_parts = duration % q_dir
268 |         return "{}+{}".format(write_duration(duration-q_parts), write_duration(q_parts))
269 |     else:
270 |         # Find the shortest representation
271 |         best = None
272 |         for i in range(1,duration//2):
273 |             cur_try = "{}+{}".format(write_duration(duration-i),write_duration(i))
274 |             if best is None or len(cur_try) < len(best):
275 |                 best = cur_try
276 |         return cur_try
277 | 
278 | def write_melody(melody):
279 |     """
280 |     Convert a list of melody to a string
281 |     """
282 |     notes = []
283 |     for midi, dur in melody:
284 |         if midi is None:
285 |             notename = "r"
286 |             octave_adj = ""
287 |         else:
288 |             delta_from_middle = midi - constants.MIDDLE_C_MIDI
289 |             octaves = delta_from_middle // 12
290 |             pitchclass = delta_from_middle % 12
291 |             notename = list(constants.NOTE_OFFSETS.keys())[list(constants.NOTE_OFFSETS.values()).index(pitchclass)]
292 | 
293 |             if octaves < 0:
294 |                 octave_adj = "-"*(-octaves)
295 |             else:
296 |                 octave_adj = "+"*octaves
297 | 
298 |         duration_str = write_duration(dur)
299 | 
300 |         notes.append(notename + octave_adj + duration_str)
301 | 
302 |     return " ".join(notes)
303 | 
304 | def write_chords(chords):
305 |     """
306 |     Convert a list of chords to a string
307 |     """
308 |     whole_dir = constants.WHOLE//constants.RESOLUTION_SCALAR
309 | 
310 |     parts = []
311 |     for measure in chunkwise(chords, whole_dir):
312 |         partial_measure = []
313 |         last_seen = None
314 |         for chord in measure:
315 |             if chord == last_seen:
316 |                 partial_measure[-1][1] += 1
317 |             else:
318 |                 last_seen = chord
319 |                 root,ctype = chord
320 |                 if ctype == constants.CHORD_TYPES["NC"]:
321 |                     chord_str = "NC"
322 |                 else:
323 |                     if ctype in list(constants.CHORD_TYPES.values()):
324 |                         t_idx = list(constants.CHORD_TYPES.values()).index(ctype)
325 |                         ctype_s = list(constants.CHORD_TYPES.keys())[t_idx]
326 | 
327 |                         r_idx = list(constants.CHORD_NOTE_OFFSETS.values()).index(root)
328 |                         root_s = list(constants.CHORD_NOTE_OFFSETS.keys())[r_idx]
329 | 
330 |                         chord_str = root_s + ctype_s
331 |                     else:
332 |                         # Try slash chords: "root" is bass, look for true root
333 |                         bass = root
334 |                         mod_ctype = [0] + ctype[1:]
335 |                         for offset in range(1,12):
336 |                             true_root = (bass + offset) % 12
337 |                             shifted_chord = rotate(ctype, -offset)
338 |                             mod_shifted_chord = rotate(mod_ctype, -offset)
339 |                             if shifted_chord in list(constants.CHORD_TYPES.values()):
340 |                                 t_idx = list(constants.CHORD_TYPES.values()).index(shifted_chord)
341 |                             elif mod_shifted_chord in list(constants.CHORD_TYPES.values()):
342 |                                 t_idx = list(constants.CHORD_TYPES.values()).index(mod_shifted_chord)
343 |                             else:
344 |                                 continue
345 | 
346 |                             ctype_s = list(constants.CHORD_TYPES.keys())[t_idx]
347 | 
348 |                             r_idx = list(constants.CHORD_NOTE_OFFSETS.values()).index(true_root)
349 |                             root_s = list(constants.CHORD_NOTE_OFFSETS.keys())[r_idx]
350 | 
351 |                             slash_idx = list(constants.CHORD_NOTE_OFFSETS.values()).index(bass)
352 |                             slash_s = list(constants.CHORD_NOTE_OFFSETS.keys())[slash_idx]
353 | 
354 |                             chord_str = root_s + ctype_s + '/' + slash_s
355 |                             break
356 |                         else:
357 |                             print("Not a valid chord!")
358 |                             chord_str = "NC"
359 | 
360 |                 partial_measure.append([chord_str, 1])
361 | 
362 |         divisor = gcd(x[1] for x in partial_measure)
363 |         for chord_str, ct in partial_measure:
364 |             for _ in range(ct//divisor):
365 |                 parts.append(chord_str)
366 |         parts.append("|")
367 | 
368 |     return " ".join(parts)
369 | 
370 | def write_leadsheet(chords, melody, filename=None):
371 |     """
372 |     Convert chords and a melody to a leadsheet file
373 |     """
374 |     full_leadsheet = """
375 | (section (style swing))
376 | 
377 | (part (type chords))
378 | {}
379 | (part (type melody))
380 | {}
381 | """.format(write_chords(chords), write_melody(melody))
382 | 
383 |     if filename is not None:
384 |         with open(filename,'w') as f:
385 |             f.write(full_leadsheet)
386 |     else:
387 |         return full_leadsheet
388 | 
389 | 


--------------------------------------------------------------------------------
/lscat.py:
--------------------------------------------------------------------------------
 1 | import sys
 2 | import leadsheet
 3 | import argparse
 4 | 
 5 | def main(files, output=None, verbose=False):
 6 |     melody = []
 7 |     chords = []
 8 |     for f in files:
 9 |         if verbose:
10 |             print(f)
11 |         nc,nm = leadsheet.parse_leadsheet(f, verbose)
12 |         melody.extend(nm)
13 |         chords.extend(nc)
14 |     if output is None:
15 |         output = files[0] + "-cat.ls"
16 |     leadsheet.write_leadsheet(chords, melody, output)
17 | 
18 | parser = argparse.ArgumentParser(description='Concatenate leadsheet files.')
19 | parser.add_argument('files', nargs='+', help='Files to process')
20 | parser.add_argument('--output', help='Name of the output file')
21 | parser.add_argument('--verbose', action="store_true", help='Be verbose about processing')
22 | 
23 | if __name__ == '__main__':
24 |     args = parser.parse_args()
25 |     main(**vars(args))


--------------------------------------------------------------------------------
/lssplit.py:
--------------------------------------------------------------------------------
 1 | import sys
 2 | import leadsheet
 3 | import argparse
 4 | import constants
 5 | import os
 6 | 
 7 | def main(file, split, output=None):
 8 |     c,m = leadsheet.parse_leadsheet(file)
 9 |     lslen = leadsheet.get_leadsheet_length(c,m)
10 |     divwidth = int(split) * (constants.WHOLE//constants.RESOLUTION_SCALAR)
11 |     slices = [leadsheet.slice_leadsheet(c,m,s,s+divwidth) for s in range(0,lslen,divwidth)]
12 |     if output is None:
13 |         output = file + "-split"
14 |     for i, (chords, melody) in enumerate(slices):
15 |         leadsheet.write_leadsheet(chords, melody, '{}_{}.ls'.format(output,i))
16 | 
17 | parser = argparse.ArgumentParser(description='Split a leadsheet file.')
18 | parser.add_argument('file',help='File to process')
19 | parser.add_argument('split',help='Bars to split at')
20 | parser.add_argument('--output', help='Base name of the output files')
21 | 
22 | if __name__ == '__main__':
23 |     args = parser.parse_args()
24 |     main(**vars(args))


--------------------------------------------------------------------------------
/main.py:
--------------------------------------------------------------------------------
  1 | import argparse
  2 | import time
  3 | import sys
  4 | import os
  5 | import collections
  6 | 
  7 | from models import SimpleModel, ProductOfExpertsModel, CompressiveAutoencoderModel
  8 | from note_encodings import AbsoluteSequentialEncoding, RelativeJumpEncoding, ChordRelativeEncoding, CircleOfThirdsEncoding, RhythmOnlyEncoding
  9 | from queue_managers import StandardQueueManager, VariationalQueueManager, SamplingVariationalQueueManager, QueuelessVariationalQueueManager, QueuelessStandardQueueManager, NearnessStandardQueueManager, NoiseWrapper
 10 | import input_parts
 11 | import leadsheet
 12 | import training
 13 | import pickle
 14 | import theano
 15 | import theano.tensor as T
 16 | 
 17 | import numpy as np
 18 | import constants
 19 | from util import sliceMaker
 20 | 
 21 | ModelBuilder = collections.namedtuple('ModelBuilder',['name', 'build', 'config_args', 'desc'])
 22 | builders = {}
 23 | 
 24 | def build_simple(should_setup, check_nan, unroll_batch_num, encode_key, no_per_note):
 25 |     if encode_key == "abs":
 26 |         enc = AbsoluteSequentialEncoding(constants.BOUNDS.lowbound, constants.BOUNDS.highbound)
 27 |         inputs = [input_parts.BeatInputPart(),input_parts.ChordShiftInputPart()]
 28 |     elif encode_key == "cot":
 29 |         enc = CircleOfThirdsEncoding(constants.BOUNDS.lowbound, (constants.BOUNDS.highbound-constants.BOUNDS.lowbound)//12)
 30 |         inputs = [input_parts.BeatInputPart(),input_parts.ChordShiftInputPart()]
 31 |     elif encode_key == "rel":
 32 |         enc = RelativeJumpEncoding()
 33 |         inputs = None
 34 |     sizes = [(200,10),(200,10)] if (encode_key == "rel" and not no_per_note) else [(300,0),(300,0)]
 35 |     bounds = constants.NoteBounds(48, 84) if encode_key == "cot" else constants.BOUNDS
 36 |     return SimpleModel(enc, sizes, bounds=bounds, inputs=inputs, dropout=0.5, setup=should_setup, nanguard=check_nan, unroll_batch_num=unroll_batch_num)
 37 | 
 38 | def config_simple(parser):
 39 |     parser.add_argument('encode_key', choices=["abs","cot","rel"], help='Type of encoding to use')
 40 |     parser.add_argument('--per_note', dest="no_per_note", action="store_false", help='Enable note memory cells')
 41 | 
 42 | builders['simple'] = ModelBuilder('simple', build_simple, config_simple, 'A simple single-LSTM-stack sequential model')
 43 | 
 44 | #######################
 45 | 
 46 | def build_poex(should_setup, check_nan, unroll_batch_num, no_per_note, layer_size, num_layers, separate_rhythm, skip_training_experts):
 47 |     encs = [RelativeJumpEncoding(), ChordRelativeEncoding()]
 48 |     if separate_rhythm:
 49 |         encs = [RelativeJumpEncoding(with_artic=False), ChordRelativeEncoding(with_artic=False), RhythmOnlyEncoding()]
 50 |         shift_modes = ["drop","roll","drop"]
 51 |     else:
 52 |         encs = [RelativeJumpEncoding(), ChordRelativeEncoding()]
 53 |         shift_modes = ["drop","roll"]
 54 | 
 55 |     if no_per_note:
 56 |         if len(layer_size) == 1:
 57 |             layer_size = layer_size*len(encs)
 58 |         assert len(layer_size) == len(encs)
 59 |         if len(num_layers) == 1:
 60 |             num_layers = num_layers*len(encs)
 61 |         assert len(num_layers) == len(encs)
 62 | 
 63 |         sizes = [[(ls,0)]*nl for ls, nl in zip(layer_size, num_layers)]
 64 |     else:
 65 |         sizes = [[(200,10),(200,10)]]*len(encs)
 66 | 
 67 |     return ProductOfExpertsModel(encs, sizes, shift_modes=shift_modes,
 68 |         dropout=0.5, setup=should_setup, nanguard=check_nan, unroll_batch_num=unroll_batch_num,
 69 |         normalize_artic_only=separate_rhythm, skip_training_experts=skip_training_experts)
 70 | 
 71 | def config_poex(parser):
 72 |     parser.add_argument('--per_note', dest="no_per_note", action="store_false", help='Enable note memory cells')
 73 |     parser.add_argument('--separate_rhythm', action="store_true", help='Use a separate rhythm expert. Only works without note memory cells')
 74 |     parser.add_argument('--layer_size', nargs="+", type=int, default=[300], help='Layer size of the LSTMs. Either pass a single number to be used for all experts, or a sequence of numbers, one for each expert. Only works without note memory cells.')
 75 |     parser.add_argument('--num_layers', nargs="+", type=int, default=[2], help='Number of LSTM layers. Either pass a single number to be used for all experts, or a sequence of numbers, one for each expert. Only works without note memory cells')
 76 |     parser.add_argument('--skip_training_experts', nargs="+", type=int, default=[], metavar="EXPERT_INDEX", help='Skip training these experts')
 77 | 
 78 | builders['poex'] = ModelBuilder('poex', build_poex, config_poex, 'A product-of-experts LSTM sequential model, using note and chord relative encodings.')
 79 | 
 80 | #######################
 81 | 
 82 | def build_compae(should_setup, check_nan, unroll_batch_num, encode_key, queue_key, no_per_note, layer_size, num_layers, feature_size, hide_output, sparsity_loss_scale, variational_loss_scale, train_decoder_only, feature_period=None, add_pre_noise=None, add_post_noise=None, loss_mode_priority=False, loss_mode_add=False, loss_mode_cutoff=None, loss_mode_trigger=None):
 83 |     bounds = constants.NoteBounds(48, 84) if encode_key == "cot" else constants.BOUNDS
 84 |     shift_modes = None
 85 |     if encode_key == "abs":
 86 |         enc = [AbsoluteSequentialEncoding(constants.BOUNDS.lowbound, constants.BOUNDS.highbound)]
 87 |         sizes = [[(layer_size,0)]*num_layers]
 88 |         inputs = [[input_parts.BeatInputPart(), input_parts.ChordShiftInputPart()]]
 89 |     elif encode_key == "cot":
 90 |         enc = [CircleOfThirdsEncoding(bounds.lowbound, (bounds.highbound-bounds.lowbound)//12)]
 91 |         sizes = [[(layer_size,0)]*num_layers]
 92 |         inputs = [[input_parts.BeatInputPart(), input_parts.ChordShiftInputPart()]]
 93 |     elif encode_key == "rel":
 94 |         enc = [RelativeJumpEncoding()]
 95 |         sizes = [[(200,10),(200,10)] if (not no_per_note) else [[(layer_size,0)]*num_layers]]
 96 |         shift_modes=["drop"]
 97 |         inputs = None
 98 |     elif encode_key == "poex":
 99 |         enc = [RelativeJumpEncoding(), ChordRelativeEncoding()]
100 |         sizes = [ [(200,10),(200,10)] if (not no_per_note) else [[(layer_size,0)]*num_layers] ]*2
101 |         shift_modes=["drop","roll"]
102 |         inputs = None
103 | 
104 |     unscaled_loss_fun = lambda x: T.log(1+99*x)/T.log(100)
105 |     lossfun = lambda x: np.array(sparsity_loss_scale, np.float32) * unscaled_loss_fun(x)
106 |     if queue_key == "std":
107 |         qman = StandardQueueManager(feature_size, loss_fun=lossfun)
108 |     elif queue_key == "var":
109 |         qman = VariationalQueueManager(feature_size, loss_fun=lossfun, variational_loss_scale=variational_loss_scale)
110 |     elif queue_key == "sample_var":
111 |         qman = SamplingVariationalQueueManager(feature_size, loss_fun=lossfun, variational_loss_scale=variational_loss_scale)
112 |     elif queue_key == "queueless_var":
113 |         qman = QueuelessVariationalQueueManager(feature_size, period=feature_period, variational_loss_scale=variational_loss_scale)
114 |     elif queue_key == "queueless_std":
115 |         qman = QueuelessStandardQueueManager(feature_size, period=feature_period)
116 |     elif queue_key == "nearness_std":
117 |         qman = NearnessStandardQueueManager(feature_size, sparsity_loss_scale*10, sparsity_loss_scale, 0.97, loss_fun=unscaled_loss_fun)
118 | 
119 |     if add_pre_noise is not None or add_post_noise is not None:
120 |         if "queueless" in queue_key:
121 |             pre_mask = sliceMaker[:]
122 |         else:
123 |             pre_mask = sliceMaker[1:]
124 |         qman = NoiseWrapper(qman, add_pre_noise, add_post_noise, pre_mask)
125 | 
126 |     loss_mode = "add" if loss_mode_add else \
127 |                 ("cutoff", loss_mode_cutoff) if loss_mode_cutoff is not None else \
128 |                 ("trigger",)+tuple(loss_mode_trigger) if loss_mode_trigger is not None else \
129 |                 ("priority", loss_mode_priority if loss_mode_priority is not None else 50)
130 | 
131 |     return CompressiveAutoencoderModel(qman, enc, sizes, sizes, shift_modes=shift_modes, bounds=bounds, hide_output=hide_output, inputs=inputs,
132 |                 dropout=0.5, setup=should_setup, nanguard=check_nan, unroll_batch_num=unroll_batch_num, loss_mode=loss_mode, train_decoder_only=train_decoder_only)
133 | 
134 | def config_compae(parser):
135 |     parser.add_argument('encode_key', choices=["abs","cot","rel","poex"], help='Type of encoding to use')
136 |     parser.add_argument('queue_key', choices=["std","var","sample_var","queueless_var","queueless_std","nearness_std"], help='Type of queue manager to use')
137 |     parser.add_argument('--per_note', dest="no_per_note", action="store_false", help='Enable note memory cells')
138 |     parser.add_argument('--hide_output', action="store_true", help='Hide previous outputs from the decoder')
139 |     parser.add_argument('--sparsity_loss_scale', type=float, default="1", help='How much to scale the sparsity loss by')
140 |     parser.add_argument('--variational_loss_scale', type=float, default="1", help='How much to scale the variational loss by')
141 |     parser.add_argument('--feature_size', type=int, default="100", help='Size of feature vectors')
142 |     parser.add_argument('--feature_period', type=int, help='If in queueless mode, period of features in timesteps')
143 |     parser.add_argument('--add_pre_noise', type=float, nargs="?", const=1.0, help='Add Gaussian noise to the feature values before applying the activation function')
144 |     parser.add_argument('--add_post_noise', type=float, nargs="?", const=1.0, help='Add Gaussian noise to the feature values after applying the activation function')
145 |     parser.add_argument('--train_decoder_only', action="store_true", help='Only modify the decoder parameters')
146 |     parser.add_argument('--layer_size', type=int, default=300, help='Layer size of the LSTMs. Only works without note memory cells')
147 |     parser.add_argument('--num_layers', type=int, default=2, help='Number of LSTM layers. Only works without note memory cells')
148 |     lossgroup = parser.add_mutually_exclusive_group()
149 |     lossgroup.add_argument('--priority_loss', nargs='?', const=50, dest='loss_mode_priority', type=float, help='Use priority loss scaling mode (with the specified curviness)')
150 |     lossgroup.add_argument('--add_loss', dest='loss_mode_add', action='store_true', help='Use adding loss scaling mode')
151 |     lossgroup.add_argument('--cutoff_loss', dest='loss_mode_cutoff', type=float, metavar="CUTOFF", help='Use cutoff loss scaling mode with the specified per-batch cutoff')
152 |     lossgroup.add_argument('--trigger_loss', dest='loss_mode_trigger', nargs=2, type=float, metavar=("TRIGGER", "RAMP_TIME"), help='Use trigger loss scaling mode with the specified per-batch trigger value and desired ramp-up time')
153 | 
154 | builders['compae'] = ModelBuilder('compae', build_compae, config_compae, 'A compressive autoencoder model.')
155 | 
156 | ###################################################################################################################
157 | 
158 | def main(modeltype, batch_size, iterations, learning_rate, segment_len, segment_step, train_save_params, dataset=["dataset"], outputdir="output", validation=None, validation_generate_ct=1, resume=None, resume_auto=False, check_nan=False, generate=False, generate_over=None, auto_connectome_keys=None, **model_kwargs):
159 |     generate = generate or (generate_over is not None)
160 |     should_setup = not generate
161 |     unroll_batch_num = None if generate else training.BATCH_SIZE
162 | 
163 |     for dataset_dir in dataset:
164 |         if os.path.samefile(dataset_dir,outputdir):
165 |             print("WARNING: Directory {} passed as both dataset and output directory!".format(outputdir))
166 |             print("This may cause problems by adding generated samples to the dataset directory.")
167 |             while True:
168 |                 result = input("Continue anyway? (y/n)")
169 |                 if result == "y":
170 |                     break
171 |                 elif result == "n":
172 |                     sys.exit(0)
173 |                 else:
174 |                     print("Please type y or n")
175 | 
176 |     if generate_over is None:
177 |         training.set_params(batch_size, segment_step, segment_len)
178 |         leadsheets = [training.filter_leadsheets(training.find_leadsheets(d)) for d in dataset]
179 |     else:
180 |         # Don't bother loading leadsheets, we don't need them
181 |         leadsheets = []
182 | 
183 |     if validation is not None:
184 |         validation_leadsheets = training.filter_leadsheets(training.find_leadsheets(validation))
185 |     else:
186 |         validation_leadsheets = None
187 | 
188 |     m = builders[modeltype].build(should_setup, check_nan, unroll_batch_num, **model_kwargs)
189 |     m.set_learning_rate(learning_rate)
190 | 
191 |     if resume_auto:
192 |         paramfile = os.path.join(outputdir,'final_params.p')
193 |         if os.path.isfile(paramfile):
194 |             with open(os.path.join(outputdir,'data.csv'), 'r') as f:
195 |                 for line in f:
196 |                     pass
197 |                 lastline = line
198 |                 start_idx = lastline.split(',')[0]
199 |             print("Automatically resuming from {} after iteration {}.".format(paramfile, start_idx))
200 |             resume = (start_idx, paramfile)
201 |         else:
202 |             print("Didn't find anything to resume. Starting from the beginning...")
203 | 
204 |     if resume is not None:
205 |         start_idx, paramfile = resume
206 |         start_idx = int(start_idx)
207 |         m.params = pickle.load( open(paramfile, "rb" ) )
208 |     else:
209 |         start_idx = 0
210 | 
211 |     if not os.path.exists(outputdir):
212 |         os.makedirs(outputdir)
213 | 
214 |     if generate:
215 |         print("Setting up generation")
216 |         m.setup_produce()
217 |         print("Starting to generate")
218 |         start_time = time.process_time()
219 |         if generate_over is not None:
220 |             source, divwidth = generate_over
221 |             if divwidth == 'full':
222 |                 divwidth = 0
223 |             elif divwidth == 'debug_firststep':
224 |                 divwidth = -1
225 |             elif len(divwidth)>3 and divwidth[-3:] == 'bar':
226 |                 divwidth = int(divwidth[:-3])*(constants.WHOLE//constants.RESOLUTION_SCALAR)
227 |             else:
228 |                 divwidth = int(divwidth)
229 |             ch,mel = leadsheet.parse_leadsheet(source)
230 |             lslen = leadsheet.get_leadsheet_length(ch,mel)
231 |             if divwidth == 0:
232 |                 batch = ([ch],[mel]), [source]
233 |             elif divwidth == -1:
234 |                 slices = [leadsheet.slice_leadsheet(ch,mel,0,1)]
235 |                 batch = list(zip(*slices)), [source]
236 |             else:
237 |                 slices = [leadsheet.slice_leadsheet(ch,mel,s,s+divwidth) for s in range(0,lslen,divwidth)]
238 |                 batch = list(zip(*slices)), [source]
239 |             training.generate(m, leadsheets, os.path.join(outputdir, "generated"), with_vis=True, batch=batch)
240 |         else:
241 |             training.generate(m, leadsheets, os.path.join(outputdir, "generated"), with_vis=True)
242 |         end_time = time.process_time()
243 |         print("Generation took {} seconds.".format(end_time-start_time))
244 |     else:
245 |         training.train(m, leadsheets, iterations, outputdir, start_idx, train_save_params, validation_leadsheets=validation_leadsheets, validation_generate_ct=validation_generate_ct, auto_connectome_keys=auto_connectome_keys)
246 |         pickle.dump( m.params, open( os.path.join(outputdir, "final_params.p"), "wb" ) )
247 | 
248 | def cvt_time(s):
249 |     if len(s)>3 and s[-3:] == "bar":
250 |         return int(s[:-3])*(constants.WHOLE//constants.RESOLUTION_SCALAR)
251 |     else:
252 |         return int(s)
253 | 
254 | parser = argparse.ArgumentParser(description='Train a neural network model.', formatter_class=argparse.ArgumentDefaultsHelpFormatter)
255 | parser.add_argument('--dataset', nargs="+", default=['dataset'], help='Path(s) to dataset folder (with .ls files). If multiple are passed, samples randomly from each')
256 | parser.add_argument('--validation', help='Path to validation dataset folder (with .ls files)')
257 | parser.add_argument('--validation_generate_ct', type=int, default=1, help='Number of samples to generate at each validation time.')
258 | parser.add_argument('--outputdir', default='output', help='Path to output folder')
259 | parser.add_argument('--check_nan', action='store_true', help='Check for nans during execution')
260 | parser.add_argument('--batch_size', type=int, default=10, help='Size of batch')
261 | parser.add_argument('--iterations', type=int, default=50000, help='How many iterations to train')
262 | parser.add_argument('--learning_rate', type=float, default=0.0002, help='Learning rate for the ADAM gradient descent method')
263 | parser.add_argument('--segment_len', type=cvt_time, default="4bar", help='Length of segment to train on')
264 | parser.add_argument('--segment_step', type=cvt_time, default="1bar", help='Period at which segments may begin')
265 | parser.add_argument('--save-params-interval', type=int, default=5000, dest="train_save_params", help="Save parameters after this many iterations")
266 | parser.add_argument('--final-params-only', action="store_const", const=None, dest="train_save_params", help="Don't save parameters while training, only at the end.")
267 | parser.add_argument('--auto_connectome_keys', help='Path to keys for running param_cvt. If given, will run param_cvt automatically for each saved parameters file.')
268 | resume_group = parser.add_mutually_exclusive_group()
269 | resume_group.add_argument('--resume', nargs=2, metavar=('TIMESTEP', 'PARAMFILE'), default=None, help='Where to restore from: timestep, and file to load')
270 | resume_group.add_argument('--resume_auto', action='store_true', help='Automatically restore from a previous run using output directory')
271 | gen_group = parser.add_mutually_exclusive_group()
272 | gen_group.add_argument('--generate', action='store_true', help="Don't train, just generate. Should be used with restore.")
273 | gen_group.add_argument('--generate_over', nargs=2, metavar=('SOURCE', 'DIV_WIDTH'), default=None, help="Don't train, just generate, and generate over SOURCE chord changes divided into chunks of length DIV_WIDTH (or one contiguous chunk if DIV_WIDTH is 'full'). Can use 'bar' as a unit. Should be used with restore.")
274 | 
275 | subparsers = parser.add_subparsers(title='Model Types', dest='modeltype', help='Type of model to use. (Note that each model type has additional parameters.)')
276 | for k,b in builders.items():
277 |     cur_parser = subparsers.add_parser(k, help=b.desc)
278 |     b.config_args(cur_parser)
279 | 
280 | if __name__ == '__main__':
281 |     np.set_printoptions(linewidth=200)
282 |     args = vars(parser.parse_args())
283 |     if args["modeltype"] is None:
284 |         parser.print_usage()
285 |     else:
286 |         main(**args)
287 | 


--------------------------------------------------------------------------------
/models/__init__.py:
--------------------------------------------------------------------------------
1 | from .simple_rel_model import SimpleModel
2 | from .product_model import ProductOfExpertsModel
3 | from .compressive_autoencoder_model import CompressiveAutoencoderModel


--------------------------------------------------------------------------------
/models/product_model.py:
--------------------------------------------------------------------------------
  1 | 
  2 | import theano
  3 | import theano.tensor as T
  4 | from theano.sandbox.rng_mrg import MRG_RandomStreams
  5 | 
  6 | import numpy as np
  7 | 
  8 | import constants
  9 | import input_parts
 10 | from relshift_lstm import RelativeShiftLSTMStack
 11 | from adam import Adam
 12 | from note_encodings import Encoding
 13 | import leadsheet
 14 | 
 15 | import itertools
 16 | import functools
 17 | 
 18 | from theano.compile.nanguardmode import NanGuardMode
 19 | 
 20 | 
 21 | class ProductOfExpertsModel(object):
 22 |     def __init__(self, encodings, all_layer_sizes, inputs=None, shift_modes=None, dropout=0, setup=False, nanguard=False, unroll_batch_num=None, bounds=constants.BOUNDS, normalize_artic_only=False, skip_training_experts=[]):
 23 |         self.encodings = encodings
 24 | 
 25 |         self.bounds = bounds
 26 |         self.normalize_artic_only = normalize_artic_only
 27 |         self.skip_training_experts = skip_training_experts
 28 |         
 29 |         if shift_modes is None:
 30 |             shift_modes = ["drop"]*len(encodings)
 31 | 
 32 |         if inputs is None:
 33 |             inputs = [[
 34 |                 input_parts.BeatInputPart(),
 35 |                 input_parts.PositionInputPart(self.bounds.lowbound, self.bounds.highbound, 2),
 36 |                 input_parts.ChordShiftInputPart()]]*len(self.encodings)
 37 | 
 38 |         self.all_layer_sizes = all_layer_sizes
 39 |         self.lstmstacks = []
 40 |         for layer_sizes, encoding, shift_mode, ipt in zip(all_layer_sizes,encodings,shift_modes, inputs):
 41 |             parts = ipt + [
 42 |                 input_parts.PassthroughInputPart("last_output", encoding.ENCODING_WIDTH)
 43 |             ]
 44 |             lstmstack = RelativeShiftLSTMStack(parts, layer_sizes, encoding.RAW_ENCODING_WIDTH, encoding.WINDOW_SIZE, dropout, mode=shift_mode, unroll_batch_num=unroll_batch_num)
 45 |             self.lstmstacks.append(lstmstack)
 46 | 
 47 |         self.srng = MRG_RandomStreams(np.random.randint(1, 1024))
 48 | 
 49 |         self.learning_rate_var = theano.shared(np.array(0.0002, theano.config.floatX))
 50 | 
 51 |         self.update_fun = None
 52 |         self.eval_fun = None
 53 |         self.gen_fun = None
 54 | 
 55 |         self.nanguard = nanguard
 56 | 
 57 |         if setup:
 58 |             print("Setting up train")
 59 |             self.setup_train()
 60 |             print("Setting up gen")
 61 |             self.setup_generate()
 62 |             print("Done setting up")
 63 | 
 64 |     @property
 65 |     def params(self):
 66 |         return list(itertools.chain(*(lstmstack.params for lstmstack in self.lstmstacks)))
 67 | 
 68 |     @params.setter
 69 |     def params(self, paramlist):
 70 |         mycopy = list(paramlist)
 71 |         for lstmstack in self.lstmstacks:
 72 |             lstmstack.params = mycopy[:len(lstmstack.params)]
 73 |             del mycopy[:len(lstmstack.params)]
 74 |         assert len(mycopy) == 0
 75 | 
 76 |     def get_optimize_params(self):
 77 |         return list(itertools.chain(*(lstmstack.params for i,lstmstack in enumerate(self.lstmstacks) if i not in self.skip_training_experts)))
 78 | 
 79 |     def set_learning_rate(self, lr):
 80 |         self.learning_rate_var.set_value(np.array(lr, theano.config.floatX))
 81 | 
 82 |     def setup_train(self):
 83 | 
 84 |         # dimensions: (batch, time, 12)
 85 |         chord_types = T.btensor3()
 86 | 
 87 |         # dimensions: (batch, time)
 88 |         chord_roots = T.imatrix()
 89 | 
 90 |         # dimensions: (batch, time)
 91 |         relative_posns = [T.imatrix() for _ in self.encodings]
 92 | 
 93 |         # dimesions: (batch, time, output_data)
 94 |         encoded_melodies = [T.btensor3() for _ in self.encodings]
 95 | 
 96 |         # dimesions: (batch, time)
 97 |         correct_notes = T.imatrix()
 98 | 
 99 |         n_batch, n_time = chord_roots.shape
100 | 
101 |         def _build(det_dropout):
102 |             all_out_probs = []
103 |             for encoding, lstmstack, encoded_melody, relative_pos in zip(self.encodings, self.lstmstacks, encoded_melodies, relative_posns):
104 |                 activations = lstmstack.do_preprocess_scan( timestep=T.tile(T.arange(n_time), (n_batch,1)) ,
105 |                                                             relative_position=relative_pos,
106 |                                                             cur_chord_type=chord_types,
107 |                                                             cur_chord_root=chord_roots,
108 |                                                             last_output=T.concatenate([T.tile(encoding.initial_encoded_form(), (n_batch,1,1)),
109 |                                                                                 encoded_melody[:,:-1,:] ], 1),
110 |                                                             deterministic_dropout=det_dropout)
111 | 
112 |                 out_probs = encoding.decode_to_probs(activations, relative_pos, self.bounds.lowbound, self.bounds.highbound)
113 |                 all_out_probs.append(out_probs)
114 |             reduced_out_probs = functools.reduce((lambda x,y: x*y), all_out_probs)
115 |             if self.normalize_artic_only:
116 |                 non_artic_probs = reduced_out_probs[:,:,:2]
117 |                 artic_probs = reduced_out_probs[:,:,2:]
118 |                 non_artic_sum = T.sum(non_artic_probs, 2, keepdims=True)
119 |                 artic_sum = T.sum(artic_probs, 2, keepdims=True)
120 |                 norm_artic_probs = artic_probs*(1-non_artic_sum)/artic_sum
121 |                 norm_out_probs = T.concatenate([non_artic_probs, norm_artic_probs], 2)
122 |             else:
123 |                 normsum = T.sum(reduced_out_probs, 2, keepdims=True)
124 |                 normsum = T.maximum(normsum, constants.EPSILON)
125 |                 norm_out_probs = reduced_out_probs/normsum
126 |             return Encoding.compute_loss(norm_out_probs, correct_notes, True)
127 | 
128 |         train_loss, train_info = _build(False)
129 |         updates = Adam(train_loss, self.get_optimize_params(), lr=self.learning_rate_var)
130 | 
131 |         eval_loss, eval_info = _build(True)
132 | 
133 |         self.loss_info_keys = list(train_info.keys())
134 | 
135 |         self.update_fun = theano.function(
136 |             inputs=[chord_types, chord_roots, correct_notes] + relative_posns + encoded_melodies,
137 |             outputs=[train_loss]+list(train_info.values()),
138 |             updates=updates,
139 |             allow_input_downcast=True,
140 |             on_unused_input='ignore',
141 |             mode=(NanGuardMode(nan_is_error=True, inf_is_error=True, big_is_error=True) if self.nanguard else None))
142 | 
143 |         self.eval_fun = theano.function(
144 |             inputs=[chord_types, chord_roots, correct_notes] + relative_posns + encoded_melodies,
145 |             outputs=[eval_loss]+list(eval_info.values()),
146 |             allow_input_downcast=True,
147 |             on_unused_input='ignore',
148 |             mode=(NanGuardMode(nan_is_error=True, inf_is_error=True, big_is_error=True) if self.nanguard else None))
149 | 
150 |     def _assemble_batch(self, melody, chords):
151 |         encoded_melodies = [[] for _ in self.encodings]
152 |         relative_posns = [[] for _ in self.encodings]
153 |         correct_notes = []
154 |         chord_roots = []
155 |         chord_types = []
156 |         for m,c in zip(melody,chords):
157 |             m = leadsheet.constrain_melody(m, self.bounds)
158 |             for i,encoding in enumerate(self.encodings):
159 |                 e_m, r_p = encoding.encode_melody_and_position(m,c)
160 |                 encoded_melodies[i].append(e_m)
161 |                 relative_posns[i].append(r_p)
162 |             correct_notes.append(Encoding.encode_absolute_melody(m, self.bounds.lowbound, self.bounds.highbound))
163 |             c_roots, c_types = zip(*c)
164 |             chord_roots.append(c_roots)
165 |             chord_types.append(c_types)
166 |         return ([np.array(chord_types, np.float32),
167 |                  np.array(chord_roots, np.int32),
168 |                  np.array(correct_notes, np.int32)]
169 |                 + [np.array(x, np.int32) for x in relative_posns]
170 |                 + [np.array(x, np.int32) for x in encoded_melodies])
171 | 
172 |     def train(self, chords, melody):
173 |         assert self.update_fun is not None, "Need to call setup_train before train"
174 |         res = self.update_fun(*self._assemble_batch(melody,chords))
175 |         loss = res[0]
176 |         info = dict(zip(self.loss_info_keys, res[1:]))
177 |         return loss, info
178 | 
179 |     def eval(self, chords, melody):
180 |         assert self.update_fun is not None, "Need to call setup_train before eval"
181 |         res = self.eval_fun(*self._assemble_batch(melody,chords))
182 |         loss = res[0]
183 |         info = dict(zip(self.loss_info_keys, res[1:]))
184 |         return loss, info
185 | 
186 |     def setup_generate(self):
187 | 
188 |         # dimensions: (batch, time, 12)
189 |         chord_types = T.btensor3()
190 | 
191 |         # dimensions: (batch, time)
192 |         chord_roots = T.imatrix()
193 | 
194 |         n_batch, n_time = chord_roots.shape
195 | 
196 |         specs = [lstmstack.prepare_sample_scan(  start_pos=T.alloc(np.array(encoding.STARTING_POSITION, np.int32), (n_batch)),
197 |                                                     start_out=T.tile(encoding.initial_encoded_form(), (n_batch,1)),
198 |                                                     timestep=T.tile(T.arange(n_time), (n_batch,1)),
199 |                                                     cur_chord_type=chord_types,
200 |                                                     cur_chord_root=chord_roots,
201 |                                                     deterministic_dropout=True )
202 |                     for lstmstack, encoding in zip(self.lstmstacks, self.encodings)]
203 | 
204 |         updates, all_chosen, all_probs, indiv_probs = helper_generate_from_spec(specs, self.lstmstacks, self.encodings, self.srng, n_batch, n_time, self.bounds, self.normalize_artic_only)
205 | 
206 |         self.generate_fun = theano.function(
207 |             inputs=[chord_roots, chord_types],
208 |             updates=updates,
209 |             outputs=all_chosen,
210 |             allow_input_downcast=True,
211 |             mode=(NanGuardMode(nan_is_error=True, inf_is_error=True, big_is_error=True) if self.nanguard else None))
212 | 
213 |         self.generate_visualize_fun = theano.function(
214 |             inputs=[chord_roots, chord_types],
215 |             updates=updates,
216 |             outputs=[all_chosen, all_probs] + indiv_probs,
217 |             allow_input_downcast=True,
218 |             mode=(NanGuardMode(nan_is_error=True, inf_is_error=True, big_is_error=True) if self.nanguard else None))
219 | 
220 |     def generate(self, chords):
221 |         assert self.generate_fun is not None, "Need to call setup_generate before generate"
222 | 
223 |         chord_roots = []
224 |         chord_types = []
225 |         for c in chords:
226 |             c_roots, c_types = zip(*c)
227 |             chord_roots.append(c_roots)
228 |             chord_types.append(c_types)
229 |         chosen = self.generate_fun(np.array(chord_roots, np.int32),np.array(chord_types, np.float32))
230 |         return [Encoding.decode_absolute_melody(c, self.bounds.lowbound, self.bounds.highbound) for c in chosen]
231 | 
232 |     def generate_visualize(self, chords):
233 |         assert self.generate_fun is not None, "Need to call setup_generate before generate"
234 |         chord_roots = []
235 |         chord_types = []
236 |         for c in chords:
237 |             c_roots, c_types = zip(*c)
238 |             chord_roots.append(c_roots)
239 |             chord_types.append(c_types)
240 |         stuff = self.generate_visualize_fun(chord_roots, chord_types)
241 |         chosen, all_probs = stuff[:2]
242 | 
243 |         melody = [Encoding.decode_absolute_melody(c, self.bounds.lowbound, self.bounds.highbound) for c in chosen]
244 |         return melody, chosen, all_probs, stuff[2:]
245 | 
246 |     def setup_produce(self):
247 |         self.setup_generate()
248 | 
249 |     def produce(self, chords, melody):
250 |         return self.generate_visualize(chords)
251 | 
252 | def helper_generate_from_spec(specs, lstmstacks, encodings, srng, n_batch, n_time, bounds, normalize_artic_only=False):
253 |     """Helper function to generate through a product LSTM model"""
254 |     def _scan_fn(*inputs):
255 |         # inputs is [ spec_sequences..., last_absolute_position, spec_taps..., spec_non_sequences... ]
256 |         inputs = list(inputs)
257 | 
258 |         partitioned_inputs = [[] for _ in specs]
259 |         for cur_part, spec in zip(partitioned_inputs, specs):
260 |             cur_part.extend(inputs[:len(spec.sequences)])
261 |             del inputs[:len(spec.sequences)]
262 |         last_absolute_chosen = inputs.pop(0)
263 |         for cur_part, spec in zip(partitioned_inputs, specs):
264 |             cur_part.extend(inputs[:spec.num_taps])
265 |             del inputs[:spec.num_taps]
266 |         for cur_part, spec in zip(partitioned_inputs, specs):
267 |             cur_part.extend(inputs[:len(spec.non_sequences)])
268 |             del inputs[:len(spec.non_sequences)]
269 | 
270 |         scan_routs = [ lstmstack.sample_scan_routine(spec, *p_input) for lstmstack,spec,p_input in zip(lstmstacks, specs, partitioned_inputs) ]
271 |         new_posns = []
272 |         all_out_probs = []
273 |         for scan_rout, encoding in zip(scan_routs, encodings):
274 |             last_rel_pos, last_out, cur_kwargs = scan_rout.send(None)
275 | 
276 |             new_pos = encoding.get_new_relative_position(last_absolute_chosen, last_rel_pos, last_out, bounds.lowbound, bounds.highbound, **cur_kwargs)
277 |             new_posns.append(new_pos)
278 |             addtl_kwargs = {
279 |                 "last_output": last_out
280 |             }
281 | 
282 |             out_activations = scan_rout.send((new_pos, addtl_kwargs))
283 |             out_probs = encoding.decode_to_probs(out_activations,new_pos,bounds.lowbound, bounds.highbound)
284 |             all_out_probs.append(out_probs)
285 | 
286 |         reduced_out_probs = functools.reduce((lambda x,y: x*y), all_out_probs)
287 |         if normalize_artic_only:
288 |             non_artic_probs = reduced_out_probs[:,:2]
289 |             artic_probs = reduced_out_probs[:,2:]
290 |             non_artic_sum = T.sum(non_artic_probs, 1, keepdims=True)
291 |             artic_sum = T.sum(artic_probs, 1, keepdims=True)
292 |             norm_artic_probs = artic_probs*(1-non_artic_sum)/artic_sum
293 |             norm_out_probs = T.concatenate([non_artic_probs, norm_artic_probs], 1)
294 |         else:
295 |             normsum = T.sum(reduced_out_probs, 1, keepdims=True)
296 |             normsum = T.maximum(normsum, constants.EPSILON)
297 |             norm_out_probs = reduced_out_probs/normsum
298 | 
299 |         sampled_note = Encoding.sample_absolute_probs(srng, norm_out_probs)
300 | 
301 |         outputs = []
302 |         for scan_rout, encoding, new_pos in zip(scan_routs, encodings, new_posns):
303 |             encoded_output = encoding.note_to_encoding(sampled_note, new_pos, bounds.lowbound, bounds.highbound)
304 |             scan_outputs = scan_rout.send(encoded_output)
305 |             scan_rout.close()
306 |             outputs.extend(scan_outputs)
307 | 
308 |         return [sampled_note, norm_out_probs] + all_out_probs + outputs
309 | 
310 |     sequences = []
311 |     non_sequences = []
312 |     outputs_info = [{"initial":T.zeros((n_batch,),'int32'), "taps":[-1]}, None] + [None]*len(specs)
313 |     for spec in specs:
314 |         sequences.extend(spec.sequences)
315 |         non_sequences.extend(spec.non_sequences)
316 |         outputs_info.extend(spec.outputs_info)
317 |     
318 |     result, updates = theano.scan(fn=_scan_fn, sequences=sequences, non_sequences=non_sequences, outputs_info=outputs_info)
319 |     all_chosen = result[0].dimshuffle((1,0))
320 |     all_probs = result[1].dimshuffle((1,0,2))
321 |     indiv_probs = [r.dimshuffle((1,0,2)) for r in result[2:2+len(specs)]]
322 | 
323 |     return updates, all_chosen, all_probs, indiv_probs
324 | 


--------------------------------------------------------------------------------
/models/simple_rel_model.py:
--------------------------------------------------------------------------------
  1 | 
  2 | import theano
  3 | import theano.tensor as T
  4 | from theano.sandbox.rng_mrg import MRG_RandomStreams
  5 | 
  6 | import numpy as np
  7 | 
  8 | import constants
  9 | import input_parts
 10 | from relshift_lstm import RelativeShiftLSTMStack
 11 | from adam import Adam
 12 | from note_encodings import Encoding
 13 | import leadsheet
 14 | 
 15 | class SimpleModel(object):
 16 |     def __init__(self, encoding, layer_sizes, inputs=None, shift_mode="drop", dropout=0, setup=False, nanguard=False, unroll_batch_num=None, bounds=constants.BOUNDS):
 17 | 
 18 |         self.encoding = encoding
 19 | 
 20 |         self.bounds = bounds
 21 | 
 22 |         if inputs is None:
 23 |             inputs = [
 24 |                 input_parts.BeatInputPart(),
 25 |                 input_parts.PositionInputPart(self.bounds.lowbound, self.bounds.highbound, 2),
 26 |                 input_parts.ChordShiftInputPart()]
 27 | 
 28 |         parts = inputs + [
 29 |             input_parts.PassthroughInputPart("last_output", encoding.ENCODING_WIDTH)
 30 |         ]
 31 |         self.lstmstack = RelativeShiftLSTMStack(parts, layer_sizes, encoding.RAW_ENCODING_WIDTH, encoding.WINDOW_SIZE, dropout, mode=shift_mode, unroll_batch_num=unroll_batch_num)
 32 | 
 33 |         self.srng = MRG_RandomStreams(np.random.randint(1, 1024))
 34 | 
 35 |         self.learning_rate_var = theano.shared(np.array(0.0002, theano.config.floatX))
 36 | 
 37 |         self.update_fun = None
 38 |         self.eval_fun = None
 39 |         self.gen_fun = None
 40 | 
 41 |         self.nanguard = nanguard
 42 | 
 43 |         if setup:
 44 |             print("Setting up train")
 45 |             self.setup_train()
 46 |             print("Setting up gen")
 47 |             self.setup_generate()
 48 |             print("Done setting up")
 49 | 
 50 |     @property
 51 |     def params(self):
 52 |         return self.lstmstack.params
 53 | 
 54 |     @params.setter
 55 |     def params(self, paramlist):
 56 |         self.lstmstack.params = paramlist
 57 | 
 58 |     def set_learning_rate(self, lr):
 59 |         self.learning_rate_var.set_value(np.array(lr, theano.config.floatX))
 60 | 
 61 |     def setup_train(self):
 62 | 
 63 |         # dimensions: (batch, time, 12)
 64 |         chord_types = T.btensor3()
 65 | 
 66 |         # dimensions: (batch, time)
 67 |         chord_roots = T.imatrix()
 68 | 
 69 |         # dimensions: (batch, time)
 70 |         relative_pos = T.imatrix()
 71 | 
 72 |         # dimesions: (batch, time, output_data)
 73 |         encoded_melody = T.btensor3()
 74 | 
 75 |         # dimesions: (batch, time)
 76 |         correct_notes = T.imatrix()
 77 | 
 78 |         n_batch, n_time = relative_pos.shape
 79 | 
 80 |         def _build(det_dropout):
 81 |             activations = self.lstmstack.do_preprocess_scan( timestep=T.tile(T.arange(n_time), (n_batch,1)) ,
 82 |                                                              relative_position=relative_pos,
 83 |                                                              cur_chord_type=chord_types,
 84 |                                                              cur_chord_root=chord_roots,
 85 |                                                              last_output=T.concatenate([T.tile(self.encoding.initial_encoded_form(), (n_batch,1,1)),
 86 |                                                                                    encoded_melody[:,:-1,:] ], 1),
 87 |                                                              deterministic_dropout=det_dropout)
 88 | 
 89 |             out_probs = self.encoding.decode_to_probs(activations, relative_pos, self.bounds.lowbound, self.bounds.highbound)
 90 |             return Encoding.compute_loss(out_probs, correct_notes, True)
 91 | 
 92 |         train_loss, train_info = _build(False)
 93 |         updates = Adam(train_loss, self.params, lr=self.learning_rate_var)
 94 | 
 95 |         eval_loss, eval_info = _build(True)
 96 | 
 97 |         self.loss_info_keys = list(train_info.keys())
 98 | 
 99 |         self.update_fun = theano.function(
100 |             inputs=[chord_types, chord_roots, relative_pos, encoded_melody, correct_notes],
101 |             outputs=[train_loss]+list(train_info.values()),
102 |             updates=updates,
103 |             allow_input_downcast=True,
104 |             mode=(NanGuardMode(nan_is_error=True, inf_is_error=True, big_is_error=True) if self.nanguard else None))
105 | 
106 |         self.eval_fun = theano.function(
107 |             inputs=[chord_types, chord_roots, relative_pos, encoded_melody, correct_notes],
108 |             outputs=[eval_loss]+list(eval_info.values()),
109 |             allow_input_downcast=True,
110 |             mode=(NanGuardMode(nan_is_error=True, inf_is_error=True, big_is_error=True) if self.nanguard else None))
111 | 
112 |     def _assemble_batch(self, melody, chords):
113 |         encoded_melody = []
114 |         relative_pos = []
115 |         correct_notes = []
116 |         chord_roots = []
117 |         chord_types = []
118 |         for m,c in zip(melody,chords):
119 |             m = leadsheet.constrain_melody(m, self.bounds)
120 |             e_m, r_p = self.encoding.encode_melody_and_position(m,c)
121 |             encoded_melody.append(e_m)
122 |             relative_pos.append(r_p)
123 |             correct_notes.append(Encoding.encode_absolute_melody(m, self.bounds.lowbound, self.bounds.highbound))
124 |             c_roots, c_types = zip(*c)
125 |             chord_roots.append(c_roots)
126 |             chord_types.append(c_types)
127 |         return (np.array(chord_types, np.float32),
128 |                 np.array(chord_roots, np.int32),
129 |                 np.array(relative_pos, np.int32),
130 |                 np.array(encoded_melody, np.float32),
131 |                 np.array(correct_notes, np.int32))
132 | 
133 |     def train(self, chords, melody):
134 |         assert self.update_fun is not None, "Need to call setup_train before train"
135 |         res = self.update_fun(*self._assemble_batch(melody,chords))
136 |         loss = res[0]
137 |         info = dict(zip(self.loss_info_keys, res[1:]))
138 |         return loss, info
139 | 
140 |     def eval(self, chords, melody):
141 |         assert self.update_fun is not None, "Need to call setup_train before eval"
142 |         res = self.eval_fun(*self._assemble_batch(melody,chords))
143 |         loss = res[0]
144 |         info = dict(zip(self.loss_info_keys, res[1:]))
145 |         return loss, info
146 | 
147 |     def setup_generate(self):
148 | 
149 |         # dimensions: (batch, time, 12)
150 |         chord_types = T.btensor3()
151 | 
152 |         # dimensions: (batch, time)
153 |         chord_roots = T.imatrix()
154 | 
155 |         n_batch, n_time = chord_roots.shape
156 | 
157 |         spec = self.lstmstack.prepare_sample_scan(  start_pos=T.alloc(np.array(self.encoding.STARTING_POSITION, np.int32), (n_batch)),
158 |                                                     start_out=T.tile(self.encoding.initial_encoded_form(), (n_batch,1)),
159 |                                                     timestep=T.tile(T.arange(n_time), (n_batch,1)),
160 |                                                     cur_chord_type=chord_types,
161 |                                                     cur_chord_root=chord_roots,
162 |                                                     deterministic_dropout=True )
163 | 
164 |         def _scan_fn(*inputs):
165 |             # inputs is [ spec_sequences..., last_absolute_position, spec_taps..., spec_non_sequences... ]
166 |             inputs = list(inputs)
167 |             last_absolute_chosen = inputs.pop(len(spec.sequences))
168 |             scan_rout = self.lstmstack.sample_scan_routine(spec, *inputs)
169 | 
170 |             last_rel_pos, last_out, cur_kwargs = scan_rout.send(None)
171 | 
172 |             new_pos = self.encoding.get_new_relative_position(last_absolute_chosen, last_rel_pos, last_out, self.bounds.lowbound, self.bounds.highbound, **cur_kwargs)
173 |             addtl_kwargs = {
174 |                 "last_output": last_out
175 |             }
176 | 
177 |             out_activations = scan_rout.send((new_pos, addtl_kwargs))
178 |             out_probs = self.encoding.decode_to_probs(out_activations,new_pos,self.bounds.lowbound, self.bounds.highbound)
179 |             sampled_note = Encoding.sample_absolute_probs(self.srng, out_probs)
180 |             encoded_output = self.encoding.note_to_encoding(sampled_note, new_pos, self.bounds.lowbound, self.bounds.highbound)
181 |             scan_outputs = scan_rout.send(encoded_output)
182 |             scan_rout.close()
183 | 
184 |             return [sampled_note, out_probs] + scan_outputs
185 | 
186 |         outputs_info = [{"initial":T.zeros((n_batch,),'int32'), "taps":[-1]}, None] + spec.outputs_info
187 |         result, updates = theano.scan(fn=_scan_fn, sequences=spec.sequences, non_sequences=spec.non_sequences, outputs_info=outputs_info)
188 |         all_chosen = result[0].dimshuffle((1,0))
189 |         all_probs = result[1].dimshuffle((1,0,2))
190 | 
191 |         self.generate_fun = theano.function(
192 |             inputs=[chord_roots, chord_types],
193 |             updates=updates,
194 |             outputs=all_chosen,
195 |             allow_input_downcast=True,
196 |             mode=(NanGuardMode(nan_is_error=True, inf_is_error=True, big_is_error=True) if self.nanguard else None))
197 | 
198 |         self.generate_visualize_fun = theano.function(
199 |             inputs=[chord_roots, chord_types],
200 |             updates=updates,
201 |             outputs=[all_chosen, all_probs],
202 |             allow_input_downcast=True,
203 |             mode=(NanGuardMode(nan_is_error=True, inf_is_error=True, big_is_error=True) if self.nanguard else None))
204 | 
205 |     def generate(self, chords):
206 |         assert self.generate_fun is not None, "Need to call setup_generate before generate"
207 | 
208 |         chord_roots = []
209 |         chord_types = []
210 |         for c in chords:
211 |             c_roots, c_types = zip(*c)
212 |             chord_roots.append(c_roots)
213 |             chord_types.append(c_types)
214 |         chosen = self.generate_fun(np.array(chord_roots, np.int32),np.array(chord_types, np.float32))
215 |         return [Encoding.decode_absolute_melody(c, self.bounds.lowbound, self.bounds.highbound) for c in chosen]
216 | 
217 |     def generate_visualize(self, chords):
218 |         assert self.generate_fun is not None, "Need to call setup_generate before generate"
219 |         chord_roots = []
220 |         chord_types = []
221 |         for c in chords:
222 |             c_roots, c_types = zip(*c)
223 |             chord_roots.append(c_roots)
224 |             chord_types.append(c_types)
225 |         chosen, all_probs = self.generate_visualize_fun(chord_roots, chord_types)
226 | 
227 |         melody = [Encoding.decode_absolute_melody(c, self.bounds.lowbound, self.bounds.highbound) for c in chosen]
228 |         return melody, chosen, all_probs
229 | 
230 |     def setup_produce(self):
231 |         self.setup_generate()
232 |         
233 |     def produce(self, chords, melody):
234 |         return self.generate_visualize(chords) + ([],)
235 | 


--------------------------------------------------------------------------------
/nametrain/name_model.py:
--------------------------------------------------------------------------------
  1 | import theano
  2 | import theano.tensor as T
  3 | from theano.sandbox.rng_mrg import MRG_RandomStreams
  4 | 
  5 | import numpy as np
  6 | 
  7 | import constants
  8 | import input_parts
  9 | from relshift_lstm import RelativeShiftLSTMStack
 10 | from queue_managers import QueueManager
 11 | from adam import Adam
 12 | from note_encodings import Encoding
 13 | import leadsheet
 14 | 
 15 | import itertools
 16 | import functools
 17 | from theano_lstm import LSTM, StackedCells, Layer
 18 | from util import *
 19 | import random
 20 | 
 21 | import pickle
 22 | 
 23 | CHARKEY = " !\"'(),-.01245679:?ABCDEFGHIJKLMNOPQRSTUVWYZabcdefghijklmnopqrstuvwxyz"
 24 | 
 25 | def name_model():
 26 | 
 27 |     LSTM_SIZE = 300
 28 |     layer1 = LSTM(len(CHARKEY), LSTM_SIZE, activation=T.tanh)
 29 |     layer2 = Layer(LSTM_SIZE, len(CHARKEY), activation=lambda x:x)
 30 |     params = layer1.params + [layer1.initial_hidden_state] + layer2.params
 31 | 
 32 |     ################# Train #################
 33 |     train_data = T.ftensor3()
 34 |     n_batch = train_data.shape[0]
 35 |     train_input = T.concatenate([T.zeros([n_batch,1,len(CHARKEY)]),train_data[:,:-1,:]],1)
 36 |     train_output = train_data
 37 | 
 38 |     def _scan_train(last_out, last_state):
 39 |         new_state = layer1.activate(last_out, last_state)
 40 |         layer_out = layer1.postprocess_activation(new_state)
 41 |         layer2_out = layer2.activate(layer_out)
 42 |         new_out = T.nnet.softmax(layer2_out)
 43 |         return new_out, new_state
 44 | 
 45 |     outputs_info = [None, initial_state(layer1, n_batch)]
 46 |     (scan_outputs, scan_states), _ = theano.scan(_scan_train, sequences=[train_input.dimshuffle([1,0,2])], outputs_info=outputs_info)
 47 | 
 48 |     flat_scan_outputs = scan_outputs.dimshuffle([1,0,2]).reshape([-1,len(CHARKEY)])
 49 |     flat_train_output = train_output.reshape([-1,len(CHARKEY)])
 50 |     crossentropy = T.nnet.categorical_crossentropy(flat_scan_outputs, flat_train_output)
 51 |     loss = T.sum(crossentropy)/T.cast(n_batch,'float32')
 52 | 
 53 |     adam_updates = Adam(loss, params)
 54 | 
 55 |     train_fn = theano.function([train_data],loss,updates=adam_updates)
 56 | 
 57 |     ################# Eval #################
 58 | 
 59 |     length = T.iscalar()
 60 |     srng = MRG_RandomStreams(np.random.randint(1, 1024))
 61 | 
 62 |     def _scan_gen(last_out, last_state):
 63 |         new_state = layer1.activate(last_out, last_state)
 64 |         layer_out = layer1.postprocess_activation(new_state)
 65 |         layer2_out = layer2.activate(layer_out)
 66 |         new_out = T.nnet.softmax(T.shape_padleft(layer2_out))
 67 |         sample = srng.multinomial(n=1,pvals=new_out)[0,:]
 68 |         sample = T.cast(sample,'float32')
 69 |         return sample, new_state
 70 | 
 71 |     initial_input = np.zeros([len(CHARKEY)], np.float32)
 72 |     outputs_info = [initial_input, layer1.initial_hidden_state]
 73 |     (scan_outputs, scan_states), updates = theano.scan(_scan_gen, n_steps=length, outputs_info=outputs_info)
 74 | 
 75 |     gen_fn = theano.function([length],scan_outputs,updates=updates)
 76 | 
 77 |     return layer1, layer2, train_fn, gen_fn
 78 | 
 79 | def train_name(dataset_file):
 80 |     with open(dataset_file,'r') as f:
 81 |         dataset = [x.strip() for x in f]
 82 |     maxlen = max(len(x) for x in dataset)
 83 |     dataset = [x+" "*(maxlen-len(x)) for x in dataset]
 84 | 
 85 |     layer1, layer2, train_fn, gen_fn = name_model()
 86 |     params = layer1.params + [layer1.initial_hidden_state] + layer2.params
 87 | 
 88 |     print("Starting train...")
 89 | 
 90 |     BATCH_SIZE = 20
 91 |     for iteration in range(10000):
 92 |         sample = [random.choice(dataset) for _ in range(BATCH_SIZE)]
 93 |         sample_encoded = np.zeros([BATCH_SIZE, maxlen, len(CHARKEY)], np.float32)
 94 |         for i,train_string in enumerate(sample):
 95 |             for j,c in enumerate(train_string):
 96 |                 try:
 97 |                     idx = CHARKEY.index(c)
 98 |                 except ValueError:
 99 |                     print("Couldn't find character <{}>, replacing with space".format(c))
100 |                     idx = len(CHARKEY)-1
101 |                 sample_encoded[i,j,idx] = 1.0
102 | 
103 |         loss = train_fn(sample_encoded)
104 |         if iteration % 100 == 0:
105 |             print("Iter",iteration,"has loss",loss)
106 |             for _ in range(10):
107 |                 generate_name(maxlen, layer1, layer2, train_fn, gen_fn)
108 | 
109 | 
110 |     pickle.dump([p.get_value() for p in params], open("name_params.p", 'wb'))
111 | 
112 | def generate_name(length, layer1=None, layer2=None, train_fn=None, gen_fn=None, num_names=1):
113 |     if layer1 is None:
114 |         layer1, layer2, train_fn, gen_fn = name_model()
115 |         params = layer1.params + [layer1.initial_hidden_state] + layer2.params
116 |         loaded_params = pickle.load(open("name_params.p", 'rb'))
117 |         for p,v in zip(params, loaded_params):
118 |             p.set_value(v)
119 |     for i in range(num_names):
120 |         scan_outputs = gen_fn(length)
121 |         outval = []
122 |         for output in scan_outputs:
123 |             idx = np.nonzero(output)[0][0]
124 |             outval.append(CHARKEY[idx])
125 |         print(''.join(outval))


--------------------------------------------------------------------------------
/note_encodings/__init__.py:
--------------------------------------------------------------------------------
1 | from .base_encoding import Encoding
2 | from .relative_jump import RelativeJumpEncoding
3 | from .chord_relative import ChordRelativeEncoding
4 | from .abs_seq_encoding import AbsoluteSequentialEncoding
5 | from .circle_of_thirds_encoding import CircleOfThirdsEncoding
6 | from .rhythm_only import RhythmOnlyEncoding


--------------------------------------------------------------------------------
/note_encodings/abs_seq_encoding.py:
--------------------------------------------------------------------------------
 1 | import numpy as np
 2 | import theano
 3 | import theano.tensor as T
 4 | import random
 5 | 
 6 | from .base_encoding import Encoding
 7 | 
 8 | import constants
 9 | import leadsheet
10 | import math
11 | 
12 | class AbsoluteSequentialEncoding( Encoding ):
13 | 
14 |     STARTING_POSITION = 0
15 |     WINDOW_SIZE = 1
16 | 
17 |     def __init__(self, low_bound, high_bound):
18 |         self.low_bound = low_bound
19 |         self.high_bound = high_bound
20 |         self.ENCODING_WIDTH = high_bound-low_bound+2
21 |         self.RAW_ENCODING_WIDTH = self.ENCODING_WIDTH
22 | 
23 |     def encode_melody_and_position(self, melody, chords):
24 |         abs_encoded_idxs = Encoding.encode_absolute_melody(melody, self.low_bound, self.high_bound)
25 |         encoded_form = np.eye(self.ENCODING_WIDTH)[abs_encoded_idxs]
26 |         position = np.zeros([abs_encoded_idxs.shape[0]])
27 |         return encoded_form, position
28 | 
29 |     def decode_to_probs(self, activations, relative_position, low_bound, high_bound):
30 |         squashed = T.reshape(activations, (-1,self.RAW_ENCODING_WIDTH))
31 |         probs = T.nnet.softmax(squashed)
32 |         fixed = T.reshape(probs, activations.shape)
33 |         return fixed
34 | 
35 |     def note_to_encoding(self, chosen_note, relative_position, low_bound, high_bound):
36 |         encoded_form = T.extra_ops.to_one_hot(chosen_note, self.ENCODING_WIDTH)
37 |         return encoded_form
38 | 
39 |     def get_new_relative_position(self, last_chosen_note, last_rel_pos, last_out, low_bound, high_bound, **cur_kwargs):
40 |         return T.zeros_like(last_chosen_note)
41 | 
42 |     def initial_encoded_form(self):
43 |         return np.array([1]+[0]*(self.ENCODING_WIDTH-1), np.float32)
44 | 


--------------------------------------------------------------------------------
/note_encodings/base_encoding.py:
--------------------------------------------------------------------------------
  1 | import theano
  2 | import theano.tensor as T
  3 | from theano.ifelse import ifelse
  4 | import numpy as np
  5 | import constants
  6 | 
  7 | class Encoding( object ):
  8 |     """
  9 |     Base class for note encodings
 10 |     """
 11 |     ENCODING_WIDTH = 0
 12 |     RAW_ENCODING_WIDTH = 0
 13 |     STARTING_POSITION = 0
 14 | 
 15 |     def encode_melody_and_position(self, melody, chords):
 16 |         """
 17 |         Encode a melody in the correct format
 18 | 
 19 |         Parameters:
 20 |             melody: A melody object, of the form [(note_or_none, dur), ... ],
 21 |                 where note_or_none is either a MIDI note value or None if this is a rest,
 22 |                 and dur is the duration, relative to constants.RESOLUTION_SCALAR.
 23 |             chords: A chord object, of the form [(root, typevec), ...],
 24 |                 where root is a MIDI note value in 0-12, typevec is a boolean list of length 12
 25 | 
 26 |         Returns:
 27 |             encoded_form: A numpy ndarray (float32) of shape (timestep, ENCODING_WIDTH) representing
 28 |                 the encoded form of the melody
 29 |             relative_positions: A numpy ndarray (int32) of shape (timestep), where relative_positions[t]
 30 |                 gives the note position that the given timestep encoding is relative to.
 31 |         """
 32 |         raise NotImplementedError("encode_melody_and_position not implemented")
 33 | 
 34 |     def decode_to_probs(self, activations, relative_position, low_bound, high_bound):
 35 |         """
 36 |         Convert a set of activations to a probability form across notes.
 37 | 
 38 |         Parameters:
 39 |             activations: A theano tensor (float32) of shape (..., RAW_ENCODING_WIDTH) giving
 40 |                 raw activations from a standard neural network layer
 41 |             relative_position: A theano tensor of shape (...) giving the current relative position
 42 |             low_bound: The MIDI index of the lowest note to return
 43 |             high_bound: The MIDI index of one past the highest note to return
 44 | 
 45 |         Returns:
 46 |             encoded_probs: A theano tensor (float32) of shape (..., 2+high_bound-low_bound) giving a
 47 |                 probability distribution for chosen notes, where
 48 |                     [0]: rest
 49 |                     [1]: continue
 50 |                     [2+x]: play note (low_bound + x)
 51 |         """
 52 |         raise NotImplementedError("decode_to_probs not implemented")
 53 | 
 54 |     def note_to_encoding(self, chosen_note, relative_position, low_bound, high_bound):
 55 |         """
 56 |         Convert a chosen note back into an encoded form
 57 | 
 58 |         Parameters:
 59 |             relative_position: A theano tensor of shape (...) giving the current relative position
 60 |             chosen_note: A theano tensor of shape (...) giving an index into encoded_probs
 61 |             low_bound: The MIDI index of the lowest note to return
 62 |             high_bound: The MIDI index of one past the highest note to return
 63 | 
 64 |         Returns:
 65 |             sampled_output: A theano tensor (float32) of shape (..., ENCODING_WIDTH) that is
 66 |                 sampled from encoded_probs. Should have the same representation as encoded_form from
 67 |                 encode_melody
 68 |         """
 69 |         raise NotImplementedError("note_to_output not implemented")
 70 | 
 71 |     def get_new_relative_position(self, last_chosen_note, last_rel_pos, last_out, low_bound, high_bound, **cur_kwargs):
 72 |         """
 73 |         Get the new relative position for this timestep
 74 | 
 75 |         Parameters:
 76 |             last_chosen_note is a theano tensor of shape (n_batch) indexing into 2+high_bound-low_bound
 77 |             last_rel_pos is a theano tensor of shape (n_batch)
 78 |             last_out will be a theano tensor of shape (n_batch, output_size)
 79 |             cur_kwargs[k] is a theano tensor of shape (n_batch, ...), from kwargs
 80 |             low_bound: The MIDI index of the lowest note to return
 81 |             high_bound: The MIDI index of one past the highest note to return
 82 | 
 83 |         Returns:
 84 |             new_pos, a theano tensor of shape (n_batch), giving the new relative position
 85 |         """
 86 |         raise NotImplementedError("get_new_relative_position not implemented")
 87 | 
 88 | 
 89 |     def initial_encoded_form(self):
 90 |         """
 91 |         Returns: A numpy ndarray (float32) of shape (ENCODING_WIDTH) for an initial encoding of
 92 |             the "previous note" when there is no previous data. Generally should be a representation
 93 |             of nothing, i.e. of a rest.
 94 |         """
 95 |         raise NotImplementedError("initial_encoded_form not implemented")
 96 | 
 97 |     @staticmethod
 98 |     def encode_absolute_melody(melody, low_bound, high_bound):
 99 |         """
100 |         Encode an absolute melody
101 | 
102 |         Parameters:
103 |             melody: A melody object, of the form [(note_or_none, dur), ... ],
104 |                 where note_or_none is either a MIDI note value or None if this is a rest,
105 |                 and dur is the duration, relative to constants.RESOLUTION_SCALAR.
106 |             low_bound: The MIDI index of the lowest note to return
107 |             high_bound: The MIDI index of one past the highest note to return
108 | 
109 |         Returns
110 |             A numpy matrix of shape (timestep) giving the int index (in 2+high_bound-low_bound) of the correct note
111 |         """
112 |         positions = []
113 | 
114 |         for note, dur in melody:
115 |             positions.append(0 if note is None else (note-low_bound+2))
116 | 
117 |             for _ in range(dur-1):
118 |                 positions.append(0 if note is None else 1)
119 | 
120 |         return np.array(positions, np.int32)
121 | 
122 | 
123 |     @staticmethod
124 |     def decode_absolute_melody(positions, low_bound, high_bound):
125 |         """
126 |         Decode an absolute melody
127 | 
128 |         Parameters:
129 |             A numpy matrix of shape (timestep) giving the int index (in 2+high_bound-low_bound) of the correct note
130 |             low_bound: The MIDI index of the lowest note to return
131 |             high_bound: The MIDI index of one past the highest note to return
132 | 
133 |         Returns
134 |             melody: A melody object, of the form [(note_or_none, dur), ... ],
135 |                 where note_or_none is either a MIDI note value or None if this is a rest,
136 |                 and dur is the duration, relative to constants.RESOLUTION_SCALAR.
137 |         """
138 |         melody = []
139 | 
140 |         for out in positions.tolist():
141 |             if out==1:
142 |                 # Continue a note
143 |                 if len(melody) == 0:
144 |                     print("ERROR: Can't continue from nothing! Inserting rest")
145 |                     melody.append([None, 0])
146 |                 melody[-1][1] += 1
147 |             elif out==0:
148 |                 # Rest
149 |                 if len(melody)>0 and melody[-1][0] is None:
150 |                     # More rest
151 |                     melody[-1][1] += 1
152 |                 else:
153 |                     melody.append([None, 1])
154 |             else:
155 |                 note = out-2 + low_bound
156 |                 melody.append([note, 1])
157 | 
158 |         return [tuple(x) for x in melody]
159 | 
160 |     @staticmethod
161 |     def sample_absolute_probs(srng, probs):
162 |         """
163 |         Sample from a probability distribution
164 | 
165 |         Parameters:
166 |             srng: A RandomStreams instance
167 |             probs: A matrix of probabilities of shape (n_batch, sample_from)
168 | 
169 |         Returns:
170 |             Sampled output, an index in [0,sample_from) of shape (n_batch)
171 |             One-hot encoding of that output of shape (n_batch, sample_from)
172 |         """
173 |         n_batch,sample_from = probs.shape
174 | 
175 |         sample = srng.multinomial(n=1,pvals=probs)
176 |         idx = T.cast(T.argmax(sample,axis=1),'int32')
177 | 
178 |         return idx
179 | 
180 |     @staticmethod
181 |     def compute_loss(probs, absolute_melody, extra_info=False):
182 |         """
183 |         Compute loss between probs and an absolute melody
184 | 
185 |         Parameters:
186 |             probs: A theano tensor of shape (batch, time, 2+high_bound-low_bound)
187 |             absolute_melody: A tensor of shape (batch, time) with correct indices
188 |             extra_info: If True, return extra info
189 | 
190 |         Returns
191 |             A theano tensor loss value.
192 |             Also, if extra_info is true, an additional info dict.
193 |         """
194 |         n_batch, n_time, prob_width = probs.shape
195 |         correct_encoded_form = T.reshape(T.extra_ops.to_one_hot(T.flatten(absolute_melody), prob_width), probs.shape)
196 |         loglikelihoods = T.log( probs + constants.EPSILON )*correct_encoded_form
197 |         full_loss = T.neg(T.sum(loglikelihoods))
198 | 
199 |         if extra_info:
200 |             loss_per_timestep = full_loss/T.cast(n_batch*n_time, theano.config.floatX)
201 |             accuracy_per_timestep = T.exp(-loss_per_timestep)
202 | 
203 |             loss_per_batch = full_loss/T.cast(n_batch, theano.config.floatX)
204 |             accuracy_per_batch = T.exp(-loss_per_batch)
205 | 
206 |             num_jumps = T.sum(correct_encoded_form[:,:,2:])
207 |             loss_per_jump = full_loss/T.cast(num_jumps, theano.config.floatX)
208 |             accuracy_per_jump = T.exp(-loss_per_jump)
209 | 
210 |             return full_loss, {
211 |                 "loss_per_timestep":loss_per_timestep,
212 |                 "accuracy_per_timestep":accuracy_per_timestep,
213 |                 "loss_per_batch":loss_per_batch,
214 |                 "accuracy_per_batch":accuracy_per_batch,
215 |                 "loss_per_jump":loss_per_jump,
216 |                 "accuracy_per_jump":accuracy_per_jump
217 |             }
218 |         else:
219 |             return full_loss
220 | 


--------------------------------------------------------------------------------
/note_encodings/chord_relative.py:
--------------------------------------------------------------------------------
  1 | import numpy as np
  2 | import theano
  3 | import theano.tensor as T
  4 | import random
  5 | 
  6 | from .base_encoding import Encoding
  7 | 
  8 | import constants
  9 | import leadsheet
 10 | import math
 11 | 
 12 | class ChordRelativeEncoding( Encoding ):
 13 |     """
 14 |     An encoding based on the chord. Encoding format is a one-hot
 15 | 
 16 |     [
 17 | 
 18 |         rest,                   \
 19 |         continue,                } (a softmax set of excl probs) 
 20 |         play x 12,     /
 21 | 
 22 |     ]
 23 | 
 24 |     where play is relative to the chord root
 25 |     """
 26 | 
 27 |     ENCODING_WIDTH = 1 + 1 + 12
 28 |     WINDOW_SIZE = 12
 29 | 
 30 |     def __init__(self, with_artic=True):
 31 |         self.with_artic = with_artic
 32 |         self.RAW_ENCODING_WIDTH = self.WINDOW_SIZE + (2 if with_artic else 0)
 33 | 
 34 |     def encode_melody_and_position(self, melody, chords):
 35 | 
 36 |         time = 0
 37 |         positions = []
 38 |         encoded_form = []
 39 | 
 40 |         for note, dur in melody:
 41 |             root, ctype = chords[time]
 42 |             if note is None:
 43 |                 encoded_form.append([1]+[0]+[0]*self.WINDOW_SIZE)
 44 |             else:
 45 |                 index = (note - root)%self.WINDOW_SIZE
 46 |                 encoded_form.append([0]+[0]+[1 if i==index else 0 for i in range(self.WINDOW_SIZE)])
 47 | 
 48 |             for _ in range(dur-1):
 49 |                 rcp = [1 if note is None else 0] + [0 if note is None else 1] + [0]*self.WINDOW_SIZE
 50 |                 encoded_form.append(rcp)
 51 |             time += dur
 52 | 
 53 |         positions = [root for root,ctype in chords]
 54 | 
 55 |         return np.array(encoded_form, np.float32), np.array(positions, np.int32)
 56 | 
 57 |     def decode_to_probs(self, activations, relative_position, low_bound, high_bound):
 58 |         squashed = T.reshape(activations, (-1,self.RAW_ENCODING_WIDTH))
 59 |         n_parallel = squashed.shape[0]
 60 |         probs = T.nnet.softmax(squashed)
 61 | 
 62 | 
 63 |         def _scan_fn(cprobs, cpos):
 64 | 
 65 |             if self.with_artic:
 66 |                 abs_probs = cprobs[:2]
 67 |                 rel_probs = cprobs[2:]
 68 |             else:
 69 |                 rel_probs = cprobs
 70 |                 abs_probs = T.ones((2,))
 71 | 
 72 |             aligned = T.roll(rel_probs, (cpos-low_bound)%12)
 73 | 
 74 |             num_tile = int(math.ceil((high_bound-low_bound)/self.WINDOW_SIZE))
 75 | 
 76 |             tiled = T.tile(aligned, (num_tile,))[:(high_bound-low_bound)]
 77 | 
 78 |             full = T.concatenate([abs_probs, tiled], 0)
 79 |             return full
 80 | 
 81 |         # probs = theano.printing.Print("probs",['shape'])(probs)
 82 |         # relative_position = theano.printing.Print("relative_position",['shape'])(relative_position)
 83 |         from_scan, _ = theano.map(fn=_scan_fn, sequences=[probs, T.flatten(relative_position)])
 84 |         # from_scan = theano.printing.Print("from_scan",['shape'])(from_scan)
 85 |         newshape = T.concatenate([activations.shape[:-1],[2+high_bound-low_bound]],0)
 86 |         fixed = T.reshape(from_scan, newshape, ndim=activations.ndim)
 87 |         return fixed
 88 | 
 89 |     def note_to_encoding(self, chosen_note, relative_position, low_bound, high_bound):
 90 |         """
 91 |         Convert a chosen note back into an encoded form
 92 | 
 93 |         Parameters:
 94 |             relative_position: A theano tensor of shape (...) giving the current relative position
 95 |             chosen_note: A theano tensor of shape (...) giving an index into encoded_probs
 96 | 
 97 |         Returns:
 98 |             sampled_output: A theano tensor (float32) of shape (..., ENCODING_WIDTH) that is
 99 |                 sampled from encoded_probs. Should have the same representation as encoded_form from
100 |                 encode_melody
101 |         """
102 |         new_idx = T.switch(chosen_note<2, chosen_note, (chosen_note-2+low_bound-relative_position)%self.WINDOW_SIZE + 2)
103 |         sampled_output = T.extra_ops.to_one_hot(new_idx, self.ENCODING_WIDTH)
104 |         return sampled_output
105 | 
106 |     def get_new_relative_position(self, last_chosen_note, last_rel_pos, last_out, low_bound, high_bound, cur_chord_root, **cur_kwargs):
107 |         return cur_chord_root
108 | 
109 |     def initial_encoded_form(self):
110 |         return np.array([1]+[0]+[0]*self.WINDOW_SIZE, np.float32)
111 | 


--------------------------------------------------------------------------------
/note_encodings/circle_of_thirds_encoding.py:
--------------------------------------------------------------------------------
 1 | import numpy as np
 2 | import theano
 3 | import theano.tensor as T
 4 | import random
 5 | 
 6 | from .base_encoding import Encoding
 7 | 
 8 | import constants
 9 | import leadsheet
10 | import math
11 | 
12 | class CircleOfThirdsEncoding( Encoding ):
13 |     """
14 |     [ rest sustain play ]
15 |     [ () () () () ]
16 |     [ () () () ]
17 |     [ octave0 ... ]
18 |     """
19 | 
20 |     STARTING_POSITION = 0
21 |     WINDOW_SIZE = 12
22 | 
23 |     def __init__(self, octave_start, num_octaves):
24 |         self.octave_start = octave_start
25 |         self.num_octaves = num_octaves
26 |         self.ENCODING_WIDTH = 3 + 4 + 3 + num_octaves
27 |         self.RAW_ENCODING_WIDTH = self.ENCODING_WIDTH
28 | 
29 |     def encode_melody_and_position(self, melody, chords):
30 |         encoded_form = []
31 | 
32 |         for note, dur in melody:
33 |             if note is None:
34 |                 for _ in range(dur):
35 |                     encoded_form.append([1] + [0]*(self.ENCODING_WIDTH-1))
36 |             else:
37 |                 pitchclass = note % 12
38 |                 octave = (note - self.octave_start)//12
39 | 
40 |                 first_circle = [(1 if ((pitchclass-i)%4 == 0) else 0) for i in range(4)]
41 |                 second_circle = [(1 if ((pitchclass-i)%3 == 0) else 0) for i in range(3)]
42 |                 octave_enc = [1 if (i==octave) else 0 for i in range(self.num_octaves)]
43 | 
44 |                 enc_timestep = [0,0,1] + first_circle + second_circle + octave_enc
45 | 
46 |                 encoded_form.append(enc_timestep)
47 | 
48 |                 for _ in range(dur-1):
49 |                     encoded_form.append([0, 1] + [0]*(self.ENCODING_WIDTH-2))
50 | 
51 |         encoded_form = np.array(encoded_form, np.float32)
52 |         position = np.zeros([encoded_form.shape[0]])
53 |         return encoded_form, position
54 | 
55 |     def decode_to_probs(self, activations, relative_position, low_bound, high_bound):
56 |         assert (low_bound%12==0) and (high_bound-low_bound == self.num_octaves*12), "Circle of thirds must evenly divide into octaves"
57 |         squashed = T.reshape(activations, (-1,self.RAW_ENCODING_WIDTH))
58 | 
59 |         rsp = T.nnet.softmax(squashed[:,:3])
60 |         c1 = T.nnet.softmax(squashed[:,3:7])
61 |         c2 = T.nnet.softmax(squashed[:,7:10])
62 |         octave_choice = T.nnet.softmax(squashed[:,10:])
63 |         octave_notes = T.tile(c1,(1,3)) * T.tile(c2,(1,4))
64 |         full_notes = T.reshape(T.shape_padright(octave_choice) * T.shape_padaxis(octave_notes, 1), (-1,12*self.num_octaves))
65 |         full_probs = T.concatenate([rsp[:,:2], T.shape_padright(rsp[:,2])*full_notes], 1)
66 | 
67 |         newshape = T.concatenate([activations.shape[:-1],[2+high_bound-low_bound]],0)
68 |         fixed = T.reshape(full_probs, newshape, ndim=activations.ndim)
69 |         return fixed
70 | 
71 |     def note_to_encoding(self, chosen_note, relative_position, low_bound, high_bound):
72 |         assert chosen_note.ndim == 1
73 |         n_batch = chosen_note.shape[0]
74 | 
75 |         dont_play_version = T.switch( T.shape_padright(T.eq(chosen_note, 0)),
76 |                                         T.tile(np.array([[1,0] + [0]*(self.ENCODING_WIDTH-2)], dtype=np.float32), (n_batch, 1)),
77 |                                         T.tile(np.array([[0,1] + [0]*(self.ENCODING_WIDTH-2)], dtype=np.float32), (n_batch, 1)))
78 | 
79 |         rcp = T.tile(np.array([0,0,1],dtype=np.float32), (n_batch, 1))
80 |         circle_1 = T.eye(4)[(chosen_note-2)%4]
81 |         circle_2 = T.eye(3)[(chosen_note-2)%3]
82 |         octave = T.eye(self.num_octaves)[(chosen_note-2+low_bound-self.octave_start)//12]
83 | 
84 |         play_version = T.concatenate([rcp, circle_1, circle_2, octave], 1)
85 | 
86 |         encoded_form = T.switch( T.shape_padright(T.lt(chosen_note, 2)), dont_play_version, play_version )
87 |         return encoded_form
88 | 
89 |     def get_new_relative_position(self, last_chosen_note, last_rel_pos, last_out, low_bound, high_bound, **cur_kwargs):
90 |         return T.zeros_like(last_chosen_note)
91 | 
92 |     def initial_encoded_form(self):
93 |         return np.array([1]+[0]*(self.ENCODING_WIDTH-1), np.float32)
94 | 


--------------------------------------------------------------------------------
/note_encodings/relative_jump.py:
--------------------------------------------------------------------------------
  1 | import numpy as np
  2 | import theano
  3 | import theano.tensor as T
  4 | import random
  5 | 
  6 | from .base_encoding import Encoding
  7 | 
  8 | import constants
  9 | import leadsheet
 10 | 
 11 | def rotate(li, x):
 12 |     """
 13 |     Rotate list li by x spaces to the right, i.e.
 14 |         rotate([1,2,3,4],1) -> [4,1,2,3]
 15 |     """
 16 |     if len(li) == 0: return []
 17 |     return li[-x % len(li):] + li[:-x % len(li)]
 18 | 
 19 | class RelativeJumpEncoding( Encoding ):
 20 |     """
 21 |     An encoding based on relative jumps. Encoding format is a one-hot
 22 | 
 23 |     [
 24 | 
 25 |         rest,                   \
 26 |         continue,                } (a softmax set of excl probs) 
 27 |         play x WINDOW_SIZE,     /
 28 | 
 29 |     ]
 30 | 
 31 |     where WINDOW_SIZE gives the number of places to which we can jump.
 32 |     """
 33 | 
 34 |     WINDOW_RADIUS = 12
 35 |     WINDOW_SIZE = WINDOW_RADIUS*2+1
 36 | 
 37 |     STARTING_POSITION = 72
 38 | 
 39 |     ENCODING_WIDTH = 1 + 1 + WINDOW_SIZE
 40 | 
 41 |     def __init__(self, with_artic=True):
 42 |         self.with_artic = with_artic
 43 |         self.RAW_ENCODING_WIDTH = self.WINDOW_SIZE + (2 if with_artic else 0)
 44 | 
 45 |     def encode_melody_and_position(self, melody, chords):
 46 | 
 47 |         positions = []
 48 |         encoded_form = []
 49 | 
 50 |         cur_pos = next((n for n,d in melody if n is not None), self.STARTING_POSITION) + random.randrange(-self.WINDOW_RADIUS, self.WINDOW_RADIUS+1)
 51 | 
 52 |         positions.append(cur_pos)
 53 | 
 54 |         for note, dur in melody:
 55 |             if note is None:
 56 |                 delta = 0
 57 |             else:
 58 |                 delta = note - cur_pos
 59 |                 cur_pos = note
 60 |             if not (-self.WINDOW_RADIUS <= delta <= self.WINDOW_RADIUS):
 61 |                 olddelta = delta
 62 |                 if delta>0:
 63 |                     delta = delta % self.WINDOW_RADIUS
 64 |                 else:
 65 |                     delta = -(-delta % self.WINDOW_RADIUS)
 66 |                 # print("WARNING: Jump of size {} from {} to {} not allowed. Substituting jump of size {}".format(olddelta, note-olddelta, note, delta))
 67 | 
 68 |             rcp = ([1 if note is None else 0] +
 69 |                 [0] +
 70 |                 [(1 if i==delta and note is not None else 0) for i in range(-self.WINDOW_RADIUS, self.WINDOW_RADIUS+1)])
 71 | 
 72 |             encoded_form.append(rcp) # for this timestep
 73 |             positions.append(cur_pos) # for next timestep
 74 | 
 75 |             for _ in range(dur-1):
 76 |                 rcp = [1 if note is None else 0] + [0 if note is None else 1] + [0]*self.WINDOW_SIZE
 77 |                 encoded_form.append(rcp)
 78 |                 positions.append(cur_pos)
 79 | 
 80 |         # Remove last position, since nothing is relative to it
 81 |         positions = positions[:-1]
 82 | 
 83 |         return np.array(encoded_form, np.float32), np.array(positions, np.int32)
 84 | 
 85 |     def decode_to_probs(self, activations, relative_position, low_bound, high_bound):
 86 |         squashed = T.reshape(activations, (-1,self.RAW_ENCODING_WIDTH))
 87 |         n_parallel = squashed.shape[0]
 88 |         probs = T.nnet.softmax(squashed)
 89 | 
 90 |         def _scan_fn(cprobs, cpos):
 91 |             # cprobs = theano.printing.Print("cprobs",['shape'])(cprobs)
 92 |             # cpos = theano.printing.Print("cpos",['shape'])(cpos)
 93 | 
 94 |             if self.with_artic:
 95 |                 abs_probs = cprobs[:2]
 96 |                 rel_probs = cprobs[2:]
 97 |             else:
 98 |                 rel_probs = cprobs
 99 | 
100 |             # abs_probs = theano.printing.Print("abs_probs",['shape'])(abs_probs)
101 |             # rel_probs = theano.printing.Print("rel_probs",['shape'])(rel_probs)
102 | 
103 |             # Start index:
104 |             #      *****[-----------------------------]
105 |             #      [****{-|------]
106 |             #
107 |             #           [-----------------------------]
108 |             #           ~~~~~{------|------]
109 |             start_diff = low_bound - (cpos-self.WINDOW_RADIUS)
110 |             startidx = T.maximum(0, start_diff)
111 |             startpadding = T.maximum(0, -start_diff)
112 |             # End index:
113 |             #          [-----------------------------]
114 |             #                              [******|**}---]
115 |             #
116 |             #           [-----------------------------]
117 |             #                [******|******}~~~~~~~~~~~
118 |             endidx = T.minimum(self.WINDOW_SIZE, high_bound - (cpos-self.WINDOW_RADIUS))
119 |             endpadding = T.maximum(0, high_bound-(cpos+self.WINDOW_RADIUS+1))
120 | 
121 |             # start_diff = theano.printing.Print("start_diff",['shape','__str__'])(start_diff)
122 |             # startidx = theano.printing.Print("startidx",['shape','__str__'])(startidx)
123 |             # startpadding = theano.printing.Print("startpadding",['shape','__str__'])(startpadding)
124 |             # endidx = theano.printing.Print("endidx",['shape','__str__'])(endidx)
125 |             # endpadding = theano.printing.Print("endpadding",['shape','__str__'])(endpadding)
126 | 
127 |             cropped = rel_probs[startidx:endidx]
128 | 
129 |             if self.with_artic:
130 |                 normalize_sum = T.sum(cropped) + T.sum(abs_probs)
131 |                 normalize_sum = T.maximum(normalize_sum, constants.EPSILON)
132 |                 padded = T.concatenate([abs_probs/normalize_sum, T.zeros((startpadding,)), cropped/normalize_sum, T.zeros((endpadding,))], 0)
133 |             else:
134 |                 normalize_sum = T.sum(cropped)
135 |                 normalize_sum = T.maximum(normalize_sum, constants.EPSILON)
136 |                 padded = T.concatenate([T.ones((2,)), T.zeros((startpadding,)), cropped/normalize_sum, T.zeros((endpadding,))], 0)
137 | 
138 |             # padded = theano.printing.Print("padded",['shape'])(padded)
139 |             return padded
140 | 
141 |         # probs = theano.printing.Print("probs",['shape'])(probs)
142 |         # relative_position = theano.printing.Print("relative_position",['shape'])(relative_position)
143 |         from_scan, _ = theano.map(fn=_scan_fn, sequences=[probs, T.flatten(relative_position)])
144 |         # from_scan = theano.printing.Print("from_scan",['shape'])(from_scan)
145 |         newshape = T.concatenate([activations.shape[:-1],[2+high_bound-low_bound]],0)
146 |         fixed = T.reshape(from_scan, newshape, ndim=activations.ndim)
147 |         return fixed
148 | 
149 |     def note_to_encoding(self, chosen_note, relative_position, low_bound, high_bound):
150 |         """
151 |         Convert a chosen note back into an encoded form
152 | 
153 |         Parameters:
154 |             relative_position: A theano tensor of shape (...) giving the current relative position
155 |             chosen_note: A theano tensor of shape (...) giving an index into encoded_probs
156 | 
157 |         Returns:
158 |             sampled_output: A theano tensor (float32) of shape (..., ENCODING_WIDTH) that is
159 |                 sampled from encoded_probs. Should have the same representation as encoded_form from
160 |                 encode_melody
161 |         """
162 |         new_idx = T.switch(chosen_note<2, chosen_note, chosen_note+low_bound-relative_position+self.WINDOW_RADIUS)
163 |         new_idx = T.opt.Assert("new_idx should be less than {}".format(self.ENCODING_WIDTH))(new_idx, T.all(new_idx < self.ENCODING_WIDTH))
164 |         sampled_output = T.extra_ops.to_one_hot(new_idx, self.ENCODING_WIDTH)
165 |         return sampled_output
166 | 
167 |     def get_new_relative_position(self, last_chosen_note, last_rel_pos, last_out, low_bound, high_bound, **cur_kwargs):
168 |         return T.switch(last_chosen_note<2, last_rel_pos, last_chosen_note+low_bound-2)
169 | 
170 |     def initial_encoded_form(self):
171 |         return np.array([1]+[0]+[0]*self.WINDOW_SIZE, np.float32)
172 | 


--------------------------------------------------------------------------------
/note_encodings/rhythm_only.py:
--------------------------------------------------------------------------------
 1 | import numpy as np
 2 | import theano
 3 | import theano.tensor as T
 4 | import random
 5 | 
 6 | from .base_encoding import Encoding
 7 | 
 8 | import constants
 9 | import leadsheet
10 | import math
11 | 
12 | class RhythmOnlyEncoding( Encoding ):
13 |     """
14 |     An encoding that only encodes rhythm, of the form
15 | 
16 |     [
17 | 
18 |         rest,                   \
19 |         continue,                } (a softmax set of excl probs) 
20 |         articulate,             /
21 | 
22 |     ]
23 |     """
24 | 
25 |     ENCODING_WIDTH = 3
26 |     WINDOW_SIZE = 12
27 |     RAW_ENCODING_WIDTH = 3
28 | 
29 |     def encode_melody_and_position(self, melody, chords):
30 | 
31 |         time = 0
32 |         positions = []
33 |         encoded_form = []
34 | 
35 |         for note, dur in melody:
36 |             root, ctype = chords[time]
37 |             if note is None:
38 |                 encoded_form.append([1,0,0])
39 |             else:
40 |                 encoded_form.append([0,0,1])
41 | 
42 |             for _ in range(dur-1):
43 |                 encoded_form.append([1,0,0] if note is None else [0,1,0])
44 |             time += dur
45 | 
46 |         positions = [root for root,ctype in chords]
47 | 
48 |         return np.array(encoded_form, np.float32), np.array(positions, np.int32)
49 | 
50 |     def decode_to_probs(self, activations, relative_position, low_bound, high_bound):
51 |         squashed = T.reshape(activations, (-1,self.RAW_ENCODING_WIDTH))
52 |         n_parallel = squashed.shape[0]
53 |         probs = T.nnet.softmax(squashed)
54 | 
55 |         abs_probs = probs[:,:2]
56 |         artic_prob = probs[:,2:]
57 |         repeated_artic_probs = T.tile(artic_prob, (1,high_bound-low_bound))
58 | 
59 |         full_probs = T.concatenate([abs_probs,repeated_artic_probs],1)
60 | 
61 |         newshape = T.concatenate([activations.shape[:-1],[2+high_bound-low_bound]],0)
62 |         fixed = T.reshape(full_probs, newshape, ndim=activations.ndim)
63 |         return fixed
64 | 
65 |     def note_to_encoding(self, chosen_note, relative_position, low_bound, high_bound):
66 |         """
67 |         Convert a chosen note back into an encoded form
68 | 
69 |         Parameters:
70 |             relative_position: A theano tensor of shape (...) giving the current relative position
71 |             chosen_note: A theano tensor of shape (...) giving an index into encoded_probs
72 | 
73 |         Returns:
74 |             sampled_output: A theano tensor (float32) of shape (..., ENCODING_WIDTH) that is
75 |                 sampled from encoded_probs. Should have the same representation as encoded_form from
76 |                 encode_melody
77 |         """
78 |         new_idx = T.switch(chosen_note<2, chosen_note, 2)
79 |         sampled_output = T.extra_ops.to_one_hot(new_idx, self.ENCODING_WIDTH)
80 |         return sampled_output
81 | 
82 |     def get_new_relative_position(self, last_chosen_note, last_rel_pos, last_out, low_bound, high_bound, cur_chord_root, **cur_kwargs):
83 |         return cur_chord_root
84 | 
85 |     def initial_encoded_form(self):
86 |         return np.array([1,0,0], np.float32)
87 | 


--------------------------------------------------------------------------------
/param_cvt.py:
--------------------------------------------------------------------------------
 1 | import sys
 2 | import os
 3 | import leadsheet
 4 | import argparse
 5 | import pickle
 6 | import numpy as np
 7 | import zipfile
 8 | import io
 9 | 
10 | def main(file, precision, keys=None, output=None, make_zip=False):
11 |     params = pickle.load(open(file, 'rb'))
12 |     param_vals = [x if isinstance(x,np.ndarray) else x.get_value() for x in params]
13 |     if output is None:
14 |         output = os.path.splitext(file)[0] + (".ctome" if make_zip else "-raw")
15 | 
16 |     with open(keys,'r') as f:
17 |         config_info = f.readline()
18 |         key_names = f.readlines()
19 |     assert len(key_names) == len(params), "Wrong number of keys for params! {} keys, {} params".format(len(key_names), len(params))
20 | 
21 |     fmt = '%.{}e'.format(precision)
22 |     if make_zip:
23 |         with zipfile.ZipFile(output, 'w', zipfile.ZIP_DEFLATED) as zfile:
24 |             for name,val in zip(key_names, param_vals):
25 |                 with io.BytesIO() as str_capture:
26 |                     np.savetxt(str_capture, val, fmt=fmt, delimiter=",")
27 |                     zfile.writestr("param_{}.csv".format(name.strip()), str_capture.getvalue())
28 |             zfile.writestr("config.txt", config_info)
29 |     else:
30 |         for name,val in zip(key_names, param_vals):
31 |             np.savetxt("{}_{}.csv".format(output,name.strip()), val, fmt=fmt, delimiter=",")
32 |         with open("{}_config.txt".format(output), 'w') as f:
33 |             f.write(config_info)
34 | 
35 | parser = argparse.ArgumentParser(description='Convert a python parameters file into an Impro-Visor connectome file')
36 | parser.add_argument('file', help='File to process')
37 | parser.add_argument('--keys', help='File to load parameter names from', required=True)
38 | parser.add_argument('--output', help='Base name of the output files')
39 | parser.add_argument('--precision', default=18, type=int, help='Decimal points of precision to use (default 18)')
40 | parser.add_argument('--raw', dest='make_zip', action='store_false', help='Create individual csv files instead of a connectome file')
41 | 
42 | if __name__ == '__main__':
43 |     args = parser.parse_args()
44 |     main(**vars(args))
45 | 


--------------------------------------------------------------------------------
/param_keys/ae_abs_keys.txt:
--------------------------------------------------------------------------------
 1 | autoencoder_absolute
 2 | enc_lstm1_input_w
 3 | enc_lstm1_input_b
 4 | enc_lstm1_forget_w
 5 | enc_lstm1_forget_b
 6 | enc_lstm1_activate_w
 7 | enc_lstm1_activate_b
 8 | enc_lstm1_out_w
 9 | enc_lstm1_out_b
10 | enc_lstm2_input_w
11 | enc_lstm2_input_b
12 | enc_lstm2_forget_w
13 | enc_lstm2_forget_b
14 | enc_lstm2_activate_w
15 | enc_lstm2_activate_b
16 | enc_lstm2_out_w
17 | enc_lstm2_out_b
18 | enc_full_w
19 | enc_full_b
20 | enc_lstm1_initialstate
21 | enc_lstm2_initialstate
22 | dec_lstm1_input_w
23 | dec_lstm1_input_b
24 | dec_lstm1_forget_w
25 | dec_lstm1_forget_b
26 | dec_lstm1_activate_w
27 | dec_lstm1_activate_b
28 | dec_lstm1_out_w
29 | dec_lstm1_out_b
30 | dec_lstm2_input_w
31 | dec_lstm2_input_b
32 | dec_lstm2_forget_w
33 | dec_lstm2_forget_b
34 | dec_lstm2_activate_w
35 | dec_lstm2_activate_b
36 | dec_lstm2_out_w
37 | dec_lstm2_out_b
38 | dec_full_w
39 | dec_full_b
40 | dec_lstm1_initialstate
41 | dec_lstm2_initialstate
42 | 


--------------------------------------------------------------------------------
/param_keys/ae_poex_keys.txt:
--------------------------------------------------------------------------------
 1 | autoencoder_product_interval_chords
 2 | enc_0_lstm1_input_w
 3 | enc_0_lstm1_input_b
 4 | enc_0_lstm1_forget_w
 5 | enc_0_lstm1_forget_b
 6 | enc_0_lstm1_activate_w
 7 | enc_0_lstm1_activate_b
 8 | enc_0_lstm1_out_w
 9 | enc_0_lstm1_out_b
10 | enc_0_lstm2_input_w
11 | enc_0_lstm2_input_b
12 | enc_0_lstm2_forget_w
13 | enc_0_lstm2_forget_b
14 | enc_0_lstm2_activate_w
15 | enc_0_lstm2_activate_b
16 | enc_0_lstm2_out_w
17 | enc_0_lstm2_out_b
18 | enc_0_full_w
19 | enc_0_full_b
20 | enc_0_lstm1_initialstate
21 | enc_0_lstm2_initialstate
22 | enc_1_lstm1_input_w
23 | enc_1_lstm1_input_b
24 | enc_1_lstm1_forget_w
25 | enc_1_lstm1_forget_b
26 | enc_1_lstm1_activate_w
27 | enc_1_lstm1_activate_b
28 | enc_1_lstm1_out_w
29 | enc_1_lstm1_out_b
30 | enc_1_lstm2_input_w
31 | enc_1_lstm2_input_b
32 | enc_1_lstm2_forget_w
33 | enc_1_lstm2_forget_b
34 | enc_1_lstm2_activate_w
35 | enc_1_lstm2_activate_b
36 | enc_1_lstm2_out_w
37 | enc_1_lstm2_out_b
38 | enc_1_full_w
39 | enc_1_full_b
40 | enc_1_lstm1_initialstate
41 | enc_1_lstm2_initialstate
42 | dec_0_lstm1_input_w
43 | dec_0_lstm1_input_b
44 | dec_0_lstm1_forget_w
45 | dec_0_lstm1_forget_b
46 | dec_0_lstm1_activate_w
47 | dec_0_lstm1_activate_b
48 | dec_0_lstm1_out_w
49 | dec_0_lstm1_out_b
50 | dec_0_lstm2_input_w
51 | dec_0_lstm2_input_b
52 | dec_0_lstm2_forget_w
53 | dec_0_lstm2_forget_b
54 | dec_0_lstm2_activate_w
55 | dec_0_lstm2_activate_b
56 | dec_0_lstm2_out_w
57 | dec_0_lstm2_out_b
58 | dec_0_full_w
59 | dec_0_full_b
60 | dec_0_lstm1_initialstate
61 | dec_0_lstm2_initialstate
62 | dec_1_lstm1_input_w
63 | dec_1_lstm1_input_b
64 | dec_1_lstm1_forget_w
65 | dec_1_lstm1_forget_b
66 | dec_1_lstm1_activate_w
67 | dec_1_lstm1_activate_b
68 | dec_1_lstm1_out_w
69 | dec_1_lstm1_out_b
70 | dec_1_lstm2_input_w
71 | dec_1_lstm2_input_b
72 | dec_1_lstm2_forget_w
73 | dec_1_lstm2_forget_b
74 | dec_1_lstm2_activate_w
75 | dec_1_lstm2_activate_b
76 | dec_1_lstm2_out_w
77 | dec_1_lstm2_out_b
78 | dec_1_full_w
79 | dec_1_full_b
80 | dec_1_lstm1_initialstate
81 | dec_1_lstm2_initialstate
82 | 


--------------------------------------------------------------------------------
/param_keys/corn_keys.txt:
--------------------------------------------------------------------------------
 1 | name_generator
 2 | lstm_input_w
 3 | lstm_input_b
 4 | lstm_forget_w
 5 | lstm_forget_b
 6 | lstm_activate_w
 7 | lstm_activate_b
 8 | lstm_out_w
 9 | lstm_out_b
10 | lstm_initialstate
11 | full_w
12 | full_b


--------------------------------------------------------------------------------
/param_keys/poex_keys.txt:
--------------------------------------------------------------------------------
 1 | generative_product_interval_chords
 2 | 0_lstm1_input_w
 3 | 0_lstm1_input_b
 4 | 0_lstm1_forget_w
 5 | 0_lstm1_forget_b
 6 | 0_lstm1_activate_w
 7 | 0_lstm1_activate_b
 8 | 0_lstm1_out_w
 9 | 0_lstm1_out_b
10 | 0_lstm2_input_w
11 | 0_lstm2_input_b
12 | 0_lstm2_forget_w
13 | 0_lstm2_forget_b
14 | 0_lstm2_activate_w
15 | 0_lstm2_activate_b
16 | 0_lstm2_out_w
17 | 0_lstm2_out_b
18 | 0_full_w
19 | 0_full_b
20 | 0_lstm1_initialstate
21 | 0_lstm2_initialstate
22 | 1_lstm1_input_w
23 | 1_lstm1_input_b
24 | 1_lstm1_forget_w
25 | 1_lstm1_forget_b
26 | 1_lstm1_activate_w
27 | 1_lstm1_activate_b
28 | 1_lstm1_out_w
29 | 1_lstm1_out_b
30 | 1_lstm2_input_w
31 | 1_lstm2_input_b
32 | 1_lstm2_forget_w
33 | 1_lstm2_forget_b
34 | 1_lstm2_activate_w
35 | 1_lstm2_activate_b
36 | 1_lstm2_out_w
37 | 1_lstm2_out_b
38 | 1_full_w
39 | 1_full_b
40 | 1_lstm1_initialstate
41 | 1_lstm2_initialstate


--------------------------------------------------------------------------------
/param_keys/poex_sep_rhythm_keys.txt:
--------------------------------------------------------------------------------
 1 | generative_product_interval_chords_rhythm
 2 | 0_lstm1_input_w
 3 | 0_lstm1_input_b
 4 | 0_lstm1_forget_w
 5 | 0_lstm1_forget_b
 6 | 0_lstm1_activate_w
 7 | 0_lstm1_activate_b
 8 | 0_lstm1_out_w
 9 | 0_lstm1_out_b
10 | 0_lstm2_input_w
11 | 0_lstm2_input_b
12 | 0_lstm2_forget_w
13 | 0_lstm2_forget_b
14 | 0_lstm2_activate_w
15 | 0_lstm2_activate_b
16 | 0_lstm2_out_w
17 | 0_lstm2_out_b
18 | 0_full_w
19 | 0_full_b
20 | 0_lstm1_initialstate
21 | 0_lstm2_initialstate
22 | 1_lstm1_input_w
23 | 1_lstm1_input_b
24 | 1_lstm1_forget_w
25 | 1_lstm1_forget_b
26 | 1_lstm1_activate_w
27 | 1_lstm1_activate_b
28 | 1_lstm1_out_w
29 | 1_lstm1_out_b
30 | 1_lstm2_input_w
31 | 1_lstm2_input_b
32 | 1_lstm2_forget_w
33 | 1_lstm2_forget_b
34 | 1_lstm2_activate_w
35 | 1_lstm2_activate_b
36 | 1_lstm2_out_w
37 | 1_lstm2_out_b
38 | 1_full_w
39 | 1_full_b
40 | 1_lstm1_initialstate
41 | 1_lstm2_initialstate
42 | 2_lstm1_input_w
43 | 2_lstm1_input_b
44 | 2_lstm1_forget_w
45 | 2_lstm1_forget_b
46 | 2_lstm1_activate_w
47 | 2_lstm1_activate_b
48 | 2_lstm1_out_w
49 | 2_lstm1_out_b
50 | 2_lstm2_input_w
51 | 2_lstm2_input_b
52 | 2_lstm2_forget_w
53 | 2_lstm2_forget_b
54 | 2_lstm2_activate_w
55 | 2_lstm2_activate_b
56 | 2_lstm2_out_w
57 | 2_lstm2_out_b
58 | 2_full_w
59 | 2_full_b
60 | 2_lstm1_initialstate
61 | 2_lstm2_initialstate


--------------------------------------------------------------------------------
/plot_data.py:
--------------------------------------------------------------------------------
 1 | import numpy as np
 2 | import matplotlib.pyplot as plt
 3 | import matplotlib
 4 | import sys
 5 | import argparse
 6 | 
 7 | def plot_file(fn):
 8 |     with open(fn,'r') as f:
 9 |         legend = f.readline()
10 | 
11 |     colnames = legend.strip().split(', ')
12 |     data = np.loadtxt(fn, skiprows=1, delimiter=',')
13 | 
14 |     markers = ".ov^<>12348sp*hH+xDd"
15 |     colors = "bgrcmyk"
16 | 
17 |     skip=100
18 |     timestep = data[::skip,0]
19 |     handles = []
20 |     for i,colname in enumerate(colnames[1:]):
21 |         val = data[::skip,1+i]
22 |         handles.append(plt.scatter(timestep, val, marker=markers[i%len(markers)], color=colors[i%len(colors)]))
23 | 
24 |     plt.legend(handles, colnames[1:])
25 |     plt.show()
26 | 
27 | parser = argparse.ArgumentParser(description='Plot a .csv file')
28 | parser.add_argument('fn', help='File to plot')
29 | 
30 | if __name__ == '__main__':
31 |     args = parser.parse_args()
32 |     plot_file(**vars(args))
33 | 


--------------------------------------------------------------------------------
/plot_internal_state.py:
--------------------------------------------------------------------------------
 1 | import constants
 2 | import matplotlib
 3 | import matplotlib.pyplot as plt
 4 | import itertools
 5 | import sys
 6 | import numpy as np
 7 | import os
 8 | import argparse
 9 | plt.ion()
10 | 
11 | import custom_cmap
12 | my_cmap = matplotlib.colors.ListedColormap(custom_cmap.test_cm.colors[::-1])
13 | 
14 | # probs = np.load('generation/dataset_10000_probs.npy')
15 | # probs_jump = np.load('generation/dataset_10000_info_0.npy')
16 | # probs_chord = np.load('generation/dataset_10000_info_1.npy')
17 | # chosen = np.load('generation/dataset_10000_chosen.npy')
18 | # chosen_map = np.eye(probs.shape[-1])[chosen]
19 | 
20 | def plot_note_dist(mat, name="", show_octaves=True):
21 |     f = plt.figure(figsize=(20,5))
22 |     f.canvas.set_window_title(name)
23 |     plt.imshow(mat.T, origin="lower", interpolation="nearest", cmap=my_cmap)
24 |     plt.xticks( np.arange(0,4*(constants.WHOLE//constants.RESOLUTION_SCALAR),(constants.QUARTER//constants.RESOLUTION_SCALAR)) )
25 |     plt.xlabel('Time (beat/12)')
26 |     plt.ylabel('Note')
27 |     plt.colorbar()
28 |     if show_octaves:
29 |         for y in range(0,36,12):
30 |             plt.axhline(y + 1.5, color='c')
31 |     for x in range(0,4*(constants.WHOLE//constants.RESOLUTION_SCALAR),(constants.QUARTER//constants.RESOLUTION_SCALAR)):
32 |         plt.axvline(x-0.5, color='k')
33 |     for x in range(0,4*(constants.WHOLE//constants.RESOLUTION_SCALAR),(constants.WHOLE//constants.RESOLUTION_SCALAR)):
34 |         plt.axvline(x-0.5, color='c')
35 |     plt.show()
36 | 
37 | def plot_scalar(mat, name=""):
38 |     f = plt.figure(figsize=(20,5))
39 |     f.canvas.set_window_title(name)
40 |     plt.bar(range(mat.shape[0]),mat,1)
41 |     plt.xticks( np.arange(0,4*(constants.WHOLE//constants.RESOLUTION_SCALAR),(constants.QUARTER//constants.RESOLUTION_SCALAR)) )
42 |     plt.xlabel('Time (beat/12)')
43 |     plt.ylabel('Strength')
44 |     for x in range(0,4*(constants.WHOLE//constants.RESOLUTION_SCALAR),(constants.QUARTER//constants.RESOLUTION_SCALAR)):
45 |         plt.axvline(x, color='k')
46 |     for x in range(0,4*(constants.WHOLE//constants.RESOLUTION_SCALAR),(constants.WHOLE//constants.RESOLUTION_SCALAR)):
47 |         plt.axvline(x, color='c')
48 |     plt.show()
49 | 
50 | 
51 | def plot_all(folder, idx=0):
52 |     probs = np.load(os.path.join(folder,'generated_probs.npy'))
53 |     chosen_raw = np.load(os.path.join(folder,'generated_chosen.npy'))
54 |     chosen = np.eye(probs.shape[-1])[chosen_raw]
55 |     plot_note_dist(probs[idx], 'Probabilities')
56 |     plot_note_dist(chosen[idx], 'Chosen')
57 |     try:
58 |         for i in itertools.count():
59 |             probs_info = np.load(os.path.join(folder,'generated_info_{}.npy'.format(i)))
60 |             if len(probs_info.shape) == 3:
61 |                 show_octaves = probs_info.shape[2] < 40
62 |                 plot_note_dist(probs_info[idx], 'Info {}'.format(i), show_octaves)
63 |             else:
64 |                 plot_scalar(probs_info[idx], 'Info {}'.format(i))
65 |     except FileNotFoundError:
66 |         pass
67 | 
68 | parser = argparse.ArgumentParser(description='Plot the internal state of a network')
69 | parser.add_argument('folder', help='Directory with the generated files')
70 | parser.add_argument('idx', type=int, help='Zero-based index of the output to visualize')
71 | 
72 | if __name__ == '__main__':
73 |     args = parser.parse_args()
74 |     plot_all(**vars(args))
75 |     input("Press enter to close.")
76 | 


--------------------------------------------------------------------------------
/queue_managers/__init__.py:
--------------------------------------------------------------------------------
1 | from .queue_base import QueueManager
2 | from .standard_manager import StandardQueueManager
3 | from .variational_manager import VariationalQueueManager
4 | from .sampling_variational_manager import SamplingVariationalQueueManager
5 | from .queueless_variational_manager import QueuelessVariationalQueueManager
6 | from .queueless_standard_manager import QueuelessStandardQueueManager
7 | from .nearness_standard_manager import NearnessStandardQueueManager
8 | from .noise_wrapper import NoiseWrapper
9 | 


--------------------------------------------------------------------------------
/queue_managers/nearness_standard_manager.py:
--------------------------------------------------------------------------------
 1 | from .queue_base import QueueManager
 2 | import theano
 3 | import theano.tensor as T
 4 | import numpy as np
 5 | from .standard_manager import StandardQueueManager
 6 | 
 7 | class NearnessStandardQueueManager( StandardQueueManager ):
 8 |     """
 9 |     A standard queue manager, using a configurable set of functions, with an exponential different loss
10 |     """
11 | 
12 |     def __init__(self, feature_size, penalty_shock, penalty_base, falloff_rate, vector_activation_fun=T.nnet.sigmoid, loss_fun=(lambda x:x)):
13 |         super().__init__(feature_size, vector_activation_fun, loss_fun)
14 |         self._penalty_shock = penalty_shock
15 |         self._penalty_base = penalty_base
16 |         self._falloff_rate = falloff_rate
17 | 
18 |     def get_loss(self, raw_feature_strengths, raw_feature_vects, extra_info=False):
19 |         raw_losses = self._loss_fun(raw_feature_strengths)
20 |         raw_sum = T.sum(raw_losses)
21 | 
22 |         n_parallel, n_timestep = raw_feature_strengths.shape
23 | 
24 |         falloff_arr = np.array(self._falloff_rate, np.float32) ** T.cast(T.arange(n_timestep), 'float32')
25 |         falloff_mat = T.shape_padright(falloff_arr) / T.shape_padleft(falloff_arr)
26 |         falloff_scaling = T.switch(T.ge(falloff_mat,1), 0, falloff_mat)/self._falloff_rate
27 |         # falloff_scaling is of shape (n_timestep, n_timestep) with 0 along diagonal, and jump to 1 falling off along dimension 1
28 |         # now we want to multiply through on both dimensions
29 |         first_multiply = T.dot(raw_feature_strengths, falloff_scaling) # shape (n_parallel, n_timestep)
30 |         second_multiply = raw_feature_strengths * first_multiply
31 |         unscaled_falloff_penalty = T.sum(second_multiply)
32 | 
33 |         full_loss = self._penalty_base * raw_sum + self._penalty_shock * unscaled_falloff_penalty
34 | 
35 |         if extra_info:
36 |             return full_loss, {"raw_loss_sum":raw_sum}
37 |         else:
38 |             return full_loss
39 | 


--------------------------------------------------------------------------------
/queue_managers/noise_wrapper.py:
--------------------------------------------------------------------------------
 1 | import theano
 2 | import theano.tensor as T
 3 | import numpy as np
 4 | from theano.sandbox.rng_mrg import MRG_RandomStreams
 5 | from .queue_base import QueueManager
 6 | from util import sliceMaker
 7 | 
 8 | class NoiseWrapper( QueueManager ):
 9 |     """
10 |     Queue manager that wraps another queue manager and adds noise to it
11 |     """
12 |     def __init__(self, inner_manager, pre_std=None, post_std=None, pre_mask=sliceMaker[:]):
13 |         """
14 |         Initialize the manager.
15 | 
16 |         Parameters:
17 |             pre_std: Standard deviation of noise to apply to input activations
18 |             post_std: Standard deviation of noise to apply to output vector
19 |             pre_mask: A slice that determines which part of input activations
20 |                 to apply noise to (i.e. activations[:,:,pre_mask])
21 |         """
22 |         self._srng = MRG_RandomStreams(np.random.randint(0, 1024))
23 |         self._inner_manager = inner_manager
24 |         self._pre_std = pre_std
25 |         self._pre_mask = pre_mask
26 |         self._post_std = post_std
27 | 
28 |     @property
29 |     def activation_width(self):
30 |         return self._inner_manager.activation_width
31 | 
32 |     @property
33 |     def feature_size(self):
34 |         return self._inner_manager.feature_size
35 | 
36 |     def get_strengths_and_vects(self, input_activations):
37 |         if self._pre_std is not None:
38 |             input_activations = T.inc_subtensor(input_activations[:,:,self._pre_mask], self._srng.normal(input_activations.shape, self._pre_std))
39 |         strengths, vects = self._inner_manager.get_strengths_and_vects(input_activations)
40 |         if self._post_std is not None:
41 |             vects = vects + self._srng.normal(vects.shape, self._post_std)
42 |         return strengths, vects
43 | 
44 |     def get_loss(self, input_activations, raw_feature_strengths, raw_feature_vects, extra_info=False):
45 |         return self._inner_manager.get_loss(input_activations, raw_feature_strengths, raw_feature_vects, extra_info)
46 | 
47 |     def process(self, input_activations, extra_info=False):
48 |         if self._pre_std is not None:
49 |             input_activations = T.inc_subtensor(input_activations[:,:,self._pre_mask], self._srng.normal(input_activations.shape, self._pre_std))
50 |         stuff = self._inner_manager.process(input_activations, extra_info)
51 |         vects = stuff[2]
52 |         if self._post_std is not None:
53 |             vects = vects + self._srng.normal(vects.shape, self._post_std)
54 |         return stuff[:2] + (vects,) + stuff[3:]
55 | 


--------------------------------------------------------------------------------
/queue_managers/queue_base.py:
--------------------------------------------------------------------------------
  1 | import theano
  2 | import theano.tensor as T
  3 | import numpy as np
  4 | 
  5 | class QueueManager( object ):
  6 |     """
  7 |     Manages the queue transformation
  8 |     """
  9 | 
 10 |     @property
 11 |     def activation_width(self):
 12 |         """
 13 |         The activation width of the queue manager, determining the dimensions of the
 14 |         input_activations
 15 |         """
 16 |         raise NotImplementedError("activation_width not implemented")
 17 | 
 18 |     @property
 19 |     def feature_size(self):
 20 |         """
 21 |         The feature width of the queue manager, determining the dimensions of the transformed output
 22 |         """
 23 |         raise NotImplementedError("feature_size not implemented")
 24 |     
 25 |     def get_strengths_and_vects(self, input_activations):
 26 |         """
 27 |         Prepare a set of input activations, returning the feature strengths and vectors.
 28 | 
 29 |         Parameters:
 30 |             input_activations: a theano tensor (float32) of shape (batch, timestep, activation_width)
 31 | 
 32 |         Returns:
 33 |             raw_feature_strengths: A theano tensor (float32) of shape (batch, timestep) giving the
 34 |                 raw push strength for each timestep
 35 |             raw_feature_vects: A theano tensor (float32) of shape (batch, timestep, feature_size)
 36 |                 giving the raw vector of the input at each timestep
 37 |         """
 38 |         raise NotImplementedError("get_strengths_and_vects not implemented")
 39 | 
 40 |     def get_loss(self, input_activations, raw_feature_strengths, raw_feature_vects, extra_info=False):
 41 |         """
 42 |         Calculate the loss for the given vects and strengths
 43 | 
 44 |         Parameters:
 45 |             raw_feature_strengths: A theano tensor (float32) of shape (batch, timestep) giving the
 46 |                 raw push strength for each timestep
 47 |             raw_feature_vects: A theano tensor (float32) of shape (batch, timestep, feature_size)
 48 |                 giving the raw vector of the input at each timestep
 49 |             extra_info: If True, return extra info
 50 | 
 51 |         Returns:
 52 |             sparsity_loss: a theano scalar (float32) giving the sparsity loss
 53 |             Also, if extra_info is true, an additional info dict.
 54 |         """
 55 |         raise NotImplementedError("get_loss not implemented")
 56 | 
 57 |     def process(self, input_activations, extra_info=False):
 58 |         """
 59 |         Process a set of input activations, returning the transformed output and sparsity loss.
 60 | 
 61 |         Parameters:
 62 |             input_activations: a theano tensor (float32) of shape (batch, timestep, activation_width)
 63 |             extra_info: If True, return extra info
 64 | 
 65 |         Returns:
 66 |             sparsity_loss: a theano scalar (float32) giving the sparsity loss
 67 |             raw_feature_strengths: A theano tensor (float32) of shape (batch, timestep) giving the
 68 |                 raw push strength for each timestep
 69 |             raw_feature_vects: A theano tensor (float32) of shape (batch, timestep, feature_size)
 70 |                 giving the raw vector of the input at each timestep
 71 |             Also, if extra_info is true, an additional info dict.
 72 |         """
 73 |         raw_feature_strengths, raw_feature_vects = self.get_strengths_and_vects(input_activations)
 74 |         sparsity_loss = self.get_loss(raw_feature_strengths, raw_feature_vects, extra_info=extra_info)
 75 | 
 76 |         if extra_info:
 77 |             sparsity_loss, info = sparsity_loss
 78 |             return sparsity_loss, raw_feature_strengths, raw_feature_vects, info
 79 |         else:
 80 |             return sparsity_loss, raw_feature_strengths, raw_feature_vects
 81 | 
 82 |     def surrogate_loss(self, reconstruction_cost, extra_info):
 83 |         """
 84 |         Get a "surrogate loss" to estimate gradients of any stochastic choices that factor into the reconstruction
 85 |         cost
 86 | 
 87 |         Parameters:
 88 |             reconstruction_cost: The reconstruction cost for this batch
 89 |             extra_info: A dictionary, of the form returned by process
 90 | 
 91 |         Returns:
 92 |             Either:
 93 |              - None, which means no loss will be added
 94 |              - (surrogate_loss, updates): where surrogate_loss is a value which will be added to the final cost
 95 |                     and differentiated to get the gradient estimator, and updates is a list of updates in 
 96 |                     [(variable, value), ...] form.
 97 |         """
 98 |         return None
 99 | 
100 |     @staticmethod
101 |     def queue_transform(feature_strengths, feature_vects, return_strengths=False):
102 |         """
103 |         Process features according to a "fragmented queue", where each timestep
104 |         gets a size-1 window onto a feature queue. Effectively,
105 |             feature_strengths gives how much to push onto queue
106 |             feature_vects gives what to push on
107 |             pop weights are tied to feature_strengths
108 |             output is a size-1 peek (without popping)
109 | 
110 |         Parameters:
111 |             - feature_strengths: float32 tensor of shape (batch, push_timestep) in [0,1]
112 |             - feature_vects: float32 tensor of shape (batch, push_timestep, feature_dim)
113 | 
114 |         Returns:
115 |             - peek_vects: float32 tensor of shape (batch, timestep, feature_dim)
116 |         """
117 |         n_batch, n_time, n_feature = feature_vects.shape
118 | 
119 |         cum_sum_str = T.extra_ops.cumsum(feature_strengths, 1)
120 | 
121 |         # We will be working in (batch, timestep, push_timestep)
122 |         # For each timestep, if we subtract out the sum of pushes before that timestep
123 |         # and then cap to 0-1 we get the cumsums for just the features active in that
124 |         # timestep
125 |         timestep_adjustments = T.shape_padright(cum_sum_str - feature_strengths)
126 |         push_time_cumsum = T.shape_padaxis(cum_sum_str, 1)
127 |         relative_cumsum = push_time_cumsum - timestep_adjustments
128 |         capped_cumsum = T.minimum(T.maximum(relative_cumsum, 0), 1)
129 | 
130 |         # Now we can recover the peek strengths by taking a diff
131 |         shifted = T.concatenate([T.zeros((n_batch, n_time, 1)), capped_cumsum[:,:,:-1]],2)
132 |         peek_strengths = capped_cumsum-shifted
133 |         # Peek strengths is now (batch, timestep, push_timestep)
134 | 
135 |         result = T.batched_dot(peek_strengths, feature_vects)
136 | 
137 |         if return_strengths:
138 |             return peek_strengths, result
139 |         else:
140 |             return result
141 | 
142 | 
143 | def test_frag_queue():
144 |     feature_strengths = T.fmatrix()
145 |     feature_vects = T.ftensor3()
146 |     peek_strengths, res = QueueManager.queue_transform(feature_strengths, feature_vects, True)
147 |     grad_s, grad_v = theano.gradient.grad(T.sum(res[:,:,1]), [feature_strengths,feature_vects])
148 | 
149 |     fun = theano.function([feature_strengths, feature_vects], [peek_strengths, res, grad_s, grad_v], allow_input_downcast=True)
150 | 
151 |     mystrengths = np.array([[0.3,0.3,0.2,0.6,0.3,0.7,0.2,1], [0.3,0.3,0.2,0.6,0.3,0.7,0.2,1]], np.float32)
152 |     myvects = np.tile(np.eye(8, dtype=np.float32), (2,1,1))
153 |     mypeek, myres, mygs, mygv = fun(mystrengths, myvects)
154 | 
155 |     print(mypeek)
156 |     print(myres)
157 |     print(mygs)
158 |     print(mygv)
159 |     return mypeek, myres, mygs, mygv


--------------------------------------------------------------------------------
/queue_managers/queueless_standard_manager.py:
--------------------------------------------------------------------------------
 1 | from .queue_base import QueueManager
 2 | import theano
 3 | import theano.tensor as T
 4 | from theano.sandbox.rng_mrg import MRG_RandomStreams
 5 | import numpy as np
 6 | 
 7 | class QueuelessStandardQueueManager( QueueManager ):
 8 |     """
 9 |     A variational-autoencoder-based manager which does not use the queue, with a configurable loss
10 |     """
11 | 
12 |     def __init__(self, feature_size, period=None, vector_activation_fun=T.nnet.sigmoid):
13 |         """
14 |         Initialize the manager.
15 | 
16 |         Parameters:
17 |             feature_size: The width of a feature
18 |             period: Period for queue activations
19 |             vector_activation_fun: The activation function to apply to the vectors. Will be applied
20 |                 to a tensor of shape (n_parallel, feature_size) and should return a tensor of the
21 |                 same shape
22 |         """
23 |         self._feature_size = feature_size
24 |         self._vector_activation_fun = vector_activation_fun
25 |         self._period = period
26 | 
27 |     @property
28 |     def activation_width(self):
29 |         return self.feature_size
30 | 
31 |     @property
32 |     def feature_size(self):
33 |         return self._feature_size
34 | 
35 |     def get_strengths_and_vects(self, input_activations):
36 |         n_batch, n_time, _ = input_activations.shape
37 |         pre_vects = input_activations
38 | 
39 |         strengths = T.zeros((n_batch, n_time))
40 |         if self._period is None:
41 |             strengths = T.set_subtensor(strengths[:,-1],1)
42 |         else:
43 |             strengths = T.set_subtensor(strengths[:,self._period-1::self._period],1)
44 | 
45 |         flat_pre_vects = T.reshape(pre_vects,(-1,self.feature_size))
46 |         flat_vects = self._vector_activation_fun( flat_pre_vects )
47 |         vects = T.reshape(flat_vects, pre_vects.shape)
48 | 
49 |         return strengths, vects
50 | 
51 |     def get_loss(self, raw_feature_strengths, raw_feature_vects, extra_info=False):
52 |         loss = T.as_tensor_variable(np.float32(0.0))
53 |         if extra_info:
54 |             return loss, {}
55 |         else:
56 |             return loss
57 | 


--------------------------------------------------------------------------------
/queue_managers/queueless_variational_manager.py:
--------------------------------------------------------------------------------
 1 | from .queue_base import QueueManager
 2 | import theano
 3 | import theano.tensor as T
 4 | from theano.sandbox.rng_mrg import MRG_RandomStreams
 5 | import numpy as np
 6 | import constants
 7 | 
 8 | class QueuelessVariationalQueueManager( QueueManager ):
 9 |     """
10 |     A variational-autoencoder-based manager which does not use the queue, with a configurable loss
11 |     """
12 | 
13 |     def __init__(self, feature_size, period=None, variational_loss_scale=1):
14 |         """
15 |         Initialize the manager.
16 | 
17 |         Parameters:
18 |             feature_size: The width of a feature
19 |             period: Period for queue activations
20 |             variational_loss_scale: Factor by which to scale variational loss
21 |         """
22 |         self._feature_size = feature_size
23 |         self._period = period
24 |         self._srng = MRG_RandomStreams(np.random.randint(0, 1024))
25 |         self._variational_loss_scale = np.array(variational_loss_scale, np.float32)
26 | 
27 |     @property
28 |     def activation_width(self):
29 |         return self.feature_size*2
30 | 
31 |     @property
32 |     def feature_size(self):
33 |         return self._feature_size
34 | 
35 |     def helper_sample(self, input_activations):
36 |         n_batch, n_time, _ = input_activations.shape
37 |         means = input_activations[:,:,:self.feature_size]
38 |         stdevs = abs(input_activations[:,:,self.feature_size:]) + constants.EPSILON
39 |         wiggle = self._srng.normal(means.shape)
40 | 
41 |         vects = means + (stdevs * wiggle)
42 | 
43 |         strengths = T.zeros((n_batch, n_time))
44 |         if self._period is None:
45 |             strengths = T.set_subtensor(strengths[:,-1],1)
46 |         else:
47 |             strengths = T.set_subtensor(strengths[:,self._period-1::self._period],1)
48 | 
49 |         return strengths, vects, means, stdevs, {}
50 | 
51 |     def get_strengths_and_vects(self, input_activations):
52 |         strengths, vects, means, stdevs, _ = self.helper_sample(input_activations)
53 |         return strengths, vects
54 | 
55 |     def process(self, input_activations, extra_info=False):
56 | 
57 |         strengths, vects, means, stdevs, sample_info = self.helper_sample(input_activations)
58 | 
59 |         means_sq = means**2
60 |         variance = stdevs**2
61 |         loss_parts = 1 + T.log(variance) - means_sq - variance
62 |         if self._period is None:
63 |             loss_parts = loss_parts[:,-1]
64 |         else:
65 |             loss_parts = loss_parts[:,self._period-1::self._period]
66 |         variational_loss = -0.5 * T.sum(loss_parts) * self._variational_loss_scale
67 | 
68 |         info = {"variational_loss":variational_loss}
69 |         info.update(sample_info)
70 |         if extra_info:
71 |             return variational_loss, strengths, vects, info
72 |         else:
73 |             return variational_loss, strengths, vects
74 | 


--------------------------------------------------------------------------------
/queue_managers/sampling_variational_manager.py:
--------------------------------------------------------------------------------
 1 | from .queue_base import QueueManager
 2 | from .variational_manager import VariationalQueueManager
 3 | import theano
 4 | import theano.tensor as T
 5 | from theano.sandbox.rng_mrg import MRG_RandomStreams
 6 | import numpy as np
 7 | import constants
 8 | 
 9 | class SamplingVariationalQueueManager( VariationalQueueManager ):
10 |     """
11 |     A variational-autoencoder-based queue manager, using a configurable loss, with sampled pushes and pops
12 |     """
13 | 
14 |     def __init__(self, feature_size, loss_fun=(lambda x:x), variational_loss_scale=1, baseline_scale=0.9):
15 |         """
16 |         Initialize the manager.
17 | 
18 |         Parameters:
19 |             feature_size: The width of a feature
20 |             loss_fun: A function which computes the loss for each timestep. Should be an elementwise
21 |                 operation.
22 |             baseline_scale: How much to adjust the baseline
23 |         """
24 |         super().__init__(feature_size, loss_fun, variational_loss_scale)
25 |         self._baseline_scale = baseline_scale
26 | 
27 |     def helper_sample(self, input_activations):
28 |         strength_probs, vects, means, stdevs, old_info = super().helper_sample(input_activations)
29 |         samples = self._srng.uniform(strength_probs.shape)
30 |         strengths = T.cast(strength_probs>samples, 'float32')
31 |         return strengths, vects, means, stdevs, {"sample_strength_probs":strength_probs, "sample_strength_choices":strengths}
32 | 
33 |     def surrogate_loss(self, reconstruction_cost, extra_info):
34 |         """
35 |         Based on "Gradient Estimation Using Stochastic Computation Graphs", we can compute the gradient estimate
36 |         as
37 | 
38 |             grad(E[cost]) ~= E[ sum(grad(log p(w|...)) * (Q - b)) + grad(cost(w))]
39 | 
40 |         where
41 |          - w is the current thing we sampled (so p(w|...) is the probability we would do what we sampled doing)
42 |          - Q is the cost "downstream" of w
43 |          - b is an arbitrary baseline, which must not be downstream of w
44 | 
45 |         In this case, each w is a particular choice we made in sampling the strengths, and  Q is just the
46 |         reconstruction cost (since the final output can depend on strengths both in the past and the future).
47 |         We let b be an exponential average of previous values of Q.
48 | 
49 |         We can construct our surrogate loss function as
50 | 
51 |             L = sum(log p(w|...)*(Q - b)) + actual costs
52 |               = (Q - b)*sum(log p(w|...)) + actual costs
53 | 
54 |         as long as we consider Q and b constant wrt any derivative operation. This function thus returns
55 | 
56 |             S = (Q - b)*sum(log p(w|...))
57 |         """
58 |         s_probs = extra_info["sample_strength_probs"]
59 |         s_choices = extra_info["sample_strength_choices"]
60 |         prob_do_sampled =  s_probs * s_choices + (1-s_probs)*(1-s_choices)
61 |         logprobsum = T.sum(T.log(prob_do_sampled))
62 | 
63 |         accum_prev_Q = theano.shared(np.array(0, np.float32))
64 |         accum_divisor = theano.shared(np.array(constants.EPSILON, np.float32))
65 |         baseline = accum_prev_Q / accum_divisor
66 | 
67 |         Q = theano.gradient.disconnected_grad(reconstruction_cost)
68 | 
69 |         surrogate_loss_component = logprobsum * (Q - baseline)
70 | 
71 |         new_prev_Q = (self._baseline_scale)*accum_prev_Q + (1-self._baseline_scale)*Q
72 |         new_divisor = (self._baseline_scale)*accum_divisor + (1-self._baseline_scale)
73 | 
74 |         updates = [(accum_prev_Q, new_prev_Q), (accum_divisor, new_divisor)]
75 | 
76 |         return surrogate_loss_component, updates
77 | 
78 | 
79 | 
80 | 
81 | 


--------------------------------------------------------------------------------
/queue_managers/standard_manager.py:
--------------------------------------------------------------------------------
 1 | from .queue_base import QueueManager
 2 | import theano
 3 | import theano.tensor as T
 4 | 
 5 | class StandardQueueManager( QueueManager ):
 6 |     """
 7 |     A standard queue manager, using a configurable set of functions
 8 |     """
 9 | 
10 |     def __init__(self, feature_size, vector_activation_fun=T.nnet.sigmoid, loss_fun=(lambda x:x)):
11 |         """
12 |         Initialize the manager.
13 | 
14 |         Parameters:
15 |             feature_size: The width of a feature
16 |             vector_activation_fun: The activation function to apply to the vectors. Will be applied
17 |                 to a tensor of shape (n_parallel, feature_size) and should return a tensor of the
18 |                 same shape
19 |             loss_fun: A function which computes the loss for each timestep. Should be an elementwise
20 |                 operation.
21 |         """
22 |         self._feature_size = feature_size
23 |         self._vector_activation_fun = vector_activation_fun
24 |         self._loss_fun = loss_fun
25 | 
26 |     @property
27 |     def activation_width(self):
28 |         return 1 + self.feature_size
29 | 
30 |     @property
31 |     def feature_size(self):
32 |         return self._feature_size
33 | 
34 |     def get_strengths_and_vects(self, input_activations):
35 |         pre_strengths = input_activations[:,:,0]
36 |         pre_vects = input_activations[:,:,1:]
37 | 
38 |         strengths = T.nnet.sigmoid(pre_strengths)
39 |         strengths = T.set_subtensor(strengths[:,-1],1)
40 | 
41 |         flat_pre_vects = T.reshape(pre_vects,(-1,self.feature_size))
42 |         flat_vects = self._vector_activation_fun( flat_pre_vects )
43 |         vects = T.reshape(flat_vects, pre_vects.shape)
44 | 
45 |         return strengths, vects
46 | 
47 |     def get_loss(self, raw_feature_strengths, raw_feature_vects, extra_info=False):
48 |         losses = self._loss_fun(raw_feature_strengths)
49 |         full_loss = T.sum(losses)
50 |         if extra_info:
51 |             return full_loss, {}
52 |         else:
53 |             return full_loss


--------------------------------------------------------------------------------
/queue_managers/variational_manager.py:
--------------------------------------------------------------------------------
 1 | from .queue_base import QueueManager
 2 | import theano
 3 | import theano.tensor as T
 4 | from theano.sandbox.rng_mrg import MRG_RandomStreams
 5 | import numpy as np
 6 | import constants
 7 | 
 8 | class VariationalQueueManager( QueueManager ):
 9 |     """
10 |     A variational-autoencoder-based queue manager, using a configurable loss
11 |     """
12 | 
13 |     def __init__(self, feature_size, loss_fun=(lambda x:x), variational_loss_scale=1):
14 |         """
15 |         Initialize the manager.
16 | 
17 |         Parameters:
18 |             feature_size: The width of a feature
19 |             loss_fun: A function which computes the loss for each timestep. Should be an elementwise
20 |                 operation.
21 |             variational_loss_scale: Factor by which to scale variational loss
22 |         """
23 |         self._feature_size = feature_size
24 |         self._srng = MRG_RandomStreams(np.random.randint(0, 1024))
25 |         self._loss_fun = loss_fun
26 |         self._variational_loss_scale = np.array(variational_loss_scale, np.float32)
27 | 
28 |     @property
29 |     def activation_width(self):
30 |         return 1 + self.feature_size*2
31 | 
32 |     @property
33 |     def feature_size(self):
34 |         return self._feature_size
35 | 
36 |     def helper_sample(self, input_activations):
37 |         """Helper method to sample from the input_activations. Also returns an (empty) info dict for child class use"""
38 |         pre_strengths = input_activations[:,:,0]
39 |         strengths = T.nnet.sigmoid(pre_strengths)
40 |         strengths = T.set_subtensor(strengths[:,-1],1)
41 | 
42 |         means = input_activations[:,:,1:1+self.feature_size]
43 |         stdevs = abs(input_activations[:,:,1+self.feature_size:]) + constants.EPSILON
44 |         wiggle = self._srng.normal(means.shape)
45 | 
46 |         vects = means + (stdevs * wiggle)
47 | 
48 |         return strengths, vects, means, stdevs, {}
49 | 
50 |     def get_strengths_and_vects(self, input_activations):
51 |         strengths, vects, means, stdevs, _ = self.helper_sample(input_activations)
52 |         return strengths, vects
53 | 
54 |     def process(self, input_activations, extra_info=False):
55 | 
56 |         strengths, vects, means, stdevs, sample_info = self.helper_sample(input_activations)
57 | 
58 |         sparsity_losses = self._loss_fun(strengths)
59 |         full_sparsity_loss = T.sum(sparsity_losses)
60 | 
61 |         means_sq = means**2
62 |         variance = stdevs**2
63 |         variational_loss = -0.5 * T.sum(1 + T.log(variance) - means_sq - variance) * self._variational_loss_scale
64 | 
65 |         full_loss = full_sparsity_loss + variational_loss
66 | 
67 |         info = {"sparsity_loss": full_sparsity_loss, "variational_loss":variational_loss}
68 |         info.update(sample_info)
69 |         if extra_info:
70 |             return full_loss, strengths, vects, info
71 |         else:
72 |             return full_loss, strengths, vects
73 | 


--------------------------------------------------------------------------------
/relshift_lstm.py:
--------------------------------------------------------------------------------
  1 | import theano
  2 | import theano.tensor as T
  3 | import numpy as np
  4 | 
  5 | 
  6 | from theano_lstm import LSTM, StackedCells, Layer
  7 | from util import *
  8 | 
  9 | from collections import namedtuple
 10 | 
 11 | SampleScanSpec = namedtuple('SampleScanSpec', ['sequences', 'non_sequences', 'outputs_info', 'num_taps', 'kwargs_keys', 'deterministic_dropout', 'start_pos'])
 12 | 
 13 | class RelativeShiftLSTMStack( object ):
 14 |     """
 15 |     Manages a stack of LSTM cells with potentially a relative shift applied
 16 |     """
 17 | 
 18 |     def __init__(self, input_parts, layer_sizes, output_size, window_size=0, dropout=0, mode="drop", unroll_batch_num=None):
 19 |         """
 20 |         Parameters:
 21 |             input_parts: A list of InputParts
 22 |             layer_sizes: A list of the form [ (indep, per_note), ... ] where
 23 |                     indep is the number of non-shifted cells to have, and
 24 |                     per_note is the number of cells to have per window note, which shift as the
 25 |                         network moves
 26 |                     Alternately can just be [ indep, ... ]
 27 |             output_size: An integer, the width of the desired output
 28 |             dropout: How much dropout to apply.
 29 |             mode: Either "drop" or "roll". If drop, discard memory that goes out of range. If roll, roll it instead
 30 |         """
 31 | 
 32 |         self.input_parts = input_parts
 33 |         self.window_size = window_size
 34 | 
 35 |         layer_sizes = [x if isinstance(x,tuple) else (x,0) for x in layer_sizes]
 36 |         self.layer_sizes = layer_sizes
 37 |         self.tot_layer_sizes = [(indep + per_note*self.window_size) for indep, per_note in layer_sizes]
 38 |         
 39 |         self.output_size = output_size
 40 |         self.dropout = dropout
 41 | 
 42 |         self.input_size = sum(part.PART_WIDTH for part in input_parts)
 43 | 
 44 |         self.cells = StackedCells( self.input_size, celltype=LSTM, activation=T.tanh, layers = self.tot_layer_sizes )
 45 |         self.cells.layers.append(Layer(self.tot_layer_sizes[-1], self.output_size, activation = lambda x:x))
 46 | 
 47 |         assert mode in ("drop", "roll"), "Must specify either drop or roll mode"
 48 |         self.mode = mode
 49 | 
 50 |         self.unroll_batch_num = unroll_batch_num
 51 | 
 52 |     @property
 53 |     def params(self):
 54 |         return self.cells.params + list(l.initial_hidden_state for l in self.cells.layers if has_hidden(l))
 55 | 
 56 |     @params.setter
 57 |     def params(self, paramlist):
 58 |         self.cells.params = paramlist[:len(self.cells.params)]
 59 |         for l, val in zip((l for l in self.cells.layers if has_hidden(l)), paramlist[len(self.cells.params):]):
 60 |             l.initial_hidden_state.set_value(val.get_value())
 61 | 
 62 |     def perform_step(self, in_data, shifts, hiddens, dropout_masks=[]):
 63 |         """
 64 |         Perform a step through the LSTM network.
 65 | 
 66 |         in_data: A theano tensor (float32) of shape (batch, input_size)
 67 |         shifts: A theano tensor (int32) of shape (batch), giving the relative
 68 |             shifts to apply to the last hiddens
 69 |         hiddens: A list of hiddens [layer](batch, hidden_idx)
 70 |         dropout_masks: If [], apply dropout deterministically. Otherwise, should
 71 |             be a set of masks returned by get_dropout_masks, generally passed through
 72 |             a scan as a non-sequence.
 73 |         """
 74 | 
 75 |         # hiddens is of shape [layer](batch, hidden_idx)
 76 |         # We want to permute the hidden_idx values according to shifts,
 77 |         # which are ints of shape (batch)
 78 | 
 79 |         n_batch = in_data.shape[0]
 80 |         new_hiddens = []
 81 |         for layer_i, (indep, per_note) in enumerate(self.layer_sizes):
 82 |             if per_note == 0:
 83 |                 # Don't bother with this layer
 84 |                 new_hiddens.append(hiddens[layer_i])
 85 |                 continue
 86 |             # The theano_lstm code puts [memory_cells... , old_activations...]
 87 |             # We want to slide the memory cells only.
 88 |             lstm_hsplit = self.cells.layers[layer_i].hidden_size
 89 |             indep_mem = hiddens[layer_i][:,:indep]
 90 |             per_note_mem = hiddens[layer_i][:,indep:lstm_hsplit]
 91 |             remaining_values = hiddens[layer_i][:,lstm_hsplit:]
 92 |             # per_note_mem is (batch, per_note_mem)
 93 |             separated_mem = per_note_mem.reshape((n_batch, self.window_size, per_note))
 94 |             # separated_mem is (batch, note, mem)
 95 |             # [a b c ... x y z] shifted up 1   (+1) goes to  [b c ... x y z 0]
 96 |             # [a b c ... x y z] shifted down 1 (-1) goes to [0 a b c ... x y]
 97 |             def _shift_step(c_mem, c_shift):
 98 |                 # c_mem is (note, mem)
 99 |                 # c_shift is an int
100 |                 if self.mode=="drop":
101 |                     def _clamp_w(x):
102 |                         return T.maximum(0,T.minimum(x,self.window_size))
103 |                     ins_at_front = T.zeros((_clamp_w(-c_shift),per_note))
104 |                     ins_at_back = T.zeros((_clamp_w(c_shift),per_note))
105 |                     take_part = c_mem[_clamp_w(c_shift):self.window_size-_clamp_w(-c_shift),:]
106 |                     return T.concatenate([ins_at_front, take_part, ins_at_back], 0)
107 |                 elif self.mode=="roll":
108 |                     return T.roll(c_mem, (-c_shift)%12, axis=0)
109 | 
110 |             if self.unroll_batch_num is None:
111 |                 shifted_mem, _ = theano.map(_shift_step, [separated_mem, shifts])
112 |             else:
113 |                 shifted_mem_parts = []
114 |                 for i in range(self.unroll_batch_num):
115 |                     shifted_mem_parts.append(_shift_step(separated_mem[i], shifts[i]))
116 |                 shifted_mem = T.stack(shifted_mem_parts)
117 | 
118 |             new_per_note_mem = shifted_mem.reshape((n_batch, self.window_size * per_note))
119 |             new_layer_hiddens = T.concatenate([indep_mem, new_per_note_mem, remaining_values], 1)
120 |             new_hiddens.append(new_layer_hiddens)
121 | 
122 |         if dropout_masks == [] or not self.dropout:
123 |             masks = []
124 |         else:
125 |             masks = [None] + dropout_masks
126 |         new_states = self.cells.forward(in_data, prev_hiddens=new_hiddens, dropout=masks)
127 |         return new_states
128 | 
129 |     def do_preprocess_scan(self, deterministic_dropout=False, **kwargs):
130 |         """
131 |         Run a scan using this LSTM, preprocessing all inputs before the scan.
132 | 
133 |         Parameters:
134 |             kwargs[k]: should be a theano tensor of shape (n_batch, n_time, ... )
135 |                 Note that "relative_position" should be a keyword argument given here if there are relative
136 |                 shifts.
137 |             deterministic_dropout: If True, apply dropout deterministically, scaling everything. If false,
138 |                 sample dropout
139 | 
140 |         Returns:
141 |             A theano tensor of shape (n_batch, n_time, output_size) of activations
142 |         """
143 | 
144 |         assert len(kwargs)>0, "Need at least one input argument!"
145 |         n_batch, n_time = list(kwargs.values())[0].shape[:2]
146 | 
147 |         squashed_kwargs = {
148 |             k: v.reshape([n_batch*n_time] + [x for x in v.shape[2:]]) for k,v in kwargs.items()
149 |         }
150 | 
151 |         full_input = T.concatenate([ part.generate(**squashed_kwargs) for part in self.input_parts ], 1)
152 |         adjusted_input = full_input.reshape([n_batch, n_time, self.input_size]).dimshuffle((1,0,2))
153 | 
154 |         if "relative_position" in kwargs:
155 |             relative_position = kwargs["relative_position"]
156 |             diff_shifts = T.extra_ops.diff(relative_position, axis=1)
157 |             cat_shifts = T.concatenate([T.zeros((n_batch, 1), 'int32'), diff_shifts], 1)
158 |             shifts = cat_shifts.dimshuffle((1,0))
159 |         else:
160 |             shifts = T.zeros(n_time, n_batch, 'int32')
161 | 
162 |         def _scan_fn(in_data, shifts, *other):
163 |             other = list(other)
164 |             if self.dropout and not deterministic_dropout:
165 |                 split = -len(self.tot_layer_sizes)
166 |                 hiddens = other[:split]
167 |                 masks = [None] + other[split:]
168 |             else:
169 |                 masks = []
170 |                 hiddens = other
171 | 
172 |             return self.perform_step(in_data, shifts, hiddens, dropout_masks=masks)
173 | 
174 |         if self.dropout and not deterministic_dropout:
175 |             dropout_masks = UpscaleMultiDropout( [(n_batch, shape) for shape in self.tot_layer_sizes], self.dropout)
176 |         else:
177 |             dropout_masks = []
178 | 
179 |         outputs_info = [initial_state_with_taps(layer, n_batch) for layer in self.cells.layers]
180 |         result, _ = theano.scan(fn=_scan_fn, sequences=[adjusted_input, shifts], non_sequences=dropout_masks, outputs_info=outputs_info)
181 | 
182 |         final_out = get_last_layer(result).transpose((1,0,2))
183 | 
184 |         return final_out
185 | 
186 |     def prepare_sample_scan(self, start_pos, start_out, deterministic_dropout=False, **kwargs):
187 |         """
188 |         Prepare a sample scan
189 | 
190 |         Parameters:
191 |             kwargs[k]: should be a theano tensor of shape (n_batch, n_time, ... )
192 |                 Note that "relative_position" should be a keyword argument given here if there are relative
193 |                 shifts, as should "timestep"
194 |             start_pos: a theano tensor of shape (n_batch) giving the initial position passed to the
195 |                 out_to_in function
196 |             start_out: a theano tensor of shape (n_batch, X) giving the initial "output" passed
197 |                 to the out_to_in_fn
198 |             deterministic_dropout: If True, apply dropout deterministically, scaling everything. If false,
199 |                 sample dropout
200 | 
201 |         Returns:
202 |             A namedtuple, where
203 |                 sequences: a list of sequences to input into scan
204 |                 non_sequences: a list of non_sequences into scan
205 |                 outputs_info: a list of outputs_info for scan
206 |                 num_taps: the number of outputs with taps for this 
207 |                 (other values): for internal use
208 |         """
209 |         assert len(kwargs)>0, "Need at least one input argument!"
210 |         n_batch, n_time = list(kwargs.values())[0].shape[:2]
211 | 
212 |         transp_kwargs = {
213 |             k: v.dimshuffle((1,0) + tuple(range(2,v.ndim))) for k,v in kwargs.items()
214 |         }
215 | 
216 |         if self.dropout and not deterministic_dropout:
217 |             dropout_masks = UpscaleMultiDropout( [(n_batch, shape) for shape in self.tot_layer_sizes], self.dropout)
218 |         else:
219 |             dropout_masks = []
220 | 
221 |         outputs_info = [{"initial":start_pos, "taps":[-1]}, {"initial":start_out, "taps":[-1]}] + [initial_state_with_taps(layer, n_batch) for layer in self.cells.layers]
222 |         sequences = list(transp_kwargs.values())
223 |         non_sequences = dropout_masks
224 |         num_taps = len([True for x in outputs_info if x is not None])
225 |         return SampleScanSpec(sequences=sequences, non_sequences=non_sequences, outputs_info=outputs_info, num_taps=num_taps, kwargs_keys=list(transp_kwargs.keys()), deterministic_dropout=deterministic_dropout, start_pos=start_pos)
226 | 
227 | 
228 |     def sample_scan_routine(self, spec, *inputs):
229 |         """
230 |         Start a scan routine. This is implemented as a generator, since we may need to interrupt the state in the
231 |         middle of iteration. How to use:
232 | 
233 |         scan_rout = x.sample_scan_routine(spec, *inputs)
234 |                 - spec: The SampleScanSpec returned by prepare_sample_scan
235 |                 - *inputs: The scan inputs, in [ sequences..., taps..., non_sequences... ] order
236 | 
237 |         last_rel_pos, last_out, cur_kwargs = scan_rout.send(None)
238 |                 - last_rel_pos is a theano tensor of shape (n_batch)
239 |                 - last_out will be a theano tensor of shape (n_batch, output_size)
240 |                 - cur_kwargs[k] is a theano tensor of shape (n_batch, ...), from kwargs
241 | 
242 |         out_activations = scan_rout.send((new_pos, addtl_kwargs))
243 |                 - new_pos is a theano tensor of shape (n_batch), giving the new relative position
244 |                 - addtl_kwargs[k] is a theano tensor of shape (n_batch, ...) to be added to cur kwargs
245 |                     Note that "relative_position" will be added automatically.
246 | 
247 |         scan_outputs = scan_rout.send(new_out)
248 |                 - new_out is a tensor of shape (n_batch, X) to be output
249 | 
250 |         scan_rout.close()
251 | 
252 |         -> scan_outputs should be returned back to scan
253 |         """
254 |         stuff = list(inputs)
255 |         I = len(spec.kwargs_keys)
256 |         kwarg_seq_vals = stuff[:I]
257 |         cur_kwargs = {k:v for k,v in zip(spec.kwargs_keys, kwarg_seq_vals)}
258 |         last_pos, last_out = stuff[I:I+2]
259 |         other = stuff[I+2:]
260 | 
261 |         if self.dropout and not spec.deterministic_dropout:
262 |             split = -len(self.tot_layer_sizes)
263 |             hiddens = other[:split]
264 |             masks = [None] + other[split:]
265 |         else:
266 |             masks = []
267 |             hiddens = other
268 | 
269 |         cur_pos, addtl_kwargs = yield(last_pos, last_out, cur_kwargs)
270 |         all_kwargs = {
271 |             "relative_position": cur_pos
272 |         }
273 |         all_kwargs.update(cur_kwargs)
274 |         all_kwargs.update(addtl_kwargs)
275 | 
276 |         shift = T.switch(T.eq(all_kwargs["timestep"],0), 0, cur_pos - last_pos)
277 | 
278 |         full_input = T.concatenate([ part.generate(**all_kwargs) for part in self.input_parts ], 1)
279 | 
280 |         step_stuff = self.perform_step(full_input, shift, hiddens, dropout_masks=masks)
281 |         new_hiddens = step_stuff[:-1]
282 |         raw_output = step_stuff[-1]
283 |         sampled_output = yield(raw_output)
284 | 
285 |         yield [cur_pos, sampled_output] + step_stuff
286 | 
287 |     def extract_sample_scan_results(self, spec, outputs):
288 |         """
289 |         Extract outputs from the scan results. 
290 | 
291 |         Parameters:
292 |             outputs: The outputs from the scan associated with this stack
293 | 
294 |         Returns:
295 |             positions, raw_output, sampled_output
296 |         """
297 |         positions = T.concatenate([T.shape_padright(spec.start_pos), outputs[0].transpose((1,0))[:,:-1]], 1)
298 |         sampled_output = outputs[2].transpose((1,0,2))
299 |         raw_output = outputs[-1].transpose((1,0,2))
300 | 
301 |         return positions, raw_output, sampled_output
302 | 
303 | 
304 |     def do_sample_scan(self, start_pos, start_out, sample_fn, out_to_in_fn, deterministic_dropout=True, **kwargs):
305 |         """
306 |         Run a scan using this LSTM, sampling and processing as we go.
307 | 
308 |         Parameters:
309 |             kwargs[k]: should be a theano tensor of shape (n_batch, n_time, ... )
310 |                 Note that "relative_position" should be a keyword argument given here if there are relative
311 |                 shifts.
312 |             start_pos: a theano tensor of shape (n_batch) giving the initial position passed to the
313 |                 out_to_in function
314 |             start_out: a theano tensor of shape (n_batch, X) giving the initial "output" passed
315 |                 to the out_to_in_fn
316 |             sample_fn: a function with signature
317 |                     sample_fn(out_activations, rel_pos) -> new_out, new_rel_pos
318 |                 where
319 |                     - rel_pos is a theano tensor of shape (n_batch)
320 |                     - out_activations is a tensor of shape (n_batch, output_size)
321 |                 and
322 |                     - new_out is a tensor of shape (n_batch, X) to be output
323 |                     - new_rel_pos should be a theano tensor of shape (n_batch)
324 |             out_to_in_fn: a function with signature
325 |                     out_to_in_fn(rel_pos, last_out, **cur_kwargs) -> addtl_kwargs
326 |                 where 
327 |                     - rel_pos is a theano tensor of shape (n_batch)
328 |                     - last_out will be a theano tensor of shape (n_batch, output_size)
329 |                     - cur_kwargs[k] is a theano tensor of shape (n_batch, ...), from kwargs
330 |                 and
331 |                     - addtl_kwargs[k] is a theano tensor of shape (n_batch, ...) to be added to cur kwargs
332 |                         Note that "relative_position" will be added automatically.
333 |             deterministic_dropout: If True, apply dropout deterministically, scaling everything. If false,
334 |                 sample dropout
335 | 
336 |         Returns: positions, raw_output, sampled_output, updates
337 |         """
338 |         raise NotImplementedError()
339 |         spec = self.prepare_sample_scan(start_pos, start_out, sample_fn, deterministic_dropout, **kwargs)
340 | 
341 |         def _scan_fn(*stuff):
342 |             scan_rout = self.sample_scan_routine(spec, *stuff)
343 |             rel_pos, last_out, cur_kwargs = scan_rout.send(None)
344 |             addtl_kwargs = out_to_in_fn(rel_pos, last_out, **cur_kwargs)
345 |             out_activations = scan_rout.send(addtl_kwargs)
346 |             sampled_output, new_pos = sample_fn(out_activations, rel_pos)
347 |             scan_outputs = scan_rout.send((sampled_output, new_pos))
348 |             scan_rout.close()
349 |             return scan_outputs
350 | 
351 |         result, updates = theano.scan(fn=_scan_fn, sequences=spec.sequences, non_sequences=spec.non_sequences, outputs_info=spec.outputs_info)
352 |         positions, raw_output, sampled_output = self.extract_sample_scan_results(spec, result)
353 |         return positions, raw_output, sampled_output, updates
354 | 


--------------------------------------------------------------------------------
/training.py:
--------------------------------------------------------------------------------
  1 | import numpy as np
  2 | import os
  3 | import random
  4 | import signal
  5 | 
  6 | import leadsheet
  7 | import constants
  8 | 
  9 | import param_cvt
 10 | 
 11 | import pickle as pickle
 12 | 
 13 | import traceback
 14 | 
 15 | from pprint import pformat
 16 | 
 17 | BATCH_SIZE = 10
 18 | SEGMENT_STEP = constants.WHOLE//constants.RESOLUTION_SCALAR
 19 | SEGMENT_LEN = 4*SEGMENT_STEP
 20 | 
 21 | def set_params(batch_size, segment_step, segment_len):
 22 |     global BATCH_SIZE
 23 |     global SEGMENT_STEP
 24 |     global SEGMENT_LEN
 25 |     BATCH_SIZE = batch_size
 26 |     SEGMENT_STEP = segment_step
 27 |     SEGMENT_LEN = segment_len
 28 | 
 29 | VALIDATION_CT = 5
 30 | 
 31 | def find_leadsheets(dirpath):
 32 |     return [os.path.join(dirpath, fname) for fname in os.listdir(dirpath) if fname[-3:] == '.ls']
 33 | 
 34 | def filter_leadsheets(leadsheets):
 35 |     new_leadsheets=[]
 36 |     for lsfn in leadsheets:
 37 |         print("---- {} ----".format(lsfn))
 38 |         c,m = leadsheet.parse_leadsheet(lsfn, verbose=True)
 39 |         length = leadsheet.get_leadsheet_length(c,m)
 40 |         if length < SEGMENT_LEN:
 41 |             print("Leadsheet {} is too short! Skipping...".format(lsfn))
 42 |         else:
 43 |             new_leadsheets.append(lsfn)
 44 |     print("Found {} leadsheets.".format(len(leadsheets)))
 45 |     return new_leadsheets
 46 | 
 47 | def get_batch(leadsheets, with_sample=False):
 48 |     """
 49 |     Get a batch
 50 | 
 51 |     leadsheets should be a list of dataset lists of (chord, melody) tuples, or just a dataset list of tuples
 52 | 
 53 |     returns: chords, melodies
 54 |     """
 55 |     if not isinstance(leadsheets[0], list):
 56 |         leadsheets = [leadsheets]
 57 | 
 58 |     sample_datasets = [random.randrange(len(leadsheets)) for _ in range(BATCH_SIZE)]
 59 |     sample_fns = [random.choice(leadsheets[i]) for i in sample_datasets]
 60 |     loaded_samples = [leadsheet.parse_leadsheet(lsfn) for lsfn in sample_fns]
 61 |     sample_lengths = [leadsheet.get_leadsheet_length(c,m) for c,m in loaded_samples]
 62 | 
 63 |     starts = [(0 if l==SEGMENT_LEN else random.randrange(0,l-SEGMENT_LEN,SEGMENT_STEP)) for l in sample_lengths]
 64 |     sliced = [leadsheet.slice_leadsheet(c,m,s,s+SEGMENT_LEN) for (c,m),s in zip(loaded_samples, starts)]
 65 | 
 66 |     res = list(zip(*sliced))
 67 | 
 68 |     sample_sources = ["{}: starting at {} = bar {}".format(fn, start, start/(constants.WHOLE//constants.RESOLUTION_SCALAR)) for fn,start in zip(sample_fns, starts)]
 69 | 
 70 |     if with_sample:
 71 |         return res, sample_sources
 72 |     else:
 73 |         return res
 74 | 
 75 | def generate(model, leadsheets, filename, with_vis=False, batch=None):
 76 |     if batch is None:
 77 |         batch = get_batch(leadsheets, True)
 78 |     (chords, melody), sample_sources = batch
 79 |     generated_out, chosen, vis_probs, vis_info = model.produce(chords, melody)
 80 | 
 81 |     if with_vis:
 82 |         with open("{}_sources.txt".format(filename), "w") as f:
 83 |             f.write('\n'.join(sample_sources))
 84 |         np.save('{}_chosen.npy'.format(filename), chosen)
 85 |         np.save('{}_probs.npy'.format(filename), vis_probs)
 86 |         for i,v in enumerate(vis_info):
 87 |             np.save('{}_info_{}.npy'.format(filename,i), v)
 88 |     for samplenum, (melody, chords) in enumerate(zip(generated_out, chords)):
 89 |         leadsheet.write_leadsheet(chords, melody, '{}_{}.ls'.format(filename, samplenum))
 90 | 
 91 | def validate(model, validation_leadsheets):
 92 |     accum_loss = None
 93 |     accum_infos = None
 94 |     for i in range(VALIDATION_CT):
 95 |         loss, infos = model.eval(*get_batch(validation_leadsheets))
 96 |         if accum_loss is None:
 97 |             accum_loss = loss
 98 |             accum_infos = infos
 99 |         else:
100 |             accum_loss +=  loss
101 |             for k in accum_info.keys():
102 |                 accum_loss[k] += accum_infos[k]
103 |     accum_loss /= VALIDATION_CT
104 |     for k in accum_info.keys():
105 |         accum_loss[k] /= VALIDATION_CT
106 |     return accum_loss, accum_info
107 | 
108 | def validate_generate(model, validation_leadsheets, generated_dir):
109 |     for lsfn in validation_leadsheets:
110 |         ch,mel = leadsheet.parse_leadsheet(lsfn)
111 |         batch = ([ch],[mel]), [lsfn]
112 |         curdir = os.path.join(generated_dir, os.path.splitext(os.path.basename(lsfn))[0])
113 |         os.makedirs(curdir)
114 |         generate(model, None, os.path.join(curdir, "generated"), with_vis=True, batch=batch)
115 | 
116 | def train(model,leadsheets,num_updates,outputdir,start=0,save_params_interval=5000,validation_leadsheets=None,validation_generate_ct=1,auto_connectome_keys=None):
117 |     stopflag = [False]
118 |     def signal_handler(signame, sf):
119 |         stopflag[0] = True
120 |         print("Caught interrupt, waiting until safe. Press again to force terminate")
121 |         signal.signal(signal.SIGINT, old_handler)
122 |     old_handler = signal.signal(signal.SIGINT, signal_handler)
123 |     for i in range(start+1,start+num_updates+1):
124 |         if stopflag[0]:
125 |             break
126 |         loss, infos = model.train(*get_batch(leadsheets))
127 |         with open(os.path.join(outputdir,'data.csv'),'a') as f:
128 |             if i == 1:
129 |                 f.seek(0)
130 |                 f.truncate()
131 |                 f.write("iter, loss, " + ", ".join(k for k,v in sorted(infos.items())) + "\n")
132 |             f.write("{}, {}, ".format(i,loss) + ", ".join(str(v) for k,v in sorted(infos.items())) + "\n")
133 |         if i % 10 == 0:
134 |             print("update {}: {}, info {}".format(i,loss,pformat(infos)))
135 |         if save_params_interval is not None and i % save_params_interval == 0:
136 |             paramfile = os.path.join(outputdir, 'params{}.p'.format(i))
137 |             pickle.dump(model.params,open(paramfile, 'wb'))
138 |             if auto_connectome_keys is not None:
139 |                 param_cvt.main(paramfile, 18, auto_connectome_keys, make_zip=True)
140 |             if validation_leadsheets is None:
141 |                 generate(model, leadsheets, os.path.join(outputdir,'sample{}'.format(i)))
142 |             else:
143 |                 for gen_num in range(validation_generate_ct):
144 |                     validate_generate(model, validation_leadsheets, os.path.join(outputdir, "validation_{}_sample_{}".format(i,gen_num)))
145 |     if not stopflag[0]:
146 |         signal.signal(signal.SIGINT, old_handler)


--------------------------------------------------------------------------------
/util.py:
--------------------------------------------------------------------------------
 1 | import theano
 2 | import os
 3 | import theano.tensor as T
 4 | import numpy as np
 5 | from theano_lstm import MultiDropout
 6 | 
 7 | def has_hidden(layer):
 8 |     """
 9 |     Whether a layer has a trainable
10 |     initial hidden state.
11 |     """
12 |     return hasattr(layer, 'initial_hidden_state')
13 | 
14 | def matrixify(vector, n):
15 |     return T.repeat(T.shape_padleft(vector), n, axis=0)
16 | 
17 | def initial_state(layer, dimensions = None):
18 |     """
19 |     Initalizes the recurrence relation with an initial hidden state
20 |     if needed, else replaces with a "None" to tell Theano that
21 |     the network **will** return something, but it does not need
22 |     to send it to the next step of the recurrence
23 |     """
24 |     if dimensions is None:
25 |         return layer.initial_hidden_state if has_hidden(layer) else None
26 |     else:
27 |         return matrixify(layer.initial_hidden_state, dimensions) if has_hidden(layer) else None
28 | 
29 | def initial_state_with_taps(layer, dimensions = None):
30 |     """Optionally wrap tensor variable into a dict with taps=[-1]"""
31 |     state = initial_state(layer, dimensions)
32 |     if state is not None:
33 |         return dict(initial=state, taps=[-1])
34 |     else:
35 |         return None
36 | 
37 | 
38 | def get_last_layer(result):
39 |     if isinstance(result, list):
40 |         return result[-1]
41 |     else:
42 |         return result
43 | 
44 | def ensure_list(result):
45 |     if isinstance(result, list):
46 |         return result
47 |     else:
48 |         return [result]
49 | 
50 | def UpscaleMultiDropout(shapes, dropout = 0.):
51 |     """
52 |     Return all the masks needed for dropout outside of a scan loop.
53 |     """
54 |     orig_masks = MultiDropout(shapes, dropout)
55 |     fixed_masks = [m / (1-dropout) for m in orig_masks]
56 |     return fixed_masks
57 | 
58 | class _SliceHelperObj(object):
59 |     """
60 |     Helper object that exposes the slice from __getitem__ directly
61 |     """
62 |     def __getitem__(self, key):
63 |         return key
64 | 
65 | sliceMaker = _SliceHelperObj()
66 | 
67 | def _better_print_fn(op, xin):
68 |     for item in op.attrs:
69 |         if callable(item):
70 |             pmsg = item(xin)
71 |         else:
72 |             temp = getattr(xin, item)
73 |             if callable(temp):
74 |                 pmsg = temp()
75 |             else:
76 |                 pmsg = temp
77 |         print(op.message, attr, '=', pmsg)
78 | 
79 | def FnPrint(name, items=['__str__']):
80 |     return theano.printing.Print(name, items, _better_print_fn)
81 | 
82 | def Save(path="", preprocess=lambda x:x, text=False):
83 |     def _save_fn(op, xin):
84 |         val = preprocess(xin)
85 |         if text:
86 |             np.savetxt(path + ".csv", val, delimiter=",")
87 |         else:
88 |             np.save(path + ".npy", val)
89 |     return theano.printing.Print(path, [], _save_fn)
90 | 


--------------------------------------------------------------------------------