├── .gitignore
├── README.md
├── benchmark_functions.py
├── log
    └── .gitkeep
├── logging.yaml
├── methods
    ├── __init__.py
    ├── bo.py
    ├── oei.py
    ├── random.py
    ├── sdp.py
    └── solvers.py
├── out
    └── .gitkeep
├── plot.py
├── results
    └── .gitkeep
├── run.py
└── tests
    ├── __init__.py
    ├── create_model.py
    ├── test_derivatives.py
    └── test_sdp.py


/.gitignore:
--------------------------------------------------------------------------------
 1 | # Saved numpy and binary data
 2 | *.npz
 3 | *.npy
 4 | *.dat
 5 | # Saved figures
 6 | *.pdf
 7 | # Pycache
 8 | __pycache__
 9 | # Output logs
10 | out.txt
11 | *.swp
12 | *.png
13 | .Rhistory
14 | log/**/
15 | out/**/
16 | *.zip
17 | *.tgz
18 | **.pkl
19 | get_results.sh
20 | **/.cache
21 | **/output*
22 | log**/
23 | evals**/


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | # Bayesian-Optimization
 2 | This is the implementation of a new acquisition function for **Batch Bayesian Optimization**, named **Optimistic Expected Improvement (OEI)**. For details, results and theoretical analysis, refer to the paper titled [*Distributionally Ambiguous Optimization Techniques for Batch Bayesian Optimization*](https://arxiv.org/abs/1707.04191) by Nikitas Rontsis, Michael A.  Osborne, Paul J. Goulart.
 3 | 
 4 | This branch is a **cleaned** and **updated** implementation of `OEI`. The branch  [GPflow-based](https://github.com/oxfordcontrol/Bayesian-Optimization/tree/GPflow-based) includes code for testing against the following batch acquisition functions:
 5 | * Multipoint expected improvement (`qei.py`)
 6 | * Multipoint expected improvement, Constant Liar Heuristic (`qei_cl.py`)
 7 | * Batch Lower Confidence Bound (`blcb.py`)
 8 | * Local Penalization of Expected Improvement (`lp_ei.py`)
 9 | 
10 | Refer to the above mentioned paper for references and more detailed descriptions.
11 | 
12 | ***Please use this branch for `OEI` as the one in the [GPflow-based](https://github.com/oxfordcontrol/Bayesian-Optimization/tree/GPflow-based) branch is outdated and only intended for the other acquisition functions (QEI, QEI-Cl, BLCB and LP_EI).***
13 | 
14 | ## Installation
15 | This package was written in `Python 3.6`. It is recommended to use the `Anaconda >=5.0.1` distribution, on a empty environment. The package is built on top of `gpflow 0.5` which has to be installed from [source]( https://github.com/GPflow/GPflow/releases/tag/0.5.0).
16 | Then, proceed on installing the following
17 | ```
18 | conda install numpy scipy matplotlib pyyaml
19 | ```
20 | You should also install [`SCS`](https://github.com/cvxgrp/scs), a Convex Solver. Do compile the package (`--no-binary :all:` flag below), as this can bring a significant speedup.
21 | ```
22 | pip install 'scs>=2.0.2' --no-binary :all:
23 | ```
24 | Moreover, you should install a Python interface to the `Intel MKL Pardiso` Linear Solver (freely distributed by the `Anaconda` distribution):
25 | ```
26 | conda install -c haasad pypardiso
27 | ```
28 | Finally, you have to download the second order non-linear solver [`KNITRO`](https://www.artelys.com/en/optimization-tools/knitro), install it and copy its contents in a folder named `knitro` inside this package. `KNITRO` is proprietary, but time-limited academic and commercial licenses are provided for free by `Artelys`.
29 | 
30 | ## Testing
31 | Unit tests are included in the `tests` folder. You can invoke them using `pytest`. To do so, the following additional packages are required:
32 | ```
33 | conda install pytest
34 | pip install 'cvxpy<1' numdifftools
35 | conda install -c mosek mosek
36 | ```
37 | 
38 | ## Usage example
39 | Running Batch Bayesian Optimization (BO) for the Hartmann-6d function with the `OEI` acquisition function can be invoked as following:
40 | ```
41 | python run.py --seed=123 --algorithm=OEI --function=hart6 --batch_size=20 --initial_size=10 --iterations=15 --noise=1e-6
42 | ```
43 | The above will create an output saved in the folder named `out`. Detailed logging is saved in the `log` folder, the verbosity of which can be controller by the `logging.yaml` configuration file.
44 | 
45 | To compare against random search and plot the results run:
46 | ```
47 | python run.py --seed=123 --algorithm=Random --function=hart6 --batch_size=20 --initial_size=10 --iterations=15 --noise=1e-6 --nl_solver=knitro --hessian=1
48 | 
49 | python plot.py hart6 out/hart6_OEI out/hart6_Random 
50 | ```
51 | This will save a `pdf` plot in the `results` folder.
52 | 
53 | The user can define a different function for optimization by modifying `benchmark_function.py`, or run **any of the ones presented in the [paper](https://arxiv.org/abs/1707.04191)**, the definitions of which can be found [here](https://github.com/oxfordcontrol/Bayesian-Optimization/blob/GPflow-based/test_functions/benchmark_functions.py).
54 | 
55 | ## Basic files
56 | 
57 | |  File            | Description                                                      |
58 | |----------------|-----------------------------------------------------------------|
59 | | `run.py`         | Script that runs BO on test functions defined in `benchmark_functions.py`.               |
60 | | `plot.py`        | Plots results of `run.py`.                                       |
61 | | `methods/oei.py` | Definition of `OEI`, its gradient and hessian.                   |
62 | | `methods/bo.py`  | A parent class that implements a simple parametrizable BO setup. |
63 | 
64 | ## Timing results
65 | The main operations for the calculating value of `OEI`, its gradient and hessian are listed below:
66 | * Value and gradient
67 |     * Solving the Semidefinite Optimization Problem [(relevant line)](https://github.com/oxfordcontrol/Bayesian-Optimization/blob/master/methods/sdp.py#L51-L52).
68 | * Hessian (given solution of SDP): 
69 |     * Calculating the derivatives of M [(dominant operation)](https://github.com/oxfordcontrol/Bayesian-Optimization/blob/master/methods/sdp.py#L184) called from [here](https://github.com/oxfordcontrol/Bayesian-Optimization/blob/master/methods/oei.py#L61).
70 |     * Chain rules of tensorflow [(relevant line #1](https://github.com/oxfordcontrol/Bayesian-Optimization/blob/master/methods/oei.py#L60), [relevant line #2)](https://github.com/oxfordcontrol/Bayesian-Optimization/blob/master/methods/oei.py#L64).
71 | 
72 | In the current implementation a considerable amount of time is spent in Python to create, reshape and move matrices, all of which could have been avoided in a more efficient version.
73 | 


--------------------------------------------------------------------------------
/benchmark_functions.py:
--------------------------------------------------------------------------------
  1 | import random
  2 | import numpy as np
  3 | 
  4 | 
  5 | class scale_function():
  6 |     '''
  7 |     This wrapper takes another objective and scales its input domain to [0.5,-0.5]^n.
  8 |     It's useful to apply when having priors on lengthscales.
  9 |     '''
 10 |     def __init__(self, function):
 11 |         self.bounds = function.bounds.astype(float)
 12 |         self.function = function
 13 |         self.bounds[:, 0] = -1/2
 14 |         self.bounds[:, 1] = 1/2
 15 |         if hasattr(function, 'fmin'):
 16 |             self.fmin = function.fmin
 17 | 
 18 |     def restore(self, X):
 19 |         means = (self.function.bounds[:, 1] + self.function.bounds[:, 0])/2
 20 |         lengths = self.function.bounds[:, 1] - self.function.bounds[:, 0]
 21 | 
 22 |         Xnorm = np.zeros(X.shape)
 23 |         for i in range(X.shape[0]):
 24 |             Xnorm[i, :] = X[i, :] * lengths + means
 25 | 
 26 |         return Xnorm
 27 | 
 28 |     def scale(self, X):
 29 |         means = (self.function.bounds[:, 1] + self.function.bounds[:, 0])/2
 30 |         lengths = self.function.bounds[:, 1] - self.function.bounds[:, 0]
 31 | 
 32 |         Xorig = np.zeros(X.shape)
 33 |         for i in range(X.shape[0]):
 34 |             Xorig[i, :] = (X[i, :] - means) / lengths
 35 | 
 36 |         return Xorig
 37 | 
 38 |     def f(self, X):
 39 |         Xorig = self.restore(X)
 40 |         X_ret = ()
 41 |         y_ret = ()
 42 |         # Split the evaluations into a number of batches
 43 |         N = 1
 44 |         for i in range(0, len(Xorig), N):
 45 |             ret = self.function.f(Xorig[i:i+N])
 46 |             if isinstance(ret, tuple):
 47 |                 y_ret = y_ret + (ret[0],)
 48 |                 X_ret = X_ret + (ret[1],)
 49 |             else:
 50 |                 y_ret = y_ret + (ret,)
 51 | 
 52 |         if isinstance(ret, tuple):
 53 |             y_ret = np.concatenate(y_ret)
 54 |             X_ret = np.concatenate(X_ret)
 55 |             # np.testing.assert_array_almost_equal(self.scale(X_ret), X)
 56 |             return y_ret, self.scale(X_ret)
 57 |         else:
 58 |             y_ret = np.concatenate(y_ret)
 59 |             return y_ret
 60 | 
 61 | 
 62 | class hart6:
 63 |     '''
 64 |     Hartmann 6-Dimensional function
 65 |     Based on the following MATLAB code:
 66 |     https://www.sfu.ca/~ssurjano/hart6.html
 67 |     '''
 68 |     def __init__(self, sd=0):
 69 |         self.sd = sd
 70 |         self.bounds = np.array([[0, 1], [0, 1], [0, 1],
 71 |                                [0, 1], [0, 1], [0, 1]])
 72 |         self.min = np.array([0.20169, 0.150011, 0.476874,
 73 |                              0.275332, 0.311652, 0.6573])
 74 |         self.fmin = -3.32237
 75 | 
 76 |     def f(self, xx):
 77 |         if len(xx.shape) == 1:
 78 |             xx = xx.reshape((1, 6))
 79 | 
 80 |         assert xx.shape[1] == 6
 81 | 
 82 |         n = xx.shape[0]
 83 |         y = np.zeros(n)
 84 |         for i in range(n):
 85 |             alpha = np.array([1.0, 1.2, 3.0, 3.2])
 86 |             A = np.array([[10, 3, 17, 3.5, 1.7, 8],
 87 |                           [0.05, 10, 17, 0.1, 8, 14],
 88 |                           [3, 3.5, 1.7, 10, 17, 8],
 89 |                           [17, 8, 0.05, 10, 0.1, 14]])
 90 |             P = 1e-4 * np.array([[1312, 1696, 5569, 124, 8283, 5886],
 91 |                                  [2329, 4135, 8307, 3736, 1004, 9991],
 92 |                                  [2348, 1451, 3522, 2883, 3047, 6650],
 93 |                                  [4047, 8828, 8732, 5743, 1091, 381]])
 94 | 
 95 |             outer = 0
 96 |             for ii in range(4):
 97 |                 inner = 0
 98 |                 for jj in range(6):
 99 |                     xj = xx[i, jj]
100 |                     Aij = A[ii, jj]
101 |                     Pij = P[ii, jj]
102 |                     inner = inner + Aij*(xj-Pij)**2
103 | 
104 |                 new = alpha[ii] * np.exp(-inner)
105 |                 outer = outer + new
106 | 
107 |             y[i] = -outer
108 | 
109 |         if self.sd == 0:
110 |             noise = np.zeros(n)
111 |         else:
112 |             noise = np.random.normal(0, self.sd, n)
113 | 
114 |         return (y + noise).reshape((n, 1))
115 | 
116 | 


--------------------------------------------------------------------------------
/log/.gitkeep:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/oxfordcontrol/Bayesian-Optimization/980a0edb978710fe29529526977d0c14649b3088/log/.gitkeep


--------------------------------------------------------------------------------
/logging.yaml:
--------------------------------------------------------------------------------
 1 | version: 1
 2 | formatters:
 3 |   verbose:
 4 |     format: '%(asctime)s - %(name)s - %(levelname)s - %(message)s'
 5 |   simple:
 6 |     format: '%(name)s - %(levelname)s - %(message)s'
 7 |   bare:
 8 |     format: '%(message)s'
 9 | handlers:
10 |   console:
11 |     class: logging.StreamHandler
12 |     level: CRITICAL
13 |     formatter: simple
14 |     stream: ext://sys.stdout
15 |   evals_file:
16 |     class: logging.FileHandler
17 |     level: DEBUG
18 |     formatter: bare
19 |     filename: PATH/evals.log
20 |   opt_file:
21 |     class: logging.FileHandler
22 |     level: DEBUG
23 |     formatter: bare
24 |     filename: PATH/opt.log
25 |   model_file:
26 |     class: logging.FileHandler
27 |     level: DEBUG
28 |     formatter: bare
29 |     filename: PATH/model.log
30 | loggers:
31 |   evals:
32 |     level: INFO
33 |     handlers: [evals_file]
34 |     propagate: False
35 |   opt:
36 |     level: INFO
37 |     handlers: [opt_file]
38 |     propagate: False
39 |   model:
40 |     level: INFO
41 |     handlers: [model_file]
42 |     propagate: False
43 | root:
44 |   level: DEBUG
45 |   handlers: [console, evals_file, model_file, opt_file]


--------------------------------------------------------------------------------
/methods/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/oxfordcontrol/Bayesian-Optimization/980a0edb978710fe29529526977d0c14649b3088/methods/__init__.py


--------------------------------------------------------------------------------
/methods/bo.py:
--------------------------------------------------------------------------------
  1 | from gpflow.gpr import GPR
  2 | import numpy as np
  3 | import tensorflow as tf
  4 | import random
  5 | import copy
  6 | from .solvers import solve
  7 | from .sdp import reset_warm_starting
  8 | import logging.config
  9 | import shutil
 10 | import os
 11 | import re
 12 | import yaml
 13 | 
 14 | 
 15 | class BO(GPR):
 16 |     '''
 17 |     This is a simple (abstract) implementation of Bayesian Optimization.
 18 |     It extends gpflow's GPR.
 19 |     '''
 20 |     def __init__(self, options):
 21 |         self.bounds = options['objective'].bounds.copy()
 22 |         self.dim = self.bounds.shape[0]
 23 | 
 24 |         if 'mean_function' not in options:
 25 |             options['mean_function'] = None
 26 |         super(BO, self).__init__(X=np.zeros((0, self.dim)),
 27 |                                  Y=np.zeros((0, self.dim)),
 28 |                                  kern=options['kernel'],
 29 |                                  mean_function=options['mean_function']
 30 |                                  )
 31 |         self.options = options.copy()
 32 | 
 33 |         # Fix the noise, if its value is provided
 34 |         if self.options['noise'] is not None:
 35 |             self.likelihood.variance = self.options['noise']
 36 |             self.likelihood.variance.fixed = True
 37 | 
 38 |     def bayesian_optimization(self):
 39 |         '''
 40 |         This function implements the main loop of the Bayesian Optimization
 41 |         '''
 42 |         # Copy the objective. This is essential when testing draws from GPs
 43 |         objective = copy.copy(self.options['objective'])
 44 |         # Initialize the dataset at some random locations
 45 |         X0 = self.random_sample(self.bounds, self.options['initial_size'])
 46 |         # Evaluate the objective funtion there
 47 |         ret = objective.f(X0)
 48 |         # The objective might alter (e.g. discretize) X0.
 49 |         # In this case it returns two matrices:
 50 |         # the objective values (y0) and the altered locations (X0)
 51 |         if isinstance(ret, tuple):
 52 |             y0, X0 = ret
 53 |         else:
 54 |             y0 = ret
 55 | 
 56 |         # Set the data to the GP model
 57 |         # Careful, we might normalize the function evaluations
 58 |         # Provide only the first column of y0.
 59 |         # The others columns contain auxiliary data.
 60 |         self.X = X0
 61 |         self.Y = self.normalize(y0[:, 0:1])
 62 | 
 63 |         # X_all stores all the points where f was evaluated and
 64 |         # y_all the respective function values
 65 |         X_all = X0
 66 |         y_all = y0
 67 | 
 68 |         #------------------------------------------------------------------
 69 |         # Logging - Information about the function and initial evaluations.
 70 |         self.setup_logging()
 71 |         logger = logging.getLogger('evals')
 72 |         logger.info('----------------------------')
 73 |         logger.info('Bounds:\n' + str(self.bounds))
 74 |         if hasattr(objective, 'fmin'):
 75 |             logger.info('Minimum value:' + str(objective.fmin))
 76 |         logger.info('----------------------------')
 77 |         for i in range(len(X0)):
 78 |             logger.info(
 79 |                 'X:' + str(X0[i, :]) + ' y: ' + str(y0[i, :])
 80 |             )
 81 |         #------------------------------------------------------------------
 82 | 
 83 |         for i in range(self.options['iterations']):
 84 |             # ML (or MAP if there are priors) for the GP model
 85 |             self.optimize_restarts(restarts=self.options['model_restarts'])
 86 | 
 87 |             #------------------------------------------------------------------
 88 |             # Logging - Print hyperparameters of the model
 89 |             logging.getLogger('').info('#Iteration:' + str(i + 1))
 90 |             # Remove non-printable characters (bold identifiers)
 91 |             ansi_escape = re.compile(r'\x1b[^m]*m')
 92 |             logging.getLogger('model').info(ansi_escape.sub('', str(self)))
 93 |             #------------------------------------------------------------------
 94 | 
 95 |             # Get the suggested points for optimising the acquisition function
 96 |             X_new = self.get_suggestion(self.options['batch_size'])
 97 |             # Evaluate the black-box function at the suggested points
 98 |             ret = objective.f(X_new)
 99 |             # The objective might alter (e.g. discretize) X_new.
100 |             # In this case it returns two matrices:
101 |             # the objective values (y_new) and the altered locations (X_new)
102 |             if isinstance(ret, tuple):
103 |                 y_new, X_new = ret
104 |             else:
105 |                 y_new = ret
106 | 
107 |             # Append the algorithm's choice X_new to X_all
108 |             # Add the function evaluations f(X_new) to y_all
109 |             X_all = np.concatenate((X_all, X_new))
110 |             y_all = np.concatenate((y_all, y_new))
111 | 
112 |             # Update the GP model
113 |             # Careful, the model might normalize the function evaluations
114 |             # Provide only the first column of y_all.
115 |             # The others columns contain auxiliary data.
116 |             self.X = X_all
117 |             self.Y = self.normalize(y_all[:, 0:1])
118 | 
119 |             #------------------------------------------------------------------
120 |             # Logging - Print new evaluations
121 |             for j in range(len(X_new)):
122 |                 logging.getLogger('evals').info(
123 |                     'X:' + str(X_new[j, :]) + ' y: ' + str(y_new[j, :])
124 |                 )
125 |             #------------------------------------------------------------------
126 | 
127 |         return X_all, y_all
128 | 
129 |     def get_suggestion(self, batch_size):
130 |         '''
131 |         This function runs a descent-based optimization on the acquisition
132 |         function to get a batch of bath_size evaluation points
133 |         Inputs: batch_size: Number of evalution points
134 |         Output: X: [batch_size x self.dim] ndarray with the evaluation points
135 |         '''
136 |         X = None    # Will hold the final decision
137 |         y = None    # Will hold the acquisition function value at X
138 |         # Tile bounds to match batch size
139 |         bounds_tiled = np.tile(self.bounds, (batch_size, 1))
140 |         # Run local gradient-descent optimizer multiple times
141 |         # to avoid getting stuck in a poor local optimum
142 |         for j in range(self.options['opt_restarts']):
143 |             try:
144 |                 # Resets the warm starting of the SDP (sdp.py) as we start from
145 |                 # a random point that is not necessarily close to the previous ones.
146 |                 reset_warm_starting()
147 | 
148 |                 # Initial point of the optimization
149 |                 X_init = self.random_sample(self.bounds, batch_size)
150 |                 y_init = self.acquisition(X_init)[0]
151 | 
152 |                 X0, y0, status = solve(X_init=X_init,
153 |                                     bounds=bounds_tiled,
154 |                                     hessian=self.options['hessian'],
155 |                                     bo=self,
156 |                                     solver=self.options['nl_solver'])
157 | 
158 |                 # Update X if the current local minimum is
159 |                 # the best one found so far
160 |                 if X is None or y0 < y:
161 |                     X, y = X0, y0
162 | 
163 | 
164 |                 #--------------------------------------------------------
165 |                 # Logging - Print statistics for the non-linear solver
166 |                 logging.getLogger('opt').info(
167 |                     '##Opt_it:' + str(j + 1) + ' Val:' + '%.2e' % y0 +
168 |                     ' Diff:' + '%.2e' % (y_init - y0) +
169 |                     ' It:' + str(status.nit)
170 |                 )
171 |                 if not status.success:
172 |                     logging.getLogger('opt').warning(
173 |                         'NLP Warning:' + str(status.message)
174 |                     )
175 |                 #--------------------------------------------------------
176 |             except AssertionError:# KeyboardInterrupt:
177 |                 raise
178 |             '''
179 |             except Exception as e:
180 |                 #--------------------------------------------------------
181 |                 # Logging - Report failure of the nonlinear solver
182 |                 logging.getLogger('info').critical(
183 |                     'Optimization #' + str(j + 1) +
184 |                     'of the acquisition function failed!\n Error:' + str(e)
185 |                 )
186 |                 #--------------------------------------------------------
187 |             '''
188 | 
189 |         # Assert that at least one optimization run succesfully
190 |         assert X is not None
191 | 
192 |         return X
193 | 
194 |     def optimize_restarts(self, restarts=1, **kwargs):
195 |         '''
196 |         Wrapper of gpflow's self._objective to allow for multiple
197 |         random restarts on the maximization of the likelihood 
198 |         of the GP's hyperparameters.
199 |         '''
200 |         if self._needs_recompile:
201 |             self.compile()
202 |         obj = self._objective
203 | 
204 |         par_min = self.get_free_state().copy()
205 |         val_min = obj(par_min)[0]
206 |         for i in range(restarts):
207 |             try:
208 |                 self.randomize()
209 |                 self.optimize(**kwargs)
210 |                 x = self.get_free_state().copy()
211 |                 val = obj(x)[0]
212 |             except KeyboardInterrupt:
213 |                 raise
214 |             except:
215 |                 val = float("inf")
216 | 
217 |             if val < val_min:
218 |                 par_min = x
219 |                 val_min = val
220 | 
221 |         self.set_state(par_min)
222 | 
223 |     @staticmethod
224 |     def random_sample(bounds, k):
225 |         '''
226 |         Generate a set of k n-dimensional points sampled uniformly at random
227 |         Inputs:
228 |             bounds: n x 2 dimenional array containing upper/lower bounds
229 |                     for each dimension
230 |             k: number of points
231 |         Output: k x n array containing the sampled points
232 |         '''
233 |         # k: Number of points
234 |         n = bounds.shape[0]  # Dimensionality of each point
235 |         X = np.zeros((k, n))
236 |         for i in range(n):
237 |             X[:, i] = np.random.uniform(bounds[i, 0], bounds[i, 1], k)
238 | 
239 |         return X
240 | 
241 |     def normalize(self, Y):
242 |         '''
243 |         When normalization is enabled, this function normalizes the first
244 |         collumn of Y to have zero mean and std one.
245 | 
246 |         Recall that the first collumn contains the output of the function under
247 |         minimization. The rest of the collumns (if any) contain auxiliary data
248 |         that are only used for inspection purposes.
249 |         '''
250 | 
251 |         Y_ = Y.copy()
252 |         if self.options['normalize_Y'] and np.std(Y[:, 0]) > 0:
253 |             Y_[:, 0] = (Y[:, 0] - np.mean(Y[:, 0]))/np.std(Y[:, 0])
254 | 
255 |         return Y_
256 | 
257 |     def setup_logging(self):
258 |         '''
259 |         This function initialises the logging mechanism
260 |         '''
261 |         if 'seed' in self.options:
262 |             self.log_folder = 'log/' + self.options['job_name'] + '/' + str(self.options['seed']) + '/'
263 |         else:
264 |             self.log_folder = 'log/' + self.options['job_name'] + '/'
265 | 
266 |         try:
267 |             os.makedirs(self.log_folder)
268 |         except OSError:
269 |             pass
270 |         # Load config file
271 |         with open('logging.yaml', 'r') as f:
272 |             config = f.read()
273 |         # Prepend logging folder
274 |         config = config.replace('PATH/', self.log_folder)
275 |         # Setup logging
276 |         logging.config.dictConfig(yaml.load(config))
277 |         #------------------------------------------------------------------
278 | 


--------------------------------------------------------------------------------
/methods/oei.py:
--------------------------------------------------------------------------------
  1 | from .bo import BO
  2 | import numpy as np
  3 | import scipy.linalg as la
  4 | import logging
  5 | from .sdp import sdp, solution_derivative
  6 | import tensorflow as tf
  7 | from gpflow.param import AutoFlow
  8 | from gpflow._settings import settings
  9 | float_type = settings.dtypes.float_type
 10 | 
 11 | 
 12 | class OEI(BO):
 13 |     '''
 14 |     This class implements the Optimistic Expected Improvement acquisition function.
 15 |     '''
 16 |     def __init__(self, options):
 17 |         super(OEI, self).__init__(options)
 18 | 
 19 |     def acquisition(self, x):
 20 |         '''
 21 |         The acquisition function and its gradient
 22 |         Input: x: flattened ndarray [batch_size x self.dim] containing the evaluation points
 23 |         Output[0]: Value of the acquisition function
 24 |         Output[1]: Gradient of the acquisition function
 25 |         '''
 26 |         # Calculate the minimum value so far
 27 |         fmin = np.min(self.predict_f(self.X.value)[0])
 28 |         X = x.reshape((-1, self.dim))
 29 | 
 30 |         X_, V = self.project(X)  # See comments in self.project
 31 |         if len(X_) > 0:
 32 |             # Solve the SDP
 33 |             value, M_, _, _ = sdp(self.omega(X_), fmin, warm_start=(len(X)==len(X_)))
 34 |             _, gradient_ = self.acquisition_tf(X_, M_)
 35 |             gradient = V.T.dot(gradient_)
 36 |         else:
 37 |             value = 0; gradient = np.random.rand(len(x))
 38 | 
 39 |         return np.array([value]), gradient.flatten()
 40 | 
 41 |     def acquisition_hessian(self, x):
 42 |         '''
 43 |         The acquisition function and its gradient
 44 |         Input: x: flattened ndarray [batch_size x self.dim] containing the evaluation points
 45 |         Output[0]: Value of the acquisition function
 46 |         Output[1]: Gradient of the acquisition function
 47 |         '''
 48 |         # Calculate the minimum value so far
 49 |         fmin = np.min(self.predict_f(self.X.value)[0])
 50 |         X = x.reshape((-1, self.dim))
 51 | 
 52 |         # See comments on self.project
 53 |         X_, _ = self.project(X)
 54 |         if len(X) != len(X_):
 55 |             return np.zeros((x.shape[0], x.shape[0]))
 56 | 
 57 |         # Solve SDP
 58 |         _, M, Y, C = sdp(self.omega(X), fmin)
 59 |         # Calculate domega/dx and dM/dx
 60 |         domega = self.domega(X.flatten())
 61 |         dM = solution_derivative(M, Y, C, domega) 
 62 | 
 63 |         # Perform the chain rules in Tensorflow
 64 |         return self.acquisition_hessian_tf(X.flatten(), M, dM, domega)
 65 | 
 66 |     @AutoFlow((float_type, [None, None]), (float_type, [None, None]))
 67 |     def acquisition_tf(self, X, M):
 68 |         '''
 69 |         Calculates the acquisition function, given M the optimizer of the SDP.
 70 |         The calculation is simply a matrix inner product.
 71 |         Input: X: ndarray [batch_size x self.dim] containing the evaluation points
 72 |         Output[0]: Value of the acquisition function
 73 |         Output[1]: Gradient of the acquisition function
 74 |         '''
 75 |         f = tf.tensordot(self.omega_tf(X), M, axes=2)
 76 |         df = tf.gradients(f, X)[0]
 77 |         return f, df
 78 | 
 79 |     def omega_tf(self, X):
 80 |         '''
 81 |         Calculates the second order moment matrix in tensorflow.
 82 |         Input: X: ndarray [batch_size x self.dim] containing the evaluation points
 83 |         Output: [Sigma(X) + mu(X)*mu(X).T  mu(X); mu(X).T 1]
 84 |                 where mu(X) and Sigma(X) are the mean and variance of the GP posterior at X.
 85 |         '''
 86 |         mean, var = self.build_predict(X, full_cov=True)
 87 |         var = var[:, :, 0] + tf.eye(tf.shape(var)[0], dtype=float_type)*self.likelihood.variance
 88 | 
 89 |         # Create omega
 90 |         omega = var + tf.matmul(mean, mean, transpose_b=True)
 91 |         omega = tf.concat([omega, mean], axis=1)
 92 |         omega = tf.concat([omega,
 93 |                           tf.concat([tf.transpose(mean), [[1]]], axis=1)],
 94 |                           axis=0)
 95 | 
 96 |         return omega
 97 | 
 98 |     @AutoFlow((float_type, [None, None]))
 99 |     def omega(self, X):
100 |         '''
101 |         Just an autoflow wrapper for omega_tf
102 |         '''
103 |         return self.omega_tf(X)
104 | 
105 |     @AutoFlow((float_type, [None]), (float_type, [None, None]),
106 |               (float_type, [None, None, None]), (float_type, [None, None, None]))
107 |     def acquisition_hessian_tf(self, x, M, dM, domega):
108 |         '''
109 |         Input: x: flattened ndarray [batch_size x self.dim] containing the evaluation points
110 |                M: [(batch_size + 1) x (batch_size + 1)] ndarray
111 |                   containing the solution of the SDP
112 |                dM: [(batch_size + 1) x (batch_size + 1) x (batch_size * self.dim)] ndarray
113 |                   containing the ''jacobian'' of M w.r.t x.
114 |                domega: [(batch_size + 1) x (batch_size + 1) x (batch_size * self.dim)] ndarray
115 |                   containing the ''jacobian'' of the second order moment Omega with respect to x.
116 |         Output: hessian of the acquisition function: [(batch_size * self.dim) x (batch_size * self.dim)] ndarray
117 |         '''
118 |         X = tf.reshape(x, [self.options['batch_size'], -1])
119 |         f = tf.tensordot(self.omega_tf(X), M, axes=2)
120 |         d2f_a = tf.hessians(f, x)[0]
121 |         d2f_b = tf.tensordot(tf.transpose(dM), domega, axes=2)
122 | 
123 |         return d2f_a + d2f_b
124 | 
125 |     @AutoFlow((float_type, [None]))
126 |     def domega(self, x):
127 |         '''
128 |         Input: x: flattened ndarray [batch_size x self.dim] containing the evaluation points
129 |         Output: domega: [(batch_size + 1) x (batch_size + 1) x (batch_size * self.dim)] ndarray
130 |                         containing the ''jacobian'' of the second order moment Omega with respect to x.
131 |         '''
132 |         X = tf.reshape(x, [self.options['batch_size'], -1])
133 |         omega = self.omega_tf(X)
134 |         domega = self.jacobian(tf.reshape(omega, [-1]), x)
135 |         return tf.reshape(domega,[omega.shape[0], omega.shape[1], -1])
136 | 
137 |     def jacobian(self, y_flat, x):
138 |         '''
139 |         Calculates the jacobian of y_flat with respect to x
140 |         Code taken from  jeisses' comment on
141 |         https://github.com/tensorflow/tensorflow/issues/675
142 |         '''
143 |         n = y_flat.shape[0]
144 | 
145 |         loop_vars = [
146 |             tf.constant(0, tf.int32),
147 |             tf.TensorArray(float_type, size=n),
148 |         ]
149 | 
150 |         _, jacobian = tf.while_loop(
151 |             lambda j, _: j < n,
152 |             lambda j, result: (j+1, result.write(j, tf.gradients(y_flat[j], x))),
153 |             loop_vars)
154 | 
155 |         return jacobian.stack()
156 | 
157 |     def project(self, X):
158 |         '''
159 |         According to Proposition 8, OEI and QEI are not differentiable when the kernel is noiseless and
160 |         there are duplicates in X. In these cases calculating M, SDP's optimizer, becomes increasing hard, as
161 |         the SDP problem becomes ill conditioned.
162 |         
163 |         In this cases, we get around this problem by solving a smaller (projected) SDP problem,
164 |         and providing a descent direction that causes the duplicates to separate and, if requested, a hessian equal to zero.
165 |         Input: X: ndarray [batch_size x self.dim] containing the evaluation points
166 |         Output: 
167 |         If the self.likelihood.variance > 1e-4:
168 |             Returns X (same as input) and V, an identity matrix
169 |         If the kernel is noiseless:
170 |             X_u: ndarray [q x self.dim] containing the unique evaluation points (it removes also duplicates of dataset)
171 |             V: ndarray [q x batch_size] projection matrix that given the solution M of the smaller sdp problem 
172 |                                         can be used to calculate an appropriate descent direction
173 |         '''
174 |         if self.likelihood.variance.value > 1e-4:
175 |             return X, np.eye(X.shape[0])
176 | 
177 |         l = self.kern.lengthscales.value
178 | 
179 |         # Finding duplicates of dataset
180 |         # See https://stackoverflow.com/questions/11903083/find-the-set-difference-between-two-large-arrays-matrices-in-python
181 |         idx = np.where(np.all(np.abs((X/l)[:, None, :] - self.X.value/l) < 1e-2, axis=2))[0]
182 |         idx_ = np.setdiff1d(np.arange(X.shape[0]), idx)
183 |         # Remove duplicates of the dataset
184 |         X_u = X[idx_]
185 |         V = np.eye(X.shape[0])[idx_]
186 |         random_direction = np.random.rand(V.shape[0], len(idx))
187 |         V[:, idx] = random_direction/np.linalg.norm(random_direction, axis=0, keepdims=True)
188 | 
189 |         if X_u.shape[0] > 0:
190 |             _, idx = np.unique(np.round(X_u / l, decimals=2), return_index=True, axis=0)
191 |             X_u = X_u[idx]
192 |             V = V[idx]
193 | 
194 |         # If no points were removed, then just return the original X, to preserve the order of its rows
195 |         if len(X_u) == len(X):
196 |             X_u = X
197 |             V = np.eye(X.shape[0])
198 | 
199 |         return X_u, V
200 | 


--------------------------------------------------------------------------------
/methods/random.py:
--------------------------------------------------------------------------------
 1 | from __future__ import print_function
 2 | from .bo import BO
 3 | 
 4 | 
 5 | class Random(BO):
 6 |     '''
 7 |     Random strategy
 8 |     '''
 9 |     def __init__(self, options):
10 |         super(Random, self).__init__(options)
11 | 
12 |     def get_suggestion(self, batch_size):
13 |         return self.random_sample(self.bounds, batch_size)
14 | 


--------------------------------------------------------------------------------
/methods/sdp.py:
--------------------------------------------------------------------------------
  1 | import numpy as np
  2 | import scs
  3 | import collections
  4 | import logging
  5 | import scipy.sparse as sp
  6 | from pypardiso import spsolve, factorized
  7 | import pypardiso
  8 | 
  9 | OUTPUT_LEVEL = 0
 10 | 
 11 | def sdp(omega, fmin, warm_start=True):
 12 |     '''
 13 |     Solves the following SDP with the first-order primal-dual solver SCS:
 14 |     ---------Primal Form----------
 15 |     minimize <M, \omega>
 16 |     s.t.     M - C_i = positive semidefinite for all i = 0...k
 17 |     Dual Form:
 18 |     ---------Dual Form------------
 19 |     minimize \sum_{i=1}^{k} <Y_i, C_i>
 20 |     s.t.     Y_i positive semidefinite for all i = 0...k
 21 |                 \sum_{i=0}^{k} Y_i = \omega
 22 |     ------------------------------
 23 |     in order to get the value and the gradient of the acquisiton function.
 24 |     Inputs:
 25 |         omega: Second order moment matrix
 26 |         fmin: min value achieved so far, i.e. min(y0)
 27 |         warm_start: Whether or not to warm-start the solution
 28 |     Outpus:
 29 |         opt_val: Optimal value of the SDP
 30 |         M: Optimizer of the SDP
 31 |         Y: List of the Optimal Lagrange Multipliers (Dual Optimizers) of the SDP
 32 |         C: List of C_i, i = 0 ... k
 33 |     '''
 34 |     omega = (omega + omega.T)/2
 35 |     assert not np.isnan(omega).any()
 36 | 
 37 |     # Express the problem in the format required by scs
 38 |     data = create_scs_data(omega, fmin)
 39 |     k_ = omega.shape[0]
 40 |     cone = {'s': [k_]*k_}
 41 |     if not 'past_omegas' in globals() or len(past_omegas) == 0 or \
 42 |         past_omegas[0].shape[0] != k_:
 43 |         # Clear the saved solutions, as they are of different size
 44 |         # than the one we are currently trying to solve.
 45 |         reset_warm_starting()
 46 |     elif warm_start:
 47 |         # Update data with warm-started solution
 48 |         data = get_warm_start(omega, data)
 49 | 
 50 |     # Call SCS
 51 |     sol = scs.solve(data, cone, eps=1e-5, use_indirect=False, verbose=OUTPUT_LEVEL==1)
 52 |     # print(sol['info']['solveTime']) # Prints solution time for SDP.
 53 |     if sol['info']['status'] != 'Solved':
 54 |         logging.getLogger('opt').warning(
 55 |             'SCS solution status:' + sol['info']['status']
 56 |         )
 57 | 
 58 |     # Extract solution from SCS' structures
 59 |     M, Y = unpack_solution(sol['x'], sol['y'], k_)
 60 |     objective = -sol['info']['pobj']
 61 |     sol['C'] = data['C']
 62 | 
 63 |     if warm_start:
 64 |         # Save solution for warm starting
 65 |         past_solutions.append(sol); past_omegas.append(omega)
 66 | 
 67 |     return objective, M, Y, data['C']
 68 | 
 69 | def reset_warm_starting():
 70 |     '''
 71 |     Clears list of previous solutions that are used for warm-starting SCS
 72 |     '''
 73 |     global past_omegas, past_solutions
 74 |     past_omegas = collections.deque(maxlen=20)
 75 |     past_solutions = collections.deque(maxlen=20)
 76 | 
 77 | def get_warm_start(omega, data):
 78 |     '''
 79 |     Updates SCS' data structure with a warm-started solution
 80 |     '''
 81 |     k_ = omega.shape[0]
 82 | 
 83 |     # Search on the list of past solved problems for the one
 84 |     # that had the most similar omega matrix as the one we
 85 |     # are trying to solve now
 86 |     def sort_func(X):
 87 |         return np.linalg.norm(X - omega)
 88 |     min_score = float('inf')
 89 |     idx = -1
 90 |     for i in range(len(past_omegas)):
 91 |         score = sort_func(past_omegas[i])
 92 |         if min_score > score:
 93 |             min_score = score
 94 |             idx = i
 95 | 
 96 |     # Copy the solution from the closest match
 97 |     x = past_solutions[idx]['x'].copy()
 98 |     y = past_solutions[idx]['y'].copy()
 99 |     s = past_solutions[idx]['s'].copy()
100 | 
101 |     # Improve warm starting by moving the solution in the direction of dM, dy
102 |     try:
103 |         domega = -(past_omegas[idx] - omega)
104 |         M, Y = unpack_solution(x, y, k_)
105 |         C = past_solutions[idx]['C']
106 |         dM, dY = solution_derivative(M, Y, C, domega, return_dY=True)
107 | 
108 |         x += pack(dM[:,:,0], k_)
109 |         n = len(x)
110 |         for i in range(0, k_):
111 |             y[i*n:(i+1)*n] += pack(dY[i], k_)
112 |             s[i*n:(i+1)*n]-= pack(dM[:,:,0], k_)
113 |     except pypardiso.pardiso_wrapper.PyPardisoError:
114 |         # In the (very) case where pardiso fails due to badly conditioned data
115 |         logging.getLogger('opt').warning('Call to pardiso failed. Switching to zero-order warm starting.')
116 |         pass
117 | 
118 |     # Add warm starting to the problem data
119 |     data['x'] = x; data['y'] = y; data['s'] = s
120 | 
121 |     return data
122 | 
123 | def create_scs_data(omega, fmin):
124 |     '''
125 |     Returns a data structure that containts the problem data as required by SCS. 
126 |     See github.com/cvxgrp/scs 
127 |     '''
128 |     k_ = omega.shape[0]
129 | 
130 |     # Create A
131 |     n = k_ * (k_ + 1) // 2
132 |     A = np.zeros((k_*n, n))
133 |     row_ind = np.array([])
134 |     col_ind = np.array([])
135 |     z = np.array([])
136 |     for i in range(n):
137 |         row_ind = np.append(row_ind, np.arange(i, k_*n, n))
138 |         col_ind = np.append(col_ind, np.repeat(i, k_))
139 |         z = np.append(z, np.repeat(1, k_))
140 |     A = sp.csc_matrix((z, (row_ind, col_ind)), shape=(k_*n, n))
141 | 
142 |     # Create b and C
143 |     b = np.array([])
144 |     C = []
145 |     C.append(np.zeros((k_, k_)))
146 |     b = np.append(b, pack(C[0], k_))
147 |     for i in range(1, k_):
148 |         C.append(np.zeros((k_, k_)))
149 |         C[i][-1, i - 1] = 1/2
150 |         C[i][i - 1, -1] = 1/2
151 |         C[i][-1, -1] = -fmin
152 |         b = np.append(b, pack(C[i], k_))
153 | 
154 |     # Create c
155 |     c = -pack(omega, k_)
156 | 
157 |     return {'A': A, 'b': b, 'c': c, 'C': C}
158 | 
159 | def solution_derivative(M, Y, C, domegas, return_dY=False):
160 |     '''
161 |     Calculates the derivatives of M and Y across perturbations of the cost matrix
162 |     defined in domegas.
163 |     Inputs: M: Primal Solution of the SDP
164 |             Y: List of Dual Solutions of the SDP
165 |             C: List of the auxiliary matrices in the definition of the SDP
166 |             domegas: 3D-ndarray with each domegas[:, :, i] defining a perturbation
167 |     Outputs: dM: 3D-ndarray with each dM[:, :, i] being the derivative of M when
168 |                  perturbing omega across domega[:, :, i]
169 |              dY: list of 3D-ndarrays having, similarly to dM,
170 |                  the derivatives of Y when perturbing omega
171 |     '''
172 | 
173 |     H = create_matrix(M, Y, C)
174 |     assert len(domegas.shape) == 2 or len(domegas.shape) == 3
175 |     if len(domegas.shape) == 2:
176 |         domegas = domegas[:, :, None]
177 | 
178 |     n = domegas.shape[0]
179 |     k = domegas.shape[-1]
180 | 
181 |     # z a vectorized copy of the upper triangular part of domegas
182 |     z = np.zeros((n**2 + n*(n+1)//2, k))
183 |     z[-n*(n+1)//2:, :] = domegas[np.triu_indices(n)]
184 |     x = H(z)
185 | 
186 |     def convert_to_matrix(x):
187 |         G = np.zeros((n, n, k))
188 |         G[np.triu_indices(n)] = x[-n*(n+1)//2:, :]
189 |         G = G + np.transpose(G, [1, 0, 2]) 
190 |         G[np.diag_indices(n)] = G[np.diag_indices(n)]/2
191 |         return G
192 | 
193 |     dM = convert_to_matrix(x[-n*(n+1)//2:, :])
194 | 
195 |     if return_dY:
196 |         assert k == 1  # This part has been only implemented for the case where k = 1
197 |         dy = np.reshape(x[0:n**2], (n, n, k))
198 |         dY = []
199 |         for i in range(n):
200 |             eig_Y_i = np.linalg.eigh(Y[i]) 
201 |             y_i = (eig_Y_i[0][-1]**.5 * eig_Y_i[1][:, -1])[:, None]
202 |             dY.append(y_i.dot(dy[i].T) + dy[i].dot(y_i.T))
203 | 
204 |         return dM, dY
205 |     else:
206 |         return dM
207 | 
208 | def create_matrix(M, Y, C):
209 |     '''
210 |     Creates the left-hand side matrix of the linear system that gives dM, dy
211 |     '''
212 |     k = len(Y)
213 |     y = []
214 |     for i in range(len(Y)):
215 |         eig_Y_i = np.linalg.eigh(Y[i]) 
216 |         y_i = (eig_Y_i[0][-1]**.5 * eig_Y_i[1][:, -1])[:, None]
217 |         y.append(y_i)
218 | 
219 |     S = M - C
220 |     P = get_P(k)
221 |     P_ = get_P_(k)
222 |     H1 = sp.block_diag(S, format='csc')
223 |     I = sp.eye(k, format='csc')
224 |     for i in range(len(y)):
225 |         h2 = sp.kron(y[i].T, I).dot(P_)
226 |         h3 = P.dot(sp.kron(y[i], I) + sp.kron(I, y[i]))
227 |         if i == 0:
228 |             H2 = h2
229 |             H3 = h3
230 |         else:
231 |             H2 = sp.vstack((H2, h2))
232 |             H3 = sp.hstack((H3, h3))
233 |     H = sp.bmat([[H1, H2], [H3, None]])
234 |     return factorized(H)
235 | 
236 | def unpack_solution(x, y, k_):
237 |     '''
238 |     Creates matrices M and Y from scs solution structure
239 |     '''
240 |     M = unpack(x, k_)
241 |     Y = []
242 |     for i in range(k_):
243 |         n = k_ * (k_ + 1) // 2
244 |         Y.append(
245 |             unpack(y[i*n:(i+1)*n], k_)
246 |         )
247 |     return M, Y
248 | 
249 | def pack(Z, n):
250 |     '''
251 |     Auxiliary function for 'packing' a matrix into
252 |     a vector format, as required by scs.
253 |     See github.com/cvxgrp/scs 
254 |     '''
255 |     Z = np.copy(Z)
256 |     tidx = np.triu_indices(n)
257 |     tidx = (tidx[1], tidx[0])
258 |     didx = np.diag_indices(n)
259 | 
260 |     Z = Z * np.sqrt(2.)
261 |     Z[didx] = Z[didx] / np.sqrt(2.)
262 |     z = Z[tidx]
263 | 
264 |     return z
265 | 
266 | def unpack(z, n):
267 |     '''
268 |     Auxiliary function for 'unpacking' a packed matrix of a vector format,
269 |     as required by scs. See github.com/cvxgrp/scs 
270 |     '''
271 |     z = np.copy(z)
272 |     tidx = np.triu_indices(n)
273 |     tidx = (tidx[1], tidx[0])
274 |     didx = np.diag_indices(n)
275 | 
276 |     Z = np.zeros((n, n))
277 |     Z[tidx] = z
278 |     Z = (Z + np.transpose(Z)) / np.sqrt(2.)
279 |     Z[didx] = Z[didx] / np.sqrt(2.)
280 | 
281 |     return Z
282 | 
283 | def get_P_(k):
284 |     A = sp.csc_matrix((0, k*(k+1)//2))
285 |     for i in range(k):
286 |         B = sp.csc_matrix((0, 0))
287 |         for j in range(i):
288 |             tmp = sp.csc_matrix(([1], ([0], [i-j])), shape=(1, k - j))
289 |             B = sp.block_diag((B, tmp))
290 | 
291 |         B = sp.block_diag((B, sp.eye(k - i, format='csc')))
292 |         B = sp.hstack([B, sp.csc_matrix((k, k*(k+1)//2 - B.shape[1]))])
293 |         A = sp.vstack([A, B])
294 | 
295 |     return A
296 | 
297 | def get_P(k):
298 |     A = sp.csc_matrix((0, 0))
299 |     I = sp.eye(k, format='csc')
300 |     for i in range(k):
301 |         A = sp.block_diag((A, I[i:, :]))
302 | 
303 |     return A
304 | 


--------------------------------------------------------------------------------
/methods/solvers.py:
--------------------------------------------------------------------------------
  1 | import numpy as np
  2 | import logging
  3 | OUTPUT_LEVEL = 0
  4 | 
  5 | '''
  6 | L-BFGS-B wrapper
  7 | '''
  8 | import scipy as sp
  9 | 
 10 | def bfgs_solve(x_init, bounds, hessian, bo):
 11 |     res = sp.optimize.minimize(fun=bo.acquisition,
 12 |                                x0=x_init,
 13 |                                method='L-BFGS-B',
 14 |                                jac=True,
 15 |                                bounds=bounds,
 16 |                                options={'disp': OUTPUT_LEVEL}
 17 |                                )
 18 |     x = res.x
 19 |     y = res.fun if isinstance(res.fun, float) else res.fun[0]
 20 | 
 21 |     return x, y, res
 22 | 
 23 | try:
 24 |     '''
 25 |     Knitro wrapper
 26 |     '''
 27 |     import sys
 28 |     sys.path.insert(0, './knitro/examples/Python')
 29 |     from knitro import *
 30 |     from knitroNumPy import *
 31 |     from collections import namedtuple
 32 |     '''
 33 |     See Knitro's callback library reference
 34 |     https://www.artelys.com/tools/knitro_doc/3_referenceManual/callableLibrary/API.html
 35 |     The code here is based on the example file
 36 |     examples/Python/exampleHS15NumPy.py
 37 |     '''
 38 |     def callbackEvalFC(bo, evalRequestCode, n, m, nnzJ, nnzH, x,
 39 |                     lambda_, obj, c, objGrad, jac, hessian, hessVector,
 40 |                     userParams):
 41 |         if evalRequestCode == KTR_RC_EVALFC:
 42 |             np.copyto(obj, bo.acquisition(x)[0])
 43 |             return 0
 44 |         else:
 45 |             return KTR_RC_CALLBACK_ERR
 46 | 
 47 |     def callbackEvalGA(bo, evalRequestCode, n, m, nnzJ, nnzH, x,
 48 |                     lambda_, obj, c, objGrad, jac, hessian, hessVector,
 49 |                     userParams):
 50 |         if evalRequestCode == KTR_RC_EVALGA:
 51 |             np.copyto(objGrad, bo.acquisition(x)[1])
 52 |             return 0
 53 |         else:
 54 |             return KTR_RC_CALLBACK_ERR
 55 | 
 56 |     def callbackEvalH(bo, evalRequestCode, n, m, nnzJ, nnzH, x,
 57 |                     lambda_, obj, c, objGrad, jac, hessian, hessVector,
 58 |                     userParams):
 59 |         indices = np.triu_indices(n)
 60 |         np.copyto(hessian, bo.acquisition_hessian(x)[indices])
 61 |         if evalRequestCode == KTR_RC_EVALH:
 62 |             # In this case we want the full hessian
 63 |             return 0
 64 |         elif evalRequestCode == KTR_RC_EVALH_NO_F:
 65 |             # In this case we only want the constraint part of the lagrangian
 66 |             np.copyto(hessian, np.zeros(hessian.shape))
 67 |             return 0
 68 |         else:
 69 |             return KTR_RC_CALLBACK_ERR
 70 | 
 71 |     def knitro_solve(x_init, bounds, hessian, bo):
 72 |         # We have to implement the case of non hessian
 73 |         n = bounds.shape[0]
 74 |         objGoal = KTR_OBJGOAL_MINIMIZE
 75 |         objType = KTR_OBJTYPE_GENERAL
 76 |         bndsLo = bounds[:, 0].copy()
 77 |         bndsUp = bounds[:, 1].copy()
 78 |         # Hessian is dense
 79 |         hessRow, hessCol = np.triu_indices(n)
 80 |         # No constraints
 81 |         m = 0
 82 |         cType = np.array([])
 83 |         cBndsLo = np.array([])
 84 |         cBndsUp = np.array([])
 85 |         jacIxConstr = np.array([])
 86 |         jacIxVar = np.array([])
 87 |         kc = KTR_new()
 88 |         assert kc is not None, "Failed to initialize knitro. Check license."
 89 |         # Set knitro parameters
 90 |         # Derivative Checker
 91 |         # assert not KTR_set_int_param_by_name(kc, "derivcheck", 2)
 92 |         # Verbosity
 93 |         if OUTPUT_LEVEL == 0:
 94 |             assert not KTR_set_int_param_by_name(kc, "outlev", 0)
 95 |             # verbose:
 96 |             # assert not KTR_set_int_param_by_name(kc, "outlev", 2)
 97 |             # Super verbose: prints iterates information
 98 |             # assert not KTR_set_int_param_by_name(kc, "outlev", 4)
 99 |         if hessian:
100 |             # Exact hessian
101 |             assert not KTR_set_int_param_by_name(kc, "hessopt", 1)
102 |             assert not KTR_set_int_param_by_name(kc, "hessian_no_f", 1)
103 |         else:
104 |             assert not KTR_set_int_param_by_name(kc, "hessopt", 3) # SR1
105 |         # Compare performance of all the algorithms
106 |         assert not KTR_set_int_param_by_name(kc, "algorithm", 4) # SQP
107 |         # assert not KTR_set_int_param_by_name(kc, "linsolver", 3) # Non-sparse
108 |         # assert not KTR_set_char_param_by_name(kc, "outlev", "all")
109 | 
110 |         # set callbacks
111 |         assert not KTR_set_func_callback(
112 |             kc, lambda *args: callbackEvalFC(bo, *args))
113 |         assert not KTR_set_grad_callback(
114 |             kc, lambda *args: callbackEvalGA(bo, *args))
115 |         if hessian:
116 |             assert not KTR_set_hess_callback(
117 |                 kc, lambda *args: callbackEvalH(bo, *args))
118 |         ret = KTR_init_problem(kc, n, objGoal, objType, bndsLo, bndsUp, cType,
119 |                             cBndsLo, cBndsUp, jacIxVar, jacIxConstr, hessRow,
120 |                             hessCol, x_init, None)
121 |         if ret:
122 |             raise RuntimeError(
123 |                 "Error initializing the problem, Knitro status = %d" % ret)
124 |         # These will hold the solutions
125 |         x = x_init.copy()
126 |         lambda_ = np.zeros(m + n)
127 |         obj = bo.acquisition(x)[0]
128 |         try:
129 |             nStatus = KTR_solve(kc, x, lambda_, 0, obj, None, None, None, None, None,
130 |                                 None)
131 |             nIter = KTR_get_number_iters(kc)
132 |             fcEvals = KTR_get_number_FC_evals(kc)
133 |             gaEvals = KTR_get_number_GA_evals(kc)
134 |             hEvals = KTR_get_number_H_evals(kc)
135 |         finally:
136 |             KTR_free(kc)
137 |         Result = namedtuple('Result', 'status nit fEvals gEvals hEvals success message')
138 |         return x, obj, Result(status=nStatus, nit=nIter,
139 |             fEvals=fcEvals, gEvals=gaEvals, hEvals=hEvals,
140 |             success=(nStatus==0), message=str(nStatus)
141 |         )
142 | except OSError:
143 |     logging.getLogger('').debug(
144 |         "Couldn't load knitro. Make sure it is properly installed."
145 |     )
146 | except ModuleNotFoundError:
147 |     logging.getLogger('').warning(
148 |         "Couldn't find knitro. Make sure it is installed and visible."
149 |     )
150 | 
151 | def solve(X_init, bounds, hessian, bo, solver):
152 |     '''
153 |     Perform non-linear optimization on bo.acquisition_function with either L-BFGS-B (Pseudo-Newton)
154 |     or Knitro (Sequential Quadratic Programming)
155 |     '''
156 |     x_init = X_init.flatten()
157 | 
158 |     if solver == 'bfgs':
159 |         x, y, status = bfgs_solve(x_init=x_init, bounds=bounds,
160 |                                    hessian=hessian, bo=bo)
161 |     elif solver == 'knitro':
162 |         try:
163 |             x, y, status = knitro_solve(x_init=x_init, bounds=bounds,
164 |                                         hessian=hessian, bo=bo)
165 |         except NameError:
166 |             logging.getLogger('').critical(
167 |                 'Please install knitro to use it.'
168 |             )
169 |             raise
170 |     else:
171 |         assert False, 'Invalid nonlinear solver choice!'
172 | 
173 |     k = x.size // bo.dim
174 |     X = x.reshape(k, bo.dim)
175 |     return X, y, status
176 | 


--------------------------------------------------------------------------------
/out/.gitkeep:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/oxfordcontrol/Bayesian-Optimization/980a0edb978710fe29529526977d0c14649b3088/out/.gitkeep


--------------------------------------------------------------------------------
/plot.py:
--------------------------------------------------------------------------------
  1 | '''
  2 | This script plots the results of various algorithms at the same plot, and
  3 | saves it as a pdf in the results/folder.
  4 | 
  5 | Call it as:
  6 | python plot_experiments fig_name folder1 ... folderN
  7 | where folder* are the folders of the saved experiments.
  8 | '''
  9 | 
 10 | import os.path
 11 | import glob
 12 | import numpy as np
 13 | import pickle
 14 | import argparse
 15 | import matplotlib
 16 | 
 17 | SMALL_SIZE = 8
 18 | MEDIUM_SIZE = 10
 19 | BIGGER_SIZE = 12
 20 | 
 21 | matplotlib.rc('font', size=SMALL_SIZE)          # controls default text sizes
 22 | matplotlib.rc('axes', titlesize=SMALL_SIZE)     # fontsize of the axes title
 23 | matplotlib.rc('xtick', labelsize=SMALL_SIZE)    # fontsize of the tick labels
 24 | matplotlib.rc('ytick', labelsize=SMALL_SIZE)    # fontsize of the tick labels
 25 | matplotlib.rc('legend', fontsize=SMALL_SIZE)    # legend fontsize
 26 | matplotlib.rc('figure', titlesize=SMALL_SIZE)  # fontsize of the figure title
 27 | 
 28 | import matplotlib.pyplot as plt
 29 | from matplotlib.ticker import MaxNLocator
 30 | # When no X server is present
 31 | plt.switch_backend('agg')
 32 | 
 33 | 
 34 | if __name__ == '__main__':
 35 |     parser = argparse.ArgumentParser()
 36 |     parser.add_argument("name", nargs=1)
 37 |     parser.add_argument("folders", nargs="+")
 38 |     parser.add_argument('--linewidth', type=float, default=1)
 39 |     parser.add_argument('--capsize', type=float, default=1.5)
 40 |     parser.add_argument('--offset_start', type=float, default=-0.2)
 41 |     parser.add_argument('--offset_delta', type=float, default=0.1)
 42 |     parser.add_argument('--sizex', type=float, default=5)
 43 |     parser.add_argument('--sizey', type=float, default=2.5)
 44 |     parser.add_argument('--regret', type=int, default=1)
 45 |     parser.add_argument('--max_iters', type=int)
 46 |     parser.add_argument('--step', type=int, default=1)
 47 |     args = parser.parse_args()
 48 | 
 49 |     plot_experiments(args)
 50 | 
 51 | 
 52 | def plot_experiments(options):
 53 |     '''
 54 |     Plots the results of a number of different algorithms on the same plot
 55 |     '''
 56 |     # Hopefully we won't plot more than 6 different algorithms at the same plot
 57 |     colors = ['r', 'b', 'g', 'y', 'b', 'm']
 58 |     fig = None
 59 |     ax = None
 60 |     offset = options.offset_start
 61 |     for k in range(len(options.folders)):
 62 |         folder = options.folders[k]
 63 |         # Load command line arguments
 64 |         with open(folder + '/arguments.pkl', 'rb') as file:
 65 |                 args = pickle.load(file)
 66 |         # print(args)
 67 |         fmin = np.loadtxt(folder + '/fmin.txt')
 68 | 
 69 |         fails = 0
 70 |         outputs = []
 71 |         files = glob.glob(folder + '/*.npz')
 72 |         for file in files:
 73 |             npzfile = np.load(file)
 74 | 
 75 |             if npzfile['Y'].shape != ():
 76 |                 outputs.append(npzfile['Y'])
 77 |             else:
 78 |                 fails = fails + 1
 79 |                 print(file)
 80 | 
 81 |         label = os.path.basename(folder).split('_')[1]
 82 | 
 83 |         if fails > 0:
 84 |             print(label, 'Fails:', fails,
 85 |                   'Successes: ', len(files) - fails)
 86 | 
 87 |         if k == len(options.folders) - 1:
 88 |             color = 'k'
 89 |         else:
 90 |             color = colors[k]
 91 | 
 92 |         fig, ax = plot(outputs=outputs,
 93 |                        fmin=fmin,
 94 |                        iterations=args.iterations,
 95 |                        initial_size=args.initial_size + args.init_replicates,
 96 |                        batch_size=args.batch_size,
 97 |                        color=color, fig=fig,
 98 |                        ax=ax, label=label,
 99 |                        offset=offset, options=options)
100 |         offset += options.offset_delta
101 | 
102 |     ax.set_xlabel('Number of Batches')
103 |     if options.regret:
104 |         ax.set_ylabel('Regret')
105 |     else:
106 |         ax.set_ylabel('Loss')
107 |     ax.set_title(options.name[0])
108 |     figure = ax.get_figure()
109 |     figure.set_size_inches((options.sizex, options.sizey))
110 |     ax.spines['right'].set_visible(False)
111 |     ax.spines['top'].set_visible(False)
112 |     ax.locator_params(nbins=4, axis='y')
113 | 
114 |     # Save plot
115 |     plt.tight_layout()
116 |     plt.savefig('results/' + options.name[0] + '.pdf')
117 | 
118 | def plot_mins(mins, options, color='b', fig=None, ax=None, label=None, offset=0):
119 |     # Auxiliary function used by plot()
120 |     if fig is None or ax is None:
121 |         fig = plt.figure()
122 |         ax = fig.add_subplot(111)
123 | 
124 |     ax.ticklabel_format(style='sci', scilimits=(0, 1.5))
125 |     if options.max_iters:
126 |         iterations = options.max_iters
127 |     else:
128 |         iterations = mins.shape[1]
129 |     for i in range(0, iterations, options.step):
130 |         ax.scatter(i + 0*mins[:, i] + offset, mins[:, i],
131 |                    s=50, marker='.', color=color, edgecolor='none',
132 |                    alpha=0.3)
133 |         ax.scatter(i + offset, np.median(mins[:, i]),
134 |                    s=20, marker='d', color=color, edgecolor=(0, 0, 0))
135 | 
136 |     # ax.legend().get_frame().set_facecolor('none')
137 |     # plt.legend(frameon=False)
138 |     max_plotted_value = np.max(np.percentile(mins, 100, axis=0))
139 |     if options.regret:
140 |         ax.set_ylim(0, 1.05*max_plotted_value)
141 |     ax.xaxis.set_major_locator(MaxNLocator(integer=True))
142 | 
143 |     return fig, ax
144 | 
145 | def plot(outputs, fmin, iterations, initial_size, batch_size, label, options,
146 |          color='b', fig=None, ax=None, output_idx=0, offset=0):
147 |     '''
148 |     Plot a single algorithm
149 |     '''
150 |     n = len(outputs)
151 |     mins = np.zeros((n, iterations + 1))
152 |     for i in range(n):
153 |         for j in range(iterations + 1):
154 |             idx = np.argmin(outputs[i][0:initial_size + j*batch_size, 0])
155 |             mins[i, j] = outputs[i][idx, output_idx] - fmin
156 | 
157 |     fig, ax = plot_mins(mins, options, color, fig, ax, label, offset)
158 | 
159 |     return fig, ax
160 | 


--------------------------------------------------------------------------------
/results/.gitkeep:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/oxfordcontrol/Bayesian-Optimization/980a0edb978710fe29529526977d0c14649b3088/results/.gitkeep


--------------------------------------------------------------------------------
/run.py:
--------------------------------------------------------------------------------
  1 | import os
  2 | import numpy as np
  3 | import tensorflow as tf
  4 | import random
  5 | import argparse
  6 | import gpflow
  7 | from methods.oei import OEI
  8 | from methods.random import Random
  9 | import time
 10 | import pickle
 11 | from benchmark_functions import scale_function, hart6
 12 | import copy
 13 | 
 14 | algorithms = {
 15 |     'OEI': OEI,
 16 |     'Random': Random
 17 | }
 18 | 
 19 | class SafeMatern32(gpflow.kernels.Matern32):
 20 |     # See https://github.com/GPflow/GPflow/pull/727
 21 |     def euclid_dist(self, X, X2):
 22 |         r2 = self.square_dist(X, X2)
 23 |         return tf.sqrt(tf.maximum(r2, 1e-40))
 24 | 
 25 | 
 26 | def run(options, seed, robust=False, save=False):
 27 |     '''
 28 |     Runs bayesian optimization on the setup defined in the options dictionary
 29 |     starting from a predefined seed. Saves results on the folder named 'out' while logging 
 30 |     is saved on the folder 'log'.
 31 |     '''
 32 |     options['seed'] = seed
 33 |     # Set random seed: Numpy, Tensorflow, Python 
 34 |     tf.reset_default_graph()
 35 |     tf.set_random_seed(seed)
 36 |     np.random.seed(seed)
 37 |     random.seed(seed)
 38 | 
 39 |     # Create bo object which will be called later to perform Bayesian Optimization.
 40 |     bo = algorithms[options['algorithm']](options)
 41 | 
 42 |     try:
 43 |         start = time.time()
 44 |         # Run BO
 45 |         X, Y = bo.bayesian_optimization()
 46 |         end = time.time()
 47 |         print('Done with:', bo.options['job_name'], 'seed:', seed,
 48 |               'Time:', '%.2f' % ((end - start)/60), 'min')
 49 |     except KeyboardInterrupt:
 50 |         print("Caught KeyboardInterrupt, stopping.")
 51 |         raise
 52 |     except:
 53 |         print('Experiment of', bo.options['job_name'],
 54 |               'with seed', seed, 'failed')
 55 |         X, Y = None, None
 56 |         if not robust:
 57 |             raise
 58 | 
 59 |     if save:
 60 |         save_folder = 'out/' + bo.options['job_name'] + '/'
 61 |         filepath = save_folder + str(seed) + '.npz'
 62 |         try:
 63 |             os.makedirs(save_folder)
 64 |         except OSError:
 65 |             pass
 66 |         try:
 67 |             os.remove(filepath)
 68 |         except OSError:
 69 |             pass
 70 | 
 71 |         np.savez(filepath, X=X, Y=Y)
 72 | 
 73 | 
 74 | def create_options(args):
 75 |     functions = {
 76 |         'hart6': hart6()
 77 |     }
 78 | 
 79 |     kernels_gpflow = {
 80 |         'RBF': gpflow.kernels.RBF,
 81 |         'Matern32': SafeMatern32,
 82 |     }
 83 | 
 84 |     options = vars(copy.copy(args))
 85 |     options['objective'] = functions[options['function']]
 86 |     options['objective'].bounds = np.asarray(options['objective'].bounds)
 87 |     # This scales the input domain of the function to [-0.5, 0.5]^n. It's different to the
 88 |     # normalize option, which scales the output of the function.
 89 |     options['objective'] = scale_function(options['objective'])
 90 | 
 91 |     input_dim = options['objective'].bounds.shape[0]
 92 |     if options['algorithm'] != 'LP_EI':
 93 |         k = kernels_gpflow[options['kernel']](
 94 |             input_dim=input_dim, ARD=options['ard'])
 95 |         if options['priors']:
 96 |             k.lengthscales.prior = gpflow.priors.Gamma(shape=2, scale=0.5)
 97 |             k.variance.prior = gpflow.priors.Gaussian(mu=1, var=2)
 98 |         options['kernel'] = k
 99 | 
100 |     options['job_name'] = options['function'] + '_' + options['algorithm']
101 | 
102 |     return options
103 | 
104 | 
105 | def main(args):
106 |     options = create_options(args)
107 | 
108 |     save_folder = 'out/' + options['job_name'] + '/'
109 |     filepath = save_folder + 'arguments.pkl'
110 |     try:
111 |         os.makedirs(save_folder)
112 |     except OSError:
113 |         pass
114 | 
115 |     try:
116 |         os.remove(filepath)
117 |     except OSError:
118 |         pass
119 |     try:
120 |         with open(filepath, 'wb') as file:
121 |             pickle.dump(args, file, pickle.HIGHEST_PROTOCOL)
122 |     except OSError:
123 |         pass
124 | 
125 |     filepath = save_folder + 'fmin.txt'
126 |     try:
127 |         fmin = options['objective'].fmin
128 |     except AttributeError:
129 |         fmin = 0
130 |     np.savetxt(filepath, np.array([fmin]))
131 | 
132 |     for seed in range(args.seed, args.seed + args.num_seeds):
133 |         run(options, seed=seed, save=options['save'])
134 | 
135 | 
136 | def create_parser():
137 |     parser = argparse.ArgumentParser()
138 |     parser.add_argument('--function', default='hart6')
139 |     parser.add_argument('--algorithm', default='OEI')
140 |     parser.add_argument('--seed', type=int, default=123)
141 |     parser.add_argument('--num_seeds', type=int, default=1)
142 |     parser.add_argument('--save', type=int, default=1)
143 | 
144 |     parser.add_argument('--batch_size', type=int, default=5)
145 |     parser.add_argument('--iterations', type=int, default=10)
146 |     parser.add_argument('--initial_size', type=int, default=10)
147 |     parser.add_argument('--model_restarts', type=int, default=20,
148 |         help='Random restarts when optimizing the Likelihood of the GP.')
149 |     parser.add_argument('--opt_restarts', type=int, default=20,
150 |         help='Random restarts when optimizing the acquisition function.')
151 |     parser.add_argument('--normalize_Y', type=int, default=1,
152 |         help='If set to 1, then the outputs of the function under optimization is normalized to have variance 1 and mean 0')
153 |     parser.add_argument('--noise', type=float,
154 |         help='Used to set the likelihood to a fixed value')
155 |     parser.add_argument('--kernel', default='Matern32')
156 |     parser.add_argument('--ard', type=int, default=0)
157 |     parser.add_argument('--nl_solver',  default='knitro')
158 |     parser.add_argument('--hessian', type=int, default=1)
159 | 
160 |     parser.add_argument('--priors', type=int, default=0)
161 | 
162 |     return parser
163 | 
164 | 
165 | if __name__ == '__main__':
166 |     parser = create_parser()
167 |     args = parser.parse_args()
168 | 
169 |     main(args)
170 | 
171 | 


--------------------------------------------------------------------------------
/tests/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/oxfordcontrol/Bayesian-Optimization/980a0edb978710fe29529526977d0c14649b3088/tests/__init__.py


--------------------------------------------------------------------------------
/tests/create_model.py:
--------------------------------------------------------------------------------
 1 | import numpy as np
 2 | from methods.oei import OEI
 3 | import gpflow
 4 | import sys
 5 | sys.path.append('..')
 6 | from benchmark_functions import scale_function, hart6
 7 | 
 8 | 
 9 | def create_model(batch_size=2):
10 |     options = {}
11 |     options['samples'] = 0
12 |     options['priors'] = 0
13 |     options['batch_size'] = batch_size
14 |     options['iterations'] = 5
15 |     options['opt_restarts'] = 2
16 |     options['initial_size'] = 10
17 |     options['model_restarts'] = 10
18 |     options['normalize_Y'] = 1
19 |     options['noise'] = 1e-6
20 |     options['nl_solver'] = 'bfgs'
21 |     options['hessian'] = True
22 | 
23 |     options['objective'] = hart6()
24 |     options['objective'].bounds = np.asarray(options['objective'].bounds)
25 |     options['objective'] = scale_function(options['objective'])
26 | 
27 |     input_dim = options['objective'].bounds.shape[0]
28 |     options['kernel'] = gpflow.kernels.Matern32(
29 |             input_dim=input_dim, ARD=False
30 |         )
31 | 
32 |     options['job_name'] = 'tmp'
33 | 
34 |     bo = OEI(options)
35 |     # Initialize
36 |     bo.bayesian_optimization()
37 | 
38 |     return bo
39 | 


--------------------------------------------------------------------------------
/tests/test_derivatives.py:
--------------------------------------------------------------------------------
  1 | import numpy as np
  2 | import numdifftools as nd
  3 | from .create_model import create_model
  4 | from methods.sdp import sdp, solution_derivative
  5 | 
  6 | # Batch size
  7 | K = 3
  8 | 
  9 | def derivatives_numerical(x, model):
 10 |     '''
 11 |     Returns the gradient and hessian of the optimal value of
 12 |     the SDP with respect to x.
 13 |     Beware, the hessian is based on the analytical derivative,
 14 |     for accuracy and performance reasons.
 15 |     '''
 16 |     def opt_val(y):
 17 |         return model.acquisition(y)[0]
 18 | 
 19 |     def gradient(y):
 20 |         return model.acquisition(y)[1]
 21 | 
 22 |     gradient_numerical = nd.Gradient(opt_val)(x)
 23 |     hessian_numerical = nd.Hessian(opt_val)(x)
 24 | 
 25 |     return gradient_numerical, hessian_numerical
 26 | 
 27 | 
 28 | def derivatives_analytical(x, model):
 29 |     '''
 30 |     Returns the gradient and hessian of the optimal value of
 31 |     the SDP with respect to x.
 32 |     '''
 33 |     return model.acquisition(x)[1], model.acquisition_hessian(x)
 34 | 
 35 | 
 36 | def sensitivity_numerical(omega, direction, model):
 37 |     '''
 38 |     Returns the derivative of
 39 |     1) optimal value of SDP
 40 |     2) derivative primal and dual solution:
 41 |         d/dx([vec(M); vec(Y_0); ...; vec(Y_k)])
 42 |     at omega when perturbing the second moment matrix across 
 43 |     the given direction
 44 |     '''
 45 |     fmin = np.min(model.predict_f(model.X.value)[0])
 46 | 
 47 |     def solution(om):
 48 |         '''
 49 |         Return [vec(M); vec(Y_0); ...; vec(Y_k)] at omega=om
 50 |         '''
 51 |         M, Y = sdp(om, fmin)[1:3]
 52 |         solution = M.flatten()
 53 |         for i in range(len(Y)):
 54 |             solution = np.concatenate((solution, Y[i].flatten()))
 55 |         return solution
 56 | 
 57 |     d_opt_val = nd.Derivative(
 58 |         lambda x: sdp(omega + x*direction, fmin)[0]
 59 |     )(0)
 60 | 
 61 |     d_solution = nd.Derivative(
 62 |         lambda x: solution(omega + x*direction)
 63 |     )(0)
 64 | 
 65 |     return d_opt_val, d_solution
 66 | 
 67 | 
 68 | def sensitivity_analytical(omega, direction, model):
 69 |     fmin = np.min(model.predict_f(model.X.value)[0])
 70 |     _, M, Y, C = sdp(omega, fmin)
 71 |     d_opt_val = np.trace(M.T.dot(direction))
 72 | 
 73 |     dM, dY = solution_derivative(M, Y, C, direction, return_dY=True)
 74 | 
 75 |     d_solution = dM.flatten()
 76 |     for i in range(len(dY)):
 77 |         d_solution = np.concatenate((d_solution, dY[i].flatten()))
 78 | 
 79 |     return d_opt_val, d_solution
 80 | 
 81 | 
 82 | def test_sensitivity():
 83 |     '''
 84 |     Checks sensitivity results of the SDP when perturbing
 85 |     omega(X) at a direction D. X and D are chosen uniformly at random.
 86 |     '''
 87 |     np.random.seed(0)
 88 | 
 89 |     bo = create_model(batch_size=K)
 90 |     X = bo.random_sample(bo.bounds, K)
 91 | 
 92 |     omega = bo.omega(X)
 93 |     mu = omega[0:K, -1][:, None]
 94 | 
 95 |     D_s = np.random.rand(K, K)
 96 |     D_s = D_s.dot(D_s.T)
 97 |     D_m = np.random.rand(K, 1)
 98 |     D = np.zeros((K + 1, K + 1))
 99 | 
100 |     D[0:K, 0:K] = D_s + mu.dot(D_m.T) + D_m.dot(mu.T)
101 |     D[-1, 0:K] = D_m.flatten()
102 |     D[0:K, -1] = D_m.flatten()
103 |     D = (D + D.T)/2
104 |     D = 1e-3*D
105 | 
106 |     d_opt_val, d_solution = sensitivity_analytical(omega, D, bo)
107 |     d_opt_val_n, d_solution_n = sensitivity_numerical(omega, D, bo)
108 | 
109 |     # Check derivative of optimum value 
110 |     np.testing.assert_allclose(d_opt_val, d_opt_val_n, rtol=1e-2)
111 |     # Check derivative of primal and dual solution
112 |     np.testing.assert_allclose(d_solution, d_solution_n, rtol=3e-1)
113 | 
114 | 
115 | def test_derivatives():
116 |     '''
117 |     Tests gradient and hessian of the acquisition function.
118 |     '''
119 |     np.random.seed(0)
120 | 
121 |     bo = create_model(batch_size=K)
122 |     X = bo.random_sample(bo.bounds, K)
123 | 
124 |     gradient, hessian = derivatives_analytical(X.flatten(), bo)
125 |     gradient_n, hessian_n = derivatives_numerical(X.flatten(), bo)
126 | 
127 |     # The following two work better when no warm starting is present
128 |     # Maybe warm starting confuses the numerical differentiation?
129 |     np.testing.assert_allclose(gradient, gradient_n, rtol=5e-1)
130 |     assert np.linalg.norm(gradient - gradient_n) / np.linalg.norm(gradient) < 1e-2
131 |     assert np.linalg.norm(hessian - hessian_n)/np.linalg.norm(hessian) < 2e-2 
132 |     # This is too hard
133 |     # np.testing.assert_allclose(hessian, hessian_n, rtol=1)
134 | 


--------------------------------------------------------------------------------
/tests/test_sdp.py:
--------------------------------------------------------------------------------
 1 | from methods import sdp
 2 | import numpy as np
 3 | import cvxpy as cvx
 4 | 
 5 | def sdp_mosek(omega, fmin):
 6 |     k_ = omega.shape[0]
 7 | 
 8 |     Y = []
 9 |     C = []
10 | 
11 |     C.append(np.zeros((k_, k_)))
12 |     C[0][-1, -1] = 0
13 |     Y.append(cvx.Semidef(k_))
14 |     cost_sum = Y[0]*C[0]
15 | 
16 |     for i in range(1, k_):
17 |         Y.append(cvx.Semidef(k_))
18 |         C.append(np.zeros((k_, k_)))
19 |         C[i][-1, i - 1] = 1/2
20 |         C[i][i - 1, -1] = 1/2
21 |         C[i][-1, -1] = -fmin
22 |         cost_sum += Y[i]*C[i]
23 | 
24 |     constraints = [sum(Y) == omega]
25 | 
26 |     objective = cvx.Minimize(cvx.trace(cost_sum))
27 | 
28 |     prob = cvx.Problem(objective, constraints)
29 |     opt_val = prob.solve(solver=cvx.MOSEK, verbose=0)
30 | 
31 |     # Assert a valid solution is returned
32 |     assert (isinstance(opt_val, np.ndarray) or isinstance(opt_val, float))\
33 |         and np.isfinite(opt_val)
34 | 
35 |     M = -constraints[0].dual_value
36 |     M = np.asarray((M + M.T)/2)  # From matrix to array
37 | 
38 |     Y_return = []
39 |     for y in Y:
40 |         Y_return.append(np.asarray(y.value))
41 | 
42 |     return opt_val, M, Y_return, C
43 | 
44 | def test_sdp():
45 |     np.random.seed(0)
46 |     k = 5  # Batch size
47 | 
48 |     for i in range(10):
49 |         # Generate problem data
50 |         tmp = np.random.randn(k, k)
51 |         # TODO: Change the way the sigma is constructed
52 |         # Creating a matrix as below results in a very bad condition number
53 |         sigma = tmp.dot(tmp.T) + 0.01*np.eye(tmp.shape[0], tmp.shape[1])
54 |         mu = np.random.randn(k, 1)
55 | 
56 |         omega = np.zeros((k + 1, k + 1))
57 |         omega[0:k, 0:k] = sigma + mu.dot(mu.T)
58 |         omega[-1, 0:k] = mu.flatten()
59 |         omega[0:k, -1] = mu.flatten()
60 |         omega[-1, -1] = 1
61 | 
62 |         fmin = np.random.randn(1)
63 | 
64 |         # Solve SDP with SCS and compare with MOSEK
65 |         opt_val_scs, M_scs = sdp.sdp(omega, fmin, warm_start=False)[0:2]
66 |         opt_val_mosek, M_mosek = sdp_mosek(omega, fmin)[0:2]
67 | 
68 |         # Test the different formulations
69 |         np.testing.assert_allclose(opt_val_scs, opt_val_mosek, rtol=1e-4)
70 | 
71 |         # This is coarse, hence it shouldn't fail
72 |         assert np.linalg.norm(M_scs - M_mosek)/np.linalg.norm(M_mosek) < 1e-2 
73 |         # This might fail occasionaly, as SCS is of limited accuracy
74 |         np.testing.assert_allclose(M_scs, M_mosek, rtol=5e-2)


--------------------------------------------------------------------------------