├── .gitignore ├── README.md ├── benchmark_functions.py ├── log └── .gitkeep ├── logging.yaml ├── methods ├── __init__.py ├── bo.py ├── oei.py ├── random.py ├── sdp.py └── solvers.py ├── out └── .gitkeep ├── plot.py ├── results └── .gitkeep ├── run.py └── tests ├── __init__.py ├── create_model.py ├── test_derivatives.py └── test_sdp.py /.gitignore: -------------------------------------------------------------------------------- 1 | # Saved numpy and binary data 2 | *.npz 3 | *.npy 4 | *.dat 5 | # Saved figures 6 | *.pdf 7 | # Pycache 8 | __pycache__ 9 | # Output logs 10 | out.txt 11 | *.swp 12 | *.png 13 | .Rhistory 14 | log/**/ 15 | out/**/ 16 | *.zip 17 | *.tgz 18 | **.pkl 19 | get_results.sh 20 | **/.cache 21 | **/output* 22 | log**/ 23 | evals**/ -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Bayesian-Optimization 2 | This is the implementation of a new acquisition function for **Batch Bayesian Optimization**, named **Optimistic Expected Improvement (OEI)**. For details, results and theoretical analysis, refer to the paper titled [*Distributionally Ambiguous Optimization Techniques for Batch Bayesian Optimization*](https://arxiv.org/abs/1707.04191) by Nikitas Rontsis, Michael A. Osborne, Paul J. Goulart. 3 | 4 | This branch is a **cleaned** and **updated** implementation of `OEI`. The branch [GPflow-based](https://github.com/oxfordcontrol/Bayesian-Optimization/tree/GPflow-based) includes code for testing against the following batch acquisition functions: 5 | * Multipoint expected improvement (`qei.py`) 6 | * Multipoint expected improvement, Constant Liar Heuristic (`qei_cl.py`) 7 | * Batch Lower Confidence Bound (`blcb.py`) 8 | * Local Penalization of Expected Improvement (`lp_ei.py`) 9 | 10 | Refer to the above mentioned paper for references and more detailed descriptions. 11 | 12 | ***Please use this branch for `OEI` as the one in the [GPflow-based](https://github.com/oxfordcontrol/Bayesian-Optimization/tree/GPflow-based) branch is outdated and only intended for the other acquisition functions (QEI, QEI-Cl, BLCB and LP_EI).*** 13 | 14 | ## Installation 15 | This package was written in `Python 3.6`. It is recommended to use the `Anaconda >=5.0.1` distribution, on a empty environment. The package is built on top of `gpflow 0.5` which has to be installed from [source]( https://github.com/GPflow/GPflow/releases/tag/0.5.0). 16 | Then, proceed on installing the following 17 | ``` 18 | conda install numpy scipy matplotlib pyyaml 19 | ``` 20 | You should also install [`SCS`](https://github.com/cvxgrp/scs), a Convex Solver. Do compile the package (`--no-binary :all:` flag below), as this can bring a significant speedup. 21 | ``` 22 | pip install 'scs>=2.0.2' --no-binary :all: 23 | ``` 24 | Moreover, you should install a Python interface to the `Intel MKL Pardiso` Linear Solver (freely distributed by the `Anaconda` distribution): 25 | ``` 26 | conda install -c haasad pypardiso 27 | ``` 28 | Finally, you have to download the second order non-linear solver [`KNITRO`](https://www.artelys.com/en/optimization-tools/knitro), install it and copy its contents in a folder named `knitro` inside this package. `KNITRO` is proprietary, but time-limited academic and commercial licenses are provided for free by `Artelys`. 29 | 30 | ## Testing 31 | Unit tests are included in the `tests` folder. You can invoke them using `pytest`. To do so, the following additional packages are required: 32 | ``` 33 | conda install pytest 34 | pip install 'cvxpy<1' numdifftools 35 | conda install -c mosek mosek 36 | ``` 37 | 38 | ## Usage example 39 | Running Batch Bayesian Optimization (BO) for the Hartmann-6d function with the `OEI` acquisition function can be invoked as following: 40 | ``` 41 | python run.py --seed=123 --algorithm=OEI --function=hart6 --batch_size=20 --initial_size=10 --iterations=15 --noise=1e-6 42 | ``` 43 | The above will create an output saved in the folder named `out`. Detailed logging is saved in the `log` folder, the verbosity of which can be controller by the `logging.yaml` configuration file. 44 | 45 | To compare against random search and plot the results run: 46 | ``` 47 | python run.py --seed=123 --algorithm=Random --function=hart6 --batch_size=20 --initial_size=10 --iterations=15 --noise=1e-6 --nl_solver=knitro --hessian=1 48 | 49 | python plot.py hart6 out/hart6_OEI out/hart6_Random 50 | ``` 51 | This will save a `pdf` plot in the `results` folder. 52 | 53 | The user can define a different function for optimization by modifying `benchmark_function.py`, or run **any of the ones presented in the [paper](https://arxiv.org/abs/1707.04191)**, the definitions of which can be found [here](https://github.com/oxfordcontrol/Bayesian-Optimization/blob/GPflow-based/test_functions/benchmark_functions.py). 54 | 55 | ## Basic files 56 | 57 | | File | Description | 58 | |----------------|-----------------------------------------------------------------| 59 | | `run.py` | Script that runs BO on test functions defined in `benchmark_functions.py`. | 60 | | `plot.py` | Plots results of `run.py`. | 61 | | `methods/oei.py` | Definition of `OEI`, its gradient and hessian. | 62 | | `methods/bo.py` | A parent class that implements a simple parametrizable BO setup. | 63 | 64 | ## Timing results 65 | The main operations for the calculating value of `OEI`, its gradient and hessian are listed below: 66 | * Value and gradient 67 | * Solving the Semidefinite Optimization Problem [(relevant line)](https://github.com/oxfordcontrol/Bayesian-Optimization/blob/master/methods/sdp.py#L51-L52). 68 | * Hessian (given solution of SDP): 69 | * Calculating the derivatives of M [(dominant operation)](https://github.com/oxfordcontrol/Bayesian-Optimization/blob/master/methods/sdp.py#L184) called from [here](https://github.com/oxfordcontrol/Bayesian-Optimization/blob/master/methods/oei.py#L61). 70 | * Chain rules of tensorflow [(relevant line #1](https://github.com/oxfordcontrol/Bayesian-Optimization/blob/master/methods/oei.py#L60), [relevant line #2)](https://github.com/oxfordcontrol/Bayesian-Optimization/blob/master/methods/oei.py#L64). 71 | 72 | In the current implementation a considerable amount of time is spent in Python to create, reshape and move matrices, all of which could have been avoided in a more efficient version. 73 | -------------------------------------------------------------------------------- /benchmark_functions.py: -------------------------------------------------------------------------------- 1 | import random 2 | import numpy as np 3 | 4 | 5 | class scale_function(): 6 | ''' 7 | This wrapper takes another objective and scales its input domain to [0.5,-0.5]^n. 8 | It's useful to apply when having priors on lengthscales. 9 | ''' 10 | def __init__(self, function): 11 | self.bounds = function.bounds.astype(float) 12 | self.function = function 13 | self.bounds[:, 0] = -1/2 14 | self.bounds[:, 1] = 1/2 15 | if hasattr(function, 'fmin'): 16 | self.fmin = function.fmin 17 | 18 | def restore(self, X): 19 | means = (self.function.bounds[:, 1] + self.function.bounds[:, 0])/2 20 | lengths = self.function.bounds[:, 1] - self.function.bounds[:, 0] 21 | 22 | Xnorm = np.zeros(X.shape) 23 | for i in range(X.shape[0]): 24 | Xnorm[i, :] = X[i, :] * lengths + means 25 | 26 | return Xnorm 27 | 28 | def scale(self, X): 29 | means = (self.function.bounds[:, 1] + self.function.bounds[:, 0])/2 30 | lengths = self.function.bounds[:, 1] - self.function.bounds[:, 0] 31 | 32 | Xorig = np.zeros(X.shape) 33 | for i in range(X.shape[0]): 34 | Xorig[i, :] = (X[i, :] - means) / lengths 35 | 36 | return Xorig 37 | 38 | def f(self, X): 39 | Xorig = self.restore(X) 40 | X_ret = () 41 | y_ret = () 42 | # Split the evaluations into a number of batches 43 | N = 1 44 | for i in range(0, len(Xorig), N): 45 | ret = self.function.f(Xorig[i:i+N]) 46 | if isinstance(ret, tuple): 47 | y_ret = y_ret + (ret[0],) 48 | X_ret = X_ret + (ret[1],) 49 | else: 50 | y_ret = y_ret + (ret,) 51 | 52 | if isinstance(ret, tuple): 53 | y_ret = np.concatenate(y_ret) 54 | X_ret = np.concatenate(X_ret) 55 | # np.testing.assert_array_almost_equal(self.scale(X_ret), X) 56 | return y_ret, self.scale(X_ret) 57 | else: 58 | y_ret = np.concatenate(y_ret) 59 | return y_ret 60 | 61 | 62 | class hart6: 63 | ''' 64 | Hartmann 6-Dimensional function 65 | Based on the following MATLAB code: 66 | https://www.sfu.ca/~ssurjano/hart6.html 67 | ''' 68 | def __init__(self, sd=0): 69 | self.sd = sd 70 | self.bounds = np.array([[0, 1], [0, 1], [0, 1], 71 | [0, 1], [0, 1], [0, 1]]) 72 | self.min = np.array([0.20169, 0.150011, 0.476874, 73 | 0.275332, 0.311652, 0.6573]) 74 | self.fmin = -3.32237 75 | 76 | def f(self, xx): 77 | if len(xx.shape) == 1: 78 | xx = xx.reshape((1, 6)) 79 | 80 | assert xx.shape[1] == 6 81 | 82 | n = xx.shape[0] 83 | y = np.zeros(n) 84 | for i in range(n): 85 | alpha = np.array([1.0, 1.2, 3.0, 3.2]) 86 | A = np.array([[10, 3, 17, 3.5, 1.7, 8], 87 | [0.05, 10, 17, 0.1, 8, 14], 88 | [3, 3.5, 1.7, 10, 17, 8], 89 | [17, 8, 0.05, 10, 0.1, 14]]) 90 | P = 1e-4 * np.array([[1312, 1696, 5569, 124, 8283, 5886], 91 | [2329, 4135, 8307, 3736, 1004, 9991], 92 | [2348, 1451, 3522, 2883, 3047, 6650], 93 | [4047, 8828, 8732, 5743, 1091, 381]]) 94 | 95 | outer = 0 96 | for ii in range(4): 97 | inner = 0 98 | for jj in range(6): 99 | xj = xx[i, jj] 100 | Aij = A[ii, jj] 101 | Pij = P[ii, jj] 102 | inner = inner + Aij*(xj-Pij)**2 103 | 104 | new = alpha[ii] * np.exp(-inner) 105 | outer = outer + new 106 | 107 | y[i] = -outer 108 | 109 | if self.sd == 0: 110 | noise = np.zeros(n) 111 | else: 112 | noise = np.random.normal(0, self.sd, n) 113 | 114 | return (y + noise).reshape((n, 1)) 115 | 116 | -------------------------------------------------------------------------------- /log/.gitkeep: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/oxfordcontrol/Bayesian-Optimization/980a0edb978710fe29529526977d0c14649b3088/log/.gitkeep -------------------------------------------------------------------------------- /logging.yaml: -------------------------------------------------------------------------------- 1 | version: 1 2 | formatters: 3 | verbose: 4 | format: '%(asctime)s - %(name)s - %(levelname)s - %(message)s' 5 | simple: 6 | format: '%(name)s - %(levelname)s - %(message)s' 7 | bare: 8 | format: '%(message)s' 9 | handlers: 10 | console: 11 | class: logging.StreamHandler 12 | level: CRITICAL 13 | formatter: simple 14 | stream: ext://sys.stdout 15 | evals_file: 16 | class: logging.FileHandler 17 | level: DEBUG 18 | formatter: bare 19 | filename: PATH/evals.log 20 | opt_file: 21 | class: logging.FileHandler 22 | level: DEBUG 23 | formatter: bare 24 | filename: PATH/opt.log 25 | model_file: 26 | class: logging.FileHandler 27 | level: DEBUG 28 | formatter: bare 29 | filename: PATH/model.log 30 | loggers: 31 | evals: 32 | level: INFO 33 | handlers: [evals_file] 34 | propagate: False 35 | opt: 36 | level: INFO 37 | handlers: [opt_file] 38 | propagate: False 39 | model: 40 | level: INFO 41 | handlers: [model_file] 42 | propagate: False 43 | root: 44 | level: DEBUG 45 | handlers: [console, evals_file, model_file, opt_file] -------------------------------------------------------------------------------- /methods/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/oxfordcontrol/Bayesian-Optimization/980a0edb978710fe29529526977d0c14649b3088/methods/__init__.py -------------------------------------------------------------------------------- /methods/bo.py: -------------------------------------------------------------------------------- 1 | from gpflow.gpr import GPR 2 | import numpy as np 3 | import tensorflow as tf 4 | import random 5 | import copy 6 | from .solvers import solve 7 | from .sdp import reset_warm_starting 8 | import logging.config 9 | import shutil 10 | import os 11 | import re 12 | import yaml 13 | 14 | 15 | class BO(GPR): 16 | ''' 17 | This is a simple (abstract) implementation of Bayesian Optimization. 18 | It extends gpflow's GPR. 19 | ''' 20 | def __init__(self, options): 21 | self.bounds = options['objective'].bounds.copy() 22 | self.dim = self.bounds.shape[0] 23 | 24 | if 'mean_function' not in options: 25 | options['mean_function'] = None 26 | super(BO, self).__init__(X=np.zeros((0, self.dim)), 27 | Y=np.zeros((0, self.dim)), 28 | kern=options['kernel'], 29 | mean_function=options['mean_function'] 30 | ) 31 | self.options = options.copy() 32 | 33 | # Fix the noise, if its value is provided 34 | if self.options['noise'] is not None: 35 | self.likelihood.variance = self.options['noise'] 36 | self.likelihood.variance.fixed = True 37 | 38 | def bayesian_optimization(self): 39 | ''' 40 | This function implements the main loop of the Bayesian Optimization 41 | ''' 42 | # Copy the objective. This is essential when testing draws from GPs 43 | objective = copy.copy(self.options['objective']) 44 | # Initialize the dataset at some random locations 45 | X0 = self.random_sample(self.bounds, self.options['initial_size']) 46 | # Evaluate the objective funtion there 47 | ret = objective.f(X0) 48 | # The objective might alter (e.g. discretize) X0. 49 | # In this case it returns two matrices: 50 | # the objective values (y0) and the altered locations (X0) 51 | if isinstance(ret, tuple): 52 | y0, X0 = ret 53 | else: 54 | y0 = ret 55 | 56 | # Set the data to the GP model 57 | # Careful, we might normalize the function evaluations 58 | # Provide only the first column of y0. 59 | # The others columns contain auxiliary data. 60 | self.X = X0 61 | self.Y = self.normalize(y0[:, 0:1]) 62 | 63 | # X_all stores all the points where f was evaluated and 64 | # y_all the respective function values 65 | X_all = X0 66 | y_all = y0 67 | 68 | #------------------------------------------------------------------ 69 | # Logging - Information about the function and initial evaluations. 70 | self.setup_logging() 71 | logger = logging.getLogger('evals') 72 | logger.info('----------------------------') 73 | logger.info('Bounds:\n' + str(self.bounds)) 74 | if hasattr(objective, 'fmin'): 75 | logger.info('Minimum value:' + str(objective.fmin)) 76 | logger.info('----------------------------') 77 | for i in range(len(X0)): 78 | logger.info( 79 | 'X:' + str(X0[i, :]) + ' y: ' + str(y0[i, :]) 80 | ) 81 | #------------------------------------------------------------------ 82 | 83 | for i in range(self.options['iterations']): 84 | # ML (or MAP if there are priors) for the GP model 85 | self.optimize_restarts(restarts=self.options['model_restarts']) 86 | 87 | #------------------------------------------------------------------ 88 | # Logging - Print hyperparameters of the model 89 | logging.getLogger('').info('#Iteration:' + str(i + 1)) 90 | # Remove non-printable characters (bold identifiers) 91 | ansi_escape = re.compile(r'\x1b[^m]*m') 92 | logging.getLogger('model').info(ansi_escape.sub('', str(self))) 93 | #------------------------------------------------------------------ 94 | 95 | # Get the suggested points for optimising the acquisition function 96 | X_new = self.get_suggestion(self.options['batch_size']) 97 | # Evaluate the black-box function at the suggested points 98 | ret = objective.f(X_new) 99 | # The objective might alter (e.g. discretize) X_new. 100 | # In this case it returns two matrices: 101 | # the objective values (y_new) and the altered locations (X_new) 102 | if isinstance(ret, tuple): 103 | y_new, X_new = ret 104 | else: 105 | y_new = ret 106 | 107 | # Append the algorithm's choice X_new to X_all 108 | # Add the function evaluations f(X_new) to y_all 109 | X_all = np.concatenate((X_all, X_new)) 110 | y_all = np.concatenate((y_all, y_new)) 111 | 112 | # Update the GP model 113 | # Careful, the model might normalize the function evaluations 114 | # Provide only the first column of y_all. 115 | # The others columns contain auxiliary data. 116 | self.X = X_all 117 | self.Y = self.normalize(y_all[:, 0:1]) 118 | 119 | #------------------------------------------------------------------ 120 | # Logging - Print new evaluations 121 | for j in range(len(X_new)): 122 | logging.getLogger('evals').info( 123 | 'X:' + str(X_new[j, :]) + ' y: ' + str(y_new[j, :]) 124 | ) 125 | #------------------------------------------------------------------ 126 | 127 | return X_all, y_all 128 | 129 | def get_suggestion(self, batch_size): 130 | ''' 131 | This function runs a descent-based optimization on the acquisition 132 | function to get a batch of bath_size evaluation points 133 | Inputs: batch_size: Number of evalution points 134 | Output: X: [batch_size x self.dim] ndarray with the evaluation points 135 | ''' 136 | X = None # Will hold the final decision 137 | y = None # Will hold the acquisition function value at X 138 | # Tile bounds to match batch size 139 | bounds_tiled = np.tile(self.bounds, (batch_size, 1)) 140 | # Run local gradient-descent optimizer multiple times 141 | # to avoid getting stuck in a poor local optimum 142 | for j in range(self.options['opt_restarts']): 143 | try: 144 | # Resets the warm starting of the SDP (sdp.py) as we start from 145 | # a random point that is not necessarily close to the previous ones. 146 | reset_warm_starting() 147 | 148 | # Initial point of the optimization 149 | X_init = self.random_sample(self.bounds, batch_size) 150 | y_init = self.acquisition(X_init)[0] 151 | 152 | X0, y0, status = solve(X_init=X_init, 153 | bounds=bounds_tiled, 154 | hessian=self.options['hessian'], 155 | bo=self, 156 | solver=self.options['nl_solver']) 157 | 158 | # Update X if the current local minimum is 159 | # the best one found so far 160 | if X is None or y0 < y: 161 | X, y = X0, y0 162 | 163 | 164 | #-------------------------------------------------------- 165 | # Logging - Print statistics for the non-linear solver 166 | logging.getLogger('opt').info( 167 | '##Opt_it:' + str(j + 1) + ' Val:' + '%.2e' % y0 + 168 | ' Diff:' + '%.2e' % (y_init - y0) + 169 | ' It:' + str(status.nit) 170 | ) 171 | if not status.success: 172 | logging.getLogger('opt').warning( 173 | 'NLP Warning:' + str(status.message) 174 | ) 175 | #-------------------------------------------------------- 176 | except AssertionError:# KeyboardInterrupt: 177 | raise 178 | ''' 179 | except Exception as e: 180 | #-------------------------------------------------------- 181 | # Logging - Report failure of the nonlinear solver 182 | logging.getLogger('info').critical( 183 | 'Optimization #' + str(j + 1) + 184 | 'of the acquisition function failed!\n Error:' + str(e) 185 | ) 186 | #-------------------------------------------------------- 187 | ''' 188 | 189 | # Assert that at least one optimization run succesfully 190 | assert X is not None 191 | 192 | return X 193 | 194 | def optimize_restarts(self, restarts=1, **kwargs): 195 | ''' 196 | Wrapper of gpflow's self._objective to allow for multiple 197 | random restarts on the maximization of the likelihood 198 | of the GP's hyperparameters. 199 | ''' 200 | if self._needs_recompile: 201 | self.compile() 202 | obj = self._objective 203 | 204 | par_min = self.get_free_state().copy() 205 | val_min = obj(par_min)[0] 206 | for i in range(restarts): 207 | try: 208 | self.randomize() 209 | self.optimize(**kwargs) 210 | x = self.get_free_state().copy() 211 | val = obj(x)[0] 212 | except KeyboardInterrupt: 213 | raise 214 | except: 215 | val = float("inf") 216 | 217 | if val < val_min: 218 | par_min = x 219 | val_min = val 220 | 221 | self.set_state(par_min) 222 | 223 | @staticmethod 224 | def random_sample(bounds, k): 225 | ''' 226 | Generate a set of k n-dimensional points sampled uniformly at random 227 | Inputs: 228 | bounds: n x 2 dimenional array containing upper/lower bounds 229 | for each dimension 230 | k: number of points 231 | Output: k x n array containing the sampled points 232 | ''' 233 | # k: Number of points 234 | n = bounds.shape[0] # Dimensionality of each point 235 | X = np.zeros((k, n)) 236 | for i in range(n): 237 | X[:, i] = np.random.uniform(bounds[i, 0], bounds[i, 1], k) 238 | 239 | return X 240 | 241 | def normalize(self, Y): 242 | ''' 243 | When normalization is enabled, this function normalizes the first 244 | collumn of Y to have zero mean and std one. 245 | 246 | Recall that the first collumn contains the output of the function under 247 | minimization. The rest of the collumns (if any) contain auxiliary data 248 | that are only used for inspection purposes. 249 | ''' 250 | 251 | Y_ = Y.copy() 252 | if self.options['normalize_Y'] and np.std(Y[:, 0]) > 0: 253 | Y_[:, 0] = (Y[:, 0] - np.mean(Y[:, 0]))/np.std(Y[:, 0]) 254 | 255 | return Y_ 256 | 257 | def setup_logging(self): 258 | ''' 259 | This function initialises the logging mechanism 260 | ''' 261 | if 'seed' in self.options: 262 | self.log_folder = 'log/' + self.options['job_name'] + '/' + str(self.options['seed']) + '/' 263 | else: 264 | self.log_folder = 'log/' + self.options['job_name'] + '/' 265 | 266 | try: 267 | os.makedirs(self.log_folder) 268 | except OSError: 269 | pass 270 | # Load config file 271 | with open('logging.yaml', 'r') as f: 272 | config = f.read() 273 | # Prepend logging folder 274 | config = config.replace('PATH/', self.log_folder) 275 | # Setup logging 276 | logging.config.dictConfig(yaml.load(config)) 277 | #------------------------------------------------------------------ 278 | -------------------------------------------------------------------------------- /methods/oei.py: -------------------------------------------------------------------------------- 1 | from .bo import BO 2 | import numpy as np 3 | import scipy.linalg as la 4 | import logging 5 | from .sdp import sdp, solution_derivative 6 | import tensorflow as tf 7 | from gpflow.param import AutoFlow 8 | from gpflow._settings import settings 9 | float_type = settings.dtypes.float_type 10 | 11 | 12 | class OEI(BO): 13 | ''' 14 | This class implements the Optimistic Expected Improvement acquisition function. 15 | ''' 16 | def __init__(self, options): 17 | super(OEI, self).__init__(options) 18 | 19 | def acquisition(self, x): 20 | ''' 21 | The acquisition function and its gradient 22 | Input: x: flattened ndarray [batch_size x self.dim] containing the evaluation points 23 | Output[0]: Value of the acquisition function 24 | Output[1]: Gradient of the acquisition function 25 | ''' 26 | # Calculate the minimum value so far 27 | fmin = np.min(self.predict_f(self.X.value)[0]) 28 | X = x.reshape((-1, self.dim)) 29 | 30 | X_, V = self.project(X) # See comments in self.project 31 | if len(X_) > 0: 32 | # Solve the SDP 33 | value, M_, _, _ = sdp(self.omega(X_), fmin, warm_start=(len(X)==len(X_))) 34 | _, gradient_ = self.acquisition_tf(X_, M_) 35 | gradient = V.T.dot(gradient_) 36 | else: 37 | value = 0; gradient = np.random.rand(len(x)) 38 | 39 | return np.array([value]), gradient.flatten() 40 | 41 | def acquisition_hessian(self, x): 42 | ''' 43 | The acquisition function and its gradient 44 | Input: x: flattened ndarray [batch_size x self.dim] containing the evaluation points 45 | Output[0]: Value of the acquisition function 46 | Output[1]: Gradient of the acquisition function 47 | ''' 48 | # Calculate the minimum value so far 49 | fmin = np.min(self.predict_f(self.X.value)[0]) 50 | X = x.reshape((-1, self.dim)) 51 | 52 | # See comments on self.project 53 | X_, _ = self.project(X) 54 | if len(X) != len(X_): 55 | return np.zeros((x.shape[0], x.shape[0])) 56 | 57 | # Solve SDP 58 | _, M, Y, C = sdp(self.omega(X), fmin) 59 | # Calculate domega/dx and dM/dx 60 | domega = self.domega(X.flatten()) 61 | dM = solution_derivative(M, Y, C, domega) 62 | 63 | # Perform the chain rules in Tensorflow 64 | return self.acquisition_hessian_tf(X.flatten(), M, dM, domega) 65 | 66 | @AutoFlow((float_type, [None, None]), (float_type, [None, None])) 67 | def acquisition_tf(self, X, M): 68 | ''' 69 | Calculates the acquisition function, given M the optimizer of the SDP. 70 | The calculation is simply a matrix inner product. 71 | Input: X: ndarray [batch_size x self.dim] containing the evaluation points 72 | Output[0]: Value of the acquisition function 73 | Output[1]: Gradient of the acquisition function 74 | ''' 75 | f = tf.tensordot(self.omega_tf(X), M, axes=2) 76 | df = tf.gradients(f, X)[0] 77 | return f, df 78 | 79 | def omega_tf(self, X): 80 | ''' 81 | Calculates the second order moment matrix in tensorflow. 82 | Input: X: ndarray [batch_size x self.dim] containing the evaluation points 83 | Output: [Sigma(X) + mu(X)*mu(X).T mu(X); mu(X).T 1] 84 | where mu(X) and Sigma(X) are the mean and variance of the GP posterior at X. 85 | ''' 86 | mean, var = self.build_predict(X, full_cov=True) 87 | var = var[:, :, 0] + tf.eye(tf.shape(var)[0], dtype=float_type)*self.likelihood.variance 88 | 89 | # Create omega 90 | omega = var + tf.matmul(mean, mean, transpose_b=True) 91 | omega = tf.concat([omega, mean], axis=1) 92 | omega = tf.concat([omega, 93 | tf.concat([tf.transpose(mean), [[1]]], axis=1)], 94 | axis=0) 95 | 96 | return omega 97 | 98 | @AutoFlow((float_type, [None, None])) 99 | def omega(self, X): 100 | ''' 101 | Just an autoflow wrapper for omega_tf 102 | ''' 103 | return self.omega_tf(X) 104 | 105 | @AutoFlow((float_type, [None]), (float_type, [None, None]), 106 | (float_type, [None, None, None]), (float_type, [None, None, None])) 107 | def acquisition_hessian_tf(self, x, M, dM, domega): 108 | ''' 109 | Input: x: flattened ndarray [batch_size x self.dim] containing the evaluation points 110 | M: [(batch_size + 1) x (batch_size + 1)] ndarray 111 | containing the solution of the SDP 112 | dM: [(batch_size + 1) x (batch_size + 1) x (batch_size * self.dim)] ndarray 113 | containing the ''jacobian'' of M w.r.t x. 114 | domega: [(batch_size + 1) x (batch_size + 1) x (batch_size * self.dim)] ndarray 115 | containing the ''jacobian'' of the second order moment Omega with respect to x. 116 | Output: hessian of the acquisition function: [(batch_size * self.dim) x (batch_size * self.dim)] ndarray 117 | ''' 118 | X = tf.reshape(x, [self.options['batch_size'], -1]) 119 | f = tf.tensordot(self.omega_tf(X), M, axes=2) 120 | d2f_a = tf.hessians(f, x)[0] 121 | d2f_b = tf.tensordot(tf.transpose(dM), domega, axes=2) 122 | 123 | return d2f_a + d2f_b 124 | 125 | @AutoFlow((float_type, [None])) 126 | def domega(self, x): 127 | ''' 128 | Input: x: flattened ndarray [batch_size x self.dim] containing the evaluation points 129 | Output: domega: [(batch_size + 1) x (batch_size + 1) x (batch_size * self.dim)] ndarray 130 | containing the ''jacobian'' of the second order moment Omega with respect to x. 131 | ''' 132 | X = tf.reshape(x, [self.options['batch_size'], -1]) 133 | omega = self.omega_tf(X) 134 | domega = self.jacobian(tf.reshape(omega, [-1]), x) 135 | return tf.reshape(domega,[omega.shape[0], omega.shape[1], -1]) 136 | 137 | def jacobian(self, y_flat, x): 138 | ''' 139 | Calculates the jacobian of y_flat with respect to x 140 | Code taken from jeisses' comment on 141 | https://github.com/tensorflow/tensorflow/issues/675 142 | ''' 143 | n = y_flat.shape[0] 144 | 145 | loop_vars = [ 146 | tf.constant(0, tf.int32), 147 | tf.TensorArray(float_type, size=n), 148 | ] 149 | 150 | _, jacobian = tf.while_loop( 151 | lambda j, _: j < n, 152 | lambda j, result: (j+1, result.write(j, tf.gradients(y_flat[j], x))), 153 | loop_vars) 154 | 155 | return jacobian.stack() 156 | 157 | def project(self, X): 158 | ''' 159 | According to Proposition 8, OEI and QEI are not differentiable when the kernel is noiseless and 160 | there are duplicates in X. In these cases calculating M, SDP's optimizer, becomes increasing hard, as 161 | the SDP problem becomes ill conditioned. 162 | 163 | In this cases, we get around this problem by solving a smaller (projected) SDP problem, 164 | and providing a descent direction that causes the duplicates to separate and, if requested, a hessian equal to zero. 165 | Input: X: ndarray [batch_size x self.dim] containing the evaluation points 166 | Output: 167 | If the self.likelihood.variance > 1e-4: 168 | Returns X (same as input) and V, an identity matrix 169 | If the kernel is noiseless: 170 | X_u: ndarray [q x self.dim] containing the unique evaluation points (it removes also duplicates of dataset) 171 | V: ndarray [q x batch_size] projection matrix that given the solution M of the smaller sdp problem 172 | can be used to calculate an appropriate descent direction 173 | ''' 174 | if self.likelihood.variance.value > 1e-4: 175 | return X, np.eye(X.shape[0]) 176 | 177 | l = self.kern.lengthscales.value 178 | 179 | # Finding duplicates of dataset 180 | # See https://stackoverflow.com/questions/11903083/find-the-set-difference-between-two-large-arrays-matrices-in-python 181 | idx = np.where(np.all(np.abs((X/l)[:, None, :] - self.X.value/l) < 1e-2, axis=2))[0] 182 | idx_ = np.setdiff1d(np.arange(X.shape[0]), idx) 183 | # Remove duplicates of the dataset 184 | X_u = X[idx_] 185 | V = np.eye(X.shape[0])[idx_] 186 | random_direction = np.random.rand(V.shape[0], len(idx)) 187 | V[:, idx] = random_direction/np.linalg.norm(random_direction, axis=0, keepdims=True) 188 | 189 | if X_u.shape[0] > 0: 190 | _, idx = np.unique(np.round(X_u / l, decimals=2), return_index=True, axis=0) 191 | X_u = X_u[idx] 192 | V = V[idx] 193 | 194 | # If no points were removed, then just return the original X, to preserve the order of its rows 195 | if len(X_u) == len(X): 196 | X_u = X 197 | V = np.eye(X.shape[0]) 198 | 199 | return X_u, V 200 | -------------------------------------------------------------------------------- /methods/random.py: -------------------------------------------------------------------------------- 1 | from __future__ import print_function 2 | from .bo import BO 3 | 4 | 5 | class Random(BO): 6 | ''' 7 | Random strategy 8 | ''' 9 | def __init__(self, options): 10 | super(Random, self).__init__(options) 11 | 12 | def get_suggestion(self, batch_size): 13 | return self.random_sample(self.bounds, batch_size) 14 | -------------------------------------------------------------------------------- /methods/sdp.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import scs 3 | import collections 4 | import logging 5 | import scipy.sparse as sp 6 | from pypardiso import spsolve, factorized 7 | import pypardiso 8 | 9 | OUTPUT_LEVEL = 0 10 | 11 | def sdp(omega, fmin, warm_start=True): 12 | ''' 13 | Solves the following SDP with the first-order primal-dual solver SCS: 14 | ---------Primal Form---------- 15 | minimize 16 | s.t. M - C_i = positive semidefinite for all i = 0...k 17 | Dual Form: 18 | ---------Dual Form------------ 19 | minimize \sum_{i=1}^{k} 20 | s.t. Y_i positive semidefinite for all i = 0...k 21 | \sum_{i=0}^{k} Y_i = \omega 22 | ------------------------------ 23 | in order to get the value and the gradient of the acquisiton function. 24 | Inputs: 25 | omega: Second order moment matrix 26 | fmin: min value achieved so far, i.e. min(y0) 27 | warm_start: Whether or not to warm-start the solution 28 | Outpus: 29 | opt_val: Optimal value of the SDP 30 | M: Optimizer of the SDP 31 | Y: List of the Optimal Lagrange Multipliers (Dual Optimizers) of the SDP 32 | C: List of C_i, i = 0 ... k 33 | ''' 34 | omega = (omega + omega.T)/2 35 | assert not np.isnan(omega).any() 36 | 37 | # Express the problem in the format required by scs 38 | data = create_scs_data(omega, fmin) 39 | k_ = omega.shape[0] 40 | cone = {'s': [k_]*k_} 41 | if not 'past_omegas' in globals() or len(past_omegas) == 0 or \ 42 | past_omegas[0].shape[0] != k_: 43 | # Clear the saved solutions, as they are of different size 44 | # than the one we are currently trying to solve. 45 | reset_warm_starting() 46 | elif warm_start: 47 | # Update data with warm-started solution 48 | data = get_warm_start(omega, data) 49 | 50 | # Call SCS 51 | sol = scs.solve(data, cone, eps=1e-5, use_indirect=False, verbose=OUTPUT_LEVEL==1) 52 | # print(sol['info']['solveTime']) # Prints solution time for SDP. 53 | if sol['info']['status'] != 'Solved': 54 | logging.getLogger('opt').warning( 55 | 'SCS solution status:' + sol['info']['status'] 56 | ) 57 | 58 | # Extract solution from SCS' structures 59 | M, Y = unpack_solution(sol['x'], sol['y'], k_) 60 | objective = -sol['info']['pobj'] 61 | sol['C'] = data['C'] 62 | 63 | if warm_start: 64 | # Save solution for warm starting 65 | past_solutions.append(sol); past_omegas.append(omega) 66 | 67 | return objective, M, Y, data['C'] 68 | 69 | def reset_warm_starting(): 70 | ''' 71 | Clears list of previous solutions that are used for warm-starting SCS 72 | ''' 73 | global past_omegas, past_solutions 74 | past_omegas = collections.deque(maxlen=20) 75 | past_solutions = collections.deque(maxlen=20) 76 | 77 | def get_warm_start(omega, data): 78 | ''' 79 | Updates SCS' data structure with a warm-started solution 80 | ''' 81 | k_ = omega.shape[0] 82 | 83 | # Search on the list of past solved problems for the one 84 | # that had the most similar omega matrix as the one we 85 | # are trying to solve now 86 | def sort_func(X): 87 | return np.linalg.norm(X - omega) 88 | min_score = float('inf') 89 | idx = -1 90 | for i in range(len(past_omegas)): 91 | score = sort_func(past_omegas[i]) 92 | if min_score > score: 93 | min_score = score 94 | idx = i 95 | 96 | # Copy the solution from the closest match 97 | x = past_solutions[idx]['x'].copy() 98 | y = past_solutions[idx]['y'].copy() 99 | s = past_solutions[idx]['s'].copy() 100 | 101 | # Improve warm starting by moving the solution in the direction of dM, dy 102 | try: 103 | domega = -(past_omegas[idx] - omega) 104 | M, Y = unpack_solution(x, y, k_) 105 | C = past_solutions[idx]['C'] 106 | dM, dY = solution_derivative(M, Y, C, domega, return_dY=True) 107 | 108 | x += pack(dM[:,:,0], k_) 109 | n = len(x) 110 | for i in range(0, k_): 111 | y[i*n:(i+1)*n] += pack(dY[i], k_) 112 | s[i*n:(i+1)*n]-= pack(dM[:,:,0], k_) 113 | except pypardiso.pardiso_wrapper.PyPardisoError: 114 | # In the (very) case where pardiso fails due to badly conditioned data 115 | logging.getLogger('opt').warning('Call to pardiso failed. Switching to zero-order warm starting.') 116 | pass 117 | 118 | # Add warm starting to the problem data 119 | data['x'] = x; data['y'] = y; data['s'] = s 120 | 121 | return data 122 | 123 | def create_scs_data(omega, fmin): 124 | ''' 125 | Returns a data structure that containts the problem data as required by SCS. 126 | See github.com/cvxgrp/scs 127 | ''' 128 | k_ = omega.shape[0] 129 | 130 | # Create A 131 | n = k_ * (k_ + 1) // 2 132 | A = np.zeros((k_*n, n)) 133 | row_ind = np.array([]) 134 | col_ind = np.array([]) 135 | z = np.array([]) 136 | for i in range(n): 137 | row_ind = np.append(row_ind, np.arange(i, k_*n, n)) 138 | col_ind = np.append(col_ind, np.repeat(i, k_)) 139 | z = np.append(z, np.repeat(1, k_)) 140 | A = sp.csc_matrix((z, (row_ind, col_ind)), shape=(k_*n, n)) 141 | 142 | # Create b and C 143 | b = np.array([]) 144 | C = [] 145 | C.append(np.zeros((k_, k_))) 146 | b = np.append(b, pack(C[0], k_)) 147 | for i in range(1, k_): 148 | C.append(np.zeros((k_, k_))) 149 | C[i][-1, i - 1] = 1/2 150 | C[i][i - 1, -1] = 1/2 151 | C[i][-1, -1] = -fmin 152 | b = np.append(b, pack(C[i], k_)) 153 | 154 | # Create c 155 | c = -pack(omega, k_) 156 | 157 | return {'A': A, 'b': b, 'c': c, 'C': C} 158 | 159 | def solution_derivative(M, Y, C, domegas, return_dY=False): 160 | ''' 161 | Calculates the derivatives of M and Y across perturbations of the cost matrix 162 | defined in domegas. 163 | Inputs: M: Primal Solution of the SDP 164 | Y: List of Dual Solutions of the SDP 165 | C: List of the auxiliary matrices in the definition of the SDP 166 | domegas: 3D-ndarray with each domegas[:, :, i] defining a perturbation 167 | Outputs: dM: 3D-ndarray with each dM[:, :, i] being the derivative of M when 168 | perturbing omega across domega[:, :, i] 169 | dY: list of 3D-ndarrays having, similarly to dM, 170 | the derivatives of Y when perturbing omega 171 | ''' 172 | 173 | H = create_matrix(M, Y, C) 174 | assert len(domegas.shape) == 2 or len(domegas.shape) == 3 175 | if len(domegas.shape) == 2: 176 | domegas = domegas[:, :, None] 177 | 178 | n = domegas.shape[0] 179 | k = domegas.shape[-1] 180 | 181 | # z a vectorized copy of the upper triangular part of domegas 182 | z = np.zeros((n**2 + n*(n+1)//2, k)) 183 | z[-n*(n+1)//2:, :] = domegas[np.triu_indices(n)] 184 | x = H(z) 185 | 186 | def convert_to_matrix(x): 187 | G = np.zeros((n, n, k)) 188 | G[np.triu_indices(n)] = x[-n*(n+1)//2:, :] 189 | G = G + np.transpose(G, [1, 0, 2]) 190 | G[np.diag_indices(n)] = G[np.diag_indices(n)]/2 191 | return G 192 | 193 | dM = convert_to_matrix(x[-n*(n+1)//2:, :]) 194 | 195 | if return_dY: 196 | assert k == 1 # This part has been only implemented for the case where k = 1 197 | dy = np.reshape(x[0:n**2], (n, n, k)) 198 | dY = [] 199 | for i in range(n): 200 | eig_Y_i = np.linalg.eigh(Y[i]) 201 | y_i = (eig_Y_i[0][-1]**.5 * eig_Y_i[1][:, -1])[:, None] 202 | dY.append(y_i.dot(dy[i].T) + dy[i].dot(y_i.T)) 203 | 204 | return dM, dY 205 | else: 206 | return dM 207 | 208 | def create_matrix(M, Y, C): 209 | ''' 210 | Creates the left-hand side matrix of the linear system that gives dM, dy 211 | ''' 212 | k = len(Y) 213 | y = [] 214 | for i in range(len(Y)): 215 | eig_Y_i = np.linalg.eigh(Y[i]) 216 | y_i = (eig_Y_i[0][-1]**.5 * eig_Y_i[1][:, -1])[:, None] 217 | y.append(y_i) 218 | 219 | S = M - C 220 | P = get_P(k) 221 | P_ = get_P_(k) 222 | H1 = sp.block_diag(S, format='csc') 223 | I = sp.eye(k, format='csc') 224 | for i in range(len(y)): 225 | h2 = sp.kron(y[i].T, I).dot(P_) 226 | h3 = P.dot(sp.kron(y[i], I) + sp.kron(I, y[i])) 227 | if i == 0: 228 | H2 = h2 229 | H3 = h3 230 | else: 231 | H2 = sp.vstack((H2, h2)) 232 | H3 = sp.hstack((H3, h3)) 233 | H = sp.bmat([[H1, H2], [H3, None]]) 234 | return factorized(H) 235 | 236 | def unpack_solution(x, y, k_): 237 | ''' 238 | Creates matrices M and Y from scs solution structure 239 | ''' 240 | M = unpack(x, k_) 241 | Y = [] 242 | for i in range(k_): 243 | n = k_ * (k_ + 1) // 2 244 | Y.append( 245 | unpack(y[i*n:(i+1)*n], k_) 246 | ) 247 | return M, Y 248 | 249 | def pack(Z, n): 250 | ''' 251 | Auxiliary function for 'packing' a matrix into 252 | a vector format, as required by scs. 253 | See github.com/cvxgrp/scs 254 | ''' 255 | Z = np.copy(Z) 256 | tidx = np.triu_indices(n) 257 | tidx = (tidx[1], tidx[0]) 258 | didx = np.diag_indices(n) 259 | 260 | Z = Z * np.sqrt(2.) 261 | Z[didx] = Z[didx] / np.sqrt(2.) 262 | z = Z[tidx] 263 | 264 | return z 265 | 266 | def unpack(z, n): 267 | ''' 268 | Auxiliary function for 'unpacking' a packed matrix of a vector format, 269 | as required by scs. See github.com/cvxgrp/scs 270 | ''' 271 | z = np.copy(z) 272 | tidx = np.triu_indices(n) 273 | tidx = (tidx[1], tidx[0]) 274 | didx = np.diag_indices(n) 275 | 276 | Z = np.zeros((n, n)) 277 | Z[tidx] = z 278 | Z = (Z + np.transpose(Z)) / np.sqrt(2.) 279 | Z[didx] = Z[didx] / np.sqrt(2.) 280 | 281 | return Z 282 | 283 | def get_P_(k): 284 | A = sp.csc_matrix((0, k*(k+1)//2)) 285 | for i in range(k): 286 | B = sp.csc_matrix((0, 0)) 287 | for j in range(i): 288 | tmp = sp.csc_matrix(([1], ([0], [i-j])), shape=(1, k - j)) 289 | B = sp.block_diag((B, tmp)) 290 | 291 | B = sp.block_diag((B, sp.eye(k - i, format='csc'))) 292 | B = sp.hstack([B, sp.csc_matrix((k, k*(k+1)//2 - B.shape[1]))]) 293 | A = sp.vstack([A, B]) 294 | 295 | return A 296 | 297 | def get_P(k): 298 | A = sp.csc_matrix((0, 0)) 299 | I = sp.eye(k, format='csc') 300 | for i in range(k): 301 | A = sp.block_diag((A, I[i:, :])) 302 | 303 | return A 304 | -------------------------------------------------------------------------------- /methods/solvers.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import logging 3 | OUTPUT_LEVEL = 0 4 | 5 | ''' 6 | L-BFGS-B wrapper 7 | ''' 8 | import scipy as sp 9 | 10 | def bfgs_solve(x_init, bounds, hessian, bo): 11 | res = sp.optimize.minimize(fun=bo.acquisition, 12 | x0=x_init, 13 | method='L-BFGS-B', 14 | jac=True, 15 | bounds=bounds, 16 | options={'disp': OUTPUT_LEVEL} 17 | ) 18 | x = res.x 19 | y = res.fun if isinstance(res.fun, float) else res.fun[0] 20 | 21 | return x, y, res 22 | 23 | try: 24 | ''' 25 | Knitro wrapper 26 | ''' 27 | import sys 28 | sys.path.insert(0, './knitro/examples/Python') 29 | from knitro import * 30 | from knitroNumPy import * 31 | from collections import namedtuple 32 | ''' 33 | See Knitro's callback library reference 34 | https://www.artelys.com/tools/knitro_doc/3_referenceManual/callableLibrary/API.html 35 | The code here is based on the example file 36 | examples/Python/exampleHS15NumPy.py 37 | ''' 38 | def callbackEvalFC(bo, evalRequestCode, n, m, nnzJ, nnzH, x, 39 | lambda_, obj, c, objGrad, jac, hessian, hessVector, 40 | userParams): 41 | if evalRequestCode == KTR_RC_EVALFC: 42 | np.copyto(obj, bo.acquisition(x)[0]) 43 | return 0 44 | else: 45 | return KTR_RC_CALLBACK_ERR 46 | 47 | def callbackEvalGA(bo, evalRequestCode, n, m, nnzJ, nnzH, x, 48 | lambda_, obj, c, objGrad, jac, hessian, hessVector, 49 | userParams): 50 | if evalRequestCode == KTR_RC_EVALGA: 51 | np.copyto(objGrad, bo.acquisition(x)[1]) 52 | return 0 53 | else: 54 | return KTR_RC_CALLBACK_ERR 55 | 56 | def callbackEvalH(bo, evalRequestCode, n, m, nnzJ, nnzH, x, 57 | lambda_, obj, c, objGrad, jac, hessian, hessVector, 58 | userParams): 59 | indices = np.triu_indices(n) 60 | np.copyto(hessian, bo.acquisition_hessian(x)[indices]) 61 | if evalRequestCode == KTR_RC_EVALH: 62 | # In this case we want the full hessian 63 | return 0 64 | elif evalRequestCode == KTR_RC_EVALH_NO_F: 65 | # In this case we only want the constraint part of the lagrangian 66 | np.copyto(hessian, np.zeros(hessian.shape)) 67 | return 0 68 | else: 69 | return KTR_RC_CALLBACK_ERR 70 | 71 | def knitro_solve(x_init, bounds, hessian, bo): 72 | # We have to implement the case of non hessian 73 | n = bounds.shape[0] 74 | objGoal = KTR_OBJGOAL_MINIMIZE 75 | objType = KTR_OBJTYPE_GENERAL 76 | bndsLo = bounds[:, 0].copy() 77 | bndsUp = bounds[:, 1].copy() 78 | # Hessian is dense 79 | hessRow, hessCol = np.triu_indices(n) 80 | # No constraints 81 | m = 0 82 | cType = np.array([]) 83 | cBndsLo = np.array([]) 84 | cBndsUp = np.array([]) 85 | jacIxConstr = np.array([]) 86 | jacIxVar = np.array([]) 87 | kc = KTR_new() 88 | assert kc is not None, "Failed to initialize knitro. Check license." 89 | # Set knitro parameters 90 | # Derivative Checker 91 | # assert not KTR_set_int_param_by_name(kc, "derivcheck", 2) 92 | # Verbosity 93 | if OUTPUT_LEVEL == 0: 94 | assert not KTR_set_int_param_by_name(kc, "outlev", 0) 95 | # verbose: 96 | # assert not KTR_set_int_param_by_name(kc, "outlev", 2) 97 | # Super verbose: prints iterates information 98 | # assert not KTR_set_int_param_by_name(kc, "outlev", 4) 99 | if hessian: 100 | # Exact hessian 101 | assert not KTR_set_int_param_by_name(kc, "hessopt", 1) 102 | assert not KTR_set_int_param_by_name(kc, "hessian_no_f", 1) 103 | else: 104 | assert not KTR_set_int_param_by_name(kc, "hessopt", 3) # SR1 105 | # Compare performance of all the algorithms 106 | assert not KTR_set_int_param_by_name(kc, "algorithm", 4) # SQP 107 | # assert not KTR_set_int_param_by_name(kc, "linsolver", 3) # Non-sparse 108 | # assert not KTR_set_char_param_by_name(kc, "outlev", "all") 109 | 110 | # set callbacks 111 | assert not KTR_set_func_callback( 112 | kc, lambda *args: callbackEvalFC(bo, *args)) 113 | assert not KTR_set_grad_callback( 114 | kc, lambda *args: callbackEvalGA(bo, *args)) 115 | if hessian: 116 | assert not KTR_set_hess_callback( 117 | kc, lambda *args: callbackEvalH(bo, *args)) 118 | ret = KTR_init_problem(kc, n, objGoal, objType, bndsLo, bndsUp, cType, 119 | cBndsLo, cBndsUp, jacIxVar, jacIxConstr, hessRow, 120 | hessCol, x_init, None) 121 | if ret: 122 | raise RuntimeError( 123 | "Error initializing the problem, Knitro status = %d" % ret) 124 | # These will hold the solutions 125 | x = x_init.copy() 126 | lambda_ = np.zeros(m + n) 127 | obj = bo.acquisition(x)[0] 128 | try: 129 | nStatus = KTR_solve(kc, x, lambda_, 0, obj, None, None, None, None, None, 130 | None) 131 | nIter = KTR_get_number_iters(kc) 132 | fcEvals = KTR_get_number_FC_evals(kc) 133 | gaEvals = KTR_get_number_GA_evals(kc) 134 | hEvals = KTR_get_number_H_evals(kc) 135 | finally: 136 | KTR_free(kc) 137 | Result = namedtuple('Result', 'status nit fEvals gEvals hEvals success message') 138 | return x, obj, Result(status=nStatus, nit=nIter, 139 | fEvals=fcEvals, gEvals=gaEvals, hEvals=hEvals, 140 | success=(nStatus==0), message=str(nStatus) 141 | ) 142 | except OSError: 143 | logging.getLogger('').debug( 144 | "Couldn't load knitro. Make sure it is properly installed." 145 | ) 146 | except ModuleNotFoundError: 147 | logging.getLogger('').warning( 148 | "Couldn't find knitro. Make sure it is installed and visible." 149 | ) 150 | 151 | def solve(X_init, bounds, hessian, bo, solver): 152 | ''' 153 | Perform non-linear optimization on bo.acquisition_function with either L-BFGS-B (Pseudo-Newton) 154 | or Knitro (Sequential Quadratic Programming) 155 | ''' 156 | x_init = X_init.flatten() 157 | 158 | if solver == 'bfgs': 159 | x, y, status = bfgs_solve(x_init=x_init, bounds=bounds, 160 | hessian=hessian, bo=bo) 161 | elif solver == 'knitro': 162 | try: 163 | x, y, status = knitro_solve(x_init=x_init, bounds=bounds, 164 | hessian=hessian, bo=bo) 165 | except NameError: 166 | logging.getLogger('').critical( 167 | 'Please install knitro to use it.' 168 | ) 169 | raise 170 | else: 171 | assert False, 'Invalid nonlinear solver choice!' 172 | 173 | k = x.size // bo.dim 174 | X = x.reshape(k, bo.dim) 175 | return X, y, status 176 | -------------------------------------------------------------------------------- /out/.gitkeep: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/oxfordcontrol/Bayesian-Optimization/980a0edb978710fe29529526977d0c14649b3088/out/.gitkeep -------------------------------------------------------------------------------- /plot.py: -------------------------------------------------------------------------------- 1 | ''' 2 | This script plots the results of various algorithms at the same plot, and 3 | saves it as a pdf in the results/folder. 4 | 5 | Call it as: 6 | python plot_experiments fig_name folder1 ... folderN 7 | where folder* are the folders of the saved experiments. 8 | ''' 9 | 10 | import os.path 11 | import glob 12 | import numpy as np 13 | import pickle 14 | import argparse 15 | import matplotlib 16 | 17 | SMALL_SIZE = 8 18 | MEDIUM_SIZE = 10 19 | BIGGER_SIZE = 12 20 | 21 | matplotlib.rc('font', size=SMALL_SIZE) # controls default text sizes 22 | matplotlib.rc('axes', titlesize=SMALL_SIZE) # fontsize of the axes title 23 | matplotlib.rc('xtick', labelsize=SMALL_SIZE) # fontsize of the tick labels 24 | matplotlib.rc('ytick', labelsize=SMALL_SIZE) # fontsize of the tick labels 25 | matplotlib.rc('legend', fontsize=SMALL_SIZE) # legend fontsize 26 | matplotlib.rc('figure', titlesize=SMALL_SIZE) # fontsize of the figure title 27 | 28 | import matplotlib.pyplot as plt 29 | from matplotlib.ticker import MaxNLocator 30 | # When no X server is present 31 | plt.switch_backend('agg') 32 | 33 | 34 | if __name__ == '__main__': 35 | parser = argparse.ArgumentParser() 36 | parser.add_argument("name", nargs=1) 37 | parser.add_argument("folders", nargs="+") 38 | parser.add_argument('--linewidth', type=float, default=1) 39 | parser.add_argument('--capsize', type=float, default=1.5) 40 | parser.add_argument('--offset_start', type=float, default=-0.2) 41 | parser.add_argument('--offset_delta', type=float, default=0.1) 42 | parser.add_argument('--sizex', type=float, default=5) 43 | parser.add_argument('--sizey', type=float, default=2.5) 44 | parser.add_argument('--regret', type=int, default=1) 45 | parser.add_argument('--max_iters', type=int) 46 | parser.add_argument('--step', type=int, default=1) 47 | args = parser.parse_args() 48 | 49 | plot_experiments(args) 50 | 51 | 52 | def plot_experiments(options): 53 | ''' 54 | Plots the results of a number of different algorithms on the same plot 55 | ''' 56 | # Hopefully we won't plot more than 6 different algorithms at the same plot 57 | colors = ['r', 'b', 'g', 'y', 'b', 'm'] 58 | fig = None 59 | ax = None 60 | offset = options.offset_start 61 | for k in range(len(options.folders)): 62 | folder = options.folders[k] 63 | # Load command line arguments 64 | with open(folder + '/arguments.pkl', 'rb') as file: 65 | args = pickle.load(file) 66 | # print(args) 67 | fmin = np.loadtxt(folder + '/fmin.txt') 68 | 69 | fails = 0 70 | outputs = [] 71 | files = glob.glob(folder + '/*.npz') 72 | for file in files: 73 | npzfile = np.load(file) 74 | 75 | if npzfile['Y'].shape != (): 76 | outputs.append(npzfile['Y']) 77 | else: 78 | fails = fails + 1 79 | print(file) 80 | 81 | label = os.path.basename(folder).split('_')[1] 82 | 83 | if fails > 0: 84 | print(label, 'Fails:', fails, 85 | 'Successes: ', len(files) - fails) 86 | 87 | if k == len(options.folders) - 1: 88 | color = 'k' 89 | else: 90 | color = colors[k] 91 | 92 | fig, ax = plot(outputs=outputs, 93 | fmin=fmin, 94 | iterations=args.iterations, 95 | initial_size=args.initial_size + args.init_replicates, 96 | batch_size=args.batch_size, 97 | color=color, fig=fig, 98 | ax=ax, label=label, 99 | offset=offset, options=options) 100 | offset += options.offset_delta 101 | 102 | ax.set_xlabel('Number of Batches') 103 | if options.regret: 104 | ax.set_ylabel('Regret') 105 | else: 106 | ax.set_ylabel('Loss') 107 | ax.set_title(options.name[0]) 108 | figure = ax.get_figure() 109 | figure.set_size_inches((options.sizex, options.sizey)) 110 | ax.spines['right'].set_visible(False) 111 | ax.spines['top'].set_visible(False) 112 | ax.locator_params(nbins=4, axis='y') 113 | 114 | # Save plot 115 | plt.tight_layout() 116 | plt.savefig('results/' + options.name[0] + '.pdf') 117 | 118 | def plot_mins(mins, options, color='b', fig=None, ax=None, label=None, offset=0): 119 | # Auxiliary function used by plot() 120 | if fig is None or ax is None: 121 | fig = plt.figure() 122 | ax = fig.add_subplot(111) 123 | 124 | ax.ticklabel_format(style='sci', scilimits=(0, 1.5)) 125 | if options.max_iters: 126 | iterations = options.max_iters 127 | else: 128 | iterations = mins.shape[1] 129 | for i in range(0, iterations, options.step): 130 | ax.scatter(i + 0*mins[:, i] + offset, mins[:, i], 131 | s=50, marker='.', color=color, edgecolor='none', 132 | alpha=0.3) 133 | ax.scatter(i + offset, np.median(mins[:, i]), 134 | s=20, marker='d', color=color, edgecolor=(0, 0, 0)) 135 | 136 | # ax.legend().get_frame().set_facecolor('none') 137 | # plt.legend(frameon=False) 138 | max_plotted_value = np.max(np.percentile(mins, 100, axis=0)) 139 | if options.regret: 140 | ax.set_ylim(0, 1.05*max_plotted_value) 141 | ax.xaxis.set_major_locator(MaxNLocator(integer=True)) 142 | 143 | return fig, ax 144 | 145 | def plot(outputs, fmin, iterations, initial_size, batch_size, label, options, 146 | color='b', fig=None, ax=None, output_idx=0, offset=0): 147 | ''' 148 | Plot a single algorithm 149 | ''' 150 | n = len(outputs) 151 | mins = np.zeros((n, iterations + 1)) 152 | for i in range(n): 153 | for j in range(iterations + 1): 154 | idx = np.argmin(outputs[i][0:initial_size + j*batch_size, 0]) 155 | mins[i, j] = outputs[i][idx, output_idx] - fmin 156 | 157 | fig, ax = plot_mins(mins, options, color, fig, ax, label, offset) 158 | 159 | return fig, ax 160 | -------------------------------------------------------------------------------- /results/.gitkeep: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/oxfordcontrol/Bayesian-Optimization/980a0edb978710fe29529526977d0c14649b3088/results/.gitkeep -------------------------------------------------------------------------------- /run.py: -------------------------------------------------------------------------------- 1 | import os 2 | import numpy as np 3 | import tensorflow as tf 4 | import random 5 | import argparse 6 | import gpflow 7 | from methods.oei import OEI 8 | from methods.random import Random 9 | import time 10 | import pickle 11 | from benchmark_functions import scale_function, hart6 12 | import copy 13 | 14 | algorithms = { 15 | 'OEI': OEI, 16 | 'Random': Random 17 | } 18 | 19 | class SafeMatern32(gpflow.kernels.Matern32): 20 | # See https://github.com/GPflow/GPflow/pull/727 21 | def euclid_dist(self, X, X2): 22 | r2 = self.square_dist(X, X2) 23 | return tf.sqrt(tf.maximum(r2, 1e-40)) 24 | 25 | 26 | def run(options, seed, robust=False, save=False): 27 | ''' 28 | Runs bayesian optimization on the setup defined in the options dictionary 29 | starting from a predefined seed. Saves results on the folder named 'out' while logging 30 | is saved on the folder 'log'. 31 | ''' 32 | options['seed'] = seed 33 | # Set random seed: Numpy, Tensorflow, Python 34 | tf.reset_default_graph() 35 | tf.set_random_seed(seed) 36 | np.random.seed(seed) 37 | random.seed(seed) 38 | 39 | # Create bo object which will be called later to perform Bayesian Optimization. 40 | bo = algorithms[options['algorithm']](options) 41 | 42 | try: 43 | start = time.time() 44 | # Run BO 45 | X, Y = bo.bayesian_optimization() 46 | end = time.time() 47 | print('Done with:', bo.options['job_name'], 'seed:', seed, 48 | 'Time:', '%.2f' % ((end - start)/60), 'min') 49 | except KeyboardInterrupt: 50 | print("Caught KeyboardInterrupt, stopping.") 51 | raise 52 | except: 53 | print('Experiment of', bo.options['job_name'], 54 | 'with seed', seed, 'failed') 55 | X, Y = None, None 56 | if not robust: 57 | raise 58 | 59 | if save: 60 | save_folder = 'out/' + bo.options['job_name'] + '/' 61 | filepath = save_folder + str(seed) + '.npz' 62 | try: 63 | os.makedirs(save_folder) 64 | except OSError: 65 | pass 66 | try: 67 | os.remove(filepath) 68 | except OSError: 69 | pass 70 | 71 | np.savez(filepath, X=X, Y=Y) 72 | 73 | 74 | def create_options(args): 75 | functions = { 76 | 'hart6': hart6() 77 | } 78 | 79 | kernels_gpflow = { 80 | 'RBF': gpflow.kernels.RBF, 81 | 'Matern32': SafeMatern32, 82 | } 83 | 84 | options = vars(copy.copy(args)) 85 | options['objective'] = functions[options['function']] 86 | options['objective'].bounds = np.asarray(options['objective'].bounds) 87 | # This scales the input domain of the function to [-0.5, 0.5]^n. It's different to the 88 | # normalize option, which scales the output of the function. 89 | options['objective'] = scale_function(options['objective']) 90 | 91 | input_dim = options['objective'].bounds.shape[0] 92 | if options['algorithm'] != 'LP_EI': 93 | k = kernels_gpflow[options['kernel']]( 94 | input_dim=input_dim, ARD=options['ard']) 95 | if options['priors']: 96 | k.lengthscales.prior = gpflow.priors.Gamma(shape=2, scale=0.5) 97 | k.variance.prior = gpflow.priors.Gaussian(mu=1, var=2) 98 | options['kernel'] = k 99 | 100 | options['job_name'] = options['function'] + '_' + options['algorithm'] 101 | 102 | return options 103 | 104 | 105 | def main(args): 106 | options = create_options(args) 107 | 108 | save_folder = 'out/' + options['job_name'] + '/' 109 | filepath = save_folder + 'arguments.pkl' 110 | try: 111 | os.makedirs(save_folder) 112 | except OSError: 113 | pass 114 | 115 | try: 116 | os.remove(filepath) 117 | except OSError: 118 | pass 119 | try: 120 | with open(filepath, 'wb') as file: 121 | pickle.dump(args, file, pickle.HIGHEST_PROTOCOL) 122 | except OSError: 123 | pass 124 | 125 | filepath = save_folder + 'fmin.txt' 126 | try: 127 | fmin = options['objective'].fmin 128 | except AttributeError: 129 | fmin = 0 130 | np.savetxt(filepath, np.array([fmin])) 131 | 132 | for seed in range(args.seed, args.seed + args.num_seeds): 133 | run(options, seed=seed, save=options['save']) 134 | 135 | 136 | def create_parser(): 137 | parser = argparse.ArgumentParser() 138 | parser.add_argument('--function', default='hart6') 139 | parser.add_argument('--algorithm', default='OEI') 140 | parser.add_argument('--seed', type=int, default=123) 141 | parser.add_argument('--num_seeds', type=int, default=1) 142 | parser.add_argument('--save', type=int, default=1) 143 | 144 | parser.add_argument('--batch_size', type=int, default=5) 145 | parser.add_argument('--iterations', type=int, default=10) 146 | parser.add_argument('--initial_size', type=int, default=10) 147 | parser.add_argument('--model_restarts', type=int, default=20, 148 | help='Random restarts when optimizing the Likelihood of the GP.') 149 | parser.add_argument('--opt_restarts', type=int, default=20, 150 | help='Random restarts when optimizing the acquisition function.') 151 | parser.add_argument('--normalize_Y', type=int, default=1, 152 | help='If set to 1, then the outputs of the function under optimization is normalized to have variance 1 and mean 0') 153 | parser.add_argument('--noise', type=float, 154 | help='Used to set the likelihood to a fixed value') 155 | parser.add_argument('--kernel', default='Matern32') 156 | parser.add_argument('--ard', type=int, default=0) 157 | parser.add_argument('--nl_solver', default='knitro') 158 | parser.add_argument('--hessian', type=int, default=1) 159 | 160 | parser.add_argument('--priors', type=int, default=0) 161 | 162 | return parser 163 | 164 | 165 | if __name__ == '__main__': 166 | parser = create_parser() 167 | args = parser.parse_args() 168 | 169 | main(args) 170 | 171 | -------------------------------------------------------------------------------- /tests/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/oxfordcontrol/Bayesian-Optimization/980a0edb978710fe29529526977d0c14649b3088/tests/__init__.py -------------------------------------------------------------------------------- /tests/create_model.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | from methods.oei import OEI 3 | import gpflow 4 | import sys 5 | sys.path.append('..') 6 | from benchmark_functions import scale_function, hart6 7 | 8 | 9 | def create_model(batch_size=2): 10 | options = {} 11 | options['samples'] = 0 12 | options['priors'] = 0 13 | options['batch_size'] = batch_size 14 | options['iterations'] = 5 15 | options['opt_restarts'] = 2 16 | options['initial_size'] = 10 17 | options['model_restarts'] = 10 18 | options['normalize_Y'] = 1 19 | options['noise'] = 1e-6 20 | options['nl_solver'] = 'bfgs' 21 | options['hessian'] = True 22 | 23 | options['objective'] = hart6() 24 | options['objective'].bounds = np.asarray(options['objective'].bounds) 25 | options['objective'] = scale_function(options['objective']) 26 | 27 | input_dim = options['objective'].bounds.shape[0] 28 | options['kernel'] = gpflow.kernels.Matern32( 29 | input_dim=input_dim, ARD=False 30 | ) 31 | 32 | options['job_name'] = 'tmp' 33 | 34 | bo = OEI(options) 35 | # Initialize 36 | bo.bayesian_optimization() 37 | 38 | return bo 39 | -------------------------------------------------------------------------------- /tests/test_derivatives.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import numdifftools as nd 3 | from .create_model import create_model 4 | from methods.sdp import sdp, solution_derivative 5 | 6 | # Batch size 7 | K = 3 8 | 9 | def derivatives_numerical(x, model): 10 | ''' 11 | Returns the gradient and hessian of the optimal value of 12 | the SDP with respect to x. 13 | Beware, the hessian is based on the analytical derivative, 14 | for accuracy and performance reasons. 15 | ''' 16 | def opt_val(y): 17 | return model.acquisition(y)[0] 18 | 19 | def gradient(y): 20 | return model.acquisition(y)[1] 21 | 22 | gradient_numerical = nd.Gradient(opt_val)(x) 23 | hessian_numerical = nd.Hessian(opt_val)(x) 24 | 25 | return gradient_numerical, hessian_numerical 26 | 27 | 28 | def derivatives_analytical(x, model): 29 | ''' 30 | Returns the gradient and hessian of the optimal value of 31 | the SDP with respect to x. 32 | ''' 33 | return model.acquisition(x)[1], model.acquisition_hessian(x) 34 | 35 | 36 | def sensitivity_numerical(omega, direction, model): 37 | ''' 38 | Returns the derivative of 39 | 1) optimal value of SDP 40 | 2) derivative primal and dual solution: 41 | d/dx([vec(M); vec(Y_0); ...; vec(Y_k)]) 42 | at omega when perturbing the second moment matrix across 43 | the given direction 44 | ''' 45 | fmin = np.min(model.predict_f(model.X.value)[0]) 46 | 47 | def solution(om): 48 | ''' 49 | Return [vec(M); vec(Y_0); ...; vec(Y_k)] at omega=om 50 | ''' 51 | M, Y = sdp(om, fmin)[1:3] 52 | solution = M.flatten() 53 | for i in range(len(Y)): 54 | solution = np.concatenate((solution, Y[i].flatten())) 55 | return solution 56 | 57 | d_opt_val = nd.Derivative( 58 | lambda x: sdp(omega + x*direction, fmin)[0] 59 | )(0) 60 | 61 | d_solution = nd.Derivative( 62 | lambda x: solution(omega + x*direction) 63 | )(0) 64 | 65 | return d_opt_val, d_solution 66 | 67 | 68 | def sensitivity_analytical(omega, direction, model): 69 | fmin = np.min(model.predict_f(model.X.value)[0]) 70 | _, M, Y, C = sdp(omega, fmin) 71 | d_opt_val = np.trace(M.T.dot(direction)) 72 | 73 | dM, dY = solution_derivative(M, Y, C, direction, return_dY=True) 74 | 75 | d_solution = dM.flatten() 76 | for i in range(len(dY)): 77 | d_solution = np.concatenate((d_solution, dY[i].flatten())) 78 | 79 | return d_opt_val, d_solution 80 | 81 | 82 | def test_sensitivity(): 83 | ''' 84 | Checks sensitivity results of the SDP when perturbing 85 | omega(X) at a direction D. X and D are chosen uniformly at random. 86 | ''' 87 | np.random.seed(0) 88 | 89 | bo = create_model(batch_size=K) 90 | X = bo.random_sample(bo.bounds, K) 91 | 92 | omega = bo.omega(X) 93 | mu = omega[0:K, -1][:, None] 94 | 95 | D_s = np.random.rand(K, K) 96 | D_s = D_s.dot(D_s.T) 97 | D_m = np.random.rand(K, 1) 98 | D = np.zeros((K + 1, K + 1)) 99 | 100 | D[0:K, 0:K] = D_s + mu.dot(D_m.T) + D_m.dot(mu.T) 101 | D[-1, 0:K] = D_m.flatten() 102 | D[0:K, -1] = D_m.flatten() 103 | D = (D + D.T)/2 104 | D = 1e-3*D 105 | 106 | d_opt_val, d_solution = sensitivity_analytical(omega, D, bo) 107 | d_opt_val_n, d_solution_n = sensitivity_numerical(omega, D, bo) 108 | 109 | # Check derivative of optimum value 110 | np.testing.assert_allclose(d_opt_val, d_opt_val_n, rtol=1e-2) 111 | # Check derivative of primal and dual solution 112 | np.testing.assert_allclose(d_solution, d_solution_n, rtol=3e-1) 113 | 114 | 115 | def test_derivatives(): 116 | ''' 117 | Tests gradient and hessian of the acquisition function. 118 | ''' 119 | np.random.seed(0) 120 | 121 | bo = create_model(batch_size=K) 122 | X = bo.random_sample(bo.bounds, K) 123 | 124 | gradient, hessian = derivatives_analytical(X.flatten(), bo) 125 | gradient_n, hessian_n = derivatives_numerical(X.flatten(), bo) 126 | 127 | # The following two work better when no warm starting is present 128 | # Maybe warm starting confuses the numerical differentiation? 129 | np.testing.assert_allclose(gradient, gradient_n, rtol=5e-1) 130 | assert np.linalg.norm(gradient - gradient_n) / np.linalg.norm(gradient) < 1e-2 131 | assert np.linalg.norm(hessian - hessian_n)/np.linalg.norm(hessian) < 2e-2 132 | # This is too hard 133 | # np.testing.assert_allclose(hessian, hessian_n, rtol=1) 134 | -------------------------------------------------------------------------------- /tests/test_sdp.py: -------------------------------------------------------------------------------- 1 | from methods import sdp 2 | import numpy as np 3 | import cvxpy as cvx 4 | 5 | def sdp_mosek(omega, fmin): 6 | k_ = omega.shape[0] 7 | 8 | Y = [] 9 | C = [] 10 | 11 | C.append(np.zeros((k_, k_))) 12 | C[0][-1, -1] = 0 13 | Y.append(cvx.Semidef(k_)) 14 | cost_sum = Y[0]*C[0] 15 | 16 | for i in range(1, k_): 17 | Y.append(cvx.Semidef(k_)) 18 | C.append(np.zeros((k_, k_))) 19 | C[i][-1, i - 1] = 1/2 20 | C[i][i - 1, -1] = 1/2 21 | C[i][-1, -1] = -fmin 22 | cost_sum += Y[i]*C[i] 23 | 24 | constraints = [sum(Y) == omega] 25 | 26 | objective = cvx.Minimize(cvx.trace(cost_sum)) 27 | 28 | prob = cvx.Problem(objective, constraints) 29 | opt_val = prob.solve(solver=cvx.MOSEK, verbose=0) 30 | 31 | # Assert a valid solution is returned 32 | assert (isinstance(opt_val, np.ndarray) or isinstance(opt_val, float))\ 33 | and np.isfinite(opt_val) 34 | 35 | M = -constraints[0].dual_value 36 | M = np.asarray((M + M.T)/2) # From matrix to array 37 | 38 | Y_return = [] 39 | for y in Y: 40 | Y_return.append(np.asarray(y.value)) 41 | 42 | return opt_val, M, Y_return, C 43 | 44 | def test_sdp(): 45 | np.random.seed(0) 46 | k = 5 # Batch size 47 | 48 | for i in range(10): 49 | # Generate problem data 50 | tmp = np.random.randn(k, k) 51 | # TODO: Change the way the sigma is constructed 52 | # Creating a matrix as below results in a very bad condition number 53 | sigma = tmp.dot(tmp.T) + 0.01*np.eye(tmp.shape[0], tmp.shape[1]) 54 | mu = np.random.randn(k, 1) 55 | 56 | omega = np.zeros((k + 1, k + 1)) 57 | omega[0:k, 0:k] = sigma + mu.dot(mu.T) 58 | omega[-1, 0:k] = mu.flatten() 59 | omega[0:k, -1] = mu.flatten() 60 | omega[-1, -1] = 1 61 | 62 | fmin = np.random.randn(1) 63 | 64 | # Solve SDP with SCS and compare with MOSEK 65 | opt_val_scs, M_scs = sdp.sdp(omega, fmin, warm_start=False)[0:2] 66 | opt_val_mosek, M_mosek = sdp_mosek(omega, fmin)[0:2] 67 | 68 | # Test the different formulations 69 | np.testing.assert_allclose(opt_val_scs, opt_val_mosek, rtol=1e-4) 70 | 71 | # This is coarse, hence it shouldn't fail 72 | assert np.linalg.norm(M_scs - M_mosek)/np.linalg.norm(M_mosek) < 1e-2 73 | # This might fail occasionaly, as SCS is of limited accuracy 74 | np.testing.assert_allclose(M_scs, M_mosek, rtol=5e-2) --------------------------------------------------------------------------------