├── LICENSE ├── README.md ├── active_learning.m ├── label_oracles ├── bernoulli_oracle.m ├── label_oracles.m ├── lookup_oracle.m ├── multinomial_oracle.m └── probabilistic_oracle.m ├── models ├── cheating_model.m ├── ensemble.m ├── gaussian_process_model.m ├── knn_model.m ├── label_propagation_model.m ├── model_memory_wrapper.m ├── models.m └── random_forest_model.m ├── other ├── get_label_oracle.m ├── get_model.m ├── get_query_strategy.m ├── get_score_function.m └── get_selector.m ├── query_strategies ├── argmax.m ├── argmin.m ├── expected_error_reduction.m ├── margin_sampling.m ├── query_by_committee.m ├── query_strategies.m └── uncertainty_sampling.m ├── score_functions ├── calculate_entropies.m ├── expected_loss_lookahead.m ├── expected_loss_naive.m ├── expected_utility_lookahead.m ├── expected_utility_naive.m ├── loss_functions │ ├── expected_01_loss.m │ ├── expected_log_loss.m │ └── loss_functions.m ├── margin.m ├── marginal_entropy.m └── score_functions.m └── selectors ├── fixed_test_set_selector.m ├── graph_walk_selector.m ├── identity_selector.m ├── meta_selectors ├── complement_selector.m ├── intersection_selector.m └── union_selector.m ├── probability_treshold_selector.m ├── random_selector.m ├── selectors.m └── unlabeled_selector.m /LICENSE: -------------------------------------------------------------------------------- 1 | The MIT License (MIT) 2 | 3 | Copyright (c) 2011--2014 Roman Garnett 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining 6 | a copy of this software and associated documentation files (the 7 | "Software"), to deal in the Software without restriction, including 8 | without limitation the rights to use, copy, modify, merge, publish, 9 | distribute, sublicense, and/or sell copies of the Software, and to 10 | permit persons to whom the Software is furnished to do so, subject to 11 | the following conditions: 12 | 13 | The above copyright notice and this permission notice shall be 14 | included in all copies or substantial portions of the Software. 15 | 16 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 17 | EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 18 | MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND 19 | NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE 20 | LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION 21 | OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION 22 | WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. 23 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | Active Learning Toolbox for MATLAB 2 | ================================== 3 | 4 | This software package provides a toolbox for testing pool-based 5 | active-learning algorithms in MATLAB. 6 | 7 | Active Learning 8 | --------------- 9 | 10 | Specifically, we consider the following scenario. There is a pool of 11 | datapoints ![X][1]. We may successively select a set of points 12 | ![x in X][2] to observe. Each observation reveals a discrete, 13 | integer-valued label ![y in L][3] for ![x][4]. This labeling process 14 | might be nondeterministic; we might choose the same point ![x][4] 15 | twice and observe different labels each time. In active learning, we 16 | typically assume we have a budget ![B][5] that limits the number of 17 | points we may observe. 18 | 19 | Our goal is to iteratively build a set of observations 20 | 21 | ![D = (X, Y)][6] 22 | 23 | that achieves some goal in an efficient manner. One typical goal is 24 | that this training set allows us to accurately predict the labels on 25 | the unobserved points. Assume we have a probabilistic model 26 | 27 | ![p(y | x, D)][7] 28 | 29 | and let ![U = X \ X][8] represent the set of unobserved points. We 30 | might with to minimize either the 0/1 loss on the unlabeled points 31 | 32 | ![\sum_{x in U} (\hat{y} \neq y)][9], 33 | 34 | where ![\hat{y} = \argmax p(y | x, D)][10], or the log loss: 35 | 36 | ![\sum_{x in U} -\log p(y | x, D)][11]. 37 | 38 | We could sample a random set of ![B][5] points, but by careful 39 | consideration of our observation locations, we hope we can do 40 | significantly better than this. One common active learning strategy, 41 | known as _uncertainty sampling_, iteratively chooses to make an 42 | observation at the point with the largest marginal entropy given the 43 | current data: 44 | 45 | ![x* = \argmax H(y | x, D)][12], 46 | 47 | with the hope that these queries can better map out the boundaries 48 | between classes. 49 | 50 | Of course, there are countless goals besides minimizing generalization 51 | error and numerous other strategies besides the highly myopic 52 | uncertainty sampling. Indeed, many active learning scenerios might not 53 | involve probability models at all. Providing a highly adaptable and 54 | extensible toolbox for conducting arbitrary pool-based active learning 55 | experiments is the goal of this project. 56 | 57 | Using this Toolbox 58 | ------------------ 59 | 60 | The most-important function is `active_learning`, which simulates an 61 | active learning experiment using the following procedure: 62 | 63 | Given: initially labeled points X, 64 | corresponding labels Y, 65 | budget B 66 | 67 | for i = 1:B 68 | % find points available for labeling 69 | eligible_points = selector(x, y) 70 | 71 | % decide on point(s) to observe 72 | x_star = query_strategy(x, y, eligible_points) 73 | 74 | % observe point(s) 75 | y_star = label_oracle(x_star) 76 | 77 | % add observation(s) to training set 78 | X = [X, x_star] 79 | Y = [Y, y_star] 80 | end 81 | 82 | The implementation supports user-specified: 83 | 84 | * _Selectors,_ which given the current training set, return a set of 85 | points currently eligible for labeling. See `selectors.m` for usage 86 | and available implementations. 87 | 88 | * _Query strategies,_ which given a training set and the selected 89 | eligible points, decides which point(s) to observe next. Note that a 90 | query strategy can return multiple points, allowing for batch 91 | observations. See `query_strategies.m` for usage and available 92 | implementations. 93 | 94 | * _Label oracles,_ which given a set of points, return a set of 95 | corresponding labels. Label oracles may optionally be 96 | nondeterministic (see, for example, `bernoulli_oracle`). See 97 | `label_oracles.m` for usage and available implementations. 98 | 99 | Each of these are provided as function handles satisfying a desired 100 | API, described below. 101 | 102 | This function also supports arbitrary user-specified callbacks 103 | called after each round of the experiment. This can be useful, for 104 | example, for plotting the progress of the algorithm and/or printing 105 | statistics such as test error online. 106 | 107 | Selectors 108 | --------- 109 | 110 | A _selector_ considers the current labeled dataset and indicates which 111 | of the unlabeled points should be considered for observation at this 112 | time. 113 | 114 | Selectors must satisfy the following interface: 115 | 116 | test_ind = selector(problem, train_ind, observed_labels) 117 | 118 | ### Inputs: ### 119 | 120 | * `problem`: a struct describing the problem, containing fields: 121 | 122 | * `points`: an ![(n x d)][13] data matrix for the available points 123 | * `num_classes`: the number of classes 124 | * `num_queries`: the number of queries to make 125 | 126 | * `train_ind`: a list of indices into `problem.points` indicating the 127 | thus-far observed points 128 | 129 | * `observed_labels`: a list of labels corresponding to the 130 | observations in `train_ind` 131 | 132 | ### Output: ### 133 | 134 | * `test_ind`: a list of indices into `problem.points` indicating the 135 | points to consider for labeling 136 | 137 | The following general-purpose selectors are provided in this toolbox: 138 | 139 | * `fixed_test_set_selector`: selects all points besides a given test 140 | set 141 | * `graph_walk_selector`: confines an experiment to follow a path on a 142 | graph 143 | * `identity_selector`: selects all points 144 | * `random_selector`: selects a random subset of points 145 | * `unlabeled_selector`: selects points not yet observed 146 | 147 | In addition, the following "meta" selectors are provided, which 148 | combine or modify the outputs of other selectors: 149 | 150 | * `complement_selector`: takes the complement of a selector's output 151 | * `intersection_selector`: takes the intersection of the outputs of selectors 152 | * `union_selector`: takes the union of the outputs of selectors 153 | 154 | Query Strategies 155 | ---------------- 156 | 157 | _Query strategies_ select which of the points currently eligible for 158 | labeling (returned by a selector) should be observed next. 159 | 160 | Query strategies must satisfy the following interface: 161 | 162 | query_ind = query_strategy(problem, train_ind, observed_labels, test_ind) 163 | 164 | ### Inputs: ### 165 | 166 | * `problem`: a struct describing the problem, containing fields: 167 | 168 | * `points`: an ![(n x d)][13] data matrix for the available points 169 | * `num_classes`: the number of classes 170 | * `num_queries`: the number of queries to make 171 | 172 | * `train_ind`: a list of indices into `problem.points` indicating the 173 | thus-far observed points 174 | 175 | * `observed_labels`: a list of labels corresponding to the 176 | observations in `train_ind` 177 | * `test_ind`: a list of indices into `problem.points` indicating the 178 | points eligible for observation 179 | 180 | ### Output: ### 181 | 182 | * `query_ind`: an index into `problem.points` indicating the point(s) 183 | to query next (every entry in `query_ind` will always be a member 184 | of the set of points in `test_ind`) 185 | 186 | The following query strategies are provided in this toolbox: 187 | 188 | * `argmax`: samples the point(s) maximizing a given score function 189 | * `argmin`: samples the point(s) minimizing a given score function 190 | * `expected_error_reduction`: samples the point giving lowest 191 | expected loss on unlabeled points 192 | * `margin_sampling`: samples the point with the smallest margin 193 | * `query_by_committee`: samples the point with the highest disagreement 194 | between models 195 | * `uncertainty_sampling`: samples the most uncertain point 196 | 197 | Label Oracles 198 | ------------- 199 | 200 | _Label oracles_ are functions that, given a set of points chosen to be 201 | queried, returns a set of corresponding labels. In general, they need 202 | not be deterministic, which is especially interesting when points can 203 | be queried multiple times. 204 | 205 | Label oracles must satisfy the following interface: 206 | 207 | label = label_oracle(problem, query_ind) 208 | 209 | ### Inputs: ### 210 | 211 | * `problem`: a struct describing the problem, containing fields: 212 | 213 | * `points`: an ![(n x d)][13] data matrix for the available points 214 | * `num_classes`: the number of classes 215 | 216 | * `query_ind`: an index into `problem.points` specifying the point(s) to be 217 | queried 218 | 219 | ### Output: ### 220 | 221 | * `label`: a list of integers between 1 and `problem.num_classes` 222 | indicating the observed label(s) 223 | 224 | The following general-purpose label oracles are provided in this 225 | toolbox: 226 | 227 | * `lookup_oracle`: a trivial lookup-table label oracle given a fixed 228 | list of ground-truth labels 229 | * `bernoulli_oracle`: a label oracle that, conditioned on the queried 230 | point(s), samples labels independently from a Bernoulli distribution 231 | with given success probability 232 | * `multinomial_oracle`: a label oracle that, conditioned on the 233 | queried point(s), samples labels independently from a multinomial 234 | distribution with given success probabilities 235 | 236 | [1]: http://latex.codecogs.com/svg.latex?%5Cmathcal%7BX%7D 237 | [2]: http://latex.codecogs.com/svg.latex?x%20%5Cin%20%5Cmathcal%7BX%7D 238 | [3]: http://latex.codecogs.com/svg.latex?y%20%5Cin%20%5BL%5D 239 | [4]: http://latex.codecogs.com/svg.latex?x 240 | [5]: http://latex.codecogs.com/svg.latex?B 241 | [6]: http://latex.codecogs.com/svg.latex?%5Cmathcal%7BD%7D%20%3D%20%5Cbigl%5Clbrace%20(x_i%2C%20y_i)%20%5Cbigr%20%5Crbrace_%7Bi%3D1%7D%5EB%20%3D%20(X%2C%20Y) 242 | [7]: http://latex.codecogs.com/svg.latex?p(y%20%5Cmid%20x%2C%20%5Cmathcal%7BD%7D), 243 | [8]: http://latex.codecogs.com/svg.latex?%5Cmathcal%7BU%7D%20%3D%20%5Cmathcal%7BX%7D%20%5Csetminus%20X 244 | [9]: http://latex.codecogs.com/svg.latex?%5Csum_%7Bx%20%5Cin%20%5Cmathcal%7BU%7D%7D%20%5B%5Chat%7By%7D%20%5Cneq%20y%5D 245 | [10]: http://latex.codecogs.com/svg.latex?%5Chat%7By%7D%20%3D%20%5Coperatorname%7Barg%5C%2Cmax%7D%20p(y%20%5Cmid%20x%2C%20%5Cmathcal%7BD%7D) 246 | [11]: http://latex.codecogs.com/svg.latex?%5Csum_%7Bx%20%5Cin%20%5Cmathcal%7BU%7D%7D%20-%5Clog%20p(y%20%5Cmid%20x%2C%20%5Cmathcal%7BD%7D) 247 | [12]: http://latex.codecogs.com/svg.latex?x%5E%5Cast%20%3D%20%5Coperatorname%7Barg%5C%2Cmax%7D_x%20H%5By%20%5Cmid%20x%2C%20%5Cmathcal%7BD%7D%5D 248 | [13]: http://latex.codecogs.com/svg.latex?(n%20%5Ctimes%20d) 249 | -------------------------------------------------------------------------------- /active_learning.m: -------------------------------------------------------------------------------- 1 | % ACTIVE_LEARNING simulates an active learning experiment. 2 | % 3 | % This function performs active learning on a set of discrete points 4 | % using a given query strategy. An active-learning experiment is 5 | % simulated following the following procedure: 6 | % 7 | % Given: initially labeled points x, 8 | % corresponding labels y, 9 | % budget B 10 | % 11 | % for i = 1:B 12 | % % find points available for labeling 13 | % eligible_points = selector(x, y) 14 | % 15 | % % decide on point(s) to observe 16 | % x_star = query_strategy(x, y, eligible_points) 17 | % 18 | % % observe point(s) 19 | % y_star = label_oracle(x_star) 20 | % 21 | % % add observation(s) to training set 22 | % x = [x, x_star] 23 | % y = [y, y_star] 24 | % end 25 | % 26 | % This function supports user-specified: 27 | % 28 | % * _Selectors,_ which given the current training set, return a set of 29 | % points currently eligible for labeling. See selectors.m for usage 30 | % and available implementations. 31 | % 32 | % * _Query strategies,_ which given a training set and the selected 33 | % eligible points, decides which point(s) to observe next. Note that 34 | % a query strategy can return multiple points, allowing for batch 35 | % observations. See query_strategies.m for usage and available 36 | % implementations. 37 | % 38 | % * _Label oracles,_ which given a set of points, return a set of 39 | % corresponding labels. Label oracles may optionally be 40 | % nondeterministic (see, for example, bernoulli_oracle). See 41 | % label_oracles.m for usage and available implementations. 42 | % 43 | % This function also supports arbitrary user-specified callbacks 44 | % called after each round of the experiment. This can be useful, for 45 | % example, for plotting the progress of the algorithm and/or printing 46 | % statistics such as test error online. 47 | % 48 | % Usage: 49 | % 50 | % [chosen_ind, chosen_labels] = ... 51 | % active_learning(problem, train_ind, observed_labels, label_oracle, ... 52 | % selector, query_strategy, callback) 53 | % 54 | % Inputs: 55 | % 56 | % problem: a struct describing the problem, containing fields: 57 | % 58 | % points: an (n x d) data matrix for the available points 59 | % num_classes: the number of classes 60 | % num_queries: the number of queries to make 61 | % verbose: whether to print information regarding 62 | % each query (default: false) 63 | % 64 | % train_ind: a (possibly empty) list of indices into 65 | % problem.points indicating the labeled points at 66 | % start 67 | % observed_labels: a (possibly empty) list of labels corresponding 68 | % to the observations in train_ind 69 | % label_oracle: a handle to a label oracle, which takes an index 70 | % into problem.points and returns a label 71 | % selector: a handle to a point selector, which specifies 72 | % which points are eligible to query at a given time 73 | % query_strategy: a handle to a query strategy 74 | % callback: (optional) a handle to an arbitrary user-defined 75 | % callback called after each new point is queried. 76 | % The callback will be called as 77 | % 78 | % callback(problem, train_ind, observed_labels) 79 | % 80 | % and anything returned will be ignored. 81 | % 82 | % Outputs: 83 | % 84 | % chosen_ind: a list of indices of the chosen datapoints, in order 85 | % chosen_labels: a list of the corresponding observed labels 86 | % 87 | % See also LABEL_ORACLES, SELECTORS, QUERY_STRATEGIES. 88 | 89 | % Copyright (c) 2011--2014 Roman Garnett. 90 | 91 | function [chosen_ind, chosen_labels] = ... 92 | active_learning(problem, train_ind, observed_labels, label_oracle, ... 93 | selector, query_strategy, callback) 94 | 95 | % set verbose to false if not defined 96 | verbose = isfield(problem, 'verbose') && problem.verbose; 97 | 98 | chosen_ind = []; 99 | chosen_labels = []; 100 | 101 | % store number of initial training points (this can be used to track 102 | % the number of points selected thus far) 103 | problem.num_initial = numel(train_ind); 104 | 105 | for i = 1:problem.num_queries 106 | if (verbose) 107 | tic; 108 | fprintf('point %i:', i); 109 | end 110 | 111 | % get list of points to consider for querying this round 112 | test_ind = selector(problem, train_ind, observed_labels); 113 | if (verbose) 114 | fprintf(' %i points for consideration ... ', numel(test_ind)); 115 | end 116 | 117 | % end early if no points returned from selector 118 | if (isempty(test_ind)) 119 | if (verbose) 120 | fprintf('\n'); 121 | end 122 | warning('active_learning:no_points_selected', ... 123 | ['after %i steps, no points were selected. ' ... 124 | 'Ending run early!'], i); 125 | 126 | return; 127 | end 128 | 129 | % shortcut if only one point available 130 | if (numel(test_ind) == 1) 131 | this_chosen_ind = test_ind; 132 | else 133 | % select location(s) of next observation(s) from the given list 134 | this_chosen_ind = ... 135 | query_strategy(problem, train_ind, observed_labels, test_ind); 136 | end 137 | 138 | % observe label(s) at chosen location(s) 139 | this_chosen_labels = ... 140 | label_oracle(problem, train_ind, observed_labels, this_chosen_ind); 141 | 142 | % update lists with new observation(s) 143 | chosen_ind = [chosen_ind; this_chosen_ind]; 144 | train_ind = [train_ind; this_chosen_ind]; 145 | 146 | chosen_labels = [chosen_labels; this_chosen_labels]; 147 | observed_labels = [observed_labels; this_chosen_labels]; 148 | if (verbose) 149 | num_observations = numel(this_chosen_ind); 150 | observation_format_string = repmat('%i ', [1, num_observations]); 151 | observation_format_string = observation_format_string(1:(end - 1)); 152 | 153 | label_format_string = repmat('%i/', [1, problem.num_classes]); 154 | label_format_string = label_format_string(1:(end - 1)); 155 | 156 | fprintf(sprintf('done. Point chosen: %s (label: %s), took: %%.2fs. Cumulative label totals: [%s].\n', ... 157 | observation_format_string, ... 158 | observation_format_string, ... 159 | label_format_string), ... 160 | this_chosen_ind, ... 161 | this_chosen_labels, ... 162 | toc, ... 163 | accumarray(chosen_labels, 1, [problem.num_classes, 1])); 164 | end 165 | 166 | % call callback, if defined 167 | if (nargin > 6) 168 | callback(problem, train_ind, observed_labels); 169 | end 170 | end 171 | 172 | end -------------------------------------------------------------------------------- /label_oracles/bernoulli_oracle.m: -------------------------------------------------------------------------------- 1 | % BERNOULLI_ORACLE Bernoulli oracle with given success probabilities. 2 | % 3 | % This provides a label oracle that, conditioned on queried point(s), 4 | % samples labels independently from a Bernoulli with given success 5 | % probability. Here membership to class 1 is treated as "success." 6 | % 7 | % Usage: 8 | % 9 | % label = bernoulli_oracle(problem, train_ind, observed_labels, ... 10 | % query_ind, probabilities) 11 | % 12 | % Inputs: 13 | % 14 | % problem: a struct describing the problem, containing the 15 | % field: 16 | % 17 | % points: an (n x d) data matrix for the avilable points 18 | % 19 | % train_ind: a list of indices into problem.points indicating 20 | % the thus-far observed points 21 | % observed_labels: a list of labels corresponding to the 22 | % observations in train_ind 23 | % 24 | % Note: the above inputs, part of the standard 25 | % label oracle API, are ignored by 26 | % bernoulli_oracle. If desired, for standalone use 27 | % it can be replaced by an empty matrix. 28 | % 29 | % query_ind: an index into problem.points specifying the 30 | % point(s) to be queried 31 | % probabilities: a length-n vector of success probabilities 32 | % corresponding to the points in problem.points 33 | % 34 | % Output: 35 | % 36 | % label: a list of integers between 1 and problem.num_classes 37 | % indicating the observed label(s) 38 | % 39 | % See also LABEL_ORACLES, MULTINOMIAL_ORACLE. 40 | 41 | % Copyright (c) 2013--2016 Roman Garnett. 42 | 43 | function label = bernoulli_oracle(~, ~, ~, query_ind, probabilities) 44 | 45 | label = 1 + (rand(size(query_ind(:))) > probabilities(query_ind)); 46 | 47 | end -------------------------------------------------------------------------------- /label_oracles/label_oracles.m: -------------------------------------------------------------------------------- 1 | % Label oracles are functions that, given a set of point(s) chosen to 2 | % be queried, returns a list of corresponding label(s). In general, 3 | % they need not be deterministic, which is especially interesting when 4 | % points can be queried multiple times. 5 | % 6 | % Label oracles must satisfy the following interface: 7 | % 8 | % label = label_oracle(problem, train_ind, observed_labels, query_ind) 9 | % 10 | % Inputs: 11 | % 12 | % problem: a struct describing the problem, containing the 13 | % fields: 14 | % 15 | % points: an (n x d) data matrix for the avilable points 16 | % num_classes: the number of classes 17 | % 18 | % train_ind: a list of indices into problem.points indicating 19 | % the thus-far observed points 20 | % observed_labels: a list of labels corresponding to the 21 | % observations in train_ind 22 | % query_ind: an index into problem.points specifying the 23 | % point(s) to be queried 24 | % 25 | % Output: 26 | % 27 | % label: a list of integers between 1 and problem.num_classes 28 | % indicating the observed label(s) 29 | % 30 | % The following general-purpose label oracles are provided in this 31 | % toolbox: 32 | % 33 | % lookup_oracle: a trivial lookup-table label oracle given a 34 | % fixed list of ground-truth labels 35 | % bernoulli_oracle: a label oracle that, conditioned on the 36 | % queried point(s), samples labels independently 37 | % from a Bernoulli distribution with given 38 | % success probability 39 | % multinomial_oracle: a label oracle that, conditioned on the 40 | % queried point(s), samples labels independently 41 | % from a multinomial distribution with given 42 | % success probabilities 43 | % 44 | % For convenience, the function get_label_oracle is provided for 45 | % easily and concisely constructing function handles to label oracles 46 | % for use, e.g., in active_learning.m. 47 | 48 | % Copyright (c) 2014--2016 Roman Garnett. 49 | -------------------------------------------------------------------------------- /label_oracles/lookup_oracle.m: -------------------------------------------------------------------------------- 1 | % LOOKUP_ORACLE trivial lookup-table oracle with fixed labels. 2 | % 3 | % This provides a trivial lookup-table label oracle. Given query 4 | % point(s), returns the corresponding label(s) from a given list of 5 | % fixed ground truth labels. 6 | % 7 | % Usage: 8 | % 9 | % label = lookup_oracle(problem, train_ind, observed_labels, ... 10 | % query_ind, labels) 11 | % 12 | % Inputs: 13 | % 14 | % problem: a struct describing the problem, containing the 15 | % fields: 16 | % 17 | % points: an (n x d) data matrix for the avilable points 18 | % num_classes: the number of classes 19 | % 20 | % train_ind: a list of indices into problem.points indicating 21 | % the thus-far observed points 22 | % observed_labels: a list of labels corresponding to the 23 | % observations in train_ind 24 | % 25 | % Note: the above inputs, part of the standard 26 | % label oracle API, are ignored by lookup_oracle. If 27 | % desired, for standalone use it can be replaced by 28 | % an empty matrix. 29 | % 30 | % query_ind: an index into problem.points specifying the 31 | % point(s) to be queried 32 | % labels: a length-n vector of ground-truth class labels 33 | % for each point in problem.points 34 | % 35 | % Output: 36 | % 37 | % label: a list of integers between 1 and problem.num_classes 38 | % indicating the observed label(s) 39 | % 40 | % See also LABEL_ORACLES. 41 | 42 | % Copyright (c) 2013--2016 Roman Garnett. 43 | 44 | function label = lookup_oracle(~, ~, ~, query_ind, labels) 45 | 46 | label = labels(query_ind); 47 | 48 | end -------------------------------------------------------------------------------- /label_oracles/multinomial_oracle.m: -------------------------------------------------------------------------------- 1 | % MULTINOMIAL_ORACLE multinomial oracle with given probabilities. 2 | % 3 | % This provides a label oracle that, conditioned on queried point(s), 4 | % samples labels independently from a multinomial with given marginal 5 | % probabilities. 6 | % 7 | % Usage: 8 | % 9 | % label = multinomial_oracle(problem, train_ind, observed_labels, 10 | % query_ind, probabilities) 11 | % 12 | % Inputs: 13 | % 14 | % problem: a struct describing the problem, containing the 15 | % fields: 16 | % 17 | % points: an (n x d) data matrix for the avilable points 18 | % num_classes: the number of classes 19 | % 20 | % train_ind: a list of indices into problem.points indicating 21 | % the thus-far observed points 22 | % observed_labels: a list of labels corresponding to the 23 | % observations in train_ind 24 | % 25 | % Note: the above inputs, part of the standard 26 | % label oracle API, are ignored by 27 | % multinomial_oracle. If desired, for standalone 28 | % use it can be replaced by an empty matrix. 29 | % 30 | % query_ind: an index into problem.points specifying the 31 | % point(s) to be queried 32 | % probabilities: an (n x problem.num_classes) matrix of 33 | % class-membership probabilities corresponding to 34 | % the points in problem.points 35 | % 36 | % Output: 37 | % 38 | % label: a list of integers between 1 and problem.num_classes 39 | % indicating the observed label(s) 40 | % 41 | % See also LABEL_ORACLES, BERNOULLI_ORACLE. 42 | 43 | % Copyright (c) 2013--2016 Roman Garnett. 44 | 45 | function label = multinomial_oracle(~, ~, ~, query_ind, probabilities) 46 | 47 | label = 1 + sum(bsxfun(@gt, rand(size(query_ind(:))), ... 48 | cumsum(probabilities(query_ind, :), 2)), 2); 49 | 50 | end -------------------------------------------------------------------------------- /label_oracles/probabilistic_oracle.m: -------------------------------------------------------------------------------- 1 | % PROBABILISTIC_ORACLE multinomial oracle with model probabilities. 2 | % 3 | % This provides a label oracle that, conditioned on queried point(s), 4 | % samples labels independently from a multinomial with marginal 5 | % probabilities computed from a given model. 6 | % 7 | % Usage: 8 | % 9 | % label = probabilistic_oracle(problem, train_ind, observed_labels, 10 | % query_ind, model) 11 | % 12 | % Inputs: 13 | % 14 | % problem: a struct describing the problem, containing the 15 | % fields: 16 | % 17 | % points: an (n x d) data matrix for the avilable points 18 | % num_classes: the number of classes 19 | % 20 | % train_ind: a list of indices into problem.points indicating 21 | % the thus-far observed points 22 | % observed_labels: a list of labels corresponding to the 23 | % observations in train_ind 24 | % query_ind: an index into problem.points specifying the 25 | % point(s) to be queried 26 | % model: a function handle to a model to use 27 | % 28 | % Output: 29 | % 30 | % label: a list of integers between 1 and problem.num_classes 31 | % indicating the observed label(s) 32 | % 33 | % See also LABEL_ORACLES, MULTINOMIAL_ORACLE, MODELS. 34 | 35 | % Copyright (c) 2016 Roman Garnett. 36 | 37 | function label = probabilistic_oracle(problem, train_ind, observed_labels, ... 38 | query_ind, model) 39 | 40 | probabilities = model(problem, train_ind, observed_labels, query_ind); 41 | 42 | label = 1 + sum(bsxfun(@gt, rand(size(query_ind(:))), ... 43 | cumsum(probabilities, 2)), 2); 44 | 45 | end -------------------------------------------------------------------------------- /models/cheating_model.m: -------------------------------------------------------------------------------- 1 | % CHEATING_MODEL a "cheating" model that queries a label oracle. 2 | % 3 | % This model always predicts a delta distribution for each test point 4 | % with mass on the output of a given label oracle. This can be useful 5 | % for comparing against a theoretically optimal algorithm. 6 | % 7 | % Usage: 8 | % 9 | % probabilities = cheating_model(problem, train_ind, observed_labels, ... 10 | % test_ind, label_oracle) 11 | % 12 | % Inputs: 13 | % 14 | % problem: a struct describing the problem, which must at 15 | % least contain the field: 16 | % 17 | % num_classes: the number of classes 18 | % 19 | % as well as any fields that may be required by 20 | % the label oracle below. 21 | % 22 | % train_ind: a list of indices into problem.points indicating 23 | % the thus-far observed points 24 | % 25 | % Note: this input, part of the standard 26 | % probability model API, is ignored by 27 | % cheating_model. If desired, for standalone use it 28 | % can be replaced by an empty matrix. 29 | % 30 | % observed_labels: a list of labels corresponding to the 31 | % observations in train_ind 32 | % 33 | % Note: this input, part of the standard 34 | % probability model API, is ignored by 35 | % cheating_model. If desired, for standalone use it 36 | % can be replaced by an empty matrix. 37 | % 38 | % test_ind: a list of indices into problem.points indicating 39 | % the test points 40 | % label_oracle: a handle to a label oracle. which takes an index 41 | % into problem.points and returns a label 42 | % 43 | % Output: 44 | % 45 | % probabilities: a matrix of posterior probabilities. The ith column 46 | % gives the posterior probabilities p(y = i | x, D) 47 | % for each of the indicated test points; here 48 | % p(y = i | x, D) = 1 if the label oracle output i 49 | % for y; otherwise 0. 50 | % 51 | % See also MODELS, LABEL_ORACLES. 52 | 53 | % Copyright (c) 2012--2014 Roman Garnett. 54 | 55 | function probabilities = cheating_model(problem, ~, ~, test_ind, label_oracle) 56 | 57 | num_test = numel(test_ind); 58 | 59 | probabilities = zeros(num_test, problem.num_classes); 60 | for i = 1:num_test 61 | probabilities(i, label_oracle(problem, test_ind(i))) = 1; 62 | end 63 | 64 | end -------------------------------------------------------------------------------- /models/ensemble.m: -------------------------------------------------------------------------------- 1 | % ENSEMBLE makes predictions using a weighted ensemble of models. 2 | % 3 | % This is an implementation of a weighted ensemble of models. Let M = 4 | % {M_j} be a set of probability models, and let a point x and a set of 5 | % observations D = (X, Y) be given. The ensemble probabilities are 6 | % given by 7 | % 8 | % p(y = i | x, D) = \sum_j w_j(D) p(y = i | x, D, M_j) 9 | % / \sum_j w_j(D), 10 | % 11 | % where w(D) is a (possibly data-dependent) weight vector of length 12 | % |M|. 13 | % 14 | % This implementation also supports so-called "hard" voting by the 15 | % ensemble members, where in the above the posterior probabilities 16 | % 17 | % p(y | x, D, M_j) 18 | % 19 | % are replaced by a Kronecker \delta distribution on the 20 | % most-confident label according to model M_j: 21 | % 22 | % \delta[ \argmax_i p(y = i | x, D, M_j) ]. 23 | % 24 | % Usage: 25 | % 26 | % probabilities = ensemble(problem, train_ind, observed_labels, ... 27 | % test_ind, models, weights, hard_votes) 28 | % 29 | % Required Inputs: 30 | % 31 | % problem: a struct describing the problem, containing fields: 32 | % 33 | % points: an (n x d) data matrix for the available points 34 | % num_classes: the number of classes 35 | % 36 | % train_ind: a list of indices into problem.points indicating 37 | % the thus-far observed points 38 | % observed_labels: a list of labels corresponding to the 39 | % observations in train_ind 40 | % test_ind: a list of indices into problem.points indicating 41 | % the points eligible for observation 42 | % models: a cell array of handles to probability models 43 | % 44 | % Optional Inputs: 45 | % 46 | % weights: either a length-|M| vector of model weights or a 47 | % function handle returning such a vector (see note 48 | % below for details). 49 | % (default: ones(1, |M|) / |M|) 50 | % hard_votes: a boolean indicating whether to use "hard" voting 51 | % (default: false) 52 | % 53 | % Output: 54 | % 55 | % probabilities: a matrix of posterior probabilities. The ith 56 | % column gives p(y = i | x, D) for each of the 57 | % indicated test points. 58 | % 59 | % Note on Model Weights: 60 | % 61 | % This implementation supports both fixed and data-dependent model 62 | % weights w(D). The latter might be useful to, for example, weight 63 | % ensemble members by an estimate of their accuracy or by an estimate 64 | % of their posterior probabilities in a Bayesian fashion. 65 | % Data-dependent weights are implemented by providing a function 66 | % handle to a weight function which will be called as 67 | % 68 | % weights = weight_function(problem, train_ind, observed_labels, models), 69 | % 70 | % and must return a length-|M| vector of weights corresponding to the 71 | % models in models. 72 | % 73 | % See also MODELS, QUERY_BY_COMMITTEE. 74 | 75 | % Copyright (c) 2014 Roman Garnett. 76 | 77 | function probabilities = ensemble(problem, train_ind, observed_labels, ... 78 | test_ind, models, weights, hard_votes) 79 | 80 | num_test = numel(test_ind); 81 | num_models = numel(models); 82 | 83 | % default to uniform model weights 84 | if ((nargin < 6) || isempty(weights)) 85 | weights = (1 / num_models) + zeros(1, num_models); 86 | end 87 | 88 | % default to "soft" votes 89 | if (nargin < 7) 90 | hard_votes = false; 91 | end 92 | 93 | % determine weight vector if weight function is provided 94 | if (isa(weights, 'function_handle')) 95 | weights = weights(problem, train_ind, observed_labels, models); 96 | end 97 | 98 | votes = zeros(num_test, problem.num_classes); 99 | 100 | for i = 1:num_models 101 | probabilities = models{i}(problem, train_ind, observed_labels, test_ind); 102 | 103 | if (hard_votes) 104 | % "hard" votes: each model votes only for its most-confident 105 | % prediction 106 | [~, this_votes] = max(probabilities, [], 2); 107 | 108 | votes = votes + weights(i) * ... 109 | accumarray([(1:num_test)', this_votes], 1, ... 110 | [num_test, problem.num_classes]); 111 | 112 | else 113 | % "soft" votes: each model votes for each label with a weight equal to 114 | % its posterior probability 115 | votes = votes + weights(i) * probabilities; 116 | end 117 | end 118 | 119 | % normalize probabilities 120 | probabilities = bsxfun(@times, votes, 1 ./ sum(votes, 2)); 121 | 122 | end -------------------------------------------------------------------------------- /models/gaussian_process_model.m: -------------------------------------------------------------------------------- 1 | % GAUSSIAN_PROCESS_MODEL a binary Gaussian process classifier. 2 | % 3 | % This is an implementation of a Gaussian process (binary) 4 | % classifier. Requires the GPML toolkit available here: 5 | % 6 | % http://www.gaussianprocess.org/gpml/code/matlab/doc 7 | % 8 | % Usage: 9 | % 10 | % probabilities = gaussian_process_model(problem, train_ind, ... 11 | % observed_labels, test_ind, hyperparameters, inference_method, ... 12 | % mean_function, covariance_function, likelihood) 13 | % 14 | % Inputs: 15 | % 16 | % problem: a struct describing the problem, which must 17 | % at least contain the field: 18 | % 19 | % points: an (n x d) data matrix for the avilable 20 | % points 21 | % 22 | % train_ind: a list of indices into problem.points 23 | % indicating the thus-far observed points 24 | % observed_labels: a list of labels corresponding to the 25 | % observations in train_ind 26 | % test_ind: a list of indices into problem.points indicating 27 | % the test points 28 | % hyperparameters: a GPML hyperparameter structure 29 | % inference_method: a GPML inference method 30 | % mean_function: a GPML mean function 31 | % covariance_function: a GPML covariance function 32 | % likelihood: a GPML likelihood 33 | % 34 | % Output: 35 | % 36 | % probabilities: a matrix of posterior probabilities. The first 37 | % column gives p(y = 1 | x, D) for each of the 38 | % indicated test points; the second column gives 39 | % p(y \neq 1 | x, D). 40 | % 41 | % See also MODELS, GP. 42 | 43 | % Copyright (c) 2011--2016 Roman Garnett. 44 | 45 | function probabilities = gaussian_process_model(problem, train_ind, ... 46 | observed_labels, test_ind, hyperparameters, inference_method, ... 47 | mean_function, covariance_function, likelihood) 48 | 49 | % transform labels to match what GPML expects 50 | observed_labels(observed_labels ~= 1) = -1; 51 | 52 | num_test = numel(test_ind); 53 | 54 | [~, ~, ~, ~, log_probabilities] = gp(hyperparameters, inference_method, ... 55 | mean_function, covariance_function, likelihood, ... 56 | problem.points(train_ind, :), observed_labels, ... 57 | problem.points(test_ind, :), ones(num_test, 1)); 58 | 59 | probabilities = exp(log_probabilities); 60 | 61 | % return probabilities for "class 1" and "not class 1" 62 | probabilities = [probabilities, (1 - probabilities)]; 63 | 64 | end 65 | -------------------------------------------------------------------------------- /models/knn_model.m: -------------------------------------------------------------------------------- 1 | % KNN_MODEL weighted k-NN classifier. 2 | % 3 | % Suppose the problem has n points, and W is an (n x n) matrix of 4 | % pairwise weights. We assume the marginal label distribution at a 5 | % point x is a categorical distribution with probability vector p(x): 6 | % 7 | % p(y | x) = Categorical(p(x)). 8 | % 9 | % We place identical Dirichlet priors on the p(x) vectors with 10 | % hyperparameter vector \alpha: 11 | % 12 | % p(p(x) | x, \alpha) = Dirichlet(\alpha). 13 | % 14 | % Finally, given observations D = {(X, Y)} and a point x, we update 15 | % the posterior probability vector p(x) by accumulating weighted 16 | % counts of the observations near x (where "near" is defined by the 17 | % weight matrix W): 18 | % 19 | % p(p(x) | x, D, \alpha) = Dirichlet(\alpha + C(x)), 20 | % 21 | % where 22 | % 23 | % C_i(x) = \sum_{x' \in D, y' = i} W(x, x'). 24 | % 25 | % Now, given x and D, we output the Categorical distribution with 26 | % the posterior mean of p(x) given D: 27 | % 28 | % p(y | x, D, \alpha) = Categorical( E[p(x) | x, D, \alpha] ). 29 | % 30 | % Usage: 31 | % 32 | % probabilities = knn_model(problem, train_ind, observed_labels, ... 33 | % test_ind, weights, alpha) 34 | % 35 | % Inputs: 36 | % 37 | % problem: a struct describing the problem, containing the 38 | % fields: 39 | % 40 | % points: an (n x d) data matrix for the avilable points 41 | % num_classes: the number of classes 42 | % 43 | % train_ind: a list of indices into problem.points indicating 44 | % the thus-far observed points 45 | % observed_labels: a list of labels corresponding to the 46 | % observations in train_ind 47 | % test_ind: a list of indices into problem.points indicating 48 | % the test points 49 | % weights: an (n x n) matrix of weights 50 | % alpha: the hyperparameter vector \alpha 51 | % (1 x problem.num_classes) 52 | % 53 | % Output: 54 | % 55 | % probabilities: a matrix of posterior probabilities. The ith 56 | % column gives p(y = i | x, D) for each of the 57 | % indicated test points. 58 | % 59 | % See also MODELS. 60 | 61 | % Copyright (c) 2011--2014 Roman Garnett. 62 | 63 | function probabilities = knn_model(problem, train_ind, observed_labels, ... 64 | test_ind, weights, alpha) 65 | 66 | num_test = numel(test_ind); 67 | probabilities = zeros(num_test, problem.num_classes); 68 | 69 | % accumulate weighted number of successes for each class 70 | for i = 1:problem.num_classes 71 | probabilities(:, i) = alpha(i) + ... 72 | sum(weights(test_ind, train_ind(observed_labels == i)), 2); 73 | end 74 | 75 | % normalize probabilities 76 | probabilities = bsxfun(@times, probabilities, 1 ./ sum(probabilities, 2)); 77 | 78 | end 79 | -------------------------------------------------------------------------------- /models/label_propagation_model.m: -------------------------------------------------------------------------------- 1 | % LABEL_PROPAGATION_MODEL partially abosrbing label propagation. 2 | % 3 | % This is an implementation of the partially absorbing label 4 | % propagation algorithm described in: 5 | % 6 | % Neumann, M., Garnett, R., and Kersting, K. Coinciding Walk 7 | % Kernels: Parallel Absorbing Random Walks for Learning with Graphs 8 | % and Few Labels. (2013). Proceedings of the 5th Annual Asian 9 | % Conference on Machine Learning (ACML 2013). 10 | % 11 | % Usage: 12 | % 13 | % probabilities = label_propagation_model(problem, train_ind, ... 14 | % observed_labels, test_ind, A, varargin) 15 | % 16 | % Required inputs: 17 | % 18 | % problem: a struct describing the problem, which must at 19 | % least contain the field: 20 | % 21 | % num_classes: the number of classes 22 | % 23 | % train_ind: a list of indices into A indicating the thus-far 24 | % observed nodes 25 | % observed_labels: a list of labels corresponding to the 26 | % observations in train_ind 27 | % test_ind: a list of indices into A indicating the test 28 | % nodes 29 | % A: a weighted adjacency matrix for the desired graph 30 | % containing transition probabilities. A should be 31 | % row-normalized. 32 | % 33 | % Optional name/value-pair arguments specified after requried inputs: 34 | % 35 | % 'num_iterations': the number of label propagation iterations to 36 | % perform (default: 200) 37 | % 'alpha': the absorbtion parameter to use in [0, 1] 38 | % (default: 1, corresponds to standard label 39 | % propagation) 40 | % 'use_prior': a boolean indicating whether to use the 41 | % empirical distribution on the training points 42 | % as the prior (true) or a uniform prior 43 | % (false) (default: false) 44 | % 'pseudocount': if use_prior is set to true, a per-class 45 | % pseudocount can also be specified (default: 1) 46 | % 47 | % Output: 48 | % 49 | % probabilities: a matrix of posterior probabilities. The ith 50 | % column gives p(y = i | x, D) for each of the 51 | % indicated test points. 52 | % 53 | % See also MODELS, GRAPH_WALK_SELECTOR. 54 | 55 | % Copyright (c) 2014 Roman Garnett. 56 | 57 | function probabilities = label_propagation_model(problem, train_ind, ... 58 | observed_labels, test_ind, A, varargin) 59 | 60 | % parse optional inputs 61 | options = inputParser; 62 | 63 | options.addParamValue('num_iterations', 200, ... 64 | @(x) (isscalar(x) && (x >= 0))); 65 | options.addParamValue('alpha', 1, ... 66 | @(x) (isscalar(x) && (x >= 0) && (x <= 1))); 67 | options.addParamValue('use_prior', false, ... 68 | @(x) (islogical(x) && (numel(x) == 1))); 69 | options.addParamValue('pseudocount', 0.1, ... 70 | @(x) (isscalar(x) && (x > 0))); 71 | 72 | options.parse(varargin{:}); 73 | options = options.Results; 74 | 75 | % check whether A is row-normalized 76 | if (any(sum(A) ~= 1)) 77 | A = bsxfun(@times, A, 1 ./ sum(A, 2)); 78 | end 79 | 80 | num_nodes = size(A, 1); 81 | num_classes = problem.num_classes; 82 | num_train = numel(train_ind); 83 | 84 | if (options.use_prior) 85 | prior = options.pseudocount + ... 86 | accumarray(observed_labels, 1, [1, num_classes]); 87 | prior = prior * (1 ./ sum(prior)); 88 | else 89 | prior = ones(1, num_classes) * (1 / num_classes); 90 | end 91 | 92 | % expand graph with pseudonodes corresponding to the classes 93 | num_expanded_nodes = num_nodes + num_classes; 94 | 95 | A = [A, sparse(num_nodes, num_classes); ... 96 | sparse(num_classes, num_expanded_nodes)]; 97 | 98 | % reduce weight of edges leaving training nodes by a factor of 99 | % (1 - alpha) 100 | A(train_ind, :) = (1 - options.alpha) * A(train_ind, :); 101 | 102 | % add edges from training nodes to label nodes with weight alpha 103 | A = A + sparse(train_ind, num_nodes + observed_labels, options.alpha, ... 104 | num_expanded_nodes, num_expanded_nodes); 105 | 106 | % add self loops on label nodes 107 | pseudo_train_ind = (num_nodes + 1):num_expanded_nodes; 108 | A(pseudo_train_ind, pseudo_train_ind) = speye(num_classes); 109 | 110 | % begin with prior on all nodes 111 | probabilities = repmat(prior, [num_nodes + num_classes, 1]); 112 | 113 | % fill in known training labels 114 | probabilities(train_ind, :) = ... 115 | accumarray([(1:num_train)', observed_labels], 1, [num_train, num_classes]); 116 | 117 | % add knwon labels for label nodes 118 | probabilities(pseudo_train_ind, :) = eye(num_classes); 119 | 120 | for i = 1:options.num_iterations 121 | % propagate labels 122 | probabilities = A * probabilities; 123 | end 124 | 125 | probabilities = probabilities(test_ind, :); 126 | end -------------------------------------------------------------------------------- /models/model_memory_wrapper.m: -------------------------------------------------------------------------------- 1 | function probabilities = model_memory_wrapper(problem, train_ind, ... 2 | observed_labels, test_ind, model) 3 | 4 | persistent last_train_ind last_observed_labels last_test_ind last_probabilities; 5 | 6 | if (isequal(train_ind, last_train_ind) && ... 7 | isequal(observed_labels, last_observed_labels) && ... 8 | isequal(test_ind, last_test_ind)) 9 | 10 | probabilities = last_probabilities; 11 | return; 12 | end 13 | 14 | probabilities = model(problem, train_ind, observed_labels, test_ind); 15 | 16 | last_train_ind = train_ind; 17 | last_observed_labels = observed_labels; 18 | last_test_ind = test_ind; 19 | last_probabilities = probabilities; 20 | 21 | end -------------------------------------------------------------------------------- /models/models.m: -------------------------------------------------------------------------------- 1 | % A model calculates the posterior class-membership probabilites for a 2 | % selected set of test points given the current labeled training 3 | % data. 4 | % 5 | % Models must satisfy the following interface: 6 | % 7 | % probabilities = model(problem, train_ind, observed_labels, test_ind) 8 | % 9 | % Inputs: 10 | % 11 | % problem: a struct describing the problem, containing fields: 12 | % 13 | % points: an (n x d) data matrix for the available points 14 | % num_classes: the number of classes 15 | % 16 | % train_ind: a list of indices into problem.points indicating 17 | % the thus-far observed points 18 | % observed_labels: a list of labels corresponding to the 19 | % observations in train_ind 20 | % test_ind: a list of indices into problem.points 21 | % indicating the test points 22 | % 23 | % Output: 24 | % 25 | % probabilities: a matrix of posterior probabilities. The ith 26 | % column gives p(y = i | x, D) for each of the 27 | % indicated test points. 28 | % 29 | % The following models are provided in this toolbox: 30 | % 31 | % cheating_model: a "cheating" model that queries a 32 | % label oracle 33 | % gaussian_process_model: a binary Gaussian process classifier 34 | % knn_model: a weighted k-NN model 35 | % label_propagation_model: partially absorbing label propagation 36 | % random_forest_model: a random forest model 37 | 38 | % Copyright (c) 2011--2016 Roman Garnett. 39 | -------------------------------------------------------------------------------- /models/random_forest_model.m: -------------------------------------------------------------------------------- 1 | % RANDOM_FOREST_MODEL a random forest classifier. 2 | % 3 | % Requires the TreeBagger class in the MATLAB Statistics Toolbox. 4 | % 5 | % Usage: 6 | % 7 | % probabilities = random_forest_model(problem, train_ind, ... 8 | % observed_labels, test_ind, num_trees, options) 9 | % 10 | % Inputs: 11 | % 12 | % problem: a struct describing the problem, which must at 13 | % least contain the field: 14 | % 15 | % points: an (n x d) data matrix for the avilable points 16 | % 17 | % train_ind: a list of indices into problem.points indicating 18 | % the thus-far observed points 19 | % observed_labels: a list of labels corresponding to the 20 | % observations in train_ind 21 | % test_ind: a list of indices into problem.points indicating 22 | % the test points 23 | % num_trees: the number of trees to build in the random forest 24 | % options: (optional) additional options to pass into 25 | % TreeBagger for training (default: []) 26 | % 27 | % Output: 28 | % 29 | % probabilities: a matrix of posterior probabilities. The ith 30 | % column gives p(y = i | x, D) for each of the 31 | % indicated test points. 32 | % 33 | % See also TREEBAGGER, MODELS. 34 | 35 | % Copyright (c) 2011--2016 Roman Garnett. 36 | 37 | function probabilities = random_forest_model(problem, train_ind, ... 38 | observed_labels, test_ind, num_trees, options) 39 | 40 | if (nargin < 6) 41 | options = []; 42 | end 43 | 44 | model = TreeBagger(num_trees, problem.points(train_ind, :), observed_labels, ... 45 | 'method', 'classification', ... 46 | 'options', options); 47 | 48 | [~, probabilities] = predict(model, problem.points(test_ind, :)); 49 | 50 | end 51 | -------------------------------------------------------------------------------- /other/get_label_oracle.m: -------------------------------------------------------------------------------- 1 | % GET_LABEL_ORACLE creates a function handle to a label oracle. 2 | % 3 | % This is a convenience function for easily creating a function handle 4 | % to a label oracle. Given a handle to a label oracle and its 5 | % additional arguments (if any), returns a function handle for use in, 6 | % e.g., active_learning.m. 7 | % 8 | % Example: 9 | % 10 | % label_oracle = get_label_oracle(@lookup_oracle, labels); 11 | % 12 | % returns the following function handle: 13 | % 14 | % @(problem, query_ind) ... 15 | % lookup_oracle(problem, train_ind, observed_labels, query_ind, ... 16 | % labels) 17 | % 18 | % This is primarily for improving code readability by avoiding 19 | % repeated verbose function handle declarations. 20 | % 21 | % Usage: 22 | % 23 | % label_oracle = get_label_oracle(label_oracle, varargin) 24 | % 25 | % Inputs: 26 | % 27 | % label_oracle: a function handle to the desired label oracle 28 | % varargin: any additional inputs to be bound to the label 29 | % oracle beyond those required by the standard 30 | % interface (problem, query_ind) 31 | % 32 | % Output: 33 | % 34 | % label_oracle: a function handle to the desired label oracle for 35 | % use in active_learning 36 | % 37 | % See also LABEL_ORACLES. 38 | 39 | % Copyright (c) 2013--2016 Roman Garnett. 40 | 41 | function label_oracle = get_label_oracle(label_oracle, varargin) 42 | 43 | label_oracle = @(problem, train_ind, observed_labels, query_ind) ... 44 | label_oracle(problem, train_ind, observed_labels, query_ind, ... 45 | varargin{:}); 46 | 47 | end -------------------------------------------------------------------------------- /other/get_model.m: -------------------------------------------------------------------------------- 1 | % GET_MODEL creates a function handle to a probability model. 2 | % 3 | % This is a convenience function for easily creating a function handle 4 | % to a model. Given a handle to a model and its additional arguments 5 | % (if any), returns a function handle for use in, e.g., 6 | % active_learning.m. 7 | % 8 | % Example: 9 | % 10 | % model = get_model(@knn_model, weights, prior_alpha, prior_beta); 11 | % 12 | % returns the following function handle: 13 | % 14 | % @(problem, train_ind, observed_labels, test_ind) ... 15 | % knn_model(problem, train_ind, observed_labels, test_ind, ... 16 | % weights, prior_alpha, prior_beta) 17 | % 18 | % This is primarily for improving code readability by avoiding 19 | % repeated verbose function handle declarations. 20 | % 21 | % Usage: 22 | % 23 | % model = get_model(model, varargin) 24 | % 25 | % Inputs: 26 | % 27 | % model: a function handle to the desired model 28 | % varargin: any additional inputs to be bound to the model beyond 29 | % those required by the standard interface (problem, 30 | % train_ind, observed_labels, test_ind) 31 | % 32 | % Output: 33 | % 34 | % model: a function handle to the desired model for use in 35 | % active_learning 36 | % 37 | % See also MODELS. 38 | 39 | % Copyright (c) 2013--2014 Roman Garnett. 40 | 41 | function model = get_model(model, varargin) 42 | 43 | model = @(problem, train_ind, observed_labels, test_ind) ... 44 | model(problem, train_ind, observed_labels, test_ind, varargin{:}); 45 | 46 | end -------------------------------------------------------------------------------- /other/get_query_strategy.m: -------------------------------------------------------------------------------- 1 | % GET_QUERY_STRATEGY creates a function handle to a query strategy. 2 | % 3 | % This is a convenience function for easily creating a function handle 4 | % to a query strategy. Given a handle to a query strategy and its 5 | % additional arguments (if any), returns a function handle for use in, 6 | % e.g., active_learning.m. 7 | % 8 | % Example: 9 | % 10 | % query_strategy = get_query_strategy(@maximum_score, score_function); 11 | % 12 | % returns the following function handle: 13 | % 14 | % @(problem, train_ind, observed_labels, test_ind) ... 15 | % maximum_score(problem, train_ind, observed_labels, 16 | % test_ind, score_function) 17 | % 18 | % This is primarily for improving code readability by avoiding 19 | % repeated verbose function handle declarations. 20 | % 21 | % Usage: 22 | % 23 | % query_strategy = get_query_strategy(query_strategy, varargin) 24 | % 25 | % Inputs: 26 | % 27 | % query_strategy: a function handle to the desired query strategy 28 | % varargin: any additional inputs to be bound to the query 29 | % strategy beyond those required by the standard 30 | % interface (problem, train_ind, observed_labels, 31 | % test_ind) 32 | % 33 | % Output: 34 | % 35 | % query_strategy: a function handle to the desired query strategy 36 | % for use in active_learning 37 | % 38 | % See also QUERY_STRATEGIES. 39 | 40 | % Copyright (c) 2013--2014 Roman Garnett. 41 | 42 | function query_strategy = get_query_strategy(query_strategy, varargin) 43 | 44 | query_strategy = @(problem, train_ind, observed_labels, test_ind) ... 45 | query_strategy(problem, train_ind, observed_labels, test_ind, ... 46 | varargin{:}); 47 | 48 | end -------------------------------------------------------------------------------- /other/get_score_function.m: -------------------------------------------------------------------------------- 1 | % GET_SCORE_FUNCTION creates a function handle to a score function. 2 | % 3 | % This is a convenience function for easily creating a function handle 4 | % to a score function. Given a handle to a score function and its 5 | % additional arguments (if any), returns a function handle for use in, 6 | % e.g., active_learning.m. 7 | % 8 | % Example: 9 | % 10 | % score_function = get_score_function(@expected_accuracy, model); 11 | % 12 | % returns the following function handle: 13 | % 14 | % @(problem, train_ind, observed_labels, test_ind) ... 15 | % expected_accuracy(problem, train_ind, observed_labels, 16 | % test_ind, model) 17 | % 18 | % This is primarily for improving code readability by avoiding 19 | % repeated verbose function handle declarations. 20 | % 21 | % Usage: 22 | % 23 | % score_function = get_score_function(score_function, varargin) 24 | % 25 | % Inputs: 26 | % 27 | % score_function: a function handle to the desired score function 28 | % varargin: any additional inputs to be bound to the score 29 | % function beyond those required by the standard 30 | % interface (problem, train_ind, observed_labels, 31 | % test_ind) 32 | % 33 | % Output: 34 | % 35 | % score_function: a function handle to the desired label oracle for 36 | % use in active_learning 37 | % 38 | % See also SCORE_FUNCTIONS. 39 | 40 | % Copyright (c) 2013--2014 Roman Garnett. 41 | 42 | function score_function = get_score_function(score_function, varargin) 43 | 44 | score_function = @(problem, train_ind, observed_labels, test_ind) ... 45 | score_function(problem, train_ind, observed_labels, test_ind, ... 46 | varargin{:}); 47 | 48 | end -------------------------------------------------------------------------------- /other/get_selector.m: -------------------------------------------------------------------------------- 1 | % GET_SELECTOR creates a function handle to a selector. 2 | % 3 | % This is a convenience function for easily creating a function handle 4 | % to a selector. Given a handle to a selector and its additional 5 | % arguments (if any), returns a function handle for use in, e.g., 6 | % active_learning.m. 7 | % 8 | % Example: 9 | % 10 | % selector = get_selector(@random_selector, num_test); 11 | % 12 | % returns the following function handle: 13 | % 14 | % @(problem, train_ind, observed_labels) ... 15 | % random_selector(problem, train_ind, observed_labels, num_test) 16 | % 17 | % This is primarily for improving code readability by avoiding 18 | % repeated verbose function handle declarations. 19 | % 20 | % Usage: 21 | % 22 | % selector = get_selector(selector, varargin) 23 | % 24 | % Inputs: 25 | % 26 | % selector: a function handle to the desired selector 27 | % varargin: any additional inputs to be bound to the selector beyond 28 | % those required by the standard interface (problem, 29 | % train_ind, observed_labels) 30 | % 31 | % Output: 32 | % 33 | % selector: a function handle to the desired selector for use in 34 | % active_learning 35 | % 36 | % See also SELECTORS. 37 | 38 | % Copyright (c) 2014 Roman Garnett. 39 | 40 | function selector = get_selector(selector, varargin) 41 | 42 | selector = @(problem, train_ind, observed_labels) ... 43 | selector(problem, train_ind, observed_labels, varargin{:}); 44 | 45 | end -------------------------------------------------------------------------------- /query_strategies/argmax.m: -------------------------------------------------------------------------------- 1 | % ARGMAX queries the point(s) that maximizes a score function. 2 | % 3 | % This is a trivial query strategy that calls a user-provided score 4 | % function on each of the points available for labeling and selects 5 | % the point(s) with the maximum score. 6 | % 7 | % Several popular score functions are included in this software 8 | % package; see score_functions.m for more information. 9 | % 10 | % Usage: 11 | % 12 | % query_ind = argmax(problem, train_ind, observed_labels, test_ind, ... 13 | % score_function, num_points) 14 | % 15 | % Inputs: 16 | % 17 | % problem: a struct describing the problem, containing fields: 18 | % 19 | % points: an (n x d) data matrix for the available points 20 | % num_classes: the number of classes 21 | % num_queries: the number of queries to make 22 | % 23 | % train_ind: a list of indices into problem.points indicating 24 | % the thus-far observed points 25 | % observed_labels: a list of labels corresponding to the 26 | % observations in train_ind 27 | % test_ind: a list of indices into problem.points indicating 28 | % the points eligible for observation 29 | % score_function: a handle to a score function (see 30 | % score_functions.m for interface) 31 | % num_points: (optional) the number of points to return 32 | % (default: 1) 33 | % 34 | % Output: 35 | % 36 | % query_ind: an index into problem.points indicating the point(s) to 37 | % query next 38 | % 39 | % See also ARGMIN, SCORE_FUNCTIONS, QUERY_STRATEGIES. 40 | 41 | % Copyright (c) 2013--2014 Roman Garnett. 42 | 43 | function query_ind = argmax(problem, train_ind, observed_labels, ... 44 | test_ind, score_function, num_points) 45 | 46 | % by default query a single point 47 | if (nargin < 6) 48 | num_points = 1; 49 | end 50 | 51 | scores = score_function(problem, train_ind, observed_labels, test_ind); 52 | 53 | % only call sort if needed 54 | if (num_points == 1) 55 | [~, best_ind] = max(scores); 56 | else 57 | [~, best_ind] = sort(scores, 'descend'); 58 | best_ind = best_ind(1:num_points); 59 | end 60 | 61 | query_ind = test_ind(best_ind); 62 | 63 | end -------------------------------------------------------------------------------- /query_strategies/argmin.m: -------------------------------------------------------------------------------- 1 | % ARGMIN queries the point(s) that minimizes a score function. 2 | % 3 | % This is a trivial query strategy that calls a user-provided score 4 | % function on each of the points available for labeling and selects 5 | % the point(s) with the minimum score. 6 | % 7 | % Several popular score functions are included in this software 8 | % package; see score_functions.m for more information. 9 | % 10 | % Usage: 11 | % 12 | % query_ind = argmin(problem, train_ind, observed_labels, test_ind, ... 13 | % score_function, num_points) 14 | % 15 | % Inputs: 16 | % 17 | % problem: a struct describing the problem, containing fields: 18 | % 19 | % points: an (n x d) data matrix for the available points 20 | % num_classes: the number of classes 21 | % num_queries: the number of queries to make 22 | % 23 | % train_ind: a list of indices into problem.points indicating 24 | % the thus-far observed points 25 | % observed_labels: a list of labels corresponding to the 26 | % observations in train_ind 27 | % test_ind: a list of indices into problem.points indicating 28 | % the points eligible for observation 29 | % score_function: a handle to a score function (see 30 | % score_functions.m for interface) 31 | % num_points: (optional) the number of points to return 32 | % (default: 1) 33 | % 34 | % Output: 35 | % 36 | % query_ind: an index into problem.points indicating the point(s) to 37 | % query next 38 | % 39 | % See also ARGMAX, SCORE_FUNCTIONS, QUERY_STRATEGIES. 40 | 41 | % Copyright (c) 2014 Roman Garnett. 42 | 43 | function query_ind = argmin(problem, train_ind, observed_labels, ... 44 | test_ind, score_function, num_points) 45 | 46 | % by default query a single point 47 | if (nargin < 6) 48 | num_points = 1; 49 | end 50 | 51 | scores = score_function(problem, train_ind, observed_labels, test_ind); 52 | 53 | % only call sort if needed 54 | if (num_points == 1) 55 | [~, best_ind] = min(scores); 56 | else 57 | [~, best_ind] = sort(scores, 'ascend'); 58 | best_ind = best_ind(1:num_points); 59 | end 60 | 61 | query_ind = test_ind(best_ind); 62 | 63 | end -------------------------------------------------------------------------------- /query_strategies/expected_error_reduction.m: -------------------------------------------------------------------------------- 1 | % EXPECTED_ERROR_REDUCTION queries the point giving lowest expected error. 2 | % 3 | % This is an implementation of expected error reduction, a simple and 4 | % popular query strategy. Expected error reduction queries the point 5 | % that would result in the lowest expected error on the remaining 6 | % unlabeled points. Let a point x and a dataset D = (X, Y) be given. 7 | % Let \ell(D) be a loss function for the unlabeled points U given D; 8 | % here we support either the total 0/1 loss: 9 | % 10 | % \ell(D) = \sum_{x \in U} [ y != \hat{y} ], 11 | % 12 | % where \hat{y} = argmax p(y | x, D) is the predicted label for x, or 13 | % the total log loss: 14 | % 15 | % \ell(D) = \sum_{x \in U} -\log p(y = \hat{y} | x, D). 16 | % 17 | % Then expected error reduction queries the point x resulting the in 18 | % the lowest expected loss on U after adding the observation (x, y) 19 | % to D: 20 | % 21 | % x* = \argmin E_y[ p(y | x, D) \ell(D U (x, y)) ]. 22 | % 23 | % Note that the set of unlabeled points U depends on x! 24 | % 25 | % Usage: 26 | % 27 | % query_ind = expected_error_reduction(problem, train_ind, ... 28 | % observed_labels, test_ind, model, loss) 29 | % 30 | % Required Inputs: 31 | % 32 | % problem: a struct describing the problem, containing fields: 33 | % 34 | % points: an (n x d) data matrix for the available points 35 | % num_classes: the number of classes 36 | % 37 | % train_ind: a list of indices into problem.points indicating 38 | % the thus-far observed points 39 | % observed_labels: a list of labels corresponding to the 40 | % observations in train_ind 41 | % test_ind: a list of indices into problem.points indicating 42 | % the points eligible for observation 43 | % model: a function handle to a probability model 44 | % 45 | % Optional Input: 46 | % 47 | % loss: a string specifying the desired loss function; 48 | % the following are supported: '01', '0/1', 'log'. 49 | % (case insensitive; default: 'log') 50 | % 51 | % Output: 52 | % 53 | % query_ind: an index into test_ind indicating the point to query 54 | % next 55 | % 56 | % See also MODELS, QUERY_STRATEGIES. 57 | 58 | % Copyright (c) 2014 Roman Garnett. 59 | 60 | function query_ind = expected_error_reduction(problem, train_ind, ... 61 | observed_labels, test_ind, model, loss) 62 | 63 | if (nargin < 6) 64 | loss = 'log'; 65 | end 66 | 67 | % create handle to appropriate loss function 68 | switch (tolower(loss)) 69 | case {'01', '0/1'} 70 | loss = @expected_01_loss; 71 | 72 | case 'log' 73 | loss = @expected_log_loss; 74 | 75 | otherwise 76 | error('active_learning:unknown_loss', ... 77 | 'unknown loss function: %s', loss); 78 | end 79 | 80 | % expected error reduction is one-step lookahead minimization of 81 | % the chosen expected loss on the set of unlabeled points. 82 | 83 | loss = @(problem, train_ind, observed_labels) ... 84 | loss(problem, train_ind, observed_labels, ... 85 | unlabeled_selector(problem, train_ind, []), ... 86 | model); 87 | 88 | score_function = get_score_function(@expected_loss_naive, model, loss); 89 | 90 | query_ind = argmin(problem, train_ind, observed_labels, test_ind, ... 91 | score_function); 92 | 93 | end -------------------------------------------------------------------------------- /query_strategies/margin_sampling.m: -------------------------------------------------------------------------------- 1 | % MARGIN_SAMPLING queries the point with the smallest margin. 2 | % 3 | % This is an implementation of margin sampling, a simple and popular 4 | % query strategy. Margin sampling successively queries the point with 5 | % the smallest margin: 6 | % 7 | % x* = argmin margin(x | D), 8 | % 9 | % where margin(x | D) is the predictive margin of x given the 10 | % observations in D: 11 | % 12 | % margin(x | D) = p(y = y_1 | x, D) - p(y = y_2 | x, D), 13 | % 14 | % where y_1 and y_2 are the most and second-most probable class 15 | % labels for x, respectively. 16 | % 17 | % For binary problems, this coincides with uncertainty sampling. 18 | % 19 | % Usage: 20 | % 21 | % query_ind = margin_sampling(problem, train_ind, observed_labels, ... 22 | % test_ind, model) 23 | % 24 | % Inputs: 25 | % 26 | % problem: a struct describing the problem, containing fields: 27 | % 28 | % points: an (n x d) data matrix for the available points 29 | % num_classes: the number of classes 30 | % 31 | % train_ind: a list of indices into problem.points indicating 32 | % the thus-far observed points 33 | % observed_labels: a list of labels corresponding to the 34 | % observations in train_ind 35 | % test_ind: a list of indices into problem.points indicating 36 | % the points eligible for observation 37 | % model: a function handle to a probability model 38 | % 39 | % Output: 40 | % 41 | % query_ind: an index into test_ind indicating the point to query 42 | % next 43 | % 44 | % See also MODELS, MARGIN, QUERY_STRATEGIES. 45 | 46 | % Copyright (c) 2014 Roman Garnett. 47 | 48 | function query_ind = margin_sampling(problem, train_ind, observed_labels, ... 49 | test_ind, model) 50 | 51 | score_function = get_score_function(@margin, model); 52 | 53 | query_ind = argmin(problem, train_ind, observed_labels, test_ind, ... 54 | score_function); 55 | 56 | end -------------------------------------------------------------------------------- /query_strategies/query_by_committee.m: -------------------------------------------------------------------------------- 1 | % QUERY_BY_COMMITTEE queries the point with highest disagreement. 2 | % 3 | % This is an implementation of "query by committee" using vote entropy 4 | % to measure disagreement by models. The query by committee query 5 | % strategy maintains an ensemble of models M and successively queries 6 | % the point about which the ensemble members disagree the most. The 7 | % idea is to greedily cut down the version space as quickly as 8 | % possible. 9 | % 10 | % The disagreement between ensemble members can be measured in various 11 | % ways, but the most popular method (and the one we implement here) is 12 | % the so-called "vote entropy." Let M = {M_j} be a set of probability 13 | % models, and let a point x and a set of observations D = (X, Y) be 14 | % given. The ensemble probabilities are given by 15 | % 16 | % p(y = i | x, D) = \sum_j w_j(D) p(y = i | x, D, M_j) 17 | % / \sum_j w_j(D), 18 | % 19 | % where w(D) is a (possibly data-dependent) weight vector of length 20 | % |M|. 21 | % 22 | % We may alternatively use so-called "hard" voting by the ensemble 23 | % members, where in the above the posterior probabilities 24 | % 25 | % p(y | x, D, M_j) 26 | % 27 | % are replaced by a Kronecker \delta distribution on the 28 | % most-confident label according to model M_j: 29 | % 30 | % \delta[ \argmax_i p(y = i | x, D, M_j) ]. 31 | % 32 | % Finally, the vote entropy of M on x is the entropy of this marginal 33 | % distribution: 34 | % 35 | % H[y | x, D] = -\sum_i p(y = i | x, D) \log p(y = i | x, D). 36 | % 37 | % Traditionally, query by committee uses the "hard" voting strategy, 38 | % but we support either here. 39 | % 40 | % Usage: 41 | % 42 | % query_ind = query_by_committee(problem, train_ind, observed_labels, ... 43 | % test_ind, models, weights, hard_votes) 44 | % 45 | % Required Inputs: 46 | % 47 | % problem: a struct describing the problem, containing fields: 48 | % 49 | % points: an (n x d) data matrix for the available points 50 | % num_classes: the number of classes 51 | % 52 | % train_ind: a list of indices into problem.points indicating 53 | % the thus-far observed points 54 | % observed_labels: a list of labels corresponding to the 55 | % observations in train_ind 56 | % test_ind: a list of indices into problem.points indicating 57 | % the points eligible for observation 58 | % models: a cell array of handles to probability models 59 | % 60 | % Optional Inputs: 61 | % 62 | % weights: either a length-|M| vector of model weights or a 63 | % function handle returning such a vector (see note 64 | % in ensemble.m for details). 65 | % (default: ones(1, |M|) / |M|) 66 | % hard_votes: a boolean indicating whether to use "hard" voting 67 | % (default: false) 68 | % 69 | % Output: 70 | % 71 | % query_ind: an index into problem.points indicating the point to 72 | % query next 73 | % 74 | % See also ENSEMBLE, MODELS. 75 | 76 | % Copyright (c) 2014 Roman Garnett. 77 | 78 | function query_ind = query_by_committee(problem, train_ind, ... 79 | observed_labels, test_ind, models, varargin) 80 | 81 | % query by committee is simply uncertainty sampling using an 82 | % ensemble prediction 83 | model = get_model(@ensemble, models, varargin{:}); 84 | 85 | query_ind = uncertainty_sampling(problem, train_ind, observed_labels, ... 86 | test_ind, model); 87 | 88 | end -------------------------------------------------------------------------------- /query_strategies/query_strategies.m: -------------------------------------------------------------------------------- 1 | % Query strategies select which of the points currently eligible for 2 | % labeling (returned by a selector) should be observed next. 3 | % 4 | % Query strategies must satisfy the following interface: 5 | % 6 | % query_ind = query_strategy(problem, train_ind, observed_labels, test_ind) 7 | % 8 | % Inputs: 9 | % 10 | % problem: a struct describing the problem, containing fields: 11 | % 12 | % points: an (n x d) data matrix for the available points 13 | % num_classes: the number of classes 14 | % num_queries: the number of queries to make 15 | % 16 | % train_ind: a list of indices into problem.points indicating 17 | % the thus-far observed points 18 | % observed_labels: a list of labels corresponding to the 19 | % observations in train_ind 20 | % test_ind: a list of indices into problem.points indicating 21 | % the points eligible for observation 22 | % 23 | % Output: 24 | % 25 | % query_ind: an index into problem.points indicating the point(s) to 26 | % query next (every entry in query_ind will always be 27 | % a member of the set of points in test_ind) 28 | % 29 | % The following query strategies are provided in this toolbox: 30 | % 31 | % argmax: samples the point(s) maximizing a given 32 | % score function 33 | % argmin: samples the point(s) minimizing a given 34 | % score function 35 | % expected_error_reduction: samples the point giving lowest 36 | % expected loss on unlabeled points 37 | % margin_sampling: samples the point with the smallest 38 | % margin 39 | % query_by_committee: samples the point with the highest 40 | % disagreement between models 41 | % uncertainty_sampling: samples the most uncertain point 42 | 43 | % Copyright (c) 2014 Roman Garnett. 44 | -------------------------------------------------------------------------------- /query_strategies/uncertainty_sampling.m: -------------------------------------------------------------------------------- 1 | % UNCERTAINTY_SAMPLING queries the most uncertain point. 2 | % 3 | % This is an implementation of uncertainty sampling, a simple and 4 | % popular query strategy. Uncertainty sampling successively queries 5 | % the point with the highest marginal entropy: 6 | % 7 | % x* = argmax H[y | x, D], 8 | % 9 | % where H[y | x, D] is the entropy of the marginal label 10 | % distribution p(y | x, D): 11 | % 12 | % H[y | x, D] = -\sum_i p(y = i | x, D) \log(p(y = i | x, D)). 13 | % 14 | % Usage: 15 | % 16 | % query_ind = uncertainty_sampling(problem, train_ind, observed_labels, ... 17 | % test_ind, model) 18 | % 19 | % Inputs: 20 | % 21 | % problem: a struct describing the problem, containing fields: 22 | % 23 | % points: an (n x d) data matrix for the available points 24 | % num_classes: the number of classes 25 | % 26 | % train_ind: a list of indices into problem.points indicating 27 | % the thus-far observed points 28 | % observed_labels: a list of labels corresponding to the 29 | % observations in train_ind 30 | % test_ind: a list of indices into problem.points indicating 31 | % the points eligible for observation 32 | % model: a function handle to a probability model 33 | % 34 | % Output: 35 | % 36 | % query_ind: an index into test_ind indicating the point to query 37 | % next 38 | % 39 | % See also MODELS, MARGINAL_ENTROPY, QUERY_STRATEGIES. 40 | 41 | % Copyright (c) 2014 Roman Garnett. 42 | 43 | function query_ind = uncertainty_sampling(problem, train_ind, ... 44 | observed_labels, test_ind, model) 45 | 46 | score_function = get_score_function(@marginal_entropy, model); 47 | 48 | query_ind = argmax(problem, train_ind, observed_labels, test_ind, ... 49 | score_function); 50 | 51 | end -------------------------------------------------------------------------------- /score_functions/calculate_entropies.m: -------------------------------------------------------------------------------- 1 | function entropies = calculate_entropies(data, labels, train_ind, test_ind, ... 2 | probability_function) 3 | 4 | probabilities = probability_function(data, labels, train_ind, test_ind); 5 | entropies = -sum(probabilities .* log(probabilities), 2); 6 | 7 | end -------------------------------------------------------------------------------- /score_functions/expected_loss_lookahead.m: -------------------------------------------------------------------------------- 1 | % EXPECTED_LOSS_LOOKAHEAD calculates "lookahed" expected losses. 2 | % 3 | % This is an implementation of a score function that calculates the 4 | % k-step-lookahead expected losses after adding each of a given set of 5 | % points to a dataset for a particular loss function and lookahead 6 | % horizon k. 7 | % 8 | % This function supports user-specified: 9 | % 10 | % * _Loss functions,_ which calculate the loss associated with a 11 | % selected training set, 12 | % 13 | % * _Selectors,_ which given the current training set, specify which 14 | % points should have their expected losses evaluated. This 15 | % implementation allows multiple selectors to be used, should 16 | % different ones be desired for different lookaheads. 17 | % 18 | % Note on Expected Loss Functions: 19 | % 20 | % This function requires as an input a function, expected_loss, that 21 | % will return the one-step-lookahead expected losses after adding each 22 | % of a given set of points to a dataset for the chosen loss 23 | % function. That is, given a point x and a loss function \ell(D) for a 24 | % dataset D = (X, Y), this function should return 25 | % 26 | % E_y[ \ell(D U {(x, y)}) | x, D] = ... 27 | % \sum_i p(y = i | x, D) \ell(D U {(x, i)}), 28 | % 29 | % where i ranges over the possible labels. 30 | % 31 | % The API for this expected loss function is the same as for any 32 | % score function: 33 | % 34 | % expected_losses = expected_loss(problem, train_ind, observed_labels, ... 35 | % test_ind) 36 | % 37 | % Sometimes this expectation over y may be calculated directly without 38 | % enumerating all cases for y. If that is not possible, the function 39 | % expected_loss_naive may be used with any arbitrary loss function to 40 | % calculate this expectation naively (by augmenting the dataset D with 41 | % (x, i) for each class i and weigting the resulting losses by the 42 | % probabiliy that y = i). 43 | % 44 | % Usage: 45 | % 46 | % expected_losses = expected_loss_lookahead(problem, train_ind, ... 47 | % observed_labels, test_ind, model, expected_loss, selectors, ... 48 | % lookahead) 49 | % 50 | % Inputs: 51 | % 52 | % problem: a struct describing the problem, containing fields: 53 | % 54 | % points: an n x d matrix describing the avilable points 55 | % num_classes: the number of classes 56 | % 57 | % train_ind: a list of indices into problem.points 58 | % indicating the thus-far observed points 59 | % observed_labels: a list of labels corresponding to the 60 | % observations in train_ind 61 | % test_ind: a list of indices into problem.points indicating 62 | % the points eligible for observation 63 | % model: a handle to probability model to use 64 | % expected_loss: a handle to the one-step expected loss function 65 | % to use (see note above) 66 | % selectors: a cell array of selectors to use. If lookahead == k, 67 | % then the min(k, numel(selection_functions))th 68 | % element of this array will be used. 69 | % lookahead: the number of steps to look ahead. If 70 | % lookahead == 0, then random expected losses are 71 | % returned. 72 | % 73 | % Output: 74 | % 75 | % expected_losses: the lookahead-step expected losses for the points 76 | % in test_ind 77 | % 78 | % See also LOSS_FUNCTIONS, EXPECTED_LOSS_NAIVE, SELECTORS, MODELS, SCORE_FUNCTIONS. 79 | 80 | % Copyright (c) 2011--2014 Roman Garnett. 81 | 82 | function expected_losses = expected_loss_lookahead(problem, train_ind, ... 83 | observed_labels, test_ind, model, expected_loss, selectors, ... 84 | lookahead) 85 | 86 | num_test = numel(test_ind); 87 | 88 | % for zero-step lookahead, return random expected losses 89 | if (lookahead == 0) 90 | expected_losses = rand(num_test, 1); 91 | return; 92 | 93 | % for one-step lookahead, return values from base expected loss 94 | elseif (lookahead == 1) 95 | expected_losses = expected_loss(problem, train_ind, observed_labels, ... 96 | test_ind); 97 | return; 98 | end 99 | 100 | % We will calculate the expected loss after adding each dataset to the 101 | % training set by sampling over labels to create ficticious datasets 102 | % and measuring the expected loss of each. We accomplish lookahead 103 | % by calling this function recursively to calculate the expected 104 | % loss at later levels. 105 | 106 | % Used to recursively select test points. Allow array of selectors and 107 | % fall back if no entry for current lookahead. 108 | selector = selectors{min(lookahead - 1, numel(selectors))}; 109 | 110 | % Given one additional new point, we will always select the remaining 111 | % points by minimizing the (lookahead - 1) expected loss. 112 | lookahead_loss = @(problem, train_ind, observed_labels) ... 113 | min(expected_loss_lookahead(problem, train_ind, observed_labels, ... 114 | selector(problem, train_ind, observed_labels), model, ... 115 | expected_loss, selectors, lookahead - 1)); 116 | 117 | % Use expected_loss_naive to evaluate the expected loss of the given 118 | % points. 119 | expected_losses = expected_loss_naive(problem, train_ind, ... 120 | observed_labels, test_ind, model, lookahead_loss); 121 | 122 | end -------------------------------------------------------------------------------- /score_functions/expected_loss_naive.m: -------------------------------------------------------------------------------- 1 | % EXPECTED_LOSS_NAIVE calculates one-step-lookahead expected losses. 2 | % 3 | % This is an implementation of a score function that calculates the 4 | % one-step-lookahead expected losses after adding each of a given set of 5 | % points to a dataset for a particular loss function. 6 | % 7 | % Given a loss function \ell(D), this function computes the 8 | % expected loss after adding each identified point x to the current 9 | % dataset D: 10 | % 11 | % E_y[ \ell(D U {(x, y)}) | x, D] = ... 12 | % \sum_i p(y = i | x, D) \ell(D U {(x, i)}), 13 | % 14 | % where i ranges over the possible labels. 15 | % 16 | % Here this expectation is computed naively by augmenting the dataset 17 | % D with (x, i) for each class i and weighting the resulting losses by 18 | % the probability that y = i. 19 | % 20 | % Usage: 21 | % 22 | % expected_losses = expected_loss_naive(problem, train_ind, ... 23 | % observed_labels, test_ind, model, loss) 24 | % 25 | % Inputs: 26 | % 27 | % problem: a struct describing the problem, containing fields: 28 | % 29 | % points: an n x d matrix describing the available points 30 | % num_classes: the number of classes 31 | % 32 | % train_ind: a list of indices into problem.points 33 | % indicating the thus-far observed points 34 | % observed_labels: a list of labels corresponding to the 35 | % observations in train_ind 36 | % test_ind: a list of indices into problem.points indicating 37 | % the points eligible for observation 38 | % model: a handle to the probability model to use 39 | % loss: a handle to the loss function to use 40 | % 41 | % Output: 42 | % 43 | % expected_losses: the one-step lookahead expected losses for the 44 | % points in test_ind 45 | % 46 | % See also LOSS_FUNCTIONS, EXPECTED_LOSS_LOOKAHEAD, MODELS, SCORE_FUNCTIONS. 47 | 48 | % Copyright (c) 2011--2014 Roman Garnett. 49 | 50 | function expected_losses = expected_loss_naive(problem, train_ind, ... 51 | observed_labels, test_ind, model, loss) 52 | 53 | num_test = numel(test_ind); 54 | 55 | % calculate the current posterior probabilities 56 | probabilities = model(problem, train_ind, observed_labels, test_ind); 57 | 58 | expected_losses = zeros(num_test, 1); 59 | for i = 1:num_test 60 | fake_train_ind = [train_ind; test_ind(i)]; 61 | 62 | % sample over labels 63 | fake_losses = zeros(problem.num_classes, 1); 64 | for fake_label = 1:problem.num_classes 65 | fake_observed_labels = [observed_labels; fake_label]; 66 | fake_losses(fake_label) = ... 67 | loss(problem, fake_train_ind, fake_observed_labels); 68 | end 69 | 70 | % calculate expectation using current probabilities 71 | expected_losses(i) = probabilities(i, :) * fake_losses; 72 | end 73 | 74 | end -------------------------------------------------------------------------------- /score_functions/expected_utility_lookahead.m: -------------------------------------------------------------------------------- 1 | % EXPECTED_UTILITY_LOOKAHEAD calculates "lookahed" expected utilities. 2 | % 3 | % This is an implementation of a score function that calculates the 4 | % k-step-lookahead expected utilities after adding each of a given set 5 | % of points to a dataset for a particular utility function and 6 | % lookahead horizon k. 7 | % 8 | % This is implemented as a wrapper around expected_loss_lookahead that 9 | % simply transforms the provided utility into a loss (via negation), 10 | % calls that function, and again negates the outputs. The API is the 11 | % same as for expected_loss_lookahead, modulo the replacement of 12 | % losses by utilities. 13 | % 14 | % See also EXPECTED_LOSS_LOOKAHEAD, LOSS_FUNCTIONS, SCORE_FUNCTIONS. 15 | 16 | % Copyright (c) 2014 Roman Garnett. 17 | 18 | function expected_utilities = expected_utility_lookahead(problem, ... 19 | train_ind, observed_labels, test_ind, model, expected_utility, ... 20 | selectors, lookahead) 21 | 22 | % transform utility into a loss via negation 23 | expected_loss = @(problem, train_ind, observed_labels, test_ind) ... 24 | -expected_utility(problem, train_ind, observed_labels, ... 25 | test_ind); 26 | 27 | % calculate expected losses and transform back to expected utilities 28 | % by negation 29 | expected_utilities = -expected_loss_lookahead(problem, train_ind, ... 30 | observed_labels, test_ind, model, expected_loss, selectors, ... 31 | lookahead); 32 | 33 | end -------------------------------------------------------------------------------- /score_functions/expected_utility_naive.m: -------------------------------------------------------------------------------- 1 | % EXPECTED_UTILITY_NAIVE calculates one-step-lookahead expected utilities. 2 | % 3 | % This is an implementation of a score function that calculates the 4 | % one-step-lookahead expected utilities after adding each of a given 5 | % set of points to a dataset for a particular utility function. 6 | % 7 | % This is implmemented as a wrapper around expected_loss_naive that 8 | % simply transforms the provided utility into a loss (via negation), 9 | % calls that function, and again negates the outputs. The API is the 10 | % same as for expected_loss_naive, modulo the replacement of losses by 11 | % utilites. 12 | % 13 | % See also EXPECTED_LOSS_NAIVE, LOSS_FUNCTIONS, SCORE_FUNCTIONS. 14 | 15 | % Copyright (c) 2014 Roman Garnett. 16 | 17 | function expected_utilities = expected_utility_naive(problem, train_ind, ... 18 | observed_labels, test_ind, model, utility) 19 | 20 | % transform utility into a loss via negation 21 | loss = @(problem, train_ind, observed_labels) ... 22 | -utility(problem, train_ind, observed_labels); 23 | 24 | % calculate expected losses and transform back to expected utilities 25 | % by negation 26 | expected_utilities = -expected_loss_naive(problem, train_ind, ... 27 | observed_labels, test_ind, model, loss); 28 | 29 | end -------------------------------------------------------------------------------- /score_functions/loss_functions/expected_01_loss.m: -------------------------------------------------------------------------------- 1 | % EXPECTED_01_LOSS calcluates expected 0/1 loss given a training set. 2 | % 3 | % This function computes the expected total 0/1 loss on a set of 4 | % points given a training set D = (X, Y): 5 | % 6 | % \sum_{x \in U} (1 - max p(y | x, D)), 7 | % 8 | % where U is the set of points whose labels are to be predicted. 9 | % 10 | % Usage: 11 | % 12 | % loss = expected_01_loss(problem, train_ind, observed_labels, ... 13 | % test_ind, model) 14 | % 15 | % Inputs: 16 | % 17 | % problem: a struct describing the problem, containing fields: 18 | % 19 | % points: an (n x d) data matrix for the available points 20 | % num_classes: the number of classes 21 | % 22 | % train_ind: a list of indices into problem.points indicating 23 | % the thus-far observed points 24 | % observed_labels: a list of labels corresponding to the 25 | % observations in train_ind 26 | % test_ind: a list of indices into problem.points indicating 27 | % the test points 28 | % model: a handle to a probability model 29 | % 30 | % Output: 31 | % 32 | % loss: the expected total 0/1 loss on the points in test_ind 33 | % 34 | % See also EXPECTED_ERROR_REDUCTION, MARGINAL_ENTROPY. 35 | 36 | % Copyright (c) 2014 Roman Garnett. 37 | 38 | function loss = expected_01_loss(problem, train_ind, observed_labels, ... 39 | test_ind, model) 40 | 41 | probabilities = model(problem, train_ind, observed_labels, test_ind); 42 | 43 | loss = sum(1 - max(probabilities, [], 2)); 44 | 45 | end -------------------------------------------------------------------------------- /score_functions/loss_functions/expected_log_loss.m: -------------------------------------------------------------------------------- 1 | % EXPECTED_LOG_LOSS calcluates expected log loss given a training set. 2 | % 3 | % This function computes the expected total log loss on a set of 4 | % points given a training set D = (X, Y): 5 | % 6 | % \sum_{x \in U} H[y | x, D], 7 | % 8 | % where H[y | x, D] is the marginal entropy of the predictive 9 | % distribution p(y | x, D) and U is the set of points whose labels 10 | % are to be predicted. 11 | % 12 | % Usage: 13 | % 14 | % loss = expected_log_loss(problem, train_ind, observed_labels, ... 15 | % test_ind, model) 16 | % 17 | % Inputs: 18 | % 19 | % problem: a struct describing the problem, containing fields: 20 | % 21 | % points: an (n x d) data matrix for the available points 22 | % num_classes: the number of classes 23 | % 24 | % train_ind: a list of indices into problem.points indicating 25 | % the thus-far observed points 26 | % observed_labels: a list of labels corresponding to the 27 | % observations in train_ind 28 | % test_ind: a list of indices into problem.points indicating 29 | % the test points 30 | % model: a handle to a probability model 31 | % 32 | % Output: 33 | % 34 | % loss: the expected total log loss on the points in test_ind 35 | % 36 | % See also EXPECTED_ERROR_REDUCTION, MARGINAL_ENTROPY. 37 | 38 | % Copyright (c) 2014 Roman Garnett. 39 | 40 | function loss = expected_log_loss(problem, train_ind, observed_labels, ... 41 | test_ind, model) 42 | 43 | marginal_entropies = marginal_entropy(problem, train_ind, ... 44 | observed_labels, test_ind, model); 45 | 46 | loss = sum(marginal_entropies); 47 | 48 | end -------------------------------------------------------------------------------- /score_functions/loss_functions/loss_functions.m: -------------------------------------------------------------------------------- 1 | % Loss functions (also utility functions) compute the loss (or 2 | % utility) associated with a given set of observations. These are 3 | % typically used in active learning to, e.g., sample the point that, 4 | % after being incorporated into the current set of observations, 5 | % minimizes the expected final loss. Loss functions will typically not 6 | % be used directly but rather by a score function computing expected 7 | % losses (e.g., expected_loss_naive, expected_loss_lookahead) or 8 | % expected utilities (e.g., expected_loss_naive, 9 | % expected_loss_lookahead). 10 | % 11 | % Loss and utility functions must satisfy the following interface: 12 | % 13 | % loss = loss_function(problem, train_ind, observed_labels) 14 | % 15 | % or 16 | % 17 | % utility = utility_function(problem, train_ind, observed_labels) 18 | % 19 | % The only difference between the two is the semantic interpretation 20 | % of the output: losses are typically to be minimized (e.g., with 21 | % argmin) and utilities are typically to be maximized (e.g., with 22 | % argmax). 23 | % 24 | % Inputs: 25 | % 26 | % problem: a struct describing the problem, containing fields: 27 | % 28 | % points: an (n x d) data matrix for the available points 29 | % num_classes: the number of classes 30 | % 31 | % train_ind: a list of indices into problem.points indicating 32 | % the thus-far observed points 33 | % observed_labels: a list of labels corresponding to the 34 | % observations in train_ind 35 | % 36 | % Output: 37 | % 38 | % loss: the loss associated with the given set of observations 39 | % 40 | % or 41 | % 42 | % utility: the utility associated with the given set of observations 43 | % 44 | % See also EXPECTED_LOSS_NAIVE, EXPECTED_LOSS_LOOKAHEAD, 45 | % EXPECTED_UTILITY_NAIVE, EXPECTED_UTILITY_LOOKAHEAD. 46 | 47 | % Copyright (c) 2014 Roman Garnett. 48 | -------------------------------------------------------------------------------- /score_functions/margin.m: -------------------------------------------------------------------------------- 1 | % MARGIN calculates predictive margin on given test points. 2 | % 3 | % The predictive margin for a point x is the difference between the 4 | % probability assigned to the most probable class and the 5 | % second-most probable class: 6 | % 7 | % margin(x | D) = p(y = y_1 | x, D) - p(y = y_2 | x, D), 8 | % 9 | % where y_1 and y_2 are the most and second-most probable class 10 | % labels for x given the observations in D, respectively. 11 | % 12 | % Minimizing margin gives rise to a popular query strategy known as 13 | % margin sampling. 14 | % 15 | % Usage: 16 | % 17 | % scores = margin(problem, train_ind, observed_labels, test_ind, model) 18 | % 19 | % Inputs: 20 | % 21 | % problem: a struct describing the problem, containing fields: 22 | % 23 | % points: an (n x d) data matrix for the available points 24 | % num_classes: the number of classes 25 | % 26 | % train_ind: a list of indices into problem.points indicating 27 | % the thus-far observed points 28 | % observed_labels: a list of labels corresponding to the 29 | % observations in train_ind 30 | % test_ind: a list of indices into problem.points indicating 31 | % the points eligible for observation 32 | % model: a handle to a probability model 33 | % 34 | % Output: 35 | % 36 | % scores: a vector of margins for each point specified by test_ind 37 | % 38 | % See also SCORE_FUNCTIONS, MODELS, MARGIN_SAMPLING. 39 | 40 | % Copyright (c) 2014 Roman Garnett. 41 | 42 | function scores = margin(problem, train_ind, observed_labels, ... 43 | test_ind, model) 44 | 45 | probabilities = model(problem, train_ind, observed_labels, test_ind); 46 | probabilities = sort(probabilities, 2, 'descend'); 47 | 48 | scores = probabilities(:, 1) - probabilities(:, 2); 49 | 50 | end -------------------------------------------------------------------------------- /score_functions/marginal_entropy.m: -------------------------------------------------------------------------------- 1 | % MARGINAL_ENTROPY calculates predictive entropy on given test points. 2 | % 3 | % The predictive marginal entropy H for a point x given a set of 4 | % observations D = (X, Y) is given by 5 | % 6 | % H[y | x, D] = -\sum_i p(y = i | x, D) \log p(y = i | x, D). 7 | % 8 | % Maximizing the marginal entropy gives rise to a common query 9 | % strategy known as uncertainty sampling. 10 | % 11 | % Usage: 12 | % 13 | % scores = marginal_entropy(problem, train_ind, observed_labels, ... 14 | % test_ind, model) 15 | % 16 | % Inputs: 17 | % 18 | % problem: a struct describing the problem, containing fields: 19 | % 20 | % points: an (n x d) data matrix for the available points 21 | % num_classes: the number of classes 22 | % 23 | % train_ind: a list of indices into problem.points indicating 24 | % the thus-far observed points 25 | % observed_labels: a list of labels corresponding to the 26 | % observations in train_ind 27 | % test_ind: a list of indices into problem.points indicating 28 | % the points eligible for observation 29 | % model: a handle to a probability model 30 | % 31 | % Output: 32 | % 33 | % scores: a vector of marginal entropies for each point specified by 34 | % test_ind 35 | % 36 | % See also SCORE_FUNCTIONS, MODELS, UNCERTAINTY_SAMPLING. 37 | 38 | % Copyright (c) 2013--2014 Roman Garnett. 39 | 40 | function scores = marginal_entropy(problem, train_ind, observed_labels, ... 41 | test_ind, model) 42 | 43 | probabilities = model(problem, train_ind, observed_labels, test_ind); 44 | 45 | % remove any zeros from probabilities to approximate 0 * -inf = 0 46 | probabilities = max(probabilities, 1e-100); 47 | 48 | scores = -sum(probabilities .* log(probabilities), 2); 49 | 50 | end -------------------------------------------------------------------------------- /score_functions/score_functions.m: -------------------------------------------------------------------------------- 1 | % A score function computes an arbitrary score for each given test 2 | % point that is in some way related to its influence or suitability 3 | % for making an observation there. Score functions are typically 4 | % converted into query strategies by either maximization (e.g. argmax) 5 | % or minimization (e.g. argmin) over the points eligible for 6 | % observation. 7 | % 8 | % Score function must satisfy the following interface: 9 | % 10 | % scores = score_function(problem, train_ind, observed_labels, test_ind) 11 | % 12 | % Inputs: 13 | % 14 | % problem: a struct describing the problem, containing fields: 15 | % 16 | % points: an (n x d) data matrix for the available points 17 | % num_classes: the number of classes 18 | % 19 | % train_ind: a list of indices into problem.points indicating 20 | % the thus-far observed points 21 | % observed_labels: a list of labels corresponding to the 22 | % observations in train_ind 23 | % test_ind: a list of indices into problem.points indicating 24 | % the points eligible for observation 25 | % 26 | % Output: 27 | % 28 | % scores: a vector of real-valued scores; one for each point 29 | % specified by test_ind 30 | % 31 | % The following score functions are provided in this toolbox: 32 | % 33 | % expected_loss_lookahead: multiple-step lookahead expected loss 34 | % arbitrary loss functions 35 | % expected_loss_naive: one-step lookahead expected loss for 36 | % arbitrary loss functions 37 | % margin: the predictive margin 38 | % marginal_entropy: the predictive entropy 39 | % 40 | % See also ARGMIN, ARGMAX. 41 | 42 | % Copyright (c) 2014 Roman Garnett. 43 | -------------------------------------------------------------------------------- /selectors/fixed_test_set_selector.m: -------------------------------------------------------------------------------- 1 | % FIXED_TEST_SET_SELECTOR selects all points besides a given test set. 2 | % 3 | % Usage: 4 | % 5 | % test_ind = fixed_test_set_selector(problem, train_ind, observed_labels, ... 6 | % test_set_ind) 7 | % 8 | % Inputs: 9 | % 10 | % problem: a struct describing the problem, which must at 11 | % least contain the field: 12 | % 13 | % points: an (n x d) data matrix for the avilable points 14 | % 15 | % train_ind: a list of indices into problem.points indicating 16 | % the thus-far observed points 17 | % 18 | % Note: this input, part of the standard selector 19 | % API, is ignored by fixed_test_set_selector. If 20 | % desired, for standalone use it can be replaced by 21 | % an empty matrix. 22 | % 23 | % observed_labels: a list of labels corresponding to the 24 | % observations in train_ind 25 | % 26 | % Note: this input, part of the standard selector 27 | % API, is ignored by fixed_test_set_selector. If 28 | % desired, for standalone use it can be replaced by 29 | % an empty matrix. 30 | % 31 | % test_set_ind: a list of indicies into problem.points 32 | % indicating the test set 33 | % 34 | % Output: 35 | % 36 | % test_ind: a list of indices into problem.points indicating the 37 | % points to consider for labeling 38 | % 39 | % See also SELECTORS. 40 | 41 | % Copyright (c) 2011--2014 Roman Garnett. 42 | 43 | function test_ind = fixed_test_set_selector(problem, ~, ~, test_set_ind) 44 | 45 | test_ind = identity_selector(problem, [], []); 46 | test_ind(test_set_ind) = []; 47 | 48 | end 49 | -------------------------------------------------------------------------------- /selectors/graph_walk_selector.m: -------------------------------------------------------------------------------- 1 | % GRAPH_WALK_SELECTOR confines an experiment to follow a path on a graph. 2 | % 3 | % This provides a selector that compels observations to be taken along 4 | % a connected path in a specified (possibly directed) graph. The nodes 5 | % adjacent to the previously queried node are selected. 6 | % 7 | % Usage: 8 | % 9 | % test_ind = graph_walk_selector(problem, train_ind, observed_labels, A) 10 | % 11 | % Inputs: 12 | % 13 | % problem: a struct describing the problem, containing fields: 14 | % 15 | % points: an (n x d) data matrix for the available points 16 | % num_classes: the number of classes 17 | % num_queries: the number of queries to make 18 | % 19 | % Note: this input, part of the standard selector 20 | % API, is ignored by graph_walk_selector. If 21 | % desired, for standalone use it can be replaced by 22 | % an empty matrix. 23 | % 24 | % train_ind: a list of indices into problem.points indicating 25 | % the thus-far observed points 26 | % observed_labels: a list of labels corresponding to the 27 | % observations in train_ind 28 | % 29 | % Note: this input, part of the standard selector 30 | % API, is ignored by graph_walk_selector. If 31 | % desired, for standalone use it can be replaced by 32 | % an empty matrix. 33 | % 34 | % A: the (n x n) adjacency matrix for the desired 35 | % graph. A nonzero entry for A(i, j) is interpreted 36 | % as the presence of the (possibly directed) edge 37 | % [i -> j]. 38 | % 39 | % Output: 40 | % 41 | % test_ind: a list of indices into problem.points indicating the 42 | % points to consider for labeling. Each index in test_ind 43 | % can be reached from the last observed point via an 44 | % outgoing edge in the given graph. 45 | % 46 | % See also SELECTORS. 47 | 48 | % Copyright (c) 2013--2014 Roman Garnett. 49 | 50 | function test_ind = graph_walk_selector(~, train_ind, ~, A) 51 | 52 | test_ind = find(A(train_ind(end), :))'; 53 | 54 | end -------------------------------------------------------------------------------- /selectors/identity_selector.m: -------------------------------------------------------------------------------- 1 | % IDENTITY_SELECTOR selects all points. 2 | % 3 | % Usage: 4 | % 5 | % test_ind = identity_selector(problem, observed_labels, train_ind) 6 | % 7 | % Inputs: 8 | % 9 | % problem: a struct describing the problem, which must at 10 | % least contain the field: 11 | % 12 | % points: an (n x d) data matrix for the avilable points 13 | % 14 | % train_ind: a list of indices into problem.points indicating 15 | % the thus-far observed points 16 | % 17 | % Note: this input, part of the standard selector 18 | % API, is ignored by identity_selector. If desired, 19 | % for standalone use it can be replaced by an empty 20 | % matrix. 21 | % 22 | % observed_labels: a list of labels corresponding to the 23 | % observations in train_ind 24 | % 25 | % Note: this input, part of the standard selector 26 | % API, is ignored by identity_selector. If desired, 27 | % for standalone use it can be replaced by an empty 28 | % matrix. 29 | % 30 | % Output: 31 | % 32 | % test_ind: a list of indices into problem.points indicating the 33 | % points to consider for labeling 34 | % 35 | % See also SELECTORS 36 | 37 | % Copyright (c) 2011--2014 Roman Garnett. 38 | 39 | function test_ind = identity_selector(problem, ~, ~) 40 | 41 | test_ind = (1:size(problem.points, 1))'; 42 | 43 | end -------------------------------------------------------------------------------- /selectors/meta_selectors/complement_selector.m: -------------------------------------------------------------------------------- 1 | % COMPLEMENT_SELECTOR takes the complement of a selector's output. 2 | % 3 | % This provides a meta-selector that returns the complement of 4 | % the test points returned by another selector. Note: this set can 5 | % be empty! 6 | % 7 | % Usage: 8 | % 9 | % test_ind = complement_selector(problem, train_ind, observed_labels, selector) 10 | % 11 | % Inputs: 12 | % 13 | % problem: a struct describing the problem, containing fields: 14 | % 15 | % points: an (n x d) data matrix for the available points 16 | % num_classes: the number of classes 17 | % num_queries: the number of queries to make 18 | % 19 | % train_ind: a list of indices into problem.points indicating 20 | % the thus-far observed points 21 | % observed_labels: a list of labels corresponding to the 22 | % observations in train_ind 23 | % selector: a function handle to a selector 24 | % 25 | % Output: 26 | % 27 | % test_ind: a list of indices into problem.points indicating the 28 | % points to consider for labeling. Each index in test_ind 29 | % was not selected by the given selector. 30 | % 31 | % See also SELECTORS. 32 | 33 | % Copyright (c) 2014 Roman Garnett. 34 | 35 | function test_ind = complement_selector(problem, train_ind, observed_labels, ... 36 | selector) 37 | 38 | test_ind = identity_selector(problem, [], []); 39 | test_ind(selector(problem, train_ind, observed_labels)) = []; 40 | 41 | end -------------------------------------------------------------------------------- /selectors/meta_selectors/intersection_selector.m: -------------------------------------------------------------------------------- 1 | % INTERSECTION_SELECTOR takes the intersection of the outputs of selectors. 2 | % 3 | % This provides a meta-selector that returns the intersection of the 4 | % test points returned from each of a set of selectors. Note that this 5 | % intersection may be empty! 6 | % 7 | % Usage: 8 | % 9 | % test_ind = intersection_selector(problem, train_ind, observed_labels, ... 10 | % selectors) 11 | % 12 | % Inputs: 13 | % 14 | % problem: a struct describing the problem, containing fields: 15 | % 16 | % points: an (n x d) data matrix for the available points 17 | % num_classes: the number of classes 18 | % num_queries: the number of queries to make 19 | % 20 | % train_ind: a list of indices into problem.points indicating 21 | % the thus-far observed points 22 | % observed_labels: a list of labels corresponding to the 23 | % observations in train_ind 24 | % selectors: a cell array of function handles to selectors 25 | % to intersect 26 | % 27 | % Output: 28 | % 29 | % test_ind: a list of indices into problem.points indicating the 30 | % points to consider for labeling. Each index in test_ind 31 | % was selected by every provided selector. 32 | % 33 | % See also SELECTORS. 34 | 35 | % Copyright (c) 2013--2014 Roman Garnett. 36 | 37 | function test_ind = intersection_selector(problem, train_ind, ... 38 | observed_labels, selectors) 39 | 40 | test_ind = selectors{1}(problem, train_ind, observed_labels); 41 | for i = 2:numel(selectors) 42 | test_ind = intersect(test_ind, selectors{i}(problem, train_ind, observed_labels)); 43 | end 44 | 45 | end -------------------------------------------------------------------------------- /selectors/meta_selectors/union_selector.m: -------------------------------------------------------------------------------- 1 | % UNION_SELECTOR takes the union of the output of selectors. 2 | % 3 | % This provides a meta-selector that returns the union of the test 4 | % points returned from each of a set of selectors. 5 | % 6 | % Usage: 7 | % 8 | % test_ind = union_selector(problem, train_ind, observed_labels, selectors) 9 | % 10 | % Inputs: 11 | % 12 | % problem: a struct describing the problem, containing fields: 13 | % 14 | % points: an (n x d) data matrix for the available points 15 | % num_classes: the number of classes 16 | % num_queries: the number of queries to make 17 | % 18 | % train_ind: a list of indices into problem.points indicating 19 | % the thus-far observed points 20 | % observed_labels: a list of labels corresponding to the 21 | % observations in train_ind 22 | % selectors: a cell array of function handles to selectors 23 | % to combine 24 | % 25 | % Output: 26 | % 27 | % test_ind: a list of indices into problem.points indicating the 28 | % points to consider for labeling. Each index in test_ind 29 | % was selected by at least one of the provided selectors. 30 | % 31 | % See also SELECTORS. 32 | 33 | % Copyright (c) 2014 Roman Garnett. 34 | 35 | function test_ind = union_selector(problem, train_ind, observed_labels, ... 36 | selectors) 37 | 38 | test_ind = selectors{1}(problem, train_ind, observed_labels); 39 | for i = 2:numel(selectors) 40 | test_ind = union(test_ind, selectors{i}(problem, train_ind, observed_labels)); 41 | end 42 | 43 | end -------------------------------------------------------------------------------- /selectors/probability_treshold_selector.m: -------------------------------------------------------------------------------- 1 | % PROBABILITY_THRESHOLD_SELECTOR selects confident points. 2 | % 3 | % This provides a selector that selects points with at least one 4 | % class-membership probability above a specified threshold according 5 | % to a given model. 6 | % 7 | % Usage: 8 | % 9 | % test_ind = probability_treshold_selector(problem, train_ind, ... 10 | % observed_labels, model, threshold) 11 | % 12 | % Inputs: 13 | % problem: a struct describing the problem, containing fields: 14 | % 15 | % points: an (n x d) data matrix for the available points 16 | % num_classes: the number of classes 17 | % 18 | % train_ind: a list of indices into problem.points indicating 19 | % the thus-far observed points 20 | % observed_labels: a list of labels corresponding to the 21 | % observations in train_ind 22 | % model: a handle to a probability model 23 | % 24 | % Output: 25 | % 26 | % test_ind: a list of indices into problem.points indicating the 27 | % points to consider for labeling. Each index in test_ind 28 | % has at least one class-membership probability greater 29 | % than the provided threshold. 30 | % 31 | % See also SELECTORS, MODELS. 32 | 33 | % Copyright (c) 2011--2014 Roman Garnett. 34 | 35 | function test_ind = probability_treshold_selector(problem, train_ind, ... 36 | observed_labels, model, threshold) 37 | 38 | test_ind = identity_selector(problem, [], []); 39 | 40 | probabilities = model(problem, train_ind, observed_labels, test_ind); 41 | 42 | test_ind = find(any(probabilities >= threshold), 2); 43 | 44 | end 45 | -------------------------------------------------------------------------------- /selectors/random_selector.m: -------------------------------------------------------------------------------- 1 | % RANDOM_SELECTOR selects a random subset of points. 2 | % 3 | % Usage: 4 | % 5 | % test_ind = random_selector(problem, train_ind, observed_labels, ... 6 | % num_test) 7 | % 8 | % Inputs: 9 | % 10 | % problem: a struct describing the problem, which must at 11 | % least contain the field: 12 | % 13 | % points: an (n x d) data matrix for the avilable points 14 | % 15 | % train_ind: a list of indices into problem.points indicating 16 | % the thus-far observed points 17 | % 18 | % Note: this input, part of the standard selector 19 | % API, is ignored by random_selector. If desired, 20 | % for standalone use it can be replaced by an empty 21 | % matrix. 22 | % 23 | % observed_labels: a list of labels corresponding to the 24 | % observations in train_ind 25 | % 26 | % Note: this input, part of the standard selector 27 | % API, is ignored by random_selector. If desired, 28 | % for standalone use it can be replaced by an empty 29 | % matrix. 30 | % 31 | % num_test: the number of test points to select 32 | % 33 | % Output: 34 | % 35 | % test_ind: a list of indices into problem.points indicating the 36 | % points to consider for labeling 37 | % 38 | % See also SELECTORS. 39 | 40 | % Copyright (c) 2011--2014 Roman Garnett. 41 | 42 | function test_ind = random_selector(problem, ~, ~, num_test) 43 | 44 | test_ind = randperm(size(problem.points, 1), num_test); 45 | 46 | end -------------------------------------------------------------------------------- /selectors/selectors.m: -------------------------------------------------------------------------------- 1 | % A selector considers the current labeled dataset and indicates which 2 | % of the unlabeled points should be considered for observation at this 3 | % time. 4 | % 5 | % Selectors must satisfy the following interface: 6 | % 7 | % test_ind = selector(problem, train_ind, observed_labels) 8 | % 9 | % Inputs: 10 | % 11 | % problem: a struct describing the problem, containing fields: 12 | % 13 | % points: an (n x d) data matrix for the available points 14 | % num_classes: the number of classes 15 | % num_queries: the number of queries to make 16 | % 17 | % train_ind: a list of indices into problem.points indicating 18 | % the thus-far observed points 19 | % observed_labels: a list of labels corresponding to the 20 | % observations in train_ind 21 | % 22 | % Output: 23 | % 24 | % test_ind: a list of indices into problem.points indicating the 25 | % points to consider for labeling 26 | % 27 | % The following general-purpose selectors are provided in this 28 | % toolbox: 29 | % 30 | % fixed_test_set_selector: selects all points besides a given test 31 | % set 32 | % graph_walk_selector: confines an experiment to follow a path 33 | % on a graph 34 | % identity_selector: selects all points 35 | % random_selector: selects a random subset of points 36 | % unlabeled_selector: selects points not yet observed 37 | % 38 | % In addition, the following "meta" selectors are provided, which 39 | % combine or modify the outputs of other selectors: 40 | % 41 | % complement_selector: takes the complement of a selector's output 42 | % intersection_selector: takes the intersection of the outputs of selectors 43 | % union_selector: takes the union of the outputs of selectors 44 | 45 | % Copyright (c) 2011--2014 Roman Garnett. -------------------------------------------------------------------------------- /selectors/unlabeled_selector.m: -------------------------------------------------------------------------------- 1 | % UNLABELED_SELECTOR selects points not yet observed. 2 | % 3 | % Usage: 4 | % 5 | % test_ind = unlabeled_selector(problem, train_ind, observed_labels) 6 | % 7 | % Inputs: 8 | % 9 | % problem: a struct describing the problem, which must at 10 | % least contain the field: 11 | % 12 | % points: an (n x d) data matrix for the avilable points 13 | % 14 | % train_ind: a list of indices into problem.points indicating 15 | % the thus-far observed points 16 | % observed_labels: a list of labels corresponding to the 17 | % observations in train_ind 18 | % 19 | % Note: this input, part of the standard selector 20 | % API, is ignored by unlabeled_selector. If 21 | % desired, for standalone use it can be replaced by 22 | % an empty matrix. 23 | % 24 | % Output: 25 | % 26 | % test_ind: a list of indices into problem.points indicating the 27 | % points to consider for labeling 28 | % 29 | % See also SELECTORS. 30 | 31 | % Copyright (c) 2013--2014 Roman Garnett. 32 | 33 | function test_ind = unlabeled_selector(problem, train_ind, ~) 34 | 35 | test_ind = identity_selector(problem, [], []); 36 | test_ind(train_ind) = []; 37 | 38 | end --------------------------------------------------------------------------------