├── LICENSE
├── README.md
├── active_learning.m
├── label_oracles
    ├── bernoulli_oracle.m
    ├── label_oracles.m
    ├── lookup_oracle.m
    ├── multinomial_oracle.m
    └── probabilistic_oracle.m
├── models
    ├── cheating_model.m
    ├── ensemble.m
    ├── gaussian_process_model.m
    ├── knn_model.m
    ├── label_propagation_model.m
    ├── model_memory_wrapper.m
    ├── models.m
    └── random_forest_model.m
├── other
    ├── get_label_oracle.m
    ├── get_model.m
    ├── get_query_strategy.m
    ├── get_score_function.m
    └── get_selector.m
├── query_strategies
    ├── argmax.m
    ├── argmin.m
    ├── expected_error_reduction.m
    ├── margin_sampling.m
    ├── query_by_committee.m
    ├── query_strategies.m
    └── uncertainty_sampling.m
├── score_functions
    ├── calculate_entropies.m
    ├── expected_loss_lookahead.m
    ├── expected_loss_naive.m
    ├── expected_utility_lookahead.m
    ├── expected_utility_naive.m
    ├── loss_functions
    │   ├── expected_01_loss.m
    │   ├── expected_log_loss.m
    │   └── loss_functions.m
    ├── margin.m
    ├── marginal_entropy.m
    └── score_functions.m
└── selectors
    ├── fixed_test_set_selector.m
    ├── graph_walk_selector.m
    ├── identity_selector.m
    ├── meta_selectors
        ├── complement_selector.m
        ├── intersection_selector.m
        └── union_selector.m
    ├── probability_treshold_selector.m
    ├── random_selector.m
    ├── selectors.m
    └── unlabeled_selector.m


/LICENSE:
--------------------------------------------------------------------------------
 1 | The MIT License (MIT)
 2 | 
 3 | Copyright (c) 2011--2014 Roman Garnett
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining
 6 | a copy of this software and associated documentation files (the
 7 | "Software"), to deal in the Software without restriction, including
 8 | without limitation the rights to use, copy, modify, merge, publish,
 9 | distribute, sublicense, and/or sell copies of the Software, and to
10 | permit persons to whom the Software is furnished to do so, subject to
11 | the following conditions:
12 | 
13 | The above copyright notice and this permission notice shall be
14 | included in all copies or substantial portions of the Software.
15 | 
16 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
17 | EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
18 | MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
19 | NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
20 | LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
21 | OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
22 | WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
23 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
  1 | Active Learning Toolbox for MATLAB
  2 | ==================================
  3 | 
  4 | This software package provides a toolbox for testing pool-based
  5 | active-learning algorithms in MATLAB.
  6 | 
  7 | Active Learning
  8 | ---------------
  9 | 
 10 | Specifically, we consider the following scenario. There is a pool of
 11 | datapoints ![X][1]. We may successively select a set of points
 12 | ![x in X][2] to observe. Each observation reveals a discrete,
 13 | integer-valued label ![y in L][3] for ![x][4]. This labeling process
 14 | might be nondeterministic; we might choose the same point ![x][4]
 15 | twice and observe different labels each time. In active learning, we
 16 | typically assume we have a budget ![B][5] that limits the number of
 17 | points we may observe.
 18 | 
 19 | Our goal is to iteratively build a set of observations
 20 | 
 21 | ![D = (X, Y)][6]
 22 | 
 23 | that achieves some goal in an efficient manner. One typical goal is
 24 | that this training set allows us to accurately predict the labels on
 25 | the unobserved points. Assume we have a probabilistic model
 26 | 
 27 | ![p(y | x, D)][7]
 28 | 
 29 | and let ![U = X \ X][8] represent the set of unobserved points. We
 30 | might with to minimize either the 0/1 loss on the unlabeled points
 31 | 
 32 | ![\sum_{x in U} (\hat{y} \neq y)][9],
 33 | 
 34 | where ![\hat{y} = \argmax p(y | x, D)][10], or the log loss:
 35 | 
 36 | ![\sum_{x in U} -\log p(y | x, D)][11].
 37 | 
 38 | We could sample a random set of ![B][5] points, but by careful
 39 | consideration of our observation locations, we hope we can do
 40 | significantly better than this. One common active learning strategy,
 41 | known as _uncertainty sampling_, iteratively chooses to make an
 42 | observation at the point with the largest marginal entropy given the
 43 | current data:
 44 | 
 45 | ![x* = \argmax H(y | x, D)][12],
 46 | 
 47 | with the hope that these queries can better map out the boundaries
 48 | between classes.
 49 | 
 50 | Of course, there are countless goals besides minimizing generalization
 51 | error and numerous other strategies besides the highly myopic
 52 | uncertainty sampling. Indeed, many active learning scenerios might not
 53 | involve probability models at all. Providing a highly adaptable and
 54 | extensible toolbox for conducting arbitrary pool-based active learning
 55 | experiments is the goal of this project.
 56 | 
 57 | Using this Toolbox
 58 | ------------------
 59 | 
 60 | The most-important function is `active_learning`, which simulates an
 61 | active learning experiment using the following procedure:
 62 | 
 63 |     Given: initially labeled points X,
 64 | 	       corresponding labels Y,
 65 | 		   budget B
 66 | 
 67 |     for i = 1:B
 68 |       % find points available for labeling
 69 |       eligible_points = selector(x, y)
 70 | 
 71 |       % decide on point(s) to observe
 72 |       x_star = query_strategy(x, y, eligible_points)
 73 | 
 74 |       % observe point(s)
 75 |       y_star = label_oracle(x_star)
 76 | 
 77 |       % add observation(s) to training set
 78 |       X = [X, x_star]
 79 |       Y = [Y, y_star]
 80 |     end
 81 | 
 82 | The implementation supports user-specified:
 83 | 
 84 | * _Selectors,_ which given the current training set, return a set of
 85 |   points currently eligible for labeling. See `selectors.m` for usage
 86 |   and available implementations.
 87 | 
 88 | * _Query strategies,_ which given a training set and the selected
 89 |   eligible points, decides which point(s) to observe next. Note that a
 90 |   query strategy can return multiple points, allowing for batch
 91 |   observations. See `query_strategies.m` for usage and available
 92 |   implementations.
 93 | 
 94 | * _Label oracles,_ which given a set of points, return a set of
 95 |   corresponding labels. Label oracles may optionally be
 96 |   nondeterministic (see, for example, `bernoulli_oracle`). See
 97 |   `label_oracles.m` for usage and available implementations.
 98 | 
 99 | Each of these are provided as function handles satisfying a desired
100 | API, described below.
101 | 
102 | This function also supports arbitrary user-specified callbacks
103 | called after each round of the experiment. This can be useful, for
104 | example, for plotting the progress of the algorithm and/or printing
105 | statistics such as test error online.
106 | 
107 | Selectors
108 | ---------
109 | 
110 | A _selector_ considers the current labeled dataset and indicates which
111 | of the unlabeled points should be considered for observation at this
112 | time.
113 | 
114 | Selectors must satisfy the following interface:
115 | 
116 |     test_ind = selector(problem, train_ind, observed_labels)
117 | 
118 | ### Inputs: ###
119 | 
120 | * `problem`: a struct describing the problem, containing fields:
121 | 
122 |   *      `points`: an ![(n x d)][13] data matrix for the available points
123 |   * `num_classes`: the number of classes
124 |   * `num_queries`: the number of queries to make
125 | 
126 | * `train_ind`: a list of indices into `problem.points` indicating the
127 |    thus-far observed points
128 | 
129 | * `observed_labels`: a list of labels corresponding to the
130 |    observations in `train_ind`
131 | 
132 | ### Output: ###
133 | 
134 | * `test_ind`: a list of indices into `problem.points` indicating the
135 |    points to consider for labeling
136 | 
137 | The following general-purpose selectors are provided in this toolbox:
138 | 
139 | * `fixed_test_set_selector`: selects all points besides a given test
140 |    set
141 | * `graph_walk_selector`: confines an experiment to follow a path on a
142 |    graph
143 | * `identity_selector`: selects all points
144 | * `random_selector`: selects a random subset of points
145 | * `unlabeled_selector`: selects points not yet observed
146 | 
147 | In addition, the following "meta" selectors are provided, which
148 | combine or modify the outputs of other selectors:
149 | 
150 | *   `complement_selector`: takes the complement of a selector's output
151 | * `intersection_selector`: takes the intersection of the outputs of selectors
152 | *        `union_selector`: takes the union of the outputs of selectors
153 | 
154 | Query Strategies
155 | ----------------
156 | 
157 | _Query strategies_ select which of the points currently eligible for
158 | labeling (returned by a selector) should be observed next.
159 | 
160 | Query strategies must satisfy the following interface:
161 | 
162 |     query_ind = query_strategy(problem, train_ind, observed_labels, test_ind)
163 | 
164 | ### Inputs: ###
165 | 
166 | * `problem`: a struct describing the problem, containing fields:
167 | 
168 |   *      `points`: an ![(n x d)][13] data matrix for the available points
169 |   * `num_classes`: the number of classes
170 |   * `num_queries`: the number of queries to make
171 | 
172 | * `train_ind`: a list of indices into `problem.points` indicating the
173 |    thus-far observed points
174 | 
175 | * `observed_labels`: a list of labels corresponding to the
176 |    observations in `train_ind`
177 | * `test_ind`: a list of indices into `problem.points` indicating the
178 |    points eligible for observation
179 | 
180 | ### Output: ###
181 | 
182 | * `query_ind`: an index into `problem.points` indicating the point(s)
183 |    to query next (every entry in `query_ind` will always be a member
184 |    of the set of points in `test_ind`)
185 | 
186 | The following query strategies are provided in this toolbox:
187 | 
188 | * `argmax`: samples the point(s) maximizing a given score function
189 | * `argmin`: samples the point(s) minimizing a given score function
190 | * `expected_error_reduction`: samples the point giving lowest
191 |    expected loss on unlabeled points
192 | * `margin_sampling`: samples the point with the smallest margin
193 | * `query_by_committee`: samples the point with the highest disagreement
194 |    between models
195 | * `uncertainty_sampling`: samples the most uncertain point
196 | 
197 | Label Oracles
198 | -------------
199 | 
200 | _Label oracles_ are functions that, given a set of points chosen to be
201 | queried, returns a set of corresponding labels. In general, they need
202 | not be deterministic, which is especially interesting when points can
203 | be queried multiple times.
204 | 
205 | Label oracles must satisfy the following interface:
206 | 
207 |     label = label_oracle(problem, query_ind)
208 | 
209 | ### Inputs: ###
210 | 
211 | * `problem`: a struct describing the problem, containing fields:
212 | 
213 |   *      `points`: an ![(n x d)][13] data matrix for the available points
214 |   * `num_classes`: the number of classes
215 | 
216 | * `query_ind`: an index into `problem.points` specifying the point(s) to be
217 |    queried
218 | 
219 | ### Output: ###
220 | 
221 | * `label`: a list of integers between 1 and `problem.num_classes`
222 |    indicating the observed label(s)
223 | 
224 | The following general-purpose label oracles are provided in this
225 | toolbox:
226 | 
227 | * `lookup_oracle`: a trivial lookup-table label oracle given a fixed
228 |    list of ground-truth labels
229 | * `bernoulli_oracle`: a label oracle that, conditioned on the queried
230 |    point(s), samples labels independently from a Bernoulli distribution
231 |    with given success probability
232 | * `multinomial_oracle`: a label oracle that, conditioned on the
233 |    queried point(s), samples labels independently from a multinomial
234 |    distribution with given success probabilities
235 | 
236 | [1]: http://latex.codecogs.com/svg.latex?%5Cmathcal%7BX%7D
237 | [2]: http://latex.codecogs.com/svg.latex?x%20%5Cin%20%5Cmathcal%7BX%7D
238 | [3]: http://latex.codecogs.com/svg.latex?y%20%5Cin%20%5BL%5D
239 | [4]: http://latex.codecogs.com/svg.latex?x
240 | [5]: http://latex.codecogs.com/svg.latex?B
241 | [6]: http://latex.codecogs.com/svg.latex?%5Cmathcal%7BD%7D%20%3D%20%5Cbigl%5Clbrace%20(x_i%2C%20y_i)%20%5Cbigr%20%5Crbrace_%7Bi%3D1%7D%5EB%20%3D%20(X%2C%20Y)
242 | [7]: http://latex.codecogs.com/svg.latex?p(y%20%5Cmid%20x%2C%20%5Cmathcal%7BD%7D),
243 | [8]: http://latex.codecogs.com/svg.latex?%5Cmathcal%7BU%7D%20%3D%20%5Cmathcal%7BX%7D%20%5Csetminus%20X
244 | [9]: http://latex.codecogs.com/svg.latex?%5Csum_%7Bx%20%5Cin%20%5Cmathcal%7BU%7D%7D%20%5B%5Chat%7By%7D%20%5Cneq%20y%5D
245 | [10]: http://latex.codecogs.com/svg.latex?%5Chat%7By%7D%20%3D%20%5Coperatorname%7Barg%5C%2Cmax%7D%20p(y%20%5Cmid%20x%2C%20%5Cmathcal%7BD%7D)
246 | [11]: http://latex.codecogs.com/svg.latex?%5Csum_%7Bx%20%5Cin%20%5Cmathcal%7BU%7D%7D%20-%5Clog%20p(y%20%5Cmid%20x%2C%20%5Cmathcal%7BD%7D)
247 | [12]: http://latex.codecogs.com/svg.latex?x%5E%5Cast%20%3D%20%5Coperatorname%7Barg%5C%2Cmax%7D_x%20H%5By%20%5Cmid%20x%2C%20%5Cmathcal%7BD%7D%5D
248 | [13]: http://latex.codecogs.com/svg.latex?(n%20%5Ctimes%20d)
249 | 


--------------------------------------------------------------------------------
/active_learning.m:
--------------------------------------------------------------------------------
  1 | % ACTIVE_LEARNING simulates an active learning experiment.
  2 | %
  3 | % This function performs active learning on a set of discrete points
  4 | % using a given query strategy. An active-learning experiment is
  5 | % simulated following the following procedure:
  6 | %
  7 | %   Given: initially labeled points x,
  8 | %          corresponding labels y,
  9 | %          budget B
 10 | %
 11 | %   for i = 1:B
 12 | %     % find points available for labeling
 13 | %     eligible_points = selector(x, y)
 14 | %
 15 | %     % decide on point(s) to observe
 16 | %     x_star = query_strategy(x, y, eligible_points)
 17 | %
 18 | %     % observe point(s)
 19 | %     y_star = label_oracle(x_star)
 20 | %
 21 | %     % add observation(s) to training set
 22 | %     x = [x, x_star]
 23 | %     y = [y, y_star]
 24 | %   end
 25 | %
 26 | % This function supports user-specified:
 27 | %
 28 | % * _Selectors,_ which given the current training set, return a set of
 29 | %   points currently eligible for labeling. See selectors.m for usage
 30 | %   and available implementations.
 31 | %
 32 | % * _Query strategies,_ which given a training set and the selected
 33 | %   eligible points, decides which point(s) to observe next. Note that
 34 | %   a query strategy can return multiple points, allowing for batch
 35 | %   observations. See query_strategies.m for usage and available
 36 | %   implementations.
 37 | %
 38 | % * _Label oracles,_ which given a set of points, return a set of
 39 | %   corresponding labels. Label oracles may optionally be
 40 | %   nondeterministic (see, for example, bernoulli_oracle). See
 41 | %   label_oracles.m for usage and available implementations.
 42 | %
 43 | % This function also supports arbitrary user-specified callbacks
 44 | % called after each round of the experiment. This can be useful, for
 45 | % example, for plotting the progress of the algorithm and/or printing
 46 | % statistics such as test error online.
 47 | %
 48 | % Usage:
 49 | %
 50 | %   [chosen_ind, chosen_labels] = ...
 51 | %       active_learning(problem, train_ind, observed_labels, label_oracle, ...
 52 | %                       selector, query_strategy, callback)
 53 | %
 54 | % Inputs:
 55 | %
 56 | %           problem: a struct describing the problem, containing fields:
 57 | %
 58 | %                  points: an (n x d) data matrix for the available points
 59 | %             num_classes: the number of classes
 60 | %             num_queries: the number of queries to make
 61 | %                 verbose: whether to print information regarding
 62 | %                          each query (default: false)
 63 | %
 64 | %         train_ind: a (possibly empty) list of indices into
 65 | %                    problem.points indicating the labeled points at
 66 | %                    start
 67 | %   observed_labels: a (possibly empty) list of labels corresponding
 68 | %                    to the observations in train_ind
 69 | %      label_oracle: a handle to a label oracle, which takes an index
 70 | %                    into problem.points and returns a label
 71 | %          selector: a handle to a point selector, which specifies
 72 | %                    which points are eligible to query at a given time
 73 | %    query_strategy: a handle to a query strategy
 74 | %          callback: (optional) a handle to an arbitrary user-defined
 75 | %                    callback called after each new point is queried.
 76 | %                    The callback will be called as
 77 | %
 78 | %                      callback(problem, train_ind, observed_labels)
 79 | %
 80 | %                    and anything returned will be ignored.
 81 | %
 82 | % Outputs:
 83 | %
 84 | %      chosen_ind: a list of indices of the chosen datapoints, in order
 85 | %   chosen_labels: a list of the corresponding observed labels
 86 | %
 87 | % See also LABEL_ORACLES, SELECTORS, QUERY_STRATEGIES.
 88 | 
 89 | % Copyright (c) 2011--2014 Roman Garnett.
 90 | 
 91 | function [chosen_ind, chosen_labels] = ...
 92 |       active_learning(problem, train_ind, observed_labels, label_oracle, ...
 93 |                       selector, query_strategy, callback)
 94 | 
 95 |   % set verbose to false if not defined
 96 |   verbose = isfield(problem, 'verbose') && problem.verbose;
 97 | 
 98 |   chosen_ind    = [];
 99 |   chosen_labels = [];
100 | 
101 |   % store number of initial training points (this can be used to track
102 |   % the number of points selected thus far)
103 |   problem.num_initial = numel(train_ind);
104 | 
105 |   for i = 1:problem.num_queries
106 |     if (verbose)
107 |       tic;
108 |       fprintf('point %i:', i);
109 |     end
110 | 
111 |     % get list of points to consider for querying this round
112 |     test_ind = selector(problem, train_ind, observed_labels);
113 |     if (verbose)
114 |       fprintf(' %i points for consideration ... ', numel(test_ind));
115 |     end
116 | 
117 |     % end early if no points returned from selector
118 |     if (isempty(test_ind))
119 |       if (verbose)
120 |         fprintf('\n');
121 |       end
122 |       warning('active_learning:no_points_selected', ...
123 |               ['after %i steps, no points were selected. ' ...
124 |                'Ending run early!'], i);
125 | 
126 |       return;
127 |     end
128 | 
129 |     % shortcut if only one point available
130 |     if (numel(test_ind) == 1)
131 |       this_chosen_ind = test_ind;
132 |     else
133 |     % select location(s) of next observation(s) from the given list
134 |       this_chosen_ind = ...
135 |           query_strategy(problem, train_ind, observed_labels, test_ind);
136 |     end
137 | 
138 |     % observe label(s) at chosen location(s)
139 |     this_chosen_labels = ...
140 |         label_oracle(problem, train_ind, observed_labels, this_chosen_ind);
141 | 
142 |     % update lists with new observation(s)
143 |     chosen_ind      = [chosen_ind; this_chosen_ind];
144 |     train_ind       = [train_ind;  this_chosen_ind];
145 | 
146 |     chosen_labels   = [chosen_labels;   this_chosen_labels];
147 |     observed_labels = [observed_labels; this_chosen_labels];
148 |     if (verbose)
149 |       num_observations = numel(this_chosen_ind);
150 |       observation_format_string = repmat('%i ', [1, num_observations]);
151 |       observation_format_string = observation_format_string(1:(end - 1));
152 | 
153 |       label_format_string = repmat('%i/', [1, problem.num_classes]);
154 |       label_format_string = label_format_string(1:(end - 1));
155 | 
156 |       fprintf(sprintf('done. Point chosen: %s (label: %s), took: %%.2fs. Cumulative label totals: [%s].\n', ...
157 |                       observation_format_string, ...
158 |                       observation_format_string, ...
159 |                       label_format_string), ...
160 |               this_chosen_ind,    ...
161 |               this_chosen_labels, ...
162 |               toc, ...
163 |               accumarray(chosen_labels, 1, [problem.num_classes, 1]));
164 |     end
165 | 
166 |     % call callback, if defined
167 |     if (nargin > 6)
168 |       callback(problem, train_ind, observed_labels);
169 |     end
170 |   end
171 | 
172 | end


--------------------------------------------------------------------------------
/label_oracles/bernoulli_oracle.m:
--------------------------------------------------------------------------------
 1 | % BERNOULLI_ORACLE Bernoulli oracle with given success probabilities.
 2 | %
 3 | % This provides a label oracle that, conditioned on queried point(s),
 4 | % samples labels independently from a Bernoulli with given success
 5 | % probability. Here membership to class 1 is treated as "success."
 6 | %
 7 | % Usage:
 8 | %
 9 | %   label = bernoulli_oracle(problem, train_ind, observed_labels, ...
10 | %                            query_ind, probabilities)
11 | %
12 | % Inputs:
13 | %
14 | %           problem: a struct describing the problem, containing the
15 | %                    field:
16 | %
17 | %                  points: an (n x d) data matrix for the avilable points
18 | %
19 | %         train_ind: a list of indices into problem.points indicating
20 | %                    the thus-far observed points
21 | %   observed_labels: a list of labels corresponding to the
22 | %                    observations in train_ind
23 | %
24 | %                    Note: the above inputs, part of the standard
25 | %                    label oracle API, are ignored by
26 | %                    bernoulli_oracle. If desired, for standalone use
27 | %                    it can be replaced by an empty matrix.
28 | %
29 | %         query_ind: an index into problem.points specifying the
30 | %                    point(s) to be queried
31 | %     probabilities: a length-n vector of success probabilities
32 | %                    corresponding to the points in problem.points
33 | %
34 | % Output:
35 | %
36 | %   label: a list of integers between 1 and problem.num_classes
37 | %          indicating the observed label(s)
38 | %
39 | % See also LABEL_ORACLES, MULTINOMIAL_ORACLE.
40 | 
41 | % Copyright (c) 2013--2016 Roman Garnett.
42 | 
43 | function label = bernoulli_oracle(~, ~, ~, query_ind, probabilities)
44 | 
45 |   label = 1 + (rand(size(query_ind(:))) > probabilities(query_ind));
46 | 
47 | end


--------------------------------------------------------------------------------
/label_oracles/label_oracles.m:
--------------------------------------------------------------------------------
 1 | % Label oracles are functions that, given a set of point(s) chosen to
 2 | % be queried, returns a list of corresponding label(s). In general,
 3 | % they need not be deterministic, which is especially interesting when
 4 | % points can be queried multiple times.
 5 | %
 6 | % Label oracles must satisfy the following interface:
 7 | %
 8 | %   label = label_oracle(problem, train_ind, observed_labels, query_ind)
 9 | %
10 | % Inputs:
11 | %
12 | %           problem: a struct describing the problem, containing the
13 | %                    fields:
14 | %
15 | %                  points: an (n x d) data matrix for the avilable points
16 | %             num_classes: the number of classes
17 | %
18 | %         train_ind: a list of indices into problem.points indicating
19 | %                    the thus-far observed points
20 | %   observed_labels: a list of labels corresponding to the
21 | %                    observations in train_ind
22 | %         query_ind: an index into problem.points specifying the
23 | %                    point(s) to be queried
24 | %
25 | % Output:
26 | %
27 | %   label: a list of integers between 1 and problem.num_classes
28 | %          indicating the observed label(s)
29 | %
30 | % The following general-purpose label oracles are provided in this
31 | % toolbox:
32 | %
33 | %        lookup_oracle: a trivial lookup-table label oracle given a
34 | %                       fixed list of ground-truth labels
35 | %     bernoulli_oracle: a label oracle that, conditioned on the
36 | %                       queried point(s), samples labels independently
37 | %                       from a Bernoulli distribution with given
38 | %                       success probability
39 | %   multinomial_oracle: a label oracle that, conditioned on the
40 | %                       queried point(s), samples labels independently
41 | %                       from a multinomial distribution with given
42 | %                       success probabilities
43 | %
44 | % For convenience, the function get_label_oracle is provided for
45 | % easily and concisely constructing function handles to label oracles
46 | % for use, e.g., in active_learning.m.
47 | 
48 | % Copyright (c) 2014--2016 Roman Garnett.
49 | 


--------------------------------------------------------------------------------
/label_oracles/lookup_oracle.m:
--------------------------------------------------------------------------------
 1 | % LOOKUP_ORACLE trivial lookup-table oracle with fixed labels.
 2 | %
 3 | % This provides a trivial lookup-table label oracle. Given query
 4 | % point(s), returns the corresponding label(s) from a given list of
 5 | % fixed ground truth labels.
 6 | %
 7 | % Usage:
 8 | %
 9 | %   label = lookup_oracle(problem, train_ind, observed_labels, ...
10 | %                         query_ind, labels)
11 | %
12 | % Inputs:
13 | %
14 | %           problem: a struct describing the problem, containing the
15 | %                    fields:
16 | %
17 | %                  points: an (n x d) data matrix for the avilable points
18 | %             num_classes: the number of classes
19 | %
20 | %         train_ind: a list of indices into problem.points indicating
21 | %                    the thus-far observed points
22 | %   observed_labels: a list of labels corresponding to the
23 | %                    observations in train_ind
24 | %
25 | %                    Note: the above inputs, part of the standard
26 | %                    label oracle API, are ignored by lookup_oracle. If
27 | %                    desired, for standalone use it can be replaced by
28 | %                    an empty matrix.
29 | %
30 | %         query_ind: an index into problem.points specifying the
31 | %                    point(s) to be queried
32 | %            labels: a length-n vector of ground-truth class labels
33 | %                    for each point in problem.points
34 | %
35 | % Output:
36 | %
37 | %   label: a list of integers between 1 and problem.num_classes
38 | %          indicating the observed label(s)
39 | %
40 | % See also LABEL_ORACLES.
41 | 
42 | % Copyright (c) 2013--2016 Roman Garnett.
43 | 
44 | function label = lookup_oracle(~, ~, ~, query_ind, labels)
45 | 
46 |   label = labels(query_ind);
47 | 
48 | end


--------------------------------------------------------------------------------
/label_oracles/multinomial_oracle.m:
--------------------------------------------------------------------------------
 1 | % MULTINOMIAL_ORACLE multinomial oracle with given probabilities.
 2 | %
 3 | % This provides a label oracle that, conditioned on queried point(s),
 4 | % samples labels independently from a multinomial with given marginal
 5 | % probabilities.
 6 | %
 7 | % Usage:
 8 | %
 9 | %   label = multinomial_oracle(problem, train_ind, observed_labels,
10 | %                              query_ind, probabilities)
11 | %
12 | % Inputs:
13 | %
14 | %           problem: a struct describing the problem, containing the
15 | %                    fields:
16 | %
17 | %                  points: an (n x d) data matrix for the avilable points
18 | %             num_classes: the number of classes
19 | %
20 | %         train_ind: a list of indices into problem.points indicating
21 | %                    the thus-far observed points
22 | %   observed_labels: a list of labels corresponding to the
23 | %                    observations in train_ind
24 | %
25 | %                    Note: the above inputs, part of the standard
26 | %                    label oracle API, are ignored by
27 | %                    multinomial_oracle. If desired, for standalone
28 | %                    use it can be replaced by an empty matrix.
29 | %
30 | %          query_ind: an index into problem.points specifying the
31 | %                     point(s) to be queried
32 | %      probabilities: an (n x problem.num_classes) matrix of
33 | %                     class-membership probabilities corresponding to
34 | %                     the points in problem.points
35 | %
36 | % Output:
37 | %
38 | %   label: a list of integers between 1 and problem.num_classes
39 | %          indicating the observed label(s)
40 | %
41 | % See also LABEL_ORACLES, BERNOULLI_ORACLE.
42 | 
43 | % Copyright (c) 2013--2016 Roman Garnett.
44 | 
45 | function label = multinomial_oracle(~, ~, ~, query_ind, probabilities)
46 | 
47 |   label = 1 + sum(bsxfun(@gt, rand(size(query_ind(:))), ...
48 |                          cumsum(probabilities(query_ind, :), 2)), 2);
49 | 
50 | end


--------------------------------------------------------------------------------
/label_oracles/probabilistic_oracle.m:
--------------------------------------------------------------------------------
 1 | % PROBABILISTIC_ORACLE multinomial oracle with model probabilities.
 2 | %
 3 | % This provides a label oracle that, conditioned on queried point(s),
 4 | % samples labels independently from a multinomial with marginal
 5 | % probabilities computed from a given model.
 6 | %
 7 | % Usage:
 8 | %
 9 | %   label = probabilistic_oracle(problem, train_ind, observed_labels,
10 | %                                query_ind, model)
11 | %
12 | % Inputs:
13 | %
14 | %           problem: a struct describing the problem, containing the
15 | %                    fields:
16 | %
17 | %                  points: an (n x d) data matrix for the avilable points
18 | %             num_classes: the number of classes
19 | %
20 | %         train_ind: a list of indices into problem.points indicating
21 | %                    the thus-far observed points
22 | %   observed_labels: a list of labels corresponding to the
23 | %                    observations in train_ind
24 | %         query_ind: an index into problem.points specifying the
25 | %                    point(s) to be queried
26 | %             model: a function handle to a model to use
27 | %
28 | % Output:
29 | %
30 | %   label: a list of integers between 1 and problem.num_classes
31 | %          indicating the observed label(s)
32 | %
33 | % See also LABEL_ORACLES, MULTINOMIAL_ORACLE, MODELS.
34 | 
35 | % Copyright (c) 2016 Roman Garnett.
36 | 
37 | function label = probabilistic_oracle(problem, train_ind, observed_labels, ...
38 |           query_ind, model)
39 | 
40 |   probabilities = model(problem, train_ind, observed_labels, query_ind);
41 | 
42 |   label = 1 + sum(bsxfun(@gt, rand(size(query_ind(:))), ...
43 |                          cumsum(probabilities, 2)), 2);
44 | 
45 | end


--------------------------------------------------------------------------------
/models/cheating_model.m:
--------------------------------------------------------------------------------
 1 | % CHEATING_MODEL a "cheating" model that queries a label oracle.
 2 | %
 3 | % This model always predicts a delta distribution for each test point
 4 | % with mass on the output of a given label oracle. This can be useful
 5 | % for comparing against a theoretically optimal algorithm.
 6 | %
 7 | % Usage:
 8 | %
 9 | %   probabilities = cheating_model(problem, train_ind, observed_labels, ...
10 | %           test_ind, label_oracle)
11 | %
12 | % Inputs:
13 | %
14 | %           problem: a struct describing the problem, which must at
15 | %                    least contain the field:
16 | %
17 | %             num_classes: the number of classes
18 | %
19 | %                    as well as any fields that may be required by
20 | %                    the label oracle below.
21 | %
22 | %         train_ind: a list of indices into problem.points indicating
23 | %                    the thus-far observed points
24 | %
25 | %                    Note: this input, part of the standard
26 | %                    probability model API, is ignored by
27 | %                    cheating_model. If desired, for standalone use it
28 | %                    can be replaced by an empty matrix.
29 | %
30 | %   observed_labels: a list of labels corresponding to the
31 | %                    observations in train_ind
32 | %
33 | %                    Note: this input, part of the standard
34 | %                    probability model API, is ignored by
35 | %                    cheating_model. If desired, for standalone use it
36 | %                    can be replaced by an empty matrix.
37 | %
38 | %          test_ind: a list of indices into problem.points indicating
39 | %                    the test points
40 | %      label_oracle: a handle to a label oracle. which takes an index
41 | %                    into problem.points and returns a label
42 | %
43 | % Output:
44 | %
45 | %   probabilities: a matrix of posterior probabilities. The ith column
46 | %                  gives the posterior probabilities p(y = i | x, D)
47 | %                  for each of the indicated test points; here
48 | %                  p(y = i | x, D) = 1 if the label oracle output i
49 | %                  for y; otherwise 0.
50 | %
51 | % See also MODELS, LABEL_ORACLES.
52 | 
53 | % Copyright (c) 2012--2014 Roman Garnett.
54 | 
55 | function probabilities = cheating_model(problem, ~, ~, test_ind, label_oracle)
56 | 
57 |   num_test = numel(test_ind);
58 | 
59 |   probabilities = zeros(num_test, problem.num_classes);
60 |   for i = 1:num_test
61 |     probabilities(i, label_oracle(problem, test_ind(i))) = 1;
62 |   end
63 | 
64 | end


--------------------------------------------------------------------------------
/models/ensemble.m:
--------------------------------------------------------------------------------
  1 | % ENSEMBLE makes predictions using a weighted ensemble of models.
  2 | %
  3 | % This is an implementation of a weighted ensemble of models. Let M =
  4 | % {M_j} be a set of probability models, and let a point x and a set of
  5 | % observations D = (X, Y) be given. The ensemble probabilities are
  6 | % given by
  7 | %
  8 | %   p(y = i | x, D) = \sum_j w_j(D) p(y = i | x, D, M_j)
  9 | %                   / \sum_j w_j(D),
 10 | %
 11 | % where w(D) is a (possibly data-dependent) weight vector of length
 12 | % |M|.
 13 | %
 14 | % This implementation also supports so-called "hard" voting by the
 15 | % ensemble members, where in the above the posterior probabilities
 16 | %
 17 | %   p(y | x, D, M_j)
 18 | %
 19 | % are replaced by a Kronecker \delta distribution on the
 20 | % most-confident label according to model M_j:
 21 | %
 22 | %   \delta[ \argmax_i p(y = i | x, D, M_j) ].
 23 | %
 24 | % Usage:
 25 | %
 26 | %   probabilities = ensemble(problem, train_ind, observed_labels, ...
 27 | %                            test_ind, models, weights, hard_votes)
 28 | %
 29 | % Required Inputs:
 30 | %
 31 | %           problem: a struct describing the problem, containing fields:
 32 | %
 33 | %                  points: an (n x d) data matrix for the available points
 34 | %             num_classes: the number of classes
 35 | %
 36 | %         train_ind: a list of indices into problem.points indicating
 37 | %                    the thus-far observed points
 38 | %   observed_labels: a list of labels corresponding to the
 39 | %                    observations in train_ind
 40 | %          test_ind: a list of indices into problem.points indicating
 41 | %                    the points eligible for observation
 42 | %            models: a cell array of handles to probability models
 43 | %
 44 | % Optional Inputs:
 45 | %
 46 | %           weights: either a length-|M| vector of model weights or a
 47 | %                    function handle returning such a vector (see note
 48 | %                    below for details).
 49 | %                    (default: ones(1, |M|) / |M|)
 50 | %        hard_votes: a boolean indicating whether to use "hard" voting
 51 | %                    (default: false)
 52 | %
 53 | % Output:
 54 | %
 55 | %   probabilities: a matrix of posterior probabilities. The ith
 56 | %                  column gives p(y = i | x, D) for each of the
 57 | %                  indicated test points.
 58 | %
 59 | % Note on Model Weights:
 60 | %
 61 | % This implementation supports both fixed and data-dependent model
 62 | % weights w(D). The latter might be useful to, for example, weight
 63 | % ensemble members by an estimate of their accuracy or by an estimate
 64 | % of their posterior probabilities in a Bayesian fashion.
 65 | % Data-dependent weights are implemented by providing a function
 66 | % handle to a weight function which will be called as
 67 | %
 68 | %   weights = weight_function(problem, train_ind, observed_labels, models),
 69 | %
 70 | % and must return a length-|M| vector of weights corresponding to the
 71 | % models in models.
 72 | %
 73 | % See also MODELS, QUERY_BY_COMMITTEE.
 74 | 
 75 | % Copyright (c) 2014 Roman Garnett.
 76 | 
 77 | function probabilities = ensemble(problem, train_ind, observed_labels, ...
 78 |           test_ind, models, weights, hard_votes)
 79 | 
 80 |   num_test   = numel(test_ind);
 81 |   num_models = numel(models);
 82 | 
 83 |   % default to uniform model weights
 84 |   if ((nargin < 6) || isempty(weights))
 85 |     weights = (1 / num_models) + zeros(1, num_models);
 86 |   end
 87 | 
 88 |   % default to "soft" votes
 89 |   if (nargin < 7)
 90 |     hard_votes = false;
 91 |   end
 92 | 
 93 |   % determine weight vector if weight function is provided
 94 |   if (isa(weights, 'function_handle'))
 95 |     weights = weights(problem, train_ind, observed_labels, models);
 96 |   end
 97 | 
 98 |   votes = zeros(num_test, problem.num_classes);
 99 | 
100 |   for i = 1:num_models
101 |     probabilities = models{i}(problem, train_ind, observed_labels, test_ind);
102 | 
103 |     if (hard_votes)
104 |       % "hard" votes: each model votes only for its most-confident
105 |       % prediction
106 |       [~, this_votes] = max(probabilities, [], 2);
107 | 
108 |       votes = votes + weights(i) * ...
109 |               accumarray([(1:num_test)', this_votes], 1, ...
110 |                          [num_test, problem.num_classes]);
111 | 
112 |     else
113 |       % "soft" votes: each model votes for each label with a weight equal to
114 |       % its posterior probability
115 |       votes = votes + weights(i) * probabilities;
116 |     end
117 |   end
118 | 
119 |   % normalize probabilities
120 |   probabilities = bsxfun(@times, votes, 1 ./ sum(votes, 2));
121 | 
122 | end


--------------------------------------------------------------------------------
/models/gaussian_process_model.m:
--------------------------------------------------------------------------------
 1 | % GAUSSIAN_PROCESS_MODEL a binary Gaussian process classifier.
 2 | %
 3 | % This is an implementation of a Gaussian process (binary)
 4 | % classifier. Requires the GPML toolkit available here:
 5 | %
 6 | %    <a>http://www.gaussianprocess.org/gpml/code/matlab/doc</a>
 7 | %
 8 | % Usage:
 9 | %
10 | %   probabilities = gaussian_process_model(problem, train_ind, ...
11 | %           observed_labels, test_ind, hyperparameters, inference_method, ...
12 | %           mean_function, covariance_function, likelihood)
13 | %
14 | % Inputs:
15 | %
16 | %               problem: a struct describing the problem, which must
17 | %                        at least contain the field:
18 | %
19 | %                  points: an (n x d) data matrix for the avilable
20 | %                          points
21 | %
22 | %             train_ind: a list of indices into problem.points
23 | %                        indicating the thus-far observed points
24 | %       observed_labels: a list of labels corresponding to the
25 | %                        observations in train_ind
26 | %              test_ind: a list of indices into problem.points indicating
27 | %                        the test points
28 | %       hyperparameters: a GPML hyperparameter structure
29 | %      inference_method: a GPML inference method
30 | %         mean_function: a GPML mean function
31 | %   covariance_function: a GPML covariance function
32 | %            likelihood: a GPML likelihood
33 | %
34 | % Output:
35 | %
36 | %   probabilities: a matrix of posterior probabilities. The first
37 | %                  column gives p(y = 1 | x, D) for each of the
38 | %                  indicated test points; the second column gives
39 | %                  p(y \neq 1 | x, D).
40 | %
41 | % See also MODELS, GP.
42 | 
43 | % Copyright (c) 2011--2016 Roman Garnett.
44 | 
45 | function probabilities = gaussian_process_model(problem, train_ind, ...
46 |           observed_labels, test_ind, hyperparameters, inference_method, ...
47 |           mean_function, covariance_function, likelihood)
48 | 
49 |   % transform labels to match what GPML expects
50 |   observed_labels(observed_labels ~= 1) = -1;
51 | 
52 |   num_test = numel(test_ind);
53 | 
54 |   [~, ~, ~, ~, log_probabilities] = gp(hyperparameters, inference_method, ...
55 |           mean_function, covariance_function, likelihood, ...
56 |           problem.points(train_ind, :), observed_labels, ...
57 |           problem.points(test_ind, :), ones(num_test, 1));
58 | 
59 |   probabilities = exp(log_probabilities);
60 | 
61 |   % return probabilities for "class 1" and "not class 1"
62 |   probabilities = [probabilities, (1 - probabilities)];
63 | 
64 | end
65 | 


--------------------------------------------------------------------------------
/models/knn_model.m:
--------------------------------------------------------------------------------
 1 | % KNN_MODEL weighted k-NN classifier.
 2 | %
 3 | % Suppose the problem has n points, and W is an (n x n) matrix of
 4 | % pairwise weights. We assume the marginal label distribution at a
 5 | % point x is a categorical distribution with probability vector p(x):
 6 | %
 7 | %   p(y | x) = Categorical(p(x)).
 8 | %
 9 | % We place identical Dirichlet priors on the p(x) vectors with
10 | % hyperparameter vector \alpha:
11 | %
12 | %   p(p(x) | x, \alpha) = Dirichlet(\alpha).
13 | %
14 | % Finally, given observations D = {(X, Y)} and a point x, we update
15 | % the posterior probability vector p(x) by accumulating weighted
16 | % counts of the observations near x (where "near" is defined by the
17 | % weight matrix W):
18 | %
19 | %   p(p(x) | x, D, \alpha) = Dirichlet(\alpha + C(x)),
20 | %
21 | % where
22 | %
23 | %   C_i(x) = \sum_{x' \in D, y' = i} W(x, x').
24 | %
25 | % Now, given x and D, we output the Categorical distribution with
26 | % the posterior mean of p(x) given D:
27 | %
28 | %   p(y | x, D, \alpha) = Categorical( E[p(x) | x, D, \alpha] ).
29 | %
30 | % Usage:
31 | %
32 | %   probabilities = knn_model(problem, train_ind, observed_labels, ...
33 | %                             test_ind, weights, alpha)
34 | %
35 | % Inputs:
36 | %
37 | %           problem: a struct describing the problem, containing the
38 | %                    fields:
39 | %
40 | %                  points: an (n x d) data matrix for the avilable points
41 | %             num_classes: the number of classes
42 | %
43 | %         train_ind: a list of indices into problem.points indicating
44 | %                    the thus-far observed points
45 | %   observed_labels: a list of labels corresponding to the
46 | %                    observations in train_ind
47 | %          test_ind: a list of indices into problem.points indicating
48 | %                    the test points
49 | %           weights: an (n x n) matrix of weights
50 | %             alpha: the hyperparameter vector \alpha
51 | %                    (1 x problem.num_classes)
52 | %
53 | % Output:
54 | %
55 | %   probabilities: a matrix of posterior probabilities. The ith
56 | %                  column gives p(y = i | x, D) for each of the
57 | %                  indicated test points.
58 | %
59 | % See also MODELS.
60 | 
61 | % Copyright (c) 2011--2014 Roman Garnett.
62 | 
63 | function probabilities = knn_model(problem, train_ind, observed_labels, ...
64 |           test_ind, weights, alpha)
65 | 
66 |   num_test = numel(test_ind);
67 |   probabilities = zeros(num_test, problem.num_classes);
68 | 
69 |   % accumulate weighted number of successes for each class
70 |   for i = 1:problem.num_classes
71 |     probabilities(:, i) = alpha(i) + ...
72 |         sum(weights(test_ind, train_ind(observed_labels == i)), 2);
73 |   end
74 | 
75 |   % normalize probabilities
76 |   probabilities = bsxfun(@times, probabilities, 1 ./ sum(probabilities, 2));
77 | 
78 | end
79 | 


--------------------------------------------------------------------------------
/models/label_propagation_model.m:
--------------------------------------------------------------------------------
  1 | % LABEL_PROPAGATION_MODEL partially abosrbing label propagation.
  2 | %
  3 | % This is an implementation of the partially absorbing label
  4 | % propagation algorithm described in:
  5 | %
  6 | %   Neumann, M., Garnett, R., and Kersting, K. Coinciding Walk
  7 | %   Kernels: Parallel Absorbing Random Walks for Learning with Graphs
  8 | %   and Few Labels. (2013). Proceedings of the 5th Annual Asian
  9 | %   Conference on Machine Learning (ACML 2013).
 10 | %
 11 | % Usage:
 12 | %
 13 | %   probabilities = label_propagation_model(problem, train_ind, ...
 14 | %           observed_labels, test_ind, A, varargin)
 15 | %
 16 | % Required inputs:
 17 | %
 18 | %           problem: a struct describing the problem, which must at
 19 | %                    least contain the field:
 20 | %
 21 | %             num_classes: the number of classes
 22 | %
 23 | %         train_ind: a list of indices into A indicating the thus-far
 24 | %                    observed nodes
 25 | %   observed_labels: a list of labels corresponding to the
 26 | %                    observations in train_ind
 27 | %          test_ind: a list of indices into A indicating the test
 28 | %                    nodes
 29 | %                 A: a weighted adjacency matrix for the desired graph
 30 | %                    containing transition probabilities. A should be
 31 | %                    row-normalized.
 32 | %
 33 | % Optional name/value-pair arguments specified after requried inputs:
 34 | %
 35 | %      'num_iterations': the number of label propagation iterations to
 36 | %                        perform (default: 200)
 37 | %               'alpha': the absorbtion parameter to use in [0, 1]
 38 | %                        (default: 1, corresponds to standard label
 39 | %                        propagation)
 40 | %           'use_prior': a boolean indicating whether to use the
 41 | %                        empirical distribution on the training points
 42 | %                        as the prior (true) or a uniform prior
 43 | %                        (false) (default: false)
 44 | %         'pseudocount': if use_prior is set to true, a per-class
 45 | %                        pseudocount can also be specified (default: 1)
 46 | %
 47 | % Output:
 48 | %
 49 | %   probabilities: a matrix of posterior probabilities. The ith
 50 | %                  column gives p(y = i | x, D) for each of the
 51 | %                  indicated test points.
 52 | %
 53 | % See also MODELS, GRAPH_WALK_SELECTOR.
 54 | 
 55 | % Copyright (c) 2014 Roman Garnett.
 56 | 
 57 | function probabilities = label_propagation_model(problem, train_ind, ...
 58 |           observed_labels, test_ind, A, varargin)
 59 | 
 60 |   % parse optional inputs
 61 |   options = inputParser;
 62 | 
 63 |   options.addParamValue('num_iterations', 200, ...
 64 |                         @(x) (isscalar(x) && (x >= 0)));
 65 |   options.addParamValue('alpha', 1, ...
 66 |                         @(x) (isscalar(x) && (x >= 0) && (x <= 1)));
 67 |   options.addParamValue('use_prior', false, ...
 68 |                         @(x) (islogical(x) && (numel(x) == 1)));
 69 |   options.addParamValue('pseudocount', 0.1, ...
 70 |                         @(x) (isscalar(x) && (x > 0)));
 71 | 
 72 |   options.parse(varargin{:});
 73 |   options = options.Results;
 74 | 
 75 |   % check whether A is row-normalized
 76 |   if (any(sum(A) ~= 1))
 77 |     A = bsxfun(@times, A, 1 ./ sum(A, 2));
 78 |   end
 79 | 
 80 |   num_nodes   = size(A, 1);
 81 |   num_classes = problem.num_classes;
 82 |   num_train   = numel(train_ind);
 83 | 
 84 |   if (options.use_prior)
 85 |     prior = options.pseudocount + ...
 86 |             accumarray(observed_labels, 1, [1, num_classes]);
 87 |     prior = prior * (1 ./ sum(prior));
 88 |   else
 89 |     prior = ones(1, num_classes) * (1 / num_classes);
 90 |   end
 91 | 
 92 |   % expand graph with pseudonodes corresponding to the classes
 93 |   num_expanded_nodes = num_nodes + num_classes;
 94 | 
 95 |   A = [A, sparse(num_nodes, num_classes); ...
 96 |        sparse(num_classes, num_expanded_nodes)];
 97 | 
 98 |   % reduce weight of edges leaving training nodes by a factor of
 99 |   % (1 - alpha)
100 |   A(train_ind, :) = (1 - options.alpha) * A(train_ind, :);
101 | 
102 |   % add edges from training nodes to label nodes with weight alpha
103 |   A = A + sparse(train_ind, num_nodes + observed_labels, options.alpha, ...
104 |                  num_expanded_nodes, num_expanded_nodes);
105 | 
106 |   % add self loops on label nodes
107 |   pseudo_train_ind = (num_nodes + 1):num_expanded_nodes;
108 |   A(pseudo_train_ind, pseudo_train_ind) = speye(num_classes);
109 | 
110 |   % begin with prior on all nodes
111 |   probabilities = repmat(prior, [num_nodes + num_classes, 1]);
112 | 
113 |   % fill in known training labels
114 |   probabilities(train_ind, :) = ...
115 |       accumarray([(1:num_train)', observed_labels], 1, [num_train, num_classes]);
116 | 
117 |   % add knwon labels for label nodes
118 |   probabilities(pseudo_train_ind, :) = eye(num_classes);
119 | 
120 |   for i = 1:options.num_iterations
121 |     % propagate labels
122 |     probabilities = A * probabilities;
123 |   end
124 | 
125 |   probabilities = probabilities(test_ind, :);
126 | end


--------------------------------------------------------------------------------
/models/model_memory_wrapper.m:
--------------------------------------------------------------------------------
 1 | function probabilities = model_memory_wrapper(problem, train_ind, ...
 2 |           observed_labels, test_ind, model)
 3 | 
 4 |   persistent last_train_ind last_observed_labels last_test_ind last_probabilities;
 5 | 
 6 |   if (isequal(train_ind,       last_train_ind)      && ...
 7 |       isequal(observed_labels, last_observed_labels) && ...
 8 |       isequal(test_ind,        last_test_ind))
 9 | 
10 |     probabilities = last_probabilities;
11 |     return;
12 |   end
13 | 
14 |   probabilities = model(problem, train_ind, observed_labels, test_ind);
15 | 
16 |   last_train_ind       = train_ind;
17 |   last_observed_labels = observed_labels;
18 |   last_test_ind        = test_ind;
19 |   last_probabilities   = probabilities;
20 | 
21 | end


--------------------------------------------------------------------------------
/models/models.m:
--------------------------------------------------------------------------------
 1 | % A model calculates the posterior class-membership probabilites for a
 2 | % selected set of test points given the current labeled training
 3 | % data.
 4 | %
 5 | % Models must satisfy the following interface:
 6 | %
 7 | %   probabilities = model(problem, train_ind, observed_labels, test_ind)
 8 | %
 9 | % Inputs:
10 | %
11 | %           problem: a struct describing the problem, containing fields:
12 | %
13 | %                  points: an (n x d) data matrix for the available points
14 | %             num_classes: the number of classes
15 | %
16 | %         train_ind: a list of indices into problem.points indicating
17 | %                    the thus-far observed points
18 | %   observed_labels: a list of labels corresponding to the
19 | %                    observations in train_ind
20 | %          test_ind: a list of indices into problem.points
21 | %                    indicating the test points
22 | %
23 | % Output:
24 | %
25 | %   probabilities: a matrix of posterior probabilities. The ith
26 | %                  column gives p(y = i | x, D) for each of the
27 | %                  indicated test points.
28 | %
29 | % The following models are provided in this toolbox:
30 | %
31 | %            cheating_model: a "cheating" model that queries a
32 | %                            label oracle
33 | %    gaussian_process_model: a binary Gaussian process classifier
34 | %                 knn_model: a weighted k-NN model
35 | %   label_propagation_model: partially absorbing label propagation
36 | %       random_forest_model: a random forest model
37 | 
38 | % Copyright (c) 2011--2016 Roman Garnett.
39 | 


--------------------------------------------------------------------------------
/models/random_forest_model.m:
--------------------------------------------------------------------------------
 1 | % RANDOM_FOREST_MODEL a random forest classifier.
 2 | %
 3 | % Requires the TreeBagger class in the MATLAB Statistics Toolbox.
 4 | %
 5 | % Usage:
 6 | %
 7 | %   probabilities = random_forest_model(problem, train_ind, ...
 8 | %           observed_labels, test_ind, num_trees, options)
 9 | %
10 | % Inputs:
11 | %
12 | %           problem: a struct describing the problem, which must at
13 | %                    least contain the field:
14 | %
15 | %              points: an (n x d) data matrix for the avilable points
16 | %
17 | %         train_ind: a list of indices into problem.points indicating
18 | %                    the thus-far observed points
19 | %   observed_labels: a list of labels corresponding to the
20 | %                    observations in train_ind
21 | %          test_ind: a list of indices into problem.points indicating
22 | %                    the test points
23 | %         num_trees: the number of trees to build in the random forest
24 | %           options: (optional) additional options to pass into
25 | %                    TreeBagger for training (default: [])
26 | %
27 | % Output:
28 | %
29 | %   probabilities: a matrix of posterior probabilities. The ith
30 | %                  column gives p(y = i | x, D) for each of the
31 | %                  indicated test points.
32 | %
33 | % See also TREEBAGGER, MODELS.
34 | 
35 | % Copyright (c) 2011--2016 Roman Garnett.
36 | 
37 | function probabilities = random_forest_model(problem, train_ind, ...
38 |           observed_labels, test_ind, num_trees, options)
39 | 
40 |   if (nargin < 6)
41 |     options = [];
42 |   end
43 | 
44 |   model = TreeBagger(num_trees, problem.points(train_ind, :), observed_labels, ...
45 |                      'method', 'classification', ...
46 |                      'options', options);
47 | 
48 |   [~, probabilities] = predict(model, problem.points(test_ind, :));
49 | 
50 | end
51 | 


--------------------------------------------------------------------------------
/other/get_label_oracle.m:
--------------------------------------------------------------------------------
 1 | % GET_LABEL_ORACLE creates a function handle to a label oracle.
 2 | %
 3 | % This is a convenience function for easily creating a function handle
 4 | % to a label oracle. Given a handle to a label oracle and its
 5 | % additional arguments (if any), returns a function handle for use in,
 6 | % e.g., active_learning.m.
 7 | %
 8 | % Example:
 9 | %
10 | %   label_oracle = get_label_oracle(@lookup_oracle, labels);
11 | %
12 | % returns the following function handle:
13 | %
14 | %   @(problem, query_ind) ...
15 | %       lookup_oracle(problem, train_ind, observed_labels, query_ind, ...
16 | %                     labels)
17 | %
18 | % This is primarily for improving code readability by avoiding
19 | % repeated verbose function handle declarations.
20 | %
21 | % Usage:
22 | %
23 | %   label_oracle = get_label_oracle(label_oracle, varargin)
24 | %
25 | % Inputs:
26 | %
27 | %   label_oracle: a function handle to the desired label oracle
28 | %       varargin: any additional inputs to be bound to the label
29 | %                 oracle beyond those required by the standard
30 | %                 interface (problem, query_ind)
31 | %
32 | % Output:
33 | %
34 | %   label_oracle: a function handle to the desired label oracle for
35 | %                 use in active_learning
36 | %
37 | % See also LABEL_ORACLES.
38 | 
39 | % Copyright (c) 2013--2016 Roman Garnett.
40 | 
41 | function label_oracle = get_label_oracle(label_oracle, varargin)
42 | 
43 |   label_oracle = @(problem, train_ind, observed_labels, query_ind) ...
44 |       label_oracle(problem, train_ind, observed_labels, query_ind, ...
45 |                    varargin{:});
46 | 
47 | end


--------------------------------------------------------------------------------
/other/get_model.m:
--------------------------------------------------------------------------------
 1 | % GET_MODEL creates a function handle to a probability model.
 2 | %
 3 | % This is a convenience function for easily creating a function handle
 4 | % to a model. Given a handle to a model and its additional arguments
 5 | % (if any), returns a function handle for use in, e.g.,
 6 | % active_learning.m.
 7 | %
 8 | % Example:
 9 | %
10 | %   model = get_model(@knn_model, weights, prior_alpha, prior_beta);
11 | %
12 | % returns the following function handle:
13 | %
14 | %   @(problem, train_ind, observed_labels, test_ind) ...
15 | %       knn_model(problem, train_ind, observed_labels, test_ind, ...
16 | %                 weights, prior_alpha, prior_beta)
17 | %
18 | % This is primarily for improving code readability by avoiding
19 | % repeated verbose function handle declarations.
20 | %
21 | % Usage:
22 | %
23 | %   model = get_model(model, varargin)
24 | %
25 | % Inputs:
26 | %
27 | %      model: a function handle to the desired model
28 | %   varargin: any additional inputs to be bound to the model beyond
29 | %             those required by the standard interface (problem,
30 | %             train_ind, observed_labels, test_ind)
31 | %
32 | % Output:
33 | %
34 | %   model: a function handle to the desired model for use in
35 | %          active_learning
36 | %
37 | % See also MODELS.
38 | 
39 | % Copyright (c) 2013--2014 Roman Garnett.
40 | 
41 | function model = get_model(model, varargin)
42 | 
43 |   model = @(problem, train_ind, observed_labels, test_ind) ...
44 |           model(problem, train_ind, observed_labels, test_ind, varargin{:});
45 | 
46 | end


--------------------------------------------------------------------------------
/other/get_query_strategy.m:
--------------------------------------------------------------------------------
 1 | % GET_QUERY_STRATEGY creates a function handle to a query strategy.
 2 | %
 3 | % This is a convenience function for easily creating a function handle
 4 | % to a query strategy. Given a handle to a query strategy and its
 5 | % additional arguments (if any), returns a function handle for use in,
 6 | % e.g., active_learning.m.
 7 | %
 8 | % Example:
 9 | %
10 | %   query_strategy = get_query_strategy(@maximum_score, score_function);
11 | %
12 | % returns the following function handle:
13 | %
14 | %   @(problem, train_ind, observed_labels, test_ind) ...
15 | %       maximum_score(problem, train_ind, observed_labels,
16 | %                     test_ind, score_function)
17 | %
18 | % This is primarily for improving code readability by avoiding
19 | % repeated verbose function handle declarations.
20 | %
21 | % Usage:
22 | %
23 | %   query_strategy = get_query_strategy(query_strategy, varargin)
24 | %
25 | % Inputs:
26 | %
27 | %   query_strategy: a function handle to the desired query strategy
28 | %         varargin: any additional inputs to be bound to the query
29 | %                   strategy beyond those required by the standard
30 | %                   interface (problem, train_ind, observed_labels,
31 | %                   test_ind)
32 | %
33 | % Output:
34 | %
35 | %   query_strategy: a function handle to the desired query strategy
36 | %                   for use in active_learning
37 | %
38 | % See also QUERY_STRATEGIES.
39 | 
40 | % Copyright (c) 2013--2014 Roman Garnett.
41 | 
42 | function query_strategy = get_query_strategy(query_strategy, varargin)
43 | 
44 |   query_strategy = @(problem, train_ind, observed_labels, test_ind) ...
45 |       query_strategy(problem, train_ind, observed_labels, test_ind, ...
46 |                      varargin{:});
47 | 
48 | end


--------------------------------------------------------------------------------
/other/get_score_function.m:
--------------------------------------------------------------------------------
 1 | % GET_SCORE_FUNCTION creates a function handle to a score function.
 2 | %
 3 | % This is a convenience function for easily creating a function handle
 4 | % to a score function. Given a handle to a score function and its
 5 | % additional arguments (if any), returns a function handle for use in,
 6 | % e.g., active_learning.m.
 7 | %
 8 | % Example:
 9 | %
10 | %   score_function = get_score_function(@expected_accuracy, model);
11 | %
12 | % returns the following function handle:
13 | %
14 | %   @(problem, train_ind, observed_labels, test_ind) ...
15 | %       expected_accuracy(problem, train_ind, observed_labels,
16 | %                         test_ind, model)
17 | %
18 | % This is primarily for improving code readability by avoiding
19 | % repeated verbose function handle declarations.
20 | %
21 | % Usage:
22 | %
23 | %   score_function = get_score_function(score_function, varargin)
24 | %
25 | % Inputs:
26 | %
27 | %   score_function: a function handle to the desired score function
28 | %         varargin: any additional inputs to be bound to the score
29 | %                   function beyond those required by the standard
30 | %                   interface (problem, train_ind, observed_labels,
31 | %                   test_ind)
32 | %
33 | % Output:
34 | %
35 | %   score_function: a function handle to the desired label oracle for
36 | %                   use in active_learning
37 | %
38 | % See also SCORE_FUNCTIONS.
39 | 
40 | % Copyright (c) 2013--2014 Roman Garnett.
41 | 
42 | function score_function = get_score_function(score_function, varargin)
43 | 
44 |   score_function = @(problem, train_ind, observed_labels, test_ind) ...
45 |       score_function(problem, train_ind, observed_labels, test_ind, ...
46 |                      varargin{:});
47 | 
48 | end


--------------------------------------------------------------------------------
/other/get_selector.m:
--------------------------------------------------------------------------------
 1 | % GET_SELECTOR creates a function handle to a selector.
 2 | %
 3 | % This is a convenience function for easily creating a function handle
 4 | % to a selector. Given a handle to a selector and its additional
 5 | % arguments (if any), returns a function handle for use in, e.g.,
 6 | % active_learning.m.
 7 | %
 8 | % Example:
 9 | %
10 | %   selector = get_selector(@random_selector, num_test);
11 | %
12 | % returns the following function handle:
13 | %
14 | %   @(problem, train_ind, observed_labels) ...
15 | %       random_selector(problem, train_ind, observed_labels, num_test)
16 | %
17 | % This is primarily for improving code readability by avoiding
18 | % repeated verbose function handle declarations.
19 | %
20 | % Usage:
21 | %
22 | %   selector = get_selector(selector, varargin)
23 | %
24 | % Inputs:
25 | %
26 | %   selector: a function handle to the desired selector
27 | %   varargin: any additional inputs to be bound to the selector beyond
28 | %             those required by the standard interface (problem,
29 | %             train_ind, observed_labels)
30 | %
31 | % Output:
32 | %
33 | %   selector: a function handle to the desired selector for use in
34 | %             active_learning
35 | %
36 | % See also SELECTORS.
37 | 
38 | % Copyright (c) 2014 Roman Garnett.
39 | 
40 | function selector = get_selector(selector, varargin)
41 | 
42 |   selector = @(problem, train_ind, observed_labels) ...
43 |              selector(problem, train_ind, observed_labels, varargin{:});
44 | 
45 | end


--------------------------------------------------------------------------------
/query_strategies/argmax.m:
--------------------------------------------------------------------------------
 1 | % ARGMAX queries the point(s) that maximizes a score function.
 2 | %
 3 | % This is a trivial query strategy that calls a user-provided score
 4 | % function on each of the points available for labeling and selects
 5 | % the point(s) with the maximum score.
 6 | %
 7 | % Several popular score functions are included in this software
 8 | % package; see score_functions.m for more information.
 9 | %
10 | % Usage:
11 | %
12 | %   query_ind = argmax(problem, train_ind, observed_labels, test_ind, ...
13 | %                      score_function, num_points)
14 | %
15 | % Inputs:
16 | %
17 | %           problem: a struct describing the problem, containing fields:
18 | %
19 | %                  points: an (n x d) data matrix for the available points
20 | %             num_classes: the number of classes
21 | %             num_queries: the number of queries to make
22 | %
23 | %         train_ind: a list of indices into problem.points indicating
24 | %                    the thus-far observed points
25 | %   observed_labels: a list of labels corresponding to the
26 | %                    observations in train_ind
27 | %          test_ind: a list of indices into problem.points indicating
28 | %                    the points eligible for observation
29 | %    score_function: a handle to a score function (see
30 | %                    score_functions.m for interface)
31 | %        num_points: (optional) the number of points to return
32 | %                    (default: 1)
33 | %
34 | % Output:
35 | %
36 | %   query_ind: an index into problem.points indicating the point(s) to
37 | %              query next
38 | %
39 | % See also ARGMIN, SCORE_FUNCTIONS, QUERY_STRATEGIES.
40 | 
41 | % Copyright (c) 2013--2014 Roman Garnett.
42 | 
43 | function query_ind = argmax(problem, train_ind, observed_labels, ...
44 |           test_ind, score_function, num_points)
45 | 
46 |   % by default query a single point
47 |   if (nargin < 6)
48 |     num_points = 1;
49 |   end
50 | 
51 |   scores = score_function(problem, train_ind, observed_labels, test_ind);
52 | 
53 |   % only call sort if needed
54 |   if (num_points == 1)
55 |     [~, best_ind] = max(scores);
56 |   else
57 |     [~, best_ind] = sort(scores, 'descend');
58 |     best_ind = best_ind(1:num_points);
59 |   end
60 | 
61 |   query_ind = test_ind(best_ind);
62 | 
63 | end


--------------------------------------------------------------------------------
/query_strategies/argmin.m:
--------------------------------------------------------------------------------
 1 | % ARGMIN queries the point(s) that minimizes a score function.
 2 | %
 3 | % This is a trivial query strategy that calls a user-provided score
 4 | % function on each of the points available for labeling and selects
 5 | % the point(s) with the minimum score.
 6 | %
 7 | % Several popular score functions are included in this software
 8 | % package; see score_functions.m for more information.
 9 | %
10 | % Usage:
11 | %
12 | %   query_ind = argmin(problem, train_ind, observed_labels, test_ind, ...
13 | %                      score_function, num_points)
14 | %
15 | % Inputs:
16 | %
17 | %           problem: a struct describing the problem, containing fields:
18 | %
19 | %                  points: an (n x d) data matrix for the available points
20 | %             num_classes: the number of classes
21 | %             num_queries: the number of queries to make
22 | %
23 | %         train_ind: a list of indices into problem.points indicating
24 | %                    the thus-far observed points
25 | %   observed_labels: a list of labels corresponding to the
26 | %                    observations in train_ind
27 | %          test_ind: a list of indices into problem.points indicating
28 | %                    the points eligible for observation
29 | %    score_function: a handle to a score function (see
30 | %                    score_functions.m for interface)
31 | %        num_points: (optional) the number of points to return
32 | %                    (default: 1)
33 | %
34 | % Output:
35 | %
36 | %   query_ind: an index into problem.points indicating the point(s) to
37 | %              query next
38 | %
39 | % See also ARGMAX, SCORE_FUNCTIONS, QUERY_STRATEGIES.
40 | 
41 | % Copyright (c) 2014 Roman Garnett.
42 | 
43 | function query_ind = argmin(problem, train_ind, observed_labels, ...
44 |           test_ind, score_function, num_points)
45 | 
46 |   % by default query a single point
47 |   if (nargin < 6)
48 |     num_points = 1;
49 |   end
50 | 
51 |   scores = score_function(problem, train_ind, observed_labels, test_ind);
52 | 
53 |   % only call sort if needed
54 |   if (num_points == 1)
55 |     [~, best_ind] = min(scores);
56 |   else
57 |     [~, best_ind] = sort(scores, 'ascend');
58 |     best_ind = best_ind(1:num_points);
59 |   end
60 | 
61 |   query_ind = test_ind(best_ind);
62 | 
63 | end


--------------------------------------------------------------------------------
/query_strategies/expected_error_reduction.m:
--------------------------------------------------------------------------------
 1 | % EXPECTED_ERROR_REDUCTION queries the point giving lowest expected error.
 2 | %
 3 | % This is an implementation of expected error reduction, a simple and
 4 | % popular query strategy. Expected error reduction queries the point
 5 | % that would result in the lowest expected error on the remaining
 6 | % unlabeled points. Let a point x and a dataset D = (X, Y) be given.
 7 | % Let \ell(D) be a loss function for the unlabeled points U given D;
 8 | % here we support either the total 0/1 loss:
 9 | %
10 | %   \ell(D) = \sum_{x \in U} [ y != \hat{y} ],
11 | %
12 | % where \hat{y} = argmax p(y | x, D) is the predicted label for x, or
13 | % the total log loss:
14 | %
15 | %   \ell(D) = \sum_{x \in U} -\log p(y = \hat{y} | x, D).
16 | %
17 | % Then expected error reduction queries the point x resulting the in
18 | % the lowest expected loss on U after adding the observation (x, y)
19 | % to D:
20 | %
21 | %   x* = \argmin E_y[ p(y | x, D) \ell(D U (x, y)) ].
22 | %
23 | % Note that the set of unlabeled points U depends on x!
24 | %
25 | % Usage:
26 | %
27 | %   query_ind = expected_error_reduction(problem, train_ind, ...
28 | %           observed_labels, test_ind, model, loss)
29 | %
30 | % Required Inputs:
31 | %
32 | %           problem: a struct describing the problem, containing fields:
33 | %
34 | %                  points: an (n x d) data matrix for the available points
35 | %             num_classes: the number of classes
36 | %
37 | %         train_ind: a list of indices into problem.points indicating
38 | %                    the thus-far observed points
39 | %   observed_labels: a list of labels corresponding to the
40 | %                    observations in train_ind
41 | %          test_ind: a list of indices into problem.points indicating
42 | %                    the points eligible for observation
43 | %             model: a function handle to a probability model
44 | %
45 | % Optional Input:
46 | %
47 | %              loss: a string specifying the desired loss function;
48 | %                    the following are supported: '01', '0/1', 'log'.
49 | %                    (case insensitive; default: 'log')
50 | %
51 | % Output:
52 | %
53 | %   query_ind: an index into test_ind indicating the point to query
54 | %              next
55 | %
56 | % See also MODELS, QUERY_STRATEGIES.
57 | 
58 | % Copyright (c) 2014 Roman Garnett.
59 | 
60 | function query_ind = expected_error_reduction(problem, train_ind, ...
61 |           observed_labels, test_ind, model, loss)
62 | 
63 |   if (nargin < 6)
64 |     loss = 'log';
65 |   end
66 | 
67 |   % create handle to appropriate loss function
68 |   switch (tolower(loss))
69 |     case {'01', '0/1'}
70 |       loss = @expected_01_loss;
71 | 
72 |     case 'log'
73 |       loss = @expected_log_loss;
74 | 
75 |     otherwise
76 |       error('active_learning:unknown_loss', ...
77 |             'unknown loss function: %s', loss);
78 |   end
79 | 
80 |   % expected error reduction is one-step lookahead minimization of
81 |   % the chosen expected loss on the set of unlabeled points.
82 | 
83 |   loss = @(problem, train_ind, observed_labels) ...
84 |          loss(problem, train_ind, observed_labels, ...
85 |               unlabeled_selector(problem, train_ind, []), ...
86 |               model);
87 | 
88 |   score_function = get_score_function(@expected_loss_naive, model, loss);
89 | 
90 |   query_ind = argmin(problem, train_ind, observed_labels, test_ind, ...
91 |                      score_function);
92 | 
93 | end


--------------------------------------------------------------------------------
/query_strategies/margin_sampling.m:
--------------------------------------------------------------------------------
 1 | % MARGIN_SAMPLING queries the point with the smallest margin.
 2 | %
 3 | % This is an implementation of margin sampling, a simple and popular
 4 | % query strategy. Margin sampling successively queries the point with
 5 | % the smallest margin:
 6 | %
 7 | %   x* = argmin margin(x | D),
 8 | %
 9 | % where margin(x | D) is the predictive margin of x given the
10 | % observations in D:
11 | %
12 | %   margin(x | D) = p(y = y_1 | x, D) - p(y = y_2 | x, D),
13 | %
14 | % where y_1 and y_2 are the most and second-most probable class
15 | % labels for x, respectively.
16 | %
17 | % For binary problems, this coincides with uncertainty sampling.
18 | %
19 | % Usage:
20 | %
21 | %   query_ind = margin_sampling(problem, train_ind, observed_labels, ...
22 | %           test_ind, model)
23 | %
24 | % Inputs:
25 | %
26 | %           problem: a struct describing the problem, containing fields:
27 | %
28 | %                  points: an (n x d) data matrix for the available points
29 | %             num_classes: the number of classes
30 | %
31 | %         train_ind: a list of indices into problem.points indicating
32 | %                    the thus-far observed points
33 | %   observed_labels: a list of labels corresponding to the
34 | %                    observations in train_ind
35 | %          test_ind: a list of indices into problem.points indicating
36 | %                    the points eligible for observation
37 | %             model: a function handle to a probability model
38 | %
39 | % Output:
40 | %
41 | %   query_ind: an index into test_ind indicating the point to query
42 | %              next
43 | %
44 | % See also MODELS, MARGIN, QUERY_STRATEGIES.
45 | 
46 | % Copyright (c) 2014 Roman Garnett.
47 | 
48 | function query_ind = margin_sampling(problem, train_ind, observed_labels, ...
49 |           test_ind, model)
50 | 
51 |   score_function = get_score_function(@margin, model);
52 | 
53 |   query_ind = argmin(problem, train_ind, observed_labels, test_ind, ...
54 |                      score_function);
55 | 
56 | end


--------------------------------------------------------------------------------
/query_strategies/query_by_committee.m:
--------------------------------------------------------------------------------
 1 | % QUERY_BY_COMMITTEE queries the point with highest disagreement.
 2 | %
 3 | % This is an implementation of "query by committee" using vote entropy
 4 | % to measure disagreement by models. The query by committee query
 5 | % strategy maintains an ensemble of models M and successively queries
 6 | % the point about which the ensemble members disagree the most. The
 7 | % idea is to greedily cut down the version space as quickly as
 8 | % possible.
 9 | %
10 | % The disagreement between ensemble members can be measured in various
11 | % ways, but the most popular method (and the one we implement here) is
12 | % the so-called "vote entropy." Let M = {M_j} be a set of probability
13 | % models, and let a point x and a set of observations D = (X, Y) be
14 | % given. The ensemble probabilities are given by
15 | %
16 | %   p(y = i | x, D) = \sum_j w_j(D) p(y = i | x, D, M_j)
17 | %                   / \sum_j w_j(D),
18 | %
19 | % where w(D) is a (possibly data-dependent) weight vector of length
20 | % |M|.
21 | %
22 | % We may alternatively use so-called "hard" voting by the ensemble
23 | % members, where in the above the posterior probabilities
24 | %
25 | %   p(y | x, D, M_j)
26 | %
27 | % are replaced by a Kronecker \delta distribution on the
28 | % most-confident label according to model M_j:
29 | %
30 | %   \delta[ \argmax_i p(y = i | x, D, M_j) ].
31 | %
32 | % Finally, the vote entropy of M on x is the entropy of this marginal
33 | % distribution:
34 | %
35 | %   H[y | x, D] = -\sum_i p(y = i | x, D) \log p(y = i | x, D).
36 | %
37 | % Traditionally, query by committee uses the "hard" voting strategy,
38 | % but we support either here.
39 | %
40 | % Usage:
41 | %
42 | %    query_ind = query_by_committee(problem, train_ind, observed_labels, ...
43 | %            test_ind, models, weights, hard_votes)
44 | %
45 | % Required Inputs:
46 | %
47 | %           problem: a struct describing the problem, containing fields:
48 | %
49 | %                  points: an (n x d) data matrix for the available points
50 | %             num_classes: the number of classes
51 | %
52 | %         train_ind: a list of indices into problem.points indicating
53 | %                    the thus-far observed points
54 | %   observed_labels: a list of labels corresponding to the
55 | %                    observations in train_ind
56 | %          test_ind: a list of indices into problem.points indicating
57 | %                    the points eligible for observation
58 | %            models: a cell array of handles to probability models
59 | %
60 | % Optional Inputs:
61 | %
62 | %           weights: either a length-|M| vector of model weights or a
63 | %                    function handle returning such a vector (see note
64 | %                    in ensemble.m for details).
65 | %                    (default: ones(1, |M|) / |M|)
66 | %        hard_votes: a boolean indicating whether to use "hard" voting
67 | %                    (default: false)
68 | %
69 | % Output:
70 | %
71 | %   query_ind: an index into problem.points indicating the point to
72 | %              query next
73 | %
74 | % See also ENSEMBLE, MODELS.
75 | 
76 | % Copyright (c) 2014 Roman Garnett.
77 | 
78 | function query_ind = query_by_committee(problem, train_ind, ...
79 |           observed_labels, test_ind, models, varargin)
80 | 
81 |   % query by committee is simply uncertainty sampling using an
82 |   % ensemble prediction
83 |   model = get_model(@ensemble, models, varargin{:});
84 | 
85 |   query_ind = uncertainty_sampling(problem, train_ind, observed_labels, ...
86 |           test_ind, model);
87 | 
88 | end


--------------------------------------------------------------------------------
/query_strategies/query_strategies.m:
--------------------------------------------------------------------------------
 1 | % Query strategies select which of the points currently eligible for
 2 | % labeling (returned by a selector) should be observed next.
 3 | %
 4 | % Query strategies must satisfy the following interface:
 5 | %
 6 | %   query_ind = query_strategy(problem, train_ind, observed_labels, test_ind)
 7 | %
 8 | % Inputs:
 9 | %
10 | %           problem: a struct describing the problem, containing fields:
11 | %
12 | %                  points: an (n x d) data matrix for the available points
13 | %             num_classes: the number of classes
14 | %             num_queries: the number of queries to make
15 | %
16 | %         train_ind: a list of indices into problem.points indicating
17 | %                    the thus-far observed points
18 | %   observed_labels: a list of labels corresponding to the
19 | %                    observations in train_ind
20 | %          test_ind: a list of indices into problem.points indicating
21 | %                    the points eligible for observation
22 | %
23 | % Output:
24 | %
25 | %   query_ind: an index into problem.points indicating the point(s) to
26 | %              query next (every entry in query_ind will always be
27 | %              a member of the set of points in test_ind)
28 | %
29 | % The following query strategies are provided in this toolbox:
30 | %
31 | %                     argmax: samples the point(s) maximizing a given
32 | %                             score function
33 | %                     argmin: samples the point(s) minimizing a given
34 | %                             score function
35 | %   expected_error_reduction: samples the point giving lowest
36 | %                             expected loss on unlabeled points
37 | %            margin_sampling: samples the point with the smallest
38 | %                             margin
39 | %         query_by_committee: samples the point with the highest
40 | %                             disagreement between models
41 | %       uncertainty_sampling: samples the most uncertain point
42 | 
43 | % Copyright (c) 2014 Roman Garnett.
44 | 


--------------------------------------------------------------------------------
/query_strategies/uncertainty_sampling.m:
--------------------------------------------------------------------------------
 1 | % UNCERTAINTY_SAMPLING queries the most uncertain point.
 2 | %
 3 | % This is an implementation of uncertainty sampling, a simple and
 4 | % popular query strategy. Uncertainty sampling successively queries
 5 | % the point with the highest marginal entropy:
 6 | %
 7 | %   x* = argmax H[y | x, D],
 8 | %
 9 | % where H[y | x, D] is the entropy of the marginal label
10 | % distribution  p(y | x, D):
11 | %
12 | %   H[y | x, D] = -\sum_i p(y = i | x, D) \log(p(y = i | x, D)).
13 | %
14 | % Usage:
15 | %
16 | %   query_ind = uncertainty_sampling(problem, train_ind, observed_labels, ...
17 | %           test_ind, model)
18 | %
19 | % Inputs:
20 | %
21 | %           problem: a struct describing the problem, containing fields:
22 | %
23 | %                  points: an (n x d) data matrix for the available points
24 | %             num_classes: the number of classes
25 | %
26 | %         train_ind: a list of indices into problem.points indicating
27 | %                    the thus-far observed points
28 | %   observed_labels: a list of labels corresponding to the
29 | %                    observations in train_ind
30 | %          test_ind: a list of indices into problem.points indicating
31 | %                    the points eligible for observation
32 | %             model: a function handle to a probability model
33 | %
34 | % Output:
35 | %
36 | %   query_ind: an index into test_ind indicating the point to query
37 | %              next
38 | %
39 | % See also MODELS, MARGINAL_ENTROPY, QUERY_STRATEGIES.
40 | 
41 | % Copyright (c) 2014 Roman Garnett.
42 | 
43 | function query_ind = uncertainty_sampling(problem, train_ind, ...
44 |           observed_labels, test_ind, model)
45 | 
46 |   score_function = get_score_function(@marginal_entropy, model);
47 | 
48 |   query_ind = argmax(problem, train_ind, observed_labels, test_ind, ...
49 |                      score_function);
50 | 
51 | end


--------------------------------------------------------------------------------
/score_functions/calculate_entropies.m:
--------------------------------------------------------------------------------
1 | function entropies = calculate_entropies(data, labels, train_ind, test_ind, ...
2 |           probability_function)
3 | 
4 |   probabilities = probability_function(data, labels, train_ind, test_ind);
5 |   entropies = -sum(probabilities .* log(probabilities), 2);
6 |   
7 | end


--------------------------------------------------------------------------------
/score_functions/expected_loss_lookahead.m:
--------------------------------------------------------------------------------
  1 | % EXPECTED_LOSS_LOOKAHEAD calculates "lookahed" expected losses.
  2 | %
  3 | % This is an implementation of a score function that calculates the
  4 | % k-step-lookahead expected losses after adding each of a given set of
  5 | % points to a dataset for a particular loss function and lookahead
  6 | % horizon k.
  7 | %
  8 | % This function supports user-specified:
  9 | %
 10 | % * _Loss functions,_ which calculate the loss associated with a
 11 | %   selected training set,
 12 | %
 13 | % * _Selectors,_ which given the current training set, specify which
 14 | %   points should have their expected losses evaluated. This
 15 | %   implementation allows multiple selectors to be used, should
 16 | %   different ones be desired for different lookaheads.
 17 | %
 18 | % Note on Expected Loss Functions:
 19 | %
 20 | % This function requires as an input a function, expected_loss, that
 21 | % will return the one-step-lookahead expected losses after adding each
 22 | % of a given set of points to a dataset for the chosen loss
 23 | % function. That is, given a point x and a loss function \ell(D) for a
 24 | % dataset D = (X, Y), this function should return
 25 | %
 26 | %   E_y[ \ell(D U {(x, y)}) | x, D] = ...
 27 | %                       \sum_i p(y = i | x, D) \ell(D U {(x, i)}),
 28 | %
 29 | % where i ranges over the possible labels.
 30 | %
 31 | % The API for this expected loss function is the same as for any
 32 | % score function:
 33 | %
 34 | %   expected_losses = expected_loss(problem, train_ind, observed_labels, ...
 35 | %           test_ind)
 36 | %
 37 | % Sometimes this expectation over y may be calculated directly without
 38 | % enumerating all cases for y. If that is not possible, the function
 39 | % expected_loss_naive may be used with any arbitrary loss function to
 40 | % calculate this expectation naively (by augmenting the dataset D with
 41 | % (x, i) for each class i and weigting the resulting losses by the
 42 | % probabiliy that y = i).
 43 | %
 44 | % Usage:
 45 | %
 46 | %   expected_losses = expected_loss_lookahead(problem, train_ind, ...
 47 | %           observed_labels, test_ind, model, expected_loss, selectors, ...
 48 | %           lookahead)
 49 | %
 50 | % Inputs:
 51 | %
 52 | %           problem: a struct describing the problem, containing fields:
 53 | %
 54 | %                  points: an n x d matrix describing the avilable points
 55 | %             num_classes: the number of classes
 56 | %
 57 | %         train_ind: a list of indices into problem.points
 58 | %                    indicating the thus-far observed points
 59 | %   observed_labels: a list of labels corresponding to the
 60 | %                    observations in train_ind
 61 | %          test_ind: a list of indices into problem.points indicating
 62 | %                    the points eligible for observation
 63 | %             model: a handle to probability model to use
 64 | %     expected_loss: a handle to the one-step expected loss function
 65 | %                    to use (see note above)
 66 | %         selectors: a cell array of selectors to use. If lookahead == k,
 67 | %                    then the min(k, numel(selection_functions))th
 68 | %                    element of this array will be used.
 69 | %         lookahead: the number of steps to look ahead. If
 70 | %                    lookahead == 0, then random expected losses are
 71 | %                    returned.
 72 | %
 73 | % Output:
 74 | %
 75 | %   expected_losses: the lookahead-step expected losses for the points
 76 | %                    in test_ind
 77 | %
 78 | % See also LOSS_FUNCTIONS, EXPECTED_LOSS_NAIVE, SELECTORS, MODELS, SCORE_FUNCTIONS.
 79 | 
 80 | % Copyright (c) 2011--2014 Roman Garnett.
 81 | 
 82 | function expected_losses = expected_loss_lookahead(problem, train_ind, ...
 83 |           observed_labels, test_ind, model, expected_loss, selectors, ...
 84 |           lookahead)
 85 | 
 86 |   num_test = numel(test_ind);
 87 | 
 88 |   % for zero-step lookahead, return random expected losses
 89 |   if (lookahead == 0)
 90 |     expected_losses = rand(num_test, 1);
 91 |     return;
 92 | 
 93 |   % for one-step lookahead, return values from base expected loss
 94 |   elseif (lookahead == 1)
 95 |     expected_losses = expected_loss(problem, train_ind, observed_labels, ...
 96 |             test_ind);
 97 |     return;
 98 |   end
 99 | 
100 |   % We will calculate the expected loss after adding each dataset to the
101 |   % training set by sampling over labels to create ficticious datasets
102 |   % and measuring the expected loss of each. We accomplish lookahead
103 |   % by calling this function recursively to calculate the expected
104 |   % loss at later levels.
105 | 
106 |   % Used to recursively select test points. Allow array of selectors and
107 |   % fall back if no entry for current lookahead.
108 |   selector = selectors{min(lookahead - 1, numel(selectors))};
109 | 
110 |   % Given one additional new point, we will always select the remaining
111 |   % points by minimizing the (lookahead - 1) expected loss.
112 |   lookahead_loss = @(problem, train_ind, observed_labels) ...
113 |       min(expected_loss_lookahead(problem, train_ind, observed_labels, ...
114 |           selector(problem, train_ind, observed_labels), model, ...
115 |           expected_loss, selectors, lookahead - 1));
116 | 
117 |   % Use expected_loss_naive to evaluate the expected loss of the given
118 |   % points.
119 |   expected_losses = expected_loss_naive(problem, train_ind, ...
120 |           observed_labels, test_ind, model, lookahead_loss);
121 | 
122 | end


--------------------------------------------------------------------------------
/score_functions/expected_loss_naive.m:
--------------------------------------------------------------------------------
 1 | % EXPECTED_LOSS_NAIVE calculates one-step-lookahead expected losses.
 2 | %
 3 | % This is an implementation of a score function that calculates the
 4 | % one-step-lookahead expected losses after adding each of a given set of
 5 | % points to a dataset for a particular loss function.
 6 | %
 7 | % Given a loss function \ell(D), this function computes the
 8 | % expected loss after adding each identified point x to the current
 9 | % dataset D:
10 | %
11 | %   E_y[ \ell(D U {(x, y)}) | x, D] = ...
12 | %                       \sum_i p(y = i | x, D) \ell(D U {(x, i)}),
13 | %
14 | % where i ranges over the possible labels.
15 | %
16 | % Here this expectation is computed naively by augmenting the dataset
17 | % D with (x, i) for each class i and weighting the resulting losses by
18 | % the probability that y = i.
19 | %
20 | % Usage:
21 | %
22 | %   expected_losses = expected_loss_naive(problem, train_ind, ...
23 | %           observed_labels, test_ind, model, loss)
24 | %
25 | % Inputs:
26 | %
27 | %           problem: a struct describing the problem, containing fields:
28 | %
29 | %                  points: an n x d matrix describing the available points
30 | %             num_classes: the number of classes
31 | %
32 | %         train_ind: a list of indices into problem.points
33 | %                    indicating the thus-far observed points
34 | %   observed_labels: a list of labels corresponding to the
35 | %                    observations in train_ind
36 | %          test_ind: a list of indices into problem.points indicating
37 | %                    the points eligible for observation
38 | %             model: a handle to the probability model to use
39 | %              loss: a handle to the loss function to use
40 | %
41 | % Output:
42 | %
43 | %   expected_losses: the one-step lookahead expected losses for the
44 | %                    points in test_ind
45 | %
46 | % See also LOSS_FUNCTIONS, EXPECTED_LOSS_LOOKAHEAD, MODELS, SCORE_FUNCTIONS.
47 | 
48 | % Copyright (c) 2011--2014 Roman Garnett.
49 | 
50 | function expected_losses = expected_loss_naive(problem, train_ind, ...
51 |           observed_labels, test_ind, model, loss)
52 | 
53 |   num_test = numel(test_ind);
54 | 
55 |   % calculate the current posterior probabilities
56 |   probabilities = model(problem, train_ind, observed_labels, test_ind);
57 | 
58 |   expected_losses = zeros(num_test, 1);
59 |   for i = 1:num_test
60 |     fake_train_ind = [train_ind; test_ind(i)];
61 | 
62 |     % sample over labels
63 |     fake_losses = zeros(problem.num_classes, 1);
64 |     for fake_label = 1:problem.num_classes
65 |       fake_observed_labels = [observed_labels; fake_label];
66 |       fake_losses(fake_label) = ...
67 |           loss(problem, fake_train_ind, fake_observed_labels);
68 |     end
69 | 
70 |     % calculate expectation using current probabilities
71 |     expected_losses(i) = probabilities(i, :) * fake_losses;
72 |   end
73 | 
74 | end


--------------------------------------------------------------------------------
/score_functions/expected_utility_lookahead.m:
--------------------------------------------------------------------------------
 1 | % EXPECTED_UTILITY_LOOKAHEAD calculates "lookahed" expected utilities.
 2 | %
 3 | % This is an implementation of a score function that calculates the
 4 | % k-step-lookahead expected utilities after adding each of a given set
 5 | % of points to a dataset for a particular utility function and
 6 | % lookahead horizon k.
 7 | %
 8 | % This is implemented as a wrapper around expected_loss_lookahead that
 9 | % simply transforms the provided utility into a loss (via negation),
10 | % calls that function, and again negates the outputs. The API is the
11 | % same as for expected_loss_lookahead, modulo the replacement of
12 | % losses by utilities.
13 | %
14 | % See also EXPECTED_LOSS_LOOKAHEAD, LOSS_FUNCTIONS, SCORE_FUNCTIONS.
15 | 
16 | % Copyright (c) 2014 Roman Garnett.
17 | 
18 | function expected_utilities = expected_utility_lookahead(problem, ...
19 |           train_ind, observed_labels, test_ind, model, expected_utility, ...
20 |           selectors, lookahead)
21 | 
22 |   % transform utility into a loss via negation
23 |   expected_loss = @(problem, train_ind, observed_labels, test_ind) ...
24 |       -expected_utility(problem, train_ind, observed_labels, ...
25 |                         test_ind);
26 | 
27 |   % calculate expected losses and transform back to expected utilities
28 |   % by negation
29 |   expected_utilities = -expected_loss_lookahead(problem, train_ind, ...
30 |           observed_labels, test_ind, model, expected_loss, selectors, ...
31 |           lookahead);
32 | 
33 | end


--------------------------------------------------------------------------------
/score_functions/expected_utility_naive.m:
--------------------------------------------------------------------------------
 1 | % EXPECTED_UTILITY_NAIVE calculates one-step-lookahead expected utilities.
 2 | %
 3 | % This is an implementation of a score function that calculates the
 4 | % one-step-lookahead expected utilities after adding each of a given
 5 | % set of points to a dataset for a particular utility function.
 6 | %
 7 | % This is implmemented as a wrapper around expected_loss_naive that
 8 | % simply transforms the provided utility into a loss (via negation),
 9 | % calls that function, and again negates the outputs. The API is the
10 | % same as for expected_loss_naive, modulo the replacement of losses by
11 | % utilites.
12 | %
13 | % See also EXPECTED_LOSS_NAIVE, LOSS_FUNCTIONS, SCORE_FUNCTIONS.
14 | 
15 | % Copyright (c) 2014 Roman Garnett.
16 | 
17 | function expected_utilities = expected_utility_naive(problem, train_ind, ...
18 |           observed_labels, test_ind, model, utility)
19 | 
20 |   % transform utility into a loss via negation
21 |   loss = @(problem, train_ind, observed_labels) ...
22 |          -utility(problem, train_ind, observed_labels);
23 | 
24 |   % calculate expected losses and transform back to expected utilities
25 |   % by negation
26 |   expected_utilities = -expected_loss_naive(problem, train_ind, ...
27 |           observed_labels, test_ind, model, loss);
28 | 
29 | end


--------------------------------------------------------------------------------
/score_functions/loss_functions/expected_01_loss.m:
--------------------------------------------------------------------------------
 1 | % EXPECTED_01_LOSS calcluates expected 0/1 loss given a training set.
 2 | %
 3 | % This function computes the expected total 0/1 loss on a set of
 4 | % points given a training set D = (X, Y):
 5 | %
 6 | %   \sum_{x \in U} (1 - max p(y | x, D)),
 7 | %
 8 | % where U is the set of points whose labels are to be predicted.
 9 | %
10 | % Usage:
11 | %
12 | %   loss = expected_01_loss(problem, train_ind, observed_labels, ...
13 | %                           test_ind, model)
14 | %
15 | % Inputs:
16 | %
17 | %           problem: a struct describing the problem, containing fields:
18 | %
19 | %                  points: an (n x d) data matrix for the available points
20 | %             num_classes: the number of classes
21 | %
22 | %         train_ind: a list of indices into problem.points indicating
23 | %                    the thus-far observed points
24 | %   observed_labels: a list of labels corresponding to the
25 | %                    observations in train_ind
26 | %          test_ind: a list of indices into problem.points indicating
27 | %                    the test points
28 | %             model: a handle to a probability model
29 | %
30 | % Output:
31 | %
32 | %   loss: the expected total 0/1 loss on the points in test_ind
33 | %
34 | % See also EXPECTED_ERROR_REDUCTION, MARGINAL_ENTROPY.
35 | 
36 | % Copyright (c) 2014 Roman Garnett.
37 | 
38 | function loss = expected_01_loss(problem, train_ind, observed_labels, ...
39 |           test_ind, model)
40 | 
41 |   probabilities = model(problem, train_ind, observed_labels, test_ind);
42 | 
43 |   loss = sum(1 - max(probabilities, [], 2));
44 | 
45 | end


--------------------------------------------------------------------------------
/score_functions/loss_functions/expected_log_loss.m:
--------------------------------------------------------------------------------
 1 | % EXPECTED_LOG_LOSS calcluates expected log loss given a training set.
 2 | %
 3 | % This function computes the expected total log loss on a set of
 4 | % points given a training set D = (X, Y):
 5 | %
 6 | %   \sum_{x \in U} H[y | x, D],
 7 | %
 8 | % where H[y | x, D] is the marginal entropy of the predictive
 9 | % distribution p(y | x, D) and U is the set of points whose labels
10 | % are to be predicted.
11 | %
12 | % Usage:
13 | %
14 | %   loss = expected_log_loss(problem, train_ind, observed_labels, ...
15 | %                            test_ind, model)
16 | %
17 | % Inputs:
18 | %
19 | %           problem: a struct describing the problem, containing fields:
20 | %
21 | %                  points: an (n x d) data matrix for the available points
22 | %             num_classes: the number of classes
23 | %
24 | %         train_ind: a list of indices into problem.points indicating
25 | %                    the thus-far observed points
26 | %   observed_labels: a list of labels corresponding to the
27 | %                    observations in train_ind
28 | %          test_ind: a list of indices into problem.points indicating
29 | %                    the test points
30 | %             model: a handle to a probability model
31 | %
32 | % Output:
33 | %
34 | %   loss: the expected total log loss on the points in test_ind
35 | %
36 | % See also EXPECTED_ERROR_REDUCTION, MARGINAL_ENTROPY.
37 | 
38 | % Copyright (c) 2014 Roman Garnett.
39 | 
40 | function loss = expected_log_loss(problem, train_ind, observed_labels, ...
41 |           test_ind, model)
42 | 
43 |   marginal_entropies = marginal_entropy(problem, train_ind, ...
44 |           observed_labels, test_ind, model);
45 | 
46 |   loss = sum(marginal_entropies);
47 | 
48 | end


--------------------------------------------------------------------------------
/score_functions/loss_functions/loss_functions.m:
--------------------------------------------------------------------------------
 1 | % Loss functions (also utility functions) compute the loss (or
 2 | % utility) associated with a given set of observations. These are
 3 | % typically used in active learning to, e.g., sample the point that,
 4 | % after being incorporated into the current set of observations,
 5 | % minimizes the expected final loss. Loss functions will typically not
 6 | % be used directly but rather by a score function computing expected
 7 | % losses (e.g., expected_loss_naive, expected_loss_lookahead) or
 8 | % expected utilities (e.g., expected_loss_naive,
 9 | % expected_loss_lookahead).
10 | %
11 | % Loss and utility functions must satisfy the following interface:
12 | %
13 | %      loss = loss_function(problem, train_ind, observed_labels)
14 | %
15 | % or
16 | %
17 | %   utility = utility_function(problem, train_ind, observed_labels)
18 | %
19 | % The only difference between the two is the semantic interpretation
20 | % of the output: losses are typically to be minimized (e.g., with
21 | % argmin) and utilities are typically to be maximized (e.g., with
22 | % argmax).
23 | %
24 | % Inputs:
25 | %
26 | %           problem: a struct describing the problem, containing fields:
27 | %
28 | %                  points: an (n x d) data matrix for the available points
29 | %             num_classes: the number of classes
30 | %
31 | %         train_ind: a list of indices into problem.points indicating
32 | %                    the thus-far observed points
33 | %   observed_labels: a list of labels corresponding to the
34 | %                    observations in train_ind
35 | %
36 | % Output:
37 | %
38 | %      loss: the loss associated with the given set of observations
39 | %
40 | % or
41 | %
42 | %   utility: the utility associated with the given set of observations
43 | %
44 | % See also EXPECTED_LOSS_NAIVE, EXPECTED_LOSS_LOOKAHEAD,
45 | % EXPECTED_UTILITY_NAIVE, EXPECTED_UTILITY_LOOKAHEAD.
46 | 
47 | % Copyright (c) 2014 Roman Garnett.
48 | 


--------------------------------------------------------------------------------
/score_functions/margin.m:
--------------------------------------------------------------------------------
 1 | % MARGIN calculates predictive margin on given test points.
 2 | %
 3 | % The predictive margin for a point x is the difference between the
 4 | % probability assigned to the most probable class and the
 5 | % second-most probable class:
 6 | %
 7 | %   margin(x | D) = p(y = y_1 | x, D) - p(y = y_2 | x, D),
 8 | %
 9 | % where y_1 and y_2 are the most and second-most probable class
10 | % labels for x given the observations in D, respectively.
11 | %
12 | % Minimizing margin gives rise to a popular query strategy known as
13 | % margin sampling.
14 | %
15 | % Usage:
16 | %
17 | %   scores = margin(problem, train_ind, observed_labels, test_ind, model)
18 | %
19 | % Inputs:
20 | %
21 | %           problem: a struct describing the problem, containing fields:
22 | %
23 | %                  points: an (n x d) data matrix for the available points
24 | %             num_classes: the number of classes
25 | %
26 | %         train_ind: a list of indices into problem.points indicating
27 | %                    the thus-far observed points
28 | %   observed_labels: a list of labels corresponding to the
29 | %                    observations in train_ind
30 | %          test_ind: a list of indices into problem.points indicating
31 | %                    the points eligible for observation
32 | %             model: a handle to a probability model
33 | %
34 | % Output:
35 | %
36 | %   scores: a vector of margins for each point specified by test_ind
37 | %
38 | % See also SCORE_FUNCTIONS, MODELS, MARGIN_SAMPLING.
39 | 
40 | % Copyright (c) 2014 Roman Garnett.
41 | 
42 | function scores = margin(problem, train_ind, observed_labels, ...
43 |           test_ind, model)
44 | 
45 |   probabilities = model(problem, train_ind, observed_labels, test_ind);
46 |   probabilities = sort(probabilities, 2, 'descend');
47 | 
48 |   scores = probabilities(:, 1) - probabilities(:, 2);
49 | 
50 | end


--------------------------------------------------------------------------------
/score_functions/marginal_entropy.m:
--------------------------------------------------------------------------------
 1 | % MARGINAL_ENTROPY calculates predictive entropy on given test points.
 2 | %
 3 | % The predictive marginal entropy H for a point x given a set of
 4 | % observations D = (X, Y) is given by
 5 | %
 6 | %   H[y | x, D] = -\sum_i p(y = i | x, D) \log p(y = i | x, D).
 7 | %
 8 | % Maximizing the marginal entropy gives rise to a common query
 9 | % strategy known as uncertainty sampling.
10 | %
11 | % Usage:
12 | %
13 | %   scores = marginal_entropy(problem, train_ind, observed_labels, ...
14 | %                             test_ind, model)
15 | %
16 | % Inputs:
17 | %
18 | %           problem: a struct describing the problem, containing fields:
19 | %
20 | %                  points: an (n x d) data matrix for the available points
21 | %             num_classes: the number of classes
22 | %
23 | %         train_ind: a list of indices into problem.points indicating
24 | %                    the thus-far observed points
25 | %   observed_labels: a list of labels corresponding to the
26 | %                    observations in train_ind
27 | %          test_ind: a list of indices into problem.points indicating
28 | %                    the points eligible for observation
29 | %             model: a handle to a probability model
30 | %
31 | % Output:
32 | %
33 | %   scores: a vector of marginal entropies for each point specified by
34 | %           test_ind
35 | %
36 | % See also SCORE_FUNCTIONS, MODELS, UNCERTAINTY_SAMPLING.
37 | 
38 | % Copyright (c) 2013--2014 Roman Garnett.
39 | 
40 | function scores = marginal_entropy(problem, train_ind, observed_labels, ...
41 |           test_ind, model)
42 | 
43 |   probabilities = model(problem, train_ind, observed_labels, test_ind);
44 | 
45 |   % remove any zeros from probabilities to approximate 0 * -inf = 0
46 |   probabilities = max(probabilities, 1e-100);
47 | 
48 |   scores = -sum(probabilities .* log(probabilities), 2);
49 | 
50 | end


--------------------------------------------------------------------------------
/score_functions/score_functions.m:
--------------------------------------------------------------------------------
 1 | % A score function computes an arbitrary score for each given test
 2 | % point that is in some way related to its influence or suitability
 3 | % for making an observation there. Score functions are typically
 4 | % converted into query strategies by either maximization (e.g. argmax)
 5 | % or minimization (e.g. argmin) over the points eligible for
 6 | % observation.
 7 | %
 8 | % Score function must satisfy the following interface:
 9 | %
10 | %   scores = score_function(problem, train_ind, observed_labels, test_ind)
11 | %
12 | % Inputs:
13 | %
14 | %           problem: a struct describing the problem, containing fields:
15 | %
16 | %                  points: an (n x d) data matrix for the available points
17 | %             num_classes: the number of classes
18 | %
19 | %         train_ind: a list of indices into problem.points indicating
20 | %                    the thus-far observed points
21 | %   observed_labels: a list of labels corresponding to the
22 | %                    observations in train_ind
23 | %          test_ind: a list of indices into problem.points indicating
24 | %                    the points eligible for observation
25 | %
26 | % Output:
27 | %
28 | %   scores: a vector of real-valued scores; one for each point
29 | %           specified by test_ind
30 | %
31 | % The following score functions are provided in this toolbox:
32 | %
33 | %   expected_loss_lookahead: multiple-step lookahead expected loss
34 | %                            arbitrary loss functions
35 | %       expected_loss_naive: one-step lookahead expected loss for
36 | %                            arbitrary loss functions
37 | %                    margin: the predictive margin
38 | %          marginal_entropy: the predictive entropy
39 | %
40 | % See also ARGMIN, ARGMAX.
41 | 
42 | % Copyright (c) 2014 Roman Garnett.
43 | 


--------------------------------------------------------------------------------
/selectors/fixed_test_set_selector.m:
--------------------------------------------------------------------------------
 1 | % FIXED_TEST_SET_SELECTOR selects all points besides a given test set.
 2 | %
 3 | % Usage:
 4 | %
 5 | %   test_ind = fixed_test_set_selector(problem, train_ind, observed_labels, ...
 6 | %           test_set_ind)
 7 | %
 8 | % Inputs:
 9 | %
10 | %           problem: a struct describing the problem, which must at
11 | %                    least contain the field:
12 | %
13 | %              points: an (n x d) data matrix for the avilable points
14 | %
15 | %         train_ind: a list of indices into problem.points indicating
16 | %                    the thus-far observed points
17 | %
18 | %                    Note: this input, part of the standard selector
19 | %                    API, is ignored by fixed_test_set_selector. If
20 | %                    desired, for standalone use it can be replaced by
21 | %                    an empty matrix.
22 | %
23 | %   observed_labels: a list of labels corresponding to the
24 | %                    observations in train_ind
25 | %
26 | %                    Note: this input, part of the standard selector
27 | %                    API, is ignored by fixed_test_set_selector. If
28 | %                    desired, for standalone use it can be replaced by
29 | %                    an empty matrix.
30 | %
31 | %      test_set_ind: a list of indicies into problem.points
32 | %                    indicating the test set
33 | %
34 | % Output:
35 | %
36 | %   test_ind: a list of indices into problem.points indicating the
37 | %             points to consider for labeling
38 | %
39 | % See also SELECTORS.
40 | 
41 | % Copyright (c) 2011--2014 Roman Garnett.
42 | 
43 | function test_ind = fixed_test_set_selector(problem, ~, ~, test_set_ind)
44 | 
45 |   test_ind = identity_selector(problem, [], []);
46 |   test_ind(test_set_ind) = [];
47 | 
48 | end
49 | 


--------------------------------------------------------------------------------
/selectors/graph_walk_selector.m:
--------------------------------------------------------------------------------
 1 | % GRAPH_WALK_SELECTOR confines an experiment to follow a path on a graph.
 2 | %
 3 | % This provides a selector that compels observations to be taken along
 4 | % a connected path in a specified (possibly directed) graph. The nodes
 5 | % adjacent to the previously queried node are selected.
 6 | %
 7 | % Usage:
 8 | %
 9 | %   test_ind = graph_walk_selector(problem, train_ind, observed_labels, A)
10 | %
11 | % Inputs:
12 | %
13 | %           problem: a struct describing the problem, containing fields:
14 | %
15 | %                  points: an (n x d) data matrix for the available points
16 | %             num_classes: the number of classes
17 | %             num_queries: the number of queries to make
18 | %
19 | %                    Note: this input, part of the standard selector
20 | %                    API, is ignored by graph_walk_selector. If
21 | %                    desired, for standalone use it can be replaced by
22 | %                    an empty matrix.
23 | %
24 | %         train_ind: a list of indices into problem.points indicating
25 | %                    the thus-far observed points
26 | %   observed_labels: a list of labels corresponding to the
27 | %                    observations in train_ind
28 | %
29 | %                    Note: this input, part of the standard selector
30 | %                    API, is ignored by graph_walk_selector. If
31 | %                    desired, for standalone use it can be replaced by
32 | %                    an empty matrix.
33 | %
34 | %                 A: the (n x n) adjacency matrix for the desired
35 | %                    graph. A nonzero entry for A(i, j) is interpreted
36 | %                    as the presence of the (possibly directed) edge
37 | %                    [i -> j].
38 | %
39 | % Output:
40 | %
41 | %   test_ind: a list of indices into problem.points indicating the
42 | %             points to consider for labeling. Each index in test_ind
43 | %             can be reached from the last observed point via an
44 | %             outgoing edge in the given graph.
45 | %
46 | % See also SELECTORS.
47 | 
48 | % Copyright (c) 2013--2014 Roman Garnett.
49 | 
50 | function test_ind = graph_walk_selector(~, train_ind, ~, A)
51 | 
52 |   test_ind = find(A(train_ind(end), :))';
53 | 
54 | end


--------------------------------------------------------------------------------
/selectors/identity_selector.m:
--------------------------------------------------------------------------------
 1 | % IDENTITY_SELECTOR selects all points.
 2 | %
 3 | % Usage:
 4 | %
 5 | %   test_ind = identity_selector(problem, observed_labels, train_ind)
 6 | %
 7 | % Inputs:
 8 | %
 9 | %           problem: a struct describing the problem, which must at
10 | %                    least contain the field:
11 | %
12 | %              points: an (n x d) data matrix for the avilable points
13 | %
14 | %         train_ind: a list of indices into problem.points indicating
15 | %                    the thus-far observed points
16 | %
17 | %                    Note: this input, part of the standard selector
18 | %                    API, is ignored by identity_selector. If desired,
19 | %                    for standalone use it can be replaced by an empty
20 | %                    matrix.
21 | %
22 | %   observed_labels: a list of labels corresponding to the
23 | %                    observations in train_ind
24 | %
25 | %                    Note: this input, part of the standard selector
26 | %                    API, is ignored by identity_selector. If desired,
27 | %                    for standalone use it can be replaced by an empty
28 | %                    matrix.
29 | %
30 | % Output:
31 | %
32 | %   test_ind: a list of indices into problem.points indicating the
33 | %             points to consider for labeling
34 | %
35 | % See also SELECTORS
36 | 
37 | % Copyright (c) 2011--2014 Roman Garnett.
38 | 
39 | function test_ind = identity_selector(problem, ~, ~)
40 | 
41 |   test_ind = (1:size(problem.points, 1))';
42 | 
43 | end


--------------------------------------------------------------------------------
/selectors/meta_selectors/complement_selector.m:
--------------------------------------------------------------------------------
 1 | % COMPLEMENT_SELECTOR takes the complement of a selector's output.
 2 | %
 3 | % This provides a meta-selector that returns the complement of
 4 | % the test points returned by another selector. Note: this set can
 5 | % be empty!
 6 | %
 7 | % Usage:
 8 | %
 9 | %   test_ind = complement_selector(problem, train_ind, observed_labels, selector)
10 | %
11 | % Inputs:
12 | %
13 | %           problem: a struct describing the problem, containing fields:
14 | %
15 | %                  points: an (n x d) data matrix for the available points
16 | %             num_classes: the number of classes
17 | %             num_queries: the number of queries to make
18 | %
19 | %         train_ind: a list of indices into problem.points indicating
20 | %                    the thus-far observed points
21 | %   observed_labels: a list of labels corresponding to the
22 | %                    observations in train_ind
23 | %          selector: a function handle to a selector
24 | %
25 | % Output:
26 | %
27 | %   test_ind: a list of indices into problem.points indicating the
28 | %             points to consider for labeling. Each index in test_ind
29 | %             was not selected by the given selector.
30 | %
31 | % See also SELECTORS.
32 | 
33 | % Copyright (c) 2014 Roman Garnett.
34 | 
35 | function test_ind = complement_selector(problem, train_ind, observed_labels, ...
36 |           selector)
37 | 
38 |   test_ind = identity_selector(problem, [], []);
39 |   test_ind(selector(problem, train_ind, observed_labels)) = [];
40 | 
41 | end


--------------------------------------------------------------------------------
/selectors/meta_selectors/intersection_selector.m:
--------------------------------------------------------------------------------
 1 | % INTERSECTION_SELECTOR takes the intersection of the outputs of selectors.
 2 | %
 3 | % This provides a meta-selector that returns the intersection of the
 4 | % test points returned from each of a set of selectors. Note that this
 5 | % intersection may be empty!
 6 | %
 7 | % Usage:
 8 | %
 9 | %   test_ind = intersection_selector(problem, train_ind, observed_labels, ...
10 | %           selectors)
11 | %
12 | % Inputs:
13 | %
14 | %           problem: a struct describing the problem, containing fields:
15 | %
16 | %                  points: an (n x d) data matrix for the available points
17 | %             num_classes: the number of classes
18 | %             num_queries: the number of queries to make
19 | %
20 | %         train_ind: a list of indices into problem.points indicating
21 | %                    the thus-far observed points
22 | %   observed_labels: a list of labels corresponding to the
23 | %                    observations in train_ind
24 | %         selectors: a cell array of function handles to selectors
25 | %                    to intersect
26 | %
27 | % Output:
28 | %
29 | %   test_ind: a list of indices into problem.points indicating the
30 | %             points to consider for labeling. Each index in test_ind
31 | %             was selected by every provided selector.
32 | %
33 | % See also SELECTORS.
34 | 
35 | % Copyright (c) 2013--2014 Roman Garnett.
36 | 
37 | function test_ind = intersection_selector(problem, train_ind, ...
38 |           observed_labels, selectors)
39 | 
40 |   test_ind = selectors{1}(problem, train_ind, observed_labels);
41 |   for i = 2:numel(selectors)
42 |     test_ind = intersect(test_ind, selectors{i}(problem, train_ind, observed_labels));
43 |   end
44 | 
45 | end


--------------------------------------------------------------------------------
/selectors/meta_selectors/union_selector.m:
--------------------------------------------------------------------------------
 1 | % UNION_SELECTOR takes the union of the output of selectors.
 2 | %
 3 | % This provides a meta-selector that returns the union of the test
 4 | % points returned from each of a set of selectors.
 5 | %
 6 | % Usage:
 7 | %
 8 | %   test_ind = union_selector(problem, train_ind, observed_labels, selectors)
 9 | %
10 | % Inputs:
11 | %
12 | %           problem: a struct describing the problem, containing fields:
13 | %
14 | %                  points: an (n x d) data matrix for the available points
15 | %             num_classes: the number of classes
16 | %             num_queries: the number of queries to make
17 | %
18 | %         train_ind: a list of indices into problem.points indicating
19 | %                    the thus-far observed points
20 | %   observed_labels: a list of labels corresponding to the
21 | %                    observations in train_ind
22 | %         selectors: a cell array of function handles to selectors
23 | %                    to combine
24 | %
25 | % Output:
26 | %
27 | %   test_ind: a list of indices into problem.points indicating the
28 | %             points to consider for labeling. Each index in test_ind
29 | %             was selected by at least one of the provided selectors.
30 | %
31 | % See also SELECTORS.
32 | 
33 | % Copyright (c) 2014 Roman Garnett.
34 | 
35 | function test_ind = union_selector(problem, train_ind, observed_labels, ...
36 |           selectors)
37 | 
38 |   test_ind = selectors{1}(problem, train_ind, observed_labels);
39 |   for i = 2:numel(selectors)
40 |     test_ind = union(test_ind, selectors{i}(problem, train_ind, observed_labels));
41 |   end
42 | 
43 | end


--------------------------------------------------------------------------------
/selectors/probability_treshold_selector.m:
--------------------------------------------------------------------------------
 1 | % PROBABILITY_THRESHOLD_SELECTOR selects confident points.
 2 | %
 3 | % This provides a selector that selects points with at least one
 4 | % class-membership probability above a specified threshold according
 5 | % to a given model.
 6 | %
 7 | % Usage:
 8 | %
 9 | %   test_ind = probability_treshold_selector(problem, train_ind, ...
10 | %           observed_labels, model, threshold)
11 | %
12 | % Inputs:
13 | %           problem: a struct describing the problem, containing fields:
14 | %
15 | %                  points: an (n x d) data matrix for the available points
16 | %             num_classes: the number of classes
17 | %
18 | %         train_ind: a list of indices into problem.points indicating
19 | %                    the thus-far observed points
20 | %   observed_labels: a list of labels corresponding to the
21 | %                    observations in train_ind
22 | %             model: a handle to a probability model
23 | %
24 | % Output:
25 | %
26 | %   test_ind: a list of indices into problem.points indicating the
27 | %             points to consider for labeling. Each index in test_ind
28 | %             has at least one class-membership probability greater
29 | %             than the provided threshold.
30 | %
31 | % See also SELECTORS, MODELS.
32 | 
33 | % Copyright (c) 2011--2014 Roman Garnett.
34 | 
35 | function test_ind = probability_treshold_selector(problem, train_ind, ...
36 |           observed_labels, model, threshold)
37 | 
38 |   test_ind = identity_selector(problem, [], []);
39 | 
40 |   probabilities = model(problem, train_ind, observed_labels, test_ind);
41 | 
42 |   test_ind = find(any(probabilities >= threshold), 2);
43 | 
44 | end
45 | 


--------------------------------------------------------------------------------
/selectors/random_selector.m:
--------------------------------------------------------------------------------
 1 | % RANDOM_SELECTOR selects a random subset of points.
 2 | %
 3 | % Usage:
 4 | %
 5 | %   test_ind = random_selector(problem, train_ind, observed_labels, ...
 6 | %                              num_test)
 7 | %
 8 | % Inputs:
 9 | %
10 | %           problem: a struct describing the problem, which must at
11 | %                    least contain the field:
12 | %
13 | %              points: an (n x d) data matrix for the avilable points
14 | %
15 | %         train_ind: a list of indices into problem.points indicating
16 | %                    the thus-far observed points
17 | %
18 | %                    Note: this input, part of the standard selector
19 | %                    API, is ignored by random_selector. If desired,
20 | %                    for standalone use it can be replaced by an empty
21 | %                    matrix.
22 | %
23 | %   observed_labels: a list of labels corresponding to the
24 | %                    observations in train_ind
25 | %
26 | %                    Note: this input, part of the standard selector
27 | %                    API, is ignored by random_selector. If desired,
28 | %                    for standalone use it can be replaced by an empty
29 | %                    matrix.
30 | %
31 | %          num_test: the number of test points to select
32 | %
33 | % Output:
34 | %
35 | %   test_ind: a list of indices into problem.points indicating the
36 | %             points to consider for labeling
37 | %
38 | % See also SELECTORS.
39 | 
40 | % Copyright (c) 2011--2014 Roman Garnett.
41 | 
42 | function test_ind = random_selector(problem, ~, ~, num_test)
43 | 
44 |   test_ind = randperm(size(problem.points, 1), num_test);
45 | 
46 | end


--------------------------------------------------------------------------------
/selectors/selectors.m:
--------------------------------------------------------------------------------
 1 | % A selector considers the current labeled dataset and indicates which
 2 | % of the unlabeled points should be considered for observation at this
 3 | % time.
 4 | %
 5 | % Selectors must satisfy the following interface:
 6 | %
 7 | %   test_ind = selector(problem, train_ind, observed_labels)
 8 | %
 9 | % Inputs:
10 | %
11 | %           problem: a struct describing the problem, containing fields:
12 | %
13 | %                  points: an (n x d) data matrix for the available points
14 | %             num_classes: the number of classes
15 | %             num_queries: the number of queries to make
16 | %
17 | %         train_ind: a list of indices into problem.points indicating
18 | %                    the thus-far observed points
19 | %   observed_labels: a list of labels corresponding to the
20 | %                    observations in train_ind
21 | %
22 | % Output:
23 | %
24 | %   test_ind: a list of indices into problem.points indicating the
25 | %             points to consider for labeling
26 | %
27 | % The following general-purpose selectors are provided in this
28 | % toolbox:
29 | %
30 | %   fixed_test_set_selector: selects all points besides a given test
31 | %                            set
32 | %       graph_walk_selector: confines an experiment to follow a path
33 | %                            on a graph
34 | %         identity_selector: selects all points
35 | %           random_selector: selects a random subset of points
36 | %        unlabeled_selector: selects points not yet observed
37 | %
38 | % In addition, the following "meta" selectors are provided, which
39 | % combine or modify the outputs of other selectors:
40 | %
41 | %     complement_selector: takes the complement of a selector's output
42 | %   intersection_selector: takes the intersection of the outputs of selectors
43 | %          union_selector: takes the union of the outputs of selectors
44 | 
45 | % Copyright (c) 2011--2014 Roman Garnett.


--------------------------------------------------------------------------------
/selectors/unlabeled_selector.m:
--------------------------------------------------------------------------------
 1 | % UNLABELED_SELECTOR selects points not yet observed.
 2 | %
 3 | % Usage:
 4 | %
 5 | %   test_ind = unlabeled_selector(problem, train_ind, observed_labels)
 6 | %
 7 | % Inputs:
 8 | %
 9 | %           problem: a struct describing the problem, which must at
10 | %                    least contain the field:
11 | %
12 | %              points: an (n x d) data matrix for the avilable points
13 | %
14 | %         train_ind: a list of indices into problem.points indicating
15 | %                    the thus-far observed points
16 | %   observed_labels: a list of labels corresponding to the
17 | %                    observations in train_ind
18 | %
19 | %                    Note: this input, part of the standard selector
20 | %                    API, is ignored by unlabeled_selector. If
21 | %                    desired, for standalone use it can be replaced by
22 | %                    an empty matrix.
23 | %
24 | % Output:
25 | %
26 | %   test_ind: a list of indices into problem.points indicating the
27 | %             points to consider for labeling
28 | %
29 | % See also SELECTORS.
30 | 
31 | % Copyright (c) 2013--2014 Roman Garnett.
32 | 
33 | function test_ind = unlabeled_selector(problem, train_ind, ~)
34 | 
35 |   test_ind = identity_selector(problem, [], []);
36 |   test_ind(train_ind) = [];
37 | 
38 | end


--------------------------------------------------------------------------------