├── .gitignore
├── LICENSE
├── README.md
├── ibs_basic.m
├── ibs_example.m
├── ibslike.m
├── private
    └── ibs_test.m
├── psycho_gen.m
└── psycho_nll.m


/.gitignore:
--------------------------------------------------------------------------------
1 | *.asv
2 | *.out
3 | *.err
4 | .DS_Store
5 | 


--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
 1 | MIT License
 2 | 
 3 | Copyright (c) 2019 Luigi Acerbi
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy
 6 | of this software and associated documentation files (the "Software"), to deal
 7 | in the Software without restriction, including without limitation the rights
 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 | 
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 | 
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | # Inverse binomial sampling (IBS)
 2 | 
 3 | This repository contains MATLAB implementations and examples of IBS. For Python implementations, see [PyIBS](https://github.com/acerbilab/pyibs).
 4 | 
 5 | ## What is it?
 6 | 
 7 | Inverse binomial sampling (IBS) is a technique to obtain unbiased, efficient estimates of the log-likelihood of a model by simulation. [[1](#references)]
 8 | 
 9 | The typical scenario is the case in which you have a *simulator*, that is a model from which you can randomly draw synthetic observations (for a given parameter vector), but cannot evaluate the log-likelihood analytically or numerically. In other words, IBS affords likelihood-based inference for models without explicit likelihood functions (also known as implicit models).
10 | 
11 | ## Quickstart
12 | 
13 | 
14 | The main function is `ibslike.m`, which computes the estimate of the negative log-likelihood for a given simulator model and dataset via IBS.
15 | We recommend to start with the tutorial in `ibs_example.m`, which contains a full walkthrough with working example usages of IBS.
16 | 
17 | IBS is commonly used as a part of an algorithm for maximum-likelihood estimation or Bayesian inference:
18 | - For maximum-likelihood (or maximum-a-posteriori) estimation, we recommend to use IBS combined with [Bayesian Adaptive Direct Search (BADS)](https://github.com/acerbilab/bads). BADS is required to run the first part of the tutorial.
19 | - For Bayesian inference of posterior distributions and model evidence, we recommend to use IBS with [Variational Bayesian Monte Carlo (VBMC)](https://github.com/acerbilab/vbmc). VBMC is required to run the second part of the tutorial.
20 | 
21 | For practical recommendations and any other question, check out the FAQ on the [IBS wiki](https://github.com/acerbilab/ibs/wiki).
22 | 
23 | ## Code
24 | 
25 | We describe below the files in this repository:
26 | 
27 | - `ibs_basic.m` is a bare-bone implementation of IBS for didactic purposes.
28 | - `ibslike.m` is an advanced vectorized implementation of IBS, which supports several advanced features (still work in progress as we add more features). Please read the documentation in the file for usage information, and refer to the [dedicated section on the FAQ](https://github.com/acerbilab/ibs/wiki#matlab-implementation-ibslike) for specific questions about this MATLAB implementation.
29 |   - Note that `ibslike` returns the *negative* log-likelihood as it is meant to be used with an optimization method such as BADS.
30 |   - Run `ibslike('test')` for a battery of unit tests.
31 |   - If you want to run `ibslike` with [Variational Bayesian Monte Carlo](https://github.com/acerbilab/vbmc) (VBMC), note that you need to set the IBS options as
32 |     - `OPTIONS.ReturnPositive = true` to return the *positive* log-likelihood;
33 |     - `OPTIONS.ReturnStd = true` to return as second output the standard deviation (SD) of the estimate (as opposed to the variance).
34 | - `ibs_example.m` is a tutorial with usage cases for IBS.
35 | - `psycho_gen.m` and `psycho_nll.m` are functions implementing, respectively, the generative model (simulator) and the negative log-likelihood function for the orientation discrimination model used as example in the tutorial (see also Section 5.2 in the IBS paper).
36 | 
37 | The code used to produce results in the paper [[1](#references)] is available in the development repository [here](https://github.com/basvanopheusden/ibs-development).
38 | 
39 | ## References
40 | 
41 | 1. van Opheusden\*, B., Acerbi\*, L. & Ma, W.J. (2020). Unbiased and efficient log-likelihood estimation with inverse binomial sampling. *PLoS Computational Biology* 16(12): e1008483. (\* equal contribution) ([link](https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1008483)) 
42 | 
43 | You can cite IBS in your work with something along the lines of
44 | 
45 | > We estimated the log-likelihood using inverse binomial sampling (IBS; van Opheusden, Acerbi & Ma, 2020), a technique that produces unbiased and efficient estimates of the log-likelihood via simulation. 
46 | 
47 | If you use IBS in conjunction with [Bayesian Adaptive Direct Search](https://github.com/acerbilab/bads), as recommended in the paper, you could add
48 | 
49 | > We obtained maximum-likelihood estimates of the model parameters via Bayesian Adaptive Direct Search (BADS; Acerbi & Ma, 2017), a hybrid Bayesian optimization algorithm which affords stochastic objective evaluations.
50 | 
51 | and cite the appropriate paper:
52 | 
53 | 2. Acerbi, L. & Ma, W. J. (2017). Practical Bayesian optimization for model fitting with Bayesian Adaptive Direct Search. In *Advances in Neural Information Processing Systems* 30:1834-1844.
54 | 
55 | Similarly, if you use IBS in combination with [Variational Bayesian Monte Carlo](https://github.com/acerbilab/vbmc), you should cite these papers:
56 | 
57 | 3. Acerbi, L. (2018). Variational Bayesian Monte Carlo. In *Advances in Neural Information Processing Systems* 31: 8222-8232.
58 | 4. Acerbi, L. (2020). Variational Bayesian Monte Carlo with Noisy Likelihoods. In *Advances in Neural Information Processing Systems* 33: 8211-8222.
59 | 
60 | Besides formal citations, you can demonstrate your appreciation for our work in the following ways:
61 | 
62 | - *Star* the IBS repository on GitHub;
63 | - Follow us on Twitter ([Luigi](https://twitter.com/AcerbiLuigi), [Bas](https://twitter.com/basvanopheusden)) for updates about IBS and other projects we are involved with;
64 | - Tell us about your model-fitting problem and your experience with IBS (positive or negative) in the [Discussions forum](https://github.com/orgs/acerbilab/discussions).
65 | 
66 | ### BibTex
67 | 
68 | ```
69 | @article{vanOpheusden2020unbiased,
70 |   title = {Unbiased and Efficient Log-Likelihood Estimation with Inverse Binomial Sampling},
71 |   author = {van Opheusden, Bas and Acerbi, Luigi and Ma, Wei Ji},
72 |   year = {2020},
73 |   journal = {PLOS Computational Biology},
74 |   volume = {16},
75 |   number = {12},
76 |   pages = {e1008483},
77 |   publisher = {{Public Library of Science}},
78 |   issn = {1553-7358},
79 |   doi = {10.1371/journal.pcbi.1008483},
80 | }
81 | ```
82 | 
83 | ### License
84 | 
85 | The IBS code is released under the terms of the [MIT License](https://github.com/acerbilab/ibs/blob/master/LICENSE.txt).
86 | 


--------------------------------------------------------------------------------
/ibs_basic.m:
--------------------------------------------------------------------------------
 1 | function L = ibs_basic(fun,theta,R,S)
 2 | %IBS_BASIC A barebone implementation of inverse binomial sampling (IBS).
 3 | %  IBS_BASIC returns an unbiased estimate L of the log-likelihood for the 
 4 | %  simulated model and data, calculated using inverse binomial sampling. 
 5 | %  FUN is a function handle to a function that simulates the model's 
 6 | %  responses (see below). 
 7 | %  THETA is the parameter vector used to simulate the model's responses. 
 8 | %  R is a "response" data matrix, where each row correspond to one 
 9 | %  observation or "trial" (e.g., a trial of a psychophysical experiment), 
10 | %  and each column represents a different response feature (e.g., the 
11 | %  subject's response and reported confidence level). Responses need to 
12 | %  belong to a finite set.
13 | %  S is an optional "stimulus" matrix, where each row corresponds to one 
14 | %  trial, and each column corresponds to a different trial feature (such 
15 | %  as condition, stimulus value, etc.).
16 | %  FUN is the generative model or simulator, which takes as input a vector 
17 | %  of parameters PARAMS and a row of S, and generates one row of the 
18 | %  response matrix.
19 | %
20 | %  This is a slow, bare bone implementation of IBS which should be used only 
21 | %  for didactic purposes. A proper vectorized implementation of IBS is 
22 | %  offered in the function IBSLIKE.
23 | %
24 | %  See also IBSLIKE.
25 | 
26 | %  Luigi Acerbi 2020
27 | 
28 | N = size(R,1);
29 | L = zeros(N,1);
30 | 
31 | for i = 1:N         % Loop over all trials (rows)
32 |     K = 1;
33 |     while any(fun(theta,S(i,:)) ~= R(i,:))
34 |         K = K + 1;  % Sample until the generated response is a match
35 |     end
36 |     L(i) = -sum(1./(1:K-1));    % IBS estimator for the i-th trial
37 | end
38 | 
39 | L = sum(L);     % Return summed log-likelihood
40 | 
41 | end


--------------------------------------------------------------------------------
/ibs_example.m:
--------------------------------------------------------------------------------
  1 | % IBS_EXAMPLE Example script for running inverse binomial sampling (IBS)
  2 | 
  3 | % In this example, we fit some simple orientation discrimination data using
  4 | % IBS and BADS. The model is described in the "Orientation discrimination" 
  5 | % section of the Results in the published paper.
  6 | % Note that this model is very simple and used only for didactic purposes;
  7 | % one should use the analytical log-likelihood whenever available.
  8 | %
  9 | %   Reference: 
 10 | %   van Opheusden*, B., Acerbi*, L. & Ma, W. J. (2020), "Unbiased and 
 11 | %   efficient log-likelihood estimation with inverse binomial sampling". 
 12 | %   (* equal contribution), PLoS Computational Biology 16(12): e1008483.
 13 | %   Link: https://doi.org/10.1371/journal.pcbi.1008483
 14 | 
 15 | %% Generate simulated dataset
 16 | 
 17 | % First, we simulate a synthetic dataset to fit.
 18 | % Normally, you would have your own (real or simulated) data here.
 19 | 
 20 | fprintf('Generating synthetic data set...\n');
 21 | 
 22 | Ntrials = 600;
 23 | eta = log(1);                   % Fake subject (log) sensory noise
 24 | bias = 0.2;                     % Fake subject bias
 25 | lapse = 0.03;                   % Fake subject lapse rate
 26 | theta_true = [eta,bias,lapse];  % Generating parameter vector
 27 | 
 28 | S = 3*randn(Ntrials,1);        % Generate stimulus orientation per trial
 29 | R = psycho_gen(theta_true,S);  % Generate fake subject responses
 30 | 
 31 | % We set the lower/upper bounds for the parameters (in particular, note that 
 32 | % we set a nonzero lower bound for the lapse rate)
 33 | LB = [log(0.1) -2 0.01];
 34 | UB = [log(10) 2 1];
 35 | 
 36 | % We also set the "plausible" lower/upper bounds representing where we
 37 | % would expect to find most parameters (this is not a prior, it just
 38 | % identifies a region e.g. for the starting points)
 39 | PLB = [log(0.2) -1 0.02];
 40 | PUB = [log(5) 1 0.2];
 41 | 
 42 | 
 43 | %% Maximum-likelihood estimation (MLE)
 44 | 
 45 | % We fit the data via maximum-likelihood estimation using Bayesian Adaptive 
 46 | % Direct Search (BADS), a particularly effective optimization algorithm for
 47 | % noisy target functions.
 48 | % If you do not have the BADS toolbox installed, you can freely download it 
 49 | % from here: https://github.com/acerbilab/bads
 50 | % Please ensure you have the latest version of BADS installed (v1.1.1 or
 51 | % more), as it incorporates tweaks that improve performance with IBS.
 52 | 
 53 | fprintf('Maximum-likelihood estimation with BADS using IBS...\n');
 54 | 
 55 | if isempty(which('bads.m'))                     % Check that you have BADS
 56 |     error('BADS not found. You can install it from here: https://github.com/acerbilab/bads');
 57 | end
 58 | 
 59 | old_bads = false;
 60 | try
 61 |     bads_version = bads('version');
 62 |     fprintf('BADS found (version %s).\n', bads_version);
 63 |     bads_version(bads_version == '.') = ' ';
 64 |     bads_version = str2num(bads_version);
 65 |     if bads_version(2) < 1
 66 |         old_bads = true;
 67 |     end
 68 | catch
 69 |     old_bads = true;
 70 | end
 71 | 
 72 | if old_bads
 73 |     error('You have installed an older version of BADS. You can find the latest version here: https://github.com/acerbilab/bads');        
 74 | end
 75 | 
 76 | fprintf('(press a key to continue)\n');
 77 | pause;
 78 | 
 79 | % We define the negative log-likelihood function via a call to IBSLIKE
 80 | % (IBSLIKE provides a vectorized implementation of IBS for MATLAB; check
 81 | % out IBS_BASIC for a bare bone implementation)
 82 | 
 83 | % We set 10 reps for the IBS estimator (see "Iterative multi-fidelity" 
 84 | % section in the Methods of the  paper). Note that this is also the default 
 85 | % for IBSLIKE
 86 | % We also tell IBSLIKE to return as a second output the standard deviation 
 87 | % of the estimate (as opposed to the variance). This is very important!
 88 | 
 89 | options_ibs = ibslike('defaults');
 90 | options_ibs.Nreps = 10;
 91 | options_ibs.ReturnStd  = true;
 92 | 
 93 | nllfun_ibs = @(theta) ibslike(@psycho_gen,theta,R,S,options_ibs);
 94 | 
 95 | % As a starting point for the optimization, we draw a sample inside the
 96 | % plausible box (in practice you should use multiple restarts!)
 97 | theta0 = rand(size(LB)).*(PUB-PLB) + PLB;
 98 | 
 99 | % We inform BADS that IBSLIKE returns noise estimate (SD) as second output
100 | options = bads('defaults');
101 | options.SpecifyTargetNoise = true;  
102 | 
103 | theta_ibs = bads(nllfun_ibs,theta0,LB,UB,PLB,PUB,[],options);
104 | 
105 | % Compare with MLE obtained using the analytical log-likelihood expression
106 | 
107 | fprintf('Maximum-likelihood estimation with BADS using the exact log-likelihood...\n');
108 | fprintf('(press a key to continue)\n');
109 | pause;
110 | 
111 | options = bads('defaults');
112 | theta_exact = bads(@(x) psycho_nll(x,S,R),theta0,LB,UB,PLB,PUB,[],options);
113 | 
114 | 
115 | %% Analysis of results
116 | 
117 | fprintf('Returned maximum-likelihood solutions with different methods:\n');
118 | 
119 | theta_ibs
120 | theta_exact
121 | 
122 | fprintf('True data-generating parameter value:\n');
123 | theta_true
124 | 
125 | % We compute now the log-likelihood at the returned solution with higher
126 | % precision (100 repeats), and estimate its uncertainty
127 | 
128 | options_ibs = ibslike('defaults');
129 | options_ibs.Nreps = 100;
130 | options_ibs.ReturnStd  = true;
131 | 
132 | % IBSLIKE returns as second output the variance of the IBS estimate
133 | % (please note that this might change in future versions)
134 | [nll_ibs,nll_std] = ibslike(@psycho_gen,theta_ibs,R,S,options_ibs);
135 | nll_exact = psycho_nll(theta_exact,S,R);
136 | 
137 | fprintf('Estimated log-likelihood at the IBS-found solution: %.2f +/- %.2f (exact value: %.2f).\n', ...
138 |     -nll_ibs,nll_std,-psycho_nll(theta_ibs,S,R));
139 | 
140 | % We also evaluate the MLE found exactly
141 | fprintf('Log-likelihood at the exact-found solution: %.2f.\n',-nll_exact);
142 | 
143 | % Note that even the exact MLE will differ from the true data-generating
144 | % parameters due to finiteness of the dataset (we expect to recover the 
145 | % true data generating parameters in the limit NTRIALS -> infinity)
146 | 
147 | 
148 | %% Bayesian posterior estimation
149 | 
150 | close all;
151 | 
152 | % We now fit the same data via a method that computes the Bayesian posterior
153 | % over model parameters, called Variational Bayesian Monte Carlo (VBMC).
154 | % If you do not have the VBMC toolbox installed, you can freely download it 
155 | % from here: https://github.com/acerbilab/vbmc
156 | 
157 | fprintf('Bayesian posterior estimation with Variational Bayesian Monte Carlo (VBMC) using IBS...\n');
158 | fprintf('(press a key to continue)\n');
159 | pause;
160 | 
161 | if isempty(which('vbmc.m'))                     % Check that you have VBMC
162 |     error('VBMC not found. You can install it from here: https://github.com/acerbilab/vbmc');
163 | end
164 | 
165 | folder = fileparts(which('vbmc.m'));
166 | addpath([folder filesep 'utils']);  % Ensure that UTILS is on the path
167 | 
168 | % For VBMC, we need to return the POSITIVE log-likelihood as first output, 
169 | % and the standard deviation (not variance!) of the estimate as second
170 | % output.
171 | 
172 | % Options for IBSLIKE
173 | options_ibs = ibslike('defaults');
174 | options_ibs.Nreps = 100;            % Try and have a SD ~ 1 (and below 3)
175 | options_ibs.ReturnPositive = true;  % Return *positive* log-likelihood
176 | options_ibs.ReturnStd  = true;      % 2nd output is SD (not variance!)
177 | 
178 | llfun = @(theta) ibslike(@psycho_gen,theta,R,S,options_ibs);
179 | 
180 | % We add a trapezoidal or "tent" prior over the parameters (flat between 
181 | % PLB and PUB, and linearly decreasing to 0 towards LB and UB). 
182 | lpriorfun = @(x) log(mtrapezpdf(x,LB,PLB,PUB,UB));
183 | 
184 | % Since LLFUN has now two outputs (the log likelihood at X, and the
185 | % estimated SD of the log likelihood at X), we cannot directly sum the log
186 | % prior from LPRIORFUN and the log likelihood from LLFUN.
187 | % Thus, we use an auxiliary function LPOSTFUN that does exactly this. 
188 | 
189 | fun = @(x) lpostfun(x,llfun,lpriorfun);     % Log joint
190 | 
191 | % FUN will have two outputs: the first output is the sum of LLFUN and 
192 | % LPRIORFUN (the log joint), and the second output is the variability of
193 | % LLFUN (that is, the second output of LLFUN). It is assumed that the prior
194 | % can be computed analytically and adds no noise.
195 | 
196 | % Here we could use as starting point the result of a run of BADS
197 | theta0 = 0.5*(PLB+PUB);
198 | options = vbmc('defaults');
199 | options.Plot = true;        % Plot iterations
200 | options.SpecifyTargetNoise = true;  % Noisy function evaluations 
201 | 
202 | % Run VBMC
203 | [vp,elbo,elbo_sd] = vbmc(fun,theta0,LB,UB,PLB,PUB,options);
204 | 
205 | close all;
206 | 
207 | Xs = vbmc_rnd(vp,3e5);  % Generate samples from the variational posterior
208 | 
209 | close all;
210 | cornerplot(Xs,{'\eta (log noise)','bias','lapse'},[eta,bias,lapse]);
211 | 
212 | fprintf('  Have a look at the triangle-plot visualization of the approximate posterior.\n')
213 | fprintf('  The black line represents the true generating parameters for the fake dataset.\n')
214 | fprintf('  For more information, see tutorials and FAQ at <a href="https://github.com/acerbilab/vbmc">https://github.com/acerbilab/vbmc</a>.\n')
215 | 
216 | 
217 | % Luigi Acerbi, 2021-2022
218 | 


--------------------------------------------------------------------------------
/ibslike.m:
--------------------------------------------------------------------------------
  1 | function [nlogL,nlogLvar,exitflag,output] = ibslike(fun,params,respMat,designMat,options,varargin)
  2 | %IBSLIKE Unbiased negative log-likelihood via inverse binomial sampling.
  3 | %   NLOGL = IBLLIKE(FUN,PARAMS,RESPMAT,DESIGNMAT) returns unbiased estimate 
  4 | %   NLOGL of the negative of the log-likelihood for the simulated model 
  5 | %   and data, calculated using inverse binomial sampling (IBS). 
  6 | %   FUN is a function handle to a function that simulates the model's 
  7 | %   responses (see below). 
  8 | %   PARAMS is the parameter vector used to simulate the model's responses. 
  9 | %   RESPMAT is a "response" data matrix, where each row correspond to one 
 10 | %   observation or "trial" (e.g., a trial of a psychophysical experiment), 
 11 | %   and each column represents a different response feature (e.g., the 
 12 | %   subject's response and reported confidence level). Responses need to 
 13 | %   belong to a finite set.
 14 | %   DESIGNMAT is an optional experimental design matrix, where each row 
 15 | %   corresponds to one trial, and each column corresponds to a different 
 16 | %   trial feature (such as condition, stimulus value, etc.).
 17 | %
 18 | %   FUN takes as input a vector of parameters PARAMS and an experimental 
 19 | %   design matrix DMAT (one row per trial), and generates a matrix of 
 20 | %   simulated model responses (one row per trial, corresponding to rows of 
 21 | %   DMAT). DMAT is built by the algorithm out of rows of DESIGNMAT.
 22 | %
 23 | %   DESIGNMAT can be omitted or left empty, in which case FUN needs to 
 24 | %   accept a parameter vector PARAMS and an array of trial numbers T, and 
 25 | %   returns a matrix of simulated responses, where the i-th row contains 
 26 | %   the simulated response for trial T(i) (the indices in T may repeat).
 27 | %
 28 | %   NLOGL = IBSLIKE(FUN,PARAMS,RESPMAT,DESIGNMAT,OPTIONS) uses options in 
 29 | %   structure OPTIONS to replace default values. (To be explained...)
 30 | %
 31 | %   NLOGL = IBSLIKE(...,VARARGIN) additional arguments are passed to FUN.
 32 | %
 33 | %   [NLOGL,NLOGLVAR] = IBSLIKE(...) also returns an estimate NLOGVAR of the
 34 | %   variance of the log likelihood.
 35 | %
 36 | %   [NLOGL,NLOGLVAR,EXITFLAG] = IBSLIKE(...) returns an EXITFLAG that 
 37 | %   describes the exit condition. Possible values of EXITFLAG and the 
 38 | %   corresponding exit conditions are
 39 | %
 40 | %    2  IBS terminated after reaching the maximum runtime specified by the 
 41 | %       user (the estimate can be arbitrarily biased).
 42 | %    1  IBS terminated after reaching the negative log-likelihood
 43 | %       threshold specified by the user (the estimate is biased).
 44 | %    0  Correct run of IBS; the estimate is unbiased.
 45 | %
 46 | %   [NLOGL,NLOGLVAR,EXITFLAG,OUTPUT] = IBSLIKE(...) returns a structure 
 47 | %   OUTPUT with additional information about the sampling.
 48 | %
 49 | %   OPTIONS = IBSLIKE('defaults') returns a basic default OPTIONS structure.
 50 | %
 51 | %   EXITFLAG = IBSLIKE('test') runs some tests. Here EXITFLAG is 0 if 
 52 | %   everything works correctly.
 53 | %   
 54 | %   Test code on binomial sampling:
 55 | %      p = 0.7; Ntrials = 100;                  % Define binomial probability
 56 | %      fun = @(x,dmat) rand(size(dmat)) < x;    % Simulating function
 57 | %      rmat = fun(p,NaN(Ntrials,1));            % Generate responses
 58 | %      [nlogL,nlogLvar,pc,output] = ibslike(fun,p,rmat);
 59 | %      nlogL_true = -log(p)*sum(rmat == 1) - log(1-p)*sum(rmat == 0);
 60 | %      fprintf('Ground truth: %.4g, Estimate: %.4g ± %.4g.\n',nlogL_true,nlogL,sqrt(nlogLvar));
 61 | %
 62 | %   Reference: 
 63 | %   van Opheusden*, B., Acerbi*, L. & Ma, W. J. (2020), "Unbiased and 
 64 | %   efficient log-likelihood estimation with inverse binomial sampling". 
 65 | %   (* equal contribution), PLoS Computational Biology 16(12): e1008483.
 66 | %   Link: https://doi.org/10.1371/journal.pcbi.1008483
 67 | %
 68 | %   See also @.
 69 | 
 70 | %--------------------------------------------------------------------------
 71 | % IBS: Inverse Binomial Sampling for unbiased log-likelihood estimation
 72 | % To be used under the terms of the MIT License 
 73 | % (https://opensource.org/licenses/MIT).
 74 | %
 75 | %   Authors (copyright): Luigi Acerbi and Bas van Opheusden, 2020-2022
 76 | %   e-mail: luigi.acerbi@helsinki.fi, svo@princeton.edu
 77 | %   URL: http://luigiacerbi.com
 78 | %   Version: 0.96
 79 | %   Release date: Jan 21, 2021
 80 | %   Code repository: https://github.com/acerbilab/ibs
 81 | %--------------------------------------------------------------------------
 82 | 
 83 | if nargin < 4; designMat = []; end
 84 | if nargin < 5; options = []; end
 85 | 
 86 | t0 = tic;
 87 | 
 88 | % Default options
 89 | % defopts.Display       = 'off';        % Level of display on screen
 90 | defopts.Nreps           = 10;           % # independent log-likelihood estimates per trial
 91 | defopts.NegLogLikeThreshold = Inf;      % Stop sampling if estimated nLL is above threshold (incompatible with vectorized sampling)
 92 | defopts.Vectorized      = 'auto';       % Use vectorized sampling algorithm with acceleration
 93 | defopts.Acceleration    = 1.5;          % Acceleration factor for vectorized sampling
 94 | defopts.NsamplesPerCall = 0;            % # starting samples per trial per function call (0 = choose automatically)
 95 | defopts.MaxIter         = 1e5;          % Maximum number of iterations (per trial and estimate)
 96 | defopts.ReturnPositive  = false;        % If true, the first returned output is the *positive* log-likelihood
 97 | defopts.ReturnStd       = false;        % If true, the second returned output is the standard deviation of the estimate
 98 | defopts.MaxTime         = Inf;          % Maximum time for a IBS call (in seconds)
 99 | defopts.TrialWeights    = [];           % Vector of per-trial weights to yield a weighted sum of log-likelihood
100 | 
101 | %% If called with no arguments or with 'defaults', return default options
102 | if nargout <= 1 && (nargin == 0 || (nargin == 1 && ischar(fun) && strcmpi(fun,'defaults')))
103 |     if nargin < 1
104 |         fprintf('Basic default options returned (type "help ibslike" for help).\n');
105 |     end
106 |     nlogL = defopts;
107 |     return;
108 | end
109 | 
110 | %% If called with the first argument as 'test', run test
111 | if ischar(fun) && strcmpi(fun,'test')
112 |     if nargin < 2; options = []; else; options = params; end
113 |     figure; 
114 |     subplot(1,3,1);
115 |     exitflag(1) = runtest1(options);
116 |     subplot(1,3,2);
117 |     exitflag(2) = runtest2(options);
118 |     subplot(1,3,3);
119 |     exitflag(3) = runtest3(options);
120 |     nlogL = any(exitflag);
121 |     return;
122 | end
123 | 
124 | for f = fields(defopts)'
125 |     if ~isfield(options,f{:}) || isempty(options.(f{:}))
126 |         options.(f{:}) = defopts.(f{:});
127 |     end
128 | end
129 | 
130 | Ntrials = size(respMat,1);
131 | 
132 | % Add hard-coded options
133 | options.MaxSamples = 1e4;                       % Maximum # of samples per function call
134 | options.AccelerationThreshold = 0.1;            % Keep accelerating until threshold is passed (in s)
135 | options.VectorizedThreshold  = 0.1;             % Max threshold for using vectorized algorithm (in s)
136 | options.MaxMem = 1e6;                           % Maximum number of samples for vectorized implementation
137 | options.MaxMem = max(min(Ntrials,1e4),10)*100;  % Maximum number of samples for vectorized implementation
138 | 
139 | % NSAMPLESPERCALL should be a scalar integer
140 | if ~isnumeric(options.NsamplesPerCall) || ~isscalar(options.NsamplesPerCall)
141 |     error('ibslike:NsamplesPerCall','OPTIONS.NsamplesPerCall should be a scalar integer.');
142 | end
143 | 
144 | % ACCELERATION should be a scalar equal or greater than 1
145 | if ~isnumeric(options.Acceleration) || ~isscalar(options.Acceleration) || ...
146 |         options.Acceleration < 1
147 |     error('ibslike:Acceleration','OPTIONS.Acceleration should be a scalar equal or greater than one.');
148 | end
149 | 
150 | % NEGLOGLIKETHRESHOLD should be a scalar greater than 0 (or Inf)
151 | if ~isnumeric(options.NegLogLikeThreshold) || ~isscalar(options.NegLogLikeThreshold) || ...
152 |         options.NegLogLikeThreshold <= 0
153 |     error('ibslike:NegLogLikeThreshold','OPTIONS.NegLogLikeThreshold should be a positive scalar (including Inf).');
154 | end
155 | 
156 | % MAXTIME should be a positive scalar (including Inf)
157 | if ~isnumeric(options.MaxTime) || ~isscalar(options.MaxTime) || ...
158 |         options.MaxTime <= 0
159 |     error('ibslike:MaxTime','OPTIONS.MaxTime should be a positive scalar (or Inf).');
160 | end
161 | 
162 | % WEIGHT vector should be a scalar or same as number of trials
163 | weights = options.TrialWeights(:);
164 | if isempty(weights); weights = 1; end
165 | 
166 | if numel(weights) ~= 1 && numel(weights) ~= Ntrials
167 |     error('ibslike:NumWeights','OPTIONS.TrialWeights should be empty, a scalar or an array of weights with as many elements as the number of trials.');
168 | end
169 | 
170 | Trials = (1:Ntrials)';
171 | funcCount = 0;
172 | 
173 | simdata = []; elapsed_time = [];
174 | 
175 | % Use vectorized or loop version?
176 | if ischar(options.Vectorized) && options.Vectorized(1) == 'a'
177 |     if options.Nreps == 1
178 |         vectorized_flag = false;
179 |     else
180 |         % First full simulation to determine computation time
181 |         fun_clock = tic;
182 |         if isempty(designMat)    % Pass only trial indices
183 |             simdata = fun(params,Trials(:),varargin{:});
184 |         else                    % Pass full design matrix per trial
185 |             simdata = fun(params,designMat(Trials(:),:),varargin{:});   
186 |         end
187 |         elapsed_time = toc(fun_clock);
188 |         vectorized_flag = elapsed_time < options.VectorizedThreshold;
189 |         funcCount = 1;
190 |     end
191 | else
192 |     vectorized_flag = logical(options.Vectorized);
193 |     if options.Nreps == 1 && vectorized_flag
194 |         vectorized_flag = false;
195 |         warning('Vectorized IBS requires OPTIONS.Nreps > 1. Switching to non-vectorized algorithm.');
196 |     end
197 | end
198 | 
199 | if vectorized_flag
200 |     [nlogL,K,Nreps,Ns,fc,exitflag] = ...
201 |         vectorized_ibs_sampling(fun,params,respMat,designMat,simdata,elapsed_time,t0,options,varargin{:});
202 | else
203 |     [nlogL,K,Nreps,Ns,fc,exitflag] = ...
204 |         loop_ibs_sampling(fun,params,respMat,designMat,simdata,elapsed_time,t0,options,varargin{:});
205 | end
206 | funcCount = funcCount + fc;
207 | 
208 | % Variance of estimate per trial
209 | if nargout > 1
210 |     K_max = max(max(K(:),1));
211 |     Ktab = -(psi(1,1:K_max)' - psi(1,1));    
212 |     LLvar = Ktab(max(K,1));   
213 |     nlogLvar = sum(LLvar,2)./Nreps.^2;
214 | end
215 | 
216 | % OUTPUT structure with additional information
217 | if nargout > 2
218 |     output.funcCount = funcCount;
219 |     output.NsamplesPerTrial = Ns/Ntrials;
220 |     output.nlogL_trials = nlogL;
221 |     output.nlogLvar_trials = nlogLvar;
222 | end
223 | 
224 | % Return negative log-likelihood and variance summed over trials
225 | nlogL = sum(nlogL.*weights);
226 | if options.ReturnPositive; nlogL = -nlogL; end
227 | if nargout > 1
228 |     nlogLvar = sum(nlogLvar.*(weights.^2));
229 |     if options.ReturnStd    % Return standard deviation instead of variance
230 |         nlogLvar = sqrt(nlogLvar);
231 |     end
232 | end
233 | 
234 | end
235 | 
236 | %--------------------------------------------------------------------------
237 | function [nlogL,K,Nreps,Ns,fc,exitflag] = vectorized_ibs_sampling(fun,params,respMat,designMat,simdata0,elapsed_time0,t0,options,varargin)
238 | 
239 | Ntrials = size(respMat,1);
240 | Trials = (1:Ntrials)';
241 | Ns = 0;
242 | fc = 0;
243 | exitflag = 0;
244 | 
245 | Psi_tab = [];   % Empty PSI table
246 | 
247 | % Empty matrix of K values (samples-to-hit) for each repeat for each trial
248 | K_mat = zeros([max(options.Nreps),Ntrials]);
249 | 
250 | % Matrix of rep counts
251 | K_place0 = repmat((1:size(K_mat,1))',[1,Ntrials]);
252 | 
253 | % Current rep being sampled for each trial
254 | Ridx = ones(1,Ntrials);
255 | 
256 | % Current vector of "open" K values per trial (not reached a "hit" yet)
257 | K_open = zeros(1,Ntrials);
258 | 
259 | targetHits = options.Nreps(:)'.*ones(1,Ntrials);
260 | MaxIter = options.MaxIter*max(options.Nreps);
261 | 
262 | % Starting samples
263 | if options.NsamplesPerCall == 0
264 |     samples_level = options.Nreps;
265 | else
266 |     samples_level = options.NsamplesPerCall;
267 | end
268 | 
269 | for iter = 1:MaxIter
270 |     % Pick trials that need more hits, sample multiple times
271 |     T = Trials(Ridx <= targetHits);
272 |     if isfinite(options.MaxTime) && toc(t0) > options.MaxTime
273 |         T = []; 
274 |         exitflag = 2; 
275 |     end
276 |     if isempty(T); break; end
277 |     
278 |     Ttrials = numel(T);    % Number of trials under consideration
279 |         
280 |     % With accelerated sampling, might request multiple samples at once
281 |     Nsamples = min(options.MaxSamples,max(1,round(samples_level)));
282 |     MaxSamples = ceil(options.MaxMem / Ttrials);
283 |     Nsamples = min(Nsamples, MaxSamples);
284 |     Tmat = repmat(T,[1,Nsamples]);
285 |     
286 |     % Simulate trials
287 |     if iter == 1 && Nsamples == 1 && ~isempty(simdata0)
288 |         simdata = simdata0;
289 |         elapsed_time = elapsed_time0;
290 |     else
291 |         fun_clock = tic;
292 |         if isempty(designMat)    % Pass only trial indices
293 |             simdata = fun(params,Tmat(:),varargin{:});
294 |             fc = fc + 1;
295 |         else                    % Pass full design matrix per trial
296 |             simdata = fun(params,designMat(Tmat(:),:),varargin{:});   
297 |             fc = fc + 1;
298 |         end
299 |         elapsed_time = toc(fun_clock);
300 |     end
301 |     
302 |     % Check that the returned simulated data have the right size
303 |     if size(simdata,1) ~= numel(Tmat)
304 |         error('ibslike:SizeMismatch', ...
305 |             'Number of rows of returned simulated data does not match the number of requested trials.');
306 |     end
307 |     
308 |     Ns = Ns + Ttrials;
309 |     
310 |     % Accelerated sampling
311 |     if options.Acceleration > 0 && elapsed_time < options.AccelerationThreshold
312 |         samples_level = samples_level*options.Acceleration;
313 |     end
314 |     
315 |     % Check new "hits"
316 |     hits_temp = all(respMat(Tmat(:),:) == simdata,2);
317 |     
318 |     % Build matrix of new hits (sandwich with buffer of hits, then removed)
319 |     hits_new = [ones(1,Ttrials);reshape(hits_temp,size(Tmat))';ones(1,Ttrials)];
320 |             
321 |     % Warning: from now on it's going to be incomprehensible 
322 |     % (all vectorized for speed)
323 |     
324 |     % Extract matrix of Ks from matrix of hits for this iteration
325 |     h = size(hits_new,1);
326 |     list = find(hits_new(:) == 1)-1;
327 |     row = floor(list/h)+1;
328 |     col = mod(list,h)+1;
329 |     delta = diff([col;1]);
330 |     remidx = delta <= 0;
331 |     delta(remidx) = [];
332 |     row(remidx) = [];
333 |     indexcol = find(diff([0;row]));
334 |     col = 1 + (1:numel(row))' - indexcol(row);
335 |     K_iter = zeros(size(T,1),max(col));
336 |     K_iter(row + (col-1)*size(K_iter,1)) = delta;
337 | 
338 |     % This is the comprehensible version that we want to get to:
339 |     %
340 |     %   for iTrial = 1:Ntrials
341 |     %       index = find(hits_new(iTrial,:),targetHits(iTrial));
342 |     %       K = diff([0 index]);
343 |     %       logL(iTrial) = sum(Ktab(K))/numel(index);
344 |     %   end
345 |     
346 |     
347 |     % Add still-open K to first column
348 |     K_iter(:,1) = K_open(T)' + K_iter(:,1);
349 |         
350 |     % Find last K position for each trial
351 |     [~,idx_last] = min([K_iter,zeros(Ttrials,1)],[],2);
352 |     idx_last = idx_last - 1;
353 |     ii = sub2ind(size(K_iter),(1:Ttrials)',idx_last);
354 |     
355 |     % Subtract one hit from last K (it was added)
356 |     K_iter(ii) = K_iter(ii) - 1;
357 |     K_open(T) = K_iter(ii)';
358 |     
359 |     % For each trial, ignore entries of K_iter past max # of reps
360 |     idx_mat = bsxfun(@plus,Ridx(T)',repmat(0:size(K_iter,2)-1,[Ttrials,1]));
361 |     K_iter(idx_mat > (options.Nreps)) = 0;
362 |     
363 |     % Find last K position for each trial again
364 |     [~,idx_last2] = min([K_iter,zeros(Ttrials,1)],[],2);
365 |     idx_last2 = idx_last2 - 1;
366 |         
367 |     % Add current K to full K matrix    
368 |     K_iter_place = bsxfun(@ge,K_place0(:,1:Ttrials),Ridx(T)) & bsxfun(@le,K_place0(:,1:Ttrials),Ridx(T) + idx_last2'- 1);
369 |     K_place = false(size(K_place0));
370 |     K_place(:,T) = K_iter_place;
371 |     Kt = K_iter';    
372 |     K_mat(K_place) = Kt(Kt > 0);
373 |     Ridx(T) = Ridx(T) + idx_last' - 1;
374 |     
375 |     % Compute log-likelihood only if requested for thresholding
376 |     if isfinite(options.NegLogLikeThreshold)
377 |         Rmin = min(Ridx(T));    % Find repeat still ongoing
378 |         if Rmin > size(K_mat,1); continue; end
379 |         [LL_temp,Psi_tab] = get_LL_from_K(Psi_tab,K_mat(Rmin,:));
380 |         nLL_temp = -sum(LL_temp,1);
381 |         if nLL_temp > options.NegLogLikeThreshold
382 |             idx_move = Ridx == Rmin;
383 |             Ridx(idx_move) = Rmin+1;
384 |             K_open(idx_move) = 0;
385 |             exitflag = 1;
386 |         end
387 |     end
388 | end
389 | 
390 | if ~isempty(T)
391 |     error('ibslike:ConvergenceFail', ...
392 |         'Maximum number of iterations or time limit reached and algorithm did not converge. Check FUN and DATA.');
393 | end
394 |     
395 | % Log likelihood estimate per trial and run lengths K for each repetition
396 | Nreps = sum(K_mat > 0,1)';
397 | [LL_mat,Psi_tab] = get_LL_from_K(Psi_tab,K_mat);
398 | nlogL = sum(-LL_mat',2)./Nreps;
399 | K = K_mat';
400 | 
401 | end
402 | 
403 | 
404 | 
405 | %--------------------------------------------------------------------------
406 | function [nlogL,K,Nreps,Ns,fc,exitflag] = loop_ibs_sampling(fun,params,respMat,designMat,simdata0,elapsed_time0,t0,options,varargin)
407 | 
408 | Ntrials = size(respMat,1);
409 | Trials = (1:Ntrials)';
410 | MaxIter = options.MaxIter;
411 | exitflag = 0;
412 | 
413 | K = zeros(Ntrials,options.Nreps);
414 | Ns = 0;
415 | fc = 0;
416 | Psi_tab = [];
417 | 
418 | for iRep = 1:options.Nreps
419 |     
420 |     offset = 1;
421 |     hits = zeros(Ntrials,1);
422 |     if isfinite(options.MaxTime) && toc(t0) > options.MaxTime
423 |         exitflag = 2;
424 |         break;
425 |     end
426 |     
427 |     for iter = 1:MaxIter
428 |         % Pick trials that need more hits, sample multiple times
429 |         T = Trials(hits < 1);
430 |         if isempty(T); break; end        
431 | 
432 |         % Simulate trials
433 |         if iter == 1 && iRep == 1 && ~isempty(simdata0)
434 |             simdata = simdata0;
435 |         elseif isempty(designMat)    % Pass only trial indices
436 |             simdata = fun(params,T(:),varargin{:});
437 |             fc = fc + 1;
438 |         else                    % Pass full design matrix per trial
439 |             simdata = fun(params,designMat(T(:),:),varargin{:});   
440 |             fc = fc + 1;
441 |         end
442 |         
443 |         % Check that the returned simulated data have the right size
444 |         if size(simdata,1) ~= numel(T)
445 |             error('ibslike:SizeMismatch', ...
446 |                 'Number of rows of returned simulated data does not match the number of requested trials.');
447 |         end
448 |                 
449 |         Ns = Ns + numel(T); % Count samples
450 |         hits_new = all(respMat(T(:),:) == simdata,2);    
451 |         hits(T) = hits(T) + hits_new;
452 |         
453 |         K(T(hits_new),iRep) = offset;        
454 |         offset = offset + 1;
455 |         
456 |         % Terminate if negative log likelihood is above a given threshold
457 |         if isfinite(options.NegLogLikeThreshold)
458 |             K(hits < 1,iRep) = offset;
459 |             [LL_mat,Psi_tab] = get_LL_from_K(Psi_tab,K(:,iRep));
460 |             nlogL_sum = -sum(LL_mat,1);            
461 |             if nlogL_sum > options.NegLogLikeThreshold
462 |                 T = [];
463 |                 exitflag = 1;
464 |                 break;
465 |             end
466 |         end
467 |         
468 |         % Terminate if above maximum allowed runtime
469 |         if isfinite(options.MaxTime) && toc(t0) > options.MaxTime
470 |             T = [];
471 |             exitflag = 2;
472 |             break;
473 |         end
474 | 
475 |     end
476 |     
477 |     if ~isempty(T)
478 |         error('ibslike:ConvergenceFail', ...
479 |             'Maximum number of iterations reached and algorithm did not converge. Check FUN and DATA.');
480 |     end    
481 | end
482 |     
483 | Nreps = sum(K > 0,2);
484 | [LL_mat,Psi_tab] = get_LL_from_K(Psi_tab,K);
485 | nlogL = sum(-LL_mat,2)./Nreps;
486 | 
487 | end
488 | 
489 | 
490 | %--------------------------------------------------------------------------
491 | function [LL_mat,Psi_tab] = get_LL_from_K(Psi_tab,K_mat)
492 | %GET_LL_FROM_K Convert matrix of K values into log-likelihoods.
493 | 
494 | K_max = max(1,max(K_mat(:)));
495 | if K_max > numel(Psi_tab)   % Fill digamma function table
496 |     Psi_tab = [Psi_tab; (psi(1) - psi(numel(Psi_tab)+1:K_max)')];
497 | end
498 | LL_mat = Psi_tab(max(1,K_mat));
499 | 
500 | end
501 | 
502 | %--------------------------------------------------------------------------
503 | function exitflag = runtest1(options)
504 | 
505 | Nreps = 1e3;
506 | RMSE_tol = 2/sqrt(Nreps);
507 | 
508 | % Binomial probability model
509 | p_model = exp(linspace(log(1e-3),log(1),10));
510 | fun = @(x,dmat) rand(size(dmat)) < x;   % Simulating function
511 | rmat = fun(1,NaN);
512 | 
513 | fprintf('\n');
514 | fprintf('TEST 1: Using IBS to compute log(p) of Bernoulli distributions with %d repeats.\n',Nreps);
515 | fprintf('We consider p = %s.\n',mat2str(p_model,3));
516 | 
517 | options.Nreps = Nreps;
518 | options.NegLogLikeThreshold = Inf;
519 | 
520 | nlogL = zeros(1,numel(p_model));
521 | nlogLvar = zeros(1,numel(p_model));
522 | for iter = 1:numel(p_model)
523 |     [nlogL(iter),nlogLvar(iter)] = ibslike(fun,p_model(iter),rmat,[],options);
524 | end
525 | 
526 | % We expect the true value to be almost certainly (> 99.99%) in this range
527 | LL_min = (-nlogL - 4*sqrt(nlogLvar));
528 | LL_max = (-nlogL + 4*sqrt(nlogLvar));
529 | 
530 | exitflag = any(log(p_model) < LL_min) | any(log(p_model) > LL_max);
531 | 
532 | rmse = sqrt(mean((-nlogL - log(p_model)).^2));
533 | fprintf('Average RMSE of log(p) estimates across p: %.4f.\n',rmse);
534 | 
535 | exitflag = exitflag | (rmse > RMSE_tol);
536 | 
537 | if exitflag
538 |     fprintf('Test FAILED. Something might be wrong.\n');    
539 | else
540 |     fprintf('Test PASSED. IBS estimates are calibrated and close to ground truth.\n');    
541 | end
542 | 
543 | % Plot figure
544 | xx = log(p_model);
545 | h(1) = plot(xx,xx,'k-','LineWidth',2); hold on;
546 | 
547 | yy = -nlogL;
548 | xxerr = [xx, fliplr(xx)];
549 | yyerr_down = yy - 1.96*sqrt(nlogLvar);
550 | yyerr_up = yy + 1.96*sqrt(nlogLvar);
551 | yyerr = [yyerr_down, fliplr(yyerr_up)];
552 | fill(xxerr, yyerr,'b','FaceAlpha',0.5,'LineStyle','none'); hold on;
553 | h(2) = plot(xx,yy,'b-','LineWidth',2); hold on;
554 | 
555 | box off;
556 | set(gca,'TickDir','out');
557 | set(gcf,'Color','w');
558 | xlabel('True log(p)');
559 | ylabel('Estimated log(p)')
560 | %xlim([-5 5]);
561 | hl = legend(h,'True log(p)','IBS estimate (95% CI)');
562 | set(hl,'Location','NorthWest','Box','off');
563 | title('IBS estimation test');
564 | 
565 | end
566 | 
567 | 
568 | %--------------------------------------------------------------------------
569 | function exitflag = runtest2(options)
570 | 
571 | % Binomial probability model
572 | Ntrials = 100;                  
573 | p_true = 0.9*rand() + 0.05;             % True probability
574 | p_model = 0.9*rand() + 0.05;            % Model probability
575 | fun = @(x,dmat) rand(size(dmat)) < x;   % Simulating function
576 | Nexps = 2e3;
577 | 
578 | options.NegLogLikeThreshold = Inf;
579 | 
580 | fprintf('\n');
581 | fprintf('TEST 2: Using IBS to compute the log-likelihood of a binomial distribution.\n');
582 | fprintf('Parameters: p_true=%.2g, p_model=%.2g, %d trials per experiment.\n',p_true,p_model,Ntrials);
583 | fprintf('The distribution of z-scores should approximate a standard normal distribution (mean 0, SD 1).\n');
584 | 
585 | zscores = zeros(1,Nexps);
586 | for iter = 1:Nexps
587 |     rmat = fun(p_true,NaN(Ntrials,1));            % Generate data
588 |     [nlogL,nlogLvar] = ibslike(fun,p_model,rmat,[],options);
589 |     nlogL_exact = -log(p_model)*sum(rmat == 1) - log(1-p_model)*sum(rmat == 0);
590 |     zscores(iter) = (nlogL_exact - nlogL)/sqrt(nlogLvar);
591 | end
592 | 
593 | edges = -4.75:0.5:4.75;
594 | nz = histc(zscores,edges);
595 | h(1) = bar(edges,nz,'histc');
596 | hold on;
597 | xx = linspace(-5,5,1e4);
598 | h(2) = plot(xx,Nexps*exp(-xx.^2/2)/sqrt(2*pi)/2,'k-','LineWidth',2);
599 | 
600 | box off;
601 | set(gca,'TickDir','out');
602 | set(gcf,'Color','w');
603 | xlabel('z-score');
604 | ylabel('pdf')
605 | xlim([-5 5]);
606 | hl = legend(h,'z-scores histogram','expected pdf');
607 | set(hl,'Location','NorthEast','Box','off');
608 | title('Calibration test');
609 | 
610 | exitflag = abs(mean(zscores)) > 0.15 || abs(std(zscores) - 1) > 0.1;
611 | 
612 | fprintf('Distribution of z-scores (%d experiments). Mean: %.4g. Standard deviation: %.4g.\n',Nexps,mean(zscores),std(zscores));
613 | if exitflag
614 |     fprintf('Test FAILED. Something might be wrong.\n');    
615 | else
616 |     fprintf('Test PASSED. We verified that IBS is unbiased (~zero mean) and calibrated (SD ~1).\n');
617 | end
618 | 
619 | end
620 | 
621 | %--------------------------------------------------------------------------
622 | function exitflag = runtest3(options)
623 | 
624 | Nreps = 100;
625 | RMSE_tol = 4/sqrt(Nreps);
626 | 
627 | % Binomial probability model
628 | p_model = exp(linspace(log(1e-3),log(0.1),10));
629 | fun = @(x,dmat) rand(size(dmat)) < x;   % Simulating function
630 | rmat = fun(1,NaN);
631 | thresh = -log(0.01);
632 | p_target = max(p_model,exp(-thresh));
633 | 
634 | fprintf('\n');
635 | fprintf('TEST 3: Log-likelihood thresholding at log(p) = %.3f.\n',-thresh);
636 | fprintf('Using IBS to compute thresholded log(p) of Bernoulli distributions with %d repeats.\n',Nreps);
637 | fprintf('We consider p = %s.\n',mat2str(p_model,3));
638 | 
639 | options.Nreps = Nreps;
640 | options.NegLogLikeThreshold = thresh;
641 | options.Acceleration = 1;
642 | 
643 | nlogL = zeros(1,numel(p_model));
644 | nlogLvar = zeros(1,numel(p_model));
645 | for iter = 1:numel(p_model)
646 |     [nlogL(iter),nlogLvar(iter)] = ibslike(fun,p_model(iter),rmat,[],options);
647 | end
648 | 
649 | % We expect the true value to be almost certainly (> 99.99%) in this range
650 | LL_min = (-nlogL - 4*sqrt(nlogLvar));
651 | LL_max = (-nlogL + 4*sqrt(nlogLvar));
652 | 
653 | % We expect the estimates to be (almost) correct away from the threshold
654 | idx = log(p_model) > -thresh*0.75;
655 | exitflag = any(log(p_model(idx)) < LL_min(idx)) | any(log(p_model(idx)) > LL_max(idx));
656 | 
657 | % We expect the estimates to be above the true value below the threshold
658 | LL_thresh = (-nlogL - sqrt(nlogLvar));
659 | idx_below = log(p_model) < -thresh;
660 | exitflag = exitflag | any(log(p_model(idx_below)) > LL_thresh(idx_below));
661 | % exitflag = any(log(p_target) < LL_min) | any(log(p_target) > LL_max);
662 | 
663 | rmse = sqrt(mean((-nlogL(idx) - log(p_target(idx))).^2));
664 | fprintf('Average RMSE of log(p) estimates across p: %.4f.\n',rmse);
665 | 
666 | exitflag = exitflag | (rmse > RMSE_tol);
667 | 
668 | if exitflag
669 |     fprintf('Test FAILED. Something might be wrong.\n');    
670 | else
671 |     fprintf('Test PASSED. IBS estimates are calibrated and close to (thresholded) ground truth.\n');    
672 | end
673 | 
674 | % Plot figure
675 | xx = log(p_model);
676 | h(1) = plot(xx,xx,'k-','LineWidth',2); hold on;
677 | 
678 | yy = -nlogL;
679 | xxerr = [xx, fliplr(xx)];
680 | yyerr_down = yy - 1.96*sqrt(nlogLvar);
681 | yyerr_up = yy + 1.96*sqrt(nlogLvar);
682 | yyerr = [yyerr_down, fliplr(yyerr_up)];
683 | fill(xxerr, yyerr,'b','FaceAlpha',0.5,'LineStyle','none'); hold on;
684 | 
685 | h(2) = plot(xx,yy,'b-','LineWidth',2); hold on;
686 | h(3) = plot([xx(1),xx(end)],-thresh*[1 1],'k:','LineWidth',2);
687 | 
688 | box off;
689 | set(gca,'TickDir','out');
690 | set(gcf,'Color','w');
691 | xlabel('True log(p)');
692 | ylabel('Estimated log(p) with thresholding')
693 | %xlim([-5 5]);
694 | hl = legend(h,'True log(p)','IBS estimate (95% CI)','Threshold');
695 | set(hl,'Location','NorthWest','Box','off');
696 | title('Thresholded IBS test');
697 | 
698 | end
699 | 
700 | 
701 | %   TODO:
702 | %   - Fix help and documentation
703 | %   - Optimal allocation of estimates?
704 | 


--------------------------------------------------------------------------------
/private/ibs_test.m:
--------------------------------------------------------------------------------
 1 | function ibs_test
 2 | 
 3 | responses = ones(50, 1);
 4 | 
 5 | for nReps = [1, 2, 3, 10, 50]
 6 |     opt_ibs = ibslike;
 7 |     opt_ibs.Vectorized = true;
 8 |     opt_ibs.Nreps = nReps;
 9 |     
10 |     value = ibslike(@(pars, data)randomResponses(pars, data), ...
11 |         0, responses, [], opt_ibs);
12 |     disp(['Estimated nLogL with ' num2str(nReps) ' Nreps: ' num2str(value)])
13 | end
14 | 
15 | end
16 | 
17 | 
18 | function resp = randomResponses(~, data)
19 |     resp = randi(2, size(data, 1), 1);
20 | end
21 | 
22 | 


--------------------------------------------------------------------------------
/psycho_gen.m:
--------------------------------------------------------------------------------
 1 | function R=psycho_gen(theta,S)
 2 | %PSYCHO_GEN Generate responses for psychometric function model.
 3 | %  R=PSYCHO_GEN(THETA,S) generates responses in a simple orientation
 4 | %  discrimination task, where S is a vector of stimulus orientations (in
 5 | %  deg) for each trial, and THETA is a model parameter vector, with 
 6 | %  THETA(1) as eta=log(sigma), the log of the sensory noise; THETA(2) the 
 7 | %  bias term; THETA(3) is the lapse rate. The returned vector of responses
 8 | %  per trial reports 1 for "rightwards" and -1 for "leftwards".
 9 | %
10 | %  See Section 5.2 of the manuscript for more details on the model.
11 | %
12 | %  Note that this model is very simple and used only for didactic purposes;
13 | %  one should use the analytical log-likelihood whenever available.
14 | 
15 | % Luigi Acerbi, 2020
16 | 
17 | sigma = exp(theta(1));
18 | bias = theta(2);
19 | lapse = theta(3);
20 | 
21 | %% Noisy measurement
22 | 
23 | % Ass Gaussian noise to true orientations S to simulate noisy measurements
24 | X = S + sigma*randn(size(S));
25 | 
26 | %% Decision rule
27 | 
28 | % The response is 1 for "rightwards" if the internal measurement is larger
29 | % than the BIAS term; -1 for "leftwards" otherwise
30 | R = zeros(size(S));
31 | R(X >= bias) = 1;
32 | R(X < bias) = -1;
33 | 
34 | %% Lapses
35 | 
36 | % Choose trials in which subject lapses; response there is given at chance
37 | lapse_idx = rand(size(S)) < lapse;
38 | 
39 | % Random responses (equal probability of 1 or -1)
40 | lapse_val = randi(2,[sum(lapse_idx),1])*2-3;
41 | R(lapse_idx) = lapse_val;
42 | 
43 | end


--------------------------------------------------------------------------------
/psycho_nll.m:
--------------------------------------------------------------------------------
 1 | function L=psycho_nll(theta,S,R)
 2 | %PSYCHO_NLL Negative log-likelihood for psychometric function model.
 3 | %  L=PSYCHO_NLL(THETA,S,R) returns the negative log likelihood for a simple
 4 | %  orientation discrimination task; where THETA is a model parameter vector, 
 5 | %  with THETA(1) as eta=log(sigma), the log of the sensory noise; THETA(2) 
 6 | %  the bias term; THETA(3) is the lapse rate; S is a vector of stimulus 
 7 | %  orientations (in deg) for each trial; and R the vector of responses
 8 | %  per trial (1 for "rightwards" and -1 for "leftwards").
 9 | 
10 | % Luigi Acerbi, 2020
11 | 
12 | sigma = exp(theta(1));
13 | bias = theta(2);
14 | lapse = theta(3);
15 | 
16 | % Likelihood per trial (analytical solution)
17 | p_vec = lapse/2+(1-lapse)*((R==-1).*normcdf(-(S-bias)/sigma)+(R==1).*normcdf((S-bias)/sigma));
18 | 
19 | % Total negative log-likelihood
20 | L = -sum(log(p_vec));
21 | 
22 | end
23 | 
24 | 
25 | function z = normcdf(x)
26 | %NORMCDF Normal cumulative distribution function (cdf)
27 | z = 0.5 * erfc(-x ./ sqrt(2));
28 | end


--------------------------------------------------------------------------------