├── .gitignore ├── README.txt ├── argmax.m ├── ars.m ├── drawGmm.m ├── drawMultinom.m ├── drawSpiral.m ├── gibbsGmm.m ├── igmm_mv.m ├── igmm_uv.m ├── logalphapdf.m ├── logbetapdf.m ├── logmvbetapdf.m ├── lognormalpdf.m ├── plotAutoCov.m ├── plotGmm.m ├── renumber.m └── wishrnd.m /.gitignore: -------------------------------------------------------------------------------- 1 | *~ 2 | -------------------------------------------------------------------------------- /README.txt: -------------------------------------------------------------------------------- 1 | Michael Mandel 2 | CS 4771 Final Project 3 | The Infinite Gaussian Mixture Model 4 | Prof. Tony Jebara 5 | May 5, 2005 6 | 7 | For my final project in Tony Jebara's Machine Learning course, cs4771, 8 | I implemented Carl Rasmussen's Infinite Gaussian Mixture Model. I got 9 | it working for both univariate and multivariate data. I'd like to see 10 | what it does when presented with MFCC frames from music and 11 | audio. There were some tricky parts of implementing it, I wrote them 12 | up in a short paper describing my implementation. Since I've gotten 13 | the multivariate case working, I'll trust you to ignore all statements 14 | to the contrary in the paper. The IGMM requires Adaptive Rejection 15 | Sampling to sample the posteriors of some of its parameters, so I 16 | implemented that as well. Thanks to Siddharth Gopal for a bugfix. 17 | 18 | See also: 19 | 20 | The paper I wrote about implementing it: 21 | http://mr-pc.org/work/cs4771igmm.pdf 22 | 23 | Jacob Eisenstein's Dirichlet process mixture model, which adds some 24 | cool features to the infinite GMM. 25 | http://people.csail.mit.edu/jacobe/software.html 26 | 27 | ===================================================== 28 | 29 | In order to generate the test data used in the paper, just make this 30 | call in matlab: 31 | [Y,z] = drawGmm([-3 3], [1 10], [1 2], 500); 32 | 33 | In order to run the infinite GMM on the data for 10000 iterations, 34 | make this call: 35 | Samp = igmm_uv(Y, 10000); 36 | 37 | It's as easy as that. 38 | 39 | If you want to run the regular univariate Gibbs Sampler on the data, 40 | do this: 41 | [mu,sigSq,p,z,churn] = gibbsGmm(Y,2,0,100,2,1,2,1000); 42 | 43 | The igmm for multivariate data is in igmm_mv.m, which uses 44 | logmvbetpdf.m instead of the logbetapdf.m used by igmm_uv.m. 45 | Otherwise, both igmms are self-contained. 46 | 47 | To generate multivariate data, use e.g. 48 | S = [2 1; 1 2]; S(:,:,2) = S 49 | [Y,z] = drawGmm([3 -3; -3 3], S, [1 1], 100); 50 | Samp = igmm_mv(Y, 10000); 51 | 52 | To generate figure 3 in the paper, use the function plotAutoCov.m 53 | 54 | 55 | 56 | ===================================================== 57 | COPYRIGHT / LICENSE 58 | ===================================================== 59 | All code was written by Michael Mandel, and is copyrighted under the 60 | (lesser) GPL: 61 | Copyright (C) 2005 Michael Mandel 62 | 63 | This program is free software; you can redistribute it and/or 64 | modify it under the terms of the GNU Lesser General Public License 65 | as published by the Free Software Foundation; version 2.1 or later. 66 | 67 | This program is distributed in the hope that it will be useful, 68 | but WITHOUT ANY WARRANTY; without even the implied warranty of 69 | MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 70 | GNU Lesser General Public License for more details. 71 | 72 | You should have received a copy of the GNU Lesser General Public License 73 | along with this program; if not, write to the Free Software 74 | Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. 75 | 76 | The authors may be contacted via email at: mim at ee columbia edu 77 | -------------------------------------------------------------------------------- /argmax.m: -------------------------------------------------------------------------------- 1 | function am = argmax(X, dim) 2 | 3 | % am = argmax(X, dim) 4 | % 5 | % Find the index of the maximum value of the matrix X. If dime is 6 | % supplied, find the maximum index along the dimension dim. 7 | 8 | % Copyright (C) 2005 Michael Mandel, mim at ee columbia edu; 9 | % distributable under GPL 10 | 11 | if(nargin < 2) 12 | [dummy, am] = max(X); 13 | else 14 | [dummy, am] = max(X, [], dim); 15 | end 16 | -------------------------------------------------------------------------------- /ars.m: -------------------------------------------------------------------------------- 1 | function samples = ars(logpdf, pdfargs, N, xi, support) 2 | 3 | % Perform adaptive rejection sampling as described in gilks & wild 4 | % '92, and wild & gilks 93. The PDF must be log-concave. Draw N 5 | % samples from the pdf passed in as a function handle to its log. The 6 | % log could be offset by an additive constant, corresponding to an 7 | % unnormalized distribution. 8 | % 9 | % The pdf function should have prototype [h, hprime] = logpdf(x, 10 | % pdfargs{:}), where x could be a vector of points, h is the value of 11 | % the log pdf and hprime is its derivative. 12 | % 13 | % The xi argument to this function is a number of points to initially 14 | % evaluate the pdf at, which must be on either side of the 15 | % distribution's mode. And the support is a 2-vector specifying the 16 | % support of the pdf, defaults to [-inf inf]. 17 | % 18 | % This function does not use the lower squeezing bound because it 19 | % is optimized for generating a small number of samples each call. 20 | 21 | % Copyright (C) 2005 Michael Mandel, mim at ee columbia edu; 22 | % distributable under GPL, see README.txt 23 | 24 | samples = []; 25 | 26 | % Don't need to approximate the curve too well, all the sorting and 27 | % whatnot gets expensive 28 | Nxmax = 50; 29 | 30 | if(nargin < 5) support = [-inf inf]; end 31 | 32 | x = sort(xi); 33 | [h, hprime] = feval(logpdf, x, pdfargs{:}); 34 | if(~isfinite(h(1))) 35 | x 36 | h 37 | hprime 38 | logpdf 39 | pdfargs{:} 40 | size(pdfargs) 41 | det(pdfargs{2}) 42 | error('h not finite'); 43 | end 44 | 45 | if(support(1) == 0) 46 | % Cheat! Get closer and closer to 0 as needed 47 | while(hprime(1) < 0) 48 | xt = x(1)/2; 49 | [ht,hpt] = feval(logpdf, xt, pdfargs{:}); 50 | [x,z,h,hprime,hu,sc,cu] = insert(x,xt,h,ht,hprime,hpt,support); 51 | end 52 | 53 | while(hprime(end) > 0) 54 | xt = x(end)*2; 55 | [ht,hpt] = feval(logpdf, xt, pdfargs{:}); 56 | [x,z,h,hprime,hu,sc,cu] = insert(x,xt,h,ht,hprime,hpt,support); 57 | end 58 | end 59 | 60 | if(hprime(1) < 0 || hprime(end) > 0) 61 | % If the lower bound isn't 0, can't help it (for now) 62 | error(['Starting points ' num2str(x) ' do not enclose the' ... 63 | ' mode']); 64 | end 65 | 66 | 67 | % Avoid under/overflow errors. the envelope and pdf are only 68 | % proporitional to the true pdf, so we can choose any constant 69 | % of proportionality. 70 | offset = max(h); 71 | h = h-offset; 72 | 73 | [x,z,h,hprime,hu,sc,cu] = insert(x,[], h,[], hprime,[], support); 74 | 75 | Nsamp = 0; 76 | while Nsamp < N 77 | % Draw 2 random numbers in [0,1] 78 | u = rand(1,2); 79 | 80 | % Find the largest z such that sc(z) < u 81 | idx = find(sc/cu < u(1)); 82 | idx = idx(end); 83 | 84 | % Figure out the x in that segment that u corresponds to 85 | xt = x(idx) + (-h(idx) + log(hprime(idx)*(cu*u(1) - sc(idx)) + ... 86 | exp(hu(idx)))) / hprime(idx); 87 | [ht,hpt] = feval(logpdf, xt, pdfargs{:}); 88 | ht = ht-offset; 89 | 90 | % Figure out what h_u(xt) is a dumb way, uses assumption that the 91 | % log pdf is concave 92 | hut = min(hprime.*(xt - x) + h); 93 | 94 | % Decide whether to keep the sample 95 | if(u(2) < exp(ht - hut)) 96 | Nsamp = Nsamp+1; 97 | samples(Nsamp) = xt; 98 | else 99 | % $$$ fprintf('.'); 100 | end 101 | 102 | % Update vectors if necessary 103 | if(length(x) < Nxmax) 104 | try 105 | [x,z,h,hprime,hu,sc,cu] = insert(x,xt,h,ht,hprime,hpt,support); 106 | catch ex 107 | end 108 | end 109 | end 110 | 111 | 112 | 113 | function [x, z, h, hprime, hu, sc, cu] = ... 114 | insert(x, xnew, h, hnew, hprime, hprimenew, support) 115 | % Insert xnew into x and update all other vectors to reflect the 116 | % new point's addition. 117 | % 118 | % From Wild & Gilks (1993) 119 | % x: points where function has been evaluated 120 | % z: points between x's where adjacent tangents intersect each other 121 | % h: values of log pdf at x's 122 | % hprime: slope of log pdf at x's 123 | % hu: values of log pdf at z's 124 | % sc: CDF of piecewise linear upper hull 125 | % cu: normalization constant for sc 126 | 127 | [x,order] = sort([x xnew]); 128 | h = [h hnew]; h = h(order); 129 | hprime = [hprime hprimenew]; hprime = hprime(order); 130 | 131 | z = [support(1) x(1:end-1)+(-diff(h)+hprime(2:end).*diff(x)) ./ ... 132 | diff(hprime) support(end)]; 133 | hu = [hprime(1) hprime] .* (z - [x(1) x]) + [h(1) h]; 134 | 135 | % $$$ plot(z, hu); 136 | 137 | sc = [0 cumsum(diff(exp(hu)) ./ hprime)]; 138 | cu = sc(end); 139 | assert(all(sc >= 0)) 140 | -------------------------------------------------------------------------------- /drawGmm.m: -------------------------------------------------------------------------------- 1 | function [Y,z] = drawGmm(mu, sigSq, p, N) 2 | 3 | % Draw N samples from a mixture of N Gaussians. In the multivariate 4 | % case, mu is a matrix where each row is the mean of one Gaussian. 5 | % SigSq is a 3D matrix such that sigSq(:,:,i) is the covariance of the 6 | % ith Gaussian. In the univariate case, mu(i) is the mean and 7 | % sigSq(i) is the variance of the ith Gaussian. 8 | 9 | % Copyright (C) 2005 Michael Mandel, mim at ee columbia edu; 10 | % distributable under GPL, see README.txt 11 | 12 | % Y ~ sum[ p_i * N(mu_i, sigma_i) ] 13 | 14 | [tD,D] = size(mu); 15 | if(tD == 1) D=1; end 16 | 17 | z = drawMultinom(repmat(p(:), 1, N)); 18 | 19 | if(D == 1) 20 | Y = randn(1,N).*sqrt(sigSq(z)) + mu(z); 21 | else 22 | for i=1:length(p) 23 | inClass = find(z == i); 24 | n = numel(inClass); 25 | [u,s,v] = svd(sigSq(:,:,i)); 26 | sig = sqrt(s)*v'; 27 | Y(inClass,:) = randn(n,D) * sig + repmat(mu(i,:), n, 1); 28 | end 29 | end 30 | -------------------------------------------------------------------------------- /drawMultinom.m: -------------------------------------------------------------------------------- 1 | function x = drawMultinom(p) 2 | 3 | % Draw size(p,2) samples from a multinomial distribution where the 4 | % elements [1..size(p,1)] have probabilities p. There should be a way 5 | % to do it without the repmats... 6 | 7 | % Copyright (C) 2005 Michael Mandel, mim at ee columbia edu; 8 | % distributable under GPL, see README.txt 9 | 10 | 11 | p = cumsum(p); 12 | pmax = max(max(p))+1; 13 | u = repmat(rand(1,size(p,2)).*p(end,:), size(p,1), 1); 14 | m = (u < p) .* (pmax-p); 15 | x = argmax(m); 16 | -------------------------------------------------------------------------------- /drawSpiral.m: -------------------------------------------------------------------------------- 1 | function Y = drawSpiral(N, std) 2 | 3 | % Y = drawSpiral(N, std) 4 | % 5 | % Draw N points from a noisy 3D spiral similar to the one used in 6 | % Rasmussen's paper, which was taken from Ueda et al (1998). Std is 7 | % the standard deviation of noise around the spiral. The default 8 | % parameters should give something like Ueda's spiral. 9 | 10 | % Copyright (C) 2005 Michael Mandel, mim at ee columbia edu; 11 | % distributable under GPL, see README.txt 12 | 13 | 14 | if(nargin < 2) std = .05; end 15 | if(nargin < 1) N = 800; end 16 | 17 | t = rand(1,N)*4*pi + 2*pi; 18 | Y = [10*cos(t)./t;-10*sin(t)./t; t/(4*pi)]'; 19 | Y = Y + randn(size(Y))*std; 20 | -------------------------------------------------------------------------------- /gibbsGmm.m: -------------------------------------------------------------------------------- 1 | function [mu, sigmaSq, p, z, churn] = ... 2 | gibbsGmm(Y, k, m, etaSq, nu0, nu0lambda0, alpha, Nsamp) 3 | 4 | % Use Markov chain Monte Carlo simulation to cluster the data Y into a 5 | % mixture of k univariate Gaussians. Priors on variables are: mu ~ 6 | % N(m, etaSq), sigmaSq ~ Wishart(nu0, lambda0), pi ~ dirichlet(alpha/k). 7 | % Outputs of function are samples from the posterior distributions, so 8 | % that theta(i) = [mu(i,:) sigma(i,:) z ], i = 1..Nsamp 9 | 10 | N = length(Y); 11 | 12 | % Randomly assign to classes, initialize stats to the means and 13 | % vars of those classes. 14 | z = drawMultinom(ones(k,N)); 15 | for j=1:k 16 | yj = Y(find(z == j)); 17 | mu(1,j) = yj(unidrnd(numel(yj))); 18 | sigmaSq(1,j) = std(yj).^2; 19 | end 20 | p(1,:) = full(sparse(1, z, 1, 1, k)); 21 | 22 | % Go! 23 | for i=2:Nsamp 24 | % Mu 25 | for j=1:k 26 | n = sum(z == j); 27 | if(n <= 0) ybar = 0; 28 | else ybar = mean(Y(find(z == j))); 29 | end 30 | 31 | tmp_sigSq = 1/(n/sigmaSq(i-1,j) + 1/etaSq); 32 | tmp_mu = tmp_sigSq*(n*ybar/sigmaSq(i-1,j) + m/etaSq); 33 | mu(i,j) = drawNormal(tmp_mu, tmp_sigSq); 34 | end 35 | 36 | % Sigma 37 | for j=1:k 38 | inClass = z == j; 39 | n = sum(inClass); 40 | if(n <= 0) sigbar = 0; 41 | else sigbar = sum((Y(find(inClass)) - mu(i-1,j)).^2); 42 | end 43 | 44 | tmp_nu = nu0+n; 45 | tmp_nu_lambda = (nu0lambda0 + sigbar); 46 | sigmaSq(i,j) = drawInvChiSq(tmp_nu, tmp_nu_lambda); 47 | end 48 | 49 | % z \in {1..k} 50 | for j=1:k 51 | tmp_pr(j,:) = normalLike(Y, mu(i-1,j), sigmaSq(i-1,j)); 52 | end 53 | n = tabulate(z); 54 | n = n(:,2)'; 55 | % $$$ n = full(sparse(1, z, 1, 1, k)); 56 | 57 | % Scale likelihoods by class memberships times prior 58 | pri = repmat((n'+alpha/k)/(sum(n)-1+alpha), 1, N); 59 | idxs = sub2ind(size(pri), z, [1:N]); 60 | pri(idxs) = pri(idxs) - 1/(sum(n)-1+alpha); 61 | 62 | tz = drawMultinom(pri .* tmp_pr); 63 | churn(i) = sum(tz ~= z); 64 | z = tz; 65 | p(i,:) = n; 66 | 67 | % $$$ plotGmm(mu(i,1), mu(i,2), sigmaSq(i,1), sigmaSq(i,2), p(i)); 68 | % $$$ pause(.1) 69 | end 70 | 71 | 72 | function x = drawNormal(mu, sigSq) 73 | % Draw one sample from a Gaussian with mean mu and variance sigSq 74 | x = randn(1)*sqrt(sigSq) + mu; 75 | 76 | 77 | function pr = normalLike(y, mu, sigSq) 78 | % Evaluate the likelihood of the points y under the Gaussian with mean 79 | % mu and variance sigSq 80 | pr = 1/sqrt(2*pi*sigSq) .* exp(-(y-mu).^2/(2*sigSq)); 81 | 82 | 83 | function x = drawInvChiSq(nu, nu_lambda) 84 | % Draw one sample from an inverse chi square distribution with 85 | % parameters nu and lambda 86 | x = nu_lambda / chi2rnd(nu); 87 | 88 | 89 | function x = drawBeta(a, b) 90 | % Draw one sample from a Beta distribution with parameters a and b 91 | x = betarnd(a,b); 92 | 93 | 94 | function x = drawBernoulli(p) 95 | % Draw bernoulli random variables with probability p of getting 1. 96 | % x is the same size as p. 97 | x = rand(size(p)) < p; 98 | -------------------------------------------------------------------------------- /igmm_mv.m: -------------------------------------------------------------------------------- 1 | function Samp = igmm_mv(Y, Nsamp, init, debg) 2 | 3 | % Samp = igmm_mv(Y, Nsamp, init) 4 | % 5 | % Implementation of Rasmussen's infinite GMM for multivariate data. 6 | % Use Gibbs sampling to draw Nsamp samples from the posterior 7 | % distribution of an infinite GMM given some data Y, where each row is 8 | % a data point. Based on a heirarchical graphical model where 9 | % hyperparameters are also inferred, so there are no parameters to 10 | % tweak by hand. Samp is a vector of structures, where each element 11 | % has fields mu, s, lambda, r, beta, w, alpha, and k as described in 12 | % the paper. Start the Gibbs sampler with the sample INIT, if 13 | % supplied. 14 | 15 | % Copyright (C) 2005 Michael Mandel, mim at ee columbia edu; 16 | % distributable under GPL, see README.txt 17 | 18 | 19 | if(nargin < 4) debg = 0; end 20 | 21 | [N,D] = size(Y); 22 | mu_y = mean(Y); 23 | sigSq_y = cov(Y); 24 | sigSqi_y = inv(sigSq_y); 25 | arsStart = [.2 2]; 26 | 27 | if(nargin < 3 || isempty(init)) 28 | % Start off with one class 29 | c = ones(1,N); 30 | Samp(1).k = 1; 31 | Samp(1).mu = mu_y; 32 | Samp(1).s = sigSqi_y; 33 | Samp(1).lambda = drawNormal(mu_y, sigSq_y); 34 | Samp(1).r = drawWishart(D, sigSqi_y); 35 | Samp(1).beta = 1/drawGamma(1,1/D) + D-1; 36 | Samp(1).w = drawWishart(D, sigSq_y); 37 | Samp(1).alpha = 1/drawGamma(1,1); 38 | Samp(1).pi = 1; 39 | Samp(1).Ic = 1; 40 | else 41 | % Initialize with supplied sample 42 | Samp(1) = init; 43 | 44 | % Since c is not included in the intialization, sample it 45 | c = drawMultinom(normalLikeSame(Y, init.mu, init.s)); 46 | end 47 | 48 | Ic = 1; 49 | 50 | % Go! 51 | for i=2:Nsamp 52 | prt(debg, 1, '########### ', i); 53 | 54 | % Make aliases for more readable code 55 | k = Samp(i-1).k; mu = Samp(i-1).mu; s = Samp(i-1).s; 56 | beta = Samp(i-1).beta; r = Samp(i-1).r; w = Samp(i-1).w; 57 | lambda = Samp(i-1).lambda; alpha = Samp(i-1).alpha; 58 | 59 | % Find the popularity of the classes 60 | nij = tabulate(c); nij = nij(:,2); 61 | prt(debg, 2, 'nij = ', nij') 62 | 63 | %%%%% Mu 64 | for j=1:k 65 | inClass = find(c == j); 66 | n = numel(inClass); 67 | if(n <= 0) ybar = 0; 68 | else ybar = mean(Y(inClass,:),1); 69 | end 70 | 71 | tmp_sigSq = inv(n*s(:,:,j) + r); 72 | tmp_mu = (n*ybar*s(:,:,j) + lambda*r)*tmp_sigSq; 73 | Samp(i).mu(j,:) = drawNormal(tmp_mu, tmp_sigSq); 74 | end 75 | prt(debg, 3, 'mu = ', Samp(i).mu); 76 | 77 | 78 | %%%%% Lambda 79 | tmp_sigSq = inv(sigSqi_y + k*r); 80 | tmp_mu = (mu_y*sigSqi_y + sum(mu,1)*r) * tmp_sigSq; 81 | Samp(i).lambda = drawNormal(tmp_mu, tmp_sigSq); 82 | prt(debg, 3, 'lambda = ', Samp(i).lambda); 83 | 84 | 85 | %%%%%% R 86 | mMinL = mu - repmat(lambda, k, 1); 87 | Samp(i).r = drawWishart(k+1, 1/(k+1)*inv(sigSq_y + mMinL'*mMinL)); 88 | prt(debg, 3, 'r = ', Samp(i).r); 89 | 90 | 91 | %%%%%% S 92 | for j=1:k 93 | inClass = find(c == j); 94 | n = numel(inClass); 95 | if(n <= 0) sbar = 0; 96 | else 97 | yMinMu = Y(inClass,:) - repmat(mu(j,:), n, 1); 98 | sbar = yMinMu' * yMinMu; 99 | end 100 | 101 | tmp_a = beta+nij(j); 102 | tmp_b = tmp_a*inv(w*beta + sbar); 103 | Samp(i).s(:,:,j) = drawWishart(tmp_a, tmp_b); 104 | end 105 | prt(debg, 3, 's = ', Samp(i).s); 106 | 107 | 108 | %%%%%% W 109 | tmp_a = k*beta+1; 110 | tmp_b = tmp_a * inv(sigSqi_y + beta*sum(s,3)); 111 | Samp(i).w = drawWishart(tmp_a, tmp_b); 112 | prt(debg, 3, 'w = ', Samp(i).w); 113 | 114 | 115 | %%%%%% Beta 116 | Samp(i).beta = ars(@logmvbetapdf, {s,w}, 1, arsStart, [0 inf])+D-1; 117 | prt(debg, 2, 'beta = ', Samp(i).beta) 118 | 119 | 120 | %%%%%% Alpha 121 | Samp(i).alpha = ars(@logalphapdf, {k, N}, 1, arsStart, [0 inf]); 122 | prt(debg, 2, 'alpha = ', Samp(i).alpha); 123 | 124 | 125 | %%%%%% C 126 | % Samples from priors, which could be swapped in if we need a new 127 | % class. Only needed for i>=Ic 128 | mu_prop = [zeros(Ic-1,D); drawNormal(lambda, pinv(r), N-Ic+1)]; 129 | s_prop = ones(D,D,N); 130 | wi = inv(w); 131 | [s_prop(:,:,Ic), WiCh] = drawWishart(beta, wi); 132 | for prop = Ic+1:N 133 | s_prop(:,:,prop) = drawWishart(beta, wi, WiCh); 134 | end 135 | 136 | % find the liklihoods under samples from the prior for i>=Ic 137 | unrep_like = alpha/(N-1+alpha) * normalLikeDiff(Y, mu_prop, s_prop); 138 | 139 | % Find the likelihoods of the observations under the *new* gaussians 140 | mu = Samp(i).mu; s = Samp(i).s; 141 | rep_like = normalLikeSame(Y, mu, s); 142 | 143 | % Calculate the priors, specific to each datapoint, because 144 | % counting over everyone else 145 | pri = repmat(nij/(N-1+alpha), 1, N); 146 | idxs = sub2ind(size(pri), c, [1:N]); 147 | pri(idxs) = pri(idxs) - 1/(N-1+alpha); 148 | 149 | % Tweak probabilities for classes with one member 150 | for cl=find(nij == 1)' 151 | idx = find(c == cl); 152 | unrep_like(idx) = 0; 153 | pri(cl,idx) = pri(cl,idx) + 1/(N-1+alpha); 154 | end 155 | 156 | % Assign datapoints to classes, get rid of empty classes, add new 157 | % ones 158 | like = [pri .* rep_like; unrep_like]; 159 | prt(debg, 2, 'like = ', like); 160 | cn = drawMultinom(like); 161 | [c,keep,to_add,Ic] = renumber(cn, c, k, Ic); 162 | Samp(i).mu = [mu(keep,:); mu_prop(to_add,:)]; 163 | Samp(i).s = s(:,:,keep); 164 | Samp(i).s(:,:,end+1:end+numel(to_add)) = s_prop(:,:,to_add); 165 | Samp(i).k = size(Samp(i).mu, 1); 166 | 167 | prt(debg, 4, 'mu = ', Samp(i).mu); 168 | prt(debg, 4, 'k = ', Samp(i).k); 169 | 170 | 171 | % Find relative weights of components 172 | nij = tabulate(c); nij = nij(:,2); 173 | Samp(i).pi = nij / sum(nij); 174 | 175 | Samp(i).Ic = Ic; 176 | prt(debg, 2, 'Ic = ', Ic); 177 | end 178 | 179 | 180 | 181 | function x = drawNormal(mu, sigSq, N) 182 | % Draw N samples from a Gaussian with mean mu and covariance sigSq. 183 | % Mu is a row vector. 184 | if(nargin < 3) N = 1; end 185 | D = size(mu,2); 186 | [u,s,v] = svd(sigSq); 187 | sig = sqrt(s)*v'; 188 | x = randn(N,D)*sig + repmat(mu, N, 1); 189 | 190 | 191 | function pr = normalLikeSame(y, mu, s) 192 | % Evaluate the likelihood of data y under each of the gaussians 193 | % with mean mu(j,:) and precision s(:,:,j). 194 | [Ny, D] = size(y); 195 | Nmu = size(mu,1); 196 | for j=1:Nmu 197 | yMinMu = (y - repmat(mu(j,:), Ny, 1))'; 198 | pr(j,:) = sqrt(det(s(:,:,j)/(2*pi))) * ... 199 | exp( -1/2 * sum( yMinMu .* (s(:,:,j) * yMinMu) ) ); 200 | end 201 | 202 | 203 | function pr = normalLikeDiff(y, mu, s) 204 | % Evaluate the likelihood of data point y(j,:) under the gaussian 205 | % with mean mu(j,:) and precision s(:,:,j). 206 | [Nmu, D] = size(mu); 207 | for j=1:Nmu 208 | yMinMu = y(j,:) - mu(j,:); 209 | pr(j) = sqrt(det(s(:,:,j)/(2*pi))) * exp(-1/2*yMinMu * s(:,:,j) * yMinMu'); 210 | end 211 | 212 | 213 | function x = drawInvChiSq(nu, nu_lambda) 214 | % Draw one sample from an inverse chi square distribution with 215 | % parameters nu and lambda 216 | x = nu_lambda / chi2rnd(nu); 217 | 218 | 219 | function x = drawBeta(a, b) 220 | % Draw one sample from a Beta distribution with parameters a and b 221 | x = betarnd(a,b); 222 | 223 | 224 | function x = drawBernoulli(p) 225 | % Draw bernoulli random variables with probability p of getting 1. 226 | % x is the same size as p. 227 | x = rand(size(p)) < p; 228 | 229 | 230 | function x = drawGamma(shape, mean) 231 | % Draw a gamma-distributed random variable having shape and mean 232 | % parameters given by the arguments. Translate's Rasmussen's shape 233 | % and mean notation to mathworld's and mathworks' alpha and theta 234 | % notation. When rasmussen writes G(beta, w^-1), matlab expects 235 | % G(beta, w^-1/beta). 236 | x = gamrnd(shape, mean./shape); 237 | 238 | 239 | 240 | function [X,Mch] = drawWishart(shape, mean, varargin) 241 | % Draw a wishart-distributed random matrix having shape and mean 242 | % parameters given by the arguments. 243 | [X,Mch] = wishrnd(mean/shape, shape, varargin{:}); 244 | while(det(X) <= 0) 245 | % $$$ warning('Wishart drew nonpositive definite matrix'); 246 | X = wishrnd(mean/shape, shape, Mch); 247 | end 248 | 249 | 250 | 251 | function prt(debg, level, txt, num) 252 | % Print text and number to screen if debug is enabled. 253 | if(debg >= level) 254 | if(numel(num) == 1) 255 | disp([txt num2str(num)]); 256 | else 257 | disp(txt) 258 | disp(num) 259 | end 260 | end 261 | -------------------------------------------------------------------------------- /igmm_uv.m: -------------------------------------------------------------------------------- 1 | function Samp = igmm_uv(Y, Nsamp, debg) 2 | 3 | % Implementation of Rasmussen's infinite GMM for univariate data. Use 4 | % Gibbs sampling to draw Nsamp samples from the posterior distribution 5 | % of an infinite GMM given some data Y. Based on a heirarchical 6 | % graphical model where hyperparameters are also inferred, so there 7 | % are no parameters to tweak by hand. Samp is a vector of structures, 8 | % where each element has fields mu, s, lambda, r, beta, w, alpha, and 9 | % k as described in the paper. 10 | 11 | % Copyright (C) 2005 Michael Mandel, mim at ee columbia edu; 12 | % distributable under GPL, see README.txt 13 | 14 | % TODO: 15 | % Handle case where n_{-i,j}=0, but c_i=j 16 | 17 | if(nargin < 3) debg = 0; end 18 | 19 | N = length(Y); 20 | mu_y = mean(Y); 21 | sigSq_y = var(Y); 22 | sigSqi_y = 1/sigSq_y; 23 | arsStart = [.1 4]; 24 | 25 | % Start off with one class 26 | c = ones(1,N); 27 | Samp(1).k = 1; 28 | Samp(1).mu = mu_y; 29 | Samp(1).s = sigSqi_y; 30 | Samp(1).lambda = drawNormal(mu_y, sigSq_y); 31 | Samp(1).r = drawGamma(1, sigSqi_y); 32 | Samp(1).beta = 1/gamrnd(1,1); 33 | Samp(1).w = drawGamma(1, sigSq_y); 34 | Samp(1).alpha = 1/gamrnd(1,1); 35 | Samp(1).pi = 1; 36 | Samp(1).Ic = 1; 37 | 38 | Ic = 1; 39 | 40 | % Go! 41 | for i=2:Nsamp 42 | prt(debg, '########### ', i); 43 | 44 | % Make aliases for more readable code 45 | k = Samp(i-1).k; mu = Samp(i-1).mu; s = Samp(i-1).s; 46 | beta = Samp(i-1).beta; r = Samp(i-1).r; w = Samp(i-1).w; 47 | lambda = Samp(i-1).lambda; alpha = Samp(i-1).alpha; 48 | 49 | % Find the popularity of the classes 50 | nij = tabulate(c); nij = nij(:,2); 51 | prt(debg, 'nij = ', nij') 52 | 53 | %%%%% Mu 54 | for j=1:k 55 | inClass = find(c == j); 56 | n = numel(inClass); 57 | if(n <= 0) ybar = 0; 58 | else ybar = mean(Y(inClass)); 59 | end 60 | 61 | tmp_sigSq = 1/(n*s(j) + r); 62 | tmp_mu = tmp_sigSq*(n*ybar*s(j) + lambda*r); 63 | Samp(i).mu(j) = drawNormal(tmp_mu, tmp_sigSq); 64 | end 65 | prt(debg, 'mu = ', Samp(i).mu); 66 | 67 | 68 | %%%%% Lambda 69 | tmp_sigSq = 1/(sigSqi_y + k*r); 70 | tmp_mu = tmp_sigSq*(mu_y*sigSqi_y + r*sum(mu)); 71 | Samp(i).lambda = drawNormal(tmp_mu, tmp_sigSq); 72 | prt(debg, 'lambda = ', Samp(i).lambda); 73 | 74 | 75 | %%%%%% R 76 | Samp(i).r = drawGamma(k+1, (k+1)/(sigSq_y + sum((mu-lambda).^2))); 77 | prt(debg, 'r = ', Samp(i).r); 78 | 79 | 80 | %%%%%% S 81 | tmp_a = []; tmp_b = []; 82 | for j=1:k 83 | inClass = find(c == j); 84 | n = numel(inClass); 85 | if(n <= 0) sbar = 0; 86 | else sbar = sum((Y(inClass) - mu(j)).^2); 87 | end 88 | 89 | tmp_a(j) = beta+nij(j); 90 | tmp_b(j) = tmp_a(j) / (w*beta + sbar); 91 | end 92 | Samp(i).s = drawGamma(tmp_a, tmp_b); 93 | prt(debg, 's = ', Samp(i).s); 94 | 95 | 96 | %%%%%% W 97 | Samp(i).w = drawGamma(k*beta+1, (k*beta+1)/(sigSqi_y + beta*sum(s))); 98 | prt(debg, 'w = ', Samp(i).w); 99 | 100 | 101 | %%%%%% Beta 102 | % is beta log concave or log(beta)? Looks like beta itself is. 103 | Samp(i).beta = ars(@logbetapdf, {s, w}, 1, arsStart, [0 inf]); 104 | prt(debg, 'beta = ', Samp(i).beta) 105 | 106 | 107 | %%%%%% Alpha 108 | % is alpha log concave or log(alpha)? looks like alpha itself is. 109 | Samp(i).alpha = ars(@logalphapdf, {k, N}, 1, arsStart, [0 inf]); 110 | prt(debg, 'alpha = ', Samp(i).alpha); 111 | 112 | 113 | %%%%%% C 114 | % Samples from priors, which could be swapped in if we need a new 115 | % class. Only needed for i>=Ic 116 | mu_prop = [zeros(1,Ic-1) drawNormal(lambda, 1/r, N-Ic+1)]; 117 | s_prop = [ones(1,Ic-1) drawGamma(beta*ones(1,N-Ic+1), 1/w)]; 118 | 119 | % Find the likelihoods of the observations under the *new* 120 | % gaussians for i>=Ic 121 | unrep_like = alpha/(N-1+alpha) * normalLike(Y, mu_prop, s_prop); 122 | 123 | mu = Samp(i).mu; s = Samp(i).s; 124 | rep_like = []; 125 | for j=1:k 126 | rep_like(j,:) = normalLike(Y, mu(j), s(j)); 127 | end 128 | 129 | % Calculate the priors, specific to each datapoint, because 130 | % counting over everyone else 131 | pri = repmat(nij/(N-1+alpha), 1, N); 132 | idxs = sub2ind(size(pri), c, [1:N]); 133 | pri(idxs) = pri(idxs) - 1/(N-1+alpha); 134 | 135 | % Assign datapoints to classes, get rid of empty classes, add new 136 | % ones 137 | cn = drawMultinom([pri .* rep_like; unrep_like]); 138 | [c,keep,to_add,Ic] = renumber(cn, c, k, Ic); 139 | Samp(i).mu = [mu(keep) mu_prop(to_add)]; 140 | Samp(i).s = [s(keep) s_prop(to_add)]; 141 | Samp(i).k = length(Samp(i).mu); 142 | 143 | prt(debg, 'mu = ', Samp(i).mu); 144 | prt(debg, 'k = ', Samp(i).k); 145 | 146 | 147 | % Find relative weights of components 148 | nij = tabulate(c); nij = nij(:,2); 149 | Samp(i).pi = nij / sum(nij); 150 | 151 | Samp(i).Ic = Ic; 152 | end 153 | 154 | 155 | 156 | function x = drawNormal(mu, sigSq, N) 157 | % Draw one sample from a Gaussian with mean mu and variance sigSq 158 | if(nargin < 3) N = 1; end 159 | x = randn(1,N)*sqrt(sigSq) + mu; 160 | 161 | 162 | function pr = normalLike(y, mu, s) 163 | % The likelihood of data y under the gaussian with mean mu and inverse 164 | % variance s. If y, mu, and s are the same size, evaluate the 165 | % likelihood of each point in y under a different mu and s, if mu 166 | % and s are scalars, evaluate all points under the same gaussian. 167 | pr = sqrt(s/(2*pi)) .* exp(-s.*(y-mu).^2/2); 168 | 169 | 170 | function x = drawInvChiSq(nu, nu_lambda) 171 | % Draw one sample from an inverse chi square distribution with 172 | % parameters nu and lambda 173 | x = nu_lambda / chi2rnd(nu); 174 | 175 | 176 | function x = drawBeta(a, b) 177 | % Draw one sample from a Beta distribution with parameters a and b 178 | x = betarnd(a,b); 179 | 180 | 181 | function x = drawBernoulli(p) 182 | % Draw bernoulli random variables with probability p of getting 1. 183 | % x is the same size as p. 184 | x = rand(size(p)) < p; 185 | 186 | 187 | function x = drawGamma(shape, mean) 188 | % Draw a gamma-distributed random variable having shape and mean 189 | % parameters given by the arguments. Translate's Rasmussen's shape 190 | % and mean notation to mathworld's and mathworks' alpha and theta 191 | % notation. When rasmussen writes G(beta, w^-1), matlab expects 192 | % G(beta, w^-1/beta). 193 | x = gamrnd(shape/2, 2*mean./shape); 194 | 195 | 196 | function prt(debg, msg, num) 197 | % Print text and number to screen if debug is enabled. 198 | if(debg) 199 | disp([msg num2str(num)]); 200 | end 201 | -------------------------------------------------------------------------------- /logalphapdf.m: -------------------------------------------------------------------------------- 1 | function [h, hprime] = logalphapdf(alpha, k, n) 2 | % Generate the log of the pdf for the alpha variable in rasmussen's 3 | % infinite gmm, eq 15. 4 | 5 | % Copyright (C) 2005 Michael Mandel, mim at ee columbia edu; 6 | % distributable under GPL, see README.txt 7 | 8 | 9 | h = (k-3/2)*log(alpha) - 1./(2*alpha) .* gammaln(alpha) ... 10 | - gammaln(n+alpha); 11 | hprime = (k - 3/2)./alpha + 1./(2*alpha.^2) .* gammaln(alpha) ... 12 | - 1./(2*alpha) .* psi(alpha) - psi(n+alpha); 13 | -------------------------------------------------------------------------------- /logbetapdf.m: -------------------------------------------------------------------------------- 1 | function [h, hprime] = logbetapdf(beta, s, w) 2 | 3 | % Generate the log of the pdf for the beta variable in rasmussen's 4 | % infinite gmm, eq 9. 5 | 6 | % Copyright (C) 2005 Michael Mandel, mim at ee columbia edu; 7 | % distributable under GPL, see README.txt 8 | 9 | 10 | if(beta < 0) 11 | beta 12 | end 13 | k = length(s); 14 | h = -k * gammaln(beta/2) - 1./(2*beta) + (k*beta-3)./2.*log(beta/2) + ... 15 | (beta/2)*sum(log(w) + log(s)) - beta*sum(s*w/2); 16 | hprime = -k/2*psi(beta/2) + 1./(2*beta.^2) + k/2.*log(beta/2) + ... 17 | (k*beta-3)./(2*beta) + sum(log(s) + log(w) - s*w)/2; 18 | -------------------------------------------------------------------------------- /logmvbetapdf.m: -------------------------------------------------------------------------------- 1 | function [h, hprime] = logmvbetapdf(y, S, W) 2 | 3 | % Generate the log of the pdf for y=beta-D+1 in rasmussen's infinite 4 | % gmm, eq 9 for the multivariate case. The i-th covariance matrix is 5 | % s(:,:,i). To get samples from beta, take samples from this 6 | % function and add D-1 to them. 7 | 8 | % Copyright (C) 2005 Michael Mandel, mim at ee columbia edu; 9 | % distributable under GPL, see README.txt 10 | 11 | 12 | if(y < 0) y, end 13 | k = size(S,3); 14 | d = size(S,1); 15 | N = length(y); 16 | 17 | ldW = log(det(W)); 18 | for i=1:k 19 | ldS(i) = log(det(S(:,:,i))); 20 | trWS(i) = trace(W*S(:,:,i)); 21 | end 22 | yd1 = y+d-1; 23 | 24 | sumGammaLn = 0; sumPsi = 0; 25 | for i=0:d-1 26 | sumGammaLn = sumGammaLn + gammaln(.5*(y+i)); 27 | sumPsi = sumPsi + psi(.5*(y+i)); 28 | end 29 | 30 | if(k < N) 31 | sum_ldS_trWS = zeros(size(y)); 32 | for i=1:k 33 | sum_ldS_trWS = sum_ldS_trWS + (y+2)/2*ldS(i) - yd1/2*trWS(i); 34 | end 35 | else 36 | for i=1:N 37 | sum_ldS_trWS(i) = sum((y(i)+2)/2*ldS - yd1(i)/2*trWS); 38 | end 39 | end 40 | 41 | h = -3/2*log(yd1) - d./(2*yd1) - k*(y*d*log(2)/2 + sumGammaLn) + ... 42 | yd1*k/2.*(d*log(yd1) + ldW) + sum_ldS_trWS; 43 | hprime = -3./(2*yd1) + d./(2*yd1.^2) - k*(d*log(2)/2 + sumPsi/2) + ... 44 | k/2*(d*log(yd1) + ldW + d) + .5*sum(ldS - trWS); 45 | -------------------------------------------------------------------------------- /lognormalpdf.m: -------------------------------------------------------------------------------- 1 | function [h, hprime] = lognormalpdf(x, mu, sigma) 2 | 3 | % Log of the univariate normal distribution with mean mu and 4 | % standard deviation sigma. Returns the log of the pdf and the 5 | % derivative of the pdf at x. 6 | 7 | % Copyright (C) 2005 Michael Mandel, mim at ee columbia edu; 8 | % distributable under GPL, see README.txt 9 | 10 | 11 | h = 1/2*log(2*pi*sigma^2) -(x-mu).^2/(2*sigma^2); 12 | hprime = -(x-mu)./sigma^2; -------------------------------------------------------------------------------- /plotAutoCov.m: -------------------------------------------------------------------------------- 1 | function plotAutoCov(Samp) 2 | 3 | % Plot the autocovariances of the various parameters as a function 4 | % of lag in the sampling. 5 | 6 | % Copyright (C) 2005 Michael Mandel, mim at ee columbia edu; 7 | % distributable under GPL, see README.txt 8 | 9 | 10 | acK = ifftshift(xcov([Samp.k], 'coeff')); 11 | % $$$ acR = ifftshift(xcov([Samp.r], 'coeff')); 12 | % $$$ acW = ifftshift(xcov([Samp.w], 'coeff')); 13 | acLambda = ifftshift(xcov([Samp.lambda], 'coeff')); 14 | acAlpha = ifftshift(xcov([Samp.alpha], 'coeff')); 15 | acBeta = ifftshift(xcov([Samp.beta], 'coeff')); 16 | 17 | x = [1:1000]; 18 | 19 | plot(x, acK(x), x, acAlpha(x), x, acLambda(x), x, ... 20 | acBeta(x), [1 1000], [0 0]); 21 | % $$$ plot(x, acK(x), x, acAlpha(x), x, acW(x), x, acLambda(x), x, ... 22 | % $$$ acR(x), x, acBeta(x), [1 1000], [0 0]); 23 | 24 | legend('k', '\alpha', '\lambda', '\beta'); 25 | % $$$ legend('k', '\alpha', 'w', '\lambda', 'r', '\beta'); 26 | title('Autocovariances of Hyperparameters'); 27 | xlabel('Lag') -------------------------------------------------------------------------------- /plotGmm.m: -------------------------------------------------------------------------------- 1 | function plotGmm(gmm, data, x) 2 | 3 | % Plot a Gaussian mixture model described by the structure gmm. This 4 | % function works for 1D and 2D Gaussians. GMM should have fields mu, 5 | % s, and pi, where mu is a NxD matrix of the means of N Gaussians, s 6 | % is a DxDxN matrix of precisions (inverse covariances), and pi is an 7 | % N-vector of the relative weights of the Gaussians. The second 8 | % argument, DATA, is optional but should contain the data from which 9 | % these Gaussians were generated, which will be plotted on the same 10 | % graph. The X argument, for 1D GMMs, is a 2-vector specifying the 11 | % minimum and maximum X values over which to plot the Gaussians. 12 | 13 | % Copyright (C) 2005 Michael Mandel, mim at ee columbia edu; 14 | % distributable under GPL, see README.txt 15 | 16 | [N,D] = size(gmm.mu); 17 | 18 | if(D == 1) 19 | % Number of standard deviations to go outside of the lowest and 20 | % highest means. 21 | xrange = 3; 22 | points = 222; 23 | 24 | sigmaSq = 1./gmm.s; 25 | sigma = sqrt(sigmaSq); 26 | 27 | if(nargin < 3) 28 | xmin = min(gmm.mu - xrange*sigma); 29 | xmax = max(gmm.mu + xrange*sigma); 30 | x = [xmin:(xmax-xmin)/points:xmax]; 31 | end 32 | 33 | if(nargin > 1) 34 | N = numel(data); 35 | [histN, histX] = hist(data,round(N/10)); 36 | dx = mean(diff(histX)); 37 | bar(histX, histN/(N*dx), 1, 'w'), shading faceted 38 | hold on 39 | end 40 | 41 | for i=1:length(gmm.mu) 42 | y(i,:) = gmm.pi(i)/sqrt(2*pi*sigmaSq(i)) * ... 43 | exp(-(x-gmm.mu(i)).^2/(2*sigmaSq(i))); 44 | end 45 | tot = sum(y); 46 | 47 | plot(x, y, x, tot, '--'); 48 | 49 | if(nargin > 1) hold off; end 50 | 51 | elseif (D == 2) 52 | if(nargin > 1) 53 | plot(data(:,1), data(:,2), '.k'); 54 | hold on 55 | end 56 | w = gmm.pi / sum(gmm.pi); 57 | 58 | for i=1:size(gmm.mu,1); 59 | plotGauss2(gmm.mu(i,:), gmm.s(:,:,i), w(i)); 60 | hold on 61 | end 62 | hold off 63 | 64 | 65 | elseif (D == 3) 66 | grid on 67 | if(nargin > 1) 68 | scatter3(data(:,1), data(:,2), data(:,3), 2, 'b'); 69 | hold on 70 | end 71 | w = gmm.pi / sum(gmm.pi); 72 | 73 | for i=1:size(gmm.mu,1); 74 | plotGauss3(gmm.mu(i,:), gmm.s(:,:,i), w(i)); 75 | hold on 76 | end 77 | hold off 78 | 79 | 80 | else 81 | error('Can only plot 1D or 2D GMMs'); 82 | end 83 | 84 | 85 | 86 | function plotGauss2(mu,S,w) 87 | %plotGauss(mu,S) 88 | % 89 | %PLOT A 2D Gaussian 90 | %This function plots the given 2D gaussian on the current plot. 91 | 92 | t = -pi:.01:pi; 93 | x = sin(t); 94 | y = cos(t); 95 | 96 | [vv,dd] = eig(S); 97 | A = real((vv*sqrt(dd))'); 98 | z = [x' y']*A; 99 | 100 | hand = plot(mu(1),mu(2),'X'); 101 | set(hand, 'Color', 1-[w w w], 'LineWidth', 3); 102 | hand = plot(z(:,1)+mu(1),z(:,2)+mu(2)); 103 | set(hand, 'Color', 1-[w w w], 'LineWidth', 3); 104 | 105 | 106 | 107 | function plotGauss3(mu, S, w) 108 | % Plot a 3D Gaussian on the current plot 109 | 110 | [u,s,v] = svd(S); 111 | A = diag(1./diag(sqrt(s))) * v'; 112 | pts = 2*[A; -A] + repmat(mu, 6, 1); 113 | for i=1:3 114 | h = plot3(pts(i+[0 3],1), pts(i+[0 3],2), pts(i+[0 3],3), 'r'); 115 | set(h, 'linewidth', w*20); 116 | end 117 | -------------------------------------------------------------------------------- /renumber.m: -------------------------------------------------------------------------------- 1 | function [c,keep,to_add,Ic] = renumber(cn, c, k, Ic) 2 | % Assign new classes to points. From Ic up to the first addition 3 | % of a new class, c = cn, after the addition of the new class, c = 4 | % c, and Ic is set to the index of the first point that wasn't 5 | % relabeled. A new class is indicated by cn = k+1. If there is 6 | % no new class added between Ic and the end of cn, reset Ic to 1. 7 | 8 | % Copyright (C) 2005 Michael Mandel, mim at ee columbia edu; 9 | % distributable under GPL, see README.txt 10 | 11 | 12 | Icn = find(cn(Ic:end) == k+1)+Ic-1; 13 | is_new = 1-isempty(Icn); 14 | 15 | if(is_new) 16 | Icn = Icn(1); 17 | c(Ic:Icn) = cn(Ic:Icn); 18 | Ic = Icn+1; 19 | to_add = Icn; 20 | else 21 | c(Ic:end) = cn(Ic:end); 22 | to_add = []; 23 | Ic = 1; 24 | end 25 | 26 | % count up how many members of each class 27 | tab = tabulate(c); 28 | 29 | % get rid of empty classes 30 | keep = find(tab(:,2) ~= 0); 31 | tab = tab(keep,:); 32 | 33 | % don't include the extra class if there is one 34 | keep = keep(1:end-is_new); 35 | 36 | % count the number of classes to keep 37 | kn = size(tab,1); 38 | 39 | % relabel points so that each group has at least one point in it 40 | for j=1:kn 41 | c(find(c == tab(j,1))) = j; 42 | end 43 | -------------------------------------------------------------------------------- /wishrnd.m: -------------------------------------------------------------------------------- 1 | function [a,d] = wishrnd(sigma,df,d) 2 | %WISHRND Generate Wishart random matrix 3 | % W=WISHRND(SIGMA,DF) generates a random matrix W having the Wishart 4 | % distribution with covariance matrix SIGMA and with DF degrees of 5 | % freedom. 6 | % 7 | % W=WISHRND(SIGMA,DF,D) expects D to be the Cholesky factor of 8 | % SIGMA. If you call WISHRND multiple times using the same value 9 | % of SIGMA, it's more efficient to supply D instead of computing 10 | % it each time. 11 | % 12 | % [W,D]=WISHRND(SIGMA,DF) returns D so it can be used again in 13 | % future calls to WISHRND. 14 | % 15 | % Updated 5/25/05 by mim so that df > n-1 instead of >= n 16 | % 17 | % See also IWISHRND. 18 | 19 | % References: 20 | % Krzanowski, W.J. (1990), Principles of Multivariate Analysis, Oxford. 21 | % Smith, W.B., and R.R. Hocking (1972), "Wishart variate generator," 22 | % Applied Statistics, v. 21, p. 341. (Algorithm AS 53) 23 | 24 | % Copyright 1993-2002 The MathWorks, Inc. 25 | % $Revision: 1.3 $ $Date: 2005/05/26 22:28:43 $ 26 | 27 | % Error checking 28 | [n,m] = size(sigma); 29 | if (df<=n-1) & (df~=round(df)) 30 | error('stats:wishrnd:BadDf',... 31 | 'Degrees of freedom must be an integer or larger than the D-1.'); 32 | end 33 | 34 | % Factor sigma unless that has already been done 35 | if (nargin<3) 36 | [d,p] = chol(sigma); 37 | if p>0 38 | error('stats:wishrnd:BadCovariance',... 39 | 'Covariance matrix must be positive semi-definite.') 40 | end 41 | end 42 | 43 | % For small degrees of freedom, generate the matrix using the definition 44 | % of the Wishart distribution; see Krzanowski for example 45 | if (df <= 81+n) & (df==round(df)) 46 | x = randn(df,size(d,1)) * d; 47 | 48 | % Otherwise use the Smith & Hocking procedure 49 | else 50 | % Load diagonal elements with square root of chi-square variates 51 | a = diag(sqrt( gamrnd((df-(0:n-1))/2, 2) )); 52 | 53 | % Load upper triangle with independent normal (0, 1) variates 54 | a(itriu(n)) = randn(n*(n-1)/2,1); 55 | 56 | % Desired matrix is D'(A'A)D 57 | x = a*d; 58 | end 59 | 60 | a = x' * x; 61 | 62 | 63 | 64 | % --------- get indices of upper triangle of p-by-p matrix 65 | function d=itriu(p) 66 | 67 | d=ones(p*(p-1)/2,1); 68 | d(1+cumsum(0:p-2))=p+1:-1:3; 69 | d = cumsum(d); 70 | --------------------------------------------------------------------------------