├── .DS_Store ├── LICENSE ├── README.md ├── VERSION ├── demos ├── demo_escalatorVideo.m ├── recreatePaperExperiment.m └── test_l1l2.m ├── setup_fastRPCA.m ├── solvers ├── solver_RPCA_Lagrangian.m ├── solver_RPCA_SPGL1.m ├── solver_RPCA_constrained.m └── solver_split_SPCP.m └── utilities ├── .DS_Store ├── WolfeLineSearch.m ├── errorFunction.m ├── func_split_spcp.m ├── holdMarker.m ├── lbfgsAdd.m ├── lbfgsProd.m ├── lbfgs_gpu.m ├── loadSyntheticProblem.m ├── randomizedSVD.m ├── referenceSolutions ├── downloadEscalatorData.m ├── downloadReferenceSolutions.m ├── escalator_data.mat ├── verySmall_l1.mat └── verySmall_l1l2.mat └── vec.m /.DS_Store: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/stephenbeckr/fastRPCA/44dfee56f142ebffe5a7003578868e84bd4330b7/.DS_Store -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | 3-clause license ("Revised BSD License", "New BSD License", 2 | or "Modified BSD License") 3 | 4 | Copyright (c) 2014, Stephen Becker and Aleksandr Aravkin 5 | All rights reserved. 6 | 7 | Redistribution and use in source and binary forms, with or without 8 | modification, are permitted provided that the following conditions are met: 9 | 10 | * Redistributions of source code must retain the above copyright notice, this 11 | list of conditions and the following disclaimer. 12 | 13 | * Redistributions in binary form must reproduce the above copyright notice, 14 | this list of conditions and the following disclaimer in the documentation 15 | and/or other materials provided with the distribution. 16 | 17 | * Neither the name of University of Colorado Boulder nor IBM Research, 18 | nor the names of their contributors, may be used to endorse or promote 19 | products derived from this software without specific prior written permission. 20 | 21 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" 22 | AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 23 | IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE 24 | DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE 25 | FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 26 | DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR 27 | SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER 28 | CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, 29 | OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 30 | OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 31 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | fastRPCA 2 | ======== 3 | 4 | Matlab code for all variants of robust PCA and SPCP. This implements the code from the conference paper "A variational approach to stable principal component pursuit" by Aravkin, Becker, Cevher, Olsen; UAI 2014. 5 | 6 | Not only is this code fast, but it is the only code we know of that solves all common stable principal component pursuit (SPCP) variants, including the new variants we introduced in the paper. All these variants are in some sense equivalent, but some of them are easier to solve, and some have parameters that are easier to estimate. See the [paper for details](http://arxiv.org/abs/1406.1089) 7 | 8 | 9 |

10 | 11 | 12 | More info on robust PCA and stable principal component pursuit 13 | (websites with software, review articles, etc.) 14 | 15 | * ["A variational approach to stable principal component pursuit"](http://arxiv.org/abs/1406.1089) 16 | * [LRS Library](https://github.com/andrewssobral/lrslibrary) 17 | * [Matrix Factorization jungle](https://sites.google.com/site/igorcarron2/matrixfactorizations) 18 | * [One of the first papers on RPCA](http://arxiv.org/abs/0912.3599) 19 | 20 | 21 | Citation 22 | --------- 23 | bibtex: 24 | 25 | ``` 26 | @inproceedings{aravkin2014, 27 | author = "Aravkin, A. and Becker, S. and Cevher, V. and Olsen, P.", 28 | title = "A variational approach to stable principal component pursuit", 29 | booktitle = "Conference on Uncertainty in Artificial Intelligence (UAI)", 30 | year = "2014", 31 | month = "July", 32 | } 33 | ``` 34 | 35 | Code and installation 36 | ---- 37 | 38 | The code runs on MATLAB and does not require any mex files or installation. Just unzip the file and you are set. Run `setup_fastRPCA` to set the correct paths, and try the `demos` directory for sample usage of the code. 39 | 40 | [![View fastRPCA on File Exchange](https://www.mathworks.com/matlabcentral/images/matlab-file-exchange.svg)](https://www.mathworks.com/matlabcentral/fileexchange/52359-fastrpca) 41 | 42 | Authors 43 | -------------- 44 | The code was developed by all the authors of the UAI paper, but primary development is due to Stephen Becker and Aleksandr (Sasha) Aravkin (when both were at IBM Research). Further contributions are welcome. 45 | 46 | 47 | ## Contact us 48 | * [Sasha Aravkin](https://uw-amo.github.io/saravkin/), University of Washington, Seattle 49 | * [Stephen Becker](amath.colorado.edu/faculty/becker/), University of Colorado Boulder 50 | 51 | -------------------------------------------------------------------------------- /VERSION: -------------------------------------------------------------------------------- 1 | v0.1 2 | November 2014 3 | -------------------------------------------------------------------------------- /demos/demo_escalatorVideo.m: -------------------------------------------------------------------------------- 1 | %% Demo of RPCA on a video clip 2 | % We demonstrate this on a video clip taken from a surveillance 3 | % camera in a subway station. Not only is there a background 4 | % and a foreground (i.e. people), but there is an escalator 5 | % with periodic motion. Conventional background subtraction 6 | % algorithms would have difficulty with the escalator, but 7 | % this RPCA formulation easily captures it as part of the low-rank 8 | % structure. 9 | % The clip is taken from the data at this website (after 10 | % a bit of processing to convert it to grayscale): 11 | % http://perception.i2r.a-star.edu.sg/bk_model/bk_index.html 12 | 13 | % Load the data: 14 | downloadEscalatorData; 15 | load escalator_data % contains X (data), m and n (height and width) 16 | X = double(X); 17 | 18 | 19 | %% Run the SPGL1 algorithm 20 | %{ 21 | We solve 22 | min_{L,S} max( ||L||_* , lambda ||S||_1 ) 23 | subject to 24 | || L+S-X ||_F <= epsilon 25 | %} 26 | nFrames = size(X,2); 27 | lambda = 2e-2; 28 | L0 = repmat( median(X,2), 1, nFrames ); 29 | S0 = X - L0; 30 | epsilon = 5e-3*norm(X,'fro'); % tolerance for fidelity to data 31 | opts = struct('sum',false,'L0',L0,'S0',S0,'max',true,... 32 | 'tau0',3e5,'SPGL1_tol',1e-1,'tol',1e-3,'printEvery',1); 33 | 34 | [L,S] = solver_RPCA_SPGL1(X,lambda,epsilon,[],opts); 35 | 36 | %% Uncomment to Solve Lagrangian Version 37 | 38 | % opts = struct('sum',false,'max',false,'tol',1e-8,'printEvery',1); 39 | % lambdaL = 115; 40 | % lambdaS = 0.8; 41 | % 42 | % [L,S] = solver_RPCA_Lagrangian(X,lambdaL,lambdaS,[],opts); 43 | 44 | 45 | %% Run the Split-SPCP Algorithm 46 | % Implements Split-SPCP method from ``Adapting Regularized Low-Rank Models 47 | % for Parallel Architectures'' (SIAM J. Sci. Comput., 2019). 48 | 49 | % Set options for L-BFGS 50 | params.progTol = 1e-10; 51 | params.optTol = 1e-10; 52 | params.MaxIter = 500; 53 | params.store = 1; 54 | 55 | % Set parameters for split_SPCP 56 | params.k = 10; % rank bound on L 57 | params.gpu = 0; % set to 1 to run on GPU 58 | params.lambdaL = 115; 59 | params.lambdaS = 0.8; 60 | 61 | 62 | [L,S] = solver_split_SPCP(X,params); 63 | %% show all together in movie format 64 | 65 | mat = @(x) reshape( x, m, n ); 66 | figure(1); clf; 67 | colormap( 'Gray' ); 68 | for k = 1:nFrames 69 | imagesc( [mat(X(:,k)), mat(L(:,k)), mat(S(:,k))] ); 70 | axis off 71 | axis image 72 | drawnow; 73 | pause(.05); 74 | end -------------------------------------------------------------------------------- /demos/recreatePaperExperiment.m: -------------------------------------------------------------------------------- 1 | %{ 2 | Recreates some tests from the conference paper 3 | "A variational approach to stable principal component pursuit", 4 | A. Aravkin, S. Becker, V. Cevher, P. Olsen 5 | UAI 2014 6 | http://arxiv.org/abs/1406.1089 7 | 8 | 9 | This script runs the code. There are several "fancy" things: 10 | 11 | 1) it calls loadSyntheticProblem( problemName ) 12 | This lets you load problems that have pre-computed reference solutions 13 | It also lets you load problems and the corresponding parameters 14 | for all variants (objectives vs. constraints). It does 15 | this by running the Lagrangian version, from which we can 16 | calculate the other parameters. 17 | 18 | 2) it can run the software NSA or PSPG or ASALM for comparison, if these 19 | codes are visible on the Matlab path 20 | (Note: it runs modified versions of these, in order to collect 21 | errors and such, plus minor improvements, but we do not distribute 22 | these codes. 23 | 24 | 3) it has a fancy error function that not only records errors per 25 | iteration, but also records at what time the error was collected, 26 | and it can subtract off the time taken to collect the error, 27 | and it can also make a program quit if it exceeds a given time limit. 28 | (if we exceed the time limit, it throws the error: 29 | errorFunction:timedOut 30 | This is the reason for the "try"/"catch" statements below) 31 | 32 | The paper used some synthetic problems taken from the NSA paper, 33 | and since these were not in Lagrangian format, we don't collect all 34 | parameters, so we cannot test all variants 35 | 36 | Stephen Becker, November 12 2014 37 | %} 38 | 39 | tolFactor = 1; 40 | maxIts = 50; 41 | maxTime = 20; % in seconds 42 | doNSA = true && exist('nsa','file'); 43 | doPSPG = true && exist('pspg','file'); 44 | doASALM = true && exist('asalm','file'); 45 | 46 | % Pick a problem: 47 | % The "maxTime" is a time limit (in seconds) for each solver 48 | problemString = 'medium_exponentialNoise_v2'; problem_type=0;tolFactor = 1e-3;maxIts = 100;maxTime=20; 49 | % problemString = 'nsa1500_SNR45_problem1_run1'; problem_type=1;maxTime = 60; tolFactor=1e-2;% 50 | % problemString = 'nsa1500_SNR45_problem2_run1'; problem_type=2;maxTime = 60; tolFactor=1e-2;% 51 | 52 | 53 | 54 | [Y,L,S,params] = loadSyntheticProblem(problemString); 55 | rng(100); % this affects the randomized SVDs 56 | normL = norm(L(:)); normS = norm(S(:)); 57 | errFcn = @(LL,SS) norm(LL-L,'fro')/normL + norm(SS-S,'fro')/normS; 58 | erF = @(varargin)errorFunction( errFcn, varargin{:} ); 59 | 60 | %% Solve via NSA 61 | if doNSA 62 | tol = 1e-2*tolFactor; 63 | t1 = tic; 64 | errorFunction( {t1, maxTime } ); % only allow 20 seconds 65 | try 66 | [L,S] = nsa(Y,-params.epsilon,tol,0,[],[],params.lambdaSum, erF ); 67 | tm = toc(t1); 68 | fprintf('Via NSA, time is %.2e, error is %.2e\n', tm, errFcn(L,S) ); 69 | catch thrownError 70 | if ~strcmpi(thrownError.identifier,'errorFunction:timedOut') 71 | rethrow(thrownError) 72 | else 73 | disp('NSA timed out, but error were still recorded'); 74 | end 75 | end 76 | [errHist_NSA, times_NSA] = errorFunction(); 77 | end 78 | 79 | %% Solve via ASALM, constrained version 80 | if doASALM 81 | % If 'minType' is 'unc', solves Lagrangian version 82 | Omega = (1:numel(Y))'; 83 | opts = struct('minType','con','tol',1e-3*tolFactor,'print',1,'record_res',true); 84 | if problem_type == 1 85 | opts.beta = 0.1/mean(abs(Y(Omega))); 86 | else 87 | opts.beta = 0.15/mean(abs(Y(Omega))); 88 | end 89 | if numel(Y) <= 50^2 90 | opts.SVD = true; opts.PROPACK = false; 91 | end 92 | 93 | t1 = tic; 94 | errorFunction( {t1, maxTime } ); % only allow 20 seconds 95 | try 96 | out = asalm(size(Y),Omega,Y(Omega),params.lambdaSum,params.epsilon,opts,[],[],[],[],erF); %,optimal_X,optimal_S); 97 | tm = toc(t1); 98 | L = out.LowRank; S = out.Sparse; 99 | fprintf('Via ASALM, constrained, time is %.2e, error is %.2e\n', tm, errFcn(L,S) ); 100 | catch thrownError 101 | if ~strcmpi(thrownError.identifier,'errorFunction:timedOut') 102 | rethrow(thrownError) 103 | else 104 | disp('ASALM timed out, but error were still recorded'); 105 | end 106 | end 107 | [errHist_ASALM,times_ASALM] = errorFunction(); 108 | end 109 | %% Solve via ASALM, unconstrained 110 | if doASALM 111 | if ~isempty( params.lambdaS ) 112 | Omega = (1:numel(Y))'; 113 | opts = struct('minType','unc','tol',1e-3*tolFactor,'print',1,'record_res',true); 114 | if problem_type == 1 115 | opts.beta = 0.1/mean(abs(Y(Omega))); 116 | else 117 | opts.beta = 0.15/mean(abs(Y(Omega))); 118 | end 119 | if numel(Y) <= 50^2 120 | opts.SVD = true; opts.PROPACK = false; 121 | end 122 | 123 | t1 = tic; 124 | errorFunction( {t1, maxTime } ); % only allow 20 seconds 125 | try 126 | out = asalm(size(Y),Omega,Y(Omega),params.lambdaSum,params.epsilon,opts,[],[],[],[],erF); %,optimal_X,optimal_S); 127 | tm = toc(t1); 128 | L = out.LowRank; S = out.Sparse; 129 | fprintf('Via ASALM, unconstrained, time is %.2e, error is %.2e\n', tm, errFcn(L,S) ); 130 | catch thrownError 131 | if ~strcmpi(thrownError.identifier,'errorFunction:timedOut') 132 | rethrow(thrownError) 133 | else 134 | disp('ASALM timed out, but error were still recorded'); 135 | end 136 | end 137 | [errHist_ASALM2,times_ASALM2] = errorFunction(); 138 | else 139 | errHist_ASALM2 = []; 140 | end 141 | end 142 | %% Solve via PSPG... 143 | if doPSPG 144 | tol = 1e-3*tolFactor; 145 | mu = 1.25/norm(Y); % their default 146 | t1 = tic; 147 | errorFunction( {t1, maxTime } ); 148 | try 149 | [L,S,out]=pspg(Y,-params.epsilon,tol,[],[],params.lambdaSum,erF,mu,maxIts); 150 | tm = toc(t1); 151 | fprintf('Via PSPG, constrained, time is %.2e, error is %.2e\n', tm, errFcn(L,S) ); 152 | catch thrownError 153 | if ~strcmpi(thrownError.identifier,'errorFunction:timedOut') 154 | rethrow(thrownError) 155 | else 156 | disp('PSPG timed out, but error were still recorded'); 157 | end 158 | end 159 | [errHist_PSPG,times_PSPG] = errorFunction(); 160 | end 161 | 162 | %% Test our dual version by itself, max version (not doing SPGL1 stuff) 163 | opts = struct('printEvery',5,'tol',1e-4*tolFactor,'max',true,'maxIts',maxIts); 164 | opts.errFcn = erF; 165 | opts.quasiNewton = true; opts.FISTA=false; opts.BB = false; 166 | t1 = tic; 167 | errorFunction( {t1, maxTime } ); 168 | try 169 | [L,S] = solver_RPCA_constrained(Y,params.lambdaMax, params.tauMax, [], opts); 170 | tm = toc(t1); 171 | fprintf('Via our "dual-max" code, constrained, time is %.2e, error is %.2e\n', tm, errFcn(L,S) ); 172 | catch thrownError 173 | if ~strcmpi(thrownError.identifier,'errorFunction:timedOut') 174 | rethrow(thrownError) 175 | else 176 | disp('our "dual-max" code timed out, but error were still recorded'); 177 | end 178 | end 179 | [errHist_max,times_max] = errorFunction(); 180 | 181 | %% Test our dual version by itself, sum version (not doing SPGL1 stuff) 182 | opts = struct('printEvery',5,'tol',1e-5*tolFactor,'sum',true,'maxIts',maxIts); 183 | opts.restart = 100; 184 | opts.errFcn = erF; 185 | opts.BB = true; 186 | opts.FISTA = false; 187 | 188 | t1 = tic; 189 | errorFunction( {t1, maxTime } ); 190 | try 191 | [L,S] = solver_RPCA_constrained(Y,params.lambdaSum, params.tauSum, [], opts); 192 | tm = toc(t1); 193 | fprintf('Via our "dual-sum" code, constrained, time is %.2e, error is %.2e\n', tm, errFcn(L,S) ); 194 | catch thrownError 195 | if ~strcmpi(thrownError.identifier,'errorFunction:timedOut') 196 | rethrow(thrownError) 197 | else 198 | disp('our "dual-sum" code timed out, but error were still recorded'); 199 | end 200 | end 201 | [errHist_sum,times_sum] = errorFunction(); 202 | 203 | %% Our Lagrangian version 204 | if ~isempty( params.lambdaS ) 205 | opts = struct('printEvery',1,'tol',1e-4*tolFactor,'maxIts',maxIts,... 206 | 'errFcn',erF,'quasiNewton',true,'FISTA',false,'BB',false,... 207 | 'SVDstyle',4,'SVDnPower',2); 208 | t1 = tic; 209 | errorFunction( {t1, maxTime } ); % only allow 20 seconds 210 | try 211 | [L,S] = solver_RPCA_Lagrangian(Y,params.lambdaL, params.lambdaS, [], opts); 212 | tm = toc(t1); 213 | fprintf('Via our "Lagrangian" code, constrained, time is %.2e, error is %.2e\n', tm, errFcn(L,S) ); 214 | catch thrownError 215 | if ~strcmpi(thrownError.identifier,'errorFunction:timedOut') 216 | rethrow(thrownError) 217 | else 218 | disp('our "Lagrangian" code timed out, but error were still recorded'); 219 | end 220 | end 221 | [errHist_Lag,times_Lag] = errorFunction(); 222 | else 223 | errHist_Lag = []; 224 | end 225 | 226 | %% Our Flip-Flop max version 227 | opts = struct('printEvery',500,'tol',1e-4*tolFactor,'max',true,'maxIts',maxIts); 228 | opts.errFcn = erF; 229 | opts.quasiNewton = true; opts.FISTA=false; opts.BB = false; 230 | opts.SPGL1_tol = 1e-2*tolFactor; 231 | t1 = tic; 232 | errorFunction( {t1, maxTime } ); 233 | try 234 | [L,S] = solver_RPCA_SPGL1(Y,params.lambdaMax, params.epsilon, [], opts); 235 | tm = toc(t1); 236 | fprintf('Via our "Flip-Flop max" code, constrained, time is %.2e, error is %.2e\n', tm, errFcn(L,S) ); 237 | catch thrownError 238 | if ~strcmpi(thrownError.identifier,'errorFunction:timedOut') 239 | rethrow(thrownError) 240 | else 241 | disp('our "Flip-Flop max" code timed out, but error were still recorded'); 242 | end 243 | end 244 | [errHist_FlipMax,times_FlipMax] = errorFunction(); 245 | 246 | %% Our Flip-Flop sum version 247 | opts = struct('printEvery',500,'tol',1e-4*tolFactor,'sum',true,'maxIts',maxIts); 248 | opts.errFcn = erF; 249 | opts.BB = true; 250 | opts.FISTA = false; 251 | opts.SVDstyle = 4; 252 | opts.SPGL1_tol = 1e-2*tolFactor; 253 | t1 = tic; 254 | errorFunction( {t1, maxTime } ); 255 | try 256 | [L,S] = solver_RPCA_SPGL1(Y,params.lambdaSum, params.epsilon, [], opts); 257 | tm = toc(t1); 258 | fprintf('Via our "Flip-Flop sum" code, constrained, time is %.2e, error is %.2e\n', tm, errFcn(L,S) ); 259 | catch thrownError 260 | if ~strcmpi(thrownError.identifier,'errorFunction:timedOut') 261 | rethrow(thrownError) 262 | else 263 | disp('our "Flip-Flop sum" code timed out, but error were still recorded'); 264 | end 265 | end 266 | [errHist_FlipSum,times_FlipSum] = errorFunction(); 267 | 268 | %% plot all of them 269 | figure(2); clf; 270 | holdMarker('reset'); 271 | fontsize = 14; 272 | style = {'-','linewidth',2,'markersize',8}; 273 | 274 | % Plot error vs. time 275 | plotF = @semilogy;xString = 'time (s)'; 276 | % If you want to plot error vs. iterations, not time, use the following: 277 | % plotF = @(x,varargin) semilogy(1:length(x), varargin{:} ); xString = 'iterations'; 278 | 279 | lString = {}; 280 | 281 | plotF( times_max, errHist_max , style{:} );lString{end+1} = 'this paper* (flip-SPCP-max, QN)'; 282 | hold all 283 | plotF( times_sum, errHist_sum , style{:} );lString{end+1} = 'this paper (flip-SPCP-sum)'; 284 | if ~isempty( errHist_Lag ) 285 | plotF( times_Lag, errHist_Lag , style{:} );lString{end+1} = 'this paper* (lag-SPCP, QN)'; 286 | end 287 | plotF( times_FlipMax, errHist_FlipMax, style{:} ); lString{end+1} = 'this paper* (SPCP-max, QN)'; 288 | plotF( times_FlipSum, errHist_FlipSum, style{:} ); lString{end+1} = 'this paper (SPCP-sum)'; 289 | 290 | if doNSA 291 | plotF( times_NSA, errHist_NSA, style{:} ); lString{end+1} = 'NSA* (SPCP-sum)'; 292 | end 293 | if doPSPG 294 | plotF( times_PSPG, errHist_PSPG , style{:} );lString{end+1} = 'PSPG* (SPCP-sum)'; 295 | end 296 | if doASALM 297 | plotF( times_ASALM, errHist_ASALM , style{:} );lString{end+1} = 'ASALM* (SPCP-sum)'; 298 | if ~isempty( errHist_ASALM2 ) 299 | plotF( times_ASALM2, errHist_ASALM2 , style{:} );lString{end+1} = 'ASALM* (lag-SPCP)'; 300 | end 301 | end 302 | % The ^* denotes fastSVD, and the ^{QN} denotes quasi-Newton 303 | 304 | holdMarker(0); 305 | lh = legend(lString); set(lh,'fontsize',fontsize); 306 | refresh 307 | ylabel('Error','fontsize',fontsize); 308 | xlabel(xString,'fontsize',fontsize); 309 | switch problem_type 310 | case 0 311 | ylim([1e-4,Inf]); 312 | case 1 313 | ylim([5e-3,3]); 314 | case 2 315 | ylim([1e-3,5]); 316 | set(lh,'location','east'); 317 | ps = get(gcf,'position'); 318 | set(gcf,'position',[ps(1:2), 600,400] ); 319 | end -------------------------------------------------------------------------------- /demos/test_l1l2.m: -------------------------------------------------------------------------------- 1 | %{ 2 | Friday April 17 2015, testing l1l2 block version 3 | This script only works if you have CVX installed to find 4 | the reference answers. 5 | Stephen Becker, stephen.becker@colorado.edu 6 | %} 7 | 8 | 9 | %% Simple test for just l1 norm: 10 | [Y,L,S,params] = loadSyntheticProblem('verySmall_l1'); 11 | % And test: 12 | opts = struct('tol',1e-8); 13 | [LL,SS,errHist] = solver_RPCA_Lagrangian(Y,params.lambdaL,params.lambdaS,[],opts); 14 | fprintf(2,'Error with Lagrangian version, L1 norm, is %.2e (L), %.2e (S)\n\n', ... 15 | norm( LL - L, 'fro')/norm(L,'fro'), norm( SS - S, 'fro')/norm(S,'fro') ); 16 | 17 | %% Test l1l2 block norm (sum of l2 norms of rows) 18 | 19 | % -- Load reference solutiion 20 | [Y,L,S,params] = loadSyntheticProblem('verySmall_l1l2'); 21 | 22 | % -- Solve Lagrangian version 23 | opts = struct('tol',1e-10,'printEvery',50,'L1L2','rows'); 24 | normLS = sqrt(norm(L,'fro')^2+norm(S,'fro')^2); 25 | % opts.FISTA = true; opts.restart = Inf; opts.BB = false; opts.quasiNewton = false; 26 | opts.errFcn = @(LL,SS) sqrt(norm(LL-L,'fro')^2+norm(SS-S,'fro')^2)/normLS; 27 | [LL,SS,errHist] = solver_RPCA_Lagrangian(Y,params.lambdaL,params.lambdaS,[],opts); 28 | fprintf(2,'Error with Lagrangian version, L1L2 via rows, is %.2e (L), %.2e (S)\n\n', ... 29 | norm( LL - L, 'fro')/norm(L,'fro'), norm( SS - S, 'fro')/norm(S,'fro') ); 30 | 31 | % -- Solve simplest constrained versions 32 | opts = struct('tol',1e-10,'printEvery',50,'L1L2','rows'); 33 | normLS = sqrt(norm(L,'fro')^2+norm(S,'fro')^2); 34 | % opts.FISTA = true; opts.restart = Inf; opts.BB = false; opts.quasiNewton = false; 35 | opts.errFcn = @(LL,SS) sqrt(norm(LL-L,'fro')^2+norm(SS-S,'fro')^2)/normLS; 36 | % opts.sum = true; % doesn't work with L1L2 = 'rows' 37 | opts.max = true; 38 | [LL,SS,errHist] = solver_RPCA_constrained(Y,params.lambdaMax, params.tauMax,[], opts); 39 | fprintf(2,'Error with simple constrained version, L1L2 via rows, is %.2e (L), %.2e (S)\n\n', ... 40 | norm( LL - L, 'fro')/norm(L,'fro'), norm( SS - S, 'fro')/norm(S,'fro') ); 41 | 42 | % --- Solve fancy constrained versions 43 | opts = struct('tol',1e-12,'printEvery',50,'L1L2','rows'); 44 | normLS = sqrt(norm(L,'fro')^2+norm(S,'fro')^2); 45 | % opts.FISTA = true; opts.restart = Inf; opts.BB = false; opts.quasiNewton = false; 46 | opts.errFcn = @(LL,SS) sqrt(norm(LL-L,'fro')^2+norm(SS-S,'fro')^2)/normLS; 47 | % opts.sum = true; % doesn't work with L1L2 = 'rows' 48 | opts.max = true; 49 | % opts.tau0 = params.tauMax; % cheating 50 | [LL,SS,errHist,tau] = solver_RPCA_SPGL1(Y,params.lambdaMax, params.epsilon,[], opts); 51 | fprintf(2,'\n\nError with fancier constrained version, L1L2 via rows, is %.2e (L), %.2e (S)\n\n', ... 52 | norm( LL - L, 'fro')/norm(L,'fro'), norm( SS - S, 'fro')/norm(S,'fro') ); 53 | 54 | 55 | %% And suppose we want l1l2 on the columns? 56 | % We can use the same reference solution by just transposing 57 | [Y,L,S,params] = loadSyntheticProblem('verySmall_l1l2'); 58 | Y = Y'; 59 | L = L'; 60 | S = S'; 61 | opts = struct('tol',1e-10,'printEvery',150,'L1L2','cols'); 62 | normLS = sqrt(norm(L,'fro')^2+norm(S,'fro')^2); 63 | opts.errFcn = @(LL,SS) sqrt(norm(LL-L,'fro')^2+norm(SS-S,'fro')^2)/normLS; 64 | [LL,SS,errHist] = solver_RPCA_Lagrangian(Y,params.lambdaL,params.lambdaS,[],opts); 65 | fprintf(2,'Error with Lagrangian version, L1L2 via columns, is %.2e (L), %.2e (S)\n\n', ... 66 | norm( LL - L, 'fro')/norm(L,'fro'), norm( SS - S, 'fro')/norm(S,'fro') ); 67 | 68 | % --- Now simple constrained version 69 | opts = struct('tol',1e-10,'printEvery',150,'L1L2','cols'); 70 | normLS = sqrt(norm(L,'fro')^2+norm(S,'fro')^2); 71 | % opts.FISTA = true; opts.restart = Inf; opts.BB = false; opts.quasiNewton = false; 72 | opts.errFcn = @(LL,SS) sqrt(norm(LL-L,'fro')^2+norm(SS-S,'fro')^2)/normLS; 73 | % opts.sum = true; % doesn't work with L1L2 = 'rows' 74 | opts.max = true; 75 | [LL,SS,errHist] = solver_RPCA_constrained(Y,params.lambdaMax, params.tauMax,[], opts); 76 | fprintf(2,'Error with simple constrained version, L1L2 via columns, is %.2e (L), %.2e (S)\n\n', ... 77 | norm( LL - L, 'fro')/norm(L,'fro'), norm( SS - S, 'fro')/norm(S,'fro') ); 78 | 79 | % --- Now fancier constrained version 80 | opts = struct('tol',1e-12,'printEvery',200,'L1L2','cols'); 81 | normLS = sqrt(norm(L,'fro')^2+norm(S,'fro')^2); 82 | opts.errFcn = @(LL,SS) sqrt(norm(LL-L,'fro')^2+norm(SS-S,'fro')^2)/normLS; 83 | opts.max = true; 84 | [LL,SS,errHist,tau] = solver_RPCA_SPGL1(Y,params.lambdaMax, params.epsilon,[], opts); 85 | fprintf(2,'\n\nError with fancier constrained version, L1L2 via columns, is %.2e (L), %.2e (S)\n\n', ... 86 | norm( LL - L, 'fro')/norm(L,'fro'), norm( SS - S, 'fro')/norm(S,'fro') ); 87 | 88 | 89 | %% Ignore this: 90 | % The code I used to generate the reference solutions: 91 | 92 | % m = 10; 93 | % n = 11; 94 | % rng(343); 95 | % Y = randn(m,n); 96 | 97 | % -- First, test Lagrangian version, just l1 98 | % lambdaL = .4; 99 | % lambdaS = .1; 100 | % cvx_begin 101 | % cvx_precision best 102 | % variables L(m,n) S(m,n) 103 | % minimize lambdaL*norm_nuc(L)+lambdaS*norm(S(:),1)+.5*sum_square(vec(L+S-Y)) 104 | % cvx_end 105 | % S( abs(S)<1e-10 )=0; 106 | % save('~/Repos/fastRPCA/utilities/referenceSolutions/verySmall_l1.mat',... 107 | % 'Y','L','S','lambdaL','lambdaS','m','n' ); 108 | 109 | % -- Now, test Lagrangian version with l1l2 110 | % lambdaL = .25; 111 | % lambdaS = .24; 112 | % cvx_begin 113 | % cvx_precision best 114 | % variables L(m,n) S(m,n) 115 | % minimize lambdaL*norm_nuc(L)+lambdaS*sum(norms(S'))+.5*sum_square(vec(L+S-Y)) 116 | % cvx_end 117 | % S( abs(S)<1e-10 )=0 118 | % save('~/Repos/fastRPCA/utilities/referenceSolutions/verySmall_l1l2.mat',... 119 | % 'Y','L','S','lambdaL','lambdaS','m','n' ); 120 | -------------------------------------------------------------------------------- /setup_fastRPCA.m: -------------------------------------------------------------------------------- 1 | function setup_fastRPCA 2 | % SETUP_FASTRPCA Adds the fastRPCA toolbox to the path 3 | 4 | baseDirectory = fileparts(mfilename('fullpath')); 5 | addpath(genpath(baseDirectory)); 6 | -------------------------------------------------------------------------------- /solvers/solver_RPCA_Lagrangian.m: -------------------------------------------------------------------------------- 1 | function [varargout] = solver_RPCA_Lagrangian(Y,lambda_L,lambda_S,varargin) 2 | % [L,S,errHist] = solver_RPCA_Lagrangian(Y,lambda_L,lambda_S, A_cell, opts) 3 | % Solves the problem 4 | % minimize_{L,S} .5|| L + S - Y ||_F^2 + lambda_L ||L||_* + lambda_S ||S||_1 5 | % 6 | % or if A_cell is provided, where A_cell = {A, At} 7 | % (A is a function handle, At is a function handle to the transpose of A) 8 | % then 9 | % 10 | % minimize_{L,S} .5|| A(L + S) - Y ||_F^2 + lambda_L ||L||_* + lambda_S ||S||_1 11 | % (here, Y usually represents A(Y) ) 12 | % 13 | % errHist(:,1) is a record of the residual 14 | % errHist(:,2) is a record of the full objective (.f*resid^2 + lambda_L, 15 | % etc.) 16 | % errHist(:,3) is the output of opts.errFcn if provided 17 | % 18 | % opts is a structure with options: 19 | % opts.sum, opts.max (as described above) 20 | % opts.L0 initial guess for L (default is 0) 21 | % opts.S0 initial guess for S (default is 0) 22 | % opts.tol sets stopping tolerance 23 | % opts.maxIts sets maximum number of iterations 24 | % opts.printEvery will print information this many iterations 25 | % opts.displayTime will print out timing information (default is true for large problems) 26 | % opts.errFcn a function of (L,S) that records information 27 | % opts.trueObj if provided, this will be subtracted from errHist(2,:) 28 | % opts.Lip Lipschitz constant, i.e., 2*spectralNorm(A)^2 29 | % by default, assume 2 (e.g., good if A = P_Omega) 30 | % opts.FISTA whether to use FISTA or not. By default, false 31 | % opts.restart how often to restart FISTA; set to -Inf to make it automatic 32 | % opts.BB whether to use the Barzilai-Borwein spectral steplength 33 | % opts.BB_type which BB stepsize to take. Default is 1, the larger step 34 | % opts.BB_split whether to calculate stepslengths for S and L independently. 35 | % Default is false, which is recommended. 36 | % opts.quasiNewton uses quasi-Newton-like Gauss-Seidel scheme. By 37 | % default, true 38 | % opts.quasiNewton_stepsize stepsize length. Default is .8*(2/Lip) 39 | % opts.quasinewton_SLS whether to take S-L-S sequence (default is true) 40 | % otherwise, takes a L-S Gauss-Seidel sequence 41 | % opts.SVDstyle controls what type of SVD is performed. 42 | % 1 = full svd using matlab's "svd". Best for small problems 43 | % 2 = partial svd using matlab's "svds". Not recommended. 44 | % 3 = partial svd using PROPACK, if installed. Better than option 2, worse than 4 45 | % 4 = partial svd using randomized linear algebra, following 46 | % the Halko/Tropp/Martinnson "Structure in Randomness" paper 47 | % in option 4, there are additional options: 48 | % opts.SVDwarmstart whether to "warm-start" the algorithm 49 | % opts.SVDnPower number of power iterations (default is 2 unless warm start) 50 | % opts.SVDoffset oversampling, e.g., "rho" in Tropp's paper. Default is 5 51 | % 52 | % opts.L1L2 instead of using l1 penalty, e.g., norm(S(:),1), we can 53 | % also use block norm penalties, such as (if opts.L1L2 = 'rows') 54 | % the sum of the l2-norm of rows (i.e., l1-norm of rows), 55 | % or if opts.L1L2='cols', the sum of the l2-norms of colimns. 56 | % By default, or if opts.L1L2 = [] or false, then uses usual l1 norm. 57 | % [Feature added April 17 2015] 58 | % 59 | % Stephen Becker, March 6 2014 60 | % See also solver_RPCA_constrained.m 61 | 62 | 63 | [varargout{1:nargout}] = solver_RPCA_constrained(Y,lambda_S/lambda_L, -lambda_L, varargin{:}); -------------------------------------------------------------------------------- /solvers/solver_RPCA_SPGL1.m: -------------------------------------------------------------------------------- 1 | function [L,S,errHist,tau] = solver_RPCA_SPGL1(AY,lambda_S, epsilon, A_cell, opts) 2 | % [L,S,errHist,tau] = solver_RPCA_SPGL1(Y,lambda_S, epsilon, A_cell, opts) 3 | % Solves the problem 4 | % minimize_{L,S} f(L,S) 5 | % subject to 6 | % || A( L+S ) - Y ||_F <= epsilon 7 | % if opts.sum = true 8 | % (1) f(L,S) = ||L||_* + lambda_S ||S||_1 9 | % if opts.max = true 10 | % (2) f(L,S) = max( ||L||_* , lambda_S ||S||_1 ) 11 | % 12 | % By default, A is the identity, 13 | % or if A_cell is provided, where A_cell = {A, At} 14 | % (A is a function handle, At is a function handle to the transpose of A) 15 | % 16 | % This uses the trick from SPGL1 to call several subproblems 17 | % 18 | % errHist(1,:) is a record of the residual 19 | % errHist(2,:) is a record of the full objective (that is, .5*resid^2 ) 20 | % errHist(3,:) is the output of opts.errFcn if provided 21 | % 22 | % opts is a structure with options: 23 | % opts.tau0 starting point for SPGL1 search (default: 10) 24 | % opts.SPGL1_tol tolerance for 1D Newton search (default: 1e-2) 25 | % opts.SPGL1_maxIts maximum number of iterations for 1D search (default: 10) 26 | % 27 | % And these options are the same as in solver_RPCA_constrained.m 28 | % opts.sum, opts.max (as described above) 29 | % opts.L0 initial guess for L (default is 0) 30 | % opts.S0 initial guess for S (default is 0) 31 | % opts.tol sets stopping tolerance (default is 1e-6) 32 | % opts.maxIts sets maximum number of iterations 33 | % opts.printEvery will print information this many iterations 34 | % opts.displayTime will print out timing information (default is true for large problems) 35 | % opts.errFcn a function of (L,S) that records information 36 | % opts.trueObj if provided, this will be subtracted from errHist(2,:) 37 | % opts.Lip Lipschitz constant, i.e., 2*spectralNorm(A)^2 38 | % by default, assume 2 (e.g., good if A = P_Omega) 39 | % opts.FISTA whether to use FISTA or not. By default, true 40 | % opts.restart how often to restart FISTA; set to -Inf to make it automatic 41 | % opts.BB whether to use the Barzilai-Borwein spectral steplength 42 | % opts.BB_type which BB stepsize to take. Default is 1, the larger step 43 | % opts.BB_split whether to calculate stepslengths for S and L independently. 44 | % Default is false, which is recommended. 45 | % opts.quasiNewton uses quasi-Newton-like Gauss-Seidel scheme. 46 | % Only available in "max" mode 47 | % opts.quasiNewton_stepsize stepsize length. Default is .8*(2/Lip) 48 | % opts.quasinewton_SLS whether to take S-L-S sequence (default is true) 49 | % otherwise, takes a L-S Gauss-Seidel sequence 50 | % opts.SVDstyle controls what type of SVD is performed. 51 | % 1 = full svd using matlab's "svd". Best for small problems 52 | % 2 = partial svd using matlab's "svds". Not recommended. 53 | % 3 = partial svd using PROPACK, if installed. Better than option 2, worse than 4 54 | % 4 = partial svd using randomized linear algebra, following 55 | % the Halko/Tropp/Martinnson "Structure in Randomness" paper 56 | % in option 4, there are additional options: 57 | % opts.SVDwarmstart whether to "warm-start" the algorithm 58 | % opts.SVDnPower number of power iterations (default is 2 unless warm start) 59 | % opts.SVDoffset oversampling, e.g., "rho" in Tropp's paper. Default is 5 60 | % 61 | % opts.L1L2 instead of using l1 penalty, e.g., norm(S(:),1), we can 62 | % also use block norm penalties, such as (if opts.L1L2 = 'rows') 63 | % the sum of the l2-norm of rows (i.e., l1-norm of rows), 64 | % or if opts.L1L2='cols', the sum of the l2-norms of colimns. 65 | % By default, or if opts.L1L2 = [] or false, then uses usual l1 norm. 66 | % [Feature added April 17 2015] 67 | % 68 | % Stephen Becker, March 6 2014. Edited March 14 2-14. stephen.beckr@gmail.com 69 | % See also solver_RPCA_Lagrangian.m, solver_RPCA_constrained.m 70 | 71 | 72 | % todo: allow S >= 0 constraints, since this is easy 73 | 74 | error(nargchk(3,5,nargin,'struct')); 75 | if nargin < 5, opts = []; end 76 | % == PROCESS OPTIONS == 77 | function out = setOpts( field, default ) 78 | if ~isfield( opts, field ) 79 | opts.(field) = default; 80 | end 81 | out = opts.(field); 82 | opts = rmfield( opts, field ); % so we can do a check later 83 | end 84 | 85 | if nargin < 4 || isempty(A_cell) 86 | A = @(X) X(:); 87 | [n1,n2] = size(AY); 88 | At = @(x) reshape(x,n1,n2); 89 | else 90 | A = A_cell{1}; 91 | At = A_cell{2}; 92 | % Y could be either Y or A(Y) 93 | if size(AY,2) > 1 94 | % AY is a vector, so it is probably Y and not AY 95 | disp('Changing Y to A(Y)'); 96 | AY = A(AY); 97 | end 98 | end 99 | 100 | tauInitial = setOpts('tau0', 1e1 ); 101 | 102 | % April 17 2015 103 | L1L2 = setOpts('L1L2',0); 104 | if isempty(L1L2), L1L2=0; end 105 | if L1L2, 106 | if ~isempty(strfind(lower(L1L2),'row')), L1L2 = 'rows'; 107 | elseif ~isempty(strfind(lower(L1L2),'col')), L1L2 = 'cols'; 108 | % so col, COL, cols, columns, etc. all acceptable 109 | else 110 | error('unrecognized option for L1L2: should be row or column or 0'); 111 | end 112 | end 113 | % Note: setOpts removes things, so add back. Easy to have bugs 114 | opts.L1L2 = L1L2; 115 | 116 | finalTol = setOpts('tol',1e-6); 117 | sumProject = setOpts('sum', false ); 118 | maxProject = setOpts('max', false ); 119 | if (sumProject && maxProject) || (~sumProject && ~maxProject), error('must choose either "sum" or "max" type projection'); end 120 | opts.max = maxProject; % since setOpts removes them 121 | opts.sum = sumProject; 122 | rNormOld = Inf; 123 | tau = tauInitial; 124 | SPGL1_tol = setOpts('SPGL1_tol',1e-2); 125 | SPGL1_maxIts = setOpts('SPGL1_maxIts', 10 ); 126 | errHist = []; 127 | for nSteps = 1:SPGL1_maxIts 128 | opts.tol = max(finalTol, finalTol*10^((4/nSteps)-1) ); % slowly reduce tolerance 129 | % opts.tol = finalTol; 130 | fprintf('\n==Running SPGL1 Newton iteration with tau=%.2f\n\n', tau); 131 | [L,S,errHistTemp] = solver_RPCA_constrained(AY,lambda_S, tau, A_cell, opts); 132 | % Find a zero of the function phi(tau) = norm(x_tau) - epsilon 133 | 134 | errHist = [errHist; errHistTemp]; 135 | rNorm = errHist(end,1); % norm(residual) 136 | if abs( epsilon - rNorm ) < SPGL1_tol*epsilon 137 | disp('Reached end of SPGL1 iterations: converged to right residual'); 138 | break; 139 | end 140 | if abs( rNormOld - rNorm ) < .1*SPGL1_tol 141 | disp('Reached end of SPGL1 iterations: converged'); 142 | break; 143 | end 144 | % for nuclear norm matrix completion, 145 | % normG = spectralNorm( gradient ) and gradient is just residual 146 | G = At(A(L+S-AY)); % it's really two copies, [G;G] 147 | % if we have a linear operator, above needs to be modified... 148 | if sumProject 149 | normG = max(norm(G), (1/lambda_S)*norm(G(:), inf)); 150 | elseif maxProject 151 | if ~any(L1L2) 152 | % Using l1 norm for S 153 | normG = norm(G) + (1/lambda_S)*norm(G(:), inf); 154 | elseif strcmpi(L1L2,'rows') 155 | % dual norm of l1 of l2-norm of rows' 156 | normG = norm(G) + (1/lambda_S)*norm( sqrt(sum(G.^2,2)), inf ); 157 | 158 | elseif strcmpi(L1L2,'cols') 159 | normG = norm(G) + (1/lambda_S)*norm( sqrt(sum(G.^2,1)), inf ); 160 | end 161 | end 162 | 163 | % otherwise, take another Newton step 164 | phiPrime = -normG/rNorm; % derivative of phi 165 | ALPHA = .99; % amount of Newton step to take 166 | tauOld = tau; 167 | tau = tau + ALPHA*( epsilon - rNorm )/phiPrime; 168 | tau = min( max(tau,tauOld/10), 1e10 ); 169 | % and warm-start the next iteration: 170 | opts.S0 = S; 171 | opts.L0 = L; 172 | end 173 | % And if we requested very high accuracy, run a bit more on the final 174 | % iteration... 175 | if finalTol < 1e-6 176 | opts.tol = finalTol/10; 177 | opts.S0 = S; 178 | opts.L0 = L; 179 | opts.SVDnPower = 3; 180 | [L,S,errHistTemp] = solver_RPCA_constrained(AY,lambda_S, tau, A_cell, opts); 181 | errHist = [errHist; errHistTemp]; 182 | end 183 | 184 | 185 | end % end of main function -------------------------------------------------------------------------------- /solvers/solver_RPCA_constrained.m: -------------------------------------------------------------------------------- 1 | function [L,S,errHist] = solver_RPCA_constrained(AY,lambda_S, tau, A_cell, opts) 2 | % [L,S,errHist] = solver_RPCA_constrained(Y,lambda_S, tau, A_cell, opts) 3 | % Solves the problem 4 | % minimize_{L,S} .5|| L + S - Y ||_F^2 5 | % subject to 6 | % if opts.sum = true 7 | % (1) ||L||_* + lambda_S ||S||_1 <= tau 8 | % if opts.max = true 9 | % (2) max( ||L||_* , lambda_S ||S||_1 ) <= tau 10 | % 11 | % if opts.max and opts.sum are false and tau is a negative number, then 12 | % we solve the problem: 13 | % minimize_{L,S} .5|| L + S - Y ||_F^2 + abs(tau)*( ||L||_* + lambda_S ||S||_1 ) 14 | % (but see solver_RPCA_Lagrangian.m for a simpler interface) 15 | % 16 | % or if A_cell is provided, where A_cell = {A, At} 17 | % (A is a function handle, At is a function handle to the transpose of A) 18 | % then 19 | % 20 | % minimize_{L,S} .5|| A(L + S) - Y ||_F^2 21 | % subject to ... 22 | % (here, Y usually represents A(Y); if Y is not the same size 23 | % as A(L), then we will automatically set Y <-- A(Y) ) 24 | % 25 | % errHist(:,1) is a record of the residual 26 | % errHist(:,2) is a record of the full objective (that is, .5*resid^2 ) 27 | % errHist(:,3) is the output of opts.errFcn if provided 28 | % 29 | % opts is a structure with options: 30 | % opts.sum, opts.max (as described above) 31 | % opts.L0 initial guess for L (default is 0) 32 | % opts.S0 initial guess for S (default is 0) 33 | % opts.size [n1,n2] where L and S are n1 x n2 matrices. The size is automatically 34 | % determined in most cases, but when providing a linear operator 35 | % it may be necessary to provide an explicit size. 36 | % opts.tol sets stopping tolerance 37 | % opts.maxIts sets maximum number of iterations 38 | % opts.printEvery will print information this many iterations 39 | % opts.displayTime will print out timing information (default is true for large problems) 40 | % opts.errFcn a function of (L,S) that records information 41 | % opts.trueObj if provided, this will be subtracted from errHist(2,:) 42 | % opts.Lip Lipschitz constant, i.e., 2*spectralNorm(A)^2 43 | % by default, assume 2 (e.g., good if A = P_Omega) 44 | % opts.FISTA whether to use FISTA or not. By default, true 45 | % opts.restart how often to restart FISTA; set to -Inf to make it automatic 46 | % opts.BB whether to use the Barzilai-Borwein spectral steplength 47 | % opts.BB_type which BB stepsize to take. Default is 1, the larger step 48 | % opts.BB_split whether to calculate stepslengths for S and L independently. 49 | % Default is false, which is recommended. 50 | % opts.quasiNewton uses quasi-Newton-like Gauss-Seidel scheme. 51 | % Only available in "max" mode 52 | % opts.quasiNewton_stepsize stepsize length. Default is .8*(2/Lip) 53 | % opts.quasinewton_SLS whether to take S-L-S sequence (default is true) 54 | % otherwise, takes a L-S Gauss-Seidel sequence 55 | % opts.SVDstyle controls what type of SVD is performed. 56 | % 1 = full svd using matlab's "svd". Best for small problems 57 | % 2 = partial svd using matlab's "svds". Not recommended. 58 | % 3 = partial svd using PROPACK, if installed. Better than option 2, worse than 4 59 | % 4 = partial svd using randomized linear algebra, following 60 | % the Halko/Tropp/Martinnson "Structure in Randomness" paper 61 | % in option 4, there are additional options: 62 | % opts.SVDwarmstart whether to "warm-start" the algorithm 63 | % opts.SVDnPower number of power iterations (default is 2 unless warm start) 64 | % opts.SVDoffset oversampling, e.g., "rho" in Tropp's paper. Default is 5 65 | % 66 | % opts.L1L2 instead of using l1 penalty, e.g., norm(S(:),1), we can 67 | % also use block norm penalties, such as (if opts.L1L2 = 'rows') 68 | % the sum of the l2-norm of rows (i.e., l1-norm of rows), 69 | % or if opts.L1L2='cols', the sum of the l2-norms of colimns. 70 | % By default, or if opts.L1L2 = [] or false, then uses usual l1 norm. 71 | % [Feature added April 17 2015] 72 | % 73 | % Features that may be added later: [email developers if these are 74 | % important to you] 75 | % - Allow Huber loss function. 76 | % 77 | % Stephen Becker, March 6 2014. Edited March 14 2014, April 2015. 78 | % stephen.becker@colorad.edu 79 | % See also solver_RPCA_Lagrangian.m, solver_RPCA_SPGL1.m 80 | 81 | 82 | % todo: allow S >= 0 constraints, since this is easy 83 | % todo: allow Huber loss function 84 | 85 | error(nargchk(3,5,nargin,'struct')); 86 | if nargin < 5, opts = []; end 87 | % == PROCESS OPTIONS == 88 | function out = setOpts( field, default ) 89 | if ~isfield( opts, field ) 90 | opts.(field) = default; 91 | end 92 | out = opts.(field); 93 | opts = rmfield( opts, field ); % so we can do a check later 94 | end 95 | 96 | if nargin < 4 || isempty(A_cell) 97 | A = @(X) X(:); 98 | [n1,n2] = size(AY); 99 | At = @(x) reshape(x,n1,n2); 100 | 101 | if ~iscell(tau) 102 | AY = A(AY); 103 | end 104 | % The factor of 2 is since we are in both L and S 105 | else 106 | A = A_cell{1}; 107 | At = A_cell{2}; 108 | % Y could be either Y or A(Y) 109 | if size(AY,2) > 1 110 | % AY is a vector, so it is probably Y and not AY 111 | disp('Changing Y to A(Y)'); 112 | AY = A(AY); 113 | [n1,n2] = size(AY); % April 24 '14 114 | else 115 | % April 24, '14: we need to know the (n1,n2) 116 | sz = setOpts('size',[] ); 117 | if isempty(sz) 118 | error('Cannot determine the size of the variables; please specify opts.size=[n1,n2]'); 119 | end 120 | n1 = sz(1); 121 | n2 = sz(2); 122 | end 123 | %[n1,n2] = size(AY); % comment out April 24 '14 124 | end 125 | normAY = norm(AY(:)); 126 | 127 | vec = @(X) X(:); 128 | 129 | % Some problem sizes. Feel free to tweak. Mainly affect the defaults 130 | SMALL = ( n1*n2 <= 50^2 ); 131 | MEDIUM = ( n1*n2 <= 200^2 ) && ~SMALL; 132 | LARGE = ( n1*n2 <= 1000^2 ) && ~SMALL && ~MEDIUM; 133 | HUGE = ( n1*n2 > 1000^2 ); 134 | 135 | 136 | % -- PROCESS OPTIONS -- (some defaults depend on problem size ) 137 | tol = setOpts('tol',1e-6*(SMALL | MEDIUM) + 1e-4*LARGE + 1e-3*HUGE ); 138 | maxIts = setOpts('maxIts', 1e3*(SMALL | MEDIUM ) + 400*LARGE + 200*HUGE ); 139 | printEvery = setOpts('printEvery',100*SMALL + 50*MEDIUM + 5*LARGE + 1*HUGE); 140 | errFcn = setOpts('errFcn', [] ); 141 | Lip = setOpts('Lip', 2 ); 142 | restart = setOpts('restart',-Inf); 143 | trueObj = setOpts('trueObj',0); 144 | sumProject = setOpts('sum', false ); 145 | maxProject = setOpts('max', false ); 146 | 147 | if tau < 0 148 | Lagrangian = true; 149 | if sumProject || maxProject 150 | error('in Lagrangian mode (when tau<0 significies lambda=|tau|), turn off sum/maxProject'); 151 | end 152 | lambda = abs(tau); 153 | tau = []; % help us track down bugs 154 | else 155 | Lagrangian = false; 156 | if (sumProject && maxProject) || (~sumProject && ~maxProject), error('must choose either "sum" or "max" type projection'); end 157 | end 158 | 159 | QUASINEWTON = setOpts('quasiNewton', maxProject || Lagrangian ); 160 | FISTA = setOpts('FISTA',~QUASINEWTON); 161 | BB = setOpts('BB',~QUASINEWTON); 162 | % Note: BB with FISTA is sometimesm not so good 163 | if BB && FISTA, warning('solver_RPCA:convergence','Convergence not guaranteed with FISTA if opts.BB=true'); end 164 | BB_split = setOpts('BB_split',false); 165 | BB_type = setOpts('BB_type',1); % 1 or 2 166 | stepsizeQN = setOpts('quasiNewton_stepsize', .8*2/Lip ); 167 | S_L_S = setOpts('quasiNewton_SLS', true ); 168 | displayTime = setOpts('displayTime',LARGE | HUGE ); 169 | SVDstyle = setOpts('SVDstyle', 1*SMALL + 4*(~SMALL) ); % 1 is full SVD 170 | % and even finer tuning (only matter if SVDstyle==4) 171 | SVDwarmstart= setOpts('SVDwarmstart', true ); 172 | SVDnPower = setOpts('SVDnPower', 1 + ~SVDwarmstart ); % number of power iteratiosn 173 | SVDoffset = setOpts('SVDoffset', 5 ); 174 | SVDopts = struct('SVDstyle', SVDstyle,'warmstart',SVDwarmstart,... 175 | 'nPower',SVDnPower,'offset',SVDoffset ); 176 | 177 | if QUASINEWTON 178 | if sumProject 179 | error('Can not run quasi-Newton mode when in "sum" formulation. Please change to "max"'); 180 | elseif FISTA 181 | error('Can not run quasi-Newton with FISTA'); 182 | elseif BB 183 | error('Can not run quasi-Newton with BB'); 184 | end 185 | end 186 | 187 | % April 17 2015 188 | L1L2 = setOpts('L1L2',0); 189 | if isempty(L1L2), L1L2=0; end 190 | if L1L2, 191 | if ~isempty(strfind(lower(L1L2),'row')), L1L2 = 'rows'; 192 | elseif ~isempty(strfind(lower(L1L2),'col')), L1L2 = 'cols'; 193 | % so col, COL, cols, columns, etc. all acceptable 194 | else 195 | error('unrecognized option for L1L2: should be row or column or 0'); 196 | end 197 | end 198 | 199 | projNuclear(); % remove any persistent variables 200 | if maxProject 201 | project = @(L,S,varargin) projectMax(L1L2,tau,lambda_S,SVDopts, L,S); 202 | elseif sumProject 203 | if any(L1L2), error('with opts.sum=true, need opts.L1L2=0'); end 204 | project = @(L,S,varargin) projectSum(tau,lambda_S,L,S); 205 | elseif Lagrangian 206 | project = @(L,S,varargin) projectMax(L1L2,lambda,lambda*lambda_S,SVDopts, L,S, varargin{:}); 207 | end 208 | 209 | L = setOpts('L0',zeros(n1,n2) ); 210 | S = setOpts('S0',zeros(n1,n2) ); 211 | 212 | % Check for extra options that were not processed 213 | if ~isempty( fieldnames(opts ) ) 214 | disp( 'warning, found extra guys in opts'); 215 | disp( opts ) 216 | error('Found unprocessed options in "opts"'); 217 | end 218 | 219 | stepsize = 1/Lip; 220 | errHist = zeros(maxIts, 2 + ~isempty(errFcn) ); 221 | Grad = 0; 222 | if FISTA || BB || QUASINEWTON 223 | L_old = L; 224 | S_old = S; 225 | end 226 | L_fista = L; 227 | S_fista = S; 228 | BREAK = false; 229 | kk = 0; % counter for FISTA 230 | timeRef = tic; 231 | for k = 1:maxIts 232 | % Gradient in (L,S) (at the fista point) is (R,R) where... 233 | R = A(L_fista + S_fista) - AY; 234 | Grad_old = Grad; 235 | Grad = At(R); 236 | 237 | objL = Inf; 238 | if QUASINEWTON 239 | % stepsizeQN = 1 - min(0.1, 1/k ); 240 | % stepsizeQN = 1 - .3/sqrt(k); 241 | 242 | if S_L_S 243 | % we solve for S, update L, then re-update S 244 | % Exploits the fact that projection for S is faster 245 | dL = L - L_old; 246 | S_old = S; 247 | [~,S_temp] = project( [], S - stepsizeQN*( Grad + dL ), stepsizeQN ); % take small step... 248 | 249 | dS = S_temp - S_old; 250 | L_old = L; 251 | [L,~,rnk,objL] = project( L - stepsizeQN*( Grad + dS ), [], stepsizeQN ); 252 | 253 | dL = L - L_old; 254 | [~,S] = project( [], S - stepsizeQN*( Grad + dL ) , stepsizeQN); 255 | else 256 | % Gauss-Seidel update, starting with L, then S 257 | % Changing order seemed to not work as well 258 | dS = S - S_old; 259 | L_old = L; 260 | [L,~,rnk,objL] = project( L - stepsizeQN*( Grad + dS ), [] , stepsizeQN); 261 | dL = L - L_old; 262 | S_old = S; 263 | [~,S] = project( [], S - stepsizeQN*( Grad + dL ) , stepsizeQN); 264 | end 265 | % stepsizeQN = (1+3*stepsizeQN)/4; 266 | else 267 | if BB && k > 1 268 | [stepsizeL, stepsizeS] = compute_BB_stepsize( Grad, Grad_old, L, L_old, S, S_old, BB_split, BB_type); 269 | if isnan(stepsizeL) || isnan(stepsizeS) 270 | fprintf(2,'Warning: no BB stepsize possible since iterates have not changed!\n'); 271 | [stepsizeL, stepsizeS] = deal( stepsize ); 272 | end 273 | else 274 | [stepsizeL, stepsizeS] = deal( stepsize ); 275 | end 276 | 277 | % Now compute proximity step 278 | if FISTA || BB 279 | L_old = L; 280 | S_old = S; 281 | end 282 | L = L_fista - stepsizeL*Grad; 283 | S = S_fista - stepsizeS*Grad; 284 | if any(isnan(L(:))) || any(isnan(S(:))), fprintf(2,'DEBUG!\n'); keyboard; end 285 | [L,S,rnk,objL] = project( L, S, stepsizeL, stepsizeS); 286 | if any(isnan(L(:))) || any(isnan(S(:))), fprintf(2,'DEBUG!\n'); keyboard; end 287 | end 288 | 289 | 290 | DO_RESTART = false; 291 | if FISTA 292 | if k>1 && restart > 0 && ~isinf(restart) && ~mod(kk,restart) 293 | kk = 0; 294 | DO_RESTART = true; 295 | elseif restart==-Inf && kk > 5 296 | % In this case, we restart if the function has significantly increased 297 | if (errHist(k-1,2)-errHist(k-5,2)) > 1e-8*abs(errHist(k-5,2)) 298 | DO_RESTART = true; 299 | kk = 0; 300 | end 301 | end 302 | L_fista = L + kk/(kk+3)*( L - L_old ); 303 | S_fista = S + kk/(kk+3)*( S - S_old ); 304 | kk = kk + 1; 305 | else 306 | L_fista = L; 307 | S_fista = S; 308 | end 309 | 310 | 311 | % res = norm(R(:)); % this is for L_fista, not L 312 | R = A(L + S) - AY; 313 | % sometimes we already have R pre-computed, so if this turns out to 314 | % be a signficant computational cost we can make fancier code... 315 | res = norm(R(:)); 316 | errHist(k,1) = res; 317 | errHist(k,2) = 1/2*(res^2); % + lambda_L*objL + lambda_S*objS; 318 | if Lagrangian 319 | errHist(k,2) = errHist(k,2) + lambda*objL; 320 | if any(L1L2) 321 | if strcmpi(L1L2,'rows') 322 | errHist(k,2) = errHist(k,2) + lambda*lambda_S*sum( sqrt( sum(S.^2,2) ) ); 323 | else 324 | errHist(k,2) = errHist(k,2) + lambda*lambda_S*sum( sqrt( sum(S.^2,1) ) ); 325 | end 326 | else 327 | errHist(k,2) = errHist(k,2) + lambda*lambda_S*norm(S(:),1); 328 | end 329 | end 330 | if k > 1 && abs(diff(errHist(k-1:k,1)))/res < tol 331 | BREAK = true; 332 | end 333 | PRINT = ~mod(k,printEvery) | BREAK | DO_RESTART; 334 | if PRINT 335 | fprintf('Iter %4d, rel. residual %.2e, objective %.2e', k, res/normAY, errHist(k,2) -trueObj); 336 | end 337 | if ~isempty(errFcn) 338 | err = errFcn(L,S); 339 | errHist(k,3) = err; 340 | if PRINT, fprintf(', err %.2e', err ); end 341 | end 342 | if ~isempty(rnk) && PRINT, fprintf(', rank(L) %3d', rnk ); end 343 | if PRINT 344 | fprintf(', sparsity(S) %5.1f%%', 100*nnz(S)/numel(S) ); 345 | end 346 | if displayTime && PRINT 347 | tm = toc( timeRef ); timeRef = tic; 348 | fprintf(', time %.1f s', tm ); 349 | end 350 | if DO_RESTART, fprintf(' [restarted FISTA]'); end 351 | if PRINT, fprintf('\n'); end 352 | if BREAK 353 | fprintf('Reached stopping criteria (based on change in residual)\n'); 354 | break; 355 | end 356 | end 357 | if BREAK 358 | errHist = errHist(1:k,:); 359 | else 360 | fprintf('Reached maximum number of allowed iterations\n'); 361 | end 362 | 363 | end 364 | 365 | % subfunctions for projection 366 | function [L,S,rnk,nuclearNorm] = projectMax( L1L2, tau, lambda_S,SVDopts, L, S , stepsize, stepsizeS) 367 | if nargin >= 7 && ~isempty(stepsize) 368 | % we compute proximity, not projection 369 | tauL = -abs( tau*stepsize ); 370 | if nargin < 8 || isempty( stepsizeS ), stepsizeS = stepsize; end 371 | tauS = -abs( lambda_S*stepsizeS ); 372 | else 373 | tauL = abs(tau); 374 | tauS = abs(tau/lambda_S ); 375 | end 376 | 377 | % We project separately, so very easy 378 | if ~isempty(L) 379 | [L,rnk,nuclearNorm] = projNuclear(tauL, L,SVDopts); 380 | if tauL > 0 381 | % we did projection, so this should be feasible 382 | nuclearNorm = 0; 383 | end 384 | else 385 | rnk = []; 386 | nuclearNorm = 0; 387 | end 388 | 389 | if ~isempty(S) 390 | if tauS > 0 391 | if ~any(L1L2) 392 | % use the l1 norm 393 | projS = project_l1(tauS); 394 | elseif strcmpi(L1L2,'rows') 395 | projS = project_l1l2(tauS,true); 396 | elseif strcmpi(L1L2,'cols') 397 | projS = project_l1l2(tauS,false); 398 | else 399 | error('bad value for L1L2: should be [], ''rows'' or ''cols'' '); 400 | end 401 | S = projS( S ); 402 | else 403 | if ~any(L1L2) 404 | % use the l1 norm 405 | % simple prox 406 | S = sign(S).*max(0, abs(S) - abs(tauS)); 407 | elseif strcmpi(L1L2,'rows') 408 | projS = prox_l1l2(abs(tauS)); 409 | S = projS(S,1); 410 | elseif strcmpi(L1L2,'cols') 411 | projS = prox_l1l2(abs(tauS)); 412 | S = projS(S',1)'; 413 | else 414 | error('bad value for L1L2: should be [], ''rows'' or ''cols'' '); 415 | end 416 | 417 | end 418 | end 419 | end 420 | 421 | function [X,rEst,nrm] = projNuclear( tau, X, SVDopts ) 422 | % Input must be a matrix, not a vector 423 | % Computes either the proximity operator of the nuclear norm (if tau<0) 424 | % or projection onto the nuclear norm ball of radius tau (if tau>0) 425 | 426 | persistent oldRank Vold iteration 427 | if nargin==0, oldRank=[]; Vold = []; iteration = 0; return; end 428 | if isempty(oldRank), rEst = 10; 429 | else, rEst = oldRank + 2; 430 | end 431 | if isempty(iteration), iteration = 0; end 432 | iteration = iteration + 1; 433 | [n1,n2] = size(X); 434 | minN = min( [n1,n2] ); % could set smaller to make nonconvex 435 | % For the first few iterations, we constrain rankMax 436 | switch iteration 437 | case 1 438 | rankMax = round(minN/4); 439 | case 2 440 | rankMax = round(minN/2); 441 | otherwise 442 | rankMax = minN; 443 | end 444 | 445 | style = SVDopts.SVDstyle; 446 | if tau==0, X=0*X; return; end 447 | 448 | switch style 449 | case 1 450 | % full SVD 451 | [U,S,V] = svd(X,'econ'); 452 | s = diag(S); 453 | if tau < 0 % we do prox 454 | s = max( 0, s - abs(tau) ); 455 | else 456 | s = project_simplex( tau, s ); 457 | end 458 | tt = s > 0; 459 | rEst = nnz(tt); 460 | U = U(:,tt); 461 | S = diag(s(tt)); 462 | V = V(:,tt); 463 | nrm = sum(s(tt)); 464 | case {2,3,4} 465 | % 2: use Matlab's sparse SVD 466 | % 3: use PROPACK 467 | % 4: use Joel Tropp's randomized SVD 468 | if style==2 469 | opts = struct('tol',1e-4); 470 | if rankMax==1, opts.tol = min(opts.tol,1e-6); end % important! 471 | svdFcn = @(X,rEst)svds(X,rEst,'L',opts); 472 | elseif style==3 473 | opts = struct('tol',1e-4,'eta',eps); 474 | opts.delta = 10*opts.eta; 475 | % set eta to eps, but not 0 otherwise reorth is very slow 476 | if rankMax==1, opts.tol = min(opts.tol,1e-6); end % important! 477 | svdFcn = @(X,rEst)lansvd(X,rEst,'L',opts); 478 | elseif style == 4 479 | opts = []; 480 | if isfield(SVDopts,'nPower') && ~isempty( SVDopts.nPower ) 481 | nPower = SVDopts.nPower; 482 | else 483 | nPower = 2; 484 | end 485 | if isfield(SVDopts,'offset') && ~isempty( SVDopts.offset ) 486 | offset = SVDopts.offset; 487 | else 488 | offset = 5; 489 | end 490 | if isfield( SVDopts, 'warmstart' ) && SVDopts.warmstart==true ... 491 | && ~isempty(Vold) 492 | opts = struct( 'warmStart', Vold ); 493 | end 494 | ell = @(r) min([r+offset,n1,n2]); % number of samples to take 495 | svdFcn = @(X,rEst)randomizedSVD(X,rEst,ell(rEst),nPower,[],opts ); 496 | end 497 | 498 | ok = false; 499 | while ~ok 500 | rEst = min( [rEst,rankMax] ); 501 | [U,S,V] = svdFcn(X,rEst); 502 | s = diag(S); 503 | if tau < 0 504 | % we are doing prox 505 | lambda = abs(tau); 506 | else 507 | lambda = findTau(s,tau); 508 | end 509 | ok = ( min(s) < lambda ) || (rEst == rankMax); 510 | if ok, break; end 511 | rEst = 2*rEst; 512 | end 513 | rEst = min( length(find(s>lambda)), rankMax ); 514 | S = diag( s(1:rEst) - lambda ); 515 | U = U(:,1:rEst); 516 | V = V(:,1:rEst); 517 | nrm = sum( s(1:rEst) - lambda ); 518 | otherwise 519 | error('bad value for SVDstyle'); 520 | end 521 | if isempty(U) 522 | X = 0*X; 523 | else 524 | X = U*S*V'; 525 | end 526 | oldRank = size(U,2); 527 | if isfield( SVDopts, 'warmstart' ) && SVDopts.warmstart==true 528 | Vold = V; 529 | end 530 | end 531 | 532 | function x = project_simplex( q, x ) 533 | % projects onto the constraints sum(x)=q and x >= 0 534 | % Update: projects onto sum(x) <= q and x >= 0 535 | 536 | x = x.*( x > 0 ); % March 11 '14 537 | % March 11, fixing bug: we want to project onto the volume, not 538 | % the actual simplex (surface) 539 | if sum(x) <= q, return; end 540 | if q==0, x = 0*x; return; end 541 | 542 | s = sort( x, 'descend' ); 543 | if q < eps(s(1)) 544 | % eps(x) is the distance from abs(x) to the next larger 545 | % floating point number, i.e., in floating point arithmetic, 546 | % x + eps(x)/2 = x 547 | % eps(1) is about 2.2e-16 548 | 549 | error('Input is scaled so large compared to q that accurate computations are difficult'); 550 | % since then cs(1) = s(1) - q is not even guaranteed to be 551 | % smaller than q ! 552 | end 553 | cs = ( cumsum(s) - q ) ./ ( 1 : numel(s) )'; 554 | ndx = nnz( s > cs ); 555 | x = max( x - cs(ndx), 0 ); 556 | end 557 | 558 | function tau = findTau( s, lambda ) 559 | % Returns the shrinkage value necessary to shrink the vector s 560 | % so that it is in the lambda scaled simplex 561 | % Usually, s is the diagonal part of an SVD 562 | if all(s==0)||lambda==0, tau=0; return; end 563 | if numel(s)>length(s), error('s should be a vector, not a matrix'); end 564 | if ~issorted(flipud(s)), s = sort(s, 'descend'); end 565 | if any( s < 0 ), error('s should be non-negative'); end 566 | 567 | % project onto the simplex of radius lambda (not tau) 568 | % and use this to find "tau" (the shrinkage amount) 569 | % If we know s_i > tau for i = 1, ..., k, then 570 | % tau = ( sum(s(1:k)) - lambda )/k 571 | % But we don't know k, so find it: 572 | cs = (cumsum(s) - abs(lambda) )./(1:length(s))'; 573 | ndx = nnz( s > cs ); % >= 1 as long as lambda > 0 574 | tau = max(0,cs(ndx)); 575 | % We want to make sure we project onto sum(sigma) <= lambda 576 | % and not sum(sigma) == lambda, so do not allow negative tau 577 | 578 | end 579 | 580 | 581 | function [L,S,rnk,nuclearNorm] = projectSum( tau, lambda_S, L, S ) 582 | [m,n] = size(L); 583 | [U,Sigma,V] = svd(L,'econ'); 584 | s = diag(Sigma); 585 | wts = [ ones(length(s),1); lambda_S*ones(m*n,1) ]; 586 | proj = project_l1(tau,wts); 587 | sS = proj( [s;vec(S)] ); 588 | sProj = sS(1:length(s)); 589 | S = reshape( sS(length(s)+1:end), m, n ); 590 | L = U*diag(sProj)*V'; 591 | rnk = nnz( sProj ); 592 | nuclearNorm = sum(sProj); 593 | end 594 | 595 | 596 | function [stepsizeL, stepsizeS] = compute_BB_stepsize( ... 597 | Grad, Grad_old, L, L_old, S, S_old, BB_split, BB_type) 598 | 599 | if ~BB_split 600 | % we take a Barzilai-Borwein stepsize in the full variable 601 | yk = Grad(:) - Grad_old(:); 602 | yk = [yk;yk]; % to account for both variables 603 | sk = [L(:) - L_old(:); S(:) - S_old(:) ]; 604 | if BB_type == 1 605 | % Default. The bigger stepsize 606 | stepsize = norm(sk)^2/(sk'*yk); 607 | elseif BB_type == 2 608 | stepsize = sk'*yk/(norm(yk)^2); 609 | end 610 | [stepsizeL, stepsizeS] = deal( stepsize ); 611 | 612 | elseif BB_split 613 | % treat L and S variables separately 614 | % Doesn't seem to work well. 615 | yk = Grad(:) - Grad_old(:); 616 | skL = L(:) - L_old(:); 617 | skS = S(:) - S_old(:); 618 | if BB_type == 1 619 | % Default. The bigger stepsize 620 | stepsizeL = norm(skL)^2/(skL'*yk); 621 | stepsizeS = norm(skS)^2/(skS'*yk); 622 | elseif BB_type == 2 623 | stepsizeL = skL'*yk/(norm(yk)^2); 624 | stepsizeS = skS'*yk/(norm(yk)^2); 625 | end 626 | end 627 | end 628 | 629 | 630 | function op = project_l1( q , d) 631 | %PROJECT_L1 Projection onto the scaled 1-norm ball. 632 | % OP = PROJECT_L1( Q ) returns an operator implementing the 633 | % indicator function for the 1-norm ball of radius q, 634 | % { X | norm( X, 1 ) <= q }. Q is optional; if omitted, 635 | % Q=1 is assumed. But if Q is supplied, it must be a positive 636 | % real scalar. 637 | % 638 | % OP = PROJECT_L1( Q, D ) uses a scaled 1-norm ball of radius q, 639 | % { X | norm( D.*X, 1 ) <= 1 }. D should be the same size as X 640 | % and non-negative (some zero entries are OK). 641 | 642 | % Note: theoretically, this can be done in O(n) 643 | % but in practice, worst-case O(n) median sorts are slow 644 | % and instead average-case O(n) median sorts are used. 645 | % But in matlab, the median is no faster than sort 646 | % (the sort is probably quicksort, O(n log n) expected, with 647 | % good constants, but O(n^2) worst-case). 648 | % So, we use the naive implementation with the sort, since 649 | % that is, in practice, the fastest. 650 | 651 | if nargin == 0, 652 | q = 1; 653 | elseif ~isnumeric( q ) || ~isreal( q ) || numel( q ) ~= 1 || q <= 0, 654 | error( 'Argument must be positive.' ); 655 | end 656 | if nargin < 2 || isempty(d) || numel(d)==1 657 | if nargin>=2 && ~isempty(d) 658 | % d is a scalar, so norm( d*x ) <= q is same as norm(x)<=q/d 659 | if d==0 660 | error('If d==0 in proj_l1, the set is just {0}, so use proj_0'); 661 | elseif d < 0 662 | error('Require d >= 0'); 663 | end 664 | q = q/d; 665 | end 666 | op = @(varargin)proj_l1_q(q, varargin{:} ); 667 | else 668 | if any(d<0) 669 | error('All entries of d must be non-negative'); 670 | end 671 | op = @(varargin)proj_l1_q_d(q, d, varargin{:} ); 672 | end 673 | 674 | % This is modified from TFOCS, Nov 26 2013, Stephen Becker 675 | % Note: removing "v" (value) output from TFOCS code 676 | % Also removing extraneous inputs 677 | function x = proj_l1_q( q, x, varargin ) 678 | myReshape = @(x) x; % do nothing 679 | if size(x,2) > 1 680 | if ndims(x) > 2, error('You must modify this code to deal with tensors'); end 681 | myReshape = @(y) reshape( y, size(x,1), size(x,2) ); 682 | x = x(:); % make it into a vector 683 | end 684 | s = sort(abs(nonzeros(x)),'descend'); 685 | cs = cumsum(s); 686 | % ndx = find( cs - (1:numel(s))' .* [ s(2:end) ; 0 ] >= q, 1 ); 687 | ndx = find( cs - (1:numel(s))' .* [ s(2:end) ; 0 ] >= q+2*eps(q), 1 ); % For stability 688 | if ~isempty( ndx ) 689 | thresh = ( cs(ndx) - q ) / ndx; 690 | x = x .* ( 1 - thresh ./ max( abs(x), thresh ) ); % May divide very small numbers 691 | end 692 | x = myReshape(x); 693 | end 694 | 695 | % Allows scaling. Added Feb 21 2014 696 | function x = proj_l1_q_d( q, d, x, varargin ) 697 | myReshape = @(x) x; % do nothing 698 | if size(x,2) > 1 699 | if ndims(x) > 2, error('You must modify this code to deal with tensors'); end 700 | myReshape = @(y) reshape( y, size(x,1), size(x,2) ); 701 | x = x(:); % make it into a vector 702 | end 703 | [goodInd,j,xOverD] = find( x./ d ); 704 | [lambdas,srt] = sort(abs(xOverD),'descend'); 705 | s = abs(x(goodInd).*d(goodInd)); 706 | s = s(srt); 707 | dd = d(goodInd).^2; 708 | dd = dd(srt); 709 | cs = cumsum(s); 710 | cd = cumsum(dd); 711 | ndx = find( cs - lambdas.*cd >= q+2*eps(q), 1, 'first'); 712 | if ~isempty( ndx ) 713 | ndx = ndx - 1; 714 | lambda = ( cs(ndx) - q )/cd(ndx); 715 | x = sign(x).*max( 0, abs(x) - lambda*d ); 716 | end 717 | x = myReshape(x); 718 | end 719 | 720 | 721 | end 722 | 723 | 724 | 725 | % Copied from TFOCS, April 17 2015 726 | function op = project_l1l2( q, rowNorms ) 727 | %PROJ_L1L2 L1-L2 block norm: sum of L2 norms of rows. 728 | % OP = PROJ_L1L2( q ) implements the constraint set 729 | % {X | sum_{i=1:m} norm(X(i,:),2) <= 1 } 730 | % where X is a m x n matrix. If n = 1, this is equivalent 731 | % to PROJ_L1. If m=1, this is equivalent to PROJ_L2 732 | % 733 | % Q is optional; if omitted, Q=1 is assumed. But if Q is supplied, 734 | % then it must be positive and real and a scalar. 735 | % 736 | % OP = PROJ_L1L2( q, rowNorms ) 737 | % will either do the sum of the l2-norms of rows if rowNorms=true 738 | % (the default), or the sum of the l2-norms of columns if 739 | % rowNorms = false. 740 | % 741 | % Known issues: doesn't yet work with complex-valued data. 742 | % Should be easy to fix, so email developers if this is 743 | % needed for your problem. 744 | % 745 | % Dual: prox_linfl2.m [not yet available] 746 | % See also prox_l1l2.m, proj_l1.m, proj_l2.m 747 | 748 | if nargin == 0 || isempty(q), 749 | q = 1; 750 | elseif ~isnumeric( q ) || ~isreal( q ) || any(q <= 0) ||numel(q)>1, 751 | error( 'Argument must be positive and a scalar.' ); 752 | end 753 | 754 | if nargin<2 || isempty(rowNorms) 755 | rowNorms = true; 756 | end 757 | 758 | if rowNorms 759 | op = @(x,varargin)prox_f_rows(q,x); 760 | else 761 | op = @(x,varargin)prox_f_cols(q,x); 762 | end 763 | 764 | function X = prox_f_rows(tau,X) 765 | nrms = sqrt( sum( X.^2, 2 ) ); 766 | % When we include a row of x, corresponding to row y of Y, 767 | % its contribution is norm(y)-lambda 768 | % So we have sum_{i=1}^m max(0, norm(y_0)-lambda) 769 | % So, basically project nrms onto the l1 ball... 770 | s = sort( nrms, 'descend' ); 771 | cs = cumsum(s); 772 | 773 | ndx = find( cs - (1:numel(s))' .* [ s(2:end) ; 0 ] >= tau+2*eps(tau), 1 ); % For stability 774 | 775 | if ~isempty( ndx ) 776 | thresh = ( cs(ndx) - tau ) / ndx; 777 | % Apply to relevant rows 778 | d = max( 0, 1-thresh./nrms ); 779 | m = size(X,1); 780 | X = spdiags( d, 0, m, m )*X; 781 | end 782 | end 783 | 784 | function X = prox_f_cols(tau,X) 785 | nrms = sqrt( sum( X.^2, 1 ) ).'; 786 | s = sort( nrms, 'descend' ); 787 | cs = cumsum(s); 788 | 789 | ndx = find( cs - (1:numel(s))' .* [ s(2:end) ; 0 ] >= tau+2*eps(tau), 1 ); % For stability 790 | 791 | if ~isempty( ndx ) 792 | thresh = ( cs(ndx) - tau ) / ndx; 793 | d = max( 0, 1-thresh./nrms ); 794 | n = size(X,2); 795 | X = X*spdiags( d, 0, n,n ); 796 | end 797 | end 798 | 799 | end % end projection_l1l2.m 800 | 801 | 802 | 803 | % Copied from TRFOCS April 17 2015 804 | function op = prox_l1l2( q ) 805 | %PROX_L1L2 L1-L2 block norm: sum of L2 norms of rows. 806 | % OP = PROX_L1L2( q ) implements the nonsmooth function 807 | % OP(X) = q * sum_{i=1:m} norm(X(i,:),2) 808 | % where X is a m x n matrix. If n = 1, this is equivalent 809 | % to PROX_L1 810 | % Q is optional; if omitted, Q=1 is assumed. But if Q is supplied, 811 | % then it must be positive and real. 812 | % If Q is a vector, it must be m x 1, and in this case, 813 | % the weighted norm OP(X) = sum_{i} Q(i)*norm(X(i,:),2) 814 | % is calculated. 815 | % 816 | % Dual: proj_linfl2.m 817 | % See also proj_linfl2.m, proj_l1l2.m 818 | 819 | if nargin == 0, 820 | q = 1; 821 | elseif ~isnumeric( q ) || ~isreal( q ) || any(q <= 0), 822 | error( 'Argument must be positive.' ); 823 | end 824 | op = @(x,t)prox_f(q,x,t); 825 | 826 | function x = prox_f(q,x,t) 827 | if nargin < 3, 828 | error( 'Not enough arguments.' ); 829 | end 830 | v = sqrt( sum(x.^2,2) ); 831 | s = 1 - 1 ./ max( v ./ ( t .* q ), 1 ); 832 | m = length(s); 833 | x = spdiags(s,0,m,m)*x; 834 | end 835 | 836 | end 837 | -------------------------------------------------------------------------------- /solvers/solver_split_SPCP.m: -------------------------------------------------------------------------------- 1 | function [L,S,errHist] = solver_split_SPCP(X,params) 2 | % [L,S,errHist] = solver_split_SPCP(Y,params) 3 | % Solves the problem 4 | % minimize_{U,V} lambda_L/2 (||U||_F^2 + ||V||_F^2) + phi(U,V) 5 | % 6 | % where 7 | % 8 | % phi(U,V) = min_S .5|| U*V' + S - Y ||_F^2 + lambda_S ||S||_1. 9 | % 10 | % errHist(:,1) is a record of runtime 11 | % errHist(:,2) is a record of the full objective (.f*resid^2 + lambda_L, 12 | % etc.) 13 | % errHist(:,3) is the output of params.errFcn if provided 14 | % 15 | % params is a structure with optional fields 16 | % errFcn to compute objective or error (empty) 17 | % k desired rank of L (10) 18 | % U0,V0 initial points for L0 = U0*V0' (random) 19 | % gpu 1 for gpu, 0 for cpu (0) 20 | % lambdaS l1 penalty weight (0.8) 21 | % lambdaL nuclear norm penalty weight (115) 22 | % 23 | tic; 24 | 25 | [m, n] = size(X); 26 | params.m = m; 27 | params.n = n; 28 | 29 | errFcn = setOpts(params,'errFcn',[]); 30 | k = setOpts(params,'k',10); 31 | U0 = setOpts(params,'U0',randn(m,k)); 32 | V0 = setOpts(params,'V0',randn(n,k)); 33 | gpu = setOpts(params,'gpu',0); 34 | lambdaS = setOpts(params,'lambdaS',0.8); % default for demo_escalator 35 | lambdaL = setOpts(params,'lambdaL',115); % default for demo_escalator 36 | 37 | % check if we are on the GPU 38 | if strcmp(class(X), 'gpuArray') 39 | gpu = 1; 40 | end 41 | 42 | if gpu 43 | U0 = gpuArray(U0); 44 | V0 = gpuArray(V0); 45 | end 46 | 47 | % initial point 48 | R = [vec(U0); vec(V0)]; 49 | 50 | % set necessary parameters 51 | params.lambdaS = lambdaS; 52 | params.lambdaL = lambdaL; 53 | params.gpu = gpu; 54 | 55 | % objective/gradient map for L-BFGS solver 56 | ObjFunc = @(x)func_split_spcp(x,X,params,errFcn); 57 | 58 | func_split_spcp(); 59 | 60 | % solve using L-BFGS 61 | [x,~,~] = lbfgs_gpu(ObjFunc,R,params); 62 | 63 | errHist=func_split_spcp(); 64 | if ~isempty( errHist ) 65 | figure; 66 | semilogy( errHist ); 67 | end 68 | 69 | % prepare output 70 | U = reshape(x(1:m*k),m,k); 71 | V = reshape(x(m*k+1:m*k+k*n),n,k); 72 | S = func_split_spcp(x,X,params,'S'); 73 | S = reshape(S,m,n); 74 | 75 | L = U*V'; 76 | 77 | end 78 | 79 | 80 | function out = setOpts(options, opt, default) 81 | if isfield(options, opt) 82 | out = options.(opt); 83 | else 84 | out = default; 85 | end 86 | end -------------------------------------------------------------------------------- /utilities/.DS_Store: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/stephenbeckr/fastRPCA/44dfee56f142ebffe5a7003578868e84bd4330b7/utilities/.DS_Store -------------------------------------------------------------------------------- /utilities/WolfeLineSearch.m: -------------------------------------------------------------------------------- 1 | function [t,f_new,g_new,funEvals] = WolfeLineSearch(... 2 | x,t,d,f,g,gtd,c1,c2,maxLS,progTol,funObj) 3 | % 4 | % Bracketing Line Search to Satisfy Wolfe Conditions 5 | % 6 | % Based on code from ''minFunc'' (Mark Schmidt, 2005) 7 | % 8 | % Inputs: 9 | % x: starting location 10 | % t: initial step size 11 | % d: descent direction 12 | % f: function value at starting location 13 | % g: gradient at starting location 14 | % gtd: directional derivative at starting location 15 | % c1: sufficient decrease parameter 16 | % c2: curvature parameter 17 | % debug: display debugging information 18 | % LS_interp: type of interpolation 19 | % maxLS: maximum number of iterations 20 | % progTol: minimum allowable step length 21 | % doPlot: do a graphical display of interpolation 22 | % funObj: objective function 23 | % varargin: parameters of objective function 24 | % 25 | % Outputs: 26 | % t: step length 27 | % f_new: function value at x+t*d 28 | % g_new: gradient value at x+t*d 29 | % funEvals: number function evaluations performed by line search 30 | % H: Hessian at initial guess (only computed if requested) 31 | 32 | % Evaluate the Objective and Gradient at the Initial Step 33 | [f_new,g_new] = funObj(x+t*d); 34 | 35 | funEvals = 1; 36 | gtd_new = g_new'*d; 37 | 38 | % Bracket an Interval containing a point satisfying the 39 | % Wolfe criteria 40 | 41 | LSiter = 0; 42 | t_prev = 0; 43 | f_prev = f; 44 | g_prev = g; 45 | nrmD = max(abs(d)); 46 | done = 0; 47 | 48 | while LSiter < maxLS 49 | 50 | %% Bracketing Phase 51 | 52 | if f_new > f + c1*t*gtd || (LSiter > 1 && f_new >= f_prev) 53 | bracket = [t_prev t]; 54 | bracketFval = [f_prev f_new]; 55 | bracketGval = [g_prev g_new]; 56 | break; 57 | elseif abs(gtd_new) <= -c2*gtd 58 | bracket = t; 59 | bracketFval = f_new; 60 | bracketGval = g_new; 61 | done = 1; 62 | break; 63 | elseif gtd_new >= 0 64 | bracket = [t_prev t]; 65 | bracketFval = [f_prev f_new]; 66 | bracketGval = [g_prev g_new]; 67 | break; 68 | end 69 | t_prev = t; 70 | maxStep = t*10; 71 | 72 | t = maxStep; 73 | 74 | f_prev = f_new; 75 | g_prev = g_new; 76 | 77 | [f_new,g_new] = funObj(x + t*d); 78 | 79 | funEvals = funEvals + 1; 80 | gtd_new = g_new'*d; 81 | LSiter = LSiter+1; 82 | end 83 | 84 | if LSiter == maxLS 85 | bracket = [0 t]; 86 | bracketFval = [f f_new]; 87 | bracketGval = [g g_new]; 88 | end 89 | 90 | %% Zoom Phase 91 | 92 | % We now either have a point satisfying the criteria, or a bracket 93 | % surrounding a point satisfying the criteria 94 | % Refine the bracket until we find a point satisfying the criteria 95 | insufProgress = 0; 96 | 97 | while ~done && LSiter < maxLS 98 | 99 | % Find High and Low Points in bracket 100 | [f_LO, LOpos] = min(bracketFval); 101 | HIpos = -LOpos + 3; 102 | 103 | % Compute new trial value 104 | t = mean(bracket); 105 | 106 | 107 | % Test that we are making sufficient progress 108 | if min(max(bracket)-t,t-min(bracket))/(max(bracket)-min(bracket)) < 0.1 109 | if insufProgress || t>=max(bracket) || t <= min(bracket) 110 | if debug 111 | fprintf(', Evaluating at 0.1 away from boundary\n'); 112 | end 113 | if abs(t-max(bracket)) < abs(t-min(bracket)) 114 | t = max(bracket)-0.1*(max(bracket)-min(bracket)); 115 | else 116 | t = min(bracket)+0.1*(max(bracket)-min(bracket)); 117 | end 118 | insufProgress = 0; 119 | else 120 | insufProgress = 1; 121 | end 122 | else 123 | insufProgress = 0; 124 | end 125 | 126 | % Evaluate new point 127 | [f_new,g_new] = funObj(x + t*d); 128 | 129 | funEvals = funEvals + 1; 130 | gtd_new = g_new'*d; 131 | LSiter = LSiter+1; 132 | 133 | armijo = f_new < f + c1*t*gtd; 134 | if ~armijo || f_new >= f_LO 135 | % Armijo condition not satisfied or not lower than lowest 136 | % point 137 | bracket(HIpos) = t; 138 | bracketFval(HIpos) = f_new; 139 | bracketGval(:,HIpos) = g_new; 140 | else 141 | if abs(gtd_new) <= - c2*gtd 142 | % Wolfe conditions satisfied 143 | done = 1; 144 | elseif gtd_new*(bracket(HIpos)-bracket(LOpos)) >= 0 145 | % Old HI becomes new LO 146 | bracket(HIpos) = bracket(LOpos); 147 | bracketFval(HIpos) = bracketFval(LOpos); 148 | bracketGval(:,HIpos) = bracketGval(:,LOpos); 149 | end 150 | % New point becomes new LO 151 | bracket(LOpos) = t; 152 | bracketFval(LOpos) = f_new; 153 | bracketGval(:,LOpos) = g_new; 154 | end 155 | 156 | if ~done && abs(bracket(1)-bracket(2))*nrmD < progTol 157 | break; 158 | end 159 | 160 | end 161 | 162 | %% 163 | 164 | [~, LOpos] = min(bracketFval); 165 | t = bracket(LOpos); 166 | f_new = bracketFval(LOpos); 167 | g_new = bracketGval(:,LOpos); 168 | 169 | end -------------------------------------------------------------------------------- /utilities/errorFunction.m: -------------------------------------------------------------------------------- 1 | function [er,output2,output3] = errorFunction( errFcn, varargin ) 2 | % err = errorFunction( errFcn, arguments ) 3 | % just returns the evaluation of errFcn( arguments ). 4 | % This seems pointless, but the point is that this function 5 | % has memory of all the errors, so you look at this later. 6 | % 7 | % errHist = errorFunction() 8 | % will return the history of errors and also clear the memory 9 | % 10 | % ... = errorFunction( ticReference ) 11 | % will allow the function to enable time logging 12 | % It will subtract off the time used for the error function 13 | % 14 | % [errHist, timeLog] = errorFunction( ) 15 | % will output the time logging information. 16 | % The time log shows at what time, relative to ticReference, 17 | % the error was recorded 18 | % 19 | % The function uses persistent variables so don't forget to clear it 20 | % and set it with a new ticReference! 21 | % 22 | % Stephen Becker, March 6 2014 stephen.beckr@gmail.com 23 | % March 11 2014, updated to subtract off the time involved 24 | % for computing the error. Also allows error function 25 | % to return a row vector instead of a scalar 26 | % March 14 2014, if ticReference is really a cell array with 27 | % {ticReference,maxTime}, then this will throw an error 28 | % after reaching "maxTime" seconds 29 | 30 | persistent erHist timeLog ticReference timeCorrection maxTime 31 | if nargin <= 1 32 | if nargin == 1 && ~isempty(errFcn) 33 | if iscell( errFcn ) 34 | ticReference = errFcn{1}; 35 | maxTime = errFcn{2}; 36 | else 37 | maxTime = Inf; 38 | ticReference = errFcn; 39 | end 40 | else 41 | ticReference = []; 42 | end 43 | er = erHist; 44 | if nargout >= 2, output2 = timeLog; end 45 | if nargout >= 3, output3 = timeCorrection; end 46 | erHist = []; 47 | timeLog= []; 48 | timeCorrection = []; 49 | return 50 | end 51 | if isempty( maxTime ), maxTime = Inf; end 52 | 53 | tc1 = tic; 54 | er = errFcn( varargin{:} ); 55 | erTime = toc(tc1); 56 | erHist(end+1,:) = er; 57 | 58 | % And mark at what time we took this error measurement 59 | if ~isempty( ticReference ) 60 | % And we also want to add "erTime" to ticReference 61 | % but this isn't straightforward since ticReference is some huge 62 | % uint64 and not in seconds. So instead we update a correction 63 | if isempty( timeCorrection), timeCorrection = 0; end 64 | timeCorrection = timeCorrection + erTime; 65 | timeLog(end+1) = toc( ticReference ) - timeCorrection; 66 | 67 | if timeLog(end) > maxTime 68 | error('errorFunction:timedOut','Reached maximum allowed time'); 69 | end 70 | end -------------------------------------------------------------------------------- /utilities/func_split_spcp.m: -------------------------------------------------------------------------------- 1 | function [f, df] = func_split_spcp(x,Y,params,errFcn) 2 | % [f, df] = func_split_spcp(x,Y,params,errFunc) 3 | % [errHist] = func_split_spcp(); 4 | % [S] = func_split_spcp(x,Y,params,'S'); 5 | % 6 | % Compute function and gradient of split-SPCP objective 7 | % 8 | % lambda_L/2 (||U||_F^2 + ||V||_F^2) + phi(U,V) 9 | % 10 | % where 11 | % 12 | % phi(U,V) = min_S .5|| U*V' + S - Y ||_F^2 + lambda_S ||S||_1. 13 | 14 | persistent errHist 15 | 16 | if nargin==0 17 | f = errHist; 18 | errHist = []; 19 | return; 20 | end 21 | 22 | if nargin<3, params=[]; end 23 | m = params.m; 24 | n = params.n; 25 | k = params.k; 26 | lambdaL = params.lambdaL; 27 | lambdaS = params.lambdaS; 28 | useGPU = params.gpu; 29 | 30 | U = reshape(x(1:m*k),m,k); 31 | V = reshape(x(m*k+1:m*k+n*k),n,k); 32 | 33 | L = U*V'; 34 | LY = vec(Y-L); 35 | 36 | soft_thresh = @(LY,lambdaS) sign(LY).*max(abs(LY) - lambdaS,0); 37 | S = soft_thresh(LY,lambdaS); 38 | if nargout==1 39 | f = S; 40 | return; 41 | end 42 | SLY = reshape(S-LY,m,n); 43 | 44 | fS = norm(SLY,'fro')^2/2 + lambdaS*norm(S(:),1); 45 | 46 | f = lambdaL/2*(norm(U,'fro')^2 + norm(V,'fro')^2) + fS; 47 | 48 | if nargout > 1 49 | 50 | if useGPU 51 | df = gpuArray.zeros(m*k + n*k,1); 52 | else 53 | df = zeros(m*k + n*k,1); 54 | end 55 | 56 | df(1:m*k) = vec(lambdaL*U) + vec((SLY)*V); 57 | df(m*k+1:m*k+n*k) = vec(lambdaL*V) + vec((SLY)'*U); 58 | 59 | end 60 | 61 | errHist(end+1,1) = toc; 62 | errHist(end,2) = gather(f); 63 | 64 | if ~isempty(errFcn) 65 | errHist(end,3) = errFcn(x); 66 | end 67 | 68 | tic; 69 | 70 | end 71 | 72 | 73 | 74 | function out = setOpts( params, field, default ) 75 | if ~isfield( params, field ) 76 | params.(field) = default; 77 | end 78 | out = params.(field); 79 | params = rmfield( params, field ); % so we can do a check later 80 | end 81 | -------------------------------------------------------------------------------- /utilities/holdMarker.m: -------------------------------------------------------------------------------- 1 | function symb = holdMarker(reset,symbList2) 2 | %HOLDMARKER Returns a new symbol for use in plotting or rotates through 3 | % symb = holdMarker() 4 | % returns a new symbol for use in plotting 5 | % 6 | % holdMarker('reset') 7 | % will reset the rotation 8 | % (if there is no output, then the rotation is reset) 9 | % holdMarker( {j} ) 10 | % will reset the rotation to start at symbol position "j" 11 | % 12 | % Use this with plot( ..., 'marker', holdMarker() ) 13 | % 14 | % holdMarker('reset',symbList) sets the symbol list (should be cell array) 15 | % 16 | % holdMarker( h ) 17 | % will apply this function to the handles in h 18 | % (if h=0, uses h = get(gca,'children') ) 19 | % ex. 20 | % h = plot( myMatrix ); 21 | % holdMarker(); % reset 22 | % holdMarker( h ); % apply to all the lines in the plot -- each 23 | % % will have a unique symbol. 24 | % 25 | % see also: 26 | % set(gcf,'DefaultAxesColorOrder',[1 0 0;0 1 0;0 0 1],... 27 | % 'DefaultAxesLineStyleOrder','-|--|:') 28 | % (using a 0 instead of gcf will make it permanent for your matlab session) 29 | % Stephen Becker, May 2011 30 | % 31 | % See also add_few_markers.m 32 | 33 | persistent currentLocation symbListP 34 | 35 | if isempty( currentLocation) 36 | currentLocation = 1; 37 | end 38 | 39 | if isempty( symbListP ) 40 | %symbList = { '+','o','*','.','x','s',... 41 | %'d','^','v','>','<','p','h' }; 42 | 43 | % Exclude the "." 44 | symbList = { '+','o','*','x','s',... 45 | 'd','^','v','>','<','p','h' }; 46 | else 47 | % symbList = symbListP; 48 | % May 2013 49 | if ~iscell( symbListP ) 50 | symbList = cell(size(symbListP)); 51 | for jj = 1:length( symbListP ) 52 | symbList{jj} = symbListP(jj); 53 | end 54 | else 55 | symbList = symbListP; 56 | end 57 | end 58 | 59 | % April 2013 60 | if nargin > 0 && isequal(reset,0) 61 | reset = get( gca, 'children' ); 62 | end 63 | 64 | do_reset = false; 65 | if nargin > 0 66 | if ischar(reset) 67 | do_reset = true; 68 | elseif iscell( reset ) 69 | currentLocation = reset{1}; 70 | elseif all( ishandle(reset) ) && ~( isscalar(reset) && ( reset == 1) ) 71 | % we have handles 72 | handles = reset; 73 | for h = handles(:)' 74 | symb = get_symbol; 75 | set( h, 'Marker', symb ); 76 | end 77 | elseif isscalar(reset) 78 | currentLocation = reset; 79 | else 80 | error('invalid input'); 81 | end 82 | end 83 | % if nargout == 0 || do_reset 84 | if false; 85 | do_reset = true; 86 | else 87 | symb = get_symbol; 88 | end 89 | 90 | if do_reset 91 | currentLocation = 1; 92 | if nargin >= 2 93 | symbListP = symbList2; 94 | else 95 | symbListP = []; 96 | end 97 | return; 98 | end 99 | 100 | 101 | 102 | % This subfunction has side-effects 103 | function symb = get_symbol 104 | if currentLocation > length(symbList), currentLocation = 1; end 105 | symb = symbList{ currentLocation }; 106 | currentLocation = currentLocation + 1; 107 | end 108 | 109 | 110 | 111 | end % end of main function 112 | -------------------------------------------------------------------------------- /utilities/lbfgsAdd.m: -------------------------------------------------------------------------------- 1 | function [S,Y,YS,lbfgs_start,lbfgs_end,Hdiag,skipped] = lbfgsAdd(y,s,S,Y,YS,lbfgs_start,lbfgs_end,Hdiag) 2 | ys = y'*s; 3 | skipped = 0; 4 | corrections = size(S,2); 5 | if ys > 1e-10 6 | if lbfgs_end < corrections 7 | lbfgs_end = lbfgs_end+1; 8 | if lbfgs_start ~= 1 9 | if lbfgs_start == corrections 10 | lbfgs_start = 1; 11 | else 12 | lbfgs_start = lbfgs_start+1; 13 | end 14 | end 15 | else 16 | lbfgs_start = min(2,corrections); 17 | lbfgs_end = 1; 18 | end 19 | 20 | S(:,lbfgs_end) = s; 21 | Y(:,lbfgs_end) = y; 22 | YS(lbfgs_end) = ys; 23 | 24 | % Update scale of initial Hessian approximation 25 | Hdiag = ys/(y'*y); 26 | else 27 | skipped = 1; 28 | end -------------------------------------------------------------------------------- /utilities/lbfgsProd.m: -------------------------------------------------------------------------------- 1 | function [d] = lbfgsProd(g,S,Y,YS,lbfgs_start,lbfgs_end,Hdiag) 2 | % BFGS Search Direction 3 | % 4 | % This function returns the (L-BFGS) approximate inverse Hessian, 5 | % multiplied by the negative gradient 6 | 7 | % Determine if we are working on the GPU 8 | if strcmp(class(S), 'gpuArray') 9 | gpu = 1; 10 | else 11 | gpu = 0; 12 | end 13 | 14 | % Set up indexing 15 | [nVars,maxCorrections] = size(S); 16 | if lbfgs_start == 1 17 | ind = 1:lbfgs_end; 18 | nCor = lbfgs_end-lbfgs_start+1; 19 | else 20 | ind = [lbfgs_start:maxCorrections 1:lbfgs_end]; 21 | nCor = maxCorrections; 22 | end 23 | 24 | if gpu 25 | al = gpuArray.zeros(nCor,1); 26 | be = gpuArray.zeros(nCor,1); 27 | else 28 | al = zeros(nCor,1); 29 | be = zeros(nCor,1); 30 | end 31 | 32 | d = -g; 33 | for j = 1:length(ind) 34 | i = ind(end-j+1); 35 | al(i) = (S(:,i)'*d)/YS(i); 36 | d = d-al(i)*Y(:,i); 37 | end 38 | 39 | % Multiply by Initial Hessian 40 | d = Hdiag*d; 41 | 42 | for i = ind 43 | be(i) = (Y(:,i)'*d)/YS(i); 44 | d = d + S(:,i)*(al(i)-be(i)); 45 | end 46 | -------------------------------------------------------------------------------- /utilities/lbfgs_gpu.m: -------------------------------------------------------------------------------- 1 | function [x,f,exitflag] = lbfgs_gpu(funObj,x0,options) 2 | % [x,f,exitflag] = lbfgs_gpu(funObj,x0,options) 3 | % 4 | % Limited-memory BFGS with Wolfe line-search and GPU support. 5 | % Based on code from ''minFunc'' (Mark Schmidt, 2005) 6 | % 7 | % Options: 8 | % GPU - 0 is off-GPU, 1 is on-GPU 9 | % maxIter - number of iterations 10 | % store - number of corrections to store in memory 11 | % (more store lead to faster convergence). 12 | % verbose - 0 turns off printing, 1 turns on printing 13 | % c1 - Sufficient Decrease for Armijo condition (1e-4) 14 | % c2 - Curvature Decrease for Wolfe conditions (0.9) 15 | % progTol - Termination tolerance on gradient size (1e-8) 16 | % optTol - Termination tolerance on objective decrease (1e-9) 17 | % 18 | % Inputs: 19 | % funObj - is a function handle (objective/gradient map) 20 | % x0 - is a starting vector; 21 | % options - is a struct containing parameters 22 | % 23 | % Outputs: 24 | % x is the minimum value found 25 | % f is the function value at the minimum found 26 | % exitflag returns an exit condition 27 | % 28 | 29 | if nargin < 3 30 | options = []; 31 | end 32 | 33 | % Set parameters 34 | maxIter = setOpts(options,'MaxIter',100); 35 | verbose = setOpts(options,'verbose',1); 36 | store = setOpts(options,'store',100); 37 | gpu = setOpts(options,'gpu',0); 38 | c1 = setOpts(options,'c1',1e-4); 39 | c2 = setOpts(options,'c2',0.9); 40 | progTol = setOpts(options,'progTol',1e-9); 41 | optTol = setOpts(options,'optTol',1e-8); 42 | 43 | % Initialize 44 | p = length(x0); 45 | d = zeros(p,1); 46 | x = x0; 47 | t = 1; 48 | 49 | funEvalMultiplier = 1; 50 | 51 | % Evaluate Initial Point 52 | [f,g] = funObj(x); 53 | funEvals = 1; 54 | 55 | % Output Log 56 | if verbose 57 | fprintf('%10s %10s %15s %15s %15s\n','Iteration','FunEvals','Step Length','Function Val','Opt Cond'); 58 | end 59 | 60 | % Compute optimality of initial point 61 | optCond = max(abs(g)); 62 | 63 | % Exit if initial point is optimal 64 | if optCond <= optTol 65 | exitflag=1; 66 | msg = 'Optimality Condition below optTol'; 67 | if verbose 68 | fprintf('%s\n',msg); 69 | end 70 | return; 71 | end 72 | 73 | 74 | % Perform up to a maximum of 'maxIter' descent steps: 75 | for i = 1:maxIter 76 | 77 | % ****************** COMPUTE DESCENT DIRECTION ***************** 78 | if i == 1 79 | d = -g; % Initially use steepest descent direction 80 | 81 | if gpu 82 | S = gpuArray.zeros(p,store); 83 | Y = gpuArray.zeros(p,store); 84 | YS = gpuArray.zeros(store,1); 85 | else 86 | S = zeros(p,store); 87 | Y = zeros(p,store); 88 | YS = zeros(store,1); 89 | end 90 | 91 | lbfgs_start = 1; 92 | lbfgs_end = 0; 93 | Hdiag = 1; 94 | else 95 | [S,Y,YS,lbfgs_start,lbfgs_end,Hdiag,skipped] = lbfgsAdd(g-g_old,t*d,S,Y,YS,lbfgs_start,lbfgs_end,Hdiag); 96 | d = lbfgsProd(g,S,Y,YS,lbfgs_start,lbfgs_end,Hdiag); 97 | end 98 | g_old = g; 99 | 100 | % ****************** COMPUTE STEP LENGTH ************************ 101 | 102 | % Directional Derivative 103 | gtd = g'*d; 104 | 105 | % Check that progress can be made along direction 106 | if gtd > -progTol 107 | exitflag=2; 108 | msg = 'Directional Derivative below progTol'; 109 | break; 110 | end 111 | 112 | % Select Initial Guess for Line Search 113 | if i == 1 114 | t = min(1,1/sum(abs(g))); 115 | else 116 | t = 1; 117 | end 118 | 119 | % Line Search 120 | f_old = f; 121 | 122 | % Find Point satisfying Wolfe conditions 123 | [t,f,g,LSfunEvals] = WolfeLineSearch(x,t,d,f,g,gtd,c1,c2,25,progTol,funObj); 124 | funEvals = funEvals + LSfunEvals; 125 | x = x + t*d; 126 | 127 | % Compute Optimality Condition 128 | optCond = max(abs(g)); 129 | 130 | % Output iteration information 131 | if verbose 132 | fprintf('%10d %10d %15.5e %15.5e %15.5e\n',i,funEvals*funEvalMultiplier,t,f,optCond); 133 | end 134 | 135 | % Check Optimality Condition 136 | if optCond <= optTol 137 | exitflag=1; 138 | msg = 'Optimality Condition below optTol'; 139 | break; 140 | end 141 | 142 | % ******************* Check for lack of progress ******************* 143 | 144 | if max(abs(t*d)) <= progTol 145 | exitflag=2; 146 | msg = 'Step Size below progTol'; 147 | break; 148 | end 149 | 150 | 151 | if abs(f-f_old) < progTol 152 | exitflag=2; 153 | msg = 'Function Value changing by less than progTol'; 154 | break; 155 | end 156 | 157 | % ******** Check for going over iteration/evaluation limit ******************* 158 | if i == maxIter 159 | exitflag = 0; 160 | msg='Reached Maximum Number of Iterations'; 161 | break; 162 | end 163 | 164 | end 165 | 166 | if verbose 167 | fprintf('%s\n',msg); 168 | end 169 | 170 | end 171 | 172 | 173 | 174 | 175 | 176 | function out = setOpts(options, opt, default) 177 | if isfield(options, opt) 178 | out = options.(opt); 179 | else 180 | out = default; 181 | end 182 | end 183 | 184 | -------------------------------------------------------------------------------- /utilities/loadSyntheticProblem.m: -------------------------------------------------------------------------------- 1 | function [Y,L,S,params] = loadSyntheticProblem(varargin) 2 | % [Y,L,S,params] = loadSyntheticProblem(problemName, referenceSolutionDirectory) 3 | % 4 | % Returns parameters and solution for the following problems 5 | % (parameters are tuned so that each problem has the 6 | % the same true solution (L,S) ) 7 | % Or, call this with no input arguments to set the path... 8 | % 9 | % "Lag": min lambdaL*||L||_* + lambdaS*||S||_1 + .5||L+S-Y||_F^2 10 | % 11 | % "Sum": min ||L||_* + lambdaSum*||S||_1 12 | % subject to ||L+S-Y||_F <= epsilon 13 | % 14 | % "Max": min max( ||L||_* , lambdaMax*||S||_1 ) 15 | % subject to ||L+S-Y||_F <= epsilon 16 | % 17 | % "Dual-sum" min .5||L+S-Y||_F^2 18 | % subject to ||L||_* + lambdaSum*||S||_1 <= tauSum 19 | % 20 | % "Max-sum" min .5||L+S-Y||_F^2 21 | % subject to max( ||L||_*, lambdaMax*||S||_1) <= tauMax 22 | % 23 | % so params has the following fields: 24 | % .lambdaL 25 | % .lambdaS 26 | % .lambdaSum 27 | % .lambdaMax 28 | % .tauSum 29 | % .tauMax 30 | % .epsilon 31 | % 32 | % .objective This is a function of (L,S) and is the objective 33 | % of the "Lagrangian" formulation 34 | % 35 | % .L1L2 set to 0, 'rows', or 'sum' for using l1 or l1l2 block norms 36 | % 37 | % Stephen Becker, March 6 2014 38 | 39 | 40 | 41 | % addpath for the proxes and solvers 42 | Mfilename = mfilename('fullpath'); 43 | baseDirectory = fileparts(Mfilename); 44 | addpath(genpath(baseDirectory)); 45 | 46 | if nargin==0 47 | % we just setup the path, nothing else... 48 | return; 49 | end 50 | 51 | if nargin >= 2 52 | baseDirectory = varargin{2}; 53 | else 54 | baseDirectory = fullfile(baseDirectory,'referenceSolutions'); 55 | end 56 | norm_nuke = @(L) sum(svd(L,'econ')); 57 | vec = @(X) X(:); 58 | problemString = varargin{1}; 59 | 60 | %% 61 | % -- Common problem strings -- 62 | % problemString = 'small_randn_v2'; 63 | % problemString = 'medium_randn_v1'; 64 | % problemString = 'medium_exponentialNoise_v1'; 65 | % problemString = 'large_exponentialNoise_square_v1'; 66 | % problemString = 'large_exponentialNoise_rectangular_v1'; 67 | % problemString = 'huge_exponentialNoise_rectangular_v1'; % not yet done! 68 | % problemString = 'nsa500_SNR80_problem1_run1'; 69 | 70 | fprintf('%s\n** Problem %s **\n\n',char('*'*ones(60,1)), problemString ); 71 | fileName = fullfile(baseDirectory,[problemString,'.mat']); 72 | if exist(fileName,'file') 73 | fprintf('Loading reference solution from file %s\n', fileName); 74 | load(fileName); % should contain L, S and Y at least, and preferably lambdaL and lambdaS 75 | else 76 | [L,S,Y] = deal( [] ); 77 | end 78 | % Default: 79 | L1L2 = 0; 80 | 81 | % For problems from the NSA software package: 82 | if length(problemString)>3 && strcmpi( problemString(1:3), 'nsa' ) 83 | % This sscanf line is broken in versions of Matlab after R2010a 84 | tmp = sscanf(problemString,'nsa%d_SNR%d_problem%d_run%d'); 85 | m = tmp(1); % either 500, 1000 or 1500 86 | SNR = tmp(2); % either 80 or 45 87 | problem_type = tmp(3); 88 | run_counter = tmp(4); 89 | % e.g., 'nsa500_SNR80_problem1_run1' 90 | n = m; 91 | % % Variable parameters 92 | % m = 500; n = m; 93 | % problem_type = 1; % between 1 and 4. Controls rank/sparsity 94 | % SNR = 80; 95 | % run_counter = 1; % between 1 and 10. Only affects seed 96 | 97 | 98 | % % Possible sparsity amounts: 99 | % p_array = ceil( m*n*[ 0.05, 0.1, 0.05, 0.1 ] ); 100 | % r_array = ceil(min(m,n)*[ 0.05, 0.05, 0.1, 0.1 ] ); 101 | 102 | % Changing the conventions, March 14. They had 4 cases, for p = {.05,.1} 103 | % and r = {.05, .1 } (these are sparsity and rank percentages) 104 | % I don't want to do that many tests since we will display graphically 105 | % and not in tables. I also want more values of p and r. So I will 106 | % make p and r be linked. 107 | p_array = ceil( m*n*[ 0.05, 0.1, 0.2, 0.3 ] ); 108 | r_array = ceil(min(m,n)*[ 0.05, 0.1, 0.2, 0.3 ] ); 109 | 110 | p = p_array(problem_type); 111 | r = r_array(problem_type); 112 | pwr = r+p/(m*n)*100^2/3; 113 | stdev = sqrt(pwr/10^(SNR/10)); 114 | delta = sqrt(min(m,n)+sqrt(8*min(m,n)))*stdev; 115 | lambdaSum = 1/sqrt( max(m,n) ); % "xi" in NSA 116 | else 117 | switch problemString 118 | case 'small_randn_v1' 119 | my_rng(101); 120 | m = 40; n = 42; lambdaL = .5; lambdaS = 1e-1; 121 | if isempty(Y) 122 | Y = randn(m,n); 123 | printEvery = 500; tol = 1e-12; maxIts = 3e3; 124 | end 125 | 126 | case 'small_randn_v2' 127 | my_rng(102); 128 | m = 40; n = 42; lambdaL = .5; lambdaS = 1e-1; 129 | if isempty(Y) 130 | Y = randn(m,n); 131 | printEvery = 500; tol = 1e-12; maxIts = 3e3; 132 | end 133 | case 'medium_randn_v1' 134 | my_rng(101); 135 | m = 400; n = 500; lambdaL = .25; lambdaS = 1e-2; 136 | if isempty(Y) 137 | Y = randn(m,n); 138 | printEvery = 1; %tol = 1e-4; maxIts = 100; 139 | tol = 1e-10; maxIts = 1e3; 140 | end 141 | case 'medium_exponentialNoise_v1' 142 | % Converges much faster than 'medium_randn_v1' 143 | my_rng(101); 144 | m = 400; n = 500; lambdaL = .25; lambdaS = 1e-2; 145 | rk = round(.5*min(m,n) ); 146 | Q1 = haar_rankR(m,rk,false); 147 | Q2 = haar_rankR(n,rk,false); 148 | Y = Q1*diag(.1+rand(rk,1))*Q2'; 149 | mdn = median(abs(Y(:))); 150 | Y = Y + exprnd( .1*mdn, m, n ); 151 | 152 | printEvery = 5; tol = 1e-10; maxIts = 1000; 153 | 154 | case 'medium_exponentialNoise_v2' 155 | % Converges much faster than 'medium_randn_v1' 156 | % our quasi-Newton scheme does well here 157 | my_rng(101); 158 | m = 400; n = 500; lambdaL = .25; lambdaS = 1e-2; 159 | rk = round(.05*min(m,n) ); 160 | Q1 = haar_rankR(m,rk,false); 161 | Q2 = haar_rankR(n,rk,false); 162 | Y = Q1*diag(.1+rand(rk,1))*Q2'; 163 | mdn = median(abs(Y(:))); 164 | Y = Y + exprnd( .1*mdn, m, n ); 165 | 166 | printEvery = 5; tol = 1e-10; maxIts = 1000; 167 | 168 | case 'medium_exponentialNoise_v3' 169 | % Converges much faster than 'medium_randn_v1' 170 | my_rng(101); 171 | m = 400; n = 500; lambdaL = .25; lambdaS = 1e-2; 172 | rk = round(.1*min(m,n) ); 173 | Q1 = haar_rankR(m,rk,false); 174 | Q2 = haar_rankR(n,rk,false); 175 | Y = Q1*diag(.1+rand(rk,1))*Q2'; 176 | mdn = median(abs(Y(:))); 177 | Y = Y + exprnd( .1*mdn, m, n ); 178 | 179 | printEvery = 5; tol = 1e-10; maxIts = 1000; 180 | 181 | case 'large_exponentialNoise_square_v1' 182 | my_rng(201); 183 | m = 1e3; n = m; lambdaL = .25; lambdaS = 1e-2; 184 | rk = round(.5*min(m,n) ); 185 | Q1 = haar_rankR(m,rk,false); 186 | Q2 = haar_rankR(n,rk,false); 187 | Y = Q1*diag(.1+rand(rk,1))*Q2'; 188 | mdn = median(abs(Y(:))); 189 | Y = Y + exprnd( .1*mdn, m, n ); 190 | 191 | printEvery = 5; tol = 1e-8; maxIts = 250; 192 | 193 | case 'large_exponentialNoise_rectangular_v1' 194 | my_rng(201); 195 | m = 5e3; n = 200; lambdaL = .25; lambdaS = 5e-3; % OK 196 | rk = round(.5*min(m,n) ); 197 | Q1 = haar_rankR(m,rk,false); 198 | Q2 = haar_rankR(n,rk,false); 199 | Y = Q1*diag(.1+rand(rk,1))*Q2'; 200 | mdn = median(abs(Y(:))); 201 | Y = Y + exprnd( .1*mdn, m, n ); 202 | 203 | printEvery = 5; tol = 1e-8; maxIts = 100; 204 | 205 | case 'huge_exponentialNoise_rectangular_v1'; 206 | disp('Warning: lambda values not yet tuned'); 207 | my_rng(301); 208 | m = 5e4; n = 200; lambdaL = .25; lambdaS = 5e-2; 209 | rk = round(.5*min(m,n) ); 210 | Q1 = haar_rankR(m,rk,false); 211 | Q2 = haar_rankR(n,rk,false); 212 | Y = Q1*diag(.1+rand(rk,1))*Q2'; 213 | mdn = median(abs(Y(:))); 214 | Y = Y + exprnd( .1*mdn, m, n ); 215 | 216 | printEvery = 5; tol = 1e-10; maxIts = 100; 217 | 218 | 219 | case 'escalator_small'; 220 | % added March 29 2015 221 | load escalator_data_2015_small % has Y, mat, m, n, nFrames 222 | m = m*n; % 64 x 79, 200 frames 223 | n = nFrames; 224 | lambdaL = 3; 225 | lambdaS = 0.06; 226 | printEvery = 1; 227 | tol = 1e-9; 228 | maxIts = 100; 229 | L0 = Y; 230 | S0 = 0*Y; 231 | 232 | 233 | case 'verySmall_l1' 234 | load verySmall_l1 235 | % Used CVX to generate solution 236 | 237 | case 'verySmall_l1l2' 238 | load verySmall_l1l2 239 | % Used CVX to generate solution 240 | L1L2 = 'rows'; 241 | 242 | otherwise 243 | error('bad problem name'); 244 | end 245 | end 246 | 247 | 248 | % Generate new reference soln if not precomputed 249 | if isempty(L) 250 | fprintf('Generating reference solution...\n'); 251 | if length(problemString)>3 && strcmpi( problemString(1:3), 'nsa' ) 252 | % We solve Sum problem, which gives us a lot of parameters but not 253 | % the parameters for the Lagrangian problem unfortunately. 254 | 255 | % I am changing the seeds format 256 | seed = 601; 257 | my_rng( seed + run_counter ); 258 | [Y, optimal_L, optimal_S, deltabar]=create_data_noisy_L2(m,n,p,r,stdev); 259 | 260 | tol = 0.0001; 261 | denoise_flag=0; 262 | 263 | % [L,S,out] = nsa(Y,stdev,tol,denoise_flag,optimal_L,optimal_S); 264 | % [L,S,out] = nsa(Y,-delta,tol,denoise_flag,optimal_L,optimal_S); 265 | 266 | [L,S,out] = nsa(Y,-delta,tol,denoise_flag,[],[],lambdaSum); 267 | fprintf('Discrepancy with intitial signal L is %.2e\n', ... 268 | norm(L-optimal_L,'fro')/norm(L,'fro') ); 269 | fprintf('Discrepancy with intitial signal S is %.2e\n', ... 270 | norm(S-optimal_S,'fro')/norm(S,'fro') ); 271 | 272 | lambdaL = []; 273 | lambdaS = []; 274 | maxIts = []; 275 | else 276 | tic 277 | opts = struct('printEvery',printEvery,'FISTA',true,'tol',tol,'maxIts',maxIts); 278 | %[L,S,errHist] = solver_RPCA_Lagrangian_simple(Y,lambdaL,lambdaS,[],opts); 279 | opts.quasiNewton = true; opts.FISTA=false; opts.BB = false; 280 | opts.SVDstyle = 1; % for highest accuracy 281 | if exist('L0','var') 282 | opts.L0 = L0; 283 | end 284 | if exist('S0','var') 285 | opts.S0 = S0; 286 | end 287 | [L,S,errHist] = solver_RPCA_Lagrangian(Y,lambdaL,lambdaS,[],opts); 288 | toc 289 | end 290 | 291 | save(fileName,'L','S','Y','lambdaL','lambdaS','tol','maxIts'); 292 | end 293 | 294 | sigma = svd(L,'econ'); 295 | rk = sum( sigma > 1e-8 ); 296 | normL = sum(sigma); 297 | 298 | params = []; 299 | params.L1L2 = L1L2; 300 | if L1L2 301 | if strcmpi(L1L2,'rows') 302 | params.objective = @(L,S) lambdaL*norm_nuke(L)+lambdaS*sum( sqrt( sum(S.^2,2) ) ) + ... 303 | .5*norm(vec(L+S-Y))^2; 304 | normS = sum( sqrt( sum(S.^2,2) ) ); 305 | elseif strcmpi(L1L2,'cols') 306 | params.objective = @(L,S) lambdaL*norm_nuke(L)+lambdaS*sum( sqrt( sum(S.^2,1) ) ) + ... 307 | .5*norm(vec(L+S-Y))^2; 308 | normS = sum( sqrt( sum(S.^2,1) ) ); 309 | else error('bad value for L1L2'); 310 | end 311 | else 312 | params.objective = @(L,S) lambdaL*norm_nuke(L)+lambdaS*norm(S(:),1) + ... 313 | .5*norm(vec(L+S-Y))^2; 314 | normS = norm(S(:),1); 315 | end 316 | params.epsilon = norm( L + S - Y, 'fro' ); 317 | params.lambdaL = lambdaL; 318 | params.lambdaS = lambdaS; 319 | if exist('lambdaSum','var') 320 | params.lambdaSum = lambdaSum; 321 | else 322 | params.lambdaSum = lambdaS/lambdaL; 323 | end 324 | params.lambdaMax = normL/normS; 325 | params.tauSum = normL + params.lambdaSum*normS; 326 | params.tauMax = max(normL,params.lambdaMax*normS ); 327 | 328 | fprintf('\tSize %d x %d\n\tL has nuclear norm %.3f and rank %d of %d possible\n', ... 329 | m,n,normL, rk, length(sigma) ); 330 | fprintf('\tS has l1 (or l1l2) norm %.3f and %.2f%% of its elements are nonzero\n', ... 331 | normS, 100*sum( abs(S(:)) > 1e-8)/numel(S) ); 332 | fprintf('\t||L+S-Y||/||Y|| is %.2e\n', params.epsilon/norm(Y,'fro' ) ); 333 | 334 | 335 | 336 | end % end of main function 337 | 338 | 339 | function Q = haar_rankR(n, r, cmplx) 340 | % Q = HAAR_RANKR( N , R) 341 | % returns a N x N unitary matrix Q 342 | % from the Haar measure on the 343 | % Circular Unitary Ensemble (CUE) 344 | % 345 | % Q = HAAR_RANKR( N, R, COMPLEX ) 346 | % if COMPLEX is false, then returns a real matrix from the 347 | % Circular Orthogonal Ensemble (COE). By default, COMPLEX = true 348 | % 349 | % For more info, see http://www.ams.org/notices/200705/fea-mezzadri-web.pdf 350 | % "How to Generate Random Matrices fromthe Classical Compact Groups" 351 | % by Francesco Mezzadri (also at http://arxiv.org/abs/math-ph/0609050 ) 352 | % 353 | % Other references: 354 | % "How to generate a random unitary matrix" by Maris Ozols 355 | % 356 | % Stephen Becker, 2/24/11. updated 11/1/12 for rank r version. 357 | % 358 | % To test that the eigenvalues are evenly distributed on the unit 359 | % circle, do this: a = angle(eig(haar(1000))); hist(a,100); 360 | % 361 | % This calls "randn", so it will change the stream of the prng 362 | 363 | if nargin < 3, cmplx = true; end 364 | if nargin < 2 || isempty(r) , r=n; end 365 | 366 | if cmplx 367 | z = (randn(n,r) + 1i*randn(n,r))/sqrt(2.0); 368 | else 369 | z = randn(n,r); 370 | end 371 | [Q,R] = qr(z,0); % time consuming 372 | d = diag(R); 373 | ph = d./abs(d); 374 | % Q = multiply(q,ph,q) % in python. this is q <-- multiply(q,ph) 375 | % where we multiply each column of q by an element in ph 376 | 377 | %Q = Q.*repmat( ph', n, 1 ); 378 | % use bsxfun for this to make it fast... 379 | Q = bsxfun( @times, Q, ph' ); 380 | 381 | end % end of haar_rankR 382 | 383 | 384 | function varargout = my_rng(varargin) 385 | %MY_RNG Control the random number generator used by RAND, RANDI, and RANDN (SRB version) 386 | % MY_RNG(SD) seeds the random number generator using the non-negative 387 | % integer SD so that RAND, RANDI, and RANDN produce a predictable 388 | % sequence of numbers. 389 | % MODIFIED BY STEPHEN BECKER 390 | % See also RAND, RANDI, RANDN, RandStream, NOW. 391 | 392 | 393 | % See Choosing a Random Number Generator for details on these generators. 394 | 395 | if exist('rng','file') 396 | [varargout{1:nargout}] = rng(varargin{:}); 397 | return; 398 | end 399 | % Otherwise, imitate the functionality... 400 | 401 | % -- SRB adding this -- 402 | error(nargchk(1,1,nargin)); 403 | error(nargoutchk(0,0,nargout)); 404 | arg1 = varargin{1}; 405 | % For R2008a, this doesn't work... (not sure what earliest version is) 406 | if verLessThan('matlab','7.7') 407 | randn('state',arg1); 408 | rand('state',arg1); 409 | else 410 | if verLessThan('matlab','8.2') 411 | RandStream.setDefaultStream(RandStream('mt19937ar', 'seed', arg1) ); 412 | else 413 | RandStream.setGlobalStream(RandStream('mt19937ar', 'seed', arg1) ); 414 | end 415 | end 416 | end % end of my_rng 417 | -------------------------------------------------------------------------------- /utilities/randomizedSVD.m: -------------------------------------------------------------------------------- 1 | function [U,S,V] = randomizedSVD( X, r, rEst, nPower, seed, opts ) 2 | % [U,S,V] = randomizedSVD( X, r, rEst, nPower, seed, opts ) 3 | % returns V, S such that X ~ U*S*V' ( a m x n matrix) 4 | % where S is a r x r matrix 5 | % rEst >= r is the size of the random multiplies (default: ceil(r+log2(r)) ) 6 | % nPower is number of iterations to do the power method 7 | % (should be at least 1, which is the default) 8 | % seed can be the empty matrix; otherwise, it will be used to seed 9 | % the random number generator (useful if you want reproducible results) 10 | % opts is a structure containing further options, including: 11 | % opts.warmStart Set this to a matrix if you have a good estimate 12 | % of the row-space of the matrix already. By default, 13 | % a random matrix is used. 14 | % 15 | % X can either be a m x n matrix, or it can be a cell array 16 | % of the form {@(y)X*y, @(y)X'*y, n } 17 | % 18 | % Follows the algorithm from [1] 19 | % 20 | % [1] "Finding Structure with Randomness: Probabilistic Algorithms 21 | % for Constructing Approximate Matrix Decompositions" 22 | % by N. Halko, P. G. Martinsson, and J. A. Tropp. SIAM Review vol 53 2011. 23 | % http://epubs.siam.org/doi/abs/10.1137/090771806 24 | % 25 | 26 | % added to TFOCS in October 2014 27 | 28 | if isnumeric( X ) 29 | X_forward = @(y) X*y; 30 | X_transpose = @(y) X'*y; 31 | n = size(X,2); 32 | elseif iscell(X) 33 | if isa(X{1},'function_handle') 34 | X_forward = X{1}; 35 | else 36 | error('X{1} should be a function handle'); 37 | end 38 | if isa(X{2},'function_handle') 39 | X_transpose = X{2}; 40 | else 41 | error('X{2} should be a function handle'); 42 | end 43 | if size(X) < 3 44 | error('Please specify X in the form {@(y)X*y, @(y)X''*y, n }' ); 45 | end 46 | n = X{3}; 47 | else 48 | error('Unknown type for X: should be matrix or cell/function handle'); 49 | end 50 | function out = setOpts( field, default ) 51 | if ~isfield( opts, field ) 52 | out = default; 53 | else 54 | out = opts.(field); 55 | end 56 | end 57 | 58 | % If you want reproducible results for some reason: 59 | if nargin >= 6 && ~isempty(seed) 60 | % around 2013 or 14 (not sure exactly) 61 | % they start changing .setDefaultStream... 62 | if verLessThan('matlab','8.2') 63 | RandStream.setDefaultStream(RandStream('mt19937ar', 'seed', seed) ); 64 | else 65 | RandStream.setGlobalStream(RandStream('mt19937ar', 'seed', seed) ); 66 | end 67 | end 68 | if nargin < 3 || isempty( rEst ) 69 | rEst = ceil( r + log2(r) ); % for example... 70 | end 71 | rEst = min( rEst, n ); 72 | if r > n 73 | warning('randomizedSVD:r','Warning: r > # rows, so truncating it'); 74 | r = n; 75 | end 76 | 77 | % March 2015, do full SVD sometimes 78 | if isnumeric( X ) 79 | m = size(X,1); 80 | rEst = min( rEst, m ); 81 | r = min( r, m ); 82 | 83 | if r == min( m, n ) 84 | % do full SVD 85 | [U,S,V] = svd(X,'econ'); 86 | return; 87 | end 88 | end 89 | 90 | 91 | if nargin < 4 || isempty( nPower ) 92 | nPower = 1; 93 | end 94 | if nPower < 1, error('nPower must be >= 1'); end 95 | if nargin < 6, opts = []; end 96 | 97 | warmStart = setOpts('warmStart',[] ); 98 | if isempty( warmStart ) 99 | Q = randn( n, rEst ); 100 | else 101 | Q = warmStart; 102 | if size(Q,1) ~= n, error('bad height dimension for warmStart'); end 103 | if size(Q,2) > rEst 104 | % with Nesterov, we get this a lot, so disable it 105 | warning('randomizedSVD:warmStartLarge','Warning: warmStart has more columns than rEst'); 106 | % disp('Warning: warmStart has more columns than rEst'); 107 | else 108 | Q = [Q, randn(n,rEst - size(Q,2) )]; 109 | end 110 | end 111 | Q = X_forward(Q); 112 | % Algo 4.4 in "Structure in randomness" paper, but we re-arrange a little 113 | for j = 1:(nPower-1) 114 | [Q,R] = qr(Q,0); 115 | Q = X_transpose(Q); 116 | [Q,R] = qr(Q,0); 117 | Q = X_forward(Q); 118 | end 119 | [Q,R] = qr(Q,0); 120 | 121 | % We can now approximate: 122 | % X ~ QQ'X = QV' 123 | % Form Q'X, e.g. V = X'Q 124 | V = X_transpose(Q); 125 | 126 | [V,R] = qr(V,0); 127 | [U,S,VV] = svd(R','econ'); 128 | U = Q*U; 129 | V = V*VV; 130 | 131 | % Now, pick out top r. It's already sorted. 132 | U = U(:,1:r); 133 | V = V(:,1:r); 134 | S = S(1:r,1:r); 135 | 136 | 137 | end % end of function 138 | -------------------------------------------------------------------------------- /utilities/referenceSolutions/downloadEscalatorData.m: -------------------------------------------------------------------------------- 1 | function downloadEscalatorData 2 | if ~exist('escalator_data.mat','file') 3 | % This downloads a 3.7 MB demo 4 | disp('About to download the 3.7 MB video...'); 5 | 6 | baseDirectory = fileparts(mfilename('fullpath')); 7 | disp(baseDirectory) 8 | 9 | urlwrite('http://cvxr.com/tfocs/demos/rpca/escalator_data.mat',... 10 | fullfile(baseDirectory,'escalator_data.mat')); 11 | else 12 | disp('You already have the data file; everything is good'); 13 | end -------------------------------------------------------------------------------- /utilities/referenceSolutions/downloadReferenceSolutions.m: -------------------------------------------------------------------------------- 1 | % Run this script to download the reference files 2 | % (117 MB) 3 | 4 | disp('Downloading 117 MB file of reference solutions'); 5 | unzip(... 6 | 'http://amath.colorado.edu/faculty/becker/code/referenceSolutions_v1.0.zip'); 7 | 8 | -------------------------------------------------------------------------------- /utilities/referenceSolutions/escalator_data.mat: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/stephenbeckr/fastRPCA/44dfee56f142ebffe5a7003578868e84bd4330b7/utilities/referenceSolutions/escalator_data.mat -------------------------------------------------------------------------------- /utilities/referenceSolutions/verySmall_l1.mat: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/stephenbeckr/fastRPCA/44dfee56f142ebffe5a7003578868e84bd4330b7/utilities/referenceSolutions/verySmall_l1.mat -------------------------------------------------------------------------------- /utilities/referenceSolutions/verySmall_l1l2.mat: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/stephenbeckr/fastRPCA/44dfee56f142ebffe5a7003578868e84bd4330b7/utilities/referenceSolutions/verySmall_l1l2.mat -------------------------------------------------------------------------------- /utilities/vec.m: -------------------------------------------------------------------------------- 1 | function y = vec(x) 2 | y = x(:); 3 | end --------------------------------------------------------------------------------