├── LICENSE ├── README.md ├── example.m └── subprograms ├── MRSM.m ├── covSEard.m ├── define_covmatrix.m ├── find_beta.m ├── initial_hyper.m ├── log_likelihood.m ├── max_likelihood.m ├── max_restr_likelihood.m ├── poschol.m ├── predict_resp.m ├── rest_log_likelihood.m ├── sq_dist.m ├── trend_fun.m └── tune_hyper.m /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2019 Mohammad Sadoughi 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Multi-output-Gaussian-Process 2 | 3 | ## Multi-output regression 4 | 5 | In multi-output regression (multi-target, multi-variate, or multi-response regression), we aim to predict multiple real valued output variables. One simple approach may be using combination of single output regression models. But this approach has some drawbacks and limitations [[1](https://towardsdatascience.com/regression-models-with-multiple-target-variables-8baa75aacd)]: 6 | 7 | * Training several single-output models take long times. 8 | * Each single output model is trained and optimized for one specific target, not the combination of all target. 9 | * In many cases, there is a strong interdependency and correlation among the targets. The single output models cannot capture this relation. 10 | 11 | To address this drawbacks and limitations, we look for multi-output regression methods that can model the multi-output datasets by considering not only the relationships between the input factors and the corresponding targets but also the relationships between the targets. Several regression methods have been developed for multi-output problems. Click [here](http://cig.fi.upm.es/articles/2015/Borchani-2015-WDMKD.pdf) for a great review of these methods. For example, multi-target SVM or Random forest are two of the most popular ones. 12 | 13 | In this research, I am proposing and implementing a new technique for multi-output regression using Gaussian process (GP) model. 14 | 15 | ## Univariate GP 16 | Let’s first start with introducing univariate GP. Univariate GP defines a Gaussian distributions on functions which can be used for nonlinear regression, classification, ranking, preference learning, or ordinal regression. Univariate GP has several advantages over other regression techniques: 17 | 18 | 1. It outperforms other methods on the problems with limited by computationally expensive datasets. 19 | 2. It provides not only the mean value of prediction, but also the uncertainty of prediction. The prediction uncertainty can be used to define the confidence level for our prediction accuracy. 20 | 21 | [PyKrige](https://github.com/bsmurphy/PyKrige) is a Toolkit for Python which implements the univariate GP model. Also, you can use [fitrgp](https://www.mathworks.com/help/stats/fitrgp.html) to implement univariate GP model in MATLAB. 22 | 23 | ## Multivariate GP 24 | 25 | In this research, I extended the univariate GP to multivariate GP to handle the problems with multiple interdependent targets. The key point for this extension, is redefining the covariance matrix to be able to capture the correlation among the targets. To this end, I used the nonseparable covariance structure explained by [Thomas E. Fricker](https://amstat.tandfonline.com/doi/abs/10.1080/00401706.2012.715835#.XC5jpVxKi70). The conventional univariate GP models, builds the spatial covariance matrix over the features data. Then the kernel’s hyper parameters are learned by maximizing the likelihood of observations. However, in multivariate GP, we extend the covariance matrix to cover the correlation between both features and targets. Therefore, it involves a larger number of hyperparameters to be optimized. As such the optimization process takes a little longer. For the mathematical description of multivariate GP, please see our recent publication [click here]. 26 | 27 | You can train a multivariate GP model using the MSRM function. 28 | 29 | ```matlab 30 | MGP = MRSM(X,Y, option) 31 | ``` 32 | 33 | Here, MGP is the trained model, ```X``` is the matrix of input factors and ```Y``` is the matrix of targets. Option can have one or several attributes: 34 | 35 | * ```Option.s``` : defines the upper bound for the correlation matrix among the targets. Setting this number to a very small value means we are removing all the possible correlation between the targets in building the multivariate GP model. 36 | 37 | * ```Option.degree```: defines the digress of polynomial function which is used as the trend function for the multivariate GP model. It can be any integer between zero to 4. The default value is zero. 38 | 39 | * ```Option.optim```: defines the optimization method that is used during the training process. The default is ```‘fmincon’```. You also have the option of ```'Global'``` which uses a global optimization algorithm. This option is more accurate, however the optimization process takes longer. 40 | 41 | After training the model, we can use predict_resp function for performing the prediction on the new dataset. 42 | ```matlab 43 | [y_ S_] = predict_resp(MGP, X_); 44 | ``` 45 | 46 | Where ```X_``` is the new dataset. ```Y_``` is mean value of prediction and ```S_``` is the uncertainty of the prediction. 47 | 48 | ## Simple example 49 | 50 | Now let’s try to implement a simple example: 51 | 52 | 1. Defines a simple input matrix with 4 samples and two features : 53 | ```matlab 54 | >> X = [1,2; 2, 1; 2, 3; 3, 4] 55 | X = 56 | 1 2 57 | 2 1 58 | 2 3 59 | 3 4 60 | ``` 61 | 2. Define the corresponding output matrix with 4 samples and 2 targets. 62 | ```matlab 63 | >> Y = [3, 4; 2, 3; 1, 2; 4, 5] 64 | Y = 65 | 3 4 66 | 2 3 67 | 1 2 68 | 4 5 69 | ``` 70 | 3. Build a MRSM model over the data (here I am letting the code uses the default options). 71 | ```matlab 72 | >> MGP = MRSM (X, Y); 73 | ``` 74 | 4. Predict the response at a new input point using the trained model. 75 | ```matlab 76 | >> X_ = [2.5, 3.5] 77 | X_ = 78 | 2.5000 3.5000 79 | >> [Y_, S_] = predict_resp(MGP, X_) 80 | Y_ = 81 | 2.2442 3.0326 82 | S_ = 83 | 0.3581 0.4193 84 | 0.4193 1.0432 85 | ``` 86 | Y_ is the mean value of prediction for the two targets and S_ is the covariance matrix summarizing the uncertainty of prediction and also the correlation between the two targets at this specific test point. For instance this result shows that the correlation between the two targets at this point is 0.4193. 87 | -------------------------------------------------------------------------------- /example.m: -------------------------------------------------------------------------------- 1 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 2 | % This code has been written by Mohammad Kazem Sadoughi at SRSL, Iowa State 3 | % University. The purpose of this code is to demostrated who to run the 4 | % multi-response surrogate modeling (MRSM) code on a sample problem with 5 | % three ojective functions. For more information on MRSM, please see our 6 | % recent publication: 7 | % Sadoughi, Mohammadkazem, Meng Li, and Chao Hu. "Multivariate system reliability analysis 8 | % considering highly nonlinear and dependent safety events." Reliability Engineering & System 9 | % Safety 180 (2018): 189-200. 10 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 11 | 12 | function example() 13 | % NOTE: EXECUTE THIS EXAMPLE FROM THIS DIRECTORY. RELATIVE PATHS ARE USED 14 | % IN THE EXAMPLES. 15 | % This example considers three highly correlated response surfaces. 16 | 17 | %% Design variables definition 18 | % Generating sample data: 16 uniform distributed points is considered as 19 | % the training data. This training data can be changed or chosen from any 20 | % other algorithm like LHS 21 | n0 = 4; % number of samples along each direction (variable) 22 | input1 = linspace(0,1,n0); % generating the uniform samples along the first direction 23 | input2 = linspace(0,1,n0); % generating the uniform samples along the second direction 24 | [INPUT1,INPUT2] = meshgrid(input1,input2); 25 | INPUT11 = reshape(INPUT1,n0*n0,1); % generating mesh grids of data (will be used for ploting) 26 | INPUT22 = reshape(INPUT2,n0*n0,1); % generating mesh grids of data (will be used for ploting) 27 | INPUT = [INPUT11,INPUT22]; % combine the two dimensions (variables) 28 | 29 | %% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 30 | % calculate the output of the three functions at the generated initial 31 | % points using multivariate Gaussian process. Please read our published 32 | % article for more information on multivariate Gaussian process: 33 | % Sadoughi, Mohammadkazem, Meng Li, and Chao Hu. "Multivariate system reliability analysis 34 | % considering highly nonlinear and dependent safety events." Reliability Engineering & System 35 | % Safety 180 (2018): 189-200. 36 | 37 | OUTPUT=[bmfun1(INPUT),bmfun2(INPUT),bmfun3(INPUT)]; 38 | k = MRSM(INPUT,OUTPUT ); % Building the model using MRSM function 39 | 40 | % k is the summary of the trained model containing all the optimized hyperparameters 41 | 42 | %% %%%%%%%%%%%%%%%%% Prediction %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 43 | % Now, we want to use the trained model to do prediction on a new set of data 44 | % Generating testing points 45 | n = 100; 46 | x = linspace(0,1,n); % generating testing points uniformly along the firt direction 47 | y = linspace(0,1,n); % generating testing points uniformly along the second direction 48 | [X,Y] = meshgrid(x,y); % generate the 2d samples by combining the generated unidirectional points 49 | 50 | for i=1:n 51 | for j=1:n 52 | 53 | % NOTE: z_true is the true output at the testing points and "z_pred" is 54 | % the prediction at testing points based on the MRSM method 55 | z_true(i,j,1) = bmfun1([x(i),y(j)]); 56 | z_true(i,j,2) = bmfun2([x(i),y(j)]); 57 | z_true(i,j,3) = bmfun3([x(i),y(j)]); 58 | 59 | % Prediction at testing points 60 | [z_pred(i,j,:) std_pred] = predict_resp(k,[x(i),y(j)]); 61 | % NOTE: Our MRSM not only can provides us with the mean of 62 | % prediction, but also the uncertainty in the precistion. this 63 | % uncertainty can be very useful since it quantifies how much we 64 | % expect our prediction is really acccurate! 65 | end 66 | end 67 | 68 | %% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 69 | % Now, that we the true and predicted output, we can calcualte the prediction error. 70 | error = (((sum(sum(abs(z_true-z_pred))))).^1)/(n*n); % error calcualtion using MAE(mean absolute error) 71 | error = sum(error)/3; % calculate the mean of error 72 | 73 | 74 | %% Plot the b th responses and the corresponding training points 75 | b = 3; 76 | z_true = reshape(z_true(:,:,b),n,n); 77 | z_pred = reshape(z_pred(:,:,b),n,n); 78 | figure 79 | surf(X,Y,z_true'); 80 | hold on 81 | surf(X,Y,z_pred'); 82 | hold on 83 | plot3 (INPUT(:,1),INPUT(:,2),OUTPUT(:,b),'ob', 'MarkerFaceColor','b'); 84 | xlabel('x1'); ylabel('x2'); zlabel('out'); 85 | end 86 | 87 | %% the objective functions... you can customize them and see the effect! 88 | function response = bmfun1(x) 89 | response = 0.8*(1+x(:,1).^2+x(:,2).^2).*( 1+sin(2*pi*(x(:,1)-0.25)).*sin(2*pi*(x(:,2)-0.25)) ); 90 | end 91 | function response = bmfun2(x) 92 | response = 0+0.8*(1+x(:,1).^2+x(:,2).^2).*( 1+sin(2*pi*(x(:,1)-0.25)).*sin(2*pi*(x(:,2)-0.25)) ); 93 | end 94 | function response = bmfun3(x) 95 | response = 0.8*(1+x(:,1).^2+x(:,2).^2).*( 1+sin(2*pi*(x(:,1)-0.25)).*sin(2*pi*(x(:,2)-0.25)) ); 96 | end 97 | 98 | 99 | 100 | 101 | -------------------------------------------------------------------------------- /subprograms/MRSM.m: -------------------------------------------------------------------------------- 1 | function model=MRSM(input,output,option) 2 | 3 | % This code has been written by Mohammadkazem Sadoughi at Iowa State 4 | % University (4/05/2017). This file is the main code for building the multi-variate 5 | % Gaussian process. The inputs are the training data and based on that the 6 | % hyperparametrs is tuned and the regression coefficient, trend function and 7 | % covariance matrix will be defined. 8 | 9 | if exist( 'option', 'var' ) 10 | 11 | if isempty(option.s) % Upper bound for matrix A elements. Setting this parameter to a very small value means we do not consider any correlation amng the responses 12 | model.s=0.6; 13 | else 14 | model.s=option.s; 15 | end 16 | 17 | if isempty(option.degree) % Degree of polynomial functions: can be zero, one, two or.... 18 | model.degree=2; 19 | else 20 | model.degree=option.degree; 21 | end 22 | 23 | if isempty(option.optim) % Optimization method 24 | model.optim='fmincon'; 25 | else 26 | model.optim=option.optim; 27 | end 28 | else 29 | model.s=0.6; 30 | model.degree=0; 31 | model.optim='fmincon'; 32 | end 33 | 34 | model.X=input; 35 | model.Y=output; % Matrix of (model.n by model.m ) 36 | 37 | model.d=size(input,2); % Dimesntion of input variables 38 | model.n=size(input,1); % Number of training points 39 | model.m=size(output,2); % Number of responses 40 | 41 | model.ymn=reshape(output',model.m*model.n,1); % Reshape the output to a vector of (model.m * model.n by 1) 42 | 43 | model.cov_model = @(hyp, x, z, i)covSEard(hyp, x, z); % Define the covariance model (hbere we used the SEARD) 44 | 45 | [model.fn, model.p]=trend_fun(model,model.X); % Define the shape of trend function and Fn 46 | 47 | model=initial_hyper(model); % Initialize hyperparametrs 48 | 49 | model=tune_hyper(model); % Tune hyperparametrs 50 | end -------------------------------------------------------------------------------- /subprograms/covSEard.m: -------------------------------------------------------------------------------- 1 | function K= covSEard(hyp, x, z) 2 | n=size(x,1); 3 | m=size(z,1); 4 | for i=1:n 5 | for j=1:m 6 | K(i,j)=exp(-sum( hyp.*((x(i,:)-z(j,:)).^2) )); 7 | end 8 | end 9 | % K= A+A'-diag(diag(A)); 10 | end -------------------------------------------------------------------------------- /subprograms/define_covmatrix.m: -------------------------------------------------------------------------------- 1 | function z_mn=define_covmatrix(model,X1,X2) 2 | % This function uses Non-separable Linear Model of Coregionalization (NLMC) to build the covariance among each set of points. 3 | 4 | for i=1:model.m 5 | L(:,:,i) = model.cov_model(model.hyper.teta(i,:), X1, X2); 6 | end 7 | z_mn=[]; 8 | for i=1:size(X1,1) 9 | supplement=[]; 10 | for j=1:size(X2,1) 11 | model.cov{i,j}=model.hyper.A*diag(reshape(L(i,j,:),model.m,1))*model.hyper.A'; 12 | supplement=[supplement, model.cov{i,j}]; 13 | end 14 | z_mn=[z_mn;supplement]; 15 | end 16 | 17 | if isequal(X1,X2)==1 18 | z_mn= triu(z_mn)+triu(z_mn)'-diag(diag(z_mn)); 19 | [V,D]=eig(z_mn); 20 | d=diag(D); 21 | d(d<=.001)=0.001; 22 | z_mn= V*diag(d)*V'; 23 | end 24 | 25 | end 26 | 27 | -------------------------------------------------------------------------------- /subprograms/find_beta.m: -------------------------------------------------------------------------------- 1 | function beta=find_beta(model) 2 | 3 | zchol=poschol(model.z_mn); 4 | 5 | alpha1 = (zchol\(zchol'\model.ymn)); 6 | alpha2=(zchol\(zchol'\model.fn)); 7 | alpha3=model.fn'*alpha2; 8 | alpha3chol=chol(alpha3); 9 | alpha4=(alpha3chol\(alpha3chol'\model.fn')); 10 | beta=alpha4*alpha1; 11 | 12 | 13 | end -------------------------------------------------------------------------------- /subprograms/initial_hyper.m: -------------------------------------------------------------------------------- 1 | function model=initial_hyper(model) 2 | 3 | % This function gives the initial values to all hyperparameters used in this algorithm. 4 | 5 | model.hyper.teta=ones(model.m,model.d); 6 | model.hyper.A=[1 0.05 0.03 0.01;0.05,1,0.02,0.01;0.03,0.02,1,0.04;0.01,0.01,0.04,1]; 7 | 8 | model.zigma0=diag(ones(model.m,1)); 9 | [V, D]=eig(model.zigma0); 10 | model.hyper.A=V*(D^0.5)*V'; 11 | 12 | end -------------------------------------------------------------------------------- /subprograms/log_likelihood.m: -------------------------------------------------------------------------------- 1 | function [lml] = log_likelihood(hyper_param, model,i) 2 | 3 | % This function estimates the restricted likelihood. 4 | 5 | y=model.Y(:,i); 6 | fn=reshape(model.fn,model.m,model.n,model.p); 7 | f=reshape(fn(i,:,:), model.n,model.p); 8 | 9 | K = model.cov_model(hyper_param, model.X, model.X); 10 | Kchol = chol(K); 11 | 12 | alpha1=y-f*model.beta; 13 | alpha2 = (Kchol\(Kchol'\alpha1)); 14 | sigma2=alpha1'*alpha2/model.n; 15 | 16 | lml=0.5*model.n*(sigma2)+0.5*log(det(K)); 17 | 18 | end -------------------------------------------------------------------------------- /subprograms/max_likelihood.m: -------------------------------------------------------------------------------- 1 | function model= max_likelihood(model) 2 | 3 | % This function is used when we only want to tune the decay parameters in each single response separately. 4 | 5 | for i=1:model.m 6 | problem.f = @(x) log_likelihood(x,model,i); 7 | A = []; 8 | b = []; 9 | Aeq = []; 10 | beq = []; 11 | lb= 0.01*ones(1,model.d); 12 | ub= 100*ones(1,model.d); 13 | x0 = model.hyper.teta(i,:); 14 | options = optimset('Display', 'off') ; 15 | nonlcon=[]; 16 | model.hyper.teta(i,:) = fmincon(problem.f,x0,A,b,Aeq,beq,lb,ub,nonlcon,options); 17 | end 18 | 19 | end -------------------------------------------------------------------------------- /subprograms/max_restr_likelihood.m: -------------------------------------------------------------------------------- 1 | function model= max_restr_likelihood(model) 2 | 3 | % This function is used when we want to tune all hyperparameters simultaneously. 4 | % To find the global optimum point we use Globalsearch function in Matlab. 5 | 6 | MLE = @(x) rest_log_likelihood(x,model); % define the likelihood function 7 | 8 | % Set the consitraints used in optimization 9 | A = []; 10 | b = []; 11 | Aeq = []; 12 | beq = []; 13 | 14 | % Inital sets for parametrs of z0 15 | n1=model.m*(model.m-1)/2; 16 | lb1= zeros(1,n1); 17 | ub1= model.s*ones(1,n1); 18 | x01 = 0.1*ones(1,n1); 19 | 20 | % Initial sets for parameters of teta 21 | n2=model.m*model.d; 22 | lb2= 0.01*ones(1,n2); 23 | ub2= 10*ones(1,n2); 24 | x02 =0.2*ones(1,n2); 25 | 26 | % Combining all parameters: 27 | n=n1+n2; 28 | lb=[lb1,lb2]; 29 | ub=[ub1,ub2] ; 30 | x0 =[x01,x02]; 31 | 32 | %% Optimization 33 | % There are two options for optimization: 1- using direct fmincon function 34 | % based on a single inital point x0. This is a faster method but can only 35 | % find the local minima. 2- Using global search which can find the global 36 | % minima. This method is more accurate but takes more time. 37 | 38 | if strcmp(model.optim,'fmincon') % Option 1: 39 | options = optimset('Display', 'off') ; 40 | nonlcon=[]; 41 | x= fmincon(MLE,x0,A,b,Aeq,beq,lb,ub, nonlcon, options); 42 | else % Option 2: 43 | opts = optimoptions(@fmincon,'Algorithm','interior-point'); 44 | problem = createOptimProblem('fmincon','objective',MLE,'x0',x0,'lb',lb,'ub',ub,'options',opts); 45 | gs = GlobalSearch('NumTrialPoints',300); 46 | [x,f] = run(gs,problem); 47 | end 48 | 49 | %% Update hyperparametrs 50 | % Update the matrix A 51 | k=1; 52 | for i=1: model.m 53 | for j=i+1:model.m 54 | z0(i,j)=x(k); 55 | k=k+1; 56 | end 57 | end 58 | if model.m==1; 59 | z0=[]; 60 | end 61 | 62 | z0=[z0;zeros(1,model.m)]; 63 | z0=z0+z0'+diag(ones(model.m,1)); 64 | [V, D]=eig(z0); 65 | model.hyper.A=V*(D^0.5)*V'; 66 | 67 | % Update the hyperparameters teta 68 | teta=x(n1+1:end); 69 | teta=reshape(teta,model.m,model.d); 70 | model.hyper.teta=teta; 71 | 72 | end -------------------------------------------------------------------------------- /subprograms/poschol.m: -------------------------------------------------------------------------------- 1 | function Mchol=poschol(M) 2 | M=abs(M); 3 | M= triu(M)+triu(M)'-diag(diag(M)); 4 | [V,D]=eig(M); 5 | d=diag(D); 6 | d(d<=10e-10)=10e-10; 7 | Mchol= V*diag(d)*V'; 8 | Mchol=chol(Mchol); 9 | end -------------------------------------------------------------------------------- /subprograms/predict_resp.m: -------------------------------------------------------------------------------- 1 | function [y, s, model]=predict_resp(model,x) 2 | 3 | model.x_pred=x; 4 | model.n0=size(x,1); 5 | model.z_0n0n=define_covmatrix(model,model.x_pred,model.X); 6 | z0=model.hyper.A*model.hyper.A'; 7 | n=size(model.x_pred,1); 8 | model.z0=repmat(z0,n,1); 9 | 10 | %% Prediction at new points 11 | alpha1=model.ymn-model.fn*model.beta; 12 | zchol=poschol(model.z_mn); 13 | alpha2 = (zchol\(zchol'\alpha1)); 14 | model.f0=trend_fun(model,model.x_pred); 15 | y=model.f0*model.beta+model.z_0n0n*alpha2; 16 | y=reshape(y,model.m,model.n0)'; 17 | 18 | %% uncertainity in preictoin at new points 19 | alpha3=(zchol\(zchol'\model.fn)); 20 | u=model.f0-model.z_0n0n*alpha3; 21 | alpha4=model.fn'*alpha3; 22 | alpha4chol=poschol(alpha4); 23 | alpha5=(alpha4chol\(alpha4chol'\u')); 24 | alpha6=(zchol\(zchol'\model.z_0n0n')); 25 | 26 | for i=1:n 27 | for j=1:model.m 28 | for jj=1:model.m 29 | alpha7(model.m*(i-1)+1+(j-1),jj)=model.z_0n0n(model.m*(i-1)+1+(j-1),:)*alpha6(:,model.m*(i-1)+jj); 30 | alpha8(model.m*(i-1)+1+(j-1),jj)=u(model.m*(i-1)+1+(j-1),:)*alpha5(:,model.m*(i-1)+jj); 31 | end 32 | end 33 | end 34 | s=model.z0-alpha7+alpha8; 35 | 36 | end -------------------------------------------------------------------------------- /subprograms/rest_log_likelihood.m: -------------------------------------------------------------------------------- 1 | function [lml] = rest_log_likelihood(x, model) 2 | 3 | % This function calculates the restricted likelihod at each set of hyperparametrs 4 | 5 | n1=model.m*(model.m-1)/2; 6 | n2=model.m*model.d; 7 | 8 | k=1; 9 | 10 | for i=1: model.m 11 | for j=i+1:model.m 12 | z0(i,j)=x(k); 13 | k=k+1; 14 | end 15 | end 16 | 17 | if model.m==1; 18 | z0=[]; 19 | end 20 | 21 | z0=[z0;zeros(1,model.m)]; 22 | z0=z0+z0'+diag(ones(model.m,1)); 23 | 24 | [V, D]=eig(z0); 25 | model.hyper.A=V*(D^0.5)*V'; 26 | teta=x(n1+1:end); 27 | teta=reshape(teta,model.m,model.d); 28 | y=model.ymn; 29 | f=model.fn; 30 | 31 | model.hyper.teta=teta; 32 | model.z_mn=define_covmatrix(model,model.X,model.X); % update the value of model. zmn 33 | model.beta=find_beta(model); % update beta values 34 | 35 | %% Determine the restricted likelihood function 36 | zmnchol = poschol(model.z_mn); 37 | alpha1=y-f*model.beta; 38 | alpha2 = (zmnchol\(zmnchol'\alpha1)); 39 | sigma2=alpha1'*alpha2; 40 | alpha3=(zmnchol\(zmnchol'\f)); 41 | alpha4=f'*alpha3; 42 | 43 | % RMLE 44 | lml=0.5*(sigma2)+0.5*log(det(model.z_mn))+0.5*log(det(alpha4)); 45 | % MLE 46 | % lml=0.5*(sigma2)+0.5*log(det(model.z_mn)); 47 | end -------------------------------------------------------------------------------- /subprograms/sq_dist.m: -------------------------------------------------------------------------------- 1 | % sq_dist - a function to compute a matrix of all pairwise squared distances 2 | % between two sets of vectors, stored in the columns of the two matrices, a 3 | % (of size D by n) and b (of size D by m). If only a single argument is given 4 | % or the second matrix is empty, the missing matrix is taken to be identical 5 | % to the first. 6 | % 7 | % Usage: C = sq_dist(a, b) 8 | % or: C = sq_dist(a) or equiv.: C = sq_dist(a, []) 9 | % 10 | % Where a is of size Dxn, b is of size Dxm (or empty), C is of size nxm. 11 | % 12 | % Copyright (c) by Carl Edward Rasmussen and Hannes Nickisch, 2010-12-13. 13 | 14 | function C = sq_dist(a, b) 15 | 16 | if nargin<1 || nargin>3 || nargout>1, error('Wrong number of arguments.'); end 17 | bsx = exist('bsxfun','builtin'); % since Matlab R2007a 7.4.0 and Octave 3.0 18 | if ~bsx, bsx = exist('bsxfun'); end % bsxfun is not yes "builtin" in Octave 19 | [D, n] = size(a); 20 | 21 | % Computation of a^2 - 2*a*b + b^2 is less stable than (a-b)^2 because numerical 22 | % precision can be lost when both a and b have very large absolute value and the 23 | % same sign. For that reason, we subtract the mean from the data beforehand to 24 | % stabilise the computations. This is OK because the squared error is 25 | % independent of the mean. 26 | if nargin==1 % subtract mean 27 | mu = mean(a,2); 28 | if bsx 29 | a = bsxfun(@minus,a,mu); 30 | else 31 | a = a - repmat(mu,1,size(a,2)); 32 | end 33 | b = a; m = n; 34 | else 35 | [d, m] = size(b); 36 | if d ~= D, error('Error: column lengths must agree.'); end 37 | mu = (m/(n+m))*mean(b,2) + (n/(n+m))*mean(a,2); 38 | if bsx 39 | a = bsxfun(@minus,a,mu); b = bsxfun(@minus,b,mu); 40 | else 41 | a = a - repmat(mu,1,n); b = b - repmat(mu,1,m); 42 | end 43 | end 44 | 45 | if bsx % compute squared distances 46 | C = bsxfun(@plus,sum(a.*a,1)',bsxfun(@minus,sum(b.*b,1),2*a'*b)); 47 | else 48 | C = repmat(sum(a.*a,1)',1,m) + repmat(sum(b.*b,1),n,1) - 2*a'*b; 49 | end 50 | C = max(C,0); % numerical noise can cause C to negative i.e. C > -1e-14 51 | -------------------------------------------------------------------------------- /subprograms/trend_fun.m: -------------------------------------------------------------------------------- 1 | function [f p]=trend_fun(model,X) 2 | 3 | % This file has been written by Mohammadkazem Sadoughi, at Iowa State 4 | % University (4/05/2017). This file defines different type of trend function 5 | % based on the degree of polynomials 6 | 7 | n=size(X,1); 8 | model.ss=1; 9 | f=[]; 10 | 11 | 12 | switch model.degree 13 | 14 | case 0 % f(x)= 1 15 | p=1; 16 | for i=1:n; 17 | for j=1:model.m 18 | supplement=[1]; 19 | f=[f;supplement]; 20 | end 21 | end 22 | 23 | case 1 % f(x)= 1, x 24 | p=1+model.d; 25 | for i=1:n; 26 | for j=1:model.m 27 | supplement=[1,X(i,:)]; 28 | f=[f;supplement]; 29 | end 30 | end 31 | 32 | case 2 % f(x)= 1 , x , x2 33 | p=1+model.d*(model.d+3)/2; 34 | for i=1:n; 35 | for j=1:model.m 36 | supplement=[1,X(i,:)]; 37 | for k=1:model.d 38 | for l=k:model.d 39 | supplement=[supplement,X(i,k)*X(i,l)]; 40 | end 41 | end 42 | f=[f;supplement]; 43 | end 44 | 45 | end 46 | 47 | case 7 % f(x)= 1 , x , y , xy, x2, y2, x2y 48 | case 8 % f(x)= 1 , x , y , xy, x2, y2, x2y, xy2 49 | case 9 % f(x)= 1 , x , y , xy, x2, y2, x2y, xy2, x3 50 | case 10 % f(x)= 1 , x , y , xy, x2, y2, x2y, xy2, x3, y3 51 | 52 | otherwise 53 | disp('other value for p') 54 | end 55 | 56 | end -------------------------------------------------------------------------------- /subprograms/tune_hyper.m: -------------------------------------------------------------------------------- 1 | function model=tune_hyper(model) 2 | 3 | % This file has been written by Mohammadkazem Sadoughi, at Iowa State 4 | % University (4/05/2017). This file tunes the hyperparameters used in 5 | % covariance metrix of multivariate Gaussian process 6 | 7 | model = max_restr_likelihood(model); % Using MLE technique to tune the matrix A and hyperparametrs teta. 8 | 9 | model.z_mn=define_covmatrix(model,model.X,model.X); % Define the covariance matrix between training points 10 | 11 | model.beta=find_beta(model); % Define the trend function coefficients 12 | end --------------------------------------------------------------------------------