├── .project ├── .pydevproject ├── LICENSE ├── README.md ├── alg1.png ├── eq1.png ├── examples ├── __init__.py ├── __init__.pyc ├── compare_gaussian_methods.py ├── compare_linsvm_methods.py ├── compare_rbfsvm_methods.py ├── example.py ├── plotutils.py └── plotutils.pyc ├── frameworks ├── CPLELearning.py ├── SelfLearning.py └── __init__.py ├── methods ├── __init__.py ├── qns3vm.py ├── scikitTSVM.py └── scikitWQDA.py ├── qdaexample - Copy.png ├── qdaexample.png ├── setup.py ├── svmexample1.png └── svmexample2.png /.project: -------------------------------------------------------------------------------- 1 | 2 | 3 | semisup-learn 4 | 5 | 6 | 7 | 8 | 9 | org.python.pydev.PyDevBuilder 10 | 11 | 12 | 13 | 14 | 15 | org.python.pydev.pythonNature 16 | 17 | 18 | -------------------------------------------------------------------------------- /.pydevproject: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | /${PROJECT_DIR_NAME} 5 | 6 | python 2.7 7 | Default 8 | 9 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | The MIT License (MIT) 2 | 3 | Copyright (c) 2015 Tamas Madl 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | 23 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | Semi-supervised learning frameworks for Python 2 | =============== 3 | 4 | This project contains Python implementations for semi-supervised 5 | learning, made compatible with scikit-learn, including 6 | 7 | - **Contrastive Pessimistic Likelihood Estimation (CPLE)** (based on - but not equivalent to - [Loog, 2015](http://arxiv.org/abs/1503.00269)), a `safe' framework applicable for all classifiers which can yield prediction probabilities 8 | (safe here means that the model trained on both labelled and unlabelled data should not be worse than models trained only on the labelled data) 9 | 10 | - Self learning (self training), a naive semi-supervised learning framework applicable for any classifier (iteratively labelling the unlabelled instances using a trained classifier, and then re-training it on the resulting dataset - see e.g. http://pages.cs.wisc.edu/~jerryzhu/pub/sslicml07.pdf ) 11 | 12 | - Semi-Supervised Support Vector Machine (S3VM) - a simple scikit-learn compatible wrapper for the QN-S3VM code developed by 13 | Fabian Gieseke, Antti Airola, Tapio Pahikkala, Oliver Kramer (see http://www.fabiangieseke.de/index.php/code/qns3vm ) 14 | This method was included for comparison 15 | 16 | The first method is a novel extension of [Loog, 2015](http://arxiv.org/abs/1503.00269) for any discriminative classifier (the differences to the original CPLE are explained below). The last two methods are only included for comparison. 17 | 18 | 19 | The advantages of the CPLE framework compared to other semi-supervised learning approaches include 20 | 21 | - it is a **generally applicable framework (works with scikit-learn classifiers which allow per-sample weights)** 22 | 23 | - it needs low memory (as opposed to e.g. Label Spreading which needs O(n^2)), and 24 | 25 | - it makes no additional assumptions except for the ones made by the choice of classifier 26 | 27 | The main disadvantage is high computational complexity. Note: **this is an early stage research project, and work in progress** (it is by no means efficient or well tested)! 28 | 29 | If you need faster results, try the Self Learning framework (which is a naive approach but much faster): 30 | 31 | ```python 32 | from frameworks.SelfLearning import * 33 | 34 | any_scikitlearn_classifier = SVC() 35 | ssmodel = SelfLearningModel(any_scikitlearn_classifier) 36 | ssmodel.fit(X, y) 37 | ``` 38 | 39 | Usage 40 | =============== 41 | 42 | The project requires [scikit-learn](http://scikit-learn.org/stable/install.html), [matplotlib](http://matplotlib.org/users/installing.html) and [NLopt](http://ab-initio.mit.edu/wiki/index.php/NLopt_Installation) to run. 43 | 44 | Usage example: 45 | 46 | ```python 47 | # load `Lung cancer' dataset from mldata.org 48 | cancer = fetch_mldata("Lung cancer (Ontario)") 49 | X = cancer.target.T 50 | ytrue = np.copy(cancer.data).flatten() 51 | ytrue[ytrue>0]=1 52 | 53 | # label a few points 54 | labeled_N = 4 55 | ys = np.array([-1]*len(ytrue)) # -1 denotes unlabeled point 56 | random_labeled_points = random.sample(np.where(ytrue == 0)[0], labeled_N/2)+\ 57 | random.sample(np.where(ytrue == 1)[0], labeled_N/2) 58 | ys[random_labeled_points] = ytrue[random_labeled_points] 59 | 60 | # supervised score 61 | basemodel = SGDClassifier(loss='log', penalty='l1') # scikit logistic regression 62 | basemodel.fit(X[random_labeled_points, :], ys[random_labeled_points]) 63 | print "supervised log.reg. score", basemodel.score(X, ytrue) 64 | 65 | # fast (but naive, unsafe) self learning framework 66 | ssmodel = SelfLearningModel(basemodel) 67 | ssmodel.fit(X, ys) 68 | print "self-learning log.reg. score", ssmodel.score(X, ytrue) 69 | 70 | # semi-supervised score (base model has to be able to take weighted samples) 71 | ssmodel = CPLELearningModel(basemodel) 72 | ssmodel.fit(X, ys) 73 | print "CPLE semi-supervised log.reg. score", ssmodel.score(X, ytrue) 74 | 75 | # semi-supervised score, RBF SVM model 76 | ssmodel = CPLELearningModel(sklearn.svm.SVC(kernel="rbf", probability=True), predict_from_probabilities=True) # RBF SVM 77 | ssmodel.fit(X, ys) 78 | print "CPLE semi-supervised RBF SVM score", ssmodel.score(X, ytrue) 79 | 80 | # supervised log.reg. score 0.410256410256 81 | # self-learning log.reg. score 0.461538461538 82 | # semi-supervised log.reg. score 0.615384615385 83 | # semi-supervised RBF SVM score 0.769230769231 84 | ``` 85 | 86 | 87 | Examples 88 | =============== 89 | 90 | Two-class classification examples with 56 unlabelled (small circles in the plot) and 4 labelled (large circles in the plot) data points. 91 | Plot titles show classification accuracies (percentage of data points correctly classified by the model) 92 | 93 | In the second example, **the state-of-the-art S3VM performs worse than the purely supervised SVM**, while the CPLE SVM (by means of the 94 | pessimistic assumption) provides increased accuracy. 95 | 96 | Quadratic Discriminant Analysis (from left to right: supervised QDA, Self learning QDA, pessimistic CPLE QDA) 97 | ![Comparison of supervised QDA with CPLE QDA](qdaexample.png) 98 | 99 | Support Vector Machine (from left to right: supervised SVM, S3VM [(Gieseke et al., 2012)](http://www.sciencedirect.com/science/article/pii/S0925231213003706), pessimistic CPLE SVM) 100 | ![Comparison of supervised SVM, S3VM, and CPLE SVM](svmexample1.png) 101 | 102 | Support Vector Machine (from left to right: supervised SVM, S3VM [(Gieseke et al., 2012)](http://www.sciencedirect.com/science/article/pii/S0925231213003706), pessimistic CPLE SVM) 103 | ![Comparison of supervised SVM, S3VM, and CPLE SVM](svmexample2.png) 104 | 105 | Motivation 106 | =============== 107 | 108 | Current semi-supervised learning approaches require strong assumptions, and perform badly if those 109 | assumptions are violated (e.g. low density assumption, clustering assumption). In some cases, they can perform worse than a supervised classifier trained only on the labeled exampels. Furthermore, the vast majority require O(N^2) memory. 110 | 111 | [(Loog, 2015)](http://arxiv.org/abs/1503.00269) has suggested an elegant framework (called Contrastive Pessimistic Likelihood Estimation / CPLE) which 112 | **only uses assumptions intrinsic to the chosen classifier**, and thus allows choosing likelihood-based classifiers which fit the domain / data 113 | distribution at hand, and can work even if some of the assumptions mentioned above are violated. The idea is to pessimistically assign soft labels 114 | to the unlabelled data, such that the improvement over the supervised version is minimal (i.e. assume the worst case for the unknown labels). 115 | 116 | The parameters in CPLE can be estimated according to: 117 | ![CPLE Equation](eq1.png) 118 | 119 | The original CPLE framework is only applicable to likelihood-based classifiers, and (Loog, 2015) only provides solutions for Linear Discriminant Analysis and the Nearest Mean Classifier. 120 | 121 | The CPLE implementation in this project 122 | =============== 123 | 124 | Building on this idea, this project contains a general semi-supervised learning framework allowing plugging in **any classifier** which allows 1) instance weighting and 2) can generate probability 125 | estimates (such probability estimates can also be provided by [Platt scaling](https://en.wikipedia.org/wiki/Platt_scaling) for classifiers which don't support them. Also, an experimental feature 126 | is included to make the approach work with classifiers not supporting instance weighting). 127 | 128 | In order to make the approach work with any classifier, the discriminative likelihood (DL) is used instead of the generative likelihood, which is the first major difference to (Loog, 2015). The second 129 | difference is that only the unlabelled data is included in the first term of the minimization objective (point 2. below), which leads to pessimistic minimization of the DL over the unlabelled data, but maximization 130 | of the DL over the labelled data. (Note that the DL is equivalent to the negative log loss for binary classifiers with probabilistic predictions - see below.) 131 | 132 | ![CPLE Equation](alg1.png) 133 | 134 | The resulting semi-supervised learning framework is highly computationally expensive, but has the advantages of being a generally applicable framework, needing low memory, and making no additional assumptions except for the ones made by the choice of classifier 135 | -------------------------------------------------------------------------------- /alg1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tmadl/semisup-learn/483698be0453f08e550a4c705fc7ad522708708c/alg1.png -------------------------------------------------------------------------------- /eq1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tmadl/semisup-learn/483698be0453f08e550a4c705fc7ad522708708c/eq1.png -------------------------------------------------------------------------------- /examples/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tmadl/semisup-learn/483698be0453f08e550a4c705fc7ad522708708c/examples/__init__.py -------------------------------------------------------------------------------- /examples/__init__.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tmadl/semisup-learn/483698be0453f08e550a4c705fc7ad522708708c/examples/__init__.pyc -------------------------------------------------------------------------------- /examples/compare_gaussian_methods.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | 3 | from frameworks.CPLELearning import CPLELearningModel 4 | from frameworks.SelfLearning import SelfLearningModel 5 | from methods.scikitWQDA import WQDA 6 | from examples.plotutils import evaluate_and_plot 7 | 8 | # number of data points 9 | N = 60 10 | supevised_data_points = 4 11 | 12 | # generate data 13 | meandistance = 1 14 | 15 | s = np.random.random() 16 | cov = [[s, 0], [0, s]] 17 | Xs = np.random.multivariate_normal([-s*meandistance, -s*meandistance], cov, (N,)) 18 | Xs = np.vstack(( Xs, np.random.multivariate_normal([s*meandistance, s*meandistance], cov, (N,)) )) 19 | ytrue = np.array([0]*N + [1]*N) 20 | 21 | ys = np.array([-1]*(2*N)) 22 | for i in range(supevised_data_points/2): 23 | ys[np.random.randint(0, N)] = 0 24 | for i in range(supevised_data_points/2): 25 | ys[np.random.randint(N, 2*N)] = 1 26 | 27 | Xsupervised = Xs[ys!=-1, :] 28 | ysupervised = ys[ys!=-1] 29 | 30 | # compare models 31 | 32 | lbl = "Purely supervised QDA:" 33 | print(lbl) 34 | model = WQDA() 35 | model.fit(Xsupervised, ysupervised) 36 | evaluate_and_plot(model, Xs, ys, ytrue, lbl, 1) 37 | 38 | lbl = "SelfLearning QDA:" 39 | print(lbl) 40 | model = SelfLearningModel(WQDA()) 41 | model.fit(Xs, ys) 42 | evaluate_and_plot(model, Xs, ys, ytrue, lbl, 2) 43 | 44 | lbl = "CPLE(pessimistic) QDA:" 45 | print(lbl) 46 | model = CPLELearningModel(WQDA(), predict_from_probabilities=True) 47 | model.fit(Xs, ys) 48 | evaluate_and_plot(model, Xs, ys, ytrue, lbl, 3) 49 | 50 | lbl = "CPLE(optimistic) QDA:" 51 | print(lbl) 52 | CPLELearningModel.pessimistic = False 53 | model = CPLELearningModel(WQDA(), predict_from_probabilities=True) 54 | model.fit(Xs, ys) 55 | evaluate_and_plot(model, Xs, ys, ytrue, lbl, 4, block=True) 56 | -------------------------------------------------------------------------------- /examples/compare_linsvm_methods.py: -------------------------------------------------------------------------------- 1 | import sklearn.svm 2 | import numpy as np 3 | import random 4 | 5 | from frameworks.CPLELearning import CPLELearningModel 6 | from methods import scikitTSVM 7 | from examples.plotutils import evaluate_and_plot 8 | 9 | kernel = "linear" 10 | 11 | # number of data points 12 | N = 60 13 | supevised_data_points = 2 14 | noise_probability = 0.1 15 | 16 | # generate data- 17 | cov = [[0.5, 0], [0, 0.5]] 18 | Xs = np.random.multivariate_normal([0.5,0.5], cov, (N,)) 19 | ytrue = [] 20 | for i in range(N): 21 | if np.random.random() < noise_probability: 22 | ytrue.append(np.random.randint(2)) 23 | else: 24 | ytrue.append(1 if np.sum(Xs[i])>1 else 0) 25 | Xs = np.array(Xs) 26 | ytrue = np.array(ytrue).astype(int) 27 | 28 | ys = np.array([-1]*N) 29 | sidx = random.sample(np.where(ytrue == 0)[0], supevised_data_points/2)+random.sample(np.where(ytrue == 1)[0], supevised_data_points/2) 30 | ys[sidx] = ytrue[sidx] 31 | 32 | Xsupervised = Xs[ys!=-1, :] 33 | ysupervised = ys[ys!=-1] 34 | 35 | # compare models 36 | lbl = "Purely supervised SVM:" 37 | print(lbl) 38 | model = sklearn.svm.SVC(kernel=kernel, probability=True) 39 | model.fit(Xsupervised, ysupervised) 40 | evaluate_and_plot(model, Xs, ys, ytrue, lbl, 1) 41 | 42 | lbl = "S3VM (Gieseke et al. 2012):" 43 | print(lbl) 44 | model = scikitTSVM.SKTSVM(kernel=kernel) 45 | model.fit(Xs, ys.astype(int)) 46 | evaluate_and_plot(model, Xs, ys, ytrue, lbl, 2) 47 | 48 | lbl = "CPLE(pessimistic) SVM:" 49 | print(lbl) 50 | model = CPLELearningModel(sklearn.svm.SVC(kernel=kernel, probability=True), predict_from_probabilities=True) 51 | model.fit(Xs, ys.astype(int)) 52 | evaluate_and_plot(model, Xs, ys, ytrue, lbl, 3) 53 | 54 | lbl = "CPLE(optimistic) SVM:" 55 | print(lbl) 56 | CPLELearningModel.pessimistic = False 57 | model = CPLELearningModel(sklearn.svm.SVC(kernel=kernel, probability=True), predict_from_probabilities=True) 58 | model.fit(Xs, ys.astype(int)) 59 | evaluate_and_plot(model, Xs, ys, ytrue, lbl, 4, block=True) 60 | -------------------------------------------------------------------------------- /examples/compare_rbfsvm_methods.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import random 3 | import sklearn.svm 4 | 5 | from frameworks.CPLELearning import CPLELearningModel 6 | from methods import scikitTSVM 7 | from examples.plotutils import evaluate_and_plot 8 | 9 | kernel = "rbf" 10 | 11 | # number of data points 12 | N = 60 13 | supevised_data_points = 4 14 | 15 | # generate data 16 | meandistance = 2 17 | 18 | s = np.random.random() 19 | cov = [[s, 0], [0, s]] 20 | # some random Gaussians 21 | gaussians = 6 #np.random.randint(4, 7) 22 | Xs = np.random.multivariate_normal([np.random.random()*meandistance, np.random.random()*meandistance], cov, (N/gaussians,)) 23 | for i in range(gaussians-1): 24 | Xs = np.vstack(( Xs, np.random.multivariate_normal([np.random.random()*meandistance, np.random.random()*meandistance], cov, (N/gaussians,)) )) 25 | 26 | # cut data into XOR 27 | ytrue = ((Xs[:, 0] < np.mean(Xs[:, 0]))*(Xs[:, 1] < np.mean(Xs[:, 1])) + (Xs[:, 0] > np.mean(Xs[:, 0]))*(Xs[:, 1] > np.mean(Xs[:, 1])))*1 28 | 29 | ys = np.array([-1]*N) 30 | sidx = random.sample(np.where(ytrue == 0)[0], supevised_data_points/2)+random.sample(np.where(ytrue == 1)[0], supevised_data_points/2) 31 | ys[sidx] = ytrue[sidx] 32 | 33 | Xsupervised = Xs[ys!=-1, :] 34 | ysupervised = ys[ys!=-1] 35 | 36 | # compare models 37 | lbl = "Purely supervised SVM:" 38 | print(lbl) 39 | model = sklearn.svm.SVC(kernel=kernel, probability=True) 40 | model.fit(Xsupervised, ysupervised) 41 | evaluate_and_plot(model, Xs, ys, ytrue, lbl, 1) 42 | 43 | 44 | lbl = "S3VM (Gieseke et al. 2012):" 45 | print(lbl) 46 | model = scikitTSVM.SKTSVM(kernel=kernel) 47 | model.fit(Xs, ys) 48 | evaluate_and_plot(model, Xs, ys, ytrue, lbl, 2) 49 | 50 | 51 | lbl = "CPLE(pessimistic) SVM:" 52 | print(lbl) 53 | model = CPLELearningModel(sklearn.svm.SVC(kernel=kernel, probability=True), predict_from_probabilities=True) 54 | model.fit(Xs, ys) 55 | evaluate_and_plot(model, Xs, ys, ytrue, lbl, 3) 56 | 57 | 58 | lbl = "CPLE(optimistic) SVM:" 59 | print(lbl) 60 | CPLELearningModel.pessimistic = False 61 | model = CPLELearningModel(sklearn.svm.SVC(kernel=kernel, probability=True), predict_from_probabilities=True) 62 | model.fit(Xs, ys) 63 | evaluate_and_plot(model, Xs, ys, ytrue, lbl, 4, block=True) 64 | -------------------------------------------------------------------------------- /examples/example.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import random 3 | from frameworks.CPLELearning import CPLELearningModel 4 | from sklearn.datasets.mldata import fetch_mldata 5 | from sklearn.linear_model.stochastic_gradient import SGDClassifier 6 | import sklearn.svm 7 | from methods.scikitWQDA import WQDA 8 | from frameworks.SelfLearning import SelfLearningModel 9 | 10 | # load data 11 | cancer = fetch_mldata("Lung cancer (Ontario)") 12 | X = cancer.target.T 13 | ytrue = np.copy(cancer.data).flatten() 14 | ytrue[ytrue>0]=1 15 | 16 | # label a few points 17 | labeled_N = 4 18 | ys = np.array([-1]*len(ytrue)) # -1 denotes unlabeled point 19 | random_labeled_points = random.sample(np.where(ytrue == 0)[0], labeled_N/2)+\ 20 | random.sample(np.where(ytrue == 1)[0], labeled_N/2) 21 | ys[random_labeled_points] = ytrue[random_labeled_points] 22 | 23 | # supervised score 24 | #basemodel = WQDA() # weighted Quadratic Discriminant Analysis 25 | basemodel = SGDClassifier(loss='log', penalty='l1') # scikit logistic regression 26 | basemodel.fit(X[random_labeled_points, :], ys[random_labeled_points]) 27 | print "supervised log.reg. score", basemodel.score(X, ytrue) 28 | 29 | # fast (but naive, unsafe) self learning framework 30 | ssmodel = SelfLearningModel(basemodel) 31 | ssmodel.fit(X, ys) 32 | print "self-learning log.reg. score", ssmodel.score(X, ytrue) 33 | 34 | # semi-supervised score (base model has to be able to take weighted samples) 35 | ssmodel = CPLELearningModel(basemodel) 36 | ssmodel.fit(X, ys) 37 | print "CPLE semi-supervised log.reg. score", ssmodel.score(X, ytrue) 38 | 39 | # semi-supervised score, WQDA model 40 | ssmodel = CPLELearningModel(WQDA(), predict_from_probabilities=True) # weighted Quadratic Discriminant Analysis 41 | ssmodel.fit(X, ys) 42 | print "CPLE semi-supervised WQDA score", ssmodel.score(X, ytrue) 43 | 44 | # semi-supervised score, RBF SVM model 45 | ssmodel = CPLELearningModel(sklearn.svm.SVC(kernel="rbf", probability=True), predict_from_probabilities=True) # RBF SVM 46 | ssmodel.fit(X, ys) 47 | print "CPLE semi-supervised RBF SVM score", ssmodel.score(X, ytrue) 48 | -------------------------------------------------------------------------------- /examples/plotutils.py: -------------------------------------------------------------------------------- 1 | import matplotlib.pyplot as plt 2 | import numpy as np 3 | 4 | cols = [np.array([1,0,0]),np.array([0,1,0])] # colors 5 | 6 | def evaluate_and_plot(model, Xs, ys, ytrue, lbl, subplot = None, block=False): 7 | if subplot != None: 8 | plt.subplot(2,2,subplot) 9 | 10 | # predict, and evaluate 11 | pred = model.predict(Xs) 12 | 13 | acc = np.mean(pred==ytrue) 14 | print "accuracy:", round(acc, 3) 15 | 16 | # plot probabilities 17 | [minx, maxx] = [np.min(Xs[:, 0]), np.max(Xs[:, 0])] 18 | [miny, maxy] = [np.min(Xs[:, 1]), np.max(Xs[:, 1])] 19 | gridsize = 100 20 | xx = np.linspace(minx, maxx, gridsize) 21 | yy = np.linspace(miny, maxy, gridsize).T 22 | xx, yy = np.meshgrid(xx, yy) 23 | Xfull = np.c_[xx.ravel(), yy.ravel()] 24 | probas = model.predict_proba(Xfull) 25 | plt.imshow(probas[:, 1].reshape((gridsize, gridsize)), extent=(minx, maxx, miny, maxy), origin='lower') 26 | 27 | # plot decision boundary 28 | try: 29 | if hasattr(model, 'predict_from_probabilities') and model.predict_from_probabilities: 30 | plt.contour((probas[:, 0]-1)*300+100, linewidth=1, edgecolor=[cols[p]*P[p] for p in model.predict(Xs).astype(int)], cmap='hot') 39 | plt.scatter(Xs[ys>-1, 0], Xs[ys>-1,1], c=ytrue[ys>-1], s=300, linewidth=1, edgecolor=[cols[p]*P[p] for p in model.predict(Xs).astype(int)], cmap='hot') 40 | plt.title(lbl + str(round(acc, 2))) 41 | 42 | plt.show(block=block) -------------------------------------------------------------------------------- /examples/plotutils.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tmadl/semisup-learn/483698be0453f08e550a4c705fc7ad522708708c/examples/plotutils.pyc -------------------------------------------------------------------------------- /frameworks/CPLELearning.py: -------------------------------------------------------------------------------- 1 | class Unbuffered(object): 2 | def __init__(self, stream): 3 | self.stream = stream 4 | def write(self, data): 5 | self.stream.write(data) 6 | self.stream.flush() 7 | def __getattr__(self, attr): 8 | return getattr(self.stream, attr) 9 | 10 | import sys 11 | sys.stdout = Unbuffered(sys.stdout) 12 | 13 | from sklearn.base import BaseEstimator 14 | import numpy 15 | import sklearn.metrics 16 | from sklearn.linear_model import LogisticRegression as LR 17 | import nlopt 18 | import scipy.stats 19 | 20 | class CPLELearningModel(BaseEstimator): 21 | """ 22 | Contrastive Pessimistic Likelihood Estimation framework for semi-supervised 23 | learning, based on (Loog, 2015). This implementation contains two 24 | significant differences to (Loog, 2015): 25 | - the discriminative likelihood p(y|X), instead of the generative 26 | likelihood p(X), is used for optimization 27 | - apart from `pessimism' (the assumption that the true labels of the 28 | unlabeled instances are as adversarial to the likelihood as possible), the 29 | optimization objective also tries to increase the likelihood on the labeled 30 | examples 31 | 32 | This class takes a base model (any scikit learn estimator), 33 | trains it on the labeled examples, and then uses global optimization to 34 | find (soft) label hypotheses for the unlabeled examples in a pessimistic 35 | fashion (such that the model log likelihood on the unlabeled data is as 36 | small as possible, but the log likelihood on the labeled data is as high 37 | as possible) 38 | 39 | See Loog, Marco. "Contrastive Pessimistic Likelihood Estimation for 40 | Semi-Supervised Classification." arXiv preprint arXiv:1503.00269 (2015). 41 | http://arxiv.org/pdf/1503.00269 42 | 43 | Attributes 44 | ---------- 45 | basemodel : BaseEstimator instance 46 | Base classifier to be trained on the partially supervised data 47 | 48 | pessimistic : boolean, optional (default=True) 49 | Whether the label hypotheses for the unlabeled instances should be 50 | pessimistic (i.e. minimize log likelihood) or optimistic (i.e. 51 | maximize log likelihood). 52 | Pessimistic label hypotheses ensure safety (i.e. the semi-supervised 53 | solution will not be worse than a model trained on the purely 54 | supervised instances) 55 | 56 | predict_from_probabilities : boolean, optional (default=False) 57 | The prediction is calculated from the probabilities if this is True 58 | (1 if more likely than the mean predicted probability or 0 otherwise). 59 | If it is false, the normal base model predictions are used. 60 | This only affects the predict function. Warning: only set to true if 61 | predict will be called with a substantial number of data points 62 | 63 | use_sample_weighting : boolean, optional (default=True) 64 | Whether to use sample weights (soft labels) for the unlabeled instances. 65 | Setting this to False allows the use of base classifiers which do not 66 | support sample weights (but might slow down the optimization) 67 | 68 | max_iter : int, optional (default=3000) 69 | Maximum number of iterations 70 | 71 | verbose : int, optional (default=1) 72 | Enable verbose output (1 shows progress, 2 shows the detailed log 73 | likelihood at every iteration). 74 | 75 | """ 76 | 77 | def __init__(self, basemodel, pessimistic=True, predict_from_probabilities = False, use_sample_weighting = True, max_iter=3000, verbose = 1): 78 | self.model = basemodel 79 | self.pessimistic = pessimistic 80 | self.predict_from_probabilities = predict_from_probabilities 81 | self.use_sample_weighting = use_sample_weighting 82 | self.max_iter = max_iter 83 | self.verbose = verbose 84 | 85 | self.it = 0 # iteration counter 86 | self.noimprovementsince = 0 # log likelihood hasn't improved since this number of iterations 87 | self.maxnoimprovementsince = 3 # threshold for iterations without improvements (convergence is assumed when this is reached) 88 | 89 | self.buffersize = 200 90 | # buffer for the last few discriminative likelihoods (used to check for convergence) 91 | self.lastdls = [0]*self.buffersize 92 | 93 | # best discriminative likelihood and corresponding soft labels; updated during training 94 | self.bestdl = numpy.infty 95 | self.bestlbls = [] 96 | 97 | # unique id 98 | self.id = str(unichr(numpy.random.randint(26)+97))+str(unichr(numpy.random.randint(26)+97)) 99 | 100 | def discriminative_likelihood(self, model, labeledData, labeledy = None, unlabeledData = None, unlabeledWeights = None, unlabeledlambda = 1, gradient=[], alpha = 0.01): 101 | unlabeledy = (unlabeledWeights[:, 0]<0.5)*1 102 | uweights = numpy.copy(unlabeledWeights[:, 0]) # large prob. for k=0 instances, small prob. for k=1 instances 103 | uweights[unlabeledy==1] = 1-uweights[unlabeledy==1] # subtract from 1 for k=1 instances to reflect confidence 104 | weights = numpy.hstack((numpy.ones(len(labeledy)), uweights)) 105 | labels = numpy.hstack((labeledy, unlabeledy)) 106 | 107 | # fit model on supervised data 108 | if self.use_sample_weighting: 109 | model.fit(numpy.vstack((labeledData, unlabeledData)), labels, sample_weight=weights) 110 | else: 111 | model.fit(numpy.vstack((labeledData, unlabeledData)), labels) 112 | 113 | # probability of labeled data 114 | P = model.predict_proba(labeledData) 115 | 116 | try: 117 | # labeled discriminative log likelihood 118 | labeledDL = -sklearn.metrics.log_loss(labeledy, P) 119 | except Exception as e: 120 | print(e) 121 | P = model.predict_proba(labeledData) 122 | 123 | # probability of unlabeled data 124 | unlabeledP = model.predict_proba(unlabeledData) 125 | 126 | try: 127 | # unlabeled discriminative log likelihood 128 | eps = 1e-15 129 | unlabeledP = numpy.clip(unlabeledP, eps, 1 - eps) 130 | unlabeledDL = numpy.average((unlabeledWeights*numpy.vstack((1-unlabeledy, unlabeledy)).T*numpy.log(unlabeledP)).sum(axis=1)) 131 | except Exception as e: 132 | print(e) 133 | unlabeledP = model.predict_proba(unlabeledData) 134 | 135 | if self.pessimistic: 136 | # pessimistic: minimize the difference between unlabeled and labeled discriminative likelihood (assume worst case for unknown true labels) 137 | dl = unlabeledlambda * unlabeledDL - labeledDL 138 | else: 139 | # optimistic: minimize negative total discriminative likelihood (i.e. maximize likelihood) 140 | dl = - unlabeledlambda * unlabeledDL - labeledDL 141 | 142 | return dl 143 | 144 | def discriminative_likelihood_objective(self, model, labeledData, labeledy = None, unlabeledData = None, unlabeledWeights = None, unlabeledlambda = 1, gradient=[], alpha = 0.01): 145 | if self.it == 0: 146 | self.lastdls = [0]*self.buffersize 147 | 148 | dl = self.discriminative_likelihood(model, labeledData, labeledy, unlabeledData, unlabeledWeights, unlabeledlambda, gradient, alpha) 149 | 150 | self.it += 1 151 | self.lastdls[numpy.mod(self.it, len(self.lastdls))] = dl 152 | 153 | if numpy.mod(self.it, self.buffersize) == 0: # or True: 154 | improvement = numpy.mean((self.lastdls[(len(self.lastdls)/2):])) - numpy.mean((self.lastdls[:(len(self.lastdls)/2)])) 155 | # ttest - test for hypothesis that the likelihoods have not changed (i.e. there has been no improvement, and we are close to convergence) 156 | _, prob = scipy.stats.ttest_ind(self.lastdls[(len(self.lastdls)/2):], self.lastdls[:(len(self.lastdls)/2)]) 157 | 158 | # if improvement is not certain accoring to t-test... 159 | noimprovement = prob > 0.1 and numpy.mean(self.lastdls[(len(self.lastdls)/2):]) < numpy.mean(self.lastdls[:(len(self.lastdls)/2)]) 160 | if noimprovement: 161 | self.noimprovementsince += 1 162 | if self.noimprovementsince >= self.maxnoimprovementsince: 163 | # no improvement since a while - converged; exit 164 | self.noimprovementsince = 0 165 | raise Exception(" converged.") # we need to raise an exception to get NLopt to stop before exceeding the iteration budget 166 | else: 167 | self.noimprovementsince = 0 168 | 169 | if self.verbose == 2: 170 | print(self.id,self.it, dl, numpy.mean(self.lastdls), improvement, round(prob, 3), (prob < 0.1)) 171 | elif self.verbose: 172 | sys.stdout.write(('.' if self.pessimistic else '.') if not noimprovement else 'n') 173 | 174 | if dl < self.bestdl: 175 | self.bestdl = dl 176 | self.bestlbls = numpy.copy(unlabeledWeights[:, 0]) 177 | 178 | return dl 179 | 180 | def fit(self, X, y): # -1 for unlabeled 181 | unlabeledX = X[y==-1, :] 182 | labeledX = X[y!=-1, :] 183 | labeledy = y[y!=-1] 184 | 185 | M = unlabeledX.shape[0] 186 | 187 | # train on labeled data 188 | self.model.fit(labeledX, labeledy) 189 | 190 | unlabeledy = self.predict(unlabeledX) 191 | 192 | #re-train, labeling unlabeled instances pessimistically 193 | 194 | # pessimistic soft labels ('weights') q for unlabelled points, q=P(k=0|Xu) 195 | f = lambda softlabels, grad=[]: self.discriminative_likelihood_objective(self.model, labeledX, labeledy=labeledy, unlabeledData=unlabeledX, unlabeledWeights=numpy.vstack((softlabels, 1-softlabels)).T, gradient=grad) #- supLL 196 | lblinit = numpy.random.random(len(unlabeledy)) 197 | 198 | try: 199 | self.it = 0 200 | opt = nlopt.opt(nlopt.GN_DIRECT_L_RAND, M) 201 | opt.set_lower_bounds(numpy.zeros(M)) 202 | opt.set_upper_bounds(numpy.ones(M)) 203 | opt.set_min_objective(f) 204 | opt.set_maxeval(self.max_iter) 205 | self.bestsoftlbl = opt.optimize(lblinit) 206 | print(" max_iter exceeded.") 207 | except Exception as e: 208 | print(e) 209 | self.bestsoftlbl = self.bestlbls 210 | 211 | if numpy.any(self.bestsoftlbl != self.bestlbls): 212 | self.bestsoftlbl = self.bestlbls 213 | ll = f(self.bestsoftlbl) 214 | 215 | unlabeledy = (self.bestsoftlbl<0.5)*1 216 | uweights = numpy.copy(self.bestsoftlbl) # large prob. for k=0 instances, small prob. for k=1 instances 217 | uweights[unlabeledy==1] = 1-uweights[unlabeledy==1] # subtract from 1 for k=1 instances to reflect confidence 218 | weights = numpy.hstack((numpy.ones(len(labeledy)), uweights)) 219 | labels = numpy.hstack((labeledy, unlabeledy)) 220 | if self.use_sample_weighting: 221 | self.model.fit(numpy.vstack((labeledX, unlabeledX)), labels, sample_weight=weights) 222 | else: 223 | self.model.fit(numpy.vstack((labeledX, unlabeledX)), labels) 224 | 225 | if self.verbose > 1: 226 | print("number of non-one soft labels: ", numpy.sum(self.bestsoftlbl != 1), ", balance:", numpy.sum(self.bestsoftlbl<0.5), " / ", len(self.bestsoftlbl)) 227 | print("current likelihood: ", ll) 228 | 229 | if not getattr(self.model, "predict_proba", None): 230 | # Platt scaling 231 | self.plattlr = LR() 232 | preds = self.model.predict(labeledX) 233 | self.plattlr.fit( preds.reshape( -1, 1 ), labeledy ) 234 | 235 | return self 236 | 237 | def predict_proba(self, X): 238 | """Compute probabilities of possible outcomes for samples in X. 239 | 240 | The model need to have probability information computed at training 241 | time: fit with attribute `probability` set to True. 242 | 243 | Parameters 244 | ---------- 245 | X : array-like, shape = [n_samples, n_features] 246 | 247 | Returns 248 | ------- 249 | T : array-like, shape = [n_samples, n_classes] 250 | Returns the probability of the sample for each class in 251 | the model. The columns correspond to the classes in sorted 252 | order, as they appear in the attribute `classes_`. 253 | """ 254 | 255 | if getattr(self.model, "predict_proba", None): 256 | return self.model.predict_proba(X) 257 | else: 258 | preds = self.model.predict(X) 259 | return self.plattlr.predict_proba(preds.reshape( -1, 1 )) 260 | 261 | def predict(self, X): 262 | """Perform classification on samples in X. 263 | 264 | Parameters 265 | ---------- 266 | X : array-like, shape = [n_samples, n_features] 267 | 268 | Returns 269 | ------- 270 | y_pred : array, shape = [n_samples] 271 | Class labels for samples in X. 272 | """ 273 | 274 | if self.predict_from_probabilities: 275 | P = self.predict_proba(X) 276 | return (P[:, 0] self.prob_threshold) | (unlabeledprob[:, 1] > self.prob_threshold))[0] 69 | 70 | self.model.fit(numpy.vstack((labeledX, unlabeledX[uidx, :])), numpy.hstack((labeledy, unlabeledy_old[uidx]))) 71 | unlabeledy = self.predict(unlabeledX) 72 | unlabeledprob = self.predict_proba(unlabeledX) 73 | i += 1 74 | 75 | if not getattr(self.model, "predict_proba", None): 76 | # Platt scaling if the model cannot generate predictions itself 77 | self.plattlr = LR() 78 | preds = self.model.predict(labeledX) 79 | self.plattlr.fit( preds.reshape( -1, 1 ), labeledy ) 80 | 81 | return self 82 | 83 | def predict_proba(self, X): 84 | """Compute probabilities of possible outcomes for samples in X. 85 | 86 | The model need to have probability information computed at training 87 | time: fit with attribute `probability` set to True. 88 | 89 | Parameters 90 | ---------- 91 | X : array-like, shape = [n_samples, n_features] 92 | 93 | Returns 94 | ------- 95 | T : array-like, shape = [n_samples, n_classes] 96 | Returns the probability of the sample for each class in 97 | the model. The columns correspond to the classes in sorted 98 | order, as they appear in the attribute `classes_`. 99 | """ 100 | 101 | if getattr(self.model, "predict_proba", None): 102 | return self.model.predict_proba(X) 103 | else: 104 | preds = self.model.predict(X) 105 | return self.plattlr.predict_proba(preds.reshape( -1, 1 )) 106 | 107 | def predict(self, X): 108 | """Perform classification on samples in X. 109 | 110 | Parameters 111 | ---------- 112 | X : array-like, shape = [n_samples, n_features] 113 | 114 | Returns 115 | ------- 116 | y_pred : array, shape = [n_samples] 117 | Class labels for samples in X. 118 | """ 119 | 120 | return self.model.predict(X) 121 | 122 | def score(self, X, y, sample_weight=None): 123 | return sklearn.metrics.accuracy_score(y, self.predict(X), sample_weight=sample_weight) 124 | -------------------------------------------------------------------------------- /frameworks/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tmadl/semisup-learn/483698be0453f08e550a4c705fc7ad522708708c/frameworks/__init__.py -------------------------------------------------------------------------------- /methods/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tmadl/semisup-learn/483698be0453f08e550a4c705fc7ad522708708c/methods/__init__.py -------------------------------------------------------------------------------- /methods/qns3vm.py: -------------------------------------------------------------------------------- 1 | ############################################################################################ 2 | # QN-S3VM BFGS optimizer for semi-supervised support vector machines. 3 | # 4 | # This implementation provides both a L-BFGS optimization scheme 5 | # for semi-supvised support vector machines. Details can be found in: 6 | # 7 | # F. Gieseke, A. Airola, T. Pahikkala, O. Kramer, Sparse quasi- 8 | # Newton optimization for semi-supervised support vector ma- 9 | # chines, in: Proc. of the 1st Int. Conf. on Pattern Recognition 10 | # Applications and Methods, 2012, pp. 45-54. 11 | # 12 | # Version: 0.1 (September, 2012) 13 | # 14 | # Bugs: Please send any bugs to "f DOT gieseke AT uni-oldenburg.de" 15 | # 16 | # 17 | # Copyright (C) 2012 Fabian Gieseke, Antti Airola, Tapio Pahikkala, Oliver Kramer 18 | # 19 | # This program is free software: you can redistribute it and/or modify 20 | # it under the terms of the GNU General Public License as published by 21 | # the Free Software Foundation, either version 3 of the License, or 22 | # (at your option) any later version. 23 | # 24 | # This program is distributed in the hope that it will be useful, 25 | # but WITHOUT ANY WARRANTY; without even the implied warranty of 26 | # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 27 | # GNU General Public License for more details. 28 | # 29 | # You should have received a copy of the GNU General Public License 30 | # along with this program. If not, see . 31 | # 32 | # 33 | # INSTALLATION and DEPENDENCIES 34 | # 35 | # The module should work out of the box, given Python and Numpy (http://numpy.scipy.org/) 36 | # and Scipy (http://scipy.org/) installed correctly. 37 | # 38 | # We have tested the code on Ubuntu 12.04 (32 Bit) with Python 2.7.3, Numpy 1.6.1, 39 | # and Scipy 0.9.0. Installing these packages on a Ubuntu- or Debian-based systems 40 | # can be done via "sudo apt-get install python python-numpy python-scipy". 41 | # 42 | # 43 | # RUNNING THE EXAMPLES 44 | # 45 | # For a description of the data sets, see the paper mentioned above and the references 46 | # therein. Running the command "python qns3vm.py" should yield an output similar to: 47 | # 48 | # Sparse text data set instance 49 | # Number of labeled patterns: 48 50 | # Number of unlabeled patterns: 924 51 | # Number of test patterns: 974 52 | # Time needed to compute the model: 0.775886058807 seconds 53 | # Classification error of QN-S3VM: 0.0667351129363 54 | # 55 | # Dense gaussian data set instance 56 | # Number of labeled patterns: 25 57 | # Number of unlabeled patterns: 225 58 | # Number of test patterns: 250 59 | # Time needed to compute the model: 0.464584112167 seconds 60 | # Classification error of QN-S3VM: 0.012 61 | # 62 | # Dense moons data set instance 63 | # Number of labeled patterns: 5 64 | # Number of unlabeled patterns: 495 65 | # Number of test patterns: 500 66 | # Time needed to compute the model: 0.69714307785 seconds 67 | # Classification error of QN-S3VM: 0.0 68 | 69 | ############################################################################################ 70 | 71 | import array as arr 72 | import math 73 | import copy as cp 74 | import logging 75 | import numpy as np 76 | from numpy import * 77 | import operator 78 | from time import time 79 | import sys 80 | from scipy import optimize 81 | import scipy.sparse.csc as csc 82 | from scipy import sparse 83 | import scipy 84 | import warnings 85 | warnings.simplefilter('error') 86 | 87 | __author__ = 'Fabian Gieseke, Antti Airola, Tapio Pahikkala, Oliver Kramer' 88 | __version__= '0.1' 89 | 90 | class QN_S3VM: 91 | """ 92 | L-BFGS optimizer for semi-supervised support vector machines (S3VM). 93 | """ 94 | def __init__(self, X_l, L_l, X_u, random_generator = None, ** kw): 95 | """ 96 | Initializes the model. Detects automatically if dense or sparse data is provided. 97 | 98 | Keyword arguments: 99 | X_l -- patterns of labeled part of the data 100 | L_l -- labels of labeled part of the data 101 | X_u -- patterns of unlabeled part of the data 102 | random_generator -- particular instance of a random_generator (default None) 103 | kw -- additional parameters for the optimizer 104 | lam -- regularization parameter lambda (default 1, must be a float > 0) 105 | lamU -- cost parameter that determines influence of unlabeled patterns (default 1, must be float > 0) 106 | sigma -- kernel width for RBF kernel (default 1.0, must be a float > 0) 107 | kernel_type -- "Linear" or "RBF" (default "Linear") 108 | numR -- implementation of subset of regressors. If None is provided, all patterns are used 109 | (no approximation). Must fulfill 0 <= numR <= len(X_l) + len(X_u) (default None) 110 | estimate_r -- desired ratio for positive and negative assigments for 111 | unlabeled patterns (-1.0 <= estimate_r <= 1.0). If estimate_r=None, 112 | then L_l is used to estimate this ratio (in case len(L_l) >= 113 | minimum_labeled_patterns_for_estimate_r. Otherwise use estimate_r = 0.0 114 | (default None) 115 | minimum_labeled_patterns_for_estimate_r -- see above (default 0) 116 | BFGS_m -- BFGS parameter (default 50) 117 | BFGS_maxfun -- BFGS parameter, maximum number of function calls (default 500) 118 | BFGS_factr -- BFGS parameter (default 1E12) 119 | BFGS_pgtol -- BFGS parameter (default 1.0000000000000001e-05) 120 | """ 121 | self.__model = None 122 | # Initiate model for sparse data 123 | if isinstance(X_l, csc.csc_matrix): 124 | self.__data_type = "sparse" 125 | self.__model = QN_S3VM_Sparse(X_l, L_l, X_u, random_generator, ** kw) 126 | # Initiate model for dense data 127 | elif (isinstance(X_l[0], list)) or (isinstance(X_l[0], np.ndarray)): 128 | self.__data_type = "dense" 129 | self.__model = QN_S3VM_Dense(X_l, L_l, X_u, random_generator, ** kw) 130 | # Data format unknown 131 | if self.__model == None: 132 | logging.info("Data format for patterns is unknown.") 133 | sys.exit(0) 134 | 135 | def train(self): 136 | """ 137 | Training phase. 138 | 139 | Returns: 140 | The computed partition for the unlabeled patterns. 141 | """ 142 | return self.__model.train() 143 | 144 | def getPredictions(self, X, real_valued=False): 145 | """ 146 | Computes the predicted labels for a given set of patterns 147 | 148 | Keyword arguments: 149 | X -- The set of patterns 150 | real_valued -- If True, then the real prediction values are returned 151 | 152 | Returns: 153 | The predictions for the list X of patterns. 154 | """ 155 | return self.__model.getPredictions(X, real_valued=False) 156 | 157 | def predict(self, x): 158 | """ 159 | Predicts a label (-1 or +1) for the pattern 160 | 161 | Keyword arguments: 162 | x -- The pattern 163 | 164 | Returns: 165 | The prediction for x. 166 | """ 167 | return self.__model.predict(x) 168 | 169 | def predictValue(self, x): 170 | """ 171 | Computes f(x) for a given pattern (see Representer Theorem) 172 | 173 | Keyword arguments: 174 | x -- The pattern 175 | 176 | Returns: 177 | The (real) prediction value for x. 178 | """ 179 | return self.__model.predictValue(x) 180 | 181 | def getNeededFunctionCalls(self): 182 | """ 183 | Returns the number of function calls needed during 184 | the optimization process. 185 | """ 186 | return self.__model.getNeededFunctionCalls() 187 | 188 | def mygetPreds(self, X, real_valued=False): 189 | return self.__model.mygetPreds(X, real_valued) 190 | 191 | ############################################################################################ 192 | ############################################################################################ 193 | class QN_S3VM_Dense: 194 | 195 | """ 196 | BFGS optimizer for semi-supervised support vector machines (S3VM). 197 | 198 | Dense Data 199 | """ 200 | parameters = { 201 | 'lam': 1, 202 | 'lamU':1, 203 | 'sigma': 1, 204 | 'kernel_type': "Linear", 205 | 'numR':None, 206 | 'estimate_r':None, 207 | 'minimum_labeled_patterns_for_estimate_r':0, 208 | 'BFGS_m':50, 209 | 'BFGS_maxfun':500, 210 | 'BFGS_factr':1E12, 211 | 'BFGS_pgtol':1.0000000000000001e-05, 212 | 'BFGS_verbose':-1, 213 | 'surrogate_s':3.0, 214 | 'surrogate_gamma':20.0, 215 | 'breakpoint_for_exp':500 216 | } 217 | 218 | def __init__(self, X_l, L_l, X_u, random_generator, ** kw): 219 | """ 220 | Intializes the S3VM optimizer. 221 | """ 222 | self.__random_generator = random_generator 223 | self.__X_l, self.__X_u, self.__L_l = X_l, X_u, L_l 224 | assert len(X_l) == len(L_l) 225 | self.__X = cp.deepcopy(self.__X_l) 226 | self.__X.extend(cp.deepcopy(self.__X_u)) 227 | self.__size_l, self.__size_u, self.__size_n = len(X_l), len(X_u), len(X_l) + len(X_u) 228 | self.__matrices_initialized = False 229 | self.__setParameters( ** kw) 230 | self.__kw = kw 231 | 232 | def train(self): 233 | """ 234 | Training phase. 235 | 236 | Returns: 237 | The computed partition for the unlabeled patterns. 238 | """ 239 | indi_opt = self.__optimize() 240 | self.__recomputeModel(indi_opt) 241 | predictions = self.__getTrainingPredictions(self.__X) 242 | return predictions 243 | 244 | def mygetPreds(self, X, real_valued=False): 245 | KNR = self.__kernel.computeKernelMatrix(X, self.__Xreg) 246 | KNU_bar = self.__kernel.computeKernelMatrix(X, self.__X_u_subset, symmetric=False) 247 | KNU_bar_horizontal_sum = (1.0 / len(self.__X_u_subset)) * KNU_bar.sum(axis=1) 248 | KNR = KNR - KNU_bar_horizontal_sum - self.__KU_barR_vertical_sum + self.__KU_barU_bar_sum 249 | preds = KNR * self.__c[0:self.__dim-1,:] + self.__c[self.__dim-1,:] 250 | return preds 251 | 252 | def getPredictions(self, X, real_valued=False): 253 | """ 254 | Computes the predicted labels for a given set of patterns 255 | 256 | Keyword arguments: 257 | X -- The set of patterns 258 | real_valued -- If True, then the real prediction values are returned 259 | 260 | Returns: 261 | The predictions for the list X of patterns. 262 | """ 263 | KNR = self.__kernel.computeKernelMatrix(X, self.__Xreg) 264 | KNU_bar = self.__kernel.computeKernelMatrix(X, self.__X_u_subset, symmetric=False) 265 | KNU_bar_horizontal_sum = (1.0 / len(self.__X_u_subset)) * KNU_bar.sum(axis=1) 266 | KNR = KNR - KNU_bar_horizontal_sum - self.__KU_barR_vertical_sum + self.__KU_barU_bar_sum 267 | preds = KNR * self.__c[0:self.__dim-1,:] + self.__c[self.__dim-1,:] 268 | if real_valued == True: 269 | return preds.flatten(1).tolist()[0] 270 | else: 271 | return np.sign(np.sign(preds)+0.1).flatten(1).tolist()[0] 272 | 273 | def predict(self, x): 274 | """ 275 | Predicts a label for the pattern 276 | 277 | Keyword arguments: 278 | x -- The pattern 279 | 280 | Returns: 281 | The prediction for x. 282 | """ 283 | return self.getPredictions([x], real_valued=False)[0] 284 | 285 | def predictValue(self, x): 286 | """ 287 | Computes f(x) for a given pattern (see Representer Theorem) 288 | 289 | Keyword arguments: 290 | x -- The pattern 291 | 292 | Returns: 293 | The (real) prediction value for x. 294 | """ 295 | return self.getPredictions([x], real_valued=True)[0] 296 | 297 | def getNeededFunctionCalls(self): 298 | """ 299 | Returns the number of function calls needed during 300 | the optimization process. 301 | """ 302 | return self.__needed_function_calls 303 | 304 | def __setParameters(self, ** kw): 305 | for attr, val in kw.items(): 306 | self.parameters[attr] = val 307 | self.__lam = float(self.parameters['lam']) 308 | assert self.__lam > 0 309 | self.__lamU = float(self.parameters['lamU']) 310 | assert self.__lamU > 0 311 | self.__lam_Uvec = [float(self.__lamU)*i for i in [0,0.000001,0.0001,0.01,0.1,0.5,1]] 312 | self.__sigma = float(self.parameters['sigma']) 313 | assert self.__sigma > 0 314 | self.__kernel_type = str(self.parameters['kernel_type']) 315 | if self.parameters['numR'] != None: 316 | self.__numR = int(self.parameters['numR']) 317 | assert (self.__numR <= len(self.__X)) and (self.__numR > 0) 318 | else: 319 | self.__numR = len(self.__X) 320 | self.__regressors_indices = sorted(self.__random_generator.sample( range(0,len(self.__X)), self.__numR )) 321 | self.__dim = self.__numR + 1 # add bias term b 322 | self.__minimum_labeled_patterns_for_estimate_r = float(self.parameters['minimum_labeled_patterns_for_estimate_r']) 323 | # If reliable estimate is available or can be estimated, use it, otherwise 324 | # assume classes to be balanced (i.e., estimate_r=0.0) 325 | if self.parameters['estimate_r'] != None: 326 | self.__estimate_r = float(self.parameters['estimate_r']) 327 | elif len(self.__L_l) >= self.__minimum_labeled_patterns_for_estimate_r: 328 | self.__estimate_r = (1.0 / len(self.__L_l)) * np.sum(self.__L_l) 329 | else: 330 | self.__estimate_r = 0.0 331 | self.__BFGS_m = int(self.parameters['BFGS_m']) 332 | self.__BFGS_maxfun = int(self.parameters['BFGS_maxfun']) 333 | self.__BFGS_factr = float(self.parameters['BFGS_factr']) 334 | # This is a hack for 64 bit systems (Linux). The machine precision 335 | # is different for the BFGS optimizer (Fortran code) and we fix this by: 336 | is_64bits = sys.maxsize > 2**32 337 | if is_64bits: 338 | logging.debug("64-bit system detected, modifying BFGS_factr!") 339 | self.__BFGS_factr = 0.000488288*self.__BFGS_factr 340 | self.__BFGS_pgtol = float(self.parameters['BFGS_pgtol']) 341 | self.__BFGS_verbose = int(self.parameters['BFGS_verbose']) 342 | self.__surrogate_gamma = float(self.parameters['surrogate_gamma']) 343 | self.__s = float(self.parameters['surrogate_s']) 344 | self.__breakpoint_for_exp = float(self.parameters['breakpoint_for_exp']) 345 | self.__b = self.__estimate_r 346 | # size of unlabeled patterns to estimate mean (used for balancing constraint) 347 | self.__max_unlabeled_subset_size = 1000 348 | 349 | 350 | def __optimize(self): 351 | logging.debug("Starting optimization with BFGS ...") 352 | self.__needed_function_calls = 0 353 | self.__initializeMatrices() 354 | # starting point 355 | c_current = zeros(self.__dim, float64) 356 | c_current[self.__dim-1] = self.__b 357 | # Annealing sequence. 358 | for i in range(len(self.__lam_Uvec)): 359 | self.__lamU = self.__lam_Uvec[i] 360 | # crop one dimension (in case the offset b is fixed) 361 | c_current = c_current[:self.__dim-1] 362 | c_current = self.__localSearch(c_current) 363 | # reappend it if needed 364 | c_current = np.append(c_current, self.__b) 365 | f_opt = self.__getFitness(c_current) 366 | return c_current, f_opt 367 | 368 | def __localSearch(self, start): 369 | c_opt, f_opt, d = optimize.fmin_l_bfgs_b(self.__getFitness, start, m=self.__BFGS_m, \ 370 | fprime=self.__getFitness_Prime, maxfun=self.__BFGS_maxfun, factr=self.__BFGS_factr,\ 371 | pgtol=self.__BFGS_pgtol, iprint=self.__BFGS_verbose) 372 | self.__needed_function_calls += int(d['funcalls']) 373 | return c_opt 374 | 375 | def __initializeMatrices(self): 376 | if self.__matrices_initialized == False: 377 | logging.debug("Initializing matrices...") 378 | # Initialize labels 379 | x = arr.array('i') 380 | for l in self.__L_l: 381 | x.append(l) 382 | self.__YL = mat(x, dtype=np.float64) 383 | self.__YL = self.__YL.transpose() 384 | # Initialize kernel matrices 385 | if (self.__kernel_type == "Linear"): 386 | self.__kernel = LinearKernel() 387 | elif (self.__kernel_type == "RBF"): 388 | self.__kernel = RBFKernel(self.__sigma) 389 | self.__Xreg = (mat(self.__X)[self.__regressors_indices,:].tolist()) 390 | self.__KLR = self.__kernel.computeKernelMatrix(self.__X_l,self.__Xreg, symmetric=False) 391 | self.__KUR = self.__kernel.computeKernelMatrix(self.__X_u,self.__Xreg, symmetric=False) 392 | self.__KNR = cp.deepcopy(bmat([[self.__KLR], [self.__KUR]])) 393 | self.__KRR = self.__KNR[self.__regressors_indices,:] 394 | # Center patterns in feature space (with respect to approximated mean of unlabeled patterns in the feature space) 395 | subset_unlabled_indices = sorted(self.__random_generator.sample( range(0,len(self.__X_u)), min(self.__max_unlabeled_subset_size, len(self.__X_u)) )) 396 | self.__X_u_subset = (mat(self.__X_u)[subset_unlabled_indices,:].tolist()) 397 | self.__KNU_bar = self.__kernel.computeKernelMatrix(self.__X, self.__X_u_subset, symmetric=False) 398 | self.__KNU_bar_horizontal_sum = (1.0 / len(self.__X_u_subset)) * self.__KNU_bar.sum(axis=1) 399 | self.__KU_barR = self.__kernel.computeKernelMatrix(self.__X_u_subset, self.__Xreg, symmetric=False) 400 | self.__KU_barR_vertical_sum = (1.0 / len(self.__X_u_subset)) * self.__KU_barR.sum(axis=0) 401 | self.__KU_barU_bar = self.__kernel.computeKernelMatrix(self.__X_u_subset, self.__X_u_subset, symmetric=False) 402 | self.__KU_barU_bar_sum = (1.0 / (len(self.__X_u_subset)))**2 * self.__KU_barU_bar.sum() 403 | self.__KNR = self.__KNR - self.__KNU_bar_horizontal_sum - self.__KU_barR_vertical_sum + self.__KU_barU_bar_sum 404 | self.__KRR = self.__KNR[self.__regressors_indices,:] 405 | self.__KLR = self.__KNR[range(0,len(self.__X_l)),:] 406 | self.__KUR = self.__KNR[range(len(self.__X_l),len(self.__X)),:] 407 | self.__matrices_initialized = True 408 | 409 | def __getFitness(self,c): 410 | # Check whether the function is called from the bfgs solver 411 | # (that does not optimize the offset b) or not 412 | if len(c) == self.__dim - 1: 413 | c = np.append(c, self.__b) 414 | c = mat(c) 415 | b = c[:,self.__dim-1].T 416 | c_new = c[:,0:self.__dim-1].T 417 | preds_labeled = self.__surrogate_gamma*(1.0 - multiply(self.__YL, self.__KLR * c_new + b)) 418 | preds_unlabeled = self.__KUR * c_new + b 419 | # This vector has a "one" for each "numerically instable" entry; "zeros" for "good ones". 420 | preds_labeled_conflict_indicator = np.sign(np.sign(preds_labeled/self.__breakpoint_for_exp - 1.0) + 1.0) 421 | # This vector has a one for each good entry and zero otherwise 422 | preds_labeled_good_indicator = (-1)*(preds_labeled_conflict_indicator - 1.0) 423 | preds_labeled_for_conflicts = multiply(preds_labeled_conflict_indicator,preds_labeled) 424 | preds_labeled = multiply(preds_labeled,preds_labeled_good_indicator) 425 | # Compute values for good entries 426 | preds_labeled_log_exp = np.log(1.0 + np.exp(preds_labeled)) 427 | # Compute values for instable entries 428 | preds_labeled_log_exp = multiply(preds_labeled_good_indicator, preds_labeled_log_exp) 429 | # Replace critical values with values 430 | preds_labeled_final = preds_labeled_log_exp + preds_labeled_for_conflicts 431 | term1 = (1.0/(self.__surrogate_gamma*self.__size_l)) * np.sum(preds_labeled_final) 432 | preds_unlabeled_squared = multiply(preds_unlabeled,preds_unlabeled) 433 | term2 = (float(self.__lamU)/float(self.__size_u))*np.sum(np.exp(-self.__s * preds_unlabeled_squared)) 434 | term3 = self.__lam * (c_new.T * self.__KRR * c_new) 435 | return (term1 + term2 + term3)[0,0] 436 | 437 | def __getFitness_Prime(self,c): 438 | # Check whether the function is called from the bfgs solver 439 | # (that does not optimize the offset b) or not 440 | if len(c) == self.__dim - 1: 441 | c = np.append(c, self.__b) 442 | c = mat(c) 443 | b = c[:,self.__dim-1].T 444 | c_new = c[:,0:self.__dim-1].T 445 | preds_labeled = self.__surrogate_gamma * (1.0 - multiply(self.__YL, self.__KLR * c_new + b)) 446 | preds_unlabeled = (self.__KUR * c_new + b) 447 | # This vector has a "one" for each "numerically instable" entry; "zeros" for "good ones". 448 | preds_labeled_conflict_indicator = np.sign(np.sign(preds_labeled/self.__breakpoint_for_exp - 1.0) + 1.0) 449 | # This vector has a one for each good entry and zero otherwise 450 | preds_labeled_good_indicator = (-1)*(preds_labeled_conflict_indicator - 1.0) 451 | preds_labeled = multiply(preds_labeled,preds_labeled_good_indicator) 452 | preds_labeled_exp = np.exp(preds_labeled) 453 | term1 = multiply(preds_labeled_exp, 1.0/(1.0 + preds_labeled_exp)) 454 | term1 = multiply(preds_labeled_good_indicator, term1) 455 | # Replace critical values with "1.0" 456 | term1 = term1 + preds_labeled_conflict_indicator 457 | term1 = multiply(self.__YL, term1) 458 | preds_unlabeled_squared_exp_f = multiply(preds_unlabeled,preds_unlabeled) 459 | preds_unlabeled_squared_exp_f = np.exp(-self.__s * preds_unlabeled_squared_exp_f) 460 | preds_unlabeled_squared_exp_f = multiply(preds_unlabeled_squared_exp_f, preds_unlabeled) 461 | term1 = (-1.0/self.__size_l) * (term1.T * self.__KLR).T 462 | term2 = ((-2.0 * self.__s * self.__lamU)/float(self.__size_u)) * (preds_unlabeled_squared_exp_f.T * self.__KUR).T 463 | term3 = 2*self.__lam*(self.__KRR * c_new) 464 | return array((term1 + term2 + term3).T)[0] 465 | 466 | def __recomputeModel(self, indi): 467 | self.__c = mat(indi[0]).T 468 | 469 | def __getTrainingPredictions(self, X, real_valued=False): 470 | preds = self.__KNR * self.__c[0:self.__dim-1,:] + self.__c[self.__dim-1,:] 471 | if real_valued == True: 472 | return preds.flatten(1).tolist()[0] 473 | else: 474 | return np.sign(np.sign(preds)+0.1).flatten(1).tolist()[0] 475 | 476 | def __check_matrix(self, M): 477 | smallesteval = scipy.linalg.eigvalsh(M, eigvals=(0,0))[0] 478 | if smallesteval < 0.0: 479 | shift = abs(smallesteval) + 0.0000001 480 | M = M + shift 481 | return M 482 | 483 | ############################################################################################ 484 | ############################################################################################ 485 | class QN_S3VM_Sparse: 486 | """ 487 | BFGS optimizer for semi-supervised support vector machines (S3VM). 488 | 489 | Sparse Data 490 | """ 491 | parameters = { 492 | 'lam': 1, 493 | 'lamU':1, 494 | 'estimate_r':None, 495 | 'minimum_labeled_patterns_for_estimate_r':0, 496 | 'BFGS_m':50, 497 | 'BFGS_maxfun':500, 498 | 'BFGS_factr':1E12, 499 | 'BFGS_pgtol':1.0000000000000001e-05, 500 | 'BFGS_verbose':-1, 501 | 'surrogate_s':3.0, 502 | 'surrogate_gamma':20.0, 503 | 'breakpoint_for_exp':500 504 | } 505 | 506 | 507 | def __init__(self, X_l, L_l, X_u, random_generator, ** kw): 508 | """ 509 | Intializes the S3VM optimizer. 510 | """ 511 | self.__random_generator = random_generator 512 | # This is a nuisance, but we may need to pad extra dimensions to either X_l or X_u 513 | # in case the highest feature indices appear only in one of the two data matrices 514 | if X_l.shape[1] > X_u.shape[1]: 515 | X_u = sparse.hstack([X_u, sparse.coo_matrix(X_u.shape[0], X_l.shape[1] - X_u.shape[1])]) 516 | elif X_l.shape[1] < X_u.shape[1]: 517 | X_l = sparse.hstack([X_l, sparse.coo_matrix(X_l.shape[0], X_u.shape[1] - X_u.shape[1])]) 518 | # We vertically stack the data matrices into one big matrix 519 | X = sparse.vstack([X_l, X_u]) 520 | self.__size_l, self.__size_u, self.__size_n = X_l.shape[0], X_u.shape[0], X_l.shape[0]+ X_u.shape[0] 521 | x = arr.array('i') 522 | for l in L_l: 523 | x.append(int(l)) 524 | self.__YL = mat(x, dtype=np.float64) 525 | self.__YL = self.__YL.transpose() 526 | self.__setParameters( ** kw) 527 | self.__kw = kw 528 | self.X_l = X_l.tocsr() 529 | self.X_u = X_u.tocsr() 530 | self.X = X.tocsr() 531 | # compute mean of unlabeled patterns 532 | self.__mean_u = self.X_u.mean(axis=0) 533 | self.X_u_T = X_u.tocsc().T 534 | self.X_l_T = X_l.tocsc().T 535 | self.X_T = X.tocsc().T 536 | 537 | def train(self): 538 | """ 539 | Training phase. 540 | 541 | Returns: 542 | The computed partition for the unlabeled patterns. 543 | """ 544 | indi_opt = self.__optimize() 545 | self.__recomputeModel(indi_opt) 546 | predictions = self.getPredictions(self.X) 547 | return predictions 548 | 549 | def getPredictions(self, X, real_valued=False): 550 | """ 551 | Computes the predicted labels for a given set of patterns 552 | 553 | Keyword arguments: 554 | X -- The set of patterns 555 | real_valued -- If True, then the real prediction values are returned 556 | 557 | Returns: 558 | The predictions for the list X of patterns. 559 | """ 560 | c_new = self.__c[:self.__dim-1] 561 | W = self.X.T*c_new - self.__mean_u.T*np.sum(c_new) 562 | # Again, possibility of dimension mismatch due to use of sparse matrices 563 | if X.shape[1] > W.shape[0]: 564 | X = X[:,range(W.shape[0])] 565 | if X.shape[1] < W.shape[0]: 566 | W = W[range(X.shape[1])] 567 | X = X.tocsc() 568 | preds = X * W + self.__b 569 | if real_valued == True: 570 | return preds.flatten(1).tolist()[0] 571 | else: 572 | return np.sign(np.sign(preds)+0.1).flatten(1).tolist()[0] 573 | 574 | def predict(self, x): 575 | """ 576 | Predicts a label for the pattern 577 | 578 | Keyword arguments: 579 | x -- The pattern 580 | 581 | Returns: 582 | The prediction for x. 583 | """ 584 | return self.getPredictions([x], real_valued=False)[0] 585 | 586 | def predictValue(self, x): 587 | """ 588 | Computes f(x) for a given pattern (see Representer Theorem) 589 | 590 | Keyword arguments: 591 | x -- The pattern 592 | 593 | Returns: 594 | The (real) prediction value for x. 595 | """ 596 | return self.getPredictions([x], real_valued=True)[0] 597 | 598 | def getNeededFunctionCalls(self): 599 | """ 600 | Returns the number of function calls needed during 601 | the optimization process. 602 | """ 603 | return self.__needed_function_calls 604 | 605 | def __setParameters(self, ** kw): 606 | for attr, val in kw.items(): 607 | self.parameters[attr] = val 608 | self.__lam = float(self.parameters['lam']) 609 | assert self.__lam > 0 610 | self.__lamU = float(self.parameters['lamU']) 611 | assert self.__lamU > 0 612 | self.__lam_Uvec = [float(self.__lamU)*i for i in [0,0.000001,0.0001,0.01,0.1,0.5,1]] 613 | self.__minimum_labeled_patterns_for_estimate_r = float(self.parameters['minimum_labeled_patterns_for_estimate_r']) 614 | # If reliable estimate is available or can be estimated, use it, otherwise 615 | # assume classes to be balanced (i.e., estimate_r=0.0) 616 | if self.parameters['estimate_r'] != None: 617 | self.__estimate_r = float(self.parameters['estimate_r']) 618 | elif self.__YL.shape[0] > self.__minimum_labeled_patterns_for_estimate_r: 619 | self.__estimate_r = (1.0 / self.__YL.shape[0]) * np.sum(self.__YL[0:]) 620 | else: 621 | self.__estimate_r = 0.0 622 | self.__dim = self.__size_n + 1 # for offset term b 623 | self.__BFGS_m = int(self.parameters['BFGS_m']) 624 | self.__BFGS_maxfun = int(self.parameters['BFGS_maxfun']) 625 | self.__BFGS_factr = float(self.parameters['BFGS_factr']) 626 | # This is a hack for 64 bit systems (Linux). The machine precision 627 | # is different for the BFGS optimizer (Fortran code) and we fix this by: 628 | is_64bits = sys.maxsize > 2**32 629 | if is_64bits: 630 | logging.debug("64-bit system detected, modifying BFGS_factr!") 631 | self.__BFGS_factr = 0.000488288*self.__BFGS_factr 632 | self.__BFGS_pgtol = float(self.parameters['BFGS_pgtol']) 633 | self.__BFGS_verbose = int(self.parameters['BFGS_verbose']) 634 | self.__surrogate_gamma = float(self.parameters['surrogate_gamma']) 635 | self.__s = float(self.parameters['surrogate_s']) 636 | self.__breakpoint_for_exp = float(self.parameters['breakpoint_for_exp']) 637 | self.__b = self.__estimate_r 638 | 639 | def __optimize(self): 640 | logging.debug("Starting optimization with BFGS ...") 641 | self.__needed_function_calls = 0 642 | # starting_point 643 | c_current = zeros(self.__dim, float64) 644 | c_current[self.__dim-1] = self.__b 645 | # Annealing sequence. 646 | for i in range(len(self.__lam_Uvec)): 647 | self.__lamU = self.__lam_Uvec[i] 648 | # crop one dimension (in case the offset b is fixed) 649 | c_current = c_current[:self.__dim-1] 650 | c_current = self.__localSearch(c_current) 651 | # reappend it if needed 652 | c_current = np.append(c_current, self.__b) 653 | f_opt = self.__getFitness(c_current) 654 | return c_current, f_opt 655 | 656 | def __localSearch(self, start): 657 | c_opt, f_opt, d = optimize.fmin_l_bfgs_b(self.__getFitness, start, m=self.__BFGS_m, \ 658 | fprime=self.__getFitness_Prime, maxfun=self.__BFGS_maxfun,\ 659 | factr=self.__BFGS_factr, pgtol=self.__BFGS_pgtol, iprint=self.__BFGS_verbose) 660 | self.__needed_function_calls += int(d['funcalls']) 661 | return c_opt 662 | 663 | def __getFitness(self,c): 664 | # check whether the function is called from the bfgs solver 665 | # (that does not optimize the offset b) or not 666 | if len(c) == self.__dim - 1: 667 | c = np.append(c, self.__b) 668 | c = mat(c) 669 | b = c[:,self.__dim-1].T 670 | c_new = c[:,0:self.__dim-1].T 671 | c_new_sum = np.sum(c_new) 672 | XTc = self.X_T*c_new - self.__mean_u.T*c_new_sum 673 | preds_labeled = self.__surrogate_gamma*(1.0 - multiply(self.__YL, (self.X_l*XTc - self.__mean_u*XTc) + b[0,0])) 674 | preds_unlabeled = (self.X_u*XTc - self.__mean_u*XTc) + b[0,0] 675 | # This vector has a "one" for each "numerically instable" entry; "zeros" for "good ones". 676 | preds_labeled_conflict_indicator = np.sign(np.sign(preds_labeled/self.__breakpoint_for_exp - 1.0) + 1.0) 677 | # This vector has a one for each good entry and zero otherwise 678 | preds_labeled_good_indicator = (-1)*(preds_labeled_conflict_indicator - 1.0) 679 | preds_labeled_for_conflicts = multiply(preds_labeled_conflict_indicator,preds_labeled) 680 | preds_labeled = multiply(preds_labeled,preds_labeled_good_indicator) 681 | # Compute values for good entries 682 | preds_labeled_log_exp = np.log(1.0 + np.exp(preds_labeled)) 683 | # Compute values for instable entries 684 | preds_labeled_log_exp = multiply(preds_labeled_good_indicator, preds_labeled_log_exp) 685 | # Replace critical values with values 686 | preds_labeled_final = preds_labeled_log_exp + preds_labeled_for_conflicts 687 | term1 = (1.0/(self.__surrogate_gamma*self.__size_l)) * np.sum(preds_labeled_final) 688 | preds_unlabeled_squared = multiply(preds_unlabeled,preds_unlabeled) 689 | term2 = (float(self.__lamU)/float(self.__size_u))*np.sum(np.exp(-self.__s * preds_unlabeled_squared)) 690 | term3 = self.__lam * c_new.T * (self.X * XTc - self.__mean_u*XTc) 691 | return (term1 + term2 + term3)[0,0] 692 | 693 | def __getFitness_Prime(self,c): 694 | # check whether the function is called from the bfgs solver 695 | # (that does not optimize the offset b) or not 696 | if len(c) == self.__dim - 1: 697 | c = np.append(c, self.__b) 698 | c = mat(c) 699 | b = c[:,self.__dim-1].T 700 | c_new = c[:,0:self.__dim-1].T 701 | c_new_sum = np.sum(c_new) 702 | XTc = self.X_T*c_new - self.__mean_u.T*c_new_sum 703 | preds_labeled = self.__surrogate_gamma*(1.0 - multiply(self.__YL, (self.X_l*XTc -self.__mean_u*XTc) + b[0,0])) 704 | preds_unlabeled = (self.X_u*XTc - self.__mean_u*XTc )+ b[0,0] 705 | preds_labeled_conflict_indicator = np.sign(np.sign(preds_labeled/self.__breakpoint_for_exp - 1.0) + 1.0) 706 | # This vector has a one for each good entry and zero otherwise 707 | preds_labeled_good_indicator = (-1)*(preds_labeled_conflict_indicator - 1.0) 708 | preds_labeled = multiply(preds_labeled,preds_labeled_good_indicator) 709 | preds_labeled_exp = np.exp(preds_labeled) 710 | term1 = multiply(preds_labeled_exp, 1.0/(1.0 + preds_labeled_exp)) 711 | term1 = multiply(preds_labeled_good_indicator, term1) 712 | # Replace critical values with "1.0" 713 | term1 = term1 + preds_labeled_conflict_indicator 714 | term1 = multiply(self.__YL, term1) 715 | preds_unlabeled_squared_exp_f = multiply(preds_unlabeled,preds_unlabeled) 716 | preds_unlabeled_squared_exp_f = np.exp(-self.__s * preds_unlabeled_squared_exp_f) 717 | preds_unlabeled_squared_exp_f = multiply(preds_unlabeled_squared_exp_f, preds_unlabeled) 718 | term1_sum = np.sum(term1) 719 | tmp = self.X_l_T * term1 - self.__mean_u.T*term1_sum 720 | term1 = (-1.0/self.__size_l) * (self.X * tmp - self.__mean_u*tmp) 721 | preds_unlabeled_squared_exp_f_sum = np.sum(preds_unlabeled_squared_exp_f) 722 | tmp_unlabeled = self.X_u_T * preds_unlabeled_squared_exp_f - self.__mean_u.T * preds_unlabeled_squared_exp_f_sum 723 | term2 = ((-2.0 * self.__s * self.__lamU)/float(self.__size_u)) * (self.X * tmp_unlabeled - self.__mean_u*tmp_unlabeled) 724 | XTc_sum = np.sum(XTc) 725 | term3 = 2*self.__lam*(self.X * XTc - self.__mean_u*XTc) 726 | return array((term1 + term2 + term3).T)[0] 727 | 728 | def __recomputeModel(self, indi): 729 | self.__c = mat(indi[0]).T 730 | 731 | ############################################################################################ 732 | ############################################################################################ 733 | class LinearKernel(): 734 | """ 735 | Linear Kernel 736 | """ 737 | def __init__(self): 738 | pass 739 | 740 | def computeKernelMatrix(self, data1, data2, symmetric=False): 741 | """ 742 | Computes the kernel matrix 743 | """ 744 | logging.debug("Starting Linear Kernel Matrix Computation...") 745 | self._data1 = mat(data1) 746 | self._data2 = mat(data2) 747 | assert self._data1.shape[1] == (self._data2.T).shape[0] 748 | try: 749 | return self._data1 * self._data2.T 750 | except Exception as e: 751 | logging.error("Error while computing kernel matrix: " + str(e)) 752 | import traceback 753 | traceback.print_exc() 754 | sys.exit() 755 | logging.debug("Kernel Matrix computed...") 756 | 757 | def getKernelValue(self, xi, xj): 758 | """ 759 | Returns a single kernel value. 760 | """ 761 | xi = array(xi) 762 | xj = array(xj) 763 | val = dot(xi, xj) 764 | return val 765 | 766 | 767 | class DictLinearKernel(): 768 | """ 769 | Linear Kernel (for dictionaries) 770 | """ 771 | def __init__(self): 772 | pass 773 | 774 | def computeKernelMatrix(self, data1, data2, symmetric=False): 775 | """ 776 | Computes the kernel matrix 777 | """ 778 | logging.debug("Starting Linear Kernel Matrix Computation...") 779 | self._data1 = data1 780 | self._data2 = data2 781 | self._dim1 = len(data1) 782 | self._dim2 = len(data2) 783 | self._symmetric = symmetric 784 | self.__km = None 785 | try: 786 | km = mat(zeros((self._dim1, self._dim2), dtype=float64)) 787 | if self._symmetric: 788 | for i in range(self._dim1): 789 | message = 'Kernel Matrix Progress: %dx%d/%dx%d' % (i, self._dim2,self._dim1,self._dim2) 790 | logging.debug(message) 791 | for j in range(i, self._dim2): 792 | val = self.getKernelValue(self._data1[i], self._data2[j]) 793 | km[i, j] = val 794 | km[j, i] = val 795 | return km 796 | else: 797 | for i in range(self._dim1): 798 | message = 'Kernel Matrix Progress: %dx%d/%dx%d' % (i, self._dim2,self._dim1,self._dim2) 799 | logging.debug(message) 800 | for j in range(0, self._dim2): 801 | val = self.getKernelValue(self._data1[i], self._data2[j]) 802 | km[i, j] = val 803 | return km 804 | 805 | except Exception as e: 806 | logging.error("Error while computing kernel matrix: " + str(e)) 807 | sys.exit() 808 | logging.debug("Kernel Matrix computed...") 809 | 810 | def getKernelValue(self, xi, xj): 811 | """ 812 | Returns a single kernel value. 813 | """ 814 | val = 0. 815 | for key in xi: 816 | if key in xj: 817 | val += xi[key]*xj[key] 818 | return val 819 | 820 | class RBFKernel(): 821 | """ 822 | RBF Kernel 823 | """ 824 | def __init__(self, sigma): 825 | self.__sigma = sigma 826 | self.__sigma_squared_inv = 1.0 / (2* (self.__sigma ** 2) ) 827 | 828 | def computeKernelMatrix(self, data1, data2, symmetric=False): 829 | """ 830 | Computes the kernel matrix 831 | """ 832 | logging.debug("Starting RBF Kernel Matrix Computation...") 833 | self._data1 = mat(data1) 834 | self._data2 = mat(data2) 835 | assert self._data1.shape[1] == (self._data2.T).shape[0] 836 | self._dim1 = len(data1) 837 | self._dim2 = len(data2) 838 | self._symmetric = symmetric 839 | self.__km = None 840 | try: 841 | if self._symmetric: 842 | linearkm = self._data1 * self._data2.T 843 | trnorms = mat(np.diag(linearkm)).T 844 | trace_matrix = trnorms * mat(np.ones((1, self._dim1), dtype = float64)) 845 | self.__km = trace_matrix + trace_matrix.T 846 | self.__km = self.__km - 2*linearkm 847 | self.__km = - self.__sigma_squared_inv * self.__km 848 | self.__km = np.exp(self.__km) 849 | return self.__km 850 | else: 851 | m = self._data1.shape[0] 852 | n = self._data2.shape[0] 853 | assert self._data1.shape[1] == self._data2.shape[1] 854 | linkm = mat(self._data1 * self._data2.T) 855 | trnorms1 = [] 856 | for i in range(m): 857 | trnorms1.append((self._data1[i] * self._data1[i].T)[0,0]) 858 | trnorms1 = mat(trnorms1).T 859 | trnorms2 = [] 860 | for i in range(n): 861 | trnorms2.append((self._data2[i] * self._data2[i].T)[0,0]) 862 | trnorms2 = mat(trnorms2).T 863 | self.__km = trnorms1 * mat(np.ones((n, 1), dtype = float64)).T 864 | self.__km = self.__km + mat(np.ones((m, 1), dtype = float64)) * trnorms2.T 865 | self.__km = self.__km - 2 * linkm 866 | self.__km = - self.__sigma_squared_inv * self.__km 867 | self.__km = np.exp(self.__km) 868 | return self.__km 869 | except Exception as e: 870 | logging.error("Error while computing kernel matrix: " + str(e)) 871 | sys.exit() 872 | 873 | def getKernelValue(self, xi, xj): 874 | """ 875 | Returns a single kernel value. 876 | """ 877 | xi = array(xi) 878 | xj = array(xj) 879 | diff = xi-xj 880 | val = exp(-self.__sigma_squared_inv * (dot(diff, diff))) 881 | return val 882 | 883 | class DictRBFKernel(): 884 | """ 885 | RBF Kernel (for dictionaries) 886 | """ 887 | def __init__(self, sigma): 888 | self.__sigma = sigma 889 | self.__sigma_squared_inv = 1.0 / ((self.__sigma ** 2)) 890 | 891 | def computeKernelMatrix(self, data1, data2, symmetric=False): 892 | """ 893 | Computes the kernel matrix 894 | """ 895 | logging.debug("Starting RBF Kernel Matrix Computation...") 896 | self._data1 = data1 897 | self._data2 = data2 898 | self._dim1 = len(data1) 899 | self._dim2 = len(data2) 900 | self._symmetric = symmetric 901 | self.__km = None 902 | try: 903 | km = mat(zeros((self._dim1, self._dim2), dtype=float64)) 904 | if self._symmetric: 905 | for i in range(self._dim1): 906 | message = 'Kernel Matrix Progress: %dx%d/%dx%d' % (i, self._dim2,self._dim1,self._dim2) 907 | logging.debug(message) 908 | for j in range(i, self._dim2): 909 | val = self.getKernelValue(self._data1[i], self._data2[j]) 910 | km[i, j] = val 911 | km[j, i] = val 912 | return km 913 | else: 914 | for i in range(0, self._dim1): 915 | message = 'Kernel Matrix Progress: %dx%d/%dx%d' % (i, self._dim2,self._dim1,self._dim2) 916 | logging.debug(message) 917 | for j in range(0, self._dim2): 918 | val = self.getKernelValue(self._data1[i], self._data2[j]) 919 | km[i, j] = val 920 | return km 921 | except Exception as e: 922 | logging.error("Error while computing kernel matrix: " + str(e)) 923 | sys.exit() 924 | logging.info("Kernel Matrix computed...") 925 | 926 | def getKernelValue(self, xi, xj): 927 | """ 928 | Returns a single kernel value. 929 | """ 930 | diff = xi.copy() 931 | for key in xj: 932 | if key in diff: 933 | diff[key]-=xj[key] 934 | else: 935 | diff[key]=-xj[key] 936 | diff = diff.values() 937 | val = exp(-self.__sigma_squared_inv * (dot(diff, diff))) 938 | return val 939 | -------------------------------------------------------------------------------- /methods/scikitTSVM.py: -------------------------------------------------------------------------------- 1 | from sklearn.base import BaseEstimator 2 | import sklearn.metrics 3 | import random as rnd 4 | import numpy 5 | from sklearn.linear_model import LogisticRegression as LR 6 | from qns3vm import QN_S3VM 7 | 8 | class SKTSVM(BaseEstimator): 9 | """ 10 | Scikit-learn wrapper for transductive SVM (SKTSVM) 11 | 12 | Wraps QN-S3VM by Fabian Gieseke, Antti Airola, Tapio Pahikkala, Oliver Kramer (see http://www.fabiangieseke.de/index.php/code/qns3vm) 13 | as a scikit-learn BaseEstimator, and provides probability estimates using Platt scaling 14 | 15 | Parameters 16 | ---------- 17 | C : float, optional (default=1.0) 18 | Penalty parameter C of the error term. 19 | 20 | kernel : string, optional (default='rbf') 21 | Specifies the kernel type to be used in the algorithm. 22 | It must be 'linear' or 'rbf' 23 | 24 | gamma : float, optional (default=0.0) 25 | Kernel coefficient for 'rbf' 26 | 27 | lamU: float, optional (default=1.0) 28 | cost parameter that determines influence of unlabeled patterns 29 | must be float >0 30 | 31 | probability: boolean, optional (default=False) 32 | Whether to enable probability estimates. This must be enabled prior 33 | to calling `fit`, and will slow down that method. 34 | """ 35 | 36 | # lamU -- cost parameter that determines influence of unlabeled patterns (default 1, must be float > 0) 37 | def __init__(self, kernel = 'RBF', C = 1e-4, gamma = 0.5, lamU = 1.0, probability=True): 38 | self.random_generator = rnd.Random() 39 | self.kernel = kernel 40 | self.C = C 41 | self.gamma = gamma 42 | self.lamU = lamU 43 | self.probability = probability 44 | 45 | def fit(self, X, y): # -1 for unlabeled 46 | """Fit the model according to the given training data. 47 | 48 | Parameters 49 | ---------- 50 | X : array-like, shape = [n_samples, n_features] 51 | Training vector, where n_samples in the number of samples and 52 | n_features is the number of features. 53 | 54 | y : array-like, shape = [n_samples] 55 | Target vector relative to X 56 | Must be 0 or 1 for labeled and -1 for unlabeled instances 57 | 58 | Returns 59 | ------- 60 | self : object 61 | Returns self. 62 | """ 63 | 64 | # http://www.fabiangieseke.de/index.php/code/qns3vm 65 | 66 | unlabeledX = X[y==-1, :].tolist() 67 | labeledX = X[y!=-1, :].tolist() 68 | labeledy = y[y!=-1] 69 | 70 | # convert class 0 to -1 for tsvm 71 | labeledy[labeledy==0] = -1 72 | labeledy = labeledy.tolist() 73 | 74 | if 'rbf' in self.kernel.lower(): 75 | self.model = QN_S3VM(labeledX, labeledy, unlabeledX, self.random_generator, lam=self.C, lamU=self.lamU, kernel_type="RBF", sigma=self.gamma) 76 | else: 77 | self.model = QN_S3VM(labeledX, labeledy, unlabeledX, self.random_generator, lam=self.C, lamU=self.lamU) 78 | 79 | self.model.train() 80 | 81 | # probabilities by Platt scaling 82 | if self.probability: 83 | self.plattlr = LR() 84 | preds = self.model.mygetPreds(labeledX) 85 | self.plattlr.fit( preds.reshape( -1, 1 ), labeledy ) 86 | 87 | def predict_proba(self, X): 88 | """Compute probabilities of possible outcomes for samples in X. 89 | 90 | The model need to have probability information computed at training 91 | time: fit with attribute `probability` set to True. 92 | 93 | Parameters 94 | ---------- 95 | X : array-like, shape = [n_samples, n_features] 96 | 97 | Returns 98 | ------- 99 | T : array-like, shape = [n_samples, n_classes] 100 | Returns the probability of the sample for each class in 101 | the model. The columns correspond to the classes in sorted 102 | order, as they appear in the attribute `classes_`. 103 | """ 104 | 105 | if self.probability: 106 | preds = self.model.mygetPreds(X.tolist()) 107 | return self.plattlr.predict_proba(preds.reshape( -1, 1 )) 108 | else: 109 | raise RuntimeError("Probabilities were not calculated for this model - make sure you pass probability=True to the constructor") 110 | 111 | def predict(self, X): 112 | """Perform classification on samples in X. 113 | 114 | Parameters 115 | ---------- 116 | X : array-like, shape = [n_samples, n_features] 117 | 118 | Returns 119 | ------- 120 | y_pred : array, shape = [n_samples] 121 | Class labels for samples in X. 122 | """ 123 | 124 | y = numpy.array(self.model.getPredictions(X.tolist())) 125 | y[y == -1] = 0 126 | return y 127 | 128 | def score(self, X, y, sample_weight=None): 129 | return sklearn.metrics.accuracy_score(y, self.predict(X), sample_weight=sample_weight) 130 | -------------------------------------------------------------------------------- /methods/scikitWQDA.py: -------------------------------------------------------------------------------- 1 | import traceback 2 | import numpy 3 | import numpy as np 4 | from sklearn.base import BaseEstimator 5 | import sklearn.metrics 6 | import scipy.stats 7 | 8 | class WQDA(BaseEstimator): 9 | """ 10 | Weighted Quadratic Discriminant Analysis (QDA) 11 | 12 | A classifier with a quadratic decision boundary, which allows 13 | weighted samples, and which is generated by fitting class 14 | conditional densities to the data and using Bayes' rule. 15 | 16 | The model fits a Gaussian density to each class. 17 | 18 | Attributes 19 | ---------- 20 | covariances_ : list of array-like, shape = [n_features, n_features] 21 | Covariance matrices of each class. 22 | 23 | means_ : array-like, shape = [n_classes, n_features] 24 | Class means. 25 | 26 | priors_ : array-like, shape = [n_classes] 27 | Class priors (sum to 1). 28 | """ 29 | 30 | def __init__(self): # LDA 31 | self.use_shrinkage = False 32 | 33 | #TODO regularization 34 | def fit(self, X, y, sample_weight=[]): 35 | """ 36 | Fit the QDA model according to the given training data and parameters. 37 | 38 | Parameters 39 | ---------- 40 | X : array-like, shape = [n_samples, n_features] 41 | Training vector, where n_samples in the number of samples and 42 | n_features is the number of features. 43 | 44 | y : array, shape = [n_samples] 45 | Target values (integers) 46 | 47 | sample_weight : array-like, shape (n_samples,), optional 48 | Weights applied to individual samples. 49 | If not provided, uniform weights are assumed. 50 | 51 | Returns 52 | ------- 53 | self : returns an instance of self. 54 | """ 55 | 56 | K = len(set(y)) 57 | if X.shape[0] < X.shape[1]: # less instances than dimensions -> use shrinkage 58 | self.use_shrinkage = True 59 | 60 | kindices = [numpy.where(y==k)[0] for k in range(K)] 61 | if len(sample_weight) == 0: 62 | qsum = numpy.bincount(y.astype(int)) 63 | sample_weight = numpy.ones(len(y)) 64 | else: 65 | qsum = numpy.array([numpy.sum(sample_weight[kindices[k]]) for k in range(K)]) 66 | sample_weight = numpy.reshape(sample_weight, (len(sample_weight),1)) 67 | self.priors_ = qsum / float(len(y)) 68 | self.means_ = [] 69 | self.covariances_ = [] 70 | self.covariance_ = [] 71 | for k in range(K): 72 | self.means_.append(numpy.average(X[kindices[k], :], axis=0, weights=sample_weight[kindices[k]][:,0])) 73 | ##QDA 74 | try: 75 | xm = X[kindices[k], :] - self.means_[k] 76 | if X.shape[0] > X.shape[1]: # more instances than features 77 | ##normalizing by number of data points (Loog 2015) 78 | self.covariances_.append(1./(len(kindices[k])) * numpy.multiply(xm, sample_weight[kindices[k]]).T.dot(xm)) 79 | ##weighted unbiased sample covariance (from http://stats.stackexchange.com/questions/61225/correct-equation-for-weighted-unbiased-sample-covariance ) 80 | #self.covariances_.append(1./(sample_weight[kindices[k], :].sum()) * numpy.multiply(xm, sample_weight[kindices[k]]).T.dot(xm)) 81 | else: # less instances than features - use shrinkage 82 | self.covariances_.append(weighted_oas(xm, sample_weight[kindices[k]])) 83 | 84 | except: 85 | traceback.print_exc() 86 | 87 | if len(self.covariance_) == 0: 88 | self.covariance_ = numpy.copy(self.covariances_[k]) 89 | else: 90 | self.covariance_ += self.covariances_[k] 91 | self.covariance_ /= float(K) 92 | 93 | return self 94 | 95 | def _log_posterior(self, X, normalize = True): 96 | # https://github.com/probml/pmtk3/blob/5fefd068a2e84ae508684d3e4750bd72a4164ba0/toolbox/SupervisedModels/discrimAnalysis/discrimAnalysisPredict.m 97 | # Apply Bayes rule with Gaussian class-conditional densities. 98 | # post[i,c] = P(y=c|x(i,:), params) 99 | # yhat[i] = arg max_c post[i,c] 100 | N = X.shape[0] 101 | Nclasses = len(self.priors_) 102 | loglik = numpy.zeros((N, Nclasses)) 103 | for c in range(Nclasses): 104 | try: 105 | mvnorm = scipy.stats.multivariate_normal(self.means_[c], self.covariances_[c], allow_singular=True) 106 | except: 107 | mvnorm = scipy.stats.multivariate_normal(self.means_[c], self.covariances_[c]) 108 | loglik[:, c] = mvnorm.logpdf(X) 109 | logjoint = numpy.log(self.priors_) + loglik 110 | 111 | if normalize: 112 | normalization = scipy.misc.logsumexp(logjoint, axis=1) 113 | return logjoint - numpy.reshape(normalization, (len(normalization), 1)) 114 | else: 115 | return logjoint 116 | 117 | def _posterior(self, X): 118 | return numpy.exp(self._log_posterior(X)) 119 | 120 | def predict(self, X): 121 | """Perform classification on samples in X. 122 | 123 | Parameters 124 | ---------- 125 | X : array-like, shape = [n_samples, n_features] 126 | 127 | Returns 128 | ------- 129 | y_pred : array, shape = [n_samples] 130 | Class labels for samples in X. 131 | """ 132 | return numpy.argmax(self._posterior(X), axis=1) 133 | 134 | def predict_proba(self, X): 135 | """Return posterior probabilities of classification. 136 | 137 | Parameters 138 | ---------- 139 | X : array-like, shape = [n_samples, n_features] 140 | Array of samples/test vectors. 141 | 142 | Returns 143 | ------- 144 | C : array, shape = [n_samples, n_classes] 145 | Posterior probabilities of classification per class. 146 | """ 147 | return self._posterior(X) 148 | 149 | def score(self, X, y, sample_weight=None): 150 | return sklearn.metrics.accuracy_score(y, self.predict(X), sample_weight=sample_weight) 151 | 152 | def weighted_oas(X, weights): 153 | """Estimate covariance with the Oracle Approximating Shrinkage algorithm. 154 | 155 | Parameters 156 | ---------- 157 | X : array-like, shape (n_samples, n_features) 158 | Centered data from which to compute the covariance estimate. 159 | 160 | weights: sample weights 161 | 162 | Returns 163 | ------- 164 | shrunk_cov : array-like, shape (n_features, n_features) 165 | Shrunk covariance. 166 | 167 | Notes 168 | ----- 169 | The regularised (shrunk) covariance is: 170 | 171 | (1 - shrinkage)*cov 172 | + shrinkage * mu * np.identity(n_features) 173 | 174 | where mu = trace(cov) / n_features 175 | 176 | The formula we used to implement the OAS 177 | does not correspond to the one given in the article. It has been taken 178 | from the MATLAB program available from the author's webpage 179 | (https://tbayes.eecs.umich.edu/yilun/covestimation). 180 | 181 | """ 182 | X = np.asarray(X) 183 | n_samples, n_features = X.shape 184 | 185 | #emp_cov = empirical_covariance(X, assume_centered=assume_centered) 186 | emp_cov = 1./(len(weights)) * numpy.multiply(X, weights).T.dot(X) 187 | mu = np.trace(emp_cov) / n_features 188 | 189 | # formula from Chen et al.'s **implementation** 190 | alpha = np.mean(emp_cov ** 2) 191 | num = alpha + mu ** 2 192 | den = (n_samples + 1.) * (alpha - (mu ** 2) / n_features) 193 | 194 | shrinkage = 1. if den == 0 else min(num / den, 1.) 195 | shrunk_cov = (1. - shrinkage) * emp_cov 196 | shrunk_cov.flat[::n_features + 1] += shrinkage * mu 197 | 198 | return shrunk_cov#, shrinkage 199 | -------------------------------------------------------------------------------- /qdaexample - Copy.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tmadl/semisup-learn/483698be0453f08e550a4c705fc7ad522708708c/qdaexample - Copy.png -------------------------------------------------------------------------------- /qdaexample.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tmadl/semisup-learn/483698be0453f08e550a4c705fc7ad522708708c/qdaexample.png -------------------------------------------------------------------------------- /setup.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # -*- coding: utf-8 -*- 3 | 4 | try: 5 | from setuptools import setup 6 | except ImportError: 7 | from distutils.core import setup 8 | 9 | requirements = [ 10 | "sklearn", 11 | "scipy", 12 | "numpy", 13 | "matplotlib" 14 | ] 15 | 16 | test_requirements = [] 17 | 18 | setup( 19 | name='semisup_learn', 20 | version='0.0.1', 21 | description="Semisupervised Learning Framework", 22 | url='https://github.com/tmadl/semisup-learn', 23 | packages=[ 24 | 'methods', 'frameworks' 25 | ], 26 | include_package_data=True, 27 | install_requires=requirements, 28 | zip_safe=False, 29 | keywords='semisup-learn', 30 | ) -------------------------------------------------------------------------------- /svmexample1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tmadl/semisup-learn/483698be0453f08e550a4c705fc7ad522708708c/svmexample1.png -------------------------------------------------------------------------------- /svmexample2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tmadl/semisup-learn/483698be0453f08e550a4c705fc7ad522708708c/svmexample2.png --------------------------------------------------------------------------------