├── .project
├── .pydevproject
├── LICENSE
├── README.md
├── alg1.png
├── eq1.png
├── examples
├── __init__.py
├── __init__.pyc
├── compare_gaussian_methods.py
├── compare_linsvm_methods.py
├── compare_rbfsvm_methods.py
├── example.py
├── plotutils.py
└── plotutils.pyc
├── frameworks
├── CPLELearning.py
├── SelfLearning.py
└── __init__.py
├── methods
├── __init__.py
├── qns3vm.py
├── scikitTSVM.py
└── scikitWQDA.py
├── qdaexample - Copy.png
├── qdaexample.png
├── setup.py
├── svmexample1.png
└── svmexample2.png
/.project:
--------------------------------------------------------------------------------
1 |
2 |
3 | semisup-learn
4 |
5 |
6 |
7 |
8 |
9 | org.python.pydev.PyDevBuilder
10 |
11 |
12 |
13 |
14 |
15 | org.python.pydev.pythonNature
16 |
17 |
18 |
--------------------------------------------------------------------------------
/.pydevproject:
--------------------------------------------------------------------------------
1 |
2 |
3 |
4 | /${PROJECT_DIR_NAME}
5 |
6 | python 2.7
7 | Default
8 |
9 |
--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
1 | The MIT License (MIT)
2 |
3 | Copyright (c) 2015 Tamas Madl
4 |
5 | Permission is hereby granted, free of charge, to any person obtaining a copy
6 | of this software and associated documentation files (the "Software"), to deal
7 | in the Software without restriction, including without limitation the rights
8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 |
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 |
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 |
23 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | Semi-supervised learning frameworks for Python
2 | ===============
3 |
4 | This project contains Python implementations for semi-supervised
5 | learning, made compatible with scikit-learn, including
6 |
7 | - **Contrastive Pessimistic Likelihood Estimation (CPLE)** (based on - but not equivalent to - [Loog, 2015](http://arxiv.org/abs/1503.00269)), a `safe' framework applicable for all classifiers which can yield prediction probabilities
8 | (safe here means that the model trained on both labelled and unlabelled data should not be worse than models trained only on the labelled data)
9 |
10 | - Self learning (self training), a naive semi-supervised learning framework applicable for any classifier (iteratively labelling the unlabelled instances using a trained classifier, and then re-training it on the resulting dataset - see e.g. http://pages.cs.wisc.edu/~jerryzhu/pub/sslicml07.pdf )
11 |
12 | - Semi-Supervised Support Vector Machine (S3VM) - a simple scikit-learn compatible wrapper for the QN-S3VM code developed by
13 | Fabian Gieseke, Antti Airola, Tapio Pahikkala, Oliver Kramer (see http://www.fabiangieseke.de/index.php/code/qns3vm )
14 | This method was included for comparison
15 |
16 | The first method is a novel extension of [Loog, 2015](http://arxiv.org/abs/1503.00269) for any discriminative classifier (the differences to the original CPLE are explained below). The last two methods are only included for comparison.
17 |
18 |
19 | The advantages of the CPLE framework compared to other semi-supervised learning approaches include
20 |
21 | - it is a **generally applicable framework (works with scikit-learn classifiers which allow per-sample weights)**
22 |
23 | - it needs low memory (as opposed to e.g. Label Spreading which needs O(n^2)), and
24 |
25 | - it makes no additional assumptions except for the ones made by the choice of classifier
26 |
27 | The main disadvantage is high computational complexity. Note: **this is an early stage research project, and work in progress** (it is by no means efficient or well tested)!
28 |
29 | If you need faster results, try the Self Learning framework (which is a naive approach but much faster):
30 |
31 | ```python
32 | from frameworks.SelfLearning import *
33 |
34 | any_scikitlearn_classifier = SVC()
35 | ssmodel = SelfLearningModel(any_scikitlearn_classifier)
36 | ssmodel.fit(X, y)
37 | ```
38 |
39 | Usage
40 | ===============
41 |
42 | The project requires [scikit-learn](http://scikit-learn.org/stable/install.html), [matplotlib](http://matplotlib.org/users/installing.html) and [NLopt](http://ab-initio.mit.edu/wiki/index.php/NLopt_Installation) to run.
43 |
44 | Usage example:
45 |
46 | ```python
47 | # load `Lung cancer' dataset from mldata.org
48 | cancer = fetch_mldata("Lung cancer (Ontario)")
49 | X = cancer.target.T
50 | ytrue = np.copy(cancer.data).flatten()
51 | ytrue[ytrue>0]=1
52 |
53 | # label a few points
54 | labeled_N = 4
55 | ys = np.array([-1]*len(ytrue)) # -1 denotes unlabeled point
56 | random_labeled_points = random.sample(np.where(ytrue == 0)[0], labeled_N/2)+\
57 | random.sample(np.where(ytrue == 1)[0], labeled_N/2)
58 | ys[random_labeled_points] = ytrue[random_labeled_points]
59 |
60 | # supervised score
61 | basemodel = SGDClassifier(loss='log', penalty='l1') # scikit logistic regression
62 | basemodel.fit(X[random_labeled_points, :], ys[random_labeled_points])
63 | print "supervised log.reg. score", basemodel.score(X, ytrue)
64 |
65 | # fast (but naive, unsafe) self learning framework
66 | ssmodel = SelfLearningModel(basemodel)
67 | ssmodel.fit(X, ys)
68 | print "self-learning log.reg. score", ssmodel.score(X, ytrue)
69 |
70 | # semi-supervised score (base model has to be able to take weighted samples)
71 | ssmodel = CPLELearningModel(basemodel)
72 | ssmodel.fit(X, ys)
73 | print "CPLE semi-supervised log.reg. score", ssmodel.score(X, ytrue)
74 |
75 | # semi-supervised score, RBF SVM model
76 | ssmodel = CPLELearningModel(sklearn.svm.SVC(kernel="rbf", probability=True), predict_from_probabilities=True) # RBF SVM
77 | ssmodel.fit(X, ys)
78 | print "CPLE semi-supervised RBF SVM score", ssmodel.score(X, ytrue)
79 |
80 | # supervised log.reg. score 0.410256410256
81 | # self-learning log.reg. score 0.461538461538
82 | # semi-supervised log.reg. score 0.615384615385
83 | # semi-supervised RBF SVM score 0.769230769231
84 | ```
85 |
86 |
87 | Examples
88 | ===============
89 |
90 | Two-class classification examples with 56 unlabelled (small circles in the plot) and 4 labelled (large circles in the plot) data points.
91 | Plot titles show classification accuracies (percentage of data points correctly classified by the model)
92 |
93 | In the second example, **the state-of-the-art S3VM performs worse than the purely supervised SVM**, while the CPLE SVM (by means of the
94 | pessimistic assumption) provides increased accuracy.
95 |
96 | Quadratic Discriminant Analysis (from left to right: supervised QDA, Self learning QDA, pessimistic CPLE QDA)
97 | 
98 |
99 | Support Vector Machine (from left to right: supervised SVM, S3VM [(Gieseke et al., 2012)](http://www.sciencedirect.com/science/article/pii/S0925231213003706), pessimistic CPLE SVM)
100 | 
101 |
102 | Support Vector Machine (from left to right: supervised SVM, S3VM [(Gieseke et al., 2012)](http://www.sciencedirect.com/science/article/pii/S0925231213003706), pessimistic CPLE SVM)
103 | 
104 |
105 | Motivation
106 | ===============
107 |
108 | Current semi-supervised learning approaches require strong assumptions, and perform badly if those
109 | assumptions are violated (e.g. low density assumption, clustering assumption). In some cases, they can perform worse than a supervised classifier trained only on the labeled exampels. Furthermore, the vast majority require O(N^2) memory.
110 |
111 | [(Loog, 2015)](http://arxiv.org/abs/1503.00269) has suggested an elegant framework (called Contrastive Pessimistic Likelihood Estimation / CPLE) which
112 | **only uses assumptions intrinsic to the chosen classifier**, and thus allows choosing likelihood-based classifiers which fit the domain / data
113 | distribution at hand, and can work even if some of the assumptions mentioned above are violated. The idea is to pessimistically assign soft labels
114 | to the unlabelled data, such that the improvement over the supervised version is minimal (i.e. assume the worst case for the unknown labels).
115 |
116 | The parameters in CPLE can be estimated according to:
117 | 
118 |
119 | The original CPLE framework is only applicable to likelihood-based classifiers, and (Loog, 2015) only provides solutions for Linear Discriminant Analysis and the Nearest Mean Classifier.
120 |
121 | The CPLE implementation in this project
122 | ===============
123 |
124 | Building on this idea, this project contains a general semi-supervised learning framework allowing plugging in **any classifier** which allows 1) instance weighting and 2) can generate probability
125 | estimates (such probability estimates can also be provided by [Platt scaling](https://en.wikipedia.org/wiki/Platt_scaling) for classifiers which don't support them. Also, an experimental feature
126 | is included to make the approach work with classifiers not supporting instance weighting).
127 |
128 | In order to make the approach work with any classifier, the discriminative likelihood (DL) is used instead of the generative likelihood, which is the first major difference to (Loog, 2015). The second
129 | difference is that only the unlabelled data is included in the first term of the minimization objective (point 2. below), which leads to pessimistic minimization of the DL over the unlabelled data, but maximization
130 | of the DL over the labelled data. (Note that the DL is equivalent to the negative log loss for binary classifiers with probabilistic predictions - see below.)
131 |
132 | 
133 |
134 | The resulting semi-supervised learning framework is highly computationally expensive, but has the advantages of being a generally applicable framework, needing low memory, and making no additional assumptions except for the ones made by the choice of classifier
135 |
--------------------------------------------------------------------------------
/alg1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/tmadl/semisup-learn/483698be0453f08e550a4c705fc7ad522708708c/alg1.png
--------------------------------------------------------------------------------
/eq1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/tmadl/semisup-learn/483698be0453f08e550a4c705fc7ad522708708c/eq1.png
--------------------------------------------------------------------------------
/examples/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/tmadl/semisup-learn/483698be0453f08e550a4c705fc7ad522708708c/examples/__init__.py
--------------------------------------------------------------------------------
/examples/__init__.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/tmadl/semisup-learn/483698be0453f08e550a4c705fc7ad522708708c/examples/__init__.pyc
--------------------------------------------------------------------------------
/examples/compare_gaussian_methods.py:
--------------------------------------------------------------------------------
1 | import numpy as np
2 |
3 | from frameworks.CPLELearning import CPLELearningModel
4 | from frameworks.SelfLearning import SelfLearningModel
5 | from methods.scikitWQDA import WQDA
6 | from examples.plotutils import evaluate_and_plot
7 |
8 | # number of data points
9 | N = 60
10 | supevised_data_points = 4
11 |
12 | # generate data
13 | meandistance = 1
14 |
15 | s = np.random.random()
16 | cov = [[s, 0], [0, s]]
17 | Xs = np.random.multivariate_normal([-s*meandistance, -s*meandistance], cov, (N,))
18 | Xs = np.vstack(( Xs, np.random.multivariate_normal([s*meandistance, s*meandistance], cov, (N,)) ))
19 | ytrue = np.array([0]*N + [1]*N)
20 |
21 | ys = np.array([-1]*(2*N))
22 | for i in range(supevised_data_points/2):
23 | ys[np.random.randint(0, N)] = 0
24 | for i in range(supevised_data_points/2):
25 | ys[np.random.randint(N, 2*N)] = 1
26 |
27 | Xsupervised = Xs[ys!=-1, :]
28 | ysupervised = ys[ys!=-1]
29 |
30 | # compare models
31 |
32 | lbl = "Purely supervised QDA:"
33 | print(lbl)
34 | model = WQDA()
35 | model.fit(Xsupervised, ysupervised)
36 | evaluate_and_plot(model, Xs, ys, ytrue, lbl, 1)
37 |
38 | lbl = "SelfLearning QDA:"
39 | print(lbl)
40 | model = SelfLearningModel(WQDA())
41 | model.fit(Xs, ys)
42 | evaluate_and_plot(model, Xs, ys, ytrue, lbl, 2)
43 |
44 | lbl = "CPLE(pessimistic) QDA:"
45 | print(lbl)
46 | model = CPLELearningModel(WQDA(), predict_from_probabilities=True)
47 | model.fit(Xs, ys)
48 | evaluate_and_plot(model, Xs, ys, ytrue, lbl, 3)
49 |
50 | lbl = "CPLE(optimistic) QDA:"
51 | print(lbl)
52 | CPLELearningModel.pessimistic = False
53 | model = CPLELearningModel(WQDA(), predict_from_probabilities=True)
54 | model.fit(Xs, ys)
55 | evaluate_and_plot(model, Xs, ys, ytrue, lbl, 4, block=True)
56 |
--------------------------------------------------------------------------------
/examples/compare_linsvm_methods.py:
--------------------------------------------------------------------------------
1 | import sklearn.svm
2 | import numpy as np
3 | import random
4 |
5 | from frameworks.CPLELearning import CPLELearningModel
6 | from methods import scikitTSVM
7 | from examples.plotutils import evaluate_and_plot
8 |
9 | kernel = "linear"
10 |
11 | # number of data points
12 | N = 60
13 | supevised_data_points = 2
14 | noise_probability = 0.1
15 |
16 | # generate data-
17 | cov = [[0.5, 0], [0, 0.5]]
18 | Xs = np.random.multivariate_normal([0.5,0.5], cov, (N,))
19 | ytrue = []
20 | for i in range(N):
21 | if np.random.random() < noise_probability:
22 | ytrue.append(np.random.randint(2))
23 | else:
24 | ytrue.append(1 if np.sum(Xs[i])>1 else 0)
25 | Xs = np.array(Xs)
26 | ytrue = np.array(ytrue).astype(int)
27 |
28 | ys = np.array([-1]*N)
29 | sidx = random.sample(np.where(ytrue == 0)[0], supevised_data_points/2)+random.sample(np.where(ytrue == 1)[0], supevised_data_points/2)
30 | ys[sidx] = ytrue[sidx]
31 |
32 | Xsupervised = Xs[ys!=-1, :]
33 | ysupervised = ys[ys!=-1]
34 |
35 | # compare models
36 | lbl = "Purely supervised SVM:"
37 | print(lbl)
38 | model = sklearn.svm.SVC(kernel=kernel, probability=True)
39 | model.fit(Xsupervised, ysupervised)
40 | evaluate_and_plot(model, Xs, ys, ytrue, lbl, 1)
41 |
42 | lbl = "S3VM (Gieseke et al. 2012):"
43 | print(lbl)
44 | model = scikitTSVM.SKTSVM(kernel=kernel)
45 | model.fit(Xs, ys.astype(int))
46 | evaluate_and_plot(model, Xs, ys, ytrue, lbl, 2)
47 |
48 | lbl = "CPLE(pessimistic) SVM:"
49 | print(lbl)
50 | model = CPLELearningModel(sklearn.svm.SVC(kernel=kernel, probability=True), predict_from_probabilities=True)
51 | model.fit(Xs, ys.astype(int))
52 | evaluate_and_plot(model, Xs, ys, ytrue, lbl, 3)
53 |
54 | lbl = "CPLE(optimistic) SVM:"
55 | print(lbl)
56 | CPLELearningModel.pessimistic = False
57 | model = CPLELearningModel(sklearn.svm.SVC(kernel=kernel, probability=True), predict_from_probabilities=True)
58 | model.fit(Xs, ys.astype(int))
59 | evaluate_and_plot(model, Xs, ys, ytrue, lbl, 4, block=True)
60 |
--------------------------------------------------------------------------------
/examples/compare_rbfsvm_methods.py:
--------------------------------------------------------------------------------
1 | import numpy as np
2 | import random
3 | import sklearn.svm
4 |
5 | from frameworks.CPLELearning import CPLELearningModel
6 | from methods import scikitTSVM
7 | from examples.plotutils import evaluate_and_plot
8 |
9 | kernel = "rbf"
10 |
11 | # number of data points
12 | N = 60
13 | supevised_data_points = 4
14 |
15 | # generate data
16 | meandistance = 2
17 |
18 | s = np.random.random()
19 | cov = [[s, 0], [0, s]]
20 | # some random Gaussians
21 | gaussians = 6 #np.random.randint(4, 7)
22 | Xs = np.random.multivariate_normal([np.random.random()*meandistance, np.random.random()*meandistance], cov, (N/gaussians,))
23 | for i in range(gaussians-1):
24 | Xs = np.vstack(( Xs, np.random.multivariate_normal([np.random.random()*meandistance, np.random.random()*meandistance], cov, (N/gaussians,)) ))
25 |
26 | # cut data into XOR
27 | ytrue = ((Xs[:, 0] < np.mean(Xs[:, 0]))*(Xs[:, 1] < np.mean(Xs[:, 1])) + (Xs[:, 0] > np.mean(Xs[:, 0]))*(Xs[:, 1] > np.mean(Xs[:, 1])))*1
28 |
29 | ys = np.array([-1]*N)
30 | sidx = random.sample(np.where(ytrue == 0)[0], supevised_data_points/2)+random.sample(np.where(ytrue == 1)[0], supevised_data_points/2)
31 | ys[sidx] = ytrue[sidx]
32 |
33 | Xsupervised = Xs[ys!=-1, :]
34 | ysupervised = ys[ys!=-1]
35 |
36 | # compare models
37 | lbl = "Purely supervised SVM:"
38 | print(lbl)
39 | model = sklearn.svm.SVC(kernel=kernel, probability=True)
40 | model.fit(Xsupervised, ysupervised)
41 | evaluate_and_plot(model, Xs, ys, ytrue, lbl, 1)
42 |
43 |
44 | lbl = "S3VM (Gieseke et al. 2012):"
45 | print(lbl)
46 | model = scikitTSVM.SKTSVM(kernel=kernel)
47 | model.fit(Xs, ys)
48 | evaluate_and_plot(model, Xs, ys, ytrue, lbl, 2)
49 |
50 |
51 | lbl = "CPLE(pessimistic) SVM:"
52 | print(lbl)
53 | model = CPLELearningModel(sklearn.svm.SVC(kernel=kernel, probability=True), predict_from_probabilities=True)
54 | model.fit(Xs, ys)
55 | evaluate_and_plot(model, Xs, ys, ytrue, lbl, 3)
56 |
57 |
58 | lbl = "CPLE(optimistic) SVM:"
59 | print(lbl)
60 | CPLELearningModel.pessimistic = False
61 | model = CPLELearningModel(sklearn.svm.SVC(kernel=kernel, probability=True), predict_from_probabilities=True)
62 | model.fit(Xs, ys)
63 | evaluate_and_plot(model, Xs, ys, ytrue, lbl, 4, block=True)
64 |
--------------------------------------------------------------------------------
/examples/example.py:
--------------------------------------------------------------------------------
1 | import numpy as np
2 | import random
3 | from frameworks.CPLELearning import CPLELearningModel
4 | from sklearn.datasets.mldata import fetch_mldata
5 | from sklearn.linear_model.stochastic_gradient import SGDClassifier
6 | import sklearn.svm
7 | from methods.scikitWQDA import WQDA
8 | from frameworks.SelfLearning import SelfLearningModel
9 |
10 | # load data
11 | cancer = fetch_mldata("Lung cancer (Ontario)")
12 | X = cancer.target.T
13 | ytrue = np.copy(cancer.data).flatten()
14 | ytrue[ytrue>0]=1
15 |
16 | # label a few points
17 | labeled_N = 4
18 | ys = np.array([-1]*len(ytrue)) # -1 denotes unlabeled point
19 | random_labeled_points = random.sample(np.where(ytrue == 0)[0], labeled_N/2)+\
20 | random.sample(np.where(ytrue == 1)[0], labeled_N/2)
21 | ys[random_labeled_points] = ytrue[random_labeled_points]
22 |
23 | # supervised score
24 | #basemodel = WQDA() # weighted Quadratic Discriminant Analysis
25 | basemodel = SGDClassifier(loss='log', penalty='l1') # scikit logistic regression
26 | basemodel.fit(X[random_labeled_points, :], ys[random_labeled_points])
27 | print "supervised log.reg. score", basemodel.score(X, ytrue)
28 |
29 | # fast (but naive, unsafe) self learning framework
30 | ssmodel = SelfLearningModel(basemodel)
31 | ssmodel.fit(X, ys)
32 | print "self-learning log.reg. score", ssmodel.score(X, ytrue)
33 |
34 | # semi-supervised score (base model has to be able to take weighted samples)
35 | ssmodel = CPLELearningModel(basemodel)
36 | ssmodel.fit(X, ys)
37 | print "CPLE semi-supervised log.reg. score", ssmodel.score(X, ytrue)
38 |
39 | # semi-supervised score, WQDA model
40 | ssmodel = CPLELearningModel(WQDA(), predict_from_probabilities=True) # weighted Quadratic Discriminant Analysis
41 | ssmodel.fit(X, ys)
42 | print "CPLE semi-supervised WQDA score", ssmodel.score(X, ytrue)
43 |
44 | # semi-supervised score, RBF SVM model
45 | ssmodel = CPLELearningModel(sklearn.svm.SVC(kernel="rbf", probability=True), predict_from_probabilities=True) # RBF SVM
46 | ssmodel.fit(X, ys)
47 | print "CPLE semi-supervised RBF SVM score", ssmodel.score(X, ytrue)
48 |
--------------------------------------------------------------------------------
/examples/plotutils.py:
--------------------------------------------------------------------------------
1 | import matplotlib.pyplot as plt
2 | import numpy as np
3 |
4 | cols = [np.array([1,0,0]),np.array([0,1,0])] # colors
5 |
6 | def evaluate_and_plot(model, Xs, ys, ytrue, lbl, subplot = None, block=False):
7 | if subplot != None:
8 | plt.subplot(2,2,subplot)
9 |
10 | # predict, and evaluate
11 | pred = model.predict(Xs)
12 |
13 | acc = np.mean(pred==ytrue)
14 | print "accuracy:", round(acc, 3)
15 |
16 | # plot probabilities
17 | [minx, maxx] = [np.min(Xs[:, 0]), np.max(Xs[:, 0])]
18 | [miny, maxy] = [np.min(Xs[:, 1]), np.max(Xs[:, 1])]
19 | gridsize = 100
20 | xx = np.linspace(minx, maxx, gridsize)
21 | yy = np.linspace(miny, maxy, gridsize).T
22 | xx, yy = np.meshgrid(xx, yy)
23 | Xfull = np.c_[xx.ravel(), yy.ravel()]
24 | probas = model.predict_proba(Xfull)
25 | plt.imshow(probas[:, 1].reshape((gridsize, gridsize)), extent=(minx, maxx, miny, maxy), origin='lower')
26 |
27 | # plot decision boundary
28 | try:
29 | if hasattr(model, 'predict_from_probabilities') and model.predict_from_probabilities:
30 | plt.contour((probas[:, 0]-1)*300+100, linewidth=1, edgecolor=[cols[p]*P[p] for p in model.predict(Xs).astype(int)], cmap='hot')
39 | plt.scatter(Xs[ys>-1, 0], Xs[ys>-1,1], c=ytrue[ys>-1], s=300, linewidth=1, edgecolor=[cols[p]*P[p] for p in model.predict(Xs).astype(int)], cmap='hot')
40 | plt.title(lbl + str(round(acc, 2)))
41 |
42 | plt.show(block=block)
--------------------------------------------------------------------------------
/examples/plotutils.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/tmadl/semisup-learn/483698be0453f08e550a4c705fc7ad522708708c/examples/plotutils.pyc
--------------------------------------------------------------------------------
/frameworks/CPLELearning.py:
--------------------------------------------------------------------------------
1 | class Unbuffered(object):
2 | def __init__(self, stream):
3 | self.stream = stream
4 | def write(self, data):
5 | self.stream.write(data)
6 | self.stream.flush()
7 | def __getattr__(self, attr):
8 | return getattr(self.stream, attr)
9 |
10 | import sys
11 | sys.stdout = Unbuffered(sys.stdout)
12 |
13 | from sklearn.base import BaseEstimator
14 | import numpy
15 | import sklearn.metrics
16 | from sklearn.linear_model import LogisticRegression as LR
17 | import nlopt
18 | import scipy.stats
19 |
20 | class CPLELearningModel(BaseEstimator):
21 | """
22 | Contrastive Pessimistic Likelihood Estimation framework for semi-supervised
23 | learning, based on (Loog, 2015). This implementation contains two
24 | significant differences to (Loog, 2015):
25 | - the discriminative likelihood p(y|X), instead of the generative
26 | likelihood p(X), is used for optimization
27 | - apart from `pessimism' (the assumption that the true labels of the
28 | unlabeled instances are as adversarial to the likelihood as possible), the
29 | optimization objective also tries to increase the likelihood on the labeled
30 | examples
31 |
32 | This class takes a base model (any scikit learn estimator),
33 | trains it on the labeled examples, and then uses global optimization to
34 | find (soft) label hypotheses for the unlabeled examples in a pessimistic
35 | fashion (such that the model log likelihood on the unlabeled data is as
36 | small as possible, but the log likelihood on the labeled data is as high
37 | as possible)
38 |
39 | See Loog, Marco. "Contrastive Pessimistic Likelihood Estimation for
40 | Semi-Supervised Classification." arXiv preprint arXiv:1503.00269 (2015).
41 | http://arxiv.org/pdf/1503.00269
42 |
43 | Attributes
44 | ----------
45 | basemodel : BaseEstimator instance
46 | Base classifier to be trained on the partially supervised data
47 |
48 | pessimistic : boolean, optional (default=True)
49 | Whether the label hypotheses for the unlabeled instances should be
50 | pessimistic (i.e. minimize log likelihood) or optimistic (i.e.
51 | maximize log likelihood).
52 | Pessimistic label hypotheses ensure safety (i.e. the semi-supervised
53 | solution will not be worse than a model trained on the purely
54 | supervised instances)
55 |
56 | predict_from_probabilities : boolean, optional (default=False)
57 | The prediction is calculated from the probabilities if this is True
58 | (1 if more likely than the mean predicted probability or 0 otherwise).
59 | If it is false, the normal base model predictions are used.
60 | This only affects the predict function. Warning: only set to true if
61 | predict will be called with a substantial number of data points
62 |
63 | use_sample_weighting : boolean, optional (default=True)
64 | Whether to use sample weights (soft labels) for the unlabeled instances.
65 | Setting this to False allows the use of base classifiers which do not
66 | support sample weights (but might slow down the optimization)
67 |
68 | max_iter : int, optional (default=3000)
69 | Maximum number of iterations
70 |
71 | verbose : int, optional (default=1)
72 | Enable verbose output (1 shows progress, 2 shows the detailed log
73 | likelihood at every iteration).
74 |
75 | """
76 |
77 | def __init__(self, basemodel, pessimistic=True, predict_from_probabilities = False, use_sample_weighting = True, max_iter=3000, verbose = 1):
78 | self.model = basemodel
79 | self.pessimistic = pessimistic
80 | self.predict_from_probabilities = predict_from_probabilities
81 | self.use_sample_weighting = use_sample_weighting
82 | self.max_iter = max_iter
83 | self.verbose = verbose
84 |
85 | self.it = 0 # iteration counter
86 | self.noimprovementsince = 0 # log likelihood hasn't improved since this number of iterations
87 | self.maxnoimprovementsince = 3 # threshold for iterations without improvements (convergence is assumed when this is reached)
88 |
89 | self.buffersize = 200
90 | # buffer for the last few discriminative likelihoods (used to check for convergence)
91 | self.lastdls = [0]*self.buffersize
92 |
93 | # best discriminative likelihood and corresponding soft labels; updated during training
94 | self.bestdl = numpy.infty
95 | self.bestlbls = []
96 |
97 | # unique id
98 | self.id = str(unichr(numpy.random.randint(26)+97))+str(unichr(numpy.random.randint(26)+97))
99 |
100 | def discriminative_likelihood(self, model, labeledData, labeledy = None, unlabeledData = None, unlabeledWeights = None, unlabeledlambda = 1, gradient=[], alpha = 0.01):
101 | unlabeledy = (unlabeledWeights[:, 0]<0.5)*1
102 | uweights = numpy.copy(unlabeledWeights[:, 0]) # large prob. for k=0 instances, small prob. for k=1 instances
103 | uweights[unlabeledy==1] = 1-uweights[unlabeledy==1] # subtract from 1 for k=1 instances to reflect confidence
104 | weights = numpy.hstack((numpy.ones(len(labeledy)), uweights))
105 | labels = numpy.hstack((labeledy, unlabeledy))
106 |
107 | # fit model on supervised data
108 | if self.use_sample_weighting:
109 | model.fit(numpy.vstack((labeledData, unlabeledData)), labels, sample_weight=weights)
110 | else:
111 | model.fit(numpy.vstack((labeledData, unlabeledData)), labels)
112 |
113 | # probability of labeled data
114 | P = model.predict_proba(labeledData)
115 |
116 | try:
117 | # labeled discriminative log likelihood
118 | labeledDL = -sklearn.metrics.log_loss(labeledy, P)
119 | except Exception as e:
120 | print(e)
121 | P = model.predict_proba(labeledData)
122 |
123 | # probability of unlabeled data
124 | unlabeledP = model.predict_proba(unlabeledData)
125 |
126 | try:
127 | # unlabeled discriminative log likelihood
128 | eps = 1e-15
129 | unlabeledP = numpy.clip(unlabeledP, eps, 1 - eps)
130 | unlabeledDL = numpy.average((unlabeledWeights*numpy.vstack((1-unlabeledy, unlabeledy)).T*numpy.log(unlabeledP)).sum(axis=1))
131 | except Exception as e:
132 | print(e)
133 | unlabeledP = model.predict_proba(unlabeledData)
134 |
135 | if self.pessimistic:
136 | # pessimistic: minimize the difference between unlabeled and labeled discriminative likelihood (assume worst case for unknown true labels)
137 | dl = unlabeledlambda * unlabeledDL - labeledDL
138 | else:
139 | # optimistic: minimize negative total discriminative likelihood (i.e. maximize likelihood)
140 | dl = - unlabeledlambda * unlabeledDL - labeledDL
141 |
142 | return dl
143 |
144 | def discriminative_likelihood_objective(self, model, labeledData, labeledy = None, unlabeledData = None, unlabeledWeights = None, unlabeledlambda = 1, gradient=[], alpha = 0.01):
145 | if self.it == 0:
146 | self.lastdls = [0]*self.buffersize
147 |
148 | dl = self.discriminative_likelihood(model, labeledData, labeledy, unlabeledData, unlabeledWeights, unlabeledlambda, gradient, alpha)
149 |
150 | self.it += 1
151 | self.lastdls[numpy.mod(self.it, len(self.lastdls))] = dl
152 |
153 | if numpy.mod(self.it, self.buffersize) == 0: # or True:
154 | improvement = numpy.mean((self.lastdls[(len(self.lastdls)/2):])) - numpy.mean((self.lastdls[:(len(self.lastdls)/2)]))
155 | # ttest - test for hypothesis that the likelihoods have not changed (i.e. there has been no improvement, and we are close to convergence)
156 | _, prob = scipy.stats.ttest_ind(self.lastdls[(len(self.lastdls)/2):], self.lastdls[:(len(self.lastdls)/2)])
157 |
158 | # if improvement is not certain accoring to t-test...
159 | noimprovement = prob > 0.1 and numpy.mean(self.lastdls[(len(self.lastdls)/2):]) < numpy.mean(self.lastdls[:(len(self.lastdls)/2)])
160 | if noimprovement:
161 | self.noimprovementsince += 1
162 | if self.noimprovementsince >= self.maxnoimprovementsince:
163 | # no improvement since a while - converged; exit
164 | self.noimprovementsince = 0
165 | raise Exception(" converged.") # we need to raise an exception to get NLopt to stop before exceeding the iteration budget
166 | else:
167 | self.noimprovementsince = 0
168 |
169 | if self.verbose == 2:
170 | print(self.id,self.it, dl, numpy.mean(self.lastdls), improvement, round(prob, 3), (prob < 0.1))
171 | elif self.verbose:
172 | sys.stdout.write(('.' if self.pessimistic else '.') if not noimprovement else 'n')
173 |
174 | if dl < self.bestdl:
175 | self.bestdl = dl
176 | self.bestlbls = numpy.copy(unlabeledWeights[:, 0])
177 |
178 | return dl
179 |
180 | def fit(self, X, y): # -1 for unlabeled
181 | unlabeledX = X[y==-1, :]
182 | labeledX = X[y!=-1, :]
183 | labeledy = y[y!=-1]
184 |
185 | M = unlabeledX.shape[0]
186 |
187 | # train on labeled data
188 | self.model.fit(labeledX, labeledy)
189 |
190 | unlabeledy = self.predict(unlabeledX)
191 |
192 | #re-train, labeling unlabeled instances pessimistically
193 |
194 | # pessimistic soft labels ('weights') q for unlabelled points, q=P(k=0|Xu)
195 | f = lambda softlabels, grad=[]: self.discriminative_likelihood_objective(self.model, labeledX, labeledy=labeledy, unlabeledData=unlabeledX, unlabeledWeights=numpy.vstack((softlabels, 1-softlabels)).T, gradient=grad) #- supLL
196 | lblinit = numpy.random.random(len(unlabeledy))
197 |
198 | try:
199 | self.it = 0
200 | opt = nlopt.opt(nlopt.GN_DIRECT_L_RAND, M)
201 | opt.set_lower_bounds(numpy.zeros(M))
202 | opt.set_upper_bounds(numpy.ones(M))
203 | opt.set_min_objective(f)
204 | opt.set_maxeval(self.max_iter)
205 | self.bestsoftlbl = opt.optimize(lblinit)
206 | print(" max_iter exceeded.")
207 | except Exception as e:
208 | print(e)
209 | self.bestsoftlbl = self.bestlbls
210 |
211 | if numpy.any(self.bestsoftlbl != self.bestlbls):
212 | self.bestsoftlbl = self.bestlbls
213 | ll = f(self.bestsoftlbl)
214 |
215 | unlabeledy = (self.bestsoftlbl<0.5)*1
216 | uweights = numpy.copy(self.bestsoftlbl) # large prob. for k=0 instances, small prob. for k=1 instances
217 | uweights[unlabeledy==1] = 1-uweights[unlabeledy==1] # subtract from 1 for k=1 instances to reflect confidence
218 | weights = numpy.hstack((numpy.ones(len(labeledy)), uweights))
219 | labels = numpy.hstack((labeledy, unlabeledy))
220 | if self.use_sample_weighting:
221 | self.model.fit(numpy.vstack((labeledX, unlabeledX)), labels, sample_weight=weights)
222 | else:
223 | self.model.fit(numpy.vstack((labeledX, unlabeledX)), labels)
224 |
225 | if self.verbose > 1:
226 | print("number of non-one soft labels: ", numpy.sum(self.bestsoftlbl != 1), ", balance:", numpy.sum(self.bestsoftlbl<0.5), " / ", len(self.bestsoftlbl))
227 | print("current likelihood: ", ll)
228 |
229 | if not getattr(self.model, "predict_proba", None):
230 | # Platt scaling
231 | self.plattlr = LR()
232 | preds = self.model.predict(labeledX)
233 | self.plattlr.fit( preds.reshape( -1, 1 ), labeledy )
234 |
235 | return self
236 |
237 | def predict_proba(self, X):
238 | """Compute probabilities of possible outcomes for samples in X.
239 |
240 | The model need to have probability information computed at training
241 | time: fit with attribute `probability` set to True.
242 |
243 | Parameters
244 | ----------
245 | X : array-like, shape = [n_samples, n_features]
246 |
247 | Returns
248 | -------
249 | T : array-like, shape = [n_samples, n_classes]
250 | Returns the probability of the sample for each class in
251 | the model. The columns correspond to the classes in sorted
252 | order, as they appear in the attribute `classes_`.
253 | """
254 |
255 | if getattr(self.model, "predict_proba", None):
256 | return self.model.predict_proba(X)
257 | else:
258 | preds = self.model.predict(X)
259 | return self.plattlr.predict_proba(preds.reshape( -1, 1 ))
260 |
261 | def predict(self, X):
262 | """Perform classification on samples in X.
263 |
264 | Parameters
265 | ----------
266 | X : array-like, shape = [n_samples, n_features]
267 |
268 | Returns
269 | -------
270 | y_pred : array, shape = [n_samples]
271 | Class labels for samples in X.
272 | """
273 |
274 | if self.predict_from_probabilities:
275 | P = self.predict_proba(X)
276 | return (P[:, 0] self.prob_threshold) | (unlabeledprob[:, 1] > self.prob_threshold))[0]
69 |
70 | self.model.fit(numpy.vstack((labeledX, unlabeledX[uidx, :])), numpy.hstack((labeledy, unlabeledy_old[uidx])))
71 | unlabeledy = self.predict(unlabeledX)
72 | unlabeledprob = self.predict_proba(unlabeledX)
73 | i += 1
74 |
75 | if not getattr(self.model, "predict_proba", None):
76 | # Platt scaling if the model cannot generate predictions itself
77 | self.plattlr = LR()
78 | preds = self.model.predict(labeledX)
79 | self.plattlr.fit( preds.reshape( -1, 1 ), labeledy )
80 |
81 | return self
82 |
83 | def predict_proba(self, X):
84 | """Compute probabilities of possible outcomes for samples in X.
85 |
86 | The model need to have probability information computed at training
87 | time: fit with attribute `probability` set to True.
88 |
89 | Parameters
90 | ----------
91 | X : array-like, shape = [n_samples, n_features]
92 |
93 | Returns
94 | -------
95 | T : array-like, shape = [n_samples, n_classes]
96 | Returns the probability of the sample for each class in
97 | the model. The columns correspond to the classes in sorted
98 | order, as they appear in the attribute `classes_`.
99 | """
100 |
101 | if getattr(self.model, "predict_proba", None):
102 | return self.model.predict_proba(X)
103 | else:
104 | preds = self.model.predict(X)
105 | return self.plattlr.predict_proba(preds.reshape( -1, 1 ))
106 |
107 | def predict(self, X):
108 | """Perform classification on samples in X.
109 |
110 | Parameters
111 | ----------
112 | X : array-like, shape = [n_samples, n_features]
113 |
114 | Returns
115 | -------
116 | y_pred : array, shape = [n_samples]
117 | Class labels for samples in X.
118 | """
119 |
120 | return self.model.predict(X)
121 |
122 | def score(self, X, y, sample_weight=None):
123 | return sklearn.metrics.accuracy_score(y, self.predict(X), sample_weight=sample_weight)
124 |
--------------------------------------------------------------------------------
/frameworks/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/tmadl/semisup-learn/483698be0453f08e550a4c705fc7ad522708708c/frameworks/__init__.py
--------------------------------------------------------------------------------
/methods/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/tmadl/semisup-learn/483698be0453f08e550a4c705fc7ad522708708c/methods/__init__.py
--------------------------------------------------------------------------------
/methods/qns3vm.py:
--------------------------------------------------------------------------------
1 | ############################################################################################
2 | # QN-S3VM BFGS optimizer for semi-supervised support vector machines.
3 | #
4 | # This implementation provides both a L-BFGS optimization scheme
5 | # for semi-supvised support vector machines. Details can be found in:
6 | #
7 | # F. Gieseke, A. Airola, T. Pahikkala, O. Kramer, Sparse quasi-
8 | # Newton optimization for semi-supervised support vector ma-
9 | # chines, in: Proc. of the 1st Int. Conf. on Pattern Recognition
10 | # Applications and Methods, 2012, pp. 45-54.
11 | #
12 | # Version: 0.1 (September, 2012)
13 | #
14 | # Bugs: Please send any bugs to "f DOT gieseke AT uni-oldenburg.de"
15 | #
16 | #
17 | # Copyright (C) 2012 Fabian Gieseke, Antti Airola, Tapio Pahikkala, Oliver Kramer
18 | #
19 | # This program is free software: you can redistribute it and/or modify
20 | # it under the terms of the GNU General Public License as published by
21 | # the Free Software Foundation, either version 3 of the License, or
22 | # (at your option) any later version.
23 | #
24 | # This program is distributed in the hope that it will be useful,
25 | # but WITHOUT ANY WARRANTY; without even the implied warranty of
26 | # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
27 | # GNU General Public License for more details.
28 | #
29 | # You should have received a copy of the GNU General Public License
30 | # along with this program. If not, see .
31 | #
32 | #
33 | # INSTALLATION and DEPENDENCIES
34 | #
35 | # The module should work out of the box, given Python and Numpy (http://numpy.scipy.org/)
36 | # and Scipy (http://scipy.org/) installed correctly.
37 | #
38 | # We have tested the code on Ubuntu 12.04 (32 Bit) with Python 2.7.3, Numpy 1.6.1,
39 | # and Scipy 0.9.0. Installing these packages on a Ubuntu- or Debian-based systems
40 | # can be done via "sudo apt-get install python python-numpy python-scipy".
41 | #
42 | #
43 | # RUNNING THE EXAMPLES
44 | #
45 | # For a description of the data sets, see the paper mentioned above and the references
46 | # therein. Running the command "python qns3vm.py" should yield an output similar to:
47 | #
48 | # Sparse text data set instance
49 | # Number of labeled patterns: 48
50 | # Number of unlabeled patterns: 924
51 | # Number of test patterns: 974
52 | # Time needed to compute the model: 0.775886058807 seconds
53 | # Classification error of QN-S3VM: 0.0667351129363
54 | #
55 | # Dense gaussian data set instance
56 | # Number of labeled patterns: 25
57 | # Number of unlabeled patterns: 225
58 | # Number of test patterns: 250
59 | # Time needed to compute the model: 0.464584112167 seconds
60 | # Classification error of QN-S3VM: 0.012
61 | #
62 | # Dense moons data set instance
63 | # Number of labeled patterns: 5
64 | # Number of unlabeled patterns: 495
65 | # Number of test patterns: 500
66 | # Time needed to compute the model: 0.69714307785 seconds
67 | # Classification error of QN-S3VM: 0.0
68 |
69 | ############################################################################################
70 |
71 | import array as arr
72 | import math
73 | import copy as cp
74 | import logging
75 | import numpy as np
76 | from numpy import *
77 | import operator
78 | from time import time
79 | import sys
80 | from scipy import optimize
81 | import scipy.sparse.csc as csc
82 | from scipy import sparse
83 | import scipy
84 | import warnings
85 | warnings.simplefilter('error')
86 |
87 | __author__ = 'Fabian Gieseke, Antti Airola, Tapio Pahikkala, Oliver Kramer'
88 | __version__= '0.1'
89 |
90 | class QN_S3VM:
91 | """
92 | L-BFGS optimizer for semi-supervised support vector machines (S3VM).
93 | """
94 | def __init__(self, X_l, L_l, X_u, random_generator = None, ** kw):
95 | """
96 | Initializes the model. Detects automatically if dense or sparse data is provided.
97 |
98 | Keyword arguments:
99 | X_l -- patterns of labeled part of the data
100 | L_l -- labels of labeled part of the data
101 | X_u -- patterns of unlabeled part of the data
102 | random_generator -- particular instance of a random_generator (default None)
103 | kw -- additional parameters for the optimizer
104 | lam -- regularization parameter lambda (default 1, must be a float > 0)
105 | lamU -- cost parameter that determines influence of unlabeled patterns (default 1, must be float > 0)
106 | sigma -- kernel width for RBF kernel (default 1.0, must be a float > 0)
107 | kernel_type -- "Linear" or "RBF" (default "Linear")
108 | numR -- implementation of subset of regressors. If None is provided, all patterns are used
109 | (no approximation). Must fulfill 0 <= numR <= len(X_l) + len(X_u) (default None)
110 | estimate_r -- desired ratio for positive and negative assigments for
111 | unlabeled patterns (-1.0 <= estimate_r <= 1.0). If estimate_r=None,
112 | then L_l is used to estimate this ratio (in case len(L_l) >=
113 | minimum_labeled_patterns_for_estimate_r. Otherwise use estimate_r = 0.0
114 | (default None)
115 | minimum_labeled_patterns_for_estimate_r -- see above (default 0)
116 | BFGS_m -- BFGS parameter (default 50)
117 | BFGS_maxfun -- BFGS parameter, maximum number of function calls (default 500)
118 | BFGS_factr -- BFGS parameter (default 1E12)
119 | BFGS_pgtol -- BFGS parameter (default 1.0000000000000001e-05)
120 | """
121 | self.__model = None
122 | # Initiate model for sparse data
123 | if isinstance(X_l, csc.csc_matrix):
124 | self.__data_type = "sparse"
125 | self.__model = QN_S3VM_Sparse(X_l, L_l, X_u, random_generator, ** kw)
126 | # Initiate model for dense data
127 | elif (isinstance(X_l[0], list)) or (isinstance(X_l[0], np.ndarray)):
128 | self.__data_type = "dense"
129 | self.__model = QN_S3VM_Dense(X_l, L_l, X_u, random_generator, ** kw)
130 | # Data format unknown
131 | if self.__model == None:
132 | logging.info("Data format for patterns is unknown.")
133 | sys.exit(0)
134 |
135 | def train(self):
136 | """
137 | Training phase.
138 |
139 | Returns:
140 | The computed partition for the unlabeled patterns.
141 | """
142 | return self.__model.train()
143 |
144 | def getPredictions(self, X, real_valued=False):
145 | """
146 | Computes the predicted labels for a given set of patterns
147 |
148 | Keyword arguments:
149 | X -- The set of patterns
150 | real_valued -- If True, then the real prediction values are returned
151 |
152 | Returns:
153 | The predictions for the list X of patterns.
154 | """
155 | return self.__model.getPredictions(X, real_valued=False)
156 |
157 | def predict(self, x):
158 | """
159 | Predicts a label (-1 or +1) for the pattern
160 |
161 | Keyword arguments:
162 | x -- The pattern
163 |
164 | Returns:
165 | The prediction for x.
166 | """
167 | return self.__model.predict(x)
168 |
169 | def predictValue(self, x):
170 | """
171 | Computes f(x) for a given pattern (see Representer Theorem)
172 |
173 | Keyword arguments:
174 | x -- The pattern
175 |
176 | Returns:
177 | The (real) prediction value for x.
178 | """
179 | return self.__model.predictValue(x)
180 |
181 | def getNeededFunctionCalls(self):
182 | """
183 | Returns the number of function calls needed during
184 | the optimization process.
185 | """
186 | return self.__model.getNeededFunctionCalls()
187 |
188 | def mygetPreds(self, X, real_valued=False):
189 | return self.__model.mygetPreds(X, real_valued)
190 |
191 | ############################################################################################
192 | ############################################################################################
193 | class QN_S3VM_Dense:
194 |
195 | """
196 | BFGS optimizer for semi-supervised support vector machines (S3VM).
197 |
198 | Dense Data
199 | """
200 | parameters = {
201 | 'lam': 1,
202 | 'lamU':1,
203 | 'sigma': 1,
204 | 'kernel_type': "Linear",
205 | 'numR':None,
206 | 'estimate_r':None,
207 | 'minimum_labeled_patterns_for_estimate_r':0,
208 | 'BFGS_m':50,
209 | 'BFGS_maxfun':500,
210 | 'BFGS_factr':1E12,
211 | 'BFGS_pgtol':1.0000000000000001e-05,
212 | 'BFGS_verbose':-1,
213 | 'surrogate_s':3.0,
214 | 'surrogate_gamma':20.0,
215 | 'breakpoint_for_exp':500
216 | }
217 |
218 | def __init__(self, X_l, L_l, X_u, random_generator, ** kw):
219 | """
220 | Intializes the S3VM optimizer.
221 | """
222 | self.__random_generator = random_generator
223 | self.__X_l, self.__X_u, self.__L_l = X_l, X_u, L_l
224 | assert len(X_l) == len(L_l)
225 | self.__X = cp.deepcopy(self.__X_l)
226 | self.__X.extend(cp.deepcopy(self.__X_u))
227 | self.__size_l, self.__size_u, self.__size_n = len(X_l), len(X_u), len(X_l) + len(X_u)
228 | self.__matrices_initialized = False
229 | self.__setParameters( ** kw)
230 | self.__kw = kw
231 |
232 | def train(self):
233 | """
234 | Training phase.
235 |
236 | Returns:
237 | The computed partition for the unlabeled patterns.
238 | """
239 | indi_opt = self.__optimize()
240 | self.__recomputeModel(indi_opt)
241 | predictions = self.__getTrainingPredictions(self.__X)
242 | return predictions
243 |
244 | def mygetPreds(self, X, real_valued=False):
245 | KNR = self.__kernel.computeKernelMatrix(X, self.__Xreg)
246 | KNU_bar = self.__kernel.computeKernelMatrix(X, self.__X_u_subset, symmetric=False)
247 | KNU_bar_horizontal_sum = (1.0 / len(self.__X_u_subset)) * KNU_bar.sum(axis=1)
248 | KNR = KNR - KNU_bar_horizontal_sum - self.__KU_barR_vertical_sum + self.__KU_barU_bar_sum
249 | preds = KNR * self.__c[0:self.__dim-1,:] + self.__c[self.__dim-1,:]
250 | return preds
251 |
252 | def getPredictions(self, X, real_valued=False):
253 | """
254 | Computes the predicted labels for a given set of patterns
255 |
256 | Keyword arguments:
257 | X -- The set of patterns
258 | real_valued -- If True, then the real prediction values are returned
259 |
260 | Returns:
261 | The predictions for the list X of patterns.
262 | """
263 | KNR = self.__kernel.computeKernelMatrix(X, self.__Xreg)
264 | KNU_bar = self.__kernel.computeKernelMatrix(X, self.__X_u_subset, symmetric=False)
265 | KNU_bar_horizontal_sum = (1.0 / len(self.__X_u_subset)) * KNU_bar.sum(axis=1)
266 | KNR = KNR - KNU_bar_horizontal_sum - self.__KU_barR_vertical_sum + self.__KU_barU_bar_sum
267 | preds = KNR * self.__c[0:self.__dim-1,:] + self.__c[self.__dim-1,:]
268 | if real_valued == True:
269 | return preds.flatten(1).tolist()[0]
270 | else:
271 | return np.sign(np.sign(preds)+0.1).flatten(1).tolist()[0]
272 |
273 | def predict(self, x):
274 | """
275 | Predicts a label for the pattern
276 |
277 | Keyword arguments:
278 | x -- The pattern
279 |
280 | Returns:
281 | The prediction for x.
282 | """
283 | return self.getPredictions([x], real_valued=False)[0]
284 |
285 | def predictValue(self, x):
286 | """
287 | Computes f(x) for a given pattern (see Representer Theorem)
288 |
289 | Keyword arguments:
290 | x -- The pattern
291 |
292 | Returns:
293 | The (real) prediction value for x.
294 | """
295 | return self.getPredictions([x], real_valued=True)[0]
296 |
297 | def getNeededFunctionCalls(self):
298 | """
299 | Returns the number of function calls needed during
300 | the optimization process.
301 | """
302 | return self.__needed_function_calls
303 |
304 | def __setParameters(self, ** kw):
305 | for attr, val in kw.items():
306 | self.parameters[attr] = val
307 | self.__lam = float(self.parameters['lam'])
308 | assert self.__lam > 0
309 | self.__lamU = float(self.parameters['lamU'])
310 | assert self.__lamU > 0
311 | self.__lam_Uvec = [float(self.__lamU)*i for i in [0,0.000001,0.0001,0.01,0.1,0.5,1]]
312 | self.__sigma = float(self.parameters['sigma'])
313 | assert self.__sigma > 0
314 | self.__kernel_type = str(self.parameters['kernel_type'])
315 | if self.parameters['numR'] != None:
316 | self.__numR = int(self.parameters['numR'])
317 | assert (self.__numR <= len(self.__X)) and (self.__numR > 0)
318 | else:
319 | self.__numR = len(self.__X)
320 | self.__regressors_indices = sorted(self.__random_generator.sample( range(0,len(self.__X)), self.__numR ))
321 | self.__dim = self.__numR + 1 # add bias term b
322 | self.__minimum_labeled_patterns_for_estimate_r = float(self.parameters['minimum_labeled_patterns_for_estimate_r'])
323 | # If reliable estimate is available or can be estimated, use it, otherwise
324 | # assume classes to be balanced (i.e., estimate_r=0.0)
325 | if self.parameters['estimate_r'] != None:
326 | self.__estimate_r = float(self.parameters['estimate_r'])
327 | elif len(self.__L_l) >= self.__minimum_labeled_patterns_for_estimate_r:
328 | self.__estimate_r = (1.0 / len(self.__L_l)) * np.sum(self.__L_l)
329 | else:
330 | self.__estimate_r = 0.0
331 | self.__BFGS_m = int(self.parameters['BFGS_m'])
332 | self.__BFGS_maxfun = int(self.parameters['BFGS_maxfun'])
333 | self.__BFGS_factr = float(self.parameters['BFGS_factr'])
334 | # This is a hack for 64 bit systems (Linux). The machine precision
335 | # is different for the BFGS optimizer (Fortran code) and we fix this by:
336 | is_64bits = sys.maxsize > 2**32
337 | if is_64bits:
338 | logging.debug("64-bit system detected, modifying BFGS_factr!")
339 | self.__BFGS_factr = 0.000488288*self.__BFGS_factr
340 | self.__BFGS_pgtol = float(self.parameters['BFGS_pgtol'])
341 | self.__BFGS_verbose = int(self.parameters['BFGS_verbose'])
342 | self.__surrogate_gamma = float(self.parameters['surrogate_gamma'])
343 | self.__s = float(self.parameters['surrogate_s'])
344 | self.__breakpoint_for_exp = float(self.parameters['breakpoint_for_exp'])
345 | self.__b = self.__estimate_r
346 | # size of unlabeled patterns to estimate mean (used for balancing constraint)
347 | self.__max_unlabeled_subset_size = 1000
348 |
349 |
350 | def __optimize(self):
351 | logging.debug("Starting optimization with BFGS ...")
352 | self.__needed_function_calls = 0
353 | self.__initializeMatrices()
354 | # starting point
355 | c_current = zeros(self.__dim, float64)
356 | c_current[self.__dim-1] = self.__b
357 | # Annealing sequence.
358 | for i in range(len(self.__lam_Uvec)):
359 | self.__lamU = self.__lam_Uvec[i]
360 | # crop one dimension (in case the offset b is fixed)
361 | c_current = c_current[:self.__dim-1]
362 | c_current = self.__localSearch(c_current)
363 | # reappend it if needed
364 | c_current = np.append(c_current, self.__b)
365 | f_opt = self.__getFitness(c_current)
366 | return c_current, f_opt
367 |
368 | def __localSearch(self, start):
369 | c_opt, f_opt, d = optimize.fmin_l_bfgs_b(self.__getFitness, start, m=self.__BFGS_m, \
370 | fprime=self.__getFitness_Prime, maxfun=self.__BFGS_maxfun, factr=self.__BFGS_factr,\
371 | pgtol=self.__BFGS_pgtol, iprint=self.__BFGS_verbose)
372 | self.__needed_function_calls += int(d['funcalls'])
373 | return c_opt
374 |
375 | def __initializeMatrices(self):
376 | if self.__matrices_initialized == False:
377 | logging.debug("Initializing matrices...")
378 | # Initialize labels
379 | x = arr.array('i')
380 | for l in self.__L_l:
381 | x.append(l)
382 | self.__YL = mat(x, dtype=np.float64)
383 | self.__YL = self.__YL.transpose()
384 | # Initialize kernel matrices
385 | if (self.__kernel_type == "Linear"):
386 | self.__kernel = LinearKernel()
387 | elif (self.__kernel_type == "RBF"):
388 | self.__kernel = RBFKernel(self.__sigma)
389 | self.__Xreg = (mat(self.__X)[self.__regressors_indices,:].tolist())
390 | self.__KLR = self.__kernel.computeKernelMatrix(self.__X_l,self.__Xreg, symmetric=False)
391 | self.__KUR = self.__kernel.computeKernelMatrix(self.__X_u,self.__Xreg, symmetric=False)
392 | self.__KNR = cp.deepcopy(bmat([[self.__KLR], [self.__KUR]]))
393 | self.__KRR = self.__KNR[self.__regressors_indices,:]
394 | # Center patterns in feature space (with respect to approximated mean of unlabeled patterns in the feature space)
395 | subset_unlabled_indices = sorted(self.__random_generator.sample( range(0,len(self.__X_u)), min(self.__max_unlabeled_subset_size, len(self.__X_u)) ))
396 | self.__X_u_subset = (mat(self.__X_u)[subset_unlabled_indices,:].tolist())
397 | self.__KNU_bar = self.__kernel.computeKernelMatrix(self.__X, self.__X_u_subset, symmetric=False)
398 | self.__KNU_bar_horizontal_sum = (1.0 / len(self.__X_u_subset)) * self.__KNU_bar.sum(axis=1)
399 | self.__KU_barR = self.__kernel.computeKernelMatrix(self.__X_u_subset, self.__Xreg, symmetric=False)
400 | self.__KU_barR_vertical_sum = (1.0 / len(self.__X_u_subset)) * self.__KU_barR.sum(axis=0)
401 | self.__KU_barU_bar = self.__kernel.computeKernelMatrix(self.__X_u_subset, self.__X_u_subset, symmetric=False)
402 | self.__KU_barU_bar_sum = (1.0 / (len(self.__X_u_subset)))**2 * self.__KU_barU_bar.sum()
403 | self.__KNR = self.__KNR - self.__KNU_bar_horizontal_sum - self.__KU_barR_vertical_sum + self.__KU_barU_bar_sum
404 | self.__KRR = self.__KNR[self.__regressors_indices,:]
405 | self.__KLR = self.__KNR[range(0,len(self.__X_l)),:]
406 | self.__KUR = self.__KNR[range(len(self.__X_l),len(self.__X)),:]
407 | self.__matrices_initialized = True
408 |
409 | def __getFitness(self,c):
410 | # Check whether the function is called from the bfgs solver
411 | # (that does not optimize the offset b) or not
412 | if len(c) == self.__dim - 1:
413 | c = np.append(c, self.__b)
414 | c = mat(c)
415 | b = c[:,self.__dim-1].T
416 | c_new = c[:,0:self.__dim-1].T
417 | preds_labeled = self.__surrogate_gamma*(1.0 - multiply(self.__YL, self.__KLR * c_new + b))
418 | preds_unlabeled = self.__KUR * c_new + b
419 | # This vector has a "one" for each "numerically instable" entry; "zeros" for "good ones".
420 | preds_labeled_conflict_indicator = np.sign(np.sign(preds_labeled/self.__breakpoint_for_exp - 1.0) + 1.0)
421 | # This vector has a one for each good entry and zero otherwise
422 | preds_labeled_good_indicator = (-1)*(preds_labeled_conflict_indicator - 1.0)
423 | preds_labeled_for_conflicts = multiply(preds_labeled_conflict_indicator,preds_labeled)
424 | preds_labeled = multiply(preds_labeled,preds_labeled_good_indicator)
425 | # Compute values for good entries
426 | preds_labeled_log_exp = np.log(1.0 + np.exp(preds_labeled))
427 | # Compute values for instable entries
428 | preds_labeled_log_exp = multiply(preds_labeled_good_indicator, preds_labeled_log_exp)
429 | # Replace critical values with values
430 | preds_labeled_final = preds_labeled_log_exp + preds_labeled_for_conflicts
431 | term1 = (1.0/(self.__surrogate_gamma*self.__size_l)) * np.sum(preds_labeled_final)
432 | preds_unlabeled_squared = multiply(preds_unlabeled,preds_unlabeled)
433 | term2 = (float(self.__lamU)/float(self.__size_u))*np.sum(np.exp(-self.__s * preds_unlabeled_squared))
434 | term3 = self.__lam * (c_new.T * self.__KRR * c_new)
435 | return (term1 + term2 + term3)[0,0]
436 |
437 | def __getFitness_Prime(self,c):
438 | # Check whether the function is called from the bfgs solver
439 | # (that does not optimize the offset b) or not
440 | if len(c) == self.__dim - 1:
441 | c = np.append(c, self.__b)
442 | c = mat(c)
443 | b = c[:,self.__dim-1].T
444 | c_new = c[:,0:self.__dim-1].T
445 | preds_labeled = self.__surrogate_gamma * (1.0 - multiply(self.__YL, self.__KLR * c_new + b))
446 | preds_unlabeled = (self.__KUR * c_new + b)
447 | # This vector has a "one" for each "numerically instable" entry; "zeros" for "good ones".
448 | preds_labeled_conflict_indicator = np.sign(np.sign(preds_labeled/self.__breakpoint_for_exp - 1.0) + 1.0)
449 | # This vector has a one for each good entry and zero otherwise
450 | preds_labeled_good_indicator = (-1)*(preds_labeled_conflict_indicator - 1.0)
451 | preds_labeled = multiply(preds_labeled,preds_labeled_good_indicator)
452 | preds_labeled_exp = np.exp(preds_labeled)
453 | term1 = multiply(preds_labeled_exp, 1.0/(1.0 + preds_labeled_exp))
454 | term1 = multiply(preds_labeled_good_indicator, term1)
455 | # Replace critical values with "1.0"
456 | term1 = term1 + preds_labeled_conflict_indicator
457 | term1 = multiply(self.__YL, term1)
458 | preds_unlabeled_squared_exp_f = multiply(preds_unlabeled,preds_unlabeled)
459 | preds_unlabeled_squared_exp_f = np.exp(-self.__s * preds_unlabeled_squared_exp_f)
460 | preds_unlabeled_squared_exp_f = multiply(preds_unlabeled_squared_exp_f, preds_unlabeled)
461 | term1 = (-1.0/self.__size_l) * (term1.T * self.__KLR).T
462 | term2 = ((-2.0 * self.__s * self.__lamU)/float(self.__size_u)) * (preds_unlabeled_squared_exp_f.T * self.__KUR).T
463 | term3 = 2*self.__lam*(self.__KRR * c_new)
464 | return array((term1 + term2 + term3).T)[0]
465 |
466 | def __recomputeModel(self, indi):
467 | self.__c = mat(indi[0]).T
468 |
469 | def __getTrainingPredictions(self, X, real_valued=False):
470 | preds = self.__KNR * self.__c[0:self.__dim-1,:] + self.__c[self.__dim-1,:]
471 | if real_valued == True:
472 | return preds.flatten(1).tolist()[0]
473 | else:
474 | return np.sign(np.sign(preds)+0.1).flatten(1).tolist()[0]
475 |
476 | def __check_matrix(self, M):
477 | smallesteval = scipy.linalg.eigvalsh(M, eigvals=(0,0))[0]
478 | if smallesteval < 0.0:
479 | shift = abs(smallesteval) + 0.0000001
480 | M = M + shift
481 | return M
482 |
483 | ############################################################################################
484 | ############################################################################################
485 | class QN_S3VM_Sparse:
486 | """
487 | BFGS optimizer for semi-supervised support vector machines (S3VM).
488 |
489 | Sparse Data
490 | """
491 | parameters = {
492 | 'lam': 1,
493 | 'lamU':1,
494 | 'estimate_r':None,
495 | 'minimum_labeled_patterns_for_estimate_r':0,
496 | 'BFGS_m':50,
497 | 'BFGS_maxfun':500,
498 | 'BFGS_factr':1E12,
499 | 'BFGS_pgtol':1.0000000000000001e-05,
500 | 'BFGS_verbose':-1,
501 | 'surrogate_s':3.0,
502 | 'surrogate_gamma':20.0,
503 | 'breakpoint_for_exp':500
504 | }
505 |
506 |
507 | def __init__(self, X_l, L_l, X_u, random_generator, ** kw):
508 | """
509 | Intializes the S3VM optimizer.
510 | """
511 | self.__random_generator = random_generator
512 | # This is a nuisance, but we may need to pad extra dimensions to either X_l or X_u
513 | # in case the highest feature indices appear only in one of the two data matrices
514 | if X_l.shape[1] > X_u.shape[1]:
515 | X_u = sparse.hstack([X_u, sparse.coo_matrix(X_u.shape[0], X_l.shape[1] - X_u.shape[1])])
516 | elif X_l.shape[1] < X_u.shape[1]:
517 | X_l = sparse.hstack([X_l, sparse.coo_matrix(X_l.shape[0], X_u.shape[1] - X_u.shape[1])])
518 | # We vertically stack the data matrices into one big matrix
519 | X = sparse.vstack([X_l, X_u])
520 | self.__size_l, self.__size_u, self.__size_n = X_l.shape[0], X_u.shape[0], X_l.shape[0]+ X_u.shape[0]
521 | x = arr.array('i')
522 | for l in L_l:
523 | x.append(int(l))
524 | self.__YL = mat(x, dtype=np.float64)
525 | self.__YL = self.__YL.transpose()
526 | self.__setParameters( ** kw)
527 | self.__kw = kw
528 | self.X_l = X_l.tocsr()
529 | self.X_u = X_u.tocsr()
530 | self.X = X.tocsr()
531 | # compute mean of unlabeled patterns
532 | self.__mean_u = self.X_u.mean(axis=0)
533 | self.X_u_T = X_u.tocsc().T
534 | self.X_l_T = X_l.tocsc().T
535 | self.X_T = X.tocsc().T
536 |
537 | def train(self):
538 | """
539 | Training phase.
540 |
541 | Returns:
542 | The computed partition for the unlabeled patterns.
543 | """
544 | indi_opt = self.__optimize()
545 | self.__recomputeModel(indi_opt)
546 | predictions = self.getPredictions(self.X)
547 | return predictions
548 |
549 | def getPredictions(self, X, real_valued=False):
550 | """
551 | Computes the predicted labels for a given set of patterns
552 |
553 | Keyword arguments:
554 | X -- The set of patterns
555 | real_valued -- If True, then the real prediction values are returned
556 |
557 | Returns:
558 | The predictions for the list X of patterns.
559 | """
560 | c_new = self.__c[:self.__dim-1]
561 | W = self.X.T*c_new - self.__mean_u.T*np.sum(c_new)
562 | # Again, possibility of dimension mismatch due to use of sparse matrices
563 | if X.shape[1] > W.shape[0]:
564 | X = X[:,range(W.shape[0])]
565 | if X.shape[1] < W.shape[0]:
566 | W = W[range(X.shape[1])]
567 | X = X.tocsc()
568 | preds = X * W + self.__b
569 | if real_valued == True:
570 | return preds.flatten(1).tolist()[0]
571 | else:
572 | return np.sign(np.sign(preds)+0.1).flatten(1).tolist()[0]
573 |
574 | def predict(self, x):
575 | """
576 | Predicts a label for the pattern
577 |
578 | Keyword arguments:
579 | x -- The pattern
580 |
581 | Returns:
582 | The prediction for x.
583 | """
584 | return self.getPredictions([x], real_valued=False)[0]
585 |
586 | def predictValue(self, x):
587 | """
588 | Computes f(x) for a given pattern (see Representer Theorem)
589 |
590 | Keyword arguments:
591 | x -- The pattern
592 |
593 | Returns:
594 | The (real) prediction value for x.
595 | """
596 | return self.getPredictions([x], real_valued=True)[0]
597 |
598 | def getNeededFunctionCalls(self):
599 | """
600 | Returns the number of function calls needed during
601 | the optimization process.
602 | """
603 | return self.__needed_function_calls
604 |
605 | def __setParameters(self, ** kw):
606 | for attr, val in kw.items():
607 | self.parameters[attr] = val
608 | self.__lam = float(self.parameters['lam'])
609 | assert self.__lam > 0
610 | self.__lamU = float(self.parameters['lamU'])
611 | assert self.__lamU > 0
612 | self.__lam_Uvec = [float(self.__lamU)*i for i in [0,0.000001,0.0001,0.01,0.1,0.5,1]]
613 | self.__minimum_labeled_patterns_for_estimate_r = float(self.parameters['minimum_labeled_patterns_for_estimate_r'])
614 | # If reliable estimate is available or can be estimated, use it, otherwise
615 | # assume classes to be balanced (i.e., estimate_r=0.0)
616 | if self.parameters['estimate_r'] != None:
617 | self.__estimate_r = float(self.parameters['estimate_r'])
618 | elif self.__YL.shape[0] > self.__minimum_labeled_patterns_for_estimate_r:
619 | self.__estimate_r = (1.0 / self.__YL.shape[0]) * np.sum(self.__YL[0:])
620 | else:
621 | self.__estimate_r = 0.0
622 | self.__dim = self.__size_n + 1 # for offset term b
623 | self.__BFGS_m = int(self.parameters['BFGS_m'])
624 | self.__BFGS_maxfun = int(self.parameters['BFGS_maxfun'])
625 | self.__BFGS_factr = float(self.parameters['BFGS_factr'])
626 | # This is a hack for 64 bit systems (Linux). The machine precision
627 | # is different for the BFGS optimizer (Fortran code) and we fix this by:
628 | is_64bits = sys.maxsize > 2**32
629 | if is_64bits:
630 | logging.debug("64-bit system detected, modifying BFGS_factr!")
631 | self.__BFGS_factr = 0.000488288*self.__BFGS_factr
632 | self.__BFGS_pgtol = float(self.parameters['BFGS_pgtol'])
633 | self.__BFGS_verbose = int(self.parameters['BFGS_verbose'])
634 | self.__surrogate_gamma = float(self.parameters['surrogate_gamma'])
635 | self.__s = float(self.parameters['surrogate_s'])
636 | self.__breakpoint_for_exp = float(self.parameters['breakpoint_for_exp'])
637 | self.__b = self.__estimate_r
638 |
639 | def __optimize(self):
640 | logging.debug("Starting optimization with BFGS ...")
641 | self.__needed_function_calls = 0
642 | # starting_point
643 | c_current = zeros(self.__dim, float64)
644 | c_current[self.__dim-1] = self.__b
645 | # Annealing sequence.
646 | for i in range(len(self.__lam_Uvec)):
647 | self.__lamU = self.__lam_Uvec[i]
648 | # crop one dimension (in case the offset b is fixed)
649 | c_current = c_current[:self.__dim-1]
650 | c_current = self.__localSearch(c_current)
651 | # reappend it if needed
652 | c_current = np.append(c_current, self.__b)
653 | f_opt = self.__getFitness(c_current)
654 | return c_current, f_opt
655 |
656 | def __localSearch(self, start):
657 | c_opt, f_opt, d = optimize.fmin_l_bfgs_b(self.__getFitness, start, m=self.__BFGS_m, \
658 | fprime=self.__getFitness_Prime, maxfun=self.__BFGS_maxfun,\
659 | factr=self.__BFGS_factr, pgtol=self.__BFGS_pgtol, iprint=self.__BFGS_verbose)
660 | self.__needed_function_calls += int(d['funcalls'])
661 | return c_opt
662 |
663 | def __getFitness(self,c):
664 | # check whether the function is called from the bfgs solver
665 | # (that does not optimize the offset b) or not
666 | if len(c) == self.__dim - 1:
667 | c = np.append(c, self.__b)
668 | c = mat(c)
669 | b = c[:,self.__dim-1].T
670 | c_new = c[:,0:self.__dim-1].T
671 | c_new_sum = np.sum(c_new)
672 | XTc = self.X_T*c_new - self.__mean_u.T*c_new_sum
673 | preds_labeled = self.__surrogate_gamma*(1.0 - multiply(self.__YL, (self.X_l*XTc - self.__mean_u*XTc) + b[0,0]))
674 | preds_unlabeled = (self.X_u*XTc - self.__mean_u*XTc) + b[0,0]
675 | # This vector has a "one" for each "numerically instable" entry; "zeros" for "good ones".
676 | preds_labeled_conflict_indicator = np.sign(np.sign(preds_labeled/self.__breakpoint_for_exp - 1.0) + 1.0)
677 | # This vector has a one for each good entry and zero otherwise
678 | preds_labeled_good_indicator = (-1)*(preds_labeled_conflict_indicator - 1.0)
679 | preds_labeled_for_conflicts = multiply(preds_labeled_conflict_indicator,preds_labeled)
680 | preds_labeled = multiply(preds_labeled,preds_labeled_good_indicator)
681 | # Compute values for good entries
682 | preds_labeled_log_exp = np.log(1.0 + np.exp(preds_labeled))
683 | # Compute values for instable entries
684 | preds_labeled_log_exp = multiply(preds_labeled_good_indicator, preds_labeled_log_exp)
685 | # Replace critical values with values
686 | preds_labeled_final = preds_labeled_log_exp + preds_labeled_for_conflicts
687 | term1 = (1.0/(self.__surrogate_gamma*self.__size_l)) * np.sum(preds_labeled_final)
688 | preds_unlabeled_squared = multiply(preds_unlabeled,preds_unlabeled)
689 | term2 = (float(self.__lamU)/float(self.__size_u))*np.sum(np.exp(-self.__s * preds_unlabeled_squared))
690 | term3 = self.__lam * c_new.T * (self.X * XTc - self.__mean_u*XTc)
691 | return (term1 + term2 + term3)[0,0]
692 |
693 | def __getFitness_Prime(self,c):
694 | # check whether the function is called from the bfgs solver
695 | # (that does not optimize the offset b) or not
696 | if len(c) == self.__dim - 1:
697 | c = np.append(c, self.__b)
698 | c = mat(c)
699 | b = c[:,self.__dim-1].T
700 | c_new = c[:,0:self.__dim-1].T
701 | c_new_sum = np.sum(c_new)
702 | XTc = self.X_T*c_new - self.__mean_u.T*c_new_sum
703 | preds_labeled = self.__surrogate_gamma*(1.0 - multiply(self.__YL, (self.X_l*XTc -self.__mean_u*XTc) + b[0,0]))
704 | preds_unlabeled = (self.X_u*XTc - self.__mean_u*XTc )+ b[0,0]
705 | preds_labeled_conflict_indicator = np.sign(np.sign(preds_labeled/self.__breakpoint_for_exp - 1.0) + 1.0)
706 | # This vector has a one for each good entry and zero otherwise
707 | preds_labeled_good_indicator = (-1)*(preds_labeled_conflict_indicator - 1.0)
708 | preds_labeled = multiply(preds_labeled,preds_labeled_good_indicator)
709 | preds_labeled_exp = np.exp(preds_labeled)
710 | term1 = multiply(preds_labeled_exp, 1.0/(1.0 + preds_labeled_exp))
711 | term1 = multiply(preds_labeled_good_indicator, term1)
712 | # Replace critical values with "1.0"
713 | term1 = term1 + preds_labeled_conflict_indicator
714 | term1 = multiply(self.__YL, term1)
715 | preds_unlabeled_squared_exp_f = multiply(preds_unlabeled,preds_unlabeled)
716 | preds_unlabeled_squared_exp_f = np.exp(-self.__s * preds_unlabeled_squared_exp_f)
717 | preds_unlabeled_squared_exp_f = multiply(preds_unlabeled_squared_exp_f, preds_unlabeled)
718 | term1_sum = np.sum(term1)
719 | tmp = self.X_l_T * term1 - self.__mean_u.T*term1_sum
720 | term1 = (-1.0/self.__size_l) * (self.X * tmp - self.__mean_u*tmp)
721 | preds_unlabeled_squared_exp_f_sum = np.sum(preds_unlabeled_squared_exp_f)
722 | tmp_unlabeled = self.X_u_T * preds_unlabeled_squared_exp_f - self.__mean_u.T * preds_unlabeled_squared_exp_f_sum
723 | term2 = ((-2.0 * self.__s * self.__lamU)/float(self.__size_u)) * (self.X * tmp_unlabeled - self.__mean_u*tmp_unlabeled)
724 | XTc_sum = np.sum(XTc)
725 | term3 = 2*self.__lam*(self.X * XTc - self.__mean_u*XTc)
726 | return array((term1 + term2 + term3).T)[0]
727 |
728 | def __recomputeModel(self, indi):
729 | self.__c = mat(indi[0]).T
730 |
731 | ############################################################################################
732 | ############################################################################################
733 | class LinearKernel():
734 | """
735 | Linear Kernel
736 | """
737 | def __init__(self):
738 | pass
739 |
740 | def computeKernelMatrix(self, data1, data2, symmetric=False):
741 | """
742 | Computes the kernel matrix
743 | """
744 | logging.debug("Starting Linear Kernel Matrix Computation...")
745 | self._data1 = mat(data1)
746 | self._data2 = mat(data2)
747 | assert self._data1.shape[1] == (self._data2.T).shape[0]
748 | try:
749 | return self._data1 * self._data2.T
750 | except Exception as e:
751 | logging.error("Error while computing kernel matrix: " + str(e))
752 | import traceback
753 | traceback.print_exc()
754 | sys.exit()
755 | logging.debug("Kernel Matrix computed...")
756 |
757 | def getKernelValue(self, xi, xj):
758 | """
759 | Returns a single kernel value.
760 | """
761 | xi = array(xi)
762 | xj = array(xj)
763 | val = dot(xi, xj)
764 | return val
765 |
766 |
767 | class DictLinearKernel():
768 | """
769 | Linear Kernel (for dictionaries)
770 | """
771 | def __init__(self):
772 | pass
773 |
774 | def computeKernelMatrix(self, data1, data2, symmetric=False):
775 | """
776 | Computes the kernel matrix
777 | """
778 | logging.debug("Starting Linear Kernel Matrix Computation...")
779 | self._data1 = data1
780 | self._data2 = data2
781 | self._dim1 = len(data1)
782 | self._dim2 = len(data2)
783 | self._symmetric = symmetric
784 | self.__km = None
785 | try:
786 | km = mat(zeros((self._dim1, self._dim2), dtype=float64))
787 | if self._symmetric:
788 | for i in range(self._dim1):
789 | message = 'Kernel Matrix Progress: %dx%d/%dx%d' % (i, self._dim2,self._dim1,self._dim2)
790 | logging.debug(message)
791 | for j in range(i, self._dim2):
792 | val = self.getKernelValue(self._data1[i], self._data2[j])
793 | km[i, j] = val
794 | km[j, i] = val
795 | return km
796 | else:
797 | for i in range(self._dim1):
798 | message = 'Kernel Matrix Progress: %dx%d/%dx%d' % (i, self._dim2,self._dim1,self._dim2)
799 | logging.debug(message)
800 | for j in range(0, self._dim2):
801 | val = self.getKernelValue(self._data1[i], self._data2[j])
802 | km[i, j] = val
803 | return km
804 |
805 | except Exception as e:
806 | logging.error("Error while computing kernel matrix: " + str(e))
807 | sys.exit()
808 | logging.debug("Kernel Matrix computed...")
809 |
810 | def getKernelValue(self, xi, xj):
811 | """
812 | Returns a single kernel value.
813 | """
814 | val = 0.
815 | for key in xi:
816 | if key in xj:
817 | val += xi[key]*xj[key]
818 | return val
819 |
820 | class RBFKernel():
821 | """
822 | RBF Kernel
823 | """
824 | def __init__(self, sigma):
825 | self.__sigma = sigma
826 | self.__sigma_squared_inv = 1.0 / (2* (self.__sigma ** 2) )
827 |
828 | def computeKernelMatrix(self, data1, data2, symmetric=False):
829 | """
830 | Computes the kernel matrix
831 | """
832 | logging.debug("Starting RBF Kernel Matrix Computation...")
833 | self._data1 = mat(data1)
834 | self._data2 = mat(data2)
835 | assert self._data1.shape[1] == (self._data2.T).shape[0]
836 | self._dim1 = len(data1)
837 | self._dim2 = len(data2)
838 | self._symmetric = symmetric
839 | self.__km = None
840 | try:
841 | if self._symmetric:
842 | linearkm = self._data1 * self._data2.T
843 | trnorms = mat(np.diag(linearkm)).T
844 | trace_matrix = trnorms * mat(np.ones((1, self._dim1), dtype = float64))
845 | self.__km = trace_matrix + trace_matrix.T
846 | self.__km = self.__km - 2*linearkm
847 | self.__km = - self.__sigma_squared_inv * self.__km
848 | self.__km = np.exp(self.__km)
849 | return self.__km
850 | else:
851 | m = self._data1.shape[0]
852 | n = self._data2.shape[0]
853 | assert self._data1.shape[1] == self._data2.shape[1]
854 | linkm = mat(self._data1 * self._data2.T)
855 | trnorms1 = []
856 | for i in range(m):
857 | trnorms1.append((self._data1[i] * self._data1[i].T)[0,0])
858 | trnorms1 = mat(trnorms1).T
859 | trnorms2 = []
860 | for i in range(n):
861 | trnorms2.append((self._data2[i] * self._data2[i].T)[0,0])
862 | trnorms2 = mat(trnorms2).T
863 | self.__km = trnorms1 * mat(np.ones((n, 1), dtype = float64)).T
864 | self.__km = self.__km + mat(np.ones((m, 1), dtype = float64)) * trnorms2.T
865 | self.__km = self.__km - 2 * linkm
866 | self.__km = - self.__sigma_squared_inv * self.__km
867 | self.__km = np.exp(self.__km)
868 | return self.__km
869 | except Exception as e:
870 | logging.error("Error while computing kernel matrix: " + str(e))
871 | sys.exit()
872 |
873 | def getKernelValue(self, xi, xj):
874 | """
875 | Returns a single kernel value.
876 | """
877 | xi = array(xi)
878 | xj = array(xj)
879 | diff = xi-xj
880 | val = exp(-self.__sigma_squared_inv * (dot(diff, diff)))
881 | return val
882 |
883 | class DictRBFKernel():
884 | """
885 | RBF Kernel (for dictionaries)
886 | """
887 | def __init__(self, sigma):
888 | self.__sigma = sigma
889 | self.__sigma_squared_inv = 1.0 / ((self.__sigma ** 2))
890 |
891 | def computeKernelMatrix(self, data1, data2, symmetric=False):
892 | """
893 | Computes the kernel matrix
894 | """
895 | logging.debug("Starting RBF Kernel Matrix Computation...")
896 | self._data1 = data1
897 | self._data2 = data2
898 | self._dim1 = len(data1)
899 | self._dim2 = len(data2)
900 | self._symmetric = symmetric
901 | self.__km = None
902 | try:
903 | km = mat(zeros((self._dim1, self._dim2), dtype=float64))
904 | if self._symmetric:
905 | for i in range(self._dim1):
906 | message = 'Kernel Matrix Progress: %dx%d/%dx%d' % (i, self._dim2,self._dim1,self._dim2)
907 | logging.debug(message)
908 | for j in range(i, self._dim2):
909 | val = self.getKernelValue(self._data1[i], self._data2[j])
910 | km[i, j] = val
911 | km[j, i] = val
912 | return km
913 | else:
914 | for i in range(0, self._dim1):
915 | message = 'Kernel Matrix Progress: %dx%d/%dx%d' % (i, self._dim2,self._dim1,self._dim2)
916 | logging.debug(message)
917 | for j in range(0, self._dim2):
918 | val = self.getKernelValue(self._data1[i], self._data2[j])
919 | km[i, j] = val
920 | return km
921 | except Exception as e:
922 | logging.error("Error while computing kernel matrix: " + str(e))
923 | sys.exit()
924 | logging.info("Kernel Matrix computed...")
925 |
926 | def getKernelValue(self, xi, xj):
927 | """
928 | Returns a single kernel value.
929 | """
930 | diff = xi.copy()
931 | for key in xj:
932 | if key in diff:
933 | diff[key]-=xj[key]
934 | else:
935 | diff[key]=-xj[key]
936 | diff = diff.values()
937 | val = exp(-self.__sigma_squared_inv * (dot(diff, diff)))
938 | return val
939 |
--------------------------------------------------------------------------------
/methods/scikitTSVM.py:
--------------------------------------------------------------------------------
1 | from sklearn.base import BaseEstimator
2 | import sklearn.metrics
3 | import random as rnd
4 | import numpy
5 | from sklearn.linear_model import LogisticRegression as LR
6 | from qns3vm import QN_S3VM
7 |
8 | class SKTSVM(BaseEstimator):
9 | """
10 | Scikit-learn wrapper for transductive SVM (SKTSVM)
11 |
12 | Wraps QN-S3VM by Fabian Gieseke, Antti Airola, Tapio Pahikkala, Oliver Kramer (see http://www.fabiangieseke.de/index.php/code/qns3vm)
13 | as a scikit-learn BaseEstimator, and provides probability estimates using Platt scaling
14 |
15 | Parameters
16 | ----------
17 | C : float, optional (default=1.0)
18 | Penalty parameter C of the error term.
19 |
20 | kernel : string, optional (default='rbf')
21 | Specifies the kernel type to be used in the algorithm.
22 | It must be 'linear' or 'rbf'
23 |
24 | gamma : float, optional (default=0.0)
25 | Kernel coefficient for 'rbf'
26 |
27 | lamU: float, optional (default=1.0)
28 | cost parameter that determines influence of unlabeled patterns
29 | must be float >0
30 |
31 | probability: boolean, optional (default=False)
32 | Whether to enable probability estimates. This must be enabled prior
33 | to calling `fit`, and will slow down that method.
34 | """
35 |
36 | # lamU -- cost parameter that determines influence of unlabeled patterns (default 1, must be float > 0)
37 | def __init__(self, kernel = 'RBF', C = 1e-4, gamma = 0.5, lamU = 1.0, probability=True):
38 | self.random_generator = rnd.Random()
39 | self.kernel = kernel
40 | self.C = C
41 | self.gamma = gamma
42 | self.lamU = lamU
43 | self.probability = probability
44 |
45 | def fit(self, X, y): # -1 for unlabeled
46 | """Fit the model according to the given training data.
47 |
48 | Parameters
49 | ----------
50 | X : array-like, shape = [n_samples, n_features]
51 | Training vector, where n_samples in the number of samples and
52 | n_features is the number of features.
53 |
54 | y : array-like, shape = [n_samples]
55 | Target vector relative to X
56 | Must be 0 or 1 for labeled and -1 for unlabeled instances
57 |
58 | Returns
59 | -------
60 | self : object
61 | Returns self.
62 | """
63 |
64 | # http://www.fabiangieseke.de/index.php/code/qns3vm
65 |
66 | unlabeledX = X[y==-1, :].tolist()
67 | labeledX = X[y!=-1, :].tolist()
68 | labeledy = y[y!=-1]
69 |
70 | # convert class 0 to -1 for tsvm
71 | labeledy[labeledy==0] = -1
72 | labeledy = labeledy.tolist()
73 |
74 | if 'rbf' in self.kernel.lower():
75 | self.model = QN_S3VM(labeledX, labeledy, unlabeledX, self.random_generator, lam=self.C, lamU=self.lamU, kernel_type="RBF", sigma=self.gamma)
76 | else:
77 | self.model = QN_S3VM(labeledX, labeledy, unlabeledX, self.random_generator, lam=self.C, lamU=self.lamU)
78 |
79 | self.model.train()
80 |
81 | # probabilities by Platt scaling
82 | if self.probability:
83 | self.plattlr = LR()
84 | preds = self.model.mygetPreds(labeledX)
85 | self.plattlr.fit( preds.reshape( -1, 1 ), labeledy )
86 |
87 | def predict_proba(self, X):
88 | """Compute probabilities of possible outcomes for samples in X.
89 |
90 | The model need to have probability information computed at training
91 | time: fit with attribute `probability` set to True.
92 |
93 | Parameters
94 | ----------
95 | X : array-like, shape = [n_samples, n_features]
96 |
97 | Returns
98 | -------
99 | T : array-like, shape = [n_samples, n_classes]
100 | Returns the probability of the sample for each class in
101 | the model. The columns correspond to the classes in sorted
102 | order, as they appear in the attribute `classes_`.
103 | """
104 |
105 | if self.probability:
106 | preds = self.model.mygetPreds(X.tolist())
107 | return self.plattlr.predict_proba(preds.reshape( -1, 1 ))
108 | else:
109 | raise RuntimeError("Probabilities were not calculated for this model - make sure you pass probability=True to the constructor")
110 |
111 | def predict(self, X):
112 | """Perform classification on samples in X.
113 |
114 | Parameters
115 | ----------
116 | X : array-like, shape = [n_samples, n_features]
117 |
118 | Returns
119 | -------
120 | y_pred : array, shape = [n_samples]
121 | Class labels for samples in X.
122 | """
123 |
124 | y = numpy.array(self.model.getPredictions(X.tolist()))
125 | y[y == -1] = 0
126 | return y
127 |
128 | def score(self, X, y, sample_weight=None):
129 | return sklearn.metrics.accuracy_score(y, self.predict(X), sample_weight=sample_weight)
130 |
--------------------------------------------------------------------------------
/methods/scikitWQDA.py:
--------------------------------------------------------------------------------
1 | import traceback
2 | import numpy
3 | import numpy as np
4 | from sklearn.base import BaseEstimator
5 | import sklearn.metrics
6 | import scipy.stats
7 |
8 | class WQDA(BaseEstimator):
9 | """
10 | Weighted Quadratic Discriminant Analysis (QDA)
11 |
12 | A classifier with a quadratic decision boundary, which allows
13 | weighted samples, and which is generated by fitting class
14 | conditional densities to the data and using Bayes' rule.
15 |
16 | The model fits a Gaussian density to each class.
17 |
18 | Attributes
19 | ----------
20 | covariances_ : list of array-like, shape = [n_features, n_features]
21 | Covariance matrices of each class.
22 |
23 | means_ : array-like, shape = [n_classes, n_features]
24 | Class means.
25 |
26 | priors_ : array-like, shape = [n_classes]
27 | Class priors (sum to 1).
28 | """
29 |
30 | def __init__(self): # LDA
31 | self.use_shrinkage = False
32 |
33 | #TODO regularization
34 | def fit(self, X, y, sample_weight=[]):
35 | """
36 | Fit the QDA model according to the given training data and parameters.
37 |
38 | Parameters
39 | ----------
40 | X : array-like, shape = [n_samples, n_features]
41 | Training vector, where n_samples in the number of samples and
42 | n_features is the number of features.
43 |
44 | y : array, shape = [n_samples]
45 | Target values (integers)
46 |
47 | sample_weight : array-like, shape (n_samples,), optional
48 | Weights applied to individual samples.
49 | If not provided, uniform weights are assumed.
50 |
51 | Returns
52 | -------
53 | self : returns an instance of self.
54 | """
55 |
56 | K = len(set(y))
57 | if X.shape[0] < X.shape[1]: # less instances than dimensions -> use shrinkage
58 | self.use_shrinkage = True
59 |
60 | kindices = [numpy.where(y==k)[0] for k in range(K)]
61 | if len(sample_weight) == 0:
62 | qsum = numpy.bincount(y.astype(int))
63 | sample_weight = numpy.ones(len(y))
64 | else:
65 | qsum = numpy.array([numpy.sum(sample_weight[kindices[k]]) for k in range(K)])
66 | sample_weight = numpy.reshape(sample_weight, (len(sample_weight),1))
67 | self.priors_ = qsum / float(len(y))
68 | self.means_ = []
69 | self.covariances_ = []
70 | self.covariance_ = []
71 | for k in range(K):
72 | self.means_.append(numpy.average(X[kindices[k], :], axis=0, weights=sample_weight[kindices[k]][:,0]))
73 | ##QDA
74 | try:
75 | xm = X[kindices[k], :] - self.means_[k]
76 | if X.shape[0] > X.shape[1]: # more instances than features
77 | ##normalizing by number of data points (Loog 2015)
78 | self.covariances_.append(1./(len(kindices[k])) * numpy.multiply(xm, sample_weight[kindices[k]]).T.dot(xm))
79 | ##weighted unbiased sample covariance (from http://stats.stackexchange.com/questions/61225/correct-equation-for-weighted-unbiased-sample-covariance )
80 | #self.covariances_.append(1./(sample_weight[kindices[k], :].sum()) * numpy.multiply(xm, sample_weight[kindices[k]]).T.dot(xm))
81 | else: # less instances than features - use shrinkage
82 | self.covariances_.append(weighted_oas(xm, sample_weight[kindices[k]]))
83 |
84 | except:
85 | traceback.print_exc()
86 |
87 | if len(self.covariance_) == 0:
88 | self.covariance_ = numpy.copy(self.covariances_[k])
89 | else:
90 | self.covariance_ += self.covariances_[k]
91 | self.covariance_ /= float(K)
92 |
93 | return self
94 |
95 | def _log_posterior(self, X, normalize = True):
96 | # https://github.com/probml/pmtk3/blob/5fefd068a2e84ae508684d3e4750bd72a4164ba0/toolbox/SupervisedModels/discrimAnalysis/discrimAnalysisPredict.m
97 | # Apply Bayes rule with Gaussian class-conditional densities.
98 | # post[i,c] = P(y=c|x(i,:), params)
99 | # yhat[i] = arg max_c post[i,c]
100 | N = X.shape[0]
101 | Nclasses = len(self.priors_)
102 | loglik = numpy.zeros((N, Nclasses))
103 | for c in range(Nclasses):
104 | try:
105 | mvnorm = scipy.stats.multivariate_normal(self.means_[c], self.covariances_[c], allow_singular=True)
106 | except:
107 | mvnorm = scipy.stats.multivariate_normal(self.means_[c], self.covariances_[c])
108 | loglik[:, c] = mvnorm.logpdf(X)
109 | logjoint = numpy.log(self.priors_) + loglik
110 |
111 | if normalize:
112 | normalization = scipy.misc.logsumexp(logjoint, axis=1)
113 | return logjoint - numpy.reshape(normalization, (len(normalization), 1))
114 | else:
115 | return logjoint
116 |
117 | def _posterior(self, X):
118 | return numpy.exp(self._log_posterior(X))
119 |
120 | def predict(self, X):
121 | """Perform classification on samples in X.
122 |
123 | Parameters
124 | ----------
125 | X : array-like, shape = [n_samples, n_features]
126 |
127 | Returns
128 | -------
129 | y_pred : array, shape = [n_samples]
130 | Class labels for samples in X.
131 | """
132 | return numpy.argmax(self._posterior(X), axis=1)
133 |
134 | def predict_proba(self, X):
135 | """Return posterior probabilities of classification.
136 |
137 | Parameters
138 | ----------
139 | X : array-like, shape = [n_samples, n_features]
140 | Array of samples/test vectors.
141 |
142 | Returns
143 | -------
144 | C : array, shape = [n_samples, n_classes]
145 | Posterior probabilities of classification per class.
146 | """
147 | return self._posterior(X)
148 |
149 | def score(self, X, y, sample_weight=None):
150 | return sklearn.metrics.accuracy_score(y, self.predict(X), sample_weight=sample_weight)
151 |
152 | def weighted_oas(X, weights):
153 | """Estimate covariance with the Oracle Approximating Shrinkage algorithm.
154 |
155 | Parameters
156 | ----------
157 | X : array-like, shape (n_samples, n_features)
158 | Centered data from which to compute the covariance estimate.
159 |
160 | weights: sample weights
161 |
162 | Returns
163 | -------
164 | shrunk_cov : array-like, shape (n_features, n_features)
165 | Shrunk covariance.
166 |
167 | Notes
168 | -----
169 | The regularised (shrunk) covariance is:
170 |
171 | (1 - shrinkage)*cov
172 | + shrinkage * mu * np.identity(n_features)
173 |
174 | where mu = trace(cov) / n_features
175 |
176 | The formula we used to implement the OAS
177 | does not correspond to the one given in the article. It has been taken
178 | from the MATLAB program available from the author's webpage
179 | (https://tbayes.eecs.umich.edu/yilun/covestimation).
180 |
181 | """
182 | X = np.asarray(X)
183 | n_samples, n_features = X.shape
184 |
185 | #emp_cov = empirical_covariance(X, assume_centered=assume_centered)
186 | emp_cov = 1./(len(weights)) * numpy.multiply(X, weights).T.dot(X)
187 | mu = np.trace(emp_cov) / n_features
188 |
189 | # formula from Chen et al.'s **implementation**
190 | alpha = np.mean(emp_cov ** 2)
191 | num = alpha + mu ** 2
192 | den = (n_samples + 1.) * (alpha - (mu ** 2) / n_features)
193 |
194 | shrinkage = 1. if den == 0 else min(num / den, 1.)
195 | shrunk_cov = (1. - shrinkage) * emp_cov
196 | shrunk_cov.flat[::n_features + 1] += shrinkage * mu
197 |
198 | return shrunk_cov#, shrinkage
199 |
--------------------------------------------------------------------------------
/qdaexample - Copy.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/tmadl/semisup-learn/483698be0453f08e550a4c705fc7ad522708708c/qdaexample - Copy.png
--------------------------------------------------------------------------------
/qdaexample.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/tmadl/semisup-learn/483698be0453f08e550a4c705fc7ad522708708c/qdaexample.png
--------------------------------------------------------------------------------
/setup.py:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env python
2 | # -*- coding: utf-8 -*-
3 |
4 | try:
5 | from setuptools import setup
6 | except ImportError:
7 | from distutils.core import setup
8 |
9 | requirements = [
10 | "sklearn",
11 | "scipy",
12 | "numpy",
13 | "matplotlib"
14 | ]
15 |
16 | test_requirements = []
17 |
18 | setup(
19 | name='semisup_learn',
20 | version='0.0.1',
21 | description="Semisupervised Learning Framework",
22 | url='https://github.com/tmadl/semisup-learn',
23 | packages=[
24 | 'methods', 'frameworks'
25 | ],
26 | include_package_data=True,
27 | install_requires=requirements,
28 | zip_safe=False,
29 | keywords='semisup-learn',
30 | )
--------------------------------------------------------------------------------
/svmexample1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/tmadl/semisup-learn/483698be0453f08e550a4c705fc7ad522708708c/svmexample1.png
--------------------------------------------------------------------------------
/svmexample2.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/tmadl/semisup-learn/483698be0453f08e550a4c705fc7ad522708708c/svmexample2.png
--------------------------------------------------------------------------------