├── .gitignore ├── LICENSE ├── README.md ├── example ├── example.py ├── misvmio.py ├── musk1.data └── musk1.names ├── misvm ├── __init__.py ├── cccp.py ├── kernel.py ├── mi_svm.py ├── mica.py ├── misssvm.py ├── nsk.py ├── quadprog.py ├── sbmil.py ├── sil.py ├── smil.py ├── stk.py ├── stmil.py ├── svm.py └── util.py └── setup.py /.gitignore: -------------------------------------------------------------------------------- 1 | *.pyc 2 | *.swp 3 | *.swo 4 | build 5 | dist 6 | *.egg-info 7 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | Copyright (c) 2013, Case Western Reserve University, Gary Doran 2 | All rights reserved. 3 | 4 | Redistribution and use in source and binary forms, with or without 5 | modification, are permitted provided that the following conditions are met: 6 | 7 | * Redistributions of source code must retain the above copyright notice, 8 | this list of conditions and the following disclaimer. 9 | * Redistributions in binary form must reproduce the above copyright notice, 10 | this list of conditions and the following disclaimer in the documentation 11 | and/or other materials provided with the distribution. 12 | * Neither the name of the owner nor the names of its contributors may be 13 | used to endorse or promote products derived from this software without 14 | specific prior written permission. 15 | 16 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND 17 | ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED 18 | WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE 19 | DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR 20 | ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES 21 | (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; 22 | LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON 23 | ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT 24 | (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS 25 | SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 26 | 27 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | MISVM: Multiple-Instance Support Vector Machines 2 | ================================================ 3 | 4 | by Gary Doran () 5 | 6 | Overview 7 | -------- 8 | 9 | MISVM contains a Python implementation of numerous support vector machine (SVM) 10 | algorithms for the multiple-instance (MI) learning framework. The 11 | implementations were created for use in the following publication: 12 | > Doran, Gary and Soumya Ray. **A theoretical and empirical analysis of support 13 | > vector machine methods for multiple-instance classification.** _To appear in 14 | > Machine Learning Journal._ 2013. 15 | 16 | Installation 17 | ------------ 18 | 19 | This package can be installed in two ways (the easy way): 20 | 21 | # If needed: 22 | # pip install numpy 23 | # pip install scipy 24 | # pip install cvxopt 25 | pip install -e git+https://github.com/garydoranjr/misvm.git#egg=misvm 26 | 27 | or by running the setup file manually 28 | 29 | git clone [the url for misvm] 30 | cd misvm 31 | python setup.py install 32 | 33 | Note the code depends on the `numpy`, `scipy`, and `cvxopt` packages. So have those 34 | installed first. The build will likely fail if it can't find them. For more information, see: 35 | 36 | + [NumPy](http://www.numpy.org/): Library for efficient matrix math in Python 37 | + [SciPy](http://www.scipy.org/): Library for more MATLAB-like functionality 38 | + [CVXOPT](http://cvxopt.org/): Efficient convex (including quadratic program) optimization 39 | 40 | Contents 41 | -------- 42 | 43 | The MISVM package currently implements the following algorithms: 44 | 45 | ### SIL 46 | Single-Instance Learning (SIL) is a "naive" approach that assigns each instance 47 | the label of its bag, creating a supervised learning problem but mislabeling 48 | negative instances in positive bags. It works surprisingly well for many 49 | problems. 50 | > Ray, Soumya, and Mark Craven. **Supervised versus multiple instance learning: 51 | > an empirical comparison.** _Proceedings of the 22nd International Conference 52 | > on Machine Learning._ 2005. 53 | 54 | ### MI-SVM and mi-SVM 55 | These approaches modify the standard SVM formulation so that the constraints on 56 | instance labels correspond to the MI assumption that at least one instance in 57 | each bag is positive. For more information, see: 58 | > Andrews, Stuart, Ioannis Tsochantaridis, and Thomas Hofmann. **Support vector 59 | > machines for multiple-instance learning.** _Advances in Neural Information 60 | > Processing Systems._ 2002. 61 | 62 | ### NSK and STK 63 | The normalized set kernel (NSK) and statistics kernel (STK) approaches use 64 | kernels to map entire bags into a features, then use the standard SVM 65 | formulation to find bag classifiers: 66 | > Gärtner, Thomas, Peter A. Flach, Adam Kowalczyk, and Alex J. Smola. 67 | > **Multi-instance kernels.** _Proceedings of the 19th International Conference on 68 | > Machine Learning._ 2002. 69 | 70 | ### MissSVM 71 | MissSVM uses a semi-supervised learning approach, treating the instances in 72 | positive bags as unlabeled data: 73 | > Zhou, Zhi-Hua, and Jun-Ming Xu. **On the relation between multi-instance 74 | > learning and semi-supervised learning.** _Proceedings of the 24th 75 | > International Conference on Machine Learning._ 2007. 76 | 77 | ### MICA 78 | The "multiple-instance classification algorithm" (MICA) represents each bag 79 | using a convex combinations of its instances. The optimization program is then 80 | solved by iteratively solving a series of linear programs. In our formulation, 81 | we use L2 regularization, so we solve alternating linear and quadratic programs. 82 | For more information on the original algorithm, see: 83 | > Mangasarian, Olvi L., and Edward W. Wild. **Multiple instance classification 84 | > via successive linear programming.** _Journal of Optimization Theory and 85 | > Applications_ 137.3 (2008): 555-568. 86 | 87 | ### sMIL, stMIL, and sbMIL 88 | This family of approaches intentionally bias SVM formulations to handle the 89 | assumption that there are very few positive instances in each positive bag. In 90 | the case of sbMIL, prior knowledge on the "sparsity" of positive bags can be 91 | specified or found via cross-validation: 92 | > Bunescu, Razvan C., and Raymond J. Mooney. **Multiple instance learning for 93 | > sparse positive bags.** _Proceedings of the 24th International Conference on 94 | > Machine Learning._ 2007. 95 | 96 | How to Use 97 | ---------- 98 | 99 | The classifier implementations are loosely based on those found in the 100 | [scikit-learn](http://scikit-learn.org/stable/) library. First, construct a 101 | classifier with the desired parameters: 102 | 103 | >>> import misvm 104 | >>> classifier = misvm.MISVM(kernel='linear', C=1.0, max_iters=50) 105 | 106 | Use Python's `help` functionality as in `help(misvm.MISVM)` or read the 107 | documentation in the code to see which arguments each classifier takes. Then, 108 | call the `fit` function with some data: 109 | 110 | >>> classifier.fit(bags, labels) 111 | 112 | Here, the `bags` argument is a list of "array-like" (could be NumPy arrays, or a 113 | list of lists) objects representing each bag. Each (array-like) bag has m rows 114 | and f columns, which correspond to m instances, each with f features. Of course, 115 | m can be different across bags, but f must be the same. Then `labels` is an 116 | array-like object containing a label corresponding to each bag. *Each label must 117 | be either +1 or -1.* You will likely get strange results if you try using 118 | 0/1-valued labels. After training the classifier, you can call the `predict` 119 | function as: 120 | 121 | >>> labels = classifier.predict(bags) 122 | 123 | Here `bags` has the same format as for `fit`, and the function returns an array 124 | of real-valued predictions (use `numpy.sign(labels)` to get -1/+1 class 125 | predictions). 126 | 127 | In order to get instance-level predictions from a classifier, use the 128 | `instancePrediction` flag, as in: 129 | 130 | >>> bag_labels, instance_labels = classifier.predict(bags, instancePrediction=True) 131 | 132 | The `instancePrediction` flag is not available for bag-level classifiers such 133 | as the NSK. However, you can always predict the labels of "singleton" bags 134 | containing a single instance to assign a label to that instance. In this case, 135 | one should use caution in interpreting the label of an instance produced by a 136 | bag-level classifier, since these classifiers are designed to make predictions 137 | based on properties of an entire bag. 138 | 139 | An example script is included that trains classifiers on the [musk1 140 | dataset](http://archive.ics.uci.edu/ml/datasets/Musk+(Version+1)); see: 141 | > Bache, K. & Lichman, M. (2013). UCI Machine Learning Repository 142 | > [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School 143 | > of Information and Computer Science. 144 | 145 | Install the package or add the `misvm` directory to the `PYTHONPATH` environment 146 | variable before attempting to run the example using `python example.py` within 147 | the `example` directory. 148 | 149 | Questions and Issues 150 | -------------------- 151 | 152 | If you find any bugs or have any questions about this code, please create an 153 | issue on [GitHub](https://github.com/garydoranjr/misvm/issues), or contact Gary 154 | Doran at . Of course, I cannot guarantee any support for 155 | this software. 156 | -------------------------------------------------------------------------------- /example/example.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | from __future__ import print_function, division 3 | 4 | import numpy as np 5 | 6 | from misvmio import parse_c45, bag_set 7 | import misvm 8 | 9 | 10 | def main(): 11 | # Load list of C4.5 Examples 12 | example_set = parse_c45('musk1') 13 | 14 | # Get stats to normalize data 15 | raw_data = np.array(example_set.to_float()) 16 | data_mean = np.average(raw_data, axis=0) 17 | data_std = np.std(raw_data, axis=0) 18 | data_std[np.nonzero(data_std == 0.0)] = 1.0 19 | def normalizer(ex): 20 | ex = np.array(ex) 21 | normed = ((ex - data_mean) / data_std) 22 | # The ...[:, 2:-1] removes first two columns and last column, 23 | # which are the bag/instance ids and class label, as part of the 24 | # normalization process 25 | return normed[2:-1] 26 | 27 | # Group examples into bags 28 | bagset = bag_set(example_set) 29 | 30 | # Convert bags to NumPy arrays 31 | bags = [np.array(b.to_float(normalizer)) for b in bagset] 32 | labels = np.array([b.label for b in bagset], dtype=float) 33 | # Convert 0/1 labels to -1/1 labels 34 | labels = 2 * labels - 1 35 | 36 | # Spilt dataset arbitrarily to train/test sets 37 | train_bags = bags[10:] 38 | train_labels = labels[10:] 39 | test_bags = bags[:10] 40 | test_labels = labels[:10] 41 | 42 | # Construct classifiers 43 | classifiers = {} 44 | classifiers['MissSVM'] = misvm.MissSVM(kernel='linear', C=1.0, max_iters=20) 45 | classifiers['sbMIL'] = misvm.sbMIL(kernel='linear', eta=0.1, C=1e2) 46 | classifiers['SIL'] = misvm.SIL(kernel='linear', C=1.0) 47 | 48 | # Train/Evaluate classifiers 49 | accuracies = {} 50 | for algorithm, classifier in classifiers.items(): 51 | classifier.fit(train_bags, train_labels) 52 | predictions = classifier.predict(test_bags) 53 | accuracies[algorithm] = np.average(test_labels == np.sign(predictions)) 54 | 55 | for algorithm, accuracy in accuracies.items(): 56 | print('\n%s Accuracy: %.1f%%' % (algorithm, 100 * accuracy)) 57 | 58 | 59 | if __name__ == '__main__': 60 | main() 61 | -------------------------------------------------------------------------------- /example/misvmio.py: -------------------------------------------------------------------------------- 1 | """ 2 | Parses and represents C4.5 MI data sets 3 | """ 4 | from __future__ import print_function, division 5 | import os 6 | import re 7 | import sys 8 | import traceback 9 | from collections import MutableSequence, defaultdict, Sequence 10 | from itertools import chain, starmap 11 | 12 | NAMES_EXT = '.names' 13 | DATA_EXT = '.data' 14 | 15 | _COMMENT_RE = '//.*' 16 | _BINARY_RE = '\\s*0\\s*,\\s*1\\s*' 17 | 18 | 19 | class Feature(object): 20 | """ 21 | Information for a feature 22 | of C4.5 data 23 | """ 24 | 25 | class Type: 26 | """ 27 | Type of feature 28 | """ 29 | CLASS = 'CLASS' 30 | ID = 'ID' 31 | BINARY = 'BINARY' 32 | NOMINAL = 'NOMINAL' 33 | CONTINUOUS = 'CONTINUOUS' 34 | 35 | def __init__(self, name, ftype, values=None): 36 | self.name = name 37 | self.type = ftype 38 | if (self.type == Feature.Type.ID or 39 | self.type == Feature.Type.NOMINAL): 40 | if values is None: 41 | raise Exception('No values for %s feature' % self.type) 42 | else: 43 | self.values = tuple(values) 44 | else: 45 | if values is None: 46 | self.values = None 47 | else: 48 | raise Exception('Values given for % feature' % self.type) 49 | self.tup = (self.name, self.type, self.values) 50 | 51 | def __cmp__(self, other): 52 | if self.tup > other.tup: 53 | return 1 54 | elif self.tup < other.tup: 55 | return -1 56 | else: 57 | return 0 58 | 59 | def __hash__(self): 60 | return self.tup.__hash__() 61 | 62 | def __repr__(self): 63 | return '<%s, %s, %s>' % self.tup 64 | 65 | def to_float(self, value): 66 | if value is None: 67 | return None 68 | if (self.type == Feature.Type.ID or 69 | self.type == Feature.Type.NOMINAL): 70 | return float(self.values.index(value)) 71 | elif (self.type == Feature.Type.BINARY or 72 | self.type == Feature.Type.CLASS): 73 | if value: 74 | return 1.0 75 | else: 76 | return 0.0 77 | else: 78 | return value 79 | 80 | 81 | Feature.CLASS = Feature("CLASS", Feature.Type.CLASS) 82 | 83 | 84 | class Schema(Sequence): 85 | """ 86 | Represents a schema for C4.5 data 87 | """ 88 | 89 | def __init__(self, features): 90 | self.features = tuple(features) 91 | 92 | def __cmp__(self, other): 93 | if self.features > other.features: 94 | return 1 95 | elif self.features < other.features: 96 | return -1 97 | else: 98 | return 0 99 | 100 | def __hash__(self): 101 | return self.features.__hash__() 102 | 103 | def __repr__(self): 104 | return str(self.features) 105 | 106 | def __len__(self): 107 | return len(self.features) 108 | 109 | def __iter__(self): 110 | return self.features.__iter__() 111 | 112 | def __contains__(self, item): 113 | return self.features.__contains__(item) 114 | 115 | def __getitem__(self, key): 116 | return self.features[key] 117 | 118 | 119 | class ExampleSet(MutableSequence): 120 | """ 121 | Holds a set of examples 122 | """ 123 | 124 | def __init__(self, schema): 125 | self.schema = schema 126 | self.examples = [] 127 | 128 | def __len__(self): 129 | return len(self.examples) 130 | 131 | def __iter__(self): 132 | return self.examples.__iter__() 133 | 134 | def __contains__(self, item): 135 | return self.examples.__contains__(item) 136 | 137 | def __getitem__(self, key): 138 | return self.examples[key] 139 | 140 | def __setitem__(self, key, example): 141 | if example.schema != self.schema: 142 | raise ValueError('Schema mismatch') 143 | self.examples[key] = example 144 | 145 | def __delitem__(self, key): 146 | del self.examples[key] 147 | 148 | def insert(self, key, example): 149 | if example.schema != self.schema: 150 | raise ValueError('Schema mismatch') 151 | return self.examples.insert(key, example) 152 | 153 | def append(self, example): 154 | if example.schema != self.schema: 155 | raise ValueError('Schema mismatch') 156 | super(ExampleSet, self).append(example) 157 | 158 | def __repr__(self): 159 | return '<%s, %s>' % (self.schema, self.examples) 160 | 161 | def to_float(self, normalizer=None): 162 | return [ex.to_float(normalizer) for ex in self] 163 | 164 | 165 | class Example(MutableSequence): 166 | """ 167 | Represents a single example 168 | from a dataset 169 | """ 170 | 171 | def __init__(self, schema): 172 | self.schema = schema 173 | self.features = [None for i in range(len(schema))] 174 | self.weight = 1.0 175 | 176 | def __len__(self): 177 | return len(self.features) 178 | 179 | def __iter__(self): 180 | return self.features.__iter__() 181 | 182 | def __contains__(self, item): 183 | return self.features.__contains__(item) 184 | 185 | def __getitem__(self, key): 186 | return self.features[key] 187 | 188 | def __setitem__(self, key, value): 189 | self.features[key] = value 190 | 191 | def __delitem__(self, key): 192 | del self.features[key] 193 | 194 | def insert(self, key, item): 195 | return self.features.insert(key, item) 196 | 197 | def __repr__(self): 198 | return '<%s, %s, %s>' % (self.schema, self.features, self.weight) 199 | 200 | def copy_of(self): 201 | ex = Example(self.schema) 202 | for i, f in enumerate(self): 203 | ex[i] = f 204 | return ex 205 | 206 | def to_float(self, normalizer=None): 207 | if normalizer is None: 208 | normalizer = lambda x: x 209 | return normalizer([feature.to_float(value) 210 | for feature, value in zip(self.schema, self)]) 211 | 212 | 213 | class Bag(MutableSequence): 214 | """ 215 | Represents a Bag 216 | """ 217 | 218 | def __init__(self, bag_id, examples): 219 | classes = map(lambda x: x[-1], examples) 220 | if any(classes): 221 | self.label = True 222 | else: 223 | self.label = False 224 | self.bag_id = bag_id 225 | self.examples = examples 226 | 227 | def __len__(self): 228 | return len(self.examples) 229 | 230 | def __iter__(self): 231 | return self.examples.__iter__() 232 | 233 | def __contains__(self, item): 234 | return self.examples.__contains__(item) 235 | 236 | def __getitem__(self, key): 237 | return self.examples[key] 238 | 239 | def __setitem__(self, key, value): 240 | self.examples[key] = value 241 | 242 | def __delitem__(self, key): 243 | del self.examples[key] 244 | 245 | def insert(self, key, item): 246 | return self.examples.insert(key, item) 247 | 248 | def __repr__(self): 249 | return '<%s, %s>' % (self.examples, self.label) 250 | 251 | def to_float(self, normalizer=None): 252 | return [example.to_float(normalizer) for example in self] 253 | 254 | 255 | def bag_set(exampleset, bag_attr=0): 256 | """ 257 | Construct bags on the given attribute 258 | """ 259 | bag_dict = defaultdict(list) 260 | for example in exampleset: 261 | bag_dict[example[bag_attr]].append(example) 262 | return [Bag(bag_id, value) for bag_id, value in bag_dict.items()] 263 | 264 | 265 | def parse_c45(file_base, rootdir='.'): 266 | """ 267 | Returns an ExampleSet from the 268 | C4.5 formatted data 269 | """ 270 | schema_name = file_base + NAMES_EXT 271 | data_name = file_base + DATA_EXT 272 | schema_file = find_file(schema_name, rootdir) 273 | if schema_file is None: 274 | raise ValueError('Schema file not found') 275 | data_file = find_file(data_name, rootdir) 276 | if data_file is None: 277 | raise ValueError('Data file not found') 278 | return _parse_c45(schema_file, data_file) 279 | 280 | 281 | def _parse_c45(schema_filename, data_filename): 282 | """Parses C4.5 given file names""" 283 | try: 284 | schema = _parse_schema(schema_filename) 285 | except Exception as e: 286 | raise Exception('Error parsing schema: %s' % e) 287 | 288 | try: 289 | examples = _parse_examples(schema, data_filename) 290 | except Exception as e: 291 | raise Exception('Error parsing examples: %s' % e) 292 | 293 | return examples 294 | 295 | 296 | def _parse_schema(schema_filename): 297 | features = [] 298 | needs_id = True 299 | with open(schema_filename) as schema_file: 300 | for line in schema_file: 301 | feature = _parse_feature(line, needs_id) 302 | if feature is not None: 303 | if (needs_id and 304 | feature.type == Feature.Type.ID): 305 | needs_id = False 306 | features.append(feature) 307 | try: 308 | features.remove(Feature.CLASS) 309 | except: 310 | raise Exception('File does not contain worthless "Class" line.') 311 | features.append(Feature.CLASS) 312 | return Schema(features) 313 | 314 | 315 | def _parse_feature(line, needs_id): 316 | """ 317 | Parse a feature from the given line; 318 | second argument indicates whether we 319 | need an ID for our schema 320 | """ 321 | line = _trim_line(line) 322 | if len(line) == 0: 323 | # Blank line 324 | return None 325 | if re.match(_BINARY_RE, line) is not None: 326 | # Class feature 327 | return Feature.CLASS 328 | colon = line.find(':') 329 | if colon < 0: 330 | raise Exception('No feature name found.') 331 | name = line[:colon].strip() 332 | remainder = line[colon + 1:] 333 | values = _parse_values(remainder) 334 | if needs_id: 335 | return Feature(name, Feature.Type.ID, values) 336 | elif len(values) == 1 and values[0].startswith('continuous'): 337 | return Feature(name, Feature.Type.CONTINUOUS) 338 | elif len(values) == 2 and '0' in values and '1' in values: 339 | return Feature(name, Feature.Type.BINARY) 340 | else: 341 | return Feature(name, Feature.Type.NOMINAL, values) 342 | 343 | 344 | def _parse_values(remainder): 345 | values = list() 346 | for raw in remainder.split(','): 347 | raw = raw.strip() 348 | if len(raw) > 1 and raw[0] == '"' and raw[-1] == '"': 349 | raw = raw[1:-1].strip() 350 | values.append(raw) 351 | return values 352 | 353 | 354 | def _parse_examples(schema, data_filename): 355 | exset = ExampleSet(schema) 356 | with open(data_filename) as data_file: 357 | for line in data_file: 358 | line = _trim_line(line) 359 | if len(line) == 0: 360 | continue 361 | try: 362 | ex = _parse_example(schema, line) 363 | exset.append(ex) 364 | except Exception as e: 365 | traceback.print_exc(file=sys.stderr) 366 | print('Warning: skipping line: "%s"' % line, file=sys.stderr) 367 | return exset 368 | 369 | 370 | def _parse_example(schema, line): 371 | values = _parse_values(line) 372 | if len(values) != len(schema): 373 | raise Exception('Feature-data size mismatch: %s' % line) 374 | ex = Example(schema) 375 | for i, value in enumerate(values): 376 | if value == '?': 377 | # Unknown value says 'None' 378 | continue 379 | stype = schema[i].type 380 | if (stype == Feature.Type.ID or 381 | stype == Feature.Type.NOMINAL): 382 | ex[i] = value 383 | elif (stype == Feature.Type.BINARY or 384 | stype == Feature.Type.CLASS): 385 | ex[i] = bool(int(value)) 386 | elif stype == Feature.Type.CONTINUOUS: 387 | ex[i] = float(value) 388 | else: 389 | raise ValueError('Unknown schema type "%s"' % stype) 390 | return ex 391 | 392 | 393 | def _trim_line(line): 394 | """ 395 | Removes comments and periods 396 | from the given line 397 | """ 398 | line = re.sub(_COMMENT_RE, '', line) 399 | line = line.strip() 400 | if len(line) > 0 and line[-1] == '.': 401 | line = line[:-1].strip() 402 | return line 403 | 404 | 405 | def find_file(filename, rootdir): 406 | """ 407 | Finds a file with filename located in 408 | some subdirectory of the current directory 409 | """ 410 | for dirpath, _, filenames in os.walk(rootdir): 411 | if filename in filenames: 412 | return os.path.join(dirpath, filename) 413 | 414 | 415 | def save_c45(example_set, basename, basedir='.'): 416 | schema_name = os.path.join(basedir, basename + NAMES_EXT) 417 | data_name = os.path.join(basedir, basename + DATA_EXT) 418 | 419 | print(schema_name) 420 | with open(schema_name, 'w+') as schema_file: 421 | schema_file.write('0,1.\n') 422 | for feature in example_set.schema: 423 | if (feature.type == Feature.Type.ID or 424 | feature.type == Feature.Type.NOMINAL): 425 | schema_file.write('%s:%s.\n' % 426 | (feature.name, ','.join(sorted(feature.values)))) 427 | elif feature.type == Feature.Type.BINARY: 428 | schema_file.write('%s:0,1.\n' % feature.name) 429 | elif feature.type == Feature.Type.CONTINUOUS: 430 | schema_file.write('%s:continuous.\n' % feature.name) 431 | 432 | with open(data_name, 'w+') as data_file: 433 | for example in example_set: 434 | ex_strs = starmap(_feature_to_str, zip(example.schema, example)) 435 | data_file.write('%s.\n' % ','.join(ex_strs)) 436 | 437 | 438 | def _feature_to_str(feature, value): 439 | if (feature.type == Feature.Type.ID or 440 | feature.type == Feature.Type.NOMINAL): 441 | return value 442 | elif (feature.type == Feature.Type.BINARY or 443 | feature.type == Feature.Type.CLASS): 444 | return str(int(value)) 445 | elif feature.type == Feature.Type.CONTINUOUS: 446 | return str(float(value)) 447 | -------------------------------------------------------------------------------- /example/musk1.names: -------------------------------------------------------------------------------- 1 | 0,1. 2 | molecule_name: MUSK-jf78,MUSK-jf67,MUSK-jf59,MUSK-jf58,MUSK-jf47,MUSK-jf46,MUSK-jf17,MUSK-j51,MUSK-j33,MUSK-f205,MUSK-f184,MUSK-f159,MUSK-f158,MUSK-f152,MUSK-344,MUSK-333,MUSK-331,MUSK-330,MUSK-323,MUSK-322,MUSK-321,MUSK-316,MUSK-315,MUSK-314,MUSK-311,MUSK-301,MUSK-293,MUSK-292,MUSK-285,MUSK-284,MUSK-273,MUSK-272,MUSK-256,MUSK-254,MUSK-246,MUSK-240,MUSK-238,MUSK-236,MUSK-228,MUSK-227,MUSK-224,MUSK-219,MUSK-213,MUSK-212,MUSK-211,MUSK-190,MUSK-188,NON-MUSK-jp13,NON-MUSK-jp10,NON-MUSK-j97,NON-MUSK-j96,NON-MUSK-j93,NON-MUSK-j90,NON-MUSK-j84,NON-MUSK-j83,NON-MUSK-j81,NON-MUSK-j148,NON-MUSK-j147,NON-MUSK-j146,NON-MUSK-j130,NON-MUSK-j129,NON-MUSK-j100,NON-MUSK-f209,NON-MUSK-f164,NON-MUSK-f161,NON-MUSK-f150,NON-MUSK-334,NON-MUSK-327,NON-MUSK-320,NON-MUSK-319,NON-MUSK-318,NON-MUSK-309,NON-MUSK-308,NON-MUSK-305,NON-MUSK-297,NON-MUSK-296,NON-MUSK-295,NON-MUSK-290,NON-MUSK-289,NON-MUSK-288,NON-MUSK-286,NON-MUSK-271,NON-MUSK-257,NON-MUSK-253,NON-MUSK-249,NON-MUSK-247,NON-MUSK-232,NON-MUSK-226,NON-MUSK-220,NON-MUSK-208,NON-MUSK-200,NON-MUSK-199. 3 | conformation_name: 188_1+1,188_1+2,188_1+3,188_1+4,190_1+1,190_1+2,190_1+3,190_1+4,211_1+1,211_1+2,212_1+1,212_1+2,212_1+3,213_1+1,213_1+2,213_1+3,213_1+4,219_1+1,219_1+2,224_1+1,224_1+2,227_1+1,227_1+2,228_1+1,228_1+2,228_1+3,228_2+1,228_2+2,236_1+1,236_1+2,236_1+3,236_2+1,236_2+2,236_2+3,238_1+1,238_1+2,238_1+3,238_2+1,238_2+2,240_1+1,240_1+2,240_2+1,240_2+2,240_3+1,240_3+2,240_4+1,240_4+2,246_1+1,246_1+2,246_2+1,246_2+2,254_1+1,254_1+2,256_1+1,256_1+2,256_1+3,256_1+4,272_1+1,272_1+2,272_1+3,273_1+1,273_1+2,273_1+3,273_1+4,273_1+5,284_1+1,284_1+2,284_2+1,284_2+2,285_1+1,285_1+2,285_1+3,285_1+4,285_2+1,285_2+2,285_2+3,285_2+4,292_1+1,292_1+2,292_2+1,292_2+2,293_1+1,293_1+2,293_2+1,293_2+2,301_1+1,301_1+2,301_2+1,301_2+2,311_1+1,311_1+2,314_1+1,314_1+2,314_2+1,314_2+2,314_3+1,314_3+2,314_4+1,314_4+2,315_1+1,315_1+2,315_2+1,315_2+2,316_1+1,316_1+2,316_2+1,316_2+2,321_1+1,321_1+2,322_1+1,322_1+2,322_2+1,322_2+2,322_3+1,322_3+2,322_4+1,322_4+2,323_1+1,323_1+2,323_2+1,323_2+2,330_1+1,330_1+2,330_2+1,330_2+2,331_1+1,331_1+2,331_2+1,331_2+2,333_1+1,333_1+2,333_2+1,333_2+2,333_3+1,333_3+2,333_4+1,333_4+2,344_1+1,344_1+2,f152_1+1,f152_1+2,f152_1+3,f152_1+4,f158_1+1,f158_1+2,f158_1+3,f158_1+4,f159_1+1,f159_1+2,f184_1+1,f184_1+2,f184_1+3,f184_1+4,f184_2+1,f184_2+2,f184_2+3,f184_2+4,f205_1+1,f205_1+2,f205_1+3,f205_1+4,f205_2+1,f205_2+2,f205_2+3,f205_2+4,j33_1+1,j33_1+2,j51_1+1,j51_1+2,j51_2+1,j51_2+2,jf17_1+1,jf17_1+2,jf17_1+3,jf46_1+1,jf46_1+2,jf46_2+1,jf46_2+2,jf46_2+3,jf47_1+1,jf47_1+2,jf47_2+1,jf47_2+2,jf58_1+1,jf58_1+2,jf58_2+1,jf58_2+2,jf58_2+3,jf59_1+1,jf59_1+2,jf59_2+1,jf59_2+2,jf59_2+3,jf67_1+1,jf67_1+2,jf67_2+1,jf67_2+2,jf67_3+1,jf67_3+2,jf67_4+1,jf67_4+2,jf78_1+1,jf78_1+2,jf78_1+3,jf78_2+1,jf78_2+2,jf78_2+3,199_1+1,199_1+2,199_1+3,199_1+4,200_1+1,200_1+2,200_1+3,200_1+4,208_1+1,208_1+2,220_1+1,220_1+2,220_1+3,220_1+4,226_1+1,226_1+2,232_1+1,232_1+2,232_1+3,232_2+1,232_2+2,232_3+1,232_3+2,232_4+1,232_4+2,247_1+1,247_1+2,249_1+1,249_1+2,253_1+1,253_1+2,257_1+1,257_1+2,257_1+3,257_1+4,271_1+1,271_1+2,286_1+1,286_1+2,286_1+3,286_1+4,286_2+1,286_2+2,286_2+3,286_2+4,286_2+5,288_1+1,288_1+2,288_1+3,288_1+4,288_1+5,288_1+6,288_1+7,288_1+8,288_2+1,288_2+2,288_2+3,288_2+4,288_2+5,288_2+6,288_2+7,288_2+8,288_3+1,288_3+2,288_3+3,288_3+4,288_3+5,288_3+6,288_3+7,288_3+8,288_4+1,288_4+2,288_4+3,288_4+4,288_4+5,288_4+6,288_4+7,288_4+8,289_1+1,289_1+2,289_1+3,289_1+4,290_1+1,290_1+2,295_1+1,295_1+2,296_1+1,296_1+2,296_2+1,296_2+2,297_1+1,297_1+2,297_2+1,297_2+2,305_1+1,305_1+2,308_1+1,308_1+2,309_1+1,309_1+2,318_1+1,318_1+2,319_1+1,319_1+2,319_2+1,319_2+2,320_1+1,320_1+2,327_1+1,327_1+2,327_2+1,327_2+2,334_1+1,334_1+2,f150_1+1,f150_1+2,f161_1+1,f161_1+2,f164_1+1,f164_1+2,f209_1+1,f209_1+2,f209_1+3,f209_1+4,f209_2+1,f209_2+2,f209_2+3,f209_2+4,j100_1+1,j100_1+2,j100_2+1,j100_2+2,j129_1+1,j129_1+2,j129_2+1,j129_2+2,j129_3+1,j129_3+2,j129_4+1,j129_4+2,j130_1+1,j130_1+2,j146_1+1,j146_1+10,j146_1+2,j146_1+3,j146_1+4,j146_1+5,j146_1+6,j146_1+7,j146_1+8,j146_1+9,j146_2+1,j146_2+10,j146_2+2,j146_2+3,j146_2+4,j146_2+5,j146_2+6,j146_2+7,j146_2+8,j146_2+9,j146_3+1,j146_3+10,j146_3+2,j146_3+3,j146_3+4,j146_3+5,j146_3+6,j146_3+7,j146_3+8,j146_3+9,j146_4+1,j146_4+10,j146_4+2,j146_4+3,j146_4+4,j146_4+5,j146_4+6,j146_4+7,j146_4+8,j146_4+9,j147_1+1,j147_1+10,j147_1+2,j147_1+3,j147_1+4,j147_1+5,j147_1+6,j147_1+7,j147_1+8,j147_1+9,j147_2+1,j147_2+10,j147_2+2,j147_2+3,j147_2+4,j147_2+5,j147_2+6,j147_2+7,j147_2+8,j147_2+9,j147_3+1,j147_3+10,j147_3+2,j147_3+3,j147_3+4,j147_3+5,j147_3+6,j147_3+7,j147_3+8,j147_3+9,j147_4+1,j147_4+10,j147_4+2,j147_4+3,j147_4+4,j147_4+5,j147_4+6,j147_4+7,j147_4+8,j147_4+9,j148_1+1,j148_1+2,j81_1+1,j81_1+2,j83_1+1,j83_1+2,j84_1+1,j84_1+2,j90_1+1,j90_1+2,j90_1+3,j90_1+4,j93_1+1,j93_1+2,j93_1+3,j93_1+4,j93_2+1,j93_2+2,j93_2+3,j93_2+4,j93_3+1,j93_3+2,j93_3+3,j93_3+4,j93_4+1,j93_4+2,j93_4+3,j93_4+4,j96_1+1,j96_1+2,j96_2+1,j96_2+2,j97_1+1,j97_1+2,j97_2+1,j97_2+2,jp10_1+1,jp10_1+2,jp10_1+3,jp13_1+1,jp13_1+2,jp13_1+3,jp13_1+4,jp13_2+1,jp13_2+2,jp13_2+3,jp13_2+4. 4 | f1: continuous scale=0.001. 5 | f2: continuous scale=0.001. 6 | f3: continuous scale=0.001. 7 | f4: continuous scale=0.001. 8 | f5: continuous scale=0.001. 9 | f6: continuous scale=0.001. 10 | f7: continuous scale=0.001. 11 | f8: continuous scale=0.001. 12 | f9: continuous scale=0.001. 13 | f10: continuous scale=0.001. 14 | f11: continuous scale=0.001. 15 | f12: continuous scale=0.001. 16 | f13: continuous scale=0.001. 17 | f14: continuous scale=0.001. 18 | f15: continuous scale=0.001. 19 | f16: continuous scale=0.001. 20 | f17: continuous scale=0.001. 21 | f18: continuous scale=0.001. 22 | f19: continuous scale=0.001. 23 | f20: continuous scale=0.001. 24 | f21: continuous scale=0.001. 25 | f22: continuous scale=0.001. 26 | f23: continuous scale=0.001. 27 | f24: continuous scale=0.001. 28 | f25: continuous scale=0.001. 29 | f26: continuous scale=0.001. 30 | f27: continuous scale=0.001. 31 | f28: continuous scale=0.001. 32 | f29: continuous scale=0.001. 33 | f30: continuous scale=0.001. 34 | f31: continuous scale=0.001. 35 | f32: continuous scale=0.001. 36 | f33: continuous scale=0.001. 37 | f34: continuous scale=0.001. 38 | f35: continuous scale=0.001. 39 | f36: continuous scale=0.001. 40 | f37: continuous scale=0.001. 41 | f38: continuous scale=0.001. 42 | f39: continuous scale=0.001. 43 | f40: continuous scale=0.001. 44 | f41: continuous scale=0.001. 45 | f42: continuous scale=0.001. 46 | f43: continuous scale=0.001. 47 | f44: continuous scale=0.001. 48 | f45: continuous scale=0.001. 49 | f46: continuous scale=0.001. 50 | f47: continuous scale=0.001. 51 | f48: continuous scale=0.001. 52 | f49: continuous scale=0.001. 53 | f50: continuous scale=0.001. 54 | f51: continuous scale=0.001. 55 | f52: continuous scale=0.001. 56 | f53: continuous scale=0.001. 57 | f54: continuous scale=0.001. 58 | f55: continuous scale=0.001. 59 | f56: continuous scale=0.001. 60 | f57: continuous scale=0.001. 61 | f58: continuous scale=0.001. 62 | f59: continuous scale=0.001. 63 | f60: continuous scale=0.001. 64 | f61: continuous scale=0.001. 65 | f62: continuous scale=0.001. 66 | f63: continuous scale=0.001. 67 | f64: continuous scale=0.001. 68 | f65: continuous scale=0.001. 69 | f66: continuous scale=0.001. 70 | f67: continuous scale=0.001. 71 | f68: continuous scale=0.001. 72 | f69: continuous scale=0.001. 73 | f70: continuous scale=0.001. 74 | f71: continuous scale=0.001. 75 | f72: continuous scale=0.001. 76 | f73: continuous scale=0.001. 77 | f74: continuous scale=0.001. 78 | f75: continuous scale=0.001. 79 | f76: continuous scale=0.001. 80 | f77: continuous scale=0.001. 81 | f78: continuous scale=0.001. 82 | f79: continuous scale=0.001. 83 | f80: continuous scale=0.001. 84 | f81: continuous scale=0.001. 85 | f82: continuous scale=0.001. 86 | f83: continuous scale=0.001. 87 | f84: continuous scale=0.001. 88 | f85: continuous scale=0.001. 89 | f86: continuous scale=0.001. 90 | f87: continuous scale=0.001. 91 | f88: continuous scale=0.001. 92 | f89: continuous scale=0.001. 93 | f90: continuous scale=0.001. 94 | f91: continuous scale=0.001. 95 | f92: continuous scale=0.001. 96 | f93: continuous scale=0.001. 97 | f94: continuous scale=0.001. 98 | f95: continuous scale=0.001. 99 | f96: continuous scale=0.001. 100 | f97: continuous scale=0.001. 101 | f98: continuous scale=0.001. 102 | f99: continuous scale=0.001. 103 | f100: continuous scale=0.001. 104 | f101: continuous scale=0.001. 105 | f102: continuous scale=0.001. 106 | f103: continuous scale=0.001. 107 | f104: continuous scale=0.001. 108 | f105: continuous scale=0.001. 109 | f106: continuous scale=0.001. 110 | f107: continuous scale=0.001. 111 | f108: continuous scale=0.001. 112 | f109: continuous scale=0.001. 113 | f110: continuous scale=0.001. 114 | f111: continuous scale=0.001. 115 | f112: continuous scale=0.001. 116 | f113: continuous scale=0.001. 117 | f114: continuous scale=0.001. 118 | f115: continuous scale=0.001. 119 | f116: continuous scale=0.001. 120 | f117: continuous scale=0.001. 121 | f118: continuous scale=0.001. 122 | f119: continuous scale=0.001. 123 | f120: continuous scale=0.001. 124 | f121: continuous scale=0.001. 125 | f122: continuous scale=0.001. 126 | f123: continuous scale=0.001. 127 | f124: continuous scale=0.001. 128 | f125: continuous scale=0.001. 129 | f126: continuous scale=0.001. 130 | f127: continuous scale=0.001. 131 | f128: continuous scale=0.001. 132 | f129: continuous scale=0.001. 133 | f130: continuous scale=0.001. 134 | f131: continuous scale=0.001. 135 | f132: continuous scale=0.001. 136 | f133: continuous scale=0.001. 137 | f134: continuous scale=0.001. 138 | f135: continuous scale=0.001. 139 | f136: continuous scale=0.001. 140 | f137: continuous scale=0.001. 141 | f138: continuous scale=0.001. 142 | f139: continuous scale=0.001. 143 | f140: continuous scale=0.001. 144 | f141: continuous scale=0.001. 145 | f142: continuous scale=0.001. 146 | f143: continuous scale=0.001. 147 | f144: continuous scale=0.001. 148 | f145: continuous scale=0.001. 149 | f146: continuous scale=0.001. 150 | f147: continuous scale=0.001. 151 | f148: continuous scale=0.001. 152 | f149: continuous scale=0.001. 153 | f150: continuous scale=0.001. 154 | f151: continuous scale=0.001. 155 | f152: continuous scale=0.001. 156 | f153: continuous scale=0.001. 157 | f154: continuous scale=0.001. 158 | f155: continuous scale=0.001. 159 | f156: continuous scale=0.001. 160 | f157: continuous scale=0.001. 161 | f158: continuous scale=0.001. 162 | f159: continuous scale=0.001. 163 | f160: continuous scale=0.001. 164 | f161: continuous scale=0.001. 165 | f162: continuous scale=0.001. 166 | f163: continuous scale=0.001. 167 | f164: continuous scale=0.001. 168 | f165: continuous scale=0.001. 169 | f166: continuous scale=0.001. 170 | -------------------------------------------------------------------------------- /misvm/__init__.py: -------------------------------------------------------------------------------- 1 | """ 2 | MISVM: An implementation of multiple-instance support vector machines 3 | 4 | The following algorithms are implemented: 5 | 6 | SVM : a standard supervised SVM 7 | SIL : trains a standard SVM classifier after applying bag labels to each 8 | instance 9 | MISVM : the MI-SVM algorithm of Andrews, Tsochantaridis, & Hofmann (2002) 10 | miSVM : the mi-SVM algorithm of Andrews, Tsochantaridis, & Hofmann (2002) 11 | NSK : the normalized set kernel of Gaertner, et al. (2002) 12 | STK : the statistics kernel of Gaertner, et al. (2002) 13 | MissSVM : the semi-supervised learning approach of Zhou & Xu (2007) 14 | MICA : the MI classification algorithm of Mangasarian & Wild (2008) 15 | sMIL : sparse MIL (Bunescu & Mooney, 2007) 16 | stMIL : sparse, transductive MIL (Bunescu & Mooney, 2007) 17 | sbMIL : sparse, balanced MIL (Bunescu & Mooney, 2007) 18 | """ 19 | __name__ = 'misvm' 20 | __version__ = '1.0' 21 | from misvm.svm import SVM 22 | from misvm.sil import SIL 23 | from misvm.nsk import NSK 24 | from misvm.smil import sMIL 25 | from misvm.mi_svm import miSVM, MISVM 26 | from misvm.stk import STK 27 | from misvm.stmil import stMIL 28 | from misvm.sbmil import sbMIL 29 | from misvm.mica import MICA 30 | from misvm.misssvm import MissSVM 31 | -------------------------------------------------------------------------------- /misvm/cccp.py: -------------------------------------------------------------------------------- 1 | """ 2 | Implements standard code for problems that 3 | require the Concave-Convex Procedure (CCCP), 4 | or similar iteration. 5 | """ 6 | from __future__ import print_function, division 7 | from sys import stderr 8 | 9 | 10 | class CCCP(object): 11 | """ 12 | Encapsulates the CCCP 13 | """ 14 | TOLERANCE = 1e-6 15 | 16 | def __init__(self, verbose=True, max_iters=50, **kwargs): 17 | self.verbose = verbose 18 | self.max_iters = (max_iters + 1) 19 | self.kwargs = kwargs 20 | 21 | def mention(self, message): 22 | if self.verbose: 23 | print(message) 24 | 25 | def solve(self): 26 | """ 27 | Called to solve the CCCP problem 28 | """ 29 | for i in range(1, self.max_iters): 30 | self.mention('\nIteration %d...' % i) 31 | try: 32 | self.kwargs, solution = self.iterate(**self.kwargs) 33 | except Exception as e: 34 | if self.verbose: 35 | print('Warning: Bailing due to error: %s' % e, file=stderr) 36 | return self.bailout(**self.kwargs) 37 | if solution is not None: 38 | return solution 39 | 40 | if self.verbose: 41 | print('Warning: Max iterations exceeded', file=stderr) 42 | return self.bailout(**self.kwargs) 43 | 44 | def iterate(self, **kwargs): 45 | """ 46 | Should perform an iteration of the CCCP, 47 | using values in kwargs, and returning the 48 | kwargs for the next iteration. 49 | 50 | If the CCCP should terminate, also return the 51 | solution; otherwise, return 'None' 52 | """ 53 | pass 54 | 55 | def bailout(self, **kwargs): 56 | """ 57 | Return a solution in the case that the 58 | maximum allowed iterations was exceeded. 59 | """ 60 | pass 61 | 62 | def check_tolerance(self, last_obj, new_obj=0.0): 63 | """ 64 | Compares objective values, or takes the first 65 | value as delta if no second argument is given. 66 | """ 67 | if last_obj is not None: 68 | delta_obj = abs(float(new_obj) - float(last_obj)) 69 | self.mention('delta obj ratio: %.2e' % (delta_obj / self.TOLERANCE)) 70 | return delta_obj < self.TOLERANCE 71 | return False 72 | -------------------------------------------------------------------------------- /misvm/kernel.py: -------------------------------------------------------------------------------- 1 | """ 2 | Contains various kernels for SVMs 3 | 4 | A kernel should take two arguments, 5 | each of which is a list of examples 6 | as rows of a numpy matrix 7 | """ 8 | from __future__ import print_function, division 9 | from numpy import matrix, vstack, hstack 10 | import numpy as np 11 | from scipy.spatial.distance import cdist 12 | from scipy.io import loadmat, savemat 13 | import math 14 | import os 15 | 16 | import hashlib 17 | import time 18 | 19 | CACHE_CUTOFF_T = 10 20 | CACHE_DIR = '.kernel_cache' 21 | 22 | from misvm.util import spdiag, slices 23 | 24 | 25 | def by_name(full_name, gamma=None, p=None, use_caching=False): 26 | parts = full_name.split('_') 27 | name = parts.pop(0) 28 | 29 | try: 30 | # See if second part is a number 31 | value = float(parts[0]) 32 | parts.pop(0) 33 | except: 34 | pass 35 | 36 | if name == 'linear': 37 | kernel = linear 38 | elif name == 'quadratic': 39 | kernel = quadratic 40 | elif name == 'polynomial': 41 | kernel = polynomial(int(p)) 42 | elif name == 'rbf': 43 | kernel = rbf(gamma) 44 | else: 45 | raise ValueError('Unknown Kernel type %s' % name) 46 | 47 | try: 48 | # See if remaining part is a norm 49 | norm_name = parts.pop(0) 50 | if norm_name == 'fs': 51 | norm = featurespace_norm 52 | elif norm_name == 'av': 53 | norm = averaging_norm 54 | else: 55 | raise ValueError('Unknown norm %s' % norm_name) 56 | except IndexError: 57 | norm = no_norm 58 | 59 | kernel_function = set_kernel(kernel, norm) 60 | kernel_function.name = full_name 61 | if use_caching: 62 | kernel_function = cached_kernel(kernel_function) 63 | kernel_function.name = full_name 64 | return kernel_function 65 | 66 | 67 | def averaging_norm(x, *args): 68 | return float(x.shape[0]) 69 | 70 | 71 | def featurespace_norm(x, k): 72 | return math.sqrt(np.sum(k(x, x))) 73 | 74 | 75 | def no_norm(x, k): 76 | return 1.0 77 | 78 | 79 | def _hash_array(x): 80 | return hashlib.sha1(x).hexdigest() 81 | 82 | 83 | def cached_kernel(K): 84 | def cached_K(X, Y): 85 | if type(X) == list: 86 | x_hash = ''.join(map(_hash_array, X)) 87 | y_hash = ''.join(map(_hash_array, Y)) 88 | else: 89 | x_hash = _hash_array(X) 90 | y_hash = _hash_array(Y) 91 | full_hash = hashlib.sha1(x_hash + y_hash + K.name).hexdigest() 92 | cache_file = os.path.join(CACHE_DIR, full_hash + '.mat') 93 | if os.path.exists(cache_file): 94 | print('Using cached result!') 95 | result = np.matrix(loadmat(cache_file)['k']) 96 | return result 97 | # Check cache 98 | t0 = time.time() 99 | result = K(X, Y) 100 | tf = time.time() 101 | if (tf - t0) > CACHE_CUTOFF_T: 102 | print('Caching result...') 103 | if not os.path.exists(CACHE_DIR): 104 | os.mkdir(CACHE_DIR) 105 | savemat(cache_file, {'k': result}, oned_as='column') 106 | return result 107 | 108 | return cached_K 109 | 110 | 111 | def set_kernel(k, normalizer=no_norm): 112 | """ 113 | Decorator that makes a normalized 114 | set kernel out of a standard kernel k 115 | """ 116 | 117 | def K(X, Y): 118 | if type(X) == list: 119 | norm = lambda x: normalizer(x, k) 120 | x_norm = matrix(list(map(norm, X))) 121 | if id(X) == id(Y): 122 | # Optimize for symmetric case 123 | norms = x_norm.T * x_norm 124 | if all(len(bag) == 1 for bag in X): 125 | # Optimize for singleton bags 126 | instX = vstack(X) 127 | raw_kernel = k(instX, instX) 128 | else: 129 | # Only need to compute half of 130 | # the matrix if it's symmetric 131 | upper = matrix([i * [0] + [np.sum(k(x, y)) 132 | for y in Y[i:]] 133 | for i, x in enumerate(X, 1)]) 134 | diag = np.array([np.sum(k(x, x)) for x in X]) 135 | raw_kernel = upper + upper.T + spdiag(diag) 136 | else: 137 | y_norm = matrix(list(map(norm, Y))) 138 | norms = x_norm.T * y_norm 139 | raw_kernel = k(vstack(X), vstack(Y)) 140 | lensX = list(map(len, X)) 141 | lensY = list(map(len, Y)) 142 | if any(l != 1 for l in lensX): 143 | raw_kernel = vstack([np.sum(raw_kernel[i:j, :], axis=0) 144 | for i, j in slices(lensX)]) 145 | if any(l != 1 for l in lensY): 146 | raw_kernel = hstack([np.sum(raw_kernel[:, i:j], axis=1) 147 | for i, j in slices(lensY)]) 148 | return np.divide(raw_kernel, norms) 149 | else: 150 | return k(X, Y) 151 | 152 | return K 153 | 154 | 155 | def linear(x, y): 156 | """Linear kernel x'*y""" 157 | return x * y.T 158 | 159 | 160 | def quadratic(x, y): 161 | """Quadratic kernel (1 + x'*y)^2""" 162 | return np.square(1e0 + x * y.T) 163 | 164 | 165 | def polynomial(p): 166 | """General polynomial kernel (1 + x'*y)^p""" 167 | 168 | def p_kernel(x, y): 169 | return np.power(1e0 + x * y.T, p) 170 | 171 | return p_kernel 172 | 173 | 174 | def rbf(gamma): 175 | """Radial Basis Function""" 176 | 177 | def rbf_kernel(x, y): 178 | return matrix(np.exp(-gamma * cdist(x, y, 'sqeuclidean'))) 179 | 180 | return rbf_kernel 181 | -------------------------------------------------------------------------------- /misvm/mi_svm.py: -------------------------------------------------------------------------------- 1 | """ 2 | Implements mi-SVM and MI-SVM 3 | """ 4 | from __future__ import print_function, division 5 | import numpy as np 6 | from random import uniform 7 | from cvxopt import matrix as cvxmat, sparse 8 | import inspect 9 | from misvm.sil import SIL 10 | from misvm.svm import SVM 11 | from misvm.cccp import CCCP 12 | from misvm.quadprog import IterativeQP, spzeros, speye 13 | from misvm.kernel import by_name as kernel_by_name 14 | from misvm.util import partition, BagSplitter, spdiag, rand_convex, slices 15 | from scipy.sparse import issparse 16 | import pdb 17 | 18 | 19 | class MISVM(SIL): 20 | """ 21 | The MI-SVM approach of Andrews, Tsochantaridis, & Hofmann (2002) 22 | """ 23 | 24 | def __init__(self, restarts=0, max_iters=50, **kwargs): 25 | """ 26 | @param kernel : the desired kernel function; can be linear, quadratic, 27 | polynomial, or rbf [default: linear] 28 | @param C : the loss/regularization tradeoff constant [default: 1.0] 29 | @param scale_C : if True [default], scale C by the number of examples 30 | @param p : polynomial degree when a 'polynomial' kernel is used 31 | [default: 3] 32 | @param gamma : RBF scale parameter when an 'rbf' kernel is used 33 | [default: 1.0] 34 | @param verbose : print optimization status messages [default: True] 35 | @param sv_cutoff : the numerical cutoff for an example to be considered 36 | a support vector [default: 1e-7] 37 | @param restarts : the number of random restarts [default: 0] 38 | @param max_iters : the maximum number of iterations in the outer loop of 39 | the optimization procedure [default: 50] 40 | """ 41 | self.restarts = restarts 42 | self.max_iters = max_iters 43 | super(MISVM, self).__init__(**kwargs) 44 | 45 | def fit(self, bags, y): 46 | """ 47 | @param bags : a sequence of n bags; each bag is an m-by-k array-like 48 | object containing m instances with k features 49 | @param y : an array-like object of length n containing -1/+1 labels 50 | """ 51 | def transform(mx): 52 | """ 53 | Transform into np.matrix if array/list 54 | ignore scipy.sparse matrix 55 | """ 56 | if issparse(mx): 57 | return mx.todense() 58 | return np.asmatrix(mx) 59 | 60 | self._bags = [transform(bag) for bag in bags] 61 | y = np.asmatrix(y).reshape((-1, 1)) 62 | 63 | bs = BagSplitter(self._bags, y) 64 | best_obj = float('inf') 65 | best_svm = None 66 | for rr in range(self.restarts + 1): 67 | if rr == 0: 68 | if self.verbose: 69 | print('Non-random start...') 70 | pos_bag_avgs = np.vstack([np.average(bag, axis=0) for bag in bs.pos_bags]) 71 | else: 72 | if self.verbose: 73 | print('Random restart %d of %d...' % (rr, self.restarts)) 74 | pos_bag_avgs = np.vstack([rand_convex(len(bag)) * bag for bag in bs.pos_bags]) 75 | 76 | intial_instances = np.vstack([bs.neg_instances, pos_bag_avgs]) 77 | classes = np.vstack([-np.ones((bs.L_n, 1)), 78 | np.ones((bs.X_p, 1))]) 79 | 80 | # Setup SVM and QP 81 | if self.scale_C: 82 | C = self.C / float(len(intial_instances)) 83 | else: 84 | C = self.C 85 | setup = self._setup_svm(intial_instances, classes, C) 86 | K = setup[0] 87 | qp = IterativeQP(*setup[1:]) 88 | 89 | # Fix Gx <= h 90 | neg_cons = spzeros(bs.X_n, bs.L_n) 91 | for b, (l, u) in enumerate(slices(bs.neg_groups)): 92 | neg_cons[b, l:u] = 1.0 93 | pos_cons = speye(bs.X_p) 94 | bot_left = spzeros(bs.X_p, bs.L_n) 95 | top_right = spzeros(bs.X_n, bs.X_p) 96 | half_cons = sparse([[neg_cons, bot_left], 97 | [top_right, pos_cons]]) 98 | qp.G = sparse([-speye(bs.X_p + bs.L_n), half_cons]) 99 | qp.h = cvxmat(np.vstack([np.zeros((bs.X_p + bs.L_n, 1)), 100 | C * np.ones((bs.X_p + bs.X_n, 1))])) 101 | 102 | # Precompute kernel for all positive instances 103 | kernel = kernel_by_name(self.kernel, gamma=self.gamma, p=self.p) 104 | K_all = kernel(bs.instances, bs.instances) 105 | 106 | neg_selectors = np.array(range(bs.L_n)) 107 | 108 | class MISVMCCCP(CCCP): 109 | 110 | def bailout(cself, svm, selectors, instances, K): 111 | return svm 112 | 113 | def iterate(cself, svm, selectors, instances, K): 114 | cself.mention('Training SVM...') 115 | alphas, obj = qp.solve(cself.verbose) 116 | 117 | # Construct SVM from solution 118 | svm = SVM(kernel=self.kernel, gamma=self.gamma, p=self.p, 119 | verbose=self.verbose, sv_cutoff=self.sv_cutoff) 120 | svm._X = instances 121 | svm._y = classes 122 | svm._alphas = alphas 123 | svm._objective = obj 124 | svm._compute_separator(K) 125 | svm._K = K 126 | 127 | cself.mention('Recomputing classes...') 128 | p_confs = svm.predict(bs.pos_instances) 129 | pos_selectors = bs.L_n + np.array([l + np.argmax(p_confs[l:u]) 130 | for l, u in slices(bs.pos_groups)]) 131 | new_selectors = np.hstack([neg_selectors, pos_selectors]) 132 | 133 | if selectors is None: 134 | sel_diff = len(new_selectors) 135 | else: 136 | sel_diff = np.nonzero(new_selectors - selectors)[0].size 137 | 138 | cself.mention('Selector differences: %d' % sel_diff) 139 | if sel_diff == 0: 140 | return None, svm 141 | elif sel_diff > 5: 142 | # Clear results to avoid a 143 | # bad starting point in 144 | # the next iteration 145 | qp.clear_results() 146 | 147 | cself.mention('Updating QP...') 148 | indices = (new_selectors,) 149 | K = K_all[indices].T[indices].T 150 | D = spdiag(classes) 151 | qp.update_H(D * K * D) 152 | return {'svm': svm, 'selectors': new_selectors, 153 | 'instances': bs.instances[indices], 'K': K}, None 154 | 155 | cccp = MISVMCCCP(verbose=self.verbose, svm=None, selectors=None, 156 | instances=intial_instances, K=K, max_iters=self.max_iters) 157 | svm = cccp.solve() 158 | if svm is not None: 159 | obj = float(svm._objective) 160 | if obj < best_obj: 161 | best_svm = svm 162 | best_obj = obj 163 | 164 | if best_svm is not None: 165 | self._X = best_svm._X 166 | self._y = best_svm._y 167 | self._alphas = best_svm._alphas 168 | self._objective = best_svm._objective 169 | self._compute_separator(best_svm._K) 170 | 171 | def _compute_separator(self, K): 172 | super(SIL, self)._compute_separator(K) 173 | self._bag_predictions = self.predict(self._bags) 174 | 175 | def get_params(self, deep=True): 176 | super_args = super(MISVM, self).get_params() 177 | args, _, _, _ = inspect.getargspec(self.__init__) 178 | args.pop(0) 179 | super_args.update({key: getattr(self, key, None) for key in args}) 180 | return super_args 181 | 182 | 183 | class miSVM(SIL): 184 | """ 185 | The mi-SVM approach of Andrews, Tsochantaridis, & Hofmann (2002) 186 | """ 187 | 188 | def __init__(self, *args, **kwargs): 189 | """ 190 | @param kernel : the desired kernel function; can be linear, quadratic, 191 | polynomial, or rbf [default: linear] 192 | @param C : the loss/regularization tradeoff constant [default: 1.0] 193 | @param scale_C : if True [default], scale C by the number of examples 194 | @param p : polynomial degree when a 'polynomial' kernel is used 195 | [default: 3] 196 | @param gamma : RBF scale parameter when an 'rbf' kernel is used 197 | [default: 1.0] 198 | @param verbose : print optimization status messages [default: True] 199 | @param sv_cutoff : the numerical cutoff for an example to be considered 200 | a support vector [default: 1e-7] 201 | @param restarts : the number of random restarts [default: 0] 202 | @param max_iters : the maximum number of iterations in the outer loop of 203 | the optimization procedure [default: 50] 204 | """ 205 | self.restarts = kwargs.pop('restarts', 0) 206 | self.max_iters = kwargs.pop('max_iters', 50) 207 | super(miSVM, self).__init__(*args, **kwargs) 208 | 209 | def fit(self, bags, y): 210 | """ 211 | @param bags : a sequence of n bags; each bag is an m-by-k array-like 212 | object containing m instances with k features 213 | @param y : an array-like object of length n containing -1/+1 labels 214 | """ 215 | self._bags = [np.asmatrix(bag) for bag in bags] 216 | y = np.asmatrix(y).reshape((-1, 1)) 217 | 218 | bs = BagSplitter(self._bags, y) 219 | best_obj = float('inf') 220 | best_svm = None 221 | for rr in range(self.restarts + 1): 222 | if rr == 0: 223 | if self.verbose: 224 | print('Non-random start...') 225 | initial_classes = np.vstack([-np.ones((bs.L_n, 1)), 226 | np.ones((bs.L_p, 1))]) 227 | else: 228 | if self.verbose: 229 | print('Random restart %d of %d...' % (rr, self.restarts)) 230 | rand_classes = np.matrix([np.sign([uniform(-1.0, 1.0) 231 | for i in range(bs.L_p)])]).T 232 | initial_classes = np.vstack([-np.ones((bs.L_n, 1)), 233 | rand_classes]) 234 | initial_classes[np.nonzero(initial_classes == 0.0)] = 1.0 235 | 236 | # Setup SVM and QP 237 | if self.scale_C: 238 | C = self.C / float(len(bs.instances)) 239 | else: 240 | C = self.C 241 | setup = self._setup_svm(bs.instances, initial_classes, C) 242 | K = setup[0] 243 | qp = IterativeQP(*setup[1:]) 244 | 245 | class miSVMCCCP(CCCP): 246 | 247 | def bailout(cself, svm, classes): 248 | return svm 249 | 250 | def iterate(cself, svm, classes): 251 | cself.mention('Training SVM...') 252 | D = spdiag(classes) 253 | qp.update_H(D * K * D) 254 | qp.update_Aeq(classes.T) 255 | alphas, obj = qp.solve(cself.verbose) 256 | 257 | # Construct SVM from solution 258 | svm = SVM(kernel=self.kernel, gamma=self.gamma, p=self.p, 259 | verbose=self.verbose, sv_cutoff=self.sv_cutoff) 260 | svm._X = bs.instances 261 | svm._y = classes 262 | svm._alphas = alphas 263 | svm._objective = obj 264 | svm._compute_separator(K) 265 | svm._K = K 266 | 267 | cself.mention('Recomputing classes...') 268 | p_conf = svm._predictions[-bs.L_p:] 269 | pos_classes = np.vstack([_update_classes(part) 270 | for part in 271 | partition(p_conf, bs.pos_groups)]) 272 | new_classes = np.vstack([-np.ones((bs.L_n, 1)), pos_classes]) 273 | 274 | class_changes = round(np.sum(np.abs(classes - new_classes) / 2)) 275 | cself.mention('Class Changes: %d' % class_changes) 276 | if class_changes == 0: 277 | return None, svm 278 | 279 | return {'svm': svm, 'classes': new_classes}, None 280 | 281 | cccp = miSVMCCCP(verbose=self.verbose, svm=None, 282 | classes=initial_classes, max_iters=self.max_iters) 283 | svm = cccp.solve() 284 | if svm is not None: 285 | obj = float(svm._objective) 286 | if obj < best_obj: 287 | best_svm = svm 288 | best_obj = obj 289 | 290 | if best_svm is not None: 291 | self._X = best_svm._X 292 | self._y = best_svm._y 293 | self._alphas = best_svm._alphas 294 | self._objective = best_svm._objective 295 | self._compute_separator(best_svm._K) 296 | 297 | def get_params(self, deep=True): 298 | super_args = super(miSVM, self).get_params() 299 | args, _, _, _ = inspect.getargspec(self.__init__) 300 | args.pop(0) 301 | super_args.update({key: getattr(self, key, None) for key in args}) 302 | return super_args 303 | 304 | 305 | def _update_classes(x): 306 | classes = np.sign(x) 307 | # If classification happened to 308 | # be zero, make it 1.0 309 | classes[np.nonzero(classes == 0.0)] = 1.0 310 | # Guarantee that at least one 311 | # instance is positive 312 | classes[np.argmax(x)] = 1.0 313 | return classes.reshape((-1, 1)) 314 | -------------------------------------------------------------------------------- /misvm/mica.py: -------------------------------------------------------------------------------- 1 | """ 2 | Implements the MICA algorithm 3 | """ 4 | from __future__ import print_function, division 5 | import sys 6 | import numpy as np 7 | import scipy.sparse as sp 8 | from cvxopt import matrix as cvxmat, sparse, spmatrix 9 | from cvxopt.solvers import lp 10 | import inspect 11 | from misvm.quadprog import IterativeQP, spzeros as spz, speye as spI, _apply_options 12 | from misvm.util import spdiag, BagSplitter, slices, rand_convex 13 | from misvm.kernel import by_name as kernel_by_name 14 | from misvm.svm import SVM 15 | from misvm.cccp import CCCP 16 | 17 | 18 | class MICA(SVM): 19 | """ 20 | The MICA approach of Mangasarian & Wild (2008) 21 | """ 22 | 23 | def __init__(self, regularization='L2', restarts=0, max_iters=50, **kwargs): 24 | """ 25 | @param kernel : the desired kernel function; can be linear, quadratic, 26 | polynomial, or rbf [default: linear] 27 | @param C : the loss/regularization tradeoff constant [default: 1.0] 28 | @param scale_C : if True [default], scale C by the number of examples 29 | @param p : polynomial degree when a 'polynomial' kernel is used 30 | [default: 3] 31 | @param gamma : RBF scale parameter when an 'rbf' kernel is used 32 | [default: 1.0] 33 | @param verbose : print optimization status messages [default: True] 34 | @param sv_cutoff : the numerical cutoff for an example to be considered 35 | a support vector [default: 1e-7] 36 | @param restarts : the number of random restarts [default: 0] 37 | @param max_iters : the maximum number of iterations in the outer loop of 38 | the optimization procedure [default: 50] 39 | @param regularization : currently only L2 regularization is implemented 40 | """ 41 | self.regularization = regularization 42 | if not self.regularization in ('L2',): 43 | raise ValueError('Invalid regularization "%s"' 44 | % self.regularization) 45 | self.restarts = restarts 46 | self.max_iters = max_iters 47 | super(MICA, self).__init__(**kwargs) 48 | self._bags = None 49 | self._sv_bags = None 50 | self._bag_predictions = None 51 | 52 | def fit(self, bags, y): 53 | """ 54 | @param bags : a sequence of n bags; each bag is an m-by-k array-like 55 | object containing m instances with k features 56 | @param y : an array-like object of length n containing -1/+1 labels 57 | """ 58 | self._bags = list(map(np.asmatrix, bags)) 59 | bs = BagSplitter(self._bags, 60 | np.asmatrix(y).reshape((-1, 1))) 61 | self._X = bs.instances 62 | Ln = bs.L_n 63 | Lp = bs.L_p 64 | Xp = bs.X_p 65 | m = Ln + Xp 66 | if self.scale_C: 67 | C = self.C / float(len(self._bags)) 68 | else: 69 | C = self.C 70 | 71 | K = kernel_by_name(self.kernel, gamma=self.gamma, p=self.p)(self._X, self._X) 72 | new_classes = np.matrix(np.vstack([-np.ones((Ln, 1)), 73 | np.ones((Xp, 1))])) 74 | self._y = new_classes 75 | D = spdiag(new_classes) 76 | setup = list(self._setup_svm(new_classes, new_classes, C))[1:] 77 | setup[0] = np.matrix([0]) 78 | qp = IterativeQP(*setup) 79 | 80 | c = cvxmat(np.hstack([np.zeros(Lp + 1), 81 | np.ones(Xp + Ln)])) 82 | b = cvxmat(np.ones((Xp, 1))) 83 | A = spz(Xp, Lp + 1 + Xp + Ln) 84 | for row, (i, j) in enumerate(slices(bs.pos_groups)): 85 | A[row, i:j] = 1.0 86 | 87 | bottom_left = sparse(t([[-spI(Lp), spz(Lp)], 88 | [spz(m, Lp), spz(m)]])) 89 | bottom_right = sparse([spz(Lp, m), -spI(m)]) 90 | inst_cons = sparse(t([[spz(Xp, Lp), -spo(Xp)], 91 | [spz(Ln, Lp), spo(Ln)]])) 92 | G = sparse(t([[inst_cons, -spI(m)], 93 | [bottom_left, bottom_right]])) 94 | h = cvxmat(np.vstack([-np.ones((Xp, 1)), 95 | np.zeros((Ln + Lp + m, 1))])) 96 | 97 | def to_V(upsilon): 98 | bot = np.zeros((Xp, Lp)) 99 | for row, (i, j) in enumerate(slices(bs.pos_groups)): 100 | bot[row, i:j] = upsilon.flat[i:j] 101 | return sp.bmat([[sp.eye(Ln, Ln), None], 102 | [None, sp.coo_matrix(bot)]]) 103 | 104 | class MICACCCP(CCCP): 105 | 106 | def bailout(cself, alphas, upsilon, svm): 107 | return svm 108 | 109 | def iterate(cself, alphas, upsilon, svm): 110 | V = to_V(upsilon) 111 | cself.mention('Update QP...') 112 | qp.update_H(D * V * K * V.T * D) 113 | cself.mention('Solve QP...') 114 | alphas, obj = qp.solve(self.verbose) 115 | svm = MICA(kernel=self.kernel, gamma=self.gamma, p=self.p, 116 | verbose=self.verbose, sv_cutoff=self.sv_cutoff) 117 | svm._X = self._X 118 | svm._y = self._y 119 | svm._V = V 120 | svm._alphas = alphas 121 | svm._objective = obj 122 | svm._compute_separator(K) 123 | svm._K = K 124 | 125 | cself.mention('Update LP...') 126 | for row, (i, j) in enumerate(slices(bs.pos_groups)): 127 | G[row, i:j] = cvxmat(-svm._dotprods[Ln + i: Ln + j].T) 128 | h[Xp: Xp + Ln] = cvxmat(-(1 + svm._dotprods[:Ln])) 129 | 130 | cself.mention('Solve LP...') 131 | sol, _ = linprog(c, G, h, A, b, verbose=self.verbose) 132 | new_upsilon = sol[:Lp] 133 | 134 | if cself.check_tolerance(np.linalg.norm(upsilon - new_upsilon)): 135 | return None, svm 136 | 137 | return {'alphas': alphas, 'upsilon': new_upsilon, 'svm': svm}, None 138 | 139 | best_obj = float('inf') 140 | best_svm = None 141 | for rr in range(self.restarts + 1): 142 | if rr == 0: 143 | if self.verbose: 144 | print('Non-random start...') 145 | upsilon0 = np.matrix(np.vstack([np.ones((size, 1)) / float(size) 146 | for size in bs.pos_groups])) 147 | else: 148 | if self.verbose: 149 | print('Random restart %d of %d...' % (rr, self.restarts)) 150 | upsilon0 = np.matrix(np.vstack([rand_convex(size).T 151 | for size in bs.pos_groups])) 152 | cccp = MICACCCP(verbose=self.verbose, alphas=None, upsilon=upsilon0, 153 | svm=None, max_iters=self.max_iters) 154 | svm = cccp.solve() 155 | if svm is not None: 156 | obj = float(svm._objective) 157 | if obj < best_obj: 158 | best_svm = svm 159 | best_obj = obj 160 | 161 | if best_svm is not None: 162 | self._V = best_svm._V 163 | self._alphas = best_svm._alphas 164 | self._objective = best_svm._objective 165 | self._compute_separator(best_svm._K) 166 | self._bag_predictions = self.predict(self._bags) 167 | 168 | def _compute_separator(self, K): 169 | sv = (self._alphas.flat > self.sv_cutoff) 170 | 171 | D = spdiag(self._y) 172 | self._b = (np.sum(D * sv) - np.sum(self._alphas.T * D * self._V * K)) / np.sum(sv) 173 | self._dotprods = (self._alphas.T * D * self._V * K).T 174 | self._predictions = self._b + self._dotprods 175 | 176 | def predict(self, bags): 177 | """ 178 | @param bags : a sequence of n bags; each bag is an m-by-k array-like 179 | object containing m instances with k features 180 | @return : an array of length n containing real-valued label predictions 181 | (threshold at zero to produce binary predictions) 182 | """ 183 | if self._b is None: 184 | return np.zeros(len(bags)) 185 | else: 186 | bags = [np.asmatrix(bag) for bag in bags] 187 | k = kernel_by_name(self.kernel, p=self.p, gamma=self.gamma) 188 | D = spdiag(self._y) 189 | return np.array([np.max(self._b + self._alphas.T * D * self._V * 190 | k(self._X, bag)) 191 | for bag in bags]) 192 | 193 | def get_params(self, deep=True): 194 | """ 195 | return params 196 | """ 197 | super_args, _, _, _ = inspect.getargspec(super(MICA, self).__init__) 198 | args, _, _, _ = inspect.getargspec(MICA.__init__) 199 | args.pop(0) 200 | super_args.pop(0) 201 | args += super_args 202 | return {key: getattr(self, key, None) for key in args} 203 | 204 | 205 | def linprog(*args, **kwargs): 206 | verbose = kwargs.get('verbose', False) 207 | # Save settings and set verbosity 208 | old_settings = _apply_options({'show_progress': verbose}) 209 | 210 | # Optimize 211 | results = lp(*args, solver='glpk') 212 | 213 | # Restore settings 214 | _apply_options(old_settings) 215 | 216 | # Check return status 217 | status = results['status'] 218 | if not status == 'optimal': 219 | print('Warning: termination of lp with status: %s' 220 | % status, file=sys.stderr) 221 | 222 | # Convert back to NumPy matrix 223 | # and return solution 224 | xstar = results['x'] 225 | return np.matrix(xstar), results['primal objective'] 226 | 227 | 228 | def spo(r, v=1.0): 229 | """Create a sparse one vector""" 230 | return spmatrix(v, range(r), r * [0]) 231 | 232 | 233 | def t(list_of_lists): 234 | """ 235 | Transpose a list of lists, since 'sparse' 236 | takes arguments in column-major order. 237 | """ 238 | return list(map(list, zip(*list_of_lists))) 239 | -------------------------------------------------------------------------------- /misvm/misssvm.py: -------------------------------------------------------------------------------- 1 | """ 2 | Implements MissSVM 3 | """ 4 | from __future__ import print_function, division 5 | import numpy as np 6 | import scipy.sparse as sp 7 | from random import uniform 8 | import inspect 9 | from misvm.quadprog import IterativeQP, Objective 10 | from misvm.util import BagSplitter, spdiag, slices 11 | from misvm.kernel import by_name as kernel_by_name 12 | from misvm.mica import MICA 13 | from misvm.cccp import CCCP 14 | 15 | 16 | class MissSVM(MICA): 17 | """ 18 | Semi-supervised learning applied to MI data (Zhou & Xu 2007) 19 | """ 20 | 21 | def __init__(self, alpha=1e4, **kwargs): 22 | """ 23 | @param kernel : the desired kernel function; can be linear, quadratic, 24 | polynomial, or rbf [default: linear] 25 | @param C : the loss/regularization tradeoff constant [default: 1.0] 26 | @param scale_C : if True [default], scale C by the number of examples 27 | @param p : polynomial degree when a 'polynomial' kernel is used 28 | [default: 3] 29 | @param gamma : RBF scale parameter when an 'rbf' kernel is used 30 | [default: 1.0] 31 | @param verbose : print optimization status messages [default: True] 32 | @param sv_cutoff : the numerical cutoff for an example to be considered 33 | a support vector [default: 1e-7] 34 | @param restarts : the number of random restarts [default: 0] 35 | @param max_iters : the maximum number of iterations in the outer loop of 36 | the optimization procedure [default: 50] 37 | @param alpha : the softmax parameter [default: 1e4] 38 | """ 39 | self.alpha = alpha 40 | super(MissSVM, self).__init__(**kwargs) 41 | self._bags = None 42 | self._sv_bags = None 43 | self._bag_predictions = None 44 | 45 | def fit(self, bags, y): 46 | """ 47 | @param bags : a sequence of n bags; each bag is an m-by-k array-like 48 | object containing m instances with k features 49 | @param y : an array-like object of length n containing -1/+1 labels 50 | """ 51 | self._bags = list(map(np.asmatrix, bags)) 52 | bs = BagSplitter(self._bags, 53 | np.asmatrix(y).reshape((-1, 1))) 54 | self._X = np.vstack([bs.pos_instances, 55 | bs.pos_instances, 56 | bs.pos_instances, 57 | bs.neg_instances]) 58 | self._y = np.vstack([np.matrix(np.ones((bs.X_p + bs.L_p, 1))), 59 | -np.matrix(np.ones((bs.L_p + bs.L_n, 1)))]) 60 | if self.scale_C: 61 | C = self.C / float(len(self._bags)) 62 | else: 63 | C = self.C 64 | 65 | # Setup SVM and adjust constraints 66 | _, _, f, A, b, lb, ub = self._setup_svm(self._y, self._y, C) 67 | ub[:bs.X_p] *= (float(bs.L_n) / float(bs.X_p)) 68 | ub[bs.X_p: bs.X_p + 2 * bs.L_p] *= (float(bs.L_n) / float(bs.L_p)) 69 | K = kernel_by_name(self.kernel, gamma=self.gamma, p=self.p)(self._X, self._X) 70 | D = spdiag(self._y) 71 | ub0 = np.matrix(ub) 72 | ub0[bs.X_p: bs.X_p + 2 * bs.L_p] *= 0.5 73 | 74 | def get_V(pos_classifications): 75 | eye_n = bs.L_n + 2 * bs.L_p 76 | top = np.zeros((bs.X_p, bs.L_p)) 77 | for row, (i, j) in enumerate(slices(bs.pos_groups)): 78 | top[row, i:j] = _grad_softmin(-pos_classifications[i:j], self.alpha).flat 79 | return sp.bmat([[sp.coo_matrix(top), None], 80 | [None, sp.eye(eye_n, eye_n)]]) 81 | 82 | V0 = get_V(np.matrix(np.zeros((bs.L_p, 1)))) 83 | 84 | qp = IterativeQP(D * V0 * K * V0.T * D, f, A, b, lb, ub0) 85 | 86 | best_obj = float('inf') 87 | best_svm = None 88 | for rr in range(self.restarts + 1): 89 | if rr == 0: 90 | if self.verbose: 91 | print('Non-random start...') 92 | # Train on instances 93 | alphas, obj = qp.solve(self.verbose) 94 | else: 95 | if self.verbose: 96 | print('Random restart %d of %d...' % (rr, self.restarts)) 97 | alphas = np.matrix([uniform(0.0, 1.0) for i in range(len(lb))]).T 98 | obj = Objective(0.0, 0.0) 99 | svm = MICA(kernel=self.kernel, gamma=self.gamma, p=self.p, 100 | verbose=self.verbose, sv_cutoff=self.sv_cutoff) 101 | svm._X = self._X 102 | svm._y = self._y 103 | svm._V = V0 104 | svm._alphas = alphas 105 | svm._objective = obj 106 | svm._compute_separator(K) 107 | svm._K = K 108 | 109 | class missCCCP(CCCP): 110 | 111 | def bailout(cself, svm, obj_val): 112 | return svm 113 | 114 | def iterate(cself, svm, obj_val): 115 | cself.mention('Linearizing constraints...') 116 | classifications = svm._predictions[bs.X_p: bs.X_p + bs.L_p] 117 | V = get_V(classifications) 118 | 119 | cself.mention('Computing slacks...') 120 | # Difference is [1 - y_i*(w*phi(x_i) + b)] 121 | pos_differences = 1.0 - classifications 122 | neg_differences = 1.0 + classifications 123 | # Slacks are positive differences only 124 | pos_slacks = np.multiply(pos_differences > 0, pos_differences) 125 | neg_slacks = np.multiply(neg_differences > 0, neg_differences) 126 | all_slacks = np.hstack([pos_slacks, neg_slacks]) 127 | 128 | cself.mention('Linearizing...') 129 | # Compute gradient across pairs 130 | slack_grads = np.vstack([_grad_softmin(pair, self.alpha) 131 | for pair in all_slacks]) 132 | # Stack results into one column 133 | slack_grads = np.vstack([np.ones((bs.X_p, 1)), 134 | slack_grads[:, 0], 135 | slack_grads[:, 1], 136 | np.ones((bs.L_n, 1))]) 137 | # Update QP 138 | qp.update_H(D * V * K * V.T * D) 139 | qp.update_ub(np.multiply(ub, slack_grads)) 140 | 141 | # Re-solve 142 | cself.mention('Solving QP...') 143 | alphas, obj = qp.solve(self.verbose) 144 | new_svm = MICA(kernel=self.kernel, gamma=self.gamma, p=self.p, 145 | verbose=self.verbose, sv_cutoff=self.sv_cutoff) 146 | new_svm._X = self._X 147 | new_svm._y = self._y 148 | new_svm._V = V 149 | new_svm._alphas = alphas 150 | new_svm._objective = obj 151 | new_svm._compute_separator(K) 152 | new_svm._K = K 153 | 154 | if cself.check_tolerance(obj_val, obj): 155 | return None, new_svm 156 | 157 | return {'svm': new_svm, 'obj_val': obj}, None 158 | 159 | cccp = missCCCP(verbose=self.verbose, svm=svm, obj_val=None, 160 | max_iters=self.max_iters) 161 | svm = cccp.solve() 162 | if svm is not None: 163 | obj = float(svm._objective) 164 | if obj < best_obj: 165 | best_svm = svm 166 | best_obj = obj 167 | 168 | if best_svm is not None: 169 | self._V = best_svm._V 170 | self._alphas = best_svm._alphas 171 | self._objective = best_svm._objective 172 | self._compute_separator(best_svm._K) 173 | self._bag_predictions = self.predict(self._bags) 174 | 175 | def get_params(self, deep=True): 176 | super_args = super(MissSVM, self).get_params() 177 | args, _, _, _ = inspect.getargspec(MissSVM.__init__) 178 | args.pop(0) 179 | super_args.update({key: getattr(self, key, None) for key in args}) 180 | return super_args 181 | 182 | 183 | def _grad_softmin(x, alpha=1e4): 184 | """ 185 | Computes the gradient of min function, 186 | taken from gradient of softmin as 187 | alpha goes to infinity. It is: 188 | 0 if x_i != min(x), or 189 | 1/n if x_i is one of the n 190 | elements equal to min(x) 191 | """ 192 | grad = np.matrix(np.zeros(x.shape)) 193 | minimizers = (x == min(x.flat)) 194 | n = float(np.sum(minimizers)) 195 | grad[np.nonzero(minimizers)] = 1.0 / n 196 | return grad 197 | -------------------------------------------------------------------------------- /misvm/nsk.py: -------------------------------------------------------------------------------- 1 | """ 2 | Implements the Normalized Set Kernel 3 | of Gartner et al. 4 | """ 5 | from __future__ import print_function, division 6 | import numpy as np 7 | import inspect 8 | from misvm.quadprog import quadprog 9 | from misvm.kernel import by_name as kernel_by_name 10 | from misvm.util import spdiag 11 | from misvm.svm import SVM 12 | 13 | 14 | 15 | 16 | class NSK(SVM): 17 | """ 18 | Normalized set kernel of Gaertner, et al. (2002) 19 | """ 20 | 21 | def __init__(self, **kwargs): 22 | """ 23 | @param kernel : the desired kernel function; can be linear, quadratic, 24 | polynomial, or rbf [default: linear] 25 | (by default, no normalization is used; to use averaging 26 | or feature space normalization, append either '_av' or 27 | '_fs' to the kernel name, as in 'rbf_av') 28 | @param C : the loss/regularization tradeoff constant [default: 1.0] 29 | @param scale_C : if True [default], scale C by the number of examples 30 | @param p : polynomial degree when a 'polynomial' kernel is used 31 | [default: 3] 32 | @param gamma : RBF scale parameter when an 'rbf' kernel is used 33 | [default: 1.0] 34 | @param verbose : print optimization status messages [default: True] 35 | @param sv_cutoff : the numerical cutoff for an example to be considered 36 | a support vector [default: 1e-7] 37 | """ 38 | super(NSK, self).__init__(**kwargs) 39 | self._bags = None 40 | self._sv_bags = None 41 | self._bag_predictions = None 42 | 43 | def fit(self, bags, y): 44 | """ 45 | @param bags : a sequence of n bags; each bag is an m-by-k array-like 46 | object containing m instances with k features 47 | @param y : an array-like object of length n containing -1/+1 labels 48 | """ 49 | self._bags = list(map(np.asmatrix, bags)) 50 | self._y = np.asmatrix(y).reshape((-1, 1)) 51 | if self.scale_C: 52 | C = self.C / float(len(self._bags)) 53 | else: 54 | C = self.C 55 | 56 | if self.verbose: 57 | print('Setup QP...') 58 | K, H, f, A, b, lb, ub = self._setup_svm(self._bags, self._y, C) 59 | 60 | # Solve QP 61 | if self.verbose: 62 | print('Solving QP...') 63 | self._alphas, self._objective = quadprog(H, f, A, b, lb, ub, 64 | self.verbose) 65 | self._compute_separator(K) 66 | 67 | def _compute_separator(self, K): 68 | 69 | self._sv = np.nonzero(self._alphas.flat > self.sv_cutoff) 70 | self._sv_alphas = self._alphas[self._sv] 71 | self._sv_bags = [self._bags[i] for i in self._sv[0]] 72 | self._sv_y = self._y[self._sv] 73 | 74 | n = len(self._sv_bags) 75 | if n == 0: 76 | self._b = 0.0 77 | self._bag_predictions = np.zeros(len(self._bags)) 78 | else: 79 | _sv_all_K = K[self._sv] 80 | _sv_K = _sv_all_K.T[self._sv].T 81 | e = np.matrix(np.ones((n, 1))) 82 | D = spdiag(self._sv_y) 83 | self._b = float(e.T * D * e - self._sv_alphas.T * D * _sv_K * e) / n 84 | self._bag_predictions = np.array(self._b 85 | + self._sv_alphas.T * D * _sv_all_K).reshape((-1,)) 86 | 87 | def predict(self, bags): 88 | """ 89 | @param bags : a sequence of n bags; each bag is an m-by-k array-like 90 | object containing m instances with k features 91 | @return : an array of length n containing real-valued label predictions 92 | (threshold at zero to produce binary predictions) 93 | """ 94 | if self._sv_bags is None or len(self._sv_bags) == 0: 95 | return np.zeros(len(bags)) 96 | else: 97 | kernel = kernel_by_name(self.kernel, p=self.p, gamma=self.gamma) 98 | K = kernel(list(map(np.asmatrix, bags)), self._sv_bags) 99 | return np.array(self._b + K * spdiag(self._sv_y) * self._sv_alphas).reshape((-1,)) 100 | 101 | def get_params(self, deep=True): 102 | """ 103 | return params 104 | """ 105 | args, _, _, _ = inspect.getargspec(super(NSK, self).__init__) 106 | args.pop(0) 107 | return {key: getattr(self, key, None) for key in args} -------------------------------------------------------------------------------- /misvm/quadprog.py: -------------------------------------------------------------------------------- 1 | from __future__ import print_function, division 2 | from cvxopt import matrix as cvxmat, sparse, spmatrix 3 | from cvxopt.solvers import qp, options 4 | from sys import stderr 5 | from itertools import count 6 | 7 | from numpy import eye, vstack, matrix 8 | 9 | 10 | class IterativeQP(object): 11 | """ 12 | Iteratively solves QPs, allowing 13 | an update of parameters and using 14 | the previous solution as an initial solution 15 | """ 16 | 17 | def __init__(self, H, f, Aeq, beq, lb, ub, fix_pd=False): 18 | """ 19 | minimize: 20 | (1/2)*x'*H*x + f'*x 21 | subject to: 22 | Aeq*x = beq 23 | lb <= x <= ub 24 | """ 25 | self.lb = lb 26 | (self.P, self.q, self.G, 27 | self.h, self.A, self.b) = _convert(H, f, Aeq, beq, lb, ub) 28 | self.last_results = None 29 | self.fix_pd = fix_pd 30 | 31 | def update_ub(self, ub): 32 | self.h = cvxmat(vstack([-self.lb, ub])) 33 | # Old results no longer valid 34 | self.last_results = None 35 | 36 | def update_H(self, H): 37 | self.P = cvxmat(H) 38 | 39 | def update_Aeq(self, Aeq): 40 | if Aeq is None: 41 | self.A = None 42 | else: 43 | self.A = cvxmat(Aeq) 44 | # Old results no longer valid 45 | self.last_results = None 46 | 47 | def _ensure_pd(self, epsilon): 48 | """ 49 | Add epsilon times identity matrix 50 | to P to ensure numerically it is P.D. 51 | """ 52 | n = self.P.size[0] 53 | self.P = self.P + cvxmat(epsilon * eye(n)) 54 | 55 | def clear_results(self): 56 | self.last_results = None 57 | 58 | def solve(self, verbose=False): 59 | # Optimize 60 | old_settings = _apply_options({'show_progress': verbose}) 61 | 62 | for i in count(-9): 63 | try: 64 | results = qp(self.P, self.q, self.G, 65 | self.h, self.A, self.b, 66 | initvals=self.last_results) 67 | break 68 | except ValueError as e: 69 | # Sometimes the hessian isn't full rank, 70 | # due to numerical error 71 | if self.fix_pd: 72 | eps = 10.0 ** i 73 | print('Rank error while solving, adjusting to fix...') 74 | print('Using epsilon = %.1e' % eps) 75 | self._ensure_pd(eps) 76 | else: 77 | raise e 78 | 79 | _apply_options(old_settings) 80 | 81 | # Store results 82 | self.last_results = results 83 | 84 | # Check return status 85 | status = results['status'] 86 | if not status == 'optimal': 87 | print('Warning: termination of qp with status: %s' 88 | % status, file=stderr) 89 | 90 | # Convert back to NumPy matrix 91 | # and return solution 92 | xstar = results['x'] 93 | obj = Objective((0.5 * xstar.T * self.P * xstar)[0], (self.q.T * xstar)[0]) 94 | return matrix(xstar), obj 95 | 96 | 97 | def quadprog(H, f, Aeq, beq, lb, ub, verbose=False, fix_pd=False): 98 | """ 99 | minimize: 100 | (1/2)*x'*H*x + f'*x 101 | subject to: 102 | Aeq*x = beq 103 | lb <= x <= ub 104 | """ 105 | qp = IterativeQP(H, f, Aeq, beq, lb, ub, fix_pd) 106 | return qp.solve(verbose) 107 | 108 | 109 | def speye(n): 110 | """Create a sparse identity matrix""" 111 | r = range(n) 112 | return spmatrix(1.0, r, r) 113 | 114 | 115 | def spzeros(r, c=1): 116 | """Create a sparse zero vector or matrix""" 117 | return spmatrix([], [], [], (r, c)) 118 | 119 | 120 | def _convert(H, f, Aeq, beq, lb, ub): 121 | """ 122 | Convert everything to 123 | cvxopt-style matrices 124 | """ 125 | P = cvxmat(H) 126 | q = cvxmat(f) 127 | if Aeq is None: 128 | A = None 129 | else: 130 | A = cvxmat(Aeq) 131 | if beq is None: 132 | b = None 133 | else: 134 | b = cvxmat(beq) 135 | 136 | n = lb.size 137 | G = sparse([-speye(n), speye(n)]) 138 | h = cvxmat(vstack([-lb, ub])) 139 | return P, q, G, h, A, b 140 | 141 | 142 | def _apply_options(option_dict): 143 | old_settings = {} 144 | for k, v in option_dict.items(): 145 | old_settings[k] = options.get(k, None) 146 | if v is None: 147 | del options[k] 148 | else: 149 | options[k] = v 150 | return old_settings 151 | 152 | 153 | class Objective(object): 154 | def __init__(self, quadratic, linear): 155 | self.objective = quadratic + linear 156 | self.quadratic = quadratic 157 | self.linear = linear 158 | 159 | def __float__(self): 160 | return float(self.objective) 161 | 162 | def __str__(self): 163 | return str(self.objective) 164 | -------------------------------------------------------------------------------- /misvm/sbmil.py: -------------------------------------------------------------------------------- 1 | """ 2 | Implements sbMIL 3 | """ 4 | from __future__ import print_function, division 5 | import numpy as np 6 | 7 | from misvm.smil import sMIL 8 | from misvm.sil import SIL 9 | from misvm.util import BagSplitter 10 | 11 | 12 | class sbMIL(SIL): 13 | """ 14 | Sparse, balanced MIL (Bunescu & Mooney, 2007) 15 | """ 16 | 17 | def __init__(self, *args, **kwargs): 18 | """ 19 | @param kernel : the desired kernel function; can be linear, quadratic, 20 | polynomial, or rbf [default: linear] 21 | (by default, no normalization is used; to use averaging 22 | or feature space normalization, append either '_av' or 23 | '_fs' to the kernel name, as in 'rbf_av'; averaging 24 | normalization is used in the original formulation) 25 | @param C : the loss/regularization tradeoff constant [default: 1.0] 26 | @param scale_C : if True [default], scale C by the number of examples 27 | @param p : polynomial degree when a 'polynomial' kernel is used 28 | [default: 3] 29 | @param gamma : RBF scale parameter when an 'rbf' kernel is used 30 | [default: 1.0] 31 | @param verbose : print optimization status messages [default: True] 32 | @param sv_cutoff : the numerical cutoff for an example to be considered 33 | a support vector [default: 1e-7] 34 | @param eta : balance parameter 35 | """ 36 | self.eta = kwargs.pop('eta', 0.0) 37 | self.eta = max(0.0, self.eta) 38 | self.eta = min(1.0, self.eta) 39 | super(sbMIL, self).__init__(*args, **kwargs) 40 | 41 | def fit(self, bags, y): 42 | """ 43 | @param bags : a sequence of n bags; each bag is an m-by-k array-like 44 | object containing m instances with k features 45 | @param y : an array-like object of length n containing -1/+1 labels 46 | """ 47 | self._bags = [np.asmatrix(bag) for bag in bags] 48 | y = np.asmatrix(y).reshape((-1, 1)) 49 | bs = BagSplitter(self._bags, y) 50 | 51 | if self.verbose: 52 | print('Training initial sMIL classifier for sbMIL...') 53 | initial_classifier = sMIL(kernel=self.kernel, C=self.C, p=self.p, gamma=self.gamma, 54 | scale_C=self.scale_C, verbose=self.verbose, 55 | sv_cutoff=self.sv_cutoff) 56 | initial_classifier.fit(bags, y) 57 | if self.verbose: 58 | print('Computing initial instance labels for sbMIL...') 59 | f_pos = initial_classifier.predict(bs.pos_inst_as_bags) 60 | # Select nth largest value as cutoff for positive instances 61 | n = int(round(bs.L_p * self.eta)) 62 | n = min(bs.L_p, n) 63 | n = max(bs.X_p, n) 64 | f_cutoff = sorted((float(f) for f in f_pos), reverse=True)[n - 1] 65 | 66 | # Label all except for n largest as -1 67 | pos_labels = -np.matrix(np.ones((bs.L_p, 1))) 68 | pos_labels[np.nonzero(f_pos >= f_cutoff)] = 1.0 69 | 70 | # Train on all instances 71 | if self.verbose: 72 | print('Retraining with top %d%% as positive...' % int(100 * self.eta)) 73 | all_labels = np.vstack([-np.ones((bs.L_n, 1)), pos_labels]) 74 | super(SIL, self).fit(bs.instances, all_labels) 75 | 76 | def _compute_separator(self, K): 77 | super(SIL, self)._compute_separator(K) 78 | self._bag_predictions = self.predict(self._bags) 79 | -------------------------------------------------------------------------------- /misvm/sil.py: -------------------------------------------------------------------------------- 1 | """ 2 | Implements Single Instance Learning SVM 3 | """ 4 | from __future__ import print_function, division 5 | import numpy as np 6 | import inspect 7 | from misvm.svm import SVM 8 | from misvm.util import slices 9 | 10 | 11 | class SIL(SVM): 12 | """ 13 | Single-Instance Learning applied to MI data 14 | """ 15 | 16 | def __init__(self, **kwargs): 17 | """ 18 | @param kernel : the desired kernel function; can be linear, quadratic, 19 | polynomial, or rbf [default: linear] 20 | @param C : the loss/regularization tradeoff constant [default: 1.0] 21 | @param scale_C : if True [default], scale C by the number of examples 22 | @param p : polynomial degree when a 'polynomial' kernel is used 23 | [default: 3] 24 | @param gamma : RBF scale parameter when an 'rbf' kernel is used 25 | [default: 1.0] 26 | @param verbose : print optimization status messages [default: True] 27 | @param sv_cutoff : the numerical cutoff for an example to be considered 28 | a support vector [default: 1e-7] 29 | """ 30 | super(SIL, self).__init__(**kwargs) 31 | self._bags = None 32 | self._bag_predictions = None 33 | 34 | def fit(self, bags, y): 35 | """ 36 | @param bags : a sequence of n bags; each bag is an m-by-k array-like 37 | object containing m instances with k features 38 | @param y : an array-like object of length n containing -1/+1 labels 39 | """ 40 | self._bags = [np.asmatrix(bag) for bag in bags] 41 | y = np.asmatrix(y).reshape((-1, 1)) 42 | svm_X = np.vstack(self._bags) 43 | svm_y = np.vstack([float(cls) * np.matrix(np.ones((len(bag), 1))) 44 | for bag, cls in zip(self._bags, y)]) 45 | super(SIL, self).fit(svm_X, svm_y) 46 | 47 | def _compute_separator(self, K): 48 | super(SIL, self)._compute_separator(K) 49 | self._bag_predictions = _inst_to_bag_preds(self._predictions, self._bags) 50 | 51 | def predict(self, bags, instancePrediction = None): 52 | """ 53 | @param bags : a sequence of n bags; each bag is an m-by-k array-like 54 | object containing m instances with k features 55 | @param instancePrediction : flag to indicate if instance predictions 56 | should be given as output. 57 | @return : an array of length n containing real-valued label predictions 58 | (threshold at zero to produce binary predictions) 59 | """ 60 | if instancePrediction is None: 61 | instancePrediction = False 62 | 63 | bags = [np.asmatrix(bag) for bag in bags] 64 | inst_preds = super(SIL, self).predict(np.vstack(bags)) 65 | 66 | if instancePrediction: 67 | return _inst_to_bag_preds(inst_preds, bags), inst_preds 68 | else: 69 | return _inst_to_bag_preds(inst_preds, bags) 70 | 71 | def get_params(self, deep=True): 72 | """ 73 | return params 74 | """ 75 | args, _, _, _ = inspect.getargspec(super(SIL, self).__init__) 76 | args.pop(0) 77 | return {key: getattr(self, key, None) for key in args} 78 | 79 | 80 | def _inst_to_bag_preds(inst_preds, bags): 81 | return np.array([np.max(inst_preds[slice(*bidx)]) 82 | for bidx in slices(list(map(len, bags)))]) 83 | -------------------------------------------------------------------------------- /misvm/smil.py: -------------------------------------------------------------------------------- 1 | """ 2 | Implements sMIL 3 | """ 4 | from __future__ import print_function, division 5 | import numpy as np 6 | 7 | from misvm.quadprog import quadprog 8 | from misvm.kernel import by_name as kernel_by_name 9 | from misvm.util import BagSplitter 10 | from misvm.nsk import NSK 11 | 12 | 13 | class sMIL(NSK): 14 | """ 15 | Sparse MIL (Bunescu & Mooney, 2007) 16 | """ 17 | 18 | def __init__(self, **kwargs): 19 | """ 20 | @param kernel : the desired kernel function; can be linear, quadratic, 21 | polynomial, or rbf [default: linear] 22 | (by default, no normalization is used; to use averaging 23 | or feature space normalization, append either '_av' or 24 | '_fs' to the kernel name, as in 'rbf_av'; averaging 25 | normalization is used in the original formulation) 26 | @param C : the loss/regularization tradeoff constant [default: 1.0] 27 | @param scale_C : if True [default], scale C by the number of examples 28 | @param p : polynomial degree when a 'polynomial' kernel is used 29 | [default: 3] 30 | @param gamma : RBF scale parameter when an 'rbf' kernel is used 31 | [default: 1.0] 32 | @param verbose : print optimization status messages [default: True] 33 | @param sv_cutoff : the numerical cutoff for an example to be considered 34 | a support vector [default: 1e-7] 35 | """ 36 | super(sMIL, self).__init__(**kwargs) 37 | 38 | def fit(self, bags, y): 39 | """ 40 | @param bags : a sequence of n bags; each bag is an m-by-k array-like 41 | object containing m instances with k features 42 | @param y : an array-like object of length n containing -1/+1 labels 43 | """ 44 | bs = BagSplitter(list(map(np.asmatrix, bags)), 45 | np.asmatrix(y).reshape((-1, 1))) 46 | self._bags = bs.neg_inst_as_bags + bs.pos_bags 47 | self._y = np.matrix(np.vstack([-np.ones((bs.L_n, 1)), 48 | np.ones((bs.X_p, 1))])) 49 | if self.scale_C: 50 | iC = float(self.C) / bs.L_n 51 | bC = float(self.C) / bs.X_p 52 | else: 53 | iC = self.C 54 | bC = self.C 55 | C = np.vstack([iC * np.ones((bs.L_n, 1)), 56 | bC * np.ones((bs.X_p, 1))]) 57 | 58 | if self.verbose: 59 | print('Setup QP...') 60 | K, H, f, A, b, lb, ub = self._setup_svm(self._bags, self._y, C) 61 | 62 | # Adjust f with balancing terms 63 | factors = np.vstack([np.matrix(np.ones((bs.L_n, 1))), 64 | np.matrix([2.0 / len(bag) - 1.0 65 | for bag in bs.pos_bags]).T]) 66 | f = np.multiply(f, factors) 67 | 68 | if self.verbose: 69 | print('Solving QP...') 70 | self._alphas, self._objective = quadprog(H, f, A, b, lb, ub, 71 | self.verbose) 72 | self._compute_separator(K) 73 | 74 | # Recompute predictions for full bags 75 | self._bag_predictions = self.predict(bags) -------------------------------------------------------------------------------- /misvm/stk.py: -------------------------------------------------------------------------------- 1 | """ 2 | Implements the STK of Gartner et al. 3 | """ 4 | from __future__ import print_function, division 5 | import inspect 6 | import numpy as np 7 | 8 | from misvm.svm import SVM 9 | 10 | 11 | class STK(SVM): 12 | """ 13 | Statistics kernel of Gaertner, et al. (2002) 14 | """ 15 | 16 | def __init__(self, **kwargs): 17 | """ 18 | @param kernel : the desired kernel function; can be linear, quadratic, 19 | polynomial, or rbf [default: linear] 20 | @param C : the loss/regularization tradeoff constant [default: 1.0] 21 | @param scale_C : if True [default], scale C by the number of examples 22 | @param p : polynomial degree when a 'polynomial' kernel is used 23 | [default: 3] 24 | @param gamma : RBF scale parameter when an 'rbf' kernel is used 25 | [default: 1.0] 26 | @param verbose : print optimization status messages [default: True] 27 | @param sv_cutoff : the numerical cutoff for an example to be considered 28 | a support vector [default: 1e-7] 29 | """ 30 | super(STK, self).__init__(**kwargs) 31 | self._bags = None 32 | self._bag_predictions = None 33 | 34 | def fit(self, bags, y): 35 | """ 36 | @param bags : a sequence of n bags; each bag is an m-by-k array-like 37 | object containing m instances with k features 38 | @param y : an array-like object of length n containing -1/+1 labels 39 | """ 40 | self._bags = [np.asmatrix(bag) for bag in bags] 41 | y = np.asmatrix(y).reshape((-1, 1)) 42 | svm_X = _stats_from_bags(bags) 43 | super(STK, self).fit(svm_X, y) 44 | 45 | def _compute_separator(self, K): 46 | super(STK, self)._compute_separator(K) 47 | self._bag_predictions = self._predictions 48 | 49 | def predict(self, bags): 50 | """ 51 | @param bags : a sequence of n bags; each bag is an m-by-k array-like 52 | object containing m instances with k features 53 | @return : an array of length n containing real-valued label predictions 54 | (threshold at zero to produce binary predictions) 55 | """ 56 | bags = [np.asmatrix(bag) for bag in bags] 57 | svm_X = _stats_from_bags(bags) 58 | return super(STK, self).predict(svm_X) 59 | 60 | def get_params(self, deep=True): 61 | """ 62 | return params 63 | """ 64 | args, _, _, _ = inspect.getargspec(super(STK, self).__init__) 65 | args.pop(0) 66 | return {key: getattr(self, key, None) for key in args} 67 | 68 | def _stats_from_bags(bags): 69 | return np.vstack([np.hstack([np.min(bag, 0), np.max(bag, 0)]) 70 | for bag in bags]) 71 | -------------------------------------------------------------------------------- /misvm/stmil.py: -------------------------------------------------------------------------------- 1 | """ 2 | Implements stMIL 3 | """ 4 | from __future__ import print_function, division 5 | import numpy as np 6 | from random import uniform 7 | 8 | from misvm.nsk import NSK 9 | from misvm.smil import sMIL 10 | from misvm.quadprog import IterativeQP 11 | from misvm.cccp import CCCP 12 | from misvm.util import BagSplitter, spdiag 13 | 14 | 15 | class stMIL(NSK): 16 | """ 17 | Sparse, transductive MIL (Bunescu & Mooney, 2007) 18 | """ 19 | 20 | def __init__(self, *args, **kwargs): 21 | """ 22 | @param kernel : the desired kernel function; can be linear, quadratic, 23 | polynomial, or rbf [default: linear] 24 | (by default, no normalization is used; to use averaging 25 | or feature space normalization, append either '_av' or 26 | '_fs' to the kernel name, as in 'rbf_av'; averaging 27 | normalization is used in the original formulation) 28 | @param C : the loss/regularization tradeoff constant [default: 1.0] 29 | @param scale_C : if True [default], scale C by the number of examples 30 | @param p : polynomial degree when a 'polynomial' kernel is used 31 | [default: 3] 32 | @param gamma : RBF scale parameter when an 'rbf' kernel is used 33 | [default: 1.0] 34 | @param verbose : print optimization status messages [default: True] 35 | @param sv_cutoff : the numerical cutoff for an example to be considered 36 | a support vector [default: 1e-7] 37 | """ 38 | self.restarts = kwargs.pop('restarts', 0) 39 | self.max_iters = kwargs.pop('max_iters', 50) 40 | super(stMIL, self).__init__(*args, **kwargs) 41 | 42 | def fit(self, bags, y): 43 | """ 44 | @param bags : a sequence of n bags; each bag is an m-by-k array-like 45 | object containing m instances with k features 46 | @param y : an array-like object of length n containing -1/+1 labels 47 | """ 48 | self._bags = map(np.asmatrix, bags) 49 | bs = BagSplitter(self._bags, 50 | np.asmatrix(y).reshape((-1, 1))) 51 | self._all_bags = bs.neg_inst_as_bags + bs.pos_inst_as_bags + bs.pos_bags 52 | all_classes = np.vstack([-np.ones((bs.L_n, 1)), 53 | np.ones((bs.L_p + bs.X_p, 1))]) 54 | 55 | if self.scale_C: 56 | niC = float(self.C) / bs.L_n 57 | piC = float(self.C) / bs.L_p 58 | pbC = float(self.C) / bs.X_p 59 | else: 60 | niC = float(self.C) 61 | piC = float(self.C) 62 | pbC = float(self.C) 63 | C = np.vstack([niC * np.ones((bs.L_n, 1)), 64 | piC * np.ones((bs.L_p, 1)), 65 | pbC * np.ones((bs.X_p, 1))]) 66 | 67 | # Used to adjust balancing terms 68 | factors = np.vstack([np.matrix(np.ones((bs.L_n + bs.L_p, 1))), 69 | np.matrix([2.0 / bag.shape[0] - 1.0 70 | for bag in bs.pos_bags]).T]) 71 | 72 | best_obj = float('inf') 73 | best_svm = None 74 | for rr in range(self.restarts + 1): 75 | if rr == 0: 76 | if self.verbose: 77 | print('Non-random start...') 78 | if self.verbose: 79 | print('Initial sMIL solution...') 80 | smil = sMIL(kernel=self.kernel, C=self.C, 81 | gamma=self.gamma, p=self.p, scale_C=self.scale_C) 82 | smil.fit(bags, y) 83 | if self.verbose: 84 | print('Computing instance classes...') 85 | initial_svm = smil 86 | initial_classes = np.sign(smil.predict(bs.pos_inst_as_bags)) 87 | else: 88 | if self.verbose: 89 | print('Random restart %d of %d...' % (rr, self.restarts)) 90 | initial_svm = None 91 | initial_classes = np.matrix([np.sign([uniform(-1.0, 1.0) 92 | for i in range(bs.L_p)])]).T 93 | 94 | if self.verbose: 95 | print('Setup SVM and QP...') 96 | # Setup SVM and QP 97 | K, H, f, A, b, lb, ub = self._setup_svm(self._all_bags, all_classes, C) 98 | # Adjust f with balancing terms 99 | f = np.multiply(f, factors) 100 | qp = IterativeQP(H, f, A, b, lb, ub) 101 | 102 | class stMILCCCP(CCCP): 103 | 104 | def bailout(cself, svm, obj_val, classes): 105 | return svm 106 | 107 | def iterate(cself, svm, obj_val, classes): 108 | # Fix classes with zero classification 109 | classes[np.nonzero(classes == 0.0)] = 1.0 110 | 111 | cself.mention('Linearalizing constraints...') 112 | all_classes = np.matrix(np.vstack([-np.ones((bs.L_n, 1)), 113 | classes.reshape((-1, 1)), 114 | np.ones((bs.X_p, 1))])) 115 | D = spdiag(all_classes) 116 | 117 | # Update QP 118 | qp.update_H(D * K * D) 119 | qp.update_Aeq(all_classes.T) 120 | 121 | # Solve QP 122 | alphas, obj = qp.solve(self.verbose) 123 | 124 | # Update SVM 125 | svm = NSK(kernel=self.kernel, gamma=self.gamma, p=self.p, 126 | verbose=self.verbose, sv_cutoff=self.sv_cutoff) 127 | svm._bags = self._all_bags 128 | svm._y = all_classes 129 | svm._alphas = alphas 130 | svm._objective = obj 131 | svm._compute_separator(K) 132 | svm._K = K 133 | 134 | if cself.check_tolerance(obj_val, obj): 135 | return None, svm 136 | 137 | # Use precomputed classifications from SVM 138 | new_classes = np.sign(svm._bag_predictions[bs.L_n:-bs.X_p]) 139 | return {'svm': svm, 'obj_val': obj, 'classes': new_classes}, None 140 | 141 | cccp = stMILCCCP(verbose=self.verbose, svm=initial_svm, obj_val=None, 142 | classes=initial_classes, max_iters=self.max_iters) 143 | svm = cccp.solve() 144 | if svm is not None: 145 | obj = float(svm._objective) 146 | if obj < best_obj: 147 | best_svm = svm 148 | best_obj = obj 149 | 150 | if best_svm is not None: 151 | self._all_bags = best_svm._bags 152 | self._y = best_svm._y 153 | self._alphas = best_svm._alphas 154 | self._objective = best_svm._objective 155 | self._compute_separator(best_svm._K) 156 | 157 | def _compute_separator(self, K): 158 | bags = self._bags 159 | self._bags = self._all_bags 160 | super(stMIL, self)._compute_separator(K) 161 | self._bags = bags 162 | self._bag_predictions = self.predict(self._bags) 163 | -------------------------------------------------------------------------------- /misvm/svm.py: -------------------------------------------------------------------------------- 1 | """ 2 | Implements a standard SVM 3 | """ 4 | from __future__ import print_function, division 5 | import numpy as np 6 | from misvm.quadprog import quadprog 7 | from misvm.kernel import by_name as kernel_by_name 8 | from misvm.util import spdiag 9 | from sklearn.base import ClassifierMixin, BaseEstimator 10 | 11 | 12 | class SVM(ClassifierMixin, BaseEstimator): 13 | """ 14 | A standard supervised SVM implementation. 15 | """ 16 | 17 | def __init__(self, kernel='linear', C=1.0, p=3, gamma=1e0, scale_C=True, 18 | verbose=True, sv_cutoff=1e-7): 19 | """ 20 | @param kernel : the desired kernel function; can be linear, quadratic, 21 | polynomial, or rbf [default: linear] 22 | @param C : the loss/regularization tradeoff constant [default: 1.0] 23 | @param scale_C : if True [default], scale C by the number of examples 24 | @param p : polynomial degree when a 'polynomial' kernel is used 25 | [default: 3] 26 | @param gamma : RBF scale parameter when an 'rbf' kernel is used 27 | [default: 1.0] 28 | @param verbose : print optimization status messages [default: True] 29 | @param sv_cutoff : the numerical cutoff for an example to be considered 30 | a support vector [default: 1e-7] 31 | """ 32 | self.kernel = kernel 33 | self.C = C 34 | self.gamma = gamma 35 | self.p = p 36 | self.scale_C = scale_C 37 | self.verbose = verbose 38 | self.sv_cutoff = sv_cutoff 39 | 40 | self._X = None 41 | self._y = None 42 | self._objective = None 43 | self._alphas = None 44 | self._sv = None 45 | self._sv_alphas = None 46 | self._sv_X = None 47 | self._sv_y = None 48 | self._b = None 49 | self._predictions = None 50 | 51 | def fit(self, X, y): 52 | """ 53 | @param X : an n-by-m array-like object containing n examples with m 54 | features 55 | @param y : an array-like object of length n containing -1/+1 labels 56 | """ 57 | self._X = np.asmatrix(X) 58 | self._y = np.asmatrix(y).reshape((-1, 1)) 59 | if self.scale_C: 60 | C = self.C / float(len(self._X)) 61 | else: 62 | C = self.C 63 | 64 | K, H, f, A, b, lb, ub = self._setup_svm(self._X, self._y, C) 65 | 66 | # Solve QP 67 | self._alphas, self._objective = quadprog(H, f, A, b, lb, ub, 68 | self.verbose) 69 | self._compute_separator(K) 70 | 71 | def _compute_separator(self, K): 72 | 73 | self._sv = np.nonzero(self._alphas.flat > self.sv_cutoff) 74 | self._sv_alphas = self._alphas[self._sv] 75 | self._sv_X = self._X[self._sv] 76 | self._sv_y = self._y[self._sv] 77 | 78 | n = len(self._sv_X) 79 | if n == 0: 80 | self._b = 0.0 81 | self._predictions = np.zeros(len(self._X)) 82 | else: 83 | _sv_all_K = K[self._sv] 84 | _sv_K = _sv_all_K.T[self._sv].T 85 | e = np.matrix(np.ones((n, 1))) 86 | D = spdiag(self._sv_y) 87 | self._b = float(e.T * D * e - self._sv_alphas.T * D * _sv_K * e) / n 88 | self._predictions = np.array(self._b 89 | + self._sv_alphas.T * D * _sv_all_K).reshape((-1,)) 90 | 91 | def predict(self, X): 92 | """ 93 | @param X : an n-by-m array-like object containing n examples with m 94 | features 95 | @return : an array of length n containing real-valued label predictions 96 | (threshold at zero to produce binary predictions) 97 | """ 98 | if self._sv_X is None or len(self._sv_X) == 0: 99 | return np.zeros(len(X)) 100 | else: 101 | kernel = kernel_by_name(self.kernel, p=self.p, gamma=self.gamma) 102 | K = kernel(np.asmatrix(X), self._sv_X) 103 | return np.array(self._b + K * spdiag(self._sv_y) * self._sv_alphas).reshape((-1,)) 104 | 105 | def _setup_svm(self, examples, classes, C): 106 | kernel = kernel_by_name(self.kernel, gamma=self.gamma, p=self.p) 107 | n = len(examples) 108 | e = np.matrix(np.ones((n, 1))) 109 | 110 | # Kernel and Hessian 111 | if kernel is None: 112 | K = None 113 | H = None 114 | else: 115 | K = _smart_kernel(kernel, examples) 116 | D = spdiag(classes) 117 | H = D * K * D 118 | 119 | # Term for -sum of alphas 120 | f = -e 121 | 122 | # Sum(y_i * alpha_i) = 0 123 | A = classes.T.astype(float) 124 | b = np.matrix([0.0]) 125 | 126 | # 0 <= alpha_i <= C 127 | lb = np.matrix(np.zeros((n, 1))) 128 | if type(C) == float: 129 | ub = C * e 130 | else: 131 | # Allow for C to be an array 132 | ub = C 133 | return K, H, f, A, b, lb, ub 134 | 135 | 136 | def _smart_kernel(kernel, examples): 137 | """ 138 | Optimize the case when instances are 139 | treated as singleton bags. In such 140 | cases, singleton bags should be placed 141 | at the beginning of the list of examples. 142 | """ 143 | if type(examples) == list: 144 | for i, bag in enumerate(examples): 145 | if len(bag) > 1: 146 | break 147 | singletons, bags = examples[:i], examples[i:] 148 | if singletons and bags: 149 | ss = kernel(singletons, singletons) 150 | sb = kernel(singletons, bags) 151 | bb = kernel(bags, bags) 152 | return np.bmat([[ss, sb], [sb.T, bb]]) 153 | 154 | return kernel(examples, examples) 155 | -------------------------------------------------------------------------------- /misvm/util.py: -------------------------------------------------------------------------------- 1 | """ 2 | Utility functions and classes 3 | """ 4 | from __future__ import print_function, division 5 | import numpy as np 6 | import scipy.sparse as sp 7 | from itertools import chain 8 | from random import uniform 9 | 10 | 11 | def rand_convex(n): 12 | rand = np.matrix([uniform(0.0, 1.0) for i in range(n)]) 13 | return rand / np.sum(rand) 14 | 15 | 16 | def spdiag(x): 17 | n = len(x) 18 | return sp.spdiags(x.flat, [0], n, n) 19 | 20 | 21 | def partition(items, group_sizes): 22 | """ 23 | Partition a sequence of items 24 | into groups of the given sizes 25 | """ 26 | i = 0 27 | for group in group_sizes: 28 | yield items[i: i + group] 29 | i += group 30 | 31 | 32 | def slices(groups): 33 | """ 34 | Generate slices to select 35 | groups of the given sizes 36 | within a list/matrix 37 | """ 38 | i = 0 39 | for group in groups: 40 | yield i, i + group 41 | i += group 42 | 43 | 44 | class BagSplitter(object): 45 | def __init__(self, bags, classes): 46 | self.bags = bags 47 | self.classes = classes 48 | 49 | def __getattr__(self, name): 50 | if name == 'pos_bags': 51 | self.pos_bags = [bag for bag, cls in 52 | zip(self.bags, self.classes) 53 | if cls > 0.0] 54 | return self.pos_bags 55 | elif name == 'neg_bags': 56 | self.neg_bags = [bag for bag, cls in 57 | zip(self.bags, self.classes) 58 | if cls <= 0.0] 59 | return self.neg_bags 60 | elif name == 'neg_instances': 61 | self.neg_instances = np.vstack(self.neg_bags) 62 | return self.neg_instances 63 | elif name == 'pos_instances': 64 | self.pos_instances = np.vstack(self.pos_bags) 65 | return self.pos_instances 66 | elif name == 'instances': 67 | self.instances = np.vstack([self.neg_instances, 68 | self.pos_instances]) 69 | return self.instances 70 | elif name == 'inst_classes': 71 | self.inst_classes = np.vstack([-np.ones((self.L_n, 1)), 72 | np.ones((self.L_p, 1))]) 73 | return self.inst_classes 74 | elif name == 'pos_groups': 75 | self.pos_groups = [len(bag) for bag in self.pos_bags] 76 | return self.pos_groups 77 | elif name == 'neg_groups': 78 | self.neg_groups = [len(bag) for bag in self.neg_bags] 79 | return self.neg_groups 80 | elif name == 'L_n': 81 | self.L_n = len(self.neg_instances) 82 | return self.L_n 83 | elif name == 'L_p': 84 | self.L_p = len(self.pos_instances) 85 | return self.L_p 86 | elif name == 'L': 87 | self.L = self.L_p + self.L_n 88 | return self.L 89 | elif name == 'X_n': 90 | self.X_n = len(self.neg_bags) 91 | return self.X_n 92 | elif name == 'X_p': 93 | self.X_p = len(self.pos_bags) 94 | return self.X_p 95 | elif name == 'X': 96 | self.X = self.X_p + self.X_n 97 | return self.X 98 | elif name == 'neg_inst_as_bags': 99 | self.neg_inst_as_bags = [inst for inst in chain(*self.neg_bags)] 100 | return self.neg_inst_as_bags 101 | elif name == 'pos_inst_as_bags': 102 | self.pos_inst_as_bags = [inst for inst in chain(*self.pos_bags)] 103 | return self.pos_inst_as_bags 104 | else: 105 | raise AttributeError('No "%s" attribute.' % name) 106 | raise Exception("Unreachable %s" % name) 107 | -------------------------------------------------------------------------------- /setup.py: -------------------------------------------------------------------------------- 1 | try: 2 | from setuptools import setup 3 | 4 | setup # quiet "redefinition of unused ..." warning from pyflakes 5 | # arguments that distutils doesn't understand 6 | setuptools_kwargs = { 7 | 'install_requires': [ 8 | 'numpy', 9 | 'scipy', 10 | 'cvxopt', 11 | ], 12 | 'provides': ['misvm'], 13 | } 14 | except ImportError: 15 | from distutils.core import setup 16 | 17 | setuptools_kwargs = {} 18 | 19 | setup(name='misvm', 20 | version="1.0", 21 | description=( 22 | """ 23 | Implementations of various 24 | multiple-instance support vector machine approaches 25 | """ 26 | ), 27 | author='Gary Doran', 28 | author_email='gary.doran@case.edu', 29 | url='https://github.com/garydoranjr/misvm.git', 30 | license="BSD compatable (see the LICENSE file)", 31 | packages=['misvm'], 32 | platforms=['unix'], 33 | scripts=[], 34 | **setuptools_kwargs) 35 | --------------------------------------------------------------------------------