├── README.md ├── binaries ├── linux │ ├── COPYRIGHT │ ├── FAQ.html │ ├── README │ ├── README-GPU │ ├── svm-predict │ ├── svm-scale │ ├── svm-train │ ├── svm-train-gpu │ ├── tools │ │ ├── README │ │ ├── checkdata.py │ │ ├── easy.py │ │ ├── grid.py │ │ └── subset.py │ └── train_set └── windows │ ├── x64 │ ├── COPYRIGHT │ ├── FAQ.html │ ├── README │ ├── README-GPU │ ├── tools │ │ ├── README │ │ ├── checkdata.py │ │ ├── easy.py │ │ ├── grid.py │ │ └── subset.py │ ├── train_set │ └── windows │ │ ├── cublas64_55.dll │ │ ├── cudart64_55.dll │ │ ├── libsvm.dll │ │ ├── svm-predict.exe │ │ ├── svm-scale.exe │ │ ├── svm-toy.exe │ │ ├── svm-train-gpu.exe │ │ └── svm-train.exe │ └── x86 │ ├── COPYRIGHT │ ├── FAQ.html │ ├── README │ ├── README-GPU │ ├── tools │ ├── README │ ├── checkdata.py │ ├── easy.py │ ├── grid.py │ └── subset.py │ ├── train_set │ └── windows │ ├── cublas32_55.dll │ ├── cudart32_55.dll │ ├── libsvm.dll │ ├── svm-predict.exe │ ├── svm-scale.exe │ ├── svm-toy.exe │ ├── svm-train-gpu.exe │ └── svm-train.exe └── src ├── linux ├── COPYRIGHT ├── Makefile ├── README ├── README-GPU ├── cross_validation_with_matrix_precomputation.c ├── findcudalib.mk ├── kernel_matrix_calculation.c ├── readme.txt ├── svm-train.c ├── svm.cpp └── svm.h └── windows ├── README-GPU ├── libsvm_train_dense_gpu.ncb ├── libsvm_train_dense_gpu.sdf ├── libsvm_train_dense_gpu.sln ├── libsvm_train_dense_gpu.suo └── libsvm_train_dense_gpu ├── cross_validation_with_matrix_precomputation.c ├── kernel_matrix_calculation.c ├── libsvm_train_dense_gpu.vcxproj ├── libsvm_train_dense_gpu.vcxproj.filters ├── libsvm_train_dense_gpu.vcxproj.user ├── svm-train.c ├── svm.cpp └── svm.h /README.md: -------------------------------------------------------------------------------- 1 | CUDA: GPU-accelerated LIBSVM 2 | ==== 3 | **LIBSVM Accelerated with GPU using the CUDA Framework** 4 | 5 | GPU-accelerated LIBSVM is a modification of the [original LIBSVM](http://www.csie.ntu.edu.tw/~cjlin/libsvm/) that exploits the CUDA framework to significantly reduce processing time while producing identical results. The functionality and interface of LIBSVM remains the same. The modifications were done in the kernel computation, that is now performed using the GPU. 6 | 7 | Watch a [short video](http://www.youtube.com/watch?v=Fl99tQQd55U) on the capabilities of the GPU-accelerated LIBSVM package 8 | 9 | ###CHANGELOG 10 | 11 | V1.2 12 | 13 | Updated to LIBSVM version 3.17 14 | Updated to CUDA SDK v5.5 15 | Using CUBLAS_V2 which is compatible with the CUDA SDK v4.0 and up. 16 | 17 | ### FEATURES 18 | 19 | Mode Supported 20 | 21 | C-SVC classification with RBF kernel 22 | 23 | Functionality / User interface 24 | 25 | Same as LIBSVM 26 | 27 | ### PREREQUISITES 28 | 29 | LIBSVM prerequisites 30 | NVIDIA Graphics card with CUDA support 31 | Latest NVIDIA drivers for GPU 32 | 33 | ### PERFORMANCE COMPARISON 34 | 35 | To showcase the performance gain using the GPU-accelerated LIBSVM we present an example run. 36 | 37 | PC Setup 38 | 39 | Quad core Intel Q6600 processor 40 | 3.5GB of DDR2 RAM 41 | Windows-XP 32-bit OS 42 | 43 | Input Data 44 | 45 | TRECVID 2007 Dataset for the detection of high level features in video shots 46 | Training vectors with a dimension of 6000 47 | 20 different feature models with a variable number of input training vectors ranging from 36 up to 3772 48 | 49 | Classification parameters 50 | 51 | c-svc 52 | RBF kernel 53 | Parameter optimization using the easy.py script provided by LIBSVM. 54 | 4 local workers 55 | ![Diagram](http://mklab.iti.gr/files/GPULIBSVM-comparison.jpg) 56 | 57 | Discussion 58 | 59 | GPU-accelerated LIBSVM gives a performance gain depending on the size of input data set. 60 | This gain is increasing dramatically with the size of the dataset. 61 | Please take into consideration input data size limitations that can occur from the memory 62 | capacity of the graphics card that is used. 63 | 64 | ### PUBLICATION 65 | 66 | A first document describing some of the work related to the GPU-Accelerated LIBSVM is the following; please cite it if you find this implementation useful in your work: 67 | 68 | A. Athanasopoulos, A. Dimou, V. Mezaris, I. Kompatsiaris, "GPU Acceleration for Support Vector Machines", Proc. 12th International Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS 2011), Delft, The Netherlands, April 2011. 69 | 70 | ### LICENSE 71 | 72 | THIS SOFTWARE IS PROVIDED BY THE AUTHOR "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 73 | 74 | ### FREQUENTLY ASKED QUESTIONS (FAQ) 75 | 76 | * Is there a GPU-accelerated LIBSVM version for Matlab? 77 | 78 | We are interested in porting our imlementation in Matlab but due to our workload, it is not in our immediate plans. Everybody is welcome to make the port and we can host the ported software. 79 | 80 | * Visual Studio will not let me build the provided project. 81 | 82 | The project has been built in both 32/64 bit mode. If you are working on 32 bits, you might need the x64 compiler for Visual Studio 2010 to build the project. 83 | 84 | * Building the project, I get linker error messages. 85 | 86 | Please go to the project properties and check the library settings. CUDA libraries have different names for x32 / x64. Make sure that the correct path and filenames are given. 87 | 88 | * I have built the project but the executables will not run (The application has failed to start because its side-by-side configuration is in 89 | correct.) 90 | 91 | Please update the VS2010 redistributables to the PC you are running your executable and install all the latest patches for visual studio. 92 | 93 | * My GPU-accelerated LIBSVM is running smoothly but i do not see any speed-up. 94 | 95 | GPU-accelerated LIBSVM is giving speed-ups mainly for big datasets. In the GPU-accelerated implementation some extra time is needed to load the data to the gpu memory. If the dataset is not big enough to give a significant performance gain, the gain is lost due to the gpu-memory -> cpu-memory, cpu-memory -> gpu-memory transfer time. Please refer to the graph above to have a better understanding of the performance gain for different dataset sizes. 96 | Problems also seem to arise when the input dataset contains values with extreme differences (e.g. 107) if no scaling is performed. Such an example is the "breast-cancer" dataset provided in the official LIBSVM page. 97 | 98 | ### ACKNOWLEDGEMENTS 99 | 100 | This work was supported by the EU FP7 projects GLOCAL (FP7-248984) and WeKnowIt (FP7-215453) 101 | -------------------------------------------------------------------------------- /binaries/linux/COPYRIGHT: -------------------------------------------------------------------------------- 1 | 2 | Copyright (c) 2000-2013 Chih-Chung Chang and Chih-Jen Lin 3 | All rights reserved. 4 | 5 | Redistribution and use in source and binary forms, with or without 6 | modification, are permitted provided that the following conditions 7 | are met: 8 | 9 | 1. Redistributions of source code must retain the above copyright 10 | notice, this list of conditions and the following disclaimer. 11 | 12 | 2. Redistributions in binary form must reproduce the above copyright 13 | notice, this list of conditions and the following disclaimer in the 14 | documentation and/or other materials provided with the distribution. 15 | 16 | 3. Neither name of copyright holders nor the names of its contributors 17 | may be used to endorse or promote products derived from this software 18 | without specific prior written permission. 19 | 20 | 21 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS 22 | ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT 23 | LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR 24 | A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR 25 | CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, 26 | EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, 27 | PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR 28 | PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF 29 | LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING 30 | NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS 31 | SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 32 | -------------------------------------------------------------------------------- /binaries/linux/README-GPU: -------------------------------------------------------------------------------- 1 | GPU-Accelerated LIBSVM is exploiting the GPU, using the CUDA interface, to 2 | speed-up the training process. This package contains a new executable for 3 | training classifiers "svm-train-gpu.exe" together with the original one. 4 | The use of the new executable is exactly the same as with the original one. 5 | 6 | This binary was built with the CUBLAS API version 2 which is compatible with SDKs from 4.0 and up. 7 | 8 | To test the binary "svm-train-gpu" you can run the easy.py script which is located in the "tools" folder. 9 | To observe speed improvements between CPU and GPU execution we provide a custom relatively large dataset (train_set) which can be used as an input to easy.py. 10 | 11 | 12 | FEATURES 13 | 14 | Mode Supported 15 | 16 | * c-svc classification with RBF kernel 17 | 18 | Functionality / User interface 19 | 20 | * Same as LIBSVM 21 | 22 | 23 | PREREQUISITES 24 | 25 | * NVIDIA Graphics card with CUDA support 26 | * Latest NVIDIA drivers for GPU 27 | 28 | Additional Information 29 | ====================== 30 | 31 | If you find GPU-Accelerated LIBSVM helpful, please cite it as 32 | 33 | A. Athanasopoulos, A. Dimou, V. Mezaris, I. Kompatsiaris, "GPU Acceleration for Support Vector Machines", 34 | Proc. 12th International Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS 2011), Delft, The Netherlands, April 2011. 35 | 36 | Software available at http://mklab.iti.gr/project/GPU-LIBSVM -------------------------------------------------------------------------------- /binaries/linux/svm-predict: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MKLab-ITI/CUDA/e567d31391d51742e1d6fe6cadffa4f9557bd518/binaries/linux/svm-predict -------------------------------------------------------------------------------- /binaries/linux/svm-scale: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MKLab-ITI/CUDA/e567d31391d51742e1d6fe6cadffa4f9557bd518/binaries/linux/svm-scale -------------------------------------------------------------------------------- /binaries/linux/svm-train: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MKLab-ITI/CUDA/e567d31391d51742e1d6fe6cadffa4f9557bd518/binaries/linux/svm-train -------------------------------------------------------------------------------- /binaries/linux/svm-train-gpu: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MKLab-ITI/CUDA/e567d31391d51742e1d6fe6cadffa4f9557bd518/binaries/linux/svm-train-gpu -------------------------------------------------------------------------------- /binaries/linux/tools/README: -------------------------------------------------------------------------------- 1 | This directory includes some useful codes: 2 | 3 | 1. subset selection tools. 4 | 2. parameter selection tools. 5 | 3. LIBSVM format checking tools 6 | 7 | Part I: Subset selection tools 8 | 9 | Introduction 10 | ============ 11 | 12 | Training large data is time consuming. Sometimes one should work on a 13 | smaller subset first. The python script subset.py randomly selects a 14 | specified number of samples. For classification data, we provide a 15 | stratified selection to ensure the same class distribution in the 16 | subset. 17 | 18 | Usage: subset.py [options] dataset number [output1] [output2] 19 | 20 | This script selects a subset of the given data set. 21 | 22 | options: 23 | -s method : method of selection (default 0) 24 | 0 -- stratified selection (classification only) 25 | 1 -- random selection 26 | 27 | output1 : the subset (optional) 28 | output2 : the rest of data (optional) 29 | 30 | If output1 is omitted, the subset will be printed on the screen. 31 | 32 | Example 33 | ======= 34 | 35 | > python subset.py heart_scale 100 file1 file2 36 | 37 | From heart_scale 100 samples are randomly selected and stored in 38 | file1. All remaining instances are stored in file2. 39 | 40 | 41 | Part II: Parameter Selection Tools 42 | 43 | Introduction 44 | ============ 45 | 46 | grid.py is a parameter selection tool for C-SVM classification using 47 | the RBF (radial basis function) kernel. It uses cross validation (CV) 48 | technique to estimate the accuracy of each parameter combination in 49 | the specified range and helps you to decide the best parameters for 50 | your problem. 51 | 52 | grid.py directly executes libsvm binaries (so no python binding is needed) 53 | for cross validation and then draw contour of CV accuracy using gnuplot. 54 | You must have libsvm and gnuplot installed before using it. The package 55 | gnuplot is available at http://www.gnuplot.info/ 56 | 57 | On Mac OSX, the precompiled gnuplot file needs the library Aquarterm, 58 | which thus must be installed as well. In addition, this version of 59 | gnuplot does not support png, so you need to change "set term png 60 | transparent small" and use other image formats. For example, you may 61 | have "set term pbm small color". 62 | 63 | Usage: grid.py [grid_options] [svm_options] dataset 64 | 65 | grid_options : 66 | -log2c {begin,end,step | "null"} : set the range of c (default -5,15,2) 67 | begin,end,step -- c_range = 2^{begin,...,begin+k*step,...,end} 68 | "null" -- do not grid with c 69 | -log2g {begin,end,step | "null"} : set the range of g (default 3,-15,-2) 70 | begin,end,step -- g_range = 2^{begin,...,begin+k*step,...,end} 71 | "null" -- do not grid with g 72 | -v n : n-fold cross validation (default 5) 73 | -svmtrain pathname : set svm executable path and name 74 | -gnuplot {pathname | "null"} : 75 | pathname -- set gnuplot executable path and name 76 | "null" -- do not plot 77 | -out {pathname | "null"} : (default dataset.out) 78 | pathname -- set output file path and name 79 | "null" -- do not output file 80 | -png pathname : set graphic output file path and name (default dataset.png) 81 | -resume [pathname] : resume the grid task using an existing output file (default pathname is dataset.out) 82 | Use this option only if some parameters have been checked for the SAME data. 83 | 84 | svm_options : additional options for svm-train 85 | 86 | The program conducts v-fold cross validation using parameter C (and gamma) 87 | = 2^begin, 2^(begin+step), ..., 2^end. 88 | 89 | You can specify where the libsvm executable and gnuplot are using the 90 | -svmtrain and -gnuplot parameters. 91 | 92 | For windows users, please use pgnuplot.exe. If you are using gnuplot 93 | 3.7.1, please upgrade to version 3.7.3 or higher. The version 3.7.1 94 | has a bug. If you use cygwin on windows, please use gunplot-x11. 95 | 96 | If the task is terminated accidentally or you would like to change the 97 | range of parameters, you can apply '-resume' to save time by re-using 98 | previous results. You may specify the output file of a previous run 99 | or use the default (i.e., dataset.out) without giving a name. Please 100 | note that the same condition must be used in two runs. For example, 101 | you cannot use '-v 10' earlier and resume the task with '-v 5'. 102 | 103 | The value of some options can be "null." For example, `-log2c -1,0,1 104 | -log2 "null"' means that C=2^-1,2^0,2^1 and g=LIBSVM's default gamma 105 | value. That is, you do not conduct parameter selection on gamma. 106 | 107 | Example 108 | ======= 109 | 110 | > python grid.py -log2c -5,5,1 -log2g -4,0,1 -v 5 -m 300 heart_scale 111 | 112 | Users (in particular MS Windows users) may need to specify the path of 113 | executable files. You can either change paths in the beginning of 114 | grid.py or specify them in the command line. For example, 115 | 116 | > grid.py -log2c -5,5,1 -svmtrain "c:\Program Files\libsvm\windows\svm-train.exe" -gnuplot c:\tmp\gnuplot\binary\pgnuplot.exe -v 10 heart_scale 117 | 118 | Output: two files 119 | dataset.png: the CV accuracy contour plot generated by gnuplot 120 | dataset.out: the CV accuracy at each (log2(C),log2(gamma)) 121 | 122 | The following example saves running time by loading the output file of a previous run. 123 | 124 | > python grid.py -log2c -7,7,1 -log2g -5,2,1 -v 5 -resume heart_scale.out heart_scale 125 | 126 | Parallel grid search 127 | ==================== 128 | 129 | You can conduct a parallel grid search by dispatching jobs to a 130 | cluster of computers which share the same file system. First, you add 131 | machine names in grid.py: 132 | 133 | ssh_workers = ["linux1", "linux5", "linux5"] 134 | 135 | and then setup your ssh so that the authentication works without 136 | asking a password. 137 | 138 | The same machine (e.g., linux5 here) can be listed more than once if 139 | it has multiple CPUs or has more RAM. If the local machine is the 140 | best, you can also enlarge the nr_local_worker. For example: 141 | 142 | nr_local_worker = 2 143 | 144 | Example: 145 | 146 | > python grid.py heart_scale 147 | [local] -1 -1 78.8889 (best c=0.5, g=0.5, rate=78.8889) 148 | [linux5] -1 -7 83.3333 (best c=0.5, g=0.0078125, rate=83.3333) 149 | [linux5] 5 -1 77.037 (best c=0.5, g=0.0078125, rate=83.3333) 150 | [linux1] 5 -7 83.3333 (best c=0.5, g=0.0078125, rate=83.3333) 151 | . 152 | . 153 | . 154 | 155 | If -log2c, -log2g, or -v is not specified, default values are used. 156 | 157 | If your system uses telnet instead of ssh, you list the computer names 158 | in telnet_workers. 159 | 160 | Calling grid in Python 161 | ====================== 162 | 163 | In addition to using grid.py as a command-line tool, you can use it as a 164 | Python module. 165 | 166 | >>> rate, param = find_parameters(dataset, options) 167 | 168 | You need to specify `dataset' and `options' (default ''). See the following example. 169 | 170 | > python 171 | 172 | >>> from grid import * 173 | >>> rate, param = find_parameters('../heart_scale', '-log2c -1,1,1 -log2g -1,1,1') 174 | [local] 0.0 0.0 rate=74.8148 (best c=1.0, g=1.0, rate=74.8148) 175 | [local] 0.0 -1.0 rate=77.037 (best c=1.0, g=0.5, rate=77.037) 176 | . 177 | . 178 | [local] -1.0 -1.0 rate=78.8889 (best c=0.5, g=0.5, rate=78.8889) 179 | . 180 | . 181 | >>> rate 182 | 78.8889 183 | >>> param 184 | {'c': 0.5, 'g': 0.5} 185 | 186 | 187 | Part III: LIBSVM format checking tools 188 | 189 | Introduction 190 | ============ 191 | 192 | `svm-train' conducts only a simple check of the input data. To do a 193 | detailed check, we provide a python script `checkdata.py.' 194 | 195 | Usage: checkdata.py dataset 196 | 197 | Exit status (returned value): 1 if there are errors, 0 otherwise. 198 | 199 | This tool is written by Rong-En Fan at National Taiwan University. 200 | 201 | Example 202 | ======= 203 | 204 | > cat bad_data 205 | 1 3:1 2:4 206 | > python checkdata.py bad_data 207 | line 1: feature indices must be in an ascending order, previous/current features 3:1 2:4 208 | Found 1 lines with error. 209 | 210 | 211 | -------------------------------------------------------------------------------- /binaries/linux/tools/checkdata.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | 3 | # 4 | # A format checker for LIBSVM 5 | # 6 | 7 | # 8 | # Copyright (c) 2007, Rong-En Fan 9 | # 10 | # All rights reserved. 11 | # 12 | # This program is distributed under the same license of the LIBSVM package. 13 | # 14 | 15 | from sys import argv, exit 16 | import os.path 17 | 18 | def err(line_no, msg): 19 | print("line {0}: {1}".format(line_no, msg)) 20 | 21 | # works like float() but does not accept nan and inf 22 | def my_float(x): 23 | if x.lower().find("nan") != -1 or x.lower().find("inf") != -1: 24 | raise ValueError 25 | 26 | return float(x) 27 | 28 | def main(): 29 | if len(argv) != 2: 30 | print("Usage: {0} dataset".format(argv[0])) 31 | exit(1) 32 | 33 | dataset = argv[1] 34 | 35 | if not os.path.exists(dataset): 36 | print("dataset {0} not found".format(dataset)) 37 | exit(1) 38 | 39 | line_no = 1 40 | error_line_count = 0 41 | for line in open(dataset, 'r'): 42 | line_error = False 43 | 44 | # each line must end with a newline character 45 | if line[-1] != '\n': 46 | err(line_no, "missing a newline character in the end") 47 | line_error = True 48 | 49 | nodes = line.split() 50 | 51 | # check label 52 | try: 53 | label = nodes.pop(0) 54 | 55 | if label.find(',') != -1: 56 | # multi-label format 57 | try: 58 | for l in label.split(','): 59 | l = my_float(l) 60 | except: 61 | err(line_no, "label {0} is not a valid multi-label form".format(label)) 62 | line_error = True 63 | else: 64 | try: 65 | label = my_float(label) 66 | except: 67 | err(line_no, "label {0} is not a number".format(label)) 68 | line_error = True 69 | except: 70 | err(line_no, "missing label, perhaps an empty line?") 71 | line_error = True 72 | 73 | # check features 74 | prev_index = -1 75 | for i in range(len(nodes)): 76 | try: 77 | (index, value) = nodes[i].split(':') 78 | 79 | index = int(index) 80 | value = my_float(value) 81 | 82 | # precomputed kernel's index starts from 0 and LIBSVM 83 | # checks it. Hence, don't treat index 0 as an error. 84 | if index < 0: 85 | err(line_no, "feature index must be positive; wrong feature {0}".format(nodes[i])) 86 | line_error = True 87 | elif index <= prev_index: 88 | err(line_no, "feature indices must be in an ascending order, previous/current features {0} {1}".format(nodes[i-1], nodes[i])) 89 | line_error = True 90 | prev_index = index 91 | except: 92 | err(line_no, "feature '{0}' not an : pair, integer, real number ".format(nodes[i])) 93 | line_error = True 94 | 95 | line_no += 1 96 | 97 | if line_error: 98 | error_line_count += 1 99 | 100 | if error_line_count > 0: 101 | print("Found {0} lines with error.".format(error_line_count)) 102 | return 1 103 | else: 104 | print("No error.") 105 | return 0 106 | 107 | if __name__ == "__main__": 108 | exit(main()) 109 | -------------------------------------------------------------------------------- /binaries/linux/tools/easy.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | 3 | import sys 4 | import os 5 | from subprocess import * 6 | 7 | if len(sys.argv) <= 1: 8 | print('Usage: {0} training_file [testing_file]'.format(sys.argv[0])) 9 | raise SystemExit 10 | 11 | # svm, grid, and gnuplot executable files 12 | 13 | is_win32 = (sys.platform == 'win32') 14 | if not is_win32: 15 | svmscale_exe = "../svm-scale" 16 | svmtrain_exe = "../svm-train-gpu" 17 | svmpredict_exe = "../svm-predict" 18 | grid_py = "./grid.py" 19 | gnuplot_exe = "/usr/bin/gnuplot" 20 | else: 21 | # example for windows 22 | svmscale_exe = r"..\windows\svm-scale.exe" 23 | svmtrain_exe = r"..\windows\svm-train-gpu.exe" 24 | svmpredict_exe = r"..\windows\svm-predict.exe" 25 | gnuplot_exe = r"c:\tmp\gnuplot\binary\pgnuplot.exe" 26 | grid_py = r".\grid.py" 27 | 28 | assert os.path.exists(svmscale_exe),"svm-scale executable not found" 29 | assert os.path.exists(svmtrain_exe),"svm-train executable not found" 30 | assert os.path.exists(svmpredict_exe),"svm-predict executable not found" 31 | assert os.path.exists(gnuplot_exe),"gnuplot executable not found" 32 | assert os.path.exists(grid_py),"grid.py not found" 33 | 34 | train_pathname = sys.argv[1] 35 | assert os.path.exists(train_pathname),"training file not found" 36 | file_name = os.path.split(train_pathname)[1] 37 | scaled_file = file_name + ".scale" 38 | model_file = file_name + ".model" 39 | range_file = file_name + ".range" 40 | 41 | if len(sys.argv) > 2: 42 | test_pathname = sys.argv[2] 43 | file_name = os.path.split(test_pathname)[1] 44 | assert os.path.exists(test_pathname),"testing file not found" 45 | scaled_test_file = file_name + ".scale" 46 | predict_test_file = file_name + ".predict" 47 | 48 | cmd = '{0} -s "{1}" "{2}" > "{3}"'.format(svmscale_exe, range_file, train_pathname, scaled_file) 49 | print('Scaling training data...') 50 | Popen(cmd, shell = True, stdout = PIPE).communicate() 51 | 52 | cmd = '{0} -svmtrain "{1}" -gnuplot "{2}" "{3}"'.format(grid_py, svmtrain_exe, gnuplot_exe, scaled_file) 53 | print('Cross validation...') 54 | f = Popen(cmd, shell = True, stdout = PIPE).stdout 55 | 56 | line = '' 57 | while True: 58 | last_line = line 59 | line = f.readline() 60 | if not line: break 61 | c,g,rate = map(float,last_line.split()) 62 | 63 | print('Best c={0}, g={1} CV rate={2}'.format(c,g,rate)) 64 | 65 | cmd = '{0} -c {1} -g {2} "{3}" "{4}"'.format(svmtrain_exe,c,g,scaled_file,model_file) 66 | print('Training...') 67 | Popen(cmd, shell = True, stdout = PIPE).communicate() 68 | 69 | print('Output model: {0}'.format(model_file)) 70 | if len(sys.argv) > 2: 71 | cmd = '{0} -r "{1}" "{2}" > "{3}"'.format(svmscale_exe, range_file, test_pathname, scaled_test_file) 72 | print('Scaling testing data...') 73 | Popen(cmd, shell = True, stdout = PIPE).communicate() 74 | 75 | cmd = '{0} "{1}" "{2}" "{3}"'.format(svmpredict_exe, scaled_test_file, model_file, predict_test_file) 76 | print('Testing...') 77 | Popen(cmd, shell = True).communicate() 78 | 79 | print('Output prediction: {0}'.format(predict_test_file)) 80 | -------------------------------------------------------------------------------- /binaries/linux/tools/grid.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | __all__ = ['find_parameters'] 3 | 4 | import os, sys, traceback, getpass, time, re 5 | from threading import Thread 6 | from subprocess import * 7 | 8 | if sys.version_info[0] < 3: 9 | from Queue import Queue 10 | else: 11 | from queue import Queue 12 | 13 | telnet_workers = [] 14 | ssh_workers = [] 15 | nr_local_worker = 1 16 | 17 | class GridOption: 18 | def __init__(self, dataset_pathname, options): 19 | dirname = os.path.dirname(__file__) 20 | if sys.platform != 'win32': 21 | self.svmtrain_pathname = os.path.join(dirname, '../svm-train-gpu') 22 | self.gnuplot_pathname = '/usr/bin/gnuplot' 23 | else: 24 | # example for windows 25 | self.svmtrain_pathname = os.path.join(dirname, r'..\windows\svm-train-gpu.exe') 26 | # svmtrain_pathname = r'c:\Program Files\libsvm\windows\svm-train-gpu.exe' 27 | self.gnuplot_pathname = r'c:\tmp\gnuplot\binary\pgnuplot.exe' 28 | self.fold = 5 29 | self.c_begin, self.c_end, self.c_step = -5, 15, 2 30 | self.g_begin, self.g_end, self.g_step = 3, -15, -2 31 | self.grid_with_c, self.grid_with_g = True, True 32 | self.dataset_pathname = dataset_pathname 33 | self.dataset_title = os.path.split(dataset_pathname)[1] 34 | self.out_pathname = '{0}.out'.format(self.dataset_title) 35 | self.png_pathname = '{0}.png'.format(self.dataset_title) 36 | self.pass_through_string = ' ' 37 | self.resume_pathname = None 38 | self.parse_options(options) 39 | 40 | def parse_options(self, options): 41 | if type(options) == str: 42 | options = options.split() 43 | i = 0 44 | pass_through_options = [] 45 | 46 | while i < len(options): 47 | if options[i] == '-log2c': 48 | i = i + 1 49 | if options[i] == 'null': 50 | self.grid_with_c = False 51 | else: 52 | self.c_begin, self.c_end, self.c_step = map(float,options[i].split(',')) 53 | elif options[i] == '-log2g': 54 | i = i + 1 55 | if options[i] == 'null': 56 | self.grid_with_g = False 57 | else: 58 | self.g_begin, self.g_end, self.g_step = map(float,options[i].split(',')) 59 | elif options[i] == '-v': 60 | i = i + 1 61 | self.fold = options[i] 62 | elif options[i] in ('-c','-g'): 63 | raise ValueError('Use -log2c and -log2g.') 64 | elif options[i] == '-svmtrain': 65 | i = i + 1 66 | self.svmtrain_pathname = options[i] 67 | elif options[i] == '-gnuplot': 68 | i = i + 1 69 | if options[i] == 'null': 70 | self.gnuplot_pathname = None 71 | else: 72 | self.gnuplot_pathname = options[i] 73 | elif options[i] == '-out': 74 | i = i + 1 75 | if options[i] == 'null': 76 | self.out_pathname = None 77 | else: 78 | self.out_pathname = options[i] 79 | elif options[i] == '-png': 80 | i = i + 1 81 | self.png_pathname = options[i] 82 | elif options[i] == '-resume': 83 | if i == (len(options)-1) or options[i+1].startswith('-'): 84 | self.resume_pathname = self.dataset_title + '.out' 85 | else: 86 | i = i + 1 87 | self.resume_pathname = options[i] 88 | else: 89 | pass_through_options.append(options[i]) 90 | i = i + 1 91 | 92 | self.pass_through_string = ' '.join(pass_through_options) 93 | if not os.path.exists(self.svmtrain_pathname): 94 | raise IOError('svm-train executable not found') 95 | if not os.path.exists(self.dataset_pathname): 96 | raise IOError('dataset not found') 97 | if self.resume_pathname and not os.path.exists(self.resume_pathname): 98 | raise IOError('file for resumption not found') 99 | if not self.grid_with_c and not self.grid_with_g: 100 | raise ValueError('-log2c and -log2g should not be null simultaneously') 101 | if self.gnuplot_pathname and not os.path.exists(self.gnuplot_pathname): 102 | sys.stderr.write('gnuplot executable not found\n') 103 | self.gnuplot_pathname = None 104 | 105 | def redraw(db,best_param,gnuplot,options,tofile=False): 106 | if len(db) == 0: return 107 | begin_level = round(max(x[2] for x in db)) - 3 108 | step_size = 0.5 109 | 110 | best_log2c,best_log2g,best_rate = best_param 111 | 112 | # if newly obtained c, g, or cv values are the same, 113 | # then stop redrawing the contour. 114 | if all(x[0] == db[0][0] for x in db): return 115 | if all(x[1] == db[0][1] for x in db): return 116 | if all(x[2] == db[0][2] for x in db): return 117 | 118 | if tofile: 119 | gnuplot.write(b"set term png transparent small linewidth 2 medium enhanced\n") 120 | gnuplot.write("set output \"{0}\"\n".format(options.png_pathname.replace('\\','\\\\')).encode()) 121 | #gnuplot.write(b"set term postscript color solid\n") 122 | #gnuplot.write("set output \"{0}.ps\"\n".format(options.dataset_title).encode().encode()) 123 | elif sys.platform == 'win32': 124 | gnuplot.write(b"set term windows\n") 125 | else: 126 | gnuplot.write( b"set term x11\n") 127 | gnuplot.write(b"set xlabel \"log2(C)\"\n") 128 | gnuplot.write(b"set ylabel \"log2(gamma)\"\n") 129 | gnuplot.write("set xrange [{0}:{1}]\n".format(options.c_begin,options.c_end).encode()) 130 | gnuplot.write("set yrange [{0}:{1}]\n".format(options.g_begin,options.g_end).encode()) 131 | gnuplot.write(b"set contour\n") 132 | gnuplot.write("set cntrparam levels incremental {0},{1},100\n".format(begin_level,step_size).encode()) 133 | gnuplot.write(b"unset surface\n") 134 | gnuplot.write(b"unset ztics\n") 135 | gnuplot.write(b"set view 0,0\n") 136 | gnuplot.write("set title \"{0}\"\n".format(options.dataset_title).encode()) 137 | gnuplot.write(b"unset label\n") 138 | gnuplot.write("set label \"Best log2(C) = {0} log2(gamma) = {1} accuracy = {2}%\" \ 139 | at screen 0.5,0.85 center\n". \ 140 | format(best_log2c, best_log2g, best_rate).encode()) 141 | gnuplot.write("set label \"C = {0} gamma = {1}\"" 142 | " at screen 0.5,0.8 center\n".format(2**best_log2c, 2**best_log2g).encode()) 143 | gnuplot.write(b"set key at screen 0.9,0.9\n") 144 | gnuplot.write(b"splot \"-\" with lines\n") 145 | 146 | db.sort(key = lambda x:(x[0], -x[1])) 147 | 148 | prevc = db[0][0] 149 | for line in db: 150 | if prevc != line[0]: 151 | gnuplot.write(b"\n") 152 | prevc = line[0] 153 | gnuplot.write("{0[0]} {0[1]} {0[2]}\n".format(line).encode()) 154 | gnuplot.write(b"e\n") 155 | gnuplot.write(b"\n") # force gnuplot back to prompt when term set failure 156 | gnuplot.flush() 157 | 158 | 159 | def calculate_jobs(options): 160 | 161 | def range_f(begin,end,step): 162 | # like range, but works on non-integer too 163 | seq = [] 164 | while True: 165 | if step > 0 and begin > end: break 166 | if step < 0 and begin < end: break 167 | seq.append(begin) 168 | begin = begin + step 169 | return seq 170 | 171 | def permute_sequence(seq): 172 | n = len(seq) 173 | if n <= 1: return seq 174 | 175 | mid = int(n/2) 176 | left = permute_sequence(seq[:mid]) 177 | right = permute_sequence(seq[mid+1:]) 178 | 179 | ret = [seq[mid]] 180 | while left or right: 181 | if left: ret.append(left.pop(0)) 182 | if right: ret.append(right.pop(0)) 183 | 184 | return ret 185 | 186 | 187 | c_seq = permute_sequence(range_f(options.c_begin,options.c_end,options.c_step)) 188 | g_seq = permute_sequence(range_f(options.g_begin,options.g_end,options.g_step)) 189 | 190 | if not options.grid_with_c: 191 | c_seq = [None] 192 | if not options.grid_with_g: 193 | g_seq = [None] 194 | 195 | nr_c = float(len(c_seq)) 196 | nr_g = float(len(g_seq)) 197 | i, j = 0, 0 198 | jobs = [] 199 | 200 | while i < nr_c or j < nr_g: 201 | if i/nr_c < j/nr_g: 202 | # increase C resolution 203 | line = [] 204 | for k in range(0,j): 205 | line.append((c_seq[i],g_seq[k])) 206 | i = i + 1 207 | jobs.append(line) 208 | else: 209 | # increase g resolution 210 | line = [] 211 | for k in range(0,i): 212 | line.append((c_seq[k],g_seq[j])) 213 | j = j + 1 214 | jobs.append(line) 215 | 216 | resumed_jobs = {} 217 | 218 | if options.resume_pathname is None: 219 | return jobs, resumed_jobs 220 | 221 | for line in open(options.resume_pathname, 'r'): 222 | line = line.strip() 223 | rst = re.findall(r'rate=([0-9.]+)',line) 224 | if not rst: 225 | continue 226 | rate = float(rst[0]) 227 | 228 | c, g = None, None 229 | rst = re.findall(r'log2c=([0-9.-]+)',line) 230 | if rst: 231 | c = float(rst[0]) 232 | rst = re.findall(r'log2g=([0-9.-]+)',line) 233 | if rst: 234 | g = float(rst[0]) 235 | 236 | resumed_jobs[(c,g)] = rate 237 | 238 | return jobs, resumed_jobs 239 | 240 | 241 | class WorkerStopToken: # used to notify the worker to stop or if a worker is dead 242 | pass 243 | 244 | class Worker(Thread): 245 | def __init__(self,name,job_queue,result_queue,options): 246 | Thread.__init__(self) 247 | self.name = name 248 | self.job_queue = job_queue 249 | self.result_queue = result_queue 250 | self.options = options 251 | 252 | def run(self): 253 | while True: 254 | (cexp,gexp) = self.job_queue.get() 255 | if cexp is WorkerStopToken: 256 | self.job_queue.put((cexp,gexp)) 257 | # print('worker {0} stop.'.format(self.name)) 258 | break 259 | try: 260 | c, g = None, None 261 | if cexp != None: 262 | c = 2.0**cexp 263 | if gexp != None: 264 | g = 2.0**gexp 265 | rate = self.run_one(c,g) 266 | if rate is None: raise RuntimeError('get no rate') 267 | except: 268 | # we failed, let others do that and we just quit 269 | 270 | traceback.print_exception(sys.exc_info()[0], sys.exc_info()[1], sys.exc_info()[2]) 271 | 272 | self.job_queue.put((cexp,gexp)) 273 | sys.stderr.write('worker {0} quit.\n'.format(self.name)) 274 | break 275 | else: 276 | self.result_queue.put((self.name,cexp,gexp,rate)) 277 | 278 | def get_cmd(self,c,g): 279 | options=self.options 280 | cmdline = options.svmtrain_pathname 281 | if options.grid_with_c: 282 | cmdline += ' -c {0} '.format(c) 283 | if options.grid_with_g: 284 | cmdline += ' -g {0} '.format(g) 285 | cmdline += ' -v {0} {1} {2} '.format\ 286 | (options.fold,options.pass_through_string,options.dataset_pathname) 287 | return cmdline 288 | 289 | class LocalWorker(Worker): 290 | def run_one(self,c,g): 291 | cmdline = self.get_cmd(c,g) 292 | result = Popen(cmdline,shell=True,stdout=PIPE,stderr=PIPE,stdin=PIPE).stdout 293 | for line in result.readlines(): 294 | if str(line).find('Cross') != -1: 295 | return float(line.split()[-1][0:-1]) 296 | 297 | class SSHWorker(Worker): 298 | def __init__(self,name,job_queue,result_queue,host,options): 299 | Worker.__init__(self,name,job_queue,result_queue,options) 300 | self.host = host 301 | self.cwd = os.getcwd() 302 | def run_one(self,c,g): 303 | cmdline = 'ssh -x -t -t {0} "cd {1}; {2}"'.format\ 304 | (self.host,self.cwd,self.get_cmd(c,g)) 305 | result = Popen(cmdline,shell=True,stdout=PIPE,stderr=PIPE,stdin=PIPE).stdout 306 | for line in result.readlines(): 307 | if str(line).find('Cross') != -1: 308 | return float(line.split()[-1][0:-1]) 309 | 310 | class TelnetWorker(Worker): 311 | def __init__(self,name,job_queue,result_queue,host,username,password,options): 312 | Worker.__init__(self,name,job_queue,result_queue,options) 313 | self.host = host 314 | self.username = username 315 | self.password = password 316 | def run(self): 317 | import telnetlib 318 | self.tn = tn = telnetlib.Telnet(self.host) 319 | tn.read_until('login: ') 320 | tn.write(self.username + '\n') 321 | tn.read_until('Password: ') 322 | tn.write(self.password + '\n') 323 | 324 | # XXX: how to know whether login is successful? 325 | tn.read_until(self.username) 326 | # 327 | print('login ok', self.host) 328 | tn.write('cd '+os.getcwd()+'\n') 329 | Worker.run(self) 330 | tn.write('exit\n') 331 | def run_one(self,c,g): 332 | cmdline = self.get_cmd(c,g) 333 | result = self.tn.write(cmdline+'\n') 334 | (idx,matchm,output) = self.tn.expect(['Cross.*\n']) 335 | for line in output.split('\n'): 336 | if str(line).find('Cross') != -1: 337 | return float(line.split()[-1][0:-1]) 338 | 339 | def find_parameters(dataset_pathname, options=''): 340 | 341 | def update_param(c,g,rate,best_c,best_g,best_rate,worker,resumed): 342 | if (rate > best_rate) or (rate==best_rate and g==best_g and c= 3: 7 | xrange = range 8 | 9 | def exit_with_help(argv): 10 | print("""\ 11 | Usage: {0} [options] dataset subset_size [output1] [output2] 12 | 13 | This script randomly selects a subset of the dataset. 14 | 15 | options: 16 | -s method : method of selection (default 0) 17 | 0 -- stratified selection (classification only) 18 | 1 -- random selection 19 | 20 | output1 : the subset (optional) 21 | output2 : rest of the data (optional) 22 | If output1 is omitted, the subset will be printed on the screen.""".format(argv[0])) 23 | exit(1) 24 | 25 | def process_options(argv): 26 | argc = len(argv) 27 | if argc < 3: 28 | exit_with_help(argv) 29 | 30 | # default method is stratified selection 31 | method = 0 32 | subset_file = sys.stdout 33 | rest_file = None 34 | 35 | i = 1 36 | while i < argc: 37 | if argv[i][0] != "-": 38 | break 39 | if argv[i] == "-s": 40 | i = i + 1 41 | method = int(argv[i]) 42 | if method not in [0,1]: 43 | print("Unknown selection method {0}".format(method)) 44 | exit_with_help(argv) 45 | i = i + 1 46 | 47 | dataset = argv[i] 48 | subset_size = int(argv[i+1]) 49 | if i+2 < argc: 50 | subset_file = open(argv[i+2],'w') 51 | if i+3 < argc: 52 | rest_file = open(argv[i+3],'w') 53 | 54 | return dataset, subset_size, method, subset_file, rest_file 55 | 56 | def random_selection(dataset, subset_size): 57 | l = sum(1 for line in open(dataset,'r')) 58 | return sorted(random.sample(xrange(l), subset_size)) 59 | 60 | def stratified_selection(dataset, subset_size): 61 | labels = [line.split(None,1)[0] for line in open(dataset)] 62 | label_linenums = defaultdict(list) 63 | for i, label in enumerate(labels): 64 | label_linenums[label] += [i] 65 | 66 | l = len(labels) 67 | remaining = subset_size 68 | ret = [] 69 | 70 | # classes with fewer data are sampled first; otherwise 71 | # some rare classes may not be selected 72 | for label in sorted(label_linenums, key=lambda x: len(label_linenums[x])): 73 | linenums = label_linenums[label] 74 | label_size = len(linenums) 75 | # at least one instance per class 76 | s = int(min(remaining, max(1, math.ceil(label_size*(float(subset_size)/l))))) 77 | if s == 0: 78 | sys.stderr.write('''\ 79 | Error: failed to have at least one instance per class 80 | 1. You may have regression data. 81 | 2. Your classification data is unbalanced or too small. 82 | Please use -s 1. 83 | ''') 84 | sys.exit(-1) 85 | remaining -= s 86 | ret += [linenums[i] for i in random.sample(xrange(label_size), s)] 87 | return sorted(ret) 88 | 89 | def main(argv=sys.argv): 90 | dataset, subset_size, method, subset_file, rest_file = process_options(argv) 91 | #uncomment the following line to fix the random seed 92 | #random.seed(0) 93 | selected_lines = [] 94 | 95 | if method == 0: 96 | selected_lines = stratified_selection(dataset, subset_size) 97 | elif method == 1: 98 | selected_lines = random_selection(dataset, subset_size) 99 | 100 | #select instances based on selected_lines 101 | dataset = open(dataset,'r') 102 | prev_selected_linenum = -1 103 | for i in xrange(len(selected_lines)): 104 | for cnt in xrange(selected_lines[i]-prev_selected_linenum-1): 105 | line = dataset.readline() 106 | if rest_file: 107 | rest_file.write(line) 108 | subset_file.write(dataset.readline()) 109 | prev_selected_linenum = selected_lines[i] 110 | subset_file.close() 111 | 112 | if rest_file: 113 | for line in dataset: 114 | rest_file.write(line) 115 | rest_file.close() 116 | dataset.close() 117 | 118 | if __name__ == '__main__': 119 | main(sys.argv) 120 | 121 | -------------------------------------------------------------------------------- /binaries/windows/x64/COPYRIGHT: -------------------------------------------------------------------------------- 1 | 2 | Copyright (c) 2000-2013 Chih-Chung Chang and Chih-Jen Lin 3 | All rights reserved. 4 | 5 | Redistribution and use in source and binary forms, with or without 6 | modification, are permitted provided that the following conditions 7 | are met: 8 | 9 | 1. Redistributions of source code must retain the above copyright 10 | notice, this list of conditions and the following disclaimer. 11 | 12 | 2. Redistributions in binary form must reproduce the above copyright 13 | notice, this list of conditions and the following disclaimer in the 14 | documentation and/or other materials provided with the distribution. 15 | 16 | 3. Neither name of copyright holders nor the names of its contributors 17 | may be used to endorse or promote products derived from this software 18 | without specific prior written permission. 19 | 20 | 21 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS 22 | ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT 23 | LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR 24 | A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR 25 | CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, 26 | EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, 27 | PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR 28 | PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF 29 | LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING 30 | NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS 31 | SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 32 | -------------------------------------------------------------------------------- /binaries/windows/x64/README-GPU: -------------------------------------------------------------------------------- 1 | GPU-Accelerated LIBSVM is exploiting the GPU, using the CUDA interface, to 2 | speed-up the training process. This package contains a new executable for 3 | training classifiers "svm-train-gpu.exe" together with the original one. 4 | The use of the new executable is exactly the same as with the original one. 5 | 6 | This executable was built with the CUBLAS API version 2 which is compatible with SDKs from 4.0 and up. 7 | 8 | To test the executable "svm-train-gpu" you can run the easy.py script which is located in the "tools" folder. 9 | To observe speed improvements between CPU and GPU execution we provide a custom relatively large dataset (train_set) which can be used as an input to easy.py. 10 | 11 | FEATURES 12 | 13 | Mode Supported 14 | 15 | * c-svc classification with RBF kernel 16 | 17 | Functionality / User interface 18 | 19 | * Same as LIBSVM 20 | 21 | 22 | PREREQUISITES 23 | 24 | * NVIDIA Graphics card with CUDA support 25 | * Latest NVIDIA drivers for GPU 26 | 27 | Additional Information 28 | ====================== 29 | 30 | If you find GPU-Accelerated LIBSVM helpful, please cite it as 31 | 32 | A. Athanasopoulos, A. Dimou, V. Mezaris, I. Kompatsiaris, "GPU Acceleration for Support Vector Machines", 33 | Proc. 12th International Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS 2011), Delft, The Netherlands, April 2011. 34 | 35 | Software available at http://mklab.iti.gr/project/GPU-LIBSVM 36 | -------------------------------------------------------------------------------- /binaries/windows/x64/tools/README: -------------------------------------------------------------------------------- 1 | This directory includes some useful codes: 2 | 3 | 1. subset selection tools. 4 | 2. parameter selection tools. 5 | 3. LIBSVM format checking tools 6 | 7 | Part I: Subset selection tools 8 | 9 | Introduction 10 | ============ 11 | 12 | Training large data is time consuming. Sometimes one should work on a 13 | smaller subset first. The python script subset.py randomly selects a 14 | specified number of samples. For classification data, we provide a 15 | stratified selection to ensure the same class distribution in the 16 | subset. 17 | 18 | Usage: subset.py [options] dataset number [output1] [output2] 19 | 20 | This script selects a subset of the given data set. 21 | 22 | options: 23 | -s method : method of selection (default 0) 24 | 0 -- stratified selection (classification only) 25 | 1 -- random selection 26 | 27 | output1 : the subset (optional) 28 | output2 : the rest of data (optional) 29 | 30 | If output1 is omitted, the subset will be printed on the screen. 31 | 32 | Example 33 | ======= 34 | 35 | > python subset.py heart_scale 100 file1 file2 36 | 37 | From heart_scale 100 samples are randomly selected and stored in 38 | file1. All remaining instances are stored in file2. 39 | 40 | 41 | Part II: Parameter Selection Tools 42 | 43 | Introduction 44 | ============ 45 | 46 | grid.py is a parameter selection tool for C-SVM classification using 47 | the RBF (radial basis function) kernel. It uses cross validation (CV) 48 | technique to estimate the accuracy of each parameter combination in 49 | the specified range and helps you to decide the best parameters for 50 | your problem. 51 | 52 | grid.py directly executes libsvm binaries (so no python binding is needed) 53 | for cross validation and then draw contour of CV accuracy using gnuplot. 54 | You must have libsvm and gnuplot installed before using it. The package 55 | gnuplot is available at http://www.gnuplot.info/ 56 | 57 | On Mac OSX, the precompiled gnuplot file needs the library Aquarterm, 58 | which thus must be installed as well. In addition, this version of 59 | gnuplot does not support png, so you need to change "set term png 60 | transparent small" and use other image formats. For example, you may 61 | have "set term pbm small color". 62 | 63 | Usage: grid.py [grid_options] [svm_options] dataset 64 | 65 | grid_options : 66 | -log2c {begin,end,step | "null"} : set the range of c (default -5,15,2) 67 | begin,end,step -- c_range = 2^{begin,...,begin+k*step,...,end} 68 | "null" -- do not grid with c 69 | -log2g {begin,end,step | "null"} : set the range of g (default 3,-15,-2) 70 | begin,end,step -- g_range = 2^{begin,...,begin+k*step,...,end} 71 | "null" -- do not grid with g 72 | -v n : n-fold cross validation (default 5) 73 | -svmtrain pathname : set svm executable path and name 74 | -gnuplot {pathname | "null"} : 75 | pathname -- set gnuplot executable path and name 76 | "null" -- do not plot 77 | -out {pathname | "null"} : (default dataset.out) 78 | pathname -- set output file path and name 79 | "null" -- do not output file 80 | -png pathname : set graphic output file path and name (default dataset.png) 81 | -resume [pathname] : resume the grid task using an existing output file (default pathname is dataset.out) 82 | Use this option only if some parameters have been checked for the SAME data. 83 | 84 | svm_options : additional options for svm-train 85 | 86 | The program conducts v-fold cross validation using parameter C (and gamma) 87 | = 2^begin, 2^(begin+step), ..., 2^end. 88 | 89 | You can specify where the libsvm executable and gnuplot are using the 90 | -svmtrain and -gnuplot parameters. 91 | 92 | For windows users, please use pgnuplot.exe. If you are using gnuplot 93 | 3.7.1, please upgrade to version 3.7.3 or higher. The version 3.7.1 94 | has a bug. If you use cygwin on windows, please use gunplot-x11. 95 | 96 | If the task is terminated accidentally or you would like to change the 97 | range of parameters, you can apply '-resume' to save time by re-using 98 | previous results. You may specify the output file of a previous run 99 | or use the default (i.e., dataset.out) without giving a name. Please 100 | note that the same condition must be used in two runs. For example, 101 | you cannot use '-v 10' earlier and resume the task with '-v 5'. 102 | 103 | The value of some options can be "null." For example, `-log2c -1,0,1 104 | -log2 "null"' means that C=2^-1,2^0,2^1 and g=LIBSVM's default gamma 105 | value. That is, you do not conduct parameter selection on gamma. 106 | 107 | Example 108 | ======= 109 | 110 | > python grid.py -log2c -5,5,1 -log2g -4,0,1 -v 5 -m 300 heart_scale 111 | 112 | Users (in particular MS Windows users) may need to specify the path of 113 | executable files. You can either change paths in the beginning of 114 | grid.py or specify them in the command line. For example, 115 | 116 | > grid.py -log2c -5,5,1 -svmtrain "c:\Program Files\libsvm\windows\svm-train.exe" -gnuplot c:\tmp\gnuplot\binary\pgnuplot.exe -v 10 heart_scale 117 | 118 | Output: two files 119 | dataset.png: the CV accuracy contour plot generated by gnuplot 120 | dataset.out: the CV accuracy at each (log2(C),log2(gamma)) 121 | 122 | The following example saves running time by loading the output file of a previous run. 123 | 124 | > python grid.py -log2c -7,7,1 -log2g -5,2,1 -v 5 -resume heart_scale.out heart_scale 125 | 126 | Parallel grid search 127 | ==================== 128 | 129 | You can conduct a parallel grid search by dispatching jobs to a 130 | cluster of computers which share the same file system. First, you add 131 | machine names in grid.py: 132 | 133 | ssh_workers = ["linux1", "linux5", "linux5"] 134 | 135 | and then setup your ssh so that the authentication works without 136 | asking a password. 137 | 138 | The same machine (e.g., linux5 here) can be listed more than once if 139 | it has multiple CPUs or has more RAM. If the local machine is the 140 | best, you can also enlarge the nr_local_worker. For example: 141 | 142 | nr_local_worker = 2 143 | 144 | Example: 145 | 146 | > python grid.py heart_scale 147 | [local] -1 -1 78.8889 (best c=0.5, g=0.5, rate=78.8889) 148 | [linux5] -1 -7 83.3333 (best c=0.5, g=0.0078125, rate=83.3333) 149 | [linux5] 5 -1 77.037 (best c=0.5, g=0.0078125, rate=83.3333) 150 | [linux1] 5 -7 83.3333 (best c=0.5, g=0.0078125, rate=83.3333) 151 | . 152 | . 153 | . 154 | 155 | If -log2c, -log2g, or -v is not specified, default values are used. 156 | 157 | If your system uses telnet instead of ssh, you list the computer names 158 | in telnet_workers. 159 | 160 | Calling grid in Python 161 | ====================== 162 | 163 | In addition to using grid.py as a command-line tool, you can use it as a 164 | Python module. 165 | 166 | >>> rate, param = find_parameters(dataset, options) 167 | 168 | You need to specify `dataset' and `options' (default ''). See the following example. 169 | 170 | > python 171 | 172 | >>> from grid import * 173 | >>> rate, param = find_parameters('../heart_scale', '-log2c -1,1,1 -log2g -1,1,1') 174 | [local] 0.0 0.0 rate=74.8148 (best c=1.0, g=1.0, rate=74.8148) 175 | [local] 0.0 -1.0 rate=77.037 (best c=1.0, g=0.5, rate=77.037) 176 | . 177 | . 178 | [local] -1.0 -1.0 rate=78.8889 (best c=0.5, g=0.5, rate=78.8889) 179 | . 180 | . 181 | >>> rate 182 | 78.8889 183 | >>> param 184 | {'c': 0.5, 'g': 0.5} 185 | 186 | 187 | Part III: LIBSVM format checking tools 188 | 189 | Introduction 190 | ============ 191 | 192 | `svm-train' conducts only a simple check of the input data. To do a 193 | detailed check, we provide a python script `checkdata.py.' 194 | 195 | Usage: checkdata.py dataset 196 | 197 | Exit status (returned value): 1 if there are errors, 0 otherwise. 198 | 199 | This tool is written by Rong-En Fan at National Taiwan University. 200 | 201 | Example 202 | ======= 203 | 204 | > cat bad_data 205 | 1 3:1 2:4 206 | > python checkdata.py bad_data 207 | line 1: feature indices must be in an ascending order, previous/current features 3:1 2:4 208 | Found 1 lines with error. 209 | 210 | 211 | -------------------------------------------------------------------------------- /binaries/windows/x64/tools/checkdata.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | 3 | # 4 | # A format checker for LIBSVM 5 | # 6 | 7 | # 8 | # Copyright (c) 2007, Rong-En Fan 9 | # 10 | # All rights reserved. 11 | # 12 | # This program is distributed under the same license of the LIBSVM package. 13 | # 14 | 15 | from sys import argv, exit 16 | import os.path 17 | 18 | def err(line_no, msg): 19 | print("line {0}: {1}".format(line_no, msg)) 20 | 21 | # works like float() but does not accept nan and inf 22 | def my_float(x): 23 | if x.lower().find("nan") != -1 or x.lower().find("inf") != -1: 24 | raise ValueError 25 | 26 | return float(x) 27 | 28 | def main(): 29 | if len(argv) != 2: 30 | print("Usage: {0} dataset".format(argv[0])) 31 | exit(1) 32 | 33 | dataset = argv[1] 34 | 35 | if not os.path.exists(dataset): 36 | print("dataset {0} not found".format(dataset)) 37 | exit(1) 38 | 39 | line_no = 1 40 | error_line_count = 0 41 | for line in open(dataset, 'r'): 42 | line_error = False 43 | 44 | # each line must end with a newline character 45 | if line[-1] != '\n': 46 | err(line_no, "missing a newline character in the end") 47 | line_error = True 48 | 49 | nodes = line.split() 50 | 51 | # check label 52 | try: 53 | label = nodes.pop(0) 54 | 55 | if label.find(',') != -1: 56 | # multi-label format 57 | try: 58 | for l in label.split(','): 59 | l = my_float(l) 60 | except: 61 | err(line_no, "label {0} is not a valid multi-label form".format(label)) 62 | line_error = True 63 | else: 64 | try: 65 | label = my_float(label) 66 | except: 67 | err(line_no, "label {0} is not a number".format(label)) 68 | line_error = True 69 | except: 70 | err(line_no, "missing label, perhaps an empty line?") 71 | line_error = True 72 | 73 | # check features 74 | prev_index = -1 75 | for i in range(len(nodes)): 76 | try: 77 | (index, value) = nodes[i].split(':') 78 | 79 | index = int(index) 80 | value = my_float(value) 81 | 82 | # precomputed kernel's index starts from 0 and LIBSVM 83 | # checks it. Hence, don't treat index 0 as an error. 84 | if index < 0: 85 | err(line_no, "feature index must be positive; wrong feature {0}".format(nodes[i])) 86 | line_error = True 87 | elif index <= prev_index: 88 | err(line_no, "feature indices must be in an ascending order, previous/current features {0} {1}".format(nodes[i-1], nodes[i])) 89 | line_error = True 90 | prev_index = index 91 | except: 92 | err(line_no, "feature '{0}' not an : pair, integer, real number ".format(nodes[i])) 93 | line_error = True 94 | 95 | line_no += 1 96 | 97 | if line_error: 98 | error_line_count += 1 99 | 100 | if error_line_count > 0: 101 | print("Found {0} lines with error.".format(error_line_count)) 102 | return 1 103 | else: 104 | print("No error.") 105 | return 0 106 | 107 | if __name__ == "__main__": 108 | exit(main()) 109 | -------------------------------------------------------------------------------- /binaries/windows/x64/tools/easy.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | 3 | import sys 4 | import os 5 | from subprocess import * 6 | 7 | if len(sys.argv) <= 1: 8 | print('Usage: {0} training_file [testing_file]'.format(sys.argv[0])) 9 | raise SystemExit 10 | 11 | # svm, grid, and gnuplot executable files 12 | 13 | is_win32 = (sys.platform == 'win32') 14 | if not is_win32: 15 | svmscale_exe = "../svm-scale" 16 | svmtrain_exe = "../svm-train-gpu" 17 | svmpredict_exe = "../svm-predict" 18 | grid_py = "./grid.py" 19 | gnuplot_exe = "/usr/bin/gnuplot" 20 | else: 21 | # example for windows 22 | svmscale_exe = r"..\windows\svm-scale.exe" 23 | svmtrain_exe = r"..\windows\svm-train-gpu.exe" 24 | svmpredict_exe = r"..\windows\svm-predict.exe" 25 | gnuplot_exe = r"C:\Program Files (x86)\gnuplot\bin\pgnuplot.exe" 26 | grid_py = r".\grid.py" 27 | 28 | assert os.path.exists(svmscale_exe),"svm-scale executable not found" 29 | assert os.path.exists(svmtrain_exe),"svm-train-gpu executable not found" 30 | assert os.path.exists(svmpredict_exe),"svm-predict executable not found" 31 | assert os.path.exists(gnuplot_exe),"gnuplot executable not found" 32 | assert os.path.exists(grid_py),"grid.py not found" 33 | 34 | train_pathname = sys.argv[1] 35 | assert os.path.exists(train_pathname),"training file not found" 36 | file_name = os.path.split(train_pathname)[1] 37 | scaled_file = file_name + ".scale" 38 | model_file = file_name + ".model" 39 | range_file = file_name + ".range" 40 | 41 | if len(sys.argv) > 2: 42 | test_pathname = sys.argv[2] 43 | file_name = os.path.split(test_pathname)[1] 44 | assert os.path.exists(test_pathname),"testing file not found" 45 | scaled_test_file = file_name + ".scale" 46 | predict_test_file = file_name + ".predict" 47 | 48 | cmd = '{0} -s "{1}" "{2}" > "{3}"'.format(svmscale_exe, range_file, train_pathname, scaled_file) 49 | print('Scaling training data...') 50 | Popen(cmd, shell = True, stdout = PIPE).communicate() 51 | 52 | cmd = '{0} -svmtrain "{1}" -gnuplot "{2}" "{3}"'.format(grid_py, svmtrain_exe, gnuplot_exe, scaled_file) 53 | print('Cross validation...') 54 | f = Popen(cmd, shell = True, stdout = PIPE).stdout 55 | 56 | line = '' 57 | while True: 58 | last_line = line 59 | line = f.readline() 60 | if not line: break 61 | c,g,rate = map(float,last_line.split()) 62 | 63 | print('Best c={0}, g={1} CV rate={2}'.format(c,g,rate)) 64 | 65 | cmd = '{0} -c {1} -g {2} "{3}" "{4}"'.format(svmtrain_exe,c,g,scaled_file,model_file) 66 | print('Training...') 67 | Popen(cmd, shell = True, stdout = PIPE).communicate() 68 | 69 | print('Output model: {0}'.format(model_file)) 70 | if len(sys.argv) > 2: 71 | cmd = '{0} -r "{1}" "{2}" > "{3}"'.format(svmscale_exe, range_file, test_pathname, scaled_test_file) 72 | print('Scaling testing data...') 73 | Popen(cmd, shell = True, stdout = PIPE).communicate() 74 | 75 | cmd = '{0} "{1}" "{2}" "{3}"'.format(svmpredict_exe, scaled_test_file, model_file, predict_test_file) 76 | print('Testing...') 77 | Popen(cmd, shell = True).communicate() 78 | 79 | print('Output prediction: {0}'.format(predict_test_file)) 80 | -------------------------------------------------------------------------------- /binaries/windows/x64/tools/grid.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | __all__ = ['find_parameters'] 3 | 4 | import os, sys, traceback, getpass, time, re 5 | from threading import Thread 6 | from subprocess import * 7 | 8 | if sys.version_info[0] < 3: 9 | from Queue import Queue 10 | else: 11 | from queue import Queue 12 | 13 | telnet_workers = [] 14 | ssh_workers = [] 15 | nr_local_worker = 1 16 | 17 | class GridOption: 18 | def __init__(self, dataset_pathname, options): 19 | dirname = os.path.dirname(__file__) 20 | if sys.platform != 'win32': 21 | self.svmtrain_pathname = os.path.join(dirname, '../svm-train') 22 | self.gnuplot_pathname = '/usr/bin/gnuplot' 23 | else: 24 | # example for windows 25 | self.svmtrain_pathname = os.path.join(dirname, r'..\windows\svm-train.exe') 26 | # svmtrain_pathname = r'c:\Program Files\libsvm\windows\svm-train.exe' 27 | self.gnuplot_pathname = r'C:\Program Files (x86)\gnuplot\bin\pgnuplot.exe' 28 | self.fold = 5 29 | self.c_begin, self.c_end, self.c_step = -5, 15, 2 30 | self.g_begin, self.g_end, self.g_step = 3, -15, -2 31 | self.grid_with_c, self.grid_with_g = True, True 32 | self.dataset_pathname = dataset_pathname 33 | self.dataset_title = os.path.split(dataset_pathname)[1] 34 | self.out_pathname = '{0}.out'.format(self.dataset_title) 35 | self.png_pathname = '{0}.png'.format(self.dataset_title) 36 | self.pass_through_string = ' ' 37 | self.resume_pathname = None 38 | self.parse_options(options) 39 | 40 | def parse_options(self, options): 41 | if type(options) == str: 42 | options = options.split() 43 | i = 0 44 | pass_through_options = [] 45 | 46 | while i < len(options): 47 | if options[i] == '-log2c': 48 | i = i + 1 49 | if options[i] == 'null': 50 | self.grid_with_c = False 51 | else: 52 | self.c_begin, self.c_end, self.c_step = map(float,options[i].split(',')) 53 | elif options[i] == '-log2g': 54 | i = i + 1 55 | if options[i] == 'null': 56 | self.grid_with_g = False 57 | else: 58 | self.g_begin, self.g_end, self.g_step = map(float,options[i].split(',')) 59 | elif options[i] == '-v': 60 | i = i + 1 61 | self.fold = options[i] 62 | elif options[i] in ('-c','-g'): 63 | raise ValueError('Use -log2c and -log2g.') 64 | elif options[i] == '-svmtrain': 65 | i = i + 1 66 | self.svmtrain_pathname = options[i] 67 | elif options[i] == '-gnuplot': 68 | i = i + 1 69 | if options[i] == 'null': 70 | self.gnuplot_pathname = None 71 | else: 72 | self.gnuplot_pathname = options[i] 73 | elif options[i] == '-out': 74 | i = i + 1 75 | if options[i] == 'null': 76 | self.out_pathname = None 77 | else: 78 | self.out_pathname = options[i] 79 | elif options[i] == '-png': 80 | i = i + 1 81 | self.png_pathname = options[i] 82 | elif options[i] == '-resume': 83 | if i == (len(options)-1) or options[i+1].startswith('-'): 84 | self.resume_pathname = self.dataset_title + '.out' 85 | else: 86 | i = i + 1 87 | self.resume_pathname = options[i] 88 | else: 89 | pass_through_options.append(options[i]) 90 | i = i + 1 91 | 92 | self.pass_through_string = ' '.join(pass_through_options) 93 | if not os.path.exists(self.svmtrain_pathname): 94 | raise IOError('svm-train executable not found') 95 | if not os.path.exists(self.dataset_pathname): 96 | raise IOError('dataset not found') 97 | if self.resume_pathname and not os.path.exists(self.resume_pathname): 98 | raise IOError('file for resumption not found') 99 | if not self.grid_with_c and not self.grid_with_g: 100 | raise ValueError('-log2c and -log2g should not be null simultaneously') 101 | if self.gnuplot_pathname and not os.path.exists(self.gnuplot_pathname): 102 | sys.stderr.write('gnuplot executable not found\n') 103 | self.gnuplot_pathname = None 104 | 105 | def redraw(db,best_param,gnuplot,options,tofile=False): 106 | if len(db) == 0: return 107 | begin_level = round(max(x[2] for x in db)) - 3 108 | step_size = 0.5 109 | 110 | best_log2c,best_log2g,best_rate = best_param 111 | 112 | # if newly obtained c, g, or cv values are the same, 113 | # then stop redrawing the contour. 114 | if all(x[0] == db[0][0] for x in db): return 115 | if all(x[1] == db[0][1] for x in db): return 116 | if all(x[2] == db[0][2] for x in db): return 117 | 118 | if tofile: 119 | gnuplot.write(b"set term png transparent small linewidth 2 medium enhanced\n") 120 | gnuplot.write("set output \"{0}\"\n".format(options.png_pathname.replace('\\','\\\\')).encode()) 121 | #gnuplot.write(b"set term postscript color solid\n") 122 | #gnuplot.write("set output \"{0}.ps\"\n".format(options.dataset_title).encode().encode()) 123 | elif sys.platform == 'win32': 124 | gnuplot.write(b"set term windows\n") 125 | else: 126 | gnuplot.write( b"set term x11\n") 127 | gnuplot.write(b"set xlabel \"log2(C)\"\n") 128 | gnuplot.write(b"set ylabel \"log2(gamma)\"\n") 129 | gnuplot.write("set xrange [{0}:{1}]\n".format(options.c_begin,options.c_end).encode()) 130 | gnuplot.write("set yrange [{0}:{1}]\n".format(options.g_begin,options.g_end).encode()) 131 | gnuplot.write(b"set contour\n") 132 | gnuplot.write("set cntrparam levels incremental {0},{1},100\n".format(begin_level,step_size).encode()) 133 | gnuplot.write(b"unset surface\n") 134 | gnuplot.write(b"unset ztics\n") 135 | gnuplot.write(b"set view 0,0\n") 136 | gnuplot.write("set title \"{0}\"\n".format(options.dataset_title).encode()) 137 | gnuplot.write(b"unset label\n") 138 | gnuplot.write("set label \"Best log2(C) = {0} log2(gamma) = {1} accuracy = {2}%\" \ 139 | at screen 0.5,0.85 center\n". \ 140 | format(best_log2c, best_log2g, best_rate).encode()) 141 | gnuplot.write("set label \"C = {0} gamma = {1}\"" 142 | " at screen 0.5,0.8 center\n".format(2**best_log2c, 2**best_log2g).encode()) 143 | gnuplot.write(b"set key at screen 0.9,0.9\n") 144 | gnuplot.write(b"splot \"-\" with lines\n") 145 | 146 | db.sort(key = lambda x:(x[0], -x[1])) 147 | 148 | prevc = db[0][0] 149 | for line in db: 150 | if prevc != line[0]: 151 | gnuplot.write(b"\n") 152 | prevc = line[0] 153 | gnuplot.write("{0[0]} {0[1]} {0[2]}\n".format(line).encode()) 154 | gnuplot.write(b"e\n") 155 | gnuplot.write(b"\n") # force gnuplot back to prompt when term set failure 156 | gnuplot.flush() 157 | 158 | 159 | def calculate_jobs(options): 160 | 161 | def range_f(begin,end,step): 162 | # like range, but works on non-integer too 163 | seq = [] 164 | while True: 165 | if step > 0 and begin > end: break 166 | if step < 0 and begin < end: break 167 | seq.append(begin) 168 | begin = begin + step 169 | return seq 170 | 171 | def permute_sequence(seq): 172 | n = len(seq) 173 | if n <= 1: return seq 174 | 175 | mid = int(n/2) 176 | left = permute_sequence(seq[:mid]) 177 | right = permute_sequence(seq[mid+1:]) 178 | 179 | ret = [seq[mid]] 180 | while left or right: 181 | if left: ret.append(left.pop(0)) 182 | if right: ret.append(right.pop(0)) 183 | 184 | return ret 185 | 186 | 187 | c_seq = permute_sequence(range_f(options.c_begin,options.c_end,options.c_step)) 188 | g_seq = permute_sequence(range_f(options.g_begin,options.g_end,options.g_step)) 189 | 190 | if not options.grid_with_c: 191 | c_seq = [None] 192 | if not options.grid_with_g: 193 | g_seq = [None] 194 | 195 | nr_c = float(len(c_seq)) 196 | nr_g = float(len(g_seq)) 197 | i, j = 0, 0 198 | jobs = [] 199 | 200 | while i < nr_c or j < nr_g: 201 | if i/nr_c < j/nr_g: 202 | # increase C resolution 203 | line = [] 204 | for k in range(0,j): 205 | line.append((c_seq[i],g_seq[k])) 206 | i = i + 1 207 | jobs.append(line) 208 | else: 209 | # increase g resolution 210 | line = [] 211 | for k in range(0,i): 212 | line.append((c_seq[k],g_seq[j])) 213 | j = j + 1 214 | jobs.append(line) 215 | 216 | resumed_jobs = {} 217 | 218 | if options.resume_pathname is None: 219 | return jobs, resumed_jobs 220 | 221 | for line in open(options.resume_pathname, 'r'): 222 | line = line.strip() 223 | rst = re.findall(r'rate=([0-9.]+)',line) 224 | if not rst: 225 | continue 226 | rate = float(rst[0]) 227 | 228 | c, g = None, None 229 | rst = re.findall(r'log2c=([0-9.-]+)',line) 230 | if rst: 231 | c = float(rst[0]) 232 | rst = re.findall(r'log2g=([0-9.-]+)',line) 233 | if rst: 234 | g = float(rst[0]) 235 | 236 | resumed_jobs[(c,g)] = rate 237 | 238 | return jobs, resumed_jobs 239 | 240 | 241 | class WorkerStopToken: # used to notify the worker to stop or if a worker is dead 242 | pass 243 | 244 | class Worker(Thread): 245 | def __init__(self,name,job_queue,result_queue,options): 246 | Thread.__init__(self) 247 | self.name = name 248 | self.job_queue = job_queue 249 | self.result_queue = result_queue 250 | self.options = options 251 | 252 | def run(self): 253 | while True: 254 | (cexp,gexp) = self.job_queue.get() 255 | if cexp is WorkerStopToken: 256 | self.job_queue.put((cexp,gexp)) 257 | # print('worker {0} stop.'.format(self.name)) 258 | break 259 | try: 260 | c, g = None, None 261 | if cexp != None: 262 | c = 2.0**cexp 263 | if gexp != None: 264 | g = 2.0**gexp 265 | rate = self.run_one(c,g) 266 | if rate is None: raise RuntimeError('get no rate') 267 | except: 268 | # we failed, let others do that and we just quit 269 | 270 | traceback.print_exception(sys.exc_info()[0], sys.exc_info()[1], sys.exc_info()[2]) 271 | 272 | self.job_queue.put((cexp,gexp)) 273 | sys.stderr.write('worker {0} quit.\n'.format(self.name)) 274 | break 275 | else: 276 | self.result_queue.put((self.name,cexp,gexp,rate)) 277 | 278 | def get_cmd(self,c,g): 279 | options=self.options 280 | cmdline = options.svmtrain_pathname 281 | if options.grid_with_c: 282 | cmdline += ' -c {0} '.format(c) 283 | if options.grid_with_g: 284 | cmdline += ' -g {0} '.format(g) 285 | cmdline += ' -v {0} {1} {2} '.format\ 286 | (options.fold,options.pass_through_string,options.dataset_pathname) 287 | return cmdline 288 | 289 | class LocalWorker(Worker): 290 | def run_one(self,c,g): 291 | cmdline = self.get_cmd(c,g) 292 | result = Popen(cmdline,shell=True,stdout=PIPE,stderr=PIPE,stdin=PIPE).stdout 293 | for line in result.readlines(): 294 | if str(line).find('Cross') != -1: 295 | return float(line.split()[-1][0:-1]) 296 | 297 | class SSHWorker(Worker): 298 | def __init__(self,name,job_queue,result_queue,host,options): 299 | Worker.__init__(self,name,job_queue,result_queue,options) 300 | self.host = host 301 | self.cwd = os.getcwd() 302 | def run_one(self,c,g): 303 | cmdline = 'ssh -x -t -t {0} "cd {1}; {2}"'.format\ 304 | (self.host,self.cwd,self.get_cmd(c,g)) 305 | result = Popen(cmdline,shell=True,stdout=PIPE,stderr=PIPE,stdin=PIPE).stdout 306 | for line in result.readlines(): 307 | if str(line).find('Cross') != -1: 308 | return float(line.split()[-1][0:-1]) 309 | 310 | class TelnetWorker(Worker): 311 | def __init__(self,name,job_queue,result_queue,host,username,password,options): 312 | Worker.__init__(self,name,job_queue,result_queue,options) 313 | self.host = host 314 | self.username = username 315 | self.password = password 316 | def run(self): 317 | import telnetlib 318 | self.tn = tn = telnetlib.Telnet(self.host) 319 | tn.read_until('login: ') 320 | tn.write(self.username + '\n') 321 | tn.read_until('Password: ') 322 | tn.write(self.password + '\n') 323 | 324 | # XXX: how to know whether login is successful? 325 | tn.read_until(self.username) 326 | # 327 | print('login ok', self.host) 328 | tn.write('cd '+os.getcwd()+'\n') 329 | Worker.run(self) 330 | tn.write('exit\n') 331 | def run_one(self,c,g): 332 | cmdline = self.get_cmd(c,g) 333 | result = self.tn.write(cmdline+'\n') 334 | (idx,matchm,output) = self.tn.expect(['Cross.*\n']) 335 | for line in output.split('\n'): 336 | if str(line).find('Cross') != -1: 337 | return float(line.split()[-1][0:-1]) 338 | 339 | def find_parameters(dataset_pathname, options=''): 340 | 341 | def update_param(c,g,rate,best_c,best_g,best_rate,worker,resumed): 342 | if (rate > best_rate) or (rate==best_rate and g==best_g and c= 3: 7 | xrange = range 8 | 9 | def exit_with_help(argv): 10 | print("""\ 11 | Usage: {0} [options] dataset subset_size [output1] [output2] 12 | 13 | This script randomly selects a subset of the dataset. 14 | 15 | options: 16 | -s method : method of selection (default 0) 17 | 0 -- stratified selection (classification only) 18 | 1 -- random selection 19 | 20 | output1 : the subset (optional) 21 | output2 : rest of the data (optional) 22 | If output1 is omitted, the subset will be printed on the screen.""".format(argv[0])) 23 | exit(1) 24 | 25 | def process_options(argv): 26 | argc = len(argv) 27 | if argc < 3: 28 | exit_with_help(argv) 29 | 30 | # default method is stratified selection 31 | method = 0 32 | subset_file = sys.stdout 33 | rest_file = None 34 | 35 | i = 1 36 | while i < argc: 37 | if argv[i][0] != "-": 38 | break 39 | if argv[i] == "-s": 40 | i = i + 1 41 | method = int(argv[i]) 42 | if method not in [0,1]: 43 | print("Unknown selection method {0}".format(method)) 44 | exit_with_help(argv) 45 | i = i + 1 46 | 47 | dataset = argv[i] 48 | subset_size = int(argv[i+1]) 49 | if i+2 < argc: 50 | subset_file = open(argv[i+2],'w') 51 | if i+3 < argc: 52 | rest_file = open(argv[i+3],'w') 53 | 54 | return dataset, subset_size, method, subset_file, rest_file 55 | 56 | def random_selection(dataset, subset_size): 57 | l = sum(1 for line in open(dataset,'r')) 58 | return sorted(random.sample(xrange(l), subset_size)) 59 | 60 | def stratified_selection(dataset, subset_size): 61 | labels = [line.split(None,1)[0] for line in open(dataset)] 62 | label_linenums = defaultdict(list) 63 | for i, label in enumerate(labels): 64 | label_linenums[label] += [i] 65 | 66 | l = len(labels) 67 | remaining = subset_size 68 | ret = [] 69 | 70 | # classes with fewer data are sampled first; otherwise 71 | # some rare classes may not be selected 72 | for label in sorted(label_linenums, key=lambda x: len(label_linenums[x])): 73 | linenums = label_linenums[label] 74 | label_size = len(linenums) 75 | # at least one instance per class 76 | s = int(min(remaining, max(1, math.ceil(label_size*(float(subset_size)/l))))) 77 | if s == 0: 78 | sys.stderr.write('''\ 79 | Error: failed to have at least one instance per class 80 | 1. You may have regression data. 81 | 2. Your classification data is unbalanced or too small. 82 | Please use -s 1. 83 | ''') 84 | sys.exit(-1) 85 | remaining -= s 86 | ret += [linenums[i] for i in random.sample(xrange(label_size), s)] 87 | return sorted(ret) 88 | 89 | def main(argv=sys.argv): 90 | dataset, subset_size, method, subset_file, rest_file = process_options(argv) 91 | #uncomment the following line to fix the random seed 92 | #random.seed(0) 93 | selected_lines = [] 94 | 95 | if method == 0: 96 | selected_lines = stratified_selection(dataset, subset_size) 97 | elif method == 1: 98 | selected_lines = random_selection(dataset, subset_size) 99 | 100 | #select instances based on selected_lines 101 | dataset = open(dataset,'r') 102 | prev_selected_linenum = -1 103 | for i in xrange(len(selected_lines)): 104 | for cnt in xrange(selected_lines[i]-prev_selected_linenum-1): 105 | line = dataset.readline() 106 | if rest_file: 107 | rest_file.write(line) 108 | subset_file.write(dataset.readline()) 109 | prev_selected_linenum = selected_lines[i] 110 | subset_file.close() 111 | 112 | if rest_file: 113 | for line in dataset: 114 | rest_file.write(line) 115 | rest_file.close() 116 | dataset.close() 117 | 118 | if __name__ == '__main__': 119 | main(sys.argv) 120 | 121 | -------------------------------------------------------------------------------- /binaries/windows/x64/windows/cublas64_55.dll: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MKLab-ITI/CUDA/e567d31391d51742e1d6fe6cadffa4f9557bd518/binaries/windows/x64/windows/cublas64_55.dll -------------------------------------------------------------------------------- /binaries/windows/x64/windows/cudart64_55.dll: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MKLab-ITI/CUDA/e567d31391d51742e1d6fe6cadffa4f9557bd518/binaries/windows/x64/windows/cudart64_55.dll -------------------------------------------------------------------------------- /binaries/windows/x64/windows/libsvm.dll: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MKLab-ITI/CUDA/e567d31391d51742e1d6fe6cadffa4f9557bd518/binaries/windows/x64/windows/libsvm.dll -------------------------------------------------------------------------------- /binaries/windows/x64/windows/svm-predict.exe: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MKLab-ITI/CUDA/e567d31391d51742e1d6fe6cadffa4f9557bd518/binaries/windows/x64/windows/svm-predict.exe -------------------------------------------------------------------------------- /binaries/windows/x64/windows/svm-scale.exe: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MKLab-ITI/CUDA/e567d31391d51742e1d6fe6cadffa4f9557bd518/binaries/windows/x64/windows/svm-scale.exe -------------------------------------------------------------------------------- /binaries/windows/x64/windows/svm-toy.exe: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MKLab-ITI/CUDA/e567d31391d51742e1d6fe6cadffa4f9557bd518/binaries/windows/x64/windows/svm-toy.exe -------------------------------------------------------------------------------- /binaries/windows/x64/windows/svm-train-gpu.exe: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MKLab-ITI/CUDA/e567d31391d51742e1d6fe6cadffa4f9557bd518/binaries/windows/x64/windows/svm-train-gpu.exe -------------------------------------------------------------------------------- /binaries/windows/x64/windows/svm-train.exe: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MKLab-ITI/CUDA/e567d31391d51742e1d6fe6cadffa4f9557bd518/binaries/windows/x64/windows/svm-train.exe -------------------------------------------------------------------------------- /binaries/windows/x86/COPYRIGHT: -------------------------------------------------------------------------------- 1 | 2 | Copyright (c) 2000-2013 Chih-Chung Chang and Chih-Jen Lin 3 | All rights reserved. 4 | 5 | Redistribution and use in source and binary forms, with or without 6 | modification, are permitted provided that the following conditions 7 | are met: 8 | 9 | 1. Redistributions of source code must retain the above copyright 10 | notice, this list of conditions and the following disclaimer. 11 | 12 | 2. Redistributions in binary form must reproduce the above copyright 13 | notice, this list of conditions and the following disclaimer in the 14 | documentation and/or other materials provided with the distribution. 15 | 16 | 3. Neither name of copyright holders nor the names of its contributors 17 | may be used to endorse or promote products derived from this software 18 | without specific prior written permission. 19 | 20 | 21 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS 22 | ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT 23 | LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR 24 | A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR 25 | CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, 26 | EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, 27 | PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR 28 | PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF 29 | LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING 30 | NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS 31 | SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 32 | -------------------------------------------------------------------------------- /binaries/windows/x86/README-GPU: -------------------------------------------------------------------------------- 1 | GPU-Accelerated LIBSVM is exploiting the GPU, using the CUDA interface, to 2 | speed-up the training process. This package contains a new executable for 3 | training classifiers "svm-train-gpu.exe" together with the original one. 4 | The use of the new executable is exactly the same as with the original one. 5 | 6 | This executable was built with the CUBLAS API version 2 which is compatible with SDKs from 4.0 and up. 7 | 8 | To test the executable "svm-train-gpu" you can run the easy.py script which is located in the "tools" folder. 9 | To observe speed improvements between CPU and GPU execution we provide a custom relatively large dataset (train_set) which can be used as an input to easy.py. 10 | 11 | FEATURES 12 | 13 | Mode Supported 14 | 15 | * c-svc classification with RBF kernel 16 | 17 | Functionality / User interface 18 | 19 | * Same as LIBSVM 20 | 21 | 22 | PREREQUISITES 23 | 24 | * NVIDIA Graphics card with CUDA support 25 | * Latest NVIDIA drivers for GPU 26 | 27 | Additional Information 28 | ====================== 29 | 30 | If you find GPU-Accelerated LIBSVM helpful, please cite it as 31 | 32 | A. Athanasopoulos, A. Dimou, V. Mezaris, I. Kompatsiaris, "GPU Acceleration for Support Vector Machines", 33 | Proc. 12th International Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS 2011), Delft, The Netherlands, April 2011. 34 | 35 | Software available at http://mklab.iti.gr/project/GPU-LIBSVM 36 | -------------------------------------------------------------------------------- /binaries/windows/x86/tools/README: -------------------------------------------------------------------------------- 1 | This directory includes some useful codes: 2 | 3 | 1. subset selection tools. 4 | 2. parameter selection tools. 5 | 3. LIBSVM format checking tools 6 | 7 | Part I: Subset selection tools 8 | 9 | Introduction 10 | ============ 11 | 12 | Training large data is time consuming. Sometimes one should work on a 13 | smaller subset first. The python script subset.py randomly selects a 14 | specified number of samples. For classification data, we provide a 15 | stratified selection to ensure the same class distribution in the 16 | subset. 17 | 18 | Usage: subset.py [options] dataset number [output1] [output2] 19 | 20 | This script selects a subset of the given data set. 21 | 22 | options: 23 | -s method : method of selection (default 0) 24 | 0 -- stratified selection (classification only) 25 | 1 -- random selection 26 | 27 | output1 : the subset (optional) 28 | output2 : the rest of data (optional) 29 | 30 | If output1 is omitted, the subset will be printed on the screen. 31 | 32 | Example 33 | ======= 34 | 35 | > python subset.py heart_scale 100 file1 file2 36 | 37 | From heart_scale 100 samples are randomly selected and stored in 38 | file1. All remaining instances are stored in file2. 39 | 40 | 41 | Part II: Parameter Selection Tools 42 | 43 | Introduction 44 | ============ 45 | 46 | grid.py is a parameter selection tool for C-SVM classification using 47 | the RBF (radial basis function) kernel. It uses cross validation (CV) 48 | technique to estimate the accuracy of each parameter combination in 49 | the specified range and helps you to decide the best parameters for 50 | your problem. 51 | 52 | grid.py directly executes libsvm binaries (so no python binding is needed) 53 | for cross validation and then draw contour of CV accuracy using gnuplot. 54 | You must have libsvm and gnuplot installed before using it. The package 55 | gnuplot is available at http://www.gnuplot.info/ 56 | 57 | On Mac OSX, the precompiled gnuplot file needs the library Aquarterm, 58 | which thus must be installed as well. In addition, this version of 59 | gnuplot does not support png, so you need to change "set term png 60 | transparent small" and use other image formats. For example, you may 61 | have "set term pbm small color". 62 | 63 | Usage: grid.py [grid_options] [svm_options] dataset 64 | 65 | grid_options : 66 | -log2c {begin,end,step | "null"} : set the range of c (default -5,15,2) 67 | begin,end,step -- c_range = 2^{begin,...,begin+k*step,...,end} 68 | "null" -- do not grid with c 69 | -log2g {begin,end,step | "null"} : set the range of g (default 3,-15,-2) 70 | begin,end,step -- g_range = 2^{begin,...,begin+k*step,...,end} 71 | "null" -- do not grid with g 72 | -v n : n-fold cross validation (default 5) 73 | -svmtrain pathname : set svm executable path and name 74 | -gnuplot {pathname | "null"} : 75 | pathname -- set gnuplot executable path and name 76 | "null" -- do not plot 77 | -out {pathname | "null"} : (default dataset.out) 78 | pathname -- set output file path and name 79 | "null" -- do not output file 80 | -png pathname : set graphic output file path and name (default dataset.png) 81 | -resume [pathname] : resume the grid task using an existing output file (default pathname is dataset.out) 82 | Use this option only if some parameters have been checked for the SAME data. 83 | 84 | svm_options : additional options for svm-train 85 | 86 | The program conducts v-fold cross validation using parameter C (and gamma) 87 | = 2^begin, 2^(begin+step), ..., 2^end. 88 | 89 | You can specify where the libsvm executable and gnuplot are using the 90 | -svmtrain and -gnuplot parameters. 91 | 92 | For windows users, please use pgnuplot.exe. If you are using gnuplot 93 | 3.7.1, please upgrade to version 3.7.3 or higher. The version 3.7.1 94 | has a bug. If you use cygwin on windows, please use gunplot-x11. 95 | 96 | If the task is terminated accidentally or you would like to change the 97 | range of parameters, you can apply '-resume' to save time by re-using 98 | previous results. You may specify the output file of a previous run 99 | or use the default (i.e., dataset.out) without giving a name. Please 100 | note that the same condition must be used in two runs. For example, 101 | you cannot use '-v 10' earlier and resume the task with '-v 5'. 102 | 103 | The value of some options can be "null." For example, `-log2c -1,0,1 104 | -log2 "null"' means that C=2^-1,2^0,2^1 and g=LIBSVM's default gamma 105 | value. That is, you do not conduct parameter selection on gamma. 106 | 107 | Example 108 | ======= 109 | 110 | > python grid.py -log2c -5,5,1 -log2g -4,0,1 -v 5 -m 300 heart_scale 111 | 112 | Users (in particular MS Windows users) may need to specify the path of 113 | executable files. You can either change paths in the beginning of 114 | grid.py or specify them in the command line. For example, 115 | 116 | > grid.py -log2c -5,5,1 -svmtrain "c:\Program Files\libsvm\windows\svm-train.exe" -gnuplot c:\tmp\gnuplot\binary\pgnuplot.exe -v 10 heart_scale 117 | 118 | Output: two files 119 | dataset.png: the CV accuracy contour plot generated by gnuplot 120 | dataset.out: the CV accuracy at each (log2(C),log2(gamma)) 121 | 122 | The following example saves running time by loading the output file of a previous run. 123 | 124 | > python grid.py -log2c -7,7,1 -log2g -5,2,1 -v 5 -resume heart_scale.out heart_scale 125 | 126 | Parallel grid search 127 | ==================== 128 | 129 | You can conduct a parallel grid search by dispatching jobs to a 130 | cluster of computers which share the same file system. First, you add 131 | machine names in grid.py: 132 | 133 | ssh_workers = ["linux1", "linux5", "linux5"] 134 | 135 | and then setup your ssh so that the authentication works without 136 | asking a password. 137 | 138 | The same machine (e.g., linux5 here) can be listed more than once if 139 | it has multiple CPUs or has more RAM. If the local machine is the 140 | best, you can also enlarge the nr_local_worker. For example: 141 | 142 | nr_local_worker = 2 143 | 144 | Example: 145 | 146 | > python grid.py heart_scale 147 | [local] -1 -1 78.8889 (best c=0.5, g=0.5, rate=78.8889) 148 | [linux5] -1 -7 83.3333 (best c=0.5, g=0.0078125, rate=83.3333) 149 | [linux5] 5 -1 77.037 (best c=0.5, g=0.0078125, rate=83.3333) 150 | [linux1] 5 -7 83.3333 (best c=0.5, g=0.0078125, rate=83.3333) 151 | . 152 | . 153 | . 154 | 155 | If -log2c, -log2g, or -v is not specified, default values are used. 156 | 157 | If your system uses telnet instead of ssh, you list the computer names 158 | in telnet_workers. 159 | 160 | Calling grid in Python 161 | ====================== 162 | 163 | In addition to using grid.py as a command-line tool, you can use it as a 164 | Python module. 165 | 166 | >>> rate, param = find_parameters(dataset, options) 167 | 168 | You need to specify `dataset' and `options' (default ''). See the following example. 169 | 170 | > python 171 | 172 | >>> from grid import * 173 | >>> rate, param = find_parameters('../heart_scale', '-log2c -1,1,1 -log2g -1,1,1') 174 | [local] 0.0 0.0 rate=74.8148 (best c=1.0, g=1.0, rate=74.8148) 175 | [local] 0.0 -1.0 rate=77.037 (best c=1.0, g=0.5, rate=77.037) 176 | . 177 | . 178 | [local] -1.0 -1.0 rate=78.8889 (best c=0.5, g=0.5, rate=78.8889) 179 | . 180 | . 181 | >>> rate 182 | 78.8889 183 | >>> param 184 | {'c': 0.5, 'g': 0.5} 185 | 186 | 187 | Part III: LIBSVM format checking tools 188 | 189 | Introduction 190 | ============ 191 | 192 | `svm-train' conducts only a simple check of the input data. To do a 193 | detailed check, we provide a python script `checkdata.py.' 194 | 195 | Usage: checkdata.py dataset 196 | 197 | Exit status (returned value): 1 if there are errors, 0 otherwise. 198 | 199 | This tool is written by Rong-En Fan at National Taiwan University. 200 | 201 | Example 202 | ======= 203 | 204 | > cat bad_data 205 | 1 3:1 2:4 206 | > python checkdata.py bad_data 207 | line 1: feature indices must be in an ascending order, previous/current features 3:1 2:4 208 | Found 1 lines with error. 209 | 210 | 211 | -------------------------------------------------------------------------------- /binaries/windows/x86/tools/checkdata.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | 3 | # 4 | # A format checker for LIBSVM 5 | # 6 | 7 | # 8 | # Copyright (c) 2007, Rong-En Fan 9 | # 10 | # All rights reserved. 11 | # 12 | # This program is distributed under the same license of the LIBSVM package. 13 | # 14 | 15 | from sys import argv, exit 16 | import os.path 17 | 18 | def err(line_no, msg): 19 | print("line {0}: {1}".format(line_no, msg)) 20 | 21 | # works like float() but does not accept nan and inf 22 | def my_float(x): 23 | if x.lower().find("nan") != -1 or x.lower().find("inf") != -1: 24 | raise ValueError 25 | 26 | return float(x) 27 | 28 | def main(): 29 | if len(argv) != 2: 30 | print("Usage: {0} dataset".format(argv[0])) 31 | exit(1) 32 | 33 | dataset = argv[1] 34 | 35 | if not os.path.exists(dataset): 36 | print("dataset {0} not found".format(dataset)) 37 | exit(1) 38 | 39 | line_no = 1 40 | error_line_count = 0 41 | for line in open(dataset, 'r'): 42 | line_error = False 43 | 44 | # each line must end with a newline character 45 | if line[-1] != '\n': 46 | err(line_no, "missing a newline character in the end") 47 | line_error = True 48 | 49 | nodes = line.split() 50 | 51 | # check label 52 | try: 53 | label = nodes.pop(0) 54 | 55 | if label.find(',') != -1: 56 | # multi-label format 57 | try: 58 | for l in label.split(','): 59 | l = my_float(l) 60 | except: 61 | err(line_no, "label {0} is not a valid multi-label form".format(label)) 62 | line_error = True 63 | else: 64 | try: 65 | label = my_float(label) 66 | except: 67 | err(line_no, "label {0} is not a number".format(label)) 68 | line_error = True 69 | except: 70 | err(line_no, "missing label, perhaps an empty line?") 71 | line_error = True 72 | 73 | # check features 74 | prev_index = -1 75 | for i in range(len(nodes)): 76 | try: 77 | (index, value) = nodes[i].split(':') 78 | 79 | index = int(index) 80 | value = my_float(value) 81 | 82 | # precomputed kernel's index starts from 0 and LIBSVM 83 | # checks it. Hence, don't treat index 0 as an error. 84 | if index < 0: 85 | err(line_no, "feature index must be positive; wrong feature {0}".format(nodes[i])) 86 | line_error = True 87 | elif index <= prev_index: 88 | err(line_no, "feature indices must be in an ascending order, previous/current features {0} {1}".format(nodes[i-1], nodes[i])) 89 | line_error = True 90 | prev_index = index 91 | except: 92 | err(line_no, "feature '{0}' not an : pair, integer, real number ".format(nodes[i])) 93 | line_error = True 94 | 95 | line_no += 1 96 | 97 | if line_error: 98 | error_line_count += 1 99 | 100 | if error_line_count > 0: 101 | print("Found {0} lines with error.".format(error_line_count)) 102 | return 1 103 | else: 104 | print("No error.") 105 | return 0 106 | 107 | if __name__ == "__main__": 108 | exit(main()) 109 | -------------------------------------------------------------------------------- /binaries/windows/x86/tools/easy.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | 3 | import sys 4 | import os 5 | from subprocess import * 6 | 7 | if len(sys.argv) <= 1: 8 | print('Usage: {0} training_file [testing_file]'.format(sys.argv[0])) 9 | raise SystemExit 10 | 11 | # svm, grid, and gnuplot executable files 12 | 13 | is_win32 = (sys.platform == 'win32') 14 | if not is_win32: 15 | svmscale_exe = "../svm-scale" 16 | svmtrain_exe = "../svm-train-gpu" 17 | svmpredict_exe = "../svm-predict" 18 | grid_py = "./grid.py" 19 | gnuplot_exe = "/usr/bin/gnuplot" 20 | else: 21 | # example for windows 22 | svmscale_exe = r"..\windows\svm-scale.exe" 23 | svmtrain_exe = r"..\windows\svm-train-gpu.exe" 24 | svmpredict_exe = r"..\windows\svm-predict.exe" 25 | gnuplot_exe = r"C:\Program Files (x86)\gnuplot\bin\pgnuplot.exe" 26 | grid_py = r".\grid.py" 27 | 28 | assert os.path.exists(svmscale_exe),"svm-scale executable not found" 29 | assert os.path.exists(svmtrain_exe),"svm-train-gpu executable not found" 30 | assert os.path.exists(svmpredict_exe),"svm-predict executable not found" 31 | assert os.path.exists(gnuplot_exe),"gnuplot executable not found" 32 | assert os.path.exists(grid_py),"grid.py not found" 33 | 34 | train_pathname = sys.argv[1] 35 | assert os.path.exists(train_pathname),"training file not found" 36 | file_name = os.path.split(train_pathname)[1] 37 | scaled_file = file_name + ".scale" 38 | model_file = file_name + ".model" 39 | range_file = file_name + ".range" 40 | 41 | if len(sys.argv) > 2: 42 | test_pathname = sys.argv[2] 43 | file_name = os.path.split(test_pathname)[1] 44 | assert os.path.exists(test_pathname),"testing file not found" 45 | scaled_test_file = file_name + ".scale" 46 | predict_test_file = file_name + ".predict" 47 | 48 | cmd = '{0} -s "{1}" "{2}" > "{3}"'.format(svmscale_exe, range_file, train_pathname, scaled_file) 49 | print('Scaling training data...') 50 | Popen(cmd, shell = True, stdout = PIPE).communicate() 51 | 52 | cmd = '{0} -svmtrain "{1}" -gnuplot "{2}" "{3}"'.format(grid_py, svmtrain_exe, gnuplot_exe, scaled_file) 53 | print('Cross validation...') 54 | f = Popen(cmd, shell = True, stdout = PIPE).stdout 55 | 56 | line = '' 57 | while True: 58 | last_line = line 59 | line = f.readline() 60 | if not line: break 61 | c,g,rate = map(float,last_line.split()) 62 | 63 | print('Best c={0}, g={1} CV rate={2}'.format(c,g,rate)) 64 | 65 | cmd = '{0} -c {1} -g {2} "{3}" "{4}"'.format(svmtrain_exe,c,g,scaled_file,model_file) 66 | print('Training...') 67 | Popen(cmd, shell = True, stdout = PIPE).communicate() 68 | 69 | print('Output model: {0}'.format(model_file)) 70 | if len(sys.argv) > 2: 71 | cmd = '{0} -r "{1}" "{2}" > "{3}"'.format(svmscale_exe, range_file, test_pathname, scaled_test_file) 72 | print('Scaling testing data...') 73 | Popen(cmd, shell = True, stdout = PIPE).communicate() 74 | 75 | cmd = '{0} "{1}" "{2}" "{3}"'.format(svmpredict_exe, scaled_test_file, model_file, predict_test_file) 76 | print('Testing...') 77 | Popen(cmd, shell = True).communicate() 78 | 79 | print('Output prediction: {0}'.format(predict_test_file)) 80 | -------------------------------------------------------------------------------- /binaries/windows/x86/tools/grid.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | __all__ = ['find_parameters'] 3 | 4 | import os, sys, traceback, getpass, time, re 5 | from threading import Thread 6 | from subprocess import * 7 | 8 | if sys.version_info[0] < 3: 9 | from Queue import Queue 10 | else: 11 | from queue import Queue 12 | 13 | telnet_workers = [] 14 | ssh_workers = [] 15 | nr_local_worker = 1 16 | 17 | class GridOption: 18 | def __init__(self, dataset_pathname, options): 19 | dirname = os.path.dirname(__file__) 20 | if sys.platform != 'win32': 21 | self.svmtrain_pathname = os.path.join(dirname, '../svm-train') 22 | self.gnuplot_pathname = '/usr/bin/gnuplot' 23 | else: 24 | # example for windows 25 | self.svmtrain_pathname = os.path.join(dirname, r'..\windows\svm-train.exe') 26 | # svmtrain_pathname = r'c:\Program Files\libsvm\windows\svm-train.exe' 27 | self.gnuplot_pathname = r'C:\Program Files (x86)\gnuplot\bin\pgnuplot.exe' 28 | self.fold = 5 29 | self.c_begin, self.c_end, self.c_step = -5, 15, 2 30 | self.g_begin, self.g_end, self.g_step = 3, -15, -2 31 | self.grid_with_c, self.grid_with_g = True, True 32 | self.dataset_pathname = dataset_pathname 33 | self.dataset_title = os.path.split(dataset_pathname)[1] 34 | self.out_pathname = '{0}.out'.format(self.dataset_title) 35 | self.png_pathname = '{0}.png'.format(self.dataset_title) 36 | self.pass_through_string = ' ' 37 | self.resume_pathname = None 38 | self.parse_options(options) 39 | 40 | def parse_options(self, options): 41 | if type(options) == str: 42 | options = options.split() 43 | i = 0 44 | pass_through_options = [] 45 | 46 | while i < len(options): 47 | if options[i] == '-log2c': 48 | i = i + 1 49 | if options[i] == 'null': 50 | self.grid_with_c = False 51 | else: 52 | self.c_begin, self.c_end, self.c_step = map(float,options[i].split(',')) 53 | elif options[i] == '-log2g': 54 | i = i + 1 55 | if options[i] == 'null': 56 | self.grid_with_g = False 57 | else: 58 | self.g_begin, self.g_end, self.g_step = map(float,options[i].split(',')) 59 | elif options[i] == '-v': 60 | i = i + 1 61 | self.fold = options[i] 62 | elif options[i] in ('-c','-g'): 63 | raise ValueError('Use -log2c and -log2g.') 64 | elif options[i] == '-svmtrain': 65 | i = i + 1 66 | self.svmtrain_pathname = options[i] 67 | elif options[i] == '-gnuplot': 68 | i = i + 1 69 | if options[i] == 'null': 70 | self.gnuplot_pathname = None 71 | else: 72 | self.gnuplot_pathname = options[i] 73 | elif options[i] == '-out': 74 | i = i + 1 75 | if options[i] == 'null': 76 | self.out_pathname = None 77 | else: 78 | self.out_pathname = options[i] 79 | elif options[i] == '-png': 80 | i = i + 1 81 | self.png_pathname = options[i] 82 | elif options[i] == '-resume': 83 | if i == (len(options)-1) or options[i+1].startswith('-'): 84 | self.resume_pathname = self.dataset_title + '.out' 85 | else: 86 | i = i + 1 87 | self.resume_pathname = options[i] 88 | else: 89 | pass_through_options.append(options[i]) 90 | i = i + 1 91 | 92 | self.pass_through_string = ' '.join(pass_through_options) 93 | if not os.path.exists(self.svmtrain_pathname): 94 | raise IOError('svm-train executable not found') 95 | if not os.path.exists(self.dataset_pathname): 96 | raise IOError('dataset not found') 97 | if self.resume_pathname and not os.path.exists(self.resume_pathname): 98 | raise IOError('file for resumption not found') 99 | if not self.grid_with_c and not self.grid_with_g: 100 | raise ValueError('-log2c and -log2g should not be null simultaneously') 101 | if self.gnuplot_pathname and not os.path.exists(self.gnuplot_pathname): 102 | sys.stderr.write('gnuplot executable not found\n') 103 | self.gnuplot_pathname = None 104 | 105 | def redraw(db,best_param,gnuplot,options,tofile=False): 106 | if len(db) == 0: return 107 | begin_level = round(max(x[2] for x in db)) - 3 108 | step_size = 0.5 109 | 110 | best_log2c,best_log2g,best_rate = best_param 111 | 112 | # if newly obtained c, g, or cv values are the same, 113 | # then stop redrawing the contour. 114 | if all(x[0] == db[0][0] for x in db): return 115 | if all(x[1] == db[0][1] for x in db): return 116 | if all(x[2] == db[0][2] for x in db): return 117 | 118 | if tofile: 119 | gnuplot.write(b"set term png transparent small linewidth 2 medium enhanced\n") 120 | gnuplot.write("set output \"{0}\"\n".format(options.png_pathname.replace('\\','\\\\')).encode()) 121 | #gnuplot.write(b"set term postscript color solid\n") 122 | #gnuplot.write("set output \"{0}.ps\"\n".format(options.dataset_title).encode().encode()) 123 | elif sys.platform == 'win32': 124 | gnuplot.write(b"set term windows\n") 125 | else: 126 | gnuplot.write( b"set term x11\n") 127 | gnuplot.write(b"set xlabel \"log2(C)\"\n") 128 | gnuplot.write(b"set ylabel \"log2(gamma)\"\n") 129 | gnuplot.write("set xrange [{0}:{1}]\n".format(options.c_begin,options.c_end).encode()) 130 | gnuplot.write("set yrange [{0}:{1}]\n".format(options.g_begin,options.g_end).encode()) 131 | gnuplot.write(b"set contour\n") 132 | gnuplot.write("set cntrparam levels incremental {0},{1},100\n".format(begin_level,step_size).encode()) 133 | gnuplot.write(b"unset surface\n") 134 | gnuplot.write(b"unset ztics\n") 135 | gnuplot.write(b"set view 0,0\n") 136 | gnuplot.write("set title \"{0}\"\n".format(options.dataset_title).encode()) 137 | gnuplot.write(b"unset label\n") 138 | gnuplot.write("set label \"Best log2(C) = {0} log2(gamma) = {1} accuracy = {2}%\" \ 139 | at screen 0.5,0.85 center\n". \ 140 | format(best_log2c, best_log2g, best_rate).encode()) 141 | gnuplot.write("set label \"C = {0} gamma = {1}\"" 142 | " at screen 0.5,0.8 center\n".format(2**best_log2c, 2**best_log2g).encode()) 143 | gnuplot.write(b"set key at screen 0.9,0.9\n") 144 | gnuplot.write(b"splot \"-\" with lines\n") 145 | 146 | db.sort(key = lambda x:(x[0], -x[1])) 147 | 148 | prevc = db[0][0] 149 | for line in db: 150 | if prevc != line[0]: 151 | gnuplot.write(b"\n") 152 | prevc = line[0] 153 | gnuplot.write("{0[0]} {0[1]} {0[2]}\n".format(line).encode()) 154 | gnuplot.write(b"e\n") 155 | gnuplot.write(b"\n") # force gnuplot back to prompt when term set failure 156 | gnuplot.flush() 157 | 158 | 159 | def calculate_jobs(options): 160 | 161 | def range_f(begin,end,step): 162 | # like range, but works on non-integer too 163 | seq = [] 164 | while True: 165 | if step > 0 and begin > end: break 166 | if step < 0 and begin < end: break 167 | seq.append(begin) 168 | begin = begin + step 169 | return seq 170 | 171 | def permute_sequence(seq): 172 | n = len(seq) 173 | if n <= 1: return seq 174 | 175 | mid = int(n/2) 176 | left = permute_sequence(seq[:mid]) 177 | right = permute_sequence(seq[mid+1:]) 178 | 179 | ret = [seq[mid]] 180 | while left or right: 181 | if left: ret.append(left.pop(0)) 182 | if right: ret.append(right.pop(0)) 183 | 184 | return ret 185 | 186 | 187 | c_seq = permute_sequence(range_f(options.c_begin,options.c_end,options.c_step)) 188 | g_seq = permute_sequence(range_f(options.g_begin,options.g_end,options.g_step)) 189 | 190 | if not options.grid_with_c: 191 | c_seq = [None] 192 | if not options.grid_with_g: 193 | g_seq = [None] 194 | 195 | nr_c = float(len(c_seq)) 196 | nr_g = float(len(g_seq)) 197 | i, j = 0, 0 198 | jobs = [] 199 | 200 | while i < nr_c or j < nr_g: 201 | if i/nr_c < j/nr_g: 202 | # increase C resolution 203 | line = [] 204 | for k in range(0,j): 205 | line.append((c_seq[i],g_seq[k])) 206 | i = i + 1 207 | jobs.append(line) 208 | else: 209 | # increase g resolution 210 | line = [] 211 | for k in range(0,i): 212 | line.append((c_seq[k],g_seq[j])) 213 | j = j + 1 214 | jobs.append(line) 215 | 216 | resumed_jobs = {} 217 | 218 | if options.resume_pathname is None: 219 | return jobs, resumed_jobs 220 | 221 | for line in open(options.resume_pathname, 'r'): 222 | line = line.strip() 223 | rst = re.findall(r'rate=([0-9.]+)',line) 224 | if not rst: 225 | continue 226 | rate = float(rst[0]) 227 | 228 | c, g = None, None 229 | rst = re.findall(r'log2c=([0-9.-]+)',line) 230 | if rst: 231 | c = float(rst[0]) 232 | rst = re.findall(r'log2g=([0-9.-]+)',line) 233 | if rst: 234 | g = float(rst[0]) 235 | 236 | resumed_jobs[(c,g)] = rate 237 | 238 | return jobs, resumed_jobs 239 | 240 | 241 | class WorkerStopToken: # used to notify the worker to stop or if a worker is dead 242 | pass 243 | 244 | class Worker(Thread): 245 | def __init__(self,name,job_queue,result_queue,options): 246 | Thread.__init__(self) 247 | self.name = name 248 | self.job_queue = job_queue 249 | self.result_queue = result_queue 250 | self.options = options 251 | 252 | def run(self): 253 | while True: 254 | (cexp,gexp) = self.job_queue.get() 255 | if cexp is WorkerStopToken: 256 | self.job_queue.put((cexp,gexp)) 257 | # print('worker {0} stop.'.format(self.name)) 258 | break 259 | try: 260 | c, g = None, None 261 | if cexp != None: 262 | c = 2.0**cexp 263 | if gexp != None: 264 | g = 2.0**gexp 265 | rate = self.run_one(c,g) 266 | if rate is None: raise RuntimeError('get no rate') 267 | except: 268 | # we failed, let others do that and we just quit 269 | 270 | traceback.print_exception(sys.exc_info()[0], sys.exc_info()[1], sys.exc_info()[2]) 271 | 272 | self.job_queue.put((cexp,gexp)) 273 | sys.stderr.write('worker {0} quit.\n'.format(self.name)) 274 | break 275 | else: 276 | self.result_queue.put((self.name,cexp,gexp,rate)) 277 | 278 | def get_cmd(self,c,g): 279 | options=self.options 280 | cmdline = options.svmtrain_pathname 281 | if options.grid_with_c: 282 | cmdline += ' -c {0} '.format(c) 283 | if options.grid_with_g: 284 | cmdline += ' -g {0} '.format(g) 285 | cmdline += ' -v {0} {1} {2} '.format\ 286 | (options.fold,options.pass_through_string,options.dataset_pathname) 287 | return cmdline 288 | 289 | class LocalWorker(Worker): 290 | def run_one(self,c,g): 291 | cmdline = self.get_cmd(c,g) 292 | result = Popen(cmdline,shell=True,stdout=PIPE,stderr=PIPE,stdin=PIPE).stdout 293 | for line in result.readlines(): 294 | if str(line).find('Cross') != -1: 295 | return float(line.split()[-1][0:-1]) 296 | 297 | class SSHWorker(Worker): 298 | def __init__(self,name,job_queue,result_queue,host,options): 299 | Worker.__init__(self,name,job_queue,result_queue,options) 300 | self.host = host 301 | self.cwd = os.getcwd() 302 | def run_one(self,c,g): 303 | cmdline = 'ssh -x -t -t {0} "cd {1}; {2}"'.format\ 304 | (self.host,self.cwd,self.get_cmd(c,g)) 305 | result = Popen(cmdline,shell=True,stdout=PIPE,stderr=PIPE,stdin=PIPE).stdout 306 | for line in result.readlines(): 307 | if str(line).find('Cross') != -1: 308 | return float(line.split()[-1][0:-1]) 309 | 310 | class TelnetWorker(Worker): 311 | def __init__(self,name,job_queue,result_queue,host,username,password,options): 312 | Worker.__init__(self,name,job_queue,result_queue,options) 313 | self.host = host 314 | self.username = username 315 | self.password = password 316 | def run(self): 317 | import telnetlib 318 | self.tn = tn = telnetlib.Telnet(self.host) 319 | tn.read_until('login: ') 320 | tn.write(self.username + '\n') 321 | tn.read_until('Password: ') 322 | tn.write(self.password + '\n') 323 | 324 | # XXX: how to know whether login is successful? 325 | tn.read_until(self.username) 326 | # 327 | print('login ok', self.host) 328 | tn.write('cd '+os.getcwd()+'\n') 329 | Worker.run(self) 330 | tn.write('exit\n') 331 | def run_one(self,c,g): 332 | cmdline = self.get_cmd(c,g) 333 | result = self.tn.write(cmdline+'\n') 334 | (idx,matchm,output) = self.tn.expect(['Cross.*\n']) 335 | for line in output.split('\n'): 336 | if str(line).find('Cross') != -1: 337 | return float(line.split()[-1][0:-1]) 338 | 339 | def find_parameters(dataset_pathname, options=''): 340 | 341 | def update_param(c,g,rate,best_c,best_g,best_rate,worker,resumed): 342 | if (rate > best_rate) or (rate==best_rate and g==best_g and c= 3: 7 | xrange = range 8 | 9 | def exit_with_help(argv): 10 | print("""\ 11 | Usage: {0} [options] dataset subset_size [output1] [output2] 12 | 13 | This script randomly selects a subset of the dataset. 14 | 15 | options: 16 | -s method : method of selection (default 0) 17 | 0 -- stratified selection (classification only) 18 | 1 -- random selection 19 | 20 | output1 : the subset (optional) 21 | output2 : rest of the data (optional) 22 | If output1 is omitted, the subset will be printed on the screen.""".format(argv[0])) 23 | exit(1) 24 | 25 | def process_options(argv): 26 | argc = len(argv) 27 | if argc < 3: 28 | exit_with_help(argv) 29 | 30 | # default method is stratified selection 31 | method = 0 32 | subset_file = sys.stdout 33 | rest_file = None 34 | 35 | i = 1 36 | while i < argc: 37 | if argv[i][0] != "-": 38 | break 39 | if argv[i] == "-s": 40 | i = i + 1 41 | method = int(argv[i]) 42 | if method not in [0,1]: 43 | print("Unknown selection method {0}".format(method)) 44 | exit_with_help(argv) 45 | i = i + 1 46 | 47 | dataset = argv[i] 48 | subset_size = int(argv[i+1]) 49 | if i+2 < argc: 50 | subset_file = open(argv[i+2],'w') 51 | if i+3 < argc: 52 | rest_file = open(argv[i+3],'w') 53 | 54 | return dataset, subset_size, method, subset_file, rest_file 55 | 56 | def random_selection(dataset, subset_size): 57 | l = sum(1 for line in open(dataset,'r')) 58 | return sorted(random.sample(xrange(l), subset_size)) 59 | 60 | def stratified_selection(dataset, subset_size): 61 | labels = [line.split(None,1)[0] for line in open(dataset)] 62 | label_linenums = defaultdict(list) 63 | for i, label in enumerate(labels): 64 | label_linenums[label] += [i] 65 | 66 | l = len(labels) 67 | remaining = subset_size 68 | ret = [] 69 | 70 | # classes with fewer data are sampled first; otherwise 71 | # some rare classes may not be selected 72 | for label in sorted(label_linenums, key=lambda x: len(label_linenums[x])): 73 | linenums = label_linenums[label] 74 | label_size = len(linenums) 75 | # at least one instance per class 76 | s = int(min(remaining, max(1, math.ceil(label_size*(float(subset_size)/l))))) 77 | if s == 0: 78 | sys.stderr.write('''\ 79 | Error: failed to have at least one instance per class 80 | 1. You may have regression data. 81 | 2. Your classification data is unbalanced or too small. 82 | Please use -s 1. 83 | ''') 84 | sys.exit(-1) 85 | remaining -= s 86 | ret += [linenums[i] for i in random.sample(xrange(label_size), s)] 87 | return sorted(ret) 88 | 89 | def main(argv=sys.argv): 90 | dataset, subset_size, method, subset_file, rest_file = process_options(argv) 91 | #uncomment the following line to fix the random seed 92 | #random.seed(0) 93 | selected_lines = [] 94 | 95 | if method == 0: 96 | selected_lines = stratified_selection(dataset, subset_size) 97 | elif method == 1: 98 | selected_lines = random_selection(dataset, subset_size) 99 | 100 | #select instances based on selected_lines 101 | dataset = open(dataset,'r') 102 | prev_selected_linenum = -1 103 | for i in xrange(len(selected_lines)): 104 | for cnt in xrange(selected_lines[i]-prev_selected_linenum-1): 105 | line = dataset.readline() 106 | if rest_file: 107 | rest_file.write(line) 108 | subset_file.write(dataset.readline()) 109 | prev_selected_linenum = selected_lines[i] 110 | subset_file.close() 111 | 112 | if rest_file: 113 | for line in dataset: 114 | rest_file.write(line) 115 | rest_file.close() 116 | dataset.close() 117 | 118 | if __name__ == '__main__': 119 | main(sys.argv) 120 | 121 | -------------------------------------------------------------------------------- /binaries/windows/x86/windows/cublas32_55.dll: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MKLab-ITI/CUDA/e567d31391d51742e1d6fe6cadffa4f9557bd518/binaries/windows/x86/windows/cublas32_55.dll -------------------------------------------------------------------------------- /binaries/windows/x86/windows/cudart32_55.dll: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MKLab-ITI/CUDA/e567d31391d51742e1d6fe6cadffa4f9557bd518/binaries/windows/x86/windows/cudart32_55.dll -------------------------------------------------------------------------------- /binaries/windows/x86/windows/libsvm.dll: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MKLab-ITI/CUDA/e567d31391d51742e1d6fe6cadffa4f9557bd518/binaries/windows/x86/windows/libsvm.dll -------------------------------------------------------------------------------- /binaries/windows/x86/windows/svm-predict.exe: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MKLab-ITI/CUDA/e567d31391d51742e1d6fe6cadffa4f9557bd518/binaries/windows/x86/windows/svm-predict.exe -------------------------------------------------------------------------------- /binaries/windows/x86/windows/svm-scale.exe: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MKLab-ITI/CUDA/e567d31391d51742e1d6fe6cadffa4f9557bd518/binaries/windows/x86/windows/svm-scale.exe -------------------------------------------------------------------------------- /binaries/windows/x86/windows/svm-toy.exe: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MKLab-ITI/CUDA/e567d31391d51742e1d6fe6cadffa4f9557bd518/binaries/windows/x86/windows/svm-toy.exe -------------------------------------------------------------------------------- /binaries/windows/x86/windows/svm-train-gpu.exe: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MKLab-ITI/CUDA/e567d31391d51742e1d6fe6cadffa4f9557bd518/binaries/windows/x86/windows/svm-train-gpu.exe -------------------------------------------------------------------------------- /binaries/windows/x86/windows/svm-train.exe: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MKLab-ITI/CUDA/e567d31391d51742e1d6fe6cadffa4f9557bd518/binaries/windows/x86/windows/svm-train.exe -------------------------------------------------------------------------------- /src/linux/COPYRIGHT: -------------------------------------------------------------------------------- 1 | 2 | Copyright (c) 2000-2010 Chih-Chung Chang and Chih-Jen Lin 3 | All rights reserved. 4 | 5 | Redistribution and use in source and binary forms, with or without 6 | modification, are permitted provided that the following conditions 7 | are met: 8 | 9 | 1. Redistributions of source code must retain the above copyright 10 | notice, this list of conditions and the following disclaimer. 11 | 12 | 2. Redistributions in binary form must reproduce the above copyright 13 | notice, this list of conditions and the following disclaimer in the 14 | documentation and/or other materials provided with the distribution. 15 | 16 | 3. Neither name of copyright holders nor the names of its contributors 17 | may be used to endorse or promote products derived from this software 18 | without specific prior written permission. 19 | 20 | 21 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS 22 | ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT 23 | LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR 24 | A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR 25 | CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, 26 | EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, 27 | PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR 28 | PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF 29 | LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING 30 | NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS 31 | SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 32 | -------------------------------------------------------------------------------- /src/linux/Makefile: -------------------------------------------------------------------------------- 1 | ################################################################################ 2 | # 3 | # Copyright 1993-2013 NVIDIA Corporation. All rights reserved. 4 | # 5 | # NOTICE TO USER: 6 | # 7 | # This source code is subject to NVIDIA ownership rights under U.S. and 8 | # international Copyright laws. 9 | # 10 | # NVIDIA MAKES NO REPRESENTATION ABOUT THE SUITABILITY OF THIS SOURCE 11 | # CODE FOR ANY PURPOSE. IT IS PROVIDED "AS IS" WITHOUT EXPRESS OR 12 | # IMPLIED WARRANTY OF ANY KIND. NVIDIA DISCLAIMS ALL WARRANTIES WITH 13 | # REGARD TO THIS SOURCE CODE, INCLUDING ALL IMPLIED WARRANTIES OF 14 | # MERCHANTABILITY, NONINFRINGEMENT, AND FITNESS FOR A PARTICULAR PURPOSE. 15 | # IN NO EVENT SHALL NVIDIA BE LIABLE FOR ANY SPECIAL, INDIRECT, INCIDENTAL, 16 | # OR CONSEQUENTIAL DAMAGES, OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS 17 | # OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE 18 | # OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE 19 | # OR PERFORMANCE OF THIS SOURCE CODE. 20 | # 21 | # U.S. Government End Users. This source code is a "commercial item" as 22 | # that term is defined at 48 C.F.R. 2.101 (OCT 1995), consisting of 23 | # "commercial computer software" and "commercial computer software 24 | # documentation" as such terms are used in 48 C.F.R. 12.212 (SEPT 1995) 25 | # and is provided to the U.S. Government only as a commercial end item. 26 | # Consistent with 48 C.F.R.12.212 and 48 C.F.R. 227.7202-1 through 27 | # 227.7202-4 (JUNE 1995), all U.S. Government End Users acquire the 28 | # source code with only those rights set forth herein. 29 | # 30 | ################################################################################ 31 | # 32 | # Makefile project only supported on Mac OS X and Linux Platforms) 33 | # 34 | ################################################################################ 35 | 36 | include ./findcudalib.mk 37 | 38 | # Location of the CUDA Toolkit 39 | CUDA_PATH ?= "/usr/local/cuda-5.5" 40 | 41 | # internal flags 42 | NVCCFLAGS := -m${OS_SIZE} -maxrregcount=16 43 | CCFLAGS := 44 | NVCCLDFLAGS := 45 | LDFLAGS := 46 | 47 | # Extra user flags 48 | EXTRA_NVCCFLAGS ?= 49 | EXTRA_NVCCLDFLAGS ?= 50 | EXTRA_LDFLAGS ?= 51 | EXTRA_CCFLAGS ?= 52 | 53 | # OS-specific build flags 54 | ifneq ($(DARWIN),) 55 | LDFLAGS += -rpath $(CUDA_PATH)/lib 56 | CCFLAGS += -arch $(OS_ARCH) $(STDLIB) 57 | else 58 | ifeq ($(OS_ARCH),armv7l) 59 | ifeq ($(abi),gnueabi) 60 | CCFLAGS += -mfloat-abi=softfp 61 | else 62 | # default to gnueabihf 63 | override abi := gnueabihf 64 | LDFLAGS += --dynamic-linker=/lib/ld-linux-armhf.so.3 65 | CCFLAGS += -mfloat-abi=hard 66 | endif 67 | endif 68 | endif 69 | 70 | ifeq ($(ARMv7),1) 71 | NVCCFLAGS += -target-cpu-arch ARM 72 | ifneq ($(TARGET_FS),) 73 | CCFLAGS += --sysroot=$(TARGET_FS) 74 | LDFLAGS += --sysroot=$(TARGET_FS) 75 | LDFLAGS += -rpath-link=$(TARGET_FS)/lib 76 | LDFLAGS += -rpath-link=$(TARGET_FS)/usr/lib 77 | LDFLAGS += -rpath-link=$(TARGET_FS)/usr/lib/arm-linux-$(abi) 78 | endif 79 | endif 80 | 81 | # Debug build flags 82 | ifeq ($(dbg),1) 83 | NVCCFLAGS += -g -G 84 | TARGET := debug 85 | else 86 | TARGET := release 87 | endif 88 | 89 | ALL_CCFLAGS := 90 | ALL_CCFLAGS += $(NVCCFLAGS) 91 | ALL_CCFLAGS += $(addprefix -Xcompiler ,$(CCFLAGS)) 92 | ALL_CCFLAGS += $(EXTRA_NVCCFLAGS) 93 | ALL_CCFLAGS += $(addprefix -Xcompiler ,$(EXTRA_CCFLAGS)) 94 | 95 | ALL_LDFLAGS := 96 | ALL_LDFLAGS += $(ALL_CCFLAGS) 97 | ALL_LDFLAGS += $(NVCCLDFLAGS) 98 | ALL_LDFLAGS += $(addprefix -Xlinker ,$(LDFLAGS)) 99 | ALL_LDFLAGS += $(EXTRA_NVCCLDFLAGS) 100 | ALL_LDFLAGS += $(addprefix -Xlinker ,$(EXTRA_LDFLAGS)) 101 | 102 | # Common includes and paths for CUDA 103 | INCLUDES := -I/usr/local/cuda-5.5/include 104 | INCLUDES := -I/usr/local/cuda-5.5/samples/common/inc 105 | LIBRARIES := -L/usr/local/cuda-5.5/lib64 106 | 107 | ################################################################################ 108 | 109 | LIBRARIES += -lcublas -lcudart 110 | 111 | ################################################################################ 112 | 113 | # CUDA code generation flags 114 | ifneq ($(OS_ARCH),armv7l) 115 | GENCODE_SM10 := -gencode arch=compute_10,code=sm_10 116 | endif 117 | GENCODE_SM20 := -gencode arch=compute_20,code=sm_20 118 | GENCODE_SM30 := -gencode arch=compute_30,code=sm_30 -gencode arch=compute_35,code=\"sm_35,compute_35\" 119 | GENCODE_FLAGS := $(GENCODE_SM10) $(GENCODE_SM20) $(GENCODE_SM30) 120 | 121 | ################################################################################ 122 | 123 | 124 | # Target rules 125 | all: build 126 | 127 | build: svm-train-gpu 128 | 129 | svm-train-gpu.o: svm.cpp svm-train.o 130 | $(NVCC) $(INCLUDES) $(ALL_CCFLAGS) $(GENCODE_FLAGS) -o $@ -c $< 131 | 132 | svm-train-gpu: svm.o svm-train.o 133 | $(NVCC) $(ALL_LDFLAGS) -o $@ $+ $(LIBRARIES) 134 | mkdir -p /bin/$(OS_ARCH)/$(OSLOWER)/$(TARGET)$(if $(abi),/$(abi)) 135 | cp $@ /bin/$(OS_ARCH)/$(OSLOWER)/$(TARGET)$(if $(abi),/$(abi)) 136 | 137 | run: build 138 | ./svm-train-gpu 139 | 140 | clean: 141 | rm -f svm-train-gpu.o svm-train-gpu 142 | rm -rf /bin/$(OS_ARCH)/$(OSLOWER)/$(TARGET)$(if $(abi),/$(abi))/svm-train-gpu 143 | 144 | clobber: clean 145 | -------------------------------------------------------------------------------- /src/linux/README-GPU: -------------------------------------------------------------------------------- 1 | GPU-Accelerated LIBSVM is exploiting the GPU, using the CUDA interface, to 2 | speed-up the training process. This package contains a new executable for 3 | training classifiers "svm-train-gpu.exe" together with the original one. 4 | The use of the new executable is exactly the same as with the original one. 5 | 6 | 7 | 8 | FEATURES 9 | 10 | Mode Supported 11 | 12 | * c-svc classification with RBF kernel 13 | 14 | Functionality / User interface 15 | 16 | * Same as LIBSVM 17 | 18 | 19 | PREREQUISITES 20 | 21 | * NVIDIA Graphics card with CUDA support 22 | * Latest NVIDIA drivers for GPU 23 | * CUDA toolkit & GPU Computing SDK 5.5 24 | 25 | Download all in one package from: 26 | https://developer.nvidia.com/cuda-downloads 27 | 28 | 29 | INSTRUCTIONS 30 | 31 | 1. Install the NVIDIA drivers, CUDA toolkit and GPU Computing SDK code samples. You can find them all in one package here: 32 | 33 | https://developer.nvidia.com/cuda-downloads (Version 5.5) 34 | 35 | You may need some additional packets to be installed in order to complete the installation above. 36 | 37 | A very helpful and descriptive guide is on the CUDA webpage: 38 | 39 | http://docs.nvidia.com/cuda/cuda-getting-started-guide-for-linux/index.html 40 | 41 | Make sure you have followed every step that is relevant to your system, like declaring $PATH and $LD_LIBRARY_PATH on your bash configuration file. 42 | 43 | 2. Copy this folder anywhere you like. 44 | 45 | 3. Use the Makefile found inside this folder. 46 | 47 | 4. Find the "svm-train-gpu" executable inside this folder. 48 | 49 | 50 | Troubleshooting 51 | 52 | 1. Nearly all problems are resolved by reading carefully through the nvidia guidelines. 53 | 54 | 2. When making, there is a chance a "cannot find cublas_v2.h" or "cuda_runtime.h" error to arise. Find where these files are located (Default path is: "/usr/local/cuda-5.5/include") and replace the paths on the header files in "kernel_matrix_calculation.c" file with your system paths. Also you can change the location of the default CUDA toolkit path on the makefile (CUDA_PATH ?= "/usr/local/cuda-5.5") to your cuda toolkit path. 55 | 56 | 57 | 58 | Additional Information 59 | ====================== 60 | 61 | If you find GPU-Accelerated LIBSVM helpful, please cite it as 62 | 63 | A. Athanasopoulos, A. Dimou, V. Mezaris, I. Kompatsiaris, "GPU Acceleration for Support Vector Machines", 64 | Proc. 12th International Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS 2011), Delft, The Netherlands, April 2011. 65 | 66 | Software available at http://mklab.iti.gr/project/GPU-LIBSVM 67 | -------------------------------------------------------------------------------- /src/linux/cross_validation_with_matrix_precomputation.c: -------------------------------------------------------------------------------- 1 | void setup_pkm(struct svm_problem *p_km) 2 | { 3 | 4 | int i; 5 | 6 | p_km->l = prob.l; 7 | p_km->x = Malloc(struct svm_node,p_km->l); 8 | p_km->y = Malloc(double,p_km->l); 9 | 10 | for(i=0;ix+i)->values = Malloc(double,prob.l+1); 13 | (p_km->x+i)->dim = prob.l+1; 14 | } 15 | 16 | for( i=0; iy[i] = prob.y[i]; 17 | } 18 | 19 | void free_pkm(struct svm_problem *p_km) 20 | { 21 | 22 | int i; 23 | 24 | for(i=0;ix+i)->values); 26 | 27 | free( p_km->x ); 28 | free( p_km->y ); 29 | 30 | } 31 | 32 | 33 | double do_crossvalidation(struct svm_problem * p_km) 34 | { 35 | double rate; 36 | 37 | int i; 38 | int total_correct = 0; 39 | double total_error = 0; 40 | double sumv = 0, sumy = 0, sumvv = 0, sumyy = 0, sumvy = 0; 41 | double *target = Malloc(double,prob.l); 42 | 43 | svm_cross_validation(p_km,¶m,nr_fold,target); 44 | 45 | 46 | if(param.svm_type == EPSILON_SVR || 47 | param.svm_type == NU_SVR) 48 | { 49 | for(i=0;i/dev/null | tr "[:lower:]" "[:upper:]") 39 | OSLOWER = $(shell uname -s 2>/dev/null | tr "[:upper:]" "[:lower:]") 40 | 41 | # Flags to detect 32-bit or 64-bit OS platform 42 | OS_SIZE = $(shell uname -m | sed -e "s/i.86/32/" -e "s/x86_64/64/" -e "s/armv7l/32/") 43 | OS_ARCH = $(shell uname -m | sed -e "s/i386/i686/") 44 | 45 | # Determine OS platform and unix distribution 46 | ifeq ("$(OSLOWER)","linux") 47 | # first search lsb_release 48 | DISTRO = $(shell lsb_release -i -s 2>/dev/null | tr "[:upper:]" "[:lower:]") 49 | DISTVER = $(shell lsb_release -r -s 2>/dev/null) 50 | ifeq ("$(DISTRO)",'') 51 | # second search and parse /etc/issue 52 | DISTRO = $(shell more /etc/issue | awk '{print $$1}' | sed '1!d' | sed -e "/^$$/d" 2>/dev/null | tr "[:upper:]" "[:lower:]") 53 | DISTVER= $(shell more /etc/issue | awk '{print $$2}' | sed '1!d' 2>/dev/null 54 | endif 55 | ifeq ("$(DISTRO)",'') 56 | # third, we can search in /etc/os-release or /etc/{distro}-release 57 | DISTRO = $(shell awk '/ID/' /etc/*-release | sed 's/ID=//' | grep -v "VERSION" | grep -v "ID" | grep -v "DISTRIB") 58 | DISTVER= $(shell awk '/DISTRIB_RELEASE/' /etc/*-release | sed 's/DISTRIB_RELEASE=//' | grep -v "DISTRIB_RELEASE") 59 | endif 60 | endif 61 | 62 | # search at Darwin (unix based info) 63 | DARWIN = $(strip $(findstring DARWIN, $(OSUPPER))) 64 | ifneq ($(DARWIN),) 65 | SNOWLEOPARD = $(strip $(findstring 10.6, $(shell egrep "10\.6" /System/Library/CoreServices/SystemVersion.plist))) 66 | LION = $(strip $(findstring 10.7, $(shell egrep "10\.7" /System/Library/CoreServices/SystemVersion.plist))) 67 | MOUNTAIN = $(strip $(findstring 10.8, $(shell egrep "10\.8" /System/Library/CoreServices/SystemVersion.plist))) 68 | MAVERICKS = $(strip $(findstring 10.9, $(shell egrep "10\.9" /System/Library/CoreServices/SystemVersion.plist))) 69 | endif 70 | 71 | # Common binaries 72 | GCC ?= g++ 73 | CLANG ?= /usr/bin/clang 74 | 75 | ifeq ("$(OSUPPER)","LINUX") 76 | NVCC ?= $(CUDA_PATH)/bin/nvcc -ccbin $(GCC) 77 | else 78 | # for some newer versions of XCode, CLANG is the default compiler, so we need to include this 79 | ifneq ($(MAVERICKS),) 80 | NVCC ?= $(CUDA_PATH)/bin/nvcc -ccbin $(CLANG) 81 | STDLIB ?= -stdlib=libstdc++ 82 | else 83 | NVCC ?= $(CUDA_PATH)/bin/nvcc -ccbin $(GCC) 84 | endif 85 | endif 86 | 87 | # Take command line flags that override any of these settings 88 | ifeq ($(i386),1) 89 | OS_SIZE = 32 90 | OS_ARCH = i686 91 | endif 92 | ifeq ($(x86_64),1) 93 | OS_SIZE = 64 94 | OS_ARCH = x86_64 95 | endif 96 | ifeq ($(ARMv7),1) 97 | OS_SIZE = 32 98 | OS_ARCH = armv7l 99 | endif 100 | 101 | ifeq ("$(OSUPPER)","LINUX") 102 | # Each Linux Distribuion has a set of different paths. This applies especially when using the Linux RPM/debian packages 103 | ifeq ("$(DISTRO)","ubuntu") 104 | CUDAPATH ?= /usr/lib/nvidia-current 105 | CUDALINK ?= -L/usr/lib/nvidia-current 106 | DFLT_PATH = /usr/lib 107 | endif 108 | ifeq ("$(DISTRO)","kubuntu") 109 | CUDAPATH ?= /usr/lib/nvidia-current 110 | CUDALINK ?= -L/usr/lib/nvidia-current 111 | DFLT_PATH = /usr/lib 112 | endif 113 | ifeq ("$(DISTRO)","debian") 114 | CUDAPATH ?= /usr/lib/nvidia-current 115 | CUDALINK ?= -L/usr/lib/nvidia-current 116 | DFLT_PATH = /usr/lib 117 | endif 118 | ifeq ("$(DISTRO)","suse") 119 | ifeq ($(OS_SIZE),64) 120 | CUDAPATH ?= 121 | CUDALINK ?= 122 | DFLT_PATH = /usr/lib64 123 | else 124 | CUDAPATH ?= 125 | CUDALINK ?= 126 | DFLT_PATH = /usr/lib 127 | endif 128 | endif 129 | ifeq ("$(DISTRO)","suse linux") 130 | ifeq ($(OS_SIZE),64) 131 | CUDAPATH ?= 132 | CUDALINK ?= 133 | DFLT_PATH = /usr/lib64 134 | else 135 | CUDAPATH ?= 136 | CUDALINK ?= 137 | DFLT_PATH = /usr/lib 138 | endif 139 | endif 140 | ifeq ("$(DISTRO)","opensuse") 141 | ifeq ($(OS_SIZE),64) 142 | CUDAPATH ?= 143 | CUDALINK ?= 144 | DFLT_PATH = /usr/lib64 145 | else 146 | CUDAPATH ?= 147 | CUDALINK ?= 148 | DFLT_PATH = /usr/lib 149 | endif 150 | endif 151 | ifeq ("$(DISTRO)","fedora") 152 | ifeq ($(OS_SIZE),64) 153 | CUDAPATH ?= /usr/lib64/nvidia 154 | CUDALINK ?= -L/usr/lib64/nvidia 155 | DFLT_PATH = /usr/lib64 156 | else 157 | CUDAPATH ?= 158 | CUDALINK ?= 159 | DFLT_PATH = /usr/lib 160 | endif 161 | endif 162 | ifeq ("$(DISTRO)","redhat") 163 | ifeq ($(OS_SIZE),64) 164 | CUDAPATH ?= /usr/lib64/nvidia 165 | CUDALINK ?= -L/usr/lib64/nvidia 166 | DFLT_PATH = /usr/lib64 167 | else 168 | CUDAPATH ?= 169 | CUDALINK ?= 170 | DFLT_PATH = /usr/lib 171 | endif 172 | endif 173 | ifeq ("$(DISTRO)","red") 174 | ifeq ($(OS_SIZE),64) 175 | CUDAPATH ?= /usr/lib64/nvidia 176 | CUDALINK ?= -L/usr/lib64/nvidia 177 | DFLT_PATH = /usr/lib64 178 | else 179 | CUDAPATH ?= 180 | CUDALINK ?= 181 | DFLT_PATH = /usr/lib 182 | endif 183 | endif 184 | ifeq ("$(DISTRO)","redhatenterpriseworkstation") 185 | ifeq ($(OS_SIZE),64) 186 | CUDAPATH ?= /usr/lib64/nvidia 187 | CUDALINK ?= -L/usr/lib64/nvidia 188 | DFLT_PATH ?= /usr/lib64 189 | else 190 | CUDAPATH ?= 191 | CUDALINK ?= 192 | DFLT_PATH ?= /usr/lib 193 | endif 194 | endif 195 | ifeq ("$(DISTRO)","centos") 196 | ifeq ($(OS_SIZE),64) 197 | CUDAPATH ?= /usr/lib64/nvidia 198 | CUDALINK ?= -L/usr/lib64/nvidia 199 | DFLT_PATH = /usr/lib64 200 | else 201 | CUDAPATH ?= 202 | CUDALINK ?= 203 | DFLT_PATH = /usr/lib 204 | endif 205 | endif 206 | 207 | ifeq ($(ARMv7),1) 208 | CUDAPATH := /usr/arm-linux-gnueabihf/lib 209 | CUDALINK := -L/usr/arm-linux-gnueabihf/lib 210 | ifneq ($(TARGET_FS),) 211 | CUDAPATH += $(TARGET_FS)/usr/lib/nvidia-current 212 | CUDALINK += -L$(TARGET_FS)/usr/lib/nvidia-current 213 | endif 214 | endif 215 | 216 | # Search for Linux distribution path for libcuda.so 217 | CUDALIB ?= $(shell find $(CUDAPATH) $(DFLT_PATH) -name libcuda.so -print 2>/dev/null) 218 | 219 | ifeq ("$(CUDALIB)",'') 220 | $(info >>> WARNING - CUDA Driver libcuda.so is not found. Please check and re-install the NVIDIA driver. <<<) 221 | EXEC=@echo "[@]" 222 | endif 223 | else 224 | # This would be the Mac OS X path if we had to do anything special 225 | endif 226 | 227 | -------------------------------------------------------------------------------- /src/linux/kernel_matrix_calculation.c: -------------------------------------------------------------------------------- 1 | #include "/usr/local/cuda-5.5/include/cuda_runtime.h" 2 | #include "/usr/local/cuda-5.5/include/cublas_v2.h" 3 | 4 | // Scalars 5 | const float alpha = 1; 6 | const float beta = 0; 7 | 8 | void ckm( struct svm_problem *prob, struct svm_problem *pecm, float *gamma ) 9 | { 10 | cublasStatus_t status; 11 | 12 | double g_val = *gamma; 13 | 14 | long int nfa; 15 | 16 | int len_tv; 17 | int ntv; 18 | int i_v; 19 | int i_el; 20 | int i_r, i_c; 21 | int trvei; 22 | 23 | double *tv_sq; 24 | double *v_f_g; 25 | 26 | float *tr_ar; 27 | float *tva, *vtm, *DP; 28 | float *g_tva = 0, *g_vtm = 0, *g_DotProd = 0; 29 | 30 | cudaError_t cudaStat; 31 | cublasHandle_t handle; 32 | 33 | status = cublasCreate(&handle); 34 | 35 | len_tv = prob-> x[0].dim; 36 | ntv = prob-> l; 37 | 38 | nfa = len_tv * ntv; 39 | 40 | tva = (float*) malloc ( len_tv * ntv* sizeof(float) ); 41 | vtm = (float*) malloc ( len_tv * sizeof(float) ); 42 | DP = (float*) malloc ( ntv * sizeof(float) ); 43 | 44 | tr_ar = (float*) malloc ( len_tv * ntv* sizeof(float) ); 45 | 46 | tv_sq = (double*) malloc ( ntv * sizeof(double) ); 47 | 48 | v_f_g = (double*) malloc ( ntv * sizeof(double) ); 49 | 50 | for ( i_r = 0; i_r < ntv ; i_r++ ) 51 | { 52 | for ( i_c = 0; i_c < len_tv; i_c++ ) 53 | tva[i_r * len_tv + i_c] = (float)prob-> x[i_r].values[i_c]; 54 | } 55 | 56 | cudaStat = cudaMalloc((void**)&g_tva, len_tv * ntv * sizeof(float)); 57 | 58 | if (cudaStat != cudaSuccess) { 59 | free( tva ); 60 | free( vtm ); 61 | free( DP ); 62 | 63 | free( v_f_g ); 64 | free( tv_sq ); 65 | 66 | cudaFree( g_tva ); 67 | cublasDestroy( handle ); 68 | 69 | fprintf (stderr, "!!!! Device memory allocation error (A)\n"); 70 | getchar(); 71 | return; 72 | } 73 | 74 | cudaStat = cudaMalloc((void**)&g_vtm, len_tv * sizeof(float)); 75 | 76 | cudaStat = cudaMalloc((void**)&g_DotProd, ntv * sizeof(float)); 77 | 78 | for( i_r = 0; i_r < ntv; i_r++ ) 79 | for( i_c = 0; i_c < len_tv; i_c++ ) 80 | tr_ar[i_c * ntv + i_r] = tva[i_r * len_tv + i_c]; 81 | 82 | // Copy cpu vector to gpu vector 83 | status = cublasSetVector( len_tv * ntv, sizeof(float), tr_ar, 1, g_tva, 1 ); 84 | 85 | free( tr_ar ); 86 | 87 | for( i_v = 0; i_v < ntv; i_v++ ) 88 | { 89 | tv_sq[ i_v ] = 0; 90 | for( i_el = 0; i_el < len_tv; i_el++ ) 91 | tv_sq[i_v] += pow( tva[i_v*len_tv + i_el], (float)2.0 ); 92 | } 93 | 94 | 95 | 96 | for ( trvei = 0; trvei < ntv; trvei++ ) 97 | { 98 | status = cublasSetVector( len_tv, sizeof(float), &tva[trvei * len_tv], 1, g_vtm, 1 ); 99 | 100 | status = cublasSgemv( handle, CUBLAS_OP_N, ntv, len_tv, &alpha, g_tva, ntv , g_vtm, 1, &beta, g_DotProd, 1 ); 101 | 102 | status = cublasGetVector( ntv, sizeof(float), g_DotProd, 1, DP, 1 ); 103 | 104 | for ( i_c = 0; i_c < ntv; i_c++ ) 105 | v_f_g[i_c] = exp( -g_val * (tv_sq[trvei] + tv_sq[i_c]-((double)2.0)* (double)DP[i_c] )); 106 | 107 | 108 | pecm-> x[trvei].values[0] = trvei + 1; 109 | 110 | for ( i_c = 0; i_c < ntv; i_c++ ) 111 | pecm-> x[trvei].values[i_c + 1] = v_f_g[i_c]; 112 | 113 | 114 | } 115 | 116 | free( tva ); 117 | free( vtm ); 118 | free( DP ); 119 | free( v_f_g ); 120 | free( tv_sq ); 121 | 122 | cudaFree( g_tva ); 123 | cudaFree( g_vtm ); 124 | cudaFree( g_DotProd ); 125 | 126 | cublasDestroy( handle ); 127 | } 128 | 129 | void cal_km( struct svm_problem * p_km) 130 | { 131 | float gamma = param.gamma; 132 | 133 | ckm(&prob, p_km, &gamma); 134 | } 135 | -------------------------------------------------------------------------------- /src/linux/readme.txt: -------------------------------------------------------------------------------- 1 | Instructions to compile Linux GPU-Accelerated LIBSVM 2 | 3 | 1. Install the NVIDIA drivers, CUDA toolkit and GPU Computing SDK code samples. You can find them all in one package here: 4 | 5 | https://developer.nvidia.com/cuda-downloads (Version 5.5) 6 | 7 | You may need some additional packets to be installed in order to complete the installation above. 8 | 9 | A very helpful and descriptive guide is on the CUDA webpage: 10 | 11 | http://docs.nvidia.com/cuda/cuda-getting-started-guide-for-linux/index.html 12 | 13 | Make sure you have followed every step that is relevant to your system, like declaring $PATH and $LD_LIBRARY_PATH on your bash configuration file. 14 | 15 | 2. Copy this folder anywhere you like. 16 | 17 | 3. Use the Makefile found inside this folder. 18 | 19 | 4. Find the "svm-train-gpu" executable inside this folder. 20 | 21 | 22 | Troubleshooting 23 | 24 | 1. Nearly all problems are resolved by reading carefully through the nvidia guidelines. 25 | 26 | 2. When making, there is a chance a "cannot find cublas_v2.h" or "cuda_runtime.h" error to arise. Find where these files are located (Default path is: "/usr/local/cuda-5.5/include") and replace the paths on the header files in "kernel_matrix_calculation.c" file with your system paths. 27 | 28 | Or you can change the location of the default CUDA toolkit path on the makefile (CUDA_PATH ?= "/usr/local/cuda-5.5") to your cuda toolkit path 29 | -------------------------------------------------------------------------------- /src/linux/svm-train.c: -------------------------------------------------------------------------------- 1 | #include 2 | #include 3 | #include 4 | #include 5 | #include 6 | #include "svm.h" 7 | #define Malloc(type,n) (type *)malloc((n)*sizeof(type)) 8 | 9 | void print_null(const char *s) {} 10 | 11 | void exit_with_help() 12 | { 13 | printf( 14 | "Usage: svm-train [options] training_set_file [model_file]\n" 15 | "options:\n" 16 | "-s svm_type : set type of SVM (default 0)\n" 17 | " 0 -- C-SVC (multi-class classification)\n" 18 | " 1 -- nu-SVC (multi-class classification)\n" 19 | " 2 -- one-class SVM\n" 20 | " 3 -- epsilon-SVR (regression)\n" 21 | " 4 -- nu-SVR (regression)\n" 22 | "-t kernel_type : set type of kernel function (default 2)\n" 23 | " 0 -- linear: u'*v\n" 24 | " 1 -- polynomial: (gamma*u'*v + coef0)^degree\n" 25 | " 2 -- radial basis function: exp(-gamma*|u-v|^2)\n" 26 | " 3 -- sigmoid: tanh(gamma*u'*v + coef0)\n" 27 | " 4 -- precomputed kernel (kernel values in training_set_file)\n" 28 | "-d degree : set degree in kernel function (default 3)\n" 29 | "-g gamma : set gamma in kernel function (default 1/num_features)\n" 30 | "-r coef0 : set coef0 in kernel function (default 0)\n" 31 | "-c cost : set the parameter C of C-SVC, epsilon-SVR, and nu-SVR (default 1)\n" 32 | "-n nu : set the parameter nu of nu-SVC, one-class SVM, and nu-SVR (default 0.5)\n" 33 | "-p epsilon : set the epsilon in loss function of epsilon-SVR (default 0.1)\n" 34 | "-m cachesize : set cache memory size in MB (default 100)\n" 35 | "-e epsilon : set tolerance of termination criterion (default 0.001)\n" 36 | "-h shrinking : whether to use the shrinking heuristics, 0 or 1 (default 1)\n" 37 | "-b probability_estimates : whether to train a SVC or SVR model for probability estimates, 0 or 1 (default 0)\n" 38 | "-wi weight : set the parameter C of class i to weight*C, for C-SVC (default 1)\n" 39 | "-v n: n-fold cross validation mode\n" 40 | "-q : quiet mode (no outputs)\n" 41 | ); 42 | exit(1); 43 | } 44 | 45 | void exit_input_error(int line_num) 46 | { 47 | fprintf(stderr,"Wrong input format at line %d\n", line_num); 48 | exit(1); 49 | } 50 | 51 | void parse_command_line(int argc, char **argv, char *input_file_name, char *model_file_name); 52 | void read_problem(const char *filename); 53 | void do_cross_validation(); 54 | 55 | struct svm_parameter param; // set by parse_command_line 56 | struct svm_problem prob; // set by read_problem 57 | struct svm_model *model; 58 | struct svm_node *x_space; 59 | int cross_validation; 60 | int nr_fold; 61 | 62 | static char *line = NULL; 63 | static int max_line_len; 64 | 65 | #include "kernel_matrix_calculation.c" 66 | #include "cross_validation_with_matrix_precomputation.c" 67 | 68 | static char* readline(FILE *input) 69 | { 70 | int len; 71 | 72 | if(fgets(line,max_line_len,input) == NULL) 73 | return NULL; 74 | 75 | while(strrchr(line,'\n') == NULL) 76 | { 77 | max_line_len *= 2; 78 | line = (char *) realloc(line,max_line_len); 79 | len = (int) strlen(line); 80 | if(fgets(line+len,max_line_len-len,input) == NULL) 81 | break; 82 | } 83 | return line; 84 | } 85 | 86 | int main(int argc, char **argv) 87 | { 88 | int i; 89 | char input_file_name[1024]; 90 | char model_file_name[1024]; 91 | const char *error_msg; 92 | 93 | parse_command_line(argc, argv, input_file_name, model_file_name); 94 | read_problem(input_file_name); 95 | error_msg = svm_check_parameter(&prob,¶m); 96 | if(error_msg) 97 | { 98 | fprintf(stderr,"ERROR: %s\n",error_msg); 99 | exit(1); 100 | } 101 | 102 | if(cross_validation) 103 | { 104 | do_cross_validation_with_KM_precalculated( ); 105 | 106 | // do_cross_validation(); 107 | } 108 | else 109 | { 110 | model = svm_train(&prob,¶m); 111 | if(svm_save_model(model_file_name,model)) 112 | { 113 | fprintf(stderr, "can't save model to file %s\n", model_file_name); 114 | exit(1); 115 | } 116 | svm_free_and_destroy_model(&model); 117 | } 118 | svm_destroy_param(¶m); 119 | free(prob.y); 120 | 121 | #ifdef _DENSE_REP 122 | for (i = 0; i < prob.l; ++i) 123 | free((prob.x+i)->values); 124 | #else 125 | free(x_space); 126 | #endif 127 | free(prob.x); 128 | free(line); 129 | 130 | return 0; 131 | } 132 | 133 | void do_cross_validation() 134 | { 135 | int i; 136 | int total_correct = 0; 137 | double total_error = 0; 138 | double sumv = 0, sumy = 0, sumvv = 0, sumyy = 0, sumvy = 0; 139 | double *target = Malloc(double,prob.l); 140 | 141 | svm_cross_validation(&prob,¶m,nr_fold,target); 142 | if(param.svm_type == EPSILON_SVR || 143 | param.svm_type == NU_SVR) 144 | { 145 | for(i=0;i=argc) 200 | exit_with_help(); 201 | switch(argv[i-1][1]) 202 | { 203 | case 's': 204 | param.svm_type = atoi(argv[i]); 205 | break; 206 | case 't': 207 | param.kernel_type = atoi(argv[i]); 208 | break; 209 | case 'd': 210 | param.degree = atoi(argv[i]); 211 | break; 212 | case 'g': 213 | param.gamma = atof(argv[i]); 214 | break; 215 | case 'r': 216 | param.coef0 = atof(argv[i]); 217 | break; 218 | case 'n': 219 | param.nu = atof(argv[i]); 220 | break; 221 | case 'm': 222 | param.cache_size = atof(argv[i]); 223 | break; 224 | case 'c': 225 | param.C = atof(argv[i]); 226 | break; 227 | case 'e': 228 | param.eps = atof(argv[i]); 229 | break; 230 | case 'p': 231 | param.p = atof(argv[i]); 232 | break; 233 | case 'h': 234 | param.shrinking = atoi(argv[i]); 235 | break; 236 | case 'b': 237 | param.probability = atoi(argv[i]); 238 | break; 239 | case 'q': 240 | print_func = &print_null; 241 | i--; 242 | break; 243 | case 'v': 244 | cross_validation = 1; 245 | nr_fold = atoi(argv[i]); 246 | if(nr_fold < 2) 247 | { 248 | fprintf(stderr,"n-fold cross validation: n must >= 2\n"); 249 | exit_with_help(); 250 | } 251 | break; 252 | case 'w': 253 | ++param.nr_weight; 254 | param.weight_label = (int *)realloc(param.weight_label,sizeof(int)*param.nr_weight); 255 | param.weight = (double *)realloc(param.weight,sizeof(double)*param.nr_weight); 256 | param.weight_label[param.nr_weight-1] = atoi(&argv[i-1][2]); 257 | param.weight[param.nr_weight-1] = atof(argv[i]); 258 | break; 259 | default: 260 | fprintf(stderr,"Unknown option: -%c\n", argv[i-1][1]); 261 | exit_with_help(); 262 | } 263 | } 264 | 265 | svm_set_print_string_function(print_func); 266 | 267 | // determine filenames 268 | 269 | if(i>=argc) 270 | exit_with_help(); 271 | 272 | strcpy(input_file_name, argv[i]); 273 | 274 | if(i line) 319 | p--; 320 | if(p > line) 321 | max_index = (int) strtol(p,&endptr,10) + 1; 322 | } 323 | if(max_index > elements) 324 | elements = max_index; 325 | ++prob.l; 326 | } 327 | 328 | rewind(fp); 329 | 330 | prob.y = Malloc(double,prob.l); 331 | prob.x = Malloc(struct svm_node,prob.l); 332 | 333 | for(i=0;ivalues = Malloc(double,elements); 337 | (prob.x+i)->dim = 0; 338 | 339 | inst_max_index = -1; // strtol gives 0 if wrong format, and precomputed kernel has start from 0 340 | readline(fp); 341 | 342 | label = strtok(line," \t"); 343 | prob.y[i] = strtod(label,&endptr); 344 | if(endptr == label) 345 | exit_input_error(i+1); 346 | 347 | while(1) 348 | { 349 | idx = strtok(NULL,":"); 350 | val = strtok(NULL," \t"); 351 | 352 | if(val == NULL) 353 | break; 354 | 355 | errno = 0; 356 | j = (int) strtol(idx,&endptr,10); 357 | if(endptr == idx || errno != 0 || *endptr != '\0' || j <= inst_max_index) 358 | exit_input_error(i+1); 359 | else 360 | inst_max_index = j; 361 | 362 | errno = 0; 363 | value = strtod(val,&endptr); 364 | if(endptr == val || errno != 0 || (*endptr != '\0' && !isspace(*endptr))) 365 | exit_input_error(i+1); 366 | 367 | d = &((prob.x+i)->dim); 368 | while (*d < j) 369 | (prob.x+i)->values[(*d)++] = 0.0; 370 | (prob.x+i)->values[(*d)++] = value; 371 | } 372 | } 373 | max_index = elements-1; 374 | 375 | #else 376 | while(readline(fp)!=NULL) 377 | { 378 | char *p = strtok(line," \t"); // label 379 | 380 | // features 381 | while(1) 382 | { 383 | p = strtok(NULL," \t"); 384 | if(p == NULL || *p == '\n') // check '\n' as ' ' may be after the last feature 385 | break; 386 | ++elements; 387 | } 388 | ++elements; 389 | ++prob.l; 390 | } 391 | rewind(fp); 392 | 393 | prob.y = Malloc(double,prob.l); 394 | prob.x = Malloc(struct svm_node *,prob.l); 395 | x_space = Malloc(struct svm_node,elements); 396 | 397 | max_index = 0; 398 | j=0; 399 | for(i=0;i start from 0 402 | readline(fp); 403 | prob.x[i] = &x_space[j]; 404 | label = strtok(line," \t\n"); 405 | if(label == NULL) // empty line 406 | exit_input_error(i+1); 407 | 408 | prob.y[i] = strtod(label,&endptr); 409 | if(endptr == label || *endptr != '\0') 410 | exit_input_error(i+1); 411 | 412 | while(1) 413 | { 414 | idx = strtok(NULL,":"); 415 | val = strtok(NULL," \t"); 416 | 417 | if(val == NULL) 418 | break; 419 | 420 | errno = 0; 421 | x_space[j].index = (int) strtol(idx,&endptr,10); 422 | if(endptr == idx || errno != 0 || *endptr != '\0' || x_space[j].index <= inst_max_index) 423 | exit_input_error(i+1); 424 | else 425 | inst_max_index = x_space[j].index; 426 | 427 | errno = 0; 428 | x_space[j].value = strtod(val,&endptr); 429 | if(endptr == val || errno != 0 || (*endptr != '\0' && !isspace(*endptr))) 430 | exit_input_error(i+1); 431 | 432 | ++j; 433 | } 434 | 435 | if(inst_max_index > max_index) 436 | max_index = inst_max_index; 437 | x_space[j++].index = -1; 438 | } 439 | #endif 440 | 441 | if(param.gamma == 0 && max_index > 0) 442 | param.gamma = 1.0/max_index; 443 | 444 | if(param.kernel_type == PRECOMPUTED) 445 | for(i=0;idim == 0 || (prob.x+i)->values[0] == 0.0) 449 | { 450 | fprintf(stderr,"Wrong input format: first column must be 0:sample_serial_number\n"); 451 | exit(1); 452 | } 453 | if ((int)(prob.x+i)->values[0] < 0 || (int)(prob.x+i)->values[0] > max_index) 454 | { 455 | fprintf(stderr,"Wrong input format: sample_serial_number out of range\n"); 456 | exit(1); 457 | } 458 | #else 459 | if (prob.x[i][0].index != 0) 460 | { 461 | fprintf(stderr,"Wrong input format: first column must be 0:sample_serial_number\n"); 462 | exit(1); 463 | } 464 | if ((int)prob.x[i][0].value <= 0 || (int)prob.x[i][0].value > max_index) 465 | { 466 | fprintf(stderr,"Wrong input format: sample_serial_number out of range\n"); 467 | exit(1); 468 | } 469 | #endif 470 | } 471 | fclose(fp); 472 | } 473 | -------------------------------------------------------------------------------- /src/linux/svm.h: -------------------------------------------------------------------------------- 1 | #ifndef _LIBSVM_H 2 | #define _LIBSVM_H 3 | #define _DENSE_REP 4 | #define LIBSVM_VERSION 317 5 | 6 | #ifdef __cplusplus 7 | extern "C" { 8 | #endif 9 | 10 | extern int libsvm_version; 11 | 12 | #ifdef _DENSE_REP 13 | struct svm_node 14 | { 15 | int dim; 16 | double *values; 17 | }; 18 | 19 | struct svm_problem 20 | { 21 | int l; 22 | double *y; 23 | struct svm_node *x; 24 | }; 25 | 26 | #else 27 | struct svm_node 28 | { 29 | int index; 30 | double value; 31 | }; 32 | 33 | struct svm_problem 34 | { 35 | int l; 36 | double *y; 37 | struct svm_node **x; 38 | }; 39 | #endif 40 | 41 | enum { C_SVC, NU_SVC, ONE_CLASS, EPSILON_SVR, NU_SVR }; /* svm_type */ 42 | enum { LINEAR, POLY, RBF, SIGMOID, PRECOMPUTED }; /* kernel_type */ 43 | 44 | struct svm_parameter 45 | { 46 | int svm_type; 47 | int kernel_type; 48 | int degree; /* for poly */ 49 | double gamma; /* for poly/rbf/sigmoid */ 50 | double coef0; /* for poly/sigmoid */ 51 | 52 | /* these are for training only */ 53 | double cache_size; /* in MB */ 54 | double eps; /* stopping criteria */ 55 | double C; /* for C_SVC, EPSILON_SVR and NU_SVR */ 56 | int nr_weight; /* for C_SVC */ 57 | int *weight_label; /* for C_SVC */ 58 | double* weight; /* for C_SVC */ 59 | double nu; /* for NU_SVC, ONE_CLASS, and NU_SVR */ 60 | double p; /* for EPSILON_SVR */ 61 | int shrinking; /* use the shrinking heuristics */ 62 | int probability; /* do probability estimates */ 63 | }; 64 | 65 | // 66 | // svm_model 67 | // 68 | struct svm_model 69 | { 70 | struct svm_parameter param; /* parameter */ 71 | int nr_class; /* number of classes, = 2 in regression/one class svm */ 72 | int l; /* total #SV */ 73 | #ifdef _DENSE_REP 74 | struct svm_node *SV; /* SVs (SV[l]) */ 75 | #else 76 | struct svm_node **SV; /* SVs (SV[l]) */ 77 | #endif 78 | double **sv_coef; /* coefficients for SVs in decision functions (sv_coef[k-1][l]) */ 79 | double *rho; /* constants in decision functions (rho[k*(k-1)/2]) */ 80 | double *probA; /* pariwise probability information */ 81 | double *probB; 82 | int *sv_indices; /* sv_indices[0,...,nSV-1] are values in [1,...,num_traning_data] to indicate SVs in the training set */ 83 | 84 | /* for classification only */ 85 | 86 | int *label; /* label of each class (label[k]) */ 87 | int *nSV; /* number of SVs for each class (nSV[k]) */ 88 | /* nSV[0] + nSV[1] + ... + nSV[k-1] = l */ 89 | /* XXX */ 90 | int free_sv; /* 1 if svm_model is created by svm_load_model*/ 91 | /* 0 if svm_model is created by svm_train */ 92 | }; 93 | 94 | struct svm_model *svm_train(const struct svm_problem *prob, const struct svm_parameter *param); 95 | void svm_cross_validation(const struct svm_problem *prob, const struct svm_parameter *param, int nr_fold, double *target); 96 | 97 | int svm_save_model(const char *model_file_name, const struct svm_model *model); 98 | struct svm_model *svm_load_model(const char *model_file_name); 99 | 100 | int svm_get_svm_type(const struct svm_model *model); 101 | int svm_get_nr_class(const struct svm_model *model); 102 | void svm_get_labels(const struct svm_model *model, int *label); 103 | void svm_get_sv_indices(const struct svm_model *model, int *sv_indices); 104 | int svm_get_nr_sv(const struct svm_model *model); 105 | double svm_get_svr_probability(const struct svm_model *model); 106 | 107 | double svm_predict_values(const struct svm_model *model, const struct svm_node *x, double* dec_values); 108 | double svm_predict(const struct svm_model *model, const struct svm_node *x); 109 | double svm_predict_probability(const struct svm_model *model, const struct svm_node *x, double* prob_estimates); 110 | 111 | void svm_free_model_content(struct svm_model *model_ptr); 112 | void svm_free_and_destroy_model(struct svm_model **model_ptr_ptr); 113 | void svm_destroy_param(struct svm_parameter *param); 114 | 115 | const char *svm_check_parameter(const struct svm_problem *prob, const struct svm_parameter *param); 116 | int svm_check_probability_model(const struct svm_model *model); 117 | 118 | void svm_set_print_string_function(void (*print_func)(const char *)); 119 | 120 | #ifdef __cplusplus 121 | } 122 | #endif 123 | 124 | #endif /* _LIBSVM_H */ 125 | -------------------------------------------------------------------------------- /src/windows/README-GPU: -------------------------------------------------------------------------------- 1 | GPU-Accelerated LIBSVM is exploiting the GPU, using the CUDA interface, to 2 | speed-up the training process. This package contains a new executable for 3 | training classifiers "svm-train-gpu.exe" together with the original one. 4 | The use of the new executable is exactly the same as with the original one. 5 | 6 | This executable was built with the CUBLAS API version 2 which is compatible with SDKs from 4.0 and up. 7 | 8 | FEATURES 9 | 10 | Mode Supported 11 | 12 | * c-svc classification with RBF kernel 13 | 14 | Functionality / User interface 15 | 16 | * Same as LIBSVM 17 | 18 | 19 | PREREQUISITES 20 | 21 | * NVIDIA Graphics card with CUDA support 22 | * Latest NVIDIA drivers for GPU 23 | * CUDA toolkit & GPU Computing SDK 5.5 24 | 25 | Download all in one package from: 26 | https://developer.nvidia.com/cuda-downloads 27 | 28 | 29 | INSTRUCTIONS 30 | 31 | 1. Install the NVIDIA drivers, CUDA toolkit and GPU Computing SDK code samples. You can find them all in one package here: 32 | 33 | https://developer.nvidia.com/cuda-downloads (Version 5.5) 34 | 35 | 2. Open the Visual studio 2010 project file located inside this folder and build. 36 | 37 | 38 | Additional Information 39 | ====================== 40 | 41 | If you find GPU-Accelerated LIBSVM helpful, please cite it as 42 | 43 | A. Athanasopoulos, A. Dimou, V. Mezaris, I. Kompatsiaris, "GPU Acceleration for Support Vector Machines", 44 | Proc. 12th International Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS 2011), Delft, The Netherlands, April 2011. 45 | 46 | Software available at http://mklab.iti.gr/project/GPU-LIBSVM 47 | -------------------------------------------------------------------------------- /src/windows/libsvm_train_dense_gpu.ncb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MKLab-ITI/CUDA/e567d31391d51742e1d6fe6cadffa4f9557bd518/src/windows/libsvm_train_dense_gpu.ncb -------------------------------------------------------------------------------- /src/windows/libsvm_train_dense_gpu.sdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MKLab-ITI/CUDA/e567d31391d51742e1d6fe6cadffa4f9557bd518/src/windows/libsvm_train_dense_gpu.sdf -------------------------------------------------------------------------------- /src/windows/libsvm_train_dense_gpu.sln: -------------------------------------------------------------------------------- 1 |  2 | Microsoft Visual Studio Solution File, Format Version 11.00 3 | # Visual Studio 2010 4 | Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "libsvm_train_dense_gpu", "libsvm_train_dense_gpu\libsvm_train_dense_gpu.vcxproj", "{80730853-7F34-44C7-BBF7-496977A4C14F}" 5 | EndProject 6 | Global 7 | GlobalSection(SolutionConfigurationPlatforms) = preSolution 8 | Debug|Win32 = Debug|Win32 9 | Debug|x64 = Debug|x64 10 | Release|Win32 = Release|Win32 11 | Release|x64 = Release|x64 12 | EndGlobalSection 13 | GlobalSection(ProjectConfigurationPlatforms) = postSolution 14 | {80730853-7F34-44C7-BBF7-496977A4C14F}.Debug|Win32.ActiveCfg = Debug|Win32 15 | {80730853-7F34-44C7-BBF7-496977A4C14F}.Debug|Win32.Build.0 = Debug|Win32 16 | {80730853-7F34-44C7-BBF7-496977A4C14F}.Debug|x64.ActiveCfg = Debug|x64 17 | {80730853-7F34-44C7-BBF7-496977A4C14F}.Debug|x64.Build.0 = Debug|x64 18 | {80730853-7F34-44C7-BBF7-496977A4C14F}.Release|Win32.ActiveCfg = Release|Win32 19 | {80730853-7F34-44C7-BBF7-496977A4C14F}.Release|Win32.Build.0 = Release|Win32 20 | {80730853-7F34-44C7-BBF7-496977A4C14F}.Release|x64.ActiveCfg = Release|x64 21 | {80730853-7F34-44C7-BBF7-496977A4C14F}.Release|x64.Build.0 = Release|x64 22 | EndGlobalSection 23 | GlobalSection(SolutionProperties) = preSolution 24 | HideSolutionNode = FALSE 25 | EndGlobalSection 26 | EndGlobal 27 | -------------------------------------------------------------------------------- /src/windows/libsvm_train_dense_gpu.suo: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MKLab-ITI/CUDA/e567d31391d51742e1d6fe6cadffa4f9557bd518/src/windows/libsvm_train_dense_gpu.suo -------------------------------------------------------------------------------- /src/windows/libsvm_train_dense_gpu/cross_validation_with_matrix_precomputation.c: -------------------------------------------------------------------------------- 1 | void setup_pkm(struct svm_problem *p_km) 2 | { 3 | 4 | int i; 5 | 6 | p_km->l = prob.l; 7 | p_km->x = Malloc(struct svm_node,p_km->l); 8 | p_km->y = Malloc(double,p_km->l); 9 | 10 | for(i=0;ix+i)->values = Malloc(double,prob.l+1); 13 | (p_km->x+i)->dim = prob.l+1; 14 | } 15 | 16 | for( i=0; iy[i] = prob.y[i]; 17 | } 18 | 19 | void free_pkm(struct svm_problem *p_km) 20 | { 21 | 22 | int i; 23 | 24 | for(i=0;ix+i)->values); 26 | 27 | free( p_km->x ); 28 | free( p_km->y ); 29 | 30 | } 31 | 32 | 33 | double do_crossvalidation(struct svm_problem * p_km) 34 | { 35 | double rate; 36 | 37 | int i; 38 | int total_correct = 0; 39 | double total_error = 0; 40 | double sumv = 0, sumy = 0, sumvv = 0, sumyy = 0, sumvy = 0; 41 | double *target = Malloc(double,prob.l); 42 | 43 | svm_cross_validation(p_km,¶m,nr_fold,target); 44 | 45 | 46 | if(param.svm_type == EPSILON_SVR || 47 | param.svm_type == NU_SVR) 48 | { 49 | for(i=0;i 2 | #include "cublas_v2.h" 3 | 4 | // Scalars 5 | const float alpha = 1; 6 | const float beta = 0; 7 | 8 | void ckm( struct svm_problem *prob, struct svm_problem *pecm, float *gamma ) 9 | { 10 | cublasStatus_t status; 11 | 12 | double g_val = *gamma; 13 | 14 | long int nfa; 15 | 16 | int len_tv; 17 | int ntv; 18 | int i_v; 19 | int i_el; 20 | int i_r, i_c; 21 | int trvei; 22 | 23 | double *tv_sq; 24 | double *v_f_g; 25 | 26 | float *tr_ar; 27 | float *tva, *vtm, *DP; 28 | float *g_tva = 0, *g_vtm = 0, *g_DotProd = 0; 29 | 30 | cudaError_t cudaStat; 31 | cublasHandle_t handle; 32 | 33 | status = cublasCreate(&handle); 34 | 35 | len_tv = prob-> x[0].dim; 36 | ntv = prob-> l; 37 | 38 | nfa = len_tv * ntv; 39 | 40 | tva = (float*) malloc ( len_tv * ntv* sizeof(float) ); 41 | vtm = (float*) malloc ( len_tv * sizeof(float) ); 42 | DP = (float*) malloc ( ntv * sizeof(float) ); 43 | 44 | tr_ar = (float*) malloc ( len_tv * ntv* sizeof(float) ); 45 | 46 | tv_sq = (double*) malloc ( ntv * sizeof(double) ); 47 | 48 | v_f_g = (double*) malloc ( ntv * sizeof(double) ); 49 | 50 | for ( i_r = 0; i_r < ntv ; i_r++ ) 51 | { 52 | for ( i_c = 0; i_c < len_tv; i_c++ ) 53 | tva[i_r * len_tv + i_c] = (float)prob-> x[i_r].values[i_c]; 54 | } 55 | 56 | cudaStat = cudaMalloc((void**)&g_tva, len_tv * ntv * sizeof(float)); 57 | 58 | if (cudaStat != cudaSuccess) { 59 | free( tva ); 60 | free( vtm ); 61 | free( DP ); 62 | 63 | free( v_f_g ); 64 | free( tv_sq ); 65 | 66 | cudaFree( g_tva ); 67 | cublasDestroy( handle ); 68 | 69 | fprintf (stderr, "!!!! Device memory allocation error (A)\n"); 70 | getchar(); 71 | return; 72 | } 73 | 74 | cudaStat = cudaMalloc((void**)&g_vtm, len_tv * sizeof(float)); 75 | 76 | cudaStat = cudaMalloc((void**)&g_DotProd, ntv * sizeof(float)); 77 | 78 | for( i_r = 0; i_r < ntv; i_r++ ) 79 | for( i_c = 0; i_c < len_tv; i_c++ ) 80 | tr_ar[i_c * ntv + i_r] = tva[i_r * len_tv + i_c]; 81 | 82 | // Copy cpu vector to gpu vector 83 | status = cublasSetVector( len_tv * ntv, sizeof(float), tr_ar, 1, g_tva, 1 ); 84 | 85 | free( tr_ar ); 86 | 87 | for( i_v = 0; i_v < ntv; i_v++ ) 88 | { 89 | tv_sq[ i_v ] = 0; 90 | for( i_el = 0; i_el < len_tv; i_el++ ) 91 | tv_sq[i_v] += pow( tva[i_v*len_tv + i_el], (float)2.0 ); 92 | } 93 | 94 | 95 | 96 | for ( trvei = 0; trvei < ntv; trvei++ ) 97 | { 98 | status = cublasSetVector( len_tv, sizeof(float), &tva[trvei * len_tv], 1, g_vtm, 1 ); 99 | 100 | status = cublasSgemv( handle, CUBLAS_OP_N, ntv, len_tv, &alpha, g_tva, ntv , g_vtm, 1, &beta, g_DotProd, 1 ); 101 | 102 | status = cublasGetVector( ntv, sizeof(float), g_DotProd, 1, DP, 1 ); 103 | 104 | for ( i_c = 0; i_c < ntv; i_c++ ) 105 | v_f_g[i_c] = exp( -g_val * (tv_sq[trvei] + tv_sq[i_c]-((double)2.0)* (double)DP[i_c] )); 106 | 107 | 108 | pecm-> x[trvei].values[0] = trvei + 1; 109 | 110 | for ( i_c = 0; i_c < ntv; i_c++ ) 111 | pecm-> x[trvei].values[i_c + 1] = v_f_g[i_c]; 112 | 113 | 114 | } 115 | 116 | free( tva ); 117 | free( vtm ); 118 | free( DP ); 119 | free( v_f_g ); 120 | free( tv_sq ); 121 | 122 | cudaFree( g_tva ); 123 | cudaFree( g_vtm ); 124 | cudaFree( g_DotProd ); 125 | 126 | cublasDestroy( handle ); 127 | } 128 | 129 | void cal_km( struct svm_problem * p_km) 130 | { 131 | float gamma = param.gamma; 132 | 133 | ckm(&prob, p_km, &gamma); 134 | } -------------------------------------------------------------------------------- /src/windows/libsvm_train_dense_gpu/libsvm_train_dense_gpu.vcxproj: -------------------------------------------------------------------------------- 1 |  2 | 3 | 4 | 5 | Debug 6 | Win32 7 | 8 | 9 | Debug 10 | x64 11 | 12 | 13 | Release 14 | Win32 15 | 16 | 17 | Release 18 | x64 19 | 20 | 21 | 22 | {80730853-7F34-44C7-BBF7-496977A4C14F} 23 | libsvm_289_dense 24 | Win32Proj 25 | 26 | 27 | 28 | Application 29 | Unicode 30 | true 31 | 32 | 33 | Application 34 | Unicode 35 | 36 | 37 | Application 38 | Unicode 39 | true 40 | 41 | 42 | Application 43 | Unicode 44 | 45 | 46 | 47 | 48 | 49 | 50 | 51 | 52 | 53 | 54 | 55 | 56 | 57 | 58 | 59 | 60 | 61 | 62 | 63 | <_ProjectFileVersion>10.0.40219.1 64 | $(SolutionDir)$(Configuration)\ 65 | $(Configuration)\ 66 | true 67 | $(SolutionDir)$(Platform)\$(Configuration)\ 68 | $(Platform)\$(Configuration)\ 69 | true 70 | $(SolutionDir)$(Configuration)\ 71 | $(Configuration)\ 72 | false 73 | $(SolutionDir)$(Platform)\$(Configuration)\ 74 | $(Platform)\$(Configuration)\ 75 | false 76 | $(CUDA_WIN32_LIBS);$(LibraryPath) 77 | $(CUDA_INC_PATH);C:\ProgramData\NVIDIA Corporation\CUDA Samples\v5.5\common\inc;$(IncludePath) 78 | 79 | 80 | 81 | Disabled 82 | $(CUDA_INC_PATH);%(AdditionalIncludeDirectories) 83 | WIN32;_DEBUG;_CONSOLE;_CONSOLE;_CRT_SECURE_NO_DEPRECATE;_DENSE_REP;%(PreprocessorDefinitions) 84 | true 85 | EnableFastChecks 86 | MultiThreadedDebugDLL 87 | 88 | 89 | Level3 90 | EditAndContinue 91 | 92 | 93 | cublas.lib;cudart.lib;%(AdditionalDependencies) 94 | $(CUDA_WIN32_LIBS);%(AdditionalLibraryDirectories) 95 | true 96 | Console 97 | false 98 | 99 | 100 | MachineX86 101 | 102 | 103 | C:\ProgramData\NVIDIA Corporation\CUDA Samples\v5.5\common\inc;%(Include) 104 | 105 | 106 | 107 | 108 | X64 109 | 110 | 111 | Disabled 112 | $(CUDA_INC_PATH);C:/Documents and Settings/All Users/Application Data/NVIDIA Corporation/NVIDIA GPU Computing SDK/C/common/inc;%(AdditionalIncludeDirectories) 113 | WIN32;_DEBUG;_CONSOLE;_CONSOLE;_CRT_SECURE_NO_DEPRECATE;_DENSE_REP;%(PreprocessorDefinitions) 114 | true 115 | EnableFastChecks 116 | MultiThreadedDebugDLL 117 | 118 | 119 | Level3 120 | ProgramDatabase 121 | 122 | 123 | cublas.lib;cudart.lib;%(AdditionalDependencies) 124 | $(CUDA_LIB_PATH);%(AdditionalLibraryDirectories) 125 | true 126 | Console 127 | false 128 | 129 | 130 | MachineX64 131 | 132 | 133 | 134 | 135 | MaxSpeed 136 | Default 137 | false 138 | Neither 139 | true 140 | $(CUDA_INC_PATH);%(AdditionalIncludeDirectories) 141 | WIN32;NDEBUG;_CONSOLE;_CRT_SECURE_NO_DEPRECATE;_DENSE_REP;%(PreprocessorDefinitions) 142 | Sync 143 | MultiThreadedDLL 144 | true 145 | NotSet 146 | Precise 147 | 148 | 149 | AssemblyAndMachineCode 150 | Level3 151 | ProgramDatabase 152 | 153 | 154 | false 155 | cublas.lib;cudart.lib;cuda.lib;%(AdditionalDependencies) 156 | $(CUDA_LIB_PATH);%(AdditionalLibraryDirectories) 157 | true 158 | Console 159 | true 160 | true 161 | false 162 | 163 | 164 | MachineX86 165 | 166 | 167 | 168 | 169 | X64 170 | 171 | 172 | /Oa %(AdditionalOptions) 173 | MaxSpeed 174 | Default 175 | false 176 | Neither 177 | true 178 | $(CUDA_INC_PATH);C:/Documents and Settings/All Users/Application Data/NVIDIA Corporation/NVIDIA GPU Computing SDK/C/common/inc;%(AdditionalIncludeDirectories) 179 | WIN32;NDEBUG;_CONSOLE;_CRT_SECURE_NO_DEPRECATE;_DENSE_REP;%(PreprocessorDefinitions) 180 | Sync 181 | MultiThreadedDLL 182 | true 183 | NotSet 184 | Precise 185 | 186 | 187 | AssemblyAndMachineCode 188 | Level3 189 | ProgramDatabase 190 | 191 | 192 | cublas.lib;cudart.lib;cuda.lib;%(AdditionalDependencies) 193 | $(CUDA_X64_LIBS);%(AdditionalLibraryDirectories) 194 | true 195 | Console 196 | true 197 | true 198 | false 199 | 200 | 201 | MachineX64 202 | 203 | 204 | 205 | 206 | true 207 | true 208 | true 209 | false 210 | true 211 | 212 | 213 | true 214 | true 215 | true 216 | true 217 | 218 | 219 | 220 | 221 | 222 | 223 | 224 | 225 | 226 | 227 | 228 | 229 | -------------------------------------------------------------------------------- /src/windows/libsvm_train_dense_gpu/libsvm_train_dense_gpu.vcxproj.filters: -------------------------------------------------------------------------------- 1 |  2 | 3 | 4 | 5 | {4FC737F1-C7A5-4376-A066-2A32D752A2FF} 6 | cpp;c;cc;cxx;def;odl;idl;hpj;bat;asm;asmx 7 | 8 | 9 | {93995380-89BD-4b04-88EB-625FBE52EBFB} 10 | h;hpp;hxx;hm;inl;inc;xsd 11 | 12 | 13 | {67DA6AB6-F800-4c08-8B7A-83BB121AAD01} 14 | rc;ico;cur;bmp;dlg;rc2;rct;bin;rgs;gif;jpg;jpeg;jpe;resx;tiff;tif;png;wav 15 | 16 | 17 | 18 | 19 | Source Files 20 | 21 | 22 | Source Files 23 | 24 | 25 | Source Files 26 | 27 | 28 | Source Files 29 | 30 | 31 | 32 | 33 | Source Files 34 | 35 | 36 | Header Files 37 | 38 | 39 | -------------------------------------------------------------------------------- /src/windows/libsvm_train_dense_gpu/libsvm_train_dense_gpu.vcxproj.user: -------------------------------------------------------------------------------- 1 |  2 | 3 | -------------------------------------------------------------------------------- /src/windows/libsvm_train_dense_gpu/svm-train.c: -------------------------------------------------------------------------------- 1 | #include 2 | #include 3 | #include 4 | #include 5 | #include 6 | #include "svm.h" 7 | #define Malloc(type,n) (type *)malloc((n)*sizeof(type)) 8 | 9 | void print_null(const char *s) {} 10 | 11 | void exit_with_help() 12 | { 13 | printf( 14 | "Usage: svm-train [options] training_set_file [model_file]\n" 15 | "options:\n" 16 | "-s svm_type : set type of SVM (default 0)\n" 17 | " 0 -- C-SVC (multi-class classification)\n" 18 | " 1 -- nu-SVC (multi-class classification)\n" 19 | " 2 -- one-class SVM\n" 20 | " 3 -- epsilon-SVR (regression)\n" 21 | " 4 -- nu-SVR (regression)\n" 22 | "-t kernel_type : set type of kernel function (default 2)\n" 23 | " 0 -- linear: u'*v\n" 24 | " 1 -- polynomial: (gamma*u'*v + coef0)^degree\n" 25 | " 2 -- radial basis function: exp(-gamma*|u-v|^2)\n" 26 | " 3 -- sigmoid: tanh(gamma*u'*v + coef0)\n" 27 | " 4 -- precomputed kernel (kernel values in training_set_file)\n" 28 | "-d degree : set degree in kernel function (default 3)\n" 29 | "-g gamma : set gamma in kernel function (default 1/num_features)\n" 30 | "-r coef0 : set coef0 in kernel function (default 0)\n" 31 | "-c cost : set the parameter C of C-SVC, epsilon-SVR, and nu-SVR (default 1)\n" 32 | "-n nu : set the parameter nu of nu-SVC, one-class SVM, and nu-SVR (default 0.5)\n" 33 | "-p epsilon : set the epsilon in loss function of epsilon-SVR (default 0.1)\n" 34 | "-m cachesize : set cache memory size in MB (default 100)\n" 35 | "-e epsilon : set tolerance of termination criterion (default 0.001)\n" 36 | "-h shrinking : whether to use the shrinking heuristics, 0 or 1 (default 1)\n" 37 | "-b probability_estimates : whether to train a SVC or SVR model for probability estimates, 0 or 1 (default 0)\n" 38 | "-wi weight : set the parameter C of class i to weight*C, for C-SVC (default 1)\n" 39 | "-v n: n-fold cross validation mode\n" 40 | "-q : quiet mode (no outputs)\n" 41 | ); 42 | exit(1); 43 | } 44 | 45 | void exit_input_error(int line_num) 46 | { 47 | fprintf(stderr,"Wrong input format at line %d\n", line_num); 48 | exit(1); 49 | } 50 | 51 | void parse_command_line(int argc, char **argv, char *input_file_name, char *model_file_name); 52 | void read_problem(const char *filename); 53 | void do_cross_validation(); 54 | 55 | struct svm_parameter param; // set by parse_command_line 56 | struct svm_problem prob; // set by read_problem 57 | struct svm_model *model; 58 | struct svm_node *x_space; 59 | int cross_validation; 60 | int nr_fold; 61 | 62 | static char *line = NULL; 63 | static int max_line_len; 64 | 65 | #include "kernel_matrix_calculation.c" 66 | #include "cross_validation_with_matrix_precomputation.c" 67 | 68 | static char* readline(FILE *input) 69 | { 70 | int len; 71 | 72 | if(fgets(line,max_line_len,input) == NULL) 73 | return NULL; 74 | 75 | while(strrchr(line,'\n') == NULL) 76 | { 77 | max_line_len *= 2; 78 | line = (char *) realloc(line,max_line_len); 79 | len = (int) strlen(line); 80 | if(fgets(line+len,max_line_len-len,input) == NULL) 81 | break; 82 | } 83 | return line; 84 | } 85 | 86 | int main(int argc, char **argv) 87 | { 88 | int i; 89 | char input_file_name[1024]; 90 | char model_file_name[1024]; 91 | const char *error_msg; 92 | 93 | parse_command_line(argc, argv, input_file_name, model_file_name); 94 | read_problem(input_file_name); 95 | error_msg = svm_check_parameter(&prob,¶m); 96 | if(error_msg) 97 | { 98 | fprintf(stderr,"ERROR: %s\n",error_msg); 99 | exit(1); 100 | } 101 | 102 | if(cross_validation) 103 | { 104 | do_cross_validation_with_KM_precalculated( ); 105 | 106 | // do_cross_validation(); 107 | } 108 | else 109 | { 110 | model = svm_train(&prob,¶m); 111 | if(svm_save_model(model_file_name,model)) 112 | { 113 | fprintf(stderr, "can't save model to file %s\n", model_file_name); 114 | exit(1); 115 | } 116 | svm_free_and_destroy_model(&model); 117 | } 118 | svm_destroy_param(¶m); 119 | free(prob.y); 120 | 121 | #ifdef _DENSE_REP 122 | for (i = 0; i < prob.l; ++i) 123 | free((prob.x+i)->values); 124 | #else 125 | free(x_space); 126 | #endif 127 | free(prob.x); 128 | free(line); 129 | 130 | return 0; 131 | } 132 | 133 | void do_cross_validation() 134 | { 135 | int i; 136 | int total_correct = 0; 137 | double total_error = 0; 138 | double sumv = 0, sumy = 0, sumvv = 0, sumyy = 0, sumvy = 0; 139 | double *target = Malloc(double,prob.l); 140 | 141 | svm_cross_validation(&prob,¶m,nr_fold,target); 142 | if(param.svm_type == EPSILON_SVR || 143 | param.svm_type == NU_SVR) 144 | { 145 | for(i=0;i=argc) 200 | exit_with_help(); 201 | switch(argv[i-1][1]) 202 | { 203 | case 's': 204 | param.svm_type = atoi(argv[i]); 205 | break; 206 | case 't': 207 | param.kernel_type = atoi(argv[i]); 208 | break; 209 | case 'd': 210 | param.degree = atoi(argv[i]); 211 | break; 212 | case 'g': 213 | param.gamma = atof(argv[i]); 214 | break; 215 | case 'r': 216 | param.coef0 = atof(argv[i]); 217 | break; 218 | case 'n': 219 | param.nu = atof(argv[i]); 220 | break; 221 | case 'm': 222 | param.cache_size = atof(argv[i]); 223 | break; 224 | case 'c': 225 | param.C = atof(argv[i]); 226 | break; 227 | case 'e': 228 | param.eps = atof(argv[i]); 229 | break; 230 | case 'p': 231 | param.p = atof(argv[i]); 232 | break; 233 | case 'h': 234 | param.shrinking = atoi(argv[i]); 235 | break; 236 | case 'b': 237 | param.probability = atoi(argv[i]); 238 | break; 239 | case 'q': 240 | print_func = &print_null; 241 | i--; 242 | break; 243 | case 'v': 244 | cross_validation = 1; 245 | nr_fold = atoi(argv[i]); 246 | if(nr_fold < 2) 247 | { 248 | fprintf(stderr,"n-fold cross validation: n must >= 2\n"); 249 | exit_with_help(); 250 | } 251 | break; 252 | case 'w': 253 | ++param.nr_weight; 254 | param.weight_label = (int *)realloc(param.weight_label,sizeof(int)*param.nr_weight); 255 | param.weight = (double *)realloc(param.weight,sizeof(double)*param.nr_weight); 256 | param.weight_label[param.nr_weight-1] = atoi(&argv[i-1][2]); 257 | param.weight[param.nr_weight-1] = atof(argv[i]); 258 | break; 259 | default: 260 | fprintf(stderr,"Unknown option: -%c\n", argv[i-1][1]); 261 | exit_with_help(); 262 | } 263 | } 264 | 265 | svm_set_print_string_function(print_func); 266 | 267 | // determine filenames 268 | 269 | if(i>=argc) 270 | exit_with_help(); 271 | 272 | strcpy(input_file_name, argv[i]); 273 | 274 | if(i line) 319 | p--; 320 | if(p > line) 321 | max_index = (int) strtol(p,&endptr,10) + 1; 322 | } 323 | if(max_index > elements) 324 | elements = max_index; 325 | ++prob.l; 326 | } 327 | 328 | rewind(fp); 329 | 330 | prob.y = Malloc(double,prob.l); 331 | prob.x = Malloc(struct svm_node,prob.l); 332 | 333 | for(i=0;ivalues = Malloc(double,elements); 337 | (prob.x+i)->dim = 0; 338 | 339 | inst_max_index = -1; // strtol gives 0 if wrong format, and precomputed kernel has start from 0 340 | readline(fp); 341 | 342 | label = strtok(line," \t"); 343 | prob.y[i] = strtod(label,&endptr); 344 | if(endptr == label) 345 | exit_input_error(i+1); 346 | 347 | while(1) 348 | { 349 | idx = strtok(NULL,":"); 350 | val = strtok(NULL," \t"); 351 | 352 | if(val == NULL) 353 | break; 354 | 355 | errno = 0; 356 | j = (int) strtol(idx,&endptr,10); 357 | if(endptr == idx || errno != 0 || *endptr != '\0' || j <= inst_max_index) 358 | exit_input_error(i+1); 359 | else 360 | inst_max_index = j; 361 | 362 | errno = 0; 363 | value = strtod(val,&endptr); 364 | if(endptr == val || errno != 0 || (*endptr != '\0' && !isspace(*endptr))) 365 | exit_input_error(i+1); 366 | 367 | d = &((prob.x+i)->dim); 368 | while (*d < j) 369 | (prob.x+i)->values[(*d)++] = 0.0; 370 | (prob.x+i)->values[(*d)++] = value; 371 | } 372 | } 373 | max_index = elements-1; 374 | 375 | #else 376 | while(readline(fp)!=NULL) 377 | { 378 | char *p = strtok(line," \t"); // label 379 | 380 | // features 381 | while(1) 382 | { 383 | p = strtok(NULL," \t"); 384 | if(p == NULL || *p == '\n') // check '\n' as ' ' may be after the last feature 385 | break; 386 | ++elements; 387 | } 388 | ++elements; 389 | ++prob.l; 390 | } 391 | rewind(fp); 392 | 393 | prob.y = Malloc(double,prob.l); 394 | prob.x = Malloc(struct svm_node *,prob.l); 395 | x_space = Malloc(struct svm_node,elements); 396 | 397 | max_index = 0; 398 | j=0; 399 | for(i=0;i start from 0 402 | readline(fp); 403 | prob.x[i] = &x_space[j]; 404 | label = strtok(line," \t\n"); 405 | if(label == NULL) // empty line 406 | exit_input_error(i+1); 407 | 408 | prob.y[i] = strtod(label,&endptr); 409 | if(endptr == label || *endptr != '\0') 410 | exit_input_error(i+1); 411 | 412 | while(1) 413 | { 414 | idx = strtok(NULL,":"); 415 | val = strtok(NULL," \t"); 416 | 417 | if(val == NULL) 418 | break; 419 | 420 | errno = 0; 421 | x_space[j].index = (int) strtol(idx,&endptr,10); 422 | if(endptr == idx || errno != 0 || *endptr != '\0' || x_space[j].index <= inst_max_index) 423 | exit_input_error(i+1); 424 | else 425 | inst_max_index = x_space[j].index; 426 | 427 | errno = 0; 428 | x_space[j].value = strtod(val,&endptr); 429 | if(endptr == val || errno != 0 || (*endptr != '\0' && !isspace(*endptr))) 430 | exit_input_error(i+1); 431 | 432 | ++j; 433 | } 434 | 435 | if(inst_max_index > max_index) 436 | max_index = inst_max_index; 437 | x_space[j++].index = -1; 438 | } 439 | #endif 440 | 441 | if(param.gamma == 0 && max_index > 0) 442 | param.gamma = 1.0/max_index; 443 | 444 | if(param.kernel_type == PRECOMPUTED) 445 | for(i=0;idim == 0 || (prob.x+i)->values[0] == 0.0) 449 | { 450 | fprintf(stderr,"Wrong input format: first column must be 0:sample_serial_number\n"); 451 | exit(1); 452 | } 453 | if ((int)(prob.x+i)->values[0] < 0 || (int)(prob.x+i)->values[0] > max_index) 454 | { 455 | fprintf(stderr,"Wrong input format: sample_serial_number out of range\n"); 456 | exit(1); 457 | } 458 | #else 459 | if (prob.x[i][0].index != 0) 460 | { 461 | fprintf(stderr,"Wrong input format: first column must be 0:sample_serial_number\n"); 462 | exit(1); 463 | } 464 | if ((int)prob.x[i][0].value <= 0 || (int)prob.x[i][0].value > max_index) 465 | { 466 | fprintf(stderr,"Wrong input format: sample_serial_number out of range\n"); 467 | exit(1); 468 | } 469 | #endif 470 | } 471 | fclose(fp); 472 | } 473 | -------------------------------------------------------------------------------- /src/windows/libsvm_train_dense_gpu/svm.h: -------------------------------------------------------------------------------- 1 | #ifndef _LIBSVM_H 2 | #define _LIBSVM_H 3 | #define _DENSE_REP 4 | #define LIBSVM_VERSION 317 5 | 6 | #ifdef __cplusplus 7 | extern "C" { 8 | #endif 9 | 10 | extern int libsvm_version; 11 | 12 | #ifdef _DENSE_REP 13 | struct svm_node 14 | { 15 | int dim; 16 | double *values; 17 | }; 18 | 19 | struct svm_problem 20 | { 21 | int l; 22 | double *y; 23 | struct svm_node *x; 24 | }; 25 | 26 | #else 27 | struct svm_node 28 | { 29 | int index; 30 | double value; 31 | }; 32 | 33 | struct svm_problem 34 | { 35 | int l; 36 | double *y; 37 | struct svm_node **x; 38 | }; 39 | #endif 40 | 41 | enum { C_SVC, NU_SVC, ONE_CLASS, EPSILON_SVR, NU_SVR }; /* svm_type */ 42 | enum { LINEAR, POLY, RBF, SIGMOID, PRECOMPUTED }; /* kernel_type */ 43 | 44 | struct svm_parameter 45 | { 46 | int svm_type; 47 | int kernel_type; 48 | int degree; /* for poly */ 49 | double gamma; /* for poly/rbf/sigmoid */ 50 | double coef0; /* for poly/sigmoid */ 51 | 52 | /* these are for training only */ 53 | double cache_size; /* in MB */ 54 | double eps; /* stopping criteria */ 55 | double C; /* for C_SVC, EPSILON_SVR and NU_SVR */ 56 | int nr_weight; /* for C_SVC */ 57 | int *weight_label; /* for C_SVC */ 58 | double* weight; /* for C_SVC */ 59 | double nu; /* for NU_SVC, ONE_CLASS, and NU_SVR */ 60 | double p; /* for EPSILON_SVR */ 61 | int shrinking; /* use the shrinking heuristics */ 62 | int probability; /* do probability estimates */ 63 | }; 64 | 65 | // 66 | // svm_model 67 | // 68 | struct svm_model 69 | { 70 | struct svm_parameter param; /* parameter */ 71 | int nr_class; /* number of classes, = 2 in regression/one class svm */ 72 | int l; /* total #SV */ 73 | #ifdef _DENSE_REP 74 | struct svm_node *SV; /* SVs (SV[l]) */ 75 | #else 76 | struct svm_node **SV; /* SVs (SV[l]) */ 77 | #endif 78 | double **sv_coef; /* coefficients for SVs in decision functions (sv_coef[k-1][l]) */ 79 | double *rho; /* constants in decision functions (rho[k*(k-1)/2]) */ 80 | double *probA; /* pariwise probability information */ 81 | double *probB; 82 | int *sv_indices; /* sv_indices[0,...,nSV-1] are values in [1,...,num_traning_data] to indicate SVs in the training set */ 83 | 84 | /* for classification only */ 85 | 86 | int *label; /* label of each class (label[k]) */ 87 | int *nSV; /* number of SVs for each class (nSV[k]) */ 88 | /* nSV[0] + nSV[1] + ... + nSV[k-1] = l */ 89 | /* XXX */ 90 | int free_sv; /* 1 if svm_model is created by svm_load_model*/ 91 | /* 0 if svm_model is created by svm_train */ 92 | }; 93 | 94 | struct svm_model *svm_train(const struct svm_problem *prob, const struct svm_parameter *param); 95 | void svm_cross_validation(const struct svm_problem *prob, const struct svm_parameter *param, int nr_fold, double *target); 96 | 97 | int svm_save_model(const char *model_file_name, const struct svm_model *model); 98 | struct svm_model *svm_load_model(const char *model_file_name); 99 | 100 | int svm_get_svm_type(const struct svm_model *model); 101 | int svm_get_nr_class(const struct svm_model *model); 102 | void svm_get_labels(const struct svm_model *model, int *label); 103 | void svm_get_sv_indices(const struct svm_model *model, int *sv_indices); 104 | int svm_get_nr_sv(const struct svm_model *model); 105 | double svm_get_svr_probability(const struct svm_model *model); 106 | 107 | double svm_predict_values(const struct svm_model *model, const struct svm_node *x, double* dec_values); 108 | double svm_predict(const struct svm_model *model, const struct svm_node *x); 109 | double svm_predict_probability(const struct svm_model *model, const struct svm_node *x, double* prob_estimates); 110 | 111 | void svm_free_model_content(struct svm_model *model_ptr); 112 | void svm_free_and_destroy_model(struct svm_model **model_ptr_ptr); 113 | void svm_destroy_param(struct svm_parameter *param); 114 | 115 | const char *svm_check_parameter(const struct svm_problem *prob, const struct svm_parameter *param); 116 | int svm_check_probability_model(const struct svm_model *model); 117 | 118 | void svm_set_print_string_function(void (*print_func)(const char *)); 119 | 120 | #ifdef __cplusplus 121 | } 122 | #endif 123 | 124 | #endif /* _LIBSVM_H */ 125 | --------------------------------------------------------------------------------