├── LICENSE ├── LSH_DL ├── LSH_DL.iml ├── LSH_DL.ipr ├── pom.xml └── src │ ├── main │ └── java │ │ └── dl │ │ ├── dataset │ │ ├── DLDataSet.java │ │ ├── DMPair.java │ │ ├── MNISTDataSet.java │ │ └── NORBDataSet.java │ │ ├── lsh │ │ ├── CosineDistance.java │ │ ├── EuclideanDistance.java │ │ ├── HashBuckets.java │ │ ├── Histogram.java │ │ ├── LSH.java │ │ ├── Pooling.java │ │ └── RandomProjection.java │ │ └── nn │ │ ├── CrossEntropy.java │ │ ├── HiddenLayer.java │ │ ├── ICostFunction.java │ │ ├── LogisticNeuronLayer.java │ │ ├── NN_parameters.java │ │ ├── NeuralCoordinator.java │ │ ├── NeuralNetwork.java │ │ ├── NeuronLayer.java │ │ ├── ReLUNeuronLayer.java │ │ ├── SoftMaxNeuronLayer.java │ │ └── Util.java │ └── test │ └── java │ └── org │ └── dl │ └── LSH_DL_Experiments.groovy └── README.md /LICENSE: -------------------------------------------------------------------------------- 1 | Apache License 2 | Version 2.0, January 2004 3 | http://www.apache.org/licenses/ 4 | 5 | TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION 6 | 7 | 1. Definitions. 8 | 9 | "License" shall mean the terms and conditions for use, reproduction, 10 | and distribution as defined by Sections 1 through 9 of this document. 11 | 12 | "Licensor" shall mean the copyright owner or entity authorized by 13 | the copyright owner that is granting the License. 14 | 15 | "Legal Entity" shall mean the union of the acting entity and all 16 | other entities that control, are controlled by, or are under common 17 | control with that entity. For the purposes of this definition, 18 | "control" means (i) the power, direct or indirect, to cause the 19 | direction or management of such entity, whether by contract or 20 | otherwise, or (ii) ownership of fifty percent (50%) or more of the 21 | outstanding shares, or (iii) beneficial ownership of such entity. 22 | 23 | "You" (or "Your") shall mean an individual or Legal Entity 24 | exercising permissions granted by this License. 25 | 26 | "Source" form shall mean the preferred form for making modifications, 27 | including but not limited to software source code, documentation 28 | source, and configuration files. 29 | 30 | "Object" form shall mean any form resulting from mechanical 31 | transformation or translation of a Source form, including but 32 | not limited to compiled object code, generated documentation, 33 | and conversions to other media types. 34 | 35 | "Work" shall mean the work of authorship, whether in Source or 36 | Object form, made available under the License, as indicated by a 37 | copyright notice that is included in or attached to the work 38 | (an example is provided in the Appendix below). 39 | 40 | "Derivative Works" shall mean any work, whether in Source or Object 41 | form, that is based on (or derived from) the Work and for which the 42 | editorial revisions, annotations, elaborations, or other modifications 43 | represent, as a whole, an original work of authorship. For the purposes 44 | of this License, Derivative Works shall not include works that remain 45 | separable from, or merely link (or bind by name) to the interfaces of, 46 | the Work and Derivative Works thereof. 47 | 48 | "Contribution" shall mean any work of authorship, including 49 | the original version of the Work and any modifications or additions 50 | to that Work or Derivative Works thereof, that is intentionally 51 | submitted to Licensor for inclusion in the Work by the copyright owner 52 | or by an individual or Legal Entity authorized to submit on behalf of 53 | the copyright owner. For the purposes of this definition, "submitted" 54 | means any form of electronic, verbal, or written communication sent 55 | to the Licensor or its representatives, including but not limited to 56 | communication on electronic mailing lists, source code control systems, 57 | and issue tracking systems that are managed by, or on behalf of, the 58 | Licensor for the purpose of discussing and improving the Work, but 59 | excluding communication that is conspicuously marked or otherwise 60 | designated in writing by the copyright owner as "Not a Contribution." 61 | 62 | "Contributor" shall mean Licensor and any individual or Legal Entity 63 | on behalf of whom a Contribution has been received by Licensor and 64 | subsequently incorporated within the Work. 65 | 66 | 2. Grant of Copyright License. Subject to the terms and conditions of 67 | this License, each Contributor hereby grants to You a perpetual, 68 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 69 | copyright license to reproduce, prepare Derivative Works of, 70 | publicly display, publicly perform, sublicense, and distribute the 71 | Work and such Derivative Works in Source or Object form. 72 | 73 | 3. Grant of Patent License. Subject to the terms and conditions of 74 | this License, each Contributor hereby grants to You a perpetual, 75 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 76 | (except as stated in this section) patent license to make, have made, 77 | use, offer to sell, sell, import, and otherwise transfer the Work, 78 | where such license applies only to those patent claims licensable 79 | by such Contributor that are necessarily infringed by their 80 | Contribution(s) alone or by combination of their Contribution(s) 81 | with the Work to which such Contribution(s) was submitted. If You 82 | institute patent litigation against any entity (including a 83 | cross-claim or counterclaim in a lawsuit) alleging that the Work 84 | or a Contribution incorporated within the Work constitutes direct 85 | or contributory patent infringement, then any patent licenses 86 | granted to You under this License for that Work shall terminate 87 | as of the date such litigation is filed. 88 | 89 | 4. Redistribution. You may reproduce and distribute copies of the 90 | Work or Derivative Works thereof in any medium, with or without 91 | modifications, and in Source or Object form, provided that You 92 | meet the following conditions: 93 | 94 | (a) You must give any other recipients of the Work or 95 | Derivative Works a copy of this License; and 96 | 97 | (b) You must cause any modified files to carry prominent notices 98 | stating that You changed the files; and 99 | 100 | (c) You must retain, in the Source form of any Derivative Works 101 | that You distribute, all copyright, patent, trademark, and 102 | attribution notices from the Source form of the Work, 103 | excluding those notices that do not pertain to any part of 104 | the Derivative Works; and 105 | 106 | (d) If the Work includes a "NOTICE" text file as part of its 107 | distribution, then any Derivative Works that You distribute must 108 | include a readable copy of the attribution notices contained 109 | within such NOTICE file, excluding those notices that do not 110 | pertain to any part of the Derivative Works, in at least one 111 | of the following places: within a NOTICE text file distributed 112 | as part of the Derivative Works; within the Source form or 113 | documentation, if provided along with the Derivative Works; or, 114 | within a display generated by the Derivative Works, if and 115 | wherever such third-party notices normally appear. The contents 116 | of the NOTICE file are for informational purposes only and 117 | do not modify the License. You may add Your own attribution 118 | notices within Derivative Works that You distribute, alongside 119 | or as an addendum to the NOTICE text from the Work, provided 120 | that such additional attribution notices cannot be construed 121 | as modifying the License. 122 | 123 | You may add Your own copyright statement to Your modifications and 124 | may provide additional or different license terms and conditions 125 | for use, reproduction, or distribution of Your modifications, or 126 | for any such Derivative Works as a whole, provided Your use, 127 | reproduction, and distribution of the Work otherwise complies with 128 | the conditions stated in this License. 129 | 130 | 5. Submission of Contributions. Unless You explicitly state otherwise, 131 | any Contribution intentionally submitted for inclusion in the Work 132 | by You to the Licensor shall be under the terms and conditions of 133 | this License, without any additional terms or conditions. 134 | Notwithstanding the above, nothing herein shall supersede or modify 135 | the terms of any separate license agreement you may have executed 136 | with Licensor regarding such Contributions. 137 | 138 | 6. Trademarks. This License does not grant permission to use the trade 139 | names, trademarks, service marks, or product names of the Licensor, 140 | except as required for reasonable and customary use in describing the 141 | origin of the Work and reproducing the content of the NOTICE file. 142 | 143 | 7. Disclaimer of Warranty. Unless required by applicable law or 144 | agreed to in writing, Licensor provides the Work (and each 145 | Contributor provides its Contributions) on an "AS IS" BASIS, 146 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or 147 | implied, including, without limitation, any warranties or conditions 148 | of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A 149 | PARTICULAR PURPOSE. You are solely responsible for determining the 150 | appropriateness of using or redistributing the Work and assume any 151 | risks associated with Your exercise of permissions under this License. 152 | 153 | 8. Limitation of Liability. In no event and under no legal theory, 154 | whether in tort (including negligence), contract, or otherwise, 155 | unless required by applicable law (such as deliberate and grossly 156 | negligent acts) or agreed to in writing, shall any Contributor be 157 | liable to You for damages, including any direct, indirect, special, 158 | incidental, or consequential damages of any character arising as a 159 | result of this License or out of the use or inability to use the 160 | Work (including but not limited to damages for loss of goodwill, 161 | work stoppage, computer failure or malfunction, or any and all 162 | other commercial damages or losses), even if such Contributor 163 | has been advised of the possibility of such damages. 164 | 165 | 9. Accepting Warranty or Additional Liability. While redistributing 166 | the Work or Derivative Works thereof, You may choose to offer, 167 | and charge a fee for, acceptance of support, warranty, indemnity, 168 | or other liability obligations and/or rights consistent with this 169 | License. However, in accepting such obligations, You may act only 170 | on Your own behalf and on Your sole responsibility, not on behalf 171 | of any other Contributor, and only if You agree to indemnify, 172 | defend, and hold each Contributor harmless for any liability 173 | incurred by, or claims asserted against, such Contributor by reason 174 | of your accepting any such warranty or additional liability. 175 | 176 | END OF TERMS AND CONDITIONS 177 | 178 | APPENDIX: How to apply the Apache License to your work. 179 | 180 | To apply the Apache License to your work, attach the following 181 | boilerplate notice, with the fields enclosed by brackets "{}" 182 | replaced with your own identifying information. (Don't include 183 | the brackets!) The text should be enclosed in the appropriate 184 | comment syntax for the file format. We also recommend that a 185 | file or class name and description of purpose be included on the 186 | same "printed page" as the copyright notice for easier 187 | identification within third-party archives. 188 | 189 | Copyright {yyyy} {name of copyright owner} 190 | 191 | Licensed under the Apache License, Version 2.0 (the "License"); 192 | you may not use this file except in compliance with the License. 193 | You may obtain a copy of the License at 194 | 195 | http://www.apache.org/licenses/LICENSE-2.0 196 | 197 | Unless required by applicable law or agreed to in writing, software 198 | distributed under the License is distributed on an "AS IS" BASIS, 199 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 200 | See the License for the specific language governing permissions and 201 | limitations under the License. 202 | -------------------------------------------------------------------------------- /LSH_DL/LSH_DL.iml: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | -------------------------------------------------------------------------------- /LSH_DL/LSH_DL.ipr: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 41 | 42 | 43 | 44 | 45 | 46 | 47 | 48 | 49 | 50 | 51 | 52 | 53 | 54 | 55 | 56 | 57 | 58 | 59 | 60 | 61 | 62 | 63 | 64 | 65 | 66 | 67 | 68 | 69 | 70 | 71 | 72 | 73 | 74 | 75 | 76 | 77 | 78 | 79 | 80 | 81 | 82 | 83 | 84 | 85 | 86 | 87 | 88 | 89 | 90 | 91 | 92 | 93 | 94 | 95 | 96 | 97 | 98 | 99 | 100 | 101 | 102 | 103 | 104 | 105 | 106 | 107 | 108 | 109 | 110 | 111 | 112 | 113 | 114 | 115 | 116 | 117 | 118 | 119 | 120 | 121 | 122 | 123 | 124 | 125 | 126 | 127 | 128 | 129 | 130 | 131 | 132 | 133 | 134 | 135 | 136 | 137 | 138 | 139 | 140 | 141 | 142 | 143 | 144 | 145 | 146 | 147 | 148 | 149 | 150 | 151 | 152 | 153 | 154 | 155 | 156 | 157 | 158 | 159 | 160 | 161 | 162 | 163 | 164 | -------------------------------------------------------------------------------- /LSH_DL/pom.xml: -------------------------------------------------------------------------------- 1 | 2 | 5 | 4.0.0 6 | 7 | groupId 8 | LSH_DL 9 | 1.0-SNAPSHOT 10 | 11 | 12 | 13 | junit 14 | junit 15 | 4.12 16 | 17 | 18 | org.codehaus.groovy 19 | groovy-all 20 | 2.4.6 21 | 22 | 23 | org.apache.commons 24 | commons-compress 25 | 1.5 26 | 27 | 28 | com.google.guava 29 | guava 30 | 19.0-rc3 31 | 32 | 33 | org.jblas 34 | jblas 35 | 1.2.2 36 | 37 | 38 | org.apache.commons 39 | commons-lang3 40 | 3.0 41 | 42 | 43 | org.apache.commons 44 | commons-math3 45 | 3.5 46 | 47 | 48 | 49 | 50 | 51 | 52 | 53 | org.apache.maven.plugins 54 | maven-compiler-plugin 55 | 3.10.1 56 | 57 | 1.8 58 | 1.8 59 | 60 | 61 | 62 | 63 | 64 | -------------------------------------------------------------------------------- /LSH_DL/src/main/java/dl/dataset/DLDataSet.java: -------------------------------------------------------------------------------- 1 | package dl.dataset; 2 | 3 | import org.apache.commons.lang3.tuple.Pair; 4 | import org.jblas.DoubleMatrix; 5 | 6 | import java.io.BufferedReader; 7 | import java.util.ArrayList; 8 | import java.util.List; 9 | 10 | public class DLDataSet 11 | { 12 | public static Pair, double[]> loadDataSet(BufferedReader stream, final int size, final int numAttributes) throws Exception 13 | { 14 | // read LibSVM data 15 | double[] label_list = new double[size]; 16 | List data_list = new ArrayList<>(size); 17 | 18 | for(int label_idx = 0; label_idx < size; ++label_idx) 19 | { 20 | String[] data = stream.readLine().trim().split("\\s+"); 21 | label_list[label_idx] = Double.parseDouble(data[numAttributes]); 22 | 23 | DoubleMatrix feature_vector = DoubleMatrix.zeros(numAttributes); 24 | for (int idx = 0; idx < numAttributes; ++idx) 25 | { 26 | feature_vector.put(idx, Double.parseDouble(data[idx])); 27 | assert(feature_vector.get(idx) >= 0.0 && feature_vector.get(idx) <= 1.0); 28 | } 29 | data_list.add(feature_vector); 30 | } 31 | return new DMPair(data_list, label_list); 32 | } 33 | } 34 | -------------------------------------------------------------------------------- /LSH_DL/src/main/java/dl/dataset/DMPair.java: -------------------------------------------------------------------------------- 1 | package dl.dataset; 2 | 3 | import org.apache.commons.lang3.tuple.Pair; 4 | import org.jblas.DoubleMatrix; 5 | import java.util.List; 6 | 7 | public class DMPair extends Pair, double[]> 8 | { 9 | private List m_left; 10 | private double[] m_right; 11 | public DMPair(List left, double[] right) 12 | { 13 | m_left = left; 14 | m_right = right; 15 | } 16 | 17 | @Override 18 | public List getLeft() 19 | { 20 | return m_left; 21 | } 22 | 23 | @Override 24 | public double[] getRight() 25 | { 26 | return m_right; 27 | } 28 | 29 | public double[] setValue(double[] value) 30 | { 31 | return null; 32 | } 33 | } 34 | 35 | 36 | -------------------------------------------------------------------------------- /LSH_DL/src/main/java/dl/dataset/MNISTDataSet.java: -------------------------------------------------------------------------------- 1 | package dl.dataset; 2 | 3 | import org.apache.commons.lang3.tuple.Pair; 4 | import org.jblas.DoubleMatrix; 5 | 6 | import java.io.DataInputStream; 7 | import java.io.FileInputStream; 8 | import java.util.ArrayList; 9 | import java.util.List; 10 | 11 | public class MNISTDataSet 12 | { 13 | private static final int LABEL_MAGIC = 2049; 14 | private static final int IMAGE_MAGIC = 2051; 15 | 16 | public static Pair, double[]> loadDataSet(final String label_path, final String image_path) throws Exception 17 | { 18 | // read MNIST data 19 | DataInputStream label_stream = new DataInputStream(new FileInputStream(label_path)); 20 | DataInputStream image_stream = new DataInputStream(new FileInputStream(image_path)); 21 | 22 | int label_magicNumber = label_stream.readInt(); 23 | if (label_magicNumber != LABEL_MAGIC) 24 | { 25 | System.err.println("Label file has wrong magic number: " + label_magicNumber + " expected: " + LABEL_MAGIC); 26 | } 27 | int image_magicNumber = image_stream.readInt(); 28 | if (image_magicNumber != IMAGE_MAGIC) 29 | { 30 | System.err.println("Image file has wrong magic number: " + label_magicNumber + " expected: " + IMAGE_MAGIC); 31 | } 32 | 33 | int numLabels = label_stream.readInt(); 34 | int numImages = image_stream.readInt(); 35 | int numRows = image_stream.readInt(); 36 | int numCols = image_stream.readInt(); 37 | if (numLabels != numImages) 38 | { 39 | System.err.println("Image file and label file do not contain the same number of entries."); 40 | System.err.println(" Label file contains: " + numLabels); 41 | System.err.println(" Image file contains: " + numImages); 42 | } 43 | 44 | int label_idx = 0; 45 | int numImagesRead = 0; 46 | double[] label_list = new double[numLabels]; 47 | List image_list = new ArrayList<>(numImages); 48 | while (label_stream.available() > 0 && label_idx < numLabels) 49 | { 50 | DoubleMatrix image = DoubleMatrix.zeros(numCols * numRows); 51 | label_list[label_idx++] = label_stream.readByte(); 52 | int image_idx = 0; 53 | for (int colIdx = 0; colIdx < numCols; colIdx++) 54 | { 55 | for (int rowIdx = 0; rowIdx < numRows; rowIdx++) 56 | { 57 | image.put(image_idx++, image_stream.readUnsignedByte() / 255.0); 58 | } 59 | } 60 | image_list.add(image); 61 | ++numImagesRead; 62 | } 63 | assert(label_idx == numImagesRead); 64 | return new DMPair(image_list, label_list); 65 | } 66 | } 67 | -------------------------------------------------------------------------------- /LSH_DL/src/main/java/dl/dataset/NORBDataSet.java: -------------------------------------------------------------------------------- 1 | package dl.dataset; 2 | 3 | import org.apache.commons.lang3.tuple.Pair; 4 | import org.jblas.DoubleMatrix; 5 | 6 | import java.io.BufferedReader; 7 | import java.io.IOException; 8 | import java.util.ArrayList; 9 | import java.util.List; 10 | 11 | public class NORBDataSet 12 | { 13 | private static final int FEATURE_SIZE = 2048; 14 | 15 | public static Pair, double[]> loadDataSet(BufferedReader stream, final int size) throws IOException 16 | { 17 | // read LibSVM data 18 | double[] label_list = new double[size]; 19 | List data_list = new ArrayList<>(size); 20 | 21 | for(int label_idx = 0; label_idx < size; ++label_idx) 22 | { 23 | String[] data = stream.readLine().trim().split("\\s"); 24 | label_list[label_idx] = Double.parseDouble(data[FEATURE_SIZE]); 25 | 26 | DoubleMatrix feature_vector = DoubleMatrix.zeros(FEATURE_SIZE); 27 | for (int idx = 0; idx < FEATURE_SIZE; ++idx) 28 | { 29 | feature_vector.put(idx, Double.parseDouble(data[idx])); 30 | assert(feature_vector.get(idx) >= -1.0 && feature_vector.get(idx) <= 1.0); 31 | } 32 | data_list.add(feature_vector); 33 | } 34 | assert(data_list.size() == size); 35 | return new DMPair(data_list, label_list); 36 | } 37 | } 38 | -------------------------------------------------------------------------------- /LSH_DL/src/main/java/dl/lsh/CosineDistance.java: -------------------------------------------------------------------------------- 1 | package dl.lsh; 2 | 3 | import dl.nn.Util; 4 | import org.jblas.DoubleMatrix; 5 | 6 | import java.util.ArrayList; 7 | import java.util.List; 8 | 9 | public class CosineDistance implements LSH 10 | { 11 | private int m_L; 12 | private List randomMatrix; 13 | private int[] hashes; 14 | 15 | public CosineDistance(int b, int L, int d) 16 | { 17 | m_L = L; 18 | randomMatrix = new ArrayList<>(); 19 | hashes = new int[m_L]; 20 | for(int jdx = 0; jdx < m_L; ++jdx) 21 | { 22 | randomMatrix.add(DoubleMatrix.randn(b, d)); 23 | } 24 | } 25 | 26 | public int[] hashSignature(DoubleMatrix data) 27 | { 28 | return new RandomProjection(hashes, randomMatrix, data).run(); 29 | } 30 | 31 | public static double distance(DoubleMatrix x, DoubleMatrix y) 32 | { 33 | return 1 - (x.dot(y) / (x.norm2() * y.norm2())); 34 | } 35 | 36 | public static double distance(double[] x, double[] y) 37 | { 38 | assert(x.length == y.length); 39 | double dp = 0.0; 40 | double x_norm = 0.0; 41 | double y_norm = 0.0; 42 | 43 | for(int idx = 0; idx < x.length; ++idx) 44 | { 45 | dp += x[idx] * y[idx]; 46 | x_norm += Math.pow(x[idx], 2); 47 | y_norm += Math.pow(y[idx], 2); 48 | } 49 | 50 | x_norm = Math.sqrt(x_norm); 51 | y_norm = Math.sqrt(y_norm); 52 | return 1 - (dp / (x_norm * y_norm)); 53 | } 54 | 55 | public static double dotProductDistance(int[] x, int[] y, final int b) 56 | { 57 | final int numIntegers = b / Util.INT_SIZE; 58 | assert(x.length == numIntegers); 59 | assert(y.length == numIntegers); 60 | 61 | double dp = 0.0; 62 | double x_norm = 0.0; 63 | double y_norm = 0.0; 64 | 65 | for(int idx = 0; idx < x.length; ++idx) 66 | { 67 | dp += count(x[idx] & y[idx]); 68 | x_norm += count(x[idx]); 69 | y_norm += count(y[idx]); 70 | } 71 | 72 | x_norm = Math.sqrt(x_norm); 73 | y_norm = Math.sqrt(y_norm); 74 | return 1 - (dp / (x_norm * y_norm)); 75 | } 76 | 77 | public static double hammingDistance(int[] x, int[] y, final int b) 78 | { 79 | final int numIntegers = b / Util.INT_SIZE; 80 | int numBits = b % Util.INT_SIZE; 81 | numBits = (numBits == 0) ? Util.INT_SIZE : numBits; 82 | final int bitMask = (int) Math.pow(2, numBits) - 1; 83 | assert(x.length == numIntegers); 84 | assert(y.length == numIntegers); 85 | 86 | int hammingDistance = 0; 87 | for(int idx = 0; idx < x.length-1; ++idx) 88 | { 89 | hammingDistance += count(x[idx] ^ y[idx]); 90 | } 91 | 92 | hammingDistance += count((x[x.length-1] & bitMask) ^ (y[y.length-1] & bitMask)); 93 | return 1 - Math.cos((double) hammingDistance * Math.PI / (double) b); 94 | } 95 | 96 | private static int count(int value) 97 | { 98 | int count = 0; 99 | for(int idx = 0; idx < Util.INT_SIZE; ++idx) 100 | { 101 | count += (value & 1); 102 | value >>= 1; 103 | } 104 | return count; 105 | } 106 | } 107 | -------------------------------------------------------------------------------- /LSH_DL/src/main/java/dl/lsh/EuclideanDistance.java: -------------------------------------------------------------------------------- 1 | package dl.lsh; 2 | 3 | import org.jblas.DoubleMatrix; 4 | 5 | public class EuclideanDistance 6 | { 7 | public static double distance(DoubleMatrix x, DoubleMatrix y) 8 | { 9 | return x.distance2(y); 10 | } 11 | 12 | public static double distance(double[] x, double[] y) 13 | { 14 | assert(x.length == y.length); 15 | double distance = 0.0; 16 | double x_mag = 0; 17 | double y_mag = 0; 18 | 19 | for(int idx = 0; idx < x.length; ++idx) 20 | { 21 | x_mag += Math.pow(x[idx], 2.0); 22 | y_mag += Math.pow(y[idx], 2.0); 23 | } 24 | x_mag = Math.sqrt(x_mag); 25 | y_mag = Math.sqrt(y_mag); 26 | 27 | for(int idx = 0; idx < x.length; ++idx) 28 | { 29 | distance += Math.pow((x[idx] / x_mag) - (y[idx] / y_mag), 2.0); 30 | } 31 | return Math.sqrt(distance); 32 | } 33 | } 34 | -------------------------------------------------------------------------------- /LSH_DL/src/main/java/dl/lsh/HashBuckets.java: -------------------------------------------------------------------------------- 1 | package dl.lsh; 2 | 3 | import org.jblas.DoubleMatrix; 4 | 5 | import java.util.*; 6 | 7 | public class HashBuckets 8 | { 9 | private double m_nn_sizeLimit; 10 | private int m_L; 11 | private int m_poolDim; 12 | private LSH m_hashFunction; 13 | private List>> m_Tables = new ArrayList<>(); 14 | private List> m_bucket_hashes = new ArrayList<>(); 15 | 16 | public HashBuckets(double sizeLimit, int poolDim, int L, LSH hashFunction) 17 | { 18 | m_hashFunction = hashFunction; 19 | m_poolDim = poolDim; 20 | m_nn_sizeLimit = sizeLimit; 21 | m_L = L; 22 | construct(); 23 | } 24 | 25 | public void construct() 26 | { 27 | for (int i = 0; i < m_L; i++) 28 | { 29 | m_Tables.add(new HashMap<>()); 30 | m_bucket_hashes.add(new HashMap<>()); 31 | } 32 | } 33 | 34 | public void clear() 35 | { 36 | m_Tables.clear(); 37 | m_bucket_hashes.clear(); 38 | construct(); 39 | } 40 | 41 | public void LSHAdd(int recIndex, DoubleMatrix data) 42 | { 43 | LSHAdd(recIndex, generateHashSignature(data)); 44 | } 45 | 46 | private void LSHAdd(int recIndex, int[] hashes) 47 | { 48 | assert(hashes.length == m_L); 49 | 50 | for (int idx = 0; idx < m_L; idx++) 51 | { 52 | if (!m_Tables.get(idx).containsKey(hashes[idx])) 53 | { 54 | Set set = new HashSet<>(); 55 | set.add(recIndex); 56 | m_Tables.get(idx).put(hashes[idx], set); 57 | m_bucket_hashes.get(idx).put(hashes[idx], hashes); 58 | } 59 | else 60 | { 61 | m_Tables.get(idx).get(hashes[idx]).add(recIndex); 62 | } 63 | } 64 | } 65 | 66 | public Set LSHUnion(DoubleMatrix data) 67 | { 68 | return LSHUnion(generateHashSignature(data)); 69 | } 70 | 71 | public Set histogramLSH(DoubleMatrix data) 72 | { 73 | return histogramLSH(generateHashSignature(data)); 74 | } 75 | 76 | public Set histogramLSH(int[] hashes) 77 | { 78 | assert(hashes.length == m_L); 79 | 80 | Histogram hist = new Histogram(); 81 | for (int idx = 0; idx < m_L; ++idx) 82 | { 83 | if (m_Tables.get(idx).containsKey(hashes[idx])) 84 | { 85 | hist.add(m_Tables.get(idx).get(hashes[idx])); 86 | } 87 | } 88 | return hist.thresholdSet(m_nn_sizeLimit); 89 | } 90 | 91 | public Set LSHUnion(int[] hashes) 92 | { 93 | assert(hashes.length == m_L); 94 | 95 | Set retrieved = new HashSet<>(); 96 | for (int idx = 0; idx < m_L && retrieved.size() < m_nn_sizeLimit; ++idx) 97 | { 98 | if (m_Tables.get(idx).containsKey(hashes[idx])) 99 | { 100 | retrieved.addAll(m_Tables.get(idx).get(hashes[idx])); 101 | } 102 | } 103 | return retrieved; 104 | } 105 | 106 | public int[] generateHashSignature(DoubleMatrix data) 107 | { 108 | return m_hashFunction.hashSignature(Pooling.compress(m_poolDim, data)); 109 | } 110 | } 111 | -------------------------------------------------------------------------------- /LSH_DL/src/main/java/dl/lsh/Histogram.java: -------------------------------------------------------------------------------- 1 | package dl.lsh; 2 | 3 | import org.apache.commons.lang3.mutable.MutableInt; 4 | 5 | import java.util.*; 6 | 7 | public class Histogram 8 | { 9 | private HashMap histogram = new HashMap<>(); 10 | 11 | public void add(Collection data) 12 | { 13 | for(Integer value : data) 14 | { 15 | if(!histogram.containsKey(value)) 16 | { 17 | histogram.put(value, new MutableInt(1)); 18 | } 19 | else 20 | { 21 | histogram.get(value).increment(); 22 | } 23 | } 24 | } 25 | 26 | public Set thresholdSet(double count) 27 | { 28 | List> list = new LinkedList<>(histogram.entrySet()); 29 | Collections.sort(list, (Map.Entry o1, Map.Entry o2) -> o2.getValue().compareTo(o1.getValue())); 30 | count = Math.min(count, list.size()); 31 | 32 | Set retrieved = new HashSet<>(); 33 | Iterator> iterator = list.iterator(); 34 | for(int idx = 0; idx < count; ++idx) 35 | { 36 | retrieved.add(iterator.next().getKey()); 37 | } 38 | return retrieved; 39 | } 40 | } 41 | -------------------------------------------------------------------------------- /LSH_DL/src/main/java/dl/lsh/LSH.java: -------------------------------------------------------------------------------- 1 | package dl.lsh; 2 | 3 | import org.jblas.DoubleMatrix; 4 | 5 | public interface LSH 6 | { 7 | int[] hashSignature(DoubleMatrix data); 8 | } 9 | -------------------------------------------------------------------------------- /LSH_DL/src/main/java/dl/lsh/Pooling.java: -------------------------------------------------------------------------------- 1 | package dl.lsh; 2 | 3 | import org.jblas.DoubleMatrix; 4 | 5 | public class Pooling 6 | { 7 | public static DoubleMatrix compress(final int size, DoubleMatrix data) 8 | { 9 | int compressSize = data.length / size; 10 | 11 | DoubleMatrix compressData = DoubleMatrix.zeros(compressSize); 12 | for(int idx = 0; idx < compressSize; ++idx) 13 | { 14 | int offset = idx * size; 15 | compressData.put(idx, sum(data, offset, offset + size)); 16 | } 17 | return compressData; 18 | } 19 | 20 | private static double sum(DoubleMatrix data, int start, int end) 21 | { 22 | double value = 0; 23 | for(int idx = start; idx < end; ++idx) 24 | { 25 | value += data.get(idx); 26 | } 27 | return value / (end - start); 28 | } 29 | } 30 | -------------------------------------------------------------------------------- /LSH_DL/src/main/java/dl/lsh/RandomProjection.java: -------------------------------------------------------------------------------- 1 | package dl.lsh; 2 | 3 | import org.jblas.DoubleMatrix; 4 | import java.util.List; 5 | 6 | public class RandomProjection 7 | { 8 | private List m_projection_matrix; 9 | private DoubleMatrix m_query; 10 | private int[] m_hashes; 11 | 12 | public RandomProjection(int[] hashes, List projection_matrix, DoubleMatrix query) 13 | { 14 | m_projection_matrix = projection_matrix; 15 | m_query = query; 16 | m_hashes = hashes; 17 | } 18 | 19 | public int[] run() 20 | { 21 | int hash_idx = -1; 22 | for(DoubleMatrix projection : m_projection_matrix) 23 | { 24 | assert(projection.columns == m_query.rows); 25 | DoubleMatrix dotProduct = projection.mmul(m_query); 26 | 27 | int signature = 0; 28 | for(int idx = 0; idx < dotProduct.length; ++idx) 29 | { 30 | signature |= sign(dotProduct.get(idx)); 31 | signature <<= 1; 32 | } 33 | m_hashes[++hash_idx] = signature; 34 | } 35 | return m_hashes; 36 | } 37 | 38 | private int sign(double value) 39 | { 40 | return (value >= 0) ? 1 : 0; 41 | } 42 | } 43 | -------------------------------------------------------------------------------- /LSH_DL/src/main/java/dl/nn/CrossEntropy.java: -------------------------------------------------------------------------------- 1 | package dl.nn; 2 | 3 | import java.util.Arrays; 4 | 5 | // Compare Against Labels - Classification 6 | public class CrossEntropy implements ICostFunction 7 | { 8 | private int max_idx(double[] array) 9 | { 10 | int max_idx = 0; 11 | double max_value = Double.MIN_VALUE; 12 | for(int idx = 0; idx < array.length; ++idx) 13 | { 14 | if(max_value < array[idx]) 15 | { 16 | max_idx = idx; 17 | max_value = array[idx]; 18 | } 19 | } 20 | return max_idx; 21 | } 22 | 23 | public double correct(double[] y_hat, double labels) 24 | { 25 | return (max_idx(y_hat) == (int) labels) ? 1.0 : 0.0; 26 | } 27 | 28 | public double accuracy(double[][] y_hat, double[] labels) 29 | { 30 | // select highest probability index as label for data set 31 | // check for matches and return average 32 | double correct = 0; 33 | for(int idx = 0; idx < labels.length; ++idx) 34 | { 35 | if(max_idx(y_hat[idx]) == (int) labels[idx]) 36 | { 37 | ++correct; 38 | } 39 | } 40 | return correct / labels.length; 41 | } 42 | 43 | public double costFunction(double[] y_hat, double labels) 44 | { 45 | return -Math.log(y_hat[(int) labels]); 46 | } 47 | 48 | public double[] outputDelta(double[] y_hat, double labels, NeuronLayer l) 49 | { 50 | double[] delta = Arrays.copyOf(y_hat, y_hat.length); 51 | delta[(int) labels] -= 1.0; 52 | return delta; 53 | } 54 | } 55 | -------------------------------------------------------------------------------- /LSH_DL/src/main/java/dl/nn/HiddenLayer.java: -------------------------------------------------------------------------------- 1 | package dl.nn; 2 | 3 | import org.jblas.DoubleMatrix; 4 | 5 | import java.util.*; 6 | 7 | public abstract class HiddenLayer extends NeuronLayer 8 | { 9 | protected Set m_node_set; 10 | protected Set m_total_node_set; 11 | protected long m_total_nn_set_size; 12 | protected long m_total_multiplication; 13 | 14 | public HiddenLayer(int prev_layer_size, int layer_size, double L2) 15 | { 16 | super(prev_layer_size, layer_size, L2); 17 | m_total_node_set = new HashSet<>(); 18 | m_delta = new double[m_layer_size]; 19 | } 20 | 21 | public abstract HiddenLayer clone(); 22 | 23 | // Derivative Function 24 | protected abstract double derivative(double input); 25 | 26 | public double[] forwardPropagation(DoubleMatrix input, Set nn_node_set, boolean training) 27 | { 28 | assert(nn_node_set.size() <= m_layer_size); 29 | assert(input.length == m_prev_layer_size); 30 | 31 | m_input = input.toArray(); 32 | m_node_set = nn_node_set; 33 | 34 | if(training) 35 | { 36 | m_total_nn_set_size += m_node_set.size(); 37 | m_total_multiplication += m_node_set.size() * m_prev_layer_size; 38 | } 39 | 40 | Arrays.fill(m_weightedSum, 0.0); 41 | for(int idx : nn_node_set) 42 | { 43 | m_weightedSum[idx] = m_theta.getWeightVector(m_pos, idx).dot(input) + m_theta.getBias(m_pos, idx); 44 | } 45 | return activationFunction(); 46 | } 47 | 48 | public double[] forwardPropagation(double[] input) 49 | { 50 | return forwardPropagation(Util.vectorize(input), false); 51 | } 52 | 53 | public double[] forwardPropagation(double[] input, boolean training) 54 | { 55 | return forwardPropagation(Util.vectorize(input), training); 56 | } 57 | 58 | public double[] forwardPropagation(DoubleMatrix input, boolean training) 59 | { 60 | return forwardPropagation(input, m_theta.retrieveNodes(m_pos, input), training); 61 | } 62 | 63 | public double[] forwardPropagation(DoubleMatrix input, int[] hashes, boolean training) 64 | { 65 | return forwardPropagation(input, m_theta.retrieveNodes(m_pos, hashes), training); 66 | } 67 | 68 | public double[] calculateDelta(final double[] prev_layer_delta) 69 | { 70 | Arrays.fill(m_delta, 0.0); 71 | for(int idx : m_node_set) 72 | { 73 | for(int jdx = 0; jdx < prev_layer_delta.length; ++jdx) 74 | { 75 | m_delta[idx] += m_theta.getWeight(m_pos+1, jdx, idx) * prev_layer_delta[jdx]; 76 | } 77 | m_delta[idx] *= derivative(m_weightedSum[idx]); 78 | } 79 | return m_delta; 80 | } 81 | 82 | public void calculateGradient() 83 | { 84 | assert(m_delta.length == m_layer_size); 85 | 86 | for(int idx : m_node_set) 87 | { 88 | // Set Weight Gradient 89 | for(int jdx = 0; jdx < m_prev_layer_size; ++jdx) 90 | { 91 | m_theta.stochasticGradientDescent(m_theta.weightOffset(m_pos, idx, jdx), m_delta[idx] * m_input[jdx]); 92 | } 93 | 94 | // Set Bias Gradient 95 | m_theta.stochasticGradientDescent(m_theta.biasOffset(m_pos, idx), m_delta[idx]); 96 | } 97 | } 98 | 99 | public void updateHashTables(double size) 100 | { 101 | System.out.println(m_pos + " : " + m_total_nn_set_size / size); 102 | } 103 | } 104 | -------------------------------------------------------------------------------- /LSH_DL/src/main/java/dl/nn/ICostFunction.java: -------------------------------------------------------------------------------- 1 | package dl.nn; 2 | 3 | public interface ICostFunction 4 | { 5 | double correct(double[] y_hat, double labels); 6 | double accuracy(double[][] y_hat, double[] labels); 7 | double costFunction(double[] y_hat, double labels); 8 | double[] outputDelta(double[] y_hat, double labels, NeuronLayer l); 9 | } 10 | -------------------------------------------------------------------------------- /LSH_DL/src/main/java/dl/nn/LogisticNeuronLayer.java: -------------------------------------------------------------------------------- 1 | package dl.nn; 2 | 3 | public class LogisticNeuronLayer extends HiddenLayer 4 | { 5 | public LogisticNeuronLayer(int prev_layer_size, int layer_size, double L2) 6 | { 7 | super(prev_layer_size, layer_size, L2); 8 | } 9 | 10 | public HiddenLayer clone() 11 | { 12 | LogisticNeuronLayer copy = new LogisticNeuronLayer(m_prev_layer_size, m_layer_size, L2_Lambda); 13 | copy.m_theta = this.m_theta; 14 | copy.m_pos = this.m_pos; 15 | return copy; 16 | } 17 | 18 | // Random Weight Initialization 19 | protected double weightInitialization() 20 | { 21 | double interval = 4.0*Math.sqrt(6.0 / (m_prev_layer_size + m_layer_size)); 22 | return Util.rand.nextDouble() * (2*interval) - interval; 23 | } 24 | 25 | // Activation Function 26 | protected double[] activationFunction(double[] input) 27 | { 28 | double[] output = new double[input.length]; 29 | for(int idx = 0; idx < output.length; ++idx) 30 | { 31 | output[idx] = 1.0 / (1.0 + Math.exp(-input[idx])); 32 | } 33 | return output; 34 | } 35 | 36 | // Derivative Function 37 | protected double derivative(double input) 38 | { 39 | double negative_exp = Math.exp(-input); 40 | return negative_exp / Math.pow((1 + negative_exp), 2.0); 41 | } 42 | } 43 | -------------------------------------------------------------------------------- /LSH_DL/src/main/java/dl/nn/NN_parameters.java: -------------------------------------------------------------------------------- 1 | package dl.nn; 2 | 3 | 4 | import dl.lsh.CosineDistance; 5 | import dl.lsh.HashBuckets; 6 | import org.jblas.DoubleMatrix; 7 | 8 | import java.io.BufferedReader; 9 | import java.io.BufferedWriter; 10 | import java.io.FileWriter; 11 | import java.io.IOException; 12 | import java.text.DecimalFormat; 13 | import java.util.ArrayList; 14 | import java.util.Arrays; 15 | import java.util.List; 16 | import java.util.Set; 17 | 18 | /** 19 | * Created by sml1 on 9/23/15. 20 | */ 21 | public class NN_parameters 22 | { 23 | // Neural Network Structure 24 | private int m_epoch_offset; 25 | private List m_layers; 26 | 27 | private int[] m_weight_idx; 28 | private int[] m_bias_idx; 29 | private int m_size = 0; 30 | private int m_layer_count = 0; 31 | private int[] m_layer_row; 32 | private int[] m_layer_col; 33 | 34 | // Stochastic Gradient Descent 35 | private double[] m_theta; 36 | private double[] m_gradient; 37 | 38 | // Momentum 39 | private double[] m_momentum; 40 | private double m_momentum_lambda = 0.50; 41 | private final double momentum_max = 0.90; 42 | private final double momentum_rate = 1.00; 43 | 44 | // Learning Rate - Adagrad 45 | private final double m_learning_rate; 46 | private double[] m_learning_rates; 47 | 48 | // LSH 49 | private List m_tables; 50 | private final int[] m_poolDim; 51 | private final int[] m_b; 52 | private final int[] m_L; 53 | private double[] m_size_limit; 54 | 55 | // Create a NN_parameters object for a given neural network 56 | public NN_parameters(List NN_structure, final int[] poolDim, final int[] b, final int[] L, final double learning_rate, double[] size_limit) 57 | { 58 | m_layers = NN_structure; 59 | m_learning_rate = learning_rate; 60 | m_poolDim = poolDim; 61 | m_b = b; 62 | m_L = L; 63 | m_size_limit = size_limit; 64 | m_tables = new ArrayList<>(); 65 | construct(NN_structure); 66 | weight_initialization(NN_structure); 67 | createLSHTable(m_tables, poolDim, b, L, size_limit); 68 | System.out.println("Finished Initializing Parameters"); 69 | } 70 | 71 | // Load parameters from saved file 72 | public NN_parameters(BufferedReader reader, List NN_structure, final int[] poolDim, final int[] b, final int[] L, final double learning_rate, double[] size_limit) throws IOException 73 | { 74 | m_layers = NN_structure; 75 | m_learning_rate = learning_rate; 76 | m_poolDim = poolDim; 77 | m_b = b; 78 | m_L = L; 79 | m_size_limit = size_limit; 80 | m_tables = new ArrayList<>(); 81 | construct(NN_structure); 82 | 83 | // Load model and parameters 84 | m_epoch_offset = Integer.parseInt(reader.readLine()); 85 | load_model(NN_structure, reader, m_theta); 86 | load_model(NN_structure, reader, m_momentum); 87 | load_model(NN_structure, reader, m_learning_rates); 88 | 89 | createLSHTable(m_tables, poolDim, b, L, size_limit); 90 | System.out.println("Finished Initializing Parameters"); 91 | } 92 | 93 | /* 94 | Create an empty duplicate of the other NN_parameters object 95 | */ 96 | public double[] copy() 97 | { 98 | return Arrays.copyOf(m_theta, m_theta.length); 99 | } 100 | 101 | public void copy(double[] theta) 102 | { 103 | assert(theta.length == m_theta.length); 104 | m_theta = Arrays.copyOf(theta, theta.length); 105 | Arrays.fill(m_momentum, 0.0); 106 | Arrays.fill(m_learning_rates, 0.0); 107 | } 108 | 109 | public void construct(List NN_structure) 110 | { 111 | m_weight_idx = new int[NN_structure.size()]; 112 | m_bias_idx = new int[NN_structure.size()]; 113 | m_layer_row = new int[NN_structure.size()]; 114 | m_layer_col = new int[NN_structure.size()]; 115 | 116 | for(NeuronLayer l : NN_structure) 117 | { 118 | m_layer_row[m_layer_count] = l.m_layer_size; 119 | m_layer_col[m_layer_count] = l.m_prev_layer_size; 120 | 121 | m_weight_idx[m_layer_count] = m_size; 122 | m_size += l.numWeights(); 123 | m_bias_idx[m_layer_count] = m_size; 124 | m_size += l.numBias(); 125 | l.m_pos = m_layer_count++; 126 | l.m_theta = this; 127 | } 128 | m_theta = new double[m_size]; 129 | m_gradient = new double[m_size]; 130 | m_momentum = new double[m_size]; 131 | m_learning_rates = new double[m_size]; 132 | } 133 | 134 | public int epoch_offset() 135 | { 136 | return m_epoch_offset; 137 | } 138 | 139 | public void save_model(int epoch, BufferedWriter writer) throws IOException 140 | { 141 | writer.write(Long.toString(epoch + m_epoch_offset)); 142 | writer.newLine(); 143 | save_model(writer, m_theta); 144 | save_model(writer, m_momentum); 145 | save_model(writer, m_learning_rates); 146 | writer.close(); 147 | } 148 | 149 | private void save_model(BufferedWriter writer, double[] array) throws IOException 150 | { 151 | DecimalFormat df = new DecimalFormat("#.##########"); 152 | final String SPACE = " "; 153 | 154 | int global_idx = -1; 155 | for(NeuronLayer l : m_layers) 156 | { 157 | for(int idx = 0; idx < l.m_layer_size; ++idx) 158 | { 159 | for(int jdx = 0; jdx < l.m_prev_layer_size; ++jdx) 160 | { 161 | writer.write(df.format(array[++global_idx])); 162 | writer.write(SPACE); 163 | } 164 | writer.newLine(); 165 | } 166 | 167 | for(int idx = 0; idx < l.m_layer_size; ++idx) 168 | { 169 | writer.write(df.format(array[++global_idx])); 170 | writer.write(SPACE); 171 | } 172 | writer.newLine(); 173 | } 174 | assert(global_idx == m_size-1); 175 | } 176 | 177 | private void load_model(List NN_structure, BufferedReader reader, double[] array) throws IOException 178 | { 179 | int global_idx = -1; 180 | for(NeuronLayer l : NN_structure) 181 | { 182 | for(int idx = 0; idx < l.m_layer_size; ++idx) 183 | { 184 | String[] node = reader.readLine().trim().split("\\s+"); 185 | assert(node.length == l.m_prev_layer_size); 186 | for(String weight : node) 187 | { 188 | array[++global_idx] = Double.parseDouble(weight); 189 | } 190 | } 191 | 192 | String[] biases = reader.readLine().trim().split("\\s+"); 193 | assert(biases.length == l.m_layer_size); 194 | for(String bias : biases) 195 | { 196 | array[++global_idx] = Double.parseDouble(bias); 197 | } 198 | } 199 | assert(global_idx == m_size-1); 200 | } 201 | 202 | private void weight_initialization(List NN_structure) 203 | { 204 | int global_idx = -1; 205 | for(NeuronLayer l : NN_structure) 206 | { 207 | for(int idx = 0; idx < l.m_layer_size; ++idx) 208 | { 209 | for(int jdx = 0; jdx < l.m_prev_layer_size; ++jdx) 210 | { 211 | m_theta[++global_idx] = l.weightInitialization(); 212 | } 213 | } 214 | global_idx += l.m_layer_size; 215 | } 216 | assert(global_idx == m_size-1); 217 | } 218 | 219 | public List computeHashes(List data) 220 | { 221 | final int interval = data.size() / 10; 222 | List hashes = new ArrayList<>(); 223 | for(int idx = 0; idx < data.size(); ++idx) 224 | { 225 | if(idx % interval == 0) 226 | { 227 | System.out.println("Completed " + idx + " / " + data.size()); 228 | } 229 | 230 | hashes.add(m_tables.get(0).generateHashSignature(data.get(idx))); 231 | } 232 | return hashes; 233 | } 234 | 235 | public DoubleMatrix getWeight(int layer, int node) 236 | { 237 | assert(layer >= 0 && layer < m_layer_count); 238 | assert(node >= 0 && node < m_layer_row[layer]); 239 | 240 | return Util.vectorize(m_theta, m_weight_idx[layer] + node * m_layer_col[layer], m_layer_col[layer]); 241 | } 242 | 243 | public void rebuildTables() 244 | { 245 | int global_idx = 0; 246 | for(int layer_idx = 0; layer_idx < m_layer_count-1; ++layer_idx) 247 | { 248 | m_tables.get(layer_idx).clear(); 249 | for(int idx = 0; idx < m_layer_row[layer_idx] ; ++idx) 250 | { 251 | m_tables.get(layer_idx).LSHAdd(idx, Util.vectorize(m_theta, global_idx, m_layer_col[layer_idx])); 252 | global_idx += m_layer_col[layer_idx]; 253 | } 254 | global_idx += m_layer_row[layer_idx]; 255 | } 256 | } 257 | 258 | public void createLSHTable(List tables, int[] poolDim, int[] b, int[] L, final double[] size_limit) 259 | { 260 | int global_idx = 0; 261 | for(int layer_idx = 0; layer_idx < m_layer_count-1; ++layer_idx) 262 | { 263 | HashBuckets table = new HashBuckets(size_limit[layer_idx] * m_layer_row[layer_idx], poolDim[layer_idx], L[layer_idx], new CosineDistance(b[layer_idx], L[layer_idx], m_layer_col[layer_idx] / poolDim[layer_idx])); 264 | for(int idx = 0; idx < m_layer_row[layer_idx] ; ++idx) 265 | { 266 | table.LSHAdd(idx, Util.vectorize(m_theta, global_idx, m_layer_col[layer_idx])); 267 | global_idx += m_layer_col[layer_idx]; 268 | } 269 | tables.add(table); 270 | global_idx += m_layer_row[layer_idx]; 271 | } 272 | } 273 | 274 | public Set retrieveNodes(int layer, DoubleMatrix input) 275 | { 276 | //return m_tables.get(layer).LSHUnion(input); 277 | return m_tables.get(layer).histogramLSH(input); 278 | } 279 | 280 | public Set retrieveNodes(int layer, int[] hashes) 281 | { 282 | //return m_tables.get(layer).LSHUnion(hashes); 283 | return m_tables.get(layer).histogramLSH(hashes); 284 | } 285 | 286 | public void timeStep() 287 | { 288 | m_momentum_lambda *= momentum_rate; 289 | m_momentum_lambda = Math.min(m_momentum_lambda, momentum_max); 290 | } 291 | 292 | public int size() 293 | { 294 | return m_size; 295 | } 296 | 297 | public double getGradient(int idx) 298 | { 299 | assert(idx >= 0 && idx < m_theta.length); 300 | return m_momentum[idx] / m_learning_rate; 301 | } 302 | 303 | public double getTheta(int idx) 304 | { 305 | assert(idx >= 0 && idx < m_theta.length); 306 | return m_theta[idx]; 307 | } 308 | 309 | public void setTheta(int idx, double value) 310 | { 311 | assert(idx >= 0 && idx < m_theta.length); 312 | m_theta[idx] = value; 313 | } 314 | 315 | public double getWeight(int layer, int row, int col) 316 | { 317 | assert(layer >= 0 && layer < m_layer_count); 318 | assert(row >= 0 && row < m_layer_row[layer]); 319 | assert(col >= 0 && col < m_layer_col[layer]); 320 | 321 | int idx = row * m_layer_col[layer] + col; 322 | return m_theta[m_weight_idx[layer] + idx]; 323 | } 324 | 325 | public DoubleMatrix getWeightVector(int layer, int node_idx) 326 | { 327 | assert(layer >= 0 && layer < m_layer_count); 328 | assert(node_idx >= 0 && node_idx < m_layer_row[layer]); 329 | return Util.vectorize(m_theta, m_weight_idx[layer] + node_idx * m_layer_col[layer], m_layer_col[layer]); 330 | } 331 | 332 | public void setWeight(int layer, int row, int col, double value) 333 | { 334 | assert(layer >= 0 && layer < m_layer_count); 335 | assert(row >= 0 && row < m_layer_row[layer]); 336 | assert(col >= 0 && col < m_layer_col[layer]); 337 | 338 | int idx = row * m_layer_col[layer] + col; 339 | m_theta[m_weight_idx[layer] + idx] = value; 340 | } 341 | 342 | public double getBias(int layer, int idx) 343 | { 344 | assert(layer >= 0 && layer < m_layer_count); 345 | assert(idx >= 0 && idx < m_layer_row[layer]); 346 | 347 | return m_theta[m_bias_idx[layer] + idx]; 348 | } 349 | 350 | public void setBias(int layer, int idx, double value) 351 | { 352 | assert(layer >= 0 && layer < m_layer_count); 353 | assert(idx >= 0 && idx < m_layer_row[layer]); 354 | 355 | m_theta[m_bias_idx[layer] + idx] = value; 356 | } 357 | 358 | public double L2_regularization() 359 | { 360 | double L2 = 0.0; 361 | for (int layer_idx = 0; layer_idx < m_layer_count; ++layer_idx) 362 | { 363 | for(int idx = m_weight_idx[layer_idx]; idx < m_bias_idx[layer_idx]; ++idx) 364 | { 365 | L2 += Math.pow(m_theta[idx], 2.0); 366 | } 367 | } 368 | return 0.5 * L2; 369 | } 370 | 371 | public int weightOffset(int layer, int row, int column) 372 | { 373 | assert(layer >= 0 && layer < m_layer_count); 374 | assert(row >= 0 && row < m_layer_row[layer]); 375 | assert(column >= 0 && column < m_layer_col[layer]); 376 | 377 | int idx = row * m_layer_col[layer] + column; 378 | return m_weight_idx[layer] + idx; 379 | } 380 | 381 | public int biasOffset(int layer, int idx) 382 | { 383 | assert(layer >= 0 && layer < m_layer_count); 384 | assert(idx >= 0 && idx < m_layer_row[layer]); 385 | 386 | return m_bias_idx[layer] + idx; 387 | } 388 | 389 | public void stochasticGradientDescent(int idx, double gradient) 390 | { 391 | m_gradient[idx] = gradient; 392 | m_learning_rates[idx] += Math.pow(gradient, 2.0); 393 | double learning_rate = m_learning_rate / (1e-6 + Math.sqrt(m_learning_rates[idx])); 394 | m_momentum[idx] *= m_momentum_lambda; 395 | m_momentum[idx] += learning_rate * gradient; 396 | m_theta[idx] -= m_momentum[idx]; 397 | } 398 | 399 | public void clear_gradient() 400 | { 401 | Arrays.fill(m_gradient, 0); 402 | } 403 | 404 | public void print_active_nodes(String dataset, String filename, double threshold) throws Exception 405 | { 406 | final int linesize = 25; 407 | 408 | BufferedWriter writer = new BufferedWriter(new FileWriter(Util.DATAPATH + dataset + "/" + dataset + "_" + filename, true)); 409 | StringBuilder string = new StringBuilder(); 410 | string.append("["); 411 | int count = 0; 412 | for(int layer = 0; layer < m_layer_row.length; ++layer) 413 | { 414 | int layer_size = m_layer_row[layer]; 415 | int[] grad_count = new int[layer_size]; 416 | for(int idx = 0; idx < layer_size; ++idx) 417 | { 418 | int pos = idx * m_layer_col[layer]; 419 | for(int jdx = 0; jdx < m_layer_col[layer]; ++jdx) 420 | { 421 | if(Math.abs(m_gradient[pos+jdx]) > 0) 422 | { 423 | ++grad_count[idx]; 424 | } 425 | } 426 | } 427 | 428 | for(int idx = 0; idx < layer_size; ++idx) 429 | { 430 | int value = (grad_count[idx] >= threshold * layer_size)? 1 : 0; 431 | string.append(value); 432 | if(!(layer == m_layer_row.length-1 && idx == layer_size-1)) 433 | { 434 | if(count <= linesize) 435 | { 436 | string.append(", "); 437 | } 438 | else 439 | { 440 | string.append("\n"); 441 | count = 0; 442 | } 443 | } 444 | ++count; 445 | } 446 | } 447 | string.append("]\n"); 448 | writer.write(string.toString()); 449 | writer.flush(); 450 | writer.close(); 451 | } 452 | } 453 | -------------------------------------------------------------------------------- /LSH_DL/src/main/java/dl/nn/NeuralCoordinator.java: -------------------------------------------------------------------------------- 1 | package dl.nn; 2 | 3 | import org.jblas.DoubleMatrix; 4 | 5 | import java.io.BufferedWriter; 6 | import java.io.FileWriter; 7 | import java.io.IOException; 8 | import java.math.RoundingMode; 9 | import java.text.DecimalFormat; 10 | import java.util.ArrayList; 11 | import java.util.LinkedList; 12 | import java.util.List; 13 | 14 | public class NeuralCoordinator 15 | { 16 | private List m_networks; 17 | private NN_parameters m_params; 18 | private String m_modelTitle; 19 | private String m_model_path; 20 | private String m_train_path; 21 | private String m_test_path; 22 | private double m_total_nodes; 23 | private final int update_threshold = 20; 24 | 25 | public NeuralCoordinator(String model_title, String title, String dataset, NN_parameters params, List layers, LinkedList hiddenLayers, double L2, ICostFunction cf) throws IOException 26 | { 27 | m_modelTitle = model_title; 28 | m_model_path = Util.DATAPATH + dataset + "/" + Util.MODEL + title + "_" + model_title; 29 | m_train_path = Util.DATAPATH + dataset + "/" + Util.TRAIN + title; 30 | m_test_path = Util.DATAPATH + dataset + "/" + Util.TEST + title; 31 | 32 | for(NeuronLayer layer : layers) 33 | { 34 | m_total_nodes += layer.m_layer_size; 35 | } 36 | 37 | m_params = params; 38 | m_networks = new ArrayList<>(Util.LAYER_THREADS); 39 | m_networks.add(new NeuralNetwork(params, layers, hiddenLayers, L2, cf)); 40 | for(int idx = 1; idx < Util.LAYER_THREADS; ++idx) 41 | { 42 | LinkedList hiddenLayers1 = new LinkedList<>(); 43 | hiddenLayers.forEach(e -> hiddenLayers1.add(e.clone())); 44 | 45 | List layers1 = new LinkedList<>(); 46 | layers1.addAll(hiddenLayers1); 47 | layers1.add(layers.get(layers.size()-1).clone()); 48 | m_networks.add(new NeuralNetwork(params, layers1, hiddenLayers1, L2, cf)); 49 | } 50 | } 51 | 52 | private List initIndices(int length) 53 | { 54 | List indices = new ArrayList<>(); 55 | for(int idx = 0; idx < length; ++idx) 56 | { 57 | indices.add(idx); 58 | } 59 | return indices; 60 | } 61 | 62 | private void shuffle(List indices) 63 | { 64 | for(int idx = 0; idx < indices.size(); ++idx) 65 | { 66 | int rand = Util.rand.nextInt(indices.size()); 67 | int value = indices.get(idx); 68 | indices.set(idx, indices.get(rand)); 69 | indices.set(rand, value); 70 | } 71 | } 72 | 73 | public void test(List data, double[] labels) 74 | { 75 | List test_hashes = m_params.computeHashes(data); 76 | System.out.println("Finished Pre-Computing Training Hashes"); 77 | System.out.println(m_networks.get(0).test(test_hashes, data, labels)); 78 | } 79 | 80 | // training data, training labels 81 | public void train(final int max_epoch, List data, double[] labels, List test_data, double[] test_labels) throws Exception 82 | { 83 | assert(data.size() == labels.length); 84 | assert(test_data.size() == test_labels.length); 85 | 86 | List input_hashes = m_params.computeHashes(data); 87 | System.out.println("Finished Pre-Computing Training Hashes"); 88 | 89 | List test_hashes = m_params.computeHashes(test_data); 90 | System.out.println("Finished Pre-Computing Testing Hashes"); 91 | 92 | List data_idx = initIndices(labels.length); 93 | final int m_examples_per_thread = data.size() / (Util.UPDATE_SIZE * Util.LAYER_THREADS); 94 | assert(data_idx.size() == labels.length); 95 | 96 | BufferedWriter train_writer = new BufferedWriter(new FileWriter(m_train_path, true)); 97 | BufferedWriter test_writer = new BufferedWriter(new FileWriter(m_test_path, true)); 98 | for(int epoch_count = 0; epoch_count < max_epoch; ++epoch_count) 99 | { 100 | m_params.clear_gradient(); 101 | shuffle(data_idx); 102 | int count = 0; 103 | while(count < data_idx.size()) 104 | { 105 | List threads = new LinkedList<>(); 106 | for(NeuralNetwork network : m_networks) 107 | { 108 | if(count < data_idx.size()) 109 | { 110 | int start = count; 111 | count = Math.min(data_idx.size(), count + m_examples_per_thread); 112 | int end = count; 113 | 114 | Thread t = new Thread() 115 | { 116 | @Override 117 | public void run() 118 | { 119 | for (int pos = start; pos < end; ++pos) 120 | { 121 | network.execute(input_hashes.get(pos), data.get(pos), labels[pos], true); 122 | } 123 | } 124 | }; 125 | t.start(); 126 | threads.add(t); 127 | } 128 | } 129 | Util.join(threads); 130 | if(epoch_count <= update_threshold && epoch_count % (epoch_count / 10 + 1) == 0) 131 | { 132 | m_params.rebuildTables(); 133 | } 134 | 135 | } 136 | 137 | // Console Debug Output 138 | int epoch = m_params.epoch_offset() + epoch_count; 139 | //m_networks.stream().forEach(e -> e.updateHashTables(labels.length / Util.LAYER_THREADS)); 140 | double activeNodes = calculateActiveNodes(m_total_nodes * data.size()); 141 | double test_accuracy = m_networks.get(0).test(test_hashes, test_data, test_labels); 142 | System.out.println("Epoch " + epoch + " Accuracy: " + test_accuracy); 143 | 144 | // Test Output 145 | DecimalFormat df = new DecimalFormat("#.###"); 146 | df.setRoundingMode(RoundingMode.FLOOR); 147 | test_writer.write(m_modelTitle + " " + epoch + " " + df.format(activeNodes) + " " + test_accuracy); 148 | test_writer.newLine(); 149 | 150 | // Train Output 151 | train_writer.write(m_modelTitle + " " + epoch + " " + df.format(activeNodes) + " " + calculateTrainAccuracy(data.size())); 152 | train_writer.newLine(); 153 | 154 | test_writer.flush(); 155 | train_writer.flush(); 156 | 157 | m_params.timeStep(); 158 | } 159 | test_writer.close(); 160 | train_writer.close(); 161 | save_model(max_epoch, m_model_path); 162 | } 163 | 164 | public void save_model(int epoch, String path) throws IOException 165 | { 166 | m_params.save_model(epoch, Util.writerBZ2(path)); 167 | } 168 | 169 | private double calculateTrainAccuracy(double size) 170 | { 171 | double count = 0; 172 | for(NeuralNetwork network : m_networks) 173 | { 174 | count += network.m_train_correct; 175 | network.m_train_correct = 0; 176 | } 177 | return count / size; 178 | } 179 | 180 | private double calculateActiveNodes(double total) 181 | { 182 | long active = 0; 183 | for(NeuralNetwork network : m_networks) 184 | { 185 | active += network.calculateActiveNodes(); 186 | } 187 | return active / total; 188 | } 189 | 190 | private long calculateMultiplications() 191 | { 192 | long total = 0; 193 | for(NeuralNetwork network : m_networks) 194 | { 195 | total += network.calculateMultiplications(); 196 | } 197 | return total; 198 | } 199 | } 200 | -------------------------------------------------------------------------------- /LSH_DL/src/main/java/dl/nn/NeuralNetwork.java: -------------------------------------------------------------------------------- 1 | package dl.nn; 2 | 3 | import org.jblas.DoubleMatrix; 4 | import java.util.*; 5 | 6 | public class NeuralNetwork 7 | { 8 | private List m_layers; 9 | private LinkedList m_hidden_layers; 10 | private double L2_lambda; 11 | private ICostFunction m_cf; 12 | private NN_parameters m_params; 13 | private double m_cost; 14 | protected double m_train_correct; 15 | 16 | public NeuralNetwork(NN_parameters params, List layers, LinkedList hiddenLayers, double L2, ICostFunction cf) 17 | { 18 | L2_lambda = L2; 19 | m_cf = cf; 20 | m_params = params; 21 | m_hidden_layers = hiddenLayers; 22 | m_layers = layers; 23 | } 24 | 25 | public long calculateActiveNodes() 26 | { 27 | long total = 0; 28 | for(HiddenLayer l : m_hidden_layers) 29 | { 30 | total += l.m_total_nn_set_size; 31 | l.m_total_nn_set_size = 0; 32 | } 33 | total += m_layers.get(m_layers.size()-1).m_layer_size; 34 | return total; 35 | } 36 | 37 | public long calculateMultiplications() 38 | { 39 | long total = 0; 40 | for(HiddenLayer l : m_hidden_layers) 41 | { 42 | total += l.m_total_multiplication; 43 | l.m_total_multiplication = 0; 44 | } 45 | total += m_layers.get(m_layers.size()-1).numWeights(); 46 | return total; 47 | } 48 | 49 | public double test(List input_hashes, List data, double[] labels) 50 | { 51 | double[][] y_hat = new double[labels.length][]; 52 | for(int idx = 0; idx < labels.length; ++idx) 53 | { 54 | y_hat[idx] = forwardPropagation(data.get(idx), input_hashes.get(idx), false); 55 | } 56 | return m_cf.accuracy(y_hat, labels); 57 | } 58 | 59 | public double getGradient(int idx) 60 | { 61 | return m_params.getGradient(idx); 62 | } 63 | 64 | public double[] copyTheta() 65 | { 66 | return m_params.copy(); 67 | } 68 | 69 | public double getCost() 70 | { 71 | return m_cost; 72 | } 73 | 74 | public int numTheta() 75 | { 76 | return m_params.size(); 77 | } 78 | 79 | public double getTheta(int idx) 80 | { 81 | return m_params.getTheta(idx); 82 | } 83 | 84 | public void setTheta(double[] params) 85 | { 86 | m_params.copy(params); 87 | } 88 | 89 | public void setTheta(int idx, double value) 90 | { 91 | m_params.setTheta(idx, value); 92 | } 93 | 94 | public List computeHashes(List data) 95 | { 96 | return m_params.computeHashes(data); 97 | } 98 | 99 | public void updateHashTables(int miniBatch_size) 100 | { 101 | m_hidden_layers.forEach(e -> e.updateHashTables(miniBatch_size)); 102 | } 103 | 104 | public void execute(int[] hashes, DoubleMatrix input, double labels, boolean training) 105 | { 106 | double[] y_hat = forwardPropagation(input, hashes, training); 107 | backPropagation(y_hat, labels); // Calculate Cost and Gradient 108 | m_train_correct += m_cf.correct(y_hat, labels); 109 | } 110 | 111 | private void backPropagation(double[] y_hat, double labels) 112 | { 113 | // square loss function 114 | m_cost = m_cf.costFunction(y_hat, labels) + L2_lambda * m_params.L2_regularization(); 115 | 116 | NeuronLayer outputLayer = m_layers.get(m_layers.size()-1); 117 | 118 | // cost function derivatives 119 | double[] delta = m_cf.outputDelta(y_hat, labels, outputLayer); 120 | 121 | // Calculate the gradient for the output layer 122 | ListIterator it = m_layers.listIterator(m_layers.size()); 123 | // Calculate the delta for the hidden layers 124 | while (it.hasPrevious()) 125 | { 126 | delta = it.previous().calculateDelta(delta); 127 | } 128 | 129 | // Calculate the gradient for the output layer 130 | it = m_layers.listIterator(m_layers.size()); 131 | while (it.hasPrevious()) 132 | { 133 | it.previous().calculateGradient(); 134 | } 135 | } 136 | 137 | private double[] forwardPropagation(DoubleMatrix input, int[] hashes, boolean training) 138 | { 139 | Iterator iterator = m_hidden_layers.iterator(); 140 | double[] data = iterator.next().forwardPropagation(input, hashes, training); 141 | while (iterator.hasNext()) 142 | { 143 | data = iterator.next().forwardPropagation(data, training); 144 | } 145 | return m_layers.get(m_layers.size() - 1).forwardPropagation(data); 146 | } 147 | } 148 | -------------------------------------------------------------------------------- /LSH_DL/src/main/java/dl/nn/NeuronLayer.java: -------------------------------------------------------------------------------- 1 | package dl.nn; 2 | 3 | // Input: A(l) Params: W(l) b(l) Output: Z(l+1) A(l+1) 4 | public abstract class NeuronLayer implements Cloneable 5 | { 6 | public int m_pos = -1; 7 | public NN_parameters m_theta; 8 | 9 | protected double[] m_input; 10 | protected double[] m_weightedSum; 11 | protected double[] m_delta; 12 | 13 | protected int m_prev_layer_size; 14 | protected int m_layer_size; 15 | protected double L2_Lambda; 16 | 17 | public NeuronLayer(int prev_layer_size, int layer_size, double L2) 18 | { 19 | m_prev_layer_size = prev_layer_size; 20 | m_layer_size = layer_size; 21 | L2_Lambda = L2; 22 | m_weightedSum = new double[m_layer_size]; 23 | } 24 | 25 | public abstract NeuronLayer clone(); 26 | 27 | // Random Weight Initialization 28 | protected abstract double weightInitialization(); 29 | 30 | // Activation Function 31 | protected abstract double[] activationFunction(double[] input); 32 | 33 | public double[] activationFunction() 34 | { 35 | return activationFunction(m_weightedSum); 36 | } 37 | 38 | public int numWeights() 39 | { 40 | return m_prev_layer_size * m_layer_size; 41 | } 42 | 43 | public int numBias() 44 | { 45 | return m_layer_size; 46 | } 47 | 48 | public abstract double[] forwardPropagation(double[] input); 49 | 50 | public abstract double[] calculateDelta(double[] prev_layer_delta); 51 | 52 | public abstract void calculateGradient(); 53 | } 54 | -------------------------------------------------------------------------------- /LSH_DL/src/main/java/dl/nn/ReLUNeuronLayer.java: -------------------------------------------------------------------------------- 1 | package dl.nn; 2 | 3 | public class ReLUNeuronLayer extends HiddenLayer 4 | { 5 | public ReLUNeuronLayer(int prev_layer_size, int layer_size, double L2) 6 | { 7 | super(prev_layer_size, layer_size, L2); 8 | } 9 | 10 | public HiddenLayer clone() 11 | { 12 | ReLUNeuronLayer copy = new ReLUNeuronLayer(m_prev_layer_size, m_layer_size, L2_Lambda); 13 | copy.m_theta = this.m_theta; 14 | copy.m_pos = this.m_pos; 15 | return copy; 16 | } 17 | 18 | // Random Weight Initialization 19 | protected double weightInitialization() 20 | { 21 | double interval = 2.0*Math.sqrt(6.0 / (m_prev_layer_size + m_layer_size)); 22 | return Util.rand.nextDouble() * (2*interval) - interval; 23 | } 24 | 25 | // Activation Function 26 | protected double[] activationFunction(double[] input) 27 | { 28 | double[] output = new double[input.length]; 29 | for(int idx = 0; idx < output.length; ++idx) 30 | { 31 | output[idx] = Math.max(input[idx], 0.0); 32 | } 33 | return output; 34 | } 35 | 36 | // Derivative Function 37 | protected double derivative(double input) 38 | { 39 | return (input > 0) ? 1.0 : 0.0; 40 | } 41 | } 42 | -------------------------------------------------------------------------------- /LSH_DL/src/main/java/dl/nn/SoftMaxNeuronLayer.java: -------------------------------------------------------------------------------- 1 | package dl.nn; 2 | 3 | public class SoftMaxNeuronLayer extends NeuronLayer 4 | { 5 | public SoftMaxNeuronLayer(int prev_layer_size, int layer_size, double L2) 6 | { 7 | super(prev_layer_size, layer_size, L2); 8 | } 9 | 10 | public NeuronLayer clone() 11 | { 12 | SoftMaxNeuronLayer copy = new SoftMaxNeuronLayer(m_prev_layer_size, m_layer_size, L2_Lambda); 13 | copy.m_theta = this.m_theta; 14 | copy.m_pos = this.m_pos; 15 | return copy; 16 | } 17 | 18 | // Random Weight Initialization 19 | protected double weightInitialization() 20 | { 21 | double interval = 2.0*Math.sqrt(6.0 / (m_prev_layer_size + m_layer_size)); 22 | return Util.rand.nextDouble() * (2*interval) - interval; 23 | } 24 | 25 | // Activation Function 26 | protected double[] activationFunction(double[] input) 27 | { 28 | double sum = 0.0; 29 | double[] output = new double[input.length]; 30 | for(int idx = 0; idx < input.length; ++idx) 31 | { 32 | output[idx] = Math.exp(input[idx]); 33 | sum += output[idx]; 34 | } 35 | 36 | for(int idx = 0; idx < output.length; ++idx) 37 | { 38 | output[idx] /= sum; 39 | } 40 | return output; 41 | } 42 | 43 | public double[] forwardPropagation(double[] input) 44 | { 45 | assert(input.length == m_prev_layer_size); 46 | m_input = input; 47 | 48 | for(int jdx = 0; jdx < m_layer_size; ++jdx) 49 | { 50 | m_weightedSum[jdx] = 0.0; 51 | for(int idx = 0; idx < m_prev_layer_size; ++idx) 52 | { 53 | m_weightedSum[jdx] += m_theta.getWeight(m_pos, jdx, idx) * m_input[idx]; 54 | } 55 | m_weightedSum[jdx] += m_theta.getBias(m_pos, jdx); 56 | } 57 | return activationFunction(); 58 | } 59 | 60 | public double[] calculateDelta(double[] prev_layer_delta) 61 | { 62 | assert(prev_layer_delta.length == m_layer_size); 63 | m_delta = prev_layer_delta; 64 | return m_delta; 65 | } 66 | 67 | public void calculateGradient() 68 | { 69 | assert(m_delta.length == m_layer_size); 70 | 71 | for(int idx = 0; idx < m_layer_size; ++idx) 72 | { 73 | // Set Weight Gradient 74 | for(int jdx = 0; jdx < m_prev_layer_size; ++jdx) 75 | { 76 | m_theta.stochasticGradientDescent(m_theta.weightOffset(m_pos, idx, jdx), m_delta[idx] * m_input[jdx] + L2_Lambda * m_theta.getWeight(m_pos, idx, jdx)); 77 | } 78 | 79 | // Set Bias Gradient 80 | m_theta.stochasticGradientDescent(m_theta.biasOffset(m_pos, idx), m_delta[idx]); 81 | } 82 | } 83 | } -------------------------------------------------------------------------------- /LSH_DL/src/main/java/dl/nn/Util.java: -------------------------------------------------------------------------------- 1 | package dl.nn; 2 | 3 | import org.apache.commons.compress.compressors.bzip2.BZip2CompressorInputStream; 4 | import org.apache.commons.compress.compressors.bzip2.BZip2CompressorOutputStream; 5 | import org.jblas.DoubleMatrix; 6 | 7 | import java.io.*; 8 | import java.util.*; 9 | 10 | public class Util 11 | { 12 | public static final String DATAPATH = "../data/"; 13 | public static final String MODEL = "Models/"; 14 | public static final String TRAIN = "Train/"; 15 | public static final String TEST = "Test/"; 16 | 17 | public final static int INT_SIZE = 32; 18 | // ASGD - Higher Threads for larger workloads improves performance 19 | public final static int LAYER_THREADS = 1; 20 | public final static int UPDATE_SIZE = 10; 21 | 22 | public static Random rand = new Random(System.currentTimeMillis()); 23 | public static int randInt(int min, int max) 24 | { 25 | return rand.nextInt((max - min)) + min; 26 | } 27 | 28 | public static boolean randBoolean(double probability) 29 | { 30 | return rand.nextDouble() < probability; 31 | } 32 | 33 | public static BufferedWriter writerBZ2(final String path) throws IOException 34 | { 35 | return new BufferedWriter(new OutputStreamWriter(new BZip2CompressorOutputStream(new BufferedOutputStream(new FileOutputStream(path))))); 36 | } 37 | 38 | public static BufferedReader readerBZ2(final String path) throws IOException 39 | { 40 | return new BufferedReader(new InputStreamReader(new BZip2CompressorInputStream(new BufferedInputStream(new FileInputStream(path))))); 41 | } 42 | 43 | public static DataInputStream byteReaderBZ2(final String path) throws Exception 44 | { 45 | 46 | return new DataInputStream(new BZip2CompressorInputStream(new BufferedInputStream(new FileInputStream(path)))); 47 | } 48 | 49 | public static DoubleMatrix vectorize(double[] data) 50 | { 51 | return vectorize(data, 0, data.length); 52 | } 53 | 54 | public static DoubleMatrix vectorize(double[] data, int offset, int length) 55 | { 56 | DoubleMatrix vector = DoubleMatrix.zeros(length); 57 | for(int idx = 0; idx < length; ++idx) 58 | { 59 | vector.put(idx, 0, data[offset + idx]); 60 | } 61 | return vector; 62 | } 63 | 64 | public static List mean_normalization(double[] sum, List data_list) 65 | { 66 | DoubleMatrix meanVector = new DoubleMatrix(sum); 67 | meanVector.divi(data_list.size()); 68 | for(DoubleMatrix data : data_list) 69 | { 70 | data.subi(meanVector); 71 | } 72 | return data_list; 73 | } 74 | 75 | public static List range_normalization(double[] min, double[] max, List data_list) 76 | { 77 | DoubleMatrix minVector = new DoubleMatrix(min); 78 | DoubleMatrix maxVector = new DoubleMatrix(max); 79 | DoubleMatrix range = maxVector.sub(minVector); 80 | for(DoubleMatrix data : data_list) 81 | { 82 | data.divi(range); 83 | } 84 | return data_list; 85 | } 86 | 87 | public static double gradient_check(NeuralNetwork NN, List data, double[] labels, int num_checks) 88 | { 89 | List input_hashes = NN.computeHashes(data); 90 | final double delta = 0.0001; 91 | 92 | double max = 0.0; 93 | double[] original_params = NN.copyTheta(); 94 | for(int n = 0; n < num_checks; ++n) 95 | { 96 | int randData = randInt(0, labels.length); 97 | int randIdx = randInt(0, NN.numTheta()); 98 | double theta = NN.getTheta(randIdx); 99 | 100 | NN.execute(input_hashes.get(randData), data.get(randData), labels[randData], false); 101 | double gradient = NN.getGradient(randIdx); 102 | NN.setTheta(original_params); 103 | 104 | NN.setTheta(randIdx, theta-delta); 105 | NN.execute(input_hashes.get(randData), data.get(randData), labels[randData], false); 106 | double J0 = NN.getCost(); 107 | NN.setTheta(original_params); 108 | 109 | NN.setTheta(randIdx, theta+delta); 110 | NN.execute(input_hashes.get(randData), data.get(randData), labels[randData], false); 111 | double J1 = NN.getCost(); 112 | NN.setTheta(original_params); 113 | 114 | double est_gradient = (J1-J0) / (2*delta); 115 | double error = Math.abs(gradient - est_gradient); 116 | System.out.println("Error: " + error + " Gradient: " + gradient + " Est.Gradient: " + est_gradient); 117 | max = Math.max(max, error); 118 | } 119 | return max; 120 | } 121 | 122 | public static List join(List threads) 123 | { 124 | for(E t : threads) 125 | { 126 | try 127 | { 128 | t.join(); 129 | } 130 | catch (InterruptedException e) 131 | { 132 | System.out.println("Thread interrupted"); 133 | } 134 | } 135 | return threads; 136 | } 137 | } 138 | -------------------------------------------------------------------------------- /LSH_DL/src/test/java/org/dl/LSH_DL_Experiments.groovy: -------------------------------------------------------------------------------- 1 | package org.dl 2 | 3 | import dl.dataset.DLDataSet 4 | import dl.dataset.MNISTDataSet 5 | import dl.dataset.NORBDataSet 6 | 7 | import dl.nn.CrossEntropy 8 | import dl.nn.HiddenLayer 9 | import dl.nn.NN_parameters 10 | import dl.nn.NeuralCoordinator 11 | import dl.nn.NeuronLayer 12 | import dl.nn.ReLUNeuronLayer 13 | import dl.nn.SoftMaxNeuronLayer 14 | import dl.nn.Util 15 | import org.apache.commons.lang3.tuple.Pair 16 | import org.jblas.DoubleMatrix 17 | 18 | class LSH_DL_Experiments extends GroovyTestCase 19 | { 20 | // Grid Search 21 | private static final int min_layers = 1; 22 | private static final int max_layers = 1; 23 | private static final int hidden_layer_size = 1000; 24 | private static final int hidden_pool_size = hidden_layer_size * 0.1; 25 | 26 | // Neural Network Parameters 27 | private String title; 28 | private String dataset; 29 | private int training_size 30 | private int test_size 31 | private int inputLayer 32 | private int outputLayer 33 | private int k 34 | private int b = 6 35 | private int L = 100 36 | 37 | private final int max_epoch = 25 38 | private final double L2_Lambda = 0.003 39 | private int[] hiddenLayers 40 | private double[] learning_rates 41 | private final double[] size_limits = [0.05, 0.10, 0.25, 0.5, 0.75, 1.0] 42 | 43 | private LinkedList hidden_layers 44 | private LinkedList NN_layers 45 | 46 | String make_title() 47 | { 48 | StringBuilder titleBuilder = new StringBuilder() 49 | titleBuilder.append(dataset) 50 | titleBuilder.append('_') 51 | titleBuilder.append("LSH") 52 | titleBuilder.append('_') 53 | titleBuilder.append(inputLayer) 54 | titleBuilder.append('_') 55 | for(int idx = 0; idx < hiddenLayers.length; ++idx) 56 | { 57 | titleBuilder.append(hiddenLayers[idx]) 58 | titleBuilder.append('_') 59 | } 60 | titleBuilder.append(outputLayer) 61 | title = titleBuilder.toString() 62 | return title; 63 | } 64 | 65 | void testMNIST() 66 | { 67 | dataset = "MNIST" 68 | training_size = 60000 69 | test_size = 10000 70 | inputLayer = 784 71 | outputLayer = 10 72 | k = 98 73 | learning_rates = [1e-2, 1e-2, 1e-2, 5e-3, 1e-3, 1e-3] 74 | 75 | // Read MNIST test and training data 76 | final String training_label_path = Util.DATAPATH + dataset + "/train-labels-idx1-ubyte" 77 | final String training_image_path = Util.DATAPATH + dataset + "/train-images-idx3-ubyte" 78 | final String test_label_path = Util.DATAPATH + dataset + "/t10k-labels-idx1-ubyte" 79 | final String test_image_path = Util.DATAPATH + dataset + "/t10k-images-idx3-ubyte" 80 | 81 | Pair, double[]> training = MNISTDataSet.loadDataSet(training_label_path, training_image_path) 82 | Pair, double[]> test = MNISTDataSet.loadDataSet(test_label_path, test_image_path) 83 | 84 | execute(training.getLeft(), training.getRight(), test.getLeft(), test.getRight()); 85 | } 86 | 87 | void testNORB() 88 | { 89 | dataset = "NORB_SMALL" 90 | training_size = 20000 91 | test_size = 24300 92 | inputLayer = 2048 93 | outputLayer = 5 94 | k = 128 95 | learning_rates = [1e-2, 1e-2, 1e-2, 5e-3, 1e-3, 1e-3] 96 | 97 | // Read NORB training, validation, test data 98 | final String training_path = Util.DATAPATH + dataset + "/norb-small-train.bz2" 99 | final String test_path = Util.DATAPATH + dataset + "/norb-small-test.bz2" 100 | 101 | Pair, double[]> training = NORBDataSet.loadDataSet(Util.readerBZ2(training_path), training_size) 102 | Pair, double[]> test = NORBDataSet.loadDataSet(Util.readerBZ2(test_path), test_size) 103 | execute(training.getLeft(), training.getRight(), test.getLeft(), test.getRight()); 104 | } 105 | 106 | void testRectangles() 107 | { 108 | test_size = 50000 109 | inputLayer = 784 110 | outputLayer = 2 111 | k = 98 112 | 113 | // Rectangles Data 114 | dataset = "Rectangles"; 115 | training_size = 12000; 116 | final String training_path = Util.DATAPATH + dataset + "/rectangles_im_train.amat.bz2"; 117 | final String test_path = Util.DATAPATH + dataset + "/rectangles_im_test.amat.bz2"; 118 | learning_rates = [1e-2, 1e-2, 5e-3, 1e-3, 1e-3, 1e-3] 119 | 120 | Pair, double[]> training = DLDataSet.loadDataSet(Util.readerBZ2(training_path), training_size, inputLayer) 121 | Pair, double[]> test = DLDataSet.loadDataSet(Util.readerBZ2(test_path), test_size, inputLayer) 122 | execute(training.getLeft(), training.getRight(), test.getLeft(), test.getRight()); 123 | } 124 | 125 | void testConvex() 126 | { 127 | test_size = 50000 128 | inputLayer = 784 129 | outputLayer = 2 130 | k = 98 131 | 132 | // Convex 133 | dataset = "Convex"; 134 | training_size = 8000; 135 | final String training_path = Util.DATAPATH + dataset + "/convex_train.amat.bz2"; 136 | final String test_path = Util.DATAPATH + dataset + "/convex_test.amat.bz2"; 137 | learning_rates = [1e-2, 1e-2, 5e-3, 1e-3, 1e-3, 1e-3] 138 | 139 | Pair, double[]> training = DLDataSet.loadDataSet(Util.readerBZ2(training_path), training_size, inputLayer) 140 | Pair, double[]> test = DLDataSet.loadDataSet(Util.readerBZ2(test_path), test_size, inputLayer) 141 | execute(training.getLeft(), training.getRight(), test.getLeft(), test.getRight()); 142 | } 143 | 144 | void construct(final int inputLayer, final int outputLayer) 145 | { 146 | // Hidden Layers 147 | hidden_layers = new ArrayList<>(); 148 | hidden_layers.add(new ReLUNeuronLayer(inputLayer, hiddenLayers[0], L2_Lambda)); 149 | for(int idx = 0; idx < hiddenLayers.length-1; ++idx) 150 | { 151 | hidden_layers.add(new ReLUNeuronLayer(hiddenLayers[idx], hiddenLayers[idx+1], L2_Lambda)); 152 | } 153 | 154 | // Output Layers 155 | NN_layers = new ArrayList<>(); 156 | NN_layers.addAll(hidden_layers); 157 | NN_layers.add(new SoftMaxNeuronLayer(hiddenLayers[hiddenLayers.length-1], outputLayer, L2_Lambda)); 158 | } 159 | 160 | void execute(List training_data, double[] training_labels, List test_data, double[] test_labels) 161 | { 162 | assert(size_limits.length == learning_rates.length) 163 | 164 | for(int size = min_layers; size <= max_layers; ++size) 165 | { 166 | hiddenLayers = new int[size] 167 | Arrays.fill(hiddenLayers, hidden_layer_size) 168 | 169 | int[] sum_pool = new int[size] 170 | Arrays.fill(sum_pool, hidden_pool_size) 171 | sum_pool[0] = k 172 | 173 | int[] bits = new int[size] 174 | Arrays.fill(bits, b) 175 | 176 | int[] tables = new int[size] 177 | Arrays.fill(tables, L) 178 | 179 | for(int idx = 0; idx < size_limits.length; ++idx) 180 | { 181 | double[] sl = new double[size] 182 | Arrays.fill(sl, size_limits[idx]) 183 | 184 | System.out.println(make_title()) 185 | construct(inputLayer, outputLayer) 186 | NN_parameters parameters 187 | try 188 | { 189 | parameters = new NN_parameters(Util.readerBZ2(Util.DATAPATH + dataset + "/" + Util.MODEL + title), NN_layers, sum_pool, bits, tables, learning_rates[idx], sl) 190 | } 191 | catch (Exception ignore) 192 | { 193 | parameters = new NN_parameters(NN_layers, sum_pool, bits, tables, learning_rates[idx], sl) 194 | } 195 | NeuralCoordinator NN = new NeuralCoordinator(Double.toString(size_limits[idx]), title, dataset, parameters, NN_layers, hidden_layers, L2_Lambda, new CrossEntropy()) 196 | long startTime = System.currentTimeMillis() 197 | NN.train(max_epoch, training_data, training_labels, test_data, test_labels) 198 | long estimatedTime = (System.currentTimeMillis() - startTime) / 1000 199 | System.out.println(estimatedTime) 200 | } 201 | } 202 | } 203 | } 204 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | ## Scalable and Sustainable Deep Learning via Randomized Hashing 2 | 3 | **Look for major updates around ICLR 2018 deadline this November!** 4 | 5 | # Abstract 6 | Current deep learning architectures are growing larger in order to learn from complex datasets. These architectures require giant matrix multiplication operations to train millions of parameters. Conversely, there is another growing trend to bring deep learning to low-power, embedded devices. The matrix operations, associated with the training and testing of deep networks, are very expensive from a computational and energy standpoint. We present a novel hashing-based technique to drastically reduce the amount of computation needed to train and test neural networks. Our approach combines two recent ideas, Adaptive Dropout and Randomized Hashing for Maximum Inner Product Search (MIPS), to select the nodes with the highest activation efficiently. Our new algorithm for deep learning reduces the overall computational cost of the forward and backward propagation steps by operating on significantly fewer nodes. As a consequence, our algorithm uses only 5% of the total multiplications, while keeping within 1% of the accuracy of the original model on average. A unique property of the proposed hashing-based back-propagation is that the updates are always sparse. Due to the sparse gradient updates, our algorithm is ideally suited for asynchronous, parallel training, leading to near-linear speedup, as the number of cores increases. We demonstrate the scalability and sustainability (energy efficiency) of our proposed algorithm via rigorous experimental evaluations on several datasets. 7 | 8 | # Future Work 9 | 1. Implement a general LSH framework for GPUs with API support for TensorFlow, PyTorch, and MXNet 10 | 2. Build Scalable One-Shot Learning using Locality-Sensitive Hashing [https://github.com/RUSH-LAB/LSH_Memory] 11 | 3. Demonstrate Tera-Scale machine learning on a single machine using our algorithm, tailored for sparse, high-dimensional datasets (See Netflix VectorFlow) 12 | 13 | # References 14 | 1. [Scalable and Sustainable Deep Learning via Randomized Hashing (KDD 2017 - Oral)](http://dl.acm.org/citation.cfm?id=3098035) 15 | 2. [Efficient Class of LSH-Based Samplers](https://arxiv.org/abs/1703.05160) 16 | 3. [Learning to Remember Rare Events](https://arxiv.org/abs/1703.03129) 17 | 4. [Netflix VectorFlow](https://medium.com/@NetflixTechBlog/introducing-vectorflow-fe10d7f126b8) --------------------------------------------------------------------------------