├── .gitattributes ├── .github └── workflows │ └── msvc-analysis.yml ├── .gitignore ├── Chapter1.md ├── Chapter1 ├── .clang-format ├── Makefile ├── NeuralNet1.cpp ├── NeuralNet1.sln ├── NeuralNet1.vcxproj ├── NeuralNet1.vcxproj.filters ├── packages.config ├── stdafx.cpp ├── stdafx.h └── targetver.h ├── Chapter3.md ├── Chapter3 ├── .clang-format ├── Makefile ├── NeuralNet.h ├── NeuralNet1.vcxproj ├── NeuralNet1.vcxproj.filters ├── NeuralNet2.cpp ├── NeuralNet2.sln ├── packages.config ├── stdafx.cpp ├── stdafx.h └── targetver.h ├── Data ├── t10k-images.idx3-ubyte ├── t10k-labels.idx1-ubyte ├── train-images.idx3-ubyte └── train-labels.idx1-ubyte ├── Future.md ├── Future ├── .clang-format ├── CMakeLists.txt ├── Makefile ├── NeuralNet.h ├── NeuralNet2F.sln ├── NeuralNet2F.vcxproj ├── NeuralNet2F.vcxproj.filters ├── NeuralNet3.cpp ├── packages.config ├── stdafx.cpp ├── stdafx.h └── targetver.h ├── LICENSE ├── Loader ├── .clang-format └── mnist_loader.h └── README.md /.gitattributes: -------------------------------------------------------------------------------- 1 | ############################################################################### 2 | # Set default behavior to automatically normalize line endings. 3 | ############################################################################### 4 | * text=auto 5 | 6 | ############################################################################### 7 | # Set default behavior for command prompt diff. 8 | # 9 | # This is need for earlier builds of msysgit that does not have it on by 10 | # default for csharp files. 11 | # Note: This is only used by command line 12 | ############################################################################### 13 | #*.cs diff=csharp 14 | 15 | ############################################################################### 16 | # Set the merge driver for project and solution files 17 | # 18 | # Merging from the command prompt will add diff markers to the files if there 19 | # are conflicts (Merging from VS is not affected by the settings below, in VS 20 | # the diff markers are never inserted). Diff markers may cause the following 21 | # file extensions to fail to load in VS. An alternative would be to treat 22 | # these files as binary and thus will always conflict and require user 23 | # intervention with every merge. To do so, just uncomment the entries below 24 | ############################################################################### 25 | #*.sln merge=binary 26 | #*.csproj merge=binary 27 | #*.vbproj merge=binary 28 | #*.vcxproj merge=binary 29 | #*.vcproj merge=binary 30 | #*.dbproj merge=binary 31 | #*.fsproj merge=binary 32 | #*.lsproj merge=binary 33 | #*.wixproj merge=binary 34 | #*.modelproj merge=binary 35 | #*.sqlproj merge=binary 36 | #*.wwaproj merge=binary 37 | 38 | ############################################################################### 39 | # behavior for image files 40 | # 41 | # image files are treated as binary by default. 42 | ############################################################################### 43 | #*.jpg binary 44 | #*.png binary 45 | #*.gif binary 46 | 47 | ############################################################################### 48 | # diff behavior for common document formats 49 | # 50 | # Convert binary document formats to text before diffing them. This feature 51 | # is only available from the command line. Turn it on by uncommenting the 52 | # entries below. 53 | ############################################################################### 54 | #*.doc diff=astextplain 55 | #*.DOC diff=astextplain 56 | #*.docx diff=astextplain 57 | #*.DOCX diff=astextplain 58 | #*.dot diff=astextplain 59 | #*.DOT diff=astextplain 60 | #*.pdf diff=astextplain 61 | #*.PDF diff=astextplain 62 | #*.rtf diff=astextplain 63 | #*.RTF diff=astextplain 64 | -------------------------------------------------------------------------------- /.github/workflows/msvc-analysis.yml: -------------------------------------------------------------------------------- 1 | # This workflow uses actions that are not certified by GitHub. 2 | # They are provided by a third-party and are governed by 3 | # separate terms of service, privacy policy, and support 4 | # documentation. 5 | # 6 | # Find more information at: 7 | # https://github.com/microsoft/msvc-code-analysis-action 8 | 9 | name: Microsoft C++ Code Analysis 10 | 11 | on: 12 | push: 13 | branches: [ master ] 14 | pull_request: 15 | branches: [ master ] 16 | schedule: 17 | - cron: '19 22 * * 1' 18 | 19 | env: 20 | # Path to the CMake build directory. 21 | build: '${{ github.workspace }}/build' 22 | 23 | jobs: 24 | analyze: 25 | name: Analyze 26 | runs-on: windows-latest 27 | 28 | steps: 29 | - name: Checkout repository 30 | uses: actions/checkout@v2 31 | 32 | - name: Configure CMake 33 | run: cmake -B ${{ env.build }} 34 | 35 | # Build is not required unless generated source files are used 36 | # - name: Build CMake 37 | # run: cmake --build ${{ env.build }} 38 | 39 | - name: Initialize MSVC Code Analysis 40 | uses: microsoft/msvc-code-analysis-action@04825f6d9e00f87422d6bf04e1a38b1f3ed60d99 41 | # Provide a unique ID to access the sarif output path 42 | id: run-analysis 43 | with: 44 | cmakeBuildDirectory: ${{ env.build }} 45 | # Ruleset file that will determine what checks will be run 46 | ruleset: NativeRecommendedRules.ruleset 47 | 48 | # Upload SARIF file to GitHub Code Scanning Alerts 49 | - name: Upload SARIF to GitHub 50 | uses: github/codeql-action/upload-sarif@v1 51 | with: 52 | sarif_file: ${{ steps.run-analysis.outputs.sarif }} 53 | 54 | # Upload SARIF file as an Artifact to download and view 55 | # - name: Upload SARIF as an Artifact 56 | # uses: actions/upload-artifact@v2 57 | # with: 58 | # name: sarif-file 59 | # path: ${{ steps.run-analysis.outputs.sarif }} 60 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | ## Ignore Visual Studio temporary files, build results, and 2 | ## files generated by popular Visual Studio add-ons. 3 | 4 | # User-specific files 5 | *.suo 6 | *.user 7 | *.userosscache 8 | *.sln.docstates 9 | 10 | # User-specific files (MonoDevelop/Xamarin Studio) 11 | *.userprefs 12 | 13 | # Build results 14 | [Dd]ebug/ 15 | [Dd]ebugPublic/ 16 | [Rr]elease/ 17 | [Rr]eleases/ 18 | x64/ 19 | x86/ 20 | bld/ 21 | [Bb]in/ 22 | [Oo]bj/ 23 | [Ll]og/ 24 | 25 | # Visual Studio 2015 cache/options directory 26 | .vs/ 27 | # Uncomment if you have tasks that create the project's static files in wwwroot 28 | #wwwroot/ 29 | 30 | # MSTest test Results 31 | [Tt]est[Rr]esult*/ 32 | [Bb]uild[Ll]og.* 33 | 34 | # NUNIT 35 | *.VisualState.xml 36 | TestResult.xml 37 | 38 | # Build Results of an ATL Project 39 | [Dd]ebugPS/ 40 | [Rr]eleasePS/ 41 | dlldata.c 42 | 43 | # DNX 44 | project.lock.json 45 | project.fragment.lock.json 46 | artifacts/ 47 | 48 | *_i.c 49 | *_p.c 50 | *_i.h 51 | *.ilk 52 | *.meta 53 | *.obj 54 | *.pch 55 | *.pdb 56 | *.pgc 57 | *.pgd 58 | *.rsp 59 | *.sbr 60 | *.tlb 61 | *.tli 62 | *.tlh 63 | *.tmp 64 | *.tmp_proj 65 | *.log 66 | *.vspscc 67 | *.vssscc 68 | .builds 69 | *.pidb 70 | *.svclog 71 | *.scc 72 | 73 | # Chutzpah Test files 74 | _Chutzpah* 75 | 76 | # Visual C++ cache files 77 | ipch/ 78 | *.aps 79 | *.ncb 80 | *.opendb 81 | *.opensdf 82 | *.sdf 83 | *.cachefile 84 | *.VC.db 85 | *.VC.VC.opendb 86 | 87 | # Visual Studio profiler 88 | *.psess 89 | *.vsp 90 | *.vspx 91 | *.sap 92 | 93 | # TFS 2012 Local Workspace 94 | $tf/ 95 | 96 | # Guidance Automation Toolkit 97 | *.gpState 98 | 99 | # ReSharper is a .NET coding add-in 100 | _ReSharper*/ 101 | *.[Rr]e[Ss]harper 102 | *.DotSettings.user 103 | 104 | # JustCode is a .NET coding add-in 105 | .JustCode 106 | 107 | # TeamCity is a build add-in 108 | _TeamCity* 109 | 110 | # DotCover is a Code Coverage Tool 111 | *.dotCover 112 | 113 | # NCrunch 114 | _NCrunch_* 115 | .*crunch*.local.xml 116 | nCrunchTemp_* 117 | 118 | # MightyMoose 119 | *.mm.* 120 | AutoTest.Net/ 121 | 122 | # Web workbench (sass) 123 | .sass-cache/ 124 | 125 | # Installshield output folder 126 | [Ee]xpress/ 127 | 128 | # DocProject is a documentation generator add-in 129 | DocProject/buildhelp/ 130 | DocProject/Help/*.HxT 131 | DocProject/Help/*.HxC 132 | DocProject/Help/*.hhc 133 | DocProject/Help/*.hhk 134 | DocProject/Help/*.hhp 135 | DocProject/Help/Html2 136 | DocProject/Help/html 137 | 138 | # Click-Once directory 139 | publish/ 140 | 141 | # Publish Web Output 142 | *.[Pp]ublish.xml 143 | *.azurePubxml 144 | # TODO: Comment the next line if you want to checkin your web deploy settings 145 | # but database connection strings (with potential passwords) will be unencrypted 146 | #*.pubxml 147 | *.publishproj 148 | 149 | # Microsoft Azure Web App publish settings. Comment the next line if you want to 150 | # checkin your Azure Web App publish settings, but sensitive information contained 151 | # in these scripts will be unencrypted 152 | PublishScripts/ 153 | 154 | # NuGet Packages 155 | *.nupkg 156 | # The packages folder can be ignored because of Package Restore 157 | **/packages/* 158 | # except build/, which is used as an MSBuild target. 159 | !**/packages/build/ 160 | # Uncomment if necessary however generally it will be regenerated when needed 161 | #!**/packages/repositories.config 162 | # NuGet v3's project.json files produces more ignoreable files 163 | *.nuget.props 164 | *.nuget.targets 165 | 166 | # Microsoft Azure Build Output 167 | csx/ 168 | *.build.csdef 169 | 170 | # Microsoft Azure Emulator 171 | ecf/ 172 | rcf/ 173 | 174 | # Windows Store app package directories and files 175 | AppPackages/ 176 | BundleArtifacts/ 177 | Package.StoreAssociation.xml 178 | _pkginfo.txt 179 | 180 | # Visual Studio cache files 181 | # files ending in .cache can be ignored 182 | *.[Cc]ache 183 | # but keep track of directories ending in .cache 184 | !*.[Cc]ache/ 185 | 186 | # Others 187 | ClientBin/ 188 | ~$* 189 | *~ 190 | *.dbmdl 191 | *.dbproj.schemaview 192 | *.jfm 193 | *.pfx 194 | *.publishsettings 195 | node_modules/ 196 | orleans.codegen.cs 197 | 198 | # Since there are multiple workflows, uncomment next line to ignore bower_components 199 | # (https://github.com/github/gitignore/pull/1529#issuecomment-104372622) 200 | #bower_components/ 201 | 202 | # RIA/Silverlight projects 203 | Generated_Code/ 204 | 205 | # Backup & report files from converting an old project file 206 | # to a newer Visual Studio version. Backup files are not needed, 207 | # because we have git ;-) 208 | _UpgradeReport_Files/ 209 | Backup*/ 210 | UpgradeLog*.XML 211 | UpgradeLog*.htm 212 | 213 | # SQL Server files 214 | *.mdf 215 | *.ldf 216 | 217 | # Business Intelligence projects 218 | *.rdl.data 219 | *.bim.layout 220 | *.bim_*.settings 221 | 222 | # Depend files 223 | .depend 224 | 225 | # Object files 226 | *.o 227 | 228 | # Microsoft Fakes 229 | FakesAssemblies/ 230 | 231 | # GhostDoc plugin setting file 232 | *.GhostDoc.xml 233 | 234 | # Node.js Tools for Visual Studio 235 | .ntvs_analysis.dat 236 | 237 | # Visual Studio 6 build log 238 | *.plg 239 | 240 | # Visual Studio 6 workspace options file 241 | *.opt 242 | 243 | # Visual Studio LightSwitch build output 244 | **/*.HTMLClient/GeneratedArtifacts 245 | **/*.DesktopClient/GeneratedArtifacts 246 | **/*.DesktopClient/ModelManifest.xml 247 | **/*.Server/GeneratedArtifacts 248 | **/*.Server/ModelManifest.xml 249 | _Pvt_Extensions 250 | 251 | # Paket dependency manager 252 | .paket/paket.exe 253 | paket-files/ 254 | 255 | # FAKE - F# Make 256 | .fake/ 257 | 258 | # JetBrains Rider 259 | .idea/ 260 | *.sln.iml 261 | 262 | # CodeRush 263 | .cr/ 264 | 265 | # Python Tools for Visual Studio (PTVS) 266 | __pycache__/ 267 | *.pyc -------------------------------------------------------------------------------- /Chapter1.md: -------------------------------------------------------------------------------- 1 | # Chapter 1 2 | The C++ in the Chapter1 directory is unsurprisingly a port of the code of [chapter 1](http://neuralnetworksanddeeplearning.com/chap1.html) 3 | of the online book [Neural networks and deep learning](http://neuralnetworksanddeeplearning.com). 4 | 5 | ## The Network 6 | As explained in the book a neural network can be modelled by a list of biases, 1D vectors 7 | and a set of weights, 2D matrices. These can be defined in C++ as follows: 8 | ```c++ 9 | using BiasesVector = std::vector>; 10 | using WeightsVector = std::vector>; 11 | BiasesVector biases; 12 | WeightsVector weights; 13 | ``` 14 | The weights and biases can be initialised from a single vector which we store in sizes: 15 | ```C++ 16 | void PopulateZeroWeightsAndBiases(BiasesVector &b, WeightsVector &w) const 17 | { 18 | for (size_t i = 1; i < m_sizes.size(); ++i) 19 | { 20 | b.push_back(ublas::zero_vector(m_sizes[i])); 21 | w.push_back(ublas::zero_matrix(m_sizes[i], m_sizes[i - 1])); 22 | } 23 | } 24 | ``` 25 | This code is equivalent to the python code: 26 | ```python 27 | self.biases = [np.random.randn(y, 1) for y in sizes[1:]] 28 | self.weights = [np.random.randn(y, x) 29 | for x, y in zip(sizes[:-1], sizes[1:])] 30 | ``` 31 | But not quite equivalent as the C++ biases and weights are zero at this point. 32 | 33 | ## The Training Data 34 | The training data is a vector of pair or uBLAS vectors where the first vector is the array of inputs (x’s in the Python) the second vector is the expected result, a uBLAS vector which contains 1 in the index of the expected answer. 35 | ```c++ 36 | using TrainingData = std::pair, ublas::vector>; 37 | ``` 38 | By creating the alias TrainingData we can add greater readability to the class definition. The TrainingData definition is not actually a type so the class which loads the training data can fully define the TrainingData as: 39 | ```c++ 40 | std::vector, ublas::vector>> 41 | ``` 42 | And this interoperates with the TrainingData definition. 43 | 44 | ## Feed Forward 45 | 46 | The equivalent of the feedforward function in C++ is as follows: 47 | ```c++ 48 | ublas::vector feedforward(ublas::vector a) const 49 | { 50 | for (size_t i = 0; i < biases.size(); ++i) 51 | { 52 | ublas::vector c = prod(weights[i], a) + biases[i]; 53 | sigmoid(c); 54 | a = c; 55 | } 56 | return a; 57 | } 58 | ``` 59 | This is of course the only function your application needs if you just need to process inputs and classify using a pre-learned set of weights. 60 | In this function the prod function of the ublas library is doing the matrix multiplication and the result is a 1D vector which we add the biases to. 61 | The ublas library is an extensible library and with a bit of effort I'm sure you could achieve: 62 | ``` c++ 63 | a = sigmoid(prod(weights[i], a) + biases[i]); 64 | ``` 65 | Which would by no more verbose than the python: 66 | ``` python 67 | sigmoid(np.dot(w, a)+b) 68 | ``` 69 | By now if your unfamiliar with python you may be puzzled by the some of the python constructs used in network.py. This construct: 70 | ``` python 71 | for b, w in zip(self.biases, self.weights): 72 | ``` 73 | Is a pythonic way of iterating through the weights and biases lists together. In C++ I've fallen back to writing a simple loop for this expression. 74 | 75 | ## Back propogation 76 | 77 | Let's have a brief look at the backprop function and to explore the differences between the python and C++ code. The first loop over the biases and weights 78 | runs the same algorithm as the feedforward function, but storing the intermediate working which we use later in the function. 79 | 80 | The second part of the function calculated the derivatives using the back prop method. The first calculation involves calculating the derivative of the cost and the 81 | Remaining loops propagate this back through the rest of net. In the python we can access and array with [-1] to obtain the last element of the array. In the 82 | C++ version of the code we set up the iterators we need for the calculation and move them backwards for each iteration. 83 | ```c++ 84 | // Populates the gradent for the cost function for the biases in the vector nabla_b 85 | // and the weights in nabla_w 86 | void backprop(const ublas::vector &x, const ublas::vector &y, 87 | BiasesVector &nabla_b, WeightsVector &nabla_w) 88 | { 89 | auto activation = x; 90 | std::vector> activations; // Stores the activations of each layer 91 | activations.push_back(x); 92 | std::vector> zs; // The z vectors layer by layer 93 | for (size_t i = 0; i < biases.size(); ++i) { 94 | ublas::vector z = prod(weights[i], activation) + biases[i]; 95 | zs.push_back(z); 96 | activation = z; 97 | sigmoid(activation); 98 | activations.push_back(activation); 99 | } 100 | // backward pass 101 | auto iActivations = activations.end() - 1; 102 | auto izs = zs.end() - 1; 103 | sigmoid_prime(*izs); 104 | ublas::vector delta = element_prod(cost_derivative(*iActivations, y), *izs); 105 | auto ib = nabla_b.end() - 1; 106 | auto iw = nabla_w.end() - 1; 107 | *ib = delta; 108 | iActivations--; 109 | *iw = outer_prod(delta, trans(*iActivations)); 110 | 111 | auto iWeights = weights.end(); 112 | while (iActivations != activations.begin()) 113 | { 114 | izs--; iWeights--; iActivations--; ib--; iw--; 115 | sigmoid_prime(*izs); 116 | delta = element_prod(prod(trans(*iWeights), delta), *izs); 117 | *ib = delta; 118 | *iw = outer_prod(delta, trans(*iActivations)); 119 | } 120 | } 121 | ``` 122 | In the numpy library to multiply two vectors together you can use a*b and the result multiplies the elements of each of them. In ublas 123 | element_prod(a,b) is equivalent. In python the code np.dot(delta, activations[-l-1].transpose()) creates a matrix which the m[i,j]=a[i]*b[j] 124 | the equivalent function in the ublas library is the outer_prod function. 125 | 126 | ## Evaluate 127 | 128 | In C++ version of the evaluate function I've used the max element and distance functions to determine the location of the maximum 129 | signal. 130 | ```c++ 131 | int evaluate(const std::vector &td) const 132 | { 133 | return count_if(td.begin(), td.end(), [this](const TrainingData &testElement) { 134 | auto res = feedforward(testElement.first); 135 | return (std::distance(res.begin(), max_element(res.begin(), res.end())) 136 | == std::distance(testElement.second.begin(), max_element(testElement.second.begin(), testElement.second.end())) 137 | ); 138 | }); 139 | } 140 | ``` 141 | 142 | ## Loading the MNIST data 143 | The data used by this project was obtained from the [MNIST Database](http://yann.lecun.com/exdb/mnist/). The code to load the data is contained in 144 | the Loader directory. 145 | 146 | ## Running the code 147 | So you've downloaded the code compiled it and you wait, and wait, go make a cup tea come back and still nothing. The debug version of this code 148 | is almost certainly not using the vectorising abilities of your computer. If this is the case please build an optimized version of the code 149 | and if your PC is similar to mine you should start to see output like this 150 | ``` 151 | Epoch 0: 9137 / 10000 152 | Epoch 1: 9294 / 10000 153 | Epoch 2: 9306 / 10000 154 | Epoch 3: 9396 / 10000 155 | Epoch 4: 9420 / 10000 156 | Epoch 5: 9428 / 10000 157 | . 158 | . 159 | . 160 | Epoch 26: 9522 / 10000 161 | Epoch 27: 9530 / 10000 162 | Epoch 28: 9518 / 10000 163 | Epoch 29: 9527 / 10000 164 | ``` 165 | Similar accuracy to Network.py a result! 166 | -------------------------------------------------------------------------------- /Chapter1/.clang-format: -------------------------------------------------------------------------------- 1 | --- 2 | Language: Cpp 3 | # BasedOnStyle: LLVM 4 | AccessModifierOffset: -2 5 | AlignAfterOpenBracket: Align 6 | AlignConsecutiveAssignments: false 7 | AlignConsecutiveDeclarations: false 8 | AlignEscapedNewlines: Right 9 | AlignOperands: true 10 | AlignTrailingComments: true 11 | AllowAllParametersOfDeclarationOnNextLine: true 12 | AllowShortBlocksOnASingleLine: false 13 | AllowShortCaseLabelsOnASingleLine: false 14 | AllowShortFunctionsOnASingleLine: All 15 | AllowShortIfStatementsOnASingleLine: false 16 | AllowShortLoopsOnASingleLine: false 17 | AlwaysBreakAfterDefinitionReturnType: None 18 | AlwaysBreakAfterReturnType: None 19 | AlwaysBreakBeforeMultilineStrings: false 20 | AlwaysBreakTemplateDeclarations: false 21 | BinPackArguments: true 22 | BinPackParameters: true 23 | BraceWrapping: 24 | AfterClass: false 25 | AfterControlStatement: false 26 | AfterEnum: false 27 | AfterFunction: false 28 | AfterNamespace: false 29 | AfterObjCDeclaration: false 30 | AfterStruct: false 31 | AfterUnion: false 32 | BeforeCatch: false 33 | BeforeElse: false 34 | IndentBraces: false 35 | SplitEmptyFunction: true 36 | SplitEmptyRecord: true 37 | SplitEmptyNamespace: true 38 | BreakBeforeBinaryOperators: None 39 | BreakBeforeBraces: Attach 40 | BreakBeforeInheritanceComma: false 41 | BreakBeforeTernaryOperators: true 42 | BreakConstructorInitializersBeforeComma: false 43 | BreakConstructorInitializers: BeforeColon 44 | BreakAfterJavaFieldAnnotations: false 45 | BreakStringLiterals: true 46 | ColumnLimit: 120 47 | CommentPragmas: '^ IWYU pragma:' 48 | CompactNamespaces: false 49 | ConstructorInitializerAllOnOneLineOrOnePerLine: false 50 | ConstructorInitializerIndentWidth: 4 51 | ContinuationIndentWidth: 4 52 | Cpp11BracedListStyle: true 53 | DerivePointerAlignment: false 54 | DisableFormat: false 55 | ExperimentalAutoDetectBinPacking: false 56 | FixNamespaceComments: true 57 | ForEachMacros: 58 | - foreach 59 | - Q_FOREACH 60 | - BOOST_FOREACH 61 | IncludeCategories: 62 | - Regex: '^"(llvm|llvm-c|clang|clang-c)/' 63 | Priority: 2 64 | - Regex: '^(<|"(gtest|gmock|isl|json)/)' 65 | Priority: 3 66 | - Regex: '.*' 67 | Priority: 1 68 | IncludeIsMainRegex: '(Test)?$' 69 | IndentCaseLabels: false 70 | IndentWidth: 4 71 | IndentWrappedFunctionNames: false 72 | JavaScriptQuotes: Leave 73 | JavaScriptWrapImports: true 74 | KeepEmptyLinesAtTheStartOfBlocks: true 75 | MacroBlockBegin: '' 76 | MacroBlockEnd: '' 77 | MaxEmptyLinesToKeep: 1 78 | NamespaceIndentation: None 79 | ObjCBlockIndentWidth: 2 80 | ObjCSpaceAfterProperty: false 81 | ObjCSpaceBeforeProtocolList: true 82 | PenaltyBreakAssignment: 2 83 | PenaltyBreakBeforeFirstCallParameter: 19 84 | PenaltyBreakComment: 300 85 | PenaltyBreakFirstLessLess: 120 86 | PenaltyBreakString: 1000 87 | PenaltyExcessCharacter: 1000000 88 | PenaltyReturnTypeOnItsOwnLine: 60 89 | PointerAlignment: Right 90 | ReflowComments: true 91 | SortIncludes: true 92 | SortUsingDeclarations: true 93 | SpaceAfterCStyleCast: false 94 | SpaceAfterTemplateKeyword: true 95 | SpaceBeforeAssignmentOperators: true 96 | SpaceBeforeParens: ControlStatements 97 | SpaceInEmptyParentheses: false 98 | SpacesBeforeTrailingComments: 1 99 | SpacesInAngles: false 100 | SpacesInContainerLiterals: true 101 | SpacesInCStyleCastParentheses: false 102 | SpacesInParentheses: false 103 | SpacesInSquareBrackets: false 104 | Standard: Cpp11 105 | TabWidth: 8 106 | UseTab: Never 107 | ... 108 | 109 | -------------------------------------------------------------------------------- /Chapter1/Makefile: -------------------------------------------------------------------------------- 1 | appname := NeuralNet 2 | 3 | CC=gcc 4 | CXX=g++ 5 | RM=rm -f 6 | CXXFLAGS=-O3 -I../../boost_1_65_1 -I../Loader 7 | LDFLAGS=-g 8 | LDLIBS= 9 | 10 | SRCS := $(shell find . -maxdepth 1 -name "*.cpp") 11 | OBJS := $(patsubst %.cpp, %.o, $(SRCS)) 12 | 13 | all: $(appname) 14 | 15 | $(appname): $(OBJS) 16 | $(CXX) $(LDFLAGS) -o $(appname) $(OBJS) $(LDLIBS) 17 | 18 | depend: .depend 19 | 20 | .depend: $(SRCS) 21 | $(RM) ./.depend 22 | $(CXX) $(CXXFLAGS) -MM $^>>./.depend; 23 | 24 | clean: 25 | $(RM) $(OBJS) 26 | 27 | distclean: clean 28 | $(RM) .depend 29 | 30 | include .depend 31 | -------------------------------------------------------------------------------- /Chapter1/NeuralNet1.cpp: -------------------------------------------------------------------------------- 1 | // NeuralNet1.cpp : Console application to demonstrate Machine learning 2 | // 3 | // An example written to implement the stochastic gradient descent learning algorithm 4 | // for a feedforward neural network. Gradients are calculated using backpropagation. 5 | // 6 | // Code is written to be a C++ version of network.py from http://neuralnetworksanddeeplearning.com/chap1.html 7 | // Variable and functions names follow the names used in the original Python 8 | // 9 | // Uses the boost ublas library for linear algebra operations 10 | 11 | #include "stdafx.h" 12 | 13 | #include "boost\numeric\ublas\matrix.hpp" 14 | #include "boost\numeric\ublas\vector.hpp" 15 | #include "mnist_loader.h" 16 | #include 17 | #include 18 | #include 19 | #include 20 | 21 | using namespace boost::numeric; 22 | 23 | // Set up the random number generator 24 | std::random_device rd; 25 | std::mt19937 gen(rd()); 26 | 27 | // Randomize as ublas vector 28 | void Randomize(ublas::vector &vec) { 29 | std::normal_distribution<> d(0, 1); 30 | for (auto &e : vec) { 31 | e = d(gen); 32 | } 33 | } 34 | 35 | // Randomize as ublas matrix 36 | void Randomize(ublas::matrix &m) { 37 | std::normal_distribution<> d(0, 1); 38 | for (auto &e : m.data()) { 39 | e = d(gen); 40 | } 41 | } 42 | 43 | // The sigmoid function. 44 | void sigmoid(ublas::vector &v) { 45 | for (auto &iz : v) { 46 | iz = 1.0 / (1.0 + exp(-iz)); 47 | } 48 | } 49 | // Derivative of the sigmoid function. 50 | void sigmoid_prime(ublas::vector &v) { 51 | for (auto &iz : v) { 52 | iz = 1.0 / (1.0 + exp(-iz)); 53 | iz = iz * (1 - iz); 54 | } 55 | } 56 | 57 | class Network { 58 | private: 59 | std::vector m_sizes; 60 | using BiasesVector = std::vector>; 61 | using WeightsVector = std::vector>; 62 | BiasesVector biases; 63 | WeightsVector weights; 64 | 65 | public: 66 | // The vector sizes contains the number of neurons in the 67 | // respective layers of the network.For example, if the list 68 | // was{ 2, 3, 1} then it would be a three - layer network, with the 69 | // first layer containing 2 neurons, the second layer 3 neurons, 70 | // and the third layer 1 neuron.The biases and weights for the 71 | // network are initialized randomly, using a Gaussian 72 | // distribution with mean 0, and variance 1. Note that the first 73 | // layer is assumed to be an input layer, and by convention we 74 | // won't set any biases for those neurons, since biases are only 75 | // ever used in computing the outputs from later layers. 76 | Network(const std::vector &sizes) : m_sizes(sizes) { 77 | PopulateZeroWeightsAndBiases(biases, weights); 78 | for (auto &b : biases) 79 | Randomize(b); 80 | for (auto &w : weights) 81 | Randomize(w); 82 | } 83 | // Initialise the array of Biases and Matrix of weights 84 | void PopulateZeroWeightsAndBiases(BiasesVector &b, WeightsVector &w) const { 85 | for (size_t i = 1; i < m_sizes.size(); ++i) { 86 | b.push_back(ublas::zero_vector(m_sizes[i])); 87 | w.push_back(ublas::zero_matrix(m_sizes[i], m_sizes[i - 1])); 88 | } 89 | } 90 | // Returns the output of the network if the input is a 91 | ublas::vector feedforward(ublas::vector a) const { 92 | for (auto i = 0; i < biases.size(); ++i) { 93 | ublas::vector c = prod(weights[i], a) + biases[i]; 94 | sigmoid(c); 95 | a = c; 96 | } 97 | return a; 98 | } 99 | // Type definition of the Training data 100 | using TrainingData = std::pair, ublas::vector>; 101 | using TrainingDataIterator = typename std::vector::iterator; 102 | // Train the neural network using mini-batch stochastic 103 | // gradient descent.The training_data is a vector of pairs 104 | // representing the training inputs and the desired 105 | // outputs.The other non - optional parameters are 106 | // self - explanatory.If test_data is provided then the 107 | // network will be evaluated against the test data after each 108 | // epoch, and partial progress printed out.This is useful for 109 | // tracking progress, but slows things down substantially. 110 | void SGD(std::vector training_data, int epochs, int mini_batch_size, double eta, 111 | std::vector test_data) { 112 | for (auto j = 0; j < epochs; j++) { 113 | std::shuffle(training_data.begin(), training_data.end(), rd); 114 | for (auto i = 0; i < training_data.size(); i += mini_batch_size) { 115 | auto iter = training_data.begin(); 116 | std::advance(iter, i); 117 | update_mini_batch(iter, mini_batch_size, eta); 118 | } 119 | if (test_data.size() != 0) 120 | std::cout << "Epoch " << j << ": " << evaluate(test_data) << " / " << test_data.size() << std::endl; 121 | else 122 | std::cout << "Epoch " << j << " complete" << std::endl; 123 | } 124 | } 125 | // Update the network's weights and biases by applying 126 | // gradient descent using backpropagation to a single mini batch. 127 | // The "mini_batch" is a list of tuples "(x, y)", and "eta" 128 | // is the learning rate.""" 129 | void update_mini_batch(TrainingDataIterator td, int mini_batch_size, double eta) { 130 | std::vector> nabla_b; 131 | std::vector> nabla_w; 132 | PopulateZeroWeightsAndBiases(nabla_b, nabla_w); 133 | for (auto i = 0; i < mini_batch_size; ++i, td++) { 134 | auto &x = td->first; // test data 135 | auto &y = td->second; // expected result 136 | std::vector> delta_nabla_b; 137 | std::vector> delta_nabla_w; 138 | PopulateZeroWeightsAndBiases(delta_nabla_b, delta_nabla_w); 139 | backprop(x, y, delta_nabla_b, delta_nabla_w); 140 | for (auto k = 0; k < biases.size(); ++k) { 141 | nabla_b[k] += delta_nabla_b[k]; 142 | nabla_w[k] += delta_nabla_w[k]; 143 | } 144 | } 145 | for (auto i = 0; i < biases.size(); ++i) { 146 | biases[i] -= eta / mini_batch_size * nabla_b[i]; 147 | weights[i] -= eta / mini_batch_size * nabla_w[i]; 148 | } 149 | } 150 | // Populates the gradient for the cost function for the biases in the vector nabla_b 151 | // and the weights in nabla_w 152 | void backprop(const ublas::vector &x, const ublas::vector &y, BiasesVector &nabla_b, 153 | WeightsVector &nabla_w) { 154 | auto activation = x; 155 | std::vector> activations; // Stores the activations of each layer 156 | activations.push_back(x); 157 | std::vector> zs; // The z vectors layer by layer 158 | for (auto i = 0; i < biases.size(); ++i) { 159 | ublas::vector z = prod(weights[i], activation) + biases[i]; 160 | zs.push_back(z); 161 | activation = z; 162 | sigmoid(activation); 163 | activations.push_back(activation); 164 | } 165 | // backward pass 166 | auto iActivations = activations.end() - 1; 167 | auto izs = zs.end() - 1; 168 | sigmoid_prime(*izs); 169 | ublas::vector delta = element_prod(cost_derivative(*iActivations, y), *izs); 170 | auto ib = nabla_b.end() - 1; 171 | auto iw = nabla_w.end() - 1; 172 | *ib = delta; 173 | iActivations--; 174 | *iw = outer_prod(delta, trans(*iActivations)); 175 | 176 | auto iWeights = weights.end(); 177 | while (iActivations != activations.begin()) { 178 | izs--; 179 | iWeights--; 180 | iActivations--; 181 | ib--; 182 | iw--; 183 | sigmoid_prime(*izs); 184 | delta = element_prod(prod(trans(*iWeights), delta), *izs); 185 | *ib = delta; 186 | *iw = outer_prod(delta, trans(*iActivations)); 187 | } 188 | } 189 | // Return the number of test inputs for which the neural 190 | // network outputs the correct result. Note that the neural 191 | // network's output is assumed to be the index of whichever 192 | // neuron in the final layer has the highest activation. 193 | int evaluate(const std::vector &td) const { 194 | return count_if(td.begin(), td.end(), [this](const TrainingData &testElement) { 195 | auto res = feedforward(testElement.first); 196 | return (std::distance(res.begin(), max_element(res.begin(), res.end())) == 197 | std::distance(testElement.second.begin(), 198 | max_element(testElement.second.begin(), testElement.second.end()))); 199 | }); 200 | } 201 | // Return the vector of partial derivatives \partial C_x / 202 | // \partial a for the output activations. 203 | ublas::vector cost_derivative(const ublas::vector &output_activations, 204 | const ublas::vector &y) const { 205 | return output_activations - y; 206 | } 207 | }; 208 | 209 | int main() { 210 | std::vector td, testData; 211 | // Load training data 212 | try { 213 | mnist_loader loader("..\\Data\\train-images.idx3-ubyte", "..\\Data\\train-labels.idx1-ubyte", td); 214 | // Load test data 215 | mnist_loader loader2("..\\Data\\t10k-images.idx3-ubyte", "..\\Data\\t10k-labels.idx1-ubyte", testData); 216 | } 217 | catch (const char *ex) { 218 | std::cout << "Error: " << ex << std::endl; 219 | return 0; 220 | } 221 | Network net({ 784, 30, 10 }); 222 | net.SGD(td, 30, 10, 3.0, testData); 223 | 224 | return 0; 225 | } 226 | -------------------------------------------------------------------------------- /Chapter1/NeuralNet1.sln: -------------------------------------------------------------------------------- 1 |  2 | Microsoft Visual Studio Solution File, Format Version 12.00 3 | # Visual Studio 15 4 | VisualStudioVersion = 15.0.27004.2006 5 | MinimumVisualStudioVersion = 10.0.40219.1 6 | Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "NeuralNet1", "NeuralNet1.vcxproj", "{3E2E207B-E939-4B80-9F13-5CAE3E04A4C1}" 7 | EndProject 8 | Global 9 | GlobalSection(SolutionConfigurationPlatforms) = preSolution 10 | Debug|x64 = Debug|x64 11 | Debug|x86 = Debug|x86 12 | Release|x64 = Release|x64 13 | Release|x86 = Release|x86 14 | EndGlobalSection 15 | GlobalSection(ProjectConfigurationPlatforms) = postSolution 16 | {3E2E207B-E939-4B80-9F13-5CAE3E04A4C1}.Debug|x64.ActiveCfg = Debug|x64 17 | {3E2E207B-E939-4B80-9F13-5CAE3E04A4C1}.Debug|x64.Build.0 = Debug|x64 18 | {3E2E207B-E939-4B80-9F13-5CAE3E04A4C1}.Debug|x86.ActiveCfg = Debug|Win32 19 | {3E2E207B-E939-4B80-9F13-5CAE3E04A4C1}.Debug|x86.Build.0 = Debug|Win32 20 | {3E2E207B-E939-4B80-9F13-5CAE3E04A4C1}.Release|x64.ActiveCfg = Release|x64 21 | {3E2E207B-E939-4B80-9F13-5CAE3E04A4C1}.Release|x64.Build.0 = Release|x64 22 | {3E2E207B-E939-4B80-9F13-5CAE3E04A4C1}.Release|x86.ActiveCfg = Release|Win32 23 | {3E2E207B-E939-4B80-9F13-5CAE3E04A4C1}.Release|x86.Build.0 = Release|Win32 24 | EndGlobalSection 25 | GlobalSection(SolutionProperties) = preSolution 26 | HideSolutionNode = FALSE 27 | EndGlobalSection 28 | GlobalSection(ExtensibilityGlobals) = postSolution 29 | SolutionGuid = {A868641A-544F-4BD0-B707-DFC56CA860DA} 30 | EndGlobalSection 31 | EndGlobal 32 | -------------------------------------------------------------------------------- /Chapter1/NeuralNet1.vcxproj: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | Debug 6 | Win32 7 | 8 | 9 | Release 10 | Win32 11 | 12 | 13 | Debug 14 | x64 15 | 16 | 17 | Release 18 | x64 19 | 20 | 21 | 22 | 15.0 23 | {3E2E207B-E939-4B80-9F13-5CAE3E04A4C1} 24 | Win32Proj 25 | NeuralNet1 26 | 10.0.16299.0 27 | 28 | 29 | 30 | Application 31 | true 32 | v141 33 | Unicode 34 | 35 | 36 | Application 37 | false 38 | v141 39 | true 40 | Unicode 41 | 42 | 43 | Application 44 | true 45 | v141 46 | Unicode 47 | 48 | 49 | Application 50 | false 51 | v141 52 | true 53 | Unicode 54 | 55 | 56 | 57 | 58 | 59 | 60 | 61 | 62 | 63 | 64 | 65 | 66 | 67 | 68 | 69 | 70 | 71 | 72 | 73 | 74 | true 75 | 76 | 77 | true 78 | 79 | 80 | false 81 | 82 | 83 | false 84 | 85 | 86 | 87 | Use 88 | Level3 89 | Disabled 90 | true 91 | WIN32;_DEBUG;_CONSOLE;%(PreprocessorDefinitions);-D_SCL_SECURE_NO_WARNINGS;-D_SCL_SECURE_NO_WARNINGS 92 | ..\Loader;%(AdditionalIncludeDirectories) 93 | -D_SCL_SECURE_NO_WARNINGS %(AdditionalOptions) 94 | 95 | 96 | Console 97 | true 98 | 99 | 100 | 101 | 102 | Use 103 | Level3 104 | Disabled 105 | true 106 | _DEBUG;_CONSOLE;%(PreprocessorDefinitions);-D_SCL_SECURE_NO_WARNINGS;-D_SCL_SECURE_NO_WARNINGS 107 | ..\Loader;%(AdditionalIncludeDirectories) 108 | -D_SCL_SECURE_NO_WARNINGS %(AdditionalOptions) 109 | 110 | 111 | Console 112 | true 113 | 114 | 115 | 116 | 117 | Use 118 | Level3 119 | MaxSpeed 120 | true 121 | true 122 | true 123 | WIN32;NDEBUG;_CONSOLE;%(PreprocessorDefinitions);-D_SCL_SECURE_NO_WARNINGS;-D_SCL_SECURE_NO_WARNINGS 124 | ..\Loader;%(AdditionalIncludeDirectories) 125 | -D_SCL_SECURE_NO_WARNINGS %(AdditionalOptions) 126 | 127 | 128 | Console 129 | true 130 | true 131 | true 132 | 133 | 134 | 135 | 136 | Use 137 | Level3 138 | MaxSpeed 139 | true 140 | true 141 | true 142 | NDEBUG;_CONSOLE;%(PreprocessorDefinitions);-D_SCL_SECURE_NO_WARNINGS;-D_SCL_SECURE_NO_WARNINGS 143 | ..\Loader;%(AdditionalIncludeDirectories) 144 | -D_SCL_SECURE_NO_WARNINGS %(AdditionalOptions) 145 | 146 | 147 | Console 148 | true 149 | true 150 | true 151 | 152 | 153 | 154 | 155 | 156 | 157 | 158 | 159 | 160 | Create 161 | Create 162 | Create 163 | Create 164 | 165 | 166 | 167 | 168 | 169 | 170 | 171 | 172 | 173 | 174 | 175 | This project references NuGet package(s) that are missing on this computer. Use NuGet Package Restore to download them. For more information, see http://go.microsoft.com/fwlink/?LinkID=322105. The missing file is {0}. 176 | 177 | 178 | 179 | -------------------------------------------------------------------------------- /Chapter1/NeuralNet1.vcxproj.filters: -------------------------------------------------------------------------------- 1 |  2 | 3 | 4 | 5 | {4FC737F1-C7A5-4376-A066-2A32D752A2FF} 6 | cpp;c;cc;cxx;def;odl;idl;hpj;bat;asm;asmx 7 | 8 | 9 | {93995380-89BD-4b04-88EB-625FBE52EBFB} 10 | h;hh;hpp;hxx;hm;inl;inc;xsd 11 | 12 | 13 | {67DA6AB6-F800-4c08-8B7A-83BB121AAD01} 14 | rc;ico;cur;bmp;dlg;rc2;rct;bin;rgs;gif;jpg;jpeg;jpe;resx;tiff;tif;png;wav;mfcribbon-ms 15 | 16 | 17 | 18 | 19 | Header Files 20 | 21 | 22 | Header Files 23 | 24 | 25 | 26 | 27 | Source Files 28 | 29 | 30 | Source Files 31 | 32 | 33 | 34 | 35 | 36 | -------------------------------------------------------------------------------- /Chapter1/packages.config: -------------------------------------------------------------------------------- 1 |  2 | 3 | 4 | -------------------------------------------------------------------------------- /Chapter1/stdafx.cpp: -------------------------------------------------------------------------------- 1 | // stdafx.cpp : source file that includes just the standard includes 2 | // NeuralNet1.pch will be the pre-compiled header 3 | // stdafx.obj will contain the pre-compiled type information 4 | 5 | #include "stdafx.h" 6 | 7 | // TODO: reference any additional headers you need in STDAFX.H 8 | // and not in this file 9 | -------------------------------------------------------------------------------- /Chapter1/stdafx.h: -------------------------------------------------------------------------------- 1 | // stdafx.h : include file for standard system include files, 2 | // or project specific include files that are used frequently, but 3 | // are changed infrequently 4 | // 5 | 6 | #pragma once 7 | 8 | #include "targetver.h" 9 | 10 | #include 11 | #include 12 | 13 | 14 | 15 | // TODO: reference additional headers your program requires here 16 | -------------------------------------------------------------------------------- /Chapter1/targetver.h: -------------------------------------------------------------------------------- 1 | #pragma once 2 | 3 | // Including SDKDDKVer.h defines the highest available Windows platform. 4 | 5 | // If you wish to build your application for a previous Windows platform, include WinSDKVer.h and 6 | // set the _WIN32_WINNT macro to the platform you wish to support before including SDKDDKVer.h. 7 | 8 | #include 9 | -------------------------------------------------------------------------------- /Chapter3.md: -------------------------------------------------------------------------------- 1 | # Chapter 3 2 | The code written for chapter 1 was passable C++ but obviously not good enough for library code. You may ask why did I choose double rather than float as type to do the processing on? Perhaps float would be more efficient on your hardware? In this logic we have the justification for templates. Here is how we convert the Network class to a template: 3 | ```c++ 4 | template 5 | class Network { 6 | private: 7 | using BiasesVector = std::vector>; 8 | using WeightsVector = std::vector>; 9 | std::vector m_sizes; 10 | BiasesVector biases; 11 | WeightsVector weights; 12 | ``` 13 | ## The cost functions 14 | In Chapter 3 Michael discusses different cost functions we can use on the network. In C++ these can be implemented as cost policy classes. Here is the code for the QuadraticCost function: 15 | ```c++ 16 | template 17 | class QuadraticCost { 18 | public: 19 | T cost_fn(const ublas::vector& a, 20 | const ublas::vector& y) const 21 | { 22 | return 0.5 *pow(norm_2(a - y)); 23 | } 24 | ublas::vector cost_delta(const ublas::vector& z, const ublas::vector& a, 25 | const ublas::vector& y) const 26 | { 27 | auto zp = z; 28 | sigmoid_prime(zp); 29 | return element_prod(a - y, zp); 30 | } 31 | }; 32 | ``` 33 | And here is the code for the CrossEntropyCost function. 34 | ```c++ 35 | template 36 | class CrossEntropyCost { 37 | public: 38 | // Return the cost associated with an output ``a`` and desired output 39 | // ``y``. Note that np.nan_to_num is used to ensure numerical 40 | // stability. If both ``a`` and ``y`` have a 1.0 41 | // in the same slot, then the expression (1 - y)*np.log(1 - a) 42 | T cost_fn(const ublas::vector& a, 43 | const ublas::vector& y) const 44 | { 45 | T total(0); 46 | for (auto i = 0; i < a.size(); ++i) 47 | { 48 | total += -y(i)*log(a(i)) - (1 - y(i))*log(1 - a(i)); 49 | } 50 | return total; 51 | } 52 | // Return the error delta from the output layer. 53 | ublas::vector cost_delta(const ublas::vector& z, const ublas::vector& a, 54 | const ublas::vector& y) const 55 | { 56 | z; // not used by design 57 | return a - y; 58 | } 59 | }; 60 | ``` 61 | The imported feature to note with policy classes is they must use the same interface. 62 | 63 | We can create decide at compile time which class we would like to use as our cost function by defining the Network class as follows: 64 | ```c++ 65 | template 66 | class Network : private CostPolicy { 67 | private: 68 | using BiasesVector = std::vector>; 69 | using WeightsVector = std::vector>; 70 | std::vector m_sizes; 71 | BiasesVector biases; 72 | WeightsVector weights; 73 | ``` 74 | To create an implementation of the Network using floats and CrossEntropyCost we can create this as follows: 75 | ```c++ 76 | NeuralNet::Network> net({ 10, 748, 30, 10 }); 77 | ``` 78 | Naturally that’s not nice to type out or read C++ so we define it as a type: 79 | ```c++ 80 | using NetCrossEntropyCost=NeuralNet::Network>; 81 | NetCrossEntropyCost net({ 10, 748, 30, 10 }); 82 | ``` 83 | This helps when we want to obtain the definition of the training data: 84 | ```c++ 85 | NetCrossEntropyCost::TrainingData 86 | // Instead of 87 | NeuralNet::Network>::TrainingData 88 | ``` 89 | 90 | ## Feedback function 91 | Users of the Network class need to see feedback from each round of fitting using a C++ lambda is a very clean approach to achieving this goal. This can be achieved by adding a std::function parameter to our interface. 92 | ```c++ 93 | void SGD(typename std::vector::iterator td_begin, 94 | typename std::vector::iterator td_end, 95 | int epochs, int mini_batch_size, T eta, T lmbda, 96 | std::function feedback) 97 | ``` 98 | By passing a referece to the Network the user of the class can integerate the class as to the current cost and accuracy of the network as follows: 99 | ```c++ 100 | NeuralNet1 net({ 784, 30, 10 }); 101 | net.SGD(td.begin(), td.end(), 30, 10, 0.5, Lmbda, [&testData,&td,Lmbda](const NeuralNet1 &network, int Epoch) { 102 | std::cout << "Epoch " << Epoch << " : " << network.accuracy(testData.begin(), testData.end()) << " / " << testData.size() << std::endl; 103 | std::cout << "Epoch " << Epoch << " : " << network.accuracy(td.begin(), td.end()) << " / " << td.size() << std::endl; 104 | std::cout << "Cost : " << network.total_cost(td.begin(), td.end(), Lmbda) << std::endl; 105 | std::cout << "Cost : " << network.total_cost(testData.begin(), testData.end(), Lmbda) << std::endl; 106 | }); 107 | ``` 108 | This I hope you will agree results in a nice clean interface for our class. 109 | -------------------------------------------------------------------------------- /Chapter3/.clang-format: -------------------------------------------------------------------------------- 1 | --- 2 | Language: Cpp 3 | # BasedOnStyle: LLVM 4 | AccessModifierOffset: -2 5 | AlignAfterOpenBracket: Align 6 | AlignConsecutiveAssignments: false 7 | AlignConsecutiveDeclarations: false 8 | AlignEscapedNewlines: Right 9 | AlignOperands: true 10 | AlignTrailingComments: true 11 | AllowAllParametersOfDeclarationOnNextLine: true 12 | AllowShortBlocksOnASingleLine: false 13 | AllowShortCaseLabelsOnASingleLine: false 14 | AllowShortFunctionsOnASingleLine: All 15 | AllowShortIfStatementsOnASingleLine: false 16 | AllowShortLoopsOnASingleLine: false 17 | AlwaysBreakAfterDefinitionReturnType: None 18 | AlwaysBreakAfterReturnType: None 19 | AlwaysBreakBeforeMultilineStrings: false 20 | AlwaysBreakTemplateDeclarations: false 21 | BinPackArguments: true 22 | BinPackParameters: true 23 | BraceWrapping: 24 | AfterClass: false 25 | AfterControlStatement: false 26 | AfterEnum: false 27 | AfterFunction: false 28 | AfterNamespace: false 29 | AfterObjCDeclaration: false 30 | AfterStruct: false 31 | AfterUnion: false 32 | BeforeCatch: false 33 | BeforeElse: false 34 | IndentBraces: false 35 | SplitEmptyFunction: true 36 | SplitEmptyRecord: true 37 | SplitEmptyNamespace: true 38 | BreakBeforeBinaryOperators: None 39 | BreakBeforeBraces: Attach 40 | BreakBeforeInheritanceComma: false 41 | BreakBeforeTernaryOperators: true 42 | BreakConstructorInitializersBeforeComma: false 43 | BreakConstructorInitializers: BeforeColon 44 | BreakAfterJavaFieldAnnotations: false 45 | BreakStringLiterals: true 46 | ColumnLimit: 120 47 | CommentPragmas: '^ IWYU pragma:' 48 | CompactNamespaces: false 49 | ConstructorInitializerAllOnOneLineOrOnePerLine: false 50 | ConstructorInitializerIndentWidth: 4 51 | ContinuationIndentWidth: 4 52 | Cpp11BracedListStyle: true 53 | DerivePointerAlignment: false 54 | DisableFormat: false 55 | ExperimentalAutoDetectBinPacking: false 56 | FixNamespaceComments: true 57 | ForEachMacros: 58 | - foreach 59 | - Q_FOREACH 60 | - BOOST_FOREACH 61 | IncludeCategories: 62 | - Regex: '^"(llvm|llvm-c|clang|clang-c)/' 63 | Priority: 2 64 | - Regex: '^(<|"(gtest|gmock|isl|json)/)' 65 | Priority: 3 66 | - Regex: '.*' 67 | Priority: 1 68 | IncludeIsMainRegex: '(Test)?$' 69 | IndentCaseLabels: false 70 | IndentWidth: 4 71 | IndentWrappedFunctionNames: false 72 | JavaScriptQuotes: Leave 73 | JavaScriptWrapImports: true 74 | KeepEmptyLinesAtTheStartOfBlocks: true 75 | MacroBlockBegin: '' 76 | MacroBlockEnd: '' 77 | MaxEmptyLinesToKeep: 1 78 | NamespaceIndentation: None 79 | ObjCBlockIndentWidth: 2 80 | ObjCSpaceAfterProperty: false 81 | ObjCSpaceBeforeProtocolList: true 82 | PenaltyBreakAssignment: 2 83 | PenaltyBreakBeforeFirstCallParameter: 19 84 | PenaltyBreakComment: 300 85 | PenaltyBreakFirstLessLess: 120 86 | PenaltyBreakString: 1000 87 | PenaltyExcessCharacter: 1000000 88 | PenaltyReturnTypeOnItsOwnLine: 60 89 | PointerAlignment: Right 90 | ReflowComments: true 91 | SortIncludes: true 92 | SortUsingDeclarations: true 93 | SpaceAfterCStyleCast: false 94 | SpaceAfterTemplateKeyword: true 95 | SpaceBeforeAssignmentOperators: true 96 | SpaceBeforeParens: ControlStatements 97 | SpaceInEmptyParentheses: false 98 | SpacesBeforeTrailingComments: 1 99 | SpacesInAngles: false 100 | SpacesInContainerLiterals: true 101 | SpacesInCStyleCastParentheses: false 102 | SpacesInParentheses: false 103 | SpacesInSquareBrackets: false 104 | Standard: Cpp11 105 | TabWidth: 8 106 | UseTab: Never 107 | ... 108 | 109 | -------------------------------------------------------------------------------- /Chapter3/Makefile: -------------------------------------------------------------------------------- 1 | appname := NeuralNet2 2 | 3 | CC=gcc 4 | CXX=g++ 5 | RM=rm -f 6 | CXXFLAGS=-O3 -I../../boost_1_65_1 -I../Loader 7 | LDFLAGS=-g 8 | LDLIBS= 9 | 10 | SRCS := $(shell find . -maxdepth 1 -name "*.cpp") 11 | OBJS := $(patsubst %.cpp, %.o, $(SRCS)) 12 | 13 | all: $(appname) 14 | 15 | $(appname): $(OBJS) 16 | $(CXX) $(LDFLAGS) -o $(appname) $(OBJS) $(LDLIBS) 17 | 18 | depend: .depend 19 | 20 | .depend: $(SRCS) 21 | $(RM) ./.depend 22 | $(CXX) $(CXXFLAGS) -MM $^>>./.depend; 23 | 24 | clean: 25 | $(RM) $(OBJS) 26 | 27 | distclean: clean 28 | $(RM) .depend 29 | 30 | include .depend 31 | -------------------------------------------------------------------------------- /Chapter3/NeuralNet.h: -------------------------------------------------------------------------------- 1 | #pragma once 2 | // 3 | // Copyright (c) 2017 4 | // Gareth Richards 5 | // 6 | // NeuralNet.h Definition for NeuralNet namespace contains the following classes 7 | // Network - main class containing the implemention of the NeuralNet 8 | // The following Cost policies can be applied to this class. 9 | // Cost Policies: 10 | // QuadraticCost 11 | // CrossEntropyCost 12 | 13 | #include "boost\numeric\ublas\matrix.hpp" 14 | #include "boost\numeric\ublas\vector.hpp" 15 | #include 16 | #include 17 | #include 18 | #include 19 | 20 | using namespace boost::numeric; 21 | 22 | namespace NeuralNet { 23 | // Set up the random number generator 24 | std::random_device rd; 25 | std::mt19937 gen(rd()); 26 | 27 | // Randomize as ublas vector 28 | template void Randomize(ublas::vector &vec) { 29 | std::normal_distribution<> d(0, 1); 30 | for (auto &e : vec) { 31 | e = d(gen); 32 | } 33 | } 34 | // Randomize as ublas matrix 35 | template void Randomize(ublas::matrix &m) { 36 | std::normal_distribution<> d(0, 1); 37 | T sx = sqrt(m.size2()); 38 | for (auto &e : m.data()) { 39 | e = d(gen) / sx; 40 | } 41 | } 42 | // The sigmoid function. 43 | template void sigmoid(ublas::vector &v) { 44 | for (auto &iv : v) { 45 | iv = 1.0 / (1.0 + exp(-iv)); 46 | } 47 | } 48 | // Derivative of the sigmoid function. 49 | template void sigmoid_prime(ublas::vector &v) { 50 | for (auto &iv : v) { 51 | iv = 1.0 / (1.0 + exp(-iv)); 52 | iv = iv * (1.0 - iv); 53 | } 54 | } 55 | 56 | template class QuadraticCost { 57 | public: 58 | T cost_fn(const ublas::vector &a, const ublas::vector &y) const { 59 | return 0.5 * pow(norm_2(a - y)); 60 | ; 61 | } 62 | ublas::vector cost_delta(const ublas::vector &z, const ublas::vector &a, const ublas::vector &y) const { 63 | auto zp = z; 64 | sigmoid_prime(zp); 65 | return element_prod(a - y, zp); 66 | } 67 | }; 68 | 69 | template class CrossEntropyCost { 70 | public: 71 | // Return the cost associated with an output ``a`` and desired output 72 | // ``y``. Note that np.nan_to_num is used to ensure numerical 73 | // stability.In particular, if both ``a`` and ``y`` have a 1.0 74 | // in the same slot, then the expression(1 - y)*np.log(1 - a) 75 | // returns nan.The np.nan_to_num ensures that that is converted 76 | // to the correct value(0.0). 77 | T cost_fn(const ublas::vector &a, const ublas::vector &y) const { 78 | T total(0); 79 | for (auto i = 0; i < a.size(); ++i) { 80 | total += -y(i) * log(a(i)) - (1 - y(i)) * log(1 - a(i)); 81 | } 82 | return total; 83 | } 84 | // Return the error delta from the output layer. Note that the 85 | // parameter ``z`` is not used by the method.It is included in 86 | // the method's parameters in order to make the interface 87 | // consistent with the delta method for other cost classes. 88 | ublas::vector cost_delta(const ublas::vector &z, const ublas::vector &a, const ublas::vector &y) const { 89 | (void)z; // not used by design 90 | return a - y; 91 | } 92 | }; 93 | 94 | template class Network : private CostPolicy { 95 | private: 96 | using BiasesVector = std::vector>; 97 | using WeightsVector = std::vector>; 98 | std::vector m_sizes; 99 | BiasesVector biases; 100 | WeightsVector weights; 101 | 102 | public: 103 | Network(const std::vector &sizes) : m_sizes(sizes) { 104 | PopulateZeroWeightsAndBiases(biases, weights); 105 | for (auto &b : biases) 106 | Randomize(b); 107 | for (auto &w : weights) 108 | Randomize(w); 109 | } 110 | // Initalize the array of Biases and Matrix of weights 111 | void PopulateZeroWeightsAndBiases(BiasesVector &b, WeightsVector &w) const { 112 | for (auto i = 1; i < m_sizes.size(); ++i) { 113 | b.push_back(ublas::zero_vector(m_sizes[i])); 114 | w.push_back(ublas::zero_matrix(m_sizes[i], m_sizes[i - 1])); 115 | } 116 | } 117 | // Returns the output of the network if the input is a 118 | ublas::vector feedforward(ublas::vector a) const { 119 | for (auto i = 0; i < biases.size(); ++i) { 120 | ublas::vector c = prod(weights[i], a) + biases[i]; 121 | sigmoid(c); 122 | a = c; 123 | } 124 | return a; 125 | } 126 | // Type definition of the Training data 127 | using TrainingData = std::pair, ublas::vector>; 128 | using TrainingDataIterator = typename std::vector::iterator; 129 | // Train the neural network using mini-batch stochastic 130 | // gradient descent.The training_data is a vector of pairs 131 | // representing the training inputs and the desired 132 | // outputs.The other non - optional parameters are 133 | // self - explanatory.If test_data is provided then the 134 | // network will be evaluated against the test data after each 135 | // epoch, and partial progress printed out.This is useful for 136 | // tracking progress, but slows things down substantially. 137 | void SGD(TrainingDataIterator td_begin, TrainingDataIterator td_end, int epochs, int mini_batch_size, T eta, 138 | T lmbda, std::function feedback) { 139 | for (auto j = 0; j < epochs; j++) { 140 | std::shuffle(td_begin, td_end, gen); 141 | for (auto td_i = td_begin; td_i < td_end; td_i += mini_batch_size) { 142 | update_mini_batch(td_i, mini_batch_size, eta, lmbda, std::distance(td_begin, td_end)); 143 | } 144 | feedback(*this, j); 145 | } 146 | } 147 | // Update the network's weights and biases by applying 148 | // gradient descent using backpropagation to a single mini batch. 149 | // The "mini_batch" is a list of tuples "(x, y)", and "eta" 150 | // is the learning rate.""" 151 | void update_mini_batch(TrainingDataIterator td, int mini_batch_size, T eta, T lmbda, int n) { 152 | std::vector> nabla_b; 153 | std::vector> nabla_w; 154 | PopulateZeroWeightsAndBiases(nabla_b, nabla_w); 155 | for (auto i = 0; i < mini_batch_size; ++i, td++) { 156 | auto &x = td->first; // test data 157 | auto &y = td->second; // expected result 158 | std::vector> delta_nabla_b; 159 | std::vector> delta_nabla_w; 160 | PopulateZeroWeightsAndBiases(delta_nabla_b, delta_nabla_w); 161 | backprop(x, y, delta_nabla_b, delta_nabla_w); 162 | for (auto j = 0; j < biases.size(); ++j) { 163 | nabla_b[j] += delta_nabla_b[j]; 164 | nabla_w[j] += delta_nabla_w[j]; 165 | } 166 | } 167 | for (auto i = 0; i < biases.size(); ++i) { 168 | biases[i] -= eta / mini_batch_size * nabla_b[i]; 169 | weights[i] = (1 - eta * (lmbda / n)) * weights[i] - (eta / mini_batch_size) * nabla_w[i]; 170 | } 171 | } 172 | // Populates the gradient for the cost function for the biases in the vector 173 | // nabla_b and the weights in nabla_w 174 | void backprop(const ublas::vector &x, const ublas::vector &y, std::vector> &nabla_b, 175 | std::vector> &nabla_w) { 176 | auto activation = x; 177 | std::vector> activations; // Stores the activations of each layer 178 | activations.push_back(x); 179 | std::vector> zs; // The z vectors layer by layer 180 | for (auto i = 0; i < biases.size(); ++i) { 181 | ublas::vector z = prod(weights[i], activation) + biases[i]; 182 | zs.push_back(z); 183 | activation = z; 184 | sigmoid(activation); 185 | activations.push_back(activation); 186 | } 187 | // backward pass 188 | auto iActivations = activations.end() - 1; 189 | auto izs = zs.end() - 1; 190 | sigmoid_prime(*izs); 191 | ublas::vector delta = this->cost_delta(*izs, *iActivations, y); 192 | auto ib = nabla_b.end() - 1; 193 | auto iw = nabla_w.end() - 1; 194 | *ib = delta; 195 | iActivations--; 196 | *iw = outer_prod(delta, trans(*iActivations)); 197 | auto iWeights = weights.end(); 198 | while (iActivations != activations.begin()) { 199 | izs--; 200 | iWeights--; 201 | iActivations--; 202 | ib--; 203 | iw--; 204 | sigmoid_prime(*izs); 205 | delta = element_prod(prod(trans(*iWeights), delta), *izs); 206 | *ib = delta; 207 | *iw = outer_prod(delta, trans(*iActivations)); 208 | } 209 | } 210 | // Return the vector of partial derivatives \partial C_x / 211 | // \partial a for the output activations. 212 | int accuracy(TrainingDataIterator td_begin, TrainingDataIterator td_end) const { 213 | return count_if(td_begin, td_end, [=](const TrainingData &testElement) { 214 | auto res = feedforward(testElement.first); 215 | return (std::distance(res.begin(), max_element(res.begin(), res.end())) == 216 | std::distance(testElement.second.begin(), 217 | max_element(testElement.second.begin(), testElement.second.end()))); 218 | }); 219 | } 220 | // Return the total cost for the data set ``data``. 221 | 222 | double total_cost(TrainingDataIterator td_begin, TrainingDataIterator td_end, T lmbda) const { 223 | T cost(0); 224 | cost = std::accumulate(td_begin, td_end, cost, [=](T cost, const TrainingData &td) { 225 | auto res = feedforward(td.first); 226 | return cost + this->cost_fn(res, td.second); 227 | }); 228 | size_t count = std::distance(td_begin, td_end); 229 | cost /= static_cast(count); 230 | T reg = std::accumulate(weights.begin(), weights.end(), 0.0, [lmbda, count](T reg, const ublas::matrix &w) { 231 | return reg + .5 * (lmbda * pow(norm_frobenius(w), 2)) / static_cast(count); 232 | }); 233 | return cost + reg; 234 | } 235 | }; 236 | 237 | } // namespace NeuralNet 238 | -------------------------------------------------------------------------------- /Chapter3/NeuralNet1.vcxproj: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | Debug 6 | Win32 7 | 8 | 9 | Release 10 | Win32 11 | 12 | 13 | Debug 14 | x64 15 | 16 | 17 | Release 18 | x64 19 | 20 | 21 | 22 | 15.0 23 | {3E2E207B-E939-4B80-9F13-5CAE3E04A4C1} 24 | Win32Proj 25 | NeuralNet1 26 | 10.0.16299.0 27 | NeuralNet2 28 | 29 | 30 | 31 | Application 32 | true 33 | v141 34 | Unicode 35 | 36 | 37 | Application 38 | false 39 | v141 40 | true 41 | Unicode 42 | 43 | 44 | Application 45 | true 46 | v141 47 | Unicode 48 | 49 | 50 | Application 51 | false 52 | v141 53 | true 54 | Unicode 55 | 56 | 57 | 58 | 59 | 60 | 61 | 62 | 63 | 64 | 65 | 66 | 67 | 68 | 69 | 70 | 71 | 72 | 73 | 74 | 75 | true 76 | 77 | 78 | true 79 | 80 | 81 | false 82 | 83 | 84 | false 85 | 86 | 87 | 88 | Use 89 | Level4 90 | Disabled 91 | true 92 | WIN32;_DEBUG;_CONSOLE;%(PreprocessorDefinitions);-D_SCL_SECURE_NO_WARNINGS;-D_SCL_SECURE_NO_WARNINGS 93 | ..\Loader;%(AdditionalIncludeDirectories) 94 | -D_SCL_SECURE_NO_WARNINGS %(AdditionalOptions) 95 | 96 | 97 | Console 98 | true 99 | 100 | 101 | 102 | 103 | Use 104 | Level4 105 | Disabled 106 | true 107 | _DEBUG;_CONSOLE;%(PreprocessorDefinitions);-D_SCL_SECURE_NO_WARNINGS;-D_SCL_SECURE_NO_WARNINGS 108 | ..\Loader;%(AdditionalIncludeDirectories) 109 | -D_SCL_SECURE_NO_WARNINGS %(AdditionalOptions) 110 | 111 | 112 | Console 113 | true 114 | 115 | 116 | 117 | 118 | Use 119 | Level4 120 | MaxSpeed 121 | true 122 | true 123 | true 124 | WIN32;NDEBUG;_CONSOLE;%(PreprocessorDefinitions);-D_SCL_SECURE_NO_WARNINGS;-D_SCL_SECURE_NO_WARNINGS 125 | ..\Loader;%(AdditionalIncludeDirectories) 126 | -D_SCL_SECURE_NO_WARNINGS %(AdditionalOptions) 127 | 128 | 129 | Console 130 | true 131 | true 132 | true 133 | 134 | 135 | 136 | 137 | Use 138 | Level4 139 | MaxSpeed 140 | true 141 | true 142 | true 143 | NDEBUG;_CONSOLE;%(PreprocessorDefinitions);-D_SCL_SECURE_NO_WARNINGS;-D_SCL_SECURE_NO_WARNINGS 144 | ..\Loader;%(AdditionalIncludeDirectories) 145 | -D_SCL_SECURE_NO_WARNINGS %(AdditionalOptions) 146 | 147 | 148 | Console 149 | true 150 | true 151 | true 152 | 153 | 154 | 155 | 156 | 157 | 158 | 159 | 160 | 161 | 162 | Create 163 | Create 164 | Create 165 | Create 166 | 167 | 168 | 169 | 170 | 171 | 172 | 173 | 174 | 175 | 176 | 177 | This project references NuGet package(s) that are missing on this computer. Use NuGet Package Restore to download them. For more information, see http://go.microsoft.com/fwlink/?LinkID=322105. The missing file is {0}. 178 | 179 | 180 | 181 | -------------------------------------------------------------------------------- /Chapter3/NeuralNet1.vcxproj.filters: -------------------------------------------------------------------------------- 1 |  2 | 3 | 4 | 5 | {4FC737F1-C7A5-4376-A066-2A32D752A2FF} 6 | cpp;c;cc;cxx;def;odl;idl;hpj;bat;asm;asmx 7 | 8 | 9 | {93995380-89BD-4b04-88EB-625FBE52EBFB} 10 | h;hh;hpp;hxx;hm;inl;inc;xsd 11 | 12 | 13 | {67DA6AB6-F800-4c08-8B7A-83BB121AAD01} 14 | rc;ico;cur;bmp;dlg;rc2;rct;bin;rgs;gif;jpg;jpeg;jpe;resx;tiff;tif;png;wav;mfcribbon-ms 15 | 16 | 17 | 18 | 19 | Header Files 20 | 21 | 22 | Header Files 23 | 24 | 25 | Header Files 26 | 27 | 28 | 29 | 30 | Source Files 31 | 32 | 33 | Source Files 34 | 35 | 36 | 37 | 38 | 39 | -------------------------------------------------------------------------------- /Chapter3/NeuralNet2.cpp: -------------------------------------------------------------------------------- 1 | // NeuralNet1.cpp : Defines the entry point for the console application. 2 | // 3 | // An example written to implement the stochastic gradient descent learning algorithm 4 | // NeuralNet1.cpp : Defines the entry point for the console application. 5 | // 6 | // An example written to implement the stochastic gradient descent learning 7 | // algorithm for a feed forward neural network. Gradients are calculated using 8 | // back propagation. 9 | // 10 | // Code is written to be a C++ version of network2.py from 11 | // http://neuralnetworksanddeeplearning.com/chap3.html Variable and functions 12 | // names follow the names used in the original Python 13 | // 14 | // This implementation aims to be slight better C++ rather than Python code 15 | // ported to C++ 16 | // 17 | // Uses the boost ublas library for linear algebra operations 18 | 19 | #include "stdafx.h" 20 | 21 | #include "NeuralNet.h" 22 | #include "boost\numeric\ublas\matrix.hpp" 23 | #include "boost\numeric\ublas\vector.hpp" 24 | #include "mnist_loader.h" 25 | #include 26 | #include 27 | #include 28 | #include 29 | #include 30 | 31 | using namespace boost::numeric; 32 | using namespace NeuralNet; 33 | 34 | int main() { 35 | using NeuralNet1 = NeuralNet::Network>; 36 | std::vector td, testData; 37 | try { 38 | // Load training data 39 | mnist_loader loader("..\\Data\\train-images.idx3-ubyte", "..\\Data\\train-labels.idx1-ubyte", td); 40 | // Load test data 41 | mnist_loader loader2("..\\Data\\t10k-images.idx3-ubyte", "..\\Data\\t10k-labels.idx1-ubyte", testData); 42 | } 43 | catch (const char *ex) { 44 | std::cout << "Error " << ex << std::endl; 45 | return 0; 46 | } 47 | double Lmbda = 5.0; 48 | NeuralNet1 net({ 784, 30, 10 }); 49 | net.SGD(td.begin(), td.begin() + 1000, 400, 10, 0.5, Lmbda, 50 | [&testData, &td, Lmbda](const NeuralNet1 &network, int Epoch) { 51 | std::cout << "Epoch " << Epoch << " : " << network.accuracy(testData.begin(), testData.end()) << " / " 52 | << testData.size() << std::endl; 53 | std::cout << "Epoch " << Epoch << " : " << network.accuracy(td.begin(), td.begin() + 1000) << " / " 54 | << 1000 << std::endl; 55 | std::cout << "Cost : " << network.total_cost(td.begin(), td.begin() + 1000, Lmbda) << std::endl; 56 | std::cout << "Cost : " << network.total_cost(testData.begin(), testData.end(), Lmbda) << std::endl; 57 | }); 58 | 59 | net.SGD(td.begin(), td.end(), 30, 10, 0.5, Lmbda, [&testData, &td, Lmbda](const NeuralNet1 &network, int Epoch) { 60 | std::cout << "Epoch " << Epoch << " : " << network.accuracy(testData.begin(), testData.end()) << " / " 61 | << testData.size() << std::endl; 62 | std::cout << "Epoch " << Epoch << " : " << network.accuracy(td.begin(), td.end()) << " / " << td.size() 63 | << std::endl; 64 | std::cout << "Cost : " << network.total_cost(td.begin(), td.end(), Lmbda) << std::endl; 65 | std::cout << "Cost : " << network.total_cost(testData.begin(), testData.end(), Lmbda) << std::endl; 66 | }); 67 | 68 | return 0; 69 | } 70 | -------------------------------------------------------------------------------- /Chapter3/NeuralNet2.sln: -------------------------------------------------------------------------------- 1 |  2 | Microsoft Visual Studio Solution File, Format Version 12.00 3 | # Visual Studio 15 4 | VisualStudioVersion = 15.0.27004.2006 5 | MinimumVisualStudioVersion = 10.0.40219.1 6 | Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "NeuralNet1", "NeuralNet1.vcxproj", "{3E2E207B-E939-4B80-9F13-5CAE3E04A4C1}" 7 | EndProject 8 | Global 9 | GlobalSection(SolutionConfigurationPlatforms) = preSolution 10 | Debug|x64 = Debug|x64 11 | Debug|x86 = Debug|x86 12 | Release|x64 = Release|x64 13 | Release|x86 = Release|x86 14 | EndGlobalSection 15 | GlobalSection(ProjectConfigurationPlatforms) = postSolution 16 | {3E2E207B-E939-4B80-9F13-5CAE3E04A4C1}.Debug|x64.ActiveCfg = Debug|x64 17 | {3E2E207B-E939-4B80-9F13-5CAE3E04A4C1}.Debug|x64.Build.0 = Debug|x64 18 | {3E2E207B-E939-4B80-9F13-5CAE3E04A4C1}.Debug|x86.ActiveCfg = Debug|Win32 19 | {3E2E207B-E939-4B80-9F13-5CAE3E04A4C1}.Debug|x86.Build.0 = Debug|Win32 20 | {3E2E207B-E939-4B80-9F13-5CAE3E04A4C1}.Release|x64.ActiveCfg = Release|x64 21 | {3E2E207B-E939-4B80-9F13-5CAE3E04A4C1}.Release|x64.Build.0 = Release|x64 22 | {3E2E207B-E939-4B80-9F13-5CAE3E04A4C1}.Release|x86.ActiveCfg = Release|Win32 23 | {3E2E207B-E939-4B80-9F13-5CAE3E04A4C1}.Release|x86.Build.0 = Release|Win32 24 | EndGlobalSection 25 | GlobalSection(SolutionProperties) = preSolution 26 | HideSolutionNode = FALSE 27 | EndGlobalSection 28 | GlobalSection(ExtensibilityGlobals) = postSolution 29 | SolutionGuid = {A868641A-544F-4BD0-B707-DFC56CA860DA} 30 | EndGlobalSection 31 | EndGlobal 32 | -------------------------------------------------------------------------------- /Chapter3/packages.config: -------------------------------------------------------------------------------- 1 |  2 | 3 | 4 | -------------------------------------------------------------------------------- /Chapter3/stdafx.cpp: -------------------------------------------------------------------------------- 1 | // stdafx.cpp : source file that includes just the standard includes 2 | // NeuralNet1.pch will be the pre-compiled header 3 | // stdafx.obj will contain the pre-compiled type information 4 | 5 | #include "stdafx.h" 6 | 7 | // TODO: reference any additional headers you need in STDAFX.H 8 | // and not in this file 9 | -------------------------------------------------------------------------------- /Chapter3/stdafx.h: -------------------------------------------------------------------------------- 1 | // stdafx.h : include file for standard system include files, 2 | // or project specific include files that are used frequently, but 3 | // are changed infrequently 4 | // 5 | 6 | #pragma once 7 | 8 | #include "targetver.h" 9 | 10 | #include 11 | #include 12 | 13 | 14 | 15 | // TODO: reference additional headers your program requires here 16 | -------------------------------------------------------------------------------- /Chapter3/targetver.h: -------------------------------------------------------------------------------- 1 | #pragma once 2 | 3 | // Including SDKDDKVer.h defines the highest available Windows platform. 4 | 5 | // If you wish to build your application for a previous Windows platform, include WinSDKVer.h and 6 | // set the _WIN32_WINNT macro to the platform you wish to support before including SDKDDKVer.h. 7 | 8 | #include 9 | -------------------------------------------------------------------------------- /Data/t10k-images.idx3-ubyte: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/GarethRichards/Machine-Learning-CPP/919b41f38896f61d45cf1c26bfa8900164edc438/Data/t10k-images.idx3-ubyte -------------------------------------------------------------------------------- /Data/t10k-labels.idx1-ubyte: -------------------------------------------------------------------------------- 1 | '                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             -------------------------------------------------------------------------------- /Data/train-images.idx3-ubyte: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/GarethRichards/Machine-Learning-CPP/919b41f38896f61d45cf1c26bfa8900164edc438/Data/train-images.idx3-ubyte -------------------------------------------------------------------------------- /Data/train-labels.idx1-ubyte: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/GarethRichards/Machine-Learning-CPP/919b41f38896f61d45cf1c26bfa8900164edc438/Data/train-labels.idx1-ubyte -------------------------------------------------------------------------------- /Future.md: -------------------------------------------------------------------------------- 1 | # Future 2 | In this section I'll look at how I've evolved the code from the Chapter 3 version and plans to improve it in the future. The Machine learning ideas 3 | implemented here are expounded on in greater depth in the [Deep Learning](http://neuralnetworksanddeeplearning.com/chap6.html). There are some 4 | great new features in C++ 17 and I've tried to showcase some of them here. I'll start with some new C++ features. 5 | 6 | ## C++ 17 7 | ### Structured bindings 8 | Structured Bindings give us the ability to declare multiple variables from a pair or a tuple. The training data in the NeuralNet library is 9 | a vector containg pairs of uBLAS vectors std::vector,ublas::vector>> before C++ 17 to reference this data as follows: 10 | ``` c++ 11 | auto nabla=std::accumulate(td,td+mini_batch_size,nabla0,[=](NetworkData &nabla,const TrainingData &td) 12 | { 13 | const auto &x=td.first; 14 | const auto &y=td.second; 15 | ``` 16 | With C++ 17 This can be simplefied to: 17 | ```c++ 18 | auto nabla=std::accumulate(td,td+mini_batch_size,nabla0,[=](NetworkData &nabla,const TrainingData &td) 19 | { 20 | const auto& [ x, y ] = td; // test data x, expected result y 21 | ``` 22 | A definate win for readability. 23 | 24 | ## Parallel STL 25 | When writing modern C++ a good practice is lookout for possible places where you can replace a hand written loop with an algorithm from the standard 26 | library. The processing of the mini batch looked like an ideal case: 27 | ```c++ 28 | std::vector> nabla_b; 29 | std::vector> nabla_w; 30 | PopulateZeroWeightsAndBiases(nabla_b, nabla_w); 31 | for (auto i = 0; i < mini_batch_size; ++i, td++) { 32 | const auto& [ x, y ] = td; // test data x, expected result y 33 | std::vector> delta_nabla_b; 34 | std::vector> delta_nabla_w; 35 | PopulateZeroWeightsAndBiases(delta_nabla_b, delta_nabla_w); 36 | backprop(x, y, delta_nabla_b, delta_nabla_w); 37 | for (auto j = 0; j< biases.size(); ++j) 38 | { 39 | nabla_b[j] += delta_nabla_b[j]; 40 | nabla_w[j] += delta_nabla_w[j]; 41 | } 42 | } 43 | ``` 44 | All I have written here is my very own version of the accumulate algorithm. But a little refactoring is required before I'm able to use 45 | std::accumulate. First, I refactored the code to create a class NetworkData which contains the matrix of weights and the vector of 46 | biases. This class can also hold the randomization functions and handle its own creation. The finished product is here: 47 | ```c++ 48 | NetworkData nabla0(nd.m_sizes); 49 | auto nabla=std::accumulate(td,td+mini_batch_size,nabla0,[=](NetworkData &nabla,const TrainingData &td) 50 | { 51 | const auto& [ x, y ] = td; // test data x, expected result y 52 | NetworkData delta_nabla(this->nd.m_sizes); 53 | backprop(x, y, delta_nabla); 54 | nabla += delta_nabla; 55 | return nabla; 56 | }); 57 | ``` 58 | Much nicer and more readable. C++11 introduced threads to the standard library, but if you needed to run a loop over many threads it 59 | was easier to relay on soultions such as the [Parallel Patterns Library](https://msdn.microsoft.com/en-us/library/dd492418.aspx) rather 60 | than a standard library C++ soultion. C++17 gives us the [Extensions for parallelism](http://en.cppreference.com/w/cpp/experimental/parallelism) 61 | which allow you to set execution policies to your STL algoriths. 62 | Accumulate is not the simplist algorithm to parallize as you require guarded access to the item which keeping 63 | the running total. The go to algorithm for simple parallelism is for_each as I'm accumulating the result I need a mutex to 64 | avoid race conditions. 65 | ```c++ 66 | NetworkData nabla(nd.m_sizes); 67 | std::mutex mtx; // mutex for critical section 68 | for_each(std::execution::par, td, td + mini_batch_size, [=, &nabla, &mtx](const TrainingData &td) 69 | { 70 | const auto& [ x, y ] = td; // test data x, expected result y 71 | NetworkData delta_nabla(this->nd.m_sizes); 72 | backprop(x, y, delta_nabla); 73 | // critical section 74 | std::lock_guard guard(mtx); 75 | nabla += delta_nabla; 76 | }); 77 | 78 | ``` 79 | An important note: In this example the code in the loop before the critical section will takes a lot more clock cycles than the 80 | addition after the critical section and many more cycles than the critical section code. It's important to remember the critical 81 | section is an expensive operation and the constant execution of the critical section will cause your application to run slower 82 | than the single threaded version. The golden rule here is test and measure. 83 | 84 | ## Activation functions 85 | The previous two versions of this code used the sigmoid activation function. Clearly a Neural net library should provide a set of activation functions and a way to add activation functions. An activation class needs to contain a definition of the activation function and its derivative. The activation function is used in the forward pass through the neural net and the derivative is needed to back propagate the errors through the network. 86 | This is the class for the sigmoid function: 87 | ```c++ 88 | // The sigmoid function. 89 | template class SigmoidActivation { 90 | public: 91 | void Activation(ublas::vector &v) const { 92 | constexpr T one = 1.0; 93 | for (auto &iv : v) { 94 | iv = one / (one + exp(-iv)); 95 | } 96 | } 97 | void ActivationPrime(ublas::vector &v) const { 98 | constexpr T one = 1.0; 99 | for (auto &iv : v) { 100 | iv = one / (one + exp(-iv)); 101 | iv = iv * (one - iv); 102 | } 103 | } 104 | }; 105 | ``` 106 | Using different policy classes one can eaisly create Neural nets with different activation functions: 107 | ```c++ 108 | using NeuralNet1 = NeuralNet::Network, 110 | NeuralNet::ReLUActivation>; 111 | 112 | using NeuralNet2 = NeuralNet::Network, 114 | NeuralNet::TanhActivation>; 115 | ``` 116 | In this example 2 possible networks have been defined one using rectified linear units and the other the Tanh activation function. Modifying the library to add a new activation function is very simple. If your inspired enough to try, Wikipedia has a list [here](https://en.wikipedia.org/wiki/Activation_function), but remember the eta variable which is the size of the step we make during using the backpropagation needs to change when you change the activation function. 117 | 118 | ## Changing the step parameter eta 119 | Another feature a neural net library needs is the ability to adjust the step size as the fitting progresses. In this C++ library a feedback function can be given to the SGD function which permits the user to change eta, the step size, as the fitting progresses. 120 | ```c++ 121 | NeuralNet1 net2({ 784, 60, 10 }); 122 | net2.SGD(td.begin(), td.end(), 60, 100, eta, Lmbda, 123 | [&periodStart, &Lmbda, &testData, &td](const NeuralNet1 &network, int Epoch, float &eta) { 124 | // eta can be manipulated in the feed back function 125 | auto end = std::chrono::high_resolution_clock::now(); 126 | std::chrono::duration diff = end - periodStart; 127 | std::cout << "Epoch " << Epoch << " time taken: " << diff.count() << "\n"; 128 | std::cout << "Test accuracy : " << network.accuracy(testData.begin(), testData.end()) << " / " 129 | << testData.size() << "\n"; 130 | std::cout << "Training accuracy : " << network.accuracy(td.begin(), td.end()) << " / " << td.size() 131 | << "\n"; 132 | std::cout << "Cost Training: " << network.total_cost(td.begin(), td.end(), Lmbda) << "\n"; 133 | std::cout << "Cost Test : " << network.total_cost(testData.begin(), testData.end(), Lmbda) 134 | << std::endl; 135 | eta *= .95; 136 | periodStart = std::chrono::high_resolution_clock::now(); 137 | }); 138 | ``` 139 | 140 | ## To Do 141 | There are many improvements which could be made to the Library. If anyone wishes to extend their understanding of Machine learning concepts 142 | and C++ I've created a list in what I believe is in order of difficulty of interesting changes which could be made: 143 | 1. Add code to load and save a Network. 144 | 2. Add different regularization schemes. 145 | 3. Add a Dropout scheme to the code. 146 | 5. Add Convolutional neural networks scheme to the library. 147 | -------------------------------------------------------------------------------- /Future/.clang-format: -------------------------------------------------------------------------------- 1 | --- 2 | Language: Cpp 3 | # BasedOnStyle: LLVM 4 | AccessModifierOffset: -2 5 | AlignAfterOpenBracket: Align 6 | AlignConsecutiveAssignments: false 7 | AlignConsecutiveDeclarations: false 8 | AlignEscapedNewlines: Right 9 | AlignOperands: true 10 | AlignTrailingComments: true 11 | AllowAllParametersOfDeclarationOnNextLine: true 12 | AllowShortBlocksOnASingleLine: false 13 | AllowShortCaseLabelsOnASingleLine: false 14 | AllowShortFunctionsOnASingleLine: All 15 | AllowShortIfStatementsOnASingleLine: false 16 | AllowShortLoopsOnASingleLine: false 17 | AlwaysBreakAfterDefinitionReturnType: None 18 | AlwaysBreakAfterReturnType: None 19 | AlwaysBreakBeforeMultilineStrings: false 20 | AlwaysBreakTemplateDeclarations: false 21 | BinPackArguments: true 22 | BinPackParameters: true 23 | BraceWrapping: 24 | AfterClass: false 25 | AfterControlStatement: false 26 | AfterEnum: false 27 | AfterFunction: false 28 | AfterNamespace: false 29 | AfterObjCDeclaration: false 30 | AfterStruct: false 31 | AfterUnion: false 32 | BeforeCatch: false 33 | BeforeElse: false 34 | IndentBraces: false 35 | SplitEmptyFunction: true 36 | SplitEmptyRecord: true 37 | SplitEmptyNamespace: true 38 | BreakBeforeBinaryOperators: None 39 | BreakBeforeBraces: Attach 40 | BreakBeforeInheritanceComma: false 41 | BreakBeforeTernaryOperators: true 42 | BreakConstructorInitializersBeforeComma: false 43 | BreakConstructorInitializers: BeforeColon 44 | BreakAfterJavaFieldAnnotations: false 45 | BreakStringLiterals: true 46 | ColumnLimit: 120 47 | CommentPragmas: '^ IWYU pragma:' 48 | CompactNamespaces: false 49 | ConstructorInitializerAllOnOneLineOrOnePerLine: false 50 | ConstructorInitializerIndentWidth: 4 51 | ContinuationIndentWidth: 4 52 | Cpp11BracedListStyle: true 53 | DerivePointerAlignment: false 54 | DisableFormat: false 55 | ExperimentalAutoDetectBinPacking: false 56 | FixNamespaceComments: true 57 | ForEachMacros: 58 | - foreach 59 | - Q_FOREACH 60 | - BOOST_FOREACH 61 | IncludeCategories: 62 | - Regex: '^"(llvm|llvm-c|clang|clang-c)/' 63 | Priority: 2 64 | - Regex: '^(<|"(gtest|gmock|isl|json)/)' 65 | Priority: 3 66 | - Regex: '.*' 67 | Priority: 1 68 | IncludeIsMainRegex: '(Test)?$' 69 | IndentCaseLabels: false 70 | IndentWidth: 4 71 | IndentWrappedFunctionNames: false 72 | JavaScriptQuotes: Leave 73 | JavaScriptWrapImports: true 74 | KeepEmptyLinesAtTheStartOfBlocks: true 75 | MacroBlockBegin: '' 76 | MacroBlockEnd: '' 77 | MaxEmptyLinesToKeep: 1 78 | NamespaceIndentation: None 79 | ObjCBlockIndentWidth: 2 80 | ObjCSpaceAfterProperty: false 81 | ObjCSpaceBeforeProtocolList: true 82 | PenaltyBreakAssignment: 2 83 | PenaltyBreakBeforeFirstCallParameter: 19 84 | PenaltyBreakComment: 300 85 | PenaltyBreakFirstLessLess: 120 86 | PenaltyBreakString: 1000 87 | PenaltyExcessCharacter: 1000000 88 | PenaltyReturnTypeOnItsOwnLine: 60 89 | PointerAlignment: Right 90 | ReflowComments: true 91 | SortIncludes: true 92 | SortUsingDeclarations: true 93 | SpaceAfterCStyleCast: false 94 | SpaceAfterTemplateKeyword: true 95 | SpaceBeforeAssignmentOperators: true 96 | SpaceBeforeParens: ControlStatements 97 | SpaceInEmptyParentheses: false 98 | SpacesBeforeTrailingComments: 1 99 | SpacesInAngles: false 100 | SpacesInContainerLiterals: true 101 | SpacesInCStyleCastParentheses: false 102 | SpacesInParentheses: false 103 | SpacesInSquareBrackets: false 104 | Standard: Cpp11 105 | TabWidth: 8 106 | UseTab: Never 107 | ... 108 | 109 | -------------------------------------------------------------------------------- /Future/CMakeLists.txt: -------------------------------------------------------------------------------- 1 | cmake_minimum_required(VERSION 3.11) 2 | 3 | project(NeuralNet) 4 | 5 | set(CMAKE_CXX_STANDARD 20) 6 | set(CMAKE_CXX_STANDARD_REQUIRED ON) 7 | 8 | find_package(Boost 1.78 REQUIRED) 9 | 10 | INCLUDE_DIRECTORIES("../Loader") 11 | 12 | add_executable(NeuralNet NeuralNet3.cpp) 13 | target_link_libraries(NeuralNet PUBLIC Boost::boost) 14 | -------------------------------------------------------------------------------- /Future/Makefile: -------------------------------------------------------------------------------- 1 | appname := NeuralNet3 2 | 3 | CC=gcc 4 | CXX=g++ 5 | RM=rm -f 6 | CXXFLAGS=-O3 -I../../boost_1_65_1 -I../Loader -std=gnu++1z 7 | LDFLAGS=-g 8 | LDLIBS= 9 | 10 | # clang++ -Ipackages\boost.1.65.1.0\lib\native\include -I../loader -std=c++17 -O3 -D_SILENCE_CXX17_ITERATOR_BASE_CLASS_DEPRECATION_WARNING NeuralNet3.cpp -D_SILENCE_CXX17_OLD_ALLOCATOR_MEMBERS_DEPRECATION_WARNING 11 | 12 | SRCS := $(shell find . -maxdepth 1 -name "*.cpp") 13 | OBJS := $(patsubst %.cpp, %.o, $(SRCS)) 14 | 15 | all: $(appname) 16 | 17 | $(appname): $(OBJS) 18 | $(CXX) $(LDFLAGS) -o $(appname) $(OBJS) $(LDLIBS) 19 | 20 | depend: .depend 21 | 22 | .depend: $(SRCS) 23 | $(RM) ./.depend 24 | $(CXX) $(CXXFLAGS) -MM $^>>./.depend; 25 | 26 | clean: 27 | $(RM) $(OBJS) 28 | 29 | distclean: clean 30 | $(RM) .depend 31 | 32 | include .depend 33 | -------------------------------------------------------------------------------- /Future/NeuralNet.h: -------------------------------------------------------------------------------- 1 | #pragma once 2 | // 3 | // Copyright (c) 2017 4 | // Gareth Richards 5 | // 6 | // NeuralNet.h Definition for NeuralNet namespace contains the following classes 7 | // Network - main class containing the implemention of the NeuralNet 8 | // The following Cost policies can be applied to this class. 9 | // Cost Policies: 10 | // QuadraticCost 11 | // CrossEntropyCost 12 | // Activation Policies: 13 | // SigmoidActivation 14 | // TanhActivation 15 | // ReLUActivation 16 | 17 | #include "boost/numeric/ublas/matrix.hpp" 18 | #include "boost/numeric/ublas/vector.hpp" 19 | #include 20 | #include 21 | #include 22 | #include 23 | #include 24 | #include 25 | #include 26 | #include 27 | 28 | using namespace boost::numeric; 29 | 30 | namespace NeuralNet { 31 | 32 | // The sigmoid function. 33 | template 34 | class SigmoidActivation { 35 | public: 36 | void Activation(ublas::vector &v) const { 37 | constexpr T one = 1.0; 38 | std::ranges::for_each(v, [](T &iv) { iv = one / (one + exp(-iv)); }); 39 | } 40 | void ActivationPrime(ublas::vector &v) const { 41 | constexpr T one = 1.0; 42 | std::ranges::for_each(v, [](T &iv) { 43 | iv = one / (one + exp(-iv)); 44 | iv = iv * (one - iv); 45 | }); 46 | } 47 | }; 48 | 49 | // The tanh function. 50 | template 51 | class TanhActivation { 52 | public: 53 | void Activation(ublas::vector &v) const { 54 | constexpr T one = 1.0; 55 | constexpr T two = 2.0; 56 | for (auto &iv : v) { 57 | iv = (one + tanh(iv)) / two; 58 | } 59 | } 60 | void ActivationPrime(ublas::vector &v) const { 61 | constexpr T two = 2.0; 62 | for (auto &iv : v) { 63 | iv = pow(two / (exp(-iv) + exp(iv)), two) / two; 64 | } 65 | } 66 | }; 67 | 68 | // The ReLU function. 69 | template 70 | class ReLUActivation { 71 | public: 72 | void Activation(ublas::vector &v) const { 73 | constexpr T zero = 0.0; 74 | for (auto &iv : v) { 75 | iv = std::max(zero, iv); 76 | } 77 | } 78 | void ActivationPrime(ublas::vector &v) const { 79 | constexpr T zero = 0.0; 80 | constexpr T one = 1.0; 81 | for (auto &iv : v) { 82 | iv = iv < zero ? zero : one; 83 | } 84 | } 85 | }; 86 | 87 | template 88 | class QuadraticCost { 89 | public: 90 | T cost_fn(const ublas::vector &a, const ublas::vector &y) const { 91 | return 0.5 * pow(norm_2(a - y)); 92 | ; 93 | } 94 | ublas::vector cost_delta(const ublas::vector &z, const ublas::vector &a, const ublas::vector &y) const { 95 | auto zp = z; 96 | this->ActivationPrime(zp); 97 | return element_prod(a - y, zp); 98 | } 99 | }; 100 | 101 | template 102 | class CrossEntropyCost { 103 | public: 104 | // Return the cost associated with an output ``a`` and desired output 105 | // ``y``. Note that np.nan_to_num is used to ensure numerical 106 | // stability.In particular, if both ``a`` and ``y`` have a 1.0 107 | // in the same slot, then the expression(1 - y)*np.log(1 - a) 108 | // returns nan.The np.nan_to_num ensures that that is converted 109 | // to the correct value(0.0). 110 | T cost_fn(const ublas::vector &a, const ublas::vector &y) const { 111 | constexpr T zero = 0.0; 112 | constexpr T one = 1.0; 113 | T total(zero); 114 | for (auto i = 0; i < a.size(); ++i) { 115 | total += a(i) == zero ? zero : -y(i) * log(a(i)); 116 | total += a(i) >= one ? zero : -(one - y(i)) * log(one - a(i)); 117 | } 118 | return total; 119 | } 120 | // Return the error delta from the output layer. Note that the 121 | // parameter ``z`` is not used by the method.It is included in 122 | // the method's parameters in order to make the interface 123 | // consistent with the delta method for other cost classes. 124 | ublas::vector cost_delta(const ublas::vector &z, const ublas::vector &a, const ublas::vector &y) const { 125 | (void)z; // not used by design 126 | return a - y; 127 | } 128 | }; 129 | 130 | template 131 | requires std::floating_point 132 | class Network : private CostPolicy, private ActivationPolicy { 133 | private: 134 | using BiasesVector = std::vector>; 135 | using WeightsVector = std::vector>; 136 | 137 | public: 138 | // Type definition of the Training data 139 | using TrainingData = std::pair, ublas::vector>; 140 | using TrainingDataIterator = typename std::vector::iterator; 141 | using TrainingDataVector = std::vector; 142 | 143 | protected: 144 | class NetworkData { 145 | public: 146 | std::vector m_sizes; 147 | BiasesVector biases; 148 | WeightsVector weights; 149 | std::mt19937 gen; 150 | 151 | NetworkData() { 152 | std::random_device r; 153 | gen = std::mt19937(r()); 154 | } 155 | explicit NetworkData(const std::vector &m_sizes) : m_sizes(m_sizes) 156 | { PopulateZeroWeightsAndBiases(); } 157 | ~NetworkData() = default; 158 | void PopulateZeroWeightsAndBiases() { 159 | for (auto i = 1; i < m_sizes.size(); ++i) { 160 | biases.emplace_back(ublas::zero_vector{m_sizes[i]}); 161 | weights.emplace_back(ublas::zero_matrix{m_sizes[i], m_sizes[i - 1]}); 162 | } 163 | } 164 | NetworkData &operator+=(const NetworkData &rhs) { 165 | for (auto j = 0; j < biases.size(); ++j) { 166 | biases[j] += rhs.biases[j]; 167 | weights[j] += rhs.weights[j]; 168 | } 169 | return *this; 170 | } 171 | friend NetworkData operator+(NetworkData lhs, const NetworkData &rhs) { 172 | lhs += rhs; // reuse compound assignment 173 | return lhs; 174 | } 175 | void Randomize() { 176 | for (auto &b : biases) 177 | RandomizeVector(b); 178 | for (auto &w : weights) 179 | RandomizeMatrix(w); 180 | } 181 | void RandomizeVector(ublas::vector &vec) { 182 | std::normal_distribution d(0, 1); 183 | std::ranges::for_each(vec, [&](T &e) { e = d(gen); }); 184 | } 185 | // Randomize as ublas matrix 186 | void RandomizeMatrix(ublas::matrix &m) { 187 | std::normal_distribution d(0, 1); 188 | T sx = sqrt(static_cast(m.size2())); 189 | std::ranges::for_each(m.data(), [&](auto &e) { e = d(gen) / sx; }); 190 | } 191 | }; 192 | private: 193 | NetworkData nd; 194 | public: 195 | Network() = default; 196 | explicit Network(const std::vector &sizes) : nd(sizes) { nd.Randomize(); } 197 | // Initalize the array of Biases and Matrix of weights 198 | 199 | // Returns the output of the network if the input is a 200 | ublas::vector feedforward(ublas::vector a) const { 201 | for (auto i = 0; i < nd.biases.size(); ++i) { 202 | ublas::vector c = prod(nd.weights[i], a) + nd.biases[i]; 203 | this->Activation(c); 204 | a = c; 205 | } 206 | return a; 207 | } 208 | // Train the neural network using mini-batch stochastic 209 | // gradient descent.The training_data is a vector of pairs 210 | // representing the training inputs and the desired 211 | // outputs.The other non - optional parameters are 212 | // self - explanatory.If test_data is provided then the 213 | // network will be evaluated against the test data after each 214 | // epoch, and partial progress printed out.This is useful for 215 | // tracking progress, but slows things down substantially. 216 | // The lmbda parameter can be altered in the feedback function 217 | void SGD(TrainingDataIterator td_begin, TrainingDataIterator td_end, int epochs, int mini_batch_size, T eta, 218 | T lmbda, std::function feedback) { 219 | for (auto j = 0; j < epochs; j++) { 220 | std::shuffle(td_begin, td_end, nd.gen); 221 | for (auto td_i = td_begin; td_i < td_end; td_i += mini_batch_size) { 222 | update_mini_batch(td_i, mini_batch_size, eta, lmbda, std::distance(td_begin, td_end)); 223 | } 224 | feedback(*this, j, eta); 225 | } 226 | } 227 | // Update the network's weights and biases by applying 228 | // gradient descent using backpropagation to a single mini batch. 229 | // The "mini_batch" is a list of tuples "(x, y)", and "eta" 230 | // is the learning rate.""" 231 | void update_mini_batch(TrainingDataIterator td, int mini_batch_size, T eta, T lmbda, auto n) { 232 | NetworkData nabla(nd.m_sizes); 233 | nabla = std::transform_reduce(std::execution::par, td, td + mini_batch_size, nabla, std::plus(), [this](const TrainingData& tdIn) { 234 | const auto &[x, y] = tdIn; // test data x, expected result y 235 | NetworkData delta_nabla(this->nd.m_sizes); 236 | backprop(x, y, delta_nabla); 237 | return delta_nabla; 238 | }); 239 | constexpr T one = 1.0; 240 | for (auto i = 0; i < nd.biases.size(); ++i) { 241 | nd.biases[i] -= eta / mini_batch_size * nabla.biases[i]; 242 | nd.weights[i] = (one - eta * (lmbda / n)) * nd.weights[i] - (eta / mini_batch_size) * nabla.weights[i]; 243 | } 244 | } 245 | // Populates the gradient for the cost function for the biases in the vector 246 | // nabla_b and the weights in nabla_w 247 | void backprop(const ublas::vector &x, const ublas::vector &y, NetworkData &nabla) { 248 | auto activation = x; 249 | std::vector> activations; // Stores the activations of each layer 250 | activations.push_back(x); 251 | std::vector> zs; // The z vectors layer by layer 252 | for (auto i = 0; i < nd.biases.size(); ++i) { 253 | ublas::vector z = prod(nd.weights[i], activation) + nd.biases[i]; 254 | zs.push_back(z); 255 | activation = z; 256 | this->Activation(activation); 257 | activations.push_back(activation); 258 | } 259 | // backward pass 260 | auto iActivations = activations.end() - 1; 261 | auto izs = zs.end() - 1; 262 | 263 | this->ActivationPrime(*izs); 264 | ublas::vector delta = this->cost_delta(*izs, *iActivations, y); 265 | auto ib = nabla.biases.end() - 1; 266 | auto iw = nabla.weights.end() - 1; 267 | *ib = delta; 268 | iActivations--; 269 | *iw = outer_prod(delta, trans(*iActivations)); 270 | auto iWeights = nd.weights.end(); 271 | while (iActivations != activations.begin()) { 272 | izs--; 273 | iWeights--; 274 | iActivations--; 275 | ib--; 276 | iw--; 277 | this->ActivationPrime(*izs); 278 | delta = element_prod(prod(trans(*iWeights), delta), *izs); 279 | *ib = delta; 280 | *iw = outer_prod(delta, trans(*iActivations)); 281 | } 282 | } 283 | auto result(const ublas::vector &res) const { 284 | return std::distance(res.begin(), max_element(res.begin(), res.end())); 285 | } 286 | // Return the vector of partial derivatives \partial C_x / 287 | // \partial a for the output activations. 288 | auto accuracy(TrainingDataIterator td_begin, TrainingDataIterator td_end) const { 289 | return count_if(std::execution::par, td_begin, td_end, [this](const TrainingData& testElement) { 290 | const auto& [x, y] = testElement; // test data x, expected result y 291 | auto res = feedforward(x); 292 | return result(res) == result(y); 293 | }); 294 | } 295 | // Return the total cost for the data set ``data``. 296 | 297 | T total_cost(TrainingDataIterator td_begin, TrainingDataIterator td_end, T lmbda) const { 298 | auto count = std::distance(td_begin, td_end); 299 | T cost(0); 300 | cost = std::transform_reduce(std::execution::par, td_begin, td_end, cost, std::plus<>(), [this](const TrainingData& td) { 301 | const auto& [testData, expectedResult] = td; 302 | auto res = feedforward(testData); 303 | return this->cost_fn(res, expectedResult); 304 | }); 305 | 306 | cost /= static_cast(count); 307 | constexpr T zero = 0.0; 308 | constexpr T half = 0.5; 309 | T reg = std::accumulate(nd.weights.begin(), nd.weights.end(), zero, 310 | [lmbda, count](T regC, const ublas::matrix &w) { 311 | return regC + half * (lmbda * pow(norm_frobenius(w), 2)) / static_cast(count); 312 | }); 313 | return cost + reg; 314 | } 315 | 316 | friend std::ostream &operator<<(std::ostream &os, const Network &net) { 317 | os << net.nd.m_sizes.size() << " "; 318 | std::ranges::for_each(net.nd.m_sizes, [&](size_t x) { os << x << " "; }); 319 | for (auto x = 0; x < net.nd.m_sizes.size() - 1; ++x) { 320 | std::ranges::for_each(net.nd.biases[x], [&](T y) { os << y << " "; }); 321 | std::ranges::for_each(net.nd.weights[x].data(), [&](T y) { os << y << " "; }); 322 | }; 323 | 324 | return os; 325 | } 326 | 327 | friend std::istream &operator>>(std::istream &is, Network &obj) { 328 | int netSize; 329 | is >> netSize; 330 | for (int i = 0; i < netSize; ++i) { 331 | int size; 332 | is >> size; 333 | obj.nd.m_sizes.push_back(size); 334 | } 335 | obj.nd.PopulateZeroWeightsAndBiases(); 336 | T a; 337 | for (auto x = 1; x < obj.nd.m_sizes.size(); ++x) { 338 | for (auto y = 0; y < obj.nd.m_sizes[x]; ++y) { 339 | is >> a; 340 | obj.nd.biases[x - 1][y] = a; 341 | } 342 | for (auto y = 0; y < obj.nd.m_sizes[x] * obj.nd.m_sizes[x - 1]; ++y) { 343 | is >> a; 344 | obj.nd.weights[x - 1].data()[y] = a; 345 | } 346 | }; 347 | return is; 348 | } 349 | }; 350 | 351 | } // namespace NeuralNet 352 | -------------------------------------------------------------------------------- /Future/NeuralNet2F.sln: -------------------------------------------------------------------------------- 1 |  2 | Microsoft Visual Studio Solution File, Format Version 12.00 3 | # Visual Studio 15 4 | VisualStudioVersion = 15.0.27004.2006 5 | MinimumVisualStudioVersion = 10.0.40219.1 6 | Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "NeuralNet2F", "NeuralNet2F.vcxproj", "{C8B6D31F-C1F0-4E8F-B05E-0E2E1E3947B7}" 7 | EndProject 8 | Global 9 | GlobalSection(SolutionConfigurationPlatforms) = preSolution 10 | Debug|x64 = Debug|x64 11 | Debug|x86 = Debug|x86 12 | Release|x64 = Release|x64 13 | Release|x86 = Release|x86 14 | EndGlobalSection 15 | GlobalSection(ProjectConfigurationPlatforms) = postSolution 16 | {C8B6D31F-C1F0-4E8F-B05E-0E2E1E3947B7}.Debug|x64.ActiveCfg = Debug|x64 17 | {C8B6D31F-C1F0-4E8F-B05E-0E2E1E3947B7}.Debug|x64.Build.0 = Debug|x64 18 | {C8B6D31F-C1F0-4E8F-B05E-0E2E1E3947B7}.Debug|x86.ActiveCfg = Debug|Win32 19 | {C8B6D31F-C1F0-4E8F-B05E-0E2E1E3947B7}.Debug|x86.Build.0 = Debug|Win32 20 | {C8B6D31F-C1F0-4E8F-B05E-0E2E1E3947B7}.Release|x64.ActiveCfg = Release|x64 21 | {C8B6D31F-C1F0-4E8F-B05E-0E2E1E3947B7}.Release|x64.Build.0 = Release|x64 22 | {C8B6D31F-C1F0-4E8F-B05E-0E2E1E3947B7}.Release|x86.ActiveCfg = Release|Win32 23 | {C8B6D31F-C1F0-4E8F-B05E-0E2E1E3947B7}.Release|x86.Build.0 = Release|Win32 24 | EndGlobalSection 25 | GlobalSection(SolutionProperties) = preSolution 26 | HideSolutionNode = FALSE 27 | EndGlobalSection 28 | GlobalSection(ExtensibilityGlobals) = postSolution 29 | SolutionGuid = {3328FF37-0803-42DB-A97F-C473C5F898FC} 30 | EndGlobalSection 31 | EndGlobal 32 | -------------------------------------------------------------------------------- /Future/NeuralNet2F.vcxproj: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | Debug 6 | Win32 7 | 8 | 9 | Release 10 | Win32 11 | 12 | 13 | Debug 14 | x64 15 | 16 | 17 | Release 18 | x64 19 | 20 | 21 | 22 | 15.0 23 | {C8B6D31F-C1F0-4E8F-B05E-0E2E1E3947B7} 24 | Win32Proj 25 | NeuralNet2F 26 | 10.0 27 | 28 | 29 | 30 | Application 31 | true 32 | v143 33 | Unicode 34 | 35 | 36 | Application 37 | false 38 | v143 39 | true 40 | Unicode 41 | 42 | 43 | Application 44 | true 45 | v143 46 | Unicode 47 | 48 | 49 | Application 50 | false 51 | v143 52 | true 53 | Unicode 54 | 55 | 56 | 57 | 58 | 59 | 60 | 61 | 62 | 63 | 64 | 65 | 66 | 67 | 68 | 69 | 70 | 71 | 72 | 73 | 74 | true 75 | 76 | 77 | true 78 | 79 | 80 | false 81 | 82 | 83 | false 84 | 85 | 86 | 87 | Use 88 | Level4 89 | Disabled 90 | true 91 | WIN32;_DEBUG;_CONSOLE;%(PreprocessorDefinitions) 92 | ..\Loader;%(AdditionalIncludeDirectories) 93 | -D_SILENCE_CXX17_ITERATOR_BASE_CLASS_DEPRECATION_WARNING -D_SILENCE_CXX17_OLD_ALLOCATOR_MEMBERS_DEPRECATION_WARNING -D_SILENCE_PARALLEL_ALGORITHMS_EXPERIMENTAL_WARNING 94 | stdcpplatest 95 | true 96 | 97 | 98 | Console 99 | true 100 | 101 | 102 | 103 | 104 | Use 105 | Level4 106 | Disabled 107 | true 108 | _DEBUG;_CONSOLE;%(PreprocessorDefinitions) 109 | ..\Loader;%(AdditionalIncludeDirectories) 110 | -D_SILENCE_PARALLEL_ALGORITHMS_EXPERIMENTAL_WARNING -D_SILENCE_CXX17_ITERATOR_BASE_CLASS_DEPRECATION_WARNING -D_SILENCE_CXX17_OLD_ALLOCATOR_MEMBERS_DEPRECATION_WARNING -D_SCL_SECURE_NO_WARNINGS 111 | stdcpplatest 112 | true 113 | 114 | 115 | Console 116 | true 117 | 118 | 119 | 120 | 121 | Use 122 | Level4 123 | Full 124 | true 125 | true 126 | true 127 | WIN32;NDEBUG;_CONSOLE;%(PreprocessorDefinitions) 128 | ..\Loader;%(AdditionalIncludeDirectories) 129 | -D_SILENCE_CXX17_ITERATOR_BASE_CLASS_DEPRECATION_WARNING -D_SILENCE_CXX17_OLD_ALLOCATOR_MEMBERS_DEPRECATION_WARNING -D_SILENCE_PARALLEL_ALGORITHMS_EXPERIMENTAL_WARNING 130 | stdcpplatest 131 | true 132 | AdvancedVectorExtensions2 133 | true 134 | true 135 | 136 | 137 | Console 138 | true 139 | true 140 | true 141 | 142 | 143 | 144 | 145 | Use 146 | Level4 147 | Full 148 | true 149 | true 150 | true 151 | NDEBUG;_CONSOLE;%(PreprocessorDefinitions) 152 | ..\Loader;%(AdditionalIncludeDirectories) 153 | -D_SILENCE_PARALLEL_ALGORITHMS_EXPERIMENTAL_WARNING -D_SILENCE_CXX17_ITERATOR_BASE_CLASS_DEPRECATION_WARNING -D_SILENCE_CXX17_OLD_ALLOCATOR_MEMBERS_DEPRECATION_WARNING 154 | stdcpplatest 155 | true 156 | 157 | 158 | Console 159 | true 160 | true 161 | true 162 | 163 | 164 | 165 | 166 | 167 | 168 | 169 | 170 | 171 | 172 | Create 173 | Create 174 | Create 175 | Create 176 | 177 | 178 | 179 | 180 | 181 | 182 | 183 | 184 | 185 | 186 | 187 | This project references NuGet package(s) that are missing on this computer. Use NuGet Package Restore to download them. For more information, see http://go.microsoft.com/fwlink/?LinkID=322105. The missing file is {0}. 188 | 189 | 190 | 191 | -------------------------------------------------------------------------------- /Future/NeuralNet2F.vcxproj.filters: -------------------------------------------------------------------------------- 1 |  2 | 3 | 4 | 5 | {4FC737F1-C7A5-4376-A066-2A32D752A2FF} 6 | cpp;c;cc;cxx;def;odl;idl;hpj;bat;asm;asmx 7 | 8 | 9 | {93995380-89BD-4b04-88EB-625FBE52EBFB} 10 | h;hh;hpp;hxx;hm;inl;inc;xsd 11 | 12 | 13 | {67DA6AB6-F800-4c08-8B7A-83BB121AAD01} 14 | rc;ico;cur;bmp;dlg;rc2;rct;bin;rgs;gif;jpg;jpeg;jpe;resx;tiff;tif;png;wav;mfcribbon-ms 15 | 16 | 17 | 18 | 19 | Header Files 20 | 21 | 22 | Header Files 23 | 24 | 25 | Header Files 26 | 27 | 28 | 29 | 30 | Source Files 31 | 32 | 33 | Source Files 34 | 35 | 36 | 37 | 38 | 39 | -------------------------------------------------------------------------------- /Future/NeuralNet3.cpp: -------------------------------------------------------------------------------- 1 | // NeuralNet3.cpp : Defines the entry point for the console application. 2 | // 3 | // An example written to implement the stochastic gradient descent learning 4 | // algorithm for a feed forward neural network. Gradients are calculated using 5 | // back propagation. 6 | // 7 | // Code is written to be a C++ version of network2.py from 8 | // http://neuralnetworksanddeeplearning.com/chap3.html Variable and functions 9 | // names follow the names used in the original Python 10 | // 11 | // This implementation aims to be slight better C++ rather than Python code 12 | // ported to C++ 13 | // 14 | // Uses the boost ublas library for linear algebra operations 15 | 16 | #include "stdafx.h" 17 | 18 | #include "NeuralNet.h" 19 | #include "boost/numeric/ublas/matrix.hpp" 20 | #include "boost/numeric/ublas/vector.hpp" 21 | #include "mnist_loader.h" 22 | #include 23 | #include 24 | #include 25 | #include 26 | #include 27 | 28 | using namespace boost::numeric; 29 | using namespace NeuralNet; 30 | 31 | int main() { 32 | using NeuralNet1 = NeuralNet::Network, NeuralNet::ReLUActivation>; 33 | std::vector td, testData; 34 | try { 35 | // Load training data 36 | mnist_loader loader("..\\Data\\train-images.idx3-ubyte", "..\\Data\\train-labels.idx1-ubyte", td); 37 | // Load test data 38 | mnist_loader loader2("..\\Data\\t10k-images.idx3-ubyte", "..\\Data\\t10k-labels.idx1-ubyte", testData); 39 | } 40 | catch (const char *Error) { 41 | std::cout << "Error: " << Error << "\n"; 42 | return 0; 43 | } 44 | float Lmbda = 0.1f; // try 5.0; 45 | float eta = 0.03f; // try 0.5 46 | 47 | auto start = std::chrono::high_resolution_clock::now(); 48 | auto periodStart = std::chrono::high_resolution_clock::now(); 49 | NeuralNet1 net({ 784, 100, 60, 10 }); 50 | net.SGD(td.begin(), td.end(), 20, 100, eta, Lmbda, 51 | [&periodStart, &Lmbda, &testData, &td](const NeuralNet1 &network, int Epoch, float ¤ctEta) { 52 | // eta can be manipulated in the feed back function 53 | auto end = std::chrono::high_resolution_clock::now(); 54 | std::chrono::duration diff = end - periodStart; 55 | std::cout << "Epoch " << Epoch << " time taken: " << diff.count() << "\n"; 56 | std::cout << "Test accuracy : " << network.accuracy(testData.begin(), testData.end()) << " / " 57 | << testData.size() << "\n"; 58 | std::cout << "Training accuracy : " << network.accuracy(td.begin(), td.end()) << " / " << td.size() 59 | << "\n"; 60 | std::cout << "Cost Training: " << network.total_cost(td.begin(), td.end(), Lmbda) << "\n"; 61 | std::cout << "Cost Test : " << network.total_cost(testData.begin(), testData.end(), Lmbda) 62 | << std::endl; 63 | currenctEta *= .95f; 64 | periodStart = std::chrono::high_resolution_clock::now(); 65 | }); 66 | auto end = std::chrono::high_resolution_clock::now(); 67 | std::chrono::duration diff = end - start; 68 | std::cout << "Total time: " << diff.count() << "\n"; 69 | // write out net 70 | { 71 | std::ofstream f("./net-save.txt", std::ios::binary | std::ios::out); 72 | f << net; 73 | f.close(); 74 | } 75 | // read it and use it 76 | NeuralNet1 net2; 77 | { 78 | std::fstream f("./net-save.txt", std::ios::binary | std::ios::in); 79 | if (!f.is_open()) { 80 | std::cout << "failed to open ./net-save.txt\n"; 81 | return 0; 82 | } else { 83 | f >> net2; 84 | } 85 | // test total cost should be same as before 86 | std::cout << "Cost Test : " << net2.total_cost(testData.begin(), testData.end(), Lmbda) << "\n"; 87 | auto x = net2.result(net2.feedforward(testData[0].first)); 88 | auto y = net2.result(testData[2].second); 89 | std::cout << "looks like a " << x << " is a " << y << "\n"; 90 | } 91 | return 0; 92 | } 93 | -------------------------------------------------------------------------------- /Future/packages.config: -------------------------------------------------------------------------------- 1 |  2 | 3 | 4 | -------------------------------------------------------------------------------- /Future/stdafx.cpp: -------------------------------------------------------------------------------- 1 | // stdafx.cpp : source file that includes just the standard includes 2 | // NeuralNet1.pch will be the pre-compiled header 3 | // stdafx.obj will contain the pre-compiled type information 4 | 5 | #include "stdafx.h" 6 | 7 | // TODO: reference any additional headers you need in STDAFX.H 8 | // and not in this file 9 | -------------------------------------------------------------------------------- /Future/stdafx.h: -------------------------------------------------------------------------------- 1 | // stdafx.h : include file for standard system include files, 2 | // or project specific include files that are used frequently, but 3 | // are changed infrequently 4 | // 5 | 6 | #pragma once 7 | 8 | #include "targetver.h" 9 | 10 | #include 11 | #include 12 | 13 | 14 | 15 | // TODO: reference additional headers your program requires here 16 | -------------------------------------------------------------------------------- /Future/targetver.h: -------------------------------------------------------------------------------- 1 | #pragma once 2 | 3 | // Including SDKDDKVer.h defines the highest available Windows platform. 4 | 5 | // If you wish to build your application for a previous Windows platform, include WinSDKVer.h and 6 | // set the _WIN32_WINNT macro to the platform you wish to support before including SDKDDKVer.h. 7 | 8 | #include 9 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | This is free and unencumbered software released into the public domain. 2 | 3 | Anyone is free to copy, modify, publish, use, compile, sell, or 4 | distribute this software, either in source code form or as a compiled 5 | binary, for any purpose, commercial or non-commercial, and by any 6 | means. 7 | 8 | In jurisdictions that recognize copyright laws, the author or authors 9 | of this software dedicate any and all copyright interest in the 10 | software to the public domain. We make this dedication for the benefit 11 | of the public at large and to the detriment of our heirs and 12 | successors. We intend this dedication to be an overt act of 13 | relinquishment in perpetuity of all present and future rights to this 14 | software under copyright law. 15 | 16 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 17 | EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 18 | MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. 19 | IN NO EVENT SHALL THE AUTHORS BE LIABLE FOR ANY CLAIM, DAMAGES OR 20 | OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, 21 | ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR 22 | OTHER DEALINGS IN THE SOFTWARE. 23 | 24 | For more information, please refer to 25 | -------------------------------------------------------------------------------- /Loader/.clang-format: -------------------------------------------------------------------------------- 1 | --- 2 | Language: Cpp 3 | # BasedOnStyle: LLVM 4 | AccessModifierOffset: -2 5 | AlignAfterOpenBracket: Align 6 | AlignConsecutiveAssignments: false 7 | AlignConsecutiveDeclarations: false 8 | AlignEscapedNewlines: Right 9 | AlignOperands: true 10 | AlignTrailingComments: true 11 | AllowAllParametersOfDeclarationOnNextLine: true 12 | AllowShortBlocksOnASingleLine: false 13 | AllowShortCaseLabelsOnASingleLine: false 14 | AllowShortFunctionsOnASingleLine: All 15 | AllowShortIfStatementsOnASingleLine: false 16 | AllowShortLoopsOnASingleLine: false 17 | AlwaysBreakAfterDefinitionReturnType: None 18 | AlwaysBreakAfterReturnType: None 19 | AlwaysBreakBeforeMultilineStrings: false 20 | AlwaysBreakTemplateDeclarations: false 21 | BinPackArguments: true 22 | BinPackParameters: true 23 | BraceWrapping: 24 | AfterClass: false 25 | AfterControlStatement: false 26 | AfterEnum: false 27 | AfterFunction: false 28 | AfterNamespace: false 29 | AfterObjCDeclaration: false 30 | AfterStruct: false 31 | AfterUnion: false 32 | BeforeCatch: false 33 | BeforeElse: false 34 | IndentBraces: false 35 | SplitEmptyFunction: true 36 | SplitEmptyRecord: true 37 | SplitEmptyNamespace: true 38 | BreakBeforeBinaryOperators: None 39 | BreakBeforeBraces: Attach 40 | BreakBeforeInheritanceComma: false 41 | BreakBeforeTernaryOperators: true 42 | BreakConstructorInitializersBeforeComma: false 43 | BreakConstructorInitializers: BeforeColon 44 | BreakAfterJavaFieldAnnotations: false 45 | BreakStringLiterals: true 46 | ColumnLimit: 120 47 | CommentPragmas: '^ IWYU pragma:' 48 | CompactNamespaces: false 49 | ConstructorInitializerAllOnOneLineOrOnePerLine: false 50 | ConstructorInitializerIndentWidth: 4 51 | ContinuationIndentWidth: 4 52 | Cpp11BracedListStyle: true 53 | DerivePointerAlignment: false 54 | DisableFormat: false 55 | ExperimentalAutoDetectBinPacking: false 56 | FixNamespaceComments: true 57 | ForEachMacros: 58 | - foreach 59 | - Q_FOREACH 60 | - BOOST_FOREACH 61 | IncludeCategories: 62 | - Regex: '^"(llvm|llvm-c|clang|clang-c)/' 63 | Priority: 2 64 | - Regex: '^(<|"(gtest|gmock|isl|json)/)' 65 | Priority: 3 66 | - Regex: '.*' 67 | Priority: 1 68 | IncludeIsMainRegex: '(Test)?$' 69 | IndentCaseLabels: false 70 | IndentWidth: 4 71 | IndentWrappedFunctionNames: false 72 | JavaScriptQuotes: Leave 73 | JavaScriptWrapImports: true 74 | KeepEmptyLinesAtTheStartOfBlocks: true 75 | MacroBlockBegin: '' 76 | MacroBlockEnd: '' 77 | MaxEmptyLinesToKeep: 1 78 | NamespaceIndentation: None 79 | ObjCBlockIndentWidth: 2 80 | ObjCSpaceAfterProperty: false 81 | ObjCSpaceBeforeProtocolList: true 82 | PenaltyBreakAssignment: 2 83 | PenaltyBreakBeforeFirstCallParameter: 19 84 | PenaltyBreakComment: 300 85 | PenaltyBreakFirstLessLess: 120 86 | PenaltyBreakString: 1000 87 | PenaltyExcessCharacter: 1000000 88 | PenaltyReturnTypeOnItsOwnLine: 60 89 | PointerAlignment: Right 90 | ReflowComments: true 91 | SortIncludes: true 92 | SortUsingDeclarations: true 93 | SpaceAfterCStyleCast: false 94 | SpaceAfterTemplateKeyword: true 95 | SpaceBeforeAssignmentOperators: true 96 | SpaceBeforeParens: ControlStatements 97 | SpaceInEmptyParentheses: false 98 | SpacesBeforeTrailingComments: 1 99 | SpacesInAngles: false 100 | SpacesInContainerLiterals: true 101 | SpacesInCStyleCastParentheses: false 102 | SpacesInParentheses: false 103 | SpacesInSquareBrackets: false 104 | Standard: Cpp11 105 | TabWidth: 8 106 | UseTab: Never 107 | ... 108 | 109 | -------------------------------------------------------------------------------- /Loader/mnist_loader.h: -------------------------------------------------------------------------------- 1 | #pragma once 2 | 3 | #include 4 | #ifdef _MSC_VER 5 | #define bswap_32(x) _byteswap_ulong(x) 6 | #else 7 | #define bswap_32(x) __builtin_bswap32(x) 8 | #endif 9 | 10 | 11 | #include "boost/numeric/ublas/vector.hpp" 12 | #include 13 | #include 14 | #include 15 | #include 16 | 17 | using namespace boost::numeric; 18 | 19 | // Loads the MNIST data files 20 | template class mnist_loader { 21 | public: 22 | mnist_loader(const std::string &FileData, const std::string &FileLabels, 23 | std::vector, ublas::vector>> &mnist_data) { 24 | { 25 | std::ifstream myFile(FileData, std::wifstream::in | std::wifstream::binary); 26 | if (!myFile) 27 | throw "File does not exist"; 28 | int MagicNumber(0); 29 | unsigned int nItems(0); 30 | unsigned int nRows(0); 31 | unsigned int nCol(0); 32 | myFile.read((char *)&MagicNumber, 4); 33 | MagicNumber = bswap_32(MagicNumber); 34 | if (MagicNumber != 2051) 35 | throw "Magic number for training data incorrect"; 36 | myFile.read((char *)&nItems, 4); 37 | nItems = bswap_32(nItems); 38 | myFile.read((char *)&nRows, 4); 39 | nRows = bswap_32(nRows); 40 | myFile.read((char *)&nCol, 4); 41 | nCol = bswap_32(nCol); 42 | std::unique_ptr buf(new unsigned char[nRows * nCol]); 43 | for (unsigned int i = 0; i < nItems; ++i) { 44 | myFile.read((char *)buf.get(), nRows * nCol); 45 | ublas::vector data(nRows * nCol); 46 | for (unsigned int j = 0; j < nRows * nCol; ++j) { 47 | data[j] = static_cast(buf[j]) / static_cast(255.0); 48 | } 49 | mnist_data.push_back(make_pair(data, ublas::zero_vector(10))); 50 | } 51 | } 52 | { 53 | std::ifstream myFile(FileLabels, std::wifstream::in | std::wifstream::binary); 54 | if (!myFile) 55 | throw "File does not exist"; 56 | int MagicNumber(0); 57 | int nItems(0); 58 | myFile.read((char *)&MagicNumber, 4); 59 | MagicNumber = bswap_32(MagicNumber); 60 | if (MagicNumber != 2049) 61 | throw "Magic number for label file incorrect"; 62 | myFile.read((char *)&nItems, 4); 63 | nItems = bswap_32(nItems); 64 | for (int i = 0; i < nItems; ++i) { 65 | char data; 66 | myFile.read(&data, 1); 67 | mnist_data[i].second[data] = 1.0; 68 | } 69 | } 70 | } 71 | }; 72 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Machine Learning with C++ 2 | 3 | The C++ code in this repository is a hopefully accurate port of the python code in Michael Nielsen's book 4 | [Neural Networks and Deep Learning](http://neuralnetworksanddeeplearning.com/). I recommend you to read 5 | Michael Nielson's book and if you wish to use C++ rather than python you can use the code 6 | in this repository to supplement your understanding. I would also like to thank Grant Sanderson, [3Blue1Brown](https://www.youtube.com/channel/UCYO_jab_esuFRV4b17AJtAw/featured) 7 | for his entertaining and educational videos on [Machine learning](https://www.youtube.com/watch?v=aircAruvnKk&t=68s) and 8 | for introducing me to Michael's book. 9 | If you're new to Machine learning I can also recommend as an excellent introduction 10 | [Machine Learning](https://www.coursera.org/learn/machine-learning/home/welcome) 11 | by Stanford University on [Coursera](https://www.coursera.org). Finally, thanks to the people who created the 12 | [MNIST dataset](http://yann.lecun.com/exdb/mnist/) for making their dataset available to us. 13 | 14 | I've aimed to produce code as simple and concise as possible using only the features in the standard C++17, but as 15 | the standard library does not contain a linear algebra library I've used the [uBLAS](http://www.boost.org). I've 16 | tried to showcase some of the new features of Modern C++ here, STL algorithms and lambda functions. With just a couple of 100 lines of code we can produce a program which can recognize handwritten digits and implements the standard learning algorithm for neural networks, known as stochastic gradient descent. 17 | 18 | ## [Chapter 1](https://github.com/GarethRichards/Machine-Learning-CPP/blob/master/Chapter1.md) 19 | In chapter 1 A C++ version of network.py is constructed in Python style as all the code is in one cpp file. The purpose of this version is to create code similar to the python code, rather than the best C++ possible, to allow you to read [Using neural nets to recognize handwritten digits](http://neuralnetworksanddeeplearning.com/chap1.html) and follow the online book with C++. 20 | 21 | ## Chapter 2 22 | Michael explains how the backpropagation algorithm [works](http://neuralnetworksanddeeplearning.com/chap2.html) but adds no 23 | more Python code to his book. Giving me time to read up on some C++17 features I'd like to try out. 24 | 25 | ## [Chapter 3](https://github.com/GarethRichards/Machine-Learning-CPP/blob/master/Chapter3.md) 26 | This time I'm aiming to produce C++ worthy of being called a library, but still maintaining code which is recognisable to the python network2.py from [chapter 3](http://neuralnetworksanddeeplearning.com/chap2.html) of Michael's book. Of course the main body of the code goes into a header file and I show how to construct some [policy classes](https://en.wikipedia.org/wiki/Policy-based_design) and how to use them in Network2.cpp. 27 | 28 | ## [Future](https://github.com/GarethRichards/Machine-Learning-CPP/blob/master/Future.md) 29 | The code written so far all runs in a single thread. Of course thanks to the C++ compilers vectorization skills it is able to do its job in a reasonable amount of time. In C++ 17 the Extensions for parallelism [TR](http://en.cppreference.com/w/cpp/experimental/parallelism) 30 | lets us change the execution plan of algorithms. I introduce [Structured bindings](http://en.cppreference.com/w/cpp/language/structured_binding) into the codebase which increase conciseness and readability. Finally, I use some ideas outlined in Michael's book and add different activation functions to the library. 31 | 32 | ## Final thoughts 33 | I hope you enjoyed this excursion into both C++ and Machine learning. I started writing this code as I was unable to find any easily approachable 34 | code for experimenting with Machine learning in C++. Once again thanks to Michael Nielsen for writing such an accessible and readable book. 35 | 36 | Gareth Richards 37 | 14/11/2017 38 | --------------------------------------------------------------------------------