├── .github
    └── ISSUE_TEMPLATE
    │   ├── bug_report.md
    │   └── feature_request.md
├── .gitignore
├── 00a_livebook_for_python_jupyter.livemd
├── 01a_matmul_using_CPU.livemd
├── 01g_matmul_EXLA_gpu.livemd
├── 01h_matmul_Torchx_gpu.livemd
├── 01i_matmul_Exla_cpu.livemd
├── ElixirFashionML_Challenge
    ├── fashion_mnist_challenge.livemd
    └── fashion_mnist_sean_m.livemd
└── README.md


/.github/ISSUE_TEMPLATE/bug_report.md:
--------------------------------------------------------------------------------
 1 | ---
 2 | name: Bug report
 3 | about: Create a report to help us improve
 4 | title: ''
 5 | labels: ''
 6 | assignees: ''
 7 | 
 8 | ---
 9 | 
10 | **Describe the bug**
11 | A clear and concise description of what the bug is.
12 | 
13 | **To Reproduce**
14 | Steps to reproduce the behavior:
15 | 1. Go to '...'
16 | 2. Click on '....'
17 | 3. Scroll down to '....'
18 | 4. See error
19 | 
20 | **Expected behavior**
21 | A clear and concise description of what you expected to happen.
22 | 
23 | **Screenshots**
24 | If applicable, add screenshots to help explain your problem.
25 | 
26 | **Desktop (please complete the following information):**
27 |  - OS: [e.g. iOS]
28 |  - Browser [e.g. chrome, safari]
29 |  - Version [e.g. 22]
30 | 
31 | **Smartphone (please complete the following information):**
32 |  - Device: [e.g. iPhone6]
33 |  - OS: [e.g. iOS8.1]
34 |  - Browser [e.g. stock browser, safari]
35 |  - Version [e.g. 22]
36 | 
37 | **Additional context**
38 | Add any other context about the problem here.
39 | 


--------------------------------------------------------------------------------
/.github/ISSUE_TEMPLATE/feature_request.md:
--------------------------------------------------------------------------------
 1 | ---
 2 | name: Feature request
 3 | about: Suggest an idea for this project
 4 | title: ''
 5 | labels: ''
 6 | assignees: ''
 7 | 
 8 | ---
 9 | 
10 | **Is your feature request related to a problem? Please describe.**
11 | A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]
12 | 
13 | **Describe the solution you'd like**
14 | A clear and concise description of what you want to happen.
15 | 
16 | **Describe alternatives you've considered**
17 | A clear and concise description of any alternative solutions or features you've considered.
18 | 
19 | **Additional context**
20 | Add any other context or screenshots about the feature request here.
21 | 


--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------
 1 | /_build
 2 | /cover
 3 | /deps
 4 | /doc
 5 | /.fetch
 6 | erl_crash.dump
 7 | *.ez
 8 | *.beam
 9 | /config/*.secret.exs
10 | .elixir_ls/
11 | 


--------------------------------------------------------------------------------
/00a_livebook_for_python_jupyter.livemd:
--------------------------------------------------------------------------------
  1 | # Elixir/Livebook for Python/Jupyter Developers
  2 | 
  3 | ## Quick Overview
  4 | 
  5 | Python/Jupyter focused developers that are casually looking at Elixir and Livebook notebooks will see concepts that kind of look the same but are different.  Knowing the key differences could help ease understanding about key Elixir concepts in the notebook.  The goal of this guide is to help Python focused people look at a Livebook notebook and grasp what is happening in the notebook.
  6 | 
  7 | ## Installing Livebook
  8 | 
  9 | We've had good success with installing Livebook.dev, https://livebook.dev/#install. There are  native applications for Windows and Mac.  On a Linux system, we go to the Github site, https://github.com/livebook-dev/livebook, and install via Escript.  Don't forget to set the shims.  Also, when running on a local Linux server, the firewall ports for Kino and other interactive cells, are different than the livebook server port.  Be sure to pay attention to the environment variable options in the Readme.md
 10 | 
 11 | Let's discuss how Livebook/Elixir is a little different from Jupyter/Python
 12 | 
 13 | ## Function vs Object Oriented
 14 | 
 15 | Elixir is a functional language.  State exists outside of a function with values passed into a function.  Most Elixir functions will probably transform the inputs then supply an output back.  You could think of them as procedures that have everything passed in, don't update any object state, and return the result.
 16 | 
 17 | However, there are a few situations where state is held after a function call.  In Elixir, we think of these functions as having a side effect.  Common side effects are storing data in a database, file, or operating resources.  The database "write" and file "write" functions result in changing a resource that can later be retrieved.  There are a several other examples of side effect situations.  We'll even see a few examples in Elixir machine learning libraries.
 18 | 
 19 | Elixir has modules that hold function definitions and may define a data structure.  Python has class definitions that hold state and function definitions.  Where a variable can have a method invocation in Python, i.e. list_a.sum(), Elixir values must be passed as arguments into a module's function, Enum.sum(list_b)
 20 | 
 21 | ## Immutable state in Elixir
 22 | 
 23 | For Machine Learning notebooks, state is referenced in variable names specific to the notebook.  This is very similar to how variable state is held in a Jupyter notebook.  The variable values are held by the notebook until the Elixir notebook is closed.
 24 | 
 25 | One pretty big difference in Elixir is that all state is immutable.  A function can receive state as an argument variable, however, the variable is immutable so it can't be changed.  The function may transform the information, but any transformations must be returned back as a newly created value.  One convenient approach in Elixir is to assign the resulting function call back to the same variable name.  But there are better conventions that we'll see below
 26 | 
 27 | <!-- livebook:{"force_markdown":true} -->
 28 | 
 29 | ```elixir
 30 | list_b = [1,2,3]
 31 | list_b = Enum.map(list_b, fn(value) -> value * value end)
 32 | ```
 33 | 
 34 | Like Jupyter notebooks, shift-*return key* will execute the current cell.  The other keyboard shortcuts can be found in the keypad-like icon on the left bar.  There is also a mouse approach with the > Execute that appears above the active cell.  Click on the Execute button will also work.  If you've already install Livebook, try executing the following code cell.
 35 | 
 36 | ```elixir
 37 | list_b = [1, 2, 3]
 38 | list_b = Enum.map(list_b, fn value -> value * value end)
 39 | ```
 40 | 
 41 | ## Chaining function calls
 42 | 
 43 | In many object-oriented languages method calls can be chained sequentially, i.e. array_a.square().sum()
 44 | 
 45 | Elixir has a special notation for chaining function calls together.
 46 | 
 47 | <!-- livebook:{"force_markdown":true} -->
 48 | 
 49 | ```elixir
 50 | list_b
 51 |   |> Enum.map(fn(value) -> value * value end)
 52 |   |> Enum.sum()
 53 | ```
 54 | 
 55 | The |>, pipe operator, takes the result of the previous function and passes it as the first argument to the following function call.  Note that the first argument, a list or enumerable, for Enum.map and Enum.sum aren't shown because the pipe operator represents the output from the previous line of code.
 56 | 
 57 | ```elixir
 58 | list_b
 59 | |> Enum.map(fn value -> value * value end)
 60 | |> Enum.sum()
 61 | ```
 62 | 
 63 | ```elixir
 64 | # All in one line also works
 65 | list_b |> Enum.map(fn value -> value * value end) |> Enum.sum()
 66 | ```
 67 | 
 68 | ## Elixir functions in modules
 69 | 
 70 | As long as the code in a Livebook is calling existing functions, variable assignment works pretty much like they do in Python/Jupyter.  However, Python supports the definition of standalone functions in notebooks.
 71 | 
 72 | ```python
 73 | def chunks(x, sz):
 74 |     for i in range(0, len(x), sz): yield x[i:i+sz]
 75 | ```
 76 | 
 77 | Alll Elixir function definitions must be inside a module definition
 78 | 
 79 | <!-- livebook:{"force_markdown":true} -->
 80 | 
 81 | ```elixir
 82 | defmodule ModA do
 83 |   def funct_a() do
 84 |   end
 85 | end
 86 | ```
 87 | 
 88 | <!-- livebook:{"break_markdown":true} -->
 89 | 
 90 | Elixir has an anonymous function capability.  In the above Enum function call, the fn(something) -> transform(something) end) is creating an anonymous function, like Python's lambda.  Anonymous functions can be assigned to variable names and called.  Note the .(args) when the named anonymous function is called.
 91 | 
 92 | <!-- livebook:{"force_markdown":true} -->
 93 | 
 94 | ```elixir
 95 | sum_of_squares = fn(value) ->
 96 |   Enum.map(value, fn(v) -> v * v end) 
 97 |   |> Enum.sum() 
 98 | end
 99 | 
100 | sum_of_squares.(list_b)
101 | ```
102 | 
103 | ```elixir
104 | sum_of_squares = fn value ->
105 |   Enum.map(value, fn v -> v * v end)
106 |   |> Enum.sum()
107 | end
108 | 
109 | sum_of_squares.(list_b)
110 | ```
111 | 
112 | ## Livebook module version management
113 | 
114 | One item to note about Livebook, modules are installed with a version.  Rather than a requirements.txt for an entire folder of Jupyter notebooks, the module dependencies are defined within each notebook.  The Livebook convention is to use the first cell to define any module dependencies.  In this notebook, the basic Elixir language capabilities were sufficient so no modules were Mix.installed.  Watch for the contents of the first cell to explore the modules used in notebooks.  The specific modules used helps with repeatability challenges with notebooks.  However, you'll note that the Elixir and Erlang versions are not defined in the notebook.  Neither was the version of Livebook the notebook was run under.  Operating system dependencies, like Cuda, CudaDNN, cmake, make, etc. are not defined in notebooks either.
115 | 
116 | ## Livebook file format
117 | 
118 | Livebook's file format is a markdown file.  The use of a well defined standard format allows support for understandable Git pull requests against the .livemd file.
119 | 
120 | ## Left sidebar and hints
121 | 
122 | We've already mentioned the keyboard shortcuts.  Other icons represent the table of Section labels and connected users.  The lock captures secrets that you don't want stored in your Livebook.  Secrets can be things like a database login, etc.  The runtime settings is kind of an advanced setting.  We suggest finding the documentation or blog posts on how to use the settings.  These settings don't have a strong mapping to Jupyter.  Finally, a Big Hint, if you accidently delete a cell, you can retrieve the cell from the bin/trash.
123 | 
124 | <!-- livebook:{"break_markdown":true} -->
125 | 
126 | For the active cell, there are some icons above the cell to the right.  The up and down arrow move the active cell up or down in your notebook.  We just noticed, in Livebook 0.7.2, that there is an icon to insert an image into a markdown cell.  We'll need to try it out.
127 | 
128 | <!-- livebook:{"break_markdown":true} -->
129 | 
130 | Another big hint: Livebook knows the cells that are stale.  If you go to the bottom of the notebook, or someplace in the middle of notebook, execute the cell and any cells that are out of date with your edits are executed down to your cell.  This is one technique for executing all of the cells in a notebook.  However, it doesn't force the re-executing of all cells.  Only stale cells are run.  If you re-execute the first cell and then execute the last cell, all cells will be executed.
131 | 
132 | <!-- livebook:{"break_markdown":true} -->
133 | 
134 | A Livebook notebook opened from a web resource will not be saved locally unless to instruct Livebook to save the notebook.  Click on the floppy disk icon in the lower right and choose someplace you want to store the notebook.
135 | 
136 | ## Try out some Livebook notebooks
137 | 
138 | This hasn't been a complete guide to Livebook, but hopefully it provides some context for your exploration of Elixir and Livebook.  Have fun!
139 | 


--------------------------------------------------------------------------------
/01g_matmul_EXLA_gpu.livemd:
--------------------------------------------------------------------------------
  1 | # Matrix multiplication on GPU - XLA
  2 | 
  3 | ```elixir
  4 | Mix.install(
  5 |   [
  6 |     {:nx, "~> 0.4.0"},
  7 |     {:scidata, "~> 0.1.9"},
  8 |     {:axon, "~> 0.3.0"},
  9 |     {:exla, "~> 0.4"}
 10 |   ],
 11 |   system_env: %{"XLA_TARGET" => "cuda111"}
 12 | )
 13 | ```
 14 | 
 15 | ## Before running notebook
 16 | 
 17 | This notebook has a dependency on EXLA.  XLA support systems with direct access to an NVidia GPU, AMD ROCm or a Google TPU.  According to the documentation, https://github.com/elixir-nx/nx/tree/main/exla#readme EXLA will try to find a precompiled version that matches your system.  If it doesn't find a match. you will need to install CUDA and CuDNN for your system.
 18 | 
 19 | The notebook is currently configured for Nvidia GPU via
 20 | 
 21 | ```
 22 | system_env: %{"XLA_TARGET" => "cuda111"}
 23 | ```
 24 | 
 25 | Review the configuration documentation for more options. https://hexdocs.pm/exla/EXLA.html#module-configuration
 26 | 
 27 | We had to install CUDA and CuDNN but that was several months ago.  Your experience may vary from ours.
 28 | 
 29 | ## Context
 30 | 
 31 | This Livebook is a transformation of a Python Jupyter Notebook from Fast.ai's From Deep Learning Foundations to Stable Diffusion, Practical Deep Learning for Coders part 2, 2022. Specifically, it mimics the CUDA portion of https://github.com/fastai/course22p2/blob/master/nbs/01_matmul.ipynb
 32 | 
 33 | The purpose of the transformation is to bring the Fast.ai concepts to Elixir focused developers. The object-oriented Python/PyTorch implementation is transformed into a functional programming implementation using Nx and Axon
 34 | 
 35 | ## Experimenting with backend control
 36 | 
 37 | In this notebook, we are going to experiment with swapping out backends in the same notebook. One of the strengths of Elixir's numerical processing approach is the concept of a backend. The same Nx code can run on several different backends. This allows Nx to adapt to changes in numerical libaries and technology. Currently, Nx has support for Tensorflow's XLA and PyTorch's TorchScript. Theoretically, backends for SOC type devices should be possible.
 38 | 
 39 | We chose not to set the backend globally throughout the notebook.  At the beginning of the notebook we'll repeat the approach we used in 01a_matmul_using_CPU.  We begin with the Elixir Binary backend.  You'll see that it isn't quick multiplying 10,000 rows of MNIST data by some arbitrary weights.  We'll then repeat the same multiplication using an NVidia 1080Ti GPU.  The 1080 Ti is not the fastest GPU, but it is tremendously faster than a "large" set of data on the BinaryBackend.
 40 | 
 41 | * 31649.26 milliseconds using BinaryBackend with a CPU only.
 42 | * 0.14 milliseconds using XLA with a warmed up GPU
 43 | 
 44 | *226,000 times faster on an old GPU*
 45 | 
 46 | ## Default - BinaryBackend
 47 | 
 48 | ```elixir
 49 | # Without choosing a backend, Nx defaults to Nx.BinaryBackend
 50 | Nx.default_backend()
 51 | ```
 52 | 
 53 | ```elixir
 54 | # Just in case you rerun the notebook, let's make sure the default backend is BinaryBackend
 55 | # Setting to the Nx default backend
 56 | Nx.default_backend(Nx.BinaryBackend)
 57 | Nx.default_backend()
 58 | ```
 59 | 
 60 | We'll pull down the MNIST data
 61 | 
 62 | ```elixir
 63 | {train_images, train_labels} = Scidata.MNIST.download()
 64 | ```
 65 | 
 66 | ```elixir
 67 | {train_images_binary, train_tensor_type, train_shape} = train_images
 68 | ```
 69 | 
 70 | ```elixir
 71 | train_tensor_type
 72 | ```
 73 | 
 74 | Convert into Tensors and normalize to between 0 and 1
 75 | 
 76 | ```elixir
 77 | train_tensors =
 78 |   train_images_binary
 79 |   |> Nx.from_binary(train_tensor_type)
 80 |   |> Nx.reshape({60000, 28 * 28})
 81 |   |> Nx.divide(255)
 82 | ```
 83 | 
 84 | We'll separate the data into 50,000 train images and 10,000 validation images.
 85 | 
 86 | ```elixir
 87 | x_train_cpu = train_tensors[0..49_999]
 88 | x_valid_cpu = train_tensors[50_000..59_999]
 89 | {x_train_cpu.shape, x_valid_cpu.shape}
 90 | ```
 91 | 
 92 | Training is more stable when random numbers are initialized with a mean of 0.0 and a variance of 1.0
 93 | 
 94 | ```elixir
 95 | mean = 0.0
 96 | variance = 1.0
 97 | weights_cpu = Nx.random_normal({784, 10}, mean, variance, type: {:f, 32})
 98 | ```
 99 | 
100 | In order to simplify timing the performance of the Nx.dot/2 function, we'll use an 0 parameter anonymous function.  Invoking the anonymous function will always use the two parameters, x_valid_cpu and weights_cpu.
101 | 
102 | ```elixir
103 | large_nx_mult_fn = fn -> Nx.dot(x_valid_cpu, weights_cpu) end
104 | ```
105 | 
106 | The following anonymous function takes function and the number of times to make the call to the function.
107 | 
108 | ```elixir
109 | repeat = fn timed_fn, times -> Enum.each(1..times, fn _x -> timed_fn.() end) end
110 | ```
111 | 
112 | Timing the average duration of the dot multiply function to run.  The cell will output the average and total elapsed time
113 | 
114 | ```elixir
115 | repeat_times = 5
116 | {elapsed_time_micro, _} = :timer.tc(repeat, [large_nx_mult_fn, repeat_times])
117 | avg_elapsed_time_ms = elapsed_time_micro / 1000 / repeat_times
118 | 
119 | {backend, _device} = Nx.default_backend()
120 | 
121 | "#{backend} CPU avg time in #{avg_elapsed_time_ms} milliseconds, total_time #{elapsed_time_micro / 1000} milliseconds"
122 | ```
123 | 
124 | ## XLA using GPU
125 | 
126 | We'll switch to the XLA backend and use the cuda device.  If you have a different device, replace all the :cuda specifications with your device.
127 | 
128 | ```elixir
129 | Nx.default_backend({EXLA.Backend, device: :cuda})
130 | Nx.default_backend()
131 | ```
132 | 
133 | In the following cell, we transfer the target data onto the GPU.
134 | 
135 | ```elixir
136 | x_valid_cuda = Nx.backend_transfer(x_valid_cpu, {EXLA.Backend, client: :cuda})
137 | weights_cuda = Nx.backend_transfer(weights_cpu, {EXLA.Backend, client: :cuda})
138 | ```
139 | 
140 | An anonymous function that calls Nx.dot/2 with data on the GPU
141 | 
142 | ```elixir
143 | exla_gpu_mult_fn = fn -> Nx.dot(x_valid_cuda, weights_cuda) end
144 | ```
145 | 
146 | We'll warm up the GPU by looping through 5 function calls and then timing the next 5 
147 | function calls.
148 | 
149 | ```elixir
150 | repeat_times = 5
151 | # Warm up one epoch
152 | {elapsed_time_micro, _} = :timer.tc(repeat, [exla_gpu_mult_fn, repeat_times])
153 | # The real timing starts here
154 | {elapsed_time_micro, _} = :timer.tc(repeat, [exla_gpu_mult_fn, repeat_times])
155 | avg_elapsed_time_ms = elapsed_time_micro / 1000 / repeat_times
156 | 
157 | {backend, [device: device]} = Nx.default_backend()
158 | 
159 | "#{backend} #{device} avg time in #{avg_elapsed_time_ms} milliseconds total_time #{elapsed_time_micro / 1000} milliseconds"
160 | ```
161 | 
162 | ```elixir
163 | x_valid_cpu = Nx.backend_transfer(x_valid_cuda, Nx.BinaryBackend)
164 | weights_cpu = Nx.backend_transfer(weights_cuda, Nx.BinaryBackend)
165 | ```
166 | 


--------------------------------------------------------------------------------
/01h_matmul_Torchx_gpu.livemd:
--------------------------------------------------------------------------------
  1 | # Matrix multiplication on GPU - TorchScript
  2 | 
  3 | ```elixir
  4 | Mix.install(
  5 |   [
  6 |     {:nx, "~> 0.4.0"},
  7 |     {:scidata, "~> 0.1.9"},
  8 |     {:torchx, "~> 0.3"}
  9 |   ],
 10 |   system_env: %{"LIBTORCH_TARGET" => "cu116"}
 11 | )
 12 | ```
 13 | 
 14 | <!-- livebook:{"output":true} -->
 15 | 
 16 | ```
 17 | :ok
 18 | ```
 19 | 
 20 | ## Before running notebook
 21 | 
 22 | This notebook has a dependency on TorchScript. Torchx can use your CPU or GPU.  If you have direct access to an NVidia GPU, the notebook has a section on running matrix multiplication on a GPU. If you only have a CPU, you can comment out the last GPU section and just run on your CPU.  CPU is still pretty fast for this simple notebook.
 23 | 
 24 | According to the documentation, https://github.com/elixir-nx/nx/tree/main/torchx#readme Torchx will need to compile the TorchScript binding.  Before you run the above cell, you will need make/nmake, cmake (3.12+) and a C++ compiler.  The Windows binding to TorchScript is also supported and more information can be found at the Torchx readme. At this time, the MacOS binding doesn't support access to a GPU.
 25 | 
 26 | **Running the first cell downloads and compiles the binding to TorchScript. The download of TorchScript took about 9 minutes and compilation took about 1 minute on our system.**  In the future, it is likely that the downloaded TorchScript file will be cached locally, however, right now each notebook that uses torchx will download the file.
 27 | 
 28 | The notebook is currently set up for an Nvidia GPU on Linux.
 29 | 
 30 | ```
 31 | system_env: %{"LIBTORCH_TARGET" => "cu111"}
 32 | ```
 33 | 
 34 | Feel free to read the Torchx documentation and modify to fit your needs.
 35 | 
 36 | ## Context
 37 | 
 38 | The notebook is a transformation of a Python Jupyter Notebook from Fast.ai's [From Deep Learning Foundations to Stable Diffusion](https://www.fast.ai/posts/part2-2022.html), Practical Deep Learning for Coders part 2, 2022. Specifically, it mimics the CUDA portion of https://github.com/fastai/course22p2/blob/master/nbs/01_matmul.ipynb
 39 | 
 40 | The purpose of the transformation is to bring the Fast.ai concepts to Elixir focused developers.  The object-oriented Python/PyTorch implementation is transformed into a functional programming implementation using [Nx](https://github.com/elixir-nx/nx) and [Axon](https://github.com/elixir-nx/axon)
 41 | 
 42 | ## Experimenting with backend control
 43 | 
 44 | In this notebook, we are going to experiment with swapping out backends in the same notebook. One of the strengths of Elixir's numerical processing approach is the concept of a backend.  The same Nx code can run on several different backends.  This allows Nx to adapt to changes in numerical libaries and technology.  Currently, Nx has support for Tensorflow's XLA and PyTorch's TorchScript.  Theoretically, backends for SOC type devices should be possible.
 45 | 
 46 | We chose not to set the backend globally in this notebook. At the beginning of the notebook, we'll repeat the approach we used in 01a_matmul_using_CPU. We begin with the Elixir Binary backend. You'll see that it isn't quick multiplying 10,000 rows of MNIST data by some arbitrary weights.
 47 | 
 48 | We'll then repeat the same multiplication using TorchScript on the CPU.  Followed again by TorchScript using an NVidia 1080Ti GPU. The 1080 Ti is not the fastest GPU, but it is tremendously faster than a "large" set of data on the BinaryBackend but only a little faster than just the CPU
 49 | 
 50 | * About 32 seconds using BinaryBackend with only a CPU.
 51 | * 1.8 milliseconds using TorchScript with only a CPU
 52 | 
 53 | 17,778 times faster than Binary backend
 54 | 
 55 | * 70 microseconds using TorchScript with a warmed up, but old, GPU
 56 | 
 57 | 111 times faster on the GPU vs the CPU
 58 | 
 59 | ## Default - BinaryBackend
 60 | 
 61 | ```elixir
 62 | # Without choosing a backend, Nx defaults to Nx.BinaryBackend
 63 | Nx.default_backend()
 64 | ```
 65 | 
 66 | <!-- livebook:{"output":true} -->
 67 | 
 68 | ```
 69 | {Nx.BinaryBackend, []}
 70 | ```
 71 | 
 72 | ```elixir
 73 | # Just in case you rerun the notebook, let's make sure the default backend is BinaryBackend
 74 | # Setting to the Nx default backend
 75 | Nx.default_backend(Nx.BinaryBackend)
 76 | Nx.default_backend()
 77 | ```
 78 | 
 79 | <!-- livebook:{"output":true} -->
 80 | 
 81 | ```
 82 | {Nx.BinaryBackend, []}
 83 | ```
 84 | 
 85 | We'll pull down the MNIST data
 86 | 
 87 | ```elixir
 88 | {train_images, train_labels} = Scidata.MNIST.download()
 89 | {test_images, test_labels} = Scidata.MNIST.download_test()
 90 | ```
 91 | 
 92 | <!-- livebook:{"output":true} -->
 93 | 
 94 | ```
 95 | {{<<0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
 96 |     0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...>>, {:u, 8}, {10000, 1, 28, 28}},
 97 |  {<<7, 2, 1, 0, 4, 1, 4, 9, 5, 9, 0, 6, 9, 0, 1, 5, 9, 7, 3, 4, 9, 6, 6, 5, 4, 0, 7, 4, 0, 1, 3, 1,
 98 |     3, 4, 7, 2, 7, 1, 2, 1, 1, 7, 4, 2, 3, 5, 1, ...>>, {:u, 8}, {10000}}}
 99 | ```
100 | 
101 | ```elixir
102 | {train_images_binary, train_tensor_type, train_shape} = train_images
103 | {test_images_binary, test_tensor_type, test_shape} = test_images
104 | ```
105 | 
106 | <!-- livebook:{"output":true} -->
107 | 
108 | ```
109 | {<<0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
110 |    0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...>>, {:u, 8}, {10000, 1, 28, 28}}
111 | ```
112 | 
113 | ```elixir
114 | {train_tensor_type, test_tensor_type}
115 | ```
116 | 
117 | <!-- livebook:{"output":true} -->
118 | 
119 | ```
120 | {{:u, 8}, {:u, 8}}
121 | ```
122 | 
123 | Convert into Tensors and normalize to between 0 and 1
124 | 
125 | ```elixir
126 | train_tensors =
127 |   train_images_binary
128 |   |> Nx.from_binary(train_tensor_type)
129 |   |> Nx.reshape({60000, 28 * 28})
130 |   |> Nx.divide(255)
131 | ```
132 | 
133 | <!-- livebook:{"output":true} -->
134 | 
135 | ```
136 | #Nx.Tensor<
137 |   f32[60000][784]
138 |   [
139 |     [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...],
140 |     ...
141 |   ]
142 | >
143 | ```
144 | 
145 | We'll separate the data into 50,000 train images and 10,000 validation images.
146 | 
147 | ```elixir
148 | x_train = train_tensors[0..49_999]
149 | x_valid = train_tensors[50_000..59_999]
150 | {x_train.shape, x_valid.shape}
151 | ```
152 | 
153 | <!-- livebook:{"output":true} -->
154 | 
155 | ```
156 | {{50000, 784}, {10000, 784}}
157 | ```
158 | 
159 | Training is more stable when random numbers are initialized with a mean of 0.0 and a variance of 1.0
160 | 
161 | ```elixir
162 | mean = 0.0
163 | variance = 1.0
164 | weights = Nx.random_normal({784, 10}, mean, variance, type: {:f, 32})
165 | ```
166 | 
167 | <!-- livebook:{"output":true} -->
168 | 
169 | ```
170 | #Nx.Tensor<
171 |   f32[784][10]
172 |   [
173 |     [1.182692050933838, 1.6625404357910156, -0.598689079284668, -0.6435468196868896, 0.25204139947891235, -1.1432150602340698, -0.9701210260391235, 1.9566036462783813, -0.6923237442970276, -1.0753910541534424],
174 |     [0.17891690135002136, 0.42717286944389343, -0.9910821914672852, -2.649228096008301, 0.13641099631786346, 0.48691749572753906, -1.0575640201568604, 0.40385302901268005, 0.5131964683532715, 0.41488444805145264],
175 |     [2.100423574447632, -1.2787413597106934, -1.8883213996887207, -0.49423742294311523, 0.5708040595054626, -0.48230457305908203, -0.19617703557014465, 0.7797456979751587, 0.7876895070075989, -0.33916765451431274],
176 |     [-0.4369395673274994, 0.4421914517879486, 0.18007169663906097, 0.7891340255737305, 0.28369951248168945, -1.2312926054000854, -0.17864377796649933, -1.2232452630996704, 0.6976354718208313, 1.300831913948059],
177 |     [-1.9821809530258179, 1.426361083984375, -2.2645328044891357, 0.26135173439979553, -0.36276111006736755, 2.7461342811584473, 0.007044021971523762, -0.18955571949481964, 0.6062670946121216, -0.4373891055583954],
178 |     ...
179 |   ]
180 | >
181 | ```
182 | 
183 | In order to simplify timing the performance of the Nx.dot/2 function, we'll use an 0 parameter anonymous function. Invoking the anonymous function will always use the two parameters, x_valid_cpu and weights_cpu.
184 | 
185 | ```elixir
186 | large_nx_mult_fn = fn -> Nx.dot(x_valid, weights) end
187 | ```
188 | 
189 | <!-- livebook:{"output":true} -->
190 | 
191 | ```
192 | #Function<43.3316493/0 in :erl_eval.expr/6>
193 | ```
194 | 
195 | The following anonymous function take a function and the number of times to make the call to the function.
196 | 
197 | ```elixir
198 | repeat = fn timed_fn, times -> Enum.each(1..times, fn _x -> timed_fn.() end) end
199 | ```
200 | 
201 | <!-- livebook:{"output":true} -->
202 | 
203 | ```
204 | #Function<41.3316493/2 in :erl_eval.expr/6>
205 | ```
206 | 
207 | Timing the average duration of the dot multiply function to run. The cell will output the average and total elapsed time
208 | 
209 | ```elixir
210 | repeat_times = 5
211 | {elapsed_time_micro, _} = :timer.tc(repeat, [large_nx_mult_fn, repeat_times])
212 | avg_elapsed_time_ms = elapsed_time_micro / 1000 / repeat_times
213 | 
214 | {backend, _device} = Nx.default_backend()
215 | 
216 | "#{backend} CPU avg time in #{avg_elapsed_time_ms} milliseconds  total_time #{elapsed_time_micro / 1000} milliseconds"
217 | ```
218 | 
219 | <!-- livebook:{"output":true} -->
220 | 
221 | ```
222 | "Elixir.Nx.BinaryBackend CPU avg time in 31846.6328 milliseconds  total_time 159233.164 milliseconds"
223 | ```
224 | 
225 | ## TorchScript CPU only
226 | 
227 | We'll switch to the TorchScript backend but we'll stick with using the CPU.
228 | 
229 | ```elixir
230 | Nx.default_backend({Torchx.Backend, device: :cpu})
231 | Nx.default_backend()
232 | ```
233 | 
234 | <!-- livebook:{"output":true} -->
235 | 
236 | ```
237 | {Torchx.Backend, [device: :cpu]}
238 | ```
239 | 
240 | In the following cell, we transfer the target data from BinaryBackend to Torchx cpu backend.
241 | 
242 | ```elixir
243 | x_valid_torchx_cpu = Nx.backend_transfer(x_valid, {Torchx.Backend, device: :cpu})
244 | weights_torchx_cpu = Nx.backend_transfer(weights, {Torchx.Backend, device: :cpu})
245 | ```
246 | 
247 | <!-- livebook:{"output":true} -->
248 | 
249 | ```
250 | #Nx.Tensor<
251 |   f32[784][10]
252 |   Torchx.Backend(cpu)
253 |   [
254 |     [1.182692050933838, 1.6625404357910156, -0.598689079284668, -0.6435468196868896, 0.25204139947891235, -1.1432150602340698, -0.9701210260391235, 1.9566036462783813, -0.6923237442970276, -1.0753910541534424],
255 |     [0.17891690135002136, 0.42717286944389343, -0.9910821914672852, -2.649228096008301, 0.13641099631786346, 0.48691749572753906, -1.0575640201568604, 0.40385302901268005, 0.5131964683532715, 0.41488444805145264],
256 |     [2.100423574447632, -1.2787413597106934, -1.8883213996887207, -0.49423742294311523, 0.5708040595054626, -0.48230457305908203, -0.19617703557014465, 0.7797456979751587, 0.7876895070075989, -0.33916765451431274],
257 |     [-0.4369395673274994, 0.4421914517879486, 0.18007169663906097, 0.7891340255737305, 0.28369951248168945, -1.2312926054000854, -0.17864377796649933, -1.2232452630996704, 0.6976354718208313, 1.300831913948059],
258 |     [-1.9821809530258179, 1.426361083984375, -2.2645328044891357, 0.26135173439979553, -0.36276111006736755, 2.7461342811584473, 0.007044021971523762, -0.18955571949481964, 0.6062670946121216, -0.4373891055583954],
259 |     ...
260 |   ]
261 | >
262 | ```
263 | 
264 | An anonymous function that calls Nx.dot/2 with data on the Torchx cpu backend.
265 | 
266 | ```elixir
267 | torchx_cpu_mult_fn = fn -> Nx.dot(x_valid_torchx_cpu, weights_torchx_cpu) end
268 | ```
269 | 
270 | <!-- livebook:{"output":true} -->
271 | 
272 | ```
273 | #Function<43.3316493/0 in :erl_eval.expr/6>
274 | ```
275 | 
276 | We'll time using Torchx on the CPU.  Notice the significant performance improvement over BinaryBackend while still using just the CPU.
277 | 
278 | ```elixir
279 | repeat_times = 5
280 | {elapsed_time_micro, _} = :timer.tc(repeat, [torchx_cpu_mult_fn, repeat_times])
281 | avg_elapsed_time_ms = elapsed_time_micro / 1000 / repeat_times
282 | 
283 | {backend, [device: device]} = Nx.default_backend()
284 | 
285 | "#{backend} #{device} avg time in milliseconds #{avg_elapsed_time_ms} total_time #{elapsed_time_micro / 1000}"
286 | ```
287 | 
288 | <!-- livebook:{"output":true} -->
289 | 
290 | ```
291 | "Elixir.Torchx.Backend cpu avg time in milliseconds 1.7149999999999999 total_time 8.575"
292 | ```
293 | 
294 | ## TorchScript using GPU
295 | 
296 | We'll switch to using the cuda device. If you have a different device, replace all the :cuda specifications with your device.
297 | 
298 | ```elixir
299 | Nx.default_backend({Torchx.Backend, device: :cuda})
300 | Nx.default_backend()
301 | ```
302 | 
303 | <!-- livebook:{"output":true} -->
304 | 
305 | ```
306 | {Torchx.Backend, [device: :cuda]}
307 | ```
308 | 
309 | In the following cell, we transfer the target data onto the GPU.
310 | 
311 | ```elixir
312 | x_valid_cuda = Nx.backend_transfer(x_valid, {Torchx.Backend, client: :cuda})
313 | weights_cuda = Nx.backend_transfer(weights, {Torchx.Backend, client: :cuda})
314 | ```
315 | 
316 | <!-- livebook:{"output":true} -->
317 | 
318 | ```
319 | #Nx.Tensor<
320 |   f32[784][10]
321 |   Torchx.Backend(cuda)
322 |   [
323 |     [1.182692050933838, 1.6625404357910156, -0.598689079284668, -0.6435468196868896, 0.25204139947891235, -1.1432150602340698, -0.9701210260391235, 1.9566036462783813, -0.6923237442970276, -1.0753910541534424],
324 |     [0.17891690135002136, 0.42717286944389343, -0.9910821914672852, -2.649228096008301, 0.13641099631786346, 0.48691749572753906, -1.0575640201568604, 0.40385302901268005, 0.5131964683532715, 0.41488444805145264],
325 |     [2.100423574447632, -1.2787413597106934, -1.8883213996887207, -0.49423742294311523, 0.5708040595054626, -0.48230457305908203, -0.19617703557014465, 0.7797456979751587, 0.7876895070075989, -0.33916765451431274],
326 |     [-0.4369395673274994, 0.4421914517879486, 0.18007169663906097, 0.7891340255737305, 0.28369951248168945, -1.2312926054000854, -0.17864377796649933, -1.2232452630996704, 0.6976354718208313, 1.300831913948059],
327 |     [-1.9821809530258179, 1.426361083984375, -2.2645328044891357, 0.26135173439979553, -0.36276111006736755, 2.7461342811584473, 0.007044021971523762, -0.18955571949481964, 0.6062670946121216, -0.4373891055583954],
328 |     ...
329 |   ]
330 | >
331 | ```
332 | 
333 | An anonymous function that calls Nx.dot/2 with data on the GPU
334 | 
335 | ```elixir
336 | torchx_gpu_mult_fn = fn -> Nx.dot(x_valid_cuda, weights_cuda) end
337 | ```
338 | 
339 | <!-- livebook:{"output":true} -->
340 | 
341 | ```
342 | #Function<43.3316493/0 in :erl_eval.expr/6>
343 | ```
344 | 
345 | We'll warm up the GPU by looping through 5 function calls and then timing the next 5 function calls.
346 | 
347 | ```elixir
348 | repeat_times = 5
349 | # Warmup
350 | {elapsed_time_micro, _} = :timer.tc(repeat, [torchx_gpu_mult_fn, repeat_times])
351 | {elapsed_time_micro, _} = :timer.tc(repeat, [torchx_gpu_mult_fn, repeat_times])
352 | avg_elapsed_time_ms = elapsed_time_micro / 1000 / repeat_times
353 | 
354 | {backend, [device: device]} = Nx.default_backend()
355 | 
356 | "#{backend} #{device} avg time in milliseconds #{avg_elapsed_time_ms} total_time #{elapsed_time_micro / 1000}"
357 | ```
358 | 
359 | <!-- livebook:{"output":true} -->
360 | 
361 | ```
362 | "Elixir.Torchx.Backend cuda avg time in milliseconds 0.0718 total_time 0.359"
363 | ```
364 | 
365 | ```elixir
366 | x_valid = Nx.backend_transfer(x_valid_cuda, Nx.BinaryBackend)
367 | weights = Nx.backend_transfer(weights_cuda, Nx.BinaryBackend)
368 | ```
369 | 
370 | <!-- livebook:{"output":true} -->
371 | 
372 | ```
373 | #Nx.Tensor<
374 |   f32[784][10]
375 |   [
376 |     [1.182692050933838, 1.6625404357910156, -0.598689079284668, -0.6435468196868896, 0.25204139947891235, -1.1432150602340698, -0.9701210260391235, 1.9566036462783813, -0.6923237442970276, -1.0753910541534424],
377 |     [0.17891690135002136, 0.42717286944389343, -0.9910821914672852, -2.649228096008301, 0.13641099631786346, 0.48691749572753906, -1.0575640201568604, 0.40385302901268005, 0.5131964683532715, 0.41488444805145264],
378 |     [2.100423574447632, -1.2787413597106934, -1.8883213996887207, -0.49423742294311523, 0.5708040595054626, -0.48230457305908203, -0.19617703557014465, 0.7797456979751587, 0.7876895070075989, -0.33916765451431274],
379 |     [-0.4369395673274994, 0.4421914517879486, 0.18007169663906097, 0.7891340255737305, 0.28369951248168945, -1.2312926054000854, -0.17864377796649933, -1.2232452630996704, 0.6976354718208313, 1.300831913948059],
380 |     [-1.9821809530258179, 1.426361083984375, -2.2645328044891357, 0.26135173439979553, -0.36276111006736755, 2.7461342811584473, 0.007044021971523762, -0.18955571949481964, 0.6062670946121216, -0.4373891055583954],
381 |     ...
382 |   ]
383 | >
384 | ```
385 | 


--------------------------------------------------------------------------------
/01i_matmul_Exla_cpu.livemd:
--------------------------------------------------------------------------------
  1 | # Matrix multiplication on CPU- XLA
  2 | 
  3 | ```elixir
  4 | Mix.install(
  5 |   [
  6 |     {:nx, "~> 0.4.0"},
  7 |     {:scidata, "~> 0.1.9"},
  8 |     {:axon, "~> 0.3.0"},
  9 |     {:exla, "~> 0.4"}
 10 |   ]
 11 | )
 12 | ```
 13 | 
 14 | <!-- livebook:{"output":true} -->
 15 | 
 16 | ```
 17 | Resolving Hex dependencies...
 18 | Dependency resolution completed:
 19 | New:
 20 |   axon 0.3.0
 21 |   castore 0.1.18
 22 |   complex 0.4.2
 23 |   elixir_make 0.6.3
 24 |   exla 0.4.0
 25 |   jason 1.4.0
 26 |   nimble_csv 1.2.0
 27 |   nx 0.4.0
 28 |   scidata 0.1.9
 29 |   xla 0.3.0
 30 | * Getting nx (Hex package)
 31 | * Getting scidata (Hex package)
 32 | * Getting axon (Hex package)
 33 | * Getting exla (Hex package)
 34 | * Getting elixir_make (Hex package)
 35 | * Getting xla (Hex package)
 36 | * Getting castore (Hex package)
 37 | * Getting jason (Hex package)
 38 | * Getting nimble_csv (Hex package)
 39 | * Getting complex (Hex package)
 40 | ==> jason
 41 | Compiling 10 files (.ex)
 42 | Generated jason app
 43 | ==> nimble_csv
 44 | Compiling 1 file (.ex)
 45 | Generated nimble_csv app
 46 | ==> complex
 47 | Compiling 2 files (.ex)
 48 | Generated complex app
 49 | ==> nx
 50 | Compiling 27 files (.ex)
 51 | Generated nx app
 52 | ==> axon
 53 | Compiling 24 files (.ex)
 54 | Generated axon app
 55 | ==> elixir_make
 56 | Compiling 1 file (.ex)
 57 | Generated elixir_make app
 58 | ==> xla
 59 | Compiling 2 files (.ex)
 60 | Generated xla app
 61 | ==> exla
 62 | Unpacking /home/ml3/.cache/xla/0.3.0/cache/download/xla_extension-x86_64-linux-cpu.tar.gz into /home/ml3/.cache/mix/installs/elixir-1.14.1-erts-13.1/45e4038ac8aacd103fe2688496702add/deps/exla/cache
 63 | g++ -fPIC -I/home/ml3/.asdf/installs/erlang/25.1/erts-13.1/include -Icache/xla_extension/include -O3 -Wall -Wno-sign-compare -Wno-unused-parameter -Wno-missing-field-initializers -Wno-comment -shared -std=c++14 c_src/exla/exla.cc c_src/exla/exla_nif_util.cc c_src/exla/exla_client.cc -o cache/libexla.so -Lcache/xla_extension/lib -lxla_extension -Wl,-rpath,'$ORIGIN/lib'
 64 | Compiling 21 files (.ex)
 65 | Generated exla app
 66 | ==> castore
 67 | Compiling 1 file (.ex)
 68 | Generated castore app
 69 | ==> scidata
 70 | Compiling 13 files (.ex)
 71 | Generated scidata app
 72 | ```
 73 | 
 74 | <!-- livebook:{"output":true} -->
 75 | 
 76 | ```
 77 | :ok
 78 | ```
 79 | 
 80 | ## Before running notebook
 81 | 
 82 | This notebook has a dependency on EXLA.  XLA supports systems with direct access to an NVidia GPU, AMD ROCm or a Google TPU.  According to the documentation, https://github.com/elixir-nx/nx/tree/main/exla#readme EXLA will try to find a precompiled version that matches your system.  If it doesn't find a match. you will need to install CUDA and CuDNN for your system.
 83 | 
 84 | The notebook is currently configured for Nvidia GPU via
 85 | 
 86 | ```
 87 | system_env: %{"XLA_TARGET" => "cuda111"}
 88 | ```
 89 | 
 90 | Review the configuration documentation for more options. https://hexdocs.pm/exla/EXLA.html#module-configuration
 91 | 
 92 | We had to install CUDA and CuDNN but that was several months ago.  Your experience may vary from ours.
 93 | 
 94 | ## Context
 95 | 
 96 | This Livebook is a transformation of a Python Jupyter Notebook from Fast.ai's From Deep Learning Foundations to Stable Diffusion, Practical Deep Learning for Coders part 2, 2022. Specifically, it mimics the CUDA portion of https://github.com/fastai/course22p2/blob/master/nbs/01_matmul.ipynb
 97 | 
 98 | The purpose of the transformation is to bring the Fast.ai concepts to Elixir focused developers. The object-oriented Python/PyTorch implementation is transformed into a functional programming implementation using Nx and Axon
 99 | 
100 | ## Experimenting with backend control
101 | 
102 | In this notebook, we are going to experiment with swapping out backends in the same notebook. One of the strengths of Elixir's numerical processing approach is the concept of a backend. The same Nx code can run on several different backends. This allows Nx to adapt to changes in numerical libaries and technology. Currently, Nx has support for Tensorflow's XLA and PyTorch's TorchScript. Theoretically, backends for SOC type devices should be possible.
103 | 
104 | We chose not to set the backend globally throughout the notebook.  At the beginning of the notebook we'll repeat the approach we used in 01a_matmul_using_CPU.  We begin with the Elixir Binary backend.  You'll see that it isn't quick multiplying 10,000 rows of MNIST data by some arbitrary weights.  We'll then repeat the same multiplication using an NVidia 1080Ti GPU.  The 1080 Ti is not the fastest GPU, but it is tremendously faster than a "large" set of data on the BinaryBackend.
105 | 
106 | * 31649.26 milliseconds using BinaryBackend with a CPU only.
107 | * 0.14 milliseconds using XLA with a warmed up GPU
108 | 
109 | *226,000 times faster on an old GPU*
110 | 
111 | ## Backends
112 | 
113 | ```elixir
114 | # Without choosing a backend, Nx defaults to Nx.BinaryBackend
115 | Nx.default_backend()
116 | ```
117 | 
118 | <!-- livebook:{"output":true} -->
119 | 
120 | ```
121 | {Nx.BinaryBackend, []}
122 | ```
123 | 
124 | Let's change to EXLA with CPU
125 | 
126 | ```elixir
127 | Nx.default_backend({EXLA.Backend, device: :host})
128 | Nx.default_backend()
129 | ```
130 | 
131 | <!-- livebook:{"output":true} -->
132 | 
133 | ```
134 | {EXLA.Backend, [device: :host]}
135 | ```
136 | 
137 | We'll pull down the MNIST data
138 | 
139 | ```elixir
140 | {train_images, train_labels} = Scidata.MNIST.download()
141 | ```
142 | 
143 | <!-- livebook:{"output":true} -->
144 | 
145 | ```
146 | {{<<0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
147 |     0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...>>, {:u, 8}, {60000, 1, 28, 28}},
148 |  {<<5, 0, 4, 1, 9, 2, 1, 3, 1, 4, 3, 5, 3, 6, 1, 7, 2, 8, 6, 9, 4, 0, 9, 1, 1, 2, 4, 3, 2, 7, 3, 8,
149 |     6, 9, 0, 5, 6, 0, 7, 6, 1, 8, 7, 9, 3, 9, 8, ...>>, {:u, 8}, {60000}}}
150 | ```
151 | 
152 | ```elixir
153 | {train_images_binary, train_tensor_type, train_shape} = train_images
154 | ```
155 | 
156 | <!-- livebook:{"output":true} -->
157 | 
158 | ```
159 | {<<0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
160 |    0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...>>, {:u, 8}, {60000, 1, 28, 28}}
161 | ```
162 | 
163 | ```elixir
164 | train_tensor_type
165 | ```
166 | 
167 | <!-- livebook:{"output":true} -->
168 | 
169 | ```
170 | {:u, 8}
171 | ```
172 | 
173 | Convert into Tensors and normalize to between 0 and 1
174 | 
175 | ```elixir
176 | train_tensors =
177 |   train_images_binary
178 |   |> Nx.from_binary(train_tensor_type)
179 |   |> Nx.reshape({60000, 28 * 28})
180 |   |> Nx.divide(255)
181 | ```
182 | 
183 | <!-- livebook:{"output":true} -->
184 | 
185 | ```
186 | 
187 | 18:50:30.293 [info] XLA service 0x7fe6d40e2330 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
188 | 
189 | 18:50:30.295 [info]   StreamExecutor device (0): Host, Default Version
190 | 
191 | ```
192 | 
193 | <!-- livebook:{"output":true} -->
194 | 
195 | ```
196 | #Nx.Tensor<
197 |   f32[60000][784]
198 |   EXLA.Backend<host:0, 0.2851900150.286654488.81191>
199 |   [
200 |     [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...],
201 |     ...
202 |   ]
203 | >
204 | ```
205 | 
206 | We'll separate the data into 50,000 train images and 10,000 validation images.
207 | 
208 | ```elixir
209 | x_train_cpu = train_tensors[0..49_999]
210 | x_valid_cpu = train_tensors[50_000..59_999]
211 | {x_train_cpu.shape, x_valid_cpu.shape}
212 | ```
213 | 
214 | <!-- livebook:{"output":true} -->
215 | 
216 | ```
217 | {{50000, 784}, {10000, 784}}
218 | ```
219 | 
220 | Training is more stable when random numbers are initialized with a mean of 0.0 and a variance of 1.0
221 | 
222 | ```elixir
223 | mean = 0.0
224 | variance = 1.0
225 | weights_cpu = Nx.random_normal({784, 10}, mean, variance, type: {:f, 32})
226 | ```
227 | 
228 | <!-- livebook:{"output":true} -->
229 | 
230 | ```
231 | #Nx.Tensor<
232 |   f32[784][10]
233 |   EXLA.Backend<host:0, 0.2851900150.286654488.81194>
234 |   [
235 |     [-0.973583996295929, 1.3404284715652466, 0.5889155268669128, -0.06439179182052612, -2.2255215644836426, -0.3939111828804016, -1.5497547388076782, -1.1714494228363037, 1.0855729579925537, -0.4689534306526184],
236 |     [-0.31778475642204285, 0.07520100474357605, 0.053238045424222946, 0.42360711097717285, -2.253004312515259, -0.3818463981151581, -0.5468025803565979, 1.3460612297058105, 1.509813904762268, 0.10178464651107788],
237 |     [2.7212319374084473, -0.6341637969017029, 1.9983967542648315, 0.4862823486328125, 0.951216459274292, -0.8570582270622253, 1.7834625244140625, -0.1596108078956604, -0.369051992893219, 0.7038326263427734],
238 |     [-1.321571946144104, -0.573075532913208, -0.5281657576560974, -1.528030276298523, 0.5641341209411621, -0.13296610116958618, -0.20917919278144836, -0.5405102372169495, 0.13647650182247162, 1.0692965984344482],
239 |     [1.1940683126449585, -1.0889204740524292, 0.26889121532440186, -0.8505605459213257, 0.31284958124160767, 0.8289848566055298, 0.23549814522266388, 0.5921769738197327, 0.506867527961731, 0.6787563562393188],
240 |     ...
241 |   ]
242 | >
243 | ```
244 | 
245 | In order to simplify timing the performance of the Nx.dot/2 function, we'll use an 0 parameter anonymous function.  Invoking the anonymous function will always use the two parameters, x_valid_cpu and weights_cpu.
246 | 
247 | ```elixir
248 | large_nx_mult_fn = fn -> Nx.dot(x_valid_cpu, weights_cpu) end
249 | ```
250 | 
251 | <!-- livebook:{"output":true} -->
252 | 
253 | ```
254 | #Function<43.3316493/0 in :erl_eval.expr/6>
255 | ```
256 | 
257 | The following anonymous function takes function and the number of times to make the call to the function.
258 | 
259 | ```elixir
260 | repeat = fn timed_fn, times -> Enum.each(1..times, fn _x -> timed_fn.() end) end
261 | ```
262 | 
263 | <!-- livebook:{"output":true} -->
264 | 
265 | ```
266 | #Function<41.3316493/2 in :erl_eval.expr/6>
267 | ```
268 | 
269 | Timing the average duration of the dot multiply function to run.  The cell will output the average and total elapsed time
270 | 
271 | ```elixir
272 | repeat_times = 5
273 | {elapsed_time_micro, _} = :timer.tc(repeat, [large_nx_mult_fn, repeat_times])
274 | avg_elapsed_time_ms = elapsed_time_micro / 1000 / repeat_times
275 | 
276 | {backend, device} = Nx.default_backend()
277 | 
278 | "#{backend} CPU avg time in #{avg_elapsed_time_ms} milliseconds, total_time #{elapsed_time_micro / 1000} milliseconds"
279 | ```
280 | 
281 | <!-- livebook:{"output":true} -->
282 | 
283 | ```
284 | "Elixir.EXLA.Backend CPU avg time in 1.2837999999999998 milliseconds, total_time 6.419 milliseconds"
285 | ```
286 | 


--------------------------------------------------------------------------------
/ElixirFashionML_Challenge/fashion_mnist_challenge.livemd:
--------------------------------------------------------------------------------
  1 | # Classifying Simple Fashion Types - State of the Art (SOTA) Challenge
  2 | 
  3 | ```elixir
  4 | Mix.install([
  5 |   {:axon, "~> 0.5.1"},
  6 |   {:exla, "~> 0.5.3"},
  7 |   {:req, "~> 0.3.10"},
  8 |   {:scidata, "~> 0.1.10"}
  9 | ])
 10 | ```
 11 | 
 12 | ## Introduction
 13 | 
 14 | This livebook is inspired by the Classifying handwritten digits notebook in the Axon documentation.  FashionMNIST was designed as a drop-in replacment for the MNIST dataset.  Instead of digits, there a grey scale images of clothing types.  Like MNIST, there are 10 kinds of images.  FashionMNIST was designed as a harder problem than the digits dataset.  You can check the difficulty by running this notebook for 10 epochs.  Notice the training accuracy will be lower than the corresponding MNIST notebook when using the exact same model and epochs.
 15 | 
 16 | ## State of the Art
 17 | 
 18 | In a December tweet, Jeremy Howard created a challenge for the machine learning community.  Can anyone beat his accuracy in 5, 20 or 50 epochs. The challenge's epoch accuracy approach is open to the community and inclusive because the compute requirements are broader.  It doesn't matter whether you are running on an NVidia 1060, 4080, or some GPU in the cloud.  In fact, because the problem is small enough, you can even use your CPU and patience.  A CPU cloud resource can be used on a free Huggingface Space or Fly.io.  If you only have a CPU, be sure to use the EXLA or TorchX backends because they are faster than the pure Elixir default.
 19 | 
 20 | <!-- livebook:{"break_markdown":true} -->
 21 | 
 22 | ![](images/fashion-MNIST_Challenge.png)
 23 | 
 24 | <!-- livebook:{"break_markdown":true} -->
 25 | 
 26 | One implied rule that isn't written in Jeremy's challenge, the model must be trained using only the original FashionMNIST training dataset.  Participants can't add any extra images to the training set.  For example, you can't use generative AI to create new fashion training data images.
 27 | 
 28 | <!-- livebook:{"break_markdown":true} -->
 29 | 
 30 | Leaderboard (Accuracy) on 12/15/2022
 31 | 
 32 | * 5 Epochs - 92.7%
 33 | * 20 Epochs - 93.2%
 34 | 
 35 | <!-- livebook:{"break_markdown":true} -->
 36 | 
 37 | Using Axon, we should be able to match those mid December numbers.  The techniques that Jeremy used can be built in the Nx family of libraries. The foundations for the necessary tools and techniques are in the Axon, Nx, Kino, and NxImage libraries.  Going through training resources, and hints I'll provide, should allow participants to improve the score. Try implementing one techique and share your results.  If you improve the accuracy, I'll add you to the leaderboard.  I'll also keep track of everyone who has been on the leaderboard.
 38 | 
 39 | By competing with each other and sharing, we'll all learn the best techniques for building a State of the Art model in Elixir.  Also, I strongly recommend sharing techniques that you try that don't improve the leaderboard.  If you try something, you learn something.  When you share, everyone learns something.
 40 | 
 41 | If we can match the numbers, then we might be able to get close to the current [leaderboard](https://forums.fast.ai/t/a-challenge-for-you-all/102656).  But let's try the 12/15 leaderboard first.
 42 | 
 43 | ## Hyperparameters
 44 | 
 45 | Hyperparameters in machine learning are choices the developer makes that shape the training of a model.  However, what model to use is one of those choices but it isn't a simple hyperparameter.  Let's create a Map with our simple parameter choices.  It should make it easier to see some key choices that we are making.  We can then reference the choices later in our notebook.  When you add a new technique, you are probably going to make some hyperparameter choices.  Please add your choices to this datastructure.  When we get further along, I plan upon sharing the reasoning for a separate hyperparameter data structure.
 46 | 
 47 | ```elixir
 48 | hyperparams = %{
 49 |   epochs: 5,
 50 |   batch_size: 32
 51 | }
 52 | ```
 53 | 
 54 | ## Retrieving and exploring the dataset
 55 | 
 56 | The Fashion MNIST dataset is available for free online. The Elixir SciData library provides an easy technique to access the training and test datasets.
 57 | 
 58 | ```elixir
 59 | {train_images, train_labels} = Scidata.FashionMNIST.download()
 60 | ```
 61 | 
 62 | ```elixir
 63 | # Normalize and batch images
 64 | {images_binary, images_type, images_shape} = train_images
 65 | 
 66 | batched_images =
 67 |   images_binary
 68 |   |> Nx.from_binary(images_type)
 69 |   |> Nx.reshape(images_shape)
 70 |   |> Nx.divide(255)
 71 |   |> Nx.to_batched(hyperparams[:batch_size])
 72 | ```
 73 | 
 74 | ```elixir
 75 | # One-hot-encode and batch labels
 76 | {labels_binary, labels_type, _shape} = train_labels
 77 | 
 78 | batched_labels =
 79 |   labels_binary
 80 |   |> Nx.from_binary(labels_type)
 81 |   |> Nx.new_axis(-1)
 82 |   |> Nx.equal(Nx.tensor(Enum.to_list(0..9)))
 83 |   |> Nx.to_batched(hyperparams[:batch_size])
 84 | ```
 85 | 
 86 | ## Defining the model
 87 | 
 88 | We'll use the same model from the MNIST example.  By starting with an extremely simple model, I've left room for challenge participants to try different models.  Remember, the models have to start with random weights.  Pre-trained models can't be used on the leaderboard.  However, you can learn from trying a pre-trained model.  Check out Sean's Machine Learning for Elixir book for an example.
 89 | 
 90 | ```elixir
 91 | model =
 92 |   Axon.input("input", shape: {nil, 1, 28, 28})
 93 |   |> Axon.flatten()
 94 |   |> Axon.dense(128, activation: :relu)
 95 |   |> Axon.dense(10, activation: :softmax)
 96 | ```
 97 | 
 98 | All `Axon` models start with an input layer to tell subsequent layers what shapes to expect. We then use `Axon.flatten/2` which flattens the previous layer by squeezing all dimensions but the first dimension into a single dimension. Our model consists of 2 fully connected layers with 128 and 10 units respectively. The first layer uses `:relu` activation which returns `max(0, input)` element-wise. The final layer uses `:softmax` activation to return a probability distribution over the 10 labels.
 99 | 
100 | ## Training
101 | 
102 | In Axon we express the task of training using a declarative loop API. First, we need to specify a loss function and optimizer, there are many built-in variants to choose from. In this example, we'll use *categorical cross-entropy* and the *Adam* optimizer. We will also keep track of the *accuracy* metric. Finally, we run training loop passing our batched images and labels. We'll train for 10 epochs using the `EXLA` compiler.
103 | 
104 | <!-- livebook:{"break_markdown":true} -->
105 | 
106 | Based upon the results of PyTorch challenge from last winter, every leaderboard change overtook the others for all 3 epoch levels. Five epochs is enough to experiment with different model and training approaches. If 5 epochs is more accurate than the current leaderboard, then try the 20 and 50 epochs for completeness
107 | 
108 | ```elixir
109 | trained_model_params =
110 |   model
111 |   |> Axon.Loop.trainer(:categorical_cross_entropy, :adam)
112 |   |> Axon.Loop.metric(:accuracy, "Accuracy")
113 |   |> Axon.Loop.run(Stream.zip(batched_images, batched_labels), %{},
114 |     epochs: hyperparams[:epochs],
115 |     compiler: EXLA
116 |   )
117 | ```
118 | 
119 | ## Comparison with the test data leaderboard
120 | 
121 | Now that we have the trained model parameters from the training effort, we can use them for calculating test data accuracy.
122 | 
123 | Let's get the test data.
124 | 
125 | ```elixir
126 | {test_images, test_labels} = Scidata.FashionMNIST.download_test()
127 | ```
128 | 
129 | ```elixir
130 | {test_images_binary, _test_images_type, test_images_shape} = test_images
131 | 
132 | test_batched_images =
133 |   test_images_binary
134 |   |> Nx.from_binary(images_type)
135 |   |> Nx.reshape(test_images_shape)
136 |   |> Nx.divide(255)
137 |   |> Nx.to_batched(hyperparams[:batch_size])
138 | ```
139 | 
140 | ```elixir
141 | # One-hot-encode and batch labels
142 | {test_labels_binary, _test_labels_type, _shape} = test_labels
143 | 
144 | test_batched_labels =
145 |   test_labels_binary
146 |   |> Nx.from_binary(labels_type)
147 |   |> Nx.new_axis(-1)
148 |   |> Nx.equal(Nx.tensor(Enum.to_list(0..9)))
149 |   |> Nx.to_batched(hyperparams[:batch_size])
150 | ```
151 | 
152 | Instead of Axon.predict, we'll use Axon.loop.evaluator with an accuracy metric.
153 | 
154 | ```elixir
155 | Axon.Loop.evaluator(model)
156 | |> Axon.Loop.metric(:accuracy, "Accuracy")
157 | |> Axon.Loop.run(
158 |   Stream.zip(test_batched_images, test_batched_labels),
159 |   trained_model_params,
160 |   compiler: EXLA
161 | )
162 | ```
163 | 
164 | ## Challenge: #ElixirFashionML
165 | 
166 | '#ElixirFashionMLChallenge Leaderboard (Accuracy) on 7/30/2023
167 | 
168 | * 5 Epochs - 87.4%
169 | * 20 Epochs - 87.7%
170 | * 50 Epochs - 87.8%
171 | 
172 | <!-- livebook:{"break_markdown":true} -->
173 | 
174 | We have an 5 epoch accuracy of 87.4% vs Jeremy's 12/15 accuracy of 92.7%.  That should plenty of opportunities for the community to leap to the top of the leaderboard
175 | 
176 | ## How can you beat this initial result?
177 | 
178 | I'll provide a quick set of resources and expand upon important resources at a later time.  For now, start reading, try various Livebook notebooks, and watch some videos.
179 | 
180 | ## Resources
181 | 
182 | We highly recommend purchasing Sean Moriarity's book, [Machine Learning in Elixir](https://pragprog.com/titles/smelixir/machine-learning-in-elixir/).  He and Jose' started the Elixir numerical compute capability.  The book explains many important concepts about training models in Elixir.
183 | 
184 | <!-- livebook:{"break_markdown":true} -->
185 | 
186 | Nicolo` G created a batch of Livebook notebooks that translated Python book examples into Nx.  The notebooks can be found on his [Github account](https://github.com/nickgnd/programming-machine-learning-livebooks)
187 | 
188 | <!-- livebook:{"break_markdown":true} -->
189 | 
190 | The techniques to achieve the SOTA are taught in the Fast.ai [Part 2 course](https://course.fast.ai/Lessons/part2.html).  There are three parts of the course: StableDiffusion, Deep Learning Foundations, StableDiffusion from scratch.  Deep Learching Foundations focused on the skills for this challenge and are found in Lessons 10.second_half through 19.first_half.  About 18 hours of videos on the PyTorch implementation to reach the SOTA numbers. I have some Livebook notebooks for [Lesson 10-11](https://github.com/meanderingstream/dl_foundations_in_elixir).
191 | 
192 | I struggled for a while on translating the object oriented concepts into a similar approach in Elixir before I decided that the object-oriented abstractions probably weren't worth translating.  The calculations and tools are important though.  Axon and Kino have elements that can provide some of the same ease of use as Fast.ai.  Kino can be used, with Axon, to create the visualizations and tools that are created in the course.  Axon and NxImage have elements that can be combined to create other capabilities taught in the course.  I'll have more thoughts and hints to share soon.
193 | 
194 | ## Why is it important for Elixir folks to try to beat Jeremy's 12/15 SOTA values
195 | 
196 | By implementing the techniques from Fast.ai's Lessons 10-18, we will be learning how to train a very accurate model using lower compute costs.  When a business is trying to use a model in production, normally businesses want the best performing model that fits the problem constraints.  By learning techniques to improve model performance while also reducing the compute training requirements, we help reduce costs and have a better chance of meeting business goals.
197 | 
198 | <!-- livebook:{"break_markdown":true} -->
199 | 
200 | While it may seem that FashionMNIST is a simple problem, all of the techniques used to reach SOTA were originally combined in the [2018 DawnBench competition](https://www.fast.ai/posts/2018-04-30-dawnbench-fastai.html).  Fast.ai students in a study group teamed up to compete against well funded companies and came in second place (Imagenet) and first place (CIFAR).  Unlike this competition, the DawnBench competition was a time-based competition.
201 | 


--------------------------------------------------------------------------------
/ElixirFashionML_Challenge/fashion_mnist_sean_m.livemd:
--------------------------------------------------------------------------------
  1 | # Classifying Simple Fashion Types - Sean Moriarity
  2 | 
  3 | ```elixir
  4 | Mix.install([
  5 |   {:axon, "~> 0.5.1"},
  6 |   {:exla, "~> 0.5.3"},
  7 |   {:req, "~> 0.3.10"},
  8 |   {:scidata, "~> 0.1.10"}
  9 | ])
 10 | ```
 11 | 
 12 | ## Elixir FashionMNIST Challenge
 13 | 
 14 | I challenged an Elixir community to an Elixir Fashion MNIST challenge, https://alongtheaxon.com/blog/fashion_mnist_challenge.  The idea was derived from a Twitter post by Jeremy Howard.  Jeremy was teaching his Deep Learning from the Foundations 2022 course.  He used the Fashion MNIST dataset as an accessible deep learning problem.  By using accuracy, anyone with patience and a CPU could try to compete for a better accuracy measure.  Having a GPU would be faster, but using the CPU works pretty well.
 15 | 
 16 | ## Sean Moriarity's Blog Post
 17 | 
 18 | Sean created an excellent Dockyard blog post inviting Elixir folks to join the [Elixir FashionMNIST Challenge](https://dockyard.com/blog/2023/08/08/join-the-elixir-fashionmnist-challenge). In his post, he presented his initial approach to improving upon my initial baseline.  He achieved a 90.7% accuracy on the test set.  However, he didn't link to a Livebook implementation.  This notebook attempts to match his blog post in a Livebook notebook that you can use to repeat his results.  I also tried the other two epoch sizes and added Sean to the leaderboard.
 19 | 
 20 | ## Hyperparameters
 21 | 
 22 | Hyperparameters in machine learning are choices the developer makes that shape the training of a model.  However, what model to use is one of those choices but it isn't a simple hyperparameter.  Let's create a map with our simple parameter choices.  It should make it easier to see some key training choices.  We can then reference the choices later in our notebook.
 23 | 
 24 | ```elixir
 25 | hyperparams = %{
 26 |   epochs: 5,
 27 |   batch_size: 32
 28 | }
 29 | ```
 30 | 
 31 | ## Retrieving and exploring the dataset
 32 | 
 33 | The Fashion MNIST dataset is available for free online. The Elixir SciData library provides an easy technique to access the training and test datasets.
 34 | 
 35 | ```elixir
 36 | {train_images, train_labels} = Scidata.FashionMNIST.download()
 37 | ```
 38 | 
 39 | ```elixir
 40 | # Normalize and batch images
 41 | {images_binary, images_type, images_shape} = train_images
 42 | 
 43 | train_images =
 44 |   images_binary
 45 |   |> Nx.from_binary(images_type)
 46 |   |> Nx.reshape({:auto, 28, 28, 1})
 47 |   |> Nx.divide(255)
 48 | ```
 49 | 
 50 | ```elixir
 51 | # One-hot-encode and batch labels
 52 | {labels_binary, labels_type, _shape} = train_labels
 53 | 
 54 | train_labels =
 55 |   labels_binary
 56 |   |> Nx.from_binary(labels_type)
 57 |   |> Nx.new_axis(-1)
 58 |   |> Nx.equal(Nx.tensor(Enum.to_list(0..9)))
 59 | ```
 60 | 
 61 | ## Pipeline
 62 | 
 63 | The next cell was one of the key improvements that Sean made.  The pipeline concept provides a place for dynamically modifying the original dataset.  Without having to capture more data, augmenting input data improves the quality of the data.  When working on a machine learning problem, the business team often asks how much data is required.  There isn't a magic number, or rule of thumb, for how much data is required.  However, useful augmentation of the data can improve the quality of the data at a low acquisition cost.
 64 | 
 65 | ```elixir
 66 | seed = 42
 67 | 
 68 | {batched_images, _} =
 69 |   train_images
 70 |   |> Nx.to_batched(hyperparams[:batch_size])
 71 |   |> Enum.map_reduce(Nx.Random.key(seed), fn batch, key ->
 72 |     fun =
 73 |       Nx.Defn.jit(
 74 |         fn regular, key ->
 75 |           {mask, key} = Nx.Random.uniform(key)
 76 |           flipped = Nx.reverse(regular, axes: [1])
 77 |           augmented = Nx.select(Nx.greater(mask, 0.5), flipped, regular)
 78 |           {augmented, key}
 79 |         end,
 80 |         compiler: EXLA
 81 |       )
 82 | 
 83 |     fun.(batch, key)
 84 |   end)
 85 | 
 86 | batched_labels =
 87 |   train_labels
 88 |   |> Nx.to_batched(hyperparams[:batch_size])
 89 |   |> Enum.to_list()
 90 | ```
 91 | 
 92 | ## Defining the model
 93 | 
 94 | Sean created a custom, but small, convolutional neural network (CNN).  Two convolutional blocks with max pooling.  This is an incremental improvement over the initial model.  You can experiment with other model approaches to improve the score.
 95 | 
 96 | ```elixir
 97 | model =
 98 |   Axon.input("features")
 99 |   |> Axon.conv(32, kernel_size: {3, 3}, padding: :same, activation: :relu)
100 |   |> Axon.max_pool(kernel_size: 2)
101 |   |> Axon.conv(64, kernel_size: {3, 3}, padding: :same, activation: :relu)
102 |   |> Axon.max_pool(kernel_size: 2)
103 |   |> Axon.flatten()
104 |   |> Axon.dense(128, activation: :relu)
105 |   |> Axon.dense(10, activation: :softmax)
106 | ```
107 | 
108 | ## Training
109 | 
110 | Sean also improved the training process.  He added a learning rate scheduler instead of the constant default learning rate from the Adam optimizer.  Learning rate schedules are a fantastic area for experimentation.  Sean used 1850 transition steps which is just under one epoch at a batch size of 32.  60,000/32 = 1,875.
111 | 
112 | What happens when the decay rate is changed?  What is the best nominal learning rate?  What happens with the transition steps are increased or decreased?  Is a decay optimal or should it be an increasing learning rate?  What about a one-cycle approach?  These are all hyperparameter decisions that can be varied to improve the model test set accuracy.
113 | 
114 | <!-- livebook:{"break_markdown":true} -->
115 | 
116 | Based upon the results of PyTorch challenge from last winter, every leaderboard change overtook the others for all 3 epoch levels. Five epochs is enough to experiment with different model and training approaches. If 5 epochs is more accurate than the current leaderboard, then try the 20 and 50 epochs for completeness.  Sean only provided a 5 epoch accuracy.  His accuracy 90.7% improves upon my initial notebook.  So let's try the 20 epoch level and the 50 epoch level.  We'll credit Sean with all three numbers.
117 | 
118 | ```elixir
119 | training_seed = 42
120 | learning_rate = 1.0e-3
121 | 
122 | schedule =
123 |   Axon.Schedules.exponential_decay(
124 |     5.0e-3,
125 |     transition_steps: 1850,
126 |     decay_rate: 0.5
127 |   )
128 | 
129 | optimizer = Axon.Optimizers.adam(schedule)
130 | 
131 | trained_model_state =
132 |   model
133 |   |> Axon.Loop.trainer(:categorical_cross_entropy, optimizer)
134 |   |> Axon.Loop.metric(:accuracy)
135 |   |> Axon.Loop.run(Stream.zip(batched_images, batched_labels), %{},
136 |     epochs: hyperparams[:epochs],
137 |     compiler: EXLA,
138 |     seed: training_seed
139 |   )
140 | ```
141 | 
142 | ## Comparison with the test data leaderboard
143 | 
144 | Now that we have the trained model parameters from the training effort, we can use them for calculating test data accuracy.
145 | 
146 | Let's get the test data.
147 | 
148 | ```elixir
149 | {test_images, test_labels} = Scidata.FashionMNIST.download_test()
150 | ```
151 | 
152 | ```elixir
153 | {test_images_binary, _, _} = test_images
154 | 
155 | test_images =
156 |   test_images_binary
157 |   |> Nx.from_binary(images_type)
158 |   |> Nx.reshape({:auto, 28, 28, 1})
159 |   |> Nx.divide(255)
160 | 
161 | {test_labels_binary, _, _} = test_labels
162 | 
163 | test_labels =
164 |   test_labels_binary
165 |   |> Nx.from_binary(labels_type)
166 |   |> Nx.new_axis(-1)
167 |   |> Nx.equal(Nx.tensor(Enum.to_list(0..9)))
168 | ```
169 | 
170 | Instead of Axon.predict, we'll use Axon.loop.evaluator with an accuracy metric.
171 | 
172 | ```elixir
173 | test_batched_images = Nx.to_batched(test_images, hyperparams[:batch_size])
174 | test_batched_labels = Nx.to_batched(test_labels, hyperparams[:batch_size])
175 | 
176 | model
177 | |> Axon.Loop.evaluator()
178 | |> Axon.Loop.metric(:accuracy)
179 | |> Axon.Loop.run(Stream.zip(test_batched_images, test_batched_labels), trained_model_state,
180 |   compiler: EXLA
181 | )
182 | ```
183 | 
184 | ## Challenge: #ElixirFashionML
185 | 
186 | Sean's #ElixirFashionMLChallenge Leaderboard (Accuracy) on 8/13/2023
187 | 
188 | * 5 Epochs - 90.7%
189 | * 20 Epochs - 91.1%
190 | * 50 Epochs - 90.9%
191 | 
192 | <!-- livebook:{"break_markdown":true} -->
193 | 
194 | We have an 5 epoch accuracy of 90.7% vs Jeremy's 12/15 accuracy of 92.7%.  That still leaves plenty of opportunities for the community to leap to the top of the leaderboard
195 | 
196 | ## Resources
197 | 
198 | We highly recommend purchasing Sean Moriarity's book, [Machine Learning in Elixir](https://pragprog.com/titles/smelixir/machine-learning-in-elixir/).  He and Jose' started the Elixir numerical compute capability.  The book explains many important concepts about training models in Elixir.
199 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # Deep Learning from the Foundations with Elixir
2 | 
3 | This repository is a transformation of the 2022 version of Fast.ai's Deep Learning for Coders Part 2.  These code notebooks follow the notebooks from the [first portion](https://github.com/fastai/course22p2) of the course.  Please watch the Jeremy Howard's course videos for the complete context.  In the course, Jeremy starts with some constraints.  His approach is to first implement a concept in standard Python code. Once he's introduced the concepts, then the notebooks start to bring in PyTorch library code that also implements the concept. He tries to incrementally demystify the PyTorch and Fast.ai library code.
4 | 
5 | Similar to the course, these notebooks start from standard Elixir code and then bring in Nx and Axon libraries.  We'll use Elixir and  [Livebook.dev interactive & collaborative code notebooks](https://livebook.dev/).  A requirement for these notebooks is a running Livebook application or server.  Livebook runs on Windows and Mac desktops and on Linux.  Please see the Livebook web site for instructions on installing the basic Livebook application.  Livebook also runs on Linux.  For our purposes, we run Livebook on a local Linux server using escript.  For more information on using Livebook on escript, please see the Readme.md at https://github.com/livebook-dev/livebook. 
6 | 
7 | We'll be building a livebook for every Fast.ai Foundations Jupyter notebook.  We welcome pull requests that improve our notebooks. 
8 | 


--------------------------------------------------------------------------------