├── 02_first ├── pizza.txt └── supervised_pizzas.livemd ├── 03_gradient ├── gradient_descend.livemd └── pizza.txt ├── 04_hyperspace ├── multiple_regression.livemd ├── pizza_2_vars.txt └── pizza_3_vars.txt ├── 05_discerning ├── classifier.livemd └── police.txt ├── 06_real └── digit_classifier.livemd ├── 07_final ├── multiclass_classifier.livemd ├── seed_results.txt ├── sonar_classifier.livemd └── sonar_seed_comparison.ex ├── 10_building ├── forward_propagation.livemd └── weights.json ├── 11_training └── neural_network.livemd ├── 12_classifiers ├── 12_classifiers_01.livemd ├── 12_classifiers_02.livemd ├── circles.txt ├── circles_data.livemd ├── linearly_separable.txt └── non_linearly_separable.txt ├── 13_batching ├── 13_batching.livemd └── images │ └── visualization.png ├── 14_testing ├── 14_testing.livemd └── images │ ├── MNIST_2_sets.png │ └── MNIST_3_sets.png ├── 15_development └── 15_development.livemd ├── 16_deeper ├── 16_deeper.livemd ├── mnist_with_axon.livemd └── model1_params.term ├── 17_overfitting └── 17_overfitting.livemd ├── 18_taming ├── beyond_the_sigmoid.livemd └── ten_epochs_challenge.livemd ├── 19_beyond └── beyond_vanilla_networks.livemd ├── LICENSE ├── README.md ├── data ├── echidna.txt ├── mnist │ ├── readme.txt │ ├── t10k-images-idx3-ubyte.gz │ ├── t10k-labels-idx1-ubyte.gz │ ├── train-images-idx3-ubyte.gz │ └── train-labels-idx1-ubyte.gz └── sonar │ ├── sonar.all-data │ └── sonar.names ├── docker-compose.yml └── images └── livebooks_home.png /02_first/pizza.txt: -------------------------------------------------------------------------------- 1 | Reservations Pizzas 2 | 13 33 3 | 2 16 4 | 14 32 5 | 23 51 6 | 13 27 7 | 1 16 8 | 18 34 9 | 10 17 10 | 26 29 11 | 3 15 12 | 3 15 13 | 21 32 14 | 7 22 15 | 22 37 16 | 2 13 17 | 27 44 18 | 6 16 19 | 10 21 20 | 18 37 21 | 15 30 22 | 9 26 23 | 26 34 24 | 8 23 25 | 15 39 26 | 10 27 27 | 21 37 28 | 5 17 29 | 6 18 30 | 13 25 31 | 13 23 32 | -------------------------------------------------------------------------------- /02_first/supervised_pizzas.livemd: -------------------------------------------------------------------------------- 1 | # Chapter 2: Your First Learning Program 2 | 3 | ```elixir 4 | Mix.install([ 5 | {:vega_lite, "~> 0.1.6"}, 6 | {:kino, "~> 0.8.1"}, 7 | {:kino_vega_lite, "~> 0.1.7"} 8 | ]) 9 | ``` 10 | 11 | ## Pizza and Correlation 12 | 13 | ### Read the data 14 | 15 | ```elixir 16 | file = 17 | __DIR__ 18 | |> Path.join("pizza.txt") 19 | |> Path.expand() 20 | 21 | # Read the data from the file, remove the header and return 22 | # `[%{reservations: integer(), pizzas: integer()}]` 23 | data = 24 | file 25 | |> File.read!() 26 | |> String.split("\n", trim: true) 27 | |> Enum.slice(1..-1) 28 | |> Enum.map(&String.split(&1, ~r{\s+}, trim: true)) 29 | |> Enum.map(fn [r, p] -> 30 | %{reservations: String.to_integer(r), pizzas: String.to_integer(p)} 31 | end) 32 | 33 | Kino.DataTable.new(data) 34 | ``` 35 | 36 | ### Plot the data 37 | 38 | 39 | 40 | ```elixir 41 | VegaLite.new(width: 600, height: 400) 42 | |> VegaLite.data_from_values(data, only: ["reservations", "pizzas"]) 43 | |> VegaLite.mark(:point) 44 | |> VegaLite.encode_field(:x, "reservations", type: :quantitative) 45 | |> VegaLite.encode_field(:y, "pizzas", type: :quantitative) 46 | ``` 47 | 48 | ## Tracing a Line 49 | 50 | ```elixir 51 | defmodule C2.LinearRegression do 52 | @doc """ 53 | Returns a list of predictions. 54 | """ 55 | def predict([item | rest], weight) do 56 | [predict(item, weight) | predict(rest, weight)] 57 | end 58 | 59 | def predict([], _weight), do: [] 60 | 61 | # The function predicts the pizzas from the reservations. 62 | # To be more precise, it takes the input variable and the weight, 63 | # and it uses them to calculate ŷ. 64 | def predict(x, weight), do: x * weight 65 | 66 | @doc """ 67 | Returns the mean squared error. 68 | """ 69 | def loss(x, y, weight) when is_list(x) and is_list(y) do 70 | predictions = predict(x, weight) 71 | errors = Enum.zip_with([predictions, y], fn [pr, y] -> pr - y end) 72 | squared_error = square(errors) 73 | avg(squared_error) 74 | end 75 | 76 | def train(x, y, iterations, lr) when is_list(x) and is_list(y) do 77 | Enum.reduce_while(0..(iterations - 1), 0, fn i, w -> 78 | current_loss = loss(x, y, w) 79 | 80 | IO.puts("Iteration #{i} => Loss: #{current_loss}") 81 | 82 | cond do 83 | loss(x, y, w + lr) < current_loss -> {:cont, w + lr} 84 | loss(x, y, w - lr) < current_loss -> {:cont, w - lr} 85 | true -> {:halt, w} 86 | end 87 | end) 88 | end 89 | 90 | defp square(list) when is_list(list) do 91 | for i <- list, do: i * i 92 | end 93 | 94 | defp avg(list) when is_list(list) do 95 | Enum.sum(list) / length(list) 96 | end 97 | end 98 | ``` 99 | 100 | ### Train the system 101 | 102 | ```elixir 103 | # Transform the data to unpack the 2 columns `reservations` and 104 | # `pizzas` into separate arrays called x and y 105 | %{x: x, y: y} = 106 | Enum.reduce(data, %{x: [], y: []}, fn item, %{x: x, y: y} -> 107 | %{x: x ++ [item.reservations], y: y ++ [item.pizzas]} 108 | end) 109 | ``` 110 | 111 | ```elixir 112 | iterations = Kino.Input.number("iterations", default: 10_000) 113 | ``` 114 | 115 | ```elixir 116 | lr = Kino.Input.number("lr (learning rate)", default: 0.01) 117 | ``` 118 | 119 | ```elixir 120 | iterations = Kino.Input.read(iterations) 121 | lr = Kino.Input.read(lr) 122 | 123 | w = C2.LinearRegression.train(x, y, iterations = iterations, lr = lr) 124 | ``` 125 | 126 | ### Predict the number of pizzas 127 | 128 | ```elixir 129 | C2.LinearRegression.predict(20, w) 130 | ``` 131 | 132 | ```elixir 133 | # Compute the predictions 134 | 135 | predictions = 136 | Enum.map(0..Enum.max(x), fn i -> 137 | %{x: i, prediction: C2.LinearRegression.predict(i, w)} 138 | end) 139 | ``` 140 | 141 | 142 | 143 | ```elixir 144 | VegaLite.new(width: 600, height: 400) 145 | |> VegaLite.layers([ 146 | VegaLite.new() 147 | |> VegaLite.data_from_values(data, only: ["reservations", "pizzas"]) 148 | |> VegaLite.mark(:point) 149 | |> VegaLite.encode_field(:x, "reservations", type: :quantitative) 150 | |> VegaLite.encode_field(:y, "pizzas", type: :quantitative), 151 | VegaLite.new() 152 | |> VegaLite.data_from_values(predictions, only: ["x", "prediction"]) 153 | |> VegaLite.mark(:line) 154 | |> VegaLite.encode_field(:x, "x", type: :quantitative) 155 | |> VegaLite.encode_field(:y, "prediction", type: :quantitative) 156 | ]) 157 | ``` 158 | 159 | ## Adding a Bias 160 | 161 | ```elixir 162 | defmodule C2.LinearRegressionWithBias do 163 | @doc """ 164 | Returns a list of predictions. 165 | """ 166 | def predict([item | rest], weight, bias) do 167 | [predict(item, weight, bias) | predict(rest, weight, bias)] 168 | end 169 | 170 | def predict([], _weight, _bias), do: [] 171 | 172 | # The function predicts the pizzas from the reservations. 173 | # To be more precise, it takes the input variable, the weight 174 | # and the bias, and it uses them to calculate ŷ. 175 | def predict(x, weight, bias), do: x * weight + bias 176 | 177 | @doc """ 178 | Returns the mean squared error. 179 | """ 180 | def loss(x, y, weight, bias) when is_list(x) and is_list(y) do 181 | predictions = predict(x, weight, bias) 182 | errors = Enum.zip_with([predictions, y], fn [pr, y] -> pr - y end) 183 | squared_error = square(errors) 184 | avg(squared_error) 185 | end 186 | 187 | def train(x, y, iterations, lr) when is_list(x) and is_list(y) do 188 | Enum.reduce_while(0..(iterations - 1), %{weight: 0, bias: 0}, fn i, 189 | %{weight: w, bias: b} = acc -> 190 | current_loss = loss(x, y, w, b) 191 | 192 | IO.puts("Iteration #{i} => Loss: #{current_loss}") 193 | 194 | cond do 195 | loss(x, y, w + lr, b) < current_loss -> {:cont, %{acc | weight: w + lr}} 196 | loss(x, y, w - lr, b) < current_loss -> {:cont, %{acc | weight: w - lr}} 197 | loss(x, y, w, b + lr) < current_loss -> {:cont, %{acc | bias: b + lr}} 198 | loss(x, y, w, b - lr) < current_loss -> {:cont, %{acc | bias: b - lr}} 199 | true -> {:halt, acc} 200 | end 201 | end) 202 | end 203 | 204 | defp square(list) when is_list(list) do 205 | for i <- list, do: i * i 206 | end 207 | 208 | defp avg(list) when is_list(list) do 209 | Enum.sum(list) / length(list) 210 | end 211 | end 212 | ``` 213 | 214 | ### Train the system 215 | 216 | ```elixir 217 | iterations = Kino.Input.number("iterations", default: 10_000) 218 | ``` 219 | 220 | ```elixir 221 | lr = Kino.Input.number("lr (learning rate)", default: 0.01) 222 | ``` 223 | 224 | ```elixir 225 | iterations = Kino.Input.read(iterations) 226 | lr = Kino.Input.read(lr) 227 | 228 | %{weight: w2, bias: bias} = 229 | C2.LinearRegressionWithBias.train(x, y, iterations = 10_000, lr = 0.01) 230 | ``` 231 | 232 | ### Predict the number of pizzas 233 | 234 | ```elixir 235 | n_reservations = Kino.Input.number("number of reservations", default: 20) 236 | ``` 237 | 238 | ```elixir 239 | n = Kino.Input.read(n_reservations) 240 | 241 | C2.LinearRegressionWithBias.predict(n, w2, bias) 242 | ``` 243 | 244 | ```elixir 245 | # Compute the predictions 246 | 247 | predictions = 248 | Enum.map(0..Enum.max(x), fn i -> 249 | %{x: i, prediction: C2.LinearRegressionWithBias.predict(i, w2, bias)} 250 | end) 251 | ``` 252 | 253 | 254 | 255 | ```elixir 256 | VegaLite.new(width: 600, height: 400) 257 | |> VegaLite.layers([ 258 | VegaLite.new() 259 | |> VegaLite.data_from_values(data, only: ["reservations", "pizzas"]) 260 | |> VegaLite.mark(:point) 261 | |> VegaLite.encode_field(:x, "reservations", type: :quantitative) 262 | |> VegaLite.encode_field(:y, "pizzas", type: :quantitative), 263 | VegaLite.new() 264 | |> VegaLite.data_from_values(predictions, only: ["x", "prediction"]) 265 | |> VegaLite.mark(:line) 266 | |> VegaLite.encode_field(:x, "x", type: :quantitative) 267 | |> VegaLite.encode_field(:y, "prediction", type: :quantitative) 268 | ]) 269 | ``` 270 | -------------------------------------------------------------------------------- /03_gradient/gradient_descend.livemd: -------------------------------------------------------------------------------- 1 | # Chapter 3: Walking the Gradient 2 | 3 | ```elixir 4 | Mix.install([ 5 | {:vega_lite, "~> 0.1.6"}, 6 | {:kino, "~> 0.8.1"}, 7 | {:kino_vega_lite, "~> 0.1.7"} 8 | ]) 9 | ``` 10 | 11 | ## Read the data 12 | 13 | ```elixir 14 | file = 15 | __DIR__ 16 | |> Path.join("pizza.txt") 17 | |> Path.expand() 18 | 19 | # Read the data from the file, remove the header and return 20 | # `[%{reservations: integer(), pizzas: integer()}]` 21 | data = 22 | file 23 | |> File.read!() 24 | |> String.split("\n", trim: true) 25 | |> Enum.slice(1..-1) 26 | |> Enum.map(&String.split(&1, ~r{\s+}, trim: true)) 27 | |> Enum.map(fn [r, p] -> %{reservations: String.to_integer(r), pizzas: String.to_integer(p)} end) 28 | 29 | Kino.DataTable.new(data) 30 | ``` 31 | 32 | ## Linear regression with bias 33 | 34 | ☝️ From chapter 2 35 | 36 | ```elixir 37 | defmodule C2.LinearRegressionWithBias do 38 | @doc """ 39 | Returns a list of predictions. 40 | """ 41 | def predict([item | rest], weight, bias) do 42 | [predict(item, weight, bias) | predict(rest, weight, bias)] 43 | end 44 | 45 | def predict([], _weight, _bias), do: [] 46 | 47 | # The function predicts the pizzas from the reservations. 48 | # To be more precise, it takes the input variable, the weight 49 | # and the bias, and it uses them to calculate ŷ. 50 | def predict(x, weight, bias), do: x * weight + bias 51 | 52 | @doc """ 53 | Returns the mean squared error. 54 | """ 55 | def loss(x, y, weight, bias) when is_list(x) and is_list(y) do 56 | predictions = predict(x, weight, bias) 57 | errors = Enum.zip_with([predictions, y], fn [pr, y] -> pr - y end) 58 | squared_error = square(errors) 59 | avg(squared_error) 60 | end 61 | 62 | def train(x, y, iterations, lr) when is_list(x) and is_list(y) do 63 | Enum.reduce(0..(iterations - 1), %{weight: 0, bias: 0}, fn i, %{weight: w, bias: b} = acc -> 64 | current_loss = loss(x, y, w, b) 65 | 66 | IO.puts("Iteration #{i} => Loss: #{current_loss}") 67 | 68 | cond do 69 | loss(x, y, w + lr, b) < current_loss -> %{acc | weight: w + lr} 70 | loss(x, y, w - lr, b) < current_loss -> %{acc | weight: w - lr} 71 | loss(x, y, w, b + lr) < current_loss -> %{acc | bias: b + lr} 72 | loss(x, y, w, b - lr) < current_loss -> %{acc | bias: b - lr} 73 | true -> acc 74 | end 75 | end) 76 | end 77 | 78 | defp square(list) when is_list(list) do 79 | for i <- list, do: i * i 80 | end 81 | 82 | defp avg(list) when is_list(list) do 83 | Enum.sum(list) / length(list) 84 | end 85 | end 86 | ``` 87 | 88 | ### Plot the loss curve 89 | 90 | ```elixir 91 | # Transform the data to unpack the 2 columns `reservations` and 92 | # `pizzas` into separate arrays called x and y 93 | %{x: x, y: y} = 94 | Enum.reduce(data, %{x: [], y: []}, fn item, %{x: x, y: y} -> 95 | %{x: x ++ [item.reservations], y: y ++ [item.pizzas]} 96 | end) 97 | ``` 98 | 99 | ```elixir 100 | alias VegaLite, as: Vl 101 | 102 | # Generate a sequence that will be used as `weight` 103 | # From -1 to -4, step 0.01 104 | weights = Enum.map(-100..400, &(&1 / 100)) 105 | 106 | # Compute the loss for each weight, with bias=0 107 | losses = Enum.map(weights, &C2.LinearRegressionWithBias.loss(x, y, &1, 0)) 108 | 109 | # Get the min loss index 110 | min_loss_index = Enum.find_index(losses, &(&1 == Enum.min(losses))) 111 | 112 | Vl.new(width: 600, height: 400) 113 | |> Vl.layers([ 114 | Vl.new() 115 | |> Vl.data_from_values(weight: weights, loss: losses) 116 | |> Vl.mark(:line) 117 | |> Vl.encode_field(:x, "weight", type: :quantitative) 118 | |> Vl.encode_field(:y, "loss", type: :quantitative), 119 | Vl.new() 120 | |> Vl.data_from_values( 121 | weight: [Enum.at(weights, min_loss_index)], 122 | min_loss: [Enum.at(losses, min_loss_index)] 123 | ) 124 | |> Vl.mark(:circle, tooltip: true, size: "100", color: "red") 125 | |> Vl.encode_field(:x, "weight", type: :quantitative) 126 | |> Vl.encode_field(:y, "min_loss", type: :quantitative, title: "loss") 127 | ]) 128 | ``` 129 | 130 | ## Gradient Descent 131 | 132 | ```elixir 133 | defmodule C3.LinearRegressionWithoutBias do 134 | def predict([item | rest], weight, bias) do 135 | [predict(item, weight, bias) | predict(rest, weight, bias)] 136 | end 137 | 138 | def predict([], _weight, _bias), do: [] 139 | def predict(x, weight, bias), do: x * weight + bias 140 | 141 | @doc """ 142 | Returns the mean squared error. 143 | """ 144 | def loss(x, y, weight, bias) when is_list(x) and is_list(y) do 145 | predictions = predict(x, weight, bias) 146 | errors = Enum.zip_with([predictions, y], fn [pr, y] -> pr - y end) 147 | squared_error = square(errors) 148 | avg(squared_error) 149 | end 150 | 151 | @doc """ 152 | Returns the derivative of the loss curve 153 | """ 154 | def gradient(x, y, weight) do 155 | predictions = predict(x, weight, 0) 156 | errors = Enum.zip_with([predictions, y], fn [pr, y] -> pr - y end) 157 | 2 * avg(Enum.zip_with([x, errors], fn [x_item, error] -> x_item * error end)) 158 | end 159 | 160 | def train(x, y, iterations, lr) when is_list(x) and is_list(y) do 161 | Enum.reduce(0..(iterations - 1), 0, fn i, weight -> 162 | IO.puts("Iteration #{i} => Loss: #{loss(x, y, weight, 0)}") 163 | weight - gradient(x, y, weight) * lr 164 | end) 165 | end 166 | 167 | defp square(list) when is_list(list) do 168 | for i <- list, do: i * i 169 | end 170 | 171 | defp avg(list) when is_list(list) do 172 | Enum.sum(list) / length(list) 173 | end 174 | end 175 | ``` 176 | 177 | ### Train the system 178 | 179 | ```elixir 180 | iterations = Kino.Input.number("iterations", default: 100) 181 | ``` 182 | 183 | ```elixir 184 | lr = Kino.Input.number("lr (learning rate)", default: 0.001) 185 | ``` 186 | 187 | ```elixir 188 | iterations = Kino.Input.read(iterations) 189 | lr = Kino.Input.read(lr) 190 | 191 | weight = C3.LinearRegressionWithoutBias.train(x, y, iterations = 100, lr = 0.001) 192 | ``` 193 | 194 | ## Putting Gradient Descent to the Test 195 | 196 | ```elixir 197 | defmodule C3.LinearRegressionWithBias do 198 | def predict([item | rest], weight, bias) do 199 | [predict(item, weight, bias) | predict(rest, weight, bias)] 200 | end 201 | 202 | def predict([], _weight, _bias), do: [] 203 | def predict(x, weight, bias), do: x * weight + bias 204 | 205 | @doc """ 206 | Returns the mean squared error. 207 | """ 208 | def loss(x, y, weight, bias) when is_list(x) and is_list(y) do 209 | predictions = predict(x, weight, bias) 210 | errors = Enum.zip_with([predictions, y], fn [pr, y] -> pr - y end) 211 | squared_error = square(errors) 212 | avg(squared_error) 213 | end 214 | 215 | @doc """ 216 | Returns the derivative of the loss curve 217 | """ 218 | def gradient(x, y, weight, bias) do 219 | predictions = predict(x, weight, bias) 220 | errors = Enum.zip_with([predictions, y], fn [pr, y] -> pr - y end) 221 | 222 | w_gradient = 2 * avg(Enum.zip_with([x, errors], fn [x_item, error] -> x_item * error end)) 223 | b_gradient = 2 * avg(errors) 224 | 225 | {w_gradient, b_gradient} 226 | end 227 | 228 | def train(x, y, iterations, lr) when is_list(x) and is_list(y) do 229 | Enum.reduce(0..(iterations - 1), %{weight: 0, bias: 0}, fn i, %{weight: weight, bias: bias} -> 230 | IO.puts("Iteration #{i} => Loss: #{loss(x, y, weight, bias)}") 231 | 232 | {w_gradient, b_gradient} = gradient(x, y, weight, bias) 233 | %{weight: weight - w_gradient * lr, bias: bias - b_gradient * lr} 234 | end) 235 | end 236 | 237 | defp square(list) when is_list(list) do 238 | for i <- list, do: i * i 239 | end 240 | 241 | defp avg(list) when is_list(list) do 242 | Enum.sum(list) / length(list) 243 | end 244 | end 245 | ``` 246 | 247 | ### Train the system 248 | 249 | ```elixir 250 | iterations = Kino.Input.number("iterations", default: 20_000) 251 | ``` 252 | 253 | ```elixir 254 | lr = Kino.Input.number("lr (learning rate)", default: 0.001) 255 | ``` 256 | 257 | ```elixir 258 | iterations = Kino.Input.read(iterations) 259 | lr = Kino.Input.read(lr) 260 | 261 | %{weight: weight, bias: bias} = 262 | C3.LinearRegressionWithBias.train(x, y, iterations = iterations, lr = lr) 263 | ``` 264 | 265 | ### Predict the number of pizzas 266 | 267 | ```elixir 268 | n_reservations = Kino.Input.number("number of reservations", default: 20) 269 | ``` 270 | 271 | ```elixir 272 | n = Kino.Input.read(n_reservations) 273 | 274 | C3.LinearRegressionWithBias.predict(n, weight, bias) 275 | ``` 276 | -------------------------------------------------------------------------------- /03_gradient/pizza.txt: -------------------------------------------------------------------------------- 1 | Reservations Pizzas 2 | 13 33 3 | 2 16 4 | 14 32 5 | 23 51 6 | 13 27 7 | 1 16 8 | 18 34 9 | 10 17 10 | 26 29 11 | 3 15 12 | 3 15 13 | 21 32 14 | 7 22 15 | 22 37 16 | 2 13 17 | 27 44 18 | 6 16 19 | 10 21 20 | 18 37 21 | 15 30 22 | 9 26 23 | 26 34 24 | 8 23 25 | 15 39 26 | 10 27 27 | 21 37 28 | 5 17 29 | 6 18 30 | 13 25 31 | 13 23 32 | -------------------------------------------------------------------------------- /04_hyperspace/multiple_regression.livemd: -------------------------------------------------------------------------------- 1 | # Chapter 4: Hyperspace! 2 | 3 | ```elixir 4 | Mix.install( 5 | [ 6 | {:exla, "~> 0.5"}, 7 | {:nx, "~> 0.5"}, 8 | {:vega_lite, "~> 0.1.6"}, 9 | {:kino, "~> 0.8.1"}, 10 | {:kino_vega_lite, "~> 0.1.7"} 11 | ], 12 | config: [nx: [default_backend: EXLA.Backend]] 13 | ) 14 | ``` 15 | 16 | ## Upgrading the Learner 17 | 18 | ### Preparing Data 19 | 20 | ```elixir 21 | file = 22 | __DIR__ 23 | |> Path.join("pizza_3_vars.txt") 24 | |> Path.expand() 25 | 26 | # Read the data from the file, remove the header and return 27 | # `[%{reservations: integer(), temperature: integer(), tourists: integer(), pizzas: integer()}]` 28 | data = 29 | File.read!(file) 30 | |> String.split("\n", trim: true) 31 | |> Enum.slice(1..-1) 32 | |> Enum.map(&String.split(&1, ~r{\s+}, trim: true)) 33 | |> Enum.map(fn [r, temp, tour, p] -> 34 | %{ 35 | reservations: String.to_integer(r), 36 | temperature: String.to_integer(temp), 37 | tourists: String.to_integer(tour), 38 | pizzas: String.to_integer(p) 39 | } 40 | end) 41 | 42 | Kino.DataTable.new(data, keys: [:reservations, :temperature, :tourists, :pizzas]) 43 | ``` 44 | 45 | ```elixir 46 | # Transform the data to unpack the 4 columns `reservations`, 47 | # `temperature`, `tourists` and `pizzas` into separate arrays 48 | # called x1, x2, x3 and y 49 | %{x1: x1, x2: x2, x3: x3, y: y} = 50 | Enum.reduce(data, %{x1: [], x2: [], x3: [], y: []}, fn item, %{x1: x1, x2: x2, x3: x3, y: y} -> 51 | %{ 52 | x1: x1 ++ [item.reservations], 53 | x2: x2 ++ [item.temperature], 54 | x3: x3 ++ [item.tourists], 55 | y: y ++ [item.pizzas] 56 | } 57 | end) 58 | ``` 59 | 60 | ### Let's build the matrix x for input variables 61 | 62 | ```elixir 63 | # Same of `numpy.column_stack((x1, x2, x3))` used in the book 64 | x = 65 | [x1, x2, x3] 66 | |> Nx.tensor() 67 | |> Nx.transpose() 68 | ``` 69 | 70 | ```elixir 71 | # Inspect x shape 72 | x.shape() 73 | ``` 74 | 75 | ```elixir 76 | # Get the first 2 rows of x 77 | x[0..1] 78 | ``` 79 | 80 | ### And reshape y into a matrix for labels 81 | 82 | ```elixir 83 | # Same of `y.reshape(-1, 1)` used in the book 84 | y = Nx.tensor([y]) |> Nx.transpose() 85 | ``` 86 | 87 | ```elixir 88 | # Inspect y shape 89 | y.shape() 90 | ``` 91 | 92 | ## Multiple Linear Regression 93 | 94 | ```elixir 95 | defmodule C4.MultipleLinearRegression do 96 | import Nx.Defn 97 | 98 | @doc """ 99 | Return the prediction tensor given the inputs and weight. 100 | """ 101 | defn(predict(x, weight), do: Nx.dot(x, weight)) 102 | 103 | @doc """ 104 | Returns the mean squared error. 105 | """ 106 | defn loss(x, y, weight) do 107 | predictions = predict(x, weight) 108 | errors = Nx.subtract(predictions, y) 109 | squared_error = Nx.pow(errors, 2) 110 | 111 | Nx.mean(squared_error) 112 | end 113 | 114 | @doc """ 115 | Returns the derivative of the loss curve. 116 | """ 117 | defn gradient(x, y, weight) do 118 | # in python: 119 | # 2 * np.matmul(X.T, (predict(X, w) - Y)) / X.shape[0] 120 | 121 | predictions = predict(x, weight) 122 | errors = Nx.subtract(predictions, y) 123 | n_examples = elem(Nx.shape(x), 0) 124 | 125 | Nx.transpose(x) 126 | |> Nx.dot(errors) 127 | |> Nx.multiply(2) 128 | |> Nx.divide(n_examples) 129 | end 130 | 131 | @doc """ 132 | Computes the weight by training the system 133 | with the given inputs and labels, by iterating 134 | over the examples the specified number of times. 135 | """ 136 | def train(x, y, iterations, lr) do 137 | Enum.reduce(0..(iterations - 1), init_weight(x), fn i, weight -> 138 | current_loss = loss(x, y, weight) |> Nx.to_number() 139 | IO.puts("Iteration #{i} => Loss: #{current_loss}") 140 | Nx.subtract(weight, Nx.multiply(gradient(x, y, weight), lr)) 141 | end) 142 | end 143 | 144 | # Given n elements it returns a tensor 145 | # with this shape {n, 1}, each element 146 | # initialized to 0 147 | defnp init_weight(x) do 148 | Nx.broadcast(Nx.tensor([0]), {elem(Nx.shape(x), 1), 1}) 149 | end 150 | end 151 | ``` 152 | 153 | ### Train the system 154 | 155 | ```elixir 156 | iterations = Kino.Input.number("iterations", default: 10_000) 157 | ``` 158 | 159 | ```elixir 160 | lr = Kino.Input.number("lr (learning rate)", default: 0.001) 161 | ``` 162 | 163 | ```elixir 164 | iterations = Kino.Input.read(iterations) 165 | lr = Kino.Input.read(lr) 166 | 167 | weight = C4.MultipleLinearRegression.train(x, y, iterations, lr) 168 | ``` 169 | 170 | ```elixir 171 | loss = C4.MultipleLinearRegression.loss(x, y, weight) |> Nx.to_number() 172 | ``` 173 | 174 | ## Bye bye, bias 👋 175 | 176 | Quoting the book: 177 | 178 | > The bias is just the **weight** of an input variable that happens to have the constant value 1. 179 | 180 | Basically, this expression: 181 | 182 | 183 | 184 | ```elixir 185 | ŷ = x1 * w1 + x2 * w2 + x3 * w3 + b 186 | ``` 187 | 188 | can be rewritten as: 189 | 190 | 191 | 192 | ```elixir 193 | ŷ = x1 * w1 + x2 * w2 + x3 * w3 + x0 * b 194 | ``` 195 | 196 | where `x0` is a constant matrix of value 1 with `{30, 1}` shape. 197 | 198 | ```elixir 199 | x0 = List.duplicate(1, length(x1)) 200 | ``` 201 | 202 | And now let's add the new input `x0` to the `x` tensor. 203 | 204 | ```elixir 205 | x = 206 | [x0, x1, x2, x3] 207 | |> Nx.tensor() 208 | |> Nx.transpose() 209 | ``` 210 | 211 | ```elixir 212 | weight = C4.MultipleLinearRegression.train(x, y, iterations, lr) 213 | ``` 214 | 215 | ```elixir 216 | loss = C4.MultipleLinearRegression.loss(x, y, weight) 217 | ``` 218 | 219 | A few predictions 220 | 221 | ```elixir 222 | Enum.map(0..4, fn i -> 223 | prediction = 224 | x[i] 225 | |> C4.MultipleLinearRegression.predict(weight) 226 | |> Nx.squeeze() 227 | |> Nx.to_number() 228 | |> Float.round(4) 229 | 230 | label = 231 | y[i] 232 | |> Nx.squeeze() 233 | |> Nx.to_number() 234 | 235 | IO.inspect("x[#{i}] -> #{prediction} (label: #{label})") 236 | end) 237 | 238 | Kino.nothing() 239 | ``` 240 | -------------------------------------------------------------------------------- /04_hyperspace/pizza_2_vars.txt: -------------------------------------------------------------------------------- 1 | Reservations Temperature Pizzas 2 | 13 26 44 3 | 2 14 23 4 | 14 20 28 5 | 23 25 60 6 | 13 24 42 7 | 1 12 5 8 | 18 23 51 9 | 10 18 44 10 | 26 24 42 11 | 3 14 9 12 | 3 12 14 13 | 21 27 43 14 | 7 17 22 15 | 22 21 34 16 | 2 12 16 17 | 27 26 46 18 | 6 15 26 19 | 10 21 33 20 | 18 18 29 21 | 15 26 43 22 | 9 20 37 23 | 26 25 62 24 | 8 21 47 25 | 15 22 38 26 | 10 20 22 27 | 21 21 29 28 | 5 12 34 29 | 6 14 38 30 | 13 19 30 31 | 13 20 28 32 | -------------------------------------------------------------------------------- /04_hyperspace/pizza_3_vars.txt: -------------------------------------------------------------------------------- 1 | Reservations Temperature Tourists Pizzas 2 | 13 26 9 44 3 | 2 14 6 23 4 | 14 20 3 28 5 | 23 25 9 60 6 | 13 24 8 42 7 | 1 12 2 5 8 | 18 23 9 51 9 | 10 18 10 44 10 | 26 24 3 42 11 | 3 14 1 9 12 | 3 12 3 14 13 | 21 27 5 43 14 | 7 17 3 22 15 | 22 21 1 34 16 | 2 12 4 16 17 | 27 26 2 46 18 | 6 15 4 26 19 | 10 21 7 33 20 | 18 18 3 29 21 | 15 26 8 43 22 | 9 20 6 37 23 | 26 25 9 62 24 | 8 21 10 47 25 | 15 22 7 38 26 | 10 20 2 22 27 | 21 21 1 29 28 | 5 12 7 34 29 | 6 14 9 38 30 | 13 19 4 30 31 | 13 20 3 28 32 | -------------------------------------------------------------------------------- /05_discerning/classifier.livemd: -------------------------------------------------------------------------------- 1 | # Chapter 5: A Discerning Machine 2 | 3 | ```elixir 4 | Mix.install( 5 | [ 6 | {:exla, "~> 0.5"}, 7 | {:nx, "~> 0.5"}, 8 | {:vega_lite, "~> 0.1.6"}, 9 | {:kino, "~> 0.8.1"}, 10 | {:kino_vega_lite, "~> 0.1.7"} 11 | ], 12 | config: [nx: [default_backend: EXLA.Backend]] 13 | ) 14 | ``` 15 | 16 | ## Invasion of the Sigmoids 17 | 18 | $$ 19 | \sigma(z) = \cfrac{1}{1 + e^{-z}} 20 | $$ 21 | 22 | ```elixir 23 | alias VegaLite, as: Vl 24 | 25 | sigmoid_fn = fn z -> 1 / (1 + :math.exp(-z)) end 26 | 27 | # Generate a sequence that will be used as `z` 28 | # From -5 to 5, step 0.1 29 | z = Enum.map(-50..50, &(&1 / 10)) 30 | 31 | # Compute the sigmoids 32 | sigmoids = Enum.map(z, fn v -> sigmoid_fn.(v) end) 33 | 34 | Vl.new(width: 600, height: 400) 35 | |> Vl.data_from_values(z: z, sigmoids: sigmoids) 36 | |> Vl.mark(:line) 37 | |> Vl.encode_field(:x, "z", type: :quantitative) 38 | |> Vl.encode_field(:y, "sigmoids", type: :quantitative, title: "sigmoid(z)") 39 | ``` 40 | 41 | ## Classification in Action 42 | 43 | ```elixir 44 | defmodule C5.Classifier do 45 | import Nx.Defn 46 | 47 | @doc """ 48 | A sigmoid function is a mathematical function having 49 | a characteristic "S"-shaped curve or sigmoid curve. 50 | 51 | A sigmoid function: 52 | - is monotonic 53 | - has no local minimums 54 | - has a non-negative derivative for each point 55 | 56 | More here https://en.wikipedia.org/wiki/Sigmoid_function 57 | """ 58 | defn sigmoid(z) do 59 | Nx.divide(1, Nx.add(1, Nx.exp(Nx.negate(z)))) 60 | end 61 | 62 | @doc """ 63 | Return the prediction tensor ŷ given the inputs and weight. 64 | The returned tensor is a matrix with the same dimensions as 65 | the weighted sum: one row per example, and one column. 66 | Each element in the matrix is now constrained between 0 and 1. 67 | """ 68 | defn forward(x, weight) do 69 | weighted_sum = Nx.dot(x, weight) 70 | sigmoid(weighted_sum) 71 | end 72 | 73 | @doc """ 74 | Return the prediction rounded to forecast a binary value (0, 1). 75 | """ 76 | defn classify(x, weight) do 77 | forward(x, weight) 78 | |> Nx.round() 79 | end 80 | 81 | @doc """ 82 | Log loss function. 83 | """ 84 | defn loss(x, y, weight) do 85 | # in python: 86 | # y_hat = forward(X, w) 87 | # first_term = Y * np.log(y_hat) 88 | # second_term = (1 - Y) * np.log(1 - y_hat) 89 | # return -np.average(first_term + second_term) 90 | 91 | y_hat = forward(x, weight) 92 | 93 | # Each label in the matrix `y_hat` is either `0` or `1`. 94 | # - `first_term` disappears when `y_hat` is 0 95 | # - `second_term` disappears when `y_hat` is 1 96 | first_term = y * Nx.log(y_hat) 97 | second_term = Nx.subtract(1, y) * Nx.log(Nx.subtract(1, y_hat)) 98 | 99 | Nx.add(first_term, second_term) 100 | |> Nx.mean() 101 | |> Nx.negate() 102 | end 103 | 104 | @doc """ 105 | Returns the derivative of the loss curve. 106 | """ 107 | defn gradient(x, y, weight) do 108 | # in python: 109 | # np.matmul(X.T, (predict(X, w) - Y)) / X.shape[0] 110 | 111 | predictions = forward(x, weight) 112 | errors = Nx.subtract(predictions, y) 113 | n_examples = elem(Nx.shape(x), 0) 114 | 115 | Nx.transpose(x) 116 | |> Nx.dot(errors) 117 | |> Nx.divide(n_examples) 118 | end 119 | 120 | @doc """ 121 | Computes the weight by training the system 122 | with the given inputs and labels, by iterating 123 | over the examples the specified number of times. 124 | """ 125 | def train(x, y, iterations, lr) do 126 | Enum.reduce(0..(iterations - 1), init_weight(x), fn i, weight -> 127 | IO.inspect("Iteration #{i} => Loss: #{Nx.to_number(loss(x, y, weight))}") 128 | 129 | step(x, y, weight, lr) 130 | end) 131 | end 132 | 133 | defnp step(x, y, weight, lr) do 134 | Nx.subtract(weight, Nx.multiply(gradient(x, y, weight), lr)) 135 | end 136 | 137 | def test(x, y, weight) do 138 | total_examples = elem(Nx.shape(x), 0) 139 | 140 | correct_results = 141 | classify(x, weight) 142 | |> Nx.equal(y) 143 | |> Nx.sum() 144 | |> Nx.to_number() 145 | 146 | # Accuracy of the classifier 147 | success_percent = Float.round(correct_results * 100 / total_examples, 2) 148 | 149 | IO.puts("Success: #{correct_results}/#{total_examples} (#{success_percent}%)") 150 | end 151 | 152 | # Given n elements it returns a tensor 153 | # with this shape {n, 1}, each element 154 | # initialized to 0 155 | defnp init_weight(x) do 156 | Nx.broadcast(Nx.tensor([0]), {elem(Nx.shape(x), 1), 1}) 157 | end 158 | end 159 | ``` 160 | 161 | ## Read the data 162 | 163 | ```elixir 164 | file = 165 | __DIR__ 166 | |> Path.join("police.txt") 167 | |> Path.expand() 168 | 169 | # Read the data from the file, remove the header and return 170 | # `[%{reservations: integer(), temperature: integer(), tourists: integer(), police: integer()}]` 171 | data = 172 | File.read!(file) 173 | |> String.split("\n", trim: true) 174 | |> Enum.slice(1..-1) 175 | |> Enum.map(&String.split(&1, ~r{\s+}, trim: true)) 176 | |> Enum.map(fn [r, temp, tour, p] -> 177 | %{ 178 | reservations: String.to_integer(r), 179 | temperature: String.to_integer(temp), 180 | tourists: String.to_integer(tour), 181 | police: String.to_integer(p) 182 | } 183 | end) 184 | 185 | Kino.DataTable.new(data, keys: [:reservations, :temperature, :tourists, :police]) 186 | ``` 187 | 188 | ### Prepare the data 189 | 190 | ```elixir 191 | # Transform the data to unpack the 4 columns `reservations`, 192 | # `temperature`, `tourists` and `police` into separate arrays 193 | # called x1, x2, x3 and y 194 | %{x1: x1, x2: x2, x3: x3, y: y} = 195 | Enum.reduce(data, %{x1: [], x2: [], x3: [], y: []}, fn item, %{x1: x1, x2: x2, x3: x3, y: y} -> 196 | %{ 197 | x1: x1 ++ [item.reservations], 198 | x2: x2 ++ [item.temperature], 199 | x3: x3 ++ [item.tourists], 200 | y: y ++ [item.police] 201 | } 202 | end) 203 | ``` 204 | 205 | ```elixir 206 | # bias 207 | x0 = List.duplicate(1, length(x1)) 208 | 209 | x = 210 | [x0, x1, x2, x3] 211 | |> Nx.tensor() 212 | |> Nx.transpose() 213 | 214 | # Same of `y.reshape(-1, 1)` used in the book 215 | y = Nx.tensor(y) |> Nx.reshape({:auto, 1}) 216 | ``` 217 | 218 | ### Our new model 219 | 220 | Plot of the `forward()` function. 221 | 222 | ```elixir 223 | alias VegaLite, as: Vl 224 | 225 | reservations_tensor = Nx.tensor([x0, x1]) |> Nx.transpose() 226 | 227 | # It can take a bit of time 228 | weight = C5.Classifier.train(reservations_tensor, y, iterations = 1_000_000, lr = 0.01) 229 | 230 | predictions = C5.Classifier.forward(reservations_tensor, weight) 231 | rounded_predictions = C5.Classifier.classify(reservations_tensor, weight) 232 | 233 | :ok 234 | ``` 235 | 236 | ```elixir 237 | Vl.new(width: 600, height: 400, title: "Model - forward()") 238 | |> Vl.layers([ 239 | Vl.new() 240 | |> Vl.data_from_values( 241 | reservations: x1, 242 | police_calls: Nx.to_flat_list(y) 243 | ) 244 | |> Vl.mark(:circle) 245 | |> Vl.encode_field(:x, "reservations", type: :quantitative, title: "Reservations") 246 | |> Vl.encode_field(:y, "police_calls", type: :quantitative, title: "Police Calls"), 247 | Vl.new() 248 | |> Vl.data_from_values( 249 | reservations: x1, 250 | forward: Nx.to_flat_list(predictions) 251 | ) 252 | |> Vl.mark(:line) 253 | |> Vl.encode_field(:x, "reservations", type: :quantitative, title: "Reservations") 254 | |> Vl.encode_field(:y, "forward", type: :quantitative, title: "forward(x, w)") 255 | ]) 256 | ``` 257 | 258 | ```elixir 259 | Vl.new(width: 600, height: 400, title: "Predictions based on binary classification - classify()") 260 | |> Vl.layers([ 261 | Vl.new() 262 | |> Vl.data_from_values( 263 | reservations: x1, 264 | police_calls: Nx.to_flat_list(y) 265 | ) 266 | |> Vl.mark(:circle) 267 | |> Vl.encode_field(:x, "reservations", type: :quantitative, title: "Reservations") 268 | |> Vl.encode_field(:y, "police_calls", type: :quantitative, title: "Police Calls"), 269 | Vl.new() 270 | |> Vl.data_from_values( 271 | reservations: x1, 272 | classify: Nx.to_flat_list(rounded_predictions) 273 | ) 274 | |> Vl.mark(:line) 275 | |> Vl.encode_field(:x, "reservations", type: :quantitative, title: "Reservations") 276 | |> Vl.encode_field(:y, "classify", type: :quantitative, title: "classify(x, w)") 277 | ]) 278 | ``` 279 | 280 | ### Train the system 281 | 282 | ```elixir 283 | weight = C5.Classifier.train(x, y, iterations = 10_000, lr = 0.001) 284 | ``` 285 | 286 | ### Test the system 287 | 288 | The percentage of correctly classified examples is called the accuracy of the classifier. 289 | 290 | ```elixir 291 | C5.Classifier.test(x, y, weight) 292 | 293 | Kino.nothing() 294 | ``` 295 | -------------------------------------------------------------------------------- /05_discerning/police.txt: -------------------------------------------------------------------------------- 1 | Reservations Temperature Tourists Police 2 | 13 26 9 1 3 | 2 14 6 0 4 | 14 20 3 1 5 | 23 25 9 1 6 | 13 24 8 1 7 | 1 13 2 0 8 | 18 23 9 1 9 | 10 18 10 1 10 | 26 24 3 1 11 | 3 14 1 0 12 | 3 12 3 0 13 | 21 27 5 1 14 | 7 17 3 0 15 | 22 21 1 1 16 | 2 14 4 0 17 | 27 26 2 1 18 | 6 15 4 0 19 | 10 21 7 0 20 | 18 18 3 0 21 | 15 26 8 1 22 | 9 20 6 0 23 | 26 25 9 1 24 | 8 21 10 0 25 | 15 22 7 1 26 | 10 20 2 0 27 | 21 21 1 1 28 | 5 12 7 0 29 | 6 14 9 0 30 | 13 19 4 1 31 | 13 20 3 0 32 | -------------------------------------------------------------------------------- /06_real/digit_classifier.livemd: -------------------------------------------------------------------------------- 1 | # Chapter 6: Getting Real 2 | 3 | ```elixir 4 | Mix.install( 5 | [ 6 | {:exla, "~> 0.5"}, 7 | {:nx, "~> 0.5"}, 8 | {:vega_lite, "~> 0.1.6"}, 9 | {:kino, "~> 0.8.1"}, 10 | {:kino_vega_lite, "~> 0.1.7"} 11 | ], 12 | config: [nx: [default_backend: EXLA.Backend]] 13 | ) 14 | ``` 15 | 16 | ## Our Own MNIST Library 17 | 18 | ```elixir 19 | defmodule C6.MNIST do 20 | @moduledoc """ 21 | Use this Module to load the MNIST database (test, train, and labels). 22 | 23 | MNIST dataset specifications can be found here: http://yann.lecun.com/exdb/mnist/ 24 | """ 25 | 26 | @data_path Path.join(__DIR__, "../data/mnist") |> Path.expand() 27 | 28 | @train_images_filename Path.join(@data_path, "train-images-idx3-ubyte.gz") 29 | @test_images_filename Path.join(@data_path, "t10k-images-idx3-ubyte.gz") 30 | @train_labels_filename Path.join(@data_path, "train-labels-idx1-ubyte.gz") 31 | @test_labels_filename Path.join(@data_path, "t10k-labels-idx1-ubyte.gz") 32 | 33 | defstruct [:x_train, :x_test, :y_train, :y_test] 34 | 35 | @doc """ 36 | Load the MNIST database and return the train and test images. 37 | """ 38 | def load() do 39 | %__MODULE__{ 40 | # 60000 images, each 785 elements (1 bias + 28 * 28 pixels) 41 | x_train: prepend_bias(load_images(@train_images_filename)), 42 | # 10000 images, each 785 elements, with the same structure as `x_train` 43 | x_test: prepend_bias(load_images(@test_images_filename)), 44 | # 60000 labels 45 | y_train: load_labels(@train_labels_filename), 46 | # 10000 labels, with the same encoding as `y_train` 47 | y_test: load_labels(@test_labels_filename) 48 | } 49 | end 50 | 51 | @doc """ 52 | Encode the five in the given label matrix by 53 | converting all 5s to 1, and everything else to 0. 54 | """ 55 | def encode_fives(y) do 56 | Nx.equal(y, 5) 57 | end 58 | 59 | @doc """ 60 | Load the MNIST labels from the given file 61 | and return a matrix. 62 | """ 63 | def load_labels(filename) do 64 | # Open and unzip the file of labels 65 | with {:ok, binary} <- File.read(filename) do 66 | <<_::32, n_labels::32, labels_binary::binary>> = :zlib.gunzip(binary) 67 | 68 | # Create a tensor from the binary and 69 | # reshape the list of labels into a one-column matrix. 70 | labels_binary 71 | |> Nx.from_binary({:u, 8}) 72 | |> Nx.reshape({n_labels, 1}) 73 | end 74 | end 75 | 76 | @doc """ 77 | Load the MNIST images from the given file 78 | and return a matrix. 79 | """ 80 | def load_images(filename) do 81 | # Open and unzip the file of images 82 | with {:ok, binary} <- File.read(filename) do 83 | <<_::32, n_images::32, n_rows::32, n_cols::32, images_binary::binary>> = 84 | :zlib.gunzip(binary) 85 | 86 | # Create a tensor from the binary and 87 | # reshape the pixels into a matrix where each line is an image. 88 | images_binary 89 | |> Nx.from_binary({:u, 8}) 90 | |> Nx.reshape({n_images, n_cols * n_rows}) 91 | end 92 | end 93 | 94 | @doc """ 95 | Prepend a the bias, an extra column of 1s, to 96 | the given tensor. 97 | """ 98 | def prepend_bias(x) do 99 | bias = Nx.broadcast(1, {elem(Nx.shape(x), 0), 1}) 100 | 101 | # Insert a column of 1s in the position 0 of x. 102 | # ("axis: 1" stands for: "insert a column, not a row") 103 | # in python: `np.insert(X, 0, 1, axis=1)` 104 | Nx.concatenate([bias, x], axis: 1) 105 | end 106 | end 107 | ``` 108 | 109 | ```elixir 110 | # Unzips and decodes images from MNIST’s binary files. 111 | filename = Path.join(__DIR__, "../data/mnist/train-images-idx3-ubyte.gz") |> Path.expand() 112 | images_tensor = C6.MNIST.load_images(filename) 113 | ``` 114 | 115 | ```elixir 116 | # Add the bias to the images tensor 117 | images_tensor_with_bias = C6.MNIST.prepend_bias(images_tensor) 118 | ``` 119 | 120 | ## The Real Thing 121 | 122 | Use the classifier developed in chapter 5 to train and test the system. 123 | 124 | ```elixir 125 | defmodule C5.Classifier do 126 | import Nx.Defn 127 | 128 | @doc """ 129 | A sigmoid function is a mathematical function having 130 | a characteristic "S"-shaped curve or sigmoid curve. 131 | 132 | A sigmoid function: 133 | - is monotonic 134 | - has no local minimums 135 | - has a non-negative derivative for each point 136 | 137 | More here https://en.wikipedia.org/wiki/Sigmoid_function 138 | """ 139 | defn sigmoid(z) do 140 | Nx.divide(1, Nx.add(1, Nx.exp(Nx.negate(z)))) 141 | end 142 | 143 | @doc """ 144 | Return the prediction tensor ŷ given the inputs and weight. 145 | The returned tensor is a matrix with the same dimensions as 146 | the weighted sum: one row per example, and one column. 147 | Each element in the matrix is now constrained between 0 and 1. 148 | """ 149 | defn forward(x, weight) do 150 | weighted_sum = Nx.dot(x, weight) 151 | sigmoid(weighted_sum) 152 | end 153 | 154 | @doc """ 155 | Return the prediction rounded to forecast a binary value (0, 1). 156 | """ 157 | defn classify(x, weight) do 158 | forward(x, weight) 159 | |> Nx.round() 160 | end 161 | 162 | @doc """ 163 | Log loss function. 164 | """ 165 | defn loss(x, y, weight) do 166 | # in python: 167 | # y_hat = forward(X, w) 168 | # first_term = Y * np.log(y_hat) 169 | # second_term = (1 - Y) * np.log(1 - y_hat) 170 | # return -np.average(first_term + second_term) 171 | 172 | y_hat = forward(x, weight) 173 | 174 | # Each label in the matrix `y_hat` is either `0` or `1`. 175 | # - `first_term` disappears when `y_hat` is 0 176 | # - `second_term` disappears when `y_hat` is 1 177 | first_term = y * Nx.log(y_hat) 178 | second_term = Nx.subtract(1, y) * Nx.log(Nx.subtract(1, y_hat)) 179 | 180 | Nx.add(first_term, second_term) 181 | |> Nx.mean() 182 | |> Nx.negate() 183 | end 184 | 185 | @doc """ 186 | Returns the derivative of the loss curve. 187 | """ 188 | defn gradient(x, y, weight) do 189 | # in python: 190 | # np.matmul(X.T, (predict(X, w) - Y)) / X.shape[0] 191 | 192 | predictions = forward(x, weight) 193 | errors = Nx.subtract(predictions, y) 194 | n_examples = elem(Nx.shape(x), 0) 195 | 196 | Nx.transpose(x) 197 | |> Nx.dot(errors) 198 | |> Nx.divide(n_examples) 199 | end 200 | 201 | @doc """ 202 | Computes the weight by training the system 203 | with the given inputs and labels, by iterating 204 | over the examples the specified number of times. 205 | """ 206 | def train(x, y, iterations, lr) do 207 | Enum.reduce(0..(iterations - 1), init_weight(x), fn i, weight -> 208 | IO.inspect("Iteration #{i} => Loss: #{Nx.to_number(loss(x, y, weight))}") 209 | 210 | step(x, y, weight, lr) 211 | end) 212 | end 213 | 214 | defnp step(x, y, weight, lr) do 215 | Nx.subtract(weight, Nx.multiply(gradient(x, y, weight), lr)) 216 | end 217 | 218 | def test(x, y, weight) do 219 | total_examples = elem(Nx.shape(x), 0) 220 | 221 | correct_results = 222 | classify(x, weight) 223 | |> Nx.equal(y) 224 | |> Nx.sum() 225 | |> Nx.to_number() 226 | 227 | # Accuracy of the classifier 228 | success_percent = Float.round(correct_results * 100 / total_examples, 2) 229 | 230 | IO.puts("Success: #{correct_results}/#{total_examples} (#{success_percent}%)") 231 | end 232 | 233 | # Given n elements it returns a tensor 234 | # with this shape {n, 1}, each element 235 | # initialized to 0 236 | defnp init_weight(x) do 237 | Nx.broadcast(Nx.tensor([0]), {elem(Nx.shape(x), 1), 1}) 238 | end 239 | end 240 | ``` 241 | 242 | ### Binary classifier - recognize the 5 243 | 244 | 245 | 246 | The test and train labels contain values from 0 to 9, but for this chapter we want to recognize only 5s, therefore we use the `C6.MNIST.encode_fives/1` function to turn these labels into binary values: 247 | 248 | * 1 when the value is `5` 249 | * 0 otherwise 250 | 251 | 252 | 253 | #### Train and test the system 254 | 255 | ```elixir 256 | # Use the public API to get train and test images 257 | %{x_train: x_train, x_test: x_test, y_train: y_train, y_test: y_test} = C6.MNIST.load() 258 | ``` 259 | 260 | ```elixir 261 | updated_y_train = C6.MNIST.encode_fives(y_train) 262 | 263 | weight = C5.Classifier.train(x_train, updated_y_train, iterations = 100, lr = 1.0e-5) 264 | ``` 265 | 266 | ```elixir 267 | updated_y_test = C6.MNIST.encode_fives(y_test) 268 | 269 | C5.Classifier.test(x_test, updated_y_test, weight) 270 | ``` 271 | -------------------------------------------------------------------------------- /07_final/multiclass_classifier.livemd: -------------------------------------------------------------------------------- 1 | # Chapter 7: The Final Challenge 2 | 3 | ```elixir 4 | Mix.install( 5 | [ 6 | {:exla, "~> 0.5"}, 7 | {:nx, "~> 0.5"}, 8 | {:vega_lite, "~> 0.1.6"}, 9 | {:kino, "~> 0.8.1"}, 10 | {:kino_vega_lite, "~> 0.1.7"} 11 | ], 12 | config: [nx: [default_backend: EXLA.Backend]] 13 | ) 14 | ``` 15 | 16 | ## Going Multinomial 17 | 18 | ### Load MNIST dataset 19 | 20 | ```elixir 21 | defmodule C7.MNIST do 22 | @moduledoc """ 23 | Use this Module to load the MNIST database (test, train, and labels). 24 | 25 | MNIST dataset specifications can be found here: http://yann.lecun.com/exdb/mnist/ 26 | """ 27 | 28 | @data_path Path.join(__DIR__, "../data/mnist") |> Path.expand() 29 | 30 | @train_images_filename Path.join(@data_path, "train-images-idx3-ubyte.gz") 31 | @test_images_filename Path.join(@data_path, "t10k-images-idx3-ubyte.gz") 32 | @train_labels_filename Path.join(@data_path, "train-labels-idx1-ubyte.gz") 33 | @test_labels_filename Path.join(@data_path, "t10k-labels-idx1-ubyte.gz") 34 | 35 | defstruct [:x_train, :x_test, :y_train, :y_test] 36 | 37 | @doc """ 38 | Load the MNIST database and return the train and test images. 39 | """ 40 | def load() do 41 | %__MODULE__{ 42 | # 60000 images, each 785 elements (1 bias + 28 * 28 pixels) 43 | x_train: prepend_bias(load_images(@train_images_filename)), 44 | # 10000 images, each 785 elements, with the same structure as `x_train` 45 | x_test: prepend_bias(load_images(@test_images_filename)), 46 | # 60000 labels 47 | y_train: load_labels(@train_labels_filename), 48 | # 10000 labels, with the same encoding as `y_train` 49 | y_test: load_labels(@test_labels_filename) 50 | } 51 | end 52 | 53 | @doc """ 54 | One-hot encode the given tensor (classes: from 0 to 9). 55 | """ 56 | def one_hot_encode(y) do 57 | Nx.equal(y, Nx.tensor(Enum.to_list(0..9))) 58 | end 59 | 60 | @doc """ 61 | Load the MNIST labels from the given file 62 | and return a matrix. 63 | """ 64 | def load_labels(filename) do 65 | # Open and unzip the file of labels 66 | with {:ok, binary} <- File.read(filename) do 67 | <<_::32, n_labels::32, labels_binary::binary>> = :zlib.gunzip(binary) 68 | 69 | # Create a tensor from the binary and 70 | # reshape the list of labels into a one-column matrix. 71 | labels_binary 72 | |> Nx.from_binary({:u, 8}) 73 | |> Nx.reshape({n_labels, 1}) 74 | end 75 | end 76 | 77 | @doc """ 78 | Load the MNIST images from the given file 79 | and return a matrix. 80 | """ 81 | def load_images(filename) do 82 | # Open and unzip the file of images 83 | with {:ok, binary} <- File.read(filename) do 84 | <<_::32, n_images::32, n_rows::32, n_cols::32, images_binary::binary>> = 85 | :zlib.gunzip(binary) 86 | 87 | # Create a tensor from the binary and 88 | # reshape the pixels into a matrix where each line is an image. 89 | images_binary 90 | |> Nx.from_binary({:u, 8}) 91 | |> Nx.reshape({n_images, n_cols * n_rows}) 92 | end 93 | end 94 | 95 | @doc """ 96 | Prepend a the bias, an extra column of 1s, to 97 | the given tensor. 98 | """ 99 | def prepend_bias(x) do 100 | bias = Nx.broadcast(1, {elem(Nx.shape(x), 0), 1}) 101 | 102 | Nx.concatenate([bias, x], axis: 1) 103 | end 104 | end 105 | ``` 106 | 107 | ```elixir 108 | # 60K labels, each a single digit from 0 to 9 109 | filename = Path.join(__DIR__, "../data/mnist/train-labels-idx1-ubyte.gz") |> Path.expand() 110 | y_train_unencoded = C7.MNIST.load_labels(filename) 111 | ``` 112 | 113 | Hot-encode the labels tensor (train data). 114 | 115 | ```elixir 116 | # 60K labels, each consisting of 10 one-hot encoded elements 117 | y_train = C7.MNIST.one_hot_encode(y_train_unencoded) 118 | ``` 119 | 120 | ## Moment of Truth 121 | 122 | Update the classifier implemented in chapter 5 to handle multiclasses. 123 | 124 | ```elixir 125 | defmodule C7.Classifier do 126 | import Nx.Defn 127 | 128 | @doc """ 129 | A sigmoid function is a mathematical function having 130 | a characteristic "S"-shaped curve or sigmoid curve. 131 | 132 | A sigmoid function: 133 | - is monotonic 134 | - has no local minimums 135 | - has a non-negative derivative for each point 136 | 137 | More here https://en.wikipedia.org/wiki/Sigmoid_function 138 | """ 139 | defn sigmoid(z) do 140 | Nx.divide(1, Nx.add(1, Nx.exp(Nx.negate(z)))) 141 | end 142 | 143 | @doc """ 144 | Return the prediction tensor ŷ (y_hat) given the inputs and weight. 145 | The returned tensor is a matrix with the same dimensions as 146 | the weighted sum: one row per example, and one column. 147 | Each element in the matrix is now constrained between 0 and 1. 148 | """ 149 | defn forward(x, weight) do 150 | weighted_sum = Nx.dot(x, weight) 151 | sigmoid(weighted_sum) 152 | end 153 | 154 | @doc """ 155 | Return the prediction rounded to forecast a value between 0 and 9. 156 | """ 157 | defn classify(x, weight) do 158 | y_hat = forward(x, weight) 159 | 160 | # Get the index of the maximum value in each row of y_hat 161 | # (the value that’s closer to 1). 162 | # NOTE: in case of MNIST dataset, the returned index is also the 163 | # decoded label (0..9). 164 | labels = Nx.argmax(y_hat, axis: 1) 165 | 166 | Nx.reshape(labels, {:auto, 1}) 167 | end 168 | 169 | @doc """ 170 | Log loss function. 171 | """ 172 | defn loss(x, y, weight) do 173 | y_hat = forward(x, weight) 174 | 175 | # Each label in the matrix `y_hat` is either `0` or `1`. 176 | # - `first_term` disappears when `y_hat` is 0 177 | # - `second_term` disappears when `y_hat` is 1 178 | first_term = y * Nx.log(y_hat) 179 | second_term = Nx.subtract(1, y) * Nx.log(Nx.subtract(1, y_hat)) 180 | 181 | # Corrected version (Chapter 7) 182 | Nx.add(first_term, second_term) 183 | |> Nx.sum() 184 | |> Nx.divide(elem(Nx.shape(x), 0)) 185 | |> Nx.negate() 186 | end 187 | 188 | @doc """ 189 | Returns the derivative of the loss curve. 190 | """ 191 | defn gradient(x, y, weight) do 192 | # in python: 193 | # np.matmul(X.T, (predict(X, w) - Y)) / X.shape[0] 194 | 195 | predictions = forward(x, weight) 196 | errors = Nx.subtract(predictions, y) 197 | n_examples = elem(Nx.shape(x), 0) 198 | 199 | Nx.transpose(x) 200 | |> Nx.dot(errors) 201 | |> Nx.divide(n_examples) 202 | end 203 | 204 | @doc """ 205 | Utility to report (to stdout) the loss per iteration. 206 | """ 207 | def report(iteration, x_train, y_train, x_test, y_test, weight) do 208 | matches = 209 | classify(x_test, weight) 210 | |> Nx.equal(y_test) 211 | |> Nx.sum() 212 | |> Nx.to_number() 213 | 214 | n_test_examples = elem(Nx.shape(y_test), 0) 215 | matches = matches * 100.0 / n_test_examples 216 | training_loss = loss(x_train, y_train, weight) |> Nx.to_number() 217 | 218 | IO.inspect("Iteration #{iteration} => Loss: #{training_loss}, #{matches}%") 219 | end 220 | 221 | @doc """ 222 | Computes the weight by training the system 223 | with the given inputs and labels, by iterating 224 | over the examples the specified number of times. 225 | """ 226 | def train(x_train, y_train, x_test, y_test, iterations, lr) do 227 | final_weight = 228 | Enum.reduce(0..(iterations - 1), init_weight(x_train, y_train), fn i, weight -> 229 | report(i, x_train, y_train, x_test, y_test, weight) 230 | step(x_train, y_train, weight, lr) 231 | end) 232 | 233 | report(iterations, x_train, y_train, x_test, y_test, final_weight) 234 | 235 | final_weight 236 | end 237 | 238 | defnp step(x, y, weight, lr) do 239 | Nx.subtract(weight, Nx.multiply(gradient(x, y, weight), lr)) 240 | end 241 | 242 | # Returns a tensor of shape `{n, m}`, where 243 | # `n` is the number of columns in `x` (input variables) and 244 | # `m` is the number of columns in `y` (classes). 245 | # Each element in the tensor is initialized to 0. 246 | defnp init_weight(x, y) do 247 | n_input_variables = elem(Nx.shape(x), 1) 248 | n_classes = elem(Nx.shape(y), 1) 249 | Nx.broadcast(0, {n_input_variables, n_classes}) 250 | end 251 | end 252 | ``` 253 | 254 | ### Train and test the system 255 | 256 | 257 | 258 | Load the data first. 259 | 260 | ```elixir 261 | # Use the public API to get train and test images 262 | %{x_train: x_train, x_test: x_test, y_train: y_train, y_test: y_test} = data = C7.MNIST.load() 263 | ``` 264 | 265 | One-hot encode the train labels. 266 | 267 | ```elixir 268 | updated_y_train = C7.MNIST.one_hot_encode(y_train) 269 | ``` 270 | 271 | ```elixir 272 | weight = 273 | C7.Classifier.train(x_train, updated_y_train, x_test, y_test, iterations = 200, lr = 1.0e-5) 274 | ``` 275 | -------------------------------------------------------------------------------- /07_final/seed_results.txt: -------------------------------------------------------------------------------- 1 | seed: 0, last report: {99999, 81.25%}, max report: {18909, 81.25%} 2 | seed: 1, last report: {99999, 81.25%}, max report: {34046, 81.25%} 3 | seed: 2, last report: {99999, 66.66666666666667%}, max report: {599, 75.0%} 4 | seed: 3, last report: {99999, 72.91666666666667%}, max report: {17413, 77.08333333333333%} 5 | seed: 4, last report: {99999, 83.33333333333333%}, max report: {73627, 83.33333333333333%} 6 | seed: 5, last report: {99999, 85.41666666666667%}, max report: {22479, 87.5%} 7 | seed: 6, last report: {99999, 85.41666666666667%}, max report: {5052, 87.5%} 8 | seed: 7, last report: {99999, 68.75%}, max report: {7600, 68.75%} 9 | seed: 8, last report: {99999, 83.33333333333333%}, max report: {18584, 85.41666666666667%} 10 | seed: 9, last report: {99999, 68.75%}, max report: {4100, 70.83333333333333%} 11 | seed: 10, last report: {99999, 75.0%}, max report: {20283, 77.08333333333333%} 12 | seed: 11, last report: {99999, 83.33333333333333%}, max report: {93458, 83.33333333333333%} 13 | seed: 12, last report: {99999, 68.75%}, max report: {39547, 68.75%} 14 | seed: 13, last report: {99999, 75.0%}, max report: {70571, 75.0%} 15 | seed: 14, last report: {99999, 75.0%}, max report: {8184, 81.25%} 16 | seed: 15, last report: {99999, 75.0%}, max report: {52230, 75.0%} 17 | seed: 16, last report: {99999, 79.16666666666667%}, max report: {19844, 83.33333333333333%} 18 | seed: 17, last report: {99999, 75.0%}, max report: {3787, 77.08333333333333%} 19 | seed: 18, last report: {99999, 83.33333333333333%}, max report: {60810, 87.5%} 20 | seed: 19, last report: {99999, 75.0%}, max report: {25625, 75.0%} 21 | seed: 20, last report: {99999, 70.83333333333333%}, max report: {10482, 72.91666666666667%} 22 | seed: 21, last report: {99999, 72.91666666666667%}, max report: {78058, 75.0%} 23 | seed: 22, last report: {99999, 79.16666666666667%}, max report: {27242, 81.25%} 24 | seed: 23, last report: {99999, 75.0%}, max report: {59677, 75.0%} 25 | seed: 24, last report: {99999, 79.16666666666667%}, max report: {7921, 83.33333333333333%} 26 | seed: 25, last report: {99999, 75.0%}, max report: {30436, 77.08333333333333%} 27 | seed: 26, last report: {99999, 75.0%}, max report: {10052, 75.0%} 28 | seed: 27, last report: {99999, 77.08333333333333%}, max report: {19500, 79.16666666666667%} 29 | seed: 28, last report: {99999, 66.66666666666667%}, max report: {800, 70.83333333333333%} 30 | seed: 29, last report: {99999, 83.33333333333333%}, max report: {44372, 85.41666666666667%} 31 | seed: 30, last report: {99999, 85.41666666666667%}, max report: {74103, 85.41666666666667%} 32 | seed: 31, last report: {99999, 75.0%}, max report: {20929, 79.16666666666667%} 33 | seed: 32, last report: {99999, 81.25%}, max report: {44995, 89.58333333333333%} 34 | seed: 33, last report: {99999, 79.16666666666667%}, max report: {2055, 81.25%} 35 | seed: 34, last report: {99999, 83.33333333333333%}, max report: {35563, 83.33333333333333%} 36 | seed: 35, last report: {99999, 77.08333333333333%}, max report: {72040, 77.08333333333333%} 37 | seed: 36, last report: {99999, 72.91666666666667%}, max report: {71573, 75.0%} 38 | seed: 37, last report: {99999, 79.16666666666667%}, max report: {27332, 79.16666666666667%} 39 | seed: 38, last report: {99999, 83.33333333333333%}, max report: {44491, 83.33333333333333%} 40 | seed: 39, last report: {99999, 70.83333333333333%}, max report: {35787, 72.91666666666667%} 41 | seed: 40, last report: {99999, 79.16666666666667%}, max report: {87046, 79.16666666666667%} 42 | seed: 41, last report: {99999, 75.0%}, max report: {33515, 81.25%} 43 | seed: 42, last report: {99999, 77.08333333333333%}, max report: {17229, 77.08333333333333%} 44 | seed: 43, last report: {99999, 75.0%}, max report: {3933, 77.08333333333333%} 45 | seed: 44, last report: {99999, 83.33333333333333%}, max report: {7832, 87.5%} 46 | seed: 45, last report: {99999, 83.33333333333333%}, max report: {8693, 85.41666666666667%} 47 | seed: 46, last report: {99999, 79.16666666666667%}, max report: {16584, 79.16666666666667%} 48 | seed: 47, last report: {99999, 77.08333333333333%}, max report: {53828, 77.08333333333333%} 49 | seed: 48, last report: {99999, 66.66666666666667%}, max report: {1788, 77.08333333333333%} 50 | seed: 49, last report: {99999, 75.0%}, max report: {3898, 85.41666666666667%} 51 | seed: 50, last report: {99999, 81.25%}, max report: {44039, 83.33333333333333%} 52 | seed: 51, last report: {99999, 72.91666666666667%}, max report: {3355, 75.0%} 53 | seed: 52, last report: {99999, 72.91666666666667%}, max report: {4164, 79.16666666666667%} 54 | seed: 53, last report: {99999, 79.16666666666667%}, max report: {12252, 79.16666666666667%} 55 | seed: 54, last report: {99999, 79.16666666666667%}, max report: {98892, 79.16666666666667%} 56 | seed: 55, last report: {99999, 75.0%}, max report: {17781, 75.0%} 57 | seed: 56, last report: {99999, 79.16666666666667%}, max report: {85959, 81.25%} 58 | seed: 57, last report: {99999, 79.16666666666667%}, max report: {5334, 83.33333333333333%} 59 | seed: 58, last report: {99999, 77.08333333333333%}, max report: {27106, 79.16666666666667%} 60 | seed: 59, last report: {99999, 72.91666666666667%}, max report: {5261, 72.91666666666667%} 61 | seed: 60, last report: {99999, 77.08333333333333%}, max report: {15309, 79.16666666666667%} 62 | seed: 61, last report: {99999, 77.08333333333333%}, max report: {48603, 79.16666666666667%} 63 | seed: 62, last report: {99999, 87.5%}, max report: {7114, 87.5%} 64 | seed: 63, last report: {99999, 77.08333333333333%}, max report: {1513, 79.16666666666667%} 65 | seed: 64, last report: {99999, 72.91666666666667%}, max report: {12601, 75.0%} 66 | seed: 65, last report: {99999, 66.66666666666667%}, max report: {5296, 75.0%} 67 | seed: 66, last report: {99999, 72.91666666666667%}, max report: {95258, 72.91666666666667%} 68 | seed: 67, last report: {99999, 81.25%}, max report: {14952, 85.41666666666667%} 69 | seed: 68, last report: {99999, 79.16666666666667%}, max report: {19580, 81.25%} 70 | seed: 69, last report: {99999, 85.41666666666667%}, max report: {80082, 85.41666666666667%} 71 | seed: 70, last report: {99999, 79.16666666666667%}, max report: {67560, 79.16666666666667%} 72 | seed: 71, last report: {99999, 81.25%}, max report: {59272, 81.25%} 73 | seed: 72, last report: {99999, 81.25%}, max report: {14584, 83.33333333333333%} 74 | seed: 73, last report: {99999, 75.0%}, max report: {1845, 75.0%} 75 | seed: 74, last report: {99999, 70.83333333333333%}, max report: {38719, 75.0%} 76 | seed: 75, last report: {99999, 68.75%}, max report: {47293, 68.75%} 77 | seed: 76, last report: {99999, 72.91666666666667%}, max report: {675, 77.08333333333333%} 78 | seed: 77, last report: {99999, 75.0%}, max report: {1945, 79.16666666666667%} 79 | seed: 78, last report: {99999, 81.25%}, max report: {38393, 81.25%} 80 | seed: 79, last report: {99999, 81.25%}, max report: {41124, 83.33333333333333%} 81 | seed: 80, last report: {99999, 77.08333333333333%}, max report: {3001, 77.08333333333333%} 82 | seed: 81, last report: {99999, 79.16666666666667%}, max report: {27362, 83.33333333333333%} 83 | seed: 82, last report: {99999, 68.75%}, max report: {35765, 72.91666666666667%} 84 | seed: 83, last report: {99999, 70.83333333333333%}, max report: {32968, 70.83333333333333%} 85 | seed: 84, last report: {99999, 72.91666666666667%}, max report: {8628, 79.16666666666667%} 86 | seed: 85, last report: {99999, 85.41666666666667%}, max report: {93904, 85.41666666666667%} 87 | seed: 86, last report: {99999, 75.0%}, max report: {48891, 77.08333333333333%} 88 | seed: 87, last report: {99999, 70.83333333333333%}, max report: {422, 72.91666666666667%} 89 | seed: 88, last report: {99999, 79.16666666666667%}, max report: {8319, 81.25%} 90 | seed: 89, last report: {99999, 70.83333333333333%}, max report: {30000, 77.08333333333333%} 91 | seed: 90, last report: {99999, 77.08333333333333%}, max report: {24308, 79.16666666666667%} 92 | seed: 91, last report: {99999, 66.66666666666667%}, max report: {34768, 72.91666666666667%} 93 | seed: 92, last report: {99999, 72.91666666666667%}, max report: {20252, 72.91666666666667%} 94 | seed: 93, last report: {99999, 75.0%}, max report: {18490, 83.33333333333333%} 95 | seed: 94, last report: {99999, 81.25%}, max report: {23952, 81.25%} 96 | seed: 95, last report: {99999, 77.08333333333333%}, max report: {3725, 81.25%} 97 | seed: 96, last report: {99999, 81.25%}, max report: {2720, 85.41666666666667%} 98 | seed: 97, last report: {99999, 85.41666666666667%}, max report: {10278, 93.75%} 99 | seed: 98, last report: {99999, 66.66666666666667%}, max report: {118, 72.91666666666667%} 100 | seed: 99, last report: {99999, 75.0%}, max report: {11320, 77.08333333333333%} 101 | seed: 100, last report: {99999, 79.16666666666667%}, max report: {39636, 81.25%} -------------------------------------------------------------------------------- /07_final/sonar_classifier.livemd: -------------------------------------------------------------------------------- 1 | # SONAR, mines vs. rocks 2 | 3 | ```elixir 4 | Mix.install( 5 | [ 6 | {:exla, "~> 0.5"}, 7 | {:nx, "~> 0.5"}, 8 | {:vega_lite, "~> 0.1.6"}, 9 | {:kino, "~> 0.8.1"}, 10 | {:kino_vega_lite, "~> 0.1.7"} 11 | ], 12 | config: [nx: [default_backend: EXLA.Backend]] 13 | ) 14 | ``` 15 | 16 | ## Load Sonar data 17 | 18 | ```elixir 19 | defmodule C7.SonarDataset do 20 | @moduledoc """ 21 | Use this Module to load the Sonar database (test, train, and labels). 22 | 23 | Sonar dataset specifications can be found here: https://archive.ics.uci.edu/ml/datasets/Connectionist+Bench+(Sonar,+Mines+vs.+Rocks) 24 | The documentation of the dataset can be found in the `sonar.names` file. 25 | """ 26 | 27 | @data_path Path.join(__DIR__, "../data/sonar") |> Path.expand() 28 | 29 | @sonar_all_data_filename Path.join(@data_path, "sonar.all-data") 30 | 31 | defstruct [:x_train, :x_test, :y_train, :y_test] 32 | 33 | # Seed the random algorithm to initialize its state. 34 | :rand.seed(:exsss, 97) 35 | 36 | @doc """ 37 | Load the Sonar database and return the train and test data. 38 | 39 | The function accepts the argument `test_number` to define how many 40 | of the patterns will be used for training. 41 | The given number should consider that the are in total 208 patterns (111 from metal, 42 | 97 from rocks). The default value is 48, which is ~23% of the whole dataset. 43 | """ 44 | def load(test_number \\ 48) do 45 | # The file "sonar.all-data" contains 208 patterns: 46 | # - 111 patterns obtained by bouncing sonar signals off a 47 | # metal cylinder at various angles and under various conditions. 48 | # - 97 patterns obtained from rocks under similar conditions. 49 | # 50 | # - The patterns are ordered: first the 97 rocks ones and then the 111 metals ones. 51 | # - Each pattern is a set of 60 numbers in the range 0.0 to 1.0, followed by either 52 | # `M` or `R` depending if it has been obtained from a metal or rock. 53 | # 54 | 55 | with {:ok, binary} <- File.read(@sonar_all_data_filename) do 56 | data = 57 | binary 58 | |> parse() 59 | |> Enum.shuffle() 60 | 61 | {train_data, test_data} = Enum.split(data, length(data) - test_number) 62 | 63 | {x_train, y_train} = split_inputs_and_labels(train_data) 64 | {x_test, y_test} = split_inputs_and_labels(test_data) 65 | 66 | %__MODULE__{ 67 | x_train: prepend_bias(Nx.tensor(x_train)), 68 | x_test: prepend_bias(Nx.tensor(x_test)), 69 | y_train: Nx.tensor(y_train) |> Nx.reshape({:auto, 1}), 70 | y_test: Nx.tensor(y_test) |> Nx.reshape({:auto, 1}) 71 | } 72 | end 73 | end 74 | 75 | defp split_inputs_and_labels(data) do 76 | Enum.reduce(data, {[], []}, fn [pattern, label], {x, y} = _acc -> 77 | {x ++ [pattern], y ++ [label]} 78 | end) 79 | end 80 | 81 | defp pattern_type_to_label("M"), do: 1 82 | defp pattern_type_to_label("R"), do: 0 83 | 84 | defp parse(binary) do 85 | binary 86 | |> String.split("\n", trim: true) 87 | |> Enum.map(&String.split(&1, ",", trim: true)) 88 | |> Enum.map(fn row -> 89 | {pattern_type, pattern} = List.pop_at(row, -1) 90 | 91 | [ 92 | Enum.map(pattern, &String.to_float/1), 93 | pattern_type_to_label(pattern_type) 94 | ] 95 | end) 96 | end 97 | 98 | @doc """ 99 | One-hot encode the given tensor (classes: either 0 or 1). 100 | """ 101 | def one_hot_encode(y) do 102 | Nx.equal(y, Nx.tensor([0, 1])) 103 | end 104 | 105 | @doc """ 106 | Prepend a the bias, an extra column of 1s, to 107 | the given tensor. 108 | """ 109 | def prepend_bias(x) do 110 | bias = Nx.broadcast(1, {elem(Nx.shape(x), 0), 1}) 111 | 112 | # Insert a column of 1s in the position 0 of x. 113 | # ("axis: 1" stands for: "insert a column, not a row") 114 | # in python: `np.insert(X, 0, 1, axis=1)` 115 | Nx.concatenate([bias, x], axis: 1) 116 | end 117 | end 118 | ``` 119 | 120 | ```elixir 121 | %{x_train: x_train, x_test: x_test, y_train: y_train, y_test: y_test} = C7.SonarDataset.load() 122 | ``` 123 | 124 | Hot-encode the labels tensor (train data). 125 | 126 | ```elixir 127 | y_train = C7.SonarDataset.one_hot_encode(y_train) 128 | ``` 129 | 130 | ## Multiclass classifier 131 | 132 | ```elixir 133 | defmodule C7.Classifier do 134 | import Nx.Defn 135 | 136 | @doc """ 137 | A sigmoid function is a mathematical function having 138 | a characteristic "S"-shaped curve or sigmoid curve. 139 | 140 | A sigmoid function: 141 | - is monotonic 142 | - has no local minimums 143 | - has a non-negative derivative for each point 144 | 145 | More here https://en.wikipedia.org/wiki/Sigmoid_function 146 | """ 147 | @spec sigmoid(Nx.Tensor.t()) :: Nx.Tensor.t() 148 | defn sigmoid(z) do 149 | Nx.divide(1, Nx.add(1, Nx.exp(Nx.negate(z)))) 150 | end 151 | 152 | @doc """ 153 | Return the prediction tensor ŷ (y_hat) given the inputs and weight. 154 | The returned tensor is a matrix with the same dimensions as 155 | the weighted sum: one row per example, and one column. 156 | Each element in the matrix is now constrained between 0 and 1. 157 | """ 158 | @spec forward(Nx.Tensor.t(), Nx.Tensor.t()) :: Nx.Tensor.t() 159 | defn forward(x, weight) do 160 | weighted_sum = Nx.dot(x, weight) 161 | sigmoid(weighted_sum) 162 | end 163 | 164 | @doc """ 165 | Return the prediction rounded to forecast a value between 0 and 9. 166 | """ 167 | @spec classify(Nx.Tensor.t(), Nx.Tensor.t()) :: Nx.Tensor.t() 168 | defn classify(x, weight) do 169 | y_hat = forward(x, weight) 170 | 171 | # Get the index of the maximum value in each row of y_hat 172 | # (the value that’s closer to 1). 173 | # NOTE: in case of MNIST dataset, the returned index is also the 174 | # decoded label (0..9). 175 | labels = Nx.argmax(y_hat, axis: 1) 176 | 177 | Nx.reshape(labels, {:auto, 1}) 178 | end 179 | 180 | @doc """ 181 | Log loss function. 182 | """ 183 | @spec loss(Nx.Tensor.t(), Nx.Tensor.t(), Nx.Tensor.t()) :: Nx.Tensor.t() 184 | defn loss(x, y, weight) do 185 | y_hat = forward(x, weight) 186 | 187 | # Each label in the matrix `y_hat` is either `0` or `1`. 188 | # - `first_term` disappears when `y_hat` is 0 189 | # - `second_term` disappears when `y_hat` is 1 190 | first_term = y * Nx.log(y_hat) 191 | second_term = Nx.subtract(1, y) * Nx.log(Nx.subtract(1, y_hat)) 192 | 193 | # Corrected version (Chapter 7) 194 | Nx.add(first_term, second_term) 195 | |> Nx.sum() 196 | |> Nx.divide(elem(Nx.shape(x), 0)) 197 | |> Nx.negate() 198 | end 199 | 200 | @doc """ 201 | Returns the derivative of the loss curve. 202 | """ 203 | @spec gradient(Nx.Tensor.t(), Nx.Tensor.t(), Nx.Tensor.t()) :: Nx.Tensor.t() 204 | defn gradient(x, y, weight) do 205 | # in python: 206 | # np.matmul(X.T, (predict(X, w) - Y)) / X.shape[0] 207 | 208 | predictions = forward(x, weight) 209 | errors = Nx.subtract(predictions, y) 210 | n_examples = elem(Nx.shape(x), 0) 211 | 212 | Nx.transpose(x) 213 | |> Nx.dot(errors) 214 | |> Nx.divide(n_examples) 215 | end 216 | 217 | @typep report_item :: { 218 | iteration :: integer(), 219 | training_loss :: float(), 220 | matches_percentage :: float() 221 | } 222 | 223 | @doc """ 224 | Utility to compute training loss and matches % per iteration, 225 | use this to report the result. 226 | """ 227 | @spec report( 228 | integer(), 229 | Nx.Tensor.t(), 230 | Nx.Tensor.t(), 231 | Nx.Tensor.t(), 232 | Nx.Tensor.t(), 233 | Nx.Tensor.t() 234 | ) :: report_item() 235 | def report(iteration, x_train, y_train, x_test, y_test, weight) do 236 | matches = matches(x_test, y_test, weight) |> Nx.to_number() 237 | n_test_examples = elem(Nx.shape(y_test), 0) 238 | matches = matches * 100.0 / n_test_examples 239 | training_loss = loss(x_train, y_train, weight) |> Nx.to_number() 240 | 241 | IO.inspect("Iteration #{iteration} => Loss: #{training_loss}, #{matches}%") 242 | 243 | {iteration, training_loss, matches} 244 | end 245 | 246 | defnp matches(x_test, y_test, weight) do 247 | classify(x_test, weight) 248 | |> Nx.equal(y_test) 249 | |> Nx.sum() 250 | end 251 | 252 | @doc """ 253 | Computes the weight by training the system 254 | with the given inputs and labels, by iterating 255 | over the examples the specified number of times. 256 | 257 | It returns a tuple with the final weight and the 258 | reports of the loss per iteration. 259 | """ 260 | @spec train( 261 | train_inputs :: Nx.Tensor.t(), 262 | train_labels :: Nx.Tensor.t(), 263 | test_inputs :: Nx.Tensor.t(), 264 | test_labels :: Nx.Tensor.t(), 265 | iterations :: integer(), 266 | learning_rate :: float() 267 | ) :: {weight :: Nx.Tensor.t(), reports :: [report_item()]} 268 | def train(x_train, y_train, x_test, y_test, iterations, lr) do 269 | init_weight = init_weight(x_train, y_train) 270 | init_reports = [report(0, x_train, y_train, x_test, y_test, init_weight)] 271 | 272 | {final_weight, reversed_reports} = 273 | Enum.reduce(1..(iterations - 1), {init_weight, init_reports}, fn i, {weight, reports} -> 274 | new_weight = step(x_train, y_train, weight, lr) 275 | updated_reports = [report(i, x_train, y_train, x_test, y_test, new_weight) | reports] 276 | 277 | {new_weight, updated_reports} 278 | end) 279 | 280 | {final_weight, Enum.reverse(reversed_reports)} 281 | end 282 | 283 | defnp step(x, y, weight, lr) do 284 | Nx.subtract(weight, Nx.multiply(gradient(x, y, weight), lr)) 285 | end 286 | 287 | # Returns a tensor of shape `{n, m}`, where 288 | # `n` is the number of columns in `x` (input variables) and 289 | # `m` is the number of columns in `y` (classes). 290 | # Each element in the tensor is initialized to 0. 291 | defnp init_weight(x, y) do 292 | n_input_variables = elem(Nx.shape(x), 1) 293 | n_classes = elem(Nx.shape(y), 1) 294 | Nx.broadcast(0, {n_input_variables, n_classes}) 295 | end 296 | end 297 | ``` 298 | 299 | ### Train and test the system 300 | 301 | * Shuffle the patterns randomly (random algorithm previously seeded) 302 | * The seed is initialized with `97`, which is the seed that gave the better results in terms of % of matches. 303 | See `seed_results.txt` to compare how the % of matches differs when training the system with a different seed (from 0 to 100). To start the comparison run `elixir sonar_seed_comparison.ex` (it will take a while). 304 | * 48 examples used for testing 305 | * iterations `100_000` 306 | * learning rate `0.01` 307 | 308 | ```elixir 309 | {weight, reports} = 310 | C7.Classifier.train(x_train, y_train, x_test, y_test, iterations = 100_000, lr = 0.01) 311 | ``` 312 | 313 | ```elixir 314 | alias VegaLite, as: Vl 315 | 316 | iterations = Enum.map(reports, &elem(&1, 0)) 317 | matches = Enum.map(reports, &elem(&1, 2)) 318 | 319 | # Get the max matches % 320 | max_matches_index = Enum.find_index(matches, &(&1 == Enum.max(matches))) 321 | 322 | Vl.new(width: 800, height: 400) 323 | |> Vl.layers([ 324 | Vl.new() 325 | |> Vl.data_from_values(iterations: iterations, matches: matches) 326 | |> Vl.mark(:line) 327 | |> Vl.encode_field(:x, "iterations", type: :quantitative) 328 | |> Vl.encode_field(:y, "matches", type: :quantitative, title: "matches (%)"), 329 | Vl.new() 330 | |> Vl.data_from_values( 331 | iterations: [Enum.at(iterations, max_matches_index)], 332 | max_matches: [Enum.at(matches, max_matches_index)] 333 | ) 334 | |> Vl.mark(:circle, tooltip: true, size: "100", color: "red") 335 | |> Vl.encode_field(:x, "iterations", type: :quantitative) 336 | |> Vl.encode_field(:y, "max_matches", type: :quantitative, title: "matches (%)") 337 | ]) 338 | ``` 339 | -------------------------------------------------------------------------------- /07_final/sonar_seed_comparison.ex: -------------------------------------------------------------------------------- 1 | # Based on sonar_classifier 2 | 3 | Mix.install([ 4 | {:exla, "~> 0.1.0-dev", github: "elixir-nx/nx", sparse: "exla"}, 5 | {:nx, "~> 0.1.0-dev", github: "elixir-nx/nx", sparse: "nx", override: true} 6 | ]) 7 | 8 | # Set the backend 9 | Nx.Defn.global_default_options(compiler: EXLA) 10 | 11 | defmodule C7.SonarDataset do 12 | @moduledoc """ 13 | Use this Module to load the Sonar database (test, train, and labels). 14 | 15 | Sonar dataset specifications can be found here: https://archive.ics.uci.edu/ml/datasets/Connectionist+Bench+(Sonar,+Mines+vs.+Rocks) 16 | The documentation of the dataset can be found in the `sonar.names` file. 17 | """ 18 | 19 | @sonar_all_data_filename "../data/sonar/sonar.all-data" 20 | 21 | @type t :: %__MODULE__{ 22 | x_train: Nx.Tensor.t(), 23 | x_test: Nx.Tensor.t(), 24 | y_train: Nx.Tensor.t(), 25 | y_test: Nx.Tensor.t() 26 | } 27 | defstruct [:x_train, :x_test, :y_train, :y_test] 28 | 29 | @doc """ 30 | Load the Sonar database and return the train and test data. 31 | 32 | The function accepts the argument `rnd_seed` to initialize the Random algorithm 33 | before shuffling the list of patterns. 34 | The given number should consider that the are in total 208 patterns (111 from metal, 35 | 97 from rocks). The default value is 48, which is ~23% of the whole dataset. 36 | """ 37 | @spec load(rnd_seed :: integer()) :: t() 38 | def load(rnd_seed) do 39 | # The file "sonar.all-data" contains 208 patterns: 40 | # - 111 patterns obtained by bouncing sonar signals off a 41 | # metal cylinder at various angles and under various conditions. 42 | # - 97 patterns obtained from rocks under similar conditions. 43 | # 44 | # - The patterns are ordered: first the 97 rocks ones and then the 111 metals ones. 45 | # - Each pattern is a set of 60 numbers in the range 0.0 to 1.0, followed by either 46 | # `M` or `R` depending if it has been obtained from a metal or rock. 47 | # 48 | 49 | :rand.seed(:exsss, rnd_seed) 50 | 51 | with {:ok, binary} <- File.read(@sonar_all_data_filename) do 52 | data = 53 | binary 54 | |> parse() 55 | |> Enum.shuffle() 56 | 57 | # Keep 48 examples for testing 58 | {train_data, test_data} = Enum.split(data, length(data) - 48) 59 | 60 | {x_train, y_train} = split_inputs_and_labels(train_data) 61 | {x_test, y_test} = split_inputs_and_labels(test_data) 62 | 63 | %__MODULE__{ 64 | x_train: prepend_bias(Nx.tensor(x_train)), 65 | x_test: prepend_bias(Nx.tensor(x_test)), 66 | y_train: Nx.tensor(y_train) |> Nx.reshape({:auto, 1}), 67 | y_test: Nx.tensor(y_test) |> Nx.reshape({:auto, 1}) 68 | } 69 | end 70 | end 71 | 72 | defp split_inputs_and_labels(data) do 73 | Enum.reduce(data, {[], []}, fn [pattern, label], {x, y} = _acc -> 74 | {x ++ [pattern], y ++ [label]} 75 | end) 76 | end 77 | 78 | defp pattern_type_to_label("M"), do: 1 79 | defp pattern_type_to_label("R"), do: 0 80 | 81 | defp parse(binary) do 82 | binary 83 | |> String.split("\n", trim: true) 84 | |> Enum.map(&String.split(&1, ",", trim: true)) 85 | |> Enum.map(fn row -> 86 | {pattern_type, pattern} = List.pop_at(row, -1) 87 | 88 | [ 89 | Enum.map(pattern, &String.to_float/1), 90 | pattern_type_to_label(pattern_type) 91 | ] 92 | end) 93 | end 94 | 95 | @doc """ 96 | One-hot encode the given tensor (classes: either 0 or 1). 97 | """ 98 | @spec one_hot_encode(y :: Nx.Tensor.t()) :: Nx.Tensor.t() 99 | def one_hot_encode(y) do 100 | Nx.equal(y, Nx.tensor([0, 1])) 101 | end 102 | 103 | @doc """ 104 | Prepend a the bias, an extra column of 1s, to 105 | the given tensor. 106 | """ 107 | @spec prepend_bias(Nx.Tensor.t()) :: Nx.Tensor.t() 108 | def prepend_bias(x) do 109 | bias = Nx.broadcast(1, {elem(Nx.shape(x), 0), 1}) 110 | 111 | # Insert a column of 1s in the position 0 of x. 112 | # ("axis: 1" stands for: "insert a column, not a row") 113 | # in python: `np.insert(X, 0, 1, axis=1)` 114 | Nx.concatenate([bias, x], axis: 1) 115 | end 116 | end 117 | 118 | defmodule C7.Classifier do 119 | import Nx.Defn 120 | 121 | @doc """ 122 | A sigmoid function is a mathematical function having 123 | a characteristic "S"-shaped curve or sigmoid curve. 124 | 125 | A sigmoid function: 126 | - is monotonic 127 | - has no local minimums 128 | - has a non-negative derivative for each point 129 | 130 | More here https://en.wikipedia.org/wiki/Sigmoid_function 131 | """ 132 | @spec sigmoid(Nx.Tensor.t()) :: Nx.Tensor.t() 133 | defn sigmoid(z) do 134 | Nx.divide(1, Nx.add(1, Nx.exp(Nx.negate(z)))) 135 | end 136 | 137 | @doc """ 138 | Return the prediction tensor ŷ (y_hat) given the inputs and weight. 139 | The returned tensor is a matrix with the same dimensions as 140 | the weighted sum: one row per example, and one column. 141 | Each element in the matrix is now constrained between 0 and 1. 142 | """ 143 | @spec forward(Nx.Tensor.t(), Nx.Tensor.t()) :: Nx.Tensor.t() 144 | defn forward(x, weight) do 145 | weighted_sum = Nx.dot(x, weight) 146 | sigmoid(weighted_sum) 147 | end 148 | 149 | @doc """ 150 | Return the prediction rounded to forecast a value between 0 and 9. 151 | """ 152 | @spec classify(Nx.Tensor.t(), Nx.Tensor.t()) :: Nx.Tensor.t() 153 | defn classify(x, weight) do 154 | y_hat = forward(x, weight) 155 | 156 | # Get the index of the maximum value in each row of y_hat 157 | # (the value that’s closer to 1). 158 | # NOTE: in case of MNIST dataset, the returned index is also the 159 | # decoded label (0..9). 160 | labels = Nx.argmax(y_hat, axis: 1) 161 | 162 | Nx.reshape(labels, {:auto, 1}) 163 | end 164 | 165 | @doc """ 166 | Log loss function. 167 | """ 168 | @spec loss(Nx.Tensor.t(), Nx.Tensor.t(), Nx.Tensor.t()) :: Nx.Tensor.t() 169 | defn loss(x, y, weight) do 170 | y_hat = forward(x, weight) 171 | 172 | # Each label in the matrix `y_hat` is either `0` or `1`. 173 | # - `first_term` disappears when `y_hat` is 0 174 | # - `second_term` disappears when `y_hat` is 1 175 | first_term = y * Nx.log(y_hat) 176 | second_term = Nx.subtract(1, y) * Nx.log(Nx.subtract(1, y_hat)) 177 | 178 | # Corrected version (Chapter 7) 179 | Nx.add(first_term, second_term) 180 | |> Nx.sum() 181 | |> Nx.divide(elem(Nx.shape(x), 0)) 182 | |> Nx.negate() 183 | end 184 | 185 | @doc """ 186 | Returns the derivative of the loss curve. 187 | """ 188 | @spec gradient(Nx.Tensor.t(), Nx.Tensor.t(), Nx.Tensor.t()) :: Nx.Tensor.t() 189 | defn gradient(x, y, weight) do 190 | # in python: 191 | # np.matmul(X.T, (predict(X, w) - Y)) / X.shape[0] 192 | 193 | predictions = forward(x, weight) 194 | errors = Nx.subtract(predictions, y) 195 | n_examples = elem(Nx.shape(x), 0) 196 | 197 | Nx.transpose(x) 198 | |> Nx.dot(errors) 199 | |> Nx.divide(n_examples) 200 | end 201 | 202 | @typep report_item :: { 203 | iteration :: integer(), 204 | training_loss :: float(), 205 | matches_percentage :: float() 206 | } 207 | 208 | @doc """ 209 | Utility to compute training loss and matches % per iteration, 210 | use this to report the result. 211 | """ 212 | @spec report( 213 | integer(), 214 | Nx.Tensor.t(), 215 | Nx.Tensor.t(), 216 | Nx.Tensor.t(), 217 | Nx.Tensor.t(), 218 | Nx.Tensor.t() 219 | ) :: report_item() 220 | def report(iteration, x_train, y_train, x_test, y_test, weight) do 221 | matches = matches(x_test, y_test, weight) |> Nx.to_number() 222 | n_test_examples = elem(Nx.shape(y_test), 0) 223 | matches = matches * 100.0 / n_test_examples 224 | training_loss = loss(x_train, y_train, weight) |> Nx.to_number() 225 | 226 | # Commented to don't freeze the browser 227 | # IO.inspect("Iteration #{iteration} => Loss: #{training_loss}, #{matches}%") 228 | {iteration, training_loss, matches} 229 | end 230 | 231 | defnp matches(x_test, y_test, weight) do 232 | classify(x_test, weight) 233 | |> Nx.equal(y_test) 234 | |> Nx.sum() 235 | end 236 | 237 | @doc """ 238 | Computes the weight by training the system 239 | with the given inputs and labels, by iterating 240 | over the examples the specified number of times. 241 | 242 | It returns a tuple with the final weight and the 243 | reports of the loss per iteration. 244 | """ 245 | @spec train( 246 | train_inputs :: Nx.Tensor.t(), 247 | train_labels :: Nx.Tensor.t(), 248 | test_inputs :: Nx.Tensor.t(), 249 | test_labels :: Nx.Tensor.t(), 250 | iterations :: integer(), 251 | learning_rate :: float() 252 | ) :: {weight :: Nx.Tensor.t(), reports :: [report_item()]} 253 | def train(x_train, y_train, x_test, y_test, iterations, lr) do 254 | init_weight = init_weight(x_train, y_train) 255 | init_reports = [report(0, x_train, y_train, x_test, y_test, init_weight)] 256 | 257 | {final_weight, reversed_reports} = 258 | Enum.reduce(1..(iterations - 1), {init_weight, init_reports}, fn i, {weight, reports} -> 259 | new_weight = step(x_train, y_train, weight, lr) 260 | updated_reports = [report(i, x_train, y_train, x_test, y_test, new_weight) | reports] 261 | 262 | {new_weight, updated_reports} 263 | end) 264 | 265 | {final_weight, Enum.reverse(reversed_reports)} 266 | end 267 | 268 | defnp step(x, y, weight, lr) do 269 | Nx.subtract(weight, Nx.multiply(gradient(x, y, weight), lr)) 270 | end 271 | 272 | # Returns a tensor of shape `{n, m}`, where 273 | # `n` is the number of columns in `x` (input variables) and 274 | # `m` is the number of columns in `y` (classes). 275 | # Each element in the tensor is initialized to 0. 276 | defnp init_weight(x, y) do 277 | n_input_variables = elem(Nx.shape(x), 1) 278 | n_classes = elem(Nx.shape(y), 1) 279 | Nx.broadcast(0, {n_input_variables, n_classes}) 280 | end 281 | end 282 | 283 | seed_range = 0..100 284 | 285 | results = 286 | Enum.reduce(seed_range, [], fn seed, acc -> 287 | IO.puts "Seed #{seed} in progress..." 288 | 289 | %{x_train: x_train, x_test: x_test, y_train: y_train, y_test: y_test} = 290 | C7.SonarDataset.load(seed) 291 | 292 | y_train = C7.SonarDataset.one_hot_encode(y_train) 293 | 294 | {weight, reports} = 295 | C7.Classifier.train(x_train, y_train, x_test, y_test, iterations = 100_000, lr = 0.01) 296 | 297 | max_match_report = Enum.max_by(reports, &elem(&1, 2)) 298 | 299 | acc ++ [{seed, Enum.at(reports, -1), max_match_report}] 300 | end) 301 | 302 | content = 303 | Enum.map(results, fn {seed, last_report, max_report} -> 304 | "seed: #{seed}, last report: {#{elem(last_report, 0)}, #{elem(last_report, 2)}%}, max report: {#{elem(max_report, 0)}, #{elem(max_report, 2)}%}" 305 | end) 306 | |> Enum.join("\n") 307 | 308 | File.write!("./seed_results.txt", content) 309 | -------------------------------------------------------------------------------- /10_building/forward_propagation.livemd: -------------------------------------------------------------------------------- 1 | # Chapter 10: Building the Network 2 | 3 | ```elixir 4 | Mix.install( 5 | [ 6 | {:exla, "~> 0.5"}, 7 | {:nx, "~> 0.5"}, 8 | {:jason, "~> 1.4"} 9 | ], 10 | config: [nx: [default_backend: EXLA.Backend]] 11 | ) 12 | ``` 13 | 14 | ## Load MNIST images 15 | 16 | _The module to load MNIST data is based on the chapter 7 implementation._ 17 | 18 | ```elixir 19 | defmodule C10.MNIST do 20 | @moduledoc """ 21 | Use this Module to load the MNIST database (test, train, and labels). 22 | 23 | MNIST dataset specifications can be found here: http://yann.lecun.com/exdb/mnist/ 24 | """ 25 | 26 | @data_path Path.join(__DIR__, "../data/mnist") |> Path.expand() 27 | 28 | @train_images_filename Path.join(@data_path, "train-images-idx3-ubyte.gz") 29 | @test_images_filename Path.join(@data_path, "t10k-images-idx3-ubyte.gz") 30 | @train_labels_filename Path.join(@data_path, "train-labels-idx1-ubyte.gz") 31 | @test_labels_filename Path.join(@data_path, "t10k-labels-idx1-ubyte.gz") 32 | 33 | defstruct [:x_train, :x_test, :y_train, :y_test] 34 | 35 | @doc """ 36 | Load the MNIST database and return the train and test images. 37 | """ 38 | def load() do 39 | %__MODULE__{ 40 | # 60000 images, each 784 elements (28 * 28 pixels) 41 | x_train: load_images(@train_images_filename), 42 | # 10000 images, each 784 elements, with the same structure as `x_train` 43 | x_test: load_images(@test_images_filename), 44 | # 60000 labels 45 | y_train: load_labels(@train_labels_filename), 46 | # 10000 labels, with the same encoding as `y_train` 47 | y_test: load_labels(@test_labels_filename) 48 | } 49 | end 50 | 51 | @doc """ 52 | One-hot encode the given tensor (classes: from 0 to 9). 53 | """ 54 | def one_hot_encode(y) do 55 | Nx.equal(y, Nx.tensor(Enum.to_list(0..9))) 56 | end 57 | 58 | @doc """ 59 | Load the MNIST labels from the given file 60 | and return a matrix. 61 | """ 62 | def load_labels(filename) do 63 | # Open and unzip the file of labels 64 | with {:ok, binary} <- File.read(filename) do 65 | <<_::32, n_labels::32, labels_binary::binary>> = :zlib.gunzip(binary) 66 | 67 | # Create a tensor from the binary and 68 | # reshape the list of labels into a one-column matrix. 69 | labels_binary 70 | |> Nx.from_binary({:u, 8}) 71 | |> Nx.reshape({n_labels, 1}) 72 | end 73 | end 74 | 75 | @doc """ 76 | Load the MNIST images from the given file 77 | and return a matrix. 78 | """ 79 | def load_images(filename) do 80 | # Open and unzip the file of images 81 | with {:ok, binary} <- File.read(filename) do 82 | <<_::32, n_images::32, n_rows::32, n_cols::32, images_binary::binary>> = 83 | :zlib.gunzip(binary) 84 | 85 | # Create a tensor from the binary and 86 | # reshape the pixels into a matrix where each line is an image. 87 | images_binary 88 | |> Nx.from_binary({:u, 8}) 89 | |> Nx.reshape({n_images, n_cols * n_rows}) 90 | end 91 | end 92 | end 93 | ``` 94 | 95 | ### Load the data 96 | 97 | ```elixir 98 | # Use the public API to get train and test images 99 | %{x_train: x_train, x_test: x_test, y_train: y_train, y_test: y_test} = data = C10.MNIST.load() 100 | ``` 101 | 102 | ## Writing the Softmax Function 103 | 104 | Softmax is used as activation function as the `sigmoid` we used in the previous chapter. It is used in the last network's layer. 105 | 106 | 107 | 108 | $$ 109 | softmax(l_i) = \cfrac{e^{l_i}}{\sum e^{l}} 110 | $$ 111 | 112 | ```elixir 113 | softmax = fn logits -> 114 | exponentials = Nx.exp(logits) 115 | 116 | Nx.divide( 117 | exponentials, 118 | Nx.sum(exponentials, axes: [1]) |> Nx.reshape({:auto, 1}) 119 | ) 120 | end 121 | 122 | output = Nx.tensor([[0.3, 0.8, 0.2], [0.1, 0.9, 0.1]]) 123 | 124 | softmax.(output) 125 | ``` 126 | 127 | ### Numerical Stability 128 | 129 | Our implementations of `softmax/1` and `sigmoid/1` have a problem: they're numerically unstable, meaning that they amplify small changes in the inputs 130 | 131 | ```elixir 132 | softmax.(Nx.tensor([[1, 20]])) |> IO.inspect(label: "softmax([[1, 20]])") 133 | 134 | softmax.(Nx.tensor([[1, 1000]])) |> IO.inspect(label: "softmax([[1, 1000]])") 135 | 136 | :ok 137 | ``` 138 | 139 | ## Forward propagation and Cross entropy 140 | 141 | Update the classifier implemented in chapter 7 with: 142 | 143 | * Softmax activation function `softmax/1` 144 | * Forward propagation `forward/3` 145 | * Classification function `classify/3` 146 | * Cross-entropy loss `loss/2` 147 | 148 | ```elixir 149 | defmodule C10.Classifier do 150 | import Nx.Defn 151 | 152 | @doc """ 153 | A sigmoid function is a mathematical function having 154 | a characteristic "S"-shaped curve or sigmoid curve. 155 | 156 | A sigmoid function: 157 | - is monotonic 158 | - has no local minimums 159 | - has a non-negative derivative for each point 160 | 161 | It is used as activation function in the intermediate 162 | layers of a neural network. 163 | 164 | More here https://en.wikipedia.org/wiki/Sigmoid_function 165 | """ 166 | defn sigmoid(z) do 167 | Nx.divide(1, Nx.add(1, Nx.exp(Nx.negate(z)))) 168 | end 169 | 170 | @doc """ 171 | A softmax function turns a list of numbers (logits) 172 | into probabilities that sum to one. 173 | 174 | It is used as activation function in the last 175 | layer of a neural network. 176 | 177 | More here https://en.wikipedia.org/wiki/Softmax_function 178 | """ 179 | defn softmax(logits) do 180 | exponentials = Nx.exp(logits) 181 | 182 | Nx.divide( 183 | exponentials, 184 | Nx.sum(exponentials, axes: [1]) |> Nx.reshape({:auto, 1}) 185 | ) 186 | end 187 | 188 | @doc """ 189 | Prepend a the bias, an extra column of 1s, to 190 | the given tensor. 191 | """ 192 | defn prepend_bias(x) do 193 | bias = Nx.broadcast(1, {elem(Nx.shape(x), 0), 1}) 194 | 195 | # Insert a column of 1s in the position 0 of x. 196 | # ("axis: 1" stands for: "insert a column, not a row") 197 | Nx.concatenate([bias, x], axis: 1) 198 | end 199 | 200 | @doc """ 201 | Return the prediction tensor ŷ (y_hat) given the inputs and weights. 202 | The returned tensor is a matrix with the same dimensions as 203 | the weighted sum: one row per example, and one column. 204 | Each element in the matrix is now constrained between 0 and 1. 205 | """ 206 | defn forward(x, weight1, weight2) do 207 | h = sigmoid(Nx.dot(prepend_bias(x), weight1)) 208 | softmax(Nx.dot(prepend_bias(h), weight2)) 209 | end 210 | 211 | @doc """ 212 | Return the prediction rounded to forecast a value between 0 and 9. 213 | """ 214 | defn classify(x, weight1, weight2) do 215 | y_hat = forward(x, weight1, weight2) 216 | 217 | # Get the index of the maximum value in each row of y_hat 218 | # (the value that’s closer to 1). 219 | # NOTE: in case of MNIST dataset, the returned index is also the 220 | # decoded label (0..9). 221 | labels = Nx.argmax(y_hat, axis: 1) 222 | 223 | Nx.reshape(labels, {:auto, 1}) 224 | end 225 | 226 | @doc """ 227 | Cross-entropy loss. 228 | 229 | It measures the distance between the classifier's prediction 230 | and the labels. 231 | """ 232 | defn loss(y, y_hat) do 233 | # In python: -np.sum(Y * np.log(y_hat)) / Y.shape[0] 234 | -Nx.sum(y * Nx.log(y_hat)) / elem(Nx.shape(y), 0) 235 | end 236 | 237 | @doc """ 238 | Utility to report (to stdout) the loss per iteration. 239 | """ 240 | def report(iteration, x_train, y_train, x_test, y_test, weight1, weight2) do 241 | y_hat = forward(x_train, weight1, weight2) 242 | training_loss = loss(y_train, y_hat) |> Nx.to_number() 243 | classifications = classify(x_test, weight1, weight2) 244 | accuracy = Nx.multiply(Nx.mean(Nx.equal(classifications, y_test)), 100.0) |> Nx.to_number() 245 | 246 | IO.puts("Iteration #{iteration}, Loss: #{training_loss}, Accuracy: #{accuracy}%") 247 | end 248 | end 249 | ``` 250 | 251 | ## Hands on: Time Travel Testing 252 | 253 | ### Test the system with some pre-computed weights 254 | 255 | ```elixir 256 | [weight1, weight2] = 257 | Path.join(__DIR__, "./weights.json") 258 | |> Path.expand() 259 | |> File.read!() 260 | |> Jason.decode!() 261 | |> Enum.map(&Nx.tensor/1) 262 | 263 | C10.Classifier.report(0, x_train, y_train, x_test, y_test, weight1, weight2) 264 | ``` 265 | -------------------------------------------------------------------------------- /11_training/neural_network.livemd: -------------------------------------------------------------------------------- 1 | # Chapter 11: Training the Network 2 | 3 | ```elixir 4 | Mix.install( 5 | [ 6 | {:exla, "~> 0.5"}, 7 | {:nx, "~> 0.5"} 8 | ], 9 | config: [nx: [default_backend: EXLA.Backend]] 10 | ) 11 | ``` 12 | 13 | ## Load MNIST dataset 14 | 15 | _The module to load MNIST data is the based on the developed one in the chapter 10, but the returned `y_train` is already hot-encoded._ 16 | 17 | ```elixir 18 | defmodule C11.MNIST do 19 | @moduledoc """ 20 | Use this Module to load the MNIST database (test, train, and labels). 21 | 22 | MNIST dataset specifications can be found here: http://yann.lecun.com/exdb/mnist/ 23 | """ 24 | 25 | @data_path Path.join(__DIR__, "../data/mnist") |> Path.expand() 26 | 27 | @train_images_filename Path.join(@data_path, "train-images-idx3-ubyte.gz") 28 | @test_images_filename Path.join(@data_path, "t10k-images-idx3-ubyte.gz") 29 | @train_labels_filename Path.join(@data_path, "train-labels-idx1-ubyte.gz") 30 | @test_labels_filename Path.join(@data_path, "t10k-labels-idx1-ubyte.gz") 31 | 32 | defstruct [:x_train, :x_test, :y_train, :y_test] 33 | 34 | @doc """ 35 | Load the MNIST database and return the train and test images. 36 | 37 | `y_train` already hot-encoded. 38 | """ 39 | def load() do 40 | %__MODULE__{ 41 | # 60000 images, each 784 elements (28 * 28 pixels) 42 | x_train: load_images(@train_images_filename), 43 | # 10000 images, each 784 elements, with the same structure as `x_train` 44 | x_test: load_images(@test_images_filename), 45 | # 60000 labels 46 | y_train: load_labels(@train_labels_filename) |> one_hot_encode(), 47 | # 10000 labels, with the same encoding as `y_train` 48 | y_test: load_labels(@test_labels_filename) 49 | } 50 | end 51 | 52 | @doc """ 53 | One-hot encode the given tensor (classes: from 0 to 9). 54 | """ 55 | def one_hot_encode(y) do 56 | Nx.equal(y, Nx.tensor(Enum.to_list(0..9))) 57 | end 58 | 59 | @doc """ 60 | Load the MNIST labels from the given file 61 | and return a matrix. 62 | """ 63 | def load_labels(filename) do 64 | # Open and unzip the file of labels 65 | with {:ok, binary} <- File.read(filename) do 66 | <<_::32, n_labels::32, labels_binary::binary>> = :zlib.gunzip(binary) 67 | 68 | # Create a tensor from the binary and 69 | # reshape the list of labels into a one-column matrix. 70 | labels_binary 71 | |> Nx.from_binary({:u, 8}) 72 | |> Nx.reshape({n_labels, 1}) 73 | end 74 | end 75 | 76 | @doc """ 77 | Load the MNIST images from the given file 78 | and return a matrix. 79 | """ 80 | def load_images(filename) do 81 | # Open and unzip the file of images 82 | with {:ok, binary} <- File.read(filename) do 83 | <<_::32, n_images::32, n_rows::32, n_cols::32, images_binary::binary>> = 84 | :zlib.gunzip(binary) 85 | 86 | # Create a tensor from the binary and 87 | # reshape the pixels into a matrix where each line is an image. 88 | images_binary 89 | |> Nx.from_binary({:u, 8}) 90 | |> Nx.reshape({n_images, n_cols * n_rows}) 91 | end 92 | end 93 | end 94 | ``` 95 | 96 | ### Load the data. 97 | 98 | ```elixir 99 | # Use the public API to get train and test images 100 | %{x_train: x_train, x_test: x_test, y_train: y_train, y_test: y_test} = data = C11.MNIST.load() 101 | ``` 102 | 103 | ## Backpropagation 104 | 105 | Local gradient for the `w2`: 106 | 107 | $$ 108 | 109 | \frac {\partial L}{\partial w2} = SML\rq \cdot \frac {\partial b}{\partial w2} = (\hat y - y) \cdot h 110 | 111 | $$ 112 | 113 | 114 | 115 | Local gradient for the `w1`: 116 | 117 | $$ 118 | 119 | \frac {\partial L}{\partial w1} = SML\rq \cdot \frac {\partial b}{\partial w1} \cdot \sigma\rq \frac {\partial a}{\partial w1}= (\hat y - y) \cdot w2 \cdot \sigma \cdot (1 - \sigma) \cdot x 120 | 121 | $$ 122 | 123 | ## The Finished Network 124 | 125 | _The Neural Network is based on the classifier implemented in chapter 7, but with the backpropagation step when training the model._ 126 | 127 | ```elixir 128 | defmodule C11.Classifier do 129 | import Nx.Defn 130 | 131 | @doc """ 132 | A sigmoid function is a mathematical function having 133 | a characteristic "S"-shaped curve or sigmoid curve. 134 | 135 | A sigmoid function: 136 | - is monotonic 137 | - has no local minimums 138 | - has a non-negative derivative for each point 139 | 140 | It is used as activation function in the intermediate 141 | layers of a neural network. 142 | 143 | More here https://en.wikipedia.org/wiki/Sigmoid_function 144 | """ 145 | defn sigmoid(z) do 146 | Nx.divide(1, Nx.add(1, Nx.exp(Nx.negate(z)))) 147 | end 148 | 149 | @doc """ 150 | A softmax function turns a list of numbers (logits) 151 | into probabilities that sum to one. 152 | 153 | It is used as activation function in the last 154 | layer of a neural network. 155 | 156 | More here https://en.wikipedia.org/wiki/Softmax_function 157 | 158 | For MNIST dataset, the `logits` is a tensor `{60_000, 10}`: 159 | - one row for each MNIST image (60_000) 160 | - one column per class (0..9) 161 | and it must return a matrix of the same shape. 162 | """ 163 | defn softmax(logits) do 164 | exponentials = Nx.exp(logits) 165 | 166 | Nx.divide( 167 | exponentials, 168 | Nx.sum(exponentials, axes: [1]) |> Nx.reshape({:auto, 1}) 169 | ) 170 | end 171 | 172 | @doc """ 173 | Helper function that calculates the sigmoid's gradient 174 | from the sigmoid's output. 175 | """ 176 | defn sigmoid_gradient(sigmoid) do 177 | Nx.multiply(sigmoid, 1 - sigmoid) 178 | end 179 | 180 | @doc """ 181 | Cross-entropy loss. 182 | 183 | It measures the distance between the classifier's prediction 184 | and the labels. 185 | """ 186 | defn loss(y, y_hat) do 187 | -Nx.sum(y * Nx.log(y_hat)) / elem(Nx.shape(y), 0) 188 | end 189 | 190 | @doc """ 191 | Prepend a the bias, an extra column of 1s, to 192 | the given tensor. 193 | """ 194 | defn prepend_bias(x) do 195 | bias = Nx.broadcast(1, {elem(Nx.shape(x), 0), 1}) 196 | 197 | # Insert a column of 1s in the position 0 of x. 198 | # ("axis: 1" stands for: "insert a column, not a row") 199 | Nx.concatenate([bias, x], axis: 1) 200 | end 201 | 202 | @doc """ 203 | Forward propagation: it propagates data "forward" through the 204 | network's layers, from input to hidden layer to output 205 | It returns a tuple `{ŷ, h}` with the prediction tensor `ŷ` (`y_hat`) 206 | and the tensor `h` for the hidden layer given the inputs and weights. 207 | 208 | Each element in the tensors is now constrained between 0 and 1, 209 | but the activation functions used for `h` and `y_hat` are 210 | different: 211 | - `sigmoid` for the hidden layer `h` 212 | - `softmax` for the prediction tensor `y_hat` 213 | 214 | Tensors shapes: 215 | - `weight1` shape: `{785, 200}` 216 | - `h` shape: `{60000, 200}` 217 | - `weight2` shape: `{201, 10}` 218 | - `y_hat` shape: `{60000, 10}` 219 | """ 220 | defn forward(x, weight1, weight2) do 221 | h = sigmoid(Nx.dot(prepend_bias(x), weight1)) 222 | y_hat = softmax(Nx.dot(prepend_bias(h), weight2)) 223 | 224 | {y_hat, h} 225 | end 226 | 227 | @doc """ 228 | Calculates the gradients of the weights by multiplying the local 229 | gradients of individual operations, from the loss to the weights. 230 | 231 | It uses the chain rule to calculate the gradient of any 232 | node `y` with respect to any other node `x`, we multiply 233 | the local gradient of all the nodes on the way back from `y` to `x`. 234 | Thanks to the chain rule, we can calculate a complicated gradient 235 | as a multiplication of many simple gradients. 236 | """ 237 | defn back(x, y, y_hat, weight2, h) do 238 | # - The swapping and transposition are needed to get the correct dimension 239 | # - The bias columnm is prepended to `h` as it is done in `forward/3` 240 | # - It is divided by `elem(Nx.shape(x), 0)` because the matrix multiplication 241 | # gives us the accumulated gradient over all the examples, but we want the 242 | # average gradient 243 | # 244 | # numpy: 245 | # w2_gradient = np.matmul(prepend_bias(h).T, y_hat - Y) / X.shape[0] 246 | w2_gradient = 247 | Nx.dot( 248 | Nx.transpose(prepend_bias(h)), 249 | Nx.subtract(y_hat, y) 250 | ) / elem(Nx.shape(x), 0) 251 | 252 | # - The swapping and transposition are needed to get a result with 253 | # the same dimensions as `w2`. 254 | # - In this case, we don't need to add a bias column to `h` because 255 | # the bias is added after its calculation (when computing `y_hat`) 256 | # Instead, the bias is prepended to `x` as it is done in the `forward/3` 257 | # function 258 | # - And since we ignored the the bias prepended to `h`, we need to 259 | # ignore the its weights (1st row of `weight2`). 260 | # - It is divided by `elem(Nx.shape(x), 0)` because the matrix multiplication 261 | # gives us the accumulated gradient over all the examples, but we want the 262 | # average gradient 263 | # 264 | # numpy: 265 | # w1_gradient = np.matmul(prepend_bias(X).T, np.matmul(y_hat - Y, w2[1:].T) * sigmoid_gradient(h)) / X.shape[0] 266 | w1_gradient = 267 | Nx.dot( 268 | Nx.transpose(prepend_bias(x)), 269 | Nx.dot(y_hat - y, Nx.transpose(weight2[1..-1//1])) * sigmoid_gradient(h) 270 | ) / elem(Nx.shape(x), 0) 271 | 272 | {w1_gradient, w2_gradient} 273 | end 274 | 275 | @doc """ 276 | Return a single-column matrix of prediction, where each value is between 0 and 9. 277 | """ 278 | defn classify(x, weight1, weight2) do 279 | {y_hat, _h} = forward(x, weight1, weight2) 280 | 281 | # Get the index of the maximum value in each row of `y_hat` 282 | # (the value that's closer to 1). 283 | # NOTE: in case of MNIST dataset, the returned index is also the 284 | # decoded label (0..9). 285 | labels = Nx.argmax(y_hat, axis: 1) 286 | 287 | Nx.reshape(labels, {:auto, 1}) 288 | end 289 | 290 | @doc """ 291 | Initialize the weights `weight1` and `weight2` with 292 | the given shape passed as options. 293 | 294 | - The weights are initialized with random numbers to "break the symmetry", 295 | otherwise our neural network would behave as if it had only 296 | one hidden node. 297 | - The initial values must be small because large values 298 | can cause problems if the network's function are not numerically 299 | stable (overflow). Plus, large values make the training slower with the 300 | possibility to halt the learning completely ("dead neurons"). 301 | 302 | The initialization of the weights is done via `Nx.Random.normal/4` 303 | https://hexdocs.pm/nx/Nx.Random.html#normal/4 304 | 305 | And for doing that we need to pass a pseudo-random number generator (PRNG) key 306 | to the function as argument, a new one for each different Nx’s random number generation 307 | calls. 308 | https://hexdocs.pm/nx/Nx.Random.html#module-design-and-context 309 | """ 310 | defn initialize_weights(opts \\ []) do 311 | opts = keyword!(opts, [:w1_shape, :w2_shape]) 312 | mean = 0.0 313 | std_deviation = 0.01 314 | 315 | prng_key = Nx.Random.key(1234) 316 | 317 | {weight1, new_prng_key} = 318 | Nx.Random.normal(prng_key, mean, std_deviation, shape: opts[:w1_shape]) 319 | 320 | {weight2, _new_prng_key} = 321 | Nx.Random.normal(new_prng_key, mean, std_deviation, shape: opts[:w2_shape]) 322 | 323 | {weight1, weight2} 324 | end 325 | 326 | @doc """ 327 | Utility to report (to stdout) the loss per iteration. 328 | """ 329 | def report(iteration, x_train, y_train, x_test, y_test, weight1, weight2) do 330 | {y_hat, _h} = forward(x_train, weight1, weight2) 331 | training_loss = loss(y_train, y_hat) |> Nx.to_number() 332 | classifications = classify(x_test, weight1, weight2) 333 | accuracy = Nx.multiply(Nx.mean(Nx.equal(classifications, y_test)), 100.0) |> Nx.to_number() 334 | 335 | IO.puts("Iteration #{iteration}, Loss: #{training_loss}, Accuracy: #{accuracy}%") 336 | end 337 | 338 | @doc """ 339 | Computes the weights `w1` and `w2` by training the system 340 | with the given inputs and labels, by iterating 341 | over the examples the specified number of times. 342 | 343 | For each iteration, it prints the loss and the accuracy. 344 | """ 345 | def train(x_train, y_train, x_test, y_test, n_hidden_nodes, iterations, lr) do 346 | n_input_variables = elem(Nx.shape(x_train), 1) 347 | n_classes = elem(Nx.shape(y_train), 1) 348 | 349 | {initial_weight_1, initial_weight_2} = 350 | initialize_weights( 351 | w1_shape: {n_input_variables + 1, n_hidden_nodes}, 352 | w2_shape: {n_hidden_nodes + 1, n_classes} 353 | ) 354 | 355 | Enum.reduce(0..(iterations - 1), {initial_weight_1, initial_weight_2}, fn i, {w1, w2} -> 356 | {updated_w1, updated_w2} = step(x_train, y_train, w1, w2, lr) 357 | report(i, x_train, y_train, x_test, y_test, updated_w1, updated_w2) 358 | {updated_w1, updated_w2} 359 | end) 360 | end 361 | 362 | defnp step(x_train, y_train, w1, w2, lr) do 363 | {y_hat, h} = forward(x_train, w1, w2) 364 | {w1_gradient, w2_gradient} = back(x_train, y_train, y_hat, w2, h) 365 | w1 = w1 - w1_gradient * lr 366 | w2 = w2 - w2_gradient * lr 367 | 368 | {w1, w2} 369 | end 370 | end 371 | ``` 372 | 373 | ### Training the network 374 | 375 | ```elixir 376 | hidden_nodes = 200 377 | # In the books the NN is trained for 10_000 iterations 378 | # but already 3500 lead to the same result in terms of 379 | # loss and accuracy 380 | iterations = 3500 381 | learning_rate = 0.01 382 | 383 | {w1, w2} = 384 | C11.Classifier.train( 385 | x_train, 386 | y_train, 387 | x_test, 388 | y_test, 389 | hidden_nodes, 390 | iterations, 391 | learning_rate 392 | ) 393 | ``` 394 | 395 | ``` 396 | Iteration 0, Loss: 2.2787163257598877, Accuracy: 13.799999237060547% 397 | Iteration 1, Loss: 2.2671103477478027, Accuracy: 18.26999855041504% 398 | Iteration 2, Loss: 2.2557082176208496, Accuracy: 24.209999084472656% 399 | Iteration 3, Loss: 2.2444677352905273, Accuracy: 29.920000076293945% 400 | Iteration 4, Loss: 2.2333567142486572, Accuracy: 34.720001220703125% 401 | ... 402 | ... 403 | Iteration 3497, Loss: 0.13987472653388977, Accuracy: 94.20999908447266% 404 | Iteration 3498, Loss: 0.13985344767570496, Accuracy: 94.20999908447266% 405 | Iteration 3499, Loss: 0.13983216881752014, Accuracy: 94.20999908447266% 406 | ``` 407 | 408 | After 35 minutes ca. and 3500 iterations: 409 | 410 | * Loss: 0.13983216881752014 411 | * Accuracy: 94.20999908447266% 412 | -------------------------------------------------------------------------------- /12_classifiers/12_classifiers_01.livemd: -------------------------------------------------------------------------------- 1 | # Chapter 12: How Classifiers Works (1 of 2) 2 | 3 | ```elixir 4 | Mix.install( 5 | [ 6 | {:exla, "~> 0.5"}, 7 | {:nx, "~> 0.5"}, 8 | {:vega_lite, "~> 0.1.6"}, 9 | {:kino, "~> 0.8.1"}, 10 | {:kino_vega_lite, "~> 0.1.7"} 11 | ], 12 | config: [nx: [default_backend: EXLA.Backend]] 13 | ) 14 | ``` 15 | 16 | ## Load the Data 17 | 18 | ```elixir 19 | filepath = Path.join(__DIR__, "./linearly_separable.txt") |> Path.expand() 20 | 21 | [head | data] = 22 | filepath 23 | |> File.read!() 24 | |> String.split("\r\n", trim: true) 25 | 26 | inputs = 27 | data 28 | |> Enum.map(&String.split(&1, "\s", trim: true)) 29 | |> Enum.map(fn [input_a, input_b, label] -> 30 | %{ 31 | "input_a" => String.to_float(input_a), 32 | "input_b" => String.to_float(input_b), 33 | "label" => String.to_integer(label) 34 | } 35 | end) 36 | 37 | Kino.DataTable.new(inputs) 38 | ``` 39 | 40 | 41 | 42 | ```elixir 43 | VegaLite.new(width: 600, height: 400) 44 | |> VegaLite.data_from_values(inputs, only: ["input_a", "input_b", "label"]) 45 | |> VegaLite.mark(:point) 46 | |> VegaLite.encode_field(:x, "input_a", type: :quantitative) 47 | |> VegaLite.encode_field(:y, "input_b", type: :quantitative) 48 | |> VegaLite.encode_field(:color, "label", type: :nominal) 49 | ``` 50 | 51 | ## Perceptron 52 | 53 | Perceptron based on `C7.Classifier` implementation. 54 | 55 | ```elixir 56 | defmodule C12.Perceptron do 57 | import Nx.Defn 58 | 59 | defn sigmoid(z) do 60 | Nx.divide(1, Nx.add(1, Nx.exp(Nx.negate(z)))) 61 | end 62 | 63 | defn forward(x, weight) do 64 | weighted_sum = Nx.dot(x, weight) 65 | sigmoid(weighted_sum) 66 | end 67 | 68 | defn classify(x, weight) do 69 | y_hat = forward(x, weight) 70 | labels = Nx.argmax(y_hat, axis: 1) 71 | Nx.reshape(labels, {:auto, 1}) 72 | end 73 | 74 | defn loss(x, y, weight) do 75 | y_hat = forward(x, weight) 76 | first_term = y * Nx.log(y_hat) 77 | second_term = Nx.subtract(1, y) * Nx.log(Nx.subtract(1, y_hat)) 78 | 79 | Nx.add(first_term, second_term) 80 | |> Nx.sum() 81 | |> Nx.divide(elem(Nx.shape(x), 0)) 82 | |> Nx.negate() 83 | end 84 | 85 | defn gradient(x, y, weight) do 86 | predictions = forward(x, weight) 87 | errors = Nx.subtract(predictions, y) 88 | n_examples = elem(Nx.shape(x), 0) 89 | 90 | Nx.transpose(x) 91 | |> Nx.dot(errors) 92 | |> Nx.divide(n_examples) 93 | end 94 | 95 | def report(iteration, x_train, y_train, x_test, y_test, weight) do 96 | matches = matches(x_test, y_test, weight) |> Nx.to_number() 97 | n_test_examples = elem(Nx.shape(y_test), 0) 98 | matches = matches * 100.0 / n_test_examples 99 | training_loss = loss(x_train, y_train, weight) |> Nx.to_number() 100 | 101 | IO.inspect("Iteration #{iteration} => Loss: #{training_loss}, #{matches}%") 102 | 103 | {iteration, training_loss, matches} 104 | end 105 | 106 | defnp matches(x_test, y_test, weight) do 107 | classify(x_test, weight) 108 | |> Nx.equal(y_test) 109 | |> Nx.sum() 110 | end 111 | 112 | def train(x_train, y_train, x_test, y_test, iterations, lr) do 113 | init_weight = init_weight(x_train, y_train) 114 | 115 | final_weight = 116 | Enum.reduce(0..(iterations - 1), init_weight, fn i, weight -> 117 | report(i, x_train, y_train, x_test, y_test, weight) 118 | step(x_train, y_train, weight, lr) 119 | end) 120 | 121 | report(iterations, x_train, y_train, x_test, y_test, final_weight) 122 | 123 | final_weight 124 | end 125 | 126 | defnp step(x, y, weight, lr) do 127 | Nx.subtract(weight, Nx.multiply(gradient(x, y, weight), lr)) 128 | end 129 | 130 | defnp init_weight(x, y) do 131 | n_input_variables = elem(Nx.shape(x), 1) 132 | n_classes = elem(Nx.shape(y), 1) 133 | Nx.broadcast(0, {n_input_variables, n_classes}) 134 | end 135 | end 136 | ``` 137 | 138 | ## Train Perceptron 139 | 140 | ```elixir 141 | # Prepend the bias function 142 | prepend_bias_fn = fn x -> 143 | bias = Nx.broadcast(1, {elem(Nx.shape(x), 0), 1}) 144 | 145 | # Insert a column of 1s in the position 0 of x. 146 | # ("axis: 1" stands for: "insert a column, not a row") 147 | # in python: `np.insert(X, 0, 1, axis=1)` 148 | Nx.concatenate([bias, x], axis: 1) 149 | end 150 | 151 | # hot encode function 152 | one_hot_encode_fn = fn y -> 153 | Nx.equal(y, Nx.tensor([0, 1])) 154 | end 155 | 156 | # Create tensors out of the inputs 157 | 158 | # NOTE: the tensor type is float, double-precision because 159 | # with an high number of iterations (> 7000) the loss is too small 160 | # to be represented with single-precision floating points. 161 | x_train = 162 | x_test = 163 | inputs 164 | |> Enum.map(&[&1["input_a"], &1["input_b"]]) 165 | |> Nx.tensor(type: {:f, 64}) 166 | |> then(fn x -> prepend_bias_fn.(x) end) 167 | 168 | y_train_unencoded = 169 | y_test = 170 | inputs 171 | |> Enum.map(& &1["label"]) 172 | |> Nx.tensor() 173 | |> Nx.reshape({:auto, 1}) 174 | 175 | y_train = one_hot_encode_fn.(y_train_unencoded) 176 | 177 | # Train the system 178 | 179 | iterations = 10_000 180 | lr = 0.1 181 | weight = C12.Perceptron.train(x_train, y_train, x_test, y_test, iterations, lr) 182 | ``` 183 | 184 | ## Plot Decision Boundary 185 | 186 | The idea: 187 | 188 | * Generate a grid of points and use the min/max values from the initial dataset to compute the boundaries. 189 | * Classify each point using the weight computed before with the initial dataset 190 | * Plot the result highlighting the "decision boundary" 191 | 192 | ```elixir 193 | # Get x from the tensor 194 | x = 195 | x_train 196 | |> Nx.slice_along_axis(1, 1, axis: 1) 197 | 198 | # Get y from the tensor 199 | y = 200 | x_train 201 | |> Nx.slice_along_axis(2, 1, axis: 1) 202 | 203 | # Compute the grid boundaries 204 | x_min = 205 | x 206 | |> Nx.to_flat_list() 207 | |> Enum.min() 208 | 209 | x_max = 210 | x 211 | |> Nx.to_flat_list() 212 | |> Enum.max() 213 | 214 | y_min = 215 | y 216 | |> Nx.to_flat_list() 217 | |> Enum.min() 218 | 219 | y_max = 220 | y 221 | |> Nx.to_flat_list() 222 | |> Enum.max() 223 | 224 | padding = 0.05 225 | 226 | boundaries = %{ 227 | x_min: x_min - abs(x_min * padding), 228 | x_max: x_max + abs(x_max * padding), 229 | y_min: y_min - abs(y_min * padding), 230 | y_max: y_max + abs(y_max * padding) 231 | } 232 | ``` 233 | 234 | ```elixir 235 | # Define the grid of data that will be classified 236 | 237 | resolution = 200 238 | x_step = (boundaries.x_max - boundaries.x_min) / resolution 239 | y_step = (boundaries.y_max - boundaries.y_min) / resolution 240 | 241 | grid = 242 | for i <- 0..(resolution - 1), j <- 0..(resolution - 1) do 243 | [boundaries.x_min + x_step * i, boundaries.y_min + y_step * j] 244 | end 245 | ``` 246 | 247 | ```elixir 248 | # Classification 249 | 250 | labels = 251 | grid 252 | |> Nx.tensor() 253 | |> then(fn t -> prepend_bias_fn.(t) end) 254 | |> C12.Perceptron.classify(weight) 255 | 256 | # Add the labels to the grid dataset 257 | data_with_labels = 258 | Enum.zip_with([grid, Nx.to_flat_list(labels)], fn [[x, y], label] -> 259 | %{x: x, y: y, label: label} 260 | end) 261 | ``` 262 | 263 | ```elixir 264 | alias VegaLite, as: Vl 265 | 266 | Vl.new(width: 600, height: 400) 267 | |> Vl.layers([ 268 | # Grid 269 | Vl.new() 270 | |> Vl.data_from_values(data_with_labels) 271 | |> Vl.mark(:point) 272 | |> Vl.encode_field(:x, "x", type: :quantitative) 273 | |> Vl.encode_field(:y, "y", type: :quantitative) 274 | |> Vl.encode(:color, field: "label", scale: %{"range" => ["lightblue", "aquamarine"]}), 275 | # Inputs 276 | Vl.new() 277 | |> Vl.data_from_values(inputs) 278 | |> Vl.mark(:point, filled: true, tooltip: true) 279 | |> Vl.encode_field(:x, "input_a", type: :quantitative) 280 | |> Vl.encode_field(:y, "input_b", type: :quantitative) 281 | |> Vl.encode(:color, field: "label", scale: %{"range" => ["blue", "green"]}) 282 | |> Vl.encode(:shape, field: "label", scale: %{"range" => ["square", "triangle-up"]}), 283 | # Threshold line 284 | Vl.new() 285 | |> Vl.data_from_values(data_with_labels) 286 | |> Vl.transform(filter: "datum['label'] == 1") 287 | |> Vl.mark(:line, stroke: "red", stroke_width: 3) 288 | |> Vl.encode_field(:x, "x", type: :quantitative) 289 | |> Vl.encode_field(:y, "y", type: :quantitative, aggregate: :max) 290 | ]) 291 | |> Vl.resolve(:scale, x: :shared, y: :shared, color: :independent) 292 | ``` 293 | -------------------------------------------------------------------------------- /12_classifiers/circles.txt: -------------------------------------------------------------------------------- 1 | Input_A Input_B Label 2 | 0.110023429593 0.755769122530 1 3 | 0.203819384954 1.011932998700 0 4 | 0.662127970322 0.537398029981 1 5 | 0.922053022927 0.111370918555 1 6 | 0.874500596794 0.360572216974 0 7 | -0.66866385592 -0.45720770050 1 8 | 0.639596184973 -0.83703947403 0 9 | 0.773196435805 -0.07704264484 1 10 | 0.330872756091 -0.82275272472 1 11 | -0.70316869026 -0.20982691915 1 12 | -0.92565594408 0.332573624748 0 13 | 1.009228609240 -0.23124272645 0 14 | 0.735549482446 0.724056492445 0 15 | 0.909009069575 -0.39959835839 0 16 | -0.23809680509 0.684093913695 1 17 | -0.84655714011 0.140472796988 1 18 | 0.011855680894 0.769759691415 1 19 | -0.60394102332 -0.69455391229 1 20 | -0.83934192952 -0.35429132875 0 21 | -0.19141019987 0.965445779520 0 22 | 0.510352474935 -0.69103160168 0 23 | 0.978931159421 -0.15425180241 0 24 | 0.076856629458 1.000579554110 0 25 | -0.16273836933 0.775128442331 1 26 | -0.95511105406 -0.05026483508 0 27 | -0.94772099912 0.110666754225 0 28 | -0.61377809146 0.835576589062 0 29 | 0.506931282745 0.478331419017 1 30 | 0.213536647027 -0.81790349196 1 31 | -0.19299597641 -1.04991591892 0 32 | 0.232625266795 0.935024301221 0 33 | -0.64675129319 0.579150663702 1 34 | -0.82322782322 -0.17151601059 1 35 | 0.711514708076 0.304076794179 1 36 | 0.284254330506 -0.88311890425 0 37 | 0.804750576501 -0.26908584035 1 38 | 0.867850898528 0.444306945682 0 39 | -0.45153278922 -0.81107463736 0 40 | 0.066027029910 -1.02449971447 0 41 | -0.56489316310 0.800508645304 0 42 | -0.70563525783 0.561772375158 0 43 | -0.56951900277 0.722664524461 0 44 | -0.40999947793 -0.87904023969 0 45 | -0.83457861041 0.108066717169 1 46 | 0.504492599556 -0.62912921289 1 47 | 0.860148791988 0.623878269262 0 48 | 0.370185487410 -0.71603371453 1 49 | 0.640067262113 -0.83063549872 0 50 | -0.79238942608 -0.52003048125 0 51 | -0.49754818744 0.782639763713 0 52 | 0.169263330412 0.734829152096 1 53 | 0.007962336251 0.761128826410 1 54 | 0.969177424748 -0.05886213067 0 55 | 0.799837856823 -0.29187115330 1 56 | -0.22076715827 -0.75875876803 1 57 | -0.54410553564 -0.80865264036 0 58 | 0.323665703437 0.683967651472 1 59 | 0.773999035560 -0.61616574853 0 60 | -0.38799721600 0.965529754564 0 61 | 0.767193012963 0.629089669793 0 62 | -0.68257933235 -0.26075836627 1 63 | -0.38843996608 -0.84365053203 0 64 | 0.773758680919 -0.51806208436 0 65 | -0.61062363778 0.841495324922 0 66 | -0.12330238200 -0.99278134731 0 67 | 0.814808858103 0.114367164912 1 68 | -1.07828666433 0.288807351643 0 69 | 0.100725272018 -1.03911600483 0 70 | -0.67248741412 -0.44470344191 1 71 | -0.98981478148 0.105446436438 0 72 | -0.57345799569 -0.79390068929 0 73 | -0.72307732522 0.481542918139 1 74 | 0.469617736995 0.586875414389 1 75 | -0.74301720418 -0.04016696782 1 76 | -0.72339913999 -0.39431067723 1 77 | 0.961644915690 -0.22793157898 0 78 | 0.551579626423 -0.58647954443 1 79 | -0.16182277273 0.825473416072 1 80 | 0.529837046908 0.621155531005 1 81 | 0.675659878113 0.450153222577 1 82 | -0.19657985768 -0.72918070809 1 83 | 0.098254511274 -0.75314267512 1 84 | -0.96008846242 0.228604907408 0 85 | 0.654606321881 0.393813182259 1 86 | 0.784699267110 0.344695299978 1 87 | -0.71414336660 0.310158149551 1 88 | 0.894536686095 -0.38300799537 0 89 | 0.782700085340 -0.07205205635 1 90 | -0.81261576497 -0.15023229094 1 91 | -0.73990610840 -0.44046361458 1 92 | -0.53065981757 0.658527431879 1 93 | -0.40327766168 0.718030403469 1 94 | 0.379829846026 0.766671954828 1 95 | 0.430318665842 0.652717337375 1 96 | 0.975280797317 0.464436598197 0 97 | -0.40436625677 -0.60441000673 1 98 | -0.22673795033 -0.78557346837 1 99 | -0.40144663142 -0.68565773652 1 100 | 0.885363181765 -0.21538075106 1 101 | 0.626873538653 -0.39566390676 1 102 | 0.464630421270 -0.63798823727 1 103 | -0.27414288324 0.912294398182 0 104 | 0.740629468873 -0.36250867443 1 105 | -0.88542051894 -0.54191663601 0 106 | -0.01253839206 -1.02373778157 0 107 | -0.83337846639 0.146964380533 1 108 | -0.01530599407 1.020106217834 0 109 | 0.380292831993 0.956465149023 0 110 | -0.76552192631 -0.13683448696 1 111 | 0.789762805228 0.095122448279 1 112 | 0.918015803788 0.533025715864 0 113 | 0.773101103484 0.027160711858 1 114 | 0.698731924386 0.661104622573 0 115 | 0.160445613471 -0.83843252722 1 116 | 0.436689829263 0.655519116212 1 117 | 0.793196338333 0.627070843831 0 118 | -0.36764573390 0.706629300337 1 119 | -0.88419744499 -0.05062385053 1 120 | -0.45252460946 0.851556221011 0 121 | 0.660846305907 0.720214047326 0 122 | 0.137903658715 0.760535780451 1 123 | -0.14542626413 1.002773201534 0 124 | -0.88799429529 0.377243590256 0 125 | 0.300404236023 0.944203943279 0 126 | -0.38464868442 -0.69535952685 1 127 | 0.288178431538 -0.69018939182 1 128 | -0.35966710592 0.841763516545 0 129 | 0.311563919782 0.672065714934 1 130 | 0.193074851100 -0.92353352384 0 131 | 0.010244748663 0.758325671064 1 132 | -0.65044865787 -0.79001220535 0 133 | -0.76345869226 -0.01147087177 1 134 | -0.75331816291 -0.41691350736 1 135 | 0.444415546535 -0.80760357793 0 136 | -0.79535763925 0.665484891382 0 137 | 0.429528122299 0.898735076522 0 138 | 0.103495344276 0.763402677281 1 139 | 1.020260657340 -0.07213576185 0 140 | -0.71824773386 -0.66299565715 0 141 | -0.34680961586 -1.04219321881 0 142 | -0.23492139642 0.699431620867 1 143 | -0.01854076862 -0.82915402859 1 144 | -0.24588108908 0.668612218083 1 145 | -0.70086322345 0.399528369991 1 146 | -0.79815377740 -0.69345462264 0 147 | -0.54897046612 -0.56071763391 1 148 | 0.517262173841 0.796119282211 0 149 | -0.73390252291 0.159873474245 1 150 | 0.908923252523 -0.50462614241 0 151 | -0.55064743735 0.548010507535 1 152 | 0.794191678354 -0.31678058157 1 153 | -0.65095670966 -0.73624730594 0 154 | 0.826512537148 0.227295377236 1 155 | -0.80050813932 0.641586375394 0 156 | 0.691491362988 0.301834336174 1 157 | -1.10217651123 -0.01737303455 0 158 | -0.09344738898 -0.92000067013 0 159 | 0.490733952414 0.834177948806 0 160 | -0.49696145047 0.825056405484 0 161 | 0.718244527716 0.388715903037 1 162 | -0.12922878128 -0.79812030887 1 163 | 1.012269685200 0.044910607201 0 164 | 0.523910410298 -0.79720912573 0 165 | -0.66764166668 0.456336028016 1 166 | 0.508714211080 -0.71255190512 1 167 | -1.00868464883 -0.34717017276 0 168 | 0.892230126343 0.486445670729 0 169 | 0.250960572203 0.791460570959 1 170 | -0.74997799394 0.158773071498 1 171 | 0.688505383695 -0.72219636559 0 172 | 0.649926801982 0.560455674056 1 173 | 1.006824348950 0.243371496985 0 174 | 0.895528908150 -0.61313784723 0 175 | 0.353575137994 0.950452217838 0 176 | -0.60194696115 -0.67587900374 0 177 | -0.97003544835 -0.12609130605 0 178 | -0.42694981541 0.540182335509 1 179 | -0.89938065333 -0.45462561427 0 180 | -0.56885446999 -0.75443163055 0 181 | 0.585259023482 -0.41959181697 1 182 | -0.34409119156 -0.75292435436 1 183 | -0.76856621936 -0.49920210985 0 184 | 0.148136525888 -0.78272172203 1 185 | -0.15611442386 -1.01618829557 0 186 | 0.671535202627 -0.47690829904 1 187 | 0.839987843562 -0.61650109607 0 188 | 0.912590078392 0.394457375357 0 189 | 0.697157243450 -0.46252949317 1 190 | 0.181911549147 -0.76454151399 1 191 | 0.761011312894 0.095949280302 1 192 | -0.65319698462 -0.55720115914 1 193 | -0.73061746348 0.477765334494 1 194 | -0.47376667397 -0.88157236509 0 195 | 0.402779519896 -0.82095424149 0 196 | 0.561486729245 -0.57342318763 1 197 | 0.907660397943 -0.58024405206 0 198 | -0.80050764734 0.213352224278 1 199 | 0.765115548490 -0.15622733873 1 200 | -0.77727687575 0.540394995699 0 201 | 0.259558545785 -0.81342845877 1 202 | -0.50277683932 0.492187021815 1 203 | -0.67986072196 0.363673677008 1 204 | 0.112707535212 -0.99613915716 0 205 | -0.62340947917 -0.57569796367 1 206 | -0.28921150359 0.973097787902 0 207 | -1.00432177297 0.272818884778 0 208 | -0.64530808929 0.535116310413 1 209 | -0.32613464418 0.619201891241 1 210 | -0.08733344376 0.771734620912 1 211 | 0.253648571894 -1.00386293261 0 212 | 0.056315310616 -0.96872929772 0 213 | -0.13869810863 -0.72585049851 1 214 | -0.84980945099 0.476114883234 0 215 | 0.689448350655 0.702108712659 0 216 | -0.52462228821 -0.63185924113 1 217 | 0.496910348611 -0.57992866319 1 218 | -0.43286087916 0.707856055807 1 219 | -0.04473233418 1.002633576925 0 220 | 0.327705595422 -0.77998945277 1 221 | 0.359570371851 0.716241611459 1 222 | -0.22473506185 0.780847493345 1 223 | 0.904033348136 0.248602445251 0 224 | -0.27920402881 0.809211782559 1 225 | 1.046165842020 0.116612622725 0 226 | 0.165717453985 0.972784993772 0 227 | 0.747711462984 0.244273990032 1 228 | 0.551037666764 0.557739116004 1 229 | 0.853199242524 0.087575879592 1 230 | 0.668694113210 -0.24798064989 1 231 | -0.61662196990 -0.49973855984 1 232 | 0.443245353222 0.646229677841 1 233 | -0.74785472084 0.299342406402 1 234 | 0.974389034976 0.009079458533 0 235 | 0.402631839749 -0.92488105358 0 236 | -0.91576222615 0.526449047198 0 237 | 0.616849844993 -0.47928708048 1 238 | 0.925144332971 0.330273773523 0 239 | 0.975678027313 -0.02104914627 0 240 | 0.494483630475 -0.65031095039 1 241 | -0.68435065877 0.633209843728 0 242 | -0.40972320199 1.018185839532 0 243 | 0.055303031997 0.776788958214 1 244 | -1.00520746759 -0.27889757025 0 245 | 0.303950171476 0.726936233774 1 246 | 0.953500732043 -0.16719878025 0 247 | -0.11717608527 0.925513509068 0 248 | -0.89020436564 0.528890574306 0 249 | -0.07353752917 -0.79208352354 1 250 | 0.639642088642 0.742842724654 0 251 | -1.07356966984 -0.01674303935 0 252 | -1.01263374310 0.127066450478 0 253 | 0.043401860337 -0.84717207307 1 254 | -0.75663877445 -0.28720131927 1 255 | -0.78292359807 0.016892565298 1 256 | -0.85966128426 -0.54167459599 0 257 | -0.00707316464 0.985889407799 0 258 | 0.828985821724 0.219138241833 1 259 | -0.95186388824 -0.25476470076 0 260 | -0.40106083959 -0.73443786268 1 261 | 0.871828153652 0.446387604833 0 262 | -0.97389834365 -0.27063779536 0 263 | -0.79367009159 0.700693736837 0 264 | -0.06367479312 -0.86350362152 1 265 | -0.26947579350 -1.02533748422 0 266 | -0.77139863340 -0.02285936916 1 267 | 1.003904544750 0.217993843845 0 268 | -0.89580696815 0.109475071195 0 269 | 0.785407201616 -0.15830336348 1 270 | -0.26052734167 -1.01974112621 0 271 | 0.299567950174 -0.96957652236 0 272 | 0.884755433557 -0.24621045831 0 273 | 0.794180633037 -0.78361845146 0 274 | 0.439033067006 -0.87282436105 0 275 | -0.35674031682 -0.73541100455 1 276 | -0.88201290051 0.562021289179 0 277 | 0.088835510973 1.014070541408 0 278 | -0.10139508008 0.723610826221 1 279 | 0.809523870110 -0.05886137031 1 280 | 0.585079725777 0.780579207671 0 281 | -0.69783184332 -0.26745832364 1 282 | -0.28457773394 0.972666036139 0 283 | 0.071829105482 -0.80865564795 1 284 | 0.680439884176 0.547679497031 1 285 | -0.64143757258 0.404363372531 1 286 | -0.95571712144 -0.18119088146 0 287 | 0.828242395774 -0.42262541541 0 288 | -1.00599006596 -0.05373428128 0 289 | 0.173762144598 -0.80203259318 1 290 | -0.43421721235 0.608340631045 1 291 | -0.55950270911 0.535699072424 1 292 | -0.48411844047 -0.69422810954 1 293 | -0.74522613052 -0.56260050564 0 294 | -0.29678244452 -0.96029654721 0 295 | 0.950762404638 -0.40494891534 0 296 | 0.223907691500 -0.97977303358 0 297 | -0.17778101427 -0.91072039103 1 298 | 0.347707581745 0.947793270625 0 299 | 0.306921523411 0.943666292813 0 300 | 0.592046929080 -0.75145537221 0 301 | 0.653155450758 -0.53526913327 1 302 | -------------------------------------------------------------------------------- /12_classifiers/linearly_separable.txt: -------------------------------------------------------------------------------- 1 | Input_A Input_B Label 2 | -0.470680718301 -1.905835436960 1 3 | 0.9952553595720 1.4019246363100 0 4 | -0.903484238413 -1.233058043620 1 5 | -1.775876322450 -0.436802254656 1 6 | 1.3563494010000 0.5395259340720 0 7 | -1.523291276850 -0.130726098431 1 8 | -0.948509423135 -1.191648136130 1 9 | 1.2031859143200 0.7386249299650 0 10 | 1.1716854353300 0.9055105325930 0 11 | -0.699992649302 -1.022686011270 1 12 | 0.8910828099040 0.9931434257470 0 13 | -1.105656568760 -1.180884829720 1 14 | 0.9373057969120 1.8041560044800 0 15 | -1.273226531880 -1.380125146520 1 16 | 1.4509924414200 0.7726546212530 0 17 | 0.6704035840430 1.5282263370500 0 18 | 1.4839239944300 1.5080098927000 0 19 | -0.484677535537 -0.659274596854 1 20 | -0.816917609462 -1.073307236700 1 21 | -0.252829661757 -0.171824879967 1 22 | -1.060415543080 -0.822824859763 1 23 | 0.6153584302610 0.2701981850990 0 24 | 0.6420747478000 0.3034024181110 0 25 | -1.469957422320 0.0455996352457 1 26 | 0.5750681486780 1.0949489046200 0 27 | -1.385116406610 -0.967973268338 1 28 | -1.275722959420 -1.300661054350 1 29 | 1.1683407987600 0.5154355733350 0 30 | 0.0519073384173 0.2556956845800 0 31 | -0.023173622401 1.0643227364000 0 32 | -0.905809192224 -0.955266002471 1 33 | 0.6293772394960 1.7020433543200 0 34 | -1.262021502860 -1.226092037760 1 35 | -1.424893895880 -1.599801380110 1 36 | 0.8069709417360 0.9854944615590 0 37 | 0.8217246480510 1.1873466724900 0 38 | 1.3355100290700 1.1067572850900 0 39 | -1.127294433520 -0.826514810918 1 40 | 1.1237448672500 0.7690458108970 0 41 | 1.1511936443700 0.6984453860840 0 42 | 0.7922245139770 0.7777078782610 0 43 | -0.924825045590 -1.180181782920 1 44 | -1.012639466800 -0.906348844996 1 45 | -0.623371511691 -0.786654418898 1 46 | 0.5916389546960 0.4346335844810 0 47 | -1.098153290840 -0.695079156783 1 48 | -1.109236742350 -1.213933608110 1 49 | 1.3450320330100 0.9959872391650 0 50 | -1.105348232650 -0.477429927560 1 51 | 0.8782907696090 0.3288839016950 0 52 | 1.0306100271000 0.8811183335750 0 53 | -0.044921081918 1.4565000760600 0 54 | -1.372195769350 -0.869001106280 1 55 | -0.797523635064 -0.335194203779 1 56 | -1.422646625730 -0.443219978282 1 57 | -0.316670610478 -1.388399779330 1 58 | 1.1210454247700 1.2095088272600 0 59 | 1.0200037423600 1.2680862377400 0 60 | 0.5079283847360 1.2284312554500 0 61 | 0.6531457901300 1.1534333033900 0 62 | 1.7863122290000 1.5573623331900 0 63 | -1.364195655220 -1.087878514180 1 64 | -1.327106119750 -0.794496147385 1 65 | -0.961953027201 -0.526132581918 1 66 | -1.356860479070 -0.079703998837 1 67 | -0.973376486006 -0.804014095481 1 68 | 0.7062811712300 0.7085981285960 0 69 | 1.2909475126300 0.6525077062480 0 70 | 0.9230136072150 0.6515249397190 0 71 | -0.223808531791 -0.782705148285 1 72 | 0.7095146029360 1.0891582183300 0 73 | -1.096445060940 -0.905455785072 1 74 | -1.354960374250 -0.267474558693 1 75 | 0.7695170454570 0.0318822283650 0 76 | -1.151296609900 -1.138328266650 1 77 | 0.7390124634160 0.5126792093680 0 78 | -0.873430478169 -0.758666137083 1 79 | 1.2339878951400 1.0610706291200 0 80 | -0.754128030381 -0.591443554241 1 81 | -0.950111709919 -0.581469608191 1 82 | -0.898988053587 -1.376255465080 1 83 | -0.774175513043 -1.076192305900 1 84 | -1.155720548150 -1.189540212160 1 85 | 1.5829198818600 0.2438047378500 0 86 | -1.778492154280 -0.797987286451 1 87 | 0.6892130105620 1.5760667475300 0 88 | 1.2721282995000 1.2542049426700 0 89 | -1.304734885340 -1.132646759280 1 90 | -1.365991444210 -1.191585665580 1 91 | 1.1831760572400 1.2220617641300 0 92 | 1.4429698403100 1.0363748357900 0 93 | -1.700526171970 -0.748442355691 1 94 | 1.3719875035900 0.3722917579190 0 95 | 0.7885060695510 1.1828007485200 0 96 | 0.8453386712840 1.3463958168100 0 97 | 0.7145824059610 0.6675385844580 0 98 | -1.114377859470 -1.891374800260 1 99 | -0.360186549421 -0.810371692086 1 100 | -1.096494130510 -1.379148872130 1 101 | -0.840718044775 -0.943344610658 1 102 | 1.7683137047400 1.2985732153000 0 103 | 1.1325142600800 1.5398968858000 0 104 | -0.148501483408 -0.436158128493 1 105 | 0.5910052087820 0.8388692302350 0 106 | -0.079730194916 -0.772201013197 1 107 | -1.106772333160 -0.655086380953 1 108 | 1.0279506755300 1.0986696439900 0 109 | -0.886999254348 -1.535255772000 1 110 | 1.0747918056600 0.8433099004460 0 111 | -1.433292878340 -0.716639402619 1 112 | -1.190198486920 -0.957438955295 1 113 | 1.0430628886000 0.7573819498660 0 114 | 0.4669561169270 1.4298490760400 0 115 | -0.822083976768 -0.948587333338 1 116 | 0.6323033211670 0.6644693245470 0 117 | -1.045660056250 -0.837679022988 1 118 | -0.496126658094 -0.614027689635 1 119 | 0.5790977041720 1.5746411143200 0 120 | 1.6676101238100 0.8245721056660 0 121 | 0.3737370824740 0.7749839247700 0 122 | 0.5144596377730 0.8117476275140 0 123 | 0.5103418863230 0.2796640346000 0 124 | -0.594383059907 -1.454003087200 1 125 | -1.173778375270 -1.351841143840 1 126 | 1.1414428408300 1.2808440685900 0 127 | -1.279045558250 -0.465740215072 1 128 | 0.7841034217910 1.1907940041600 0 129 | 0.3832017888500 0.6117056350960 0 130 | -0.404236127527 -0.494299886794 1 131 | -1.555149630960 -0.918059432183 1 132 | -0.564184381380 -0.175647304129 1 133 | 0.0562324770465 0.9201828178660 0 134 | 1.3033452580100 0.7357902689100 0 135 | -0.115078924294 -0.970427726951 1 136 | 0.5745950207790 1.1331530862400 0 137 | 0.8964532351540 0.7673412599770 0 138 | 1.1707944340600 0.3745040617080 0 139 | -0.162859248723 -1.054788718450 1 140 | 1.5226878899700 0.8237785719210 0 141 | -1.286545829840 -1.041303488340 1 142 | -1.366773756050 -1.390325442550 1 143 | 0.4529903579940 0.9878879792500 0 144 | 1.2894566021200 1.2760007411200 0 145 | -0.948393810064 -0.749169634474 1 146 | -0.736664294587 -1.631786334170 1 147 | 1.0240244851100 0.9097904023840 0 148 | -1.441750537890 -1.359502070050 1 149 | 0.6451582278460 0.6088255070590 0 150 | -1.022040397170 -1.454370315140 1 151 | -0.817088540498 -0.953720549805 1 152 | 1.2567271265000 0.8673357773200 0 153 | -0.272562175434 -1.632612399280 1 154 | -0.948534278482 -0.938522277887 1 155 | -1.200242739850 -0.951341988043 1 156 | 0.6556970406270 1.2240581172100 0 157 | -1.247732937380 -1.290683252720 1 158 | -1.309745363580 -0.778825402771 1 159 | 0.9869343443360 0.6283975191060 0 160 | -1.500601500160 -2.179737399260 1 161 | 1.4006173770000 0.7987650434560 0 162 | -1.546652944400 -0.819439307043 1 163 | -0.511337833311 -0.798520284664 1 164 | 1.8898638360600 0.7282398358010 0 165 | 0.8530049914070 1.4183650133000 0 166 | 0.8794598880480 1.1995165959700 0 167 | -1.274193558480 -0.957427474106 1 168 | 1.3076090270400 1.4985738945500 0 169 | 1.2884777355600 1.3643930567600 0 170 | -1.122625321800 -1.064897065890 1 171 | -1.597836585550 -1.044357631730 1 172 | 1.0003761111100 1.5375239174500 0 173 | -1.256112201590 -0.701610479749 1 174 | -0.478794767696 -1.563760944350 1 175 | -0.057763796169 0.9388056786610 0 176 | 1.5398904558500 1.5401198936100 0 177 | 1.2501712626000 1.4091489537200 0 178 | -0.746146335766 -0.950337193556 1 179 | -1.456414631760 -1.200310760350 1 180 | 0.9457111439050 1.5720167223800 0 181 | -1.290766555320 -0.927865954922 1 182 | 0.8417196990660 0.8842256400220 0 183 | 1.4160716983600 0.5668310977330 0 184 | -0.958554325640 -1.598287422630 1 185 | -1.369211732840 -0.719784925345 1 186 | 0.7730180515980 0.8509433789150 0 187 | -0.966478773582 -1.748935558680 1 188 | -0.856497746895 -1.155297977440 1 189 | 0.4771878996120 1.1145398988100 0 190 | 0.0988811012792 0.5332671134940 0 191 | -0.750565222607 -0.404317629205 1 192 | 1.1598375813700 0.9393700785010 0 193 | 0.2620849535590 1.8274831405000 0 194 | -0.652233248386 -1.020266979260 1 195 | -1.144453037790 -0.774390190569 1 196 | -0.839716025670 -1.050428673440 1 197 | 0.6612590194690 1.2413128520200 0 198 | 1.0472439014800 0.7005873806030 0 199 | 0.9906026717230 0.0783064484811 0 200 | -1.249018825170 -0.625484273867 1 201 | -1.611672242710 -1.620685996780 1 202 | 1.6538944818400 1.3956033207700 0 203 | -0.939469629210 -1.216869266260 1 204 | -0.857493522625 -1.004046127620 1 205 | 1.0205264403700 0.5369122134750 0 206 | -1.390135933970 -1.446696841000 1 207 | -0.395037402631 -0.789402751816 1 208 | 0.8599565373000 1.0871828065900 0 209 | -1.145922612870 -0.947559305368 1 210 | 0.8251879342500 1.1389955240700 0 211 | -1.215665562050 -1.574958241690 1 212 | -1.208083493790 -0.840763167464 1 213 | 1.4042947784000 1.7299500845400 0 214 | -1.277935291990 -0.509450381670 1 215 | -0.581126200945 -0.630420632870 1 216 | -1.205825050750 -1.244750421400 1 217 | -0.817405059268 -0.922084757841 1 218 | 1.3761957285600 0.7430252997980 0 219 | -0.745487615590 -1.463057663400 1 220 | 0.6102487493990 0.7461162979160 0 221 | 0.5325463706700 0.6699920112680 0 222 | 1.5576288904700 0.5693032223840 0 223 | 0.5719036369870 1.3216890791400 0 224 | -1.704507069200 -1.364653251200 1 225 | -0.781938322909 -0.839670693339 1 226 | -1.366774029770 -1.033060053800 1 227 | 1.0178789505500 0.6408974736180 0 228 | 0.9540287181670 1.8014789449700 0 229 | -0.891225487545 -0.507692258937 1 230 | 0.9538348144080 0.8597021570050 0 231 | 0.7691650787840 1.0496485102700 0 232 | -1.475681000060 -0.962561020387 1 233 | -0.433698731353 -0.773557786959 1 234 | -1.102050239460 -0.658345073901 1 235 | 1.1099966533700 1.0915651515700 0 236 | -1.634888723740 -1.293115532190 1 237 | 1.4381560240600 1.3923713909900 0 238 | 1.0118440609300 1.3181012623400 0 239 | 1.5054289025900 0.8978037775350 0 240 | 1.3411860129000 0.6176524590820 0 241 | 1.4879817459700 0.9009203535210 0 242 | -0.745186826234 -1.215709834730 1 243 | 0.7510903099760 0.6315325567950 0 244 | -0.884619276992 -1.173915396600 1 245 | -1.068202501770 -1.463234906840 1 246 | 1.0371631507000 1.1126984611300 0 247 | 1.5411667340700 1.3545717362400 0 248 | 0.9181169362060 1.1915773352200 0 249 | -1.363249829300 -0.395084242962 1 250 | 1.0003829476300 1.0261012723400 0 251 | -0.480951004166 -1.674773364460 1 252 | 0.8331742367090 0.9931971852700 0 253 | -1.042173711210 -0.719828653235 1 254 | -0.881082833125 -0.862083519507 1 255 | 0.8901723181730 0.9443431247510 0 256 | -0.391702783828 -0.883862726707 1 257 | -0.732691003234 -1.323814192980 1 258 | -0.847975014132 -0.924402130617 1 259 | 0.9557770561740 1.4909547979400 0 260 | 0.8024882972100 1.4959919529100 0 261 | -0.558160345235 -0.746304760005 1 262 | 1.5308778455100 0.6322953769020 0 263 | -1.206751292650 -1.038254267080 1 264 | -0.771269628542 -0.773157588284 1 265 | 0.9473689595150 1.3649656608200 0 266 | -0.836777846838 -1.241258164470 1 267 | 0.2707492475350 -0.616186983221 1 268 | 0.6992315987780 0.9469562992280 0 269 | -0.599141820737 -1.210165993720 1 270 | -0.729482388431 -1.261342464840 1 271 | 0.4958317489400 1.1858299174000 0 272 | 0.6971293990240 1.2872343337900 0 273 | -0.301106560361 -1.564098457550 1 274 | -0.967369213827 -1.120933948550 1 275 | 0.8335071473000 0.9533011982950 0 276 | -1.057650377940 -1.286565609700 1 277 | -0.503370884371 -1.062703812860 1 278 | 1.0881947453100 0.9226395568400 0 279 | 0.4936262033230 1.0479308498600 0 280 | 0.4822903905160 0.6459279948880 0 281 | 1.5186431368000 1.1009100835500 0 282 | -0.758517913622 -0.781727876523 1 283 | -1.979638213690 -1.413579207030 1 284 | -1.159241344170 -0.828238377011 1 285 | -0.957575444190 0.5710112163170 1 286 | 1.1740621218300 0.7768781131460 0 287 | 2.2995775677700 0.5915089902580 0 288 | 1.4028757430300 0.4815115563550 0 289 | 0.8931130601890 1.1933351287500 0 290 | 1.3266827743600 1.1734438422600 0 291 | -1.260918119180 -0.788468958318 1 292 | -0.856882700266 -0.924540256107 1 293 | 0.9250013895590 0.8025760000620 0 294 | -1.644336403750 -1.410648275780 1 295 | 0.6009927009160 1.3402364394900 0 296 | 0.8398669066450 1.1799521660600 0 297 | 1.2676633345900 0.3404060721990 0 298 | 0.3803574246860 1.0088738394600 0 299 | 1.0752842721700 1.8677845750800 0 300 | 0.7642049255190 1.6326800376800 0 301 | 0.1993450761360 0.8512629851440 0 302 | -------------------------------------------------------------------------------- /12_classifiers/non_linearly_separable.txt: -------------------------------------------------------------------------------- 1 | Input_A Input_B Label 2 | 0.185872557346525 0.567645715301291 0 3 | 0.284770005578376 0.458145376240974 1 4 | 0.150041107877389 0.453290078176759 0 5 | 0.638036522046149 0.329524624417077 1 6 | 0.125934731854359 0.550300640129279 0 7 | 0.166464187716999 0.419816200865374 1 8 | 0.547470416724505 0.263535311275099 1 9 | 0.375753328604949 0.563583141495753 1 10 | 0.415273597565861 0.347759831340554 1 11 | 0.382347902722197 0.222930022755573 1 12 | 0.255260089152012 0.423860456768629 1 13 | 0.386691437752741 0.200981321457421 1 14 | 0.231481536261434 0.373080303194094 1 15 | 0.171773383639811 0.178632239692453 0 16 | 0.238704153711792 0.544639885812218 1 17 | 0.295522242629006 0.549355592164107 1 18 | 0.371772586052267 0.580335756015484 1 19 | 0.617397114540016 0.587913749082441 0 20 | 0.494464078202965 0.262169347641719 1 21 | 0.611811601217844 0.603069274858823 0 22 | 0.137944742898229 0.363250499398117 0 23 | 0.340318076243722 0.596033063135561 1 24 | 0.188319676612397 0.444352164020086 1 25 | 0.143330174635788 0.406692798754551 0 26 | 0.290542075401884 0.589813063431241 1 27 | 0.065086220600478 0.343727382094072 0 28 | 0.608277468173034 0.452784471503011 1 29 | 0.184416779785702 0.517665510771726 1 30 | 0.343472476805794 0.426794223851852 1 31 | 0.218237419709771 0.569996827669265 1 32 | 0.744663399630232 0.546570483438378 1 33 | 0.434998017337283 0.567856397859814 1 34 | 0.579751069730495 0.605212657414364 0 35 | 0.487682438977831 0.673312003550014 0 36 | 0.395524452706471 0.401146500038936 1 37 | 0.187061803326975 0.462713697225775 1 38 | 0.483802677766077 0.546897822217223 1 39 | 0.691691432345648 0.415027355545197 1 40 | 0.439754996995227 0.372781003953939 1 41 | 0.443700921039504 0.492764032055088 1 42 | 0.391102216942822 0.494379280322189 1 43 | 0.073621089054039 0.317454270640649 0 44 | 0.642267386338217 0.549634407115158 0 45 | 0.339509302371488 0.665385155804249 0 46 | 0.217327918995033 0.429356966892458 1 47 | 0.661128514384210 0.544327110154572 0 48 | 0.730505292125581 0.751185831220796 0 49 | 0.336520366615677 0.533132320277557 1 50 | 0.314599722929185 0.648822919302380 0 51 | 0.262815998364609 0.535016626997539 1 52 | 0.496402468825085 0.365325121433015 1 53 | 0.294282129101217 0.613919263842754 1 54 | 0.488041737074197 0.610832329564415 0 55 | 0.191898180021961 0.782041871439979 0 56 | 0.678130682356868 0.537303529216493 0 57 | 0.699360156967237 0.343692073367899 1 58 | 0.306171006130718 0.452049837813493 1 59 | 0.455228696223868 0.489288594215802 1 60 | 0.158458393105841 0.518387061089405 1 61 | 0.582099514745833 0.485971027377901 1 62 | 0.377714868632692 0.443571363798634 1 63 | 0.214375368886653 0.666741718360665 0 64 | 0.012520854116208 0.662431224980643 0 65 | 0.532443751082547 0.345719280546294 1 66 | 0.814952757250424 0.550088860711323 1 67 | 0.650374862259001 0.555450013273306 0 68 | 0.477581688063969 0.549768454319392 1 69 | 0.512354528953985 0.466188499782383 1 70 | 0.241048745325229 0.573733275663036 1 71 | 0.720878683845167 0.261892655776282 1 72 | 0.391114342750221 0.409594358951365 1 73 | 0.111927140677257 0.335258182760164 0 74 | 0.578609290902156 0.749905503979216 0 75 | 0.196779789910107 0.525179953088861 1 76 | 0.355241878906987 0.321595134196293 0 77 | 0.402668888142536 0.413435032761744 1 78 | 0.133166113712124 0.495332222521401 0 79 | 0.303893155969414 0.531896365084603 1 80 | 0.586013381959417 0.397637508981821 1 81 | 0.005998749615912 0.750954051545646 0 82 | 0.361409865089487 0.165064449710988 1 83 | 0.421122394931855 0.294502417760282 1 84 | 0.797707697828697 0.483692497824512 1 85 | 0.529674693181247 0.509545744198941 1 86 | 0.449317343046755 0.264276326057303 1 87 | 0.226201543262588 0.459952545313192 1 88 | 0.427741992173633 0.554622389666354 1 89 | 0.685000185875834 0.606295197588184 0 90 | 0.376008329516316 0.318675202858641 0 91 | 0.489838215648457 0.490421854937454 1 92 | 0.359620252707536 0.349318356029733 1 93 | 0.620267053085032 0.284954563529464 1 94 | 0.476164037364125 0.448266478650097 1 95 | 0.253380396550363 0.490417066415215 1 96 | 0.561555124330989 0.403391772176988 1 97 | 0.419656833063424 0.504457114159702 1 98 | 0.113427104521424 0.355190217637390 0 99 | 0.187061936862685 0.343047069307368 0 100 | 0.178345032061314 0.229338901614721 0 101 | 0.835293252134245 0.336962482785625 1 102 | 0.337869366650093 0.377801595507688 1 103 | 0.440160571217575 0.499108846812169 1 104 | 0.384555811368976 0.632509885346964 0 105 | 0.312702693472014 0.402129803554429 1 106 | 0.629382735488488 0.654701260683496 0 107 | 0.496575334352883 0.252855148547268 1 108 | 0.443971986109412 0.453479124828843 1 109 | 0.656640869135908 0.643172298786415 0 110 | 0.472376910648912 0.527489780697676 1 111 | 0.136723254612367 0.824012529186366 0 112 | 0.366177887758381 0.653831734211027 0 113 | 0.413080110676208 0.395724042510887 1 114 | 0.449752022599598 0.442137814389682 1 115 | 0.310194573043006 0.175219765172843 1 116 | 0.224279605133654 0.503650907923338 1 117 | 0.175852247975371 0.388748454060114 0 118 | 0.171336637119857 0.587749969895873 0 119 | 0.497967808056733 0.386250129240487 1 120 | 0.380927659162033 0.433967607315432 1 121 | 0.313849056725867 0.538545134936431 1 122 | 0.389042580322112 0.373498468531512 1 123 | 0.391170745181978 0.557803258787498 1 124 | 0.604390751159719 0.541273177562692 0 125 | 0.192883016570914 0.342762678483355 0 126 | 0.290025329752611 0.410300411538986 1 127 | 0.150484128244726 0.353371408486552 0 128 | 0.312414400609526 0.455144854823256 1 129 | 0.444189761212638 0.456896049590973 1 130 | 0.051651571715417 0.336239902990064 0 131 | 0.343567073752331 0.742243903593177 0 132 | 0.740820192208105 0.573794804617963 0 133 | 0.284607593433955 0.471508602520287 1 134 | 0.452791646949676 0.491007973494799 1 135 | 0.620990375253472 0.414761560110805 1 136 | 0.297673476515481 0.502989240527838 1 137 | 0.507207291998171 0.566234117376812 1 138 | 0.450911501327695 0.399052174527401 1 139 | 0.133930977776328 0.486327464089020 0 140 | 0.426363969908016 0.396428732032907 1 141 | 0.402722219375909 0.764453083150158 0 142 | 0.099132932968521 0.469730275249526 0 143 | 0.245136612462514 0.376422384218068 1 144 | 0.288776541192477 0.435008404717222 1 145 | 0.624985887185437 0.350623949892745 1 146 | 0.306617926037395 0.248995778639368 1 147 | 0.561437647706132 0.406383237721582 1 148 | 0.386213885404089 0.273389863002481 1 149 | 0.198024430585314 0.548835850303379 1 150 | 0.612570489499521 0.439556424135737 1 151 | 0.490159575675978 0.318685292649777 1 152 | 0.318919242193624 0.346790679907011 1 153 | 0.238670288072295 0.310558318387962 0 154 | 0.706953158860115 0.284841512929566 1 155 | 0.191378724971581 0.240460713920564 0 156 | 0.294805302868403 0.491352313146076 1 157 | 0.022297003125116 0.351646011908525 0 158 | 0.430642163298947 0.654626563855130 0 159 | 0.280711434708331 0.520343995226133 1 160 | 0.335408710613888 0.743140589070063 0 161 | 0.455312085810807 0.449232837310457 1 162 | 0.568977117861075 0.328696582210342 1 163 | 0.185184180243927 0.708937789423362 0 164 | 0.224142440630555 0.497948633175144 1 165 | 0.345419738111637 0.567631068576167 1 166 | 0.264479797428601 0.527123854686018 1 167 | 0.105946200289441 0.617765869485902 0 168 | 0.379630280822902 0.506836775554604 1 169 | 0.465119693276561 0.386239208075271 1 170 | 0.143775723683174 0.626692665659776 0 171 | 0.624225533812329 0.170376151191855 1 172 | 0.435516917897362 0.472431959392391 1 173 | 0.158707727739740 0.272882760460164 0 174 | 0.344420009440997 0.638233331249493 0 175 | 0.238165862307001 0.398057317605384 1 176 | 0.244509285488090 0.599231873177618 1 177 | 0.525368271572631 0.531882515906129 1 178 | 0.192825155928426 0.719147814457877 0 179 | 0.067586950664138 0.265887423855564 0 180 | 0.328436192367912 0.357064923491133 1 181 | 0.159804070040797 0.660281597921673 0 182 | 0.318109812290871 0.575920953000552 1 183 | 0.178113380451287 0.484514622771554 1 184 | 0.604058804412156 0.440600326378454 1 185 | 0.094427439721678 0.447052320408327 0 186 | 0.219262988672062 0.456020141901216 1 187 | 0.160655050355314 0.666264200182135 0 188 | 0.332721458774421 0.318625924667504 0 189 | 0.343718906752124 0.528156893937911 1 190 | 0.425047033009332 0.543962541782292 1 191 | 0.749600864832302 0.660476644106951 0 192 | 0.200737870566906 0.554615352024247 1 193 | 0.491240048995677 0.429351117497986 1 194 | 0.127112171016374 0.232509503429745 0 195 | 0.060468814917301 0.398417713016151 0 196 | 0.785935955710797 0.602346015926031 1 197 | 0.374716737989999 0.353505236284064 1 198 | 0.436002712207406 0.421769904219358 1 199 | 0.273205090009305 0.488043108201696 1 200 | 0.024239104312959 0.558046862014373 0 201 | 0.662755325317231 0.512687586630719 0 202 | 0.279021852724453 0.536539291683743 1 203 | 0.120033325869932 0.536031252082054 0 204 | 0.099306949732981 0.534266316713788 0 205 | 0.455074280915086 0.499885054618869 1 206 | 0.074336747164541 0.458929554972241 0 207 | 0.188781102651723 0.676404741414833 0 208 | 0.498753590445956 0.558933578941906 1 209 | 0.230402238438460 0.638097615677935 0 210 | 0.244562527631309 0.350368442042444 1 211 | 0.110704059390191 0.764951935497288 0 212 | 0.056260670997384 0.375607688217045 0 213 | 0.268304949560206 0.490085297378408 1 214 | 0.619017121616898 0.507587214459785 0 215 | 0.506729746912698 0.595841268473231 0 216 | 0.563876710274587 0.321718137798429 1 217 | 0.516063045852790 0.640785360165191 0 218 | 0.206413997236883 0.542621068944641 1 219 | 0.419060605332245 0.638686501779906 0 220 | 0.265581590478578 0.391807661403432 1 221 | 0.534320051775695 0.472443135643378 1 222 | 0.281215725927083 0.355937448281976 1 223 | 0.513982944200595 0.385289733710778 1 224 | 0.250960560271128 0.685704432751575 0 225 | 0.630005374379881 0.592551601057047 0 226 | 0.602079209456856 0.643945963211525 0 227 | 0.226751958901932 0.362428720054178 1 228 | 0.265129718608035 0.460973886968577 1 229 | 0.260780846432465 0.281204010422012 0 230 | 0.158689210985088 0.505590507803563 1 231 | 0.318943181034731 0.505454956318115 1 232 | 0.253552105301414 0.149514581042475 1 233 | 0.539841287952326 0.319550672859052 1 234 | 0.095161797495434 0.501233344043949 0 235 | 0.312858588683517 0.471490595680891 1 236 | 0.229860589333571 0.652738407669928 0 237 | 0.307188286480966 0.540615627513335 1 238 | 0.329405942207326 0.407659723898524 1 239 | 0.313904968170452 0.589316598860567 1 240 | 0.271824181221892 0.468928562576986 1 241 | 0.455357781590165 0.548167168565001 1 242 | 0.448120324448889 0.311481567019774 1 243 | 0.378982818240203 0.539433016624798 1 244 | 0.232699431271707 0.346463675195537 1 245 | 0.154610240953759 0.568699313308745 0 246 | 0.359828171138622 0.505155824433198 1 247 | 0.409020660090158 0.447875209152292 1 248 | 0.366181222985928 0.482529880440538 1 249 | 0.581548147915803 0.558762853193273 0 250 | 0.214883531417443 0.547870010563497 1 251 | 0.024609222248019 0.423967509402905 0 252 | 0.315478578260119 0.395857789906608 1 253 | 0.485944304883226 0.610633868051437 0 254 | 0.063698582914894 0.476759523555957 0 255 | 0.387002851688177 0.392521902660983 1 256 | 0.096537480846897 0.224481265279902 0 257 | 0.118957339098346 0.489841733247089 0 258 | 0.549523420919422 0.387713119170254 1 259 | 0.445623166868433 0.392977398586459 1 260 | 0.316208665865129 0.588225094309012 1 261 | 0.129121037237315 0.410731775551996 0 262 | 0.402481967051439 0.421303610632036 1 263 | 0.781402330478245 0.662646976094367 0 264 | 0.464773535960461 0.696465008270909 0 265 | 0.422283479124878 0.415533969137168 1 266 | 0.562539403191263 0.300118661893326 1 267 | 0.175664578217876 0.324165376943464 0 268 | 0.535654198310941 0.466998758531653 1 269 | 0.800119048890266 0.382760703753727 1 270 | 0.563231686893447 0.718285194473858 0 271 | 0.363679702784266 0.608412532602636 1 272 | 0.232227662373562 0.488046953835068 1 273 | 0.507934640359342 0.357394102748849 1 274 | 0.166773363921389 0.284416808328599 0 275 | 0.260668144012676 0.537693981083947 1 276 | 0.186611684236044 0.361400666423465 0 277 | 0.110937828092493 0.512152543428773 0 278 | 0.287872646881721 0.534062489637283 1 279 | 0.435814910727064 0.499062581442704 1 280 | 0.390332658902304 0.613974657635868 0 281 | 0.490306317308272 0.401534853541549 1 282 | 0.013797001126364 0.541451706197354 0 283 | 0.157145387681494 0.581537501123647 0 284 | 0.205199093424556 0.650717494407026 0 285 | 0.570344137098391 0.597578431634414 0 286 | 0.264630060600315 0.399330105715248 1 287 | 0.381913496867797 0.433280192053377 1 288 | 0.411945870002103 0.488770942305929 1 289 | 0.187443591715223 0.409615283567895 1 290 | 0.292183570780564 0.427453936683949 1 291 | 0.774397820828806 0.455435645367286 1 292 | 0.619215122034888 0.247770771340771 1 293 | 0.549734544988108 0.545247668773684 1 294 | 0.269637300035278 0.639920534578213 0 295 | 0.326047326233748 0.537956359113939 1 296 | 0.222702158847761 0.376181681712591 1 297 | 0.480146598029984 0.494242923854163 1 298 | 0.424008751453331 0.519982567802566 1 299 | 0.314449299569264 0.413199546083012 1 300 | 0.288307691309231 0.531319053799275 1 301 | 0.384124634210163 0.511535750857765 1 302 | -------------------------------------------------------------------------------- /13_batching/images/visualization.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nickgnd/programming-machine-learning-livebooks/fbd0d9242bcc08621171006cf53637d417828a06/13_batching/images/visualization.png -------------------------------------------------------------------------------- /14_testing/images/MNIST_2_sets.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nickgnd/programming-machine-learning-livebooks/fbd0d9242bcc08621171006cf53637d417828a06/14_testing/images/MNIST_2_sets.png -------------------------------------------------------------------------------- /14_testing/images/MNIST_3_sets.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nickgnd/programming-machine-learning-livebooks/fbd0d9242bcc08621171006cf53637d417828a06/14_testing/images/MNIST_3_sets.png -------------------------------------------------------------------------------- /16_deeper/16_deeper.livemd: -------------------------------------------------------------------------------- 1 | # Chapter 16: A Deeper Kind of Network 2 | 3 | ```elixir 4 | Mix.install( 5 | [ 6 | {:exla, "~> 0.5"}, 7 | {:nx, "~> 0.5"}, 8 | {:axon, "~> 0.5"}, 9 | {:kino, "~> 0.8.1"}, 10 | {:kino_vega_lite, "~> 0.1.7"}, 11 | {:vega_lite, "~> 0.1.6"}, 12 | {:table_rex, "~> 3.1.1"} 13 | ], 14 | config: [nx: [default_backend: EXLA.Backend]] 15 | ) 16 | ``` 17 | 18 | ## The Echidna Dataset 19 | 20 | ```elixir 21 | defmodule C16.EchidnaDataset do 22 | import Nx.Defn 23 | 24 | @data_path Path.join(__DIR__, "../data") |> Path.expand() 25 | 26 | @filename Path.join(@data_path, "echidna.txt") 27 | 28 | @doc """ 29 | Loads the echidna dataset and returns the input `x` and label `y` tensors. 30 | 31 | - the dataset has been shuffled 32 | - the input tensor is already normalized 33 | """ 34 | def load() do 35 | with {:ok, binary} <- read_file() do 36 | # seed the random algorithm 37 | :rand.seed(:exsss, {1, 2, 3}) 38 | 39 | tensor = 40 | binary 41 | |> parse() 42 | |> Enum.shuffle() 43 | |> Nx.tensor() 44 | 45 | # all the rows, only first 2 columns 46 | x = tensor[[0..-1//1, 0..1//1]] |> normalize_inputs() 47 | 48 | # all the rows, only 3rd column 49 | y = 50 | tensor[[0..-1//1, 2]] 51 | |> Nx.reshape({:auto, 1}) 52 | |> Nx.as_type(:u8) 53 | 54 | %{x: x, y: y} 55 | end 56 | end 57 | 58 | def parse(binary) do 59 | binary 60 | |> String.split("\n", trim: true) 61 | |> Enum.slice(1..-1) 62 | |> Enum.map(fn row -> 63 | row 64 | |> String.split(" ", trim: true) 65 | |> Enum.map(&parse_float/1) 66 | end) 67 | end 68 | 69 | # Normalization (Min-Max Scalar) 70 | # 71 | # In this approach, the data is scaled to a fixed range — usually 0 to 1. 72 | # In contrast to standardization, the cost of having this bounded range 73 | # is that we will end up with smaller standard deviations, 74 | # which can suppress the effect of outliers. 75 | # Thus MinMax Scalar is sensitive to outliers. 76 | defnp normalize_inputs(x_raw) do 77 | # Compute the min/max over the first axe 78 | min = Nx.reduce_min(x_raw, axes: [0]) 79 | max = Nx.reduce_max(x_raw, axes: [0]) 80 | 81 | # After MinMaxScaling, the distributions are not centered 82 | # at zero and the standard deviation is not 1. 83 | # Therefore, subtract 0.5 to rescale data between -0.5 and 0.5 84 | (x_raw - min) / (max - min) - 0.5 85 | end 86 | 87 | # to handle both integer and float numbers 88 | defp parse_float(stringified_float) do 89 | {float, ""} = Float.parse(stringified_float) 90 | float 91 | end 92 | 93 | def read_file() do 94 | if File.exists?(@filename) do 95 | File.read(@filename) 96 | else 97 | {:error, "The file #{@filename} is missing!"} 98 | end 99 | end 100 | end 101 | ``` 102 | 103 | ### Visualize the Echidna dataset 104 | 105 | ```elixir 106 | alias VegaLite, as: Vl 107 | 108 | dataset = 109 | C16.EchidnaDataset.read_file() 110 | |> then(fn {:ok, binary} -> C16.EchidnaDataset.parse(binary) end) 111 | |> Enum.map(fn [input_a, input_b, label] -> 112 | %{input_a: input_a, input_b: input_b, label: label} 113 | end) 114 | 115 | Vl.new(width: 600, height: 400) 116 | |> Vl.data_from_values(dataset) 117 | |> Vl.mark(:point, filled: true, tooltip: true) 118 | |> Vl.encode_field(:x, "input_a", type: :quantitative) 119 | |> Vl.encode_field(:y, "input_b", type: :quantitative) 120 | |> Vl.encode(:color, field: "label", scale: %{"range" => ["blue", "green"]}) 121 | |> Vl.encode(:shape, field: "label", scale: %{"range" => ["square", "triangle-up"]}) 122 | ``` 123 | 124 | ### Load the data 125 | 126 | Load the data and split the input/label tensors in train, validate and test sets to use in the different stages. 127 | 128 | ```elixir 129 | %{x: x_all, y: y_all} = C16.EchidnaDataset.load() 130 | 131 | size = (elem(Nx.shape(x_all), 0) / 3) |> ceil() 132 | 133 | [x_train, x_validation, x_test] = Nx.to_batched(x_all, size) |> Enum.to_list() 134 | [y_train, y_validation, y_test] = Nx.to_batched(y_all, size) |> Enum.to_list() 135 | 136 | data = %{ 137 | x_train: x_train, 138 | x_validation: x_validation, 139 | x_test: x_test, 140 | y_train: y_train, 141 | y_validation: y_validation, 142 | y_test: y_test 143 | } 144 | ``` 145 | 146 | ## Building a Neural Network with Axon 147 | 148 | 1. The Echidna dataset has two input variables, so we only need two input nodes. 149 | 2. The Echidna dataset has two classes, so we only need two output nodes. 150 | 3. The number of hidden nodes is a hyperparameter that we can change later. To begin with, let's go with 100 hidden nodes. 151 | 4. Axon will add a bias nodes to the input and hidden layers 152 | 153 | 154 | 155 | ### Prepare the data 156 | 157 | ```elixir 158 | x_train = data.x_train 159 | x_validation = data.x_validation 160 | 161 | # One-hot encode the labels 162 | y_train = Nx.equal(data.y_train, Nx.tensor(Enum.to_list(0..1))) 163 | y_validation = Nx.equal(data.y_validation, Nx.tensor(Enum.to_list(0..1))) 164 | ``` 165 | 166 | ### Creating the model 167 | 168 | 169 | 170 | * Let's create a sequential model 171 | 172 | > Sequential models are named after the sequential nature in which data flows through them. Sequential models transform the input with sequential, successive transformations. 173 | 174 | 👆 Axon does not need a distinct sequential construct. To create a sequential model, you just pass Axon models through successive transformations in the Axon API. 175 | 176 | * A layer is _dense_ when each of its nodes is connected to all the nodes in a neighboring layer. 177 | 178 | * Note that for each layer, we specify the activation function that comes before the layer, not after it. 179 | 180 | ```elixir 181 | model = 182 | Axon.input("data", shape: Nx.shape(x_train)) 183 | |> Axon.dense(100, activation: :sigmoid) 184 | |> Axon.dense(2, activation: :softmax) 185 | ``` 186 | 187 | #### Visualize the model 188 | 189 | ```elixir 190 | template = Nx.template(Nx.shape(x_train), :f32) 191 | 192 | Axon.Display.as_table(model, template) |> IO.puts() 193 | 194 | Axon.Display.as_graph(model, template) 195 | ``` 196 | 197 | ### Training the Network 198 | 199 | ```elixir 200 | batch_size = 25 201 | 202 | train_inputs = Nx.to_batched(x_train, batch_size) 203 | train_labels = Nx.to_batched(y_train, batch_size) 204 | train_batches = Stream.zip(train_inputs, train_labels) 205 | 206 | validation_data = [{x_validation, y_validation}] 207 | 208 | epochs = 30_000 209 | 210 | # (~360 seconds with CPU) 211 | params = 212 | model 213 | |> Axon.Loop.trainer(:categorical_cross_entropy, Axon.Optimizers.rmsprop(0.001)) 214 | |> Axon.Loop.metric(:accuracy) 215 | |> Axon.Loop.validate(model, validation_data) 216 | |> Axon.Loop.run(train_batches, %{}, epochs: epochs, compiler: EXLA) 217 | ``` 218 | 219 | ### Drawing the Boundary 220 | 221 | ```elixir 222 | defmodule C16.Plotter do 223 | @moduledoc """ 224 | The module exposes an API to draw the echidna dataset 225 | and the predictions based on the params returned from 226 | the training. 227 | 228 | NOTE: since the training has been done on the normalized inputs, 229 | instead of using the original inputs from the Echidna dataset, 230 | the inputs and labels are extracted from the tensors in order 231 | to be in scale with the predictions. 232 | """ 233 | 234 | alias VegaLite, as: Vl 235 | 236 | def plot(%{x: x_all, y: y_all}, model, params) do 237 | Vl.new(width: 600, height: 400) 238 | |> Vl.layers([ 239 | # Grid 240 | prediction_layer(x_all, model, params), 241 | # Inputs 242 | normalized_dataset_layer(x_all, y_all) 243 | ]) 244 | |> Vl.resolve(:scale, x: :shared, y: :shared, color: :independent) 245 | end 246 | 247 | defp prediction_layer(x_all, model, params) do 248 | # Build the grid 249 | grid = 250 | x_all 251 | |> boundaries() 252 | |> build_grid() 253 | 254 | labels = 255 | model 256 | |> Axon.predict(params, Nx.tensor(grid), compiler: EXLA) 257 | |> Nx.argmax(axis: 1) 258 | 259 | # Add the labels to the grid dataset 260 | data_with_labels = 261 | Enum.zip_with([grid, Nx.to_flat_list(labels)], fn [[x, y], label] -> 262 | %{x: x, y: y, label: label} 263 | end) 264 | 265 | Vl.new() 266 | |> Vl.data_from_values(data_with_labels) 267 | |> Vl.mark(:point) 268 | |> Vl.encode_field(:x, "x", type: :quantitative) 269 | |> Vl.encode_field(:y, "y", type: :quantitative) 270 | |> Vl.encode(:color, field: "label", scale: %{"range" => ["lightblue", "aquamarine"]}) 271 | end 272 | 273 | defp build_grid(%{x_max: x_max, x_min: x_min, y_max: y_max, y_min: y_min}) do 274 | resolution = 200 275 | x_step = (x_max - x_min) / resolution 276 | y_step = (y_max - y_min) / resolution 277 | 278 | for i <- 0..(resolution - 1), j <- 0..(resolution - 1) do 279 | [x_min + x_step * i, y_min + y_step * j] 280 | end 281 | end 282 | 283 | defp boundaries(inputs) do 284 | # Get x from the tensor 285 | x = Nx.slice_along_axis(inputs, 1, 1, axis: 1) 286 | 287 | # Get y from the tensor 288 | y = Nx.slice_along_axis(inputs, 2, 1, axis: 1) 289 | 290 | # Compute the grid boundaries 291 | x_min = x |> Nx.to_flat_list() |> Enum.min() 292 | x_max = x |> Nx.to_flat_list() |> Enum.max() 293 | y_min = y |> Nx.to_flat_list() |> Enum.min() 294 | y_max = y |> Nx.to_flat_list() |> Enum.max() 295 | 296 | padding = 0.1 297 | 298 | %{ 299 | x_min: x_min - abs(x_min * padding), 300 | x_max: x_max + abs(x_max * padding), 301 | y_min: y_min - abs(y_min * padding), 302 | y_max: y_max + abs(y_max * padding) 303 | } 304 | end 305 | 306 | defp normalized_dataset_layer(x_all, y_all) do 307 | normalized_inputs = to_list(x_all) 308 | normalized_labels = to_list(y_all) 309 | 310 | dataset = 311 | Enum.zip(normalized_inputs, normalized_labels) 312 | |> Enum.map(fn {[input_a, input_b], [label]} -> 313 | %{input_a: input_a, input_b: input_b, label: label} 314 | end) 315 | 316 | Vl.new() 317 | |> Vl.data_from_values(dataset) 318 | |> Vl.mark(:point, filled: true, tooltip: true) 319 | |> Vl.encode_field(:x, "input_a", type: :quantitative) 320 | |> Vl.encode_field(:y, "input_b", type: :quantitative) 321 | |> Vl.encode(:color, field: "label", scale: %{"range" => ["blue", "green"]}) 322 | |> Vl.encode(:shape, field: "label", scale: %{"range" => ["square", "triangle-up"]}) 323 | end 324 | 325 | defp to_list(tensor) do 326 | # utility to transform a tensor to 327 | # a list keeping the nesting 328 | tensor 329 | |> Nx.to_batched(1) 330 | |> Enum.map(&Nx.to_flat_list/1) 331 | end 332 | end 333 | ``` 334 | 335 | ```elixir 336 | C16.Plotter.plot(%{x: x_all, y: y_all}, model, params) 337 | ``` 338 | 339 | ## Making Deep 340 | 341 | ```elixir 342 | new_model = 343 | Axon.input("data", shape: Nx.shape(x_train)) 344 | |> Axon.dense(100, activation: :sigmoid) 345 | |> Axon.dense(30, activation: :sigmoid) 346 | |> Axon.dense(2, activation: :softmax) 347 | 348 | Axon.Display.as_graph(new_model, template) 349 | ``` 350 | 351 | ### Train the Network 352 | 353 | ```elixir 354 | # epochs are defined in the previous model's training 355 | 356 | # Set `eps` option in the RMSprop to prevent division by zero (NaN) 357 | # By default in Axon is 1.0e-8, I tried with 1.0e-7 (Keras default) and 358 | # it was still returning NaN. 359 | epsilon = 1.0e-4 360 | 361 | # (~450 seconds with CPU) 362 | new_params = 363 | new_model 364 | |> Axon.Loop.trainer(:categorical_cross_entropy, Axon.Optimizers.rmsprop(0.001, eps: epsilon)) 365 | |> Axon.Loop.metric(:accuracy) 366 | |> Axon.Loop.validate(new_model, validation_data) 367 | |> Axon.Loop.run(train_batches, %{}, epochs: epochs, compiler: EXLA) 368 | ``` 369 | 370 | ```elixir 371 | C16.Plotter.plot(%{x: x_all, y: y_all}, new_model, new_params) 372 | ``` 373 | -------------------------------------------------------------------------------- /16_deeper/mnist_with_axon.livemd: -------------------------------------------------------------------------------- 1 | # MNIST with Axon 2 | 3 | ```elixir 4 | Mix.install( 5 | [ 6 | {:exla, "~> 0.5"}, 7 | {:nx, "~> 0.5"}, 8 | {:vega_lite, "~> 0.1.6"}, 9 | {:kino, "~> 0.8.1"}, 10 | {:kino_vega_lite, "~> 0.1.7"}, 11 | {:table_rex, "~> 3.1.1"} 12 | ], 13 | config: [nx: [default_backend: EXLA.Backend]] 14 | ) 15 | ``` 16 | 17 | ## Prepare and load MNIST dataset 18 | 19 | _inspired by https://hexdocs.pm/axon/mnist.html#introduction_ 20 | 21 | ```elixir 22 | defmodule C16.MNISTDataset do 23 | @moduledoc """ 24 | Use this Module to load the MNIST database (test and validation sets) with 25 | normalized inputs. 26 | 27 | MNIST dataset specifications can be found here: http://yann.lecun.com/exdb/mnist/ 28 | """ 29 | 30 | @data_path Path.join(__DIR__, "../data/mnist") |> Path.expand() 31 | 32 | @train_images_filename Path.join(@data_path, "train-images-idx3-ubyte.gz") 33 | @test_images_filename Path.join(@data_path, "t10k-images-idx3-ubyte.gz") 34 | @train_labels_filename Path.join(@data_path, "train-labels-idx1-ubyte.gz") 35 | @test_labels_filename Path.join(@data_path, "t10k-labels-idx1-ubyte.gz") 36 | 37 | @type t :: %__MODULE__{ 38 | x_train: Nx.Tensor.t(), 39 | x_validation: Nx.Tensor.t(), 40 | y_train: Nx.Tensor.t(), 41 | y_validation: Nx.Tensor.t() 42 | } 43 | defstruct [ 44 | :x_train, 45 | :x_validation, 46 | :y_train, 47 | :y_validation 48 | ] 49 | 50 | @doc """ 51 | Load the MNIST database and return a map with train and validation images/labels. 52 | 53 | * train and validation images normalized (`x_train` and `x_test`) 54 | * `y_train` and `y_validation` one-hot encoded 55 | """ 56 | @spec load() :: t() 57 | def load() do 58 | # 60000 images, 1 channel, 28 pixel width, 28 pixel height 59 | train_images = load_images(@train_images_filename) 60 | validation_images = load_images(@test_images_filename) 61 | 62 | # 10000 labels, one-hot encoded 63 | train_labels = load_labels(@train_labels_filename) 64 | validation_labels = load_labels(@test_labels_filename) 65 | 66 | %__MODULE__{ 67 | x_train: train_images, 68 | x_validation: validation_images, 69 | y_train: train_labels, 70 | y_validation: validation_labels 71 | } 72 | end 73 | 74 | defp load_labels(filename) do 75 | # Open and unzip the file of labels 76 | with {:ok, binary} <- File.read(filename) do 77 | <<_::32, n_labels::32, labels_binary::binary>> = :zlib.gunzip(binary) 78 | 79 | # Nx.from_binary/2 returns a flat tensor. 80 | # With Nx.reshape/3 we can manipulate this flat tensor 81 | # and reshape it: 1 row for each image, each row composed by 82 | # one column: 83 | # [ 84 | # [1], 85 | # [4], 86 | # [9], 87 | # … 88 | # ] 89 | labels_binary 90 | |> Nx.from_binary({:u, 8}) 91 | |> Nx.reshape({n_labels, 1}) 92 | |> Nx.equal(Nx.tensor(Enum.to_list(0..9))) 93 | end 94 | end 95 | 96 | defp load_images(filename) do 97 | # Open and unzip the file of labels 98 | with {:ok, binary} <- File.read(filename) do 99 | <<_::32, n_images::32, n_rows::32, n_cols::32, images_binary::binary>> = 100 | :zlib.gunzip(binary) 101 | 102 | # Nx.from_binary/2 returns a flat tensor. 103 | # Using Nx.reshape/3 we can manipulate this flat tensor into meaningful dimensions. 104 | # Notice we also normalized the tensor by dividing the input data by 255. 105 | # This squeezes the data between 0 and 1 which often leads to better behavior when 106 | # training models. 107 | # https://hexdocs.pm/axon/mnist.html#introduction 108 | images_binary 109 | |> Nx.from_binary({:u, 8}) 110 | |> Nx.reshape({n_images, 1, n_rows, n_cols}, names: [:images, :channels, :height, :width]) 111 | |> Nx.divide(255) 112 | end 113 | end 114 | end 115 | ``` 116 | 117 | ### Visualize the dataset via heatmap 118 | 119 | 120 | 121 | We slice the images dimension of the images tensor to obtain the first 5 training images. Then, we convert them to a heatmap for easy visualization. 122 | 123 | ```elixir 124 | %{x_train: x_train} = C16.MNISTDataset.load() 125 | 126 | x_train[[images: 0..4]] |> Nx.to_heatmap() 127 | ``` 128 | 129 | ## Build the model and train the network 130 | 131 | ```elixir 132 | # The `Axon.flatten` layer will flatten all but the batch dimensions of the 133 | # input into a single layer. Typically called to flatten the output 134 | # of a convolution for use with a dense layer. 135 | # 136 | # https://hexdocs.pm/axon/Axon.html#flatten/2 137 | # 138 | # Flattening is converting the data into a 1-dimensional array for 139 | # inputting it to the next layer. 140 | # From `{60_000, 1, 28, 28}` to `{60_000, 784}` 141 | 142 | model = 143 | Axon.input("input", shape: {nil, 1, 28, 28}) 144 | |> Axon.flatten() 145 | |> Axon.dense(1200, activation: :sigmoid) 146 | |> Axon.dense(10, activation: :softmax) 147 | 148 | Axon.Display.as_table(model, Nx.to_template(x_train)) |> IO.puts() 149 | 150 | Axon.Display.as_graph(model, Nx.to_template(x_train)) 151 | ``` 152 | 153 | ```elixir 154 | %{ 155 | x_train: x_train, 156 | y_train: y_train, 157 | x_validation: x_validation, 158 | y_validation: y_validation 159 | } = C16.MNISTDataset.load() 160 | 161 | # Batch the training data 162 | train_data = Stream.zip(Nx.to_batched(x_train, 32), Nx.to_batched(y_train, 32)) 163 | 164 | validation_data = [{x_validation, y_validation}] 165 | 166 | params = 167 | model 168 | |> Axon.Loop.trainer(:categorical_cross_entropy, Axon.Optimizers.sgd(0.1)) 169 | |> Axon.Loop.metric(:accuracy) 170 | |> Axon.Loop.validate(model, validation_data) 171 | |> Axon.Loop.run(train_data, %{}, epochs: 50, compiler: EXLA) 172 | ``` 173 | -------------------------------------------------------------------------------- /16_deeper/model1_params.term: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nickgnd/programming-machine-learning-livebooks/fbd0d9242bcc08621171006cf53637d417828a06/16_deeper/model1_params.term -------------------------------------------------------------------------------- /17_overfitting/17_overfitting.livemd: -------------------------------------------------------------------------------- 1 | # Chapter 17: Defeating Overfitting 2 | 3 | ```elixir 4 | Mix.install( 5 | [ 6 | {:exla, "~> 0.5"}, 7 | {:nx, "~> 0.5"}, 8 | {:axon, "~> 0.5"}, 9 | {:kino, "~> 0.8.1"}, 10 | {:kino_vega_lite, "~> 0.1.7"}, 11 | {:vega_lite, "~> 0.1.6"} 12 | ], 13 | config: [nx: [default_backend: EXLA.Backend]] 14 | ) 15 | ``` 16 | 17 | ## Regularizing the Model 18 | 19 | ### Reviewing the Deep Network 20 | 21 | #### Load Echidna Dataset 22 | 23 | ```elixir 24 | defmodule C17.EchidnaDataset do 25 | import Nx.Defn 26 | 27 | @data_path Path.join(__DIR__, "../data") |> Path.expand() 28 | 29 | @filename Path.join(@data_path, "echidna.txt") 30 | 31 | @doc """ 32 | Loads the echidna dataset and returns the input `x` and label `y` tensors. 33 | 34 | - the dataset has been shuffled 35 | - the input tensor is already normalized 36 | """ 37 | def load() do 38 | with {:ok, binary} <- read_file() do 39 | # seed the random algorithm 40 | :rand.seed(:exsss, {1, 2, 3}) 41 | 42 | tensor = 43 | binary 44 | |> parse() 45 | |> Enum.shuffle() 46 | |> Nx.tensor() 47 | 48 | # all the rows, only first 2 columns 49 | x = tensor[[0..-1//1, 0..1//1]] |> normalize_inputs() 50 | 51 | # all the rows, only 3rd column 52 | y = 53 | tensor[[0..-1//1, 2]] 54 | |> Nx.reshape({:auto, 1}) 55 | |> Nx.as_type(:u8) 56 | 57 | %{x: x, y: y} 58 | end 59 | end 60 | 61 | def parse(binary) do 62 | binary 63 | |> String.split("\n", trim: true) 64 | |> Enum.slice(1..-1) 65 | |> Enum.map(fn row -> 66 | row 67 | |> String.split(" ", trim: true) 68 | |> Enum.map(&parse_float/1) 69 | end) 70 | end 71 | 72 | # Normalization (Min-Max Scalar) 73 | # 74 | # In this approach, the data is scaled to a fixed range — usually 0 to 1. 75 | # In contrast to standardization, the cost of having this bounded range 76 | # is that we will end up with smaller standard deviations, 77 | # which can suppress the effect of outliers. 78 | # Thus MinMax Scalar is sensitive to outliers. 79 | defnp normalize_inputs(x_raw) do 80 | # Compute the min/max over the first axe 81 | min = Nx.reduce_min(x_raw, axes: [0]) 82 | max = Nx.reduce_max(x_raw, axes: [0]) 83 | 84 | # After MinMaxScaling, the distributions are not centered 85 | # at zero and the standard deviation is not 1. 86 | # Therefore, subtract 0.5 to rescale data between -0.5 and 0.5 87 | (x_raw - min) / (max - min) - 0.5 88 | end 89 | 90 | # to handle both integer and float numbers 91 | defp parse_float(stringified_float) do 92 | {float, ""} = Float.parse(stringified_float) 93 | float 94 | end 95 | 96 | def read_file() do 97 | if File.exists?(@filename) do 98 | File.read(@filename) 99 | else 100 | {:error, "The file #{@filename} is missing!"} 101 | end 102 | end 103 | end 104 | ``` 105 | 106 | Load the data and split the input/label tensors in train, validate and test sets to use in the different stages. 107 | 108 | ```elixir 109 | %{x: x_all, y: y_all} = C17.EchidnaDataset.load() 110 | 111 | size = (elem(Nx.shape(x_all), 0) / 3) |> ceil() 112 | 113 | [x_train, x_validation, x_test] = Nx.to_batched(x_all, size) |> Enum.to_list() 114 | [y_train, y_validation, y_test] = Nx.to_batched(y_all, size) |> Enum.to_list() 115 | 116 | # One-hot encode the labels 117 | y_train = Nx.equal(y_train, Nx.tensor([0, 1])) 118 | y_validation = Nx.equal(y_validation, Nx.tensor([0, 1])) 119 | ``` 120 | 121 | ### Building a Neural Network with Axon 122 | 123 | ```elixir 124 | batch_size = 25 125 | 126 | train_inputs = Nx.to_batched(x_train, batch_size) 127 | train_labels = Nx.to_batched(y_train, batch_size) 128 | train_batches = Stream.zip(train_inputs, train_labels) 129 | 130 | validation_data = [{x_validation, y_validation}] 131 | 132 | epochs = 30_000 133 | 134 | # Set `eps` option in the RMSprop to prevent division by zero (NaN) 135 | # By default in Axon is 1.0e-8, I tried with 1.0e-7 (Keras default) and 136 | # it was still returning NaN. 137 | epsilon = 1.0e-4 138 | 139 | model = 140 | Axon.input("data") 141 | |> Axon.dense(100, activation: :sigmoid) 142 | |> Axon.dense(30, activation: :sigmoid) 143 | |> Axon.dense(2, activation: :softmax) 144 | 145 | # `output_transform/1` applies a transformation on the final accumulated loop state. 146 | # 147 | # At the moment Axon does not provide a clean API to override/set it, 148 | # therefore we use an "hack" (`Map.update`) to override its value in the Loop's state. 149 | # 150 | # https://hexdocs.pm/axon/Axon.Loop.html#loop/3 151 | # https://github.com/elixir-nx/axon/blob/d180f074c33cf841fcbaf44c8e66d677c364d713/test/axon/loop_test.exs#L1073-L1080 152 | output_transform = fn %Axon.Loop.State{step_state: step_state, metrics: metrics} -> 153 | %{params: step_state[:model_state], metrics: metrics} 154 | end 155 | 156 | # (~450 seconds with CPU) 157 | %{params: params, metrics: metrics} = 158 | model 159 | |> Axon.Loop.trainer(:categorical_cross_entropy, Axon.Optimizers.rmsprop(0.001, eps: epsilon)) 160 | |> Axon.Loop.validate(model, validation_data) 161 | |> Map.update(:output_transform, nil, fn _original_output_transform -> 162 | fn state -> output_transform.(state) end 163 | end) 164 | |> Axon.Loop.run(train_batches, %{}, epochs: epochs, compiler: EXLA) 165 | ``` 166 | 167 | ```elixir 168 | training_losses = 169 | metrics 170 | |> Enum.sort_by(fn {index, _metric} -> index end) 171 | |> Enum.map(fn {index, %{"loss" => loss}} -> 172 | %{loss: Nx.to_number(loss), epoch: index, type: "training"} 173 | end) 174 | 175 | validation_losses = 176 | metrics 177 | |> Enum.sort_by(fn {index, _metric} -> index end) 178 | |> Enum.map(fn {index, %{"validation_loss" => validation_loss}} -> 179 | %{loss: Nx.to_number(validation_loss), epoch: index, type: "validation"} 180 | end) 181 | ``` 182 | 183 | 184 | 185 | ```elixir 186 | VegaLite.new(width: 600, height: 400) 187 | |> VegaLite.layers([ 188 | VegaLite.new() 189 | |> VegaLite.data_from_values(training_losses, only: ["epoch", "loss", "type"]) 190 | |> VegaLite.mark(:line) 191 | |> VegaLite.encode_field(:x, "epoch", type: :quantitative) 192 | |> VegaLite.encode_field(:y, "loss", type: :quantitative) 193 | |> VegaLite.encode_field(:color, "type", type: :nominal), 194 | VegaLite.new() 195 | |> VegaLite.data_from_values(validation_losses, only: ["epoch", "loss", "type"]) 196 | |> VegaLite.mark(:line) 197 | |> VegaLite.encode_field(:x, "epoch", type: :quantitative) 198 | |> VegaLite.encode_field(:y, "loss", type: :quantitative) 199 | |> VegaLite.encode_field(:color, "type", type: :nominal) 200 | ]) 201 | ``` 202 | 203 | ## L1 and L2 regularization 204 | 205 | I couldn't replicate this section of the book because L1/L2 regularizations are not supported by Axon out of the box. 206 | 207 | More details in [this post](https://elixirforum.com/t/how-to-apply-a-l1-l2-penalty-to-layers-output-in-axon/52857) in the Elixir Forum. 208 | 209 | Interestingly enough, it was possible with a previous version of Axon, but then the feature has been removed for the following reasons: 210 | 211 | * It's not in PyTorch, and it didn't seem very commonly used in TensorFlow 212 | * Regularization is a concern of training/optimization and not the model 213 | 214 | It is probably possible to achieve that by creating a custom training loop to apply L1/L2 regularization per-level. I tried but I couldn't manage to make it work 😞. 215 | -------------------------------------------------------------------------------- /18_taming/beyond_the_sigmoid.livemd: -------------------------------------------------------------------------------- 1 | # Chapter 18: Beyond the Sigmoid 2 | 3 | ```elixir 4 | Mix.install( 5 | [ 6 | {:exla, "~> 0.5"}, 7 | {:nx, "~> 0.5"}, 8 | {:axon, "~> 0.5"}, 9 | {:kino, "~> 0.8.1"}, 10 | {:kino_vega_lite, "~> 0.1.7"}, 11 | {:vega_lite, "~> 0.1.6"} 12 | ], 13 | config: [nx: [default_backend: EXLA.Backend]] 14 | ) 15 | ``` 16 | 17 | ## ReLU 18 | 19 | ```elixir 20 | relu_fn = fn z -> 21 | if z <= 0 do 22 | 0 23 | else 24 | z 25 | end 26 | end 27 | ``` 28 | 29 | ```elixir 30 | dataset = Enum.map(-6..6, fn x -> %{x: x, y: relu_fn.(x)} end) 31 | ``` 32 | 33 | 34 | 35 | ```elixir 36 | VegaLite.new(width: 600, height: 400, title: "ReLU(z)") 37 | |> VegaLite.data_from_values(dataset, only: ["x", "y"]) 38 | |> VegaLite.mark(:line) 39 | |> VegaLite.encode_field(:x, "x", type: :quantitative) 40 | |> VegaLite.encode_field(:y, "y", type: :quantitative) 41 | ``` 42 | 43 | ## Leaky ReLU 44 | 45 | ```elixir 46 | leaky_relu_fn = fn z, alpha -> 47 | if z <= 0 do 48 | z * alpha 49 | else 50 | z 51 | end 52 | end 53 | ``` 54 | 55 | ```elixir 56 | alpha = 0.02 57 | dataset = Enum.map(-6..6, fn x -> %{x: x, y: leaky_relu_fn.(x, alpha)} end) 58 | ``` 59 | 60 | 61 | 62 | ```elixir 63 | VegaLite.new(width: 600, height: 400, title: "Leaky ReLU(z, alpha)") 64 | |> VegaLite.data_from_values(dataset, only: ["x", "y"]) 65 | |> VegaLite.mark(:line) 66 | |> VegaLite.encode_field(:x, "x", type: :quantitative) 67 | |> VegaLite.encode_field(:y, "y", type: :quantitative) 68 | ``` 69 | -------------------------------------------------------------------------------- /18_taming/ten_epochs_challenge.livemd: -------------------------------------------------------------------------------- 1 | # Chapter 18 - Hands on: the 10 Epochs Challenge 2 | 3 | ```elixir 4 | Mix.install( 5 | [ 6 | {:exla, "~> 0.5"}, 7 | {:nx, "~> 0.5"}, 8 | {:axon, "~> 0.5"}, 9 | {:kino, "~> 0.8.1"}, 10 | {:kino_vega_lite, "~> 0.1.7"}, 11 | {:vega_lite, "~> 0.1.6"}, 12 | {:scidata, "~> 0.1"} 13 | ], 14 | config: [nx: [default_backend: EXLA.Backend]] 15 | ) 16 | ``` 17 | 18 | ## Prepare the date 19 | 20 | ```elixir 21 | defmodule Chapter18.MNIST do 22 | def load_data() do 23 | {raw_images, raw_labels} = Scidata.MNIST.download() 24 | {raw_test_images, raw_test_labels} = Scidata.MNIST.download_test() 25 | 26 | train_images = transform_images(raw_images) 27 | train_labels = transform_labels(raw_labels) 28 | all_test_images = transform_images(raw_test_images) 29 | all_test_labels = transform_labels(raw_test_labels) 30 | 31 | {validation_images, test_images} = split(all_test_images) 32 | {validation_labels, test_labels} = split(all_test_labels) 33 | 34 | %{ 35 | train_images: train_images, 36 | train_labels: train_labels, 37 | validation_images: validation_images, 38 | validation_labels: validation_labels, 39 | test_images: test_images, 40 | test_labels: test_labels 41 | } 42 | end 43 | 44 | defp transform_images({bin, type, shape}) do 45 | bin 46 | |> Nx.from_binary(type) 47 | |> Nx.reshape({elem(shape, 0), :auto}) 48 | |> Nx.divide(255.0) 49 | end 50 | 51 | defp transform_labels({bin, type, _}) do 52 | bin 53 | |> Nx.from_binary(type) 54 | |> Nx.new_axis(-1) 55 | |> Nx.equal(Nx.tensor(Enum.to_list(0..9))) 56 | end 57 | 58 | defp split(tensor) do 59 | {x, _} = Nx.shape(tensor) 60 | len = trunc(x / 2) 61 | first_half = Nx.slice_along_axis(tensor, 0, len, axis: 0) 62 | second_half = Nx.slice_along_axis(tensor, len + 1, len, axis: 0) 63 | {first_half, second_half} 64 | end 65 | end 66 | ``` 67 | 68 | ```elixir 69 | %{ 70 | train_images: train_images, 71 | train_labels: train_labels, 72 | validation_images: validation_images, 73 | validation_labels: validation_labels, 74 | test_images: test_images, 75 | test_labels: test_labels 76 | } = Chapter18.MNIST.load_data() 77 | 78 | train_batches = Stream.zip(Nx.to_batched(train_images, 32), Nx.to_batched(train_labels, 32)) 79 | validation_data = [{validation_images, validation_labels}] 80 | ``` 81 | 82 | ## Build and train the basic model 83 | 84 | ### Initial model in Keras 85 | 86 | ```python 87 | model = Sequential() 88 | model.add(Dense(1200, activation='sigmoid')) 89 | model.add(Dense(500, activation='sigmoid')) 90 | model.add(Dense(200, activation='sigmoid')) 91 | model.add(Dense(10, activation='softmax')) 92 | 93 | model.compile(loss='categorical_crossentropy', 94 | optimizer=SGD(lr=0.1), 95 | metrics=['accuracy']) 96 | 97 | history = model.fit(X_train, Y_train, 98 | validation_data=(X_validation, Y_validation), 99 | epochs=10, batch_size=32) 100 | ``` 101 | 102 | ```elixir 103 | epochs = 10 104 | 105 | model = 106 | Axon.input("data") 107 | |> Axon.dense(1200, activation: :sigmoid) 108 | |> Axon.dense(500, activation: :sigmoid) 109 | |> Axon.dense(200, activation: :sigmoid) 110 | |> Axon.dense(10, activation: :softmax) 111 | 112 | model 113 | |> Axon.Loop.trainer(:categorical_cross_entropy, Axon.Optimizers.sgd(0.1)) 114 | |> Axon.Loop.metric(:accuracy) 115 | |> Axon.Loop.validate(model, validation_data) 116 | |> Axon.Loop.run(train_batches, %{}, epochs: epochs, compiler: EXLA) 117 | ``` 118 | 119 | 👆 With the "basic" model: 120 | 121 | * accuracy: 0.9384413; loss: 0.3782525 122 | * val. accuracy: 0.9130000; val.loss: 0.2655237 123 | 124 | It took ~450 seconds to train this model for 10 epochs 125 | 126 | ## Build and train the optimized model 127 | 128 | ```elixir 129 | epochs = 5 130 | 131 | model = 132 | Axon.input("data") 133 | |> Axon.dense(1200) 134 | |> Axon.leaky_relu(alpha: 0.2) 135 | |> Axon.batch_norm() 136 | |> Axon.dense(500) 137 | |> Axon.leaky_relu(alpha: 0.2) 138 | |> Axon.batch_norm() 139 | |> Axon.dense(200) 140 | |> Axon.leaky_relu(alpha: 0.2) 141 | |> Axon.batch_norm() 142 | |> Axon.dense(10, activation: :softmax) 143 | 144 | model 145 | |> Axon.Loop.trainer(:categorical_cross_entropy, Axon.Optimizers.adam()) 146 | |> Axon.Loop.metric(:accuracy) 147 | |> Axon.Loop.validate(model, validation_data) 148 | |> Axon.Loop.early_stop("validation_accuracy") 149 | |> Axon.Loop.run(train_batches, %{}, epochs: epochs, compiler: EXLA) 150 | ``` 151 | 152 | 👆 With the "optimized" model we get better results already after 5 epochs: 153 | 154 | * accuracy: 0.9847380; loss: 0.1109478 155 | * val. accuracy: 0.9386000; val.loss: 0.2847975 156 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2023 Nicolò G. 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Programming Machine Learning: From Coding to Deep Learning - Elixir Livebooks 2 | 3 | > Last year (2022), one of my New Year's resolutions was to get into Machine Learning. So, I grabbed a digital copy of this book called ["Programming Machine Learning"](https://pragprog.com/titles/pplearn/programming-machine-learning/) book by [P. Perrotta](https://github.com/nusco) and started my journey. It took a bit longer than I expected, but it was totally worth it in the end. 4 | 5 | ![livebooks home](./images/livebooks_home.png) 6 | 7 | [Programming Machine Learning](https://pragprog.com/titles/pplearn/programming-machine-learning/) is an hands-on book, it guides you through the creation of an image recognition application from scratch with supervised learning, iteratevely and step-by-step. 8 | 9 | All the code examples are in Python/numpy (see [Source Code](https://pragprog.com/titles/pplearn/programming-machine-learning/#resources)), but I decided to give Elixir a spin using Livebook, Nx and company. 10 | 11 | ## Repository Structure 12 | 13 | The repository contains different Livebooks, around one per chapter, and each of them mirrors the corresponding Jupyter Book in the [Source Code](https://pragprog.com/titles/pplearn/programming-machine-learning/#resources) provided with the book. 14 | 15 | ## Prerequisites 16 | 17 | * Erlang and Elixir (https://elixir-lang.org/install.html) 18 | 19 | I personally tested it with: 20 | 21 | * erlang 25.1.2 22 | * elixir 1.14.2-otp-25 23 | 24 | ### Livebook 25 | 26 | You can install and run [Livebook](https://livebook.dev) in different ways: 27 | 28 | * The most convenient one is to download the Livebook Desktop App (https://livebook.dev/#install) 29 | 30 | Otherwise: 31 | 32 | * via [Docker](https://github.com/livebook-dev/livebook#docker) (the repository has a `docker-compose.yml` file already) 33 | * using [Escript](https://github.com/livebook-dev/livebook#escript) which comes with a convenient CLI 34 | 35 | ## How to run the livebooks? 36 | 37 | * Git clone the repository in your local machine, then: 38 | 39 | If you have the Livebook Desktop App installed locally: 40 | 41 | * Launch the app and navigate to the cloned repo locally, select one of the livebook and click on the "Open" button. 42 | 43 | If you installed Livebook via Escript: 44 | 45 | * run `livebook server --home ` (the `home` option will launch Livebook in the repository root folder), then select one of the livebook and click on the "Open" button. 46 | 47 | If you want to launch Livebook via Docker: 48 | 49 | * run `docker-compose up`, it will launch Livebook in the repository root folder, then select one of the livebook and click on the "Open" button. 50 | 51 | ## Additional notes 52 | 53 | ### Differences between Livebook an Jupyter books 54 | 55 | * I could replicate all the different Jupyter books in Elixir with Livebook/Nx/Axon, apart from the 2nd section of Chapter 17, where the book introduces L1/L2 regularization techniques and these are not supported by [Axon](https://github.com/elixir-nx/axon) out of the box (more details in the corresponding Livebook). 56 | 57 | ### Code Style 58 | 59 | * The Elixir code style used in the Livebooks is not the most idiomatic one because I aimed to keep it similar and comparable to the Python code in the Jupiter Book. 60 | 61 | ## Acknowledgements 62 | 63 | * First, I want to thank [@nusco](https://github.com/nusco) for the well-written and entertaining book. It has been a pleasant reading and I can only recommend it. 64 | * Then, I want to give a huge shoutout to the elixir-nx and Livebook teams (and contribuitors) for their incredible achievements in less than two years! I'm absolutely blown away by the ecosystem of libraries they've created to make Elixir a possible alternative to the more popular languages in the ML field. The ecosystem is evolving fast and I'm genuily curious to see what the future will bring! 65 | 66 | ## Contributing 67 | 68 | Contributions are welcome! Feel free to open an issue or PR if something is not working as expected or if you see possible improvements ☺️. 69 | -------------------------------------------------------------------------------- /data/echidna.txt: -------------------------------------------------------------------------------- 1 | Input_A Input_B Label 2 | 2.95 7.25 0 3 | 2.95 6.8 0 4 | 3.6 6.85 0 5 | 3.4 7.9 0 6 | 3.55 7.75 0 7 | 4.15 7.5 0 8 | 4.8 7.45 0 9 | 5.95 7.65 0 10 | 5.45 7.65 0 11 | 5.35 8.25 0 12 | 4.7 8 0 13 | 4.4 8.45 0 14 | 5.35 9.05 0 15 | 6.1 8.65 0 16 | 6.15 8.1 0 17 | 6.6 7.7 0 18 | 7.4 7.65 0 19 | 8.45 8.45 0 20 | 7.3 8.65 0 21 | 7.15 8.1 0 22 | 3.6 7.4 0 23 | 8.4 8 0 24 | 8.15 7.7 0 25 | 7.9 8.9 0 26 | 6.9 9.7 0 27 | 6.75 8.75 0 28 | 5.9 9.5 0 29 | 5.2 8.85 0 30 | 6.45 9.3 0 31 | 5.25 9.7 0 32 | 5.55 10.45 0 33 | 6.1 10.4 0 34 | 7.2 11.45 0 35 | 8.1 10.75 0 36 | 10.6 10.15 0 37 | 13.35 10.7 0 38 | 13.35 8.3 0 39 | 15.15 5.5 0 40 | 18.6 9.15 0 41 | 18.05 13.9 0 42 | 24.45 9.9 0 43 | 24.6 6.7 0 44 | 12.05 4.8 0 45 | 11.75 6.35 0 46 | 13 6.1 0 47 | 12.2 5.55 0 48 | 9.6 8.1 0 49 | 9 7.7 0 50 | 8.95 9.3 0 51 | 9.8 7.65 0 52 | 10.7 7.7 0 53 | 10.25 7.35 0 54 | 10.9 7.15 0 55 | 11.95 7.3 0 56 | 11.2 6.75 0 57 | 11.8 4.4 0 58 | 10.25 4.7 0 59 | 11.15 4.1 0 60 | 10.9 4.6 0 61 | 10.4 3.95 0 62 | 10.85 3.8 0 63 | 11.35 3.4 0 64 | 11.85 3.3 0 65 | 11.65 2.5 0 66 | 11.45 2.25 0 67 | 12.05 2.2 0 68 | 12.05 3.15 0 69 | 12.2 2.9 0 70 | 12.8 2.75 0 71 | 13.35 3.8 0 72 | 13.65 3.05 0 73 | 13.55 2.55 0 74 | 12.6 3.7 0 75 | 11.9 4.05 0 76 | 12.95 4.55 0 77 | 12.45 5 0 78 | 12.5 4.55 0 79 | 14.05 3.85 0 80 | 14.85 4 0 81 | 12.8 3.4 0 82 | 14.2 3.3 0 83 | 15.7 4.2 0 84 | 16.15 4 0 85 | 17.15 6 0 86 | 14.15 6.9 0 87 | 14.15 5.75 0 88 | 13.6 5.15 0 89 | 7.55 12.2 0 90 | 6.45 11.8 0 91 | 8.4 12.3 0 92 | 8.4 13.15 0 93 | 11 13.35 0 94 | 11.85 11.15 0 95 | 19.8 9.95 0 96 | 24.15 5.35 0 97 | 25.05 3 0 98 | 25.5 2.4 0 99 | 26.15 2.05 0 100 | 26.75 2.2 0 101 | 27.4 2.25 0 102 | 28.5 2.45 0 103 | 28.75 3.95 0 104 | 28.75 3.25 0 105 | 28.35 2.95 0 106 | 28.9 2.7 0 107 | 27.15 2.7 0 108 | 27.5 2.65 0 109 | 27.8 3 0 110 | 26.7 4 0 111 | 26.7 3.35 0 112 | 25.7 2.45 0 113 | 25.65 3.05 0 114 | 24.7 3.95 0 115 | 26 3.15 0 116 | 26.4 2.85 0 117 | 27.4 4.1 0 118 | 28.1 4.95 0 119 | 27.1 6.3 0 120 | 27.2 4.9 0 121 | 25.5 3.85 0 122 | 24.95 3.5 0 123 | 26.1 3.9 0 124 | 26.1 5.3 0 125 | 26.8 4.65 0 126 | 25.35 4.65 0 127 | 28.6 5.25 0 128 | 28.15 6 0 129 | 29.2 5.85 0 130 | 28.05 5.65 0 131 | 29.7 6.35 0 132 | 28.95 6.35 0 133 | 28.95 6.95 0 134 | 30.05 7.4 0 135 | 29.2 7.55 0 136 | 29.4 8.6 0 137 | 29.1 8.2 0 138 | 28.25 9.2 0 139 | 28.45 7.75 0 140 | 27.3 7.5 0 141 | 27.85 6.8 0 142 | 28.65 7.15 0 143 | 28.5 6.6 0 144 | 27.2 5.45 0 145 | 26.55 5.6 0 146 | 26 6.25 0 147 | 26.6 6.3 0 148 | 26.4 7.15 0 149 | 27.55 6.95 0 150 | 27.65 8.45 0 151 | 28.95 8.95 0 152 | 29.65 9.85 0 153 | 28.85 9.6 0 154 | 29.35 10.65 0 155 | 28.6 10.25 0 156 | 28.85 11.25 0 157 | 28.15 11.05 0 158 | 28.1 10 0 159 | 28.5 9.15 0 160 | 28.4 8.7 0 161 | 27.6 11.65 0 162 | 28.25 11.9 0 163 | 27.7 12.8 0 164 | 26.65 13.05 0 165 | 24.4 13.65 0 166 | 23.85 11.65 0 167 | 21.45 12.45 0 168 | 19.6 11.6 0 169 | 17.55 11.9 0 170 | 17.2 10.1 0 171 | 14.9 9.9 0 172 | 14.3 11.8 0 173 | 8.75 13.75 0 174 | 9.55 14.5 0 175 | 10.55 14.8 0 176 | 11.8 15.9 0 177 | 11.6 15.1 0 178 | 13 16.3 0 179 | 12.85 15.6 0 180 | 14.2 16.4 0 181 | 15.4 16.9 0 182 | 16.45 17.2 0 183 | 17.9 17.6 0 184 | 19.6 17.65 0 185 | 21.4 17.2 0 186 | 20.7 17.5 0 187 | 22 17.05 0 188 | 22.25 17.05 0 189 | 22.9 16.7 0 190 | 24.05 16.4 0 191 | 25.55 15.3 0 192 | 25.65 14.75 0 193 | 26.3 14.15 0 194 | 26.85 13.85 0 195 | 26.2 11.7 0 196 | 21.65 11.8 0 197 | 22.2 8.35 0 198 | 19.4 5.3 0 199 | 16.85 5.2 0 200 | 17.2 4.2 0 201 | 18.25 5.3 0 202 | 18 4.3 0 203 | 19.25 4.25 0 204 | 18.6 4.65 0 205 | 20.3 4.2 0 206 | 20.25 5.15 0 207 | 20.65 4.15 0 208 | 21.9 4.05 0 209 | 22.5 5.2 0 210 | 22.6 4.05 0 211 | 23.1 4 0 212 | 23.6 4.5 0 213 | 23.6 4.1 0 214 | 24.15 3.9 0 215 | 21.05 7.25 0 216 | 20.9 5.3 0 217 | 17.1 6.9 0 218 | 14.9 7.55 0 219 | 10.9 9.95 0 220 | 10.55 8.85 0 221 | 8.3 8.85 0 222 | 7.5 10.05 0 223 | 6.85 10.9 0 224 | 6.25 10.35 0 225 | 9 9.7 0 226 | 8.25 9.7 0 227 | 8.5 11 0 228 | 7.95 11.45 0 229 | 9.15 11.65 0 230 | 9.45 10.3 0 231 | 10.65 9.9 0 232 | 9.75 9.85 0 233 | 9.75 8.5 0 234 | 11.9 8.2 0 235 | 11.9 9.85 0 236 | 11.55 9.1 0 237 | 10.7 9.15 0 238 | 9.3 11.75 0 239 | 10.65 13.15 0 240 | 14.4 14.15 0 241 | 12.55 13.95 0 242 | 13.5 13 0 243 | 17.05 13.8 0 244 | 16.85 16.45 0 245 | 16.4 13.45 0 246 | 15.2 15.3 0 247 | 16.3 12 0 248 | 23.6 13.2 0 249 | 21.65 9.35 0 250 | 16.35 11.15 0 251 | 20.55 9.2 0 252 | 21.3 13.55 0 253 | 16.75 13.45 0 254 | 15 9.5 0 255 | 24.55 8.9 0 256 | 18.7 6.6 0 257 | 16.9 8.4 0 258 | 23.4 10.5 0 259 | 18.2 12.9 0 260 | 24.5 9.85 0 261 | 18.25 7.9 0 262 | 20.75 7.65 0 263 | 19.65 8.35 0 264 | 19.6 14.1 0 265 | 18.25 15.7 0 266 | 21.05 14.65 0 267 | 26.5 11.4 0 268 | 23.5 8.7 0 269 | 16.1 8.4 0 270 | 14.5 12.3 0 271 | 18.15 9.95 0 272 | 22 9.95 0 273 | 22 8.2 0 274 | 22.45 5.65 0 275 | 22.05 4.75 0 276 | 25.45 8.55 0 277 | 26.2 9.8 0 278 | 27.05 8.4 0 279 | 27.05 10.6 0 280 | 25.05 10.6 0 281 | 25.9 8 0 282 | 22.35 7.45 0 283 | 20.45 6.6 0 284 | 15.05 6.5 0 285 | 14.3 4.9 0 286 | 16.3 4.7 0 287 | 15.05 8 0 288 | 13.1 7 0 289 | 13.2 9.75 0 290 | 14.8 9 0 291 | 10.7 11.35 0 292 | 10 12.55 0 293 | 9.4 13.7 0 294 | 11.5 12.9 0 295 | 11.3 11.45 0 296 | 13.9 11.45 0 297 | 12.05 12.75 0 298 | 17.5 13.65 0 299 | 22.05 15.7 0 300 | 23.45 14.8 0 301 | 22.6 13.95 0 302 | 22.9 11.85 0 303 | 22.9 9.7 0 304 | 24.05 6.85 0 305 | 23.1 7.35 0 306 | 23.1 7.35 0 307 | 23.9 4.85 0 308 | 22.85 6.1 0 309 | 21.45 6.1 0 310 | 25.9 7.25 0 311 | 24 7.8 0 312 | 25.6 5.8 0 313 | 24.3 5.8 0 314 | 17.65 8.1 0 315 | 15.35 11 0 316 | 16.5 9.3 0 317 | 14.85 10.1 0 318 | 13.95 9.9 0 319 | 13.95 8.9 0 320 | 15.8 8.9 0 321 | 15.95 10.65 0 322 | 12.7 12.25 0 323 | 11.45 12.3 0 324 | 12.5 10.25 0 325 | 12.75 9.2 0 326 | 11.5 11.15 0 327 | 11.35 11 0 328 | 12.8 12.05 0 329 | 12.75 11.5 0 330 | 15.95 13.75 0 331 | 15.45 12.35 0 332 | 14.95 13.35 0 333 | 17.15 12.4 0 334 | 17.1 10.5 0 335 | 16 9.55 0 336 | 18.3 10.25 0 337 | 18.2 11.65 0 338 | 18.9 10.8 0 339 | 18.9 12.4 0 340 | 20.45 12.5 0 341 | 20.45 10.6 0 342 | 19.05 10.3 0 343 | 19.55 9.55 0 344 | 19.15 8.5 0 345 | 19.5 7.35 0 346 | 18.5 7.35 0 347 | 17.95 6.8 0 348 | 18 6 0 349 | 19.65 6 0 350 | 16.8 7.4 0 351 | 16.2 7.6 0 352 | 21.95 9.2 0 353 | 21.1 10.85 0 354 | 21.05 11.55 0 355 | 22.5 10.8 0 356 | 22.45 12.25 0 357 | 24.3 12.05 0 358 | 25.7 10.25 0 359 | 25.35 9.45 0 360 | 25.25 12 0 361 | 25.25 14.25 0 362 | 23.85 15.6 0 363 | 20 16.7 0 364 | 20 15.05 0 365 | 18.95 16.15 0 366 | 20.3 16 0 367 | 18.55 16.7 0 368 | 17 15.25 0 369 | 13.3 15 0 370 | 12 14.8 0 371 | 10.05 14.05 0 372 | 11.35 12.85 0 373 | 8.75 12.9 0 374 | 10.3 11.6 0 375 | 9.55 10.25 0 376 | 9.35 11.3 0 377 | 10.2 10.9 0 378 | 7.6 9.2 0 379 | 12.9 8.9 0 380 | 15.65 12.35 0 381 | 12.65 14 0 382 | 13.95 12.5 0 383 | 11.8 13.4 0 384 | 14.55 13.9 0 385 | 14.4 15.9 0 386 | 14.15 14.85 0 387 | 14.15 13.75 0 388 | 16.35 14.7 0 389 | 18.75 14.3 0 390 | 18.55 13.4 0 391 | 21.15 13.45 0 392 | 23.1 15.4 0 393 | 21.9 14.2 0 394 | 21.95 15.25 0 395 | 23.2 14.6 0 396 | 22.25 14.7 0 397 | 22.9 13.45 0 398 | 22.2 13.3 0 399 | 24.05 12.65 0 400 | 24.8 12.95 0 401 | 26.65 12.55 0 402 | 25.9 13.55 0 403 | 25.8 12.25 0 404 | 25.7 10.95 0 405 | 24.4 11.5 0 406 | 24.2 10.1 0 407 | 22.5 11.25 0 408 | 21.65 9.2 0 409 | 24.3 8.5 0 410 | 19.8 12.85 0 411 | 20.65 14.8 0 412 | 21.15 16.1 0 413 | 23.1 16.55 0 414 | 22.55 16.15 0 415 | 21.85 17 0 416 | 21.5 16.05 0 417 | 20.95 16.7 0 418 | 19.8 16.15 0 419 | 17.75 16.05 0 420 | 15.65 15.9 0 421 | 16.6 15.7 0 422 | 16.25 16.7 0 423 | 17.15 16.2 0 424 | 17.2 14.5 0 425 | 17.35 16.95 0 426 | 18.5 14.95 0 427 | 17.8 14.95 0 428 | 19.35 15.35 0 429 | 19.35 16.65 0 430 | 18.85 17.3 0 431 | 18.35 17 0 432 | 19.1 15 0 433 | 16.45 12.25 0 434 | 20.7 11.2 0 435 | 24.75 14.5 0 436 | 22.25 11.15 0 437 | 23.95 14.3 0 438 | 23.4 14.3 0 439 | 25.45 11.75 0 440 | 22.2 10.45 0 441 | 20.55 6.7 0 442 | 22.2 6.1 0 443 | 21.9 7.4 0 444 | 20.35 7.6 0 445 | 20.35 8.9 0 446 | 23.15 8.35 0 447 | 25.05 13.2 0 448 | 23.4 15.85 0 449 | 23.4 15.75 0 450 | 24.25 15 0 451 | 1.85 7.3 1 452 | 1.95 9.6 1 453 | 3.05 9.15 1 454 | 3.5 10.45 1 455 | 2.7 12.55 1 456 | 4.2 10.1 1 457 | 4.9 11.55 1 458 | 2.8 11.25 1 459 | 3.75 8.85 1 460 | 2.5 8.05 1 461 | 2.15 8.9 1 462 | 3.25 8.45 1 463 | 2.1 6.7 1 464 | 2.65 6.45 1 465 | 2.65 5.35 1 466 | 2.25 6.35 1 467 | 4.05 5.5 1 468 | 4.6 6.3 1 469 | 3.55 6.25 1 470 | 4.8 7.1 1 471 | 5.7 7 1 472 | 5.7 6.05 1 473 | 6.45 6.1 1 474 | 6.45 7.3 1 475 | 7.3 6.7 1 476 | 8.3 7.2 1 477 | 9.4 6.8 1 478 | 10 6.25 1 479 | 9.05 6.3 1 480 | 8.15 5.95 1 481 | 8.1 6.7 1 482 | 7.05 7 1 483 | 7.1 5.55 1 484 | 8.35 4.8 1 485 | 4.55 4.85 1 486 | 5.85 3.35 1 487 | 5.6 5.55 1 488 | 2.9 5.1 1 489 | 3.85 4.25 1 490 | 5.95 4.5 1 491 | 8.15 2.7 1 492 | 6.45 2.5 1 493 | 8.4 2.3 1 494 | 7.3 3.9 1 495 | 9 4.8 1 496 | 10 5.65 1 497 | 10.45 6.75 1 498 | 10.45 6.25 1 499 | 11 6.05 1 500 | 11.5 5.75 1 501 | 11.4 4.85 1 502 | 10.4 5.6 1 503 | 11.1 5.5 1 504 | 11.7 5.1 1 505 | 10.05 5.1 1 506 | 10.65 4.95 1 507 | 10 4.55 1 508 | 10.25 3.6 1 509 | 10.55 3.25 1 510 | 10.55 2.85 1 511 | 11 2.25 1 512 | 11 2.25 1 513 | 11.15 3.2 1 514 | 12.55 1.2 1 515 | 11.6 1.45 1 516 | 10.25 1.5 1 517 | 9.95 2.7 1 518 | 9.2 3.65 1 519 | 9.55 4.35 1 520 | 9.1 5.7 1 521 | 8.05 5.55 1 522 | 6.85 5.1 1 523 | 7.8 4.7 1 524 | 7.8 4.2 1 525 | 7.7 7.3 1 526 | 8.6 2.15 1 527 | 6.8 3.75 1 528 | 4.9 3.75 1 529 | 6.7 2.8 1 530 | 8.3 3.1 1 531 | 7.95 2.25 1 532 | 10.2 1.55 1 533 | 9.25 2.45 1 534 | 10.2 2.2 1 535 | 8.3 3.95 1 536 | 5.6 4.4 1 537 | 5 5.1 1 538 | 2.3 10.05 1 539 | 3.9 10.95 1 540 | 5.15 10.5 1 541 | 4.45 9.7 1 542 | 3.35 9.9 1 543 | 4.45 11 1 544 | 4.05 12.6 1 545 | 3.55 11.85 1 546 | 2.4 11.65 1 547 | 2.2 11.15 1 548 | 1.5 8.6 1 549 | 5.5 12.65 1 550 | 5.45 11.85 1 551 | 5.45 11.35 1 552 | 4.3 12 1 553 | 6.25 12.4 1 554 | 7.05 12.55 1 555 | 6.05 11.9 1 556 | 7.4 14.5 1 557 | 7.3 13.8 1 558 | 5.25 13.8 1 559 | 4.6 12.95 1 560 | 3.3 13 1 561 | 4.5 14 1 562 | 3.9 13.5 1 563 | 5.3 13.5 1 564 | 5.35 14.8 1 565 | 6.2 13.7 1 566 | 6.15 14.45 1 567 | 6.75 13.8 1 568 | 6.3 12.95 1 569 | 7.3 13.1 1 570 | 8.05 13.7 1 571 | 8.1 14.65 1 572 | 2.7 10.75 1 573 | 1.85 10.65 1 574 | 13.85 1.8 1 575 | 12.9 1.8 1 576 | 13.7 1.2 1 577 | 12.1 1.7 1 578 | 14.4 1.65 1 579 | 14.8 1.4 1 580 | 15.2 2.25 1 581 | 17 1.2 1 582 | 15.3 1.05 1 583 | 16.45 0.85 1 584 | 17.65 1.05 1 585 | 17.95 1.65 1 586 | 19.15 1.25 1 587 | 18.65 1.05 1 588 | 20.15 1.35 1 589 | 20 2.85 1 590 | 18.8 2.2 1 591 | 16.35 2.4 1 592 | 16.25 1.7 1 593 | 15.45 1.7 1 594 | 10.8 1.7 1 595 | 9.2 1.9 1 596 | 7.6 2.9 1 597 | 21.25 1.05 1 598 | 22.8 1 1 599 | 22.2 1.55 1 600 | 24.25 1.35 1 601 | 22.8 2.2 1 602 | 24.6 1.7 1 603 | 25.4 1.35 1 604 | 26.9 0.95 1 605 | 25.6 0.95 1 606 | 26.8 1.45 1 607 | 28.15 1.45 1 608 | 29.2 1.7 1 609 | 29.7 2.65 1 610 | 30.45 3.45 1 611 | 31 4.45 1 612 | 30.15 4.65 1 613 | 29.7 3.95 1 614 | 27.85 4.4 1 615 | 27.8 4.05 1 616 | 28.1 3.55 1 617 | 27.55 3.4 1 618 | 28.6 4.6 1 619 | 29.45 4.3 1 620 | 29.25 3.65 1 621 | 28.05 2.35 1 622 | 27.4 1.9 1 623 | 29.35 2.95 1 624 | 29.9 3.6 1 625 | 31.4 3.7 1 626 | 31.4 5.8 1 627 | 30.05 5.4 1 628 | 29.4 5.15 1 629 | 28.75 5.45 1 630 | 29.45 5.7 1 631 | 29.1 6.15 1 632 | 29.9 6.2 1 633 | 29.55 6.75 1 634 | 29.95 6.7 1 635 | 30.45 7.3 1 636 | 29.75 8.05 1 637 | 29.5 7.8 1 638 | 30.35 7.9 1 639 | 30.35 9.1 1 640 | 29.9 9.3 1 641 | 28.85 9.9 1 642 | 29.65 10.3 1 643 | 30.2 10.2 1 644 | 29.8 9.85 1 645 | 29.15 12 1 646 | 28.6 10.7 1 647 | 29 10.9 1 648 | 27.85 12.4 1 649 | 28.8 11.85 1 650 | 29.35 11.65 1 651 | 30 11.15 1 652 | 31.1 10.9 1 653 | 31 9.95 1 654 | 31.1 8.55 1 655 | 31 7.6 1 656 | 30.7 6.65 1 657 | 30.75 6 1 658 | 30.45 5.05 1 659 | 30.4 4.35 1 660 | 31.3 4.55 1 661 | 30.9 5.45 1 662 | 31.65 5.9 1 663 | 31.55 7.8 1 664 | 31.55 6.9 1 665 | 31.85 7.85 1 666 | 31.6 9.3 1 667 | 30.4 8.75 1 668 | 32 8.75 1 669 | 32.95 10.55 1 670 | 32.65 10.15 1 671 | 32.1 10.45 1 672 | 31.6 11.55 1 673 | 30.45 12.95 1 674 | 9 14.85 1 675 | 9.85 14.85 1 676 | 11.55 16.25 1 677 | 10.95 15.8 1 678 | 11.8 15.6 1 679 | 12.4 16.2 1 680 | 13 15.9 1 681 | 13.75 16.85 1 682 | 12.85 17.1 1 683 | 13.65 16.25 1 684 | 11.9 16.9 1 685 | 14.65 16.95 1 686 | 15 17.95 1 687 | 15.75 17.7 1 688 | 16.65 17.95 1 689 | 18 17.95 1 690 | 17.45 18.6 1 691 | 18.6 18.35 1 692 | 19.35 18 1 693 | 20.55 18.1 1 694 | 19.8 18.65 1 695 | 21.9 18 1 696 | 21.1 18.5 1 697 | 21.25 17.95 1 698 | 22.05 17.4 1 699 | 23.65 17.05 1 700 | 21.1 17.65 1 701 | 24.45 16.6 1 702 | 25.65 16.25 1 703 | 25.75 15.65 1 704 | 27 14.8 1 705 | 26.25 15.6 1 706 | 27.25 14.05 1 707 | 27.9 13.3 1 708 | 28.7 12.95 1 709 | 28.6 12.3 1 710 | 29.8 12.35 1 711 | 29.85 13.45 1 712 | 30.75 12.1 1 713 | 31.8 12.15 1 714 | 32.1 11.05 1 715 | 32.15 12.4 1 716 | 30.9 13.35 1 717 | 29.25 14 1 718 | 29.05 15.2 1 719 | 27.45 15.2 1 720 | 28.1 13.65 1 721 | 28.05 14.65 1 722 | 28.65 13.45 1 723 | 28.6 14.3 1 724 | 29.6 14.35 1 725 | 30.25 14.3 1 726 | 29.25 12.9 1 727 | 30.25 11.35 1 728 | 14.5 2.45 1 729 | 14.9 3.3 1 730 | 15.5 2.9 1 731 | 15.55 3.7 1 732 | 16.65 3.05 1 733 | 15.95 2.4 1 734 | 16.95 2.2 1 735 | 16.15 3.5 1 736 | 17.2 3.6 1 737 | 17.2 2.9 1 738 | 17.5 3 1 739 | 17.75 1.9 1 740 | 18.15 3.65 1 741 | 18.5 2.85 1 742 | 17.9 2.75 1 743 | 18.8 2.25 1 744 | 18.75 3.55 1 745 | 19.4 2.8 1 746 | 19.45 2.1 1 747 | 9.7 1.45 1 748 | 11.65 0.75 1 749 | 13.55 0.5 1 750 | 15.5 0.4 1 751 | 17 0.5 1 752 | 17.75 0.25 1 753 | 18.45 0.6 1 754 | 19.6 0.6 1 755 | 20.1 1.85 1 756 | 20.55 0.3 1 757 | 20.8 2.55 1 758 | 20.85 3.4 1 759 | 21.7 2.3 1 760 | 21.7 2.3 1 761 | 19.7 3.95 1 762 | 19.25 3.6 1 763 | 20.1 3.5 1 764 | 22 3.4 1 765 | 21.05 3.85 1 766 | 21.25 3.35 1 767 | 21.15 1.9 1 768 | 20.3 1.6 1 769 | 21.55 0.95 1 770 | 22.8 3.25 1 771 | 22.9 3.85 1 772 | 24 3.6 1 773 | 24 2.85 1 774 | 23.45 3.4 1 775 | 23.6 2.1 1 776 | 24.65 3.3 1 777 | 24.65 2.55 1 778 | 23.35 1.55 1 779 | 23.45 0.65 1 780 | 24.75 0.65 1 781 | 22.25 0.4 1 782 | 22.15 2.9 1 783 | 14.3 0.75 1 784 | 5.95 15.45 1 785 | 7.95 15.95 1 786 | 6.85 15.55 1 787 | 4.05 15.35 1 788 | 3.35 13.95 1 789 | 4.75 15.2 1 790 | 8.65 15.7 1 791 | 10 15.6 1 792 | 9.35 16.9 1 793 | 6.85 16.6 1 794 | 7.2 15.1 1 795 | 6.25 15.75 1 796 | 4.85 16.4 1 797 | 5.6 16.15 1 798 | 9.2 16.35 1 799 | 9.95 17.65 1 800 | 11.1 17.5 1 801 | 16.6 18.9 1 802 | 19.35 18.65 1 803 | 22.55 18.65 1 804 | 22.65 17.7 1 805 | 23.5 18.05 1 806 | 24.6 17.65 1 807 | 24.85 16.9 1 808 | 24.15 17.5 1 809 | 26 17.25 1 810 | 26.2 16.25 1 811 | 27.6 16.3 1 812 | 27.3 16.5 1 813 | 27.25 15.75 1 814 | 28.5 15.45 1 815 | 28.55 16.15 1 816 | 33.25 11.35 1 817 | 32.95 11.55 1 818 | 32.45 11.9 1 819 | 32.3 13.1 1 820 | 31.05 13.5 1 821 | 32.9 12.7 1 822 | 32.85 8.95 1 823 | 31.85 10.05 1 824 | 32.15 9.35 1 825 | 5.8 17.05 1 826 | 6.4 16.45 1 827 | 7.7 16.7 1 828 | 6.9 17.65 1 829 | 8.15 17.65 1 830 | 8.15 16.9 1 831 | 7.3 15.65 1 832 | 10.15 16.15 1 833 | 10.4 16.6 1 834 | 8.45 17.4 1 835 | 8.9 18.2 1 836 | 7.85 18.05 1 837 | 9.55 18.6 1 838 | 10.7 17.85 1 839 | 10.7 18.4 1 840 | 10.3 19 1 841 | 11.55 18.55 1 842 | 11.75 17.75 1 843 | 12.3 18.55 1 844 | 13.3 18.5 1 845 | 13.35 17.2 1 846 | 14.05 17.95 1 847 | 13.05 17.95 1 848 | 14.95 18.7 1 849 | 14.1 18.85 1 850 | 15.95 18.85 1 851 | 15.95 18.65 1 852 | 18 18.8 1 853 | 21.8 18.75 1 854 | 30.1 15.3 1 855 | 31.3 14.5 1 856 | 30.35 14.45 1 857 | -------------------------------------------------------------------------------- /data/mnist/readme.txt: -------------------------------------------------------------------------------- 1 | The MNIST dataset. See http://yann.lecun.com/exdb/mnist/. 2 | -------------------------------------------------------------------------------- /data/mnist/t10k-images-idx3-ubyte.gz: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nickgnd/programming-machine-learning-livebooks/fbd0d9242bcc08621171006cf53637d417828a06/data/mnist/t10k-images-idx3-ubyte.gz -------------------------------------------------------------------------------- /data/mnist/t10k-labels-idx1-ubyte.gz: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nickgnd/programming-machine-learning-livebooks/fbd0d9242bcc08621171006cf53637d417828a06/data/mnist/t10k-labels-idx1-ubyte.gz -------------------------------------------------------------------------------- /data/mnist/train-images-idx3-ubyte.gz: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nickgnd/programming-machine-learning-livebooks/fbd0d9242bcc08621171006cf53637d417828a06/data/mnist/train-images-idx3-ubyte.gz -------------------------------------------------------------------------------- /data/mnist/train-labels-idx1-ubyte.gz: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nickgnd/programming-machine-learning-livebooks/fbd0d9242bcc08621171006cf53637d417828a06/data/mnist/train-labels-idx1-ubyte.gz -------------------------------------------------------------------------------- /data/sonar/sonar.names: -------------------------------------------------------------------------------- 1 | NAME: Sonar, Mines vs. Rocks 2 | 3 | SUMMARY: This is the data set used by Gorman and Sejnowski in their study 4 | of the classification of sonar signals using a neural network [1]. The 5 | task is to train a network to discriminate between sonar signals bounced 6 | off a metal cylinder and those bounced off a roughly cylindrical rock. 7 | 8 | SOURCE: The data set was contributed to the benchmark collection by Terry 9 | Sejnowski, now at the Salk Institute and the University of California at 10 | San Deigo. The data set was developed in collaboration with R. Paul 11 | Gorman of Allied-Signal Aerospace Technology Center. 12 | 13 | MAINTAINER: Scott E. Fahlman 14 | 15 | PROBLEM DESCRIPTION: 16 | 17 | The file "sonar.mines" contains 111 patterns obtained by bouncing sonar 18 | signals off a metal cylinder at various angles and under various 19 | conditions. The file "sonar.rocks" contains 97 patterns obtained from 20 | rocks under similar conditions. The transmitted sonar signal is a 21 | frequency-modulated chirp, rising in frequency. The data set contains 22 | signals obtained from a variety of different aspect angles, spanning 90 23 | degrees for the cylinder and 180 degrees for the rock. 24 | 25 | Each pattern is a set of 60 numbers in the range 0.0 to 1.0. Each number 26 | represents the energy within a particular frequency band, integrated over 27 | a certain period of time. The integration aperture for higher frequencies 28 | occur later in time, since these frequencies are transmitted later during 29 | the chirp. 30 | 31 | The label associated with each record contains the letter "R" if the object 32 | is a rock and "M" if it is a mine (metal cylinder). The numbers in the 33 | labels are in increasing order of aspect angle, but they do not encode the 34 | angle directly. 35 | 36 | METHODOLOGY: 37 | 38 | This data set can be used in a number of different ways to test learning 39 | speed, quality of ultimate learning, ability to generalize, or combinations 40 | of these factors. 41 | 42 | In [1], Gorman and Sejnowski report two series of experiments: an 43 | "aspect-angle independent" series, in which the whole data set is used 44 | without controlling for aspect angle, and an "aspect-angle dependent" 45 | series in which the training and testing sets were carefully controlled to 46 | ensure that each set contained cases from each aspect angle in 47 | appropriate proportions. 48 | 49 | For the aspect-angle independent experiments the combined set of 208 cases 50 | is divided randomly into 13 disjoint sets with 16 cases in each. For each 51 | experiment, 12 of these sets are used as training data, while the 13th is 52 | reserved for testing. The experiment is repeated 13 times so that every 53 | case appears once as part of a test set. The reported performance is an 54 | average over the entire set of 13 different test sets, each run 10 times. 55 | 56 | It was observed that this random division of the sample set led to rather 57 | uneven performance. A few of the splits gave poor results, presumably 58 | because the test set contains some samples from aspect angles that are 59 | under-represented in the corresponding training set. This motivated Gorman 60 | and Sejnowski to devise a different set of experiments in which an attempt 61 | was made to balance the training and test sets so that each would have a 62 | representative number of samples from all aspect angles. Since detailed 63 | aspect angle information was not present in the data base of samples, the 64 | 208 samples were first divided into clusters, using a 60-dimensional 65 | Euclidian metric; each of these clusters was then divided between the 66 | 104-member training set and the 104-member test set. 67 | 68 | The actual training and testing samples used for the "aspect angle 69 | dependent" experiments are marked in the data files. The reported 70 | performance is an average over 10 runs with this single division of the 71 | data set. 72 | 73 | A standard back-propagation network was used for all experiments. The 74 | network had 60 inputs and 2 output units, one indicating a cylinder and the 75 | other a rock. Experiments were run with no hidden units (direct 76 | connections from each input to each output) and with a single hidden layer 77 | with 2, 3, 6, 12, or 24 units. Each network was trained by 300 epochs over 78 | the entire training set. 79 | 80 | The weight-update formulas used in this study were slightly different from 81 | the standard form. A learning rate of 2.0 and momentum of 0.0 was used. 82 | Errors less than 0.2 were treated as zero. Initial weights were uniform 83 | random values in the range -0.3 to +0.3. 84 | 85 | RESULTS: 86 | 87 | For the angle independent experiments, Gorman and Sejnowski report the 88 | following results for networks with different numbers of hidden units: 89 | 90 | Hidden % Right on Std. % Right on Std. 91 | Units Training set Dev. Test Set Dev. 92 | ------ ------------ ---- ---------- ---- 93 | 0 89.4 2.1 77.1 8.3 94 | 2 96.5 0.7 81.9 6.2 95 | 3 98.8 0.4 82.0 7.3 96 | 6 99.7 0.2 83.5 5.6 97 | 12 99.8 0.1 84.7 5.7 98 | 24 99.8 0.1 84.5 5.7 99 | 100 | For the angle-dependent experiments Gorman and Sejnowski report the 101 | following results: 102 | 103 | Hidden % Right on Std. % Right on Std. 104 | Units Training set Dev. Test Set Dev. 105 | ------ ------------ ---- ---------- ---- 106 | 0 79.3 3.4 73.1 4.8 107 | 2 96.2 2.2 85.7 6.3 108 | 3 98.1 1.5 87.6 3.0 109 | 6 99.4 0.9 89.3 2.4 110 | 12 99.8 0.6 90.4 1.8 111 | 24 100.0 0.0 89.2 1.4 112 | 113 | Not surprisingly, the network's performance on the test set was somewhat 114 | better when the aspect angles in the training and test sets were balanced. 115 | 116 | Gorman and Sejnowski further report that a nearest neighbor classifier on 117 | the same data gave an 82.7% probability of correct classification. 118 | 119 | Three trained human subjects were each tested on 100 signals, chosen at 120 | random from the set of 208 returns used to create this data set. Their 121 | responses ranged between 88% and 97% correct. However, they may have been 122 | using information from the raw sonar signal that is not preserved in the 123 | processed data sets presented here. 124 | 125 | REFERENCES: 126 | 127 | 1. Gorman, R. P., and Sejnowski, T. J. (1988). "Analysis of Hidden Units 128 | in a Layered Network Trained to Classify Sonar Targets" in Neural Networks, 129 | Vol. 1, pp. 75-89. 130 | -------------------------------------------------------------------------------- /docker-compose.yml: -------------------------------------------------------------------------------- 1 | version: '3' 2 | 3 | services: 4 | livebook: 5 | image: livebook/livebook 6 | ports: 7 | - '8080:8080' 8 | working_dir: '/programming_machine_learning' 9 | volumes: 10 | - ./:/programming_machine_learning 11 | environment: 12 | - LIVEBOOK_HOME=/programming_machine_learning 13 | - LIVEBOOK_DEFAULT_RUNTIME=standalone 14 | -------------------------------------------------------------------------------- /images/livebooks_home.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nickgnd/programming-machine-learning-livebooks/fbd0d9242bcc08621171006cf53637d417828a06/images/livebooks_home.png --------------------------------------------------------------------------------