├── 1004588303.png
├── 1351541081.png
├── 2218687897.png
├── 2587122783.png
├── 2590211320.png
├── 2797619445.png
├── 3923620403.png
├── 4133943752.png
├── 4167443430.png
├── 517000181.png
├── 759701435.png
├── LICENCE.md
├── MLJLogo2.svg
├── Manifest.toml
├── Project.toml
├── README.md
├── Untitled.ipynb
├── apt.txt
├── assets
├── scitypes.drawio
└── scitypes.png
├── data
├── horse.csv
├── house.csv
├── small.csv
└── src
│ ├── Manifest.toml
│ ├── Project.toml
│ ├── ames.csv
│ ├── convert_ames.jl
│ ├── convert_ames
│ ├── Manifest.toml
│ └── Project.toml
│ ├── generate_horse.jl
│ ├── get_king_county.jl
│ └── reduced_ames.csv
├── environment.yml
├── exercise_6ci.png
├── exercise_7c.png
├── exercise_7c_2.png
├── exercise_7c_3.png
├── exercise_8c.png
├── gamma_sampler.png
├── iris_learning_curve.png
├── learning_curve.png
├── learning_curve2.png
├── methods.md
├── outline.md
├── setup.jl
├── stacking.png
├── tuning.png
├── tutorials.ipynb
├── tutorials.jl
├── tutorials.md
├── vecstack.png
├── wow.ipynb
├── wow.jl
└── wow.md
/1004588303.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ablaom/MachineLearningInJulia2020/552f98fbf012475d67cd29a72448ac7c476ea2c7/1004588303.png
--------------------------------------------------------------------------------
/1351541081.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ablaom/MachineLearningInJulia2020/552f98fbf012475d67cd29a72448ac7c476ea2c7/1351541081.png
--------------------------------------------------------------------------------
/2218687897.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ablaom/MachineLearningInJulia2020/552f98fbf012475d67cd29a72448ac7c476ea2c7/2218687897.png
--------------------------------------------------------------------------------
/2587122783.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ablaom/MachineLearningInJulia2020/552f98fbf012475d67cd29a72448ac7c476ea2c7/2587122783.png
--------------------------------------------------------------------------------
/2590211320.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ablaom/MachineLearningInJulia2020/552f98fbf012475d67cd29a72448ac7c476ea2c7/2590211320.png
--------------------------------------------------------------------------------
/2797619445.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ablaom/MachineLearningInJulia2020/552f98fbf012475d67cd29a72448ac7c476ea2c7/2797619445.png
--------------------------------------------------------------------------------
/3923620403.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ablaom/MachineLearningInJulia2020/552f98fbf012475d67cd29a72448ac7c476ea2c7/3923620403.png
--------------------------------------------------------------------------------
/4133943752.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ablaom/MachineLearningInJulia2020/552f98fbf012475d67cd29a72448ac7c476ea2c7/4133943752.png
--------------------------------------------------------------------------------
/4167443430.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ablaom/MachineLearningInJulia2020/552f98fbf012475d67cd29a72448ac7c476ea2c7/4167443430.png
--------------------------------------------------------------------------------
/517000181.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ablaom/MachineLearningInJulia2020/552f98fbf012475d67cd29a72448ac7c476ea2c7/517000181.png
--------------------------------------------------------------------------------
/759701435.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ablaom/MachineLearningInJulia2020/552f98fbf012475d67cd29a72448ac7c476ea2c7/759701435.png
--------------------------------------------------------------------------------
/LICENCE.md:
--------------------------------------------------------------------------------
1 | The MLJ.jl package is licensed under the MIT "Expat" License:
2 |
3 | > Copyright (c) 2020: Anthony Blaom
4 |
5 | > Permission is hereby granted, free of charge, to any person obtaining a copy
6 | > of this software and associated documentation files (the "Software"), to deal
7 | > in the Software without restriction, including without limitation the rights
8 | > to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9 | > copies of the Software, and to permit persons to whom the Software is
10 | > furnished to do so, subject to the following conditions:
11 | >
12 | > The above copyright notice and this permission notice shall be included in all
13 | > copies or substantial portions of the Software.
14 | >
15 | > THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | > IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | > FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | > AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | > LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | > OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | > SOFTWARE.
22 | >
23 |
--------------------------------------------------------------------------------
/MLJLogo2.svg:
--------------------------------------------------------------------------------
1 |
2 |
3 |
83 |
--------------------------------------------------------------------------------
/Project.toml:
--------------------------------------------------------------------------------
1 | [deps]
2 | CSV = "336ed68f-0bac-5ca0-87d4-7b16caf5d00b"
3 | CategoricalArrays = "324d7699-5711-5eae-9e2f-1d82baa6b597"
4 | ComputationalResources = "ed09eef8-17a6-5b46-8889-db040fac31e3"
5 | DataFrames = "a93c6f00-e57d-5684-b7b6-d8193f3e46c0"
6 | DecisionTree = "7806a523-6efd-50cb-b5f6-3fa6f1930dbb"
7 | Distributions = "31c24e10-a181-5473-b8eb-7969acd0382f"
8 | EvoTrees = "f6006082-12f8-11e9-0c9c-0d5d367ab1e5"
9 | Literate = "98b081ad-f1c9-55d3-8b20-4c87d4299306"
10 | MLJ = "add582a8-e3ab-11e8-2d5e-e98b27df1bc7"
11 | MLJBase = "a7f614a8-145f-11e9-1d2a-a57a1082229d"
12 | MLJClusteringInterface = "d354fa79-ed1c-40d4-88ef-b8c7bd1568af"
13 | MLJDecisionTreeInterface = "c6f25543-311c-4c74-83dc-3ea6d1015661"
14 | MLJFlux = "094fc8d1-fd35-5302-93ea-dabda2abf845"
15 | MLJLinearModels = "6ee0df7b-362f-4a72-a706-9e79364fb692"
16 | MLJModels = "d491faf4-2d78-11e9-2867-c94bc002c0b7"
17 | MLJMultivariateStatsInterface = "1b6a4a23-ba22-4f51-9698-8599985d3728"
18 | MLJScikitLearnInterface = "5ae90465-5518-4432-b9d2-8a1def2f0cab"
19 | NearestNeighborModels = "636a865e-7cf4-491e-846c-de09b730eb36"
20 | NearestNeighbors = "b8a86587-4115-5ab1-83bc-aa920d37bbce"
21 | Plots = "91a5bcdd-55d7-5caf-9e0b-520d859cae80"
22 | Random = "9a3f8284-a2c9-5f02-9a11-845980a1fd5c"
23 | ScientificTypes = "321657f4-b219-11e9-178b-2701a2544e81"
24 | StatsBase = "2913bbd2-ae8a-5f71-8c99-4fb6c76f3a91"
25 | Tables = "bd369af6-aec1-5ad0-b16a-f7cc5008161c"
26 | UnicodePlots = "b8865327-cd53-5732-bb35-84acbb429228"
27 |
28 | [compat]
29 | julia = ">=1.6, <1.7"
30 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # Machine Learning in Julia using MLJ, JuliaCon2020
2 |
3 | **Now updated for MLJ version 0.16 and Julia 1.6**
4 |
5 | But binder notebook will not work until [this binder issue](https://github.com/jupyterhub/binderhub/issues/1424) is resolved.
6 |
7 | Interactive tutorials for a workshop introducing the machine learning
8 | toolbox [MLJ](https://alan-turing-institute.github.io/MLJ.jl/stable/) (v0.14.4)
9 |
10 |
11 |

12 |
13 |
14 | These tutorials were prepared for use in a 3 1/2 hour online workshop
15 | at JuliaCon2020, recorded
16 | [here](https://www.youtube.com/watch?time_continue=27&v=qSWbCn170HU&feature=emb_title). Their
17 | main aim is to introduce the
18 | [MLJ](https://alan-turing-institute.github.io/MLJ.jl/stable/) machine
19 | learning toolbox to data scientists.
20 |
21 | Differences from the original resources are minor (main difference:
22 | `@load` now returns a type instead of an instance). However, if you
23 | wish to access resources precisely matching those used in the video,
24 | switch to the `JuliaCon2020` branch by clicking
25 | [here](https://github.com/ablaom/MachineLearningInJulia2020/tree/for-MLJ-version-0.16).
26 |
27 | **Future revisions** of these tutorials will appear [here](https://github.com/ablaom/MLJTutorial.jl).
28 |
29 |
30 | ### [Options for running the tutorials](#options-for-running-the-tutorials)
31 |
32 | ### [Non-interactive version](tutorials.md)
33 |
34 | ### Topics covered
35 |
36 | #### Basic
37 |
38 | - Part 1 - **Data Representation**
39 |
40 | - Part 2 - **Selecting, Training and Evaluating Models**
41 |
42 | - Part 3 - **Transformers and Pipelines**
43 |
44 | #### Advanced
45 |
46 | - Part 4 - **Tuning hyper-parameters**
47 |
48 | - Part 5 - **Advanced model composition** (as time permits)
49 |
50 | The tutorials include links to external resources and exercises with
51 | solutions.
52 |
53 |
54 | ## Options for running the tutorials
55 |
56 | ### 1. Plug-and-play
57 |
58 | Only recommended for users with little Julia experience or users having
59 | problems with the other options.
60 |
61 | Use this option if you have neither run Julia/Juptyer notebook on your
62 | local machine before, nor used a Julia IDE to run a Julia script.
63 |
64 |
65 | #### Pros
66 |
67 | One
68 | [click](https://mybinder.org/v2/gh/ablaom/MachineLearningInJulia2020/master?filepath=tutorials.ipynb). No
69 | need to install anything on your local machine.
70 |
71 |
72 | #### Cons
73 |
74 | - The (automatic) setup can take a little while, sometimes over 15
75 | minutes (but you do get a static version of the notebook while it
76 | loads).
77 |
78 | - **You will have to start over** if:
79 |
80 | - The notebook drops your connection for some reason.
81 | - You are **inactive for ten minutes**.
82 |
83 |
84 | #### Instructions
85 |
86 | Click this button: [](https://mybinder.org/v2/gh/ablaom/MachineLearningInJulia2020/master?filepath=tutorials.ipynb)
87 |
88 |
89 | ### 2. Clone the repo and choose your preferred interface
90 |
91 | Assumes that you have a working installation of
92 | [Julia](https://julialang.org/downloads/) 1.3 or higher and that
93 | either:
94 |
95 | - You can run Julia/Juptyer notebooks on your local machine without problems; or
96 |
97 | - You are comfortable running Julia scripts from an IDE, such as [Juno](https://junolab.org) or [Emacs](https://github.com/JuliaEditorSupport/julia-emacs) (see [here](https://julialang.org) for a complete list).
98 |
99 |
100 | #### Pros
101 |
102 | More stable option
103 |
104 | #### Cons
105 |
106 | You need to meet above requirements
107 |
108 |
109 | #### Instructions
110 |
111 | - Clone [this repository](https://github.com/ablaom/MachineLearningInJulia2020)
112 |
113 | - Change to your local repo directory "MachineLearningInJulia2020/"
114 |
115 | - Either run the Juptyper notebook called "tutorials.ipynb" from that
116 | directory (corresponding to [this file](tutorials.ipynb) on GitHub)
117 | or open "tutorials.jl" from that directory in your favourite IDE
118 | (corresponding to [this file](tutorials.jl) on GitHub). You cannot
119 | download these files individually - you need the whole directory.
120 |
121 | - **Immediately** evaluate the first two lines of code to activate the
122 | package environment and pre-load the packages, as this can take a
123 | few minutes.
124 |
125 |
126 | ## More about the tutorials
127 |
128 | - The tutorials focus on the *machine learning* part of the data
129 | science workflow, and less on exploratory data analysis and other
130 | conventional "data analytics" methodology
131 |
132 | - Here "machine learning" is meant in a broad sense, and is not
133 | restricted to so-called *deep learning* (neural networks)
134 |
135 | - The tutorials are crafted to rapidly familiarize the user with what
136 | MLJ can do and how to do it, and are not a substitute for a course
137 | on machine learning fundamentals. Examples do not necessarily
138 | represent best practice or the best solution to a problem.
139 |
140 | ## Binder notebook for stacking demo used in video
141 |
142 | [](https://mybinder.org/v2/gh/ablaom/MachineLearningInJulia2020/386ce06766dc1d9d9a0197ec57738b732c1c5d23?filepath=wow.ipynb)
143 |
144 |
--------------------------------------------------------------------------------
/Untitled.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "code",
5 | "execution_count": 4,
6 | "metadata": {},
7 | "outputs": [],
8 | "source": [
9 | "using DataFrames, CategoricalArrays"
10 | ]
11 | },
12 | {
13 | "cell_type": "code",
14 | "execution_count": 2,
15 | "metadata": {},
16 | "outputs": [
17 | {
18 | "data": {
19 | "text/plain": [
20 | "12-element Array{Float64,1}:\n",
21 | " 1.0\n",
22 | " 2.0\n",
23 | " 3.0\n",
24 | " 4.0\n",
25 | " 5.0\n",
26 | " 6.0\n",
27 | " 7.0\n",
28 | " 8.0\n",
29 | " 9.0\n",
30 | " 10.0\n",
31 | " 11.0\n",
32 | " 12.0"
33 | ]
34 | },
35 | "execution_count": 2,
36 | "metadata": {},
37 | "output_type": "execute_result"
38 | }
39 | ],
40 | "source": [
41 | "time = float.(1:12 )"
42 | ]
43 | },
44 | {
45 | "cell_type": "code",
46 | "execution_count": 5,
47 | "metadata": {},
48 | "outputs": [
49 | {
50 | "data": {
51 | "text/plain": [
52 | "4-element CategoricalArray{String,1,UInt32}:\n",
53 | " \"kitchen\"\n",
54 | " \"bathroom\"\n",
55 | " \"bedroom_1\"\n",
56 | " \"living_room\""
57 | ]
58 | },
59 | "execution_count": 5,
60 | "metadata": {},
61 | "output_type": "execute_result"
62 | }
63 | ],
64 | "source": [
65 | "room = categorical([\"kitchen\", \"bathroom\", \"bedroom_1\", \"living_room\"])"
66 | ]
67 | },
68 | {
69 | "cell_type": "code",
70 | "execution_count": 6,
71 | "metadata": {},
72 | "outputs": [
73 | {
74 | "data": {
75 | "text/plain": [
76 | "12-element CategoricalArray{String,1,UInt32}:\n",
77 | " \"kitchen\"\n",
78 | " \"bathroom\"\n",
79 | " \"bedroom_1\"\n",
80 | " \"living_room\"\n",
81 | " \"kitchen\"\n",
82 | " \"bathroom\"\n",
83 | " \"bedroom_1\"\n",
84 | " \"living_room\"\n",
85 | " \"kitchen\"\n",
86 | " \"bathroom\"\n",
87 | " \"bedroom_1\"\n",
88 | " \"living_room\""
89 | ]
90 | },
91 | "execution_count": 6,
92 | "metadata": {},
93 | "output_type": "execute_result"
94 | }
95 | ],
96 | "source": [
97 | "room = vcat(room, room, room)"
98 | ]
99 | },
100 | {
101 | "cell_type": "code",
102 | "execution_count": 7,
103 | "metadata": {},
104 | "outputs": [
105 | {
106 | "data": {
107 | "text/plain": [
108 | "1×12 Array{Int64,2}:\n",
109 | " 5 5 5 5 6 6 6 6 7 7 7 7"
110 | ]
111 | },
112 | "execution_count": 7,
113 | "metadata": {},
114 | "output_type": "execute_result"
115 | }
116 | ],
117 | "source": [
118 | "time = [5 5 5 5 6 6 6 6 7 7 7 7]"
119 | ]
120 | },
121 | {
122 | "cell_type": "code",
123 | "execution_count": 11,
124 | "metadata": {},
125 | "outputs": [
126 | {
127 | "data": {
128 | "text/plain": [
129 | "12-element Array{Int64,1}:\n",
130 | " 5\n",
131 | " 5\n",
132 | " 5\n",
133 | " 5\n",
134 | " 6\n",
135 | " 6\n",
136 | " 6\n",
137 | " 6\n",
138 | " 7\n",
139 | " 7\n",
140 | " 7\n",
141 | " 7"
142 | ]
143 | },
144 | "execution_count": 11,
145 | "metadata": {},
146 | "output_type": "execute_result"
147 | }
148 | ],
149 | "source": [
150 | "time =reshape(time, (12,))"
151 | ]
152 | },
153 | {
154 | "cell_type": "code",
155 | "execution_count": 12,
156 | "metadata": {},
157 | "outputs": [
158 | {
159 | "data": {
160 | "text/html": [
161 | " | time | room |
---|
| Int64 | Cat… |
---|
12 rows × 2 columns
1 | 5 | kitchen |
---|
2 | 5 | bathroom |
---|
3 | 5 | bedroom_1 |
---|
4 | 5 | living_room |
---|
5 | 6 | kitchen |
---|
6 | 6 | bathroom |
---|
7 | 6 | bedroom_1 |
---|
8 | 6 | living_room |
---|
9 | 7 | kitchen |
---|
10 | 7 | bathroom |
---|
11 | 7 | bedroom_1 |
---|
12 | 7 | living_room |
---|
"
162 | ],
163 | "text/latex": [
164 | "\\begin{tabular}{r|cc}\n",
165 | "\t& time & room\\\\\n",
166 | "\t\\hline\n",
167 | "\t& Int64 & Cat…\\\\\n",
168 | "\t\\hline\n",
169 | "\t1 & 5 & kitchen \\\\\n",
170 | "\t2 & 5 & bathroom \\\\\n",
171 | "\t3 & 5 & bedroom\\_1 \\\\\n",
172 | "\t4 & 5 & living\\_room \\\\\n",
173 | "\t5 & 6 & kitchen \\\\\n",
174 | "\t6 & 6 & bathroom \\\\\n",
175 | "\t7 & 6 & bedroom\\_1 \\\\\n",
176 | "\t8 & 6 & living\\_room \\\\\n",
177 | "\t9 & 7 & kitchen \\\\\n",
178 | "\t10 & 7 & bathroom \\\\\n",
179 | "\t11 & 7 & bedroom\\_1 \\\\\n",
180 | "\t12 & 7 & living\\_room \\\\\n",
181 | "\\end{tabular}\n"
182 | ],
183 | "text/plain": [
184 | "12×2 DataFrame\n",
185 | "│ Row │ time │ room │\n",
186 | "│ │ \u001b[90mInt64\u001b[39m │ \u001b[90mCat…\u001b[39m │\n",
187 | "├─────┼───────┼─────────────┤\n",
188 | "│ 1 │ 5 │ kitchen │\n",
189 | "│ 2 │ 5 │ bathroom │\n",
190 | "│ 3 │ 5 │ bedroom_1 │\n",
191 | "│ 4 │ 5 │ living_room │\n",
192 | "│ 5 │ 6 │ kitchen │\n",
193 | "│ 6 │ 6 │ bathroom │\n",
194 | "│ 7 │ 6 │ bedroom_1 │\n",
195 | "│ 8 │ 6 │ living_room │\n",
196 | "│ 9 │ 7 │ kitchen │\n",
197 | "│ 10 │ 7 │ bathroom │\n",
198 | "│ 11 │ 7 │ bedroom_1 │\n",
199 | "│ 12 │ 7 │ living_room │"
200 | ]
201 | },
202 | "execution_count": 12,
203 | "metadata": {},
204 | "output_type": "execute_result"
205 | }
206 | ],
207 | "source": [
208 | "X = DataFrame(time=time, room=room)"
209 | ]
210 | },
211 | {
212 | "cell_type": "code",
213 | "execution_count": 13,
214 | "metadata": {},
215 | "outputs": [
216 | {
217 | "name": "stderr",
218 | "output_type": "stream",
219 | "text": [
220 | "┌ Info: Precompiling MLJ [add582a8-e3ab-11e8-2d5e-e98b27df1bc7]\n",
221 | "└ @ Base loading.jl:1260\n",
222 | "[ Info: Model metadata loaded from registry. \n"
223 | ]
224 | }
225 | ],
226 | "source": [
227 | "using MLJ"
228 | ]
229 | },
230 | {
231 | "cell_type": "code",
232 | "execution_count": 31,
233 | "metadata": {
234 | "scrolled": true
235 | },
236 | "outputs": [
237 | {
238 | "name": "stdout",
239 | "output_type": "stream",
240 | "text": [
241 | "\n",
242 | "\n",
243 | "┌\u001b[0m───────\u001b[0m┬\u001b[0m─────────────────────────────────\u001b[0m┐\u001b[0m\n",
244 | "│\u001b[0m\u001b[1m time \u001b[0m│\u001b[0m\u001b[1m room \u001b[0m│\u001b[0m\n",
245 | "│\u001b[0m\u001b[90m Int64 \u001b[0m│\u001b[0m\u001b[90m CategoricalValue{String,UInt32} \u001b[0m│\u001b[0m\n",
246 | "├\u001b[0m───────\u001b[0m┼\u001b[0m─────────────────────────────────\u001b[0m┤\u001b[0m\n",
247 | "│\u001b[0m 5 \u001b[0m│\u001b[0m kitchen \u001b[0m│\u001b[0m\n",
248 | "│\u001b[0m 5 \u001b[0m│\u001b[0m bathroom \u001b[0m│\u001b[0m\n",
249 | "│\u001b[0m 5 \u001b[0m│\u001b[0m bedroom_1 \u001b[0m│\u001b[0m\n",
250 | "│\u001b[0m 5 \u001b[0m│\u001b[0m living_room \u001b[0m│\u001b[0m\n",
251 | "│\u001b[0m 6 \u001b[0m│\u001b[0m kitchen \u001b[0m│\u001b[0m\n",
252 | "│\u001b[0m 6 \u001b[0m│\u001b[0m bathroom \u001b[0m│\u001b[0m\n",
253 | "│\u001b[0m 6 \u001b[0m│\u001b[0m bedroom_1 \u001b[0m│\u001b[0m\n",
254 | "│\u001b[0m 6 \u001b[0m│\u001b[0m living_room \u001b[0m│\u001b[0m\n",
255 | "│\u001b[0m 7 \u001b[0m│\u001b[0m kitchen \u001b[0m│\u001b[0m\n",
256 | "│\u001b[0m 7 \u001b[0m│\u001b[0m bathroom \u001b[0m│\u001b[0m\n",
257 | "│\u001b[0m 7 \u001b[0m│\u001b[0m bedroom_1 \u001b[0m│\u001b[0m\n",
258 | "│\u001b[0m 7 \u001b[0m│\u001b[0m living_room \u001b[0m│\u001b[0m\n",
259 | "└\u001b[0m───────\u001b[0m┴\u001b[0m─────────────────────────────────\u001b[0m┘\u001b[0m\n"
260 | ]
261 | }
262 | ],
263 | "source": [
264 | "println()\n",
265 | "println()\n",
266 | "MLJ.MLJBase.PrettyTables.pretty_table(X)"
267 | ]
268 | },
269 | {
270 | "cell_type": "code",
271 | "execution_count": 15,
272 | "metadata": {},
273 | "outputs": [
274 | {
275 | "ename": "UndefVarError",
276 | "evalue": "UndefVarError: y not defined",
277 | "output_type": "error",
278 | "traceback": [
279 | "UndefVarError: y not defined",
280 | "",
281 | "Stacktrace:",
282 | " [1] top-level scope at In[15]:1"
283 | ]
284 | }
285 | ],
286 | "source": [
287 | "pretty(y)"
288 | ]
289 | },
290 | {
291 | "cell_type": "code",
292 | "execution_count": 16,
293 | "metadata": {},
294 | "outputs": [
295 | {
296 | "data": {
297 | "text/plain": [
298 | "12-element Array{Float64,1}:\n",
299 | " 18.490955359526012\n",
300 | " 18.304060288673128\n",
301 | " 18.25954037947709\n",
302 | " 17.419481829632957\n",
303 | " 16.589235329028348\n",
304 | " 20.66317138311018\n",
305 | " 18.945861996750985\n",
306 | " 20.158722013970333\n",
307 | " 20.361567584624957\n",
308 | " 19.85771377870428\n",
309 | " 16.180836445944205\n",
310 | " 17.330000922162835"
311 | ]
312 | },
313 | "execution_count": 16,
314 | "metadata": {},
315 | "output_type": "execute_result"
316 | }
317 | ],
318 | "source": [
319 | "temp = 16 .+ 5*rand(12)"
320 | ]
321 | },
322 | {
323 | "cell_type": "code",
324 | "execution_count": 19,
325 | "metadata": {},
326 | "outputs": [
327 | {
328 | "data": {
329 | "text/plain": [
330 | "12-element Array{Float64,1}:\n",
331 | " 18.5\n",
332 | " 18.3\n",
333 | " 18.3\n",
334 | " 17.4\n",
335 | " 16.6\n",
336 | " 20.7\n",
337 | " 18.9\n",
338 | " 20.2\n",
339 | " 20.4\n",
340 | " 19.9\n",
341 | " 16.2\n",
342 | " 17.3"
343 | ]
344 | },
345 | "execution_count": 19,
346 | "metadata": {},
347 | "output_type": "execute_result"
348 | }
349 | ],
350 | "source": [
351 | "temperature = map(temp) do x round(x, sigdigits=3) end"
352 | ]
353 | },
354 | {
355 | "cell_type": "code",
356 | "execution_count": 20,
357 | "metadata": {},
358 | "outputs": [
359 | {
360 | "ename": "UndefVarError",
361 | "evalue": "UndefVarError: y not defined",
362 | "output_type": "error",
363 | "traceback": [
364 | "UndefVarError: y not defined",
365 | "",
366 | "Stacktrace:",
367 | " [1] top-level scope at In[20]:1"
368 | ]
369 | }
370 | ],
371 | "source": [
372 | "y = DataFrame(y)"
373 | ]
374 | },
375 | {
376 | "cell_type": "code",
377 | "execution_count": 23,
378 | "metadata": {},
379 | "outputs": [
380 | {
381 | "data": {
382 | "text/html": [
383 | " | temperature |
---|
| Float64 |
---|
12 rows × 1 columns
1 | 18.5 |
---|
2 | 18.3 |
---|
3 | 18.3 |
---|
4 | 17.4 |
---|
5 | 16.6 |
---|
6 | 20.7 |
---|
7 | 18.9 |
---|
8 | 20.2 |
---|
9 | 20.4 |
---|
10 | 19.9 |
---|
11 | 16.2 |
---|
12 | 17.3 |
---|
"
384 | ],
385 | "text/latex": [
386 | "\\begin{tabular}{r|c}\n",
387 | "\t& temperature\\\\\n",
388 | "\t\\hline\n",
389 | "\t& Float64\\\\\n",
390 | "\t\\hline\n",
391 | "\t1 & 18.5 \\\\\n",
392 | "\t2 & 18.3 \\\\\n",
393 | "\t3 & 18.3 \\\\\n",
394 | "\t4 & 17.4 \\\\\n",
395 | "\t5 & 16.6 \\\\\n",
396 | "\t6 & 20.7 \\\\\n",
397 | "\t7 & 18.9 \\\\\n",
398 | "\t8 & 20.2 \\\\\n",
399 | "\t9 & 20.4 \\\\\n",
400 | "\t10 & 19.9 \\\\\n",
401 | "\t11 & 16.2 \\\\\n",
402 | "\t12 & 17.3 \\\\\n",
403 | "\\end{tabular}\n"
404 | ],
405 | "text/plain": [
406 | "12×1 DataFrame\n",
407 | "│ Row │ temperature │\n",
408 | "│ │ \u001b[90mFloat64\u001b[39m │\n",
409 | "├─────┼─────────────┤\n",
410 | "│ 1 │ 18.5 │\n",
411 | "│ 2 │ 18.3 │\n",
412 | "│ 3 │ 18.3 │\n",
413 | "│ 4 │ 17.4 │\n",
414 | "│ 5 │ 16.6 │\n",
415 | "│ 6 │ 20.7 │\n",
416 | "│ 7 │ 18.9 │\n",
417 | "│ 8 │ 20.2 │\n",
418 | "│ 9 │ 20.4 │\n",
419 | "│ 10 │ 19.9 │\n",
420 | "│ 11 │ 16.2 │\n",
421 | "│ 12 │ 17.3 │"
422 | ]
423 | },
424 | "execution_count": 23,
425 | "metadata": {},
426 | "output_type": "execute_result"
427 | }
428 | ],
429 | "source": [
430 | "y=DataFrame(temperature=temperature)"
431 | ]
432 | },
433 | {
434 | "cell_type": "code",
435 | "execution_count": 32,
436 | "metadata": {},
437 | "outputs": [
438 | {
439 | "name": "stdout",
440 | "output_type": "stream",
441 | "text": [
442 | "\n",
443 | "\n",
444 | "┌\u001b[0m─────────────\u001b[0m┐\u001b[0m\n",
445 | "│\u001b[0m\u001b[1m temperature \u001b[0m│\u001b[0m\n",
446 | "│\u001b[0m\u001b[90m Float64 \u001b[0m│\u001b[0m\n",
447 | "├\u001b[0m─────────────\u001b[0m┤\u001b[0m\n",
448 | "│\u001b[0m 18.5 \u001b[0m│\u001b[0m\n",
449 | "│\u001b[0m 18.3 \u001b[0m│\u001b[0m\n",
450 | "│\u001b[0m 18.3 \u001b[0m│\u001b[0m\n",
451 | "│\u001b[0m 17.4 \u001b[0m│\u001b[0m\n",
452 | "│\u001b[0m 16.6 \u001b[0m│\u001b[0m\n",
453 | "│\u001b[0m 20.7 \u001b[0m│\u001b[0m\n",
454 | "│\u001b[0m 18.9 \u001b[0m│\u001b[0m\n",
455 | "│\u001b[0m 20.2 \u001b[0m│\u001b[0m\n",
456 | "│\u001b[0m 20.4 \u001b[0m│\u001b[0m\n",
457 | "│\u001b[0m 19.9 \u001b[0m│\u001b[0m\n",
458 | "│\u001b[0m 16.2 \u001b[0m│\u001b[0m\n",
459 | "│\u001b[0m 17.3 \u001b[0m│\u001b[0m\n",
460 | "└\u001b[0m─────────────\u001b[0m┘\u001b[0m\n"
461 | ]
462 | }
463 | ],
464 | "source": [
465 | "println()\n",
466 | "println()\n",
467 | "MLJ.MLJBase.PrettyTables.pretty_table(y)"
468 | ]
469 | },
470 | {
471 | "cell_type": "code",
472 | "execution_count": 25,
473 | "metadata": {},
474 | "outputs": [
475 | {
476 | "data": {
477 | "text/html": [
478 | " | temperature |
---|
| Float64 |
---|
12 rows × 1 columns
1 | 18.5 |
---|
2 | 18.3 |
---|
3 | 18.3 |
---|
4 | 17.4 |
---|
5 | 16.6 |
---|
6 | 20.7 |
---|
7 | 18.9 |
---|
8 | 20.2 |
---|
9 | 20.4 |
---|
10 | 19.9 |
---|
11 | 16.2 |
---|
12 | 17.3 |
---|
"
479 | ],
480 | "text/latex": [
481 | "\\begin{tabular}{r|c}\n",
482 | "\t& temperature\\\\\n",
483 | "\t\\hline\n",
484 | "\t& Float64\\\\\n",
485 | "\t\\hline\n",
486 | "\t1 & 18.5 \\\\\n",
487 | "\t2 & 18.3 \\\\\n",
488 | "\t3 & 18.3 \\\\\n",
489 | "\t4 & 17.4 \\\\\n",
490 | "\t5 & 16.6 \\\\\n",
491 | "\t6 & 20.7 \\\\\n",
492 | "\t7 & 18.9 \\\\\n",
493 | "\t8 & 20.2 \\\\\n",
494 | "\t9 & 20.4 \\\\\n",
495 | "\t10 & 19.9 \\\\\n",
496 | "\t11 & 16.2 \\\\\n",
497 | "\t12 & 17.3 \\\\\n",
498 | "\\end{tabular}\n"
499 | ],
500 | "text/plain": [
501 | "12×1 DataFrame\n",
502 | "│ Row │ temperature │\n",
503 | "│ │ \u001b[90mFloat64\u001b[39m │\n",
504 | "├─────┼─────────────┤\n",
505 | "│ 1 │ 18.5 │\n",
506 | "│ 2 │ 18.3 │\n",
507 | "│ 3 │ 18.3 │\n",
508 | "│ 4 │ 17.4 │\n",
509 | "│ 5 │ 16.6 │\n",
510 | "│ 6 │ 20.7 │\n",
511 | "│ 7 │ 18.9 │\n",
512 | "│ 8 │ 20.2 │\n",
513 | "│ 9 │ 20.4 │\n",
514 | "│ 10 │ 19.9 │\n",
515 | "│ 11 │ 16.2 │\n",
516 | "│ 12 │ 17.3 │"
517 | ]
518 | },
519 | "execution_count": 25,
520 | "metadata": {},
521 | "output_type": "execute_result"
522 | }
523 | ],
524 | "source": [
525 | "y"
526 | ]
527 | },
528 | {
529 | "cell_type": "code",
530 | "execution_count": null,
531 | "metadata": {},
532 | "outputs": [],
533 | "source": []
534 | }
535 | ],
536 | "metadata": {
537 | "kernelspec": {
538 | "display_name": "Julia 1.4.2",
539 | "language": "julia",
540 | "name": "julia-1.4"
541 | },
542 | "language_info": {
543 | "file_extension": ".jl",
544 | "mimetype": "application/julia",
545 | "name": "julia",
546 | "version": "1.4.2"
547 | }
548 | },
549 | "nbformat": 4,
550 | "nbformat_minor": 4
551 | }
552 |
--------------------------------------------------------------------------------
/apt.txt:
--------------------------------------------------------------------------------
1 | tzdata
--------------------------------------------------------------------------------
/assets/scitypes.drawio:
--------------------------------------------------------------------------------
1 | 7ZnbbptAEIafhstU5pxcJiROKiWt2lRK07sVjGHbZddZBhvn6btrlmCKGhPJdq2UKzP/zJ5mPjNgW26UV9eSzLM7kQCznElSWe6l5Ti2PQnUh1ZWtXLqTmohlTQxQa1wT5/BiE1YSRMoOoEoBEM674qx4Bxi7GhESrHshs0E6646Jyn0hPuYsL76QBPMzCmcsNVvgKZZs7IdnNWenDTB5iRFRhKx3JDcK8uNpBBYX+VVBEwnr8lLPW76F+/LxiRwHDJg+cUn0Vn2Y/bwNcrOr5/pE41OzGYLXDUHhkSd35hCYiZSwQm7atWLuJQL0JPaypCi5MnamiirHXArxNyE/ATElaksKVEoKcOcGS9UFL/r4R9C35iPG67Lyky9NlaNwVGu6lF+Yz5u+tpha6sZV8xJTHl6CzPsKsrSE80ExynJKdMDboAtAGlMlKOfbJP/QpQyhlcy3EBLZAr4Spz5nuj0byxgSnkNIgd1EBUggRGkiy6exFCevsS1IKgLw8IbuLC9IwLD+a/B8I8LjHreBWGlWclyAoYmRR1igqdSNI6TYl3mcxVgB/OqdaqrVH9OKacIVnjxyQovmynVDutZ65gekggVdqEpUIpfEAkmpFK44JrLGWXsD4kwmnJlxqpsoPSLBUhdUXZuHDlNkjXUy0zt617BoNdcqibXY3v37OjNQPVqtY3X90x3Me21aTbLtlcFRso22lQTtns87F6Rxoay2/uGM/C+4R3XfcM5IjDeZ0MZCkZ4VGA4e2ooH/msbiljLxnYS8Jgay/xDtlL3D2RcUeLQmd0BGMgGM5pF4zTfwyGtycwIpVxHLEYisXLTzlHgoW/Jyw+ywQkJFMSoyrj+IbyNko8bysltntITIJ9tZWSqawxUhQjI299iw39rYzYh2Qk3FuH4Uh5KcpihGMoHMH2p4/DwmH3G803VaWSsPdWPe0wr9N2sJtqugd7llRm++fK2rfxF5V79Rs=
--------------------------------------------------------------------------------
/assets/scitypes.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ablaom/MachineLearningInJulia2020/552f98fbf012475d67cd29a72448ac7c476ea2c7/assets/scitypes.png
--------------------------------------------------------------------------------
/data/horse.csv:
--------------------------------------------------------------------------------
1 | surgery,age,rectal_temperature,pulse,respiratory_rate,temperature_extremities,mucous_membranes,capillary_refill_time,pain,peristalsis,abdominal_distension,packed_cell_volume,total_protein,outcome,surgical_lesion,cp_data
2 | 2,1,38.5,66,66,3,1,2,5,4,4,45.0,8.4,2,2,2
3 | 1,1,39.2,88,88,3,4,1,3,4,2,50.0,85.0,3,2,2
4 | 2,1,38.3,40,40,1,3,1,3,3,1,33.0,6.7,1,2,1
5 | 1,9,39.1,164,164,4,6,2,2,4,4,48.0,7.2,2,1,1
6 | 2,1,37.3,104,104,3,6,2,3,3,1,74.0,7.4,2,2,2
7 | 2,1,38.1,60,60,2,3,1,2,3,2,44.0,7.5,1,2,2
8 | 1,1,37.9,48,48,1,1,1,3,3,3,37.0,7.0,1,1,2
9 | 1,1,38.1,60,60,3,1,1,3,4,2,44.0,8.3,2,1,2
10 | 2,1,38.1,80,80,3,3,1,4,4,4,38.0,6.2,3,1,2
11 | 2,9,38.3,90,90,1,1,1,5,3,1,40.0,6.2,1,2,1
12 | 1,1,38.1,66,66,3,5,1,3,3,1,44.0,6.0,1,1,1
13 | 2,1,39.1,72,72,2,2,1,2,1,2,50.0,7.8,1,1,2
14 | 1,1,37.2,42,42,2,1,1,3,3,3,44.0,7.0,1,2,2
15 | 2,9,38.0,92,92,1,2,1,1,3,2,37.0,6.1,2,2,1
16 | 1,1,38.2,76,76,3,1,1,3,4,1,46.0,81.0,1,1,2
17 | 1,1,37.6,96,96,3,4,1,5,3,3,45.0,6.8,2,1,2
18 | 1,9,38.1,128,128,3,4,2,4,4,3,53.0,7.8,2,2,1
19 | 2,1,37.5,48,48,3,1,1,3,3,1,44.0,7.5,1,2,2
20 | 1,1,37.6,64,64,1,2,1,2,3,1,40.0,7.0,1,1,1
21 | 2,1,39.4,110,110,4,6,1,3,3,3,55.0,8.7,1,2,2
22 | 1,1,39.9,72,72,1,5,2,5,4,4,46.0,6.1,1,1,2
23 | 2,1,38.4,48,48,1,1,1,1,3,1,49.0,6.8,1,2,2
24 | 1,1,38.6,42,42,2,4,1,2,3,1,48.0,7.2,1,1,2
25 | 1,9,38.3,130,130,3,1,1,2,4,1,50.0,70.0,1,1,2
26 | 1,1,38.1,60,60,3,3,1,3,4,3,51.0,65.0,1,1,2
27 | 2,1,37.8,60,60,3,1,1,3,3,1,44.0,7.5,1,2,2
28 | 1,1,38.3,72,72,4,3,2,3,3,3,43.0,7.0,1,1,1
29 | 1,1,37.8,48,48,3,1,1,3,3,2,37.0,5.5,1,2,1
30 | 1,1,38.1,60,60,3,1,1,3,3,1,44.0,7.5,2,2,2
31 | 2,1,37.7,48,48,2,1,1,1,1,1,45.0,76.0,1,2,2
32 | 2,1,37.7,96,96,3,4,2,5,4,4,66.0,7.5,2,1,2
33 | 2,1,37.2,108,108,3,4,2,2,4,2,52.0,8.2,3,1,1
34 | 1,1,37.2,60,60,2,1,1,3,3,3,43.0,6.6,1,1,2
35 | 1,1,38.2,64,64,1,1,1,3,1,1,49.0,8.6,1,1,1
36 | 1,1,38.1,100,100,3,4,2,5,4,4,52.0,6.6,1,1,2
37 | 2,1,38.1,104,104,4,3,2,4,4,3,73.0,8.4,3,1,2
38 | 2,1,38.3,112,112,3,5,2,3,3,1,51.0,6.0,3,2,1
39 | 1,1,37.8,72,72,3,1,1,5,3,1,56.0,80.0,1,1,2
40 | 2,1,38.6,52,52,1,1,1,3,3,2,32.0,6.6,1,2,1
41 | 1,9,39.2,146,146,3,1,1,3,3,1,44.0,7.5,2,1,2
42 | 1,1,38.1,88,88,3,6,2,5,3,3,63.0,6.5,2,1,2
43 | 2,9,39.0,150,150,3,1,1,3,3,1,47.0,8.5,1,1,1
44 | 2,1,38.0,60,60,3,3,1,3,3,1,47.0,7.0,1,2,2
45 | 1,1,38.1,120,120,3,4,1,4,4,4,52.0,67.0,3,1,2
46 | 1,1,35.4,140,140,3,4,2,4,4,1,57.0,69.0,3,1,2
47 | 2,1,38.1,120,120,4,4,2,5,4,4,60.0,6.5,2,1,2
48 | 1,1,37.9,60,60,3,4,2,5,4,4,65.0,7.5,1,1,1
49 | 2,1,37.5,48,48,1,1,1,1,1,1,37.0,6.5,1,2,2
50 | 1,1,38.9,80,80,3,3,2,2,3,3,54.0,6.5,2,1,2
51 | 2,1,37.2,84,84,3,5,2,4,1,2,73.0,5.5,2,2,1
52 | 2,1,38.6,46,46,1,2,1,1,3,2,49.0,9.1,1,2,1
53 | 1,1,37.4,84,84,1,3,2,3,3,2,44.0,7.5,2,1,1
54 | 2,1,38.1,60,60,1,3,1,1,3,1,43.0,7.7,1,2,2
55 | 2,1,38.6,40,40,3,1,1,3,3,1,41.0,6.4,1,2,1
56 | 2,1,40.3,114,114,3,1,2,2,3,3,57.0,8.1,3,1,1
57 | 1,9,38.6,160,160,3,5,1,3,3,4,38.0,7.5,2,1,1
58 | 1,1,38.1,60,60,3,1,1,3,3,1,24.0,6.7,1,1,2
59 | 1,1,38.1,64,64,2,2,1,5,3,3,42.0,7.7,2,1,2
60 | 1,1,38.1,60,60,4,3,1,5,4,3,53.0,5.9,2,1,1
61 | 2,1,38.1,96,96,3,3,2,5,4,4,60.0,7.5,2,1,2
62 | 2,1,37.8,48,48,1,3,1,2,1,1,37.0,6.7,1,2,2
63 | 2,1,38.5,60,60,2,1,1,1,2,2,44.0,7.7,1,2,2
64 | 1,1,37.8,88,88,2,2,1,3,3,1,64.0,8.0,2,1,1
65 | 2,1,38.2,130,130,4,4,2,2,4,4,65.0,82.0,3,2,2
66 | 1,1,39.0,64,64,3,4,2,3,3,2,44.0,7.5,1,1,1
67 | 1,1,38.1,60,60,3,3,1,3,3,2,26.0,72.0,1,1,2
68 | 2,1,37.9,72,72,1,5,2,3,3,1,58.0,74.0,1,1,2
69 | 2,1,38.4,54,54,1,1,1,1,3,1,49.0,7.2,1,2,1
70 | 2,1,38.1,52,52,1,3,1,3,3,1,55.0,7.2,1,2,2
71 | 2,1,38.0,48,48,1,1,1,1,3,1,42.0,6.3,1,2,1
72 | 2,1,37.0,60,60,3,1,1,3,3,3,43.0,7.6,3,1,1
73 | 1,1,37.8,48,48,1,1,1,1,2,1,46.0,5.9,1,2,1
74 | 1,1,37.7,56,56,3,1,1,3,3,1,44.0,7.5,2,1,2
75 | 1,1,38.1,52,52,1,5,1,4,3,1,54.0,7.5,2,1,1
76 | 1,9,38.1,60,60,3,1,1,3,3,1,37.0,4.9,2,1,2
77 | 1,9,39.7,100,100,3,5,2,2,3,1,48.0,57.0,3,1,2
78 | 1,1,37.6,38,38,3,1,1,3,3,2,37.0,68.0,1,1,2
79 | 2,1,38.7,52,52,2,1,1,1,1,1,33.0,77.0,1,2,2
80 | 1,1,38.1,60,60,3,3,3,5,3,3,46.0,5.9,2,1,2
81 | 1,1,37.5,96,96,1,6,2,3,4,2,69.0,8.9,1,1,1
82 | 1,1,36.4,98,98,3,4,1,4,3,2,47.0,6.4,2,1,1
83 | 1,1,37.3,40,40,3,1,1,2,3,2,36.0,7.5,1,1,2
84 | 1,9,38.1,100,100,3,2,1,3,4,1,36.0,5.7,1,1,2
85 | 1,1,38.0,60,60,3,6,2,5,3,4,68.0,7.8,2,1,2
86 | 1,1,37.8,60,60,1,2,2,2,3,3,40.0,4.5,1,1,1
87 | 2,1,38.0,54,54,2,3,3,3,1,2,45.0,6.2,1,2,2
88 | 1,1,38.1,88,88,3,4,2,5,4,3,50.0,7.7,2,1,1
89 | 2,1,38.1,40,40,3,1,1,3,3,1,50.0,7.0,3,1,1
90 | 2,1,39.0,64,64,1,5,1,3,3,2,42.0,7.5,1,2,1
91 | 2,1,38.3,42,42,1,1,1,1,1,1,38.0,61.0,1,2,2
92 | 2,1,38.0,52,52,3,1,1,2,3,1,53.0,86.0,1,1,2
93 | 2,1,40.3,114,114,3,1,2,2,3,3,57.0,8.1,2,1,1
94 | 2,1,38.8,50,50,3,1,1,1,1,1,42.0,6.2,1,2,2
95 | 2,1,38.1,60,60,3,1,1,5,3,3,38.0,6.5,2,1,2
96 | 2,1,37.5,48,48,4,3,1,3,2,1,48.0,8.6,1,2,2
97 | 1,1,37.3,48,48,3,2,1,3,3,3,41.0,69.0,1,1,2
98 | 2,1,38.1,84,84,3,3,1,3,3,1,44.0,8.5,1,1,2
99 | 1,1,38.1,88,88,3,4,1,2,3,3,55.0,60.0,3,2,2
100 | 2,1,37.7,44,44,2,3,1,1,3,2,41.0,60.0,1,2,2
101 | 2,1,39.6,108,108,3,6,2,2,4,3,59.0,8.0,1,2,1
102 | 1,1,38.2,40,40,3,1,1,1,3,1,34.0,66.0,1,2,2
103 | 1,1,38.1,60,60,4,4,2,5,4,1,44.0,7.5,3,1,2
104 | 2,1,38.3,40,40,3,1,1,2,3,1,37.0,57.0,1,2,2
105 | 1,9,38.0,140,140,1,1,1,3,3,2,39.0,5.3,1,1,2
106 | 1,1,37.8,52,52,1,3,1,4,4,1,48.0,6.6,2,1,2
107 | 1,1,38.1,70,70,1,3,2,2,3,2,36.0,7.3,1,1,2
108 | 1,1,38.3,52,52,3,3,1,3,3,1,43.0,6.1,1,1,1
109 | 2,1,37.3,50,50,1,3,1,1,3,2,44.0,7.0,1,2,2
110 | 1,1,38.7,60,60,4,2,2,4,4,4,53.0,64.0,3,1,2
111 | 1,9,38.4,84,84,3,2,1,3,3,3,36.0,6.6,2,1,1
112 | 1,1,38.1,70,70,3,5,2,2,3,2,60.0,7.5,2,1,2
113 | 1,1,38.3,40,40,3,1,1,1,3,2,38.0,58.0,1,1,2
114 | 1,1,38.1,40,40,2,1,1,1,3,1,39.0,56.0,1,1,2
115 | 1,1,36.8,60,60,3,1,1,3,3,1,44.0,7.5,2,1,1
116 | 1,1,38.4,44,44,3,4,1,5,4,3,50.0,77.0,1,1,2
117 | 2,1,38.1,60,60,3,1,1,3,3,2,45.0,70.0,1,2,2
118 | 1,1,38.0,44,44,1,1,1,3,3,3,42.0,65.0,1,1,2
119 | 2,1,39.5,60,60,3,4,2,3,4,3,44.0,6.7,3,1,2
120 | 1,1,36.5,78,78,1,1,1,5,3,1,34.0,75.0,1,1,2
121 | 2,1,38.1,56,56,2,2,1,1,3,1,46.0,70.0,1,2,2
122 | 1,1,39.4,54,54,1,2,1,2,3,2,39.0,6.0,1,1,1
123 | 1,1,38.3,80,80,3,6,2,4,3,1,67.0,10.2,3,1,2
124 | 2,1,38.7,40,40,2,1,1,3,1,1,39.0,62.0,1,2,2
125 | 1,1,38.2,64,64,1,3,1,4,4,3,45.0,7.5,2,1,1
126 | 2,1,37.6,48,48,3,4,1,1,1,3,37.0,5.5,3,1,2
127 | 1,1,38.0,42,42,4,1,1,3,3,2,41.0,7.6,1,1,2
128 | 1,1,38.7,60,60,3,3,1,5,4,2,33.0,6.5,1,1,2
129 | 1,1,37.4,50,50,3,1,1,4,4,1,45.0,7.9,1,1,1
130 | 1,1,37.4,84,84,3,3,1,2,3,3,31.0,61.0,3,2,2
131 | 1,1,38.4,49,49,3,1,1,3,3,1,44.0,7.6,1,1,2
132 | 1,1,37.8,30,30,3,1,1,3,3,1,44.0,7.5,2,1,2
133 | 2,1,37.6,88,88,3,1,1,3,3,2,44.0,6.0,2,1,2
134 | 2,1,37.9,40,40,1,1,1,2,3,1,40.0,5.7,1,1,2
135 | 1,1,38.1,100,100,3,4,2,5,4,1,59.0,6.3,2,1,2
136 | 1,9,38.1,136,136,3,3,1,5,1,3,33.0,4.9,2,1,1
137 | 1,1,38.1,60,60,3,3,2,5,3,3,46.0,5.9,2,1,2
138 | 1,1,38.0,48,48,1,1,1,1,2,4,44.0,7.5,1,1,2
139 | 2,1,38.0,56,56,1,3,1,1,1,1,42.0,71.0,1,2,2
140 | 2,1,38.0,60,60,1,1,1,3,3,1,50.0,7.0,1,2,2
141 | 1,1,38.1,44,44,3,1,1,2,2,1,31.0,7.3,1,2,2
142 | 2,1,36.0,42,42,3,5,1,3,3,1,64.0,6.8,2,2,2
143 | 1,1,38.1,120,120,4,6,2,5,4,4,57.0,4.5,2,1,1
144 | 1,1,37.8,48,48,1,1,2,1,2,1,46.0,5.9,1,2,1
145 | 1,1,37.1,84,84,3,6,1,2,4,4,75.0,81.0,3,2,2
146 | 2,1,38.1,80,80,3,2,1,2,3,3,50.0,80.0,1,1,2
147 | 1,1,38.2,48,48,1,3,1,3,4,4,42.0,71.0,1,1,2
148 | 2,1,38.0,44,44,2,3,1,3,4,3,33.0,6.5,2,1,2
149 | 1,1,38.3,132,132,3,6,2,2,4,2,57.0,8.0,1,1,1
150 | 2,1,38.7,48,48,3,1,1,1,1,1,34.0,63.0,1,2,2
151 | 2,1,38.9,44,44,3,1,1,2,3,2,33.0,64.0,1,2,2
152 | 1,1,39.3,60,60,4,6,2,4,4,2,75.0,7.5,2,1,1
153 | 1,1,38.1,100,100,3,4,2,3,4,4,68.0,64.0,1,1,2
154 | 2,1,38.6,48,48,3,1,1,1,3,2,50.0,7.3,1,2,1
155 | 2,1,38.8,48,48,1,3,1,3,3,4,41.0,65.0,1,1,2
156 | 2,1,38.0,48,48,3,4,1,1,4,2,49.0,8.3,1,2,1
157 | 2,1,38.6,52,52,1,1,1,3,3,2,36.0,6.6,1,2,1
158 | 1,1,37.8,60,60,1,3,2,3,4,4,52.0,75.0,3,1,2
159 | 2,1,38.0,42,42,3,1,1,3,3,1,44.0,7.5,1,2,2
160 | 2,1,38.1,60,60,1,2,1,2,1,2,44.0,7.5,1,2,1
161 | 1,1,38.1,60,60,3,1,1,4,3,1,35.0,58.0,1,1,2
162 | 1,1,38.3,42,42,3,1,1,3,3,1,40.0,8.5,2,1,2
163 | 2,1,39.5,60,60,3,1,2,3,3,2,38.0,56.0,1,2,2
164 | 1,1,38.0,66,66,1,3,1,5,3,1,46.0,46.0,3,1,2
165 | 1,1,38.7,76,76,1,5,2,3,3,2,50.0,8.0,1,1,1
166 | 1,1,39.4,120,120,3,5,1,3,3,3,56.0,64.0,3,2,2
167 | 1,1,38.3,40,40,1,1,1,3,1,1,43.0,5.9,1,2,1
168 | 2,1,38.1,44,44,1,1,1,3,3,1,44.0,6.3,1,2,2
169 | 1,1,38.4,104,104,1,3,1,2,4,2,55.0,8.5,1,1,2
170 | 1,1,38.1,65,65,3,1,2,5,3,4,44.0,7.5,3,1,2
171 | 2,1,37.5,44,44,1,3,1,3,1,1,35.0,7.2,1,2,2
172 | 2,1,39.0,86,86,3,5,1,3,3,3,68.0,5.8,2,1,1
173 | 1,1,38.5,129,129,3,3,1,2,4,3,57.0,66.0,1,1,2
174 | 1,1,38.1,104,104,3,5,2,2,4,3,69.0,8.6,2,1,1
175 | 2,1,38.1,60,60,3,6,1,4,3,4,44.0,7.5,2,1,1
176 | 1,1,38.1,60,60,3,1,1,3,3,1,44.0,7.5,1,1,2
177 | 1,1,38.2,60,60,1,3,1,3,3,1,48.0,66.0,1,1,2
178 | 1,1,38.1,68,68,3,4,1,4,3,1,44.0,7.5,2,1,1
179 | 1,1,38.1,60,60,3,4,2,5,4,4,45.0,70.0,1,1,2
180 | 2,1,38.5,100,100,3,5,2,4,3,4,44.0,7.5,3,2,1
181 | 1,1,38.4,84,84,3,5,2,4,3,3,47.0,7.5,2,1,2
182 | 2,1,37.8,48,48,3,1,1,3,3,2,35.0,7.5,1,2,1
183 | 1,1,38.0,60,60,3,6,2,5,3,4,68.0,7.8,2,1,2
184 | 2,1,37.8,56,56,1,2,1,2,1,1,44.0,68.0,1,2,2
185 | 2,1,38.2,68,68,2,2,1,1,1,1,43.0,65.0,1,2,2
186 | 1,1,38.5,120,120,4,6,2,3,3,1,54.0,7.5,1,1,2
187 | 1,1,39.3,64,64,2,1,1,3,3,1,39.0,6.7,1,1,2
188 | 1,1,38.4,80,80,4,1,1,3,3,3,32.0,6.1,1,1,1
189 | 1,1,38.5,60,60,1,1,1,3,1,1,33.0,53.0,1,1,2
190 | 1,1,38.3,60,60,3,1,1,2,1,1,30.0,6.0,1,1,2
191 | 1,1,37.1,40,40,3,4,1,3,3,1,23.0,6.7,1,1,1
192 | 2,9,38.1,100,100,2,1,1,4,1,1,37.0,4.7,1,2,2
193 | 1,1,38.2,48,48,1,1,1,3,3,3,48.0,74.0,1,1,2
194 | 1,1,38.1,60,60,3,4,2,4,3,4,58.0,7.6,2,1,2
195 | 2,1,37.9,88,88,1,2,1,2,2,1,37.0,56.0,1,2,2
196 | 2,1,38.0,44,44,3,1,1,3,1,2,42.0,64.0,1,2,2
197 | 2,1,38.5,60,60,1,5,2,2,2,1,63.0,7.5,3,2,1
198 | 2,1,38.5,96,96,3,1,2,2,4,2,70.0,8.5,2,1,1
199 | 2,1,38.3,60,60,1,1,2,1,3,1,34.0,66.0,1,2,2
200 | 2,1,38.5,60,60,3,2,1,2,1,2,49.0,59.0,1,2,2
201 | 1,1,37.3,48,48,1,3,1,3,1,3,40.0,6.6,1,1,1
202 | 1,1,38.5,86,86,1,3,1,4,4,3,45.0,7.4,2,1,1
203 | 1,1,37.5,48,48,3,1,1,3,3,1,41.0,55.0,3,1,2
204 | 2,1,37.2,36,36,1,1,1,2,3,1,35.0,5.7,1,2,2
205 | 1,1,39.2,60,60,3,3,1,4,4,2,36.0,6.6,1,1,1
206 | 2,1,38.5,100,100,3,5,2,4,3,4,44.0,7.5,3,2,2
207 | 1,1,38.5,96,96,2,4,2,4,4,3,50.0,65.0,1,1,2
208 | 1,1,38.1,60,60,3,1,1,3,3,1,45.0,8.7,2,1,2
209 | 1,1,37.8,88,88,3,5,2,3,3,3,64.0,89.0,3,1,2
210 | 2,1,37.5,44,44,3,1,1,3,1,2,43.0,51.0,1,2,2
211 | 1,1,37.9,68,68,3,2,1,2,4,2,45.0,4.0,2,1,1
212 | 1,1,38.0,86,86,4,4,1,2,4,4,45.0,5.5,2,1,1
213 | 1,9,38.9,120,120,1,2,2,3,3,3,47.0,6.3,1,2,2
214 | 1,1,37.6,45,45,3,3,1,3,2,2,39.0,7.0,1,1,1
215 | 2,1,38.6,56,56,2,1,1,1,1,1,40.0,7.0,1,2,1
216 | 1,1,37.8,40,40,1,1,1,1,2,1,38.0,7.0,1,1,2
217 | 2,1,38.1,60,60,3,1,1,3,3,1,44.0,7.5,1,2,2
218 | 1,1,38.0,76,76,3,1,2,3,3,1,71.0,11.0,1,1,1
219 | 1,1,38.1,40,40,1,2,1,2,2,1,44.0,7.5,3,1,2
220 | 1,1,38.1,52,52,3,4,1,3,4,3,37.0,8.1,1,1,2
221 | 1,1,39.2,88,88,4,1,2,5,4,1,44.0,7.5,3,2,2
222 | 1,1,38.5,92,92,4,1,1,2,4,3,46.0,67.0,1,1,2
223 | 1,1,38.1,112,112,4,4,1,2,3,1,60.0,6.3,1,1,1
224 | 1,1,37.7,66,66,1,3,1,3,3,2,31.5,6.2,1,1,1
225 | 1,1,38.8,50,50,1,1,1,3,1,1,38.0,58.0,1,1,2
226 | 2,1,38.4,54,54,1,1,1,1,3,1,49.0,7.2,1,2,1
227 | 1,1,39.2,120,120,4,5,2,2,3,3,60.0,8.8,2,1,2
228 | 1,9,38.1,60,60,3,1,1,3,3,1,45.0,6.5,1,1,1
229 | 1,1,37.3,90,90,3,6,2,5,4,3,65.0,50.0,3,1,2
230 | 1,9,38.5,120,120,3,1,1,3,1,1,35.0,54.0,1,1,2
231 | 1,1,38.5,104,104,3,1,1,4,3,4,44.0,7.5,1,1,2
232 | 2,1,39.5,92,92,3,6,1,5,4,1,72.0,6.4,2,2,2
233 | 1,1,38.5,30,30,3,1,1,3,3,1,40.0,7.7,1,1,2
234 | 1,1,38.3,72,72,4,3,2,3,3,3,43.0,7.0,1,1,1
235 | 2,1,37.5,48,48,4,3,1,3,2,1,48.0,8.6,1,2,2
236 | 1,1,38.1,52,52,1,5,1,4,3,1,54.0,7.5,2,1,1
237 | 2,1,38.2,42,42,1,1,1,3,1,2,36.0,6.9,1,2,2
238 | 2,1,37.9,54,54,2,5,1,3,1,1,47.0,54.0,1,2,2
239 | 2,1,36.1,88,88,3,3,1,3,3,2,45.0,7.0,3,1,1
240 | 1,1,38.1,70,70,3,1,1,5,3,1,36.0,65.0,3,1,2
241 | 1,1,38.0,90,90,4,4,2,5,4,4,55.0,6.1,2,1,2
242 | 1,1,38.2,52,52,1,2,1,1,2,1,43.0,8.1,1,2,1
243 | 1,1,38.1,36,36,1,4,1,5,3,3,41.0,5.9,2,1,2
244 | 1,1,38.4,92,92,1,1,2,3,3,3,44.0,7.5,1,1,1
245 | 1,9,38.2,124,124,1,2,1,2,3,4,47.0,8.0,1,1,1
246 | 2,1,38.1,96,96,3,3,2,5,4,4,60.0,7.5,2,1,2
247 | 1,1,37.6,68,68,3,3,1,4,2,4,47.0,7.2,1,1,2
248 | 1,1,38.1,88,88,3,4,1,5,4,3,41.0,4.6,2,1,2
249 | 1,1,38.0,108,108,2,4,1,4,3,3,44.0,7.5,1,1,2
250 | 2,1,38.2,48,48,2,1,2,3,3,1,34.0,6.6,1,2,2
251 | 1,1,39.3,100,100,4,6,1,2,4,1,66.0,13.0,3,1,2
252 | 2,1,36.6,42,42,3,2,1,1,4,1,52.0,7.1,2,1,2
253 | 1,9,38.8,124,124,3,2,1,2,3,4,50.0,7.6,2,1,1
254 | 2,1,38.1,112,112,3,4,2,5,4,2,40.0,5.3,1,2,1
255 | 1,1,38.1,80,80,3,3,1,4,4,4,43.0,70.0,1,1,2
256 | 1,9,38.8,184,184,1,1,1,4,1,3,33.0,3.3,2,1,2
257 | 1,1,37.5,72,72,2,1,1,2,1,1,35.0,65.0,3,1,2
258 | 1,1,38.7,96,96,3,4,1,3,4,1,64.0,9.0,2,1,1
259 | 2,1,37.5,52,52,1,1,1,2,3,2,36.0,61.0,1,2,2
260 | 1,1,40.8,72,72,3,1,1,2,3,1,54.0,7.4,2,1,1
261 | 2,1,38.0,40,40,3,1,1,4,3,2,37.0,69.0,1,2,2
262 | 2,1,38.4,48,48,2,1,1,1,3,2,39.0,6.5,1,2,1
263 | 2,9,38.6,88,88,3,1,1,3,3,1,35.0,5.9,1,2,2
264 | 1,1,37.1,75,75,3,3,2,4,4,2,48.0,7.4,2,1,1
265 | 1,1,38.3,44,44,3,2,1,3,3,3,44.0,6.5,1,1,1
266 | 2,1,38.1,56,56,3,1,1,3,3,1,40.0,6.0,3,1,2
267 | 2,1,38.6,68,68,2,3,1,3,3,2,38.0,6.5,1,2,1
268 | 2,1,38.3,54,54,3,2,1,2,3,2,44.0,7.2,1,2,1
269 | 1,1,38.2,42,42,3,1,1,3,3,1,47.0,60.0,1,2,2
270 | 1,1,39.3,64,64,2,1,1,3,3,1,39.0,6.7,1,1,2
271 | 1,1,37.5,60,60,3,1,1,3,3,2,35.0,6.5,2,1,2
272 | 1,1,37.7,80,80,3,6,1,5,4,1,50.0,55.0,1,1,2
273 | 1,1,38.1,100,100,3,4,2,5,4,4,52.0,6.6,1,1,2
274 | 1,1,37.7,120,120,3,3,1,5,3,3,65.0,7.0,2,1,1
275 | 1,1,38.1,76,76,3,1,1,3,4,4,44.0,7.5,3,1,2
276 | 1,9,38.8,150,150,1,6,2,5,3,2,50.0,6.2,2,1,2
277 | 1,1,38.0,36,36,3,1,1,4,2,2,37.0,75.0,3,2,2
278 | 2,1,36.9,50,50,2,3,1,1,3,2,37.5,6.5,1,2,2
279 | 2,1,37.8,40,40,1,1,1,1,1,1,37.0,6.8,1,2,2
280 | 2,1,38.2,56,56,4,1,1,2,4,3,47.0,7.2,1,2,1
281 | 1,1,38.6,48,48,3,1,1,1,1,1,36.0,67.0,1,2,2
282 | 2,1,40.0,78,78,3,5,1,2,3,1,66.0,6.5,2,1,1
283 | 1,1,38.1,70,70,3,5,2,2,3,2,60.0,7.5,2,1,2
284 | 1,1,38.2,72,72,3,1,1,3,3,1,35.0,6.4,1,1,2
285 | 2,1,38.5,54,54,1,1,1,3,1,1,40.0,6.8,1,2,1
286 | 1,1,38.5,66,66,1,1,1,3,3,1,40.0,6.7,1,1,1
287 | 2,1,37.8,82,82,3,1,2,4,3,3,50.0,7.0,3,1,2
288 | 2,9,39.5,84,84,3,1,1,3,3,1,28.0,5.0,1,2,2
289 | 1,1,38.1,60,60,3,1,1,3,3,1,44.0,7.5,1,1,2
290 | 1,1,38.0,50,50,3,1,1,3,2,2,39.0,6.6,1,1,1
291 | 2,1,38.6,45,45,2,2,1,1,1,1,43.0,58.0,1,2,2
292 | 1,1,38.9,80,80,3,3,1,2,3,3,54.0,6.5,2,1,2
293 | 1,1,37.0,66,66,1,2,1,4,3,3,35.0,6.9,2,1,2
294 | 1,1,38.1,78,78,3,3,1,3,3,1,43.0,62.0,3,2,2
295 | 2,1,38.5,40,40,1,1,1,2,1,1,37.0,67.0,1,2,2
296 | 1,1,38.1,120,120,4,4,2,2,4,1,55.0,65.0,3,2,2
297 | 2,1,37.2,72,72,3,4,2,4,3,3,44.0,7.5,3,1,1
298 | 1,1,37.5,72,72,4,4,1,4,4,3,60.0,6.8,2,1,2
299 | 1,1,36.5,100,100,3,3,1,3,3,3,50.0,6.0,1,1,1
300 | 1,1,37.2,40,40,3,1,1,3,3,1,36.0,62.0,3,2,2
301 | 2,1,38.5,54,54,3,2,2,3,4,1,42.0,6.3,1,2,1
302 | 2,1,37.6,48,48,3,1,1,3,3,1,44.0,6.3,1,2,1
303 | 1,1,37.7,44,44,3,3,2,5,4,4,45.0,70.0,1,1,2
304 | 1,1,37.0,56,56,3,4,2,4,4,3,35.0,61.0,3,2,2
305 | 2,1,38.0,42,42,3,3,1,1,3,1,37.0,5.8,1,2,2
306 | 1,1,38.1,60,60,3,1,1,3,4,1,42.0,72.0,1,1,2
307 | 2,1,38.4,80,80,3,2,1,3,2,1,54.0,6.9,1,2,2
308 | 2,1,37.8,48,48,2,2,1,3,3,1,48.0,7.3,1,2,1
309 | 2,1,37.9,45,45,3,3,2,2,3,1,33.0,5.7,1,1,1
310 | 2,1,39.0,84,84,3,5,1,2,4,2,62.0,5.9,2,1,1
311 | 2,1,38.2,60,60,3,3,2,3,3,2,53.0,7.5,1,2,1
312 | 1,1,38.1,140,140,3,4,2,5,4,4,30.0,69.0,2,2,2
313 | 1,1,37.9,120,120,3,3,1,5,4,4,52.0,6.6,2,1,1
314 | 2,1,38.0,72,72,1,3,1,3,3,2,38.0,6.8,1,2,1
315 | 2,9,38.0,92,92,1,2,1,1,3,2,37.0,6.1,1,2,1
316 | 1,1,38.3,66,66,2,1,1,2,4,3,37.0,6.0,1,1,2
317 | 2,1,37.5,48,48,3,1,1,2,1,1,43.0,6.0,1,2,1
318 | 1,1,37.5,88,88,2,3,1,4,3,3,35.0,6.4,2,1,2
319 | 2,9,38.1,150,150,4,4,2,5,4,4,44.0,7.5,2,1,2
320 | 1,1,39.7,100,100,3,6,2,4,4,3,65.0,75.0,3,1,2
321 | 1,1,38.3,80,80,3,4,2,5,4,3,45.0,7.5,1,1,1
322 | 2,1,37.5,40,40,3,3,1,3,2,3,32.0,6.4,1,1,1
323 | 1,1,38.4,84,84,3,5,2,4,3,3,47.0,7.5,2,1,2
324 | 1,1,38.1,84,84,4,4,2,5,3,1,60.0,6.8,2,1,1
325 | 2,1,38.7,52,52,1,1,1,1,3,1,4.0,74.0,1,2,2
326 | 2,1,38.1,44,44,2,3,1,3,3,1,35.0,6.8,1,2,2
327 | 2,1,38.4,52,52,2,3,1,1,3,2,41.0,63.0,1,2,2
328 | 1,1,38.2,60,60,1,3,1,2,1,1,43.0,6.2,1,1,1
329 | 2,1,37.7,40,40,1,1,1,3,2,1,36.0,3.5,1,2,2
330 | 1,1,39.1,60,60,3,1,1,2,3,1,44.0,7.5,1,1,2
331 | 2,1,37.8,48,48,1,1,1,3,1,1,43.0,7.5,1,2,2
332 | 1,1,39.0,120,120,4,5,2,2,4,3,65.0,8.2,1,2,2
333 | 1,1,38.2,76,76,2,2,1,5,3,3,35.0,6.5,1,1,1
334 | 2,1,38.3,88,88,3,6,1,3,3,1,44.0,7.5,2,2,2
335 | 1,1,38.0,80,80,3,3,1,3,3,1,48.0,8.3,1,1,2
336 | 1,1,38.1,60,60,3,1,1,2,3,3,44.0,7.5,2,1,1
337 | 1,1,37.6,40,40,1,1,1,1,1,1,44.0,7.5,1,1,1
338 | 2,1,37.5,44,44,1,1,1,3,3,2,45.0,5.8,1,2,1
339 | 2,1,38.2,42,42,1,3,1,1,3,1,35.0,60.0,1,2,2
340 | 2,1,38.0,56,56,3,3,1,3,1,1,47.0,70.0,1,2,2
341 | 2,1,38.3,45,45,3,2,2,2,4,1,44.0,7.5,1,2,2
342 | 1,1,38.1,48,48,1,3,1,3,4,1,42.0,8.0,1,1,2
343 | 1,1,37.7,55,55,2,2,1,2,3,3,44.0,7.5,1,1,2
344 | 2,1,36.0,100,100,4,6,2,2,4,3,74.0,5.7,3,1,1
345 | 1,1,37.1,60,60,2,4,1,3,3,3,64.0,8.5,1,1,1
346 | 2,1,37.1,114,114,3,3,2,2,2,1,32.0,7.5,1,2,2
347 | 1,1,38.1,72,72,3,3,1,4,4,3,37.0,56.0,1,1,2
348 | 1,1,37.0,44,44,3,1,2,1,1,1,40.0,6.7,1,1,2
349 | 1,1,38.6,48,48,3,1,1,4,3,1,37.0,75.0,1,1,2
350 | 1,1,38.1,82,82,3,4,1,2,3,3,53.0,65.0,3,1,2
351 | 1,9,38.2,78,78,4,6,1,3,3,3,59.0,5.8,2,1,1
352 | 2,1,37.8,60,60,1,3,1,2,3,2,41.0,73.0,3,2,2
353 | 1,1,38.7,34,34,2,3,1,2,3,1,33.0,69.0,3,1,2
354 | 1,1,38.1,36,36,1,1,1,1,2,1,44.0,7.5,1,1,1
355 | 2,1,38.3,44,44,3,1,1,3,3,1,6.4,36.0,1,1,2
356 | 2,1,37.4,54,54,3,1,1,3,4,3,30.0,7.1,1,1,1
357 | 1,1,38.1,60,60,4,1,2,2,4,1,54.0,76.0,1,1,2
358 | 1,1,36.6,48,48,3,3,1,4,1,1,27.0,56.0,3,1,2
359 | 1,1,38.5,90,90,1,3,1,3,3,3,47.0,79.0,1,1,2
360 | 1,1,38.1,75,75,1,4,1,5,3,3,58.0,8.5,1,1,1
361 | 2,1,38.2,42,42,3,1,1,1,1,2,35.0,5.9,1,2,2
362 | 1,9,38.2,78,78,4,6,1,3,3,3,59.0,5.8,2,1,1
363 | 2,1,38.6,60,60,1,3,1,4,2,2,40.0,6.0,1,1,2
364 | 2,1,37.8,42,42,1,1,1,1,3,1,36.0,6.2,1,2,2
365 | 1,1,38.0,60,60,1,2,1,2,1,1,44.0,65.0,3,1,2
366 | 2,1,38.0,42,42,3,3,1,1,1,1,37.0,5.8,1,2,2
367 | 2,1,37.6,88,88,3,1,1,3,3,2,44.0,6.0,2,1,2
368 |
--------------------------------------------------------------------------------
/data/small.csv:
--------------------------------------------------------------------------------
1 | h,e,t
2 | 185.0,rotten,2.3
3 | 153.0,great,4.5
4 | 163.0,bla,4.2
5 | 114.0,great,1.8
6 | 180.0,bla,7.1
7 |
--------------------------------------------------------------------------------
/data/src/Manifest.toml:
--------------------------------------------------------------------------------
1 | # This file is machine-generated - editing it directly is not advised
2 |
3 | [[AbstractFFTs]]
4 | deps = ["LinearAlgebra"]
5 | git-tree-sha1 = "051c95d6836228d120f5f4b984dd5aba1624f716"
6 | uuid = "621f4979-c628-5d54-868e-fcf4e3e8185c"
7 | version = "0.5.0"
8 |
9 | [[Adapt]]
10 | deps = ["LinearAlgebra"]
11 | git-tree-sha1 = "0fac443759fa829ed8066db6cf1077d888bb6573"
12 | uuid = "79e6a3ab-5dfb-504d-930d-738a2a938a0e"
13 | version = "2.0.2"
14 |
15 | [[Arpack]]
16 | deps = ["Arpack_jll", "Libdl", "LinearAlgebra"]
17 | git-tree-sha1 = "2ff92b71ba1747c5fdd541f8fc87736d82f40ec9"
18 | uuid = "7d9fca2a-8960-54d3-9f78-7d1dccf2cb97"
19 | version = "0.4.0"
20 |
21 | [[Arpack_jll]]
22 | deps = ["Libdl", "OpenBLAS_jll", "Pkg"]
23 | git-tree-sha1 = "e214a9b9bd1b4e1b4f15b22c0994862b66af7ff7"
24 | uuid = "68821587-b530-5797-8361-c406ea357684"
25 | version = "3.5.0+3"
26 |
27 | [[ArrayInterface]]
28 | deps = ["LinearAlgebra", "Requires", "SparseArrays"]
29 | git-tree-sha1 = "0eccdcbe27fd6bd9cba3be31c67bdd435a21e865"
30 | uuid = "4fba245c-0d91-5ea0-9b3e-6abc04ee57a9"
31 | version = "2.9.1"
32 |
33 | [[AxisAlgorithms]]
34 | deps = ["LinearAlgebra", "Random", "SparseArrays", "WoodburyMatrices"]
35 | git-tree-sha1 = "a4d07a1c313392a77042855df46c5f534076fab9"
36 | uuid = "13072b0f-2c55-5437-9ae7-d433b7a33950"
37 | version = "1.0.0"
38 |
39 | [[BSON]]
40 | git-tree-sha1 = "dd36d7cf3d185eeaaf64db902c15174b22f5dafb"
41 | uuid = "fbb218c0-5317-5bc6-957e-2ee96dd4b1f0"
42 | version = "0.2.6"
43 |
44 | [[Base64]]
45 | uuid = "2a0f44e3-6c83-55bd-87e4-b1978d98bd5f"
46 |
47 | [[Bzip2_jll]]
48 | deps = ["Libdl", "Pkg"]
49 | git-tree-sha1 = "3663bfffede2ef41358b6fc2e1d8a6d50b3c3904"
50 | uuid = "6e34b625-4abd-537c-b88f-471c36dfa7a0"
51 | version = "1.0.6+2"
52 |
53 | [[CSV]]
54 | deps = ["CategoricalArrays", "DataFrames", "Dates", "FilePathsBase", "Mmap", "Parsers", "PooledArrays", "Tables", "Unicode", "WeakRefStrings"]
55 | git-tree-sha1 = "52a8e60c7822f53d57e4403b7f2811e7e1bdd32b"
56 | uuid = "336ed68f-0bac-5ca0-87d4-7b16caf5d00b"
57 | version = "0.6.2"
58 |
59 | [[CategoricalArrays]]
60 | deps = ["DataAPI", "Future", "JSON", "Missings", "Printf", "Statistics", "Unicode"]
61 | git-tree-sha1 = "a6c17353ee38ddab30e73dcfaa1107752de724ec"
62 | uuid = "324d7699-5711-5eae-9e2f-1d82baa6b597"
63 | version = "0.8.1"
64 |
65 | [[Clustering]]
66 | deps = ["Distances", "LinearAlgebra", "NearestNeighbors", "Printf", "SparseArrays", "Statistics", "StatsBase"]
67 | git-tree-sha1 = "b11c8d607af357776a046889a7c32567d05f1319"
68 | uuid = "aaaa29a8-35af-508c-8bc3-b662a17a0fe5"
69 | version = "0.14.1"
70 |
71 | [[CodecZlib]]
72 | deps = ["TranscodingStreams", "Zlib_jll"]
73 | git-tree-sha1 = "ded953804d019afa9a3f98981d99b33e3db7b6da"
74 | uuid = "944b1d66-785c-5afd-91f1-9de20f533193"
75 | version = "0.7.0"
76 |
77 | [[ColorSchemes]]
78 | deps = ["ColorTypes", "Colors", "FixedPointNumbers", "Random", "StaticArrays"]
79 | git-tree-sha1 = "7a15e3690529fd1042f0ab954dff7445b1efc8a5"
80 | uuid = "35d6a980-a343-548e-a6ea-1d62b119f2f4"
81 | version = "3.9.0"
82 |
83 | [[ColorTypes]]
84 | deps = ["FixedPointNumbers", "Random"]
85 | git-tree-sha1 = "6e7aa35d0294f647bb9c985ccc34d4f5d371a533"
86 | uuid = "3da002f7-5984-5a60-b8a6-cbb66c0b333f"
87 | version = "0.10.6"
88 |
89 | [[Colors]]
90 | deps = ["ColorTypes", "FixedPointNumbers", "InteractiveUtils", "Reexport"]
91 | git-tree-sha1 = "5639e44833cfcf78c6a73fbceb4da75611d312cd"
92 | uuid = "5ae59095-9a9b-59fe-a467-6f913c188581"
93 | version = "0.12.3"
94 |
95 | [[CommonSubexpressions]]
96 | deps = ["MacroTools", "Test"]
97 | git-tree-sha1 = "7b8a93dba8af7e3b42fecabf646260105ac373f7"
98 | uuid = "bbf7d656-a473-5ed7-a52c-81e309532950"
99 | version = "0.3.0"
100 |
101 | [[Compat]]
102 | deps = ["Base64", "Dates", "DelimitedFiles", "Distributed", "InteractiveUtils", "LibGit2", "Libdl", "LinearAlgebra", "Markdown", "Mmap", "Pkg", "Printf", "REPL", "Random", "SHA", "Serialization", "SharedArrays", "Sockets", "SparseArrays", "Statistics", "Test", "UUIDs", "Unicode"]
103 | git-tree-sha1 = "a6a8197ae253f2c1a22b2ae17c2dfaf5812c03aa"
104 | uuid = "34da2185-b29b-5c13-b0c7-acf172513d20"
105 | version = "3.13.0"
106 |
107 | [[CompilerSupportLibraries_jll]]
108 | deps = ["Libdl", "Pkg"]
109 | git-tree-sha1 = "7c4f882c41faa72118841185afc58a2eb00ef612"
110 | uuid = "e66e0078-7015-5450-92f7-15fbd957f2ae"
111 | version = "0.3.3+0"
112 |
113 | [[ComputationalResources]]
114 | git-tree-sha1 = "52cb3ec90e8a8bea0e62e275ba577ad0f74821f7"
115 | uuid = "ed09eef8-17a6-5b46-8889-db040fac31e3"
116 | version = "0.3.2"
117 |
118 | [[Conda]]
119 | deps = ["JSON", "VersionParsing"]
120 | git-tree-sha1 = "7a58bb32ce5d85f8bf7559aa7c2842f9aecf52fc"
121 | uuid = "8f4d0f93-b110-5947-807f-2305c1781a2d"
122 | version = "1.4.1"
123 |
124 | [[Contour]]
125 | deps = ["StaticArrays"]
126 | git-tree-sha1 = "81685fee51fc5168898e3cbd8b0f01506cd9148e"
127 | uuid = "d38c429a-6771-53c6-b99e-75d170b6e991"
128 | version = "0.5.4"
129 |
130 | [[Crayons]]
131 | git-tree-sha1 = "c437a9c2114c7ba19322712e58942b383ffbd6c0"
132 | uuid = "a8cc5b0e-0ffa-5ad4-8c14-923d3ee1735f"
133 | version = "4.0.3"
134 |
135 | [[DataAPI]]
136 | git-tree-sha1 = "176e23402d80e7743fc26c19c681bfb11246af32"
137 | uuid = "9a962f9c-6df0-11e9-0e5d-c546b8b5ee8a"
138 | version = "1.3.0"
139 |
140 | [[DataFrames]]
141 | deps = ["CategoricalArrays", "Compat", "DataAPI", "Future", "InvertedIndices", "IteratorInterfaceExtensions", "Missings", "PooledArrays", "Printf", "REPL", "Reexport", "SortingAlgorithms", "Statistics", "TableTraits", "Tables", "Unicode"]
142 | git-tree-sha1 = "d4436b646615928b634b37e99a3288588072f851"
143 | uuid = "a93c6f00-e57d-5684-b7b6-d8193f3e46c0"
144 | version = "0.21.4"
145 |
146 | [[DataStructures]]
147 | deps = ["InteractiveUtils", "OrderedCollections"]
148 | git-tree-sha1 = "edad9434967fdc0a2631a65d902228400642120c"
149 | uuid = "864edb3b-99cc-5e75-8d2d-829cb0a9cfe8"
150 | version = "0.17.19"
151 |
152 | [[DataValueInterfaces]]
153 | git-tree-sha1 = "bfc1187b79289637fa0ef6d4436ebdfe6905cbd6"
154 | uuid = "e2d170a0-9d28-54be-80f0-106bbe20a464"
155 | version = "1.0.0"
156 |
157 | [[DataValues]]
158 | deps = ["DataValueInterfaces", "Dates"]
159 | git-tree-sha1 = "d88a19299eba280a6d062e135a43f00323ae70bf"
160 | uuid = "e7dc6d0d-1eca-5fa6-8ad6-5aecde8b7ea5"
161 | version = "0.4.13"
162 |
163 | [[Dates]]
164 | deps = ["Printf"]
165 | uuid = "ade2ca70-3891-5945-98fb-dc099432e06a"
166 |
167 | [[DecisionTree]]
168 | deps = ["DelimitedFiles", "Distributed", "LinearAlgebra", "Random", "ScikitLearnBase", "Statistics", "Test"]
169 | git-tree-sha1 = "9faa81d6e611cf00d16d4dabbd60a325ada72a83"
170 | uuid = "7806a523-6efd-50cb-b5f6-3fa6f1930dbb"
171 | version = "0.10.7"
172 |
173 | [[DelimitedFiles]]
174 | deps = ["Mmap"]
175 | uuid = "8bb1440f-4735-579b-a4ab-409b98df4dab"
176 |
177 | [[DiffResults]]
178 | deps = ["StaticArrays"]
179 | git-tree-sha1 = "da24935df8e0c6cf28de340b958f6aac88eaa0cc"
180 | uuid = "163ba53b-c6d8-5494-b064-1a9d43ac40c5"
181 | version = "1.0.2"
182 |
183 | [[DiffRules]]
184 | deps = ["NaNMath", "Random", "SpecialFunctions"]
185 | git-tree-sha1 = "eb0c34204c8410888844ada5359ac8b96292cfd1"
186 | uuid = "b552c78f-8df3-52c6-915a-8e097449b14b"
187 | version = "1.0.1"
188 |
189 | [[Distances]]
190 | deps = ["LinearAlgebra", "Statistics"]
191 | git-tree-sha1 = "23717536c81b63e250f682b0e0933769eecd1411"
192 | uuid = "b4f34e82-e78d-54a5-968a-f98e89d6e8f7"
193 | version = "0.8.2"
194 |
195 | [[Distributed]]
196 | deps = ["Random", "Serialization", "Sockets"]
197 | uuid = "8ba89e20-285c-5b6f-9357-94700520ee1b"
198 |
199 | [[Distributions]]
200 | deps = ["FillArrays", "LinearAlgebra", "PDMats", "Printf", "QuadGK", "Random", "SpecialFunctions", "Statistics", "StatsBase", "StatsFuns"]
201 | git-tree-sha1 = "78c4c32a2357a00a0a7d614880f02c2c6e1ec73c"
202 | uuid = "31c24e10-a181-5473-b8eb-7969acd0382f"
203 | version = "0.23.4"
204 |
205 | [[DocStringExtensions]]
206 | deps = ["LibGit2", "Markdown", "Pkg", "Test"]
207 | git-tree-sha1 = "c5714d9bcdba66389612dc4c47ed827c64112997"
208 | uuid = "ffbed154-4ef7-542d-bbb7-c09d3a79fcae"
209 | version = "0.8.2"
210 |
211 | [[Documenter]]
212 | deps = ["Base64", "Dates", "DocStringExtensions", "InteractiveUtils", "JSON", "LibGit2", "Logging", "Markdown", "REPL", "Test", "Unicode"]
213 | git-tree-sha1 = "395fa1554c69735802bba37d9e7d9586fd44326c"
214 | uuid = "e30172f5-a6a5-5a46-863b-614d45cd2de4"
215 | version = "0.24.11"
216 |
217 | [[EvoTrees]]
218 | deps = ["CategoricalArrays", "Distributions", "MLJModelInterface", "Random", "StaticArrays", "Statistics", "StatsBase"]
219 | git-tree-sha1 = "2608d6cd10db187b7ef96c2197f809c04a1ac735"
220 | uuid = "f6006082-12f8-11e9-0c9c-0d5d367ab1e5"
221 | version = "0.4.9"
222 |
223 | [[ExprTools]]
224 | git-tree-sha1 = "6f0517056812fd6aa3af23d4b70d5325a2ae4e95"
225 | uuid = "e2ba6199-217a-4e67-a87a-7c52f15ade04"
226 | version = "0.1.1"
227 |
228 | [[EzXML]]
229 | deps = ["Printf", "XML2_jll"]
230 | git-tree-sha1 = "0fa3b52a04a4e210aeb1626def9c90df3ae65268"
231 | uuid = "8f5d6c58-4d21-5cfd-889c-e3ad7ee6a615"
232 | version = "1.1.0"
233 |
234 | [[FFMPEG]]
235 | deps = ["FFMPEG_jll"]
236 | git-tree-sha1 = "c82bef6fc01e30d500f588cd01d29bdd44f1924e"
237 | uuid = "c87230d0-a227-11e9-1b43-d7ebe4e7570a"
238 | version = "0.3.0"
239 |
240 | [[FFMPEG_jll]]
241 | deps = ["Bzip2_jll", "FreeType2_jll", "FriBidi_jll", "LAME_jll", "LibVPX_jll", "Libdl", "Ogg_jll", "OpenSSL_jll", "Opus_jll", "Pkg", "Zlib_jll", "libass_jll", "libfdk_aac_jll", "libvorbis_jll", "x264_jll", "x265_jll"]
242 | git-tree-sha1 = "0fa07f43e5609ea54848b82b4bb330b250e9645b"
243 | uuid = "b22a6f82-2f65-5046-a5b2-351ab43fb4e5"
244 | version = "4.1.0+3"
245 |
246 | [[FFTW]]
247 | deps = ["AbstractFFTs", "FFTW_jll", "IntelOpenMP_jll", "Libdl", "LinearAlgebra", "MKL_jll", "Reexport"]
248 | git-tree-sha1 = "14536c95939aadcee44014728a459d2fe3ca9acf"
249 | uuid = "7a1cc6ca-52ef-59f5-83cd-3a7055c09341"
250 | version = "1.2.2"
251 |
252 | [[FFTW_jll]]
253 | deps = ["Libdl", "Pkg"]
254 | git-tree-sha1 = "6c975cd606128d45d1df432fb812d6eb10fee00b"
255 | uuid = "f5851436-0d7a-5f13-b9de-f02708fd171a"
256 | version = "3.3.9+5"
257 |
258 | [[FileIO]]
259 | deps = ["Pkg"]
260 | git-tree-sha1 = "202335fd24c2776493e198d6c66a6d910400a895"
261 | uuid = "5789e2e9-d7fb-5bc7-8068-2c6fae9b9549"
262 | version = "1.3.0"
263 |
264 | [[FilePathsBase]]
265 | deps = ["Dates", "LinearAlgebra", "Printf", "Test", "UUIDs"]
266 | git-tree-sha1 = "923fd3b942a11712435682eaa95cc8518c428b2c"
267 | uuid = "48062228-2e41-5def-b9a4-89aafe57970f"
268 | version = "0.8.0"
269 |
270 | [[FileWatching]]
271 | uuid = "7b1f6079-737a-58dc-b8bc-7a2ca5c1b5ee"
272 |
273 | [[FillArrays]]
274 | deps = ["LinearAlgebra", "Random", "SparseArrays"]
275 | git-tree-sha1 = "be4180bdb27a11188d694ee3773122f4921f1a62"
276 | uuid = "1a297f60-69ca-5386-bcde-b61e274b549b"
277 | version = "0.8.13"
278 |
279 | [[FiniteDiff]]
280 | deps = ["ArrayInterface", "LinearAlgebra", "Requires", "SparseArrays", "StaticArrays"]
281 | git-tree-sha1 = "b02b6f6ea2c33f86a444f9cf132c1d1180a66cfd"
282 | uuid = "6a86dc24-6348-571c-b903-95158fe2bd41"
283 | version = "2.4.1"
284 |
285 | [[FixedPointNumbers]]
286 | deps = ["Statistics"]
287 | git-tree-sha1 = "266baee2e9d875cb7a3bfdcc6cab553c543ff8ab"
288 | uuid = "53c48c17-4a7d-5ca2-90c5-79b7896eea93"
289 | version = "0.8.2"
290 |
291 | [[Formatting]]
292 | deps = ["Printf"]
293 | git-tree-sha1 = "a0c901c29c0e7c763342751c0a94211d56c0de5c"
294 | uuid = "59287772-0a20-5a39-b81b-1366585eb4c0"
295 | version = "0.4.1"
296 |
297 | [[ForwardDiff]]
298 | deps = ["CommonSubexpressions", "DiffResults", "DiffRules", "NaNMath", "Random", "SpecialFunctions", "StaticArrays"]
299 | git-tree-sha1 = "1d090099fb82223abc48f7ce176d3f7696ede36d"
300 | uuid = "f6369f11-7733-5829-9624-2563aa707210"
301 | version = "0.10.12"
302 |
303 | [[Franklin]]
304 | deps = ["Crayons", "Dates", "DelimitedFiles", "DocStringExtensions", "FranklinTemplates", "HTTP", "Literate", "LiveServer", "Logging", "Markdown", "NodeJS", "OrderedCollections", "Pkg", "Random"]
305 | git-tree-sha1 = "c79cc974f019c23e8e5841772070b60c42cdef1f"
306 | uuid = "713c75ef-9fc9-4b05-94a9-213340da978e"
307 | version = "0.8.6"
308 |
309 | [[FranklinTemplates]]
310 | git-tree-sha1 = "dc509923f200b7385ffe699d82aca084aede014b"
311 | uuid = "3a985190-f512-4703-8d38-2a7944ed5916"
312 | version = "0.7.2"
313 |
314 | [[FreeType2_jll]]
315 | deps = ["Bzip2_jll", "Libdl", "Pkg", "Zlib_jll"]
316 | git-tree-sha1 = "7d900f32a3788d4eacac2bfa3bf5c770179c8afd"
317 | uuid = "d7e528f0-a631-5988-bf34-fe36492bcfd7"
318 | version = "2.10.1+2"
319 |
320 | [[FriBidi_jll]]
321 | deps = ["Libdl", "Pkg"]
322 | git-tree-sha1 = "2f56bee16bd0151de7b6a1eeea2ced190a2ad8d4"
323 | uuid = "559328eb-81f9-559d-9380-de523a88c83c"
324 | version = "1.0.5+3"
325 |
326 | [[Future]]
327 | deps = ["Random"]
328 | uuid = "9fa8497b-333b-5362-9e8d-4d0656e87820"
329 |
330 | [[GLM]]
331 | deps = ["Distributions", "LinearAlgebra", "Printf", "Random", "Reexport", "SparseArrays", "SpecialFunctions", "Statistics", "StatsBase", "StatsFuns", "StatsModels"]
332 | git-tree-sha1 = "db0ace36f9dbe7b6a7a08434c5921377e9df2c72"
333 | uuid = "38e38edf-8417-5370-95a0-9cbb8c7f171a"
334 | version = "1.3.9"
335 |
336 | [[GR]]
337 | deps = ["Base64", "DelimitedFiles", "HTTP", "JSON", "LinearAlgebra", "Printf", "Random", "Serialization", "Sockets", "Test", "UUIDs"]
338 | git-tree-sha1 = "e26c513329675092535de20cc4bb9c579c8f85a0"
339 | uuid = "28b8d3ca-fb5f-59d9-8090-bfdbd6d07a71"
340 | version = "0.51.0"
341 |
342 | [[GeometryBasics]]
343 | deps = ["IterTools", "LinearAlgebra", "StaticArrays", "StructArrays", "Tables"]
344 | git-tree-sha1 = "119f32f9c2b497b49cd3f7f513b358b82660294c"
345 | uuid = "5c1252a2-5f33-56bf-86c9-59e7332b4326"
346 | version = "0.2.15"
347 |
348 | [[GeometryTypes]]
349 | deps = ["ColorTypes", "FixedPointNumbers", "LinearAlgebra", "StaticArrays"]
350 | git-tree-sha1 = "34bfa994967e893ab2f17b864eec221b3521ba4d"
351 | uuid = "4d00f742-c7ba-57c2-abde-4428a4b178cb"
352 | version = "0.8.3"
353 |
354 | [[HTTP]]
355 | deps = ["Base64", "Dates", "IniFile", "MbedTLS", "Sockets"]
356 | git-tree-sha1 = "eca61b35cdd8cd2fcc5eec1eda766424a995b02f"
357 | uuid = "cd3eb016-35fb-5094-929b-558a96fad6f3"
358 | version = "0.8.16"
359 |
360 | [[IniFile]]
361 | deps = ["Test"]
362 | git-tree-sha1 = "098e4d2c533924c921f9f9847274f2ad89e018b8"
363 | uuid = "83e8ac13-25f8-5344-8a64-a9f2b223428f"
364 | version = "0.5.0"
365 |
366 | [[IntelOpenMP_jll]]
367 | deps = ["Libdl", "Pkg"]
368 | git-tree-sha1 = "fb8e1c7a5594ba56f9011310790e03b5384998d6"
369 | uuid = "1d5cc7b8-4909-519e-a0f8-d0f5ad9712d0"
370 | version = "2018.0.3+0"
371 |
372 | [[InteractiveUtils]]
373 | deps = ["Markdown"]
374 | uuid = "b77e0a4c-d291-57a0-90e8-8db25a27a240"
375 |
376 | [[Interpolations]]
377 | deps = ["AxisAlgorithms", "LinearAlgebra", "OffsetArrays", "Random", "Ratios", "SharedArrays", "SparseArrays", "StaticArrays", "WoodburyMatrices"]
378 | git-tree-sha1 = "2b7d4e9be8b74f03115e64cf36ed2f48ae83d946"
379 | uuid = "a98d9a8b-a2ab-59e6-89dd-64a1c18fca59"
380 | version = "0.12.10"
381 |
382 | [[InvertedIndices]]
383 | deps = ["Test"]
384 | git-tree-sha1 = "15732c475062348b0165684ffe28e85ea8396afc"
385 | uuid = "41ab1584-1d38-5bbf-9106-f11c6c58b48f"
386 | version = "1.0.0"
387 |
388 | [[IterTools]]
389 | git-tree-sha1 = "05110a2ab1fc5f932622ffea2a003221f4782c18"
390 | uuid = "c8e1da08-722c-5040-9ed9-7db0dc04731e"
391 | version = "1.3.0"
392 |
393 | [[IterativeSolvers]]
394 | deps = ["LinearAlgebra", "Printf", "Random", "RecipesBase", "SparseArrays"]
395 | git-tree-sha1 = "3b7e2aac8c94444947facea7cc7ca91c49169be0"
396 | uuid = "42fd0dbc-a981-5370-80f2-aaf504508153"
397 | version = "0.8.4"
398 |
399 | [[IteratorInterfaceExtensions]]
400 | git-tree-sha1 = "a3f24677c21f5bbe9d2a714f95dcd58337fb2856"
401 | uuid = "82899510-4779-5014-852e-03e436cf321d"
402 | version = "1.0.0"
403 |
404 | [[JLSO]]
405 | deps = ["BSON", "CodecZlib", "FilePathsBase", "Memento", "Pkg", "Serialization"]
406 | git-tree-sha1 = "9dc0c7a4b7527806e53f524ccd66be0cd9e75e2e"
407 | uuid = "9da8a3cd-07a3-59c0-a743-3fdc52c30d11"
408 | version = "2.3.2"
409 |
410 | [[JSON]]
411 | deps = ["Dates", "Mmap", "Parsers", "Unicode"]
412 | git-tree-sha1 = "b34d7cef7b337321e97d22242c3c2b91f476748e"
413 | uuid = "682c06a0-de6a-54ab-a142-c8b1cf79cde6"
414 | version = "0.21.0"
415 |
416 | [[KernelDensity]]
417 | deps = ["Distributions", "FFTW", "Interpolations", "Optim", "StatsBase", "Test"]
418 | git-tree-sha1 = "c1048817fe5711f699abc8fabd47b1ac6ba4db04"
419 | uuid = "5ab0869b-81aa-558d-bb23-cbf5423bbe9b"
420 | version = "0.5.1"
421 |
422 | [[LAME_jll]]
423 | deps = ["Libdl", "Pkg"]
424 | git-tree-sha1 = "221cc8998b9060677448cbb6375f00032554c4fd"
425 | uuid = "c1c5ebd0-6772-5130-a774-d5fcae4a789d"
426 | version = "3.100.0+1"
427 |
428 | [[LIBLINEAR]]
429 | deps = ["DelimitedFiles", "Libdl", "SparseArrays", "Test"]
430 | git-tree-sha1 = "42cacc29d9b4ae77b6702c181bbfa58f14d8ef7a"
431 | uuid = "2d691ee1-e668-5016-a719-b2531b85e0f5"
432 | version = "0.5.1"
433 |
434 | [[LIBSVM]]
435 | deps = ["Compat", "LIBLINEAR", "Libdl", "ScikitLearnBase", "SparseArrays"]
436 | git-tree-sha1 = "05d574c6598bce023ba6f2d2aa99ffd4f8e00789"
437 | uuid = "b1bec4e5-fd48-53fe-b0cb-9723c09d164b"
438 | version = "0.4.0"
439 |
440 | [[LaTeXStrings]]
441 | git-tree-sha1 = "de44b395389b84fd681394d4e8d39ef14e3a2ea8"
442 | uuid = "b964fa9f-0449-5b57-a5c2-d3ea65f4040f"
443 | version = "1.1.0"
444 |
445 | [[LearnBase]]
446 | git-tree-sha1 = "a0d90569edd490b82fdc4dc078ea54a5a800d30a"
447 | uuid = "7f8f8fb0-2700-5f03-b4bd-41f8cfc144b6"
448 | version = "0.4.1"
449 |
450 | [[LibGit2]]
451 | deps = ["Printf"]
452 | uuid = "76f85450-5226-5b5a-8eaa-529ad045b433"
453 |
454 | [[LibVPX_jll]]
455 | deps = ["Libdl", "Pkg"]
456 | git-tree-sha1 = "e3549ca9bf35feb9d9d954f4c6a9032e92f46e7c"
457 | uuid = "dd192d2f-8180-539f-9fb4-cc70b1dcf69a"
458 | version = "1.8.1+1"
459 |
460 | [[Libdl]]
461 | uuid = "8f399da3-3557-5675-b5ff-fb832c97cbdb"
462 |
463 | [[Libiconv_jll]]
464 | deps = ["Libdl", "Pkg"]
465 | git-tree-sha1 = "c9d4035d7481bcdff2babf5a55525a818ef8ed8f"
466 | uuid = "94ce4f54-9a6c-5748-9c1c-f9c7231a4531"
467 | version = "1.16.0+5"
468 |
469 | [[LightGBM]]
470 | deps = ["Dates", "Libdl", "MLJModelInterface", "StatsBase"]
471 | git-tree-sha1 = "cae192532a16a84190935389dae1a3a9cdc92ce4"
472 | uuid = "7acf609c-83a4-11e9-1ffb-b912bcd3b04a"
473 | version = "0.3.1"
474 |
475 | [[LineSearches]]
476 | deps = ["LinearAlgebra", "NLSolversBase", "NaNMath", "Parameters", "Printf", "Test"]
477 | git-tree-sha1 = "54eb90e8dbe745d617c78dee1d6ae95c7f6f5779"
478 | uuid = "d3d80556-e9d4-5f37-9878-2ab0fcc64255"
479 | version = "7.0.1"
480 |
481 | [[LinearAlgebra]]
482 | deps = ["Libdl"]
483 | uuid = "37e2e46d-f89d-539d-b4ee-838fcccc9c8e"
484 |
485 | [[LinearMaps]]
486 | deps = ["LinearAlgebra", "SparseArrays"]
487 | git-tree-sha1 = "e204a96dbb8d49fbca24086c586734435d7bf5b5"
488 | uuid = "7a12625a-238d-50fd-b39a-03d52299707e"
489 | version = "2.6.1"
490 |
491 | [[Literate]]
492 | deps = ["Base64", "JSON", "REPL"]
493 | git-tree-sha1 = "422133037d6dc5df9f9b97c2cb81fcd9e35ddffe"
494 | uuid = "98b081ad-f1c9-55d3-8b20-4c87d4299306"
495 | version = "2.5.0"
496 |
497 | [[LiveServer]]
498 | deps = ["Crayons", "Documenter", "FileWatching", "HTTP", "Pkg", "Sockets", "Test"]
499 | git-tree-sha1 = "452307c337d1f625e7475d3e1a028cc5f1ca2fcb"
500 | uuid = "16fef848-5104-11e9-1b77-fb7a48bbb589"
501 | version = "0.5.0"
502 |
503 | [[Logging]]
504 | uuid = "56ddb016-857b-54e1-b83d-db4d58db5568"
505 |
506 | [[LossFunctions]]
507 | deps = ["LearnBase", "Markdown", "RecipesBase", "SparseArrays", "StatsBase"]
508 | git-tree-sha1 = "3cd347266e394a066ca7f17bd8ff589ff5ce1d35"
509 | uuid = "30fc2ffe-d236-52d8-8643-a9d8f7c094a7"
510 | version = "0.6.2"
511 |
512 | [[MKL_jll]]
513 | deps = ["IntelOpenMP_jll", "Libdl", "Pkg"]
514 | git-tree-sha1 = "0ce9a7fa68c70cf83c49d05d2c04d91b47404b08"
515 | uuid = "856f044c-d86e-5d09-b602-aeab76dc8ba7"
516 | version = "2020.1.216+0"
517 |
518 | [[MLJ]]
519 | deps = ["CategoricalArrays", "ComputationalResources", "Distributed", "Distributions", "LinearAlgebra", "MLJBase", "MLJModels", "MLJScientificTypes", "MLJTuning", "Pkg", "ProgressMeter", "Random", "Statistics", "StatsBase", "Tables"]
520 | git-tree-sha1 = "724663b1628522d83cb58189e57819f82d41063f"
521 | uuid = "add582a8-e3ab-11e8-2d5e-e98b27df1bc7"
522 | version = "0.11.6"
523 |
524 | [[MLJBase]]
525 | deps = ["CategoricalArrays", "ComputationalResources", "Dates", "DelimitedFiles", "Distributed", "Distributions", "HTTP", "InteractiveUtils", "InvertedIndices", "JLSO", "JSON", "LinearAlgebra", "LossFunctions", "MLJModelInterface", "MLJScientificTypes", "Missings", "OrderedCollections", "Parameters", "PrettyTables", "ProgressMeter", "Random", "ScientificTypes", "Statistics", "StatsBase", "Tables"]
526 | git-tree-sha1 = "d8ba2063ffaaa7f0fe91ea5455a7bf838c1424ac"
527 | uuid = "a7f614a8-145f-11e9-1d2a-a57a1082229d"
528 | version = "0.13.10"
529 |
530 | [[MLJLinearModels]]
531 | deps = ["DocStringExtensions", "IterativeSolvers", "LinearAlgebra", "LinearMaps", "MLJModelInterface", "Optim", "Parameters"]
532 | git-tree-sha1 = "01e7a3dc5c07982315c9163bbc3ad9d08811ea8e"
533 | uuid = "6ee0df7b-362f-4a72-a706-9e79364fb692"
534 | version = "0.5.0"
535 |
536 | [[MLJModelInterface]]
537 | deps = ["Random", "ScientificTypes"]
538 | git-tree-sha1 = "b02b13fde7b0dc301adc070d650405aa4909e657"
539 | uuid = "e80e1ace-859a-464e-9ed9-23947d8ae3ea"
540 | version = "0.3.0"
541 |
542 | [[MLJModels]]
543 | deps = ["CategoricalArrays", "Dates", "Distances", "Distributions", "InteractiveUtils", "LinearAlgebra", "MLJBase", "MLJModelInterface", "MultivariateStats", "OrderedCollections", "Parameters", "Pkg", "Random", "Requires", "ScientificTypes", "Statistics", "StatsBase", "Tables"]
544 | git-tree-sha1 = "3a434db580e736e23643867cd7c7e3ccaeafb31d"
545 | uuid = "d491faf4-2d78-11e9-2867-c94bc002c0b7"
546 | version = "0.10.1"
547 |
548 | [[MLJScientificTypes]]
549 | deps = ["CategoricalArrays", "ColorTypes", "Dates", "PrettyTables", "ScientificTypes", "Tables"]
550 | git-tree-sha1 = "c85856fca1302f7fd7d46dd72db7cf43d93777d9"
551 | uuid = "2e2323e0-db8b-457b-ae0d-bdfb3bc63afd"
552 | version = "0.2.8"
553 |
554 | [[MLJScikitLearnInterface]]
555 | deps = ["MLJModelInterface", "PyCall", "ScikitLearn"]
556 | git-tree-sha1 = "9202b249509ec05fd8a5e71b278f42b491f4f324"
557 | uuid = "5ae90465-5518-4432-b9d2-8a1def2f0cab"
558 | version = "0.1.5"
559 |
560 | [[MLJTuning]]
561 | deps = ["ComputationalResources", "Distributed", "Distributions", "MLJBase", "MLJModelInterface", "ProgressMeter", "Random", "RecipesBase"]
562 | git-tree-sha1 = "f9aa8dafd3dc4b8d195aa1b5518188cfd3e181e1"
563 | uuid = "03970b2e-30c4-11ea-3135-d1576263f10f"
564 | version = "0.3.6"
565 |
566 | [[MacroTools]]
567 | deps = ["Markdown", "Random"]
568 | git-tree-sha1 = "f7d2e3f654af75f01ec49be82c231c382214223a"
569 | uuid = "1914dd2f-81c6-5fcd-8719-6d5c9610ff09"
570 | version = "0.5.5"
571 |
572 | [[Markdown]]
573 | deps = ["Base64"]
574 | uuid = "d6f4376e-aef5-505a-96c1-9c027394607a"
575 |
576 | [[MbedTLS]]
577 | deps = ["Dates", "MbedTLS_jll", "Random", "Sockets"]
578 | git-tree-sha1 = "426a6978b03a97ceb7ead77775a1da066343ec6e"
579 | uuid = "739be429-bea8-5141-9913-cc70e7f3736d"
580 | version = "1.0.2"
581 |
582 | [[MbedTLS_jll]]
583 | deps = ["Libdl", "Pkg"]
584 | git-tree-sha1 = "a0cb0d489819fa7ea5f9fa84c7e7eba19d8073af"
585 | uuid = "c8ffd9c3-330d-5841-b78e-0817d7145fa1"
586 | version = "2.16.6+1"
587 |
588 | [[Measures]]
589 | git-tree-sha1 = "e498ddeee6f9fdb4551ce855a46f54dbd900245f"
590 | uuid = "442fdcdd-2543-5da2-b0f3-8c86c306513e"
591 | version = "0.3.1"
592 |
593 | [[Memento]]
594 | deps = ["Dates", "Distributed", "JSON", "Serialization", "Sockets", "Syslogs", "Test", "TimeZones", "UUIDs"]
595 | git-tree-sha1 = "31921ad09307dd9ad693da3213a218152fadb8f2"
596 | uuid = "f28f55f0-a522-5efc-85c2-fe41dfb9b2d9"
597 | version = "1.1.0"
598 |
599 | [[Missings]]
600 | deps = ["DataAPI"]
601 | git-tree-sha1 = "de0a5ce9e5289f27df672ffabef4d1e5861247d5"
602 | uuid = "e1d29d7a-bbdc-5cf2-9ac0-f12de2c33e28"
603 | version = "0.4.3"
604 |
605 | [[Mmap]]
606 | uuid = "a63ad114-7e13-5084-954f-fe012c677804"
607 |
608 | [[Mocking]]
609 | deps = ["ExprTools"]
610 | git-tree-sha1 = "916b850daad0d46b8c71f65f719c49957e9513ed"
611 | uuid = "78c3b35d-d492-501b-9361-3d52fe80e533"
612 | version = "0.7.1"
613 |
614 | [[MultivariateStats]]
615 | deps = ["Arpack", "LinearAlgebra", "SparseArrays", "Statistics", "StatsBase"]
616 | git-tree-sha1 = "352fae519b447bf52e6de627b89f448bcd469e4e"
617 | uuid = "6f286f6a-111f-5878-ab1e-185364afe411"
618 | version = "0.7.0"
619 |
620 | [[NLSolversBase]]
621 | deps = ["DiffResults", "Distributed", "FiniteDiff", "ForwardDiff"]
622 | git-tree-sha1 = "7c4e66c47848562003250f28b579c584e55becc0"
623 | uuid = "d41bc354-129a-5804-8e4c-c37616107c6c"
624 | version = "7.6.1"
625 |
626 | [[NaNMath]]
627 | git-tree-sha1 = "c84c576296d0e2fbb3fc134d3e09086b3ea617cd"
628 | uuid = "77ba4419-2d1f-58cd-9bb1-8ffee604a2e3"
629 | version = "0.3.4"
630 |
631 | [[NearestNeighbors]]
632 | deps = ["Distances", "StaticArrays"]
633 | git-tree-sha1 = "8bc6180f328f3c0ea2663935db880d34c57d6eae"
634 | uuid = "b8a86587-4115-5ab1-83bc-aa920d37bbce"
635 | version = "0.4.4"
636 |
637 | [[NodeJS]]
638 | deps = ["Pkg"]
639 | git-tree-sha1 = "350ac618f41958e6e0f6b0d2005ae4547eb1b503"
640 | uuid = "2bd173c7-0d6d-553b-b6af-13a54713934c"
641 | version = "1.1.1"
642 |
643 | [[Observables]]
644 | git-tree-sha1 = "11832878355305984235a2e90d0e3737383c634c"
645 | uuid = "510215fc-4207-5dde-b226-833fc4488ee2"
646 | version = "0.3.1"
647 |
648 | [[OffsetArrays]]
649 | git-tree-sha1 = "4ba4cd84c88df8340da1c3e2d8dcb9d18dd1b53b"
650 | uuid = "6fe1bfb0-de20-5000-8ca7-80f57d26f881"
651 | version = "1.1.1"
652 |
653 | [[Ogg_jll]]
654 | deps = ["Libdl", "Pkg"]
655 | git-tree-sha1 = "59cf7a95bf5ac39feac80b796e0f39f9d69dc887"
656 | uuid = "e7412a2a-1a6e-54c0-be00-318e2571c051"
657 | version = "1.3.4+0"
658 |
659 | [[OpenBLAS_jll]]
660 | deps = ["CompilerSupportLibraries_jll", "Libdl", "Pkg"]
661 | git-tree-sha1 = "0c922fd9634e358622e333fc58de61f05a048492"
662 | uuid = "4536629a-c528-5b80-bd46-f80d51c5b363"
663 | version = "0.3.9+5"
664 |
665 | [[OpenSSL_jll]]
666 | deps = ["Libdl", "Pkg"]
667 | git-tree-sha1 = "7aaaded15bf393b5f34c2aad5b765c18d26cb495"
668 | uuid = "458c3c95-2e84-50aa-8efc-19380b2a3a95"
669 | version = "1.1.1+4"
670 |
671 | [[OpenSpecFun_jll]]
672 | deps = ["CompilerSupportLibraries_jll", "Libdl", "Pkg"]
673 | git-tree-sha1 = "d51c416559217d974a1113522d5919235ae67a87"
674 | uuid = "efe28fd5-8261-553b-a9e1-b2916fc3738e"
675 | version = "0.5.3+3"
676 |
677 | [[Optim]]
678 | deps = ["Compat", "FillArrays", "LineSearches", "LinearAlgebra", "NLSolversBase", "NaNMath", "Parameters", "PositiveFactorizations", "Printf", "SparseArrays", "StatsBase"]
679 | git-tree-sha1 = "33af70b64e8ce2f2b857e3d5de7b71f67715c121"
680 | uuid = "429524aa-4258-5aef-a3af-852621145aeb"
681 | version = "0.21.0"
682 |
683 | [[Opus_jll]]
684 | deps = ["Libdl", "Pkg"]
685 | git-tree-sha1 = "002c18f222a542907e16c83c64a1338992da7e2c"
686 | uuid = "91d4177d-7536-5919-b921-800302f37372"
687 | version = "1.3.1+1"
688 |
689 | [[OrderedCollections]]
690 | git-tree-sha1 = "293b70ac1780f9584c89268a6e2a560d938a7065"
691 | uuid = "bac558e1-5e72-5ebc-8fee-abe8a469f55d"
692 | version = "1.3.0"
693 |
694 | [[PDMats]]
695 | deps = ["Arpack", "LinearAlgebra", "SparseArrays", "SuiteSparse", "Test"]
696 | git-tree-sha1 = "2fc6f50ddd959e462f0a2dbc802ddf2a539c6e35"
697 | uuid = "90014a1f-27ba-587c-ab20-58faa44d9150"
698 | version = "0.9.12"
699 |
700 | [[Parameters]]
701 | deps = ["OrderedCollections", "UnPack"]
702 | git-tree-sha1 = "38b2e970043613c187bd56a995fe2e551821eb4a"
703 | uuid = "d96e819e-fc66-5662-9728-84c9c7592b0a"
704 | version = "0.12.1"
705 |
706 | [[Parsers]]
707 | deps = ["Dates", "Test"]
708 | git-tree-sha1 = "10134f2ee0b1978ae7752c41306e131a684e1f06"
709 | uuid = "69de0a69-1ddd-5017-9359-2bf0b02dc9f0"
710 | version = "1.0.7"
711 |
712 | [[Pkg]]
713 | deps = ["Dates", "LibGit2", "Libdl", "Logging", "Markdown", "Printf", "REPL", "Random", "SHA", "UUIDs"]
714 | uuid = "44cfe95a-1eb2-52ea-b672-e2afdf69b78f"
715 |
716 | [[PlotThemes]]
717 | deps = ["PlotUtils", "Requires", "Statistics"]
718 | git-tree-sha1 = "c6f5ea535551b3b16835134697f0c65d06c94b91"
719 | uuid = "ccf2f8ad-2431-5c83-bf29-c5338b663b6a"
720 | version = "2.0.0"
721 |
722 | [[PlotUtils]]
723 | deps = ["ColorSchemes", "Colors", "Dates", "Printf", "Random", "Reexport", "Statistics"]
724 | git-tree-sha1 = "e18e0e51ff07bf92bb7e06dcb9c082a4e125e20c"
725 | uuid = "995b91a9-d308-5afd-9ec6-746e21dbc043"
726 | version = "1.0.5"
727 |
728 | [[Plots]]
729 | deps = ["Base64", "Contour", "Dates", "FFMPEG", "FixedPointNumbers", "GR", "GeometryBasics", "GeometryTypes", "JSON", "LinearAlgebra", "Measures", "NaNMath", "PlotThemes", "PlotUtils", "Printf", "REPL", "Random", "RecipesBase", "RecipesPipeline", "Reexport", "Requires", "Showoff", "SparseArrays", "Statistics", "StatsBase", "UUIDs"]
730 | git-tree-sha1 = "ba747739872a67bc1a8078aec3313bde075b3fb0"
731 | uuid = "91a5bcdd-55d7-5caf-9e0b-520d859cae80"
732 | version = "1.5.5"
733 |
734 | [[PooledArrays]]
735 | deps = ["DataAPI"]
736 | git-tree-sha1 = "b1333d4eced1826e15adbdf01a4ecaccca9d353c"
737 | uuid = "2dfb63ee-cc39-5dd5-95bd-886bf059d720"
738 | version = "0.5.3"
739 |
740 | [[PositiveFactorizations]]
741 | deps = ["LinearAlgebra", "Test"]
742 | git-tree-sha1 = "127c47b91990c101ee3752291c4f45640eeb03d1"
743 | uuid = "85a6dd25-e78a-55b7-8502-1745935b8125"
744 | version = "0.2.3"
745 |
746 | [[PrettyPrinting]]
747 | git-tree-sha1 = "cb3bd68c8e0fabf6e13c10bdf11713068e748a79"
748 | uuid = "54e16d92-306c-5ea0-a30b-337be88ac337"
749 | version = "0.2.0"
750 |
751 | [[PrettyTables]]
752 | deps = ["Crayons", "Formatting", "Parameters", "Reexport", "Tables"]
753 | git-tree-sha1 = "8458dc04a493ae5c2fed3796c1d3117972c69694"
754 | uuid = "08abe8d2-0d0c-5749-adfa-8a2ac140af0d"
755 | version = "0.9.1"
756 |
757 | [[Printf]]
758 | deps = ["Unicode"]
759 | uuid = "de0858da-6303-5e67-8744-51eddeeeb8d7"
760 |
761 | [[ProgressMeter]]
762 | deps = ["Distributed", "Printf"]
763 | git-tree-sha1 = "2de4cddc0ceeddafb6b143b5b6cd9c659b64507c"
764 | uuid = "92933f4c-e287-5a05-a399-4b506db050ca"
765 | version = "1.3.2"
766 |
767 | [[PyCall]]
768 | deps = ["Conda", "Dates", "Libdl", "LinearAlgebra", "MacroTools", "Serialization", "VersionParsing"]
769 | git-tree-sha1 = "3a3fdb9000d35958c9ba2323ca7c4958901f115d"
770 | uuid = "438e738f-606a-5dbb-bf0a-cddfbfd45ab0"
771 | version = "1.91.4"
772 |
773 | [[PyPlot]]
774 | deps = ["Colors", "LaTeXStrings", "PyCall", "Sockets", "Test", "VersionParsing"]
775 | git-tree-sha1 = "67dde2482fe1a72ef62ed93f8c239f947638e5a2"
776 | uuid = "d330b81b-6aea-500a-939a-2ce795aea3ee"
777 | version = "2.9.0"
778 |
779 | [[QuadGK]]
780 | deps = ["DataStructures", "LinearAlgebra"]
781 | git-tree-sha1 = "0ab8a09d4478ebeb99a706ecbf8634a65077ccdc"
782 | uuid = "1fd47b50-473d-5c70-9696-f719f8f3bcdc"
783 | version = "2.4.0"
784 |
785 | [[RData]]
786 | deps = ["CategoricalArrays", "CodecZlib", "DataFrames", "Dates", "FileIO", "Requires", "TimeZones", "Unicode"]
787 | git-tree-sha1 = "10693c581956334a368c26b7c544e406c4c94385"
788 | uuid = "df47a6cb-8c03-5eed-afd8-b6050d6c41da"
789 | version = "0.7.2"
790 |
791 | [[RDatasets]]
792 | deps = ["CSV", "CodecZlib", "DataFrames", "FileIO", "Printf", "RData", "Reexport"]
793 | git-tree-sha1 = "511854268c47438216a7640341ad4ce14b3463bb"
794 | uuid = "ce6b1742-4840-55fa-b093-852dadbb1d8b"
795 | version = "0.6.9"
796 |
797 | [[REPL]]
798 | deps = ["InteractiveUtils", "Markdown", "Sockets"]
799 | uuid = "3fa0cd96-eef1-5676-8a61-b3b8758bbffb"
800 |
801 | [[Random]]
802 | deps = ["Serialization"]
803 | uuid = "9a3f8284-a2c9-5f02-9a11-845980a1fd5c"
804 |
805 | [[Ratios]]
806 | git-tree-sha1 = "37d210f612d70f3f7d57d488cb3b6eff56ad4e41"
807 | uuid = "c84ed2f1-dad5-54f0-aa8e-dbefe2724439"
808 | version = "0.4.0"
809 |
810 | [[RecipesBase]]
811 | git-tree-sha1 = "54f8ceb165a0f6d083f0d12cb4996f5367c6edbc"
812 | uuid = "3cdcf5f2-1ef4-517c-9805-6587b60abb01"
813 | version = "1.0.1"
814 |
815 | [[RecipesPipeline]]
816 | deps = ["Dates", "PlotUtils", "RecipesBase"]
817 | git-tree-sha1 = "d2a58b8291d1c0abae6a91489973f8a92bf5c04a"
818 | uuid = "01d81517-befc-4cb6-b9ec-a95719d0359c"
819 | version = "0.1.11"
820 |
821 | [[Reexport]]
822 | deps = ["Pkg"]
823 | git-tree-sha1 = "7b1d07f411bc8ddb7977ec7f377b97b158514fe0"
824 | uuid = "189a3867-3050-52da-a836-e630ba90ab69"
825 | version = "0.2.0"
826 |
827 | [[Requires]]
828 | deps = ["UUIDs"]
829 | git-tree-sha1 = "d37400976e98018ee840e0ca4f9d20baa231dc6b"
830 | uuid = "ae029012-a4dd-5104-9daa-d747884805df"
831 | version = "1.0.1"
832 |
833 | [[Rmath]]
834 | deps = ["Random", "Rmath_jll"]
835 | git-tree-sha1 = "86c5647b565873641538d8f812c04e4c9dbeb370"
836 | uuid = "79098fc4-a85e-5d69-aa6a-4863f24498fa"
837 | version = "0.6.1"
838 |
839 | [[Rmath_jll]]
840 | deps = ["Libdl", "Pkg"]
841 | git-tree-sha1 = "d76185aa1f421306dec73c057aa384bad74188f0"
842 | uuid = "f50d1b31-88e8-58de-be2c-1cc44531875f"
843 | version = "0.2.2+1"
844 |
845 | [[SHA]]
846 | uuid = "ea8e919c-243c-51af-8825-aaa63cd721ce"
847 |
848 | [[ScientificTypes]]
849 | git-tree-sha1 = "1a9f881c800ea009fb7f8b5274f04e4e8a5faef8"
850 | uuid = "321657f4-b219-11e9-178b-2701a2544e81"
851 | version = "0.8.0"
852 |
853 | [[ScikitLearn]]
854 | deps = ["Compat", "Conda", "DataFrames", "Distributed", "IterTools", "LinearAlgebra", "MacroTools", "Parameters", "Printf", "PyCall", "Random", "ScikitLearnBase", "SparseArrays", "StatsBase", "VersionParsing"]
855 | git-tree-sha1 = "b2dbb141575879beb3ad771fb0314a22617586d3"
856 | uuid = "3646fa90-6ef7-5e7e-9f22-8aca16db6324"
857 | version = "0.6.2"
858 |
859 | [[ScikitLearnBase]]
860 | deps = ["LinearAlgebra", "Random", "Statistics"]
861 | git-tree-sha1 = "7877e55c1523a4b336b433da39c8e8c08d2f221f"
862 | uuid = "6e75b9c4-186b-50bd-896f-2d2496a4843e"
863 | version = "0.5.0"
864 |
865 | [[Serialization]]
866 | uuid = "9e88b42a-f829-5b0c-bbe9-9e923198166b"
867 |
868 | [[SharedArrays]]
869 | deps = ["Distributed", "Mmap", "Random", "Serialization"]
870 | uuid = "1a1011a3-84de-559e-8e89-a11a2f7dc383"
871 |
872 | [[ShiftedArrays]]
873 | git-tree-sha1 = "22395afdcf37d6709a5a0766cc4a5ca52cb85ea0"
874 | uuid = "1277b4bf-5013-50f5-be3d-901d8477a67a"
875 | version = "1.0.0"
876 |
877 | [[Showoff]]
878 | deps = ["Dates"]
879 | git-tree-sha1 = "e032c9df551fb23c9f98ae1064de074111b7bc39"
880 | uuid = "992d4aef-0814-514b-bc4d-f2e9a6c4116f"
881 | version = "0.3.1"
882 |
883 | [[Sockets]]
884 | uuid = "6462fe0b-24de-5631-8697-dd941f90decc"
885 |
886 | [[SortingAlgorithms]]
887 | deps = ["DataStructures", "Random", "Test"]
888 | git-tree-sha1 = "03f5898c9959f8115e30bc7226ada7d0df554ddd"
889 | uuid = "a2af1166-a08f-5f64-846c-94a0d3cef48c"
890 | version = "0.3.1"
891 |
892 | [[SparseArrays]]
893 | deps = ["LinearAlgebra", "Random"]
894 | uuid = "2f01184e-e22b-5df5-ae63-d93ebab69eaf"
895 |
896 | [[SpecialFunctions]]
897 | deps = ["OpenSpecFun_jll"]
898 | git-tree-sha1 = "d8d8b8a9f4119829410ecd706da4cc8594a1e020"
899 | uuid = "276daf66-3868-5448-9aa4-cd146d93841b"
900 | version = "0.10.3"
901 |
902 | [[StableRNGs]]
903 | deps = ["Random", "Test"]
904 | git-tree-sha1 = "705f8782b1d532c6db75e0a986fb848a629f971a"
905 | uuid = "860ef19b-820b-49d6-a774-d7a799459cd3"
906 | version = "0.1.1"
907 |
908 | [[StaticArrays]]
909 | deps = ["LinearAlgebra", "Random", "Statistics"]
910 | git-tree-sha1 = "016d1e1a00fabc556473b07161da3d39726ded35"
911 | uuid = "90137ffa-7385-5640-81b9-e52037218182"
912 | version = "0.12.4"
913 |
914 | [[Statistics]]
915 | deps = ["LinearAlgebra", "SparseArrays"]
916 | uuid = "10745b16-79ce-11e8-11f9-7d13ad32a3b2"
917 |
918 | [[StatsBase]]
919 | deps = ["DataAPI", "DataStructures", "LinearAlgebra", "Missings", "Printf", "Random", "SortingAlgorithms", "SparseArrays", "Statistics"]
920 | git-tree-sha1 = "a6102b1f364befdb05746f386b67c6b7e3262c45"
921 | uuid = "2913bbd2-ae8a-5f71-8c99-4fb6c76f3a91"
922 | version = "0.33.0"
923 |
924 | [[StatsFuns]]
925 | deps = ["Rmath", "SpecialFunctions"]
926 | git-tree-sha1 = "04a5a8e6ab87966b43f247920eab053fd5fdc925"
927 | uuid = "4c63d2b9-4356-54db-8cca-17b64c39e42c"
928 | version = "0.9.5"
929 |
930 | [[StatsModels]]
931 | deps = ["DataAPI", "DataStructures", "Distributions", "LinearAlgebra", "Printf", "ShiftedArrays", "SparseArrays", "StatsBase", "Tables"]
932 | git-tree-sha1 = "b79969dac368d8a61515b861b15d0e691e0bff96"
933 | uuid = "3eaba693-59b7-5ba5-a881-562e759f1c8d"
934 | version = "0.6.12"
935 |
936 | [[StatsPlots]]
937 | deps = ["Clustering", "DataStructures", "DataValues", "Distributions", "Interpolations", "KernelDensity", "MultivariateStats", "Observables", "Plots", "RecipesBase", "RecipesPipeline", "Reexport", "StatsBase", "TableOperations", "Tables", "Widgets"]
938 | git-tree-sha1 = "b9b7fff81f573465fcac4685df1497d968537a9e"
939 | uuid = "f3b207a7-027a-5e70-b257-86293d7955fd"
940 | version = "0.14.6"
941 |
942 | [[StructArrays]]
943 | deps = ["Adapt", "DataAPI", "Tables"]
944 | git-tree-sha1 = "8099ed9fb90b6e754d6ba8c6ed8670f010eadca0"
945 | uuid = "09ab397b-f2b6-538f-b94a-2f83cf4a842a"
946 | version = "0.4.4"
947 |
948 | [[SuiteSparse]]
949 | deps = ["Libdl", "LinearAlgebra", "Serialization", "SparseArrays"]
950 | uuid = "4607b0f0-06f3-5cda-b6b1-a6196a1729e9"
951 |
952 | [[Syslogs]]
953 | deps = ["Printf", "Sockets"]
954 | git-tree-sha1 = "46badfcc7c6e74535cc7d833a91f4ac4f805f86d"
955 | uuid = "cea106d9-e007-5e6c-ad93-58fe2094e9c4"
956 | version = "0.3.0"
957 |
958 | [[TableOperations]]
959 | deps = ["Tables", "Test"]
960 | git-tree-sha1 = "208630a14884abd110a8f8008b0882f0d0f5632c"
961 | uuid = "ab02a1b2-a7df-11e8-156e-fb1833f50b87"
962 | version = "0.2.1"
963 |
964 | [[TableTraits]]
965 | deps = ["IteratorInterfaceExtensions"]
966 | git-tree-sha1 = "b1ad568ba658d8cbb3b892ed5380a6f3e781a81e"
967 | uuid = "3783bdb8-4a98-5b6b-af9a-565f29a5fe9c"
968 | version = "1.0.0"
969 |
970 | [[Tables]]
971 | deps = ["DataAPI", "DataValueInterfaces", "IteratorInterfaceExtensions", "LinearAlgebra", "TableTraits", "Test"]
972 | git-tree-sha1 = "c45dcc27331febabc20d86cb3974ef095257dcf3"
973 | uuid = "bd369af6-aec1-5ad0-b16a-f7cc5008161c"
974 | version = "1.0.4"
975 |
976 | [[Test]]
977 | deps = ["Distributed", "InteractiveUtils", "Logging", "Random"]
978 | uuid = "8dfed614-e22c-5e08-85e1-65c5234f0b40"
979 |
980 | [[TimeZones]]
981 | deps = ["Dates", "EzXML", "Mocking", "Pkg", "Printf", "RecipesBase", "Serialization", "Unicode"]
982 | git-tree-sha1 = "fc9deaf6636c12c564a9eb7c110eff469eec2efa"
983 | uuid = "f269a46b-ccf7-5d73-abea-4c690281aa53"
984 | version = "1.3.0"
985 |
986 | [[TranscodingStreams]]
987 | deps = ["Random", "Test"]
988 | git-tree-sha1 = "7c53c35547de1c5b9d46a4797cf6d8253807108c"
989 | uuid = "3bb67fe8-82b1-5028-8e26-92a6c54297fa"
990 | version = "0.9.5"
991 |
992 | [[UUIDs]]
993 | deps = ["Random", "SHA"]
994 | uuid = "cf7118a7-6976-5b1a-9a39-7adc72f591a4"
995 |
996 | [[UnPack]]
997 | git-tree-sha1 = "d4bfa022cd30df012700cf380af2141961bb3bfb"
998 | uuid = "3a884ed6-31ef-47d7-9d2a-63182c4928ed"
999 | version = "1.0.1"
1000 |
1001 | [[Unicode]]
1002 | uuid = "4ec0a83e-493e-50e2-b9ac-8f72acf5a8f5"
1003 |
1004 | [[UrlDownload]]
1005 | deps = ["HTTP", "ProgressMeter"]
1006 | git-tree-sha1 = "5f4a56e15ed7c4e37d35cd30b82ecc2fb28a0f5d"
1007 | uuid = "856ac37a-3032-4c1c-9122-f86d88358c8b"
1008 | version = "0.3.0"
1009 |
1010 | [[VersionParsing]]
1011 | git-tree-sha1 = "80229be1f670524750d905f8fc8148e5a8c4537f"
1012 | uuid = "81def892-9a0e-5fdd-b105-ffc91e053289"
1013 | version = "1.2.0"
1014 |
1015 | [[WeakRefStrings]]
1016 | deps = ["DataAPI", "Random", "Test"]
1017 | git-tree-sha1 = "28807f85197eaad3cbd2330386fac1dcb9e7e11d"
1018 | uuid = "ea10d353-3f73-51f8-a26c-33c1cb351aa5"
1019 | version = "0.6.2"
1020 |
1021 | [[Widgets]]
1022 | deps = ["Colors", "Dates", "Observables", "OrderedCollections"]
1023 | git-tree-sha1 = "fc0feda91b3fef7fe6948ee09bb628f882b49ca4"
1024 | uuid = "cc8bc4a8-27d6-5769-a93b-9d913e69aa62"
1025 | version = "0.6.2"
1026 |
1027 | [[WoodburyMatrices]]
1028 | deps = ["LinearAlgebra", "SparseArrays"]
1029 | git-tree-sha1 = "28ffe06d28b1ba8fdb2f36ec7bb079fac81bac0d"
1030 | uuid = "efce3f68-66dc-5838-9240-27a6d6f5f9b6"
1031 | version = "0.5.2"
1032 |
1033 | [[XGBoost]]
1034 | deps = ["Libdl", "Printf", "Random", "SparseArrays", "Statistics", "Test", "XGBoost_jll"]
1035 | git-tree-sha1 = "8a692f817f1a6c15ef4913a0ffefa6163117f43d"
1036 | uuid = "009559a3-9522-5dbb-924b-0b6ed2b22bb9"
1037 | version = "1.1.1"
1038 |
1039 | [[XGBoost_jll]]
1040 | deps = ["CompilerSupportLibraries_jll", "Libdl", "Pkg"]
1041 | git-tree-sha1 = "72c0d8bfbb56856c5f25668b72247ec18bbf5579"
1042 | uuid = "a5c6f535-4255-5ca2-a466-0e519f119c46"
1043 | version = "1.1.1+0"
1044 |
1045 | [[XML2_jll]]
1046 | deps = ["Libdl", "Libiconv_jll", "Pkg", "Zlib_jll"]
1047 | git-tree-sha1 = "432d91f45e950f2f2bda5c0f4e2b938c14493af9"
1048 | uuid = "02c8fc9c-b97f-50b9-bbe4-9be30ff0a78a"
1049 | version = "2.9.10+1"
1050 |
1051 | [[Zlib_jll]]
1052 | deps = ["Libdl", "Pkg"]
1053 | git-tree-sha1 = "622d8b6dc0c7e8029f17127703de9819134d1b71"
1054 | uuid = "83775a58-1f1d-513f-b197-d71354ab007a"
1055 | version = "1.2.11+14"
1056 |
1057 | [[libass_jll]]
1058 | deps = ["Bzip2_jll", "FreeType2_jll", "FriBidi_jll", "Libdl", "Pkg", "Zlib_jll"]
1059 | git-tree-sha1 = "027a304b2a90de84f690949a21f94e5ae0f92c73"
1060 | uuid = "0ac62f75-1d6f-5e53-bd7c-93b484bb37c0"
1061 | version = "0.14.0+2"
1062 |
1063 | [[libfdk_aac_jll]]
1064 | deps = ["Libdl", "Pkg"]
1065 | git-tree-sha1 = "480c7ed04f68ea3edd4c757f5db5b6a0a4e0bd99"
1066 | uuid = "f638f0a6-7fb0-5443-88ba-1cc74229b280"
1067 | version = "0.1.6+2"
1068 |
1069 | [[libvorbis_jll]]
1070 | deps = ["Libdl", "Ogg_jll", "Pkg"]
1071 | git-tree-sha1 = "6a66f65b5275dfa799036c8a3a26616a0a271c4a"
1072 | uuid = "f27f6e37-5d2b-51aa-960f-b287f2bc3b7a"
1073 | version = "1.3.6+4"
1074 |
1075 | [[x264_jll]]
1076 | deps = ["Libdl", "Pkg"]
1077 | git-tree-sha1 = "d89346fe63a6465a9f44e958ac0e3d366af90b74"
1078 | uuid = "1270edf5-f2f9-52d2-97e9-ab00b5d0237a"
1079 | version = "2019.5.25+2"
1080 |
1081 | [[x265_jll]]
1082 | deps = ["Libdl", "Pkg"]
1083 | git-tree-sha1 = "61324ad346b00a6e541896b94201c9426591e43a"
1084 | uuid = "dfaa095f-4041-5dcd-9319-2fabd8486b76"
1085 | version = "3.0.0+1"
1086 |
--------------------------------------------------------------------------------
/data/src/Project.toml:
--------------------------------------------------------------------------------
1 | name = "DataScienceTutorials"
2 | uuid = "b22f6415-6e67-485c-b34d-7995e604d9c9"
3 | version = "0.4.1"
4 |
5 | [deps]
6 | CSV = "336ed68f-0bac-5ca0-87d4-7b16caf5d00b"
7 | CategoricalArrays = "324d7699-5711-5eae-9e2f-1d82baa6b597"
8 | Clustering = "aaaa29a8-35af-508c-8bc3-b662a17a0fe5"
9 | DataFrames = "a93c6f00-e57d-5684-b7b6-d8193f3e46c0"
10 | Dates = "ade2ca70-3891-5945-98fb-dc099432e06a"
11 | DecisionTree = "7806a523-6efd-50cb-b5f6-3fa6f1930dbb"
12 | Distances = "b4f34e82-e78d-54a5-968a-f98e89d6e8f7"
13 | Distributions = "31c24e10-a181-5473-b8eb-7969acd0382f"
14 | EvoTrees = "f6006082-12f8-11e9-0c9c-0d5d367ab1e5"
15 | Franklin = "713c75ef-9fc9-4b05-94a9-213340da978e"
16 | GLM = "38e38edf-8417-5370-95a0-9cbb8c7f171a"
17 | HTTP = "cd3eb016-35fb-5094-929b-558a96fad6f3"
18 | LIBSVM = "b1bec4e5-fd48-53fe-b0cb-9723c09d164b"
19 | LightGBM = "7acf609c-83a4-11e9-1ffb-b912bcd3b04a"
20 | LinearAlgebra = "37e2e46d-f89d-539d-b4ee-838fcccc9c8e"
21 | LossFunctions = "30fc2ffe-d236-52d8-8643-a9d8f7c094a7"
22 | MLJ = "add582a8-e3ab-11e8-2d5e-e98b27df1bc7"
23 | MLJBase = "a7f614a8-145f-11e9-1d2a-a57a1082229d"
24 | MLJLinearModels = "6ee0df7b-362f-4a72-a706-9e79364fb692"
25 | MLJModelInterface = "e80e1ace-859a-464e-9ed9-23947d8ae3ea"
26 | MLJModels = "d491faf4-2d78-11e9-2867-c94bc002c0b7"
27 | MLJScientificTypes = "2e2323e0-db8b-457b-ae0d-bdfb3bc63afd"
28 | MLJScikitLearnInterface = "5ae90465-5518-4432-b9d2-8a1def2f0cab"
29 | MultivariateStats = "6f286f6a-111f-5878-ab1e-185364afe411"
30 | NearestNeighbors = "b8a86587-4115-5ab1-83bc-aa920d37bbce"
31 | PrettyPrinting = "54e16d92-306c-5ea0-a30b-337be88ac337"
32 | PyPlot = "d330b81b-6aea-500a-939a-2ce795aea3ee"
33 | RDatasets = "ce6b1742-4840-55fa-b093-852dadbb1d8b"
34 | Random = "9a3f8284-a2c9-5f02-9a11-845980a1fd5c"
35 | ScikitLearn = "3646fa90-6ef7-5e7e-9f22-8aca16db6324"
36 | StableRNGs = "860ef19b-820b-49d6-a774-d7a799459cd3"
37 | Statistics = "10745b16-79ce-11e8-11f9-7d13ad32a3b2"
38 | StatsBase = "2913bbd2-ae8a-5f71-8c99-4fb6c76f3a91"
39 | StatsPlots = "f3b207a7-027a-5e70-b257-86293d7955fd"
40 | Tables = "bd369af6-aec1-5ad0-b16a-f7cc5008161c"
41 | UrlDownload = "856ac37a-3032-4c1c-9122-f86d88358c8b"
42 | XGBoost = "009559a3-9522-5dbb-924b-0b6ed2b22bb9"
43 |
44 | [compat]
45 | MLJ = "0.11"
46 | MLJBase = "0.13"
47 | MLJLinearModels = "0.5"
48 | MLJModelInterface = "0.3"
49 | MLJModels = "0.10"
50 | MLJScientificTypes = "0.2"
51 |
52 | [extras]
53 | Logging = "56ddb016-857b-54e1-b83d-db4d58db5568"
54 | Test = "8dfed614-e22c-5e08-85e1-65c5234f0b40"
55 |
56 | [targets]
57 | test = ["Test", "Logging"]
58 |
--------------------------------------------------------------------------------
/data/src/convert_ames.jl:
--------------------------------------------------------------------------------
1 | using Pkg
2 | Pkg.activate(joinpath(@__DIR__, "convert_ames"))
3 | Pkg.instantiate()
4 |
5 | using DataFrames, CSV, MLJBase, CategoricalArrays
6 |
7 | df = CSV.read(joinpath(@__DIR__, "reduced_ames.csv"))
8 |
9 | schema(df)
10 |
11 | price = df.target
12 | quality = df.OverallQual
13 | area1 = map(df.GrLivArea) do a round(Int, a) end
14 | area2 = map(df.x1stFlrSF) do a round(Int, a) end
15 | area3 = map(df.TotalBsmtSF) do a round(Int, a) end
16 | area4 = map(df.BsmtFinSF1) do a round(Int, a) end
17 | area5 = map(df.GarageArea) do a round(Int, a) end
18 | lot_area = map(df.LotArea) do a round(Int, a) end
19 | garage_cars = map(df.GarageCars) do a round(Int, a) end
20 | suburb = df.Neighborhood
21 | council_code = map(df.MSSubClass) do a parse(Int, a[2:end]) end
22 | year_built = map(df.YearBuilt) do a round(Int, a) end
23 | year_upgraded = map(df.YearRemodAdd) do a round(Int, a) end
24 | zone = df.MSSubClass
25 |
26 | df2 = DataFrame(price=price,
27 | area1=area1,
28 | area2=area2,
29 | area3=area3,
30 | area4=area4,
31 | area5=area5,
32 | lot_area=lot_area,
33 | year_built=year_built,
34 | year_upgraded=year_upgraded,
35 | quality=quality,
36 | garage_cars=garage_cars,
37 | suburb=suburb,
38 | council_code=council_code,
39 | zone=zone)
40 |
41 | CSV.write(joinpath(@__DIR__, "ames.csv"), df)
42 |
43 |
--------------------------------------------------------------------------------
/data/src/convert_ames/Manifest.toml:
--------------------------------------------------------------------------------
1 | # This file is machine-generated - editing it directly is not advised
2 |
3 | [[Arpack]]
4 | deps = ["Arpack_jll", "Libdl", "LinearAlgebra"]
5 | git-tree-sha1 = "2ff92b71ba1747c5fdd541f8fc87736d82f40ec9"
6 | uuid = "7d9fca2a-8960-54d3-9f78-7d1dccf2cb97"
7 | version = "0.4.0"
8 |
9 | [[Arpack_jll]]
10 | deps = ["Libdl", "OpenBLAS_jll", "Pkg"]
11 | git-tree-sha1 = "e214a9b9bd1b4e1b4f15b22c0994862b66af7ff7"
12 | uuid = "68821587-b530-5797-8361-c406ea357684"
13 | version = "3.5.0+3"
14 |
15 | [[BSON]]
16 | git-tree-sha1 = "dd36d7cf3d185eeaaf64db902c15174b22f5dafb"
17 | uuid = "fbb218c0-5317-5bc6-957e-2ee96dd4b1f0"
18 | version = "0.2.6"
19 |
20 | [[Base64]]
21 | uuid = "2a0f44e3-6c83-55bd-87e4-b1978d98bd5f"
22 |
23 | [[CSV]]
24 | deps = ["CategoricalArrays", "DataFrames", "Dates", "FilePathsBase", "Mmap", "Parsers", "PooledArrays", "Tables", "Unicode", "WeakRefStrings"]
25 | git-tree-sha1 = "52a8e60c7822f53d57e4403b7f2811e7e1bdd32b"
26 | uuid = "336ed68f-0bac-5ca0-87d4-7b16caf5d00b"
27 | version = "0.6.2"
28 |
29 | [[CategoricalArrays]]
30 | deps = ["DataAPI", "Future", "JSON", "Missings", "Printf", "Statistics", "Unicode"]
31 | git-tree-sha1 = "a6c17353ee38ddab30e73dcfaa1107752de724ec"
32 | uuid = "324d7699-5711-5eae-9e2f-1d82baa6b597"
33 | version = "0.8.1"
34 |
35 | [[CodecZlib]]
36 | deps = ["TranscodingStreams", "Zlib_jll"]
37 | git-tree-sha1 = "ded953804d019afa9a3f98981d99b33e3db7b6da"
38 | uuid = "944b1d66-785c-5afd-91f1-9de20f533193"
39 | version = "0.7.0"
40 |
41 | [[ColorTypes]]
42 | deps = ["FixedPointNumbers", "Random"]
43 | git-tree-sha1 = "c73d9cfc2a9d8433dc77f5bff4bddf46b1d78c20"
44 | uuid = "3da002f7-5984-5a60-b8a6-cbb66c0b333f"
45 | version = "0.10.3"
46 |
47 | [[Compat]]
48 | deps = ["Base64", "Dates", "DelimitedFiles", "Distributed", "InteractiveUtils", "LibGit2", "Libdl", "LinearAlgebra", "Markdown", "Mmap", "Pkg", "Printf", "REPL", "Random", "SHA", "Serialization", "SharedArrays", "Sockets", "SparseArrays", "Statistics", "Test", "UUIDs", "Unicode"]
49 | git-tree-sha1 = "054993b6611376ddb40203e973e954fd9d1d1902"
50 | uuid = "34da2185-b29b-5c13-b0c7-acf172513d20"
51 | version = "3.12.0"
52 |
53 | [[CompilerSupportLibraries_jll]]
54 | deps = ["Libdl", "Pkg"]
55 | git-tree-sha1 = "7c4f882c41faa72118841185afc58a2eb00ef612"
56 | uuid = "e66e0078-7015-5450-92f7-15fbd957f2ae"
57 | version = "0.3.3+0"
58 |
59 | [[ComputationalResources]]
60 | git-tree-sha1 = "52cb3ec90e8a8bea0e62e275ba577ad0f74821f7"
61 | uuid = "ed09eef8-17a6-5b46-8889-db040fac31e3"
62 | version = "0.3.2"
63 |
64 | [[Crayons]]
65 | git-tree-sha1 = "9f3adcb26c79d6270eb678f3c61bf44cc6b7077e"
66 | uuid = "a8cc5b0e-0ffa-5ad4-8c14-923d3ee1735f"
67 | version = "4.0.2"
68 |
69 | [[DataAPI]]
70 | git-tree-sha1 = "176e23402d80e7743fc26c19c681bfb11246af32"
71 | uuid = "9a962f9c-6df0-11e9-0e5d-c546b8b5ee8a"
72 | version = "1.3.0"
73 |
74 | [[DataFrames]]
75 | deps = ["CategoricalArrays", "Compat", "DataAPI", "Future", "InvertedIndices", "IteratorInterfaceExtensions", "Missings", "PooledArrays", "Printf", "REPL", "Reexport", "SortingAlgorithms", "Statistics", "TableTraits", "Tables", "Unicode"]
76 | git-tree-sha1 = "02f08ae77249b7f6d4186b081a016fb7454c616f"
77 | uuid = "a93c6f00-e57d-5684-b7b6-d8193f3e46c0"
78 | version = "0.21.2"
79 |
80 | [[DataStructures]]
81 | deps = ["InteractiveUtils", "OrderedCollections"]
82 | git-tree-sha1 = "be680f1ad03c0a03796aa3fda5a2180df7f83b46"
83 | uuid = "864edb3b-99cc-5e75-8d2d-829cb0a9cfe8"
84 | version = "0.17.18"
85 |
86 | [[DataValueInterfaces]]
87 | git-tree-sha1 = "bfc1187b79289637fa0ef6d4436ebdfe6905cbd6"
88 | uuid = "e2d170a0-9d28-54be-80f0-106bbe20a464"
89 | version = "1.0.0"
90 |
91 | [[Dates]]
92 | deps = ["Printf"]
93 | uuid = "ade2ca70-3891-5945-98fb-dc099432e06a"
94 |
95 | [[DelimitedFiles]]
96 | deps = ["Mmap"]
97 | uuid = "8bb1440f-4735-579b-a4ab-409b98df4dab"
98 |
99 | [[Distributed]]
100 | deps = ["Random", "Serialization", "Sockets"]
101 | uuid = "8ba89e20-285c-5b6f-9357-94700520ee1b"
102 |
103 | [[Distributions]]
104 | deps = ["FillArrays", "LinearAlgebra", "PDMats", "Printf", "QuadGK", "Random", "SpecialFunctions", "Statistics", "StatsBase", "StatsFuns"]
105 | git-tree-sha1 = "78c4c32a2357a00a0a7d614880f02c2c6e1ec73c"
106 | uuid = "31c24e10-a181-5473-b8eb-7969acd0382f"
107 | version = "0.23.4"
108 |
109 | [[ExprTools]]
110 | git-tree-sha1 = "6f0517056812fd6aa3af23d4b70d5325a2ae4e95"
111 | uuid = "e2ba6199-217a-4e67-a87a-7c52f15ade04"
112 | version = "0.1.1"
113 |
114 | [[EzXML]]
115 | deps = ["Printf", "XML2_jll"]
116 | git-tree-sha1 = "0fa3b52a04a4e210aeb1626def9c90df3ae65268"
117 | uuid = "8f5d6c58-4d21-5cfd-889c-e3ad7ee6a615"
118 | version = "1.1.0"
119 |
120 | [[FilePathsBase]]
121 | deps = ["Dates", "LinearAlgebra", "Printf", "Test", "UUIDs"]
122 | git-tree-sha1 = "923fd3b942a11712435682eaa95cc8518c428b2c"
123 | uuid = "48062228-2e41-5def-b9a4-89aafe57970f"
124 | version = "0.8.0"
125 |
126 | [[FillArrays]]
127 | deps = ["LinearAlgebra", "Random", "SparseArrays"]
128 | git-tree-sha1 = "44f561e293987ffc84272cd3d2b14b0b93123d63"
129 | uuid = "1a297f60-69ca-5386-bcde-b61e274b549b"
130 | version = "0.8.10"
131 |
132 | [[FixedPointNumbers]]
133 | git-tree-sha1 = "3ba9ea634d4c8b289d590403b4a06f8e227a6238"
134 | uuid = "53c48c17-4a7d-5ca2-90c5-79b7896eea93"
135 | version = "0.8.0"
136 |
137 | [[Formatting]]
138 | deps = ["Printf"]
139 | git-tree-sha1 = "a0c901c29c0e7c763342751c0a94211d56c0de5c"
140 | uuid = "59287772-0a20-5a39-b81b-1366585eb4c0"
141 | version = "0.4.1"
142 |
143 | [[Future]]
144 | deps = ["Random"]
145 | uuid = "9fa8497b-333b-5362-9e8d-4d0656e87820"
146 |
147 | [[HTTP]]
148 | deps = ["Base64", "Dates", "IniFile", "MbedTLS", "Sockets"]
149 | git-tree-sha1 = "ec87d5e2acbe1693789efbbe14f5ea7525758f71"
150 | uuid = "cd3eb016-35fb-5094-929b-558a96fad6f3"
151 | version = "0.8.15"
152 |
153 | [[IniFile]]
154 | deps = ["Test"]
155 | git-tree-sha1 = "098e4d2c533924c921f9f9847274f2ad89e018b8"
156 | uuid = "83e8ac13-25f8-5344-8a64-a9f2b223428f"
157 | version = "0.5.0"
158 |
159 | [[InteractiveUtils]]
160 | deps = ["Markdown"]
161 | uuid = "b77e0a4c-d291-57a0-90e8-8db25a27a240"
162 |
163 | [[InvertedIndices]]
164 | deps = ["Test"]
165 | git-tree-sha1 = "15732c475062348b0165684ffe28e85ea8396afc"
166 | uuid = "41ab1584-1d38-5bbf-9106-f11c6c58b48f"
167 | version = "1.0.0"
168 |
169 | [[IteratorInterfaceExtensions]]
170 | git-tree-sha1 = "a3f24677c21f5bbe9d2a714f95dcd58337fb2856"
171 | uuid = "82899510-4779-5014-852e-03e436cf321d"
172 | version = "1.0.0"
173 |
174 | [[JLSO]]
175 | deps = ["BSON", "CodecZlib", "FilePathsBase", "Memento", "Pkg", "Serialization"]
176 | git-tree-sha1 = "9dc0c7a4b7527806e53f524ccd66be0cd9e75e2e"
177 | uuid = "9da8a3cd-07a3-59c0-a743-3fdc52c30d11"
178 | version = "2.3.2"
179 |
180 | [[JSON]]
181 | deps = ["Dates", "Mmap", "Parsers", "Unicode"]
182 | git-tree-sha1 = "b34d7cef7b337321e97d22242c3c2b91f476748e"
183 | uuid = "682c06a0-de6a-54ab-a142-c8b1cf79cde6"
184 | version = "0.21.0"
185 |
186 | [[LearnBase]]
187 | git-tree-sha1 = "a0d90569edd490b82fdc4dc078ea54a5a800d30a"
188 | uuid = "7f8f8fb0-2700-5f03-b4bd-41f8cfc144b6"
189 | version = "0.4.1"
190 |
191 | [[LibGit2]]
192 | uuid = "76f85450-5226-5b5a-8eaa-529ad045b433"
193 |
194 | [[Libdl]]
195 | uuid = "8f399da3-3557-5675-b5ff-fb832c97cbdb"
196 |
197 | [[Libiconv_jll]]
198 | deps = ["Libdl", "Pkg"]
199 | git-tree-sha1 = "e5256a3b0ebc710dbd6da0c0b212164a3681037f"
200 | uuid = "94ce4f54-9a6c-5748-9c1c-f9c7231a4531"
201 | version = "1.16.0+2"
202 |
203 | [[LinearAlgebra]]
204 | deps = ["Libdl"]
205 | uuid = "37e2e46d-f89d-539d-b4ee-838fcccc9c8e"
206 |
207 | [[Logging]]
208 | uuid = "56ddb016-857b-54e1-b83d-db4d58db5568"
209 |
210 | [[LossFunctions]]
211 | deps = ["LearnBase", "Markdown", "RecipesBase", "SparseArrays", "StatsBase"]
212 | git-tree-sha1 = "3cd347266e394a066ca7f17bd8ff589ff5ce1d35"
213 | uuid = "30fc2ffe-d236-52d8-8643-a9d8f7c094a7"
214 | version = "0.6.2"
215 |
216 | [[MLJBase]]
217 | deps = ["CategoricalArrays", "ComputationalResources", "Dates", "DelimitedFiles", "Distributed", "Distributions", "HTTP", "InteractiveUtils", "InvertedIndices", "JLSO", "JSON", "LinearAlgebra", "LossFunctions", "MLJModelInterface", "MLJScientificTypes", "Missings", "OrderedCollections", "Parameters", "PrettyTables", "ProgressMeter", "Random", "ScientificTypes", "Statistics", "StatsBase", "Tables"]
218 | git-tree-sha1 = "d8ba2063ffaaa7f0fe91ea5455a7bf838c1424ac"
219 | uuid = "a7f614a8-145f-11e9-1d2a-a57a1082229d"
220 | version = "0.13.10"
221 |
222 | [[MLJModelInterface]]
223 | deps = ["Random", "ScientificTypes"]
224 | git-tree-sha1 = "b02b13fde7b0dc301adc070d650405aa4909e657"
225 | uuid = "e80e1ace-859a-464e-9ed9-23947d8ae3ea"
226 | version = "0.3.0"
227 |
228 | [[MLJScientificTypes]]
229 | deps = ["CategoricalArrays", "ColorTypes", "Dates", "PrettyTables", "ScientificTypes", "Tables"]
230 | git-tree-sha1 = "5296df0ffd2ff7c667260c027d03a465b59dcff5"
231 | uuid = "2e2323e0-db8b-457b-ae0d-bdfb3bc63afd"
232 | version = "0.2.7"
233 |
234 | [[Markdown]]
235 | deps = ["Base64"]
236 | uuid = "d6f4376e-aef5-505a-96c1-9c027394607a"
237 |
238 | [[MbedTLS]]
239 | deps = ["Dates", "MbedTLS_jll", "Random", "Sockets"]
240 | git-tree-sha1 = "426a6978b03a97ceb7ead77775a1da066343ec6e"
241 | uuid = "739be429-bea8-5141-9913-cc70e7f3736d"
242 | version = "1.0.2"
243 |
244 | [[MbedTLS_jll]]
245 | deps = ["Libdl", "Pkg"]
246 | git-tree-sha1 = "c83f5a1d038f034ad0549f9ee4d5fac3fb429e33"
247 | uuid = "c8ffd9c3-330d-5841-b78e-0817d7145fa1"
248 | version = "2.16.0+2"
249 |
250 | [[Memento]]
251 | deps = ["Dates", "Distributed", "JSON", "Serialization", "Sockets", "Syslogs", "Test", "TimeZones", "UUIDs"]
252 | git-tree-sha1 = "31921ad09307dd9ad693da3213a218152fadb8f2"
253 | uuid = "f28f55f0-a522-5efc-85c2-fe41dfb9b2d9"
254 | version = "1.1.0"
255 |
256 | [[Missings]]
257 | deps = ["DataAPI"]
258 | git-tree-sha1 = "de0a5ce9e5289f27df672ffabef4d1e5861247d5"
259 | uuid = "e1d29d7a-bbdc-5cf2-9ac0-f12de2c33e28"
260 | version = "0.4.3"
261 |
262 | [[Mmap]]
263 | uuid = "a63ad114-7e13-5084-954f-fe012c677804"
264 |
265 | [[Mocking]]
266 | deps = ["ExprTools"]
267 | git-tree-sha1 = "916b850daad0d46b8c71f65f719c49957e9513ed"
268 | uuid = "78c3b35d-d492-501b-9361-3d52fe80e533"
269 | version = "0.7.1"
270 |
271 | [[OpenBLAS_jll]]
272 | deps = ["CompilerSupportLibraries_jll", "Libdl", "Pkg"]
273 | git-tree-sha1 = "1887096f6897306a4662f7c5af936da7d5d1a062"
274 | uuid = "4536629a-c528-5b80-bd46-f80d51c5b363"
275 | version = "0.3.9+4"
276 |
277 | [[OpenSpecFun_jll]]
278 | deps = ["CompilerSupportLibraries_jll", "Libdl", "Pkg"]
279 | git-tree-sha1 = "d51c416559217d974a1113522d5919235ae67a87"
280 | uuid = "efe28fd5-8261-553b-a9e1-b2916fc3738e"
281 | version = "0.5.3+3"
282 |
283 | [[OrderedCollections]]
284 | git-tree-sha1 = "12ce190210d278e12644bcadf5b21cbdcf225cd3"
285 | uuid = "bac558e1-5e72-5ebc-8fee-abe8a469f55d"
286 | version = "1.2.0"
287 |
288 | [[PDMats]]
289 | deps = ["Arpack", "LinearAlgebra", "SparseArrays", "SuiteSparse", "Test"]
290 | git-tree-sha1 = "2fc6f50ddd959e462f0a2dbc802ddf2a539c6e35"
291 | uuid = "90014a1f-27ba-587c-ab20-58faa44d9150"
292 | version = "0.9.12"
293 |
294 | [[Parameters]]
295 | deps = ["OrderedCollections", "UnPack"]
296 | git-tree-sha1 = "38b2e970043613c187bd56a995fe2e551821eb4a"
297 | uuid = "d96e819e-fc66-5662-9728-84c9c7592b0a"
298 | version = "0.12.1"
299 |
300 | [[Parsers]]
301 | deps = ["Dates", "Test"]
302 | git-tree-sha1 = "eb3e09940c0d7ae01b01d9291ebad7b081c844d3"
303 | uuid = "69de0a69-1ddd-5017-9359-2bf0b02dc9f0"
304 | version = "1.0.5"
305 |
306 | [[Pkg]]
307 | deps = ["Dates", "LibGit2", "Libdl", "Logging", "Markdown", "Printf", "REPL", "Random", "SHA", "UUIDs"]
308 | uuid = "44cfe95a-1eb2-52ea-b672-e2afdf69b78f"
309 |
310 | [[PooledArrays]]
311 | deps = ["DataAPI"]
312 | git-tree-sha1 = "b1333d4eced1826e15adbdf01a4ecaccca9d353c"
313 | uuid = "2dfb63ee-cc39-5dd5-95bd-886bf059d720"
314 | version = "0.5.3"
315 |
316 | [[PrettyTables]]
317 | deps = ["Crayons", "Formatting", "Parameters", "Reexport", "Tables"]
318 | git-tree-sha1 = "ac3cecc7254adfffb8fdbd2c83eaa247e14b02da"
319 | uuid = "08abe8d2-0d0c-5749-adfa-8a2ac140af0d"
320 | version = "0.9.0"
321 |
322 | [[Printf]]
323 | deps = ["Unicode"]
324 | uuid = "de0858da-6303-5e67-8744-51eddeeeb8d7"
325 |
326 | [[ProgressMeter]]
327 | deps = ["Distributed", "Printf"]
328 | git-tree-sha1 = "3e1784c27847bba115815d4d4e668b99873985e5"
329 | uuid = "92933f4c-e287-5a05-a399-4b506db050ca"
330 | version = "1.3.1"
331 |
332 | [[QuadGK]]
333 | deps = ["DataStructures", "LinearAlgebra"]
334 | git-tree-sha1 = "dc84e810393cfc6294248c9032a9cdacc14a3db4"
335 | uuid = "1fd47b50-473d-5c70-9696-f719f8f3bcdc"
336 | version = "2.3.1"
337 |
338 | [[REPL]]
339 | deps = ["InteractiveUtils", "Markdown", "Sockets"]
340 | uuid = "3fa0cd96-eef1-5676-8a61-b3b8758bbffb"
341 |
342 | [[Random]]
343 | deps = ["Serialization"]
344 | uuid = "9a3f8284-a2c9-5f02-9a11-845980a1fd5c"
345 |
346 | [[RecipesBase]]
347 | git-tree-sha1 = "54f8ceb165a0f6d083f0d12cb4996f5367c6edbc"
348 | uuid = "3cdcf5f2-1ef4-517c-9805-6587b60abb01"
349 | version = "1.0.1"
350 |
351 | [[Reexport]]
352 | deps = ["Pkg"]
353 | git-tree-sha1 = "7b1d07f411bc8ddb7977ec7f377b97b158514fe0"
354 | uuid = "189a3867-3050-52da-a836-e630ba90ab69"
355 | version = "0.2.0"
356 |
357 | [[Rmath]]
358 | deps = ["Random", "Rmath_jll"]
359 | git-tree-sha1 = "86c5647b565873641538d8f812c04e4c9dbeb370"
360 | uuid = "79098fc4-a85e-5d69-aa6a-4863f24498fa"
361 | version = "0.6.1"
362 |
363 | [[Rmath_jll]]
364 | deps = ["Libdl", "Pkg"]
365 | git-tree-sha1 = "d76185aa1f421306dec73c057aa384bad74188f0"
366 | uuid = "f50d1b31-88e8-58de-be2c-1cc44531875f"
367 | version = "0.2.2+1"
368 |
369 | [[SHA]]
370 | uuid = "ea8e919c-243c-51af-8825-aaa63cd721ce"
371 |
372 | [[ScientificTypes]]
373 | git-tree-sha1 = "1a9f881c800ea009fb7f8b5274f04e4e8a5faef8"
374 | uuid = "321657f4-b219-11e9-178b-2701a2544e81"
375 | version = "0.8.0"
376 |
377 | [[Serialization]]
378 | uuid = "9e88b42a-f829-5b0c-bbe9-9e923198166b"
379 |
380 | [[SharedArrays]]
381 | deps = ["Distributed", "Mmap", "Random", "Serialization"]
382 | uuid = "1a1011a3-84de-559e-8e89-a11a2f7dc383"
383 |
384 | [[Sockets]]
385 | uuid = "6462fe0b-24de-5631-8697-dd941f90decc"
386 |
387 | [[SortingAlgorithms]]
388 | deps = ["DataStructures", "Random", "Test"]
389 | git-tree-sha1 = "03f5898c9959f8115e30bc7226ada7d0df554ddd"
390 | uuid = "a2af1166-a08f-5f64-846c-94a0d3cef48c"
391 | version = "0.3.1"
392 |
393 | [[SparseArrays]]
394 | deps = ["LinearAlgebra", "Random"]
395 | uuid = "2f01184e-e22b-5df5-ae63-d93ebab69eaf"
396 |
397 | [[SpecialFunctions]]
398 | deps = ["OpenSpecFun_jll"]
399 | git-tree-sha1 = "d8d8b8a9f4119829410ecd706da4cc8594a1e020"
400 | uuid = "276daf66-3868-5448-9aa4-cd146d93841b"
401 | version = "0.10.3"
402 |
403 | [[Statistics]]
404 | deps = ["LinearAlgebra", "SparseArrays"]
405 | uuid = "10745b16-79ce-11e8-11f9-7d13ad32a3b2"
406 |
407 | [[StatsBase]]
408 | deps = ["DataAPI", "DataStructures", "LinearAlgebra", "Missings", "Printf", "Random", "SortingAlgorithms", "SparseArrays", "Statistics"]
409 | git-tree-sha1 = "a6102b1f364befdb05746f386b67c6b7e3262c45"
410 | uuid = "2913bbd2-ae8a-5f71-8c99-4fb6c76f3a91"
411 | version = "0.33.0"
412 |
413 | [[StatsFuns]]
414 | deps = ["Rmath", "SpecialFunctions"]
415 | git-tree-sha1 = "04a5a8e6ab87966b43f247920eab053fd5fdc925"
416 | uuid = "4c63d2b9-4356-54db-8cca-17b64c39e42c"
417 | version = "0.9.5"
418 |
419 | [[SuiteSparse]]
420 | deps = ["Libdl", "LinearAlgebra", "Serialization", "SparseArrays"]
421 | uuid = "4607b0f0-06f3-5cda-b6b1-a6196a1729e9"
422 |
423 | [[Syslogs]]
424 | deps = ["Printf", "Sockets"]
425 | git-tree-sha1 = "46badfcc7c6e74535cc7d833a91f4ac4f805f86d"
426 | uuid = "cea106d9-e007-5e6c-ad93-58fe2094e9c4"
427 | version = "0.3.0"
428 |
429 | [[TableTraits]]
430 | deps = ["IteratorInterfaceExtensions"]
431 | git-tree-sha1 = "b1ad568ba658d8cbb3b892ed5380a6f3e781a81e"
432 | uuid = "3783bdb8-4a98-5b6b-af9a-565f29a5fe9c"
433 | version = "1.0.0"
434 |
435 | [[Tables]]
436 | deps = ["DataAPI", "DataValueInterfaces", "IteratorInterfaceExtensions", "LinearAlgebra", "TableTraits", "Test"]
437 | git-tree-sha1 = "c45dcc27331febabc20d86cb3974ef095257dcf3"
438 | uuid = "bd369af6-aec1-5ad0-b16a-f7cc5008161c"
439 | version = "1.0.4"
440 |
441 | [[Test]]
442 | deps = ["Distributed", "InteractiveUtils", "Logging", "Random"]
443 | uuid = "8dfed614-e22c-5e08-85e1-65c5234f0b40"
444 |
445 | [[TimeZones]]
446 | deps = ["Dates", "EzXML", "Mocking", "Printf", "RecipesBase", "Serialization", "Unicode"]
447 | git-tree-sha1 = "db7bc2051d4c2e5f336409224df81485c00de6cb"
448 | uuid = "f269a46b-ccf7-5d73-abea-4c690281aa53"
449 | version = "1.2.0"
450 |
451 | [[TranscodingStreams]]
452 | deps = ["Random", "Test"]
453 | git-tree-sha1 = "7c53c35547de1c5b9d46a4797cf6d8253807108c"
454 | uuid = "3bb67fe8-82b1-5028-8e26-92a6c54297fa"
455 | version = "0.9.5"
456 |
457 | [[UUIDs]]
458 | deps = ["Random", "SHA"]
459 | uuid = "cf7118a7-6976-5b1a-9a39-7adc72f591a4"
460 |
461 | [[UnPack]]
462 | git-tree-sha1 = "d4bfa022cd30df012700cf380af2141961bb3bfb"
463 | uuid = "3a884ed6-31ef-47d7-9d2a-63182c4928ed"
464 | version = "1.0.1"
465 |
466 | [[Unicode]]
467 | uuid = "4ec0a83e-493e-50e2-b9ac-8f72acf5a8f5"
468 |
469 | [[WeakRefStrings]]
470 | deps = ["DataAPI", "Random", "Test"]
471 | git-tree-sha1 = "28807f85197eaad3cbd2330386fac1dcb9e7e11d"
472 | uuid = "ea10d353-3f73-51f8-a26c-33c1cb351aa5"
473 | version = "0.6.2"
474 |
475 | [[XML2_jll]]
476 | deps = ["Libdl", "Libiconv_jll", "Pkg", "Zlib_jll"]
477 | git-tree-sha1 = "987c02a43fa10a491a5f0f7c46a6d3559ed6a8e2"
478 | uuid = "02c8fc9c-b97f-50b9-bbe4-9be30ff0a78a"
479 | version = "2.9.9+4"
480 |
481 | [[Zlib_jll]]
482 | deps = ["Libdl", "Pkg"]
483 | git-tree-sha1 = "a2e0d558f6031002e380a90613b199e37a8565bf"
484 | uuid = "83775a58-1f1d-513f-b197-d71354ab007a"
485 | version = "1.2.11+10"
486 |
--------------------------------------------------------------------------------
/data/src/convert_ames/Project.toml:
--------------------------------------------------------------------------------
1 | [deps]
2 | CSV = "336ed68f-0bac-5ca0-87d4-7b16caf5d00b"
3 | CategoricalArrays = "324d7699-5711-5eae-9e2f-1d82baa6b597"
4 | DataFrames = "a93c6f00-e57d-5684-b7b6-d8193f3e46c0"
5 | MLJBase = "a7f614a8-145f-11e9-1d2a-a57a1082229d"
6 |
--------------------------------------------------------------------------------
/data/src/generate_horse.jl:
--------------------------------------------------------------------------------
1 | using Pkg;
2 | Pkg.activate(@__DIR__)
3 | Pkg.instantiate()
4 |
5 | using MLJ
6 |
7 | using HTTP
8 | using CSV
9 | import DataFrames: DataFrame, select!, Not
10 | req1 = HTTP.get("http://archive.ics.uci.edu/ml/machine-learning-databases/horse-colic/horse-colic.data")
11 | req2 = HTTP.get("http://archive.ics.uci.edu/ml/machine-learning-databases/horse-colic/horse-colic.test")
12 | header = ["surgery", "age", "hospital_number",
13 | "rectal_temperature", "pulse",
14 | "respiratory_rate", "temperature_extremities",
15 | "peripheral_pulse", "mucous_membranes",
16 | "capillary_refill_time", "pain",
17 | "peristalsis", "abdominal_distension",
18 | "nasogastric_tube", "nasogastric_reflux",
19 | "nasogastric_reflux_ph", "feces", "abdomen",
20 | "packed_cell_volume", "total_protein",
21 | "abdomcentesis_appearance", "abdomcentesis_total_protein",
22 | "outcome", "surgical_lesion", "lesion_1", "lesion_2", "lesion_3",
23 | "cp_data"]
24 | csv_opts = (header=header, delim=' ', missingstring="?",
25 | ignorerepeated=true)
26 | data_train = CSV.read(req1.body; csv_opts...)
27 | data_test = CSV.read(req2.body; csv_opts...)
28 | @show size(data_train)
29 | @show size(data_test)
30 |
31 | unwanted = [:lesion_1, :lesion_2, :lesion_3]
32 | data = vcat(data_train, data_test)
33 | select!(data, Not(unwanted));
34 |
35 | train = 1:nrows(data_train)
36 | test = last(train) .+ (1:nrows(data_test));
37 |
38 | datac = coerce(data, autotype(data));
39 |
40 | sch0 = schema(data)
41 | sch = schema(datac)
42 |
43 | old_scitype_given_name = Dict(
44 | sch0.names[j] => sch0.scitypes[j] for j in eachindex(sch0.names))
45 |
46 | length(unique(datac.hospital_number))
47 |
48 | datac = select!(datac, Not(:hospital_number));
49 |
50 | datac = coerce(datac, autotype(datac, rules=(:discrete_to_continuous,)));
51 |
52 | missing_outcome = ismissing.(datac.outcome)
53 | idx_missing_outcome = missing_outcome |> findall
54 |
55 | train = setdiff!(train |> collect, idx_missing_outcome)
56 | test = setdiff!(test |> collect, idx_missing_outcome)
57 | datac = datac[.!missing_outcome, :];
58 |
59 | for name in names(datac)
60 | col = datac[:, name]
61 | ratio_missing = sum(ismissing.(col)) / nrows(datac) * 100
62 | println(rpad(name, 30), round(ratio_missing, sigdigits=3))
63 | end
64 |
65 | unwanted = [:peripheral_pulse, :nasogastric_tube, :nasogastric_reflux,
66 | :nasogastric_reflux_ph, :feces, :abdomen, :abdomcentesis_appearance, :abdomcentesis_total_protein]
67 | select!(datac, Not(unwanted));
68 |
69 | @load FillImputer
70 | filler = machine(FillImputer(), datac)
71 | fit!(filler)
72 | datac = transform(filler, datac)
73 |
74 | cat_fields = filter(schema(datac).names) do field
75 | datac[:, field] isa CategoricalArray
76 | end
77 |
78 | for f in cat_fields
79 | datac[!, f] = get.(datac[:, f])
80 | end
81 |
82 | datac.pulse = coerce(datac.pulse, Count)
83 | datac.respiratory_rate = coerce(datac.pulse, Count)
84 |
85 | sch1 = schema(datac)
86 |
87 | CSV.write("horse.csv", datac)
88 |
--------------------------------------------------------------------------------
/data/src/get_king_county.jl:
--------------------------------------------------------------------------------
1 | using Pkg;
2 | Pkg.activate(@__DIR__)
3 | Pkg.instantiate()
4 |
5 | using MLJ
6 | using PrettyPrinting
7 | import DataFrames: DataFrame, select!, Not, describe
8 | import Statistics
9 | using Dates
10 | using UrlDownload
11 | using CSV
12 |
13 |
14 | df = DataFrame(urldownload("https://raw.githubusercontent.com/tlienart/DataScienceTutorialsData.jl/master/data/kc_housing.csv", true))
15 | describe(df)
16 |
17 | df.is_renovated = df.yr_renovated .== 0
18 |
19 | select!(df, Not([:id, :date, :yr_renovated]))
20 | CSV.write(joinpath(@__DIR__, "..", "house.csv"), df)
21 |
--------------------------------------------------------------------------------
/environment.yml:
--------------------------------------------------------------------------------
1 | dependencies:
2 | - matplotlib
3 | - numpy
4 | - pip
5 | - pip:
6 | - julia
7 |
--------------------------------------------------------------------------------
/exercise_6ci.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ablaom/MachineLearningInJulia2020/552f98fbf012475d67cd29a72448ac7c476ea2c7/exercise_6ci.png
--------------------------------------------------------------------------------
/exercise_7c.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ablaom/MachineLearningInJulia2020/552f98fbf012475d67cd29a72448ac7c476ea2c7/exercise_7c.png
--------------------------------------------------------------------------------
/exercise_7c_2.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ablaom/MachineLearningInJulia2020/552f98fbf012475d67cd29a72448ac7c476ea2c7/exercise_7c_2.png
--------------------------------------------------------------------------------
/exercise_7c_3.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ablaom/MachineLearningInJulia2020/552f98fbf012475d67cd29a72448ac7c476ea2c7/exercise_7c_3.png
--------------------------------------------------------------------------------
/exercise_8c.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ablaom/MachineLearningInJulia2020/552f98fbf012475d67cd29a72448ac7c476ea2c7/exercise_8c.png
--------------------------------------------------------------------------------
/gamma_sampler.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ablaom/MachineLearningInJulia2020/552f98fbf012475d67cd29a72448ac7c476ea2c7/gamma_sampler.png
--------------------------------------------------------------------------------
/iris_learning_curve.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ablaom/MachineLearningInJulia2020/552f98fbf012475d67cd29a72448ac7c476ea2c7/iris_learning_curve.png
--------------------------------------------------------------------------------
/learning_curve.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ablaom/MachineLearningInJulia2020/552f98fbf012475d67cd29a72448ac7c476ea2c7/learning_curve.png
--------------------------------------------------------------------------------
/learning_curve2.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ablaom/MachineLearningInJulia2020/552f98fbf012475d67cd29a72448ac7c476ea2c7/learning_curve2.png
--------------------------------------------------------------------------------
/methods.md:
--------------------------------------------------------------------------------
1 | # List of methods introduced in the tutorials
2 |
3 | ## Part 1
4 |
5 | `scitype(object)`, `coerce(vector, SomeSciType)`,
6 | `levels(categorical_vector)`, `levels!(categorical_vector)`,
7 | `schema(table)`, `MLJ.table(matrix)`, `autotype(table)`,
8 | `coerce(table, ...)`, `coerce!(dataframe, ...)`, `elscitype(vector)`
9 |
10 | ## Part 2
11 |
12 | `OpenML.load(id)`, `unpack(table, ...)`, `models()`, `models(filter)`,
13 | `models(string)`, `@load ModelType pkg=PackageName`, `info(model)`,
14 | `machine(model, X, y)`, `partition(row_indices, ...)`, `fit!(mach,
15 | rows=...)`, `predict(mach, rows=...)`, `predict(mach, Xnew)`,
16 | `fitted_params(mach)`, `report(mach)`, `MLJ.save`,
17 | `machine(filename)`, `machine(filename, X, y)`,
18 | `pdf(single_prediction, class)`, `predict_mode(mach, Xnew)`,
19 | `predict_mean(mach, Xnew)`, `predict_median(mach, Xnew)`,
20 | `measures()`, `evaluate!`, `range(model, :(param.nested_param), ...)`,
21 | `learning_curve(mach, ...)`
22 |
23 | ## Part 3
24 |
25 | `Standardizer`, `transform`, `inverse_transform`, `ContinuousEncoder`, `@pipeline`
26 |
27 | ## Part 4
28 |
29 | `iterator(r, resolution)`, `sampler(r, distribution)`, `RandomSearch`,
30 | `TunedModel`
31 |
32 | ## Part 5
33 |
34 | `source(data)`, `source()`, `Probabilistic()`, `Deterministic()`,
35 | `Unsupervised()`, `@from_network`
36 |
--------------------------------------------------------------------------------
/outline.md:
--------------------------------------------------------------------------------
1 | # Machine Learning in Julia using MLJ
2 |
3 | ## Housekeeping
4 |
5 | ### Getting help during the workshop
6 |
7 | ### Resources to help you
8 |
9 | From the MLJ ecosystem:
10 |
11 | - The docs
12 |
13 | - DataScienceTutorials
14 |
15 | From elsewhere:
16 |
17 | - Julia specific:
18 |
19 | - ScikitLearn
20 |
21 | - General:
22 |
23 | -
24 |
25 | -
26 |
27 | ## Programme
28 |
29 | - An overview of machine learning and MLJ (lecture)
30 |
31 | - Workshop scope
32 |
33 | - Installing MLJ and the tutorials
34 |
35 | - Part 1. Data representations
36 |
37 | Break
38 |
39 | - Part 2: Selecting, training and evaluating models
40 |
41 | - Part 3: Tuning model hyper-parameters
42 |
43 | Break
44 |
45 | - Part 4: Model pipelines
46 |
47 | - Part 5: Advanced features (lecture)
48 |
49 | Each Parts 2-6 begins with demonstration on the "teacher's dataset", with
50 | time for participants to carry out a similar exercise on a "student's
51 | datasets" and interact with the instructors in the chat forum.
52 |
53 |
54 | ## What this workshop won't cover
55 |
56 | This workshop assumes at some experience with data and, ideally, some
57 | understanding of machine learning principles.
58 |
59 | Lightly covered or not covered
60 |
61 | - data wrangling and data cleaning
62 |
63 | - feature engineering
64 |
65 | - options for parallelism or using GPU's
66 |
67 |
68 | ## Part 1: Data ingestion and pre-processing
69 |
70 | ### What is machine learning?
71 |
72 | Supervised learning - show with examples and pictures what the basic
73 | idea and processes are: fitting, evaluating, tuning.
74 |
75 | Unsupervised learning - no labels; main use-case is dimension reduction; explain PCA with a picture
76 |
77 | Re-enforcement learning - out of scope
78 |
79 |
80 | ### Different machine learning models and paradigms
81 |
82 | - machine learning ≠ deep learning
83 |
84 | - there are hundreds of machine learning models. All of the following
85 | are in common use:
86 |
87 | - linear models, especially Ridge regression, elastic net, pca (unsupervised)
88 |
89 | - Naive Bayes
90 |
91 | - K-nearest neighbours
92 |
93 | - K-means clustering (unsupervised)
94 |
95 | - random forests
96 |
97 | - gradient boosted tree models (e.g., XGBoost)
98 |
99 | - support vector machines
100 |
101 | - probablistic programming models
102 |
103 | - neural networks
104 |
105 |
106 | ### What is a (good) machine learning toolbox?
107 |
108 | - provides uniform interface to zoo of models scattered everywhere
109 | (different packages, different languages)
110 |
111 | - provides a searchable model registry
112 |
113 | - meta-algorithms:
114 |
115 | - evaluating performance using different performance measures (aka
116 | metrics, scores, loss functions)
117 |
118 | - tuning (optimizing hyperparmaters)
119 |
120 | - facilitates model *composition* (e.g., pipelines)
121 |
122 | - customizable (getting under the hood)
123 |
124 | ### MLJ features
125 |
126 |
127 | ### A short tour of MLJ
128 |
129 |
130 | ## Part 1: Data ingestion and pre-processing
131 |
132 | ### Scientific types and type coercion
133 |
134 | - inspecting scitypes and coercing them
135 |
136 | - working with categorical data
137 |
138 |
139 | ### Tabular data
140 |
141 | - Lots of things can be considered as tabular data; examples: native
142 | tables, matrices, DataFrames, CSV files
143 |
144 | - Lots of ways to grab data; examples:
145 |
146 | - load a canned dataset
147 | - load from local file (e.g., csv)
148 | - create a synthetic data set
149 | - use OpenML
150 | - use RDatasets
151 | - use UrlDownload (or is there something better?)
152 |
153 | ### Demo
154 |
155 | ### Exercise
156 |
157 | ##
158 |
159 |
--------------------------------------------------------------------------------
/setup.jl:
--------------------------------------------------------------------------------
1 | # Setup:
2 |
3 | isbinder() = "jovyan" in split(pwd(), "/")
4 |
5 | const REPO = "https://github.com/ablaom/MachineLearningInJulia2020"
6 | using Pkg
7 |
8 | if !isbinder()
9 | Pkg.activate(DIR)
10 | Pkg.instantiate()
11 | using CategoricalArrays
12 | import MLJLinearModels
13 | import DataFrames
14 | import CSV
15 | import MLJDecisionTreeInterface
16 | using MLJ
17 | import MLJClusteringInterface
18 | import MLJMultivariateStatsInterface
19 | import MLJScikitLearnInterface
20 | import MLJLinearModels
21 | import MLJMultivariateStatsInterface
22 | import MLJFlux
23 | import Plots
24 | else
25 | @info "Skipping package instantiation as binder notebook. "
26 | end
27 | @info "Done loading"
28 |
--------------------------------------------------------------------------------
/stacking.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ablaom/MachineLearningInJulia2020/552f98fbf012475d67cd29a72448ac7c476ea2c7/stacking.png
--------------------------------------------------------------------------------
/tuning.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ablaom/MachineLearningInJulia2020/552f98fbf012475d67cd29a72448ac7c476ea2c7/tuning.png
--------------------------------------------------------------------------------
/tutorials.jl:
--------------------------------------------------------------------------------
1 | # # Machine Learning in Julia, JuliaCon2020
2 |
3 | # A workshop introducing the machine learning toolbox
4 | # [MLJ](https://alan-turing-institute.github.io/MLJ.jl/stable/).
5 |
6 |
7 | # ### Set-up
8 |
9 | # The following instantiates a package environment and pre-loads some
10 | # packages, to avoid delays later on.
11 |
12 | # The package environment has been created using **Julia 1.6** and may not
13 | # instantiate properly for other Julia versions.
14 |
15 | VERSION
16 |
17 | #-
18 |
19 | DIR = @__DIR__
20 | include(joinpath(DIR, "setup.jl"))
21 | color_off()
22 |
23 | # ## General resources
24 |
25 | # - [List of methods introduced in this tutorial](methods.md)
26 | # - [MLJ Cheatsheet](https://alan-turing-institute.github.io/MLJ.jl/dev/mlj_cheatsheet/)
27 | # - [Common MLJ Workflows](https://alan-turing-institute.github.io/MLJ.jl/dev/common_mlj_workflows/)
28 | # - [MLJ manual](https://alan-turing-institute.github.io/MLJ.jl/dev/)
29 | # - [Data Science Tutorials in Julia](https://juliaai.github.io/DataScienceTutorials.jl/)
30 |
31 |
32 | # ## Contents
33 |
34 | # ### Basic
35 |
36 | # - [Part 1 - Data Representation](#part-1-data-representation)
37 | # - [Part 2 - Selecting, Training and Evaluating Models](#part-2-selecting-training-and-evaluating-models)
38 | # - [Part 3 - Transformers and Pipelines](#part-3-transformers-and-pipelines)
39 | # - [Part 4 - Tuning Hyper-parameters](#part-4-tuning-hyper-parameters)
40 | # - [Part 5 - Advanced model composition](#part-5-advanced-model-composition)
41 | # - [Solutions to Exercises](#solutions-to-exercises)
42 |
43 |
44 | #
45 |
46 |
47 | # ## Part 1 - Data Representation
48 |
49 | # > **Goals:**
50 | # > 1. Learn how MLJ specifies it's data requirements using "scientific" types
51 | # > 2. Understand the options for representing tabular data
52 | # > 3. Learn how to inspect and fix the representation of data to meet MLJ requirements
53 |
54 |
55 | # ### Scientific types
56 |
57 | # To help you focus on the intended *purpose* or *interpretation* of
58 | # data, MLJ models specify data requirements using *scientific types*,
59 | # instead of machine types. An example of a scientific type is
60 | # `OrderedFactor`. The other basic "scalar" scientific types are
61 | # illustrated below:
62 |
63 | # 
64 |
65 | # A scientific type is an ordinary Julia type (so it can be used for
66 | # method dispatch, for example) but it usually has no instances. The
67 | # `scitype` function is used to articulate MLJ's convention about how
68 | # different machine types will be interpreted by MLJ models:
69 |
70 | using MLJ
71 | scitype(3.141)
72 |
73 | #-
74 |
75 | time = [2.3, 4.5, 4.2, 1.8, 7.1]
76 | scitype(time)
77 |
78 | # To fix data which MLJ is interpreting incorrectly, we use the
79 | # `coerce` method:
80 |
81 | height = [185, 153, 163, 114, 180]
82 | scitype(height)
83 |
84 | #-
85 |
86 | height = coerce(height, Continuous)
87 |
88 | # Here's an example of data we would want interpreted as
89 | # `OrderedFactor` but isn't:
90 |
91 | exam_mark = ["rotten", "great", "bla", missing, "great"]
92 | scitype(exam_mark)
93 |
94 | #-
95 |
96 | exam_mark = coerce(exam_mark, OrderedFactor)
97 |
98 | #-
99 |
100 | levels(exam_mark)
101 |
102 | # Use `levels!` to put the classes in the right order:
103 |
104 | levels!(exam_mark, ["rotten", "bla", "great"])
105 | exam_mark[1] < exam_mark[2]
106 |
107 | # When sub-sampling, no levels are lost:
108 |
109 | levels(exam_mark[1:2])
110 |
111 | # **Note on binary data.** There is no separate scientific type for
112 | # binary data. Binary data is `OrderedFactor{2}` or
113 | # `Multiclass{2}`. If a binary measure like `truepositive` is a
114 | # applied to `OrderedFactor{2}` then the "positive" class is assumed
115 | # to appear *second* in the ordering. If such a measure is applied to
116 | # `Multiclass{2}` data, a warning is issued. A single `OrderedFactor`
117 | # can be coerced to a single `Continuous` variable, for models that
118 | # require this, while a `Multiclass` variable can only be one-hot
119 | # encoded.
120 |
121 |
122 | # ### Two-dimensional data
123 |
124 | # Whenever it makes sense, MLJ Models generally expect two-dimensional
125 | # data to be *tabular*. All the tabular formats implementing the
126 | # [Tables.jl API](https://juliadata.github.io/Tables.jl/stable/) (see
127 | # this
128 | # [list](https://github.com/JuliaData/Tables.jl/blob/master/INTEGRATIONS.md))
129 | # have a scientific type of `Table` and can be used with such models.
130 |
131 | # Probably the simplest example of a table is the julia native *column
132 | # table*, which is just a named tuple of equal-length vectors:
133 |
134 | column_table = (h=height, e=exam_mark, t=time)
135 |
136 | #-
137 |
138 | scitype(column_table)
139 |
140 | #-
141 |
142 | # Notice the `Table{K}` type parameter `K` encodes the scientific
143 | # types of the columns. (This is useful when comparing table scitypes
144 | # with `<:`). To inspect the individual column scitypes, we use the
145 | # `schema` method instead:
146 |
147 | schema(column_table)
148 |
149 | # Here are five other examples of tables:
150 |
151 | dict_table = Dict(:h => height, :e => exam_mark, :t => time)
152 | schema(dict_table)
153 |
154 | # (To control column order here, instead use `LittleDict` from
155 | # OrderedCollections.jl.)
156 |
157 | row_table = [(a=1, b=3.4),
158 | (a=2, b=4.5),
159 | (a=3, b=5.6)]
160 | schema(row_table)
161 |
162 | #-
163 |
164 | import DataFrames
165 | df = DataFrames.DataFrame(column_table)
166 |
167 | #-
168 |
169 | schema(df) == schema(column_table)
170 |
171 | #-
172 |
173 | using CSV
174 | file = CSV.File(joinpath(DIR, "data", "horse.csv"));
175 | schema(file) # (triggers a file read)
176 |
177 |
178 | # Most MLJ models do not accept matrix in lieu of a table, but you can
179 | # wrap a matrix as a table:
180 |
181 | matrix_table = MLJ.table(rand(2,3))
182 | schema(matrix_table)
183 |
184 | # The matrix is *not* copied, only wrapped. Some models may perform
185 | # better if one wraps the adjoint of the transpose - see
186 | # [here](https://alan-turing-institute.github.io/MLJ.jl/dev/getting_started/#Observations-correspond-to-rows,-not-columns).
187 |
188 |
189 | # **Manipulating tabular data.** In this workshop we assume
190 | # familiarity with some kind of tabular data container (although it is
191 | # possible, in principle, to carry out the exercises without this.)
192 | # For a quick start introduction to `DataFrames`, see [this
193 | # tutorial](https://juliaai.github.io/DataScienceTutorials.jl/data/dataframe/).
194 |
195 | # ### Fixing scientific types in tabular data
196 |
197 | # To show how we can correct the scientific types of data in tables,
198 | # we introduce a cleaned up version of the UCI Horse Colic Data Set
199 | # (the cleaning work-flow is described
200 | # [here](https://juliaai.github.io/DataScienceTutorials.jl/end-to-end/horse/#dealing_with_missing_values)).
201 |
202 | using CSV
203 | file = CSV.File(joinpath(DIR, "data", "horse.csv"));
204 | horse = DataFrames.DataFrame(file); # convert to data frame without copying columns
205 | first(horse, 4)
206 |
207 | #-
208 |
209 | # From [the UCI
210 | # docs](http://archive.ics.uci.edu/ml/datasets/Horse+Colic) we can
211 | # surmise how each variable ought to be interpreted (a step in our
212 | # work-flow that cannot reliably be left to the computer):
213 |
214 | # variable | scientific type (interpretation)
215 | # ----------------------------|-----------------------------------
216 | # `:surgery` | Multiclass
217 | # `:age` | Multiclass
218 | # `:rectal_temperature` | Continuous
219 | # `:pulse` | Continuous
220 | # `:respiratory_rate` | Continuous
221 | # `:temperature_extremities` | OrderedFactor
222 | # `:mucous_membranes` | Multiclass
223 | # `:capillary_refill_time` | Multiclass
224 | # `:pain` | OrderedFactor
225 | # `:peristalsis` | OrderedFactor
226 | # `:abdominal_distension` | OrderedFactor
227 | # `:packed_cell_volume` | Continuous
228 | # `:total_protein` | Continuous
229 | # `:outcome` | Multiclass
230 | # `:surgical_lesion` | OrderedFactor
231 | # `:cp_data` | Multiclass
232 |
233 | # Let's see how MLJ will actually interpret the data, as it is
234 | # currently encoded:
235 |
236 | schema(horse)
237 |
238 | # As a first correction step, we can get MLJ to "guess" the
239 | # appropriate fix, using the `autotype` method:
240 |
241 | autotype(horse)
242 |
243 | #-
244 |
245 | # Okay, this is not perfect, but a step in the right direction, which
246 | # we implement like this:
247 |
248 | coerce!(horse, autotype(horse));
249 | schema(horse)
250 |
251 | # All remaining `Count` data should be `Continuous`:
252 |
253 | coerce!(horse, Count => Continuous);
254 | schema(horse)
255 |
256 | # We'll correct the remaining truant entries manually:
257 |
258 | coerce!(horse,
259 | :surgery => Multiclass,
260 | :age => Multiclass,
261 | :mucous_membranes => Multiclass,
262 | :capillary_refill_time => Multiclass,
263 | :outcome => Multiclass,
264 | :cp_data => Multiclass);
265 | schema(horse)
266 |
267 |
268 | # ### Resources for Part 1
269 | #
270 | # - From the MLJ manual:
271 | # - [A preview of data type specification in
272 | # MLJ](https://alan-turing-institute.github.io/MLJ.jl/dev/getting_started/#A-preview-of-data-type-specification-in-MLJ-1)
273 | # - [Data containers and scientific types](https://alan-turing-institute.github.io/MLJ.jl/dev/getting_started/#Data-containers-and-scientific-types-1)
274 | # - [Working with Categorical Data](https://alan-turing-institute.github.io/MLJ.jl/dev/working_with_categorical_data/)
275 | # - [Summary](https://juliaai.github.io/ScientificTypes.jl/dev/#Summary-of-the-default-convention) of the MLJ convention for representing scientific types
276 | # - [ScientificTypes.jl](https://juliaai.github.io/ScientificTypes.jl/dev/)
277 | # - From Data Science Tutorials:
278 | # - [Data interpretation: Scientific Types](https://juliaai.github.io/DataScienceTutorials.jl/data/scitype/)
279 | # - [Horse colic data](https://juliaai.github.io/DataScienceTutorials.jl/end-to-end/horse/)
280 | # - [UCI Horse Colic Data Set](http://archive.ics.uci.edu/ml/datasets/Horse+Colic)
281 |
282 |
283 | # ### Exercises for Part 1
284 |
285 |
286 | # #### Exercise 1
287 |
288 | # Try to guess how each code snippet below will evaluate:
289 |
290 | scitype(42)
291 |
292 | #-
293 |
294 | questions = ["who", "why", "what", "when"]
295 | scitype(questions)
296 |
297 | #-
298 |
299 | elscitype(questions)
300 |
301 | #-
302 |
303 | t = (3.141, 42, "how")
304 | scitype(t)
305 |
306 | #-
307 |
308 | A = rand(2, 3)
309 |
310 | # -
311 |
312 | scitype(A)
313 |
314 | #-
315 |
316 | elscitype(A)
317 |
318 | #-
319 |
320 | using SparseArrays
321 | Asparse = sparse(A)
322 |
323 | #-
324 |
325 | scitype(Asparse)
326 |
327 | #-
328 |
329 | using CategoricalArrays
330 | C1 = categorical(A)
331 |
332 | #-
333 |
334 | scitype(C1)
335 |
336 | #-
337 |
338 | elscitype(C1)
339 |
340 | #-
341 |
342 | C2 = categorical(A, ordered=true)
343 | scitype(C2)
344 |
345 | #-
346 |
347 | v = [1, 2, missing, 4]
348 | scitype(v)
349 |
350 | #-
351 |
352 | elscitype(v)
353 |
354 | #-
355 |
356 | scitype(v[1:2])
357 |
358 | # Can you guess at the general behavior of
359 | # `scitype` with respect to tuples, abstract arrays and missing
360 | # values? The answers are
361 | # [here](https://github.com/juliaai/ScientificTypesBase.jl#2-the-scitype-and-scitype-methods)
362 | # (ignore "Property 1").
363 |
364 |
365 | # #### Exercise 2
366 |
367 | # Coerce the following vector to make MLJ recognize it as a vector of
368 | # ordered factors (with an appropriate ordering):
369 |
370 | quality = ["good", "poor", "poor", "excellent", missing, "good", "excellent"]
371 |
372 | #-
373 |
374 |
375 | # #### Exercise 3 (fixing scitypes in a table)
376 |
377 | # Fix the scitypes for the [House Prices in King
378 | # County](https://mlr3gallery.mlr-org.com/posts/2020-01-30-house-prices-in-king-county/)
379 | # dataset:
380 |
381 | file = CSV.File(joinpath(DIR, "data", "house.csv"));
382 | house = DataFrames.DataFrame(file); # convert to data frame without copying columns
383 | first(house, 4)
384 |
385 | # (Two features in the original data set have been deemed uninformative
386 | # and dropped, namely `:id` and `:date`. The original feature
387 | # `:yr_renovated` has been replaced by the `Bool` feature `is_renovated`.)
388 |
389 | #
390 |
391 |
392 | # ## Part 2 - Selecting, Training and Evaluating Models
393 |
394 | # > **Goals:**
395 | # > 1. Search MLJ's database of model metadata to identify model candidates for a supervised learning task.
396 | # > 2. Evaluate the performance of a model on a holdout set using basic `fit!`/`predict` work-flow.
397 | # > 3. Inspect the outcomes of training and save these to a file.
398 | # > 3. Evaluate performance using other resampling strategies, such as cross-validation, in one line, using `evaluate!`
399 | # > 4. Plot a "learning curve", to inspect performance as a function of some model hyper-parameter, such as an iteration parameter
400 |
401 | # The "Hello World!" of machine learning is to classify Fisher's
402 | # famous iris data set. This time, we'll grab the data from
403 | # [OpenML](https://www.openml.org):
404 |
405 | OpenML.describe_dataset(61)
406 |
407 | #-
408 |
409 | iris = OpenML.load(61); # a row table
410 | iris = DataFrames.DataFrame(iris);
411 | first(iris, 4)
412 |
413 | # **Main goal.** To build and evaluate models for predicting the
414 | # `:class` variable, given the four remaining measurement variables.
415 |
416 |
417 | # ### Step 1. Inspect and fix scientific types
418 |
419 | schema(iris)
420 |
421 | # Unfortunately, `Missing` is appearing in the element type, despite
422 | # the fact there are no missing values (see this
423 | # [issue](https://github.com/JuliaAI/OpenML.jl/issues/10)). To do this
424 | # we have to explicilty tighten the types:
425 |
426 | #-
427 |
428 | coerce!(iris,
429 | Union{Missing,Continuous}=>Continuous,
430 | Union{Missing,Multiclass}=>Multiclass,
431 | tight=true)
432 | schema(iris)
433 |
434 |
435 | # ### Step 2. Split data into input and target parts
436 |
437 | # Here's how we split the data into target and input features, which
438 | # is needed for MLJ supervised models. We randomize the data at the
439 | # same time:
440 |
441 | y, X = unpack(iris, ==(:class), name->true; rng=123);
442 | scitype(y)
443 |
444 | # Here's one way to access the documentation (at the REPL, `?unpack`
445 | # also works):
446 |
447 | @doc unpack #!md
448 |
449 | # #md
450 |
451 |
452 | # ### On searching for a model
453 |
454 | # Here's how to see *all* models (not immediately useful):
455 |
456 | all_models = models()
457 |
458 | # Each entry contains metadata for a model whose defining code is not yet loaded:
459 |
460 | meta = all_models[3]
461 |
462 | #-
463 |
464 | targetscitype = meta.target_scitype
465 |
466 | #-
467 |
468 | scitype(y) <: targetscitype
469 |
470 | # So this model won't do. Let's find all pure julia classifiers:
471 |
472 | filter_julia_classifiers(meta) =
473 | AbstractVector{Finite} <: meta.target_scitype &&
474 | meta.is_pure_julia
475 |
476 | models(filter_julia_classifiers)
477 |
478 | # Find all models with "Classifier" in `name` (or `docstring`):
479 |
480 | models("Classifier")
481 |
482 |
483 | # Find all (supervised) models that match my data!
484 |
485 | models(matching(X, y))
486 |
487 |
488 |
489 | # ### Step 3. Select and instantiate a model
490 |
491 | # To load the code defining a new model type we use the `@load` macro:
492 |
493 | NeuralNetworkClassifier = @load NeuralNetworkClassifier
494 |
495 | # Other ways to load model code are described
496 | # [here](https://alan-turing-institute.github.io/MLJ.jl/dev/loading_model_code/#Loading-Model-Code).
497 |
498 | # We'll instantiate this type with default values for the
499 | # hyperparameters:
500 |
501 | model = NeuralNetworkClassifier()
502 |
503 | #-
504 |
505 | info(model)
506 |
507 | # In MLJ a *model* is just a struct containing hyper-parameters, and
508 | # that's all. A model does not store *learned* parameters. Models are
509 | # mutable:
510 |
511 | model.epochs = 12
512 |
513 | # And all models have a key-word constructor that works once `@load`
514 | # has been performed:
515 |
516 | NeuralNetworkClassifier(epochs=12) == model
517 |
518 |
519 | # ### On fitting, predicting, and inspecting models
520 |
521 | # In MLJ a model and training/validation data are typically bound
522 | # together in a machine:
523 |
524 | mach = machine(model, X, y)
525 |
526 | # A machine stores *learned* parameters, among other things. We'll
527 | # train this machine on 70% of the data and evaluate on a 30% holdout
528 | # set. Let's start by dividing all row indices into `train` and `test`
529 | # subsets:
530 |
531 | train, test = partition(eachindex(y), 0.7)
532 |
533 | # Now we can `fit!`...
534 |
535 | fit!(mach, rows=train, verbosity=2)
536 |
537 | # ... and `predict`:
538 |
539 | yhat = predict(mach, rows=test); # or `predict(mach, Xnew)`
540 | yhat[1:3]
541 |
542 | # We'll have more to say on the form of this prediction shortly.
543 |
544 | # After training, one can inspect the learned parameters:
545 |
546 | fitted_params(mach)
547 |
548 | #-
549 |
550 | # Everything else the user might be interested in is accessed from the
551 | # training *report*:
552 |
553 | report(mach)
554 |
555 | # You save a machine like this:
556 |
557 | MLJ.save("neural_net.jlso", mach)
558 |
559 | # And retrieve it like this:
560 |
561 | mach2 = machine("neural_net.jlso")
562 | yhat = predict(mach2, X);
563 | yhat[1:3]
564 |
565 | # If you want to fit a retrieved model, you will need to bind some data to it:
566 |
567 | mach3 = machine("neural_net.jlso", X, y)
568 | fit!(mach3)
569 |
570 | # Machines remember the last set of hyper-parameters used during fit,
571 | # which, in the case of iterative models, allows for a warm restart of
572 | # computations in the case that only the iteration parameter is
573 | # increased:
574 |
575 | model.epochs = model.epochs + 4
576 | fit!(mach, rows=train, verbosity=2)
577 |
578 | # For this particular model we can also increase `:learning_rate`
579 | # without triggering a cold restart:
580 |
581 | model.epochs = model.epochs + 4
582 | model.optimiser.eta = 10*model.optimiser.eta
583 | fit!(mach, rows=train, verbosity=2)
584 |
585 | # However, change any other parameter and training will restart from
586 | # scratch:
587 |
588 | model.lambda = 0.001
589 | fit!(mach, rows=train, verbosity=2)
590 |
591 | # Iterative models that implement warm-restart for training can be
592 | # controlled externally (eg, using an out-of-sample stopping
593 | # criterion). See
594 | # [here](https://alan-turing-institute.github.io/MLJ.jl/dev/controlling_iterative_models/)
595 | # for details.
596 |
597 |
598 | # Let's train silently for a total of 50 epochs, and look at a
599 | # prediction:
600 |
601 | model.epochs = 50
602 | fit!(mach, rows=train)
603 | yhat = predict(mach, X[test,:]); # or predict(mach, rows=test)
604 | yhat[1]
605 |
606 | # What's going on here?
607 |
608 | info(model).prediction_type
609 |
610 | # **Important**:
611 | # - In MLJ, a model that can predict probabilities (and not just point values) will do so by default.
612 | # - For most probabilistic predictors, the predicted object is a `Distributions.Distribution` object, supporting the `Distributions.jl` [API](https://juliastats.org/Distributions.jl/latest/extends/#Create-a-Distribution-1) for such objects. In particular, the methods `rand`, `pdf`, `logpdf`, `mode`, `median` and `mean` will apply, where appropriate.
613 |
614 | # So, to obtain the probability of "Iris-virginica" in the first test
615 | # prediction, we do
616 |
617 | pdf(yhat[1], "Iris-virginica")
618 |
619 | # To get the most likely observation, we do
620 |
621 | mode(yhat[1])
622 |
623 | # These can be broadcast over multiple predictions in the usual way:
624 |
625 | broadcast(pdf, yhat[1:4], "Iris-versicolor")
626 |
627 | #-
628 |
629 | mode.(yhat[1:4])
630 |
631 | # Or, alternatively, you can use the `predict_mode` operation instead
632 | # of `predict`:
633 |
634 | predict_mode(mach, X[test,:])[1:4] # or predict_mode(mach, rows=test)[1:4]
635 |
636 | # For a more conventional matrix of probabilities you can do this:
637 |
638 | L = levels(y)
639 | pdf(yhat, L)[1:4, :]
640 |
641 | # However, in a typical MLJ work-flow, this is not as useful as you
642 | # might imagine. In particular, all probabilistic performance measures
643 | # in MLJ expect distribution objects in their first slot:
644 |
645 | cross_entropy(yhat, y[test]) |> mean
646 |
647 | # To apply a deterministic measure, we first need to obtain point-estimates:
648 |
649 | misclassification_rate(mode.(yhat), y[test])
650 |
651 | # We note in passing that there is also a search tool for measures
652 | # analogous to `models`:
653 |
654 | measures()
655 |
656 |
657 | # ### Step 4. Evaluate the model performance
658 |
659 | # Naturally, MLJ provides boilerplate code for carrying out a model
660 | # evaluation with a lot less fuss. Let's repeat the performance
661 | # evaluation above and add an extra measure, `brier_score`:
662 |
663 | evaluate!(mach, resampling=Holdout(fraction_train=0.7),
664 | measures=[cross_entropy, brier_score])
665 |
666 | # Or applying cross-validation instead:
667 |
668 | evaluate!(mach, resampling=CV(nfolds=6),
669 | measures=[cross_entropy, brier_score])
670 |
671 | # Or, Monte Carlo cross-validation (cross-validation repeated
672 | # randomized folds)
673 |
674 | e = evaluate!(mach, resampling=CV(nfolds=6, rng=123),
675 | repeats=3,
676 | measures=[cross_entropy, brier_score])
677 |
678 | # One can access the following properties of the output `e` of an
679 | # evaluation: `measure`, `measurement`, `per_fold` (measurement for
680 | # each fold) and `per_observation` (measurement per observation, if
681 | # reported).
682 |
683 | # We finally note that you can restrict the rows of observations from
684 | # which train and test folds are drawn, by specifying `rows=...`. For
685 | # example, imagining the last 30% of target observations are `missing`
686 | # you might have a work-flow like this:
687 |
688 | train, test = partition(eachindex(y), 0.7)
689 | mach = machine(model, X, y)
690 | evaluate!(mach, resampling=CV(nfolds=6),
691 | measures=[cross_entropy, brier_score],
692 | rows=train) # cv estimate, resampling from `train`
693 | fit!(mach, rows=train) # re-train using all of `train` observations
694 | predict(mach, rows=test); # and predict missing targets
695 |
696 |
697 | # ### On learning curves
698 |
699 | # Since our model is an iterative one, we might want to inspect the
700 | # out-of-sample performance as a function of the iteration
701 | # parameter. For this we can use the `learning_curve` function (which,
702 | # incidentally can be applied to any model hyper-parameter). This
703 | # starts by defining a one-dimensional range object for the parameter
704 | # (more on this when we discuss tuning in Part 4):
705 |
706 | r = range(model, :epochs, lower=1, upper=50, scale=:log)
707 |
708 | #-
709 |
710 | curve = learning_curve(mach,
711 | range=r,
712 | resampling=Holdout(fraction_train=0.7), # (default)
713 | measure=cross_entropy)
714 |
715 | using Plots
716 | gr(size=(490,300))
717 | plt=plot(curve.parameter_values, curve.measurements)
718 | xlabel!(plt, "epochs")
719 | ylabel!(plt, "cross entropy on holdout set")
720 | savefig("learning_curve.png")
721 | plt #!md
722 | #  #md
723 |
724 | # We will return to learning curves when we look at tuning in Part 4.
725 |
726 |
727 | # ### Resources for Part 2
728 |
729 | # - From the MLJ manual:
730 | # - [Getting Started](https://alan-turing-institute.github.io/MLJ.jl/dev/getting_started/)
731 | # - [Model Search](https://alan-turing-institute.github.io/MLJ.jl/dev/model_search/)
732 | # - [Evaluating Performance](https://alan-turing-institute.github.io/MLJ.jl/dev/evaluating_model_performance/) (using `evaluate!`)
733 | # - [Learning Curves](https://alan-turing-institute.github.io/MLJ.jl/dev/learning_curves/)
734 | # - [Performance Measures](https://alan-turing-institute.github.io/MLJ.jl/dev/performance_measures/) (loss functions, scores, etc)
735 | # - From Data Science Tutorials:
736 | # - [Choosing and evaluating a model](https://juliaai.github.io/DataScienceTutorials.jl/getting-started/choosing-a-model/)
737 | # - [Fit, predict, transform](https://juliaai.github.io/DataScienceTutorials.jl/getting-started/fit-and-predict/)
738 |
739 |
740 | # ### Exercises for Part 2
741 |
742 |
743 | # #### Exercise 4
744 |
745 | # (a) Identify all supervised MLJ models that can be applied (without
746 | # type coercion or one-hot encoding) to a supervised learning problem
747 | # with input features `X4` and target `y4` defined below:
748 |
749 | import Distributions
750 | poisson = Distributions.Poisson
751 |
752 | age = 18 .+ 60*rand(10);
753 | salary = coerce(rand(["small", "big", "huge"], 10), OrderedFactor);
754 | levels!(salary, ["small", "big", "huge"]);
755 | small = CategoricalValue("small", salary)
756 |
757 | #-
758 |
759 | X4 = DataFrames.DataFrame(age=age, salary=salary)
760 |
761 | n_devices(salary) = salary > small ? rand(poisson(1.3)) : rand(poisson(2.9))
762 | y4 = [n_devices(row.salary) for row in eachrow(X4)]
763 |
764 | # (b) What models can be applied if you coerce the salary to a
765 | # `Continuous` scitype?
766 |
767 |
768 | # #### Exercise 5 (unpack)
769 |
770 | # After evaluating the following ...
771 |
772 | data = (a = [1, 2, 3, 4],
773 | b = rand(4),
774 | c = rand(4),
775 | d = coerce(["male", "female", "female", "male"], OrderedFactor));
776 | pretty(data)
777 |
778 | #-
779 |
780 | using Tables
781 | y, X, w = unpack(data,
782 | ==(:a),
783 | name -> elscitype(Tables.getcolumn(data, name)) == Continuous,
784 | name -> true);
785 |
786 | # ...attempt to guess the evaluations of the following:
787 |
788 | y
789 |
790 | #-
791 |
792 | pretty(X)
793 |
794 | #-
795 |
796 | w
797 |
798 | # #### Exercise 6 (first steps in modeling Horse Colic)
799 |
800 | # (a) Suppose we want to use predict the `:outcome` variable in the
801 | # Horse Colic study introduced in Part 1, based on the remaining
802 | # variables that are `Continuous` (one-hot encoding categorical
803 | # variables is discussed later in Part 3) *while ignoring the others*.
804 | # Extract from the `horse` data set (defined in Part 1) appropriate
805 | # input features `X` and target variable `y`. (Do not, however,
806 | # randomize the observations.)
807 |
808 | # (b) Create a 70:30 `train`/`test` split of the data and train a
809 | # `LogisticClassifier` model, from the `MLJLinearModels` package, on
810 | # the `train` rows. Use `lambda=100` and default values for the
811 | # other hyper-parameters. (Although one would normally standardize
812 | # (whiten) the continuous features for this model, do not do so here.)
813 | # After training:
814 |
815 | # - (i) Recalling that a logistic classifier (aka logistic regressor) is
816 | # a linear-based model learning a *vector* of coefficients for each
817 | # feature (one coefficient for each target class), use the
818 | # `fitted_params` method to find this vector of coefficients in the
819 | # case of the `:pulse` feature. (You can convert a vector of pairs `v =
820 | # [x1 => y1, x2 => y2, ...]` into a dictionary with `Dict(v)`.)
821 |
822 | # - (ii) Evaluate the `cross_entropy` performance on the `test`
823 | # observations.
824 |
825 | # - ☆(iii) In how many `test` observations does the predicted
826 | # probability of the observed class exceed 50%?
827 |
828 | # - (iv) Find the `misclassification_rate` in the `test`
829 | # set. (*Hint.* As this measure is deterministic, you will either
830 | # need to broadcast `mode` or use `predict_mode` instead of
831 | # `predict`.)
832 |
833 | # (c) Instead use a `RandomForestClassifier` model from the
834 | # `DecisionTree` package and:
835 | #
836 | # - (i) Generate an appropriate learning curve to convince yourself
837 | # that out-of-sample estimates of the `cross_entropy` loss do not
838 | # substantially improve for `n_trees > 50`. Use default values for
839 | # all other hyper-parameters, and feel free to use all available
840 | # data to generate the curve.
841 |
842 | # - (ii) Fix `n_trees=90` and use `evaluate!` to obtain a 9-fold
843 | # cross-validation estimate of the `cross_entropy`, restricting
844 | # sub-sampling to the `train` observations.
845 |
846 | # - (iii) Now use *all* available data but set
847 | # `resampling=Holdout(fraction_train=0.7)` to obtain a score you can
848 | # compare with the `KNNClassifier` in part (b)(iii). Which model is
849 | # better?
850 |
851 | #
852 |
853 |
854 | # ## Part 3 - Transformers and Pipelines
855 |
856 | # ### Transformers
857 |
858 | # Unsupervised models, which receive no target `y` during training,
859 | # always have a `transform` operation. They sometimes also support an
860 | # `inverse_transform` operation, with obvious meaning, and sometimes
861 | # support a `predict` operation (see the clustering example discussed
862 | # [here](https://alan-turing-institute.github.io/MLJ.jl/dev/transformers/#Transformers-that-also-predict-1)).
863 | # Otherwise, they are handled much like supervised models.
864 |
865 | # Here's a simple standardization example:
866 |
867 | x = rand(100);
868 | @show mean(x) std(x);
869 |
870 | #-
871 |
872 | model = Standardizer() # a built-in model
873 | mach = machine(model, x)
874 | fit!(mach)
875 | xhat = transform(mach, x);
876 | @show mean(xhat) std(xhat);
877 |
878 | # This particular model has an `inverse_transform`:
879 |
880 | inverse_transform(mach, xhat) ≈ x
881 |
882 |
883 | # ### Re-encoding the King County House data as continuous
884 |
885 | # For further illustrations of transformers, let's re-encode *all* of the
886 | # King County House input features (see [Ex
887 | # 3](#exercise-3-fixing-scitypes-in-a-table)) into a set of `Continuous`
888 | # features. We do this with the `ContinuousEncoder` model, which, by
889 | # default, will:
890 |
891 | # - one-hot encode all `Multiclass` features
892 | # - coerce all `OrderedFactor` features to `Continuous` ones
893 | # - coerce all `Count` features to `Continuous` ones (there aren't any)
894 | # - drop any remaining non-Continuous features (none of these either)
895 |
896 | # First, we reload the data and fix the scitypes (Exercise 3):
897 |
898 | file = CSV.File(joinpath(DIR, "data", "house.csv"));
899 | house = DataFrames.DataFrame(file);
900 | coerce!(house, autotype(file));
901 | coerce!(house, Count => Continuous, :zipcode => Multiclass);
902 | schema(house)
903 |
904 | #-
905 |
906 | y, X = unpack(house, ==(:price), name -> true, rng=123);
907 |
908 | # Instantiate the unsupervised model (transformer):
909 |
910 | encoder = ContinuousEncoder() # a built-in model; no need to @load it
911 |
912 | # Bind the model to the data and fit!
913 |
914 | mach = machine(encoder, X) |> fit!;
915 |
916 | # Transform and inspect the result:
917 |
918 | Xcont = transform(mach, X);
919 | schema(Xcont)
920 |
921 |
922 | # ### More transformers
923 |
924 | # Here's how to list all of MLJ's unsupervised models:
925 |
926 | models(m->!m.is_supervised)
927 |
928 | # Some commonly used ones are built-in (do not require `@load`ing):
929 |
930 | # model type | does what?
931 | # ----------------------------|----------------------------------------------
932 | # ContinuousEncoder | transform input table to a table of `Continuous` features (see above)
933 | # FeatureSelector | retain or dump selected features
934 | # FillImputer | impute missing values
935 | # OneHotEncoder | one-hot encoder `Multiclass` (and optionally `OrderedFactor`) features
936 | # Standardizer | standardize (whiten) a vector or all `Continuous` features of a table
937 | # UnivariateBoxCoxTransformer | apply a learned Box-Cox transformation to a vector
938 | # UnivariateDiscretizer | discretize a `Continuous` vector, and hence render its elscitypw `OrderedFactor`
939 |
940 |
941 | # In addition to "dynamic" transformers (ones that learn something
942 | # from the data and must be `fit!`) users can wrap ordinary functions
943 | # as transformers, and such *static* transformers can depend on
944 | # parameters, like the dynamic ones. See
945 | # [here](https://alan-turing-institute.github.io/MLJ.jl/dev/transformers/#Static-transformers-1)
946 | # for how to define your own static transformers.
947 |
948 |
949 | # ### Pipelines
950 |
951 | length(schema(Xcont).names)
952 |
953 | # Let's suppose that additionally we'd like to reduce the dimension of
954 | # our data. A model that will do this is `PCA` from
955 | # `MultivariateStats`:
956 |
957 | PCA = @load PCA
958 | reducer = PCA()
959 |
960 | # Now, rather simply repeating the work-flow above, applying the new
961 | # transformation to `Xcont`, we can combine both the encoding and the
962 | # dimension-reducing models into a single model, known as a
963 | # *pipeline*. While MLJ offers a powerful interface for composing
964 | # models in a variety of ways, we'll stick to these simplest class of
965 | # composite models for now. The easiest way to construct them is using
966 | # the `@pipeline` macro:
967 |
968 | pipe = @pipeline encoder reducer
969 |
970 | # Notice that `pipe` is an *instance* of an automatically generated
971 | # type (called `Pipeline`).
972 |
973 | # The new model behaves like any other transformer:
974 |
975 | mach = machine(pipe, X)
976 | fit!(mach)
977 | Xsmall = transform(mach, X)
978 | schema(Xsmall)
979 |
980 | # Want to combine this pre-processing with ridge regression?
981 |
982 | RidgeRegressor = @load RidgeRegressor pkg=MLJLinearModels
983 | rgs = RidgeRegressor()
984 | pipe2 = @pipeline encoder reducer rgs
985 |
986 | # Now our pipeline is a supervised model, instead of a transformer,
987 | # whose performance we can evaluate:
988 |
989 | mach = machine(pipe2, X, y)
990 | evaluate!(mach, measure=mae, resampling=Holdout()) # CV(nfolds=6) is default
991 |
992 |
993 | # ### Training of composite models is "smart"
994 |
995 | # Now notice what happens if we train on all the data, then change a
996 | # regressor hyper-parameter and retrain:
997 |
998 | fit!(mach)
999 |
1000 | #-
1001 |
1002 | pipe2.ridge_regressor.lambda = 0.1
1003 | fit!(mach)
1004 |
1005 | # Second time only the ridge regressor is retrained!
1006 |
1007 | # Mutate a hyper-parameter of the `PCA` model and every model except
1008 | # the `ContinuousEncoder` (which comes before it will be retrained):
1009 |
1010 | pipe2.pca.pratio = 0.9999
1011 | fit!(mach)
1012 |
1013 |
1014 | # ### Inspecting composite models
1015 |
1016 | # The dot syntax used above to change the values of *nested*
1017 | # hyper-parameters is also useful when inspecting the learned
1018 | # parameters and report generated when training a composite model:
1019 |
1020 | fitted_params(mach).ridge_regressor
1021 |
1022 | #-
1023 |
1024 | report(mach).pca
1025 |
1026 |
1027 | # ### Incorporating target transformations
1028 |
1029 | # Next, suppose that instead of using the raw `:price` as the
1030 | # training target, we want to use the log-price (a common practice in
1031 | # dealing with house price data). However, suppose that we still want
1032 | # to report final *predictions* on the original linear scale (and use
1033 | # these for evaluation purposes). Then we supply appropriate functions
1034 | # to key-word arguments `target` and `inverse`.
1035 |
1036 | # First we'll overload `log` and `exp` for broadcasting:
1037 | Base.log(v::AbstractArray) = log.(v)
1038 | Base.exp(v::AbstractArray) = exp.(v)
1039 |
1040 | # Now for the new pipeline:
1041 |
1042 | pipe3 = @pipeline encoder reducer rgs target=log inverse=exp
1043 | mach = machine(pipe3, X, y)
1044 | evaluate!(mach, measure=mae)
1045 |
1046 | # MLJ will also allow you to insert *learned* target
1047 | # transformations. For example, we might want to apply
1048 | # `Standardizer()` to the target, to standardize it, or
1049 | # `UnivariateBoxCoxTransformer()` to make it look Gaussian. Then
1050 | # instead of specifying a *function* for `target`, we specify a
1051 | # unsupervised *model* (or model type). One does not specify `inverse`
1052 | # because only models implementing `inverse_transform` are
1053 | # allowed.
1054 |
1055 | # Let's see which of these two options results in a better outcome:
1056 |
1057 | box = UnivariateBoxCoxTransformer(n=20)
1058 | stand = Standardizer()
1059 |
1060 | pipe4 = @pipeline encoder reducer rgs target=box
1061 | mach = machine(pipe4, X, y)
1062 | evaluate!(mach, measure=mae)
1063 |
1064 | #-
1065 |
1066 | pipe4.target = stand
1067 | evaluate!(mach, measure=mae)
1068 |
1069 |
1070 | # ### Resources for Part 3
1071 |
1072 | # - From the MLJ manual:
1073 | # - [Transformers and other unsupervised models](https://alan-turing-institute.github.io/MLJ.jl/dev/transformers/)
1074 | # - [Linear pipelines](https://alan-turing-institute.github.io/MLJ.jl/dev/linear_pipelines/#Linear-Pipelines)
1075 | # - From Data Science Tutorials:
1076 | # - [Composing models](https://juliaai.github.io/DataScienceTutorials.jl/getting-started/composing-models/)
1077 |
1078 |
1079 | # ### Exercises for Part 3
1080 |
1081 | # #### Exercise 7
1082 |
1083 | # Consider again the Horse Colic classification problem considered in
1084 | # Exercise 6, but with all features, `Finite` and `Infinite`:
1085 |
1086 | y, X = unpack(horse, ==(:outcome), name -> true);
1087 | schema(X)
1088 |
1089 | # (a) Define a pipeline that:
1090 | # - uses `Standardizer` to ensure that features that are already
1091 | # continuous are centered at zero and have unit variance
1092 | # - re-encodes the full set of features as `Continuous`, using
1093 | # `ContinuousEncoder`
1094 | # - uses the `KMeans` clustering model from `Clustering.jl`
1095 | # to reduce the dimension of the feature space to `k=10`.
1096 | # - trains a `EvoTreeClassifier` (a gradient tree boosting
1097 | # algorithm in `EvoTrees.jl`) on the reduced data, using
1098 | # `nrounds=50` and default values for the other
1099 | # hyper-parameters
1100 |
1101 | # (b) Evaluate the pipeline on all data, using 6-fold cross-validation
1102 | # and `cross_entropy` loss.
1103 |
1104 | # ☆(c) Plot a learning curve which examines the effect on this loss
1105 | # as the tree booster parameter `max_depth` varies from 2 to 10.
1106 |
1107 | #
1108 |
1109 |
1110 | # ## Part 4 - Tuning Hyper-parameters
1111 |
1112 | # ### Naive tuning of a single parameter
1113 |
1114 | # The most naive way to tune a single hyper-parameter is to use
1115 | # `learning_curve`, which we already saw in Part 2. Let's see this in
1116 | # the Horse Colic classification problem, in a case where the parameter
1117 | # to be tuned is *nested* (because the model is a pipeline):
1118 |
1119 | y, X = unpack(horse, ==(:outcome), name -> true);
1120 |
1121 | LogisticClassifier = @load LogisticClassifier pkg=MLJLinearModels
1122 | model = @pipeline Standardizer ContinuousEncoder LogisticClassifier
1123 | mach = machine(model, X, y)
1124 |
1125 | #-
1126 |
1127 | r = range(model, :(logistic_classifier.lambda), lower = 1e-2, upper=100, scale=:log10)
1128 |
1129 | # If you're curious, you can see what `lambda` values this range will
1130 | # generate for a given resolution:
1131 |
1132 | iterator(r, 5)
1133 |
1134 | #-
1135 |
1136 | _, _, lambdas, losses = learning_curve(mach,
1137 | range=r,
1138 | resampling=CV(nfolds=6),
1139 | resolution=30, # default
1140 | measure=cross_entropy)
1141 | plt=plot(lambdas, losses, xscale=:log10)
1142 | xlabel!(plt, "lambda")
1143 | ylabel!(plt, "cross entropy using 6-fold CV")
1144 | savefig("learning_curve2.png")
1145 | plt #!md
1146 |
1147 | #  #md
1148 |
1149 | best_lambda = lambdas[argmin(losses)]
1150 |
1151 |
1152 | # ### Self tuning models
1153 |
1154 | # A more sophisticated way to view hyper-parameter tuning (inspired by
1155 | # MLR) is as a model *wrapper*. The wrapped model is a new model in
1156 | # its own right and when you fit it, it tunes specified
1157 | # hyper-parameters of the model being wrapped, before training on all
1158 | # supplied data. Calling `predict` on the wrapped model is like
1159 | # calling `predict` on the original model, but with the
1160 | # hyper-parameters already optimized.
1161 |
1162 | # In other words, we can think of the wrapped model as a "self-tuning"
1163 | # version of the original.
1164 |
1165 | # We now create a self-tuning version of the pipeline above, adding a
1166 | # parameter from the `ContinuousEncoder` to the parameters we want
1167 | # optimized.
1168 |
1169 | # First, let's choose a tuning strategy (from [these
1170 | # options](https://github.com/juliaai/MLJTuning.jl#what-is-provided-here)). MLJ
1171 | # supports ordinary `Grid` search (query `?Grid` for
1172 | # details). However, as the utility of `Grid` search is limited to a
1173 | # small number of parameters, and as `Grid` searches are demonstrated
1174 | # elsewhere (see the [resources below](#resources-for-part-4)) we'll
1175 | # demonstrate `RandomSearch` here:
1176 |
1177 | tuning = RandomSearch(rng=123)
1178 |
1179 | # In this strategy each parameter is sampled according to a
1180 | # pre-specified prior distribution that is fit to the one-dimensional
1181 | # range object constructed using `range` as before. While one has a
1182 | # lot of control over the specification of the priors (run
1183 | # `?RandomSearch` for details) we'll let the algorithm generate these
1184 | # priors automatically.
1185 |
1186 |
1187 | # #### Unbounded ranges and sampling
1188 |
1189 | # In MLJ a range does not have to be bounded. In a `RandomSearch` a
1190 | # positive unbounded range is sampled using a `Gamma` distribution, by
1191 | # default:
1192 |
1193 | r = range(model,
1194 | :(logistic_classifier.lambda),
1195 | lower=0,
1196 | origin=6,
1197 | unit=5,
1198 | scale=:log10)
1199 |
1200 | # The `scale` in a range makes no in a `RandomSearch` (unless it is a
1201 | # function) but this will effect later plots but it does effect the
1202 | # later plots.
1203 |
1204 | # Let's see what sampling using a Gamma distribution is going to mean
1205 | # for this range:
1206 |
1207 | import Distributions
1208 | sampler_r = sampler(r, Distributions.Gamma)
1209 | plt = histogram(rand(sampler_r, 10000), nbins=50)
1210 | savefig("gamma_sampler.png")
1211 | plt #!md
1212 |
1213 | # 
1214 |
1215 | # The second parameter that we'll add to this is *nominal* (finite) and, by
1216 | # default, will be sampled uniformly. Since it is nominal, we specify
1217 | # `values` instead of `upper` and `lower` bounds:
1218 |
1219 | s = range(model, :(continuous_encoder.one_hot_ordered_factors),
1220 | values = [true, false])
1221 |
1222 |
1223 | # #### The tuning wrapper
1224 |
1225 | # Now for the wrapper, which is an instance of `TunedModel`:
1226 |
1227 | tuned_model = TunedModel(model=model,
1228 | ranges=[r, s],
1229 | resampling=CV(nfolds=6),
1230 | measures=cross_entropy,
1231 | tuning=tuning,
1232 | n=15)
1233 |
1234 | # We can apply the `fit!/predict` work-flow to `tuned_model` just as
1235 | # for any other model:
1236 |
1237 | tuned_mach = machine(tuned_model, X, y);
1238 | fit!(tuned_mach);
1239 | predict(tuned_mach, rows=1:3)
1240 |
1241 | # The outcomes of the tuning can be inspected from a detailed
1242 | # report. For example, we have:
1243 |
1244 | rep = report(tuned_mach);
1245 | rep.best_model
1246 |
1247 | # By default, sampling of a bounded range is uniform. Lets
1248 |
1249 | # In the special case of two-parameters, you can also plot the results:
1250 |
1251 | plt = plot(tuned_mach)
1252 | savefig("tuning.png")
1253 | plt #!md
1254 |
1255 | #  #md
1256 |
1257 | # Finally, let's compare cross-validation estimate of the performance
1258 | # of the self-tuning model with that of the original model (an example
1259 | # of [*nested
1260 | # resampling*]((https://mlr.mlr-org.com/articles/tutorial/nested_resampling.html)
1261 | # here):
1262 |
1263 | err = evaluate!(mach, resampling=CV(nfolds=3), measure=cross_entropy)
1264 |
1265 | #-
1266 |
1267 | tuned_err = evaluate!(tuned_mach, resampling=CV(nfolds=3), measure=cross_entropy)
1268 |
1269 | #
1270 |
1271 |
1272 | # ### Resources for Part 4
1273 | #
1274 | # - From the MLJ manual:
1275 | # - [Learning Curves](https://alan-turing-institute.github.io/MLJ.jl/dev/learning_curves/)
1276 | # - [Tuning Models](https://alan-turing-institute.github.io/MLJ.jl/dev/tuning_models/)
1277 | # - The [MLJTuning repo](https://github.com/juliaai/MLJTuning.jl#who-is-this-repo-for) - mostly for developers
1278 | #
1279 | # - From Data Science Tutorials:
1280 | # - [Tuning a model](https://juliaai.github.io/DataScienceTutorials.jl/getting-started/model-tuning/)
1281 | # - [Crabs with XGBoost](https://juliaai.github.io/DataScienceTutorials.jl/end-to-end/crabs-xgb/) `Grid` tuning in stages for a tree-boosting model with many parameters
1282 | # - [Boston with LightGBM](https://juliaai.github.io/DataScienceTutorials.jl/end-to-end/boston-lgbm/) - `Grid` tuning for another popular tree-booster
1283 | # - [Boston with Flux](https://juliaai.github.io/DataScienceTutorials.jl/end-to-end/boston-flux/) - optimizing batch size in a simple neural network regressor
1284 | # - [UCI Horse Colic Data Set](http://archive.ics.uci.edu/ml/datasets/Horse+Colic)
1285 |
1286 |
1287 | # ### Exercises for Part 4
1288 |
1289 | # #### Exercise 8
1290 |
1291 | # This exercise continues our analysis of the King County House price
1292 | # prediction problem:
1293 |
1294 | y, X = unpack(house, ==(:price), name -> true, rng=123);
1295 |
1296 | # Your task will be to tune the following pipeline regression model,
1297 | # which includes a gradient tree boosting component:
1298 |
1299 | EvoTreeRegressor = @load EvoTreeRegressor
1300 | tree_booster = EvoTreeRegressor(nrounds = 70)
1301 | model = @pipeline ContinuousEncoder tree_booster
1302 |
1303 | # (a) Construct a bounded range `r1` for the `evo_tree_booster`
1304 | # parameter `max_depth`, varying between 1 and 12.
1305 |
1306 | # \star&(b) For the `nbins` parameter of the `EvoTreeRegressor`, define the range
1307 |
1308 | r2 = range(model,
1309 | :(evo_tree_regressor.nbins),
1310 | lower = 2.5,
1311 | upper= 7.5, scale=x->2^round(Int, x))
1312 |
1313 | # Notice that in this case we've specified a *function* instead of a
1314 | # canned scale, like `:log10`. In this case the `scale` function is
1315 | # applied after sampling (uniformly) between the limits of `lower` and
1316 | # `upper`. Perhaps you can guess the outputs of the following lines of
1317 | # code?
1318 |
1319 | r2_sampler = sampler(r2, Distributions.Uniform)
1320 | samples = rand(r2_sampler, 1000);
1321 | plt = histogram(samples, nbins=50)
1322 | savefig("uniform_sampler.png")
1323 |
1324 | plt #!md
1325 |
1326 | # 
1327 |
1328 | sort(unique(samples))
1329 |
1330 | # (c) Optimize `model` over these the parameter ranges `r1` and `r2`
1331 | # using a random search with uniform priors (the default). Use
1332 | # `Holdout()` resampling, and implement your search by first
1333 | # constructing a "self-tuning" wrap of `model`, as described
1334 | # above. Make `mae` (mean absolute error) the loss function that you
1335 | # optimize, and search over a total of 40 combinations of
1336 | # hyper-parameters. If you have time, plot the results of your
1337 | # search. Feel free to use all available data.
1338 |
1339 | # (d) Evaluate the best model found in the search using 3-fold
1340 | # cross-validation and compare with that of the self-tuning model
1341 | # (which is different!). Setting data hygiene concerns aside, feel
1342 | # free to use all available data.
1343 |
1344 | #
1345 |
1346 |
1347 | # ## Part 5 - Advanced Model Composition
1348 |
1349 | # > **Goals:**
1350 | # > 1. Learn how to build a prototypes of a composite model, called a *learning network*
1351 | # > 2. Learn how to use the `@from_network` macro to export a learning network as a new stand-alone model type
1352 |
1353 | # While `@pipeline` is great for composing models in an unbranching
1354 | # sequence, for more complicated model composition you'll want to use
1355 | # MLJ's generic model composition syntax. There are two main steps:
1356 |
1357 | # - **Prototype** the composite model by building a *learning
1358 | # network*, which can be tested on some (dummy) data as you build
1359 | # it.
1360 |
1361 | # - **Export** the learning network as a new stand-alone model type.
1362 |
1363 | # Like pipeline models, instances of the exported model type behave
1364 | # like any other model (and are not bound to any data, until you wrap
1365 | # them in a machine).
1366 |
1367 |
1368 | # ### Building a pipeline using the generic composition syntax
1369 |
1370 | # To warm up, we'll do the equivalent of
1371 |
1372 | pipe = @pipeline Standardizer LogisticClassifier;
1373 |
1374 | # using the generic syntax.
1375 |
1376 | # Here's some dummy data we'll be using to test our learning network:
1377 |
1378 | X, y = make_blobs(5, 3)
1379 | pretty(X)
1380 |
1381 | # **Step 0** - Proceed as if you were combining the models "by hand",
1382 | # using all the data available for training, transforming and
1383 | # prediction:
1384 |
1385 | stand = Standardizer();
1386 | linear = LogisticClassifier();
1387 |
1388 | mach1 = machine(stand, X);
1389 | fit!(mach1);
1390 | Xstand = transform(mach1, X);
1391 |
1392 | mach2 = machine(linear, Xstand, y);
1393 | fit!(mach2);
1394 | yhat = predict(mach2, Xstand)
1395 |
1396 | # **Step 1** - Edit your code as follows:
1397 |
1398 | # - pre-wrap the data in `Source` nodes
1399 |
1400 | # - delete the `fit!` calls
1401 |
1402 | X = source(X) # or X = source() if not testing
1403 | y = source(y) # or y = source()
1404 |
1405 | stand = Standardizer();
1406 | linear = LogisticClassifier();
1407 |
1408 | mach1 = machine(stand, X);
1409 | Xstand = transform(mach1, X);
1410 |
1411 | mach2 = machine(linear, Xstand, y);
1412 | yhat = predict(mach2, Xstand)
1413 |
1414 | # Now `X`, `y`, `Xstand` and `yhat` are *nodes* ("variables" or
1415 | # "dynammic data") instead of data. All training, predicting and
1416 | # transforming is now executed lazily, whenever we `fit!` one of these
1417 | # nodes. We *call* a node to retrieve the data it represents in the
1418 | # original manual workflow.
1419 |
1420 | fit!(Xstand)
1421 | Xstand() |> pretty
1422 |
1423 | #-
1424 |
1425 | fit!(yhat);
1426 | yhat()
1427 |
1428 | # The node `yhat` is the "descendant" (in an associated DAG we have
1429 | # defined) of a unique source node:
1430 |
1431 | sources(yhat)
1432 |
1433 | #-
1434 |
1435 | # The data at the source node is replaced by `Xnew` to obtain a
1436 | # new prediction when we call `yhat` like this:
1437 |
1438 | Xnew, _ = make_blobs(2, 3);
1439 | yhat(Xnew)
1440 |
1441 |
1442 | # **Step 2** - Export the learning network as a new stand-alone model type
1443 |
1444 | # Now, somewhat paradoxically, we can wrap the whole network in a
1445 | # special machine - called a *learning network machine* - before have
1446 | # defined the new model type. Indeed doing so is a necessary step in
1447 | # the export process, for this machine will tell the export macro:
1448 |
1449 | # - what kind of model the composite will be (`Deterministic`,
1450 | # `Probabilistic` or `Unsupervised`)a
1451 |
1452 | # - which source nodes are input nodes and which are for the target
1453 |
1454 | # - which nodes correspond to each operation (`predict`, `transform`,
1455 | # etc) that we might want to define
1456 |
1457 | surrogate = Probabilistic() # a model with no fields!
1458 | mach = machine(surrogate, X, y; predict=yhat)
1459 |
1460 | # Although we have no real need to use it, this machine behaves like
1461 | # you'd expect it to:
1462 |
1463 | Xnew, _ = make_blobs(2, 3)
1464 | fit!(mach)
1465 | predict(mach, Xnew)
1466 |
1467 | #-
1468 |
1469 | # Now we create a new model type using a Julia `struct` definition
1470 | # appropriately decorated:
1471 |
1472 | @from_network mach begin
1473 | mutable struct YourPipe
1474 | standardizer = stand
1475 | classifier = linear::Probabilistic
1476 | end
1477 | end
1478 |
1479 | # Instantiating and evaluating on some new data:
1480 |
1481 | pipe = YourPipe()
1482 | X, y = @load_iris; # built-in data set
1483 | mach = machine(pipe, X, y)
1484 | evaluate!(mach, measure=misclassification_rate, operation=predict_mode)
1485 |
1486 |
1487 | # ### A composite model to average two regressor predictors
1488 |
1489 | # The following is condensed version of
1490 | # [this](https://github.com/alan-turing-institute/MLJ.jl/blob/master/binder/MLJ_demo.ipynb)
1491 | # tutorial. We will define a composite model that:
1492 |
1493 | # - standardizes the input data
1494 |
1495 | # - learns and applies a Box-Cox transformation to the target variable
1496 |
1497 | # - blends the predictions of two supervised learning models - a ridge
1498 | # regressor and a random forest regressor; we'll blend using a simple
1499 | # average (for a more sophisticated stacking example, see
1500 | # [here](https://juliaai.github.io/DataScienceTutorials.jl/getting-started/stacking/))
1501 |
1502 | # - applies the *inverse* Box-Cox transformation to this blended prediction
1503 |
1504 | RandomForestRegressor = @load RandomForestRegressor pkg=DecisionTree
1505 |
1506 | # **Input layer**
1507 |
1508 | X = source()
1509 | y = source()
1510 |
1511 | # **First layer and target transformation**
1512 |
1513 | std_model = Standardizer()
1514 | stand = machine(std_model, X)
1515 | W = MLJ.transform(stand, X)
1516 |
1517 | box_model = UnivariateBoxCoxTransformer()
1518 | box = machine(box_model, y)
1519 | z = MLJ.transform(box, y)
1520 |
1521 | # **Second layer**
1522 |
1523 | ridge_model = RidgeRegressor(lambda=0.1)
1524 | ridge = machine(ridge_model, W, z)
1525 |
1526 | forest_model = RandomForestRegressor(n_trees=50)
1527 | forest = machine(forest_model, W, z)
1528 |
1529 | ẑ = 0.5*predict(ridge, W) + 0.5*predict(forest, W)
1530 |
1531 | # **Output**
1532 |
1533 | ŷ = inverse_transform(box, ẑ)
1534 |
1535 | # With the learning network defined, we're ready to export:
1536 |
1537 | @from_network machine(Deterministic(), X, y, predict=ŷ) begin
1538 | mutable struct CompositeModel
1539 | rgs1 = ridge_model
1540 | rgs2 = forest_model
1541 | end
1542 | end
1543 |
1544 | # Let's instantiate the new model type and try it out on some data:
1545 |
1546 | composite = CompositeModel()
1547 |
1548 | #-
1549 |
1550 | X, y = @load_boston;
1551 | mach = machine(composite, X, y);
1552 | evaluate!(mach,
1553 | resampling=CV(nfolds=6, shuffle=true),
1554 | measures=[rms, mae])
1555 |
1556 |
1557 | # ### Resources for Part 5
1558 | #
1559 | # - From the MLJ manual:
1560 | # - [Learning Networks](https://alan-turing-institute.github.io/MLJ.jl/stable/composing_models/#Learning-Networks-1)
1561 | # - From Data Science Tutorials:
1562 | # - [Learning Networks](https://juliaai.github.io/DataScienceTutorials.jl/getting-started/learning-networks/)
1563 | # - [Learning Networks 2](https://juliaai.github.io/DataScienceTutorials.jl/getting-started/learning-networks-2/)
1564 |
1565 | # - [Stacking](https://juliaai.github.io/DataScienceTutorials.jl/getting-started/stacking/): an advanced example of model composition
1566 |
1567 | # - [Finer Control](https://alan-turing-institute.github.io/MLJ.jl/dev/composing_models/#Method-II:-Finer-control-(advanced)-1):
1568 | # exporting learning networks without a macro for finer control
1569 |
1570 | #
1571 |
1572 |
1573 | # ## Solutions to exercises
1574 |
1575 | # #### Exercise 2 solution
1576 |
1577 | quality = coerce(quality, OrderedFactor);
1578 | levels!(quality, ["poor", "good", "excellent"]);
1579 | elscitype(quality)
1580 |
1581 |
1582 | # #### Exercise 3 solution
1583 |
1584 | # First pass:
1585 |
1586 | coerce!(house, autotype(house));
1587 | schema(house)
1588 |
1589 | #-
1590 |
1591 | # All the "sqft" fields refer to "square feet" so are
1592 | # really `Continuous`. We'll regard `:yr_built` (the other `Count`
1593 | # variable above) as `Continuous` as well. So:
1594 |
1595 | coerce!(house, Count => Continuous);
1596 |
1597 | # And `:zipcode` should not be ordered:
1598 |
1599 | coerce!(house, :zipcode => Multiclass);
1600 | schema(house)
1601 |
1602 | # `:bathrooms` looks like it has a lot of levels, but on further
1603 | # inspection we see why, and `OrderedFactor` remains appropriate:
1604 |
1605 | import StatsBase.countmap
1606 | countmap(house.bathrooms)
1607 |
1608 |
1609 | # #### Exercise 4 solution
1610 |
1611 | # 4(a)
1612 |
1613 | # There are *no* models that apply immediately:
1614 |
1615 | models(matching(X4, y4))
1616 |
1617 | # 4(b)
1618 |
1619 | y4 = coerce(y4, Continuous);
1620 | models(matching(X4, y4))
1621 |
1622 |
1623 | # #### Exercise 6 solution
1624 |
1625 | # 6(a)
1626 |
1627 | y, X = unpack(horse,
1628 | ==(:outcome),
1629 | name -> elscitype(Tables.getcolumn(horse, name)) == Continuous);
1630 |
1631 | # 6(b)(i)
1632 |
1633 | model = (@load LogisticClassifier pkg=MLJLinearModels)();
1634 | model.lambda = 100
1635 | mach = machine(model, X, y)
1636 | fit!(mach, rows=train)
1637 | fitted_params(mach)
1638 |
1639 | #-
1640 |
1641 | coefs_given_feature = Dict(fitted_params(mach).coefs)
1642 | coefs_given_feature[:pulse]
1643 |
1644 | #6(b)(ii)
1645 |
1646 | yhat = predict(mach, rows=test); # or predict(mach, X[test,:])
1647 | err = cross_entropy(yhat, y[test]) |> mean
1648 |
1649 | # 6(b)(iii)
1650 |
1651 | # The predicted probabilities of the actual observations in the test
1652 | # are given by
1653 |
1654 | p = broadcast(pdf, yhat, y[test]);
1655 |
1656 | # The number of times this probability exceeds 50% is:
1657 | n50 = filter(x -> x > 0.5, p) |> length
1658 |
1659 | # Or, as a proportion:
1660 |
1661 | n50/length(test)
1662 |
1663 | # 6(b)(iv)
1664 |
1665 | misclassification_rate(mode.(yhat), y[test])
1666 |
1667 | # 6(c)(i)
1668 |
1669 | model = (@load RandomForestClassifier pkg=DecisionTree)()
1670 | mach = machine(model, X, y)
1671 | evaluate!(mach, resampling=CV(nfolds=6), measure=cross_entropy)
1672 |
1673 | r = range(model, :n_trees, lower=10, upper=70, scale=:log10)
1674 |
1675 | # Since random forests are inherently randomized, we generate multiple
1676 | # curves:
1677 |
1678 | plt = plot()
1679 | for i in 1:4
1680 | one_curve = learning_curve(mach,
1681 | range=r,
1682 | resampling=Holdout(),
1683 | measure=cross_entropy)
1684 | plot!(one_curve.parameter_values, one_curve.measurements)
1685 | end
1686 | xlabel!(plt, "n_trees")
1687 | ylabel!(plt, "cross entropy")
1688 | savefig("exercise_6ci.png")
1689 | plt #!md
1690 |
1691 | #  #md
1692 |
1693 |
1694 | # 6(c)(ii)
1695 |
1696 | evaluate!(mach, resampling=CV(nfolds=9),
1697 | measure=cross_entropy,
1698 | rows=train).measurement[1]
1699 |
1700 | model.n_trees = 90
1701 |
1702 | # 6(c)(iii)
1703 |
1704 | err_forest = evaluate!(mach, resampling=Holdout(),
1705 | measure=cross_entropy).measurement[1]
1706 |
1707 | # #### Exercise 7
1708 |
1709 | # (a)
1710 |
1711 | KMeans = @load KMeans pkg=Clustering
1712 | EvoTreeClassifier = @load EvoTreeClassifier
1713 | pipe = @pipeline(Standardizer,
1714 | ContinuousEncoder,
1715 | KMeans(k=10),
1716 | EvoTreeClassifier(nrounds=50))
1717 |
1718 | # (b)
1719 |
1720 | mach = machine(pipe, X, y)
1721 | evaluate!(mach, resampling=CV(nfolds=6), measure=cross_entropy)
1722 |
1723 | # (c)
1724 |
1725 | r = range(pipe, :(evo_tree_classifier.max_depth), lower=1, upper=10)
1726 |
1727 | curve = learning_curve(mach,
1728 | range=r,
1729 | resampling=CV(nfolds=6),
1730 | measure=cross_entropy)
1731 |
1732 | plt = plot(curve.parameter_values, curve.measurements)
1733 | xlabel!(plt, "max_depth")
1734 | ylabel!(plt, "CV estimate of cross entropy")
1735 | savefig("exercise_7c.png")
1736 | plt #!md
1737 |
1738 | #  #md
1739 |
1740 | # Here's a second curve using a different random seed for the booster:
1741 |
1742 | using Random
1743 | pipe.evo_tree_classifier.rng = MersenneTwister(123)
1744 | curve = learning_curve(mach,
1745 | range=r,
1746 | resampling=CV(nfolds=6),
1747 | measure=cross_entropy)
1748 | plot!(curve.parameter_values, curve.measurements)
1749 | savefig("exercise_7c_2.png")
1750 | plt #!md
1751 |
1752 | #  #md
1753 |
1754 | # One can automatic the production of multiple curves with different
1755 | # seeds in the following way:
1756 | curves = learning_curve(mach,
1757 | range=r,
1758 | resampling=CV(nfolds=6),
1759 | measure=cross_entropy,
1760 | rng_name=:(evo_tree_classifier.rng),
1761 | rngs=6) # list of RNGs, or num to auto generate
1762 | plt = plot(curves.parameter_values, curves.measurements)
1763 | savefig("exercise_7c_3.png")
1764 | plt #!md
1765 |
1766 | #  #md
1767 |
1768 | # If you have multiple threads available in your julia session, you
1769 | # can add the option `acceleration=CPUThreads()` to speed up this
1770 | # computation.
1771 |
1772 | # #### Exercise 8
1773 |
1774 | y, X = unpack(house, ==(:price), name -> true, rng=123);
1775 |
1776 | EvoTreeRegressor = @load EvoTreeRegressor
1777 | tree_booster = EvoTreeRegressor(nrounds = 70)
1778 | model = @pipeline ContinuousEncoder tree_booster
1779 |
1780 | # (a)
1781 |
1782 | r1 = range(model, :(evo_tree_regressor.max_depth), lower=1, upper=12)
1783 |
1784 | # (c)
1785 |
1786 | tuned_model = TunedModel(model=model,
1787 | ranges=[r1, r2],
1788 | resampling=Holdout(),
1789 | measures=mae,
1790 | tuning=RandomSearch(rng=123),
1791 | n=40)
1792 |
1793 | tuned_mach = machine(tuned_model, X, y) |> fit!
1794 | plt = plot(tuned_mach)
1795 | savefig("exercise_8c.png")
1796 | plt #!md
1797 |
1798 | #  #md
1799 |
1800 | # (d)
1801 |
1802 | best_model = report(tuned_mach).best_model;
1803 | best_mach = machine(best_model, X, y);
1804 | best_err = evaluate!(best_mach, resampling=CV(nfolds=3), measure=mae)
1805 |
1806 | #-
1807 |
1808 | tuned_err = evaluate!(tuned_mach, resampling=CV(nfolds=3), measure=mae)
1809 |
1810 |
1811 | using Literate #src
1812 | Literate.markdown(@__FILE__, DIR, execute=true) #src
1813 | Literate.notebook(@__FILE__, DIR, execute=false) #src
1814 |
--------------------------------------------------------------------------------
/vecstack.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ablaom/MachineLearningInJulia2020/552f98fbf012475d67cd29a72448ac7c476ea2c7/vecstack.png
--------------------------------------------------------------------------------
/wow.jl:
--------------------------------------------------------------------------------
1 | # # State-of-the-art model composition in MLJ (Machine Learning in Julia)
2 |
3 | # In this script we use model stacking to demonstrate the ease with
4 | # which machine learning models can be combined in sophisticated ways
5 | # using MLJ. In practice, one would use MLJ's [canned stacking model
6 | # constructor](https://alan-turing-institute.github.io/MLJ.jl/dev/model_stacking/#Model-Stacking)
7 | # `Stack`. Here, however, we give a quick demonstation how you would
8 | # build a stack yourself, using MLJ's generic model composition
9 | # syntax, which is an extension of the normal fit/predict syntax.
10 |
11 | # For a more leisurely notebook on the same material, see
12 | # [this](https://juliaai.github.io/DataScienceTutorials.jl/getting-started/stacking/)
13 | # tutorial.
14 |
15 |
16 | DIR = @__DIR__
17 | include(joinpath(DIR, "setup.jl"))
18 |
19 | # ## Stacking is hard
20 |
21 | # [Model
22 | # stacking](https://alan-turing-institute.github.io/DataScienceTutorials.jl/getting-started/stacking/),
23 | # popular in Kaggle data science competitions, is a sophisticated way
24 | # to blend the predictions of multiple models.
25 |
26 | # With the python toolbox
27 | # [scikit-learn](https://scikit-learn.org/stable/) (or its [julia
28 | # wrap](https://github.com/cstjean/ScikitLearn.jl)) you can use
29 | # pipelines to combine composite models in simple ways but (automated)
30 | # stacking is beyond its capabilities.
31 |
32 | # One python alternative is to use
33 | # [vecstack](https://github.com/vecxoz/vecstack). The [core
34 | # algorithm](https://github.com/vecxoz/vecstack/blob/master/vecstack/core.py)
35 | # is about eight pages (without the scikit-learn interface):
36 |
37 | # .
38 |
39 | # ## Stacking is easy (in MLJ)
40 |
41 | # Using MLJ's [generic model composition
42 | # API](https://alan-turing-institute.github.io/MLJ.jl/dev/composing_models/)
43 | # you can build a stack in about a page.
44 |
45 | # Here's the complete code needed to define a new model type that
46 | # stacks two base regressors and one adjudicator in MLJ. Here we use
47 | # three folds to create the base-learner [out-of-sample
48 | # predictions](https://alan-turing-institute.github.io/DataScienceTutorials.jl/getting-started/stacking/)
49 | # to make it easier to read. You can make this generic with little fuss.
50 |
51 | using MLJ
52 |
53 | folds(data, nfolds) =
54 | partition(1:nrows(data), (1/nfolds for i in 1:(nfolds-1))...);
55 |
56 | # these models are only going to be default choices for the stack:
57 |
58 | LinearRegressor = @load LinearRegressor pkg=MLJLinearModels
59 | model1 = LinearRegressor()
60 | model2 = LinearRegressor()
61 | judge = LinearRegressor()
62 |
63 | X = source()
64 | y = source()
65 |
66 | folds(X::AbstractNode, nfolds) = node(XX->folds(XX, nfolds), X)
67 | MLJ.restrict(X::AbstractNode, f::AbstractNode, i) =
68 | node((XX, ff) -> restrict(XX, ff, i), X, f);
69 | MLJ.corestrict(X::AbstractNode, f::AbstractNode, i) =
70 | node((XX, ff) -> corestrict(XX, ff, i), X, f);
71 |
72 | f = folds(X, 3)
73 |
74 | m11 = machine(model1, corestrict(X, f, 1), corestrict(y, f, 1))
75 | m12 = machine(model1, corestrict(X, f, 2), corestrict(y, f, 2))
76 | m13 = machine(model1, corestrict(X, f, 3), corestrict(y, f, 3))
77 |
78 | y11 = predict(m11, restrict(X, f, 1));
79 | y12 = predict(m12, restrict(X, f, 2));
80 | y13 = predict(m13, restrict(X, f, 3));
81 |
82 | m21 = machine(model2, corestrict(X, f, 1), corestrict(y, f, 1))
83 | m22 = machine(model2, corestrict(X, f, 2), corestrict(y, f, 2))
84 | m23 = machine(model2, corestrict(X, f, 3), corestrict(y, f, 3))
85 |
86 | y21 = predict(m21, restrict(X, f, 1));
87 | y22 = predict(m22, restrict(X, f, 2));
88 | y23 = predict(m23, restrict(X, f, 3));
89 |
90 | y1_oos = vcat(y11, y12, y13);
91 | y2_oos = vcat(y21, y22, y23);
92 |
93 | X_oos = MLJ.table(hcat(y1_oos, y2_oos))
94 |
95 | m_judge = machine(judge, X_oos, y)
96 |
97 | m1 = machine(model1, X, y)
98 | m2 = machine(model2, X, y)
99 |
100 | y1 = predict(m1, X);
101 | y2 = predict(m2, X);
102 |
103 | X_judge = MLJ.table(hcat(y1, y2))
104 | yhat = predict(m_judge, X_judge)
105 |
106 | @from_network machine(Deterministic(), X, y; predict=yhat) begin
107 | mutable struct MyStack
108 | regressor1=model1
109 | regressor2=model2
110 | judge=judge
111 | end
112 | end
113 |
114 | my_stack = MyStack()
115 |
116 | # For the curious: Only the last block defines the new model type. The
117 | # rest defines a *[learning network]()* - a kind of working prototype
118 | # or blueprint for the type. If the source nodes `X` and `y` wrap some
119 | # data (instead of nothing) then the network can be trained and tested
120 | # as you build it.
121 |
122 |
123 | # ## Composition plays well with other work-flows
124 |
125 | # We did not include standardization of inputs and target (with
126 | # post-prediction inversion) in our stack. However, we can add these
127 | # now, using MLJ's canned pipeline composition:
128 |
129 | pipe = @pipeline Standardizer my_stack target=Standardizer
130 |
131 | # Want to change a base learner and adjudicator?
132 |
133 | DecisionTreeRegressor = @load DecisionTreeRegressor pkg=DecisionTree;
134 | KNNRegressor = @load KNNRegressor;
135 | pipe.my_stack.regressor2 = DecisionTreeRegressor()
136 | pipe.my_stack.judge = KNNRegressor();
137 |
138 | # Want a CV estimate of performance of the complete model on some data?
139 |
140 | X, y = @load_boston;
141 | mach = machine(pipe, X, y)
142 | evaluate!(mach, resampling=CV(), measure=mae)
143 |
144 | # Want to inspect the learned parameters of the adjudicator?
145 |
146 | fp = fitted_params(mach);
147 | fp.my_stack.judge
148 |
149 | # What about the first base-learner of the stack? There are four sets
150 | # of learned parameters! One for each fold to make an out-of-sample
151 | # prediction, and one trained on all the data:
152 |
153 | fp.my_stack.regressor1
154 |
155 | #-
156 |
157 | fp.my_stack.regressor1[1].coefs
158 |
159 | # Want to tune multiple (nested) hyperparameters in the stack? Tuning is a
160 | # model wrapper (for better composition!):
161 |
162 | r1 = range(pipe, :(my_stack.regressor2.max_depth), lower = 1, upper = 25, scale=:linear)
163 | r2 = range(pipe, :(my_stack.judge.K), lower=1, origin=10, unit=10, scale=:log10)
164 |
165 | import Distributions.Poisson
166 |
167 | tuned_pipe = TunedModel(model=pipe,
168 | ranges=[r1, (r2, Poisson)],
169 | tuning=RandomSearch(),
170 | resampling=CV(),
171 | measure=rms,
172 | n=100)
173 | mach = machine(tuned_pipe, X, y) |> fit!
174 | best_model = fitted_params(mach).best_model
175 | K = fitted_params(mach).best_model.my_stack.judge.K;
176 | max_depth = fitted_params(mach).best_model.my_stack.regressor2.max_depth
177 | @show K max_depth;
178 |
179 | # Visualize tuning results:
180 |
181 | using Plots
182 | gr(size=(700,700*(sqrt(5) - 1)/2))
183 | plt = plot(mach)
184 | savefig("stacking.png")
185 | plt #!md
186 |
187 | # 
188 |
189 | using Literate #src
190 | Literate.markdown(@__FILE__, @__DIR__, execute=false) #src
191 | Literate.notebook(@__FILE__, @__DIR__, execute=true) #src
192 |
--------------------------------------------------------------------------------
/wow.md:
--------------------------------------------------------------------------------
1 | ```@meta
2 | EditURL = "/wow.jl"
3 | ```
4 |
5 | # State-of-the-art model composition in MLJ (Machine Learning in Julia)
6 |
7 | In this script we use model stacking to demonstrate the ease with
8 | which machine learning models can be combined in sophisticated ways
9 | using MLJ. In practice, one would use MLJ's [canned stacking model
10 | constructor](https://alan-turing-institute.github.io/MLJ.jl/dev/model_stacking/#Model-Stacking)
11 | `Stack`. Here, however, we give a quick demonstation how you would
12 | build a stack yourself, using MLJ's generic model composition
13 | syntax, which is an extension of the normal fit/predict syntax.
14 |
15 | For a more leisurely notebook on the same material, see
16 | [this](https://juliaai.github.io/DataScienceTutorials.jl/getting-started/stacking/)
17 | tutorial.
18 |
19 | ````@example wow
20 | DIR = @__DIR__
21 | include(joinpath(DIR, "setup.jl"))
22 | ````
23 |
24 | ## Stacking is hard
25 |
26 | [Model
27 | stacking](https://alan-turing-institute.github.io/DataScienceTutorials.jl/getting-started/stacking/),
28 | popular in Kaggle data science competitions, is a sophisticated way
29 | to blend the predictions of multiple models.
30 |
31 | With the python toolbox
32 | [scikit-learn](https://scikit-learn.org/stable/) (or its [julia
33 | wrap](https://github.com/cstjean/ScikitLearn.jl)) you can use
34 | pipelines to combine composite models in simple ways but (automated)
35 | stacking is beyond its capabilities.
36 |
37 | One python alternative is to use
38 | [vecstack](https://github.com/vecxoz/vecstack). The [core
39 | algorithm](https://github.com/vecxoz/vecstack/blob/master/vecstack/core.py)
40 | is about eight pages (without the scikit-learn interface):
41 |
42 | .
43 |
44 | ## Stacking is easy (in MLJ)
45 |
46 | Using MLJ's [generic model composition
47 | API](https://alan-turing-institute.github.io/MLJ.jl/dev/composing_models/)
48 | you can build a stack in about a page.
49 |
50 | Here's the complete code needed to define a new model type that
51 | stacks two base regressors and one adjudicator in MLJ. Here we use
52 | three folds to create the base-learner [out-of-sample
53 | predictions](https://alan-turing-institute.github.io/DataScienceTutorials.jl/getting-started/stacking/)
54 | to make it easier to read. You can make this generic with little fuss.
55 |
56 | ````@example wow
57 | using MLJ
58 |
59 | folds(data, nfolds) =
60 | partition(1:nrows(data), (1/nfolds for i in 1:(nfolds-1))...);
61 | nothing #hide
62 | ````
63 |
64 | these models are only going to be default choices for the stack:
65 |
66 | ````@example wow
67 | LinearRegressor = @load LinearRegressor pkg=MLJLinearModels
68 | model1 = LinearRegressor()
69 | model2 = LinearRegressor()
70 | judge = LinearRegressor()
71 |
72 | X = source()
73 | y = source()
74 |
75 | folds(X::AbstractNode, nfolds) = node(XX->folds(XX, nfolds), X)
76 | MLJ.restrict(X::AbstractNode, f::AbstractNode, i) =
77 | node((XX, ff) -> restrict(XX, ff, i), X, f);
78 | MLJ.corestrict(X::AbstractNode, f::AbstractNode, i) =
79 | node((XX, ff) -> corestrict(XX, ff, i), X, f);
80 |
81 | f = folds(X, 3)
82 |
83 | m11 = machine(model1, corestrict(X, f, 1), corestrict(y, f, 1))
84 | m12 = machine(model1, corestrict(X, f, 2), corestrict(y, f, 2))
85 | m13 = machine(model1, corestrict(X, f, 3), corestrict(y, f, 3))
86 |
87 | y11 = predict(m11, restrict(X, f, 1));
88 | y12 = predict(m12, restrict(X, f, 2));
89 | y13 = predict(m13, restrict(X, f, 3));
90 |
91 | m21 = machine(model2, corestrict(X, f, 1), corestrict(y, f, 1))
92 | m22 = machine(model2, corestrict(X, f, 2), corestrict(y, f, 2))
93 | m23 = machine(model2, corestrict(X, f, 3), corestrict(y, f, 3))
94 |
95 | y21 = predict(m21, restrict(X, f, 1));
96 | y22 = predict(m22, restrict(X, f, 2));
97 | y23 = predict(m23, restrict(X, f, 3));
98 |
99 | y1_oos = vcat(y11, y12, y13);
100 | y2_oos = vcat(y21, y22, y23);
101 |
102 | X_oos = MLJ.table(hcat(y1_oos, y2_oos))
103 |
104 | m_judge = machine(judge, X_oos, y)
105 |
106 | m1 = machine(model1, X, y)
107 | m2 = machine(model2, X, y)
108 |
109 | y1 = predict(m1, X);
110 | y2 = predict(m2, X);
111 |
112 | X_judge = MLJ.table(hcat(y1, y2))
113 | yhat = predict(m_judge, X_judge)
114 |
115 | @from_network machine(Deterministic(), X, y; predict=yhat) begin
116 | mutable struct MyStack
117 | regressor1=model1
118 | regressor2=model2
119 | judge=judge
120 | end
121 | end
122 |
123 | my_stack = MyStack()
124 | ````
125 |
126 | For the curious: Only the last block defines the new model type. The
127 | rest defines a *[learning network]()* - a kind of working prototype
128 | or blueprint for the type. If the source nodes `X` and `y` wrap some
129 | data (instead of nothing) then the network can be trained and tested
130 | as you build it.
131 |
132 | ## Composition plays well with other work-flows
133 |
134 | We did not include standardization of inputs and target (with
135 | post-prediction inversion) in our stack. However, we can add these
136 | now, using MLJ's canned pipeline composition:
137 |
138 | ````@example wow
139 | pipe = @pipeline Standardizer my_stack target=Standardizer
140 | ````
141 |
142 | Want to change a base learner and adjudicator?
143 |
144 | ````@example wow
145 | DecisionTreeRegressor = @load DecisionTreeRegressor pkg=DecisionTree;
146 | KNNRegressor = @load KNNRegressor;
147 | pipe.my_stack.regressor2 = DecisionTreeRegressor()
148 | pipe.my_stack.judge = KNNRegressor();
149 | nothing #hide
150 | ````
151 |
152 | Want a CV estimate of performance of the complete model on some data?
153 |
154 | ````@example wow
155 | X, y = @load_boston;
156 | mach = machine(pipe, X, y)
157 | evaluate!(mach, resampling=CV(), measure=mae)
158 | ````
159 |
160 | Want to inspect the learned parameters of the adjudicator?
161 |
162 | ````@example wow
163 | fp = fitted_params(mach);
164 | fp.my_stack.judge
165 | ````
166 |
167 | What about the first base-learner of the stack? There are four sets
168 | of learned parameters! One for each fold to make an out-of-sample
169 | prediction, and one trained on all the data:
170 |
171 | ````@example wow
172 | fp.my_stack.regressor1
173 | ````
174 |
175 | ````@example wow
176 | fp.my_stack.regressor1[1].coefs
177 | ````
178 |
179 | Want to tune multiple (nested) hyperparameters in the stack? Tuning is a
180 | model wrapper (for better composition!):
181 |
182 | ````@example wow
183 | r1 = range(pipe, :(my_stack.regressor2.max_depth), lower = 1, upper = 25, scale=:linear)
184 | r2 = range(pipe, :(my_stack.judge.K), lower=1, origin=10, unit=10, scale=:log10)
185 |
186 | import Distributions.Poisson
187 |
188 | tuned_pipe = TunedModel(model=pipe,
189 | ranges=[r1, (r2, Poisson)],
190 | tuning=RandomSearch(),
191 | resampling=CV(),
192 | measure=rms,
193 | n=100)
194 | mach = machine(tuned_pipe, X, y) |> fit!
195 | best_model = fitted_params(mach).best_model
196 | K = fitted_params(mach).best_model.my_stack.judge.K;
197 | max_depth = fitted_params(mach).best_model.my_stack.regressor2.max_depth
198 | @show K max_depth;
199 | nothing #hide
200 | ````
201 |
202 | Visualize tuning results:
203 |
204 | ````@example wow
205 | using Plots
206 | gr(size=(700,700*(sqrt(5) - 1)/2))
207 | plt = plot(mach)
208 | savefig("stacking.png")
209 | ````
210 |
211 | 
212 |
213 | ---
214 |
215 | *This page was generated using [Literate.jl](https://github.com/fredrikekre/Literate.jl).*
216 |
217 |
--------------------------------------------------------------------------------