├── 1004588303.png ├── 1351541081.png ├── 2218687897.png ├── 2587122783.png ├── 2590211320.png ├── 2797619445.png ├── 3923620403.png ├── 4133943752.png ├── 4167443430.png ├── 517000181.png ├── 759701435.png ├── LICENCE.md ├── MLJLogo2.svg ├── Manifest.toml ├── Project.toml ├── README.md ├── Untitled.ipynb ├── apt.txt ├── assets ├── scitypes.drawio └── scitypes.png ├── data ├── horse.csv ├── house.csv ├── small.csv └── src │ ├── Manifest.toml │ ├── Project.toml │ ├── ames.csv │ ├── convert_ames.jl │ ├── convert_ames │ ├── Manifest.toml │ └── Project.toml │ ├── generate_horse.jl │ ├── get_king_county.jl │ └── reduced_ames.csv ├── environment.yml ├── exercise_6ci.png ├── exercise_7c.png ├── exercise_7c_2.png ├── exercise_7c_3.png ├── exercise_8c.png ├── gamma_sampler.png ├── iris_learning_curve.png ├── learning_curve.png ├── learning_curve2.png ├── methods.md ├── outline.md ├── setup.jl ├── stacking.png ├── tuning.png ├── tutorials.ipynb ├── tutorials.jl ├── tutorials.md ├── vecstack.png ├── wow.ipynb ├── wow.jl └── wow.md /1004588303.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ablaom/MachineLearningInJulia2020/552f98fbf012475d67cd29a72448ac7c476ea2c7/1004588303.png -------------------------------------------------------------------------------- /1351541081.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ablaom/MachineLearningInJulia2020/552f98fbf012475d67cd29a72448ac7c476ea2c7/1351541081.png -------------------------------------------------------------------------------- /2218687897.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ablaom/MachineLearningInJulia2020/552f98fbf012475d67cd29a72448ac7c476ea2c7/2218687897.png -------------------------------------------------------------------------------- /2587122783.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ablaom/MachineLearningInJulia2020/552f98fbf012475d67cd29a72448ac7c476ea2c7/2587122783.png -------------------------------------------------------------------------------- /2590211320.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ablaom/MachineLearningInJulia2020/552f98fbf012475d67cd29a72448ac7c476ea2c7/2590211320.png -------------------------------------------------------------------------------- /2797619445.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ablaom/MachineLearningInJulia2020/552f98fbf012475d67cd29a72448ac7c476ea2c7/2797619445.png -------------------------------------------------------------------------------- /3923620403.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ablaom/MachineLearningInJulia2020/552f98fbf012475d67cd29a72448ac7c476ea2c7/3923620403.png -------------------------------------------------------------------------------- /4133943752.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ablaom/MachineLearningInJulia2020/552f98fbf012475d67cd29a72448ac7c476ea2c7/4133943752.png -------------------------------------------------------------------------------- /4167443430.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ablaom/MachineLearningInJulia2020/552f98fbf012475d67cd29a72448ac7c476ea2c7/4167443430.png -------------------------------------------------------------------------------- /517000181.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ablaom/MachineLearningInJulia2020/552f98fbf012475d67cd29a72448ac7c476ea2c7/517000181.png -------------------------------------------------------------------------------- /759701435.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ablaom/MachineLearningInJulia2020/552f98fbf012475d67cd29a72448ac7c476ea2c7/759701435.png -------------------------------------------------------------------------------- /LICENCE.md: -------------------------------------------------------------------------------- 1 | The MLJ.jl package is licensed under the MIT "Expat" License: 2 | 3 | > Copyright (c) 2020: Anthony Blaom 4 | 5 | > Permission is hereby granted, free of charge, to any person obtaining a copy 6 | > of this software and associated documentation files (the "Software"), to deal 7 | > in the Software without restriction, including without limitation the rights 8 | > to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | > copies of the Software, and to permit persons to whom the Software is 10 | > furnished to do so, subject to the following conditions: 11 | > 12 | > The above copyright notice and this permission notice shall be included in all 13 | > copies or substantial portions of the Software. 14 | > 15 | > THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | > IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | > FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | > AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | > LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | > OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | > SOFTWARE. 22 | > 23 | -------------------------------------------------------------------------------- /MLJLogo2.svg: -------------------------------------------------------------------------------- 1 | 2 | 3 | 5 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 41 | 42 | 43 | 44 | 45 | 46 | 47 | 48 | 51 | 52 | 53 | 54 | 55 | 56 | 57 | 58 | 59 | 60 | 61 | 62 | 63 | 64 | 65 | 66 | 67 | 68 | 69 | 70 | 71 | 72 | 73 | 74 | 75 | 76 | 77 | 78 | 79 | 80 | 81 | 82 | 83 | -------------------------------------------------------------------------------- /Project.toml: -------------------------------------------------------------------------------- 1 | [deps] 2 | CSV = "336ed68f-0bac-5ca0-87d4-7b16caf5d00b" 3 | CategoricalArrays = "324d7699-5711-5eae-9e2f-1d82baa6b597" 4 | ComputationalResources = "ed09eef8-17a6-5b46-8889-db040fac31e3" 5 | DataFrames = "a93c6f00-e57d-5684-b7b6-d8193f3e46c0" 6 | DecisionTree = "7806a523-6efd-50cb-b5f6-3fa6f1930dbb" 7 | Distributions = "31c24e10-a181-5473-b8eb-7969acd0382f" 8 | EvoTrees = "f6006082-12f8-11e9-0c9c-0d5d367ab1e5" 9 | Literate = "98b081ad-f1c9-55d3-8b20-4c87d4299306" 10 | MLJ = "add582a8-e3ab-11e8-2d5e-e98b27df1bc7" 11 | MLJBase = "a7f614a8-145f-11e9-1d2a-a57a1082229d" 12 | MLJClusteringInterface = "d354fa79-ed1c-40d4-88ef-b8c7bd1568af" 13 | MLJDecisionTreeInterface = "c6f25543-311c-4c74-83dc-3ea6d1015661" 14 | MLJFlux = "094fc8d1-fd35-5302-93ea-dabda2abf845" 15 | MLJLinearModels = "6ee0df7b-362f-4a72-a706-9e79364fb692" 16 | MLJModels = "d491faf4-2d78-11e9-2867-c94bc002c0b7" 17 | MLJMultivariateStatsInterface = "1b6a4a23-ba22-4f51-9698-8599985d3728" 18 | MLJScikitLearnInterface = "5ae90465-5518-4432-b9d2-8a1def2f0cab" 19 | NearestNeighborModels = "636a865e-7cf4-491e-846c-de09b730eb36" 20 | NearestNeighbors = "b8a86587-4115-5ab1-83bc-aa920d37bbce" 21 | Plots = "91a5bcdd-55d7-5caf-9e0b-520d859cae80" 22 | Random = "9a3f8284-a2c9-5f02-9a11-845980a1fd5c" 23 | ScientificTypes = "321657f4-b219-11e9-178b-2701a2544e81" 24 | StatsBase = "2913bbd2-ae8a-5f71-8c99-4fb6c76f3a91" 25 | Tables = "bd369af6-aec1-5ad0-b16a-f7cc5008161c" 26 | UnicodePlots = "b8865327-cd53-5732-bb35-84acbb429228" 27 | 28 | [compat] 29 | julia = ">=1.6, <1.7" 30 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Machine Learning in Julia using MLJ, JuliaCon2020 2 | 3 | **Now updated for MLJ version 0.16 and Julia 1.6** 4 | 5 | But binder notebook will not work until [this binder issue](https://github.com/jupyterhub/binderhub/issues/1424) is resolved. 6 | 7 | Interactive tutorials for a workshop introducing the machine learning 8 | toolbox [MLJ](https://alan-turing-institute.github.io/MLJ.jl/stable/) (v0.14.4) 9 | 10 |
11 | MLJ 12 |
13 | 14 | These tutorials were prepared for use in a 3 1/2 hour online workshop 15 | at JuliaCon2020, recorded 16 | [here](https://www.youtube.com/watch?time_continue=27&v=qSWbCn170HU&feature=emb_title). Their 17 | main aim is to introduce the 18 | [MLJ](https://alan-turing-institute.github.io/MLJ.jl/stable/) machine 19 | learning toolbox to data scientists. 20 | 21 | Differences from the original resources are minor (main difference: 22 | `@load` now returns a type instead of an instance). However, if you 23 | wish to access resources precisely matching those used in the video, 24 | switch to the `JuliaCon2020` branch by clicking 25 | [here](https://github.com/ablaom/MachineLearningInJulia2020/tree/for-MLJ-version-0.16). 26 | 27 | **Future revisions** of these tutorials will appear [here](https://github.com/ablaom/MLJTutorial.jl). 28 | 29 | 30 | ### [Options for running the tutorials](#options-for-running-the-tutorials) 31 | 32 | ### [Non-interactive version](tutorials.md) 33 | 34 | ### Topics covered 35 | 36 | #### Basic 37 | 38 | - Part 1 - **Data Representation** 39 | 40 | - Part 2 - **Selecting, Training and Evaluating Models** 41 | 42 | - Part 3 - **Transformers and Pipelines** 43 | 44 | #### Advanced 45 | 46 | - Part 4 - **Tuning hyper-parameters** 47 | 48 | - Part 5 - **Advanced model composition** (as time permits) 49 | 50 | The tutorials include links to external resources and exercises with 51 | solutions. 52 | 53 | 54 | ## Options for running the tutorials 55 | 56 | ### 1. Plug-and-play 57 | 58 | Only recommended for users with little Julia experience or users having 59 | problems with the other options. 60 | 61 | Use this option if you have neither run Julia/Juptyer notebook on your 62 | local machine before, nor used a Julia IDE to run a Julia script. 63 | 64 | 65 | #### Pros 66 | 67 | One 68 | [click](https://mybinder.org/v2/gh/ablaom/MachineLearningInJulia2020/master?filepath=tutorials.ipynb). No 69 | need to install anything on your local machine. 70 | 71 | 72 | #### Cons 73 | 74 | - The (automatic) setup can take a little while, sometimes over 15 75 | minutes (but you do get a static version of the notebook while it 76 | loads). 77 | 78 | - **You will have to start over** if: 79 | 80 | - The notebook drops your connection for some reason. 81 | - You are **inactive for ten minutes**. 82 | 83 | 84 | #### Instructions 85 | 86 | Click this button: [![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/ablaom/MachineLearningInJulia2020/master?filepath=tutorials.ipynb) 87 | 88 | 89 | ### 2. Clone the repo and choose your preferred interface 90 | 91 | Assumes that you have a working installation of 92 | [Julia](https://julialang.org/downloads/) 1.3 or higher and that 93 | either: 94 | 95 | - You can run Julia/Juptyer notebooks on your local machine without problems; or 96 | 97 | - You are comfortable running Julia scripts from an IDE, such as [Juno](https://junolab.org) or [Emacs](https://github.com/JuliaEditorSupport/julia-emacs) (see [here](https://julialang.org) for a complete list). 98 | 99 | 100 | #### Pros 101 | 102 | More stable option 103 | 104 | #### Cons 105 | 106 | You need to meet above requirements 107 | 108 | 109 | #### Instructions 110 | 111 | - Clone [this repository](https://github.com/ablaom/MachineLearningInJulia2020) 112 | 113 | - Change to your local repo directory "MachineLearningInJulia2020/" 114 | 115 | - Either run the Juptyper notebook called "tutorials.ipynb" from that 116 | directory (corresponding to [this file](tutorials.ipynb) on GitHub) 117 | or open "tutorials.jl" from that directory in your favourite IDE 118 | (corresponding to [this file](tutorials.jl) on GitHub). You cannot 119 | download these files individually - you need the whole directory. 120 | 121 | - **Immediately** evaluate the first two lines of code to activate the 122 | package environment and pre-load the packages, as this can take a 123 | few minutes. 124 | 125 | 126 | ## More about the tutorials 127 | 128 | - The tutorials focus on the *machine learning* part of the data 129 | science workflow, and less on exploratory data analysis and other 130 | conventional "data analytics" methodology 131 | 132 | - Here "machine learning" is meant in a broad sense, and is not 133 | restricted to so-called *deep learning* (neural networks) 134 | 135 | - The tutorials are crafted to rapidly familiarize the user with what 136 | MLJ can do and how to do it, and are not a substitute for a course 137 | on machine learning fundamentals. Examples do not necessarily 138 | represent best practice or the best solution to a problem. 139 | 140 | ## Binder notebook for stacking demo used in video 141 | 142 | [![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/ablaom/MachineLearningInJulia2020/386ce06766dc1d9d9a0197ec57738b732c1c5d23?filepath=wow.ipynb) 143 | 144 | -------------------------------------------------------------------------------- /Untitled.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "code", 5 | "execution_count": 4, 6 | "metadata": {}, 7 | "outputs": [], 8 | "source": [ 9 | "using DataFrames, CategoricalArrays" 10 | ] 11 | }, 12 | { 13 | "cell_type": "code", 14 | "execution_count": 2, 15 | "metadata": {}, 16 | "outputs": [ 17 | { 18 | "data": { 19 | "text/plain": [ 20 | "12-element Array{Float64,1}:\n", 21 | " 1.0\n", 22 | " 2.0\n", 23 | " 3.0\n", 24 | " 4.0\n", 25 | " 5.0\n", 26 | " 6.0\n", 27 | " 7.0\n", 28 | " 8.0\n", 29 | " 9.0\n", 30 | " 10.0\n", 31 | " 11.0\n", 32 | " 12.0" 33 | ] 34 | }, 35 | "execution_count": 2, 36 | "metadata": {}, 37 | "output_type": "execute_result" 38 | } 39 | ], 40 | "source": [ 41 | "time = float.(1:12 )" 42 | ] 43 | }, 44 | { 45 | "cell_type": "code", 46 | "execution_count": 5, 47 | "metadata": {}, 48 | "outputs": [ 49 | { 50 | "data": { 51 | "text/plain": [ 52 | "4-element CategoricalArray{String,1,UInt32}:\n", 53 | " \"kitchen\"\n", 54 | " \"bathroom\"\n", 55 | " \"bedroom_1\"\n", 56 | " \"living_room\"" 57 | ] 58 | }, 59 | "execution_count": 5, 60 | "metadata": {}, 61 | "output_type": "execute_result" 62 | } 63 | ], 64 | "source": [ 65 | "room = categorical([\"kitchen\", \"bathroom\", \"bedroom_1\", \"living_room\"])" 66 | ] 67 | }, 68 | { 69 | "cell_type": "code", 70 | "execution_count": 6, 71 | "metadata": {}, 72 | "outputs": [ 73 | { 74 | "data": { 75 | "text/plain": [ 76 | "12-element CategoricalArray{String,1,UInt32}:\n", 77 | " \"kitchen\"\n", 78 | " \"bathroom\"\n", 79 | " \"bedroom_1\"\n", 80 | " \"living_room\"\n", 81 | " \"kitchen\"\n", 82 | " \"bathroom\"\n", 83 | " \"bedroom_1\"\n", 84 | " \"living_room\"\n", 85 | " \"kitchen\"\n", 86 | " \"bathroom\"\n", 87 | " \"bedroom_1\"\n", 88 | " \"living_room\"" 89 | ] 90 | }, 91 | "execution_count": 6, 92 | "metadata": {}, 93 | "output_type": "execute_result" 94 | } 95 | ], 96 | "source": [ 97 | "room = vcat(room, room, room)" 98 | ] 99 | }, 100 | { 101 | "cell_type": "code", 102 | "execution_count": 7, 103 | "metadata": {}, 104 | "outputs": [ 105 | { 106 | "data": { 107 | "text/plain": [ 108 | "1×12 Array{Int64,2}:\n", 109 | " 5 5 5 5 6 6 6 6 7 7 7 7" 110 | ] 111 | }, 112 | "execution_count": 7, 113 | "metadata": {}, 114 | "output_type": "execute_result" 115 | } 116 | ], 117 | "source": [ 118 | "time = [5 5 5 5 6 6 6 6 7 7 7 7]" 119 | ] 120 | }, 121 | { 122 | "cell_type": "code", 123 | "execution_count": 11, 124 | "metadata": {}, 125 | "outputs": [ 126 | { 127 | "data": { 128 | "text/plain": [ 129 | "12-element Array{Int64,1}:\n", 130 | " 5\n", 131 | " 5\n", 132 | " 5\n", 133 | " 5\n", 134 | " 6\n", 135 | " 6\n", 136 | " 6\n", 137 | " 6\n", 138 | " 7\n", 139 | " 7\n", 140 | " 7\n", 141 | " 7" 142 | ] 143 | }, 144 | "execution_count": 11, 145 | "metadata": {}, 146 | "output_type": "execute_result" 147 | } 148 | ], 149 | "source": [ 150 | "time =reshape(time, (12,))" 151 | ] 152 | }, 153 | { 154 | "cell_type": "code", 155 | "execution_count": 12, 156 | "metadata": {}, 157 | "outputs": [ 158 | { 159 | "data": { 160 | "text/html": [ 161 | "

12 rows × 2 columns

timeroom
Int64Cat…
15kitchen
25bathroom
35bedroom_1
45living_room
56kitchen
66bathroom
76bedroom_1
86living_room
97kitchen
107bathroom
117bedroom_1
127living_room
" 162 | ], 163 | "text/latex": [ 164 | "\\begin{tabular}{r|cc}\n", 165 | "\t& time & room\\\\\n", 166 | "\t\\hline\n", 167 | "\t& Int64 & Cat…\\\\\n", 168 | "\t\\hline\n", 169 | "\t1 & 5 & kitchen \\\\\n", 170 | "\t2 & 5 & bathroom \\\\\n", 171 | "\t3 & 5 & bedroom\\_1 \\\\\n", 172 | "\t4 & 5 & living\\_room \\\\\n", 173 | "\t5 & 6 & kitchen \\\\\n", 174 | "\t6 & 6 & bathroom \\\\\n", 175 | "\t7 & 6 & bedroom\\_1 \\\\\n", 176 | "\t8 & 6 & living\\_room \\\\\n", 177 | "\t9 & 7 & kitchen \\\\\n", 178 | "\t10 & 7 & bathroom \\\\\n", 179 | "\t11 & 7 & bedroom\\_1 \\\\\n", 180 | "\t12 & 7 & living\\_room \\\\\n", 181 | "\\end{tabular}\n" 182 | ], 183 | "text/plain": [ 184 | "12×2 DataFrame\n", 185 | "│ Row │ time │ room │\n", 186 | "│ │ \u001b[90mInt64\u001b[39m │ \u001b[90mCat…\u001b[39m │\n", 187 | "├─────┼───────┼─────────────┤\n", 188 | "│ 1 │ 5 │ kitchen │\n", 189 | "│ 2 │ 5 │ bathroom │\n", 190 | "│ 3 │ 5 │ bedroom_1 │\n", 191 | "│ 4 │ 5 │ living_room │\n", 192 | "│ 5 │ 6 │ kitchen │\n", 193 | "│ 6 │ 6 │ bathroom │\n", 194 | "│ 7 │ 6 │ bedroom_1 │\n", 195 | "│ 8 │ 6 │ living_room │\n", 196 | "│ 9 │ 7 │ kitchen │\n", 197 | "│ 10 │ 7 │ bathroom │\n", 198 | "│ 11 │ 7 │ bedroom_1 │\n", 199 | "│ 12 │ 7 │ living_room │" 200 | ] 201 | }, 202 | "execution_count": 12, 203 | "metadata": {}, 204 | "output_type": "execute_result" 205 | } 206 | ], 207 | "source": [ 208 | "X = DataFrame(time=time, room=room)" 209 | ] 210 | }, 211 | { 212 | "cell_type": "code", 213 | "execution_count": 13, 214 | "metadata": {}, 215 | "outputs": [ 216 | { 217 | "name": "stderr", 218 | "output_type": "stream", 219 | "text": [ 220 | "┌ Info: Precompiling MLJ [add582a8-e3ab-11e8-2d5e-e98b27df1bc7]\n", 221 | "└ @ Base loading.jl:1260\n", 222 | "[ Info: Model metadata loaded from registry. \n" 223 | ] 224 | } 225 | ], 226 | "source": [ 227 | "using MLJ" 228 | ] 229 | }, 230 | { 231 | "cell_type": "code", 232 | "execution_count": 31, 233 | "metadata": { 234 | "scrolled": true 235 | }, 236 | "outputs": [ 237 | { 238 | "name": "stdout", 239 | "output_type": "stream", 240 | "text": [ 241 | "\n", 242 | "\n", 243 | "┌\u001b[0m───────\u001b[0m┬\u001b[0m─────────────────────────────────\u001b[0m┐\u001b[0m\n", 244 | "│\u001b[0m\u001b[1m time \u001b[0m│\u001b[0m\u001b[1m room \u001b[0m│\u001b[0m\n", 245 | "│\u001b[0m\u001b[90m Int64 \u001b[0m│\u001b[0m\u001b[90m CategoricalValue{String,UInt32} \u001b[0m│\u001b[0m\n", 246 | "├\u001b[0m───────\u001b[0m┼\u001b[0m─────────────────────────────────\u001b[0m┤\u001b[0m\n", 247 | "│\u001b[0m 5 \u001b[0m│\u001b[0m kitchen \u001b[0m│\u001b[0m\n", 248 | "│\u001b[0m 5 \u001b[0m│\u001b[0m bathroom \u001b[0m│\u001b[0m\n", 249 | "│\u001b[0m 5 \u001b[0m│\u001b[0m bedroom_1 \u001b[0m│\u001b[0m\n", 250 | "│\u001b[0m 5 \u001b[0m│\u001b[0m living_room \u001b[0m│\u001b[0m\n", 251 | "│\u001b[0m 6 \u001b[0m│\u001b[0m kitchen \u001b[0m│\u001b[0m\n", 252 | "│\u001b[0m 6 \u001b[0m│\u001b[0m bathroom \u001b[0m│\u001b[0m\n", 253 | "│\u001b[0m 6 \u001b[0m│\u001b[0m bedroom_1 \u001b[0m│\u001b[0m\n", 254 | "│\u001b[0m 6 \u001b[0m│\u001b[0m living_room \u001b[0m│\u001b[0m\n", 255 | "│\u001b[0m 7 \u001b[0m│\u001b[0m kitchen \u001b[0m│\u001b[0m\n", 256 | "│\u001b[0m 7 \u001b[0m│\u001b[0m bathroom \u001b[0m│\u001b[0m\n", 257 | "│\u001b[0m 7 \u001b[0m│\u001b[0m bedroom_1 \u001b[0m│\u001b[0m\n", 258 | "│\u001b[0m 7 \u001b[0m│\u001b[0m living_room \u001b[0m│\u001b[0m\n", 259 | "└\u001b[0m───────\u001b[0m┴\u001b[0m─────────────────────────────────\u001b[0m┘\u001b[0m\n" 260 | ] 261 | } 262 | ], 263 | "source": [ 264 | "println()\n", 265 | "println()\n", 266 | "MLJ.MLJBase.PrettyTables.pretty_table(X)" 267 | ] 268 | }, 269 | { 270 | "cell_type": "code", 271 | "execution_count": 15, 272 | "metadata": {}, 273 | "outputs": [ 274 | { 275 | "ename": "UndefVarError", 276 | "evalue": "UndefVarError: y not defined", 277 | "output_type": "error", 278 | "traceback": [ 279 | "UndefVarError: y not defined", 280 | "", 281 | "Stacktrace:", 282 | " [1] top-level scope at In[15]:1" 283 | ] 284 | } 285 | ], 286 | "source": [ 287 | "pretty(y)" 288 | ] 289 | }, 290 | { 291 | "cell_type": "code", 292 | "execution_count": 16, 293 | "metadata": {}, 294 | "outputs": [ 295 | { 296 | "data": { 297 | "text/plain": [ 298 | "12-element Array{Float64,1}:\n", 299 | " 18.490955359526012\n", 300 | " 18.304060288673128\n", 301 | " 18.25954037947709\n", 302 | " 17.419481829632957\n", 303 | " 16.589235329028348\n", 304 | " 20.66317138311018\n", 305 | " 18.945861996750985\n", 306 | " 20.158722013970333\n", 307 | " 20.361567584624957\n", 308 | " 19.85771377870428\n", 309 | " 16.180836445944205\n", 310 | " 17.330000922162835" 311 | ] 312 | }, 313 | "execution_count": 16, 314 | "metadata": {}, 315 | "output_type": "execute_result" 316 | } 317 | ], 318 | "source": [ 319 | "temp = 16 .+ 5*rand(12)" 320 | ] 321 | }, 322 | { 323 | "cell_type": "code", 324 | "execution_count": 19, 325 | "metadata": {}, 326 | "outputs": [ 327 | { 328 | "data": { 329 | "text/plain": [ 330 | "12-element Array{Float64,1}:\n", 331 | " 18.5\n", 332 | " 18.3\n", 333 | " 18.3\n", 334 | " 17.4\n", 335 | " 16.6\n", 336 | " 20.7\n", 337 | " 18.9\n", 338 | " 20.2\n", 339 | " 20.4\n", 340 | " 19.9\n", 341 | " 16.2\n", 342 | " 17.3" 343 | ] 344 | }, 345 | "execution_count": 19, 346 | "metadata": {}, 347 | "output_type": "execute_result" 348 | } 349 | ], 350 | "source": [ 351 | "temperature = map(temp) do x round(x, sigdigits=3) end" 352 | ] 353 | }, 354 | { 355 | "cell_type": "code", 356 | "execution_count": 20, 357 | "metadata": {}, 358 | "outputs": [ 359 | { 360 | "ename": "UndefVarError", 361 | "evalue": "UndefVarError: y not defined", 362 | "output_type": "error", 363 | "traceback": [ 364 | "UndefVarError: y not defined", 365 | "", 366 | "Stacktrace:", 367 | " [1] top-level scope at In[20]:1" 368 | ] 369 | } 370 | ], 371 | "source": [ 372 | "y = DataFrame(y)" 373 | ] 374 | }, 375 | { 376 | "cell_type": "code", 377 | "execution_count": 23, 378 | "metadata": {}, 379 | "outputs": [ 380 | { 381 | "data": { 382 | "text/html": [ 383 | "

12 rows × 1 columns

temperature
Float64
118.5
218.3
318.3
417.4
516.6
620.7
718.9
820.2
920.4
1019.9
1116.2
1217.3
" 384 | ], 385 | "text/latex": [ 386 | "\\begin{tabular}{r|c}\n", 387 | "\t& temperature\\\\\n", 388 | "\t\\hline\n", 389 | "\t& Float64\\\\\n", 390 | "\t\\hline\n", 391 | "\t1 & 18.5 \\\\\n", 392 | "\t2 & 18.3 \\\\\n", 393 | "\t3 & 18.3 \\\\\n", 394 | "\t4 & 17.4 \\\\\n", 395 | "\t5 & 16.6 \\\\\n", 396 | "\t6 & 20.7 \\\\\n", 397 | "\t7 & 18.9 \\\\\n", 398 | "\t8 & 20.2 \\\\\n", 399 | "\t9 & 20.4 \\\\\n", 400 | "\t10 & 19.9 \\\\\n", 401 | "\t11 & 16.2 \\\\\n", 402 | "\t12 & 17.3 \\\\\n", 403 | "\\end{tabular}\n" 404 | ], 405 | "text/plain": [ 406 | "12×1 DataFrame\n", 407 | "│ Row │ temperature │\n", 408 | "│ │ \u001b[90mFloat64\u001b[39m │\n", 409 | "├─────┼─────────────┤\n", 410 | "│ 1 │ 18.5 │\n", 411 | "│ 2 │ 18.3 │\n", 412 | "│ 3 │ 18.3 │\n", 413 | "│ 4 │ 17.4 │\n", 414 | "│ 5 │ 16.6 │\n", 415 | "│ 6 │ 20.7 │\n", 416 | "│ 7 │ 18.9 │\n", 417 | "│ 8 │ 20.2 │\n", 418 | "│ 9 │ 20.4 │\n", 419 | "│ 10 │ 19.9 │\n", 420 | "│ 11 │ 16.2 │\n", 421 | "│ 12 │ 17.3 │" 422 | ] 423 | }, 424 | "execution_count": 23, 425 | "metadata": {}, 426 | "output_type": "execute_result" 427 | } 428 | ], 429 | "source": [ 430 | "y=DataFrame(temperature=temperature)" 431 | ] 432 | }, 433 | { 434 | "cell_type": "code", 435 | "execution_count": 32, 436 | "metadata": {}, 437 | "outputs": [ 438 | { 439 | "name": "stdout", 440 | "output_type": "stream", 441 | "text": [ 442 | "\n", 443 | "\n", 444 | "┌\u001b[0m─────────────\u001b[0m┐\u001b[0m\n", 445 | "│\u001b[0m\u001b[1m temperature \u001b[0m│\u001b[0m\n", 446 | "│\u001b[0m\u001b[90m Float64 \u001b[0m│\u001b[0m\n", 447 | "├\u001b[0m─────────────\u001b[0m┤\u001b[0m\n", 448 | "│\u001b[0m 18.5 \u001b[0m│\u001b[0m\n", 449 | "│\u001b[0m 18.3 \u001b[0m│\u001b[0m\n", 450 | "│\u001b[0m 18.3 \u001b[0m│\u001b[0m\n", 451 | "│\u001b[0m 17.4 \u001b[0m│\u001b[0m\n", 452 | "│\u001b[0m 16.6 \u001b[0m│\u001b[0m\n", 453 | "│\u001b[0m 20.7 \u001b[0m│\u001b[0m\n", 454 | "│\u001b[0m 18.9 \u001b[0m│\u001b[0m\n", 455 | "│\u001b[0m 20.2 \u001b[0m│\u001b[0m\n", 456 | "│\u001b[0m 20.4 \u001b[0m│\u001b[0m\n", 457 | "│\u001b[0m 19.9 \u001b[0m│\u001b[0m\n", 458 | "│\u001b[0m 16.2 \u001b[0m│\u001b[0m\n", 459 | "│\u001b[0m 17.3 \u001b[0m│\u001b[0m\n", 460 | "└\u001b[0m─────────────\u001b[0m┘\u001b[0m\n" 461 | ] 462 | } 463 | ], 464 | "source": [ 465 | "println()\n", 466 | "println()\n", 467 | "MLJ.MLJBase.PrettyTables.pretty_table(y)" 468 | ] 469 | }, 470 | { 471 | "cell_type": "code", 472 | "execution_count": 25, 473 | "metadata": {}, 474 | "outputs": [ 475 | { 476 | "data": { 477 | "text/html": [ 478 | "

12 rows × 1 columns

temperature
Float64
118.5
218.3
318.3
417.4
516.6
620.7
718.9
820.2
920.4
1019.9
1116.2
1217.3
" 479 | ], 480 | "text/latex": [ 481 | "\\begin{tabular}{r|c}\n", 482 | "\t& temperature\\\\\n", 483 | "\t\\hline\n", 484 | "\t& Float64\\\\\n", 485 | "\t\\hline\n", 486 | "\t1 & 18.5 \\\\\n", 487 | "\t2 & 18.3 \\\\\n", 488 | "\t3 & 18.3 \\\\\n", 489 | "\t4 & 17.4 \\\\\n", 490 | "\t5 & 16.6 \\\\\n", 491 | "\t6 & 20.7 \\\\\n", 492 | "\t7 & 18.9 \\\\\n", 493 | "\t8 & 20.2 \\\\\n", 494 | "\t9 & 20.4 \\\\\n", 495 | "\t10 & 19.9 \\\\\n", 496 | "\t11 & 16.2 \\\\\n", 497 | "\t12 & 17.3 \\\\\n", 498 | "\\end{tabular}\n" 499 | ], 500 | "text/plain": [ 501 | "12×1 DataFrame\n", 502 | "│ Row │ temperature │\n", 503 | "│ │ \u001b[90mFloat64\u001b[39m │\n", 504 | "├─────┼─────────────┤\n", 505 | "│ 1 │ 18.5 │\n", 506 | "│ 2 │ 18.3 │\n", 507 | "│ 3 │ 18.3 │\n", 508 | "│ 4 │ 17.4 │\n", 509 | "│ 5 │ 16.6 │\n", 510 | "│ 6 │ 20.7 │\n", 511 | "│ 7 │ 18.9 │\n", 512 | "│ 8 │ 20.2 │\n", 513 | "│ 9 │ 20.4 │\n", 514 | "│ 10 │ 19.9 │\n", 515 | "│ 11 │ 16.2 │\n", 516 | "│ 12 │ 17.3 │" 517 | ] 518 | }, 519 | "execution_count": 25, 520 | "metadata": {}, 521 | "output_type": "execute_result" 522 | } 523 | ], 524 | "source": [ 525 | "y" 526 | ] 527 | }, 528 | { 529 | "cell_type": "code", 530 | "execution_count": null, 531 | "metadata": {}, 532 | "outputs": [], 533 | "source": [] 534 | } 535 | ], 536 | "metadata": { 537 | "kernelspec": { 538 | "display_name": "Julia 1.4.2", 539 | "language": "julia", 540 | "name": "julia-1.4" 541 | }, 542 | "language_info": { 543 | "file_extension": ".jl", 544 | "mimetype": "application/julia", 545 | "name": "julia", 546 | "version": "1.4.2" 547 | } 548 | }, 549 | "nbformat": 4, 550 | "nbformat_minor": 4 551 | } 552 | -------------------------------------------------------------------------------- /apt.txt: -------------------------------------------------------------------------------- 1 | tzdata -------------------------------------------------------------------------------- /assets/scitypes.drawio: -------------------------------------------------------------------------------- 1 | 7ZnbbptAEIafhstU5pxcJiROKiWt2lRK07sVjGHbZddZBhvn6btrlmCKGhPJdq2UKzP/zJ5mPjNgW26UV9eSzLM7kQCznElSWe6l5Ti2PQnUh1ZWtXLqTmohlTQxQa1wT5/BiE1YSRMoOoEoBEM674qx4Bxi7GhESrHshs0E6646Jyn0hPuYsL76QBPMzCmcsNVvgKZZs7IdnNWenDTB5iRFRhKx3JDcK8uNpBBYX+VVBEwnr8lLPW76F+/LxiRwHDJg+cUn0Vn2Y/bwNcrOr5/pE41OzGYLXDUHhkSd35hCYiZSwQm7atWLuJQL0JPaypCi5MnamiirHXArxNyE/ATElaksKVEoKcOcGS9UFL/r4R9C35iPG67Lyky9NlaNwVGu6lF+Yz5u+tpha6sZV8xJTHl6CzPsKsrSE80ExynJKdMDboAtAGlMlKOfbJP/QpQyhlcy3EBLZAr4Spz5nuj0byxgSnkNIgd1EBUggRGkiy6exFCevsS1IKgLw8IbuLC9IwLD+a/B8I8LjHreBWGlWclyAoYmRR1igqdSNI6TYl3mcxVgB/OqdaqrVH9OKacIVnjxyQovmynVDutZ65gekggVdqEpUIpfEAkmpFK44JrLGWXsD4kwmnJlxqpsoPSLBUhdUXZuHDlNkjXUy0zt617BoNdcqibXY3v37OjNQPVqtY3X90x3Me21aTbLtlcFRso22lQTtns87F6Rxoay2/uGM/C+4R3XfcM5IjDeZ0MZCkZ4VGA4e2ooH/msbiljLxnYS8Jgay/xDtlL3D2RcUeLQmd0BGMgGM5pF4zTfwyGtycwIpVxHLEYisXLTzlHgoW/Jyw+ywQkJFMSoyrj+IbyNko8bysltntITIJ9tZWSqawxUhQjI299iw39rYzYh2Qk3FuH4Uh5KcpihGMoHMH2p4/DwmH3G803VaWSsPdWPe0wr9N2sJtqugd7llRm++fK2rfxF5V79Rs= -------------------------------------------------------------------------------- /assets/scitypes.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ablaom/MachineLearningInJulia2020/552f98fbf012475d67cd29a72448ac7c476ea2c7/assets/scitypes.png -------------------------------------------------------------------------------- /data/horse.csv: -------------------------------------------------------------------------------- 1 | surgery,age,rectal_temperature,pulse,respiratory_rate,temperature_extremities,mucous_membranes,capillary_refill_time,pain,peristalsis,abdominal_distension,packed_cell_volume,total_protein,outcome,surgical_lesion,cp_data 2 | 2,1,38.5,66,66,3,1,2,5,4,4,45.0,8.4,2,2,2 3 | 1,1,39.2,88,88,3,4,1,3,4,2,50.0,85.0,3,2,2 4 | 2,1,38.3,40,40,1,3,1,3,3,1,33.0,6.7,1,2,1 5 | 1,9,39.1,164,164,4,6,2,2,4,4,48.0,7.2,2,1,1 6 | 2,1,37.3,104,104,3,6,2,3,3,1,74.0,7.4,2,2,2 7 | 2,1,38.1,60,60,2,3,1,2,3,2,44.0,7.5,1,2,2 8 | 1,1,37.9,48,48,1,1,1,3,3,3,37.0,7.0,1,1,2 9 | 1,1,38.1,60,60,3,1,1,3,4,2,44.0,8.3,2,1,2 10 | 2,1,38.1,80,80,3,3,1,4,4,4,38.0,6.2,3,1,2 11 | 2,9,38.3,90,90,1,1,1,5,3,1,40.0,6.2,1,2,1 12 | 1,1,38.1,66,66,3,5,1,3,3,1,44.0,6.0,1,1,1 13 | 2,1,39.1,72,72,2,2,1,2,1,2,50.0,7.8,1,1,2 14 | 1,1,37.2,42,42,2,1,1,3,3,3,44.0,7.0,1,2,2 15 | 2,9,38.0,92,92,1,2,1,1,3,2,37.0,6.1,2,2,1 16 | 1,1,38.2,76,76,3,1,1,3,4,1,46.0,81.0,1,1,2 17 | 1,1,37.6,96,96,3,4,1,5,3,3,45.0,6.8,2,1,2 18 | 1,9,38.1,128,128,3,4,2,4,4,3,53.0,7.8,2,2,1 19 | 2,1,37.5,48,48,3,1,1,3,3,1,44.0,7.5,1,2,2 20 | 1,1,37.6,64,64,1,2,1,2,3,1,40.0,7.0,1,1,1 21 | 2,1,39.4,110,110,4,6,1,3,3,3,55.0,8.7,1,2,2 22 | 1,1,39.9,72,72,1,5,2,5,4,4,46.0,6.1,1,1,2 23 | 2,1,38.4,48,48,1,1,1,1,3,1,49.0,6.8,1,2,2 24 | 1,1,38.6,42,42,2,4,1,2,3,1,48.0,7.2,1,1,2 25 | 1,9,38.3,130,130,3,1,1,2,4,1,50.0,70.0,1,1,2 26 | 1,1,38.1,60,60,3,3,1,3,4,3,51.0,65.0,1,1,2 27 | 2,1,37.8,60,60,3,1,1,3,3,1,44.0,7.5,1,2,2 28 | 1,1,38.3,72,72,4,3,2,3,3,3,43.0,7.0,1,1,1 29 | 1,1,37.8,48,48,3,1,1,3,3,2,37.0,5.5,1,2,1 30 | 1,1,38.1,60,60,3,1,1,3,3,1,44.0,7.5,2,2,2 31 | 2,1,37.7,48,48,2,1,1,1,1,1,45.0,76.0,1,2,2 32 | 2,1,37.7,96,96,3,4,2,5,4,4,66.0,7.5,2,1,2 33 | 2,1,37.2,108,108,3,4,2,2,4,2,52.0,8.2,3,1,1 34 | 1,1,37.2,60,60,2,1,1,3,3,3,43.0,6.6,1,1,2 35 | 1,1,38.2,64,64,1,1,1,3,1,1,49.0,8.6,1,1,1 36 | 1,1,38.1,100,100,3,4,2,5,4,4,52.0,6.6,1,1,2 37 | 2,1,38.1,104,104,4,3,2,4,4,3,73.0,8.4,3,1,2 38 | 2,1,38.3,112,112,3,5,2,3,3,1,51.0,6.0,3,2,1 39 | 1,1,37.8,72,72,3,1,1,5,3,1,56.0,80.0,1,1,2 40 | 2,1,38.6,52,52,1,1,1,3,3,2,32.0,6.6,1,2,1 41 | 1,9,39.2,146,146,3,1,1,3,3,1,44.0,7.5,2,1,2 42 | 1,1,38.1,88,88,3,6,2,5,3,3,63.0,6.5,2,1,2 43 | 2,9,39.0,150,150,3,1,1,3,3,1,47.0,8.5,1,1,1 44 | 2,1,38.0,60,60,3,3,1,3,3,1,47.0,7.0,1,2,2 45 | 1,1,38.1,120,120,3,4,1,4,4,4,52.0,67.0,3,1,2 46 | 1,1,35.4,140,140,3,4,2,4,4,1,57.0,69.0,3,1,2 47 | 2,1,38.1,120,120,4,4,2,5,4,4,60.0,6.5,2,1,2 48 | 1,1,37.9,60,60,3,4,2,5,4,4,65.0,7.5,1,1,1 49 | 2,1,37.5,48,48,1,1,1,1,1,1,37.0,6.5,1,2,2 50 | 1,1,38.9,80,80,3,3,2,2,3,3,54.0,6.5,2,1,2 51 | 2,1,37.2,84,84,3,5,2,4,1,2,73.0,5.5,2,2,1 52 | 2,1,38.6,46,46,1,2,1,1,3,2,49.0,9.1,1,2,1 53 | 1,1,37.4,84,84,1,3,2,3,3,2,44.0,7.5,2,1,1 54 | 2,1,38.1,60,60,1,3,1,1,3,1,43.0,7.7,1,2,2 55 | 2,1,38.6,40,40,3,1,1,3,3,1,41.0,6.4,1,2,1 56 | 2,1,40.3,114,114,3,1,2,2,3,3,57.0,8.1,3,1,1 57 | 1,9,38.6,160,160,3,5,1,3,3,4,38.0,7.5,2,1,1 58 | 1,1,38.1,60,60,3,1,1,3,3,1,24.0,6.7,1,1,2 59 | 1,1,38.1,64,64,2,2,1,5,3,3,42.0,7.7,2,1,2 60 | 1,1,38.1,60,60,4,3,1,5,4,3,53.0,5.9,2,1,1 61 | 2,1,38.1,96,96,3,3,2,5,4,4,60.0,7.5,2,1,2 62 | 2,1,37.8,48,48,1,3,1,2,1,1,37.0,6.7,1,2,2 63 | 2,1,38.5,60,60,2,1,1,1,2,2,44.0,7.7,1,2,2 64 | 1,1,37.8,88,88,2,2,1,3,3,1,64.0,8.0,2,1,1 65 | 2,1,38.2,130,130,4,4,2,2,4,4,65.0,82.0,3,2,2 66 | 1,1,39.0,64,64,3,4,2,3,3,2,44.0,7.5,1,1,1 67 | 1,1,38.1,60,60,3,3,1,3,3,2,26.0,72.0,1,1,2 68 | 2,1,37.9,72,72,1,5,2,3,3,1,58.0,74.0,1,1,2 69 | 2,1,38.4,54,54,1,1,1,1,3,1,49.0,7.2,1,2,1 70 | 2,1,38.1,52,52,1,3,1,3,3,1,55.0,7.2,1,2,2 71 | 2,1,38.0,48,48,1,1,1,1,3,1,42.0,6.3,1,2,1 72 | 2,1,37.0,60,60,3,1,1,3,3,3,43.0,7.6,3,1,1 73 | 1,1,37.8,48,48,1,1,1,1,2,1,46.0,5.9,1,2,1 74 | 1,1,37.7,56,56,3,1,1,3,3,1,44.0,7.5,2,1,2 75 | 1,1,38.1,52,52,1,5,1,4,3,1,54.0,7.5,2,1,1 76 | 1,9,38.1,60,60,3,1,1,3,3,1,37.0,4.9,2,1,2 77 | 1,9,39.7,100,100,3,5,2,2,3,1,48.0,57.0,3,1,2 78 | 1,1,37.6,38,38,3,1,1,3,3,2,37.0,68.0,1,1,2 79 | 2,1,38.7,52,52,2,1,1,1,1,1,33.0,77.0,1,2,2 80 | 1,1,38.1,60,60,3,3,3,5,3,3,46.0,5.9,2,1,2 81 | 1,1,37.5,96,96,1,6,2,3,4,2,69.0,8.9,1,1,1 82 | 1,1,36.4,98,98,3,4,1,4,3,2,47.0,6.4,2,1,1 83 | 1,1,37.3,40,40,3,1,1,2,3,2,36.0,7.5,1,1,2 84 | 1,9,38.1,100,100,3,2,1,3,4,1,36.0,5.7,1,1,2 85 | 1,1,38.0,60,60,3,6,2,5,3,4,68.0,7.8,2,1,2 86 | 1,1,37.8,60,60,1,2,2,2,3,3,40.0,4.5,1,1,1 87 | 2,1,38.0,54,54,2,3,3,3,1,2,45.0,6.2,1,2,2 88 | 1,1,38.1,88,88,3,4,2,5,4,3,50.0,7.7,2,1,1 89 | 2,1,38.1,40,40,3,1,1,3,3,1,50.0,7.0,3,1,1 90 | 2,1,39.0,64,64,1,5,1,3,3,2,42.0,7.5,1,2,1 91 | 2,1,38.3,42,42,1,1,1,1,1,1,38.0,61.0,1,2,2 92 | 2,1,38.0,52,52,3,1,1,2,3,1,53.0,86.0,1,1,2 93 | 2,1,40.3,114,114,3,1,2,2,3,3,57.0,8.1,2,1,1 94 | 2,1,38.8,50,50,3,1,1,1,1,1,42.0,6.2,1,2,2 95 | 2,1,38.1,60,60,3,1,1,5,3,3,38.0,6.5,2,1,2 96 | 2,1,37.5,48,48,4,3,1,3,2,1,48.0,8.6,1,2,2 97 | 1,1,37.3,48,48,3,2,1,3,3,3,41.0,69.0,1,1,2 98 | 2,1,38.1,84,84,3,3,1,3,3,1,44.0,8.5,1,1,2 99 | 1,1,38.1,88,88,3,4,1,2,3,3,55.0,60.0,3,2,2 100 | 2,1,37.7,44,44,2,3,1,1,3,2,41.0,60.0,1,2,2 101 | 2,1,39.6,108,108,3,6,2,2,4,3,59.0,8.0,1,2,1 102 | 1,1,38.2,40,40,3,1,1,1,3,1,34.0,66.0,1,2,2 103 | 1,1,38.1,60,60,4,4,2,5,4,1,44.0,7.5,3,1,2 104 | 2,1,38.3,40,40,3,1,1,2,3,1,37.0,57.0,1,2,2 105 | 1,9,38.0,140,140,1,1,1,3,3,2,39.0,5.3,1,1,2 106 | 1,1,37.8,52,52,1,3,1,4,4,1,48.0,6.6,2,1,2 107 | 1,1,38.1,70,70,1,3,2,2,3,2,36.0,7.3,1,1,2 108 | 1,1,38.3,52,52,3,3,1,3,3,1,43.0,6.1,1,1,1 109 | 2,1,37.3,50,50,1,3,1,1,3,2,44.0,7.0,1,2,2 110 | 1,1,38.7,60,60,4,2,2,4,4,4,53.0,64.0,3,1,2 111 | 1,9,38.4,84,84,3,2,1,3,3,3,36.0,6.6,2,1,1 112 | 1,1,38.1,70,70,3,5,2,2,3,2,60.0,7.5,2,1,2 113 | 1,1,38.3,40,40,3,1,1,1,3,2,38.0,58.0,1,1,2 114 | 1,1,38.1,40,40,2,1,1,1,3,1,39.0,56.0,1,1,2 115 | 1,1,36.8,60,60,3,1,1,3,3,1,44.0,7.5,2,1,1 116 | 1,1,38.4,44,44,3,4,1,5,4,3,50.0,77.0,1,1,2 117 | 2,1,38.1,60,60,3,1,1,3,3,2,45.0,70.0,1,2,2 118 | 1,1,38.0,44,44,1,1,1,3,3,3,42.0,65.0,1,1,2 119 | 2,1,39.5,60,60,3,4,2,3,4,3,44.0,6.7,3,1,2 120 | 1,1,36.5,78,78,1,1,1,5,3,1,34.0,75.0,1,1,2 121 | 2,1,38.1,56,56,2,2,1,1,3,1,46.0,70.0,1,2,2 122 | 1,1,39.4,54,54,1,2,1,2,3,2,39.0,6.0,1,1,1 123 | 1,1,38.3,80,80,3,6,2,4,3,1,67.0,10.2,3,1,2 124 | 2,1,38.7,40,40,2,1,1,3,1,1,39.0,62.0,1,2,2 125 | 1,1,38.2,64,64,1,3,1,4,4,3,45.0,7.5,2,1,1 126 | 2,1,37.6,48,48,3,4,1,1,1,3,37.0,5.5,3,1,2 127 | 1,1,38.0,42,42,4,1,1,3,3,2,41.0,7.6,1,1,2 128 | 1,1,38.7,60,60,3,3,1,5,4,2,33.0,6.5,1,1,2 129 | 1,1,37.4,50,50,3,1,1,4,4,1,45.0,7.9,1,1,1 130 | 1,1,37.4,84,84,3,3,1,2,3,3,31.0,61.0,3,2,2 131 | 1,1,38.4,49,49,3,1,1,3,3,1,44.0,7.6,1,1,2 132 | 1,1,37.8,30,30,3,1,1,3,3,1,44.0,7.5,2,1,2 133 | 2,1,37.6,88,88,3,1,1,3,3,2,44.0,6.0,2,1,2 134 | 2,1,37.9,40,40,1,1,1,2,3,1,40.0,5.7,1,1,2 135 | 1,1,38.1,100,100,3,4,2,5,4,1,59.0,6.3,2,1,2 136 | 1,9,38.1,136,136,3,3,1,5,1,3,33.0,4.9,2,1,1 137 | 1,1,38.1,60,60,3,3,2,5,3,3,46.0,5.9,2,1,2 138 | 1,1,38.0,48,48,1,1,1,1,2,4,44.0,7.5,1,1,2 139 | 2,1,38.0,56,56,1,3,1,1,1,1,42.0,71.0,1,2,2 140 | 2,1,38.0,60,60,1,1,1,3,3,1,50.0,7.0,1,2,2 141 | 1,1,38.1,44,44,3,1,1,2,2,1,31.0,7.3,1,2,2 142 | 2,1,36.0,42,42,3,5,1,3,3,1,64.0,6.8,2,2,2 143 | 1,1,38.1,120,120,4,6,2,5,4,4,57.0,4.5,2,1,1 144 | 1,1,37.8,48,48,1,1,2,1,2,1,46.0,5.9,1,2,1 145 | 1,1,37.1,84,84,3,6,1,2,4,4,75.0,81.0,3,2,2 146 | 2,1,38.1,80,80,3,2,1,2,3,3,50.0,80.0,1,1,2 147 | 1,1,38.2,48,48,1,3,1,3,4,4,42.0,71.0,1,1,2 148 | 2,1,38.0,44,44,2,3,1,3,4,3,33.0,6.5,2,1,2 149 | 1,1,38.3,132,132,3,6,2,2,4,2,57.0,8.0,1,1,1 150 | 2,1,38.7,48,48,3,1,1,1,1,1,34.0,63.0,1,2,2 151 | 2,1,38.9,44,44,3,1,1,2,3,2,33.0,64.0,1,2,2 152 | 1,1,39.3,60,60,4,6,2,4,4,2,75.0,7.5,2,1,1 153 | 1,1,38.1,100,100,3,4,2,3,4,4,68.0,64.0,1,1,2 154 | 2,1,38.6,48,48,3,1,1,1,3,2,50.0,7.3,1,2,1 155 | 2,1,38.8,48,48,1,3,1,3,3,4,41.0,65.0,1,1,2 156 | 2,1,38.0,48,48,3,4,1,1,4,2,49.0,8.3,1,2,1 157 | 2,1,38.6,52,52,1,1,1,3,3,2,36.0,6.6,1,2,1 158 | 1,1,37.8,60,60,1,3,2,3,4,4,52.0,75.0,3,1,2 159 | 2,1,38.0,42,42,3,1,1,3,3,1,44.0,7.5,1,2,2 160 | 2,1,38.1,60,60,1,2,1,2,1,2,44.0,7.5,1,2,1 161 | 1,1,38.1,60,60,3,1,1,4,3,1,35.0,58.0,1,1,2 162 | 1,1,38.3,42,42,3,1,1,3,3,1,40.0,8.5,2,1,2 163 | 2,1,39.5,60,60,3,1,2,3,3,2,38.0,56.0,1,2,2 164 | 1,1,38.0,66,66,1,3,1,5,3,1,46.0,46.0,3,1,2 165 | 1,1,38.7,76,76,1,5,2,3,3,2,50.0,8.0,1,1,1 166 | 1,1,39.4,120,120,3,5,1,3,3,3,56.0,64.0,3,2,2 167 | 1,1,38.3,40,40,1,1,1,3,1,1,43.0,5.9,1,2,1 168 | 2,1,38.1,44,44,1,1,1,3,3,1,44.0,6.3,1,2,2 169 | 1,1,38.4,104,104,1,3,1,2,4,2,55.0,8.5,1,1,2 170 | 1,1,38.1,65,65,3,1,2,5,3,4,44.0,7.5,3,1,2 171 | 2,1,37.5,44,44,1,3,1,3,1,1,35.0,7.2,1,2,2 172 | 2,1,39.0,86,86,3,5,1,3,3,3,68.0,5.8,2,1,1 173 | 1,1,38.5,129,129,3,3,1,2,4,3,57.0,66.0,1,1,2 174 | 1,1,38.1,104,104,3,5,2,2,4,3,69.0,8.6,2,1,1 175 | 2,1,38.1,60,60,3,6,1,4,3,4,44.0,7.5,2,1,1 176 | 1,1,38.1,60,60,3,1,1,3,3,1,44.0,7.5,1,1,2 177 | 1,1,38.2,60,60,1,3,1,3,3,1,48.0,66.0,1,1,2 178 | 1,1,38.1,68,68,3,4,1,4,3,1,44.0,7.5,2,1,1 179 | 1,1,38.1,60,60,3,4,2,5,4,4,45.0,70.0,1,1,2 180 | 2,1,38.5,100,100,3,5,2,4,3,4,44.0,7.5,3,2,1 181 | 1,1,38.4,84,84,3,5,2,4,3,3,47.0,7.5,2,1,2 182 | 2,1,37.8,48,48,3,1,1,3,3,2,35.0,7.5,1,2,1 183 | 1,1,38.0,60,60,3,6,2,5,3,4,68.0,7.8,2,1,2 184 | 2,1,37.8,56,56,1,2,1,2,1,1,44.0,68.0,1,2,2 185 | 2,1,38.2,68,68,2,2,1,1,1,1,43.0,65.0,1,2,2 186 | 1,1,38.5,120,120,4,6,2,3,3,1,54.0,7.5,1,1,2 187 | 1,1,39.3,64,64,2,1,1,3,3,1,39.0,6.7,1,1,2 188 | 1,1,38.4,80,80,4,1,1,3,3,3,32.0,6.1,1,1,1 189 | 1,1,38.5,60,60,1,1,1,3,1,1,33.0,53.0,1,1,2 190 | 1,1,38.3,60,60,3,1,1,2,1,1,30.0,6.0,1,1,2 191 | 1,1,37.1,40,40,3,4,1,3,3,1,23.0,6.7,1,1,1 192 | 2,9,38.1,100,100,2,1,1,4,1,1,37.0,4.7,1,2,2 193 | 1,1,38.2,48,48,1,1,1,3,3,3,48.0,74.0,1,1,2 194 | 1,1,38.1,60,60,3,4,2,4,3,4,58.0,7.6,2,1,2 195 | 2,1,37.9,88,88,1,2,1,2,2,1,37.0,56.0,1,2,2 196 | 2,1,38.0,44,44,3,1,1,3,1,2,42.0,64.0,1,2,2 197 | 2,1,38.5,60,60,1,5,2,2,2,1,63.0,7.5,3,2,1 198 | 2,1,38.5,96,96,3,1,2,2,4,2,70.0,8.5,2,1,1 199 | 2,1,38.3,60,60,1,1,2,1,3,1,34.0,66.0,1,2,2 200 | 2,1,38.5,60,60,3,2,1,2,1,2,49.0,59.0,1,2,2 201 | 1,1,37.3,48,48,1,3,1,3,1,3,40.0,6.6,1,1,1 202 | 1,1,38.5,86,86,1,3,1,4,4,3,45.0,7.4,2,1,1 203 | 1,1,37.5,48,48,3,1,1,3,3,1,41.0,55.0,3,1,2 204 | 2,1,37.2,36,36,1,1,1,2,3,1,35.0,5.7,1,2,2 205 | 1,1,39.2,60,60,3,3,1,4,4,2,36.0,6.6,1,1,1 206 | 2,1,38.5,100,100,3,5,2,4,3,4,44.0,7.5,3,2,2 207 | 1,1,38.5,96,96,2,4,2,4,4,3,50.0,65.0,1,1,2 208 | 1,1,38.1,60,60,3,1,1,3,3,1,45.0,8.7,2,1,2 209 | 1,1,37.8,88,88,3,5,2,3,3,3,64.0,89.0,3,1,2 210 | 2,1,37.5,44,44,3,1,1,3,1,2,43.0,51.0,1,2,2 211 | 1,1,37.9,68,68,3,2,1,2,4,2,45.0,4.0,2,1,1 212 | 1,1,38.0,86,86,4,4,1,2,4,4,45.0,5.5,2,1,1 213 | 1,9,38.9,120,120,1,2,2,3,3,3,47.0,6.3,1,2,2 214 | 1,1,37.6,45,45,3,3,1,3,2,2,39.0,7.0,1,1,1 215 | 2,1,38.6,56,56,2,1,1,1,1,1,40.0,7.0,1,2,1 216 | 1,1,37.8,40,40,1,1,1,1,2,1,38.0,7.0,1,1,2 217 | 2,1,38.1,60,60,3,1,1,3,3,1,44.0,7.5,1,2,2 218 | 1,1,38.0,76,76,3,1,2,3,3,1,71.0,11.0,1,1,1 219 | 1,1,38.1,40,40,1,2,1,2,2,1,44.0,7.5,3,1,2 220 | 1,1,38.1,52,52,3,4,1,3,4,3,37.0,8.1,1,1,2 221 | 1,1,39.2,88,88,4,1,2,5,4,1,44.0,7.5,3,2,2 222 | 1,1,38.5,92,92,4,1,1,2,4,3,46.0,67.0,1,1,2 223 | 1,1,38.1,112,112,4,4,1,2,3,1,60.0,6.3,1,1,1 224 | 1,1,37.7,66,66,1,3,1,3,3,2,31.5,6.2,1,1,1 225 | 1,1,38.8,50,50,1,1,1,3,1,1,38.0,58.0,1,1,2 226 | 2,1,38.4,54,54,1,1,1,1,3,1,49.0,7.2,1,2,1 227 | 1,1,39.2,120,120,4,5,2,2,3,3,60.0,8.8,2,1,2 228 | 1,9,38.1,60,60,3,1,1,3,3,1,45.0,6.5,1,1,1 229 | 1,1,37.3,90,90,3,6,2,5,4,3,65.0,50.0,3,1,2 230 | 1,9,38.5,120,120,3,1,1,3,1,1,35.0,54.0,1,1,2 231 | 1,1,38.5,104,104,3,1,1,4,3,4,44.0,7.5,1,1,2 232 | 2,1,39.5,92,92,3,6,1,5,4,1,72.0,6.4,2,2,2 233 | 1,1,38.5,30,30,3,1,1,3,3,1,40.0,7.7,1,1,2 234 | 1,1,38.3,72,72,4,3,2,3,3,3,43.0,7.0,1,1,1 235 | 2,1,37.5,48,48,4,3,1,3,2,1,48.0,8.6,1,2,2 236 | 1,1,38.1,52,52,1,5,1,4,3,1,54.0,7.5,2,1,1 237 | 2,1,38.2,42,42,1,1,1,3,1,2,36.0,6.9,1,2,2 238 | 2,1,37.9,54,54,2,5,1,3,1,1,47.0,54.0,1,2,2 239 | 2,1,36.1,88,88,3,3,1,3,3,2,45.0,7.0,3,1,1 240 | 1,1,38.1,70,70,3,1,1,5,3,1,36.0,65.0,3,1,2 241 | 1,1,38.0,90,90,4,4,2,5,4,4,55.0,6.1,2,1,2 242 | 1,1,38.2,52,52,1,2,1,1,2,1,43.0,8.1,1,2,1 243 | 1,1,38.1,36,36,1,4,1,5,3,3,41.0,5.9,2,1,2 244 | 1,1,38.4,92,92,1,1,2,3,3,3,44.0,7.5,1,1,1 245 | 1,9,38.2,124,124,1,2,1,2,3,4,47.0,8.0,1,1,1 246 | 2,1,38.1,96,96,3,3,2,5,4,4,60.0,7.5,2,1,2 247 | 1,1,37.6,68,68,3,3,1,4,2,4,47.0,7.2,1,1,2 248 | 1,1,38.1,88,88,3,4,1,5,4,3,41.0,4.6,2,1,2 249 | 1,1,38.0,108,108,2,4,1,4,3,3,44.0,7.5,1,1,2 250 | 2,1,38.2,48,48,2,1,2,3,3,1,34.0,6.6,1,2,2 251 | 1,1,39.3,100,100,4,6,1,2,4,1,66.0,13.0,3,1,2 252 | 2,1,36.6,42,42,3,2,1,1,4,1,52.0,7.1,2,1,2 253 | 1,9,38.8,124,124,3,2,1,2,3,4,50.0,7.6,2,1,1 254 | 2,1,38.1,112,112,3,4,2,5,4,2,40.0,5.3,1,2,1 255 | 1,1,38.1,80,80,3,3,1,4,4,4,43.0,70.0,1,1,2 256 | 1,9,38.8,184,184,1,1,1,4,1,3,33.0,3.3,2,1,2 257 | 1,1,37.5,72,72,2,1,1,2,1,1,35.0,65.0,3,1,2 258 | 1,1,38.7,96,96,3,4,1,3,4,1,64.0,9.0,2,1,1 259 | 2,1,37.5,52,52,1,1,1,2,3,2,36.0,61.0,1,2,2 260 | 1,1,40.8,72,72,3,1,1,2,3,1,54.0,7.4,2,1,1 261 | 2,1,38.0,40,40,3,1,1,4,3,2,37.0,69.0,1,2,2 262 | 2,1,38.4,48,48,2,1,1,1,3,2,39.0,6.5,1,2,1 263 | 2,9,38.6,88,88,3,1,1,3,3,1,35.0,5.9,1,2,2 264 | 1,1,37.1,75,75,3,3,2,4,4,2,48.0,7.4,2,1,1 265 | 1,1,38.3,44,44,3,2,1,3,3,3,44.0,6.5,1,1,1 266 | 2,1,38.1,56,56,3,1,1,3,3,1,40.0,6.0,3,1,2 267 | 2,1,38.6,68,68,2,3,1,3,3,2,38.0,6.5,1,2,1 268 | 2,1,38.3,54,54,3,2,1,2,3,2,44.0,7.2,1,2,1 269 | 1,1,38.2,42,42,3,1,1,3,3,1,47.0,60.0,1,2,2 270 | 1,1,39.3,64,64,2,1,1,3,3,1,39.0,6.7,1,1,2 271 | 1,1,37.5,60,60,3,1,1,3,3,2,35.0,6.5,2,1,2 272 | 1,1,37.7,80,80,3,6,1,5,4,1,50.0,55.0,1,1,2 273 | 1,1,38.1,100,100,3,4,2,5,4,4,52.0,6.6,1,1,2 274 | 1,1,37.7,120,120,3,3,1,5,3,3,65.0,7.0,2,1,1 275 | 1,1,38.1,76,76,3,1,1,3,4,4,44.0,7.5,3,1,2 276 | 1,9,38.8,150,150,1,6,2,5,3,2,50.0,6.2,2,1,2 277 | 1,1,38.0,36,36,3,1,1,4,2,2,37.0,75.0,3,2,2 278 | 2,1,36.9,50,50,2,3,1,1,3,2,37.5,6.5,1,2,2 279 | 2,1,37.8,40,40,1,1,1,1,1,1,37.0,6.8,1,2,2 280 | 2,1,38.2,56,56,4,1,1,2,4,3,47.0,7.2,1,2,1 281 | 1,1,38.6,48,48,3,1,1,1,1,1,36.0,67.0,1,2,2 282 | 2,1,40.0,78,78,3,5,1,2,3,1,66.0,6.5,2,1,1 283 | 1,1,38.1,70,70,3,5,2,2,3,2,60.0,7.5,2,1,2 284 | 1,1,38.2,72,72,3,1,1,3,3,1,35.0,6.4,1,1,2 285 | 2,1,38.5,54,54,1,1,1,3,1,1,40.0,6.8,1,2,1 286 | 1,1,38.5,66,66,1,1,1,3,3,1,40.0,6.7,1,1,1 287 | 2,1,37.8,82,82,3,1,2,4,3,3,50.0,7.0,3,1,2 288 | 2,9,39.5,84,84,3,1,1,3,3,1,28.0,5.0,1,2,2 289 | 1,1,38.1,60,60,3,1,1,3,3,1,44.0,7.5,1,1,2 290 | 1,1,38.0,50,50,3,1,1,3,2,2,39.0,6.6,1,1,1 291 | 2,1,38.6,45,45,2,2,1,1,1,1,43.0,58.0,1,2,2 292 | 1,1,38.9,80,80,3,3,1,2,3,3,54.0,6.5,2,1,2 293 | 1,1,37.0,66,66,1,2,1,4,3,3,35.0,6.9,2,1,2 294 | 1,1,38.1,78,78,3,3,1,3,3,1,43.0,62.0,3,2,2 295 | 2,1,38.5,40,40,1,1,1,2,1,1,37.0,67.0,1,2,2 296 | 1,1,38.1,120,120,4,4,2,2,4,1,55.0,65.0,3,2,2 297 | 2,1,37.2,72,72,3,4,2,4,3,3,44.0,7.5,3,1,1 298 | 1,1,37.5,72,72,4,4,1,4,4,3,60.0,6.8,2,1,2 299 | 1,1,36.5,100,100,3,3,1,3,3,3,50.0,6.0,1,1,1 300 | 1,1,37.2,40,40,3,1,1,3,3,1,36.0,62.0,3,2,2 301 | 2,1,38.5,54,54,3,2,2,3,4,1,42.0,6.3,1,2,1 302 | 2,1,37.6,48,48,3,1,1,3,3,1,44.0,6.3,1,2,1 303 | 1,1,37.7,44,44,3,3,2,5,4,4,45.0,70.0,1,1,2 304 | 1,1,37.0,56,56,3,4,2,4,4,3,35.0,61.0,3,2,2 305 | 2,1,38.0,42,42,3,3,1,1,3,1,37.0,5.8,1,2,2 306 | 1,1,38.1,60,60,3,1,1,3,4,1,42.0,72.0,1,1,2 307 | 2,1,38.4,80,80,3,2,1,3,2,1,54.0,6.9,1,2,2 308 | 2,1,37.8,48,48,2,2,1,3,3,1,48.0,7.3,1,2,1 309 | 2,1,37.9,45,45,3,3,2,2,3,1,33.0,5.7,1,1,1 310 | 2,1,39.0,84,84,3,5,1,2,4,2,62.0,5.9,2,1,1 311 | 2,1,38.2,60,60,3,3,2,3,3,2,53.0,7.5,1,2,1 312 | 1,1,38.1,140,140,3,4,2,5,4,4,30.0,69.0,2,2,2 313 | 1,1,37.9,120,120,3,3,1,5,4,4,52.0,6.6,2,1,1 314 | 2,1,38.0,72,72,1,3,1,3,3,2,38.0,6.8,1,2,1 315 | 2,9,38.0,92,92,1,2,1,1,3,2,37.0,6.1,1,2,1 316 | 1,1,38.3,66,66,2,1,1,2,4,3,37.0,6.0,1,1,2 317 | 2,1,37.5,48,48,3,1,1,2,1,1,43.0,6.0,1,2,1 318 | 1,1,37.5,88,88,2,3,1,4,3,3,35.0,6.4,2,1,2 319 | 2,9,38.1,150,150,4,4,2,5,4,4,44.0,7.5,2,1,2 320 | 1,1,39.7,100,100,3,6,2,4,4,3,65.0,75.0,3,1,2 321 | 1,1,38.3,80,80,3,4,2,5,4,3,45.0,7.5,1,1,1 322 | 2,1,37.5,40,40,3,3,1,3,2,3,32.0,6.4,1,1,1 323 | 1,1,38.4,84,84,3,5,2,4,3,3,47.0,7.5,2,1,2 324 | 1,1,38.1,84,84,4,4,2,5,3,1,60.0,6.8,2,1,1 325 | 2,1,38.7,52,52,1,1,1,1,3,1,4.0,74.0,1,2,2 326 | 2,1,38.1,44,44,2,3,1,3,3,1,35.0,6.8,1,2,2 327 | 2,1,38.4,52,52,2,3,1,1,3,2,41.0,63.0,1,2,2 328 | 1,1,38.2,60,60,1,3,1,2,1,1,43.0,6.2,1,1,1 329 | 2,1,37.7,40,40,1,1,1,3,2,1,36.0,3.5,1,2,2 330 | 1,1,39.1,60,60,3,1,1,2,3,1,44.0,7.5,1,1,2 331 | 2,1,37.8,48,48,1,1,1,3,1,1,43.0,7.5,1,2,2 332 | 1,1,39.0,120,120,4,5,2,2,4,3,65.0,8.2,1,2,2 333 | 1,1,38.2,76,76,2,2,1,5,3,3,35.0,6.5,1,1,1 334 | 2,1,38.3,88,88,3,6,1,3,3,1,44.0,7.5,2,2,2 335 | 1,1,38.0,80,80,3,3,1,3,3,1,48.0,8.3,1,1,2 336 | 1,1,38.1,60,60,3,1,1,2,3,3,44.0,7.5,2,1,1 337 | 1,1,37.6,40,40,1,1,1,1,1,1,44.0,7.5,1,1,1 338 | 2,1,37.5,44,44,1,1,1,3,3,2,45.0,5.8,1,2,1 339 | 2,1,38.2,42,42,1,3,1,1,3,1,35.0,60.0,1,2,2 340 | 2,1,38.0,56,56,3,3,1,3,1,1,47.0,70.0,1,2,2 341 | 2,1,38.3,45,45,3,2,2,2,4,1,44.0,7.5,1,2,2 342 | 1,1,38.1,48,48,1,3,1,3,4,1,42.0,8.0,1,1,2 343 | 1,1,37.7,55,55,2,2,1,2,3,3,44.0,7.5,1,1,2 344 | 2,1,36.0,100,100,4,6,2,2,4,3,74.0,5.7,3,1,1 345 | 1,1,37.1,60,60,2,4,1,3,3,3,64.0,8.5,1,1,1 346 | 2,1,37.1,114,114,3,3,2,2,2,1,32.0,7.5,1,2,2 347 | 1,1,38.1,72,72,3,3,1,4,4,3,37.0,56.0,1,1,2 348 | 1,1,37.0,44,44,3,1,2,1,1,1,40.0,6.7,1,1,2 349 | 1,1,38.6,48,48,3,1,1,4,3,1,37.0,75.0,1,1,2 350 | 1,1,38.1,82,82,3,4,1,2,3,3,53.0,65.0,3,1,2 351 | 1,9,38.2,78,78,4,6,1,3,3,3,59.0,5.8,2,1,1 352 | 2,1,37.8,60,60,1,3,1,2,3,2,41.0,73.0,3,2,2 353 | 1,1,38.7,34,34,2,3,1,2,3,1,33.0,69.0,3,1,2 354 | 1,1,38.1,36,36,1,1,1,1,2,1,44.0,7.5,1,1,1 355 | 2,1,38.3,44,44,3,1,1,3,3,1,6.4,36.0,1,1,2 356 | 2,1,37.4,54,54,3,1,1,3,4,3,30.0,7.1,1,1,1 357 | 1,1,38.1,60,60,4,1,2,2,4,1,54.0,76.0,1,1,2 358 | 1,1,36.6,48,48,3,3,1,4,1,1,27.0,56.0,3,1,2 359 | 1,1,38.5,90,90,1,3,1,3,3,3,47.0,79.0,1,1,2 360 | 1,1,38.1,75,75,1,4,1,5,3,3,58.0,8.5,1,1,1 361 | 2,1,38.2,42,42,3,1,1,1,1,2,35.0,5.9,1,2,2 362 | 1,9,38.2,78,78,4,6,1,3,3,3,59.0,5.8,2,1,1 363 | 2,1,38.6,60,60,1,3,1,4,2,2,40.0,6.0,1,1,2 364 | 2,1,37.8,42,42,1,1,1,1,3,1,36.0,6.2,1,2,2 365 | 1,1,38.0,60,60,1,2,1,2,1,1,44.0,65.0,3,1,2 366 | 2,1,38.0,42,42,3,3,1,1,1,1,37.0,5.8,1,2,2 367 | 2,1,37.6,88,88,3,1,1,3,3,2,44.0,6.0,2,1,2 368 | -------------------------------------------------------------------------------- /data/small.csv: -------------------------------------------------------------------------------- 1 | h,e,t 2 | 185.0,rotten,2.3 3 | 153.0,great,4.5 4 | 163.0,bla,4.2 5 | 114.0,great,1.8 6 | 180.0,bla,7.1 7 | -------------------------------------------------------------------------------- /data/src/Manifest.toml: -------------------------------------------------------------------------------- 1 | # This file is machine-generated - editing it directly is not advised 2 | 3 | [[AbstractFFTs]] 4 | deps = ["LinearAlgebra"] 5 | git-tree-sha1 = "051c95d6836228d120f5f4b984dd5aba1624f716" 6 | uuid = "621f4979-c628-5d54-868e-fcf4e3e8185c" 7 | version = "0.5.0" 8 | 9 | [[Adapt]] 10 | deps = ["LinearAlgebra"] 11 | git-tree-sha1 = "0fac443759fa829ed8066db6cf1077d888bb6573" 12 | uuid = "79e6a3ab-5dfb-504d-930d-738a2a938a0e" 13 | version = "2.0.2" 14 | 15 | [[Arpack]] 16 | deps = ["Arpack_jll", "Libdl", "LinearAlgebra"] 17 | git-tree-sha1 = "2ff92b71ba1747c5fdd541f8fc87736d82f40ec9" 18 | uuid = "7d9fca2a-8960-54d3-9f78-7d1dccf2cb97" 19 | version = "0.4.0" 20 | 21 | [[Arpack_jll]] 22 | deps = ["Libdl", "OpenBLAS_jll", "Pkg"] 23 | git-tree-sha1 = "e214a9b9bd1b4e1b4f15b22c0994862b66af7ff7" 24 | uuid = "68821587-b530-5797-8361-c406ea357684" 25 | version = "3.5.0+3" 26 | 27 | [[ArrayInterface]] 28 | deps = ["LinearAlgebra", "Requires", "SparseArrays"] 29 | git-tree-sha1 = "0eccdcbe27fd6bd9cba3be31c67bdd435a21e865" 30 | uuid = "4fba245c-0d91-5ea0-9b3e-6abc04ee57a9" 31 | version = "2.9.1" 32 | 33 | [[AxisAlgorithms]] 34 | deps = ["LinearAlgebra", "Random", "SparseArrays", "WoodburyMatrices"] 35 | git-tree-sha1 = "a4d07a1c313392a77042855df46c5f534076fab9" 36 | uuid = "13072b0f-2c55-5437-9ae7-d433b7a33950" 37 | version = "1.0.0" 38 | 39 | [[BSON]] 40 | git-tree-sha1 = "dd36d7cf3d185eeaaf64db902c15174b22f5dafb" 41 | uuid = "fbb218c0-5317-5bc6-957e-2ee96dd4b1f0" 42 | version = "0.2.6" 43 | 44 | [[Base64]] 45 | uuid = "2a0f44e3-6c83-55bd-87e4-b1978d98bd5f" 46 | 47 | [[Bzip2_jll]] 48 | deps = ["Libdl", "Pkg"] 49 | git-tree-sha1 = "3663bfffede2ef41358b6fc2e1d8a6d50b3c3904" 50 | uuid = "6e34b625-4abd-537c-b88f-471c36dfa7a0" 51 | version = "1.0.6+2" 52 | 53 | [[CSV]] 54 | deps = ["CategoricalArrays", "DataFrames", "Dates", "FilePathsBase", "Mmap", "Parsers", "PooledArrays", "Tables", "Unicode", "WeakRefStrings"] 55 | git-tree-sha1 = "52a8e60c7822f53d57e4403b7f2811e7e1bdd32b" 56 | uuid = "336ed68f-0bac-5ca0-87d4-7b16caf5d00b" 57 | version = "0.6.2" 58 | 59 | [[CategoricalArrays]] 60 | deps = ["DataAPI", "Future", "JSON", "Missings", "Printf", "Statistics", "Unicode"] 61 | git-tree-sha1 = "a6c17353ee38ddab30e73dcfaa1107752de724ec" 62 | uuid = "324d7699-5711-5eae-9e2f-1d82baa6b597" 63 | version = "0.8.1" 64 | 65 | [[Clustering]] 66 | deps = ["Distances", "LinearAlgebra", "NearestNeighbors", "Printf", "SparseArrays", "Statistics", "StatsBase"] 67 | git-tree-sha1 = "b11c8d607af357776a046889a7c32567d05f1319" 68 | uuid = "aaaa29a8-35af-508c-8bc3-b662a17a0fe5" 69 | version = "0.14.1" 70 | 71 | [[CodecZlib]] 72 | deps = ["TranscodingStreams", "Zlib_jll"] 73 | git-tree-sha1 = "ded953804d019afa9a3f98981d99b33e3db7b6da" 74 | uuid = "944b1d66-785c-5afd-91f1-9de20f533193" 75 | version = "0.7.0" 76 | 77 | [[ColorSchemes]] 78 | deps = ["ColorTypes", "Colors", "FixedPointNumbers", "Random", "StaticArrays"] 79 | git-tree-sha1 = "7a15e3690529fd1042f0ab954dff7445b1efc8a5" 80 | uuid = "35d6a980-a343-548e-a6ea-1d62b119f2f4" 81 | version = "3.9.0" 82 | 83 | [[ColorTypes]] 84 | deps = ["FixedPointNumbers", "Random"] 85 | git-tree-sha1 = "6e7aa35d0294f647bb9c985ccc34d4f5d371a533" 86 | uuid = "3da002f7-5984-5a60-b8a6-cbb66c0b333f" 87 | version = "0.10.6" 88 | 89 | [[Colors]] 90 | deps = ["ColorTypes", "FixedPointNumbers", "InteractiveUtils", "Reexport"] 91 | git-tree-sha1 = "5639e44833cfcf78c6a73fbceb4da75611d312cd" 92 | uuid = "5ae59095-9a9b-59fe-a467-6f913c188581" 93 | version = "0.12.3" 94 | 95 | [[CommonSubexpressions]] 96 | deps = ["MacroTools", "Test"] 97 | git-tree-sha1 = "7b8a93dba8af7e3b42fecabf646260105ac373f7" 98 | uuid = "bbf7d656-a473-5ed7-a52c-81e309532950" 99 | version = "0.3.0" 100 | 101 | [[Compat]] 102 | deps = ["Base64", "Dates", "DelimitedFiles", "Distributed", "InteractiveUtils", "LibGit2", "Libdl", "LinearAlgebra", "Markdown", "Mmap", "Pkg", "Printf", "REPL", "Random", "SHA", "Serialization", "SharedArrays", "Sockets", "SparseArrays", "Statistics", "Test", "UUIDs", "Unicode"] 103 | git-tree-sha1 = "a6a8197ae253f2c1a22b2ae17c2dfaf5812c03aa" 104 | uuid = "34da2185-b29b-5c13-b0c7-acf172513d20" 105 | version = "3.13.0" 106 | 107 | [[CompilerSupportLibraries_jll]] 108 | deps = ["Libdl", "Pkg"] 109 | git-tree-sha1 = "7c4f882c41faa72118841185afc58a2eb00ef612" 110 | uuid = "e66e0078-7015-5450-92f7-15fbd957f2ae" 111 | version = "0.3.3+0" 112 | 113 | [[ComputationalResources]] 114 | git-tree-sha1 = "52cb3ec90e8a8bea0e62e275ba577ad0f74821f7" 115 | uuid = "ed09eef8-17a6-5b46-8889-db040fac31e3" 116 | version = "0.3.2" 117 | 118 | [[Conda]] 119 | deps = ["JSON", "VersionParsing"] 120 | git-tree-sha1 = "7a58bb32ce5d85f8bf7559aa7c2842f9aecf52fc" 121 | uuid = "8f4d0f93-b110-5947-807f-2305c1781a2d" 122 | version = "1.4.1" 123 | 124 | [[Contour]] 125 | deps = ["StaticArrays"] 126 | git-tree-sha1 = "81685fee51fc5168898e3cbd8b0f01506cd9148e" 127 | uuid = "d38c429a-6771-53c6-b99e-75d170b6e991" 128 | version = "0.5.4" 129 | 130 | [[Crayons]] 131 | git-tree-sha1 = "c437a9c2114c7ba19322712e58942b383ffbd6c0" 132 | uuid = "a8cc5b0e-0ffa-5ad4-8c14-923d3ee1735f" 133 | version = "4.0.3" 134 | 135 | [[DataAPI]] 136 | git-tree-sha1 = "176e23402d80e7743fc26c19c681bfb11246af32" 137 | uuid = "9a962f9c-6df0-11e9-0e5d-c546b8b5ee8a" 138 | version = "1.3.0" 139 | 140 | [[DataFrames]] 141 | deps = ["CategoricalArrays", "Compat", "DataAPI", "Future", "InvertedIndices", "IteratorInterfaceExtensions", "Missings", "PooledArrays", "Printf", "REPL", "Reexport", "SortingAlgorithms", "Statistics", "TableTraits", "Tables", "Unicode"] 142 | git-tree-sha1 = "d4436b646615928b634b37e99a3288588072f851" 143 | uuid = "a93c6f00-e57d-5684-b7b6-d8193f3e46c0" 144 | version = "0.21.4" 145 | 146 | [[DataStructures]] 147 | deps = ["InteractiveUtils", "OrderedCollections"] 148 | git-tree-sha1 = "edad9434967fdc0a2631a65d902228400642120c" 149 | uuid = "864edb3b-99cc-5e75-8d2d-829cb0a9cfe8" 150 | version = "0.17.19" 151 | 152 | [[DataValueInterfaces]] 153 | git-tree-sha1 = "bfc1187b79289637fa0ef6d4436ebdfe6905cbd6" 154 | uuid = "e2d170a0-9d28-54be-80f0-106bbe20a464" 155 | version = "1.0.0" 156 | 157 | [[DataValues]] 158 | deps = ["DataValueInterfaces", "Dates"] 159 | git-tree-sha1 = "d88a19299eba280a6d062e135a43f00323ae70bf" 160 | uuid = "e7dc6d0d-1eca-5fa6-8ad6-5aecde8b7ea5" 161 | version = "0.4.13" 162 | 163 | [[Dates]] 164 | deps = ["Printf"] 165 | uuid = "ade2ca70-3891-5945-98fb-dc099432e06a" 166 | 167 | [[DecisionTree]] 168 | deps = ["DelimitedFiles", "Distributed", "LinearAlgebra", "Random", "ScikitLearnBase", "Statistics", "Test"] 169 | git-tree-sha1 = "9faa81d6e611cf00d16d4dabbd60a325ada72a83" 170 | uuid = "7806a523-6efd-50cb-b5f6-3fa6f1930dbb" 171 | version = "0.10.7" 172 | 173 | [[DelimitedFiles]] 174 | deps = ["Mmap"] 175 | uuid = "8bb1440f-4735-579b-a4ab-409b98df4dab" 176 | 177 | [[DiffResults]] 178 | deps = ["StaticArrays"] 179 | git-tree-sha1 = "da24935df8e0c6cf28de340b958f6aac88eaa0cc" 180 | uuid = "163ba53b-c6d8-5494-b064-1a9d43ac40c5" 181 | version = "1.0.2" 182 | 183 | [[DiffRules]] 184 | deps = ["NaNMath", "Random", "SpecialFunctions"] 185 | git-tree-sha1 = "eb0c34204c8410888844ada5359ac8b96292cfd1" 186 | uuid = "b552c78f-8df3-52c6-915a-8e097449b14b" 187 | version = "1.0.1" 188 | 189 | [[Distances]] 190 | deps = ["LinearAlgebra", "Statistics"] 191 | git-tree-sha1 = "23717536c81b63e250f682b0e0933769eecd1411" 192 | uuid = "b4f34e82-e78d-54a5-968a-f98e89d6e8f7" 193 | version = "0.8.2" 194 | 195 | [[Distributed]] 196 | deps = ["Random", "Serialization", "Sockets"] 197 | uuid = "8ba89e20-285c-5b6f-9357-94700520ee1b" 198 | 199 | [[Distributions]] 200 | deps = ["FillArrays", "LinearAlgebra", "PDMats", "Printf", "QuadGK", "Random", "SpecialFunctions", "Statistics", "StatsBase", "StatsFuns"] 201 | git-tree-sha1 = "78c4c32a2357a00a0a7d614880f02c2c6e1ec73c" 202 | uuid = "31c24e10-a181-5473-b8eb-7969acd0382f" 203 | version = "0.23.4" 204 | 205 | [[DocStringExtensions]] 206 | deps = ["LibGit2", "Markdown", "Pkg", "Test"] 207 | git-tree-sha1 = "c5714d9bcdba66389612dc4c47ed827c64112997" 208 | uuid = "ffbed154-4ef7-542d-bbb7-c09d3a79fcae" 209 | version = "0.8.2" 210 | 211 | [[Documenter]] 212 | deps = ["Base64", "Dates", "DocStringExtensions", "InteractiveUtils", "JSON", "LibGit2", "Logging", "Markdown", "REPL", "Test", "Unicode"] 213 | git-tree-sha1 = "395fa1554c69735802bba37d9e7d9586fd44326c" 214 | uuid = "e30172f5-a6a5-5a46-863b-614d45cd2de4" 215 | version = "0.24.11" 216 | 217 | [[EvoTrees]] 218 | deps = ["CategoricalArrays", "Distributions", "MLJModelInterface", "Random", "StaticArrays", "Statistics", "StatsBase"] 219 | git-tree-sha1 = "2608d6cd10db187b7ef96c2197f809c04a1ac735" 220 | uuid = "f6006082-12f8-11e9-0c9c-0d5d367ab1e5" 221 | version = "0.4.9" 222 | 223 | [[ExprTools]] 224 | git-tree-sha1 = "6f0517056812fd6aa3af23d4b70d5325a2ae4e95" 225 | uuid = "e2ba6199-217a-4e67-a87a-7c52f15ade04" 226 | version = "0.1.1" 227 | 228 | [[EzXML]] 229 | deps = ["Printf", "XML2_jll"] 230 | git-tree-sha1 = "0fa3b52a04a4e210aeb1626def9c90df3ae65268" 231 | uuid = "8f5d6c58-4d21-5cfd-889c-e3ad7ee6a615" 232 | version = "1.1.0" 233 | 234 | [[FFMPEG]] 235 | deps = ["FFMPEG_jll"] 236 | git-tree-sha1 = "c82bef6fc01e30d500f588cd01d29bdd44f1924e" 237 | uuid = "c87230d0-a227-11e9-1b43-d7ebe4e7570a" 238 | version = "0.3.0" 239 | 240 | [[FFMPEG_jll]] 241 | deps = ["Bzip2_jll", "FreeType2_jll", "FriBidi_jll", "LAME_jll", "LibVPX_jll", "Libdl", "Ogg_jll", "OpenSSL_jll", "Opus_jll", "Pkg", "Zlib_jll", "libass_jll", "libfdk_aac_jll", "libvorbis_jll", "x264_jll", "x265_jll"] 242 | git-tree-sha1 = "0fa07f43e5609ea54848b82b4bb330b250e9645b" 243 | uuid = "b22a6f82-2f65-5046-a5b2-351ab43fb4e5" 244 | version = "4.1.0+3" 245 | 246 | [[FFTW]] 247 | deps = ["AbstractFFTs", "FFTW_jll", "IntelOpenMP_jll", "Libdl", "LinearAlgebra", "MKL_jll", "Reexport"] 248 | git-tree-sha1 = "14536c95939aadcee44014728a459d2fe3ca9acf" 249 | uuid = "7a1cc6ca-52ef-59f5-83cd-3a7055c09341" 250 | version = "1.2.2" 251 | 252 | [[FFTW_jll]] 253 | deps = ["Libdl", "Pkg"] 254 | git-tree-sha1 = "6c975cd606128d45d1df432fb812d6eb10fee00b" 255 | uuid = "f5851436-0d7a-5f13-b9de-f02708fd171a" 256 | version = "3.3.9+5" 257 | 258 | [[FileIO]] 259 | deps = ["Pkg"] 260 | git-tree-sha1 = "202335fd24c2776493e198d6c66a6d910400a895" 261 | uuid = "5789e2e9-d7fb-5bc7-8068-2c6fae9b9549" 262 | version = "1.3.0" 263 | 264 | [[FilePathsBase]] 265 | deps = ["Dates", "LinearAlgebra", "Printf", "Test", "UUIDs"] 266 | git-tree-sha1 = "923fd3b942a11712435682eaa95cc8518c428b2c" 267 | uuid = "48062228-2e41-5def-b9a4-89aafe57970f" 268 | version = "0.8.0" 269 | 270 | [[FileWatching]] 271 | uuid = "7b1f6079-737a-58dc-b8bc-7a2ca5c1b5ee" 272 | 273 | [[FillArrays]] 274 | deps = ["LinearAlgebra", "Random", "SparseArrays"] 275 | git-tree-sha1 = "be4180bdb27a11188d694ee3773122f4921f1a62" 276 | uuid = "1a297f60-69ca-5386-bcde-b61e274b549b" 277 | version = "0.8.13" 278 | 279 | [[FiniteDiff]] 280 | deps = ["ArrayInterface", "LinearAlgebra", "Requires", "SparseArrays", "StaticArrays"] 281 | git-tree-sha1 = "b02b6f6ea2c33f86a444f9cf132c1d1180a66cfd" 282 | uuid = "6a86dc24-6348-571c-b903-95158fe2bd41" 283 | version = "2.4.1" 284 | 285 | [[FixedPointNumbers]] 286 | deps = ["Statistics"] 287 | git-tree-sha1 = "266baee2e9d875cb7a3bfdcc6cab553c543ff8ab" 288 | uuid = "53c48c17-4a7d-5ca2-90c5-79b7896eea93" 289 | version = "0.8.2" 290 | 291 | [[Formatting]] 292 | deps = ["Printf"] 293 | git-tree-sha1 = "a0c901c29c0e7c763342751c0a94211d56c0de5c" 294 | uuid = "59287772-0a20-5a39-b81b-1366585eb4c0" 295 | version = "0.4.1" 296 | 297 | [[ForwardDiff]] 298 | deps = ["CommonSubexpressions", "DiffResults", "DiffRules", "NaNMath", "Random", "SpecialFunctions", "StaticArrays"] 299 | git-tree-sha1 = "1d090099fb82223abc48f7ce176d3f7696ede36d" 300 | uuid = "f6369f11-7733-5829-9624-2563aa707210" 301 | version = "0.10.12" 302 | 303 | [[Franklin]] 304 | deps = ["Crayons", "Dates", "DelimitedFiles", "DocStringExtensions", "FranklinTemplates", "HTTP", "Literate", "LiveServer", "Logging", "Markdown", "NodeJS", "OrderedCollections", "Pkg", "Random"] 305 | git-tree-sha1 = "c79cc974f019c23e8e5841772070b60c42cdef1f" 306 | uuid = "713c75ef-9fc9-4b05-94a9-213340da978e" 307 | version = "0.8.6" 308 | 309 | [[FranklinTemplates]] 310 | git-tree-sha1 = "dc509923f200b7385ffe699d82aca084aede014b" 311 | uuid = "3a985190-f512-4703-8d38-2a7944ed5916" 312 | version = "0.7.2" 313 | 314 | [[FreeType2_jll]] 315 | deps = ["Bzip2_jll", "Libdl", "Pkg", "Zlib_jll"] 316 | git-tree-sha1 = "7d900f32a3788d4eacac2bfa3bf5c770179c8afd" 317 | uuid = "d7e528f0-a631-5988-bf34-fe36492bcfd7" 318 | version = "2.10.1+2" 319 | 320 | [[FriBidi_jll]] 321 | deps = ["Libdl", "Pkg"] 322 | git-tree-sha1 = "2f56bee16bd0151de7b6a1eeea2ced190a2ad8d4" 323 | uuid = "559328eb-81f9-559d-9380-de523a88c83c" 324 | version = "1.0.5+3" 325 | 326 | [[Future]] 327 | deps = ["Random"] 328 | uuid = "9fa8497b-333b-5362-9e8d-4d0656e87820" 329 | 330 | [[GLM]] 331 | deps = ["Distributions", "LinearAlgebra", "Printf", "Random", "Reexport", "SparseArrays", "SpecialFunctions", "Statistics", "StatsBase", "StatsFuns", "StatsModels"] 332 | git-tree-sha1 = "db0ace36f9dbe7b6a7a08434c5921377e9df2c72" 333 | uuid = "38e38edf-8417-5370-95a0-9cbb8c7f171a" 334 | version = "1.3.9" 335 | 336 | [[GR]] 337 | deps = ["Base64", "DelimitedFiles", "HTTP", "JSON", "LinearAlgebra", "Printf", "Random", "Serialization", "Sockets", "Test", "UUIDs"] 338 | git-tree-sha1 = "e26c513329675092535de20cc4bb9c579c8f85a0" 339 | uuid = "28b8d3ca-fb5f-59d9-8090-bfdbd6d07a71" 340 | version = "0.51.0" 341 | 342 | [[GeometryBasics]] 343 | deps = ["IterTools", "LinearAlgebra", "StaticArrays", "StructArrays", "Tables"] 344 | git-tree-sha1 = "119f32f9c2b497b49cd3f7f513b358b82660294c" 345 | uuid = "5c1252a2-5f33-56bf-86c9-59e7332b4326" 346 | version = "0.2.15" 347 | 348 | [[GeometryTypes]] 349 | deps = ["ColorTypes", "FixedPointNumbers", "LinearAlgebra", "StaticArrays"] 350 | git-tree-sha1 = "34bfa994967e893ab2f17b864eec221b3521ba4d" 351 | uuid = "4d00f742-c7ba-57c2-abde-4428a4b178cb" 352 | version = "0.8.3" 353 | 354 | [[HTTP]] 355 | deps = ["Base64", "Dates", "IniFile", "MbedTLS", "Sockets"] 356 | git-tree-sha1 = "eca61b35cdd8cd2fcc5eec1eda766424a995b02f" 357 | uuid = "cd3eb016-35fb-5094-929b-558a96fad6f3" 358 | version = "0.8.16" 359 | 360 | [[IniFile]] 361 | deps = ["Test"] 362 | git-tree-sha1 = "098e4d2c533924c921f9f9847274f2ad89e018b8" 363 | uuid = "83e8ac13-25f8-5344-8a64-a9f2b223428f" 364 | version = "0.5.0" 365 | 366 | [[IntelOpenMP_jll]] 367 | deps = ["Libdl", "Pkg"] 368 | git-tree-sha1 = "fb8e1c7a5594ba56f9011310790e03b5384998d6" 369 | uuid = "1d5cc7b8-4909-519e-a0f8-d0f5ad9712d0" 370 | version = "2018.0.3+0" 371 | 372 | [[InteractiveUtils]] 373 | deps = ["Markdown"] 374 | uuid = "b77e0a4c-d291-57a0-90e8-8db25a27a240" 375 | 376 | [[Interpolations]] 377 | deps = ["AxisAlgorithms", "LinearAlgebra", "OffsetArrays", "Random", "Ratios", "SharedArrays", "SparseArrays", "StaticArrays", "WoodburyMatrices"] 378 | git-tree-sha1 = "2b7d4e9be8b74f03115e64cf36ed2f48ae83d946" 379 | uuid = "a98d9a8b-a2ab-59e6-89dd-64a1c18fca59" 380 | version = "0.12.10" 381 | 382 | [[InvertedIndices]] 383 | deps = ["Test"] 384 | git-tree-sha1 = "15732c475062348b0165684ffe28e85ea8396afc" 385 | uuid = "41ab1584-1d38-5bbf-9106-f11c6c58b48f" 386 | version = "1.0.0" 387 | 388 | [[IterTools]] 389 | git-tree-sha1 = "05110a2ab1fc5f932622ffea2a003221f4782c18" 390 | uuid = "c8e1da08-722c-5040-9ed9-7db0dc04731e" 391 | version = "1.3.0" 392 | 393 | [[IterativeSolvers]] 394 | deps = ["LinearAlgebra", "Printf", "Random", "RecipesBase", "SparseArrays"] 395 | git-tree-sha1 = "3b7e2aac8c94444947facea7cc7ca91c49169be0" 396 | uuid = "42fd0dbc-a981-5370-80f2-aaf504508153" 397 | version = "0.8.4" 398 | 399 | [[IteratorInterfaceExtensions]] 400 | git-tree-sha1 = "a3f24677c21f5bbe9d2a714f95dcd58337fb2856" 401 | uuid = "82899510-4779-5014-852e-03e436cf321d" 402 | version = "1.0.0" 403 | 404 | [[JLSO]] 405 | deps = ["BSON", "CodecZlib", "FilePathsBase", "Memento", "Pkg", "Serialization"] 406 | git-tree-sha1 = "9dc0c7a4b7527806e53f524ccd66be0cd9e75e2e" 407 | uuid = "9da8a3cd-07a3-59c0-a743-3fdc52c30d11" 408 | version = "2.3.2" 409 | 410 | [[JSON]] 411 | deps = ["Dates", "Mmap", "Parsers", "Unicode"] 412 | git-tree-sha1 = "b34d7cef7b337321e97d22242c3c2b91f476748e" 413 | uuid = "682c06a0-de6a-54ab-a142-c8b1cf79cde6" 414 | version = "0.21.0" 415 | 416 | [[KernelDensity]] 417 | deps = ["Distributions", "FFTW", "Interpolations", "Optim", "StatsBase", "Test"] 418 | git-tree-sha1 = "c1048817fe5711f699abc8fabd47b1ac6ba4db04" 419 | uuid = "5ab0869b-81aa-558d-bb23-cbf5423bbe9b" 420 | version = "0.5.1" 421 | 422 | [[LAME_jll]] 423 | deps = ["Libdl", "Pkg"] 424 | git-tree-sha1 = "221cc8998b9060677448cbb6375f00032554c4fd" 425 | uuid = "c1c5ebd0-6772-5130-a774-d5fcae4a789d" 426 | version = "3.100.0+1" 427 | 428 | [[LIBLINEAR]] 429 | deps = ["DelimitedFiles", "Libdl", "SparseArrays", "Test"] 430 | git-tree-sha1 = "42cacc29d9b4ae77b6702c181bbfa58f14d8ef7a" 431 | uuid = "2d691ee1-e668-5016-a719-b2531b85e0f5" 432 | version = "0.5.1" 433 | 434 | [[LIBSVM]] 435 | deps = ["Compat", "LIBLINEAR", "Libdl", "ScikitLearnBase", "SparseArrays"] 436 | git-tree-sha1 = "05d574c6598bce023ba6f2d2aa99ffd4f8e00789" 437 | uuid = "b1bec4e5-fd48-53fe-b0cb-9723c09d164b" 438 | version = "0.4.0" 439 | 440 | [[LaTeXStrings]] 441 | git-tree-sha1 = "de44b395389b84fd681394d4e8d39ef14e3a2ea8" 442 | uuid = "b964fa9f-0449-5b57-a5c2-d3ea65f4040f" 443 | version = "1.1.0" 444 | 445 | [[LearnBase]] 446 | git-tree-sha1 = "a0d90569edd490b82fdc4dc078ea54a5a800d30a" 447 | uuid = "7f8f8fb0-2700-5f03-b4bd-41f8cfc144b6" 448 | version = "0.4.1" 449 | 450 | [[LibGit2]] 451 | deps = ["Printf"] 452 | uuid = "76f85450-5226-5b5a-8eaa-529ad045b433" 453 | 454 | [[LibVPX_jll]] 455 | deps = ["Libdl", "Pkg"] 456 | git-tree-sha1 = "e3549ca9bf35feb9d9d954f4c6a9032e92f46e7c" 457 | uuid = "dd192d2f-8180-539f-9fb4-cc70b1dcf69a" 458 | version = "1.8.1+1" 459 | 460 | [[Libdl]] 461 | uuid = "8f399da3-3557-5675-b5ff-fb832c97cbdb" 462 | 463 | [[Libiconv_jll]] 464 | deps = ["Libdl", "Pkg"] 465 | git-tree-sha1 = "c9d4035d7481bcdff2babf5a55525a818ef8ed8f" 466 | uuid = "94ce4f54-9a6c-5748-9c1c-f9c7231a4531" 467 | version = "1.16.0+5" 468 | 469 | [[LightGBM]] 470 | deps = ["Dates", "Libdl", "MLJModelInterface", "StatsBase"] 471 | git-tree-sha1 = "cae192532a16a84190935389dae1a3a9cdc92ce4" 472 | uuid = "7acf609c-83a4-11e9-1ffb-b912bcd3b04a" 473 | version = "0.3.1" 474 | 475 | [[LineSearches]] 476 | deps = ["LinearAlgebra", "NLSolversBase", "NaNMath", "Parameters", "Printf", "Test"] 477 | git-tree-sha1 = "54eb90e8dbe745d617c78dee1d6ae95c7f6f5779" 478 | uuid = "d3d80556-e9d4-5f37-9878-2ab0fcc64255" 479 | version = "7.0.1" 480 | 481 | [[LinearAlgebra]] 482 | deps = ["Libdl"] 483 | uuid = "37e2e46d-f89d-539d-b4ee-838fcccc9c8e" 484 | 485 | [[LinearMaps]] 486 | deps = ["LinearAlgebra", "SparseArrays"] 487 | git-tree-sha1 = "e204a96dbb8d49fbca24086c586734435d7bf5b5" 488 | uuid = "7a12625a-238d-50fd-b39a-03d52299707e" 489 | version = "2.6.1" 490 | 491 | [[Literate]] 492 | deps = ["Base64", "JSON", "REPL"] 493 | git-tree-sha1 = "422133037d6dc5df9f9b97c2cb81fcd9e35ddffe" 494 | uuid = "98b081ad-f1c9-55d3-8b20-4c87d4299306" 495 | version = "2.5.0" 496 | 497 | [[LiveServer]] 498 | deps = ["Crayons", "Documenter", "FileWatching", "HTTP", "Pkg", "Sockets", "Test"] 499 | git-tree-sha1 = "452307c337d1f625e7475d3e1a028cc5f1ca2fcb" 500 | uuid = "16fef848-5104-11e9-1b77-fb7a48bbb589" 501 | version = "0.5.0" 502 | 503 | [[Logging]] 504 | uuid = "56ddb016-857b-54e1-b83d-db4d58db5568" 505 | 506 | [[LossFunctions]] 507 | deps = ["LearnBase", "Markdown", "RecipesBase", "SparseArrays", "StatsBase"] 508 | git-tree-sha1 = "3cd347266e394a066ca7f17bd8ff589ff5ce1d35" 509 | uuid = "30fc2ffe-d236-52d8-8643-a9d8f7c094a7" 510 | version = "0.6.2" 511 | 512 | [[MKL_jll]] 513 | deps = ["IntelOpenMP_jll", "Libdl", "Pkg"] 514 | git-tree-sha1 = "0ce9a7fa68c70cf83c49d05d2c04d91b47404b08" 515 | uuid = "856f044c-d86e-5d09-b602-aeab76dc8ba7" 516 | version = "2020.1.216+0" 517 | 518 | [[MLJ]] 519 | deps = ["CategoricalArrays", "ComputationalResources", "Distributed", "Distributions", "LinearAlgebra", "MLJBase", "MLJModels", "MLJScientificTypes", "MLJTuning", "Pkg", "ProgressMeter", "Random", "Statistics", "StatsBase", "Tables"] 520 | git-tree-sha1 = "724663b1628522d83cb58189e57819f82d41063f" 521 | uuid = "add582a8-e3ab-11e8-2d5e-e98b27df1bc7" 522 | version = "0.11.6" 523 | 524 | [[MLJBase]] 525 | deps = ["CategoricalArrays", "ComputationalResources", "Dates", "DelimitedFiles", "Distributed", "Distributions", "HTTP", "InteractiveUtils", "InvertedIndices", "JLSO", "JSON", "LinearAlgebra", "LossFunctions", "MLJModelInterface", "MLJScientificTypes", "Missings", "OrderedCollections", "Parameters", "PrettyTables", "ProgressMeter", "Random", "ScientificTypes", "Statistics", "StatsBase", "Tables"] 526 | git-tree-sha1 = "d8ba2063ffaaa7f0fe91ea5455a7bf838c1424ac" 527 | uuid = "a7f614a8-145f-11e9-1d2a-a57a1082229d" 528 | version = "0.13.10" 529 | 530 | [[MLJLinearModels]] 531 | deps = ["DocStringExtensions", "IterativeSolvers", "LinearAlgebra", "LinearMaps", "MLJModelInterface", "Optim", "Parameters"] 532 | git-tree-sha1 = "01e7a3dc5c07982315c9163bbc3ad9d08811ea8e" 533 | uuid = "6ee0df7b-362f-4a72-a706-9e79364fb692" 534 | version = "0.5.0" 535 | 536 | [[MLJModelInterface]] 537 | deps = ["Random", "ScientificTypes"] 538 | git-tree-sha1 = "b02b13fde7b0dc301adc070d650405aa4909e657" 539 | uuid = "e80e1ace-859a-464e-9ed9-23947d8ae3ea" 540 | version = "0.3.0" 541 | 542 | [[MLJModels]] 543 | deps = ["CategoricalArrays", "Dates", "Distances", "Distributions", "InteractiveUtils", "LinearAlgebra", "MLJBase", "MLJModelInterface", "MultivariateStats", "OrderedCollections", "Parameters", "Pkg", "Random", "Requires", "ScientificTypes", "Statistics", "StatsBase", "Tables"] 544 | git-tree-sha1 = "3a434db580e736e23643867cd7c7e3ccaeafb31d" 545 | uuid = "d491faf4-2d78-11e9-2867-c94bc002c0b7" 546 | version = "0.10.1" 547 | 548 | [[MLJScientificTypes]] 549 | deps = ["CategoricalArrays", "ColorTypes", "Dates", "PrettyTables", "ScientificTypes", "Tables"] 550 | git-tree-sha1 = "c85856fca1302f7fd7d46dd72db7cf43d93777d9" 551 | uuid = "2e2323e0-db8b-457b-ae0d-bdfb3bc63afd" 552 | version = "0.2.8" 553 | 554 | [[MLJScikitLearnInterface]] 555 | deps = ["MLJModelInterface", "PyCall", "ScikitLearn"] 556 | git-tree-sha1 = "9202b249509ec05fd8a5e71b278f42b491f4f324" 557 | uuid = "5ae90465-5518-4432-b9d2-8a1def2f0cab" 558 | version = "0.1.5" 559 | 560 | [[MLJTuning]] 561 | deps = ["ComputationalResources", "Distributed", "Distributions", "MLJBase", "MLJModelInterface", "ProgressMeter", "Random", "RecipesBase"] 562 | git-tree-sha1 = "f9aa8dafd3dc4b8d195aa1b5518188cfd3e181e1" 563 | uuid = "03970b2e-30c4-11ea-3135-d1576263f10f" 564 | version = "0.3.6" 565 | 566 | [[MacroTools]] 567 | deps = ["Markdown", "Random"] 568 | git-tree-sha1 = "f7d2e3f654af75f01ec49be82c231c382214223a" 569 | uuid = "1914dd2f-81c6-5fcd-8719-6d5c9610ff09" 570 | version = "0.5.5" 571 | 572 | [[Markdown]] 573 | deps = ["Base64"] 574 | uuid = "d6f4376e-aef5-505a-96c1-9c027394607a" 575 | 576 | [[MbedTLS]] 577 | deps = ["Dates", "MbedTLS_jll", "Random", "Sockets"] 578 | git-tree-sha1 = "426a6978b03a97ceb7ead77775a1da066343ec6e" 579 | uuid = "739be429-bea8-5141-9913-cc70e7f3736d" 580 | version = "1.0.2" 581 | 582 | [[MbedTLS_jll]] 583 | deps = ["Libdl", "Pkg"] 584 | git-tree-sha1 = "a0cb0d489819fa7ea5f9fa84c7e7eba19d8073af" 585 | uuid = "c8ffd9c3-330d-5841-b78e-0817d7145fa1" 586 | version = "2.16.6+1" 587 | 588 | [[Measures]] 589 | git-tree-sha1 = "e498ddeee6f9fdb4551ce855a46f54dbd900245f" 590 | uuid = "442fdcdd-2543-5da2-b0f3-8c86c306513e" 591 | version = "0.3.1" 592 | 593 | [[Memento]] 594 | deps = ["Dates", "Distributed", "JSON", "Serialization", "Sockets", "Syslogs", "Test", "TimeZones", "UUIDs"] 595 | git-tree-sha1 = "31921ad09307dd9ad693da3213a218152fadb8f2" 596 | uuid = "f28f55f0-a522-5efc-85c2-fe41dfb9b2d9" 597 | version = "1.1.0" 598 | 599 | [[Missings]] 600 | deps = ["DataAPI"] 601 | git-tree-sha1 = "de0a5ce9e5289f27df672ffabef4d1e5861247d5" 602 | uuid = "e1d29d7a-bbdc-5cf2-9ac0-f12de2c33e28" 603 | version = "0.4.3" 604 | 605 | [[Mmap]] 606 | uuid = "a63ad114-7e13-5084-954f-fe012c677804" 607 | 608 | [[Mocking]] 609 | deps = ["ExprTools"] 610 | git-tree-sha1 = "916b850daad0d46b8c71f65f719c49957e9513ed" 611 | uuid = "78c3b35d-d492-501b-9361-3d52fe80e533" 612 | version = "0.7.1" 613 | 614 | [[MultivariateStats]] 615 | deps = ["Arpack", "LinearAlgebra", "SparseArrays", "Statistics", "StatsBase"] 616 | git-tree-sha1 = "352fae519b447bf52e6de627b89f448bcd469e4e" 617 | uuid = "6f286f6a-111f-5878-ab1e-185364afe411" 618 | version = "0.7.0" 619 | 620 | [[NLSolversBase]] 621 | deps = ["DiffResults", "Distributed", "FiniteDiff", "ForwardDiff"] 622 | git-tree-sha1 = "7c4e66c47848562003250f28b579c584e55becc0" 623 | uuid = "d41bc354-129a-5804-8e4c-c37616107c6c" 624 | version = "7.6.1" 625 | 626 | [[NaNMath]] 627 | git-tree-sha1 = "c84c576296d0e2fbb3fc134d3e09086b3ea617cd" 628 | uuid = "77ba4419-2d1f-58cd-9bb1-8ffee604a2e3" 629 | version = "0.3.4" 630 | 631 | [[NearestNeighbors]] 632 | deps = ["Distances", "StaticArrays"] 633 | git-tree-sha1 = "8bc6180f328f3c0ea2663935db880d34c57d6eae" 634 | uuid = "b8a86587-4115-5ab1-83bc-aa920d37bbce" 635 | version = "0.4.4" 636 | 637 | [[NodeJS]] 638 | deps = ["Pkg"] 639 | git-tree-sha1 = "350ac618f41958e6e0f6b0d2005ae4547eb1b503" 640 | uuid = "2bd173c7-0d6d-553b-b6af-13a54713934c" 641 | version = "1.1.1" 642 | 643 | [[Observables]] 644 | git-tree-sha1 = "11832878355305984235a2e90d0e3737383c634c" 645 | uuid = "510215fc-4207-5dde-b226-833fc4488ee2" 646 | version = "0.3.1" 647 | 648 | [[OffsetArrays]] 649 | git-tree-sha1 = "4ba4cd84c88df8340da1c3e2d8dcb9d18dd1b53b" 650 | uuid = "6fe1bfb0-de20-5000-8ca7-80f57d26f881" 651 | version = "1.1.1" 652 | 653 | [[Ogg_jll]] 654 | deps = ["Libdl", "Pkg"] 655 | git-tree-sha1 = "59cf7a95bf5ac39feac80b796e0f39f9d69dc887" 656 | uuid = "e7412a2a-1a6e-54c0-be00-318e2571c051" 657 | version = "1.3.4+0" 658 | 659 | [[OpenBLAS_jll]] 660 | deps = ["CompilerSupportLibraries_jll", "Libdl", "Pkg"] 661 | git-tree-sha1 = "0c922fd9634e358622e333fc58de61f05a048492" 662 | uuid = "4536629a-c528-5b80-bd46-f80d51c5b363" 663 | version = "0.3.9+5" 664 | 665 | [[OpenSSL_jll]] 666 | deps = ["Libdl", "Pkg"] 667 | git-tree-sha1 = "7aaaded15bf393b5f34c2aad5b765c18d26cb495" 668 | uuid = "458c3c95-2e84-50aa-8efc-19380b2a3a95" 669 | version = "1.1.1+4" 670 | 671 | [[OpenSpecFun_jll]] 672 | deps = ["CompilerSupportLibraries_jll", "Libdl", "Pkg"] 673 | git-tree-sha1 = "d51c416559217d974a1113522d5919235ae67a87" 674 | uuid = "efe28fd5-8261-553b-a9e1-b2916fc3738e" 675 | version = "0.5.3+3" 676 | 677 | [[Optim]] 678 | deps = ["Compat", "FillArrays", "LineSearches", "LinearAlgebra", "NLSolversBase", "NaNMath", "Parameters", "PositiveFactorizations", "Printf", "SparseArrays", "StatsBase"] 679 | git-tree-sha1 = "33af70b64e8ce2f2b857e3d5de7b71f67715c121" 680 | uuid = "429524aa-4258-5aef-a3af-852621145aeb" 681 | version = "0.21.0" 682 | 683 | [[Opus_jll]] 684 | deps = ["Libdl", "Pkg"] 685 | git-tree-sha1 = "002c18f222a542907e16c83c64a1338992da7e2c" 686 | uuid = "91d4177d-7536-5919-b921-800302f37372" 687 | version = "1.3.1+1" 688 | 689 | [[OrderedCollections]] 690 | git-tree-sha1 = "293b70ac1780f9584c89268a6e2a560d938a7065" 691 | uuid = "bac558e1-5e72-5ebc-8fee-abe8a469f55d" 692 | version = "1.3.0" 693 | 694 | [[PDMats]] 695 | deps = ["Arpack", "LinearAlgebra", "SparseArrays", "SuiteSparse", "Test"] 696 | git-tree-sha1 = "2fc6f50ddd959e462f0a2dbc802ddf2a539c6e35" 697 | uuid = "90014a1f-27ba-587c-ab20-58faa44d9150" 698 | version = "0.9.12" 699 | 700 | [[Parameters]] 701 | deps = ["OrderedCollections", "UnPack"] 702 | git-tree-sha1 = "38b2e970043613c187bd56a995fe2e551821eb4a" 703 | uuid = "d96e819e-fc66-5662-9728-84c9c7592b0a" 704 | version = "0.12.1" 705 | 706 | [[Parsers]] 707 | deps = ["Dates", "Test"] 708 | git-tree-sha1 = "10134f2ee0b1978ae7752c41306e131a684e1f06" 709 | uuid = "69de0a69-1ddd-5017-9359-2bf0b02dc9f0" 710 | version = "1.0.7" 711 | 712 | [[Pkg]] 713 | deps = ["Dates", "LibGit2", "Libdl", "Logging", "Markdown", "Printf", "REPL", "Random", "SHA", "UUIDs"] 714 | uuid = "44cfe95a-1eb2-52ea-b672-e2afdf69b78f" 715 | 716 | [[PlotThemes]] 717 | deps = ["PlotUtils", "Requires", "Statistics"] 718 | git-tree-sha1 = "c6f5ea535551b3b16835134697f0c65d06c94b91" 719 | uuid = "ccf2f8ad-2431-5c83-bf29-c5338b663b6a" 720 | version = "2.0.0" 721 | 722 | [[PlotUtils]] 723 | deps = ["ColorSchemes", "Colors", "Dates", "Printf", "Random", "Reexport", "Statistics"] 724 | git-tree-sha1 = "e18e0e51ff07bf92bb7e06dcb9c082a4e125e20c" 725 | uuid = "995b91a9-d308-5afd-9ec6-746e21dbc043" 726 | version = "1.0.5" 727 | 728 | [[Plots]] 729 | deps = ["Base64", "Contour", "Dates", "FFMPEG", "FixedPointNumbers", "GR", "GeometryBasics", "GeometryTypes", "JSON", "LinearAlgebra", "Measures", "NaNMath", "PlotThemes", "PlotUtils", "Printf", "REPL", "Random", "RecipesBase", "RecipesPipeline", "Reexport", "Requires", "Showoff", "SparseArrays", "Statistics", "StatsBase", "UUIDs"] 730 | git-tree-sha1 = "ba747739872a67bc1a8078aec3313bde075b3fb0" 731 | uuid = "91a5bcdd-55d7-5caf-9e0b-520d859cae80" 732 | version = "1.5.5" 733 | 734 | [[PooledArrays]] 735 | deps = ["DataAPI"] 736 | git-tree-sha1 = "b1333d4eced1826e15adbdf01a4ecaccca9d353c" 737 | uuid = "2dfb63ee-cc39-5dd5-95bd-886bf059d720" 738 | version = "0.5.3" 739 | 740 | [[PositiveFactorizations]] 741 | deps = ["LinearAlgebra", "Test"] 742 | git-tree-sha1 = "127c47b91990c101ee3752291c4f45640eeb03d1" 743 | uuid = "85a6dd25-e78a-55b7-8502-1745935b8125" 744 | version = "0.2.3" 745 | 746 | [[PrettyPrinting]] 747 | git-tree-sha1 = "cb3bd68c8e0fabf6e13c10bdf11713068e748a79" 748 | uuid = "54e16d92-306c-5ea0-a30b-337be88ac337" 749 | version = "0.2.0" 750 | 751 | [[PrettyTables]] 752 | deps = ["Crayons", "Formatting", "Parameters", "Reexport", "Tables"] 753 | git-tree-sha1 = "8458dc04a493ae5c2fed3796c1d3117972c69694" 754 | uuid = "08abe8d2-0d0c-5749-adfa-8a2ac140af0d" 755 | version = "0.9.1" 756 | 757 | [[Printf]] 758 | deps = ["Unicode"] 759 | uuid = "de0858da-6303-5e67-8744-51eddeeeb8d7" 760 | 761 | [[ProgressMeter]] 762 | deps = ["Distributed", "Printf"] 763 | git-tree-sha1 = "2de4cddc0ceeddafb6b143b5b6cd9c659b64507c" 764 | uuid = "92933f4c-e287-5a05-a399-4b506db050ca" 765 | version = "1.3.2" 766 | 767 | [[PyCall]] 768 | deps = ["Conda", "Dates", "Libdl", "LinearAlgebra", "MacroTools", "Serialization", "VersionParsing"] 769 | git-tree-sha1 = "3a3fdb9000d35958c9ba2323ca7c4958901f115d" 770 | uuid = "438e738f-606a-5dbb-bf0a-cddfbfd45ab0" 771 | version = "1.91.4" 772 | 773 | [[PyPlot]] 774 | deps = ["Colors", "LaTeXStrings", "PyCall", "Sockets", "Test", "VersionParsing"] 775 | git-tree-sha1 = "67dde2482fe1a72ef62ed93f8c239f947638e5a2" 776 | uuid = "d330b81b-6aea-500a-939a-2ce795aea3ee" 777 | version = "2.9.0" 778 | 779 | [[QuadGK]] 780 | deps = ["DataStructures", "LinearAlgebra"] 781 | git-tree-sha1 = "0ab8a09d4478ebeb99a706ecbf8634a65077ccdc" 782 | uuid = "1fd47b50-473d-5c70-9696-f719f8f3bcdc" 783 | version = "2.4.0" 784 | 785 | [[RData]] 786 | deps = ["CategoricalArrays", "CodecZlib", "DataFrames", "Dates", "FileIO", "Requires", "TimeZones", "Unicode"] 787 | git-tree-sha1 = "10693c581956334a368c26b7c544e406c4c94385" 788 | uuid = "df47a6cb-8c03-5eed-afd8-b6050d6c41da" 789 | version = "0.7.2" 790 | 791 | [[RDatasets]] 792 | deps = ["CSV", "CodecZlib", "DataFrames", "FileIO", "Printf", "RData", "Reexport"] 793 | git-tree-sha1 = "511854268c47438216a7640341ad4ce14b3463bb" 794 | uuid = "ce6b1742-4840-55fa-b093-852dadbb1d8b" 795 | version = "0.6.9" 796 | 797 | [[REPL]] 798 | deps = ["InteractiveUtils", "Markdown", "Sockets"] 799 | uuid = "3fa0cd96-eef1-5676-8a61-b3b8758bbffb" 800 | 801 | [[Random]] 802 | deps = ["Serialization"] 803 | uuid = "9a3f8284-a2c9-5f02-9a11-845980a1fd5c" 804 | 805 | [[Ratios]] 806 | git-tree-sha1 = "37d210f612d70f3f7d57d488cb3b6eff56ad4e41" 807 | uuid = "c84ed2f1-dad5-54f0-aa8e-dbefe2724439" 808 | version = "0.4.0" 809 | 810 | [[RecipesBase]] 811 | git-tree-sha1 = "54f8ceb165a0f6d083f0d12cb4996f5367c6edbc" 812 | uuid = "3cdcf5f2-1ef4-517c-9805-6587b60abb01" 813 | version = "1.0.1" 814 | 815 | [[RecipesPipeline]] 816 | deps = ["Dates", "PlotUtils", "RecipesBase"] 817 | git-tree-sha1 = "d2a58b8291d1c0abae6a91489973f8a92bf5c04a" 818 | uuid = "01d81517-befc-4cb6-b9ec-a95719d0359c" 819 | version = "0.1.11" 820 | 821 | [[Reexport]] 822 | deps = ["Pkg"] 823 | git-tree-sha1 = "7b1d07f411bc8ddb7977ec7f377b97b158514fe0" 824 | uuid = "189a3867-3050-52da-a836-e630ba90ab69" 825 | version = "0.2.0" 826 | 827 | [[Requires]] 828 | deps = ["UUIDs"] 829 | git-tree-sha1 = "d37400976e98018ee840e0ca4f9d20baa231dc6b" 830 | uuid = "ae029012-a4dd-5104-9daa-d747884805df" 831 | version = "1.0.1" 832 | 833 | [[Rmath]] 834 | deps = ["Random", "Rmath_jll"] 835 | git-tree-sha1 = "86c5647b565873641538d8f812c04e4c9dbeb370" 836 | uuid = "79098fc4-a85e-5d69-aa6a-4863f24498fa" 837 | version = "0.6.1" 838 | 839 | [[Rmath_jll]] 840 | deps = ["Libdl", "Pkg"] 841 | git-tree-sha1 = "d76185aa1f421306dec73c057aa384bad74188f0" 842 | uuid = "f50d1b31-88e8-58de-be2c-1cc44531875f" 843 | version = "0.2.2+1" 844 | 845 | [[SHA]] 846 | uuid = "ea8e919c-243c-51af-8825-aaa63cd721ce" 847 | 848 | [[ScientificTypes]] 849 | git-tree-sha1 = "1a9f881c800ea009fb7f8b5274f04e4e8a5faef8" 850 | uuid = "321657f4-b219-11e9-178b-2701a2544e81" 851 | version = "0.8.0" 852 | 853 | [[ScikitLearn]] 854 | deps = ["Compat", "Conda", "DataFrames", "Distributed", "IterTools", "LinearAlgebra", "MacroTools", "Parameters", "Printf", "PyCall", "Random", "ScikitLearnBase", "SparseArrays", "StatsBase", "VersionParsing"] 855 | git-tree-sha1 = "b2dbb141575879beb3ad771fb0314a22617586d3" 856 | uuid = "3646fa90-6ef7-5e7e-9f22-8aca16db6324" 857 | version = "0.6.2" 858 | 859 | [[ScikitLearnBase]] 860 | deps = ["LinearAlgebra", "Random", "Statistics"] 861 | git-tree-sha1 = "7877e55c1523a4b336b433da39c8e8c08d2f221f" 862 | uuid = "6e75b9c4-186b-50bd-896f-2d2496a4843e" 863 | version = "0.5.0" 864 | 865 | [[Serialization]] 866 | uuid = "9e88b42a-f829-5b0c-bbe9-9e923198166b" 867 | 868 | [[SharedArrays]] 869 | deps = ["Distributed", "Mmap", "Random", "Serialization"] 870 | uuid = "1a1011a3-84de-559e-8e89-a11a2f7dc383" 871 | 872 | [[ShiftedArrays]] 873 | git-tree-sha1 = "22395afdcf37d6709a5a0766cc4a5ca52cb85ea0" 874 | uuid = "1277b4bf-5013-50f5-be3d-901d8477a67a" 875 | version = "1.0.0" 876 | 877 | [[Showoff]] 878 | deps = ["Dates"] 879 | git-tree-sha1 = "e032c9df551fb23c9f98ae1064de074111b7bc39" 880 | uuid = "992d4aef-0814-514b-bc4d-f2e9a6c4116f" 881 | version = "0.3.1" 882 | 883 | [[Sockets]] 884 | uuid = "6462fe0b-24de-5631-8697-dd941f90decc" 885 | 886 | [[SortingAlgorithms]] 887 | deps = ["DataStructures", "Random", "Test"] 888 | git-tree-sha1 = "03f5898c9959f8115e30bc7226ada7d0df554ddd" 889 | uuid = "a2af1166-a08f-5f64-846c-94a0d3cef48c" 890 | version = "0.3.1" 891 | 892 | [[SparseArrays]] 893 | deps = ["LinearAlgebra", "Random"] 894 | uuid = "2f01184e-e22b-5df5-ae63-d93ebab69eaf" 895 | 896 | [[SpecialFunctions]] 897 | deps = ["OpenSpecFun_jll"] 898 | git-tree-sha1 = "d8d8b8a9f4119829410ecd706da4cc8594a1e020" 899 | uuid = "276daf66-3868-5448-9aa4-cd146d93841b" 900 | version = "0.10.3" 901 | 902 | [[StableRNGs]] 903 | deps = ["Random", "Test"] 904 | git-tree-sha1 = "705f8782b1d532c6db75e0a986fb848a629f971a" 905 | uuid = "860ef19b-820b-49d6-a774-d7a799459cd3" 906 | version = "0.1.1" 907 | 908 | [[StaticArrays]] 909 | deps = ["LinearAlgebra", "Random", "Statistics"] 910 | git-tree-sha1 = "016d1e1a00fabc556473b07161da3d39726ded35" 911 | uuid = "90137ffa-7385-5640-81b9-e52037218182" 912 | version = "0.12.4" 913 | 914 | [[Statistics]] 915 | deps = ["LinearAlgebra", "SparseArrays"] 916 | uuid = "10745b16-79ce-11e8-11f9-7d13ad32a3b2" 917 | 918 | [[StatsBase]] 919 | deps = ["DataAPI", "DataStructures", "LinearAlgebra", "Missings", "Printf", "Random", "SortingAlgorithms", "SparseArrays", "Statistics"] 920 | git-tree-sha1 = "a6102b1f364befdb05746f386b67c6b7e3262c45" 921 | uuid = "2913bbd2-ae8a-5f71-8c99-4fb6c76f3a91" 922 | version = "0.33.0" 923 | 924 | [[StatsFuns]] 925 | deps = ["Rmath", "SpecialFunctions"] 926 | git-tree-sha1 = "04a5a8e6ab87966b43f247920eab053fd5fdc925" 927 | uuid = "4c63d2b9-4356-54db-8cca-17b64c39e42c" 928 | version = "0.9.5" 929 | 930 | [[StatsModels]] 931 | deps = ["DataAPI", "DataStructures", "Distributions", "LinearAlgebra", "Printf", "ShiftedArrays", "SparseArrays", "StatsBase", "Tables"] 932 | git-tree-sha1 = "b79969dac368d8a61515b861b15d0e691e0bff96" 933 | uuid = "3eaba693-59b7-5ba5-a881-562e759f1c8d" 934 | version = "0.6.12" 935 | 936 | [[StatsPlots]] 937 | deps = ["Clustering", "DataStructures", "DataValues", "Distributions", "Interpolations", "KernelDensity", "MultivariateStats", "Observables", "Plots", "RecipesBase", "RecipesPipeline", "Reexport", "StatsBase", "TableOperations", "Tables", "Widgets"] 938 | git-tree-sha1 = "b9b7fff81f573465fcac4685df1497d968537a9e" 939 | uuid = "f3b207a7-027a-5e70-b257-86293d7955fd" 940 | version = "0.14.6" 941 | 942 | [[StructArrays]] 943 | deps = ["Adapt", "DataAPI", "Tables"] 944 | git-tree-sha1 = "8099ed9fb90b6e754d6ba8c6ed8670f010eadca0" 945 | uuid = "09ab397b-f2b6-538f-b94a-2f83cf4a842a" 946 | version = "0.4.4" 947 | 948 | [[SuiteSparse]] 949 | deps = ["Libdl", "LinearAlgebra", "Serialization", "SparseArrays"] 950 | uuid = "4607b0f0-06f3-5cda-b6b1-a6196a1729e9" 951 | 952 | [[Syslogs]] 953 | deps = ["Printf", "Sockets"] 954 | git-tree-sha1 = "46badfcc7c6e74535cc7d833a91f4ac4f805f86d" 955 | uuid = "cea106d9-e007-5e6c-ad93-58fe2094e9c4" 956 | version = "0.3.0" 957 | 958 | [[TableOperations]] 959 | deps = ["Tables", "Test"] 960 | git-tree-sha1 = "208630a14884abd110a8f8008b0882f0d0f5632c" 961 | uuid = "ab02a1b2-a7df-11e8-156e-fb1833f50b87" 962 | version = "0.2.1" 963 | 964 | [[TableTraits]] 965 | deps = ["IteratorInterfaceExtensions"] 966 | git-tree-sha1 = "b1ad568ba658d8cbb3b892ed5380a6f3e781a81e" 967 | uuid = "3783bdb8-4a98-5b6b-af9a-565f29a5fe9c" 968 | version = "1.0.0" 969 | 970 | [[Tables]] 971 | deps = ["DataAPI", "DataValueInterfaces", "IteratorInterfaceExtensions", "LinearAlgebra", "TableTraits", "Test"] 972 | git-tree-sha1 = "c45dcc27331febabc20d86cb3974ef095257dcf3" 973 | uuid = "bd369af6-aec1-5ad0-b16a-f7cc5008161c" 974 | version = "1.0.4" 975 | 976 | [[Test]] 977 | deps = ["Distributed", "InteractiveUtils", "Logging", "Random"] 978 | uuid = "8dfed614-e22c-5e08-85e1-65c5234f0b40" 979 | 980 | [[TimeZones]] 981 | deps = ["Dates", "EzXML", "Mocking", "Pkg", "Printf", "RecipesBase", "Serialization", "Unicode"] 982 | git-tree-sha1 = "fc9deaf6636c12c564a9eb7c110eff469eec2efa" 983 | uuid = "f269a46b-ccf7-5d73-abea-4c690281aa53" 984 | version = "1.3.0" 985 | 986 | [[TranscodingStreams]] 987 | deps = ["Random", "Test"] 988 | git-tree-sha1 = "7c53c35547de1c5b9d46a4797cf6d8253807108c" 989 | uuid = "3bb67fe8-82b1-5028-8e26-92a6c54297fa" 990 | version = "0.9.5" 991 | 992 | [[UUIDs]] 993 | deps = ["Random", "SHA"] 994 | uuid = "cf7118a7-6976-5b1a-9a39-7adc72f591a4" 995 | 996 | [[UnPack]] 997 | git-tree-sha1 = "d4bfa022cd30df012700cf380af2141961bb3bfb" 998 | uuid = "3a884ed6-31ef-47d7-9d2a-63182c4928ed" 999 | version = "1.0.1" 1000 | 1001 | [[Unicode]] 1002 | uuid = "4ec0a83e-493e-50e2-b9ac-8f72acf5a8f5" 1003 | 1004 | [[UrlDownload]] 1005 | deps = ["HTTP", "ProgressMeter"] 1006 | git-tree-sha1 = "5f4a56e15ed7c4e37d35cd30b82ecc2fb28a0f5d" 1007 | uuid = "856ac37a-3032-4c1c-9122-f86d88358c8b" 1008 | version = "0.3.0" 1009 | 1010 | [[VersionParsing]] 1011 | git-tree-sha1 = "80229be1f670524750d905f8fc8148e5a8c4537f" 1012 | uuid = "81def892-9a0e-5fdd-b105-ffc91e053289" 1013 | version = "1.2.0" 1014 | 1015 | [[WeakRefStrings]] 1016 | deps = ["DataAPI", "Random", "Test"] 1017 | git-tree-sha1 = "28807f85197eaad3cbd2330386fac1dcb9e7e11d" 1018 | uuid = "ea10d353-3f73-51f8-a26c-33c1cb351aa5" 1019 | version = "0.6.2" 1020 | 1021 | [[Widgets]] 1022 | deps = ["Colors", "Dates", "Observables", "OrderedCollections"] 1023 | git-tree-sha1 = "fc0feda91b3fef7fe6948ee09bb628f882b49ca4" 1024 | uuid = "cc8bc4a8-27d6-5769-a93b-9d913e69aa62" 1025 | version = "0.6.2" 1026 | 1027 | [[WoodburyMatrices]] 1028 | deps = ["LinearAlgebra", "SparseArrays"] 1029 | git-tree-sha1 = "28ffe06d28b1ba8fdb2f36ec7bb079fac81bac0d" 1030 | uuid = "efce3f68-66dc-5838-9240-27a6d6f5f9b6" 1031 | version = "0.5.2" 1032 | 1033 | [[XGBoost]] 1034 | deps = ["Libdl", "Printf", "Random", "SparseArrays", "Statistics", "Test", "XGBoost_jll"] 1035 | git-tree-sha1 = "8a692f817f1a6c15ef4913a0ffefa6163117f43d" 1036 | uuid = "009559a3-9522-5dbb-924b-0b6ed2b22bb9" 1037 | version = "1.1.1" 1038 | 1039 | [[XGBoost_jll]] 1040 | deps = ["CompilerSupportLibraries_jll", "Libdl", "Pkg"] 1041 | git-tree-sha1 = "72c0d8bfbb56856c5f25668b72247ec18bbf5579" 1042 | uuid = "a5c6f535-4255-5ca2-a466-0e519f119c46" 1043 | version = "1.1.1+0" 1044 | 1045 | [[XML2_jll]] 1046 | deps = ["Libdl", "Libiconv_jll", "Pkg", "Zlib_jll"] 1047 | git-tree-sha1 = "432d91f45e950f2f2bda5c0f4e2b938c14493af9" 1048 | uuid = "02c8fc9c-b97f-50b9-bbe4-9be30ff0a78a" 1049 | version = "2.9.10+1" 1050 | 1051 | [[Zlib_jll]] 1052 | deps = ["Libdl", "Pkg"] 1053 | git-tree-sha1 = "622d8b6dc0c7e8029f17127703de9819134d1b71" 1054 | uuid = "83775a58-1f1d-513f-b197-d71354ab007a" 1055 | version = "1.2.11+14" 1056 | 1057 | [[libass_jll]] 1058 | deps = ["Bzip2_jll", "FreeType2_jll", "FriBidi_jll", "Libdl", "Pkg", "Zlib_jll"] 1059 | git-tree-sha1 = "027a304b2a90de84f690949a21f94e5ae0f92c73" 1060 | uuid = "0ac62f75-1d6f-5e53-bd7c-93b484bb37c0" 1061 | version = "0.14.0+2" 1062 | 1063 | [[libfdk_aac_jll]] 1064 | deps = ["Libdl", "Pkg"] 1065 | git-tree-sha1 = "480c7ed04f68ea3edd4c757f5db5b6a0a4e0bd99" 1066 | uuid = "f638f0a6-7fb0-5443-88ba-1cc74229b280" 1067 | version = "0.1.6+2" 1068 | 1069 | [[libvorbis_jll]] 1070 | deps = ["Libdl", "Ogg_jll", "Pkg"] 1071 | git-tree-sha1 = "6a66f65b5275dfa799036c8a3a26616a0a271c4a" 1072 | uuid = "f27f6e37-5d2b-51aa-960f-b287f2bc3b7a" 1073 | version = "1.3.6+4" 1074 | 1075 | [[x264_jll]] 1076 | deps = ["Libdl", "Pkg"] 1077 | git-tree-sha1 = "d89346fe63a6465a9f44e958ac0e3d366af90b74" 1078 | uuid = "1270edf5-f2f9-52d2-97e9-ab00b5d0237a" 1079 | version = "2019.5.25+2" 1080 | 1081 | [[x265_jll]] 1082 | deps = ["Libdl", "Pkg"] 1083 | git-tree-sha1 = "61324ad346b00a6e541896b94201c9426591e43a" 1084 | uuid = "dfaa095f-4041-5dcd-9319-2fabd8486b76" 1085 | version = "3.0.0+1" 1086 | -------------------------------------------------------------------------------- /data/src/Project.toml: -------------------------------------------------------------------------------- 1 | name = "DataScienceTutorials" 2 | uuid = "b22f6415-6e67-485c-b34d-7995e604d9c9" 3 | version = "0.4.1" 4 | 5 | [deps] 6 | CSV = "336ed68f-0bac-5ca0-87d4-7b16caf5d00b" 7 | CategoricalArrays = "324d7699-5711-5eae-9e2f-1d82baa6b597" 8 | Clustering = "aaaa29a8-35af-508c-8bc3-b662a17a0fe5" 9 | DataFrames = "a93c6f00-e57d-5684-b7b6-d8193f3e46c0" 10 | Dates = "ade2ca70-3891-5945-98fb-dc099432e06a" 11 | DecisionTree = "7806a523-6efd-50cb-b5f6-3fa6f1930dbb" 12 | Distances = "b4f34e82-e78d-54a5-968a-f98e89d6e8f7" 13 | Distributions = "31c24e10-a181-5473-b8eb-7969acd0382f" 14 | EvoTrees = "f6006082-12f8-11e9-0c9c-0d5d367ab1e5" 15 | Franklin = "713c75ef-9fc9-4b05-94a9-213340da978e" 16 | GLM = "38e38edf-8417-5370-95a0-9cbb8c7f171a" 17 | HTTP = "cd3eb016-35fb-5094-929b-558a96fad6f3" 18 | LIBSVM = "b1bec4e5-fd48-53fe-b0cb-9723c09d164b" 19 | LightGBM = "7acf609c-83a4-11e9-1ffb-b912bcd3b04a" 20 | LinearAlgebra = "37e2e46d-f89d-539d-b4ee-838fcccc9c8e" 21 | LossFunctions = "30fc2ffe-d236-52d8-8643-a9d8f7c094a7" 22 | MLJ = "add582a8-e3ab-11e8-2d5e-e98b27df1bc7" 23 | MLJBase = "a7f614a8-145f-11e9-1d2a-a57a1082229d" 24 | MLJLinearModels = "6ee0df7b-362f-4a72-a706-9e79364fb692" 25 | MLJModelInterface = "e80e1ace-859a-464e-9ed9-23947d8ae3ea" 26 | MLJModels = "d491faf4-2d78-11e9-2867-c94bc002c0b7" 27 | MLJScientificTypes = "2e2323e0-db8b-457b-ae0d-bdfb3bc63afd" 28 | MLJScikitLearnInterface = "5ae90465-5518-4432-b9d2-8a1def2f0cab" 29 | MultivariateStats = "6f286f6a-111f-5878-ab1e-185364afe411" 30 | NearestNeighbors = "b8a86587-4115-5ab1-83bc-aa920d37bbce" 31 | PrettyPrinting = "54e16d92-306c-5ea0-a30b-337be88ac337" 32 | PyPlot = "d330b81b-6aea-500a-939a-2ce795aea3ee" 33 | RDatasets = "ce6b1742-4840-55fa-b093-852dadbb1d8b" 34 | Random = "9a3f8284-a2c9-5f02-9a11-845980a1fd5c" 35 | ScikitLearn = "3646fa90-6ef7-5e7e-9f22-8aca16db6324" 36 | StableRNGs = "860ef19b-820b-49d6-a774-d7a799459cd3" 37 | Statistics = "10745b16-79ce-11e8-11f9-7d13ad32a3b2" 38 | StatsBase = "2913bbd2-ae8a-5f71-8c99-4fb6c76f3a91" 39 | StatsPlots = "f3b207a7-027a-5e70-b257-86293d7955fd" 40 | Tables = "bd369af6-aec1-5ad0-b16a-f7cc5008161c" 41 | UrlDownload = "856ac37a-3032-4c1c-9122-f86d88358c8b" 42 | XGBoost = "009559a3-9522-5dbb-924b-0b6ed2b22bb9" 43 | 44 | [compat] 45 | MLJ = "0.11" 46 | MLJBase = "0.13" 47 | MLJLinearModels = "0.5" 48 | MLJModelInterface = "0.3" 49 | MLJModels = "0.10" 50 | MLJScientificTypes = "0.2" 51 | 52 | [extras] 53 | Logging = "56ddb016-857b-54e1-b83d-db4d58db5568" 54 | Test = "8dfed614-e22c-5e08-85e1-65c5234f0b40" 55 | 56 | [targets] 57 | test = ["Test", "Logging"] 58 | -------------------------------------------------------------------------------- /data/src/convert_ames.jl: -------------------------------------------------------------------------------- 1 | using Pkg 2 | Pkg.activate(joinpath(@__DIR__, "convert_ames")) 3 | Pkg.instantiate() 4 | 5 | using DataFrames, CSV, MLJBase, CategoricalArrays 6 | 7 | df = CSV.read(joinpath(@__DIR__, "reduced_ames.csv")) 8 | 9 | schema(df) 10 | 11 | price = df.target 12 | quality = df.OverallQual 13 | area1 = map(df.GrLivArea) do a round(Int, a) end 14 | area2 = map(df.x1stFlrSF) do a round(Int, a) end 15 | area3 = map(df.TotalBsmtSF) do a round(Int, a) end 16 | area4 = map(df.BsmtFinSF1) do a round(Int, a) end 17 | area5 = map(df.GarageArea) do a round(Int, a) end 18 | lot_area = map(df.LotArea) do a round(Int, a) end 19 | garage_cars = map(df.GarageCars) do a round(Int, a) end 20 | suburb = df.Neighborhood 21 | council_code = map(df.MSSubClass) do a parse(Int, a[2:end]) end 22 | year_built = map(df.YearBuilt) do a round(Int, a) end 23 | year_upgraded = map(df.YearRemodAdd) do a round(Int, a) end 24 | zone = df.MSSubClass 25 | 26 | df2 = DataFrame(price=price, 27 | area1=area1, 28 | area2=area2, 29 | area3=area3, 30 | area4=area4, 31 | area5=area5, 32 | lot_area=lot_area, 33 | year_built=year_built, 34 | year_upgraded=year_upgraded, 35 | quality=quality, 36 | garage_cars=garage_cars, 37 | suburb=suburb, 38 | council_code=council_code, 39 | zone=zone) 40 | 41 | CSV.write(joinpath(@__DIR__, "ames.csv"), df) 42 | 43 | -------------------------------------------------------------------------------- /data/src/convert_ames/Manifest.toml: -------------------------------------------------------------------------------- 1 | # This file is machine-generated - editing it directly is not advised 2 | 3 | [[Arpack]] 4 | deps = ["Arpack_jll", "Libdl", "LinearAlgebra"] 5 | git-tree-sha1 = "2ff92b71ba1747c5fdd541f8fc87736d82f40ec9" 6 | uuid = "7d9fca2a-8960-54d3-9f78-7d1dccf2cb97" 7 | version = "0.4.0" 8 | 9 | [[Arpack_jll]] 10 | deps = ["Libdl", "OpenBLAS_jll", "Pkg"] 11 | git-tree-sha1 = "e214a9b9bd1b4e1b4f15b22c0994862b66af7ff7" 12 | uuid = "68821587-b530-5797-8361-c406ea357684" 13 | version = "3.5.0+3" 14 | 15 | [[BSON]] 16 | git-tree-sha1 = "dd36d7cf3d185eeaaf64db902c15174b22f5dafb" 17 | uuid = "fbb218c0-5317-5bc6-957e-2ee96dd4b1f0" 18 | version = "0.2.6" 19 | 20 | [[Base64]] 21 | uuid = "2a0f44e3-6c83-55bd-87e4-b1978d98bd5f" 22 | 23 | [[CSV]] 24 | deps = ["CategoricalArrays", "DataFrames", "Dates", "FilePathsBase", "Mmap", "Parsers", "PooledArrays", "Tables", "Unicode", "WeakRefStrings"] 25 | git-tree-sha1 = "52a8e60c7822f53d57e4403b7f2811e7e1bdd32b" 26 | uuid = "336ed68f-0bac-5ca0-87d4-7b16caf5d00b" 27 | version = "0.6.2" 28 | 29 | [[CategoricalArrays]] 30 | deps = ["DataAPI", "Future", "JSON", "Missings", "Printf", "Statistics", "Unicode"] 31 | git-tree-sha1 = "a6c17353ee38ddab30e73dcfaa1107752de724ec" 32 | uuid = "324d7699-5711-5eae-9e2f-1d82baa6b597" 33 | version = "0.8.1" 34 | 35 | [[CodecZlib]] 36 | deps = ["TranscodingStreams", "Zlib_jll"] 37 | git-tree-sha1 = "ded953804d019afa9a3f98981d99b33e3db7b6da" 38 | uuid = "944b1d66-785c-5afd-91f1-9de20f533193" 39 | version = "0.7.0" 40 | 41 | [[ColorTypes]] 42 | deps = ["FixedPointNumbers", "Random"] 43 | git-tree-sha1 = "c73d9cfc2a9d8433dc77f5bff4bddf46b1d78c20" 44 | uuid = "3da002f7-5984-5a60-b8a6-cbb66c0b333f" 45 | version = "0.10.3" 46 | 47 | [[Compat]] 48 | deps = ["Base64", "Dates", "DelimitedFiles", "Distributed", "InteractiveUtils", "LibGit2", "Libdl", "LinearAlgebra", "Markdown", "Mmap", "Pkg", "Printf", "REPL", "Random", "SHA", "Serialization", "SharedArrays", "Sockets", "SparseArrays", "Statistics", "Test", "UUIDs", "Unicode"] 49 | git-tree-sha1 = "054993b6611376ddb40203e973e954fd9d1d1902" 50 | uuid = "34da2185-b29b-5c13-b0c7-acf172513d20" 51 | version = "3.12.0" 52 | 53 | [[CompilerSupportLibraries_jll]] 54 | deps = ["Libdl", "Pkg"] 55 | git-tree-sha1 = "7c4f882c41faa72118841185afc58a2eb00ef612" 56 | uuid = "e66e0078-7015-5450-92f7-15fbd957f2ae" 57 | version = "0.3.3+0" 58 | 59 | [[ComputationalResources]] 60 | git-tree-sha1 = "52cb3ec90e8a8bea0e62e275ba577ad0f74821f7" 61 | uuid = "ed09eef8-17a6-5b46-8889-db040fac31e3" 62 | version = "0.3.2" 63 | 64 | [[Crayons]] 65 | git-tree-sha1 = "9f3adcb26c79d6270eb678f3c61bf44cc6b7077e" 66 | uuid = "a8cc5b0e-0ffa-5ad4-8c14-923d3ee1735f" 67 | version = "4.0.2" 68 | 69 | [[DataAPI]] 70 | git-tree-sha1 = "176e23402d80e7743fc26c19c681bfb11246af32" 71 | uuid = "9a962f9c-6df0-11e9-0e5d-c546b8b5ee8a" 72 | version = "1.3.0" 73 | 74 | [[DataFrames]] 75 | deps = ["CategoricalArrays", "Compat", "DataAPI", "Future", "InvertedIndices", "IteratorInterfaceExtensions", "Missings", "PooledArrays", "Printf", "REPL", "Reexport", "SortingAlgorithms", "Statistics", "TableTraits", "Tables", "Unicode"] 76 | git-tree-sha1 = "02f08ae77249b7f6d4186b081a016fb7454c616f" 77 | uuid = "a93c6f00-e57d-5684-b7b6-d8193f3e46c0" 78 | version = "0.21.2" 79 | 80 | [[DataStructures]] 81 | deps = ["InteractiveUtils", "OrderedCollections"] 82 | git-tree-sha1 = "be680f1ad03c0a03796aa3fda5a2180df7f83b46" 83 | uuid = "864edb3b-99cc-5e75-8d2d-829cb0a9cfe8" 84 | version = "0.17.18" 85 | 86 | [[DataValueInterfaces]] 87 | git-tree-sha1 = "bfc1187b79289637fa0ef6d4436ebdfe6905cbd6" 88 | uuid = "e2d170a0-9d28-54be-80f0-106bbe20a464" 89 | version = "1.0.0" 90 | 91 | [[Dates]] 92 | deps = ["Printf"] 93 | uuid = "ade2ca70-3891-5945-98fb-dc099432e06a" 94 | 95 | [[DelimitedFiles]] 96 | deps = ["Mmap"] 97 | uuid = "8bb1440f-4735-579b-a4ab-409b98df4dab" 98 | 99 | [[Distributed]] 100 | deps = ["Random", "Serialization", "Sockets"] 101 | uuid = "8ba89e20-285c-5b6f-9357-94700520ee1b" 102 | 103 | [[Distributions]] 104 | deps = ["FillArrays", "LinearAlgebra", "PDMats", "Printf", "QuadGK", "Random", "SpecialFunctions", "Statistics", "StatsBase", "StatsFuns"] 105 | git-tree-sha1 = "78c4c32a2357a00a0a7d614880f02c2c6e1ec73c" 106 | uuid = "31c24e10-a181-5473-b8eb-7969acd0382f" 107 | version = "0.23.4" 108 | 109 | [[ExprTools]] 110 | git-tree-sha1 = "6f0517056812fd6aa3af23d4b70d5325a2ae4e95" 111 | uuid = "e2ba6199-217a-4e67-a87a-7c52f15ade04" 112 | version = "0.1.1" 113 | 114 | [[EzXML]] 115 | deps = ["Printf", "XML2_jll"] 116 | git-tree-sha1 = "0fa3b52a04a4e210aeb1626def9c90df3ae65268" 117 | uuid = "8f5d6c58-4d21-5cfd-889c-e3ad7ee6a615" 118 | version = "1.1.0" 119 | 120 | [[FilePathsBase]] 121 | deps = ["Dates", "LinearAlgebra", "Printf", "Test", "UUIDs"] 122 | git-tree-sha1 = "923fd3b942a11712435682eaa95cc8518c428b2c" 123 | uuid = "48062228-2e41-5def-b9a4-89aafe57970f" 124 | version = "0.8.0" 125 | 126 | [[FillArrays]] 127 | deps = ["LinearAlgebra", "Random", "SparseArrays"] 128 | git-tree-sha1 = "44f561e293987ffc84272cd3d2b14b0b93123d63" 129 | uuid = "1a297f60-69ca-5386-bcde-b61e274b549b" 130 | version = "0.8.10" 131 | 132 | [[FixedPointNumbers]] 133 | git-tree-sha1 = "3ba9ea634d4c8b289d590403b4a06f8e227a6238" 134 | uuid = "53c48c17-4a7d-5ca2-90c5-79b7896eea93" 135 | version = "0.8.0" 136 | 137 | [[Formatting]] 138 | deps = ["Printf"] 139 | git-tree-sha1 = "a0c901c29c0e7c763342751c0a94211d56c0de5c" 140 | uuid = "59287772-0a20-5a39-b81b-1366585eb4c0" 141 | version = "0.4.1" 142 | 143 | [[Future]] 144 | deps = ["Random"] 145 | uuid = "9fa8497b-333b-5362-9e8d-4d0656e87820" 146 | 147 | [[HTTP]] 148 | deps = ["Base64", "Dates", "IniFile", "MbedTLS", "Sockets"] 149 | git-tree-sha1 = "ec87d5e2acbe1693789efbbe14f5ea7525758f71" 150 | uuid = "cd3eb016-35fb-5094-929b-558a96fad6f3" 151 | version = "0.8.15" 152 | 153 | [[IniFile]] 154 | deps = ["Test"] 155 | git-tree-sha1 = "098e4d2c533924c921f9f9847274f2ad89e018b8" 156 | uuid = "83e8ac13-25f8-5344-8a64-a9f2b223428f" 157 | version = "0.5.0" 158 | 159 | [[InteractiveUtils]] 160 | deps = ["Markdown"] 161 | uuid = "b77e0a4c-d291-57a0-90e8-8db25a27a240" 162 | 163 | [[InvertedIndices]] 164 | deps = ["Test"] 165 | git-tree-sha1 = "15732c475062348b0165684ffe28e85ea8396afc" 166 | uuid = "41ab1584-1d38-5bbf-9106-f11c6c58b48f" 167 | version = "1.0.0" 168 | 169 | [[IteratorInterfaceExtensions]] 170 | git-tree-sha1 = "a3f24677c21f5bbe9d2a714f95dcd58337fb2856" 171 | uuid = "82899510-4779-5014-852e-03e436cf321d" 172 | version = "1.0.0" 173 | 174 | [[JLSO]] 175 | deps = ["BSON", "CodecZlib", "FilePathsBase", "Memento", "Pkg", "Serialization"] 176 | git-tree-sha1 = "9dc0c7a4b7527806e53f524ccd66be0cd9e75e2e" 177 | uuid = "9da8a3cd-07a3-59c0-a743-3fdc52c30d11" 178 | version = "2.3.2" 179 | 180 | [[JSON]] 181 | deps = ["Dates", "Mmap", "Parsers", "Unicode"] 182 | git-tree-sha1 = "b34d7cef7b337321e97d22242c3c2b91f476748e" 183 | uuid = "682c06a0-de6a-54ab-a142-c8b1cf79cde6" 184 | version = "0.21.0" 185 | 186 | [[LearnBase]] 187 | git-tree-sha1 = "a0d90569edd490b82fdc4dc078ea54a5a800d30a" 188 | uuid = "7f8f8fb0-2700-5f03-b4bd-41f8cfc144b6" 189 | version = "0.4.1" 190 | 191 | [[LibGit2]] 192 | uuid = "76f85450-5226-5b5a-8eaa-529ad045b433" 193 | 194 | [[Libdl]] 195 | uuid = "8f399da3-3557-5675-b5ff-fb832c97cbdb" 196 | 197 | [[Libiconv_jll]] 198 | deps = ["Libdl", "Pkg"] 199 | git-tree-sha1 = "e5256a3b0ebc710dbd6da0c0b212164a3681037f" 200 | uuid = "94ce4f54-9a6c-5748-9c1c-f9c7231a4531" 201 | version = "1.16.0+2" 202 | 203 | [[LinearAlgebra]] 204 | deps = ["Libdl"] 205 | uuid = "37e2e46d-f89d-539d-b4ee-838fcccc9c8e" 206 | 207 | [[Logging]] 208 | uuid = "56ddb016-857b-54e1-b83d-db4d58db5568" 209 | 210 | [[LossFunctions]] 211 | deps = ["LearnBase", "Markdown", "RecipesBase", "SparseArrays", "StatsBase"] 212 | git-tree-sha1 = "3cd347266e394a066ca7f17bd8ff589ff5ce1d35" 213 | uuid = "30fc2ffe-d236-52d8-8643-a9d8f7c094a7" 214 | version = "0.6.2" 215 | 216 | [[MLJBase]] 217 | deps = ["CategoricalArrays", "ComputationalResources", "Dates", "DelimitedFiles", "Distributed", "Distributions", "HTTP", "InteractiveUtils", "InvertedIndices", "JLSO", "JSON", "LinearAlgebra", "LossFunctions", "MLJModelInterface", "MLJScientificTypes", "Missings", "OrderedCollections", "Parameters", "PrettyTables", "ProgressMeter", "Random", "ScientificTypes", "Statistics", "StatsBase", "Tables"] 218 | git-tree-sha1 = "d8ba2063ffaaa7f0fe91ea5455a7bf838c1424ac" 219 | uuid = "a7f614a8-145f-11e9-1d2a-a57a1082229d" 220 | version = "0.13.10" 221 | 222 | [[MLJModelInterface]] 223 | deps = ["Random", "ScientificTypes"] 224 | git-tree-sha1 = "b02b13fde7b0dc301adc070d650405aa4909e657" 225 | uuid = "e80e1ace-859a-464e-9ed9-23947d8ae3ea" 226 | version = "0.3.0" 227 | 228 | [[MLJScientificTypes]] 229 | deps = ["CategoricalArrays", "ColorTypes", "Dates", "PrettyTables", "ScientificTypes", "Tables"] 230 | git-tree-sha1 = "5296df0ffd2ff7c667260c027d03a465b59dcff5" 231 | uuid = "2e2323e0-db8b-457b-ae0d-bdfb3bc63afd" 232 | version = "0.2.7" 233 | 234 | [[Markdown]] 235 | deps = ["Base64"] 236 | uuid = "d6f4376e-aef5-505a-96c1-9c027394607a" 237 | 238 | [[MbedTLS]] 239 | deps = ["Dates", "MbedTLS_jll", "Random", "Sockets"] 240 | git-tree-sha1 = "426a6978b03a97ceb7ead77775a1da066343ec6e" 241 | uuid = "739be429-bea8-5141-9913-cc70e7f3736d" 242 | version = "1.0.2" 243 | 244 | [[MbedTLS_jll]] 245 | deps = ["Libdl", "Pkg"] 246 | git-tree-sha1 = "c83f5a1d038f034ad0549f9ee4d5fac3fb429e33" 247 | uuid = "c8ffd9c3-330d-5841-b78e-0817d7145fa1" 248 | version = "2.16.0+2" 249 | 250 | [[Memento]] 251 | deps = ["Dates", "Distributed", "JSON", "Serialization", "Sockets", "Syslogs", "Test", "TimeZones", "UUIDs"] 252 | git-tree-sha1 = "31921ad09307dd9ad693da3213a218152fadb8f2" 253 | uuid = "f28f55f0-a522-5efc-85c2-fe41dfb9b2d9" 254 | version = "1.1.0" 255 | 256 | [[Missings]] 257 | deps = ["DataAPI"] 258 | git-tree-sha1 = "de0a5ce9e5289f27df672ffabef4d1e5861247d5" 259 | uuid = "e1d29d7a-bbdc-5cf2-9ac0-f12de2c33e28" 260 | version = "0.4.3" 261 | 262 | [[Mmap]] 263 | uuid = "a63ad114-7e13-5084-954f-fe012c677804" 264 | 265 | [[Mocking]] 266 | deps = ["ExprTools"] 267 | git-tree-sha1 = "916b850daad0d46b8c71f65f719c49957e9513ed" 268 | uuid = "78c3b35d-d492-501b-9361-3d52fe80e533" 269 | version = "0.7.1" 270 | 271 | [[OpenBLAS_jll]] 272 | deps = ["CompilerSupportLibraries_jll", "Libdl", "Pkg"] 273 | git-tree-sha1 = "1887096f6897306a4662f7c5af936da7d5d1a062" 274 | uuid = "4536629a-c528-5b80-bd46-f80d51c5b363" 275 | version = "0.3.9+4" 276 | 277 | [[OpenSpecFun_jll]] 278 | deps = ["CompilerSupportLibraries_jll", "Libdl", "Pkg"] 279 | git-tree-sha1 = "d51c416559217d974a1113522d5919235ae67a87" 280 | uuid = "efe28fd5-8261-553b-a9e1-b2916fc3738e" 281 | version = "0.5.3+3" 282 | 283 | [[OrderedCollections]] 284 | git-tree-sha1 = "12ce190210d278e12644bcadf5b21cbdcf225cd3" 285 | uuid = "bac558e1-5e72-5ebc-8fee-abe8a469f55d" 286 | version = "1.2.0" 287 | 288 | [[PDMats]] 289 | deps = ["Arpack", "LinearAlgebra", "SparseArrays", "SuiteSparse", "Test"] 290 | git-tree-sha1 = "2fc6f50ddd959e462f0a2dbc802ddf2a539c6e35" 291 | uuid = "90014a1f-27ba-587c-ab20-58faa44d9150" 292 | version = "0.9.12" 293 | 294 | [[Parameters]] 295 | deps = ["OrderedCollections", "UnPack"] 296 | git-tree-sha1 = "38b2e970043613c187bd56a995fe2e551821eb4a" 297 | uuid = "d96e819e-fc66-5662-9728-84c9c7592b0a" 298 | version = "0.12.1" 299 | 300 | [[Parsers]] 301 | deps = ["Dates", "Test"] 302 | git-tree-sha1 = "eb3e09940c0d7ae01b01d9291ebad7b081c844d3" 303 | uuid = "69de0a69-1ddd-5017-9359-2bf0b02dc9f0" 304 | version = "1.0.5" 305 | 306 | [[Pkg]] 307 | deps = ["Dates", "LibGit2", "Libdl", "Logging", "Markdown", "Printf", "REPL", "Random", "SHA", "UUIDs"] 308 | uuid = "44cfe95a-1eb2-52ea-b672-e2afdf69b78f" 309 | 310 | [[PooledArrays]] 311 | deps = ["DataAPI"] 312 | git-tree-sha1 = "b1333d4eced1826e15adbdf01a4ecaccca9d353c" 313 | uuid = "2dfb63ee-cc39-5dd5-95bd-886bf059d720" 314 | version = "0.5.3" 315 | 316 | [[PrettyTables]] 317 | deps = ["Crayons", "Formatting", "Parameters", "Reexport", "Tables"] 318 | git-tree-sha1 = "ac3cecc7254adfffb8fdbd2c83eaa247e14b02da" 319 | uuid = "08abe8d2-0d0c-5749-adfa-8a2ac140af0d" 320 | version = "0.9.0" 321 | 322 | [[Printf]] 323 | deps = ["Unicode"] 324 | uuid = "de0858da-6303-5e67-8744-51eddeeeb8d7" 325 | 326 | [[ProgressMeter]] 327 | deps = ["Distributed", "Printf"] 328 | git-tree-sha1 = "3e1784c27847bba115815d4d4e668b99873985e5" 329 | uuid = "92933f4c-e287-5a05-a399-4b506db050ca" 330 | version = "1.3.1" 331 | 332 | [[QuadGK]] 333 | deps = ["DataStructures", "LinearAlgebra"] 334 | git-tree-sha1 = "dc84e810393cfc6294248c9032a9cdacc14a3db4" 335 | uuid = "1fd47b50-473d-5c70-9696-f719f8f3bcdc" 336 | version = "2.3.1" 337 | 338 | [[REPL]] 339 | deps = ["InteractiveUtils", "Markdown", "Sockets"] 340 | uuid = "3fa0cd96-eef1-5676-8a61-b3b8758bbffb" 341 | 342 | [[Random]] 343 | deps = ["Serialization"] 344 | uuid = "9a3f8284-a2c9-5f02-9a11-845980a1fd5c" 345 | 346 | [[RecipesBase]] 347 | git-tree-sha1 = "54f8ceb165a0f6d083f0d12cb4996f5367c6edbc" 348 | uuid = "3cdcf5f2-1ef4-517c-9805-6587b60abb01" 349 | version = "1.0.1" 350 | 351 | [[Reexport]] 352 | deps = ["Pkg"] 353 | git-tree-sha1 = "7b1d07f411bc8ddb7977ec7f377b97b158514fe0" 354 | uuid = "189a3867-3050-52da-a836-e630ba90ab69" 355 | version = "0.2.0" 356 | 357 | [[Rmath]] 358 | deps = ["Random", "Rmath_jll"] 359 | git-tree-sha1 = "86c5647b565873641538d8f812c04e4c9dbeb370" 360 | uuid = "79098fc4-a85e-5d69-aa6a-4863f24498fa" 361 | version = "0.6.1" 362 | 363 | [[Rmath_jll]] 364 | deps = ["Libdl", "Pkg"] 365 | git-tree-sha1 = "d76185aa1f421306dec73c057aa384bad74188f0" 366 | uuid = "f50d1b31-88e8-58de-be2c-1cc44531875f" 367 | version = "0.2.2+1" 368 | 369 | [[SHA]] 370 | uuid = "ea8e919c-243c-51af-8825-aaa63cd721ce" 371 | 372 | [[ScientificTypes]] 373 | git-tree-sha1 = "1a9f881c800ea009fb7f8b5274f04e4e8a5faef8" 374 | uuid = "321657f4-b219-11e9-178b-2701a2544e81" 375 | version = "0.8.0" 376 | 377 | [[Serialization]] 378 | uuid = "9e88b42a-f829-5b0c-bbe9-9e923198166b" 379 | 380 | [[SharedArrays]] 381 | deps = ["Distributed", "Mmap", "Random", "Serialization"] 382 | uuid = "1a1011a3-84de-559e-8e89-a11a2f7dc383" 383 | 384 | [[Sockets]] 385 | uuid = "6462fe0b-24de-5631-8697-dd941f90decc" 386 | 387 | [[SortingAlgorithms]] 388 | deps = ["DataStructures", "Random", "Test"] 389 | git-tree-sha1 = "03f5898c9959f8115e30bc7226ada7d0df554ddd" 390 | uuid = "a2af1166-a08f-5f64-846c-94a0d3cef48c" 391 | version = "0.3.1" 392 | 393 | [[SparseArrays]] 394 | deps = ["LinearAlgebra", "Random"] 395 | uuid = "2f01184e-e22b-5df5-ae63-d93ebab69eaf" 396 | 397 | [[SpecialFunctions]] 398 | deps = ["OpenSpecFun_jll"] 399 | git-tree-sha1 = "d8d8b8a9f4119829410ecd706da4cc8594a1e020" 400 | uuid = "276daf66-3868-5448-9aa4-cd146d93841b" 401 | version = "0.10.3" 402 | 403 | [[Statistics]] 404 | deps = ["LinearAlgebra", "SparseArrays"] 405 | uuid = "10745b16-79ce-11e8-11f9-7d13ad32a3b2" 406 | 407 | [[StatsBase]] 408 | deps = ["DataAPI", "DataStructures", "LinearAlgebra", "Missings", "Printf", "Random", "SortingAlgorithms", "SparseArrays", "Statistics"] 409 | git-tree-sha1 = "a6102b1f364befdb05746f386b67c6b7e3262c45" 410 | uuid = "2913bbd2-ae8a-5f71-8c99-4fb6c76f3a91" 411 | version = "0.33.0" 412 | 413 | [[StatsFuns]] 414 | deps = ["Rmath", "SpecialFunctions"] 415 | git-tree-sha1 = "04a5a8e6ab87966b43f247920eab053fd5fdc925" 416 | uuid = "4c63d2b9-4356-54db-8cca-17b64c39e42c" 417 | version = "0.9.5" 418 | 419 | [[SuiteSparse]] 420 | deps = ["Libdl", "LinearAlgebra", "Serialization", "SparseArrays"] 421 | uuid = "4607b0f0-06f3-5cda-b6b1-a6196a1729e9" 422 | 423 | [[Syslogs]] 424 | deps = ["Printf", "Sockets"] 425 | git-tree-sha1 = "46badfcc7c6e74535cc7d833a91f4ac4f805f86d" 426 | uuid = "cea106d9-e007-5e6c-ad93-58fe2094e9c4" 427 | version = "0.3.0" 428 | 429 | [[TableTraits]] 430 | deps = ["IteratorInterfaceExtensions"] 431 | git-tree-sha1 = "b1ad568ba658d8cbb3b892ed5380a6f3e781a81e" 432 | uuid = "3783bdb8-4a98-5b6b-af9a-565f29a5fe9c" 433 | version = "1.0.0" 434 | 435 | [[Tables]] 436 | deps = ["DataAPI", "DataValueInterfaces", "IteratorInterfaceExtensions", "LinearAlgebra", "TableTraits", "Test"] 437 | git-tree-sha1 = "c45dcc27331febabc20d86cb3974ef095257dcf3" 438 | uuid = "bd369af6-aec1-5ad0-b16a-f7cc5008161c" 439 | version = "1.0.4" 440 | 441 | [[Test]] 442 | deps = ["Distributed", "InteractiveUtils", "Logging", "Random"] 443 | uuid = "8dfed614-e22c-5e08-85e1-65c5234f0b40" 444 | 445 | [[TimeZones]] 446 | deps = ["Dates", "EzXML", "Mocking", "Printf", "RecipesBase", "Serialization", "Unicode"] 447 | git-tree-sha1 = "db7bc2051d4c2e5f336409224df81485c00de6cb" 448 | uuid = "f269a46b-ccf7-5d73-abea-4c690281aa53" 449 | version = "1.2.0" 450 | 451 | [[TranscodingStreams]] 452 | deps = ["Random", "Test"] 453 | git-tree-sha1 = "7c53c35547de1c5b9d46a4797cf6d8253807108c" 454 | uuid = "3bb67fe8-82b1-5028-8e26-92a6c54297fa" 455 | version = "0.9.5" 456 | 457 | [[UUIDs]] 458 | deps = ["Random", "SHA"] 459 | uuid = "cf7118a7-6976-5b1a-9a39-7adc72f591a4" 460 | 461 | [[UnPack]] 462 | git-tree-sha1 = "d4bfa022cd30df012700cf380af2141961bb3bfb" 463 | uuid = "3a884ed6-31ef-47d7-9d2a-63182c4928ed" 464 | version = "1.0.1" 465 | 466 | [[Unicode]] 467 | uuid = "4ec0a83e-493e-50e2-b9ac-8f72acf5a8f5" 468 | 469 | [[WeakRefStrings]] 470 | deps = ["DataAPI", "Random", "Test"] 471 | git-tree-sha1 = "28807f85197eaad3cbd2330386fac1dcb9e7e11d" 472 | uuid = "ea10d353-3f73-51f8-a26c-33c1cb351aa5" 473 | version = "0.6.2" 474 | 475 | [[XML2_jll]] 476 | deps = ["Libdl", "Libiconv_jll", "Pkg", "Zlib_jll"] 477 | git-tree-sha1 = "987c02a43fa10a491a5f0f7c46a6d3559ed6a8e2" 478 | uuid = "02c8fc9c-b97f-50b9-bbe4-9be30ff0a78a" 479 | version = "2.9.9+4" 480 | 481 | [[Zlib_jll]] 482 | deps = ["Libdl", "Pkg"] 483 | git-tree-sha1 = "a2e0d558f6031002e380a90613b199e37a8565bf" 484 | uuid = "83775a58-1f1d-513f-b197-d71354ab007a" 485 | version = "1.2.11+10" 486 | -------------------------------------------------------------------------------- /data/src/convert_ames/Project.toml: -------------------------------------------------------------------------------- 1 | [deps] 2 | CSV = "336ed68f-0bac-5ca0-87d4-7b16caf5d00b" 3 | CategoricalArrays = "324d7699-5711-5eae-9e2f-1d82baa6b597" 4 | DataFrames = "a93c6f00-e57d-5684-b7b6-d8193f3e46c0" 5 | MLJBase = "a7f614a8-145f-11e9-1d2a-a57a1082229d" 6 | -------------------------------------------------------------------------------- /data/src/generate_horse.jl: -------------------------------------------------------------------------------- 1 | using Pkg; 2 | Pkg.activate(@__DIR__) 3 | Pkg.instantiate() 4 | 5 | using MLJ 6 | 7 | using HTTP 8 | using CSV 9 | import DataFrames: DataFrame, select!, Not 10 | req1 = HTTP.get("http://archive.ics.uci.edu/ml/machine-learning-databases/horse-colic/horse-colic.data") 11 | req2 = HTTP.get("http://archive.ics.uci.edu/ml/machine-learning-databases/horse-colic/horse-colic.test") 12 | header = ["surgery", "age", "hospital_number", 13 | "rectal_temperature", "pulse", 14 | "respiratory_rate", "temperature_extremities", 15 | "peripheral_pulse", "mucous_membranes", 16 | "capillary_refill_time", "pain", 17 | "peristalsis", "abdominal_distension", 18 | "nasogastric_tube", "nasogastric_reflux", 19 | "nasogastric_reflux_ph", "feces", "abdomen", 20 | "packed_cell_volume", "total_protein", 21 | "abdomcentesis_appearance", "abdomcentesis_total_protein", 22 | "outcome", "surgical_lesion", "lesion_1", "lesion_2", "lesion_3", 23 | "cp_data"] 24 | csv_opts = (header=header, delim=' ', missingstring="?", 25 | ignorerepeated=true) 26 | data_train = CSV.read(req1.body; csv_opts...) 27 | data_test = CSV.read(req2.body; csv_opts...) 28 | @show size(data_train) 29 | @show size(data_test) 30 | 31 | unwanted = [:lesion_1, :lesion_2, :lesion_3] 32 | data = vcat(data_train, data_test) 33 | select!(data, Not(unwanted)); 34 | 35 | train = 1:nrows(data_train) 36 | test = last(train) .+ (1:nrows(data_test)); 37 | 38 | datac = coerce(data, autotype(data)); 39 | 40 | sch0 = schema(data) 41 | sch = schema(datac) 42 | 43 | old_scitype_given_name = Dict( 44 | sch0.names[j] => sch0.scitypes[j] for j in eachindex(sch0.names)) 45 | 46 | length(unique(datac.hospital_number)) 47 | 48 | datac = select!(datac, Not(:hospital_number)); 49 | 50 | datac = coerce(datac, autotype(datac, rules=(:discrete_to_continuous,))); 51 | 52 | missing_outcome = ismissing.(datac.outcome) 53 | idx_missing_outcome = missing_outcome |> findall 54 | 55 | train = setdiff!(train |> collect, idx_missing_outcome) 56 | test = setdiff!(test |> collect, idx_missing_outcome) 57 | datac = datac[.!missing_outcome, :]; 58 | 59 | for name in names(datac) 60 | col = datac[:, name] 61 | ratio_missing = sum(ismissing.(col)) / nrows(datac) * 100 62 | println(rpad(name, 30), round(ratio_missing, sigdigits=3)) 63 | end 64 | 65 | unwanted = [:peripheral_pulse, :nasogastric_tube, :nasogastric_reflux, 66 | :nasogastric_reflux_ph, :feces, :abdomen, :abdomcentesis_appearance, :abdomcentesis_total_protein] 67 | select!(datac, Not(unwanted)); 68 | 69 | @load FillImputer 70 | filler = machine(FillImputer(), datac) 71 | fit!(filler) 72 | datac = transform(filler, datac) 73 | 74 | cat_fields = filter(schema(datac).names) do field 75 | datac[:, field] isa CategoricalArray 76 | end 77 | 78 | for f in cat_fields 79 | datac[!, f] = get.(datac[:, f]) 80 | end 81 | 82 | datac.pulse = coerce(datac.pulse, Count) 83 | datac.respiratory_rate = coerce(datac.pulse, Count) 84 | 85 | sch1 = schema(datac) 86 | 87 | CSV.write("horse.csv", datac) 88 | -------------------------------------------------------------------------------- /data/src/get_king_county.jl: -------------------------------------------------------------------------------- 1 | using Pkg; 2 | Pkg.activate(@__DIR__) 3 | Pkg.instantiate() 4 | 5 | using MLJ 6 | using PrettyPrinting 7 | import DataFrames: DataFrame, select!, Not, describe 8 | import Statistics 9 | using Dates 10 | using UrlDownload 11 | using CSV 12 | 13 | 14 | df = DataFrame(urldownload("https://raw.githubusercontent.com/tlienart/DataScienceTutorialsData.jl/master/data/kc_housing.csv", true)) 15 | describe(df) 16 | 17 | df.is_renovated = df.yr_renovated .== 0 18 | 19 | select!(df, Not([:id, :date, :yr_renovated])) 20 | CSV.write(joinpath(@__DIR__, "..", "house.csv"), df) 21 | -------------------------------------------------------------------------------- /environment.yml: -------------------------------------------------------------------------------- 1 | dependencies: 2 | - matplotlib 3 | - numpy 4 | - pip 5 | - pip: 6 | - julia 7 | -------------------------------------------------------------------------------- /exercise_6ci.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ablaom/MachineLearningInJulia2020/552f98fbf012475d67cd29a72448ac7c476ea2c7/exercise_6ci.png -------------------------------------------------------------------------------- /exercise_7c.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ablaom/MachineLearningInJulia2020/552f98fbf012475d67cd29a72448ac7c476ea2c7/exercise_7c.png -------------------------------------------------------------------------------- /exercise_7c_2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ablaom/MachineLearningInJulia2020/552f98fbf012475d67cd29a72448ac7c476ea2c7/exercise_7c_2.png -------------------------------------------------------------------------------- /exercise_7c_3.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ablaom/MachineLearningInJulia2020/552f98fbf012475d67cd29a72448ac7c476ea2c7/exercise_7c_3.png -------------------------------------------------------------------------------- /exercise_8c.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ablaom/MachineLearningInJulia2020/552f98fbf012475d67cd29a72448ac7c476ea2c7/exercise_8c.png -------------------------------------------------------------------------------- /gamma_sampler.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ablaom/MachineLearningInJulia2020/552f98fbf012475d67cd29a72448ac7c476ea2c7/gamma_sampler.png -------------------------------------------------------------------------------- /iris_learning_curve.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ablaom/MachineLearningInJulia2020/552f98fbf012475d67cd29a72448ac7c476ea2c7/iris_learning_curve.png -------------------------------------------------------------------------------- /learning_curve.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ablaom/MachineLearningInJulia2020/552f98fbf012475d67cd29a72448ac7c476ea2c7/learning_curve.png -------------------------------------------------------------------------------- /learning_curve2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ablaom/MachineLearningInJulia2020/552f98fbf012475d67cd29a72448ac7c476ea2c7/learning_curve2.png -------------------------------------------------------------------------------- /methods.md: -------------------------------------------------------------------------------- 1 | # List of methods introduced in the tutorials 2 | 3 | ## Part 1 4 | 5 | `scitype(object)`, `coerce(vector, SomeSciType)`, 6 | `levels(categorical_vector)`, `levels!(categorical_vector)`, 7 | `schema(table)`, `MLJ.table(matrix)`, `autotype(table)`, 8 | `coerce(table, ...)`, `coerce!(dataframe, ...)`, `elscitype(vector)` 9 | 10 | ## Part 2 11 | 12 | `OpenML.load(id)`, `unpack(table, ...)`, `models()`, `models(filter)`, 13 | `models(string)`, `@load ModelType pkg=PackageName`, `info(model)`, 14 | `machine(model, X, y)`, `partition(row_indices, ...)`, `fit!(mach, 15 | rows=...)`, `predict(mach, rows=...)`, `predict(mach, Xnew)`, 16 | `fitted_params(mach)`, `report(mach)`, `MLJ.save`, 17 | `machine(filename)`, `machine(filename, X, y)`, 18 | `pdf(single_prediction, class)`, `predict_mode(mach, Xnew)`, 19 | `predict_mean(mach, Xnew)`, `predict_median(mach, Xnew)`, 20 | `measures()`, `evaluate!`, `range(model, :(param.nested_param), ...)`, 21 | `learning_curve(mach, ...)` 22 | 23 | ## Part 3 24 | 25 | `Standardizer`, `transform`, `inverse_transform`, `ContinuousEncoder`, `@pipeline` 26 | 27 | ## Part 4 28 | 29 | `iterator(r, resolution)`, `sampler(r, distribution)`, `RandomSearch`, 30 | `TunedModel` 31 | 32 | ## Part 5 33 | 34 | `source(data)`, `source()`, `Probabilistic()`, `Deterministic()`, 35 | `Unsupervised()`, `@from_network` 36 | -------------------------------------------------------------------------------- /outline.md: -------------------------------------------------------------------------------- 1 | # Machine Learning in Julia using MLJ 2 | 3 | ## Housekeeping 4 | 5 | ### Getting help during the workshop 6 | 7 | ### Resources to help you 8 | 9 | From the MLJ ecosystem: 10 | 11 | - The docs 12 | 13 | - DataScienceTutorials 14 | 15 | From elsewhere: 16 | 17 | - Julia specific: 18 | 19 | - ScikitLearn 20 | 21 | - General: 22 | 23 | - 24 | 25 | - 26 | 27 | ## Programme 28 | 29 | - An overview of machine learning and MLJ (lecture) 30 | 31 | - Workshop scope 32 | 33 | - Installing MLJ and the tutorials 34 | 35 | - Part 1. Data representations 36 | 37 | Break 38 | 39 | - Part 2: Selecting, training and evaluating models 40 | 41 | - Part 3: Tuning model hyper-parameters 42 | 43 | Break 44 | 45 | - Part 4: Model pipelines 46 | 47 | - Part 5: Advanced features (lecture) 48 | 49 | Each Parts 2-6 begins with demonstration on the "teacher's dataset", with 50 | time for participants to carry out a similar exercise on a "student's 51 | datasets" and interact with the instructors in the chat forum. 52 | 53 | 54 | ## What this workshop won't cover 55 | 56 | This workshop assumes at some experience with data and, ideally, some 57 | understanding of machine learning principles. 58 | 59 | Lightly covered or not covered 60 | 61 | - data wrangling and data cleaning 62 | 63 | - feature engineering 64 | 65 | - options for parallelism or using GPU's 66 | 67 | 68 | ## Part 1: Data ingestion and pre-processing 69 | 70 | ### What is machine learning? 71 | 72 | Supervised learning - show with examples and pictures what the basic 73 | idea and processes are: fitting, evaluating, tuning. 74 | 75 | Unsupervised learning - no labels; main use-case is dimension reduction; explain PCA with a picture 76 | 77 | Re-enforcement learning - out of scope 78 | 79 | 80 | ### Different machine learning models and paradigms 81 | 82 | - machine learning ≠ deep learning 83 | 84 | - there are hundreds of machine learning models. All of the following 85 | are in common use: 86 | 87 | - linear models, especially Ridge regression, elastic net, pca (unsupervised) 88 | 89 | - Naive Bayes 90 | 91 | - K-nearest neighbours 92 | 93 | - K-means clustering (unsupervised) 94 | 95 | - random forests 96 | 97 | - gradient boosted tree models (e.g., XGBoost) 98 | 99 | - support vector machines 100 | 101 | - probablistic programming models 102 | 103 | - neural networks 104 | 105 | 106 | ### What is a (good) machine learning toolbox? 107 | 108 | - provides uniform interface to zoo of models scattered everywhere 109 | (different packages, different languages) 110 | 111 | - provides a searchable model registry 112 | 113 | - meta-algorithms: 114 | 115 | - evaluating performance using different performance measures (aka 116 | metrics, scores, loss functions) 117 | 118 | - tuning (optimizing hyperparmaters) 119 | 120 | - facilitates model *composition* (e.g., pipelines) 121 | 122 | - customizable (getting under the hood) 123 | 124 | ### MLJ features 125 | 126 | 127 | ### A short tour of MLJ 128 | 129 | 130 | ## Part 1: Data ingestion and pre-processing 131 | 132 | ### Scientific types and type coercion 133 | 134 | - inspecting scitypes and coercing them 135 | 136 | - working with categorical data 137 | 138 | 139 | ### Tabular data 140 | 141 | - Lots of things can be considered as tabular data; examples: native 142 | tables, matrices, DataFrames, CSV files 143 | 144 | - Lots of ways to grab data; examples: 145 | 146 | - load a canned dataset 147 | - load from local file (e.g., csv) 148 | - create a synthetic data set 149 | - use OpenML 150 | - use RDatasets 151 | - use UrlDownload (or is there something better?) 152 | 153 | ### Demo 154 | 155 | ### Exercise 156 | 157 | ## 158 | 159 | -------------------------------------------------------------------------------- /setup.jl: -------------------------------------------------------------------------------- 1 | # Setup: 2 | 3 | isbinder() = "jovyan" in split(pwd(), "/") 4 | 5 | const REPO = "https://github.com/ablaom/MachineLearningInJulia2020" 6 | using Pkg 7 | 8 | if !isbinder() 9 | Pkg.activate(DIR) 10 | Pkg.instantiate() 11 | using CategoricalArrays 12 | import MLJLinearModels 13 | import DataFrames 14 | import CSV 15 | import MLJDecisionTreeInterface 16 | using MLJ 17 | import MLJClusteringInterface 18 | import MLJMultivariateStatsInterface 19 | import MLJScikitLearnInterface 20 | import MLJLinearModels 21 | import MLJMultivariateStatsInterface 22 | import MLJFlux 23 | import Plots 24 | else 25 | @info "Skipping package instantiation as binder notebook. " 26 | end 27 | @info "Done loading" 28 | -------------------------------------------------------------------------------- /stacking.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ablaom/MachineLearningInJulia2020/552f98fbf012475d67cd29a72448ac7c476ea2c7/stacking.png -------------------------------------------------------------------------------- /tuning.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ablaom/MachineLearningInJulia2020/552f98fbf012475d67cd29a72448ac7c476ea2c7/tuning.png -------------------------------------------------------------------------------- /tutorials.jl: -------------------------------------------------------------------------------- 1 | # # Machine Learning in Julia, JuliaCon2020 2 | 3 | # A workshop introducing the machine learning toolbox 4 | # [MLJ](https://alan-turing-institute.github.io/MLJ.jl/stable/). 5 | 6 | 7 | # ### Set-up 8 | 9 | # The following instantiates a package environment and pre-loads some 10 | # packages, to avoid delays later on. 11 | 12 | # The package environment has been created using **Julia 1.6** and may not 13 | # instantiate properly for other Julia versions. 14 | 15 | VERSION 16 | 17 | #- 18 | 19 | DIR = @__DIR__ 20 | include(joinpath(DIR, "setup.jl")) 21 | color_off() 22 | 23 | # ## General resources 24 | 25 | # - [List of methods introduced in this tutorial](methods.md) 26 | # - [MLJ Cheatsheet](https://alan-turing-institute.github.io/MLJ.jl/dev/mlj_cheatsheet/) 27 | # - [Common MLJ Workflows](https://alan-turing-institute.github.io/MLJ.jl/dev/common_mlj_workflows/) 28 | # - [MLJ manual](https://alan-turing-institute.github.io/MLJ.jl/dev/) 29 | # - [Data Science Tutorials in Julia](https://juliaai.github.io/DataScienceTutorials.jl/) 30 | 31 | 32 | # ## Contents 33 | 34 | # ### Basic 35 | 36 | # - [Part 1 - Data Representation](#part-1-data-representation) 37 | # - [Part 2 - Selecting, Training and Evaluating Models](#part-2-selecting-training-and-evaluating-models) 38 | # - [Part 3 - Transformers and Pipelines](#part-3-transformers-and-pipelines) 39 | # - [Part 4 - Tuning Hyper-parameters](#part-4-tuning-hyper-parameters) 40 | # - [Part 5 - Advanced model composition](#part-5-advanced-model-composition) 41 | # - [Solutions to Exercises](#solutions-to-exercises) 42 | 43 | 44 | # 45 | 46 | 47 | # ## Part 1 - Data Representation 48 | 49 | # > **Goals:** 50 | # > 1. Learn how MLJ specifies it's data requirements using "scientific" types 51 | # > 2. Understand the options for representing tabular data 52 | # > 3. Learn how to inspect and fix the representation of data to meet MLJ requirements 53 | 54 | 55 | # ### Scientific types 56 | 57 | # To help you focus on the intended *purpose* or *interpretation* of 58 | # data, MLJ models specify data requirements using *scientific types*, 59 | # instead of machine types. An example of a scientific type is 60 | # `OrderedFactor`. The other basic "scalar" scientific types are 61 | # illustrated below: 62 | 63 | # ![](assets/scitypes.png) 64 | 65 | # A scientific type is an ordinary Julia type (so it can be used for 66 | # method dispatch, for example) but it usually has no instances. The 67 | # `scitype` function is used to articulate MLJ's convention about how 68 | # different machine types will be interpreted by MLJ models: 69 | 70 | using MLJ 71 | scitype(3.141) 72 | 73 | #- 74 | 75 | time = [2.3, 4.5, 4.2, 1.8, 7.1] 76 | scitype(time) 77 | 78 | # To fix data which MLJ is interpreting incorrectly, we use the 79 | # `coerce` method: 80 | 81 | height = [185, 153, 163, 114, 180] 82 | scitype(height) 83 | 84 | #- 85 | 86 | height = coerce(height, Continuous) 87 | 88 | # Here's an example of data we would want interpreted as 89 | # `OrderedFactor` but isn't: 90 | 91 | exam_mark = ["rotten", "great", "bla", missing, "great"] 92 | scitype(exam_mark) 93 | 94 | #- 95 | 96 | exam_mark = coerce(exam_mark, OrderedFactor) 97 | 98 | #- 99 | 100 | levels(exam_mark) 101 | 102 | # Use `levels!` to put the classes in the right order: 103 | 104 | levels!(exam_mark, ["rotten", "bla", "great"]) 105 | exam_mark[1] < exam_mark[2] 106 | 107 | # When sub-sampling, no levels are lost: 108 | 109 | levels(exam_mark[1:2]) 110 | 111 | # **Note on binary data.** There is no separate scientific type for 112 | # binary data. Binary data is `OrderedFactor{2}` or 113 | # `Multiclass{2}`. If a binary measure like `truepositive` is a 114 | # applied to `OrderedFactor{2}` then the "positive" class is assumed 115 | # to appear *second* in the ordering. If such a measure is applied to 116 | # `Multiclass{2}` data, a warning is issued. A single `OrderedFactor` 117 | # can be coerced to a single `Continuous` variable, for models that 118 | # require this, while a `Multiclass` variable can only be one-hot 119 | # encoded. 120 | 121 | 122 | # ### Two-dimensional data 123 | 124 | # Whenever it makes sense, MLJ Models generally expect two-dimensional 125 | # data to be *tabular*. All the tabular formats implementing the 126 | # [Tables.jl API](https://juliadata.github.io/Tables.jl/stable/) (see 127 | # this 128 | # [list](https://github.com/JuliaData/Tables.jl/blob/master/INTEGRATIONS.md)) 129 | # have a scientific type of `Table` and can be used with such models. 130 | 131 | # Probably the simplest example of a table is the julia native *column 132 | # table*, which is just a named tuple of equal-length vectors: 133 | 134 | column_table = (h=height, e=exam_mark, t=time) 135 | 136 | #- 137 | 138 | scitype(column_table) 139 | 140 | #- 141 | 142 | # Notice the `Table{K}` type parameter `K` encodes the scientific 143 | # types of the columns. (This is useful when comparing table scitypes 144 | # with `<:`). To inspect the individual column scitypes, we use the 145 | # `schema` method instead: 146 | 147 | schema(column_table) 148 | 149 | # Here are five other examples of tables: 150 | 151 | dict_table = Dict(:h => height, :e => exam_mark, :t => time) 152 | schema(dict_table) 153 | 154 | # (To control column order here, instead use `LittleDict` from 155 | # OrderedCollections.jl.) 156 | 157 | row_table = [(a=1, b=3.4), 158 | (a=2, b=4.5), 159 | (a=3, b=5.6)] 160 | schema(row_table) 161 | 162 | #- 163 | 164 | import DataFrames 165 | df = DataFrames.DataFrame(column_table) 166 | 167 | #- 168 | 169 | schema(df) == schema(column_table) 170 | 171 | #- 172 | 173 | using CSV 174 | file = CSV.File(joinpath(DIR, "data", "horse.csv")); 175 | schema(file) # (triggers a file read) 176 | 177 | 178 | # Most MLJ models do not accept matrix in lieu of a table, but you can 179 | # wrap a matrix as a table: 180 | 181 | matrix_table = MLJ.table(rand(2,3)) 182 | schema(matrix_table) 183 | 184 | # The matrix is *not* copied, only wrapped. Some models may perform 185 | # better if one wraps the adjoint of the transpose - see 186 | # [here](https://alan-turing-institute.github.io/MLJ.jl/dev/getting_started/#Observations-correspond-to-rows,-not-columns). 187 | 188 | 189 | # **Manipulating tabular data.** In this workshop we assume 190 | # familiarity with some kind of tabular data container (although it is 191 | # possible, in principle, to carry out the exercises without this.) 192 | # For a quick start introduction to `DataFrames`, see [this 193 | # tutorial](https://juliaai.github.io/DataScienceTutorials.jl/data/dataframe/). 194 | 195 | # ### Fixing scientific types in tabular data 196 | 197 | # To show how we can correct the scientific types of data in tables, 198 | # we introduce a cleaned up version of the UCI Horse Colic Data Set 199 | # (the cleaning work-flow is described 200 | # [here](https://juliaai.github.io/DataScienceTutorials.jl/end-to-end/horse/#dealing_with_missing_values)). 201 | 202 | using CSV 203 | file = CSV.File(joinpath(DIR, "data", "horse.csv")); 204 | horse = DataFrames.DataFrame(file); # convert to data frame without copying columns 205 | first(horse, 4) 206 | 207 | #- 208 | 209 | # From [the UCI 210 | # docs](http://archive.ics.uci.edu/ml/datasets/Horse+Colic) we can 211 | # surmise how each variable ought to be interpreted (a step in our 212 | # work-flow that cannot reliably be left to the computer): 213 | 214 | # variable | scientific type (interpretation) 215 | # ----------------------------|----------------------------------- 216 | # `:surgery` | Multiclass 217 | # `:age` | Multiclass 218 | # `:rectal_temperature` | Continuous 219 | # `:pulse` | Continuous 220 | # `:respiratory_rate` | Continuous 221 | # `:temperature_extremities` | OrderedFactor 222 | # `:mucous_membranes` | Multiclass 223 | # `:capillary_refill_time` | Multiclass 224 | # `:pain` | OrderedFactor 225 | # `:peristalsis` | OrderedFactor 226 | # `:abdominal_distension` | OrderedFactor 227 | # `:packed_cell_volume` | Continuous 228 | # `:total_protein` | Continuous 229 | # `:outcome` | Multiclass 230 | # `:surgical_lesion` | OrderedFactor 231 | # `:cp_data` | Multiclass 232 | 233 | # Let's see how MLJ will actually interpret the data, as it is 234 | # currently encoded: 235 | 236 | schema(horse) 237 | 238 | # As a first correction step, we can get MLJ to "guess" the 239 | # appropriate fix, using the `autotype` method: 240 | 241 | autotype(horse) 242 | 243 | #- 244 | 245 | # Okay, this is not perfect, but a step in the right direction, which 246 | # we implement like this: 247 | 248 | coerce!(horse, autotype(horse)); 249 | schema(horse) 250 | 251 | # All remaining `Count` data should be `Continuous`: 252 | 253 | coerce!(horse, Count => Continuous); 254 | schema(horse) 255 | 256 | # We'll correct the remaining truant entries manually: 257 | 258 | coerce!(horse, 259 | :surgery => Multiclass, 260 | :age => Multiclass, 261 | :mucous_membranes => Multiclass, 262 | :capillary_refill_time => Multiclass, 263 | :outcome => Multiclass, 264 | :cp_data => Multiclass); 265 | schema(horse) 266 | 267 | 268 | # ### Resources for Part 1 269 | # 270 | # - From the MLJ manual: 271 | # - [A preview of data type specification in 272 | # MLJ](https://alan-turing-institute.github.io/MLJ.jl/dev/getting_started/#A-preview-of-data-type-specification-in-MLJ-1) 273 | # - [Data containers and scientific types](https://alan-turing-institute.github.io/MLJ.jl/dev/getting_started/#Data-containers-and-scientific-types-1) 274 | # - [Working with Categorical Data](https://alan-turing-institute.github.io/MLJ.jl/dev/working_with_categorical_data/) 275 | # - [Summary](https://juliaai.github.io/ScientificTypes.jl/dev/#Summary-of-the-default-convention) of the MLJ convention for representing scientific types 276 | # - [ScientificTypes.jl](https://juliaai.github.io/ScientificTypes.jl/dev/) 277 | # - From Data Science Tutorials: 278 | # - [Data interpretation: Scientific Types](https://juliaai.github.io/DataScienceTutorials.jl/data/scitype/) 279 | # - [Horse colic data](https://juliaai.github.io/DataScienceTutorials.jl/end-to-end/horse/) 280 | # - [UCI Horse Colic Data Set](http://archive.ics.uci.edu/ml/datasets/Horse+Colic) 281 | 282 | 283 | # ### Exercises for Part 1 284 | 285 | 286 | # #### Exercise 1 287 | 288 | # Try to guess how each code snippet below will evaluate: 289 | 290 | scitype(42) 291 | 292 | #- 293 | 294 | questions = ["who", "why", "what", "when"] 295 | scitype(questions) 296 | 297 | #- 298 | 299 | elscitype(questions) 300 | 301 | #- 302 | 303 | t = (3.141, 42, "how") 304 | scitype(t) 305 | 306 | #- 307 | 308 | A = rand(2, 3) 309 | 310 | # - 311 | 312 | scitype(A) 313 | 314 | #- 315 | 316 | elscitype(A) 317 | 318 | #- 319 | 320 | using SparseArrays 321 | Asparse = sparse(A) 322 | 323 | #- 324 | 325 | scitype(Asparse) 326 | 327 | #- 328 | 329 | using CategoricalArrays 330 | C1 = categorical(A) 331 | 332 | #- 333 | 334 | scitype(C1) 335 | 336 | #- 337 | 338 | elscitype(C1) 339 | 340 | #- 341 | 342 | C2 = categorical(A, ordered=true) 343 | scitype(C2) 344 | 345 | #- 346 | 347 | v = [1, 2, missing, 4] 348 | scitype(v) 349 | 350 | #- 351 | 352 | elscitype(v) 353 | 354 | #- 355 | 356 | scitype(v[1:2]) 357 | 358 | # Can you guess at the general behavior of 359 | # `scitype` with respect to tuples, abstract arrays and missing 360 | # values? The answers are 361 | # [here](https://github.com/juliaai/ScientificTypesBase.jl#2-the-scitype-and-scitype-methods) 362 | # (ignore "Property 1"). 363 | 364 | 365 | # #### Exercise 2 366 | 367 | # Coerce the following vector to make MLJ recognize it as a vector of 368 | # ordered factors (with an appropriate ordering): 369 | 370 | quality = ["good", "poor", "poor", "excellent", missing, "good", "excellent"] 371 | 372 | #- 373 | 374 | 375 | # #### Exercise 3 (fixing scitypes in a table) 376 | 377 | # Fix the scitypes for the [House Prices in King 378 | # County](https://mlr3gallery.mlr-org.com/posts/2020-01-30-house-prices-in-king-county/) 379 | # dataset: 380 | 381 | file = CSV.File(joinpath(DIR, "data", "house.csv")); 382 | house = DataFrames.DataFrame(file); # convert to data frame without copying columns 383 | first(house, 4) 384 | 385 | # (Two features in the original data set have been deemed uninformative 386 | # and dropped, namely `:id` and `:date`. The original feature 387 | # `:yr_renovated` has been replaced by the `Bool` feature `is_renovated`.) 388 | 389 | # 390 | 391 | 392 | # ## Part 2 - Selecting, Training and Evaluating Models 393 | 394 | # > **Goals:** 395 | # > 1. Search MLJ's database of model metadata to identify model candidates for a supervised learning task. 396 | # > 2. Evaluate the performance of a model on a holdout set using basic `fit!`/`predict` work-flow. 397 | # > 3. Inspect the outcomes of training and save these to a file. 398 | # > 3. Evaluate performance using other resampling strategies, such as cross-validation, in one line, using `evaluate!` 399 | # > 4. Plot a "learning curve", to inspect performance as a function of some model hyper-parameter, such as an iteration parameter 400 | 401 | # The "Hello World!" of machine learning is to classify Fisher's 402 | # famous iris data set. This time, we'll grab the data from 403 | # [OpenML](https://www.openml.org): 404 | 405 | OpenML.describe_dataset(61) 406 | 407 | #- 408 | 409 | iris = OpenML.load(61); # a row table 410 | iris = DataFrames.DataFrame(iris); 411 | first(iris, 4) 412 | 413 | # **Main goal.** To build and evaluate models for predicting the 414 | # `:class` variable, given the four remaining measurement variables. 415 | 416 | 417 | # ### Step 1. Inspect and fix scientific types 418 | 419 | schema(iris) 420 | 421 | # Unfortunately, `Missing` is appearing in the element type, despite 422 | # the fact there are no missing values (see this 423 | # [issue](https://github.com/JuliaAI/OpenML.jl/issues/10)). To do this 424 | # we have to explicilty tighten the types: 425 | 426 | #- 427 | 428 | coerce!(iris, 429 | Union{Missing,Continuous}=>Continuous, 430 | Union{Missing,Multiclass}=>Multiclass, 431 | tight=true) 432 | schema(iris) 433 | 434 | 435 | # ### Step 2. Split data into input and target parts 436 | 437 | # Here's how we split the data into target and input features, which 438 | # is needed for MLJ supervised models. We randomize the data at the 439 | # same time: 440 | 441 | y, X = unpack(iris, ==(:class), name->true; rng=123); 442 | scitype(y) 443 | 444 | # Here's one way to access the documentation (at the REPL, `?unpack` 445 | # also works): 446 | 447 | @doc unpack #!md 448 | 449 | # #md 450 | 451 | 452 | # ### On searching for a model 453 | 454 | # Here's how to see *all* models (not immediately useful): 455 | 456 | all_models = models() 457 | 458 | # Each entry contains metadata for a model whose defining code is not yet loaded: 459 | 460 | meta = all_models[3] 461 | 462 | #- 463 | 464 | targetscitype = meta.target_scitype 465 | 466 | #- 467 | 468 | scitype(y) <: targetscitype 469 | 470 | # So this model won't do. Let's find all pure julia classifiers: 471 | 472 | filter_julia_classifiers(meta) = 473 | AbstractVector{Finite} <: meta.target_scitype && 474 | meta.is_pure_julia 475 | 476 | models(filter_julia_classifiers) 477 | 478 | # Find all models with "Classifier" in `name` (or `docstring`): 479 | 480 | models("Classifier") 481 | 482 | 483 | # Find all (supervised) models that match my data! 484 | 485 | models(matching(X, y)) 486 | 487 | 488 | 489 | # ### Step 3. Select and instantiate a model 490 | 491 | # To load the code defining a new model type we use the `@load` macro: 492 | 493 | NeuralNetworkClassifier = @load NeuralNetworkClassifier 494 | 495 | # Other ways to load model code are described 496 | # [here](https://alan-turing-institute.github.io/MLJ.jl/dev/loading_model_code/#Loading-Model-Code). 497 | 498 | # We'll instantiate this type with default values for the 499 | # hyperparameters: 500 | 501 | model = NeuralNetworkClassifier() 502 | 503 | #- 504 | 505 | info(model) 506 | 507 | # In MLJ a *model* is just a struct containing hyper-parameters, and 508 | # that's all. A model does not store *learned* parameters. Models are 509 | # mutable: 510 | 511 | model.epochs = 12 512 | 513 | # And all models have a key-word constructor that works once `@load` 514 | # has been performed: 515 | 516 | NeuralNetworkClassifier(epochs=12) == model 517 | 518 | 519 | # ### On fitting, predicting, and inspecting models 520 | 521 | # In MLJ a model and training/validation data are typically bound 522 | # together in a machine: 523 | 524 | mach = machine(model, X, y) 525 | 526 | # A machine stores *learned* parameters, among other things. We'll 527 | # train this machine on 70% of the data and evaluate on a 30% holdout 528 | # set. Let's start by dividing all row indices into `train` and `test` 529 | # subsets: 530 | 531 | train, test = partition(eachindex(y), 0.7) 532 | 533 | # Now we can `fit!`... 534 | 535 | fit!(mach, rows=train, verbosity=2) 536 | 537 | # ... and `predict`: 538 | 539 | yhat = predict(mach, rows=test); # or `predict(mach, Xnew)` 540 | yhat[1:3] 541 | 542 | # We'll have more to say on the form of this prediction shortly. 543 | 544 | # After training, one can inspect the learned parameters: 545 | 546 | fitted_params(mach) 547 | 548 | #- 549 | 550 | # Everything else the user might be interested in is accessed from the 551 | # training *report*: 552 | 553 | report(mach) 554 | 555 | # You save a machine like this: 556 | 557 | MLJ.save("neural_net.jlso", mach) 558 | 559 | # And retrieve it like this: 560 | 561 | mach2 = machine("neural_net.jlso") 562 | yhat = predict(mach2, X); 563 | yhat[1:3] 564 | 565 | # If you want to fit a retrieved model, you will need to bind some data to it: 566 | 567 | mach3 = machine("neural_net.jlso", X, y) 568 | fit!(mach3) 569 | 570 | # Machines remember the last set of hyper-parameters used during fit, 571 | # which, in the case of iterative models, allows for a warm restart of 572 | # computations in the case that only the iteration parameter is 573 | # increased: 574 | 575 | model.epochs = model.epochs + 4 576 | fit!(mach, rows=train, verbosity=2) 577 | 578 | # For this particular model we can also increase `:learning_rate` 579 | # without triggering a cold restart: 580 | 581 | model.epochs = model.epochs + 4 582 | model.optimiser.eta = 10*model.optimiser.eta 583 | fit!(mach, rows=train, verbosity=2) 584 | 585 | # However, change any other parameter and training will restart from 586 | # scratch: 587 | 588 | model.lambda = 0.001 589 | fit!(mach, rows=train, verbosity=2) 590 | 591 | # Iterative models that implement warm-restart for training can be 592 | # controlled externally (eg, using an out-of-sample stopping 593 | # criterion). See 594 | # [here](https://alan-turing-institute.github.io/MLJ.jl/dev/controlling_iterative_models/) 595 | # for details. 596 | 597 | 598 | # Let's train silently for a total of 50 epochs, and look at a 599 | # prediction: 600 | 601 | model.epochs = 50 602 | fit!(mach, rows=train) 603 | yhat = predict(mach, X[test,:]); # or predict(mach, rows=test) 604 | yhat[1] 605 | 606 | # What's going on here? 607 | 608 | info(model).prediction_type 609 | 610 | # **Important**: 611 | # - In MLJ, a model that can predict probabilities (and not just point values) will do so by default. 612 | # - For most probabilistic predictors, the predicted object is a `Distributions.Distribution` object, supporting the `Distributions.jl` [API](https://juliastats.org/Distributions.jl/latest/extends/#Create-a-Distribution-1) for such objects. In particular, the methods `rand`, `pdf`, `logpdf`, `mode`, `median` and `mean` will apply, where appropriate. 613 | 614 | # So, to obtain the probability of "Iris-virginica" in the first test 615 | # prediction, we do 616 | 617 | pdf(yhat[1], "Iris-virginica") 618 | 619 | # To get the most likely observation, we do 620 | 621 | mode(yhat[1]) 622 | 623 | # These can be broadcast over multiple predictions in the usual way: 624 | 625 | broadcast(pdf, yhat[1:4], "Iris-versicolor") 626 | 627 | #- 628 | 629 | mode.(yhat[1:4]) 630 | 631 | # Or, alternatively, you can use the `predict_mode` operation instead 632 | # of `predict`: 633 | 634 | predict_mode(mach, X[test,:])[1:4] # or predict_mode(mach, rows=test)[1:4] 635 | 636 | # For a more conventional matrix of probabilities you can do this: 637 | 638 | L = levels(y) 639 | pdf(yhat, L)[1:4, :] 640 | 641 | # However, in a typical MLJ work-flow, this is not as useful as you 642 | # might imagine. In particular, all probabilistic performance measures 643 | # in MLJ expect distribution objects in their first slot: 644 | 645 | cross_entropy(yhat, y[test]) |> mean 646 | 647 | # To apply a deterministic measure, we first need to obtain point-estimates: 648 | 649 | misclassification_rate(mode.(yhat), y[test]) 650 | 651 | # We note in passing that there is also a search tool for measures 652 | # analogous to `models`: 653 | 654 | measures() 655 | 656 | 657 | # ### Step 4. Evaluate the model performance 658 | 659 | # Naturally, MLJ provides boilerplate code for carrying out a model 660 | # evaluation with a lot less fuss. Let's repeat the performance 661 | # evaluation above and add an extra measure, `brier_score`: 662 | 663 | evaluate!(mach, resampling=Holdout(fraction_train=0.7), 664 | measures=[cross_entropy, brier_score]) 665 | 666 | # Or applying cross-validation instead: 667 | 668 | evaluate!(mach, resampling=CV(nfolds=6), 669 | measures=[cross_entropy, brier_score]) 670 | 671 | # Or, Monte Carlo cross-validation (cross-validation repeated 672 | # randomized folds) 673 | 674 | e = evaluate!(mach, resampling=CV(nfolds=6, rng=123), 675 | repeats=3, 676 | measures=[cross_entropy, brier_score]) 677 | 678 | # One can access the following properties of the output `e` of an 679 | # evaluation: `measure`, `measurement`, `per_fold` (measurement for 680 | # each fold) and `per_observation` (measurement per observation, if 681 | # reported). 682 | 683 | # We finally note that you can restrict the rows of observations from 684 | # which train and test folds are drawn, by specifying `rows=...`. For 685 | # example, imagining the last 30% of target observations are `missing` 686 | # you might have a work-flow like this: 687 | 688 | train, test = partition(eachindex(y), 0.7) 689 | mach = machine(model, X, y) 690 | evaluate!(mach, resampling=CV(nfolds=6), 691 | measures=[cross_entropy, brier_score], 692 | rows=train) # cv estimate, resampling from `train` 693 | fit!(mach, rows=train) # re-train using all of `train` observations 694 | predict(mach, rows=test); # and predict missing targets 695 | 696 | 697 | # ### On learning curves 698 | 699 | # Since our model is an iterative one, we might want to inspect the 700 | # out-of-sample performance as a function of the iteration 701 | # parameter. For this we can use the `learning_curve` function (which, 702 | # incidentally can be applied to any model hyper-parameter). This 703 | # starts by defining a one-dimensional range object for the parameter 704 | # (more on this when we discuss tuning in Part 4): 705 | 706 | r = range(model, :epochs, lower=1, upper=50, scale=:log) 707 | 708 | #- 709 | 710 | curve = learning_curve(mach, 711 | range=r, 712 | resampling=Holdout(fraction_train=0.7), # (default) 713 | measure=cross_entropy) 714 | 715 | using Plots 716 | gr(size=(490,300)) 717 | plt=plot(curve.parameter_values, curve.measurements) 718 | xlabel!(plt, "epochs") 719 | ylabel!(plt, "cross entropy on holdout set") 720 | savefig("learning_curve.png") 721 | plt #!md 722 | # ![](learning_curve.png) #md 723 | 724 | # We will return to learning curves when we look at tuning in Part 4. 725 | 726 | 727 | # ### Resources for Part 2 728 | 729 | # - From the MLJ manual: 730 | # - [Getting Started](https://alan-turing-institute.github.io/MLJ.jl/dev/getting_started/) 731 | # - [Model Search](https://alan-turing-institute.github.io/MLJ.jl/dev/model_search/) 732 | # - [Evaluating Performance](https://alan-turing-institute.github.io/MLJ.jl/dev/evaluating_model_performance/) (using `evaluate!`) 733 | # - [Learning Curves](https://alan-turing-institute.github.io/MLJ.jl/dev/learning_curves/) 734 | # - [Performance Measures](https://alan-turing-institute.github.io/MLJ.jl/dev/performance_measures/) (loss functions, scores, etc) 735 | # - From Data Science Tutorials: 736 | # - [Choosing and evaluating a model](https://juliaai.github.io/DataScienceTutorials.jl/getting-started/choosing-a-model/) 737 | # - [Fit, predict, transform](https://juliaai.github.io/DataScienceTutorials.jl/getting-started/fit-and-predict/) 738 | 739 | 740 | # ### Exercises for Part 2 741 | 742 | 743 | # #### Exercise 4 744 | 745 | # (a) Identify all supervised MLJ models that can be applied (without 746 | # type coercion or one-hot encoding) to a supervised learning problem 747 | # with input features `X4` and target `y4` defined below: 748 | 749 | import Distributions 750 | poisson = Distributions.Poisson 751 | 752 | age = 18 .+ 60*rand(10); 753 | salary = coerce(rand(["small", "big", "huge"], 10), OrderedFactor); 754 | levels!(salary, ["small", "big", "huge"]); 755 | small = CategoricalValue("small", salary) 756 | 757 | #- 758 | 759 | X4 = DataFrames.DataFrame(age=age, salary=salary) 760 | 761 | n_devices(salary) = salary > small ? rand(poisson(1.3)) : rand(poisson(2.9)) 762 | y4 = [n_devices(row.salary) for row in eachrow(X4)] 763 | 764 | # (b) What models can be applied if you coerce the salary to a 765 | # `Continuous` scitype? 766 | 767 | 768 | # #### Exercise 5 (unpack) 769 | 770 | # After evaluating the following ... 771 | 772 | data = (a = [1, 2, 3, 4], 773 | b = rand(4), 774 | c = rand(4), 775 | d = coerce(["male", "female", "female", "male"], OrderedFactor)); 776 | pretty(data) 777 | 778 | #- 779 | 780 | using Tables 781 | y, X, w = unpack(data, 782 | ==(:a), 783 | name -> elscitype(Tables.getcolumn(data, name)) == Continuous, 784 | name -> true); 785 | 786 | # ...attempt to guess the evaluations of the following: 787 | 788 | y 789 | 790 | #- 791 | 792 | pretty(X) 793 | 794 | #- 795 | 796 | w 797 | 798 | # #### Exercise 6 (first steps in modeling Horse Colic) 799 | 800 | # (a) Suppose we want to use predict the `:outcome` variable in the 801 | # Horse Colic study introduced in Part 1, based on the remaining 802 | # variables that are `Continuous` (one-hot encoding categorical 803 | # variables is discussed later in Part 3) *while ignoring the others*. 804 | # Extract from the `horse` data set (defined in Part 1) appropriate 805 | # input features `X` and target variable `y`. (Do not, however, 806 | # randomize the observations.) 807 | 808 | # (b) Create a 70:30 `train`/`test` split of the data and train a 809 | # `LogisticClassifier` model, from the `MLJLinearModels` package, on 810 | # the `train` rows. Use `lambda=100` and default values for the 811 | # other hyper-parameters. (Although one would normally standardize 812 | # (whiten) the continuous features for this model, do not do so here.) 813 | # After training: 814 | 815 | # - (i) Recalling that a logistic classifier (aka logistic regressor) is 816 | # a linear-based model learning a *vector* of coefficients for each 817 | # feature (one coefficient for each target class), use the 818 | # `fitted_params` method to find this vector of coefficients in the 819 | # case of the `:pulse` feature. (You can convert a vector of pairs `v = 820 | # [x1 => y1, x2 => y2, ...]` into a dictionary with `Dict(v)`.) 821 | 822 | # - (ii) Evaluate the `cross_entropy` performance on the `test` 823 | # observations. 824 | 825 | # - ☆(iii) In how many `test` observations does the predicted 826 | # probability of the observed class exceed 50%? 827 | 828 | # - (iv) Find the `misclassification_rate` in the `test` 829 | # set. (*Hint.* As this measure is deterministic, you will either 830 | # need to broadcast `mode` or use `predict_mode` instead of 831 | # `predict`.) 832 | 833 | # (c) Instead use a `RandomForestClassifier` model from the 834 | # `DecisionTree` package and: 835 | # 836 | # - (i) Generate an appropriate learning curve to convince yourself 837 | # that out-of-sample estimates of the `cross_entropy` loss do not 838 | # substantially improve for `n_trees > 50`. Use default values for 839 | # all other hyper-parameters, and feel free to use all available 840 | # data to generate the curve. 841 | 842 | # - (ii) Fix `n_trees=90` and use `evaluate!` to obtain a 9-fold 843 | # cross-validation estimate of the `cross_entropy`, restricting 844 | # sub-sampling to the `train` observations. 845 | 846 | # - (iii) Now use *all* available data but set 847 | # `resampling=Holdout(fraction_train=0.7)` to obtain a score you can 848 | # compare with the `KNNClassifier` in part (b)(iii). Which model is 849 | # better? 850 | 851 | # 852 | 853 | 854 | # ## Part 3 - Transformers and Pipelines 855 | 856 | # ### Transformers 857 | 858 | # Unsupervised models, which receive no target `y` during training, 859 | # always have a `transform` operation. They sometimes also support an 860 | # `inverse_transform` operation, with obvious meaning, and sometimes 861 | # support a `predict` operation (see the clustering example discussed 862 | # [here](https://alan-turing-institute.github.io/MLJ.jl/dev/transformers/#Transformers-that-also-predict-1)). 863 | # Otherwise, they are handled much like supervised models. 864 | 865 | # Here's a simple standardization example: 866 | 867 | x = rand(100); 868 | @show mean(x) std(x); 869 | 870 | #- 871 | 872 | model = Standardizer() # a built-in model 873 | mach = machine(model, x) 874 | fit!(mach) 875 | xhat = transform(mach, x); 876 | @show mean(xhat) std(xhat); 877 | 878 | # This particular model has an `inverse_transform`: 879 | 880 | inverse_transform(mach, xhat) ≈ x 881 | 882 | 883 | # ### Re-encoding the King County House data as continuous 884 | 885 | # For further illustrations of transformers, let's re-encode *all* of the 886 | # King County House input features (see [Ex 887 | # 3](#exercise-3-fixing-scitypes-in-a-table)) into a set of `Continuous` 888 | # features. We do this with the `ContinuousEncoder` model, which, by 889 | # default, will: 890 | 891 | # - one-hot encode all `Multiclass` features 892 | # - coerce all `OrderedFactor` features to `Continuous` ones 893 | # - coerce all `Count` features to `Continuous` ones (there aren't any) 894 | # - drop any remaining non-Continuous features (none of these either) 895 | 896 | # First, we reload the data and fix the scitypes (Exercise 3): 897 | 898 | file = CSV.File(joinpath(DIR, "data", "house.csv")); 899 | house = DataFrames.DataFrame(file); 900 | coerce!(house, autotype(file)); 901 | coerce!(house, Count => Continuous, :zipcode => Multiclass); 902 | schema(house) 903 | 904 | #- 905 | 906 | y, X = unpack(house, ==(:price), name -> true, rng=123); 907 | 908 | # Instantiate the unsupervised model (transformer): 909 | 910 | encoder = ContinuousEncoder() # a built-in model; no need to @load it 911 | 912 | # Bind the model to the data and fit! 913 | 914 | mach = machine(encoder, X) |> fit!; 915 | 916 | # Transform and inspect the result: 917 | 918 | Xcont = transform(mach, X); 919 | schema(Xcont) 920 | 921 | 922 | # ### More transformers 923 | 924 | # Here's how to list all of MLJ's unsupervised models: 925 | 926 | models(m->!m.is_supervised) 927 | 928 | # Some commonly used ones are built-in (do not require `@load`ing): 929 | 930 | # model type | does what? 931 | # ----------------------------|---------------------------------------------- 932 | # ContinuousEncoder | transform input table to a table of `Continuous` features (see above) 933 | # FeatureSelector | retain or dump selected features 934 | # FillImputer | impute missing values 935 | # OneHotEncoder | one-hot encoder `Multiclass` (and optionally `OrderedFactor`) features 936 | # Standardizer | standardize (whiten) a vector or all `Continuous` features of a table 937 | # UnivariateBoxCoxTransformer | apply a learned Box-Cox transformation to a vector 938 | # UnivariateDiscretizer | discretize a `Continuous` vector, and hence render its elscitypw `OrderedFactor` 939 | 940 | 941 | # In addition to "dynamic" transformers (ones that learn something 942 | # from the data and must be `fit!`) users can wrap ordinary functions 943 | # as transformers, and such *static* transformers can depend on 944 | # parameters, like the dynamic ones. See 945 | # [here](https://alan-turing-institute.github.io/MLJ.jl/dev/transformers/#Static-transformers-1) 946 | # for how to define your own static transformers. 947 | 948 | 949 | # ### Pipelines 950 | 951 | length(schema(Xcont).names) 952 | 953 | # Let's suppose that additionally we'd like to reduce the dimension of 954 | # our data. A model that will do this is `PCA` from 955 | # `MultivariateStats`: 956 | 957 | PCA = @load PCA 958 | reducer = PCA() 959 | 960 | # Now, rather simply repeating the work-flow above, applying the new 961 | # transformation to `Xcont`, we can combine both the encoding and the 962 | # dimension-reducing models into a single model, known as a 963 | # *pipeline*. While MLJ offers a powerful interface for composing 964 | # models in a variety of ways, we'll stick to these simplest class of 965 | # composite models for now. The easiest way to construct them is using 966 | # the `@pipeline` macro: 967 | 968 | pipe = @pipeline encoder reducer 969 | 970 | # Notice that `pipe` is an *instance* of an automatically generated 971 | # type (called `Pipeline`). 972 | 973 | # The new model behaves like any other transformer: 974 | 975 | mach = machine(pipe, X) 976 | fit!(mach) 977 | Xsmall = transform(mach, X) 978 | schema(Xsmall) 979 | 980 | # Want to combine this pre-processing with ridge regression? 981 | 982 | RidgeRegressor = @load RidgeRegressor pkg=MLJLinearModels 983 | rgs = RidgeRegressor() 984 | pipe2 = @pipeline encoder reducer rgs 985 | 986 | # Now our pipeline is a supervised model, instead of a transformer, 987 | # whose performance we can evaluate: 988 | 989 | mach = machine(pipe2, X, y) 990 | evaluate!(mach, measure=mae, resampling=Holdout()) # CV(nfolds=6) is default 991 | 992 | 993 | # ### Training of composite models is "smart" 994 | 995 | # Now notice what happens if we train on all the data, then change a 996 | # regressor hyper-parameter and retrain: 997 | 998 | fit!(mach) 999 | 1000 | #- 1001 | 1002 | pipe2.ridge_regressor.lambda = 0.1 1003 | fit!(mach) 1004 | 1005 | # Second time only the ridge regressor is retrained! 1006 | 1007 | # Mutate a hyper-parameter of the `PCA` model and every model except 1008 | # the `ContinuousEncoder` (which comes before it will be retrained): 1009 | 1010 | pipe2.pca.pratio = 0.9999 1011 | fit!(mach) 1012 | 1013 | 1014 | # ### Inspecting composite models 1015 | 1016 | # The dot syntax used above to change the values of *nested* 1017 | # hyper-parameters is also useful when inspecting the learned 1018 | # parameters and report generated when training a composite model: 1019 | 1020 | fitted_params(mach).ridge_regressor 1021 | 1022 | #- 1023 | 1024 | report(mach).pca 1025 | 1026 | 1027 | # ### Incorporating target transformations 1028 | 1029 | # Next, suppose that instead of using the raw `:price` as the 1030 | # training target, we want to use the log-price (a common practice in 1031 | # dealing with house price data). However, suppose that we still want 1032 | # to report final *predictions* on the original linear scale (and use 1033 | # these for evaluation purposes). Then we supply appropriate functions 1034 | # to key-word arguments `target` and `inverse`. 1035 | 1036 | # First we'll overload `log` and `exp` for broadcasting: 1037 | Base.log(v::AbstractArray) = log.(v) 1038 | Base.exp(v::AbstractArray) = exp.(v) 1039 | 1040 | # Now for the new pipeline: 1041 | 1042 | pipe3 = @pipeline encoder reducer rgs target=log inverse=exp 1043 | mach = machine(pipe3, X, y) 1044 | evaluate!(mach, measure=mae) 1045 | 1046 | # MLJ will also allow you to insert *learned* target 1047 | # transformations. For example, we might want to apply 1048 | # `Standardizer()` to the target, to standardize it, or 1049 | # `UnivariateBoxCoxTransformer()` to make it look Gaussian. Then 1050 | # instead of specifying a *function* for `target`, we specify a 1051 | # unsupervised *model* (or model type). One does not specify `inverse` 1052 | # because only models implementing `inverse_transform` are 1053 | # allowed. 1054 | 1055 | # Let's see which of these two options results in a better outcome: 1056 | 1057 | box = UnivariateBoxCoxTransformer(n=20) 1058 | stand = Standardizer() 1059 | 1060 | pipe4 = @pipeline encoder reducer rgs target=box 1061 | mach = machine(pipe4, X, y) 1062 | evaluate!(mach, measure=mae) 1063 | 1064 | #- 1065 | 1066 | pipe4.target = stand 1067 | evaluate!(mach, measure=mae) 1068 | 1069 | 1070 | # ### Resources for Part 3 1071 | 1072 | # - From the MLJ manual: 1073 | # - [Transformers and other unsupervised models](https://alan-turing-institute.github.io/MLJ.jl/dev/transformers/) 1074 | # - [Linear pipelines](https://alan-turing-institute.github.io/MLJ.jl/dev/linear_pipelines/#Linear-Pipelines) 1075 | # - From Data Science Tutorials: 1076 | # - [Composing models](https://juliaai.github.io/DataScienceTutorials.jl/getting-started/composing-models/) 1077 | 1078 | 1079 | # ### Exercises for Part 3 1080 | 1081 | # #### Exercise 7 1082 | 1083 | # Consider again the Horse Colic classification problem considered in 1084 | # Exercise 6, but with all features, `Finite` and `Infinite`: 1085 | 1086 | y, X = unpack(horse, ==(:outcome), name -> true); 1087 | schema(X) 1088 | 1089 | # (a) Define a pipeline that: 1090 | # - uses `Standardizer` to ensure that features that are already 1091 | # continuous are centered at zero and have unit variance 1092 | # - re-encodes the full set of features as `Continuous`, using 1093 | # `ContinuousEncoder` 1094 | # - uses the `KMeans` clustering model from `Clustering.jl` 1095 | # to reduce the dimension of the feature space to `k=10`. 1096 | # - trains a `EvoTreeClassifier` (a gradient tree boosting 1097 | # algorithm in `EvoTrees.jl`) on the reduced data, using 1098 | # `nrounds=50` and default values for the other 1099 | # hyper-parameters 1100 | 1101 | # (b) Evaluate the pipeline on all data, using 6-fold cross-validation 1102 | # and `cross_entropy` loss. 1103 | 1104 | # ☆(c) Plot a learning curve which examines the effect on this loss 1105 | # as the tree booster parameter `max_depth` varies from 2 to 10. 1106 | 1107 | # 1108 | 1109 | 1110 | # ## Part 4 - Tuning Hyper-parameters 1111 | 1112 | # ### Naive tuning of a single parameter 1113 | 1114 | # The most naive way to tune a single hyper-parameter is to use 1115 | # `learning_curve`, which we already saw in Part 2. Let's see this in 1116 | # the Horse Colic classification problem, in a case where the parameter 1117 | # to be tuned is *nested* (because the model is a pipeline): 1118 | 1119 | y, X = unpack(horse, ==(:outcome), name -> true); 1120 | 1121 | LogisticClassifier = @load LogisticClassifier pkg=MLJLinearModels 1122 | model = @pipeline Standardizer ContinuousEncoder LogisticClassifier 1123 | mach = machine(model, X, y) 1124 | 1125 | #- 1126 | 1127 | r = range(model, :(logistic_classifier.lambda), lower = 1e-2, upper=100, scale=:log10) 1128 | 1129 | # If you're curious, you can see what `lambda` values this range will 1130 | # generate for a given resolution: 1131 | 1132 | iterator(r, 5) 1133 | 1134 | #- 1135 | 1136 | _, _, lambdas, losses = learning_curve(mach, 1137 | range=r, 1138 | resampling=CV(nfolds=6), 1139 | resolution=30, # default 1140 | measure=cross_entropy) 1141 | plt=plot(lambdas, losses, xscale=:log10) 1142 | xlabel!(plt, "lambda") 1143 | ylabel!(plt, "cross entropy using 6-fold CV") 1144 | savefig("learning_curve2.png") 1145 | plt #!md 1146 | 1147 | # ![](learning_curve2.png) #md 1148 | 1149 | best_lambda = lambdas[argmin(losses)] 1150 | 1151 | 1152 | # ### Self tuning models 1153 | 1154 | # A more sophisticated way to view hyper-parameter tuning (inspired by 1155 | # MLR) is as a model *wrapper*. The wrapped model is a new model in 1156 | # its own right and when you fit it, it tunes specified 1157 | # hyper-parameters of the model being wrapped, before training on all 1158 | # supplied data. Calling `predict` on the wrapped model is like 1159 | # calling `predict` on the original model, but with the 1160 | # hyper-parameters already optimized. 1161 | 1162 | # In other words, we can think of the wrapped model as a "self-tuning" 1163 | # version of the original. 1164 | 1165 | # We now create a self-tuning version of the pipeline above, adding a 1166 | # parameter from the `ContinuousEncoder` to the parameters we want 1167 | # optimized. 1168 | 1169 | # First, let's choose a tuning strategy (from [these 1170 | # options](https://github.com/juliaai/MLJTuning.jl#what-is-provided-here)). MLJ 1171 | # supports ordinary `Grid` search (query `?Grid` for 1172 | # details). However, as the utility of `Grid` search is limited to a 1173 | # small number of parameters, and as `Grid` searches are demonstrated 1174 | # elsewhere (see the [resources below](#resources-for-part-4)) we'll 1175 | # demonstrate `RandomSearch` here: 1176 | 1177 | tuning = RandomSearch(rng=123) 1178 | 1179 | # In this strategy each parameter is sampled according to a 1180 | # pre-specified prior distribution that is fit to the one-dimensional 1181 | # range object constructed using `range` as before. While one has a 1182 | # lot of control over the specification of the priors (run 1183 | # `?RandomSearch` for details) we'll let the algorithm generate these 1184 | # priors automatically. 1185 | 1186 | 1187 | # #### Unbounded ranges and sampling 1188 | 1189 | # In MLJ a range does not have to be bounded. In a `RandomSearch` a 1190 | # positive unbounded range is sampled using a `Gamma` distribution, by 1191 | # default: 1192 | 1193 | r = range(model, 1194 | :(logistic_classifier.lambda), 1195 | lower=0, 1196 | origin=6, 1197 | unit=5, 1198 | scale=:log10) 1199 | 1200 | # The `scale` in a range makes no in a `RandomSearch` (unless it is a 1201 | # function) but this will effect later plots but it does effect the 1202 | # later plots. 1203 | 1204 | # Let's see what sampling using a Gamma distribution is going to mean 1205 | # for this range: 1206 | 1207 | import Distributions 1208 | sampler_r = sampler(r, Distributions.Gamma) 1209 | plt = histogram(rand(sampler_r, 10000), nbins=50) 1210 | savefig("gamma_sampler.png") 1211 | plt #!md 1212 | 1213 | # ![](gamma_sampler.png) 1214 | 1215 | # The second parameter that we'll add to this is *nominal* (finite) and, by 1216 | # default, will be sampled uniformly. Since it is nominal, we specify 1217 | # `values` instead of `upper` and `lower` bounds: 1218 | 1219 | s = range(model, :(continuous_encoder.one_hot_ordered_factors), 1220 | values = [true, false]) 1221 | 1222 | 1223 | # #### The tuning wrapper 1224 | 1225 | # Now for the wrapper, which is an instance of `TunedModel`: 1226 | 1227 | tuned_model = TunedModel(model=model, 1228 | ranges=[r, s], 1229 | resampling=CV(nfolds=6), 1230 | measures=cross_entropy, 1231 | tuning=tuning, 1232 | n=15) 1233 | 1234 | # We can apply the `fit!/predict` work-flow to `tuned_model` just as 1235 | # for any other model: 1236 | 1237 | tuned_mach = machine(tuned_model, X, y); 1238 | fit!(tuned_mach); 1239 | predict(tuned_mach, rows=1:3) 1240 | 1241 | # The outcomes of the tuning can be inspected from a detailed 1242 | # report. For example, we have: 1243 | 1244 | rep = report(tuned_mach); 1245 | rep.best_model 1246 | 1247 | # By default, sampling of a bounded range is uniform. Lets 1248 | 1249 | # In the special case of two-parameters, you can also plot the results: 1250 | 1251 | plt = plot(tuned_mach) 1252 | savefig("tuning.png") 1253 | plt #!md 1254 | 1255 | # ![](tuning.png) #md 1256 | 1257 | # Finally, let's compare cross-validation estimate of the performance 1258 | # of the self-tuning model with that of the original model (an example 1259 | # of [*nested 1260 | # resampling*]((https://mlr.mlr-org.com/articles/tutorial/nested_resampling.html) 1261 | # here): 1262 | 1263 | err = evaluate!(mach, resampling=CV(nfolds=3), measure=cross_entropy) 1264 | 1265 | #- 1266 | 1267 | tuned_err = evaluate!(tuned_mach, resampling=CV(nfolds=3), measure=cross_entropy) 1268 | 1269 | # 1270 | 1271 | 1272 | # ### Resources for Part 4 1273 | # 1274 | # - From the MLJ manual: 1275 | # - [Learning Curves](https://alan-turing-institute.github.io/MLJ.jl/dev/learning_curves/) 1276 | # - [Tuning Models](https://alan-turing-institute.github.io/MLJ.jl/dev/tuning_models/) 1277 | # - The [MLJTuning repo](https://github.com/juliaai/MLJTuning.jl#who-is-this-repo-for) - mostly for developers 1278 | # 1279 | # - From Data Science Tutorials: 1280 | # - [Tuning a model](https://juliaai.github.io/DataScienceTutorials.jl/getting-started/model-tuning/) 1281 | # - [Crabs with XGBoost](https://juliaai.github.io/DataScienceTutorials.jl/end-to-end/crabs-xgb/) `Grid` tuning in stages for a tree-boosting model with many parameters 1282 | # - [Boston with LightGBM](https://juliaai.github.io/DataScienceTutorials.jl/end-to-end/boston-lgbm/) - `Grid` tuning for another popular tree-booster 1283 | # - [Boston with Flux](https://juliaai.github.io/DataScienceTutorials.jl/end-to-end/boston-flux/) - optimizing batch size in a simple neural network regressor 1284 | # - [UCI Horse Colic Data Set](http://archive.ics.uci.edu/ml/datasets/Horse+Colic) 1285 | 1286 | 1287 | # ### Exercises for Part 4 1288 | 1289 | # #### Exercise 8 1290 | 1291 | # This exercise continues our analysis of the King County House price 1292 | # prediction problem: 1293 | 1294 | y, X = unpack(house, ==(:price), name -> true, rng=123); 1295 | 1296 | # Your task will be to tune the following pipeline regression model, 1297 | # which includes a gradient tree boosting component: 1298 | 1299 | EvoTreeRegressor = @load EvoTreeRegressor 1300 | tree_booster = EvoTreeRegressor(nrounds = 70) 1301 | model = @pipeline ContinuousEncoder tree_booster 1302 | 1303 | # (a) Construct a bounded range `r1` for the `evo_tree_booster` 1304 | # parameter `max_depth`, varying between 1 and 12. 1305 | 1306 | # \star&(b) For the `nbins` parameter of the `EvoTreeRegressor`, define the range 1307 | 1308 | r2 = range(model, 1309 | :(evo_tree_regressor.nbins), 1310 | lower = 2.5, 1311 | upper= 7.5, scale=x->2^round(Int, x)) 1312 | 1313 | # Notice that in this case we've specified a *function* instead of a 1314 | # canned scale, like `:log10`. In this case the `scale` function is 1315 | # applied after sampling (uniformly) between the limits of `lower` and 1316 | # `upper`. Perhaps you can guess the outputs of the following lines of 1317 | # code? 1318 | 1319 | r2_sampler = sampler(r2, Distributions.Uniform) 1320 | samples = rand(r2_sampler, 1000); 1321 | plt = histogram(samples, nbins=50) 1322 | savefig("uniform_sampler.png") 1323 | 1324 | plt #!md 1325 | 1326 | # ![](uniform_sampler.png) 1327 | 1328 | sort(unique(samples)) 1329 | 1330 | # (c) Optimize `model` over these the parameter ranges `r1` and `r2` 1331 | # using a random search with uniform priors (the default). Use 1332 | # `Holdout()` resampling, and implement your search by first 1333 | # constructing a "self-tuning" wrap of `model`, as described 1334 | # above. Make `mae` (mean absolute error) the loss function that you 1335 | # optimize, and search over a total of 40 combinations of 1336 | # hyper-parameters. If you have time, plot the results of your 1337 | # search. Feel free to use all available data. 1338 | 1339 | # (d) Evaluate the best model found in the search using 3-fold 1340 | # cross-validation and compare with that of the self-tuning model 1341 | # (which is different!). Setting data hygiene concerns aside, feel 1342 | # free to use all available data. 1343 | 1344 | # 1345 | 1346 | 1347 | # ## Part 5 - Advanced Model Composition 1348 | 1349 | # > **Goals:** 1350 | # > 1. Learn how to build a prototypes of a composite model, called a *learning network* 1351 | # > 2. Learn how to use the `@from_network` macro to export a learning network as a new stand-alone model type 1352 | 1353 | # While `@pipeline` is great for composing models in an unbranching 1354 | # sequence, for more complicated model composition you'll want to use 1355 | # MLJ's generic model composition syntax. There are two main steps: 1356 | 1357 | # - **Prototype** the composite model by building a *learning 1358 | # network*, which can be tested on some (dummy) data as you build 1359 | # it. 1360 | 1361 | # - **Export** the learning network as a new stand-alone model type. 1362 | 1363 | # Like pipeline models, instances of the exported model type behave 1364 | # like any other model (and are not bound to any data, until you wrap 1365 | # them in a machine). 1366 | 1367 | 1368 | # ### Building a pipeline using the generic composition syntax 1369 | 1370 | # To warm up, we'll do the equivalent of 1371 | 1372 | pipe = @pipeline Standardizer LogisticClassifier; 1373 | 1374 | # using the generic syntax. 1375 | 1376 | # Here's some dummy data we'll be using to test our learning network: 1377 | 1378 | X, y = make_blobs(5, 3) 1379 | pretty(X) 1380 | 1381 | # **Step 0** - Proceed as if you were combining the models "by hand", 1382 | # using all the data available for training, transforming and 1383 | # prediction: 1384 | 1385 | stand = Standardizer(); 1386 | linear = LogisticClassifier(); 1387 | 1388 | mach1 = machine(stand, X); 1389 | fit!(mach1); 1390 | Xstand = transform(mach1, X); 1391 | 1392 | mach2 = machine(linear, Xstand, y); 1393 | fit!(mach2); 1394 | yhat = predict(mach2, Xstand) 1395 | 1396 | # **Step 1** - Edit your code as follows: 1397 | 1398 | # - pre-wrap the data in `Source` nodes 1399 | 1400 | # - delete the `fit!` calls 1401 | 1402 | X = source(X) # or X = source() if not testing 1403 | y = source(y) # or y = source() 1404 | 1405 | stand = Standardizer(); 1406 | linear = LogisticClassifier(); 1407 | 1408 | mach1 = machine(stand, X); 1409 | Xstand = transform(mach1, X); 1410 | 1411 | mach2 = machine(linear, Xstand, y); 1412 | yhat = predict(mach2, Xstand) 1413 | 1414 | # Now `X`, `y`, `Xstand` and `yhat` are *nodes* ("variables" or 1415 | # "dynammic data") instead of data. All training, predicting and 1416 | # transforming is now executed lazily, whenever we `fit!` one of these 1417 | # nodes. We *call* a node to retrieve the data it represents in the 1418 | # original manual workflow. 1419 | 1420 | fit!(Xstand) 1421 | Xstand() |> pretty 1422 | 1423 | #- 1424 | 1425 | fit!(yhat); 1426 | yhat() 1427 | 1428 | # The node `yhat` is the "descendant" (in an associated DAG we have 1429 | # defined) of a unique source node: 1430 | 1431 | sources(yhat) 1432 | 1433 | #- 1434 | 1435 | # The data at the source node is replaced by `Xnew` to obtain a 1436 | # new prediction when we call `yhat` like this: 1437 | 1438 | Xnew, _ = make_blobs(2, 3); 1439 | yhat(Xnew) 1440 | 1441 | 1442 | # **Step 2** - Export the learning network as a new stand-alone model type 1443 | 1444 | # Now, somewhat paradoxically, we can wrap the whole network in a 1445 | # special machine - called a *learning network machine* - before have 1446 | # defined the new model type. Indeed doing so is a necessary step in 1447 | # the export process, for this machine will tell the export macro: 1448 | 1449 | # - what kind of model the composite will be (`Deterministic`, 1450 | # `Probabilistic` or `Unsupervised`)a 1451 | 1452 | # - which source nodes are input nodes and which are for the target 1453 | 1454 | # - which nodes correspond to each operation (`predict`, `transform`, 1455 | # etc) that we might want to define 1456 | 1457 | surrogate = Probabilistic() # a model with no fields! 1458 | mach = machine(surrogate, X, y; predict=yhat) 1459 | 1460 | # Although we have no real need to use it, this machine behaves like 1461 | # you'd expect it to: 1462 | 1463 | Xnew, _ = make_blobs(2, 3) 1464 | fit!(mach) 1465 | predict(mach, Xnew) 1466 | 1467 | #- 1468 | 1469 | # Now we create a new model type using a Julia `struct` definition 1470 | # appropriately decorated: 1471 | 1472 | @from_network mach begin 1473 | mutable struct YourPipe 1474 | standardizer = stand 1475 | classifier = linear::Probabilistic 1476 | end 1477 | end 1478 | 1479 | # Instantiating and evaluating on some new data: 1480 | 1481 | pipe = YourPipe() 1482 | X, y = @load_iris; # built-in data set 1483 | mach = machine(pipe, X, y) 1484 | evaluate!(mach, measure=misclassification_rate, operation=predict_mode) 1485 | 1486 | 1487 | # ### A composite model to average two regressor predictors 1488 | 1489 | # The following is condensed version of 1490 | # [this](https://github.com/alan-turing-institute/MLJ.jl/blob/master/binder/MLJ_demo.ipynb) 1491 | # tutorial. We will define a composite model that: 1492 | 1493 | # - standardizes the input data 1494 | 1495 | # - learns and applies a Box-Cox transformation to the target variable 1496 | 1497 | # - blends the predictions of two supervised learning models - a ridge 1498 | # regressor and a random forest regressor; we'll blend using a simple 1499 | # average (for a more sophisticated stacking example, see 1500 | # [here](https://juliaai.github.io/DataScienceTutorials.jl/getting-started/stacking/)) 1501 | 1502 | # - applies the *inverse* Box-Cox transformation to this blended prediction 1503 | 1504 | RandomForestRegressor = @load RandomForestRegressor pkg=DecisionTree 1505 | 1506 | # **Input layer** 1507 | 1508 | X = source() 1509 | y = source() 1510 | 1511 | # **First layer and target transformation** 1512 | 1513 | std_model = Standardizer() 1514 | stand = machine(std_model, X) 1515 | W = MLJ.transform(stand, X) 1516 | 1517 | box_model = UnivariateBoxCoxTransformer() 1518 | box = machine(box_model, y) 1519 | z = MLJ.transform(box, y) 1520 | 1521 | # **Second layer** 1522 | 1523 | ridge_model = RidgeRegressor(lambda=0.1) 1524 | ridge = machine(ridge_model, W, z) 1525 | 1526 | forest_model = RandomForestRegressor(n_trees=50) 1527 | forest = machine(forest_model, W, z) 1528 | 1529 | ẑ = 0.5*predict(ridge, W) + 0.5*predict(forest, W) 1530 | 1531 | # **Output** 1532 | 1533 | ŷ = inverse_transform(box, ẑ) 1534 | 1535 | # With the learning network defined, we're ready to export: 1536 | 1537 | @from_network machine(Deterministic(), X, y, predict=ŷ) begin 1538 | mutable struct CompositeModel 1539 | rgs1 = ridge_model 1540 | rgs2 = forest_model 1541 | end 1542 | end 1543 | 1544 | # Let's instantiate the new model type and try it out on some data: 1545 | 1546 | composite = CompositeModel() 1547 | 1548 | #- 1549 | 1550 | X, y = @load_boston; 1551 | mach = machine(composite, X, y); 1552 | evaluate!(mach, 1553 | resampling=CV(nfolds=6, shuffle=true), 1554 | measures=[rms, mae]) 1555 | 1556 | 1557 | # ### Resources for Part 5 1558 | # 1559 | # - From the MLJ manual: 1560 | # - [Learning Networks](https://alan-turing-institute.github.io/MLJ.jl/stable/composing_models/#Learning-Networks-1) 1561 | # - From Data Science Tutorials: 1562 | # - [Learning Networks](https://juliaai.github.io/DataScienceTutorials.jl/getting-started/learning-networks/) 1563 | # - [Learning Networks 2](https://juliaai.github.io/DataScienceTutorials.jl/getting-started/learning-networks-2/) 1564 | 1565 | # - [Stacking](https://juliaai.github.io/DataScienceTutorials.jl/getting-started/stacking/): an advanced example of model composition 1566 | 1567 | # - [Finer Control](https://alan-turing-institute.github.io/MLJ.jl/dev/composing_models/#Method-II:-Finer-control-(advanced)-1): 1568 | # exporting learning networks without a macro for finer control 1569 | 1570 | # 1571 | 1572 | 1573 | # ## Solutions to exercises 1574 | 1575 | # #### Exercise 2 solution 1576 | 1577 | quality = coerce(quality, OrderedFactor); 1578 | levels!(quality, ["poor", "good", "excellent"]); 1579 | elscitype(quality) 1580 | 1581 | 1582 | # #### Exercise 3 solution 1583 | 1584 | # First pass: 1585 | 1586 | coerce!(house, autotype(house)); 1587 | schema(house) 1588 | 1589 | #- 1590 | 1591 | # All the "sqft" fields refer to "square feet" so are 1592 | # really `Continuous`. We'll regard `:yr_built` (the other `Count` 1593 | # variable above) as `Continuous` as well. So: 1594 | 1595 | coerce!(house, Count => Continuous); 1596 | 1597 | # And `:zipcode` should not be ordered: 1598 | 1599 | coerce!(house, :zipcode => Multiclass); 1600 | schema(house) 1601 | 1602 | # `:bathrooms` looks like it has a lot of levels, but on further 1603 | # inspection we see why, and `OrderedFactor` remains appropriate: 1604 | 1605 | import StatsBase.countmap 1606 | countmap(house.bathrooms) 1607 | 1608 | 1609 | # #### Exercise 4 solution 1610 | 1611 | # 4(a) 1612 | 1613 | # There are *no* models that apply immediately: 1614 | 1615 | models(matching(X4, y4)) 1616 | 1617 | # 4(b) 1618 | 1619 | y4 = coerce(y4, Continuous); 1620 | models(matching(X4, y4)) 1621 | 1622 | 1623 | # #### Exercise 6 solution 1624 | 1625 | # 6(a) 1626 | 1627 | y, X = unpack(horse, 1628 | ==(:outcome), 1629 | name -> elscitype(Tables.getcolumn(horse, name)) == Continuous); 1630 | 1631 | # 6(b)(i) 1632 | 1633 | model = (@load LogisticClassifier pkg=MLJLinearModels)(); 1634 | model.lambda = 100 1635 | mach = machine(model, X, y) 1636 | fit!(mach, rows=train) 1637 | fitted_params(mach) 1638 | 1639 | #- 1640 | 1641 | coefs_given_feature = Dict(fitted_params(mach).coefs) 1642 | coefs_given_feature[:pulse] 1643 | 1644 | #6(b)(ii) 1645 | 1646 | yhat = predict(mach, rows=test); # or predict(mach, X[test,:]) 1647 | err = cross_entropy(yhat, y[test]) |> mean 1648 | 1649 | # 6(b)(iii) 1650 | 1651 | # The predicted probabilities of the actual observations in the test 1652 | # are given by 1653 | 1654 | p = broadcast(pdf, yhat, y[test]); 1655 | 1656 | # The number of times this probability exceeds 50% is: 1657 | n50 = filter(x -> x > 0.5, p) |> length 1658 | 1659 | # Or, as a proportion: 1660 | 1661 | n50/length(test) 1662 | 1663 | # 6(b)(iv) 1664 | 1665 | misclassification_rate(mode.(yhat), y[test]) 1666 | 1667 | # 6(c)(i) 1668 | 1669 | model = (@load RandomForestClassifier pkg=DecisionTree)() 1670 | mach = machine(model, X, y) 1671 | evaluate!(mach, resampling=CV(nfolds=6), measure=cross_entropy) 1672 | 1673 | r = range(model, :n_trees, lower=10, upper=70, scale=:log10) 1674 | 1675 | # Since random forests are inherently randomized, we generate multiple 1676 | # curves: 1677 | 1678 | plt = plot() 1679 | for i in 1:4 1680 | one_curve = learning_curve(mach, 1681 | range=r, 1682 | resampling=Holdout(), 1683 | measure=cross_entropy) 1684 | plot!(one_curve.parameter_values, one_curve.measurements) 1685 | end 1686 | xlabel!(plt, "n_trees") 1687 | ylabel!(plt, "cross entropy") 1688 | savefig("exercise_6ci.png") 1689 | plt #!md 1690 | 1691 | # ![](exercise_6ci.png) #md 1692 | 1693 | 1694 | # 6(c)(ii) 1695 | 1696 | evaluate!(mach, resampling=CV(nfolds=9), 1697 | measure=cross_entropy, 1698 | rows=train).measurement[1] 1699 | 1700 | model.n_trees = 90 1701 | 1702 | # 6(c)(iii) 1703 | 1704 | err_forest = evaluate!(mach, resampling=Holdout(), 1705 | measure=cross_entropy).measurement[1] 1706 | 1707 | # #### Exercise 7 1708 | 1709 | # (a) 1710 | 1711 | KMeans = @load KMeans pkg=Clustering 1712 | EvoTreeClassifier = @load EvoTreeClassifier 1713 | pipe = @pipeline(Standardizer, 1714 | ContinuousEncoder, 1715 | KMeans(k=10), 1716 | EvoTreeClassifier(nrounds=50)) 1717 | 1718 | # (b) 1719 | 1720 | mach = machine(pipe, X, y) 1721 | evaluate!(mach, resampling=CV(nfolds=6), measure=cross_entropy) 1722 | 1723 | # (c) 1724 | 1725 | r = range(pipe, :(evo_tree_classifier.max_depth), lower=1, upper=10) 1726 | 1727 | curve = learning_curve(mach, 1728 | range=r, 1729 | resampling=CV(nfolds=6), 1730 | measure=cross_entropy) 1731 | 1732 | plt = plot(curve.parameter_values, curve.measurements) 1733 | xlabel!(plt, "max_depth") 1734 | ylabel!(plt, "CV estimate of cross entropy") 1735 | savefig("exercise_7c.png") 1736 | plt #!md 1737 | 1738 | # ![](exercise_7c.png) #md 1739 | 1740 | # Here's a second curve using a different random seed for the booster: 1741 | 1742 | using Random 1743 | pipe.evo_tree_classifier.rng = MersenneTwister(123) 1744 | curve = learning_curve(mach, 1745 | range=r, 1746 | resampling=CV(nfolds=6), 1747 | measure=cross_entropy) 1748 | plot!(curve.parameter_values, curve.measurements) 1749 | savefig("exercise_7c_2.png") 1750 | plt #!md 1751 | 1752 | # ![](exercise_7c_2.png) #md 1753 | 1754 | # One can automatic the production of multiple curves with different 1755 | # seeds in the following way: 1756 | curves = learning_curve(mach, 1757 | range=r, 1758 | resampling=CV(nfolds=6), 1759 | measure=cross_entropy, 1760 | rng_name=:(evo_tree_classifier.rng), 1761 | rngs=6) # list of RNGs, or num to auto generate 1762 | plt = plot(curves.parameter_values, curves.measurements) 1763 | savefig("exercise_7c_3.png") 1764 | plt #!md 1765 | 1766 | # ![](exercise_7c_3.png) #md 1767 | 1768 | # If you have multiple threads available in your julia session, you 1769 | # can add the option `acceleration=CPUThreads()` to speed up this 1770 | # computation. 1771 | 1772 | # #### Exercise 8 1773 | 1774 | y, X = unpack(house, ==(:price), name -> true, rng=123); 1775 | 1776 | EvoTreeRegressor = @load EvoTreeRegressor 1777 | tree_booster = EvoTreeRegressor(nrounds = 70) 1778 | model = @pipeline ContinuousEncoder tree_booster 1779 | 1780 | # (a) 1781 | 1782 | r1 = range(model, :(evo_tree_regressor.max_depth), lower=1, upper=12) 1783 | 1784 | # (c) 1785 | 1786 | tuned_model = TunedModel(model=model, 1787 | ranges=[r1, r2], 1788 | resampling=Holdout(), 1789 | measures=mae, 1790 | tuning=RandomSearch(rng=123), 1791 | n=40) 1792 | 1793 | tuned_mach = machine(tuned_model, X, y) |> fit! 1794 | plt = plot(tuned_mach) 1795 | savefig("exercise_8c.png") 1796 | plt #!md 1797 | 1798 | # ![](exercise_8c.png) #md 1799 | 1800 | # (d) 1801 | 1802 | best_model = report(tuned_mach).best_model; 1803 | best_mach = machine(best_model, X, y); 1804 | best_err = evaluate!(best_mach, resampling=CV(nfolds=3), measure=mae) 1805 | 1806 | #- 1807 | 1808 | tuned_err = evaluate!(tuned_mach, resampling=CV(nfolds=3), measure=mae) 1809 | 1810 | 1811 | using Literate #src 1812 | Literate.markdown(@__FILE__, DIR, execute=true) #src 1813 | Literate.notebook(@__FILE__, DIR, execute=false) #src 1814 | -------------------------------------------------------------------------------- /vecstack.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ablaom/MachineLearningInJulia2020/552f98fbf012475d67cd29a72448ac7c476ea2c7/vecstack.png -------------------------------------------------------------------------------- /wow.jl: -------------------------------------------------------------------------------- 1 | # # State-of-the-art model composition in MLJ (Machine Learning in Julia) 2 | 3 | # In this script we use model stacking to demonstrate the ease with 4 | # which machine learning models can be combined in sophisticated ways 5 | # using MLJ. In practice, one would use MLJ's [canned stacking model 6 | # constructor](https://alan-turing-institute.github.io/MLJ.jl/dev/model_stacking/#Model-Stacking) 7 | # `Stack`. Here, however, we give a quick demonstation how you would 8 | # build a stack yourself, using MLJ's generic model composition 9 | # syntax, which is an extension of the normal fit/predict syntax. 10 | 11 | # For a more leisurely notebook on the same material, see 12 | # [this](https://juliaai.github.io/DataScienceTutorials.jl/getting-started/stacking/) 13 | # tutorial. 14 | 15 | 16 | DIR = @__DIR__ 17 | include(joinpath(DIR, "setup.jl")) 18 | 19 | # ## Stacking is hard 20 | 21 | # [Model 22 | # stacking](https://alan-turing-institute.github.io/DataScienceTutorials.jl/getting-started/stacking/), 23 | # popular in Kaggle data science competitions, is a sophisticated way 24 | # to blend the predictions of multiple models. 25 | 26 | # With the python toolbox 27 | # [scikit-learn](https://scikit-learn.org/stable/) (or its [julia 28 | # wrap](https://github.com/cstjean/ScikitLearn.jl)) you can use 29 | # pipelines to combine composite models in simple ways but (automated) 30 | # stacking is beyond its capabilities. 31 | 32 | # One python alternative is to use 33 | # [vecstack](https://github.com/vecxoz/vecstack). The [core 34 | # algorithm](https://github.com/vecxoz/vecstack/blob/master/vecstack/core.py) 35 | # is about eight pages (without the scikit-learn interface): 36 | 37 | # ![](vecstack.png). 38 | 39 | # ## Stacking is easy (in MLJ) 40 | 41 | # Using MLJ's [generic model composition 42 | # API](https://alan-turing-institute.github.io/MLJ.jl/dev/composing_models/) 43 | # you can build a stack in about a page. 44 | 45 | # Here's the complete code needed to define a new model type that 46 | # stacks two base regressors and one adjudicator in MLJ. Here we use 47 | # three folds to create the base-learner [out-of-sample 48 | # predictions](https://alan-turing-institute.github.io/DataScienceTutorials.jl/getting-started/stacking/) 49 | # to make it easier to read. You can make this generic with little fuss. 50 | 51 | using MLJ 52 | 53 | folds(data, nfolds) = 54 | partition(1:nrows(data), (1/nfolds for i in 1:(nfolds-1))...); 55 | 56 | # these models are only going to be default choices for the stack: 57 | 58 | LinearRegressor = @load LinearRegressor pkg=MLJLinearModels 59 | model1 = LinearRegressor() 60 | model2 = LinearRegressor() 61 | judge = LinearRegressor() 62 | 63 | X = source() 64 | y = source() 65 | 66 | folds(X::AbstractNode, nfolds) = node(XX->folds(XX, nfolds), X) 67 | MLJ.restrict(X::AbstractNode, f::AbstractNode, i) = 68 | node((XX, ff) -> restrict(XX, ff, i), X, f); 69 | MLJ.corestrict(X::AbstractNode, f::AbstractNode, i) = 70 | node((XX, ff) -> corestrict(XX, ff, i), X, f); 71 | 72 | f = folds(X, 3) 73 | 74 | m11 = machine(model1, corestrict(X, f, 1), corestrict(y, f, 1)) 75 | m12 = machine(model1, corestrict(X, f, 2), corestrict(y, f, 2)) 76 | m13 = machine(model1, corestrict(X, f, 3), corestrict(y, f, 3)) 77 | 78 | y11 = predict(m11, restrict(X, f, 1)); 79 | y12 = predict(m12, restrict(X, f, 2)); 80 | y13 = predict(m13, restrict(X, f, 3)); 81 | 82 | m21 = machine(model2, corestrict(X, f, 1), corestrict(y, f, 1)) 83 | m22 = machine(model2, corestrict(X, f, 2), corestrict(y, f, 2)) 84 | m23 = machine(model2, corestrict(X, f, 3), corestrict(y, f, 3)) 85 | 86 | y21 = predict(m21, restrict(X, f, 1)); 87 | y22 = predict(m22, restrict(X, f, 2)); 88 | y23 = predict(m23, restrict(X, f, 3)); 89 | 90 | y1_oos = vcat(y11, y12, y13); 91 | y2_oos = vcat(y21, y22, y23); 92 | 93 | X_oos = MLJ.table(hcat(y1_oos, y2_oos)) 94 | 95 | m_judge = machine(judge, X_oos, y) 96 | 97 | m1 = machine(model1, X, y) 98 | m2 = machine(model2, X, y) 99 | 100 | y1 = predict(m1, X); 101 | y2 = predict(m2, X); 102 | 103 | X_judge = MLJ.table(hcat(y1, y2)) 104 | yhat = predict(m_judge, X_judge) 105 | 106 | @from_network machine(Deterministic(), X, y; predict=yhat) begin 107 | mutable struct MyStack 108 | regressor1=model1 109 | regressor2=model2 110 | judge=judge 111 | end 112 | end 113 | 114 | my_stack = MyStack() 115 | 116 | # For the curious: Only the last block defines the new model type. The 117 | # rest defines a *[learning network]()* - a kind of working prototype 118 | # or blueprint for the type. If the source nodes `X` and `y` wrap some 119 | # data (instead of nothing) then the network can be trained and tested 120 | # as you build it. 121 | 122 | 123 | # ## Composition plays well with other work-flows 124 | 125 | # We did not include standardization of inputs and target (with 126 | # post-prediction inversion) in our stack. However, we can add these 127 | # now, using MLJ's canned pipeline composition: 128 | 129 | pipe = @pipeline Standardizer my_stack target=Standardizer 130 | 131 | # Want to change a base learner and adjudicator? 132 | 133 | DecisionTreeRegressor = @load DecisionTreeRegressor pkg=DecisionTree; 134 | KNNRegressor = @load KNNRegressor; 135 | pipe.my_stack.regressor2 = DecisionTreeRegressor() 136 | pipe.my_stack.judge = KNNRegressor(); 137 | 138 | # Want a CV estimate of performance of the complete model on some data? 139 | 140 | X, y = @load_boston; 141 | mach = machine(pipe, X, y) 142 | evaluate!(mach, resampling=CV(), measure=mae) 143 | 144 | # Want to inspect the learned parameters of the adjudicator? 145 | 146 | fp = fitted_params(mach); 147 | fp.my_stack.judge 148 | 149 | # What about the first base-learner of the stack? There are four sets 150 | # of learned parameters! One for each fold to make an out-of-sample 151 | # prediction, and one trained on all the data: 152 | 153 | fp.my_stack.regressor1 154 | 155 | #- 156 | 157 | fp.my_stack.regressor1[1].coefs 158 | 159 | # Want to tune multiple (nested) hyperparameters in the stack? Tuning is a 160 | # model wrapper (for better composition!): 161 | 162 | r1 = range(pipe, :(my_stack.regressor2.max_depth), lower = 1, upper = 25, scale=:linear) 163 | r2 = range(pipe, :(my_stack.judge.K), lower=1, origin=10, unit=10, scale=:log10) 164 | 165 | import Distributions.Poisson 166 | 167 | tuned_pipe = TunedModel(model=pipe, 168 | ranges=[r1, (r2, Poisson)], 169 | tuning=RandomSearch(), 170 | resampling=CV(), 171 | measure=rms, 172 | n=100) 173 | mach = machine(tuned_pipe, X, y) |> fit! 174 | best_model = fitted_params(mach).best_model 175 | K = fitted_params(mach).best_model.my_stack.judge.K; 176 | max_depth = fitted_params(mach).best_model.my_stack.regressor2.max_depth 177 | @show K max_depth; 178 | 179 | # Visualize tuning results: 180 | 181 | using Plots 182 | gr(size=(700,700*(sqrt(5) - 1)/2)) 183 | plt = plot(mach) 184 | savefig("stacking.png") 185 | plt #!md 186 | 187 | # ![](stacking.png) 188 | 189 | using Literate #src 190 | Literate.markdown(@__FILE__, @__DIR__, execute=false) #src 191 | Literate.notebook(@__FILE__, @__DIR__, execute=true) #src 192 | -------------------------------------------------------------------------------- /wow.md: -------------------------------------------------------------------------------- 1 | ```@meta 2 | EditURL = "/wow.jl" 3 | ``` 4 | 5 | # State-of-the-art model composition in MLJ (Machine Learning in Julia) 6 | 7 | In this script we use model stacking to demonstrate the ease with 8 | which machine learning models can be combined in sophisticated ways 9 | using MLJ. In practice, one would use MLJ's [canned stacking model 10 | constructor](https://alan-turing-institute.github.io/MLJ.jl/dev/model_stacking/#Model-Stacking) 11 | `Stack`. Here, however, we give a quick demonstation how you would 12 | build a stack yourself, using MLJ's generic model composition 13 | syntax, which is an extension of the normal fit/predict syntax. 14 | 15 | For a more leisurely notebook on the same material, see 16 | [this](https://juliaai.github.io/DataScienceTutorials.jl/getting-started/stacking/) 17 | tutorial. 18 | 19 | ````@example wow 20 | DIR = @__DIR__ 21 | include(joinpath(DIR, "setup.jl")) 22 | ```` 23 | 24 | ## Stacking is hard 25 | 26 | [Model 27 | stacking](https://alan-turing-institute.github.io/DataScienceTutorials.jl/getting-started/stacking/), 28 | popular in Kaggle data science competitions, is a sophisticated way 29 | to blend the predictions of multiple models. 30 | 31 | With the python toolbox 32 | [scikit-learn](https://scikit-learn.org/stable/) (or its [julia 33 | wrap](https://github.com/cstjean/ScikitLearn.jl)) you can use 34 | pipelines to combine composite models in simple ways but (automated) 35 | stacking is beyond its capabilities. 36 | 37 | One python alternative is to use 38 | [vecstack](https://github.com/vecxoz/vecstack). The [core 39 | algorithm](https://github.com/vecxoz/vecstack/blob/master/vecstack/core.py) 40 | is about eight pages (without the scikit-learn interface): 41 | 42 | ![](vecstack.png). 43 | 44 | ## Stacking is easy (in MLJ) 45 | 46 | Using MLJ's [generic model composition 47 | API](https://alan-turing-institute.github.io/MLJ.jl/dev/composing_models/) 48 | you can build a stack in about a page. 49 | 50 | Here's the complete code needed to define a new model type that 51 | stacks two base regressors and one adjudicator in MLJ. Here we use 52 | three folds to create the base-learner [out-of-sample 53 | predictions](https://alan-turing-institute.github.io/DataScienceTutorials.jl/getting-started/stacking/) 54 | to make it easier to read. You can make this generic with little fuss. 55 | 56 | ````@example wow 57 | using MLJ 58 | 59 | folds(data, nfolds) = 60 | partition(1:nrows(data), (1/nfolds for i in 1:(nfolds-1))...); 61 | nothing #hide 62 | ```` 63 | 64 | these models are only going to be default choices for the stack: 65 | 66 | ````@example wow 67 | LinearRegressor = @load LinearRegressor pkg=MLJLinearModels 68 | model1 = LinearRegressor() 69 | model2 = LinearRegressor() 70 | judge = LinearRegressor() 71 | 72 | X = source() 73 | y = source() 74 | 75 | folds(X::AbstractNode, nfolds) = node(XX->folds(XX, nfolds), X) 76 | MLJ.restrict(X::AbstractNode, f::AbstractNode, i) = 77 | node((XX, ff) -> restrict(XX, ff, i), X, f); 78 | MLJ.corestrict(X::AbstractNode, f::AbstractNode, i) = 79 | node((XX, ff) -> corestrict(XX, ff, i), X, f); 80 | 81 | f = folds(X, 3) 82 | 83 | m11 = machine(model1, corestrict(X, f, 1), corestrict(y, f, 1)) 84 | m12 = machine(model1, corestrict(X, f, 2), corestrict(y, f, 2)) 85 | m13 = machine(model1, corestrict(X, f, 3), corestrict(y, f, 3)) 86 | 87 | y11 = predict(m11, restrict(X, f, 1)); 88 | y12 = predict(m12, restrict(X, f, 2)); 89 | y13 = predict(m13, restrict(X, f, 3)); 90 | 91 | m21 = machine(model2, corestrict(X, f, 1), corestrict(y, f, 1)) 92 | m22 = machine(model2, corestrict(X, f, 2), corestrict(y, f, 2)) 93 | m23 = machine(model2, corestrict(X, f, 3), corestrict(y, f, 3)) 94 | 95 | y21 = predict(m21, restrict(X, f, 1)); 96 | y22 = predict(m22, restrict(X, f, 2)); 97 | y23 = predict(m23, restrict(X, f, 3)); 98 | 99 | y1_oos = vcat(y11, y12, y13); 100 | y2_oos = vcat(y21, y22, y23); 101 | 102 | X_oos = MLJ.table(hcat(y1_oos, y2_oos)) 103 | 104 | m_judge = machine(judge, X_oos, y) 105 | 106 | m1 = machine(model1, X, y) 107 | m2 = machine(model2, X, y) 108 | 109 | y1 = predict(m1, X); 110 | y2 = predict(m2, X); 111 | 112 | X_judge = MLJ.table(hcat(y1, y2)) 113 | yhat = predict(m_judge, X_judge) 114 | 115 | @from_network machine(Deterministic(), X, y; predict=yhat) begin 116 | mutable struct MyStack 117 | regressor1=model1 118 | regressor2=model2 119 | judge=judge 120 | end 121 | end 122 | 123 | my_stack = MyStack() 124 | ```` 125 | 126 | For the curious: Only the last block defines the new model type. The 127 | rest defines a *[learning network]()* - a kind of working prototype 128 | or blueprint for the type. If the source nodes `X` and `y` wrap some 129 | data (instead of nothing) then the network can be trained and tested 130 | as you build it. 131 | 132 | ## Composition plays well with other work-flows 133 | 134 | We did not include standardization of inputs and target (with 135 | post-prediction inversion) in our stack. However, we can add these 136 | now, using MLJ's canned pipeline composition: 137 | 138 | ````@example wow 139 | pipe = @pipeline Standardizer my_stack target=Standardizer 140 | ```` 141 | 142 | Want to change a base learner and adjudicator? 143 | 144 | ````@example wow 145 | DecisionTreeRegressor = @load DecisionTreeRegressor pkg=DecisionTree; 146 | KNNRegressor = @load KNNRegressor; 147 | pipe.my_stack.regressor2 = DecisionTreeRegressor() 148 | pipe.my_stack.judge = KNNRegressor(); 149 | nothing #hide 150 | ```` 151 | 152 | Want a CV estimate of performance of the complete model on some data? 153 | 154 | ````@example wow 155 | X, y = @load_boston; 156 | mach = machine(pipe, X, y) 157 | evaluate!(mach, resampling=CV(), measure=mae) 158 | ```` 159 | 160 | Want to inspect the learned parameters of the adjudicator? 161 | 162 | ````@example wow 163 | fp = fitted_params(mach); 164 | fp.my_stack.judge 165 | ```` 166 | 167 | What about the first base-learner of the stack? There are four sets 168 | of learned parameters! One for each fold to make an out-of-sample 169 | prediction, and one trained on all the data: 170 | 171 | ````@example wow 172 | fp.my_stack.regressor1 173 | ```` 174 | 175 | ````@example wow 176 | fp.my_stack.regressor1[1].coefs 177 | ```` 178 | 179 | Want to tune multiple (nested) hyperparameters in the stack? Tuning is a 180 | model wrapper (for better composition!): 181 | 182 | ````@example wow 183 | r1 = range(pipe, :(my_stack.regressor2.max_depth), lower = 1, upper = 25, scale=:linear) 184 | r2 = range(pipe, :(my_stack.judge.K), lower=1, origin=10, unit=10, scale=:log10) 185 | 186 | import Distributions.Poisson 187 | 188 | tuned_pipe = TunedModel(model=pipe, 189 | ranges=[r1, (r2, Poisson)], 190 | tuning=RandomSearch(), 191 | resampling=CV(), 192 | measure=rms, 193 | n=100) 194 | mach = machine(tuned_pipe, X, y) |> fit! 195 | best_model = fitted_params(mach).best_model 196 | K = fitted_params(mach).best_model.my_stack.judge.K; 197 | max_depth = fitted_params(mach).best_model.my_stack.regressor2.max_depth 198 | @show K max_depth; 199 | nothing #hide 200 | ```` 201 | 202 | Visualize tuning results: 203 | 204 | ````@example wow 205 | using Plots 206 | gr(size=(700,700*(sqrt(5) - 1)/2)) 207 | plt = plot(mach) 208 | savefig("stacking.png") 209 | ```` 210 | 211 | ![](stacking.png) 212 | 213 | --- 214 | 215 | *This page was generated using [Literate.jl](https://github.com/fredrikekre/Literate.jl).* 216 | 217 | --------------------------------------------------------------------------------