├── .gitignore
├── Deploying Machine Learning Models in Production
└── C4W3-Lab
│ ├── C4_W3_Lab_1_Kubeflow_Pipelines.ipynb
│ └── img
│ ├── complete.png
│ ├── complete_pipeline.png
│ ├── dag_kfp.png
│ ├── highlevel.jpg
│ ├── kfp_ui.png
│ ├── logs.png
│ ├── progress.png
│ ├── simple_dag.jpg
│ ├── upload.png
│ ├── upload_pipeline.png
│ └── url.png
├── Machine Learning Data Lifecycle in Production
├── C2W1-Assignment
│ ├── C2W1_Assignment.ipynb
│ └── dataset_diabetes
│ │ └── diabetic_data.csv
├── C2W2-Assignment
│ ├── C2W2_Assignment.ipynb
│ └── img
│ │ └── feature_eng_pipeline.png
└── C2W3-Assignment
│ ├── C2W3_Assignment.ipynb
│ └── util.py
└── Machine Learning Modeling Pipelines in Production
├── C3W1-Lab
├── C3_W1_Lab_1_Keras_Tuner.ipynb
├── C3_W1_Lab_2_TFX_Tuner_and_Trainer.ipynb
├── fmnist_transform.py
├── trainer.py
└── tuner.py
├── C3W2-Lab
├── C3_W2_Lab_1_Manual_Dimensionality.ipynb
├── C3_W2_Lab_2_Algorithmic_Dimensionality.ipynb
├── C3_W2_Lab_3_Quantization_and_Pruning.ipynb
├── dnn_model.png
└── dnn_model_engineered.png
├── C3W3-Assignment
├── lab-files
│ ├── Dockerfile
│ ├── Kptfile
│ ├── README.md
│ ├── distributed-training-gke
│ │ ├── Dockerfile
│ │ ├── Kptfile
│ │ ├── README.md
│ │ ├── mnist
│ │ │ ├── __init__.py
│ │ │ ├── main.py
│ │ │ └── model.py
│ │ └── tfjob.yaml
│ ├── mnist
│ │ ├── __init__.py
│ │ ├── main.py
│ │ └── model.py
│ └── tfjob.yaml
├── tf-distributed-training-kubeflow.ipynb
└── tf-training
│ ├── Kptfile
│ ├── OWNERS
│ ├── tf-job-crds
│ ├── base
│ │ ├── crd.yaml
│ │ └── kustomization.yaml
│ └── overlays
│ │ └── application
│ │ ├── application.yaml
│ │ └── kustomization.yaml
│ ├── tf-job-operator
│ ├── base
│ │ ├── cluster-role-binding.yaml
│ │ ├── cluster-role.yaml
│ │ ├── deployment.yaml
│ │ ├── kustomization.yaml
│ │ ├── params.env
│ │ ├── service-account.yaml
│ │ └── service.yaml
│ └── overlays
│ │ └── application
│ │ ├── application.yaml
│ │ └── kustomization.yaml
│ └── tf-training
│ ├── Kptfile
│ ├── OWNERS
│ ├── tf-job-crds
│ ├── base
│ │ ├── crd.yaml
│ │ └── kustomization.yaml
│ └── overlays
│ │ └── application
│ │ ├── application.yaml
│ │ └── kustomization.yaml
│ └── tf-job-operator
│ ├── base
│ ├── cluster-role-binding.yaml
│ ├── cluster-role.yaml
│ ├── deployment.yaml
│ ├── kustomization.yaml
│ ├── params.env
│ ├── service-account.yaml
│ └── service.yaml
│ └── overlays
│ └── application
│ ├── application.yaml
│ └── kustomization.yaml
├── C3W3-Lab
├── C3_W3_Lab_1_Distributed_Training.ipynb
├── C3_W3_Lab_2_Knowledge_Distillation.ipynb
└── model.png
├── C3W4-Assignment
└── xgboost_caip_e2e.ipynb
├── C3W4-Lab
├── C3_W4_Lab_1_TFMA.ipynb
└── C3_W4_Lab_2_TFX_Evaluator.ipynb
└── C3W5-Lab
├── C3_W5_Lab_1_Shap_Values.ipynb
└── C3_W5_Lab_2_Permutation_Importance.ipynb
/.gitignore:
--------------------------------------------------------------------------------
1 | .ipynb_checkpoints
2 | data
--------------------------------------------------------------------------------
/Deploying Machine Learning Models in Production/C4W3-Lab/C4_W3_Lab_1_Kubeflow_Pipelines.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {
6 | "id": "BE97DJ2_2gYM"
7 | },
8 | "source": [
9 | "# Ungraded Lab: Building ML Pipelines with Kubeflow"
10 | ]
11 | },
12 | {
13 | "cell_type": "markdown",
14 | "metadata": {
15 | "id": "EzUU3ZPtib8K"
16 | },
17 | "source": [
18 | "In this lab, you will have some hands-on practice with [Kubeflow Pipelines](https://www.kubeflow.org/docs/components/pipelines/overview/pipelines-overview/). As mentioned in the lectures, modern ML engineering is moving towards pipeline automation for rapid iteration and experiment tracking. This is especially useful in production deployments where models need to be frequently retrained to catch trends in newer data.\n",
19 | "\n",
20 | "Kubeflow Pipelines is one component of the [Kubeflow](https://www.kubeflow.org/) suite of tools for machine learning workflows. It is deployed on top of a Kubernetes cluster and builds an infrastructure for orchestrating ML pipelines and monitoring inputs and outputs of each component. You will use this tool in Google Cloud Platform in the first assignment this week and this lab will help prepare you for that by exploring its features on a local deployment. In particular, you will:\n",
21 | "\n",
22 | "* setup [Kubeflow Pipelines](https://www.kubeflow.org/docs/components/pipelines/overview/pipelines-overview/) in your local workstation\n",
23 | "* get familiar with the Kubeflow Pipelines UI\n",
24 | "* build pipeline components with Python and the Kubeflow Pipelines SDK\n",
25 | "* run an ML pipeline with Kubeflow Pipelines\n",
26 | "\n",
27 | "Let's begin!"
28 | ]
29 | },
30 | {
31 | "cell_type": "markdown",
32 | "metadata": {
33 | "id": "uOZgYS16iqAo"
34 | },
35 | "source": [
36 | "## Setup\n",
37 | "\n",
38 | "You will need these tool installed in your local machine to complete the exercises:\n",
39 | "\n",
40 | "1. Docker - platform for building and running containerized applications. You should already have this installed from the previous ungraded labs. If not, you can see the instructions [here](https://docs.docker.com/get-docker/). If you are using Docker for Desktop (Mac or Windows), you may need to increase the resource limits to start Kubeflow Pipelines later. You can click on the Docker icon in your Task Bar, choose `Preferences` and adjust the CPU to 4, Storage to 50GB, and the memory to at least 4GB (8GB recommended). Just make sure you are not maxing out any of these limits (i.e. the slider should ideally be at the midpoint or less) since it can make your machine slow or unresponsive. If you're constrained on resources, don't worry. You can still use this notebook as reference since we'll show the expected outputs at each step. The important thing is to become familiar with this Kubeflow Pipelines before you get more hands-on in the assignment. \n",
41 | "\n",
42 | "2. kubectl - tool for running commands on Kubernetes clusters. This should also be installed from the previous labs. If not, please see the instructions [here](https://kubernetes.io/docs/tasks/tools/)\n",
43 | "\n",
44 | "3. [kind](https://kind.sigs.k8s.io/) - a Kubernetes distribution for running local clusters using Docker. Please follow the instructions [here](https://www.kubeflow.org/docs/components/pipelines/installation/localcluster-deployment/#kind) to install kind and create a local cluster.\n",
45 | "\n",
46 | "4. Kubeflow Pipelines - a platform for building and deploying portable, scalable machine learning (ML) workflows based on Docker containers. Once you've created a local cluster using kind, you can deploy Kubeflow Pipelines with these commands.\n",
47 | "\n",
48 | "```\n",
49 | "export PIPELINE_VERSION=1.7.0\n",
50 | "kubectl apply -k \"github.com/kubeflow/pipelines/manifests/kustomize/cluster-scoped-resources?ref=$PIPELINE_VERSION\"\n",
51 | "kubectl wait --for condition=established --timeout=60s crd/applications.app.k8s.io\n",
52 | "kubectl apply -k \"github.com/kubeflow/pipelines/manifests/kustomize/env/platform-agnostic-pns?ref=$PIPELINE_VERSION\"\n",
53 | "```\n",
54 | "\n",
55 | "You can enter the commands above one line at a time. These will setup all the deployments and spin up the pods for the entire application. These will be found in the `kubeflow` namespace. After sending the last command, it will take a moment (around 30 minutes) for all the deployments to be ready. You can send the command `kubectl get deploy -n kubeflow` a few times to check the status. You should see all deployments with the `READY` status before you can proceed to the next section.\n",
56 | "\n",
57 | "```\n",
58 | "NAME READY UP-TO-DATE AVAILABLE AGE\n",
59 | "cache-deployer-deployment 1/1 1 1 21h\n",
60 | "cache-server 1/1 1 1 21h\n",
61 | "metadata-envoy-deployment 1/1 1 1 21h\n",
62 | "metadata-grpc-deployment 1/1 1 1 21h\n",
63 | "metadata-writer 1/1 1 1 21h\n",
64 | "minio 1/1 1 1 21h\n",
65 | "ml-pipeline 1/1 1 1 21h\n",
66 | "ml-pipeline-persistenceagent 1/1 1 1 21h\n",
67 | "ml-pipeline-scheduledworkflow 1/1 1 1 21h\n",
68 | "ml-pipeline-ui 1/1 1 1 21h\n",
69 | "ml-pipeline-viewer-crd 1/1 1 1 21h\n",
70 | "ml-pipeline-visualizationserver 1/1 1 1 21h\n",
71 | "mysql 1/1 1 1 21h\n",
72 | "workflow-controller 1/1 1 1 21h\n",
73 | "```\n",
74 | "\n",
75 | "When everything is ready, you can run the following command to access the `ml-pipeline-ui` service.\n",
76 | "\n",
77 | "```\n",
78 | "kubectl port-forward -n kubeflow svc/ml-pipeline-ui 8080:80\n",
79 | "```\n",
80 | "\n",
81 | "The terminal should respond with something like this:\n",
82 | "\n",
83 | "```\n",
84 | "Forwarding from 127.0.0.1:8080 -> 3000\n",
85 | "Forwarding from [::1]:8080 -> 3000\n",
86 | "```\n",
87 | "\n",
88 | "You can then open your browser and go to `http://localhost:8080` to see the user interface.\n",
89 | "\n",
90 | "
"
91 | ]
92 | },
93 | {
94 | "cell_type": "markdown",
95 | "metadata": {
96 | "id": "LbEdKUHBvLdi"
97 | },
98 | "source": [
99 | "## Operationalizing your ML Pipelines\n",
100 | "\n",
101 | "As you know, generating a trained model involves executing a sequence of steps. Here is a high level overview of what these steps might look like:\n",
102 | "\n",
103 | "
\n",
104 | "\n",
105 | "You can recall the very first model you ever built and more likely than not, your code then also followed a similar flow. In essence, building an ML pipeline mainly involves implementing these steps but you will need to optimize your operations to deliver value to your team. Platforms such as Kubeflow helps you to build ML pipelines that can be automated, reproducible, and easily monitored. You will see these as you build your pipeline in the next sections below."
106 | ]
107 | },
108 | {
109 | "cell_type": "markdown",
110 | "metadata": {
111 | "id": "pWrq6Ean7ZVE"
112 | },
113 | "source": [
114 | "### Pipeline components\n",
115 | "\n",
116 | "The main building blocks of your ML pipeline are referred to as [components](https://www.kubeflow.org/docs/components/pipelines/overview/concepts/component/). In the context of Kubeflow, these are containerized applications that run a specific task in the pipeline. Moreover, these components generate and consume *artifacts* from other components. For example, a download task will generate a dataset artifact and this will be consumed by a data splitting task. If you go back to the simple pipeline image above and describe it using tasks and artifacts, it will look something like this:\n",
117 | "\n",
118 | "
\n",
119 | "\n",
120 | "This relationship between tasks and their artifacts are what constitutes a pipeline and is also called a [directed acyclic graph (DAG)](https://en.wikipedia.org/wiki/Directed_acyclic_graph).\n",
121 | "\n",
122 | "Kubeflow Pipelines let's you create components either by [building the component specification directly](https://www.kubeflow.org/docs/components/pipelines/sdk/component-development/#component-spec) or through [Python functions](https://www.kubeflow.org/docs/components/pipelines/sdk/python-function-components/). For this lab, you will use the latter since it is more intuitive and allows for quick iteration. As you gain more experience, you can explore building the component specification directly especially if you want to use different languages other than Python.\n",
123 | "\n",
124 | "You will begin by installing the Kubeflow Pipelines SDK. Remember to restart the runtime to load the newly installed modules in Colab."
125 | ]
126 | },
127 | {
128 | "cell_type": "code",
129 | "execution_count": null,
130 | "metadata": {
131 | "id": "4IvRt6wC2n8Y"
132 | },
133 | "outputs": [],
134 | "source": [
135 | "# Install the KFP SDK\n",
136 | "!pip install --upgrade kfp"
137 | ]
138 | },
139 | {
140 | "cell_type": "markdown",
141 | "metadata": {
142 | "id": "7GVWoTzD7eT4"
143 | },
144 | "source": [
145 | "**Note:** *Please do not proceed to the next steps without restarting the Runtime after installing `kfp`. You can do that by either pressing the `Restart Runtime` button at the end of the cell output above, or going to the `Runtime` button at the Colab toolbar above and selecting `Restart Runtime`.*"
146 | ]
147 | },
148 | {
149 | "cell_type": "markdown",
150 | "metadata": {
151 | "id": "DmZeOyVu8MyJ"
152 | },
153 | "source": [
154 | "Now you will import the modules you will be using to construct the Kubeflow pipeline. You will know more what these are for in the next sections."
155 | ]
156 | },
157 | {
158 | "cell_type": "code",
159 | "execution_count": null,
160 | "metadata": {
161 | "id": "cSt2DEJA2ttR"
162 | },
163 | "outputs": [],
164 | "source": [
165 | "# Import the modules you will use\n",
166 | "import kfp\n",
167 | "\n",
168 | "# For creating the pipeline\n",
169 | "from kfp.v2 import dsl\n",
170 | "\n",
171 | "# For building components\n",
172 | "from kfp.v2.dsl import component\n",
173 | "\n",
174 | "# Type annotations for the component artifacts\n",
175 | "from kfp.v2.dsl import (\n",
176 | " Input,\n",
177 | " Output,\n",
178 | " Artifact,\n",
179 | " Dataset,\n",
180 | " Model,\n",
181 | " Metrics\n",
182 | ")"
183 | ]
184 | },
185 | {
186 | "cell_type": "markdown",
187 | "metadata": {
188 | "id": "MV8AZsyW8ahR"
189 | },
190 | "source": [
191 | "In this lab, you will build a pipeline to train a multi-output model trained on the [Energy Effeciency dataset from the UCI Machine Learning Repository](https://archive.ics.uci.edu/ml/datasets/Energy+efficiency). It uses the bulding features (e.g. wall area, roof area) as inputs and has two outputs: Cooling Load and Heating Load. You will follow the five-task graph above with some slight differences in the generated artifacts.\n",
192 | "\n",
193 | "You will now build the component to load your data into the pipeline. The code is shown below and we will discuss the syntax in more detail after running it."
194 | ]
195 | },
196 | {
197 | "cell_type": "code",
198 | "execution_count": null,
199 | "metadata": {
200 | "id": "gT4SZtZM22Gc"
201 | },
202 | "outputs": [],
203 | "source": [
204 | "@component(\n",
205 | " packages_to_install=[\"pandas\", \"openpyxl\"],\n",
206 | " output_component_file=\"download_data_component.yaml\"\n",
207 | ")\n",
208 | "def download_data(url:str, output_csv:Output[Dataset]):\n",
209 | " import pandas as pd\n",
210 | "\n",
211 | " # Use pandas excel reader\n",
212 | " df = pd.read_excel(url)\n",
213 | " df = df.sample(frac=1).reset_index(drop=True)\n",
214 | " df.to_csv(output_csv.path, index=False)"
215 | ]
216 | },
217 | {
218 | "cell_type": "markdown",
219 | "metadata": {
220 | "id": "UAa5GSbDaJpd"
221 | },
222 | "source": [
223 | "When building a component, it's good to determine first its inputs and outputs.\n",
224 | "\n",
225 | "* The dataset you want to download is an Excel file hosted by UCI [here](https://archive.ics.uci.edu/ml/machine-learning-databases/00242/ENB2012_data.xlsx) and you can load that using Pandas. Instead of hardcoding the URL in your code, you can design your function to accept an *input* string parameter so you can use other URLs in case the data has been transferred. \n",
226 | "\n",
227 | "* For the *output*, you will want to pass the downloaded dataset to the next task (i.e. data splitting). You should assign this as an `Output` type and specify what kind of artifact it is. Kubeflow provides [several of these](https://github.com/kubeflow/pipelines/blob/master/sdk/python/kfp/v2/components/types/artifact_types.py) such as `Dataset`, `Model`, `Metrics`, etc. All artifacts are saved by Kubeflow to a storage server. For local deployments, the default will be a [MinIO](https://min.io/) server. The [path](https://github.com/kubeflow/pipelines/blob/master/sdk/python/kfp/v2/components/types/artifact_types.py#L51) property fetches the location where this artifact will be saved and that's what you did above when you called `df.to_csv(output_csv.path, index=False)`\n",
228 | "\n",
229 | "The inputs and outputs are declared as parameters in the function definition. As you can see in the code we defined a `url` parameter with a `str` type and an `output_csv` parameter with an `Output[Dataset]` type.\n",
230 | "\n",
231 | "Lastly, you'll need to use the `component` decorator to specify that this is a Kubeflow Pipeline component. The [documentation](https://github.com/kubeflow/pipelines/blob/master/sdk/python/kfp/v2/components/component_decorator.py#L23) shows several parameters you can set and two of them are used in the code above. As the name suggests, the `packages_to_install` argument declares any extra packages outside the base image that is needed to run your code. As of writing, the default base image is `python:3.7` so you'll need `pandas` and `openpyxl` to load the Excel file. \n",
232 | "\n",
233 | "The `output_component_file` is an output file that contains the specification for your newly built component. You should see it in the Colab file explorer once you've ran the cell above. You'll see your code there and other settings that pertain to your component. You can use this file when building other pipelines if necessary. You don't have to redo your code again in a notebook in your next project as long as you have this YAML file. You can also pass this to your team members or use it in another machine. Kubeflow also hosts other reusable modules in their repo [here](https://github.com/kubeflow/pipelines/tree/master/components). For example, if you want a file downloader component in one of your projects, you can load the component from that repo using the [load_component_from_url](https://kubeflow-pipelines.readthedocs.io/en/latest/source/kfp.components.html#kfp.components.ComponentStore.load_component_from_url) function as shown below. The [YAML file](https://raw.githubusercontent.com/kubeflow/pipelines/master/components/web/Download/component-sdk-v2.yaml) of that component should tell you the inputs and outputs so you can use it accordingly.\n",
234 | "\n",
235 | "```\n",
236 | "web_downloader_op = kfp.components.load_component_from_url(\n",
237 | " 'https://raw.githubusercontent.com/kubeflow/pipelines/master/components/web/Download/component-sdk-v2.yaml')\n",
238 | "```"
239 | ]
240 | },
241 | {
242 | "cell_type": "markdown",
243 | "metadata": {
244 | "id": "8sNacAzvh6Ei"
245 | },
246 | "source": [
247 | "Next, you will build the next component in the pipeline. Like in the previous step, you should design it first with inputs and outputs in mind. You know that the input of this component will come from the artifact generated by the `download_data()` function above. To declare input artifacts, you can annotate your parameter with the `Input[Dataset]` data type as shown below. For the outputs, you want to have two: train and test datasets. You can see the implementation below:"
248 | ]
249 | },
250 | {
251 | "cell_type": "code",
252 | "execution_count": null,
253 | "metadata": {
254 | "id": "zpItc-Ob6pnO"
255 | },
256 | "outputs": [],
257 | "source": [
258 | "@component(\n",
259 | " packages_to_install=[\"pandas\", \"sklearn\"],\n",
260 | " output_component_file=\"split_data_component.yaml\"\n",
261 | ")\n",
262 | "def split_data(input_csv: Input[Dataset], train_csv: Output[Dataset], test_csv: Output[Dataset]):\n",
263 | " import pandas as pd\n",
264 | " from sklearn.model_selection import train_test_split\n",
265 | "\n",
266 | " df = pd.read_csv(input_csv.path)\n",
267 | " train, test = train_test_split(df, test_size=0.2)\n",
268 | "\n",
269 | " train.to_csv(train_csv.path, index=False)\n",
270 | " test.to_csv(test_csv.path, index=False)"
271 | ]
272 | },
273 | {
274 | "cell_type": "markdown",
275 | "metadata": {
276 | "id": "9ZM0MDM4qweD"
277 | },
278 | "source": [
279 | "### Building and Running a Pipeline"
280 | ]
281 | },
282 | {
283 | "cell_type": "markdown",
284 | "metadata": {
285 | "id": "JTQVk643lDMo"
286 | },
287 | "source": [
288 | "Now that you have at least two components, you can try building a pipeline just to quickly see how it works. The code is shown below. Basically, you just define a function with the sequence of steps then use the `dsl.pipeline` decorator. Notice in the last line (i.e. `split_data_task`) that to get a particular artifact from a previous step, you will need to use the `outputs` dictionary and use the parameter name as the key."
289 | ]
290 | },
291 | {
292 | "cell_type": "code",
293 | "execution_count": null,
294 | "metadata": {
295 | "id": "wZ-U_xsbLOIH"
296 | },
297 | "outputs": [],
298 | "source": [
299 | "@dsl.pipeline(\n",
300 | " name=\"my-pipeline\",\n",
301 | ")\n",
302 | "def my_pipeline(url: str):\n",
303 | " download_data_task = download_data(url=url)\n",
304 | " split_data_task = split_data(input_csv=download_data_task.outputs['output_csv'])"
305 | ]
306 | },
307 | {
308 | "cell_type": "markdown",
309 | "metadata": {
310 | "id": "OQZH5d2omdos"
311 | },
312 | "source": [
313 | "To generate your pipeline specification file, you need to compile your pipeline function using the [`Compiler`](https://kubeflow-pipelines.readthedocs.io/en/stable/source/kfp.compiler.html#kfp.compiler.Compiler) class as shown below."
314 | ]
315 | },
316 | {
317 | "cell_type": "code",
318 | "execution_count": null,
319 | "metadata": {
320 | "id": "JKFD7AGgLvHV"
321 | },
322 | "outputs": [],
323 | "source": [
324 | "kfp.compiler.Compiler(mode=kfp.dsl.PipelineExecutionMode.V2_COMPATIBLE).compile(\n",
325 | " pipeline_func=my_pipeline,\n",
326 | " package_path='pipeline.yaml')"
327 | ]
328 | },
329 | {
330 | "cell_type": "markdown",
331 | "metadata": {
332 | "id": "tfB-1JyInB-s"
333 | },
334 | "source": [
335 | "After running the cell, you'll see a `pipeline.yaml` file in the Colab file explorer. Please download that because it will be needed in the next step.\n",
336 | "\n",
337 | "You can run a pipeline programmatically or from the UI. For this exercise, you will do it from the UI and you will see how it is done programmatically in the Qwiklabs later this week. \n",
338 | "\n",
339 | "Please go back to the Kubeflow Pipelines UI and click `Upload Pipelines` from the `Pipelines` page.\n",
340 | "\n",
341 | "
\n",
342 | "
\n",
343 | "
\n",
344 | "\n",
345 | "Next, select `Upload a file` and choose the `pipeline.yaml` you downloaded earlier then click `Create`. This will open a screen showing your simple DAG (just two tasks). \n",
346 | "\n",
347 | "
\n",
348 | "
\n",
349 | "
\n",
350 | "\n",
351 | "Click `Create Run` then scroll to the bottom to input the URL of the Excel file: https://archive.ics.uci.edu/ml/machine-learning-databases/00242/ENB2012_data.xlsx . Then Click `Start`.\n",
352 | "\n",
353 | "
\n",
354 | "
\n",
355 | "
\n",
356 | "\n",
357 | "Select the topmost entry in the `Runs` page and you should see the progress of your run. You can click on the `download-data` box to see more details about that particular task (i.e. the URL input and the container logs). After it turns green, you should also see the output artifact and you can download it if you want by clicking the minio link. \n",
358 | "\n",
359 | "
\n",
360 | "
\n",
361 | "
\n",
362 | "\n",
363 | "Eventually, both tasks will turn green indicating that the run completed successfully. Nicely done!"
364 | ]
365 | },
366 | {
367 | "cell_type": "markdown",
368 | "metadata": {
369 | "id": "9eBSFSmuq-l7"
370 | },
371 | "source": [
372 | "### Generate the rest of the components"
373 | ]
374 | },
375 | {
376 | "cell_type": "markdown",
377 | "metadata": {
378 | "id": "yQGXOPvms2sW"
379 | },
380 | "source": [
381 | "Now that you've seen a sample workflow, you can build the rest of the components for preprocessing, model training, and model evaluation. The functions will be longer because the task is more complex. Nonetheless, it follows the same principles as before such as declaring inputs and outputs, and specifying the additional packages.\n",
382 | "\n",
383 | "In the `eval_model()` function, you'll notice the use of the [`log_metric()`](https://github.com/kubeflow/pipelines/blob/master/sdk/python/kfp/v2/components/types/artifact_types.py#L123) to record the results. You'll see this in the `Visualizations` tab of that task after it has completed. "
384 | ]
385 | },
386 | {
387 | "cell_type": "code",
388 | "execution_count": null,
389 | "metadata": {
390 | "id": "sF6gLo0w6nA4"
391 | },
392 | "outputs": [],
393 | "source": [
394 | "@component(\n",
395 | " packages_to_install=[\"pandas\", \"numpy\"],\n",
396 | " output_component_file=\"preprocess_data_component.yaml\"\n",
397 | ")\n",
398 | "def preprocess_data(input_train_csv: Input[Dataset], input_test_csv: Input[Dataset], \n",
399 | " output_train_x: Output[Dataset], output_test_x: Output[Dataset],\n",
400 | " output_train_y: Output[Artifact], output_test_y: Output[Artifact]):\n",
401 | " \n",
402 | " import pandas as pd\n",
403 | " import numpy as np\n",
404 | " import pickle\n",
405 | " \n",
406 | " def format_output(data):\n",
407 | " y1 = data.pop('Y1')\n",
408 | " y1 = np.array(y1)\n",
409 | " y2 = data.pop('Y2')\n",
410 | " y2 = np.array(y2)\n",
411 | " return y1, y2\n",
412 | "\n",
413 | " def norm(x, train_stats):\n",
414 | " return (x - train_stats['mean']) / train_stats['std']\n",
415 | "\n",
416 | " train = pd.read_csv(input_train_csv.path)\n",
417 | " test = pd.read_csv(input_test_csv.path)\n",
418 | "\n",
419 | " train_stats = train.describe()\n",
420 | "\n",
421 | " # Get Y1 and Y2 as the 2 outputs and format them as np arrays\n",
422 | " train_stats.pop('Y1')\n",
423 | " train_stats.pop('Y2')\n",
424 | " train_stats = train_stats.transpose()\n",
425 | " \n",
426 | " train_Y = format_output(train)\n",
427 | " with open(output_train_y.path, \"wb\") as file:\n",
428 | " pickle.dump(train_Y, file)\n",
429 | " \n",
430 | " test_Y = format_output(test)\n",
431 | " with open(output_test_y.path, \"wb\") as file:\n",
432 | " pickle.dump(test_Y, file)\n",
433 | "\n",
434 | " # Normalize the training and test data\n",
435 | " norm_train_X = norm(train, train_stats)\n",
436 | " norm_test_X = norm(test, train_stats)\n",
437 | "\n",
438 | " norm_train_X.to_csv(output_train_x.path, index=False)\n",
439 | " norm_test_X.to_csv(output_test_x.path, index=False)\n",
440 | "\n",
441 | "\n",
442 | "\n",
443 | "@component(\n",
444 | " packages_to_install=[\"tensorflow\", \"pandas\"],\n",
445 | " output_component_file=\"train_model_component.yaml\"\n",
446 | ")\n",
447 | "def train_model(input_train_x: Input[Dataset], input_train_y: Input[Artifact], \n",
448 | " output_model: Output[Model], output_history: Output[Artifact]):\n",
449 | " import pandas as pd\n",
450 | " import tensorflow as tf\n",
451 | " import pickle\n",
452 | " \n",
453 | " from tensorflow.keras.models import Model\n",
454 | " from tensorflow.keras.layers import Dense, Input\n",
455 | " \n",
456 | " norm_train_X = pd.read_csv(input_train_x.path)\n",
457 | "\n",
458 | " with open(input_train_y.path, \"rb\") as file:\n",
459 | " train_Y = pickle.load(file)\n",
460 | "\n",
461 | " def model_builder(train_X):\n",
462 | "\n",
463 | " # Define model layers.\n",
464 | " input_layer = Input(shape=(len(train_X.columns),))\n",
465 | " first_dense = Dense(units='128', activation='relu')(input_layer)\n",
466 | " second_dense = Dense(units='128', activation='relu')(first_dense)\n",
467 | "\n",
468 | " # Y1 output will be fed directly from the second dense\n",
469 | " y1_output = Dense(units='1', name='y1_output')(second_dense)\n",
470 | " third_dense = Dense(units='64', activation='relu')(second_dense)\n",
471 | "\n",
472 | " # Y2 output will come via the third dense\n",
473 | " y2_output = Dense(units='1', name='y2_output')(third_dense)\n",
474 | "\n",
475 | " # Define the model with the input layer and a list of output layers\n",
476 | " model = Model(inputs=input_layer, outputs=[y1_output, y2_output])\n",
477 | "\n",
478 | " print(model.summary())\n",
479 | "\n",
480 | " return model\n",
481 | "\n",
482 | " model = model_builder(norm_train_X)\n",
483 | "\n",
484 | " # Specify the optimizer, and compile the model with loss functions for both outputs\n",
485 | " optimizer = tf.keras.optimizers.SGD(learning_rate=0.001)\n",
486 | " model.compile(optimizer=optimizer,\n",
487 | " loss={'y1_output': 'mse', 'y2_output': 'mse'},\n",
488 | " metrics={'y1_output': tf.keras.metrics.RootMeanSquaredError(),\n",
489 | " 'y2_output': tf.keras.metrics.RootMeanSquaredError()})\n",
490 | " # Train the model for 500 epochs\n",
491 | " history = model.fit(norm_train_X, train_Y, epochs=100, batch_size=10)\n",
492 | " model.save(output_model.path)\n",
493 | "\n",
494 | " with open(output_history.path, \"wb\") as file:\n",
495 | " train_Y = pickle.dump(history.history, file)\n",
496 | "\n",
497 | "\n",
498 | "\n",
499 | "@component(\n",
500 | " packages_to_install=[\"tensorflow\", \"pandas\"],\n",
501 | " output_component_file=\"eval_model_component.yaml\"\n",
502 | ")\n",
503 | "def eval_model(input_model: Input[Model], input_history: Input[Artifact], \n",
504 | " input_test_x: Input[Dataset], input_test_y: Input[Artifact], \n",
505 | " MLPipeline_Metrics: Output[Metrics]):\n",
506 | " import pandas as pd\n",
507 | " import tensorflow as tf\n",
508 | " import pickle\n",
509 | "\n",
510 | " model = tf.keras.models.load_model(input_model.path)\n",
511 | " \n",
512 | " norm_test_X = pd.read_csv(input_test_x.path)\n",
513 | "\n",
514 | " with open(input_test_y.path, \"rb\") as file:\n",
515 | " test_Y = pickle.load(file)\n",
516 | "\n",
517 | " # Test the model and print loss and mse for both outputs\n",
518 | " loss, Y1_loss, Y2_loss, Y1_rmse, Y2_rmse = model.evaluate(x=norm_test_X, y=test_Y)\n",
519 | " print(\"Loss = {}, Y1_loss = {}, Y1_mse = {}, Y2_loss = {}, Y2_mse = {}\".format(loss, Y1_loss, Y1_rmse, Y2_loss, Y2_rmse))\n",
520 | " \n",
521 | " MLPipeline_Metrics.log_metric(\"loss\", loss)\n",
522 | " MLPipeline_Metrics.log_metric(\"Y1_loss\", Y1_loss)\n",
523 | " MLPipeline_Metrics.log_metric(\"Y2_loss\", Y2_loss)\n",
524 | " MLPipeline_Metrics.log_metric(\"Y1_rmse\", Y1_rmse)\n",
525 | " MLPipeline_Metrics.log_metric(\"Y2_rmse\", Y2_rmse)"
526 | ]
527 | },
528 | {
529 | "cell_type": "markdown",
530 | "metadata": {
531 | "id": "JEsO8UYurD1k"
532 | },
533 | "source": [
534 | "### Build and run the complete pipeline"
535 | ]
536 | },
537 | {
538 | "cell_type": "markdown",
539 | "metadata": {
540 | "id": "7XqEnO97vIwY"
541 | },
542 | "source": [
543 | "You can then build and run the entire pipeline as you did earlier. It will take around 20 minutes for all the tasks to complete and you can see the `Logs` tab of each task to see how it's going. For instance, you can see there the model training epochs as you normally see in a notebook environment."
544 | ]
545 | },
546 | {
547 | "cell_type": "code",
548 | "execution_count": null,
549 | "metadata": {
550 | "id": "HqD895So2-h2"
551 | },
552 | "outputs": [],
553 | "source": [
554 | "# Define a pipeline and create a task from a component:\n",
555 | "@dsl.pipeline(\n",
556 | " name=\"my-pipeline\",\n",
557 | ")\n",
558 | "def my_pipeline(url: str):\n",
559 | " \n",
560 | " download_data_task = download_data(url=url)\n",
561 | " \n",
562 | " split_data_task = split_data(input_csv=download_data_task.outputs['output_csv'])\n",
563 | " \n",
564 | " preprocess_data_task = preprocess_data(input_train_csv=split_data_task.outputs['train_csv'],\n",
565 | " input_test_csv=split_data_task.outputs['test_csv'])\n",
566 | " \n",
567 | " train_model_task = train_model(input_train_x=preprocess_data_task.outputs[\"output_train_x\"],\n",
568 | " input_train_y=preprocess_data_task.outputs[\"output_train_y\"])\n",
569 | " \n",
570 | " eval_model_task = eval_model(input_model=train_model_task.outputs[\"output_model\"],\n",
571 | " input_history=train_model_task.outputs[\"output_history\"],\n",
572 | " input_test_x=preprocess_data_task.outputs[\"output_test_x\"],\n",
573 | " input_test_y=preprocess_data_task.outputs[\"output_test_y\"])"
574 | ]
575 | },
576 | {
577 | "cell_type": "code",
578 | "execution_count": null,
579 | "metadata": {
580 | "id": "UNPq9D263A3d"
581 | },
582 | "outputs": [],
583 | "source": [
584 | "kfp.compiler.Compiler(mode=kfp.dsl.PipelineExecutionMode.V2_COMPATIBLE).compile(\n",
585 | " pipeline_func=my_pipeline,\n",
586 | " package_path='pipeline.yaml')"
587 | ]
588 | },
589 | {
590 | "cell_type": "markdown",
591 | "metadata": {},
592 | "source": [
593 | "After you've uploaded and ran the entire pipeline, you should see all green boxes and the training metrics in the `Visualizations` tab of the `eval-model` task.\n",
594 | "\n",
595 | "
"
596 | ]
597 | },
598 | {
599 | "cell_type": "markdown",
600 | "metadata": {
601 | "id": "9bs8p5KZGCgI"
602 | },
603 | "source": [
604 | "## Tear Down\n",
605 | "\n",
606 | "If you're done experimenting with the software and want to free up resources, you can execute the commands below to delete Kubeflow Pipelines from your system:\n",
607 | "\n",
608 | "```\n",
609 | "export PIPELINE_VERSION=1.7.0\n",
610 | "kubectl delete -k \"github.com/kubeflow/pipelines/manifests/kustomize/env/platform-agnostic-pns?ref=$PIPELINE_VERSION\"\n",
611 | "kubectl delete -k \"github.com/kubeflow/pipelines/manifests/kustomize/cluster-scoped-resources?ref=$PIPELINE_VERSION\"\n",
612 | "```\n",
613 | "\n",
614 | "You can delete the cluster for `kind` with the following:\n",
615 | "```\n",
616 | "kind delete cluster\n",
617 | "```"
618 | ]
619 | },
620 | {
621 | "cell_type": "markdown",
622 | "metadata": {
623 | "id": "PUFoY2iqIHyW"
624 | },
625 | "source": [
626 | "## Wrap Up\n",
627 | "\n",
628 | "This lab demonstrated how you can use Kubeflow Pipelines to build and orchestrate your ML workflows. Having automated, shareable, and modular pipelines is a very useful feature in production deployments so you and your team can monitor and maintain your system more effectively. In the first Qwiklabs this week, you will use Kubeflow Pipelines as part of the Google Cloud AI Platform. You'll see more features implemented there such as integration with Tensorboard and more output visualizations from each component. If you want to know more, you can start with the [Kubeflow Pipelines documentation](https://www.kubeflow.org/docs/components/pipelines/) and start conversations in Discourse. \n",
629 | "\n",
630 | "Great job and on to the next part of the course!"
631 | ]
632 | }
633 | ],
634 | "metadata": {
635 | "colab": {
636 | "collapsed_sections": [],
637 | "name": "C4_W3_Lab_1_Kubeflow_Pipelines.ipynb",
638 | "private_outputs": true,
639 | "provenance": [],
640 | "toc_visible": true
641 | },
642 | "kernelspec": {
643 | "display_name": "Python 3",
644 | "language": "python",
645 | "name": "python3"
646 | },
647 | "language_info": {
648 | "codemirror_mode": {
649 | "name": "ipython",
650 | "version": 3
651 | },
652 | "file_extension": ".py",
653 | "mimetype": "text/x-python",
654 | "name": "python",
655 | "nbconvert_exporter": "python",
656 | "pygments_lexer": "ipython3",
657 | "version": "3.7.4"
658 | }
659 | },
660 | "nbformat": 4,
661 | "nbformat_minor": 1
662 | }
663 |
--------------------------------------------------------------------------------
/Deploying Machine Learning Models in Production/C4W3-Lab/img/complete.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/zhulingchen/Machine-Learning-Engineering-for-Production-MLOps-Specialization/4a936bba66fa15a60899b7b2d0d8bddb56633a3a/Deploying Machine Learning Models in Production/C4W3-Lab/img/complete.png
--------------------------------------------------------------------------------
/Deploying Machine Learning Models in Production/C4W3-Lab/img/complete_pipeline.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/zhulingchen/Machine-Learning-Engineering-for-Production-MLOps-Specialization/4a936bba66fa15a60899b7b2d0d8bddb56633a3a/Deploying Machine Learning Models in Production/C4W3-Lab/img/complete_pipeline.png
--------------------------------------------------------------------------------
/Deploying Machine Learning Models in Production/C4W3-Lab/img/dag_kfp.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/zhulingchen/Machine-Learning-Engineering-for-Production-MLOps-Specialization/4a936bba66fa15a60899b7b2d0d8bddb56633a3a/Deploying Machine Learning Models in Production/C4W3-Lab/img/dag_kfp.png
--------------------------------------------------------------------------------
/Deploying Machine Learning Models in Production/C4W3-Lab/img/highlevel.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/zhulingchen/Machine-Learning-Engineering-for-Production-MLOps-Specialization/4a936bba66fa15a60899b7b2d0d8bddb56633a3a/Deploying Machine Learning Models in Production/C4W3-Lab/img/highlevel.jpg
--------------------------------------------------------------------------------
/Deploying Machine Learning Models in Production/C4W3-Lab/img/kfp_ui.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/zhulingchen/Machine-Learning-Engineering-for-Production-MLOps-Specialization/4a936bba66fa15a60899b7b2d0d8bddb56633a3a/Deploying Machine Learning Models in Production/C4W3-Lab/img/kfp_ui.png
--------------------------------------------------------------------------------
/Deploying Machine Learning Models in Production/C4W3-Lab/img/logs.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/zhulingchen/Machine-Learning-Engineering-for-Production-MLOps-Specialization/4a936bba66fa15a60899b7b2d0d8bddb56633a3a/Deploying Machine Learning Models in Production/C4W3-Lab/img/logs.png
--------------------------------------------------------------------------------
/Deploying Machine Learning Models in Production/C4W3-Lab/img/progress.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/zhulingchen/Machine-Learning-Engineering-for-Production-MLOps-Specialization/4a936bba66fa15a60899b7b2d0d8bddb56633a3a/Deploying Machine Learning Models in Production/C4W3-Lab/img/progress.png
--------------------------------------------------------------------------------
/Deploying Machine Learning Models in Production/C4W3-Lab/img/simple_dag.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/zhulingchen/Machine-Learning-Engineering-for-Production-MLOps-Specialization/4a936bba66fa15a60899b7b2d0d8bddb56633a3a/Deploying Machine Learning Models in Production/C4W3-Lab/img/simple_dag.jpg
--------------------------------------------------------------------------------
/Deploying Machine Learning Models in Production/C4W3-Lab/img/upload.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/zhulingchen/Machine-Learning-Engineering-for-Production-MLOps-Specialization/4a936bba66fa15a60899b7b2d0d8bddb56633a3a/Deploying Machine Learning Models in Production/C4W3-Lab/img/upload.png
--------------------------------------------------------------------------------
/Deploying Machine Learning Models in Production/C4W3-Lab/img/upload_pipeline.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/zhulingchen/Machine-Learning-Engineering-for-Production-MLOps-Specialization/4a936bba66fa15a60899b7b2d0d8bddb56633a3a/Deploying Machine Learning Models in Production/C4W3-Lab/img/upload_pipeline.png
--------------------------------------------------------------------------------
/Deploying Machine Learning Models in Production/C4W3-Lab/img/url.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/zhulingchen/Machine-Learning-Engineering-for-Production-MLOps-Specialization/4a936bba66fa15a60899b7b2d0d8bddb56633a3a/Deploying Machine Learning Models in Production/C4W3-Lab/img/url.png
--------------------------------------------------------------------------------
/Machine Learning Data Lifecycle in Production/C2W2-Assignment/img/feature_eng_pipeline.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/zhulingchen/Machine-Learning-Engineering-for-Production-MLOps-Specialization/4a936bba66fa15a60899b7b2d0d8bddb56633a3a/Machine Learning Data Lifecycle in Production/C2W2-Assignment/img/feature_eng_pipeline.png
--------------------------------------------------------------------------------
/Machine Learning Data Lifecycle in Production/C2W3-Assignment/util.py:
--------------------------------------------------------------------------------
1 | import tensorflow as tf
2 | import pandas as pd
3 | from google.protobuf.json_format import MessageToDict
4 |
5 | def get_records(dataset, num_records):
6 | '''Extracts records from the given dataset.
7 | Args:
8 | dataset (TFRecordDataset): dataset saved by ExampleGen
9 | num_records (int): number of records to preview
10 | '''
11 |
12 | # initialize an empty list
13 | records = []
14 |
15 | # Use the `take()` method to specify how many records to get
16 | for tfrecord in dataset.take(num_records):
17 |
18 | # Get the numpy property of the tensor
19 | serialized_example = tfrecord.numpy()
20 |
21 | # Initialize a `tf.train.Example()` to read the serialized data
22 | example = tf.train.Example()
23 |
24 | # Read the example data (output is a protocol buffer message)
25 | example.ParseFromString(serialized_example)
26 |
27 | # convert the protocol bufffer message to a Python dictionary
28 | example_dict = (MessageToDict(example))
29 |
30 | # append to the records list
31 | records.append(example_dict)
32 |
33 | return records
34 |
35 | def display_types(types):
36 | # Helper function to render dataframes for the artifact and execution types
37 | table = {'id': [], 'name': []}
38 | for a_type in types:
39 | table['id'].append(a_type.id)
40 | table['name'].append(a_type.name)
41 | return pd.DataFrame(data=table)
42 |
43 | def display_artifacts(store, artifacts, base_dir):
44 | # Helper function to render dataframes for the input artifacts
45 | table = {'artifact id': [], 'type': [], 'uri': []}
46 | for a in artifacts:
47 | table['artifact id'].append(a.id)
48 | artifact_type = store.get_artifact_types_by_id([a.type_id])[0]
49 | table['type'].append(artifact_type.name)
50 | table['uri'].append(a.uri.replace(base_dir, './'))
51 | return pd.DataFrame(data=table)
52 |
53 | def display_properties(store, node):
54 | # Helper function to render dataframes for artifact and execution properties
55 | table = {'property': [], 'value': []}
56 |
57 | for k, v in node.properties.items():
58 | table['property'].append(k)
59 | table['value'].append(
60 | v.string_value if v.HasField('string_value') else v.int_value)
61 |
62 | for k, v in node.custom_properties.items():
63 | table['property'].append(k)
64 | table['value'].append(
65 | v.string_value if v.HasField('string_value') else v.int_value)
66 |
67 | return pd.DataFrame(data=table)
--------------------------------------------------------------------------------
/Machine Learning Modeling Pipelines in Production/C3W1-Lab/C3_W1_Lab_1_Keras_Tuner.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {
6 | "id": "qFdPvlXBOdUN"
7 | },
8 | "source": [
9 | "# Ungraded Lab: Intro to Keras Tuner"
10 | ]
11 | },
12 | {
13 | "cell_type": "markdown",
14 | "metadata": {
15 | "id": "xHxb-dlhMIzW"
16 | },
17 | "source": [
18 | "Developing machine learning models is usually an iterative process. You start with an initial design then reconfigure until you get a model that can be trained efficiently in terms of time and compute resources. As you may already know, these settings that you adjust are called _hyperparameters_. These are the variables that govern the training process and the topology of an ML model. These remain constant over the training process and directly impact the performance of your ML program. \n",
19 | "\n",
20 | "The process of finding the optimal set of hyperparameters is called *hyperparameter tuning* or *hypertuning*, and it is an essential part of a machine learning pipeline. Without it, you might end up with a model that has unnecessary parameters and take too long to train.\n",
21 | "\n",
22 | "Hyperparameters are of two types:\n",
23 | "1. *Model hyperparameters* which influence model selection such as the number and width of hidden layers\n",
24 | "\n",
25 | "2. *Algorithm hyperparameters* which influence the speed and quality of the learning algorithm such as the learning rate for Stochastic Gradient Descent (SGD) and the number of nearest neighbors for a k Nearest Neighbors (KNN) classifier.\n",
26 | "\n",
27 | "For more complex models, the number of hyperparameters can increase dramatically and tuning them manually can be quite challenging.\n",
28 | "\n",
29 | "In this lab, you will practice hyperparameter tuning with [Keras Tuner](https://keras-team.github.io/keras-tuner/), a package from the Keras team that automates this process. For comparison, you will first train a baseline model with pre-selected hyperparameters, then redo the process with tuned hyperparameters. Some of the examples and discussions here are taken from the [official tutorial provided by Tensorflow](https://colab.research.google.com/github/tensorflow/docs/blob/master/site/en/tutorials/keras/keras_tuner.ipynb#scrollTo=sKwLOzKpFGAj) but we've expounded on a few key parts for clarity.\n",
30 | "\n",
31 | "Let's begin!\n",
32 | "\n",
33 | "**Note: The notebooks in this course are shared with read-only access. To be able to save your work, kindly select File > Save a Copy in Drive from the Colab menu and run the notebook from there. You will need a Gmail account to save a copy.**"
34 | ]
35 | },
36 | {
37 | "cell_type": "markdown",
38 | "metadata": {
39 | "id": "ReV_UXOgCZvx"
40 | },
41 | "source": [
42 | "## Download and prepare the dataset\n",
43 | "\n",
44 | "Let us first load the [Fashion MNIST dataset](https://github.com/zalandoresearch/fashion-mnist) into your workspace. You will use this to train a machine learning model that classifies images of clothing."
45 | ]
46 | },
47 | {
48 | "cell_type": "code",
49 | "execution_count": 1,
50 | "metadata": {
51 | "id": "ysAmHLZoDld7"
52 | },
53 | "outputs": [],
54 | "source": [
55 | "# Import keras\n",
56 | "from tensorflow import keras"
57 | ]
58 | },
59 | {
60 | "cell_type": "code",
61 | "execution_count": 2,
62 | "metadata": {
63 | "id": "OHlHs9Wj_PUM"
64 | },
65 | "outputs": [],
66 | "source": [
67 | "# Download the dataset and split into train and test sets\n",
68 | "(img_train, label_train), (img_test, label_test) = keras.datasets.fashion_mnist.load_data()"
69 | ]
70 | },
71 | {
72 | "cell_type": "markdown",
73 | "metadata": {
74 | "id": "nHkQOzHLoKNA"
75 | },
76 | "source": [
77 | "For preprocessing, you will normalize the pixel values to make the training converge faster."
78 | ]
79 | },
80 | {
81 | "cell_type": "code",
82 | "execution_count": 3,
83 | "metadata": {
84 | "id": "bLVhXs3xrUD0"
85 | },
86 | "outputs": [],
87 | "source": [
88 | "# Normalize pixel values between 0 and 1\n",
89 | "img_train = img_train.astype('float32') / 255.0\n",
90 | "img_test = img_test.astype('float32') / 255.0"
91 | ]
92 | },
93 | {
94 | "cell_type": "markdown",
95 | "metadata": {
96 | "id": "_hM19_JWD6eF"
97 | },
98 | "source": [
99 | "## Baseline Performance\n",
100 | "\n",
101 | "As mentioned, you will first have a baseline performance using arbitrarily handpicked parameters so you can compare the results later. In the interest of time and resource limits provided by Colab, you will just build a shallow dense neural network (DNN) as shown below. This is to demonstrate the concepts without involving huge datasets and long tuning and training times. As you'll see later, even small models can take some time to tune. You can extend the concepts here when you get to build more complex models in your own projects. "
102 | ]
103 | },
104 | {
105 | "cell_type": "code",
106 | "execution_count": 4,
107 | "metadata": {
108 | "id": "sqbYwwukkA6z"
109 | },
110 | "outputs": [
111 | {
112 | "name": "stdout",
113 | "output_type": "stream",
114 | "text": [
115 | "Model: \"sequential\"\n",
116 | "_________________________________________________________________\n",
117 | "Layer (type) Output Shape Param # \n",
118 | "=================================================================\n",
119 | "flatten (Flatten) (None, 784) 0 \n",
120 | "_________________________________________________________________\n",
121 | "dense_1 (Dense) (None, 512) 401920 \n",
122 | "_________________________________________________________________\n",
123 | "dropout (Dropout) (None, 512) 0 \n",
124 | "_________________________________________________________________\n",
125 | "dense (Dense) (None, 10) 5130 \n",
126 | "=================================================================\n",
127 | "Total params: 407,050\n",
128 | "Trainable params: 407,050\n",
129 | "Non-trainable params: 0\n",
130 | "_________________________________________________________________\n"
131 | ]
132 | }
133 | ],
134 | "source": [
135 | "# Build the baseline model using the Sequential API\n",
136 | "b_model = keras.Sequential()\n",
137 | "b_model.add(keras.layers.Flatten(input_shape=(28, 28)))\n",
138 | "b_model.add(keras.layers.Dense(units=512, activation='relu', name='dense_1')) # You will tune this layer later\n",
139 | "b_model.add(keras.layers.Dropout(0.2))\n",
140 | "b_model.add(keras.layers.Dense(10, activation='softmax'))\n",
141 | "\n",
142 | "# Print model summary\n",
143 | "b_model.summary()"
144 | ]
145 | },
146 | {
147 | "cell_type": "markdown",
148 | "metadata": {
149 | "id": "WAlb_KxTK50d"
150 | },
151 | "source": [
152 | "As shown, we hardcoded all the hyperparameters when declaring the layers. These include the number of hidden units, activation, and dropout. You will see how you can automatically tune some of these a bit later."
153 | ]
154 | },
155 | {
156 | "cell_type": "markdown",
157 | "metadata": {
158 | "id": "RM354GIBKdf0"
159 | },
160 | "source": [
161 | "Let's then setup the loss, metrics, and the optimizer. The learning rate is also a hyperparameter you can tune automatically but for now, let's set it at `0.001`."
162 | ]
163 | },
164 | {
165 | "cell_type": "code",
166 | "execution_count": 5,
167 | "metadata": {
168 | "id": "Lp58Ety3pLj2"
169 | },
170 | "outputs": [],
171 | "source": [
172 | "# Setup the training parameters\n",
173 | "b_model.compile(optimizer=keras.optimizers.Adam(learning_rate=0.001),\n",
174 | " loss=keras.losses.SparseCategoricalCrossentropy(),\n",
175 | " metrics=['accuracy'])"
176 | ]
177 | },
178 | {
179 | "cell_type": "markdown",
180 | "metadata": {
181 | "id": "_FxeAlZlLpHI"
182 | },
183 | "source": [
184 | "With all settings set, you can start training the model. We've set the number of epochs to 10 but feel free to increase it if you have more time to go through the notebook. "
185 | ]
186 | },
187 | {
188 | "cell_type": "code",
189 | "execution_count": 6,
190 | "metadata": {
191 | "id": "K1JjZ-FdLXZ3"
192 | },
193 | "outputs": [
194 | {
195 | "name": "stdout",
196 | "output_type": "stream",
197 | "text": [
198 | "Epoch 1/10\n",
199 | "1500/1500 [==============================] - 3s 2ms/step - loss: 0.5161 - accuracy: 0.8169 - val_loss: 0.4415 - val_accuracy: 0.8363\n",
200 | "Epoch 2/10\n",
201 | "1500/1500 [==============================] - 3s 2ms/step - loss: 0.3917 - accuracy: 0.8568 - val_loss: 0.4219 - val_accuracy: 0.8359\n",
202 | "Epoch 3/10\n",
203 | "1500/1500 [==============================] - 3s 2ms/step - loss: 0.3580 - accuracy: 0.8682 - val_loss: 0.3553 - val_accuracy: 0.8732\n",
204 | "Epoch 4/10\n",
205 | "1500/1500 [==============================] - 3s 2ms/step - loss: 0.3344 - accuracy: 0.8770 - val_loss: 0.3420 - val_accuracy: 0.8791\n",
206 | "Epoch 5/10\n",
207 | "1500/1500 [==============================] - 3s 2ms/step - loss: 0.3183 - accuracy: 0.8813 - val_loss: 0.3436 - val_accuracy: 0.8776\n",
208 | "Epoch 6/10\n",
209 | "1500/1500 [==============================] - 3s 2ms/step - loss: 0.3064 - accuracy: 0.8855 - val_loss: 0.3357 - val_accuracy: 0.8783\n",
210 | "Epoch 7/10\n",
211 | "1500/1500 [==============================] - 3s 2ms/step - loss: 0.2941 - accuracy: 0.8917 - val_loss: 0.3231 - val_accuracy: 0.8862\n",
212 | "Epoch 8/10\n",
213 | "1500/1500 [==============================] - 3s 2ms/step - loss: 0.2856 - accuracy: 0.8927 - val_loss: 0.3322 - val_accuracy: 0.8787\n",
214 | "Epoch 9/10\n",
215 | "1500/1500 [==============================] - 3s 2ms/step - loss: 0.2720 - accuracy: 0.8983 - val_loss: 0.3380 - val_accuracy: 0.8818\n",
216 | "Epoch 10/10\n",
217 | "1500/1500 [==============================] - 3s 2ms/step - loss: 0.2658 - accuracy: 0.9004 - val_loss: 0.3233 - val_accuracy: 0.8905\n"
218 | ]
219 | },
220 | {
221 | "data": {
222 | "text/plain": [
223 | ""
224 | ]
225 | },
226 | "execution_count": 6,
227 | "metadata": {},
228 | "output_type": "execute_result"
229 | }
230 | ],
231 | "source": [
232 | "# Number of training epochs.\n",
233 | "NUM_EPOCHS = 10\n",
234 | "\n",
235 | "# Train the model\n",
236 | "b_model.fit(img_train, label_train, epochs=NUM_EPOCHS, validation_split=0.2)"
237 | ]
238 | },
239 | {
240 | "cell_type": "markdown",
241 | "metadata": {
242 | "id": "S6LALxGwMtkV"
243 | },
244 | "source": [
245 | "Finally, you want to see how this baseline model performs against the test set."
246 | ]
247 | },
248 | {
249 | "cell_type": "code",
250 | "execution_count": 7,
251 | "metadata": {
252 | "id": "kBnZ2tFbpxgC"
253 | },
254 | "outputs": [
255 | {
256 | "name": "stdout",
257 | "output_type": "stream",
258 | "text": [
259 | "313/313 [==============================] - 1s 2ms/step - loss: 0.3483 - accuracy: 0.8800\n"
260 | ]
261 | }
262 | ],
263 | "source": [
264 | "# Evaluate model on the test set\n",
265 | "b_eval_dict = b_model.evaluate(img_test, label_test, return_dict=True)"
266 | ]
267 | },
268 | {
269 | "cell_type": "markdown",
270 | "metadata": {
271 | "id": "9YCfzg0IM9b6"
272 | },
273 | "source": [
274 | "Let's define a helper function for displaying the results so it's easier to compare later."
275 | ]
276 | },
277 | {
278 | "cell_type": "code",
279 | "execution_count": 8,
280 | "metadata": {
281 | "id": "Vt2dWs0NxnUn"
282 | },
283 | "outputs": [
284 | {
285 | "name": "stdout",
286 | "output_type": "stream",
287 | "text": [
288 | "\n",
289 | "BASELINE MODEL:\n",
290 | "number of units in 1st Dense layer: 512\n",
291 | "learning rate for the optimizer: 0.0010000000474974513\n",
292 | "loss: 0.3482625186443329\n",
293 | "accuracy: 0.8799999952316284\n"
294 | ]
295 | }
296 | ],
297 | "source": [
298 | "# Define helper function\n",
299 | "def print_results(model, model_name, eval_dict):\n",
300 | " '''\n",
301 | " Prints the values of the hyparameters to tune, and the results of model evaluation\n",
302 | "\n",
303 | " Args:\n",
304 | " model (Model) - Keras model to evaluate\n",
305 | " model_name (string) - arbitrary string to be used in identifying the model\n",
306 | " eval_dict (dict) - results of model.evaluate\n",
307 | " '''\n",
308 | " print(f'\\n{model_name}:')\n",
309 | "\n",
310 | " print(f'number of units in 1st Dense layer: {model.get_layer(\"dense_1\").units}')\n",
311 | " print(f'learning rate for the optimizer: {model.optimizer.lr.numpy()}')\n",
312 | "\n",
313 | " for key,value in eval_dict.items():\n",
314 | " print(f'{key}: {value}')\n",
315 | "\n",
316 | "# Print results for baseline model\n",
317 | "print_results(b_model, 'BASELINE MODEL', b_eval_dict)"
318 | ]
319 | },
320 | {
321 | "cell_type": "markdown",
322 | "metadata": {
323 | "id": "AH-RLK3Wxt_X"
324 | },
325 | "source": [
326 | "That's it for getting the results for a single set of hyperparameters. As you can see, this process can be tedious if you want to try different sets of parameters. For example, will your model improve if you use `learning_rate=0.00001` and `units=128`? What if `0.001` paired with `256`? The process will be even more difficult if you decide to also tune the dropout and try out other activation functions as well. Keras Tuner solves this problem by having an API to automatically search for the optimal set. You will just need to set it up once then wait for the results. You will see how this is done in the next sections."
327 | ]
328 | },
329 | {
330 | "cell_type": "markdown",
331 | "metadata": {
332 | "id": "7oyczDXqtWjI"
333 | },
334 | "source": [
335 | "## Keras Tuner\n",
336 | "\n",
337 | "To perform hypertuning with Keras Tuner, you will need to:\n",
338 | "\n",
339 | "* Define the model\n",
340 | "* Select which hyperparameters to tune\n",
341 | "* Define its search space\n",
342 | "* Define the search strategy"
343 | ]
344 | },
345 | {
346 | "cell_type": "markdown",
347 | "metadata": {
348 | "id": "MUXex9ctTuDB"
349 | },
350 | "source": [
351 | "### Install and import packages\n",
352 | "\n",
353 | "You will start by installing and importing the required packages."
354 | ]
355 | },
356 | {
357 | "cell_type": "code",
358 | "execution_count": 9,
359 | "metadata": {
360 | "id": "hpMLpbt9jcO6"
361 | },
362 | "outputs": [],
363 | "source": [
364 | "# Install Keras Tuner\n",
365 | "# !pip install -q -U keras-tuner"
366 | ]
367 | },
368 | {
369 | "cell_type": "code",
370 | "execution_count": 10,
371 | "metadata": {
372 | "id": "_leAIdFKAxAD"
373 | },
374 | "outputs": [],
375 | "source": [
376 | "# Import required packages\n",
377 | "import tensorflow as tf\n",
378 | "import keras_tuner as kt"
379 | ]
380 | },
381 | {
382 | "cell_type": "code",
383 | "execution_count": 11,
384 | "metadata": {},
385 | "outputs": [
386 | {
387 | "name": "stdout",
388 | "output_type": "stream",
389 | "text": [
390 | "TensorFlow version: 2.3.1\n",
391 | "KerasTuner version: 1.0.3\n"
392 | ]
393 | }
394 | ],
395 | "source": [
396 | "print('TensorFlow version:', tf.__version__)\n",
397 | "print('KerasTuner version:', kt.__version__)"
398 | ]
399 | },
400 | {
401 | "cell_type": "markdown",
402 | "metadata": {
403 | "id": "K5YEL2H2Ax3e"
404 | },
405 | "source": [
406 | "### Define the model\n",
407 | "\n",
408 | "The model you set up for hypertuning is called a *hypermodel*. When you build this model, you define the hyperparameter search space in addition to the model architecture. \n",
409 | "\n",
410 | "You can define a hypermodel through two approaches:\n",
411 | "\n",
412 | "* By using a model builder function\n",
413 | "* By [subclassing the `HyperModel` class](https://keras-team.github.io/keras-tuner/#you-can-use-a-hypermodel-subclass-instead-of-a-model-building-function) of the Keras Tuner API\n",
414 | "\n",
415 | "\n",
416 | "In this lab, you will take the first approach: you will use a model builder function to define the image classification model. This function returns a compiled model and uses hyperparameters you define inline to hypertune the model. \n",
417 | "\n",
418 | "The function below basically builds the same model you used earlier. The difference is there are two hyperparameters that are setup for tuning:\n",
419 | "\n",
420 | "* the number of hidden units of the first Dense layer\n",
421 | "* the learning rate of the Adam optimizer\n",
422 | "\n",
423 | "You will see that this is done with a HyperParameters object which configures the hyperparameter you'd like to tune. For this exercise, you will: \n",
424 | "\n",
425 | "* use its `Int()` method to define the search space for the Dense units. This allows you to set a minimum and maximum value, as well as the step size when incrementing between these values. \n",
426 | "\n",
427 | "* use its `Choice()` method for the learning rate. This allows you to define discrete values to include in the search space when hypertuning.\n",
428 | "\n",
429 | "You can view all available methods and its sample usage in the [official documentation](https://keras-team.github.io/keras-tuner/documentation/hyperparameters/#hyperparameters)."
430 | ]
431 | },
432 | {
433 | "cell_type": "code",
434 | "execution_count": 12,
435 | "metadata": {
436 | "id": "ZQKodC-jtsva"
437 | },
438 | "outputs": [],
439 | "source": [
440 | "def model_builder(hp):\n",
441 | " '''\n",
442 | " Builds the model and sets up the hyperparameters to tune.\n",
443 | "\n",
444 | " Args:\n",
445 | " hp - Keras tuner object\n",
446 | "\n",
447 | " Returns:\n",
448 | " model with hyperparameters to tune\n",
449 | " '''\n",
450 | "\n",
451 | " # Initialize the Sequential API and start stacking the layers\n",
452 | " model = keras.Sequential()\n",
453 | " model.add(keras.layers.Flatten(input_shape=(28, 28)))\n",
454 | "\n",
455 | " # Tune the number of units in the first Dense layer\n",
456 | " # Choose an optimal value between 32-512\n",
457 | " hp_units = hp.Int('units', min_value=32, max_value=512, step=32)\n",
458 | " model.add(keras.layers.Dense(units=hp_units, activation='relu', name='dense_1'))\n",
459 | "\n",
460 | " # Add next layers\n",
461 | " model.add(keras.layers.Dropout(0.2))\n",
462 | " model.add(keras.layers.Dense(10, activation='softmax'))\n",
463 | "\n",
464 | " # Tune the learning rate for the optimizer\n",
465 | " # Choose an optimal value from 0.01, 0.001, or 0.0001\n",
466 | " hp_learning_rate = hp.Choice('learning_rate', values=[1e-2, 1e-3, 1e-4])\n",
467 | "\n",
468 | " model.compile(optimizer=keras.optimizers.Adam(learning_rate=hp_learning_rate),\n",
469 | " loss=keras.losses.SparseCategoricalCrossentropy(),\n",
470 | " metrics=['accuracy'])\n",
471 | "\n",
472 | " return model"
473 | ]
474 | },
475 | {
476 | "cell_type": "markdown",
477 | "metadata": {
478 | "id": "0J1VYw4q3x0b"
479 | },
480 | "source": [
481 | "## Instantiate the Tuner and perform hypertuning\n",
482 | "\n",
483 | "Now that you have the model builder, you can then define how the tuner can find the optimal set of hyperparameters, also called the search strategy. Keras Tuner has [four tuners](https://keras-team.github.io/keras-tuner/documentation/tuners/) available with built-in strategies - `RandomSearch`, `Hyperband`, `BayesianOptimization`, and `Sklearn`. \n",
484 | "\n",
485 | "In this tutorial, you will use the Hyperband tuner. Hyperband is an algorithm specifically developed for hyperparameter optimization. It uses adaptive resource allocation and early-stopping to quickly converge on a high-performing model. This is done using a sports championship style bracket wherein the algorithm trains a large number of models for a few epochs and carries forward only the top-performing half of models to the next round. You can read about the intuition behind the algorithm in section 3 of [this paper](https://arxiv.org/pdf/1603.06560.pdf).\n",
486 | "\n",
487 | "Hyperband determines the number of models to train in a bracket by computing 1 + log`factor`(`max_epochs`) and rounding it up to the nearest integer. You will see these parameters (i.e. `factor` and `max_epochs` passed into the initializer below). In addition, you will also need to define the following to instantiate the Hyperband tuner:\n",
488 | "\n",
489 | "* the hypermodel (built by your model builder function)\n",
490 | "* the `objective` to optimize (e.g. validation accuracy)\n",
491 | "* a `directory` to save logs and checkpoints for every trial (model configuration) run during the hyperparameter search. If you re-run the hyperparameter search, the Keras Tuner uses the existing state from these logs to resume the search. To disable this behavior, pass an additional `overwrite=True` argument while instantiating the tuner.\n",
492 | "* the `project_name` to differentiate with other runs. This will be used as a subdirectory name under the `directory`.\n",
493 | "\n",
494 | "You can refer to the [documentation](https://keras.io/api/keras_tuner/tuners/hyperband/) for other arguments you can pass in."
495 | ]
496 | },
497 | {
498 | "cell_type": "code",
499 | "execution_count": 13,
500 | "metadata": {
501 | "id": "oichQFly6Y46"
502 | },
503 | "outputs": [],
504 | "source": [
505 | "# Instantiate the tuner\n",
506 | "tuner = kt.Hyperband(model_builder,\n",
507 | " objective='val_accuracy',\n",
508 | " max_epochs=10,\n",
509 | " factor=3,\n",
510 | " directory='kt_dir',\n",
511 | " project_name='kt_hyperband')"
512 | ]
513 | },
514 | {
515 | "cell_type": "markdown",
516 | "metadata": {
517 | "id": "Ij3hGcp4e8QG"
518 | },
519 | "source": [
520 | "Let's see a summary of the hyperparameters that you will tune:"
521 | ]
522 | },
523 | {
524 | "cell_type": "code",
525 | "execution_count": 14,
526 | "metadata": {
527 | "id": "JmkJOPp5WkiG"
528 | },
529 | "outputs": [
530 | {
531 | "name": "stdout",
532 | "output_type": "stream",
533 | "text": [
534 | "Search space summary\n",
535 | "Default search space size: 2\n",
536 | "units (Int)\n",
537 | "{'default': None, 'conditions': [], 'min_value': 32, 'max_value': 512, 'step': 32, 'sampling': None}\n",
538 | "learning_rate (Choice)\n",
539 | "{'default': 0.01, 'conditions': [], 'values': [0.01, 0.001, 0.0001], 'ordered': True}\n"
540 | ]
541 | }
542 | ],
543 | "source": [
544 | "# Display hypertuning settings\n",
545 | "tuner.search_space_summary()"
546 | ]
547 | },
548 | {
549 | "cell_type": "markdown",
550 | "metadata": {
551 | "id": "cwhBdXx0Ekj8"
552 | },
553 | "source": [
554 | "You can pass in a callback to stop training early when a metric is not improving. Below, we define an [EarlyStopping](https://www.tensorflow.org/api_docs/python/tf/keras/callbacks/EarlyStopping) callback to monitor the validation loss and stop training if it's not improving after 5 epochs."
555 | ]
556 | },
557 | {
558 | "cell_type": "code",
559 | "execution_count": 15,
560 | "metadata": {
561 | "id": "WT9IkS9NEjLc"
562 | },
563 | "outputs": [],
564 | "source": [
565 | "stop_early = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=5)"
566 | ]
567 | },
568 | {
569 | "cell_type": "markdown",
570 | "metadata": {
571 | "id": "UKghEo15Tduy"
572 | },
573 | "source": [
574 | "You will now run the hyperparameter search. The arguments for the search method are the same as those used for `tf.keras.model.fit` in addition to the callback above. This will take around 10 minutes to run."
575 | ]
576 | },
577 | {
578 | "cell_type": "code",
579 | "execution_count": 16,
580 | "metadata": {
581 | "id": "dSBQcTHF9cKt"
582 | },
583 | "outputs": [
584 | {
585 | "name": "stdout",
586 | "output_type": "stream",
587 | "text": [
588 | "Trial 30 Complete [00h 00m 32s]\n",
589 | "val_accuracy: 0.8661666512489319\n",
590 | "\n",
591 | "Best val_accuracy So Far: 0.8854166865348816\n",
592 | "Total elapsed time: 00h 06m 47s\n",
593 | "INFO:tensorflow:Oracle triggered exit\n"
594 | ]
595 | }
596 | ],
597 | "source": [
598 | "# Perform hypertuning\n",
599 | "tuner.search(img_train, label_train, epochs=NUM_EPOCHS, validation_split=0.2, callbacks=[stop_early])"
600 | ]
601 | },
602 | {
603 | "cell_type": "markdown",
604 | "metadata": {
605 | "id": "ewN6WBDYWvRw"
606 | },
607 | "source": [
608 | "You can get the top performing model with the [get_best_hyperparameters()](https://keras-team.github.io/keras-tuner/documentation/tuners/#get_best_hyperparameters-method) method."
609 | ]
610 | },
611 | {
612 | "cell_type": "code",
613 | "execution_count": 17,
614 | "metadata": {
615 | "id": "iG0zIuP5WuTI"
616 | },
617 | "outputs": [
618 | {
619 | "name": "stdout",
620 | "output_type": "stream",
621 | "text": [
622 | "\n",
623 | "The hyperparameter search is complete. The optimal number of units in the first densely-connected\n",
624 | "layer is 512 and the optimal learning rate for the optimizer\n",
625 | "is 0.001.\n",
626 | "\n"
627 | ]
628 | }
629 | ],
630 | "source": [
631 | "# Get the optimal hyperparameters from the results\n",
632 | "best_hps=tuner.get_best_hyperparameters()[0]\n",
633 | "\n",
634 | "print(f\"\"\"\n",
635 | "The hyperparameter search is complete. The optimal number of units in the first densely-connected\n",
636 | "layer is {best_hps.get('units')} and the optimal learning rate for the optimizer\n",
637 | "is {best_hps.get('learning_rate')}.\n",
638 | "\"\"\")"
639 | ]
640 | },
641 | {
642 | "cell_type": "markdown",
643 | "metadata": {
644 | "id": "Lak_ylf88xBv"
645 | },
646 | "source": [
647 | "## Build and train the model\n",
648 | "\n",
649 | "Now that you have the best set of hyperparameters, you can rebuild the hypermodel with these values and retrain it."
650 | ]
651 | },
652 | {
653 | "cell_type": "code",
654 | "execution_count": 18,
655 | "metadata": {
656 | "id": "McO82AXOuxXh"
657 | },
658 | "outputs": [
659 | {
660 | "name": "stdout",
661 | "output_type": "stream",
662 | "text": [
663 | "Model: \"sequential\"\n",
664 | "_________________________________________________________________\n",
665 | "Layer (type) Output Shape Param # \n",
666 | "=================================================================\n",
667 | "flatten (Flatten) (None, 784) 0 \n",
668 | "_________________________________________________________________\n",
669 | "dense_1 (Dense) (None, 512) 401920 \n",
670 | "_________________________________________________________________\n",
671 | "dropout (Dropout) (None, 512) 0 \n",
672 | "_________________________________________________________________\n",
673 | "dense (Dense) (None, 10) 5130 \n",
674 | "=================================================================\n",
675 | "Total params: 407,050\n",
676 | "Trainable params: 407,050\n",
677 | "Non-trainable params: 0\n",
678 | "_________________________________________________________________\n"
679 | ]
680 | }
681 | ],
682 | "source": [
683 | "# Build the model with the optimal hyperparameters\n",
684 | "h_model = tuner.hypermodel.build(best_hps)\n",
685 | "h_model.summary()"
686 | ]
687 | },
688 | {
689 | "cell_type": "code",
690 | "execution_count": 19,
691 | "metadata": {
692 | "id": "l64WP7Rau1lm"
693 | },
694 | "outputs": [
695 | {
696 | "name": "stdout",
697 | "output_type": "stream",
698 | "text": [
699 | "Epoch 1/10\n",
700 | "1500/1500 [==============================] - 3s 2ms/step - loss: 0.5107 - accuracy: 0.8185 - val_loss: 0.4400 - val_accuracy: 0.8351\n",
701 | "Epoch 2/10\n",
702 | "1500/1500 [==============================] - 3s 2ms/step - loss: 0.3930 - accuracy: 0.8562 - val_loss: 0.3902 - val_accuracy: 0.8577\n",
703 | "Epoch 3/10\n",
704 | "1500/1500 [==============================] - 3s 2ms/step - loss: 0.3560 - accuracy: 0.8695 - val_loss: 0.3468 - val_accuracy: 0.8755\n",
705 | "Epoch 4/10\n",
706 | "1500/1500 [==============================] - 3s 2ms/step - loss: 0.3344 - accuracy: 0.8766 - val_loss: 0.3648 - val_accuracy: 0.8641\n",
707 | "Epoch 5/10\n",
708 | "1500/1500 [==============================] - 3s 2ms/step - loss: 0.3198 - accuracy: 0.8826 - val_loss: 0.3414 - val_accuracy: 0.8787\n",
709 | "Epoch 6/10\n",
710 | "1500/1500 [==============================] - 3s 2ms/step - loss: 0.3014 - accuracy: 0.8881 - val_loss: 0.3216 - val_accuracy: 0.8832\n",
711 | "Epoch 7/10\n",
712 | "1500/1500 [==============================] - 3s 2ms/step - loss: 0.2950 - accuracy: 0.8891 - val_loss: 0.3533 - val_accuracy: 0.8768\n",
713 | "Epoch 8/10\n",
714 | "1500/1500 [==============================] - 3s 2ms/step - loss: 0.2811 - accuracy: 0.8954 - val_loss: 0.3182 - val_accuracy: 0.8877\n",
715 | "Epoch 9/10\n",
716 | "1500/1500 [==============================] - 3s 2ms/step - loss: 0.2714 - accuracy: 0.8977 - val_loss: 0.3182 - val_accuracy: 0.8884\n",
717 | "Epoch 10/10\n",
718 | "1500/1500 [==============================] - 3s 2ms/step - loss: 0.2647 - accuracy: 0.8994 - val_loss: 0.3159 - val_accuracy: 0.8873\n"
719 | ]
720 | },
721 | {
722 | "data": {
723 | "text/plain": [
724 | ""
725 | ]
726 | },
727 | "execution_count": 19,
728 | "metadata": {},
729 | "output_type": "execute_result"
730 | }
731 | ],
732 | "source": [
733 | "# Train the hypertuned model\n",
734 | "h_model.fit(img_train, label_train, epochs=NUM_EPOCHS, validation_split=0.2)"
735 | ]
736 | },
737 | {
738 | "cell_type": "markdown",
739 | "metadata": {
740 | "id": "MqU5ZVAaag2v"
741 | },
742 | "source": [
743 | "You will then get its performance against the test set."
744 | ]
745 | },
746 | {
747 | "cell_type": "code",
748 | "execution_count": 20,
749 | "metadata": {
750 | "id": "9E0BTp9Ealjb"
751 | },
752 | "outputs": [
753 | {
754 | "name": "stdout",
755 | "output_type": "stream",
756 | "text": [
757 | "313/313 [==============================] - 1s 2ms/step - loss: 0.3449 - accuracy: 0.8802\n"
758 | ]
759 | }
760 | ],
761 | "source": [
762 | "# Evaluate the hypertuned model against the test set\n",
763 | "h_eval_dict = h_model.evaluate(img_test, label_test, return_dict=True)"
764 | ]
765 | },
766 | {
767 | "cell_type": "markdown",
768 | "metadata": {
769 | "id": "EQRpPHZsz-eC"
770 | },
771 | "source": [
772 | "We can compare the results we got with the baseline model we used at the start of the notebook. Results may vary but you will usually get a model that has less units in the dense layer, while having comparable loss and accuracy. This indicates that you reduced the model size and saved compute resources while still having more or less the same accuracy."
773 | ]
774 | },
775 | {
776 | "cell_type": "code",
777 | "execution_count": 21,
778 | "metadata": {
779 | "id": "BjVYPOw6MH5d"
780 | },
781 | "outputs": [
782 | {
783 | "name": "stdout",
784 | "output_type": "stream",
785 | "text": [
786 | "\n",
787 | "BASELINE MODEL:\n",
788 | "number of units in 1st Dense layer: 512\n",
789 | "learning rate for the optimizer: 0.0010000000474974513\n",
790 | "loss: 0.3482625186443329\n",
791 | "accuracy: 0.8799999952316284\n",
792 | "\n",
793 | "HYPERTUNED MODEL:\n",
794 | "number of units in 1st Dense layer: 512\n",
795 | "learning rate for the optimizer: 0.0010000000474974513\n",
796 | "loss: 0.34490492939949036\n",
797 | "accuracy: 0.8802000284194946\n"
798 | ]
799 | }
800 | ],
801 | "source": [
802 | "# Print results of the baseline and hypertuned model\n",
803 | "print_results(b_model, 'BASELINE MODEL', b_eval_dict)\n",
804 | "print_results(h_model, 'HYPERTUNED MODEL', h_eval_dict)"
805 | ]
806 | },
807 | {
808 | "cell_type": "markdown",
809 | "metadata": {
810 | "id": "rKn4g_HzP2KS"
811 | },
812 | "source": [
813 | "## Bonus Challenges (optional)\n",
814 | "\n",
815 | "If you want to keep practicing with Keras Tuner in this notebook, you can do a factory reset (`Runtime > Factory reset runtime`) and take on any of the following:\n",
816 | "\n",
817 | "- hypertune the dropout layer with `hp.Float()` or `hp.Choice()`\n",
818 | "- hypertune the activation function of the 1st dense layer with `hp.Choice()`\n",
819 | "- determine the optimal number of Dense layers you can add to improve the model. You can use the code [here](https://keras-team.github.io/keras-tuner/#the-search-space-may-contain-conditional-hyperparameters) as reference.\n",
820 | "- explore pre-defined `HyperModel` classes - [HyperXception and HyperResNet](https://keras-team.github.io/keras-tuner/documentation/hypermodels/#hyperresnet-class) for computer vision applications."
821 | ]
822 | },
823 | {
824 | "cell_type": "markdown",
825 | "metadata": {
826 | "id": "sKwLOzKpFGAj"
827 | },
828 | "source": [
829 | "## Wrap Up\n",
830 | "\n",
831 | "In this tutorial, you used Keras Tuner to conveniently tune hyperparameters. You defined which ones to tune, the search space, and search strategy to arrive at the optimal set of hyperparameters. These concepts will again be discussed in the next sections but in the context of AutoML, a package that automates the entire machine learning pipeline. On to the next!\n"
832 | ]
833 | }
834 | ],
835 | "metadata": {
836 | "accelerator": "GPU",
837 | "colab": {
838 | "collapsed_sections": [],
839 | "name": "C3_W1_Lab_1_Keras_Tuner.ipynb",
840 | "private_outputs": true,
841 | "provenance": [],
842 | "toc_visible": true
843 | },
844 | "kernelspec": {
845 | "display_name": "Python 3",
846 | "language": "python",
847 | "name": "python3"
848 | },
849 | "language_info": {
850 | "codemirror_mode": {
851 | "name": "ipython",
852 | "version": 3
853 | },
854 | "file_extension": ".py",
855 | "mimetype": "text/x-python",
856 | "name": "python",
857 | "nbconvert_exporter": "python",
858 | "pygments_lexer": "ipython3",
859 | "version": "3.6.13"
860 | }
861 | },
862 | "nbformat": 4,
863 | "nbformat_minor": 1
864 | }
865 |
--------------------------------------------------------------------------------
/Machine Learning Modeling Pipelines in Production/C3W1-Lab/fmnist_transform.py:
--------------------------------------------------------------------------------
1 |
2 | import tensorflow as tf
3 | import tensorflow_transform as tft
4 |
5 | # Keys
6 | _LABEL_KEY = 'label'
7 | _IMAGE_KEY = 'image'
8 |
9 |
10 | def _transformed_name(key):
11 | return key + '_xf'
12 |
13 | def _image_parser(image_str):
14 | '''converts the images to a float tensor'''
15 | image = tf.image.decode_image(image_str, channels=1)
16 | image = tf.reshape(image, (28, 28, 1))
17 | image = tf.cast(image, tf.float32)
18 | return image
19 |
20 |
21 | def _label_parser(label_id):
22 | '''converts the labels to a float tensor'''
23 | label = tf.cast(label_id, tf.float32)
24 | return label
25 |
26 |
27 | def preprocessing_fn(inputs):
28 | """tf.transform's callback function for preprocessing inputs.
29 | Args:
30 | inputs: map from feature keys to raw not-yet-transformed features.
31 | Returns:
32 | Map from string feature key to transformed feature operations.
33 | """
34 |
35 | # Convert the raw image and labels to a float array
36 | with tf.device("/cpu:0"):
37 | outputs = {
38 | _transformed_name(_IMAGE_KEY):
39 | tf.map_fn(
40 | _image_parser,
41 | tf.squeeze(inputs[_IMAGE_KEY], axis=1),
42 | dtype=tf.float32),
43 | _transformed_name(_LABEL_KEY):
44 | tf.map_fn(
45 | _label_parser,
46 | inputs[_LABEL_KEY],
47 | dtype=tf.float32)
48 | }
49 |
50 | # scale the pixels from 0 to 1
51 | outputs[_transformed_name(_IMAGE_KEY)] = tft.scale_to_0_1(outputs[_transformed_name(_IMAGE_KEY)])
52 |
53 | return outputs
54 |
--------------------------------------------------------------------------------
/Machine Learning Modeling Pipelines in Production/C3W1-Lab/trainer.py:
--------------------------------------------------------------------------------
1 |
2 | from tensorflow import keras
3 | from typing import NamedTuple, Dict, Text, Any, List
4 | from tfx.components.trainer.fn_args_utils import FnArgs, DataAccessor
5 | import tensorflow as tf
6 | import tensorflow_transform as tft
7 |
8 | # Define the label key
9 | LABEL_KEY = 'label_xf'
10 |
11 | def _gzip_reader_fn(filenames):
12 | '''Load compressed dataset
13 |
14 | Args:
15 | filenames - filenames of TFRecords to load
16 |
17 | Returns:
18 | TFRecordDataset loaded from the filenames
19 | '''
20 |
21 | # Load the dataset. Specify the compression type since it is saved as `.gz`
22 | return tf.data.TFRecordDataset(filenames, compression_type='GZIP')
23 |
24 |
25 | def _input_fn(file_pattern,
26 | tf_transform_output,
27 | num_epochs=None,
28 | batch_size=32) -> tf.data.Dataset:
29 | '''Create batches of features and labels from TF Records
30 |
31 | Args:
32 | file_pattern - List of files or patterns of file paths containing Example records.
33 | tf_transform_output - transform output graph
34 | num_epochs - Integer specifying the number of times to read through the dataset.
35 | If None, cycles through the dataset forever.
36 | batch_size - An int representing the number of records to combine in a single batch.
37 |
38 | Returns:
39 | A dataset of dict elements, (or a tuple of dict elements and label).
40 | Each dict maps feature keys to Tensor or SparseTensor objects.
41 | '''
42 | transformed_feature_spec = (
43 | tf_transform_output.transformed_feature_spec().copy())
44 |
45 | dataset = tf.data.experimental.make_batched_features_dataset(
46 | file_pattern=file_pattern,
47 | batch_size=batch_size,
48 | features=transformed_feature_spec,
49 | reader=_gzip_reader_fn,
50 | num_epochs=num_epochs,
51 | label_key=LABEL_KEY)
52 |
53 | return dataset
54 |
55 |
56 | def model_builder(hp):
57 | '''
58 | Builds the model and sets up the hyperparameters to tune.
59 |
60 | Args:
61 | hp - Keras tuner object
62 |
63 | Returns:
64 | model with hyperparameters to tune
65 | '''
66 |
67 | # Initialize the Sequential API and start stacking the layers
68 | model = keras.Sequential()
69 | model.add(keras.layers.Flatten(input_shape=(28, 28, 1)))
70 |
71 | # Get the number of units from the Tuner results
72 | hp_units = hp.get('units')
73 | model.add(keras.layers.Dense(units=hp_units, activation='relu'))
74 |
75 | # Add next layers
76 | model.add(keras.layers.Dropout(0.2))
77 | model.add(keras.layers.Dense(10, activation='softmax'))
78 |
79 | # Get the learning rate from the Tuner results
80 | hp_learning_rate = hp.get('learning_rate')
81 |
82 | # Setup model for training
83 | model.compile(optimizer=keras.optimizers.Adam(learning_rate=hp_learning_rate),
84 | loss=keras.losses.SparseCategoricalCrossentropy(),
85 | metrics=['accuracy'])
86 |
87 | # Print the model summary
88 | model.summary()
89 |
90 | return model
91 |
92 |
93 | def run_fn(fn_args: FnArgs) -> None:
94 | """Defines and trains the model.
95 | Args:
96 | fn_args: Holds args as name/value pairs. Refer here for the complete attributes:
97 | https://www.tensorflow.org/tfx/api_docs/python/tfx/components/trainer/fn_args_utils/FnArgs#attributes
98 | """
99 |
100 | # Callback for TensorBoard
101 | tensorboard_callback = tf.keras.callbacks.TensorBoard(
102 | log_dir=fn_args.model_run_dir, update_freq='batch')
103 |
104 | # Load transform output
105 | tf_transform_output = tft.TFTransformOutput(fn_args.transform_graph_path)
106 |
107 | # Create batches of data good for 10 epochs
108 | train_set = _input_fn(fn_args.train_files[0], tf_transform_output, 10)
109 | val_set = _input_fn(fn_args.eval_files[0], tf_transform_output, 10)
110 |
111 | # Load best hyperparameters
112 | hp = fn_args.hyperparameters.get('values')
113 |
114 | # Build the model
115 | model = model_builder(hp)
116 |
117 | # Train the model
118 | model.fit(
119 | x=train_set,
120 | validation_data=val_set,
121 | callbacks=[tensorboard_callback]
122 | )
123 |
124 | # Save the model
125 | model.save(fn_args.serving_model_dir, save_format='tf')
126 |
--------------------------------------------------------------------------------
/Machine Learning Modeling Pipelines in Production/C3W1-Lab/tuner.py:
--------------------------------------------------------------------------------
1 |
2 | # Define imports
3 | from kerastuner.engine import base_tuner
4 | import kerastuner as kt
5 | from tensorflow import keras
6 | from typing import NamedTuple, Dict, Text, Any, List
7 | from tfx.components.trainer.fn_args_utils import FnArgs, DataAccessor
8 | import tensorflow as tf
9 | import tensorflow_transform as tft
10 |
11 | # Declare namedtuple field names
12 | TunerFnResult = NamedTuple('TunerFnResult', [('tuner', base_tuner.BaseTuner),
13 | ('fit_kwargs', Dict[Text, Any])])
14 |
15 | # Label key
16 | LABEL_KEY = 'label_xf'
17 |
18 | # Callback for the search strategy
19 | stop_early = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=5)
20 |
21 |
22 | def _gzip_reader_fn(filenames):
23 | '''Load compressed dataset
24 |
25 | Args:
26 | filenames - filenames of TFRecords to load
27 |
28 | Returns:
29 | TFRecordDataset loaded from the filenames
30 | '''
31 |
32 | # Load the dataset. Specify the compression type since it is saved as `.gz`
33 | return tf.data.TFRecordDataset(filenames, compression_type='GZIP')
34 |
35 |
36 | def _input_fn(file_pattern,
37 | tf_transform_output,
38 | num_epochs=None,
39 | batch_size=32) -> tf.data.Dataset:
40 | '''Create batches of features and labels from TF Records
41 |
42 | Args:
43 | file_pattern - List of files or patterns of file paths containing Example records.
44 | tf_transform_output - transform output graph
45 | num_epochs - Integer specifying the number of times to read through the dataset.
46 | If None, cycles through the dataset forever.
47 | batch_size - An int representing the number of records to combine in a single batch.
48 |
49 | Returns:
50 | A dataset of dict elements, (or a tuple of dict elements and label).
51 | Each dict maps feature keys to Tensor or SparseTensor objects.
52 | '''
53 |
54 | # Get feature specification based on transform output
55 | transformed_feature_spec = (
56 | tf_transform_output.transformed_feature_spec().copy())
57 |
58 | # Create batches of features and labels
59 | dataset = tf.data.experimental.make_batched_features_dataset(
60 | file_pattern=file_pattern,
61 | batch_size=batch_size,
62 | features=transformed_feature_spec,
63 | reader=_gzip_reader_fn,
64 | num_epochs=num_epochs,
65 | label_key=LABEL_KEY)
66 |
67 | return dataset
68 |
69 |
70 | def model_builder(hp):
71 | '''
72 | Builds the model and sets up the hyperparameters to tune.
73 |
74 | Args:
75 | hp - Keras tuner object
76 |
77 | Returns:
78 | model with hyperparameters to tune
79 | '''
80 |
81 | # Initialize the Sequential API and start stacking the layers
82 | model = keras.Sequential()
83 | model.add(keras.layers.Flatten(input_shape=(28, 28, 1)))
84 |
85 | # Tune the number of units in the first Dense layer
86 | # Choose an optimal value between 32-512
87 | hp_units = hp.Int('units', min_value=32, max_value=512, step=32)
88 | model.add(keras.layers.Dense(units=hp_units, activation='relu', name='dense_1'))
89 |
90 | # Add next layers
91 | model.add(keras.layers.Dropout(0.2))
92 | model.add(keras.layers.Dense(10, activation='softmax'))
93 |
94 | # Tune the learning rate for the optimizer
95 | # Choose an optimal value from 0.01, 0.001, or 0.0001
96 | hp_learning_rate = hp.Choice('learning_rate', values=[1e-2, 1e-3, 1e-4])
97 |
98 | model.compile(optimizer=keras.optimizers.Adam(learning_rate=hp_learning_rate),
99 | loss=keras.losses.SparseCategoricalCrossentropy(),
100 | metrics=['accuracy'])
101 |
102 | return model
103 |
104 | def tuner_fn(fn_args: FnArgs) -> TunerFnResult:
105 | """Build the tuner using the KerasTuner API.
106 | Args:
107 | fn_args: Holds args as name/value pairs.
108 |
109 | - working_dir: working dir for tuning.
110 | - train_files: List of file paths containing training tf.Example data.
111 | - eval_files: List of file paths containing eval tf.Example data.
112 | - train_steps: number of train steps.
113 | - eval_steps: number of eval steps.
114 | - schema_path: optional schema of the input data.
115 | - transform_graph_path: optional transform graph produced by TFT.
116 |
117 | Returns:
118 | A namedtuple contains the following:
119 | - tuner: A BaseTuner that will be used for tuning.
120 | - fit_kwargs: Args to pass to tuner's run_trial function for fitting the
121 | model , e.g., the training and validation dataset. Required
122 | args depend on the above tuner's implementation.
123 | """
124 |
125 | # Define tuner search strategy
126 | tuner = kt.Hyperband(model_builder,
127 | objective='val_accuracy',
128 | max_epochs=10,
129 | factor=3,
130 | directory=fn_args.working_dir,
131 | project_name='kt_hyperband')
132 |
133 | # Load transform output
134 | tf_transform_output = tft.TFTransformOutput(fn_args.transform_graph_path)
135 |
136 | # Use _input_fn() to extract input features and labels from the train and val set
137 | train_set = _input_fn(fn_args.train_files[0], tf_transform_output)
138 | val_set = _input_fn(fn_args.eval_files[0], tf_transform_output)
139 |
140 |
141 | return TunerFnResult(
142 | tuner=tuner,
143 | fit_kwargs={
144 | "callbacks":[stop_early],
145 | 'x': train_set,
146 | 'validation_data': val_set,
147 | 'steps_per_epoch': fn_args.train_steps,
148 | 'validation_steps': fn_args.eval_steps
149 | }
150 | )
151 |
--------------------------------------------------------------------------------
/Machine Learning Modeling Pipelines in Production/C3W2-Lab/dnn_model.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/zhulingchen/Machine-Learning-Engineering-for-Production-MLOps-Specialization/4a936bba66fa15a60899b7b2d0d8bddb56633a3a/Machine Learning Modeling Pipelines in Production/C3W2-Lab/dnn_model.png
--------------------------------------------------------------------------------
/Machine Learning Modeling Pipelines in Production/C3W2-Lab/dnn_model_engineered.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/zhulingchen/Machine-Learning-Engineering-for-Production-MLOps-Specialization/4a936bba66fa15a60899b7b2d0d8bddb56633a3a/Machine Learning Modeling Pipelines in Production/C3W2-Lab/dnn_model_engineered.png
--------------------------------------------------------------------------------
/Machine Learning Modeling Pipelines in Production/C3W3-Assignment/lab-files/Dockerfile:
--------------------------------------------------------------------------------
1 | # Copyright 2020 Google. All Rights Reserved.
2 | #
3 | # Licensed under the Apache License, Version 2.0 (the "License");
4 | # you may not use this file except in compliance with the License.
5 | # You may obtain a copy of the License at
6 | #
7 | # http://www.apache.org/licenses/LICENSE-2.0
8 | #
9 | # Unless required by applicable law or agreed to in writing, software
10 | # distributed under the License is distributed on an "AS IS" BASIS,
11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12 | # See the License for the specific language governing permissions and
13 | # limitations under the License.
14 | # ==============================================================================
15 |
16 | FROM tensorflow/tensorflow:2.4.1
17 | RUN pip install tensorflow_datasets
18 |
19 | ADD mnist mnist
20 |
21 | ENTRYPOINT ["python", "-m", "mnist.main"]
22 |
--------------------------------------------------------------------------------
/Machine Learning Modeling Pipelines in Production/C3W3-Assignment/lab-files/Kptfile:
--------------------------------------------------------------------------------
1 | apiVersion: kpt.dev/v1alpha1
2 | kind: Kptfile
3 | metadata:
4 | name: lab-files
5 | upstream:
6 | type: git
7 | git:
8 | commit: 89ae33dc9cf10ac777a845e0a5f2c691708151b2
9 | repo: https://github.com/GoogleCloudPlatform/mlops-on-gcp
10 | directory: workshops/mlep-qwiklabs/distributed-training-gke
11 | ref: master
12 |
--------------------------------------------------------------------------------
/Machine Learning Modeling Pipelines in Production/C3W3-Assignment/lab-files/README.md:
--------------------------------------------------------------------------------
1 | # Distributed TensorFlow training on Kubernetes
2 |
3 |
4 |
5 | ## Setup and Requirements
6 |
7 | ### Qwiklabs setup
8 |
9 | ### Activate Cloud Shell
10 |
11 | ## Setting up your GKE cluster
12 |
13 |
14 | Set the project ID
15 |
16 | ```
17 | PROJECT_ID=$(gcloud config get-value project)
18 | gcloud config set compute/zone us-central1-f
19 | ```
20 |
21 | ### Creating a Kubernetes cluster
22 |
23 | Set the name and the zone for your cluster
24 |
25 | ```
26 | CLUSTER_NAME=cluster-1
27 |
28 | gcloud beta container clusters create $CLUSTER_NAME \
29 | --project=$PROJECT_ID \
30 | --cluster-version=latest \
31 | --machine-type=n1-standard-8 \
32 | --scopes compute-rw,gke-default,storage-rw \
33 | --num-nodes=3
34 |
35 | ```
36 |
37 | ### Verifying the installation
38 |
39 | Check that the cluster is up and running
40 |
41 | ```
42 | gcloud container clusters list
43 | ```
44 |
45 | Get the credentials for you new cluster so you can interact with it using `kubectl`.
46 |
47 | ```
48 | gcloud container clusters get-credentials $CLUSTER_NAME
49 | ```
50 |
51 |
52 |
53 | ## Installing TensorFlow Training (TFJob)
54 |
55 | ### Get TensorFlow Training manifests
56 |
57 | Get the manifests for TensorFlow Training from v1.1.0 of Kubeflow.
58 | ```
59 | SRC_REPO=https://github.com/kubeflow/manifests
60 | kpt pkg get $SRC_REPO/tf-training@v1.1.0 tf-training
61 | ```
62 |
63 | ### Create the `kubeflow` namespace
64 |
65 | ```
66 | kubectl create namespace kubeflow
67 | ```
68 |
69 | ### Install the TFJob CRD
70 |
71 | ```
72 | kubectl apply --kustomize tf-training/tf-job-crds/base
73 | ```
74 |
75 | ### Install the TFJob operator
76 | ```
77 | kubectl apply --kustomize tf-training/tf-job-operator/base
78 | ```
79 |
80 | ### Verify installation
81 | ```
82 | kubectl get pods -n kubeflow
83 | ```
84 |
85 | ## Create a GCS bucket for checkpoints and SavedModel
86 |
87 | ```
88 | export TFJOB_BUCKET=
89 |
90 | gsutil mb gs://${TFJOB_BUCKET}
91 | ```
92 |
93 | ## Running and monitoring distributed jobs
94 |
95 | ### Copy lab files
96 |
97 | ```
98 | SRC_REPO=https://github.com/jarokaz/mle-labs
99 | kpt pkg get $SRC_REPO/lab-03-tfjob lab-files
100 | cd lab-files
101 | ```
102 |
103 | ### Build a training container
104 | ```
105 | IMAGE_NAME=mnist-train
106 |
107 | docker build -t gcr.io/${PROJECT_ID}/${IMAGE_NAME} .
108 | docker push gcr.io/${PROJECT_ID}/${IMAGE_NAME}
109 | ```
110 |
111 | Verify that the image was pushed successfully
112 |
113 | ```
114 | gcloud container images list
115 | ```
116 |
117 | ### Update the TFJob manifest
118 | ```
119 | yq w -i tfjob.yaml 'spec.tfReplicaSpecs.Worker.template.spec.containers[0].image' gcr.io/${PROJECT_ID}/${IMAGE_NAME}
120 | yq w -i tfjob.yaml 'spec.tfReplicaSpecs.Worker.template.spec.containers[0].args[3]' '--saved_model_path=${TFJOB_BUCKET}/saved_model_dir'
121 | yq w -i tfjob.yaml 'spec.tfReplicaSpecs.Worker.template.spec.containers[0].args[4]' '--checkpoint_path=${TFJOB_BUCKET}/checkpoints'
122 | ```
123 | ### Submit a training job
124 | ```
125 | kubectl apply -f tfjob.yaml
126 | ```
127 |
128 | ### Monitor the job
129 | ```
130 | kubectl describe tfjob multi-worker
131 | ```
132 |
133 | ```
134 | kubectl logs --follow multi-worker-worker-0
135 | ```
136 |
137 | ```
138 | kubectl get pods
139 | ```
140 |
141 |
142 | ### Delete the job
143 | ```
144 | kubectl delete tfjob multi-worker
145 | ```
146 |
--------------------------------------------------------------------------------
/Machine Learning Modeling Pipelines in Production/C3W3-Assignment/lab-files/distributed-training-gke/Dockerfile:
--------------------------------------------------------------------------------
1 | # Copyright 2020 Google. All Rights Reserved.
2 | #
3 | # Licensed under the Apache License, Version 2.0 (the "License");
4 | # you may not use this file except in compliance with the License.
5 | # You may obtain a copy of the License at
6 | #
7 | # http://www.apache.org/licenses/LICENSE-2.0
8 | #
9 | # Unless required by applicable law or agreed to in writing, software
10 | # distributed under the License is distributed on an "AS IS" BASIS,
11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12 | # See the License for the specific language governing permissions and
13 | # limitations under the License.
14 | # ==============================================================================
15 |
16 | FROM tensorflow/tensorflow:2.4.1
17 | RUN pip install tensorflow_datasets
18 |
19 | ADD mnist mnist
20 |
21 | ENTRYPOINT ["python", "-m", "mnist.main"]
22 |
--------------------------------------------------------------------------------
/Machine Learning Modeling Pipelines in Production/C3W3-Assignment/lab-files/distributed-training-gke/Kptfile:
--------------------------------------------------------------------------------
1 | apiVersion: kpt.dev/v1alpha1
2 | kind: Kptfile
3 | metadata:
4 | name: distributed-training-gke
5 | upstream:
6 | type: git
7 | git:
8 | commit: 89ae33dc9cf10ac777a845e0a5f2c691708151b2
9 | repo: https://github.com/GoogleCloudPlatform/mlops-on-gcp
10 | directory: workshops/mlep-qwiklabs/distributed-training-gke
11 | ref: master
12 |
--------------------------------------------------------------------------------
/Machine Learning Modeling Pipelines in Production/C3W3-Assignment/lab-files/distributed-training-gke/README.md:
--------------------------------------------------------------------------------
1 | # Distributed TensorFlow training on Kubernetes
2 |
3 |
4 |
5 | ## Setup and Requirements
6 |
7 | ### Qwiklabs setup
8 |
9 | ### Activate Cloud Shell
10 |
11 | ## Setting up your GKE cluster
12 |
13 |
14 | Set the project ID
15 |
16 | ```
17 | PROJECT_ID=$(gcloud config get-value project)
18 | gcloud config set compute/zone us-central1-f
19 | ```
20 |
21 | ### Creating a Kubernetes cluster
22 |
23 | Set the name and the zone for your cluster
24 |
25 | ```
26 | CLUSTER_NAME=cluster-1
27 |
28 | gcloud beta container clusters create $CLUSTER_NAME \
29 | --project=$PROJECT_ID \
30 | --cluster-version=latest \
31 | --machine-type=n1-standard-8 \
32 | --scopes compute-rw,gke-default,storage-rw \
33 | --num-nodes=3
34 |
35 | ```
36 |
37 | ### Verifying the installation
38 |
39 | Check that the cluster is up and running
40 |
41 | ```
42 | gcloud container clusters list
43 | ```
44 |
45 | Get the credentials for you new cluster so you can interact with it using `kubectl`.
46 |
47 | ```
48 | gcloud container clusters get-credentials $CLUSTER_NAME
49 | ```
50 |
51 |
52 |
53 | ## Installing TensorFlow Training (TFJob)
54 |
55 | ### Get TensorFlow Training manifests
56 |
57 | Get the manifests for TensorFlow Training from v1.1.0 of Kubeflow.
58 | ```
59 | SRC_REPO=https://github.com/kubeflow/manifests
60 | kpt pkg get $SRC_REPO/tf-training@v1.1.0 tf-training
61 | ```
62 |
63 | ### Create the `kubeflow` namespace
64 |
65 | ```
66 | kubectl create namespace kubeflow
67 | ```
68 |
69 | ### Install the TFJob CRD
70 |
71 | ```
72 | kubectl apply --kustomize tf-training/tf-job-crds/base
73 | ```
74 |
75 | ### Install the TFJob operator
76 | ```
77 | kubectl apply --kustomize tf-training/tf-job-operator/base
78 | ```
79 |
80 | ### Verify installation
81 | ```
82 | kubectl get pods -n kubeflow
83 | ```
84 |
85 | ## Create a GCS bucket for checkpoints and SavedModel
86 |
87 | ```
88 | export TFJOB_BUCKET=
89 |
90 | gsutil mb gs://${TFJOB_BUCKET}
91 | ```
92 |
93 | ## Running and monitoring distributed jobs
94 |
95 | ### Copy lab files
96 |
97 | ```
98 | SRC_REPO=https://github.com/jarokaz/mle-labs
99 | kpt pkg get $SRC_REPO/lab-03-tfjob lab-files
100 | cd lab-files
101 | ```
102 |
103 | ### Build a training container
104 | ```
105 | IMAGE_NAME=mnist-train
106 |
107 | docker build -t gcr.io/${PROJECT_ID}/${IMAGE_NAME} .
108 | docker push gcr.io/${PROJECT_ID}/${IMAGE_NAME}
109 | ```
110 |
111 | Verify that the image was pushed successfully
112 |
113 | ```
114 | gcloud container images list
115 | ```
116 |
117 | ### Update the TFJob manifest
118 | ```
119 | yq w -i tfjob.yaml 'spec.tfReplicaSpecs.Worker.template.spec.containers[0].image' gcr.io/${PROJECT_ID}/${IMAGE_NAME}
120 | yq w -i tfjob.yaml 'spec.tfReplicaSpecs.Worker.template.spec.containers[0].args[3]' '--saved_model_path=${TFJOB_BUCKET}/saved_model_dir'
121 | yq w -i tfjob.yaml 'spec.tfReplicaSpecs.Worker.template.spec.containers[0].args[4]' '--checkpoint_path=${TFJOB_BUCKET}/checkpoints'
122 | ```
123 | ### Submit a training job
124 | ```
125 | kubectl apply -f tfjob.yaml
126 | ```
127 |
128 | ### Monitor the job
129 | ```
130 | kubectl describe tfjob multi-worker
131 | ```
132 |
133 | ```
134 | kubectl logs --follow multi-worker-worker-0
135 | ```
136 |
137 | ```
138 | kubectl get pods
139 | ```
140 |
141 |
142 | ### Delete the job
143 | ```
144 | kubectl delete tfjob multi-worker
145 | ```
146 |
--------------------------------------------------------------------------------
/Machine Learning Modeling Pipelines in Production/C3W3-Assignment/lab-files/distributed-training-gke/mnist/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/zhulingchen/Machine-Learning-Engineering-for-Production-MLOps-Specialization/4a936bba66fa15a60899b7b2d0d8bddb56633a3a/Machine Learning Modeling Pipelines in Production/C3W3-Assignment/lab-files/distributed-training-gke/mnist/__init__.py
--------------------------------------------------------------------------------
/Machine Learning Modeling Pipelines in Production/C3W3-Assignment/lab-files/distributed-training-gke/mnist/main.py:
--------------------------------------------------------------------------------
1 |
2 | # Copyright 2020 Google. All Rights Reserved.
3 | #
4 | # Licensed under the Apache License, Version 2.0 (the "License");
5 | # you may not use this file except in compliance with the License.
6 | # You may obtain a copy of the License at
7 | #
8 | # http://www.apache.org/licenses/LICENSE-2.0
9 | #
10 | # Unless required by applicable law or agreed to in writing, software
11 | # distributed under the License is distributed on an "AS IS" BASIS,
12 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13 | # See the License for the specific language governing permissions and
14 | # limitations under the License.
15 | # ==============================================================================
16 | """An example of multi-worker training with Keras model using Strategy API."""
17 |
18 | import argparse
19 | import json
20 | import logging
21 | import os
22 |
23 | import tensorflow_datasets as tfds
24 | import tensorflow as tf
25 | import mnist.model as mnist
26 |
27 | BUFFER_SIZE = 100000
28 |
29 |
30 | def _scale(image, label):
31 | """Scales an image tensor."""
32 | image = tf.cast(image, tf.float32)
33 | image /= 255
34 | return image, label
35 |
36 |
37 | def _is_chief(task_type, task_id):
38 | """Determines if the replica is the Chief."""
39 | return task_type is None or task_type == 'chief' or (
40 | task_type == 'worker' and task_id == 0)
41 |
42 |
43 | def _get_saved_model_dir(base_path, task_type, task_id):
44 | """Returns a location for the SavedModel."""
45 |
46 | saved_model_path = base_path
47 | if not _is_chief(task_type, task_id):
48 | temp_dir = os.path.join('/tmp', task_type, str(task_id))
49 | tf.io.gfile.makedirs(temp_dir)
50 | saved_model_path = temp_dir
51 |
52 | return saved_model_path
53 |
54 |
55 | def train(epochs, steps_per_epoch, per_worker_batch, checkpoint_path, saved_model_path):
56 | """Trains a MNIST classification model using multi-worker mirrored strategy."""
57 |
58 | strategy = tf.distribute.experimental.MultiWorkerMirroredStrategy()
59 | task_type = strategy.cluster_resolver.task_type
60 | task_id = strategy.cluster_resolver.task_id
61 | global_batch_size = per_worker_batch * strategy.num_replicas_in_sync
62 |
63 | with strategy.scope():
64 | datasets, _ = tfds.load(name='mnist', with_info=True, as_supervised=True)
65 | dataset = datasets['train'].map(_scale).cache().shuffle(BUFFER_SIZE).batch(global_batch_size).repeat()
66 | options = tf.data.Options()
67 | options.experimental_distribute.auto_shard_policy = \
68 | tf.data.experimental.AutoShardPolicy.DATA
69 | dataset = dataset.with_options(options)
70 | multi_worker_model = mnist.build_and_compile_cnn_model()
71 |
72 | callbacks = [
73 | tf.keras.callbacks.experimental.BackupAndRestore(checkpoint_path)
74 | ]
75 |
76 | multi_worker_model.fit(dataset,
77 | epochs=epochs,
78 | steps_per_epoch=steps_per_epoch,
79 | callbacks=callbacks)
80 |
81 |
82 | logging.info("Saving the trained model to: {}".format(saved_model_path))
83 | saved_model_dir = _get_saved_model_dir(saved_model_path, task_type, task_id)
84 | multi_worker_model.save(saved_model_dir)
85 |
86 | if __name__ == '__main__':
87 |
88 | logging.getLogger().setLevel(logging.INFO)
89 | tfds.disable_progress_bar()
90 |
91 | parser = argparse.ArgumentParser()
92 | parser.add_argument('--epochs',
93 | type=int,
94 | required=True,
95 | help='Number of epochs to train.')
96 | parser.add_argument('--steps_per_epoch',
97 | type=int,
98 | required=True,
99 | help='Steps per epoch.')
100 | parser.add_argument('--per_worker_batch',
101 | type=int,
102 | required=True,
103 | help='Per worker batch.')
104 | parser.add_argument('--saved_model_path',
105 | type=str,
106 | required=True,
107 | help='Tensorflow export directory.')
108 | parser.add_argument('--checkpoint_path',
109 | type=str,
110 | required=True,
111 | help='Tensorflow checkpoint directory.')
112 |
113 | args = parser.parse_args()
114 |
115 | train(args.epochs, args.steps_per_epoch, args.per_worker_batch,
116 | args.checkpoint_path, args.saved_model_path)
117 |
--------------------------------------------------------------------------------
/Machine Learning Modeling Pipelines in Production/C3W3-Assignment/lab-files/distributed-training-gke/mnist/model.py:
--------------------------------------------------------------------------------
1 | # Copyright 2020 Google. All Rights Reserved.
2 | #
3 | # Licensed under the Apache License, Version 2.0 (the "License");
4 | # you may not use this file except in compliance with the License.
5 | # You may obtain a copy of the License at
6 | #
7 | # http://www.apache.org/licenses/LICENSE-2.0
8 | #
9 | # Unless required by applicable law or agreed to in writing, software
10 | # distributed under the License is distributed on an "AS IS" BASIS,
11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12 | # See the License for the specific language governing permissions and
13 | # limitations under the License.
14 | # ==============================================================================
15 | """An example of multi-worker training with Keras model using Strategy API."""
16 |
17 | import os
18 | import tensorflow as tf
19 | import numpy as np
20 |
21 | def build_and_compile_cnn_model():
22 | model = tf.keras.Sequential([
23 | tf.keras.Input(shape=(28, 28)),
24 | tf.keras.layers.Reshape(target_shape=(28, 28, 1)),
25 | tf.keras.layers.Conv2D(32, 3, activation='relu'),
26 | tf.keras.layers.Flatten(),
27 | tf.keras.layers.Dense(128, activation='relu'),
28 | tf.keras.layers.Dense(10)
29 | ])
30 | model.compile(
31 | loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
32 | optimizer=tf.keras.optimizers.SGD(learning_rate=0.001),
33 | metrics=['accuracy'])
34 | return model
35 |
--------------------------------------------------------------------------------
/Machine Learning Modeling Pipelines in Production/C3W3-Assignment/lab-files/distributed-training-gke/tfjob.yaml:
--------------------------------------------------------------------------------
1 | # Copyright 2020 Google. All Rights Reserved.
2 | #
3 | # Licensed under the Apache License, Version 2.0 (the "License");
4 | # you may not use this file except in compliance with the License.
5 | # You may obtain a copy of the License at
6 | #
7 | # http://www.apache.org/licenses/LICENSE-2.0
8 | #
9 | # Unless required by applicable law or agreed to in writing, software
10 | # distributed under the License is distributed on an "AS IS" BASIS,
11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12 | # See the License for the specific language governing permissions and
13 | # limitations under the License.
14 | # ==============================================================================
15 |
16 | apiVersion: kubeflow.org/v1
17 | kind: TFJob
18 | metadata:
19 | name: multi-worker
20 | spec:
21 | cleanPodPolicy: None
22 | tfReplicaSpecs:
23 | Worker:
24 | replicas: 3
25 | template:
26 | spec:
27 | containers:
28 | - name: tensorflow
29 | image: mnist
30 | args:
31 | - --epochs=5
32 | - --steps_per_epoch=100
33 | - --per_worker_batch=64
34 | - --saved_model_path=gs://bucket/saved_model_dir
35 | - --checkpoint_path=gs://bucket/checkpoints
36 |
--------------------------------------------------------------------------------
/Machine Learning Modeling Pipelines in Production/C3W3-Assignment/lab-files/mnist/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/zhulingchen/Machine-Learning-Engineering-for-Production-MLOps-Specialization/4a936bba66fa15a60899b7b2d0d8bddb56633a3a/Machine Learning Modeling Pipelines in Production/C3W3-Assignment/lab-files/mnist/__init__.py
--------------------------------------------------------------------------------
/Machine Learning Modeling Pipelines in Production/C3W3-Assignment/lab-files/mnist/main.py:
--------------------------------------------------------------------------------
1 |
2 | # Copyright 2020 Google. All Rights Reserved.
3 | #
4 | # Licensed under the Apache License, Version 2.0 (the "License");
5 | # you may not use this file except in compliance with the License.
6 | # You may obtain a copy of the License at
7 | #
8 | # http://www.apache.org/licenses/LICENSE-2.0
9 | #
10 | # Unless required by applicable law or agreed to in writing, software
11 | # distributed under the License is distributed on an "AS IS" BASIS,
12 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13 | # See the License for the specific language governing permissions and
14 | # limitations under the License.
15 | # ==============================================================================
16 | """An example of multi-worker training with Keras model using Strategy API."""
17 |
18 | import argparse
19 | import json
20 | import logging
21 | import os
22 |
23 | import tensorflow_datasets as tfds
24 | import tensorflow as tf
25 | import mnist.model as mnist
26 |
27 | BUFFER_SIZE = 100000
28 |
29 |
30 | def _scale(image, label):
31 | """Scales an image tensor."""
32 | image = tf.cast(image, tf.float32)
33 | image /= 255
34 | return image, label
35 |
36 |
37 | def _is_chief(task_type, task_id):
38 | """Determines if the replica is the Chief."""
39 | return task_type is None or task_type == 'chief' or (
40 | task_type == 'worker' and task_id == 0)
41 |
42 |
43 | def _get_saved_model_dir(base_path, task_type, task_id):
44 | """Returns a location for the SavedModel."""
45 |
46 | saved_model_path = base_path
47 | if not _is_chief(task_type, task_id):
48 | temp_dir = os.path.join('/tmp', task_type, str(task_id))
49 | tf.io.gfile.makedirs(temp_dir)
50 | saved_model_path = temp_dir
51 |
52 | return saved_model_path
53 |
54 |
55 | def train(epochs, steps_per_epoch, per_worker_batch, checkpoint_path, saved_model_path):
56 | """Trains a MNIST classification model using multi-worker mirrored strategy."""
57 |
58 | strategy = tf.distribute.experimental.MultiWorkerMirroredStrategy()
59 | task_type = strategy.cluster_resolver.task_type
60 | task_id = strategy.cluster_resolver.task_id
61 | global_batch_size = per_worker_batch * strategy.num_replicas_in_sync
62 |
63 | with strategy.scope():
64 | datasets, _ = tfds.load(name='mnist', with_info=True, as_supervised=True)
65 | dataset = datasets['train'].map(_scale).cache().shuffle(BUFFER_SIZE).batch(global_batch_size).repeat()
66 | options = tf.data.Options()
67 | options.experimental_distribute.auto_shard_policy = \
68 | tf.data.experimental.AutoShardPolicy.DATA
69 | dataset = dataset.with_options(options)
70 | multi_worker_model = mnist.build_and_compile_cnn_model()
71 |
72 | callbacks = [
73 | tf.keras.callbacks.experimental.BackupAndRestore(checkpoint_path)
74 | ]
75 |
76 | multi_worker_model.fit(dataset,
77 | epochs=epochs,
78 | steps_per_epoch=steps_per_epoch,
79 | callbacks=callbacks)
80 |
81 |
82 | logging.info("Saving the trained model to: {}".format(saved_model_path))
83 | saved_model_dir = _get_saved_model_dir(saved_model_path, task_type, task_id)
84 | multi_worker_model.save(saved_model_dir)
85 |
86 | if __name__ == '__main__':
87 |
88 | logging.getLogger().setLevel(logging.INFO)
89 | tfds.disable_progress_bar()
90 |
91 | parser = argparse.ArgumentParser()
92 | parser.add_argument('--epochs',
93 | type=int,
94 | required=True,
95 | help='Number of epochs to train.')
96 | parser.add_argument('--steps_per_epoch',
97 | type=int,
98 | required=True,
99 | help='Steps per epoch.')
100 | parser.add_argument('--per_worker_batch',
101 | type=int,
102 | required=True,
103 | help='Per worker batch.')
104 | parser.add_argument('--saved_model_path',
105 | type=str,
106 | required=True,
107 | help='Tensorflow export directory.')
108 | parser.add_argument('--checkpoint_path',
109 | type=str,
110 | required=True,
111 | help='Tensorflow checkpoint directory.')
112 |
113 | args = parser.parse_args()
114 |
115 | train(args.epochs, args.steps_per_epoch, args.per_worker_batch,
116 | args.checkpoint_path, args.saved_model_path)
117 |
--------------------------------------------------------------------------------
/Machine Learning Modeling Pipelines in Production/C3W3-Assignment/lab-files/mnist/model.py:
--------------------------------------------------------------------------------
1 | # Copyright 2020 Google. All Rights Reserved.
2 | #
3 | # Licensed under the Apache License, Version 2.0 (the "License");
4 | # you may not use this file except in compliance with the License.
5 | # You may obtain a copy of the License at
6 | #
7 | # http://www.apache.org/licenses/LICENSE-2.0
8 | #
9 | # Unless required by applicable law or agreed to in writing, software
10 | # distributed under the License is distributed on an "AS IS" BASIS,
11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12 | # See the License for the specific language governing permissions and
13 | # limitations under the License.
14 | # ==============================================================================
15 | """An example of multi-worker training with Keras model using Strategy API."""
16 |
17 | import os
18 | import tensorflow as tf
19 | import numpy as np
20 |
21 | def build_and_compile_cnn_model():
22 | model = tf.keras.Sequential([
23 | tf.keras.Input(shape=(28, 28)),
24 | tf.keras.layers.Reshape(target_shape=(28, 28, 1)),
25 | tf.keras.layers.Conv2D(32, 3, activation='relu'),
26 | tf.keras.layers.Flatten(),
27 | tf.keras.layers.Dense(128, activation='relu'),
28 | tf.keras.layers.Dense(10)
29 | ])
30 | model.compile(
31 | loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
32 | optimizer=tf.keras.optimizers.SGD(learning_rate=0.001),
33 | metrics=['accuracy'])
34 | return model
35 |
--------------------------------------------------------------------------------
/Machine Learning Modeling Pipelines in Production/C3W3-Assignment/lab-files/tfjob.yaml:
--------------------------------------------------------------------------------
1 |
2 | apiVersion: kubeflow.org/v1
3 | kind: TFJob
4 | metadata:
5 | name: multi-worker
6 | spec:
7 | cleanPodPolicy: None
8 | tfReplicaSpecs:
9 | Worker:
10 | replicas: 3
11 | template:
12 | spec:
13 | containers:
14 | - name: tensorflow
15 | image: gcr.io/zlc-test-2017/mnist-train
16 | args:
17 | - --epochs=5
18 | - --steps_per_epoch=100
19 | - --per_worker_batch=64
20 | - --saved_model_path=gs://zlc-test-2017-bucket/saved_model_dir
21 | - --checkpoint_path=gs://zlc-test-2017-bucket/checkpoints
22 |
--------------------------------------------------------------------------------
/Machine Learning Modeling Pipelines in Production/C3W3-Assignment/tf-training/Kptfile:
--------------------------------------------------------------------------------
1 | apiVersion: kpt.dev/v1alpha1
2 | kind: Kptfile
3 | metadata:
4 | name: tf-training
5 | upstream:
6 | type: git
7 | git:
8 | commit: adfca58fa01eedb1e3cacef097d7f1a3a405d16e
9 | repo: https://github.com/kubeflow/manifests
10 | directory: tf-training
11 | ref: v1.1.0
12 |
--------------------------------------------------------------------------------
/Machine Learning Modeling Pipelines in Production/C3W3-Assignment/tf-training/OWNERS:
--------------------------------------------------------------------------------
1 | approvers:
2 | - andreyvelich
3 | - gaocegege
4 | - johnugeorge
5 |
--------------------------------------------------------------------------------
/Machine Learning Modeling Pipelines in Production/C3W3-Assignment/tf-training/tf-job-crds/base/crd.yaml:
--------------------------------------------------------------------------------
1 | apiVersion: apiextensions.k8s.io/v1beta1
2 | kind: CustomResourceDefinition
3 | metadata:
4 | name: tfjobs.kubeflow.org
5 | spec:
6 | additionalPrinterColumns:
7 | - JSONPath: .status.conditions[-1:].type
8 | name: State
9 | type: string
10 | - JSONPath: .metadata.creationTimestamp
11 | name: Age
12 | type: date
13 | group: kubeflow.org
14 | names:
15 | kind: TFJob
16 | plural: tfjobs
17 | singular: tfjob
18 | scope: Namespaced
19 | subresources:
20 | status: {}
21 | validation:
22 | openAPIV3Schema:
23 | properties:
24 | spec:
25 | properties:
26 | tfReplicaSpecs:
27 | properties:
28 | Chief:
29 | properties:
30 | replicas:
31 | maximum: 1
32 | minimum: 1
33 | type: integer
34 | PS:
35 | properties:
36 | replicas:
37 | minimum: 1
38 | type: integer
39 | Worker:
40 | properties:
41 | replicas:
42 | minimum: 1
43 | type: integer
44 | versions:
45 | - name: v1
46 | served: true
47 | storage: true
48 |
--------------------------------------------------------------------------------
/Machine Learning Modeling Pipelines in Production/C3W3-Assignment/tf-training/tf-job-crds/base/kustomization.yaml:
--------------------------------------------------------------------------------
1 | apiVersion: kustomize.config.k8s.io/v1beta1
2 | kind: Kustomization
3 | resources:
4 | - crd.yaml
5 |
--------------------------------------------------------------------------------
/Machine Learning Modeling Pipelines in Production/C3W3-Assignment/tf-training/tf-job-crds/overlays/application/application.yaml:
--------------------------------------------------------------------------------
1 | apiVersion: app.k8s.io/v1beta1
2 | kind: Application
3 | metadata:
4 | name: tf-job-crds
5 | spec:
6 | selector:
7 | matchLabels:
8 | app.kubernetes.io/name: tf-job-crds
9 | app.kubernetes.io/instance: tf-job-crds-v0.7.0
10 | app.kubernetes.io/managed-by: kfctl
11 | app.kubernetes.io/component: tfjob
12 | app.kubernetes.io/part-of: kubeflow
13 | app.kubernetes.io/version: v0.7.0
14 | componentKinds:
15 | - group: core
16 | kind: Service
17 | - group: apps
18 | kind: Deployment
19 | - group: core
20 | kind: ServiceAccount
21 | - group: kubeflow.org
22 | kind: TFJob
23 | descriptor:
24 | type: "tf-job-crds"
25 | version: "v1"
26 | description: "Tf-job-crds contains the \"TFJob\" custom resource definition."
27 | maintainers:
28 | - name: Richard Liu
29 | email: ricliu@google.com
30 | owners:
31 | - name: Richard Liu
32 | email: ricliu@google.com
33 | keywords:
34 | - "tfjob"
35 | - "tf-operator"
36 | - "tf-training"
37 | links:
38 | - description: About
39 | url: "https://github.com/kubeflow/tf-operator"
40 | - description: Docs
41 | url: "https://www.kubeflow.org/docs/reference/tfjob/v1/tensorflow/"
42 | addOwnerRef: true
43 |
--------------------------------------------------------------------------------
/Machine Learning Modeling Pipelines in Production/C3W3-Assignment/tf-training/tf-job-crds/overlays/application/kustomization.yaml:
--------------------------------------------------------------------------------
1 | apiVersion: kustomize.config.k8s.io/v1beta1
2 | bases:
3 | - ../../base
4 | commonLabels:
5 | app.kubernetes.io/component: tfjob
6 | app.kubernetes.io/name: tf-job-crds
7 | kind: Kustomization
8 | resources:
9 | - application.yaml
10 |
--------------------------------------------------------------------------------
/Machine Learning Modeling Pipelines in Production/C3W3-Assignment/tf-training/tf-job-operator/base/cluster-role-binding.yaml:
--------------------------------------------------------------------------------
1 | ---
2 | apiVersion: rbac.authorization.k8s.io/v1beta1
3 | kind: ClusterRoleBinding
4 | metadata:
5 | labels:
6 | app: tf-job-operator
7 | name: tf-job-operator
8 | roleRef:
9 | apiGroup: rbac.authorization.k8s.io
10 | kind: ClusterRole
11 | name: tf-job-operator
12 | subjects:
13 | - kind: ServiceAccount
14 | name: tf-job-operator
15 |
--------------------------------------------------------------------------------
/Machine Learning Modeling Pipelines in Production/C3W3-Assignment/tf-training/tf-job-operator/base/cluster-role.yaml:
--------------------------------------------------------------------------------
1 | ---
2 | apiVersion: rbac.authorization.k8s.io/v1beta1
3 | kind: ClusterRole
4 | metadata:
5 | labels:
6 | app: tf-job-operator
7 | name: tf-job-operator
8 | rules:
9 | - apiGroups:
10 | - kubeflow.org
11 | resources:
12 | - tfjobs
13 | - tfjobs/status
14 | - tfjobs/finalizers
15 | verbs:
16 | - '*'
17 | - apiGroups:
18 | - apiextensions.k8s.io
19 | resources:
20 | - customresourcedefinitions
21 | verbs:
22 | - '*'
23 | - apiGroups:
24 | - ""
25 | resources:
26 | - pods
27 | - services
28 | - endpoints
29 | - events
30 | verbs:
31 | - '*'
32 | - apiGroups:
33 | - apps
34 | - extensions
35 | resources:
36 | - deployments
37 | verbs:
38 | - '*'
39 |
40 | ---
41 |
42 | apiVersion: rbac.authorization.k8s.io/v1
43 | kind: ClusterRole
44 | metadata:
45 | name: kubeflow-tfjobs-admin
46 | labels:
47 | rbac.authorization.kubeflow.org/aggregate-to-kubeflow-admin: "true"
48 | aggregationRule:
49 | clusterRoleSelectors:
50 | - matchLabels:
51 | rbac.authorization.kubeflow.org/aggregate-to-kubeflow-tfjobs-admin: "true"
52 | rules: []
53 |
54 | ---
55 |
56 | apiVersion: rbac.authorization.k8s.io/v1
57 | kind: ClusterRole
58 | metadata:
59 | name: kubeflow-tfjobs-edit
60 | labels:
61 | rbac.authorization.kubeflow.org/aggregate-to-kubeflow-edit: "true"
62 | rbac.authorization.kubeflow.org/aggregate-to-kubeflow-tfjobs-admin: "true"
63 | rules:
64 | - apiGroups:
65 | - kubeflow.org
66 | resources:
67 | - tfjobs
68 | - tfjobs/status
69 | verbs:
70 | - get
71 | - list
72 | - watch
73 | - create
74 | - delete
75 | - deletecollection
76 | - patch
77 | - update
78 |
79 | ---
80 |
81 | apiVersion: rbac.authorization.k8s.io/v1
82 | kind: ClusterRole
83 | metadata:
84 | name: kubeflow-tfjobs-view
85 | labels:
86 | rbac.authorization.kubeflow.org/aggregate-to-kubeflow-view: "true"
87 | rules:
88 | - apiGroups:
89 | - kubeflow.org
90 | resources:
91 | - tfjobs
92 | - tfjobs/status
93 | verbs:
94 | - get
95 | - list
96 | - watch
97 |
--------------------------------------------------------------------------------
/Machine Learning Modeling Pipelines in Production/C3W3-Assignment/tf-training/tf-job-operator/base/deployment.yaml:
--------------------------------------------------------------------------------
1 | ---
2 | apiVersion: apps/v1
3 | kind: Deployment
4 | metadata:
5 | name: tf-job-operator
6 | spec:
7 | replicas: 1
8 | template:
9 | metadata:
10 | labels:
11 | name: tf-job-operator
12 | annotations:
13 | sidecar.istio.io/inject: "false"
14 | spec:
15 | containers:
16 | - args:
17 | - --alsologtostderr
18 | - -v=1
19 | - --monitoring-port=8443
20 | env:
21 | - name: MY_POD_NAMESPACE
22 | valueFrom:
23 | fieldRef:
24 | fieldPath: metadata.namespace
25 | - name: MY_POD_NAME
26 | valueFrom:
27 | fieldRef:
28 | fieldPath: metadata.name
29 | image: gcr.io/kubeflow-images-public/tf_operator:kubeflow-tf-operator-postsubmit-v1-5adee6f-6109-a25c
30 | name: tf-job-operator
31 | serviceAccountName: tf-job-operator
32 |
--------------------------------------------------------------------------------
/Machine Learning Modeling Pipelines in Production/C3W3-Assignment/tf-training/tf-job-operator/base/kustomization.yaml:
--------------------------------------------------------------------------------
1 | apiVersion: kustomize.config.k8s.io/v1beta1
2 | kind: Kustomization
3 | namespace: kubeflow
4 | resources:
5 | - cluster-role-binding.yaml
6 | - cluster-role.yaml
7 | - deployment.yaml
8 | - service-account.yaml
9 | - service.yaml
10 | commonLabels:
11 | kustomize.component: tf-job-operator
12 | images:
13 | - name: gcr.io/kubeflow-images-public/tf_operator
14 | newName: gcr.io/kubeflow-images-public/tf_operator
15 | newTag: vmaster-ga2ae7bff
16 |
--------------------------------------------------------------------------------
/Machine Learning Modeling Pipelines in Production/C3W3-Assignment/tf-training/tf-job-operator/base/params.env:
--------------------------------------------------------------------------------
1 | namespace=
2 |
--------------------------------------------------------------------------------
/Machine Learning Modeling Pipelines in Production/C3W3-Assignment/tf-training/tf-job-operator/base/service-account.yaml:
--------------------------------------------------------------------------------
1 | ---
2 | apiVersion: v1
3 | kind: ServiceAccount
4 | metadata:
5 | labels:
6 | app: tf-job-dashboard
7 | name: tf-job-dashboard
8 | ---
9 | apiVersion: v1
10 | kind: ServiceAccount
11 | metadata:
12 | labels:
13 | app: tf-job-operator
14 | name: tf-job-operator
15 |
--------------------------------------------------------------------------------
/Machine Learning Modeling Pipelines in Production/C3W3-Assignment/tf-training/tf-job-operator/base/service.yaml:
--------------------------------------------------------------------------------
1 | ---
2 | apiVersion: v1
3 | kind: Service
4 | metadata:
5 | annotations:
6 | prometheus.io/path: /metrics
7 | prometheus.io/scrape: "true"
8 | prometheus.io/port: "8443"
9 | labels:
10 | app: tf-job-operator
11 | name: tf-job-operator
12 | spec:
13 | ports:
14 | - name: monitoring-port
15 | port: 8443
16 | targetPort: 8443
17 | selector:
18 | name: tf-job-operator
19 | type: ClusterIP
20 |
--------------------------------------------------------------------------------
/Machine Learning Modeling Pipelines in Production/C3W3-Assignment/tf-training/tf-job-operator/overlays/application/application.yaml:
--------------------------------------------------------------------------------
1 | apiVersion: app.k8s.io/v1beta1
2 | kind: Application
3 | metadata:
4 | name: tf-job-operator
5 | spec:
6 | selector:
7 | matchLabels:
8 | app.kubernetes.io/name: tf-job-operator
9 | app.kubernetes.io/instance: tf-job-operator-v0.7.0
10 | app.kubernetes.io/managed-by: kfctl
11 | app.kubernetes.io/component: tfjob
12 | app.kubernetes.io/part-of: kubeflow
13 | app.kubernetes.io/version: v0.7.0
14 | componentKinds:
15 | - group: core
16 | kind: Service
17 | - group: apps
18 | kind: Deployment
19 | - group: core
20 | kind: ServiceAccount
21 | - group: kubeflow.org
22 | kind: TFJob
23 | descriptor:
24 | type: "tf-job-operator"
25 | version: "v1"
26 | description: "Tf-operator allows users to create and manage the \"TFJob\" custom resource."
27 | maintainers:
28 | - name: Richard Liu
29 | email: ricliu@google.com
30 | owners:
31 | - name: Richard Liu
32 | email: ricliu@google.com
33 | keywords:
34 | - "tfjob"
35 | - "tf-operator"
36 | - "tf-training"
37 | links:
38 | - description: About
39 | url: "https://github.com/kubeflow/tf-operator"
40 | - description: Docs
41 | url: "https://www.kubeflow.org/docs/reference/tfjob/v1/tensorflow/"
42 | addOwnerRef: true
43 |
--------------------------------------------------------------------------------
/Machine Learning Modeling Pipelines in Production/C3W3-Assignment/tf-training/tf-job-operator/overlays/application/kustomization.yaml:
--------------------------------------------------------------------------------
1 | apiVersion: kustomize.config.k8s.io/v1beta1
2 | bases:
3 | - ../../base
4 | commonLabels:
5 | app.kubernetes.io/component: tfjob
6 | app.kubernetes.io/name: tf-job-operator
7 | kind: Kustomization
8 | resources:
9 | - application.yaml
10 |
--------------------------------------------------------------------------------
/Machine Learning Modeling Pipelines in Production/C3W3-Assignment/tf-training/tf-training/Kptfile:
--------------------------------------------------------------------------------
1 | apiVersion: kpt.dev/v1alpha1
2 | kind: Kptfile
3 | metadata:
4 | name: tf-training
5 | upstream:
6 | type: git
7 | git:
8 | commit: adfca58fa01eedb1e3cacef097d7f1a3a405d16e
9 | repo: https://github.com/kubeflow/manifests
10 | directory: tf-training
11 | ref: v1.1.0
12 |
--------------------------------------------------------------------------------
/Machine Learning Modeling Pipelines in Production/C3W3-Assignment/tf-training/tf-training/OWNERS:
--------------------------------------------------------------------------------
1 | approvers:
2 | - andreyvelich
3 | - gaocegege
4 | - johnugeorge
5 |
--------------------------------------------------------------------------------
/Machine Learning Modeling Pipelines in Production/C3W3-Assignment/tf-training/tf-training/tf-job-crds/base/crd.yaml:
--------------------------------------------------------------------------------
1 | apiVersion: apiextensions.k8s.io/v1beta1
2 | kind: CustomResourceDefinition
3 | metadata:
4 | name: tfjobs.kubeflow.org
5 | spec:
6 | additionalPrinterColumns:
7 | - JSONPath: .status.conditions[-1:].type
8 | name: State
9 | type: string
10 | - JSONPath: .metadata.creationTimestamp
11 | name: Age
12 | type: date
13 | group: kubeflow.org
14 | names:
15 | kind: TFJob
16 | plural: tfjobs
17 | singular: tfjob
18 | scope: Namespaced
19 | subresources:
20 | status: {}
21 | validation:
22 | openAPIV3Schema:
23 | properties:
24 | spec:
25 | properties:
26 | tfReplicaSpecs:
27 | properties:
28 | Chief:
29 | properties:
30 | replicas:
31 | maximum: 1
32 | minimum: 1
33 | type: integer
34 | PS:
35 | properties:
36 | replicas:
37 | minimum: 1
38 | type: integer
39 | Worker:
40 | properties:
41 | replicas:
42 | minimum: 1
43 | type: integer
44 | versions:
45 | - name: v1
46 | served: true
47 | storage: true
48 |
--------------------------------------------------------------------------------
/Machine Learning Modeling Pipelines in Production/C3W3-Assignment/tf-training/tf-training/tf-job-crds/base/kustomization.yaml:
--------------------------------------------------------------------------------
1 | apiVersion: kustomize.config.k8s.io/v1beta1
2 | kind: Kustomization
3 | resources:
4 | - crd.yaml
5 |
--------------------------------------------------------------------------------
/Machine Learning Modeling Pipelines in Production/C3W3-Assignment/tf-training/tf-training/tf-job-crds/overlays/application/application.yaml:
--------------------------------------------------------------------------------
1 | apiVersion: app.k8s.io/v1beta1
2 | kind: Application
3 | metadata:
4 | name: tf-job-crds
5 | spec:
6 | selector:
7 | matchLabels:
8 | app.kubernetes.io/name: tf-job-crds
9 | app.kubernetes.io/instance: tf-job-crds-v0.7.0
10 | app.kubernetes.io/managed-by: kfctl
11 | app.kubernetes.io/component: tfjob
12 | app.kubernetes.io/part-of: kubeflow
13 | app.kubernetes.io/version: v0.7.0
14 | componentKinds:
15 | - group: core
16 | kind: Service
17 | - group: apps
18 | kind: Deployment
19 | - group: core
20 | kind: ServiceAccount
21 | - group: kubeflow.org
22 | kind: TFJob
23 | descriptor:
24 | type: "tf-job-crds"
25 | version: "v1"
26 | description: "Tf-job-crds contains the \"TFJob\" custom resource definition."
27 | maintainers:
28 | - name: Richard Liu
29 | email: ricliu@google.com
30 | owners:
31 | - name: Richard Liu
32 | email: ricliu@google.com
33 | keywords:
34 | - "tfjob"
35 | - "tf-operator"
36 | - "tf-training"
37 | links:
38 | - description: About
39 | url: "https://github.com/kubeflow/tf-operator"
40 | - description: Docs
41 | url: "https://www.kubeflow.org/docs/reference/tfjob/v1/tensorflow/"
42 | addOwnerRef: true
43 |
--------------------------------------------------------------------------------
/Machine Learning Modeling Pipelines in Production/C3W3-Assignment/tf-training/tf-training/tf-job-crds/overlays/application/kustomization.yaml:
--------------------------------------------------------------------------------
1 | apiVersion: kustomize.config.k8s.io/v1beta1
2 | bases:
3 | - ../../base
4 | commonLabels:
5 | app.kubernetes.io/component: tfjob
6 | app.kubernetes.io/name: tf-job-crds
7 | kind: Kustomization
8 | resources:
9 | - application.yaml
10 |
--------------------------------------------------------------------------------
/Machine Learning Modeling Pipelines in Production/C3W3-Assignment/tf-training/tf-training/tf-job-operator/base/cluster-role-binding.yaml:
--------------------------------------------------------------------------------
1 | ---
2 | apiVersion: rbac.authorization.k8s.io/v1beta1
3 | kind: ClusterRoleBinding
4 | metadata:
5 | labels:
6 | app: tf-job-operator
7 | name: tf-job-operator
8 | roleRef:
9 | apiGroup: rbac.authorization.k8s.io
10 | kind: ClusterRole
11 | name: tf-job-operator
12 | subjects:
13 | - kind: ServiceAccount
14 | name: tf-job-operator
15 |
--------------------------------------------------------------------------------
/Machine Learning Modeling Pipelines in Production/C3W3-Assignment/tf-training/tf-training/tf-job-operator/base/cluster-role.yaml:
--------------------------------------------------------------------------------
1 | ---
2 | apiVersion: rbac.authorization.k8s.io/v1beta1
3 | kind: ClusterRole
4 | metadata:
5 | labels:
6 | app: tf-job-operator
7 | name: tf-job-operator
8 | rules:
9 | - apiGroups:
10 | - kubeflow.org
11 | resources:
12 | - tfjobs
13 | - tfjobs/status
14 | - tfjobs/finalizers
15 | verbs:
16 | - '*'
17 | - apiGroups:
18 | - apiextensions.k8s.io
19 | resources:
20 | - customresourcedefinitions
21 | verbs:
22 | - '*'
23 | - apiGroups:
24 | - ""
25 | resources:
26 | - pods
27 | - services
28 | - endpoints
29 | - events
30 | verbs:
31 | - '*'
32 | - apiGroups:
33 | - apps
34 | - extensions
35 | resources:
36 | - deployments
37 | verbs:
38 | - '*'
39 |
40 | ---
41 |
42 | apiVersion: rbac.authorization.k8s.io/v1
43 | kind: ClusterRole
44 | metadata:
45 | name: kubeflow-tfjobs-admin
46 | labels:
47 | rbac.authorization.kubeflow.org/aggregate-to-kubeflow-admin: "true"
48 | aggregationRule:
49 | clusterRoleSelectors:
50 | - matchLabels:
51 | rbac.authorization.kubeflow.org/aggregate-to-kubeflow-tfjobs-admin: "true"
52 | rules: []
53 |
54 | ---
55 |
56 | apiVersion: rbac.authorization.k8s.io/v1
57 | kind: ClusterRole
58 | metadata:
59 | name: kubeflow-tfjobs-edit
60 | labels:
61 | rbac.authorization.kubeflow.org/aggregate-to-kubeflow-edit: "true"
62 | rbac.authorization.kubeflow.org/aggregate-to-kubeflow-tfjobs-admin: "true"
63 | rules:
64 | - apiGroups:
65 | - kubeflow.org
66 | resources:
67 | - tfjobs
68 | - tfjobs/status
69 | verbs:
70 | - get
71 | - list
72 | - watch
73 | - create
74 | - delete
75 | - deletecollection
76 | - patch
77 | - update
78 |
79 | ---
80 |
81 | apiVersion: rbac.authorization.k8s.io/v1
82 | kind: ClusterRole
83 | metadata:
84 | name: kubeflow-tfjobs-view
85 | labels:
86 | rbac.authorization.kubeflow.org/aggregate-to-kubeflow-view: "true"
87 | rules:
88 | - apiGroups:
89 | - kubeflow.org
90 | resources:
91 | - tfjobs
92 | - tfjobs/status
93 | verbs:
94 | - get
95 | - list
96 | - watch
97 |
--------------------------------------------------------------------------------
/Machine Learning Modeling Pipelines in Production/C3W3-Assignment/tf-training/tf-training/tf-job-operator/base/deployment.yaml:
--------------------------------------------------------------------------------
1 | ---
2 | apiVersion: apps/v1
3 | kind: Deployment
4 | metadata:
5 | name: tf-job-operator
6 | spec:
7 | replicas: 1
8 | template:
9 | metadata:
10 | labels:
11 | name: tf-job-operator
12 | annotations:
13 | sidecar.istio.io/inject: "false"
14 | spec:
15 | containers:
16 | - args:
17 | - --alsologtostderr
18 | - -v=1
19 | - --monitoring-port=8443
20 | env:
21 | - name: MY_POD_NAMESPACE
22 | valueFrom:
23 | fieldRef:
24 | fieldPath: metadata.namespace
25 | - name: MY_POD_NAME
26 | valueFrom:
27 | fieldRef:
28 | fieldPath: metadata.name
29 | image: gcr.io/kubeflow-images-public/tf_operator:kubeflow-tf-operator-postsubmit-v1-5adee6f-6109-a25c
30 | name: tf-job-operator
31 | serviceAccountName: tf-job-operator
32 |
--------------------------------------------------------------------------------
/Machine Learning Modeling Pipelines in Production/C3W3-Assignment/tf-training/tf-training/tf-job-operator/base/kustomization.yaml:
--------------------------------------------------------------------------------
1 | apiVersion: kustomize.config.k8s.io/v1beta1
2 | kind: Kustomization
3 | namespace: kubeflow
4 | resources:
5 | - cluster-role-binding.yaml
6 | - cluster-role.yaml
7 | - deployment.yaml
8 | - service-account.yaml
9 | - service.yaml
10 | commonLabels:
11 | kustomize.component: tf-job-operator
12 | images:
13 | - name: gcr.io/kubeflow-images-public/tf_operator
14 | newName: gcr.io/kubeflow-images-public/tf_operator
15 | newTag: vmaster-ga2ae7bff
16 |
--------------------------------------------------------------------------------
/Machine Learning Modeling Pipelines in Production/C3W3-Assignment/tf-training/tf-training/tf-job-operator/base/params.env:
--------------------------------------------------------------------------------
1 | namespace=
2 |
--------------------------------------------------------------------------------
/Machine Learning Modeling Pipelines in Production/C3W3-Assignment/tf-training/tf-training/tf-job-operator/base/service-account.yaml:
--------------------------------------------------------------------------------
1 | ---
2 | apiVersion: v1
3 | kind: ServiceAccount
4 | metadata:
5 | labels:
6 | app: tf-job-dashboard
7 | name: tf-job-dashboard
8 | ---
9 | apiVersion: v1
10 | kind: ServiceAccount
11 | metadata:
12 | labels:
13 | app: tf-job-operator
14 | name: tf-job-operator
15 |
--------------------------------------------------------------------------------
/Machine Learning Modeling Pipelines in Production/C3W3-Assignment/tf-training/tf-training/tf-job-operator/base/service.yaml:
--------------------------------------------------------------------------------
1 | ---
2 | apiVersion: v1
3 | kind: Service
4 | metadata:
5 | annotations:
6 | prometheus.io/path: /metrics
7 | prometheus.io/scrape: "true"
8 | prometheus.io/port: "8443"
9 | labels:
10 | app: tf-job-operator
11 | name: tf-job-operator
12 | spec:
13 | ports:
14 | - name: monitoring-port
15 | port: 8443
16 | targetPort: 8443
17 | selector:
18 | name: tf-job-operator
19 | type: ClusterIP
20 |
--------------------------------------------------------------------------------
/Machine Learning Modeling Pipelines in Production/C3W3-Assignment/tf-training/tf-training/tf-job-operator/overlays/application/application.yaml:
--------------------------------------------------------------------------------
1 | apiVersion: app.k8s.io/v1beta1
2 | kind: Application
3 | metadata:
4 | name: tf-job-operator
5 | spec:
6 | selector:
7 | matchLabels:
8 | app.kubernetes.io/name: tf-job-operator
9 | app.kubernetes.io/instance: tf-job-operator-v0.7.0
10 | app.kubernetes.io/managed-by: kfctl
11 | app.kubernetes.io/component: tfjob
12 | app.kubernetes.io/part-of: kubeflow
13 | app.kubernetes.io/version: v0.7.0
14 | componentKinds:
15 | - group: core
16 | kind: Service
17 | - group: apps
18 | kind: Deployment
19 | - group: core
20 | kind: ServiceAccount
21 | - group: kubeflow.org
22 | kind: TFJob
23 | descriptor:
24 | type: "tf-job-operator"
25 | version: "v1"
26 | description: "Tf-operator allows users to create and manage the \"TFJob\" custom resource."
27 | maintainers:
28 | - name: Richard Liu
29 | email: ricliu@google.com
30 | owners:
31 | - name: Richard Liu
32 | email: ricliu@google.com
33 | keywords:
34 | - "tfjob"
35 | - "tf-operator"
36 | - "tf-training"
37 | links:
38 | - description: About
39 | url: "https://github.com/kubeflow/tf-operator"
40 | - description: Docs
41 | url: "https://www.kubeflow.org/docs/reference/tfjob/v1/tensorflow/"
42 | addOwnerRef: true
43 |
--------------------------------------------------------------------------------
/Machine Learning Modeling Pipelines in Production/C3W3-Assignment/tf-training/tf-training/tf-job-operator/overlays/application/kustomization.yaml:
--------------------------------------------------------------------------------
1 | apiVersion: kustomize.config.k8s.io/v1beta1
2 | bases:
3 | - ../../base
4 | commonLabels:
5 | app.kubernetes.io/component: tfjob
6 | app.kubernetes.io/name: tf-job-operator
7 | kind: Kustomization
8 | resources:
9 | - application.yaml
10 |
--------------------------------------------------------------------------------
/Machine Learning Modeling Pipelines in Production/C3W3-Lab/model.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/zhulingchen/Machine-Learning-Engineering-for-Production-MLOps-Specialization/4a936bba66fa15a60899b7b2d0d8bddb56633a3a/Machine Learning Modeling Pipelines in Production/C3W3-Lab/model.png
--------------------------------------------------------------------------------
/Machine Learning Modeling Pipelines in Production/C3W5-Lab/C3_W5_Lab_2_Permutation_Importance.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {
6 | "id": "tUUVrx70Z5KL"
7 | },
8 | "source": [
9 | "# Ungraded lab: Permutation Feature Importance\n",
10 | "------------------------\n",
11 | " \n",
12 | "Welcome, during this ungraded lab you are going to be perform Permutation Feature Importance on the [wine dataset](https://scikit-learn.org/stable/modules/generated/sklearn.datasets.load_wine.html#sklearn.datasets.load_wine) using scikit-learn. In particular you will:\n",
13 | "\n",
14 | "\n",
15 | "1. Train a Random Forest classifier on the data.\n",
16 | "2. Compute the feature importance score by permutating each feature.\n",
17 | "3. Re-train the model with only the top features.\n",
18 | "4. Check other classifiers for comparison.\n",
19 | "\n",
20 | "Let's get started!\n"
21 | ]
22 | },
23 | {
24 | "cell_type": "markdown",
25 | "metadata": {
26 | "id": "KxXreDg0tGOF"
27 | },
28 | "source": [
29 | "## Inspect and pre-process the data\n",
30 | "\n",
31 | "Begin by upgrading scikit-learn to the latest version:"
32 | ]
33 | },
34 | {
35 | "cell_type": "code",
36 | "execution_count": 1,
37 | "metadata": {
38 | "id": "UsGnA3r9DQkw"
39 | },
40 | "outputs": [],
41 | "source": [
42 | "# !pip install -U scikit-learn"
43 | ]
44 | },
45 | {
46 | "cell_type": "markdown",
47 | "metadata": {
48 | "id": "wx9r-xd5tWZO"
49 | },
50 | "source": [
51 | "Now import the required dependencies and load the dataset:"
52 | ]
53 | },
54 | {
55 | "cell_type": "code",
56 | "execution_count": 2,
57 | "metadata": {
58 | "id": "DCRACaLFC-1N"
59 | },
60 | "outputs": [
61 | {
62 | "name": "stdout",
63 | "output_type": "stream",
64 | "text": [
65 | "scikit-learn version: 0.24.2\n"
66 | ]
67 | }
68 | ],
69 | "source": [
70 | "import numpy as np\n",
71 | "import sklearn\n",
72 | "from sklearn.datasets import load_wine\n",
73 | "\n",
74 | "print('scikit-learn version:', sklearn.__version__)"
75 | ]
76 | },
77 | {
78 | "cell_type": "code",
79 | "execution_count": 3,
80 | "metadata": {},
81 | "outputs": [
82 | {
83 | "data": {
84 | "text/html": [
85 | "\n",
86 | "\n",
99 | "
\n",
100 | " \n",
101 | " \n",
102 | " | \n",
103 | " alcohol | \n",
104 | " malic_acid | \n",
105 | " ash | \n",
106 | " alcalinity_of_ash | \n",
107 | " magnesium | \n",
108 | " total_phenols | \n",
109 | " flavanoids | \n",
110 | " nonflavanoid_phenols | \n",
111 | " proanthocyanins | \n",
112 | " color_intensity | \n",
113 | " hue | \n",
114 | " od280/od315_of_diluted_wines | \n",
115 | " proline | \n",
116 | " target | \n",
117 | "
\n",
118 | " \n",
119 | " \n",
120 | " \n",
121 | " 0 | \n",
122 | " 14.23 | \n",
123 | " 1.71 | \n",
124 | " 2.43 | \n",
125 | " 15.6 | \n",
126 | " 127.0 | \n",
127 | " 2.80 | \n",
128 | " 3.06 | \n",
129 | " 0.28 | \n",
130 | " 2.29 | \n",
131 | " 5.64 | \n",
132 | " 1.04 | \n",
133 | " 3.92 | \n",
134 | " 1065.0 | \n",
135 | " 0 | \n",
136 | "
\n",
137 | " \n",
138 | " 1 | \n",
139 | " 13.20 | \n",
140 | " 1.78 | \n",
141 | " 2.14 | \n",
142 | " 11.2 | \n",
143 | " 100.0 | \n",
144 | " 2.65 | \n",
145 | " 2.76 | \n",
146 | " 0.26 | \n",
147 | " 1.28 | \n",
148 | " 4.38 | \n",
149 | " 1.05 | \n",
150 | " 3.40 | \n",
151 | " 1050.0 | \n",
152 | " 0 | \n",
153 | "
\n",
154 | " \n",
155 | " 2 | \n",
156 | " 13.16 | \n",
157 | " 2.36 | \n",
158 | " 2.67 | \n",
159 | " 18.6 | \n",
160 | " 101.0 | \n",
161 | " 2.80 | \n",
162 | " 3.24 | \n",
163 | " 0.30 | \n",
164 | " 2.81 | \n",
165 | " 5.68 | \n",
166 | " 1.03 | \n",
167 | " 3.17 | \n",
168 | " 1185.0 | \n",
169 | " 0 | \n",
170 | "
\n",
171 | " \n",
172 | " 3 | \n",
173 | " 14.37 | \n",
174 | " 1.95 | \n",
175 | " 2.50 | \n",
176 | " 16.8 | \n",
177 | " 113.0 | \n",
178 | " 3.85 | \n",
179 | " 3.49 | \n",
180 | " 0.24 | \n",
181 | " 2.18 | \n",
182 | " 7.80 | \n",
183 | " 0.86 | \n",
184 | " 3.45 | \n",
185 | " 1480.0 | \n",
186 | " 0 | \n",
187 | "
\n",
188 | " \n",
189 | " 4 | \n",
190 | " 13.24 | \n",
191 | " 2.59 | \n",
192 | " 2.87 | \n",
193 | " 21.0 | \n",
194 | " 118.0 | \n",
195 | " 2.80 | \n",
196 | " 2.69 | \n",
197 | " 0.39 | \n",
198 | " 1.82 | \n",
199 | " 4.32 | \n",
200 | " 1.04 | \n",
201 | " 2.93 | \n",
202 | " 735.0 | \n",
203 | " 0 | \n",
204 | "
\n",
205 | " \n",
206 | "
\n",
207 | "
"
208 | ],
209 | "text/plain": [
210 | " alcohol malic_acid ash alcalinity_of_ash magnesium total_phenols \\\n",
211 | "0 14.23 1.71 2.43 15.6 127.0 2.80 \n",
212 | "1 13.20 1.78 2.14 11.2 100.0 2.65 \n",
213 | "2 13.16 2.36 2.67 18.6 101.0 2.80 \n",
214 | "3 14.37 1.95 2.50 16.8 113.0 3.85 \n",
215 | "4 13.24 2.59 2.87 21.0 118.0 2.80 \n",
216 | "\n",
217 | " flavanoids nonflavanoid_phenols proanthocyanins color_intensity hue \\\n",
218 | "0 3.06 0.28 2.29 5.64 1.04 \n",
219 | "1 2.76 0.26 1.28 4.38 1.05 \n",
220 | "2 3.24 0.30 2.81 5.68 1.03 \n",
221 | "3 3.49 0.24 2.18 7.80 0.86 \n",
222 | "4 2.69 0.39 1.82 4.32 1.04 \n",
223 | "\n",
224 | " od280/od315_of_diluted_wines proline target \n",
225 | "0 3.92 1065.0 0 \n",
226 | "1 3.40 1050.0 0 \n",
227 | "2 3.17 1185.0 0 \n",
228 | "3 3.45 1480.0 0 \n",
229 | "4 2.93 735.0 0 "
230 | ]
231 | },
232 | "execution_count": 3,
233 | "metadata": {},
234 | "output_type": "execute_result"
235 | }
236 | ],
237 | "source": [
238 | "# as_frame param requires scikit-learn >= 0.23\n",
239 | "data = load_wine(as_frame=True)\n",
240 | "\n",
241 | "# Print first rows of the data\n",
242 | "data.frame.head()"
243 | ]
244 | },
245 | {
246 | "cell_type": "code",
247 | "execution_count": 4,
248 | "metadata": {},
249 | "outputs": [
250 | {
251 | "name": "stdout",
252 | "output_type": "stream",
253 | "text": [
254 | "Number of features: 13\n"
255 | ]
256 | }
257 | ],
258 | "source": [
259 | "print('Number of features:', len(data.feature_names))"
260 | ]
261 | },
262 | {
263 | "cell_type": "markdown",
264 | "metadata": {
265 | "id": "l8opVcVeuFLn"
266 | },
267 | "source": [
268 | "This dataset is made up of 13 numerical features and there are 3 different classes of wine.\n",
269 | "\n",
270 | "Now perform the train/test split and normalize the data using [`StandardScaler`](https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.StandardScaler.html):"
271 | ]
272 | },
273 | {
274 | "cell_type": "code",
275 | "execution_count": 5,
276 | "metadata": {
277 | "id": "Aaszrn9CEsIf"
278 | },
279 | "outputs": [],
280 | "source": [
281 | "from sklearn.model_selection import train_test_split\n",
282 | "from sklearn.preprocessing import StandardScaler\n",
283 | "\n",
284 | "# Train / Test split\n",
285 | "X_train, X_test, y_train, y_test = train_test_split(data.data, data.target, random_state=42)\n",
286 | "\n",
287 | "# Instantiate StandardScaler\n",
288 | "scaler = StandardScaler()\n",
289 | "\n",
290 | "# Fit it to the train data\n",
291 | "scaler.fit(X_train)\n",
292 | "\n",
293 | "# Use it to transform the train and test data\n",
294 | "X_train = scaler.transform(X_train)\n",
295 | "\n",
296 | "# Notice that the scaler is trained on the train data to avoid data leakage from the test set\n",
297 | "X_test = scaler.transform(X_test)"
298 | ]
299 | },
300 | {
301 | "cell_type": "markdown",
302 | "metadata": {
303 | "id": "BVmae4rtvGiA"
304 | },
305 | "source": [
306 | "## Train the classifier\n",
307 | "\n",
308 | "Now you will fit a [Random Forest classifier](https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.html) with 10 estimators and compute the mean accuracy achieved:"
309 | ]
310 | },
311 | {
312 | "cell_type": "code",
313 | "execution_count": 6,
314 | "metadata": {
315 | "id": "NK5Dxa70Ir3N"
316 | },
317 | "outputs": [
318 | {
319 | "data": {
320 | "text/plain": [
321 | "0.9111111111111111"
322 | ]
323 | },
324 | "execution_count": 6,
325 | "metadata": {},
326 | "output_type": "execute_result"
327 | }
328 | ],
329 | "source": [
330 | "from sklearn.ensemble import RandomForestClassifier\n",
331 | "\n",
332 | "# Fit the classifier\n",
333 | "rf_clf = RandomForestClassifier(n_estimators=10, random_state=42).fit(X_train, y_train)\n",
334 | "\n",
335 | "# Print the mean accuracy achieved by the classifier on the test set\n",
336 | "rf_clf.score(X_test, y_test)"
337 | ]
338 | },
339 | {
340 | "cell_type": "markdown",
341 | "metadata": {
342 | "id": "-exiHBeJwVGh"
343 | },
344 | "source": [
345 | "This model achieved a mean accuracy of 91%. Pretty good for a model without fine tunning."
346 | ]
347 | },
348 | {
349 | "cell_type": "markdown",
350 | "metadata": {
351 | "id": "O_RPyB9Owfms"
352 | },
353 | "source": [
354 | "# Permutation Feature Importance\n",
355 | "\n",
356 | "To perform the model inspection technique known as Permutation Feature Importance you will use scikit-learn's built-in function [`permutation_importance`](https://scikit-learn.org/stable/modules/generated/sklearn.inspection.permutation_importance.html#sklearn.inspection.permutation_importance).\n",
357 | "\n",
358 | "You will create a function that given a classifier, features and labels computes the feature importance for every feature:"
359 | ]
360 | },
361 | {
362 | "cell_type": "code",
363 | "execution_count": 7,
364 | "metadata": {},
365 | "outputs": [],
366 | "source": [
367 | "from sklearn.inspection import permutation_importance\n",
368 | "bunch = permutation_importance(rf_clf, X_train, y_train, n_repeats=50, random_state=42)"
369 | ]
370 | },
371 | {
372 | "cell_type": "code",
373 | "execution_count": 8,
374 | "metadata": {},
375 | "outputs": [
376 | {
377 | "data": {
378 | "text/plain": [
379 | "{'importances_mean': array([0. , 0.00240602, 0. , 0. , 0. ,\n",
380 | " 0.00315789, 0.22676692, 0. , 0.00180451, 0.11233083,\n",
381 | " 0.00150376, 0.00661654, 0.14165414]),\n",
382 | " 'importances_std': array([0. , 0.00350734, 0. , 0. , 0. ,\n",
383 | " 0.00371097, 0.02481704, 0. , 0.00321115, 0.02287544,\n",
384 | " 0.00300752, 0.00490511, 0.01916099]),\n",
385 | " 'importances': array([[0. , 0. , 0. , 0. , 0. ,\n",
386 | " 0. , 0. , 0. , 0. , 0. ,\n",
387 | " 0. , 0. , 0. , 0. , 0. ,\n",
388 | " 0. , 0. , 0. , 0. , 0. ,\n",
389 | " 0. , 0. , 0. , 0. , 0. ,\n",
390 | " 0. , 0. , 0. , 0. , 0. ,\n",
391 | " 0. , 0. , 0. , 0. , 0. ,\n",
392 | " 0. , 0. , 0. , 0. , 0. ,\n",
393 | " 0. , 0. , 0. , 0. , 0. ,\n",
394 | " 0. , 0. , 0. , 0. , 0. ],\n",
395 | " [0.0075188 , 0.0075188 , 0.0075188 , 0. , 0. ,\n",
396 | " 0. , 0. , 0. , 0. , 0. ,\n",
397 | " 0. , 0.0075188 , 0.0075188 , 0. , 0.0075188 ,\n",
398 | " 0. , 0. , 0.0075188 , 0. , 0.0075188 ,\n",
399 | " 0. , 0. , 0. , 0. , 0.0075188 ,\n",
400 | " 0. , 0. , 0. , 0.0075188 , 0. ,\n",
401 | " 0.0075188 , 0. , 0.0075188 , 0. , 0.0075188 ,\n",
402 | " 0.0075188 , 0. , 0. , 0. , 0. ,\n",
403 | " 0. , 0. , 0. , 0. , 0. ,\n",
404 | " 0. , 0. , 0.0075188 , 0.0075188 , 0. ],\n",
405 | " [0. , 0. , 0. , 0. , 0. ,\n",
406 | " 0. , 0. , 0. , 0. , 0. ,\n",
407 | " 0. , 0. , 0. , 0. , 0. ,\n",
408 | " 0. , 0. , 0. , 0. , 0. ,\n",
409 | " 0. , 0. , 0. , 0. , 0. ,\n",
410 | " 0. , 0. , 0. , 0. , 0. ,\n",
411 | " 0. , 0. , 0. , 0. , 0. ,\n",
412 | " 0. , 0. , 0. , 0. , 0. ,\n",
413 | " 0. , 0. , 0. , 0. , 0. ,\n",
414 | " 0. , 0. , 0. , 0. , 0. ],\n",
415 | " [0. , 0. , 0. , 0. , 0. ,\n",
416 | " 0. , 0. , 0. , 0. , 0. ,\n",
417 | " 0. , 0. , 0. , 0. , 0. ,\n",
418 | " 0. , 0. , 0. , 0. , 0. ,\n",
419 | " 0. , 0. , 0. , 0. , 0. ,\n",
420 | " 0. , 0. , 0. , 0. , 0. ,\n",
421 | " 0. , 0. , 0. , 0. , 0. ,\n",
422 | " 0. , 0. , 0. , 0. , 0. ,\n",
423 | " 0. , 0. , 0. , 0. , 0. ,\n",
424 | " 0. , 0. , 0. , 0. , 0. ],\n",
425 | " [0. , 0. , 0. , 0. , 0. ,\n",
426 | " 0. , 0. , 0. , 0. , 0. ,\n",
427 | " 0. , 0. , 0. , 0. , 0. ,\n",
428 | " 0. , 0. , 0. , 0. , 0. ,\n",
429 | " 0. , 0. , 0. , 0. , 0. ,\n",
430 | " 0. , 0. , 0. , 0. , 0. ,\n",
431 | " 0. , 0. , 0. , 0. , 0. ,\n",
432 | " 0. , 0. , 0. , 0. , 0. ,\n",
433 | " 0. , 0. , 0. , 0. , 0. ,\n",
434 | " 0. , 0. , 0. , 0. , 0. ],\n",
435 | " [0.0075188 , 0. , 0. , 0. , 0. ,\n",
436 | " 0.0075188 , 0. , 0.0075188 , 0.0075188 , 0. ,\n",
437 | " 0. , 0.0075188 , 0. , 0.0075188 , 0. ,\n",
438 | " 0.0075188 , 0. , 0. , 0.0075188 , 0.0075188 ,\n",
439 | " 0.0075188 , 0. , 0.0075188 , 0.0075188 , 0. ,\n",
440 | " 0.0075188 , 0. , 0. , 0. , 0. ,\n",
441 | " 0. , 0.0075188 , 0.0075188 , 0. , 0. ,\n",
442 | " 0.0075188 , 0. , 0. , 0. , 0.0075188 ,\n",
443 | " 0.0075188 , 0. , 0. , 0. , 0. ,\n",
444 | " 0.0075188 , 0. , 0. , 0.0075188 , 0.0075188 ],\n",
445 | " [0.22556391, 0.21052632, 0.26315789, 0.2406015 , 0.23308271,\n",
446 | " 0.2556391 , 0.23308271, 0.22556391, 0.27067669, 0.22556391,\n",
447 | " 0.21804511, 0.2406015 , 0.20300752, 0.2481203 , 0.21804511,\n",
448 | " 0.21052632, 0.2481203 , 0.21052632, 0.18796992, 0.19548872,\n",
449 | " 0.20300752, 0.18796992, 0.23308271, 0.20300752, 0.26315789,\n",
450 | " 0.2481203 , 0.22556391, 0.22556391, 0.22556391, 0.19548872,\n",
451 | " 0.21052632, 0.21804511, 0.21804511, 0.23308271, 0.19548872,\n",
452 | " 0.21052632, 0.29323308, 0.2406015 , 0.2481203 , 0.2406015 ,\n",
453 | " 0.23308271, 0.21052632, 0.2481203 , 0.21052632, 0.22556391,\n",
454 | " 0.20300752, 0.23308271, 0.29323308, 0.17293233, 0.22556391],\n",
455 | " [0. , 0. , 0. , 0. , 0. ,\n",
456 | " 0. , 0. , 0. , 0. , 0. ,\n",
457 | " 0. , 0. , 0. , 0. , 0. ,\n",
458 | " 0. , 0. , 0. , 0. , 0. ,\n",
459 | " 0. , 0. , 0. , 0. , 0. ,\n",
460 | " 0. , 0. , 0. , 0. , 0. ,\n",
461 | " 0. , 0. , 0. , 0. , 0. ,\n",
462 | " 0. , 0. , 0. , 0. , 0. ,\n",
463 | " 0. , 0. , 0. , 0. , 0. ,\n",
464 | " 0. , 0. , 0. , 0. , 0. ],\n",
465 | " [0. , 0. , 0. , 0. , 0.0075188 ,\n",
466 | " 0. , 0.0075188 , 0. , 0. , 0.0075188 ,\n",
467 | " 0. , 0. , 0. , 0. , 0.0075188 ,\n",
468 | " 0. , 0.0075188 , 0. , 0. , 0. ,\n",
469 | " 0. , 0.0075188 , 0. , 0. , 0. ,\n",
470 | " 0. , 0. , 0. , 0.0075188 , 0. ,\n",
471 | " 0. , 0. , 0. , 0.0075188 , 0.0075188 ,\n",
472 | " 0. , 0. , 0. , 0.0075188 , 0. ,\n",
473 | " 0. , 0. , 0.0075188 , 0. , 0. ,\n",
474 | " 0. , 0. , 0.0075188 , 0. , 0. ],\n",
475 | " [0.09774436, 0.11278195, 0.12030075, 0.14285714, 0.10526316,\n",
476 | " 0.12030075, 0.09022556, 0.10526316, 0.12781955, 0.15037594,\n",
477 | " 0.13533835, 0.16541353, 0.09022556, 0.11278195, 0.11278195,\n",
478 | " 0.08270677, 0.12781955, 0.09022556, 0.12030075, 0.13533835,\n",
479 | " 0.10526316, 0.12030075, 0.17293233, 0.07518797, 0.12030075,\n",
480 | " 0.07518797, 0.10526316, 0.09022556, 0.11278195, 0.09022556,\n",
481 | " 0.12030075, 0.07518797, 0.10526316, 0.09774436, 0.10526316,\n",
482 | " 0.10526316, 0.12781955, 0.13533835, 0.10526316, 0.13533835,\n",
483 | " 0.06015038, 0.09022556, 0.09022556, 0.12781955, 0.15037594,\n",
484 | " 0.09774436, 0.12030075, 0.12030075, 0.11278195, 0.12030075],\n",
485 | " [0. , 0. , 0. , 0.0075188 , 0. ,\n",
486 | " 0. , 0. , 0. , 0.0075188 , 0.0075188 ,\n",
487 | " 0. , 0. , 0. , 0.0075188 , 0. ,\n",
488 | " 0. , 0. , 0.0075188 , 0. , 0. ,\n",
489 | " 0. , 0. , 0. , 0. , 0. ,\n",
490 | " 0. , 0. , 0. , 0. , 0.0075188 ,\n",
491 | " 0. , 0. , 0. , 0. , 0. ,\n",
492 | " 0. , 0. , 0.0075188 , 0. , 0. ,\n",
493 | " 0. , 0. , 0. , 0.0075188 , 0. ,\n",
494 | " 0.0075188 , 0.0075188 , 0. , 0. , 0. ],\n",
495 | " [0. , 0.0075188 , 0.01503759, 0.0075188 , 0. ,\n",
496 | " 0.0075188 , 0.0075188 , 0.01503759, 0.0075188 , 0. ,\n",
497 | " 0.01503759, 0.0075188 , 0.0075188 , 0. , 0. ,\n",
498 | " 0.0075188 , 0.0075188 , 0.0075188 , 0.0075188 , 0.0075188 ,\n",
499 | " 0.0075188 , 0.0075188 , 0.0075188 , 0. , 0. ,\n",
500 | " 0.0075188 , 0. , 0. , 0.0075188 , 0.01503759,\n",
501 | " 0. , 0.0075188 , 0.0075188 , 0.0075188 , 0.0075188 ,\n",
502 | " 0.0075188 , 0.0075188 , 0.01503759, 0. , 0.0075188 ,\n",
503 | " 0.01503759, 0. , 0. , 0.0075188 , 0. ,\n",
504 | " 0.0075188 , 0.0075188 , 0.01503759, 0.01503759, 0.0075188 ],\n",
505 | " [0.15037594, 0.14285714, 0.14285714, 0.15789474, 0.13533835,\n",
506 | " 0.12781955, 0.12781955, 0.13533835, 0.13533835, 0.17293233,\n",
507 | " 0.13533835, 0.17293233, 0.15037594, 0.15789474, 0.12781955,\n",
508 | " 0.12030075, 0.12781955, 0.14285714, 0.14285714, 0.16541353,\n",
509 | " 0.15789474, 0.15789474, 0.18045113, 0.16541353, 0.15789474,\n",
510 | " 0.10526316, 0.12781955, 0.13533835, 0.16541353, 0.16541353,\n",
511 | " 0.14285714, 0.12781955, 0.09774436, 0.10526316, 0.15789474,\n",
512 | " 0.13533835, 0.15037594, 0.13533835, 0.11278195, 0.14285714,\n",
513 | " 0.13533835, 0.12030075, 0.10526316, 0.16541353, 0.16541353,\n",
514 | " 0.12781955, 0.12781955, 0.15037594, 0.13533835, 0.15037594]])}"
515 | ]
516 | },
517 | "execution_count": 8,
518 | "metadata": {},
519 | "output_type": "execute_result"
520 | }
521 | ],
522 | "source": [
523 | "bunch"
524 | ]
525 | },
526 | {
527 | "cell_type": "code",
528 | "execution_count": 9,
529 | "metadata": {},
530 | "outputs": [
531 | {
532 | "data": {
533 | "text/plain": [
534 | "{'importances_mean': (13,), 'importances_std': (13,), 'importances': (13, 50)}"
535 | ]
536 | },
537 | "execution_count": 9,
538 | "metadata": {},
539 | "output_type": "execute_result"
540 | }
541 | ],
542 | "source": [
543 | "{k: v.shape for k, v in bunch.items()}"
544 | ]
545 | },
546 | {
547 | "cell_type": "code",
548 | "execution_count": 10,
549 | "metadata": {
550 | "id": "nAvDl_2rJsTA"
551 | },
552 | "outputs": [],
553 | "source": [
554 | "def feature_importance(clf, X, y, top_limit=None):\n",
555 | "\n",
556 | " # Retrieve the Bunch object after 50 repeats\n",
557 | " # n_repeats is the number of times that each feature was permuted to compute the final score\n",
558 | " bunch = permutation_importance(clf, X, y,\n",
559 | " n_repeats=50, random_state=42)\n",
560 | "\n",
561 | " # Average feature importance\n",
562 | " imp_means = bunch.importances_mean\n",
563 | "\n",
564 | " # List that contains the index of each feature in descending order of importance\n",
565 | " ordered_imp_means_args = np.argsort(imp_means)[::-1]\n",
566 | "\n",
567 | " # If no limit print all features\n",
568 | " if top_limit is None:\n",
569 | " top_limit = len(ordered_imp_means_args)\n",
570 | "\n",
571 | " # Print relevant information\n",
572 | " for i, _ in zip(ordered_imp_means_args, range(top_limit)):\n",
573 | " name = data.feature_names[i]\n",
574 | " imp_score = imp_means[i]\n",
575 | " imp_std = bunch.importances_std[i]\n",
576 | " print(f\"Feature '{name}' with index {i} has an average importance score of {imp_score:.3f} +/- {imp_std:.3f}\\n\")"
577 | ]
578 | },
579 | {
580 | "cell_type": "markdown",
581 | "metadata": {
582 | "id": "dCHWbK_3yeJ_"
583 | },
584 | "source": [
585 | "The importance score is computed in a way that higher values represent better predictive power. To know exactly how it is computed check out this [link](https://scikit-learn.org/stable/modules/permutation_importance.html#outline-of-the-permutation-importance-algorithm).\n",
586 | "\n",
587 | "Now use the `feature_importance` function on the Random Forest classifier and the train set:"
588 | ]
589 | },
590 | {
591 | "cell_type": "code",
592 | "execution_count": 11,
593 | "metadata": {
594 | "id": "wuB6EyTuHT7S"
595 | },
596 | "outputs": [
597 | {
598 | "name": "stdout",
599 | "output_type": "stream",
600 | "text": [
601 | "Feature 'flavanoids' with index 6 has an average importance score of 0.227 +/- 0.025\n",
602 | "\n",
603 | "Feature 'proline' with index 12 has an average importance score of 0.142 +/- 0.019\n",
604 | "\n",
605 | "Feature 'color_intensity' with index 9 has an average importance score of 0.112 +/- 0.023\n",
606 | "\n",
607 | "Feature 'od280/od315_of_diluted_wines' with index 11 has an average importance score of 0.007 +/- 0.005\n",
608 | "\n",
609 | "Feature 'total_phenols' with index 5 has an average importance score of 0.003 +/- 0.004\n",
610 | "\n",
611 | "Feature 'malic_acid' with index 1 has an average importance score of 0.002 +/- 0.004\n",
612 | "\n",
613 | "Feature 'proanthocyanins' with index 8 has an average importance score of 0.002 +/- 0.003\n",
614 | "\n",
615 | "Feature 'hue' with index 10 has an average importance score of 0.002 +/- 0.003\n",
616 | "\n",
617 | "Feature 'nonflavanoid_phenols' with index 7 has an average importance score of 0.000 +/- 0.000\n",
618 | "\n",
619 | "Feature 'magnesium' with index 4 has an average importance score of 0.000 +/- 0.000\n",
620 | "\n",
621 | "Feature 'alcalinity_of_ash' with index 3 has an average importance score of 0.000 +/- 0.000\n",
622 | "\n",
623 | "Feature 'ash' with index 2 has an average importance score of 0.000 +/- 0.000\n",
624 | "\n",
625 | "Feature 'alcohol' with index 0 has an average importance score of 0.000 +/- 0.000\n",
626 | "\n"
627 | ]
628 | }
629 | ],
630 | "source": [
631 | "feature_importance(rf_clf, X_train, y_train)"
632 | ]
633 | },
634 | {
635 | "cell_type": "markdown",
636 | "metadata": {
637 | "id": "CiYP--tl5623"
638 | },
639 | "source": [
640 | "Looks like many of the features have a fairly low importance score. This points that the predictive power of this dataset is conmdensed in a few features.\n",
641 | "\n",
642 | "However it is important to notice that this process was done for the training set, so this feature importance does NOT have into account if the feature might help with the generalization power of the model.\n",
643 | "\n",
644 | "To check this, repeat the process for the test set:"
645 | ]
646 | },
647 | {
648 | "cell_type": "code",
649 | "execution_count": 12,
650 | "metadata": {
651 | "id": "iDjYLTDBzfXT"
652 | },
653 | "outputs": [
654 | {
655 | "name": "stdout",
656 | "output_type": "stream",
657 | "text": [
658 | "Feature 'flavanoids' with index 6 has an average importance score of 0.202 +/- 0.047\n",
659 | "\n",
660 | "Feature 'proline' with index 12 has an average importance score of 0.143 +/- 0.042\n",
661 | "\n",
662 | "Feature 'color_intensity' with index 9 has an average importance score of 0.112 +/- 0.043\n",
663 | "\n",
664 | "Feature 'alcohol' with index 0 has an average importance score of 0.024 +/- 0.017\n",
665 | "\n",
666 | "Feature 'magnesium' with index 4 has an average importance score of 0.021 +/- 0.015\n",
667 | "\n",
668 | "Feature 'od280/od315_of_diluted_wines' with index 11 has an average importance score of 0.015 +/- 0.018\n",
669 | "\n",
670 | "Feature 'hue' with index 10 has an average importance score of 0.013 +/- 0.018\n",
671 | "\n",
672 | "Feature 'total_phenols' with index 5 has an average importance score of 0.002 +/- 0.016\n",
673 | "\n",
674 | "Feature 'nonflavanoid_phenols' with index 7 has an average importance score of 0.000 +/- 0.000\n",
675 | "\n",
676 | "Feature 'alcalinity_of_ash' with index 3 has an average importance score of 0.000 +/- 0.000\n",
677 | "\n",
678 | "Feature 'malic_acid' with index 1 has an average importance score of -0.002 +/- 0.017\n",
679 | "\n",
680 | "Feature 'ash' with index 2 has an average importance score of -0.003 +/- 0.008\n",
681 | "\n",
682 | "Feature 'proanthocyanins' with index 8 has an average importance score of -0.021 +/- 0.020\n",
683 | "\n"
684 | ]
685 | }
686 | ],
687 | "source": [
688 | "feature_importance(rf_clf, X_test, y_test)"
689 | ]
690 | },
691 | {
692 | "cell_type": "markdown",
693 | "metadata": {
694 | "id": "aclWL_oJ7h3V"
695 | },
696 | "source": [
697 | "Notice that the top most important features are the same for both sets. However features such as **alcohol**, which was considered not important for the training set is much more important when using the testing set. This hints that this feature will contribute to the generalization power of the model.\n",
698 | "\n",
699 | "**If a feature is deemed as important for the train set but not for the testing, this feature will probably cause the model to overfit.**"
700 | ]
701 | },
702 | {
703 | "cell_type": "markdown",
704 | "metadata": {
705 | "id": "DpHC8q3y8byl"
706 | },
707 | "source": [
708 | "## Re-train the model with the most important features\n",
709 | "\n",
710 | "Now you will re-train the Random Forest classifier with only the top 3 most important features. \n",
711 | "\n",
712 | "In this case they are the same for both sets:"
713 | ]
714 | },
715 | {
716 | "cell_type": "code",
717 | "execution_count": 13,
718 | "metadata": {
719 | "id": "daZkt0PE1oxc"
720 | },
721 | "outputs": [
722 | {
723 | "name": "stdout",
724 | "output_type": "stream",
725 | "text": [
726 | "On TRAIN split:\n",
727 | "\n",
728 | "Feature 'flavanoids' with index 6 has an average importance score of 0.227 +/- 0.025\n",
729 | "\n",
730 | "Feature 'proline' with index 12 has an average importance score of 0.142 +/- 0.019\n",
731 | "\n",
732 | "Feature 'color_intensity' with index 9 has an average importance score of 0.112 +/- 0.023\n",
733 | "\n",
734 | "\n",
735 | "On TEST split:\n",
736 | "\n",
737 | "Feature 'flavanoids' with index 6 has an average importance score of 0.202 +/- 0.047\n",
738 | "\n",
739 | "Feature 'proline' with index 12 has an average importance score of 0.143 +/- 0.042\n",
740 | "\n",
741 | "Feature 'color_intensity' with index 9 has an average importance score of 0.112 +/- 0.043\n",
742 | "\n"
743 | ]
744 | }
745 | ],
746 | "source": [
747 | "print(\"On TRAIN split:\\n\")\n",
748 | "feature_importance(rf_clf, X_train, y_train, top_limit=3)\n",
749 | "\n",
750 | "print(\"\\nOn TEST split:\\n\")\n",
751 | "feature_importance(rf_clf, X_test, y_test, top_limit=3)"
752 | ]
753 | },
754 | {
755 | "cell_type": "code",
756 | "execution_count": 14,
757 | "metadata": {
758 | "id": "VOZyil7eqH5-"
759 | },
760 | "outputs": [
761 | {
762 | "data": {
763 | "text/plain": [
764 | "0.9333333333333333"
765 | ]
766 | },
767 | "execution_count": 14,
768 | "metadata": {},
769 | "output_type": "execute_result"
770 | }
771 | ],
772 | "source": [
773 | "# Preserve only the top 3 features\n",
774 | "X_train_top_features = X_train[:, [6, 9, 12]]\n",
775 | "X_test_top_features = X_test[:, [6, 9, 12]]\n",
776 | "\n",
777 | "# Re-train with only these features\n",
778 | "rf_clf_top = RandomForestClassifier(n_estimators=10, random_state=42).fit(X_train_top_features, y_train)\n",
779 | "\n",
780 | "# Compute mean accuracy achieved\n",
781 | "rf_clf_top.score(X_test_top_features, y_test)"
782 | ]
783 | },
784 | {
785 | "cell_type": "markdown",
786 | "metadata": {
787 | "id": "5CeyUCTi8zIj"
788 | },
789 | "source": [
790 | "Notice that by using only the 3 most important features the model achieved a mean accuracy even higher than the one using all 13 features. \n",
791 | "\n",
792 | "\n",
793 | "Remember that the **alcohol** feature was deemed not important in the train split but you had the hypotheses that it had important information for the generalization of the model. \n",
794 | "\n",
795 | "Add this feature and see how the model performs:"
796 | ]
797 | },
798 | {
799 | "cell_type": "code",
800 | "execution_count": 15,
801 | "metadata": {
802 | "id": "b72FJLpj9Aly"
803 | },
804 | "outputs": [
805 | {
806 | "data": {
807 | "text/plain": [
808 | "1.0"
809 | ]
810 | },
811 | "execution_count": 15,
812 | "metadata": {},
813 | "output_type": "execute_result"
814 | }
815 | ],
816 | "source": [
817 | "# Preserve only the top 4 features\n",
818 | "X_train_top_features = X_train[:,[0, 6, 9, 12]]\n",
819 | "X_test_top_features = X_test[:,[0, 6, 9, 12]]\n",
820 | "\n",
821 | "# Re-train with only these features\n",
822 | "rf_clf_top = RandomForestClassifier(n_estimators=10, random_state=42).fit(X_train_top_features, y_train)\n",
823 | "\n",
824 | "# Compute mean accuracy achieved\n",
825 | "rf_clf_top.score(X_test_top_features, y_test)"
826 | ]
827 | },
828 | {
829 | "cell_type": "markdown",
830 | "metadata": {
831 | "id": "L4imw5IU9nGx"
832 | },
833 | "source": [
834 | "Wow! By adding this additional feature you know get a mean accuracy of 100%! Quite remarkable! Looks like this feature did in fact provided some important information that helped the model do a better job at generalizing."
835 | ]
836 | },
837 | {
838 | "cell_type": "markdown",
839 | "metadata": {},
840 | "source": [
841 | "If we use the 3 least important features:"
842 | ]
843 | },
844 | {
845 | "cell_type": "code",
846 | "execution_count": 16,
847 | "metadata": {},
848 | "outputs": [
849 | {
850 | "data": {
851 | "text/plain": [
852 | "0.6444444444444445"
853 | ]
854 | },
855 | "execution_count": 16,
856 | "metadata": {},
857 | "output_type": "execute_result"
858 | }
859 | ],
860 | "source": [
861 | "# Preserve only the bottom 3 features\n",
862 | "X_train_bottom_features = X_train[:, [1, 2, 8]]\n",
863 | "X_test_bottom_features = X_test[:, [1, 2, 8]]\n",
864 | "\n",
865 | "# Re-train with only these features\n",
866 | "rf_clf_top = RandomForestClassifier(n_estimators=10, random_state=42).fit(X_train_bottom_features, y_train)\n",
867 | "\n",
868 | "# Compute mean accuracy achieved\n",
869 | "rf_clf_top.score(X_test_bottom_features, y_test)"
870 | ]
871 | },
872 | {
873 | "cell_type": "markdown",
874 | "metadata": {},
875 | "source": [
876 | "Then the mean accuracy reduces to 64.44%."
877 | ]
878 | },
879 | {
880 | "cell_type": "markdown",
881 | "metadata": {
882 | "id": "pGqKd5c7-Jaw"
883 | },
884 | "source": [
885 | "## Try out other classifiers\n",
886 | "\n",
887 | "The process of Permutation Feature Importance is also dependant on the classifier you are using. Since different classifiers follow different rules for classification it is natural to assume they will consider different features to be important or unimportant.\n",
888 | "\n",
889 | "To test this, try out other classifiers:"
890 | ]
891 | },
892 | {
893 | "cell_type": "code",
894 | "execution_count": 17,
895 | "metadata": {
896 | "id": "Hv6oXNMUrzmR"
897 | },
898 | "outputs": [],
899 | "source": [
900 | "from sklearn.svm import SVC\n",
901 | "from sklearn.linear_model import Lasso, Ridge\n",
902 | "from sklearn.tree import DecisionTreeClassifier\n",
903 | "\n",
904 | "# Compute feature importance on the test set given a classifier\n",
905 | "def fit_compute_importance(clf):\n",
906 | " clf.fit(X_train, y_train)\n",
907 | " print(f\"📏 Mean accuracy score on the test set: {clf.score(X_test, y_test)*100:.2f}%\\n\")\n",
908 | " print(\"🔝 Top 4 features when using the test set:\\n\")\n",
909 | " feature_importance(clf, X_test, y_test, top_limit=4)"
910 | ]
911 | },
912 | {
913 | "cell_type": "code",
914 | "execution_count": 18,
915 | "metadata": {},
916 | "outputs": [
917 | {
918 | "name": "stdout",
919 | "output_type": "stream",
920 | "text": [
921 | "====================================================================================================\n",
922 | "\u001b[1m➡️ Laso classifier\n",
923 | "\u001b[0m\n",
924 | "📏 Mean accuracy score on the test set: 86.80%\n",
925 | "\n",
926 | "🔝 Top 4 features when using the test set:\n",
927 | "\n",
928 | "Feature 'flavanoids' with index 6 has an average importance score of 0.323 +/- 0.055\n",
929 | "\n",
930 | "Feature 'proline' with index 12 has an average importance score of 0.203 +/- 0.035\n",
931 | "\n",
932 | "Feature 'od280/od315_of_diluted_wines' with index 11 has an average importance score of 0.146 +/- 0.030\n",
933 | "\n",
934 | "Feature 'alcalinity_of_ash' with index 3 has an average importance score of 0.038 +/- 0.014\n",
935 | "\n",
936 | "====================================================================================================\n",
937 | "\u001b[1m➡️ Ridge classifier\n",
938 | "\u001b[0m\n",
939 | "📏 Mean accuracy score on the test set: 88.71%\n",
940 | "\n",
941 | "🔝 Top 4 features when using the test set:\n",
942 | "\n",
943 | "Feature 'flavanoids' with index 6 has an average importance score of 0.445 +/- 0.071\n",
944 | "\n",
945 | "Feature 'proline' with index 12 has an average importance score of 0.210 +/- 0.035\n",
946 | "\n",
947 | "Feature 'color_intensity' with index 9 has an average importance score of 0.119 +/- 0.029\n",
948 | "\n",
949 | "Feature 'od280/od315_of_diluted_wines' with index 11 has an average importance score of 0.111 +/- 0.026\n",
950 | "\n",
951 | "====================================================================================================\n",
952 | "\u001b[1m➡️ Decision Tree classifier\n",
953 | "\u001b[0m\n",
954 | "📏 Mean accuracy score on the test set: 93.33%\n",
955 | "\n",
956 | "🔝 Top 4 features when using the test set:\n",
957 | "\n",
958 | "Feature 'flavanoids' with index 6 has an average importance score of 0.297 +/- 0.061\n",
959 | "\n",
960 | "Feature 'color_intensity' with index 9 has an average importance score of 0.206 +/- 0.046\n",
961 | "\n",
962 | "Feature 'proline' with index 12 has an average importance score of 0.198 +/- 0.038\n",
963 | "\n",
964 | "Feature 'malic_acid' with index 1 has an average importance score of 0.008 +/- 0.012\n",
965 | "\n",
966 | "====================================================================================================\n",
967 | "\u001b[1m➡️ Support Vector classifier\n",
968 | "\u001b[0m\n",
969 | "📏 Mean accuracy score on the test set: 97.78%\n",
970 | "\n",
971 | "🔝 Top 4 features when using the test set:\n",
972 | "\n",
973 | "Feature 'proline' with index 12 has an average importance score of 0.069 +/- 0.031\n",
974 | "\n",
975 | "Feature 'flavanoids' with index 6 has an average importance score of 0.061 +/- 0.023\n",
976 | "\n",
977 | "Feature 'alcohol' with index 0 has an average importance score of 0.044 +/- 0.023\n",
978 | "\n",
979 | "Feature 'ash' with index 2 has an average importance score of 0.032 +/- 0.018\n",
980 | "\n"
981 | ]
982 | }
983 | ],
984 | "source": [
985 | "# Select 4 new classifiers\n",
986 | "clfs = {\"Laso\": Lasso(alpha=0.05), \n",
987 | " \"Ridge\": Ridge(), \n",
988 | " \"Decision Tree\": DecisionTreeClassifier(), \n",
989 | " \"Support Vector\": SVC()}\n",
990 | "\n",
991 | "# Print results\n",
992 | "for name, clf in clfs.items():\n",
993 | " print('=' * 100)\n",
994 | " print('\\033[1m' + f\"➡️ {name} classifier\\n\" + '\\033[0m')\n",
995 | " fit_compute_importance(clf)"
996 | ]
997 | },
998 | {
999 | "cell_type": "markdown",
1000 | "metadata": {
1001 | "id": "0vLoU0xjDsqg"
1002 | },
1003 | "source": [
1004 | "Looks like **flavanoids** and **proline** are very important across all classifiers. However there is variability from one classifier to the others on what features are considered the most important ones."
1005 | ]
1006 | },
1007 | {
1008 | "cell_type": "markdown",
1009 | "metadata": {
1010 | "id": "cwRlO14nZ_nY"
1011 | },
1012 | "source": [
1013 | "-----------------------------\n",
1014 | "**Congratulations on finishing this ungraded lab!** Now you should have a clearer understanding of what Permutation Feature Importance is, why it is useful and how to implement this technique using scikit-learn. \n",
1015 | "\n",
1016 | "**Keep it up!**"
1017 | ]
1018 | }
1019 | ],
1020 | "metadata": {
1021 | "colab": {
1022 | "collapsed_sections": [],
1023 | "name": "C3_W5_Lab_2_Permutation_Importance.ipynb",
1024 | "provenance": []
1025 | },
1026 | "kernelspec": {
1027 | "display_name": "Python 3",
1028 | "language": "python",
1029 | "name": "python3"
1030 | },
1031 | "language_info": {
1032 | "codemirror_mode": {
1033 | "name": "ipython",
1034 | "version": 3
1035 | },
1036 | "file_extension": ".py",
1037 | "mimetype": "text/x-python",
1038 | "name": "python",
1039 | "nbconvert_exporter": "python",
1040 | "pygments_lexer": "ipython3",
1041 | "version": "3.6.13"
1042 | }
1043 | },
1044 | "nbformat": 4,
1045 | "nbformat_minor": 1
1046 | }
1047 |
--------------------------------------------------------------------------------