192 | Solution (expand to see)
193 |
194 |
195 | ```bash
196 | helm install stable/dokuwiki --set dokuwikiWikiName="Hello MLADS"
197 | ```
198 |
199 |
200 |
201 |
202 |
203 | ## Next Step
204 |
205 | [4 - Kubeflow](../4-kubeflow/README.md)
--------------------------------------------------------------------------------
/3-helm/dokuwiki.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Azure/kubeflow-labs/39657fb130a563ef03d4beaf526f8fc544825287/3-helm/dokuwiki.png
--------------------------------------------------------------------------------
/4-kubeflow/README.md:
--------------------------------------------------------------------------------
1 | # Kubeflow - Overview and Installation
2 |
3 | ## Prerequisites
4 |
5 | - [1 - Docker](../1-docker/README.md)
6 | - [2 - Kubernetes](../2-kubernetes/README.md)
7 |
8 | ## Summary
9 |
10 | In this module we are going to get an overview of the different components that make up [Kubeflow](https://github.com/kubeflow/kubeflow), and how to install them into our newly deployed Kubernetes cluster.
11 |
12 | ### Kubeflow Overview
13 |
14 | From [Kubeflow](https://github.com/kubeflow/kubeflow)'s own documetation:
15 |
16 | > The Kubeflow project is dedicated to making deployments of machine learning (ML) workflows on Kubernetes simple, portable and scalable. Our goal is not to recreate other services, but to provide a straightforward way to deploy best-of-breed open-source systems for ML to diverse infrastructures. Anywhere you are running Kubernetes, you should be able to run Kubeflow.
17 |
18 | Kubeflow is composed of multiple components:
19 |
20 | - [JupyterHub](https://jupyterhub.readthedocs.io/en/latest/), which allows user to request an instance of a Jupyter Notebook server dedicated to them.
21 | - One or multiple training controllers. These are component that simplifies and manages the deployment of training jobs. For the purpose of this lab we are only going to deploy a training controller for TensorFlow jobs. However the Kubeflow community has started working on controllers for PyTorch and Caffe2 as well.
22 | - A serving component that will help you serve predictions with your models.
23 |
24 | For more general info on Kubeflow, head to the repo's [README](https://github.com/kubeflow/kubeflow/blob/master/README.md).
25 |
26 | ### Deploying Kubeflow
27 |
28 | Kubeflow uses [`ksonnet`](https://github.com/ksonnet/ksonnet) templates as a way to package and deploy the different components.
29 |
30 | > ksonnet simplifies defining an application configuration, updating the configuration over time, and specializing it for different clusters and environments.
31 |
32 | First, install ksonnet version [0.13.1](https://ksonnet.io/#get-started), or you can [download a prebuilt binary](https://github.com/ksonnet/ksonnet/releases/tag/v0.13.1) for your OS.
33 |
34 | Then run the following commands to download Kubeflow:
35 |
36 | ```bash
37 | KUBEFLOW_SRC=$(realpath kubeflow)
38 |
39 | mkdir ${KUBEFLOW_SRC}
40 | cd ${KUBEFLOW_SRC}
41 |
42 | export KUBEFLOW_TAG=v0.4.1
43 |
44 | curl https://raw.githubusercontent.com/kubeflow/kubeflow/${KUBEFLOW_TAG}/scripts/download.sh | bash
45 | ```
46 |
47 | `KUBEFLOW_SRC` a directory where you want to download the source to
48 |
49 | `KUBEFLOW_TAG` a tag corresponding to the version to check out, such as master for the latest code.
50 |
51 | ```bash
52 | # Initialize a kubeflow app
53 | KFAPP=mykubeflowapp
54 | ${KUBEFLOW_SRC}/scripts/kfctl.sh init ${KFAPP} --platform none
55 |
56 | # Generate kubeflow app
57 | cd ${KFAPP}
58 | ${KUBEFLOW_SRC}/scripts/kfctl.sh generate k8s
59 |
60 | # Deploy Kubeflow app
61 | ${KUBEFLOW_SRC}/scripts/kfctl.sh apply k8s
62 | ```
63 |
64 | ### Validation
65 |
66 | `kubectl get pods -n kubeflow`
67 |
68 | should return something like this:
69 |
70 | ```
71 | NAME READY STATUS RESTARTS AGE
72 | kubeflow ambassador-b4d9cdb8-2qgww 1/1 Running 0 111m
73 | kubeflow ambassador-b4d9cdb8-hpwdc 1/1 Running 0 111m
74 | kubeflow ambassador-b4d9cdb8-khg8l 1/1 Running 0 111m
75 | kubeflow argo-ui-6d6658d8f7-t6whw 1/1 Running 0 110m
76 | kubeflow centraldashboard-6f686c5b7c-462cq 1/1 Running 0 111m
77 | kubeflow jupyter-0 1/1 Running 0 111m
78 | kubeflow katib-ui-6c59754c48-mgf62 1/1 Running 0 110m
79 | kubeflow metacontroller-0 1/1 Running 0 111m
80 | kubeflow minio-d79b65988-6qkxp 1/1 Running 0 110m
81 | kubeflow ml-pipeline-66df9d86f6-rp245 1/1 Running 0 110m
82 | kubeflow ml-pipeline-persistenceagent-7b86dbf4b5-rgndj 1/1 Running 0 110m
83 | kubeflow ml-pipeline-scheduledworkflow-84f6477479-9tvhk 1/1 Running 0 110m
84 | kubeflow ml-pipeline-ui-f76bb5f97-2s5qb 1/1 Running 0 110m
85 | kubeflow mysql-ffc889689-xkpxb 1/1 Running 0 110m
86 | kubeflow pytorch-operator-ff46f9b7d-qkbvh 1/1 Running 0 111m
87 | kubeflow spartakus-volunteer-5b6c956c8f-2gnvb 1/1 Running 0 111m
88 | kubeflow studyjob-controller-b7cdbd4cd-nf9z5 1/1 Running 0 110m
89 | kubeflow tf-job-dashboard-7746db84cf-njdzk 1/1 Running 0 111m
90 | kubeflow tf-job-operator-v1beta1-5949f668f7-j5zrn 1/1 Running 0 111m
91 | kubeflow vizier-core-7c56465f6-t6d5p 1/1 Running 0 110m
92 | kubeflow vizier-core-rest-67f588b4cb-lqvgr 1/1 Running 0 110m
93 | kubeflow vizier-db-86dc7d89c5-8vtfs 1/1 Running 0 110m
94 | kubeflow vizier-suggestion-bayesianoptimization-7cb546fb84-tsrn4 1/1 Running 0 110m
95 | kubeflow vizier-suggestion-grid-6587f9d6b-92c9h 1/1 Running 0 110m
96 | kubeflow vizier-suggestion-hyperband-8bb44f8c8-gs72m 1/1 Running 0 110m
97 | kubeflow vizier-suggestion-random-7ff5db687b-bjdh5 1/1 Running 0 110m
98 | kubeflow workflow-controller-cf79dfbff-lv7jk 1/1 Running 0 110m
99 | ```
100 |
101 | The most important components for the purpose of this lab are `jupyter-0` which is the JupyterHub spawner running on your cluster, and `tf-job-operator-v1beta1-5949f668f7-j5zrn` which is a controller that will monitor your cluster for new TensorFlow training jobs (called `TfJobs`) specifications and manages the training, we will look at this two components later.
102 |
103 | ### Remove Kubeflow
104 |
105 | If you want to remove the Kubeflow deployment, you can run the following to remove the namespace and installed components:
106 |
107 | ```bash
108 | cd ${KUBEFLOW_SRC}/${KFAPP}
109 | ${KUBEFLOW_SRC}/scripts/kfctl.sh delete k8s
110 | ```
111 |
112 | ## Next Step
113 |
114 | [5 - JupyterHub](../5-jupyterhub/README.md)
115 |
--------------------------------------------------------------------------------
/5-jupyterhub/README.md:
--------------------------------------------------------------------------------
1 | # Jupyter Notebooks on Kubernetes
2 |
3 | ## Prerequisites
4 |
5 | - [1 - Docker Basics](../1-docker)
6 | - [2 - Kubernetes Basics and cluster created](../2-kubernetes)
7 | - [4 - Kubeflow](../4-kubeflow)
8 |
9 | ## Summary
10 |
11 | In this module, you will learn how to:
12 |
13 | - Run Jupyter Notebooks locally using Docker
14 | - Run JupyterHub on Kubernetes using Kubeflow
15 |
16 | ## How Jupyter Notebooks work
17 |
18 | The [Jupyter Notebook](http://jupyter.org/) is an open source web application that allows users to create and share documents that contain live code, equations, visualizations, and narrative text for rapid prototyping. It is often used for data cleaning and transformation, numerical simulation, statistical modeling, data visualization, machine learning, and more. To better support exploratory iteration and to accelerate computation of Tensorflow jobs, let's look at how we can include data science tools like Jupyter Notebook with Docker and Kubernetes.
19 |
20 | ## How JupyterHub works
21 |
22 | The [JupyterHub](https://jupyterhub.readthedocs.io/en/latest/) is a multi-user Hub, spawns, manages, and proxies multiple instances of the single-user Jupyter notebook server. JupyterHub can be used to serve notebooks to a class of students, a corporate data science group, or a scientific research group. Let's look at how we can create JupyterHub to spawn multiple instances of Jupyter Notebook on Kubernetes using Kubeflow.
23 |
24 | ## Exercises
25 |
26 | ### Exercise 1: Run Jupyter Notebooks locally using Docker
27 |
28 | In this first exercise, we will run Jupyter Notebooks locally using Docker. We will use the official tensorflow docker image as it comes with Jupyter notebook.
29 |
30 | ```console
31 | docker run -it -p 8888:8888 tensorflow/tensorflow
32 | ```
33 |
34 | #### Validation
35 |
36 | To verify, browse to the url in the output log.
37 |
38 | For example: `http://localhost:8888/?token=a3ea3cd914c5b68149e2b4a6d0220eca186fec41563c0413`
39 |
40 | ### Exercise 2: Run JupyterHub on Kubernetes using Kubeflow
41 |
42 | In this exercise, we will run JupyterHub to spawn multiple instances of Jupyter Notebooks on a Kubernetes cluster using Kubeflow.
43 |
44 | As a prerequisite, you should already have a Kubernetes cluster running, you can follow [module 2 - Kubernetes](../2-kubernetes) to create your own cluster and you should already have Kubeflow running in your Kubernetes cluster, you can follow [module 4 - Kubeflow and tfjob Basics](../4-kubeflow-tfjob).
45 |
46 | In module 4, you installed the kubeflow-core component, which already includes JupyterHub and a corresponding load balancer service of type `ClusterIP`. To check its status, run the following kubectl command.
47 |
48 | ```
49 | NAMESPACE=kubeflow
50 | kubectl get svc -n=${NAMESPACE}
51 |
52 | NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
53 | ...
54 | jupyter-0 ClusterIP None