├── HandsOn.dbc
└── Readme.md
/HandsOn.dbc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/tsmatz/azure-databricks-exercise/7d293b27a153ba99aa1538413fb8fe673adb1ce8/HandsOn.dbc
--------------------------------------------------------------------------------
/Readme.md:
--------------------------------------------------------------------------------
1 | # Azure Databricks Hands-on (Tutorials)
2 |
3 | To run these exercises, follow each instructions below in this readme.
4 |
5 | 1. [Storage Settings](https://tsmatz.github.io/azure-databricks-exercise/exercise01-blob.html)
6 | 2. [Basics of PySpark, Spark Dataframe, and Spark Machine Learning](https://tsmatz.github.io/azure-databricks-exercise/exercise02-pyspark-dataframe.html)
7 | 3. [Spark Machine Learning Pipeline](https://tsmatz.github.io/azure-databricks-exercise/exercise03-sparkml-pipeline.html)
8 | 4. [Hyper-parameter Tuning](https://tsmatz.github.io/azure-databricks-exercise/exercise04-hyperparams-tuning.html)
9 | 5. [MLeap](https://tsmatz.github.io/azure-databricks-exercise/exercise05-mleap.html) (requires ML runtime)
10 | 6. [Spark PyTorch Distributor](https://tsmatz.github.io/azure-databricks-exercise/exercise06-dnn-distributor.html) (requires ML runtime)
11 | 7. [Structured Streaming (Basic)](https://tsmatz.github.io/azure-databricks-exercise/exercise07-structured-streaming.html)
12 | 8. [Structured Streaming with Azure Event Hubs or Kafka](https://tsmatz.github.io/azure-databricks-exercise/exercise08-streaming-eventhub.html)
13 | 9. [Delta Lake](https://tsmatz.github.io/azure-databricks-exercise/exercise09-databricks-delta.html)
14 | 10. [MLflow](https://tsmatz.github.io/azure-databricks-exercise/exercise10-mlflow.html) (requires ML runtime)
15 | 11. [Orchestration with Azure Data Services](https://tsmatz.github.io/azure-databricks-exercise/exercise11-orchestration.html)
16 | 12. [Delta Live Tables](https://tsmatz.github.io/azure-databricks-exercise/exercise12-dlt.html)
17 | 13. [Databricks SQL](https://tsmatz.github.io/azure-databricks-exercise/exercise13-sql.html)
18 |
19 | ## Getting Started
20 |
21 | - Create Azure Databricks resource in [Microsoft Azure](https://portal.azure.com/).
22 | When you create a resource, please select Premium plan.
23 | - After the resource is created, launch Databricks workspace UI by clicking "Launch Workspace".
24 | - Create a compute (cluster) in Databricks UI. (Select "Compute" menu and proceed to create.)
25 | Please select runtime in ML (not a standard runtime).
26 | - Clone this repository by running the following command. (Or download [HandsOn.dbc](https://github.com/tsmatz/azure-databricks-exercise/raw/master/HandsOn.dbc).)
27 | ```git clone https://github.com/tsmatz/azure-databricks-exercise```
28 | - Import ```HandsOn.dbc``` into your Databricks workspace as follows.
29 | - Select "Workspace" in Workspace UI.
30 | - Go to user folder, click your e-mail (the arrow icon), and then select "import".
31 | - Pick up ```HandsOn.dbc```.
32 | - Open the imported notebooks and attach above compute (cluster) in every notebooks. (Select compute (cluster) on the top of notebook.)
33 | - Please make sure to run "Exercise 01 : Storage Settings (Prepare)", before running other notebooks.
34 |
35 | > Note : You cannot use Azure trial (free) subscription, because of the limited quota. When you're in Azure free subscription, please promote to pay-as-you-go. (The credit in free subscription will be reserved, even when you transit to pay-as-you-go.)
36 |
37 | *Tsuyoshi Matsuzaki @ Microsoft*
38 |
--------------------------------------------------------------------------------