├── .gitignore
├── .gitmodules
├── README.md
├── Vagrantfile
├── dev.md
├── figures
    ├── arch.dot
    ├── arch.svg
    ├── arch_vm.dot
    └── arch_vm.svg
├── index.html
├── play.sh
├── provision.bash
├── python
    ├── _version.py
    ├── gen_cert.sh
    ├── setup.py
    └── sqlflow_playground
    │   ├── __init__.py
    │   ├── k8s.py
    │   ├── playground_server_design.md
    │   └── server.py
├── release.sh
└── start.bash


/.gitignore:
--------------------------------------------------------------------------------
1 | *.log
2 | *~
3 | .vagrant
4 | 


--------------------------------------------------------------------------------
/.gitmodules:
--------------------------------------------------------------------------------
1 | [submodule "sqlflow"]
2 | 	path = sqlflow
3 | 	url = https://github.com/sql-machine-learning/sqlflow
4 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | # Release SQLFlow Desktop Distribution as a VM Image
 2 | 
 3 | This is an experimental work to check deploying the whole
 4 | [SQLFlow](https://sqlflow.org/sqlflow) service mesh on Windows, Linux,
 5 | or macOS desktop.
 6 | 
 7 | The general architecture of SQLFlow is as the following:
 8 | 
 9 | ![](figures/arch.svg)
10 | 
11 | In this deployment, we have Jupyter Notebook server, SQLFlow server,
12 | and MySQL running in a container executing the
13 | `sqlflow/sqlflow:latest` image.  Argo runs on a minikube cluster
14 | running on the VM.  The deployment is shown in the folllowing figure:
15 | 
16 | ![](figures/arch_vm.svg)
17 | 
18 | I chose this deployment plan for reasons:
19 | 
20 | 1. We don't have a well-written local workflow engine, and at the
21 |    right moment, we need to focus on the Kubernetes-native engine.
22 |    So, we use minikube and install Argo on minikube.
23 | 
24 | 1. We can install minikube directly on users' desktop computers
25 |    running Windows, Linux, macOS.  However, writing a shell script to
26 |    do that requires us to consider many edge cases.  To have a clear
27 |    deployment environment, I introduced VM.
28 | 
29 | 1. To make the VM manageable in a programmatic way, I used Vagrant.
30 |    Please be aware that Vagrant is the only software users need to
31 |    install to use SQLFlow on their desktop computer.  And Vagrant
32 |    provides official support for Windows, Linux, and macOS.
33 | 
34 | 1. We can run the SQLFlow server container (`sqlflow/sqlflow:latest`)
35 |    on minikube as well.  But that would add challenge to export ports.
36 |    Running the container directly in the VM but out of minikube, we
37 | 
38 |    1. expoe the in-container port by adding `EXPOSE` statement in the
39 |       Dockerfile, and
40 |    1. expose the docker port for accessing from outside of the VM by
41 |       adding the following code snippet to the Vagrantfile.
42 | 
43 |       ```ruby
44 |       config.vm.network "forwarded_port", guest: 3306, host: 3306
45 |       config.vm.network "forwarded_port", guest: 50051, host: 50051
46 |       config.vm.network "forwarded_port", guest: 8888, host: 8888
47 |       ```
48 | 


--------------------------------------------------------------------------------
/Vagrantfile:
--------------------------------------------------------------------------------
 1 | # -*- mode: ruby -*-
 2 | # vi: set ft=ruby :
 3 | 
 4 | Vagrant.configure("2") do |config|
 5 |   config.vm.box = "ubuntu/bionic64"
 6 |   config.vm.provision "shell", path: "provision.bash"
 7 | 
 8 |   # Enlarge disk size from default '10G' to '20G'
 9 |   # This need the vagrant-disksize plugin which is installed in play.sh.
10 |   config.disksize.size = '20GB'
11 | 
12 |   # Don't forward 22.  Even if we do so, the exposed port only binds
13 |   # to 127.0.0.1, but not 0.0.0.0.  Other ports binds to all IPs.
14 |   config.vm.network "forwarded_port", guest: 3306, host: 3306,
15 |                     auto_correct: true
16 |   config.vm.network "forwarded_port", guest: 50051, host: 50051,
17 |                     auto_correct: true
18 |   # Jupyter Notebook
19 |   config.vm.network "forwarded_port", guest: 8888, host: 8888,
20 |                     auto_correct: true
21 |   # minikube dashboard
22 |   config.vm.network "forwarded_port", guest: 9000, host: 9000,
23 |                     auto_correct: true
24 |   # Argo dashboard
25 |   config.vm.network "forwarded_port", guest: 9001, host: 9001,
26 |                     auto_correct: true
27 | 
28 |   config.vm.provider "virtualbox" do |v|
29 |     v.memory = 8192
30 |     v.cpus = 4
31 |   end
32 | 
33 |   # Bind the host directory ./ into the VM.
34 |   config.vm.synced_folder "./", "/home/vagrant/desktop"
35 | end
36 | 


--------------------------------------------------------------------------------
/dev.md:
--------------------------------------------------------------------------------
 1 | ## Develop, Release, and Use SQLFlow-in-a-VM
 2 | 
 3 | ### For Developers
 4 | 
 5 | 1. Install [VirtualBox 6.1.6](https://www.virtualbox.org/) and [Vagrant 2.2.7](https://www.vagrantup.com/) on a computer with a relatively large memory size.  As a recommendation, a host with 16G memory and 8 cores is preferred.
 6 | 1. Clone and update `SQLFlow playground` project.
 7 |     ```bash
 8 |     git clone https://github.com/sql-machine-learning/playground.git
 9 |     cd playground
10 |     git submodule update --init
11 |     ```
12 | 1. Run the `play.sh` under playgound's root directory.  This script will guide you to install SQLFlow on a virtualbox VM.  If you have a slow Internet connection to Vagrant Cloud, you might want to download the Ubuntu VirtualBox image manually from some mirror sites into `~/.cache/sqlflow/` before running the above script.  We use `wget -c` here for continuing get the file from last breakpoint, so if this command fail, just re-run it.
13 |     ```bash
14 |     # download Vagrant image manually, optional
15 |     mkdir -p $HOME/.cache/sqlflow
16 |     wget -c -O $HOME/.cache/sqlflow/ubuntu-bionic64.box \
17 |       "https://mirrors.ustc.edu.cn/ubuntu-cloud-images/bionic/current/bionic-server-cloudimg-amd64-vagrant.box"
18 | 
19 |     ./play.sh
20 |     ```
21 |     The `play.sh` add some extensions for Vagrant, like `vagrant-disksize` which enlarges the disk size of the VM.  The script will then call `vagrant up` command to bootup the VM.  After the VM is up, the `provision.sh` will be automatically executed which will install the dependencies for SQLFlow.  Provision is a one-shot work, after it is done, we will have an environment with SQLFlow, docker and minikube installed.
22 | 
23 | 1. Log on the VM and start SQLFlow playground.  Run the `start.bash` script, it will pull some docker images and start the playground minikube cluster.  As the images pulling may be slow, the script might fail sometimes.  Feel free to re-run the script until gou get some output like `Access Jupyter Notebook at ...`.
24 |     ```bash
25 |     vagrant ssh
26 |     sudo su
27 |     cd desktop
28 |     ./start.bash
29 |     ```
30 | 1. After the minikube is started up. You can access the `Jupyter Notebook` from your desktop. Or you can use SQLFlow command-line tool [sqlflow](https://github.com/sql-machine-learning/sqlflow/blob/develop/doc/run/cli.md) to access the `SQLFlow server`.  Just follow the output of the `start.bash`, it will give you some hint.
31 | 1. After playing a while, you may want to stop SQLFlow playground, just log on the VM again and stop the minikube cluster.
32 |     ```bash
33 |     vagrant ssh # optional if you already logged on
34 |     minikube stop
35 |     ```
36 | 1. Finally if you want to stop the VM, you can run the `vagrant halt` command.  To complete destroy the VM, run the `vagrant destroy` command.
37 | 
38 | ### For Releaser
39 | 
40 | The releaser, which, in most cases, is a developer, can export a running VirtualBox VM into a VM image file with extension `.ova`.  An `ova` file is a tarball of a directory, whose content follows the OVF specification.  For the concepts, please refer to this [explanation](https://damiankarlson.com/2010/11/01/ovas-and-ovfs-what-are-they-and-whats-the-difference/).
41 | 
42 | According to this [tutorial](https://www.techrepublic.com/article/how-to-import-and-export-virtualbox-appliances-from-the-command-line/), releasers can call the VBoxManage command to export a VM. We have written a scrip to do this.  Simply run below script to export our playground.  This script will create a file named `SQLFlowPlayground.ova`, we can import the file through virtual box GUI.
43 | 
44 | ```bash
45 | ./release.sh
46 | ```
47 | 
48 | ### For End-users
49 | 
50 | To run SQLFlow on a desktop computer running Windows, Linux, or macOS, follow below steps:
51 | 1. install [VirtualBox](https://www.virtualbox.org/) (v6.1.6 is recommended)
52 | 
53 | 1. download the released VirtualBox `.ova` file, you have two choices:
54 |     - the minimized image (about 600M): shipped with all bootstrap files but no dependency docker images. When you start the playground, you will wait for a while to download the latest docker images, minikube framework and other packages.
55 |     ```bash
56 |     wget -c http://cdn.sqlflow.tech/latest/SQLFlowPlaygroundBare.ova
57 |     ```
58 |     - the full installed image (about 2G): with all dependencies, no extra downloading is needed when starting. Note that in this case, the images will not be updated automatically, you will do it manually when needed.
59 |     ```bash
60 |     wget -c http://cdn.sqlflow.tech/latest/SQLFlowPlaygroundFull.ova
61 |     ```
62 | 1. optional, download the [sqlflow](https://github.com/sql-machine-learning/sqlflow/blob/develop/doc/run/cli.md) command-line tool released by SQLFlow CI.
63 | 
64 | After VirtualBox is installed, you can import the `.ova` file and start a VM.  If you have a relative lower configuration, you can adjust the CPU core and RAM amount in VirtualBox's setting panel, say, to 2 cores and 4G RAM.  After that, you can log in the system through the VirtualBox GUI or through a ssh connection like below.  The default password of `root` is `sqlflow`.
65 | ```bash
66 | ssh -p2222 root@127.0.0.1
67 | root@127.0.0.1's password: sqlflow
68 | ```
69 | Once logged in the VM, you will immediately see a script named `start.bash`, just run the script to start SQLFlow playground.  It will output some hint messages for you, follow those hints, after a while, you will see something like `Access Jupyter NoteBook at: http://127.0.0.1:8888/...`, it means we are all set.  Copy the link to your web browser  and you will see SQLFlow's Jupyter Notebook user interface, Enjoy it!
70 | ```bash
71 | ./start.bash
72 | ```
73 | 
74 | Or, if you has an AWS or Google Cloud account, you can upload the `.ova` file to start the VM on the cloud.  AWS users can follow [these steps](https://aws.amazon.com/ec2/vm-import/).
75 | 
76 | Anyway, given a running VM, the end-user can run the following command to connect to it:
77 | 
78 | ```bash
79 | sqlflow --sqlflow-server=my-vm.aws.com:50051
80 | ```
81 | 
82 | ### For End-users with Kubernetes (without a VM)
83 | 
84 | Now, SQLFlow playground supports directly installing on Kubernetes. Users can refer to [this doc](https://github.com/sql-machine-learning/sqlflow/blob/develop/doc/run/kubernetes.md) to apply a fast deployment.
85 | 


--------------------------------------------------------------------------------
/figures/arch.dot:
--------------------------------------------------------------------------------
 1 | digraph G {
 2 |         node [shape=box];
 3 | 
 4 |         User1 [shape=oval, label="Lily"];
 5 |         User2 [shape=oval, label="Bob"];
 6 |         User3 [shape=oval, label="Eva"];
 7 | 
 8 |         {rank = same; User1; User2; User3}
 9 | 
10 |         Browser1 [label="Web browser"];
11 |         Browser2 [label="Web browser"];
12 | 
13 |         {rank = same; Browser1, Browser2, Client}
14 | 
15 |         Jupyter [label="Jupyter Notebook server +\n SQLFlow magic command"];
16 |         SQLFlow [label="SQLFlow server"];
17 |         Argo [label="Tekton on Kubernetes\n(each workflow step is a container)"];
18 |         AI [label="AI engine\n(Alibaba PAI, KubeFlow+Kuberntes, etc)"];
19 |         DBMS [label="database system\n(Hive, MySQL, MaxCompute, etc)"];
20 | 
21 |         User1 -> Browser1;
22 |         User2 -> Browser2;
23 |         Browser1 -> Jupyter [label="SQL/Flow program"];
24 |         Browser2 -> Jupyter;
25 | 
26 |         Jupyter -> SQLFlow [label="SQL/Flow program"];
27 |         SQLFlow -> Argo [label="Argo workflow"];
28 |         Argo -> DBMS [label="submit SQL statement"];
29 |         Argo -> AI [label="submit AI job"];
30 |         Argo -> DBMS [label="verify data schema"];
31 | 
32 |         Client [label="sqlflow command-line client"];
33 | 
34 |         User3 -> Client;
35 |         Client -> SQLFlow [label="SQL/Flow program"];
36 | }
37 | 


--------------------------------------------------------------------------------
/figures/arch.svg:
--------------------------------------------------------------------------------
  1 | <?xml version="1.0" encoding="UTF-8" standalone="no"?>
  2 | <!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN"
  3 |  "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
  4 | <!-- Generated by graphviz version 2.40.1 (20161225.0304)
  5 |  -->
  6 | <!-- Title: G Pages: 1 -->
  7 | <svg width="616pt" height="471pt"
  8 |  viewBox="0.00 0.00 616.00 471.00" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink">
  9 | <g id="graph0" class="graph" transform="scale(1 1) rotate(0) translate(4 467)">
 10 | <title>G</title>
 11 | <polygon fill="#ffffff" stroke="transparent" points="-4,4 -4,-467 612,-467 612,4 -4,4"/>
 12 | <!-- User1 -->
 13 | <g id="node1" class="node">
 14 | <title>User1</title>
 15 | <ellipse fill="none" stroke="#000000" cx="59" cy="-445" rx="28.6953" ry="18"/>
 16 | <text text-anchor="middle" x="59" y="-441.3" font-family="Times,serif" font-size="14.00" fill="#000000">Lily</text>
 17 | </g>
 18 | <!-- Browser1 -->
 19 | <g id="node4" class="node">
 20 | <title>Browser1</title>
 21 | <polygon fill="none" stroke="#000000" points="114,-390 4,-390 4,-354 114,-354 114,-390"/>
 22 | <text text-anchor="middle" x="59" y="-368.3" font-family="Times,serif" font-size="14.00" fill="#000000">Web browser</text>
 23 | </g>
 24 | <!-- User1&#45;&gt;Browser1 -->
 25 | <g id="edge1" class="edge">
 26 | <title>User1&#45;&gt;Browser1</title>
 27 | <path fill="none" stroke="#000000" d="M59,-426.9551C59,-418.8828 59,-409.1764 59,-400.1817"/>
 28 | <polygon fill="#000000" stroke="#000000" points="62.5001,-400.0903 59,-390.0904 55.5001,-400.0904 62.5001,-400.0903"/>
 29 | </g>
 30 | <!-- User2 -->
 31 | <g id="node2" class="node">
 32 | <title>User2</title>
 33 | <ellipse fill="none" stroke="#000000" cx="199" cy="-445" rx="28.6953" ry="18"/>
 34 | <text text-anchor="middle" x="199" y="-441.3" font-family="Times,serif" font-size="14.00" fill="#000000">Bob</text>
 35 | </g>
 36 | <!-- Browser2 -->
 37 | <g id="node5" class="node">
 38 | <title>Browser2</title>
 39 | <polygon fill="none" stroke="#000000" points="254,-390 144,-390 144,-354 254,-354 254,-390"/>
 40 | <text text-anchor="middle" x="199" y="-368.3" font-family="Times,serif" font-size="14.00" fill="#000000">Web browser</text>
 41 | </g>
 42 | <!-- User2&#45;&gt;Browser2 -->
 43 | <g id="edge2" class="edge">
 44 | <title>User2&#45;&gt;Browser2</title>
 45 | <path fill="none" stroke="#000000" d="M199,-426.9551C199,-418.8828 199,-409.1764 199,-400.1817"/>
 46 | <polygon fill="#000000" stroke="#000000" points="202.5001,-400.0903 199,-390.0904 195.5001,-400.0904 202.5001,-400.0903"/>
 47 | </g>
 48 | <!-- User3 -->
 49 | <g id="node3" class="node">
 50 | <title>User3</title>
 51 | <ellipse fill="none" stroke="#000000" cx="381" cy="-445" rx="28.6953" ry="18"/>
 52 | <text text-anchor="middle" x="381" y="-441.3" font-family="Times,serif" font-size="14.00" fill="#000000">Eva</text>
 53 | </g>
 54 | <!-- Client -->
 55 | <g id="node6" class="node">
 56 | <title>Client</title>
 57 | <polygon fill="none" stroke="#000000" points="490,-390 272,-390 272,-354 490,-354 490,-390"/>
 58 | <text text-anchor="middle" x="381" y="-368.3" font-family="Times,serif" font-size="14.00" fill="#000000">sqlflow command&#45;line client</text>
 59 | </g>
 60 | <!-- User3&#45;&gt;Client -->
 61 | <g id="edge10" class="edge">
 62 | <title>User3&#45;&gt;Client</title>
 63 | <path fill="none" stroke="#000000" d="M381,-426.9551C381,-418.8828 381,-409.1764 381,-400.1817"/>
 64 | <polygon fill="#000000" stroke="#000000" points="384.5001,-400.0903 381,-390.0904 377.5001,-400.0904 384.5001,-400.0903"/>
 65 | </g>
 66 | <!-- Jupyter -->
 67 | <g id="node7" class="node">
 68 | <title>Jupyter</title>
 69 | <polygon fill="none" stroke="#000000" points="316.5,-303 105.5,-303 105.5,-265 316.5,-265 316.5,-303"/>
 70 | <text text-anchor="middle" x="211" y="-287.8" font-family="Times,serif" font-size="14.00" fill="#000000">Jupyter Notebook server +</text>
 71 | <text text-anchor="middle" x="211" y="-272.8" font-family="Times,serif" font-size="14.00" fill="#000000"> SQLFlow magic command</text>
 72 | </g>
 73 | <!-- Browser1&#45;&gt;Jupyter -->
 74 | <g id="edge3" class="edge">
 75 | <title>Browser1&#45;&gt;Jupyter</title>
 76 | <path fill="none" stroke="#000000" d="M55.1111,-353.5439C53.9747,-342.883 54.6332,-329.8701 62,-321 67.5704,-314.2929 80.0246,-308.5974 95.4805,-303.8346"/>
 77 | <polygon fill="#000000" stroke="#000000" points="96.8007,-307.0977 105.4571,-300.9889 94.8806,-300.3662 96.8007,-307.0977"/>
 78 | <text text-anchor="middle" x="130.5" y="-324.8" font-family="Times,serif" font-size="14.00" fill="#000000">SQL/Flow program</text>
 79 | </g>
 80 | <!-- Browser2&#45;&gt;Jupyter -->
 81 | <g id="edge4" class="edge">
 82 | <title>Browser2&#45;&gt;Jupyter</title>
 83 | <path fill="none" stroke="#000000" d="M201.4864,-353.7663C203.079,-342.0875 205.1815,-326.6692 207.0048,-313.2978"/>
 84 | <polygon fill="#000000" stroke="#000000" points="210.5089,-313.5047 208.3922,-303.1235 203.5731,-312.5589 210.5089,-313.5047"/>
 85 | </g>
 86 | <!-- SQLFlow -->
 87 | <g id="node8" class="node">
 88 | <title>SQLFlow</title>
 89 | <polygon fill="none" stroke="#000000" points="362.5,-214 229.5,-214 229.5,-178 362.5,-178 362.5,-214"/>
 90 | <text text-anchor="middle" x="296" y="-192.3" font-family="Times,serif" font-size="14.00" fill="#000000">SQLFlow server</text>
 91 | </g>
 92 | <!-- Client&#45;&gt;SQLFlow -->
 93 | <g id="edge11" class="edge">
 94 | <title>Client&#45;&gt;SQLFlow</title>
 95 | <path fill="none" stroke="#000000" d="M383.412,-353.8268C386.2163,-325.2314 387.6015,-269.1682 362,-232 358.9771,-227.6114 355.2398,-223.707 351.0926,-220.2439"/>
 96 | <polygon fill="#000000" stroke="#000000" points="352.8343,-217.1822 342.6926,-214.1169 348.7092,-222.8376 352.8343,-217.1822"/>
 97 | <text text-anchor="middle" x="451.5" y="-280.3" font-family="Times,serif" font-size="14.00" fill="#000000">SQL/Flow program</text>
 98 | </g>
 99 | <!-- Jupyter&#45;&gt;SQLFlow -->
100 | <g id="edge5" class="edge">
101 | <title>Jupyter&#45;&gt;SQLFlow</title>
102 | <path fill="none" stroke="#000000" d="M212.3406,-264.9661C213.9542,-254.2882 217.4542,-241.2882 225,-232 228.7404,-227.3959 233.2224,-223.3237 238.0923,-219.7345"/>
103 | <polygon fill="#000000" stroke="#000000" points="240.162,-222.5623 246.5915,-214.1413 236.3139,-216.7149 240.162,-222.5623"/>
104 | <text text-anchor="middle" x="293.5" y="-235.8" font-family="Times,serif" font-size="14.00" fill="#000000">SQL/Flow program</text>
105 | </g>
106 | <!-- Argo -->
107 | <g id="node9" class="node">
108 | <title>Argo</title>
109 | <polygon fill="none" stroke="#000000" points="430.5,-127 161.5,-127 161.5,-89 430.5,-89 430.5,-127"/>
110 | <text text-anchor="middle" x="296" y="-111.8" font-family="Times,serif" font-size="14.00" fill="#000000">Tekton on Kubernetes</text>
111 | <text text-anchor="middle" x="296" y="-96.8" font-family="Times,serif" font-size="14.00" fill="#000000">(each workflow step is a container)</text>
112 | </g>
113 | <!-- SQLFlow&#45;&gt;Argo -->
114 | <g id="edge6" class="edge">
115 | <title>SQLFlow&#45;&gt;Argo</title>
116 | <path fill="none" stroke="#000000" d="M296,-177.7663C296,-166.0875 296,-150.6692 296,-137.2978"/>
117 | <polygon fill="#000000" stroke="#000000" points="299.5001,-137.1235 296,-127.1235 292.5001,-137.1235 299.5001,-137.1235"/>
118 | <text text-anchor="middle" x="349" y="-148.8" font-family="Times,serif" font-size="14.00" fill="#000000">Argo workflow</text>
119 | </g>
120 | <!-- AI -->
121 | <g id="node10" class="node">
122 | <title>AI</title>
123 | <polygon fill="none" stroke="#000000" points="308,-38 0,-38 0,0 308,0 308,-38"/>
124 | <text text-anchor="middle" x="154" y="-22.8" font-family="Times,serif" font-size="14.00" fill="#000000">AI engine</text>
125 | <text text-anchor="middle" x="154" y="-7.8" font-family="Times,serif" font-size="14.00" fill="#000000">(Alibaba PAI, KubeFlow+Kuberntes, etc)</text>
126 | </g>
127 | <!-- Argo&#45;&gt;AI -->
128 | <g id="edge8" class="edge">
129 | <title>Argo&#45;&gt;AI</title>
130 | <path fill="none" stroke="#000000" d="M205.1341,-88.9425C194.916,-84.2255 185.2377,-78.341 177,-71 170.1139,-64.8635 165.1905,-56.2792 161.7144,-47.8514"/>
131 | <polygon fill="#000000" stroke="#000000" points="164.9233,-46.4294 158.2678,-38.1859 158.33,-48.7805 164.9233,-46.4294"/>
132 | <text text-anchor="middle" x="225.5" y="-59.8" font-family="Times,serif" font-size="14.00" fill="#000000">submit AI job</text>
133 | </g>
134 | <!-- DBMS -->
135 | <g id="node11" class="node">
136 | <title>DBMS</title>
137 | <polygon fill="none" stroke="#000000" points="586,-38 326,-38 326,0 586,0 586,-38"/>
138 | <text text-anchor="middle" x="456" y="-22.8" font-family="Times,serif" font-size="14.00" fill="#000000">database system</text>
139 | <text text-anchor="middle" x="456" y="-7.8" font-family="Times,serif" font-size="14.00" fill="#000000">(Hive, MySQL, MaxCompute, etc)</text>
140 | </g>
141 | <!-- Argo&#45;&gt;DBMS -->
142 | <g id="edge7" class="edge">
143 | <title>Argo&#45;&gt;DBMS</title>
144 | <path fill="none" stroke="#000000" d="M391.0277,-88.9169C410.0294,-83.6461 426.4695,-77.555 434,-71 440.9617,-64.9401 445.7784,-56.3036 449.0863,-47.8055"/>
145 | <polygon fill="#000000" stroke="#000000" points="452.4938,-48.649 452.3145,-38.0557 445.8486,-46.4487 452.4938,-48.649"/>
146 | <text text-anchor="middle" x="526.5" y="-59.8" font-family="Times,serif" font-size="14.00" fill="#000000">submit SQL statement</text>
147 | </g>
148 | <!-- Argo&#45;&gt;DBMS -->
149 | <g id="edge9" class="edge">
150 | <title>Argo&#45;&gt;DBMS</title>
151 | <path fill="none" stroke="#000000" d="M289.0834,-88.7839C286.5279,-78.0452 285.7779,-65.0452 293,-56 297.5361,-50.3188 306.5408,-45.4442 318.091,-41.2779"/>
152 | <polygon fill="#000000" stroke="#000000" points="319.4437,-44.5181 327.8631,-38.0867 317.2706,-37.864 319.4437,-44.5181"/>
153 | <text text-anchor="middle" x="361.5" y="-59.8" font-family="Times,serif" font-size="14.00" fill="#000000">verify data schema</text>
154 | </g>
155 | </g>
156 | </svg>
157 | 


--------------------------------------------------------------------------------
/figures/arch_vm.dot:
--------------------------------------------------------------------------------
 1 | digraph G {
 2 |         node [shape=box];
 3 | 
 4 |         User1 [shape=oval, label="Lily"];
 5 |         User2 [shape=oval, label="Bob"];
 6 |         User3 [shape=oval, label="Eva"];
 7 | 
 8 |         {rank = same; User1; User2; User3}
 9 | 
10 |         Browser1 [label="Web browser"];
11 |         Browser2 [label="Web browser"];
12 | 
13 |         {rank = same; Browser1, Browser2, Client}
14 | 
15 |         subgraph cluster_vm {
16 |                 label="VM"
17 |                 subgraph cluster_container {
18 |                         label="sqlflow/sqlflow:latest";
19 |                         Jupyter [label="Jupyter Notebook server +\n SQLFlow magic command"];
20 |                         SQLFlow [label="SQLFlow server"];
21 |                         DBMS [label="MySQL"];
22 |                 }
23 |                 subgraph cluster_minikube {
24 |                     label="minikube";
25 |                     Argo [label="Argo"];
26 |                     AI [label="AI engine:\ncontainer-local run"];
27 |                 }
28 |         }
29 | 
30 |         User1 -> Browser1;
31 |         User2 -> Browser2;
32 |         Browser1 -> Jupyter [label="SQL/Flow program"];
33 |         Browser2 -> Jupyter;
34 | 
35 |         Jupyter -> SQLFlow [label="SQL/Flow program"];
36 |         SQLFlow -> Argo [label="Argo workflow"];
37 |         Argo -> DBMS [label="submit SQL statement"];
38 |         Argo -> AI [label="submit AI job"];
39 |         Argo -> DBMS [label="verify data schema"];
40 | 
41 |         Client [label="sqlflow command-line client"];
42 | 
43 |         User3 -> Client;
44 |         Client -> SQLFlow [label="SQL/Flow program"];
45 | }
46 | 


--------------------------------------------------------------------------------
/figures/arch_vm.svg:
--------------------------------------------------------------------------------
  1 | <?xml version="1.0" encoding="UTF-8" standalone="no"?>
  2 | <!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN"
  3 |  "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
  4 | <!-- Generated by graphviz version 2.40.1 (20161225.0304)
  5 |  -->
  6 | <!-- Title: G Pages: 1 -->
  7 | <svg width="1007pt" height="573pt"
  8 |  viewBox="0.00 0.00 1007.00 573.00" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink">
  9 | <g id="graph0" class="graph" transform="scale(1 1) rotate(0) translate(4 569)">
 10 | <title>G</title>
 11 | <polygon fill="#ffffff" stroke="transparent" points="-4,4 -4,-569 1003,-569 1003,4 -4,4"/>
 12 | <g id="clust3" class="cluster">
 13 | <title>cluster_vm</title>
 14 | <polygon fill="none" stroke="#000000" points="8,-8 8,-557 517,-557 517,-8 8,-8"/>
 15 | <text text-anchor="middle" x="262.5" y="-541.8" font-family="Times,serif" font-size="14.00" fill="#000000">VM</text>
 16 | </g>
 17 | <g id="clust4" class="cluster">
 18 | <title>cluster_container</title>
 19 | <polygon fill="none" stroke="#000000" points="192,-16 192,-345 509,-345 509,-16 192,-16"/>
 20 | <text text-anchor="middle" x="350.5" y="-329.8" font-family="Times,serif" font-size="14.00" fill="#000000">sqlflow/sqlflow:latest</text>
 21 | </g>
 22 | <g id="clust5" class="cluster">
 23 | <title>cluster_minikube</title>
 24 | <polygon fill="none" stroke="#000000" points="16,-127 16,-526 184,-526 184,-127 16,-127"/>
 25 | <text text-anchor="middle" x="100" y="-510.8" font-family="Times,serif" font-size="14.00" fill="#000000">minikube</text>
 26 | </g>
 27 | <!-- User1 -->
 28 | <g id="node1" class="node">
 29 | <title>User1</title>
 30 | <ellipse fill="none" stroke="#000000" cx="580" cy="-477" rx="28.6953" ry="18"/>
 31 | <text text-anchor="middle" x="580" y="-473.3" font-family="Times,serif" font-size="14.00" fill="#000000">Lily</text>
 32 | </g>
 33 | <!-- Browser1 -->
 34 | <g id="node4" class="node">
 35 | <title>Browser1</title>
 36 | <polygon fill="none" stroke="#000000" points="635,-422 525,-422 525,-386 635,-386 635,-422"/>
 37 | <text text-anchor="middle" x="580" y="-400.3" font-family="Times,serif" font-size="14.00" fill="#000000">Web browser</text>
 38 | </g>
 39 | <!-- User1&#45;&gt;Browser1 -->
 40 | <g id="edge1" class="edge">
 41 | <title>User1&#45;&gt;Browser1</title>
 42 | <path fill="none" stroke="#000000" d="M580,-458.9551C580,-450.8828 580,-441.1764 580,-432.1817"/>
 43 | <polygon fill="#000000" stroke="#000000" points="583.5001,-432.0903 580,-422.0904 576.5001,-432.0904 583.5001,-432.0903"/>
 44 | </g>
 45 | <!-- User2 -->
 46 | <g id="node2" class="node">
 47 | <title>User2</title>
 48 | <ellipse fill="none" stroke="#000000" cx="708" cy="-477" rx="28.6953" ry="18"/>
 49 | <text text-anchor="middle" x="708" y="-473.3" font-family="Times,serif" font-size="14.00" fill="#000000">Bob</text>
 50 | </g>
 51 | <!-- Browser2 -->
 52 | <g id="node5" class="node">
 53 | <title>Browser2</title>
 54 | <polygon fill="none" stroke="#000000" points="763,-422 653,-422 653,-386 763,-386 763,-422"/>
 55 | <text text-anchor="middle" x="708" y="-400.3" font-family="Times,serif" font-size="14.00" fill="#000000">Web browser</text>
 56 | </g>
 57 | <!-- User2&#45;&gt;Browser2 -->
 58 | <g id="edge2" class="edge">
 59 | <title>User2&#45;&gt;Browser2</title>
 60 | <path fill="none" stroke="#000000" d="M708,-458.9551C708,-450.8828 708,-441.1764 708,-432.1817"/>
 61 | <polygon fill="#000000" stroke="#000000" points="711.5001,-432.0903 708,-422.0904 704.5001,-432.0904 711.5001,-432.0903"/>
 62 | </g>
 63 | <!-- User3 -->
 64 | <g id="node3" class="node">
 65 | <title>User3</title>
 66 | <ellipse fill="none" stroke="#000000" cx="890" cy="-477" rx="28.6953" ry="18"/>
 67 | <text text-anchor="middle" x="890" y="-473.3" font-family="Times,serif" font-size="14.00" fill="#000000">Eva</text>
 68 | </g>
 69 | <!-- Client -->
 70 | <g id="node6" class="node">
 71 | <title>Client</title>
 72 | <polygon fill="none" stroke="#000000" points="999,-422 781,-422 781,-386 999,-386 999,-422"/>
 73 | <text text-anchor="middle" x="890" y="-400.3" font-family="Times,serif" font-size="14.00" fill="#000000">sqlflow command&#45;line client</text>
 74 | </g>
 75 | <!-- User3&#45;&gt;Client -->
 76 | <g id="edge10" class="edge">
 77 | <title>User3&#45;&gt;Client</title>
 78 | <path fill="none" stroke="#000000" d="M890,-458.9551C890,-450.8828 890,-441.1764 890,-432.1817"/>
 79 | <polygon fill="#000000" stroke="#000000" points="893.5001,-432.0903 890,-422.0904 886.5001,-432.0904 893.5001,-432.0903"/>
 80 | </g>
 81 | <!-- Jupyter -->
 82 | <g id="node7" class="node">
 83 | <title>Jupyter</title>
 84 | <polygon fill="none" stroke="#000000" points="500.5,-314 289.5,-314 289.5,-276 500.5,-276 500.5,-314"/>
 85 | <text text-anchor="middle" x="395" y="-298.8" font-family="Times,serif" font-size="14.00" fill="#000000">Jupyter Notebook server +</text>
 86 | <text text-anchor="middle" x="395" y="-283.8" font-family="Times,serif" font-size="14.00" fill="#000000"> SQLFlow magic command</text>
 87 | </g>
 88 | <!-- Browser1&#45;&gt;Jupyter -->
 89 | <g id="edge3" class="edge">
 90 | <title>Browser1&#45;&gt;Jupyter</title>
 91 | <path fill="none" stroke="#000000" d="M547.3323,-385.8767C537.2491,-380.2152 526.1335,-373.905 516,-368 488.6515,-352.0634 458.136,-333.6488 434.592,-319.3016"/>
 92 | <polygon fill="#000000" stroke="#000000" points="436.2902,-316.2377 425.931,-314.0157 432.6435,-322.2128 436.2902,-316.2377"/>
 93 | <text text-anchor="middle" x="584.5" y="-356.8" font-family="Times,serif" font-size="14.00" fill="#000000">SQL/Flow program</text>
 94 | </g>
 95 | <!-- Browser2&#45;&gt;Jupyter -->
 96 | <g id="edge4" class="edge">
 97 | <title>Browser2&#45;&gt;Jupyter</title>
 98 | <path fill="none" stroke="#000000" d="M692.9533,-385.8281C682.7874,-374.7183 668.4419,-361.1259 653,-353 609.3773,-330.0448 556.9615,-316.1423 510.8023,-307.737"/>
 99 | <polygon fill="#000000" stroke="#000000" points="511.2028,-304.2539 500.7485,-305.9741 509.9938,-311.1487 511.2028,-304.2539"/>
100 | </g>
101 | <!-- SQLFlow -->
102 | <g id="node8" class="node">
103 | <title>SQLFlow</title>
104 | <polygon fill="none" stroke="#000000" points="430.5,-60 297.5,-60 297.5,-24 430.5,-24 430.5,-60"/>
105 | <text text-anchor="middle" x="364" y="-38.3" font-family="Times,serif" font-size="14.00" fill="#000000">SQLFlow server</text>
106 | </g>
107 | <!-- Client&#45;&gt;SQLFlow -->
108 | <g id="edge11" class="edge">
109 | <title>Client&#45;&gt;SQLFlow</title>
110 | <path fill="none" stroke="#000000" d="M865.437,-385.6544C842.2463,-366.0387 811,-332.6276 811,-295 811,-295 811,-295 811,-97.5 811,-60.6682 568.9083,-47.9892 441.0497,-43.8614"/>
111 | <polygon fill="#000000" stroke="#000000" points="441.025,-40.359 430.9204,-43.5445 440.8061,-47.3556 441.025,-40.359"/>
112 | <text text-anchor="middle" x="879.5" y="-194.8" font-family="Times,serif" font-size="14.00" fill="#000000">SQL/Flow program</text>
113 | </g>
114 | <!-- Jupyter&#45;&gt;SQLFlow -->
115 | <g id="edge5" class="edge">
116 | <title>Jupyter&#45;&gt;SQLFlow</title>
117 | <path fill="none" stroke="#000000" d="M380.0448,-275.8608C372.1175,-263.8221 364,-247.5709 364,-231.5 364,-231.5 364,-231.5 364,-97.5 364,-88.7318 364,-79.1843 364,-70.5196"/>
118 | <polygon fill="#000000" stroke="#000000" points="367.5001,-70.5 364,-60.5 360.5001,-70.5 367.5001,-70.5"/>
119 | <text text-anchor="middle" x="432.5" y="-150.3" font-family="Times,serif" font-size="14.00" fill="#000000">SQL/Flow program</text>
120 | </g>
121 | <!-- Argo -->
122 | <g id="node10" class="node">
123 | <title>Argo</title>
124 | <polygon fill="none" stroke="#000000" points="169,-495 115,-495 115,-459 169,-459 169,-495"/>
125 | <text text-anchor="middle" x="142" y="-473.3" font-family="Times,serif" font-size="14.00" fill="#000000">Argo</text>
126 | </g>
127 | <!-- SQLFlow&#45;&gt;Argo -->
128 | <g id="edge6" class="edge">
129 | <title>SQLFlow&#45;&gt;Argo</title>
130 | <path fill="none" stroke="#000000" d="M346.6092,-60.0478C311.7209,-97.268 232.9507,-186.581 191,-276 163.8204,-333.9341 150.739,-408.5845 145.2969,-448.6228"/>
131 | <polygon fill="#000000" stroke="#000000" points="141.8023,-448.3561 143.9878,-458.7231 148.7442,-449.256 141.8023,-448.3561"/>
132 | <text text-anchor="middle" x="272" y="-227.8" font-family="Times,serif" font-size="14.00" fill="#000000">Argo workflow</text>
133 | </g>
134 | <!-- DBMS -->
135 | <g id="node9" class="node">
136 | <title>DBMS</title>
137 | <polygon fill="none" stroke="#000000" points="271.5,-313 200.5,-313 200.5,-277 271.5,-277 271.5,-313"/>
138 | <text text-anchor="middle" x="236" y="-291.3" font-family="Times,serif" font-size="14.00" fill="#000000">MySQL</text>
139 | </g>
140 | <!-- Argo&#45;&gt;DBMS -->
141 | <g id="edge7" class="edge">
142 | <title>Argo&#45;&gt;DBMS</title>
143 | <path fill="none" stroke="#000000" d="M150.1995,-458.6899C158.5925,-440.183 172.1859,-410.8482 185,-386 196.2391,-364.2059 209.856,-340.0097 220.1915,-322.0489"/>
144 | <polygon fill="#000000" stroke="#000000" points="223.2915,-323.6797 225.2672,-313.2707 217.2315,-320.1757 223.2915,-323.6797"/>
145 | <text text-anchor="middle" x="266.5" y="-400.3" font-family="Times,serif" font-size="14.00" fill="#000000">submit SQL statement</text>
146 | </g>
147 | <!-- Argo&#45;&gt;DBMS -->
148 | <g id="edge9" class="edge">
149 | <title>Argo&#45;&gt;DBMS</title>
150 | <path fill="none" stroke="#000000" d="M169.3338,-473.7938C220.2057,-467.2041 326.2417,-450.3297 348,-422 357.7459,-409.3107 355.8681,-399.9317 348,-386 330.6455,-355.2713 307.0857,-367.623 280,-345 271.6134,-337.9952 263.4228,-329.3487 256.4264,-321.227"/>
151 | <polygon fill="#000000" stroke="#000000" points="258.958,-318.7979 249.8649,-313.3605 253.5825,-323.2817 258.958,-318.7979"/>
152 | <text text-anchor="middle" x="423.5" y="-400.3" font-family="Times,serif" font-size="14.00" fill="#000000">verify data schema</text>
153 | </g>
154 | <!-- AI -->
155 | <g id="node11" class="node">
156 | <title>AI</title>
157 | <polygon fill="none" stroke="#000000" points="176,-173 24,-173 24,-135 176,-135 176,-173"/>
158 | <text text-anchor="middle" x="100" y="-157.8" font-family="Times,serif" font-size="14.00" fill="#000000">AI engine:</text>
159 | <text text-anchor="middle" x="100" y="-142.8" font-family="Times,serif" font-size="14.00" fill="#000000">container&#45;local run</text>
160 | </g>
161 | <!-- Argo&#45;&gt;AI -->
162 | <g id="edge8" class="edge">
163 | <title>Argo&#45;&gt;AI</title>
164 | <path fill="none" stroke="#000000" d="M114.6359,-464.7446C92.5039,-452.7542 65,-432.1838 65,-404 65,-404 65,-404 65,-231.5 65,-213.9397 72.5558,-195.9667 80.6462,-181.7555"/>
165 | <polygon fill="#000000" stroke="#000000" points="83.7634,-183.3654 85.9658,-173.002 77.7814,-179.73 83.7634,-183.3654"/>
166 | <text text-anchor="middle" x="113.5" y="-291.3" font-family="Times,serif" font-size="14.00" fill="#000000">submit AI job</text>
167 | </g>
168 | </g>
169 | </svg>
170 | 


--------------------------------------------------------------------------------
/index.html:
--------------------------------------------------------------------------------
 1 | <!-- Please don't delete this file!
 2 | 
 3 | This file is generated by https://github.com/bramp/goredirects. We
 4 | need it because github.com/sql-machine-larning/sqlflow is too long a
 5 | Go import name.  We want it to be sqlflow.org/sqlflow.
 6 | 
 7 | Our solution is to build a Website https://sqlflow.org, whose page
 8 | http://sqlflow.org/sqlflow/index.html contains some meta information
 9 | readable by the Go toolchain, so that the `go get` command knows that
10 | `import sqlflow.org/sqlflow` actually redirects to `import
11 | github.com/sql-machine-larning/sqlflow `.
12 | 
13 | We build the website by the following steps.
14 | 
15 | 1. We configure https://sqlflow.org pointing to
16 | http://sql-machine-learning.github.io, which is a GitHub Page website
17 | built from the repo
18 | github.com/sql-machine-learning/sql-machine-learning.github.io.
19 | 
20 | 2. We make sure that /sqlflow in the repo
21 | github.com/sql-machine-learning/sql-machine-learning.github.io is a
22 | git submodule pointing to this repo,
23 | github.com/sql-machine-learning/sqlflow.
24 | 
25 | 3. We put the meta information in this file as follows.
26 | 
27 | Similarly, please don't remove any index.html in this repo that were
28 | marked generated by https://github.com/bramp/goredirects.
29 | -->
30 | 
31 | <html>
32 | <head>
33 | <meta http-equiv="refresh" content="0; url=https://github.com/sql-machine-learning/playground" />
34 | <link rel="canonical" href="https://github.com/sql-machine-learning/playground" />
35 | <script>
36 | 	window.location.replace("https:\/\/github.com\/sql-machine-learning\/playground");
37 | </script>
38 | </head>
39 | <body>
40 | 	<h1>Redirecting to <a href="https://github.com/sql-machine-learning/playground">https://github.com/sql-machine-learning/playground</a></h1>
41 | </body>
42 | </html>
43 | 


--------------------------------------------------------------------------------
/play.sh:
--------------------------------------------------------------------------------
 1 | #!/bin/bash
 2 | 
 3 | # Copyright 2020 The SQLFlow Authors. All rights reserved.
 4 | # Licensed under the Apache License, Version 2.0 (the "License");
 5 | # you may not use this file except in compliance with the License.
 6 | # You may obtain a copy of the License at
 7 | #
 8 | # http://www.apache.org/licenses/LICENSE-2.0
 9 | #
10 | # Unless required by applicable law or agreed to in writing, software
11 | # distributed under the License is distributed on an "AS IS" BASIS,
12 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13 | # See the License for the specific language governing permissions and
14 | # limitations under the License.
15 | 
16 | set -e
17 | 
18 | cat <<EOF
19 |  ___  ___  _    ___ _
20 | / __|/ _ \| |  | __| |_____ __ __
21 | \__ \ (_) | |__| _|| / _ \ V  V /
22 | |___/\__\_\____|_| |_\___/\_/\_/
23 | 
24 | EOF
25 | 
26 | if ! which vagrant >/dev/null; then
27 |     cat <<EOF
28 | We need Vagrant to run the playground.
29 | 
30 | Linux users can refer to https://www.vagrantup.com/downloads.html for the
31 | installation guide.
32 | 
33 | macOS users can install Vagrant using Homebrew:
34 |   brew cask install vagrant
35 | EOF
36 |     exit 1
37 | fi
38 | 
39 | if [[ -n "$(vagrant global-status --prune | grep 'playground' | grep 'running')" ]]; then
40 |     echo "The playground VM is running."
41 |     echo "You may want to log on the VM with: vagrant ssh"
42 |     echo "Or stop the playground with: vagrant halt"
43 |     exit 0
44 | fi
45 | 
46 | if [[ -z "$(vagrant plugin list | grep -o 'vagrant-disksize')" ]]; then
47 |     echo "Install Vagrant disk size plugin ..."
48 |     vagrant plugin install vagrant-disksize
49 | fi
50 | 
51 | if [[ -z "$(vagrant box list | grep 'ubuntu/bionic64')" ]]; then
52 |     CACHED_BOX="$HOME/.cache/sqlflow/ubuntu-bionic64.box"
53 |     if [[ -f $CACHED_BOX ]]; then
54 |         echo "Found and use cached box $CACHED_BOX"
55 |         vagrant box add ubuntu/bionic64 $CACHED_BOX
56 |     fi
57 | fi
58 | 
59 | echo "Start and provision the playgound VM ..."
60 | vagrant up
61 | 
62 | echo -e "\033[32m
63 | Playground VM has been successfully set up. You may want to go into the VM and start the SQLFlow playground using the following command:
64 | 
65 | vagrant ssh
66 | sudo su
67 | cd desktop && ./start.bash
68 | 
69 | \033[0m"
70 | 
71 | 


--------------------------------------------------------------------------------
/provision.bash:
--------------------------------------------------------------------------------
  1 | #!/bin/bash
  2 | 
  3 | set -e  # Exit script if any error
  4 | 
  5 | # The shared folder is specified in Vagrantfile.
  6 | VAGRANT_SHARED_FOLDER=/home/vagrant/desktop
  7 | 
  8 | source $VAGRANT_SHARED_FOLDER/sqlflow/docker/dev/find_fastest_resources.sh
  9 | 
 10 | echo "Setting apt-get mirror..."
 11 | rm -rf /var/lib/apt/lists/* /etc/apt/sources.list
 12 | find_fastest_apt_source >/etc/apt/sources.list
 13 | apt-get update
 14 | 
 15 | echo "Installing Docker ..."
 16 | # c.f. https://dockr.ly/3cExcay
 17 | if which docker > /dev/null; then
 18 |     echo "Docker had been installed. Skip."
 19 | else
 20 |     best_install_url=$(find_fastest_docker_url)
 21 |     docker_ce_mirror=$(find_fastest_docker_ce_mirror)
 22 |     echo "Using ${best_install_url}..."
 23 |     curl -sSL "${best_install_url}" | DOWNLOAD_URL=$docker_ce_mirror bash -
 24 |     best_docker_mirror=$(find_fastest_docker_registry)
 25 |     if [[ -n "${best_docker_mirror}" ]]; then
 26 |         mkdir -p /etc/docker
 27 |         cat <<-EOF >/etc/docker/daemon.json
 28 | 	{ 
 29 | 	  "graph": "/mnt/docker-data",
 30 | 	  "storage-driver": "overlay",
 31 | 	  "registry-mirrors":[ "${best_docker_mirror}" ]
 32 | 	}
 33 | 	EOF
 34 |     fi
 35 |     usermod -aG docker vagrant
 36 | fi
 37 | echo "Done."
 38 | 
 39 | echo "Install axel ..."
 40 | if which axel > /dev/null; then
 41 |     echo "axel installed. Skip."
 42 | else
 43 |     $VAGRANT_SHARED_FOLDER/sqlflow/scripts/travis/install_axel.sh
 44 | fi
 45 | 
 46 | echo "Export Kubernetes environment variables ..."
 47 | # NOTE: According to https://stackoverflow.com/a/16619261/724872,
 48 | # source is very necessary here.
 49 | source $VAGRANT_SHARED_FOLDER/sqlflow/scripts/travis/export_k8s_vars.sh
 50 | 
 51 | echo "Installing kubectl ..."
 52 | if which kubectl > /dev/null; then
 53 |     echo "kubectl installed. Skip."
 54 | else
 55 |     $VAGRANT_SHARED_FOLDER/sqlflow/scripts/travis/install_kubectl.sh
 56 | fi
 57 | echo "Done."
 58 | 
 59 | echo "Installing minikube ..."
 60 | if which minikube > /dev/null; then
 61 |     echo "minikube installed. Skip."
 62 | else
 63 |     $VAGRANT_SHARED_FOLDER/sqlflow/scripts/travis/install_minikube.sh
 64 |     minikube config set WantUpdateNotification false
 65 | fi
 66 | echo "Done."
 67 | 
 68 | echo "Copy files ..."
 69 | # In non-develop mode, we want the user see the start.bash
 70 | # immediately after she/he logs on the vm
 71 | cp "$VAGRANT_SHARED_FOLDER/start.bash" "/root/"
 72 | 
 73 | read -r -d '\t' files <<EOM
 74 | sqlflow/scripts/travis/export_k8s_vars.sh
 75 | sqlflow/docker/dev/find_fastest_resources.sh
 76 | sqlflow/scripts/travis/start_argo.sh
 77 | sqlflow/doc/run/k8s/install-sqlflow.yaml
 78 | \t
 79 | EOM
 80 | 
 81 | mkdir -p "/root/scripts"
 82 | for file in ${files[@]}; do
 83 |     cp "$VAGRANT_SHARED_FOLDER/$file" "/root/scripts/$(basename $file)"
 84 | done
 85 | echo "Done."
 86 | 
 87 | echo "Change root password ..."
 88 | echo "root:sqlflow" | chpasswd
 89 | sed -i -e 's/^PasswordAuthentication no/PasswordAuthentication yes/g' \
 90 |     -e 's/#PermitRootLogin prohibit-password/PermitRootLogin yes/g' \
 91 |     /etc/ssh/sshd_config
 92 | service ssh restart
 93 | echo "Done."
 94 | 
 95 | 
 96 | echo "Disable cloudimg grub settings ..."
 97 | sed -i -e 's/^GRUB_CMDLINE_LINUX_DEFAULT="quiet splash"/GRUB_CMDLINE_LINUX_DEFAULT="quiet"/g' \
 98 |     -e 's/^#GRUB_TERMINAL=console$/GRUB_TERMINAL=console/g' /etc/default/grub
 99 | rm -f /etc/default/grub.d/50-cloudimg-settings.cfg
100 | update-grub
101 | echo "Done."
102 | 


--------------------------------------------------------------------------------
/python/_version.py:
--------------------------------------------------------------------------------
 1 | # Copyright 2020 The SQLFlow Authors. All rights reserved.
 2 | # Licensed under the Apache License, Version 2.0 (the "License");
 3 | # you may not use this file except in compliance with the License.
 4 | # You may obtain a copy of the License at
 5 | #
 6 | # http://www.apache.org/licenses/LICENSE-2.0
 7 | #
 8 | # Unless required by applicable law or agreed to in writing, software
 9 | # distributed under the License is distributed on an "AS IS" BASIS,
10 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
11 | # See the License for the specific language governing permissions and
12 | # limitations under the License.
13 | 
14 | VERSION = (0, 1, 0, 'dev')
15 | 
16 | __version__ = '.'.join(map(str, VERSION))


--------------------------------------------------------------------------------
/python/gen_cert.sh:
--------------------------------------------------------------------------------
 1 | #!/bin/bash
 2 | 
 3 | # Copyright 2020 The SQLFlow Authors. All rights reserved.
 4 | # Licensed under the Apache License, Version 2.0 (the "License");
 5 | # you may not use this file except in compliance with the License.
 6 | # You may obtain a copy of the License at
 7 | #
 8 | # http://www.apache.org/licenses/LICENSE-2.0
 9 | #
10 | # Unless required by applicable law or agreed to in writing, software
11 | # distributed under the License is distributed on an "AS IS" BASIS,
12 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13 | # See the License for the specific language governing permissions and
14 | # limitations under the License.
15 | 
16 | # This file is to create certification file for playground's client.
17 | # We will create a slef-signed ca file, and issue new cert file from
18 | # this ca. It basically is intend to be used in test environment.
19 | #
20 | 
21 | org="sqlflow.tech"
22 | 
23 | function create_ca() {
24 |   if [[ -f ca/ca.crt ]]; then
25 |     return
26 |   fi
27 |   mkdir ca && pushd ca
28 |   cat >ca_cert.conf <<EOF
29 | [ req ]
30 | distinguished_name = dn
31 | prompt = no
32 | 
33 | [ dn ]
34 | O = ${org} CA
35 | EOF
36 | 
37 |   openssl genrsa -out ca.key 2048
38 |   openssl req -out ca.req -key ca.key -new -config ./ca_cert.conf
39 |   openssl x509 -req -in ca.req -signkey ca.key -sha256 -out ca.crt
40 |   openssl x509 -in ca.crt -outform PEM -out ca_crt.pem
41 |   rm ca.req ca_cert.conf
42 |   popd
43 | }
44 | 
45 | function create_cert() {
46 |   if [[ -z "$1" ]]; then
47 |     exit 1
48 |   fi
49 |   target="$1"
50 |   if [[ -f certs/${target}.crt ]]; then
51 |     echo "Cert file for ${target} already exists!"
52 |     return
53 |   fi
54 |   mkdir -p certs && pushd certs
55 |   cat >${target}.conf <<EOF
56 | [ req ]
57 | distinguished_name = dn
58 | prompt = no
59 | 
60 | [ dn ]
61 | O = ${org}
62 | CN = sqlflow-playground-${target}
63 | EOF
64 |   openssl genrsa -out ${target}.key 2048
65 |   openssl req -out ${target}.req -key ${target}.key -new -config ${target}.conf
66 |   openssl x509 -req -in ${target}.req -out ${target}.crt \
67 |     -sha256 -CAcreateserial -days 5000 \
68 |     -CA ../ca/ca.crt -CAkey ../ca/ca.key
69 |   openssl rsa -in ${target}.key -out ${target}.pem
70 |   openssl x509 -in ${target}.crt -outform PEM -out ${target}_crt.pem
71 |   rm ${target}.req ${target}.conf
72 |   popd
73 | }
74 | 
75 | if [[ -z "$1" ]]; then
76 |   echo "Usage: ./gen_cert.sh target_name"
77 |   exit 1
78 | fi
79 | 
80 | create_ca
81 | create_cert "$1"
82 | 


--------------------------------------------------------------------------------
/python/setup.py:
--------------------------------------------------------------------------------
  1 | #!/usr/bin/env python
  2 | # -*- coding: utf-8 -*-
  3 | 
  4 | # Copyright 2020 The SQLFlow Authors. All rights reserved.
  5 | # Licensed under the Apache License, Version 2.0 (the "License");
  6 | # you may not use this file except in compliance with the License.
  7 | # You may obtain a copy of the License at
  8 | #
  9 | # http://www.apache.org/licenses/LICENSE-2.0
 10 | #
 11 | # Unless required by applicable law or agreed to in writing, software
 12 | # distributed under the License is distributed on an "AS IS" BASIS,
 13 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 14 | # See the License for the specific language governing permissions and
 15 | # limitations under the License.
 16 | 
 17 | import io
 18 | import os
 19 | 
 20 | from setuptools import find_packages, setup
 21 | 
 22 | # Package meta-data.
 23 | NAME = 'sqlflow_playground'
 24 | DESCRIPTION = 'SQLFlow playground server library for Python.'
 25 | URL = 'https://github.com/sql-machine-learning/playground'
 26 | AUTHOR='linhongwu.pt@antfin.com',
 27 | EMAIL='linhongwu.pt@antfin.com',
 28 | REQUIRES_PYTHON = '>=3.5.0'
 29 | VERSION = None
 30 | 
 31 | # What packages are required for this module to be executed?
 32 | REQUIRED = [
 33 |     'tornado==6.0.4'
 34 | ]
 35 | SETUP_REQUIRED = [
 36 |     'pytest-runner'
 37 | ]
 38 | TEST_REQUIRED = [
 39 |     'pytest',
 40 | ]
 41 | 
 42 | # What packages are optional?
 43 | EXTRAS = {
 44 | }
 45 | 
 46 | # The rest you shouldn't have to touch too much :)
 47 | # ------------------------------------------------
 48 | # Except, perhaps the License and Trove Classifiers!
 49 | # If you do change the License, remember to change the Trove Classifier for that!
 50 | 
 51 | here = os.path.abspath(os.path.dirname(__file__))
 52 | 
 53 | # Import the README and use it as the long-description.
 54 | # Note: this will only work if 'README.md' is present in your MANIFEST.in file!
 55 | try:
 56 |     with io.open(os.path.join(here, 'README.md'), encoding='utf-8') as f:
 57 |         long_description = '\n' + f.read()
 58 | except FileNotFoundError:
 59 |     long_description = DESCRIPTION
 60 | 
 61 | # Load the package's _version.py module as a dictionary.
 62 | about = {}
 63 | if not VERSION:
 64 |     with open(os.path.join(here, '_version.py')) as f:
 65 |         exec(f.read(), about)
 66 | else:
 67 |     about['__version__'] = VERSION
 68 | 
 69 | # Where the magic happens:
 70 | setup(
 71 |     name=NAME,
 72 |     version=about['__version__'],
 73 |     description=DESCRIPTION,
 74 |     long_description=long_description,
 75 |     long_description_content_type='text/markdown',
 76 |     author=AUTHOR,
 77 |     author_email=EMAIL,
 78 |     python_requires=REQUIRES_PYTHON,
 79 |     url=URL,
 80 |     packages=find_packages(exclude=('tests',)),
 81 |     package_data={'sqlflow_playground': ['*.py']},
 82 |     entry_points={
 83 |         'console_scripts': ['sqlflow_playground = sqlflow_playground.server:main'],
 84 |     },
 85 |     install_requires=REQUIRED,
 86 |     setup_requires=SETUP_REQUIRED,
 87 |     tests_require=TEST_REQUIRED,
 88 |     extras_require=EXTRAS,
 89 |     license='Apache License 2.0',
 90 |     classifiers=[
 91 |         # Trove classifiers
 92 |         # Full list: https://pypi.python.org/pypi?%3Aaction=list_classifiers
 93 |         'License :: OSI Approved :: Apache Software License',
 94 |         'Programming Language :: Python',
 95 |         'Programming Language :: Python :: 3 :: Only',
 96 |         'Programming Language :: Python :: 3.5',
 97 |         'Programming Language :: Python :: 3.6',
 98 |         'Programming Language :: Python :: 3.7',
 99 |         'Programming Language :: Python :: Implementation :: CPython',
100 |         'Programming Language :: Python :: Implementation :: PyPy'
101 |     ],
102 | )
103 | 


--------------------------------------------------------------------------------
/python/sqlflow_playground/__init__.py:
--------------------------------------------------------------------------------
 1 | # Copyright 2020 The SQLFlow Authors. All rights reserved.
 2 | # Licensed under the Apache License, Version 2.0 (the "License");
 3 | # you may not use this file except in compliance with the License.
 4 | # You may obtain a copy of the License at
 5 | #
 6 | # http://www.apache.org/licenses/LICENSE-2.0
 7 | #
 8 | # Unless required by applicable law or agreed to in writing, software
 9 | # distributed under the License is distributed on an "AS IS" BASIS,
10 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
11 | # See the License for the specific language governing permissions and
12 | # limitations under the License.
13 | 


--------------------------------------------------------------------------------
/python/sqlflow_playground/k8s.py:
--------------------------------------------------------------------------------
 1 | # Copyright 2020 The SQLFlow Authors. All rights reserved.
 2 | # Licensed under the Apache License, Version 2.0 (the "License");
 3 | # you may not use this file except in compliance with the License.
 4 | # You may obtain a copy of the License at
 5 | #
 6 | # http://www.apache.org/licenses/LICENSE-2.0
 7 | #
 8 | # Unless required by applicable law or agreed to in writing, software
 9 | # distributed under the License is distributed on an "AS IS" BASIS,
10 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
11 | # See the License for the specific language governing permissions and
12 | # limitations under the License.
13 | 
14 | import subprocess
15 | import time
16 | import re
17 | 
18 | mysql_pod_config = """
19 | apiVersion: v1
20 | kind: Pod
21 | metadata:
22 |   name: sqlflow-mysql-%s
23 | spec:
24 |   containers:
25 |   - name: mysql
26 |     image: sqlflow/sqlflow:mysql
27 |     imagePullPolicy: IfNotPresent
28 |     ports:
29 |     - containerPort: 3306
30 |       protocol: TCP
31 |     env:
32 |     - name: MYSQL_HOST
33 |       value: "0.0.0.0"
34 |     - name: MYSQL_PORT
35 |       value: "3306"
36 |     readinessProbe:
37 |       exec:
38 |         command:
39 |         - cat
40 |         - /work/mysql-inited
41 |       initialDelaySeconds: 1
42 |       periodSeconds: 1
43 | """
44 | 
45 | 
46 | def is_pod_alive(name):
47 |     cmd = ('''kubectl get pod %s '''
48 |            '''-o jsonpath="{.status.containerStatuses'''
49 |            '''[?(@.name=='mysql')].ready}"''') % name
50 |     status = subprocess.getoutput(cmd)
51 |     return status == "true"
52 | 
53 | 
54 | def create_mysql_pod_for_user(user):
55 |     user_pod_name = "sqlflow-mysql-%s" % user
56 |     if not is_pod_alive(user_pod_name):
57 |         config = mysql_pod_config % user
58 |         cmd = "kubectl create -f -"
59 |         subprocess.run(cmd, shell=True, input=config.encode("utf8"))
60 |         for _ in range(10):
61 |             if not is_pod_alive(user_pod_name):
62 |                 time.sleep(1)
63 |     return get_mysql_service_addr(user_pod_name)
64 | 
65 | 
66 | def get_mysql_service_addr(pod_name):
67 |     cmd = '''kubectl get pod %s -o jsonpath="{.status.podIP}"''' % pod_name
68 |     for _ in range(10):
69 |         ip = subprocess.getoutput(cmd)
70 |         if ip.count(".") == 3:
71 |             return "mysql://root:root@tcp(%s:3306)/?maxAllowedPacket=0" % ip
72 |         time.sleep(1)
73 | 


--------------------------------------------------------------------------------
/python/sqlflow_playground/playground_server_design.md:
--------------------------------------------------------------------------------
  1 | # SQLFlow Playground Server
  2 | 
  3 | SQLFlow Playground Server exposes a REST API service that enables users to share
  4 | the resources in one playground cluster. Users can take advantage of SQLFlow by
  5 | installing a small [plugin](https://github.com/sql-machine-learning/pysqlflow)
  6 | on her/his Jupyter Notebook.
  7 | 
  8 | This service is used to extend the SQLFlow Playground's capability, especially
  9 | when we need to manage the resource in the k8s cluster. We suppose the playground
 10 | as a pure backend service (without Jupyter/JupyterHub) which provides machine 
 11 | learning capability for some frontend. It clearly is not for those who just want
 12 | to connect to the SQLFlow server in the playground through our built-in Jupyter 
 13 | Notebook. Currently, this service is used to run our tutorials on [Aliyun
 14 | DSW for Developer](https://dsw-dev.data.aliyun.com/) which behaves as a frontend
 15 | of our playground.
 16 | 
 17 | ## The Architecture
 18 | 
 19 | **SQLFlow Playground Server** is a side-car service of our playground cluster.
 20 | Now, it is designed as an HTTP server which does user authorization, creates DB
 21 | resource, and so on. This server uses `kubectl` to manipulate the resource in
 22 | the playground(a k8s cluster). It's in someway the gateway of the playground.
 23 | As described in the below diagram, the interaction of the three subjects could
 24 | be: Clients ask the playground server for some resource. The server authorizes
 25 | the client and create the resource on the playground. The client connects to
 26 | the SQLFlow server in the playground and does train/predict tasks using the
 27 | created resource.
 28 | 
 29 | ```
 30 |    ----------------run task--------------------------->
 31 |    |                                                  |
 32 | Clients <--> Playground Server <--> Playground[SQLFlow Server, MySQL Server...]
 33 | ```
 34 | 
 35 | ## Supported API
 36 | 
 37 | Request URL path is composed by the prefix `/api/` and the API name, like:
 38 | 
 39 | ```url
 40 |     https://playground.sqlflow.tech/api/heart_beat
 41 | ```
 42 | This service always uses `HTTPS` and only accepts authorized clients
 43 | by checking their certification file. So there is no dedicated API
 44 | for user authentication.
 45 | 
 46 | Currently supported API are:
 47 | | name | method | params | description |
 48 | | - | - | - | - |
 49 | | create_db | POST | {"user_id": "id"} | create a DB for given user, json param |
 50 | | heart_beat| GET  | user_id=id | report a heart beat of given client |
 51 | 
 52 | 
 53 | ## How to Use
 54 | 
 55 | ### For Service Maintainer
 56 | The maintainer should [provide the playground cluster](../dev.md), and
 57 | bootup a `SQLFlow Playground Server`.  The server should have privillege
 58 | to access the `kubectl` command of the cluster.  To install the server,
 59 | maintainer can use below command:
 60 | ```bash
 61 |     mkdir $HOME/workspace
 62 |     cd $HOME/workspace
 63 |     pip install sqlflow_playground
 64 |     mkdir key_store
 65 |     gen_cert.sh server
 66 |     sqlflow_playground --port=50052 \
 67 |       --ca_crt=key_store/ca/ca.crt \
 68 |       --server_key=key_store/server/server.key \
 69 |       --server_crt=key_store/server/server.crt
 70 | ```
 71 | In the above commands, we first installed the sqlflow playground package
 72 | which carries the main cluster operation logic.  Then, we use the key
 73 | tool to generate a server certification file (Of course, it's not necessary
 74 | if you have your own certification files) which enables us to provide
 75 | `HTTPS` service.  Finally, we start the `REST API` service at port 50052.
 76 | 
 77 | Our playground service uses bi-directional validation.  So, the maintainer
 78 | needs to generate a certification file for a trusted user. Use below command and
 79 | send the generated `.crt` and `.key` file together with the `ca.crt` to
 80 | the user.
 81 | 
 82 | ```bash
 83 |     gen_cert.sh some_client
 84 | ```
 85 | 
 86 | ### For The User
 87 | 
 88 | To use this service, the user should get authorized from the playground's maintainer.
 89 | In detail, user should get `ca.crt`, `client.key` and the `client.crt` file from
 90 | the maintainer and keep them in some very-safe place. Also, the user should ask
 91 | the maintainer for the sqlflow server address and the sqlflow playground server
 92 | address. Then, the user will install Jupyter Notebook and the SQLFlow plugin package
 93 | and do some configuration. Finally, the user can experience SQLFlow in his Jupyter 
 94 | Notebook.
 95 | 
 96 | ```bash
 97 |     pip3 install notebook sqlflow==0.15.0
 98 | 
 99 |     cat >$HOME/.sqlflow_playground.env <<EOF
100 | SQLFLOW_SERVER="{sqlflow server address}"
101 | SQLFLOW_PLAYGROUND_USER_ID_ENV=SQLFLOW_USER_ID
102 | SQLFLOW_USER_ID="{your name}"
103 | SQLFLOW_PLAYGROUND_SERVRE="{sqlflow playground server address}"
104 | SQLFLOW_PLAYGROUND_SERVER_CA="{path to your ca.crt file}"
105 | SQLFLOW_PLAYGROUND_CLIENT_KEY="{path to your client.key file}"
106 | SQLFLOW_PLAYGROUND_CLIENT_CERT="{path to your client.crt file}"
107 | EOF
108 | 
109 |     export SQLFLOW_JUPYTER_ENV_PATH="$HOME/.sqlflow_playground.env"
110 |     # start the notebook and try use %%sqlflow magic command
111 |     jupyter notebook
112 | ```
113 | 
114 | ## Implementation
115 | 
116 | We use [tornado](https://www.tornadoweb.org/) as the web framework which provides
117 | a very good request dispatching mechanism. By the way, this framework is also
118 | adopted by Jupyter Notebook. The request processing is split into two steps:
119 | 
120 | 1. Register a request handler
121 | 
122 |     ```python
123 |     tornado.web.Application([(r"/", MainHandler)])
124 |     ```
125 | 1. Implement the handler as a class, the method name `get` imply
126 |     it accepts `GET` requests.
127 | 
128 |     ```python
129 |     class MainHandler(RequestHandler):
130 |         def get(self):
131 |            self.write("hello SQLFlow!") 
132 |     ```
133 | In addition, We add a k8s manipulate class, which can create resource in the
134 | cluster. It's now implemented in a brutal way (use kubectl). We may refine it
135 | by using k8s's API.
136 | 


--------------------------------------------------------------------------------
/python/sqlflow_playground/server.py:
--------------------------------------------------------------------------------
  1 | # Copyright 2020 The SQLFlow Authors. All rights reserved.
  2 | # Licensed under the Apache License, Version 2.0 (the "License");
  3 | # you may not use this file except in compliance with the License.
  4 | # You may obtain a copy of the License at
  5 | #
  6 | # http://www.apache.org/licenses/LICENSE-2.0
  7 | #
  8 | # Unless required by applicable law or agreed to in writing, software
  9 | # distributed under the License is distributed on an "AS IS" BASIS,
 10 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 11 | # See the License for the specific language governing permissions and
 12 | # limitations under the License.
 13 | 
 14 | import argparse
 15 | import json
 16 | import os
 17 | import ssl
 18 | from http import HTTPStatus
 19 | from os import path
 20 | 
 21 | import tornado.ioloop
 22 | import tornado.web
 23 | from tornado.httpserver import HTTPServer
 24 | from tornado.ioloop import IOLoop
 25 | from tornado.web import RequestHandler
 26 | 
 27 | from sqlflow_playground.k8s import create_mysql_pod_for_user
 28 | 
 29 | 
 30 | class MainHandler(RequestHandler):
 31 |     def get(self):
 32 |         self.write("Hello, SQLFlow Playground!")
 33 | 
 34 | 
 35 | class CreateUserMySQLPodHandler(RequestHandler):
 36 |     def post(self):
 37 |         params = json.loads(self.request.body)
 38 |         user_id = params["user_id"]
 39 |         if not user_id:
 40 |             self.write_error(HTTPStatus.BAD_REQUEST)
 41 |         conn_str = create_mysql_pod_for_user(user_id)
 42 |         out = {
 43 |             "data_source": conn_str
 44 |         }
 45 |         self.write(json.dumps(out))
 46 |         self.flush()
 47 | 
 48 | 
 49 | class ClientHeartBeatHandler(RequestHandler):
 50 |     def get(self):
 51 |         user_id = self.get_argument("user_id")
 52 |         # (TODO: lhw) keep track of the client's state
 53 |         print("Receive heart beat from: %s" % user_id)
 54 |         self.write("hello %s" % user_id)
 55 |         self.flush()
 56 | 
 57 | 
 58 | def make_app():
 59 |     return tornado.web.Application([
 60 |         (r"/", MainHandler),
 61 |         (r"/api/create_db", CreateUserMySQLPodHandler),
 62 |         (r'/api/heart_beat', ClientHeartBeatHandler)
 63 |     ])
 64 | 
 65 | 
 66 | parser = argparse.ArgumentParser()
 67 | parser.add_argument("--port", type=int,
 68 |                     help="Port of the service",
 69 |                     action="store",
 70 |                     default=9999)
 71 | parser.add_argument("--ca_crt", type=str,
 72 |                     help="Path to CA certificates.",
 73 |                     action="store",
 74 |                     default=None)
 75 | parser.add_argument("--server_key",
 76 |                     type=str,
 77 |                     help="Path to server key.",
 78 |                     action="store",
 79 |                     default=None)
 80 | parser.add_argument("--server_crt",
 81 |                     type=str,
 82 |                     help="Path to server crt.",
 83 |                     action="store",
 84 |                     default=None)
 85 | 
 86 | 
 87 | def main():
 88 |     """SQLFlow Playground Server
 89 | 
 90 |     We expect this server to be an open API service for SQLFlow Playground.
 91 | 
 92 |     Current, this service is used to allocate DB resource in the cluster.
 93 |     The clients will connect to this server to get DB connection string,
 94 |     and use them in later trainning/prediction. This service will keep track
 95 |     of the client, and release DB resource when the client is not active.
 96 | 
 97 |     """
 98 |     args = parser.parse_args()
 99 |     ssl_ctx = None
100 |     if args.ca_crt and args.server_crt and args.server_key:
101 |         ssl_ctx = ssl.create_default_context(ssl.Purpose.CLIENT_AUTH)
102 |         ssl_ctx.load_cert_chain(args.server_crt, args.server_key)
103 |         ssl_ctx.load_verify_locations(args.ca_crt)
104 |         ssl_ctx.check_hostname = False
105 |     else:
106 |         print("SSL is not enabled.")
107 |     app = make_app()
108 |     server = HTTPServer(app, ssl_options=ssl_ctx)
109 |     server.listen(args.port)
110 |     print("Server started at %d" % args.port)
111 |     IOLoop.current().start()
112 | 
113 | 
114 | if __name__ == "__main__":
115 |     main()
116 | 


--------------------------------------------------------------------------------
/release.sh:
--------------------------------------------------------------------------------
 1 | #!/bin/bash
 2 | 
 3 | # Copyright 2020 The SQLFlow Authors. All rights reserved.
 4 | # Licensed under the Apache License, Version 2.0 (the "License");
 5 | # you may not use this file except in compliance with the License.
 6 | # You may obtain a copy of the License at
 7 | #
 8 | # http://www.apache.org/licenses/LICENSE-2.0
 9 | #
10 | # Unless required by applicable law or agreed to in writing, software
11 | # distributed under the License is distributed on an "AS IS" BASIS,
12 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13 | # See the License for the specific language governing permissions and
14 | # limitations under the License.
15 | 
16 | 
17 | echo "Stoping the vm ..."
18 | vagrant halt
19 | echo "Done."
20 | 
21 | echo "Finding playground vm ..."
22 | vm=$(VBoxManage list vms | grep "playground_default" | head -1)
23 | if [[ ! "$vm" =~ playground_default* ]]; then
24 |     echo "No palyground virtual machine found."
25 |     exit 1
26 | fi
27 | vm=$(echo $vm | awk -F"\"" '{print $2}')
28 | echo "Found $vm ."
29 | 
30 | echo "Remove shared folder ..."
31 | VBoxManage sharedfolder remove "$vm" --name home_vagrant_desktop
32 | VBoxManage sharedfolder remove "$vm" --name vagrant
33 | echo "Done."
34 | 
35 | echo "Rebind serial port file and disable it because it does not work on Windows"
36 | VBoxManage modifyvm "$vm" --uartmode1 file /tmp/playground.log
37 | VBoxManage modifyvm "$vm" --uart1 off
38 | echo "Done."
39 | 
40 | echo "Exporting vm ..."
41 | VBoxManage export "$vm" -o SQLFlowPlayground.ova
42 | echo "Done."
43 | 


--------------------------------------------------------------------------------
/start.bash:
--------------------------------------------------------------------------------
  1 | #!/bin/bash
  2 | # Copyright 2020 The SQLFlow Authors. All rights reserved.
  3 | # Licensed under the Apache License, Version 2.0 (the "License");
  4 | # you may not use this file except in compliance with the License.
  5 | # You may obtain a copy of the License at
  6 | #
  7 | # http://www.apache.org/licenses/LICENSE-2.0
  8 | #
  9 | # Unless required by applicable law or agreed to in writing, software
 10 | # distributed under the License is distributed on an "AS IS" BASIS,
 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 12 | # See the License for the specific language governing permissions and
 13 | # limitations under the License.
 14 | 
 15 | echo -e "
 16 | \033[32m
 17 | This script is safe to re-run, feel free to retry when it exits abnormally.
 18 | Especially when we are waiting for a pod in Kubernetes cluster, it may
 19 | pull image from registry and take a lot of time to startup.
 20 | \033[0m
 21 | "
 22 | 
 23 | if [[ "$(whoami)" != "root" ]]; then
 24 |     echo "Please change to root user and retry."
 25 |     exit 1
 26 | fi
 27 | 
 28 | # script base dir for starting the minikube cluster
 29 | filebase=/root/scripts
 30 | 
 31 | echo "Docker pull dependency images, you can comment this if already have them ..."
 32 | if [[ -d "/root/.sqlflow" ]]; then
 33 |     echo "Cache found at /root/.sqlflow ..."
 34 |     if [[ ! -f "/root/.sqlflow/.loaded" ]]; then
 35 |         find /root/.sqlflow/* | xargs -I'{}' sh -c "docker load -i '{}' && sleep 10"
 36 |         touch /root/.sqlflow/.loaded
 37 |     fi
 38 |     # use local step images for model zoo model
 39 |     docker tag sqlflow/sqlflow:step sqlflow/sqlflow:latest
 40 | else
 41 |     # c.f. https://github.com/sql-machine-learning/sqlflow/blob/develop/.travis.yml
 42 |     docker pull sqlflow/sqlflow:jupyter
 43 |     docker pull sqlflow/sqlflow:mysql
 44 |     docker pull sqlflow/sqlflow:server
 45 |     docker pull sqlflow/sqlflow:step
 46 |     docker pull sqlflow/sqlflow:modelzooserver
 47 |     docker pull argoproj/argoexec:v2.7.7
 48 |     docker pull argoproj/argocli:v2.7.7
 49 |     docker pull argoproj/workflow-controller:v2.7.7
 50 |     docker tag sqlflow/sqlflow:modelzooserver sqlflow/sqlflow:model_zoo
 51 | fi
 52 | echo "Done."
 53 | 
 54 | # NOTE: According to https://stackoverflow.com/a/16619261/724872,
 55 | # source is very necessary here.
 56 | source $filebase/export_k8s_vars.sh
 57 | source $filebase/find_fastest_resources.sh
 58 | 
 59 | # (FIXME:lhw) If grep match nothing and return 1, do not exit
 60 | # Find a way that we do not need to use 'set -e'
 61 | set +e
 62 | 
 63 | # Execute cmd until given output is present
 64 | # or exit when timeout (50*3s)
 65 | # "$1" is user message
 66 | # "$2" is cmd
 67 | # "$3" is expected output
 68 | function wait_or_exit() {
 69 |     echo -n "Waiting for $1 "
 70 |     for i in {1..50}; do
 71 |         $2 | grep -o -q "$3"
 72 |         if [[ $? -eq 0 ]]; then
 73 |             echo "Done"
 74 |             return
 75 |         fi
 76 |         echo -n "."
 77 |         sleep 3
 78 |     done
 79 |     echo "Fail"
 80 |     exit
 81 | }
 82 | 
 83 | # Use a faster kube image and docker registry
 84 | echo "Start minikube cluster ..."
 85 | minikube_status=$(minikube status | grep "apiserver: Running")
 86 | if [[ "$minikube_status" == "apiserver: Running" ]]; then
 87 |   echo "Already in running."
 88 | else
 89 |     ali_kube="http://kubernetes.oss-cn-hangzhou.aliyuncs.com"
 90 |     google_kube="http://k8s.gcr.io"
 91 |     fast_kube_site=$(find_fastest_url $ali_kube $google_kube)
 92 |     if [[ "$fast_kube_site" == "$ali_kube" ]]; then
 93 |         sudo minikube start --image-mirror-country cn \
 94 |           --registry-mirror=https://registry.docker-cn.com --driver=none \
 95 |           --kubernetes-version=v"$K8S_VERSION"
 96 |     else
 97 |         sudo minikube start \
 98 |           --vm-driver=none \
 99 |           --kubernetes-version=v"$K8S_VERSION"
100 |     fi
101 | fi
102 | 
103 | wait_or_exit "minikube" "minikube status" "apiserver: Running"
104 | 
105 | # Test if a Kubernetes pod is ready
106 | # "$1" shoulde be namespace id e.g. argo
107 | # "$2" should be pod selector e.g. k8s-app=kubernetes-dashboard
108 | function is_pod_ready() {
109 |     pod=$(kubectl get pod -n "$1" -l "$2" -o name | tail -1)
110 |     if [[ -z "$pod" ]]; then
111 |         echo "no"
112 |         return
113 |     fi
114 |     ready=$(kubectl get -n "$1" "$pod" -o jsonpath='{.status.containerStatuses[0].ready}')
115 |     if [[ "$ready" == "true" ]]; then
116 |         echo "yes"
117 |     else
118 |         echo "no"
119 |     fi
120 | }
121 | 
122 | echo "Start argo ..."
123 | argo_server_alive=$(is_pod_ready "argo" "app=argo-server")
124 | if [[ "$argo_server_alive" == "yes" ]]; then
125 |     echo "Already in running."
126 | else
127 |     $filebase/start_argo.sh
128 | fi
129 | wait_or_exit "argo" "is_pod_ready argo app=argo-server" "yes"
130 | 
131 | echo "Strat Kubernetes Dashboard ..."
132 | dashboard_alive=$(is_pod_ready "kubernetes-dashboard" "k8s-app=kubernetes-dashboard")
133 | if [[ "$dashboard_alive" == "yes" ]]; then
134 |     echo "Already in running."
135 | else
136 |     nohup minikube dashboard >/dev/null 2>&1 &
137 | fi
138 | wait_or_exit "Kubernetes Dashboard" "is_pod_ready kubernetes-dashboard k8s-app=kubernetes-dashboard" "yes"
139 | 
140 | echo "Strat SQLFlow ..."
141 | sqlflow_alive=$(is_pod_ready "default" "app=sqlflow-server")
142 | if [[ "$sqlflow_alive" == "yes" ]]; then
143 |     echo "Already in running."
144 | else
145 |     kubectl apply -f $filebase/install-sqlflow.yaml
146 | fi
147 | wait_or_exit "SQLFlow" "is_pod_ready default app=sqlflow-server" "yes"
148 | 
149 | # Kill port exposing if it already exist
150 | function stop_expose() {
151 |     ps -elf | grep "kubectl port-forward" | grep "$1" | grep "$2" | awk '{print $4}' | xargs kill  >/dev/null 2>&1
152 | }
153 | 
154 | # Kubernetes port-forwarding
155 | # "$1" should be namespace
156 | # "$2" should be resource, e.g. service/argo-server
157 | # "$3" should be port mapping, e.g. 8000:80
158 | function expose() {
159 |     stop_expose "$2" "$3"
160 |     echo "Exposing port for $2 at $3 ..."
161 |     nohup kubectl port-forward -n $1 --address='0.0.0.0' $2 $3 >>port-forward-log 2>&1 &
162 | }
163 | 
164 | # (NOTE) after re-deploy sqlflow we have to re-expose the service ports.
165 | expose kubernetes-dashboard service/kubernetes-dashboard 9000:80
166 | expose argo service/argo-server 9001:2746
167 | expose default pod/sqlflow-server 8888:8888
168 | expose default pod/sqlflow-server 3306:3306
169 | expose default pod/sqlflow-server 50051:50051
170 | expose default pod/sqlflow-server 50055:50055
171 | 
172 | # Get Jupyter Notebook's token, for single-user mode, we disabled the token checking
173 | # jupyter_addr=$(kubectl logs pod/sqlflow-server notebook | grep -o -E "http://127.0.0.1[^?]+\?token=.*" | head -1)
174 | mysql_addr="mysql://root:root@tcp($(kubectl get -o jsonpath='{.status.podIP}' pod/sqlflow-server))/?maxAllowedPacket=0"
175 | 
176 | echo -e "
177 | \033[32m
178 | Congratulations, SQLFlow playground is up!
179 | 
180 | Access Jupyter Notebook at: http://localhost:8888
181 | Access Kubernetes Dashboard at: http://localhost:9000
182 | Access Argo Dashboard at: http://localhost:9001
183 | Access SQLFlow with cli: ./sqlflow --data-source="\"$mysql_addr\""
184 | Access SQLFlow Model Zoo at: localhost:50055
185 | 
186 | Stop minikube with: minikube stop
187 | Stop vagrant vm with: vagrant halt
188 | 
189 | [Dangerous]
190 | Destroy minikube with: minikube delete && rm -rf ~/.minikube
191 | Destroy vagrant vm with: vagrant destroy
192 | \033[0m
193 | "
194 | 


--------------------------------------------------------------------------------