├── .gitignore
├── AnomalyDetection
└── anomalydetection-interactivenotebook-main
│ ├── 01-Prerequisites.md
│ ├── 02-Dataflow_Pub_Sub_Notebook.md
│ ├── Dataflow_Pub_Sub_Notebook.ipynb
│ ├── Images
│ ├── DataflowJob.png
│ ├── Lab_Arch.png
│ ├── OrgPolicy.png
│ ├── agg-data-results.png
│ ├── agg-schema.png
│ ├── clonedRepoDisplayed.png
│ ├── create_notebook.png
│ ├── dataflowFailed.png
│ ├── default_notebook_settings.png
│ ├── fixed-window.png
│ ├── git_clone_icon.png
│ ├── navigate_to_workbench.png
│ ├── plot.png
│ ├── raw-data-results.png
│ ├── raw-schema.png
│ └── search_for_dataflow.png
│ ├── PythonSimulator.ipynb
│ └── README.md
├── CODE_OF_CONDUCT.md
├── CONTRIBUTING.md
├── LICENSE
├── README.md
└── RealTimePrediction
└── realtime-intelligence-main
├── README.md
├── create_train_data.sh
├── images
├── architecture.png
├── batch.png
├── data_folder.png
├── dataflow_jobs1.png
├── dataflow_jobs2.png
├── flights_folder.png
├── ingestion_bq.png
├── ingestion_gcs.png
├── op_externalip.png
├── op_shieldedvm.png
├── prediction.png
├── pubsub.png
├── streaming.png
├── vertex_ai_deployment.png
├── vertex_ai_endpoint.png
└── vertex_ai_training.png
├── install_packages.sh
├── predict_flights.sh
├── realtime
├── .gitignore
├── README.md
├── alldata_sample.json
├── call_predict.py
├── create_sample_input.sh
├── create_traindata.py
├── evaluation.ipynb
├── flightstxf
│ ├── __init__.py
│ └── flights_transforms.py
├── make_predictions.py
├── model.py
├── setup.py
├── simevents_sample.json
└── train_on_vertex.py
├── setup_env.sh
├── simulate
├── airports.csv.gz
└── simulate.py
├── simulate_flight.sh
├── stage_data.sh
└── train_model.sh
/.gitignore:
--------------------------------------------------------------------------------
1 | **/.DS_Store
2 | # Covers JetBrains IDEs: IntelliJ, RubyMine, PhpStorm, AppCode, PyCharm, CLion, Android Studio, WebStorm and Rider
3 | # Reference: https://intellij-support.jetbrains.com/hc/en-us/articles/206544839
4 |
5 | # User-specific stuff
6 | .idea/**/workspace.xml
7 | .idea/**/tasks.xml
8 | .idea/**/usage.statistics.xml
9 | .idea/**/dictionaries
10 | .idea/**/shelf
11 | *ipynb_checkpoints
12 |
--------------------------------------------------------------------------------
/AnomalyDetection/anomalydetection-interactivenotebook-main/01-Prerequisites.md:
--------------------------------------------------------------------------------
1 | # About
2 |
3 | This module includes all prerequisites for this Lab
4 |
5 | [0. Prerequisites](#0-prerequisites)
6 | [1. Variables](#1-variables)
7 | [2. Enable APIs](#2-enable-api-services)
8 | [3. Create a VPC & a subnet](#3-create-vpc--subnet)
9 | [4. Create firewall rules](#4-create-firewall-rules)
10 | [5. Update organizational policies](#5-update-organizational-policies)
11 | [6. Service Account](#6-service-account)
12 | [7. Grant general IAM permissions](#7-grant-permissions-for-service-account-that-you-just-created)
13 | [8. Launch Apache Beam Notebook Instance](#8-launch-an-apache-beam-notebook-instance)
14 | [9. Next Step](#9-next-step)
15 |
16 | ## 0. Prerequisites
17 |
18 | #### 1. Create a project called "anomaly-detection".
19 | Note the project number and project ID.
20 | We will need this for the rest fo the lab.
21 |
22 | Set the project back to "anomaly-detection" in the UI
23 |
24 | ## 1. Variables
25 |
26 | We will use these throughout the lab.
27 | Run the below in cloud shells coped to the new project you created-
28 | ```
29 | DEST_PROJECT=`gcloud config get-value project`
30 | VPC=$DEST_PROJECT"-vpc"
31 | SUBNET=$VPC"-subnet"
32 | REGION=us-central1
33 | VPC_FQN=projects/$DEST_PROJECT/global/networks/$VPC
34 |
35 | SERVICE_ACCOUNT="example-name"
36 | SERVICE_ACCOUNT_FQN=$SERVICE_ACCOUNT@$DEST_PROJECT.iam.gserviceaccount.com
37 | YOUR_IP=xx.xxx.xx.xx
38 |
39 | ```
40 | ## 2. Enable API Services
41 |
42 | From cloud shell, run the below-
43 | ```
44 | ggcloud services enable compute.googleapis.com
45 | gcloud services enable aiplatform.googleapis.com
46 | gcloud services enable dataflow.googleapis.com
47 | gcloud services enable datastream.googleapis.com
48 | gcloud services enable datacatalog.googleapis.com
49 | gcloud services enable bigquery.googleapis.com
50 | gcloud services enable composer.googleapis.com
51 | gcloud services enable sourcerepo.googleapis.com
52 | gcloud services enable cloudresourcemanager.googleapis.com
53 | ```
54 |
55 | ## 3. Create VPC & Subnet
56 |
57 | Run the below from a cloud shell.
58 |
59 | ```
60 | gcloud compute networks create $VPC \
61 | --subnet-mode=custom \
62 | --bgp-routing-mode=regional \
63 | --mtu=1500
64 | ```
65 |
66 | ```
67 | gcloud compute networks subnets create $SUBNET \
68 | --network=$VPC \
69 | --range=10.0.0.0/24 \
70 | --region=$REGION \
71 | --enable-private-ip-google-access
72 | ```
73 | ## 4. Create Firewall Rules
74 |
75 | 4.1) Intra-VPC, allow all communication
76 |
77 | ```
78 | gcloud compute firewall-rules create allow-all-intra-vpc --project=$DEST_PROJECT --network=$VPC_FQN \
79 | --description="Allows\ connection\ from\ any\ source\ to\ any\ instance\ on\ the\ network\ using\ custom\ protocols." --direction=INGRESS \
80 | --priority=65534 --source-ranges=10.0.0.0/20 --action=ALLOW --rules=all
81 | ```
82 |
83 | 4.2) Allow-SSH
84 |
85 | ```
86 | gcloud compute firewall-rules create allow-all-ssh --project=$DEST_PROJECT --network=$VPC_FQN \
87 | --description="Allows\ TCP\ connections\ from\ any\ source\ to\ any\ instance\ on\ the\ network\ using\ port\ 22." --direction=INGRESS \
88 | --priority=65534 --source-ranges=0.0.0.0/0 --action=ALLOW --rules=tcp:22
89 | ```
90 |
91 | 4.3) Allow Ingress
92 |
93 | ```
94 | gcloud compute --project=$DEST_PROJECT firewall-rules create allow-all-to-my-machine --direction=INGRESS --priority=1000 --network=$VPC \
95 | --action=ALLOW --rules=all --source-ranges=$YOUR_IP
96 |
97 | ```
98 | 4.4) Allow your computer to access node-red from the browser on port 1880.
99 |
100 | ```
101 | gcloud compute firewall-rules create allow-node-red
102 | --project=$DEST_PROJECT
103 | --network=$VPC_FQN --description=Allows\ TCP\ connections\ from\ node\ red\ source\ to\ any\ instance\ on\ the\ network\ using\ port\ 1880.
104 | --direction=INGRESS
105 | --priority=1010
106 | --source-ranges=$YOUR_IP * You need this to open node-red from a browser on your computer
107 | --action=ALLOW
108 | --rules=tcp:1880
109 |
110 | ```
111 |
112 | ## 5. Update Organizational Policies
113 |
114 | In the Google Cloud Console, navigate to IAM -> Organization Policies
115 |
116 | Turn off the following org policy - constraints/compute.vmExternalIpAccess, constraints/iam.disableServiceAccountKeyCreation
117 |
118 | 
119 |
120 | ## 6. Service Account
121 |
122 | Run the following in the cloud shell
123 |
124 | ```
125 | gcloud iam service-accounts create ${SERVICE_ACCOUNT} \
126 | --description="User Managed Service Account" \
127 | --display-name=$SERVICE_ACCOUNT
128 | ```
129 |
130 | ## 7. Grant Permissions for Service Account that you just created
131 |
132 | Run the following to grant all the permissions the service account needs to run this lab
133 |
134 | ```
135 | gcloud projects add-iam-policy-binding ${DEST_PROJECT} \
136 | --member=serviceAccount:${SERVICE_ACCOUNT_FQN} \
137 | --role=roles/iam.serviceAccountTokenCreator
138 | ```
139 | ```
140 | gcloud projects add-iam-policy-binding ${DEST_PROJECT} \
141 | --member=serviceAccount:${SERVICE_ACCOUNT_FQN} \
142 | --role=roles/pubsub.editor
143 | ```
144 | ```
145 | gcloud projects add-iam-policy-binding ${DEST_PROJECT} \
146 | --member=serviceAccount:${SERVICE_ACCOUNT_FQN} \
147 | --role=roles/pubsub.publisher
148 | ```
149 | ```
150 | gcloud projects add-iam-policy-binding ${DEST_PROJECT} \
151 | --member=serviceAccount:${SERVICE_ACCOUNT_FQN} \
152 | --role=roles/bigquery.admin
153 | ```
154 | ```
155 | gcloud projects add-iam-policy-binding ${DEST_PROJECT} \
156 | --member=serviceAccount:${SERVICE_ACCOUNT_FQN} \
157 | --role=roles/bigquery.dataEditor
158 | ```
159 | ```
160 | gcloud projects add-iam-policy-binding ${DEST_PROJECT} \
161 | --member=serviceAccount:${SERVICE_ACCOUNT_FQN} \
162 | --role=roles/dataflow.developer
163 | ```
164 | ```
165 | gcloud projects add-iam-policy-binding ${DEST_PROJECT} \
166 | --member=serviceAccount:${SERVICE_ACCOUNT_FQN} \
167 | --role=roles/dataflow.worker
168 | ```
169 | ## 8. Launch an Apache Beam notebook instance
170 |
171 | Go to the Google Cloud Console->Dataflow Workflow->Workbench
172 |
173 | Make sure that you are on the User-managed notebooks tab.
174 |
175 | In the toolbar, click add New notebook.
176 |
177 | Select Apache Beam > Without GPUs.
178 |
179 | On the New notebook page, select the subnetwork you created in Step 3 for the notebook VM.
180 |
181 | Click Create.
182 |
183 | When the link becomes active, click Open JupyterLab. Vertex AI Workbench creates a new Apache Beam notebook instance.
184 |
185 | ## 9. Next Step
186 |
187 | [Data Generation](02-Dataflow_Pub_Sub_Notebook.md)
--------------------------------------------------------------------------------
/AnomalyDetection/anomalydetection-interactivenotebook-main/02-Dataflow_Pub_Sub_Notebook.md:
--------------------------------------------------------------------------------
1 | # Real Time Visibility: Anomaly Detection
2 |
3 | ### Overview
4 |
5 | Anomaly Detection is a demo to show an end to end architecture of a streaming pipeline from raw data ingestion to transform the data using Dataflow - leveraging Dataflow notebooks, setting up an Apache Beam pipeline, transforming the data using Windows and finally landing the data in BigQuery for further analysis.
6 |
7 | ### Architecture
8 | 
9 |
10 | ### Getting Started
11 |
12 | Within the GCP Console, type in `dataflow` at the top of the search bar
13 |
14 | 
15 |
16 |
17 | 
18 |
19 |
20 | You will see an existing Notebook called **demo-notebook**. This is a default notbook that has some examples in it. We will leave the default notebook alone and create a new notebook.
21 |
22 | Click on** the User-Managed Notebooks** Tab and click **New Notebook**
23 |
24 | 
25 |
26 | Select Apache Beam > Without GPUs
27 |
28 | Leave the default settings as is and click **CREATE**
29 |
30 | 
31 |
32 | You can click *Refresh* to see the notebook being provisioned.
33 |
34 | Vertex AI Workbench will create a new Apache Beam notebook instance. Once it's available, click on **OPEN JUYPTERLAB**
35 |
36 | Once the Notebook is launched, you will see some default files and folders that come pre installed when you launch a new Notebook
37 |
38 | Next, we will clone a repo in order to get the files we need. Click on the **clone repo** icon:
39 |
40 | 
41 |
42 | Enter the below HTTPS address to clone the repo. This is a public repo containing the files we will use.
43 |
44 | ```shell
45 | https://github.com/seidou-1/GoogleCloud.git
46 | ```
47 |
48 | You can leave `Include submodules` **checked** and `Download the repository` **unchecked**
49 |
50 |
51 | Once it's cloned, you'll see a folder called **GoogleCloud**
52 |
53 |
54 | 
55 |
56 | Click into that folder and continue until you reach the sub directory `anomalydetection-interactivenotebook-main `
57 |
58 | Double click on the file `Dataflow_Pub_Sub_Notebook .ipynb` and follow the instructions in the Notebook.
59 |
60 |
--------------------------------------------------------------------------------
/AnomalyDetection/anomalydetection-interactivenotebook-main/Dataflow_Pub_Sub_Notebook.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "# How to run the examples on Dataflow\n",
8 | "\n",
9 | "This notebook illustrates a pipeline to stream the raw data from pub/sub to bigquery using dataflow runner and interactive runner .\n",
10 | "\n",
11 | "This pipeline processes the raw data from pub/sub and loads into Bigquery and in parallel it also windows the raw data (using fixed windowing) for every 3 seconds and calculates the mean of sensor values on the windowed data\n",
12 | "\n",
13 | "\n",
14 | "Note that running this example incurs a small [charge](https://cloud.google.com/dataflow/pricing) from Dataflow.\n",
15 | "\n",
16 | "Let's make sure the dependencies are installed. This allows to load the bq query results to a dataframe to plot the anomalies.\n",
17 | "\n"
18 | ]
19 | },
20 | {
21 | "cell_type": "code",
22 | "execution_count": null,
23 | "metadata": {},
24 | "outputs": [],
25 | "source": [
26 | "pip install db-dtypes"
27 | ]
28 | },
29 | {
30 | "cell_type": "markdown",
31 | "metadata": {},
32 | "source": [
33 | "After you do `pip install db-dtypes` restart the kernel by clicking on the reload icon up top near the navigation menu. Once restarted, proceed with the rest of the steps below."
34 | ]
35 | },
36 | {
37 | "cell_type": "markdown",
38 | "metadata": {},
39 | "source": [
40 | " Lets make sure the Dataflow API is enabled. This [allows](https://cloud.google.com/apis/docs/getting-started#enabling_apis) your project to access the Dataflow service:"
41 | ]
42 | },
43 | {
44 | "cell_type": "code",
45 | "execution_count": null,
46 | "metadata": {},
47 | "outputs": [],
48 | "source": [
49 | "!gcloud services enable dataflow.googleapis.com\n",
50 | "!gcloud services enable dataflow"
51 | ]
52 | },
53 | {
54 | "cell_type": "markdown",
55 | "metadata": {},
56 | "source": [
57 | "### 1. Start with necessary imports\n"
58 | ]
59 | },
60 | {
61 | "cell_type": "code",
62 | "execution_count": null,
63 | "metadata": {},
64 | "outputs": [],
65 | "source": [
66 | "import re\n",
67 | "import json\n",
68 | "from datetime import datetime\n",
69 | "import apache_beam as beam\n",
70 | "import random\n",
71 | "import time\n",
72 | "from google.cloud import pubsub_v1,bigquery\n",
73 | "from apache_beam.options import pipeline_options\n",
74 | "from apache_beam.options.pipeline_options import GoogleCloudOptions\n",
75 | "from apache_beam.runners import DataflowRunner\n",
76 | "from apache_beam.runners.interactive import interactive_runner\n",
77 | "from apache_beam import DoFn, GroupByKey, io, ParDo, Pipeline, PTransform, WindowInto, WithKeys,Create,Map , CombineGlobally ,dataframe\n",
78 | "import apache_beam.runners.interactive.interactive_beam as ib\n",
79 | "import google.auth\n",
80 | "import matplotlib.pyplot as plt\n",
81 | "\n",
82 | "publisher = pubsub_v1.PublisherClient() #Pubsub publisher client\n",
83 | "subscriber = pubsub_v1.SubscriberClient() #Pubsub subscriber client\n",
84 | "client = bigquery.Client() #bigquery client"
85 | ]
86 | },
87 | {
88 | "cell_type": "markdown",
89 | "metadata": {},
90 | "source": [
91 | "### 2. Set the variables . These variables will be referenced in later sections\n"
92 | ]
93 | },
94 | {
95 | "cell_type": "code",
96 | "execution_count": null,
97 | "metadata": {},
98 | "outputs": [],
99 | "source": [
100 | "dest_project=!gcloud config get-value project\n",
101 | "project_id=dest_project[1]\n",
102 | "print(project_id)\n",
103 | "\n",
104 | "pubsub_topic = project_id + \"-\" + \"topic\" \n",
105 | "pubsub_subscription = pubsub_topic + \"-\" + \"sub\"\n",
106 | "pubsub_topic_path = publisher.topic_path(project_id, pubsub_topic)\n",
107 | "pubsub_subscription_path = subscriber.subscription_path(project_id, pubsub_subscription)\n",
108 | "\n",
109 | "bq_dataset = \"anomaly_detection_demo\"\n",
110 | "bigquery_agg_schema = \"sensorID:STRING,sensorValue:FLOAT,windowStart:DATETIME,windowEnd:DATETIME\"\n",
111 | "bigquery_raw_schema = \"sensorID:STRING,timeStamp:DATETIME,sensorValue:FLOAT\"\n",
112 | "bigquery_raw_table = bq_dataset + \".anomaly_raw_table\" \n",
113 | "bigquery_agg_table = bq_dataset + \".anomaly_windowed_table\" \n",
114 | "region = \"us-central1\"\n",
115 | "bucket_name = project_id "
116 | ]
117 | },
118 | {
119 | "cell_type": "markdown",
120 | "metadata": {},
121 | "source": [
122 | "### 3: Create Pub/sub topic"
123 | ]
124 | },
125 | {
126 | "cell_type": "code",
127 | "execution_count": null,
128 | "metadata": {},
129 | "outputs": [],
130 | "source": [
131 | "!gcloud pubsub topics create {pubsub_topic}"
132 | ]
133 | },
134 | {
135 | "cell_type": "markdown",
136 | "metadata": {},
137 | "source": [
138 | "If you get an error that says `Run client channel backup poller: UNKNOWN:pollset_` don't be alarmed it won't effect the job. It is just a formatting issue."
139 | ]
140 | },
141 | {
142 | "cell_type": "markdown",
143 | "metadata": {},
144 | "source": [
145 | "### 4: Create Pub/sub subscription"
146 | ]
147 | },
148 | {
149 | "cell_type": "code",
150 | "execution_count": null,
151 | "metadata": {},
152 | "outputs": [],
153 | "source": [
154 | "!gcloud pubsub subscriptions create {pubsub_subscription} --topic={pubsub_topic}"
155 | ]
156 | },
157 | {
158 | "cell_type": "markdown",
159 | "metadata": {},
160 | "source": [
161 | "### 5. Create BigQuery Dataset\n"
162 | ]
163 | },
164 | {
165 | "cell_type": "code",
166 | "execution_count": null,
167 | "metadata": {},
168 | "outputs": [],
169 | "source": [
170 | "!bq --location={region} mk --dataset {project_id}:{bq_dataset}"
171 | ]
172 | },
173 | {
174 | "cell_type": "markdown",
175 | "metadata": {},
176 | "source": [
177 | "### 6. Create BigQuery Tables\n",
178 | "\n",
179 | "raw big query schema\n",
180 | "\n",
181 | "\n",
182 | "\n",
183 | "aggregated big query schema\n",
184 | ""
185 | ]
186 | },
187 | {
188 | "cell_type": "code",
189 | "execution_count": null,
190 | "metadata": {},
191 | "outputs": [],
192 | "source": [
193 | "!bq mk --schema {bigquery_raw_schema} -t {bigquery_raw_table}\n",
194 | "!bq mk --schema {bigquery_agg_schema} -t {bigquery_agg_table}"
195 | ]
196 | },
197 | {
198 | "cell_type": "markdown",
199 | "metadata": {},
200 | "source": [
201 | "If you get an error that says `Run client channel backup poller: UNKNOWN:pollset_` don't be alarmed it won't effect the job. It is just a formatting issue."
202 | ]
203 | },
204 | {
205 | "cell_type": "markdown",
206 | "metadata": {},
207 | "source": [
208 | "### 7. Create GCS Bucket "
209 | ]
210 | },
211 | {
212 | "cell_type": "code",
213 | "execution_count": null,
214 | "metadata": {},
215 | "outputs": [],
216 | "source": [
217 | "!gsutil mb -c standard -l {region} gs://{bucket_name}"
218 | ]
219 | },
220 | {
221 | "cell_type": "markdown",
222 | "metadata": {},
223 | "source": [
224 | "### 8. IMPORTANT! open GCS bucket from console and create a folder called dataflow.\n",
225 | "path should be gs://project_id/dataflow\n"
226 | ]
227 | },
228 | {
229 | "cell_type": "markdown",
230 | "metadata": {},
231 | "source": [
232 | "### 9. set the pipeline options"
233 | ]
234 | },
235 | {
236 | "cell_type": "code",
237 | "execution_count": null,
238 | "metadata": {},
239 | "outputs": [],
240 | "source": [
241 | "# Setting up the Apache Beam pipeline options.\n",
242 | "options = pipeline_options.PipelineOptions(flags={})\n",
243 | "\n",
244 | "# Sets the pipeline mode to streaming, so we can stream the data from PubSub.\n",
245 | "options.view_as(pipeline_options.StandardOptions).streaming = True\n",
246 | "\n",
247 | "# Sets the project to the default project in your current Google Cloud environment.\n",
248 | "options.view_as(GoogleCloudOptions).project = project_id\n",
249 | "\n",
250 | "# Sets the Google Cloud Region in which Cloud Dataflow runs.\n",
251 | "options.view_as(GoogleCloudOptions).region = region"
252 | ]
253 | },
254 | {
255 | "cell_type": "markdown",
256 | "metadata": {},
257 | "source": [
258 | "### 10. create the function to format the raw data and processed data"
259 | ]
260 | },
261 | {
262 | "cell_type": "code",
263 | "execution_count": null,
264 | "metadata": {},
265 | "outputs": [],
266 | "source": [
267 | "# to add window begin datetime and endtime to the aggregated PCollections.\n",
268 | "class FormatDoFn(beam.DoFn):\n",
269 | " def process(self, element, window=beam.DoFn.WindowParam):\n",
270 | " from datetime import datetime\n",
271 | " window_start = datetime.fromtimestamp(window.start)\n",
272 | " window_end = datetime.fromtimestamp(window.end)\n",
273 | " return [{\n",
274 | " 'sensorID': element[0],\n",
275 | " 'sensorValue': element[1],\n",
276 | " 'windowStart': window_start,\n",
277 | " 'windowEnd': window_end\n",
278 | " }] "
279 | ]
280 | },
281 | {
282 | "cell_type": "code",
283 | "execution_count": null,
284 | "metadata": {},
285 | "outputs": [],
286 | "source": [
287 | "# to get the raw PCollections\n",
288 | "class ProcessDoFn(beam.DoFn):\n",
289 | " def process(self, element):\n",
290 | " yield element "
291 | ]
292 | },
293 | {
294 | "cell_type": "markdown",
295 | "metadata": {},
296 | "source": [
297 | "### 11. Construct the pipeline \n",
298 | "This step will take the pipeline from pub/sub topic and do some processing. It will process the raw data into raw PCollections and process the aggregated windowed data into aggregated pcollections.\n",
299 | "\n",
300 | "With the aggregated window, the pipeline will read the data from the pub/topic and group the data into 5 sec intervals. Lastly it will calculate the mean of sensor value for each window."
301 | ]
302 | },
303 | {
304 | "cell_type": "markdown",
305 | "metadata": {},
306 | "source": [
307 | ""
308 | ]
309 | },
310 | {
311 | "cell_type": "code",
312 | "execution_count": null,
313 | "metadata": {},
314 | "outputs": [],
315 | "source": [
316 | "# Set pipeline options \n",
317 | "p = beam.Pipeline(interactive_runner.InteractiveRunner(), options=options)\n",
318 | "\n",
319 | "# pub/sub => mapped(pcollections)\n",
320 | "mapped = (p | \"ReadFromPubSub\" >> beam.io.gcp.pubsub.ReadFromPubSub(subscription=pubsub_subscription_path)\n",
321 | " | \"Json Loads\" >> Map(json.loads))\n",
322 | "\n",
323 | "# mapped(input pcollections => raw_data(output pcollections)\n",
324 | "raw_data = (mapped \n",
325 | " | 'Format' >> beam.ParDo(ProcessDoFn()))\n",
326 | "\n",
327 | "# mapped(input pcollections) => agg_date(output pcollections) \n",
328 | "agg_data = (mapped \n",
329 | " | \"Map Keys\" >> Map(lambda x: (x[\"SensorID\"],x[\"SensorValue\"]))\n",
330 | " | \"ApplyFixedWindow\" >> beam.WindowInto(beam.window.FixedWindows(5))\n",
331 | " | \"Total Per Key\" >> beam.combiners.Mean.PerKey()\n",
332 | " | 'Final Format' >> beam.ParDo(FormatDoFn())) "
333 | ]
334 | },
335 | {
336 | "cell_type": "markdown",
337 | "metadata": {},
338 | "source": [
339 | "Note that the `Pipeline` is constructed by an `InteractiveRunner`, so you can use operations such as `ib.collect` or `ib.show`.\n",
340 | "### Important \n",
341 | "Run steps 1-4 in simulator script(PythonSimulator.ipynb) in a separate tab -- (this is to simulate the data and writes to pub/sub topic to test interactiverunner(1 message per millisecond until it reaches 100 messages)\n",
342 | "\n",
343 | "Remember to **only** run steps 1-4 for now. We will come back to this script to run step 5 later.\n",
344 | " "
345 | ]
346 | },
347 | {
348 | "cell_type": "code",
349 | "execution_count": null,
350 | "metadata": {},
351 | "outputs": [],
352 | "source": [
353 | "ib.show(agg_data)"
354 | ]
355 | },
356 | {
357 | "cell_type": "markdown",
358 | "metadata": {},
359 | "source": [
360 | "### 12.Dataflow Additions\n",
361 | "\n",
362 | "Now, for something a bit different. Because Dataflow executes in the cloud, you need to output to a cloud sink. In this case, you are loading the transformed data into Cloud Storage.\n",
363 | "\n",
364 | "First, set up the `PipelineOptions` to specify to the Dataflow service the Google Cloud project, the region to run the Dataflow Job, and the SDK location."
365 | ]
366 | },
367 | {
368 | "cell_type": "code",
369 | "execution_count": null,
370 | "metadata": {},
371 | "outputs": [],
372 | "source": [
373 | "# IMPORTANT! Adjust the following to choose a Cloud Storage location.\n",
374 | "dataflow_gcs_location = \"gs:///dataflow\"\n",
375 | "\n",
376 | "# Dataflow Staging Location. This location is used to stage the Dataflow Pipeline and SDK binary.\n",
377 | "# options.view_as(GoogleCloudOptions).staging_location = dataflow_gcs_location\n",
378 | "\n",
379 | "# Sets the project to the default project in your current Google Cloud environment.\n",
380 | "_, options.view_as(GoogleCloudOptions).project = google.auth.default()\n",
381 | "\n",
382 | "# Dataflow Temp Location. This location is used to store temporary files or intermediate results before finally outputting to the sink.\n",
383 | "options.view_as(GoogleCloudOptions).temp_location = '%s/temp' % dataflow_gcs_location\n",
384 | "\n",
385 | "# Dataflow job name. when pipeline runs as dataflowrunner.\n",
386 | "options.view_as(GoogleCloudOptions).job_name = project_id\n"
387 | ]
388 | },
389 | {
390 | "cell_type": "code",
391 | "execution_count": null,
392 | "metadata": {},
393 | "outputs": [],
394 | "source": [
395 | "# Specifying the bigquery table to write `add_data` to,\n",
396 | "# based on the `bigquery_raw_table` variable set earlier.\n",
397 | "(raw_data | 'Write raw data to Bigquery' \n",
398 | " >> beam.io.WriteToBigQuery(\n",
399 | " bigquery_raw_table,\n",
400 | " write_disposition=beam.io.BigQueryDisposition.WRITE_APPEND))\n",
401 | "# Specifying the bigquery table to write `add_data` to,\n",
402 | "# based on the `bigquery_agg_table` variable set earlier.\n",
403 | "(agg_data | 'Write windowed aggregated data to Bigquery' \n",
404 | " >> beam.io.WriteToBigQuery(\n",
405 | " bigquery_agg_table,\n",
406 | " write_disposition=beam.io.BigQueryDisposition.WRITE_APPEND))"
407 | ]
408 | },
409 | {
410 | "cell_type": "code",
411 | "execution_count": null,
412 | "metadata": {},
413 | "outputs": [],
414 | "source": [
415 | "# IMPORTANT! Ensure that the graph is correct before sending it out to Dataflow.\n",
416 | "# Because this is a notebook environment, unintended additions to the graph may have occurred when rerunning cells. \n",
417 | "ib.show_graph(p)"
418 | ]
419 | },
420 | {
421 | "cell_type": "markdown",
422 | "metadata": {},
423 | "source": [
424 | "### 13.Running the pipeline\n",
425 | "\n",
426 | "Now you are ready to run the pipeline on Dataflow. `run_pipeline()` runs the pipeline and return a pipeline result object."
427 | ]
428 | },
429 | {
430 | "cell_type": "code",
431 | "execution_count": null,
432 | "metadata": {},
433 | "outputs": [],
434 | "source": [
435 | "pipeline_result = DataflowRunner().run_pipeline(p, options=options)"
436 | ]
437 | },
438 | {
439 | "cell_type": "markdown",
440 | "metadata": {},
441 | "source": [
442 | "### Important \n",
443 | "\n",
444 | "\n",
445 | "Before moving forward, check the dataflow job to see if it's running (Hamburger menu->Dataflow->Jobs). If the status shows as `failed`, **rerun** the above cell `pipeline_result = DataflowRunner().run_pipeline(p, options=options)` one more time. This happens when the Dataflow API is not fully enabled. It takes a minute or so for the API to permeate fully.\n",
446 | "\n",
447 | "\n"
448 | ]
449 | },
450 | {
451 | "cell_type": "markdown",
452 | "metadata": {},
453 | "source": [
454 | "Using the `pipeline_result` handle, the following code builds a link to the Google Cloud Console web page that shows you details of the Dataflow job you just started:"
455 | ]
456 | },
457 | {
458 | "cell_type": "code",
459 | "execution_count": null,
460 | "metadata": {},
461 | "outputs": [],
462 | "source": [
463 | "from IPython.core.display import display, HTML\n",
464 | "url = ('https://console.cloud.google.com/dataflow/jobs/%s/%s?project=%s' % \n",
465 | " (pipeline_result._job.location, pipeline_result._job.id, pipeline_result._job.projectId))\n",
466 | "display(HTML('Click here for the details of your Dataflow job!' % url))\n"
467 | ]
468 | },
469 | {
470 | "cell_type": "markdown",
471 | "metadata": {},
472 | "source": [
473 | "dtaflow job\n",
474 | ""
475 | ]
476 | },
477 | {
478 | "cell_type": "markdown",
479 | "metadata": {},
480 | "source": [
481 | "### Important \n",
482 | "Run step5 in simulator script(PythonSimulator.ipynb) that is in a separate tab -- (this is to simulate the data and writes to pub/sub topic to test dataflow runner(1 message per millisecond until it reaches 5000 messages). "
483 | ]
484 | },
485 | {
486 | "cell_type": "markdown",
487 | "metadata": {},
488 | "source": [
489 | "### 14.Checking the raw table results (note: it will take ~90sec to appear the initial data in table due to dataflow warmup time)\n",
490 | "raw table results\n",
491 | ""
492 | ]
493 | },
494 | {
495 | "cell_type": "code",
496 | "execution_count": null,
497 | "metadata": {},
498 | "outputs": [],
499 | "source": [
500 | "#check the raw data in BQ raw Table\n",
501 | "sql = 'SELECT * FROM `{}` '.format(bigquery_raw_table)\n",
502 | "query_job = client.query(sql) # API request\n",
503 | "raw_df = query_job.to_dataframe()\n",
504 | "raw_df"
505 | ]
506 | },
507 | {
508 | "cell_type": "markdown",
509 | "metadata": {},
510 | "source": [
511 | "### 15.Checking the agg table results\n",
512 | "agg table results\n",
513 | "\n",
514 | ""
515 | ]
516 | },
517 | {
518 | "cell_type": "code",
519 | "execution_count": null,
520 | "metadata": {},
521 | "outputs": [],
522 | "source": [
523 | "#check the agg data in BQ raw Table\n",
524 | "sql = 'SELECT sensorID , case when sensorValue >= 200 then \"Anomaly\" else \"Normal\" end as type, sensorValue,row_number() over (order by windowStart) as cycle FROM `{}` '.format(bigquery_agg_table)\n",
525 | "query_job = client.query(sql) # API request\n",
526 | "agg_df = query_job.to_dataframe()\n",
527 | "agg_df"
528 | ]
529 | },
530 | {
531 | "cell_type": "markdown",
532 | "metadata": {},
533 | "source": [
534 | "### 16.Plot the results in a simple scatterplot chart \n",
535 | "\n",
536 | "Chart will display Anomalies in red color and Normal in Green color\n",
537 | ""
538 | ]
539 | },
540 | {
541 | "cell_type": "code",
542 | "execution_count": null,
543 | "metadata": {},
544 | "outputs": [],
545 | "source": [
546 | "c=['green' if g=='Normal' else 'red' for g in agg_df['type']]\n",
547 | "agg_df.plot(\n",
548 | " kind=\"scatter\",\n",
549 | " x=\"cycle\",\n",
550 | " y=\"sensorValue\" , c = c, s = 150,\n",
551 | " figsize=(20, 10) \n",
552 | " )\n",
553 | "plt.axhline(y=200, color='black', linestyle='-',linewidth=3)\n"
554 | ]
555 | },
556 | {
557 | "cell_type": "markdown",
558 | "metadata": {},
559 | "source": [
560 | "# Congratulations!!!\n",
561 | "End of lab\n"
562 | ]
563 | }
564 | ],
565 | "metadata": {
566 | "kernelspec": {
567 | "display_name": "01. Apache Beam 2.45.0 for Python 3",
568 | "language": "python",
569 | "name": "01-apache-beam-2.45.0"
570 | },
571 | "language_info": {
572 | "codemirror_mode": {
573 | "name": "ipython",
574 | "version": 3
575 | },
576 | "file_extension": ".py",
577 | "mimetype": "text/x-python",
578 | "name": "python",
579 | "nbconvert_exporter": "python",
580 | "pygments_lexer": "ipython3",
581 | "version": "3.8.10"
582 | }
583 | },
584 | "nbformat": 4,
585 | "nbformat_minor": 4
586 | }
587 |
--------------------------------------------------------------------------------
/AnomalyDetection/anomalydetection-interactivenotebook-main/Images/DataflowJob.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/google/real-time-intelligence-workshop/ea99fd5b6fcf44c4f006dc17adde8c42ca163479/AnomalyDetection/anomalydetection-interactivenotebook-main/Images/DataflowJob.png
--------------------------------------------------------------------------------
/AnomalyDetection/anomalydetection-interactivenotebook-main/Images/Lab_Arch.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/google/real-time-intelligence-workshop/ea99fd5b6fcf44c4f006dc17adde8c42ca163479/AnomalyDetection/anomalydetection-interactivenotebook-main/Images/Lab_Arch.png
--------------------------------------------------------------------------------
/AnomalyDetection/anomalydetection-interactivenotebook-main/Images/OrgPolicy.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/google/real-time-intelligence-workshop/ea99fd5b6fcf44c4f006dc17adde8c42ca163479/AnomalyDetection/anomalydetection-interactivenotebook-main/Images/OrgPolicy.png
--------------------------------------------------------------------------------
/AnomalyDetection/anomalydetection-interactivenotebook-main/Images/agg-data-results.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/google/real-time-intelligence-workshop/ea99fd5b6fcf44c4f006dc17adde8c42ca163479/AnomalyDetection/anomalydetection-interactivenotebook-main/Images/agg-data-results.png
--------------------------------------------------------------------------------
/AnomalyDetection/anomalydetection-interactivenotebook-main/Images/agg-schema.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/google/real-time-intelligence-workshop/ea99fd5b6fcf44c4f006dc17adde8c42ca163479/AnomalyDetection/anomalydetection-interactivenotebook-main/Images/agg-schema.png
--------------------------------------------------------------------------------
/AnomalyDetection/anomalydetection-interactivenotebook-main/Images/clonedRepoDisplayed.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/google/real-time-intelligence-workshop/ea99fd5b6fcf44c4f006dc17adde8c42ca163479/AnomalyDetection/anomalydetection-interactivenotebook-main/Images/clonedRepoDisplayed.png
--------------------------------------------------------------------------------
/AnomalyDetection/anomalydetection-interactivenotebook-main/Images/create_notebook.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/google/real-time-intelligence-workshop/ea99fd5b6fcf44c4f006dc17adde8c42ca163479/AnomalyDetection/anomalydetection-interactivenotebook-main/Images/create_notebook.png
--------------------------------------------------------------------------------
/AnomalyDetection/anomalydetection-interactivenotebook-main/Images/dataflowFailed.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/google/real-time-intelligence-workshop/ea99fd5b6fcf44c4f006dc17adde8c42ca163479/AnomalyDetection/anomalydetection-interactivenotebook-main/Images/dataflowFailed.png
--------------------------------------------------------------------------------
/AnomalyDetection/anomalydetection-interactivenotebook-main/Images/default_notebook_settings.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/google/real-time-intelligence-workshop/ea99fd5b6fcf44c4f006dc17adde8c42ca163479/AnomalyDetection/anomalydetection-interactivenotebook-main/Images/default_notebook_settings.png
--------------------------------------------------------------------------------
/AnomalyDetection/anomalydetection-interactivenotebook-main/Images/fixed-window.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/google/real-time-intelligence-workshop/ea99fd5b6fcf44c4f006dc17adde8c42ca163479/AnomalyDetection/anomalydetection-interactivenotebook-main/Images/fixed-window.png
--------------------------------------------------------------------------------
/AnomalyDetection/anomalydetection-interactivenotebook-main/Images/git_clone_icon.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/google/real-time-intelligence-workshop/ea99fd5b6fcf44c4f006dc17adde8c42ca163479/AnomalyDetection/anomalydetection-interactivenotebook-main/Images/git_clone_icon.png
--------------------------------------------------------------------------------
/AnomalyDetection/anomalydetection-interactivenotebook-main/Images/navigate_to_workbench.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/google/real-time-intelligence-workshop/ea99fd5b6fcf44c4f006dc17adde8c42ca163479/AnomalyDetection/anomalydetection-interactivenotebook-main/Images/navigate_to_workbench.png
--------------------------------------------------------------------------------
/AnomalyDetection/anomalydetection-interactivenotebook-main/Images/plot.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/google/real-time-intelligence-workshop/ea99fd5b6fcf44c4f006dc17adde8c42ca163479/AnomalyDetection/anomalydetection-interactivenotebook-main/Images/plot.png
--------------------------------------------------------------------------------
/AnomalyDetection/anomalydetection-interactivenotebook-main/Images/raw-data-results.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/google/real-time-intelligence-workshop/ea99fd5b6fcf44c4f006dc17adde8c42ca163479/AnomalyDetection/anomalydetection-interactivenotebook-main/Images/raw-data-results.png
--------------------------------------------------------------------------------
/AnomalyDetection/anomalydetection-interactivenotebook-main/Images/raw-schema.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/google/real-time-intelligence-workshop/ea99fd5b6fcf44c4f006dc17adde8c42ca163479/AnomalyDetection/anomalydetection-interactivenotebook-main/Images/raw-schema.png
--------------------------------------------------------------------------------
/AnomalyDetection/anomalydetection-interactivenotebook-main/Images/search_for_dataflow.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/google/real-time-intelligence-workshop/ea99fd5b6fcf44c4f006dc17adde8c42ca163479/AnomalyDetection/anomalydetection-interactivenotebook-main/Images/search_for_dataflow.png
--------------------------------------------------------------------------------
/AnomalyDetection/anomalydetection-interactivenotebook-main/PythonSimulator.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "id": "e974ebb8",
6 | "metadata": {},
7 | "source": [
8 | "Copyright 2023 Google LLC\n",
9 | "\n",
10 | "Licensed under the Apache License, Version 2.0 (the \"License\");\n",
11 | "you may not use this file except in compliance with the License.\n",
12 | "You may obtain a copy of the License at\n",
13 | "\n",
14 | " https://www.apache.org/licenses/LICENSE-2.0\n",
15 | "\n",
16 | "Unless required by applicable law or agreed to in writing, software\n",
17 | "distributed under the License is distributed on an \"AS IS\" BASIS,\n",
18 | "WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n",
19 | "See the License for the specific language governing permissions and\n",
20 | "limitations under the License."
21 | ]
22 | },
23 | {
24 | "cell_type": "markdown",
25 | "id": "2d419814-3f29-4a93-8fc9-a26eac2e7439",
26 | "metadata": {},
27 | "source": [
28 | "### 1. Start with necessary imports"
29 | ]
30 | },
31 | {
32 | "cell_type": "code",
33 | "execution_count": null,
34 | "id": "3bcb5a61-b6f2-4c0b-8515-fc38666adbe1",
35 | "metadata": {},
36 | "outputs": [],
37 | "source": [
38 | "from google.cloud import pubsub_v1\n",
39 | "import json\n",
40 | "from datetime import datetime\n",
41 | "import random\n",
42 | "import time\n",
43 | "publisher = pubsub_v1.PublisherClient()"
44 | ]
45 | },
46 | {
47 | "cell_type": "markdown",
48 | "id": "29290d31-9ffb-4cbe-90e0-e508f2a49eae",
49 | "metadata": {},
50 | "source": [
51 | "### 2. Set the variables . These variables will be referenced in later sections"
52 | ]
53 | },
54 | {
55 | "cell_type": "code",
56 | "execution_count": null,
57 | "id": "3f189dbc-3ba8-41c9-a37c-1fc37a6256cd",
58 | "metadata": {},
59 | "outputs": [],
60 | "source": [
61 | "dest_project=!gcloud config get-value project\n",
62 | "project_id=dest_project[1]\n",
63 | "print(project_id)\n",
64 | "\n",
65 | "pubsub_topic = project_id + \"-\" + \"topic\" \n",
66 | "pubsub_topic_path = publisher.topic_path(project_id, pubsub_topic)\n"
67 | ]
68 | },
69 | {
70 | "cell_type": "markdown",
71 | "id": "cceca00c-2ca3-41f2-a80a-114010158947",
72 | "metadata": {},
73 | "source": [
74 | "### 3. Create the function to simulate the data"
75 | ]
76 | },
77 | {
78 | "cell_type": "code",
79 | "execution_count": null,
80 | "id": "dc71111e-778e-4df2-8e9c-96427df99fde",
81 | "metadata": {},
82 | "outputs": [],
83 | "source": [
84 | "def simulator(number): \n",
85 | " \n",
86 | " i = 0 \n",
87 | " while i < number:\n",
88 | " json_object = json.dumps({\"SensorID\":\"75c18751-7a94-453e-86f5-67be2b0c8fd4\",'Timestamp': datetime.now().strftime('%Y-%m-%d %H:%M:%S.%f')[:-3],\"SensorValue\":random.uniform(100, 300)})\n",
89 | " data = json_object.encode(\"utf-8\")\n",
90 | " future = publisher.publish(pubsub_topic_path, data)\n",
91 | " time.sleep(0.1)\n",
92 | " i= i + 1\n",
93 | "\n"
94 | ]
95 | },
96 | {
97 | "cell_type": "markdown",
98 | "id": "07ab8e92-e4a4-4481-84bd-32b5a8881579",
99 | "metadata": {},
100 | "source": [
101 | "### 4.Run the simulator to test interactive runner"
102 | ]
103 | },
104 | {
105 | "cell_type": "code",
106 | "execution_count": null,
107 | "id": "3b84d710-7d27-4512-ae1e-5d3a618009fb",
108 | "metadata": {},
109 | "outputs": [],
110 | "source": [
111 | "#interactive test data simulation\n",
112 | "if __name__ == \"__main__\":\n",
113 | " simulator(100)\n"
114 | ]
115 | },
116 | {
117 | "cell_type": "markdown",
118 | "id": "d7cfb1b2-e6ab-4da0-a584-599bbb0458a7",
119 | "metadata": {},
120 | "source": [
121 | "### 5.Run the simulator to test dataflow runner"
122 | ]
123 | },
124 | {
125 | "cell_type": "code",
126 | "execution_count": null,
127 | "id": "d4303433-773d-43f3-87cb-4a2c6fee2346",
128 | "metadata": {},
129 | "outputs": [],
130 | "source": [
131 | "#dataflow test data simulation\n",
132 | "if __name__ == \"__main__\":\n",
133 | " simulator(5000)"
134 | ]
135 | }
136 | ],
137 | "metadata": {
138 | "kernelspec": {
139 | "display_name": "01. Apache Beam 2.45.0 for Python 3",
140 | "language": "python",
141 | "name": "01-apache-beam-2.45.0"
142 | },
143 | "language_info": {
144 | "codemirror_mode": {
145 | "name": "ipython",
146 | "version": 3
147 | },
148 | "file_extension": ".py",
149 | "mimetype": "text/x-python",
150 | "name": "python",
151 | "nbconvert_exporter": "python",
152 | "pygments_lexer": "ipython3",
153 | "version": "3.8.10"
154 | }
155 | },
156 | "nbformat": 4,
157 | "nbformat_minor": 5
158 | }
159 |
--------------------------------------------------------------------------------
/AnomalyDetection/anomalydetection-interactivenotebook-main/README.md:
--------------------------------------------------------------------------------
1 | Copyright 2023 Google LLC
2 |
3 | Licensed under the Apache License, Version 2.0 (the "License");
4 | you may not use this file except in compliance with the License.
5 | You may obtain a copy of the License at
6 |
7 | https://www.apache.org/licenses/LICENSE-2.0
8 |
9 | Unless required by applicable law or agreed to in writing, software
10 | distributed under the License is distributed on an "AS IS" BASIS,
11 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12 | See the License for the specific language governing permissions and
13 | limitations under the License.
14 |
15 | # Real Time Visibility - Anomaly Detection
16 |
17 | Demo Asset for Anomaly Detection Use case RealTime Intelligence Go To Market Sales Play
18 |
19 | ## About this Lab
20 |
21 | Anomaly Detection is a demo to show an end to end architecture of a streaming pipeline from raw data ingestion to transform the data using Dataflow - leveraging Dataflow notebooks, setting up an Apache Beam pipeline, transforming the data using Windows and finally landing the data in BigQuery for further analysis. Below you will find an architecture diagram of the overall end to end solution
22 |
23 | ## Architecture
24 |
25 | 
26 |
27 | ## Lab Modules
28 |
29 | This repo is organized across various modules:
30 |
31 | [1. Prerequisites - provisioning, configuring, securing](01-Prerequisites.md)
32 |
33 | [2. Data Integration Pipeline](02-Dataflow_Pub_Sub_Notebook.md)
34 |
--------------------------------------------------------------------------------
/CODE_OF_CONDUCT.md:
--------------------------------------------------------------------------------
1 | # Code of Conduct
2 |
3 | ## Our Pledge
4 |
5 | In the interest of fostering an open and welcoming environment, we as
6 | contributors and maintainers pledge to making participation in our project and
7 | our community a harassment-free experience for everyone, regardless of age, body
8 | size, disability, ethnicity, gender identity and expression, level of
9 | experience, education, socio-economic status, nationality, personal appearance,
10 | race, religion, or sexual identity and orientation.
11 |
12 | ## Our Standards
13 |
14 | Examples of behavior that contributes to creating a positive environment
15 | include:
16 |
17 | * Using welcoming and inclusive language
18 | * Being respectful of differing viewpoints and experiences
19 | * Gracefully accepting constructive criticism
20 | * Focusing on what is best for the community
21 | * Showing empathy towards other community members
22 |
23 | Examples of unacceptable behavior by participants include:
24 |
25 | * The use of sexualized language or imagery and unwelcome sexual attention or
26 | advances
27 | * Trolling, insulting/derogatory comments, and personal or political attacks
28 | * Public or private harassment
29 | * Publishing others' private information, such as a physical or electronic
30 | address, without explicit permission
31 | * Other conduct which could reasonably be considered inappropriate in a
32 | professional setting
33 |
34 | ## Our Responsibilities
35 |
36 | Project maintainers are responsible for clarifying the standards of acceptable
37 | behavior and are expected to take appropriate and fair corrective action in
38 | response to any instances of unacceptable behavior.
39 |
40 | Project maintainers have the right and responsibility to remove, edit, or reject
41 | comments, commits, code, wiki edits, issues, and other contributions that are
42 | not aligned to this Code of Conduct, or to ban temporarily or permanently any
43 | contributor for other behaviors that they deem inappropriate, threatening,
44 | offensive, or harmful.
45 |
46 | ## Scope
47 |
48 | This Code of Conduct applies both within project spaces and in public spaces
49 | when an individual is representing the project or its community. Examples of
50 | representing a project or community include using an official project e-mail
51 | address, posting via an official social media account, or acting as an appointed
52 | representative at an online or offline event. Representation of a project may be
53 | further defined and clarified by project maintainers.
54 |
55 | This Code of Conduct also applies outside the project spaces when the Project
56 | Steward has a reasonable belief that an individual's behavior may have a
57 | negative impact on the project or its community.
58 |
59 | ## Conflict Resolution
60 |
61 | We do not believe that all conflict is bad; healthy debate and disagreement
62 | often yield positive results. However, it is never okay to be disrespectful or
63 | to engage in behavior that violates the project’s code of conduct.
64 |
65 | If you see someone violating the code of conduct, you are encouraged to address
66 | the behavior directly with those involved. Many issues can be resolved quickly
67 | and easily, and this gives people more control over the outcome of their
68 | dispute. If you are unable to resolve the matter for any reason, or if the
69 | behavior is threatening or harassing, report it. We are dedicated to providing
70 | an environment where participants feel welcome and safe.
71 |
72 | Reports should be directed to *[PROJECT STEWARD NAME(s) AND EMAIL(s)]*, the
73 | Project Steward(s) for *[PROJECT NAME]*. It is the Project Steward’s duty to
74 | receive and address reported violations of the code of conduct. They will then
75 | work with a committee consisting of representatives from the Open Source
76 | Programs Office and the Google Open Source Strategy team. If for any reason you
77 | are uncomfortable reaching out to the Project Steward, please email
78 | opensource@google.com.
79 |
80 | We will investigate every complaint, but you may not receive a direct response.
81 | We will use our discretion in determining when and how to follow up on reported
82 | incidents, which may range from not taking action to permanent expulsion from
83 | the project and project-sponsored spaces. We will notify the accused of the
84 | report and provide them an opportunity to discuss it before any action is taken.
85 | The identity of the reporter will be omitted from the details of the report
86 | supplied to the accused. In potentially harmful situations, such as ongoing
87 | harassment or threats to anyone's safety, we may take action without notice.
88 |
89 | ## Attribution
90 |
91 | This Code of Conduct is adapted from the Contributor Covenant, version 1.4,
92 | available at
93 | https://www.contributor-covenant.org/version/1/4/code-of-conduct.html
94 |
95 |
--------------------------------------------------------------------------------
/CONTRIBUTING.md:
--------------------------------------------------------------------------------
1 | # How to contribute
2 |
3 | We'd love to accept your patches and contributions to this project.
4 |
5 | ## Before you begin
6 |
7 | ### Sign our Contributor License Agreement
8 |
9 | Contributions to this project must be accompanied by a
10 | [Contributor License Agreement](https://cla.developers.google.com/about) (CLA).
11 | You (or your employer) retain the copyright to your contribution; this simply
12 | gives us permission to use and redistribute your contributions as part of the
13 | project.
14 |
15 | If you or your current employer have already signed the Google CLA (even if it
16 | was for a different project), you probably don't need to do it again.
17 |
18 | Visit to see your current agreements or to
19 | sign a new one.
20 |
21 | ### Review our community guidelines
22 |
23 | This project follows
24 | [Google's Open Source Community Guidelines](https://opensource.google/conduct/).
25 |
26 | ## Contribution process
27 |
28 | ### Code reviews
29 |
30 | All submissions, including submissions by project members, require review. We
31 | use GitHub pull requests for this purpose. Consult
32 | [GitHub Help](https://help.github.com/articles/about-pull-requests/) for more
33 | information on using pull requests.
--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
1 |
2 | Apache License
3 | Version 2.0, January 2004
4 | http://www.apache.org/licenses/
5 |
6 | TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
7 |
8 | 1. Definitions.
9 |
10 | "License" shall mean the terms and conditions for use, reproduction,
11 | and distribution as defined by Sections 1 through 9 of this document.
12 |
13 | "Licensor" shall mean the copyright owner or entity authorized by
14 | the copyright owner that is granting the License.
15 |
16 | "Legal Entity" shall mean the union of the acting entity and all
17 | other entities that control, are controlled by, or are under common
18 | control with that entity. For the purposes of this definition,
19 | "control" means (i) the power, direct or indirect, to cause the
20 | direction or management of such entity, whether by contract or
21 | otherwise, or (ii) ownership of fifty percent (50%) or more of the
22 | outstanding shares, or (iii) beneficial ownership of such entity.
23 |
24 | "You" (or "Your") shall mean an individual or Legal Entity
25 | exercising permissions granted by this License.
26 |
27 | "Source" form shall mean the preferred form for making modifications,
28 | including but not limited to software source code, documentation
29 | source, and configuration files.
30 |
31 | "Object" form shall mean any form resulting from mechanical
32 | transformation or translation of a Source form, including but
33 | not limited to compiled object code, generated documentation,
34 | and conversions to other media types.
35 |
36 | "Work" shall mean the work of authorship, whether in Source or
37 | Object form, made available under the License, as indicated by a
38 | copyright notice that is included in or attached to the work
39 | (an example is provided in the Appendix below).
40 |
41 | "Derivative Works" shall mean any work, whether in Source or Object
42 | form, that is based on (or derived from) the Work and for which the
43 | editorial revisions, annotations, elaborations, or other modifications
44 | represent, as a whole, an original work of authorship. For the purposes
45 | of this License, Derivative Works shall not include works that remain
46 | separable from, or merely link (or bind by name) to the interfaces of,
47 | the Work and Derivative Works thereof.
48 |
49 | "Contribution" shall mean any work of authorship, including
50 | the original version of the Work and any modifications or additions
51 | to that Work or Derivative Works thereof, that is intentionally
52 | submitted to Licensor for inclusion in the Work by the copyright owner
53 | or by an individual or Legal Entity authorized to submit on behalf of
54 | the copyright owner. For the purposes of this definition, "submitted"
55 | means any form of electronic, verbal, or written communication sent
56 | to the Licensor or its representatives, including but not limited to
57 | communication on electronic mailing lists, source code control systems,
58 | and issue tracking systems that are managed by, or on behalf of, the
59 | Licensor for the purpose of discussing and improving the Work, but
60 | excluding communication that is conspicuously marked or otherwise
61 | designated in writing by the copyright owner as "Not a Contribution."
62 |
63 | "Contributor" shall mean Licensor and any individual or Legal Entity
64 | on behalf of whom a Contribution has been received by Licensor and
65 | subsequently incorporated within the Work.
66 |
67 | 2. Grant of Copyright License. Subject to the terms and conditions of
68 | this License, each Contributor hereby grants to You a perpetual,
69 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable
70 | copyright license to reproduce, prepare Derivative Works of,
71 | publicly display, publicly perform, sublicense, and distribute the
72 | Work and such Derivative Works in Source or Object form.
73 |
74 | 3. Grant of Patent License. Subject to the terms and conditions of
75 | this License, each Contributor hereby grants to You a perpetual,
76 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable
77 | (except as stated in this section) patent license to make, have made,
78 | use, offer to sell, sell, import, and otherwise transfer the Work,
79 | where such license applies only to those patent claims licensable
80 | by such Contributor that are necessarily infringed by their
81 | Contribution(s) alone or by combination of their Contribution(s)
82 | with the Work to which such Contribution(s) was submitted. If You
83 | institute patent litigation against any entity (including a
84 | cross-claim or counterclaim in a lawsuit) alleging that the Work
85 | or a Contribution incorporated within the Work constitutes direct
86 | or contributory patent infringement, then any patent licenses
87 | granted to You under this License for that Work shall terminate
88 | as of the date such litigation is filed.
89 |
90 | 4. Redistribution. You may reproduce and distribute copies of the
91 | Work or Derivative Works thereof in any medium, with or without
92 | modifications, and in Source or Object form, provided that You
93 | meet the following conditions:
94 |
95 | (a) You must give any other recipients of the Work or
96 | Derivative Works a copy of this License; and
97 |
98 | (b) You must cause any modified files to carry prominent notices
99 | stating that You changed the files; and
100 |
101 | (c) You must retain, in the Source form of any Derivative Works
102 | that You distribute, all copyright, patent, trademark, and
103 | attribution notices from the Source form of the Work,
104 | excluding those notices that do not pertain to any part of
105 | the Derivative Works; and
106 |
107 | (d) If the Work includes a "NOTICE" text file as part of its
108 | distribution, then any Derivative Works that You distribute must
109 | include a readable copy of the attribution notices contained
110 | within such NOTICE file, excluding those notices that do not
111 | pertain to any part of the Derivative Works, in at least one
112 | of the following places: within a NOTICE text file distributed
113 | as part of the Derivative Works; within the Source form or
114 | documentation, if provided along with the Derivative Works; or,
115 | within a display generated by the Derivative Works, if and
116 | wherever such third-party notices normally appear. The contents
117 | of the NOTICE file are for informational purposes only and
118 | do not modify the License. You may add Your own attribution
119 | notices within Derivative Works that You distribute, alongside
120 | or as an addendum to the NOTICE text from the Work, provided
121 | that such additional attribution notices cannot be construed
122 | as modifying the License.
123 |
124 | You may add Your own copyright statement to Your modifications and
125 | may provide additional or different license terms and conditions
126 | for use, reproduction, or distribution of Your modifications, or
127 | for any such Derivative Works as a whole, provided Your use,
128 | reproduction, and distribution of the Work otherwise complies with
129 | the conditions stated in this License.
130 |
131 | 5. Submission of Contributions. Unless You explicitly state otherwise,
132 | any Contribution intentionally submitted for inclusion in the Work
133 | by You to the Licensor shall be under the terms and conditions of
134 | this License, without any additional terms or conditions.
135 | Notwithstanding the above, nothing herein shall supersede or modify
136 | the terms of any separate license agreement you may have executed
137 | with Licensor regarding such Contributions.
138 |
139 | 6. Trademarks. This License does not grant permission to use the trade
140 | names, trademarks, service marks, or product names of the Licensor,
141 | except as required for reasonable and customary use in describing the
142 | origin of the Work and reproducing the content of the NOTICE file.
143 |
144 | 7. Disclaimer of Warranty. Unless required by applicable law or
145 | agreed to in writing, Licensor provides the Work (and each
146 | Contributor provides its Contributions) on an "AS IS" BASIS,
147 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
148 | implied, including, without limitation, any warranties or conditions
149 | of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
150 | PARTICULAR PURPOSE. You are solely responsible for determining the
151 | appropriateness of using or redistributing the Work and assume any
152 | risks associated with Your exercise of permissions under this License.
153 |
154 | 8. Limitation of Liability. In no event and under no legal theory,
155 | whether in tort (including negligence), contract, or otherwise,
156 | unless required by applicable law (such as deliberate and grossly
157 | negligent acts) or agreed to in writing, shall any Contributor be
158 | liable to You for damages, including any direct, indirect, special,
159 | incidental, or consequential damages of any character arising as a
160 | result of this License or out of the use or inability to use the
161 | Work (including but not limited to damages for loss of goodwill,
162 | work stoppage, computer failure or malfunction, or any and all
163 | other commercial damages or losses), even if such Contributor
164 | has been advised of the possibility of such damages.
165 |
166 | 9. Accepting Warranty or Additional Liability. While redistributing
167 | the Work or Derivative Works thereof, You may choose to offer,
168 | and charge a fee for, acceptance of support, warranty, indemnity,
169 | or other liability obligations and/or rights consistent with this
170 | License. However, in accepting such obligations, You may act only
171 | on Your own behalf and on Your sole responsibility, not on behalf
172 | of any other Contributor, and only if You agree to indemnify,
173 | defend, and hold each Contributor harmless for any liability
174 | incurred by, or claims asserted against, such Contributor by reason
175 | of your accepting any such warranty or additional liability.
176 |
177 | END OF TERMS AND CONDITIONS
178 |
179 | APPENDIX: How to apply the Apache License to your work.
180 |
181 | To apply the Apache License to your work, attach the following
182 | boilerplate notice, with the fields enclosed by brackets "[]"
183 | replaced with your own identifying information. (Don't include
184 | the brackets!) The text should be enclosed in the appropriate
185 | comment syntax for the file format. We also recommend that a
186 | file or class name and description of purpose be included on the
187 | same "printed page" as the copyright notice for easier
188 | identification within third-party archives.
189 |
190 | Copyright [yyyy] [name of copyright owner]
191 |
192 | Licensed under the Apache License, Version 2.0 (the "License");
193 | you may not use this file except in compliance with the License.
194 | You may obtain a copy of the License at
195 |
196 | http://www.apache.org/licenses/LICENSE-2.0
197 |
198 | Unless required by applicable law or agreed to in writing, software
199 | distributed under the License is distributed on an "AS IS" BASIS,
200 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
201 | See the License for the specific language governing permissions and
202 | limitations under the License.
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | Copyright 2024 Google LLC
2 |
3 | Licensed under the Apache License, Version 2.0 (the "License");
4 | you may not use this file except in compliance with the License.
5 | You may obtain a copy of the License at
6 |
7 | https://www.apache.org/licenses/LICENSE-2.0
8 |
9 | Unless required by applicable law or agreed to in writing, software
10 | distributed under the License is distributed on an "AS IS" BASIS,
11 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12 | See the License for the specific language governing permissions and
13 | limitations under the License.
14 |
15 | # Real Time Intelligence Hands-on Labs
16 |
17 | ## About
18 | This repository features self-contained, hands-on-labs with detailed step-by-step instructions and associated collateral (data, code, configuration) to enable Real Time Intelligence learning.
19 |
20 | ## Labs
21 |
22 | | # | Use Case | Lab summary | Contributed by |
23 | | -- | :--- | :--- |:--- |
24 | | 1. |[Real Time Prediction](RealTimePrediction/realtime-intelligence-main/README.md)|A real-time, streaming, machine learning (ML) prediction pipeline that uses Dataflow, Pub/Sub, Vertex AI, BigQuery and Cloud Storage | Sam Iyer
25 | | 2. |[Anomaly Detection Interactive Notebook](AnomalyDetection/anomalydetection-interactivenotebook-main/README.md)|Running an Apache Beam pipeline using Dataflow notebooks| Smitha Venkat, Purnima Maganti and Mohamed Barry
26 |
27 | ## Contributing
28 | See the contributing [instructions](CONTRIBUTING.md) to get started contributing.
29 |
30 | ## License
31 | All solutions within this repository are provided under the Apache 2.0 license. Please see the LICENSE file for more detailed terms and conditions.
32 |
33 | ## Disclaimer
34 | This repository and its contents are not an official Google Product.
35 |
36 | ## Contact
37 | Share you feedback, ideas, by logging [issues](../../issues).
38 |
39 | ## Release History
40 |
41 | | # | Release Summary | Date | Contributor |
42 | | -- | :--- | :--- |:--- |
43 | | 1. |Initial release| 4/4/2023| Various|
44 | | 2. |Code Fix|6/10/2024|realtime/train_on_vertexai.py|
45 | | 3. ||||
46 |
--------------------------------------------------------------------------------
/RealTimePrediction/realtime-intelligence-main/README.md:
--------------------------------------------------------------------------------
1 | Copyright 2024 Google LLC
2 |
3 | Licensed under the Apache License, Version 2.0 (the "License");
4 | you may not use this file except in compliance with the License.
5 | You may obtain a copy of the License at
6 |
7 | https://www.apache.org/licenses/LICENSE-2.0
8 |
9 | Unless required by applicable law or agreed to in writing, software
10 | distributed under the License is distributed on an "AS IS" BASIS,
11 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12 | See the License for the specific language governing permissions and
13 | limitations under the License.
14 |
15 | ## Realtime Prediction
16 | This lab helps you to implement a real-time, streaming, machine learning (ML) prediction pipeline that uses Dataflow, Pub/Sub, Vertex AI, BigQuery and Cloud Storage.
17 |
18 | ## Solution Overview
19 | This lab predicts if a flight would arrive on-time using historical data from the US Bureau of Transport Statistics(BTS) website. (https://www.bts.gov/topics/airline-time-tables)
20 | This website provides historical on-time performance information of domestic flights in the United States. All major US air carriers are required to file statistics about each of their domestic flights with the BTS. The data they are required to file includes the scheduled departure and arrival times as well as the actual departure and arrival times. From the scheduled and actual arrival times, the arrival delay associated with each flight can be calculated. Therefore, this dataset can give us the true value for building a model to predict arrival delay.
21 |
22 | ## Architecture
23 |
24 | 
25 | 1. Data Ingestion
26 |
27 | - Ingest - Extract Flight On-Time Perfomance Data (Date, Flight Number, Origin, Destination, Departure Time, Taxi Time, Arrival Time, etc ) -> Stored in Cloud Storage Bucket
28 |
29 | - Ingest - Extract Airport Information (Airport code, City, Latitude, Longitiude, etc.,) -> Stored in Cloud Storage Bucket
30 |
31 | - Store - Store standardized and transformed datasets in BigQuery
32 |
33 | 2. Model Training
34 |
35 | - Batch Dataflow Process to create Training Dataset using simulated events.
36 |
37 | - Use the Training Dataset for Vertex AI Model Training.
38 |
39 | 3. Prediction
40 |
41 | - Simulate - Simulate Realtime Fight Takeoffs & Landings and capture this data in Pub/Sub Topics.
42 |
43 | - Prediction - Streaming Dataflow job to read from Pub/Sub and call Vertex AI Model to predict on-time arrival of flights.
44 |
45 | - Store - Capture the predictions in a BigQuery Dataset for Analysis and Dashboarding needs.
46 |
47 | ## Datasets
48 |
49 | 1. Inputs
50 |
51 | - Airports Information - airports
52 |
53 | - Ontime Flight Data - flights_raw
54 |
55 | - Time Zone Corrected Data - flights_tzcorr
56 |
57 | - Simulated Flight Event - flights_simevents
58 |
59 | 2. Outputs
60 |
61 | - Streaming Prediction - streaming_preds
62 |
63 |
64 | ## Getting started
65 |
66 | ### Step 01. Create a GCP project and open Cloud Shell
67 |
68 | ### Step 02. Clone this github repository:
69 |
70 | git clone https://github.com/google/real-time-intelligence-workshop.git
71 |
72 | ### Step 03. Change Directory to **RealTimePrediction/realtime-intelligence-main**
73 |
74 | cd real-time-intelligence-workshop/RealTimePrediction/realtime-intelligence-main/
75 |
76 | ### Step 04. Execute script
77 |
78 | ./setup_env.sh
79 |
80 | This script sets up your project:
81 |
82 | - Create Project Variables
83 |
84 | - Enable necessary APIs
85 |
86 | - Add the necessary roles for the default compute service account
87 |
88 | - Create Network, Sub-network & Firewall rules
89 |
90 | ### Step 05. Execute script
91 |
92 | ./stage_data.sh
93 |
94 | This script will stage the following data for the lab
95 |
96 | - Download flight ontime performance data
97 |
98 | - Download flight timezone corrected data
99 |
100 | - Download Airport information
101 |
102 | - Download Flight Simulated Events
103 |
104 | - Upload the downloaded files to BigQuery
105 |
106 | ### Step 06. Validate if data has been copied to Cloud Storage and BigQuery
107 |
108 | Sample image of the GCS Bucket
109 |
110 | 
111 |
112 | - In Google Cloud Console menu, Navigate to Cloud Storage and validate if -ml bucket is created
113 |
114 | Open the bucket and validate if the following files exists
115 |
116 | - flight_simevents_dump*.gz(5 files)
117 |
118 | - flight folder has 3 sub-folders - airports, raw & tzcorr
119 |
120 | - airports folder has 1 file - airports.csv
121 |
122 | - raw folder has 2 files - 201501.csv & 201502.csv
123 |
124 | - tzcorr folder has 1 file - all_flights*
125 |
126 | Sample Image of Bigquery Dataset
127 |
128 | 
129 |
130 | - In Google Cloud Console menu, Navigate to BigQuery and validate if flights dataset is created
131 |
132 | Open the dataset and validate if the following tables exists
133 |
134 | - airports - 13,386 rows
135 |
136 | - flights_raw - 899,159 rows
137 |
138 | - flights_simevents - 2,600,380 rows
139 |
140 | - flights_tzcorr - 65,099 rows
141 |
142 | ### Step 07. Check Organization Policies to review the following constraints
143 |
144 | - In Google Cloud Console menu, navigate to IAM->Organization Policies
145 |
146 | - Turn off Shielded VM Policy
147 |
148 | - Filter the following constraint to validate current settings
149 |
150 | constraints/compute.requireShieldedVm
151 |
152 | Sample Image of Shielded VM - Organization Policy
153 |
154 | 
155 |
156 |
157 | - Allow VM external IP Access
158 |
159 | - Filter the following constraint to validate current settings
160 |
161 | constraints/compute.vmExternalIpAccess
162 |
163 | Sample Image of External IP Access - Organization Policy
164 |
165 | 
166 |
167 | ### Step 08. Execute script to install the necessary packages.
168 |
169 | ./install_packages.sh
170 |
171 | - These packages are necessary to run tensorflow and apache beam processes
172 |
173 | ### Step 09. Execute script to create data for model training.
174 |
175 | ./create_train_data.sh
176 |
177 | This script creates data for testing, training and validation of the model.
178 |
179 | - In the Google Cloud Console menu, navigate to Dataflow > Jobs
180 |
181 | - Click on traindata job to review the job graph
182 |
183 | - Wait for the job to run and succeed - will take about 20 minutes
184 |
185 | Sample Image of Dataflow Jobs - Note: traindata is a batch job
186 |
187 | 
188 |
189 |
190 | Sample Image of TrainData Job Graph
191 |
192 | 
193 |
194 | Open -ml bucket to validate the following files and folders are present
195 |
196 | - train folder with 1 sub-folder - data with 4 files - all*.csv, test*.csv, train*.csv, validate*.csv
197 |
198 | Sample Image of the bucket
199 |
200 | 
201 |
202 | - flights folder with 2 sub-folders - staging & temp, that have staging and temp files
203 |
204 | Sample Image of the bucket
205 |
206 | 
207 |
208 |
209 | ### Step 10. Execute script to train and deploy the model
210 |
211 | ./train_model.sh
212 |
213 | - In Google Cloud Console menu, navigate to Vertex AI -> Training to monitor the training pipeline.
214 |
215 | Sample Image of Vertex AI Training Pipeline
216 |
217 | 
218 |
219 | - When the status is Finished, click on the training pipeline name and select Deploy & Test tab
220 |
221 | Sample Image of Vertex AI Deployment
222 |
223 | 
224 |
225 |
226 | - Monitor the deployment status of the model
227 |
228 | - Note: It will take around 20 minutes to complete the model training and deployment.
229 |
230 | - Once the model is deployed the flights endpoint will be used to call the model for prediction.
231 |
232 | Sample Image of Vertex AI Endpoint
233 |
234 | 
235 |
236 |
237 | ### Step 11. Open another tab in cloud shell and execute script to stream simulated flight data
238 |
239 | ./simulate_flight.sh
240 |
241 | - In Google Cloud Console menu, navigate to Pub/Sub -> Topics
242 |
243 | - Review 3 topics that were created to stream simulated flights events
244 |
245 | - arrived - simulates flight arrivals
246 | - departed - simulates flight departures
247 | - wheels-off - simulates flight take-offs
248 |
249 | Sample Image of Pub/Sub Topics
250 |
251 | 
252 |
253 |
254 | ### Step 12. In the previous tab execute script to predict the probaility of flights being on time
255 |
256 | ./predict_flights.sh
257 |
258 | This scripts will create a streaming data flow job that calls the AI model trained in Step 10
259 |
260 | Sample Image of Dataflow Jobs - Note: predictions is a streaming job
261 |
262 | 
263 |
264 | Sample Image of predictions Job Graph
265 |
266 | 
267 |
268 | - Wait for 15 minutes.
269 |
270 | - In Google Cloud Consule menu, navigate to BigQuery -> SQL Studio
271 |
272 | - Open the dataset flights and review streaming_preds table
273 |
274 | - Streaming predictions on probalblity of flight ontime arrival is captured in this table
275 |
276 | Sample Image of Streaming Predictions Table
277 |
278 | 
279 |
280 |
281 |
--------------------------------------------------------------------------------
/RealTimePrediction/realtime-intelligence-main/create_train_data.sh:
--------------------------------------------------------------------------------
1 | # Copyright 2023 Google LLC
2 | #
3 | # Licensed under the Apache License, Version 2.0 (the "License");
4 | # you may not use this file except in compliance with the License.
5 | # You may obtain a copy of the License at
6 | #
7 | # https://www.apache.org/licenses/LICENSE-2.0
8 | #
9 | # Unless required by applicable law or agreed to in writing, software
10 | # distributed under the License is distributed on an "AS IS" BASIS,
11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12 | # See the License for the specific language governing permissions and
13 | # limitations under the License.
14 | #
15 | # Set environment variables
16 | #
17 | export PROJECT_ID=$(gcloud info --format='value(config.project)')
18 | export BUCKET=$PROJECT_ID-ml
19 | #
20 | # Change directory to realtime directory
21 | #
22 | cd ./realtime
23 | #
24 | # Create data for Training
25 | # Run Dataflow Pipeline to create Training Dataset
26 | # Note: It will take around 15-20 minutes to complete the job.
27 | #
28 | python3 create_traindata.py --input bigquery --project $PROJECT_ID --bucket $BUCKET --region us-central1
29 | #
30 | cd ..
31 | #
32 | #In the GCP Cloud Console menu, navigate to Dataflow > Jobs
33 | #Open Traindata job and review the Job Graph
34 | #
--------------------------------------------------------------------------------
/RealTimePrediction/realtime-intelligence-main/images/architecture.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/google/real-time-intelligence-workshop/ea99fd5b6fcf44c4f006dc17adde8c42ca163479/RealTimePrediction/realtime-intelligence-main/images/architecture.png
--------------------------------------------------------------------------------
/RealTimePrediction/realtime-intelligence-main/images/batch.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/google/real-time-intelligence-workshop/ea99fd5b6fcf44c4f006dc17adde8c42ca163479/RealTimePrediction/realtime-intelligence-main/images/batch.png
--------------------------------------------------------------------------------
/RealTimePrediction/realtime-intelligence-main/images/data_folder.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/google/real-time-intelligence-workshop/ea99fd5b6fcf44c4f006dc17adde8c42ca163479/RealTimePrediction/realtime-intelligence-main/images/data_folder.png
--------------------------------------------------------------------------------
/RealTimePrediction/realtime-intelligence-main/images/dataflow_jobs1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/google/real-time-intelligence-workshop/ea99fd5b6fcf44c4f006dc17adde8c42ca163479/RealTimePrediction/realtime-intelligence-main/images/dataflow_jobs1.png
--------------------------------------------------------------------------------
/RealTimePrediction/realtime-intelligence-main/images/dataflow_jobs2.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/google/real-time-intelligence-workshop/ea99fd5b6fcf44c4f006dc17adde8c42ca163479/RealTimePrediction/realtime-intelligence-main/images/dataflow_jobs2.png
--------------------------------------------------------------------------------
/RealTimePrediction/realtime-intelligence-main/images/flights_folder.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/google/real-time-intelligence-workshop/ea99fd5b6fcf44c4f006dc17adde8c42ca163479/RealTimePrediction/realtime-intelligence-main/images/flights_folder.png
--------------------------------------------------------------------------------
/RealTimePrediction/realtime-intelligence-main/images/ingestion_bq.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/google/real-time-intelligence-workshop/ea99fd5b6fcf44c4f006dc17adde8c42ca163479/RealTimePrediction/realtime-intelligence-main/images/ingestion_bq.png
--------------------------------------------------------------------------------
/RealTimePrediction/realtime-intelligence-main/images/ingestion_gcs.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/google/real-time-intelligence-workshop/ea99fd5b6fcf44c4f006dc17adde8c42ca163479/RealTimePrediction/realtime-intelligence-main/images/ingestion_gcs.png
--------------------------------------------------------------------------------
/RealTimePrediction/realtime-intelligence-main/images/op_externalip.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/google/real-time-intelligence-workshop/ea99fd5b6fcf44c4f006dc17adde8c42ca163479/RealTimePrediction/realtime-intelligence-main/images/op_externalip.png
--------------------------------------------------------------------------------
/RealTimePrediction/realtime-intelligence-main/images/op_shieldedvm.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/google/real-time-intelligence-workshop/ea99fd5b6fcf44c4f006dc17adde8c42ca163479/RealTimePrediction/realtime-intelligence-main/images/op_shieldedvm.png
--------------------------------------------------------------------------------
/RealTimePrediction/realtime-intelligence-main/images/prediction.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/google/real-time-intelligence-workshop/ea99fd5b6fcf44c4f006dc17adde8c42ca163479/RealTimePrediction/realtime-intelligence-main/images/prediction.png
--------------------------------------------------------------------------------
/RealTimePrediction/realtime-intelligence-main/images/pubsub.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/google/real-time-intelligence-workshop/ea99fd5b6fcf44c4f006dc17adde8c42ca163479/RealTimePrediction/realtime-intelligence-main/images/pubsub.png
--------------------------------------------------------------------------------
/RealTimePrediction/realtime-intelligence-main/images/streaming.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/google/real-time-intelligence-workshop/ea99fd5b6fcf44c4f006dc17adde8c42ca163479/RealTimePrediction/realtime-intelligence-main/images/streaming.png
--------------------------------------------------------------------------------
/RealTimePrediction/realtime-intelligence-main/images/vertex_ai_deployment.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/google/real-time-intelligence-workshop/ea99fd5b6fcf44c4f006dc17adde8c42ca163479/RealTimePrediction/realtime-intelligence-main/images/vertex_ai_deployment.png
--------------------------------------------------------------------------------
/RealTimePrediction/realtime-intelligence-main/images/vertex_ai_endpoint.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/google/real-time-intelligence-workshop/ea99fd5b6fcf44c4f006dc17adde8c42ca163479/RealTimePrediction/realtime-intelligence-main/images/vertex_ai_endpoint.png
--------------------------------------------------------------------------------
/RealTimePrediction/realtime-intelligence-main/images/vertex_ai_training.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/google/real-time-intelligence-workshop/ea99fd5b6fcf44c4f006dc17adde8c42ca163479/RealTimePrediction/realtime-intelligence-main/images/vertex_ai_training.png
--------------------------------------------------------------------------------
/RealTimePrediction/realtime-intelligence-main/install_packages.sh:
--------------------------------------------------------------------------------
1 | # Copyright 2023 Google LLC
2 | #
3 | # Licensed under the Apache License, Version 2.0 (the "License");
4 | # you may not use this file except in compliance with the License.
5 | # You may obtain a copy of the License at
6 | #
7 | # https://www.apache.org/licenses/LICENSE-2.0
8 | #
9 | # Unless required by applicable law or agreed to in writing, software
10 | # distributed under the License is distributed on an "AS IS" BASIS,
11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12 | # See the License for the specific language governing permissions and
13 | # limitations under the License.
14 | #
15 | #Install The following Packages in your cloudshell or VM
16 | #
17 | pip3 install google-cloud-aiplatform
18 | pip3 install cloudml-hypertune
19 | pip3 install pyfarmhash
20 | pip3 install tensorflow
21 | pip3 install kfp 'apache-beam[gcp]'
22 |
--------------------------------------------------------------------------------
/RealTimePrediction/realtime-intelligence-main/predict_flights.sh:
--------------------------------------------------------------------------------
1 | # Copyright 2023 Google LLC
2 | #
3 | # Licensed under the Apache License, Version 2.0 (the "License");
4 | # you may not use this file except in compliance with the License.
5 | # You may obtain a copy of the License at
6 | #
7 | # https://www.apache.org/licenses/LICENSE-2.0
8 | #
9 | # Unless required by applicable law or agreed to in writing, software
10 | # distributed under the License is distributed on an "AS IS" BASIS,
11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12 | # See the License for the specific language governing permissions and
13 | # limitations under the License.
14 | #
15 | # Set environment variables
16 | #
17 | export PROJECT_ID=$(gcloud info --format='value(config.project)')
18 | export BUCKET=$PROJECT_ID-ml
19 | #
20 | # Change directory to simulate directory
21 | #
22 | cd ./realtime
23 | #
24 | # Predict Flights:
25 | #
26 | python3 make_predictions.py --input pubsub --output bigquery --project $PROJECT_ID --bucket $BUCKET --region us-central1
27 | cd ..
--------------------------------------------------------------------------------
/RealTimePrediction/realtime-intelligence-main/realtime/.gitignore:
--------------------------------------------------------------------------------
1 | *.egg-info
2 |
--------------------------------------------------------------------------------
/RealTimePrediction/realtime-intelligence-main/realtime/README.md:
--------------------------------------------------------------------------------
1 | Copyright 2023 Google LLC
2 |
3 | Licensed under the Apache License, Version 2.0 (the "License");
4 | you may not use this file except in compliance with the License.
5 | You may obtain a copy of the License at
6 |
7 | https://www.apache.org/licenses/LICENSE-2.0
8 |
9 | Unless required by applicable law or agreed to in writing, software
10 | distributed under the License is distributed on an "AS IS" BASIS,
11 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12 | See the License for the specific language governing permissions and
13 | limitations under the License.
14 |
15 | # Machine Learning on Streaming Pipelines
16 |
17 | ### Catch up from previous chapters if necessary
18 | If you didn't go through Chapters 2-9, the simplest way to catch up is to copy data from my bucket:
19 |
20 | #### Catch up from Chapters 2-9
21 | * Open CloudShell and git clone this repo:
22 | ```
23 | git clone https://github.com/GoogleCloudPlatform/data-science-on-gcp
24 | ```
25 | * Go to the 02_ingest folder of the repo, run the program ./ingest_from_crsbucket.sh and specify your bucket name.
26 | * Go to the 04_streaming folder of the repo, run the program ./ingest_from_crsbucket.sh and specify your bucket name.
27 | * Go to the 05_bqnotebook folder of the repo, run the program ./create_trainday.sh and specify your bucket name.
28 | * Go to the 10_mlops folder of the repo, run the program ./ingest_from_crsbucket.sh and specify your bucket name.
29 |
30 | #### From CloudShell
31 | * Install the Python libraries you'll need
32 | ```
33 | pip3 install google-cloud-aiplatform cloudml-hypertune pyfarmhash
34 | ```
35 | * [Optional] Create a small, local sample of BigQuery datasets for local experimentation:
36 | ```
37 | bash create_sample_input.sh
38 | ```
39 | * [Optional] Run a local pipeline to create a training dataset:
40 | ```
41 | python3 create_traindata.py --input local
42 | ```
43 | Verify the results:
44 | ```
45 | cat /tmp/all_data*
46 | ```
47 | * Run a Dataflow pipeline to create the full training dataset:
48 | ```
49 | python3 create_traindata.py --input bigquery --project --bucket --region
50 | ```
51 | Note if you get an error similar to:
52 | ```
53 | AttributeError: Can't get attribute '_create_code' on
54 | ```
55 | it is because the global version of your modules are ahead/behind of what Apache Beam on the server requires. Make sure to submit Apache Beam code to Dataflow from a pristine virtual environment that has only the modules you need:
56 | ```
57 | python -m venv ~/beamenv
58 | source ~/beamenv/bin/activate
59 | pip install apache-beam[gcp] google-cloud-aiplatform cloudml-hypertune pyfarmhash pyparsing==2.4.2
60 | python3 create_traindata.py ...
61 | ```
62 | Note that beamenv is only for submitting to Dataflow. Run train_on_vertexai.py and other code directly in the terminal.
63 | * Run script that copies over the Ch10 model.py and train_on_vertexai.py files and makes the necessary changes:
64 | ```
65 | python3 change_ch10_files.py
66 | ```
67 | * [Optional] Train an AutoML model on the enriched dataset:
68 | ```
69 | python3 train_on_vertexai.py --automl --project --bucket --region
70 | ```
71 | Verify performance by running the following BigQuery query:
72 | ```
73 | SELECT
74 | SQRT(SUM(
75 | (CAST(ontime AS FLOAT64) - predicted_ontime.scores[OFFSET(0)])*
76 | (CAST(ontime AS FLOAT64) - predicted_ontime.scores[OFFSET(0)])
77 | )/COUNT(*))
78 | FROM dsongcp.ch11_automl_evaluated
79 | ```
80 | * Train custom ML model on the enriched dataset:
81 | ```
82 | python3 train_on_vertexai.py --project --bucket --region
83 | ```
84 | Look at the logs of the log to determine the final RMSE.
85 | * Run a local pipeline to invoke predictions:
86 | ```
87 | python3 make_predictions.py --input local
88 | ```
89 | Verify the results:
90 | ```
91 | cat /tmp/predictions*
92 | ```
93 | * [Optional] Run a pipeline on full BigQuery dataset to invoke predictions:
94 | ```
95 | python3 make_predictions.py --input bigquery --project --bucket --region
96 | ```
97 | Verify the results
98 | ```
99 | gsutil cat gs://BUCKET/flights/ch11/predictions* | head -5
100 | ```
101 | * [Optional] Simulate real-time pipeline and check to see if predictions are being made
102 |
103 |
104 | In one terminal, type:
105 | ```
106 | cd ../04_streaming/simulate
107 | python3 ./simulate.py --startTime '2015-05-01 00:00:00 UTC' \
108 | --endTime '2015-05-04 00:00:00 UTC' --speedFactor=30 --project
109 | ```
110 |
111 | In another terminal type:
112 | ```
113 | python3 make_predictions.py --input pubsub \
114 | --project --bucket --region
115 | ```
116 |
117 | Ensure that the pipeline starts, check that output elements are starting to be written out, do:
118 | ```
119 | gsutil ls gs://BUCKET/flights/ch11/predictions*
120 | ```
121 | Make sure to go to the GCP Console and stop the Dataflow pipeline.
122 |
123 |
124 | * Simulate real-time pipeline and try out different jagger etc.
125 |
126 | In one terminal, type:
127 | ```
128 | cd ../04_streaming/simulate
129 | python3 ./simulate.py --startTime '2015-02-01 00:00:00 UTC' \
130 | --endTime '2015-02-03 00:00:00 UTC' --speedFactor=30 --project
131 | ```
132 |
133 | In another terminal type:
134 | ```
135 | python3 make_predictions.py --input pubsub --output bigquery \
136 | --project --bucket --region
137 | ```
138 |
139 | Ensure that the pipeline starts, look at BigQuery:
140 | ```
141 | SELECT * FROM dsongcp.streaming_preds ORDER BY event_time DESC LIMIT 10
142 | ```
143 | When done, make sure to go to the GCP Console and stop the Dataflow pipeline.
144 |
145 | Note: If you are going to try it a second time around, delete the BigQuery sink, or simulate with a different time range
146 | ```
147 | bq rm -f dsongcp.streaming_preds
148 | ```
149 |
150 |
--------------------------------------------------------------------------------
/RealTimePrediction/realtime-intelligence-main/realtime/call_predict.py:
--------------------------------------------------------------------------------
1 | #### DO NOT EDIT! Autogenerated from ../mlops/call_predict.py# Copyright 2023 Google Inc. All Rights Reserved.
2 | #
3 | # Licensed under the Apache License, Version 2.0 (the "License");
4 | # you may not use this file except in compliance with the License.
5 | # You may obtain a copy of the License at
6 | #
7 | # http://www.apache.org/licenses/LICENSE-2.0
8 | #
9 | # Unless required by applicable law or agreed to in writing, software
10 | # distributed under the License is distributed on an "AS IS" BASIS,
11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12 | # See the License for the specific language governing permissions and
13 | # limitations under the License.
14 |
15 | import sys, json
16 | from google.cloud import aiplatform
17 | from google.cloud.aiplatform import gapic as aip
18 |
19 | #ENDPOINT_NAME = 'flights-ch11'
20 | ENDPOINT_NAME = 'flights'
21 |
22 | if __name__ == '__main__':
23 |
24 | endpoints = aiplatform.Endpoint.list(
25 | filter='display_name="{}"'.format(ENDPOINT_NAME),
26 | order_by='create_time desc'
27 | )
28 | if len(endpoints) == 0:
29 | print("No endpoint named {}".format(ENDPOINT_NAME))
30 | sys.exit(-1)
31 |
32 | endpoint = endpoints[0]
33 |
34 | input_data = {"instances": [
35 | {"dep_hour": 2, "is_weekday": 1, "dep_delay": 40, "taxi_out": 17, "distance": 41, "carrier": "AS", "avg_dep_delay": -3.0, "avg_taxi_out": 5.0,
36 | "dep_airport_lat": 58.42527778, "dep_airport_lon": -135.7075, "arr_airport_lat": 58.35472222,
37 | "arr_airport_lon": -134.57472222, "origin": "GST", "dest": "JNU"},
38 | {"dep_hour": 22, "is_weekday": 0, "dep_delay": -7, "taxi_out": 7, "distance": 201, "carrier": "HA", "avg_dep_delay": 3.0, "avg_taxi_out": 8.0,
39 | "dep_airport_lat": 21.97611111, "dep_airport_lon": -159.33888889, "arr_airport_lat": 20.89861111,
40 | "arr_airport_lon": -156.43055556, "origin": "LIH", "dest": "OGG"}
41 | ]}
42 |
43 | preds = endpoint.predict(input_data['instances'])
44 | print(preds)
45 |
46 |
47 |
48 |
--------------------------------------------------------------------------------
/RealTimePrediction/realtime-intelligence-main/realtime/create_sample_input.sh:
--------------------------------------------------------------------------------
1 | #!/bin/bash
2 | # Copyright 2023 Google LLC
3 | #
4 | # Licensed under the Apache License, Version 2.0 (the "License");
5 | # you may not use this file except in compliance with the License.
6 | # You may obtain a copy of the License at
7 | #
8 | # https://www.apache.org/licenses/LICENSE-2.0
9 | #
10 | # Unless required by applicable law or agreed to in writing, software
11 | # distributed under the License is distributed on an "AS IS" BASIS,
12 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13 | # See the License for the specific language governing permissions and
14 | # limitations under the License.
15 |
16 | bq query --nouse_legacy_sql --format=sparse \
17 | "SELECT EVENT_DATA FROM dsongcp.flights_simevents WHERE EVENT_TYPE = 'wheelsoff' AND EVENT_TIME BETWEEN '2015-03-10T10:00:00' AND '2015-03-10T14:00:00' " \
18 | | grep FL_DATE \
19 | > simevents_sample.json
20 |
21 |
22 | bq query --nouse_legacy_sql --format=json \
23 | "SELECT * FROM dsongcp.flights_tzcorr WHERE DEP_TIME BETWEEN '2015-03-10T10:00:00' AND '2015-03-10T14:00:00' " \
24 | | sed 's/\[//g' | sed 's/\]//g' | sed s'/\},/\}\n/g' \
25 | > alldata_sample.json
26 |
--------------------------------------------------------------------------------
/RealTimePrediction/realtime-intelligence-main/realtime/create_traindata.py:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env python3
2 |
3 | # Copyright 2023 Google LLC
4 | #
5 | # Licensed under the Apache License, Version 2.0 (the "License");
6 | # you may not use this file except in compliance with the License.
7 | # You may obtain a copy of the License at
8 | #
9 | # https://www.apache.org/licenses/LICENSE-2.0
10 | #
11 | # Unless required by applicable law or agreed to in writing, software
12 | # distributed under the License is distributed on an "AS IS" BASIS,
13 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
14 | # See the License for the specific language governing permissions and
15 | # limitations under the License.
16 |
17 | import apache_beam as beam
18 | import logging
19 | import os
20 | import json
21 |
22 | from flightstxf import flights_transforms as ftxf
23 |
24 | CSV_HEADER = 'ontime,dep_delay,taxi_out,distance,origin,dest,dep_hour,is_weekday,carrier,dep_airport_lat,dep_airport_lon,arr_airport_lat,arr_airport_lon,avg_dep_delay,avg_taxi_out,data_split'
25 |
26 |
27 | def dict_to_csv(f):
28 | try:
29 | yield ','.join([str(x) for x in f.values()])
30 | except Exception as e:
31 | logging.warning('Ignoring {} because: {}'.format(f, e), exc_info=True)
32 | pass
33 |
34 |
35 | def run(project, bucket, region, input):
36 | if input == 'local':
37 | logging.info('Running locally on small extract')
38 | argv = [
39 | '--runner=DirectRunner'
40 | ]
41 | flights_output = '/tmp/'
42 | else:
43 | logging.info('Running in the cloud on full dataset input={}'.format(input))
44 | argv = [
45 | '--project={0}'.format(project),
46 | '--job_name=traindata',
47 | # '--save_main_session', # not needed as we are running as a package now
48 | '--staging_location=gs://{0}/flights/staging/'.format(bucket),
49 | '--temp_location=gs://{0}/flights/temp/'.format(bucket),
50 | '--setup_file=./setup.py',
51 | '--autoscaling_algorithm=THROUGHPUT_BASED',
52 | '--max_num_workers=20',
53 | # '--max_num_workers=4', '--worker_machine_type=m1-ultramem-40', '--disk_size_gb=500', # for full 2015-2019 dataset
54 | '--region={}'.format(region),
55 | '--subnetwork=regions/us-central1/subnetworks/default',
56 | '--runner=DataflowRunner'
57 | ]
58 | flights_output = 'gs://{}/train/data/'.format(bucket)
59 |
60 | with beam.Pipeline(argv=argv) as pipeline:
61 |
62 | # read the event stream
63 | if input == 'local':
64 | input_file = './alldata_sample.json'
65 | logging.info("Reading from {} ... Writing to {}".format(input_file, flights_output))
66 | events = (
67 | pipeline
68 | | 'read_input' >> beam.io.ReadFromText(input_file)
69 | | 'parse_input' >> beam.Map(lambda line: json.loads(line))
70 | )
71 | elif input == 'bigquery':
72 | input_table = 'flights.flights_tzcorr'
73 | logging.info("Reading from {} ... Writing to {}".format(input_table, flights_output))
74 | events = (
75 | pipeline
76 | | 'read_input' >> beam.io.ReadFromBigQuery(table=input_table)
77 | )
78 | else:
79 | logging.error("Unknown input type {}".format(input))
80 | return
81 |
82 | # events -> features. See ./flights_transforms.py for the code shared between training & prediction
83 | features = ftxf.transform_events_to_features(events)
84 |
85 | # shuffle globally so that we are not at mercy of TensorFlow's shuffle buffer
86 | features = (
87 | features
88 | | 'into_global' >> beam.WindowInto(beam.window.GlobalWindows())
89 | | 'shuffle' >> beam.util.Reshuffle()
90 | )
91 |
92 | # write out
93 | for split in ['ALL', 'TRAIN', 'VALIDATE', 'TEST']:
94 | feats = features
95 | if split != 'ALL':
96 | feats = feats | 'only_{}'.format(split) >> beam.Filter(lambda f: f['data_split'] == split)
97 | (
98 | feats
99 | | '{}_to_string'.format(split) >> beam.FlatMap(dict_to_csv)
100 | | '{}_to_gcs'.format(split) >> beam.io.textio.WriteToText(os.path.join(flights_output, split.lower()),
101 | file_name_suffix='.csv', header=CSV_HEADER,
102 | # workaround b/207384805
103 | num_shards=1)
104 | )
105 |
106 |
107 | if __name__ == '__main__':
108 | import argparse
109 |
110 | parser = argparse.ArgumentParser(description='Create training CSV file that includes time-aggregate features')
111 | parser.add_argument('-p', '--project', help='Project to be billed for Dataflow job. Omit if running locally.')
112 | parser.add_argument('-b', '--bucket', help='Training data will be written to gs://BUCKET/train/')
113 | parser.add_argument('-r', '--region', help='Region to run Dataflow job. Choose the same region as your bucket.')
114 | parser.add_argument('-i', '--input', help='local OR bigquery', required=True)
115 |
116 | logging.getLogger().setLevel(logging.INFO)
117 | args = vars(parser.parse_args())
118 |
119 | if args['input'] != 'local':
120 | if not args['bucket'] or not args['project'] or not args['region']:
121 | print("Project, Bucket, Region are needed in order to run on the cloud on full dataset.")
122 | parser.print_help()
123 | parser.exit()
124 |
125 | run(project=args['project'], bucket=args['bucket'], region=args['region'], input=args['input'])
126 |
--------------------------------------------------------------------------------
/RealTimePrediction/realtime-intelligence-main/realtime/flightstxf/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/google/real-time-intelligence-workshop/ea99fd5b6fcf44c4f006dc17adde8c42ca163479/RealTimePrediction/realtime-intelligence-main/realtime/flightstxf/__init__.py
--------------------------------------------------------------------------------
/RealTimePrediction/realtime-intelligence-main/realtime/flightstxf/flights_transforms.py:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env python3
2 | # Copyright 2023 Google LLC
3 | #
4 | # Licensed under the Apache License, Version 2.0 (the "License");
5 | # you may not use this file except in compliance with the License.
6 | # You may obtain a copy of the License at
7 | #
8 | # https://www.apache.org/licenses/LICENSE-2.0
9 | #
10 | # Unless required by applicable law or agreed to in writing, software
11 | # distributed under the License is distributed on an "AS IS" BASIS,
12 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13 | # See the License for the specific language governing permissions and
14 | # limitations under the License.
15 |
16 | import apache_beam as beam
17 | import datetime as dt
18 | import logging
19 | import numpy as np
20 | import farmhash # pip install pyfarmhash
21 |
22 | DATETIME_FORMAT = '%Y-%m-%d %H:%M:%S'
23 | WINDOW_DURATION = 60 * 60
24 | WINDOW_EVERY = 5 * 60
25 |
26 |
27 | def get_data_split(fl_date):
28 | fl_date_str = str(fl_date)
29 | # Use farm fingerprint just like in BigQuery
30 | x = np.abs(np.uint64(farmhash.fingerprint64(fl_date_str)).astype('int64') % 100)
31 | if x < 60:
32 | data_split = 'TRAIN'
33 | elif x < 80:
34 | data_split = 'VALIDATE'
35 | else:
36 | data_split = 'TEST'
37 | return data_split
38 |
39 |
40 | def get_data_split_2019(fl_date):
41 | fl_date_str = str(fl_date)
42 | if fl_date_str > '2019':
43 | data_split = 'TEST'
44 | else:
45 | # Use farm fingerprint just like in BigQuery
46 | x = np.abs(np.uint64(farmhash.fingerprint64(fl_date_str)).astype('int64') % 100)
47 | if x < 95:
48 | data_split = 'TRAIN'
49 | else:
50 | data_split = 'VALIDATE'
51 | return data_split
52 |
53 |
54 | def to_datetime(event_time):
55 | if isinstance(event_time, str):
56 | # In BigQuery, this is a datetime.datetime. In JSON, it's a string
57 | # sometimes it has a T separating the date, sometimes it doesn't
58 | # Handle all the possibilities
59 | event_time = dt.datetime.strptime(event_time.replace('T', ' '), DATETIME_FORMAT)
60 | return event_time
61 |
62 |
63 | def approx_miles_between(lat1, lon1, lat2, lon2):
64 | # convert to radians
65 | lat1 = float(lat1) * np.pi / 180.0
66 | lat2 = float(lat2) * np.pi / 180.0
67 | lon1 = float(lon1) * np.pi / 180.0
68 | lon2 = float(lon2) * np.pi / 180.0
69 |
70 | # apply Haversine formula
71 | d_lat = lat2 - lat1
72 | d_lon = lon2 - lon1
73 | a = (pow(np.sin(d_lat / 2), 2) +
74 | pow(np.sin(d_lon / 2), 2) *
75 | np.cos(lat1) * np.cos(lat2));
76 | c = 2 * np.arcsin(np.sqrt(a))
77 | return float(6371 * c * 0.621371) # miles
78 |
79 |
80 | def create_features_and_label(event, for_training):
81 | try:
82 | model_input = {}
83 |
84 | if for_training:
85 | model_input.update({
86 | 'ontime': 1.0 if float(event['ARR_DELAY'] or 0) < 15 else 0,
87 | })
88 |
89 | # features for both training and prediction
90 | model_input.update({
91 | # same as in ch9
92 | 'dep_delay': event['DEP_DELAY'],
93 | 'taxi_out': event['TAXI_OUT'],
94 | # distance is not in wheelsoff
95 | 'distance': approx_miles_between(event['DEP_AIRPORT_LAT'], event['DEP_AIRPORT_LON'],
96 | event['ARR_AIRPORT_LAT'], event['ARR_AIRPORT_LON']),
97 | 'origin': event['ORIGIN'],
98 | 'dest': event['DEST'],
99 | 'dep_hour': to_datetime(event['DEP_TIME']).hour,
100 | 'is_weekday': 1.0 if to_datetime(event['DEP_TIME']).isoweekday() < 6 else 0.0,
101 | 'carrier': event['UNIQUE_CARRIER'],
102 | 'dep_airport_lat': event['DEP_AIRPORT_LAT'],
103 | 'dep_airport_lon': event['DEP_AIRPORT_LON'],
104 | 'arr_airport_lat': event['ARR_AIRPORT_LAT'],
105 | 'arr_airport_lon': event['ARR_AIRPORT_LON'],
106 | # newly computed averages
107 | 'avg_dep_delay': event['AVG_DEP_DELAY'],
108 | 'avg_taxi_out': event['AVG_TAXI_OUT'],
109 |
110 | })
111 |
112 | if for_training:
113 | model_input.update({
114 | # training data split
115 | 'data_split': get_data_split(event['FL_DATE'])
116 | })
117 | else:
118 | model_input.update({
119 | # prediction output should include timestamp
120 | 'event_time': event['WHEELS_OFF']
121 | })
122 |
123 | yield model_input
124 | except Exception as e:
125 | # if any key is not present, don't use for training
126 | logging.warning('Ignoring {} because: {}'.format(event, e), exc_info=True)
127 | pass
128 |
129 |
130 | def compute_mean(events, col_name):
131 | values = [float(event[col_name]) for event in events if col_name in event and event[col_name]]
132 | return float(np.mean(values)) if len(values) > 0 else None
133 |
134 |
135 | def add_stats(element, window=beam.DoFn.WindowParam):
136 | # result of a group-by, so this will be called once for each airport and window
137 | # all averages here are by airport
138 | airport = element[0]
139 | events = element[1]
140 |
141 | # how late are flights leaving?
142 | avg_dep_delay = compute_mean(events, 'DEP_DELAY')
143 | avg_taxiout = compute_mean(events, 'TAXI_OUT')
144 |
145 | # remember that an event will be present for 60 minutes, but we want to emit
146 | # it only if it has just arrived (if it is within 5 minutes of the start of the window)
147 | emit_end_time = window.start + WINDOW_EVERY
148 | for event in events:
149 | event_time = to_datetime(event['WHEELS_OFF']).timestamp()
150 | if event_time < emit_end_time:
151 | event_plus_stat = event.copy()
152 | event_plus_stat['AVG_DEP_DELAY'] = avg_dep_delay
153 | event_plus_stat['AVG_TAXI_OUT'] = avg_taxiout
154 | yield event_plus_stat
155 |
156 |
157 | def assign_timestamp(event):
158 | try:
159 | event_time = to_datetime(event['WHEELS_OFF'])
160 | yield beam.window.TimestampedValue(event, event_time.timestamp())
161 | except:
162 | pass
163 |
164 |
165 | def is_normal_operation(event):
166 | for flag in ['CANCELLED', 'DIVERTED']:
167 | if flag in event:
168 | s = str(event[flag]).lower()
169 | if s == 'true':
170 | return False; # cancelled or diverted
171 | return True # normal operation
172 |
173 |
174 | def transform_events_to_features(events, for_training=True):
175 | # events are assigned the time at which predictions will have to be made -- the wheels off time
176 | events = events | 'assign_time' >> beam.FlatMap(assign_timestamp)
177 | events = events | 'remove_cancelled' >> beam.Filter(is_normal_operation)
178 |
179 | # compute stats by airport, and add to events
180 | features = (
181 | events
182 | | 'window' >> beam.WindowInto(beam.window.SlidingWindows(WINDOW_DURATION, WINDOW_EVERY))
183 | | 'by_airport' >> beam.Map(lambda x: (x['ORIGIN'], x))
184 | | 'group_by_airport' >> beam.GroupByKey()
185 | | 'events_and_stats' >> beam.FlatMap(add_stats)
186 | | 'events_to_features' >> beam.FlatMap(lambda x: create_features_and_label(x, for_training))
187 | )
188 |
189 | return features
190 |
--------------------------------------------------------------------------------
/RealTimePrediction/realtime-intelligence-main/realtime/make_predictions.py:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env python3
2 |
3 | # Copyright 2023 Google Inc.
4 | #
5 | # Licensed under the Apache License, Version 2.0 (the "License");
6 | # you may not use this file except in compliance with the License.
7 | # You may obtain a copy of the License at
8 | #
9 | # http://www.apache.org/licenses/LICENSE-2.0
10 | #
11 | # Unless required by applicable law or agreed to in writing, software
12 | # distributed under the License is distributed on an "AS IS" BASIS,
13 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
14 | # See the License for the specific language governing permissions and
15 | # limitations under the License.
16 |
17 | import apache_beam as beam
18 | import logging
19 | import json
20 | import os
21 |
22 | from flightstxf import flights_transforms as ftxf
23 |
24 |
25 | CSV_HEADER = 'event_time,dep_delay,taxi_out,distance,origin,dest,dep_hour,is_weekday,carrier,dep_airport_lat,dep_airport_lon,arr_airport_lat,arr_airport_lon,avg_dep_delay,avg_taxi_out,prob_ontime'
26 |
27 | class FlightsModelInvoker(beam.DoFn):
28 | def __init__(self):
29 | self.endpoint = None
30 |
31 | def setup(self):
32 | from google.cloud import aiplatform
33 | endpoint_name = 'flights'
34 | endpoints = aiplatform.Endpoint.list(
35 | filter='display_name="{}"'.format(endpoint_name),
36 | order_by='create_time desc'
37 | )
38 | if len(endpoints) == 0:
39 | raise EnvironmentError("No endpoint named {}".format(endpoint_name))
40 | logging.info("Found endpoint {}".format(endpoints[0]))
41 | self.endpoint = endpoints[0]
42 |
43 | def process(self, input_data):
44 | # call predictions and pull out probability
45 | logging.info("Invoking ML model on {} flights".format(len(input_data)))
46 | # drop inputs not needed by model
47 | features = [x.copy() for x in input_data]
48 | for f in features:
49 | f.pop('event_time')
50 | # call model
51 | predictions = self.endpoint.predict(features).predictions
52 | for idx, input_instance in enumerate(input_data):
53 | result = input_instance.copy()
54 | result['prob_ontime'] = predictions[idx][0]
55 | yield result
56 |
57 |
58 | def run(project, bucket, region, source, sink):
59 | if source == 'local':
60 | logging.info('Running locally on small extract')
61 | argv = [
62 | '--project={0}'.format(project),
63 | '--runner=DirectRunner'
64 | ]
65 | flights_output = '/tmp/predictions'
66 | else:
67 | logging.info('Running in the cloud on full dataset input={}'.format(source))
68 | argv = [
69 | '--project={0}'.format(project),
70 | '--job_name=predictions',
71 | '--save_main_session',
72 | '--staging_location=gs://{0}/flights/staging/'.format(bucket),
73 | '--temp_location=gs://{0}/flights/temp/'.format(bucket),
74 | '--setup_file=./setup.py',
75 | '--autoscaling_algorithm=THROUGHPUT_BASED',
76 | '--max_num_workers=8',
77 | '--region={}'.format(region),
78 | '--subnetwork=regions/us-central1/subnetworks/default',
79 | '--runner=DataflowRunner'
80 | ]
81 | if source == 'pubsub':
82 | logging.info("Turning on streaming. Cancel the pipeline from GCP console")
83 | argv += ['--streaming']
84 | flights_output = 'gs://{}/flights/predictions'.format(bucket)
85 |
86 | with beam.Pipeline(argv=argv) as pipeline:
87 |
88 | # read the event stream
89 | if source == 'local':
90 | input_file = './simevents_sample.json'
91 | logging.info("Reading from {} ... Writing to {}".format(input_file, flights_output))
92 | events = (
93 | pipeline
94 | | 'read_input' >> beam.io.ReadFromText(input_file)
95 | | 'parse_input' >> beam.Map(lambda line: json.loads(line))
96 | )
97 | elif source == 'bigquery':
98 | input_query = ("SELECT EVENT_DATA FROM flights.flights_simevents " +
99 | "WHERE EVENT_TIME BETWEEN '2015-03-01' AND '2015-03-02'")
100 | logging.info("Reading from {} ... Writing to {}".format(input_query, flights_output))
101 | events = (
102 | pipeline
103 | | 'read_input' >> beam.io.ReadFromBigQuery(query=input_query, use_standard_sql=True)
104 | | 'parse_input' >> beam.Map(lambda row: json.loads(row['EVENT_DATA']))
105 | )
106 | elif source == 'pubsub':
107 | input_topic = "projects/{}/topics/wheelsoff".format(project)
108 | logging.info("Reading from {} ... Writing to {}".format(input_topic, flights_output))
109 | events = (
110 | pipeline
111 | | 'read_input' >> beam.io.ReadFromPubSub(topic=input_topic,
112 | timestamp_attribute='EventTimeStamp')
113 | | 'parse_input' >> beam.Map(lambda s: json.loads(s))
114 | )
115 | else:
116 | logging.error("Unknown input type {}".format(source))
117 | return
118 |
119 | # events -> features. See ./flights_transforms.py for the code shared between training & prediction
120 | features = ftxf.transform_events_to_features(events, for_training=False)
121 |
122 | # call model endpoint
123 | # shared_handle = beam.utils.shared.Shared()
124 | preds = (
125 | features
126 | | 'into_global' >> beam.WindowInto(beam.window.GlobalWindows())
127 | | 'batch_instances' >> beam.BatchElements(min_batch_size=1, max_batch_size=64)
128 | | 'model_predict' >> beam.ParDo(FlightsModelInvoker())
129 | )
130 |
131 | # write it out
132 | if sink == 'file':
133 | (preds
134 | | 'to_string' >> beam.Map(lambda f: ','.join([str(x) for x in f.values()]))
135 | | 'to_gcs' >> beam.io.textio.WriteToText(flights_output,
136 | file_name_suffix='.csv', header=CSV_HEADER,
137 | # workaround b/207384805
138 | num_shards=1)
139 | )
140 | elif sink == 'bigquery':
141 | preds_schema = ','.join([
142 | 'event_time:timestamp',
143 | 'prob_ontime:float',
144 | 'dep_delay:float',
145 | 'taxi_out:float',
146 | 'distance:float',
147 | 'origin:string',
148 | 'dest:string',
149 | 'dep_hour:integer',
150 | 'is_weekday:integer',
151 | 'carrier:string',
152 | 'dep_airport_lat:float,dep_airport_lon:float',
153 | 'arr_airport_lat:float,arr_airport_lon:float',
154 | 'avg_dep_delay:float',
155 | 'avg_taxi_out:float',
156 | ])
157 | (preds
158 | | 'to_bigquery' >> beam.io.WriteToBigQuery(
159 | 'flights.streaming_preds', schema=preds_schema,
160 | # write_disposition=beam.io.BigQueryDisposition.WRITE_TRUNCATE,
161 | create_disposition=beam.io.BigQueryDisposition.CREATE_IF_NEEDED,
162 | method='STREAMING_INSERTS'
163 | )
164 | )
165 | else:
166 | logging.error("Unknown output type {}".format(sink))
167 | return
168 |
169 |
170 | if __name__ == '__main__':
171 | import argparse
172 |
173 | parser = argparse.ArgumentParser(description='Create training CSV file that includes time-aggregate features')
174 | parser.add_argument('-p', '--project', help='Project to be billed for Dataflow/BigQuery', required=True)
175 | parser.add_argument('-b', '--bucket', help='data will be read from written to gs://BUCKET/flights/predictions/')
176 | parser.add_argument('-r', '--region', help='Region to run Dataflow job. Choose the same region as your bucket.')
177 | parser.add_argument('-i', '--input', help='local, bigquery OR pubsub', required=True)
178 | parser.add_argument('-o', '--output', help='file, bigquery OR bigtable', default='file')
179 |
180 | logging.getLogger().setLevel(logging.INFO)
181 | args = vars(parser.parse_args())
182 |
183 | if args['input'] != 'local':
184 | if not args['bucket'] or not args['project'] or not args['region']:
185 | print("Project, Bucket, Region are needed in order to run on the cloud on full dataset.")
186 | parser.print_help()
187 | parser.exit()
188 |
189 | run(project=args['project'], bucket=args['bucket'], region=args['region'],
190 | source=args['input'], sink=args['output'])
191 |
--------------------------------------------------------------------------------
/RealTimePrediction/realtime-intelligence-main/realtime/model.py:
--------------------------------------------------------------------------------
1 | # Copyright 2023 Google LLC
2 | #
3 | # Licensed under the Apache License, Version 2.0 (the "License");
4 | # you may not use this file except in compliance with the License.
5 | # You may obtain a copy of the License at
6 | #
7 | # https://www.apache.org/licenses/LICENSE-2.0
8 | #
9 | # Unless required by applicable law or agreed to in writing, software
10 | # distributed under the License is distributed on an "AS IS" BASIS,
11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12 | # See the License for the specific language governing permissions and
13 | # limitations under the License.
14 |
15 | # Checking in
16 |
17 | import argparse
18 | import logging
19 | import os, time
20 | import hypertune
21 | import numpy as np
22 | import tensorflow as tf
23 |
24 | BUCKET = None
25 | TF_VERSION = '2-' + tf.__version__[2:3] # needed to choose container
26 |
27 | DEVELOP_MODE = True
28 | NUM_EXAMPLES = 5000 * 1000 # doesn't need to be precise but get order of magnitude right.
29 |
30 | NUM_BUCKETS = 5
31 | NUM_EMBEDS = 3
32 | TRAIN_BATCH_SIZE = 64
33 | DNN_HIDDEN_UNITS = '64,32'
34 |
35 | CSV_COLUMNS = (
36 | 'ontime,dep_delay,taxi_out,distance,origin,dest,dep_hour,is_weekday,carrier,' +
37 | 'dep_airport_lat,dep_airport_lon,arr_airport_lat,arr_airport_lon,avg_dep_delay,avg_taxi_out,data_split'
38 | ).split(',')
39 |
40 | CSV_COLUMN_TYPES = [
41 | 1.0, -3.0, 5.0, 1037.493622678299, 'OTH', 'DEN', 21, 1.0, 'OO',
42 | 43.41694444, -124.24694444, 39.86166667, -104.67305556, -3.0, 5.0, 'TRAIN'
43 | ]
44 |
45 |
46 | def features_and_labels(features):
47 | label = features.pop('ontime') # this is what we will train for
48 | return features, label
49 |
50 |
51 | def read_dataset(pattern, batch_size, mode=tf.estimator.ModeKeys.TRAIN, truncate=None):
52 | dataset = tf.data.experimental.make_csv_dataset(
53 | pattern, batch_size,
54 | column_names=CSV_COLUMNS,
55 | column_defaults=CSV_COLUMN_TYPES,
56 | sloppy=True,
57 | num_parallel_reads=2,
58 | ignore_errors=True,
59 | num_epochs=1)
60 | dataset = dataset.map(features_and_labels)
61 | if mode == tf.estimator.ModeKeys.TRAIN:
62 | dataset = dataset.shuffle(batch_size * 10)
63 | dataset = dataset.repeat()
64 | dataset = dataset.prefetch(1)
65 | if truncate is not None:
66 | dataset = dataset.take(truncate)
67 | return dataset
68 |
69 |
70 | def create_model():
71 | real = {
72 | colname: tf.feature_column.numeric_column(colname)
73 | for colname in
74 | (
75 | 'dep_delay,taxi_out,distance,dep_hour,is_weekday,' +
76 | 'dep_airport_lat,dep_airport_lon,' +
77 | 'arr_airport_lat,arr_airport_lon,avg_dep_delay,avg_taxi_out'
78 | ).split(',')
79 | }
80 | sparse = {
81 | 'carrier': tf.feature_column.categorical_column_with_vocabulary_list('carrier',
82 | vocabulary_list='AS,VX,F9,UA,US,WN,HA,EV,MQ,DL,OO,B6,NK,AA'.split(
83 | ',')),
84 | 'origin': tf.feature_column.categorical_column_with_hash_bucket('origin', hash_bucket_size=1000),
85 | 'dest': tf.feature_column.categorical_column_with_hash_bucket('dest', hash_bucket_size=1000),
86 | }
87 |
88 | inputs = {
89 | colname: tf.keras.layers.Input(name=colname, shape=(), dtype='float32')
90 | for colname in real.keys()
91 | }
92 | inputs.update({
93 | colname: tf.keras.layers.Input(name=colname, shape=(), dtype='string')
94 | for colname in sparse.keys()
95 | })
96 |
97 | latbuckets = np.linspace(20.0, 50.0, NUM_BUCKETS).tolist() # USA
98 | lonbuckets = np.linspace(-120.0, -70.0, NUM_BUCKETS).tolist() # USA
99 | disc = {}
100 | disc.update({
101 | 'd_{}'.format(key): tf.feature_column.bucketized_column(real[key], latbuckets)
102 | for key in ['dep_airport_lat', 'arr_airport_lat']
103 | })
104 | disc.update({
105 | 'd_{}'.format(key): tf.feature_column.bucketized_column(real[key], lonbuckets)
106 | for key in ['dep_airport_lon', 'arr_airport_lon']
107 | })
108 |
109 | # cross columns that make sense in combination
110 | sparse['dep_loc'] = tf.feature_column.crossed_column(
111 | [disc['d_dep_airport_lat'], disc['d_dep_airport_lon']], NUM_BUCKETS * NUM_BUCKETS)
112 | sparse['arr_loc'] = tf.feature_column.crossed_column(
113 | [disc['d_arr_airport_lat'], disc['d_arr_airport_lon']], NUM_BUCKETS * NUM_BUCKETS)
114 | sparse['dep_arr'] = tf.feature_column.crossed_column([sparse['dep_loc'], sparse['arr_loc']], NUM_BUCKETS ** 4)
115 |
116 | # embed all the sparse columns
117 | embed = {
118 | 'embed_{}'.format(colname): tf.feature_column.embedding_column(col, NUM_EMBEDS)
119 | for colname, col in sparse.items()
120 | }
121 | real.update(embed)
122 |
123 | # one-hot encode the sparse columns
124 | sparse = {
125 | colname: tf.feature_column.indicator_column(col)
126 | for colname, col in sparse.items()
127 | }
128 |
129 | model = wide_and_deep_classifier(
130 | inputs,
131 | linear_feature_columns=sparse.values(),
132 | dnn_feature_columns=real.values(),
133 | dnn_hidden_units=DNN_HIDDEN_UNITS)
134 |
135 | return model
136 |
137 |
138 | def wide_and_deep_classifier(inputs, linear_feature_columns, dnn_feature_columns, dnn_hidden_units):
139 | deep = tf.keras.layers.DenseFeatures(dnn_feature_columns, name='deep_inputs')(inputs)
140 | layers = [int(x) for x in dnn_hidden_units.split(',')]
141 | for layerno, numnodes in enumerate(layers):
142 | deep = tf.keras.layers.Dense(numnodes, activation='relu', name='dnn_{}'.format(layerno + 1))(deep)
143 | wide = tf.keras.layers.DenseFeatures(linear_feature_columns, name='wide_inputs')(inputs)
144 | both = tf.keras.layers.concatenate([deep, wide], name='both')
145 | output = tf.keras.layers.Dense(1, activation='sigmoid', name='pred')(both)
146 | model = tf.keras.Model(inputs, output)
147 | model.compile(optimizer='adam',
148 | loss='binary_crossentropy',
149 | metrics=['accuracy', rmse, tf.keras.metrics.AUC()])
150 | return model
151 |
152 |
153 | def rmse(y_true, y_pred):
154 | return tf.sqrt(tf.reduce_mean(tf.square(y_pred - y_true)))
155 |
156 |
157 | def train_and_evaluate(train_data_pattern, eval_data_pattern, test_data_pattern, export_dir, output_dir):
158 | train_batch_size = TRAIN_BATCH_SIZE
159 | if DEVELOP_MODE:
160 | eval_batch_size = 100
161 | steps_per_epoch = 3
162 | epochs = 2
163 | num_eval_examples = eval_batch_size * 10
164 | else:
165 | eval_batch_size = 100
166 | steps_per_epoch = NUM_EXAMPLES // train_batch_size
167 | epochs = NUM_EPOCHS
168 | num_eval_examples = eval_batch_size * 100
169 |
170 | train_dataset = read_dataset(train_data_pattern, train_batch_size)
171 | eval_dataset = read_dataset(eval_data_pattern, eval_batch_size, tf.estimator.ModeKeys.EVAL, num_eval_examples)
172 |
173 | # checkpoint
174 | checkpoint_path = '{}/checkpoints/flights.cpt'.format(output_dir)
175 | logging.info("Checkpointing to {}".format(checkpoint_path))
176 | cp_callback = tf.keras.callbacks.ModelCheckpoint(checkpoint_path,
177 | save_weights_only=True,
178 | verbose=1)
179 |
180 | # call back to write out hyperparameter tuning metric
181 | METRIC = 'val_rmse'
182 | hpt = hypertune.HyperTune()
183 |
184 | class HpCallback(tf.keras.callbacks.Callback):
185 | def on_epoch_end(self, epoch, logs=None):
186 | if logs and METRIC in logs:
187 | logging.info("Epoch {}: {} = {}".format(epoch, METRIC, logs[METRIC]))
188 | hpt.report_hyperparameter_tuning_metric(hyperparameter_metric_tag=METRIC,
189 | metric_value=logs[METRIC],
190 | global_step=epoch)
191 |
192 | # train the model
193 | model = create_model()
194 | logging.info(f"Training on {train_data_pattern}; eval on {eval_data_pattern}; {epochs} epochs; {steps_per_epoch}")
195 | history = model.fit(train_dataset,
196 | validation_data=eval_dataset,
197 | epochs=epochs,
198 | steps_per_epoch=steps_per_epoch,
199 | callbacks=[cp_callback, HpCallback()])
200 |
201 | # export
202 | logging.info('Exporting to {}'.format(export_dir))
203 | tf.saved_model.save(model, export_dir)
204 |
205 | # write out final metric
206 | final_rmse = history.history[METRIC][-1]
207 | logging.info("Validation metric {} on {} samples = {}".format(METRIC, num_eval_examples, final_rmse))
208 |
209 | if (not DEVELOP_MODE) and (test_data_pattern is not None) and (not SKIP_FULL_EVAL):
210 | logging.info("Evaluating over full test dataset")
211 | test_dataset = read_dataset(test_data_pattern, eval_batch_size, tf.estimator.ModeKeys.EVAL, None)
212 | final_metrics = model.evaluate(test_dataset)
213 | logging.info("Final metrics on full test dataset = {}".format(final_metrics))
214 | else:
215 | logging.info("Skipping evaluation on full test dataset")
216 |
217 |
218 | if __name__ == '__main__':
219 | logging.info("Tensorflow version " + tf.__version__)
220 | parser = argparse.ArgumentParser()
221 |
222 | parser.add_argument(
223 | '--bucket',
224 | #help='Data will be read from gs://BUCKET/ch11/data and output will be in gs://BUCKET/ch11/trained_model',
225 | help='Data will be read from gs://BUCKET/train/data and output will be in gs://BUCKET/train/trained_model',
226 | required=True
227 | )
228 |
229 | parser.add_argument(
230 | '--num_examples',
231 | help='Number of examples per epoch. Get order of magnitude correct.',
232 | type=int,
233 | default=5000000
234 | )
235 |
236 | # for hyper-parameter tuning
237 | parser.add_argument(
238 | '--train_batch_size',
239 | help='Number of examples to compute gradient on',
240 | type=int,
241 | default=256 # originally 64
242 | )
243 | parser.add_argument(
244 | '--nbuckets',
245 | help='Number of bins into which to discretize lats and lons',
246 | type=int,
247 | default=10 # originally 5
248 | )
249 | parser.add_argument(
250 | '--nembeds',
251 | help='Embedding dimension for categorical variables',
252 | type=int,
253 | default=3
254 | )
255 | parser.add_argument(
256 | '--num_epochs',
257 | help='Number of epochs (used only if --develop is not set)',
258 | type=int,
259 | default=10
260 | )
261 | parser.add_argument(
262 | '--dnn_hidden_units',
263 | help='Architecture of DNN part of wide-and-deep network',
264 | default='64,64,64,8' # originally '64,32'
265 | )
266 | parser.add_argument(
267 | '--develop',
268 | help='Train on a small subset in development',
269 | dest='develop',
270 | action='store_true')
271 | parser.set_defaults(develop=False)
272 | parser.add_argument(
273 | '--skip_full_eval',
274 | help='Just train. Do not evaluate on test dataset.',
275 | dest='skip_full_eval',
276 | action='store_true')
277 | parser.set_defaults(skip_full_eval=False)
278 |
279 | # parse args
280 | args = parser.parse_args().__dict__
281 | logging.getLogger().setLevel(logging.INFO)
282 |
283 | # The Vertex AI contract. If not running in Vertex AI Training, these will be None
284 | OUTPUT_MODEL_DIR = os.getenv("AIP_MODEL_DIR") # or None
285 | TRAIN_DATA_PATTERN = os.getenv("AIP_TRAINING_DATA_URI")
286 | EVAL_DATA_PATTERN = os.getenv("AIP_VALIDATION_DATA_URI")
287 | TEST_DATA_PATTERN = os.getenv("AIP_TEST_DATA_URI")
288 |
289 | # set top-level output directory for checkpoints, etc.
290 | BUCKET = args['bucket']
291 | #OUTPUT_DIR = 'gs://{}/ch11/train_output'.format(BUCKET)
292 | OUTPUT_DIR = 'gs://{}/train/train_output'.format(BUCKET)
293 | # During hyperparameter tuning, we need to make sure different trials don't clobber each other
294 | # https://cloud.google.com/ai-platform/training/docs/distributed-training-details#tf-config-format
295 | # This doesn't exist in Vertex AI
296 | # OUTPUT_DIR = os.path.join(
297 | # OUTPUT_DIR,
298 | # json.loads(
299 | # os.environ.get('TF_CONFIG', '{}')
300 | # ).get('task', {}).get('trial', '')
301 | # )
302 | if OUTPUT_MODEL_DIR:
303 | # convert gs://ai-analytics-solutions-dsongcp2/aiplatform-custom-job-2021-11-13-22:22:46.175/1/model/
304 | # to gs://ai-analytics-solutions-dsongcp2/aiplatform-custom-job-2021-11-13-22:22:46.175/1
305 | OUTPUT_DIR = os.path.join(
306 | os.path.dirname(OUTPUT_MODEL_DIR if OUTPUT_MODEL_DIR[-1] != '/' else OUTPUT_MODEL_DIR[:-1]),
307 | 'train_output')
308 | logging.info('Writing checkpoints and other outputs to {}'.format(OUTPUT_DIR))
309 |
310 | # Set default values for the contract variables in case we are not running in Vertex AI Training
311 | if not OUTPUT_MODEL_DIR:
312 | OUTPUT_MODEL_DIR = os.path.join(OUTPUT_DIR,
313 | 'export/flights_{}'.format(time.strftime("%Y%m%d-%H%M%S")))
314 | if not TRAIN_DATA_PATTERN:
315 | #TRAIN_DATA_PATTERN = 'gs://{}/ch11/data/train*'.format(BUCKET)
316 | TRAIN_DATA_PATTERN = 'gs://{}/train/data/train*'.format(BUCKET)
317 | CSV_COLUMNS.pop() # the data_split column won't exist
318 | CSV_COLUMN_TYPES.pop() # the data_split column won't exist
319 | if not EVAL_DATA_PATTERN:
320 | #EVAL_DATA_PATTERN = 'gs://{}/ch11/data/eval*'.format(BUCKET
321 | EVAL_DATA_PATTERN = 'gs://{}/train/data/eval*'.format(BUCKET)
322 | logging.info('Exporting trained model to {}'.format(OUTPUT_MODEL_DIR))
323 | logging.info("Reading training data from {}".format(TRAIN_DATA_PATTERN))
324 | logging.info('Writing trained model to {}'.format(OUTPUT_MODEL_DIR))
325 |
326 | # other global parameters
327 | NUM_BUCKETS = args['nbuckets']
328 | NUM_EMBEDS = args['nembeds']
329 | NUM_EXAMPLES = args['num_examples']
330 | NUM_EPOCHS = args['num_epochs']
331 | TRAIN_BATCH_SIZE = args['train_batch_size']
332 | DNN_HIDDEN_UNITS = args['dnn_hidden_units']
333 | DEVELOP_MODE = args['develop']
334 | SKIP_FULL_EVAL = args['skip_full_eval']
335 |
336 | # run
337 | train_and_evaluate(TRAIN_DATA_PATTERN, EVAL_DATA_PATTERN, TEST_DATA_PATTERN, OUTPUT_MODEL_DIR, OUTPUT_DIR)
338 |
339 | logging.info("Done")
340 |
--------------------------------------------------------------------------------
/RealTimePrediction/realtime-intelligence-main/realtime/setup.py:
--------------------------------------------------------------------------------
1 | #
2 | # Licensed to the Apache Software Foundation (ASF) under one or more
3 | # contributor license agreements. See the NOTICE file distributed with
4 | # this work for additional information regarding copyright ownership.
5 | # The ASF licenses this file to You under the Apache License, Version 2.0
6 | # (the "License"); you may not use this file except in compliance with
7 | # the License. You may obtain a copy of the License at
8 | #
9 | # http://www.apache.org/licenses/LICENSE-2.0
10 | #
11 | # Unless required by applicable law or agreed to in writing, software
12 | # distributed under the License is distributed on an "AS IS" BASIS,
13 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
14 | # See the License for the specific language governing permissions and
15 | # limitations under the License.
16 | #
17 |
18 | """Setup.py module for the workflow's worker utilities.
19 |
20 | All the workflow related code is gathered in a package that will be built as a
21 | source distribution, staged in the staging area for the workflow being run and
22 | then installed in the workers when they start running.
23 |
24 | This behavior is triggered by specifying the --setup_file command line option
25 | when running the workflow for remote execution.
26 | """
27 |
28 | from distutils.command.build import build as _build
29 | import subprocess
30 |
31 | import setuptools
32 |
33 |
34 | # This class handles the pip install mechanism.
35 | class build(_build): # pylint: disable=invalid-name
36 | """A build command class that will be invoked during package install.
37 |
38 | The package built using the current setup.py will be staged and later
39 | installed in the worker using `pip install package'. This class will be
40 | instantiated during install for this specific scenario and will trigger
41 | running the custom commands specified.
42 | """
43 | sub_commands = _build.sub_commands + [('CustomCommands', None)]
44 |
45 |
46 | # Some custom command to run during setup. The command is not essential for this
47 | # workflow. It is used here as an example. Each command will spawn a child
48 | # process. Typically, these commands will include steps to install non-Python
49 | # packages. For instance, to install a C++-based library libjpeg62 the following
50 | # two commands will have to be added:
51 | #
52 | # ['apt-get', 'update'],
53 | # ['apt-get', '--assume-yes', install', 'libjpeg62'],
54 | #
55 | # First, note that there is no need to use the sudo command because the setup
56 | # script runs with appropriate access.
57 | # Second, if apt-get tool is used then the first command needs to be 'apt-get
58 | # update' so the tool refreshes itself and initializes links to download
59 | # repositories. Without this initial step the other apt-get install commands
60 | # will fail with package not found errors. Note also --assume-yes option which
61 | # shortcuts the interactive confirmation.
62 | #
63 | # The output of custom commands (including failures) will be logged in the
64 | # worker-startup log.
65 | CUSTOM_COMMANDS = [
66 | ]
67 |
68 |
69 | class CustomCommands(setuptools.Command):
70 | """A setuptools Command class able to run arbitrary commands."""
71 |
72 | def initialize_options(self):
73 | pass
74 |
75 | def finalize_options(self):
76 | pass
77 |
78 | def RunCustomCommand(self, command_list):
79 | print ('Running command: %s' % command_list)
80 | p = subprocess.Popen(
81 | command_list,
82 | stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
83 | # Can use communicate(input='y\n'.encode()) if the command run requires
84 | # some confirmation.
85 | stdout_data, _ = p.communicate()
86 | print ('Command output: %s' % stdout_data)
87 | if p.returncode != 0:
88 | raise RuntimeError(
89 | 'Command %s failed: exit code: %s' % (command_list, p.returncode))
90 |
91 | def run(self):
92 | for command in CUSTOM_COMMANDS:
93 | self.RunCustomCommand(command)
94 |
95 |
96 | # Configure the required packages and scripts to install.
97 | # Note that the Python Dataflow containers come with numpy already installed
98 | # so this dependency will not trigger anything to be installed unless a version
99 | # restriction is specified.
100 | REQUIRED_PACKAGES = [
101 | 'pyfarmhash',
102 | 'google-cloud-aiplatform',
103 | 'cloudml-hypertune',
104 | 'dill==0.3.1.1'
105 | ]
106 |
107 |
108 | setuptools.setup(
109 | name='flightsdf',
110 | version='0.0.1',
111 | description='Data Science on GCP flights training and prediction pipelines',
112 | install_requires=REQUIRED_PACKAGES,
113 | packages=setuptools.find_packages(),
114 | cmdclass={
115 | # Command class instantiated and run during pip install scenarios.
116 | 'build': build,
117 | 'CustomCommands': CustomCommands,
118 | }
119 | )
120 |
--------------------------------------------------------------------------------
/RealTimePrediction/realtime-intelligence-main/realtime/simevents_sample.json:
--------------------------------------------------------------------------------
1 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "US", "ORIGIN_AIRPORT_SEQ_ID": "1393003", "ORIGIN": "ORD", "DEST_AIRPORT_SEQ_ID": "1410702", "DEST": "PHX", "CRS_DEP_TIME": "2015-03-10T12:00:00", "DEP_TIME": "2015-03-10T11:56:00", "DEP_DELAY": -4.0, "TAXI_OUT": 21.0, "WHEELS_OFF": "2015-03-10T12:17:00", "CRS_ARR_TIME": "2015-03-10T16:10:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 41.97944444, "DEP_AIRPORT_LON": -87.9075, "DEP_AIRPORT_TZOFFSET": -18000.0, "ARR_AIRPORT_LAT": 33.43416667, "ARR_AIRPORT_LON": -112.01166667, "ARR_AIRPORT_TZOFFSET": -25200.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T12:17:00"}
2 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "US", "ORIGIN_AIRPORT_SEQ_ID": "1467903", "ORIGIN": "SAN", "DEST_AIRPORT_SEQ_ID": "1410702", "DEST": "PHX", "CRS_DEP_TIME": "2015-03-10T13:30:00", "DEP_TIME": "2015-03-10T13:26:00", "DEP_DELAY": -4.0, "TAXI_OUT": 34.0, "WHEELS_OFF": "2015-03-10T14:00:00", "CRS_ARR_TIME": "2015-03-10T14:57:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 32.73361111, "DEP_AIRPORT_LON": -117.18972222, "DEP_AIRPORT_TZOFFSET": -25200.0, "ARR_AIRPORT_LAT": 33.43416667, "ARR_AIRPORT_LON": -112.01166667, "ARR_AIRPORT_TZOFFSET": -25200.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T14:00:00"}
3 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "US", "ORIGIN_AIRPORT_SEQ_ID": "1483103", "ORIGIN": "SJC", "DEST_AIRPORT_SEQ_ID": "1410702", "DEST": "PHX", "CRS_DEP_TIME": "2015-03-10T13:20:00", "DEP_TIME": "2015-03-10T13:15:00", "DEP_DELAY": -5.0, "TAXI_OUT": 16.0, "WHEELS_OFF": "2015-03-10T13:31:00", "CRS_ARR_TIME": "2015-03-10T15:13:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 37.36277778, "DEP_AIRPORT_LON": -121.92916667, "DEP_AIRPORT_TZOFFSET": -25200.0, "ARR_AIRPORT_LAT": 33.43416667, "ARR_AIRPORT_LON": -112.01166667, "ARR_AIRPORT_TZOFFSET": -25200.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T13:31:00"}
4 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "WN", "ORIGIN_AIRPORT_SEQ_ID": "1410002", "ORIGIN": "PHL", "DEST_AIRPORT_SEQ_ID": "1530402", "DEST": "TPA", "CRS_DEP_TIME": "2015-03-10T09:50:00", "DEP_TIME": "2015-03-10T09:48:00", "DEP_DELAY": -2.0, "TAXI_OUT": 15.0, "WHEELS_OFF": "2015-03-10T10:03:00", "CRS_ARR_TIME": "2015-03-10T12:35:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 39.87222222, "DEP_AIRPORT_LON": -75.24083333, "DEP_AIRPORT_TZOFFSET": -14400.0, "ARR_AIRPORT_LAT": 27.97555556, "ARR_AIRPORT_LON": -82.53333333, "ARR_AIRPORT_TZOFFSET": -14400.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T10:03:00"}
5 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "WN", "ORIGIN_AIRPORT_SEQ_ID": "1323202", "ORIGIN": "MDW", "DEST_AIRPORT_SEQ_ID": "1410702", "DEST": "PHX", "CRS_DEP_TIME": "2015-03-10T12:10:00", "DEP_TIME": "2015-03-10T12:08:00", "DEP_DELAY": -2.0, "TAXI_OUT": 10.0, "WHEELS_OFF": "2015-03-10T12:18:00", "CRS_ARR_TIME": "2015-03-10T16:00:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 41.78583333, "DEP_AIRPORT_LON": -87.7525, "DEP_AIRPORT_TZOFFSET": -18000.0, "ARR_AIRPORT_LAT": 33.43416667, "ARR_AIRPORT_LON": -112.01166667, "ARR_AIRPORT_TZOFFSET": -25200.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T12:18:00"}
6 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "WN", "ORIGIN_AIRPORT_SEQ_ID": "1323202", "ORIGIN": "MDW", "DEST_AIRPORT_SEQ_ID": "1410702", "DEST": "PHX", "CRS_DEP_TIME": "2015-03-10T13:35:00", "DEP_TIME": "2015-03-10T13:40:00", "DEP_DELAY": 5.0, "TAXI_OUT": 6.0, "WHEELS_OFF": "2015-03-10T13:46:00", "CRS_ARR_TIME": "2015-03-10T17:30:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 41.78583333, "DEP_AIRPORT_LON": -87.7525, "DEP_AIRPORT_TZOFFSET": -18000.0, "ARR_AIRPORT_LAT": 33.43416667, "ARR_AIRPORT_LON": -112.01166667, "ARR_AIRPORT_TZOFFSET": -25200.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T13:46:00"}
7 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "WN", "ORIGIN_AIRPORT_SEQ_ID": "1348702", "ORIGIN": "MSP", "DEST_AIRPORT_SEQ_ID": "1410702", "DEST": "PHX", "CRS_DEP_TIME": "2015-03-10T13:00:00", "DEP_TIME": "2015-03-10T12:56:00", "DEP_DELAY": -4.0, "TAXI_OUT": 24.0, "WHEELS_OFF": "2015-03-10T13:20:00", "CRS_ARR_TIME": "2015-03-10T16:30:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 44.88194444, "DEP_AIRPORT_LON": -93.22166667, "DEP_AIRPORT_TZOFFSET": -18000.0, "ARR_AIRPORT_LAT": 33.43416667, "ARR_AIRPORT_LON": -112.01166667, "ARR_AIRPORT_TZOFFSET": -25200.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T13:20:00"}
8 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "WN", "ORIGIN_AIRPORT_SEQ_ID": "1405702", "ORIGIN": "PDX", "DEST_AIRPORT_SEQ_ID": "1410702", "DEST": "PHX", "CRS_DEP_TIME": "2015-03-10T13:45:00", "DEP_TIME": "2015-03-10T13:44:00", "DEP_DELAY": -1.0, "TAXI_OUT": 13.0, "WHEELS_OFF": "2015-03-10T13:57:00", "CRS_ARR_TIME": "2015-03-10T16:20:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 45.58861111, "DEP_AIRPORT_LON": -122.59694444, "DEP_AIRPORT_TZOFFSET": -25200.0, "ARR_AIRPORT_LAT": 33.43416667, "ARR_AIRPORT_LON": -112.01166667, "ARR_AIRPORT_TZOFFSET": -25200.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T13:57:00"}
9 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "WN", "ORIGIN_AIRPORT_SEQ_ID": "1468303", "ORIGIN": "SAT", "DEST_AIRPORT_SEQ_ID": "1410702", "DEST": "PHX", "CRS_DEP_TIME": "2015-03-10T12:20:00", "DEP_TIME": "2015-03-10T12:20:00", "DEP_DELAY": 0.0, "TAXI_OUT": 10.0, "WHEELS_OFF": "2015-03-10T12:30:00", "CRS_ARR_TIME": "2015-03-10T14:55:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 29.53388889, "DEP_AIRPORT_LON": -98.46916667, "DEP_AIRPORT_TZOFFSET": -18000.0, "ARR_AIRPORT_LAT": 33.43416667, "ARR_AIRPORT_LON": -112.01166667, "ARR_AIRPORT_TZOFFSET": -25200.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T12:30:00"}
10 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "WN", "ORIGIN_AIRPORT_SEQ_ID": "1486903", "ORIGIN": "SLC", "DEST_AIRPORT_SEQ_ID": "1410702", "DEST": "PHX", "CRS_DEP_TIME": "2015-03-10T12:00:00", "DEP_TIME": "2015-03-10T11:55:00", "DEP_DELAY": -5.0, "TAXI_OUT": 5.0, "WHEELS_OFF": "2015-03-10T12:00:00", "CRS_ARR_TIME": "2015-03-10T13:40:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 40.78833333, "DEP_AIRPORT_LON": -111.97777778, "DEP_AIRPORT_TZOFFSET": -21600.0, "ARR_AIRPORT_LAT": 33.43416667, "ARR_AIRPORT_LON": -112.01166667, "ARR_AIRPORT_TZOFFSET": -25200.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T12:00:00"}
11 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "OO", "ORIGIN_AIRPORT_SEQ_ID": "1014103", "ORIGIN": "ABR", "DEST_AIRPORT_SEQ_ID": "1348702", "DEST": "MSP", "CRS_DEP_TIME": "2015-03-10T10:15:00", "DEP_TIME": "2015-03-10T10:07:00", "DEP_DELAY": -8.0, "TAXI_OUT": 16.0, "WHEELS_OFF": "2015-03-10T10:23:00", "CRS_ARR_TIME": "2015-03-10T11:25:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 45.44833333, "DEP_AIRPORT_LON": -98.4225, "DEP_AIRPORT_TZOFFSET": -18000.0, "ARR_AIRPORT_LAT": 44.88194444, "ARR_AIRPORT_LON": -93.22166667, "ARR_AIRPORT_TZOFFSET": -18000.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T10:23:00"}
12 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "WN", "ORIGIN_AIRPORT_SEQ_ID": "1039705", "ORIGIN": "ATL", "DEST_AIRPORT_SEQ_ID": "1348702", "DEST": "MSP", "CRS_DEP_TIME": "2015-03-10T12:25:00", "DEP_TIME": "2015-03-10T12:24:00", "DEP_DELAY": -1.0, "TAXI_OUT": 16.0, "WHEELS_OFF": "2015-03-10T12:40:00", "CRS_ARR_TIME": "2015-03-10T15:15:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 33.63666667, "DEP_AIRPORT_LON": -84.42777778, "DEP_AIRPORT_TZOFFSET": -14400.0, "ARR_AIRPORT_LAT": 44.88194444, "ARR_AIRPORT_LON": -93.22166667, "ARR_AIRPORT_TZOFFSET": -18000.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T12:40:00"}
13 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "DL", "ORIGIN_AIRPORT_SEQ_ID": "1039705", "ORIGIN": "ATL", "DEST_AIRPORT_SEQ_ID": "1348702", "DEST": "MSP", "CRS_DEP_TIME": "2015-03-10T11:35:00", "DEP_TIME": "2015-03-10T11:35:00", "DEP_DELAY": 0.0, "TAXI_OUT": 15.0, "WHEELS_OFF": "2015-03-10T11:50:00", "CRS_ARR_TIME": "2015-03-10T14:11:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 33.63666667, "DEP_AIRPORT_LON": -84.42777778, "DEP_AIRPORT_TZOFFSET": -14400.0, "ARR_AIRPORT_LAT": 44.88194444, "ARR_AIRPORT_LON": -93.22166667, "ARR_AIRPORT_TZOFFSET": -18000.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T11:50:00"}
14 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "DL", "ORIGIN_AIRPORT_SEQ_ID": "1052904", "ORIGIN": "BDL", "DEST_AIRPORT_SEQ_ID": "1348702", "DEST": "MSP", "CRS_DEP_TIME": "2015-03-10T10:20:00", "DEP_TIME": "2015-03-10T10:14:00", "DEP_DELAY": -6.0, "TAXI_OUT": 16.0, "WHEELS_OFF": "2015-03-10T10:30:00", "CRS_ARR_TIME": "2015-03-10T13:18:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 41.93916667, "DEP_AIRPORT_LON": -72.68333333, "DEP_AIRPORT_TZOFFSET": -14400.0, "ARR_AIRPORT_LAT": 44.88194444, "ARR_AIRPORT_LON": -93.22166667, "ARR_AIRPORT_TZOFFSET": -18000.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T10:30:00"}
15 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "DL", "ORIGIN_AIRPORT_SEQ_ID": "1062002", "ORIGIN": "BIL", "DEST_AIRPORT_SEQ_ID": "1348702", "DEST": "MSP", "CRS_DEP_TIME": "2015-03-10T12:25:00", "DEP_TIME": "2015-03-10T12:18:00", "DEP_DELAY": -7.0, "TAXI_OUT": 11.0, "WHEELS_OFF": "2015-03-10T12:29:00", "CRS_ARR_TIME": "2015-03-10T14:15:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 45.80777778, "DEP_AIRPORT_LON": -108.54277778, "DEP_AIRPORT_TZOFFSET": -21600.0, "ARR_AIRPORT_LAT": 44.88194444, "ARR_AIRPORT_LON": -93.22166667, "ARR_AIRPORT_TZOFFSET": -18000.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T12:29:00"}
16 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "EV", "ORIGIN_AIRPORT_SEQ_ID": "1062702", "ORIGIN": "BIS", "DEST_AIRPORT_SEQ_ID": "1348702", "DEST": "MSP", "CRS_DEP_TIME": "2015-03-10T12:45:00", "DEP_TIME": "2015-03-10T12:40:00", "DEP_DELAY": -5.0, "TAXI_OUT": 29.0, "WHEELS_OFF": "2015-03-10T13:09:00", "CRS_ARR_TIME": "2015-03-10T14:14:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 46.77277778, "DEP_AIRPORT_LON": -100.74583333, "DEP_AIRPORT_TZOFFSET": -18000.0, "ARR_AIRPORT_LAT": 44.88194444, "ARR_AIRPORT_LON": -93.22166667, "ARR_AIRPORT_TZOFFSET": -18000.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T13:09:00"}
17 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "DL", "ORIGIN_AIRPORT_SEQ_ID": "1062702", "ORIGIN": "BIS", "DEST_AIRPORT_SEQ_ID": "1348702", "DEST": "MSP", "CRS_DEP_TIME": "2015-03-10T10:00:00", "DEP_TIME": "2015-03-10T09:53:00", "DEP_DELAY": -7.0, "TAXI_OUT": 9.0, "WHEELS_OFF": "2015-03-10T10:02:00", "CRS_ARR_TIME": "2015-03-10T11:25:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 46.77277778, "DEP_AIRPORT_LON": -100.74583333, "DEP_AIRPORT_TZOFFSET": -18000.0, "ARR_AIRPORT_LAT": 44.88194444, "ARR_AIRPORT_LON": -93.22166667, "ARR_AIRPORT_TZOFFSET": -18000.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T10:02:00"}
18 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "OO", "ORIGIN_AIRPORT_SEQ_ID": "1063104", "ORIGIN": "BJI", "DEST_AIRPORT_SEQ_ID": "1348702", "DEST": "MSP", "CRS_DEP_TIME": "2015-03-10T10:05:00", "DEP_TIME": "2015-03-10T09:59:00", "DEP_DELAY": -6.0, "TAXI_OUT": 15.0, "WHEELS_OFF": "2015-03-10T10:14:00", "CRS_ARR_TIME": "2015-03-10T11:05:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 47.51083333, "DEP_AIRPORT_LON": -94.93472222, "DEP_AIRPORT_TZOFFSET": -18000.0, "ARR_AIRPORT_LAT": 44.88194444, "ARR_AIRPORT_LON": -93.22166667, "ARR_AIRPORT_TZOFFSET": -18000.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T10:14:00"}
19 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "OO", "ORIGIN_AIRPORT_SEQ_ID": "1069302", "ORIGIN": "BNA", "DEST_AIRPORT_SEQ_ID": "1348702", "DEST": "MSP", "CRS_DEP_TIME": "2015-03-10T13:15:00", "DEP_TIME": "2015-03-10T12:47:00", "DEP_DELAY": -28.0, "TAXI_OUT": 56.0, "WHEELS_OFF": "2015-03-10T13:43:00", "CRS_ARR_TIME": "2015-03-10T15:32:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 36.12444444, "DEP_AIRPORT_LON": -86.67805556, "DEP_AIRPORT_TZOFFSET": -18000.0, "ARR_AIRPORT_LAT": 44.88194444, "ARR_AIRPORT_LON": -93.22166667, "ARR_AIRPORT_TZOFFSET": -18000.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T13:43:00"}
20 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "DL", "ORIGIN_AIRPORT_SEQ_ID": "1072102", "ORIGIN": "BOS", "DEST_AIRPORT_SEQ_ID": "1348702", "DEST": "MSP", "CRS_DEP_TIME": "2015-03-10T12:10:00", "DEP_TIME": "2015-03-10T12:08:00", "DEP_DELAY": -2.0, "TAXI_OUT": 20.0, "WHEELS_OFF": "2015-03-10T12:28:00", "CRS_ARR_TIME": "2015-03-10T15:27:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 42.36305556, "DEP_AIRPORT_LON": -71.00638889, "DEP_AIRPORT_TZOFFSET": -14400.0, "ARR_AIRPORT_LAT": 44.88194444, "ARR_AIRPORT_LON": -93.22166667, "ARR_AIRPORT_TZOFFSET": -18000.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T12:28:00"}
21 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "DL", "ORIGIN_AIRPORT_SEQ_ID": "1072102", "ORIGIN": "BOS", "DEST_AIRPORT_SEQ_ID": "1348702", "DEST": "MSP", "CRS_DEP_TIME": "2015-03-10T10:00:00", "DEP_TIME": "2015-03-10T09:55:00", "DEP_DELAY": -5.0, "TAXI_OUT": 19.0, "WHEELS_OFF": "2015-03-10T10:14:00", "CRS_ARR_TIME": "2015-03-10T13:21:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 42.36305556, "DEP_AIRPORT_LON": -71.00638889, "DEP_AIRPORT_TZOFFSET": -14400.0, "ARR_AIRPORT_LAT": 44.88194444, "ARR_AIRPORT_LON": -93.22166667, "ARR_AIRPORT_TZOFFSET": -18000.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T10:14:00"}
22 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "EV", "ORIGIN_AIRPORT_SEQ_ID": "1104203", "ORIGIN": "CLE", "DEST_AIRPORT_SEQ_ID": "1334205", "DEST": "MKE", "CRS_DEP_TIME": "2015-03-10T10:31:00", "DEP_TIME": "2015-03-10T10:19:00", "DEP_DELAY": -12.0, "TAXI_OUT": 12.0, "WHEELS_OFF": "2015-03-10T10:31:00", "CRS_ARR_TIME": "2015-03-10T12:02:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 41.40944444, "DEP_AIRPORT_LON": -81.85472222, "DEP_AIRPORT_TZOFFSET": -14400.0, "ARR_AIRPORT_LAT": 42.94694444, "ARR_AIRPORT_LON": -87.89694444, "ARR_AIRPORT_TZOFFSET": -18000.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T10:31:00"}
23 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "EV", "ORIGIN_AIRPORT_SEQ_ID": "1104203", "ORIGIN": "CLE", "DEST_AIRPORT_SEQ_ID": "1348702", "DEST": "MSP", "CRS_DEP_TIME": "2015-03-10T11:10:00", "DEP_TIME": "2015-03-10T11:07:00", "DEP_DELAY": -3.0, "TAXI_OUT": 7.0, "WHEELS_OFF": "2015-03-10T11:14:00", "CRS_ARR_TIME": "2015-03-10T13:25:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 41.40944444, "DEP_AIRPORT_LON": -81.85472222, "DEP_AIRPORT_TZOFFSET": -14400.0, "ARR_AIRPORT_LAT": 44.88194444, "ARR_AIRPORT_LON": -93.22166667, "ARR_AIRPORT_TZOFFSET": -18000.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T11:14:00"}
24 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "DL", "ORIGIN_AIRPORT_SEQ_ID": "1105703", "ORIGIN": "CLT", "DEST_AIRPORT_SEQ_ID": "1348702", "DEST": "MSP", "CRS_DEP_TIME": "2015-03-10T10:20:00", "DEP_TIME": "2015-03-10T10:13:00", "DEP_DELAY": -7.0, "TAXI_OUT": 15.0, "WHEELS_OFF": "2015-03-10T10:28:00", "CRS_ARR_TIME": "2015-03-10T13:08:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 35.21361111, "DEP_AIRPORT_LON": -80.94916667, "DEP_AIRPORT_TZOFFSET": -14400.0, "ARR_AIRPORT_LAT": 44.88194444, "ARR_AIRPORT_LON": -93.22166667, "ARR_AIRPORT_TZOFFSET": -18000.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T10:28:00"}
25 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "US", "ORIGIN_AIRPORT_SEQ_ID": "1105703", "ORIGIN": "CLT", "DEST_AIRPORT_SEQ_ID": "1348702", "DEST": "MSP", "CRS_DEP_TIME": "2015-03-10T13:35:00", "DEP_TIME": "2015-03-10T13:35:00", "DEP_DELAY": 0.0, "TAXI_OUT": 16.0, "WHEELS_OFF": "2015-03-10T13:51:00", "CRS_ARR_TIME": "2015-03-10T16:23:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 35.21361111, "DEP_AIRPORT_LON": -80.94916667, "DEP_AIRPORT_TZOFFSET": -14400.0, "ARR_AIRPORT_LAT": 44.88194444, "ARR_AIRPORT_LON": -93.22166667, "ARR_AIRPORT_TZOFFSET": -18000.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T13:51:00"}
26 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "OO", "ORIGIN_AIRPORT_SEQ_ID": "1106603", "ORIGIN": "CMH", "DEST_AIRPORT_SEQ_ID": "1348702", "DEST": "MSP", "CRS_DEP_TIME": "2015-03-10T13:05:00", "DEP_TIME": "2015-03-10T13:01:00", "DEP_DELAY": -4.0, "TAXI_OUT": 9.0, "WHEELS_OFF": "2015-03-10T13:10:00", "CRS_ARR_TIME": "2015-03-10T15:15:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 39.99694444, "DEP_AIRPORT_LON": -82.89222222, "DEP_AIRPORT_TZOFFSET": -14400.0, "ARR_AIRPORT_LAT": 44.88194444, "ARR_AIRPORT_LON": -93.22166667, "ARR_AIRPORT_TZOFFSET": -18000.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T13:10:00"}
27 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "WN", "ORIGIN_AIRPORT_SEQ_ID": "1127802", "ORIGIN": "DCA", "DEST_AIRPORT_SEQ_ID": "1334205", "DEST": "MKE", "CRS_DEP_TIME": "2015-03-10T10:55:00", "DEP_TIME": "2015-03-10T10:49:00", "DEP_DELAY": -6.0, "TAXI_OUT": 7.0, "WHEELS_OFF": "2015-03-10T10:56:00", "CRS_ARR_TIME": "2015-03-10T13:00:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 38.85194444, "DEP_AIRPORT_LON": -77.03777778, "DEP_AIRPORT_TZOFFSET": -14400.0, "ARR_AIRPORT_LAT": 42.94694444, "ARR_AIRPORT_LON": -87.89694444, "ARR_AIRPORT_TZOFFSET": -18000.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T10:56:00"}
28 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "DL", "ORIGIN_AIRPORT_SEQ_ID": "1127802", "ORIGIN": "DCA", "DEST_AIRPORT_SEQ_ID": "1348702", "DEST": "MSP", "CRS_DEP_TIME": "2015-03-10T12:35:00", "DEP_TIME": "2015-03-10T12:32:00", "DEP_DELAY": -3.0, "TAXI_OUT": 9.0, "WHEELS_OFF": "2015-03-10T12:41:00", "CRS_ARR_TIME": "2015-03-10T15:23:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 38.85194444, "DEP_AIRPORT_LON": -77.03777778, "DEP_AIRPORT_TZOFFSET": -14400.0, "ARR_AIRPORT_LAT": 44.88194444, "ARR_AIRPORT_LON": -93.22166667, "ARR_AIRPORT_TZOFFSET": -18000.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T12:41:00"}
29 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "DL", "ORIGIN_AIRPORT_SEQ_ID": "1129202", "ORIGIN": "DEN", "DEST_AIRPORT_SEQ_ID": "1348702", "DEST": "MSP", "CRS_DEP_TIME": "2015-03-10T12:10:00", "DEP_TIME": "2015-03-10T12:07:00", "DEP_DELAY": -3.0, "TAXI_OUT": 14.0, "WHEELS_OFF": "2015-03-10T12:21:00", "CRS_ARR_TIME": "2015-03-10T14:10:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 39.86166667, "DEP_AIRPORT_LON": -104.67305556, "DEP_AIRPORT_TZOFFSET": -21600.0, "ARR_AIRPORT_LAT": 44.88194444, "ARR_AIRPORT_LON": -93.22166667, "ARR_AIRPORT_TZOFFSET": -18000.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T12:21:00"}
30 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "NK", "ORIGIN_AIRPORT_SEQ_ID": "1129803", "ORIGIN": "DFW", "DEST_AIRPORT_SEQ_ID": "1348702", "DEST": "MSP", "CRS_DEP_TIME": "2015-03-10T11:20:00", "DEP_TIME": "2015-03-10T11:17:00", "DEP_DELAY": -3.0, "TAXI_OUT": 19.0, "WHEELS_OFF": "2015-03-10T11:36:00", "CRS_ARR_TIME": "2015-03-10T13:39:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 32.89694444, "DEP_AIRPORT_LON": -97.03805556, "DEP_AIRPORT_TZOFFSET": -18000.0, "ARR_AIRPORT_LAT": 44.88194444, "ARR_AIRPORT_LON": -93.22166667, "ARR_AIRPORT_TZOFFSET": -18000.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T11:36:00"}
31 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "DL", "ORIGIN_AIRPORT_SEQ_ID": "1143302", "ORIGIN": "DTW", "DEST_AIRPORT_SEQ_ID": "1334205", "DEST": "MKE", "CRS_DEP_TIME": "2015-03-10T12:20:00", "DEP_TIME": "2015-03-10T12:19:00", "DEP_DELAY": -1.0, "TAXI_OUT": 14.0, "WHEELS_OFF": "2015-03-10T12:33:00", "CRS_ARR_TIME": "2015-03-10T13:29:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 42.2125, "DEP_AIRPORT_LON": -83.35333333, "DEP_AIRPORT_TZOFFSET": -14400.0, "ARR_AIRPORT_LAT": 42.94694444, "ARR_AIRPORT_LON": -87.89694444, "ARR_AIRPORT_TZOFFSET": -18000.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T12:33:00"}
32 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "DL", "ORIGIN_AIRPORT_SEQ_ID": "1143302", "ORIGIN": "DTW", "DEST_AIRPORT_SEQ_ID": "1348702", "DEST": "MSP", "CRS_DEP_TIME": "2015-03-10T11:25:00", "DEP_TIME": "2015-03-10T11:22:00", "DEP_DELAY": -3.0, "TAXI_OUT": 15.0, "WHEELS_OFF": "2015-03-10T11:37:00", "CRS_ARR_TIME": "2015-03-10T13:20:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 42.2125, "DEP_AIRPORT_LON": -83.35333333, "DEP_AIRPORT_TZOFFSET": -14400.0, "ARR_AIRPORT_LAT": 44.88194444, "ARR_AIRPORT_LON": -93.22166667, "ARR_AIRPORT_TZOFFSET": -18000.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T11:37:00"}
33 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "DL", "ORIGIN_AIRPORT_SEQ_ID": "1143302", "ORIGIN": "DTW", "DEST_AIRPORT_SEQ_ID": "1348702", "DEST": "MSP", "CRS_DEP_TIME": "2015-03-10T12:35:00", "DEP_TIME": "2015-03-10T12:32:00", "DEP_DELAY": -3.0, "TAXI_OUT": 18.0, "WHEELS_OFF": "2015-03-10T12:50:00", "CRS_ARR_TIME": "2015-03-10T14:28:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 42.2125, "DEP_AIRPORT_LON": -83.35333333, "DEP_AIRPORT_TZOFFSET": -14400.0, "ARR_AIRPORT_LAT": 44.88194444, "ARR_AIRPORT_LON": -93.22166667, "ARR_AIRPORT_TZOFFSET": -18000.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T12:50:00"}
34 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "DL", "ORIGIN_AIRPORT_SEQ_ID": "1163703", "ORIGIN": "FAR", "DEST_AIRPORT_SEQ_ID": "1348702", "DEST": "MSP", "CRS_DEP_TIME": "2015-03-10T10:00:00", "DEP_TIME": "2015-03-10T10:20:00", "DEP_DELAY": 20.0, "TAXI_OUT": 8.0, "WHEELS_OFF": "2015-03-10T10:28:00", "CRS_ARR_TIME": "2015-03-10T11:12:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 46.92055556, "DEP_AIRPORT_LON": -96.81583333, "DEP_AIRPORT_TZOFFSET": -18000.0, "ARR_AIRPORT_LAT": 44.88194444, "ARR_AIRPORT_LON": -93.22166667, "ARR_AIRPORT_TZOFFSET": -18000.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T10:28:00"}
35 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "DL", "ORIGIN_AIRPORT_SEQ_ID": "1169703", "ORIGIN": "FLL", "DEST_AIRPORT_SEQ_ID": "1348702", "DEST": "MSP", "CRS_DEP_TIME": "2015-03-10T11:25:00", "DEP_TIME": "2015-03-10T11:20:00", "DEP_DELAY": -5.0, "TAXI_OUT": 13.0, "WHEELS_OFF": "2015-03-10T11:33:00", "CRS_ARR_TIME": "2015-03-10T15:18:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 26.0725, "DEP_AIRPORT_LON": -80.15277778, "DEP_AIRPORT_TZOFFSET": -14400.0, "ARR_AIRPORT_LAT": 44.88194444, "ARR_AIRPORT_LON": -93.22166667, "ARR_AIRPORT_TZOFFSET": -18000.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T11:33:00"}
36 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "OO", "ORIGIN_AIRPORT_SEQ_ID": "1182304", "ORIGIN": "FWA", "DEST_AIRPORT_SEQ_ID": "1348702", "DEST": "MSP", "CRS_DEP_TIME": "2015-03-10T11:34:00", "DEP_TIME": "2015-03-10T11:31:00", "DEP_DELAY": -3.0, "TAXI_OUT": 17.0, "WHEELS_OFF": "2015-03-10T11:48:00", "CRS_ARR_TIME": "2015-03-10T13:30:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 40.97833333, "DEP_AIRPORT_LON": -85.19527778, "DEP_AIRPORT_TZOFFSET": -14400.0, "ARR_AIRPORT_LAT": 44.88194444, "ARR_AIRPORT_LON": -93.22166667, "ARR_AIRPORT_TZOFFSET": -18000.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T11:48:00"}
37 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "DL", "ORIGIN_AIRPORT_SEQ_ID": "1198603", "ORIGIN": "GRR", "DEST_AIRPORT_SEQ_ID": "1348702", "DEST": "MSP", "CRS_DEP_TIME": "2015-03-10T11:25:00", "DEP_TIME": "2015-03-10T11:18:00", "DEP_DELAY": -7.0, "TAXI_OUT": 16.0, "WHEELS_OFF": "2015-03-10T11:34:00", "CRS_ARR_TIME": "2015-03-10T12:59:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 42.88083333, "DEP_AIRPORT_LON": -85.52277778, "DEP_AIRPORT_TZOFFSET": -14400.0, "ARR_AIRPORT_LAT": 44.88194444, "ARR_AIRPORT_LON": -93.22166667, "ARR_AIRPORT_TZOFFSET": -18000.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T11:34:00"}
38 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "OO", "ORIGIN_AIRPORT_SEQ_ID": "1212903", "ORIGIN": "HIB", "DEST_AIRPORT_SEQ_ID": "1348702", "DEST": "MSP", "CRS_DEP_TIME": "2015-03-10T12:14:00", "DEP_TIME": "2015-03-10T12:01:00", "DEP_DELAY": -13.0, "TAXI_OUT": 9.0, "WHEELS_OFF": "2015-03-10T12:10:00", "CRS_ARR_TIME": "2015-03-10T13:25:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 47.38666667, "DEP_AIRPORT_LON": -92.83888889, "DEP_AIRPORT_TZOFFSET": -18000.0, "ARR_AIRPORT_LAT": 44.88194444, "ARR_AIRPORT_LON": -93.22166667, "ARR_AIRPORT_TZOFFSET": -18000.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T12:10:00"}
39 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "DL", "ORIGIN_AIRPORT_SEQ_ID": "1233904", "ORIGIN": "IND", "DEST_AIRPORT_SEQ_ID": "1348702", "DEST": "MSP", "CRS_DEP_TIME": "2015-03-10T11:12:00", "DEP_TIME": "2015-03-10T11:04:00", "DEP_DELAY": -8.0, "TAXI_OUT": 14.0, "WHEELS_OFF": "2015-03-10T11:18:00", "CRS_ARR_TIME": "2015-03-10T13:00:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 39.71722222, "DEP_AIRPORT_LON": -86.29472222, "DEP_AIRPORT_TZOFFSET": -14400.0, "ARR_AIRPORT_LAT": 44.88194444, "ARR_AIRPORT_LON": -93.22166667, "ARR_AIRPORT_TZOFFSET": -18000.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T11:18:00"}
40 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "OO", "ORIGIN_AIRPORT_SEQ_ID": "1238902", "ORIGIN": "ISN", "DEST_AIRPORT_SEQ_ID": "1348702", "DEST": "MSP", "CRS_DEP_TIME": "2015-03-10T11:15:00", "DEP_TIME": "2015-03-10T11:08:00", "DEP_DELAY": -7.0, "TAXI_OUT": 41.0, "WHEELS_OFF": "2015-03-10T11:49:00", "CRS_ARR_TIME": "2015-03-10T13:11:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 48.17805556, "DEP_AIRPORT_LON": -103.64222222, "DEP_AIRPORT_TZOFFSET": -18000.0, "ARR_AIRPORT_LAT": 44.88194444, "ARR_AIRPORT_LON": -93.22166667, "ARR_AIRPORT_TZOFFSET": -18000.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T11:49:00"}
41 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "WN", "ORIGIN_AIRPORT_SEQ_ID": "1295302", "ORIGIN": "LGA", "DEST_AIRPORT_SEQ_ID": "1334205", "DEST": "MKE", "CRS_DEP_TIME": "2015-03-10T12:30:00", "DEP_TIME": "2015-03-10T12:38:00", "DEP_DELAY": 8.0, "TAXI_OUT": 17.0, "WHEELS_OFF": "2015-03-10T12:55:00", "CRS_ARR_TIME": "2015-03-10T14:55:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 40.77722222, "DEP_AIRPORT_LON": -73.8725, "DEP_AIRPORT_TZOFFSET": -14400.0, "ARR_AIRPORT_LAT": 42.94694444, "ARR_AIRPORT_LON": -87.89694444, "ARR_AIRPORT_TZOFFSET": -18000.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T12:55:00"}
42 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "OO", "ORIGIN_AIRPORT_SEQ_ID": "1307602", "ORIGIN": "LSE", "DEST_AIRPORT_SEQ_ID": "1348702", "DEST": "MSP", "CRS_DEP_TIME": "2015-03-10T11:20:00", "DEP_TIME": "2015-03-10T11:20:00", "DEP_DELAY": 0.0, "TAXI_OUT": 7.0, "WHEELS_OFF": "2015-03-10T11:27:00", "CRS_ARR_TIME": "2015-03-10T12:25:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 43.87916667, "DEP_AIRPORT_LON": -91.25666667, "DEP_AIRPORT_TZOFFSET": -18000.0, "ARR_AIRPORT_LAT": 44.88194444, "ARR_AIRPORT_LON": -93.22166667, "ARR_AIRPORT_TZOFFSET": -18000.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T11:27:00"}
43 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "WN", "ORIGIN_AIRPORT_SEQ_ID": "1320402", "ORIGIN": "MCO", "DEST_AIRPORT_SEQ_ID": "1334205", "DEST": "MKE", "CRS_DEP_TIME": "2015-03-10T12:45:00", "DEP_TIME": "2015-03-10T12:44:00", "DEP_DELAY": -1.0, "TAXI_OUT": 9.0, "WHEELS_OFF": "2015-03-10T12:53:00", "CRS_ARR_TIME": "2015-03-10T15:45:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 28.42944444, "DEP_AIRPORT_LON": -81.30888889, "DEP_AIRPORT_TZOFFSET": -14400.0, "ARR_AIRPORT_LAT": 42.94694444, "ARR_AIRPORT_LON": -87.89694444, "ARR_AIRPORT_TZOFFSET": -18000.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T12:53:00"}
44 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "DL", "ORIGIN_AIRPORT_SEQ_ID": "1348602", "ORIGIN": "MSO", "DEST_AIRPORT_SEQ_ID": "1348702", "DEST": "MSP", "CRS_DEP_TIME": "2015-03-10T11:45:00", "DEP_TIME": "2015-03-10T11:38:00", "DEP_DELAY": -7.0, "TAXI_OUT": 21.0, "WHEELS_OFF": "2015-03-10T11:59:00", "CRS_ARR_TIME": "2015-03-10T14:23:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 46.91638889, "DEP_AIRPORT_LON": -114.09055556, "DEP_AIRPORT_TZOFFSET": -21600.0, "ARR_AIRPORT_LAT": 44.88194444, "ARR_AIRPORT_LON": -93.22166667, "ARR_AIRPORT_TZOFFSET": -18000.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T11:59:00"}
45 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "AA", "ORIGIN_AIRPORT_SEQ_ID": "1393003", "ORIGIN": "ORD", "DEST_AIRPORT_SEQ_ID": "1330303", "DEST": "MIA", "CRS_DEP_TIME": "2015-03-10T11:00:00", "DEP_TIME": "2015-03-10T10:54:00", "DEP_DELAY": -6.0, "TAXI_OUT": 12.0, "WHEELS_OFF": "2015-03-10T11:06:00", "CRS_ARR_TIME": "2015-03-10T14:03:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 41.97944444, "DEP_AIRPORT_LON": -87.9075, "DEP_AIRPORT_TZOFFSET": -18000.0, "ARR_AIRPORT_LAT": 25.79527778, "ARR_AIRPORT_LON": -80.29, "ARR_AIRPORT_TZOFFSET": -14400.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T11:06:00"}
46 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "AA", "ORIGIN_AIRPORT_SEQ_ID": "1393003", "ORIGIN": "ORD", "DEST_AIRPORT_SEQ_ID": "1330303", "DEST": "MIA", "CRS_DEP_TIME": "2015-03-10T12:30:00", "DEP_TIME": "2015-03-10T12:21:00", "DEP_DELAY": -9.0, "TAXI_OUT": 16.0, "WHEELS_OFF": "2015-03-10T12:37:00", "CRS_ARR_TIME": "2015-03-10T15:38:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 41.97944444, "DEP_AIRPORT_LON": -87.9075, "DEP_AIRPORT_TZOFFSET": -18000.0, "ARR_AIRPORT_LAT": 25.79527778, "ARR_AIRPORT_LON": -80.29, "ARR_AIRPORT_TZOFFSET": -14400.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T12:37:00"}
47 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "AA", "ORIGIN_AIRPORT_SEQ_ID": "1393003", "ORIGIN": "ORD", "DEST_AIRPORT_SEQ_ID": "1330303", "DEST": "MIA", "CRS_DEP_TIME": "2015-03-10T13:00:00", "DEP_TIME": "2015-03-10T12:55:00", "DEP_DELAY": -5.0, "TAXI_OUT": 13.0, "WHEELS_OFF": "2015-03-10T13:08:00", "CRS_ARR_TIME": "2015-03-10T16:06:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 41.97944444, "DEP_AIRPORT_LON": -87.9075, "DEP_AIRPORT_TZOFFSET": -18000.0, "ARR_AIRPORT_LAT": 25.79527778, "ARR_AIRPORT_LON": -80.29, "ARR_AIRPORT_TZOFFSET": -14400.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T13:08:00"}
48 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "F9", "ORIGIN_AIRPORT_SEQ_ID": "1393003", "ORIGIN": "ORD", "DEST_AIRPORT_SEQ_ID": "1330303", "DEST": "MIA", "CRS_DEP_TIME": "2015-03-10T11:00:00", "DEP_TIME": "2015-03-10T10:56:00", "DEP_DELAY": -4.0, "TAXI_OUT": 21.0, "WHEELS_OFF": "2015-03-10T11:17:00", "CRS_ARR_TIME": "2015-03-10T14:00:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 41.97944444, "DEP_AIRPORT_LON": -87.9075, "DEP_AIRPORT_TZOFFSET": -18000.0, "ARR_AIRPORT_LAT": 25.79527778, "ARR_AIRPORT_LON": -80.29, "ARR_AIRPORT_TZOFFSET": -14400.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T11:17:00"}
49 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "US", "ORIGIN_AIRPORT_SEQ_ID": "1410002", "ORIGIN": "PHL", "DEST_AIRPORT_SEQ_ID": "1330303", "DEST": "MIA", "CRS_DEP_TIME": "2015-03-10T11:35:00", "DEP_TIME": "2015-03-10T11:31:00", "DEP_DELAY": -4.0, "TAXI_OUT": 26.0, "WHEELS_OFF": "2015-03-10T11:57:00", "CRS_ARR_TIME": "2015-03-10T14:37:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 39.87222222, "DEP_AIRPORT_LON": -75.24083333, "DEP_AIRPORT_TZOFFSET": -14400.0, "ARR_AIRPORT_LAT": 25.79527778, "ARR_AIRPORT_LON": -80.29, "ARR_AIRPORT_TZOFFSET": -14400.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T11:57:00"}
50 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "WN", "ORIGIN_AIRPORT_SEQ_ID": "1410702", "ORIGIN": "PHX", "DEST_AIRPORT_SEQ_ID": "1334205", "DEST": "MKE", "CRS_DEP_TIME": "2015-03-10T12:50:00", "DEP_TIME": "2015-03-10T12:47:00", "DEP_DELAY": -3.0, "TAXI_OUT": 8.0, "WHEELS_OFF": "2015-03-10T12:55:00", "CRS_ARR_TIME": "2015-03-10T16:10:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 33.43416667, "DEP_AIRPORT_LON": -112.01166667, "DEP_AIRPORT_TZOFFSET": -25200.0, "ARR_AIRPORT_LAT": 42.94694444, "ARR_AIRPORT_LON": -87.89694444, "ARR_AIRPORT_TZOFFSET": -18000.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T12:55:00"}
51 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "OO", "ORIGIN_AIRPORT_SEQ_ID": "1445702", "ORIGIN": "RAP", "DEST_AIRPORT_SEQ_ID": "1348702", "DEST": "MSP", "CRS_DEP_TIME": "2015-03-10T12:35:00", "DEP_TIME": "2015-03-10T12:20:00", "DEP_DELAY": -15.0, "TAXI_OUT": 12.0, "WHEELS_OFF": "2015-03-10T12:32:00", "CRS_ARR_TIME": "2015-03-10T14:22:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 44.04527778, "DEP_AIRPORT_LON": -103.05722222, "DEP_AIRPORT_TZOFFSET": -21600.0, "ARR_AIRPORT_LAT": 44.88194444, "ARR_AIRPORT_LON": -93.22166667, "ARR_AIRPORT_TZOFFSET": -18000.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T12:32:00"}
52 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "EV", "ORIGIN_AIRPORT_SEQ_ID": "1452401", "ORIGIN": "RIC", "DEST_AIRPORT_SEQ_ID": "1348702", "DEST": "MSP", "CRS_DEP_TIME": "2015-03-10T11:15:00", "DEP_TIME": "2015-03-10T11:56:00", "DEP_DELAY": 41.0, "TAXI_OUT": 9.0, "WHEELS_OFF": "2015-03-10T12:05:00", "CRS_ARR_TIME": "2015-03-10T14:13:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 37.50527778, "DEP_AIRPORT_LON": -77.31972222, "DEP_AIRPORT_TZOFFSET": -14400.0, "ARR_AIRPORT_LAT": 44.88194444, "ARR_AIRPORT_LON": -93.22166667, "ARR_AIRPORT_TZOFFSET": -18000.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T12:05:00"}
53 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "OO", "ORIGIN_AIRPORT_SEQ_ID": "1469606", "ORIGIN": "SBN", "DEST_AIRPORT_SEQ_ID": "1348702", "DEST": "MSP", "CRS_DEP_TIME": "2015-03-10T11:30:00", "DEP_TIME": "2015-03-10T11:19:00", "DEP_DELAY": -11.0, "TAXI_OUT": 44.0, "WHEELS_OFF": "2015-03-10T12:03:00", "CRS_ARR_TIME": "2015-03-10T13:16:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 41.70833333, "DEP_AIRPORT_LON": -86.31722222, "DEP_AIRPORT_TZOFFSET": -14400.0, "ARR_AIRPORT_LAT": 44.88194444, "ARR_AIRPORT_LON": -93.22166667, "ARR_AIRPORT_TZOFFSET": -18000.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T12:03:00"}
54 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "WN", "ORIGIN_AIRPORT_SEQ_ID": "1501603", "ORIGIN": "STL", "DEST_AIRPORT_SEQ_ID": "1334205", "DEST": "MKE", "CRS_DEP_TIME": "2015-03-10T13:40:00", "DEP_TIME": "2015-03-10T13:38:00", "DEP_DELAY": -2.0, "TAXI_OUT": 15.0, "WHEELS_OFF": "2015-03-10T13:53:00", "CRS_ARR_TIME": "2015-03-10T14:50:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 38.74861111, "DEP_AIRPORT_LON": -90.37, "DEP_AIRPORT_TZOFFSET": -18000.0, "ARR_AIRPORT_LAT": 42.94694444, "ARR_AIRPORT_LON": -87.89694444, "ARR_AIRPORT_TZOFFSET": -18000.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T13:53:00"}
55 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "DL", "ORIGIN_AIRPORT_SEQ_ID": "1348702", "ORIGIN": "MSP", "DEST_AIRPORT_SEQ_ID": "1410702", "DEST": "PHX", "CRS_DEP_TIME": "2015-03-10T12:05:00", "DEP_TIME": "2015-03-10T12:00:00", "DEP_DELAY": -5.0, "TAXI_OUT": 10.0, "WHEELS_OFF": "2015-03-10T12:10:00", "CRS_ARR_TIME": "2015-03-10T15:28:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 44.88194444, "DEP_AIRPORT_LON": -93.22166667, "DEP_AIRPORT_TZOFFSET": -18000.0, "ARR_AIRPORT_LAT": 33.43416667, "ARR_AIRPORT_LON": -112.01166667, "ARR_AIRPORT_TZOFFSET": -25200.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T12:10:00"}
56 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "UA", "ORIGIN_AIRPORT_SEQ_ID": "1393003", "ORIGIN": "ORD", "DEST_AIRPORT_SEQ_ID": "1530402", "DEST": "TPA", "CRS_DEP_TIME": "2015-03-10T11:00:00", "DEP_TIME": "2015-03-10T10:53:00", "DEP_DELAY": -7.0, "TAXI_OUT": 18.0, "WHEELS_OFF": "2015-03-10T11:11:00", "CRS_ARR_TIME": "2015-03-10T13:40:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 41.97944444, "DEP_AIRPORT_LON": -87.9075, "DEP_AIRPORT_TZOFFSET": -18000.0, "ARR_AIRPORT_LAT": 27.97555556, "ARR_AIRPORT_LON": -82.53333333, "ARR_AIRPORT_TZOFFSET": -14400.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T11:11:00"}
57 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "US", "ORIGIN_AIRPORT_SEQ_ID": "1348702", "ORIGIN": "MSP", "DEST_AIRPORT_SEQ_ID": "1410702", "DEST": "PHX", "CRS_DEP_TIME": "2015-03-10T12:30:00", "DEP_TIME": "2015-03-10T12:21:00", "DEP_DELAY": -9.0, "TAXI_OUT": 26.0, "WHEELS_OFF": "2015-03-10T12:47:00", "CRS_ARR_TIME": "2015-03-10T16:16:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 44.88194444, "DEP_AIRPORT_LON": -93.22166667, "DEP_AIRPORT_TZOFFSET": -18000.0, "ARR_AIRPORT_LAT": 33.43416667, "ARR_AIRPORT_LON": -112.01166667, "ARR_AIRPORT_TZOFFSET": -25200.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T12:47:00"}
58 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "WN", "ORIGIN_AIRPORT_SEQ_ID": "1154003", "ORIGIN": "ELP", "DEST_AIRPORT_SEQ_ID": "1468303", "DEST": "SAT", "CRS_DEP_TIME": "2015-03-10T12:15:00", "DEP_TIME": "2015-03-10T12:12:00", "DEP_DELAY": -3.0, "TAXI_OUT": 6.0, "WHEELS_OFF": "2015-03-10T12:18:00", "CRS_ARR_TIME": "2015-03-10T13:40:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 31.80722222, "DEP_AIRPORT_LON": -106.37638889, "DEP_AIRPORT_TZOFFSET": -21600.0, "ARR_AIRPORT_LAT": 29.53388889, "ARR_AIRPORT_LON": -98.46916667, "ARR_AIRPORT_TZOFFSET": -18000.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T12:18:00"}
59 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "UA", "ORIGIN_AIRPORT_SEQ_ID": "1226402", "ORIGIN": "IAD", "DEST_AIRPORT_SEQ_ID": "1467903", "DEST": "SAN", "CRS_DEP_TIME": "2015-03-10T12:21:00", "DEP_TIME": "2015-03-10T12:16:00", "DEP_DELAY": -5.0, "TAXI_OUT": 9.0, "WHEELS_OFF": "2015-03-10T12:25:00", "CRS_ARR_TIME": "2015-03-10T18:12:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 38.9475, "DEP_AIRPORT_LON": -77.46, "DEP_AIRPORT_TZOFFSET": -14400.0, "ARR_AIRPORT_LAT": 32.73361111, "ARR_AIRPORT_LON": -117.18972222, "ARR_AIRPORT_TZOFFSET": -25200.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T12:25:00"}
60 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "B6", "ORIGIN_AIRPORT_SEQ_ID": "1247802", "ORIGIN": "JFK", "DEST_AIRPORT_SEQ_ID": "1467903", "DEST": "SAN", "CRS_DEP_TIME": "2015-03-10T13:23:00", "DEP_TIME": "2015-03-10T13:15:00", "DEP_DELAY": -8.0, "TAXI_OUT": 23.0, "WHEELS_OFF": "2015-03-10T13:38:00", "CRS_ARR_TIME": "2015-03-10T19:39:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 40.63972222, "DEP_AIRPORT_LON": -73.77888889, "DEP_AIRPORT_TZOFFSET": -14400.0, "ARR_AIRPORT_LAT": 32.73361111, "ARR_AIRPORT_LON": -117.18972222, "ARR_AIRPORT_TZOFFSET": -25200.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T13:38:00"}
61 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "WN", "ORIGIN_AIRPORT_SEQ_ID": "1082103", "ORIGIN": "BWI", "DEST_AIRPORT_SEQ_ID": "1468303", "DEST": "SAT", "CRS_DEP_TIME": "2015-03-10T13:30:00", "DEP_TIME": "2015-03-10T13:36:00", "DEP_DELAY": 6.0, "TAXI_OUT": 10.0, "WHEELS_OFF": "2015-03-10T13:46:00", "CRS_ARR_TIME": "2015-03-10T17:30:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 39.17527778, "DEP_AIRPORT_LON": -76.66833333, "DEP_AIRPORT_TZOFFSET": -14400.0, "ARR_AIRPORT_LAT": 29.53388889, "ARR_AIRPORT_LON": -98.46916667, "ARR_AIRPORT_TZOFFSET": -18000.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T13:46:00"}
62 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "WN", "ORIGIN_AIRPORT_SEQ_ID": "1323202", "ORIGIN": "MDW", "DEST_AIRPORT_SEQ_ID": "1468303", "DEST": "SAT", "CRS_DEP_TIME": "2015-03-10T13:40:00", "DEP_TIME": "2015-03-10T13:39:00", "DEP_DELAY": -1.0, "TAXI_OUT": 11.0, "WHEELS_OFF": "2015-03-10T13:50:00", "CRS_ARR_TIME": "2015-03-10T16:35:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 41.78583333, "DEP_AIRPORT_LON": -87.7525, "DEP_AIRPORT_TZOFFSET": -18000.0, "ARR_AIRPORT_LAT": 29.53388889, "ARR_AIRPORT_LON": -98.46916667, "ARR_AIRPORT_TZOFFSET": -18000.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T13:50:00"}
63 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "AS", "ORIGIN_AIRPORT_SEQ_ID": "1405702", "ORIGIN": "PDX", "DEST_AIRPORT_SEQ_ID": "1467903", "DEST": "SAN", "CRS_DEP_TIME": "2015-03-10T13:45:00", "DEP_TIME": "2015-03-10T13:37:00", "DEP_DELAY": -8.0, "TAXI_OUT": 18.0, "WHEELS_OFF": "2015-03-10T13:55:00", "CRS_ARR_TIME": "2015-03-10T16:10:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 45.58861111, "DEP_AIRPORT_LON": -122.59694444, "DEP_AIRPORT_TZOFFSET": -25200.0, "ARR_AIRPORT_LAT": 32.73361111, "ARR_AIRPORT_LON": -117.18972222, "ARR_AIRPORT_TZOFFSET": -25200.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T13:55:00"}
64 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "AS", "ORIGIN_AIRPORT_SEQ_ID": "1474703", "ORIGIN": "SEA", "DEST_AIRPORT_SEQ_ID": "1467903", "DEST": "SAN", "CRS_DEP_TIME": "2015-03-10T13:35:00", "DEP_TIME": "2015-03-10T13:49:00", "DEP_DELAY": 14.0, "TAXI_OUT": 10.0, "WHEELS_OFF": "2015-03-10T13:59:00", "CRS_ARR_TIME": "2015-03-10T16:17:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 47.45, "DEP_AIRPORT_LON": -122.31166667, "DEP_AIRPORT_TZOFFSET": -25200.0, "ARR_AIRPORT_LAT": 32.73361111, "ARR_AIRPORT_LON": -117.18972222, "ARR_AIRPORT_TZOFFSET": -25200.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T13:59:00"}
65 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "WN", "ORIGIN_AIRPORT_SEQ_ID": "1477101", "ORIGIN": "SFO", "DEST_AIRPORT_SEQ_ID": "1467903", "DEST": "SAN", "CRS_DEP_TIME": "2015-03-10T13:05:00", "DEP_TIME": "2015-03-10T13:00:00", "DEP_DELAY": -5.0, "TAXI_OUT": 12.0, "WHEELS_OFF": "2015-03-10T13:12:00", "CRS_ARR_TIME": "2015-03-10T14:40:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 37.61888889, "DEP_AIRPORT_LON": -122.375, "DEP_AIRPORT_TZOFFSET": -25200.0, "ARR_AIRPORT_LAT": 32.73361111, "ARR_AIRPORT_LON": -117.18972222, "ARR_AIRPORT_TZOFFSET": -25200.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T13:12:00"}
66 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "WN", "ORIGIN_AIRPORT_SEQ_ID": "1483103", "ORIGIN": "SJC", "DEST_AIRPORT_SEQ_ID": "1467903", "DEST": "SAN", "CRS_DEP_TIME": "2015-03-10T13:30:00", "DEP_TIME": "2015-03-10T13:27:00", "DEP_DELAY": -3.0, "TAXI_OUT": 14.0, "WHEELS_OFF": "2015-03-10T13:41:00", "CRS_ARR_TIME": "2015-03-10T14:50:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 37.36277778, "DEP_AIRPORT_LON": -121.92916667, "DEP_AIRPORT_TZOFFSET": -25200.0, "ARR_AIRPORT_LAT": 32.73361111, "ARR_AIRPORT_LON": -117.18972222, "ARR_AIRPORT_TZOFFSET": -25200.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T13:41:00"}
67 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "WN", "ORIGIN_AIRPORT_SEQ_ID": "1288903", "ORIGIN": "LAS", "DEST_AIRPORT_SEQ_ID": "1468303", "DEST": "SAT", "CRS_DEP_TIME": "2015-03-10T12:55:00", "DEP_TIME": "2015-03-10T13:01:00", "DEP_DELAY": 6.0, "TAXI_OUT": 24.0, "WHEELS_OFF": "2015-03-10T13:25:00", "CRS_ARR_TIME": "2015-03-10T15:25:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 36.08, "DEP_AIRPORT_LON": -115.15222222, "DEP_AIRPORT_TZOFFSET": -25200.0, "ARR_AIRPORT_LAT": 29.53388889, "ARR_AIRPORT_LON": -98.46916667, "ARR_AIRPORT_TZOFFSET": -18000.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T13:25:00"}
68 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "AS", "ORIGIN_AIRPORT_SEQ_ID": "1066602", "ORIGIN": "BLI", "DEST_AIRPORT_SEQ_ID": "1474703", "DEST": "SEA", "CRS_DEP_TIME": "2015-03-10T13:40:00", "DEP_TIME": "2015-03-10T13:37:00", "DEP_DELAY": -3.0, "TAXI_OUT": 15.0, "WHEELS_OFF": "2015-03-10T13:52:00", "CRS_ARR_TIME": "2015-03-10T14:20:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 48.79277778, "DEP_AIRPORT_LON": -122.5375, "DEP_AIRPORT_TZOFFSET": -25200.0, "ARR_AIRPORT_LAT": 47.45, "ARR_AIRPORT_LON": -122.31166667, "ARR_AIRPORT_TZOFFSET": -25200.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T13:52:00"}
69 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "AS", "ORIGIN_AIRPORT_SEQ_ID": "1348702", "ORIGIN": "MSP", "DEST_AIRPORT_SEQ_ID": "1474703", "DEST": "SEA", "CRS_DEP_TIME": "2015-03-10T12:00:00", "DEP_TIME": "2015-03-10T11:55:00", "DEP_DELAY": -5.0, "TAXI_OUT": 15.0, "WHEELS_OFF": "2015-03-10T12:10:00", "CRS_ARR_TIME": "2015-03-10T15:40:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 44.88194444, "DEP_AIRPORT_LON": -93.22166667, "DEP_AIRPORT_TZOFFSET": -18000.0, "ARR_AIRPORT_LAT": 47.45, "ARR_AIRPORT_LON": -122.31166667, "ARR_AIRPORT_TZOFFSET": -25200.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T12:10:00"}
70 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "AA", "ORIGIN_AIRPORT_SEQ_ID": "1393003", "ORIGIN": "ORD", "DEST_AIRPORT_SEQ_ID": "1474703", "DEST": "SEA", "CRS_DEP_TIME": "2015-03-10T13:25:00", "DEP_TIME": "2015-03-10T13:36:00", "DEP_DELAY": 11.0, "TAXI_OUT": 17.0, "WHEELS_OFF": "2015-03-10T13:53:00", "CRS_ARR_TIME": "2015-03-10T18:00:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 41.97944444, "DEP_AIRPORT_LON": -87.9075, "DEP_AIRPORT_TZOFFSET": -18000.0, "ARR_AIRPORT_LAT": 47.45, "ARR_AIRPORT_LON": -122.31166667, "ARR_AIRPORT_TZOFFSET": -25200.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T13:53:00"}
71 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "AS", "ORIGIN_AIRPORT_SEQ_ID": "1393003", "ORIGIN": "ORD", "DEST_AIRPORT_SEQ_ID": "1474703", "DEST": "SEA", "CRS_DEP_TIME": "2015-03-10T11:00:00", "DEP_TIME": "2015-03-10T10:50:00", "DEP_DELAY": -10.0, "TAXI_OUT": 27.0, "WHEELS_OFF": "2015-03-10T11:17:00", "CRS_ARR_TIME": "2015-03-10T15:25:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 41.97944444, "DEP_AIRPORT_LON": -87.9075, "DEP_AIRPORT_TZOFFSET": -18000.0, "ARR_AIRPORT_LAT": 47.45, "ARR_AIRPORT_LON": -122.31166667, "ARR_AIRPORT_TZOFFSET": -25200.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T11:17:00"}
72 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "AS", "ORIGIN_AIRPORT_SEQ_ID": "1393003", "ORIGIN": "ORD", "DEST_AIRPORT_SEQ_ID": "1474703", "DEST": "SEA", "CRS_DEP_TIME": "2015-03-10T13:00:00", "DEP_TIME": "2015-03-10T12:50:00", "DEP_DELAY": -10.0, "TAXI_OUT": 47.0, "WHEELS_OFF": "2015-03-10T13:37:00", "CRS_ARR_TIME": "2015-03-10T17:30:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 41.97944444, "DEP_AIRPORT_LON": -87.9075, "DEP_AIRPORT_TZOFFSET": -18000.0, "ARR_AIRPORT_LAT": 47.45, "ARR_AIRPORT_LON": -122.31166667, "ARR_AIRPORT_TZOFFSET": -25200.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T13:37:00"}
73 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "AS", "ORIGIN_AIRPORT_SEQ_ID": "1379603", "ORIGIN": "OAK", "DEST_AIRPORT_SEQ_ID": "1474703", "DEST": "SEA", "CRS_DEP_TIME": "2015-03-10T13:10:00", "DEP_TIME": "2015-03-10T12:57:00", "DEP_DELAY": -13.0, "TAXI_OUT": 14.0, "WHEELS_OFF": "2015-03-10T13:11:00", "CRS_ARR_TIME": "2015-03-10T15:08:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 37.72277778, "DEP_AIRPORT_LON": -122.22138889, "DEP_AIRPORT_TZOFFSET": -25200.0, "ARR_AIRPORT_LAT": 47.45, "ARR_AIRPORT_LON": -122.31166667, "ARR_AIRPORT_TZOFFSET": -25200.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T13:11:00"}
74 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "OO", "ORIGIN_AIRPORT_SEQ_ID": "1163805", "ORIGIN": "FAT", "DEST_AIRPORT_SEQ_ID": "1474703", "DEST": "SEA", "CRS_DEP_TIME": "2015-03-10T13:00:00", "DEP_TIME": "2015-03-10T12:50:00", "DEP_DELAY": -10.0, "TAXI_OUT": 12.0, "WHEELS_OFF": "2015-03-10T13:02:00", "CRS_ARR_TIME": "2015-03-10T15:07:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 36.77666667, "DEP_AIRPORT_LON": -119.71888889, "DEP_AIRPORT_TZOFFSET": -25200.0, "ARR_AIRPORT_LAT": 47.45, "ARR_AIRPORT_LON": -122.31166667, "ARR_AIRPORT_TZOFFSET": -25200.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T13:02:00"}
75 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "AS", "ORIGIN_AIRPORT_SEQ_ID": "1389101", "ORIGIN": "ONT", "DEST_AIRPORT_SEQ_ID": "1474703", "DEST": "SEA", "CRS_DEP_TIME": "2015-03-10T13:00:00", "DEP_TIME": "2015-03-10T12:50:00", "DEP_DELAY": -10.0, "TAXI_OUT": 14.0, "WHEELS_OFF": "2015-03-10T13:04:00", "CRS_ARR_TIME": "2015-03-10T15:38:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 34.05611111, "DEP_AIRPORT_LON": -117.60111111, "DEP_AIRPORT_TZOFFSET": -25200.0, "ARR_AIRPORT_LAT": 47.45, "ARR_AIRPORT_LON": -122.31166667, "ARR_AIRPORT_TZOFFSET": -25200.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T13:04:00"}
76 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "AS", "ORIGIN_AIRPORT_SEQ_ID": "1289203", "ORIGIN": "LAX", "DEST_AIRPORT_SEQ_ID": "1474703", "DEST": "SEA", "CRS_DEP_TIME": "2015-03-10T13:00:00", "DEP_TIME": "2015-03-10T12:53:00", "DEP_DELAY": -7.0, "TAXI_OUT": 21.0, "WHEELS_OFF": "2015-03-10T13:14:00", "CRS_ARR_TIME": "2015-03-10T15:46:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 33.9425, "DEP_AIRPORT_LON": -118.40805556, "DEP_AIRPORT_TZOFFSET": -25200.0, "ARR_AIRPORT_LAT": 47.45, "ARR_AIRPORT_LON": -122.31166667, "ARR_AIRPORT_TZOFFSET": -25200.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T13:14:00"}
77 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "AS", "ORIGIN_AIRPORT_SEQ_ID": "1410702", "ORIGIN": "PHX", "DEST_AIRPORT_SEQ_ID": "1474703", "DEST": "SEA", "CRS_DEP_TIME": "2015-03-10T13:00:00", "DEP_TIME": "2015-03-10T12:56:00", "DEP_DELAY": -4.0, "TAXI_OUT": 9.0, "WHEELS_OFF": "2015-03-10T13:05:00", "CRS_ARR_TIME": "2015-03-10T15:58:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 33.43416667, "DEP_AIRPORT_LON": -112.01166667, "DEP_AIRPORT_TZOFFSET": -25200.0, "ARR_AIRPORT_LAT": 47.45, "ARR_AIRPORT_LON": -122.31166667, "ARR_AIRPORT_TZOFFSET": -25200.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T13:05:00"}
78 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "UA", "ORIGIN_AIRPORT_SEQ_ID": "1477101", "ORIGIN": "SFO", "DEST_AIRPORT_SEQ_ID": "1474703", "DEST": "SEA", "CRS_DEP_TIME": "2015-03-10T13:17:00", "DEP_TIME": "2015-03-10T13:12:00", "DEP_DELAY": -5.0, "TAXI_OUT": 14.0, "WHEELS_OFF": "2015-03-10T13:26:00", "CRS_ARR_TIME": "2015-03-10T15:22:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 37.61888889, "DEP_AIRPORT_LON": -122.375, "DEP_AIRPORT_TZOFFSET": -25200.0, "ARR_AIRPORT_LAT": 47.45, "ARR_AIRPORT_LON": -122.31166667, "ARR_AIRPORT_TZOFFSET": -25200.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T13:26:00"}
79 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "DL", "ORIGIN_AIRPORT_SEQ_ID": "1039705", "ORIGIN": "ATL", "DEST_AIRPORT_SEQ_ID": "1474703", "DEST": "SEA", "CRS_DEP_TIME": "2015-03-10T12:15:00", "DEP_TIME": "2015-03-10T12:21:00", "DEP_DELAY": 6.0, "TAXI_OUT": 17.0, "WHEELS_OFF": "2015-03-10T12:38:00", "CRS_ARR_TIME": "2015-03-10T17:45:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 33.63666667, "DEP_AIRPORT_LON": -84.42777778, "DEP_AIRPORT_TZOFFSET": -14400.0, "ARR_AIRPORT_LAT": 47.45, "ARR_AIRPORT_LON": -122.31166667, "ARR_AIRPORT_TZOFFSET": -25200.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T12:38:00"}
80 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "B6", "ORIGIN_AIRPORT_SEQ_ID": "1072102", "ORIGIN": "BOS", "DEST_AIRPORT_SEQ_ID": "1474703", "DEST": "SEA", "CRS_DEP_TIME": "2015-03-10T12:55:00", "DEP_TIME": "2015-03-10T12:48:00", "DEP_DELAY": -7.0, "TAXI_OUT": 18.0, "WHEELS_OFF": "2015-03-10T13:06:00", "CRS_ARR_TIME": "2015-03-10T19:07:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 42.36305556, "DEP_AIRPORT_LON": -71.00638889, "DEP_AIRPORT_TZOFFSET": -14400.0, "ARR_AIRPORT_LAT": 47.45, "ARR_AIRPORT_LON": -122.31166667, "ARR_AIRPORT_TZOFFSET": -25200.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T13:06:00"}
81 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "DL", "ORIGIN_AIRPORT_SEQ_ID": "1143302", "ORIGIN": "DTW", "DEST_AIRPORT_SEQ_ID": "1474703", "DEST": "SEA", "CRS_DEP_TIME": "2015-03-10T12:45:00", "DEP_TIME": "2015-03-10T12:41:00", "DEP_DELAY": -4.0, "TAXI_OUT": 18.0, "WHEELS_OFF": "2015-03-10T12:59:00", "CRS_ARR_TIME": "2015-03-10T17:39:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 42.2125, "DEP_AIRPORT_LON": -83.35333333, "DEP_AIRPORT_TZOFFSET": -14400.0, "ARR_AIRPORT_LAT": 47.45, "ARR_AIRPORT_LON": -122.31166667, "ARR_AIRPORT_TZOFFSET": -25200.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T12:59:00"}
82 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "UA", "ORIGIN_AIRPORT_SEQ_ID": "1161802", "ORIGIN": "EWR", "DEST_AIRPORT_SEQ_ID": "1474703", "DEST": "SEA", "CRS_DEP_TIME": "2015-03-10T13:45:00", "DEP_TIME": "2015-03-10T13:46:00", "DEP_DELAY": 1.0, "TAXI_OUT": 10.0, "WHEELS_OFF": "2015-03-10T13:56:00", "CRS_ARR_TIME": "2015-03-10T20:01:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 40.6925, "DEP_AIRPORT_LON": -74.16861111, "DEP_AIRPORT_TZOFFSET": -14400.0, "ARR_AIRPORT_LAT": 47.45, "ARR_AIRPORT_LON": -122.31166667, "ARR_AIRPORT_TZOFFSET": -25200.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T13:56:00"}
83 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "UA", "ORIGIN_AIRPORT_SEQ_ID": "1226402", "ORIGIN": "IAD", "DEST_AIRPORT_SEQ_ID": "1474703", "DEST": "SEA", "CRS_DEP_TIME": "2015-03-10T12:15:00", "DEP_TIME": "2015-03-10T12:09:00", "DEP_DELAY": -6.0, "TAXI_OUT": 16.0, "WHEELS_OFF": "2015-03-10T12:25:00", "CRS_ARR_TIME": "2015-03-10T18:00:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 38.9475, "DEP_AIRPORT_LON": -77.46, "DEP_AIRPORT_TZOFFSET": -14400.0, "ARR_AIRPORT_LAT": 47.45, "ARR_AIRPORT_LON": -122.31166667, "ARR_AIRPORT_TZOFFSET": -25200.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T12:25:00"}
84 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "DL", "ORIGIN_AIRPORT_SEQ_ID": "1247802", "ORIGIN": "JFK", "DEST_AIRPORT_SEQ_ID": "1474703", "DEST": "SEA", "CRS_DEP_TIME": "2015-03-10T11:30:00", "DEP_TIME": "2015-03-10T12:02:00", "DEP_DELAY": 32.0, "TAXI_OUT": 27.0, "WHEELS_OFF": "2015-03-10T12:29:00", "CRS_ARR_TIME": "2015-03-10T17:40:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 40.63972222, "DEP_AIRPORT_LON": -73.77888889, "DEP_AIRPORT_TZOFFSET": -14400.0, "ARR_AIRPORT_LAT": 47.45, "ARR_AIRPORT_LON": -122.31166667, "ARR_AIRPORT_TZOFFSET": -25200.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T12:29:00"}
85 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "AS", "ORIGIN_AIRPORT_SEQ_ID": "1163002", "ORIGIN": "FAI", "DEST_AIRPORT_SEQ_ID": "1474703", "DEST": "SEA", "CRS_DEP_TIME": "2015-03-10T09:50:00", "DEP_TIME": "2015-03-10T09:43:00", "DEP_DELAY": -7.0, "TAXI_OUT": 26.0, "WHEELS_OFF": "2015-03-10T10:09:00", "CRS_ARR_TIME": "2015-03-10T13:20:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 64.815, "DEP_AIRPORT_LON": -147.85638889, "DEP_AIRPORT_TZOFFSET": -28800.0, "ARR_AIRPORT_LAT": 47.45, "ARR_AIRPORT_LON": -122.31166667, "ARR_AIRPORT_TZOFFSET": -25200.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T10:09:00"}
86 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "AS", "ORIGIN_AIRPORT_SEQ_ID": "1029904", "ORIGIN": "ANC", "DEST_AIRPORT_SEQ_ID": "1474703", "DEST": "SEA", "CRS_DEP_TIME": "2015-03-10T13:05:00", "DEP_TIME": "2015-03-10T13:01:00", "DEP_DELAY": -4.0, "TAXI_OUT": 14.0, "WHEELS_OFF": "2015-03-10T13:15:00", "CRS_ARR_TIME": "2015-03-10T16:24:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 61.17416667, "DEP_AIRPORT_LON": -149.99805556, "DEP_AIRPORT_TZOFFSET": -28800.0, "ARR_AIRPORT_LAT": 47.45, "ARR_AIRPORT_LON": -122.31166667, "ARR_AIRPORT_TZOFFSET": -25200.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T13:15:00"}
87 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "AS", "ORIGIN_AIRPORT_SEQ_ID": "1129803", "ORIGIN": "DFW", "DEST_AIRPORT_SEQ_ID": "1474703", "DEST": "SEA", "CRS_DEP_TIME": "2015-03-10T13:00:00", "DEP_TIME": "2015-03-10T12:57:00", "DEP_DELAY": -3.0, "TAXI_OUT": 13.0, "WHEELS_OFF": "2015-03-10T13:10:00", "CRS_ARR_TIME": "2015-03-10T17:20:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 32.89694444, "DEP_AIRPORT_LON": -97.03805556, "DEP_AIRPORT_TZOFFSET": -18000.0, "ARR_AIRPORT_LAT": 47.45, "ARR_AIRPORT_LON": -122.31166667, "ARR_AIRPORT_TZOFFSET": -25200.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T13:10:00"}
88 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "AA", "ORIGIN_AIRPORT_SEQ_ID": "1129803", "ORIGIN": "DFW", "DEST_AIRPORT_SEQ_ID": "1474703", "DEST": "SEA", "CRS_DEP_TIME": "2015-03-10T13:05:00", "DEP_TIME": "2015-03-10T13:01:00", "DEP_DELAY": -4.0, "TAXI_OUT": 18.0, "WHEELS_OFF": "2015-03-10T13:19:00", "CRS_ARR_TIME": "2015-03-10T17:34:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 32.89694444, "DEP_AIRPORT_LON": -97.03805556, "DEP_AIRPORT_TZOFFSET": -18000.0, "ARR_AIRPORT_LAT": 47.45, "ARR_AIRPORT_LON": -122.31166667, "ARR_AIRPORT_TZOFFSET": -25200.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T13:19:00"}
89 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "F9", "ORIGIN_AIRPORT_SEQ_ID": "1129202", "ORIGIN": "DEN", "DEST_AIRPORT_SEQ_ID": "1127802", "DEST": "DCA", "CRS_DEP_TIME": "2015-03-10T13:35:00", "DEP_TIME": "2015-03-10T13:32:00", "DEP_DELAY": -3.0, "TAXI_OUT": 9.0, "WHEELS_OFF": "2015-03-10T13:41:00", "CRS_ARR_TIME": "2015-03-10T16:50:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 39.86166667, "DEP_AIRPORT_LON": -104.67305556, "DEP_AIRPORT_TZOFFSET": -21600.0, "ARR_AIRPORT_LAT": 38.85194444, "ARR_AIRPORT_LON": -77.03777778, "ARR_AIRPORT_TZOFFSET": -14400.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T13:41:00"}
90 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "B6", "ORIGIN_AIRPORT_SEQ_ID": "1245102", "ORIGIN": "JAX", "DEST_AIRPORT_SEQ_ID": "1127802", "DEST": "DCA", "CRS_DEP_TIME": "2015-03-10T13:15:00", "DEP_TIME": "2015-03-10T13:10:00", "DEP_DELAY": -5.0, "TAXI_OUT": 15.0, "WHEELS_OFF": "2015-03-10T13:25:00", "CRS_ARR_TIME": "2015-03-10T15:06:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 30.49416667, "DEP_AIRPORT_LON": -81.68777778, "DEP_AIRPORT_TZOFFSET": -14400.0, "ARR_AIRPORT_LAT": 38.85194444, "ARR_AIRPORT_LON": -77.03777778, "ARR_AIRPORT_TZOFFSET": -14400.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T13:25:00"}
91 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "B6", "ORIGIN_AIRPORT_SEQ_ID": "1169703", "ORIGIN": "FLL", "DEST_AIRPORT_SEQ_ID": "1127802", "DEST": "DCA", "CRS_DEP_TIME": "2015-03-10T13:35:00", "DEP_TIME": "2015-03-10T13:33:00", "DEP_DELAY": -2.0, "TAXI_OUT": 13.0, "WHEELS_OFF": "2015-03-10T13:46:00", "CRS_ARR_TIME": "2015-03-10T16:00:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 26.0725, "DEP_AIRPORT_LON": -80.15277778, "DEP_AIRPORT_TZOFFSET": -14400.0, "ARR_AIRPORT_LAT": 38.85194444, "ARR_AIRPORT_LON": -77.03777778, "ARR_AIRPORT_TZOFFSET": -14400.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T13:46:00"}
92 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "WN", "ORIGIN_AIRPORT_SEQ_ID": "1233904", "ORIGIN": "IND", "DEST_AIRPORT_SEQ_ID": "1127802", "DEST": "DCA", "CRS_DEP_TIME": "2015-03-10T11:20:00", "DEP_TIME": "2015-03-10T11:19:00", "DEP_DELAY": -1.0, "TAXI_OUT": 9.0, "WHEELS_OFF": "2015-03-10T11:28:00", "CRS_ARR_TIME": "2015-03-10T12:55:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 39.71722222, "DEP_AIRPORT_LON": -86.29472222, "DEP_AIRPORT_TZOFFSET": -14400.0, "ARR_AIRPORT_LAT": 38.85194444, "ARR_AIRPORT_LON": -77.03777778, "ARR_AIRPORT_TZOFFSET": -14400.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T11:28:00"}
93 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "WN", "ORIGIN_AIRPORT_SEQ_ID": "1039705", "ORIGIN": "ATL", "DEST_AIRPORT_SEQ_ID": "1127802", "DEST": "DCA", "CRS_DEP_TIME": "2015-03-10T13:40:00", "DEP_TIME": "2015-03-10T13:35:00", "DEP_DELAY": -5.0, "TAXI_OUT": 10.0, "WHEELS_OFF": "2015-03-10T13:45:00", "CRS_ARR_TIME": "2015-03-10T15:25:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 33.63666667, "DEP_AIRPORT_LON": -84.42777778, "DEP_AIRPORT_TZOFFSET": -14400.0, "ARR_AIRPORT_LAT": 38.85194444, "ARR_AIRPORT_LON": -77.03777778, "ARR_AIRPORT_TZOFFSET": -14400.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T13:45:00"}
94 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "DL", "ORIGIN_AIRPORT_SEQ_ID": "1039705", "ORIGIN": "ATL", "DEST_AIRPORT_SEQ_ID": "1127802", "DEST": "DCA", "CRS_DEP_TIME": "2015-03-10T13:20:00", "DEP_TIME": "2015-03-10T13:15:00", "DEP_DELAY": -5.0, "TAXI_OUT": 14.0, "WHEELS_OFF": "2015-03-10T13:29:00", "CRS_ARR_TIME": "2015-03-10T15:01:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 33.63666667, "DEP_AIRPORT_LON": -84.42777778, "DEP_AIRPORT_TZOFFSET": -14400.0, "ARR_AIRPORT_LAT": 38.85194444, "ARR_AIRPORT_LON": -77.03777778, "ARR_AIRPORT_TZOFFSET": -14400.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T13:29:00"}
95 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "DL", "ORIGIN_AIRPORT_SEQ_ID": "1039705", "ORIGIN": "ATL", "DEST_AIRPORT_SEQ_ID": "1127802", "DEST": "DCA", "CRS_DEP_TIME": "2015-03-10T12:20:00", "DEP_TIME": "2015-03-10T12:20:00", "DEP_DELAY": 0.0, "TAXI_OUT": 19.0, "WHEELS_OFF": "2015-03-10T12:39:00", "CRS_ARR_TIME": "2015-03-10T14:01:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 33.63666667, "DEP_AIRPORT_LON": -84.42777778, "DEP_AIRPORT_TZOFFSET": -14400.0, "ARR_AIRPORT_LAT": 38.85194444, "ARR_AIRPORT_LON": -77.03777778, "ARR_AIRPORT_TZOFFSET": -14400.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T12:39:00"}
96 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "WN", "ORIGIN_AIRPORT_SEQ_ID": "1039705", "ORIGIN": "ATL", "DEST_AIRPORT_SEQ_ID": "1127802", "DEST": "DCA", "CRS_DEP_TIME": "2015-03-10T10:45:00", "DEP_TIME": "2015-03-10T10:39:00", "DEP_DELAY": -6.0, "TAXI_OUT": 15.0, "WHEELS_OFF": "2015-03-10T10:54:00", "CRS_ARR_TIME": "2015-03-10T12:20:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 33.63666667, "DEP_AIRPORT_LON": -84.42777778, "DEP_AIRPORT_TZOFFSET": -14400.0, "ARR_AIRPORT_LAT": 38.85194444, "ARR_AIRPORT_LON": -77.03777778, "ARR_AIRPORT_TZOFFSET": -14400.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T10:54:00"}
97 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "B6", "ORIGIN_AIRPORT_SEQ_ID": "1072102", "ORIGIN": "BOS", "DEST_AIRPORT_SEQ_ID": "1127802", "DEST": "DCA", "CRS_DEP_TIME": "2015-03-10T12:42:00", "DEP_TIME": "2015-03-10T13:31:00", "DEP_DELAY": 49.0, "TAXI_OUT": 19.0, "WHEELS_OFF": "2015-03-10T13:50:00", "CRS_ARR_TIME": "2015-03-10T14:19:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 42.36305556, "DEP_AIRPORT_LON": -71.00638889, "DEP_AIRPORT_TZOFFSET": -14400.0, "ARR_AIRPORT_LAT": 38.85194444, "ARR_AIRPORT_LON": -77.03777778, "ARR_AIRPORT_TZOFFSET": -14400.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T13:50:00"}
98 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "US", "ORIGIN_AIRPORT_SEQ_ID": "1072102", "ORIGIN": "BOS", "DEST_AIRPORT_SEQ_ID": "1127802", "DEST": "DCA", "CRS_DEP_TIME": "2015-03-10T11:00:00", "DEP_TIME": "2015-03-10T10:54:00", "DEP_DELAY": -6.0, "TAXI_OUT": 18.0, "WHEELS_OFF": "2015-03-10T11:12:00", "CRS_ARR_TIME": "2015-03-10T12:46:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 42.36305556, "DEP_AIRPORT_LON": -71.00638889, "DEP_AIRPORT_TZOFFSET": -14400.0, "ARR_AIRPORT_LAT": 38.85194444, "ARR_AIRPORT_LON": -77.03777778, "ARR_AIRPORT_TZOFFSET": -14400.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T11:12:00"}
99 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "DL", "ORIGIN_AIRPORT_SEQ_ID": "1143302", "ORIGIN": "DTW", "DEST_AIRPORT_SEQ_ID": "1127802", "DEST": "DCA", "CRS_DEP_TIME": "2015-03-10T11:25:00", "DEP_TIME": "2015-03-10T11:23:00", "DEP_DELAY": -2.0, "TAXI_OUT": 15.0, "WHEELS_OFF": "2015-03-10T11:38:00", "CRS_ARR_TIME": "2015-03-10T13:00:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 42.2125, "DEP_AIRPORT_LON": -83.35333333, "DEP_AIRPORT_TZOFFSET": -14400.0, "ARR_AIRPORT_LAT": 38.85194444, "ARR_AIRPORT_LON": -77.03777778, "ARR_AIRPORT_TZOFFSET": -14400.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T11:38:00"}
100 | {"FL_DATE": "2015-03-10", "UNIQUE_CARRIER": "EV", "ORIGIN_AIRPORT_SEQ_ID": "1161802", "ORIGIN": "EWR", "DEST_AIRPORT_SEQ_ID": "1127802", "DEST": "DCA", "CRS_DEP_TIME": "2015-03-10T10:36:00", "DEP_TIME": "2015-03-10T10:31:00", "DEP_DELAY": -5.0, "TAXI_OUT": 12.0, "WHEELS_OFF": "2015-03-10T10:43:00", "CRS_ARR_TIME": "2015-03-10T11:51:00", "CANCELLED": false, "DIVERTED": false, "DEP_AIRPORT_LAT": 40.6925, "DEP_AIRPORT_LON": -74.16861111, "DEP_AIRPORT_TZOFFSET": -14400.0, "ARR_AIRPORT_LAT": 38.85194444, "ARR_AIRPORT_LON": -77.03777778, "ARR_AIRPORT_TZOFFSET": -14400.0, "EVENT_TYPE": "wheelsoff", "EVENT_TIME": "2015-03-10T10:43:00"}
101 |
--------------------------------------------------------------------------------
/RealTimePrediction/realtime-intelligence-main/realtime/train_on_vertex.py:
--------------------------------------------------------------------------------
1 | # Copyright 2023 Google LLC
2 | #
3 | # Licensed under the Apache License, Version 2.0 (the "License");
4 | # you may not use this file except in compliance with the License.
5 | # You may obtain a copy of the License at
6 | #
7 | # https://www.apache.org/licenses/LICENSE-2.0
8 | #
9 | # Unless required by applicable law or agreed to in writing, software
10 | # distributed under the License is distributed on an "AS IS" BASIS,
11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12 | # See the License for the specific language governing permissions and
13 | # limitations under the License.
14 | # new commit
15 |
16 | import argparse
17 | import logging
18 | from datetime import datetime
19 | import tensorflow as tf
20 |
21 | from google.cloud import aiplatform
22 | from google.cloud.aiplatform import gapic as aip
23 | from google.cloud.aiplatform import hyperparameter_tuning as hpt
24 | #Remove References to kfp
25 | #from kfp.v2 import compiler, dsl
26 | ENDPOINT_NAME = 'flights'
27 |
28 |
29 | def train_custom_model(data_set, timestamp, develop_mode, cpu_only_mode, tf_version, extra_args=None):
30 | # Set up training and deployment infra
31 |
32 | if cpu_only_mode:
33 | train_image='us-docker.pkg.dev/vertex-ai/training/tf-cpu.{}:latest'.format(tf_version)
34 | deploy_image='us-docker.pkg.dev/vertex-ai/prediction/tf2-cpu.{}:latest'.format(tf_version)
35 | else:
36 | train_image = "us-docker.pkg.dev/vertex-ai/training/tf-gpu.{}:latest".format(tf_version)
37 | deploy_image = "us-docker.pkg.dev/vertex-ai/prediction/tf2-cpu.{}:latest".format(tf_version)
38 |
39 | # train
40 | model_display_name = '{}-{}'.format(ENDPOINT_NAME, timestamp)
41 | job = aiplatform.CustomTrainingJob(
42 | display_name='train-{}'.format(model_display_name),
43 | script_path="model.py",
44 | container_uri=train_image,
45 | requirements=['cloudml-hypertune'], # any extra Python packages
46 | model_serving_container_image_uri=deploy_image
47 | )
48 | model_args = [
49 | '--bucket', BUCKET,
50 | ]
51 | if develop_mode:
52 | model_args += ['--develop']
53 | if extra_args:
54 | model_args += extra_args
55 |
56 | if cpu_only_mode:
57 | model = job.run(
58 | dataset=data_set,
59 | # See https://googleapis.dev/python/aiplatform/latest/aiplatform.html#
60 | predefined_split_column_name='data_split',
61 | model_display_name=model_display_name,
62 | args=model_args,
63 | replica_count=1,
64 | machine_type='n1-standard-4',
65 | sync=develop_mode
66 | )
67 | else:
68 | model = job.run(
69 | dataset=data_set,
70 | # See https://googleapis.dev/python/aiplatform/latest/aiplatform.html#
71 | predefined_split_column_name='data_split',
72 | model_display_name=model_display_name,
73 | args=model_args,
74 | replica_count=1,
75 | machine_type='n1-standard-4',
76 | # See https://cloud.google.com/vertex-ai/docs/general/locations#accelerators
77 | accelerator_type=aip.AcceleratorType.NVIDIA_TESLA_T4.name,
78 | accelerator_count=1,
79 | sync=develop_mode
80 | )
81 | return model
82 |
83 |
84 | def train_automl_model(data_set, timestamp, develop_mode):
85 | # train
86 | model_display_name = '{}-{}'.format(ENDPOINT_NAME, timestamp)
87 | job = aiplatform.AutoMLTabularTrainingJob(
88 | display_name='train-{}'.format(model_display_name),
89 | optimization_prediction_type='classification'
90 | )
91 | model = job.run(
92 | dataset=data_set,
93 | # See https://googleapis.dev/python/aiplatform/latest/aiplatform.html#
94 | predefined_split_column_name='data_split',
95 | target_column='ontime',
96 | model_display_name=model_display_name,
97 | budget_milli_node_hours=(300 if develop_mode else 2000),
98 | disable_early_stopping=False,
99 | export_evaluated_data_items=True,
100 | export_evaluated_data_items_bigquery_destination_uri='{}:flights.automl_evaluated'.format(PROJECT),
101 | export_evaluated_data_items_override_destination=True,
102 | sync=develop_mode
103 | )
104 | return model
105 |
106 |
107 | def do_hyperparameter_tuning(data_set, timestamp, develop_mode, cpu_only_mode, tf_version):
108 | # Vertex AI services require regional API endpoints.
109 | if cpu_only_mode:
110 | train_image='us-docker.pkg.dev/vertex-ai/training/tf-cpu.{}:latest'.format(tf_version)
111 | else:
112 | train_image = "us-docker.pkg.dev/vertex-ai/training/tf-gpu.{}:latest".format(tf_version)
113 |
114 | # a single trial job
115 | model_display_name = '{}-{}'.format(ENDPOINT_NAME, timestamp)
116 | if cpu_only_mode:
117 | trial_job = aiplatform.CustomJob.from_local_script(
118 | display_name='train-{}'.format(model_display_name),
119 | script_path="model.py",
120 | container_uri=train_image,
121 | args=[
122 | '--bucket', BUCKET,
123 | '--skip_full_eval', # no need to evaluate on test data set
124 | '--num_epochs', '10',
125 | '--num_examples', '500000' # 1/10 actual size to finish faster
126 | ],
127 | requirements=['cloudml-hypertune'], # any extra Python packages
128 | replica_count=1,
129 | machine_type='n1-standard-4'
130 | )
131 | else:
132 | trial_job = aiplatform.CustomJob.from_local_script(
133 | display_name='train-{}'.format(model_display_name),
134 | script_path="model.py",
135 | container_uri=train_image,
136 | args=[
137 | '--bucket', BUCKET,
138 | '--skip_full_eval', # no need to evaluate on test data set
139 | '--num_epochs', '10',
140 | '--num_examples', '500000' # 1/10 actual size to finish faster
141 | ],
142 | requirements=['cloudml-hypertune'], # any extra Python packages
143 | replica_count=1,
144 | machine_type='n1-standard-4',
145 | # See https://cloud.google.com/vertex-ai/docs/general/locations#accelerators
146 | accelerator_type=aip.AcceleratorType.NVIDIA_TESLA_T4.name,
147 | accelerator_count=1,
148 | )
149 |
150 | # the tuning job
151 | hparam_job = aiplatform.HyperparameterTuningJob(
152 | # See https://googleapis.dev/python/aiplatform/latest/aiplatform.html#
153 | display_name='hparam-{}'.format(model_display_name),
154 | custom_job=trial_job,
155 | metric_spec={'val_rmse': 'minimize'},
156 | parameter_spec={
157 | "train_batch_size": hpt.IntegerParameterSpec(min=16, max=256, scale='log'),
158 | "nbuckets": hpt.IntegerParameterSpec(min=5, max=10, scale='linear'),
159 | "dnn_hidden_units": hpt.CategoricalParameterSpec(values=["64,16", "64,16,4", "64,64,64,8", "256,64,16"])
160 | },
161 | max_trial_count=2 if develop_mode else NUM_HPARAM_TRIALS,
162 | parallel_trial_count=2,
163 | search_algorithm=None, # Bayesian
164 | )
165 |
166 | hparam_job.run(sync=True) # has to finish before we can get trials.
167 |
168 | # get the parameters corresponding to the best trial
169 | best = sorted(hparam_job.trials, key=lambda x: x.final_measurement.metrics[0].value)[0]
170 | logging.info('Best trial: {}'.format(best))
171 | best_params = []
172 | for param in best.parameters:
173 | best_params.append('--{}'.format(param.parameter_id))
174 |
175 | if param.parameter_id in ["train_batch_size", "nbuckets"]:
176 | # hparam returns 10.0 even though it's an integer param. so round it.
177 | # but CustomTrainingJob makes integer args into floats. so make it a string
178 | best_params.append(str(int(round(param.value))))
179 | else:
180 | # string or float parameters
181 | best_params.append(param.value)
182 |
183 | # run the best trial to completion
184 | logging.info('Launching full training job with {}'.format(best_params))
185 | return train_custom_model(data_set, timestamp, develop_mode, cpu_only_mode, tf_version, extra_args=best_params)
186 |
187 | #Remove references to kfp
188 | #@dsl.pipeline(name="flights-pipeline",
189 | # description="ds-on-gcp flights pipeline"
190 | #)
191 |
192 | def main():
193 | aiplatform.init(project=PROJECT, location=REGION, staging_bucket='gs://{}'.format(BUCKET))
194 |
195 | # create data set
196 | all_files = tf.io.gfile.glob('gs://{}/train/data/all*.csv'.format(BUCKET))
197 | logging.info("Training on {}".format(all_files))
198 | data_set = aiplatform.TabularDataset.create(
199 | display_name='data-{}'.format(ENDPOINT_NAME),
200 | gcs_source=all_files
201 | )
202 | if TF_VERSION is not None:
203 | tf_version = TF_VERSION.replace(".", "-")
204 | else:
205 | tf_version = '2-' + tf.__version__[2:3]
206 |
207 | # train
208 | if AUTOML:
209 | model = train_automl_model(data_set, TIMESTAMP, DEVELOP_MODE)
210 | elif NUM_HPARAM_TRIALS > 1:
211 | model = do_hyperparameter_tuning(data_set, TIMESTAMP, DEVELOP_MODE, CPU_ONLY_MODE, tf_version)
212 | else:
213 | model = train_custom_model(data_set, TIMESTAMP, DEVELOP_MODE, CPU_ONLY_MODE, tf_version)
214 |
215 | # create endpoint if it doesn't already exist
216 | endpoints = aiplatform.Endpoint.list(
217 | filter='display_name="{}"'.format(ENDPOINT_NAME),
218 | order_by='create_time desc',
219 | project=PROJECT, location=REGION,
220 | )
221 | if len(endpoints) > 0:
222 | endpoint = endpoints[0] # most recently created
223 | else:
224 | endpoint = aiplatform.Endpoint.create(
225 | display_name=ENDPOINT_NAME, project=PROJECT, location=REGION,
226 | sync=DEVELOP_MODE
227 | )
228 |
229 | # deploy
230 | model.deploy(
231 | endpoint=endpoint,
232 | traffic_split={"0": 100},
233 | machine_type='n1-standard-2',
234 | min_replica_count=1,
235 | max_replica_count=1,
236 | sync=DEVELOP_MODE
237 | )
238 |
239 | if DEVELOP_MODE:
240 | model.wait()
241 |
242 | #Remove run_pipeline function
243 | #def run_pipeline():
244 | # compiler.Compiler().compile(pipeline_func=main, package_path='flights_pipeline.json')
245 |
246 | # job = aip.PipelineJob(
247 | # display_name="{}-pipeline".format(ENDPOINT_NAME),
248 | # template_path="{}_pipeline.json".format(ENDPOINT_NAME),
249 | # pipeline_root="{}/pipeline_root/intro".format(BUCKET),
250 | # enable_caching=False
251 | # )
252 |
253 | # job.run()
254 |
255 |
256 | if __name__ == '__main__':
257 | parser = argparse.ArgumentParser()
258 |
259 | parser.add_argument(
260 | '--bucket',
261 | help='Data will be read from gs://BUCKET/train/data and checkpoints will be in gs://BUCKET/train/trained_model',
262 | required=True
263 | )
264 | parser.add_argument(
265 | '--region',
266 | help='Where to run the trainer',
267 | default='us-central1'
268 | )
269 | parser.add_argument(
270 | '--project',
271 | help='Project to be billed',
272 | required=True
273 | )
274 | parser.add_argument(
275 | '--develop',
276 | help='Train on a small subset in development',
277 | dest='develop',
278 | action='store_true')
279 | parser.set_defaults(develop=False)
280 | parser.add_argument(
281 | '--automl',
282 | help='Train an AutoML Table, instead of using model.py',
283 | dest='automl',
284 | action='store_true')
285 | parser.set_defaults(automl=False)
286 | parser.add_argument(
287 | '--num_hparam_trials',
288 | help='Number of hyperparameter trials. 0/1 means no hyperparam. Ignored if --automl is set.',
289 | type=int,
290 | default=0)
291 | parser.add_argument(
292 | '--pipeline',
293 | help='Run as pipeline',
294 | dest='pipeline',
295 | action='store_true')
296 | parser.add_argument(
297 | '--cpuonly',
298 | help='Run without GPU',
299 | dest='cpuonly',
300 | action='store_true')
301 | parser.set_defaults(cpuonly=False)
302 | parser.add_argument(
303 | '--tfversion',
304 | help='TensorFlow version to use'
305 | )
306 |
307 | # parse args
308 | logging.getLogger().setLevel(logging.INFO)
309 | args = parser.parse_args().__dict__
310 | BUCKET = args['bucket']
311 | PROJECT = args['project']
312 | REGION = args['region']
313 | DEVELOP_MODE = args['develop']
314 | CPU_ONLY_MODE = args['cpuonly']
315 | TF_VERSION = args['tfversion']
316 | AUTOML = args['automl']
317 | NUM_HPARAM_TRIALS = args['num_hparam_trials']
318 | TIMESTAMP = datetime.now().strftime("%Y%m%d%H%M%S")
319 |
320 | #if args['pipeline']:
321 | #run_pipeline()
322 | #else:
323 | main()
--------------------------------------------------------------------------------
/RealTimePrediction/realtime-intelligence-main/setup_env.sh:
--------------------------------------------------------------------------------
1 | # Copyright 2023 Google LLC
2 | #
3 | # Licensed under the Apache License, Version 2.0 (the "License");
4 | # you may not use this file except in compliance with the License.
5 | # You may obtain a copy of the License at
6 | #
7 | # https://www.apache.org/licenses/LICENSE-2.0
8 | #
9 | # Unless required by applicable law or agreed to in writing, software
10 | # distributed under the License is distributed on an "AS IS" BASIS,
11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12 | # See the License for the specific language governing permissions and
13 | # limitations under the License.
14 | #Set Project Variables
15 | PROJECT_ID=${DEVSHELL_PROJECT_ID}
16 | PROJECT_NBR=$(gcloud projects describe $PROJECT_ID --format="value(projectNumber)")
17 | NETWORK=default
18 | SUBNET=default
19 | SUBNET_CIDR=10.6.0.0/24
20 | REGION=us-central1
21 | #
22 | #Enable APIs
23 | #
24 | gcloud services enable compute.googleapis.com
25 | gcloud services enable dataflow.googleapis.com
26 | gcloud services enable pubsub.googleapis.com
27 | gcloud services enable aiplatform.googleapis.com
28 | gcloud services enable logging.googleapis.com
29 | gcloud services enable serviceusage.googleapis.com
30 | gcloud services enable bigquery.googleapis.com
31 | gcloud services enable monitoring.googleapis.com
32 | #
33 | #Provide roles to the compute service account
34 | #
35 | gcloud projects add-iam-policy-binding $PROJECT_ID \
36 | --member=serviceAccount:$PROJECT_NBR-compute@developer.gserviceaccount.com \
37 | --role="roles/storage.objectAdmin"
38 |
39 | gcloud projects add-iam-policy-binding $PROJECT_ID \
40 | --member=serviceAccount:$PROJECT_NBR-compute@developer.gserviceaccount.com \
41 | --role="roles/bigquery.jobUser"
42 |
43 | gcloud projects add-iam-policy-binding $PROJECT_ID \
44 | --member=serviceAccount:$PROJECT_NBR-compute@developer.gserviceaccount.com \
45 | --role="roles/bigquery.dataEditor"
46 |
47 | gcloud projects add-iam-policy-binding $PROJECT_ID \
48 | --member=serviceAccount:$PROJECT_NBR-compute@developer.gserviceaccount.com \
49 | --role="roles/dataflow.worker"
50 |
51 | gcloud projects add-iam-policy-binding $PROJECT_ID \
52 | --member=serviceAccount:$PROJECT_NBR-compute@developer.gserviceaccount.com \
53 | --role="roles/dataflow.developer"
54 |
55 | gcloud projects add-iam-policy-binding $PROJECT_ID \
56 | --member=serviceAccount:$PROJECT_NBR-compute@developer.gserviceaccount.com \
57 | --role="roles/aiplatform.user"
58 |
59 | gcloud projects add-iam-policy-binding $PROJECT_ID \
60 | --member=serviceAccount:$PROJECT_NBR-compute@developer.gserviceaccount.com \
61 | --role="roles/pubsub.editor"
62 |
63 | #
64 | #Create VPC
65 | #
66 | #gcloud compute networks create $NETWORK \
67 | #--project=$PROJECT_ID \
68 | #--subnet-mode=custom \
69 | #--mtu=1460 \
70 | #--bgp-routing-mode=regional
71 | #
72 | #Create Subnet
73 | #
74 | #gcloud compute networks subnets create $SUBNET \
75 | # --network=$NETWORK \
76 | # --range=$SUBNET_CIDR \
77 | # --region=$REGION \
78 | # --enable-private-ip-google-access \
79 | # --project=$PROJECT_ID
80 | #
81 | #Create Firewall Rules
82 | #
83 | #gcloud compute --project=$PROJECT_ID firewall-rules create allow-intra-$SUBNET \
84 | #--direction=INGRESS \
85 | #--priority=1000 \
86 | #--network=$NETWORK \
87 | #--action=ALLOW \
88 | #--rules=all \
89 | #--source-ranges=$SUBNET_CIDR
90 |
91 |
92 | # If needed turn off the following policies
93 | # shielded vm policy constraints/compute.requireShieldedVm
94 | # Allow constraints/compute.vmExternalIpAccess
95 |
96 |
--------------------------------------------------------------------------------
/RealTimePrediction/realtime-intelligence-main/simulate/airports.csv.gz:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/google/real-time-intelligence-workshop/ea99fd5b6fcf44c4f006dc17adde8c42ca163479/RealTimePrediction/realtime-intelligence-main/simulate/airports.csv.gz
--------------------------------------------------------------------------------
/RealTimePrediction/realtime-intelligence-main/simulate/simulate.py:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env python3
2 | # Copyright 2023 Google LLC
3 | #
4 | # Licensed under the Apache License, Version 2.0 (the "License");
5 | # you may not use this file except in compliance with the License.
6 | # You may obtain a copy of the License at
7 | #
8 | # https://www.apache.org/licenses/LICENSE-2.0
9 | #
10 | # Unless required by applicable law or agreed to in writing, software
11 | # distributed under the License is distributed on an "AS IS" BASIS,
12 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13 | # See the License for the specific language governing permissions and
14 | # limitations under the License.
15 |
16 | import time
17 | import pytz
18 | import logging
19 | import argparse
20 | import datetime
21 | import google.cloud.pubsub_v1 as pubsub # Use v1 of the API
22 | import google.cloud.bigquery as bq
23 |
24 | TIME_FORMAT = '%Y-%m-%d %H:%M:%S %Z'
25 | RFC3339_TIME_FORMAT = '%Y-%m-%dT%H:%M:%S-00:00'
26 |
27 | def publish(publisher, topics, allevents, notify_time):
28 | timestamp = notify_time.strftime(RFC3339_TIME_FORMAT)
29 | for key in topics: # 'departed', 'arrived', etc.
30 | topic = topics[key]
31 | events = allevents[key]
32 | # the client automatically batches
33 | logging.info('Publishing {} {} till {}'.format(len(events), key, timestamp))
34 | for event_data in events:
35 | publisher.publish(topic, event_data.encode(), EventTimeStamp=timestamp)
36 |
37 | def notify(publisher, topics, rows, simStartTime, programStart, speedFactor):
38 | # sleep computation
39 | def compute_sleep_secs(notify_time):
40 | time_elapsed = (datetime.datetime.utcnow() - programStart).total_seconds()
41 | sim_time_elapsed = (notify_time - simStartTime).total_seconds() / speedFactor
42 | to_sleep_secs = sim_time_elapsed - time_elapsed
43 | return to_sleep_secs
44 |
45 | tonotify = {}
46 | for key in topics:
47 | tonotify[key] = list()
48 |
49 | for row in rows:
50 | event_type, notify_time, event_data = row
51 |
52 | # how much time should we sleep?
53 | if compute_sleep_secs(notify_time) > 1:
54 | # notify the accumulated tonotify
55 | publish(publisher, topics, tonotify, notify_time)
56 | for key in topics:
57 | tonotify[key] = list()
58 |
59 | # recompute sleep, since notification takes a while
60 | to_sleep_secs = compute_sleep_secs(notify_time)
61 | if to_sleep_secs > 0:
62 | logging.info('Sleeping {} seconds'.format(to_sleep_secs))
63 | time.sleep(to_sleep_secs)
64 | tonotify[event_type].append(event_data)
65 |
66 | # left-over records; notify again
67 | publish(publisher, topics, tonotify, notify_time)
68 |
69 |
70 | if __name__ == '__main__':
71 | parser = argparse.ArgumentParser(description='Send simulated flight events to Cloud Pub/Sub')
72 | parser.add_argument('--startTime', help='Example: 2015-05-01 00:00:00 UTC', required=True)
73 | parser.add_argument('--endTime', help='Example: 2015-05-03 00:00:00 UTC', required=True)
74 | parser.add_argument('--project', help='your project id, to create pubsub topic', required=True)
75 | parser.add_argument('--speedFactor', help='Example: 60 implies 1 hour of data sent to Cloud Pub/Sub in 1 minute', required=True, type=float)
76 | parser.add_argument('--jitter', help='type of jitter to add: None, uniform, exp are the three options', default='None')
77 |
78 | # set up BigQuery bqclient
79 | logging.basicConfig(format='%(levelname)s: %(message)s', level=logging.INFO)
80 | args = parser.parse_args()
81 | bqclient = bq.Client(args.project)
82 | bqclient.get_table('flights.flights_simevents') # throws exception on failure
83 |
84 | # jitter?
85 | if args.jitter == 'exp':
86 | jitter = 'CAST (-LN(RAND()*0.99 + 0.01)*30 + 90.5 AS INT64)'
87 | elif args.jitter == 'uniform':
88 | jitter = 'CAST(90.5 + RAND()*30 AS INT64)'
89 | else:
90 | jitter = '0'
91 |
92 |
93 | # run the query to pull simulated events
94 | querystr = """
95 | SELECT
96 | EVENT_TYPE,
97 | TIMESTAMP_ADD(EVENT_TIME, INTERVAL @jitter SECOND) AS NOTIFY_TIME,
98 | EVENT_DATA
99 | FROM
100 | flights.flights_simevents
101 | WHERE
102 | EVENT_TIME >= @startTime
103 | AND EVENT_TIME < @endTime
104 | ORDER BY
105 | EVENT_TIME ASC
106 | """
107 | job_config = bq.QueryJobConfig(
108 | query_parameters=[
109 | bq.ScalarQueryParameter("jitter", "INT64", jitter),
110 | bq.ScalarQueryParameter("startTime", "TIMESTAMP", args.startTime),
111 | bq.ScalarQueryParameter("endTime", "TIMESTAMP", args.endTime),
112 | ]
113 | )
114 | rows = bqclient.query(querystr, job_config=job_config)
115 |
116 | # create one Pub/Sub notification topic for each type of event
117 | publisher = pubsub.PublisherClient()
118 | topics = {}
119 | for event_type in ['wheelsoff', 'arrived', 'departed']:
120 | topics[event_type] = publisher.topic_path(args.project, event_type)
121 | try:
122 | publisher.get_topic(topic=topics[event_type])
123 | logging.info("Already exists: {}".format(topics[event_type]))
124 | except:
125 | logging.info("Creating {}".format(topics[event_type]))
126 | publisher.create_topic(name=topics[event_type])
127 |
128 |
129 | # notify about each row in the dataset
130 | programStartTime = datetime.datetime.utcnow()
131 | simStartTime = datetime.datetime.strptime(args.startTime, TIME_FORMAT).replace(tzinfo=pytz.UTC)
132 | logging.info('Simulation start time is {}'.format(simStartTime))
133 | notify(publisher, topics, rows, simStartTime, programStartTime, args.speedFactor)
134 |
--------------------------------------------------------------------------------
/RealTimePrediction/realtime-intelligence-main/simulate_flight.sh:
--------------------------------------------------------------------------------
1 | # Copyright 2023 Google LLC
2 | #
3 | # Licensed under the Apache License, Version 2.0 (the "License");
4 | # you may not use this file except in compliance with the License.
5 | # You may obtain a copy of the License at
6 | #
7 | # https://www.apache.org/licenses/LICENSE-2.0
8 | #
9 | # Unless required by applicable law or agreed to in writing, software
10 | # distributed under the License is distributed on an "AS IS" BASIS,
11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12 | # See the License for the specific language governing permissions and
13 | # limitations under the License.
14 | #
15 | # Set environment variables
16 | #
17 | export PROJECT_ID=$(gcloud info --format='value(config.project)')
18 | export BUCKET=$PROJECT_ID-ml
19 | #
20 | # Change directory to simulate directory
21 | #
22 | cd ./simulate
23 | #
24 | # Simulate Flights:
25 | #
26 | python3 ./simulate.py --startTime '2015-02-01 00:00:00 UTC' --endTime '2015-03-03 00:00:00 UTC' --speedFactor=30 --project $PROJECT_ID
27 | cd ..
--------------------------------------------------------------------------------
/RealTimePrediction/realtime-intelligence-main/stage_data.sh:
--------------------------------------------------------------------------------
1 | # Copyright 2023 Google LLC
2 | #
3 | # Licensed under the Apache License, Version 2.0 (the "License");
4 | # you may not use this file except in compliance with the License.
5 | # You may obtain a copy of the License at
6 | #
7 | # https://www.apache.org/licenses/LICENSE-2.0
8 | #
9 | # Unless required by applicable law or agreed to in writing, software
10 | # distributed under the License is distributed on an "AS IS" BASIS,
11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12 | # See the License for the specific language governing permissions and
13 | # limitations under the License.
14 | #
15 | # Create staging environment to generate data needed
16 | #
17 | export PROJECT_ID=${DEVSHELL_PROJECT_ID}
18 | export BUCKET=$PROJECT_ID-ml
19 | #
20 | #Copy raw flights data from US Bureau of Transportation Statistics (https://www.transtats.bts.gov/)
21 | #
22 | gsutil mb -l us-central1 gs://$BUCKET
23 | #
24 | # Download Flight On Time Data from BTS
25 | #
26 | YEAR=2015
27 | #
28 | #BTS URL
29 | #
30 | #SOURCE=https://transtats.bts.gov/PREZIP
31 | #
32 | #Using a mirror
33 | #
34 | SOURCE=https://storage.googleapis.com/data-science-on-gcp/edition2/raw
35 | BASEURL="${SOURCE}/On_Time_Reporting_Carrier_On_Time_Performance_1987_present"
36 |
37 |
38 | for MONTH in `seq 1 2`; do
39 | echo "Downloading YEAR=$YEAR ... MONTH=$MONTH ... from $BASEURL"
40 | MONTH2=$(printf "%02d" $MONTH)
41 | #
42 | #Create a temp directory to store downloaded zip files
43 | #
44 | TMPDIR=$(mktemp -d)
45 | ZIPFILE=${TMPDIR}/${YEAR}_${MONTH2}.zip
46 | echo $ZIPFILE
47 | curl -o $ZIPFILE ${BASEURL}_${YEAR}_${MONTH}.zip
48 | unzip -d $TMPDIR $ZIPFILE
49 | gsutil cp $TMPDIR/*.csv gs://$BUCKET/flights/raw/${YEAR}${MONTH2}.csv
50 | rm -rf $TMPDIR
51 | done
52 | #
53 | #Define Schema for BQ flights_raw table
54 | #
55 | SCHEMA=Year:STRING,Quarter:STRING,Month:STRING,DayofMonth:STRING,DayOfWeek:STRING,FlightDate:DATE,Reporting_Airline:STRING,DOT_ID_Reporting_Airline:STRING,IATA_CODE_Reporting_Airline:STRING,Tail_Number:STRING,Flight_Number_Reporting_Airline:STRING,OriginAirportID:STRING,OriginAirportSeqID:STRING,OriginCityMarketID:STRING,Origin:STRING,OriginCityName:STRING,OriginState:STRING,OriginStateFips:STRING,OriginStateName:STRING,OriginWac:STRING,DestAirportID:STRING,DestAirportSeqID:STRING,DestCityMarketID:STRING,Dest:STRING,DestCityName:STRING,DestState:STRING,DestStateFips:STRING,DestStateName:STRING,DestWac:STRING,CRSDepTime:STRING,DepTime:STRING,DepDelay:STRING,DepDelayMinutes:STRING,DepDel15:STRING,DepartureDelayGroups:STRING,DepTimeBlk:STRING,TaxiOut:STRING,WheelsOff:STRING,WheelsOn:STRING,TaxiIn:STRING,CRSArrTime:STRING,ArrTime:STRING,ArrDelay:STRING,ArrDelayMinutes:STRING,ArrDel15:STRING,ArrivalDelayGroups:STRING,ArrTimeBlk:STRING,Cancelled:STRING,CancellationCode:STRING,Diverted:STRING,CRSElapsedTime:STRING,ActualElapsedTime:STRING,AirTime:STRING,Flights:STRING,Distance:STRING,DistanceGroup:STRING,CarrierDelay:STRING,WeatherDelay:STRING,NASDelay:STRING,SecurityDelay:STRING,LateAircraftDelay:STRING,FirstDepTime:STRING,TotalAddGTime:STRING,LongestAddGTime:STRING,DivAirportLandings:STRING,DivReachedDest:STRING,DivActualElapsedTime:STRING,DivArrDelay:STRING,DivDistance:STRING,Div1Airport:STRING,Div1AirportID:STRING,Div1AirportSeqID:STRING,Div1WheelsOn:STRING,Div1TotalGTime:STRING,Div1LongestGTime:STRING,Div1WheelsOff:STRING,Div1TailNum:STRING,Div2Airport:STRING,Div2AirportID:STRING,Div2AirportSeqID:STRING,Div2WheelsOn:STRING,Div2TotalGTime:STRING,Div2LongestGTime:STRING,Div2WheelsOff:STRING,Div2TailNum:STRING,Div3Airport:STRING,Div3AirportID:STRING,Div3AirportSeqID:STRING,Div3WheelsOn:STRING,Div3TotalGTime:STRING,Div3LongestGTime:STRING,Div3WheelsOff:STRING,Div3TailNum:STRING,Div4Airport:STRING,Div4AirportID:STRING,Div4AirportSeqID:STRING,Div4WheelsOn:STRING,Div4TotalGTime:STRING,Div4LongestGTime:STRING,Div4WheelsOff:STRING,Div4TailNum:STRING,Div5Airport:STRING,Div5AirportID:STRING,Div5AirportSeqID:STRING,Div5WheelsOn:STRING,Div5TotalGTime:STRING,Div5LongestGTime:STRING,Div5WheelsOff:STRING,Div5TailNum:STRING
56 | #
57 | #Create BQ Dataset
58 | #
59 | bq --project_id $PROJECT_ID show flights || bq mk --sync flights
60 | #
61 | # Create BQ table flights_raw in flights dataset and load the raw CSV files copied in the previous steps
62 | #
63 | for MONTH in `seq -w 1 2`; do
64 | CSVFILE=gs://$BUCKET/flights/raw/20150$MONTH.csv
65 | bq --project_id $PROJECT_ID --sync load \
66 | --time_partitioning_field=FlightDate --time_partitioning_type=MONTH \
67 | --source_format=CSV --ignore_unknown_values --skip_leading_rows=1 --schema=$SCHEMA \
68 | --replace $PROJECT_ID:flights.flights_raw\$20150$MONTH $CSVFILE
69 | done
70 | #
71 | # Copy all flights time zone corrected JSON file
72 | #
73 | gsutil -m cp gs://data-science-on-gcp/edition2/flights/tzcorr/all_flights-00000-of-00026 gs://${BUCKET}/flights/tzcorr/
74 | #
75 | # Create BQ table flights_tzcorr - time zone corrected
76 | #
77 | bq --project_id $PROJECT_ID load \
78 | --source_format=NEWLINE_DELIMITED_JSON \
79 | --autodetect ${PROJECT_ID}:flights.flights_tzcorr gs://${BUCKET}/flights/tzcorr/all_flights-*
80 | #
81 | #Copy Airport Information
82 | #
83 | gsutil cp gs://data-science-on-gcp/edition2/raw/airports.csv gs://${BUCKET}/flights/airports/airports.csv
84 | #
85 | # Create BQ airports table
86 | #
87 | bq --project_id=$PROJECT_ID load --autodetect --replace --source_format=CSV ${PROJECT_ID}:flights.airports gs://${BUCKET}/flights/airports/airports.csv
88 | #
89 | # Copy Simulated events file
90 | #
91 | gsutil -m cp gs://cloud-training/gsp201/simevents/flights_simevents_dump00000000000*.csv.gz gs://$BUCKET/
92 | #
93 | # Create BQ Simevents Table
94 | #
95 | bq load --replace --autodetect --source_format=CSV flights.flights_simevents gs://$BUCKET/flights_simevents_dump00000000000*.csv.gz
96 |
--------------------------------------------------------------------------------
/RealTimePrediction/realtime-intelligence-main/train_model.sh:
--------------------------------------------------------------------------------
1 | # Copyright 2023 Google LLC
2 | #
3 | # Licensed under the Apache License, Version 2.0 (the "License");
4 | # you may not use this file except in compliance with the License.
5 | # You may obtain a copy of the License at
6 | #
7 | # https://www.apache.org/licenses/LICENSE-2.0
8 | #
9 | # Unless required by applicable law or agreed to in writing, software
10 | # distributed under the License is distributed on an "AS IS" BASIS,
11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12 | # See the License for the specific language governing permissions and
13 | # limitations under the License.
14 | #
15 | # Set environment variables
16 | #
17 | export PROJECT_ID=$(gcloud info --format='value(config.project)')
18 | export BUCKET=$PROJECT_ID-ml
19 | #
20 | # Train custom ML model on the enriched dataset:
21 | #
22 | cd ./realtime
23 | python3 train_on_vertex.py --project $PROJECT_ID --bucket $BUCKET --region us-central1 --develop --cpuonly
24 | #
25 | #In the Cloud Console, on the Navigation menu,
26 | #click Vertex AI > Training to monitor the training pipeline.
27 | #When the status is Finished, click on the training pipeline name to monitor the deployment status.
28 | #Note: It will take around 20 minutes to complete the model training and deployment.
--------------------------------------------------------------------------------