├── .gitignore
├── 01 - Regression
    ├── -- TensorBoard.ipynb
    ├── 00.0 - TensorFlow Version Update.ipynb
    ├── 01.0 - Regression Data Generation.ipynb
    ├── 02.0 - TF Regression Model - Estimator APIs + Pandas.ipynb
    ├── 03.0 - TF Regression Model - Experiment APIs + CSV Files.ipynb
    ├── 04.0 - TF Regression Model - Dataset Input + JSON Serving.ipynb
    ├── 04.0 - TF Regression Model - Dataset Input.ipynb
    ├── 05.0 - TF Regression Model - Custom Estimator.ipynb
    ├── 06.0 - Convert CSV to TFRecords.ipynb
    ├── 07.0 - TF Regression Model - DNN Wide & Deep + estimator.train_and_evaluate.ipynb
    ├── 08.0 - TF Regression Example - Housing Price Estimation + Features Scaling.ipynb
    └── data
    │   ├── housingdata.csv
    │   ├── new-data.csv
    │   ├── new-data.json
    │   ├── test-data.csv
    │   ├── test-data.tfrecords
    │   ├── train-data.csv
    │   ├── train-data.tfrecords
    │   ├── valid-data.csv
    │   └── valid-data.tfrecords
├── 02 - Classification
    ├── -- TensorBoard.ipynb
    ├── 00.0 - TensorFlow Version Update.ipynb
    ├── 01.0 - Classification Data Generation.ipynb
    ├── 02.0 - Convert CSV to TFRecords.ipynb
    ├── 03.0 - TF Classification Model - DNN Wide & Deep + Train_And_Evaluate + Dataset + TFRecords.ipynb
    ├── 04.0 - TF Classification Model - Custom Estimator + Experiment + Dataset + CSV.ipynb
    ├── 05.0 - Classification Example -  Census Income Prediction.ipynb
    ├── 06.0 - Classification Example -  Census Income Prediction - Custom Estimator + Exponential Decay Learning Rate.ipynb
    └── data
    │   ├── adult.data.csv
    │   ├── adult.stats.csv
    │   ├── adult.test.csv
    │   ├── test-data.csv
    │   ├── test-data.tfrecords
    │   ├── train-data.csv
    │   ├── train-data.tfrecords
    │   ├── valid-data.csv
    │   └── valid-data.tfrecords
├── 03 - Clustering
    ├── 00.0 - TensorFlow Version Update.ipynb
    ├── 01.0 - Generate Data Points + SKLearn Clustering.ipynb
    ├── 02.0 - TF k-means - Estimator API.ipynb
    ├── 03.0 - TF k-means - Experiment API.ipynb
    └── data
    │   ├── new-data.csv
    │   ├── test-data.csv
    │   └── train-data.csv
├── 04 - Times Series
    ├── 00.0 - Generate Time Series Data.ipynb
    ├── 01.0 - TF ARRegressor - Estimator + Numpy.ipynb
    ├── 02.0 - TF ARRegressor - Experiment + CSV.ipynb
    └── data
    │   ├── test-data.csv
    │   ├── timeseries-multivariate.txt
    │   ├── timeseries-univariate.csv
    │   └── train-data.csv
├── 05 -  Autoencoding
    ├── 01.0 - Generate Dataset with High-Dimensionality.ipynb
    ├── 02.0 - Dimensionality Reduction - Autoencoding + Custom Estimator.ipynb
    ├── 03.0 - Dimensionality Reduction - Autoencoding + Normalizer + XEntropy Loss.ipynb
    ├── 04.0 - Dimensionality Reduction - Autoencoding + Custom Estimator with MNIST.ipynb
    └── data
    │   └── data-01.csv
├── 06 - Sequence Models
    ├── 01 - RNN with LSTM - Predicting the Next Values - Single Pattern.ipynb
    ├── 02 - RNN with LSTM - Predicting the Next Values - Multiple Patterns.ipynb
    ├── 03 - RNN with LSTM - Sequence Classification.ipynb
    ├── TODO.txt
    └── data
    │   ├── seq01.test.csv
    │   └── seq01.train.csv
├── 07 -  Image Analysis
    ├── 00.0 - TensorFlow Version Update.ipynb
    ├── 01.0 - CNN Example with CIFAR-10 dataset.ipynb
    ├── 02.0 - CNN Example with CIFAR-10 dataset using TFRecords.ipynb
    └── 03.0 - CNN Example with CIFAR-10 (Keras ver.).ipynb
├── 08 - Text Analysis
    ├── 01 - Text Classification - SMS Ham vs. Spam - Data Preparation.ipynb
    ├── 02 - Text Classification - SMS Ham vs. Spam - Document Embedding.ipynb
    ├── 03 - Text Classification - SMS Ham vs. Spam - Word Embeddings + CNN.ipynb
    ├── 04 - Text Classification - SMS Ham vs. Spam - Word Embeddings + LSTM.ipynb
    ├── 05 - Text Classification - Hacker News - End-to-End + TF-Hub Sentence Embedding.ipynb
    ├── 06 - Part_1 - Text Classification - Hacker News - Data Preprocessing with TFT.ipynb
    ├── 06 - Part_2 - Text Classification - Hacker News - DNNClassifier with TF-Hub Sentence Embedding.ipynb
    ├── 06 - Part_3 - Text Classification - Hacker News - Custom Estimator Word Embedding.ipynb
    ├── 06 - Part_4 - Text Classification - Hacker News - DNNClassifier with TF.IDF.ipynb
    └── data
    │   └── sms-spam
    │       ├── SMSSpamCollection
    │       ├── n_words.tsv
    │       ├── train-data.tsv
    │       ├── valid-data.tsv
    │       └── vocab_list.tsv
├── README.md
└── images
    └── exp-api2.png


/.gitignore:
--------------------------------------------------------------------------------
 1 | 01 - Regression/trained_models
 2 | 01 - Regression/.ipynb_checkpoints
 3 | 01 - Regression/.DS_Store
 4 | 02 - Classification/trained_models
 5 | 02 - Classification/.ipynb_checkpoints
 6 | 02 - Classification/.DS_Store
 7 | 03 - Clustering/trained_models
 8 | 03 - Clustering/.ipynb_checkpoints
 9 | 03 - Clustering/.DS_Store
10 | 04 - Time Series/trained_models
11 | 04 - Time Series/.ipynb_checkpoints
12 | 04 - Time Series/.DS_Store
13 | 05 - Autoencoding/trained_models
14 | 05 - Autoencoding/.ipynb_checkpoints
15 | 05 - Autoencoding/.DS_Store
16 | .ipynb_checkpoints
17 | .DS_Store
18 | 


--------------------------------------------------------------------------------
/01 - Regression/-- TensorBoard.ipynb:
--------------------------------------------------------------------------------
 1 | {
 2 |  "cells": [
 3 |   {
 4 |    "cell_type": "code",
 5 |    "execution_count": null,
 6 |    "metadata": {},
 7 |    "outputs": [],
 8 |    "source": [
 9 |     "MODEL_NAME = 'reg-model-01'\n",
10 |     "model_dir = 'trained_models/{}'.format(MODEL_NAME)\n",
11 |     "print(model_dir)"
12 |    ]
13 |   },
14 |   {
15 |    "cell_type": "markdown",
16 |    "metadata": {},
17 |    "source": [
18 |     "## Start TensorBoard Process"
19 |    ]
20 |   },
21 |   {
22 |    "cell_type": "code",
23 |    "execution_count": null,
24 |    "metadata": {},
25 |    "outputs": [],
26 |    "source": [
27 |     "from google.datalab.ml import TensorBoard\n",
28 |     "TensorBoard().start(model_dir)\n",
29 |     "TensorBoard().list()"
30 |    ]
31 |   },
32 |   {
33 |    "cell_type": "markdown",
34 |    "metadata": {},
35 |    "source": [
36 |     "## Kill TensorBoard Process"
37 |    ]
38 |   },
39 |   {
40 |    "cell_type": "code",
41 |    "execution_count": null,
42 |    "metadata": {},
43 |    "outputs": [],
44 |    "source": [
45 |     "# to stop TensorBoard\n",
46 |     "TensorBoard().stop(23002)\n",
47 |     "print('stopped TensorBoard')\n",
48 |     "TensorBoard().list()"
49 |    ]
50 |   }
51 |  ],
52 |  "metadata": {
53 |   "kernelspec": {
54 |    "display_name": "Python 3",
55 |    "language": "python",
56 |    "name": "python3"
57 |   },
58 |   "language_info": {
59 |    "codemirror_mode": {
60 |     "name": "ipython",
61 |     "version": 3
62 |    },
63 |    "file_extension": ".py",
64 |    "mimetype": "text/x-python",
65 |    "name": "python",
66 |    "nbconvert_exporter": "python",
67 |    "pygments_lexer": "ipython3",
68 |    "version": "3.6.1"
69 |   }
70 |  },
71 |  "nbformat": 4,
72 |  "nbformat_minor": 2
73 | }
74 | 


--------------------------------------------------------------------------------
/01 - Regression/00.0 - TensorFlow Version Update.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "code",
  5 |    "execution_count": 1,
  6 |    "metadata": {},
  7 |    "outputs": [
  8 |     {
  9 |      "name": "stdout",
 10 |      "output_type": "stream",
 11 |      "text": [
 12 |       "Collecting tensorflow\n",
 13 |       "  Downloading tensorflow-1.4.0-cp36-cp36m-macosx_10_11_x86_64.whl (39.3MB)\n",
 14 |       "Collecting tensorflow-tensorboard<0.5.0,>=0.4.0rc1 (from tensorflow)\n",
 15 |       "  Downloading tensorflow_tensorboard-0.4.0rc2-py3-none-any.whl (1.7MB)\n",
 16 |       "Requirement already up-to-date: protobuf>=3.3.0 in /Users/khalidsalama/anaconda/lib/python3.6/site-packages (from tensorflow)\n",
 17 |       "Requirement already up-to-date: numpy>=1.12.1 in /Users/khalidsalama/anaconda/lib/python3.6/site-packages (from tensorflow)\n",
 18 |       "Requirement already up-to-date: wheel>=0.26 in /Users/khalidsalama/anaconda/lib/python3.6/site-packages (from tensorflow)\n",
 19 |       "Collecting enum34>=1.1.6 (from tensorflow)\n",
 20 |       "  Downloading enum34-1.1.6-py3-none-any.whl\n",
 21 |       "Requirement already up-to-date: six>=1.10.0 in /Users/khalidsalama/anaconda/lib/python3.6/site-packages (from tensorflow)\n",
 22 |       "Requirement already up-to-date: werkzeug>=0.11.10 in /Users/khalidsalama/anaconda/lib/python3.6/site-packages (from tensorflow-tensorboard<0.5.0,>=0.4.0rc1->tensorflow)\n",
 23 |       "Requirement already up-to-date: html5lib==0.9999999 in /Users/khalidsalama/anaconda/lib/python3.6/site-packages (from tensorflow-tensorboard<0.5.0,>=0.4.0rc1->tensorflow)\n",
 24 |       "Requirement already up-to-date: markdown>=2.6.8 in /Users/khalidsalama/anaconda/lib/python3.6/site-packages (from tensorflow-tensorboard<0.5.0,>=0.4.0rc1->tensorflow)\n",
 25 |       "Requirement already up-to-date: bleach==1.5.0 in /Users/khalidsalama/anaconda/lib/python3.6/site-packages (from tensorflow-tensorboard<0.5.0,>=0.4.0rc1->tensorflow)\n",
 26 |       "Requirement already up-to-date: setuptools in /Users/khalidsalama/anaconda/lib/python3.6/site-packages (from protobuf>=3.3.0->tensorflow)\n",
 27 |       "Installing collected packages: tensorflow-tensorboard, enum34, tensorflow\n",
 28 |       "  Found existing installation: tensorflow-tensorboard 0.1.8\n",
 29 |       "    Uninstalling tensorflow-tensorboard-0.1.8:\n",
 30 |       "      Successfully uninstalled tensorflow-tensorboard-0.1.8\n",
 31 |       "  Found existing installation: tensorflow 1.3.0\n",
 32 |       "    Uninstalling tensorflow-1.3.0:\n",
 33 |       "      Successfully uninstalled tensorflow-1.3.0\n",
 34 |       "Successfully installed enum34-1.1.6 tensorflow-1.4.0 tensorflow-tensorboard-0.4.0rc2\n"
 35 |      ]
 36 |     }
 37 |    ],
 38 |    "source": [
 39 |     "%%bash\n",
 40 |     "\n",
 41 |     "pip install -U tensorflow"
 42 |    ]
 43 |   },
 44 |   {
 45 |    "cell_type": "code",
 46 |    "execution_count": 2,
 47 |    "metadata": {},
 48 |    "outputs": [
 49 |     {
 50 |      "name": "stderr",
 51 |      "output_type": "stream",
 52 |      "text": [
 53 |       "/Users/khalidsalama/anaconda/lib/python3.6/importlib/_bootstrap.py:205: RuntimeWarning: compiletime version 3.5 of module 'tensorflow.python.framework.fast_tensor_util' does not match runtime version 3.6\n",
 54 |       "  return f(*args, **kwds)\n"
 55 |      ]
 56 |     },
 57 |     {
 58 |      "name": "stdout",
 59 |      "output_type": "stream",
 60 |      "text": [
 61 |       "1.4.0\n"
 62 |      ]
 63 |     }
 64 |    ],
 65 |    "source": [
 66 |     "import tensorflow as tf\n",
 67 |     "print(tf.__version__)"
 68 |    ]
 69 |   },
 70 |   {
 71 |    "cell_type": "code",
 72 |    "execution_count": null,
 73 |    "metadata": {
 74 |     "collapsed": true
 75 |    },
 76 |    "outputs": [],
 77 |    "source": []
 78 |   }
 79 |  ],
 80 |  "metadata": {
 81 |   "kernelspec": {
 82 |    "display_name": "Python 3",
 83 |    "language": "python",
 84 |    "name": "python3"
 85 |   },
 86 |   "language_info": {
 87 |    "codemirror_mode": {
 88 |     "name": "ipython",
 89 |     "version": 3
 90 |    },
 91 |    "file_extension": ".py",
 92 |    "mimetype": "text/x-python",
 93 |    "name": "python",
 94 |    "nbconvert_exporter": "python",
 95 |    "pygments_lexer": "ipython3",
 96 |    "version": "3.6.1"
 97 |   }
 98 |  },
 99 |  "nbformat": 4,
100 |  "nbformat_minor": 2
101 | }
102 | 


--------------------------------------------------------------------------------
/01 - Regression/01.0 - Regression Data Generation.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "code",
  5 |    "execution_count": 1,
  6 |    "metadata": {},
  7 |    "outputs": [
  8 |     {
  9 |      "name": "stderr",
 10 |      "output_type": "stream",
 11 |      "text": [
 12 |       "/Users/khalidsalama/anaconda/lib/python3.6/site-packages/sklearn/cross_validation.py:41: DeprecationWarning: This module was deprecated in version 0.18 in favor of the model_selection module into which all the refactored classes and functions are moved. Also note that the interface of the new CV iterators are different from that of this module. This module will be removed in 0.20.\n",
 13 |       "  \"This module will be removed in 0.20.\", DeprecationWarning)\n",
 14 |       "/Users/khalidsalama/anaconda/lib/python3.6/site-packages/sklearn/grid_search.py:42: DeprecationWarning: This module was deprecated in version 0.18 in favor of the model_selection module into which all the refactored classes and functions are moved. This module will be removed in 0.20.\n",
 15 |       "  DeprecationWarning)\n",
 16 |       "/Users/khalidsalama/anaconda/lib/python3.6/site-packages/sklearn/learning_curve.py:22: DeprecationWarning: This module was deprecated in version 0.18 in favor of the model_selection module into which all the functions are moved. This module will be removed in 0.20\n",
 17 |       "  DeprecationWarning)\n"
 18 |      ]
 19 |     }
 20 |    ],
 21 |    "source": [
 22 |     "import numpy as np\n",
 23 |     "import pandas as pd\n",
 24 |     "from sklearn import *\n",
 25 |     "import matplotlib.pyplot as plt\n",
 26 |     "%matplotlib inline"
 27 |    ]
 28 |   },
 29 |   {
 30 |    "cell_type": "code",
 31 |    "execution_count": 2,
 32 |    "metadata": {
 33 |     "collapsed": true
 34 |    },
 35 |    "outputs": [],
 36 |    "source": [
 37 |     "sample_size = 5000"
 38 |    ]
 39 |   },
 40 |   {
 41 |    "cell_type": "code",
 42 |    "execution_count": 3,
 43 |    "metadata": {
 44 |     "collapsed": true
 45 |    },
 46 |    "outputs": [],
 47 |    "source": [
 48 |     "\n",
 49 |     "data1,target1 = datasets.make_circles(n_samples=sample_size, factor=.1, noise=0.2)\n",
 50 |     "target1 = (3*data1[:,0])-(16*data1[:,1]) + (0.5*data1[:,0]*data1[:,1]) + np.random.normal(0,2,size=sample_size)\n",
 51 |     "\n",
 52 |     "\n",
 53 |     "data2,target2 = datasets.make_circles(n_samples=sample_size, factor=.5, noise=0.2)\n",
 54 |     "target2 = np.power(data2[:,0],2) + 10*np.power(data2[:,1],3) + (50*data2[:,0]*np.power(data2[:,1],2)) + np.random.normal(0,2,size=sample_size)\n",
 55 |     "\n",
 56 |     "data3,target3 = datasets.make_moons(n_samples=sample_size,noise=0.2)\n",
 57 |     "data3[:,0] = (2 * (data3[:, 0]-(-1))/(3))-1\n",
 58 |     "data3[:,1] = (2 * (data3[:, 1]-(-1))/(2))-1\n",
 59 |     "target3 = (50*data3[:,0]*np.sin(data3[:,1])) + (50*data3[:,1]*np.cos(data3[:,0]))\n",
 60 |     "\n",
 61 |     "data4,target4 = datasets.make_moons(n_samples=sample_size,noise=0.2)\n",
 62 |     "\n",
 63 |     "temp = np.copy(data4[:, 0])\n",
 64 |     "data4[:, 0] = data4[:, 1]\n",
 65 |     "data4[:, 1] = temp\n",
 66 |     "data4[:,0] = (2 * (data4[:, 0]-(-1))/(2))-1\n",
 67 |     "data4[:,1] = (2 * (data4[:, 1]-(-1))/(3))-1\n",
 68 |     "\n",
 69 |     "target4 = (30*data1[:,0])-(16*data1[:,1]) - (1.5*data1[:,0]*data1[:,1]) + np.random.normal(0,1,size=sample_size)"
 70 |    ]
 71 |   },
 72 |   {
 73 |    "cell_type": "code",
 74 |    "execution_count": 4,
 75 |    "metadata": {
 76 |     "collapsed": true
 77 |    },
 78 |    "outputs": [],
 79 |    "source": [
 80 |     "data = np.concatenate((data1, data2, data3, data4), axis=0)\n",
 81 |     "target = np.concatenate((target1,target2,target3,target4),axis=0)\n",
 82 |     "alpha = np.concatenate((np.zeros(sample_size),np.ones(sample_size),np.zeros(sample_size),np.ones(sample_size)), axis=0)\n",
 83 |     "beta = np.concatenate((np.zeros(sample_size),np.zeros(sample_size),np.ones(sample_size),np.ones(sample_size)), axis=0)"
 84 |    ]
 85 |   },
 86 |   {
 87 |    "cell_type": "code",
 88 |    "execution_count": 5,
 89 |    "metadata": {
 90 |     "collapsed": true
 91 |    },
 92 |    "outputs": [],
 93 |    "source": [
 94 |     "data_frame = pd.DataFrame(data = data,columns=[\"x\",\"y\"])\n",
 95 |     "data_frame[\"alpha\"] = pd.Series(alpha).map(lambda v: 'ax01' if v==0 else 'ax02')\n",
 96 |     "data_frame[\"beta\"] = pd.Series(beta).map(lambda v: 'bx01' if v==0 else 'bx02')\n",
 97 |     "data_frame[\"target\"] = target"
 98 |    ]
 99 |   },
100 |   {
101 |    "cell_type": "code",
102 |    "execution_count": 6,
103 |    "metadata": {},
104 |    "outputs": [
105 |     {
106 |      "data": {
107 |       "text/html": [
108 |        "<div>\n",
109 |        "<style>\n",
110 |        "    .dataframe thead tr:only-child th {\n",
111 |        "        text-align: right;\n",
112 |        "    }\n",
113 |        "\n",
114 |        "    .dataframe thead th {\n",
115 |        "        text-align: left;\n",
116 |        "    }\n",
117 |        "\n",
118 |        "    .dataframe tbody tr th {\n",
119 |        "        vertical-align: top;\n",
120 |        "    }\n",
121 |        "</style>\n",
122 |        "<table border=\"1\" class=\"dataframe\">\n",
123 |        "  <thead>\n",
124 |        "    <tr style=\"text-align: right;\">\n",
125 |        "      <th></th>\n",
126 |        "      <th>x</th>\n",
127 |        "      <th>y</th>\n",
128 |        "      <th>target</th>\n",
129 |        "    </tr>\n",
130 |        "  </thead>\n",
131 |        "  <tbody>\n",
132 |        "    <tr>\n",
133 |        "      <th>count</th>\n",
134 |        "      <td>20000.000000</td>\n",
135 |        "      <td>20000.000000</td>\n",
136 |        "      <td>20000.000000</td>\n",
137 |        "    </tr>\n",
138 |        "    <tr>\n",
139 |        "      <th>mean</th>\n",
140 |        "      <td>0.063032</td>\n",
141 |        "      <td>0.061292</td>\n",
142 |        "      <td>1.326481</td>\n",
143 |        "    </tr>\n",
144 |        "    <tr>\n",
145 |        "      <th>std</th>\n",
146 |        "      <td>0.577148</td>\n",
147 |        "      <td>0.577051</td>\n",
148 |        "      <td>17.741681</td>\n",
149 |        "    </tr>\n",
150 |        "    <tr>\n",
151 |        "      <th>min</th>\n",
152 |        "      <td>-1.567981</td>\n",
153 |        "      <td>-1.578965</td>\n",
154 |        "      <td>-73.096282</td>\n",
155 |        "    </tr>\n",
156 |        "    <tr>\n",
157 |        "      <th>25%</th>\n",
158 |        "      <td>-0.333928</td>\n",
159 |        "      <td>-0.334557</td>\n",
160 |        "      <td>-6.737629</td>\n",
161 |        "    </tr>\n",
162 |        "    <tr>\n",
163 |        "      <th>50%</th>\n",
164 |        "      <td>0.053508</td>\n",
165 |        "      <td>0.053526</td>\n",
166 |        "      <td>0.417512</td>\n",
167 |        "    </tr>\n",
168 |        "    <tr>\n",
169 |        "      <th>75%</th>\n",
170 |        "      <td>0.477157</td>\n",
171 |        "      <td>0.475678</td>\n",
172 |        "      <td>8.707335</td>\n",
173 |        "    </tr>\n",
174 |        "    <tr>\n",
175 |        "      <th>max</th>\n",
176 |        "      <td>1.617511</td>\n",
177 |        "      <td>1.724125</td>\n",
178 |        "      <td>86.776134</td>\n",
179 |        "    </tr>\n",
180 |        "  </tbody>\n",
181 |        "</table>\n",
182 |        "</div>"
183 |       ],
184 |       "text/plain": [
185 |        "                  x             y        target\n",
186 |        "count  20000.000000  20000.000000  20000.000000\n",
187 |        "mean       0.063032      0.061292      1.326481\n",
188 |        "std        0.577148      0.577051     17.741681\n",
189 |        "min       -1.567981     -1.578965    -73.096282\n",
190 |        "25%       -0.333928     -0.334557     -6.737629\n",
191 |        "50%        0.053508      0.053526      0.417512\n",
192 |        "75%        0.477157      0.475678      8.707335\n",
193 |        "max        1.617511      1.724125     86.776134"
194 |       ]
195 |      },
196 |      "execution_count": 6,
197 |      "metadata": {},
198 |      "output_type": "execute_result"
199 |     }
200 |    ],
201 |    "source": [
202 |     "data_frame.describe()"
203 |    ]
204 |   },
205 |   {
206 |    "cell_type": "code",
207 |    "execution_count": 7,
208 |    "metadata": {},
209 |    "outputs": [
210 |     {
211 |      "name": "stdout",
212 |      "output_type": "stream",
213 |      "text": [
214 |       "12000\n",
215 |       "3000\n",
216 |       "5000\n"
217 |      ]
218 |     }
219 |    ],
220 |    "source": [
221 |     "distribution =  ([0] * sample_size) + ([1] * sample_size) + ([2] * sample_size) + ([3] * sample_size)\n",
222 |     "\n",
223 |     "splitter =  model_selection.StratifiedShuffleSplit(n_splits=1, test_size=0.25, random_state=0)\n",
224 |     "splits = list(splitter.split(X=data_frame.iloc[:,[0,1,2,3]],y=distribution))\n",
225 |     "learn_index = splits[0][0]\n",
226 |     "test_index = splits[0][1]\n",
227 |     "\n",
228 |     "learn_df = data_frame.iloc[learn_index,:]\n",
229 |     "\n",
230 |     "size2 = int(len(learn_df)/4)\n",
231 |     "distribution2 =  ([0] * size2) + ([1] * size2) + ([2] * size2) + ([3] * size2)\n",
232 |     "\n",
233 |     "\n",
234 |     "splitter =  model_selection.StratifiedShuffleSplit(n_splits=1, test_size=0.2, random_state=0)\n",
235 |     "splits = list(splitter.split(X=learn_df.iloc[:,[0,1,2,3]],y=distribution2))\n",
236 |     "train_index = splits[0][0]\n",
237 |     "valid_index = splits[0][1]\n",
238 |     "\n",
239 |     "\n",
240 |     "train_df = learn_df.iloc[train_index,:]\n",
241 |     "print(len(train_df))\n",
242 |     "\n",
243 |     "valid_df = learn_df.iloc[valid_index,:]\n",
244 |     "print(len(valid_df))\n",
245 |     "\n",
246 |     "test_df = data_frame.iloc[test_index,:]\n",
247 |     "print(len(test_df))\n"
248 |    ]
249 |   },
250 |   {
251 |    "cell_type": "code",
252 |    "execution_count": 8,
253 |    "metadata": {
254 |     "collapsed": true
255 |    },
256 |    "outputs": [],
257 |    "source": [
258 |     "train_df.to_csv(path_or_buf=\"data/train-data.csv\", header=False, index=True)\n",
259 |     "valid_df.to_csv(path_or_buf=\"data/valid-data.csv\", header=False, index=True)\n",
260 |     "test_df.to_csv(path_or_buf=\"data/test-data.csv\", header=False, index=True)"
261 |    ]
262 |   },
263 |   {
264 |    "cell_type": "code",
265 |    "execution_count": null,
266 |    "metadata": {
267 |     "collapsed": true
268 |    },
269 |    "outputs": [],
270 |    "source": []
271 |   }
272 |  ],
273 |  "metadata": {
274 |   "kernelspec": {
275 |    "display_name": "Python 3",
276 |    "language": "python",
277 |    "name": "python3"
278 |   },
279 |   "language_info": {
280 |    "codemirror_mode": {
281 |     "name": "ipython",
282 |     "version": 3
283 |    },
284 |    "file_extension": ".py",
285 |    "mimetype": "text/x-python",
286 |    "name": "python",
287 |    "nbconvert_exporter": "python",
288 |    "pygments_lexer": "ipython3",
289 |    "version": "3.6.1"
290 |   }
291 |  },
292 |  "nbformat": 4,
293 |  "nbformat_minor": 2
294 | }
295 | 


--------------------------------------------------------------------------------
/01 - Regression/02.0 - TF Regression Model - Estimator APIs + Pandas.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "code",
  5 |    "execution_count": 1,
  6 |    "metadata": {},
  7 |    "outputs": [
  8 |     {
  9 |      "name": "stderr",
 10 |      "output_type": "stream",
 11 |      "text": [
 12 |       "/Users/khalidsalama/anaconda/lib/python3.6/importlib/_bootstrap.py:205: RuntimeWarning: compiletime version 3.5 of module 'tensorflow.python.framework.fast_tensor_util' does not match runtime version 3.6\n",
 13 |       "  return f(*args, **kwds)\n"
 14 |      ]
 15 |     },
 16 |     {
 17 |      "name": "stdout",
 18 |      "output_type": "stream",
 19 |      "text": [
 20 |       "1.4.0\n"
 21 |      ]
 22 |     }
 23 |    ],
 24 |    "source": [
 25 |     "import tensorflow as tf\n",
 26 |     "import pandas as pd\n",
 27 |     "import numpy as np\n",
 28 |     "import shutil\n",
 29 |     "import math\n",
 30 |     "import multiprocessing\n",
 31 |     "from datetime import datetime\n",
 32 |     "from tensorflow.python.feature_column import feature_column\n",
 33 |     "print(tf.__version__)"
 34 |    ]
 35 |   },
 36 |   {
 37 |    "cell_type": "markdown",
 38 |    "metadata": {},
 39 |    "source": [
 40 |     "## Steps to use the TF Estimator APIs\n",
 41 |     "1. Define dataset **metadata**\n",
 42 |     "2. Define **data input function** to read the data from Pandas dataframe + **apply feature processing**\n",
 43 |     "3. Create TF **feature columns** based on metadata + **extended feature columns**\n",
 44 |     "4. Instantiate an **estimator** with the required **feature columns & parameters**\n",
 45 |     "5. **Train** estimator using training data\n",
 46 |     "6. **Evaluate** estimator using test data\n",
 47 |     "7.  Perform **predictions**\n",
 48 |     "8. **Save & Serve** the estimator"
 49 |    ]
 50 |   },
 51 |   {
 52 |    "cell_type": "code",
 53 |    "execution_count": 2,
 54 |    "metadata": {
 55 |     "collapsed": true
 56 |    },
 57 |    "outputs": [],
 58 |    "source": [
 59 |     "MODEL_NAME = 'reg-model-01'\n",
 60 |     "\n",
 61 |     "TRAIN_DATA_FILE = 'data/train-data.csv'\n",
 62 |     "VALID_DATA_FILE = 'data/valid-data.csv'\n",
 63 |     "TEST_DATA_FILE = 'data/test-data.csv'\n",
 64 |     "\n",
 65 |     "RESUME_TRAINING = False\n",
 66 |     "PROCESS_FEATURES = True\n",
 67 |     "MULTI_THREADING = False"
 68 |    ]
 69 |   },
 70 |   {
 71 |    "cell_type": "markdown",
 72 |    "metadata": {},
 73 |    "source": [
 74 |     "## 1. Define Dataset Metadata\n",
 75 |     "* CSV file header and defaults\n",
 76 |     "* Numeric and categorical feature names\n",
 77 |     "* Target feature name\n",
 78 |     "* Unused columns"
 79 |    ]
 80 |   },
 81 |   {
 82 |    "cell_type": "code",
 83 |    "execution_count": 3,
 84 |    "metadata": {},
 85 |    "outputs": [
 86 |     {
 87 |      "name": "stdout",
 88 |      "output_type": "stream",
 89 |      "text": [
 90 |       "Header: ['key', 'x', 'y', 'alpha', 'beta', 'target']\n",
 91 |       "Numeric Features: ['x', 'y']\n",
 92 |       "Categorical Features: ['alpha', 'beta']\n",
 93 |       "Target: target\n",
 94 |       "Unused Features: ['key']\n"
 95 |      ]
 96 |     }
 97 |    ],
 98 |    "source": [
 99 |     "HEADER = ['key','x','y','alpha','beta','target']\n",
100 |     "HEADER_DEFAULTS = [[0], [0.0], [0.0], ['NA'], ['NA'], [0.0]]\n",
101 |     "\n",
102 |     "NUMERIC_FEATURE_NAMES = ['x', 'y']  \n",
103 |     "\n",
104 |     "CATEGORICAL_FEATURE_NAMES_WITH_VOCABULARY = {'alpha':['ax01', 'ax02'], 'beta':['bx01', 'bx02']}\n",
105 |     "CATEGORICAL_FEATURE_NAMES = list(CATEGORICAL_FEATURE_NAMES_WITH_VOCABULARY.keys())\n",
106 |     "\n",
107 |     "FEATURE_NAMES = NUMERIC_FEATURE_NAMES + CATEGORICAL_FEATURE_NAMES\n",
108 |     "\n",
109 |     "TARGET_NAME = 'target'\n",
110 |     "\n",
111 |     "UNUSED_FEATURE_NAMES = list(set(HEADER) - set(FEATURE_NAMES) - {TARGET_NAME})\n",
112 |     "\n",
113 |     "print(\"Header: {}\".format(HEADER))\n",
114 |     "print(\"Numeric Features: {}\".format(NUMERIC_FEATURE_NAMES))\n",
115 |     "print(\"Categorical Features: {}\".format(CATEGORICAL_FEATURE_NAMES))\n",
116 |     "print(\"Target: {}\".format(TARGET_NAME))\n",
117 |     "print(\"Unused Features: {}\".format(UNUSED_FEATURE_NAMES))"
118 |    ]
119 |   },
120 |   {
121 |    "cell_type": "markdown",
122 |    "metadata": {},
123 |    "source": [
124 |     "## 2. Define Data Input Function\n",
125 |     "* Input csv file name\n",
126 |     "* Load pandas Dataframe\n",
127 |     "* Apply feature processing\n",
128 |     "* Return a function that returns (features, target) tensors"
129 |    ]
130 |   },
131 |   {
132 |    "cell_type": "code",
133 |    "execution_count": 4,
134 |    "metadata": {
135 |     "collapsed": true
136 |    },
137 |    "outputs": [],
138 |    "source": [
139 |     "def process_dataframe(dataset_df):\n",
140 |     "    \n",
141 |     "    dataset_df[\"x_2\"] = np.square(dataset_df['x'])\n",
142 |     "    dataset_df[\"y_2\"] = np.square(dataset_df['y'])\n",
143 |     "    dataset_df[\"xy\"] = dataset_df['x'] * dataset_df['y']\n",
144 |     "    dataset_df['dist_xy'] =  np.sqrt(np.square(dataset_df['x']-dataset_df['y']))\n",
145 |     "    \n",
146 |     "    return dataset_df\n",
147 |     "\n",
148 |     "def generate_pandas_input_fn(file_name, mode=tf.estimator.ModeKeys.EVAL,\n",
149 |     "                             skip_header_lines=0,\n",
150 |     "                             num_epochs=1,\n",
151 |     "                             batch_size=100):\n",
152 |     "\n",
153 |     "    df_dataset = pd.read_csv(file_name, names=HEADER, skiprows=skip_header_lines)\n",
154 |     "    \n",
155 |     "    x = df_dataset[FEATURE_NAMES].copy()\n",
156 |     "    if PROCESS_FEATURES:\n",
157 |     "        x = process_dataframe(x)\n",
158 |     "    \n",
159 |     "    y = df_dataset[TARGET_NAME]\n",
160 |     "        \n",
161 |     "    shuffle = True if mode == tf.estimator.ModeKeys.TRAIN else False\n",
162 |     "    \n",
163 |     "    num_threads=1\n",
164 |     "    \n",
165 |     "    if MULTI_THREADING:\n",
166 |     "        num_threads=multiprocessing.cpu_count()\n",
167 |     "        num_epochs = int(num_epochs/num_threads) if mode == tf.estimator.ModeKeys.TRAIN else num_epochs\n",
168 |     "    \n",
169 |     "    pandas_input_fn = tf.estimator.inputs.pandas_input_fn(\n",
170 |     "        batch_size=batch_size,\n",
171 |     "        num_epochs= num_epochs,\n",
172 |     "        shuffle=shuffle,\n",
173 |     "        x=x,\n",
174 |     "        y=y,\n",
175 |     "        target_column=TARGET_NAME\n",
176 |     "    )\n",
177 |     "    \n",
178 |     "    print(\"\")\n",
179 |     "    print(\"* data input_fn:\")\n",
180 |     "    print(\"================\")\n",
181 |     "    print(\"Input file: {}\".format(file_name))\n",
182 |     "    print(\"Dataset size: {}\".format(len(df_dataset)))\n",
183 |     "    print(\"Batch size: {}\".format(batch_size))\n",
184 |     "    print(\"Epoch Count: {}\".format(num_epochs))\n",
185 |     "    print(\"Mode: {}\".format(mode))\n",
186 |     "    print(\"Thread Count: {}\".format(num_threads))\n",
187 |     "    print(\"Shuffle: {}\".format(shuffle))\n",
188 |     "    print(\"================\")\n",
189 |     "    print(\"\")\n",
190 |     "    \n",
191 |     "    return pandas_input_fn"
192 |    ]
193 |   },
194 |   {
195 |    "cell_type": "code",
196 |    "execution_count": 5,
197 |    "metadata": {},
198 |    "outputs": [
199 |     {
200 |      "name": "stdout",
201 |      "output_type": "stream",
202 |      "text": [
203 |       "\n",
204 |       "* data input_fn:\n",
205 |       "================\n",
206 |       "Input file: data/train-data.csv\n",
207 |       "Dataset size: 12000\n",
208 |       "Batch size: 100\n",
209 |       "Epoch Count: 1\n",
210 |       "Mode: eval\n",
211 |       "Thread Count: 1\n",
212 |       "Shuffle: False\n",
213 |       "================\n",
214 |       "\n",
215 |       "Feature read from DataFrame: ['x', 'y', 'alpha', 'beta', 'x_2', 'y_2', 'xy', 'dist_xy']\n",
216 |       "Target read from DataFrame: Tensor(\"fifo_queue_DequeueUpTo:9\", shape=(?,), dtype=float64)\n"
217 |      ]
218 |     }
219 |    ],
220 |    "source": [
221 |     "features, target = generate_pandas_input_fn(file_name=TRAIN_DATA_FILE)()\n",
222 |     "print(\"Feature read from DataFrame: {}\".format(list(features.keys())))\n",
223 |     "print(\"Target read from DataFrame: {}\".format(target))"
224 |    ]
225 |   },
226 |   {
227 |    "cell_type": "markdown",
228 |    "metadata": {},
229 |    "source": [
230 |     "## 3. Define Feature Columns\n",
231 |     "The input numeric columns are assumed to be normalized (or have the same scale). Otherwise, a normlizer_fn, along with the normlisation params (mean, stdv or min, max) should be passed to tf.feature_column.numeric_column() constructor."
232 |    ]
233 |   },
234 |   {
235 |    "cell_type": "code",
236 |    "execution_count": 6,
237 |    "metadata": {},
238 |    "outputs": [
239 |     {
240 |      "name": "stdout",
241 |      "output_type": "stream",
242 |      "text": [
243 |       "Feature Columns: {'x': _NumericColumn(key='x', shape=(1,), default_value=None, dtype=tf.float32, normalizer_fn=None), 'y': _NumericColumn(key='y', shape=(1,), default_value=None, dtype=tf.float32, normalizer_fn=None), 'x_2': _NumericColumn(key='x_2', shape=(1,), default_value=None, dtype=tf.float32, normalizer_fn=None), 'y_2': _NumericColumn(key='y_2', shape=(1,), default_value=None, dtype=tf.float32, normalizer_fn=None), 'xy': _NumericColumn(key='xy', shape=(1,), default_value=None, dtype=tf.float32, normalizer_fn=None), 'dist_xy': _NumericColumn(key='dist_xy', shape=(1,), default_value=None, dtype=tf.float32, normalizer_fn=None), 'alpha': _VocabularyListCategoricalColumn(key='alpha', vocabulary_list=('ax01', 'ax02'), dtype=tf.string, default_value=-1, num_oov_buckets=0), 'beta': _VocabularyListCategoricalColumn(key='beta', vocabulary_list=('bx01', 'bx02'), dtype=tf.string, default_value=-1, num_oov_buckets=0), 'alpha_X_beta': _CrossedColumn(keys=(_VocabularyListCategoricalColumn(key='alpha', vocabulary_list=('ax01', 'ax02'), dtype=tf.string, default_value=-1, num_oov_buckets=0), _VocabularyListCategoricalColumn(key='beta', vocabulary_list=('bx01', 'bx02'), dtype=tf.string, default_value=-1, num_oov_buckets=0)), hash_bucket_size=4, hash_key=None)}\n"
244 |      ]
245 |     }
246 |    ],
247 |    "source": [
248 |     "def get_feature_columns():\n",
249 |     "    \n",
250 |     "    \n",
251 |     "    all_numeric_feature_names = NUMERIC_FEATURE_NAMES\n",
252 |     "    \n",
253 |     "    CONSTRUCTED_NUMERIC_FEATURES_NAMES = ['x_2', 'y_2', 'xy', 'dist_xy']\n",
254 |     "    \n",
255 |     "    if PROCESS_FEATURES:\n",
256 |     "        all_numeric_feature_names += CONSTRUCTED_NUMERIC_FEATURES_NAMES\n",
257 |     "\n",
258 |     "    numeric_columns = {feature_name: tf.feature_column.numeric_column(feature_name)\n",
259 |     "                       for feature_name in all_numeric_feature_names}\n",
260 |     "\n",
261 |     "    categorical_column_with_vocabulary = \\\n",
262 |     "        {item[0]: tf.feature_column.categorical_column_with_vocabulary_list(item[0], item[1])\n",
263 |     "         for item in CATEGORICAL_FEATURE_NAMES_WITH_VOCABULARY.items()}\n",
264 |     "        \n",
265 |     "    feature_columns = {}\n",
266 |     "\n",
267 |     "    if numeric_columns is not None:\n",
268 |     "        feature_columns.update(numeric_columns)\n",
269 |     "\n",
270 |     "    if categorical_column_with_vocabulary is not None:\n",
271 |     "        feature_columns.update(categorical_column_with_vocabulary)\n",
272 |     "        \n",
273 |     "    # add extended features (crossing, bucektization, embedding)\n",
274 |     "    \n",
275 |     "    feature_columns['alpha_X_beta'] = tf.feature_column.crossed_column(\n",
276 |     "        [feature_columns['alpha'], feature_columns['beta']], 4)\n",
277 |     "    \n",
278 |     "    return feature_columns\n",
279 |     "\n",
280 |     "feature_columns = get_feature_columns()\n",
281 |     "print(\"Feature Columns: {}\".format(feature_columns))"
282 |    ]
283 |   },
284 |   {
285 |    "cell_type": "markdown",
286 |    "metadata": {},
287 |    "source": [
288 |     "## 4. Create an Estimator"
289 |    ]
290 |   },
291 |   {
292 |    "cell_type": "markdown",
293 |    "metadata": {},
294 |    "source": [
295 |     "### a. Define an Estimator Creation Function\n",
296 |     "\n",
297 |     "* Get dense (numeric) columns from the feature columns\n",
298 |     "* Convert categorical columns to indicator columns\n",
299 |     "* Create Instantiate a DNNRegressor estimator given **dense + indicator** feature columns + params"
300 |    ]
301 |   },
302 |   {
303 |    "cell_type": "code",
304 |    "execution_count": 7,
305 |    "metadata": {
306 |     "collapsed": true
307 |    },
308 |    "outputs": [],
309 |    "source": [
310 |     "def create_estimator(run_config, hparams):\n",
311 |     "    \n",
312 |     "    feature_columns = list(get_feature_columns().values())\n",
313 |     "    \n",
314 |     "    dense_columns = list(\n",
315 |     "        filter(lambda column: isinstance(column, feature_column._NumericColumn),\n",
316 |     "               feature_columns\n",
317 |     "        )\n",
318 |     "    )\n",
319 |     "\n",
320 |     "    categorical_columns = list(\n",
321 |     "        filter(lambda column: isinstance(column, feature_column._VocabularyListCategoricalColumn) |\n",
322 |     "                              isinstance(column, feature_column._BucketizedColumn),\n",
323 |     "                   feature_columns)\n",
324 |     "    )\n",
325 |     "\n",
326 |     "    indicator_columns = list(\n",
327 |     "            map(lambda column: tf.feature_column.indicator_column(column),\n",
328 |     "                categorical_columns)\n",
329 |     "    )\n",
330 |     "    \n",
331 |     "    \n",
332 |     "    estimator_feature_columns = dense_columns + indicator_columns \n",
333 |     "    \n",
334 |     "    estimator = tf.estimator.DNNRegressor(\n",
335 |     "        \n",
336 |     "        feature_columns= estimator_feature_columns,\n",
337 |     "        hidden_units= hparams.hidden_units,\n",
338 |     "        \n",
339 |     "        optimizer= tf.train.AdamOptimizer(),\n",
340 |     "        activation_fn= tf.nn.elu,\n",
341 |     "        dropout= hparams.dropout_prob,\n",
342 |     "        \n",
343 |     "        config= run_config\n",
344 |     "    )\n",
345 |     "    \n",
346 |     "    print(\"\")\n",
347 |     "    print(\"Estimator Type: {}\".format(type(estimator)))\n",
348 |     "    print(\"\")\n",
349 |     "    \n",
350 |     "    return estimator"
351 |    ]
352 |   },
353 |   {
354 |    "cell_type": "markdown",
355 |    "metadata": {},
356 |    "source": [
357 |     "### b. Set hyper-parameter values (HParams)"
358 |    ]
359 |   },
360 |   {
361 |    "cell_type": "code",
362 |    "execution_count": 8,
363 |    "metadata": {},
364 |    "outputs": [
365 |     {
366 |      "name": "stdout",
367 |      "output_type": "stream",
368 |      "text": [
369 |       "Model directory: trained_models/reg-model-01\n",
370 |       "Hyper-parameters: [('batch_size', 500), ('dropout_prob', 0.0), ('hidden_units', [8, 4]), ('num_epochs', 100)]\n"
371 |      ]
372 |     }
373 |    ],
374 |    "source": [
375 |     "hparams  = tf.contrib.training.HParams(\n",
376 |     "    num_epochs = 100,\n",
377 |     "    batch_size = 500,\n",
378 |     "    hidden_units=[8, 4], \n",
379 |     "    dropout_prob = 0.0)\n",
380 |     "\n",
381 |     "\n",
382 |     "model_dir = 'trained_models/{}'.format(MODEL_NAME)\n",
383 |     "\n",
384 |     "run_config = tf.estimator.RunConfig().replace(model_dir=model_dir)\n",
385 |     "print(\"Model directory: {}\".format(run_config.model_dir))\n",
386 |     "print(\"Hyper-parameters: {}\".format(hparams))"
387 |    ]
388 |   },
389 |   {
390 |    "cell_type": "markdown",
391 |    "metadata": {},
392 |    "source": [
393 |     "### c. Instantiate the estimator "
394 |    ]
395 |   },
396 |   {
397 |    "cell_type": "code",
398 |    "execution_count": 9,
399 |    "metadata": {},
400 |    "outputs": [
401 |     {
402 |      "name": "stdout",
403 |      "output_type": "stream",
404 |      "text": [
405 |       "INFO:tensorflow:Using config: {'_model_dir': 'trained_models/reg-model-01', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': None, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x1280145f8>, '_task_type': 'worker', '_task_id': 0, '_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}\n",
406 |       "\n",
407 |       "Estimator Type: <class 'tensorflow.python.estimator.canned.dnn.DNNRegressor'>\n",
408 |       "\n"
409 |      ]
410 |     }
411 |    ],
412 |    "source": [
413 |     "estimator = create_estimator(run_config, hparams)"
414 |    ]
415 |   },
416 |   {
417 |    "cell_type": "markdown",
418 |    "metadata": {},
419 |    "source": [
420 |     "## 5. Train the Estimator"
421 |    ]
422 |   },
423 |   {
424 |    "cell_type": "code",
425 |    "execution_count": 10,
426 |    "metadata": {},
427 |    "outputs": [
428 |     {
429 |      "name": "stdout",
430 |      "output_type": "stream",
431 |      "text": [
432 |       "\n",
433 |       "* data input_fn:\n",
434 |       "================\n",
435 |       "Input file: data/train-data.csv\n",
436 |       "Dataset size: 12000\n",
437 |       "Batch size: 500\n",
438 |       "Epoch Count: 100\n",
439 |       "Mode: train\n",
440 |       "Thread Count: 1\n",
441 |       "Shuffle: True\n",
442 |       "================\n",
443 |       "\n",
444 |       "Estimator training started at 19:19:12\n",
445 |       ".......................................\n",
446 |       "INFO:tensorflow:Create CheckpointSaverHook.\n",
447 |       "INFO:tensorflow:Saving checkpoints for 1 into trained_models/reg-model-01/model.ckpt.\n",
448 |       "INFO:tensorflow:loss = 179225.0, step = 1\n",
449 |       "INFO:tensorflow:global_step/sec: 166.515\n",
450 |       "INFO:tensorflow:loss = 124778.0, step = 101 (0.602 sec)\n",
451 |       "INFO:tensorflow:global_step/sec: 182.042\n",
452 |       "INFO:tensorflow:loss = 144432.0, step = 201 (0.550 sec)\n",
453 |       "INFO:tensorflow:global_step/sec: 221.401\n",
454 |       "INFO:tensorflow:loss = 167542.0, step = 301 (0.451 sec)\n",
455 |       "INFO:tensorflow:global_step/sec: 208.414\n",
456 |       "INFO:tensorflow:loss = 146349.0, step = 401 (0.480 sec)\n",
457 |       "INFO:tensorflow:global_step/sec: 216.184\n",
458 |       "INFO:tensorflow:loss = 148680.0, step = 501 (0.462 sec)\n",
459 |       "INFO:tensorflow:global_step/sec: 217.155\n",
460 |       "INFO:tensorflow:loss = 123907.0, step = 601 (0.460 sec)\n",
461 |       "INFO:tensorflow:global_step/sec: 209.701\n",
462 |       "INFO:tensorflow:loss = 113046.0, step = 701 (0.477 sec)\n",
463 |       "INFO:tensorflow:global_step/sec: 168.637\n",
464 |       "INFO:tensorflow:loss = 107878.0, step = 801 (0.594 sec)\n",
465 |       "INFO:tensorflow:global_step/sec: 126.787\n",
466 |       "INFO:tensorflow:loss = 118305.0, step = 901 (0.788 sec)\n",
467 |       "INFO:tensorflow:global_step/sec: 138.261\n",
468 |       "INFO:tensorflow:loss = 101507.0, step = 1001 (0.723 sec)\n",
469 |       "INFO:tensorflow:global_step/sec: 162.629\n",
470 |       "INFO:tensorflow:loss = 106166.0, step = 1101 (0.616 sec)\n",
471 |       "INFO:tensorflow:global_step/sec: 210.706\n",
472 |       "INFO:tensorflow:loss = 107934.0, step = 1201 (0.474 sec)\n",
473 |       "INFO:tensorflow:global_step/sec: 175.23\n",
474 |       "INFO:tensorflow:loss = 98094.9, step = 1301 (0.571 sec)\n",
475 |       "INFO:tensorflow:global_step/sec: 176.572\n",
476 |       "INFO:tensorflow:loss = 89144.2, step = 1401 (0.566 sec)\n",
477 |       "INFO:tensorflow:global_step/sec: 177.678\n",
478 |       "INFO:tensorflow:loss = 104465.0, step = 1501 (0.563 sec)\n",
479 |       "INFO:tensorflow:global_step/sec: 183.081\n",
480 |       "INFO:tensorflow:loss = 92220.2, step = 1601 (0.546 sec)\n",
481 |       "INFO:tensorflow:global_step/sec: 218.108\n",
482 |       "INFO:tensorflow:loss = 79086.9, step = 1701 (0.458 sec)\n",
483 |       "INFO:tensorflow:global_step/sec: 138.97\n",
484 |       "INFO:tensorflow:loss = 93577.3, step = 1801 (0.724 sec)\n",
485 |       "INFO:tensorflow:global_step/sec: 145.418\n",
486 |       "INFO:tensorflow:loss = 75269.3, step = 1901 (0.684 sec)\n",
487 |       "INFO:tensorflow:global_step/sec: 181.944\n",
488 |       "INFO:tensorflow:loss = 73518.7, step = 2001 (0.549 sec)\n",
489 |       "INFO:tensorflow:global_step/sec: 165.012\n",
490 |       "INFO:tensorflow:loss = 75916.3, step = 2101 (0.607 sec)\n",
491 |       "INFO:tensorflow:global_step/sec: 130.054\n",
492 |       "INFO:tensorflow:loss = 65138.1, step = 2201 (0.768 sec)\n",
493 |       "INFO:tensorflow:global_step/sec: 128.839\n",
494 |       "INFO:tensorflow:loss = 65868.5, step = 2301 (0.777 sec)\n",
495 |       "INFO:tensorflow:Saving checkpoints for 2400 into trained_models/reg-model-01/model.ckpt.\n",
496 |       "INFO:tensorflow:Loss for final step: 88071.1.\n",
497 |       ".......................................\n",
498 |       "Estimator training finished at 19:19:30\n",
499 |       "\n",
500 |       "Estimator training elapsed time: 17.686301 seconds\n"
501 |      ]
502 |     }
503 |    ],
504 |    "source": [
505 |     "train_input_fn = generate_pandas_input_fn(file_name= TRAIN_DATA_FILE, \n",
506 |     "                                      mode=tf.estimator.ModeKeys.TRAIN,\n",
507 |     "                                      num_epochs=hparams.num_epochs,\n",
508 |     "                                      batch_size=hparams.batch_size) \n",
509 |     "\n",
510 |     "if not RESUME_TRAINING:\n",
511 |     "    shutil.rmtree(model_dir, ignore_errors=True)\n",
512 |     "    \n",
513 |     "tf.logging.set_verbosity(tf.logging.INFO)\n",
514 |     "\n",
515 |     "time_start = datetime.utcnow() \n",
516 |     "print(\"Estimator training started at {}\".format(time_start.strftime(\"%H:%M:%S\")))\n",
517 |     "print(\".......................................\")\n",
518 |     "\n",
519 |     "estimator.train(input_fn = train_input_fn)\n",
520 |     "\n",
521 |     "time_end = datetime.utcnow() \n",
522 |     "print(\".......................................\")\n",
523 |     "print(\"Estimator training finished at {}\".format(time_end.strftime(\"%H:%M:%S\")))\n",
524 |     "print(\"\")\n",
525 |     "time_elapsed = time_end - time_start\n",
526 |     "print(\"Estimator training elapsed time: {} seconds\".format(time_elapsed.total_seconds()))\n"
527 |    ]
528 |   },
529 |   {
530 |    "cell_type": "markdown",
531 |    "metadata": {},
532 |    "source": [
533 |     "## 6. Evaluate the Model"
534 |    ]
535 |   },
536 |   {
537 |    "cell_type": "code",
538 |    "execution_count": 11,
539 |    "metadata": {},
540 |    "outputs": [
541 |     {
542 |      "name": "stdout",
543 |      "output_type": "stream",
544 |      "text": [
545 |       "\n",
546 |       "* data input_fn:\n",
547 |       "================\n",
548 |       "Input file: data/test-data.csv\n",
549 |       "Dataset size: 5000\n",
550 |       "Batch size: 5000\n",
551 |       "Epoch Count: 1\n",
552 |       "Mode: eval\n",
553 |       "Thread Count: 1\n",
554 |       "Shuffle: False\n",
555 |       "================\n",
556 |       "\n",
557 |       "INFO:tensorflow:Starting evaluation at 2017-11-14-19:19:30\n",
558 |       "INFO:tensorflow:Restoring parameters from trained_models/reg-model-01/model.ckpt-2400\n",
559 |       "INFO:tensorflow:Finished evaluation at 2017-11-14-19:19:31\n",
560 |       "INFO:tensorflow:Saving dict for global step 2400: average_loss = 164.862, global_step = 2400, loss = 824311.0\n",
561 |       "\n",
562 |       "{'average_loss': 164.86218, 'loss': 824310.88, 'global_step': 2400}\n",
563 |       "\n",
564 |       "RMSE: 12.83987\n"
565 |      ]
566 |     }
567 |    ],
568 |    "source": [
569 |     "TEST_SIZE = 5000\n",
570 |     "\n",
571 |     "test_input_fn = generate_pandas_input_fn(file_name=TEST_DATA_FILE, \n",
572 |     "                                      mode= tf.estimator.ModeKeys.EVAL,\n",
573 |     "                                      batch_size= TEST_SIZE)\n",
574 |     "\n",
575 |     "results = estimator.evaluate(input_fn=test_input_fn)\n",
576 |     "print(\"\")\n",
577 |     "print(results)\n",
578 |     "rmse = round(math.sqrt(results[\"average_loss\"]),5)\n",
579 |     "print(\"\")\n",
580 |     "print(\"RMSE: {}\".format(rmse))"
581 |    ]
582 |   },
583 |   {
584 |    "cell_type": "markdown",
585 |    "metadata": {},
586 |    "source": [
587 |     "## 7. Prediction"
588 |    ]
589 |   },
590 |   {
591 |    "cell_type": "code",
592 |    "execution_count": 12,
593 |    "metadata": {},
594 |    "outputs": [
595 |     {
596 |      "name": "stdout",
597 |      "output_type": "stream",
598 |      "text": [
599 |       "\n",
600 |       "* data input_fn:\n",
601 |       "================\n",
602 |       "Input file: data/test-data.csv\n",
603 |       "Dataset size: 5000\n",
604 |       "Batch size: 5\n",
605 |       "Epoch Count: 1\n",
606 |       "Mode: infer\n",
607 |       "Thread Count: 1\n",
608 |       "Shuffle: False\n",
609 |       "================\n",
610 |       "\n",
611 |       "INFO:tensorflow:Restoring parameters from trained_models/reg-model-01/model.ckpt-2400\n",
612 |       "\n",
613 |       "Predicted Values: [13.141397, -5.9562521, 11.541443, 3.8178449, 2.1242597]\n"
614 |      ]
615 |     }
616 |    ],
617 |    "source": [
618 |     "import itertools\n",
619 |     "\n",
620 |     "predict_input_fn = generate_pandas_input_fn(file_name=TEST_DATA_FILE, \n",
621 |     "                                      mode= tf.estimator.ModeKeys.PREDICT,\n",
622 |     "                                      batch_size= 5)\n",
623 |     "\n",
624 |     "predictions = estimator.predict(input_fn=predict_input_fn)\n",
625 |     "values = list(map(lambda item: item[\"predictions\"][0],list(itertools.islice(predictions, 5))))\n",
626 |     "print()\n",
627 |     "print(\"Predicted Values: {}\".format(values))"
628 |    ]
629 |   },
630 |   {
631 |    "cell_type": "markdown",
632 |    "metadata": {},
633 |    "source": [
634 |     "## 8. Save & Serve the Model"
635 |    ]
636 |   },
637 |   {
638 |    "cell_type": "markdown",
639 |    "metadata": {},
640 |    "source": [
641 |     "### a. Define Seving Function"
642 |    ]
643 |   },
644 |   {
645 |    "cell_type": "code",
646 |    "execution_count": 1,
647 |    "metadata": {
648 |     "collapsed": true
649 |    },
650 |    "outputs": [],
651 |    "source": [
652 |     "def process_features(features):\n",
653 |     "    \n",
654 |     "    features[\"x_2\"] = tf.square(features['x'])\n",
655 |     "    features[\"y_2\"] = tf.square(features['y'])\n",
656 |     "    features[\"xy\"] = tf.multiply(features['x'], features['y'])\n",
657 |     "    features['dist_xy'] =  tf.sqrt(tf.squared_difference(features['x'],features['y']))\n",
658 |     "    \n",
659 |     "    return features\n",
660 |     "\n",
661 |     "def csv_serving_input_fn():\n",
662 |     "    \n",
663 |     "    SERVING_HEADER = ['x','y','alpha','beta']\n",
664 |     "    SERVING_HEADER_DEFAULTS = [[0.0], [0.0], ['NA'], ['NA']]\n",
665 |     "\n",
666 |     "    rows_string_tensor = tf.placeholder(dtype=tf.string,\n",
667 |     "                                         shape=[None],\n",
668 |     "                                         name='csv_rows')\n",
669 |     "    \n",
670 |     "    receiver_tensor = {'csv_rows': rows_string_tensor}\n",
671 |     "\n",
672 |     "    row_columns = tf.expand_dims(rows_string_tensor, -1)\n",
673 |     "    columns = tf.decode_csv(row_columns, record_defaults=SERVING_HEADER_DEFAULTS)\n",
674 |     "    features = dict(zip(SERVING_HEADER, columns))\n",
675 |     "    \n",
676 |     "    if PROCESS_FEATURES:\n",
677 |     "        features = process_features(features)\n",
678 |     "\n",
679 |     "    return tf.estimator.export.ServingInputReceiver(\n",
680 |     "        features, receiver_tensor)"
681 |    ]
682 |   },
683 |   {
684 |    "cell_type": "markdown",
685 |    "metadata": {},
686 |    "source": [
687 |     "### b. Export SavedModel"
688 |    ]
689 |   },
690 |   {
691 |    "cell_type": "code",
692 |    "execution_count": 31,
693 |    "metadata": {},
694 |    "outputs": [
695 |     {
696 |      "name": "stdout",
697 |      "output_type": "stream",
698 |      "text": [
699 |       "INFO:tensorflow:Restoring parameters from trained_models/reg-model-01/model.ckpt-2400\n",
700 |       "INFO:tensorflow:Assets added to graph.\n",
701 |       "INFO:tensorflow:No assets to write.\n",
702 |       "INFO:tensorflow:SavedModel written to: b\"trained_models/reg-model-01/export/temp-b'1510688109'/saved_model.pbtxt\"\n"
703 |      ]
704 |     },
705 |     {
706 |      "data": {
707 |       "text/plain": [
708 |        "b'trained_models/reg-model-01/export/1510688109'"
709 |       ]
710 |      },
711 |      "execution_count": 31,
712 |      "metadata": {},
713 |      "output_type": "execute_result"
714 |     }
715 |    ],
716 |    "source": [
717 |     "export_dir = model_dir + \"/export\"\n",
718 |     "\n",
719 |     "estimator.export_savedmodel(\n",
720 |     "    export_dir_base = export_dir,\n",
721 |     "    serving_input_receiver_fn = csv_serving_input_fn,\n",
722 |     "    as_text=True\n",
723 |     ")\n"
724 |    ]
725 |   },
726 |   {
727 |    "cell_type": "markdown",
728 |    "metadata": {},
729 |    "source": [
730 |     "### c. Serve the Saved Model"
731 |    ]
732 |   },
733 |   {
734 |    "cell_type": "code",
735 |    "execution_count": 35,
736 |    "metadata": {},
737 |    "outputs": [
738 |     {
739 |      "name": "stdout",
740 |      "output_type": "stream",
741 |      "text": [
742 |       "trained_models/reg-model-01/export/1510688109\n",
743 |       "INFO:tensorflow:Restoring parameters from b'trained_models/reg-model-01/export/1510688109/variables/variables'\n",
744 |       "{'predictions': array([[ 13.15929985],\n",
745 |       "       [-13.96904373]], dtype=float32)}\n"
746 |      ]
747 |     }
748 |    ],
749 |    "source": [
750 |     "import os\n",
751 |     "\n",
752 |     "saved_model_dir = export_dir + \"/\" + os.listdir(path=export_dir)[-1] \n",
753 |     "\n",
754 |     "print(saved_model_dir)\n",
755 |     "\n",
756 |     "predictor_fn = tf.contrib.predictor.from_saved_model(\n",
757 |     "    export_dir = saved_model_dir,\n",
758 |     "    signature_def_key=\"predict\"\n",
759 |     ")\n",
760 |     "\n",
761 |     "output = predictor_fn({'csv_rows': [\"0.5,1,ax01,bx02\", \"-0.5,-1,ax02,bx02\"]})\n",
762 |     "print(output)"
763 |    ]
764 |   },
765 |   {
766 |    "cell_type": "markdown",
767 |    "metadata": {},
768 |    "source": [
769 |     "## What can we improve?\n",
770 |     "\n",
771 |     "* **Use data files instead of  DataFrames** - pandas dataframes need to fit in memory, and hard to distribute. Working with (sharded) training data files allows reading records in batches (so we can work with large data set regardless the memory size), as well as supporting distributed training (data parallelism).\n",
772 |     "\n",
773 |     "\n",
774 |     "* **Use Experiment APIs** - Experiment API knows how to invoke training and eval loops in a sensible fashion for local & distributed training.\n",
775 |     "\n",
776 |     "\n",
777 |     "* ** Early Stopping** - Use the validation set evaluation to stop the training and avoid overfitting.\n"
778 |    ]
779 |   },
780 |   {
781 |    "cell_type": "code",
782 |    "execution_count": null,
783 |    "metadata": {
784 |     "collapsed": true
785 |    },
786 |    "outputs": [],
787 |    "source": []
788 |   }
789 |  ],
790 |  "metadata": {
791 |   "kernelspec": {
792 |    "display_name": "Python 3",
793 |    "language": "python",
794 |    "name": "python3"
795 |   },
796 |   "language_info": {
797 |    "codemirror_mode": {
798 |     "name": "ipython",
799 |     "version": 3
800 |    },
801 |    "file_extension": ".py",
802 |    "mimetype": "text/x-python",
803 |    "name": "python",
804 |    "nbconvert_exporter": "python",
805 |    "pygments_lexer": "ipython3",
806 |    "version": "3.6.1"
807 |   }
808 |  },
809 |  "nbformat": 4,
810 |  "nbformat_minor": 2
811 | }
812 | 


--------------------------------------------------------------------------------
/01 - Regression/06.0 - Convert CSV to TFRecords.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "code",
  5 |    "execution_count": 1,
  6 |    "metadata": {},
  7 |    "outputs": [
  8 |     {
  9 |      "name": "stderr",
 10 |      "output_type": "stream",
 11 |      "text": [
 12 |       "/Users/khalidsalama/anaconda/lib/python3.6/importlib/_bootstrap.py:205: RuntimeWarning: compiletime version 3.5 of module 'tensorflow.python.framework.fast_tensor_util' does not match runtime version 3.6\n",
 13 |       "  return f(*args, **kwds)\n"
 14 |      ]
 15 |     },
 16 |     {
 17 |      "name": "stdout",
 18 |      "output_type": "stream",
 19 |      "text": [
 20 |       "1.4.0\n"
 21 |      ]
 22 |     }
 23 |    ],
 24 |    "source": [
 25 |     "import tensorflow as tf\n",
 26 |     "import csv\n",
 27 |     "import os\n",
 28 |     "\n",
 29 |     "print(tf.__version__)"
 30 |    ]
 31 |   },
 32 |   {
 33 |    "cell_type": "code",
 34 |    "execution_count": 2,
 35 |    "metadata": {
 36 |     "collapsed": true
 37 |    },
 38 |    "outputs": [],
 39 |    "source": [
 40 |     "train_data_files = ['data/train-data.csv']\n",
 41 |     "valid_data_files = ['data/valid-data.csv']\n",
 42 |     "test_data_files = ['data/test-data.csv']"
 43 |    ]
 44 |   },
 45 |   {
 46 |    "cell_type": "code",
 47 |    "execution_count": 3,
 48 |    "metadata": {},
 49 |    "outputs": [
 50 |     {
 51 |      "name": "stdout",
 52 |      "output_type": "stream",
 53 |      "text": [
 54 |       "Header: ['key', 'x', 'y', 'alpha', 'beta', 'target']\n",
 55 |       "Numeric Features: ['x', 'y']\n",
 56 |       "Categorical Features: ['alpha', 'beta']\n",
 57 |       "Target: target\n",
 58 |       "Unused Features: ['key']\n"
 59 |      ]
 60 |     }
 61 |    ],
 62 |    "source": [
 63 |     "HEADER = ['key','x','y','alpha','beta','target']\n",
 64 |     "HEADER_DEFAULTS = [[0], [0.0], [0.0], ['NA'], ['NA'], [0.0]]\n",
 65 |     "\n",
 66 |     "NUMERIC_FEATURE_NAMES = ['x', 'y']  \n",
 67 |     "\n",
 68 |     "CATEGORICAL_FEATURE_NAMES_WITH_VOCABULARY = {'alpha':['ax01', 'ax02'], 'beta':['bx01', 'bx02']}\n",
 69 |     "CATEGORICAL_FEATURE_NAMES = list(CATEGORICAL_FEATURE_NAMES_WITH_VOCABULARY.keys())\n",
 70 |     "\n",
 71 |     "FEATURE_NAMES = NUMERIC_FEATURE_NAMES + CATEGORICAL_FEATURE_NAMES\n",
 72 |     "\n",
 73 |     "TARGET_NAME = 'target'\n",
 74 |     "\n",
 75 |     "UNUSED_FEATURE_NAMES = list(set(HEADER) - set(FEATURE_NAMES) - {TARGET_NAME})\n",
 76 |     "\n",
 77 |     "print(\"Header: {}\".format(HEADER))\n",
 78 |     "print(\"Numeric Features: {}\".format(NUMERIC_FEATURE_NAMES))\n",
 79 |     "print(\"Categorical Features: {}\".format(CATEGORICAL_FEATURE_NAMES))\n",
 80 |     "print(\"Target: {}\".format(TARGET_NAME))\n",
 81 |     "print(\"Unused Features: {}\".format(UNUSED_FEATURE_NAMES))"
 82 |    ]
 83 |   },
 84 |   {
 85 |    "cell_type": "code",
 86 |    "execution_count": 4,
 87 |    "metadata": {
 88 |     "collapsed": true
 89 |    },
 90 |    "outputs": [],
 91 |    "source": [
 92 |     "def create_csv_iterator(csv_file_path, skip_header):\n",
 93 |     "    \n",
 94 |     "    with tf.gfile.Open(csv_file_path) as csv_file:\n",
 95 |     "        reader = csv.reader(csv_file)\n",
 96 |     "        if skip_header: # Skip the header\n",
 97 |     "            next(reader)\n",
 98 |     "        for row in reader:\n",
 99 |     "            yield row"
100 |    ]
101 |   },
102 |   {
103 |    "cell_type": "code",
104 |    "execution_count": 5,
105 |    "metadata": {
106 |     "collapsed": true
107 |    },
108 |    "outputs": [],
109 |    "source": [
110 |     "def create_example(row):\n",
111 |     "    \"\"\"\n",
112 |     "    Returns a tensorflow.Example Protocol Buffer object.\n",
113 |     "    \"\"\"\n",
114 |     "    example = tf.train.Example()\n",
115 |     "\n",
116 |     "    for i in range(len(HEADER)):\n",
117 |     "        \n",
118 |     "        feature_name = HEADER[i]\n",
119 |     "        feature_value = row[i]\n",
120 |     "        \n",
121 |     "        if feature_name in UNUSED_FEATURE_NAMES:\n",
122 |     "            continue\n",
123 |     " \n",
124 |     "        if feature_name in NUMERIC_FEATURE_NAMES:\n",
125 |     "            example.features.feature[feature_name].float_list.value.extend([float(feature_value)])\n",
126 |     "        \n",
127 |     "        elif feature_name in CATEGORICAL_FEATURE_NAMES:\n",
128 |     "            example.features.feature[feature_name].bytes_list.value.extend([bytes(feature_value, 'utf-8')])\n",
129 |     "            \n",
130 |     "\n",
131 |     "        elif feature_name in TARGET_NAME:\n",
132 |     "            example.features.feature[feature_name].float_list.value.extend([float(feature_value)])\n",
133 |     "\n",
134 |     "    return example"
135 |    ]
136 |   },
137 |   {
138 |    "cell_type": "code",
139 |    "execution_count": 6,
140 |    "metadata": {
141 |     "collapsed": true
142 |    },
143 |    "outputs": [],
144 |    "source": [
145 |     "def create_tfrecords_file(input_csv_file):\n",
146 |     "    \"\"\"\n",
147 |     "    Creates a TFRecords file for the given input data and\n",
148 |     "    example transofmration function\n",
149 |     "    \"\"\"\n",
150 |     "    output_tfrecord_file = input_csv_file.replace(\"csv\",\"tfrecords\")\n",
151 |     "    writer = tf.python_io.TFRecordWriter(output_tfrecord_file)\n",
152 |     "    \n",
153 |     "    print(\"Creating TFRecords file at\", output_tfrecord_file, \"...\")\n",
154 |     "    \n",
155 |     "    for i, row in enumerate(create_csv_iterator(input_csv_file, skip_header=False)):\n",
156 |     "        \n",
157 |     "        if len(row) == 0:\n",
158 |     "            continue\n",
159 |     "            \n",
160 |     "        example = create_example(row)\n",
161 |     "        content = example.SerializeToString()\n",
162 |     "        writer.write(content)\n",
163 |     "        \n",
164 |     "    writer.close()\n",
165 |     "    \n",
166 |     "    print(\"Finish Writing\", output_tfrecord_file)"
167 |    ]
168 |   },
169 |   {
170 |    "cell_type": "code",
171 |    "execution_count": 7,
172 |    "metadata": {},
173 |    "outputs": [
174 |     {
175 |      "name": "stdout",
176 |      "output_type": "stream",
177 |      "text": [
178 |       "Converting Training Data Files\n",
179 |       "Creating TFRecords file at data/train-data.tfrecords ...\n",
180 |       "Finish Writing data/train-data.tfrecords\n",
181 |       "\n",
182 |       "Converting Validation Data Files\n",
183 |       "Creating TFRecords file at data/valid-data.tfrecords ...\n",
184 |       "Finish Writing data/valid-data.tfrecords\n",
185 |       "\n",
186 |       "Converting Test Data Files\n",
187 |       "Creating TFRecords file at data/test-data.tfrecords ...\n",
188 |       "Finish Writing data/test-data.tfrecords\n"
189 |      ]
190 |     }
191 |    ],
192 |    "source": [
193 |     "print(\"Converting Training Data Files\")\n",
194 |     "for input_csv_file in train_data_files:\n",
195 |     "    create_tfrecords_file(input_csv_file)\n",
196 |     "print(\"\")\n",
197 |     "\n",
198 |     "print(\"Converting Validation Data Files\")\n",
199 |     "for input_csv_file in valid_data_files:\n",
200 |     "    create_tfrecords_file(input_csv_file)\n",
201 |     "print(\"\")\n",
202 |     "\n",
203 |     "print(\"Converting Test Data Files\")\n",
204 |     "for input_csv_file in test_data_files:\n",
205 |     "    create_tfrecords_file(input_csv_file)"
206 |    ]
207 |   },
208 |   {
209 |    "cell_type": "code",
210 |    "execution_count": null,
211 |    "metadata": {
212 |     "collapsed": true
213 |    },
214 |    "outputs": [],
215 |    "source": []
216 |   }
217 |  ],
218 |  "metadata": {
219 |   "kernelspec": {
220 |    "display_name": "Python 3",
221 |    "language": "python",
222 |    "name": "python3"
223 |   },
224 |   "language_info": {
225 |    "codemirror_mode": {
226 |     "name": "ipython",
227 |     "version": 3
228 |    },
229 |    "file_extension": ".py",
230 |    "mimetype": "text/x-python",
231 |    "name": "python",
232 |    "nbconvert_exporter": "python",
233 |    "pygments_lexer": "ipython3",
234 |    "version": "3.6.1"
235 |   }
236 |  },
237 |  "nbformat": 4,
238 |  "nbformat_minor": 2
239 | }
240 | 


--------------------------------------------------------------------------------
/01 - Regression/data/new-data.csv:
--------------------------------------------------------------------------------
1 | 1.3,-0.5,ax01,bx02


--------------------------------------------------------------------------------
/01 - Regression/data/new-data.json:
--------------------------------------------------------------------------------
1 | {"x": 1.3, "y": -0.5, "alpha": "ax01", "beta": "bx02"}


--------------------------------------------------------------------------------
/01 - Regression/data/test-data.tfrecords:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ksalama/tf-estimator-tutorials/cecfea0c378ebc8552941c9ebf8a530228dd845d/01 - Regression/data/test-data.tfrecords


--------------------------------------------------------------------------------
/01 - Regression/data/train-data.tfrecords:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ksalama/tf-estimator-tutorials/cecfea0c378ebc8552941c9ebf8a530228dd845d/01 - Regression/data/train-data.tfrecords


--------------------------------------------------------------------------------
/01 - Regression/data/valid-data.tfrecords:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ksalama/tf-estimator-tutorials/cecfea0c378ebc8552941c9ebf8a530228dd845d/01 - Regression/data/valid-data.tfrecords


--------------------------------------------------------------------------------
/02 - Classification/-- TensorBoard.ipynb:
--------------------------------------------------------------------------------
 1 | {
 2 |  "cells": [
 3 |   {
 4 |    "cell_type": "code",
 5 |    "execution_count": 1,
 6 |    "metadata": {},
 7 |    "outputs": [
 8 |     {
 9 |      "name": "stdout",
10 |      "output_type": "stream",
11 |      "text": [
12 |       "trained_models/class-model-01\n"
13 |      ]
14 |     }
15 |    ],
16 |    "source": [
17 |     "MODEL_NAME = 'class-model-01'\n",
18 |     "model_dir = 'trained_models/{}'.format(MODEL_NAME)\n",
19 |     "print(model_dir)"
20 |    ]
21 |   },
22 |   {
23 |    "cell_type": "markdown",
24 |    "metadata": {},
25 |    "source": [
26 |     "## Start TensorBoard Process"
27 |    ]
28 |   },
29 |   {
30 |    "cell_type": "code",
31 |    "execution_count": null,
32 |    "metadata": {
33 |     "collapsed": true
34 |    },
35 |    "outputs": [],
36 |    "source": [
37 |     "from google.datalab.ml import TensorBoard\n",
38 |     "TensorBoard().start(model_dir)\n",
39 |     "TensorBoard().list()"
40 |    ]
41 |   },
42 |   {
43 |    "cell_type": "markdown",
44 |    "metadata": {},
45 |    "source": [
46 |     "## Kill TensorBoard Process"
47 |    ]
48 |   },
49 |   {
50 |    "cell_type": "code",
51 |    "execution_count": null,
52 |    "metadata": {
53 |     "collapsed": true
54 |    },
55 |    "outputs": [],
56 |    "source": [
57 |     "# to stop TensorBoard\n",
58 |     "TensorBoard().stop(23002)\n",
59 |     "print('stopped TensorBoard')\n",
60 |     "TensorBoard().list()"
61 |    ]
62 |   }
63 |  ],
64 |  "metadata": {
65 |   "kernelspec": {
66 |    "display_name": "Python 3",
67 |    "language": "python",
68 |    "name": "python3"
69 |   },
70 |   "language_info": {
71 |    "codemirror_mode": {
72 |     "name": "ipython",
73 |     "version": 3
74 |    },
75 |    "file_extension": ".py",
76 |    "mimetype": "text/x-python",
77 |    "name": "python",
78 |    "nbconvert_exporter": "python",
79 |    "pygments_lexer": "ipython3",
80 |    "version": "3.6.1"
81 |   }
82 |  },
83 |  "nbformat": 4,
84 |  "nbformat_minor": 2
85 | }
86 | 


--------------------------------------------------------------------------------
/02 - Classification/00.0 - TensorFlow Version Update.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "code",
  5 |    "execution_count": 1,
  6 |    "metadata": {},
  7 |    "outputs": [
  8 |     {
  9 |      "name": "stdout",
 10 |      "output_type": "stream",
 11 |      "text": [
 12 |       "Collecting tensorflow\n",
 13 |       "  Downloading tensorflow-1.4.0-cp36-cp36m-macosx_10_11_x86_64.whl (39.3MB)\n",
 14 |       "Collecting tensorflow-tensorboard<0.5.0,>=0.4.0rc1 (from tensorflow)\n",
 15 |       "  Downloading tensorflow_tensorboard-0.4.0rc2-py3-none-any.whl (1.7MB)\n",
 16 |       "Requirement already up-to-date: protobuf>=3.3.0 in /Users/khalidsalama/anaconda/lib/python3.6/site-packages (from tensorflow)\n",
 17 |       "Requirement already up-to-date: numpy>=1.12.1 in /Users/khalidsalama/anaconda/lib/python3.6/site-packages (from tensorflow)\n",
 18 |       "Requirement already up-to-date: wheel>=0.26 in /Users/khalidsalama/anaconda/lib/python3.6/site-packages (from tensorflow)\n",
 19 |       "Collecting enum34>=1.1.6 (from tensorflow)\n",
 20 |       "  Downloading enum34-1.1.6-py3-none-any.whl\n",
 21 |       "Requirement already up-to-date: six>=1.10.0 in /Users/khalidsalama/anaconda/lib/python3.6/site-packages (from tensorflow)\n",
 22 |       "Requirement already up-to-date: werkzeug>=0.11.10 in /Users/khalidsalama/anaconda/lib/python3.6/site-packages (from tensorflow-tensorboard<0.5.0,>=0.4.0rc1->tensorflow)\n",
 23 |       "Requirement already up-to-date: html5lib==0.9999999 in /Users/khalidsalama/anaconda/lib/python3.6/site-packages (from tensorflow-tensorboard<0.5.0,>=0.4.0rc1->tensorflow)\n",
 24 |       "Requirement already up-to-date: markdown>=2.6.8 in /Users/khalidsalama/anaconda/lib/python3.6/site-packages (from tensorflow-tensorboard<0.5.0,>=0.4.0rc1->tensorflow)\n",
 25 |       "Requirement already up-to-date: bleach==1.5.0 in /Users/khalidsalama/anaconda/lib/python3.6/site-packages (from tensorflow-tensorboard<0.5.0,>=0.4.0rc1->tensorflow)\n",
 26 |       "Requirement already up-to-date: setuptools in /Users/khalidsalama/anaconda/lib/python3.6/site-packages (from protobuf>=3.3.0->tensorflow)\n",
 27 |       "Installing collected packages: tensorflow-tensorboard, enum34, tensorflow\n",
 28 |       "  Found existing installation: tensorflow-tensorboard 0.1.8\n",
 29 |       "    Uninstalling tensorflow-tensorboard-0.1.8:\n",
 30 |       "      Successfully uninstalled tensorflow-tensorboard-0.1.8\n",
 31 |       "  Found existing installation: tensorflow 1.3.0\n",
 32 |       "    Uninstalling tensorflow-1.3.0:\n",
 33 |       "      Successfully uninstalled tensorflow-1.3.0\n",
 34 |       "Successfully installed enum34-1.1.6 tensorflow-1.4.0 tensorflow-tensorboard-0.4.0rc2\n"
 35 |      ]
 36 |     }
 37 |    ],
 38 |    "source": [
 39 |     "%%bash\n",
 40 |     "\n",
 41 |     "pip install -U tensorflow"
 42 |    ]
 43 |   },
 44 |   {
 45 |    "cell_type": "code",
 46 |    "execution_count": 2,
 47 |    "metadata": {},
 48 |    "outputs": [
 49 |     {
 50 |      "name": "stderr",
 51 |      "output_type": "stream",
 52 |      "text": [
 53 |       "/Users/khalidsalama/anaconda/lib/python3.6/importlib/_bootstrap.py:205: RuntimeWarning: compiletime version 3.5 of module 'tensorflow.python.framework.fast_tensor_util' does not match runtime version 3.6\n",
 54 |       "  return f(*args, **kwds)\n"
 55 |      ]
 56 |     },
 57 |     {
 58 |      "name": "stdout",
 59 |      "output_type": "stream",
 60 |      "text": [
 61 |       "1.4.0\n"
 62 |      ]
 63 |     }
 64 |    ],
 65 |    "source": [
 66 |     "import tensorflow as tf\n",
 67 |     "print(tf.__version__)"
 68 |    ]
 69 |   },
 70 |   {
 71 |    "cell_type": "code",
 72 |    "execution_count": null,
 73 |    "metadata": {
 74 |     "collapsed": true
 75 |    },
 76 |    "outputs": [],
 77 |    "source": []
 78 |   }
 79 |  ],
 80 |  "metadata": {
 81 |   "kernelspec": {
 82 |    "display_name": "Python 3",
 83 |    "language": "python",
 84 |    "name": "python3"
 85 |   },
 86 |   "language_info": {
 87 |    "codemirror_mode": {
 88 |     "name": "ipython",
 89 |     "version": 3
 90 |    },
 91 |    "file_extension": ".py",
 92 |    "mimetype": "text/x-python",
 93 |    "name": "python",
 94 |    "nbconvert_exporter": "python",
 95 |    "pygments_lexer": "ipython3",
 96 |    "version": "3.6.1"
 97 |   }
 98 |  },
 99 |  "nbformat": 4,
100 |  "nbformat_minor": 2
101 | }
102 | 


--------------------------------------------------------------------------------
/02 - Classification/02.0 - Convert CSV to TFRecords.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "code",
  5 |    "execution_count": 1,
  6 |    "metadata": {},
  7 |    "outputs": [
  8 |     {
  9 |      "name": "stderr",
 10 |      "output_type": "stream",
 11 |      "text": [
 12 |       "/Users/khalidsalama/anaconda/lib/python3.6/importlib/_bootstrap.py:205: RuntimeWarning: compiletime version 3.5 of module 'tensorflow.python.framework.fast_tensor_util' does not match runtime version 3.6\n",
 13 |       "  return f(*args, **kwds)\n"
 14 |      ]
 15 |     },
 16 |     {
 17 |      "name": "stdout",
 18 |      "output_type": "stream",
 19 |      "text": [
 20 |       "1.4.0\n"
 21 |      ]
 22 |     }
 23 |    ],
 24 |    "source": [
 25 |     "import tensorflow as tf\n",
 26 |     "import csv\n",
 27 |     "import os\n",
 28 |     "\n",
 29 |     "print(tf.__version__)"
 30 |    ]
 31 |   },
 32 |   {
 33 |    "cell_type": "code",
 34 |    "execution_count": 2,
 35 |    "metadata": {
 36 |     "collapsed": true
 37 |    },
 38 |    "outputs": [],
 39 |    "source": [
 40 |     "train_data_files = ['data/train-data.csv']\n",
 41 |     "valid_data_files = ['data/valid-data.csv']\n",
 42 |     "test_data_files = ['data/test-data.csv']"
 43 |    ]
 44 |   },
 45 |   {
 46 |    "cell_type": "code",
 47 |    "execution_count": 3,
 48 |    "metadata": {},
 49 |    "outputs": [
 50 |     {
 51 |      "name": "stdout",
 52 |      "output_type": "stream",
 53 |      "text": [
 54 |       "Header: ['key', 'x', 'y', 'alpha', 'beta', 'target']\n",
 55 |       "Numeric Features: ['x', 'y']\n",
 56 |       "Categorical Features: ['alpha', 'beta']\n",
 57 |       "Target: target - labels: ['postive', 'negative']\n",
 58 |       "Unused Features: ['key']\n"
 59 |      ]
 60 |     }
 61 |    ],
 62 |    "source": [
 63 |     "HEADER = ['key','x','y','alpha','beta','target']\n",
 64 |     "HEADER_DEFAULTS = [[0], [0.0], [0.0], ['NA'], ['NA'], ['NA']]\n",
 65 |     "\n",
 66 |     "NUMERIC_FEATURE_NAMES = ['x', 'y']  \n",
 67 |     "\n",
 68 |     "CATEGORICAL_FEATURE_NAMES_WITH_VOCABULARY = {'alpha':['ax01', 'ax02'], 'beta':['bx01', 'bx02']}\n",
 69 |     "CATEGORICAL_FEATURE_NAMES = list(CATEGORICAL_FEATURE_NAMES_WITH_VOCABULARY.keys())\n",
 70 |     "\n",
 71 |     "FEATURE_NAMES = NUMERIC_FEATURE_NAMES + CATEGORICAL_FEATURE_NAMES\n",
 72 |     "\n",
 73 |     "TARGET_NAME = 'target'\n",
 74 |     "\n",
 75 |     "TARGET_LABELS = ['postive', 'negative']\n",
 76 |     "\n",
 77 |     "UNUSED_FEATURE_NAMES = list(set(HEADER) - set(FEATURE_NAMES) - {TARGET_NAME})\n",
 78 |     "\n",
 79 |     "print(\"Header: {}\".format(HEADER))\n",
 80 |     "print(\"Numeric Features: {}\".format(NUMERIC_FEATURE_NAMES))\n",
 81 |     "print(\"Categorical Features: {}\".format(CATEGORICAL_FEATURE_NAMES))\n",
 82 |     "print(\"Target: {} - labels: {}\".format(TARGET_NAME, TARGET_LABELS))\n",
 83 |     "print(\"Unused Features: {}\".format(UNUSED_FEATURE_NAMES))"
 84 |    ]
 85 |   },
 86 |   {
 87 |    "cell_type": "code",
 88 |    "execution_count": 4,
 89 |    "metadata": {
 90 |     "collapsed": true
 91 |    },
 92 |    "outputs": [],
 93 |    "source": [
 94 |     "def create_csv_iterator(csv_file_path, skip_header):\n",
 95 |     "    \n",
 96 |     "    with tf.gfile.Open(csv_file_path) as csv_file:\n",
 97 |     "        reader = csv.reader(csv_file)\n",
 98 |     "        if skip_header: # Skip the header\n",
 99 |     "            next(reader)\n",
100 |     "        for row in reader:\n",
101 |     "            yield row"
102 |    ]
103 |   },
104 |   {
105 |    "cell_type": "code",
106 |    "execution_count": 5,
107 |    "metadata": {
108 |     "collapsed": true
109 |    },
110 |    "outputs": [],
111 |    "source": [
112 |     "def create_example(row):\n",
113 |     "    \"\"\"\n",
114 |     "    Returns a tensorflow.Example Protocol Buffer object.\n",
115 |     "    \"\"\"\n",
116 |     "    example = tf.train.Example()\n",
117 |     "\n",
118 |     "    for i in range(len(HEADER)):\n",
119 |     "        \n",
120 |     "        feature_name = HEADER[i]\n",
121 |     "        feature_value = row[i]\n",
122 |     "        \n",
123 |     "        if feature_name in UNUSED_FEATURE_NAMES:\n",
124 |     "            continue\n",
125 |     " \n",
126 |     "        if feature_name in NUMERIC_FEATURE_NAMES:\n",
127 |     "            example.features.feature[feature_name].float_list.value.extend([float(feature_value)])\n",
128 |     "        \n",
129 |     "        elif feature_name in CATEGORICAL_FEATURE_NAMES:\n",
130 |     "            example.features.feature[feature_name].bytes_list.value.extend([bytes(feature_value, 'utf-8')])\n",
131 |     "\n",
132 |     "        elif feature_name in TARGET_NAME:\n",
133 |     "            example.features.feature[feature_name].bytes_list.value.extend([bytes(feature_value, 'utf-8')])\n",
134 |     "\n",
135 |     "    return example"
136 |    ]
137 |   },
138 |   {
139 |    "cell_type": "code",
140 |    "execution_count": 6,
141 |    "metadata": {
142 |     "collapsed": true
143 |    },
144 |    "outputs": [],
145 |    "source": [
146 |     "def create_tfrecords_file(input_csv_file):\n",
147 |     "    \"\"\"\n",
148 |     "    Creates a TFRecords file for the given input data and\n",
149 |     "    example transofmration function\n",
150 |     "    \"\"\"\n",
151 |     "    output_tfrecord_file = input_csv_file.replace(\"csv\",\"tfrecords\")\n",
152 |     "    writer = tf.python_io.TFRecordWriter(output_tfrecord_file)\n",
153 |     "    \n",
154 |     "    print(\"Creating TFRecords file at\", output_tfrecord_file, \"...\")\n",
155 |     "    \n",
156 |     "    for i, row in enumerate(create_csv_iterator(input_csv_file, skip_header=False)):\n",
157 |     "        \n",
158 |     "        if len(row) == 0:\n",
159 |     "            continue\n",
160 |     "            \n",
161 |     "        example = create_example(row)\n",
162 |     "        content = example.SerializeToString()\n",
163 |     "        writer.write(content)\n",
164 |     "        \n",
165 |     "    writer.close()\n",
166 |     "    \n",
167 |     "    print(\"Finish Writing\", output_tfrecord_file)"
168 |    ]
169 |   },
170 |   {
171 |    "cell_type": "code",
172 |    "execution_count": 7,
173 |    "metadata": {},
174 |    "outputs": [
175 |     {
176 |      "name": "stdout",
177 |      "output_type": "stream",
178 |      "text": [
179 |       "Converting Training Data Files\n",
180 |       "Creating TFRecords file at data/train-data.tfrecords ...\n",
181 |       "Finish Writing data/train-data.tfrecords\n",
182 |       "\n",
183 |       "Converting Validation Data Files\n",
184 |       "Creating TFRecords file at data/valid-data.tfrecords ...\n",
185 |       "Finish Writing data/valid-data.tfrecords\n",
186 |       "\n",
187 |       "Converting Test Data Files\n",
188 |       "Creating TFRecords file at data/test-data.tfrecords ...\n",
189 |       "Finish Writing data/test-data.tfrecords\n"
190 |      ]
191 |     }
192 |    ],
193 |    "source": [
194 |     "print(\"Converting Training Data Files\")\n",
195 |     "for input_csv_file in train_data_files:\n",
196 |     "    create_tfrecords_file(input_csv_file)\n",
197 |     "print(\"\")\n",
198 |     "\n",
199 |     "print(\"Converting Validation Data Files\")\n",
200 |     "for input_csv_file in valid_data_files:\n",
201 |     "    create_tfrecords_file(input_csv_file)\n",
202 |     "print(\"\")\n",
203 |     "\n",
204 |     "print(\"Converting Test Data Files\")\n",
205 |     "for input_csv_file in test_data_files:\n",
206 |     "    create_tfrecords_file(input_csv_file)"
207 |    ]
208 |   },
209 |   {
210 |    "cell_type": "code",
211 |    "execution_count": null,
212 |    "metadata": {
213 |     "collapsed": true
214 |    },
215 |    "outputs": [],
216 |    "source": []
217 |   }
218 |  ],
219 |  "metadata": {
220 |   "kernelspec": {
221 |    "display_name": "Python 3",
222 |    "language": "python",
223 |    "name": "python3"
224 |   },
225 |   "language_info": {
226 |    "codemirror_mode": {
227 |     "name": "ipython",
228 |     "version": 3
229 |    },
230 |    "file_extension": ".py",
231 |    "mimetype": "text/x-python",
232 |    "name": "python",
233 |    "nbconvert_exporter": "python",
234 |    "pygments_lexer": "ipython3",
235 |    "version": "3.6.1"
236 |   }
237 |  },
238 |  "nbformat": 4,
239 |  "nbformat_minor": 2
240 | }
241 | 


--------------------------------------------------------------------------------
/02 - Classification/data/adult.stats.csv:
--------------------------------------------------------------------------------
1 | ,max,mean,min,stdv
2 | age,90,38.58164675532078,17,13.640432553581146
3 | fnlwgt,1484705,189778.36651208502,12285,105549.97769702235
4 | education_num,16,10.0806793403151,1,2.5727203320673406
5 | capital_gain,99999,1077.6488437087312,0,7385.292084839299
6 | capital_loss,4356,87.303829734959,0,402.96021864905896
7 | hours_per_week,99,40.437455852092995,1,12.347428681730811
8 | 


--------------------------------------------------------------------------------
/02 - Classification/data/test-data.tfrecords:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ksalama/tf-estimator-tutorials/cecfea0c378ebc8552941c9ebf8a530228dd845d/02 - Classification/data/test-data.tfrecords


--------------------------------------------------------------------------------
/02 - Classification/data/train-data.tfrecords:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ksalama/tf-estimator-tutorials/cecfea0c378ebc8552941c9ebf8a530228dd845d/02 - Classification/data/train-data.tfrecords


--------------------------------------------------------------------------------
/02 - Classification/data/valid-data.tfrecords:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ksalama/tf-estimator-tutorials/cecfea0c378ebc8552941c9ebf8a530228dd845d/02 - Classification/data/valid-data.tfrecords


--------------------------------------------------------------------------------
/03 - Clustering/00.0 - TensorFlow Version Update.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "code",
  5 |    "execution_count": 1,
  6 |    "metadata": {},
  7 |    "outputs": [
  8 |     {
  9 |      "name": "stdout",
 10 |      "output_type": "stream",
 11 |      "text": [
 12 |       "Collecting tensorflow\n",
 13 |       "  Downloading tensorflow-1.4.0-cp36-cp36m-macosx_10_11_x86_64.whl (39.3MB)\n",
 14 |       "Collecting tensorflow-tensorboard<0.5.0,>=0.4.0rc1 (from tensorflow)\n",
 15 |       "  Downloading tensorflow_tensorboard-0.4.0rc2-py3-none-any.whl (1.7MB)\n",
 16 |       "Requirement already up-to-date: protobuf>=3.3.0 in /Users/khalidsalama/anaconda/lib/python3.6/site-packages (from tensorflow)\n",
 17 |       "Requirement already up-to-date: numpy>=1.12.1 in /Users/khalidsalama/anaconda/lib/python3.6/site-packages (from tensorflow)\n",
 18 |       "Requirement already up-to-date: wheel>=0.26 in /Users/khalidsalama/anaconda/lib/python3.6/site-packages (from tensorflow)\n",
 19 |       "Collecting enum34>=1.1.6 (from tensorflow)\n",
 20 |       "  Downloading enum34-1.1.6-py3-none-any.whl\n",
 21 |       "Requirement already up-to-date: six>=1.10.0 in /Users/khalidsalama/anaconda/lib/python3.6/site-packages (from tensorflow)\n",
 22 |       "Requirement already up-to-date: werkzeug>=0.11.10 in /Users/khalidsalama/anaconda/lib/python3.6/site-packages (from tensorflow-tensorboard<0.5.0,>=0.4.0rc1->tensorflow)\n",
 23 |       "Requirement already up-to-date: html5lib==0.9999999 in /Users/khalidsalama/anaconda/lib/python3.6/site-packages (from tensorflow-tensorboard<0.5.0,>=0.4.0rc1->tensorflow)\n",
 24 |       "Requirement already up-to-date: markdown>=2.6.8 in /Users/khalidsalama/anaconda/lib/python3.6/site-packages (from tensorflow-tensorboard<0.5.0,>=0.4.0rc1->tensorflow)\n",
 25 |       "Requirement already up-to-date: bleach==1.5.0 in /Users/khalidsalama/anaconda/lib/python3.6/site-packages (from tensorflow-tensorboard<0.5.0,>=0.4.0rc1->tensorflow)\n",
 26 |       "Requirement already up-to-date: setuptools in /Users/khalidsalama/anaconda/lib/python3.6/site-packages (from protobuf>=3.3.0->tensorflow)\n",
 27 |       "Installing collected packages: tensorflow-tensorboard, enum34, tensorflow\n",
 28 |       "  Found existing installation: tensorflow-tensorboard 0.1.8\n",
 29 |       "    Uninstalling tensorflow-tensorboard-0.1.8:\n",
 30 |       "      Successfully uninstalled tensorflow-tensorboard-0.1.8\n",
 31 |       "  Found existing installation: tensorflow 1.3.0\n",
 32 |       "    Uninstalling tensorflow-1.3.0:\n",
 33 |       "      Successfully uninstalled tensorflow-1.3.0\n",
 34 |       "Successfully installed enum34-1.1.6 tensorflow-1.4.0 tensorflow-tensorboard-0.4.0rc2\n"
 35 |      ]
 36 |     }
 37 |    ],
 38 |    "source": [
 39 |     "%%bash\n",
 40 |     "\n",
 41 |     "pip install -U tensorflow"
 42 |    ]
 43 |   },
 44 |   {
 45 |    "cell_type": "code",
 46 |    "execution_count": 2,
 47 |    "metadata": {},
 48 |    "outputs": [
 49 |     {
 50 |      "name": "stderr",
 51 |      "output_type": "stream",
 52 |      "text": [
 53 |       "/Users/khalidsalama/anaconda/lib/python3.6/importlib/_bootstrap.py:205: RuntimeWarning: compiletime version 3.5 of module 'tensorflow.python.framework.fast_tensor_util' does not match runtime version 3.6\n",
 54 |       "  return f(*args, **kwds)\n"
 55 |      ]
 56 |     },
 57 |     {
 58 |      "name": "stdout",
 59 |      "output_type": "stream",
 60 |      "text": [
 61 |       "1.4.0\n"
 62 |      ]
 63 |     }
 64 |    ],
 65 |    "source": [
 66 |     "import tensorflow as tf\n",
 67 |     "print(tf.__version__)"
 68 |    ]
 69 |   },
 70 |   {
 71 |    "cell_type": "code",
 72 |    "execution_count": null,
 73 |    "metadata": {
 74 |     "collapsed": true
 75 |    },
 76 |    "outputs": [],
 77 |    "source": []
 78 |   }
 79 |  ],
 80 |  "metadata": {
 81 |   "kernelspec": {
 82 |    "display_name": "Python 3",
 83 |    "language": "python",
 84 |    "name": "python3"
 85 |   },
 86 |   "language_info": {
 87 |    "codemirror_mode": {
 88 |     "name": "ipython",
 89 |     "version": 3
 90 |    },
 91 |    "file_extension": ".py",
 92 |    "mimetype": "text/x-python",
 93 |    "name": "python",
 94 |    "nbconvert_exporter": "python",
 95 |    "pygments_lexer": "ipython3",
 96 |    "version": "3.6.1"
 97 |   }
 98 |  },
 99 |  "nbformat": 4,
100 |  "nbformat_minor": 2
101 | }
102 | 


--------------------------------------------------------------------------------
/03 - Clustering/data/new-data.csv:
--------------------------------------------------------------------------------
1 | 0.5,-0.3,7


--------------------------------------------------------------------------------
/04 - Times Series/data/test-data.csv:
--------------------------------------------------------------------------------
  1 | time_index,value
  2 | 700,4.062140225471643
  3 | 701,3.1703847192297845
  4 | 702,2.8296873454315192
  5 | 703,3.6961238939396597
  6 | 704,3.774688382600603
  7 | 705,3.8527851934977764
  8 | 706,3.0342060812112686
  9 | 707,3.3306549645378025
 10 | 708,3.6487243344222495
 11 | 709,2.1335273551814233
 12 | 710,3.683338770039675
 13 | 711,3.4407180005651257
 14 | 712,3.1847285264243808
 15 | 713,2.8896834732269645
 16 | 714,3.443554099017879
 17 | 715,3.1628527640306943
 18 | 716,3.477295109150082
 19 | 717,3.36526258592279
 20 | 718,3.122152506677769
 21 | 719,3.348075456125274
 22 | 720,2.4504905110212265
 23 | 721,3.00950452903947
 24 | 722,3.205798117619332
 25 | 723,2.5846071060841815
 26 | 724,2.3892445742499167
 27 | 725,3.3945942547686103
 28 | 726,2.561123153365352
 29 | 727,2.035257638932893
 30 | 728,2.9801074278502275
 31 | 729,2.9562399361791156
 32 | 730,2.1654708168278356
 33 | 731,3.4468449142981705
 34 | 732,2.6893807928426563
 35 | 733,3.025794994419157
 36 | 734,2.542869596532311
 37 | 735,2.9275771470778706
 38 | 736,2.8204505932091055
 39 | 737,3.527816758474815
 40 | 738,2.197418634802284
 41 | 739,2.554280646235888
 42 | 740,2.4324240602338847
 43 | 741,3.1271375891212405
 44 | 742,2.2850209514914006
 45 | 743,2.0776756899911613
 46 | 744,2.529000935802995
 47 | 745,3.297087742223073
 48 | 746,2.2394742253878963
 49 | 747,3.1367437479006797
 50 | 748,2.3953147600203675
 51 | 749,2.8848458913301296
 52 | 750,2.9185911092297903
 53 | 751,2.768126620869814
 54 | 752,2.43488473055407
 55 | 753,2.8870032425325123
 56 | 754,3.317655820661928
 57 | 755,2.1790416388446836
 58 | 756,2.7702407610447577
 59 | 757,2.554226484730687
 60 | 758,2.8134188158141438
 61 | 759,2.758781861045474
 62 | 760,2.272104718154779
 63 | 761,2.8103970647324372
 64 | 762,2.8972594904941387
 65 | 763,3.4002482934772478
 66 | 764,3.3455711599757834
 67 | 765,2.715918824573258
 68 | 766,3.7061718277620113
 69 | 767,3.0195081640399204
 70 | 768,3.4891004444538325
 71 | 769,2.9311254106642193
 72 | 770,2.3379837346623598
 73 | 771,2.5146941193432775
 74 | 772,3.534476172205595
 75 | 773,3.0003799070150836
 76 | 774,2.8915136087990785
 77 | 775,2.393552222327803
 78 | 776,3.011905423311392
 79 | 777,3.8801787347996632
 80 | 778,3.2515228547754886
 81 | 779,2.789501465000945
 82 | 780,3.426429385272551
 83 | 781,3.418712155395797
 84 | 782,3.983713621739207
 85 | 783,3.6345931864075576
 86 | 784,3.028427715036325
 87 | 785,3.6675103028495144
 88 | 786,4.199142625113263
 89 | 787,3.0825750004211327
 90 | 788,3.3340944569219486
 91 | 789,3.7900930100567076
 92 | 790,3.9891451701449654
 93 | 791,4.437402936216056
 94 | 792,3.483434383801479
 95 | 793,4.856432283156268
 96 | 794,3.112032068064219
 97 | 795,3.764822361311284
 98 | 796,4.778499314027573
 99 | 797,3.33724185989896
100 | 798,3.8058737331849453
101 | 799,3.9223811712262653
102 | 800,4.80546589113736
103 | 801,4.421552582453292
104 | 802,3.606081714628961
105 | 803,3.9941737176325596
106 | 804,4.662649705612334
107 | 805,4.018590914241019
108 | 806,3.5680466115701646
109 | 807,5.103635450598651
110 | 808,4.553832764926619
111 | 809,4.480087371204185
112 | 810,4.462603498918542
113 | 811,4.2137200426188075
114 | 812,4.189374217427936
115 | 813,4.044349362105051
116 | 814,3.3654308023514417
117 | 815,4.551988909577272
118 | 816,5.281251897092956
119 | 817,4.919655962013503
120 | 818,4.268853670537956
121 | 819,5.326461607549719
122 | 820,4.423531000313117
123 | 821,4.203178570982242
124 | 822,4.120263855677827
125 | 823,3.776759734748973
126 | 824,4.5429757684426
127 | 825,5.351165193685153
128 | 826,4.3428152492354775
129 | 827,5.394929077351952
130 | 828,5.218609727629257
131 | 829,4.9831655977115625
132 | 830,5.602842952189427
133 | 831,5.3664242391999775
134 | 832,5.14450210344502
135 | 833,5.014801223804789
136 | 834,5.404549894954248
137 | 835,4.611614806903722
138 | 836,5.91369549455372
139 | 837,5.575199712425203
140 | 838,4.551385680651797
141 | 839,5.696239295334581
142 | 840,5.673983860921718
143 | 841,5.131646815240395
144 | 842,4.7304053547507685
145 | 843,5.131704930574379
146 | 844,5.350692049974139
147 | 845,5.043122726463051
148 | 846,5.433980654640878
149 | 847,5.392811171818018
150 | 848,6.127902771533075
151 | 849,4.948801899758867
152 | 850,5.672670683819614
153 | 851,4.619344342691638
154 | 852,4.461927290385413
155 | 853,5.134271002568175
156 | 854,5.244183015774759
157 | 855,5.0454199444160315
158 | 856,5.63991663670262
159 | 857,5.444551179447414
160 | 858,5.358876256297602
161 | 859,6.300157516455733
162 | 860,5.521687291262919
163 | 861,6.482989918871226
164 | 862,4.452113139646457
165 | 863,5.947519115228699
166 | 864,4.732843968683768
167 | 865,4.663305658213866
168 | 866,5.060828778618426
169 | 867,5.630137501067726
170 | 868,4.837622754661279
171 | 869,4.589984321432029
172 | 870,5.149519770472633
173 | 871,4.926183108085338
174 | 872,5.529322212911065
175 | 873,4.757430665280789
176 | 874,5.39173836256956
177 | 875,5.23465202505217
178 | 876,4.714170978848213
179 | 877,4.662839356640053
180 | 878,4.60819971256791
181 | 879,4.882721694617192
182 | 880,5.390915345465747
183 | 881,3.8287811359231476
184 | 882,4.905994302868104
185 | 883,5.1710621658328515
186 | 884,4.391188353483403
187 | 885,4.748422379069466
188 | 886,5.83622319255817
189 | 887,5.085489278108183
190 | 888,5.085950301210101
191 | 889,5.267403853016739
192 | 890,5.494662308615086
193 | 891,5.113984813073912
194 | 892,5.1585692571022514
195 | 893,4.4319546043214695
196 | 894,4.387326346526101
197 | 895,4.756366569062057
198 | 896,4.413291810154036
199 | 897,5.013658966087423
200 | 898,4.549167180263243
201 | 899,5.172728541719882
202 | 900,3.886746174651208
203 | 901,4.588569016237771
204 | 902,4.929271797376781
205 | 903,4.599656125469891
206 | 904,4.808639274005556
207 | 905,4.325581040660617
208 | 906,4.194580144654176
209 | 907,3.9974315940262444
210 | 908,4.715515557271075
211 | 909,4.1909689237542285
212 | 910,4.074666135679668
213 | 911,4.901169926148987
214 | 912,3.9552622873015917
215 | 913,3.5796376754546113
216 | 914,4.711809225431517
217 | 915,3.6796683690417753
218 | 916,3.5442679947216176
219 | 917,3.1421330422267806
220 | 918,3.756923255399997
221 | 919,4.221580470129694
222 | 920,3.7462046848740105
223 | 921,4.185030793915441
224 | 922,3.396145019184779
225 | 923,4.5190483220341156
226 | 924,4.397364432124365
227 | 925,4.187464540915996
228 | 926,3.9272795891536907
229 | 927,3.6197921717482333
230 | 928,4.032448297113617
231 | 929,4.220999672128967
232 | 930,4.257691213195231
233 | 931,4.154362183090802
234 | 932,4.263289039195564
235 | 933,4.223630570761886
236 | 934,3.5084926009711603
237 | 935,4.062713546013925
238 | 936,4.273967524010026
239 | 937,3.9695138571146784
240 | 938,3.8583311980482913
241 | 939,2.9247548632596514
242 | 940,3.854652554626871
243 | 941,3.234429271148347
244 | 942,3.044781194988175
245 | 943,3.656513334659649
246 | 944,3.4818997204930073
247 | 945,2.875852392487933
248 | 946,3.961998294895795
249 | 947,4.105299492259828
250 | 948,4.216647670087034
251 | 949,3.6508133990630003
252 | 950,3.8246399910006907
253 | 951,3.9922875618756573
254 | 952,3.588030199403815
255 | 953,4.384184812214926
256 | 954,3.5120901674831724
257 | 955,3.418442907989169
258 | 956,3.2977455863735003
259 | 957,2.752204501626123
260 | 958,3.6410521282086634
261 | 959,3.473679507191042
262 | 960,3.921614723961769
263 | 961,3.8925441023134137
264 | 962,3.4275589055043403
265 | 963,3.2211072262973106
266 | 964,3.109856818870202
267 | 965,4.680680857691393
268 | 966,4.546573184643018
269 | 967,3.306769027768169
270 | 968,3.981130047116258
271 | 969,3.8552929675600307
272 | 970,3.7797822835350914
273 | 971,4.427976023630056
274 | 972,4.202232494544553
275 | 973,5.007984438145008
276 | 974,4.489380804569116
277 | 975,4.378691233249306
278 | 976,3.844107079000932
279 | 977,3.2052757587187792
280 | 978,3.966125426862569
281 | 979,3.871352865158788
282 | 980,3.8431485227426725
283 | 981,3.657077938996487
284 | 982,4.356575779819392
285 | 983,3.991957980024272
286 | 984,3.937984423830078
287 | 985,4.673590913114282
288 | 986,5.608245991898564
289 | 987,4.549470674312809
290 | 988,4.706222218782882
291 | 989,4.006099739650296
292 | 990,4.735846319022681
293 | 991,4.395454732392768
294 | 992,4.807044589998613
295 | 993,4.40877616086605
296 | 994,3.623780385854121
297 | 995,5.332176415583953
298 | 996,4.798612805906225
299 | 997,4.91261012214747
300 | 998,5.1126834613174434
301 | 999,5.6472138130982374
302 | 


--------------------------------------------------------------------------------
/04 - Times Series/data/timeseries-multivariate.txt:
--------------------------------------------------------------------------------
  1 | 0,0.926906299771,1.99107237682,2.56546245685,3.07914768197,4.04839057867
  2 | 1,0.108010001864,1.41645361423,2.1686839775,2.94963962176,4.1263503303
  3 | 2,-0.800567600028,1.0172132907,1.96434754116,2.99885333086,4.04300485864
  4 | 3,0.0607042871898,0.719540073421,1.9765012584,2.89265588817,4.0951014426
  5 | 4,0.933712200629,0.28052120776,1.41018552514,2.69232603996,4.06481164223
  6 | 5,-0.171730652974,0.260054421028,1.48770816369,2.62199129293,4.44572807842
  7 | 6,-1.00180162933,0.333045158863,1.50006392277,2.88888309683,4.24755865606
  8 | 7,0.0580061875336,0.688929398826,1.56543458772,2.99840358953,4.52726873347
  9 | 8,0.764139447412,1.24704875327,1.77649279698,3.13578593851,4.63238922951
 10 | 9,-0.230331874785,1.47903998963,2.03547545751,3.20624030377,4.77980005228
 11 | 10,-1.03846045211,2.01133000781,2.31977503972,3.67951536251,5.09716775897
 12 | 11,0.188643592253,2.23285349038,2.68338482249,3.49817168611,5.24928239634
 13 | 12,0.91207302309,2.24244446841,2.71362604985,3.96332587625,5.37802271594
 14 | 13,-0.296588665881,2.02594634141,3.07733910479,3.99698324956,5.56365901394
 15 | 14,-0.959961476551,1.45078629833,3.18996420137,4.3763059609,5.65356015609
 16 | 15,0.46313530679,1.01141441548,3.4980215948,4.20224896882,5.88842247449
 17 | 16,0.929354125798,0.626635305936,3.70508262244,4.51791573544,5.73945973251
 18 | 17,-0.519110731957,0.269249223148,3.39866823332,4.46802003061,5.82768174382
 19 | 18,-0.924330981367,0.349602834684,3.21762413294,4.72803587499,5.94918925767
 20 | 19,0.253239387885,0.345158023497,3.11071425333,4.79311566935,5.9489259713
 21 | 20,0.637408390225,0.698996675371,3.25232492145,4.73814732384,5.9612010251
 22 | 21,-0.407396859412,1.17456342803,2.49526823723,4.59323415742,5.82501686811
 23 | 22,-0.967485452118,1.66655933642,2.47284606244,4.58316034754,5.88721406681
 24 | 23,0.474480867904,1.95018556323,2.0228950072,4.48651142819,5.8255943735
 25 | 24,1.04309652155,2.23519892356,1.91924131572,4.19094661783,5.87457348436
 26 | 25,-0.517861513772,2.12501967336,1.70266619979,4.05280882887,5.72160912899
 27 | 26,-0.945301585146,1.65464653549,1.81567174251,3.92309850635,5.58270493814
 28 | 27,0.501153868974,1.40600764889,1.53991387719,3.72853247942,5.60169001727
 29 | 28,0.972859524418,1.00344321868,1.5175642828,3.64092376655,5.10567722582
 30 | 29,-0.70553406135,0.465306263885,1.7038540803,3.33236870312,5.09182481555
 31 | 30,-0.946093634916,0.294539309453,1.88052827037,2.93011492669,4.97354922696
 32 | 31,0.47922123231,0.308465865031,2.03445883031,2.90772899045,4.86241793548
 33 | 32,0.754030014252,0.549752241167,2.46115815089,2.95063349534,4.71834614627
 34 | 33,-0.64875949826,0.894615488148,2.5922463381,2.81269864022,4.43480095104
 35 | 34,-0.757829951086,1.39123914261,2.69258079904,2.61834837315,4.36580046156
 36 | 35,0.565653301088,1.72360022693,2.97794913834,2.80403840334,4.27327248459
 37 | 36,0.867440092372,2.21100730052,3.38648090792,2.84057515729,4.12210169576
 38 | 37,-0.894567758095,2.17549105818,3.45532493329,2.90446025717,4.00251740584
 39 | 38,-0.715442356893,2.15105389965,3.52041791902,3.03650393392,4.12809249577
 40 | 39,0.80671703672,1.81504564517,3.60463324866,3.00747789871,3.98440762467
 41 | 40,0.527014790142,1.31803513865,3.43842186337,3.3332594663,4.03232406566
 42 | 41,-0.795936862129,0.847809114454,3.09875133548,3.52863155938,3.94883924909
 43 | 42,-0.610245806946,0.425530441018,2.92581949152,3.77238736123,4.27287245021
 44 | 43,0.611662279431,0.178432049837,2.48128214822,3.73212087883,4.17319013831
 45 | 44,0.650866553108,0.220341648392,2.41694642022,4.2609098519,4.27271645905
 46 | 45,-0.774156982023,0.632667602331,2.05474356052,4.32889204886,4.18029723271
 47 | 46,-0.714058448409,0.924562377599,1.75706135146,4.52492718422,4.3972678094
 48 | 47,0.889627293379,1.46207968841,1.78299357672,4.64466731095,4.56317887554
 49 | 48,0.520140662861,1.8996333843,1.41377633823,4.48899091177,4.78805049769
 50 | 49,-1.03816935616,2.08997002059,1.51218375351,4.84167764204,4.93026048606
 51 | 50,-0.40772951362,2.30878972136,1.44144415128,4.76854460997,5.01538444629
 52 | 51,0.792730684781,1.91367048509,1.58887384677,4.71739397335,5.25690012199
 53 | 52,0.371311881576,1.67565079528,1.81688563053,4.60353107555,5.44265822961
 54 | 53,-0.814398070371,1.13374634126,1.80328814859,4.72264252878,5.52674761122
 55 | 54,-0.469017949323,0.601244136627,2.29690896736,4.49859178859,5.54126153454
 56 | 55,0.871044371426,0.407597593794,2.7499112487,4.19060637761,5.57693767301
 57 | 56,0.523764933017,0.247705192709,3.09002071379,4.02095509006,5.80510362182
 58 | 57,-0.881326403531,0.31513103164,3.11358205718,3.96079100808,5.81000652365
 59 | 58,-0.357928025339,0.486163915865,3.17884556771,3.72634990659,5.85693642011
 60 | 59,0.853038779822,1.04218094475,3.45835384454,3.36703969978,5.9585988449
 61 | 60,0.435311516013,1.59715085283,3.63313338588,3.11276729421,5.93643818229
 62 | 61,-1.02703719138,1.92205832542,3.47606111735,3.06247155999,6.02106646259
 63 | 62,-0.246661325557,2.14653802542,3.29446326567,2.89936259181,5.67531541272
 64 | 63,1.02554736569,2.25943737733,3.07031591528,2.78176218013,5.78206328989
 65 | 64,0.337814475969,2.07589147224,2.80356226089,2.55888206331,5.7094075496
 66 | 65,-1.12023369929,1.25333011618,2.56497288445,2.77361359194,5.50799418376
 67 | 66,-0.178980246554,1.11937139901,2.51598681313,2.91438309151,5.47469577206
 68 | 67,0.97550951531,0.60553823137,2.11657741073,2.88081098981,5.37034999502
 69 | 68,0.136653357206,0.365828836075,1.97386033165,3.13217903204,5.07254490219
 70 | 69,-1.05607596951,0.153152115069,1.52110743825,3.01308794192,5.08902539125
 71 | 70,-0.13095280331,0.337113974483,1.52703079853,3.16687131599,4.86649398514
 72 | 71,1.07081057754,0.714247566736,1.53761382634,3.45151989484,4.75892309166
 73 | 72,0.0153410376082,1.24631231847,1.61690939161,3.85481994498,4.35683752832
 74 | 73,-0.912801257303,1.60791309476,1.8729264524,4.03037260012,4.36072588913
 75 | 74,-0.0894895640338,2.02535207407,1.93484909619,4.09557485132,4.35327025188
 76 | 75,0.978646999652,2.20085086625,2.09003440427,4.27542353033,4.1805058388
 77 | 76,-0.113312642876,2.2444100761,2.50789248839,4.4151861502,4.03267168136
 78 | 77,-1.00215099149,1.84305628445,2.61691237246,4.45425147595,3.81203553766
 79 | 78,-0.0183234614205,1.49573923116,2.99308471214,4.71134960112,4.0273804959
 80 | 79,1.0823738177,1.12211589848,3.27079386925,4.94288270502,4.01851068083
 81 | 80,0.124370187893,0.616474412808,3.4284236674,4.76942168327,3.9749536483
 82 | 81,-0.929423379352,0.290977090976,3.34131726136,4.78590392707,4.10190661656
 83 | 82,0.23766302648,0.155302052254,3.49779513794,4.64605656795,4.15571321107
 84 | 83,1.03531486192,0.359702776204,3.4880725919,4.48167586667,4.21134561991
 85 | 84,-0.261234571382,0.713877760378,3.42756426614,4.426443869,4.25208300527
 86 | 85,-1.03572442277,1.25001113691,2.96908341113,4.25500915322,4.25723010649
 87 | 86,0.380034261243,1.70543355622,2.73605932518,4.16703432307,4.63700400788
 88 | 87,1.03734873488,1.97544410562,2.55586572141,3.84976673263,4.55282864289
 89 | 88,-0.177344253372,2.22614526325,2.09565864891,3.77378097953,4.82577400298
 90 | 89,-0.976821526892,2.18385079177,1.78522284118,3.67768223554,5.06302440873
 91 | 90,0.264820472091,1.86981946157,1.50048403865,3.43619796921,5.05651761669
 92 | 91,1.05642344868,1.47568646076,1.51347671977,3.20898518885,5.50149047462
 93 | 92,-0.311607433358,1.04226467636,1.52089650905,3.02291865417,5.4889046232
 94 | 93,-0.724285777937,0.553052311957,1.48573560173,2.7365973598,5.72549174225
 95 | 94,0.519859192905,0.226520626591,1.61543723167,2.84102086852,5.69330622288
 96 | 95,1.0323195039,0.260873217055,1.81913034804,2.83951143848,5.90325028086
 97 | 96,-0.53285682538,0.387695521405,1.70935609313,2.57977050631,5.79579213161
 98 | 97,-0.975127997215,0.920948771589,2.51292643636,2.71004616612,5.87016469227
 99 | 98,0.540246804099,1.36445470181,2.61949412896,2.98482553485,6.02447664937
100 | 99,0.987764008058,1.85581989607,2.84685706149,2.94760204892,6.0212151724


--------------------------------------------------------------------------------
/04 - Times Series/data/timeseries-univariate.csv:
--------------------------------------------------------------------------------
  1 | 1,-0.6656603714
  2 | 2,-0.1164380359
  3 | 3,0.7398626488
  4 | 4,0.7368633029
  5 | 5,0.2289480898
  6 | 6,2.257073255
  7 | 7,3.023457405
  8 | 8,2.481161007
  9 | 9,3.773638612
 10 | 10,5.059257738
 11 | 11,3.553186083
 12 | 12,4.554486452
 13 | 13,3.655475698
 14 | 14,3.419647598
 15 | 15,4.303376245
 16 | 16,4.830153934
 17 | 17,7.253057441
 18 | 18,5.064802335
 19 | 19,5.448082106
 20 | 20,6.251301517
 21 | 21,6.214335675
 22 | 22,3.07021164
 23 | 23,6.995487627
 24 | 24,7.180942656
 25 | 25,6.084876071
 26 | 26,6.95580607
 27 | 27,6.692312738
 28 | 28,6.339959049
 29 | 29,7.659013269
 30 | 30,6.157071564
 31 | 31,4.023661782
 32 | 32,7.380555018
 33 | 33,6.972155839
 34 | 34,6.655956847
 35 | 35,6.532594924
 36 | 36,6.780524726
 37 | 37,6.723407547
 38 | 38,7.616777776
 39 | 39,6.394157367
 40 | 40,5.046574011
 41 | 41,5.715326568
 42 | 42,6.536737479
 43 | 43,6.527307846
 44 | 44,5.671954159
 45 | 45,6.508512087
 46 | 46,4.740656344
 47 | 47,5.449062618
 48 | 48,5.796110609
 49 | 49,4.802213058
 50 | 50,4.627081034
 51 | 51,5.748934924
 52 | 52,4.05776044
 53 | 53,2.743057715
 54 | 54,3.590052501
 55 | 55,2.937786376
 56 | 56,5.333221794
 57 | 57,5.102383904
 58 | 58,5.097946146
 59 | 59,2.771776766
 60 | 60,3.75493571
 61 | 61,3.268329562
 62 | 62,3.127887555
 63 | 63,5.723894838
 64 | 64,2.365351066
 65 | 65,2.030890988
 66 | 66,5.74385257
 67 | 67,2.637874242
 68 | 68,2.851492945
 69 | 69,1.907194917
 70 | 70,2.568816256
 71 | 71,3.869259698
 72 | 72,3.989917724
 73 | 73,3.641515351
 74 | 74,2.812911768
 75 | 75,4.964828171
 76 | 76,3.050937945
 77 | 77,4.203046785
 78 | 78,4.269162745
 79 | 79,2.818643243
 80 | 80,3.334928424
 81 | 81,5.239741508
 82 | 82,4.972880771
 83 | 83,5.212782208
 84 | 84,6.056729012
 85 | 85,5.404247421
 86 | 86,4.733521027
 87 | 87,5.241044888
 88 | 88,6.844720502
 89 | 89,8.242617764
 90 | 90,6.686818708
 91 | 91,6.429035591
 92 | 92,7.45926043
 93 | 93,8.225717423
 94 | 94,7.661722793
 95 | 95,8.348721917
 96 | 96,8.029228135
 97 | 97,9.780942864
 98 | 98,9.755623978
 99 | 99,9.149489124
100 | 100,8.947965351
101 | 101,9.176768019
102 | 102,8.768408716
103 | 103,10.39624874
104 | 104,10.39477408
105 | 105,11.63126076
106 | 106,11.8222078
107 | 107,13.60107691
108 | 108,14.54919169
109 | 109,12.63475358
110 | 110,13.77411599
111 | 111,14.45808191
112 | 112,13.27674112
113 | 113,16.00004992
114 | 114,13.04977221
115 | 115,14.65730048
116 | 116,14.76178039
117 | 117,14.62716229
118 | 118,16.20697047
119 | 119,14.79470608
120 | 120,16.70541749
121 | 121,15.8638474
122 | 122,15.63192699
123 | 123,17.20433954
124 | 124,16.29180965
125 | 125,16.93688521
126 | 126,16.07521662
127 | 127,18.33942893
128 | 128,15.62502668
129 | 129,16.81519558
130 | 130,16.86177911
131 | 131,19.18323671
132 | 132,16.68993279
133 | 133,16.52735528
134 | 134,15.22702085
135 | 135,16.13574242
136 | 136,16.08079964
137 | 137,17.16828833
138 | 138,16.09004409
139 | 139,16.92712829
140 | 140,15.54298161
141 | 141,16.03893798
142 | 142,15.38310389
143 | 143,16.18064645
144 | 144,16.22326501
145 | 145,17.1657127
146 | 146,14.87850136
147 | 147,12.80968507
148 | 148,16.25354113
149 | 149,15.14082073
150 | 150,15.79111348
151 | 151,14.02005588
152 | 152,14.32583767
153 | 153,13.87437546
154 | 154,14.47127314
155 | 155,14.29661188
156 | 156,14.68406313
157 | 157,15.84514503
158 | 158,13.89667867
159 | 159,13.58135083
160 | 160,14.26005818
161 | 161,13.3826131
162 | 162,12.85293827
163 | 163,11.06745237
164 | 164,14.08812275
165 | 165,13.05949205
166 | 166,12.18454971
167 | 167,13.01005879
168 | 168,12.45032762
169 | 169,12.20445297
170 | 170,14.39420173
171 | 171,13.49261191
172 | 172,14.91460871
173 | 173,15.97672915
174 | 174,13.96235436
175 | 175,13.77840615
176 | 176,14.39425289
177 | 177,14.31499272
178 | 178,14.37080989
179 | 179,15.34130707
180 | 180,13.42441434
181 | 181,14.54726137
182 | 182,12.51644144
183 | 183,15.36040785
184 | 184,14.52577002
185 | 185,15.90562887
186 | 186,15.12482026
187 | 187,15.55534424
188 | 188,12.22427756
189 | 189,15.11554898
190 | 190,14.23464612
191 | 191,16.52156964
192 | 192,18.14558077
193 | 193,16.51932129
194 | 194,16.88159194
195 | 195,18.08337828
196 | 196,18.70889734
197 | 197,20.97040748
198 | 198,18.98358689
199 | 199,20.76308391
200 | 200,19.81117586
201 | 201,20.24139919
202 | 202,20.78884634
203 | 203,19.92458806
204 | 204,21.60401889
205 | 205,23.30040897
206 | 206,22.2621713
207 | 207,21.24305034
208 | 208,22.07690632
209 | 209,21.78022193
210 | 210,22.94853418
211 | 211,23.72076264
212 | 212,24.12217213
213 | 213,23.04498673
214 | 214,23.8767225
215 | 215,26.52157498
216 | 216,26.24329682
217 | 217,24.83932457
218 | 218,25.66570111
219 | 219,25.61834475
220 | 220,24.41079934
221 | 221,25.31871793
222 | 222,26.7612452
223 | 223,27.00663389
224 | 224,27.86719501
225 | 225,24.87319457
226 | 226,27.85768696
227 | 227,25.70405436
228 | 228,26.11077958
229 | 229,28.11250875
230 | 230,27.6743468
231 | 231,27.19705336
232 | 232,28.08086799
233 | 233,26.19946123
234 | 234,27.32830376
235 | 235,25.98334256
236 | 236,26.71791978
237 | 237,26.67921906
238 | 238,26.25811051
239 | 239,26.64228363
240 | 240,26.20667398
241 | 241,26.39816025
242 | 242,24.83672957
243 | 243,24.27745854
244 | 244,26.10007483
245 | 245,25.67761738
246 | 246,25.91667268
247 | 247,27.57057095
248 | 248,25.68913621
249 | 249,24.92375989
250 | 250,25.5593706
251 | 251,25.14638402
252 | 252,26.46738639
253 | 253,24.55740644
254 | 254,23.5691458
255 | 255,24.07138538
256 | 256,24.94177528
257 | 257,22.33546227
258 | 258,22.32323763
259 | 259,24.38075647
260 | 260,22.40754744
261 | 261,22.61183469
262 | 262,23.28658677
263 | 263,22.98637689
264 | 264,25.46468191
265 | 265,24.14497597
266 | 266,22.97023633
267 | 267,24.37831161
268 | 268,24.86418705
269 | 269,22.61185053
270 | 270,21.70979546
271 | 271,22.09389192
272 | 272,23.25882086
273 | 273,23.56494308
274 | 274,24.13181731
275 | 275,24.28160263
276 | 276,24.43623736
277 | 277,23.24956419
278 | 278,21.76696726
279 | 279,25.14997786
280 | 280,24.67520728
281 | 281,23.40400797
282 | 282,26.24489282
283 | 283,25.05952039
284 | 284,24.53922399
285 | 285,24.89917455
286 | 286,25.13438134
287 | 287,26.05220822
288 | 288,26.94133112
289 | 289,26.02788294
290 | 290,26.65909349
291 | 291,26.0832158
292 | 292,27.39946496
293 | 293,26.57973099
294 | 294,27.49867838
295 | 295,29.89834253
296 | 296,27.78403709
297 | 297,28.92405258
298 | 298,26.58518509
299 | 299,30.91291741
300 | 300,31.73949474
301 | 301,29.25173685
302 | 302,30.3747463
303 | 303,30.59695095
304 | 304,31.50757627
305 | 305,30.97036633
306 | 306,31.27177079
307 | 307,33.43369051
308 | 308,33.9848363
309 | 309,33.31775176
310 | 310,31.69164009
311 | 311,33.07897081
312 | 312,33.10849644
313 | 313,33.29428375
314 | 314,35.60397723
315 | 315,35.33614012
316 | 316,33.95701506
317 | 317,35.16914759
318 | 318,35.92430987
319 | 319,35.81820171
320 | 320,37.36378976
321 | 321,36.74459793
322 | 322,35.27569759
323 | 323,35.9767425
324 | 324,36.17811539
325 | 325,35.68567729
326 | 326,35.54212562
327 | 327,38.78114238
328 | 328,36.46819618
329 | 329,38.07352601
330 | 330,36.56662256
331 | 331,38.1938068
332 | 332,37.42919226
333 | 333,37.44666875
334 | 334,37.16795054
335 | 335,34.97440399
336 | 336,35.6174255
337 | 337,37.37634133
338 | 338,37.26137677
339 | 339,38.09726659
340 | 340,36.04071363
341 | 341,37.07494746
342 | 342,34.4281316
343 | 343,35.1959716
344 | 344,35.26041345
345 | 345,36.9398346
346 | 346,33.58933988
347 | 347,35.00075536
348 | 348,35.97807689
349 | 349,35.66631707
350 | 350,35.44925794
351 | 351,33.69565848
352 | 352,35.38969147
353 | 353,35.96432261
354 | 354,33.6956667
355 | 355,34.05230212
356 | 356,32.70536873
357 | 357,33.91009672
358 | 358,34.45606416
359 | 359,34.97972516
360 | 360,32.36260234
361 | 361,31.69621537
362 | 362,33.02307596
363 | 363,33.94445036
364 | 364,32.2763097
365 | 365,32.06228645
366 | 366,34.25956906
367 | 367,33.61620818
368 | 368,35.00141908
369 | 369,34.47493965
370 | 370,34.31576327
371 | 371,33.24772844
372 | 372,32.95185358
373 | 373,32.55224164
374 | 374,33.06560689
375 | 375,35.2082848
376 | 376,34.50372086
377 | 377,33.54922461
378 | 378,35.46287805
379 | 379,34.68829823
380 | 380,35.04640557
381 | 381,33.48711975
382 | 382,34.03264662
383 | 383,34.43296169
384 | 384,35.7571391
385 | 385,32.58466542
386 | 386,34.44295272
387 | 387,35.43369124
388 | 388,37.7196386
389 | 389,37.55863215
390 | 390,35.11245844
391 | 391,37.36667774
392 | 392,36.41904568
393 | 393,38.11951592
394 | 394,39.351325
395 | 395,38.87795167
396 | 396,38.8144378
397 | 397,38.96059714
398 | 398,39.95536453
399 | 399,39.78580611
400 | 400,40.70319964
401 | 401,41.32804151
402 | 402,42.79937243
403 | 403,38.43432481
404 | 404,42.12051726
405 | 405,42.50068551
406 | 406,43.89812523
407 | 407,42.18632495
408 | 408,43.99716859
409 | 409,43.67726129
410 | 410,42.98072384
411 | 411,43.59181621
412 | 412,44.98283057
413 | 413,42.17674627
414 | 414,46.49541908
415 | 415,45.58212027
416 | 416,42.7202171
417 | 417,45.66108535
418 | 418,45.03844556
419 | 419,44.96618253
420 | 420,45.0371585
421 | 421,46.12237848
422 | 422,46.18891162
423 | 423,46.82075672
424 | 424,47.25058257
425 | 425,45.91853936
426 | 426,46.83241571
427 | 427,47.77383153
428 | 428,48.12984438
429 | 429,46.74042025
430 | 430,46.66834779
431 | 431,47.41473153
432 | 432,46.93101415
433 | 433,48.24438209
434 | 434,47.41007874
435 | 435,46.92607209
436 | 436,46.77346554
437 | 437,47.80447575
438 | 438,45.7000972
439 | 439,46.60252512
440 | 440,45.59290618
441 | 441,47.37025588
442 | 442,46.46333171
443 | 443,46.19762396
444 | 444,47.57763766
445 | 445,46.92624737
446 | 446,46.1536802
447 | 447,45.94947611
448 | 448,46.37457004
449 | 449,44.22344538
450 | 450,43.18937717
451 | 451,44.3387774
452 | 452,45.63204816
453 | 453,43.87816917
454 | 454,43.67301546
455 | 455,42.11959709
456 | 456,43.89387883
457 | 457,44.40734798
458 | 458,42.67367897
459 | 459,43.76501429
460 | 460,44.74698445
461 | 461,43.14500236
462 | 462,42.41214263
463 | 463,44.1631715
464 | 464,41.81378406
465 | 465,43.00929934
466 | 466,42.80360515
467 | 467,44.30252713
468 | 468,42.88123048
469 | 469,43.47049118
470 | 470,44.42168141
471 | 471,42.43276664
472 | 472,44.57582419
473 | 473,43.56138481
474 | 474,43.4549005
475 | 475,43.06396235
476 | 476,43.8737132
477 | 477,42.1428636
478 | 478,43.60856585
479 | 479,44.16778079
480 | 480,42.90474298
481 | 481,44.99882414
482 | 482,43.304605
483 | 483,44.4468626
484 | 484,45.49241923
485 | 485,44.46713555
486 | 486,46.27348465
487 | 487,45.76034556
488 | 488,45.37440079
489 | 489,46.19246701
490 | 490,48.28190231
491 | 491,47.81719203
492 | 492,47.23213374
493 | 493,48.03313818
494 | 494,46.73599653
495 | 495,47.12327054
496 | 496,48.58597108
497 | 497,48.6738899
498 | 498,48.52018743
499 | 499,48.50385022
500 | 500,50.17026668


--------------------------------------------------------------------------------
/04 - Times Series/data/train-data.csv:
--------------------------------------------------------------------------------
  1 | time_index,value
  2 | 0,-0.5070139274941298
  3 | 1,0.1253712967775968
  4 | 2,-0.12267575206840288
  5 | 3,-0.16956387939957984
  6 | 4,0.30393116333315534
  7 | 5,-0.12859637797270826
  8 | 6,0.3184790887830743
  9 | 7,-0.42364367699352623
 10 | 8,0.21185831866839666
 11 | 9,-0.3894984396354476
 12 | 10,0.780461773776706
 13 | 11,0.21716827635006894
 14 | 12,0.208066295332867
 15 | 13,0.9023076086982981
 16 | 14,0.37177403780424256
 17 | 15,0.43073320317895214
 18 | 16,1.23308015942016
 19 | 17,0.9121301645172821
 20 | 18,0.8702649415061833
 21 | 19,1.3225444506952997
 22 | 20,1.2849016700625
 23 | 21,0.8053488682091177
 24 | 22,0.7683377036246207
 25 | 23,0.4057552748095914
 26 | 24,0.978329292153113
 27 | 25,0.696015936876089
 28 | 26,1.5554789446672375
 29 | 27,0.4781779876080657
 30 | 28,0.14013906948820853
 31 | 29,1.6449368085323504
 32 | 30,1.6590749923600454
 33 | 31,2.0141355497897404
 34 | 32,1.2459591113581108
 35 | 33,0.9793177817011796
 36 | 34,0.241826654996384
 37 | 35,1.7570742528353063
 38 | 36,1.0695330784833672
 39 | 37,1.5210907139820962
 40 | 38,0.8710384662192763
 41 | 39,1.4839397118303155
 42 | 40,1.243793091688579
 43 | 41,0.6848339518810963
 44 | 42,1.6276559859206734
 45 | 43,1.3116497622098806
 46 | 44,1.3608905379061378
 47 | 45,1.041190994021974
 48 | 46,0.9799805971350682
 49 | 47,1.2354969054174134
 50 | 48,0.22235989417954904
 51 | 49,1.1513923265108672
 52 | 50,0.9396515278276432
 53 | 51,0.30260959424652467
 54 | 52,1.0056960398178687
 55 | 53,1.6068853568408674
 56 | 54,1.7676627305773898
 57 | 55,0.9173150845957287
 58 | 56,1.9939897894609664
 59 | 57,0.7414664658496637
 60 | 58,0.5332771867761734
 61 | 59,0.8338414219102095
 62 | 60,0.8641193396342405
 63 | 61,1.8207245814139355
 64 | 62,1.443486437588053
 65 | 63,1.674623601228037
 66 | 64,1.443584875090209
 67 | 65,1.308804473574395
 68 | 66,1.733630056325742
 69 | 67,0.8359910593520475
 70 | 68,1.0980006203179862
 71 | 69,0.7093105225204377
 72 | 70,1.1191743713347615
 73 | 71,1.150825963564253
 74 | 72,2.3425419082645034
 75 | 73,1.5345352246330584
 76 | 74,1.6779068527914949
 77 | 75,0.8676696369755783
 78 | 76,0.7677086011436657
 79 | 77,0.8998287612805649
 80 | 78,0.6025577314257724
 81 | 79,1.5358568541287414
 82 | 80,1.7413454713205905
 83 | 81,1.7294779805951772
 84 | 82,0.24100658343511855
 85 | 83,0.8087157551467282
 86 | 84,1.5151550594728582
 87 | 85,1.6630951453036857
 88 | 86,0.6939581780591096
 89 | 87,1.8563910702987192
 90 | 88,0.8695925892718722
 91 | 89,1.4075735259518103
 92 | 90,0.813779511845681
 93 | 91,1.0075753587561769
 94 | 92,0.5362436479057621
 95 | 93,1.208505762728457
 96 | 94,0.39508516366335494
 97 | 95,0.5387949889626957
 98 | 96,0.06824975090018343
 99 | 97,0.5515019585271188
100 | 98,0.9717347784462206
101 | 99,0.6453799423415064
102 | 100,0.9638253752187474
103 | 101,0.8253807454700055
104 | 102,0.765295849230056
105 | 103,-0.2490667183903401
106 | 104,0.1570243819306409
107 | 105,0.25567212153198093
108 | 106,0.13750761989296234
109 | 107,-0.20660746818522513
110 | 108,0.5933906620273861
111 | 109,0.5663378756409212
112 | 110,1.0593951296282083
113 | 111,0.2521124010736045
114 | 112,0.3260453790642569
115 | 113,0.1540775658086948
116 | 114,1.1988817491487544
117 | 115,0.2736270594678474
118 | 116,0.07319358529892356
119 | 117,0.2355506823573319
120 | 118,-0.3498579462123355
121 | 119,1.1746109061731147
122 | 120,-0.2454734313621706
123 | 121,0.03472511352169072
124 | 122,0.6452287293459306
125 | 123,-0.5196418251954463
126 | 124,0.27339202936903717
127 | 125,0.49343731462487284
128 | 126,1.0226849755300311
129 | 127,-0.19257056273006357
130 | 128,-0.47107762571031686
131 | 129,-0.23374598590110818
132 | 130,-0.007609464770713337
133 | 131,-0.3980476099888386
134 | 132,-0.5558966206457587
135 | 133,-0.1344657899523246
136 | 134,-0.6562174891480932
137 | 135,-0.9529234885623512
138 | 136,-0.1824689939763049
139 | 137,-0.4824704474604228
140 | 138,-0.9436448853093331
141 | 139,-0.3369041551721612
142 | 140,0.14497797127573497
143 | 141,0.016325582764854185
144 | 142,-0.19500561044644937
145 | 143,-0.9654489601806846
146 | 144,-0.9612848959159918
147 | 145,-0.162283592524345
148 | 146,0.22063804277118648
149 | 147,-0.7768224686464962
150 | 148,-0.5474822299406553
151 | 149,-0.22684463014547362
152 | 150,-0.05073639447563938
153 | 151,-0.3540337760171799
154 | 152,0.26733413075841495
155 | 153,-0.48318666001008803
156 | 154,1.0412721613305362
157 | 155,-0.3009441654464442
158 | 156,0.23672219675628858
159 | 157,-0.107098377724405
160 | 158,-0.4440316895674985
161 | 159,-0.24570790256824426
162 | 160,0.5943460949278447
163 | 161,-1.0682094133994264
164 | 162,-0.2680015981885515
165 | 163,0.033828877133002866
166 | 164,-0.28231626805343357
167 | 165,0.025170222611089033
168 | 166,-0.17207076283859424
169 | 167,0.2296022365944559
170 | 168,0.04598573095359981
171 | 169,0.8987253251768407
172 | 170,0.4586360956080101
173 | 171,0.42576578797620623
174 | 172,0.10791234233154612
175 | 173,-0.23875487135166396
176 | 174,-0.34084971403433867
177 | 175,0.6546440666166147
178 | 176,1.0435654514531552
179 | 177,0.6100905299960653
180 | 178,-0.42662090687008325
181 | 179,-0.7205534111701549
182 | 180,0.2370598496105042
183 | 181,0.7156811736776737
184 | 182,0.09764433823741903
185 | 183,0.1713968836530575
186 | 184,1.1557269685335108
187 | 185,0.7253976794449961
188 | 186,0.26055050392723333
189 | 187,1.429007916594629
190 | 188,0.8915221745929881
191 | 189,1.849975707474486
192 | 190,1.2170415838605697
193 | 191,0.28227177326870756
194 | 192,0.3758340112512873
195 | 193,0.489395008190605
196 | 194,-0.07780361911193254
197 | 195,1.0592080566932727
198 | 196,1.260639592592383
199 | 197,0.8778403107793371
200 | 198,0.23693750056347074
201 | 199,1.3629839804917614
202 | 200,0.933989699638078
203 | 201,1.2559119044689113
204 | 202,1.562489659119579
205 | 203,1.4766387826355065
206 | 204,1.491247246204821
207 | 205,1.0330384259109189
208 | 206,1.2717659065426434
209 | 207,0.5450718100076353
210 | 208,1.588810638400759
211 | 209,1.180009407506983
212 | 210,1.2517809833903317
213 | 211,1.5428349841931654
214 | 212,0.9090514743496572
215 | 213,2.0636856127889924
216 | 214,1.7295023405620553
217 | 215,0.7207840153166203
218 | 216,1.329677130143393
219 | 217,1.7757506097740814
220 | 218,0.7162167349060784
221 | 219,2.2275539995837077
222 | 220,2.012786749166425
223 | 221,0.8684124735270828
224 | 222,1.7121799084141875
225 | 223,1.574730314175874
226 | 224,1.9539080144419492
227 | 225,1.0177157482426051
228 | 226,1.7317822328139219
229 | 227,1.931341421920829
230 | 228,2.3984024094583827
231 | 229,1.4395431414826998
232 | 230,2.014204161701857
233 | 231,2.8239742537290016
234 | 232,1.2932303643848833
235 | 233,1.9383163687374438
236 | 234,2.2300236863813026
237 | 235,2.110700974442951
238 | 236,1.9604749859151185
239 | 237,2.873056945604347
240 | 238,3.042481203367786
241 | 239,2.0349069225390064
242 | 240,2.0777500108680504
243 | 241,2.291484544878781
244 | 242,2.769883711814939
245 | 243,2.4088627771736624
246 | 244,1.9491837560161467
247 | 245,3.1833487056181706
248 | 246,2.2988579493188883
249 | 247,1.957192841259238
250 | 248,2.9068158400814905
251 | 249,2.3701889353165955
252 | 250,1.919831480345358
253 | 251,2.6692682753733843
254 | 252,2.3481555360095228
255 | 253,2.0611353817546756
256 | 254,2.084063946698618
257 | 255,2.5871870558846437
258 | 256,2.5349460436653226
259 | 257,2.1937121705238254
260 | 258,2.465205616564662
261 | 259,3.5011148068047655
262 | 260,1.0872350793182335
263 | 261,3.1172222909842504
264 | 262,2.2166159479532883
265 | 263,1.7705159676237796
266 | 264,3.1727664641419024
267 | 265,1.9892904101862803
268 | 266,1.5376910059503701
269 | 267,2.7220079745425036
270 | 268,2.2792294831645616
271 | 269,1.2867915515837316
272 | 270,1.7632027534734713
273 | 271,2.608122652864524
274 | 272,1.8349037986395822
275 | 273,2.5657744969319713
276 | 274,2.0294851497169994
277 | 275,2.701897623997814
278 | 276,2.136856139216165
279 | 277,3.0684468389668704
280 | 278,2.5880287951712986
281 | 279,2.09758044416676
282 | 280,1.3842149430365502
283 | 281,2.090777383464958
284 | 282,2.7737299554551322
285 | 283,1.677875405665155
286 | 284,2.5185996092876226
287 | 285,2.3979314455762495
288 | 286,0.7706116377140514
289 | 287,1.9068300017223707
290 | 288,1.6884109083823688
291 | 289,2.3211973311275136
292 | 290,2.074017902710044
293 | 291,1.8601854699639824
294 | 292,2.5770457442035712
295 | 293,1.1320215688173692
296 | 294,1.6661182755241826
297 | 295,1.5065410558928145
298 | 296,1.3504730246720247
299 | 297,1.4781447340715372
300 | 298,1.8287400728716516
301 | 299,2.4439413941638555
302 | 300,1.1335870239005752
303 | 301,1.1376121185076786
304 | 302,1.883419689823916
305 | 303,1.1643748038367192
306 | 304,0.9052621787229713
307 | 305,1.49029893737825
308 | 306,1.504595817338015
309 | 307,0.5972730492321923
310 | 308,1.4200104774184505
311 | 309,1.7746909364603747
312 | 310,1.2224561324894654
313 | 311,0.9871804251175486
314 | 312,1.5178942988712019
315 | 313,1.6323865515785387
316 | 314,0.8180963782943355
317 | 315,1.179894094942133
318 | 316,0.7957333525885373
319 | 317,1.5596466944625096
320 | 318,1.4642504546959447
321 | 319,1.3051724059444467
322 | 320,1.620731615853558
323 | 321,0.34301128322455676
324 | 322,0.8935376468255153
325 | 323,0.7162418593271469
326 | 324,0.7935228855212434
327 | 325,0.715036157783857
328 | 326,0.47255776093126256
329 | 327,0.06053479272504858
330 | 328,1.1092282612381497
331 | 329,1.5854843894731405
332 | 330,0.7226338344053831
333 | 331,0.5415628726616049
334 | 332,0.769707956298005
335 | 333,1.0811826467816688
336 | 334,1.2324015386438218
337 | 335,0.1850672631368836
338 | 336,0.8329997477763121
339 | 337,1.2827590248617144
340 | 338,0.15409359872707973
341 | 339,0.7131220830375605
342 | 340,0.3538863693240233
343 | 341,0.4729187010748861
344 | 342,0.25167060592814694
345 | 343,0.4795340919745855
346 | 344,1.244545001156966
347 | 345,0.9070580938291402
348 | 346,-0.2474843556396955
349 | 347,0.9318249817242306
350 | 348,0.8691696762707389
351 | 349,0.10928398400582107
352 | 350,1.1954642151715116
353 | 351,1.484077554996629
354 | 352,0.40735836498626776
355 | 353,0.8741504310303659
356 | 354,1.3984054286233911
357 | 355,0.8676682893671924
358 | 356,1.1623868990365802
359 | 357,1.3360059694609663
360 | 358,0.5750144526389834
361 | 359,1.4273445427609106
362 | 360,0.7625919084249725
363 | 361,1.5300702740487817
364 | 362,1.1771428900727505
365 | 363,0.9815884070712058
366 | 364,1.7794529677664737
367 | 365,0.5096286014930141
368 | 366,0.8621047208445878
369 | 367,0.6572835372437593
370 | 368,0.7704329971617637
371 | 369,0.5021708574153725
372 | 370,0.9521055790991899
373 | 371,0.5124929945662163
374 | 372,0.24419637629300284
375 | 373,1.1672863895506451
376 | 374,0.7178411779625429
377 | 375,1.1888475010857138
378 | 376,1.2673469074966683
379 | 377,0.4135564113401964
380 | 378,1.2619200687245757
381 | 379,1.2847332369524564
382 | 380,1.0587556277731935
383 | 381,1.2057020854211569
384 | 382,1.9577257560566927
385 | 383,2.0900437646623122
386 | 384,0.9604295505152993
387 | 385,1.6910295543239444
388 | 386,2.4846342819242393
389 | 387,2.0867271577793156
390 | 388,1.2630580708033659
391 | 389,1.211468044794045
392 | 390,0.955653622356968
393 | 391,1.3890179730031247
394 | 392,2.44345181147831
395 | 393,1.204457284758076
396 | 394,2.845650935135721
397 | 395,1.5276002219192633
398 | 396,1.6151275384846175
399 | 397,2.5959934698396947
400 | 398,1.8126461999813244
401 | 399,2.8157807817978533
402 | 400,1.422968461929559
403 | 401,0.9105198075182803
404 | 402,1.9572272725379791
405 | 403,2.600692828025921
406 | 404,1.3992103054577312
407 | 405,3.018559318027152
408 | 406,1.5093470931108484
409 | 407,2.8971446400960468
410 | 408,1.6865911401470695
411 | 409,2.2332932663990315
412 | 410,2.6608879170904864
413 | 411,2.669532774586354
414 | 412,2.0251473312590482
415 | 413,2.523094088712269
416 | 414,2.911419909915979
417 | 415,2.340494856846694
418 | 416,2.545681180095806
419 | 417,3.099596750711708
420 | 418,2.136135468673287
421 | 419,2.1888107371201158
422 | 420,2.6488645103329107
423 | 421,3.1568399578858273
424 | 422,2.440864677932434
425 | 423,2.610809173659483
426 | 424,1.8178056091295223
427 | 425,2.9315735968273544
428 | 426,2.8184595352670607
429 | 427,2.069996262482516
430 | 428,2.210078410725907
431 | 429,4.014375425597757
432 | 430,2.797331532036595
433 | 431,2.34882390480658
434 | 432,2.4007093792308676
435 | 433,3.2702296111523
436 | 434,3.4303221048765566
437 | 435,3.37858399500833
438 | 436,3.212384205955622
439 | 437,2.216321278261658
440 | 438,2.784889634345377
441 | 439,4.419156418444061
442 | 440,3.957440911283956
443 | 441,2.754337273745299
444 | 442,3.9590830385435067
445 | 443,2.9510304655840445
446 | 444,3.430818337970406
447 | 445,3.417164122127078
448 | 446,3.7727735339485235
449 | 447,2.193807925768478
450 | 448,3.3244896074740726
451 | 449,3.4043681191391784
452 | 450,3.6974548916638454
453 | 451,3.9523937610220634
454 | 452,3.5729878538357385
455 | 453,2.495321372584242
456 | 454,2.0219326537398943
457 | 455,2.6792841395141833
458 | 456,3.5211582944939983
459 | 457,2.9853510573663247
460 | 458,2.725114037495373
461 | 459,3.0731529711235157
462 | 460,2.8064542337639415
463 | 461,3.961218040102244
464 | 462,3.318125102556614
465 | 463,3.7284890757462783
466 | 464,3.4752958055141363
467 | 465,3.278428878730352
468 | 466,3.444283176951884
469 | 467,3.77325515759806
470 | 468,2.2907371586806886
471 | 469,2.938790653954773
472 | 470,3.8176787682406395
473 | 471,2.7734549949442497
474 | 472,3.5007835419150353
475 | 473,3.18851778210803
476 | 474,3.594162344250331
477 | 475,2.1631708225851556
478 | 476,3.876183487335109
479 | 477,3.238333572409265
480 | 478,3.0111588283739703
481 | 479,2.6859592974631026
482 | 480,3.543110508796691
483 | 481,3.3523951980962914
484 | 482,3.3045303739129706
485 | 483,2.1505872783687434
486 | 484,3.255545595834051
487 | 485,2.1834361711617376
488 | 486,3.016004454036089
489 | 487,2.4348048508379136
490 | 488,2.2764720072045774
491 | 489,2.13094695214868
492 | 490,2.4107915906463733
493 | 491,2.2028169203722907
494 | 492,2.5179343324986063
495 | 493,2.05351310068931
496 | 494,2.5003053592680518
497 | 495,2.218774657884143
498 | 496,1.644258843640977
499 | 497,2.062599169684491
500 | 498,2.3219602009985816
501 | 499,2.6096961732821673
502 | 500,2.3765700831025884
503 | 501,3.0287779735690266
504 | 502,2.3451050020349733
505 | 503,2.2730504246034657
506 | 504,1.7170030083608254
507 | 505,3.972820176853931
508 | 506,2.96081928143457
509 | 507,1.7440825856250106
510 | 508,2.283956392920651
511 | 509,3.017090874099012
512 | 510,1.73269060308032
513 | 511,2.4818036862008084
514 | 512,1.9178423638357804
515 | 513,2.008468538769001
516 | 514,1.4303512189916163
517 | 515,2.420028353972818
518 | 516,2.333694471544968
519 | 517,2.0358856878361853
520 | 518,1.930372332964022
521 | 519,2.772314114023144
522 | 520,2.6525270397365643
523 | 521,2.7874542842670094
524 | 522,1.4108282031162545
525 | 523,1.6868876057556874
526 | 524,1.5037689438264519
527 | 525,1.2325642370435734
528 | 526,0.7124232713063819
529 | 527,2.3800907644028957
530 | 528,1.3877233338461814
531 | 529,1.8462752581855768
532 | 530,1.6416133440083642
533 | 531,1.8126044092561209
534 | 532,2.0663554509247932
535 | 533,2.761008359626194
536 | 534,2.1577238764476894
537 | 535,2.017417635006672
538 | 536,1.601923991176038
539 | 537,1.7680351150614104
540 | 538,2.065200901662619
541 | 539,1.725351491022523
542 | 540,1.924858339794002
543 | 541,2.125758189878704
544 | 542,1.2301071586988084
545 | 543,1.709721540041509
546 | 544,1.5239738349384686
547 | 545,2.3385901731902794
548 | 546,2.5132419702994624
549 | 547,1.9750801909178817
550 | 548,0.12333314756948865
551 | 549,2.0991657257046445
552 | 550,1.9142082333554962
553 | 551,1.9309896520009284
554 | 552,1.341544103330502
555 | 553,1.28049058809898
556 | 554,2.6218423637877235
557 | 555,2.1286009393550938
558 | 556,2.2438217064205723
559 | 557,1.456842576721939
560 | 558,2.4680883404735128
561 | 559,2.678058024740523
562 | 560,2.1697469640198386
563 | 561,1.9274790031063742
564 | 562,1.263900280529004
565 | 563,1.3976212029710853
566 | 564,0.7847085746264618
567 | 565,2.239433783516727
568 | 566,1.0804046348105025
569 | 567,2.0291971262278277
570 | 568,1.9031523291722041
571 | 569,1.594750755676741
572 | 570,2.095705429543913
573 | 571,2.1439876601219687
574 | 572,2.17447718714394
575 | 573,2.509210779721647
576 | 574,1.348754113695628
577 | 575,2.2650768647581687
578 | 576,2.469957066691454
579 | 577,1.9094084008513759
580 | 578,2.6907915613546765
581 | 579,2.4283581283856654
582 | 580,2.6198086506106506
583 | 581,2.824532088498242
584 | 582,2.144986257680162
585 | 583,2.6967709905140853
586 | 584,2.155027926934807
587 | 585,3.0763715554468307
588 | 586,2.2804956585199467
589 | 587,2.330030225288893
590 | 588,2.7283262483956863
591 | 589,1.9667832940469763
592 | 590,2.7629914605916728
593 | 591,3.347673260683096
594 | 592,2.7204991774409706
595 | 593,2.2213380726123417
596 | 594,3.295738484226204
597 | 595,3.0943834635313716
598 | 596,3.395245128818069
599 | 597,1.7801657319322999
600 | 598,3.006003815599795
601 | 599,3.7271303717241318
602 | 600,3.3687920925147727
603 | 601,3.494081235835588
604 | 602,2.6398630706074826
605 | 603,3.32437043035613
606 | 604,3.9716937039130134
607 | 605,3.0148970547587193
608 | 606,2.7587802972729034
609 | 607,3.541314808626614
610 | 608,3.419552004582584
611 | 609,4.096884215866331
612 | 610,2.5616750924741796
613 | 611,3.5639179882703873
614 | 612,3.4268808208312347
615 | 613,2.872363219761527
616 | 614,3.3855457293820366
617 | 615,2.93785729220192
618 | 616,3.1458155752845265
619 | 617,3.2465539652297237
620 | 618,2.946748408784293
621 | 619,3.7676191513221244
622 | 620,4.106623396844353
623 | 621,3.4644362851382864
624 | 622,4.2298715687484965
625 | 623,4.614724972380464
626 | 624,4.485952085461564
627 | 625,4.008892979961672
628 | 626,3.4117492445208164
629 | 627,3.6571908075816983
630 | 628,3.8408519603679254
631 | 629,3.431112244463793
632 | 630,4.556865657959409
633 | 631,4.237222933222556
634 | 632,3.222116818635072
635 | 633,3.576165574677214
636 | 634,3.9754856194853234
637 | 635,3.328213368589007
638 | 636,4.104693254466189
639 | 637,3.9691640998211914
640 | 638,3.3161168154225567
641 | 639,3.7508224414841997
642 | 640,4.434260513708335
643 | 641,3.2003371924917166
644 | 642,4.980128628132029
645 | 643,4.545334396893772
646 | 644,4.181800858792881
647 | 645,4.264018934480361
648 | 646,3.81587906099798
649 | 647,4.546549273705075
650 | 648,4.343850164247174
651 | 649,3.785030301942927
652 | 650,3.9667904424294553
653 | 651,4.832489508124424
654 | 652,3.5115344994617628
655 | 653,5.280271863593596
656 | 654,5.170105810177021
657 | 655,4.001193197013245
658 | 656,4.152953851268918
659 | 657,4.349360432568223
660 | 658,3.5909299965583887
661 | 659,4.734825589863275
662 | 660,3.893199952723647
663 | 661,5.383358186348112
664 | 662,3.861226525588445
665 | 663,3.8204241629880973
666 | 664,4.030051057138472
667 | 665,4.01900086168555
668 | 666,4.245729586883419
669 | 667,3.8258969209411626
670 | 668,4.640010552723009
671 | 669,4.283439814229849
672 | 670,4.4789429151128495
673 | 671,3.9050383247869966
674 | 672,4.440188192844067
675 | 673,3.9891542166926928
676 | 674,4.533468802275714
677 | 675,3.2600453449902833
678 | 676,4.435279876652234
679 | 677,3.8421123752652178
680 | 678,3.81433230199235
681 | 679,4.095330470306205
682 | 680,3.7508912112831534
683 | 681,3.6067124664798547
684 | 682,3.773260762561308
685 | 683,3.9661995229630125
686 | 684,4.025269079939741
687 | 685,3.891316308095429
688 | 686,2.6268497228468517
689 | 687,3.9555450836012014
690 | 688,4.217561006995724
691 | 689,3.959901576095089
692 | 690,3.9814289938170444
693 | 691,3.4927129816373235
694 | 692,3.643282736855781
695 | 693,3.415009233378614
696 | 694,3.755798217824436
697 | 695,3.767404234589839
698 | 696,3.2622188273548947
699 | 697,3.7034220234951563
700 | 698,2.449142007600186
701 | 699,2.7817285578755713
702 | 


--------------------------------------------------------------------------------
/06 - Sequence Models/TODO.txt:
--------------------------------------------------------------------------------
1 | - RNN, LSTM, etc.
2 | - sequence classification
3 | - predicting next element in the sequence
4 | - time-series using RNN
5 | - ...


--------------------------------------------------------------------------------
/06 - Sequence Models/data/seq01.test.csv:
--------------------------------------------------------------------------------
  1 | 0.0,0.687785252292,1.1510565163,1.2510565163,0.987785252292,0.5,0.0122147477075,-0.251056516295,-0.151056516295,0.312214747708,1.0,1.68778525229,2.1510565163,2.2510565163,1.98778525229,1.5,1.01221474771,0.748943483705,0.848943483705,1.31221474771
  2 | 0.687785252292,1.1510565163,1.2510565163,0.987785252292,0.5,0.0122147477075,-0.251056516295,-0.151056516295,0.312214747708,1.0,1.68778525229,2.1510565163,2.2510565163,1.98778525229,1.5,1.01221474771,0.748943483705,0.848943483705,1.31221474771,2.0
  3 | 1.1510565163,1.2510565163,0.987785252292,0.5,0.0122147477075,-0.251056516295,-0.151056516295,0.312214747708,1.0,1.68778525229,2.1510565163,2.2510565163,1.98778525229,1.5,1.01221474771,0.748943483705,0.848943483705,1.31221474771,2.0,2.68778525229
  4 | 1.2510565163,0.987785252292,0.5,0.0122147477075,-0.251056516295,-0.151056516295,0.312214747708,1.0,1.68778525229,2.1510565163,2.2510565163,1.98778525229,1.5,1.01221474771,0.748943483705,0.848943483705,1.31221474771,2.0,2.68778525229,3.1510565163
  5 | 0.987785252292,0.5,0.0122147477075,-0.251056516295,-0.151056516295,0.312214747708,1.0,1.68778525229,2.1510565163,2.2510565163,1.98778525229,1.5,1.01221474771,0.748943483705,0.848943483705,1.31221474771,2.0,2.68778525229,3.1510565163,3.2510565163
  6 | 0.5,0.0122147477075,-0.251056516295,-0.151056516295,0.312214747708,1.0,1.68778525229,2.1510565163,2.2510565163,1.98778525229,1.5,1.01221474771,0.748943483705,0.848943483705,1.31221474771,2.0,2.68778525229,3.1510565163,3.2510565163,2.98778525229
  7 | 0.0122147477075,-0.251056516295,-0.151056516295,0.312214747708,1.0,1.68778525229,2.1510565163,2.2510565163,1.98778525229,1.5,1.01221474771,0.748943483705,0.848943483705,1.31221474771,2.0,2.68778525229,3.1510565163,3.2510565163,2.98778525229,2.5
  8 | -0.251056516295,-0.151056516295,0.312214747708,1.0,1.68778525229,2.1510565163,2.2510565163,1.98778525229,1.5,1.01221474771,0.748943483705,0.848943483705,1.31221474771,2.0,2.68778525229,3.1510565163,3.2510565163,2.98778525229,2.5,2.01221474771
  9 | -0.151056516295,0.312214747708,1.0,1.68778525229,2.1510565163,2.2510565163,1.98778525229,1.5,1.01221474771,0.748943483705,0.848943483705,1.31221474771,2.0,2.68778525229,3.1510565163,3.2510565163,2.98778525229,2.5,2.01221474771,1.7489434837
 10 | 0.312214747708,1.0,1.68778525229,2.1510565163,2.2510565163,1.98778525229,1.5,1.01221474771,0.748943483705,0.848943483705,1.31221474771,2.0,2.68778525229,3.1510565163,3.2510565163,2.98778525229,2.5,2.01221474771,1.7489434837,1.8489434837
 11 | 1.0,1.68778525229,2.1510565163,2.2510565163,1.98778525229,1.5,1.01221474771,0.748943483705,0.848943483705,1.31221474771,2.0,2.68778525229,3.1510565163,3.2510565163,2.98778525229,2.5,2.01221474771,1.7489434837,1.8489434837,2.31221474771
 12 | 1.68778525229,2.1510565163,2.2510565163,1.98778525229,1.5,1.01221474771,0.748943483705,0.848943483705,1.31221474771,2.0,2.68778525229,3.1510565163,3.2510565163,2.98778525229,2.5,2.01221474771,1.7489434837,1.8489434837,2.31221474771,3.0
 13 | 2.1510565163,2.2510565163,1.98778525229,1.5,1.01221474771,0.748943483705,0.848943483705,1.31221474771,2.0,2.68778525229,3.1510565163,3.2510565163,2.98778525229,2.5,2.01221474771,1.7489434837,1.8489434837,2.31221474771,3.0,3.68778525229
 14 | 2.2510565163,1.98778525229,1.5,1.01221474771,0.748943483705,0.848943483705,1.31221474771,2.0,2.68778525229,3.1510565163,3.2510565163,2.98778525229,2.5,2.01221474771,1.7489434837,1.8489434837,2.31221474771,3.0,3.68778525229,4.1510565163
 15 | 1.98778525229,1.5,1.01221474771,0.748943483705,0.848943483705,1.31221474771,2.0,2.68778525229,3.1510565163,3.2510565163,2.98778525229,2.5,2.01221474771,1.7489434837,1.8489434837,2.31221474771,3.0,3.68778525229,4.1510565163,4.2510565163
 16 | 1.5,1.01221474771,0.748943483705,0.848943483705,1.31221474771,2.0,2.68778525229,3.1510565163,3.2510565163,2.98778525229,2.5,2.01221474771,1.7489434837,1.8489434837,2.31221474771,3.0,3.68778525229,4.1510565163,4.2510565163,3.98778525229
 17 | 1.01221474771,0.748943483705,0.848943483705,1.31221474771,2.0,2.68778525229,3.1510565163,3.2510565163,2.98778525229,2.5,2.01221474771,1.7489434837,1.8489434837,2.31221474771,3.0,3.68778525229,4.1510565163,4.2510565163,3.98778525229,3.5
 18 | 0.748943483705,0.848943483705,1.31221474771,2.0,2.68778525229,3.1510565163,3.2510565163,2.98778525229,2.5,2.01221474771,1.7489434837,1.8489434837,2.31221474771,3.0,3.68778525229,4.1510565163,4.2510565163,3.98778525229,3.5,3.01221474771
 19 | 0.848943483705,1.31221474771,2.0,2.68778525229,3.1510565163,3.2510565163,2.98778525229,2.5,2.01221474771,1.7489434837,1.8489434837,2.31221474771,3.0,3.68778525229,4.1510565163,4.2510565163,3.98778525229,3.5,3.01221474771,2.7489434837
 20 | 1.31221474771,2.0,2.68778525229,3.1510565163,3.2510565163,2.98778525229,2.5,2.01221474771,1.7489434837,1.8489434837,2.31221474771,3.0,3.68778525229,4.1510565163,4.2510565163,3.98778525229,3.5,3.01221474771,2.7489434837,2.8489434837
 21 | 2.0,2.68778525229,3.1510565163,3.2510565163,2.98778525229,2.5,2.01221474771,1.7489434837,1.8489434837,2.31221474771,3.0,3.68778525229,4.1510565163,4.2510565163,3.98778525229,3.5,3.01221474771,2.7489434837,2.8489434837,3.31221474771
 22 | 2.68778525229,3.1510565163,3.2510565163,2.98778525229,2.5,2.01221474771,1.7489434837,1.8489434837,2.31221474771,3.0,3.68778525229,4.1510565163,4.2510565163,3.98778525229,3.5,3.01221474771,2.7489434837,2.8489434837,3.31221474771,4.0
 23 | 3.1510565163,3.2510565163,2.98778525229,2.5,2.01221474771,1.7489434837,1.8489434837,2.31221474771,3.0,3.68778525229,4.1510565163,4.2510565163,3.98778525229,3.5,3.01221474771,2.7489434837,2.8489434837,3.31221474771,4.0,4.68778525229
 24 | 3.2510565163,2.98778525229,2.5,2.01221474771,1.7489434837,1.8489434837,2.31221474771,3.0,3.68778525229,4.1510565163,4.2510565163,3.98778525229,3.5,3.01221474771,2.7489434837,2.8489434837,3.31221474771,4.0,4.68778525229,5.1510565163
 25 | 2.98778525229,2.5,2.01221474771,1.7489434837,1.8489434837,2.31221474771,3.0,3.68778525229,4.1510565163,4.2510565163,3.98778525229,3.5,3.01221474771,2.7489434837,2.8489434837,3.31221474771,4.0,4.68778525229,5.1510565163,5.2510565163
 26 | 2.5,2.01221474771,1.7489434837,1.8489434837,2.31221474771,3.0,3.68778525229,4.1510565163,4.2510565163,3.98778525229,3.5,3.01221474771,2.7489434837,2.8489434837,3.31221474771,4.0,4.68778525229,5.1510565163,5.2510565163,4.98778525229
 27 | 2.01221474771,1.7489434837,1.8489434837,2.31221474771,3.0,3.68778525229,4.1510565163,4.2510565163,3.98778525229,3.5,3.01221474771,2.7489434837,2.8489434837,3.31221474771,4.0,4.68778525229,5.1510565163,5.2510565163,4.98778525229,4.5
 28 | 1.7489434837,1.8489434837,2.31221474771,3.0,3.68778525229,4.1510565163,4.2510565163,3.98778525229,3.5,3.01221474771,2.7489434837,2.8489434837,3.31221474771,4.0,4.68778525229,5.1510565163,5.2510565163,4.98778525229,4.5,4.01221474771
 29 | 1.8489434837,2.31221474771,3.0,3.68778525229,4.1510565163,4.2510565163,3.98778525229,3.5,3.01221474771,2.7489434837,2.8489434837,3.31221474771,4.0,4.68778525229,5.1510565163,5.2510565163,4.98778525229,4.5,4.01221474771,3.7489434837
 30 | 2.31221474771,3.0,3.68778525229,4.1510565163,4.2510565163,3.98778525229,3.5,3.01221474771,2.7489434837,2.8489434837,3.31221474771,4.0,4.68778525229,5.1510565163,5.2510565163,4.98778525229,4.5,4.01221474771,3.7489434837,3.8489434837
 31 | 3.0,3.68778525229,4.1510565163,4.2510565163,3.98778525229,3.5,3.01221474771,2.7489434837,2.8489434837,3.31221474771,4.0,4.68778525229,5.1510565163,5.2510565163,4.98778525229,4.5,4.01221474771,3.7489434837,3.8489434837,4.31221474771
 32 | 3.68778525229,4.1510565163,4.2510565163,3.98778525229,3.5,3.01221474771,2.7489434837,2.8489434837,3.31221474771,4.0,4.68778525229,5.1510565163,5.2510565163,4.98778525229,4.5,4.01221474771,3.7489434837,3.8489434837,4.31221474771,5.0
 33 | 4.1510565163,4.2510565163,3.98778525229,3.5,3.01221474771,2.7489434837,2.8489434837,3.31221474771,4.0,4.68778525229,5.1510565163,5.2510565163,4.98778525229,4.5,4.01221474771,3.7489434837,3.8489434837,4.31221474771,5.0,5.68778525229
 34 | 4.2510565163,3.98778525229,3.5,3.01221474771,2.7489434837,2.8489434837,3.31221474771,4.0,4.68778525229,5.1510565163,5.2510565163,4.98778525229,4.5,4.01221474771,3.7489434837,3.8489434837,4.31221474771,5.0,5.68778525229,6.1510565163
 35 | 3.98778525229,3.5,3.01221474771,2.7489434837,2.8489434837,3.31221474771,4.0,4.68778525229,5.1510565163,5.2510565163,4.98778525229,4.5,4.01221474771,3.7489434837,3.8489434837,4.31221474771,5.0,5.68778525229,6.1510565163,6.2510565163
 36 | 3.5,3.01221474771,2.7489434837,2.8489434837,3.31221474771,4.0,4.68778525229,5.1510565163,5.2510565163,4.98778525229,4.5,4.01221474771,3.7489434837,3.8489434837,4.31221474771,5.0,5.68778525229,6.1510565163,6.2510565163,5.98778525229
 37 | 3.01221474771,2.7489434837,2.8489434837,3.31221474771,4.0,4.68778525229,5.1510565163,5.2510565163,4.98778525229,4.5,4.01221474771,3.7489434837,3.8489434837,4.31221474771,5.0,5.68778525229,6.1510565163,6.2510565163,5.98778525229,5.5
 38 | 2.7489434837,2.8489434837,3.31221474771,4.0,4.68778525229,5.1510565163,5.2510565163,4.98778525229,4.5,4.01221474771,3.7489434837,3.8489434837,4.31221474771,5.0,5.68778525229,6.1510565163,6.2510565163,5.98778525229,5.5,5.01221474771
 39 | 2.8489434837,3.31221474771,4.0,4.68778525229,5.1510565163,5.2510565163,4.98778525229,4.5,4.01221474771,3.7489434837,3.8489434837,4.31221474771,5.0,5.68778525229,6.1510565163,6.2510565163,5.98778525229,5.5,5.01221474771,4.7489434837
 40 | 3.31221474771,4.0,4.68778525229,5.1510565163,5.2510565163,4.98778525229,4.5,4.01221474771,3.7489434837,3.8489434837,4.31221474771,5.0,5.68778525229,6.1510565163,6.2510565163,5.98778525229,5.5,5.01221474771,4.7489434837,4.8489434837
 41 | 4.0,4.68778525229,5.1510565163,5.2510565163,4.98778525229,4.5,4.01221474771,3.7489434837,3.8489434837,4.31221474771,5.0,5.68778525229,6.1510565163,6.2510565163,5.98778525229,5.5,5.01221474771,4.7489434837,4.8489434837,5.31221474771
 42 | 4.68778525229,5.1510565163,5.2510565163,4.98778525229,4.5,4.01221474771,3.7489434837,3.8489434837,4.31221474771,5.0,5.68778525229,6.1510565163,6.2510565163,5.98778525229,5.5,5.01221474771,4.7489434837,4.8489434837,5.31221474771,6.0
 43 | 5.1510565163,5.2510565163,4.98778525229,4.5,4.01221474771,3.7489434837,3.8489434837,4.31221474771,5.0,5.68778525229,6.1510565163,6.2510565163,5.98778525229,5.5,5.01221474771,4.7489434837,4.8489434837,5.31221474771,6.0,6.68778525229
 44 | 5.2510565163,4.98778525229,4.5,4.01221474771,3.7489434837,3.8489434837,4.31221474771,5.0,5.68778525229,6.1510565163,6.2510565163,5.98778525229,5.5,5.01221474771,4.7489434837,4.8489434837,5.31221474771,6.0,6.68778525229,7.1510565163
 45 | 4.98778525229,4.5,4.01221474771,3.7489434837,3.8489434837,4.31221474771,5.0,5.68778525229,6.1510565163,6.2510565163,5.98778525229,5.5,5.01221474771,4.7489434837,4.8489434837,5.31221474771,6.0,6.68778525229,7.1510565163,7.2510565163
 46 | 4.5,4.01221474771,3.7489434837,3.8489434837,4.31221474771,5.0,5.68778525229,6.1510565163,6.2510565163,5.98778525229,5.5,5.01221474771,4.7489434837,4.8489434837,5.31221474771,6.0,6.68778525229,7.1510565163,7.2510565163,6.98778525229
 47 | 4.01221474771,3.7489434837,3.8489434837,4.31221474771,5.0,5.68778525229,6.1510565163,6.2510565163,5.98778525229,5.5,5.01221474771,4.7489434837,4.8489434837,5.31221474771,6.0,6.68778525229,7.1510565163,7.2510565163,6.98778525229,6.5
 48 | 3.7489434837,3.8489434837,4.31221474771,5.0,5.68778525229,6.1510565163,6.2510565163,5.98778525229,5.5,5.01221474771,4.7489434837,4.8489434837,5.31221474771,6.0,6.68778525229,7.1510565163,7.2510565163,6.98778525229,6.5,6.01221474771
 49 | 3.8489434837,4.31221474771,5.0,5.68778525229,6.1510565163,6.2510565163,5.98778525229,5.5,5.01221474771,4.7489434837,4.8489434837,5.31221474771,6.0,6.68778525229,7.1510565163,7.2510565163,6.98778525229,6.5,6.01221474771,5.7489434837
 50 | 4.31221474771,5.0,5.68778525229,6.1510565163,6.2510565163,5.98778525229,5.5,5.01221474771,4.7489434837,4.8489434837,5.31221474771,6.0,6.68778525229,7.1510565163,7.2510565163,6.98778525229,6.5,6.01221474771,5.7489434837,5.8489434837
 51 | 5.0,5.68778525229,6.1510565163,6.2510565163,5.98778525229,5.5,5.01221474771,4.7489434837,4.8489434837,5.31221474771,6.0,6.68778525229,7.1510565163,7.2510565163,6.98778525229,6.5,6.01221474771,5.7489434837,5.8489434837,6.31221474771
 52 | 5.68778525229,6.1510565163,6.2510565163,5.98778525229,5.5,5.01221474771,4.7489434837,4.8489434837,5.31221474771,6.0,6.68778525229,7.1510565163,7.2510565163,6.98778525229,6.5,6.01221474771,5.7489434837,5.8489434837,6.31221474771,7.0
 53 | 6.1510565163,6.2510565163,5.98778525229,5.5,5.01221474771,4.7489434837,4.8489434837,5.31221474771,6.0,6.68778525229,7.1510565163,7.2510565163,6.98778525229,6.5,6.01221474771,5.7489434837,5.8489434837,6.31221474771,7.0,7.68778525229
 54 | 6.2510565163,5.98778525229,5.5,5.01221474771,4.7489434837,4.8489434837,5.31221474771,6.0,6.68778525229,7.1510565163,7.2510565163,6.98778525229,6.5,6.01221474771,5.7489434837,5.8489434837,6.31221474771,7.0,7.68778525229,8.1510565163
 55 | 5.98778525229,5.5,5.01221474771,4.7489434837,4.8489434837,5.31221474771,6.0,6.68778525229,7.1510565163,7.2510565163,6.98778525229,6.5,6.01221474771,5.7489434837,5.8489434837,6.31221474771,7.0,7.68778525229,8.1510565163,8.2510565163
 56 | 5.5,5.01221474771,4.7489434837,4.8489434837,5.31221474771,6.0,6.68778525229,7.1510565163,7.2510565163,6.98778525229,6.5,6.01221474771,5.7489434837,5.8489434837,6.31221474771,7.0,7.68778525229,8.1510565163,8.2510565163,7.98778525229
 57 | 5.01221474771,4.7489434837,4.8489434837,5.31221474771,6.0,6.68778525229,7.1510565163,7.2510565163,6.98778525229,6.5,6.01221474771,5.7489434837,5.8489434837,6.31221474771,7.0,7.68778525229,8.1510565163,8.2510565163,7.98778525229,7.5
 58 | 4.7489434837,4.8489434837,5.31221474771,6.0,6.68778525229,7.1510565163,7.2510565163,6.98778525229,6.5,6.01221474771,5.7489434837,5.8489434837,6.31221474771,7.0,7.68778525229,8.1510565163,8.2510565163,7.98778525229,7.5,7.01221474771
 59 | 4.8489434837,5.31221474771,6.0,6.68778525229,7.1510565163,7.2510565163,6.98778525229,6.5,6.01221474771,5.7489434837,5.8489434837,6.31221474771,7.0,7.68778525229,8.1510565163,8.2510565163,7.98778525229,7.5,7.01221474771,6.7489434837
 60 | 5.31221474771,6.0,6.68778525229,7.1510565163,7.2510565163,6.98778525229,6.5,6.01221474771,5.7489434837,5.8489434837,6.31221474771,7.0,7.68778525229,8.1510565163,8.2510565163,7.98778525229,7.5,7.01221474771,6.7489434837,6.8489434837
 61 | 6.0,6.68778525229,7.1510565163,7.2510565163,6.98778525229,6.5,6.01221474771,5.7489434837,5.8489434837,6.31221474771,7.0,7.68778525229,8.1510565163,8.2510565163,7.98778525229,7.5,7.01221474771,6.7489434837,6.8489434837,7.31221474771
 62 | 6.68778525229,7.1510565163,7.2510565163,6.98778525229,6.5,6.01221474771,5.7489434837,5.8489434837,6.31221474771,7.0,7.68778525229,8.1510565163,8.2510565163,7.98778525229,7.5,7.01221474771,6.7489434837,6.8489434837,7.31221474771,8.0
 63 | 7.1510565163,7.2510565163,6.98778525229,6.5,6.01221474771,5.7489434837,5.8489434837,6.31221474771,7.0,7.68778525229,8.1510565163,8.2510565163,7.98778525229,7.5,7.01221474771,6.7489434837,6.8489434837,7.31221474771,8.0,8.68778525229
 64 | 7.2510565163,6.98778525229,6.5,6.01221474771,5.7489434837,5.8489434837,6.31221474771,7.0,7.68778525229,8.1510565163,8.2510565163,7.98778525229,7.5,7.01221474771,6.7489434837,6.8489434837,7.31221474771,8.0,8.68778525229,9.1510565163
 65 | 6.98778525229,6.5,6.01221474771,5.7489434837,5.8489434837,6.31221474771,7.0,7.68778525229,8.1510565163,8.2510565163,7.98778525229,7.5,7.01221474771,6.7489434837,6.8489434837,7.31221474771,8.0,8.68778525229,9.1510565163,9.2510565163
 66 | 6.5,6.01221474771,5.7489434837,5.8489434837,6.31221474771,7.0,7.68778525229,8.1510565163,8.2510565163,7.98778525229,7.5,7.01221474771,6.7489434837,6.8489434837,7.31221474771,8.0,8.68778525229,9.1510565163,9.2510565163,8.98778525229
 67 | 6.01221474771,5.7489434837,5.8489434837,6.31221474771,7.0,7.68778525229,8.1510565163,8.2510565163,7.98778525229,7.5,7.01221474771,6.7489434837,6.8489434837,7.31221474771,8.0,8.68778525229,9.1510565163,9.2510565163,8.98778525229,8.5
 68 | 5.7489434837,5.8489434837,6.31221474771,7.0,7.68778525229,8.1510565163,8.2510565163,7.98778525229,7.5,7.01221474771,6.7489434837,6.8489434837,7.31221474771,8.0,8.68778525229,9.1510565163,9.2510565163,8.98778525229,8.5,8.01221474771
 69 | 5.8489434837,6.31221474771,7.0,7.68778525229,8.1510565163,8.2510565163,7.98778525229,7.5,7.01221474771,6.7489434837,6.8489434837,7.31221474771,8.0,8.68778525229,9.1510565163,9.2510565163,8.98778525229,8.5,8.01221474771,7.7489434837
 70 | 6.31221474771,7.0,7.68778525229,8.1510565163,8.2510565163,7.98778525229,7.5,7.01221474771,6.7489434837,6.8489434837,7.31221474771,8.0,8.68778525229,9.1510565163,9.2510565163,8.98778525229,8.5,8.01221474771,7.7489434837,7.8489434837
 71 | 7.0,7.68778525229,8.1510565163,8.2510565163,7.98778525229,7.5,7.01221474771,6.7489434837,6.8489434837,7.31221474771,8.0,8.68778525229,9.1510565163,9.2510565163,8.98778525229,8.5,8.01221474771,7.7489434837,7.8489434837,8.31221474771
 72 | 7.68778525229,8.1510565163,8.2510565163,7.98778525229,7.5,7.01221474771,6.7489434837,6.8489434837,7.31221474771,8.0,8.68778525229,9.1510565163,9.2510565163,8.98778525229,8.5,8.01221474771,7.7489434837,7.8489434837,8.31221474771,9.0
 73 | 8.1510565163,8.2510565163,7.98778525229,7.5,7.01221474771,6.7489434837,6.8489434837,7.31221474771,8.0,8.68778525229,9.1510565163,9.2510565163,8.98778525229,8.5,8.01221474771,7.7489434837,7.8489434837,8.31221474771,9.0,9.68778525229
 74 | 8.2510565163,7.98778525229,7.5,7.01221474771,6.7489434837,6.8489434837,7.31221474771,8.0,8.68778525229,9.1510565163,9.2510565163,8.98778525229,8.5,8.01221474771,7.7489434837,7.8489434837,8.31221474771,9.0,9.68778525229,10.1510565163
 75 | 7.98778525229,7.5,7.01221474771,6.7489434837,6.8489434837,7.31221474771,8.0,8.68778525229,9.1510565163,9.2510565163,8.98778525229,8.5,8.01221474771,7.7489434837,7.8489434837,8.31221474771,9.0,9.68778525229,10.1510565163,10.2510565163
 76 | 7.5,7.01221474771,6.7489434837,6.8489434837,7.31221474771,8.0,8.68778525229,9.1510565163,9.2510565163,8.98778525229,8.5,8.01221474771,7.7489434837,7.8489434837,8.31221474771,9.0,9.68778525229,10.1510565163,10.2510565163,9.98778525229
 77 | 7.01221474771,6.7489434837,6.8489434837,7.31221474771,8.0,8.68778525229,9.1510565163,9.2510565163,8.98778525229,8.5,8.01221474771,7.7489434837,7.8489434837,8.31221474771,9.0,9.68778525229,10.1510565163,10.2510565163,9.98778525229,9.5
 78 | 6.7489434837,6.8489434837,7.31221474771,8.0,8.68778525229,9.1510565163,9.2510565163,8.98778525229,8.5,8.01221474771,7.7489434837,7.8489434837,8.31221474771,9.0,9.68778525229,10.1510565163,10.2510565163,9.98778525229,9.5,9.01221474771
 79 | 6.8489434837,7.31221474771,8.0,8.68778525229,9.1510565163,9.2510565163,8.98778525229,8.5,8.01221474771,7.7489434837,7.8489434837,8.31221474771,9.0,9.68778525229,10.1510565163,10.2510565163,9.98778525229,9.5,9.01221474771,8.7489434837
 80 | 7.31221474771,8.0,8.68778525229,9.1510565163,9.2510565163,8.98778525229,8.5,8.01221474771,7.7489434837,7.8489434837,8.31221474771,9.0,9.68778525229,10.1510565163,10.2510565163,9.98778525229,9.5,9.01221474771,8.7489434837,8.8489434837
 81 | 8.0,8.68778525229,9.1510565163,9.2510565163,8.98778525229,8.5,8.01221474771,7.7489434837,7.8489434837,8.31221474771,9.0,9.68778525229,10.1510565163,10.2510565163,9.98778525229,9.5,9.01221474771,8.7489434837,8.8489434837,9.31221474771
 82 | 8.68778525229,9.1510565163,9.2510565163,8.98778525229,8.5,8.01221474771,7.7489434837,7.8489434837,8.31221474771,9.0,9.68778525229,10.1510565163,10.2510565163,9.98778525229,9.5,9.01221474771,8.7489434837,8.8489434837,9.31221474771,10.0
 83 | 9.1510565163,9.2510565163,8.98778525229,8.5,8.01221474771,7.7489434837,7.8489434837,8.31221474771,9.0,9.68778525229,10.1510565163,10.2510565163,9.98778525229,9.5,9.01221474771,8.7489434837,8.8489434837,9.31221474771,10.0,10.6877852523
 84 | 9.2510565163,8.98778525229,8.5,8.01221474771,7.7489434837,7.8489434837,8.31221474771,9.0,9.68778525229,10.1510565163,10.2510565163,9.98778525229,9.5,9.01221474771,8.7489434837,8.8489434837,9.31221474771,10.0,10.6877852523,11.1510565163
 85 | 8.98778525229,8.5,8.01221474771,7.7489434837,7.8489434837,8.31221474771,9.0,9.68778525229,10.1510565163,10.2510565163,9.98778525229,9.5,9.01221474771,8.7489434837,8.8489434837,9.31221474771,10.0,10.6877852523,11.1510565163,11.2510565163
 86 | 8.5,8.01221474771,7.7489434837,7.8489434837,8.31221474771,9.0,9.68778525229,10.1510565163,10.2510565163,9.98778525229,9.5,9.01221474771,8.7489434837,8.8489434837,9.31221474771,10.0,10.6877852523,11.1510565163,11.2510565163,10.9877852523
 87 | 8.01221474771,7.7489434837,7.8489434837,8.31221474771,9.0,9.68778525229,10.1510565163,10.2510565163,9.98778525229,9.5,9.01221474771,8.7489434837,8.8489434837,9.31221474771,10.0,10.6877852523,11.1510565163,11.2510565163,10.9877852523,10.5
 88 | 7.7489434837,7.8489434837,8.31221474771,9.0,9.68778525229,10.1510565163,10.2510565163,9.98778525229,9.5,9.01221474771,8.7489434837,8.8489434837,9.31221474771,10.0,10.6877852523,11.1510565163,11.2510565163,10.9877852523,10.5,10.0122147477
 89 | 7.8489434837,8.31221474771,9.0,9.68778525229,10.1510565163,10.2510565163,9.98778525229,9.5,9.01221474771,8.7489434837,8.8489434837,9.31221474771,10.0,10.6877852523,11.1510565163,11.2510565163,10.9877852523,10.5,10.0122147477,9.7489434837
 90 | 8.31221474771,9.0,9.68778525229,10.1510565163,10.2510565163,9.98778525229,9.5,9.01221474771,8.7489434837,8.8489434837,9.31221474771,10.0,10.6877852523,11.1510565163,11.2510565163,10.9877852523,10.5,10.0122147477,9.7489434837,9.8489434837
 91 | 9.0,9.68778525229,10.1510565163,10.2510565163,9.98778525229,9.5,9.01221474771,8.7489434837,8.8489434837,9.31221474771,10.0,10.6877852523,11.1510565163,11.2510565163,10.9877852523,10.5,10.0122147477,9.7489434837,9.8489434837,10.3122147477
 92 | 9.68778525229,10.1510565163,10.2510565163,9.98778525229,9.5,9.01221474771,8.7489434837,8.8489434837,9.31221474771,10.0,10.6877852523,11.1510565163,11.2510565163,10.9877852523,10.5,10.0122147477,9.7489434837,9.8489434837,10.3122147477,11.0
 93 | 10.1510565163,10.2510565163,9.98778525229,9.5,9.01221474771,8.7489434837,8.8489434837,9.31221474771,10.0,10.6877852523,11.1510565163,11.2510565163,10.9877852523,10.5,10.0122147477,9.7489434837,9.8489434837,10.3122147477,11.0,11.6877852523
 94 | 10.2510565163,9.98778525229,9.5,9.01221474771,8.7489434837,8.8489434837,9.31221474771,10.0,10.6877852523,11.1510565163,11.2510565163,10.9877852523,10.5,10.0122147477,9.7489434837,9.8489434837,10.3122147477,11.0,11.6877852523,12.1510565163
 95 | 9.98778525229,9.5,9.01221474771,8.7489434837,8.8489434837,9.31221474771,10.0,10.6877852523,11.1510565163,11.2510565163,10.9877852523,10.5,10.0122147477,9.7489434837,9.8489434837,10.3122147477,11.0,11.6877852523,12.1510565163,12.2510565163
 96 | 9.5,9.01221474771,8.7489434837,8.8489434837,9.31221474771,10.0,10.6877852523,11.1510565163,11.2510565163,10.9877852523,10.5,10.0122147477,9.7489434837,9.8489434837,10.3122147477,11.0,11.6877852523,12.1510565163,12.2510565163,11.9877852523
 97 | 9.01221474771,8.7489434837,8.8489434837,9.31221474771,10.0,10.6877852523,11.1510565163,11.2510565163,10.9877852523,10.5,10.0122147477,9.7489434837,9.8489434837,10.3122147477,11.0,11.6877852523,12.1510565163,12.2510565163,11.9877852523,11.5
 98 | 8.7489434837,8.8489434837,9.31221474771,10.0,10.6877852523,11.1510565163,11.2510565163,10.9877852523,10.5,10.0122147477,9.7489434837,9.8489434837,10.3122147477,11.0,11.6877852523,12.1510565163,12.2510565163,11.9877852523,11.5,11.0122147477
 99 | 8.8489434837,9.31221474771,10.0,10.6877852523,11.1510565163,11.2510565163,10.9877852523,10.5,10.0122147477,9.7489434837,9.8489434837,10.3122147477,11.0,11.6877852523,12.1510565163,12.2510565163,11.9877852523,11.5,11.0122147477,10.7489434837
100 | 9.31221474771,10.0,10.6877852523,11.1510565163,11.2510565163,10.9877852523,10.5,10.0122147477,9.7489434837,9.8489434837,10.3122147477,11.0,11.6877852523,12.1510565163,12.2510565163,11.9877852523,11.5,11.0122147477,10.7489434837,10.8489434837
101 | 


--------------------------------------------------------------------------------
/07 -  Image Analysis/00.0 - TensorFlow Version Update.ipynb:
--------------------------------------------------------------------------------
 1 | {
 2 |  "cells": [
 3 |   {
 4 |    "cell_type": "code",
 5 |    "execution_count": null,
 6 |    "metadata": {},
 7 |    "outputs": [],
 8 |    "source": [
 9 |     "%%bash\n",
10 |     "\n",
11 |     "pip install -U tensorflow"
12 |    ]
13 |   },
14 |   {
15 |    "cell_type": "code",
16 |    "execution_count": null,
17 |    "metadata": {},
18 |    "outputs": [],
19 |    "source": [
20 |     "import tensorflow as tf\n",
21 |     "print(tf.__version__)"
22 |    ]
23 |   }
24 |  ],
25 |  "metadata": {
26 |   "kernelspec": {
27 |    "display_name": "Python 2",
28 |    "language": "python",
29 |    "name": "python2"
30 |   },
31 |   "language_info": {
32 |    "codemirror_mode": {
33 |     "name": "ipython",
34 |     "version": 2
35 |    },
36 |    "file_extension": ".py",
37 |    "mimetype": "text/x-python",
38 |    "name": "python",
39 |    "nbconvert_exporter": "python",
40 |    "pygments_lexer": "ipython2",
41 |    "version": "2.7.13"
42 |   }
43 |  },
44 |  "nbformat": 4,
45 |  "nbformat_minor": 2
46 | }
47 | 


--------------------------------------------------------------------------------
/08 - Text Analysis/01 - Text Classification - SMS Ham vs. Spam - Data Preparation.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "markdown",
  5 |    "metadata": {},
  6 |    "source": [
  7 |     "## UCI SMS Spam Collection Dataset\n",
  8 |     "\n",
  9 |     "### Dataset URL: http://archive.ics.uci.edu/ml/datasets/SMS+Spam+Collection\n",
 10 |     "\n",
 11 |     "A set of labeled SMS messages + label (ham vs Spam)"
 12 |    ]
 13 |   },
 14 |   {
 15 |    "cell_type": "code",
 16 |    "execution_count": 1,
 17 |    "metadata": {
 18 |     "collapsed": true
 19 |    },
 20 |    "outputs": [],
 21 |    "source": [
 22 |     "import pandas as pd\n",
 23 |     "import string\n",
 24 |     "import re\n",
 25 |     "from sklearn import model_selection"
 26 |    ]
 27 |   },
 28 |   {
 29 |    "cell_type": "code",
 30 |    "execution_count": 2,
 31 |    "metadata": {},
 32 |    "outputs": [
 33 |     {
 34 |      "data": {
 35 |       "text/html": [
 36 |        "<div>\n",
 37 |        "<style scoped>\n",
 38 |        "    .dataframe tbody tr th:only-of-type {\n",
 39 |        "        vertical-align: middle;\n",
 40 |        "    }\n",
 41 |        "\n",
 42 |        "    .dataframe tbody tr th {\n",
 43 |        "        vertical-align: top;\n",
 44 |        "    }\n",
 45 |        "\n",
 46 |        "    .dataframe thead th {\n",
 47 |        "        text-align: right;\n",
 48 |        "    }\n",
 49 |        "</style>\n",
 50 |        "<table border=\"1\" class=\"dataframe\">\n",
 51 |        "  <thead>\n",
 52 |        "    <tr style=\"text-align: right;\">\n",
 53 |        "      <th></th>\n",
 54 |        "      <th>class</th>\n",
 55 |        "      <th>sms</th>\n",
 56 |        "    </tr>\n",
 57 |        "  </thead>\n",
 58 |        "  <tbody>\n",
 59 |        "    <tr>\n",
 60 |        "      <th>0</th>\n",
 61 |        "      <td>ham</td>\n",
 62 |        "      <td>Go until jurong point, crazy.. Available only ...</td>\n",
 63 |        "    </tr>\n",
 64 |        "    <tr>\n",
 65 |        "      <th>1</th>\n",
 66 |        "      <td>ham</td>\n",
 67 |        "      <td>Ok lar... Joking wif u oni...</td>\n",
 68 |        "    </tr>\n",
 69 |        "    <tr>\n",
 70 |        "      <th>2</th>\n",
 71 |        "      <td>spam</td>\n",
 72 |        "      <td>Free entry in 2 a wkly comp to win FA Cup fina...</td>\n",
 73 |        "    </tr>\n",
 74 |        "    <tr>\n",
 75 |        "      <th>3</th>\n",
 76 |        "      <td>ham</td>\n",
 77 |        "      <td>U dun say so early hor... U c already then say...</td>\n",
 78 |        "    </tr>\n",
 79 |        "    <tr>\n",
 80 |        "      <th>4</th>\n",
 81 |        "      <td>ham</td>\n",
 82 |        "      <td>Nah I don't think he goes to usf, he lives aro...</td>\n",
 83 |        "    </tr>\n",
 84 |        "  </tbody>\n",
 85 |        "</table>\n",
 86 |        "</div>"
 87 |       ],
 88 |       "text/plain": [
 89 |        "  class                                                sms\n",
 90 |        "0   ham  Go until jurong point, crazy.. Available only ...\n",
 91 |        "1   ham                      Ok lar... Joking wif u oni...\n",
 92 |        "2  spam  Free entry in 2 a wkly comp to win FA Cup fina...\n",
 93 |        "3   ham  U dun say so early hor... U c already then say...\n",
 94 |        "4   ham  Nah I don't think he goes to usf, he lives aro..."
 95 |       ]
 96 |      },
 97 |      "execution_count": 2,
 98 |      "metadata": {},
 99 |      "output_type": "execute_result"
100 |     }
101 |    ],
102 |    "source": [
103 |     "DATASET_FILE = 'data/sms-spam/SMSSpamCollection'\n",
104 |     "dataset = pd.read_csv(DATASET_FILE, sep='\\t', names=['class','sms'])\n",
105 |     "dataset.head()"
106 |    ]
107 |   },
108 |   {
109 |    "cell_type": "code",
110 |    "execution_count": 3,
111 |    "metadata": {},
112 |    "outputs": [
113 |     {
114 |      "name": "stdout",
115 |      "output_type": "stream",
116 |      "text": [
117 |       "Dataset Size: 5572\n",
118 |       "ham     4825\n",
119 |       "spam     747\n",
120 |       "Name: class, dtype: int64\n",
121 |       "ham %: 86.59\n",
122 |       "ham %: 13.41\n"
123 |      ]
124 |     }
125 |    ],
126 |    "source": [
127 |     "print(\"Dataset Size: {}\".format(len(dataset)))\n",
128 |     "value_counts = dataset['class'].value_counts()\n",
129 |     "print(value_counts)\n",
130 |     "print(\"ham %: {}\".format(round(value_counts[0]/len(dataset)*100,2)))\n",
131 |     "print(\"ham %: {}\".format(round(value_counts[1]/len(dataset)*100,2)))"
132 |    ]
133 |   },
134 |   {
135 |    "cell_type": "markdown",
136 |    "metadata": {},
137 |    "source": [
138 |     "## Create Training and Validation Datasets"
139 |    ]
140 |   },
141 |   {
142 |    "cell_type": "code",
143 |    "execution_count": 4,
144 |    "metadata": {},
145 |    "outputs": [
146 |     {
147 |      "name": "stdout",
148 |      "output_type": "stream",
149 |      "text": [
150 |       "4179\n",
151 |       "1393\n"
152 |      ]
153 |     }
154 |    ],
155 |    "source": [
156 |     "exclude = ['\\t', '\"']\n",
157 |     "def clean_text(text):\n",
158 |     "    for c in exclude:\n",
159 |     "        text=text.replace(c,'')\n",
160 |     "    return text.lower().strip()\n",
161 |     "\n",
162 |     "sms_processed = list(map(lambda text: clean_text(text), \n",
163 |     "                         dataset['sms'].values))\n",
164 |     "\n",
165 |     "dataset['sms'] = sms_processed\n",
166 |     "\n",
167 |     "splitter =  model_selection.StratifiedShuffleSplit(n_splits=1,\n",
168 |     "                                                   test_size=0.25, \n",
169 |     "                                                   random_state=19850610)\n",
170 |     "\n",
171 |     "splits = list(splitter.split(X=dataset['sms'], y=dataset['class']))\n",
172 |     "train_index = splits[0][0]\n",
173 |     "valid_index = splits[0][1]\n",
174 |     "\n",
175 |     "train_df = dataset.loc[train_index,:]\n",
176 |     "print(len(train_df))\n",
177 |     "\n",
178 |     "valid_df = dataset.loc[valid_index,:]\n",
179 |     "print(len(valid_df))"
180 |    ]
181 |   },
182 |   {
183 |    "cell_type": "code",
184 |    "execution_count": 5,
185 |    "metadata": {},
186 |    "outputs": [
187 |     {
188 |      "name": "stdout",
189 |      "output_type": "stream",
190 |      "text": [
191 |       "Training Set\n",
192 |       "ham     3619\n",
193 |       "spam     560\n",
194 |       "Name: class, dtype: int64\n",
195 |       "ham %: 86.6\n",
196 |       "ham %: 13.4\n",
197 |       "\n",
198 |       "Validation Set\n",
199 |       "ham     1206\n",
200 |       "spam     187\n",
201 |       "Name: class, dtype: int64\n",
202 |       "ham %: 86.58\n",
203 |       "ham %: 13.42\n"
204 |      ]
205 |     }
206 |    ],
207 |    "source": [
208 |     "print(\"Training Set\")\n",
209 |     "training_value_counts = train_df['class'].value_counts()\n",
210 |     "print(training_value_counts)\n",
211 |     "print(\"ham %: {}\".format(round(training_value_counts[0]/len(train_df)*100,2)))\n",
212 |     "print(\"ham %: {}\".format(round(training_value_counts[1]/len(train_df)*100,2)))\n",
213 |     "print(\"\")\n",
214 |     "print(\"Validation Set\")\n",
215 |     "validation_value_counts = valid_df['class'].value_counts()\n",
216 |     "print(validation_value_counts)\n",
217 |     "print(\"ham %: {}\".format(round(validation_value_counts[0]/len(valid_df)*100,2)))\n",
218 |     "print(\"ham %: {}\".format(round(validation_value_counts[1]/len(valid_df)*100,2)))"
219 |    ]
220 |   },
221 |   {
222 |    "cell_type": "markdown",
223 |    "metadata": {},
224 |    "source": [
225 |     "## Save Training and Validation Datasets"
226 |    ]
227 |   },
228 |   {
229 |    "cell_type": "code",
230 |    "execution_count": 6,
231 |    "metadata": {
232 |     "collapsed": true
233 |    },
234 |    "outputs": [],
235 |    "source": [
236 |     "train_df.to_csv(\"data/sms-spam/train-data.tsv\", header=False, index=False, sep='\\t')\n",
237 |     "valid_df.to_csv(\"data/sms-spam/valid-data.tsv\", header=False, index=False, sep='\\t')"
238 |    ]
239 |   },
240 |   {
241 |    "cell_type": "code",
242 |    "execution_count": 7,
243 |    "metadata": {},
244 |    "outputs": [
245 |     {
246 |      "data": {
247 |       "text/html": [
248 |        "<div>\n",
249 |        "<style scoped>\n",
250 |        "    .dataframe tbody tr th:only-of-type {\n",
251 |        "        vertical-align: middle;\n",
252 |        "    }\n",
253 |        "\n",
254 |        "    .dataframe tbody tr th {\n",
255 |        "        vertical-align: top;\n",
256 |        "    }\n",
257 |        "\n",
258 |        "    .dataframe thead th {\n",
259 |        "        text-align: right;\n",
260 |        "    }\n",
261 |        "</style>\n",
262 |        "<table border=\"1\" class=\"dataframe\">\n",
263 |        "  <thead>\n",
264 |        "    <tr style=\"text-align: right;\">\n",
265 |        "      <th></th>\n",
266 |        "      <th>class</th>\n",
267 |        "      <th>sms</th>\n",
268 |        "    </tr>\n",
269 |        "  </thead>\n",
270 |        "  <tbody>\n",
271 |        "    <tr>\n",
272 |        "      <th>4174</th>\n",
273 |        "      <td>ham</td>\n",
274 |        "      <td>just woke up. yeesh its late. but i didn't fal...</td>\n",
275 |        "    </tr>\n",
276 |        "    <tr>\n",
277 |        "      <th>4175</th>\n",
278 |        "      <td>ham</td>\n",
279 |        "      <td>what do u reckon as need 2 arrange transport i...</td>\n",
280 |        "    </tr>\n",
281 |        "    <tr>\n",
282 |        "      <th>4176</th>\n",
283 |        "      <td>spam</td>\n",
284 |        "      <td>free entry into our £250 weekly competition ju...</td>\n",
285 |        "    </tr>\n",
286 |        "    <tr>\n",
287 |        "      <th>4177</th>\n",
288 |        "      <td>spam</td>\n",
289 |        "      <td>-pls stop bootydelious (32/f) is inviting you ...</td>\n",
290 |        "    </tr>\n",
291 |        "    <tr>\n",
292 |        "      <th>4178</th>\n",
293 |        "      <td>ham</td>\n",
294 |        "      <td>tell my  bad character which u dnt lik in me. ...</td>\n",
295 |        "    </tr>\n",
296 |        "  </tbody>\n",
297 |        "</table>\n",
298 |        "</div>"
299 |       ],
300 |       "text/plain": [
301 |        "     class                                                sms\n",
302 |        "4174   ham  just woke up. yeesh its late. but i didn't fal...\n",
303 |        "4175   ham  what do u reckon as need 2 arrange transport i...\n",
304 |        "4176  spam  free entry into our £250 weekly competition ju...\n",
305 |        "4177  spam  -pls stop bootydelious (32/f) is inviting you ...\n",
306 |        "4178   ham  tell my  bad character which u dnt lik in me. ..."
307 |       ]
308 |      },
309 |      "execution_count": 7,
310 |      "metadata": {},
311 |      "output_type": "execute_result"
312 |     }
313 |    ],
314 |    "source": [
315 |     "pd.read_csv(\"data/sms-spam/train-data.tsv\", sep='\\t', names=['class','sms']).tail()"
316 |    ]
317 |   },
318 |   {
319 |    "cell_type": "code",
320 |    "execution_count": 12,
321 |    "metadata": {},
322 |    "outputs": [
323 |     {
324 |      "data": {
325 |       "text/html": [
326 |        "<div>\n",
327 |        "<style scoped>\n",
328 |        "    .dataframe tbody tr th:only-of-type {\n",
329 |        "        vertical-align: middle;\n",
330 |        "    }\n",
331 |        "\n",
332 |        "    .dataframe tbody tr th {\n",
333 |        "        vertical-align: top;\n",
334 |        "    }\n",
335 |        "\n",
336 |        "    .dataframe thead th {\n",
337 |        "        text-align: right;\n",
338 |        "    }\n",
339 |        "</style>\n",
340 |        "<table border=\"1\" class=\"dataframe\">\n",
341 |        "  <thead>\n",
342 |        "    <tr style=\"text-align: right;\">\n",
343 |        "      <th></th>\n",
344 |        "      <th>class</th>\n",
345 |        "      <th>sms</th>\n",
346 |        "    </tr>\n",
347 |        "  </thead>\n",
348 |        "  <tbody>\n",
349 |        "    <tr>\n",
350 |        "      <th>1387</th>\n",
351 |        "      <td>ham</td>\n",
352 |        "      <td>true dear..i sat to pray evening and felt so.s...</td>\n",
353 |        "    </tr>\n",
354 |        "    <tr>\n",
355 |        "      <th>1388</th>\n",
356 |        "      <td>ham</td>\n",
357 |        "      <td>what will we do in the shower, baby?</td>\n",
358 |        "    </tr>\n",
359 |        "    <tr>\n",
360 |        "      <th>1389</th>\n",
361 |        "      <td>ham</td>\n",
362 |        "      <td>where are you ? what are you doing ? are yuou ...</td>\n",
363 |        "    </tr>\n",
364 |        "    <tr>\n",
365 |        "      <th>1390</th>\n",
366 |        "      <td>spam</td>\n",
367 |        "      <td>ur cash-balance is currently 500 pounds - to m...</td>\n",
368 |        "    </tr>\n",
369 |        "    <tr>\n",
370 |        "      <th>1391</th>\n",
371 |        "      <td>spam</td>\n",
372 |        "      <td>not heard from u4 a while. call 4 rude chat pr...</td>\n",
373 |        "    </tr>\n",
374 |        "  </tbody>\n",
375 |        "</table>\n",
376 |        "</div>"
377 |       ],
378 |       "text/plain": [
379 |        "     class                                                sms\n",
380 |        "1387   ham  true dear..i sat to pray evening and felt so.s...\n",
381 |        "1388   ham               what will we do in the shower, baby?\n",
382 |        "1389   ham  where are you ? what are you doing ? are yuou ...\n",
383 |        "1390  spam  ur cash-balance is currently 500 pounds - to m...\n",
384 |        "1391  spam  not heard from u4 a while. call 4 rude chat pr..."
385 |       ]
386 |      },
387 |      "execution_count": 12,
388 |      "metadata": {},
389 |      "output_type": "execute_result"
390 |     }
391 |    ],
392 |    "source": [
393 |     "pd.read_csv(\"data/sms-spam/valid-data.tsv\", sep='\\t', names=['class','sms']).tail()"
394 |    ]
395 |   },
396 |   {
397 |    "cell_type": "markdown",
398 |    "metadata": {},
399 |    "source": [
400 |     "## Calculate Vocabulary"
401 |    ]
402 |   },
403 |   {
404 |    "cell_type": "code",
405 |    "execution_count": 9,
406 |    "metadata": {
407 |     "collapsed": true
408 |    },
409 |    "outputs": [],
410 |    "source": [
411 |     "def get_vocab():\n",
412 |     "    vocab = set()\n",
413 |     "    for text in train_df['sms'].values:\n",
414 |     "        words = text.split(' ')\n",
415 |     "        word_set = set(words)\n",
416 |     "        vocab.update(word_set)\n",
417 |     "    \n",
418 |     "    vocab.remove('')\n",
419 |     "    return list(vocab)"
420 |    ]
421 |   },
422 |   {
423 |    "cell_type": "code",
424 |    "execution_count": 10,
425 |    "metadata": {},
426 |    "outputs": [
427 |     {
428 |      "name": "stdout",
429 |      "output_type": "stream",
430 |      "text": [
431 |       "11330\n"
432 |      ]
433 |     },
434 |     {
435 |      "data": {
436 |       "text/plain": [
437 |        "['child',\n",
438 |        " 'place..',\n",
439 |        " 'hi..i',\n",
440 |        " 'oso?',\n",
441 |        " 'home!',\n",
442 |        " 'lasting',\n",
443 |        " 'there..do',\n",
444 |        " 'clock',\n",
445 |        " 'advice',\n",
446 |        " 'free...']"
447 |       ]
448 |      },
449 |      "execution_count": 10,
450 |      "metadata": {},
451 |      "output_type": "execute_result"
452 |     }
453 |    ],
454 |    "source": [
455 |     "vocab = get_vocab()\n",
456 |     "print(len(vocab))\n",
457 |     "vocab[10:20]"
458 |    ]
459 |   },
460 |   {
461 |    "cell_type": "markdown",
462 |    "metadata": {},
463 |    "source": [
464 |     "## Save Vocabulary"
465 |    ]
466 |   },
467 |   {
468 |    "cell_type": "code",
469 |    "execution_count": 11,
470 |    "metadata": {
471 |     "collapsed": true
472 |    },
473 |    "outputs": [],
474 |    "source": [
475 |     "PAD_WORD = '#=KS=#'\n",
476 |     "\n",
477 |     "with open('data/sms-spam/vocab_list.tsv', 'w') as file:\n",
478 |     "    file.write(\"{}\\n\".format(PAD_WORD))\n",
479 |     "    for word in vocab:\n",
480 |     "        file.write(\"{}\\n\".format(word))\n",
481 |     "        \n",
482 |     "with open('data/sms-spam/n_words.tsv', 'w') as file:\n",
483 |     "    file.write(str(len(vocab)))"
484 |    ]
485 |   },
486 |   {
487 |    "cell_type": "code",
488 |    "execution_count": null,
489 |    "metadata": {
490 |     "collapsed": true
491 |    },
492 |    "outputs": [],
493 |    "source": []
494 |   }
495 |  ],
496 |  "metadata": {
497 |   "kernelspec": {
498 |    "display_name": "Python 3",
499 |    "language": "python",
500 |    "name": "python3"
501 |   },
502 |   "language_info": {
503 |    "codemirror_mode": {
504 |     "name": "ipython",
505 |     "version": 3
506 |    },
507 |    "file_extension": ".py",
508 |    "mimetype": "text/x-python",
509 |    "name": "python",
510 |    "nbconvert_exporter": "python",
511 |    "pygments_lexer": "ipython3",
512 |    "version": "3.6.1"
513 |   }
514 |  },
515 |  "nbformat": 4,
516 |  "nbformat_minor": 2
517 | }
518 | 


--------------------------------------------------------------------------------
/08 - Text Analysis/06 - Part_1 - Text Classification - Hacker News - Data Preprocessing with TFT.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "code",
  5 |    "execution_count": 1,
  6 |    "metadata": {},
  7 |    "outputs": [],
  8 |    "source": [
  9 |     "# %%bash\n",
 10 |     "\n",
 11 |     "# pip install tensorflow==1.7\n",
 12 |     "# pip install google-cloud-dataflow==2.3\n",
 13 |     "# pip install tensorflow-hub"
 14 |    ]
 15 |   },
 16 |   {
 17 |    "cell_type": "markdown",
 18 |    "metadata": {},
 19 |    "source": [
 20 |     "# Text Classification using TensorFlow and Google Cloud - Part 1\n",
 21 |     "\n",
 22 |     "This [bigquery-public-data:hacker_news](https://cloud.google.com/bigquery/public-data/hacker-news) contains all stories and comments from Hacker News from its launch in 2006.  Each story contains a story id, url, the title of the story, tthe author that made the post, when it was written, and the number of points the story received.\n",
 23 |     "\n",
 24 |     "The objective is, given the title of the story, we want to build an ML model that can predict the source of this story.\n",
 25 |     "\n",
 26 |     "## Data preparation with tf.Transform and DataFlow\n",
 27 |     "\n",
 28 |     "This notebook illustrates how to build a Beam pipeline using tf.transform to prepare ML 'train' and 'eval' datasets. \n",
 29 |     "The pipeline includes the following steps:\n",
 30 |     "1. Read data from BigQuery\n",
 31 |     "2. Extract and clean features from BQ rows\n",
 32 |     "3. Use tf.transfrom to process the text and produce the following features for each entry\n",
 33 |     " * title: Raw text - string\n",
 34 |     " * bow: Bag of word indecies - sparse vector of integers\n",
 35 |     " * weight: TF.IDF values - sparse vector of floats\n",
 36 |     " * source: target feature - string\n",
 37 |     "4. Save the data as .tfrecord files\n",
 38 |     "    \n",
 39 |     "\n"
 40 |    ]
 41 |   },
 42 |   {
 43 |    "cell_type": "markdown",
 44 |    "metadata": {},
 45 |    "source": [
 46 |     "### Setting Global Parameters"
 47 |    ]
 48 |   },
 49 |   {
 50 |    "cell_type": "code",
 51 |    "execution_count": 2,
 52 |    "metadata": {},
 53 |    "outputs": [],
 54 |    "source": [
 55 |     "import os\n",
 56 |     "\n",
 57 |     "class Params:\n",
 58 |     "    pass\n",
 59 |     "\n",
 60 |     "# Set to run on GCP\n",
 61 |     "Params.GCP_PROJECT_ID = 'ksalama-gcp-playground'\n",
 62 |     "Params.REGION = 'europe-west1'\n",
 63 |     "Params.BUCKET = 'ksalama-gcs-cloudml'\n",
 64 |     "\n",
 65 |     "Params.PLATFORM = 'local' # local | GCP\n",
 66 |     "\n",
 67 |     "Params.DATA_DIR = 'data/news'  if Params.PLATFORM == 'local' else 'gs://{}/data/news'.format(Params.BUCKET)\n",
 68 |     "\n",
 69 |     "Params.TRANSFORMED_DATA_DIR = os.path.join(Params.DATA_DIR, 'transformed')\n",
 70 |     "Params.TRANSFORMED_TRAIN_DATA_FILE_PREFIX = os.path.join(Params.TRANSFORMED_DATA_DIR, 'train')\n",
 71 |     "Params.TRANSFORMED_EVAL_DATA_FILE_PREFIX = os.path.join(Params.TRANSFORMED_DATA_DIR, 'eval')\n",
 72 |     "\n",
 73 |     "Params.TEMP_DIR = os.path.join(Params.DATA_DIR, 'tmp')\n",
 74 |     "\n",
 75 |     "Params.MODELS_DIR = 'models/news' if Params.PLATFORM == 'local' else 'gs://{}/models/news'.format(Params.BUCKET)\n",
 76 |     "\n",
 77 |     "Params.TRANSFORM_ARTEFACTS_DIR = os.path.join(Params.MODELS_DIR,'transform')\n",
 78 |     "\n",
 79 |     "Params.TRANSFORM = True"
 80 |    ]
 81 |   },
 82 |   {
 83 |    "cell_type": "markdown",
 84 |    "metadata": {},
 85 |    "source": [
 86 |     "### Importing libraries"
 87 |    ]
 88 |   },
 89 |   {
 90 |    "cell_type": "code",
 91 |    "execution_count": 3,
 92 |    "metadata": {},
 93 |    "outputs": [
 94 |     {
 95 |      "name": "stdout",
 96 |      "output_type": "stream",
 97 |      "text": [
 98 |       "WARNING:tensorflow:From /Users/khalidsalama/Technology/python-venvs/py27-venv/lib/python2.7/site-packages/tensorflow/contrib/learn/python/learn/datasets/base.py:198: retry (from tensorflow.contrib.learn.python.learn.datasets.base) is deprecated and will be removed in a future version.\n",
 99 |       "Instructions for updating:\n",
100 |       "Use the retry module or similar alternatives.\n"
101 |      ]
102 |     }
103 |    ],
104 |    "source": [
105 |     "import apache_beam as beam\n",
106 |     "\n",
107 |     "import tensorflow as tf\n",
108 |     "import tensorflow_transform as tft\n",
109 |     "import tensorflow_transform.coders as tft_coders\n",
110 |     "\n",
111 |     "from tensorflow.contrib.learn.python.learn.utils import input_fn_utils\n",
112 |     "\n",
113 |     "from tensorflow_transform.beam import impl\n",
114 |     "from tensorflow_transform.beam.tft_beam_io import transform_fn_io\n",
115 |     "from tensorflow_transform.tf_metadata import metadata_io\n",
116 |     "from tensorflow_transform.tf_metadata import dataset_schema\n",
117 |     "from tensorflow_transform.tf_metadata import dataset_metadata\n",
118 |     "from tensorflow_transform.saved import saved_transform_io"
119 |    ]
120 |   },
121 |   {
122 |    "cell_type": "markdown",
123 |    "metadata": {},
124 |    "source": [
125 |     "## 1. Source Query"
126 |    ]
127 |   },
128 |   {
129 |    "cell_type": "code",
130 |    "execution_count": 4,
131 |    "metadata": {},
132 |    "outputs": [],
133 |    "source": [
134 |     "bq_query = '''\n",
135 |     "SELECT\n",
136 |     "    key,\n",
137 |     "    REGEXP_REPLACE(title, '[^a-zA-Z0-9 $.-]', ' ') AS title, \n",
138 |     "    source\n",
139 |     "FROM\n",
140 |     "(\n",
141 |     "    SELECT\n",
142 |     "        ARRAY_REVERSE(SPLIT(REGEXP_EXTRACT(url, '.*://(.[^/]+)/'), '.'))[OFFSET(1)] AS source,\n",
143 |     "        title,\n",
144 |     "        ABS(FARM_FINGERPRINT(title)) AS Key\n",
145 |     "    FROM\n",
146 |     "      `bigquery-public-data.hacker_news.stories`\n",
147 |     "    WHERE\n",
148 |     "      REGEXP_CONTAINS(REGEXP_EXTRACT(url, '.*://(.[^/]+)/'), '.com$')\n",
149 |     "      AND LENGTH(title) > 10\n",
150 |     ")\n",
151 |     "WHERE (source = 'github' OR source = 'nytimes' OR source = 'techcrunch')\n",
152 |     "'''\n",
153 |     "\n",
154 |     "def get_source_query(step):\n",
155 |     "    \n",
156 |     "    if step == 'train':\n",
157 |     "        source_query = 'SELECT * FROM ({}) WHERE MOD(key,100) <= 75'.format(bq_query)\n",
158 |     "    else:\n",
159 |     "        source_query = 'SELECT * FROM ({}) WHERE MOD(key,100) > 75'.format(bq_query)\n",
160 |     "        \n",
161 |     "    return source_query"
162 |    ]
163 |   },
164 |   {
165 |    "cell_type": "markdown",
166 |    "metadata": {},
167 |    "source": [
168 |     "## 2. Raw metadata"
169 |    ]
170 |   },
171 |   {
172 |    "cell_type": "code",
173 |    "execution_count": 5,
174 |    "metadata": {},
175 |    "outputs": [],
176 |    "source": [
177 |     "RAW_HEADER = 'key,title,source'.split(',')\n",
178 |     "RAW_DEFAULTS = [['NA'],['NA'],['NA']]\n",
179 |     "TARGET_FEATURE_NAME = 'source'\n",
180 |     "TARGET_LABELS = ['github', 'nytimes', 'techcrunch']\n",
181 |     "TEXT_FEATURE_NAME = 'title'\n",
182 |     "KEY_COLUMN = 'key'\n",
183 |     "\n",
184 |     "VOCAB_SIZE = 20000\n",
185 |     "TRAIN_SIZE = 73124\n",
186 |     "EVAL_SIZE = 23079\n",
187 |     "\n",
188 |     "DELIMITERS = '.,!?() '\n",
189 |     "\n",
190 |     "raw_metadata = dataset_metadata.DatasetMetadata(dataset_schema.Schema({\n",
191 |     "    KEY_COLUMN: dataset_schema.ColumnSchema(\n",
192 |     "        tf.string, [], dataset_schema.FixedColumnRepresentation()),\n",
193 |     "    TEXT_FEATURE_NAME: dataset_schema.ColumnSchema(\n",
194 |     "        tf.string, [], dataset_schema.FixedColumnRepresentation()),\n",
195 |     "    TARGET_FEATURE_NAME: dataset_schema.ColumnSchema(\n",
196 |     "        tf.string, [], dataset_schema.FixedColumnRepresentation()),\n",
197 |     "}))"
198 |    ]
199 |   },
200 |   {
201 |    "cell_type": "markdown",
202 |    "metadata": {},
203 |    "source": [
204 |     "## 3. Preprocessing functions"
205 |    ]
206 |   },
207 |   {
208 |    "cell_type": "code",
209 |    "execution_count": 6,
210 |    "metadata": {},
211 |    "outputs": [],
212 |    "source": [
213 |     "def get_features(bq_row):\n",
214 |     "    \n",
215 |     "    CSV_HEADER = 'key,title,source'.split(',')\n",
216 |     "    \n",
217 |     "    input_features = {}\n",
218 |     "    \n",
219 |     "    for feature_name in CSV_HEADER:\n",
220 |     "        input_features[feature_name] = str(bq_row[feature_name]).lower()\n",
221 |     "        \n",
222 |     "    return input_features\n",
223 |     "\n",
224 |     "\n",
225 |     "def preprocessing_fn(input_features):\n",
226 |     " \n",
227 |     "    text = input_features[TEXT_FEATURE_NAME]\n",
228 |     "\n",
229 |     "    text_tokens = tf.string_split(text, DELIMITERS)\n",
230 |     "    text_tokens_indcies = tft.string_to_int(text_tokens, top_k=VOCAB_SIZE)\n",
231 |     "    bag_of_words_indices, text_weight = tft.tfidf(text_tokens_indcies, VOCAB_SIZE + 1)\n",
232 |     "    \n",
233 |     "    output_features = {}\n",
234 |     "    output_features[TEXT_FEATURE_NAME] = input_features[TEXT_FEATURE_NAME]\n",
235 |     "    output_features['bow'] = bag_of_words_indices\n",
236 |     "    output_features['weight'] = text_weight\n",
237 |     "    output_features[TARGET_FEATURE_NAME] = input_features[TARGET_FEATURE_NAME]\n",
238 |     "    \n",
239 |     "    return output_features"
240 |    ]
241 |   },
242 |   {
243 |    "cell_type": "markdown",
244 |    "metadata": {},
245 |    "source": [
246 |     "## 4. Beam Pipeline"
247 |    ]
248 |   },
249 |   {
250 |    "cell_type": "code",
251 |    "execution_count": 7,
252 |    "metadata": {},
253 |    "outputs": [],
254 |    "source": [
255 |     "import apache_beam as beam\n",
256 |     "\n",
257 |     "\n",
258 |     "def run_pipeline(runner, opts):\n",
259 |     "    \n",
260 |     "    print(\"Sink train data files: {}\".format(Params.TRANSFORMED_TRAIN_DATA_FILE_PREFIX))\n",
261 |     "    print(\"Sink data files: {}\".format(Params.TRANSFORMED_EVAL_DATA_FILE_PREFIX))\n",
262 |     "    print(\"Temporary directory: {}\".format(Params.TEMP_DIR))\n",
263 |     "    print(\"\")\n",
264 |     "    \n",
265 |     "    \n",
266 |     "    with beam.Pipeline(runner, options=opts) as pipeline:\n",
267 |     "        with impl.Context(Params.TEMP_DIR): \n",
268 |     "    \n",
269 |     "            ###### analyze & transform train #########################################################\n",
270 |     "            if(runner=='DirectRunner'):\n",
271 |     "                print(\"\")\n",
272 |     "                print(\"Transform training data....\")\n",
273 |     "                print(\"\")\n",
274 |     "            \n",
275 |     "            step = 'train'\n",
276 |     "            source_query = get_source_query(step)\n",
277 |     "            \n",
278 |     "            # Read raw train data from BQ and cleanup\n",
279 |     "            raw_train_data = (\n",
280 |     "              pipeline\n",
281 |     "              | '{} - Read Data from BigQuery'.format(step) >> beam.io.Read(beam.io.BigQuerySource(query=source_query, use_standard_sql=True))\n",
282 |     "              | '{} - Extract Features'.format(step) >> beam.Map(get_features)\n",
283 |     "            )\n",
284 |     "            \n",
285 |     "            # create a train dataset from the data and schema\n",
286 |     "            raw_train_dataset = (raw_train_data, raw_metadata)\n",
287 |     "            \n",
288 |     "            # analyze and transform raw_train_dataset to produced transformed_train_dataset and transform_fn\n",
289 |     "            transformed_train_dataset, transform_fn = (\n",
290 |     "                raw_train_dataset \n",
291 |     "                | '{} - Analyze & Transform'.format(step) >> impl.AnalyzeAndTransformDataset(preprocessing_fn)\n",
292 |     "            )\n",
293 |     "            \n",
294 |     "            # get data and schema separately from the transformed_train_dataset\n",
295 |     "            transformed_train_data, transformed_metadata = transformed_train_dataset\n",
296 |     "\n",
297 |     "            # write transformed train data to sink\n",
298 |     "            _ = (\n",
299 |     "                transformed_train_data \n",
300 |     "                | '{} - Write Transformed Data as tfrecords'.format(step) >> beam.io.tfrecordio.WriteToTFRecord(\n",
301 |     "                    file_path_prefix=Params.TRANSFORMED_TRAIN_DATA_FILE_PREFIX,\n",
302 |     "                    file_name_suffix=\".tfrecords\",\n",
303 |     "                    num_shards=25,\n",
304 |     "                    coder=tft_coders.example_proto_coder.ExampleProtoCoder(transformed_metadata.schema))\n",
305 |     "            )\n",
306 |     "            \n",
307 |     "            \n",
308 |     "#             #### TEST write transformed AS TEXT train data to sink\n",
309 |     "#             _ = (\n",
310 |     "#                 transformed_train_data \n",
311 |     "#                 | '{} - Write Transformed Data as Text'.format(step) >> beam.io.textio.WriteToText(\n",
312 |     "#                     file_path_prefix=Params.TRANSFORMED_TRAIN_DATA_FILE_PREFIX,\n",
313 |     "#                     file_name_suffix=\".csv\")\n",
314 |     "#             )\n",
315 |     "#             ##################################################\n",
316 |     "\n",
317 |     "\n",
318 |     "            ###### transform eval ##################################################################\n",
319 |     "            \n",
320 |     "            if(runner=='DirectRunner'):\n",
321 |     "                print(\"\")\n",
322 |     "                print(\"Transform eval data....\")\n",
323 |     "                print(\"\")\n",
324 |     "            \n",
325 |     "            step = 'eval'\n",
326 |     "            source_query = get_source_query(step)\n",
327 |     "\n",
328 |     "            # Read raw eval data from BQ and cleanup\n",
329 |     "            raw_eval_data = (\n",
330 |     "              pipeline\n",
331 |     "              | '{} - Read Data from BigQuery'.format(step) >> beam.io.Read(beam.io.BigQuerySource(query=source_query, use_standard_sql=True))\n",
332 |     "              | '{} - Extract Features'.format(step) >> beam.Map(get_features)\n",
333 |     "            )\n",
334 |     "            \n",
335 |     "            # create a eval dataset from the data and schema\n",
336 |     "            raw_eval_dataset = (raw_eval_data, raw_metadata)\n",
337 |     "            \n",
338 |     "            # transform eval data based on produced transform_fn (from analyzing train_data)\n",
339 |     "            transformed_eval_dataset = (\n",
340 |     "                (raw_eval_dataset, transform_fn) \n",
341 |     "                | '{} - Transform'.format(step) >> impl.TransformDataset()\n",
342 |     "            )\n",
343 |     "            \n",
344 |     "            # get data from the transformed_eval_dataset\n",
345 |     "            transformed_eval_data, _ = transformed_eval_dataset\n",
346 |     "            \n",
347 |     "            # write transformed eval data to sink\n",
348 |     "            _ = (\n",
349 |     "                transformed_eval_data \n",
350 |     "                | '{} - Write Transformed Data'.format(step) >> beam.io.tfrecordio.WriteToTFRecord(\n",
351 |     "                    file_path_prefix=Params.TRANSFORMED_EVAL_DATA_FILE_PREFIX,\n",
352 |     "                    file_name_suffix=\".tfrecords\",\n",
353 |     "                    num_shards=10,\n",
354 |     "                    coder=tft_coders.example_proto_coder.ExampleProtoCoder(transformed_metadata.schema))\n",
355 |     "            )\n",
356 |     "        \n",
357 |     "            ###### write transformation metadata #######################################################\n",
358 |     "            if(runner=='DirectRunner'):\n",
359 |     "                print(\"\")\n",
360 |     "                print(\"Saving transformation artefacts ....\")\n",
361 |     "                print(\"\")\n",
362 |     "            \n",
363 |     "            # write transform_fn as tf.graph\n",
364 |     "            _ = (\n",
365 |     "                transform_fn \n",
366 |     "                | 'Write Transform Artefacts' >> transform_fn_io.WriteTransformFn(Params.TRANSFORM_ARTEFACTS_DIR)\n",
367 |     "            )\n",
368 |     "\n",
369 |     "    if runner=='DataflowRunner':\n",
370 |     "        pipeline.run()"
371 |    ]
372 |   },
373 |   {
374 |    "cell_type": "markdown",
375 |    "metadata": {},
376 |    "source": [
377 |     "## 5. Run Pipeline"
378 |    ]
379 |   },
380 |   {
381 |    "cell_type": "code",
382 |    "execution_count": 8,
383 |    "metadata": {},
384 |    "outputs": [
385 |     {
386 |      "name": "stdout",
387 |      "output_type": "stream",
388 |      "text": [
389 |       "Launching DirectRunner job preprocess-hackernews-data-180514-115222 ... hang on\n",
390 |       "Sink train data files: data/news/transformed/train\n",
391 |       "Sink data files: data/news/transformed/eval\n",
392 |       "Temporary directory: data/news/tmp\n",
393 |       "\n",
394 |       "\n",
395 |       "Transform training data....\n",
396 |       "\n",
397 |       "\n",
398 |       "Transform eval data....\n",
399 |       "\n",
400 |       "\n",
401 |       "Saving transformation artefacts ....\n",
402 |       "\n"
403 |      ]
404 |     },
405 |     {
406 |      "name": "stderr",
407 |      "output_type": "stream",
408 |      "text": [
409 |       "/Users/khalidsalama/Technology/python-venvs/py27-venv/lib/python2.7/site-packages/apache_beam/runners/direct/direct_runner.py:337: DeprecationWarning: options is deprecated since First stable release.. References to <pipeline>.options will not be supported\n",
410 |       "  pipeline.replace_all(_get_transform_overrides(pipeline.options))\n",
411 |       "WARNING:root:Dataset ksalama-gcp-playground:temp_dataset_151e64fa07a3490bae91dd844ce4b7da does not exist so we will create it as temporary with location=None\n",
412 |       "WARNING:root:Dataset ksalama-gcp-playground:temp_dataset_f3701d6e27e14e068968a255f43c4b8c does not exist so we will create it as temporary with location=None\n"
413 |      ]
414 |     },
415 |     {
416 |      "name": "stdout",
417 |      "output_type": "stream",
418 |      "text": [
419 |       "Pipline completed.\n"
420 |      ]
421 |     }
422 |    ],
423 |    "source": [
424 |     "from datetime import datetime\n",
425 |     "import shutil\n",
426 |     "\n",
427 |     "job_name = 'preprocess-hackernews-data' + '-' + datetime.utcnow().strftime('%y%m%d-%H%M%S')\n",
428 |     "\n",
429 |     "options = {\n",
430 |     "    'region': Params.REGION,\n",
431 |     "    'staging_location': os.path.join(Params.TEMP_DIR, 'staging'),\n",
432 |     "    'temp_location': Params.TEMP_DIR,\n",
433 |     "    'job_name': job_name,\n",
434 |     "    'project': Params.GCP_PROJECT_ID\n",
435 |     "}\n",
436 |     "\n",
437 |     "tf.logging.set_verbosity(tf.logging.ERROR)\n",
438 |     "\n",
439 |     "opts = beam.pipeline.PipelineOptions(flags=[], **options)\n",
440 |     "runner = 'DirectRunner' if Params.PLATFORM == 'local' else 'DirectRunner'\n",
441 |     "\n",
442 |     "if Params.TRANSFORM:\n",
443 |     "    \n",
444 |     "    if Params.PLATFORM == 'local':\n",
445 |     "        shutil.rmtree(Params.TRANSFORMED_DATA_DIR, ignore_errors=True)\n",
446 |     "        shutil.rmtree(Params.TRANSFORM_ARTEFACTS_DIR, ignore_errors=True)\n",
447 |     "        shutil.rmtree(Params.TEMP_DIR, ignore_errors=True)\n",
448 |     "    \n",
449 |     "    print 'Launching {} job {} ... hang on'.format(runner, job_name)\n",
450 |     "    \n",
451 |     "    run_pipeline(runner, opts)\n",
452 |     "    \n",
453 |     "    print \"Pipline completed.\"\n",
454 |     "else:\n",
455 |     "    print \"Transformation skipped!\""
456 |    ]
457 |   },
458 |   {
459 |    "cell_type": "code",
460 |    "execution_count": 9,
461 |    "metadata": {},
462 |    "outputs": [
463 |     {
464 |      "name": "stdout",
465 |      "output_type": "stream",
466 |      "text": [
467 |       "** transformed data:\n",
468 |       "eval-00000-of-00010.tfrecords\n",
469 |       "eval-00001-of-00010.tfrecords\n",
470 |       "eval-00002-of-00010.tfrecords\n",
471 |       "eval-00003-of-00010.tfrecords\n",
472 |       "eval-00004-of-00010.tfrecords\n",
473 |       "eval-00005-of-00010.tfrecords\n",
474 |       "eval-00006-of-00010.tfrecords\n",
475 |       "eval-00007-of-00010.tfrecords\n",
476 |       "eval-00008-of-00010.tfrecords\n",
477 |       "eval-00009-of-00010.tfrecords\n",
478 |       "train-00000-of-00025.tfrecords\n",
479 |       "train-00001-of-00025.tfrecords\n",
480 |       "train-00002-of-00025.tfrecords\n",
481 |       "train-00003-of-00025.tfrecords\n",
482 |       "train-00004-of-00025.tfrecords\n",
483 |       "train-00005-of-00025.tfrecords\n",
484 |       "train-00006-of-00025.tfrecords\n",
485 |       "train-00007-of-00025.tfrecords\n",
486 |       "train-00008-of-00025.tfrecords\n",
487 |       "train-00009-of-00025.tfrecords\n",
488 |       "train-00010-of-00025.tfrecords\n",
489 |       "train-00011-of-00025.tfrecords\n",
490 |       "train-00012-of-00025.tfrecords\n",
491 |       "train-00013-of-00025.tfrecords\n",
492 |       "train-00014-of-00025.tfrecords\n",
493 |       "train-00015-of-00025.tfrecords\n",
494 |       "train-00016-of-00025.tfrecords\n",
495 |       "train-00017-of-00025.tfrecords\n",
496 |       "train-00018-of-00025.tfrecords\n",
497 |       "train-00019-of-00025.tfrecords\n",
498 |       "train-00020-of-00025.tfrecords\n",
499 |       "train-00021-of-00025.tfrecords\n",
500 |       "train-00022-of-00025.tfrecords\n",
501 |       "train-00023-of-00025.tfrecords\n",
502 |       "train-00024-of-00025.tfrecords\n",
503 |       "\n",
504 |       "** transform artefacts:\n",
505 |       "transform_fn\n",
506 |       "transformed_metadata\n",
507 |       "\n",
508 |       "** transform assets:\n",
509 |       "vocab_string_to_int_uniques\n",
510 |       "\n",
511 |       "the\n",
512 |       "a\n",
513 |       "to\n",
514 |       "for\n",
515 |       "in\n",
516 |       "of\n",
517 |       "and\n",
518 |       "s\n",
519 |       "on\n",
520 |       "with\n"
521 |      ]
522 |     }
523 |    ],
524 |    "source": [
525 |     "%%bash\n",
526 |     "\n",
527 |     "echo \"** transformed data:\"\n",
528 |     "ls data/news/transformed\n",
529 |     "echo \"\"\n",
530 |     "\n",
531 |     "echo \"** transform artefacts:\"\n",
532 |     "ls models/news/transform\n",
533 |     "echo \"\"\n",
534 |     "\n",
535 |     "echo \"** transform assets:\"\n",
536 |     "ls models/news/transform/transform_fn/assets\n",
537 |     "echo \"\"\n",
538 |     "\n",
539 |     "head models/news/transform/transform_fn/assets/vocab_string_to_int_uniques"
540 |    ]
541 |   },
542 |   {
543 |    "cell_type": "code",
544 |    "execution_count": null,
545 |    "metadata": {},
546 |    "outputs": [],
547 |    "source": []
548 |   }
549 |  ],
550 |  "metadata": {
551 |   "kernelspec": {
552 |    "display_name": "Python 2",
553 |    "language": "python",
554 |    "name": "python2"
555 |   },
556 |   "language_info": {
557 |    "codemirror_mode": {
558 |     "name": "ipython",
559 |     "version": 2
560 |    },
561 |    "file_extension": ".py",
562 |    "mimetype": "text/x-python",
563 |    "name": "python",
564 |    "nbconvert_exporter": "python",
565 |    "pygments_lexer": "ipython2",
566 |    "version": "2.7.10"
567 |   }
568 |  },
569 |  "nbformat": 4,
570 |  "nbformat_minor": 2
571 | }
572 | 


--------------------------------------------------------------------------------
/08 - Text Analysis/06 - Part_4 - Text Classification - Hacker News - DNNClassifier with TF.IDF.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "code",
  5 |    "execution_count": null,
  6 |    "metadata": {},
  7 |    "outputs": [],
  8 |    "source": [
  9 |     "# %%bash\n",
 10 |     "\n",
 11 |     "# pip install tensorflow==1.7\n",
 12 |     "# pip install tensorflow-transform"
 13 |    ]
 14 |   },
 15 |   {
 16 |    "cell_type": "markdown",
 17 |    "metadata": {},
 18 |    "source": [
 19 |     "# Text Classification using TensorFlow and Google Cloud - Part 4\n",
 20 |     "\n",
 21 |     "This [bigquery-public-data:hacker_news](https://cloud.google.com/bigquery/public-data/hacker-news) contains all stories and comments from Hacker News from its launch in 2006.  Each story contains a story id, url, the title of the story, tthe author that made the post, when it was written, and the number of points the story received.\n",
 22 |     "\n",
 23 |     "The objective is, given the title of the story, we want to build an ML model that can predict the source of this story.\n",
 24 |     "\n",
 25 |     "## TF DNNClassifier with TF.IDF Text Reprsentation\n",
 26 |     "\n",
 27 |     "This notebook illustrates how to build a TF premade estimator, namely DNNClassifier, while the input text will be repesented as TF.IDF computed during the preprocessing phase in Part 1. The overall steps are as follows:\n",
 28 |     "\n",
 29 |     "\n",
 30 |     "1. Define the metadata\n",
 31 |     "2. Define data input function\n",
 32 |     "2. Create feature columns (using the tfidf)\n",
 33 |     "3. Create the premade DNNClassifier estimator\n",
 34 |     "4. Setup experiement\n",
 35 |     "    * Hyper-parameters & RunConfig\n",
 36 |     "    * Serving function (for exported model)\n",
 37 |     "    * TrainSpec & EvalSpec\n",
 38 |     "5. Run experiement\n",
 39 |     "6. Evalute the model\n",
 40 |     "7. Use SavedModel for prediction\n",
 41 |     "    \n",
 42 |     "\n"
 43 |    ]
 44 |   },
 45 |   {
 46 |    "cell_type": "markdown",
 47 |    "metadata": {},
 48 |    "source": [
 49 |     "### Setting Global Parameters"
 50 |    ]
 51 |   },
 52 |   {
 53 |    "cell_type": "code",
 54 |    "execution_count": 1,
 55 |    "metadata": {},
 56 |    "outputs": [],
 57 |    "source": [
 58 |     "import os\n",
 59 |     "\n",
 60 |     "class Params:\n",
 61 |     "    pass\n",
 62 |     "\n",
 63 |     "# Set to run on GCP\n",
 64 |     "Params.GCP_PROJECT_ID = 'ksalama-gcp-playground'\n",
 65 |     "Params.REGION = 'europe-west1'\n",
 66 |     "Params.BUCKET = 'ksalama-gcs-cloudml'\n",
 67 |     "\n",
 68 |     "Params.PLATFORM = 'local' # local | GCP\n",
 69 |     "\n",
 70 |     "Params.DATA_DIR = 'data/news'  if Params.PLATFORM == 'local' else 'gs://{}/data/news'.format(Params.BUCKET)\n",
 71 |     "\n",
 72 |     "Params.TRANSFORMED_DATA_DIR = os.path.join(Params.DATA_DIR, 'transformed')\n",
 73 |     "Params.TRANSFORMED_TRAIN_DATA_FILE_PREFIX = os.path.join(Params.TRANSFORMED_DATA_DIR, 'train')\n",
 74 |     "Params.TRANSFORMED_EVAL_DATA_FILE_PREFIX = os.path.join(Params.TRANSFORMED_DATA_DIR, 'eval')\n",
 75 |     "\n",
 76 |     "Params.TEMP_DIR = os.path.join(Params.DATA_DIR, 'tmp')\n",
 77 |     "\n",
 78 |     "Params.MODELS_DIR = 'models/news' if Params.PLATFORM == 'local' else 'gs://{}/models/news'.format(Params.BUCKET)\n",
 79 |     "\n",
 80 |     "Params.TRANSFORM_ARTEFACTS_DIR = os.path.join(Params.MODELS_DIR,'transform')\n",
 81 |     "\n",
 82 |     "Params.TRAIN = True\n",
 83 |     "\n",
 84 |     "Params.RESUME_TRAINING = False\n",
 85 |     "\n",
 86 |     "Params.EAGER = False\n",
 87 |     "\n",
 88 |     "if Params.EAGER:\n",
 89 |     "    tf.enable_eager_execution()"
 90 |    ]
 91 |   },
 92 |   {
 93 |    "cell_type": "markdown",
 94 |    "metadata": {},
 95 |    "source": [
 96 |     "### Importing libraries"
 97 |    ]
 98 |   },
 99 |   {
100 |    "cell_type": "code",
101 |    "execution_count": 2,
102 |    "metadata": {},
103 |    "outputs": [
104 |     {
105 |      "name": "stdout",
106 |      "output_type": "stream",
107 |      "text": [
108 |       "WARNING:tensorflow:From /Users/khalidsalama/Technology/python-venvs/py27-venv/lib/python2.7/site-packages/tensorflow/contrib/learn/python/learn/datasets/base.py:198: retry (from tensorflow.contrib.learn.python.learn.datasets.base) is deprecated and will be removed in a future version.\n",
109 |       "Instructions for updating:\n",
110 |       "Use the retry module or similar alternatives.\n",
111 |       "1.7.0\n"
112 |      ]
113 |     }
114 |    ],
115 |    "source": [
116 |     "import tensorflow as tf\n",
117 |     "from tensorflow import data\n",
118 |     "\n",
119 |     "\n",
120 |     "from tensorflow.contrib.learn.python.learn.utils import input_fn_utils\n",
121 |     "from tensorflow_transform.beam.tft_beam_io import transform_fn_io\n",
122 |     "from tensorflow_transform.tf_metadata import metadata_io\n",
123 |     "from tensorflow_transform.tf_metadata import dataset_schema\n",
124 |     "from tensorflow_transform.tf_metadata import dataset_metadata\n",
125 |     "from tensorflow_transform.saved import saved_transform_io\n",
126 |     "\n",
127 |     "print tf.__version__"
128 |    ]
129 |   },
130 |   {
131 |    "cell_type": "markdown",
132 |    "metadata": {},
133 |    "source": [
134 |     "## 1. Define Metadata"
135 |    ]
136 |   },
137 |   {
138 |    "cell_type": "code",
139 |    "execution_count": 3,
140 |    "metadata": {},
141 |    "outputs": [
142 |     {
143 |      "name": "stdout",
144 |      "output_type": "stream",
145 |      "text": [
146 |       "{u'source': FixedLenFeature(shape=[], dtype=tf.string, default_value=None), u'title': FixedLenFeature(shape=[], dtype=tf.string, default_value=None), u'weight': VarLenFeature(dtype=tf.float32), u'bow': VarLenFeature(dtype=tf.int64)}\n"
147 |      ]
148 |     }
149 |    ],
150 |    "source": [
151 |     "RAW_HEADER = 'key,title,source'.split(',')\n",
152 |     "RAW_DEFAULTS = [['NA'],['NA'],['NA']]\n",
153 |     "TARGET_FEATURE_NAME = 'source'\n",
154 |     "TARGET_LABELS = ['github', 'nytimes', 'techcrunch']\n",
155 |     "TEXT_FEATURE_NAME = 'title'\n",
156 |     "KEY_COLUMN = 'key'\n",
157 |     "\n",
158 |     "VOCAB_SIZE = 20000\n",
159 |     "TRAIN_SIZE = 73124\n",
160 |     "EVAL_SIZE = 23079\n",
161 |     "\n",
162 |     "DELIMITERS = '.,!?() '\n",
163 |     "\n",
164 |     "raw_metadata = dataset_metadata.DatasetMetadata(dataset_schema.Schema({\n",
165 |     "    KEY_COLUMN: dataset_schema.ColumnSchema(\n",
166 |     "        tf.string, [], dataset_schema.FixedColumnRepresentation()),\n",
167 |     "    TEXT_FEATURE_NAME: dataset_schema.ColumnSchema(\n",
168 |     "        tf.string, [], dataset_schema.FixedColumnRepresentation()),\n",
169 |     "    TARGET_FEATURE_NAME: dataset_schema.ColumnSchema(\n",
170 |     "        tf.string, [], dataset_schema.FixedColumnRepresentation()),\n",
171 |     "}))\n",
172 |     "\n",
173 |     "\n",
174 |     "transformed_metadata = metadata_io.read_metadata(\n",
175 |     "    os.path.join(Params.TRANSFORM_ARTEFACTS_DIR,\"transformed_metadata\"))\n",
176 |     "\n",
177 |     "raw_feature_spec = raw_metadata.schema.as_feature_spec()\n",
178 |     "transformed_feature_spec = transformed_metadata.schema.as_feature_spec()\n",
179 |     "\n",
180 |     "print transformed_feature_spec"
181 |    ]
182 |   },
183 |   {
184 |    "cell_type": "markdown",
185 |    "metadata": {},
186 |    "source": [
187 |     "## 2. Define Input Function"
188 |    ]
189 |   },
190 |   {
191 |    "cell_type": "code",
192 |    "execution_count": 4,
193 |    "metadata": {},
194 |    "outputs": [],
195 |    "source": [
196 |     "def parse_tf_example(tf_example):\n",
197 |     "    \n",
198 |     "    parsed_features = tf.parse_single_example(serialized=tf_example, features=transformed_feature_spec)\n",
199 |     "    target = parsed_features.pop(TARGET_FEATURE_NAME)\n",
200 |     "    \n",
201 |     "    return parsed_features, target\n",
202 |     "\n",
203 |     "\n",
204 |     "def generate_tfrecords_input_fn(files_pattern, \n",
205 |     "                          mode=tf.estimator.ModeKeys.EVAL, \n",
206 |     "                          num_epochs=1, \n",
207 |     "                          batch_size=200):\n",
208 |     "    \n",
209 |     "    def _input_fn():\n",
210 |     "        \n",
211 |     "        file_names = data.Dataset.list_files(files_pattern)\n",
212 |     "\n",
213 |     "        if Params.EAGER:\n",
214 |     "            print file_names\n",
215 |     "\n",
216 |     "        dataset = data.TFRecordDataset(file_names )\n",
217 |     "\n",
218 |     "        dataset = dataset.apply(\n",
219 |     "                tf.contrib.data.shuffle_and_repeat(count=num_epochs,\n",
220 |     "                                                   buffer_size=batch_size*2)\n",
221 |     "        )\n",
222 |     "\n",
223 |     "        dataset = dataset.apply(\n",
224 |     "                tf.contrib.data.map_and_batch(parse_tf_example, \n",
225 |     "                                              batch_size=batch_size, \n",
226 |     "                                              num_parallel_batches=2)\n",
227 |     "        )\n",
228 |     "\n",
229 |     "        datset = dataset.prefetch(batch_size)\n",
230 |     "\n",
231 |     "        if Params.EAGER:\n",
232 |     "            return dataset\n",
233 |     "\n",
234 |     "        iterator = dataset.make_one_shot_iterator()\n",
235 |     "        features, target = iterator.get_next()\n",
236 |     "        return features, target\n",
237 |     "    \n",
238 |     "    return _input_fn"
239 |    ]
240 |   },
241 |   {
242 |    "cell_type": "markdown",
243 |    "metadata": {},
244 |    "source": [
245 |     "## 3. Create feature columns"
246 |    ]
247 |   },
248 |   {
249 |    "cell_type": "code",
250 |    "execution_count": 5,
251 |    "metadata": {},
252 |    "outputs": [],
253 |    "source": [
254 |     "BOW_FEATURE_NAME = 'bow'\n",
255 |     "TFIDF_FEATURE_NAME = 'weight'\n",
256 |     "\n",
257 |     "def create_feature_columns():\n",
258 |     "    \n",
259 |     "    # Get word indecies from bow\n",
260 |     "    bow = tf.feature_column.categorical_column_with_identity(\n",
261 |     "      BOW_FEATURE_NAME, num_buckets=VOCAB_SIZE + 1)\n",
262 |     "    \n",
263 |     "    # Add weight to the word indecies\n",
264 |     "    weight_bow = tf.feature_column.weighted_categorical_column(\n",
265 |     "      bow, TFIDF_FEATURE_NAME)\n",
266 |     "    \n",
267 |     "    # Convert to indicator \n",
268 |     "    weight_bow_indicators = tf.feature_column.indicator_column(weight_bow)\n",
269 |     "    \n",
270 |     "    return [weight_bow_indicators]"
271 |    ]
272 |   },
273 |   {
274 |    "cell_type": "markdown",
275 |    "metadata": {},
276 |    "source": [
277 |     "## 4. Create a model using a premade DNNClassifer"
278 |    ]
279 |   },
280 |   {
281 |    "cell_type": "code",
282 |    "execution_count": 6,
283 |    "metadata": {},
284 |    "outputs": [],
285 |    "source": [
286 |     "def create_estimator(hparams, run_config):\n",
287 |     "    \n",
288 |     "    feature_columns = create_feature_columns()\n",
289 |     "    \n",
290 |     "    optimizer = tf.train.AdamOptimizer(learning_rate=hparams.learning_rate)\n",
291 |     "    \n",
292 |     "    estimator = tf.estimator.DNNClassifier(\n",
293 |     "        feature_columns=feature_columns,\n",
294 |     "        n_classes =len(TARGET_LABELS),\n",
295 |     "        label_vocabulary=TARGET_LABELS,\n",
296 |     "        hidden_units=hparams.hidden_units,\n",
297 |     "        optimizer=optimizer,\n",
298 |     "        config=run_config\n",
299 |     "    )\n",
300 |     "    \n",
301 |     "    \n",
302 |     "    return estimator"
303 |    ]
304 |   },
305 |   {
306 |    "cell_type": "markdown",
307 |    "metadata": {},
308 |    "source": [
309 |     "## 5. Setup Experiment"
310 |    ]
311 |   },
312 |   {
313 |    "cell_type": "markdown",
314 |    "metadata": {},
315 |    "source": [
316 |     "### 5.1 HParams and RunConfig"
317 |    ]
318 |   },
319 |   {
320 |    "cell_type": "code",
321 |    "execution_count": 7,
322 |    "metadata": {},
323 |    "outputs": [
324 |     {
325 |      "name": "stdout",
326 |      "output_type": "stream",
327 |      "text": [
328 |       "[('batch_size', 1000), ('hidden_units', [64, 32]), ('learning_rate', 0.01), ('max_steps', 730), ('num_epochs', 10), ('trainable_embedding', False)]\n",
329 |       "\n",
330 |       "('Model Directory:', 'models/news/dnn_estimator_tfidf')\n",
331 |       "('Dataset Size:', 73124)\n",
332 |       "('Batch Size:', 1000)\n",
333 |       "('Steps per Epoch:', 73)\n",
334 |       "('Total Steps:', 730)\n"
335 |      ]
336 |     }
337 |    ],
338 |    "source": [
339 |     "NUM_EPOCHS = 10\n",
340 |     "BATCH_SIZE = 1000\n",
341 |     "\n",
342 |     "TOTAL_STEPS = (TRAIN_SIZE/BATCH_SIZE)*NUM_EPOCHS\n",
343 |     "EVAL_EVERY_SEC = 60\n",
344 |     "\n",
345 |     "hparams  = tf.contrib.training.HParams(\n",
346 |     "    num_epochs = NUM_EPOCHS,\n",
347 |     "    batch_size = BATCH_SIZE,\n",
348 |     "    learning_rate = 0.01,\n",
349 |     "    hidden_units=[64, 32],\n",
350 |     "    max_steps = TOTAL_STEPS,\n",
351 |     "\n",
352 |     ")\n",
353 |     "\n",
354 |     "MODEL_NAME = 'dnn_estimator_tfidf' \n",
355 |     "model_dir = os.path.join(Params.MODELS_DIR, MODEL_NAME)\n",
356 |     "\n",
357 |     "run_config = tf.estimator.RunConfig(\n",
358 |     "    tf_random_seed=19830610,\n",
359 |     "    log_step_count_steps=1000,\n",
360 |     "    save_checkpoints_secs=EVAL_EVERY_SEC,\n",
361 |     "    keep_checkpoint_max=1,\n",
362 |     "    model_dir=model_dir\n",
363 |     ")\n",
364 |     "\n",
365 |     "\n",
366 |     "print(hparams)\n",
367 |     "print(\"\")\n",
368 |     "print(\"Model Directory:\", run_config.model_dir)\n",
369 |     "print(\"Dataset Size:\", TRAIN_SIZE)\n",
370 |     "print(\"Batch Size:\", BATCH_SIZE)\n",
371 |     "print(\"Steps per Epoch:\",TRAIN_SIZE/BATCH_SIZE)\n",
372 |     "print(\"Total Steps:\", TOTAL_STEPS)"
373 |    ]
374 |   },
375 |   {
376 |    "cell_type": "markdown",
377 |    "metadata": {},
378 |    "source": [
379 |     "### 5.2 Serving function"
380 |    ]
381 |   },
382 |   {
383 |    "cell_type": "code",
384 |    "execution_count": 8,
385 |    "metadata": {},
386 |    "outputs": [],
387 |    "source": [
388 |     "def generate_serving_input_fn():\n",
389 |     "    \n",
390 |     "    def _serving_fn():\n",
391 |     "    \n",
392 |     "        receiver_tensor = {\n",
393 |     "          'title': tf.placeholder(dtype=tf.string, shape=[None])\n",
394 |     "        }\n",
395 |     "\n",
396 |     "        _, transformed_features = (\n",
397 |     "            saved_transform_io.partially_apply_saved_transform(\n",
398 |     "                os.path.join(Params.TRANSFORM_ARTEFACTS_DIR, transform_fn_io.TRANSFORM_FN_DIR),\n",
399 |     "            receiver_tensor)\n",
400 |     "        )\n",
401 |     "        \n",
402 |     "        return tf.estimator.export.ServingInputReceiver(\n",
403 |     "            transformed_features, receiver_tensor)\n",
404 |     "    \n",
405 |     "    return _serving_fn"
406 |    ]
407 |   },
408 |   {
409 |    "cell_type": "markdown",
410 |    "metadata": {},
411 |    "source": [
412 |     "### 5.3 TrainSpec & EvalSpec"
413 |    ]
414 |   },
415 |   {
416 |    "cell_type": "code",
417 |    "execution_count": 9,
418 |    "metadata": {},
419 |    "outputs": [],
420 |    "source": [
421 |     "train_spec = tf.estimator.TrainSpec(\n",
422 |     "    input_fn = generate_tfrecords_input_fn(\n",
423 |     "        Params.TRANSFORMED_TRAIN_DATA_FILE_PREFIX+\"*\",\n",
424 |     "        mode = tf.estimator.ModeKeys.TRAIN,\n",
425 |     "        num_epochs=hparams.num_epochs,\n",
426 |     "        batch_size=hparams.batch_size\n",
427 |     "    ),\n",
428 |     "    max_steps=hparams.max_steps,\n",
429 |     "    hooks=None\n",
430 |     ")\n",
431 |     "\n",
432 |     "eval_spec = tf.estimator.EvalSpec(\n",
433 |     "    input_fn = generate_tfrecords_input_fn(\n",
434 |     "        Params.TRANSFORMED_EVAL_DATA_FILE_PREFIX+\"*\",\n",
435 |     "        mode=tf.estimator.ModeKeys.EVAL,\n",
436 |     "        num_epochs=1,\n",
437 |     "        batch_size=hparams.batch_size\n",
438 |     "    ),\n",
439 |     "    exporters=[tf.estimator.LatestExporter(\n",
440 |     "        name=\"estimate\", # the name of the folder in which the model will be exported to under export\n",
441 |     "        serving_input_receiver_fn=generate_serving_input_fn(),\n",
442 |     "        exports_to_keep=1,\n",
443 |     "        as_text=False)],\n",
444 |     "    steps=None,\n",
445 |     "    throttle_secs=EVAL_EVERY_SEC\n",
446 |     ")"
447 |    ]
448 |   },
449 |   {
450 |    "cell_type": "markdown",
451 |    "metadata": {},
452 |    "source": [
453 |     "## 6. Run experiment"
454 |    ]
455 |   },
456 |   {
457 |    "cell_type": "code",
458 |    "execution_count": 10,
459 |    "metadata": {},
460 |    "outputs": [
461 |     {
462 |      "name": "stdout",
463 |      "output_type": "stream",
464 |      "text": [
465 |       "Removing previous training artefacts...\n",
466 |       "Experiment started at 16:13:21\n",
467 |       ".......................................\n",
468 |       "INFO:tensorflow:Using config: {'_save_checkpoints_secs': 60, '_session_config': None, '_keep_checkpoint_max': 1, '_tf_random_seed': 19830610, '_task_type': 'worker', '_global_id_in_cluster': 0, '_is_chief': True, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x11711e1d0>, '_model_dir': 'models/news/dnn_estimator_tfidf', '_num_worker_replicas': 1, '_task_id': 0, '_log_step_count_steps': 1000, '_master': '', '_save_checkpoints_steps': None, '_keep_checkpoint_every_n_hours': 10000, '_evaluation_master': '', '_service': None, '_save_summary_steps': 100, '_num_ps_replicas': 0}\n",
469 |       "INFO:tensorflow:Running training and evaluation locally (non-distributed).\n",
470 |       "INFO:tensorflow:Start train and evaluate loop. The evaluate will happen after 60 secs (eval_spec.throttle_secs) or training is finished.\n",
471 |       "INFO:tensorflow:Calling model_fn.\n",
472 |       "INFO:tensorflow:Done calling model_fn.\n",
473 |       "INFO:tensorflow:Create CheckpointSaverHook.\n",
474 |       "INFO:tensorflow:Graph was finalized.\n",
475 |       "INFO:tensorflow:Running local_init_op.\n",
476 |       "INFO:tensorflow:Done running local_init_op.\n",
477 |       "INFO:tensorflow:Saving checkpoints for 1 into models/news/dnn_estimator_tfidf/model.ckpt.\n",
478 |       "INFO:tensorflow:loss = 1098.7266, step = 1\n",
479 |       "INFO:tensorflow:loss = 213.40088, step = 101 (15.307 sec)\n",
480 |       "INFO:tensorflow:loss = 147.65674, step = 201 (13.971 sec)\n",
481 |       "INFO:tensorflow:loss = 71.7646, step = 301 (15.121 sec)\n",
482 |       "INFO:tensorflow:Saving checkpoints for 392 into models/news/dnn_estimator_tfidf/model.ckpt.\n",
483 |       "INFO:tensorflow:Loss for final step: 26.048763.\n",
484 |       "INFO:tensorflow:Calling model_fn.\n",
485 |       "INFO:tensorflow:Done calling model_fn.\n",
486 |       "INFO:tensorflow:Starting evaluation at 2018-05-14-16:14:22\n",
487 |       "INFO:tensorflow:Graph was finalized.\n",
488 |       "INFO:tensorflow:Restoring parameters from models/news/dnn_estimator_tfidf/model.ckpt-392\n",
489 |       "INFO:tensorflow:Running local_init_op.\n",
490 |       "INFO:tensorflow:Done running local_init_op.\n",
491 |       "INFO:tensorflow:Finished evaluation at 2018-05-14-16:14:25\n",
492 |       "INFO:tensorflow:Saving dict for global step 392: accuracy = 0.8243858, average_loss = 0.94847244, global_step = 392, loss = 912.07477\n",
493 |       "WARNING:tensorflow:Expected binary or unicode string, got type_url: \"type.googleapis.com/tensorflow.AssetFileDef\"\n",
494 |       "value: \"\\n\\t\\n\\007Const:0\\022\\033vocab_string_to_int_uniques\"\n",
495 |       "\n",
496 |       "INFO:tensorflow:Calling model_fn.\n",
497 |       "INFO:tensorflow:Done calling model_fn.\n",
498 |       "INFO:tensorflow:Signatures INCLUDED in export for Classify: ['serving_default', 'classification']\n",
499 |       "INFO:tensorflow:Signatures INCLUDED in export for Regress: None\n",
500 |       "INFO:tensorflow:Signatures INCLUDED in export for Predict: ['predict']\n",
501 |       "INFO:tensorflow:Restoring parameters from models/news/dnn_estimator_tfidf/model.ckpt-392\n",
502 |       "INFO:tensorflow:Assets added to graph.\n",
503 |       "INFO:tensorflow:Assets written to: models/news/dnn_estimator_tfidf/export/estimate/temp-1526314465/assets\n",
504 |       "INFO:tensorflow:SavedModel written to: models/news/dnn_estimator_tfidf/export/estimate/temp-1526314465/saved_model.pb\n",
505 |       "INFO:tensorflow:Calling model_fn.\n",
506 |       "INFO:tensorflow:Done calling model_fn.\n",
507 |       "INFO:tensorflow:Create CheckpointSaverHook.\n",
508 |       "INFO:tensorflow:Graph was finalized.\n",
509 |       "INFO:tensorflow:Restoring parameters from models/news/dnn_estimator_tfidf/model.ckpt-392\n",
510 |       "INFO:tensorflow:Running local_init_op.\n",
511 |       "INFO:tensorflow:Done running local_init_op.\n",
512 |       "INFO:tensorflow:Saving checkpoints for 393 into models/news/dnn_estimator_tfidf/model.ckpt.\n",
513 |       "INFO:tensorflow:loss = 27.088547, step = 393\n",
514 |       "INFO:tensorflow:loss = 2.9095829, step = 493 (13.979 sec)\n",
515 |       "INFO:tensorflow:loss = 4.3351374, step = 593 (13.651 sec)\n",
516 |       "INFO:tensorflow:loss = 11.017786, step = 693 (14.415 sec)\n",
517 |       "INFO:tensorflow:Saving checkpoints for 730 into models/news/dnn_estimator_tfidf/model.ckpt.\n",
518 |       "INFO:tensorflow:Loss for final step: 3.2552278.\n",
519 |       "INFO:tensorflow:Calling model_fn.\n",
520 |       "INFO:tensorflow:Done calling model_fn.\n",
521 |       "INFO:tensorflow:Starting evaluation at 2018-05-14-16:15:15\n",
522 |       "INFO:tensorflow:Graph was finalized.\n",
523 |       "INFO:tensorflow:Restoring parameters from models/news/dnn_estimator_tfidf/model.ckpt-730\n",
524 |       "INFO:tensorflow:Running local_init_op.\n",
525 |       "INFO:tensorflow:Done running local_init_op.\n",
526 |       "INFO:tensorflow:Finished evaluation at 2018-05-14-16:15:17\n",
527 |       "INFO:tensorflow:Saving dict for global step 730: accuracy = 0.82416916, average_loss = 1.344607, global_step = 730, loss = 1293.0077\n",
528 |       "WARNING:tensorflow:Expected binary or unicode string, got type_url: \"type.googleapis.com/tensorflow.AssetFileDef\"\n",
529 |       "value: \"\\n\\t\\n\\007Const:0\\022\\033vocab_string_to_int_uniques\"\n",
530 |       "\n",
531 |       "INFO:tensorflow:Calling model_fn.\n",
532 |       "INFO:tensorflow:Done calling model_fn.\n",
533 |       "INFO:tensorflow:Signatures INCLUDED in export for Classify: ['serving_default', 'classification']\n",
534 |       "INFO:tensorflow:Signatures INCLUDED in export for Regress: None\n",
535 |       "INFO:tensorflow:Signatures INCLUDED in export for Predict: ['predict']\n",
536 |       "INFO:tensorflow:Restoring parameters from models/news/dnn_estimator_tfidf/model.ckpt-730\n",
537 |       "INFO:tensorflow:Assets added to graph.\n",
538 |       "INFO:tensorflow:Assets written to: models/news/dnn_estimator_tfidf/export/estimate/temp-1526314518/assets\n",
539 |       "INFO:tensorflow:SavedModel written to: models/news/dnn_estimator_tfidf/export/estimate/temp-1526314518/saved_model.pb\n",
540 |       ".......................................\n",
541 |       "Experiment finished at 16:15:18\n",
542 |       "\n",
543 |       "Experiment elapsed time: 117.021302 seconds\n"
544 |      ]
545 |     }
546 |    ],
547 |    "source": [
548 |     "from datetime import datetime\n",
549 |     "import shutil\n",
550 |     "\n",
551 |     "if Params.TRAIN:\n",
552 |     "    if not Params.RESUME_TRAINING:\n",
553 |     "        print(\"Removing previous training artefacts...\")\n",
554 |     "        shutil.rmtree(model_dir, ignore_errors=True)\n",
555 |     "    else:\n",
556 |     "        print(\"Resuming training...\") \n",
557 |     "\n",
558 |     "\n",
559 |     "    tf.logging.set_verbosity(tf.logging.INFO)\n",
560 |     "\n",
561 |     "    time_start = datetime.utcnow() \n",
562 |     "    print(\"Experiment started at {}\".format(time_start.strftime(\"%H:%M:%S\")))\n",
563 |     "    print(\".......................................\") \n",
564 |     "\n",
565 |     "    estimator = create_estimator(hparams, run_config)\n",
566 |     "\n",
567 |     "    tf.estimator.train_and_evaluate(\n",
568 |     "        estimator=estimator,\n",
569 |     "        train_spec=train_spec, \n",
570 |     "        eval_spec=eval_spec\n",
571 |     "    )\n",
572 |     "\n",
573 |     "    time_end = datetime.utcnow() \n",
574 |     "    print(\".......................................\")\n",
575 |     "    print(\"Experiment finished at {}\".format(time_end.strftime(\"%H:%M:%S\")))\n",
576 |     "    print(\"\")\n",
577 |     "    time_elapsed = time_end - time_start\n",
578 |     "    print(\"Experiment elapsed time: {} seconds\".format(time_elapsed.total_seconds()))\n",
579 |     "else:\n",
580 |     "    print \"Training was skipped!\""
581 |    ]
582 |   },
583 |   {
584 |    "cell_type": "markdown",
585 |    "metadata": {},
586 |    "source": [
587 |     "## 7. Evaluate the model"
588 |    ]
589 |   },
590 |   {
591 |    "cell_type": "code",
592 |    "execution_count": 11,
593 |    "metadata": {},
594 |    "outputs": [
595 |     {
596 |      "name": "stdout",
597 |      "output_type": "stream",
598 |      "text": [
599 |       "############################################################################################\n",
600 |       "# Train Measures: {'average_loss': 0.0037224626, 'accuracy': 0.99904275, 'global_step': 730, 'loss': 272.20135}\n",
601 |       "############################################################################################\n",
602 |       "\n",
603 |       "############################################################################################\n",
604 |       "# Eval Measures: {'average_loss': 1.3446056, 'accuracy': 0.82416916, 'global_step': 730, 'loss': 31032.152}\n",
605 |       "############################################################################################\n"
606 |      ]
607 |     }
608 |    ],
609 |    "source": [
610 |     "tf.logging.set_verbosity(tf.logging.ERROR)\n",
611 |     "\n",
612 |     "estimator = create_estimator(hparams, run_config)\n",
613 |     "\n",
614 |     "train_metrics = estimator.evaluate(\n",
615 |     "    input_fn = generate_tfrecords_input_fn(\n",
616 |     "        files_pattern= Params.TRANSFORMED_TRAIN_DATA_FILE_PREFIX+\"*\", \n",
617 |     "        mode= tf.estimator.ModeKeys.EVAL,\n",
618 |     "        batch_size= TRAIN_SIZE), \n",
619 |     "    steps=1\n",
620 |     ")\n",
621 |     "\n",
622 |     "\n",
623 |     "print(\"############################################################################################\")\n",
624 |     "print(\"# Train Measures: {}\".format(train_metrics))\n",
625 |     "print(\"############################################################################################\")\n",
626 |     "\n",
627 |     "eval_metrics = estimator.evaluate(\n",
628 |     "    input_fn=generate_tfrecords_input_fn(\n",
629 |     "        files_pattern= Params.TRANSFORMED_EVAL_DATA_FILE_PREFIX+\"*\", \n",
630 |     "        mode= tf.estimator.ModeKeys.EVAL,\n",
631 |     "        batch_size= EVAL_SIZE), \n",
632 |     "    steps=1\n",
633 |     ")\n",
634 |     "print(\"\")\n",
635 |     "print(\"############################################################################################\")\n",
636 |     "print(\"# Eval Measures: {}\".format(eval_metrics))\n",
637 |     "print(\"############################################################################################\")\n"
638 |    ]
639 |   },
640 |   {
641 |    "cell_type": "markdown",
642 |    "metadata": {},
643 |    "source": [
644 |     "## 8. Use Saved Model for Predictions"
645 |    ]
646 |   },
647 |   {
648 |    "cell_type": "code",
649 |    "execution_count": 12,
650 |    "metadata": {},
651 |    "outputs": [
652 |     {
653 |      "name": "stdout",
654 |      "output_type": "stream",
655 |      "text": [
656 |       "models/news/dnn_estimator_tfidf/export/estimate/1526314518\n",
657 |       "\n",
658 |       "{u'probabilities': array([[0.96217114, 0.01375495, 0.02407398],\n",
659 |       "       [0.02322701, 0.39720485, 0.5795681 ],\n",
660 |       "       [0.03017025, 0.9552083 , 0.01462139]], dtype=float32), u'class_ids': array([[0],\n",
661 |       "       [2],\n",
662 |       "       [1]]), u'classes': array([['github'],\n",
663 |       "       ['techcrunch'],\n",
664 |       "       ['nytimes']], dtype=object), u'logits': array([[ 2.4457023, -1.8020908, -1.2423583],\n",
665 |       "       [-2.1229138,  0.7162221,  1.0940531],\n",
666 |       "       [-0.9709409,  2.4841323, -1.6953117]], dtype=float32)}\n"
667 |      ]
668 |     }
669 |    ],
670 |    "source": [
671 |     "import os\n",
672 |     "\n",
673 |     "export_dir = model_dir +\"/export/estimate/\"\n",
674 |     "saved_model_dir = os.path.join(export_dir, os.listdir(export_dir)[0])\n",
675 |     "\n",
676 |     "print(saved_model_dir)\n",
677 |     "print(\"\")\n",
678 |     "\n",
679 |     "predictor_fn = tf.contrib.predictor.from_saved_model(\n",
680 |     "    export_dir = saved_model_dir,\n",
681 |     "    signature_def_key=\"predict\"\n",
682 |     ")\n",
683 |     "\n",
684 |     "output = predictor_fn(\n",
685 |     "    {\n",
686 |     "        'title':[\n",
687 |     "            'Microsoft and Google are joining forces for a new AI framework',\n",
688 |     "            'A new version of Python is mind blowing',\n",
689 |     "            'EU is investigating new data privacy policies'\n",
690 |     "        ]\n",
691 |     "        \n",
692 |     "    }\n",
693 |     ")\n",
694 |     "print(output)"
695 |    ]
696 |   },
697 |   {
698 |    "cell_type": "code",
699 |    "execution_count": null,
700 |    "metadata": {},
701 |    "outputs": [],
702 |    "source": []
703 |   }
704 |  ],
705 |  "metadata": {
706 |   "kernelspec": {
707 |    "display_name": "Python 2",
708 |    "language": "python",
709 |    "name": "python2"
710 |   },
711 |   "language_info": {
712 |    "codemirror_mode": {
713 |     "name": "ipython",
714 |     "version": 2
715 |    },
716 |    "file_extension": ".py",
717 |    "mimetype": "text/x-python",
718 |    "name": "python",
719 |    "nbconvert_exporter": "python",
720 |    "pygments_lexer": "ipython2",
721 |    "version": "2.7.10"
722 |   }
723 |  },
724 |  "nbformat": 4,
725 |  "nbformat_minor": 2
726 | }
727 | 


--------------------------------------------------------------------------------
/08 - Text Analysis/data/sms-spam/n_words.tsv:
--------------------------------------------------------------------------------
1 | 11330


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | # TensorFlow Estimator APIs Tutorials - TensorFlow v1.4
 2 | 
 3 | ## The tutorials use the TF estimator APIs to cover:
 4 | 
 5 | * Various ML tasks, currently covering:
 6 |   * Classification
 7 |   * Regression
 8 |   * Clustering (k-means)
 9 |   * Time-series Analysis (AR Models)
10 |   * Dimensionality Reduction (Autoencoding)
11 |   * Sequence Models (RNN and LSTMs)
12 |   * Image Analysis (CNN for Image Classification)
13 |   * Text Analysis (Text Classification with embeddings, CNN, and RNN)
14 | *  How to use **canned estimators**  to train ML models.
15 |   
16 | * How to implement **custom estimators** (model_fn & EstimatorSpec).
17 | 
18 | * A standard **metadata-driven** approach to build the model **feature_column**(s) including:
19 |   * **numerical** features
20 |   * **categorical** features with **vocabulary**, 
21 |   * **categorical** features **hash bucket**, and
22 |   * **categorical** features with **identity**
23 | 
24 | * Data **input pipelines** (input_fn) using:
25 |   * tf.estimator.inputs.**pandas_input_fn**, 
26 |   * tf.train.**string_input_producer**, and 
27 |   * tf.data.**Dataset** APIs to read both **.csv** and **.tfrecords** (tf.example) data files
28 |   * tf.contrib.timeseries.**RandomWindowInputFn** and **WholeDatasetInputFn** for time-series data
29 |   * Feature **preprocessing** and **creation** as part of reading data (input_fn), for example, sin, sqrt, polynomial expansion, fourier transform, log, boolean comparisons, euclidean distance, custom formulas, etc.
30 | 
31 | * A standard approach to prepare **wide** (sparse) and **deep** (dense) feature_column(s) for Wide and Deep **DNN Liner Combined Models**
32 | 
33 | * The use of **normalizer_fn** in numeric_column() to **scale** the numeric features using pre-computed statistics (for Min-Max or Standard scaling)
34 | 
35 | * The use of **weight_column** in the canned estimators, and in the loss metric in custom estimators.
36 | 
37 | * Implicit **Feature Engineering** as part of defining feature_colum(s), including:
38 |   * crossing, 
39 |   * clipping,
40 |   * embedding,
41 |   * indicators (encoding categorical features), and
42 |   * bucketization
43 |   *  How to use the  tf.contrib.learn.**experiment** APIs to train, evaluate, and export models
44 | 
45 | * Howe to use the tf.estimator.**train_and_evaluate** function (along with trainSpec & evalSpec) train, evaluate, and export models
46 | 
47 | * How to use **tf.train.exponential_decay** function as a learning rate scheduler
48 | 
49 | * How to **serve** exported model (export_savedmodel) using **csv** and **json** inputs
50 | 
51 | ## Coming Soon:
52 | * Early-stopping implementation
53 | * DynamicRnnEstimator and the use of variable-length sequences
54 | * Collaborative Filtering for Recommendation Models
55 | * Text Analysis (Topic Models, Word/Doc embedding, etc.)
56 | * tf.Transform to preprocessing and feature engineering
57 | * keras examples
58 | 
59 | 
60 | 
61 | 


--------------------------------------------------------------------------------
/images/exp-api2.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ksalama/tf-estimator-tutorials/cecfea0c378ebc8552941c9ebf8a530228dd845d/images/exp-api2.png


--------------------------------------------------------------------------------