├── How it works - Bike Share Regression PyTorch Lightning.ipynb ├── How it works - Covariance, Correlation & Linear Regression.ipynb ├── How it works - GeoPandas basics.ipynb ├── How it works - GeoPandas detailed mapping.ipynb ├── How it works - GeoPandas spatial joins.ipynb ├── How it works - KMeans Clustering.ipynb ├── How it works - Multivariate regression.ipynb ├── How it works - Pandas, data manipulation.ipynb ├── How it works - Pandas, data selection.ipynb ├── How it works - Pandas, groupby method.ipynb ├── How it works - Pandas, mapping series values.ipynb ├── How it works - Pandas, merge method.ipynb ├── How it works - Pandas, pivot tables.ipynb ├── How it works - Polynomial Regression.ipynb ├── How it works - Principal Component Analysis.ipynb ├── How it works - basic lists.ipynb ├── How it works - data structures for deep learning.ipynb ├── How it works - labelling districts in GeoPandas.ipynb ├── How it works - list comprehensions.ipynb ├── How it works - lists vs arrays.ipynb ├── How it works - positive & negative indexation.ipynb ├── How it works - visualizing data overlaps.ipynb ├── Practice Run - Linear Regression Age vs Blood Pressure.ipynb ├── Practice Run - Linear Regression Fish Sizes.ipynb ├── README.md ├── customer_product_list.xlsx ├── data_structures.png ├── flattening.png ├── img_5.png ├── multiplication.png ├── sea_picture.jpg ├── sorted.jpg ├── take1.png ├── take2.png ├── tensor_shape_2.png ├── unsorted.jpg └── weight_matrix_detail_2.png /How it works - Covariance, Correlation & Linear Regression.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "## Co-variance" 8 | ] 9 | }, 10 | { 11 | "cell_type": "code", 12 | "execution_count": 1, 13 | "metadata": {}, 14 | "outputs": [], 15 | "source": [ 16 | "import numpy as np\n", 17 | "import matplotlib.pyplot as plt\n", 18 | "%matplotlib inline" 19 | ] 20 | }, 21 | { 22 | "cell_type": "markdown", 23 | "metadata": {}, 24 | "source": [ 25 | "wikiHow provides a very easy to understand example of how to calculate co-variance
\n", 26 | "https://www.wikihow.com/Calculate-Covariance
\n", 27 | "Using their sample data, below we see how to view the data, and then find the covariance using np.cov():" 28 | ] 29 | }, 30 | { 31 | "cell_type": "code", 32 | "execution_count": 2, 33 | "metadata": {}, 34 | "outputs": [ 35 | { 36 | "data": { 37 | "text/plain": [ 38 | "" 39 | ] 40 | }, 41 | "execution_count": 2, 42 | "metadata": {}, 43 | "output_type": "execute_result" 44 | }, 45 | { 46 | "data": { 47 | "image/png": "iVBORw0KGgoAAAANSUhEUgAAAW4AAAD8CAYAAABXe05zAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4xLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvAOZPmwAAD4xJREFUeJzt3V1sZHd5gPHnrWOa2RRkSgyqHbYLUuWCEojRCAFRI5VATSEK26gXqUpFP9TtBYIEVaZYvajUi6bIqAKpEtUqFJBIg+jiuFUr4qygFCGVSLNxGidZXFQ+QsaBGFHzOWoc8/bC42128cbHyTme/Y+fn7SyfXx2znu03kezZ87sPzITSVI5fm7QA0iS9sdwS1JhDLckFcZwS1JhDLckFcZwS1JhDLckFcZwS1JhDLckFeayJh70yiuvzGPHjjXx0JI0lM6cOfPdzByvsm8j4T527BidTqeJh5akoRQR36y6r5dKJKkwhluSCmO4JakwhluSCmO4JakwlcIdEbdGxEMR8XBE3Nb0UJKki9vzdsCIuBr4Y+C1wJPAPRHxr5n51aaHa9Licpf5pVXWNnpMjLWYnZni+PTkoMeSpD1Vecb9CuDLmfmTzHwK+Hfgt5odq1mLy13mFlbobvRIoLvRY25hhcXl7qBHk6Q9VQn3Q8D1EfGiiDgCvBV4abNjNWt+aZXe5tZ523qbW8wvrQ5oIkmqbs9LJZl5NiI+AJwGfgT8J/DUhftFxAngBMDRo0drHrNeaxu9fW2XpEtJpRcnM/OjmfmazLwe+B7wM9e3M/NkZrYzsz0+Xunt9gMzMdba13ZJupRUvavkxf2PR4GbgbuaHKppszNTtEZHztvWGh1hdmZqQBNJUnVV/5Opz0TEi4BN4F2Z+T8NztS4nbtHvKtEUokqhTszf63pQQ7a8elJQy2pSL5zUpIKY7glqTCGW5IKY7glqTCGW5IKY7glqTCGW5IKY7glqTCGW5IKY7glqTCGW5IKY7glqTCGW5IKY7glqTCGW5IKY7glqTCGW5IKY7glqTCGW5IKU3WV9/dGxMMR8VBE3BURlzc9mCRpd3suFhwRk8B7gFdmZi8iPg3cAny8zkEWl7uuui5JFVRa5b2/XysiNoEjwFqdQywud5lbWKG3uQVAd6PH3MIKgPGWpAvseakkM7vAB4FHgceB72fmvXUOMb+0ei7aO3qbW8wvrdZ5GEkaCnuGOyJeCLwdeBkwAVwREe/YZb8TEdGJiM76+vq+hljb6O1ruyQdZlVenHwT8PXMXM/MTWABeMOFO2XmycxsZ2Z7fHx8X0NMjLX2tV2SDrMq4X4UeF1EHImIAG4AztY5xOzMFK3RkfO2tUZHmJ2ZqvMwkjQU9nxxMjPvi4hTwP3AU8AycLLOIXZegPSuEknaW2Rm7Q/abrez0+nU/riSNKwi4kxmtqvs6zsnJakwhluSCmO4JakwhluSCmO4JakwhluSCmO4JakwhluSCmO4JakwhluSCmO4JakwhluSCmO4JakwhluSCmO4JakwhluSCmO4JakwhluSCmO4Jakwe4Y7IqYi4oGn/fpBRNx2EMNJkn5WlVXeV4FrASJiBOgCdzc8V+MWl7tDu6r8MJ+bpArhvsANwH9n5jebGOagLC53mVtYobe5BUB3o8fcwgpA8YEb5nOTtG2/17hvAe5qYpCDNL+0ei5sO3qbW8wvrQ5oovoM87lJ2lY53BHxPOAm4B8v8v0TEdGJiM76+npd8zVibaO3r+0lGeZzk7RtP8+4fxO4PzO/s9s3M/NkZrYzsz0+Pl7PdA2ZGGvta3tJhvncJG3bT7h/hyG4TAIwOzNFa3TkvG2t0RFmZ6YGNFF9hvncJG2r9OJkRBwB3gz8SbPjHIydF+mG8c6LYT43SdsiM2t/0Ha7nZ1Op/bHlaRhFRFnMrNdZV/fOSlJhTHcklQYwy1JhTHcklQYwy1JhTHcklQYwy1JhTHcklQYwy1JhTHcklQYwy1JhTHcklQYwy1JhTHcklQYwy1JhTHcklQYwy1JhTHcklQYwy1JhakU7ogYi4hTEfGViDgbEa9vejBJ0u4qrfIOfBi4JzN/OyKeBxxpcKahs7jcddV1SbXZM9wR8QLgeuD3ATLzSeDJZscaHovLXeYWVuhtbgHQ3egxt7ACYLwlPStVLpW8HFgHPhYRyxFxR0Rc0fBcQ2N+afVctHf0NreYX1od0ESSSlcl3JcBrwE+kpnTwI+B91+4U0SciIhORHTW19drHrNcaxu9fW2XpL1UCfdjwGOZeV//61Nsh/w8mXkyM9uZ2R4fH69zxqJNjLX2tV2S9rJnuDPz28C3ImKqv+kG4JFGpxoiszNTtEZHztvWGh1hdmbqIr9Dkp5Z1btK3g3c2b+j5GvAHzQ30nDZeQHSu0ok1aVSuDPzAaDd8CxD6/j0pKGWVBvfOSlJhTHcklQYwy1JhTHcklQYwy1JhTHcklQYwy1JhTHcklQYwy1JhTHcklQYwy1JhTHcklQYwy1JhTHcklQYwy1JhTHcklQYwy1JhTHcklQYwy1Jham05mREfAP4IbAFPJWZrj8pSQNSdZV3gF/PzO82NokkqRIvlUhSYaqGO4F7I+JMRJxociBJ0jOreqnkusxci4gXA6cj4iuZ+cWn79AP+gmAo0eP1jymJGlHpWfcmbnW//gEcDfw2l32OZmZ7cxsj4+P1zulJOmcPcMdEVdExPN3Pgd+A3io6cEkSburcqnkJcDdEbGz/z9k5j2NTiVJuqg9w52ZXwNefQCzSJIq8HZASSqM4ZakwhhuSSqM4ZakwhhuSSqM4ZakwhhuSSqM4ZakwhhuSSqM4ZakwhhuSSqM4ZakwhhuSSqM4ZakwhhuSSqM4ZakwhhuSSqM4ZakwhhuSSqM4ZakwlRZ5R2AiBgBOkA3M29sbiSVZnG5y/zSKmsbPSbGWszOTHF8enLQY0lDq3K4gVuBs8ALGppFBVpc7jK3sEJvcwuA7kaPuYUVAOMtNaTSpZKIuAp4G3BHs+OoNPNLq+eivaO3ucX80uqAJpKGX9Vr3B8C3gf89GI7RMSJiOhERGd9fb2W4XTpW9vo7Wu7pOduz3BHxI3AE5l55pn2y8yTmdnOzPb4+HhtA+rSNjHW2td2Sc9dlWfc1wE3RcQ3gE8Bb4yITzY6lYoxOzNFa3TkvG2t0RFmZ6YGNJE0/PYMd2bOZeZVmXkMuAX4fGa+o/HJVITj05PcfvM1TI61CGByrMXtN1/jC5NSg/ZzV4m0q+PTk4ZaOkD7CndmfgH4QiOTSJIq8Z2TklQYwy1JhTHcklQYwy1JhTHcklQYwy1JhTHcklQYwy1JhTHcklQYwy1JhTHcklQYwy1JhTHcklQYwy1JhTHcklQYwy1JhTHcklQYwy1JhTHcklSYPdecjIjLgS8CP9/f/1Rm/kXTg0mXgsXlLvNLq6xt9JgYazE7M9XYwsgHeSyVrcpiwf8LvDEzfxQRo8CXIuKzmfnlhmeTBmpxucvcwgq9zS0Auhs95hZWAGoP6kEeS+Xb81JJbvtR/8vR/q9sdCrpEjC/tHoupDt6m1vML60WfSyVr9I17ogYiYgHgCeA05l53y77nIiITkR01tfX655TOnBrG719bS/lWCpfpXBn5lZmXgtcBbw2Iq7eZZ+TmdnOzPb4+Hjdc0oHbmKsta/tpRxL5dvXXSWZuQF8AXhLI9NIl5DZmSlaoyPnbWuNjjA7M1X0sVS+KneVjAObmbkRES3gTcAHGp9MGrCdFwUP4k6PgzyWyheZz/w6Y0S8CvgEMML2M/RPZ+ZfPtPvabfb2el0ahtSkoZdRJzJzHaVffd8xp2ZDwLTz3kqSVItfOekJBXGcEtSYQy3JBXGcEtSYQy3JBXGcEtSYQy3JBXGcEtSYQy3JBXGcEtSYQy3JBXGcEtSYQy3JBXGcEtSYQy3JBXGcEtSYQy3JBXGcEtSYQy3JBVmz3BHxEsj4t8i4mxEPBwRtx7EYJKk3e25WDDwFPCnmXl/RDwfOBMRpzPzkYZnk6QiLC53mV9aZW2jx8RYi9mZKY5PTzZ2vCqrvD8OPN7//IcRcRaYBAy3pENvcbnL3MIKvc0tALobPeYWVgAai/e+rnFHxDFgGriviWEkqTTzS6vnor2jt7nF/NJqY8esHO6I+AXgM8BtmfmDXb5/IiI6EdFZX1+vc0ZJumStbfT2tb0OlcIdEaNsR/vOzFzYbZ/MPJmZ7cxsj4+P1zmjJF2yJsZa+9pehyp3lQTwUeBsZv5NY5NIUoFmZ6ZojY6ct601OsLszFRjx6zyjPs64PeAN0bEA/1fb21sIkkqyPHpSW6/+Romx1oEMDnW4vabrxn4XSVfAqKxCSSpcMenJxsN9YV856QkFcZwS1JhDLckFcZwS1JhDLckFSYys/4HjVgHvln7AzfjSuC7gx6iIZ5bmYb53GC4z++5nNsvZ2aldy82Eu6SREQnM9uDnqMJnluZhvncYLjP76DOzUslklQYwy1JhTHccHLQAzTIcyvTMJ8bDPf5Hci5Hfpr3JJUGp9xS1JhDmW4D8MCyBExEhHLEfEvg56lbhExFhGnIuIr/T/D1w96prpExHv7P5MPRcRdEXH5oGd6tiLi7yPiiYh46GnbfjEiTkfEV/sfXzjIGZ+Li5zffP/n8sGIuDsixpo49qEMN/+/APIrgNcB74qIVw54prrdCpwd9BAN+TBwT2b+KvBqhuQ8I2ISeA/QzsyrgRHglsFO9Zx8HHjLBdveD3wuM38F+Fz/61J9nJ89v9PA1Zn5KuC/gLkmDnwow52Zj2fm/f3Pf8j2X/yD+z8ZGxYRVwFvA+4Y9Cx1i4gXANezvbgHmflkZm4MdqpaXQa0IuIy4AiwNuB5nrXM/CLwvQs2vx34RP/zTwDHD3SoGu12fpl5b2Y+1f/yy8BVTRz7UIb76YZ0AeQPAe8DfjroQRrwcmAd+Fj/UtAdEXHFoIeqQ2Z2gQ8CjwKPA9/PzHsHO1XtXpKZj8P2EyjgxQOep0l/CHy2iQc+1OHeawHkEkXEjcATmXlm0LM05DLgNcBHMnMa+DFl/3P7nP713rcDLwMmgCsi4h2DnUrPRkT8OduXZO9s4vEPbbirLIBcqOuAmyLiG8Cn2F5y7pODHalWjwGPZebOv5BOsR3yYfAm4OuZuZ6Zm8AC8IYBz1S370TELwH0Pz4x4HlqFxHvBG4Efjcbut/6UIZ7mBdAzsy5zLwqM4+x/cLW5zNzaJ61Zea3gW9FxM5KrDcAjwxwpDo9CrwuIo70f0ZvYEheeH2afwbe2f/8ncA/DXCW2kXEW4A/A27KzJ80dZxDGW5cALl07wbujIgHgWuBvxrwPLXo/yviFHA/sML2389i32UYEXcB/wFMRcRjEfFHwF8Db46IrwJv7n9dpIuc398CzwdO97vyd40c23dOSlJZDuszbkkqluGWpMIYbkkqjOGWpMIYbkkqjOGWpMIYbkkqjOGWpML8H63rtKf2WZ0oAAAAAElFTkSuQmCC\n", 48 | "text/plain": [ 49 | "" 50 | ] 51 | }, 52 | "metadata": {}, 53 | "output_type": "display_data" 54 | } 55 | ], 56 | "source": [ 57 | "x = [1, 3, 2, 5, 8, 7, 12, 2, 4]\n", 58 | "y = [8, 6, 9, 4, 3, 3, 2, 7, 7]\n", 59 | "plt.scatter(x, y)" 60 | ] 61 | }, 62 | { 63 | "cell_type": "markdown", 64 | "metadata": {}, 65 | "source": [ 66 | "Note that as per https://machinelearningmastery.com/introduction-to-expected-value-variance-and-covariance/ np.cov() returns the correlation matrix, but the key value we are looking for is the [0,1] element, i.e. -8.07" 67 | ] 68 | }, 69 | { 70 | "cell_type": "code", 71 | "execution_count": 3, 72 | "metadata": {}, 73 | "outputs": [ 74 | { 75 | "data": { 76 | "text/plain": [ 77 | "array([[ 12.61111111, -8.06944444],\n", 78 | " [ -8.06944444, 6.27777778]])" 79 | ] 80 | }, 81 | "execution_count": 3, 82 | "metadata": {}, 83 | "output_type": "execute_result" 84 | } 85 | ], 86 | "source": [ 87 | "np.cov(x, y)" 88 | ] 89 | }, 90 | { 91 | "cell_type": "markdown", 92 | "metadata": {}, 93 | "source": [ 94 | "Let's do another example which is fairly typical: the relationship between heights and weights:" 95 | ] 96 | }, 97 | { 98 | "cell_type": "code", 99 | "execution_count": 14, 100 | "metadata": {}, 101 | "outputs": [], 102 | "source": [ 103 | "# First let's get a normally distributed sample of people heights\n", 104 | "heights = np.random.normal(1.65, 0.10, 100)" 105 | ] 106 | }, 107 | { 108 | "cell_type": "code", 109 | "execution_count": 15, 110 | "metadata": {}, 111 | "outputs": [], 112 | "source": [ 113 | "# And a normally distributed sample of people weights\n", 114 | "weights = np.random.normal(60, 8, 100)" 115 | ] 116 | }, 117 | { 118 | "cell_type": "code", 119 | "execution_count": 16, 120 | "metadata": {}, 121 | "outputs": [ 122 | { 123 | "data": { 124 | "text/plain": [ 125 | "" 126 | ] 127 | }, 128 | "execution_count": 16, 129 | "metadata": {}, 130 | "output_type": "execute_result" 131 | }, 132 | { 133 | "data": { 134 | "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXQAAAD8CAYAAABn919SAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4xLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvAOZPmwAAGcNJREFUeJzt3X2MHeV1x/HvYb0o6yRlIRgCixy7FTUVQeCypS9WU16aOGlC2BCSkFaqlSK5aqU0Ia2LyR+Ftn/gyK1CpKqqUJKWSoiQgGNIaTEIQ9PSQrWOTQ0JLinBLmsKJmFDCRtlsU//2Lt4X+7cOzN3Xp555veRELvXs3efnXvvmTPnOfOMuTsiItJ8J9Q9ABERKYYCuohIJBTQRUQioYAuIhIJBXQRkUgooIuIREIBXUQkEgroIiKRUEAXEYnEiip/2amnnupr1qyp8leKiDTenj17XnL3Vf22qzSgr1mzhsnJySp/pYhI45nZwTTbqeQiIhIJBXQRkUgooIuIREIBXUQkEgroIiKRqLTLRUTitXPvFNt3HeDw9Axnjo6wZeM6JtaP1T2sVlFAF5GB7dw7xfU79jMzexSAqekZrt+xH0BBvUIquYjIwLbvOvBGMJ83M3uU7bsO1DSidlJAF5GBHZ6eyfS4lEMBXUQGduboSKbHpRwK6BKcnXun2LBtN2u33suGbbvZuXeq7iFJH1s2rmNkeGjRYyPDQ2zZuK6mEbWTJkUlKJpca6b510ZdLvVSQJeg9JpcU3AI28T6Mb1GNUtVcjGza83sSTN7wsxuN7M3mdlaM3vMzJ42szvM7MSyByvx0+SaSH59A7qZjQF/AIy7+zuBIeBq4HPA5939bOBl4JoyByrtoMk1kfzSToquAEbMbAWwEngeuBS4s/PvtwITxQ9P2kaTayL59a2hu/uUmf0FcAiYAe4H9gDT7v56Z7PnABXPZGCaXBPJr29AN7OTgSuAtcA08DXgfV029YSf3wxsBli9enXugUp7aHJNJJ80JZdfB77n7kfcfRbYAfwKMNopwQCcBRzu9sPufou7j7v7+KpVfW+JJyIiOaUJ6IeAXzKzlWZmwGXAt4GHgKs622wC7i5niCIikkbfgO7ujzE3+fktYH/nZ24BrgM+Y2bfBd4GfKnEcYqISB+pLixy9xuAG5Y8/AxwUeEjEhGRXLSWi4hIJBTQRUQioYAuIhIJLc4lIjKgUO6nqoAuEqlQgkzsQlryWSUXkQjNB5mp6Rmc40FGNwspXkj3U1WGLkFQNlksrStfnZCWfFaGLrVTNlm8kIJM7EJa8lkBXWoX0ilrLEIKMrELaclnBXSpnbLJ4oUUZGI3sX6Mm648j7HREQwYGx3hpivPU5eLtNOZoyNMdQneyibz07ry1QplyWcFdKndlo3rFrV9gbLJIoQSZKQ6CuhSO2WTIsVQQJcgKJscjNo+BRTQRRovpCsVpV7qchFpOLV9yjwFdJGGU9unzFNAF2k4XUQk8xTQRRpOFxHJPE2KijSc2j5lngK6SATU9imgkouISDQU0EVEIqGALiISCQV0EZFIKKCLiERCAV1EJBIK6CIikVBAFxGJhAK6iEgkFNBFRCLR2Ev/dYcWEZHFGhnQdYcWEZHlGlly0R1aRESW6xvQzWydme1b8N8rZvZpMzvFzB4ws6c7/z+5igGD7tAiItJN34Du7gfc/QJ3vwC4EHgN+DqwFXjQ3c8GHux8XwndoUWaZufeKTZs283arfeyYdtudu6dqntIEqGsNfTLgP9294NmdgVwcefxW4GHgeuKG1qyLRvXLaqhg+7QIuFKmvOZPPgDHnrqiCb2pTBZA/rVwO2dr0939+cB3P15Mzut2w+Y2WZgM8Dq1avzjnMR3aElTOo86i5pzue2Rw/hne81sS9FMHfvvxVgZicCh4Fz3f0FM5t299EF//6yu/eso4+Pj/vk5ORAA5ZiFRWEl2ahMHfWdNOV57U+QK3dei/pPmUwNjrCI1svLXU8WeggHQYz2+Pu4/22y9Ll8j7gW+7+Quf7F8zsjM4vOwN4MfswpU7zQXhqegbneJaYp76rzqNkWeZ2qpjYT1vPL/L9IdXIEtA/zvFyC8A9wKbO15uAu4salFSjyCCszqNkWzauY2R4aNFjlrBt2RP7WYK0DtLNk6qGbmYrgXcDv7vg4W3AV83sGuAQ8JHihydF6XbqXGQQPnN0hKkuP6fOo+VzPqMrh/nx7FFmZo8t2q6Kif1eQXppKUUH6eZJFdDd/TXgbUse+z5zXS8SuKQui5NGhpmemV22fZ4g3KTOozrqwhPrx5hYP7bgtVgczE8w+PCFY6WPI0uQrvMgrdp9Po28UlSyScrKzFhWCsgbhCfWj3HTlecxNjqCMTe5F+KEaN114W6vBcAxh7v2TJU+jizXcHQrFVVxkK77NWqyRq7lItkkZWXTr83y+Y9dUFgmNJ+FLhVStpWl5FCGXuWKKsaR5Uyqrvbgul+jJlNAb4Fep85JQbgooS2kVlddeP6g1q99sexxZA3SZb8/ulHtPj8F9Ba45JxViy5igerq26FlW3XUhbv16Cepoj5dR5DOoqrXKKQzx6Kohh65nXunuGvP1KJgblQzAQfhZVt11IWT6uZLVTmJHPLaMlW8RrHW6RXQI9ctmDjw0FNHKvn9oS2kVsfkba+DVx2TyKEHsypeo1h77FVyiVzdGXKI7YxVlxySSgh1XeYfWhmsm7Jfo7o/F2VRhh65ujPkprQzlqmu9r8ksQazLOr+XJRFGXrkQsiQQ5+EK1toq4Pqqt4wPhdlUECPXJZgEuOsfyjyHtTKeE1iDWZZhHaQLUrq5XOLoOVzw6Xlb8OT5TXJGvh18G6WtMvnKqALABu27Q5q4q7N5oNtt9cDlr8mbT0Yt+mglDagq+TSQmWvvCj5pbkIaelr0oSulaKFdgVyKNTl0jJJPcgnjQx33b5NE2UhSHMR0tLXpI0H41j7yAelDL1lkj4Ibxo+gZHhoVZMlIV8qt4vCHd7TdrYtdLGg1gaytBbptfKi23oFw/9KsleQTjpNQmtz70KsfaRD0oZesvUufJiCEKvNye1FPY6uMbagteLWi+7U0BvmbZ/EEI/Vc8bnNtwMF5okINYyCW3QSmgt0wbs7mFmlBvbltwzivPfoq9O0YBvYXaHDCaeIYSc0ZZtdBLboNSQJdWKfIMpYpAG3tGWbXQS26DUkCX1iniDKWqQBt7Rlm1JpTcBqG2RZEcqrqwJfaMsmqxt3gqQxfJoapAmyWjbEqtvc5xxt4UoIAukkNVp+5pJ3GbUmsPYZwxNwWo5CKlCvlmxIPodepe5N+c9o5PTVnbpCnjbCpl6DKwpFPoorKxEEsJSafuQKEZaNq/PU0JKIT9qDmBcimgR6SOD2yvoN0vG0t7F6W6T9GTdDt137Btd2FdKVn+9n4loFD2Y+xdJnVTySUSdS061StoJ2Vd82NLM9amnaIXmYFm+dv7dW+Esh9j7zKpmwJ6JOr6wPYKYElZ15BZ6rE27RS9yFUAex0Qlx78+tXaQ9mPaecE+ol1bmZQKrlEoq4PbK9T6KQOjaQbOHQba9NO0bv9zQZccs6qzM+V9LcDbPna48Dickmv7o3RlcO8/Nps199RtUG7TEIpH4VIGXok6lofutcpdFI2NpZhrE07RZ9YP8aHLxzDFjzmwF17pjJnkd3+9nmzx5wb73ky1fPs3DvFqz9+fdnjw0MW7H7sJZTyUYiUoUeirkWn+l2okZSNpR1rEy8EeeipIyy99XqeidH5bT99x76u/z49szzj7mb7rgPMHlt+M/g3n7gi6P2YJJTyUYhSBXQzGwW+CLyTuYTjd4ADwB3AGuBZ4KPu/nIpo5S+6gx8WU+hs461aReCFBlwJtaPJQZ0gLVb7+27/5J+7w9THhBC07QyXJXSZuhfAO5z96vM7ERgJfBZ4EF332ZmW4GtwHUljVNSaFLga9JYsyo64JycUP8GFnUJQfcacmwBsIlLIFelbw3dzH4KeBfwJQB3/4m7TwNXALd2NrsVmChrkFKOWDsFyv67+j1/0XX/Gy4/l+Eh67lNrxpy0+Yh+imqUyZGaTL0nwaOAH9rZucDe4BPAae7+/MA7v68mZ1W3jClaGV2CtR5RWLZHRBpnr/o8tfS51teDZ+TVFpp4jxEPzGf4Q3C3JPeHp0NzMaBR4EN7v6YmX0BeAX4pLuPLtjuZXc/ucvPbwY2A6xevfrCgwcPFjl+yWnDtt1dT8PHRkd4ZOuluZ93acCD/jc5LlJZf1eW5y/7gFbmaxdT0I+Jme1x9/F+26VpW3wOeM7dH+t8fyfw88ALZnZG55edAbzY7Yfd/RZ3H3f38VWrsvfiSjnK6hSou6Ws7A6Ifs9fxRW7ZZRQ6rrSWIrVN6C7+/8C/2Nm8++Wy4BvA/cAmzqPbQLuLmWEUoqy+tbrbikrux+/1/Pv3DvFH3718dIPaGXUkOs+EEsx0na5fBK4rdPh8gzwCeYOBl81s2uAQ8BHyhmilKGsToG6b8hQdgdE0vNfcs4qrt+xn6MJJcyiD2jdasiD7M+6D8Qq9xQjVUB3931At/rNZcUOR6pS1kTZIDdk2PK1x/nTbzzJ9GuzucdT9gRg0vN3y3AXKrtFcNDJ4DpbG3Upf3H6TooWaXx83CcnJyv7fVKPNNlW0sTeQiPDQ3z4wjEeeupI4cG56Ixw7dZ7E7tPqpgUHmSidOfeKW6858llV55WNZld9kR2DNJOiurSfylcmpayNKfyM7NHue3RQ28EyqIytzIywqQMd8iskqCYt2TSrSsJ5i5muuHycyvJkOsu98REi3NFpEkXCqU9lU9aE2UQZUwAJnWe/OVHz68kKOadDE4qFa2scJ2XfhPNTXlPh0ABPRJNazvrtZJgP4NmbmVkhHVfvZi3lTGE7Dhp7PMTzU15T4dAJZdI9Mo6iw4qRdSfl04unjQyzI9+8jqzR4/n5MbyDB0Gn6grawKwzqsX804Gh7DOS5aJ5rLe07FQQI9EVZlWkfXnpQFw6YHiknNWcdeeqcJbEKtc3KnKdrw8B5RQFrrqNvZrE1aZVG09mQJ6JKrKtMrMmrp9qMffcUrhAbGqtU2asF5OyOu8hHD20DQK6JGoKtMq+0ygW6Aqo3WtivJIWQe/og8UoS50FcrZQ5NoUjQSVU3KlXlpfdMmdvuJdb2cqtQ90dxEytAjUkWmVWbWFNskWFklgxA6U6oS6tlDqJShSyZlZk2xBaqsrYRpe67ruiG4hE8ZumRWVtZUVkZb18JPWSYcs9TFVVuWJFrLRYJRxs0xuj3n8AnGW960YqBFwIqWdT0TrU7YLlrLRYKSJgCV0ULXrS4/e8zfuOnyIB0iRQbVrOUm1ZalGwV0KV1SOWHy4A+6rqRYZKBKuwhY1onXolsHyyg3KYtvH02KSiF6Teglda/c9uih0lsU0wbErBOvRbcOFn1budhaQCUdBXQZWL/gkRQsy1hJcam0i4BlzYSL7sgpunuoLb3qsphKLjKwfv3jSeWEbsq4Vdv8GJMWAcuTCZdRIimy3JS0H6emZ9iwbbfKMDk0oYSlDF0G1i9b7ZYlW8JzldFLPbF+jEe2Xsr3tr2ffTe8h+1XnT9wJlx0iaRoSfvRQGWYHJpSwlKGLgPrl612614payXFNIrIhENe1Aq696p3W464yVfiVqkpVzEroOfQhFOvKqW50KWqlRTLkvSaLx1vKO+NbgecpLJXU6/ErVJTrmLWhUUZlXHxSwxCCWRlSPuaJ92fc3RkmBs/WM39OXvRzZjzq3vf6cKikjTl1KtqdVzoUtVBJO1rnnR/zumZ2cLWQR+ElgzIryn7TpOiGTXl1Ct2VU5SpX3Ne70HQmgZ1HK0+TVl3ylDz0h3UQlDlWdKaV/zfu2ZIRz0tWRAfk3Yd8rQMwq9Xa0tqjxTSvua97uIqYyDftold6UdFNAzasqpV+yqXBM87Ws+v93JK4eXPUcZB/1uZadP37GP9X92vwJ7S6nLRRop9G6jKiZskzovIKx9IYNTl4tELfQLe6qot6aZhA1lf4QqtnZbBXRprIVBc/6Dee0d+6L4YKbRhEnYkBW9BHIIVEOXxmvKOhtFq2MSNiYxrkipDF2CkucUuK0Xe83/bTfe8yTTM7OL/i3LJGxsZYe0YrymRBm6BCNvph3jBzOtifVj7LvhPdz8sQtydV619ewGqu2UqooydAlG3kxbF3vln4Rt69kNNOdy/ixSBXQzexb4P+Ao8Lq7j5vZKcAdwBrgWeCj7v5yOcOUNsibacf4wSzL0vJKm1dgDL1TKo8sGfol7v7Sgu+3Ag+6+zYz29r5/rpCRyetkjfTjvGDWYZuXR3d1kiH9pzdNOFy/iwGKblcAVzc+fpW4GEU0GUAg2TasX0wy9CtvOIsv/FFHWc3bZ2YLVraSVEH7jezPWa2ufPY6e7+PEDn/6d1+0Ez22xmk2Y2eeTIkcFHLNHSsgrlSiqvONS6z9s8MVu0tBn6Bnc/bGanAQ+Y2VNpf4G73wLcAnOX/ucYo7SIMu3yDJlxtMtSH0Nmtd7gos0Ts0VLFdDd/XDn/y+a2deBi4AXzOwMd3/ezM4AXixxnBIonSo3R7dg3uvxqrS57bRofUsuZvZmM3vr/NfAe4AngHuATZ3NNgF3lzVICZNOlZtlLGGiM+nxqsTYD16XNDX004F/NbPHgf8A7nX3+4BtwLvN7Gng3Z3vpUVivHQ6ZqGu5d9vXFrzPb2+JRd3fwY4v8vj3wcuK2NQ0gw6VW6WUNs7e40rxgW0yqQrRSU3XaHZPKFOOieNSxOm2WgtF8kt1FN4iYfOArNRQJfc1DcuZdOEaTYquchAQj2FlzhonZ5sFNCltdRDH75QJ3JDpYAuraTuiebQWWB6qqFLK6mHXmKkDF1aqe7uCZV7pAzK0KWV6uye0JIJUhYFdGmUoi4Dr7OHXuUeKYtKLtIYRU5k1tk9UXe5R+KlgC6NUfRl4HV1T2jJBCmLSi7SGLFktm1fMkGrJ5ZHGbo0RiyZbZsvllH/f7kU0KUxYroMvK0Xy2j1xHIpoEtjtDmzjUUsZbNQKaBLo7Q1s41FLGWzUGlSVEQq0/YJ4bIpQxeRyqhsVi4FdBGplMpm5VHJRUQkEgroIiKRUEAXEYmEArqISCQU0EVEIqGALiISCQV0EZFIKKCLiERCAV1EJBIK6CIikVBAFxGJhAK6iEgkFNBFRCKROqCb2ZCZ7TWzf+h8v9bMHjOzp83sDjM7sbxhiohIP1ky9E8B31nw/eeAz7v72cDLwDVFDkxERLJJFdDN7Czg/cAXO98bcClwZ2eTW4GJMgYoIiLppM3Qbwb+GDjW+f5twLS7v975/jlAK9aLiNSob0A3sw8AL7r7noUPd9nUE35+s5lNmtnkkSNHcg5TRET6SZOhbwA+aGbPAl9hrtRyMzBqZvO3sDsLONzth939Fncfd/fxVatWFTBkERHppm9Ad/fr3f0sd18DXA3sdvffAh4Crupstgm4u7RRiohIX4P0oV8HfMbMvstcTf1LxQxJRETyWNF/k+Pc/WHg4c7XzwAXFT8kERHJQ1eKiohEQgFdRCQSCugiIpFQQBcRiYQCuohIJBTQRUQioYAuIhIJBXQRkUgooIuIREIBXUQkEgroIiKRyLSWi0jZdu6dYvuuAxyenuHM0RG2bFzHxHrdO0UkDQV0CcbOvVNcv2M/M7NHAZianuH6HfsBFNRFUlDJRYKxfdeBN4L5vJnZo2zfdaCmEYk0iwK6BOPw9Eymx0VkMQV0CcaZoyOZHheRxRTQJRhbNq5jZHho0WMjw0Ns2biuphGJNIsmRSUY8xOf6nIRyUcBXYIysX5MAVwkJ5VcREQioYAuIhIJBXQRkUgooIuIREIBXUQkEubu1f0ysyPAwYKe7lTgpYKeq8m0H+ZoPxynfTEnpv3wDndf1W+jSgN6kcxs0t3H6x5H3bQf5mg/HKd9MaeN+0ElFxGRSCigi4hEoskB/Za6BxAI7Yc52g/HaV/Mad1+aGwNXUREFmtyhi4iIgsEHdDN7Mtm9qKZPdFnu18ws6NmdlVVY6tSv/1gZheb2Q/NbF/nvz+peoxVSfOe6OyPfWb2pJn9c5Xjq0qK98SWBe+HJzqfj1OqHmfZUuyHk8zsG2b2eOf98Imqx1iloEsuZvYu4FXg7939nQnbDAEPAD8Gvuzud1Y4xEr02w9mdjHwR+7+garHVrUU+2IU+Dfgve5+yMxOc/cXqx5n2dJ8NhZsezlwrbtfWsngKpTi/fBZ4CR3v87MVgEHgLe7+08qHmolgs7Q3f2bwA/6bPZJ4C4gug/tvJT7oRVS7IvfBHa4+6HO9lG+LzK+Jz4O3F7icGqTYj848FYzM+AtnW1fr2JsdQg6oPdjZmPAh4C/qXssAfjlzmnlP5nZuXUPpkY/C5xsZg+b2R4z++26B1QnM1sJvJe5pKeN/gr4OeAwsB/4lLsfq3dI5Wn6DS5uBq5z96NzB+DW+hZzlwa/ama/AewEzq55THVZAVwIXAaMAP9uZo+6+3/VO6zaXA484u5tPcPbCOwDLgV+BnjAzP7F3V+pd1jlaHSGDowDXzGzZ4GrgL82s4l6h1Q9d3/F3V/tfP2PwLCZnVrzsOryHHCfu//I3V8CvgmcX/OY6nQ1kZZbUvoEcyU4d/fvAt8Dzql5TKVpdEB397Xuvsbd1wB3Ar/v7jtrHlblzOztnRohZnYRc6/r9+sdVW3uBn7VzFZ0yg2/CHyn5jHVwsxOAn6NuX3SVoeYO1vDzE4H1gHP1DqiEgVdcjGz24GLgVPN7DngBmAYwN1bUzdPsR+uAn7PzF4HZoCrPeT2pQH02xfu/h0zuw/4T+AY8EV379n22kQpPxsfAu539x/VMsgKpNgPfw78nZntB4y5Em0sKzAuE3TbooiIpNfokouIiByngC4iEgkFdBGRSCigi4hEQgFdRCQSCugiIpFQQBcRiYQCuohIJP4fPQHQUTMeJ4wAAAAASUVORK5CYII=\n", 135 | "text/plain": [ 136 | "" 137 | ] 138 | }, 139 | "metadata": {}, 140 | "output_type": "display_data" 141 | } 142 | ], 143 | "source": [ 144 | "# Because we have not setup any kind of relationship between the 2 we can quickly see there is no real \n", 145 | "# relationship between heights and weights if we visualize the data:\n", 146 | "plt.scatter(heights, weights)" 147 | ] 148 | }, 149 | { 150 | "cell_type": "code", 151 | "execution_count": 17, 152 | "metadata": {}, 153 | "outputs": [], 154 | "source": [ 155 | "# But we can change that by sorting the data:\n", 156 | "heights.sort()\n", 157 | "weights.sort()" 158 | ] 159 | }, 160 | { 161 | "cell_type": "code", 162 | "execution_count": 18, 163 | "metadata": {}, 164 | "outputs": [ 165 | { 166 | "data": { 167 | "text/plain": [ 168 | "" 169 | ] 170 | }, 171 | "execution_count": 18, 172 | "metadata": {}, 173 | "output_type": "execute_result" 174 | }, 175 | { 176 | "data": { 177 | "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXQAAAD8CAYAAABn919SAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4xLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvAOZPmwAAFRVJREFUeJzt3X+QXfV53/H3o9UlviKpVxhBkBxF1OOqGcwI2q1NyjSl0EZJY2zFI1NcPKUuU3XaGdc/WhWRocFuO2MyamJnptPJECctnVACJnjBTWPhsUPToYGp5BWRiavaxljhioJsa+vYbMqyevrH3pVWq3v3nrt7f51z368Zze49Onf3maPVR0fP+f6IzESSVH4bhl2AJKk3DHRJqggDXZIqwkCXpIow0CWpIgx0SaoIA12SKsJAl6SKMNAlqSI2DvKbXXrppbljx45BfktJKr0jR458OzO3dDpvoIG+Y8cODh8+PMhvKUmlFxHfKnKeLRdJqggDXZIqwkCXpIow0CWpIgx0SaqIgY5ykaQqmp5pcPDQcU7OzrF1ss7+3TvZc+22gddhoEvSOkzPNLjr0WPMzS8A0Jid465HjwEMPNRtuUjSOhw8dPxsmC+Zm1/g4KHjA6/FQJekdTg5O9fV8X4y0CVpHbZO1rs63k8GuiQVMD3T4Pp7v8SVB36X6+/9EtMzDQD2795JvTZx3rn12gT7d+8ceI0+FJWkDoo8+HSUiySVwGoPPvdcu+3sr2Er1HKJiI9ExHMR8ZWIeDAi3hARV0bEMxHxtYh4KCIu6nexkjQMo/TgczUdAz0itgH/FJjKzLcBE8CtwC8Bn8zMtwKngTv6WagkDcsoPfhcTdGHohuBekRsBDYBLwE3Ao80f/9+YE/vy5Ok4RulB5+r6dhDz8xGRPxb4AQwBzwBHAFmM/P15mkvAsNvIElSH4zSg8/VdAz0iNgMvBu4EpgFPgP8bItTs8379wH7ALZv377mQiVpmEblwedqirRc/ibwzcw8lZnzwKPAXwUmmy0YgDcDJ1u9OTPvy8ypzJzasqXjlniSpDUqEugngOsiYlNEBHAT8MfA7wN7m+fcDjzWnxIlSUV0DPTMfIbFh59fBo4133MfcCfw0Yj4OvAm4Df6WKckqYNCE4sy8x7gnhWHnwfe3vOKJElr4louklQRBrokVYSBLkkV4eJcksbWqOwF2isGuqSxdPf0MR54+sTZGZHD3Au0V2y5SBo70zON88J8ybD2Au0VA13SWJmeafDPHn629VoljN6SuN0w0CWNjaWdhxayXZyP3pK43TDQJY2NVjsPLRcwckvidsNAlzQ2VmunBHDbddtL+0AUHOUiaYxsnazTaBHqExH88i27Sh3m4B26pDHSbuehKoQ5eIcuaYyUZeehtTLQJZXSWmd5lmHnobUy0CWVztLww6URK1WY5dkL9tAllU6r4Ydln+XZCwa6pNJpN/ywzLM8e8FAl1Q67WZzlnmWZy8Y6JJKp93wwzLP8uwFH4pKKp2qDz9cKwNdUilVefjhWtlykaSKMNAlqSIMdEmqCANdkirCQJekijDQJakiDHRJqggDXZIqwkCXpIow0CWpIpz6L6l01rpbUdUZ6JJKxd2K2rPlIqlU3K2ovY6BHhE7I+Losl/fi4gPR8QlEfGFiPha8+PmQRQsaby5W1F7HVsumXkcuAYgIiaABvBZ4ADwxcy8NyIONF/f2cdaJY2hlf3yyU01Tr86f8F5475bEXTfQ78J+EZmfisi3g3c0Dx+P/AkBrqkHrp7+hgPPH2CbL5uzM5R2xDUJoL5hTx7nrsVLeo20G8FHmx+fnlmvgSQmS9FxGWt3hAR+4B9ANu3b19rnZLGyPRMg489/hyzcxfeic+fSSbrNS7+oY2OclkhMrPzWUBEXAScBK7KzJcjYjYzJ5f9/unMXLWPPjU1lYcPH15XwZKq5+7pYzz4zJ+wkEkAGzYEC2faZ1MA37z35wZW37BFxJHMnOp0Xjd36D8LfDkzX26+fjkirmjenV8BvLKWQiWNl5U98R1vqvPUN7579vcTVg1zsF/eTjfDFt/HuXYLwOPA7c3Pbwce61VRkqppaQx5Y3aOZLEnvjzMiwiwX95GoTv0iNgE/C3gHy07fC/wcETcAZwA3tv78iSVRZHZm63GkHcjgNuu226/vI1CgZ6ZrwJvWnHsOyyOepFUYktB3JidYyKChUy2NQMZKDTFvujszcY6xopv3lTjnpuvMsxX4dR/aYytDOKF5iCJxuwc+z/zLARnhweuNsV+tdmby89d+gejiIsvmuDV1xYcxdIFA10aY6u1QOZbPJhsFdJQfPbmamG+FPYTEbzvHT/Gv9lzdafytYKBLo2h5W2WbrUK762T9ZZfa+VolG1tzts2WeepAzd2XYvO5+Jc0hiZnmnwE//y9/jwQ0fX3M9uNWRw/+6d1GsT5x1rNXuz6HlaG+/QpTExPdPgow8fpcMQ77NqG+K8Hjq0D9+lFkynB6hFz9PaFJ4p2gvOFJWG59p/9UTLRa1amYjgl2/ZBRi+o6AfM0UllVjRMK/XJvjEe64+G9wGeHkY6JLO2uZdeKkZ6NKYmKzXWq5eCPD+67Y7TLACHOUijYl37rqi5XHDvDoMdGkM3D19jN96+sQFxzcETP34JUOoSP1gy0WqqOVrjLdzJmk581PlZKBLFbF8tcM31DYwN3+m0PvcXLk6DHSpAlYuslU0zMHNIqrEHrpUAWtdZ7w2EU67rxADXaqAtazLcvFFExzcu8v+eYXYcpEqoJt1xp08VF0GulQBncLcNcbHg4Euldz0TIMAWkW664yPF3voUskdPHS8ZZgH+MBzzBjoUsm1eyCauFLiuDHQpZLbEN0dV3UZ6FLJtduBqOjORKoOH4pKJbSeTZ5VXQa6VDIrp/m3M1mvDagijQoDXSqB5QtvbSgwiai2IfjYu64aUHUaFQa6NMJarWNeZEbowfc6pX8cGejSiFjeF+9mKv9K2ybrhvmYMtClEbCyL77WMK/XJpxMNMYMdGlIiuwo1I2JCD7xnqu9Ox9jBro0AMsfar6xXuN7c/MU34Kis3ptwjCXgS7128p2yuzcfE++7uZNNWZfnWery+GqyUCX+mytuwm1s3lTjXtuvsoA1wUKBXpETAKfBt7G4po//wA4DjwE7ABeAG7JzNN9qVIqsbVswrw0ysXNKNSNonfovwp8PjP3RsRFwCbgF4AvZua9EXEAOADc2ac6pdKa3FTj9KvF2izvv267m1BozToGekT8OeCngL8PkJmvAa9FxLuBG5qn3Q88iYEuAWsbwXL9Wy4xzLUuRe7Q/zxwCvgPEbELOAJ8CLg8M18CyMyXIuKy/pUplcdtv/6HPPWN7xY+3564eqVIoG8E/hLwwcx8JiJ+lcX2SiERsQ/YB7B9+/Y1FSmVxd3TxwqHudvDqdeKrIf+IvBiZj7TfP0IiwH/ckRcAdD8+EqrN2fmfZk5lZlTW7Zs6UXN0kianmlcsO5KO87oVD90DPTM/D/An0TE0k/fTcAfA48DtzeP3Q481pcKpZL4+OeeK3SeMzrVL0VHuXwQeKA5wuV54AMs/mPwcETcAZwA3tufEqXR0mpziYCWGzWv5IxO9VOhQM/Mo8BUi9+6qbflSKNpeYi3Cu8iYe6YcvWbM0U1dqZnGnzs8efOm4K/2kiTlVP3u11Ka0PAr9xyjUGuvjPQNVamZxrs/8yzzK/YQfn0q/Psf+RZgAuCdz1T98Mw1wAVGeUiVcbBQ8cvCPMl8wvJwUPHLzi+lqn7sNgv/6RhrgEy0DVWGh3CuVV4b52sF/76F180QbDYL/fhpwbNlosqb/la5J20Cu/9u3ee10OHC0e1TETwvnf8mFP3NVQGuipn5WYSP3jtdeYXOj/KrE1Ey8k+S3fZS1/T9cc1qgx0VUK7YYXdbCZxcO+utiG959ptBrhGnoGu0moX4mvZofP91203sFV6BrpKab1jw5fY+1aVGOgqpfWMDXf6varKQFeptFpHpZPahuCH37DRDZVVeQa6Rl6ndVRaWTrP9VM0Tgx0jbS7p4/xwNMnCj3wNMQ17gx0jZRWC2cVYYhLBrpGyPRMg488dLTrEStu5SYtci0XjYw7f+ePug5zt3KTzvEOXSNheqbB/3v9TFfvsc0inc9A19Cct2hWFHtPALddt92JQFILBrqGYuVMzyK9Fu/IpdUZ6BqKbmZ6Xv+WS3jgH/5knyuSys9A10Asb69snawXnulpmEvFGejqu5WTg4qG+WS9ZphLXTDQ1Vd3Tx/jt54+0fX76rUJPvauq/pQkVRdBrr6ZnqmwQNrCHMffkprY6Crbw4eOu6sT2mADHT13FqWuIX2e3pKKsZAV890s7DW9W+5hOdO/unZczdvqnHPzVfZZpHWwUBXT1wwUagNZ3pK/WOgqyc+/rnnOoa5Dzul/jLQtW53Tx/j9Kurt1l82Cn1n8vnal2KDE10iVtpMLxDV9dWrpK42tBEH3ZKg2OgqyvdrJI4Wa8x84s/PZjCJNlyUXeKrpIY4NR9acAK3aFHxAvAnwILwOuZORURlwAPATuAF4BbMvN0f8rUIKxcEbHViJSTBScL3Xbddtss0oB1c4f+NzLzmsycar4+AHwxM98KfLH5WiW11EppzM6RLK6IeNejx5ieaZx33tbJesevNVmvOc5cGoL1tFzeDdzf/Px+YM/6y9EgTc80uObjT7DjwO/y4YeOXtBKmZtf4OCh4+cd2797J/XaRNuv6SqJ0vAUDfQEnoiIIxGxr3ns8sx8CaD58bJWb4yIfRFxOCIOnzp1av0VqyemZxrs/8yzHafpr2yx7Ll2G594z9Vsm6wTLN6Nb95UI1gca/6J91xtq0UakqKjXK7PzJMRcRnwhYj4X0W/QWbeB9wHMDU11e3ie+qTg4eOM3+m8x9HqxbLnmu3GdrSCCp0h56ZJ5sfXwE+C7wdeDkirgBofnylX0Wqt6ZnGoVWQnRCkFQuHQM9Ii6OiB9Z+hz4aeArwOPA7c3Tbgce61eR6p2lh5+d2D6RyqdIy+Vy4LMRsXT+f87Mz0fE/wQejog7gBPAe/tXptar6BrltYng4N5dBrlUQh0DPTOfB3a1OP4d4KZ+FKXeKrq0rdP0pXJz6v8YKDK709UQpfJz6v8Y6DS704efUjUY6GNgtdmdPvyUqsNAHwOtZnfWaxN86u9cw1MHbjTMpYow0MfED20890e9eVPNu3KpgnwoWnGtRrj82fyZIVYkqV+8Q6+4ViNcWi26Jan8vEOvkFbrmbcb4VJ0XXNJ5WGgV8TK1srSeuaTm2qcfvXCFRWLrGsuqVxsuVREu9ZKJi1HuDjuXKoeA70i2q3RMjs3f9765Y47l6rLlktFTESwkBeubz4R4frl0pgw0NegyGbKg9YqzFc7Lql6bLl0qehmyoO2rc1DznbHJVWPgd6lUR3X3W56vw8/pfFhy6VLozque6nlM2qtIEmDY6B3aetkveWIklEY1+3DT2m82XLpkq0NSaPKO/Qu2dqQNKoM9DWwtSFpFNlykaSKMNAlqSIMdEmqCANdkirCQJekijDQJakiDHRJqggDXZIqwkCXpIow0CWpIgx0SaoIA12SKsJAl6SKKBzoETERETMR8V+ar6+MiGci4msR8VBEXNS/MiVJnXRzh/4h4KvLXv8S8MnMfCtwGrijl4VJkrpTKNAj4s3AzwGfbr4O4EbgkeYp9wN7+lGgJKmYonfonwL+BXCm+fpNwGxmvt58/SLgjg+SNEQdAz0i3gm8kplHlh9ucWq2ef++iDgcEYdPnTq1xjIlSZ0UuUO/HnhXRLwA/DaLrZZPAZMRsbSF3ZuBk63enJn3ZeZUZk5t2bKlByVLklrpGOiZeVdmvjkzdwC3Al/KzNuA3wf2Nk+7HXisb1VKkjpazzj0O4GPRsTXWeyp/0ZvSpIkrcXGzqeck5lPAk82P38eeHvvS5IkrYUzRSWpIgx0SaoIA12SKsJAl6SKMNAlqSIMdEmqCANdkirCQJekijDQJakiDHRJqggDXZIqoqu1XEbJ9EyDg4eOc3J2jq2Tdfbv3smea91jQ9L4KmWgT880uOvRY8zNLwDQmJ3jrkePARjqksZWKVsuBw8dPxvmS+bmFzh46PiQKpKk4StloJ+cnevquCSNg1IG+tbJelfHJWkclDLQ9+/eSb02cd6xem2C/bt3DqkiSRq+Uj4UXXrw6SgXSTqnlIEOi6FugEvSOaVsuUiSLmSgS1JFGOiSVBEGuiRVhIEuSRURmTm4bxZxCvhWj77cpcC3e/S1yszrsMjrcI7XYlGVrsOPZ+aWTicNNNB7KSIOZ+bUsOsYNq/DIq/DOV6LReN4HWy5SFJFGOiSVBFlDvT7hl3AiPA6LPI6nOO1WDR216G0PXRJ0vnKfIcuSVpmpAM9In4zIl6JiK90OO+vRMRCROwdVG2D1Ok6RMQNEfF/I+Jo89cvDrrGQSnyM9G8Hkcj4rmI+G+DrG9QCvxM7F/28/CV5t+PSwZdZ78VuA5vjIjPRcSzzZ+HDwy6xkEa6ZZLRPwU8H3gP2Xm29qcMwF8Afgz4Dcz85EBljgQna5DRNwA/PPMfOegaxu0AtdiEvgfwM9k5omIuCwzXxl0nf1W5O/GsnNvBj6SmTcOpLgBKvDz8AvAGzPzzojYAhwHfjQzXxtwqQMx0nfomfkHwHc7nPZB4HeAyv2lXVLwOoyFAtfi7wKPZuaJ5vmV/Lno8mfifcCDfSxnaApchwR+JCIC+OHmua8PorZhGOlA7yQitgE/D/zasGsZAT/Z/G/l70XEVcMuZoj+ArA5Ip6MiCMR8feGXdAwRcQm4GdYvOkZR/8O+AngJHAM+FBmnhluSf1T2g0umj4F3JmZC4v/AI+tL7M4Nfj7EfG3gWngrUOuaVg2An8ZuAmoA38YEU9n5v8ebllDczPwVGaO6//wdgNHgRuBtwBfiIj/npnfG25Z/VHqO3RgCvjtiHgB2Av8+4jYM9ySBi8zv5eZ329+/l+BWkRcOuSyhuVF4POZ+YPM/DbwB8CuIdc0TLdS0XZLQR9gsQWXmfl14JvAXxxyTX1T6kDPzCszc0dm7gAeAf5JZk4PuayBi4gfbfYIiYi3s/jn+p3hVjU0jwF/LSI2NtsN7wC+OuSahiIi3gj8dRavybg6weL/1oiIy4GdwPNDraiPRrrlEhEPAjcAl0bEi8A9QA0gM8emb17gOuwF/nFEvA7MAbfmKA9fWodO1yIzvxoRnwf+CDgDfDozVx32WkYF/278PPBEZv5gKEUOQIHr8K+B/xgRx4BgsUVblRUYLzDSwxYlScWVuuUiSTrHQJekijDQJakiDHRJqggDXZIqwkCXpIow0CWpIgx0SaqI/w8L+IS/vONODgAAAABJRU5ErkJggg==\n", 178 | "text/plain": [ 179 | "" 180 | ] 181 | }, 182 | "metadata": {}, 183 | "output_type": "display_data" 184 | } 185 | ], 186 | "source": [ 187 | "# Having forced a relationship we can clearly see this in the data\n", 188 | "plt.scatter(heights, weights)" 189 | ] 190 | }, 191 | { 192 | "cell_type": "code", 193 | "execution_count": 20, 194 | "metadata": {}, 195 | "outputs": [ 196 | { 197 | "data": { 198 | "text/plain": [ 199 | "array([[ 7.87464896e-03, 7.10461347e-01],\n", 200 | " [ 7.10461347e-01, 6.58789965e+01]])" 201 | ] 202 | }, 203 | "execution_count": 20, 204 | "metadata": {}, 205 | "output_type": "execute_result" 206 | } 207 | ], 208 | "source": [ 209 | "# Now let's find the co-variance...\n", 210 | "np.cov(heights, weights)" 211 | ] 212 | }, 213 | { 214 | "cell_type": "code", 215 | "execution_count": 21, 216 | "metadata": { 217 | "scrolled": true 218 | }, 219 | "outputs": [ 220 | { 221 | "name": "stdout", 222 | "output_type": "stream", 223 | "text": [ 224 | "0.710461347301\n" 225 | ] 226 | } 227 | ], 228 | "source": [ 229 | "# We can also find this answer the long way based on the formula:\n", 230 | "# ∑(xi - xavg)(yi - yavg) / n - 1\n", 231 | "\n", 232 | "mean_height = np.mean(heights)\n", 233 | "mean_weight = np.mean(weights)\n", 234 | "heights_array = np.array([hi - mean_height for hi in heights])\n", 235 | "weights_array = np.array([wi - mean_weight for wi in weights])\n", 236 | "numerator = heights_array @ weights_array\n", 237 | "denominator = len(heights) - 1\n", 238 | "covariance = numerator / denominator\n", 239 | "print(covariance)" 240 | ] 241 | }, 242 | { 243 | "cell_type": "markdown", 244 | "metadata": {}, 245 | "source": [ 246 | "Interpreting the result:
\n", 247 | "- Is it positive or negative? If it's positive then as one variable increases so does the other (e.g. heights and weights). If it's negative then as one variable increases the other decreases (e.g. practice hours and math test scores)
\n", 248 | "- How big is the number in relation to the data? If it's quite a big number then the relationship is quite a strong one. If it's quite a small number in relation to the data, say 0.00something then the relationship is probably negligible.
\n", 249 | "- By looking at the spread of the data above we can check this understanding...

\n", 250 | "But of course the difficulty here is \"what is a big number?\" since we have 2 different units of measure (meters and kilograms)!
" 251 | ] 252 | }, 253 | { 254 | "cell_type": "markdown", 255 | "metadata": {}, 256 | "source": [ 257 | "## Correlation\n", 258 | "Maths is fun has a lovely article on how to calculate correlation: https://www.mathsisfun.com/data/correlation.html – also infinitely intelligible!
\n", 259 | "Correlation will normalize the different units of measure to give a standardized value between -1 (perfect negative correlation) and 1 (perfect positive correlation) with 0 representing absolutely no correlation whatsoever." 260 | ] 261 | }, 262 | { 263 | "cell_type": "code", 264 | "execution_count": 22, 265 | "metadata": {}, 266 | "outputs": [ 267 | { 268 | "data": { 269 | "text/plain": [ 270 | "array([[ 1. , 0.98639614],\n", 271 | " [ 0.98639614, 1. ]])" 272 | ] 273 | }, 274 | "execution_count": 22, 275 | "metadata": {}, 276 | "output_type": "execute_result" 277 | } 278 | ], 279 | "source": [ 280 | "# Finding the correlation co-efficient is easy enough with numpy!\n", 281 | "np.corrcoef(heights, weights)" 282 | ] 283 | }, 284 | { 285 | "cell_type": "code", 286 | "execution_count": 23, 287 | "metadata": {}, 288 | "outputs": [ 289 | { 290 | "name": "stdout", 291 | "output_type": "stream", 292 | "text": [ 293 | "0.986396144201\n" 294 | ] 295 | } 296 | ], 297 | "source": [ 298 | "# We can also find this answer the long way by building on the values found for co-variance above. The formula is...\n", 299 | "# ∑(xi - xavg).(yi - yavg) / sqrt(x.x * y.y)\n", 300 | "heights_sq = heights_array @ heights_array\n", 301 | "weights_sq = weights_array @ weights_array\n", 302 | "correlation = numerator / np.sqrt(heights_sq * weights_sq)\n", 303 | "print(correlation)" 304 | ] 305 | }, 306 | { 307 | "cell_type": "markdown", 308 | "metadata": {}, 309 | "source": [ 310 | "So we DO indeed have a very high positive correlation, as could be seen in the plot!" 311 | ] 312 | }, 313 | { 314 | "cell_type": "markdown", 315 | "metadata": {}, 316 | "source": [ 317 | "## Linear regression" 318 | ] 319 | }, 320 | { 321 | "cell_type": "markdown", 322 | "metadata": {}, 323 | "source": [ 324 | "So now that we have established that there is a high correlation between heights and weights, we would like to find the line that best fits this relationship. Why? because then given the height of any other random future person we could predict their probable weight, and vice-versa." 325 | ] 326 | }, 327 | { 328 | "cell_type": "markdown", 329 | "metadata": {}, 330 | "source": [ 331 | "Mathsisfun provides a very easy to understand lesson on how least squares regression is calculated:\n", 332 | "https://www.mathsisfun.com/data/least-squares-regression.html. Essentially, given that the equation of a line is expressed as y = mx + b, we are trying to solve for the constants m and b so that we get the line that represents the best fit for our data (the one that minimizes the squared differences from the data points to the line in each case)." 333 | ] 334 | }, 335 | { 336 | "cell_type": "code", 337 | "execution_count": 36, 338 | "metadata": {}, 339 | "outputs": [ 340 | { 341 | "name": "stdout", 342 | "output_type": "stream", 343 | "text": [ 344 | "90.2213356483 -88.82225037 0.986396144201\n" 345 | ] 346 | } 347 | ], 348 | "source": [ 349 | "# We'll need the stats module from scipy\n", 350 | "from scipy import stats\n", 351 | "\n", 352 | "# Using the linregress() function returns 5 values, which we name here for convenience:\n", 353 | "slope, intercept, r_value, p_value, std_err = stats.linregress(heights, weights)\n", 354 | "\n", 355 | "# Slope = m from our equation, Intercept = b, r_value = our correlation co-efficient\n", 356 | "print(slope, intercept, r_value)" 357 | ] 358 | }, 359 | { 360 | "cell_type": "code", 361 | "execution_count": 37, 362 | "metadata": {}, 363 | "outputs": [ 364 | { 365 | "data": { 366 | "text/plain": [ 367 | "0.9729773532953242" 368 | ] 369 | }, 370 | "execution_count": 37, 371 | "metadata": {}, 372 | "output_type": "execute_result" 373 | } 374 | ], 375 | "source": [ 376 | "# By squaring r_value we get the co-efficient of determination\n", 377 | "r_value ** 2" 378 | ] 379 | }, 380 | { 381 | "cell_type": "markdown", 382 | "metadata": {}, 383 | "source": [ 384 | "A quick note here - what is the difference between the co-efficient of correlation, and the co-efficient of determination?
\n", 385 | "It is very nicely explained here: http://blog.uwgb.edu/bansalg/statistics-data-analytics/linear-regression/what-is-the-difference-between-coefficient-of-determination-and-coefficient-of-correlation/ but essentially:
\n", 386 | "- The co-efficient of correlation is a number between -1 (perfect negative correlation) and 1 (perfect positive correlation) which is indicative of how the 2 sets of values are correlated
\n", 387 | "- The co-efficient of determination is a number between 0 and 1 (of course because it has been squared!) and is indicative of how good a fit the line is to the original data - the higher the better" 388 | ] 389 | }, 390 | { 391 | "cell_type": "code", 392 | "execution_count": 34, 393 | "metadata": { 394 | "scrolled": true 395 | }, 396 | "outputs": [ 397 | { 398 | "data": { 399 | "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXQAAAD8CAYAAABn919SAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4xLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvAOZPmwAAIABJREFUeJzt3XmczvXex/HXxxgZVDMhopgWqZOUiIqDUmmz1EEkKZ2c+3S3SITsqSidk9NyTpFKJbcsjdJBi6VSKhq7pGQbimQoM5jle//xu8Y6yzVjrv39fDw8xnXN75r5dDXevj6/72LOOUREJPKVCXUBIiJSOhToIiJRQoEuIhIlFOgiIlFCgS4iEiUU6CIiUUKBLiISJRToIiJRQoEuIhIlygbzm1WpUsUlJycH81uKiES8JUuW/Oqcq1rUdUEN9OTkZBYvXhzMbykiEvHMbKM/16nlIiISJRToIiJRQoEuIhIlFOgiIlFCgS4iEiWCOstFRCQapaSmMXrOWramZ1IjMYG+revSvkHNoNehQBcROQ4pqWkMmL6CzKwcANLSMxkwfQVA0ENdLRcRkeMwes7ag2GeJzMrh9Fz1ga9FgW6iMhx2JqeWaznA0mBLiJyHGokJhTr+UBSoIuI+CElNY2mo+ZyZv8PaDpqLimpaQD0bV2XhPi4I65NiI+jb+u6Qa9RN0VFRIrgz41PzXIREYkAhd34bN+g5sFfoeZXy8XMHjKzVWa20swmmVl5MzvTzL4ys3VmNtnMygW6WBGRUAinG5+FKTLQzawm8ADQyDlXD4gDOgNPAc865+oAu4C7A1moiEiohNONz8L4e1O0LJBgZmWBCsA24Cpgqu/zE4D2pV+eiEjohdONz8IU2UN3zqWZ2TPAJiAT+BBYAqQ757J9l20BQt9AEhEJgHC68VmYIgPdzJKAdsCZQDowBbg+n0tdAa/vCfQEqFWrVokLFREJpXC58VkYf1ouVwM/Oed2OOeygOnAFUCirwUDcDqwNb8XO+fGOucaOecaVa1a5JF4IiJSQv4E+ibgMjOrYGYGtAJWA/OADr5rugMzAlOiiIj4o8hAd859hXfz81tghe81Y4F+QG8z+wGoDIwPYJ0iIlIEvxYWOeeGAkOPeno90LjUKxIRkRLRXi4iIlFCgS4iEiUU6CIiUUKbc4lIzAqXs0BLi0boIhKTBqWs4KHJS0lLz8RxaEvcvH3Oi+2LL6B1a9i5s1TrLA4FuojEnJTUNCYu2nTM8vYSnQW6ZQt07QpNm8LKlfDDD6VWZ3Ep0EUkpqSkpvHwO8vy36uEYmyJm5kJjz8OdevCtGkwaBCsXQtNmpRarcWlHrqIxIy8k4dyXEFx7seWuM55Ad6nD2zcCB06wOjRkJxcusWWgEboIhIz8jt56HAGhW+Ju2wZXHkldOwIJ58M8+bBlClhEeagQBeRGFJYO8WArpfVyn+Wy44d8D//A5dc4vXJX3oJvv0WWrYMWK0loZaLiMSMGokJpOUT6nFm/KPTRceGeVYWvPgiDBsGe/fCAw/AkCGQlBScgotJI3QRiRkFnTyUb5jPng3168NDD8Fll8Hy5fDss2Eb5qBAF5EY0r5BTUbeciE1ExMwoGZiAiNvufDIMP/+e7jpJrj+esjJgZkzYdYsOP/8kNXtL7VcRCQilXSVZ4EnD+3eDSNGwHPPQfny8MwzcP/9UK5cAKoPDAW6iEScvOmHeTNW8lZ5AsVfup+TA6+9Bo8+Cr/+Cj16wBNPQLVqpV12wKnlIiIRJ7/phyVa5fn559C4Mdxzj7dAaPFieOWViAxzUKCLSAQqaPqh36s8N22CLl3gz3+G7dth0iT49FNvWmIEU6CLSMQpaDVnkas8MzJg+HA47zxISYHBg+G776BzZzALQKXBpUAXkYhT0PTDAld5OgeTJ3tBPmwYtGnjBfljj0HFioEvOEgU6CIScfyafpjn22+heXNvFF65MixY4IV77dpBrzvQNMtFRCJSgdMP82zfDgMHwvjxUKUKjB3rzWCJiyv4NRFOgS4i0eXAAXj+ea+dkpHhrfQcPBgSE0NdWcAp0EUkenzwgRfg69bBDTfAP//pTUeMEeqhi0jk++47b6n+TTdBmTJesH/wQUyFOSjQRSSSpadD795w4YXw5ZfeiHz5cm90HoPUchGRyJOT493sHDjQO5T5nnu8fVhOPTXUlYWURugiElkWLICGDeFvf/N2QFyyBF5+OebDHBToIhIpNmyATp28U4J27fLmki9YAA0ahLqysKGWi4iEt7174amnvIOYzbyl+336QIUKoa4s7CjQRSQ8OedtmvXII5CW5m2m9dRTcMYZoa4sbKnlIiLhZ/FiaNYMunb1trL9/HN4+22FeRE0QheR8PHzz95BE6+95t3kHD8e7rzTm1t+mJKeVhTtFOgiEnr793tHv40YAfv2Qd++MGgQnHTSMZeW6mlFUUYtFxEJHefg/fehXj2vV96iBaxaBU8/nW+YQymeVhSFigx0M6trZksP+7XHzHqZ2Slm9pGZrfN9TApGwSISJVavhuuug7ZtIT4eZs/2wr1OnUJfdtynFUWxIlsuzrm1wMUAZhYHpAHvAv2BT5xzo8ysv+9xvwDWKiLRYNcu75CJF1+ESpVgzBi4914v1PNxdL88sUI8uzKyjrmuyNOKYkBxe+itgB+dcxvNrB3Q0vf8BGA+CnQRKUh2Nowb521lu2vXoeX6VasW+JJBKSuYuGgTzvc4LT2T+DJGfJyRleMOXlfoaUUxpLiB3hmY5Pt9NefcNgDn3DYzy3fdrZn1BHoC1KpVq6R1ikgkmzcPHnwQVqzwVnr+619Qv36Bl6ekpjHsvVWkZx47Es/KdSQmxFPxhLKa5XIUvwPdzMoBbYEBxfkGzrmxwFiARo0auSIuF5Fo8tNP3qrO6dMhORmmTYObbz7mQOZBKSuY9NVmcpzDgDJljJzcguNid2YWS4deG9jaI1BxRujXA986537xPf7FzE7zjc5PA7aXfnkiEpH++ANGjYJnnvGOfHv8cW+b24SEY3riyZUTWPjjbwdf6qDQMAf1ywtSnEDvwqF2C8B7QHdglO/jjFKsS0QiUW6ut6KzXz/YuhVuv90L9ppeOyS/OeRpxZydYqB+eQH8moduZhWAa4Dphz09CrjGzNb5Pjeq9MsTkUgx/82ZrEyuB926sbrMiSx4bQa8+ebBMIf855AXhwFdL6ulfnkB/BqhO+cygMpHPbcTb9aLiESwvBZIWnomcWbkOEdN341GoOgl9tu2selvD9Ly/Slsr5hEnxt6Ma3eVZT/MZ6RqWlHXF/c0fjhkirEM7TNBQrzQmjpv0gMO7oFkuO83nVaeiZ9pywD4+D0wGOW2O/b580hf+IJqmfu59+XdeDFyzqx9wRvW9u81ZuHB3DeXxj+qFgujowDOZrFUgwKdJEYVlgLJCufG5OZWTmMnv0d7Td+Aw8/DOvXQ/v2XHPqjWxMOu2Y649evVlYmOeFfZwZXZqcwePtLyzmf40o0EVi0OFtluI4d8cGhvzfONi4DC64AD76CK6+muxRcyGfr3X0bJSaiQn5fs+aiQks7H9V8f4j5BjanEskhqSkpnH+4Fn0mry0WGGemLmH4R/9h1mvPUD97evh+edh6VK4+mrAm3WSEB93xGvyW73p73VSMhqhi8SIlNQ0er+zlCKmeB8UX8aIczl0WvwBvT+fSKX9GUxqeCNJTz/JjVfWO+LavP52UTdQ/b1OSsacnzcoSkOjRo3c4sWLg/b9ROSQBo99mO+mVvmJM+ON2nu48JlhnPTjWhbWrs9/2j9Ah+7XKXxDwMyWOOcaFXWdRugiMcLfMD/391+YsPodThs1B848E959l6bt2tH0qOX6En4U6CICQMX9GfRPfZfbvphGXLl4GDkSevWC8uVDXZr4SYEuEiMSE+Lz3b3QXC7PZC7jL1Ne9M707N4dnnwSatQIQZVyPBToIjHipotO461Fm454rkHadzy/6HVO/2ElNGkCM2ZA48YhqlCOlwJdJAYMSllxRJhX+/1X+i2YwC2r5pFZpRpMmOBtpFVGM5kjmQJdJEodvsd4nhOyD/DXr9/lfxe9Q1xuLs9ffiszWnfj4ztuDGGlUloU6CJR4vB9xsvHlyEzK/fQJ52j9fdfMmjeeM7Y/Quzz72cJ668m82J1TGdrRw1FOgiUeDoTbYOD/Pztv/E0E/GcvmmFXxXpTZdOj/Bl7UvOvh5HRYRPRToIlEgv022kjJ28/Bnb9Fl2Rz2nFCRQdfey6SLWpNT5tDS+/g407L7KKJAF4kCh+/LUjYnm26pH9Dr87epeCCTNy65kTFNb2N3wolHvKZiuTieuPlCrfyMIgp0kSiQt/Xsn3/6liGfjKPOzs18mtyAEVf9lXVVax9xbU3tnxK1FOgiUeCMnVsYOG881/zwNRsST+Ovtwzm43Mag2+5vvYYjw0KdJFItns36+7vx4cTx7O/bDwjW97Jaw3bcaBsPKB9xmONAl0kEuXmwuuvw4ABnL1jB1PrtWJ08+7sqJR08BID3fCMMQp0kUizcCE8+CAsWQJXXEG76waw4rQ6x1zmQH3yGKN1viKRYvNmuO02aNbM20Rr4kT4/HNW1Tg2zAHKaLfbmKMRuki4y8yEZ57xtrN1DgYPhn79oGJFgAJPIPL3ZCKJHgp0kXDlHEydCn36wKZN0LEjPP00JCf7lvl/VexDniW6qeUiEo6WLoWWLaFTJ0hKgvnz4Z13Dob5gOkrigzzxIT4oJQq4UMjdJFwsmMHDBoE48ZB5crw8stw992kLP+Z0aPmsjU9kzK+RUSFiS9jDGt7QZCKlnChQBcJBwcOwIsvwvDhsHevN4tl6FAGzd/MWwNnH3FpUWEOMLrjRZrhEoMU6CKhNmsWPPQQrF3Ll+deyqDmPdhQvhY5oxaW6MvVTExQmMcoBbpIqKxdC717w3//y++1z6JP5+HMqd3Q+5wfo/D8JMTHaTFRDFOgiwRbejqMGEHOv54jo2w5/nVlDyY0bENW3PHdxIwzY+Qt2j0xlinQRYIgJTWNf8xaTbNP36PvZ2+SmLGHKfWv4Znm3fi1YlLRX6AICfFxCnNRoIsEWkpqGlPHTOI/c16i3i8/8vXpf2J4x+Gsqn7OcX3dpArxpGdkUUPb4YqPAl0kkDZt4sTu3XlrxXzSTqzK/W368v75zQ9ua1sSSRXiGdrmAgW4HMOvQDezROAVoB7enj89gLXAZCAZ2AB0cs7tCkiVIpEmI8Nb1fnUU1yR7RjTtAsvNfkL++LL+/XyvAMrdBiFFIe/I/R/AbOdcx3MrBxQAXgU+MQ5N8rM+gP9gX4BqlMkMjgHkydD376wZQt06sQtNduwppx/ffLbL6ulQyikxIpc+m9mJwHNgfEAzrkDzrl0oB0wwXfZBKB9oIoUiQhLlkDz5tClC1vLVeLWrk+RfOYdfod507NPUZjLcfFnhH4WsAN4zcwuApYADwLVnHPbAJxz28zs1MCVKRLGfvkFBg6EV1+FKlUY160/I6tfTm6ZOL9erp64lBZ/Ar0scAlwv3PuKzP7F157xS9m1hPoCVCrVq0SFSkSlg4cgOeeg8ceg3374OGHGXHxzYxf4d+tJB0PJ6XNn90WtwBbnHNf+R5PxQv4X8zsNADfx+35vdg5N9Y518g516hq1aqlUbNIaDkHM2dCvXper7x5c1i5kpTbevkd5lrRKYFQZKA7534GNptZ3k9fK2A18B7Q3fdcd2BGQCoUCSdr1sD110ObNlCmjLcPy8yZcO65DH9/lV9fQis6JVD8neVyPzDRN8NlPXAX3l8G75jZ3cAmoGNgShQJA7t2eTshvvACWQkV+feN9/L8+deSPT8H5n+A4c3nLYpWdEog+RXozrmlQKN8PtWqdMsRCTM5OTBuHPsHPErZ3buZXP9a/tG8GzsrnHzEZf6EueaUS6BppajEnJTUNIa9t4r0zKyDz+U702T+fG9f8uXLWVbrQoa1H87qamcV+/uVMfhnp4sV5BJwCnSJKSmpafSdsoyso05Q3pWRRd+pywBon5Tl3eycOhVq12bgbUOYePqlJVqubwpzCSKdKSoxZfSctceEeZ6ymZmkP9wfzjsP/vtfGDEC1qzh7TMalyjME+LjeFZhLkGkEbrElHwPVnaOdqvn03/+65z2x07o2hVGjYLTTwegRmJCkQcy56lYLo6MAznaAVFCQoEuUS8lNY3Rc9ayNZ9Qrr/te4Z+PJaGW79jWfU6DLt9KC//5/4jrunbui4Dpq8gMyvn4HNHz2qJM6NLkzO0dF9CSoEuUefwAD85IZ69B7LJyjmyzVL1j108smACHVd+zI6KifS5oRfvXdSKpzs2OObr5Y2y876mRt8SrhToEhXyQjwtPfOI0fPhM1kAymVn0WPxDO77cjLlsrN4qfEtvHBFZ/44oQJjOhbc727foKYCXMKeAl0iVkEhnu8tT+e4+oevGTT3FZLTt/HROU144soebDjFC+nbL6ulwJaIp0CXiJSSmnZEX7uwhT11dmxk8NxXaL4hlXWVz6Bbp8f47MxLAPW+Jboo0CUijZ6z9oiblPk5OfN3ei18m27ffsDecgkMa9WTtxrcQHz5Exij5fcShRToElEOb7MUJC43hy5LZ9P784mcvO8PJjW4nvFXd2eDVdANTYlqCnQJewX1yvNzxYalDPlkHOf9upEvatXnpfb3ccudNzBPAS4xQIEuYW1QygomLtpU+A1P4Iz0nxk4bzzXff8lW5Oq89XosVzx8F+5ogQrPEUilQJdwkp+G2cVpuL+DO5dNIV7vkmhTLl4ePJJajz0EDXKlw9wpSLhR4EuYSMlNY2HJi/1aytac7ncvGoe/RZMoNofv8Edd8DIkVCjRsDrFAlXCnQJG/2mLfcrzC/eupZhH7/Mxdu+Z1mNunz3wmu06N424PWJhDsFuoSFlNQ09mfnFnrNqb/vpN+C1/nLqnn8UukURnTsx4X9/pf2Dc8IUpUi4U2BLiFzxKZZhdy7PCH7AHd/k8L/fvkOZXOzWfCXu2nx2rMMPvHE4BUrEgEU6BISR6/0zLfX4hyt133JwLnjqbX7Fxb8qSn7Rz3NtW2uCGqtIpFCgS4hUdRKz7o7NjDkk7E03biczTXOgmkf06KVjrAVKYwCXYLi8PZKYQdGJGbuofdnE+m6dBZ7TqjIa7c9zF0TRkFZ/aiKFEV/SiTgjl4clF+Yl83JpuvSWTz0+UQq7c/grQbX8+rVd7LgqQ7BLVYkginQJaAGpazgrUWbCr2m2U+pDPlkHOfu3MTntS/isVb3sLnG2Yy8RTsgihSHAl0CJiU1jYmFhHntXVsZOO9Vrl23iI2J1bnnlkF8dE4TaiZVYKQ20BIpNgW6BMzoOWvznbxSaX8G9305mR7fzOBA2XieatGd8Y3ac6BsPDUTE1jY/6qg1yoSDRToUuoK2uLWXC4dVnzCI59OoOredKbUu5qnW9zBjkqnABAfZ/RtXTcUJYtEBQW6lJrCNta6ZMsahswdy8Xb1vFtjbo89+AzvF/ujIPXJlWIZ2ibC9RmETkOCnQpFccsFPKpvudX+i14nZtXz+fnSqfQ+8beVOxxByNuuYgRIapVJFop0KVUDH9/1RFhfkLWfu755l3uXTSFuNxcnr/8Vma07sZ9bS/WKFwkQBToctwGpaxgV4avzeIc169dyMB5r3L6nu18ULcpI1vehUs+Uzc7RQJMgS7H5fCpiedvX8/Qj8dy2eaVrKmaTOcuT7KoVn0S4uMYqZudIgGnQJdiO3qXxKSM3Tz82Zt0XvYhu8tXYtC19zLpotbklInTzU6RIFKgS7EcfvOzbE423VI/oNfnb1PxQCZvXHIjzzbryp7ylQBITIgndci1Ia5YJHYo0KVY8nZJbL5+CUM+Gcc5v23h0+QGDG/Vkx+rHDpowoBhbS8IXaEiMcivQDezDcDvQA6Q7ZxrZGanAJOBZGAD0Mk5tyswZUowHL0jYt98lt+fsP4Hxs99hVY/fsNPSafR4y9DmHv2pWBHnlDR9bJaarOIBFlxRuhXOud+Pexxf+AT59woM+vve9yvVKuToDl6HnlaeiYDpq8A8IJ59254/HHmvDqGfXHxPNnyLl5r1JasuPhjvlZiQjyPt9fGWiLBdjwtl3ZAS9/vJwDzUaBHlMJWdgJkZuXwj1mraf/tbHj0Udixg7R2t3J7clu2nHBSvq9JiI9Tq0UkRMr4eZ0DPjSzJWbW0/dcNefcNgDfx1Pze6GZ9TSzxWa2eMeOHcdfsZSKlNQ0+k5ZVmCYAzTasop/P/d3+OtfoU4d+OYbkt+dRJ87mlMzMQHDG40nVYjHgJqJCYy85UK1WkRCxN8RelPn3FYzOxX4yMy+8/cbOOfGAmMBGjVqlN/mexICo+esJSs3//8dNfZsp//812m75lN+ObkqTJoEt956sE/evkFNhbZIGPIr0J1zW30ft5vZu0Bj4BczO805t83MTgO2B7BOKUUpqWn5nhpUPmsff/tqOv/z1TQMxwt/7krtUUNpc0WdEFQpIsVVZKCbWUWgjHPud9/vrwUeA94DugOjfB9nBLJQKR15Nz+P4Bw3ffcZA+a9Rs3fdzDzvD/zatu/c0fn5rTRSFwkYvgzQq8GvGveP7fLAm8752ab2TfAO2Z2N7AJ6Bi4MuV4FbRH+QW//MiQj8fSZMsqVp16Fn3a9eHW3l2ZriAXiThFBrpzbj1wUT7P7wRaBaIoKV35bW1beW86fT59g1uXf8RvFU5iQOv7+PCyGxjcrr764yIRSitFY0De6k6A+Jwsui95nwcW/h8J2fsZf2k7nr+iMydWr8oS7YYoEtEU6DFgq6/N0vLHbxg89xXO/i2NeWc1ZMRV97C+8ukkxMfp6DeRKKBAjwGXHdhBzxkvcOX6Jfx4Sk3u7DCU+WdfCnhzx/Nb4i8ikUeBHs3S02H4cCa+8AJ748rx+JU9mNCwDVlx8d4e5VoEJBJVFOjRKCcHxo+HgQNh5042te/CXWe346e4ioAOZBaJVv4u/ZdI8emn0LAh/O1vcP75zJs4i+sv6HYwzAH2ZeWGsEARCRQFerTYuBE6dYIWLeC332DyZFiwgEEb44+Yrgjepluj56wNUaEiEihquUS6vXvhqadg9Giygdev7s4/6rfhlPVJ9F269eAMl6MV9LyIRC4FeqRyzts065FHIC2Nzde14446f+GnCqcAh/YzT6wQz66MY3dUrJGYEOyKRSTA1HKJRIsXQ7Nm0LUrVKsGn31G5xYPHAzzPJlZOTjn7VF+OM07F4lOCvRI8vPP0KMHNG4MP/zgzWT5+mto1izf3RMB0jOzGHnLhQf3L9ee5SLRSy2XSLB/Pzz3HIwYAfv2QZ8+MGgQnHTo1KA4M3Lcsfubx5lp/3KRGKFALwF/DlMuFc7BzJnQu7c3Ir/pJvjnP73Tg46SX5gX9ryIRB+1XIopb+fCtPRMHIduPqakppXuN1q9Gq67Dtq2hbJlYdYseP/9fMMcvFZKcZ4XkeijQC+mw3cuzFOq87p37YIHH4T69eGrr2DMGFi+3Av3QvRtXVc3P0VinFouxRSwed3Z2TBuHAwe7IV6z57w2GNQtapfL89r+QSlFSQiYUmBXkw1EhPynVFyXPO6582DXr28kXjLlt6o/KJjzhQpkm5+isQ2tVyKqVRbGz/9BB06wFVXwe7dMHUqzJ1bojAXEdEIvZhKpbXxxx8wahQ88wzExcHjj3szWRJ0A1NESk6BXgIlbm3k5sLbb0O/frB1K9x+uxfsNdUmEZHjp5ZLsHz9NTRtCt26eQH+xRfw5psKcxEpNQr0QNu2De68E5o0gQ0b4PXXYdEiuPzyEBcmItFGLZdA2bfPm63yxBNw4IDXZhk4EE48MdSViUiUUqCXNufgvfe8m5zr10O7dt7Nz3POCXVlIhLl1HIpTStXwrXXQvv2UL48fPghpKQozEUkKBTopeG33+D+++Hii2HJEnj+eVi2DK65JtSViUgMUcvleGRnw8svw5AhkJ4Of/87DB8OlSuHujIRiUEK9JL65BNvE61Vq7yVnmPGwIUXhroqEYlharkU148/ws03w9VXQ0YGTJ8OH3+sMBeRkFOg++v332HAAPjTn+Cjj2DkSG/P8ptvBrNQVyciopZLkXJzvRWd/ft7Z3recYcX5jVqhLoyEZEjKNALs2iR1yf/+mtvpWdKivdRRCQMqeWSn7Q0byR++eWweTO88Ya394rCXETCmN+BbmZxZpZqZjN9j880s6/MbJ2ZTTazcoErM0j27YMnn4S6deGdd+DRR+H7770Ntcro7z4RCW/FSakHgTWHPX4KeNY5VwfYBdxdmoUFlXMwbRqcf76330rr1t4NzyeegEqVQl2diIhf/Ap0MzsduBF4xffYgKuAqb5LJgDtA1FgwC1fDq1aeScHVarkzS+fNg3OOivUlYmIFIu/I/QxwCNAru9xZSDdOZfte7wFiKyNvX/9Fe69Fxo08Jbp//vfkJrqLRISEYlARQa6md0EbHfOLTn86XwudQW8vqeZLTazxTt27ChhmaUoKwueew7q1IGxY+G++2DdOm/ZfllN+hGRyOVPgjUF2prZDUB54CS8EXuimZX1jdJPB7bm92Ln3FhgLECjRo3yDf2g+fBD6NUL1qzxNs4aM8ZbKCQiEgWKHKE75wY45053ziUDnYG5zrmuwDygg++y7sCMgFV5vNatg7ZtvZudBw7AjBkwZ47CXESiyvHMxesH9DazH/B66uNLp6RStGcPPPIIXHABzJ8PTz/tbabVtq2W64tI1ClW09g5Nx+Y7/v9eqBx6ZdUCnJzvbM7BwyA7dvhrru8+eXVq4e6MhGRgIm+u4ALF3rL9Zcs8VZ6zpwJl14a6qpERAIuepY/bt4Mt90GzZp5m2hNnOiFu8JcRGJE5I/QMzK8Q5hHjfJWfA4eDP36QcWKoa5MRCSoIjfQnYMpU6BvX9i0CTp29G56JieHujIRkZCI3EAHePFFSErydkNs0SLU1YiIhFTkBrqZN0KvXBni4kJdjYhIyEVuoAOcemqoKxARCRvRM8tFRCTGKdBFRKKEAl1EJEoo0EVEokTE3hRNSU1j9Jy1bE3PpEZiAn1b16V9g8g6Y0NEpDQ8str1AAAD2klEQVRFZKCnpKYxYPoKMrNyAEhLz2TA9BUACnURiVkR2XIZPWftwTDPk5mVw+g5a0NUkYhI6EVkoG9NzyzW8yIisSAiA71GYkKxnhcRiQURGeh9W9clIf7I5f4J8XH0bV03RBWJiIReRN4UzbvxqVkuIiKHRGSggxfqCnARkUMisuUiIiLHUqCLiEQJBbqISJRQoIuIRAkFuohIlDDnXPC+mdkOYGMpfbkqwK+l9LUimd4Hj96HQ/ReeKLpfajtnKta1EVBDfTSZGaLnXONQl1HqOl98Oh9OETvhScW3we1XEREooQCXUQkSkRyoI8NdQFhQu+DR+/DIXovPDH3PkRsD11ERI4UySN0ERE5TFgHupm9ambbzWxlEdddamY5ZtYhWLUFU1Hvg5m1NLPdZrbU92tIsGsMFn9+Jnzvx1IzW2VmC4JZX7D48TPR97Cfh5W+Px+nBLvOQPPjfTjZzN43s2W+n4e7gl1jMIV1y8XMmgN/AG845+oVcE0c8BGwD3jVOTc1iCUGRVHvg5m1BPo4524Kdm3B5sd7kQh8AVznnNtkZqc657YHu85A8+fPxmHXtgEecs5dFZTigsiPn4dHgZOdc/3MrCqwFqjunDsQ5FKDIqxH6M65T4HfirjsfmAaEHV/aPP4+T7EBD/ei9uA6c65Tb7ro/Lnopg/E12ASQEsJ2T8eB8ccKKZGVDJd212MGoLhbAO9KKYWU3gZuClUNcSBi73/bNylpldEOpiQuhcIMnM5pvZEjO7I9QFhZKZVQCuwxv0xKIXgPOBrcAK4EHnXG5oSwqciD3gwmcM0M85l+P9BRyzvsVbGvyHmd0ApAB1QlxTqJQFGgKtgATgSzNb5Jz7PrRlhUwbYKFzLlb/hdcaWApcBZwNfGRmnznn9oS2rMCI6BE60Aj4PzPbAHQA/m1m7UNbUvA55/Y45/7w/f6/QLyZVQlxWaGyBZjtnNvrnPsV+BS4KMQ1hVJnorTd4qe78Fpwzjn3A/ATcF6IawqYiA5059yZzrlk51wyMBW41zmXEuKygs7Mqvt6hJhZY7z/rztDW1XIzAD+bGZlfe2GJsCaENcUEmZ2MtAC7z2JVZvw/rWGmVUD6gLrQ1pRAIV1y8XMJgEtgSpmtgUYCsQDOOdipm/ux/vQAfi7mWUDmUBnF87Tl45DUe+Fc26Nmc0GlgO5wCvOuUKnvUYiP/9s3Ax86JzbG5Iig8CP92EE8LqZrQAMr0UbLTswHiOspy2KiIj/IrrlIiIihyjQRUSihAJdRCRKKNBFRKKEAl1EJEoo0EVEooQCXUQkSijQRUSixP8DmQkNGJYhykMAAAAASUVORK5CYII=\n", 400 | "text/plain": [ 401 | "" 402 | ] 403 | }, 404 | "metadata": {}, 405 | "output_type": "display_data" 406 | } 407 | ], 408 | "source": [ 409 | "# Here we create a little function which takes heights as inputs in order to predict corresponding weights values:\n", 410 | "def get_weights(heights):\n", 411 | " return slope * heights + intercept\n", 412 | "\n", 413 | "plt.scatter(heights, weights)\n", 414 | "plt.plot(heights, get_weights(heights), c='r')\n", 415 | "plt.show()" 416 | ] 417 | } 418 | ], 419 | "metadata": { 420 | "kernelspec": { 421 | "display_name": "Python 3", 422 | "language": "python", 423 | "name": "python3" 424 | }, 425 | "language_info": { 426 | "codemirror_mode": { 427 | "name": "ipython", 428 | "version": 3 429 | }, 430 | "file_extension": ".py", 431 | "mimetype": "text/x-python", 432 | "name": "python", 433 | "nbconvert_exporter": "python", 434 | "pygments_lexer": "ipython3", 435 | "version": "3.6.3" 436 | } 437 | }, 438 | "nbformat": 4, 439 | "nbformat_minor": 2 440 | } 441 | -------------------------------------------------------------------------------- /How it works - Pandas, data manipulation.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "code", 5 | "execution_count": 1, 6 | "metadata": {}, 7 | "outputs": [], 8 | "source": [ 9 | "import pandas as pd" 10 | ] 11 | }, 12 | { 13 | "cell_type": "code", 14 | "execution_count": 2, 15 | "metadata": {}, 16 | "outputs": [ 17 | { 18 | "data": { 19 | "text/html": [ 20 | "

\n", 21 | "\n", 34 | "\n", 35 | " \n", 36 | " \n", 37 | " \n", 38 | " \n", 39 | " \n", 40 | " \n", 41 | " \n", 42 | " \n", 43 | " \n", 44 | " \n", 45 | " \n", 46 | " \n", 47 | " \n", 48 | " \n", 49 | " \n", 50 | " \n", 51 | " \n", 52 | " \n", 53 | " \n", 54 | " \n", 55 | " \n", 56 | " \n", 57 | " \n", 58 | " \n", 59 | " \n", 60 | " \n", 61 | " \n", 62 | " \n", 63 | " \n", 64 | " \n", 65 | " \n", 66 | " \n", 67 | " \n", 68 | " \n", 69 | " \n", 70 | " \n", 71 | " \n", 72 | " \n", 73 | " \n", 74 | " \n", 75 | "
MakeColorCapacity
JaneFordBlue1.6l
JohnBMWGrey2.0l
JuneMiniRed1.6l
JimMercedesWhite2.2l
JayToyotaWhite1.2l
\n", 76 | "
" 77 | ], 78 | "text/plain": [ 79 | " Make Color Capacity\n", 80 | "Jane Ford Blue 1.6l\n", 81 | "John BMW Grey 2.0l\n", 82 | "June Mini Red 1.6l\n", 83 | "Jim Mercedes White 2.2l\n", 84 | "Jay Toyota White 1.2l" 85 | ] 86 | }, 87 | "execution_count": 2, 88 | "metadata": {}, 89 | "output_type": "execute_result" 90 | } 91 | ], 92 | "source": [ 93 | "list1 = [\"Jane\", \"John\", \"June\", \"Jim\", \"Jay\"]\n", 94 | "list2 = [\"Ford\", \"BMW\", \"Mini\", \"Mercedes\", \"Toyota\"]\n", 95 | "list3 = [\"Blue\", \"Grey\", \"Red\", \"White\", \"White\"]\n", 96 | "list4 = [\"1.6l\", \"2.0l\", \"1.6l\", \"2.2l\", \"1.2l\"]\n", 97 | "df1 = pd.DataFrame({\"Make\":list2, \"Color\":list3, \"Capacity\":list4}, index = list1)\n", 98 | "df1" 99 | ] 100 | }, 101 | { 102 | "cell_type": "markdown", 103 | "metadata": {}, 104 | "source": [ 105 | "## Updating values via indexation" 106 | ] 107 | }, 108 | { 109 | "cell_type": "code", 110 | "execution_count": 3, 111 | "metadata": {}, 112 | "outputs": [ 113 | { 114 | "data": { 115 | "text/html": [ 116 | "
\n", 117 | "\n", 130 | "\n", 131 | " \n", 132 | " \n", 133 | " \n", 134 | " \n", 135 | " \n", 136 | " \n", 137 | " \n", 138 | " \n", 139 | " \n", 140 | " \n", 141 | " \n", 142 | " \n", 143 | " \n", 144 | " \n", 145 | " \n", 146 | " \n", 147 | " \n", 148 | " \n", 149 | " \n", 150 | " \n", 151 | " \n", 152 | " \n", 153 | " \n", 154 | " \n", 155 | " \n", 156 | " \n", 157 | " \n", 158 | " \n", 159 | " \n", 160 | " \n", 161 | " \n", 162 | " \n", 163 | " \n", 164 | " \n", 165 | " \n", 166 | " \n", 167 | " \n", 168 | " \n", 169 | " \n", 170 | " \n", 171 | "
CapacityColorMake
Jane1.6lOrangeFord
John2.0lGreyBMW
June1.6lRedMini
Jim2.2lWhiteMercedes
Jay1.2lWhiteToyota
\n", 172 | "
" 173 | ], 174 | "text/plain": [ 175 | " Capacity Color Make\n", 176 | "Jane 1.6l Orange Ford\n", 177 | "John 2.0l Grey BMW\n", 178 | "June 1.6l Red Mini\n", 179 | "Jim 2.2l White Mercedes\n", 180 | "Jay 1.2l White Toyota" 181 | ] 182 | }, 183 | "execution_count": 3, 184 | "metadata": {}, 185 | "output_type": "execute_result" 186 | } 187 | ], 188 | "source": [ 189 | "df1.loc[\"Jane\", \"Color\"] = \"Orange\" # Jane gives us the rows to update, \n", 190 | " # Color tells us which column to update\n", 191 | "df1" 192 | ] 193 | }, 194 | { 195 | "cell_type": "code", 196 | "execution_count": 4, 197 | "metadata": {}, 198 | "outputs": [ 199 | { 200 | "data": { 201 | "text/html": [ 202 | "
\n", 203 | "\n", 216 | "\n", 217 | " \n", 218 | " \n", 219 | " \n", 220 | " \n", 221 | " \n", 222 | " \n", 223 | " \n", 224 | " \n", 225 | " \n", 226 | " \n", 227 | " \n", 228 | " \n", 229 | " \n", 230 | " \n", 231 | " \n", 232 | " \n", 233 | " \n", 234 | " \n", 235 | " \n", 236 | " \n", 237 | " \n", 238 | " \n", 239 | " \n", 240 | " \n", 241 | " \n", 242 | " \n", 243 | " \n", 244 | " \n", 245 | " \n", 246 | " \n", 247 | " \n", 248 | " \n", 249 | " \n", 250 | " \n", 251 | " \n", 252 | " \n", 253 | " \n", 254 | " \n", 255 | " \n", 256 | " \n", 257 | "
CapacityColorMake
Jane1.6lOrangeFord
John2.0lGreyBMW
June1.6lRedMini
Jim2.2lOff-WhiteMercedes
Jay1.2lOff-WhiteToyota
\n", 258 | "
" 259 | ], 260 | "text/plain": [ 261 | " Capacity Color Make\n", 262 | "Jane 1.6l Orange Ford\n", 263 | "John 2.0l Grey BMW\n", 264 | "June 1.6l Red Mini\n", 265 | "Jim 2.2l Off-White Mercedes\n", 266 | "Jay 1.2l Off-White Toyota" 267 | ] 268 | }, 269 | "execution_count": 4, 270 | "metadata": {}, 271 | "output_type": "execute_result" 272 | } 273 | ], 274 | "source": [ 275 | "df1.loc[df1[\"Color\"].isin([\"White\"]), \"Color\"] = \"Off-White\" # The first part gives us the rows to update,\n", 276 | " # The second part tells us which column to update\n", 277 | "df1" 278 | ] 279 | }, 280 | { 281 | "cell_type": "markdown", 282 | "metadata": {}, 283 | "source": [ 284 | "## Replacing values" 285 | ] 286 | }, 287 | { 288 | "cell_type": "markdown", 289 | "metadata": {}, 290 | "source": [ 291 | "### An example across the entire dataframe, replacing entire cell values" 292 | ] 293 | }, 294 | { 295 | "cell_type": "code", 296 | "execution_count": 5, 297 | "metadata": {}, 298 | "outputs": [ 299 | { 300 | "data": { 301 | "text/html": [ 302 | "
\n", 303 | "\n", 316 | "\n", 317 | " \n", 318 | " \n", 319 | " \n", 320 | " \n", 321 | " \n", 322 | " \n", 323 | " \n", 324 | " \n", 325 | " \n", 326 | " \n", 327 | " \n", 328 | " \n", 329 | " \n", 330 | " \n", 331 | " \n", 332 | " \n", 333 | " \n", 334 | " \n", 335 | " \n", 336 | " \n", 337 | " \n", 338 | " \n", 339 | " \n", 340 | " \n", 341 | " \n", 342 | " \n", 343 | " \n", 344 | " \n", 345 | " \n", 346 | " \n", 347 | " \n", 348 | " \n", 349 | " \n", 350 | " \n", 351 | " \n", 352 | " \n", 353 | " \n", 354 | " \n", 355 | " \n", 356 | " \n", 357 | "
CapacityColorMake
Jane1.6lOrangeFord
John2.0lSilverBMW
June1.6lRedMini
Jim2.2lWhiteMercedes
Jay1.2lWhiteToyota
\n", 358 | "
" 359 | ], 360 | "text/plain": [ 361 | " Capacity Color Make\n", 362 | "Jane 1.6l Orange Ford\n", 363 | "John 2.0l Silver BMW\n", 364 | "June 1.6l Red Mini\n", 365 | "Jim 2.2l White Mercedes\n", 366 | "Jay 1.2l White Toyota" 367 | ] 368 | }, 369 | "execution_count": 5, 370 | "metadata": {}, 371 | "output_type": "execute_result" 372 | } 373 | ], 374 | "source": [ 375 | "df1.replace([\"Off-White\", \"Grey\"], [\"White\", \"Silver\"], inplace = True)\n", 376 | "df1" 377 | ] 378 | }, 379 | { 380 | "cell_type": "markdown", 381 | "metadata": {}, 382 | "source": [ 383 | "### An example for a single series, replacing partial cell values" 384 | ] 385 | }, 386 | { 387 | "cell_type": "code", 388 | "execution_count": 6, 389 | "metadata": {}, 390 | "outputs": [ 391 | { 392 | "data": { 393 | "text/html": [ 394 | "
\n", 395 | "\n", 408 | "\n", 409 | " \n", 410 | " \n", 411 | " \n", 412 | " \n", 413 | " \n", 414 | " \n", 415 | " \n", 416 | " \n", 417 | " \n", 418 | " \n", 419 | " \n", 420 | " \n", 421 | " \n", 422 | " \n", 423 | " \n", 424 | " \n", 425 | " \n", 426 | " \n", 427 | " \n", 428 | " \n", 429 | " \n", 430 | " \n", 431 | " \n", 432 | " \n", 433 | " \n", 434 | " \n", 435 | " \n", 436 | " \n", 437 | " \n", 438 | " \n", 439 | " \n", 440 | " \n", 441 | " \n", 442 | " \n", 443 | " \n", 444 | " \n", 445 | " \n", 446 | " \n", 447 | " \n", 448 | " \n", 449 | "
CapacityColorMake
Jane1.6OrangeFord
John2.0SilverBMW
June1.6RedMini
Jim2.2WhiteMercedes
Jay1.2WhiteToyota
\n", 450 | "
" 451 | ], 452 | "text/plain": [ 453 | " Capacity Color Make\n", 454 | "Jane 1.6 Orange Ford\n", 455 | "John 2.0 Silver BMW\n", 456 | "June 1.6 Red Mini\n", 457 | "Jim 2.2 White Mercedes\n", 458 | "Jay 1.2 White Toyota" 459 | ] 460 | }, 461 | "execution_count": 6, 462 | "metadata": {}, 463 | "output_type": "execute_result" 464 | } 465 | ], 466 | "source": [ 467 | "df1[\"Capacity\"].replace({'l':''}, inplace = True, regex=True)\n", 468 | "df1" 469 | ] 470 | }, 471 | { 472 | "cell_type": "markdown", 473 | "metadata": {}, 474 | "source": [ 475 | "## Changing the type of a series" 476 | ] 477 | }, 478 | { 479 | "cell_type": "markdown", 480 | "metadata": {}, 481 | "source": [ 482 | "### Using astype()" 483 | ] 484 | }, 485 | { 486 | "cell_type": "code", 487 | "execution_count": 7, 488 | "metadata": {}, 489 | "outputs": [ 490 | { 491 | "name": "stdout", 492 | "output_type": "stream", 493 | "text": [ 494 | "\n", 495 | "Index: 5 entries, Jane to Jay\n", 496 | "Data columns (total 3 columns):\n", 497 | "Capacity 5 non-null object\n", 498 | "Color 5 non-null object\n", 499 | "Make 5 non-null object\n", 500 | "dtypes: object(3)\n", 501 | "memory usage: 320.0+ bytes\n" 502 | ] 503 | } 504 | ], 505 | "source": [ 506 | "df1.info() # All 3 series currently classified as \"object\" aka \"string\"" 507 | ] 508 | }, 509 | { 510 | "cell_type": "code", 511 | "execution_count": 8, 512 | "metadata": {}, 513 | "outputs": [ 514 | { 515 | "name": "stdout", 516 | "output_type": "stream", 517 | "text": [ 518 | "\n", 519 | "Index: 5 entries, Jane to Jay\n", 520 | "Data columns (total 3 columns):\n", 521 | "Capacity 5 non-null float64\n", 522 | "Color 5 non-null category\n", 523 | "Make 5 non-null object\n", 524 | "dtypes: category(1), float64(1), object(1)\n", 525 | "memory usage: 477.0+ bytes\n" 526 | ] 527 | } 528 | ], 529 | "source": [ 530 | "df1[\"Color\"] = df1[\"Color\"].astype(\"category\") # Convert color to a categorical variable\n", 531 | "df1[\"Capacity\"] = df1[\"Capacity\"].astype(\"float\") # Convert capacity to a float\n", 532 | "\n", 533 | "df1.info() # Here we see the results of the updates - the data didn't change\n", 534 | " # but the format did, so that now we can e.g. perform calcs on Capacity\n", 535 | " # which we could not have done while it was classified as an Object" 536 | ] 537 | }, 538 | { 539 | "cell_type": "markdown", 540 | "metadata": {}, 541 | "source": [ 542 | "### Using pd.to_numeric()" 543 | ] 544 | }, 545 | { 546 | "cell_type": "code", 547 | "execution_count": 9, 548 | "metadata": {}, 549 | "outputs": [ 550 | { 551 | "data": { 552 | "text/html": [ 553 | "
\n", 554 | "\n", 567 | "\n", 568 | " \n", 569 | " \n", 570 | " \n", 571 | " \n", 572 | " \n", 573 | " \n", 574 | " \n", 575 | " \n", 576 | " \n", 577 | " \n", 578 | " \n", 579 | " \n", 580 | " \n", 581 | " \n", 582 | " \n", 583 | " \n", 584 | " \n", 585 | " \n", 586 | " \n", 587 | " \n", 588 | " \n", 589 | " \n", 590 | " \n", 591 | " \n", 592 | " \n", 593 | " \n", 594 | " \n", 595 | " \n", 596 | " \n", 597 | " \n", 598 | " \n", 599 | " \n", 600 | " \n", 601 | " \n", 602 | " \n", 603 | " \n", 604 | " \n", 605 | " \n", 606 | " \n", 607 | " \n", 608 | "
CapacityColorMake
Jane1.6lOrangeFord
John2SilverBMW
June1.6RedMini
Jim2.2WhiteMercedes
Jay1.2WhiteToyota
\n", 609 | "
" 610 | ], 611 | "text/plain": [ 612 | " Capacity Color Make\n", 613 | "Jane 1.6l Orange Ford\n", 614 | "John 2 Silver BMW\n", 615 | "June 1.6 Red Mini\n", 616 | "Jim 2.2 White Mercedes\n", 617 | "Jay 1.2 White Toyota" 618 | ] 619 | }, 620 | "execution_count": 9, 621 | "metadata": {}, 622 | "output_type": "execute_result" 623 | } 624 | ], 625 | "source": [ 626 | "df1[\"Capacity\"] = df1[\"Capacity\"].astype(\"object\") # Now let's put Capacity back\n", 627 | "df1.loc[\"Jane\", \"Capacity\"] = \"1.6l\" # And then introduce a non-numeric value in one field\n", 628 | "df1" 629 | ] 630 | }, 631 | { 632 | "cell_type": "code", 633 | "execution_count": null, 634 | "metadata": {}, 635 | "outputs": [], 636 | "source": [ 637 | "df1[\"Capacity\"] = df1[\"Capacity\"].astype(\"float\") # THIS script will now end in an error\n", 638 | " # because the l in 1.6l can't be converted " 639 | ] 640 | }, 641 | { 642 | "cell_type": "code", 643 | "execution_count": 10, 644 | "metadata": {}, 645 | "outputs": [ 646 | { 647 | "data": { 648 | "text/html": [ 649 | "
\n", 650 | "\n", 663 | "\n", 664 | " \n", 665 | " \n", 666 | " \n", 667 | " \n", 668 | " \n", 669 | " \n", 670 | " \n", 671 | " \n", 672 | " \n", 673 | " \n", 674 | " \n", 675 | " \n", 676 | " \n", 677 | " \n", 678 | " \n", 679 | " \n", 680 | " \n", 681 | " \n", 682 | " \n", 683 | " \n", 684 | " \n", 685 | " \n", 686 | " \n", 687 | " \n", 688 | " \n", 689 | " \n", 690 | " \n", 691 | " \n", 692 | " \n", 693 | " \n", 694 | " \n", 695 | " \n", 696 | " \n", 697 | " \n", 698 | " \n", 699 | " \n", 700 | " \n", 701 | " \n", 702 | " \n", 703 | " \n", 704 | "
CapacityColorMake
JaneNaNOrangeFord
John2.0SilverBMW
June1.6RedMini
Jim2.2WhiteMercedes
Jay1.2WhiteToyota
\n", 705 | "
" 706 | ], 707 | "text/plain": [ 708 | " Capacity Color Make\n", 709 | "Jane NaN Orange Ford\n", 710 | "John 2.0 Silver BMW\n", 711 | "June 1.6 Red Mini\n", 712 | "Jim 2.2 White Mercedes\n", 713 | "Jay 1.2 White Toyota" 714 | ] 715 | }, 716 | "execution_count": 10, 717 | "metadata": {}, 718 | "output_type": "execute_result" 719 | } 720 | ], 721 | "source": [ 722 | "df1[\"Capacity\"] = pd.to_numeric(df1[\"Capacity\"], errors = \"coerce\")\n", 723 | "df1\n", 724 | "# This is an alternative method to convert data types - the argument errors = \"coerce\" \n", 725 | "# is pretty handy if your data has some exceptions in it, fills NaN where no conversion is possible\n", 726 | "# pd.to_datetime() also has errors = \"coerce\" which can be useful" 727 | ] 728 | }, 729 | { 730 | "cell_type": "markdown", 731 | "metadata": {}, 732 | "source": [ 733 | "## Dealing with null values" 734 | ] 735 | }, 736 | { 737 | "cell_type": "markdown", 738 | "metadata": {}, 739 | "source": [ 740 | "### Filling with another default value" 741 | ] 742 | }, 743 | { 744 | "cell_type": "code", 745 | "execution_count": 11, 746 | "metadata": {}, 747 | "outputs": [ 748 | { 749 | "data": { 750 | "text/html": [ 751 | "
\n", 752 | "\n", 765 | "\n", 766 | " \n", 767 | " \n", 768 | " \n", 769 | " \n", 770 | " \n", 771 | " \n", 772 | " \n", 773 | " \n", 774 | " \n", 775 | " \n", 776 | " \n", 777 | " \n", 778 | " \n", 779 | " \n", 780 | " \n", 781 | " \n", 782 | " \n", 783 | " \n", 784 | " \n", 785 | " \n", 786 | " \n", 787 | " \n", 788 | " \n", 789 | " \n", 790 | " \n", 791 | " \n", 792 | " \n", 793 | " \n", 794 | " \n", 795 | " \n", 796 | " \n", 797 | " \n", 798 | " \n", 799 | " \n", 800 | " \n", 801 | " \n", 802 | " \n", 803 | " \n", 804 | " \n", 805 | " \n", 806 | "
CapacityColorMake
Jane0.0OrangeFord
John2.0SilverBMW
June1.6RedMini
Jim2.2WhiteMercedes
Jay1.2WhiteToyota
\n", 807 | "
" 808 | ], 809 | "text/plain": [ 810 | " Capacity Color Make\n", 811 | "Jane 0.0 Orange Ford\n", 812 | "John 2.0 Silver BMW\n", 813 | "June 1.6 Red Mini\n", 814 | "Jim 2.2 White Mercedes\n", 815 | "Jay 1.2 White Toyota" 816 | ] 817 | }, 818 | "execution_count": 11, 819 | "metadata": {}, 820 | "output_type": "execute_result" 821 | } 822 | ], 823 | "source": [ 824 | "df1[\"Capacity\"].fillna(0, inplace = True) # Fills the NaN values in the series with the specified value\n", 825 | " # or indeed the entire dataframe (which would seldom make sense!)\n", 826 | "df1" 827 | ] 828 | }, 829 | { 830 | "cell_type": "markdown", 831 | "metadata": {}, 832 | "source": [ 833 | "### Removing rows with NaN" 834 | ] 835 | }, 836 | { 837 | "cell_type": "code", 838 | "execution_count": 12, 839 | "metadata": {}, 840 | "outputs": [], 841 | "source": [ 842 | "# Here are some additional ways to deal with them:\n", 843 | "# df.dropna() Removes ALL rows in the entire dataframe with \n", 844 | "# one or more null values\n", 845 | "# df.dropna(how = 'all' Removes only rows where all columns contain null values\n", 846 | "# df.dropna(subset = [“Column name”]) Removes rows only where there is a null value in the \n", 847 | "# specified column name" 848 | ] 849 | }, 850 | { 851 | "cell_type": "markdown", 852 | "metadata": {}, 853 | "source": [ 854 | "## Altering the shape of the dataframe" 855 | ] 856 | }, 857 | { 858 | "cell_type": "markdown", 859 | "metadata": {}, 860 | "source": [ 861 | "### Adding columns quickly" 862 | ] 863 | }, 864 | { 865 | "cell_type": "code", 866 | "execution_count": 13, 867 | "metadata": {}, 868 | "outputs": [ 869 | { 870 | "data": { 871 | "text/html": [ 872 | "
\n", 873 | "\n", 886 | "\n", 887 | " \n", 888 | " \n", 889 | " \n", 890 | " \n", 891 | " \n", 892 | " \n", 893 | " \n", 894 | " \n", 895 | " \n", 896 | " \n", 897 | " \n", 898 | " \n", 899 | " \n", 900 | " \n", 901 | " \n", 902 | " \n", 903 | " \n", 904 | " \n", 905 | " \n", 906 | " \n", 907 | " \n", 908 | " \n", 909 | " \n", 910 | " \n", 911 | " \n", 912 | " \n", 913 | " \n", 914 | " \n", 915 | " \n", 916 | " \n", 917 | " \n", 918 | " \n", 919 | " \n", 920 | " \n", 921 | " \n", 922 | " \n", 923 | " \n", 924 | " \n", 925 | " \n", 926 | " \n", 927 | " \n", 928 | " \n", 929 | " \n", 930 | " \n", 931 | " \n", 932 | " \n", 933 | "
CapacityColorMakeModel
Jane0.0OrangeFord
John2.0SilverBMW
June1.6RedMini
Jim2.2WhiteMercedes
Jay1.2WhiteToyota
\n", 934 | "
" 935 | ], 936 | "text/plain": [ 937 | " Capacity Color Make Model\n", 938 | "Jane 0.0 Orange Ford \n", 939 | "John 2.0 Silver BMW \n", 940 | "June 1.6 Red Mini \n", 941 | "Jim 2.2 White Mercedes \n", 942 | "Jay 1.2 White Toyota " 943 | ] 944 | }, 945 | "execution_count": 13, 946 | "metadata": {}, 947 | "output_type": "execute_result" 948 | } 949 | ], 950 | "source": [ 951 | "df1[\"Model\"] = df[\"\"]\n", 952 | "df1" 953 | ] 954 | }, 955 | { 956 | "cell_type": "markdown", 957 | "metadata": {}, 958 | "source": [ 959 | "### Adding columns at a specified location" 960 | ] 961 | }, 962 | { 963 | "cell_type": "code", 964 | "execution_count": 14, 965 | "metadata": {}, 966 | "outputs": [ 967 | { 968 | "data": { 969 | "text/html": [ 970 | "
\n", 971 | "\n", 984 | "\n", 985 | " \n", 986 | " \n", 987 | " \n", 988 | " \n", 989 | " \n", 990 | " \n", 991 | " \n", 992 | " \n", 993 | " \n", 994 | " \n", 995 | " \n", 996 | " \n", 997 | " \n", 998 | " \n", 999 | " \n", 1000 | " \n", 1001 | " \n", 1002 | " \n", 1003 | " \n", 1004 | " \n", 1005 | " \n", 1006 | " \n", 1007 | " \n", 1008 | " \n", 1009 | " \n", 1010 | " \n", 1011 | " \n", 1012 | " \n", 1013 | " \n", 1014 | " \n", 1015 | " \n", 1016 | " \n", 1017 | " \n", 1018 | " \n", 1019 | " \n", 1020 | " \n", 1021 | " \n", 1022 | " \n", 1023 | " \n", 1024 | " \n", 1025 | " \n", 1026 | " \n", 1027 | " \n", 1028 | " \n", 1029 | " \n", 1030 | " \n", 1031 | " \n", 1032 | " \n", 1033 | " \n", 1034 | " \n", 1035 | " \n", 1036 | " \n", 1037 | "
CapacityService IntervalColorMakeModel
Jane0.020000kmOrangeFord
John2.020000kmSilverBMW
June1.620000kmRedMini
Jim2.220000kmWhiteMercedes
Jay1.220000kmWhiteToyota
\n", 1038 | "
" 1039 | ], 1040 | "text/plain": [ 1041 | " Capacity Service Interval Color Make Model\n", 1042 | "Jane 0.0 20000km Orange Ford \n", 1043 | "John 2.0 20000km Silver BMW \n", 1044 | "June 1.6 20000km Red Mini \n", 1045 | "Jim 2.2 20000km White Mercedes \n", 1046 | "Jay 1.2 20000km White Toyota " 1047 | ] 1048 | }, 1049 | "execution_count": 14, 1050 | "metadata": {}, 1051 | "output_type": "execute_result" 1052 | } 1053 | ], 1054 | "source": [ 1055 | "df1.insert(1, \"Service Interval\", \"20000km\")\n", 1056 | "df1" 1057 | ] 1058 | }, 1059 | { 1060 | "cell_type": "markdown", 1061 | "metadata": {}, 1062 | "source": [ 1063 | "### Adding rows" 1064 | ] 1065 | }, 1066 | { 1067 | "cell_type": "code", 1068 | "execution_count": 15, 1069 | "metadata": {}, 1070 | "outputs": [ 1071 | { 1072 | "data": { 1073 | "text/html": [ 1074 | "
\n", 1075 | "\n", 1088 | "\n", 1089 | " \n", 1090 | " \n", 1091 | " \n", 1092 | " \n", 1093 | " \n", 1094 | " \n", 1095 | " \n", 1096 | " \n", 1097 | " \n", 1098 | " \n", 1099 | " \n", 1100 | " \n", 1101 | " \n", 1102 | " \n", 1103 | " \n", 1104 | " \n", 1105 | " \n", 1106 | " \n", 1107 | " \n", 1108 | " \n", 1109 | " \n", 1110 | " \n", 1111 | " \n", 1112 | " \n", 1113 | " \n", 1114 | " \n", 1115 | " \n", 1116 | " \n", 1117 | " \n", 1118 | " \n", 1119 | " \n", 1120 | " \n", 1121 | " \n", 1122 | " \n", 1123 | " \n", 1124 | " \n", 1125 | " \n", 1126 | " \n", 1127 | " \n", 1128 | " \n", 1129 | " \n", 1130 | " \n", 1131 | " \n", 1132 | " \n", 1133 | " \n", 1134 | " \n", 1135 | " \n", 1136 | " \n", 1137 | " \n", 1138 | " \n", 1139 | " \n", 1140 | " \n", 1141 | " \n", 1142 | " \n", 1143 | " \n", 1144 | " \n", 1145 | " \n", 1146 | " \n", 1147 | " \n", 1148 | " \n", 1149 | "
CapacityService IntervalColorMakeModel
Jane0.020000kmOrangeFord
John2.020000kmSilverBMW
June1.620000kmRedMini
Jim2.220000kmWhiteMercedes
Jay1.220000kmWhiteToyota
James1.6NaNBlueHondaJazz
\n", 1150 | "
" 1151 | ], 1152 | "text/plain": [ 1153 | " Capacity Service Interval Color Make Model\n", 1154 | "Jane 0.0 20000km Orange Ford \n", 1155 | "John 2.0 20000km Silver BMW \n", 1156 | "June 1.6 20000km Red Mini \n", 1157 | "Jim 2.2 20000km White Mercedes \n", 1158 | "Jay 1.2 20000km White Toyota \n", 1159 | "James 1.6 NaN Blue Honda Jazz" 1160 | ] 1161 | }, 1162 | "execution_count": 15, 1163 | "metadata": {}, 1164 | "output_type": "execute_result" 1165 | } 1166 | ], 1167 | "source": [ 1168 | "extra_row = pd.Series({\"Capacity\": 1.6, \"Color\": \"Blue\", \"Make\": \"Honda\", \"Model\": \"Jazz\"})\n", 1169 | "extra_row.name = \"James\"\n", 1170 | "df1 = df1.append(extra_row)\n", 1171 | "df1" 1172 | ] 1173 | }, 1174 | { 1175 | "cell_type": "markdown", 1176 | "metadata": {}, 1177 | "source": [ 1178 | "### Removing columns" 1179 | ] 1180 | }, 1181 | { 1182 | "cell_type": "code", 1183 | "execution_count": 16, 1184 | "metadata": {}, 1185 | "outputs": [ 1186 | { 1187 | "data": { 1188 | "text/html": [ 1189 | "
\n", 1190 | "\n", 1203 | "\n", 1204 | " \n", 1205 | " \n", 1206 | " \n", 1207 | " \n", 1208 | " \n", 1209 | " \n", 1210 | " \n", 1211 | " \n", 1212 | " \n", 1213 | " \n", 1214 | " \n", 1215 | " \n", 1216 | " \n", 1217 | " \n", 1218 | " \n", 1219 | " \n", 1220 | " \n", 1221 | " \n", 1222 | " \n", 1223 | " \n", 1224 | " \n", 1225 | " \n", 1226 | " \n", 1227 | " \n", 1228 | " \n", 1229 | " \n", 1230 | " \n", 1231 | " \n", 1232 | " \n", 1233 | " \n", 1234 | " \n", 1235 | " \n", 1236 | " \n", 1237 | " \n", 1238 | " \n", 1239 | " \n", 1240 | " \n", 1241 | " \n", 1242 | " \n", 1243 | " \n", 1244 | " \n", 1245 | " \n", 1246 | " \n", 1247 | " \n", 1248 | " \n", 1249 | " \n", 1250 | " \n", 1251 | " \n", 1252 | " \n", 1253 | " \n", 1254 | " \n", 1255 | " \n", 1256 | " \n", 1257 | "
CapacityColorMakeModel
Jane0.0OrangeFord
John2.0SilverBMW
June1.6RedMini
Jim2.2WhiteMercedes
Jay1.2WhiteToyota
James1.6BlueHondaJazz
\n", 1258 | "
" 1259 | ], 1260 | "text/plain": [ 1261 | " Capacity Color Make Model\n", 1262 | "Jane 0.0 Orange Ford \n", 1263 | "John 2.0 Silver BMW \n", 1264 | "June 1.6 Red Mini \n", 1265 | "Jim 2.2 White Mercedes \n", 1266 | "Jay 1.2 White Toyota \n", 1267 | "James 1.6 Blue Honda Jazz" 1268 | ] 1269 | }, 1270 | "execution_count": 16, 1271 | "metadata": {}, 1272 | "output_type": "execute_result" 1273 | } 1274 | ], 1275 | "source": [ 1276 | "df1.drop(\"Service Interval\", axis = 1, inplace = True) # axis = 1 says you're looking a columns\n", 1277 | "df1" 1278 | ] 1279 | }, 1280 | { 1281 | "cell_type": "markdown", 1282 | "metadata": {}, 1283 | "source": [ 1284 | "### Removing rows" 1285 | ] 1286 | }, 1287 | { 1288 | "cell_type": "code", 1289 | "execution_count": 17, 1290 | "metadata": {}, 1291 | "outputs": [ 1292 | { 1293 | "data": { 1294 | "text/html": [ 1295 | "
\n", 1296 | "\n", 1309 | "\n", 1310 | " \n", 1311 | " \n", 1312 | " \n", 1313 | " \n", 1314 | " \n", 1315 | " \n", 1316 | " \n", 1317 | " \n", 1318 | " \n", 1319 | " \n", 1320 | " \n", 1321 | " \n", 1322 | " \n", 1323 | " \n", 1324 | " \n", 1325 | " \n", 1326 | " \n", 1327 | " \n", 1328 | " \n", 1329 | " \n", 1330 | " \n", 1331 | " \n", 1332 | " \n", 1333 | " \n", 1334 | " \n", 1335 | " \n", 1336 | " \n", 1337 | " \n", 1338 | " \n", 1339 | " \n", 1340 | " \n", 1341 | " \n", 1342 | " \n", 1343 | " \n", 1344 | " \n", 1345 | " \n", 1346 | " \n", 1347 | " \n", 1348 | " \n", 1349 | " \n", 1350 | " \n", 1351 | " \n", 1352 | " \n", 1353 | " \n", 1354 | " \n", 1355 | " \n", 1356 | "
CapacityColorMakeModel
Jane0.0OrangeFord
June1.6RedMini
Jim2.2WhiteMercedes
Jay1.2WhiteToyota
James1.6BlueHondaJazz
\n", 1357 | "
" 1358 | ], 1359 | "text/plain": [ 1360 | " Capacity Color Make Model\n", 1361 | "Jane 0.0 Orange Ford \n", 1362 | "June 1.6 Red Mini \n", 1363 | "Jim 2.2 White Mercedes \n", 1364 | "Jay 1.2 White Toyota \n", 1365 | "James 1.6 Blue Honda Jazz" 1366 | ] 1367 | }, 1368 | "execution_count": 17, 1369 | "metadata": {}, 1370 | "output_type": "execute_result" 1371 | } 1372 | ], 1373 | "source": [ 1374 | "df1.drop(\"John\", axis = 0, inplace = True) # axis = 0 says you're looking a rows\n", 1375 | "df1" 1376 | ] 1377 | }, 1378 | { 1379 | "cell_type": "code", 1380 | "execution_count": 18, 1381 | "metadata": {}, 1382 | "outputs": [ 1383 | { 1384 | "data": { 1385 | "text/html": [ 1386 | "
\n", 1387 | "\n", 1400 | "\n", 1401 | " \n", 1402 | " \n", 1403 | " \n", 1404 | " \n", 1405 | " \n", 1406 | " \n", 1407 | " \n", 1408 | " \n", 1409 | " \n", 1410 | " \n", 1411 | " \n", 1412 | " \n", 1413 | " \n", 1414 | " \n", 1415 | " \n", 1416 | " \n", 1417 | " \n", 1418 | " \n", 1419 | " \n", 1420 | " \n", 1421 | " \n", 1422 | " \n", 1423 | " \n", 1424 | " \n", 1425 | " \n", 1426 | " \n", 1427 | " \n", 1428 | " \n", 1429 | " \n", 1430 | " \n", 1431 | " \n", 1432 | " \n", 1433 | "
CapacityColorMakeModel
Jane0.0OrangeFord
June1.6RedMini
James1.6BlueHondaJazz
\n", 1434 | "
" 1435 | ], 1436 | "text/plain": [ 1437 | " Capacity Color Make Model\n", 1438 | "Jane 0.0 Orange Ford \n", 1439 | "June 1.6 Red Mini \n", 1440 | "James 1.6 Blue Honda Jazz" 1441 | ] 1442 | }, 1443 | "execution_count": 18, 1444 | "metadata": {}, 1445 | "output_type": "execute_result" 1446 | } 1447 | ], 1448 | "source": [ 1449 | "df1 = df1[df1[\"Color\"] != \"White\"] # a simple way to remove rows based on a condition\n", 1450 | "df1" 1451 | ] 1452 | }, 1453 | { 1454 | "cell_type": "markdown", 1455 | "metadata": {}, 1456 | "source": [ 1457 | "### Transposing the data" 1458 | ] 1459 | }, 1460 | { 1461 | "cell_type": "code", 1462 | "execution_count": 19, 1463 | "metadata": {}, 1464 | "outputs": [ 1465 | { 1466 | "data": { 1467 | "text/html": [ 1468 | "
\n", 1469 | "\n", 1482 | "\n", 1483 | " \n", 1484 | " \n", 1485 | " \n", 1486 | " \n", 1487 | " \n", 1488 | " \n", 1489 | " \n", 1490 | " \n", 1491 | " \n", 1492 | " \n", 1493 | " \n", 1494 | " \n", 1495 | " \n", 1496 | " \n", 1497 | " \n", 1498 | " \n", 1499 | " \n", 1500 | " \n", 1501 | " \n", 1502 | " \n", 1503 | " \n", 1504 | " \n", 1505 | " \n", 1506 | " \n", 1507 | " \n", 1508 | " \n", 1509 | " \n", 1510 | " \n", 1511 | " \n", 1512 | " \n", 1513 | " \n", 1514 | " \n", 1515 | " \n", 1516 | " \n", 1517 | "
JaneJuneJames
Capacity01.61.6
ColorOrangeRedBlue
MakeFordMiniHonda
ModelJazz
\n", 1518 | "
" 1519 | ], 1520 | "text/plain": [ 1521 | " Jane June James\n", 1522 | "Capacity 0 1.6 1.6\n", 1523 | "Color Orange Red Blue\n", 1524 | "Make Ford Mini Honda\n", 1525 | "Model Jazz" 1526 | ] 1527 | }, 1528 | "execution_count": 19, 1529 | "metadata": {}, 1530 | "output_type": "execute_result" 1531 | } 1532 | ], 1533 | "source": [ 1534 | "df1 = df1.transpose() # flips the data around if it's more convenient\n", 1535 | "df1" 1536 | ] 1537 | } 1538 | ], 1539 | "metadata": { 1540 | "kernelspec": { 1541 | "display_name": "Python 3", 1542 | "language": "python", 1543 | "name": "python3" 1544 | }, 1545 | "language_info": { 1546 | "codemirror_mode": { 1547 | "name": "ipython", 1548 | "version": 3 1549 | }, 1550 | "file_extension": ".py", 1551 | "mimetype": "text/x-python", 1552 | "name": "python", 1553 | "nbconvert_exporter": "python", 1554 | "pygments_lexer": "ipython3", 1555 | "version": "3.6.5" 1556 | } 1557 | }, 1558 | "nbformat": 4, 1559 | "nbformat_minor": 2 1560 | } 1561 | -------------------------------------------------------------------------------- /How it works - Pandas, data selection.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# How it works - Pandas, data selection" 8 | ] 9 | }, 10 | { 11 | "cell_type": "code", 12 | "execution_count": 1, 13 | "metadata": {}, 14 | "outputs": [], 15 | "source": [ 16 | "import pandas as pd" 17 | ] 18 | }, 19 | { 20 | "cell_type": "code", 21 | "execution_count": 2, 22 | "metadata": {}, 23 | "outputs": [], 24 | "source": [ 25 | "list1 = [\"Jane\", \"John\", \"June\", \"Jim\", \"Jay\"]\n", 26 | "list2 = [\"Ford\", \"BMW\", \"Mini\", \"Mercedes\", \"Toyota\"]\n", 27 | "list3 = [\"Blue\", \"Grey\", \"Red\", \"White\", \"White\"]\n", 28 | "list4 = [\"1.6l\", \"2.0l\", \"1.6l\", \"2.2l\", \"1.2l\"]\n", 29 | "df = pd.DataFrame({\"Make\":list2, \"Color\":list3, \"Capacity\":list4}, \n", 30 | " index = list1)" 31 | ] 32 | }, 33 | { 34 | "cell_type": "code", 35 | "execution_count": 3, 36 | "metadata": {}, 37 | "outputs": [ 38 | { 39 | "data": { 40 | "text/html": [ 41 | "
\n", 42 | "\n", 55 | "\n", 56 | " \n", 57 | " \n", 58 | " \n", 59 | " \n", 60 | " \n", 61 | " \n", 62 | " \n", 63 | " \n", 64 | " \n", 65 | " \n", 66 | " \n", 67 | " \n", 68 | " \n", 69 | " \n", 70 | " \n", 71 | " \n", 72 | " \n", 73 | " \n", 74 | " \n", 75 | " \n", 76 | " \n", 77 | " \n", 78 | " \n", 79 | " \n", 80 | " \n", 81 | " \n", 82 | " \n", 83 | " \n", 84 | " \n", 85 | " \n", 86 | " \n", 87 | " \n", 88 | " \n", 89 | " \n", 90 | " \n", 91 | " \n", 92 | " \n", 93 | " \n", 94 | " \n", 95 | " \n", 96 | "
MakeColorCapacity
JaneFordBlue1.6l
JohnBMWGrey2.0l
JuneMiniRed1.6l
JimMercedesWhite2.2l
JayToyotaWhite1.2l
\n", 97 | "
" 98 | ], 99 | "text/plain": [ 100 | " Make Color Capacity\n", 101 | "Jane Ford Blue 1.6l\n", 102 | "John BMW Grey 2.0l\n", 103 | "June Mini Red 1.6l\n", 104 | "Jim Mercedes White 2.2l\n", 105 | "Jay Toyota White 1.2l" 106 | ] 107 | }, 108 | "execution_count": 3, 109 | "metadata": {}, 110 | "output_type": "execute_result" 111 | } 112 | ], 113 | "source": [ 114 | "df" 115 | ] 116 | }, 117 | { 118 | "cell_type": "markdown", 119 | "metadata": {}, 120 | "source": [ 121 | "## Selection by column" 122 | ] 123 | }, 124 | { 125 | "cell_type": "code", 126 | "execution_count": 4, 127 | "metadata": {}, 128 | "outputs": [ 129 | { 130 | "data": { 131 | "text/plain": [ 132 | "Jane 1.6l\n", 133 | "John 2.0l\n", 134 | "June 1.6l\n", 135 | "Jim 2.2l\n", 136 | "Jay 1.2l\n", 137 | "Name: Capacity, dtype: object" 138 | ] 139 | }, 140 | "execution_count": 4, 141 | "metadata": {}, 142 | "output_type": "execute_result" 143 | } 144 | ], 145 | "source": [ 146 | "df[\"Capacity\"] # Selection of the SERIES by the column name (note the single bracket)" 147 | ] 148 | }, 149 | { 150 | "cell_type": "code", 151 | "execution_count": 5, 152 | "metadata": {}, 153 | "outputs": [ 154 | { 155 | "data": { 156 | "text/html": [ 157 | "
\n", 158 | "\n", 171 | "\n", 172 | " \n", 173 | " \n", 174 | " \n", 175 | " \n", 176 | " \n", 177 | " \n", 178 | " \n", 179 | " \n", 180 | " \n", 181 | " \n", 182 | " \n", 183 | " \n", 184 | " \n", 185 | " \n", 186 | " \n", 187 | " \n", 188 | " \n", 189 | " \n", 190 | " \n", 191 | " \n", 192 | " \n", 193 | " \n", 194 | " \n", 195 | " \n", 196 | " \n", 197 | " \n", 198 | " \n", 199 | " \n", 200 | "
Capacity
Jane1.6l
John2.0l
June1.6l
Jim2.2l
Jay1.2l
\n", 201 | "
" 202 | ], 203 | "text/plain": [ 204 | " Capacity\n", 205 | "Jane 1.6l\n", 206 | "John 2.0l\n", 207 | "June 1.6l\n", 208 | "Jim 2.2l\n", 209 | "Jay 1.2l" 210 | ] 211 | }, 212 | "execution_count": 5, 213 | "metadata": {}, 214 | "output_type": "execute_result" 215 | } 216 | ], 217 | "source": [ 218 | "df[[\"Capacity\"]] # Selection of the DATAFRAME by column name (note the double bracket)" 219 | ] 220 | }, 221 | { 222 | "cell_type": "code", 223 | "execution_count": 6, 224 | "metadata": {}, 225 | "outputs": [ 226 | { 227 | "data": { 228 | "text/html": [ 229 | "
\n", 230 | "\n", 243 | "\n", 244 | " \n", 245 | " \n", 246 | " \n", 247 | " \n", 248 | " \n", 249 | " \n", 250 | " \n", 251 | " \n", 252 | " \n", 253 | " \n", 254 | " \n", 255 | " \n", 256 | " \n", 257 | " \n", 258 | " \n", 259 | " \n", 260 | " \n", 261 | " \n", 262 | " \n", 263 | " \n", 264 | " \n", 265 | " \n", 266 | " \n", 267 | " \n", 268 | " \n", 269 | " \n", 270 | " \n", 271 | " \n", 272 | " \n", 273 | " \n", 274 | " \n", 275 | " \n", 276 | " \n", 277 | " \n", 278 | "
ColorCapacity
JaneBlue1.6l
JohnGrey2.0l
JuneRed1.6l
JimWhite2.2l
JayWhite1.2l
\n", 279 | "
" 280 | ], 281 | "text/plain": [ 282 | " Color Capacity\n", 283 | "Jane Blue 1.6l\n", 284 | "John Grey 2.0l\n", 285 | "June Red 1.6l\n", 286 | "Jim White 2.2l\n", 287 | "Jay White 1.2l" 288 | ] 289 | }, 290 | "execution_count": 6, 291 | "metadata": {}, 292 | "output_type": "execute_result" 293 | } 294 | ], 295 | "source": [ 296 | "df[[\"Color\", \"Capacity\"]] # Selection by multiple column names (note the new listed order)" 297 | ] 298 | }, 299 | { 300 | "cell_type": "code", 301 | "execution_count": 7, 302 | "metadata": {}, 303 | "outputs": [ 304 | { 305 | "data": { 306 | "text/html": [ 307 | "
\n", 308 | "\n", 321 | "\n", 322 | " \n", 323 | " \n", 324 | " \n", 325 | " \n", 326 | " \n", 327 | " \n", 328 | " \n", 329 | " \n", 330 | " \n", 331 | " \n", 332 | " \n", 333 | " \n", 334 | " \n", 335 | " \n", 336 | " \n", 337 | " \n", 338 | " \n", 339 | " \n", 340 | " \n", 341 | " \n", 342 | " \n", 343 | " \n", 344 | " \n", 345 | " \n", 346 | " \n", 347 | " \n", 348 | " \n", 349 | " \n", 350 | " \n", 351 | " \n", 352 | " \n", 353 | " \n", 354 | " \n", 355 | " \n", 356 | "
ColorCapacity
JaneBlue1.6l
JohnGrey2.0l
JuneRed1.6l
JimWhite2.2l
JayWhite1.2l
\n", 357 | "
" 358 | ], 359 | "text/plain": [ 360 | " Color Capacity\n", 361 | "Jane Blue 1.6l\n", 362 | "John Grey 2.0l\n", 363 | "June Red 1.6l\n", 364 | "Jim White 2.2l\n", 365 | "Jay White 1.2l" 366 | ] 367 | }, 368 | "execution_count": 7, 369 | "metadata": {}, 370 | "output_type": "execute_result" 371 | } 372 | ], 373 | "source": [ 374 | "df.loc[:,[\"Color\", \"Capacity\"]] # Selection by column names using the .loc method (columns is the 2nd argument)" 375 | ] 376 | }, 377 | { 378 | "cell_type": "code", 379 | "execution_count": 8, 380 | "metadata": {}, 381 | "outputs": [ 382 | { 383 | "data": { 384 | "text/html": [ 385 | "
\n", 386 | "\n", 399 | "\n", 400 | " \n", 401 | " \n", 402 | " \n", 403 | " \n", 404 | " \n", 405 | " \n", 406 | " \n", 407 | " \n", 408 | " \n", 409 | " \n", 410 | " \n", 411 | " \n", 412 | " \n", 413 | " \n", 414 | " \n", 415 | " \n", 416 | " \n", 417 | " \n", 418 | " \n", 419 | " \n", 420 | " \n", 421 | " \n", 422 | " \n", 423 | " \n", 424 | " \n", 425 | " \n", 426 | " \n", 427 | " \n", 428 | " \n", 429 | " \n", 430 | " \n", 431 | " \n", 432 | " \n", 433 | " \n", 434 | "
ColorMake
JaneBlueFord
JohnGreyBMW
JuneRedMini
JimWhiteMercedes
JayWhiteToyota
\n", 435 | "
" 436 | ], 437 | "text/plain": [ 438 | " Color Make\n", 439 | "Jane Blue Ford\n", 440 | "John Grey BMW\n", 441 | "June Red Mini\n", 442 | "Jim White Mercedes\n", 443 | "Jay White Toyota" 444 | ] 445 | }, 446 | "execution_count": 8, 447 | "metadata": {}, 448 | "output_type": "execute_result" 449 | } 450 | ], 451 | "source": [ 452 | "df.iloc[:,[1, 0]] # Selection by column indices using the .iloc method (columns is the 2nd argument)" 453 | ] 454 | }, 455 | { 456 | "cell_type": "markdown", 457 | "metadata": {}, 458 | "source": [ 459 | "## Selection by row" 460 | ] 461 | }, 462 | { 463 | "cell_type": "code", 464 | "execution_count": 9, 465 | "metadata": {}, 466 | "outputs": [ 467 | { 468 | "data": { 469 | "text/html": [ 470 | "
\n", 471 | "\n", 484 | "\n", 485 | " \n", 486 | " \n", 487 | " \n", 488 | " \n", 489 | " \n", 490 | " \n", 491 | " \n", 492 | " \n", 493 | " \n", 494 | " \n", 495 | " \n", 496 | " \n", 497 | " \n", 498 | " \n", 499 | " \n", 500 | " \n", 501 | " \n", 502 | " \n", 503 | " \n", 504 | " \n", 505 | " \n", 506 | " \n", 507 | "
MakeColorCapacity
JaneFordBlue1.6l
JuneMiniRed1.6l
\n", 508 | "
" 509 | ], 510 | "text/plain": [ 511 | " Make Color Capacity\n", 512 | "Jane Ford Blue 1.6l\n", 513 | "June Mini Red 1.6l" 514 | ] 515 | }, 516 | "execution_count": 9, 517 | "metadata": {}, 518 | "output_type": "execute_result" 519 | } 520 | ], 521 | "source": [ 522 | "df.loc[[\"Jane\", \"June\"],:] # Selection by row names using the .loc method (rows is the 1st argument)" 523 | ] 524 | }, 525 | { 526 | "cell_type": "code", 527 | "execution_count": 10, 528 | "metadata": {}, 529 | "outputs": [ 530 | { 531 | "data": { 532 | "text/html": [ 533 | "
\n", 534 | "\n", 547 | "\n", 548 | " \n", 549 | " \n", 550 | " \n", 551 | " \n", 552 | " \n", 553 | " \n", 554 | " \n", 555 | " \n", 556 | " \n", 557 | " \n", 558 | " \n", 559 | " \n", 560 | " \n", 561 | " \n", 562 | " \n", 563 | " \n", 564 | " \n", 565 | " \n", 566 | " \n", 567 | " \n", 568 | " \n", 569 | " \n", 570 | "
MakeColorCapacity
JaneFordBlue1.6l
JuneMiniRed1.6l
\n", 571 | "
" 572 | ], 573 | "text/plain": [ 574 | " Make Color Capacity\n", 575 | "Jane Ford Blue 1.6l\n", 576 | "June Mini Red 1.6l" 577 | ] 578 | }, 579 | "execution_count": 10, 580 | "metadata": {}, 581 | "output_type": "execute_result" 582 | } 583 | ], 584 | "source": [ 585 | "df.iloc[[0,2],:] # Selection by row indices using the .iloc method (rows is the 1st argument)" 586 | ] 587 | }, 588 | { 589 | "cell_type": "markdown", 590 | "metadata": {}, 591 | "source": [ 592 | "## Selection by row and column" 593 | ] 594 | }, 595 | { 596 | "cell_type": "code", 597 | "execution_count": 11, 598 | "metadata": {}, 599 | "outputs": [ 600 | { 601 | "data": { 602 | "text/html": [ 603 | "
\n", 604 | "\n", 617 | "\n", 618 | " \n", 619 | " \n", 620 | " \n", 621 | " \n", 622 | " \n", 623 | " \n", 624 | " \n", 625 | " \n", 626 | " \n", 627 | " \n", 628 | " \n", 629 | " \n", 630 | " \n", 631 | " \n", 632 | " \n", 633 | " \n", 634 | " \n", 635 | " \n", 636 | " \n", 637 | "
CapacityMake
Jane1.6lFord
June1.6lMini
\n", 638 | "
" 639 | ], 640 | "text/plain": [ 641 | " Capacity Make\n", 642 | "Jane 1.6l Ford\n", 643 | "June 1.6l Mini" 644 | ] 645 | }, 646 | "execution_count": 11, 647 | "metadata": {}, 648 | "output_type": "execute_result" 649 | } 650 | ], 651 | "source": [ 652 | "df.loc[[\"Jane\", \"June\"],[\"Capacity\", \"Make\"]] # Selection using the .loc method would be preferred!" 653 | ] 654 | }, 655 | { 656 | "cell_type": "code", 657 | "execution_count": 12, 658 | "metadata": {}, 659 | "outputs": [ 660 | { 661 | "data": { 662 | "text/html": [ 663 | "
\n", 664 | "\n", 677 | "\n", 678 | " \n", 679 | " \n", 680 | " \n", 681 | " \n", 682 | " \n", 683 | " \n", 684 | " \n", 685 | " \n", 686 | " \n", 687 | " \n", 688 | " \n", 689 | " \n", 690 | " \n", 691 | " \n", 692 | " \n", 693 | " \n", 694 | " \n", 695 | " \n", 696 | " \n", 697 | "
MakeCapacity
JaneFord1.6l
JohnBMW2.0l
\n", 698 | "
" 699 | ], 700 | "text/plain": [ 701 | " Make Capacity\n", 702 | "Jane Ford 1.6l\n", 703 | "John BMW 2.0l" 704 | ] 705 | }, 706 | "execution_count": 12, 707 | "metadata": {}, 708 | "output_type": "execute_result" 709 | } 710 | ], 711 | "source": [ 712 | "df.iloc[[0,1],[0,2]] # Selection using the .iloc method would be preferred!" 713 | ] 714 | }, 715 | { 716 | "cell_type": "markdown", 717 | "metadata": {}, 718 | "source": [ 719 | "## Selection by filter" 720 | ] 721 | }, 722 | { 723 | "cell_type": "code", 724 | "execution_count": 13, 725 | "metadata": {}, 726 | "outputs": [ 727 | { 728 | "data": { 729 | "text/html": [ 730 | "
\n", 731 | "\n", 744 | "\n", 745 | " \n", 746 | " \n", 747 | " \n", 748 | " \n", 749 | " \n", 750 | " \n", 751 | " \n", 752 | " \n", 753 | " \n", 754 | " \n", 755 | " \n", 756 | " \n", 757 | " \n", 758 | " \n", 759 | " \n", 760 | " \n", 761 | " \n", 762 | " \n", 763 | " \n", 764 | " \n", 765 | " \n", 766 | " \n", 767 | "
MakeColorCapacity
JimMercedesWhite2.2l
JayToyotaWhite1.2l
\n", 768 | "
" 769 | ], 770 | "text/plain": [ 771 | " Make Color Capacity\n", 772 | "Jim Mercedes White 2.2l\n", 773 | "Jay Toyota White 1.2l" 774 | ] 775 | }, 776 | "execution_count": 13, 777 | "metadata": {}, 778 | "output_type": "execute_result" 779 | } 780 | ], 781 | "source": [ 782 | "df[df[\"Color\"] == \"White\"] # Notice that we essentially filter on a series and then apply the result to the df\n", 783 | " # df[\"Color] is the series\n", 784 | " # we look for values == \"White\" in that series\n", 785 | " # apply the result to the df\n", 786 | " # When in doubt... build the code from the inside out!" 787 | ] 788 | }, 789 | { 790 | "cell_type": "code", 791 | "execution_count": 14, 792 | "metadata": {}, 793 | "outputs": [ 794 | { 795 | "data": { 796 | "text/html": [ 797 | "
\n", 798 | "\n", 811 | "\n", 812 | " \n", 813 | " \n", 814 | " \n", 815 | " \n", 816 | " \n", 817 | " \n", 818 | " \n", 819 | " \n", 820 | " \n", 821 | " \n", 822 | " \n", 823 | " \n", 824 | " \n", 825 | " \n", 826 | " \n", 827 | " \n", 828 | " \n", 829 | " \n", 830 | " \n", 831 | " \n", 832 | " \n", 833 | " \n", 834 | " \n", 835 | " \n", 836 | " \n", 837 | " \n", 838 | " \n", 839 | " \n", 840 | "
MakeColorCapacity
JaneFordBlue1.6l
JimMercedesWhite2.2l
JayToyotaWhite1.2l
\n", 841 | "
" 842 | ], 843 | "text/plain": [ 844 | " Make Color Capacity\n", 845 | "Jane Ford Blue 1.6l\n", 846 | "Jim Mercedes White 2.2l\n", 847 | "Jay Toyota White 1.2l" 848 | ] 849 | }, 850 | "execution_count": 14, 851 | "metadata": {}, 852 | "output_type": "execute_result" 853 | } 854 | ], 855 | "source": [ 856 | "df[(df[\"Color\"] == \"White\") | (df[\"Color\"] == \"Blue\")] # One can apply this technique with boolean operators too" 857 | ] 858 | }, 859 | { 860 | "cell_type": "markdown", 861 | "metadata": {}, 862 | "source": [ 863 | "## Selection by mask\n", 864 | "This mechanism will be quite handy for re-usability" 865 | ] 866 | }, 867 | { 868 | "cell_type": "code", 869 | "execution_count": 15, 870 | "metadata": {}, 871 | "outputs": [], 872 | "source": [ 873 | "mask = df[\"Color\"] == \"White\"" 874 | ] 875 | }, 876 | { 877 | "cell_type": "code", 878 | "execution_count": 16, 879 | "metadata": {}, 880 | "outputs": [ 881 | { 882 | "data": { 883 | "text/html": [ 884 | "
\n", 885 | "\n", 898 | "\n", 899 | " \n", 900 | " \n", 901 | " \n", 902 | " \n", 903 | " \n", 904 | " \n", 905 | " \n", 906 | " \n", 907 | " \n", 908 | " \n", 909 | " \n", 910 | " \n", 911 | " \n", 912 | " \n", 913 | " \n", 914 | " \n", 915 | " \n", 916 | " \n", 917 | " \n", 918 | " \n", 919 | " \n", 920 | " \n", 921 | "
MakeColorCapacity
JimMercedesWhite2.2l
JayToyotaWhite1.2l
\n", 922 | "
" 923 | ], 924 | "text/plain": [ 925 | " Make Color Capacity\n", 926 | "Jim Mercedes White 2.2l\n", 927 | "Jay Toyota White 1.2l" 928 | ] 929 | }, 930 | "execution_count": 16, 931 | "metadata": {}, 932 | "output_type": "execute_result" 933 | } 934 | ], 935 | "source": [ 936 | "df[mask]" 937 | ] 938 | }, 939 | { 940 | "cell_type": "markdown", 941 | "metadata": {}, 942 | "source": [ 943 | "## Selection using the .isin() method\n", 944 | "This is pretty handy where you have larger lists of values that you want to check for" 945 | ] 946 | }, 947 | { 948 | "cell_type": "code", 949 | "execution_count": 17, 950 | "metadata": {}, 951 | "outputs": [ 952 | { 953 | "data": { 954 | "text/html": [ 955 | "
\n", 956 | "\n", 969 | "\n", 970 | " \n", 971 | " \n", 972 | " \n", 973 | " \n", 974 | " \n", 975 | " \n", 976 | " \n", 977 | " \n", 978 | " \n", 979 | " \n", 980 | " \n", 981 | " \n", 982 | " \n", 983 | " \n", 984 | " \n", 985 | " \n", 986 | " \n", 987 | " \n", 988 | " \n", 989 | " \n", 990 | " \n", 991 | " \n", 992 | " \n", 993 | " \n", 994 | " \n", 995 | " \n", 996 | " \n", 997 | " \n", 998 | " \n", 999 | " \n", 1000 | " \n", 1001 | " \n", 1002 | " \n", 1003 | " \n", 1004 | "
MakeColorCapacity
JaneFordBlue1.6l
JuneMiniRed1.6l
JimMercedesWhite2.2l
JayToyotaWhite1.2l
\n", 1005 | "
" 1006 | ], 1007 | "text/plain": [ 1008 | " Make Color Capacity\n", 1009 | "Jane Ford Blue 1.6l\n", 1010 | "June Mini Red 1.6l\n", 1011 | "Jim Mercedes White 2.2l\n", 1012 | "Jay Toyota White 1.2l" 1013 | ] 1014 | }, 1015 | "execution_count": 17, 1016 | "metadata": {}, 1017 | "output_type": "execute_result" 1018 | } 1019 | ], 1020 | "source": [ 1021 | "required_vals = [\"White\", \"Blue\", \"Red\"] # Make a list of all the values you want included\n", 1022 | "df[df[\"Color\"].isin(required_vals)] # Use this list as a filter with the .isin() method" 1023 | ] 1024 | }, 1025 | { 1026 | "cell_type": "code", 1027 | "execution_count": 18, 1028 | "metadata": {}, 1029 | "outputs": [ 1030 | { 1031 | "data": { 1032 | "text/html": [ 1033 | "
\n", 1034 | "\n", 1047 | "\n", 1048 | " \n", 1049 | " \n", 1050 | " \n", 1051 | " \n", 1052 | " \n", 1053 | " \n", 1054 | " \n", 1055 | " \n", 1056 | " \n", 1057 | " \n", 1058 | " \n", 1059 | " \n", 1060 | " \n", 1061 | " \n", 1062 | " \n", 1063 | " \n", 1064 | "
MakeColorCapacity
JohnBMWGrey2.0l
\n", 1065 | "
" 1066 | ], 1067 | "text/plain": [ 1068 | " Make Color Capacity\n", 1069 | "John BMW Grey 2.0l" 1070 | ] 1071 | }, 1072 | "execution_count": 18, 1073 | "metadata": {}, 1074 | "output_type": "execute_result" 1075 | } 1076 | ], 1077 | "source": [ 1078 | "df[~df[\"Color\"].isin(required_vals)] # Use the handy ~ notation to change it to \"isnotin\"!" 1079 | ] 1080 | }, 1081 | { 1082 | "cell_type": "code", 1083 | "execution_count": 25, 1084 | "metadata": {}, 1085 | "outputs": [ 1086 | { 1087 | "data": { 1088 | "text/html": [ 1089 | "
\n", 1090 | "\n", 1103 | "\n", 1104 | " \n", 1105 | " \n", 1106 | " \n", 1107 | " \n", 1108 | " \n", 1109 | " \n", 1110 | " \n", 1111 | " \n", 1112 | " \n", 1113 | " \n", 1114 | " \n", 1115 | " \n", 1116 | " \n", 1117 | " \n", 1118 | " \n", 1119 | " \n", 1120 | " \n", 1121 | " \n", 1122 | " \n", 1123 | " \n", 1124 | " \n", 1125 | " \n", 1126 | "
MakeColorCapacity
JimMercedesWhite2.2l
JayToyotaWhite1.2l
\n", 1127 | "
" 1128 | ], 1129 | "text/plain": [ 1130 | " Make Color Capacity\n", 1131 | "Jim Mercedes White 2.2l\n", 1132 | "Jay Toyota White 1.2l" 1133 | ] 1134 | }, 1135 | "execution_count": 25, 1136 | "metadata": {}, 1137 | "output_type": "execute_result" 1138 | } 1139 | ], 1140 | "source": [ 1141 | "df[df.Color.str.contains(\"te\")]" 1142 | ] 1143 | } 1144 | ], 1145 | "metadata": { 1146 | "kernelspec": { 1147 | "display_name": "Python 3", 1148 | "language": "python", 1149 | "name": "python3" 1150 | }, 1151 | "language_info": { 1152 | "codemirror_mode": { 1153 | "name": "ipython", 1154 | "version": 3 1155 | }, 1156 | "file_extension": ".py", 1157 | "mimetype": "text/x-python", 1158 | "name": "python", 1159 | "nbconvert_exporter": "python", 1160 | "pygments_lexer": "ipython3", 1161 | "version": "3.6.5" 1162 | } 1163 | }, 1164 | "nbformat": 4, 1165 | "nbformat_minor": 2 1166 | } 1167 | -------------------------------------------------------------------------------- /How it works - Pandas, groupby method.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "### import pandas as pd" 8 | ] 9 | }, 10 | { 11 | "cell_type": "code", 12 | "execution_count": 2, 13 | "metadata": {}, 14 | "outputs": [], 15 | "source": [ 16 | "names = [\"Jane\", \"June\", \"Jenny\", \"Jacky\", \"Johnny\", \"Jack\", \"Jeremy\"]\n", 17 | "cars = [\"Ford\", \"Fiat\", \"Ford\", \"Ford\", \"BMW\", \"Mercedes\", \"Honda\"]\n", 18 | "colors = [\"black\", \"blue\", \"white\", \"white\", \"white\", \"white\", \"blue\"]" 19 | ] 20 | }, 21 | { 22 | "cell_type": "code", 23 | "execution_count": 3, 24 | "metadata": { 25 | "scrolled": true 26 | }, 27 | "outputs": [], 28 | "source": [ 29 | "df = pd.DataFrame(data = {\"Names\": names, \"Cars\": cars, \"Colours\": colors})\n", 30 | "df = df[[\"Names\", \"Cars\", \"Colours\"]]" 31 | ] 32 | }, 33 | { 34 | "cell_type": "code", 35 | "execution_count": 4, 36 | "metadata": {}, 37 | "outputs": [ 38 | { 39 | "data": { 40 | "text/html": [ 41 | "
\n", 42 | "\n", 55 | "\n", 56 | " \n", 57 | " \n", 58 | " \n", 59 | " \n", 60 | " \n", 61 | " \n", 62 | " \n", 63 | " \n", 64 | " \n", 65 | " \n", 66 | " \n", 67 | " \n", 68 | " \n", 69 | " \n", 70 | " \n", 71 | " \n", 72 | " \n", 73 | " \n", 74 | " \n", 75 | " \n", 76 | " \n", 77 | " \n", 78 | " \n", 79 | " \n", 80 | " \n", 81 | " \n", 82 | " \n", 83 | " \n", 84 | " \n", 85 | " \n", 86 | " \n", 87 | " \n", 88 | " \n", 89 | " \n", 90 | " \n", 91 | " \n", 92 | " \n", 93 | " \n", 94 | " \n", 95 | " \n", 96 | " \n", 97 | " \n", 98 | " \n", 99 | " \n", 100 | " \n", 101 | " \n", 102 | " \n", 103 | " \n", 104 | " \n", 105 | " \n", 106 | " \n", 107 | " \n", 108 | "
NamesCarsColours
0JaneFordblack
1JuneFiatblue
2JennyFordwhite
3JackyFordwhite
4JohnnyBMWwhite
5JackMercedeswhite
6JeremyHondablue
\n", 109 | "
" 110 | ], 111 | "text/plain": [ 112 | " Names Cars Colours\n", 113 | "0 Jane Ford black\n", 114 | "1 June Fiat blue\n", 115 | "2 Jenny Ford white\n", 116 | "3 Jacky Ford white\n", 117 | "4 Johnny BMW white\n", 118 | "5 Jack Mercedes white\n", 119 | "6 Jeremy Honda blue" 120 | ] 121 | }, 122 | "execution_count": 4, 123 | "metadata": {}, 124 | "output_type": "execute_result" 125 | } 126 | ], 127 | "source": [ 128 | "df" 129 | ] 130 | }, 131 | { 132 | "cell_type": "code", 133 | "execution_count": 5, 134 | "metadata": {}, 135 | "outputs": [], 136 | "source": [ 137 | "summary = df.groupby(by = [\"Cars\", \"Colours\"])[\"Names\"].count()" 138 | ] 139 | }, 140 | { 141 | "cell_type": "code", 142 | "execution_count": 11, 143 | "metadata": {}, 144 | "outputs": [ 145 | { 146 | "data": { 147 | "text/plain": [ 148 | "Cars Colours\n", 149 | "BMW white 1\n", 150 | "Fiat blue 1\n", 151 | "Ford black 1\n", 152 | " white 2\n", 153 | "Honda blue 1\n", 154 | "Mercedes white 1\n", 155 | "Name: Names, dtype: int64" 156 | ] 157 | }, 158 | "execution_count": 11, 159 | "metadata": {}, 160 | "output_type": "execute_result" 161 | } 162 | ], 163 | "source": [ 164 | "summary" 165 | ] 166 | }, 167 | { 168 | "cell_type": "code", 169 | "execution_count": 13, 170 | "metadata": {}, 171 | "outputs": [ 172 | { 173 | "data": { 174 | "text/plain": [ 175 | "Cars Colours\n", 176 | "BMW white 1\n", 177 | "Ford white 2\n", 178 | "Name: Names, dtype: int64" 179 | ] 180 | }, 181 | "execution_count": 13, 182 | "metadata": {}, 183 | "output_type": "execute_result" 184 | } 185 | ], 186 | "source": [ 187 | "summary.loc[[('BMW', 'white'), ('Ford', 'white')]]" 188 | ] 189 | }, 190 | { 191 | "cell_type": "code", 192 | "execution_count": 36, 193 | "metadata": {}, 194 | "outputs": [ 195 | { 196 | "data": { 197 | "text/html": [ 198 | "
\n", 199 | "\n", 212 | "\n", 213 | " \n", 214 | " \n", 215 | " \n", 216 | " \n", 217 | " \n", 218 | " \n", 219 | " \n", 220 | " \n", 221 | " \n", 222 | " \n", 223 | " \n", 224 | " \n", 225 | " \n", 226 | " \n", 227 | " \n", 228 | " \n", 229 | " \n", 230 | " \n", 231 | " \n", 232 | " \n", 233 | " \n", 234 | " \n", 235 | " \n", 236 | " \n", 237 | " \n", 238 | " \n", 239 | " \n", 240 | " \n", 241 | " \n", 242 | " \n", 243 | " \n", 244 | " \n", 245 | " \n", 246 | " \n", 247 | " \n", 248 | " \n", 249 | " \n", 250 | " \n", 251 | " \n", 252 | " \n", 253 | " \n", 254 | " \n", 255 | " \n", 256 | " \n", 257 | " \n", 258 | " \n", 259 | "
CarsColoursNames
0BMWwhite1
1Fiatblue1
2Fordblack1
3Fordwhite2
4Hondablue1
5Mercedeswhite1
\n", 260 | "
" 261 | ], 262 | "text/plain": [ 263 | " Cars Colours Names\n", 264 | "0 BMW white 1\n", 265 | "1 Fiat blue 1\n", 266 | "2 Ford black 1\n", 267 | "3 Ford white 2\n", 268 | "4 Honda blue 1\n", 269 | "5 Mercedes white 1" 270 | ] 271 | }, 272 | "execution_count": 36, 273 | "metadata": {}, 274 | "output_type": "execute_result" 275 | } 276 | ], 277 | "source": [ 278 | "summary.reset_index(level=[\"Cars\", \"Colours\"])" 279 | ] 280 | } 281 | ], 282 | "metadata": { 283 | "kernelspec": { 284 | "display_name": "Python 3", 285 | "language": "python", 286 | "name": "python3" 287 | }, 288 | "language_info": { 289 | "codemirror_mode": { 290 | "name": "ipython", 291 | "version": 3 292 | }, 293 | "file_extension": ".py", 294 | "mimetype": "text/x-python", 295 | "name": "python", 296 | "nbconvert_exporter": "python", 297 | "pygments_lexer": "ipython3", 298 | "version": "3.6.5" 299 | } 300 | }, 301 | "nbformat": 4, 302 | "nbformat_minor": 2 303 | } 304 | -------------------------------------------------------------------------------- /How it works - Pandas, mapping series values.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "code", 5 | "execution_count": 1, 6 | "metadata": {}, 7 | "outputs": [], 8 | "source": [ 9 | "import pandas as pd" 10 | ] 11 | }, 12 | { 13 | "cell_type": "code", 14 | "execution_count": 2, 15 | "metadata": {}, 16 | "outputs": [], 17 | "source": [ 18 | "names = [\"Liza\", \"Lisa\", \"Lizzy\", \"Lynne\", \"Lisbeth\", \"Lana\"]\n", 19 | "sizes = [16, 12, 14, 10, 8, 14]" 20 | ] 21 | }, 22 | { 23 | "cell_type": "code", 24 | "execution_count": 3, 25 | "metadata": {}, 26 | "outputs": [], 27 | "source": [ 28 | "s_name = pd.Series(names)" 29 | ] 30 | }, 31 | { 32 | "cell_type": "code", 33 | "execution_count": 4, 34 | "metadata": {}, 35 | "outputs": [ 36 | { 37 | "data": { 38 | "text/plain": [ 39 | "0 Liza\n", 40 | "1 Lisa\n", 41 | "2 Lizzy\n", 42 | "3 Lynne\n", 43 | "4 Lisbeth\n", 44 | "5 Lana\n", 45 | "dtype: object" 46 | ] 47 | }, 48 | "execution_count": 4, 49 | "metadata": {}, 50 | "output_type": "execute_result" 51 | } 52 | ], 53 | "source": [ 54 | "s_name" 55 | ] 56 | }, 57 | { 58 | "cell_type": "code", 59 | "execution_count": 5, 60 | "metadata": {}, 61 | "outputs": [], 62 | "source": [ 63 | "s_sizes = pd.Series(sizes, names)" 64 | ] 65 | }, 66 | { 67 | "cell_type": "code", 68 | "execution_count": 6, 69 | "metadata": {}, 70 | "outputs": [ 71 | { 72 | "data": { 73 | "text/plain": [ 74 | "Liza 16\n", 75 | "Lisa 12\n", 76 | "Lizzy 14\n", 77 | "Lynne 10\n", 78 | "Lisbeth 8\n", 79 | "Lana 14\n", 80 | "dtype: int64" 81 | ] 82 | }, 83 | "execution_count": 6, 84 | "metadata": {}, 85 | "output_type": "execute_result" 86 | } 87 | ], 88 | "source": [ 89 | "s_sizes" 90 | ] 91 | }, 92 | { 93 | "cell_type": "code", 94 | "execution_count": 10, 95 | "metadata": {}, 96 | "outputs": [ 97 | { 98 | "data": { 99 | "text/plain": [ 100 | "0 16\n", 101 | "1 12\n", 102 | "2 14\n", 103 | "3 10\n", 104 | "4 8\n", 105 | "5 14\n", 106 | "dtype: int64" 107 | ] 108 | }, 109 | "execution_count": 10, 110 | "metadata": {}, 111 | "output_type": "execute_result" 112 | } 113 | ], 114 | "source": [ 115 | "s_name.map(s_sizes)" 116 | ] 117 | } 118 | ], 119 | "metadata": { 120 | "kernelspec": { 121 | "display_name": "Python 3", 122 | "language": "python", 123 | "name": "python3" 124 | }, 125 | "language_info": { 126 | "codemirror_mode": { 127 | "name": "ipython", 128 | "version": 3 129 | }, 130 | "file_extension": ".py", 131 | "mimetype": "text/x-python", 132 | "name": "python", 133 | "nbconvert_exporter": "python", 134 | "pygments_lexer": "ipython3", 135 | "version": "3.6.5" 136 | } 137 | }, 138 | "nbformat": 4, 139 | "nbformat_minor": 2 140 | } 141 | -------------------------------------------------------------------------------- /How it works - Pandas, merge method.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "code", 5 | "execution_count": 1, 6 | "metadata": {}, 7 | "outputs": [], 8 | "source": [ 9 | "import pandas as pd" 10 | ] 11 | }, 12 | { 13 | "cell_type": "code", 14 | "execution_count": 2, 15 | "metadata": {}, 16 | "outputs": [], 17 | "source": [ 18 | "mothers = [\"Evelyn\", \"Eve\", \"Eugenia\", \"Elizabeth\"]\n", 19 | "daughters = [\"Jenny\", \"Joan\", \"June\", \"Julia\"]" 20 | ] 21 | }, 22 | { 23 | "cell_type": "code", 24 | "execution_count": 3, 25 | "metadata": {}, 26 | "outputs": [], 27 | "source": [ 28 | "kids1 = [\"Jenny\", \"Jenny\", \"Joan\", \"June\", \"Julia\", \"Julia\", \"Julia\"]\n", 29 | "kids2 = [\"Freddy\", \"Johnny\", \"Michael\", \"Robin\", \"Nadia\", \"Freddy\", \"Mark\"]" 30 | ] 31 | }, 32 | { 33 | "cell_type": "code", 34 | "execution_count": 5, 35 | "metadata": {}, 36 | "outputs": [], 37 | "source": [ 38 | "senior = pd.DataFrame(data = {\"Grandmothers\": mothers, \"Mothers\": daughters})" 39 | ] 40 | }, 41 | { 42 | "cell_type": "code", 43 | "execution_count": 16, 44 | "metadata": {}, 45 | "outputs": [], 46 | "source": [ 47 | "junior = pd.DataFrame(data = {\"Mothers\": kids1, \"Children\": kids2})\n", 48 | "junior = junior[[\"Mothers\", \"Children\"]]" 49 | ] 50 | }, 51 | { 52 | "cell_type": "code", 53 | "execution_count": 17, 54 | "metadata": {}, 55 | "outputs": [ 56 | { 57 | "data": { 58 | "text/html": [ 59 | "
\n", 60 | "\n", 73 | "\n", 74 | " \n", 75 | " \n", 76 | " \n", 77 | " \n", 78 | " \n", 79 | " \n", 80 | " \n", 81 | " \n", 82 | " \n", 83 | " \n", 84 | " \n", 85 | " \n", 86 | " \n", 87 | " \n", 88 | " \n", 89 | " \n", 90 | " \n", 91 | " \n", 92 | " \n", 93 | " \n", 94 | " \n", 95 | " \n", 96 | " \n", 97 | " \n", 98 | " \n", 99 | " \n", 100 | " \n", 101 | " \n", 102 | " \n", 103 | "
GrandmothersMothers
0EvelynJenny
1EveJoan
2EugeniaJune
3ElizabethJulia
\n", 104 | "
" 105 | ], 106 | "text/plain": [ 107 | " Grandmothers Mothers\n", 108 | "0 Evelyn Jenny\n", 109 | "1 Eve Joan\n", 110 | "2 Eugenia June\n", 111 | "3 Elizabeth Julia" 112 | ] 113 | }, 114 | "execution_count": 17, 115 | "metadata": {}, 116 | "output_type": "execute_result" 117 | } 118 | ], 119 | "source": [ 120 | "senior" 121 | ] 122 | }, 123 | { 124 | "cell_type": "code", 125 | "execution_count": 18, 126 | "metadata": {}, 127 | "outputs": [ 128 | { 129 | "data": { 130 | "text/html": [ 131 | "
\n", 132 | "\n", 145 | "\n", 146 | " \n", 147 | " \n", 148 | " \n", 149 | " \n", 150 | " \n", 151 | " \n", 152 | " \n", 153 | " \n", 154 | " \n", 155 | " \n", 156 | " \n", 157 | " \n", 158 | " \n", 159 | " \n", 160 | " \n", 161 | " \n", 162 | " \n", 163 | " \n", 164 | " \n", 165 | " \n", 166 | " \n", 167 | " \n", 168 | " \n", 169 | " \n", 170 | " \n", 171 | " \n", 172 | " \n", 173 | " \n", 174 | " \n", 175 | " \n", 176 | " \n", 177 | " \n", 178 | " \n", 179 | " \n", 180 | " \n", 181 | " \n", 182 | " \n", 183 | " \n", 184 | " \n", 185 | " \n", 186 | " \n", 187 | " \n", 188 | " \n", 189 | " \n", 190 | "
MothersChildren
0JennyFreddy
1JennyJohnny
2JoanMichael
3JuneRobin
4JuliaNadia
5JuliaFreddy
6JuliaMark
\n", 191 | "
" 192 | ], 193 | "text/plain": [ 194 | " Mothers Children\n", 195 | "0 Jenny Freddy\n", 196 | "1 Jenny Johnny\n", 197 | "2 Joan Michael\n", 198 | "3 June Robin\n", 199 | "4 Julia Nadia\n", 200 | "5 Julia Freddy\n", 201 | "6 Julia Mark" 202 | ] 203 | }, 204 | "execution_count": 18, 205 | "metadata": {}, 206 | "output_type": "execute_result" 207 | } 208 | ], 209 | "source": [ 210 | "junior" 211 | ] 212 | }, 213 | { 214 | "cell_type": "code", 215 | "execution_count": 19, 216 | "metadata": {}, 217 | "outputs": [], 218 | "source": [ 219 | "master = pd.merge(senior, junior, how = \"right\")" 220 | ] 221 | }, 222 | { 223 | "cell_type": "code", 224 | "execution_count": 21, 225 | "metadata": {}, 226 | "outputs": [ 227 | { 228 | "data": { 229 | "text/html": [ 230 | "
\n", 231 | "\n", 244 | "\n", 245 | " \n", 246 | " \n", 247 | " \n", 248 | " \n", 249 | " \n", 250 | " \n", 251 | " \n", 252 | " \n", 253 | " \n", 254 | " \n", 255 | " \n", 256 | " \n", 257 | " \n", 258 | " \n", 259 | " \n", 260 | " \n", 261 | " \n", 262 | " \n", 263 | " \n", 264 | " \n", 265 | " \n", 266 | " \n", 267 | " \n", 268 | " \n", 269 | " \n", 270 | " \n", 271 | " \n", 272 | " \n", 273 | " \n", 274 | " \n", 275 | " \n", 276 | " \n", 277 | " \n", 278 | " \n", 279 | " \n", 280 | " \n", 281 | " \n", 282 | " \n", 283 | " \n", 284 | " \n", 285 | " \n", 286 | " \n", 287 | " \n", 288 | " \n", 289 | " \n", 290 | " \n", 291 | " \n", 292 | " \n", 293 | " \n", 294 | " \n", 295 | " \n", 296 | " \n", 297 | "
GrandmothersMothersChildren
0EvelynJennyFreddy
1EvelynJennyJohnny
2EveJoanMichael
3EugeniaJuneRobin
4ElizabethJuliaNadia
5ElizabethJuliaFreddy
6ElizabethJuliaMark
\n", 298 | "
" 299 | ], 300 | "text/plain": [ 301 | " Grandmothers Mothers Children\n", 302 | "0 Evelyn Jenny Freddy\n", 303 | "1 Evelyn Jenny Johnny\n", 304 | "2 Eve Joan Michael\n", 305 | "3 Eugenia June Robin\n", 306 | "4 Elizabeth Julia Nadia\n", 307 | "5 Elizabeth Julia Freddy\n", 308 | "6 Elizabeth Julia Mark" 309 | ] 310 | }, 311 | "execution_count": 21, 312 | "metadata": {}, 313 | "output_type": "execute_result" 314 | } 315 | ], 316 | "source": [ 317 | "master" 318 | ] 319 | }, 320 | { 321 | "cell_type": "code", 322 | "execution_count": null, 323 | "metadata": {}, 324 | "outputs": [], 325 | "source": [] 326 | } 327 | ], 328 | "metadata": { 329 | "kernelspec": { 330 | "display_name": "Python 3", 331 | "language": "python", 332 | "name": "python3" 333 | }, 334 | "language_info": { 335 | "codemirror_mode": { 336 | "name": "ipython", 337 | "version": 3 338 | }, 339 | "file_extension": ".py", 340 | "mimetype": "text/x-python", 341 | "name": "python", 342 | "nbconvert_exporter": "python", 343 | "pygments_lexer": "ipython3", 344 | "version": "3.6.5" 345 | } 346 | }, 347 | "nbformat": 4, 348 | "nbformat_minor": 2 349 | } 350 | -------------------------------------------------------------------------------- /How it works - basic lists.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "code", 5 | "execution_count": 1, 6 | "metadata": { 7 | "collapsed": true 8 | }, 9 | "outputs": [], 10 | "source": [ 11 | "List = [3, 4, 6, 8, 9, 11, 12, 14, 15, 16, 17, 20, 21]" 12 | ] 13 | }, 14 | { 15 | "cell_type": "code", 16 | "execution_count": 2, 17 | "metadata": {}, 18 | "outputs": [ 19 | { 20 | "name": "stdout", 21 | "output_type": "stream", 22 | "text": [ 23 | "[3, 4, 6, 8, 9, 11, 12, 14, 15, 16, 17, 20, 21]\n" 24 | ] 25 | } 26 | ], 27 | "source": [ 28 | "print(List)" 29 | ] 30 | }, 31 | { 32 | "cell_type": "code", 33 | "execution_count": 8, 34 | "metadata": {}, 35 | "outputs": [ 36 | { 37 | "name": "stdout", 38 | "output_type": "stream", 39 | "text": [ 40 | "3 is a Multiple of 3\n", 41 | "4\n", 42 | "6 is a Multiple of 3\n", 43 | "8\n", 44 | "9 is a Multiple of 3\n", 45 | "11\n", 46 | "12 is a Multiple of 3\n", 47 | "14\n", 48 | "15 is a Multiple of 3\n", 49 | "16\n", 50 | "17\n", 51 | "20\n", 52 | "21 is a Multiple of 3\n" 53 | ] 54 | } 55 | ], 56 | "source": [ 57 | "for i in List:\n", 58 | " if i % 3 == 0:\n", 59 | " print(i, \"is a Multiple of 3\")\n", 60 | " else:\n", 61 | " print(i)" 62 | ] 63 | }, 64 | { 65 | "cell_type": "code", 66 | "execution_count": 10, 67 | "metadata": {}, 68 | "outputs": [ 69 | { 70 | "name": "stdout", 71 | "output_type": "stream", 72 | "text": [ 73 | "[3, 6, 9, 12, 15, 21]\n" 74 | ] 75 | } 76 | ], 77 | "source": [ 78 | "List_of_3 = []\n", 79 | "for i in List:\n", 80 | " if i % 3 == 0:\n", 81 | " List_of_3.append(i)\n", 82 | "print(List_of_3)" 83 | ] 84 | }, 85 | { 86 | "cell_type": "code", 87 | "execution_count": 11, 88 | "metadata": {}, 89 | "outputs": [ 90 | { 91 | "name": "stdout", 92 | "output_type": "stream", 93 | "text": [ 94 | "o\n" 95 | ] 96 | } 97 | ], 98 | "source": [ 99 | "words = \"Hello world\"\n", 100 | "print(words[4])" 101 | ] 102 | }, 103 | { 104 | "cell_type": "code", 105 | "execution_count": 12, 106 | "metadata": {}, 107 | "outputs": [ 108 | { 109 | "name": "stdout", 110 | "output_type": "stream", 111 | "text": [ 112 | "Hello worldHello worldHello world\n" 113 | ] 114 | } 115 | ], 116 | "source": [ 117 | "print(words * 3)" 118 | ] 119 | }, 120 | { 121 | "cell_type": "code", 122 | "execution_count": 13, 123 | "metadata": {}, 124 | "outputs": [ 125 | { 126 | "data": { 127 | "text/plain": [ 128 | "True" 129 | ] 130 | }, 131 | "execution_count": 13, 132 | "metadata": {}, 133 | "output_type": "execute_result" 134 | } 135 | ], 136 | "source": [ 137 | "\"Hello\" in words" 138 | ] 139 | }, 140 | { 141 | "cell_type": "code", 142 | "execution_count": 14, 143 | "metadata": {}, 144 | "outputs": [ 145 | { 146 | "data": { 147 | "text/plain": [ 148 | "False" 149 | ] 150 | }, 151 | "execution_count": 14, 152 | "metadata": {}, 153 | "output_type": "execute_result" 154 | } 155 | ], 156 | "source": [ 157 | "\"Goodbye\" in words" 158 | ] 159 | }, 160 | { 161 | "cell_type": "code", 162 | "execution_count": 15, 163 | "metadata": {}, 164 | "outputs": [ 165 | { 166 | "data": { 167 | "text/plain": [ 168 | "False" 169 | ] 170 | }, 171 | "execution_count": 15, 172 | "metadata": {}, 173 | "output_type": "execute_result" 174 | } 175 | ], 176 | "source": [ 177 | "not \"Hello\" in words" 178 | ] 179 | }, 180 | { 181 | "cell_type": "code", 182 | "execution_count": 16, 183 | "metadata": {}, 184 | "outputs": [ 185 | { 186 | "data": { 187 | "text/plain": [ 188 | "True" 189 | ] 190 | }, 191 | "execution_count": 16, 192 | "metadata": {}, 193 | "output_type": "execute_result" 194 | } 195 | ], 196 | "source": [ 197 | "not \"Goodbye\" in words" 198 | ] 199 | }, 200 | { 201 | "cell_type": "code", 202 | "execution_count": null, 203 | "metadata": { 204 | "collapsed": true 205 | }, 206 | "outputs": [], 207 | "source": [] 208 | } 209 | ], 210 | "metadata": { 211 | "kernelspec": { 212 | "display_name": "Python 3", 213 | "language": "python", 214 | "name": "python3" 215 | }, 216 | "language_info": { 217 | "codemirror_mode": { 218 | "name": "ipython", 219 | "version": 3 220 | }, 221 | "file_extension": ".py", 222 | "mimetype": "text/x-python", 223 | "name": "python", 224 | "nbconvert_exporter": "python", 225 | "pygments_lexer": "ipython3", 226 | "version": "3.6.5" 227 | } 228 | }, 229 | "nbformat": 4, 230 | "nbformat_minor": 2 231 | } 232 | -------------------------------------------------------------------------------- /How it works - list comprehensions.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "code", 5 | "execution_count": 1, 6 | "metadata": {}, 7 | "outputs": [], 8 | "source": [ 9 | "# Tips from https://www.kaggle.com/colinmorris/learn-python-challenge-day-5\n", 10 | "# Thank you https://www.kaggle.com/colinmorris :)" 11 | ] 12 | }, 13 | { 14 | "cell_type": "code", 15 | "execution_count": 8, 16 | "metadata": {}, 17 | "outputs": [ 18 | { 19 | "name": "stdout", 20 | "output_type": "stream", 21 | "text": [ 22 | "[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]\n" 23 | ] 24 | } 25 | ], 26 | "source": [ 27 | "# Basically creating a list by specifying some rule for its construction\n", 28 | "squares = [n**2 for n in range(10)]\n", 29 | "print(squares)\n", 30 | "# One can see the beauty of this by comparing it to less satisfactory methods below..." 31 | ] 32 | }, 33 | { 34 | "cell_type": "code", 35 | "execution_count": 9, 36 | "metadata": {}, 37 | "outputs": [], 38 | "source": [ 39 | "# At worst we could have done this\n", 40 | "squares = [0, 1, 4, 9, 16, 25, 36, 49, 64, 81]" 41 | ] 42 | }, 43 | { 44 | "cell_type": "code", 45 | "execution_count": 10, 46 | "metadata": {}, 47 | "outputs": [ 48 | { 49 | "name": "stdout", 50 | "output_type": "stream", 51 | "text": [ 52 | "[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]\n" 53 | ] 54 | } 55 | ], 56 | "source": [ 57 | "# And more tediously we could have done this\n", 58 | "squares = []\n", 59 | "for n in range(10):\n", 60 | " squares.append(n**2)\n", 61 | "print(squares)" 62 | ] 63 | }, 64 | { 65 | "cell_type": "code", 66 | "execution_count": 17, 67 | "metadata": {}, 68 | "outputs": [], 69 | "source": [ 70 | "# You can do some nice fancy footwork - let's get an arbitrary list of integers\n", 71 | "my_list = [5, -1, -2, 0, 3]" 72 | ] 73 | }, 74 | { 75 | "cell_type": "code", 76 | "execution_count": 24, 77 | "metadata": {}, 78 | "outputs": [ 79 | { 80 | "data": { 81 | "text/plain": [ 82 | "2" 83 | ] 84 | }, 85 | "execution_count": 24, 86 | "metadata": {}, 87 | "output_type": "execute_result" 88 | } 89 | ], 90 | "source": [ 91 | "# I can easily find the negative numbers now - without a loop!\n", 92 | "len([num for num in my_list if num < 0])" 93 | ] 94 | }, 95 | { 96 | "cell_type": "code", 97 | "execution_count": 26, 98 | "metadata": {}, 99 | "outputs": [ 100 | { 101 | "data": { 102 | "text/plain": [ 103 | "2" 104 | ] 105 | }, 106 | "execution_count": 26, 107 | "metadata": {}, 108 | "output_type": "execute_result" 109 | } 110 | ], 111 | "source": [ 112 | "# There are a couple of options...\n", 113 | "sum([num < 0 for num in my_list])" 114 | ] 115 | }, 116 | { 117 | "cell_type": "code", 118 | "execution_count": 27, 119 | "metadata": {}, 120 | "outputs": [], 121 | "source": [ 122 | "# Some more examples\n", 123 | "planets = [\"Mercury\", \"Venus\", \"Earth\", \"Mars\", \"Jupiter\", \"Saturn\", \"Uranus\", \"Neptune\", \"Pluto\"]" 124 | ] 125 | }, 126 | { 127 | "cell_type": "code", 128 | "execution_count": 28, 129 | "metadata": {}, 130 | "outputs": [ 131 | { 132 | "data": { 133 | "text/plain": [ 134 | "['VENUS!', 'EARTH!', 'MARS!', 'PLUTO!']" 135 | ] 136 | }, 137 | "execution_count": 28, 138 | "metadata": {}, 139 | "output_type": "execute_result" 140 | } 141 | ], 142 | "source": [ 143 | "# Notice how you can write it quite nice and readably\n", 144 | "[\n", 145 | " planet.upper() + '!' \n", 146 | " for planet in planets \n", 147 | " if len(planet) < 6\n", 148 | "]" 149 | ] 150 | } 151 | ], 152 | "metadata": { 153 | "kernelspec": { 154 | "display_name": "Python 3", 155 | "language": "python", 156 | "name": "python3" 157 | }, 158 | "language_info": { 159 | "codemirror_mode": { 160 | "name": "ipython", 161 | "version": 3 162 | }, 163 | "file_extension": ".py", 164 | "mimetype": "text/x-python", 165 | "name": "python", 166 | "nbconvert_exporter": "python", 167 | "pygments_lexer": "ipython3", 168 | "version": "3.6.5" 169 | } 170 | }, 171 | "nbformat": 4, 172 | "nbformat_minor": 2 173 | } 174 | -------------------------------------------------------------------------------- /How it works - lists vs arrays.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "code", 5 | "execution_count": 1, 6 | "metadata": {}, 7 | "outputs": [], 8 | "source": [ 9 | "import numpy as np" 10 | ] 11 | }, 12 | { 13 | "cell_type": "code", 14 | "execution_count": 2, 15 | "metadata": {}, 16 | "outputs": [ 17 | { 18 | "name": "stdout", 19 | "output_type": "stream", 20 | "text": [ 21 | "[12, 24, 65, 19, 242]\n" 22 | ] 23 | } 24 | ], 25 | "source": [ 26 | "List1 = [12, 24, 65 , 19, 242]\n", 27 | "print(List1)" 28 | ] 29 | }, 30 | { 31 | "cell_type": "code", 32 | "execution_count": 3, 33 | "metadata": {}, 34 | "outputs": [ 35 | { 36 | "name": "stdout", 37 | "output_type": "stream", 38 | "text": [ 39 | "[12, 24, 65, 19]\n" 40 | ] 41 | } 42 | ], 43 | "source": [ 44 | "List2 = List1[0:4]\n", 45 | "print(List2)" 46 | ] 47 | }, 48 | { 49 | "cell_type": "code", 50 | "execution_count": 4, 51 | "metadata": {}, 52 | "outputs": [], 53 | "source": [ 54 | "List2[0] = 15" 55 | ] 56 | }, 57 | { 58 | "cell_type": "code", 59 | "execution_count": 5, 60 | "metadata": {}, 61 | "outputs": [ 62 | { 63 | "name": "stdout", 64 | "output_type": "stream", 65 | "text": [ 66 | "List1 contains: [12, 24, 65, 19, 242]\n", 67 | "List2 contains: [15, 24, 65, 19]\n" 68 | ] 69 | } 70 | ], 71 | "source": [ 72 | "# Notice that while we changed a value in List2, the value in List2 remains the same\n", 73 | "# The 2 lists are independent of one another\n", 74 | "print(\"List1 contains: \", List1)\n", 75 | "print(\"List2 contains: \", List2)" 76 | ] 77 | }, 78 | { 79 | "cell_type": "code", 80 | "execution_count": 6, 81 | "metadata": {}, 82 | "outputs": [ 83 | { 84 | "name": "stdout", 85 | "output_type": "stream", 86 | "text": [ 87 | "[ 12 24 65 19 242]\n" 88 | ] 89 | } 90 | ], 91 | "source": [ 92 | "Array1 = np.array(List1)\n", 93 | "print(Array1)" 94 | ] 95 | }, 96 | { 97 | "cell_type": "code", 98 | "execution_count": 7, 99 | "metadata": {}, 100 | "outputs": [ 101 | { 102 | "name": "stdout", 103 | "output_type": "stream", 104 | "text": [ 105 | "[12 24 65 19]\n" 106 | ] 107 | } 108 | ], 109 | "source": [ 110 | "Array2 = Array1[0:4]\n", 111 | "print(Array2)" 112 | ] 113 | }, 114 | { 115 | "cell_type": "code", 116 | "execution_count": 8, 117 | "metadata": {}, 118 | "outputs": [], 119 | "source": [ 120 | "Array2[0] = 15" 121 | ] 122 | }, 123 | { 124 | "cell_type": "code", 125 | "execution_count": 9, 126 | "metadata": {}, 127 | "outputs": [ 128 | { 129 | "name": "stdout", 130 | "output_type": "stream", 131 | "text": [ 132 | "Array1 contains: [ 15 24 65 19 242]\n", 133 | "Array2 contains: [15 24 65 19]\n" 134 | ] 135 | } 136 | ], 137 | "source": [ 138 | "# Notice that because Array2 came from Array1, they are joined at the hip, so when you change\n", 139 | "# the value in one, the value in the other immediately changes too!\n", 140 | "print(\"Array1 contains: \", Array1)\n", 141 | "print(\"Array2 contains: \", Array2)" 142 | ] 143 | }, 144 | { 145 | "cell_type": "code", 146 | "execution_count": 10, 147 | "metadata": {}, 148 | "outputs": [ 149 | { 150 | "name": "stdout", 151 | "output_type": "stream", 152 | "text": [ 153 | "[ 12 24 65 19 242]\n" 154 | ] 155 | } 156 | ], 157 | "source": [ 158 | "Array1 = np.array(List1)\n", 159 | "print(Array1)" 160 | ] 161 | }, 162 | { 163 | "cell_type": "code", 164 | "execution_count": 11, 165 | "metadata": {}, 166 | "outputs": [ 167 | { 168 | "name": "stdout", 169 | "output_type": "stream", 170 | "text": [ 171 | "[12 24 65 19]\n" 172 | ] 173 | } 174 | ], 175 | "source": [ 176 | "# If this is not the desired effect use .copy() to make an independent copy\n", 177 | "Array2 = Array1[0:4].copy()\n", 178 | "print(Array2)" 179 | ] 180 | }, 181 | { 182 | "cell_type": "code", 183 | "execution_count": 12, 184 | "metadata": {}, 185 | "outputs": [], 186 | "source": [ 187 | "Array2[0] = 15" 188 | ] 189 | }, 190 | { 191 | "cell_type": "code", 192 | "execution_count": 13, 193 | "metadata": {}, 194 | "outputs": [ 195 | { 196 | "name": "stdout", 197 | "output_type": "stream", 198 | "text": [ 199 | "Array1 contains: [ 12 24 65 19 242]\n", 200 | "Array2 contains: [15 24 65 19]\n" 201 | ] 202 | } 203 | ], 204 | "source": [ 205 | "# Notice now that although we changed a value in Array2, Array1 has remained unaffected\n", 206 | "print(\"Array1 contains: \", Array1)\n", 207 | "print(\"Array2 contains: \", Array2)" 208 | ] 209 | }, 210 | { 211 | "cell_type": "code", 212 | "execution_count": 14, 213 | "metadata": {}, 214 | "outputs": [], 215 | "source": [ 216 | "List1 = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9])" 217 | ] 218 | }, 219 | { 220 | "cell_type": "code", 221 | "execution_count": 15, 222 | "metadata": {}, 223 | "outputs": [ 224 | { 225 | "name": "stdout", 226 | "output_type": "stream", 227 | "text": [ 228 | "[[1 2 3]\n", 229 | " [4 5 6]\n", 230 | " [7 8 9]]\n", 231 | "\n" 232 | ] 233 | } 234 | ], 235 | "source": [ 236 | "List_Table = np.reshape(List1, (3,3))\n", 237 | "print(List_Table)\n", 238 | "print(type(List_Table))" 239 | ] 240 | }, 241 | { 242 | "cell_type": "code", 243 | "execution_count": 16, 244 | "metadata": {}, 245 | "outputs": [ 246 | { 247 | "name": "stdout", 248 | "output_type": "stream", 249 | "text": [ 250 | "3\n", 251 | "[2 5 8]\n" 252 | ] 253 | } 254 | ], 255 | "source": [ 256 | "print(List_Table[0,2])\n", 257 | "print(List_Table[:,1])" 258 | ] 259 | }, 260 | { 261 | "cell_type": "code", 262 | "execution_count": 17, 263 | "metadata": {}, 264 | "outputs": [], 265 | "source": [ 266 | "#Object oriented programming" 267 | ] 268 | }, 269 | { 270 | "cell_type": "code", 271 | "execution_count": 18, 272 | "metadata": {}, 273 | "outputs": [ 274 | { 275 | "data": { 276 | "text/plain": [ 277 | "array([[1, 2, 3],\n", 278 | " [4, 5, 6],\n", 279 | " [7, 8, 9]])" 280 | ] 281 | }, 282 | "execution_count": 18, 283 | "metadata": {}, 284 | "output_type": "execute_result" 285 | } 286 | ], 287 | "source": [ 288 | "List1.reshape(3,3)" 289 | ] 290 | }, 291 | { 292 | "cell_type": "code", 293 | "execution_count": 19, 294 | "metadata": {}, 295 | "outputs": [ 296 | { 297 | "name": "stdout", 298 | "output_type": "stream", 299 | "text": [ 300 | "[[1 2 3]\n", 301 | " [4 5 6]\n", 302 | " [7 8 9]]\n" 303 | ] 304 | } 305 | ], 306 | "source": [ 307 | "print(List1.reshape(3,3))" 308 | ] 309 | }, 310 | { 311 | "cell_type": "code", 312 | "execution_count": 20, 313 | "metadata": {}, 314 | "outputs": [], 315 | "source": [ 316 | "#We played Scrabble in Jan, Feb & Mar, here are the scores:" 317 | ] 318 | }, 319 | { 320 | "cell_type": "code", 321 | "execution_count": 21, 322 | "metadata": {}, 323 | "outputs": [], 324 | "source": [ 325 | "Lisa_Mitford = [108, 215, 99]\n", 326 | "Barry_Benjamin = [260, 212, 220]\n", 327 | "Geoff_Louw = [176, 98, 232]\n", 328 | "Denise_Louw = [102, 89, 276]" 329 | ] 330 | }, 331 | { 332 | "cell_type": "code", 333 | "execution_count": 22, 334 | "metadata": {}, 335 | "outputs": [], 336 | "source": [ 337 | "#Insert this lot into an array" 338 | ] 339 | }, 340 | { 341 | "cell_type": "code", 342 | "execution_count": 23, 343 | "metadata": {}, 344 | "outputs": [ 345 | { 346 | "name": "stdout", 347 | "output_type": "stream", 348 | "text": [ 349 | "[[108 215 99]\n", 350 | " [260 212 220]\n", 351 | " [176 98 232]\n", 352 | " [102 89 276]]\n" 353 | ] 354 | } 355 | ], 356 | "source": [ 357 | "Scores = np.array([Lisa_Mitford, Barry_Benjamin, Geoff_Louw, Denise_Louw])\n", 358 | "print(Scores)" 359 | ] 360 | }, 361 | { 362 | "cell_type": "code", 363 | "execution_count": 24, 364 | "metadata": {}, 365 | "outputs": [], 366 | "source": [ 367 | "#Create a player dictionary" 368 | ] 369 | }, 370 | { 371 | "cell_type": "code", 372 | "execution_count": 25, 373 | "metadata": {}, 374 | "outputs": [], 375 | "source": [ 376 | "PDict = {'Lisa_Mitford':0, 'Barry_Benjamin':1, 'Geoff_Louw':2, 'Denise_Louw':3}" 377 | ] 378 | }, 379 | { 380 | "cell_type": "code", 381 | "execution_count": 26, 382 | "metadata": {}, 383 | "outputs": [], 384 | "source": [ 385 | "#Create a months dictionary\n", 386 | "MDict = {'Jan':0, 'Feb':1, 'Mar':2}" 387 | ] 388 | }, 389 | { 390 | "cell_type": "code", 391 | "execution_count": 27, 392 | "metadata": {}, 393 | "outputs": [], 394 | "source": [ 395 | "#To retrieve an individual item from the matrix" 396 | ] 397 | }, 398 | { 399 | "cell_type": "code", 400 | "execution_count": 28, 401 | "metadata": {}, 402 | "outputs": [ 403 | { 404 | "name": "stdout", 405 | "output_type": "stream", 406 | "text": [ 407 | "232\n" 408 | ] 409 | } 410 | ], 411 | "source": [ 412 | "print(Scores[PDict['Geoff_Louw'], MDict['Mar']])" 413 | ] 414 | }, 415 | { 416 | "cell_type": "code", 417 | "execution_count": 29, 418 | "metadata": {}, 419 | "outputs": [], 420 | "source": [ 421 | "#To retrieve a whole row from the matrix" 422 | ] 423 | }, 424 | { 425 | "cell_type": "code", 426 | "execution_count": 30, 427 | "metadata": {}, 428 | "outputs": [ 429 | { 430 | "name": "stdout", 431 | "output_type": "stream", 432 | "text": [ 433 | "[176 98 232]\n" 434 | ] 435 | } 436 | ], 437 | "source": [ 438 | "print(Scores[PDict['Geoff_Louw']])" 439 | ] 440 | }, 441 | { 442 | "cell_type": "code", 443 | "execution_count": 31, 444 | "metadata": {}, 445 | "outputs": [ 446 | { 447 | "name": "stdout", 448 | "output_type": "stream", 449 | "text": [ 450 | "[ 99 220 232 276]\n" 451 | ] 452 | } 453 | ], 454 | "source": [ 455 | "#Or to retrieve a whole column from the matrix\n", 456 | "print(Scores[:,MDict['Mar']])" 457 | ] 458 | } 459 | ], 460 | "metadata": { 461 | "kernelspec": { 462 | "display_name": "Python 3", 463 | "language": "python", 464 | "name": "python3" 465 | }, 466 | "language_info": { 467 | "codemirror_mode": { 468 | "name": "ipython", 469 | "version": 3 470 | }, 471 | "file_extension": ".py", 472 | "mimetype": "text/x-python", 473 | "name": "python", 474 | "nbconvert_exporter": "python", 475 | "pygments_lexer": "ipython3", 476 | "version": "3.6.5" 477 | } 478 | }, 479 | "nbformat": 4, 480 | "nbformat_minor": 2 481 | } 482 | -------------------------------------------------------------------------------- /How it works - positive & negative indexation.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "code", 5 | "execution_count": 1, 6 | "metadata": {}, 7 | "outputs": [], 8 | "source": [ 9 | "# Let's start with the basics" 10 | ] 11 | }, 12 | { 13 | "cell_type": "code", 14 | "execution_count": 2, 15 | "metadata": {}, 16 | "outputs": [ 17 | { 18 | "data": { 19 | "text/plain": [ 20 | "[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18]" 21 | ] 22 | }, 23 | "execution_count": 2, 24 | "metadata": {}, 25 | "output_type": "execute_result" 26 | } 27 | ], 28 | "source": [ 29 | "simple = list(range(1,19))\n", 30 | "simple" 31 | ] 32 | }, 33 | { 34 | "cell_type": "code", 35 | "execution_count": 3, 36 | "metadata": {}, 37 | "outputs": [ 38 | { 39 | "data": { 40 | "text/plain": [ 41 | "1" 42 | ] 43 | }, 44 | "execution_count": 3, 45 | "metadata": {}, 46 | "output_type": "execute_result" 47 | } 48 | ], 49 | "source": [ 50 | "# Select the 1st item using positive indexation\n", 51 | "simple[0]" 52 | ] 53 | }, 54 | { 55 | "cell_type": "code", 56 | "execution_count": 4, 57 | "metadata": {}, 58 | "outputs": [ 59 | { 60 | "data": { 61 | "text/plain": [ 62 | "1" 63 | ] 64 | }, 65 | "execution_count": 4, 66 | "metadata": {}, 67 | "output_type": "execute_result" 68 | } 69 | ], 70 | "source": [ 71 | "# Select the 1st item using negative indexation\n", 72 | "simple[-18]" 73 | ] 74 | }, 75 | { 76 | "cell_type": "code", 77 | "execution_count": 5, 78 | "metadata": {}, 79 | "outputs": [ 80 | { 81 | "data": { 82 | "text/plain": [ 83 | "18" 84 | ] 85 | }, 86 | "execution_count": 5, 87 | "metadata": {}, 88 | "output_type": "execute_result" 89 | } 90 | ], 91 | "source": [ 92 | "# Select the last item using positive indexation\n", 93 | "simple[17]" 94 | ] 95 | }, 96 | { 97 | "cell_type": "code", 98 | "execution_count": 6, 99 | "metadata": {}, 100 | "outputs": [ 101 | { 102 | "data": { 103 | "text/plain": [ 104 | "18" 105 | ] 106 | }, 107 | "execution_count": 6, 108 | "metadata": {}, 109 | "output_type": "execute_result" 110 | } 111 | ], 112 | "source": [ 113 | "# Select the last item using negative indexation\n", 114 | "simple[-1]" 115 | ] 116 | }, 117 | { 118 | "cell_type": "code", 119 | "execution_count": 7, 120 | "metadata": {}, 121 | "outputs": [ 122 | { 123 | "data": { 124 | "text/plain": [ 125 | "[1, 2, 3, 4, 5, 6, 7]" 126 | ] 127 | }, 128 | "execution_count": 7, 129 | "metadata": {}, 130 | "output_type": "execute_result" 131 | } 132 | ], 133 | "source": [ 134 | "# Select a range of items with positive indexation\n", 135 | "simple[0:7]" 136 | ] 137 | }, 138 | { 139 | "cell_type": "code", 140 | "execution_count": 8, 141 | "metadata": {}, 142 | "outputs": [ 143 | { 144 | "data": { 145 | "text/plain": [ 146 | "[1, 2, 3, 4, 5, 6, 7]" 147 | ] 148 | }, 149 | "execution_count": 8, 150 | "metadata": {}, 151 | "output_type": "execute_result" 152 | } 153 | ], 154 | "source": [ 155 | "# Select a range of items with negative indexation\n", 156 | "simple[-18:-11]" 157 | ] 158 | }, 159 | { 160 | "cell_type": "code", 161 | "execution_count": 9, 162 | "metadata": {}, 163 | "outputs": [ 164 | { 165 | "data": { 166 | "text/plain": [ 167 | "[2, 4, 6]" 168 | ] 169 | }, 170 | "execution_count": 9, 171 | "metadata": {}, 172 | "output_type": "execute_result" 173 | } 174 | ], 175 | "source": [ 176 | "# Select a range of items between 1 and 7 in increments of 2\n", 177 | "simple[1:7:2]" 178 | ] 179 | }, 180 | { 181 | "cell_type": "code", 182 | "execution_count": 10, 183 | "metadata": {}, 184 | "outputs": [ 185 | { 186 | "data": { 187 | "text/plain": [ 188 | "[6, 4, 2]" 189 | ] 190 | }, 191 | "execution_count": 10, 192 | "metadata": {}, 193 | "output_type": "execute_result" 194 | } 195 | ], 196 | "source": [ 197 | "# Select the same range of items between 1 and 7 in increments of -2 (backwards)\n", 198 | "simple[-13:-18:-2]" 199 | ] 200 | }, 201 | { 202 | "cell_type": "code", 203 | "execution_count": 11, 204 | "metadata": {}, 205 | "outputs": [ 206 | { 207 | "data": { 208 | "text/plain": [ 209 | "[]" 210 | ] 211 | }, 212 | "execution_count": 11, 213 | "metadata": {}, 214 | "output_type": "execute_result" 215 | } 216 | ], 217 | "source": [ 218 | "# Note how the step increment makes a difference to the order - this doesn't work because it says start at 1, \n", 219 | "# go on until 7 and use increments of negative 2 but if we do negative 2 from 1 we get immediately outside\n", 220 | "# the bounds of our list\n", 221 | "simple[1:7:-2]" 222 | ] 223 | }, 224 | { 225 | "cell_type": "code", 226 | "execution_count": 12, 227 | "metadata": {}, 228 | "outputs": [ 229 | { 230 | "data": { 231 | "text/plain": [ 232 | "[]" 233 | ] 234 | }, 235 | "execution_count": 12, 236 | "metadata": {}, 237 | "output_type": "execute_result" 238 | } 239 | ], 240 | "source": [ 241 | "# Similarly here we are saying start at -18 and go forwards by 2 which again puts us immediately\n", 242 | "# outside the bounds of our list\n", 243 | "simple[-13:-18:2]" 244 | ] 245 | }, 246 | { 247 | "cell_type": "code", 248 | "execution_count": 13, 249 | "metadata": {}, 250 | "outputs": [], 251 | "source": [ 252 | "# Now replace a list item with a new value (6 > 99)\n", 253 | "simple[-13] = 99" 254 | ] 255 | }, 256 | { 257 | "cell_type": "code", 258 | "execution_count": 14, 259 | "metadata": {}, 260 | "outputs": [ 261 | { 262 | "data": { 263 | "text/plain": [ 264 | "[99, 4, 2]" 265 | ] 266 | }, 267 | "execution_count": 14, 268 | "metadata": {}, 269 | "output_type": "execute_result" 270 | } 271 | ], 272 | "source": [ 273 | "# And check what it looks like now\n", 274 | "simple[-13:-18:-2]" 275 | ] 276 | }, 277 | { 278 | "cell_type": "code", 279 | "execution_count": 15, 280 | "metadata": {}, 281 | "outputs": [ 282 | { 283 | "data": { 284 | "text/plain": [ 285 | "[1, 2, 3, 4, 5, 99, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 909]" 286 | ] 287 | }, 288 | "execution_count": 15, 289 | "metadata": {}, 290 | "output_type": "execute_result" 291 | } 292 | ], 293 | "source": [ 294 | "# Add a number at the end of the list\n", 295 | "simple.append(909)\n", 296 | "simple" 297 | ] 298 | }, 299 | { 300 | "cell_type": "code", 301 | "execution_count": 16, 302 | "metadata": {}, 303 | "outputs": [ 304 | { 305 | "data": { 306 | "text/plain": [ 307 | "[1, 2, 3, 4, 5, 6, 99, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 909]" 308 | ] 309 | }, 310 | "execution_count": 16, 311 | "metadata": {}, 312 | "output_type": "execute_result" 313 | } 314 | ], 315 | "source": [ 316 | "# Add a number in the middle of the list (add number 6 just before position 5)\n", 317 | "simple.insert(5, 6)\n", 318 | "simple" 319 | ] 320 | }, 321 | { 322 | "cell_type": "code", 323 | "execution_count": 17, 324 | "metadata": {}, 325 | "outputs": [ 326 | { 327 | "data": { 328 | "text/plain": [ 329 | "True" 330 | ] 331 | }, 332 | "execution_count": 17, 333 | "metadata": {}, 334 | "output_type": "execute_result" 335 | } 336 | ], 337 | "source": [ 338 | "# Quickly check if a number is somewhere in the list\n", 339 | "99 in simple" 340 | ] 341 | }, 342 | { 343 | "cell_type": "code", 344 | "execution_count": 18, 345 | "metadata": {}, 346 | "outputs": [ 347 | { 348 | "data": { 349 | "text/plain": [ 350 | "6" 351 | ] 352 | }, 353 | "execution_count": 18, 354 | "metadata": {}, 355 | "output_type": "execute_result" 356 | } 357 | ], 358 | "source": [ 359 | "# And then check which index position it occurs in the list\n", 360 | "simple.index(99)" 361 | ] 362 | }, 363 | { 364 | "cell_type": "code", 365 | "execution_count": 19, 366 | "metadata": {}, 367 | "outputs": [ 368 | { 369 | "data": { 370 | "text/plain": [ 371 | "909" 372 | ] 373 | }, 374 | "execution_count": 19, 375 | "metadata": {}, 376 | "output_type": "execute_result" 377 | } 378 | ], 379 | "source": [ 380 | "# What is the biggest number in the list?\n", 381 | "max(simple)" 382 | ] 383 | }, 384 | { 385 | "cell_type": "code", 386 | "execution_count": 20, 387 | "metadata": {}, 388 | "outputs": [ 389 | { 390 | "data": { 391 | "text/plain": [ 392 | "1" 393 | ] 394 | }, 395 | "execution_count": 20, 396 | "metadata": {}, 397 | "output_type": "execute_result" 398 | } 399 | ], 400 | "source": [ 401 | "# And the smallest?\n", 402 | "min(simple)" 403 | ] 404 | }, 405 | { 406 | "cell_type": "code", 407 | "execution_count": 21, 408 | "metadata": {}, 409 | "outputs": [], 410 | "source": [ 411 | "# Some more finicky examples" 412 | ] 413 | }, 414 | { 415 | "cell_type": "code", 416 | "execution_count": 22, 417 | "metadata": {}, 418 | "outputs": [ 419 | { 420 | "name": "stdout", 421 | "output_type": "stream", 422 | "text": [ 423 | "Fortune 500\n", 424 | "Think Bike\n", 425 | "Make Love\n", 426 | "Be Careful\n" 427 | ] 428 | } 429 | ], 430 | "source": [ 431 | "List_1 = [\"Be\", \"Make\",\"Think\", \"Fortune\"]\n", 432 | "List_2 = [\"Careful\", \"Love\", \"Bike\", \"500\"]\n", 433 | "if len(List_1) < len(List_2):\n", 434 | " x = len(List_1) + 1\n", 435 | "else:\n", 436 | " x = len(List_2) + 1\n", 437 | "for i in range(1,x):\n", 438 | " print(List_1[-i] + \" \" + List_2[-i])" 439 | ] 440 | }, 441 | { 442 | "cell_type": "code", 443 | "execution_count": 23, 444 | "metadata": {}, 445 | "outputs": [ 446 | { 447 | "name": "stdout", 448 | "output_type": "stream", 449 | "text": [ 450 | "Be Careful\n", 451 | "Fortune 500\n", 452 | "Think Bike\n", 453 | "Make Love\n" 454 | ] 455 | } 456 | ], 457 | "source": [ 458 | "List_1 = [\"Be\", \"Make\",\"Think\", \"Fortune\"]\n", 459 | "List_2 = [\"Careful\", \"Love\", \"Bike\", \"500\"]\n", 460 | "if len(List_1) < len(List_2):\n", 461 | " x = len(List_1)\n", 462 | "else:\n", 463 | " x = len(List_2)\n", 464 | "for i in range(x):\n", 465 | " print(List_1[-i] + \" \" + List_2[-i])" 466 | ] 467 | }, 468 | { 469 | "cell_type": "code", 470 | "execution_count": 24, 471 | "metadata": { 472 | "scrolled": true 473 | }, 474 | "outputs": [ 475 | { 476 | "name": "stdout", 477 | "output_type": "stream", 478 | "text": [ 479 | "Fortune 500\n", 480 | "Think Bike\n", 481 | "Make Love\n", 482 | "Be Careful\n" 483 | ] 484 | } 485 | ], 486 | "source": [ 487 | "List_1 = [\"Be\", \"Make\",\"Think\", \"Fortune\"]\n", 488 | "List_2 = [\"Careful\", \"Love\", \"Bike\", \"500\"]\n", 489 | "if len(List_1) < len(List_2):\n", 490 | " x = len(List_1)\n", 491 | "else:\n", 492 | " x = len(List_2)\n", 493 | "for i in range(-1,(-x-1),-1):\n", 494 | " print(List_1[i] + \" \" + List_2[i])" 495 | ] 496 | } 497 | ], 498 | "metadata": { 499 | "kernelspec": { 500 | "display_name": "Python 3", 501 | "language": "python", 502 | "name": "python3" 503 | }, 504 | "language_info": { 505 | "codemirror_mode": { 506 | "name": "ipython", 507 | "version": 3 508 | }, 509 | "file_extension": ".py", 510 | "mimetype": "text/x-python", 511 | "name": "python", 512 | "nbconvert_exporter": "python", 513 | "pygments_lexer": "ipython3", 514 | "version": "3.6.5" 515 | } 516 | }, 517 | "nbformat": 4, 518 | "nbformat_minor": 2 519 | } 520 | -------------------------------------------------------------------------------- /Practice Run - Linear Regression Age vs Blood Pressure.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "Small dataset for linear regression from:
\n", 8 | "http://people.sc.fsu.edu/~jburkardt/datasets/regression/x03.txt" 9 | ] 10 | }, 11 | { 12 | "cell_type": "code", 13 | "execution_count": 1, 14 | "metadata": {}, 15 | "outputs": [], 16 | "source": [ 17 | "import numpy as np\n", 18 | "from scipy import stats\n", 19 | "import pandas as pd\n", 20 | "import matplotlib.pyplot as plt\n", 21 | "%matplotlib inline" 22 | ] 23 | }, 24 | { 25 | "cell_type": "code", 26 | "execution_count": 2, 27 | "metadata": {}, 28 | "outputs": [], 29 | "source": [ 30 | "x03 = pd.read_excel(\"x03.xlsx\")\n", 31 | "# I, the index;\n", 32 | "# A0, 1,\n", 33 | "# A1, the age;\n", 34 | "# B, the systolic blood pressure." 35 | ] 36 | }, 37 | { 38 | "cell_type": "code", 39 | "execution_count": 3, 40 | "metadata": {}, 41 | "outputs": [ 42 | { 43 | "data": { 44 | "text/html": [ 45 | "
\n", 46 | "\n", 59 | "\n", 60 | " \n", 61 | " \n", 62 | " \n", 63 | " \n", 64 | " \n", 65 | " \n", 66 | " \n", 67 | " \n", 68 | " \n", 69 | " \n", 70 | " \n", 71 | " \n", 72 | " \n", 73 | " \n", 74 | " \n", 75 | " \n", 76 | " \n", 77 | " \n", 78 | " \n", 79 | " \n", 80 | " \n", 81 | " \n", 82 | " \n", 83 | " \n", 84 | " \n", 85 | " \n", 86 | " \n", 87 | " \n", 88 | " \n", 89 | " \n", 90 | " \n", 91 | " \n", 92 | "
IA0A1B
01139144
12147220
23145138
\n", 93 | "
" 94 | ], 95 | "text/plain": [ 96 | " I A0 A1 B\n", 97 | "0 1 1 39 144\n", 98 | "1 2 1 47 220\n", 99 | "2 3 1 45 138" 100 | ] 101 | }, 102 | "execution_count": 3, 103 | "metadata": {}, 104 | "output_type": "execute_result" 105 | } 106 | ], 107 | "source": [ 108 | "x03.head(3)" 109 | ] 110 | }, 111 | { 112 | "cell_type": "code", 113 | "execution_count": 4, 114 | "metadata": {}, 115 | "outputs": [ 116 | { 117 | "data": { 118 | "text/plain": [ 119 | "" 120 | ] 121 | }, 122 | "execution_count": 4, 123 | "metadata": {}, 124 | "output_type": "execute_result" 125 | }, 126 | { 127 | "data": { 128 | "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXoAAAD8CAYAAAB5Pm/hAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMi4yLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvhp/UCwAAFKdJREFUeJzt3X+MZeV93/H3x4DpOm46dlhHMLt0ocJbk+CwZIJJaVOHNF6cWoBQIoGcGDlWV0lQZSKHmLWlRqmKIKFyEiuKpW1MbSQKJTEmKHGCiXHqxgqggcX8MN56E2yzu8SsRddOyxbB5ts/5kwY1jPMj3vvzLnPfb+k0dz73HPvPs/cs5858z3PeW6qCklSu16z0R2QJI2WQS9JjTPoJalxBr0kNc6gl6TGGfSS1DiDXpIaZ9BLUuMMeklq3Ikb3QGAU045pbZt27bR3ZCksfLQQw99q6o2L7ddL4J+27ZtzM7ObnQ3JGmsJPn6SrazdCNJjTPoJalxBr0kNc6gl6TGGfSS1Lhlgz7J1iSfT/JkkieSvL9rvynJV5I8muTTSaYWPGd3kv1J9iXZOcoBSOPorr0HufDG+zjjuj/hwhvv4669Bze6S2rYSo7oXwI+UFVvAS4Ark5yNnAv8INV9VbgfwG7AbrHrgB+ALgY+L0kJ4yi89I4umvvQXbf+RgHjxylgINHjrL7zscMe43MskFfVc9U1cPd7b8DngSmq+qzVfVSt9n9wJbu9qXA7VX1QlU9BewHzh9+16XxdNM9+zj64rFXtB198Rg33bNvg3qk1q2qRp9kG7ADeOC4h34e+NPu9jTw9ILHDnRtx7/WriSzSWYPHz68mm5IY+3QkaOrapcGteKgT/J64FPANVX1nQXtH2auvHPrfNMiT/+uTyCvqj1VNVNVM5s3L3sFr9SM06Y2rapdGtSKgj7JScyF/K1VdeeC9quAdwHvrqr5MD8AbF3w9C3AoeF0Vxp/1+7czqaTXnnaatNJJ3Dtzu0b1CO1biWzbgJ8HHiyqj6yoP1i4IPAJVX1/IKn3A1ckeTkJGcAZwEPDrfb0vi6bMc0N1x+DtNTmwgwPbWJGy4/h8t2fFeFUxqKlSxqdiHwc8BjSR7p2j4EfBQ4Gbh37ncB91fVL1TVE0nuAL7MXEnn6qo6tsjrShPrsh3TBrvWzbJBX1V/yeJ198+8ynOuB64foF+SpCHxylhJapxBL0mNM+glqXEGvSQ1zqCXpMYZ9JLUOINekhpn0EtS4wx6SWqcQS9JjTPoJalxBr0kNc6gl6TGGfSS1DiDXpIaZ9BLUuMMeklqnEEvSY0z6CWpcQa9JDXOoJekxhn0ktQ4g16SGmfQS1LjDHpJapxBL0mNM+glqXHLBn2SrUk+n+TJJE8keX/X/sYk9yb5avf9DV17knw0yf4kjyY5b9SDkCQtbSVH9C8BH6iqtwAXAFcnORu4DvhcVZ0FfK67D/BO4KzuaxfwsaH3WpK0YssGfVU9U1UPd7f/DngSmAYuBT7ZbfZJ4LLu9qXALTXnfmAqyalD77kkaUVWVaNPsg3YATwAfH9VPQNzvwyAN3WbTQNPL3jaga7t+NfalWQ2yezhw4dX33NJ0oqsOOiTvB74FHBNVX3n1TZdpK2+q6FqT1XNVNXM5s2bV9oNSdIqrSjok5zEXMjfWlV3ds3fnC/JdN+f7doPAFsXPH0LcGg43ZUkrdZKZt0E+DjwZFV9ZMFDdwNXdbevAv5oQft7utk3FwDfni/xSJLW34kr2OZC4OeAx5I80rV9CLgRuCPJ+4BvAD/TPfYZ4KeA/cDzwHuH2mNJ0qosG/RV9ZcsXncH+IlFti/g6gH7JUkaEq+MlaTGGfSS1DiDXpIaZ9BLUuMMeklqnEEvSY0z6CWpcQa9JDXOoJekxhn0ktQ4g16SGmfQS1LjDHpJapxBL0mNM+glqXEGvSQ1zqCXpMYZ9JLUOINekhpn0EtS4wx6SWqcQS9JjTPoJalxBr0kNc6gl6TGGfSS1Lhlgz7JzUmeTfL4grZzk9yf5JEks0nO79qT5KNJ9id5NMl5o+y8JGl5Kzmi/wRw8XFtvwn8elWdC/yH7j7AO4Gzuq9dwMeG001J0lotG/RV9QXgueObge/tbv8T4FB3+1LglppzPzCV5NRhdVaStHonrvF51wD3JPnPzP2y+Bdd+zTw9ILtDnRtz6y5h5Kkgaz1ZOwvAr9cVVuBXwY+3rVnkW1rsRdIsqur788ePnx4jd2QJC1nrUF/FXBnd/sPgPO72weArQu228LLZZ1XqKo9VTVTVTObN29eYzckSctZa9AfAv51d/si4Kvd7buB93Szby4Avl1Vlm0kaQMtW6NPchvwduCUJAeAXwP+HfA7SU4E/h9zM2wAPgP8FLAfeB547wj6LElahWWDvqquXOKhH15k2wKuHrRTkqTh8cpYSWrcWqdXSpIGcNfeg9x0zz4OHTnKaVObuHbndi7bMT2Sf8ugl6R1dtfeg+y+8zGOvngMgINHjrL7zscARhL2lm4kaZ3ddM++fwj5eUdfPMZN9+wbyb9n0EvSOjt05Oiq2gdl0EvSOjttatOq2gdl0EvSOrt253Y2nXTCK9o2nXQC1+7cPpJ/z5OxkrTO5k+4OutGkhp22Y7pkQX78SzdSFLjPKKXpCFYzwugVsugl6QBrfcFUKtl6UaSBrTeF0CtlkEvSQNa7wugVsugl6QBrfcFUKtl0EvSgNb7AqjV8mSsJA1ovS+AWi2DXpKGYD0vgFotSzeS1DiDXpIaZ9BLUuMMeklqnEEvSY0z6CWpcU6vlDTRllp1ss+rUa6WQS9pYi216uTs15/jUw8d7O1qlKtl6UbSxFpq1cnbHni616tRrtayQZ/k5iTPJnn8uPZ/n2RfkieS/OaC9t1J9neP7RxFpyVpGJZaXfJY1aq277uVlG4+AfwucMt8Q5IfBy4F3lpVLyR5U9d+NnAF8APAacCfJ3lzVR37rleVxlBLdVvNrS55cJHwPiFZNOz7shrlai17RF9VXwCeO675F4Ebq+qFbptnu/ZLgdur6oWqegrYD5w/xP5KG2a+nnvwyFGKl+u2d+09uNFd0xotterklW/b2uvVKFdrrTX6NwP/KskDSf5Hkh/p2qeBpxdsd6Brk8Ze3z9FSKt32Y5pbrj8HKanNhFgemoTN1x+Dv/psnMWbR/Xv97WOuvmROANwAXAjwB3JDkTyCLbLlrsSrIL2AVw+umnr7Eb0vrp+6cIaW2WWnWyz6tRrtZag/4AcGdVFfBgkr8HTunaty7YbgtwaLEXqKo9wB6AmZmZxc98SD2yVD13XOu28zzvMBx9/jmutXRzF3ARQJI3A68FvgXcDVyR5OQkZwBnAQ8Oo6PSRuv7pwithecdhqPvP8eVTK+8DfgrYHuSA0neB9wMnNlNubwduKrmPAHcAXwZ+DPgamfcqBVL1XP7ctS2Fp53GI6+/xyXLd1U1ZVLPPSzS2x/PXD9IJ2S+qqlui143mFY+v5z9MpYaYItdX5h3M87rLe+/xwNemmCtXjeYSP0/efoombSBJsvQ/V1tsi46PvPMbXEmg7raWZmpmZnZze6G5IG0Ofpha1K8lBVzSy3nUf0kga21HK/MJ7L+rbGGr2kgfV9euGkM+glDazv0wsnnUEvaWB9n1446Qx6SQPr+/TCSefJWEkD6/v0wknnEb0kNc4jekkDc3plv3lEL2lgTq/sN4Ne0sCcXtlvBr2kgTm9st8MekkDc3plv3kyVtLAnF7Zbwa9pKFo7dO3WmLpRpIa5xG9tAFcu13ryaCX1pkXF2m9GfQaW+N6VPxqFxeNQ//H2bjuM4My6DWWxvmo2IuLNsY47zOD8mSsxtI4X3LvxUUbY5z3mUF5RK+xNC5HxYuVCq7duf0VR5aw9ouLJrUUsRbjss+Mgkf0GkvjcFQ8Xyo4eOQoxStLBTdcfg7TU5sIMD21iRsuP2fVAb3U69+19+DQx9KCcdhnRsWg11gah0vulzvp+sXrLuKpG/8tX7zuojUdhU9yKWItxmGfGZVlgz7JzUmeTfL4Io/9SpJKckp3P0k+mmR/kkeTnDeKTkuX7ZgeylHxKI26VDDJpYi1GId9ZlRWUqP/BPC7wC0LG5NsBX4S+MaC5ncCZ3VfbwM+1n2Xhq7vl9yfNrWJg4uE7rBKBaN+/Y0w6nMOfd9nRmXZI/qq+gLw3CIP/Rbwq0AtaLsUuKXm3A9MJTl1KD2VxsyoSwWtlSI85zA6a6rRJ7kEOFhVXzruoWng6QX3D3Rt0sQZdamgtVKE5xxGZ9XTK5O8Dvgw8I7FHl6krRZpI8kuYBfA6aefvtpuSGNh1KWClkoRnnMYnbXMo/9nwBnAl5IAbAEeTnI+c0fwWxdsuwU4tNiLVNUeYA/AzMzMor8MpHHhfPbBtXjOoS9WXbqpqseq6k1Vta2qtjEX7udV1d8CdwPv6WbfXAB8u6qeGW6XpX6xtjwcrZ1z6JOVTK+8DfgrYHuSA0ne9yqbfwb4G2A/8F+AXxpKL6Ues7Y8HK2dc+iTZUs3VXXlMo9vW3C7gKsH75Za11Kpw9ry8LR0zqFPvDJW6661UsckX1qv8WDQa921Vuq4dud2TnrNKyecnfSaWFtWbxj0WndNljqOn1i82ERjaYMY9Fp3rZU6brpnHy8ee+UM4ReP1dj+haL2GPRad61No2vyLxQ1xaDXumttGl1rf6GoPX7ClDZES9PohvmJUdIoGPTSgOZ/YbVyXYDaY9BLQ9DSXyhqjzV6SWqcR/QNaml5gUnje6dRMOgbM7+8wPyJwfnlBQADo+d87zQqlm4a09ryApPE906jYtA3xot3xpfvnUbF0k1j/JSe1etLXdz3TqPiEX1jWlteYNT6tGSy751GxaBvTGvLC4xan+rivncaFUs3DfLinZXrW13c906j4BG9JpoLkmkSGPQb4K69B7nwxvs447o/4cIb7xvbj9BrgXVxTQJLN+vMi2L6xQXJNAkmJuj7MoXu1U7+GS6SRmEigr5PR9F9O/k36fq0b0ijMhE1+j5NofPkX7/0ad+QRmUigr5PR9Ge/OuXPu0b0qhMRND36Sjai2L6pU/7hjQqE1Gj79tnenpRTH/0bd+QRmHZI/okNyd5NsnjC9puSvKVJI8m+XSSqQWP7U6yP8m+JDtH1fHV8ChaS3Hf0CRIVb36BsmPAf8HuKWqfrBrewdwX1W9lOQ3AKrqg0nOBm4DzgdOA/4ceHNVHVv81efMzMzU7OzswIMZpr5Mx9TSfI806ZI8VFUzy2237BF9VX0BeO64ts9W1Uvd3fuBLd3tS4Hbq+qFqnoK2M9c6I+VPq1oqMX5HkkrN4yTsT8P/Gl3exp4esFjB7q2seKUu/7zPZJWbqCgT/Jh4CXg1vmmRTZbtDaUZFeS2SSzhw8fHqQbQ+eUu/7zPZJWbs1Bn+Qq4F3Au+vlQv8BYOuCzbYAhxZ7flXtqaqZqprZvHnzWrsxEk656z/fI2nl1hT0SS4GPghcUlXPL3jobuCKJCcnOQM4C3hw8G6uLy9q6j/fI2nllp1Hn+Q24O3AKUkOAL8G7AZOBu5NAnB/Vf1CVT2R5A7gy8yVdK5ebsZNH7miYf/5Hkkrt+z0yvWwkdMrnaInaVytdHrlRFwZuxRXLpQ0CSZirZulOEVP0iSY6KB3ip6kSTC2pZth1NZPm9rEwUVC3Sl6kloylkf0w7r83Sl6kibBWAb9sGrrrlwoaRKMZelmmLV114aX1LqxPKL38ndJWrmxDHpr65K0cmNZuvHyd0laubEMerC23jcuJSH119gGvfrDpSSkfhvLGr36xaUkpH4z6DUwl5KQ+s2g18Cc7ir1m0GvgTndVeo3T8ZqYE53lfrNoNdQON1V6i9LN5LUOINekhpn0EtS4wx6SWqcQS9JjUtVbXQfSHIY+PoGd+MU4Fsb3If1MiljnZRxwuSMdVLGCSsb6z+tqs3LvVAvgr4PksxW1cxG92M9TMpYJ2WcMDljnZRxwnDHaulGkhpn0EtS4wz6l+3Z6A6so0kZ66SMEyZnrJMyThjiWK3RS1LjPKKXpMZNXNAn2Zrk80meTPJEkvd37W9Mcm+Sr3bf37DRfR1Ukn+U5MEkX+rG+utd+xlJHujG+t+TvHaj+zoMSU5IsjfJH3f3Wx3n15I8luSRJLNdW3P7L0CSqSR/mOQr3f/ZH21trEm2d+/l/Nd3klwzzHFOXNADLwEfqKq3ABcAVyc5G7gO+FxVnQV8rrs/7l4ALqqqHwLOBS5OcgHwG8BvdWP938D7NrCPw/R+4MkF91sdJ8CPV9W5C6bftbj/AvwO8GdV9c+BH2Lu/W1qrFW1r3svzwV+GHge+DTDHGdVTfQX8EfATwL7gFO7tlOBfRvdtyGP83XAw8DbmLsI48Su/UeBeza6f0MY35buP8NFwB8DaXGc3Vi+BpxyXFtz+y/wvcBTdOcSWx7rgrG9A/jisMc5iUf0/yDJNmAH8ADw/VX1DED3/U0b17Ph6coZjwDPAvcCfw0cqaqXuk0OAC0sJP/bwK8Cf9/d/z7aHCdAAZ9N8lCSXV1bi/vvmcBh4L92JbnfT/I9tDnWeVcAt3W3hzbOiQ36JK8HPgVcU1Xf2ej+jEpVHau5Pwm3AOcDb1lss/Xt1XAleRfwbFU9tLB5kU3HepwLXFhV5wHvZK70+GMb3aERORE4D/hYVe0A/i9jXqZ5Nd05pEuAPxj2a09k0Cc5ibmQv7Wq7uyav5nk1O7xU5k7Am5GVR0B/oK58xJTSeY/XWwLcGij+jUkFwKXJPkacDtz5Zvfpr1xAlBVh7rvzzJXyz2fNvffA8CBqnqgu/+HzAV/i2OFuV/cD1fVN7v7QxvnxAV9kgAfB56sqo8seOhu4Kru9lXM1e7HWpLNSaa625uAf8PcyazPAz/dbTb2Y62q3VW1paq2Mfen731V9W4aGydAku9J8o/nbzNX032cBvffqvpb4Okk858y/xPAl2lwrJ0reblsA0Mc58RdMJXkXwL/E3iMl+u5H2KuTn8HcDrwDeBnquq5DenkkCR5K/BJ4ATmfqnfUVX/McmZzB35vhHYC/xsVb2wcT0dniRvB36lqt7V4ji7MX26u3si8N+q6vok30dj+y9AknOB3wdeC/wN8F66fZmGxprkdcDTwJlV9e2ubWjv6cQFvSRNmokr3UjSpDHoJalxBr0kNc6gl6TGGfSS1DiDXpIaZ9BLUuMMeklq3P8HX7aglmPVgOUAAAAASUVORK5CYII=\n", 129 | "text/plain": [ 130 | "
" 131 | ] 132 | }, 133 | "metadata": {}, 134 | "output_type": "display_data" 135 | } 136 | ], 137 | "source": [ 138 | "plt.scatter(x03[\"A1\"], x03[\"B\"])" 139 | ] 140 | }, 141 | { 142 | "cell_type": "code", 143 | "execution_count": 5, 144 | "metadata": {}, 145 | "outputs": [ 146 | { 147 | "data": { 148 | "text/plain": [ 149 | "array([[1. , 0.65756728],\n", 150 | " [0.65756728, 1. ]])" 151 | ] 152 | }, 153 | "execution_count": 5, 154 | "metadata": {}, 155 | "output_type": "execute_result" 156 | } 157 | ], 158 | "source": [ 159 | "np.corrcoef(x03[\"A1\"], x03[\"B\"])" 160 | ] 161 | }, 162 | { 163 | "cell_type": "code", 164 | "execution_count": 6, 165 | "metadata": {}, 166 | "outputs": [ 167 | { 168 | "name": "stdout", 169 | "output_type": "stream", 170 | "text": [ 171 | "0.9708703514427236 98.71471813821842 0.4323947319275954\n" 172 | ] 173 | } 174 | ], 175 | "source": [ 176 | "# Using the linregress() function returns 5 values, which we name here for convenience:\n", 177 | "slope, intercept, r_value, p_value, std_err = stats.linregress(x03[\"A1\"], x03[\"B\"])\n", 178 | "\n", 179 | "# Slope = m from our equation, Intercept = b, r_value = our correlation co-efficient\n", 180 | "print(slope, intercept, r_value**2)" 181 | ] 182 | }, 183 | { 184 | "cell_type": "code", 185 | "execution_count": 7, 186 | "metadata": {}, 187 | "outputs": [ 188 | { 189 | "data": { 190 | "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXoAAAD8CAYAAAB5Pm/hAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMi4yLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvhp/UCwAAGvBJREFUeJzt3XuUVeV5x/HvIxcd42VURoMzkMFEEaqJ6JTSEJOoCaClQrSu6IoNNTSkCbnoijdierFLCwaXTdKmMdhggrVYaxCJN4KitbGiGUCDCEQSVGYwDokiCUwQhqd/nD2cM8yZc7/s857fZ61Zc+bd+5x598zhN5tnv++7zd0REZFwHVLtDoiISHkp6EVEAqegFxEJnIJeRCRwCnoRkcAp6EVEAqegFxEJnIJeRCRwCnoRkcANrnYHAIYNG+atra3V7oaISE1ZvXr1b9y9Kdt+sQj61tZW2tvbq90NEZGaYmav5rKfSjciIoFT0IuIBE5BLyISOAW9iEjgFPQiIoHLGvRmNsLMnjCzDWa23sy+ErXPN7ONZvZzM7vfzBpTnjPHzDab2SYzm1zOAxCpRUvXdjJx3kpGXf8QE+etZOnazmp3SQKWyxn9PuCr7j4GmADMNrOxwArgNHd/P/ALYA5AtO1S4I+AKcC/mdmgcnRepBYtXdvJnCXr6NzRjQOdO7qZs2Sdwl7KJmvQu/vr7r4mevw7YAPQ7O4/cfd90W6rgJbo8TTgHnff4+5bgM3A+NJ3XaQ2zV++ie69PX3auvf2MH/5pir1SEKXV43ezFqBccCzB236DPBI9LgZ2JqyrSNqO/i1ZplZu5m1b9++PZ9uiNS0bTu682oXKVbOQW9mRwA/Aq50950p7TeQKO/c3duU5un97kDu7gvcvc3d25qass7gFQnGiY0NebWLFCunoDezISRC/m53X5LSPgOYCnzK3XvDvAMYkfL0FmBbaborUvuumTyahiF9L1s1DBnENZNHV6lHErpcRt0Y8H1gg7vfltI+BbgOuNDdd6c8ZRlwqZkdamajgJOB50rbbZHaNX1cM3MvOp3mxgYMaG5sYO5FpzN9XL8Kp0hJ5LKo2UTgL4F1ZvZ81PY14NvAocCKxN8CVrn737j7ejO7F3iJRElntrv3pHldkbo1fVyzgl0qJmvQu/tPSV93fzjDc24Gbi6iXyIiUiKaGSsiEjgFvYhI4BT0IiKBU9CLiAROQS8iEjgFvYhI4BT0IiKBU9CLiAROQS8iEjgFvYhI4BT0IiKBU9CLiAROQS8iEjgFvYhI4BT0IiKBU9CLiAROQS8iEjgFvYhI4BT0IiKBU9CLiAROQS8iEjgFvYhI4BT0IiKBU9CLiAROQS8iEjgFvYhI4BT0IiKByxr0ZjbCzJ4wsw1mtt7MvhK1H2tmK8zs5ejzMVG7mdm3zWyzmf3czM4s90GIiMjAcjmj3wd81d3HABOA2WY2FrgeeNzdTwYej74GOB84OfqYBXy35L0WEZGcZQ16d3/d3ddEj38HbACagWnAD6PdfghMjx5PAxZ5wiqg0cyGl7znIiKSk7xq9GbWCowDngVOcPfXIfHHADg+2q0Z2JrytI6o7eDXmmVm7WbWvn379vx7LiIiOck56M3sCOBHwJXuvjPTrmnavF+D+wJ3b3P3tqamply7ISIiecop6M1sCImQv9vdl0TNb/SWZKLPXVF7BzAi5ektwLbSdFdERPKVy6gbA74PbHD321I2LQNmRI9nAA+ktH86Gn0zAXi7t8QjIiKVNziHfSYCfwmsM7Pno7avAfOAe81sJvAacEm07WHgAmAzsBu4oqQ9FhGRvGQNenf/Kenr7gDnpdnfgdlF9ktEREpEM2NFRAKnoBcRCZyCXkQkcAp6EZHAKehFRAKnoBcRCZyCXkQkcAp6EZHAKehFRAKnoBcRCZyCXkQkcAp6EZHAKehFRAKnoBcRCZyCXkQkcAp6EZHAKehFRAKnoBcRCZyCXkQkcAp6EZHAKehFRAKnoBcRCZyCXkQkcAp6EZHAKehFRAKnoBcRCVzWoDezhWbWZWYvprSdYWarzOx5M2s3s/FRu5nZt81ss5n93MzOLGfnRUQku1zO6H8ATDmo7RvAje5+BvB30dcA5wMnRx+zgO+WppsiIlKorEHv7k8Bbx7cDBwVPT4a2BY9ngYs8oRVQKOZDS9VZ0VEJH+DC3zelcByM7uVxB+LD0btzcDWlP06orbXC+6hiIgUpdCLsZ8HrnL3EcBVwPejdkuzr6d7ATObFdX327dv315gN0REJJtCg34GsCR6/N/A+OhxBzAiZb8WkmWdPtx9gbu3uXtbU1NTgd0QEZFsCg36bcBHosfnAi9Hj5cBn45G30wA3nZ3lW1ERKooa43ezBYDHwWGmVkH8PfAZ4Fvmdlg4A8kRtgAPAxcAGwGdgNXlKHPIiKSh6xB7+6XDbDprDT7OjC72E6JiEjpaGasiEjgCh1eKSIiRVi6tpP5yzexbUc3JzY2cM3k0Uwf11yW76WgFxGpsKVrO5mzZB3de3sA6NzRzZwl6wDKEvYq3YiIVNj85ZsOhHyv7r09zF++qSzfT0EvIlJh23Z059VeLAW9iEiFndjYkFd7sRT0IiIVds3k0TQMGdSnrWHIIK6ZPLos308XY0VEKqz3gqtG3YiIBGz6uOayBfvBVLoREQmczuhFREqgkhOg8qWgFxEpUqUnQOVLpRsRkSJVegJUvhT0IiJFqvQEqHwp6EVEilTpCVD5UtCLiBSp0hOg8qWLsSIiRar0BKh8KehFREqgkhOg8qXSjYhI4BT0IiLVsn8/vPNO2b+Ngl5EpJJ27YIHHoDPfhZaWuCOO8r+LVWjFxEpt1dfhYceggcfhJUrYc8eOOoomDIFTj217N9eQS8iUmo9PfDcc4lg//GPYV1iOQTe9z74whdg6lT40Idg6NCKdEdBLyJSCo89Bh//OAweDMccA9u3w6BBiUC/9dZEuJ9yCphVvGsKehGpawOtOpnTapQXXwxLlvRt27cPJk1KBPvkyYnQrzIFvYjUrYFWnWx/9U1+tLqzX/uQt9/iz845PfOL/va3cOyx5e56XhT0IlK3Blp1cvGzW+lxB2DKpqe5fencxMab0rzIxRfDffeVuafFyRr0ZrYQmAp0uftpKe1fAr4I7AMecvdro/Y5wEygB/iyuy8vR8dFRIqVdnVJd355y9TMT1yxAj72sfJ0qgxyOaP/AfCvwKLeBjM7B5gGvN/d95jZ8VH7WOBS4I+AE4HHzOwUd+/p96oiNSjOdxGS/J3Y2EDnjm7GdP2KR+78csZ9x1x1H8cefwxPX39uhXpXOlmD3t2fMrPWg5o/D8xz9z3RPl1R+zTgnqh9i5ltBsYDz5SsxyJVEve7CEmeLryQp3/84wE3bzx+FFOu+JcDX8dpNcp8FToz9hTgbDN71sz+x8z+OGpvBram7NcRtYnUvLjfRUiy6OlJDG3s/UgT8lfPuJmlazrAnY2P/i/NjQ0Y0NzYwNyLTq/ZP+iFXowdDBwDTAD+GLjXzE4C0g0Q9XQvYGazgFkAI0eOLLAbIpUT97sISRrLlsG0aZn3+f3v4V3vAuDWlOY4r0aZr0KDvgNY4u4OPGdm+4FhUfuIlP1agG3pXsDdFwALANra2tL+MRCJk956brr2WhbcdYdcJiR56SMnzj/HQks3S4FzAczsFGAo8BtgGXCpmR1qZqOAk4HnStFRkWqL+12ECtF73aFzRzdO8rrD0rWd1e5a7rq7+5Zk0rnrrkS4936UWNx/jlmD3swWk7iYOtrMOsxsJrAQOMnMXgTuAWZ4wnrgXuAl4FFgtkbcSCimj2tm7kWnB1O3hRq+7vC97yWD/fDD0++zZ08y2C+/vKzdifvPMZdRN5cNsCntT87dbwZuLqZTInEVUt0Wauy6Q7aSzJFHws6dlenLQeL+c9R69CJ1bKDrC7G47vDmm9lLMitWJM/aqxTyEPOfIwp6kboWu+sOs2Ylg/2449Lvs39/MtxjMjs1dj/Hg2itG5E61luGqupokWwlmbPOgvb2yvSlQLH4OWZgXoYr0Plqa2vz9pj/IkUks5yHF/7yl4kbcGTy9NPwwQ+Wp6MBMbPV7t6WbT+d0YtI0bIuD3HOOfDkk5lfJAYnnaFS0ItI0dINL9xw0/npl/XtNWECPKNlsCpBQS8iRdu2o5tRb3byxB2fy7zjxo0wOh4XKOuJgl5ECjdjBixaxJZM+6gkU3UKehHJnTscknlU9n2nncfVf3YVDUMGJWYOV6hrMjAFvYhk9rOfwfjxmfd56y2Wbtl1YNRNc8yGF9Y7Bb2I9JdtlIxZYuJSH7vK2SMpgmbGigjs29d3uYF0Ib94cXJG6kEhH/fVG+udgl6kXi1fngz2IUPS79PdnQz3Sy8d8KXivnpjvVPpRqSevOc98NprA29/3/vg5Zfzftm4r95Y73RGLxKy3bv7lmTShXzqCpAFhDzEf/XGeqegFwnNXXclgz26F2o/+/aVdAXIuK/eWO9UuhEJQbYVICdNStTkyyTuqzfWOwW9SC3q6oITTsi8z5o1MG5cZfpDeHffColKNyK14qqrkiWZgUI+9aYcFQx5iTed0YtUQc5rt2cryZx3Hjz2WHk6KcFQ0ItUWMa124fugNNOy/wC69fD2LHl7qYEREEvNSvns+KYOXhy0cN3fomxXVsyr92uFSBLolbfM8VS0EtNynpHoxjb9tZuXvnGn2fe6fLLE8MkpWRq+T1TLF2MlZpUc1PuH330wIXULQOE/MXX3p28kKqQL7mae8+UkM7opSbVxJT7ww6DPXsy7tJ63YMAB9Zuz1e9liIKURPvmTJR0EtNOrGxgc40/0CrOuV+/34YNCjjLo+M/TB7/uM/gcQZphUR0PVciihELN8zFaKgl5p0zeTRfUIOqjTl/pFH4IILMu7ygS8v5u2GIw983bx8E09ff27RYZypFKGg7y8275kqyBr0ZrYQmAp0uftpB227GpgPNLn7b8zMgG8BFwC7gb9y9zWl77bUu6pOuR86FPbuzbyPO6Ouf4h0Y2VKVSqo51JEIep5mYZczuh/APwrsCi10cxGAB8HUpfDOx84Ofr4E+C70WeRkqvYlPs9exL19kwWLoQrrujTVO5SQYiliHJfc6jXZRqyjrpx96eAN9Ns+mfgWuhz0jINWOQJq4BGMxtekp6KVNKddyaXGxgo5P/wh+QomYNCHsq/omNoK0bqLlXlU1CN3swuBDrd/QXrO0W7Gdia8nVH1PZ6wT0UqZRsyw0MGQLvvJPzy5W7VBBaKULXHMon76A3s8OBG4BJ6TanaUs7pc/MZgGzAEaOHJlvN0SK97vfwVFHZd7n4Yfh/PML/hblLhWEVIrQNYfyKWTC1HuBUcALZvYK0AKsMbN3kziDH5GybwuwLd2LuPsCd29z97ampqYCuiFSgAULkiWZgUK+pydZkskx5Jeu7WTivJWMuv4hJs5bqXJDAXSXqvLJO+jdfZ27H+/ure7eSiLcz3T3XwPLgE9bwgTgbXdX2UaqK/VWep/7XP/tF1+cDHZ3OCS/fxaqLZdGaNcc4iTrO9rMFgPPAKPNrMPMZmbY/WHgV8Bm4A7gCyXppUg+urr6hns6L76YDPb77ivq29Xz1PpSmj6umbkXnU5zYwMGNDc2MPei04MpTVVT1hq9u1+WZXtrymMHZhffLQldyYfR3Xgj/MM/ZN6nTCtAqrZcOiFdc4gTzYyViivZ1P1so2Suugpuu63QbuYsxPHsEhatXikVV3Cp49VXs5dkXnstWZKpQMhDorY85JC+/RlyiKm2LLGhoJeKy6vUce21yWBvbU3/gqkXUkeMSL9PuR38dyfLfzZEKklBLxWXcRide9+z9vnz++94++19w73K5i/fxN6evv3Y2+O6GCuxoRq9VNzBqwiO3v4Kyxd+MbFxzgBPeustaGysTAfzpIuxEncKeqm46eOaOeuGLzLikaUD7/Tud8PrtTEFQxdjJe5UupHK6OnpU5JJG/JLliTLMTUS8qCJPhJ/OqOX8vnpT+HsszPv092dfQngmAttcTEJj4JeSuvssxMBP5AJE+CZZyrXnwrRRB+JMwW9FOedd+DQQzPv8+ST8JGPVKQ7ItKfgj5A5b5LT04lmX37st4oW/or++9O6pIuxgambCspXnJJ8mJqupC/8ca+Y9sV8nnTKphSLjqjD0zJ7tKzaxcccUTmfTZvhve+t4BeSjq6w5KUi87oA1PU5J2HHkqetacL+SOPhP37k2ftCvmS0sQrKRcFfWDyvkvPxInJcJ86tf/273wnGew7d2ZfMbIGxeXuULrDkpSLgj4wWSfv/Pa3fdeS+b//6/8i27Ylw/0LYd87Jk51cU28knJR0Acm3V16/mPoJqaf2ZII9mHD+j/p1FP7XkgdPrzi/a6WON0dSndYknLRxdgATR/XzPRPTEys3z6Qe+6BT36ycp2KqbjVxTXxSspBQR+Kzk5oacm8T4xXgKwWLUgm9UClmyoo2cW/b34zWWtPF/Lnnde3JKOQ70d1cakHOqOvsKLul9o7ESnTzTaWL4dJk0rV3eBpQTKpB3UT9HGZWp73pJhf/AJGZzm73L0bGlRqEJH06qJ0E6chdDld/LvhhmRJJl3IX35535KMQr5gcXpviJRLXQR9nIbQpbvIN2h/D1tumZoM93/6p/5PXLUqGex33VWBntaHOL03RMqlLko3cRpC13u/1KauDp5a8NnMO+/dC4Pr4ldUNXF6b4iUS12c0cdmavnttzP9zBY23HR++pC/+uq+JRmFfNnF5r0hUkZ1EfRVG0K3dy+ccUayJPP5z/ffZ/36ZLDPn1/e/kg/Gl4p9SBr0JvZQjPrMrMXU9rmm9lGM/u5md1vZo0p2+aY2WYz22Rmk8vV8XxUdGr5iy8mg33oUHjhhb7bP/GJvitAjh1b+j5IzrTsgNQD80xjsgEz+zDwe2CRu58WtU0CVrr7PjO7BcDdrzOzscBiYDxwIvAYcIq796R/9YS2tjZvb28v+mBKKa/hmPPmwZw5A7/Y/ffD9Onl6Wgdi8uQWZFqMbPV7t6Wbb+sRWB3f8rMWg9q+0nKl6uAv4geTwPucfc9wBYz20wi9GvqbtBZJzV1d8Mpp0BHx8Av0tUFTU2V6G5dKmrimUidKUWN/jPAI9HjZmBryraOqK2mpBtyN/rVl5IrQB5+eP+Qnzmzb0lGIV9WGhYpkruihnWY2Q3APuDu3qY0u6WtDZnZLGAWwMiRI4vpRsn1Dq37zM8e4O9W3jHwjo8/DueeW6FeSSoNixTJXcFBb2YzgKnAeZ4s9HcAI1J2awG2pXu+uy8AFkCiRl9oP0pqzx649Va23PL1tJt3Dz2Mw7t+DUcfXeGOycG06qRI7goq3ZjZFOA64EJ3352yaRlwqZkdamajgJOB54rvZhk9/nhylMxhh8HXkyH/h8FD+c6ES2i97kHGfP0RfrJqs0I+JjQsUiR3Wc/ozWwx8FFgmJl1AH8PzAEOBVZY4h6iq9z9b9x9vZndC7xEoqQzO9uIm4pzh7/+a1i4MP32yy6D+fNZ2pVc0bBZIzpiR6tOiuQu6/DKSij78Mpdu+Dmm2Hu3LSbX288gXM+828c19SosBCRmlGy4ZU1a+NG+NKX4LHH0m+fN4+lky7XED0RCV44SyC4J+6DevTRiXr7mDF9Q37mTHjjjeTwx+uu0xA9EakLtX9Gf9NN8Ld/m37b976XCPhBg9Ju1hA9EakHNXtGv3RtJ5+8elHfkB8/Htrbk2fts2YNGPKglQtFpD7U5Bn9genvg47l4k99g5eHjWTvkUfnvRhV79rwqeUbDdETkdDU5Bn9gdq6GatbxrLzsCMKqq1r5UIRqQc1eUZfytr69HHNCnYRCVpNntGrti4ikruaDHpNfxcRyV1Nlm40/V1EJHc1GfSg2nrc6G5PIvFVs0Ev8aG7PYnEW03W6CVetJSESLwp6KVoWkpCJN4U9FI0DXcViTcFvRRNw11F4k0XY6VoGu4qEm8KeikJDXcViS+VbkREAqegFxEJnIJeRCRwCnoRkcAp6EVEAmfuXu0+YGbbgVer3I1hwG+q3IdKqZdjrZfjhPo51no5TsjtWN/j7k3ZXigWQR8HZtbu7m3V7kcl1Mux1stxQv0ca70cJ5T2WFW6EREJnIJeRCRwCvqkBdXuQAXVy7HWy3FC/RxrvRwnlPBYVaMXEQmczuhFRAJXd0FvZiPM7Akz22Bm683sK1H7sWa2wsxejj4fU+2+FsvMDjOz58zshehYb4zaR5nZs9Gx/peZDa12X0vBzAaZ2VozezD6OtTjfMXM1pnZ82bWHrUF9/4FMLNGM7vPzDZG/2b/NLRjNbPR0e+y92OnmV1ZyuOsu6AH9gFfdfcxwARgtpmNBa4HHnf3k4HHo69r3R7gXHf/AHAGMMXMJgC3AP8cHetbwMwq9rGUvgJsSPk61OMEOMfdz0gZfhfi+xfgW8Cj7n4q8AESv9+gjtXdN0W/yzOAs4DdwP2U8jjdva4/gAeAjwObgOFR23BgU7X7VuLjPBxYA/wJiUkYg6P2PwWWV7t/JTi+lugfw7nAg4CFeJzRsbwCDDuoLbj3L3AUsIXoWmLIx5pybJOAp0t9nPV4Rn+AmbUC44BngRPc/XWA6PPx1etZ6UTljOeBLmAF8Etgh7vvi3bpAEJYSP6bwLXA/ujr4wjzOAEc+ImZrTazWVFbiO/fk4DtwJ1RSe7fzexdhHmsvS4FFkePS3acdRv0ZnYE8CPgSnffWe3+lIu793jiv4QtwHhgTLrdKtur0jKzqUCXu69ObU6za00fZ4qJ7n4mcD6J0uOHq92hMhkMnAl8193HAbuo8TJNJtE1pAuB/y71a9dl0JvZEBIhf7e7L4ma3zCz4dH24STOgIPh7juAJ0lcl2g0s967i7UA26rVrxKZCFxoZq8A95Ao33yT8I4TAHffFn3uIlHLHU+Y798OoMPdn42+vo9E8Id4rJD4w73G3d+Ivi7ZcdZd0JuZAd8HNrj7bSmblgEzosczSNTua5qZNZlZY/S4AfgYiYtZTwB/Ee1W88fq7nPcvcXdW0n813elu3+KwI4TwMzeZWZH9j4mUdN9kQDfv+7+a2CrmfXeZf484CUCPNbIZSTLNlDC46y7CVNm9iHgf4F1JOu5XyNRp78XGAm8Blzi7m9WpZMlYmbvB34IDCLxR/1ed/9HMzuJxJnvscBa4HJ331O9npaOmX0UuNrdp4Z4nNEx3R99ORj4T3e/2cyOI7D3L4CZnQH8OzAU+BVwBdF7mYCO1cwOB7YCJ7n721FbyX6ndRf0IiL1pu5KNyIi9UZBLyISOAW9iEjgFPQiIoFT0IuIBE5BLyISOAW9iEjgFPQiIoH7fw7+ncQsZ/xnAAAAAElFTkSuQmCC\n", 191 | "text/plain": [ 192 | "
" 193 | ] 194 | }, 195 | "metadata": {}, 196 | "output_type": "display_data" 197 | } 198 | ], 199 | "source": [ 200 | "def get_B():\n", 201 | " return slope * x03[\"A1\"] + intercept\n", 202 | "\n", 203 | "plt.scatter(x03[\"A1\"], x03[\"B\"])\n", 204 | "plt.plot(x03[\"A1\"], get_B(), c='r')\n", 205 | "plt.show()" 206 | ] 207 | } 208 | ], 209 | "metadata": { 210 | "kernelspec": { 211 | "display_name": "Python 3", 212 | "language": "python", 213 | "name": "python3" 214 | }, 215 | "language_info": { 216 | "codemirror_mode": { 217 | "name": "ipython", 218 | "version": 3 219 | }, 220 | "file_extension": ".py", 221 | "mimetype": "text/x-python", 222 | "name": "python", 223 | "nbconvert_exporter": "python", 224 | "pygments_lexer": "ipython3", 225 | "version": "3.6.5" 226 | } 227 | }, 228 | "nbformat": 4, 229 | "nbformat_minor": 2 230 | } 231 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # how-to-python 2 | Knowledge is often hard-won, and then quickly forgotten! This repository aims to explain simply, and then serve as an aid to memory. 3 | -------------------------------------------------------------------------------- /customer_product_list.xlsx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/shotleft/how-to-python/9b71a76742b3896d0017a1fcc6cdbd64ad421b1b/customer_product_list.xlsx -------------------------------------------------------------------------------- /data_structures.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/shotleft/how-to-python/9b71a76742b3896d0017a1fcc6cdbd64ad421b1b/data_structures.png -------------------------------------------------------------------------------- /flattening.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/shotleft/how-to-python/9b71a76742b3896d0017a1fcc6cdbd64ad421b1b/flattening.png -------------------------------------------------------------------------------- /img_5.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/shotleft/how-to-python/9b71a76742b3896d0017a1fcc6cdbd64ad421b1b/img_5.png -------------------------------------------------------------------------------- /multiplication.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/shotleft/how-to-python/9b71a76742b3896d0017a1fcc6cdbd64ad421b1b/multiplication.png -------------------------------------------------------------------------------- /sea_picture.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/shotleft/how-to-python/9b71a76742b3896d0017a1fcc6cdbd64ad421b1b/sea_picture.jpg -------------------------------------------------------------------------------- /sorted.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/shotleft/how-to-python/9b71a76742b3896d0017a1fcc6cdbd64ad421b1b/sorted.jpg -------------------------------------------------------------------------------- /take1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/shotleft/how-to-python/9b71a76742b3896d0017a1fcc6cdbd64ad421b1b/take1.png -------------------------------------------------------------------------------- /take2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/shotleft/how-to-python/9b71a76742b3896d0017a1fcc6cdbd64ad421b1b/take2.png -------------------------------------------------------------------------------- /tensor_shape_2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/shotleft/how-to-python/9b71a76742b3896d0017a1fcc6cdbd64ad421b1b/tensor_shape_2.png -------------------------------------------------------------------------------- /unsorted.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/shotleft/how-to-python/9b71a76742b3896d0017a1fcc6cdbd64ad421b1b/unsorted.jpg -------------------------------------------------------------------------------- /weight_matrix_detail_2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/shotleft/how-to-python/9b71a76742b3896d0017a1fcc6cdbd64ad421b1b/weight_matrix_detail_2.png --------------------------------------------------------------------------------