"
180 | ]
181 | },
182 | "metadata": {},
183 | "output_type": "display_data"
184 | }
185 | ],
186 | "source": [
187 | "# Having forced a relationship we can clearly see this in the data\n",
188 | "plt.scatter(heights, weights)"
189 | ]
190 | },
191 | {
192 | "cell_type": "code",
193 | "execution_count": 20,
194 | "metadata": {},
195 | "outputs": [
196 | {
197 | "data": {
198 | "text/plain": [
199 | "array([[ 7.87464896e-03, 7.10461347e-01],\n",
200 | " [ 7.10461347e-01, 6.58789965e+01]])"
201 | ]
202 | },
203 | "execution_count": 20,
204 | "metadata": {},
205 | "output_type": "execute_result"
206 | }
207 | ],
208 | "source": [
209 | "# Now let's find the co-variance...\n",
210 | "np.cov(heights, weights)"
211 | ]
212 | },
213 | {
214 | "cell_type": "code",
215 | "execution_count": 21,
216 | "metadata": {
217 | "scrolled": true
218 | },
219 | "outputs": [
220 | {
221 | "name": "stdout",
222 | "output_type": "stream",
223 | "text": [
224 | "0.710461347301\n"
225 | ]
226 | }
227 | ],
228 | "source": [
229 | "# We can also find this answer the long way based on the formula:\n",
230 | "# ∑(xi - xavg)(yi - yavg) / n - 1\n",
231 | "\n",
232 | "mean_height = np.mean(heights)\n",
233 | "mean_weight = np.mean(weights)\n",
234 | "heights_array = np.array([hi - mean_height for hi in heights])\n",
235 | "weights_array = np.array([wi - mean_weight for wi in weights])\n",
236 | "numerator = heights_array @ weights_array\n",
237 | "denominator = len(heights) - 1\n",
238 | "covariance = numerator / denominator\n",
239 | "print(covariance)"
240 | ]
241 | },
242 | {
243 | "cell_type": "markdown",
244 | "metadata": {},
245 | "source": [
246 | "Interpreting the result:
\n",
247 | "- Is it positive or negative? If it's positive then as one variable increases so does the other (e.g. heights and weights). If it's negative then as one variable increases the other decreases (e.g. practice hours and math test scores)
\n",
248 | "- How big is the number in relation to the data? If it's quite a big number then the relationship is quite a strong one. If it's quite a small number in relation to the data, say 0.00something then the relationship is probably negligible.
\n",
249 | "- By looking at the spread of the data above we can check this understanding...\n",
250 | "But of course the difficulty here is \"what is a big number?\" since we have 2 different units of measure (meters and kilograms)!
"
251 | ]
252 | },
253 | {
254 | "cell_type": "markdown",
255 | "metadata": {},
256 | "source": [
257 | "## Correlation\n",
258 | "Maths is fun has a lovely article on how to calculate correlation: https://www.mathsisfun.com/data/correlation.html – also infinitely intelligible!
\n",
259 | "Correlation will normalize the different units of measure to give a standardized value between -1 (perfect negative correlation) and 1 (perfect positive correlation) with 0 representing absolutely no correlation whatsoever."
260 | ]
261 | },
262 | {
263 | "cell_type": "code",
264 | "execution_count": 22,
265 | "metadata": {},
266 | "outputs": [
267 | {
268 | "data": {
269 | "text/plain": [
270 | "array([[ 1. , 0.98639614],\n",
271 | " [ 0.98639614, 1. ]])"
272 | ]
273 | },
274 | "execution_count": 22,
275 | "metadata": {},
276 | "output_type": "execute_result"
277 | }
278 | ],
279 | "source": [
280 | "# Finding the correlation co-efficient is easy enough with numpy!\n",
281 | "np.corrcoef(heights, weights)"
282 | ]
283 | },
284 | {
285 | "cell_type": "code",
286 | "execution_count": 23,
287 | "metadata": {},
288 | "outputs": [
289 | {
290 | "name": "stdout",
291 | "output_type": "stream",
292 | "text": [
293 | "0.986396144201\n"
294 | ]
295 | }
296 | ],
297 | "source": [
298 | "# We can also find this answer the long way by building on the values found for co-variance above. The formula is...\n",
299 | "# ∑(xi - xavg).(yi - yavg) / sqrt(x.x * y.y)\n",
300 | "heights_sq = heights_array @ heights_array\n",
301 | "weights_sq = weights_array @ weights_array\n",
302 | "correlation = numerator / np.sqrt(heights_sq * weights_sq)\n",
303 | "print(correlation)"
304 | ]
305 | },
306 | {
307 | "cell_type": "markdown",
308 | "metadata": {},
309 | "source": [
310 | "So we DO indeed have a very high positive correlation, as could be seen in the plot!"
311 | ]
312 | },
313 | {
314 | "cell_type": "markdown",
315 | "metadata": {},
316 | "source": [
317 | "## Linear regression"
318 | ]
319 | },
320 | {
321 | "cell_type": "markdown",
322 | "metadata": {},
323 | "source": [
324 | "So now that we have established that there is a high correlation between heights and weights, we would like to find the line that best fits this relationship. Why? because then given the height of any other random future person we could predict their probable weight, and vice-versa."
325 | ]
326 | },
327 | {
328 | "cell_type": "markdown",
329 | "metadata": {},
330 | "source": [
331 | "Mathsisfun provides a very easy to understand lesson on how least squares regression is calculated:\n",
332 | "https://www.mathsisfun.com/data/least-squares-regression.html. Essentially, given that the equation of a line is expressed as y = mx + b, we are trying to solve for the constants m and b so that we get the line that represents the best fit for our data (the one that minimizes the squared differences from the data points to the line in each case)."
333 | ]
334 | },
335 | {
336 | "cell_type": "code",
337 | "execution_count": 36,
338 | "metadata": {},
339 | "outputs": [
340 | {
341 | "name": "stdout",
342 | "output_type": "stream",
343 | "text": [
344 | "90.2213356483 -88.82225037 0.986396144201\n"
345 | ]
346 | }
347 | ],
348 | "source": [
349 | "# We'll need the stats module from scipy\n",
350 | "from scipy import stats\n",
351 | "\n",
352 | "# Using the linregress() function returns 5 values, which we name here for convenience:\n",
353 | "slope, intercept, r_value, p_value, std_err = stats.linregress(heights, weights)\n",
354 | "\n",
355 | "# Slope = m from our equation, Intercept = b, r_value = our correlation co-efficient\n",
356 | "print(slope, intercept, r_value)"
357 | ]
358 | },
359 | {
360 | "cell_type": "code",
361 | "execution_count": 37,
362 | "metadata": {},
363 | "outputs": [
364 | {
365 | "data": {
366 | "text/plain": [
367 | "0.9729773532953242"
368 | ]
369 | },
370 | "execution_count": 37,
371 | "metadata": {},
372 | "output_type": "execute_result"
373 | }
374 | ],
375 | "source": [
376 | "# By squaring r_value we get the co-efficient of determination\n",
377 | "r_value ** 2"
378 | ]
379 | },
380 | {
381 | "cell_type": "markdown",
382 | "metadata": {},
383 | "source": [
384 | "A quick note here - what is the difference between the co-efficient of correlation, and the co-efficient of determination?
\n",
385 | "It is very nicely explained here: http://blog.uwgb.edu/bansalg/statistics-data-analytics/linear-regression/what-is-the-difference-between-coefficient-of-determination-and-coefficient-of-correlation/ but essentially:
\n",
386 | "- The co-efficient of correlation is a number between -1 (perfect negative correlation) and 1 (perfect positive correlation) which is indicative of how the 2 sets of values are correlated
\n",
387 | "- The co-efficient of determination is a number between 0 and 1 (of course because it has been squared!) and is indicative of how good a fit the line is to the original data - the higher the better"
388 | ]
389 | },
390 | {
391 | "cell_type": "code",
392 | "execution_count": 34,
393 | "metadata": {
394 | "scrolled": true
395 | },
396 | "outputs": [
397 | {
398 | "data": {
399 | "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXQAAAD8CAYAAABn919SAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4xLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvAOZPmwAAIABJREFUeJzt3XmczvXex/HXxxgZVDMhopgWqZOUiIqDUmmz1EEkKZ2c+3S3SITsqSidk9NyTpFKJbcsjdJBi6VSKhq7pGQbimQoM5jle//xu8Y6yzVjrv39fDw8xnXN75r5dDXevj6/72LOOUREJPKVCXUBIiJSOhToIiJRQoEuIhIlFOgiIlFCgS4iEiUU6CIiUUKBLiISJRToIiJRQoEuIhIlygbzm1WpUsUlJycH81uKiES8JUuW/Oqcq1rUdUEN9OTkZBYvXhzMbykiEvHMbKM/16nlIiISJRToIiJRQoEuIhIlFOgiIlFCgS4iEiWCOstFRCQapaSmMXrOWramZ1IjMYG+revSvkHNoNehQBcROQ4pqWkMmL6CzKwcANLSMxkwfQVA0ENdLRcRkeMwes7ag2GeJzMrh9Fz1ga9FgW6iMhx2JqeWaznA0mBLiJyHGokJhTr+UBSoIuI+CElNY2mo+ZyZv8PaDpqLimpaQD0bV2XhPi4I65NiI+jb+u6Qa9RN0VFRIrgz41PzXIREYkAhd34bN+g5sFfoeZXy8XMHjKzVWa20swmmVl5MzvTzL4ys3VmNtnMygW6WBGRUAinG5+FKTLQzawm8ADQyDlXD4gDOgNPAc865+oAu4C7A1moiEiohNONz8L4e1O0LJBgZmWBCsA24Cpgqu/zE4D2pV+eiEjohdONz8IU2UN3zqWZ2TPAJiAT+BBYAqQ757J9l20BQt9AEhEJgHC68VmYIgPdzJKAdsCZQDowBbg+n0tdAa/vCfQEqFWrVokLFREJpXC58VkYf1ouVwM/Oed2OOeygOnAFUCirwUDcDqwNb8XO+fGOucaOecaVa1a5JF4IiJSQv4E+ibgMjOrYGYGtAJWA/OADr5rugMzAlOiiIj4o8hAd859hXfz81tghe81Y4F+QG8z+wGoDIwPYJ0iIlIEvxYWOeeGAkOPeno90LjUKxIRkRLRXi4iIlFCgS4iEiUU6CIiUUKbc4lIzAqXs0BLi0boIhKTBqWs4KHJS0lLz8RxaEvcvH3Oi+2LL6B1a9i5s1TrLA4FuojEnJTUNCYu2nTM8vYSnQW6ZQt07QpNm8LKlfDDD6VWZ3Ep0EUkpqSkpvHwO8vy36uEYmyJm5kJjz8OdevCtGkwaBCsXQtNmpRarcWlHrqIxIy8k4dyXEFx7seWuM55Ad6nD2zcCB06wOjRkJxcusWWgEboIhIz8jt56HAGhW+Ju2wZXHkldOwIJ58M8+bBlClhEeagQBeRGFJYO8WArpfVyn+Wy44d8D//A5dc4vXJX3oJvv0WWrYMWK0loZaLiMSMGokJpOUT6nFm/KPTRceGeVYWvPgiDBsGe/fCAw/AkCGQlBScgotJI3QRiRkFnTyUb5jPng3168NDD8Fll8Hy5fDss2Eb5qBAF5EY0r5BTUbeciE1ExMwoGZiAiNvufDIMP/+e7jpJrj+esjJgZkzYdYsOP/8kNXtL7VcRCQilXSVZ4EnD+3eDSNGwHPPQfny8MwzcP/9UK5cAKoPDAW6iEScvOmHeTNW8lZ5AsVfup+TA6+9Bo8+Cr/+Cj16wBNPQLVqpV12wKnlIiIRJ7/phyVa5fn559C4Mdxzj7dAaPFieOWViAxzUKCLSAQqaPqh36s8N22CLl3gz3+G7dth0iT49FNvWmIEU6CLSMQpaDVnkas8MzJg+HA47zxISYHBg+G776BzZzALQKXBpUAXkYhT0PTDAld5OgeTJ3tBPmwYtGnjBfljj0HFioEvOEgU6CIScfyafpjn22+heXNvFF65MixY4IV77dpBrzvQNMtFRCJSgdMP82zfDgMHwvjxUKUKjB3rzWCJiyv4NRFOgS4i0eXAAXj+ea+dkpHhrfQcPBgSE0NdWcAp0EUkenzwgRfg69bBDTfAP//pTUeMEeqhi0jk++47b6n+TTdBmTJesH/wQUyFOSjQRSSSpadD795w4YXw5ZfeiHz5cm90HoPUchGRyJOT493sHDjQO5T5nnu8fVhOPTXUlYWURugiElkWLICGDeFvf/N2QFyyBF5+OebDHBToIhIpNmyATp28U4J27fLmki9YAA0ahLqysKGWi4iEt7174amnvIOYzbyl+336QIUKoa4s7CjQRSQ8OedtmvXII5CW5m2m9dRTcMYZoa4sbKnlIiLhZ/FiaNYMunb1trL9/HN4+22FeRE0QheR8PHzz95BE6+95t3kHD8e7rzTm1t+mJKeVhTtFOgiEnr793tHv40YAfv2Qd++MGgQnHTSMZeW6mlFUUYtFxEJHefg/fehXj2vV96iBaxaBU8/nW+YQymeVhSFigx0M6trZksP+7XHzHqZ2Slm9pGZrfN9TApGwSISJVavhuuug7ZtIT4eZs/2wr1OnUJfdtynFUWxIlsuzrm1wMUAZhYHpAHvAv2BT5xzo8ysv+9xvwDWKiLRYNcu75CJF1+ESpVgzBi4914v1PNxdL88sUI8uzKyjrmuyNOKYkBxe+itgB+dcxvNrB3Q0vf8BGA+CnQRKUh2Nowb521lu2vXoeX6VasW+JJBKSuYuGgTzvc4LT2T+DJGfJyRleMOXlfoaUUxpLiB3hmY5Pt9NefcNgDn3DYzy3fdrZn1BHoC1KpVq6R1ikgkmzcPHnwQVqzwVnr+619Qv36Bl6ekpjHsvVWkZx47Es/KdSQmxFPxhLKa5XIUvwPdzMoBbYEBxfkGzrmxwFiARo0auSIuF5Fo8tNP3qrO6dMhORmmTYObbz7mQOZBKSuY9NVmcpzDgDJljJzcguNid2YWS4deG9jaI1BxRujXA986537xPf7FzE7zjc5PA7aXfnkiEpH++ANGjYJnnvGOfHv8cW+b24SEY3riyZUTWPjjbwdf6qDQMAf1ywtSnEDvwqF2C8B7QHdglO/jjFKsS0QiUW6ut6KzXz/YuhVuv90L9ppeOyS/OeRpxZydYqB+eQH8moduZhWAa4Dphz09CrjGzNb5Pjeq9MsTkUgx/82ZrEyuB926sbrMiSx4bQa8+ebBMIf855AXhwFdL6ulfnkB/BqhO+cygMpHPbcTb9aLiESwvBZIWnomcWbkOEdN341GoOgl9tu2selvD9Ly/Slsr5hEnxt6Ma3eVZT/MZ6RqWlHXF/c0fjhkirEM7TNBQrzQmjpv0gMO7oFkuO83nVaeiZ9pywD4+D0wGOW2O/b580hf+IJqmfu59+XdeDFyzqx9wRvW9u81ZuHB3DeXxj+qFgujowDOZrFUgwKdJEYVlgLJCufG5OZWTmMnv0d7Td+Aw8/DOvXQ/v2XHPqjWxMOu2Y649evVlYmOeFfZwZXZqcwePtLyzmf40o0EVi0OFtluI4d8cGhvzfONi4DC64AD76CK6+muxRcyGfr3X0bJSaiQn5fs+aiQks7H9V8f4j5BjanEskhqSkpnH+4Fn0mry0WGGemLmH4R/9h1mvPUD97evh+edh6VK4+mrAm3WSEB93xGvyW73p73VSMhqhi8SIlNQ0er+zlCKmeB8UX8aIczl0WvwBvT+fSKX9GUxqeCNJTz/JjVfWO+LavP52UTdQ/b1OSsacnzcoSkOjRo3c4sWLg/b9ROSQBo99mO+mVvmJM+ON2nu48JlhnPTjWhbWrs9/2j9Ah+7XKXxDwMyWOOcaFXWdRugiMcLfMD/391+YsPodThs1B848E959l6bt2tH0qOX6En4U6CICQMX9GfRPfZfbvphGXLl4GDkSevWC8uVDXZr4SYEuEiMSE+Lz3b3QXC7PZC7jL1Ne9M707N4dnnwSatQIQZVyPBToIjHipotO461Fm454rkHadzy/6HVO/2ElNGkCM2ZA48YhqlCOlwJdJAYMSllxRJhX+/1X+i2YwC2r5pFZpRpMmOBtpFVGM5kjmQJdJEodvsd4nhOyD/DXr9/lfxe9Q1xuLs9ffiszWnfj4ztuDGGlUloU6CJR4vB9xsvHlyEzK/fQJ52j9fdfMmjeeM7Y/Quzz72cJ668m82J1TGdrRw1FOgiUeDoTbYOD/Pztv/E0E/GcvmmFXxXpTZdOj/Bl7UvOvh5HRYRPRToIlEgv022kjJ28/Bnb9Fl2Rz2nFCRQdfey6SLWpNT5tDS+/g407L7KKJAF4kCh+/LUjYnm26pH9Dr87epeCCTNy65kTFNb2N3wolHvKZiuTieuPlCrfyMIgp0kSiQt/Xsn3/6liGfjKPOzs18mtyAEVf9lXVVax9xbU3tnxK1FOgiUeCMnVsYOG881/zwNRsST+Ovtwzm43Mag2+5vvYYjw0KdJFItns36+7vx4cTx7O/bDwjW97Jaw3bcaBsPKB9xmONAl0kEuXmwuuvw4ABnL1jB1PrtWJ08+7sqJR08BID3fCMMQp0kUizcCE8+CAsWQJXXEG76waw4rQ6x1zmQH3yGKN1viKRYvNmuO02aNbM20Rr4kT4/HNW1Tg2zAHKaLfbmKMRuki4y8yEZ57xtrN1DgYPhn79oGJFgAJPIPL3ZCKJHgp0kXDlHEydCn36wKZN0LEjPP00JCf7lvl/VexDniW6qeUiEo6WLoWWLaFTJ0hKgvnz4Z13Dob5gOkrigzzxIT4oJQq4UMjdJFwsmMHDBoE48ZB5crw8stw992kLP+Z0aPmsjU9kzK+RUSFiS9jDGt7QZCKlnChQBcJBwcOwIsvwvDhsHevN4tl6FAGzd/MWwNnH3FpUWEOMLrjRZrhEoMU6CKhNmsWPPQQrF3Ll+deyqDmPdhQvhY5oxaW6MvVTExQmMcoBbpIqKxdC717w3//y++1z6JP5+HMqd3Q+5wfo/D8JMTHaTFRDFOgiwRbejqMGEHOv54jo2w5/nVlDyY0bENW3PHdxIwzY+Qt2j0xlinQRYIgJTWNf8xaTbNP36PvZ2+SmLGHKfWv4Znm3fi1YlLRX6AICfFxCnNRoIsEWkpqGlPHTOI/c16i3i8/8vXpf2J4x+Gsqn7OcX3dpArxpGdkUUPb4YqPAl0kkDZt4sTu3XlrxXzSTqzK/W368v75zQ9ua1sSSRXiGdrmAgW4HMOvQDezROAVoB7enj89gLXAZCAZ2AB0cs7tCkiVIpEmI8Nb1fnUU1yR7RjTtAsvNfkL++LL+/XyvAMrdBiFFIe/I/R/AbOdcx3MrBxQAXgU+MQ5N8rM+gP9gX4BqlMkMjgHkydD376wZQt06sQtNduwppx/ffLbL6ulQyikxIpc+m9mJwHNgfEAzrkDzrl0oB0wwXfZBKB9oIoUiQhLlkDz5tClC1vLVeLWrk+RfOYdfod507NPUZjLcfFnhH4WsAN4zcwuApYADwLVnHPbAJxz28zs1MCVKRLGfvkFBg6EV1+FKlUY160/I6tfTm6ZOL9erp64lBZ/Ar0scAlwv3PuKzP7F157xS9m1hPoCVCrVq0SFSkSlg4cgOeeg8ceg3374OGHGXHxzYxf4d+tJB0PJ6XNn90WtwBbnHNf+R5PxQv4X8zsNADfx+35vdg5N9Y518g516hq1aqlUbNIaDkHM2dCvXper7x5c1i5kpTbevkd5lrRKYFQZKA7534GNptZ3k9fK2A18B7Q3fdcd2BGQCoUCSdr1sD110ObNlCmjLcPy8yZcO65DH9/lV9fQis6JVD8neVyPzDRN8NlPXAX3l8G75jZ3cAmoGNgShQJA7t2eTshvvACWQkV+feN9/L8+deSPT8H5n+A4c3nLYpWdEog+RXozrmlQKN8PtWqdMsRCTM5OTBuHPsHPErZ3buZXP9a/tG8GzsrnHzEZf6EueaUS6BppajEnJTUNIa9t4r0zKyDz+U702T+fG9f8uXLWVbrQoa1H87qamcV+/uVMfhnp4sV5BJwCnSJKSmpafSdsoyso05Q3pWRRd+pywBon5Tl3eycOhVq12bgbUOYePqlJVqubwpzCSKdKSoxZfSctceEeZ6ymZmkP9wfzjsP/vtfGDEC1qzh7TMalyjME+LjeFZhLkGkEbrElHwPVnaOdqvn03/+65z2x07o2hVGjYLTTwegRmJCkQcy56lYLo6MAznaAVFCQoEuUS8lNY3Rc9ayNZ9Qrr/te4Z+PJaGW79jWfU6DLt9KC//5/4jrunbui4Dpq8gMyvn4HNHz2qJM6NLkzO0dF9CSoEuUefwAD85IZ69B7LJyjmyzVL1j108smACHVd+zI6KifS5oRfvXdSKpzs2OObr5Y2y876mRt8SrhToEhXyQjwtPfOI0fPhM1kAymVn0WPxDO77cjLlsrN4qfEtvHBFZ/44oQJjOhbc727foKYCXMKeAl0iVkEhnu8tT+e4+oevGTT3FZLTt/HROU144soebDjFC+nbL6ulwJaIp0CXiJSSmnZEX7uwhT11dmxk8NxXaL4hlXWVz6Bbp8f47MxLAPW+Jboo0CUijZ6z9oiblPk5OfN3ei18m27ffsDecgkMa9WTtxrcQHz5Exij5fcShRToElEOb7MUJC43hy5LZ9P784mcvO8PJjW4nvFXd2eDVdANTYlqCnQJewX1yvNzxYalDPlkHOf9upEvatXnpfb3ccudNzBPAS4xQIEuYW1QygomLtpU+A1P4Iz0nxk4bzzXff8lW5Oq89XosVzx8F+5ogQrPEUilQJdwkp+G2cVpuL+DO5dNIV7vkmhTLl4ePJJajz0EDXKlw9wpSLhR4EuYSMlNY2HJi/1aytac7ncvGoe/RZMoNofv8Edd8DIkVCjRsDrFAlXCnQJG/2mLfcrzC/eupZhH7/Mxdu+Z1mNunz3wmu06N424PWJhDsFuoSFlNQ09mfnFnrNqb/vpN+C1/nLqnn8UukURnTsx4X9/pf2Dc8IUpUi4U2BLiFzxKZZhdy7PCH7AHd/k8L/fvkOZXOzWfCXu2nx2rMMPvHE4BUrEgEU6BISR6/0zLfX4hyt133JwLnjqbX7Fxb8qSn7Rz3NtW2uCGqtIpFCgS4hUdRKz7o7NjDkk7E03biczTXOgmkf06KVjrAVKYwCXYLi8PZKYQdGJGbuofdnE+m6dBZ7TqjIa7c9zF0TRkFZ/aiKFEV/SiTgjl4clF+Yl83JpuvSWTz0+UQq7c/grQbX8+rVd7LgqQ7BLVYkginQJaAGpazgrUWbCr2m2U+pDPlkHOfu3MTntS/isVb3sLnG2Yy8RTsgihSHAl0CJiU1jYmFhHntXVsZOO9Vrl23iI2J1bnnlkF8dE4TaiZVYKQ20BIpNgW6BMzoOWvznbxSaX8G9305mR7fzOBA2XieatGd8Y3ac6BsPDUTE1jY/6qg1yoSDRToUuoK2uLWXC4dVnzCI59OoOredKbUu5qnW9zBjkqnABAfZ/RtXTcUJYtEBQW6lJrCNta6ZMsahswdy8Xb1vFtjbo89+AzvF/ujIPXJlWIZ2ibC9RmETkOCnQpFccsFPKpvudX+i14nZtXz+fnSqfQ+8beVOxxByNuuYgRIapVJFop0KVUDH9/1RFhfkLWfu755l3uXTSFuNxcnr/8Vma07sZ9bS/WKFwkQBToctwGpaxgV4avzeIc169dyMB5r3L6nu18ULcpI1vehUs+Uzc7RQJMgS7H5fCpiedvX8/Qj8dy2eaVrKmaTOcuT7KoVn0S4uMYqZudIgGnQJdiO3qXxKSM3Tz82Zt0XvYhu8tXYtC19zLpotbklInTzU6RIFKgS7EcfvOzbE423VI/oNfnb1PxQCZvXHIjzzbryp7ylQBITIgndci1Ia5YJHYo0KVY8nZJbL5+CUM+Gcc5v23h0+QGDG/Vkx+rHDpowoBhbS8IXaEiMcivQDezDcDvQA6Q7ZxrZGanAJOBZGAD0Mk5tyswZUowHL0jYt98lt+fsP4Hxs99hVY/fsNPSafR4y9DmHv2pWBHnlDR9bJaarOIBFlxRuhXOud+Pexxf+AT59woM+vve9yvVKuToDl6HnlaeiYDpq8A8IJ59254/HHmvDqGfXHxPNnyLl5r1JasuPhjvlZiQjyPt9fGWiLBdjwtl3ZAS9/vJwDzUaBHlMJWdgJkZuXwj1mraf/tbHj0Udixg7R2t3J7clu2nHBSvq9JiI9Tq0UkRMr4eZ0DPjSzJWbW0/dcNefcNgDfx1Pze6GZ9TSzxWa2eMeOHcdfsZSKlNQ0+k5ZVmCYAzTasop/P/d3+OtfoU4d+OYbkt+dRJ87mlMzMQHDG40nVYjHgJqJCYy85UK1WkRCxN8RelPn3FYzOxX4yMy+8/cbOOfGAmMBGjVqlN/mexICo+esJSs3//8dNfZsp//812m75lN+ObkqTJoEt956sE/evkFNhbZIGPIr0J1zW30ft5vZu0Bj4BczO805t83MTgO2B7BOKUUpqWn5nhpUPmsff/tqOv/z1TQMxwt/7krtUUNpc0WdEFQpIsVVZKCbWUWgjHPud9/vrwUeA94DugOjfB9nBLJQKR15Nz+P4Bw3ffcZA+a9Rs3fdzDzvD/zatu/c0fn5rTRSFwkYvgzQq8GvGveP7fLAm8752ab2TfAO2Z2N7AJ6Bi4MuV4FbRH+QW//MiQj8fSZMsqVp16Fn3a9eHW3l2ZriAXiThFBrpzbj1wUT7P7wRaBaIoKV35bW1beW86fT59g1uXf8RvFU5iQOv7+PCyGxjcrr764yIRSitFY0De6k6A+Jwsui95nwcW/h8J2fsZf2k7nr+iMydWr8oS7YYoEtEU6DFgq6/N0vLHbxg89xXO/i2NeWc1ZMRV97C+8ukkxMfp6DeRKKBAjwGXHdhBzxkvcOX6Jfx4Sk3u7DCU+WdfCnhzx/Nb4i8ikUeBHs3S02H4cCa+8AJ748rx+JU9mNCwDVlx8d4e5VoEJBJVFOjRKCcHxo+HgQNh5042te/CXWe346e4ioAOZBaJVv4u/ZdI8emn0LAh/O1vcP75zJs4i+sv6HYwzAH2ZeWGsEARCRQFerTYuBE6dYIWLeC332DyZFiwgEEb44+Yrgjepluj56wNUaEiEihquUS6vXvhqadg9Giygdev7s4/6rfhlPVJ9F269eAMl6MV9LyIRC4FeqRyzts065FHIC2Nzde14446f+GnCqcAh/YzT6wQz66MY3dUrJGYEOyKRSTA1HKJRIsXQ7Nm0LUrVKsGn31G5xYPHAzzPJlZOTjn7VF+OM07F4lOCvRI8vPP0KMHNG4MP/zgzWT5+mto1izf3RMB0jOzGHnLhQf3L9ee5SLRSy2XSLB/Pzz3HIwYAfv2QZ8+MGgQnHTo1KA4M3Lcsfubx5lp/3KRGKFALwF/DlMuFc7BzJnQu7c3Ir/pJvjnP73Tg46SX5gX9ryIRB+1XIopb+fCtPRMHIduPqakppXuN1q9Gq67Dtq2hbJlYdYseP/9fMMcvFZKcZ4XkeijQC+mw3cuzFOq87p37YIHH4T69eGrr2DMGFi+3Av3QvRtXVc3P0VinFouxRSwed3Z2TBuHAwe7IV6z57w2GNQtapfL89r+QSlFSQiYUmBXkw1EhPynVFyXPO6582DXr28kXjLlt6o/KJjzhQpkm5+isQ2tVyKqVRbGz/9BB06wFVXwe7dMHUqzJ1bojAXEdEIvZhKpbXxxx8wahQ88wzExcHjj3szWRJ0A1NESk6BXgIlbm3k5sLbb0O/frB1K9x+uxfsNdUmEZHjp5ZLsHz9NTRtCt26eQH+xRfw5psKcxEpNQr0QNu2De68E5o0gQ0b4PXXYdEiuPzyEBcmItFGLZdA2bfPm63yxBNw4IDXZhk4EE48MdSViUiUUqCXNufgvfe8m5zr10O7dt7Nz3POCXVlIhLl1HIpTStXwrXXQvv2UL48fPghpKQozEUkKBTopeG33+D+++Hii2HJEnj+eVi2DK65JtSViUgMUcvleGRnw8svw5AhkJ4Of/87DB8OlSuHujIRiUEK9JL65BNvE61Vq7yVnmPGwIUXhroqEYlharkU148/ws03w9VXQ0YGTJ8OH3+sMBeRkFOg++v332HAAPjTn+Cjj2DkSG/P8ptvBrNQVyciopZLkXJzvRWd/ft7Z3recYcX5jVqhLoyEZEjKNALs2iR1yf/+mtvpWdKivdRRCQMqeWSn7Q0byR++eWweTO88Ya394rCXETCmN+BbmZxZpZqZjN9j880s6/MbJ2ZTTazcoErM0j27YMnn4S6deGdd+DRR+H7770Ntcro7z4RCW/FSakHgTWHPX4KeNY5VwfYBdxdmoUFlXMwbRqcf76330rr1t4NzyeegEqVQl2diIhf/Ap0MzsduBF4xffYgKuAqb5LJgDtA1FgwC1fDq1aeScHVarkzS+fNg3OOivUlYmIFIu/I/QxwCNAru9xZSDdOZfte7wFiKyNvX/9Fe69Fxo08Jbp//vfkJrqLRISEYlARQa6md0EbHfOLTn86XwudQW8vqeZLTazxTt27ChhmaUoKwueew7q1IGxY+G++2DdOm/ZfllN+hGRyOVPgjUF2prZDUB54CS8EXuimZX1jdJPB7bm92Ln3FhgLECjRo3yDf2g+fBD6NUL1qzxNs4aM8ZbKCQiEgWKHKE75wY45053ziUDnYG5zrmuwDygg++y7sCMgFV5vNatg7ZtvZudBw7AjBkwZ47CXESiyvHMxesH9DazH/B66uNLp6RStGcPPPIIXHABzJ8PTz/tbabVtq2W64tI1ClW09g5Nx+Y7/v9eqBx6ZdUCnJzvbM7BwyA7dvhrru8+eXVq4e6MhGRgIm+u4ALF3rL9Zcs8VZ6zpwJl14a6qpERAIuepY/bt4Mt90GzZp5m2hNnOiFu8JcRGJE5I/QMzK8Q5hHjfJWfA4eDP36QcWKoa5MRCSoIjfQnYMpU6BvX9i0CTp29G56JieHujIRkZCI3EAHePFFSErydkNs0SLU1YiIhFTkBrqZN0KvXBni4kJdjYhIyEVuoAOcemqoKxARCRvRM8tFRCTGKdBFRKKEAl1EJEoo0EVEokTE3hRNSU1j9Jy1bE3PpEZiAn1b16V9g8g6Y0NEpDQ8str1AAAD2klEQVRFZKCnpKYxYPoKMrNyAEhLz2TA9BUACnURiVkR2XIZPWftwTDPk5mVw+g5a0NUkYhI6EVkoG9NzyzW8yIisSAiA71GYkKxnhcRiQURGeh9W9clIf7I5f4J8XH0bV03RBWJiIReRN4UzbvxqVkuIiKHRGSggxfqCnARkUMisuUiIiLHUqCLiEQJBbqISJRQoIuIRAkFuohIlDDnXPC+mdkOYGMpfbkqwK+l9LUimd4Hj96HQ/ReeKLpfajtnKta1EVBDfTSZGaLnXONQl1HqOl98Oh9OETvhScW3we1XEREooQCXUQkSkRyoI8NdQFhQu+DR+/DIXovPDH3PkRsD11ERI4UySN0ERE5TFgHupm9ambbzWxlEdddamY5ZtYhWLUFU1Hvg5m1NLPdZrbU92tIsGsMFn9+Jnzvx1IzW2VmC4JZX7D48TPR97Cfh5W+Px+nBLvOQPPjfTjZzN43s2W+n4e7gl1jMIV1y8XMmgN/AG845+oVcE0c8BGwD3jVOTc1iCUGRVHvg5m1BPo4524Kdm3B5sd7kQh8AVznnNtkZqc657YHu85A8+fPxmHXtgEecs5dFZTigsiPn4dHgZOdc/3MrCqwFqjunDsQ5FKDIqxH6M65T4HfirjsfmAaEHV/aPP4+T7EBD/ei9uA6c65Tb7ro/Lnopg/E12ASQEsJ2T8eB8ccKKZGVDJd212MGoLhbAO9KKYWU3gZuClUNcSBi73/bNylpldEOpiQuhcIMnM5pvZEjO7I9QFhZKZVQCuwxv0xKIXgPOBrcAK4EHnXG5oSwqciD3gwmcM0M85l+P9BRyzvsVbGvyHmd0ApAB1QlxTqJQFGgKtgATgSzNb5Jz7PrRlhUwbYKFzLlb/hdcaWApcBZwNfGRmnznn9oS2rMCI6BE60Aj4PzPbAHQA/m1m7UNbUvA55/Y45/7w/f6/QLyZVQlxWaGyBZjtnNvrnPsV+BS4KMQ1hVJnorTd4qe78Fpwzjn3A/ATcF6IawqYiA5059yZzrlk51wyMBW41zmXEuKygs7Mqvt6hJhZY7z/rztDW1XIzAD+bGZlfe2GJsCaENcUEmZ2MtAC7z2JVZvw/rWGmVUD6gLrQ1pRAIV1y8XMJgEtgSpmtgUYCsQDOOdipm/ux/vQAfi7mWUDmUBnF87Tl45DUe+Fc26Nmc0GlgO5wCvOuUKnvUYiP/9s3Ax86JzbG5Iig8CP92EE8LqZrQAMr0UbLTswHiOspy2KiIj/IrrlIiIihyjQRUSihAJdRCRKKNBFRKKEAl1EJEoo0EVEooQCXUQkSijQRUSixP8DmQkNGJYhykMAAAAASUVORK5CYII=\n",
400 | "text/plain": [
401 | ""
402 | ]
403 | },
404 | "metadata": {},
405 | "output_type": "display_data"
406 | }
407 | ],
408 | "source": [
409 | "# Here we create a little function which takes heights as inputs in order to predict corresponding weights values:\n",
410 | "def get_weights(heights):\n",
411 | " return slope * heights + intercept\n",
412 | "\n",
413 | "plt.scatter(heights, weights)\n",
414 | "plt.plot(heights, get_weights(heights), c='r')\n",
415 | "plt.show()"
416 | ]
417 | }
418 | ],
419 | "metadata": {
420 | "kernelspec": {
421 | "display_name": "Python 3",
422 | "language": "python",
423 | "name": "python3"
424 | },
425 | "language_info": {
426 | "codemirror_mode": {
427 | "name": "ipython",
428 | "version": 3
429 | },
430 | "file_extension": ".py",
431 | "mimetype": "text/x-python",
432 | "name": "python",
433 | "nbconvert_exporter": "python",
434 | "pygments_lexer": "ipython3",
435 | "version": "3.6.3"
436 | }
437 | },
438 | "nbformat": 4,
439 | "nbformat_minor": 2
440 | }
441 |
--------------------------------------------------------------------------------
/How it works - Pandas, data manipulation.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "code",
5 | "execution_count": 1,
6 | "metadata": {},
7 | "outputs": [],
8 | "source": [
9 | "import pandas as pd"
10 | ]
11 | },
12 | {
13 | "cell_type": "code",
14 | "execution_count": 2,
15 | "metadata": {},
16 | "outputs": [
17 | {
18 | "data": {
19 | "text/html": [
20 | "\n",
21 | "\n",
34 | "
\n",
35 | " \n",
36 | " \n",
37 | " | \n",
38 | " Make | \n",
39 | " Color | \n",
40 | " Capacity | \n",
41 | "
\n",
42 | " \n",
43 | " \n",
44 | " \n",
45 | " Jane | \n",
46 | " Ford | \n",
47 | " Blue | \n",
48 | " 1.6l | \n",
49 | "
\n",
50 | " \n",
51 | " John | \n",
52 | " BMW | \n",
53 | " Grey | \n",
54 | " 2.0l | \n",
55 | "
\n",
56 | " \n",
57 | " June | \n",
58 | " Mini | \n",
59 | " Red | \n",
60 | " 1.6l | \n",
61 | "
\n",
62 | " \n",
63 | " Jim | \n",
64 | " Mercedes | \n",
65 | " White | \n",
66 | " 2.2l | \n",
67 | "
\n",
68 | " \n",
69 | " Jay | \n",
70 | " Toyota | \n",
71 | " White | \n",
72 | " 1.2l | \n",
73 | "
\n",
74 | " \n",
75 | "
\n",
76 | "
"
77 | ],
78 | "text/plain": [
79 | " Make Color Capacity\n",
80 | "Jane Ford Blue 1.6l\n",
81 | "John BMW Grey 2.0l\n",
82 | "June Mini Red 1.6l\n",
83 | "Jim Mercedes White 2.2l\n",
84 | "Jay Toyota White 1.2l"
85 | ]
86 | },
87 | "execution_count": 2,
88 | "metadata": {},
89 | "output_type": "execute_result"
90 | }
91 | ],
92 | "source": [
93 | "list1 = [\"Jane\", \"John\", \"June\", \"Jim\", \"Jay\"]\n",
94 | "list2 = [\"Ford\", \"BMW\", \"Mini\", \"Mercedes\", \"Toyota\"]\n",
95 | "list3 = [\"Blue\", \"Grey\", \"Red\", \"White\", \"White\"]\n",
96 | "list4 = [\"1.6l\", \"2.0l\", \"1.6l\", \"2.2l\", \"1.2l\"]\n",
97 | "df1 = pd.DataFrame({\"Make\":list2, \"Color\":list3, \"Capacity\":list4}, index = list1)\n",
98 | "df1"
99 | ]
100 | },
101 | {
102 | "cell_type": "markdown",
103 | "metadata": {},
104 | "source": [
105 | "## Updating values via indexation"
106 | ]
107 | },
108 | {
109 | "cell_type": "code",
110 | "execution_count": 3,
111 | "metadata": {},
112 | "outputs": [
113 | {
114 | "data": {
115 | "text/html": [
116 | "\n",
117 | "\n",
130 | "
\n",
131 | " \n",
132 | " \n",
133 | " | \n",
134 | " Capacity | \n",
135 | " Color | \n",
136 | " Make | \n",
137 | "
\n",
138 | " \n",
139 | " \n",
140 | " \n",
141 | " Jane | \n",
142 | " 1.6l | \n",
143 | " Orange | \n",
144 | " Ford | \n",
145 | "
\n",
146 | " \n",
147 | " John | \n",
148 | " 2.0l | \n",
149 | " Grey | \n",
150 | " BMW | \n",
151 | "
\n",
152 | " \n",
153 | " June | \n",
154 | " 1.6l | \n",
155 | " Red | \n",
156 | " Mini | \n",
157 | "
\n",
158 | " \n",
159 | " Jim | \n",
160 | " 2.2l | \n",
161 | " White | \n",
162 | " Mercedes | \n",
163 | "
\n",
164 | " \n",
165 | " Jay | \n",
166 | " 1.2l | \n",
167 | " White | \n",
168 | " Toyota | \n",
169 | "
\n",
170 | " \n",
171 | "
\n",
172 | "
"
173 | ],
174 | "text/plain": [
175 | " Capacity Color Make\n",
176 | "Jane 1.6l Orange Ford\n",
177 | "John 2.0l Grey BMW\n",
178 | "June 1.6l Red Mini\n",
179 | "Jim 2.2l White Mercedes\n",
180 | "Jay 1.2l White Toyota"
181 | ]
182 | },
183 | "execution_count": 3,
184 | "metadata": {},
185 | "output_type": "execute_result"
186 | }
187 | ],
188 | "source": [
189 | "df1.loc[\"Jane\", \"Color\"] = \"Orange\" # Jane gives us the rows to update, \n",
190 | " # Color tells us which column to update\n",
191 | "df1"
192 | ]
193 | },
194 | {
195 | "cell_type": "code",
196 | "execution_count": 4,
197 | "metadata": {},
198 | "outputs": [
199 | {
200 | "data": {
201 | "text/html": [
202 | "\n",
203 | "\n",
216 | "
\n",
217 | " \n",
218 | " \n",
219 | " | \n",
220 | " Capacity | \n",
221 | " Color | \n",
222 | " Make | \n",
223 | "
\n",
224 | " \n",
225 | " \n",
226 | " \n",
227 | " Jane | \n",
228 | " 1.6l | \n",
229 | " Orange | \n",
230 | " Ford | \n",
231 | "
\n",
232 | " \n",
233 | " John | \n",
234 | " 2.0l | \n",
235 | " Grey | \n",
236 | " BMW | \n",
237 | "
\n",
238 | " \n",
239 | " June | \n",
240 | " 1.6l | \n",
241 | " Red | \n",
242 | " Mini | \n",
243 | "
\n",
244 | " \n",
245 | " Jim | \n",
246 | " 2.2l | \n",
247 | " Off-White | \n",
248 | " Mercedes | \n",
249 | "
\n",
250 | " \n",
251 | " Jay | \n",
252 | " 1.2l | \n",
253 | " Off-White | \n",
254 | " Toyota | \n",
255 | "
\n",
256 | " \n",
257 | "
\n",
258 | "
"
259 | ],
260 | "text/plain": [
261 | " Capacity Color Make\n",
262 | "Jane 1.6l Orange Ford\n",
263 | "John 2.0l Grey BMW\n",
264 | "June 1.6l Red Mini\n",
265 | "Jim 2.2l Off-White Mercedes\n",
266 | "Jay 1.2l Off-White Toyota"
267 | ]
268 | },
269 | "execution_count": 4,
270 | "metadata": {},
271 | "output_type": "execute_result"
272 | }
273 | ],
274 | "source": [
275 | "df1.loc[df1[\"Color\"].isin([\"White\"]), \"Color\"] = \"Off-White\" # The first part gives us the rows to update,\n",
276 | " # The second part tells us which column to update\n",
277 | "df1"
278 | ]
279 | },
280 | {
281 | "cell_type": "markdown",
282 | "metadata": {},
283 | "source": [
284 | "## Replacing values"
285 | ]
286 | },
287 | {
288 | "cell_type": "markdown",
289 | "metadata": {},
290 | "source": [
291 | "### An example across the entire dataframe, replacing entire cell values"
292 | ]
293 | },
294 | {
295 | "cell_type": "code",
296 | "execution_count": 5,
297 | "metadata": {},
298 | "outputs": [
299 | {
300 | "data": {
301 | "text/html": [
302 | "\n",
303 | "\n",
316 | "
\n",
317 | " \n",
318 | " \n",
319 | " | \n",
320 | " Capacity | \n",
321 | " Color | \n",
322 | " Make | \n",
323 | "
\n",
324 | " \n",
325 | " \n",
326 | " \n",
327 | " Jane | \n",
328 | " 1.6l | \n",
329 | " Orange | \n",
330 | " Ford | \n",
331 | "
\n",
332 | " \n",
333 | " John | \n",
334 | " 2.0l | \n",
335 | " Silver | \n",
336 | " BMW | \n",
337 | "
\n",
338 | " \n",
339 | " June | \n",
340 | " 1.6l | \n",
341 | " Red | \n",
342 | " Mini | \n",
343 | "
\n",
344 | " \n",
345 | " Jim | \n",
346 | " 2.2l | \n",
347 | " White | \n",
348 | " Mercedes | \n",
349 | "
\n",
350 | " \n",
351 | " Jay | \n",
352 | " 1.2l | \n",
353 | " White | \n",
354 | " Toyota | \n",
355 | "
\n",
356 | " \n",
357 | "
\n",
358 | "
"
359 | ],
360 | "text/plain": [
361 | " Capacity Color Make\n",
362 | "Jane 1.6l Orange Ford\n",
363 | "John 2.0l Silver BMW\n",
364 | "June 1.6l Red Mini\n",
365 | "Jim 2.2l White Mercedes\n",
366 | "Jay 1.2l White Toyota"
367 | ]
368 | },
369 | "execution_count": 5,
370 | "metadata": {},
371 | "output_type": "execute_result"
372 | }
373 | ],
374 | "source": [
375 | "df1.replace([\"Off-White\", \"Grey\"], [\"White\", \"Silver\"], inplace = True)\n",
376 | "df1"
377 | ]
378 | },
379 | {
380 | "cell_type": "markdown",
381 | "metadata": {},
382 | "source": [
383 | "### An example for a single series, replacing partial cell values"
384 | ]
385 | },
386 | {
387 | "cell_type": "code",
388 | "execution_count": 6,
389 | "metadata": {},
390 | "outputs": [
391 | {
392 | "data": {
393 | "text/html": [
394 | "\n",
395 | "\n",
408 | "
\n",
409 | " \n",
410 | " \n",
411 | " | \n",
412 | " Capacity | \n",
413 | " Color | \n",
414 | " Make | \n",
415 | "
\n",
416 | " \n",
417 | " \n",
418 | " \n",
419 | " Jane | \n",
420 | " 1.6 | \n",
421 | " Orange | \n",
422 | " Ford | \n",
423 | "
\n",
424 | " \n",
425 | " John | \n",
426 | " 2.0 | \n",
427 | " Silver | \n",
428 | " BMW | \n",
429 | "
\n",
430 | " \n",
431 | " June | \n",
432 | " 1.6 | \n",
433 | " Red | \n",
434 | " Mini | \n",
435 | "
\n",
436 | " \n",
437 | " Jim | \n",
438 | " 2.2 | \n",
439 | " White | \n",
440 | " Mercedes | \n",
441 | "
\n",
442 | " \n",
443 | " Jay | \n",
444 | " 1.2 | \n",
445 | " White | \n",
446 | " Toyota | \n",
447 | "
\n",
448 | " \n",
449 | "
\n",
450 | "
"
451 | ],
452 | "text/plain": [
453 | " Capacity Color Make\n",
454 | "Jane 1.6 Orange Ford\n",
455 | "John 2.0 Silver BMW\n",
456 | "June 1.6 Red Mini\n",
457 | "Jim 2.2 White Mercedes\n",
458 | "Jay 1.2 White Toyota"
459 | ]
460 | },
461 | "execution_count": 6,
462 | "metadata": {},
463 | "output_type": "execute_result"
464 | }
465 | ],
466 | "source": [
467 | "df1[\"Capacity\"].replace({'l':''}, inplace = True, regex=True)\n",
468 | "df1"
469 | ]
470 | },
471 | {
472 | "cell_type": "markdown",
473 | "metadata": {},
474 | "source": [
475 | "## Changing the type of a series"
476 | ]
477 | },
478 | {
479 | "cell_type": "markdown",
480 | "metadata": {},
481 | "source": [
482 | "### Using astype()"
483 | ]
484 | },
485 | {
486 | "cell_type": "code",
487 | "execution_count": 7,
488 | "metadata": {},
489 | "outputs": [
490 | {
491 | "name": "stdout",
492 | "output_type": "stream",
493 | "text": [
494 | "\n",
495 | "Index: 5 entries, Jane to Jay\n",
496 | "Data columns (total 3 columns):\n",
497 | "Capacity 5 non-null object\n",
498 | "Color 5 non-null object\n",
499 | "Make 5 non-null object\n",
500 | "dtypes: object(3)\n",
501 | "memory usage: 320.0+ bytes\n"
502 | ]
503 | }
504 | ],
505 | "source": [
506 | "df1.info() # All 3 series currently classified as \"object\" aka \"string\""
507 | ]
508 | },
509 | {
510 | "cell_type": "code",
511 | "execution_count": 8,
512 | "metadata": {},
513 | "outputs": [
514 | {
515 | "name": "stdout",
516 | "output_type": "stream",
517 | "text": [
518 | "\n",
519 | "Index: 5 entries, Jane to Jay\n",
520 | "Data columns (total 3 columns):\n",
521 | "Capacity 5 non-null float64\n",
522 | "Color 5 non-null category\n",
523 | "Make 5 non-null object\n",
524 | "dtypes: category(1), float64(1), object(1)\n",
525 | "memory usage: 477.0+ bytes\n"
526 | ]
527 | }
528 | ],
529 | "source": [
530 | "df1[\"Color\"] = df1[\"Color\"].astype(\"category\") # Convert color to a categorical variable\n",
531 | "df1[\"Capacity\"] = df1[\"Capacity\"].astype(\"float\") # Convert capacity to a float\n",
532 | "\n",
533 | "df1.info() # Here we see the results of the updates - the data didn't change\n",
534 | " # but the format did, so that now we can e.g. perform calcs on Capacity\n",
535 | " # which we could not have done while it was classified as an Object"
536 | ]
537 | },
538 | {
539 | "cell_type": "markdown",
540 | "metadata": {},
541 | "source": [
542 | "### Using pd.to_numeric()"
543 | ]
544 | },
545 | {
546 | "cell_type": "code",
547 | "execution_count": 9,
548 | "metadata": {},
549 | "outputs": [
550 | {
551 | "data": {
552 | "text/html": [
553 | "\n",
554 | "\n",
567 | "
\n",
568 | " \n",
569 | " \n",
570 | " | \n",
571 | " Capacity | \n",
572 | " Color | \n",
573 | " Make | \n",
574 | "
\n",
575 | " \n",
576 | " \n",
577 | " \n",
578 | " Jane | \n",
579 | " 1.6l | \n",
580 | " Orange | \n",
581 | " Ford | \n",
582 | "
\n",
583 | " \n",
584 | " John | \n",
585 | " 2 | \n",
586 | " Silver | \n",
587 | " BMW | \n",
588 | "
\n",
589 | " \n",
590 | " June | \n",
591 | " 1.6 | \n",
592 | " Red | \n",
593 | " Mini | \n",
594 | "
\n",
595 | " \n",
596 | " Jim | \n",
597 | " 2.2 | \n",
598 | " White | \n",
599 | " Mercedes | \n",
600 | "
\n",
601 | " \n",
602 | " Jay | \n",
603 | " 1.2 | \n",
604 | " White | \n",
605 | " Toyota | \n",
606 | "
\n",
607 | " \n",
608 | "
\n",
609 | "
"
610 | ],
611 | "text/plain": [
612 | " Capacity Color Make\n",
613 | "Jane 1.6l Orange Ford\n",
614 | "John 2 Silver BMW\n",
615 | "June 1.6 Red Mini\n",
616 | "Jim 2.2 White Mercedes\n",
617 | "Jay 1.2 White Toyota"
618 | ]
619 | },
620 | "execution_count": 9,
621 | "metadata": {},
622 | "output_type": "execute_result"
623 | }
624 | ],
625 | "source": [
626 | "df1[\"Capacity\"] = df1[\"Capacity\"].astype(\"object\") # Now let's put Capacity back\n",
627 | "df1.loc[\"Jane\", \"Capacity\"] = \"1.6l\" # And then introduce a non-numeric value in one field\n",
628 | "df1"
629 | ]
630 | },
631 | {
632 | "cell_type": "code",
633 | "execution_count": null,
634 | "metadata": {},
635 | "outputs": [],
636 | "source": [
637 | "df1[\"Capacity\"] = df1[\"Capacity\"].astype(\"float\") # THIS script will now end in an error\n",
638 | " # because the l in 1.6l can't be converted "
639 | ]
640 | },
641 | {
642 | "cell_type": "code",
643 | "execution_count": 10,
644 | "metadata": {},
645 | "outputs": [
646 | {
647 | "data": {
648 | "text/html": [
649 | "\n",
650 | "\n",
663 | "
\n",
664 | " \n",
665 | " \n",
666 | " | \n",
667 | " Capacity | \n",
668 | " Color | \n",
669 | " Make | \n",
670 | "
\n",
671 | " \n",
672 | " \n",
673 | " \n",
674 | " Jane | \n",
675 | " NaN | \n",
676 | " Orange | \n",
677 | " Ford | \n",
678 | "
\n",
679 | " \n",
680 | " John | \n",
681 | " 2.0 | \n",
682 | " Silver | \n",
683 | " BMW | \n",
684 | "
\n",
685 | " \n",
686 | " June | \n",
687 | " 1.6 | \n",
688 | " Red | \n",
689 | " Mini | \n",
690 | "
\n",
691 | " \n",
692 | " Jim | \n",
693 | " 2.2 | \n",
694 | " White | \n",
695 | " Mercedes | \n",
696 | "
\n",
697 | " \n",
698 | " Jay | \n",
699 | " 1.2 | \n",
700 | " White | \n",
701 | " Toyota | \n",
702 | "
\n",
703 | " \n",
704 | "
\n",
705 | "
"
706 | ],
707 | "text/plain": [
708 | " Capacity Color Make\n",
709 | "Jane NaN Orange Ford\n",
710 | "John 2.0 Silver BMW\n",
711 | "June 1.6 Red Mini\n",
712 | "Jim 2.2 White Mercedes\n",
713 | "Jay 1.2 White Toyota"
714 | ]
715 | },
716 | "execution_count": 10,
717 | "metadata": {},
718 | "output_type": "execute_result"
719 | }
720 | ],
721 | "source": [
722 | "df1[\"Capacity\"] = pd.to_numeric(df1[\"Capacity\"], errors = \"coerce\")\n",
723 | "df1\n",
724 | "# This is an alternative method to convert data types - the argument errors = \"coerce\" \n",
725 | "# is pretty handy if your data has some exceptions in it, fills NaN where no conversion is possible\n",
726 | "# pd.to_datetime() also has errors = \"coerce\" which can be useful"
727 | ]
728 | },
729 | {
730 | "cell_type": "markdown",
731 | "metadata": {},
732 | "source": [
733 | "## Dealing with null values"
734 | ]
735 | },
736 | {
737 | "cell_type": "markdown",
738 | "metadata": {},
739 | "source": [
740 | "### Filling with another default value"
741 | ]
742 | },
743 | {
744 | "cell_type": "code",
745 | "execution_count": 11,
746 | "metadata": {},
747 | "outputs": [
748 | {
749 | "data": {
750 | "text/html": [
751 | "\n",
752 | "\n",
765 | "
\n",
766 | " \n",
767 | " \n",
768 | " | \n",
769 | " Capacity | \n",
770 | " Color | \n",
771 | " Make | \n",
772 | "
\n",
773 | " \n",
774 | " \n",
775 | " \n",
776 | " Jane | \n",
777 | " 0.0 | \n",
778 | " Orange | \n",
779 | " Ford | \n",
780 | "
\n",
781 | " \n",
782 | " John | \n",
783 | " 2.0 | \n",
784 | " Silver | \n",
785 | " BMW | \n",
786 | "
\n",
787 | " \n",
788 | " June | \n",
789 | " 1.6 | \n",
790 | " Red | \n",
791 | " Mini | \n",
792 | "
\n",
793 | " \n",
794 | " Jim | \n",
795 | " 2.2 | \n",
796 | " White | \n",
797 | " Mercedes | \n",
798 | "
\n",
799 | " \n",
800 | " Jay | \n",
801 | " 1.2 | \n",
802 | " White | \n",
803 | " Toyota | \n",
804 | "
\n",
805 | " \n",
806 | "
\n",
807 | "
"
808 | ],
809 | "text/plain": [
810 | " Capacity Color Make\n",
811 | "Jane 0.0 Orange Ford\n",
812 | "John 2.0 Silver BMW\n",
813 | "June 1.6 Red Mini\n",
814 | "Jim 2.2 White Mercedes\n",
815 | "Jay 1.2 White Toyota"
816 | ]
817 | },
818 | "execution_count": 11,
819 | "metadata": {},
820 | "output_type": "execute_result"
821 | }
822 | ],
823 | "source": [
824 | "df1[\"Capacity\"].fillna(0, inplace = True) # Fills the NaN values in the series with the specified value\n",
825 | " # or indeed the entire dataframe (which would seldom make sense!)\n",
826 | "df1"
827 | ]
828 | },
829 | {
830 | "cell_type": "markdown",
831 | "metadata": {},
832 | "source": [
833 | "### Removing rows with NaN"
834 | ]
835 | },
836 | {
837 | "cell_type": "code",
838 | "execution_count": 12,
839 | "metadata": {},
840 | "outputs": [],
841 | "source": [
842 | "# Here are some additional ways to deal with them:\n",
843 | "# df.dropna() Removes ALL rows in the entire dataframe with \n",
844 | "# one or more null values\n",
845 | "# df.dropna(how = 'all' Removes only rows where all columns contain null values\n",
846 | "# df.dropna(subset = [“Column name”]) Removes rows only where there is a null value in the \n",
847 | "# specified column name"
848 | ]
849 | },
850 | {
851 | "cell_type": "markdown",
852 | "metadata": {},
853 | "source": [
854 | "## Altering the shape of the dataframe"
855 | ]
856 | },
857 | {
858 | "cell_type": "markdown",
859 | "metadata": {},
860 | "source": [
861 | "### Adding columns quickly"
862 | ]
863 | },
864 | {
865 | "cell_type": "code",
866 | "execution_count": 13,
867 | "metadata": {},
868 | "outputs": [
869 | {
870 | "data": {
871 | "text/html": [
872 | "\n",
873 | "\n",
886 | "
\n",
887 | " \n",
888 | " \n",
889 | " | \n",
890 | " Capacity | \n",
891 | " Color | \n",
892 | " Make | \n",
893 | " Model | \n",
894 | "
\n",
895 | " \n",
896 | " \n",
897 | " \n",
898 | " Jane | \n",
899 | " 0.0 | \n",
900 | " Orange | \n",
901 | " Ford | \n",
902 | " | \n",
903 | "
\n",
904 | " \n",
905 | " John | \n",
906 | " 2.0 | \n",
907 | " Silver | \n",
908 | " BMW | \n",
909 | " | \n",
910 | "
\n",
911 | " \n",
912 | " June | \n",
913 | " 1.6 | \n",
914 | " Red | \n",
915 | " Mini | \n",
916 | " | \n",
917 | "
\n",
918 | " \n",
919 | " Jim | \n",
920 | " 2.2 | \n",
921 | " White | \n",
922 | " Mercedes | \n",
923 | " | \n",
924 | "
\n",
925 | " \n",
926 | " Jay | \n",
927 | " 1.2 | \n",
928 | " White | \n",
929 | " Toyota | \n",
930 | " | \n",
931 | "
\n",
932 | " \n",
933 | "
\n",
934 | "
"
935 | ],
936 | "text/plain": [
937 | " Capacity Color Make Model\n",
938 | "Jane 0.0 Orange Ford \n",
939 | "John 2.0 Silver BMW \n",
940 | "June 1.6 Red Mini \n",
941 | "Jim 2.2 White Mercedes \n",
942 | "Jay 1.2 White Toyota "
943 | ]
944 | },
945 | "execution_count": 13,
946 | "metadata": {},
947 | "output_type": "execute_result"
948 | }
949 | ],
950 | "source": [
951 | "df1[\"Model\"] = df[\"\"]\n",
952 | "df1"
953 | ]
954 | },
955 | {
956 | "cell_type": "markdown",
957 | "metadata": {},
958 | "source": [
959 | "### Adding columns at a specified location"
960 | ]
961 | },
962 | {
963 | "cell_type": "code",
964 | "execution_count": 14,
965 | "metadata": {},
966 | "outputs": [
967 | {
968 | "data": {
969 | "text/html": [
970 | "\n",
971 | "\n",
984 | "
\n",
985 | " \n",
986 | " \n",
987 | " | \n",
988 | " Capacity | \n",
989 | " Service Interval | \n",
990 | " Color | \n",
991 | " Make | \n",
992 | " Model | \n",
993 | "
\n",
994 | " \n",
995 | " \n",
996 | " \n",
997 | " Jane | \n",
998 | " 0.0 | \n",
999 | " 20000km | \n",
1000 | " Orange | \n",
1001 | " Ford | \n",
1002 | " | \n",
1003 | "
\n",
1004 | " \n",
1005 | " John | \n",
1006 | " 2.0 | \n",
1007 | " 20000km | \n",
1008 | " Silver | \n",
1009 | " BMW | \n",
1010 | " | \n",
1011 | "
\n",
1012 | " \n",
1013 | " June | \n",
1014 | " 1.6 | \n",
1015 | " 20000km | \n",
1016 | " Red | \n",
1017 | " Mini | \n",
1018 | " | \n",
1019 | "
\n",
1020 | " \n",
1021 | " Jim | \n",
1022 | " 2.2 | \n",
1023 | " 20000km | \n",
1024 | " White | \n",
1025 | " Mercedes | \n",
1026 | " | \n",
1027 | "
\n",
1028 | " \n",
1029 | " Jay | \n",
1030 | " 1.2 | \n",
1031 | " 20000km | \n",
1032 | " White | \n",
1033 | " Toyota | \n",
1034 | " | \n",
1035 | "
\n",
1036 | " \n",
1037 | "
\n",
1038 | "
"
1039 | ],
1040 | "text/plain": [
1041 | " Capacity Service Interval Color Make Model\n",
1042 | "Jane 0.0 20000km Orange Ford \n",
1043 | "John 2.0 20000km Silver BMW \n",
1044 | "June 1.6 20000km Red Mini \n",
1045 | "Jim 2.2 20000km White Mercedes \n",
1046 | "Jay 1.2 20000km White Toyota "
1047 | ]
1048 | },
1049 | "execution_count": 14,
1050 | "metadata": {},
1051 | "output_type": "execute_result"
1052 | }
1053 | ],
1054 | "source": [
1055 | "df1.insert(1, \"Service Interval\", \"20000km\")\n",
1056 | "df1"
1057 | ]
1058 | },
1059 | {
1060 | "cell_type": "markdown",
1061 | "metadata": {},
1062 | "source": [
1063 | "### Adding rows"
1064 | ]
1065 | },
1066 | {
1067 | "cell_type": "code",
1068 | "execution_count": 15,
1069 | "metadata": {},
1070 | "outputs": [
1071 | {
1072 | "data": {
1073 | "text/html": [
1074 | "\n",
1075 | "\n",
1088 | "
\n",
1089 | " \n",
1090 | " \n",
1091 | " | \n",
1092 | " Capacity | \n",
1093 | " Service Interval | \n",
1094 | " Color | \n",
1095 | " Make | \n",
1096 | " Model | \n",
1097 | "
\n",
1098 | " \n",
1099 | " \n",
1100 | " \n",
1101 | " Jane | \n",
1102 | " 0.0 | \n",
1103 | " 20000km | \n",
1104 | " Orange | \n",
1105 | " Ford | \n",
1106 | " | \n",
1107 | "
\n",
1108 | " \n",
1109 | " John | \n",
1110 | " 2.0 | \n",
1111 | " 20000km | \n",
1112 | " Silver | \n",
1113 | " BMW | \n",
1114 | " | \n",
1115 | "
\n",
1116 | " \n",
1117 | " June | \n",
1118 | " 1.6 | \n",
1119 | " 20000km | \n",
1120 | " Red | \n",
1121 | " Mini | \n",
1122 | " | \n",
1123 | "
\n",
1124 | " \n",
1125 | " Jim | \n",
1126 | " 2.2 | \n",
1127 | " 20000km | \n",
1128 | " White | \n",
1129 | " Mercedes | \n",
1130 | " | \n",
1131 | "
\n",
1132 | " \n",
1133 | " Jay | \n",
1134 | " 1.2 | \n",
1135 | " 20000km | \n",
1136 | " White | \n",
1137 | " Toyota | \n",
1138 | " | \n",
1139 | "
\n",
1140 | " \n",
1141 | " James | \n",
1142 | " 1.6 | \n",
1143 | " NaN | \n",
1144 | " Blue | \n",
1145 | " Honda | \n",
1146 | " Jazz | \n",
1147 | "
\n",
1148 | " \n",
1149 | "
\n",
1150 | "
"
1151 | ],
1152 | "text/plain": [
1153 | " Capacity Service Interval Color Make Model\n",
1154 | "Jane 0.0 20000km Orange Ford \n",
1155 | "John 2.0 20000km Silver BMW \n",
1156 | "June 1.6 20000km Red Mini \n",
1157 | "Jim 2.2 20000km White Mercedes \n",
1158 | "Jay 1.2 20000km White Toyota \n",
1159 | "James 1.6 NaN Blue Honda Jazz"
1160 | ]
1161 | },
1162 | "execution_count": 15,
1163 | "metadata": {},
1164 | "output_type": "execute_result"
1165 | }
1166 | ],
1167 | "source": [
1168 | "extra_row = pd.Series({\"Capacity\": 1.6, \"Color\": \"Blue\", \"Make\": \"Honda\", \"Model\": \"Jazz\"})\n",
1169 | "extra_row.name = \"James\"\n",
1170 | "df1 = df1.append(extra_row)\n",
1171 | "df1"
1172 | ]
1173 | },
1174 | {
1175 | "cell_type": "markdown",
1176 | "metadata": {},
1177 | "source": [
1178 | "### Removing columns"
1179 | ]
1180 | },
1181 | {
1182 | "cell_type": "code",
1183 | "execution_count": 16,
1184 | "metadata": {},
1185 | "outputs": [
1186 | {
1187 | "data": {
1188 | "text/html": [
1189 | "\n",
1190 | "\n",
1203 | "
\n",
1204 | " \n",
1205 | " \n",
1206 | " | \n",
1207 | " Capacity | \n",
1208 | " Color | \n",
1209 | " Make | \n",
1210 | " Model | \n",
1211 | "
\n",
1212 | " \n",
1213 | " \n",
1214 | " \n",
1215 | " Jane | \n",
1216 | " 0.0 | \n",
1217 | " Orange | \n",
1218 | " Ford | \n",
1219 | " | \n",
1220 | "
\n",
1221 | " \n",
1222 | " John | \n",
1223 | " 2.0 | \n",
1224 | " Silver | \n",
1225 | " BMW | \n",
1226 | " | \n",
1227 | "
\n",
1228 | " \n",
1229 | " June | \n",
1230 | " 1.6 | \n",
1231 | " Red | \n",
1232 | " Mini | \n",
1233 | " | \n",
1234 | "
\n",
1235 | " \n",
1236 | " Jim | \n",
1237 | " 2.2 | \n",
1238 | " White | \n",
1239 | " Mercedes | \n",
1240 | " | \n",
1241 | "
\n",
1242 | " \n",
1243 | " Jay | \n",
1244 | " 1.2 | \n",
1245 | " White | \n",
1246 | " Toyota | \n",
1247 | " | \n",
1248 | "
\n",
1249 | " \n",
1250 | " James | \n",
1251 | " 1.6 | \n",
1252 | " Blue | \n",
1253 | " Honda | \n",
1254 | " Jazz | \n",
1255 | "
\n",
1256 | " \n",
1257 | "
\n",
1258 | "
"
1259 | ],
1260 | "text/plain": [
1261 | " Capacity Color Make Model\n",
1262 | "Jane 0.0 Orange Ford \n",
1263 | "John 2.0 Silver BMW \n",
1264 | "June 1.6 Red Mini \n",
1265 | "Jim 2.2 White Mercedes \n",
1266 | "Jay 1.2 White Toyota \n",
1267 | "James 1.6 Blue Honda Jazz"
1268 | ]
1269 | },
1270 | "execution_count": 16,
1271 | "metadata": {},
1272 | "output_type": "execute_result"
1273 | }
1274 | ],
1275 | "source": [
1276 | "df1.drop(\"Service Interval\", axis = 1, inplace = True) # axis = 1 says you're looking a columns\n",
1277 | "df1"
1278 | ]
1279 | },
1280 | {
1281 | "cell_type": "markdown",
1282 | "metadata": {},
1283 | "source": [
1284 | "### Removing rows"
1285 | ]
1286 | },
1287 | {
1288 | "cell_type": "code",
1289 | "execution_count": 17,
1290 | "metadata": {},
1291 | "outputs": [
1292 | {
1293 | "data": {
1294 | "text/html": [
1295 | "\n",
1296 | "\n",
1309 | "
\n",
1310 | " \n",
1311 | " \n",
1312 | " | \n",
1313 | " Capacity | \n",
1314 | " Color | \n",
1315 | " Make | \n",
1316 | " Model | \n",
1317 | "
\n",
1318 | " \n",
1319 | " \n",
1320 | " \n",
1321 | " Jane | \n",
1322 | " 0.0 | \n",
1323 | " Orange | \n",
1324 | " Ford | \n",
1325 | " | \n",
1326 | "
\n",
1327 | " \n",
1328 | " June | \n",
1329 | " 1.6 | \n",
1330 | " Red | \n",
1331 | " Mini | \n",
1332 | " | \n",
1333 | "
\n",
1334 | " \n",
1335 | " Jim | \n",
1336 | " 2.2 | \n",
1337 | " White | \n",
1338 | " Mercedes | \n",
1339 | " | \n",
1340 | "
\n",
1341 | " \n",
1342 | " Jay | \n",
1343 | " 1.2 | \n",
1344 | " White | \n",
1345 | " Toyota | \n",
1346 | " | \n",
1347 | "
\n",
1348 | " \n",
1349 | " James | \n",
1350 | " 1.6 | \n",
1351 | " Blue | \n",
1352 | " Honda | \n",
1353 | " Jazz | \n",
1354 | "
\n",
1355 | " \n",
1356 | "
\n",
1357 | "
"
1358 | ],
1359 | "text/plain": [
1360 | " Capacity Color Make Model\n",
1361 | "Jane 0.0 Orange Ford \n",
1362 | "June 1.6 Red Mini \n",
1363 | "Jim 2.2 White Mercedes \n",
1364 | "Jay 1.2 White Toyota \n",
1365 | "James 1.6 Blue Honda Jazz"
1366 | ]
1367 | },
1368 | "execution_count": 17,
1369 | "metadata": {},
1370 | "output_type": "execute_result"
1371 | }
1372 | ],
1373 | "source": [
1374 | "df1.drop(\"John\", axis = 0, inplace = True) # axis = 0 says you're looking a rows\n",
1375 | "df1"
1376 | ]
1377 | },
1378 | {
1379 | "cell_type": "code",
1380 | "execution_count": 18,
1381 | "metadata": {},
1382 | "outputs": [
1383 | {
1384 | "data": {
1385 | "text/html": [
1386 | "\n",
1387 | "\n",
1400 | "
\n",
1401 | " \n",
1402 | " \n",
1403 | " | \n",
1404 | " Capacity | \n",
1405 | " Color | \n",
1406 | " Make | \n",
1407 | " Model | \n",
1408 | "
\n",
1409 | " \n",
1410 | " \n",
1411 | " \n",
1412 | " Jane | \n",
1413 | " 0.0 | \n",
1414 | " Orange | \n",
1415 | " Ford | \n",
1416 | " | \n",
1417 | "
\n",
1418 | " \n",
1419 | " June | \n",
1420 | " 1.6 | \n",
1421 | " Red | \n",
1422 | " Mini | \n",
1423 | " | \n",
1424 | "
\n",
1425 | " \n",
1426 | " James | \n",
1427 | " 1.6 | \n",
1428 | " Blue | \n",
1429 | " Honda | \n",
1430 | " Jazz | \n",
1431 | "
\n",
1432 | " \n",
1433 | "
\n",
1434 | "
"
1435 | ],
1436 | "text/plain": [
1437 | " Capacity Color Make Model\n",
1438 | "Jane 0.0 Orange Ford \n",
1439 | "June 1.6 Red Mini \n",
1440 | "James 1.6 Blue Honda Jazz"
1441 | ]
1442 | },
1443 | "execution_count": 18,
1444 | "metadata": {},
1445 | "output_type": "execute_result"
1446 | }
1447 | ],
1448 | "source": [
1449 | "df1 = df1[df1[\"Color\"] != \"White\"] # a simple way to remove rows based on a condition\n",
1450 | "df1"
1451 | ]
1452 | },
1453 | {
1454 | "cell_type": "markdown",
1455 | "metadata": {},
1456 | "source": [
1457 | "### Transposing the data"
1458 | ]
1459 | },
1460 | {
1461 | "cell_type": "code",
1462 | "execution_count": 19,
1463 | "metadata": {},
1464 | "outputs": [
1465 | {
1466 | "data": {
1467 | "text/html": [
1468 | "\n",
1469 | "\n",
1482 | "
\n",
1483 | " \n",
1484 | " \n",
1485 | " | \n",
1486 | " Jane | \n",
1487 | " June | \n",
1488 | " James | \n",
1489 | "
\n",
1490 | " \n",
1491 | " \n",
1492 | " \n",
1493 | " Capacity | \n",
1494 | " 0 | \n",
1495 | " 1.6 | \n",
1496 | " 1.6 | \n",
1497 | "
\n",
1498 | " \n",
1499 | " Color | \n",
1500 | " Orange | \n",
1501 | " Red | \n",
1502 | " Blue | \n",
1503 | "
\n",
1504 | " \n",
1505 | " Make | \n",
1506 | " Ford | \n",
1507 | " Mini | \n",
1508 | " Honda | \n",
1509 | "
\n",
1510 | " \n",
1511 | " Model | \n",
1512 | " | \n",
1513 | " | \n",
1514 | " Jazz | \n",
1515 | "
\n",
1516 | " \n",
1517 | "
\n",
1518 | "
"
1519 | ],
1520 | "text/plain": [
1521 | " Jane June James\n",
1522 | "Capacity 0 1.6 1.6\n",
1523 | "Color Orange Red Blue\n",
1524 | "Make Ford Mini Honda\n",
1525 | "Model Jazz"
1526 | ]
1527 | },
1528 | "execution_count": 19,
1529 | "metadata": {},
1530 | "output_type": "execute_result"
1531 | }
1532 | ],
1533 | "source": [
1534 | "df1 = df1.transpose() # flips the data around if it's more convenient\n",
1535 | "df1"
1536 | ]
1537 | }
1538 | ],
1539 | "metadata": {
1540 | "kernelspec": {
1541 | "display_name": "Python 3",
1542 | "language": "python",
1543 | "name": "python3"
1544 | },
1545 | "language_info": {
1546 | "codemirror_mode": {
1547 | "name": "ipython",
1548 | "version": 3
1549 | },
1550 | "file_extension": ".py",
1551 | "mimetype": "text/x-python",
1552 | "name": "python",
1553 | "nbconvert_exporter": "python",
1554 | "pygments_lexer": "ipython3",
1555 | "version": "3.6.5"
1556 | }
1557 | },
1558 | "nbformat": 4,
1559 | "nbformat_minor": 2
1560 | }
1561 |
--------------------------------------------------------------------------------
/How it works - Pandas, data selection.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "# How it works - Pandas, data selection"
8 | ]
9 | },
10 | {
11 | "cell_type": "code",
12 | "execution_count": 1,
13 | "metadata": {},
14 | "outputs": [],
15 | "source": [
16 | "import pandas as pd"
17 | ]
18 | },
19 | {
20 | "cell_type": "code",
21 | "execution_count": 2,
22 | "metadata": {},
23 | "outputs": [],
24 | "source": [
25 | "list1 = [\"Jane\", \"John\", \"June\", \"Jim\", \"Jay\"]\n",
26 | "list2 = [\"Ford\", \"BMW\", \"Mini\", \"Mercedes\", \"Toyota\"]\n",
27 | "list3 = [\"Blue\", \"Grey\", \"Red\", \"White\", \"White\"]\n",
28 | "list4 = [\"1.6l\", \"2.0l\", \"1.6l\", \"2.2l\", \"1.2l\"]\n",
29 | "df = pd.DataFrame({\"Make\":list2, \"Color\":list3, \"Capacity\":list4}, \n",
30 | " index = list1)"
31 | ]
32 | },
33 | {
34 | "cell_type": "code",
35 | "execution_count": 3,
36 | "metadata": {},
37 | "outputs": [
38 | {
39 | "data": {
40 | "text/html": [
41 | "\n",
42 | "\n",
55 | "
\n",
56 | " \n",
57 | " \n",
58 | " | \n",
59 | " Make | \n",
60 | " Color | \n",
61 | " Capacity | \n",
62 | "
\n",
63 | " \n",
64 | " \n",
65 | " \n",
66 | " Jane | \n",
67 | " Ford | \n",
68 | " Blue | \n",
69 | " 1.6l | \n",
70 | "
\n",
71 | " \n",
72 | " John | \n",
73 | " BMW | \n",
74 | " Grey | \n",
75 | " 2.0l | \n",
76 | "
\n",
77 | " \n",
78 | " June | \n",
79 | " Mini | \n",
80 | " Red | \n",
81 | " 1.6l | \n",
82 | "
\n",
83 | " \n",
84 | " Jim | \n",
85 | " Mercedes | \n",
86 | " White | \n",
87 | " 2.2l | \n",
88 | "
\n",
89 | " \n",
90 | " Jay | \n",
91 | " Toyota | \n",
92 | " White | \n",
93 | " 1.2l | \n",
94 | "
\n",
95 | " \n",
96 | "
\n",
97 | "
"
98 | ],
99 | "text/plain": [
100 | " Make Color Capacity\n",
101 | "Jane Ford Blue 1.6l\n",
102 | "John BMW Grey 2.0l\n",
103 | "June Mini Red 1.6l\n",
104 | "Jim Mercedes White 2.2l\n",
105 | "Jay Toyota White 1.2l"
106 | ]
107 | },
108 | "execution_count": 3,
109 | "metadata": {},
110 | "output_type": "execute_result"
111 | }
112 | ],
113 | "source": [
114 | "df"
115 | ]
116 | },
117 | {
118 | "cell_type": "markdown",
119 | "metadata": {},
120 | "source": [
121 | "## Selection by column"
122 | ]
123 | },
124 | {
125 | "cell_type": "code",
126 | "execution_count": 4,
127 | "metadata": {},
128 | "outputs": [
129 | {
130 | "data": {
131 | "text/plain": [
132 | "Jane 1.6l\n",
133 | "John 2.0l\n",
134 | "June 1.6l\n",
135 | "Jim 2.2l\n",
136 | "Jay 1.2l\n",
137 | "Name: Capacity, dtype: object"
138 | ]
139 | },
140 | "execution_count": 4,
141 | "metadata": {},
142 | "output_type": "execute_result"
143 | }
144 | ],
145 | "source": [
146 | "df[\"Capacity\"] # Selection of the SERIES by the column name (note the single bracket)"
147 | ]
148 | },
149 | {
150 | "cell_type": "code",
151 | "execution_count": 5,
152 | "metadata": {},
153 | "outputs": [
154 | {
155 | "data": {
156 | "text/html": [
157 | "\n",
158 | "\n",
171 | "
\n",
172 | " \n",
173 | " \n",
174 | " | \n",
175 | " Capacity | \n",
176 | "
\n",
177 | " \n",
178 | " \n",
179 | " \n",
180 | " Jane | \n",
181 | " 1.6l | \n",
182 | "
\n",
183 | " \n",
184 | " John | \n",
185 | " 2.0l | \n",
186 | "
\n",
187 | " \n",
188 | " June | \n",
189 | " 1.6l | \n",
190 | "
\n",
191 | " \n",
192 | " Jim | \n",
193 | " 2.2l | \n",
194 | "
\n",
195 | " \n",
196 | " Jay | \n",
197 | " 1.2l | \n",
198 | "
\n",
199 | " \n",
200 | "
\n",
201 | "
"
202 | ],
203 | "text/plain": [
204 | " Capacity\n",
205 | "Jane 1.6l\n",
206 | "John 2.0l\n",
207 | "June 1.6l\n",
208 | "Jim 2.2l\n",
209 | "Jay 1.2l"
210 | ]
211 | },
212 | "execution_count": 5,
213 | "metadata": {},
214 | "output_type": "execute_result"
215 | }
216 | ],
217 | "source": [
218 | "df[[\"Capacity\"]] # Selection of the DATAFRAME by column name (note the double bracket)"
219 | ]
220 | },
221 | {
222 | "cell_type": "code",
223 | "execution_count": 6,
224 | "metadata": {},
225 | "outputs": [
226 | {
227 | "data": {
228 | "text/html": [
229 | "\n",
230 | "\n",
243 | "
\n",
244 | " \n",
245 | " \n",
246 | " | \n",
247 | " Color | \n",
248 | " Capacity | \n",
249 | "
\n",
250 | " \n",
251 | " \n",
252 | " \n",
253 | " Jane | \n",
254 | " Blue | \n",
255 | " 1.6l | \n",
256 | "
\n",
257 | " \n",
258 | " John | \n",
259 | " Grey | \n",
260 | " 2.0l | \n",
261 | "
\n",
262 | " \n",
263 | " June | \n",
264 | " Red | \n",
265 | " 1.6l | \n",
266 | "
\n",
267 | " \n",
268 | " Jim | \n",
269 | " White | \n",
270 | " 2.2l | \n",
271 | "
\n",
272 | " \n",
273 | " Jay | \n",
274 | " White | \n",
275 | " 1.2l | \n",
276 | "
\n",
277 | " \n",
278 | "
\n",
279 | "
"
280 | ],
281 | "text/plain": [
282 | " Color Capacity\n",
283 | "Jane Blue 1.6l\n",
284 | "John Grey 2.0l\n",
285 | "June Red 1.6l\n",
286 | "Jim White 2.2l\n",
287 | "Jay White 1.2l"
288 | ]
289 | },
290 | "execution_count": 6,
291 | "metadata": {},
292 | "output_type": "execute_result"
293 | }
294 | ],
295 | "source": [
296 | "df[[\"Color\", \"Capacity\"]] # Selection by multiple column names (note the new listed order)"
297 | ]
298 | },
299 | {
300 | "cell_type": "code",
301 | "execution_count": 7,
302 | "metadata": {},
303 | "outputs": [
304 | {
305 | "data": {
306 | "text/html": [
307 | "\n",
308 | "\n",
321 | "
\n",
322 | " \n",
323 | " \n",
324 | " | \n",
325 | " Color | \n",
326 | " Capacity | \n",
327 | "
\n",
328 | " \n",
329 | " \n",
330 | " \n",
331 | " Jane | \n",
332 | " Blue | \n",
333 | " 1.6l | \n",
334 | "
\n",
335 | " \n",
336 | " John | \n",
337 | " Grey | \n",
338 | " 2.0l | \n",
339 | "
\n",
340 | " \n",
341 | " June | \n",
342 | " Red | \n",
343 | " 1.6l | \n",
344 | "
\n",
345 | " \n",
346 | " Jim | \n",
347 | " White | \n",
348 | " 2.2l | \n",
349 | "
\n",
350 | " \n",
351 | " Jay | \n",
352 | " White | \n",
353 | " 1.2l | \n",
354 | "
\n",
355 | " \n",
356 | "
\n",
357 | "
"
358 | ],
359 | "text/plain": [
360 | " Color Capacity\n",
361 | "Jane Blue 1.6l\n",
362 | "John Grey 2.0l\n",
363 | "June Red 1.6l\n",
364 | "Jim White 2.2l\n",
365 | "Jay White 1.2l"
366 | ]
367 | },
368 | "execution_count": 7,
369 | "metadata": {},
370 | "output_type": "execute_result"
371 | }
372 | ],
373 | "source": [
374 | "df.loc[:,[\"Color\", \"Capacity\"]] # Selection by column names using the .loc method (columns is the 2nd argument)"
375 | ]
376 | },
377 | {
378 | "cell_type": "code",
379 | "execution_count": 8,
380 | "metadata": {},
381 | "outputs": [
382 | {
383 | "data": {
384 | "text/html": [
385 | "\n",
386 | "\n",
399 | "
\n",
400 | " \n",
401 | " \n",
402 | " | \n",
403 | " Color | \n",
404 | " Make | \n",
405 | "
\n",
406 | " \n",
407 | " \n",
408 | " \n",
409 | " Jane | \n",
410 | " Blue | \n",
411 | " Ford | \n",
412 | "
\n",
413 | " \n",
414 | " John | \n",
415 | " Grey | \n",
416 | " BMW | \n",
417 | "
\n",
418 | " \n",
419 | " June | \n",
420 | " Red | \n",
421 | " Mini | \n",
422 | "
\n",
423 | " \n",
424 | " Jim | \n",
425 | " White | \n",
426 | " Mercedes | \n",
427 | "
\n",
428 | " \n",
429 | " Jay | \n",
430 | " White | \n",
431 | " Toyota | \n",
432 | "
\n",
433 | " \n",
434 | "
\n",
435 | "
"
436 | ],
437 | "text/plain": [
438 | " Color Make\n",
439 | "Jane Blue Ford\n",
440 | "John Grey BMW\n",
441 | "June Red Mini\n",
442 | "Jim White Mercedes\n",
443 | "Jay White Toyota"
444 | ]
445 | },
446 | "execution_count": 8,
447 | "metadata": {},
448 | "output_type": "execute_result"
449 | }
450 | ],
451 | "source": [
452 | "df.iloc[:,[1, 0]] # Selection by column indices using the .iloc method (columns is the 2nd argument)"
453 | ]
454 | },
455 | {
456 | "cell_type": "markdown",
457 | "metadata": {},
458 | "source": [
459 | "## Selection by row"
460 | ]
461 | },
462 | {
463 | "cell_type": "code",
464 | "execution_count": 9,
465 | "metadata": {},
466 | "outputs": [
467 | {
468 | "data": {
469 | "text/html": [
470 | "\n",
471 | "\n",
484 | "
\n",
485 | " \n",
486 | " \n",
487 | " | \n",
488 | " Make | \n",
489 | " Color | \n",
490 | " Capacity | \n",
491 | "
\n",
492 | " \n",
493 | " \n",
494 | " \n",
495 | " Jane | \n",
496 | " Ford | \n",
497 | " Blue | \n",
498 | " 1.6l | \n",
499 | "
\n",
500 | " \n",
501 | " June | \n",
502 | " Mini | \n",
503 | " Red | \n",
504 | " 1.6l | \n",
505 | "
\n",
506 | " \n",
507 | "
\n",
508 | "
"
509 | ],
510 | "text/plain": [
511 | " Make Color Capacity\n",
512 | "Jane Ford Blue 1.6l\n",
513 | "June Mini Red 1.6l"
514 | ]
515 | },
516 | "execution_count": 9,
517 | "metadata": {},
518 | "output_type": "execute_result"
519 | }
520 | ],
521 | "source": [
522 | "df.loc[[\"Jane\", \"June\"],:] # Selection by row names using the .loc method (rows is the 1st argument)"
523 | ]
524 | },
525 | {
526 | "cell_type": "code",
527 | "execution_count": 10,
528 | "metadata": {},
529 | "outputs": [
530 | {
531 | "data": {
532 | "text/html": [
533 | "\n",
534 | "\n",
547 | "
\n",
548 | " \n",
549 | " \n",
550 | " | \n",
551 | " Make | \n",
552 | " Color | \n",
553 | " Capacity | \n",
554 | "
\n",
555 | " \n",
556 | " \n",
557 | " \n",
558 | " Jane | \n",
559 | " Ford | \n",
560 | " Blue | \n",
561 | " 1.6l | \n",
562 | "
\n",
563 | " \n",
564 | " June | \n",
565 | " Mini | \n",
566 | " Red | \n",
567 | " 1.6l | \n",
568 | "
\n",
569 | " \n",
570 | "
\n",
571 | "
"
572 | ],
573 | "text/plain": [
574 | " Make Color Capacity\n",
575 | "Jane Ford Blue 1.6l\n",
576 | "June Mini Red 1.6l"
577 | ]
578 | },
579 | "execution_count": 10,
580 | "metadata": {},
581 | "output_type": "execute_result"
582 | }
583 | ],
584 | "source": [
585 | "df.iloc[[0,2],:] # Selection by row indices using the .iloc method (rows is the 1st argument)"
586 | ]
587 | },
588 | {
589 | "cell_type": "markdown",
590 | "metadata": {},
591 | "source": [
592 | "## Selection by row and column"
593 | ]
594 | },
595 | {
596 | "cell_type": "code",
597 | "execution_count": 11,
598 | "metadata": {},
599 | "outputs": [
600 | {
601 | "data": {
602 | "text/html": [
603 | "\n",
604 | "\n",
617 | "
\n",
618 | " \n",
619 | " \n",
620 | " | \n",
621 | " Capacity | \n",
622 | " Make | \n",
623 | "
\n",
624 | " \n",
625 | " \n",
626 | " \n",
627 | " Jane | \n",
628 | " 1.6l | \n",
629 | " Ford | \n",
630 | "
\n",
631 | " \n",
632 | " June | \n",
633 | " 1.6l | \n",
634 | " Mini | \n",
635 | "
\n",
636 | " \n",
637 | "
\n",
638 | "
"
639 | ],
640 | "text/plain": [
641 | " Capacity Make\n",
642 | "Jane 1.6l Ford\n",
643 | "June 1.6l Mini"
644 | ]
645 | },
646 | "execution_count": 11,
647 | "metadata": {},
648 | "output_type": "execute_result"
649 | }
650 | ],
651 | "source": [
652 | "df.loc[[\"Jane\", \"June\"],[\"Capacity\", \"Make\"]] # Selection using the .loc method would be preferred!"
653 | ]
654 | },
655 | {
656 | "cell_type": "code",
657 | "execution_count": 12,
658 | "metadata": {},
659 | "outputs": [
660 | {
661 | "data": {
662 | "text/html": [
663 | "\n",
664 | "\n",
677 | "
\n",
678 | " \n",
679 | " \n",
680 | " | \n",
681 | " Make | \n",
682 | " Capacity | \n",
683 | "
\n",
684 | " \n",
685 | " \n",
686 | " \n",
687 | " Jane | \n",
688 | " Ford | \n",
689 | " 1.6l | \n",
690 | "
\n",
691 | " \n",
692 | " John | \n",
693 | " BMW | \n",
694 | " 2.0l | \n",
695 | "
\n",
696 | " \n",
697 | "
\n",
698 | "
"
699 | ],
700 | "text/plain": [
701 | " Make Capacity\n",
702 | "Jane Ford 1.6l\n",
703 | "John BMW 2.0l"
704 | ]
705 | },
706 | "execution_count": 12,
707 | "metadata": {},
708 | "output_type": "execute_result"
709 | }
710 | ],
711 | "source": [
712 | "df.iloc[[0,1],[0,2]] # Selection using the .iloc method would be preferred!"
713 | ]
714 | },
715 | {
716 | "cell_type": "markdown",
717 | "metadata": {},
718 | "source": [
719 | "## Selection by filter"
720 | ]
721 | },
722 | {
723 | "cell_type": "code",
724 | "execution_count": 13,
725 | "metadata": {},
726 | "outputs": [
727 | {
728 | "data": {
729 | "text/html": [
730 | "\n",
731 | "\n",
744 | "
\n",
745 | " \n",
746 | " \n",
747 | " | \n",
748 | " Make | \n",
749 | " Color | \n",
750 | " Capacity | \n",
751 | "
\n",
752 | " \n",
753 | " \n",
754 | " \n",
755 | " Jim | \n",
756 | " Mercedes | \n",
757 | " White | \n",
758 | " 2.2l | \n",
759 | "
\n",
760 | " \n",
761 | " Jay | \n",
762 | " Toyota | \n",
763 | " White | \n",
764 | " 1.2l | \n",
765 | "
\n",
766 | " \n",
767 | "
\n",
768 | "
"
769 | ],
770 | "text/plain": [
771 | " Make Color Capacity\n",
772 | "Jim Mercedes White 2.2l\n",
773 | "Jay Toyota White 1.2l"
774 | ]
775 | },
776 | "execution_count": 13,
777 | "metadata": {},
778 | "output_type": "execute_result"
779 | }
780 | ],
781 | "source": [
782 | "df[df[\"Color\"] == \"White\"] # Notice that we essentially filter on a series and then apply the result to the df\n",
783 | " # df[\"Color] is the series\n",
784 | " # we look for values == \"White\" in that series\n",
785 | " # apply the result to the df\n",
786 | " # When in doubt... build the code from the inside out!"
787 | ]
788 | },
789 | {
790 | "cell_type": "code",
791 | "execution_count": 14,
792 | "metadata": {},
793 | "outputs": [
794 | {
795 | "data": {
796 | "text/html": [
797 | "\n",
798 | "\n",
811 | "
\n",
812 | " \n",
813 | " \n",
814 | " | \n",
815 | " Make | \n",
816 | " Color | \n",
817 | " Capacity | \n",
818 | "
\n",
819 | " \n",
820 | " \n",
821 | " \n",
822 | " Jane | \n",
823 | " Ford | \n",
824 | " Blue | \n",
825 | " 1.6l | \n",
826 | "
\n",
827 | " \n",
828 | " Jim | \n",
829 | " Mercedes | \n",
830 | " White | \n",
831 | " 2.2l | \n",
832 | "
\n",
833 | " \n",
834 | " Jay | \n",
835 | " Toyota | \n",
836 | " White | \n",
837 | " 1.2l | \n",
838 | "
\n",
839 | " \n",
840 | "
\n",
841 | "
"
842 | ],
843 | "text/plain": [
844 | " Make Color Capacity\n",
845 | "Jane Ford Blue 1.6l\n",
846 | "Jim Mercedes White 2.2l\n",
847 | "Jay Toyota White 1.2l"
848 | ]
849 | },
850 | "execution_count": 14,
851 | "metadata": {},
852 | "output_type": "execute_result"
853 | }
854 | ],
855 | "source": [
856 | "df[(df[\"Color\"] == \"White\") | (df[\"Color\"] == \"Blue\")] # One can apply this technique with boolean operators too"
857 | ]
858 | },
859 | {
860 | "cell_type": "markdown",
861 | "metadata": {},
862 | "source": [
863 | "## Selection by mask\n",
864 | "This mechanism will be quite handy for re-usability"
865 | ]
866 | },
867 | {
868 | "cell_type": "code",
869 | "execution_count": 15,
870 | "metadata": {},
871 | "outputs": [],
872 | "source": [
873 | "mask = df[\"Color\"] == \"White\""
874 | ]
875 | },
876 | {
877 | "cell_type": "code",
878 | "execution_count": 16,
879 | "metadata": {},
880 | "outputs": [
881 | {
882 | "data": {
883 | "text/html": [
884 | "\n",
885 | "\n",
898 | "
\n",
899 | " \n",
900 | " \n",
901 | " | \n",
902 | " Make | \n",
903 | " Color | \n",
904 | " Capacity | \n",
905 | "
\n",
906 | " \n",
907 | " \n",
908 | " \n",
909 | " Jim | \n",
910 | " Mercedes | \n",
911 | " White | \n",
912 | " 2.2l | \n",
913 | "
\n",
914 | " \n",
915 | " Jay | \n",
916 | " Toyota | \n",
917 | " White | \n",
918 | " 1.2l | \n",
919 | "
\n",
920 | " \n",
921 | "
\n",
922 | "
"
923 | ],
924 | "text/plain": [
925 | " Make Color Capacity\n",
926 | "Jim Mercedes White 2.2l\n",
927 | "Jay Toyota White 1.2l"
928 | ]
929 | },
930 | "execution_count": 16,
931 | "metadata": {},
932 | "output_type": "execute_result"
933 | }
934 | ],
935 | "source": [
936 | "df[mask]"
937 | ]
938 | },
939 | {
940 | "cell_type": "markdown",
941 | "metadata": {},
942 | "source": [
943 | "## Selection using the .isin() method\n",
944 | "This is pretty handy where you have larger lists of values that you want to check for"
945 | ]
946 | },
947 | {
948 | "cell_type": "code",
949 | "execution_count": 17,
950 | "metadata": {},
951 | "outputs": [
952 | {
953 | "data": {
954 | "text/html": [
955 | "\n",
956 | "\n",
969 | "
\n",
970 | " \n",
971 | " \n",
972 | " | \n",
973 | " Make | \n",
974 | " Color | \n",
975 | " Capacity | \n",
976 | "
\n",
977 | " \n",
978 | " \n",
979 | " \n",
980 | " Jane | \n",
981 | " Ford | \n",
982 | " Blue | \n",
983 | " 1.6l | \n",
984 | "
\n",
985 | " \n",
986 | " June | \n",
987 | " Mini | \n",
988 | " Red | \n",
989 | " 1.6l | \n",
990 | "
\n",
991 | " \n",
992 | " Jim | \n",
993 | " Mercedes | \n",
994 | " White | \n",
995 | " 2.2l | \n",
996 | "
\n",
997 | " \n",
998 | " Jay | \n",
999 | " Toyota | \n",
1000 | " White | \n",
1001 | " 1.2l | \n",
1002 | "
\n",
1003 | " \n",
1004 | "
\n",
1005 | "
"
1006 | ],
1007 | "text/plain": [
1008 | " Make Color Capacity\n",
1009 | "Jane Ford Blue 1.6l\n",
1010 | "June Mini Red 1.6l\n",
1011 | "Jim Mercedes White 2.2l\n",
1012 | "Jay Toyota White 1.2l"
1013 | ]
1014 | },
1015 | "execution_count": 17,
1016 | "metadata": {},
1017 | "output_type": "execute_result"
1018 | }
1019 | ],
1020 | "source": [
1021 | "required_vals = [\"White\", \"Blue\", \"Red\"] # Make a list of all the values you want included\n",
1022 | "df[df[\"Color\"].isin(required_vals)] # Use this list as a filter with the .isin() method"
1023 | ]
1024 | },
1025 | {
1026 | "cell_type": "code",
1027 | "execution_count": 18,
1028 | "metadata": {},
1029 | "outputs": [
1030 | {
1031 | "data": {
1032 | "text/html": [
1033 | "\n",
1034 | "\n",
1047 | "
\n",
1048 | " \n",
1049 | " \n",
1050 | " | \n",
1051 | " Make | \n",
1052 | " Color | \n",
1053 | " Capacity | \n",
1054 | "
\n",
1055 | " \n",
1056 | " \n",
1057 | " \n",
1058 | " John | \n",
1059 | " BMW | \n",
1060 | " Grey | \n",
1061 | " 2.0l | \n",
1062 | "
\n",
1063 | " \n",
1064 | "
\n",
1065 | "
"
1066 | ],
1067 | "text/plain": [
1068 | " Make Color Capacity\n",
1069 | "John BMW Grey 2.0l"
1070 | ]
1071 | },
1072 | "execution_count": 18,
1073 | "metadata": {},
1074 | "output_type": "execute_result"
1075 | }
1076 | ],
1077 | "source": [
1078 | "df[~df[\"Color\"].isin(required_vals)] # Use the handy ~ notation to change it to \"isnotin\"!"
1079 | ]
1080 | },
1081 | {
1082 | "cell_type": "code",
1083 | "execution_count": 25,
1084 | "metadata": {},
1085 | "outputs": [
1086 | {
1087 | "data": {
1088 | "text/html": [
1089 | "\n",
1090 | "\n",
1103 | "
\n",
1104 | " \n",
1105 | " \n",
1106 | " | \n",
1107 | " Make | \n",
1108 | " Color | \n",
1109 | " Capacity | \n",
1110 | "
\n",
1111 | " \n",
1112 | " \n",
1113 | " \n",
1114 | " Jim | \n",
1115 | " Mercedes | \n",
1116 | " White | \n",
1117 | " 2.2l | \n",
1118 | "
\n",
1119 | " \n",
1120 | " Jay | \n",
1121 | " Toyota | \n",
1122 | " White | \n",
1123 | " 1.2l | \n",
1124 | "
\n",
1125 | " \n",
1126 | "
\n",
1127 | "
"
1128 | ],
1129 | "text/plain": [
1130 | " Make Color Capacity\n",
1131 | "Jim Mercedes White 2.2l\n",
1132 | "Jay Toyota White 1.2l"
1133 | ]
1134 | },
1135 | "execution_count": 25,
1136 | "metadata": {},
1137 | "output_type": "execute_result"
1138 | }
1139 | ],
1140 | "source": [
1141 | "df[df.Color.str.contains(\"te\")]"
1142 | ]
1143 | }
1144 | ],
1145 | "metadata": {
1146 | "kernelspec": {
1147 | "display_name": "Python 3",
1148 | "language": "python",
1149 | "name": "python3"
1150 | },
1151 | "language_info": {
1152 | "codemirror_mode": {
1153 | "name": "ipython",
1154 | "version": 3
1155 | },
1156 | "file_extension": ".py",
1157 | "mimetype": "text/x-python",
1158 | "name": "python",
1159 | "nbconvert_exporter": "python",
1160 | "pygments_lexer": "ipython3",
1161 | "version": "3.6.5"
1162 | }
1163 | },
1164 | "nbformat": 4,
1165 | "nbformat_minor": 2
1166 | }
1167 |
--------------------------------------------------------------------------------
/How it works - Pandas, groupby method.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "### import pandas as pd"
8 | ]
9 | },
10 | {
11 | "cell_type": "code",
12 | "execution_count": 2,
13 | "metadata": {},
14 | "outputs": [],
15 | "source": [
16 | "names = [\"Jane\", \"June\", \"Jenny\", \"Jacky\", \"Johnny\", \"Jack\", \"Jeremy\"]\n",
17 | "cars = [\"Ford\", \"Fiat\", \"Ford\", \"Ford\", \"BMW\", \"Mercedes\", \"Honda\"]\n",
18 | "colors = [\"black\", \"blue\", \"white\", \"white\", \"white\", \"white\", \"blue\"]"
19 | ]
20 | },
21 | {
22 | "cell_type": "code",
23 | "execution_count": 3,
24 | "metadata": {
25 | "scrolled": true
26 | },
27 | "outputs": [],
28 | "source": [
29 | "df = pd.DataFrame(data = {\"Names\": names, \"Cars\": cars, \"Colours\": colors})\n",
30 | "df = df[[\"Names\", \"Cars\", \"Colours\"]]"
31 | ]
32 | },
33 | {
34 | "cell_type": "code",
35 | "execution_count": 4,
36 | "metadata": {},
37 | "outputs": [
38 | {
39 | "data": {
40 | "text/html": [
41 | "\n",
42 | "\n",
55 | "
\n",
56 | " \n",
57 | " \n",
58 | " | \n",
59 | " Names | \n",
60 | " Cars | \n",
61 | " Colours | \n",
62 | "
\n",
63 | " \n",
64 | " \n",
65 | " \n",
66 | " 0 | \n",
67 | " Jane | \n",
68 | " Ford | \n",
69 | " black | \n",
70 | "
\n",
71 | " \n",
72 | " 1 | \n",
73 | " June | \n",
74 | " Fiat | \n",
75 | " blue | \n",
76 | "
\n",
77 | " \n",
78 | " 2 | \n",
79 | " Jenny | \n",
80 | " Ford | \n",
81 | " white | \n",
82 | "
\n",
83 | " \n",
84 | " 3 | \n",
85 | " Jacky | \n",
86 | " Ford | \n",
87 | " white | \n",
88 | "
\n",
89 | " \n",
90 | " 4 | \n",
91 | " Johnny | \n",
92 | " BMW | \n",
93 | " white | \n",
94 | "
\n",
95 | " \n",
96 | " 5 | \n",
97 | " Jack | \n",
98 | " Mercedes | \n",
99 | " white | \n",
100 | "
\n",
101 | " \n",
102 | " 6 | \n",
103 | " Jeremy | \n",
104 | " Honda | \n",
105 | " blue | \n",
106 | "
\n",
107 | " \n",
108 | "
\n",
109 | "
"
110 | ],
111 | "text/plain": [
112 | " Names Cars Colours\n",
113 | "0 Jane Ford black\n",
114 | "1 June Fiat blue\n",
115 | "2 Jenny Ford white\n",
116 | "3 Jacky Ford white\n",
117 | "4 Johnny BMW white\n",
118 | "5 Jack Mercedes white\n",
119 | "6 Jeremy Honda blue"
120 | ]
121 | },
122 | "execution_count": 4,
123 | "metadata": {},
124 | "output_type": "execute_result"
125 | }
126 | ],
127 | "source": [
128 | "df"
129 | ]
130 | },
131 | {
132 | "cell_type": "code",
133 | "execution_count": 5,
134 | "metadata": {},
135 | "outputs": [],
136 | "source": [
137 | "summary = df.groupby(by = [\"Cars\", \"Colours\"])[\"Names\"].count()"
138 | ]
139 | },
140 | {
141 | "cell_type": "code",
142 | "execution_count": 11,
143 | "metadata": {},
144 | "outputs": [
145 | {
146 | "data": {
147 | "text/plain": [
148 | "Cars Colours\n",
149 | "BMW white 1\n",
150 | "Fiat blue 1\n",
151 | "Ford black 1\n",
152 | " white 2\n",
153 | "Honda blue 1\n",
154 | "Mercedes white 1\n",
155 | "Name: Names, dtype: int64"
156 | ]
157 | },
158 | "execution_count": 11,
159 | "metadata": {},
160 | "output_type": "execute_result"
161 | }
162 | ],
163 | "source": [
164 | "summary"
165 | ]
166 | },
167 | {
168 | "cell_type": "code",
169 | "execution_count": 13,
170 | "metadata": {},
171 | "outputs": [
172 | {
173 | "data": {
174 | "text/plain": [
175 | "Cars Colours\n",
176 | "BMW white 1\n",
177 | "Ford white 2\n",
178 | "Name: Names, dtype: int64"
179 | ]
180 | },
181 | "execution_count": 13,
182 | "metadata": {},
183 | "output_type": "execute_result"
184 | }
185 | ],
186 | "source": [
187 | "summary.loc[[('BMW', 'white'), ('Ford', 'white')]]"
188 | ]
189 | },
190 | {
191 | "cell_type": "code",
192 | "execution_count": 36,
193 | "metadata": {},
194 | "outputs": [
195 | {
196 | "data": {
197 | "text/html": [
198 | "\n",
199 | "\n",
212 | "
\n",
213 | " \n",
214 | " \n",
215 | " | \n",
216 | " Cars | \n",
217 | " Colours | \n",
218 | " Names | \n",
219 | "
\n",
220 | " \n",
221 | " \n",
222 | " \n",
223 | " 0 | \n",
224 | " BMW | \n",
225 | " white | \n",
226 | " 1 | \n",
227 | "
\n",
228 | " \n",
229 | " 1 | \n",
230 | " Fiat | \n",
231 | " blue | \n",
232 | " 1 | \n",
233 | "
\n",
234 | " \n",
235 | " 2 | \n",
236 | " Ford | \n",
237 | " black | \n",
238 | " 1 | \n",
239 | "
\n",
240 | " \n",
241 | " 3 | \n",
242 | " Ford | \n",
243 | " white | \n",
244 | " 2 | \n",
245 | "
\n",
246 | " \n",
247 | " 4 | \n",
248 | " Honda | \n",
249 | " blue | \n",
250 | " 1 | \n",
251 | "
\n",
252 | " \n",
253 | " 5 | \n",
254 | " Mercedes | \n",
255 | " white | \n",
256 | " 1 | \n",
257 | "
\n",
258 | " \n",
259 | "
\n",
260 | "
"
261 | ],
262 | "text/plain": [
263 | " Cars Colours Names\n",
264 | "0 BMW white 1\n",
265 | "1 Fiat blue 1\n",
266 | "2 Ford black 1\n",
267 | "3 Ford white 2\n",
268 | "4 Honda blue 1\n",
269 | "5 Mercedes white 1"
270 | ]
271 | },
272 | "execution_count": 36,
273 | "metadata": {},
274 | "output_type": "execute_result"
275 | }
276 | ],
277 | "source": [
278 | "summary.reset_index(level=[\"Cars\", \"Colours\"])"
279 | ]
280 | }
281 | ],
282 | "metadata": {
283 | "kernelspec": {
284 | "display_name": "Python 3",
285 | "language": "python",
286 | "name": "python3"
287 | },
288 | "language_info": {
289 | "codemirror_mode": {
290 | "name": "ipython",
291 | "version": 3
292 | },
293 | "file_extension": ".py",
294 | "mimetype": "text/x-python",
295 | "name": "python",
296 | "nbconvert_exporter": "python",
297 | "pygments_lexer": "ipython3",
298 | "version": "3.6.5"
299 | }
300 | },
301 | "nbformat": 4,
302 | "nbformat_minor": 2
303 | }
304 |
--------------------------------------------------------------------------------
/How it works - Pandas, mapping series values.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "code",
5 | "execution_count": 1,
6 | "metadata": {},
7 | "outputs": [],
8 | "source": [
9 | "import pandas as pd"
10 | ]
11 | },
12 | {
13 | "cell_type": "code",
14 | "execution_count": 2,
15 | "metadata": {},
16 | "outputs": [],
17 | "source": [
18 | "names = [\"Liza\", \"Lisa\", \"Lizzy\", \"Lynne\", \"Lisbeth\", \"Lana\"]\n",
19 | "sizes = [16, 12, 14, 10, 8, 14]"
20 | ]
21 | },
22 | {
23 | "cell_type": "code",
24 | "execution_count": 3,
25 | "metadata": {},
26 | "outputs": [],
27 | "source": [
28 | "s_name = pd.Series(names)"
29 | ]
30 | },
31 | {
32 | "cell_type": "code",
33 | "execution_count": 4,
34 | "metadata": {},
35 | "outputs": [
36 | {
37 | "data": {
38 | "text/plain": [
39 | "0 Liza\n",
40 | "1 Lisa\n",
41 | "2 Lizzy\n",
42 | "3 Lynne\n",
43 | "4 Lisbeth\n",
44 | "5 Lana\n",
45 | "dtype: object"
46 | ]
47 | },
48 | "execution_count": 4,
49 | "metadata": {},
50 | "output_type": "execute_result"
51 | }
52 | ],
53 | "source": [
54 | "s_name"
55 | ]
56 | },
57 | {
58 | "cell_type": "code",
59 | "execution_count": 5,
60 | "metadata": {},
61 | "outputs": [],
62 | "source": [
63 | "s_sizes = pd.Series(sizes, names)"
64 | ]
65 | },
66 | {
67 | "cell_type": "code",
68 | "execution_count": 6,
69 | "metadata": {},
70 | "outputs": [
71 | {
72 | "data": {
73 | "text/plain": [
74 | "Liza 16\n",
75 | "Lisa 12\n",
76 | "Lizzy 14\n",
77 | "Lynne 10\n",
78 | "Lisbeth 8\n",
79 | "Lana 14\n",
80 | "dtype: int64"
81 | ]
82 | },
83 | "execution_count": 6,
84 | "metadata": {},
85 | "output_type": "execute_result"
86 | }
87 | ],
88 | "source": [
89 | "s_sizes"
90 | ]
91 | },
92 | {
93 | "cell_type": "code",
94 | "execution_count": 10,
95 | "metadata": {},
96 | "outputs": [
97 | {
98 | "data": {
99 | "text/plain": [
100 | "0 16\n",
101 | "1 12\n",
102 | "2 14\n",
103 | "3 10\n",
104 | "4 8\n",
105 | "5 14\n",
106 | "dtype: int64"
107 | ]
108 | },
109 | "execution_count": 10,
110 | "metadata": {},
111 | "output_type": "execute_result"
112 | }
113 | ],
114 | "source": [
115 | "s_name.map(s_sizes)"
116 | ]
117 | }
118 | ],
119 | "metadata": {
120 | "kernelspec": {
121 | "display_name": "Python 3",
122 | "language": "python",
123 | "name": "python3"
124 | },
125 | "language_info": {
126 | "codemirror_mode": {
127 | "name": "ipython",
128 | "version": 3
129 | },
130 | "file_extension": ".py",
131 | "mimetype": "text/x-python",
132 | "name": "python",
133 | "nbconvert_exporter": "python",
134 | "pygments_lexer": "ipython3",
135 | "version": "3.6.5"
136 | }
137 | },
138 | "nbformat": 4,
139 | "nbformat_minor": 2
140 | }
141 |
--------------------------------------------------------------------------------
/How it works - Pandas, merge method.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "code",
5 | "execution_count": 1,
6 | "metadata": {},
7 | "outputs": [],
8 | "source": [
9 | "import pandas as pd"
10 | ]
11 | },
12 | {
13 | "cell_type": "code",
14 | "execution_count": 2,
15 | "metadata": {},
16 | "outputs": [],
17 | "source": [
18 | "mothers = [\"Evelyn\", \"Eve\", \"Eugenia\", \"Elizabeth\"]\n",
19 | "daughters = [\"Jenny\", \"Joan\", \"June\", \"Julia\"]"
20 | ]
21 | },
22 | {
23 | "cell_type": "code",
24 | "execution_count": 3,
25 | "metadata": {},
26 | "outputs": [],
27 | "source": [
28 | "kids1 = [\"Jenny\", \"Jenny\", \"Joan\", \"June\", \"Julia\", \"Julia\", \"Julia\"]\n",
29 | "kids2 = [\"Freddy\", \"Johnny\", \"Michael\", \"Robin\", \"Nadia\", \"Freddy\", \"Mark\"]"
30 | ]
31 | },
32 | {
33 | "cell_type": "code",
34 | "execution_count": 5,
35 | "metadata": {},
36 | "outputs": [],
37 | "source": [
38 | "senior = pd.DataFrame(data = {\"Grandmothers\": mothers, \"Mothers\": daughters})"
39 | ]
40 | },
41 | {
42 | "cell_type": "code",
43 | "execution_count": 16,
44 | "metadata": {},
45 | "outputs": [],
46 | "source": [
47 | "junior = pd.DataFrame(data = {\"Mothers\": kids1, \"Children\": kids2})\n",
48 | "junior = junior[[\"Mothers\", \"Children\"]]"
49 | ]
50 | },
51 | {
52 | "cell_type": "code",
53 | "execution_count": 17,
54 | "metadata": {},
55 | "outputs": [
56 | {
57 | "data": {
58 | "text/html": [
59 | "\n",
60 | "\n",
73 | "
\n",
74 | " \n",
75 | " \n",
76 | " | \n",
77 | " Grandmothers | \n",
78 | " Mothers | \n",
79 | "
\n",
80 | " \n",
81 | " \n",
82 | " \n",
83 | " 0 | \n",
84 | " Evelyn | \n",
85 | " Jenny | \n",
86 | "
\n",
87 | " \n",
88 | " 1 | \n",
89 | " Eve | \n",
90 | " Joan | \n",
91 | "
\n",
92 | " \n",
93 | " 2 | \n",
94 | " Eugenia | \n",
95 | " June | \n",
96 | "
\n",
97 | " \n",
98 | " 3 | \n",
99 | " Elizabeth | \n",
100 | " Julia | \n",
101 | "
\n",
102 | " \n",
103 | "
\n",
104 | "
"
105 | ],
106 | "text/plain": [
107 | " Grandmothers Mothers\n",
108 | "0 Evelyn Jenny\n",
109 | "1 Eve Joan\n",
110 | "2 Eugenia June\n",
111 | "3 Elizabeth Julia"
112 | ]
113 | },
114 | "execution_count": 17,
115 | "metadata": {},
116 | "output_type": "execute_result"
117 | }
118 | ],
119 | "source": [
120 | "senior"
121 | ]
122 | },
123 | {
124 | "cell_type": "code",
125 | "execution_count": 18,
126 | "metadata": {},
127 | "outputs": [
128 | {
129 | "data": {
130 | "text/html": [
131 | "\n",
132 | "\n",
145 | "
\n",
146 | " \n",
147 | " \n",
148 | " | \n",
149 | " Mothers | \n",
150 | " Children | \n",
151 | "
\n",
152 | " \n",
153 | " \n",
154 | " \n",
155 | " 0 | \n",
156 | " Jenny | \n",
157 | " Freddy | \n",
158 | "
\n",
159 | " \n",
160 | " 1 | \n",
161 | " Jenny | \n",
162 | " Johnny | \n",
163 | "
\n",
164 | " \n",
165 | " 2 | \n",
166 | " Joan | \n",
167 | " Michael | \n",
168 | "
\n",
169 | " \n",
170 | " 3 | \n",
171 | " June | \n",
172 | " Robin | \n",
173 | "
\n",
174 | " \n",
175 | " 4 | \n",
176 | " Julia | \n",
177 | " Nadia | \n",
178 | "
\n",
179 | " \n",
180 | " 5 | \n",
181 | " Julia | \n",
182 | " Freddy | \n",
183 | "
\n",
184 | " \n",
185 | " 6 | \n",
186 | " Julia | \n",
187 | " Mark | \n",
188 | "
\n",
189 | " \n",
190 | "
\n",
191 | "
"
192 | ],
193 | "text/plain": [
194 | " Mothers Children\n",
195 | "0 Jenny Freddy\n",
196 | "1 Jenny Johnny\n",
197 | "2 Joan Michael\n",
198 | "3 June Robin\n",
199 | "4 Julia Nadia\n",
200 | "5 Julia Freddy\n",
201 | "6 Julia Mark"
202 | ]
203 | },
204 | "execution_count": 18,
205 | "metadata": {},
206 | "output_type": "execute_result"
207 | }
208 | ],
209 | "source": [
210 | "junior"
211 | ]
212 | },
213 | {
214 | "cell_type": "code",
215 | "execution_count": 19,
216 | "metadata": {},
217 | "outputs": [],
218 | "source": [
219 | "master = pd.merge(senior, junior, how = \"right\")"
220 | ]
221 | },
222 | {
223 | "cell_type": "code",
224 | "execution_count": 21,
225 | "metadata": {},
226 | "outputs": [
227 | {
228 | "data": {
229 | "text/html": [
230 | "\n",
231 | "\n",
244 | "
\n",
245 | " \n",
246 | " \n",
247 | " | \n",
248 | " Grandmothers | \n",
249 | " Mothers | \n",
250 | " Children | \n",
251 | "
\n",
252 | " \n",
253 | " \n",
254 | " \n",
255 | " 0 | \n",
256 | " Evelyn | \n",
257 | " Jenny | \n",
258 | " Freddy | \n",
259 | "
\n",
260 | " \n",
261 | " 1 | \n",
262 | " Evelyn | \n",
263 | " Jenny | \n",
264 | " Johnny | \n",
265 | "
\n",
266 | " \n",
267 | " 2 | \n",
268 | " Eve | \n",
269 | " Joan | \n",
270 | " Michael | \n",
271 | "
\n",
272 | " \n",
273 | " 3 | \n",
274 | " Eugenia | \n",
275 | " June | \n",
276 | " Robin | \n",
277 | "
\n",
278 | " \n",
279 | " 4 | \n",
280 | " Elizabeth | \n",
281 | " Julia | \n",
282 | " Nadia | \n",
283 | "
\n",
284 | " \n",
285 | " 5 | \n",
286 | " Elizabeth | \n",
287 | " Julia | \n",
288 | " Freddy | \n",
289 | "
\n",
290 | " \n",
291 | " 6 | \n",
292 | " Elizabeth | \n",
293 | " Julia | \n",
294 | " Mark | \n",
295 | "
\n",
296 | " \n",
297 | "
\n",
298 | "
"
299 | ],
300 | "text/plain": [
301 | " Grandmothers Mothers Children\n",
302 | "0 Evelyn Jenny Freddy\n",
303 | "1 Evelyn Jenny Johnny\n",
304 | "2 Eve Joan Michael\n",
305 | "3 Eugenia June Robin\n",
306 | "4 Elizabeth Julia Nadia\n",
307 | "5 Elizabeth Julia Freddy\n",
308 | "6 Elizabeth Julia Mark"
309 | ]
310 | },
311 | "execution_count": 21,
312 | "metadata": {},
313 | "output_type": "execute_result"
314 | }
315 | ],
316 | "source": [
317 | "master"
318 | ]
319 | },
320 | {
321 | "cell_type": "code",
322 | "execution_count": null,
323 | "metadata": {},
324 | "outputs": [],
325 | "source": []
326 | }
327 | ],
328 | "metadata": {
329 | "kernelspec": {
330 | "display_name": "Python 3",
331 | "language": "python",
332 | "name": "python3"
333 | },
334 | "language_info": {
335 | "codemirror_mode": {
336 | "name": "ipython",
337 | "version": 3
338 | },
339 | "file_extension": ".py",
340 | "mimetype": "text/x-python",
341 | "name": "python",
342 | "nbconvert_exporter": "python",
343 | "pygments_lexer": "ipython3",
344 | "version": "3.6.5"
345 | }
346 | },
347 | "nbformat": 4,
348 | "nbformat_minor": 2
349 | }
350 |
--------------------------------------------------------------------------------
/How it works - basic lists.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "code",
5 | "execution_count": 1,
6 | "metadata": {
7 | "collapsed": true
8 | },
9 | "outputs": [],
10 | "source": [
11 | "List = [3, 4, 6, 8, 9, 11, 12, 14, 15, 16, 17, 20, 21]"
12 | ]
13 | },
14 | {
15 | "cell_type": "code",
16 | "execution_count": 2,
17 | "metadata": {},
18 | "outputs": [
19 | {
20 | "name": "stdout",
21 | "output_type": "stream",
22 | "text": [
23 | "[3, 4, 6, 8, 9, 11, 12, 14, 15, 16, 17, 20, 21]\n"
24 | ]
25 | }
26 | ],
27 | "source": [
28 | "print(List)"
29 | ]
30 | },
31 | {
32 | "cell_type": "code",
33 | "execution_count": 8,
34 | "metadata": {},
35 | "outputs": [
36 | {
37 | "name": "stdout",
38 | "output_type": "stream",
39 | "text": [
40 | "3 is a Multiple of 3\n",
41 | "4\n",
42 | "6 is a Multiple of 3\n",
43 | "8\n",
44 | "9 is a Multiple of 3\n",
45 | "11\n",
46 | "12 is a Multiple of 3\n",
47 | "14\n",
48 | "15 is a Multiple of 3\n",
49 | "16\n",
50 | "17\n",
51 | "20\n",
52 | "21 is a Multiple of 3\n"
53 | ]
54 | }
55 | ],
56 | "source": [
57 | "for i in List:\n",
58 | " if i % 3 == 0:\n",
59 | " print(i, \"is a Multiple of 3\")\n",
60 | " else:\n",
61 | " print(i)"
62 | ]
63 | },
64 | {
65 | "cell_type": "code",
66 | "execution_count": 10,
67 | "metadata": {},
68 | "outputs": [
69 | {
70 | "name": "stdout",
71 | "output_type": "stream",
72 | "text": [
73 | "[3, 6, 9, 12, 15, 21]\n"
74 | ]
75 | }
76 | ],
77 | "source": [
78 | "List_of_3 = []\n",
79 | "for i in List:\n",
80 | " if i % 3 == 0:\n",
81 | " List_of_3.append(i)\n",
82 | "print(List_of_3)"
83 | ]
84 | },
85 | {
86 | "cell_type": "code",
87 | "execution_count": 11,
88 | "metadata": {},
89 | "outputs": [
90 | {
91 | "name": "stdout",
92 | "output_type": "stream",
93 | "text": [
94 | "o\n"
95 | ]
96 | }
97 | ],
98 | "source": [
99 | "words = \"Hello world\"\n",
100 | "print(words[4])"
101 | ]
102 | },
103 | {
104 | "cell_type": "code",
105 | "execution_count": 12,
106 | "metadata": {},
107 | "outputs": [
108 | {
109 | "name": "stdout",
110 | "output_type": "stream",
111 | "text": [
112 | "Hello worldHello worldHello world\n"
113 | ]
114 | }
115 | ],
116 | "source": [
117 | "print(words * 3)"
118 | ]
119 | },
120 | {
121 | "cell_type": "code",
122 | "execution_count": 13,
123 | "metadata": {},
124 | "outputs": [
125 | {
126 | "data": {
127 | "text/plain": [
128 | "True"
129 | ]
130 | },
131 | "execution_count": 13,
132 | "metadata": {},
133 | "output_type": "execute_result"
134 | }
135 | ],
136 | "source": [
137 | "\"Hello\" in words"
138 | ]
139 | },
140 | {
141 | "cell_type": "code",
142 | "execution_count": 14,
143 | "metadata": {},
144 | "outputs": [
145 | {
146 | "data": {
147 | "text/plain": [
148 | "False"
149 | ]
150 | },
151 | "execution_count": 14,
152 | "metadata": {},
153 | "output_type": "execute_result"
154 | }
155 | ],
156 | "source": [
157 | "\"Goodbye\" in words"
158 | ]
159 | },
160 | {
161 | "cell_type": "code",
162 | "execution_count": 15,
163 | "metadata": {},
164 | "outputs": [
165 | {
166 | "data": {
167 | "text/plain": [
168 | "False"
169 | ]
170 | },
171 | "execution_count": 15,
172 | "metadata": {},
173 | "output_type": "execute_result"
174 | }
175 | ],
176 | "source": [
177 | "not \"Hello\" in words"
178 | ]
179 | },
180 | {
181 | "cell_type": "code",
182 | "execution_count": 16,
183 | "metadata": {},
184 | "outputs": [
185 | {
186 | "data": {
187 | "text/plain": [
188 | "True"
189 | ]
190 | },
191 | "execution_count": 16,
192 | "metadata": {},
193 | "output_type": "execute_result"
194 | }
195 | ],
196 | "source": [
197 | "not \"Goodbye\" in words"
198 | ]
199 | },
200 | {
201 | "cell_type": "code",
202 | "execution_count": null,
203 | "metadata": {
204 | "collapsed": true
205 | },
206 | "outputs": [],
207 | "source": []
208 | }
209 | ],
210 | "metadata": {
211 | "kernelspec": {
212 | "display_name": "Python 3",
213 | "language": "python",
214 | "name": "python3"
215 | },
216 | "language_info": {
217 | "codemirror_mode": {
218 | "name": "ipython",
219 | "version": 3
220 | },
221 | "file_extension": ".py",
222 | "mimetype": "text/x-python",
223 | "name": "python",
224 | "nbconvert_exporter": "python",
225 | "pygments_lexer": "ipython3",
226 | "version": "3.6.5"
227 | }
228 | },
229 | "nbformat": 4,
230 | "nbformat_minor": 2
231 | }
232 |
--------------------------------------------------------------------------------
/How it works - list comprehensions.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "code",
5 | "execution_count": 1,
6 | "metadata": {},
7 | "outputs": [],
8 | "source": [
9 | "# Tips from https://www.kaggle.com/colinmorris/learn-python-challenge-day-5\n",
10 | "# Thank you https://www.kaggle.com/colinmorris :)"
11 | ]
12 | },
13 | {
14 | "cell_type": "code",
15 | "execution_count": 8,
16 | "metadata": {},
17 | "outputs": [
18 | {
19 | "name": "stdout",
20 | "output_type": "stream",
21 | "text": [
22 | "[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]\n"
23 | ]
24 | }
25 | ],
26 | "source": [
27 | "# Basically creating a list by specifying some rule for its construction\n",
28 | "squares = [n**2 for n in range(10)]\n",
29 | "print(squares)\n",
30 | "# One can see the beauty of this by comparing it to less satisfactory methods below..."
31 | ]
32 | },
33 | {
34 | "cell_type": "code",
35 | "execution_count": 9,
36 | "metadata": {},
37 | "outputs": [],
38 | "source": [
39 | "# At worst we could have done this\n",
40 | "squares = [0, 1, 4, 9, 16, 25, 36, 49, 64, 81]"
41 | ]
42 | },
43 | {
44 | "cell_type": "code",
45 | "execution_count": 10,
46 | "metadata": {},
47 | "outputs": [
48 | {
49 | "name": "stdout",
50 | "output_type": "stream",
51 | "text": [
52 | "[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]\n"
53 | ]
54 | }
55 | ],
56 | "source": [
57 | "# And more tediously we could have done this\n",
58 | "squares = []\n",
59 | "for n in range(10):\n",
60 | " squares.append(n**2)\n",
61 | "print(squares)"
62 | ]
63 | },
64 | {
65 | "cell_type": "code",
66 | "execution_count": 17,
67 | "metadata": {},
68 | "outputs": [],
69 | "source": [
70 | "# You can do some nice fancy footwork - let's get an arbitrary list of integers\n",
71 | "my_list = [5, -1, -2, 0, 3]"
72 | ]
73 | },
74 | {
75 | "cell_type": "code",
76 | "execution_count": 24,
77 | "metadata": {},
78 | "outputs": [
79 | {
80 | "data": {
81 | "text/plain": [
82 | "2"
83 | ]
84 | },
85 | "execution_count": 24,
86 | "metadata": {},
87 | "output_type": "execute_result"
88 | }
89 | ],
90 | "source": [
91 | "# I can easily find the negative numbers now - without a loop!\n",
92 | "len([num for num in my_list if num < 0])"
93 | ]
94 | },
95 | {
96 | "cell_type": "code",
97 | "execution_count": 26,
98 | "metadata": {},
99 | "outputs": [
100 | {
101 | "data": {
102 | "text/plain": [
103 | "2"
104 | ]
105 | },
106 | "execution_count": 26,
107 | "metadata": {},
108 | "output_type": "execute_result"
109 | }
110 | ],
111 | "source": [
112 | "# There are a couple of options...\n",
113 | "sum([num < 0 for num in my_list])"
114 | ]
115 | },
116 | {
117 | "cell_type": "code",
118 | "execution_count": 27,
119 | "metadata": {},
120 | "outputs": [],
121 | "source": [
122 | "# Some more examples\n",
123 | "planets = [\"Mercury\", \"Venus\", \"Earth\", \"Mars\", \"Jupiter\", \"Saturn\", \"Uranus\", \"Neptune\", \"Pluto\"]"
124 | ]
125 | },
126 | {
127 | "cell_type": "code",
128 | "execution_count": 28,
129 | "metadata": {},
130 | "outputs": [
131 | {
132 | "data": {
133 | "text/plain": [
134 | "['VENUS!', 'EARTH!', 'MARS!', 'PLUTO!']"
135 | ]
136 | },
137 | "execution_count": 28,
138 | "metadata": {},
139 | "output_type": "execute_result"
140 | }
141 | ],
142 | "source": [
143 | "# Notice how you can write it quite nice and readably\n",
144 | "[\n",
145 | " planet.upper() + '!' \n",
146 | " for planet in planets \n",
147 | " if len(planet) < 6\n",
148 | "]"
149 | ]
150 | }
151 | ],
152 | "metadata": {
153 | "kernelspec": {
154 | "display_name": "Python 3",
155 | "language": "python",
156 | "name": "python3"
157 | },
158 | "language_info": {
159 | "codemirror_mode": {
160 | "name": "ipython",
161 | "version": 3
162 | },
163 | "file_extension": ".py",
164 | "mimetype": "text/x-python",
165 | "name": "python",
166 | "nbconvert_exporter": "python",
167 | "pygments_lexer": "ipython3",
168 | "version": "3.6.5"
169 | }
170 | },
171 | "nbformat": 4,
172 | "nbformat_minor": 2
173 | }
174 |
--------------------------------------------------------------------------------
/How it works - lists vs arrays.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "code",
5 | "execution_count": 1,
6 | "metadata": {},
7 | "outputs": [],
8 | "source": [
9 | "import numpy as np"
10 | ]
11 | },
12 | {
13 | "cell_type": "code",
14 | "execution_count": 2,
15 | "metadata": {},
16 | "outputs": [
17 | {
18 | "name": "stdout",
19 | "output_type": "stream",
20 | "text": [
21 | "[12, 24, 65, 19, 242]\n"
22 | ]
23 | }
24 | ],
25 | "source": [
26 | "List1 = [12, 24, 65 , 19, 242]\n",
27 | "print(List1)"
28 | ]
29 | },
30 | {
31 | "cell_type": "code",
32 | "execution_count": 3,
33 | "metadata": {},
34 | "outputs": [
35 | {
36 | "name": "stdout",
37 | "output_type": "stream",
38 | "text": [
39 | "[12, 24, 65, 19]\n"
40 | ]
41 | }
42 | ],
43 | "source": [
44 | "List2 = List1[0:4]\n",
45 | "print(List2)"
46 | ]
47 | },
48 | {
49 | "cell_type": "code",
50 | "execution_count": 4,
51 | "metadata": {},
52 | "outputs": [],
53 | "source": [
54 | "List2[0] = 15"
55 | ]
56 | },
57 | {
58 | "cell_type": "code",
59 | "execution_count": 5,
60 | "metadata": {},
61 | "outputs": [
62 | {
63 | "name": "stdout",
64 | "output_type": "stream",
65 | "text": [
66 | "List1 contains: [12, 24, 65, 19, 242]\n",
67 | "List2 contains: [15, 24, 65, 19]\n"
68 | ]
69 | }
70 | ],
71 | "source": [
72 | "# Notice that while we changed a value in List2, the value in List2 remains the same\n",
73 | "# The 2 lists are independent of one another\n",
74 | "print(\"List1 contains: \", List1)\n",
75 | "print(\"List2 contains: \", List2)"
76 | ]
77 | },
78 | {
79 | "cell_type": "code",
80 | "execution_count": 6,
81 | "metadata": {},
82 | "outputs": [
83 | {
84 | "name": "stdout",
85 | "output_type": "stream",
86 | "text": [
87 | "[ 12 24 65 19 242]\n"
88 | ]
89 | }
90 | ],
91 | "source": [
92 | "Array1 = np.array(List1)\n",
93 | "print(Array1)"
94 | ]
95 | },
96 | {
97 | "cell_type": "code",
98 | "execution_count": 7,
99 | "metadata": {},
100 | "outputs": [
101 | {
102 | "name": "stdout",
103 | "output_type": "stream",
104 | "text": [
105 | "[12 24 65 19]\n"
106 | ]
107 | }
108 | ],
109 | "source": [
110 | "Array2 = Array1[0:4]\n",
111 | "print(Array2)"
112 | ]
113 | },
114 | {
115 | "cell_type": "code",
116 | "execution_count": 8,
117 | "metadata": {},
118 | "outputs": [],
119 | "source": [
120 | "Array2[0] = 15"
121 | ]
122 | },
123 | {
124 | "cell_type": "code",
125 | "execution_count": 9,
126 | "metadata": {},
127 | "outputs": [
128 | {
129 | "name": "stdout",
130 | "output_type": "stream",
131 | "text": [
132 | "Array1 contains: [ 15 24 65 19 242]\n",
133 | "Array2 contains: [15 24 65 19]\n"
134 | ]
135 | }
136 | ],
137 | "source": [
138 | "# Notice that because Array2 came from Array1, they are joined at the hip, so when you change\n",
139 | "# the value in one, the value in the other immediately changes too!\n",
140 | "print(\"Array1 contains: \", Array1)\n",
141 | "print(\"Array2 contains: \", Array2)"
142 | ]
143 | },
144 | {
145 | "cell_type": "code",
146 | "execution_count": 10,
147 | "metadata": {},
148 | "outputs": [
149 | {
150 | "name": "stdout",
151 | "output_type": "stream",
152 | "text": [
153 | "[ 12 24 65 19 242]\n"
154 | ]
155 | }
156 | ],
157 | "source": [
158 | "Array1 = np.array(List1)\n",
159 | "print(Array1)"
160 | ]
161 | },
162 | {
163 | "cell_type": "code",
164 | "execution_count": 11,
165 | "metadata": {},
166 | "outputs": [
167 | {
168 | "name": "stdout",
169 | "output_type": "stream",
170 | "text": [
171 | "[12 24 65 19]\n"
172 | ]
173 | }
174 | ],
175 | "source": [
176 | "# If this is not the desired effect use .copy() to make an independent copy\n",
177 | "Array2 = Array1[0:4].copy()\n",
178 | "print(Array2)"
179 | ]
180 | },
181 | {
182 | "cell_type": "code",
183 | "execution_count": 12,
184 | "metadata": {},
185 | "outputs": [],
186 | "source": [
187 | "Array2[0] = 15"
188 | ]
189 | },
190 | {
191 | "cell_type": "code",
192 | "execution_count": 13,
193 | "metadata": {},
194 | "outputs": [
195 | {
196 | "name": "stdout",
197 | "output_type": "stream",
198 | "text": [
199 | "Array1 contains: [ 12 24 65 19 242]\n",
200 | "Array2 contains: [15 24 65 19]\n"
201 | ]
202 | }
203 | ],
204 | "source": [
205 | "# Notice now that although we changed a value in Array2, Array1 has remained unaffected\n",
206 | "print(\"Array1 contains: \", Array1)\n",
207 | "print(\"Array2 contains: \", Array2)"
208 | ]
209 | },
210 | {
211 | "cell_type": "code",
212 | "execution_count": 14,
213 | "metadata": {},
214 | "outputs": [],
215 | "source": [
216 | "List1 = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9])"
217 | ]
218 | },
219 | {
220 | "cell_type": "code",
221 | "execution_count": 15,
222 | "metadata": {},
223 | "outputs": [
224 | {
225 | "name": "stdout",
226 | "output_type": "stream",
227 | "text": [
228 | "[[1 2 3]\n",
229 | " [4 5 6]\n",
230 | " [7 8 9]]\n",
231 | "\n"
232 | ]
233 | }
234 | ],
235 | "source": [
236 | "List_Table = np.reshape(List1, (3,3))\n",
237 | "print(List_Table)\n",
238 | "print(type(List_Table))"
239 | ]
240 | },
241 | {
242 | "cell_type": "code",
243 | "execution_count": 16,
244 | "metadata": {},
245 | "outputs": [
246 | {
247 | "name": "stdout",
248 | "output_type": "stream",
249 | "text": [
250 | "3\n",
251 | "[2 5 8]\n"
252 | ]
253 | }
254 | ],
255 | "source": [
256 | "print(List_Table[0,2])\n",
257 | "print(List_Table[:,1])"
258 | ]
259 | },
260 | {
261 | "cell_type": "code",
262 | "execution_count": 17,
263 | "metadata": {},
264 | "outputs": [],
265 | "source": [
266 | "#Object oriented programming"
267 | ]
268 | },
269 | {
270 | "cell_type": "code",
271 | "execution_count": 18,
272 | "metadata": {},
273 | "outputs": [
274 | {
275 | "data": {
276 | "text/plain": [
277 | "array([[1, 2, 3],\n",
278 | " [4, 5, 6],\n",
279 | " [7, 8, 9]])"
280 | ]
281 | },
282 | "execution_count": 18,
283 | "metadata": {},
284 | "output_type": "execute_result"
285 | }
286 | ],
287 | "source": [
288 | "List1.reshape(3,3)"
289 | ]
290 | },
291 | {
292 | "cell_type": "code",
293 | "execution_count": 19,
294 | "metadata": {},
295 | "outputs": [
296 | {
297 | "name": "stdout",
298 | "output_type": "stream",
299 | "text": [
300 | "[[1 2 3]\n",
301 | " [4 5 6]\n",
302 | " [7 8 9]]\n"
303 | ]
304 | }
305 | ],
306 | "source": [
307 | "print(List1.reshape(3,3))"
308 | ]
309 | },
310 | {
311 | "cell_type": "code",
312 | "execution_count": 20,
313 | "metadata": {},
314 | "outputs": [],
315 | "source": [
316 | "#We played Scrabble in Jan, Feb & Mar, here are the scores:"
317 | ]
318 | },
319 | {
320 | "cell_type": "code",
321 | "execution_count": 21,
322 | "metadata": {},
323 | "outputs": [],
324 | "source": [
325 | "Lisa_Mitford = [108, 215, 99]\n",
326 | "Barry_Benjamin = [260, 212, 220]\n",
327 | "Geoff_Louw = [176, 98, 232]\n",
328 | "Denise_Louw = [102, 89, 276]"
329 | ]
330 | },
331 | {
332 | "cell_type": "code",
333 | "execution_count": 22,
334 | "metadata": {},
335 | "outputs": [],
336 | "source": [
337 | "#Insert this lot into an array"
338 | ]
339 | },
340 | {
341 | "cell_type": "code",
342 | "execution_count": 23,
343 | "metadata": {},
344 | "outputs": [
345 | {
346 | "name": "stdout",
347 | "output_type": "stream",
348 | "text": [
349 | "[[108 215 99]\n",
350 | " [260 212 220]\n",
351 | " [176 98 232]\n",
352 | " [102 89 276]]\n"
353 | ]
354 | }
355 | ],
356 | "source": [
357 | "Scores = np.array([Lisa_Mitford, Barry_Benjamin, Geoff_Louw, Denise_Louw])\n",
358 | "print(Scores)"
359 | ]
360 | },
361 | {
362 | "cell_type": "code",
363 | "execution_count": 24,
364 | "metadata": {},
365 | "outputs": [],
366 | "source": [
367 | "#Create a player dictionary"
368 | ]
369 | },
370 | {
371 | "cell_type": "code",
372 | "execution_count": 25,
373 | "metadata": {},
374 | "outputs": [],
375 | "source": [
376 | "PDict = {'Lisa_Mitford':0, 'Barry_Benjamin':1, 'Geoff_Louw':2, 'Denise_Louw':3}"
377 | ]
378 | },
379 | {
380 | "cell_type": "code",
381 | "execution_count": 26,
382 | "metadata": {},
383 | "outputs": [],
384 | "source": [
385 | "#Create a months dictionary\n",
386 | "MDict = {'Jan':0, 'Feb':1, 'Mar':2}"
387 | ]
388 | },
389 | {
390 | "cell_type": "code",
391 | "execution_count": 27,
392 | "metadata": {},
393 | "outputs": [],
394 | "source": [
395 | "#To retrieve an individual item from the matrix"
396 | ]
397 | },
398 | {
399 | "cell_type": "code",
400 | "execution_count": 28,
401 | "metadata": {},
402 | "outputs": [
403 | {
404 | "name": "stdout",
405 | "output_type": "stream",
406 | "text": [
407 | "232\n"
408 | ]
409 | }
410 | ],
411 | "source": [
412 | "print(Scores[PDict['Geoff_Louw'], MDict['Mar']])"
413 | ]
414 | },
415 | {
416 | "cell_type": "code",
417 | "execution_count": 29,
418 | "metadata": {},
419 | "outputs": [],
420 | "source": [
421 | "#To retrieve a whole row from the matrix"
422 | ]
423 | },
424 | {
425 | "cell_type": "code",
426 | "execution_count": 30,
427 | "metadata": {},
428 | "outputs": [
429 | {
430 | "name": "stdout",
431 | "output_type": "stream",
432 | "text": [
433 | "[176 98 232]\n"
434 | ]
435 | }
436 | ],
437 | "source": [
438 | "print(Scores[PDict['Geoff_Louw']])"
439 | ]
440 | },
441 | {
442 | "cell_type": "code",
443 | "execution_count": 31,
444 | "metadata": {},
445 | "outputs": [
446 | {
447 | "name": "stdout",
448 | "output_type": "stream",
449 | "text": [
450 | "[ 99 220 232 276]\n"
451 | ]
452 | }
453 | ],
454 | "source": [
455 | "#Or to retrieve a whole column from the matrix\n",
456 | "print(Scores[:,MDict['Mar']])"
457 | ]
458 | }
459 | ],
460 | "metadata": {
461 | "kernelspec": {
462 | "display_name": "Python 3",
463 | "language": "python",
464 | "name": "python3"
465 | },
466 | "language_info": {
467 | "codemirror_mode": {
468 | "name": "ipython",
469 | "version": 3
470 | },
471 | "file_extension": ".py",
472 | "mimetype": "text/x-python",
473 | "name": "python",
474 | "nbconvert_exporter": "python",
475 | "pygments_lexer": "ipython3",
476 | "version": "3.6.5"
477 | }
478 | },
479 | "nbformat": 4,
480 | "nbformat_minor": 2
481 | }
482 |
--------------------------------------------------------------------------------
/How it works - positive & negative indexation.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "code",
5 | "execution_count": 1,
6 | "metadata": {},
7 | "outputs": [],
8 | "source": [
9 | "# Let's start with the basics"
10 | ]
11 | },
12 | {
13 | "cell_type": "code",
14 | "execution_count": 2,
15 | "metadata": {},
16 | "outputs": [
17 | {
18 | "data": {
19 | "text/plain": [
20 | "[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18]"
21 | ]
22 | },
23 | "execution_count": 2,
24 | "metadata": {},
25 | "output_type": "execute_result"
26 | }
27 | ],
28 | "source": [
29 | "simple = list(range(1,19))\n",
30 | "simple"
31 | ]
32 | },
33 | {
34 | "cell_type": "code",
35 | "execution_count": 3,
36 | "metadata": {},
37 | "outputs": [
38 | {
39 | "data": {
40 | "text/plain": [
41 | "1"
42 | ]
43 | },
44 | "execution_count": 3,
45 | "metadata": {},
46 | "output_type": "execute_result"
47 | }
48 | ],
49 | "source": [
50 | "# Select the 1st item using positive indexation\n",
51 | "simple[0]"
52 | ]
53 | },
54 | {
55 | "cell_type": "code",
56 | "execution_count": 4,
57 | "metadata": {},
58 | "outputs": [
59 | {
60 | "data": {
61 | "text/plain": [
62 | "1"
63 | ]
64 | },
65 | "execution_count": 4,
66 | "metadata": {},
67 | "output_type": "execute_result"
68 | }
69 | ],
70 | "source": [
71 | "# Select the 1st item using negative indexation\n",
72 | "simple[-18]"
73 | ]
74 | },
75 | {
76 | "cell_type": "code",
77 | "execution_count": 5,
78 | "metadata": {},
79 | "outputs": [
80 | {
81 | "data": {
82 | "text/plain": [
83 | "18"
84 | ]
85 | },
86 | "execution_count": 5,
87 | "metadata": {},
88 | "output_type": "execute_result"
89 | }
90 | ],
91 | "source": [
92 | "# Select the last item using positive indexation\n",
93 | "simple[17]"
94 | ]
95 | },
96 | {
97 | "cell_type": "code",
98 | "execution_count": 6,
99 | "metadata": {},
100 | "outputs": [
101 | {
102 | "data": {
103 | "text/plain": [
104 | "18"
105 | ]
106 | },
107 | "execution_count": 6,
108 | "metadata": {},
109 | "output_type": "execute_result"
110 | }
111 | ],
112 | "source": [
113 | "# Select the last item using negative indexation\n",
114 | "simple[-1]"
115 | ]
116 | },
117 | {
118 | "cell_type": "code",
119 | "execution_count": 7,
120 | "metadata": {},
121 | "outputs": [
122 | {
123 | "data": {
124 | "text/plain": [
125 | "[1, 2, 3, 4, 5, 6, 7]"
126 | ]
127 | },
128 | "execution_count": 7,
129 | "metadata": {},
130 | "output_type": "execute_result"
131 | }
132 | ],
133 | "source": [
134 | "# Select a range of items with positive indexation\n",
135 | "simple[0:7]"
136 | ]
137 | },
138 | {
139 | "cell_type": "code",
140 | "execution_count": 8,
141 | "metadata": {},
142 | "outputs": [
143 | {
144 | "data": {
145 | "text/plain": [
146 | "[1, 2, 3, 4, 5, 6, 7]"
147 | ]
148 | },
149 | "execution_count": 8,
150 | "metadata": {},
151 | "output_type": "execute_result"
152 | }
153 | ],
154 | "source": [
155 | "# Select a range of items with negative indexation\n",
156 | "simple[-18:-11]"
157 | ]
158 | },
159 | {
160 | "cell_type": "code",
161 | "execution_count": 9,
162 | "metadata": {},
163 | "outputs": [
164 | {
165 | "data": {
166 | "text/plain": [
167 | "[2, 4, 6]"
168 | ]
169 | },
170 | "execution_count": 9,
171 | "metadata": {},
172 | "output_type": "execute_result"
173 | }
174 | ],
175 | "source": [
176 | "# Select a range of items between 1 and 7 in increments of 2\n",
177 | "simple[1:7:2]"
178 | ]
179 | },
180 | {
181 | "cell_type": "code",
182 | "execution_count": 10,
183 | "metadata": {},
184 | "outputs": [
185 | {
186 | "data": {
187 | "text/plain": [
188 | "[6, 4, 2]"
189 | ]
190 | },
191 | "execution_count": 10,
192 | "metadata": {},
193 | "output_type": "execute_result"
194 | }
195 | ],
196 | "source": [
197 | "# Select the same range of items between 1 and 7 in increments of -2 (backwards)\n",
198 | "simple[-13:-18:-2]"
199 | ]
200 | },
201 | {
202 | "cell_type": "code",
203 | "execution_count": 11,
204 | "metadata": {},
205 | "outputs": [
206 | {
207 | "data": {
208 | "text/plain": [
209 | "[]"
210 | ]
211 | },
212 | "execution_count": 11,
213 | "metadata": {},
214 | "output_type": "execute_result"
215 | }
216 | ],
217 | "source": [
218 | "# Note how the step increment makes a difference to the order - this doesn't work because it says start at 1, \n",
219 | "# go on until 7 and use increments of negative 2 but if we do negative 2 from 1 we get immediately outside\n",
220 | "# the bounds of our list\n",
221 | "simple[1:7:-2]"
222 | ]
223 | },
224 | {
225 | "cell_type": "code",
226 | "execution_count": 12,
227 | "metadata": {},
228 | "outputs": [
229 | {
230 | "data": {
231 | "text/plain": [
232 | "[]"
233 | ]
234 | },
235 | "execution_count": 12,
236 | "metadata": {},
237 | "output_type": "execute_result"
238 | }
239 | ],
240 | "source": [
241 | "# Similarly here we are saying start at -18 and go forwards by 2 which again puts us immediately\n",
242 | "# outside the bounds of our list\n",
243 | "simple[-13:-18:2]"
244 | ]
245 | },
246 | {
247 | "cell_type": "code",
248 | "execution_count": 13,
249 | "metadata": {},
250 | "outputs": [],
251 | "source": [
252 | "# Now replace a list item with a new value (6 > 99)\n",
253 | "simple[-13] = 99"
254 | ]
255 | },
256 | {
257 | "cell_type": "code",
258 | "execution_count": 14,
259 | "metadata": {},
260 | "outputs": [
261 | {
262 | "data": {
263 | "text/plain": [
264 | "[99, 4, 2]"
265 | ]
266 | },
267 | "execution_count": 14,
268 | "metadata": {},
269 | "output_type": "execute_result"
270 | }
271 | ],
272 | "source": [
273 | "# And check what it looks like now\n",
274 | "simple[-13:-18:-2]"
275 | ]
276 | },
277 | {
278 | "cell_type": "code",
279 | "execution_count": 15,
280 | "metadata": {},
281 | "outputs": [
282 | {
283 | "data": {
284 | "text/plain": [
285 | "[1, 2, 3, 4, 5, 99, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 909]"
286 | ]
287 | },
288 | "execution_count": 15,
289 | "metadata": {},
290 | "output_type": "execute_result"
291 | }
292 | ],
293 | "source": [
294 | "# Add a number at the end of the list\n",
295 | "simple.append(909)\n",
296 | "simple"
297 | ]
298 | },
299 | {
300 | "cell_type": "code",
301 | "execution_count": 16,
302 | "metadata": {},
303 | "outputs": [
304 | {
305 | "data": {
306 | "text/plain": [
307 | "[1, 2, 3, 4, 5, 6, 99, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 909]"
308 | ]
309 | },
310 | "execution_count": 16,
311 | "metadata": {},
312 | "output_type": "execute_result"
313 | }
314 | ],
315 | "source": [
316 | "# Add a number in the middle of the list (add number 6 just before position 5)\n",
317 | "simple.insert(5, 6)\n",
318 | "simple"
319 | ]
320 | },
321 | {
322 | "cell_type": "code",
323 | "execution_count": 17,
324 | "metadata": {},
325 | "outputs": [
326 | {
327 | "data": {
328 | "text/plain": [
329 | "True"
330 | ]
331 | },
332 | "execution_count": 17,
333 | "metadata": {},
334 | "output_type": "execute_result"
335 | }
336 | ],
337 | "source": [
338 | "# Quickly check if a number is somewhere in the list\n",
339 | "99 in simple"
340 | ]
341 | },
342 | {
343 | "cell_type": "code",
344 | "execution_count": 18,
345 | "metadata": {},
346 | "outputs": [
347 | {
348 | "data": {
349 | "text/plain": [
350 | "6"
351 | ]
352 | },
353 | "execution_count": 18,
354 | "metadata": {},
355 | "output_type": "execute_result"
356 | }
357 | ],
358 | "source": [
359 | "# And then check which index position it occurs in the list\n",
360 | "simple.index(99)"
361 | ]
362 | },
363 | {
364 | "cell_type": "code",
365 | "execution_count": 19,
366 | "metadata": {},
367 | "outputs": [
368 | {
369 | "data": {
370 | "text/plain": [
371 | "909"
372 | ]
373 | },
374 | "execution_count": 19,
375 | "metadata": {},
376 | "output_type": "execute_result"
377 | }
378 | ],
379 | "source": [
380 | "# What is the biggest number in the list?\n",
381 | "max(simple)"
382 | ]
383 | },
384 | {
385 | "cell_type": "code",
386 | "execution_count": 20,
387 | "metadata": {},
388 | "outputs": [
389 | {
390 | "data": {
391 | "text/plain": [
392 | "1"
393 | ]
394 | },
395 | "execution_count": 20,
396 | "metadata": {},
397 | "output_type": "execute_result"
398 | }
399 | ],
400 | "source": [
401 | "# And the smallest?\n",
402 | "min(simple)"
403 | ]
404 | },
405 | {
406 | "cell_type": "code",
407 | "execution_count": 21,
408 | "metadata": {},
409 | "outputs": [],
410 | "source": [
411 | "# Some more finicky examples"
412 | ]
413 | },
414 | {
415 | "cell_type": "code",
416 | "execution_count": 22,
417 | "metadata": {},
418 | "outputs": [
419 | {
420 | "name": "stdout",
421 | "output_type": "stream",
422 | "text": [
423 | "Fortune 500\n",
424 | "Think Bike\n",
425 | "Make Love\n",
426 | "Be Careful\n"
427 | ]
428 | }
429 | ],
430 | "source": [
431 | "List_1 = [\"Be\", \"Make\",\"Think\", \"Fortune\"]\n",
432 | "List_2 = [\"Careful\", \"Love\", \"Bike\", \"500\"]\n",
433 | "if len(List_1) < len(List_2):\n",
434 | " x = len(List_1) + 1\n",
435 | "else:\n",
436 | " x = len(List_2) + 1\n",
437 | "for i in range(1,x):\n",
438 | " print(List_1[-i] + \" \" + List_2[-i])"
439 | ]
440 | },
441 | {
442 | "cell_type": "code",
443 | "execution_count": 23,
444 | "metadata": {},
445 | "outputs": [
446 | {
447 | "name": "stdout",
448 | "output_type": "stream",
449 | "text": [
450 | "Be Careful\n",
451 | "Fortune 500\n",
452 | "Think Bike\n",
453 | "Make Love\n"
454 | ]
455 | }
456 | ],
457 | "source": [
458 | "List_1 = [\"Be\", \"Make\",\"Think\", \"Fortune\"]\n",
459 | "List_2 = [\"Careful\", \"Love\", \"Bike\", \"500\"]\n",
460 | "if len(List_1) < len(List_2):\n",
461 | " x = len(List_1)\n",
462 | "else:\n",
463 | " x = len(List_2)\n",
464 | "for i in range(x):\n",
465 | " print(List_1[-i] + \" \" + List_2[-i])"
466 | ]
467 | },
468 | {
469 | "cell_type": "code",
470 | "execution_count": 24,
471 | "metadata": {
472 | "scrolled": true
473 | },
474 | "outputs": [
475 | {
476 | "name": "stdout",
477 | "output_type": "stream",
478 | "text": [
479 | "Fortune 500\n",
480 | "Think Bike\n",
481 | "Make Love\n",
482 | "Be Careful\n"
483 | ]
484 | }
485 | ],
486 | "source": [
487 | "List_1 = [\"Be\", \"Make\",\"Think\", \"Fortune\"]\n",
488 | "List_2 = [\"Careful\", \"Love\", \"Bike\", \"500\"]\n",
489 | "if len(List_1) < len(List_2):\n",
490 | " x = len(List_1)\n",
491 | "else:\n",
492 | " x = len(List_2)\n",
493 | "for i in range(-1,(-x-1),-1):\n",
494 | " print(List_1[i] + \" \" + List_2[i])"
495 | ]
496 | }
497 | ],
498 | "metadata": {
499 | "kernelspec": {
500 | "display_name": "Python 3",
501 | "language": "python",
502 | "name": "python3"
503 | },
504 | "language_info": {
505 | "codemirror_mode": {
506 | "name": "ipython",
507 | "version": 3
508 | },
509 | "file_extension": ".py",
510 | "mimetype": "text/x-python",
511 | "name": "python",
512 | "nbconvert_exporter": "python",
513 | "pygments_lexer": "ipython3",
514 | "version": "3.6.5"
515 | }
516 | },
517 | "nbformat": 4,
518 | "nbformat_minor": 2
519 | }
520 |
--------------------------------------------------------------------------------
/Practice Run - Linear Regression Age vs Blood Pressure.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "Small dataset for linear regression from:
\n",
8 | "http://people.sc.fsu.edu/~jburkardt/datasets/regression/x03.txt"
9 | ]
10 | },
11 | {
12 | "cell_type": "code",
13 | "execution_count": 1,
14 | "metadata": {},
15 | "outputs": [],
16 | "source": [
17 | "import numpy as np\n",
18 | "from scipy import stats\n",
19 | "import pandas as pd\n",
20 | "import matplotlib.pyplot as plt\n",
21 | "%matplotlib inline"
22 | ]
23 | },
24 | {
25 | "cell_type": "code",
26 | "execution_count": 2,
27 | "metadata": {},
28 | "outputs": [],
29 | "source": [
30 | "x03 = pd.read_excel(\"x03.xlsx\")\n",
31 | "# I, the index;\n",
32 | "# A0, 1,\n",
33 | "# A1, the age;\n",
34 | "# B, the systolic blood pressure."
35 | ]
36 | },
37 | {
38 | "cell_type": "code",
39 | "execution_count": 3,
40 | "metadata": {},
41 | "outputs": [
42 | {
43 | "data": {
44 | "text/html": [
45 | "\n",
46 | "\n",
59 | "
\n",
60 | " \n",
61 | " \n",
62 | " | \n",
63 | " I | \n",
64 | " A0 | \n",
65 | " A1 | \n",
66 | " B | \n",
67 | "
\n",
68 | " \n",
69 | " \n",
70 | " \n",
71 | " 0 | \n",
72 | " 1 | \n",
73 | " 1 | \n",
74 | " 39 | \n",
75 | " 144 | \n",
76 | "
\n",
77 | " \n",
78 | " 1 | \n",
79 | " 2 | \n",
80 | " 1 | \n",
81 | " 47 | \n",
82 | " 220 | \n",
83 | "
\n",
84 | " \n",
85 | " 2 | \n",
86 | " 3 | \n",
87 | " 1 | \n",
88 | " 45 | \n",
89 | " 138 | \n",
90 | "
\n",
91 | " \n",
92 | "
\n",
93 | "
"
94 | ],
95 | "text/plain": [
96 | " I A0 A1 B\n",
97 | "0 1 1 39 144\n",
98 | "1 2 1 47 220\n",
99 | "2 3 1 45 138"
100 | ]
101 | },
102 | "execution_count": 3,
103 | "metadata": {},
104 | "output_type": "execute_result"
105 | }
106 | ],
107 | "source": [
108 | "x03.head(3)"
109 | ]
110 | },
111 | {
112 | "cell_type": "code",
113 | "execution_count": 4,
114 | "metadata": {},
115 | "outputs": [
116 | {
117 | "data": {
118 | "text/plain": [
119 | ""
120 | ]
121 | },
122 | "execution_count": 4,
123 | "metadata": {},
124 | "output_type": "execute_result"
125 | },
126 | {
127 | "data": {
128 | "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXoAAAD8CAYAAAB5Pm/hAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMi4yLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvhp/UCwAAFKdJREFUeJzt3X+MZeV93/H3x4DpOm46dlhHMLt0ocJbk+CwZIJJaVOHNF6cWoBQIoGcGDlWV0lQZSKHmLWlRqmKIKFyEiuKpW1MbSQKJTEmKHGCiXHqxgqggcX8MN56E2yzu8SsRddOyxbB5ts/5kwY1jPMj3vvzLnPfb+k0dz73HPvPs/cs5858z3PeW6qCklSu16z0R2QJI2WQS9JjTPoJalxBr0kNc6gl6TGGfSS1DiDXpIaZ9BLUuMMeklq3Ikb3QGAU045pbZt27bR3ZCksfLQQw99q6o2L7ddL4J+27ZtzM7ObnQ3JGmsJPn6SrazdCNJjTPoJalxBr0kNc6gl6TGGfSS1Lhlgz7J1iSfT/JkkieSvL9rvynJV5I8muTTSaYWPGd3kv1J9iXZOcoBSOPorr0HufDG+zjjuj/hwhvv4669Bze6S2rYSo7oXwI+UFVvAS4Ark5yNnAv8INV9VbgfwG7AbrHrgB+ALgY+L0kJ4yi89I4umvvQXbf+RgHjxylgINHjrL7zscMe43MskFfVc9U1cPd7b8DngSmq+qzVfVSt9n9wJbu9qXA7VX1QlU9BewHzh9+16XxdNM9+zj64rFXtB198Rg33bNvg3qk1q2qRp9kG7ADeOC4h34e+NPu9jTw9ILHDnRtx7/WriSzSWYPHz68mm5IY+3QkaOrapcGteKgT/J64FPANVX1nQXtH2auvHPrfNMiT/+uTyCvqj1VNVNVM5s3L3sFr9SM06Y2rapdGtSKgj7JScyF/K1VdeeC9quAdwHvrqr5MD8AbF3w9C3AoeF0Vxp/1+7czqaTXnnaatNJJ3Dtzu0b1CO1biWzbgJ8HHiyqj6yoP1i4IPAJVX1/IKn3A1ckeTkJGcAZwEPDrfb0vi6bMc0N1x+DtNTmwgwPbWJGy4/h8t2fFeFUxqKlSxqdiHwc8BjSR7p2j4EfBQ4Gbh37ncB91fVL1TVE0nuAL7MXEnn6qo6tsjrShPrsh3TBrvWzbJBX1V/yeJ198+8ynOuB64foF+SpCHxylhJapxBL0mNM+glqXEGvSQ1zqCXpMYZ9JLUOINekhpn0EtS4wx6SWqcQS9JjTPoJalxBr0kNc6gl6TGGfSS1DiDXpIaZ9BLUuMMeklqnEEvSY0z6CWpcQa9JDXOoJekxhn0ktQ4g16SGmfQS1LjDHpJapxBL0mNM+glqXHLBn2SrUk+n+TJJE8keX/X/sYk9yb5avf9DV17knw0yf4kjyY5b9SDkCQtbSVH9C8BH6iqtwAXAFcnORu4DvhcVZ0FfK67D/BO4KzuaxfwsaH3WpK0YssGfVU9U1UPd7f/DngSmAYuBT7ZbfZJ4LLu9qXALTXnfmAqyalD77kkaUVWVaNPsg3YATwAfH9VPQNzvwyAN3WbTQNPL3jaga7t+NfalWQ2yezhw4dX33NJ0oqsOOiTvB74FHBNVX3n1TZdpK2+q6FqT1XNVNXM5s2bV9oNSdIqrSjok5zEXMjfWlV3ds3fnC/JdN+f7doPAFsXPH0LcGg43ZUkrdZKZt0E+DjwZFV9ZMFDdwNXdbevAv5oQft7utk3FwDfni/xSJLW34kr2OZC4OeAx5I80rV9CLgRuCPJ+4BvAD/TPfYZ4KeA/cDzwHuH2mNJ0qosG/RV9ZcsXncH+IlFti/g6gH7JUkaEq+MlaTGGfSS1DiDXpIaZ9BLUuMMeklqnEEvSY0z6CWpcQa9JDXOoJekxhn0ktQ4g16SGmfQS1LjDHpJapxBL0mNM+glqXEGvSQ1zqCXpMYZ9JLUOINekhpn0EtS4wx6SWqcQS9JjTPoJalxBr0kNc6gl6TGGfSS1Lhlgz7JzUmeTfL4grZzk9yf5JEks0nO79qT5KNJ9id5NMl5o+y8JGl5Kzmi/wRw8XFtvwn8elWdC/yH7j7AO4Gzuq9dwMeG001J0lotG/RV9QXgueObge/tbv8T4FB3+1LglppzPzCV5NRhdVaStHonrvF51wD3JPnPzP2y+Bdd+zTw9ILtDnRtz6y5h5Kkgaz1ZOwvAr9cVVuBXwY+3rVnkW1rsRdIsqur788ePnx4jd2QJC1nrUF/FXBnd/sPgPO72weArQu228LLZZ1XqKo9VTVTVTObN29eYzckSctZa9AfAv51d/si4Kvd7buB93Szby4Avl1Vlm0kaQMtW6NPchvwduCUJAeAXwP+HfA7SU4E/h9zM2wAPgP8FLAfeB547wj6LElahWWDvqquXOKhH15k2wKuHrRTkqTh8cpYSWrcWqdXSpIGcNfeg9x0zz4OHTnKaVObuHbndi7bMT2Sf8ugl6R1dtfeg+y+8zGOvngMgINHjrL7zscARhL2lm4kaZ3ddM++fwj5eUdfPMZN9+wbyb9n0EvSOjt05Oiq2gdl0EvSOjttatOq2gdl0EvSOrt253Y2nXTCK9o2nXQC1+7cPpJ/z5OxkrTO5k+4OutGkhp22Y7pkQX78SzdSFLjPKKXpCFYzwugVsugl6QBrfcFUKtl6UaSBrTeF0CtlkEvSQNa7wugVsugl6QBrfcFUKtl0EvSgNb7AqjV8mSsJA1ovS+AWi2DXpKGYD0vgFotSzeS1DiDXpIaZ9BLUuMMeklqnEEvSY0z6CWpcU6vlDTRllp1ss+rUa6WQS9pYi216uTs15/jUw8d7O1qlKtl6UbSxFpq1cnbHni616tRrtayQZ/k5iTPJnn8uPZ/n2RfkieS/OaC9t1J9neP7RxFpyVpGJZaXfJY1aq277uVlG4+AfwucMt8Q5IfBy4F3lpVLyR5U9d+NnAF8APAacCfJ3lzVR37rleVxlBLdVvNrS55cJHwPiFZNOz7shrlai17RF9VXwCeO675F4Ebq+qFbptnu/ZLgdur6oWqegrYD5w/xP5KG2a+nnvwyFGKl+u2d+09uNFd0xotterklW/b2uvVKFdrrTX6NwP/KskDSf5Hkh/p2qeBpxdsd6Brk8Ze3z9FSKt32Y5pbrj8HKanNhFgemoTN1x+Dv/psnMWbR/Xv97WOuvmROANwAXAjwB3JDkTyCLbLlrsSrIL2AVw+umnr7Eb0vrp+6cIaW2WWnWyz6tRrtZag/4AcGdVFfBgkr8HTunaty7YbgtwaLEXqKo9wB6AmZmZxc98SD2yVD13XOu28zzvMBx9/jmutXRzF3ARQJI3A68FvgXcDVyR5OQkZwBnAQ8Oo6PSRuv7pwithecdhqPvP8eVTK+8DfgrYHuSA0neB9wMnNlNubwduKrmPAHcAXwZ+DPgamfcqBVL1XP7ctS2Fp53GI6+/xyXLd1U1ZVLPPSzS2x/PXD9IJ2S+qqlui143mFY+v5z9MpYaYItdX5h3M87rLe+/xwNemmCtXjeYSP0/efoombSBJsvQ/V1tsi46PvPMbXEmg7raWZmpmZnZze6G5IG0Ofpha1K8lBVzSy3nUf0kga21HK/MJ7L+rbGGr2kgfV9euGkM+glDazv0wsnnUEvaWB9n1446Qx6SQPr+/TCSefJWEkD6/v0wknnEb0kNc4jekkDc3plv3lEL2lgTq/sN4Ne0sCcXtlvBr2kgTm9st8MekkDc3plv3kyVtLAnF7Zbwa9pKFo7dO3WmLpRpIa5xG9tAFcu13ryaCX1pkXF2m9GfQaW+N6VPxqFxeNQ//H2bjuM4My6DWWxvmo2IuLNsY47zOD8mSsxtI4X3LvxUUbY5z3mUF5RK+xNC5HxYuVCq7duf0VR5aw9ouLJrUUsRbjss+Mgkf0GkvjcFQ8Xyo4eOQoxStLBTdcfg7TU5sIMD21iRsuP2fVAb3U69+19+DQx9KCcdhnRsWg11gah0vulzvp+sXrLuKpG/8tX7zuojUdhU9yKWItxmGfGZVlgz7JzUmeTfL4Io/9SpJKckp3P0k+mmR/kkeTnDeKTkuX7ZgeylHxKI26VDDJpYi1GId9ZlRWUqP/BPC7wC0LG5NsBX4S+MaC5ncCZ3VfbwM+1n2Xhq7vl9yfNrWJg4uE7rBKBaN+/Y0w6nMOfd9nRmXZI/qq+gLw3CIP/Rbwq0AtaLsUuKXm3A9MJTl1KD2VxsyoSwWtlSI85zA6a6rRJ7kEOFhVXzruoWng6QX3D3Rt0sQZdamgtVKE5xxGZ9XTK5O8Dvgw8I7FHl6krRZpI8kuYBfA6aefvtpuSGNh1KWClkoRnnMYnbXMo/9nwBnAl5IAbAEeTnI+c0fwWxdsuwU4tNiLVNUeYA/AzMzMor8MpHHhfPbBtXjOoS9WXbqpqseq6k1Vta2qtjEX7udV1d8CdwPv6WbfXAB8u6qeGW6XpX6xtjwcrZ1z6JOVTK+8DfgrYHuSA0ne9yqbfwb4G2A/8F+AXxpKL6Ues7Y8HK2dc+iTZUs3VXXlMo9vW3C7gKsH75Za11Kpw9ry8LR0zqFPvDJW6661UsckX1qv8WDQa921Vuq4dud2TnrNKyecnfSaWFtWbxj0WndNljqOn1i82ERjaYMY9Fp3rZU6brpnHy8ee+UM4ReP1dj+haL2GPRad61No2vyLxQ1xaDXumttGl1rf6GoPX7ClDZES9PohvmJUdIoGPTSgOZ/YbVyXYDaY9BLQ9DSXyhqjzV6SWqcR/QNaml5gUnje6dRMOgbM7+8wPyJwfnlBQADo+d87zQqlm4a09ryApPE906jYtA3xot3xpfvnUbF0k1j/JSe1etLXdz3TqPiEX1jWlteYNT6tGSy751GxaBvTGvLC4xan+rivncaFUs3DfLinZXrW13c906j4BG9JpoLkmkSGPQb4K69B7nwxvs447o/4cIb7xvbj9BrgXVxTQJLN+vMi2L6xQXJNAkmJuj7MoXu1U7+GS6SRmEigr5PR9F9O/k36fq0b0ijMhE1+j5NofPkX7/0ad+QRmUigr5PR9Ge/OuXPu0b0qhMRND36Sjai2L6pU/7hjQqE1Gj79tnenpRTH/0bd+QRmHZI/okNyd5NsnjC9puSvKVJI8m+XSSqQWP7U6yP8m+JDtH1fHV8ChaS3Hf0CRIVb36BsmPAf8HuKWqfrBrewdwX1W9lOQ3AKrqg0nOBm4DzgdOA/4ceHNVHVv81efMzMzU7OzswIMZpr5Mx9TSfI806ZI8VFUzy2237BF9VX0BeO64ts9W1Uvd3fuBLd3tS4Hbq+qFqnoK2M9c6I+VPq1oqMX5HkkrN4yTsT8P/Gl3exp4esFjB7q2seKUu/7zPZJWbqCgT/Jh4CXg1vmmRTZbtDaUZFeS2SSzhw8fHqQbQ+eUu/7zPZJWbs1Bn+Qq4F3Au+vlQv8BYOuCzbYAhxZ7flXtqaqZqprZvHnzWrsxEk656z/fI2nl1hT0SS4GPghcUlXPL3jobuCKJCcnOQM4C3hw8G6uLy9q6j/fI2nllp1Hn+Q24O3AKUkOAL8G7AZOBu5NAnB/Vf1CVT2R5A7gy8yVdK5ebsZNH7miYf/5Hkkrt+z0yvWwkdMrnaInaVytdHrlRFwZuxRXLpQ0CSZirZulOEVP0iSY6KB3ip6kSTC2pZth1NZPm9rEwUVC3Sl6kloylkf0w7r83Sl6kibBWAb9sGrrrlwoaRKMZelmmLV114aX1LqxPKL38ndJWrmxDHpr65K0cmNZuvHyd0laubEMerC23jcuJSH119gGvfrDpSSkfhvLGr36xaUkpH4z6DUwl5KQ+s2g18Cc7ir1m0GvgTndVeo3T8ZqYE53lfrNoNdQON1V6i9LN5LUOINekhpn0EtS4wx6SWqcQS9JjUtVbXQfSHIY+PoGd+MU4Fsb3If1MiljnZRxwuSMdVLGCSsb6z+tqs3LvVAvgr4PksxW1cxG92M9TMpYJ2WcMDljnZRxwnDHaulGkhpn0EtS4wz6l+3Z6A6so0kZ66SMEyZnrJMyThjiWK3RS1LjPKKXpMZNXNAn2Zrk80meTPJEkvd37W9Mcm+Sr3bf37DRfR1Ukn+U5MEkX+rG+utd+xlJHujG+t+TvHaj+zoMSU5IsjfJH3f3Wx3n15I8luSRJLNdW3P7L0CSqSR/mOQr3f/ZH21trEm2d+/l/Nd3klwzzHFOXNADLwEfqKq3ABcAVyc5G7gO+FxVnQV8rrs/7l4ALqqqHwLOBS5OcgHwG8BvdWP938D7NrCPw/R+4MkF91sdJ8CPV9W5C6bftbj/AvwO8GdV9c+BH2Lu/W1qrFW1r3svzwV+GHge+DTDHGdVTfQX8EfATwL7gFO7tlOBfRvdtyGP83XAw8DbmLsI48Su/UeBeza6f0MY35buP8NFwB8DaXGc3Vi+BpxyXFtz+y/wvcBTdOcSWx7rgrG9A/jisMc5iUf0/yDJNmAH8ADw/VX1DED3/U0b17Ph6coZjwDPAvcCfw0cqaqXuk0OAC0sJP/bwK8Cf9/d/z7aHCdAAZ9N8lCSXV1bi/vvmcBh4L92JbnfT/I9tDnWeVcAt3W3hzbOiQ36JK8HPgVcU1Xf2ej+jEpVHau5Pwm3AOcDb1lss/Xt1XAleRfwbFU9tLB5kU3HepwLXFhV5wHvZK70+GMb3aERORE4D/hYVe0A/i9jXqZ5Nd05pEuAPxj2a09k0Cc5ibmQv7Wq7uyav5nk1O7xU5k7Am5GVR0B/oK58xJTSeY/XWwLcGij+jUkFwKXJPkacDtz5Zvfpr1xAlBVh7rvzzJXyz2fNvffA8CBqnqgu/+HzAV/i2OFuV/cD1fVN7v7QxvnxAV9kgAfB56sqo8seOhu4Kru9lXM1e7HWpLNSaa625uAf8PcyazPAz/dbTb2Y62q3VW1paq2Mfen731V9W4aGydAku9J8o/nbzNX032cBvffqvpb4Okk858y/xPAl2lwrJ0reblsA0Mc58RdMJXkXwL/E3iMl+u5H2KuTn8HcDrwDeBnquq5DenkkCR5K/BJ4ATmfqnfUVX/McmZzB35vhHYC/xsVb2wcT0dniRvB36lqt7V4ji7MX26u3si8N+q6vok30dj+y9AknOB3wdeC/wN8F66fZmGxprkdcDTwJlV9e2ubWjv6cQFvSRNmokr3UjSpDHoJalxBr0kNc6gl6TGGfSS1DiDXpIaZ9BLUuMMeklq3P8HX7aglmPVgOUAAAAASUVORK5CYII=\n",
129 | "text/plain": [
130 | ""
131 | ]
132 | },
133 | "metadata": {},
134 | "output_type": "display_data"
135 | }
136 | ],
137 | "source": [
138 | "plt.scatter(x03[\"A1\"], x03[\"B\"])"
139 | ]
140 | },
141 | {
142 | "cell_type": "code",
143 | "execution_count": 5,
144 | "metadata": {},
145 | "outputs": [
146 | {
147 | "data": {
148 | "text/plain": [
149 | "array([[1. , 0.65756728],\n",
150 | " [0.65756728, 1. ]])"
151 | ]
152 | },
153 | "execution_count": 5,
154 | "metadata": {},
155 | "output_type": "execute_result"
156 | }
157 | ],
158 | "source": [
159 | "np.corrcoef(x03[\"A1\"], x03[\"B\"])"
160 | ]
161 | },
162 | {
163 | "cell_type": "code",
164 | "execution_count": 6,
165 | "metadata": {},
166 | "outputs": [
167 | {
168 | "name": "stdout",
169 | "output_type": "stream",
170 | "text": [
171 | "0.9708703514427236 98.71471813821842 0.4323947319275954\n"
172 | ]
173 | }
174 | ],
175 | "source": [
176 | "# Using the linregress() function returns 5 values, which we name here for convenience:\n",
177 | "slope, intercept, r_value, p_value, std_err = stats.linregress(x03[\"A1\"], x03[\"B\"])\n",
178 | "\n",
179 | "# Slope = m from our equation, Intercept = b, r_value = our correlation co-efficient\n",
180 | "print(slope, intercept, r_value**2)"
181 | ]
182 | },
183 | {
184 | "cell_type": "code",
185 | "execution_count": 7,
186 | "metadata": {},
187 | "outputs": [
188 | {
189 | "data": {
190 | "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXoAAAD8CAYAAAB5Pm/hAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMi4yLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvhp/UCwAAGvBJREFUeJzt3XuUVeV5x/HvIxcd42VURoMzkMFEEaqJ6JTSEJOoCaClQrSu6IoNNTSkCbnoijdierFLCwaXTdKmMdhggrVYaxCJN4KitbGiGUCDCEQSVGYwDokiCUwQhqd/nD2cM8yZc7/s857fZ61Zc+bd+5x598zhN5tnv++7zd0REZFwHVLtDoiISHkp6EVEAqegFxEJnIJeRCRwCnoRkcAp6EVEAqegFxEJnIJeRCRwCnoRkcANrnYHAIYNG+atra3V7oaISE1ZvXr1b9y9Kdt+sQj61tZW2tvbq90NEZGaYmav5rKfSjciIoFT0IuIBE5BLyISOAW9iEjgFPQiIoHLGvRmNsLMnjCzDWa23sy+ErXPN7ONZvZzM7vfzBpTnjPHzDab2SYzm1zOAxCpRUvXdjJx3kpGXf8QE+etZOnazmp3SQKWyxn9PuCr7j4GmADMNrOxwArgNHd/P/ALYA5AtO1S4I+AKcC/mdmgcnRepBYtXdvJnCXr6NzRjQOdO7qZs2Sdwl7KJmvQu/vr7r4mevw7YAPQ7O4/cfd90W6rgJbo8TTgHnff4+5bgM3A+NJ3XaQ2zV++ie69PX3auvf2MH/5pir1SEKXV43ezFqBccCzB236DPBI9LgZ2JqyrSNqO/i1ZplZu5m1b9++PZ9uiNS0bTu682oXKVbOQW9mRwA/Aq50950p7TeQKO/c3duU5un97kDu7gvcvc3d25qass7gFQnGiY0NebWLFCunoDezISRC/m53X5LSPgOYCnzK3XvDvAMYkfL0FmBbaborUvuumTyahiF9L1s1DBnENZNHV6lHErpcRt0Y8H1gg7vfltI+BbgOuNDdd6c8ZRlwqZkdamajgJOB50rbbZHaNX1cM3MvOp3mxgYMaG5sYO5FpzN9XL8Kp0hJ5LKo2UTgL4F1ZvZ81PY14NvAocCKxN8CVrn737j7ejO7F3iJRElntrv3pHldkbo1fVyzgl0qJmvQu/tPSV93fzjDc24Gbi6iXyIiUiKaGSsiEjgFvYhI4BT0IiKBU9CLiAROQS8iEjgFvYhI4BT0IiKBU9CLiAROQS8iEjgFvYhI4BT0IiKBU9CLiAROQS8iEjgFvYhI4BT0IiKBU9CLiAROQS8iEjgFvYhI4BT0IiKBU9CLiAROQS8iEjgFvYhI4BT0IiKBU9CLiAROQS8iEjgFvYhI4BT0IiKByxr0ZjbCzJ4wsw1mtt7MvhK1H2tmK8zs5ejzMVG7mdm3zWyzmf3czM4s90GIiMjAcjmj3wd81d3HABOA2WY2FrgeeNzdTwYej74GOB84OfqYBXy35L0WEZGcZQ16d3/d3ddEj38HbACagWnAD6PdfghMjx5PAxZ5wiqg0cyGl7znIiKSk7xq9GbWCowDngVOcPfXIfHHADg+2q0Z2JrytI6o7eDXmmVm7WbWvn379vx7LiIiOck56M3sCOBHwJXuvjPTrmnavF+D+wJ3b3P3tqamply7ISIiecop6M1sCImQv9vdl0TNb/SWZKLPXVF7BzAi5ektwLbSdFdERPKVy6gbA74PbHD321I2LQNmRI9nAA+ktH86Gn0zAXi7t8QjIiKVNziHfSYCfwmsM7Pno7avAfOAe81sJvAacEm07WHgAmAzsBu4oqQ9FhGRvGQNenf/Kenr7gDnpdnfgdlF9ktEREpEM2NFRAKnoBcRCZyCXkQkcAp6EZHAKehFRAKnoBcRCZyCXkQkcAp6EZHAKehFRAKnoBcRCZyCXkQkcAp6EZHAKehFRAKnoBcRCZyCXkQkcAp6EZHAKehFRAKnoBcRCZyCXkQkcAp6EZHAKehFRAKnoBcRCZyCXkQkcAp6EZHAKehFRAKnoBcRCVzWoDezhWbWZWYvprSdYWarzOx5M2s3s/FRu5nZt81ss5n93MzOLGfnRUQku1zO6H8ATDmo7RvAje5+BvB30dcA5wMnRx+zgO+WppsiIlKorEHv7k8Bbx7cDBwVPT4a2BY9ngYs8oRVQKOZDS9VZ0VEJH+DC3zelcByM7uVxB+LD0btzcDWlP06orbXC+6hiIgUpdCLsZ8HrnL3EcBVwPejdkuzr6d7ATObFdX327dv315gN0REJJtCg34GsCR6/N/A+OhxBzAiZb8WkmWdPtx9gbu3uXtbU1NTgd0QEZFsCg36bcBHosfnAi9Hj5cBn45G30wA3nZ3lW1ERKooa43ezBYDHwWGmVkH8PfAZ4Fvmdlg4A8kRtgAPAxcAGwGdgNXlKHPIiKSh6xB7+6XDbDprDT7OjC72E6JiEjpaGasiEjgCh1eKSIiRVi6tpP5yzexbUc3JzY2cM3k0Uwf11yW76WgFxGpsKVrO5mzZB3de3sA6NzRzZwl6wDKEvYq3YiIVNj85ZsOhHyv7r09zF++qSzfT0EvIlJh23Z059VeLAW9iEiFndjYkFd7sRT0IiIVds3k0TQMGdSnrWHIIK6ZPLos308XY0VEKqz3gqtG3YiIBGz6uOayBfvBVLoREQmczuhFREqgkhOg8qWgFxEpUqUnQOVLpRsRkSJVegJUvhT0IiJFqvQEqHwp6EVEilTpCVD5UtCLiBSp0hOg8qWLsSIiRar0BKh8KehFREqgkhOg8qXSjYhI4BT0IiLVsn8/vPNO2b+Ngl5EpJJ27YIHHoDPfhZaWuCOO8r+LVWjFxEpt1dfhYceggcfhJUrYc8eOOoomDIFTj217N9eQS8iUmo9PfDcc4lg//GPYV1iOQTe9z74whdg6lT40Idg6NCKdEdBLyJSCo89Bh//OAweDMccA9u3w6BBiUC/9dZEuJ9yCphVvGsKehGpawOtOpnTapQXXwxLlvRt27cPJk1KBPvkyYnQrzIFvYjUrYFWnWx/9U1+tLqzX/uQt9/iz845PfOL/va3cOyx5e56XhT0IlK3Blp1cvGzW+lxB2DKpqe5fencxMab0rzIxRfDffeVuafFyRr0ZrYQmAp0uftpKe1fAr4I7AMecvdro/Y5wEygB/iyuy8vR8dFRIqVdnVJd355y9TMT1yxAj72sfJ0qgxyOaP/AfCvwKLeBjM7B5gGvN/d95jZ8VH7WOBS4I+AE4HHzOwUd+/p96oiNSjOdxGS/J3Y2EDnjm7GdP2KR+78csZ9x1x1H8cefwxPX39uhXpXOlmD3t2fMrPWg5o/D8xz9z3RPl1R+zTgnqh9i5ltBsYDz5SsxyJVEve7CEmeLryQp3/84wE3bzx+FFOu+JcDX8dpNcp8FToz9hTgbDN71sz+x8z+OGpvBram7NcRtYnUvLjfRUiy6OlJDG3s/UgT8lfPuJmlazrAnY2P/i/NjQ0Y0NzYwNyLTq/ZP+iFXowdDBwDTAD+GLjXzE4C0g0Q9XQvYGazgFkAI0eOLLAbIpUT97sISRrLlsG0aZn3+f3v4V3vAuDWlOY4r0aZr0KDvgNY4u4OPGdm+4FhUfuIlP1agG3pXsDdFwALANra2tL+MRCJk956brr2WhbcdYdcJiR56SMnzj/HQks3S4FzAczsFGAo8BtgGXCpmR1qZqOAk4HnStFRkWqL+12ECtF73aFzRzdO8rrD0rWd1e5a7rq7+5Zk0rnrrkS4936UWNx/jlmD3swWk7iYOtrMOsxsJrAQOMnMXgTuAWZ4wnrgXuAl4FFgtkbcSCimj2tm7kWnB1O3hRq+7vC97yWD/fDD0++zZ08y2C+/vKzdifvPMZdRN5cNsCntT87dbwZuLqZTInEVUt0Wauy6Q7aSzJFHws6dlenLQeL+c9R69CJ1bKDrC7G47vDmm9lLMitWJM/aqxTyEPOfIwp6kboWu+sOs2Ylg/2449Lvs39/MtxjMjs1dj/Hg2itG5E61luGqupokWwlmbPOgvb2yvSlQLH4OWZgXoYr0Plqa2vz9pj/IkUks5yHF/7yl4kbcGTy9NPwwQ+Wp6MBMbPV7t6WbT+d0YtI0bIuD3HOOfDkk5lfJAYnnaFS0ItI0dINL9xw0/npl/XtNWECPKNlsCpBQS8iRdu2o5tRb3byxB2fy7zjxo0wOh4XKOuJgl5ECjdjBixaxJZM+6gkU3UKehHJnTscknlU9n2nncfVf3YVDUMGJWYOV6hrMjAFvYhk9rOfwfjxmfd56y2Wbtl1YNRNc8yGF9Y7Bb2I9JdtlIxZYuJSH7vK2SMpgmbGigjs29d3uYF0Ib94cXJG6kEhH/fVG+udgl6kXi1fngz2IUPS79PdnQz3Sy8d8KXivnpjvVPpRqSevOc98NprA29/3/vg5Zfzftm4r95Y73RGLxKy3bv7lmTShXzqCpAFhDzEf/XGeqegFwnNXXclgz26F2o/+/aVdAXIuK/eWO9UuhEJQbYVICdNStTkyyTuqzfWOwW9SC3q6oITTsi8z5o1MG5cZfpDeHffColKNyK14qqrkiWZgUI+9aYcFQx5iTed0YtUQc5rt2cryZx3Hjz2WHk6KcFQ0ItUWMa124fugNNOy/wC69fD2LHl7qYEREEvNSvns+KYOXhy0cN3fomxXVsyr92uFSBLolbfM8VS0EtNynpHoxjb9tZuXvnGn2fe6fLLE8MkpWRq+T1TLF2MlZpUc1PuH330wIXULQOE/MXX3p28kKqQL7mae8+UkM7opSbVxJT7ww6DPXsy7tJ63YMAB9Zuz1e9liIKURPvmTJR0EtNOrGxgc40/0CrOuV+/34YNCjjLo+M/TB7/uM/gcQZphUR0PVciihELN8zFaKgl5p0zeTRfUIOqjTl/pFH4IILMu7ygS8v5u2GIw983bx8E09ff27RYZypFKGg7y8275kqyBr0ZrYQmAp0uftpB227GpgPNLn7b8zMgG8BFwC7gb9y9zWl77bUu6pOuR86FPbuzbyPO6Ouf4h0Y2VKVSqo51JEIep5mYZczuh/APwrsCi10cxGAB8HUpfDOx84Ofr4E+C70WeRkqvYlPs9exL19kwWLoQrrujTVO5SQYiliHJfc6jXZRqyjrpx96eAN9Ns+mfgWuhz0jINWOQJq4BGMxtekp6KVNKddyaXGxgo5P/wh+QomYNCHsq/omNoK0bqLlXlU1CN3swuBDrd/QXrO0W7Gdia8nVH1PZ6wT0UqZRsyw0MGQLvvJPzy5W7VBBaKULXHMon76A3s8OBG4BJ6TanaUs7pc/MZgGzAEaOHJlvN0SK97vfwVFHZd7n4Yfh/PML/hblLhWEVIrQNYfyKWTC1HuBUcALZvYK0AKsMbN3kziDH5GybwuwLd2LuPsCd29z97ampqYCuiFSgAULkiWZgUK+pydZkskx5Jeu7WTivJWMuv4hJs5bqXJDAXSXqvLJO+jdfZ27H+/ure7eSiLcz3T3XwPLgE9bwgTgbXdX2UaqK/VWep/7XP/tF1+cDHZ3OCS/fxaqLZdGaNcc4iTrO9rMFgPPAKPNrMPMZmbY/WHgV8Bm4A7gCyXppUg+urr6hns6L76YDPb77ivq29Xz1PpSmj6umbkXnU5zYwMGNDc2MPei04MpTVVT1hq9u1+WZXtrymMHZhffLQldyYfR3Xgj/MM/ZN6nTCtAqrZcOiFdc4gTzYyViivZ1P1so2Suugpuu63QbuYsxPHsEhatXikVV3Cp49VXs5dkXnstWZKpQMhDorY85JC+/RlyiKm2LLGhoJeKy6vUce21yWBvbU3/gqkXUkeMSL9PuR38dyfLfzZEKklBLxWXcRide9+z9vnz++94++19w73K5i/fxN6evv3Y2+O6GCuxoRq9VNzBqwiO3v4Kyxd+MbFxzgBPeustaGysTAfzpIuxEncKeqm46eOaOeuGLzLikaUD7/Tud8PrtTEFQxdjJe5UupHK6OnpU5JJG/JLliTLMTUS8qCJPhJ/OqOX8vnpT+HsszPv092dfQngmAttcTEJj4JeSuvssxMBP5AJE+CZZyrXnwrRRB+JMwW9FOedd+DQQzPv8+ST8JGPVKQ7ItKfgj5A5b5LT04lmX37st4oW/or++9O6pIuxgambCspXnJJ8mJqupC/8ca+Y9sV8nnTKphSLjqjD0zJ7tKzaxcccUTmfTZvhve+t4BeSjq6w5KUi87oA1PU5J2HHkqetacL+SOPhP37k2ftCvmS0sQrKRcFfWDyvkvPxInJcJ86tf/273wnGew7d2ZfMbIGxeXuULrDkpSLgj4wWSfv/Pa3fdeS+b//6/8i27Ylw/0LYd87Jk51cU28knJR0Acm3V16/mPoJqaf2ZII9mHD+j/p1FP7XkgdPrzi/a6WON0dSndYknLRxdgATR/XzPRPTEys3z6Qe+6BT36ycp2KqbjVxTXxSspBQR+Kzk5oacm8T4xXgKwWLUgm9UClmyoo2cW/b34zWWtPF/Lnnde3JKOQ70d1cakHOqOvsKLul9o7ESnTzTaWL4dJk0rV3eBpQTKpB3UT9HGZWp73pJhf/AJGZzm73L0bGlRqEJH06qJ0E6chdDld/LvhhmRJJl3IX35535KMQr5gcXpviJRLXQR9nIbQpbvIN2h/D1tumZoM93/6p/5PXLUqGex33VWBntaHOL03RMqlLko3cRpC13u/1KauDp5a8NnMO+/dC4Pr4ldUNXF6b4iUS12c0cdmavnttzP9zBY23HR++pC/+uq+JRmFfNnF5r0hUkZ1EfRVG0K3dy+ccUayJPP5z/ffZ/36ZLDPn1/e/kg/Gl4p9SBr0JvZQjPrMrMXU9rmm9lGM/u5md1vZo0p2+aY2WYz22Rmk8vV8XxUdGr5iy8mg33oUHjhhb7bP/GJvitAjh1b+j5IzrTsgNQD80xjsgEz+zDwe2CRu58WtU0CVrr7PjO7BcDdrzOzscBiYDxwIvAYcIq796R/9YS2tjZvb28v+mBKKa/hmPPmwZw5A7/Y/ffD9Onl6Wgdi8uQWZFqMbPV7t6Wbb+sRWB3f8rMWg9q+0nKl6uAv4geTwPucfc9wBYz20wi9GvqbtBZJzV1d8Mpp0BHx8Av0tUFTU2V6G5dKmrimUidKUWN/jPAI9HjZmBryraOqK2mpBtyN/rVl5IrQB5+eP+Qnzmzb0lGIV9WGhYpkruihnWY2Q3APuDu3qY0u6WtDZnZLGAWwMiRI4vpRsn1Dq37zM8e4O9W3jHwjo8/DueeW6FeSSoNixTJXcFBb2YzgKnAeZ4s9HcAI1J2awG2pXu+uy8AFkCiRl9oP0pqzx649Va23PL1tJt3Dz2Mw7t+DUcfXeGOycG06qRI7goq3ZjZFOA64EJ3352yaRlwqZkdamajgJOB54rvZhk9/nhylMxhh8HXkyH/h8FD+c6ES2i97kHGfP0RfrJqs0I+JjQsUiR3Wc/ozWwx8FFgmJl1AH8PzAEOBVZY4h6iq9z9b9x9vZndC7xEoqQzO9uIm4pzh7/+a1i4MP32yy6D+fNZ2pVc0bBZIzpiR6tOiuQu6/DKSij78Mpdu+Dmm2Hu3LSbX288gXM+828c19SosBCRmlGy4ZU1a+NG+NKX4LHH0m+fN4+lky7XED0RCV44SyC4J+6DevTRiXr7mDF9Q37mTHjjjeTwx+uu0xA9EakLtX9Gf9NN8Ld/m37b976XCPhBg9Ju1hA9EakHNXtGv3RtJ5+8elHfkB8/Htrbk2fts2YNGPKglQtFpD7U5Bn9genvg47l4k99g5eHjWTvkUfnvRhV79rwqeUbDdETkdDU5Bn9gdq6GatbxrLzsCMKqq1r5UIRqQc1eUZfytr69HHNCnYRCVpNntGrti4ikruaDHpNfxcRyV1Nlm40/V1EJHc1GfSg2nrc6G5PIvFVs0Ev8aG7PYnEW03W6CVetJSESLwp6KVoWkpCJN4U9FI0DXcViTcFvRRNw11F4k0XY6VoGu4qEm8KeikJDXcViS+VbkREAqegFxEJnIJeRCRwCnoRkcAp6EVEAmfuXu0+YGbbgVer3I1hwG+q3IdKqZdjrZfjhPo51no5TsjtWN/j7k3ZXigWQR8HZtbu7m3V7kcl1Mux1stxQv0ca70cJ5T2WFW6EREJnIJeRCRwCvqkBdXuQAXVy7HWy3FC/RxrvRwnlPBYVaMXEQmczuhFRAJXd0FvZiPM7Akz22Bm683sK1H7sWa2wsxejj4fU+2+FsvMDjOz58zshehYb4zaR5nZs9Gx/peZDa12X0vBzAaZ2VozezD6OtTjfMXM1pnZ82bWHrUF9/4FMLNGM7vPzDZG/2b/NLRjNbPR0e+y92OnmV1ZyuOsu6AH9gFfdfcxwARgtpmNBa4HHnf3k4HHo69r3R7gXHf/AHAGMMXMJgC3AP8cHetbwMwq9rGUvgJsSPk61OMEOMfdz0gZfhfi+xfgW8Cj7n4q8AESv9+gjtXdN0W/yzOAs4DdwP2U8jjdva4/gAeAjwObgOFR23BgU7X7VuLjPBxYA/wJiUkYg6P2PwWWV7t/JTi+lugfw7nAg4CFeJzRsbwCDDuoLbj3L3AUsIXoWmLIx5pybJOAp0t9nPV4Rn+AmbUC44BngRPc/XWA6PPx1etZ6UTljOeBLmAF8Etgh7vvi3bpAEJYSP6bwLXA/ujr4wjzOAEc+ImZrTazWVFbiO/fk4DtwJ1RSe7fzexdhHmsvS4FFkePS3acdRv0ZnYE8CPgSnffWe3+lIu793jiv4QtwHhgTLrdKtur0jKzqUCXu69ObU6za00fZ4qJ7n4mcD6J0uOHq92hMhkMnAl8193HAbuo8TJNJtE1pAuB/y71a9dl0JvZEBIhf7e7L4ma3zCz4dH24STOgIPh7juAJ0lcl2g0s967i7UA26rVrxKZCFxoZq8A95Ao33yT8I4TAHffFn3uIlHLHU+Y798OoMPdn42+vo9E8Id4rJD4w73G3d+Ivi7ZcdZd0JuZAd8HNrj7bSmblgEzosczSNTua5qZNZlZY/S4AfgYiYtZTwB/Ee1W88fq7nPcvcXdW0n813elu3+KwI4TwMzeZWZH9j4mUdN9kQDfv+7+a2CrmfXeZf484CUCPNbIZSTLNlDC46y7CVNm9iHgf4F1JOu5XyNRp78XGAm8Blzi7m9WpZMlYmbvB34IDCLxR/1ed/9HMzuJxJnvscBa4HJ331O9npaOmX0UuNrdp4Z4nNEx3R99ORj4T3e/2cyOI7D3L4CZnQH8OzAU+BVwBdF7mYCO1cwOB7YCJ7n721FbyX6ndRf0IiL1pu5KNyIi9UZBLyISOAW9iEjgFPQiIoFT0IuIBE5BLyISOAW9iEjgFPQiIoH7fw7+ncQsZ/xnAAAAAElFTkSuQmCC\n",
191 | "text/plain": [
192 | ""
193 | ]
194 | },
195 | "metadata": {},
196 | "output_type": "display_data"
197 | }
198 | ],
199 | "source": [
200 | "def get_B():\n",
201 | " return slope * x03[\"A1\"] + intercept\n",
202 | "\n",
203 | "plt.scatter(x03[\"A1\"], x03[\"B\"])\n",
204 | "plt.plot(x03[\"A1\"], get_B(), c='r')\n",
205 | "plt.show()"
206 | ]
207 | }
208 | ],
209 | "metadata": {
210 | "kernelspec": {
211 | "display_name": "Python 3",
212 | "language": "python",
213 | "name": "python3"
214 | },
215 | "language_info": {
216 | "codemirror_mode": {
217 | "name": "ipython",
218 | "version": 3
219 | },
220 | "file_extension": ".py",
221 | "mimetype": "text/x-python",
222 | "name": "python",
223 | "nbconvert_exporter": "python",
224 | "pygments_lexer": "ipython3",
225 | "version": "3.6.5"
226 | }
227 | },
228 | "nbformat": 4,
229 | "nbformat_minor": 2
230 | }
231 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # how-to-python
2 | Knowledge is often hard-won, and then quickly forgotten! This repository aims to explain simply, and then serve as an aid to memory.
3 |
--------------------------------------------------------------------------------
/customer_product_list.xlsx:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/shotleft/how-to-python/9b71a76742b3896d0017a1fcc6cdbd64ad421b1b/customer_product_list.xlsx
--------------------------------------------------------------------------------
/data_structures.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/shotleft/how-to-python/9b71a76742b3896d0017a1fcc6cdbd64ad421b1b/data_structures.png
--------------------------------------------------------------------------------
/flattening.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/shotleft/how-to-python/9b71a76742b3896d0017a1fcc6cdbd64ad421b1b/flattening.png
--------------------------------------------------------------------------------
/img_5.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/shotleft/how-to-python/9b71a76742b3896d0017a1fcc6cdbd64ad421b1b/img_5.png
--------------------------------------------------------------------------------
/multiplication.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/shotleft/how-to-python/9b71a76742b3896d0017a1fcc6cdbd64ad421b1b/multiplication.png
--------------------------------------------------------------------------------
/sea_picture.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/shotleft/how-to-python/9b71a76742b3896d0017a1fcc6cdbd64ad421b1b/sea_picture.jpg
--------------------------------------------------------------------------------
/sorted.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/shotleft/how-to-python/9b71a76742b3896d0017a1fcc6cdbd64ad421b1b/sorted.jpg
--------------------------------------------------------------------------------
/take1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/shotleft/how-to-python/9b71a76742b3896d0017a1fcc6cdbd64ad421b1b/take1.png
--------------------------------------------------------------------------------
/take2.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/shotleft/how-to-python/9b71a76742b3896d0017a1fcc6cdbd64ad421b1b/take2.png
--------------------------------------------------------------------------------
/tensor_shape_2.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/shotleft/how-to-python/9b71a76742b3896d0017a1fcc6cdbd64ad421b1b/tensor_shape_2.png
--------------------------------------------------------------------------------
/unsorted.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/shotleft/how-to-python/9b71a76742b3896d0017a1fcc6cdbd64ad421b1b/unsorted.jpg
--------------------------------------------------------------------------------
/weight_matrix_detail_2.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/shotleft/how-to-python/9b71a76742b3896d0017a1fcc6cdbd64ad421b1b/weight_matrix_detail_2.png
--------------------------------------------------------------------------------