├── LICENSE ├── README.md ├── check_env.py ├── exercises ├── dow_selection │ ├── dow.csv │ ├── dow_selection.py │ └── dow_selection_solution.py ├── filter_image │ ├── dc_metro.png │ ├── filter_image.py │ └── filter_image_solution.py ├── plotting │ ├── dc_metro.JPG │ ├── my_plots.png │ ├── plotting.py │ ├── plotting_bonus_solution.py │ ├── plotting_solution.py │ └── sample_plots.png └── wind_statistics │ ├── wind.data │ ├── wind.desc │ ├── wind_statistics.py │ └── wind_statistics_solution.py └── slides.pdf /LICENSE: -------------------------------------------------------------------------------- 1 | © 2001-2019, Enthought, Inc. 2 | All Rights Reserved. Use only permitted under license. Copying, sharing, redistributing or other unauthorized use strictly prohibited. 3 | All trademarks and registered trademarks are the property of their respective owners. 4 | Enthought, Inc. 5 | 200 W Cesar Chavez Suite 202 6 | Austin, TX 78701 7 | www.enthought.com 8 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # SciPy 2019 Tutorial: Introduction to Numerical Computing With NumPy 2 | 3 | #### Presented by: Alexandre Chabot-Leclerc, [Enthought, Inc.](https://www.enthought.com) 4 | 5 | This repository contains all the material needed by students registered for the Numpy tutorial of SciPy 2018 on Monday, July 8th 2019. 6 | 7 | For a smooth experience, you will need to make sure that you install or update your Python distribution and download the tutorial material _before_ the day of the tutorial as the Wi-Fi at the AT&T center can be flaky. 8 | 9 | ## Install Python 10 | 11 | If you don't already have a working python distribution, you may download Enthought Canopy ([https://store.enthought.com/downloads](https://store.enthought.com/downloads)), Anaconda Python ([https://www.anaconda.com/download/](https://www.anaconda.com/download/)) or Python.org ([https://www.python.org/downloads/](https://www.python.org/downloads/)). 12 | 13 | 14 | ## Install Packages 15 | 16 | To be able to run the examples, demos and exercises, you must have the 17 | following packages installed: 18 | 19 | - numpy 1.15+ 20 | - matplotlib 2.0+ 21 | - ipython (for running, experimenting and doing exercises) 22 | 23 | With Canopy, the required packages are already installed. If you are using Python from python.org or your system, you can install the necessary packages with: 24 | 25 | ```sh 26 | $ pip install -U numpy matplotlib ipython 27 | ``` 28 | 29 | If you are using Anaconda, you can create an environment with the necessary packages with: 30 | 31 | ```sh 32 | $ conda create -n numpy-tutorial numpy matplotlib ipython 33 | ``` 34 | 35 | To test your installation, please execute the `check_env.py` script in the environment where you've installed the requirements: 36 | 37 | ```sh 38 | $ python check_env.py 39 | ``` 40 | 41 | You should see a window pop up with a plot that looks vaguely like a smiley face. 42 | 43 | ## Download Tutorial Materials 44 | 45 | This GitHub repository is all that is needed in terms of tutorial content. The simplest solution is to download the material using this link: 46 | 47 | https://github.com/enthought/Numpy-Tutorial-SciPyConf-2019/archive/master.zip 48 | 49 | If you're familiar with Git, you can also clone this repository with: 50 | 51 | ```sh 52 | $ git clone https://github.com/enthought/Numpy-Tutorial-SciPyConf-2019.git 53 | ``` 54 | 55 | It will create a new folder named `Numpy-Tutorial-SciPyConf-2018/` with all the content you will need: the slides I will go through (`slides.pdf`), and a folder of exercises. 56 | 57 | 58 | ## Questions? Problems? 59 | 60 | You may post messages to the `#numpy` Slack channel for this tutorial at in the official Slack team: [https://scipy2019.slack.com](https://scipy2019.slack.com) . 61 | -------------------------------------------------------------------------------- /check_env.py: -------------------------------------------------------------------------------- 1 | """ Run this file to check your python installation. 2 | """ 3 | import numpy as np 4 | import matplotlib.pyplot as plt 5 | 6 | assert np.allclose(np.array([3.3], dtype='float32'), np.array([1.65], dtype='float32')*2) 7 | fig, ax = plt.subplots() 8 | ax.scatter(x=[-3, -2, -1, 0, 1, 2, 3], y=[0, -1, -1.5, -1.75, -1.5, -1, 0]) 9 | ax.scatter(x=[-1.5, 1.5], y=[2, 2], s=1000) 10 | ax.set_ylim((-3, 3)) 11 | plt.show() 12 | -------------------------------------------------------------------------------- /exercises/dow_selection/dow.csv: -------------------------------------------------------------------------------- 1 | 13261.82,13338.23,12969.42,13043.96,3452650000,13043.96 2 | 13044.12,13197.43,12968.44,13056.72,3429500000,13056.72 3 | 13046.56,13049.65,12740.51,12800.18,4166000000,12800.18 4 | 12801.15,12984.95,12640.44,12827.49,4221260000,12827.49 5 | 12820.9,12998.11,12511.03,12589.07,4705390000,12589.07 6 | 12590.21,12814.97,12431.53,12735.31,5351030000,12735.31 7 | 12733.11,12931.29,12632.15,12853.09,5170490000,12853.09 8 | 12850.74,12863.34,12495.91,12606.3,4495840000,12606.3 9 | 12613.78,12866.1,12596.95,12778.15,3682090000,12778.15 10 | 12777.5,12777.5,12425.92,12501.11,4601640000,12501.11 11 | 12476.81,12699.05,12294.48,12466.16,5440620000,12466.16 12 | 12467.05,12597.85,12089.38,12159.21,5303130000,12159.21 13 | 12159.94,12441.85,11953.71,12099.3,6004840000,12099.3 14 | 12092.72,12167.42,11508.74,11971.19,6544690000,11971.19 15 | 11969.08,12339.1,11530.12,12270.17,3241680000,12270.17 16 | 12272.69,12522.82,12114.83,12378.61,5735300000,12378.61 17 | 12391.7,12590.69,12103.61,12207.17,4882250000,12207.17 18 | 12205.71,12423.81,12061.42,12383.89,4100930000,12383.89 19 | 12385.19,12604.92,12262.29,12480.3,4232960000,12480.3 20 | 12480.14,12715.96,12311.55,12442.83,4742760000,12442.83 21 | 12438.28,12734.74,12197.09,12650.36,4970290000,12650.36 22 | 12638.17,12841.88,12510.05,12743.19,4650770000,12743.19 23 | 12743.11,12810.34,12557.61,12635.16,3495780000,12635.16 24 | 12631.85,12631.85,12234.97,12265.13,4315740000,12265.13 25 | 12257.25,12436.33,12142.14,12200.1,4008120000,12200.1 26 | 12196.2,12366.99,12045,12247,4589160000,12247 27 | 12248.47,12330.97,12058.01,12182.13,3768490000,12182.13 28 | 12181.89,12332.76,12006.79,12240.01,3593140000,12240.01 29 | 12241.56,12524.12,12207.9,12373.41,4044640000,12373.41 30 | 12368.12,12627.76,12354.22,12552.24,3856420000,12552.24 31 | 12551.51,12611.26,12332.03,12376.98,3644760000,12376.98 32 | 12376.66,12441.2,12216.68,12348.21,3583300000,12348.21 33 | 12349.59,12571.11,12276.81,12337.22,3613550000,12337.22 34 | 12333.31,12489.29,12159.42,12427.26,3870520000,12427.26 35 | 12426.85,12545.79,12225.36,12284.3,3696660000,12284.3 36 | 12281.09,12429.05,12116.92,12381.02,3572660000,12381.02 37 | 12380.77,12612.47,12292.03,12570.22,3866350000,12570.22 38 | 12569.48,12771.14,12449.08,12684.92,4096060000,12684.92 39 | 12683.54,12815.59,12527.64,12694.28,3904700000,12694.28 40 | 12689.28,12713.99,12463.32,12582.18,3938580000,12582.18 41 | 12579.58,12579.58,12210.3,12266.39,4426730000,12266.39 42 | 12264.36,12344.71,12101.29,12258.9,4117570000,12258.9 43 | 12259.14,12291.22,11991.06,12213.8,4757180000,12213.8 44 | 12204.93,12392.74,12105.36,12254.99,4277710000,12254.99 45 | 12254.59,12267.86,12010.03,12040.39,4323460000,12040.39 46 | 12039.09,12131.33,11778.66,11893.69,4565410000,11893.69 47 | 11893.04,11993.75,11691.47,11740.15,4261240000,11740.15 48 | 11741.33,12205.98,11741.33,12156.81,5109080000,12156.81 49 | 12148.61,12360.58,12037.79,12110.24,4414280000,12110.24 50 | 12096.49,12242.29,11832.88,12145.74,5073360000,12145.74 51 | 12146.39,12249.86,11781.43,11951.09,5153780000,11951.09 52 | 11946.45,12119.69,11650.44,11972.25,5683010000,11972.25 53 | 11975.92,12411.63,11975.92,12392.66,5335630000,12392.66 54 | 12391.52,12525.19,12077.27,12099.66,1203830000,12099.66 55 | 12102.43,12434.34,12024.68,12361.32,6145220000,12361.32 56 | 12361.97,12687.61,12346.17,12548.64,4499000000,12548.64 57 | 12547.34,12639.82,12397.62,12532.6,4145120000,12532.6 58 | 12531.79,12531.79,12309.62,12422.86,4055670000,12422.86 59 | 12421.88,12528.13,12264.76,12302.46,4037930000,12302.46 60 | 12303.92,12441.67,12164.22,12216.4,3686980000,12216.4 61 | 12215.92,12384.84,12095.18,12262.89,4188990000,12262.89 62 | 12266.64,12693.93,12266.64,12654.36,4745120000,12654.36 63 | 12651.67,12790.28,12488.22,12608.92,4320440000,12608.92 64 | 12604.69,12734.97,12455.04,12626.03,3920100000,12626.03 65 | 12626.35,12738.3,12489.4,12609.42,3703100000,12609.42 66 | 12612.59,12786.83,12550.22,12612.43,3747780000,12612.43 67 | 12602.66,12664.38,12440.55,12576.44,3602500000,12576.44 68 | 12574.65,12686.93,12416.53,12527.26,3556670000,12527.26 69 | 12526.78,12705.9,12447.96,12581.98,3686150000,12581.98 70 | 12579.78,12579.78,12280.89,12325.42,3723790000,12325.42 71 | 12324.77,12430.86,12208.42,12302.06,3565020000,12302.06 72 | 12303.6,12459.36,12223.97,12362.47,3581230000,12362.47 73 | 12371.51,12670.56,12371.51,12619.27,4260370000,12619.27 74 | 12617.4,12725.93,12472.71,12620.49,3713880000,12620.49 75 | 12626.76,12965.47,12626.76,12849.36,4222380000,12849.36 76 | 12850.91,12902.69,12666.08,12825.02,3420570000,12825.02 77 | 12825.02,12870.86,12604.53,12720.23,3821900000,12720.23 78 | 12721.45,12883.8,12627,12763.22,4103610000,12763.22 79 | 12764.68,12979.88,12651.51,12848.95,4461660000,12848.95 80 | 12848.38,12987.29,12703.7,12891.86,3891150000,12891.86 81 | 12890.76,13015.62,12791.55,12871.75,3607000000,12871.75 82 | 12870.37,12970.27,12737.82,12831.94,3815320000,12831.94 83 | 12831.45,13052.91,12746.45,12820.13,4508890000,12820.13 84 | 12818.34,13079.94,12721.94,13010,4448780000,13010 85 | 13012.53,13191.49,12931.35,13058.2,3953030000,13058.2 86 | 13056.57,13105.75,12896.5,12969.54,3410090000,12969.54 87 | 12968.89,13071.07,12817.53,13020.83,3924100000,13020.83 88 | 13010.82,13097.77,12756.14,12814.35,4075860000,12814.35 89 | 12814.84,12965.95,12727.56,12866.78,3827550000,12866.78 90 | 12860.68,12871.75,12648.09,12745.88,3518620000,12745.88 91 | 12768.38,12903.33,12746.36,12876.05,3370630000,12876.05 92 | 12872.08,12957.65,12716.16,12832.18,4018590000,12832.18 93 | 12825.12,13037.44,12806.21,12898.38,3979370000,12898.38 94 | 12891.29,13028.16,12798.39,12992.66,3836480000,12992.66 95 | 12992.74,13069.52,12860.6,12986.8,3842590000,12986.8 96 | 12985.41,13170.97,12899.19,13028.16,3683970000,13028.16 97 | 13026.04,13026.04,12742.29,12828.68,3854320000,12828.68 98 | 12824.94,12926.71,12550.39,12601.19,4517990000,12601.19 99 | 12597.69,12743.68,12515.78,12625.62,3955960000,12625.62 100 | 12620.9,12637.43,12420.2,12479.63,3516380000,12479.63 101 | 12479.63,12626.84,12397.56,12548.35,3588860000,12548.35 102 | 12542.9,12693.77,12437.38,12594.03,3927240000,12594.03 103 | 12593.87,12760.21,12493.47,12646.22,3894440000,12646.22 104 | 12647.36,12750.84,12555.6,12638.32,3845630000,12638.32 105 | 12637.67,12645.4,12385.76,12503.82,3714320000,12503.82 106 | 12503.2,12620.98,12317.61,12402.85,4396380000,12402.85 107 | 12391.86,12540.37,12283.74,12390.48,4338640000,12390.48 108 | 12388.81,12652.81,12358.07,12604.45,4350790000,12604.45 109 | 12602.74,12602.74,12180.5,12209.81,4771660000,12209.81 110 | 12210.13,12406.36,12102.5,12280.32,4404570000,12280.32 111 | 12277.71,12425.98,12116.58,12289.76,4635070000,12289.76 112 | 12286.34,12317.2,12029.46,12083.77,4779980000,12083.77 113 | 12089.63,12337.72,12041.43,12141.58,4734240000,12141.58 114 | 12144.59,12376.72,12096.23,12307.35,4080420000,12307.35 115 | 12306.86,12381.44,12139.79,12269.08,3706940000,12269.08 116 | 12269.65,12378.67,12114.14,12160.3,3801960000,12160.3 117 | 12158.68,12212.33,11947.07,12029.06,4573570000,12029.06 118 | 12022.54,12188.31,11881.03,12063.09,4811670000,12063.09 119 | 12062.19,12078.23,11785.04,11842.69,5324900000,11842.69 120 | 11843.83,11986.94,11731.06,11842.36,4186370000,11842.36 121 | 11842.36,11962.37,11668.53,11807.43,4705050000,11807.43 122 | 11805.31,12008.7,11683.75,11811.83,4825640000,11811.83 123 | 11808.57,11808.57,11431.92,11453.42,5231280000,11453.42 124 | 11452.85,11556.33,11248.48,11346.51,6208260000,11346.51 125 | 11345.7,11504.55,11226.34,11350.01,5032330000,11350.01 126 | 11344.64,11465.79,11106.65,11382.26,5846290000,11382.26 127 | 11382.34,11510.41,11180.58,11215.51,5276090000,11215.51 128 | 11297.33,11336.49,11158.02,11288.53,3247590000,11288.53 129 | 11289.19,11477.52,11094.44,11231.96,5265420000,11231.96 130 | 11225.03,11459.52,11101.19,11384.21,6034110000,11384.21 131 | 11381.93,11505.12,11115.61,11147.44,5181000000,11147.44 132 | 11148.01,11351.24,11006.01,11229.02,5840430000,11229.02 133 | 11226.17,11292.04,10908.64,11100.54,6742200000,11100.54 134 | 11103.64,11299.7,10972.63,11055.19,5434860000,11055.19 135 | 11050.8,11201.67,10731.96,10962.54,7363640000,10962.54 136 | 10961.89,11308.41,10831.61,11239.28,6738630400,11239.28 137 | 11238.39,11538.5,11118.46,11446.66,7365209600,11446.66 138 | 11436.56,11599.57,11290.5,11496.57,5653280000,11496.57 139 | 11495.02,11663.4,11339.02,11467.34,4630640000,11467.34 140 | 11457.9,11692.79,11273.32,11602.5,6180230000,11602.5 141 | 11603.39,11820.21,11410.02,11632.38,6705830000,11632.38 142 | 11630.34,11714.21,11288.79,11349.28,6127980000,11349.28 143 | 11341.14,11540.78,11252.47,11370.69,4672560000,11370.69 144 | 11369.47,11439.25,11094.76,11131.08,4282960000,11131.08 145 | 11133.44,11444.05,11086.13,11397.56,5414240000,11397.56 146 | 11397.56,11681.47,11328.68,11583.69,5631330000,11583.69 147 | 11577.99,11631.16,11317.69,11378.02,5346050000,11378.02 148 | 11379.89,11512.61,11205.41,11326.32,4684870000,11326.32 149 | 11326.32,11449.67,11144.59,11284.15,4562280000,11284.15 150 | 11286.02,11652.24,11286.02,11615.77,1219310000,11615.77 151 | 11603.64,11745.71,11454.64,11656.07,4873420000,11656.07 152 | 11655.42,11680.5,11355.63,11431.43,5319380000,11431.43 153 | 11432.09,11808.49,11344.23,11734.32,4966810000,11734.32 154 | 11729.67,11933.55,11580.19,11782.35,5067310000,11782.35 155 | 11781.7,11830.39,11541.43,11642.47,4711290000,11642.47 156 | 11632.81,11689.05,11377.37,11532.96,4787600000,11532.96 157 | 11532.07,11744.33,11399.84,11615.93,4064000000,11615.93 158 | 11611.21,11776.41,11540.05,11659.9,4041820000,11659.9 159 | 11659.65,11744.49,11410.18,11479.39,3829290000,11479.39 160 | 11478.09,11501.45,11260.53,11348.55,4159760000,11348.55 161 | 11345.94,11511.06,11240.18,11417.43,4555030000,11417.43 162 | 11415.23,11501.29,11263.63,11430.21,4032590000,11430.21 163 | 11426.79,11684,11426.79,11628.06,3741070000,11628.06 164 | 11626.19,11626.19,11336.82,11386.25,3420600000,11386.25 165 | 11383.56,11483.62,11284.47,11412.87,3587570000,11412.87 166 | 11412.46,11575.14,11349.69,11502.51,3499610000,11502.51 167 | 11499.87,11756.46,11493.72,11715.18,3854280000,11715.18 168 | 11713.23,11730.49,11508.78,11543.55,3288120000,11543.55 169 | -------------------------------------------------------------------------------- /exercises/dow_selection/dow_selection.py: -------------------------------------------------------------------------------- 1 | """ 2 | Dow Selection 3 | ------------- 4 | 5 | Topics: Boolean array operators, sum function, where function, plotting. 6 | 7 | The array 'dow' is a 2-D array with each row holding the 8 | daily performance of the Dow Jones Industrial Average from the 9 | beginning of 2008 (dates have been removed for exercise simplicity). 10 | The array has the following structure:: 11 | 12 | OPEN HIGH LOW CLOSE VOLUME ADJ_CLOSE 13 | 13261.82 13338.23 12969.42 13043.96 3452650000 13043.96 14 | 13044.12 13197.43 12968.44 13056.72 3429500000 13056.72 15 | 13046.56 13049.65 12740.51 12800.18 4166000000 12800.18 16 | 12801.15 12984.95 12640.44 12827.49 4221260000 12827.49 17 | 12820.9 12998.11 12511.03 12589.07 4705390000 12589.07 18 | 12590.21 12814.97 12431.53 12735.31 5351030000 12735.31 19 | 20 | 0. The data has been loaded from a .csv file for you. 21 | 1. Create a "mask" array that indicates which rows have a volume 22 | greater than 5.5 billion. 23 | 2. How many are there? (hint: use sum). 24 | 3. Find the index of every row (or day) where the volume is greater 25 | than 5.5 billion. hint: look at the where() command. 26 | 27 | Bonus 28 | ~~~~~ 29 | 30 | 1. Plot the adjusted close for *every* day in 2008. 31 | 2. Now over-plot this plot with a 'red dot' marker for every 32 | day where the volume was greater than 5.5 billion. 33 | 34 | See :ref:`dow-selection-solution`. 35 | """ 36 | 37 | from numpy import loadtxt, sum, where 38 | import matplotlib.pyplot as plt 39 | # Constants that indicate what data is held in each column of 40 | # the 'dow' array. 41 | OPEN = 0 42 | HIGH = 1 43 | LOW = 2 44 | CLOSE = 3 45 | VOLUME = 4 46 | ADJ_CLOSE = 5 47 | 48 | # 0. The data has been loaded from a .csv file for you. 49 | 50 | # 'dow' is our NumPy array that we will manipulate. 51 | dow = loadtxt('dow.csv', delimiter=',') 52 | 53 | # 1. Create a "mask" array that indicates which rows have a volume 54 | # greater than 5.5 billion. 55 | 56 | 57 | # 2. How many are there? (hint: use sum). 58 | 59 | # 3. Find the index of every row (or day) where the volume is greater 60 | # than 5.5 billion. hint: look at the where() command. 61 | 62 | # BONUS: 63 | # a. Plot the adjusted close for EVERY day in 2008. 64 | # b. Now over-plot this plot with a 'red dot' marker for every 65 | # day where the volume was greater than 5.5 billion. 66 | -------------------------------------------------------------------------------- /exercises/dow_selection/dow_selection_solution.py: -------------------------------------------------------------------------------- 1 | """ 2 | 3 | Topics: Boolean array operators, sum function, where function, plotting. 4 | 5 | The array 'dow' is a 2-D array with each row holding the 6 | daily performance of the Dow Jones Industrial Average from the 7 | beginning of 2008 (dates have been removed for exercise simplicity). 8 | The array has the following structure:: 9 | 10 | OPEN HIGH LOW CLOSE VOLUME ADJ_CLOSE 11 | 13261.82 13338.23 12969.42 13043.96 3452650000 13043.96 12 | 13044.12 13197.43 12968.44 13056.72 3429500000 13056.72 13 | 13046.56 13049.65 12740.51 12800.18 4166000000 12800.18 14 | 12801.15 12984.95 12640.44 12827.49 4221260000 12827.49 15 | 12820.9 12998.11 12511.03 12589.07 4705390000 12589.07 16 | 12590.21 12814.97 12431.53 12735.31 5351030000 12735.31 17 | 18 | 0. The data has been loaded from a .csv file for you. 19 | 1. Create a "mask" array that indicates which rows have a volume 20 | greater than 5.5 billion. 21 | 2. How many are there? (hint: use sum). 22 | 3. Find the index of every row (or day) where the volume is greater 23 | than 5.5 billion. hint: look at the where() command. 24 | 25 | Bonus 26 | ~~~~~ 27 | 28 | 1. Plot the adjusted close for *every* day in 2008. 29 | 2. Now over-plot this plot with a 'red dot' marker for every 30 | day where the volume was greater than 5.5 billion. 31 | 32 | """ 33 | 34 | from __future__ import print_function 35 | from numpy import loadtxt, sum, where 36 | import matplotlib.pyplot as plt 37 | # Constants that indicate what data is held in each column of 38 | # the 'dow' array. 39 | OPEN = 0 40 | HIGH = 1 41 | LOW = 2 42 | CLOSE = 3 43 | VOLUME = 4 44 | ADJ_CLOSE = 5 45 | 46 | # 0. The data has been loaded from a csv file for you. 47 | 48 | # 'dow' is our NumPy array that we will manipulate. 49 | dow = loadtxt('dow.csv', delimiter=',') 50 | 51 | 52 | # 1. Create a "mask" array that indicates which rows have a volume 53 | # greater than 5.5 billion. 54 | high_volume_mask = dow[:, VOLUME] > 5.5e9 55 | 56 | # 2. How many are there? (hint: use sum). 57 | high_volume_days = sum(high_volume_mask) 58 | print("The dow volume has been above 5.5 billion on" \ 59 | " %d days this year." % high_volume_days) 60 | 61 | # 3. Find the index of every row (or day) where the volume is greater 62 | # than 5.5 billion. hint: look at the where() command. 63 | high_vol_index = where(high_volume_mask)[0] 64 | 65 | # BONUS: 66 | # 1. Plot the adjusted close for EVERY day in 2008. 67 | # 2. Now over-plot this plot with a 'red dot' marker for every 68 | # day where the dow was greater than 5.5 billion. 69 | 70 | # Create a new plot. 71 | plt.figure() 72 | 73 | # Plot the adjusted close for every day of the year as a blue line. 74 | # In the format string 'b-', 'b' means blue and '-' indicates a line. 75 | plt.plot(dow[:, ADJ_CLOSE], 'b-') 76 | 77 | # Plot the days where the volume was high with red dots... 78 | plt.plot(high_vol_index, dow[high_vol_index, ADJ_CLOSE], 'ro') 79 | 80 | # Scripts must call the "plt.show" command to display the plot 81 | # to the screen. 82 | plt.show() 83 | -------------------------------------------------------------------------------- /exercises/filter_image/dc_metro.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/enthought/Numpy-Tutorial-SciPyConf-2019/823def12cea444d9d3e9e04535c900bba30c789c/exercises/filter_image/dc_metro.png -------------------------------------------------------------------------------- /exercises/filter_image/filter_image.py: -------------------------------------------------------------------------------- 1 | """ 2 | Filter Image 3 | ------------ 4 | 5 | Read in the "dc_metro" image and use an averaging filter 6 | to "smooth" the image. Use a "5 point stencil" where 7 | you average the current pixel with its neighboring pixels:: 8 | 9 | 0 0 0 0 0 0 0 10 | 0 0 0 x 0 0 0 11 | 0 0 x x x 0 0 12 | 0 0 0 x 0 0 0 13 | 0 0 0 0 0 0 0 14 | 15 | Plot the image, the smoothed image, and the difference between the 16 | two. 17 | 18 | Bonus 19 | ~~~~~ 20 | 21 | Re-filter the image by passing the result image through the filter again. Do 22 | this 50 times and plot the resulting image. 23 | 24 | See :ref:`filter-image-solution`. 25 | """ 26 | 27 | import matplotlib.pyplot as plt 28 | 29 | img = plt.imread('dc_metro.png') 30 | 31 | plt.imshow(img, cmap=plt.cm.hot) 32 | plt.show() 33 | -------------------------------------------------------------------------------- /exercises/filter_image/filter_image_solution.py: -------------------------------------------------------------------------------- 1 | """ 2 | Filter Image 3 | ------------ 4 | 5 | Read in the "dc_metro" image and use an averaging filter 6 | to "smooth" the image. Use a "5 point stencil" where 7 | you average the current pixel with its neighboring pixels:: 8 | 9 | 0 0 0 0 0 0 0 10 | 0 0 0 x 0 0 0 11 | 0 0 x x x 0 0 12 | 0 0 0 x 0 0 0 13 | 0 0 0 0 0 0 0 14 | 15 | Plot the image, the smoothed image, and the difference between the 16 | two. 17 | 18 | Bonus 19 | ~~~~~ 20 | 21 | Re-filter the image by passing the result image through the filter again. Do 22 | this 50 times and plot the resulting image. 23 | 24 | """ 25 | import numpy as np 26 | import matplotlib.pyplot as plt 27 | 28 | def smooth(img): 29 | avg_img =( img[1:-1 ,1:-1] # center 30 | + img[ :-2 ,1:-1] # top 31 | + img[2: ,1:-1] # bottom 32 | + img[1:-1 , :-2] # left 33 | + img[1:-1 ,2: ] # right 34 | ) / 5.0 35 | return avg_img 36 | 37 | 38 | def smooth_loop(img): 39 | smoothed = np.zeros((img.shape[0]-2, img.shape[1]-2)) 40 | for r in range(0, img.shape[0]-2): 41 | for c in range(0, img.shape[1]-2): 42 | smoothed[r, c] = ( img[r+1, c+1] # center 43 | + img[r , c+1] # top 44 | + img[r+2, c+1] # bottom 45 | + img[r+1, c ] # left 46 | + img[r+1, c+2] # right 47 | ) / 5.0 48 | return smoothed 49 | 50 | 51 | img = plt.imread('dc_metro.png') 52 | avg_img = smooth(img) 53 | 54 | plt.figure() 55 | # Set colormap so that images are plotted in gray scale. 56 | plt.gray() 57 | # Plot the original image first 58 | plt.subplot(1,3,1) 59 | plt.imshow(img) 60 | plt.title('original') 61 | 62 | # Now the filtered image. 63 | plt.subplot(1,3,2) 64 | plt.imshow(avg_img) 65 | plt.title('smoothed once') 66 | 67 | # And finally the difference between the two. 68 | plt.subplot(1,3,3) 69 | plt.imshow(img[1:-1,1:-1] - avg_img) 70 | plt.title('difference') 71 | 72 | 73 | # Bonus: Re-filter the image by passing the result image 74 | # through the filter again. Do this 50 times and plot 75 | # the resulting image. 76 | 77 | for num in range(50): 78 | avg_img = smooth(avg_img) 79 | 80 | # Plot the original image first 81 | plt.figure() 82 | plt.subplot(1,2,1) 83 | plt.imshow(img) 84 | plt.title('original') 85 | 86 | # Now the filtered image. 87 | plt.subplot(1,2,2) 88 | plt.imshow(avg_img) 89 | plt.title('smoothed 50 times') 90 | 91 | assert np.allclose(smooth(img), smooth_loop(img)) 92 | 93 | plt.show() 94 | -------------------------------------------------------------------------------- /exercises/plotting/dc_metro.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/enthought/Numpy-Tutorial-SciPyConf-2019/823def12cea444d9d3e9e04535c900bba30c789c/exercises/plotting/dc_metro.JPG -------------------------------------------------------------------------------- /exercises/plotting/my_plots.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/enthought/Numpy-Tutorial-SciPyConf-2019/823def12cea444d9d3e9e04535c900bba30c789c/exercises/plotting/my_plots.png -------------------------------------------------------------------------------- /exercises/plotting/plotting.py: -------------------------------------------------------------------------------- 1 | """ 2 | Plotting 3 | -------- 4 | 5 | Create a plot display that looks like the following: 6 | 7 | .. image:: plotting/sample_plots.png 8 | 9 | `Photo credit: David Fettig 10 | `_ 11 | 12 | 13 | This is a 2x2 layout, with 3 slots occupied. 14 | 15 | 1. Sine function, with blue solid line; cosine with red '+' markers; the 16 | extents fit the plot exactly. Hint: see the plt.axis() function for setting the 17 | extents. 18 | 2. Sine function, with gridlines, axis labels, and title; the extents fit the 19 | plot exactly. 20 | 3. Image with color map; the extents run from -10 to 10, rather than the 21 | default. 22 | 23 | Save the resulting plot image to a file. (Use a different file name, so you 24 | don't overwrite the sample.) 25 | 26 | The color map in the example is 'winter'; use 'plt.cm.' to list the available 27 | ones, and experiment to find one you like. 28 | 29 | Start with the following statements:: 30 | 31 | import matplotlib.pyplot as plt 32 | 33 | x = linspace(0, 2*pi, 101) 34 | s = sin(x) 35 | c = cos(x) 36 | 37 | img = plt.imread('dc_metro.jpg') 38 | 39 | Tip: If you find that the label of one plot overlaps another, try adding 40 | a call to `plt.tight_layout()` to your script. 41 | 42 | Bonus 43 | ~~~~~ 44 | 45 | 4. The `plt.subplot()` function returns an axes object, which can be 46 | assigned to the `sharex` and `sharey` keyword arguments of another 47 | plt.subplot() function call. E.g.:: 48 | 49 | ax1 = plt.subplot(2,2,1) 50 | ... 51 | plt.subplot(2,2,2, sharex=ax1, sharey=ax1) 52 | 53 | Make this modification to your script, and explore the consequences. 54 | Hint: try panning and zooming in the subplots. 55 | 56 | """ 57 | 58 | 59 | # The following imports are *not* needed in PyLab, but are needed in this file. 60 | from numpy import linspace, pi, sin, cos 61 | import matplotlib.pyplot as plt 62 | 63 | x = linspace(0, 2 * pi, 101) 64 | s = sin(x) 65 | c = cos(x) 66 | 67 | img = plt.imread('dc_metro.JPG') 68 | -------------------------------------------------------------------------------- /exercises/plotting/plotting_bonus_solution.py: -------------------------------------------------------------------------------- 1 | """ 2 | Plotting 3 | -------- 4 | 5 | Create a plot display that looks like the following: 6 | 7 | .. image:: plotting/sample_plots.png 8 | 9 | `Photo credit: David Fettig 10 | `_ 11 | 12 | 13 | This is a 2x2 layout, with 3 slots occupied. 14 | 15 | 1. Sine function, with blue solid line; cosine with red '+' markers; the 16 | extents fit the plot exactly. Hint: see the plt.axis() function for setting the 17 | extents. 18 | 2. Sine function, with gridlines, axis labels, and title; the extents fit the 19 | plot exactly. 20 | 3. Image with color map; the extents run from -10 to 10, rather than the 21 | default. 22 | 23 | Save the resulting plot image to a file. (Use a different file name, so you 24 | don't overwrite the sample.) 25 | 26 | The color map in the example is 'winter'; use 'plt.cm.' to list the available 27 | ones, and experiment to find one you like. 28 | 29 | Start with the following statements:: 30 | 31 | import matplotlib.pyplot as plt 32 | 33 | x = linspace(0, 2*pi, 101) 34 | s = sin(x) 35 | c = cos(x) 36 | 37 | img = plt.imread('dc_metro.jpg') 38 | 39 | Tip: If you find that the label of one plot overlaps another, try adding 40 | a call to `plt.tight_layout()` to your script. 41 | 42 | Bonus 43 | ~~~~~ 44 | 45 | 4. The `plt.subplot()` function returns an axes object, which can be 46 | assigned to the `sharex` and `sharey` keyword arguments of another 47 | plt.subplot() function call. E.g.:: 48 | 49 | ax1 = plt.subplot(2,2,1) 50 | ... 51 | plt.subplot(2,2,2, sharex=ax1, sharey=ax1) 52 | 53 | Make this modification to your script, and explore the consequences. 54 | Hint: try panning and zooming in the subplots. 55 | 56 | """ 57 | 58 | 59 | # The following imports are *not* needed in PyLab, but are needed in this file. 60 | from numpy import linspace, pi, sin, cos 61 | import matplotlib.pyplot as plt 62 | 63 | x = linspace(0, 2 * pi, 101) 64 | s = sin(x) 65 | c = cos(x) 66 | 67 | img = plt.imread('dc_metro.JPG') 68 | 69 | plt.close('all') 70 | # 2x2 layout, first plot: sin and cos 71 | ax1 = plt.subplot(2, 2, 1) 72 | plt.plot(x, s, 'b-', x, c, 'r+') 73 | plt.axis('tight') 74 | 75 | # 2nd plot: gridlines, labels 76 | plt.subplot(2, 2, 2, sharex=ax1, sharey=ax1) 77 | plt.plot(x, s) 78 | plt.grid() 79 | plt.xlabel('radians') 80 | plt.ylabel('amplitude') 81 | plt.title('sin(x)') 82 | plt.axis('tight') 83 | 84 | # 3rd plot, image 85 | plt.subplot(2, 2, 3) 86 | plt.imshow(img, extent=[-10, 10, -10, 10], cmap=plt.cm.winter) 87 | 88 | plt.tight_layout() 89 | 90 | plt.show() 91 | 92 | 93 | plt.savefig('my_plots.png') 94 | -------------------------------------------------------------------------------- /exercises/plotting/plotting_solution.py: -------------------------------------------------------------------------------- 1 | """ 2 | Plotting 3 | -------- 4 | 5 | Create a plt.plot display that looks like the following: 6 | 7 | .. image:: plotting/sample_plots.png 8 | 9 | `Photo credit: David Fettig 10 | `_ 11 | 12 | 13 | This is a 2x2 layout, with 3 slots occupied. 14 | 15 | 1. Sine function, with blue solid line; cosine with red '+' markers; the 16 | extents fit the plt.plot exactly. Hint: see the plt.axis() function for setting the 17 | extents. 18 | 2. Sine function, with gridlines, axis labels, and title; the extents fit the 19 | plot exactly. 20 | 3. Image with color map; the extents run from -10 to 10, rather than the 21 | default. 22 | 23 | Save the resulting plot image to a file. (Use a different file name, so you 24 | don't overwrite the sample.) 25 | 26 | The color map in the example is 'winter'; use 'plt.cm.' to list the available 27 | ones, and experiment to find one you like. 28 | 29 | Start with the following statements:: 30 | 31 | import matplotlib.pyplot as plt 32 | 33 | x = linspace(0, 2*pi, 101) 34 | s = sin(x) 35 | c = cos(x) 36 | 37 | img = plt.imread('dc_metro.jpg') 38 | 39 | Tip: If you find that the label of one plot overlaps another, try adding 40 | a call to `plt.tight_layout()` to your script. 41 | 42 | Bonus 43 | ~~~~~ 44 | 45 | 4. The `plt.subplot()` function returns an axes object, which can be 46 | assigned to the `sharex` and `sharey` keyword arguments of another 47 | plt.subplot() function call. E.g.:: 48 | 49 | ax1 = plt.subplot(2,2,1) 50 | ... 51 | plt.subplot(2,2,2, sharex=ax1, sharey=ax1) 52 | 53 | Make this modification to your script, and explore the consequences. 54 | Hint: try panning and zooming in the subplots. 55 | 56 | """ 57 | 58 | 59 | # The following imports are *not* needed in PyLab, but are needed in this file. 60 | from numpy import linspace, pi, sin, cos 61 | import matplotlib.pyplot as plt 62 | 63 | x = linspace(0, 2*pi, 101) 64 | s = sin(x) 65 | c = cos(x) 66 | 67 | img = plt.imread('dc_metro.JPG') 68 | 69 | plt.close('all') 70 | # 2x2 layout, first plot: sin and cos 71 | plt.subplot(2, 2, 1) 72 | plt.plot(x, s, 'b-', x, c, 'r+') 73 | plt.axis('tight') 74 | 75 | # 2nd plot: gridlines, labels 76 | plt.subplot(2, 2, 2) 77 | plt.plot(x, s) 78 | plt.grid() 79 | plt.xlabel('radians') 80 | plt.ylabel('amplitude') 81 | plt.title('sin(x)') 82 | plt.axis('tight') 83 | 84 | # 3rd plot, image 85 | plt.subplot(2, 2, 3) 86 | plt.imshow(img, extent=[-10, 10, -10, 10], cmap=plt.cm.winter) 87 | 88 | plt.tight_layout() 89 | 90 | plt.show() 91 | 92 | 93 | plt.savefig('my_plots.png') 94 | -------------------------------------------------------------------------------- /exercises/plotting/sample_plots.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/enthought/Numpy-Tutorial-SciPyConf-2019/823def12cea444d9d3e9e04535c900bba30c789c/exercises/plotting/sample_plots.png -------------------------------------------------------------------------------- /exercises/wind_statistics/wind.desc: -------------------------------------------------------------------------------- 1 | wind daily average wind speeds for 1961-1978 at 12 synoptic meteorological 2 | stations in the Republic of Ireland (Haslett and raftery 1989). 3 | 4 | These data were analyzed in detail in the following article: 5 | Haslett, J. and Raftery, A. E. (1989). Space-time Modelling with 6 | Long-memory Dependence: Assessing Ireland's Wind Power Resource 7 | (with Discussion). Applied Statistics 38, 1-50. 8 | 9 | Each line corresponds to one day of data in the following format: 10 | year, month, day, average wind speed at each of the stations in the order given 11 | in Fig.4 of Haslett and Raftery : 12 | RPT, VAL, ROS, KIL, SHA, BIR, DUB, CLA, MUL, CLO, BEL, MAL 13 | 14 | Fortan format : ( i2, 2i3, 12f6.2) 15 | 16 | The data are in knots, not in m/s. 17 | 18 | Permission granted for unlimited distribution. 19 | 20 | Please report all anomalies to fraley@stat.washington.edu 21 | 22 | Be aware that the dataset is 532494 bytes long (thats over half a 23 | Megabyte). Please be sure you want the data before you request it. 24 | -------------------------------------------------------------------------------- /exercises/wind_statistics/wind_statistics.py: -------------------------------------------------------------------------------- 1 | """ 2 | Wind Statistics 3 | ---------------- 4 | 5 | Topics: Using array methods over different axes, fancy indexing. 6 | 7 | 1. The data in 'wind.data' has the following format:: 8 | 9 | 61 1 1 15.04 14.96 13.17 9.29 13.96 9.87 13.67 10.25 10.83 12.58 18.50 15.04 10 | 61 1 2 14.71 16.88 10.83 6.50 12.62 7.67 11.50 10.04 9.79 9.67 17.54 13.83 11 | 61 1 3 18.50 16.88 12.33 10.13 11.17 6.17 11.25 8.04 8.50 7.67 12.75 12.71 12 | 13 | The first three columns are year, month and day. The 14 | remaining 12 columns are average windspeeds in knots at 12 15 | locations in Ireland on that day. 16 | 17 | Use the 'loadtxt' function from numpy to read the data into 18 | an array. 19 | 20 | 2. Calculate the min, max and mean windspeeds and standard deviation of the 21 | windspeeds over all the locations and all the times (a single set of numbers 22 | for the entire dataset). 23 | 24 | 3. Calculate the min, max and mean windspeeds and standard deviations of the 25 | windspeeds at each location over all the days (a different set of numbers 26 | for each location) 27 | 28 | 4. Calculate the min, max and mean windspeed and standard deviations of the 29 | windspeeds across all the locations at each day (a different set of numbers 30 | for each day) 31 | 32 | 5. Find the location which has the greatest windspeed on each day (an integer 33 | column number for each day). 34 | 35 | 6. Find the year, month and day on which the greatest windspeed was recorded. 36 | 37 | 7. Find the average windspeed in January for each location. 38 | 39 | You should be able to perform all of these operations without using a for 40 | loop or other looping construct. 41 | 42 | Bonus 43 | ~~~~~ 44 | 45 | 1. Calculate the mean windspeed for each month in the dataset. Treat 46 | January 1961 and January 1962 as *different* months. (hint: first find a 47 | way to create an identifier unique for each month. The second step might 48 | require a for loop.) 49 | 50 | 2. Calculate the min, max and mean windspeeds and standard deviations of the 51 | windspeeds across all locations for each week (assume that the first week 52 | starts on January 1 1961) for the first 52 weeks. This can be done without 53 | any for loop. 54 | 55 | Bonus Bonus 56 | ~~~~~~~~~~~ 57 | 58 | Calculate the mean windspeed for each month without using a for loop. 59 | (Hint: look at `searchsorted` and `add.reduceat`.) 60 | 61 | Notes 62 | ~~~~~ 63 | 64 | These data were analyzed in detail in the following article: 65 | 66 | Haslett, J. and Raftery, A. E. (1989). Space-time Modelling with 67 | Long-memory Dependence: Assessing Ireland's Wind Power Resource 68 | (with Discussion). Applied Statistics 38, 1-50. 69 | 70 | 71 | See :ref:`wind-statistics-solution`. 72 | """ 73 | 74 | from numpy import loadtxt 75 | 76 | -------------------------------------------------------------------------------- /exercises/wind_statistics/wind_statistics_solution.py: -------------------------------------------------------------------------------- 1 | """ 2 | Wind Statistics 3 | ---------------- 4 | 5 | Topics: Using array methods over different axes, fancy indexing. 6 | 7 | 1. The data in 'wind.data' has the following format:: 8 | 9 | 61 1 1 15.04 14.96 13.17 9.29 13.96 9.87 13.67 10.25 10.83 12.58 18.50 15.04 10 | 61 1 2 14.71 16.88 10.83 6.50 12.62 7.67 11.50 10.04 9.79 9.67 17.54 13.83 11 | 61 1 3 18.50 16.88 12.33 10.13 11.17 6.17 11.25 8.04 8.50 7.67 12.75 12.71 12 | 13 | The first three columns are year, month and day. The 14 | remaining 12 columns are average windspeeds in knots at 12 15 | locations in Ireland on that day. 16 | 17 | Use the 'loadtxt' function from numpy to read the data into 18 | an array. 19 | 20 | 2. Calculate the min, max and mean windspeeds and standard deviation of the 21 | windspeeds over all the locations and all the times (a single set of numbers 22 | for the entire dataset). 23 | 24 | 3. Calculate the min, max and mean windspeeds and standard deviations of the 25 | windspeeds at each location over all the days (a different set of numbers 26 | for each location) 27 | 28 | 4. Calculate the min, max and mean windspeed and standard deviations of the 29 | windspeeds across all the locations at each day (a different set of numbers 30 | for each day) 31 | 32 | 5. Find the location which has the greatest windspeed on each day (an integer 33 | column number for each day). 34 | 35 | 6. Find the year, month and day on which the greatest windspeed was recorded. 36 | 37 | 7. Find the average windspeed in January for each location. 38 | 39 | You should be able to perform all of these operations without using a for 40 | loop or other looping construct. 41 | 42 | Bonus 43 | ~~~~~ 44 | 45 | 1. Calculate the mean windspeed for each month in the dataset. Treat 46 | January 1961 and January 1962 as *different* months. 47 | 48 | 2. Calculate the min, max and mean windspeeds and standard deviations of the 49 | windspeeds across all locations for each week (assume that the first week 50 | starts on January 1 1961) for the first 52 weeks. 51 | 52 | Bonus Bonus 53 | ~~~~~~~~~~~ 54 | 55 | Calculate the mean windspeed for each month without using a for loop. 56 | (Hint: look at `searchsorted` and `add.reduceat`.) 57 | 58 | Notes 59 | ~~~~~ 60 | 61 | These data were analyzed in detail in the following article: 62 | 63 | Haslett, J. and Raftery, A. E. (1989). Space-time Modelling with 64 | Long-memory Dependence: Assessing Ireland's Wind Power Resource 65 | (with Discussion). Applied Statistics 38, 1-50. 66 | 67 | """ 68 | from __future__ import print_function 69 | from numpy import (loadtxt, arange, searchsorted, add, zeros, unravel_index, 70 | where) 71 | 72 | wind_data = loadtxt('wind.data') 73 | 74 | data = wind_data[:, 3:] 75 | 76 | print('2. Statistics over all values') 77 | print(' min:', data.min()) 78 | print(' max:', data.max()) 79 | print(' mean:', data.mean()) 80 | print(' standard deviation:', data.std()) 81 | print() 82 | 83 | print('3. Statistics over all days at each location') 84 | print(' min:', data.min(axis=0)) 85 | print(' max:', data.max(axis=0)) 86 | print(' mean:', data.mean(axis=0)) 87 | print(' standard deviation:', data.std(axis=0)) 88 | print() 89 | 90 | print('4. Statistics over all locations for each day') 91 | print(' min:', data.min(axis=1)) 92 | print(' max:', data.max(axis=1)) 93 | print(' mean:', data.mean(axis=1)) 94 | print(' standard deviation:', data.std(axis=1)) 95 | print() 96 | 97 | print('5. Location of daily maximum') 98 | print(' daily max location:', data.argmax(axis=1)) 99 | print() 100 | 101 | daily_max = data.max(axis=1) 102 | max_row = daily_max.argmax() 103 | # Note: Another way to do this would be to use the unravel_index function 104 | # which takes a linear index and convert it to a location given the shape 105 | # of the array: 106 | max_row, max_col = unravel_index(data.argmax(), data.shape) 107 | # Or you could use "where", which identifies *all* the places where the max 108 | # occurs, rather than just the first. Note that "where" returns two arrays in 109 | # this case, instead of two integers. 110 | max_row, max_col = where(data == data.max()) 111 | 112 | 113 | print('6. Day of maximum reading') 114 | print(' Year:', int(wind_data[max_row, 0])) 115 | print(' Month:', int(wind_data[max_row, 1])) 116 | print(' Day:', int(wind_data[max_row, 2])) 117 | print() 118 | 119 | january_indices = wind_data[:, 1] == 1 120 | january_data = data[january_indices] 121 | 122 | print('7. Statistics for January') 123 | print(' mean:', january_data.mean(axis=0)) 124 | print() 125 | 126 | # Bonus 127 | 128 | # compute the month number for each day in the dataset 129 | months = (wind_data[:, 0] - 61) * 12 + wind_data[:, 1] - 1 130 | 131 | # we're going to use the month values as indices, so we need 132 | # them to be integers 133 | months = months.astype(int) 134 | 135 | # get set of unique months 136 | month_values = set(months) 137 | 138 | # initialize an array to hold the result 139 | monthly_means = zeros(len(month_values)) 140 | 141 | for month in month_values: 142 | # find the rows that correspond to the current month 143 | day_indices = (months == month) 144 | 145 | # extract the data for the current month using fancy indexing 146 | month_data = data[day_indices] 147 | 148 | # find the mean 149 | monthly_means[month] = month_data.mean() 150 | 151 | # Note: experts might do this all-in one 152 | # monthly_means[month] = data[months==month].mean() 153 | 154 | # In fact the whole for loop could reduce to the following one-liner 155 | # monthly_means = array([data[months==month].mean() for month in month_values]) 156 | 157 | 158 | print("Bonus 1.") 159 | print(" mean:", monthly_means) 160 | print() 161 | 162 | # Bonus 2. 163 | # Extract the data for the first 52 weeks. Then reshape the array to put 164 | # on the same line 7 days worth of data for all locations. Let Numpy 165 | # figure out the number of lines needed to do so 166 | weekly_data = data[:52 * 7].reshape(-1, 7 * 12) 167 | 168 | print('Bonus 2. Weekly statistics over all locations') 169 | print(' min:', weekly_data.min(axis=1)) 170 | print(' max:', weekly_data.max(axis=1)) 171 | print(' mean:', weekly_data.mean(axis=1)) 172 | print(' standard deviation:', weekly_data.std(axis=1)) 173 | print() 174 | 175 | # Bonus Bonus : this is really tricky... 176 | 177 | # compute the month number for each day in the dataset 178 | months = (wind_data[:, 0] - 61) * 12 + wind_data[:, 1] - 1 179 | 180 | # find the indices for the start of each month 181 | # this is a useful trick - we use range from 0 to the 182 | # number of months + 1 and searchsorted to find the insertion 183 | # points for each. 184 | month_indices = searchsorted(months, arange(months[-1] + 2)) 185 | 186 | # now use add.reduceat to get the sum at each location 187 | monthly_loc_totals = add.reduceat(data, month_indices[:-1]) 188 | 189 | # now use add to find the sum across all locations for each month 190 | monthly_totals = monthly_loc_totals.sum(axis=1) 191 | 192 | # now find total number of measurements for each month 193 | month_days = month_indices[1:] - month_indices[:-1] 194 | measurement_count = month_days * 12 195 | 196 | # compute the mean 197 | monthly_means = monthly_totals / measurement_count 198 | 199 | print("Bonus Bonus") 200 | print(" mean:", monthly_means) 201 | 202 | # Notes: this method relies on the fact that the months are contiguous in the 203 | # data set - the method used in the bonus section works for non-contiguous 204 | # days. 205 | -------------------------------------------------------------------------------- /slides.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/enthought/Numpy-Tutorial-SciPyConf-2019/823def12cea444d9d3e9e04535c900bba30c789c/slides.pdf --------------------------------------------------------------------------------