├── .gitignore
├── LICENSE
├── README.md
├── check_env.py
├── exercises
    ├── calc_derivative
    │   ├── calc_derivative.py
    │   └── calc_derivative_solution.py
    ├── dow_selection
    │   ├── dow.csv
    │   ├── dow_selection.py
    │   └── dow_selection_solution.py
    ├── load_text
    │   ├── complex_data_file.txt
    │   ├── float_data.txt
    │   ├── float_data_with_header.txt
    │   ├── load_text.py
    │   └── load_text_solution.py
    ├── plotting
    │   ├── dc_metro.JPG
    │   ├── my_plots.png
    │   ├── plotting.py
    │   ├── plotting_bonus_solution.py
    │   ├── plotting_solution.py
    │   └── sample_plots.png
    ├── structured_array
    │   ├── short_logs.crv
    │   ├── structured_array.py
    │   └── structured_array_solution.py
    └── wind_statistics
    │   ├── wind.data
    │   ├── wind.desc
    │   ├── wind_statistics.py
    │   └── wind_statistics_solution.py
└── slides.pdf


/.gitignore:
--------------------------------------------------------------------------------
 1 | # Byte-compiled / optimized / DLL files
 2 | __pycache__/
 3 | *.py[cod]
 4 | 
 5 | # C extensions
 6 | *.so
 7 | 
 8 | # Distribution / packaging
 9 | .Python
10 | env/
11 | build/
12 | develop-eggs/
13 | dist/
14 | downloads/
15 | eggs/
16 | .eggs/
17 | lib/
18 | lib64/
19 | parts/
20 | sdist/
21 | var/
22 | *.egg-info/
23 | .installed.cfg
24 | *.egg
25 | 
26 | # PyInstaller
27 | #  Usually these files are written by a python script from a template
28 | #  before PyInstaller builds the exe, so as to inject date/other infos into it.
29 | *.manifest
30 | *.spec
31 | 
32 | # Installer logs
33 | pip-log.txt
34 | pip-delete-this-directory.txt
35 | 
36 | # Unit test / coverage reports
37 | htmlcov/
38 | .tox/
39 | .coverage
40 | .coverage.*
41 | .cache
42 | nosetests.xml
43 | coverage.xml
44 | *,cover
45 | 
46 | # Translations
47 | *.mo
48 | *.pot
49 | 
50 | # Django stuff:
51 | *.log
52 | 
53 | # Sphinx documentation
54 | docs/_build/
55 | 
56 | # PyBuilder
57 | target/
58 | 


--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
1 | © 2001-2016, Enthought, Inc.
2 | All Rights Reserved. Use only permitted under license. Copying, sharing, redistributing or other unauthorized use strictly prohibited.
3 | All trademarks and registered trademarks are the property of their respective owners.
4 | Enthought, Inc.
5 | 200 W Cesar Chavez Suite 202
6 | Austin, TX 78701
7 | www.enthought.com
8 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | # SciPy2016 tutorial: Introduction to NumPy
 2 | 
 3 | This repository contains all the material needed by students registered for the
 4 | Numpy tutorial of SciPy 2016 on Monday, July 11th 2016.
 5 | 
 6 | For a smooth experience, you will need to make sure that you install or update
 7 | your Python distribution and download the tutorial material _before_ the day
 8 | of the tutorial as the Wi-Fi at the AT&T center can be flaky.
 9 | 
10 | 
11 | ## Python distribution and Packages needed
12 | 
13 | If you don't already have a working python distribution, by far the easiest
14 | way to get everything you need for this tutorial is to download Enthought
15 | Canopy ([https://store.enthought.com/](https://store.enthought.com/),
16 | the free version is sufficient), or Continuum's Anaconda
17 | ([http://continuum.io/downloads](http://continuum.io/downloads)).
18 | 
19 | If you have the choice, I recommend to use a Python 2.7 distribution, which
20 | is what I will be using and my material as been tested with that. If you have
21 | a Python 3.4+ version, you should be fine, though you might have to replace a
22 | print statement (`print a`) by the print function (`print(a)`) in some of the
23 | solution files.
24 | 
25 | To be able to run the examples, demoes and exercises, you must have the
26 | following packages installed:
27 | 
28 | - numpy 1.10+
29 | - matplotlib 1.5+
30 | - ipython 4.0+ (for running, experimenting and doing exercises)
31 | - nose (only to test your distribution, see below)
32 | 
33 | If you use Canopy, everything you need will be installed by default. If you
34 | use `conda`, you can create a new environment using the following command:
35 | 
36 |     $ conda create -n numpy-tutorial python=2 numpy matplotlib nose ipython
37 | 
38 | To test your installation, please execute the `check_env.py` script. The
39 | output should look something like this:
40 | 
41 |     $ python check_env.py
42 |     ....
43 |     ----------------------------------------------------------------------
44 |     Ran 4 tests in 0.162 s
45 | 
46 |     OK
47 | 
48 | 
49 | ## Content needed
50 | 
51 | This GitHub repository is all that is needed in terms of tutorial content. The simplest solution is to download the material using this link:
52 | 
53 | https://github.com/enthought/Numpy-Tutorial-SciPyConf-2016/archive/master.zip
54 | 
55 | If you're familiar with Git, you can also clone this repository with:
56 | 
57 |     $ git clone https://github.com/enthought/Numpy-Tutorial-SciPyConf-2016.git
58 | 
59 | It will create a new folder named SciPy2016_numpy_tutorial/ with all the
60 | content you will need: the slides I will go through (`slides.pdf`), and a folder
61 | of exercises.
62 | 
63 | As you get closer to the day of the tutorial, it is highly recommended to
64 | update this repository, as I will be improving it this week. To update it, open
65 | a command prompt, move **into** the SciPy2016_numpy_tutorial/ folder and run:
66 | 
67 |     $ git pull
68 | 
69 | 
70 | Questions? Problems?
71 | ====================
72 | Questions? Problems? Don't wait, shoot me and the rest of the group an email on
73 | the tutorial mailing list: https://groups.google.com/forum/#!forum/scipy-2016-numpy-tutorial
74 | 


--------------------------------------------------------------------------------
/check_env.py:
--------------------------------------------------------------------------------
 1 | """ Run this file to check your python installation.
 2 | """
 3 | from numpy.testing import assert_array_equal
 4 | 
 5 | 
 6 | def test_import_numpy():
 7 |     import numpy
 8 | 
 9 | 
10 | def test_numpy_version():
11 |     import numpy
12 |     version_found = numpy.__version__.split(".")
13 |     version_found = tuple(int(num) for num in version_found)
14 |     assert version_found > (1, 8)
15 | 
16 | 
17 | def test_import_matplotlib():
18 |     from matplotlib.pyplot import plot
19 | 
20 | 
21 | def test_slicing():
22 |     from numpy import array
23 |     x = array([[1, 2, 3], [4, 5, 6]])
24 |     assert_array_equal(x[:, ::2], array([[1, 3], [4, 6]]))
25 | 
26 | 
27 | if __name__ == "__main__":
28 |     import nose
29 |     nose.run(defaultTest=__name__)
30 | 


--------------------------------------------------------------------------------
/exercises/calc_derivative/calc_derivative.py:
--------------------------------------------------------------------------------
 1 | # Copyright 2016 Enthought, Inc. All Rights Reserved
 2 | """
 3 | Calculate Derivative
 4 | --------------------
 5 | 
 6 | Topics: NumPy array indexing and array math.
 7 | 
 8 | Use array slicing and math operations to calculate the
 9 | numerical derivative of ``sin`` from 0 to ``2*pi``.  There is no
10 | need to use a 'for' loop for this.
11 | 
12 | Plot the resulting values and compare to ``cos``.
13 | 
14 | Bonus
15 | ~~~~~
16 | 
17 | Implement integration of the same function using Riemann sums or the
18 | trapezoidal rule.
19 | 
20 | See :ref:`calc-derivative-solution`.
21 | """
22 | from numpy import linspace, pi, sin, cos, cumsum
23 | from matplotlib.pyplot import plot, show, subplot, legend, title
24 | 
25 | # calculate the sin() function on evenly spaced data.
26 | x = linspace(0,2*pi,101)
27 | y = sin(x)
28 | 
29 | plot(x,y)
30 | show()
31 | 


--------------------------------------------------------------------------------
/exercises/calc_derivative/calc_derivative_solution.py:
--------------------------------------------------------------------------------
 1 | # Copyright 2016 Enthought, Inc. All Rights Reserved
 2 | """
 3 | Topics: NumPy array indexing and array math.
 4 | 
 5 | Use array slicing and math operations to calculate the
 6 | numerical derivative of ``sin`` from 0 to ``2*pi``.  There is no
 7 | need to use a for loop for this.
 8 | 
 9 | Plot the resulting values and compare to ``cos``.
10 | 
11 | Bonus
12 | ~~~~~
13 | 
14 | Implement integration of the same function using Riemann sums or the
15 | trapezoidal rule.
16 | 
17 | """
18 | from numpy import linspace, pi, sin, cos, cumsum
19 | from matplotlib.pyplot import plot, show, subplot, legend, title
20 | 
21 | # calculate the sin() function on evenly spaced data.
22 | x = linspace(0,2*pi,101)
23 | y = sin(x)
24 | 
25 | # calculate the derivative dy/dx numerically.
26 | # First, calculate the distance between adjacent pairs of
27 | # x and y values.
28 | dy = y[1:]-y[:-1]
29 | dx = x[1:]-x[:-1]
30 | 
31 | # Now divide to get "rise" over "run" for each interval.
32 | dy_dx = dy/dx
33 | 
34 | # Assuming central differences, these derivative values
35 | # centered in-between our original sample points.
36 | centers_x = (x[1:]+x[:-1])/2.0
37 | 
38 | # Plot our derivative calculation.  It should match up
39 | # with the cos function since the derivative of sin is
40 | # cos.
41 | subplot(1,2,1)
42 | plot(centers_x, dy_dx,'rx', centers_x, cos(centers_x),'b-')
43 | title(r"$\rm{Derivative\ of}\ sin(x)$")
44 | 
45 | # Trapezoidal rule integration.
46 | avg_height = (y[1:]+y[:-1])/2.0
47 | int_sin = cumsum(dx * avg_height)
48 | 
49 | # Plot our integration against -cos(x) - -cos(0)
50 | closed_form = -cos(x)+cos(0)
51 | subplot(1,2,2)
52 | plot(x[1:], int_sin,'rx', x, closed_form,'b-')
53 | legend(('numerical', 'actual'))
54 | title(r"$\int \, \sin(x) \, dx$")
55 | show()
56 | 


--------------------------------------------------------------------------------
/exercises/dow_selection/dow.csv:
--------------------------------------------------------------------------------
  1 | 13261.82,13338.23,12969.42,13043.96,3452650000,13043.96
  2 | 13044.12,13197.43,12968.44,13056.72,3429500000,13056.72
  3 | 13046.56,13049.65,12740.51,12800.18,4166000000,12800.18
  4 | 12801.15,12984.95,12640.44,12827.49,4221260000,12827.49
  5 | 12820.9,12998.11,12511.03,12589.07,4705390000,12589.07
  6 | 12590.21,12814.97,12431.53,12735.31,5351030000,12735.31
  7 | 12733.11,12931.29,12632.15,12853.09,5170490000,12853.09
  8 | 12850.74,12863.34,12495.91,12606.3,4495840000,12606.3
  9 | 12613.78,12866.1,12596.95,12778.15,3682090000,12778.15
 10 | 12777.5,12777.5,12425.92,12501.11,4601640000,12501.11
 11 | 12476.81,12699.05,12294.48,12466.16,5440620000,12466.16
 12 | 12467.05,12597.85,12089.38,12159.21,5303130000,12159.21
 13 | 12159.94,12441.85,11953.71,12099.3,6004840000,12099.3
 14 | 12092.72,12167.42,11508.74,11971.19,6544690000,11971.19
 15 | 11969.08,12339.1,11530.12,12270.17,3241680000,12270.17
 16 | 12272.69,12522.82,12114.83,12378.61,5735300000,12378.61
 17 | 12391.7,12590.69,12103.61,12207.17,4882250000,12207.17
 18 | 12205.71,12423.81,12061.42,12383.89,4100930000,12383.89
 19 | 12385.19,12604.92,12262.29,12480.3,4232960000,12480.3
 20 | 12480.14,12715.96,12311.55,12442.83,4742760000,12442.83
 21 | 12438.28,12734.74,12197.09,12650.36,4970290000,12650.36
 22 | 12638.17,12841.88,12510.05,12743.19,4650770000,12743.19
 23 | 12743.11,12810.34,12557.61,12635.16,3495780000,12635.16
 24 | 12631.85,12631.85,12234.97,12265.13,4315740000,12265.13
 25 | 12257.25,12436.33,12142.14,12200.1,4008120000,12200.1
 26 | 12196.2,12366.99,12045,12247,4589160000,12247
 27 | 12248.47,12330.97,12058.01,12182.13,3768490000,12182.13
 28 | 12181.89,12332.76,12006.79,12240.01,3593140000,12240.01
 29 | 12241.56,12524.12,12207.9,12373.41,4044640000,12373.41
 30 | 12368.12,12627.76,12354.22,12552.24,3856420000,12552.24
 31 | 12551.51,12611.26,12332.03,12376.98,3644760000,12376.98
 32 | 12376.66,12441.2,12216.68,12348.21,3583300000,12348.21
 33 | 12349.59,12571.11,12276.81,12337.22,3613550000,12337.22
 34 | 12333.31,12489.29,12159.42,12427.26,3870520000,12427.26
 35 | 12426.85,12545.79,12225.36,12284.3,3696660000,12284.3
 36 | 12281.09,12429.05,12116.92,12381.02,3572660000,12381.02
 37 | 12380.77,12612.47,12292.03,12570.22,3866350000,12570.22
 38 | 12569.48,12771.14,12449.08,12684.92,4096060000,12684.92
 39 | 12683.54,12815.59,12527.64,12694.28,3904700000,12694.28
 40 | 12689.28,12713.99,12463.32,12582.18,3938580000,12582.18
 41 | 12579.58,12579.58,12210.3,12266.39,4426730000,12266.39
 42 | 12264.36,12344.71,12101.29,12258.9,4117570000,12258.9
 43 | 12259.14,12291.22,11991.06,12213.8,4757180000,12213.8
 44 | 12204.93,12392.74,12105.36,12254.99,4277710000,12254.99
 45 | 12254.59,12267.86,12010.03,12040.39,4323460000,12040.39
 46 | 12039.09,12131.33,11778.66,11893.69,4565410000,11893.69
 47 | 11893.04,11993.75,11691.47,11740.15,4261240000,11740.15
 48 | 11741.33,12205.98,11741.33,12156.81,5109080000,12156.81
 49 | 12148.61,12360.58,12037.79,12110.24,4414280000,12110.24
 50 | 12096.49,12242.29,11832.88,12145.74,5073360000,12145.74
 51 | 12146.39,12249.86,11781.43,11951.09,5153780000,11951.09
 52 | 11946.45,12119.69,11650.44,11972.25,5683010000,11972.25
 53 | 11975.92,12411.63,11975.92,12392.66,5335630000,12392.66
 54 | 12391.52,12525.19,12077.27,12099.66,1203830000,12099.66
 55 | 12102.43,12434.34,12024.68,12361.32,6145220000,12361.32
 56 | 12361.97,12687.61,12346.17,12548.64,4499000000,12548.64
 57 | 12547.34,12639.82,12397.62,12532.6,4145120000,12532.6
 58 | 12531.79,12531.79,12309.62,12422.86,4055670000,12422.86
 59 | 12421.88,12528.13,12264.76,12302.46,4037930000,12302.46
 60 | 12303.92,12441.67,12164.22,12216.4,3686980000,12216.4
 61 | 12215.92,12384.84,12095.18,12262.89,4188990000,12262.89
 62 | 12266.64,12693.93,12266.64,12654.36,4745120000,12654.36
 63 | 12651.67,12790.28,12488.22,12608.92,4320440000,12608.92
 64 | 12604.69,12734.97,12455.04,12626.03,3920100000,12626.03
 65 | 12626.35,12738.3,12489.4,12609.42,3703100000,12609.42
 66 | 12612.59,12786.83,12550.22,12612.43,3747780000,12612.43
 67 | 12602.66,12664.38,12440.55,12576.44,3602500000,12576.44
 68 | 12574.65,12686.93,12416.53,12527.26,3556670000,12527.26
 69 | 12526.78,12705.9,12447.96,12581.98,3686150000,12581.98
 70 | 12579.78,12579.78,12280.89,12325.42,3723790000,12325.42
 71 | 12324.77,12430.86,12208.42,12302.06,3565020000,12302.06
 72 | 12303.6,12459.36,12223.97,12362.47,3581230000,12362.47
 73 | 12371.51,12670.56,12371.51,12619.27,4260370000,12619.27
 74 | 12617.4,12725.93,12472.71,12620.49,3713880000,12620.49
 75 | 12626.76,12965.47,12626.76,12849.36,4222380000,12849.36
 76 | 12850.91,12902.69,12666.08,12825.02,3420570000,12825.02
 77 | 12825.02,12870.86,12604.53,12720.23,3821900000,12720.23
 78 | 12721.45,12883.8,12627,12763.22,4103610000,12763.22
 79 | 12764.68,12979.88,12651.51,12848.95,4461660000,12848.95
 80 | 12848.38,12987.29,12703.7,12891.86,3891150000,12891.86
 81 | 12890.76,13015.62,12791.55,12871.75,3607000000,12871.75
 82 | 12870.37,12970.27,12737.82,12831.94,3815320000,12831.94
 83 | 12831.45,13052.91,12746.45,12820.13,4508890000,12820.13
 84 | 12818.34,13079.94,12721.94,13010,4448780000,13010
 85 | 13012.53,13191.49,12931.35,13058.2,3953030000,13058.2
 86 | 13056.57,13105.75,12896.5,12969.54,3410090000,12969.54
 87 | 12968.89,13071.07,12817.53,13020.83,3924100000,13020.83
 88 | 13010.82,13097.77,12756.14,12814.35,4075860000,12814.35
 89 | 12814.84,12965.95,12727.56,12866.78,3827550000,12866.78
 90 | 12860.68,12871.75,12648.09,12745.88,3518620000,12745.88
 91 | 12768.38,12903.33,12746.36,12876.05,3370630000,12876.05
 92 | 12872.08,12957.65,12716.16,12832.18,4018590000,12832.18
 93 | 12825.12,13037.44,12806.21,12898.38,3979370000,12898.38
 94 | 12891.29,13028.16,12798.39,12992.66,3836480000,12992.66
 95 | 12992.74,13069.52,12860.6,12986.8,3842590000,12986.8
 96 | 12985.41,13170.97,12899.19,13028.16,3683970000,13028.16
 97 | 13026.04,13026.04,12742.29,12828.68,3854320000,12828.68
 98 | 12824.94,12926.71,12550.39,12601.19,4517990000,12601.19
 99 | 12597.69,12743.68,12515.78,12625.62,3955960000,12625.62
100 | 12620.9,12637.43,12420.2,12479.63,3516380000,12479.63
101 | 12479.63,12626.84,12397.56,12548.35,3588860000,12548.35
102 | 12542.9,12693.77,12437.38,12594.03,3927240000,12594.03
103 | 12593.87,12760.21,12493.47,12646.22,3894440000,12646.22
104 | 12647.36,12750.84,12555.6,12638.32,3845630000,12638.32
105 | 12637.67,12645.4,12385.76,12503.82,3714320000,12503.82
106 | 12503.2,12620.98,12317.61,12402.85,4396380000,12402.85
107 | 12391.86,12540.37,12283.74,12390.48,4338640000,12390.48
108 | 12388.81,12652.81,12358.07,12604.45,4350790000,12604.45
109 | 12602.74,12602.74,12180.5,12209.81,4771660000,12209.81
110 | 12210.13,12406.36,12102.5,12280.32,4404570000,12280.32
111 | 12277.71,12425.98,12116.58,12289.76,4635070000,12289.76
112 | 12286.34,12317.2,12029.46,12083.77,4779980000,12083.77
113 | 12089.63,12337.72,12041.43,12141.58,4734240000,12141.58
114 | 12144.59,12376.72,12096.23,12307.35,4080420000,12307.35
115 | 12306.86,12381.44,12139.79,12269.08,3706940000,12269.08
116 | 12269.65,12378.67,12114.14,12160.3,3801960000,12160.3
117 | 12158.68,12212.33,11947.07,12029.06,4573570000,12029.06
118 | 12022.54,12188.31,11881.03,12063.09,4811670000,12063.09
119 | 12062.19,12078.23,11785.04,11842.69,5324900000,11842.69
120 | 11843.83,11986.94,11731.06,11842.36,4186370000,11842.36
121 | 11842.36,11962.37,11668.53,11807.43,4705050000,11807.43
122 | 11805.31,12008.7,11683.75,11811.83,4825640000,11811.83
123 | 11808.57,11808.57,11431.92,11453.42,5231280000,11453.42
124 | 11452.85,11556.33,11248.48,11346.51,6208260000,11346.51
125 | 11345.7,11504.55,11226.34,11350.01,5032330000,11350.01
126 | 11344.64,11465.79,11106.65,11382.26,5846290000,11382.26
127 | 11382.34,11510.41,11180.58,11215.51,5276090000,11215.51
128 | 11297.33,11336.49,11158.02,11288.53,3247590000,11288.53
129 | 11289.19,11477.52,11094.44,11231.96,5265420000,11231.96
130 | 11225.03,11459.52,11101.19,11384.21,6034110000,11384.21
131 | 11381.93,11505.12,11115.61,11147.44,5181000000,11147.44
132 | 11148.01,11351.24,11006.01,11229.02,5840430000,11229.02
133 | 11226.17,11292.04,10908.64,11100.54,6742200000,11100.54
134 | 11103.64,11299.7,10972.63,11055.19,5434860000,11055.19
135 | 11050.8,11201.67,10731.96,10962.54,7363640000,10962.54
136 | 10961.89,11308.41,10831.61,11239.28,6738630400,11239.28
137 | 11238.39,11538.5,11118.46,11446.66,7365209600,11446.66
138 | 11436.56,11599.57,11290.5,11496.57,5653280000,11496.57
139 | 11495.02,11663.4,11339.02,11467.34,4630640000,11467.34
140 | 11457.9,11692.79,11273.32,11602.5,6180230000,11602.5
141 | 11603.39,11820.21,11410.02,11632.38,6705830000,11632.38
142 | 11630.34,11714.21,11288.79,11349.28,6127980000,11349.28
143 | 11341.14,11540.78,11252.47,11370.69,4672560000,11370.69
144 | 11369.47,11439.25,11094.76,11131.08,4282960000,11131.08
145 | 11133.44,11444.05,11086.13,11397.56,5414240000,11397.56
146 | 11397.56,11681.47,11328.68,11583.69,5631330000,11583.69
147 | 11577.99,11631.16,11317.69,11378.02,5346050000,11378.02
148 | 11379.89,11512.61,11205.41,11326.32,4684870000,11326.32
149 | 11326.32,11449.67,11144.59,11284.15,4562280000,11284.15
150 | 11286.02,11652.24,11286.02,11615.77,1219310000,11615.77
151 | 11603.64,11745.71,11454.64,11656.07,4873420000,11656.07
152 | 11655.42,11680.5,11355.63,11431.43,5319380000,11431.43
153 | 11432.09,11808.49,11344.23,11734.32,4966810000,11734.32
154 | 11729.67,11933.55,11580.19,11782.35,5067310000,11782.35
155 | 11781.7,11830.39,11541.43,11642.47,4711290000,11642.47
156 | 11632.81,11689.05,11377.37,11532.96,4787600000,11532.96
157 | 11532.07,11744.33,11399.84,11615.93,4064000000,11615.93
158 | 11611.21,11776.41,11540.05,11659.9,4041820000,11659.9
159 | 11659.65,11744.49,11410.18,11479.39,3829290000,11479.39
160 | 11478.09,11501.45,11260.53,11348.55,4159760000,11348.55
161 | 11345.94,11511.06,11240.18,11417.43,4555030000,11417.43
162 | 11415.23,11501.29,11263.63,11430.21,4032590000,11430.21
163 | 11426.79,11684,11426.79,11628.06,3741070000,11628.06
164 | 11626.19,11626.19,11336.82,11386.25,3420600000,11386.25
165 | 11383.56,11483.62,11284.47,11412.87,3587570000,11412.87
166 | 11412.46,11575.14,11349.69,11502.51,3499610000,11502.51
167 | 11499.87,11756.46,11493.72,11715.18,3854280000,11715.18
168 | 11713.23,11730.49,11508.78,11543.55,3288120000,11543.55
169 | 


--------------------------------------------------------------------------------
/exercises/dow_selection/dow_selection.py:
--------------------------------------------------------------------------------
 1 | # Copyright 2016 Enthought, Inc. All Rights Reserved
 2 | """
 3 | Dow Selection
 4 | -------------
 5 | 
 6 | Topics: Boolean array operators, sum function, where function, plotting.
 7 | 
 8 | The array 'dow' is a 2-D array with each row holding the
 9 | daily performance of the Dow Jones Industrial Average from the
10 | beginning of 2008 (dates have been removed for exercise simplicity).
11 | The array has the following structure::
12 | 
13 |        OPEN      HIGH      LOW       CLOSE     VOLUME      ADJ_CLOSE
14 |        13261.82  13338.23  12969.42  13043.96  3452650000  13043.96
15 |        13044.12  13197.43  12968.44  13056.72  3429500000  13056.72
16 |        13046.56  13049.65  12740.51  12800.18  4166000000  12800.18
17 |        12801.15  12984.95  12640.44  12827.49  4221260000  12827.49
18 |        12820.9   12998.11  12511.03  12589.07  4705390000  12589.07
19 |        12590.21  12814.97  12431.53  12735.31  5351030000  12735.31
20 | 
21 | 0. The data has been loaded from a .csv file for you.
22 | 1. Create a "mask" array that indicates which rows have a volume
23 |    greater than 5.5 billion.
24 | 2. How many are there?  (hint: use sum).
25 | 3. Find the index of every row (or day) where the volume is greater
26 |    than 5.5 billion. hint: look at the where() command.
27 | 
28 | Bonus
29 | ~~~~~
30 | 
31 | 1. Plot the adjusted close for *every* day in 2008.
32 | 2. Now over-plot this plot with a 'red dot' marker for every
33 |    day where the volume was greater than 5.5 billion.
34 | 
35 | See :ref:`dow-selection-solution`.
36 | """
37 | 
38 | from numpy import loadtxt, sum, where
39 | from matplotlib.pyplot import figure, hold, plot, show
40 | 
41 | # Constants that indicate what data is held in each column of
42 | # the 'dow' array.
43 | OPEN = 0
44 | HIGH = 1
45 | LOW = 2
46 | CLOSE = 3
47 | VOLUME = 4
48 | ADJ_CLOSE = 5
49 | 
50 | # 0. The data has been loaded from a .csv file for you.
51 | 
52 | # 'dow' is our NumPy array that we will manipulate.
53 | dow = loadtxt('dow.csv', delimiter=',')
54 | 
55 | # 1. Create a "mask" array that indicates which rows have a volume
56 | #    greater than 5.5 billion.
57 | 
58 | 
59 | # 2. How many are there?  (hint: use sum).
60 | 
61 | # 3. Find the index of every row (or day) where the volume is greater
62 | #    than 5.5 billion. hint: look at the where() command.
63 | 
64 | # BONUS:
65 | # a. Plot the adjusted close for EVERY day in 2008.
66 | # b. Now over-plot this plot with a 'red dot' marker for every
67 | #    day where the volume was greater than 5.5 billion.
68 | 


--------------------------------------------------------------------------------
/exercises/dow_selection/dow_selection_solution.py:
--------------------------------------------------------------------------------
 1 | # Copyright 2016 Enthought, Inc. All Rights Reserved
 2 | """
 3 | 
 4 | Topics: Boolean array operators, sum function, where function, plotting.
 5 | 
 6 | The array 'dow' is a 2-D array with each row holding the
 7 | daily performance of the Dow Jones Industrial Average from the
 8 | beginning of 2008 (dates have been removed for exercise simplicity).
 9 | The array has the following structure::
10 | 
11 |        OPEN      HIGH      LOW       CLOSE     VOLUME      ADJ_CLOSE
12 |        13261.82  13338.23  12969.42  13043.96  3452650000  13043.96
13 |        13044.12  13197.43  12968.44  13056.72  3429500000  13056.72
14 |        13046.56  13049.65  12740.51  12800.18  4166000000  12800.18
15 |        12801.15  12984.95  12640.44  12827.49  4221260000  12827.49
16 |        12820.9   12998.11  12511.03  12589.07  4705390000  12589.07
17 |        12590.21  12814.97  12431.53  12735.31  5351030000  12735.31
18 | 
19 | 0. The data has been loaded from a .csv file for you.
20 | 1. Create a "mask" array that indicates which rows have a volume
21 |    greater than 5.5 billion.
22 | 2. How many are there?  (hint: use sum).
23 | 3. Find the index of every row (or day) where the volume is greater
24 |    than 5.5 billion. hint: look at the where() command.
25 | 
26 | Bonus
27 | ~~~~~
28 | 
29 | 1. Plot the adjusted close for *every* day in 2008.
30 | 2. Now over-plot this plot with a 'red dot' marker for every
31 |    day where the volume was greater than 5.5 billion.
32 | 
33 | """
34 | 
35 | from numpy import loadtxt, sum, where
36 | from matplotlib.pyplot import figure, hold, plot, show
37 | 
38 | # Constants that indicate what data is held in each column of
39 | # the 'dow' array.
40 | OPEN = 0
41 | HIGH = 1
42 | LOW = 2
43 | CLOSE = 3
44 | VOLUME = 4
45 | ADJ_CLOSE = 5
46 | 
47 | # 0. The data has been loaded from a csv file for you.
48 | 
49 | # 'dow' is our NumPy array that we will manipulate.
50 | dow = loadtxt('dow.csv', delimiter=',')
51 | 
52 | 
53 | # 1. Create a "mask" array that indicates which rows have a volume
54 | #    greater than 5.5 billion.
55 | high_volume_mask = dow[:, VOLUME] > 5.5e9
56 | 
57 | # 2. How many are there?  (hint: use sum).
58 | high_volume_days = sum(high_volume_mask)
59 | print "The dow volume has been above 5.5 billion on" \
60 |       " %d days this year." % high_volume_days
61 | 
62 | # 3. Find the index of every row (or day) where the volume is greater
63 | #    than 5.5 billion. hint: look at the where() command.
64 | high_vol_index = where(high_volume_mask)[0]
65 | 
66 | # BONUS:
67 | # 1. Plot the adjusted close for EVERY day in 2008.
68 | # 2. Now over-plot this plot with a 'red dot' marker for every
69 | #    day where the dow was greater than 5.5 billion.
70 | 
71 | # Create a new plot.
72 | figure()
73 | 
74 | # Plot the adjusted close for every day of the year as a blue line.
75 | # In the format string 'b-', 'b' means blue and '-' indicates a line.
76 | plot(dow[:, ADJ_CLOSE], 'b-')
77 | 
78 | # Plot the days where the volume was high with red dots...
79 | plot(high_vol_index, dow[high_vol_index, ADJ_CLOSE], 'ro')
80 | 
81 | # Scripts must call the plot "show" command to display the plot
82 | # to the screen.
83 | show()
84 | 


--------------------------------------------------------------------------------
/exercises/load_text/complex_data_file.txt:
--------------------------------------------------------------------------------
 1 | -- THIS IS THE BEGINNING OF THE FILE --
 2 | % This is a more complex file to read!
 3 | 
 4 | % Day,  Month,  Year, Useless Col, Avg Power
 5 |    01,     01,  2000,      ad766,         30 
 6 |    02,     01,  2000,       t873,         41
 7 | % we don't have Jan 03rd!
 8 |    04,     01,  2000,       r441,         55
 9 |    05,     01,  2000,       s345,         78
10 |    06,     01,  2000,       x273,        134 % that day was crazy
11 |    07,     01,  2000,       x355,         42
12 | 
13 | %-- THIS IS THE END OF THE FILE --
14 | 


--------------------------------------------------------------------------------
/exercises/load_text/float_data.txt:
--------------------------------------------------------------------------------
1 | 1 2 3 4
2 | 5 6 7 8


--------------------------------------------------------------------------------
/exercises/load_text/float_data_with_header.txt:
--------------------------------------------------------------------------------
1 | c1 c2 c3 c4
2 | 1   2  3  4
3 | 5   6  7  8


--------------------------------------------------------------------------------
/exercises/load_text/load_text.py:
--------------------------------------------------------------------------------
 1 | # Copyright 2016 Enthought, Inc. All Rights Reserved
 2 | """
 3 | Load Array from Text File
 4 | -------------------------
 5 | 
 6 | 0. From the IPython prompt, type::
 7 | 
 8 |         In [1]: loadtxt?
 9 | 
10 |    to see the options on how to use the loadtxt command.
11 | 
12 | 
13 | 1. Use loadtxt to load in a 2D array of floating point values from
14 |    'float_data.txt'.  The data in the file looks like::
15 | 
16 |         1 2 3 4
17 |         5 6 7 8
18 | 
19 |    The resulting data should be a 2x4 array of floating point values.
20 | 
21 | 2. In the second example, the file 'float_data_with_header.txt' has
22 |    strings as column names in the first row::
23 | 
24 |         c1 c2 c3 c4
25 |          1  2  3  4
26 |          5  6  7  8
27 | 
28 |    Ignore these column names, and read the remainder of the data into
29 |    a 2D array.
30 | 
31 |    Later on, we'll learn how to create a "structured array" using
32 |    these column names to create fields within an array.
33 | 
34 | Bonus
35 | ~~~~~
36 | 
37 | 3. A third example is more involved (the file is called
38 |    'complex_data_file.txt'). It contains comments in multiple
39 |    locations, uses multiple formats, and includes a useless column to
40 |    skip::
41 | 
42 |     -- THIS IS THE BEGINNING OF THE FILE --
43 |     % This is a more complex file to read!
44 | 
45 |     % Day,  Month,  Year, Useless Col, Avg Power
46 |        01,     01,  2000,      ad766,         30
47 |        02,     01,  2000,       t873,         41
48 |     % we don't have Jan 03rd!
49 |        04,     01,  2000,       r441,         55
50 |        05,     01,  2000,       s345,         78
51 |        06,     01,  2000,       x273,        134 % that day was crazy
52 |        07,     01,  2000,       x355,         42
53 | 
54 |     %-- THIS IS THE END OF THE FILE --
55 | 
56 | 
57 | See :ref:`load-text-solution`
58 | """
59 | 
60 | from numpy import loadtxt
61 | 


--------------------------------------------------------------------------------
/exercises/load_text/load_text_solution.py:
--------------------------------------------------------------------------------
 1 | # Copyright 2016 Enthought, Inc. All Rights Reserved
 2 | """
 3 | Load Array from Text File
 4 | -------------------------
 5 | 
 6 | 0. From the IPython prompt, type::
 7 | 
 8 |         In [1]: loadtxt?
 9 | 
10 |    to see the options on how to use the loadtxt command.
11 | 
12 | 
13 | 1. Use loadtxt to load in a 2D array of floating point values from
14 |    'float_data.txt'.  The data in the file looks like::
15 | 
16 |         1 2 3 4
17 |         5 6 7 8
18 | 
19 |    The resulting data should be a 2x4 array of floating point values.
20 | 
21 | 2. In the second example, the file 'float_data_with_header.txt' has
22 |    strings as column names in the first row::
23 | 
24 |         c1 c2 c3 c4
25 |          1  2  3  4
26 |          5  6  7  8
27 | 
28 |    Ignore these column names, and read the remainder of the data into
29 |    a 2D array.
30 | 
31 |    Later on, we'll learn how to create a "structured array" using
32 |    these column names to create fields within an array.
33 | 
34 | Bonus
35 | ~~~~~
36 | 
37 | 3. A third example is more involved. It contains comments in multiple
38 |    locations, uses multiple formats, and includes a useless column to
39 |    skip::
40 | 
41 |     -- THIS IS THE BEGINNING OF THE FILE --
42 |     % This is a more complex file to read!
43 | 
44 |     % Day,  Month,  Year, Useless Col, Avg Power
45 |        01,     01,  2000,      ad766,         30
46 |        02,     01,  2000,       t873,         41
47 |     % we don't have Jan 03rd!
48 |        04,     01,  2000,       r441,         55
49 |        05,     01,  2000,       s345,         78
50 |        06,     01,  2000,       x273,        134 % that day was crazy
51 |        07,     01,  2000,       x355,         42
52 | 
53 |     %-- THIS IS THE END OF THE FILE --
54 | """
55 | 
56 | from numpy import loadtxt
57 | 
58 | #############################################################################
59 | # 1. Simple example loading a 2x4 array of floats from a file.
60 | #############################################################################
61 | ary1 = loadtxt('float_data.txt')
62 | 
63 | print('example 1:')
64 | print(ary1)
65 | 
66 | 
67 | #############################################################################
68 | # 2. Same example, but skipping the first row of column headers
69 | #############################################################################
70 | ary2 = loadtxt('float_data_with_header.txt', skiprows=1)
71 | 
72 | print('example 2:')
73 | print(ary2)
74 | 
75 | #############################################################################
76 | # 3. More complex example with comments and columns to skip
77 | #############################################################################
78 | ary3 = loadtxt("complex_data_file.txt", delimiter=",", comments="%",
79 |                usecols=(0, 1, 2, 4), dtype=int, skiprows=1)
80 | 
81 | print('example 3:')
82 | print(ary3)
83 | 


--------------------------------------------------------------------------------
/exercises/plotting/dc_metro.JPG:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/enthought/Numpy-Tutorial-SciPyConf-2016/8e9e8cbb57f8976a4572800fe808aedc74775a81/exercises/plotting/dc_metro.JPG


--------------------------------------------------------------------------------
/exercises/plotting/my_plots.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/enthought/Numpy-Tutorial-SciPyConf-2016/8e9e8cbb57f8976a4572800fe808aedc74775a81/exercises/plotting/my_plots.png


--------------------------------------------------------------------------------
/exercises/plotting/plotting.py:
--------------------------------------------------------------------------------
 1 | # Copyright 2016 Enthought, Inc. All Rights Reserved
 2 | """
 3 | Plotting
 4 | --------
 5 | 
 6 | In PyLab, create a plot display that looks like the following:
 7 | 
 8 | .. image:: plotting/sample_plots.png
 9 | 
10 | `Photo credit: David Fettig
11 | <http://www.publicdomainpictures.net/view-image.php?image=507>`_
12 | 
13 | 
14 | This is a 2x2 layout, with 3 slots occupied.
15 | 
16 | 1. Sine function, with blue solid line; cosine with red '+' markers; the
17 |    extents fit the plot exactly. Hint: see the axis() function for setting the
18 |    extents.
19 | 2. Sine function, with gridlines, axis labels, and title; the extents fit the
20 |    plot exactly.
21 | 3. Image with color map; the extents run from -10 to 10, rather than the
22 |    default.
23 | 
24 | Save the resulting plot image to a file. (Use a different file name, so you
25 | don't overwrite the sample.)
26 | 
27 | The color map in the example is 'winter'; use 'cm.<tab>' to list the available
28 | ones, and experiment to find one you like.
29 | 
30 | Start with the following statements::
31 | 
32 |     from matplotlib.pyplot import imread
33 | 
34 |     x = linspace(0, 2*pi, 101)
35 |     s = sin(x)
36 |     c = cos(x)
37 | 
38 |     img = imread('dc_metro.jpg')
39 | 
40 | Tip: If you find that the label of one plot overlaps another plot, try adding
41 | a call to `tight_layout()` to your script.
42 | 
43 | Bonus
44 | ~~~~~
45 | 
46 | 4. The `subplot()` function returns an axes object, which can be assigned to
47 |    the `sharex` and `sharey` keyword arguments of another subplot() function
48 |    call.  E.g.::
49 | 
50 |        ax1 = subplot(2,2,1)
51 |        ...
52 |        subplot(2,2,2, sharex=ax1, sharey=ax1)
53 | 
54 |    Make this modification to your script, and explore the consequences.
55 |    Hint: try panning and zooming in the subplots.
56 | 
57 | See :ref:`plotting-solution`.
58 | """
59 | 
60 | 
61 | # The following imports are *not* needed in PyLab, but are needed in this file.
62 | from numpy import linspace, pi, sin, cos
63 | from matplotlib.pyplot import (plot, subplot, cm, imread, imshow, xlabel,
64 |                                ylabel, title, grid, axis, show, savefig, gcf,
65 |                                figure, close, tight_layout)
66 | 
67 | x = linspace(0, 2 * pi, 101)
68 | s = sin(x)
69 | c = cos(x)
70 | 
71 | img = imread('dc_metro.JPG')
72 | 


--------------------------------------------------------------------------------
/exercises/plotting/plotting_bonus_solution.py:
--------------------------------------------------------------------------------
 1 | # Copyright 2016 Enthought, Inc. All Rights Reserved
 2 | """
 3 | Plotting
 4 | --------
 5 | 
 6 | In PyLab, create a plot display that looks like the following:
 7 | 
 8 | .. image:: plotting/sample_plots.png
 9 | 
10 | `Photo credit: David Fettig
11 |     <http://www.publicdomainpictures.net/view-image.php?image=507>`_
12 | 
13 | 
14 | This is a 2x2 layout, with 3 slots occupied.
15 | 
16 | 1. Sine function, with blue solid line; cosine with red '+' markers; the
17 |    extents fit the plot exactly. Hint: see the axis() function for setting the
18 |    extents.
19 | 2. Sine function, with gridlines, axis labels, and title; the extents fit the
20 |    plot exactly.
21 | 3. Image with color map; the extents run from -10 to 10, rather than the
22 |    default.
23 | 
24 | Save the resulting plot image to a file. (Use a different file name, so you
25 | don't overwrite the sample.)
26 | 
27 | The color map in the example is 'winter'; use 'cm.<tab>' to list the available
28 | ones, and experiment to find one you like.
29 | 
30 | Start with the following statements::
31 | 
32 |     from matplotlib.pyplot import imread
33 | 
34 |     x = linspace(0, 2*pi, 101)
35 |     s = sin(x)
36 |     c = cos(x)
37 | 
38 |     img = imread('dc_metro.jpg')
39 | 
40 | Tip: If you find that the label of one plot overlaps another plot, try adding
41 | a call to `tight_layout()` to your script.
42 | 
43 | Bonus
44 | ~~~~~
45 | 
46 | 4. The `subplot()` function returns an axes object, which can be assigned to
47 |    the `sharex` and `sharey` keyword arguments of another subplot() function
48 |    call.  E.g.::
49 | 
50 |        ax1 = subplot(2,2,1)
51 |        ...
52 |        subplot(2,2,2, sharex=ax1, sharey=ax1)
53 | 
54 |    Make this modification to your script, and explore the consequences.
55 |    Hint: try panning and zooming in the subplots.
56 | 
57 | """
58 | 
59 | 
60 | # The following imports are *not* needed in PyLab, but are needed in this file.
61 | from numpy import linspace, pi, sin, cos
62 | from matplotlib.pyplot import (plot, subplot, cm, imread, imshow, xlabel,
63 |                                ylabel, title, grid, axis, show, savefig, gcf,
64 |                                figure, close, tight_layout)
65 | 
66 | x = linspace(0, 2 * pi, 101)
67 | s = sin(x)
68 | c = cos(x)
69 | 
70 | img = imread('dc_metro.JPG')
71 | 
72 | close('all')
73 | # 2x2 layout, first plot: sin and cos
74 | ax1 = subplot(2, 2, 1)
75 | plot(x, s, 'b-', x, c, 'r+')
76 | axis('tight')
77 | 
78 | # 2nd plot: gridlines, labels
79 | subplot(2, 2, 2, sharex=ax1, sharey=ax1)
80 | plot(x, s)
81 | grid()
82 | xlabel('radians')
83 | ylabel('amplitude')
84 | title('sin(x)')
85 | axis('tight')
86 | 
87 | # 3rd plot, image
88 | subplot(2, 2, 3)
89 | imshow(img, extent=[-10, 10, -10, 10], cmap=cm.winter)
90 | 
91 | tight_layout()
92 | 
93 | show()
94 | 
95 | 
96 | savefig('my_plots.png')
97 | 


--------------------------------------------------------------------------------
/exercises/plotting/plotting_solution.py:
--------------------------------------------------------------------------------
 1 | # Copyright 2016 Enthought, Inc. All Rights Reserved
 2 | """
 3 | Plotting
 4 | --------
 5 | 
 6 | In PyLab, create a plot display that looks like the following:
 7 | 
 8 | .. image:: plotting/sample_plots.png
 9 | 
10 | `Photo credit: David Fettig
11 | <http://www.publicdomainpictures.net/view-image.php?image=507>`_
12 | 
13 | 
14 | This is a 2x2 layout, with 3 slots occupied.
15 | 
16 | 1. Sine function, with blue solid line; cosine with red '+' markers; the
17 |    extents fit the plot exactly. Hint: see the axis() function for setting the
18 |    extents.
19 | 2. Sine function, with gridlines, axis labels, and title; the extents fit the
20 |    plot exactly.
21 | 3. Image with color map; the extents run from -10 to 10, rather than the
22 |    default.
23 | 
24 | Save the resulting plot image to a file. (Use a different file name, so you
25 | don't overwrite the sample.)
26 | 
27 | The color map in the example is 'winter'; use 'cm.<tab>' to list the available
28 | ones, and experiment to find one you like.
29 | 
30 | Start with the following statements::
31 | 
32 |     from matplotlib.pyplot import imread
33 | 
34 |     x = linspace(0, 2*pi, 101)
35 |     s = sin(x)
36 |     c = cos(x)
37 | 
38 |     img = imread('dc_metro.jpg')
39 | 
40 | Tip: If you find that the label of one plot overlaps another plot, try adding
41 | a call to `tight_layout()` to your script.
42 | 
43 | Bonus
44 | ~~~~~
45 | 
46 | 4. The `subplot()` function returns an axes object, which can be assigned to
47 |    the `sharex` and `sharey` keyword arguments of another subplot() function
48 |    call.  E.g.::
49 | 
50 |        ax1 = subplot(2,2,1)
51 |        ...
52 |        subplot(2,2,2, sharex=ax1, sharey=ax1)
53 | 
54 |    Make this modification to your script, and explore the consequences.
55 |    Hint: try panning and zooming in the subplots.
56 | 
57 | """
58 | 
59 | 
60 | # The following imports are *not* needed in PyLab, but are needed in this file.
61 | from numpy import linspace, pi, sin, cos
62 | from matplotlib.pyplot import (plot, subplot, cm, imread, imshow, xlabel,
63 |                                ylabel, title, grid, axis, show, savefig, gcf,
64 |                                figure, close, tight_layout)
65 | 
66 | x = linspace(0, 2*pi, 101)
67 | s = sin(x)
68 | c = cos(x)
69 | 
70 | img = imread('dc_metro.JPG')
71 | 
72 | close('all')
73 | # 2x2 layout, first plot: sin and cos
74 | subplot(2, 2, 1)
75 | plot(x, s, 'b-', x, c, 'r+')
76 | axis('tight')
77 | 
78 | # 2nd plot: gridlines, labels
79 | subplot(2, 2, 2)
80 | plot(x, s)
81 | grid()
82 | xlabel('radians')
83 | ylabel('amplitude')
84 | title('sin(x)')
85 | axis('tight')
86 | 
87 | # 3rd plot, image
88 | subplot(2, 2, 3)
89 | imshow(img, extent=[-10, 10, -10, 10], cmap=cm.winter)
90 | 
91 | tight_layout()
92 | 
93 | show()
94 | 
95 | 
96 | savefig('my_plots.png')
97 | 


--------------------------------------------------------------------------------
/exercises/plotting/sample_plots.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/enthought/Numpy-Tutorial-SciPyConf-2016/8e9e8cbb57f8976a4572800fe808aedc74775a81/exercises/plotting/sample_plots.png


--------------------------------------------------------------------------------
/exercises/structured_array/structured_array.py:
--------------------------------------------------------------------------------
 1 | # Copyright 2016 Enthought, Inc. All Rights Reserved
 2 | """
 3 | Structured Array
 4 | ----------------
 5 | 
 6 | In this exercise you will read columns of data into a structured array using
 7 | loadtxt and combine that array to a regular array to analyze the data and learn
 8 | how the pressure velocity evolves as a function of the shear velocity in sound
 9 | waves in the Earth.
10 | 
11 | 1. The data in 'short_logs.crv' has the following format::
12 | 
13 |        DEPTH          CALI       S-SONIC   ...
14 |        8744.5000   -999.2500   -999.2500   ...
15 |        8745.0000   -999.2500   -999.2500   ...
16 |        8745.5000   -999.2500   -999.2500   ...
17 | 
18 |    Here the first row defines a set of names for the columns
19 |    of data in the file.  Use these column names to define a
20 |    dtype for a structured array that will have fields 'DEPTH',
21 |    'CALI', etc.  Assume all the data is of the float64 data
22 |    format.
23 | 
24 | 2. Use the 'loadtxt' method from numpy to read the data from
25 |    the file into a structured array with the dtype created
26 |    in (1).  Name this array 'logs'
27 | 
28 | 3. The 'logs' array is nice for retrieving columns from the data.
29 |    For example, logs['DEPTH'] returns the values from the DEPTH
30 |    column of the data.  For row-based or array-wide operations,
31 |    it is more convenient to have a 2D view into the data, as if it
32 |    is a simple 2D array of float64 values.
33 | 
34 |    Create a 2D array called 'logs_2d' using the view operation.
35 |    Be sure the 2D array has the same number of columns as in the
36 |    data file.
37 | 
38 | 4. -999.25 is a "special" value in this data set.  It is
39 |    intended to represent missing data.  Replace all of these
40 |    values with NaNs.  Is this easier with the 'logs' array
41 |    or the 'logs_2d' array?
42 | 
43 | 5. Create a mask for all the "complete" rows in the array.
44 |    A complete row is one that doesn't have any NaN values measured
45 |    in that row.
46 | 
47 |    HINT: The ``all`` function is also useful here.
48 | 
49 | 6. Plot the VP vs VS logs for the "complete" rows.
50 | 
51 | See :ref:`structured-array-solution`.
52 | """
53 | from numpy import dtype, loadtxt, float64, NaN, isfinite, all
54 | from matplotlib.pyplot import plot, show, xlabel, ylabel
55 | 
56 | # Open the file.
57 | log_file = open('short_logs.crv')
58 | 
59 | # The first line is a header that has all the log names.
60 | header = log_file.readline()
61 | log_names = header.split()
62 | 


--------------------------------------------------------------------------------
/exercises/structured_array/structured_array_solution.py:
--------------------------------------------------------------------------------
 1 | # Copyright 2016 Enthought, Inc. All Rights Reserved
 2 | """
 3 | Structured Array
 4 | ----------------
 5 | 
 6 | In this exercise you will read columns of data into a structured array using
 7 | loadtxt and combine that array to a regular array to analyze the data and learn
 8 | how the pressure velocity evolves as a function of the shear velocity in sound
 9 | waves in the Earth.
10 | 
11 |     1. The data in 'short_logs.crv' has the following format::
12 | 
13 |            DEPTH          CALI       S-SONIC   ...
14 |            8744.5000   -999.2500   -999.2500   ...
15 |            8745.0000   -999.2500   -999.2500   ...
16 |            8745.5000   -999.2500   -999.2500   ...
17 | 
18 |        Here the first row defines a set of names for the columns
19 |        of data in the file.  Use these column names to define a
20 |        dtype for a structured array that will have fields 'DEPTH',
21 |        'CALI', etc.  Assume all the data is of the float64 data
22 |        format.
23 | 
24 |     2. Use the 'loadtxt' method from numpy to read the data from
25 |        the file into a structured array with the dtype created
26 |        in (1).  Name this array 'logs'
27 | 
28 |     3. The 'logs' array is nice for retrieving columns from the data.
29 |        For example, logs['DEPTH'] returns the values from the DEPTH
30 |        column of the data.  For row-based or array-wide operations,
31 |        it is more convenient to have a 2D view into the data, as if it
32 |        is a simple 2D array of float64 values.
33 | 
34 |        Create a 2D array called 'logs_2d' using the view operation.
35 |        Be sure the 2D array has the same number of columns as in the
36 |        data file.
37 | 
38 |     4. -999.25 is a "special" value in this data set.  It is
39 |        intended to represent missing data.  Replace all of these
40 |        values with NaNs.  Is this easier with the 'logs' array
41 |        or the 'logs_2d' array?
42 | 
43 |     5. Create a mask for all the "complete" rows in the array.
44 |        A complete row is one that doesn't have any NaN values measured
45 |        in that row.
46 | 
47 |        HINT: The ``all`` function is also useful here.
48 | 
49 |     6. Plot the VP vs VS logs for the "complete" rows.
50 | """
51 | from numpy import dtype, loadtxt, float64, NaN, isfinite, all
52 | from matplotlib.pyplot import plot, show, xlabel, ylabel
53 | 
54 | # Open the file.
55 | log_file = open('short_logs.crv')
56 | 
57 | # 1.Create a dtype from the names in the file header.
58 | header = log_file.readline()
59 | log_names = header.split()
60 | 
61 | # Construct the array "dtype" that describes the data.  All fields
62 | # are 8 byte (64 bit) floating point.
63 | fields = zip(log_names, ['f8']*len(log_names))
64 | fields_dtype = dtype(fields)
65 | 
66 | #2. Use loadtxt to load the data into a structured array.
67 | logs = loadtxt(log_file, dtype=fields_dtype)
68 | 
69 | # 3. Make a 2D, float64 view of the data.
70 | #    The -1 value for the row shape means that numpy should
71 | #    make this dimension whatever it needs to be so that
72 | #    rows*cols = size for the array.
73 | values = logs.view(float64)
74 | values.shape = -1, len(fields)
75 | 
76 | # 4. Relace any values that are -999.25 with NaNs.
77 | values[values==-999.25] = NaN
78 | 
79 | # 5. Make a mask for all the rows that don't have any missing values.
80 | #    Pull out these samples from the logs array into a separate array.
81 | data_mask = all(isfinite(values), axis=-1)
82 | good_logs = logs[data_mask]
83 | 
84 | 
85 | # 6. Plot VP vs. VS for the "complete rows.
86 | plot(good_logs['VS'], good_logs['VP'], 'o')
87 | xlabel('VS')
88 | ylabel('VP')
89 | show()
90 | 


--------------------------------------------------------------------------------
/exercises/wind_statistics/wind.desc:
--------------------------------------------------------------------------------
 1 | wind   daily average wind speeds for 1961-1978 at 12 synoptic meteorological 
 2 |        stations in the Republic of Ireland (Haslett and raftery 1989).
 3 | 
 4 | These data were analyzed in detail in the following article:
 5 |    Haslett, J. and Raftery, A. E. (1989). Space-time Modelling with
 6 |    Long-memory Dependence: Assessing Ireland's Wind Power Resource
 7 |    (with Discussion). Applied Statistics 38, 1-50.
 8 | 
 9 | Each line corresponds to one day of data in the following format:
10 | year, month, day, average wind speed at each of the stations in the order given
11 | in Fig.4 of Haslett and Raftery : 
12 |  RPT, VAL, ROS, KIL, SHA, BIR, DUB, CLA, MUL, CLO, BEL, MAL
13 | 
14 | Fortan format : ( i2, 2i3, 12f6.2) 
15 | 
16 | The data are in knots, not in m/s.
17 | 
18 | Permission granted for unlimited distribution.
19 | 
20 | Please report all anomalies to fraley@stat.washington.edu
21 | 
22 | Be aware that the dataset is 532494 bytes long (thats over half a
23 | Megabyte).  Please be sure you want the data before you request it.
24 | 


--------------------------------------------------------------------------------
/exercises/wind_statistics/wind_statistics.py:
--------------------------------------------------------------------------------
 1 | # Copyright 2016 Enthought, Inc. All Rights Reserved
 2 | """
 3 | Wind Statistics
 4 | ----------------
 5 | 
 6 | Topics: Using array methods over different axes, fancy indexing.
 7 | 
 8 | 1. The data in 'wind.data' has the following format::
 9 | 
10 |         61  1  1 15.04 14.96 13.17  9.29 13.96  9.87 13.67 10.25 10.83 12.58 18.50 15.04
11 |         61  1  2 14.71 16.88 10.83  6.50 12.62  7.67 11.50 10.04  9.79  9.67 17.54 13.83
12 |         61  1  3 18.50 16.88 12.33 10.13 11.17  6.17 11.25  8.04  8.50  7.67 12.75 12.71
13 | 
14 |    The first three columns are year, month and day.  The
15 |    remaining 12 columns are average windspeeds in knots at 12
16 |    locations in Ireland on that day.
17 | 
18 |    Use the 'loadtxt' function from numpy to read the data into
19 |    an array.
20 | 
21 | 2. Calculate the min, max and mean windspeeds and standard deviation of the
22 |    windspeeds over all the locations and all the times (a single set of numbers
23 |    for the entire dataset).
24 | 
25 | 3. Calculate the min, max and mean windspeeds and standard deviations of the
26 |    windspeeds at each location over all the days (a different set of numbers
27 |    for each location)
28 | 
29 | 4. Calculate the min, max and mean windspeed and standard deviations of the
30 |    windspeeds across all the locations at each day (a different set of numbers
31 |    for each day)
32 | 
33 | 5. Find the location which has the greatest windspeed on each day (an integer
34 |    column number for each day).
35 | 
36 | 6. Find the year, month and day on which the greatest windspeed was recorded.
37 | 
38 | 7. Find the average windspeed in January for each location.
39 | 
40 | You should be able to perform all of these operations without using a for
41 | loop or other looping construct.
42 | 
43 | Bonus
44 | ~~~~~
45 | 
46 | 1. Calculate the mean windspeed for each month in the dataset.  Treat
47 |    January 1961 and January 1962 as *different* months. (hint: first find a
48 |    way to create an identifier unique for each month. The second step might
49 |    require a for loop.)
50 | 
51 | 2. Calculate the min, max and mean windspeeds and standard deviations of the
52 |    windspeeds across all locations for each week (assume that the first week
53 |    starts on January 1 1961) for the first 52 weeks. This can be done without
54 |    any for loop.
55 | 
56 | Bonus Bonus
57 | ~~~~~~~~~~~
58 | 
59 | Calculate the mean windspeed for each month without using a for loop.
60 | (Hint: look at `searchsorted` and `add.reduceat`.)
61 | 
62 | Notes
63 | ~~~~~
64 | 
65 | These data were analyzed in detail in the following article:
66 | 
67 |    Haslett, J. and Raftery, A. E. (1989). Space-time Modelling with
68 |    Long-memory Dependence: Assessing Ireland's Wind Power Resource
69 |    (with Discussion). Applied Statistics 38, 1-50.
70 | 
71 | 
72 | See :ref:`wind-statistics-solution`.
73 | """
74 | 
75 | from numpy import loadtxt
76 | 


--------------------------------------------------------------------------------
/exercises/wind_statistics/wind_statistics_solution.py:
--------------------------------------------------------------------------------
  1 | # Copyright 2016 Enthought, Inc. All Rights Reserved
  2 | """
  3 | Wind Statistics
  4 | ----------------
  5 | 
  6 | Topics: Using array methods over different axes, fancy indexing.
  7 | 
  8 | 1. The data in 'wind.data' has the following format::
  9 | 
 10 |         61  1  1 15.04 14.96 13.17  9.29 13.96  9.87 13.67 10.25 10.83 12.58 18.50 15.04
 11 |         61  1  2 14.71 16.88 10.83  6.50 12.62  7.67 11.50 10.04  9.79  9.67 17.54 13.83
 12 |         61  1  3 18.50 16.88 12.33 10.13 11.17  6.17 11.25  8.04  8.50  7.67 12.75 12.71
 13 | 
 14 |    The first three columns are year, month and day.  The
 15 |    remaining 12 columns are average windspeeds in knots at 12
 16 |    locations in Ireland on that day.
 17 | 
 18 |    Use the 'loadtxt' function from numpy to read the data into
 19 |    an array.
 20 | 
 21 | 2. Calculate the min, max and mean windspeeds and standard deviation of the
 22 |    windspeeds over all the locations and all the times (a single set of numbers
 23 |    for the entire dataset).
 24 | 
 25 | 3. Calculate the min, max and mean windspeeds and standard deviations of the
 26 |    windspeeds at each location over all the days (a different set of numbers
 27 |    for each location)
 28 | 
 29 | 4. Calculate the min, max and mean windspeed and standard deviations of the
 30 |    windspeeds across all the locations at each day (a different set of numbers
 31 |    for each day)
 32 | 
 33 | 5. Find the location which has the greatest windspeed on each day (an integer
 34 |    column number for each day).
 35 | 
 36 | 6. Find the year, month and day on which the greatest windspeed was recorded.
 37 | 
 38 | 7. Find the average windspeed in January for each location.
 39 | 
 40 | You should be able to perform all of these operations without using a for
 41 | loop or other looping construct.
 42 | 
 43 | Bonus
 44 | ~~~~~
 45 | 
 46 | 1. Calculate the mean windspeed for each month in the dataset.  Treat
 47 |    January 1961 and January 1962 as *different* months.
 48 | 
 49 | 2. Calculate the min, max and mean windspeeds and standard deviations of the
 50 |    windspeeds across all locations for each week (assume that the first week
 51 |    starts on January 1 1961) for the first 52 weeks.
 52 | 
 53 | Bonus Bonus
 54 | ~~~~~~~~~~~
 55 | 
 56 | Calculate the mean windspeed for each month without using a for loop.
 57 | (Hint: look at `searchsorted` and `add.reduceat`.)
 58 | 
 59 | Notes
 60 | ~~~~~
 61 | 
 62 | These data were analyzed in detail in the following article:
 63 | 
 64 |    Haslett, J. and Raftery, A. E. (1989). Space-time Modelling with
 65 |    Long-memory Dependence: Assessing Ireland's Wind Power Resource
 66 |    (with Discussion). Applied Statistics 38, 1-50.
 67 | 
 68 | """
 69 | from __future__ import print_function
 70 | from numpy import loadtxt, arange, searchsorted, add, zeros
 71 | 
 72 | wind_data = loadtxt('wind.data')
 73 | 
 74 | data = wind_data[:, 3:]
 75 | 
 76 | print('2. Statistics over all values')
 77 | print('  min:', data.min())
 78 | print('  max:', data.max())
 79 | print('  mean:', data.mean())
 80 | print('  standard deviation:', data.std())
 81 | print
 82 | 
 83 | print('3. Statistics over all days at each location')
 84 | print('  min:', data.min(axis=0))
 85 | print('  max:', data.max(axis=0))
 86 | print('  mean:', data.mean(axis=0))
 87 | print('  standard deviation:', data.std(axis=0))
 88 | print()
 89 | 
 90 | print('4. Statistics over all locations for each day')
 91 | print('  min:', data.min(axis=1))
 92 | print('  max:', data.max(axis=1))
 93 | print('  mean:', data.mean(axis=1))
 94 | print('  standard deviation:', data.std(axis=1))
 95 | print
 96 | 
 97 | print('5. Location of daily maximum')
 98 | print('  daily max location:', data.argmax(axis=1))
 99 | print()
100 | 
101 | daily_max = data.max(axis=1)
102 | max_row = daily_max.argmax()
103 | 
104 | print('6. Day of maximum reading')
105 | print('  Year:', int(wind_data[max_row, 0]))
106 | print('  Month:', int(wind_data[max_row, 1]))
107 | print('  Day:', int(wind_data[max_row, 2]))
108 | print()
109 | 
110 | january_indices = wind_data[:, 1] == 1
111 | january_data = data[january_indices]
112 | 
113 | print('7. Statistics for January')
114 | print('  mean:', january_data.mean(axis=0))
115 | print()
116 | 
117 | # Bonus
118 | 
119 | # compute the month number for each day in the dataset
120 | months = (wind_data[:, 0] - 61) * 12 + wind_data[:, 1] - 1
121 | 
122 | # get set of unique months
123 | month_values = set(months)
124 | 
125 | # initialize an array to hold the result
126 | monthly_means = zeros(len(month_values))
127 | 
128 | for month in month_values:
129 |     # find the rows that correspond to the current month
130 |     day_indices = (months == month)
131 | 
132 |     # extract the data for the current month using fancy indexing
133 |     month_data = data[day_indices]
134 | 
135 |     # find the mean
136 |     monthly_means[month] = month_data.mean()
137 | 
138 |     # Note: experts might do this all-in one
139 |     # monthly_means[month] = data[months==month].mean()
140 | 
141 | # In fact the whole for loop could reduce to the following one-liner
142 | # monthly_means = array([data[months==month].mean() for month in month_values])
143 | 
144 | 
145 | print("Bonus 1.")
146 | print("  mean:", monthly_means)
147 | print()
148 | 
149 | # Bonus 2.
150 | # Extract the data for the first 52 weeks. Then reshape the array to put
151 | # on the same line 7 days worth of data for all locations. Let Numpy
152 | # figure out the number of lines needed to do so
153 | weekly_data = data[:52 * 7].reshape(-1, 7 * 12)
154 | 
155 | print('Bonus 2. Weekly statistics over all locations')
156 | print('  min:', weekly_data.min(axis=1))
157 | print('  max:', weekly_data.max(axis=1))
158 | print('  mean:', weekly_data.mean(axis=1))
159 | print('  standard deviation:', weekly_data.std(axis=1))
160 | print()
161 | 
162 | # Bonus Bonus : this is really tricky...
163 | 
164 | # compute the month number for each day in the dataset
165 | months = (wind_data[:, 0] - 61) * 12 + wind_data[:, 1] - 1
166 | 
167 | # find the indices for the start of each month
168 | # this is a useful trick - we use range from 0 to the
169 | # number of months + 1 and searchsorted to find the insertion
170 | # points for each.
171 | month_indices = searchsorted(months, arange(months[-1] + 2))
172 | 
173 | # now use add.reduceat to get the sum at each location
174 | monthly_loc_totals = add.reduceat(data, month_indices[:-1])
175 | 
176 | # now use add to find the sum across all locations for each month
177 | monthly_totals = monthly_loc_totals.sum(axis=1)
178 | 
179 | # now find total number of measurements for each month
180 | month_days = month_indices[1:] - month_indices[:-1]
181 | measurement_count = month_days * 12
182 | 
183 | # compute the mean
184 | monthly_means = monthly_totals / measurement_count
185 | 
186 | print("Bonus Bonus")
187 | print("  mean:", monthly_means)
188 | 
189 | # Notes: this method relies on the fact that the months are contiguous in the
190 | # data set - the method used in the bonus section works for non-contiguous
191 | # days.
192 | 


--------------------------------------------------------------------------------
/slides.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/enthought/Numpy-Tutorial-SciPyConf-2016/8e9e8cbb57f8976a4572800fe808aedc74775a81/slides.pdf


--------------------------------------------------------------------------------