├── .gitignore ├── .gitmodules ├── README.rst ├── asm └── sum.s ├── benchmarks ├── c++ │ ├── Makefile │ └── bench │ │ ├── main.cc │ │ ├── main.d │ │ ├── malloc.cc │ │ ├── memory_order.cc │ │ ├── run │ │ ├── strided_sum.cc │ │ └── strided_sum.d └── python │ ├── improper_alignment.py │ ├── loop.py │ ├── numpy_sum.py │ └── proper_alignment.py ├── etc └── setup-env ├── exercises ├── numpy │ ├── .ipynb_checkpoints │ │ ├── 1-Finding Functions and Documentation in Jupyter-checkpoint.ipynb │ │ ├── 2-Creating and Reshaping Arrays-checkpoint.ipynb │ │ ├── 3-Universal Functions-checkpoint.ipynb │ │ ├── 4-Selections-checkpoint.ipynb │ │ ├── 5-Reductions-checkpoint.ipynb │ │ └── 6-Broadcasting-checkpoint.ipynb │ ├── 1-Finding Functions and Documentation in Jupyter.ipynb │ ├── 2-Creating and Reshaping Arrays.ipynb │ ├── 3-Universal Functions.ipynb │ ├── 4-Selections.ipynb │ ├── 5-Reductions.ipynb │ ├── 6-Broadcasting.ipynb │ ├── 7-Strided Memory and Convolutions.ipynb │ ├── images │ │ ├── bezier.gif │ │ └── bezier2.gif │ ├── prices.csv │ ├── solutions │ │ ├── 1-Finding Functions and Documentation (Solutions).ipynb │ │ ├── 2-Creating and Reshaping Arrays (Solutions).ipynb │ │ ├── 3-Universal Functions (Solutions).ipynb │ │ ├── 4-Selections (Solutions).ipynb │ │ ├── 5-Reductions (Solutions).ipynb │ │ ├── 6-Broadcasting (Solutions).ipynb │ │ └── images │ │ │ ├── bezier.gif │ │ │ └── bezier2.gif │ └── volumes.csv └── profiling │ ├── replay-parsing.stats │ ├── rolling.py │ └── solutions │ └── rolling.py └── tutorial ├── Makefile ├── deploy.py └── source ├── _static ├── 1-byte-value-array.png ├── 2d-array.png ├── adders.png ├── addition-dereferences.png ├── cache-0.png ├── cache-1.png ├── column-order-strides.png ├── column-order.png ├── column-slice-strides.png ├── memory-cells.png ├── multi-byte-value-array.png ├── row-order-strides.png ├── row-order.png └── struct.png ├── appendix.rst ├── arrays-and-structs.rst ├── bits.rst ├── conf.py ├── how-to-optimize-code.rst ├── index.rst ├── low-level-computation.rst ├── memory-locality.rst ├── memory-management.rst ├── numpy-overview.rst ├── profiling.rst └── python-overview.rst /.gitignore: -------------------------------------------------------------------------------- 1 | # Compiled python files 2 | *.py[co] 3 | 4 | # Packages 5 | *.egg 6 | *.egg-info 7 | dist 8 | build 9 | eggs 10 | parts 11 | bin 12 | sdist 13 | 14 | # C Extension artifacts 15 | *.o 16 | *.so 17 | *.out 18 | 19 | # pypi 20 | MANIFEST 21 | 22 | # pytest 23 | .cache 24 | 25 | htmlcov 26 | 27 | venv/ 28 | 29 | .vagrant 30 | 31 | # PyCharm settings dir 32 | .idea 33 | 34 | tutorial/build/* 35 | venv/* 36 | 37 | *.gdb_history 38 | -------------------------------------------------------------------------------- /.gitmodules: -------------------------------------------------------------------------------- 1 | [submodule "benchmarks/c++/benchmark"] 2 | path = benchmarks/c++/benchmark 3 | url = https://github.com/google/benchmark 4 | -------------------------------------------------------------------------------- /README.rst: -------------------------------------------------------------------------------- 1 | Principles of Performance 2 | ========================= 3 | 4 | Today the quant finance and fintech sectors attract top talent from a wide range 5 | of quantitative academic backgrounds, often outside of traditional computer 6 | science. While this diversity has been advantageous in many respects, these 7 | students and professionals can often find themselves mired by performance 8 | related issues when they take their analyses from the classroom setting to the 9 | marketplace and have to interface with real world data structures at scale. This 10 | tutorial will cover the basic foundational concepts needed to effectively design 11 | and implement scalable algorithms like those common to quant finance or related 12 | financial technology applications. 13 | 14 | Many common performance issues are rooted in a lack of knowledge about how a 15 | computer actually performs computation. We will begin by covering how modern 16 | computers physically execute code. This low level information will help us 17 | reason about the behavior of our high level code later. We will then look at 18 | Python's execution model with our new understanding of the machine. We will then 19 | discuss how numpy allows us to take full advantage of the power of our computer 20 | while staying in Python. Finally, we will look at tools for analyzing the 21 | performance of Python programs and cover common issues and fixes. 22 | 23 | By the end of this session, attendees will: 24 | 25 | - Have a general understanding of how the processor works. 26 | - Know about memory management and cache locality. 27 | - Understand the reason Python is "slow". 28 | - Understand how numpy makes Python fast. 29 | - Be familiar with using cProfile to analyze programs. 30 | 31 | Install Steps 32 | ------------- 33 | 34 | Prior to the tutorial, attendees need to install a git and Python 3.6. After 35 | that, run the following commands. 36 | 37 | .. code-block:: bash 38 | 39 | $ git clone --recursive https://github.com/llllllllll/principles-of-performance.git 40 | $ cd principles-of-performance 41 | $ source etc/setup-env 42 | 43 | The ``setup-env`` script will attempt to download the needed packages. 44 | The ``setup-env`` should print a lot of stuff to the terminal. You can ignore 45 | most of it but the last line should be: 46 | 47 | .. code-block:: text 48 | 49 | Environment is setup correctly! 50 | 51 | Viewing the Tutorial 52 | -------------------- 53 | 54 | The tutorial is structured as a sphinx project. This allows the tutorial to be 55 | viewed from a standard browser or hosted online. 56 | 57 | The material can be viewed in a browser by opening 58 | ``tutorial/build/html/index.html``, for example: 59 | 60 | .. code-block:: bash 61 | 62 | $ ${BROWSER} tutorial/build/html/index.html 63 | 64 | Acknowledgments 65 | --------------- 66 | 67 | The numpy examples are sourced from Scott Sanderson's 68 | https://github.com/ssanderson/foundations-of-numerical-computing 69 | -------------------------------------------------------------------------------- /asm/sum.s: -------------------------------------------------------------------------------- 1 | .data 2 | array: /* declare an array of 64 bit integers */ 3 | .quad 2,9,9,8,1,1,3,8,6,6 4 | 5 | .text 6 | 7 | /* sum(%rdi size, %rsi array) 8 | 9 | sum and array of int64 values whose length is size. 10 | */ 11 | sum: 12 | xorq %rax, %rax /* zero the sum */ 13 | xorq %rbx, %rbx /* zero the loop counter */ 14 | .Lloop_start: 15 | cmpq %rbx, %rdi /* compare the loop counter to len */ 16 | je .Lloop_end /* jump $len == %eax */ 17 | /* load the value of the array at index %rbx into %rcx */ 18 | movq (%rsi, %rbx, 8), %rcx 19 | addq %rcx, %rax /* increment %rax by %rcx */ 20 | incq %rbx /* increment %rbx by 1 */ 21 | jmp .Lloop_start /* jump to the top of the loop */ 22 | .Lloop_end: 23 | ret 24 | 25 | .global _start 26 | _start: 27 | movq $10, %rdi 28 | movq $array, %rsi 29 | call sum 30 | movq %rax, %rdi 31 | movl $60, %eax /* exit syscall marker, exits with value of %rdi */ 32 | syscall 33 | -------------------------------------------------------------------------------- /benchmarks/c++/Makefile: -------------------------------------------------------------------------------- 1 | GBENCHMARK_DIR := benchmark 2 | GBENCHMARK_HEADERS := $(wildcard $(GBENCHMARK_DIR)/src/*.h) 3 | GBENCHMARK_SRCS := $(wildcard $(GBENCHMARK_DIR)/src/*.cc) 4 | LIBGBENCHMARK := $(GBENCHMARK_DIR)/build/src/libbenchmark.a 5 | 6 | BENCH_SOURCES := $(wildcard bench/*.cc) 7 | BENCH_DFILES := $(BENCH_SOURCES:.cc=.d) 8 | BENCH_OBJECTS := $(BENCH_SOURCES:.cc=.o) 9 | BENCH_HEADERS := $(wildcard bench/*.h) $(GBENCHMARK_HEADERS) 10 | BENCH_INCLUDE := -I bench -I $(GBENCHMARK_DIR)/include 11 | BENCHRUNNER := bench/run 12 | 13 | .PHONY: benchmark 14 | benchmark: $(BENCHRUNNER) 15 | @LD_LIBRARY_PATH=. $< 16 | 17 | 18 | bench/%.o: bench/%.cc 19 | $(CXX) $(CXXFLAGS) $(INCLUDE) $(BENCH_INCLUDE) -c $< -o $@ 20 | 21 | 22 | $(LIBGBENCHMARK): $(GBENCHMARK_SRCS) $(GBENCHMARK_HEADERS) 23 | cd $(GBENCHMARK_DIR) && mkdir -p build 24 | cd $(GBENCHMARK_DIR)/build && cmake .. \ 25 | -DCMAKE_BUILD_TYPE=RELEASE \ 26 | -DBENCHMARK_ENABLE_GTEST_TESTS=OFF 27 | make -C $(GBENCHMARK_DIR)/build 28 | 29 | 30 | $(BENCHRUNNER): $(LIBGBENCHMARK) $(BENCH_OBJECTS) 31 | $(CXX) -o $@ $(BENCH_OBJECTS) $(BENCH_LDFLAGS) $(BENCH_INCLUDE) \ 32 | $(LIBGBENCHMARK) -lpthread -L. $(LDFLAGS) 33 | -------------------------------------------------------------------------------- /benchmarks/c++/bench/main.cc: -------------------------------------------------------------------------------- 1 | #include "benchmark/benchmark.h" 2 | 3 | BENCHMARK_MAIN(); 4 | -------------------------------------------------------------------------------- /benchmarks/c++/bench/main.d: -------------------------------------------------------------------------------- 1 | bench/main.o: bench/main.cc /usr/include/stdc-predef.h \ 2 | benchmark/include/benchmark/benchmark.h \ 3 | /usr/lib/gcc/x86_64-pc-linux-gnu/8.2.1/include/stdint.h \ 4 | /usr/include/stdint.h /usr/include/bits/libc-header-start.h \ 5 | /usr/include/features.h /usr/include/sys/cdefs.h \ 6 | /usr/include/bits/wordsize.h /usr/include/bits/long-double.h \ 7 | /usr/include/gnu/stubs.h /usr/include/gnu/stubs-64.h \ 8 | /usr/include/bits/types.h /usr/include/bits/typesizes.h \ 9 | /usr/include/bits/wchar.h /usr/include/bits/stdint-intn.h \ 10 | /usr/include/bits/stdint-uintn.h /usr/include/c++/8.2.1/algorithm \ 11 | /usr/include/c++/8.2.1/utility \ 12 | /usr/include/c++/8.2.1/x86_64-pc-linux-gnu/bits/c++config.h \ 13 | /usr/include/c++/8.2.1/x86_64-pc-linux-gnu/bits/os_defines.h \ 14 | /usr/include/c++/8.2.1/x86_64-pc-linux-gnu/bits/cpu_defines.h \ 15 | /usr/include/c++/8.2.1/bits/stl_relops.h \ 16 | /usr/include/c++/8.2.1/bits/stl_pair.h \ 17 | /usr/include/c++/8.2.1/bits/move.h \ 18 | /usr/include/c++/8.2.1/bits/concept_check.h \ 19 | /usr/include/c++/8.2.1/type_traits \ 20 | /usr/include/c++/8.2.1/initializer_list \ 21 | /usr/include/c++/8.2.1/bits/stl_algobase.h \ 22 | /usr/include/c++/8.2.1/bits/functexcept.h \ 23 | /usr/include/c++/8.2.1/bits/exception_defines.h \ 24 | /usr/include/c++/8.2.1/bits/cpp_type_traits.h \ 25 | /usr/include/c++/8.2.1/ext/type_traits.h \ 26 | /usr/include/c++/8.2.1/ext/numeric_traits.h \ 27 | /usr/include/c++/8.2.1/bits/stl_iterator_base_types.h \ 28 | /usr/include/c++/8.2.1/bits/stl_iterator_base_funcs.h \ 29 | /usr/include/c++/8.2.1/debug/assertions.h \ 30 | /usr/include/c++/8.2.1/bits/stl_iterator.h \ 31 | /usr/include/c++/8.2.1/bits/ptr_traits.h \ 32 | /usr/include/c++/8.2.1/debug/debug.h \ 33 | /usr/include/c++/8.2.1/bits/predefined_ops.h \ 34 | /usr/include/c++/8.2.1/bits/stl_algo.h /usr/include/c++/8.2.1/cstdlib \ 35 | /usr/include/stdlib.h \ 36 | /usr/lib/gcc/x86_64-pc-linux-gnu/8.2.1/include/stddef.h \ 37 | /usr/include/bits/waitflags.h /usr/include/bits/waitstatus.h \ 38 | /usr/include/bits/floatn.h /usr/include/bits/floatn-common.h \ 39 | /usr/include/bits/types/locale_t.h /usr/include/bits/types/__locale_t.h \ 40 | /usr/include/sys/types.h /usr/include/bits/types/clock_t.h \ 41 | /usr/include/bits/types/clockid_t.h /usr/include/bits/types/time_t.h \ 42 | /usr/include/bits/types/timer_t.h /usr/include/endian.h \ 43 | /usr/include/bits/endian.h /usr/include/bits/byteswap.h \ 44 | /usr/include/bits/uintn-identity.h /usr/include/sys/select.h \ 45 | /usr/include/bits/select.h /usr/include/bits/types/sigset_t.h \ 46 | /usr/include/bits/types/__sigset_t.h \ 47 | /usr/include/bits/types/struct_timeval.h \ 48 | /usr/include/bits/types/struct_timespec.h \ 49 | /usr/include/bits/pthreadtypes.h /usr/include/bits/thread-shared-types.h \ 50 | /usr/include/bits/pthreadtypes-arch.h /usr/include/alloca.h \ 51 | /usr/include/bits/stdlib-float.h /usr/include/c++/8.2.1/bits/std_abs.h \ 52 | /usr/include/c++/8.2.1/bits/algorithmfwd.h \ 53 | /usr/include/c++/8.2.1/bits/stl_heap.h \ 54 | /usr/include/c++/8.2.1/bits/stl_tempbuf.h \ 55 | /usr/include/c++/8.2.1/bits/stl_construct.h /usr/include/c++/8.2.1/new \ 56 | /usr/include/c++/8.2.1/exception /usr/include/c++/8.2.1/bits/exception.h \ 57 | /usr/include/c++/8.2.1/bits/exception_ptr.h \ 58 | /usr/include/c++/8.2.1/bits/cxxabi_init_exception.h \ 59 | /usr/include/c++/8.2.1/typeinfo /usr/include/c++/8.2.1/bits/hash_bytes.h \ 60 | /usr/include/c++/8.2.1/bits/nested_exception.h \ 61 | /usr/include/c++/8.2.1/ext/alloc_traits.h \ 62 | /usr/include/c++/8.2.1/bits/alloc_traits.h \ 63 | /usr/include/c++/8.2.1/bits/memoryfwd.h \ 64 | /usr/include/c++/8.2.1/bits/uniform_int_dist.h \ 65 | /usr/include/c++/8.2.1/limits /usr/include/c++/8.2.1/cassert \ 66 | /usr/include/assert.h /usr/include/c++/8.2.1/cstddef \ 67 | /usr/include/c++/8.2.1/iosfwd /usr/include/c++/8.2.1/bits/stringfwd.h \ 68 | /usr/include/c++/8.2.1/bits/postypes.h /usr/include/c++/8.2.1/cwchar \ 69 | /usr/include/wchar.h \ 70 | /usr/lib/gcc/x86_64-pc-linux-gnu/8.2.1/include/stdarg.h \ 71 | /usr/include/bits/types/wint_t.h /usr/include/bits/types/mbstate_t.h \ 72 | /usr/include/bits/types/__mbstate_t.h /usr/include/bits/types/__FILE.h \ 73 | /usr/include/bits/types/FILE.h /usr/include/c++/8.2.1/map \ 74 | /usr/include/c++/8.2.1/bits/stl_tree.h \ 75 | /usr/include/c++/8.2.1/bits/allocator.h \ 76 | /usr/include/c++/8.2.1/x86_64-pc-linux-gnu/bits/c++allocator.h \ 77 | /usr/include/c++/8.2.1/ext/new_allocator.h \ 78 | /usr/include/c++/8.2.1/bits/stl_function.h \ 79 | /usr/include/c++/8.2.1/backward/binders.h \ 80 | /usr/include/c++/8.2.1/ext/aligned_buffer.h \ 81 | /usr/include/c++/8.2.1/bits/stl_map.h /usr/include/c++/8.2.1/tuple \ 82 | /usr/include/c++/8.2.1/array /usr/include/c++/8.2.1/stdexcept \ 83 | /usr/include/c++/8.2.1/string /usr/include/c++/8.2.1/bits/char_traits.h \ 84 | /usr/include/c++/8.2.1/cstdint /usr/include/c++/8.2.1/bits/localefwd.h \ 85 | /usr/include/c++/8.2.1/x86_64-pc-linux-gnu/bits/c++locale.h \ 86 | /usr/include/c++/8.2.1/clocale /usr/include/locale.h \ 87 | /usr/include/bits/locale.h /usr/include/c++/8.2.1/cctype \ 88 | /usr/include/ctype.h /usr/include/c++/8.2.1/bits/ostream_insert.h \ 89 | /usr/include/c++/8.2.1/bits/cxxabi_forced.h \ 90 | /usr/include/c++/8.2.1/bits/range_access.h \ 91 | /usr/include/c++/8.2.1/bits/basic_string.h \ 92 | /usr/include/c++/8.2.1/ext/atomicity.h \ 93 | /usr/include/c++/8.2.1/x86_64-pc-linux-gnu/bits/gthr.h \ 94 | /usr/include/c++/8.2.1/x86_64-pc-linux-gnu/bits/gthr-default.h \ 95 | /usr/include/pthread.h /usr/include/sched.h /usr/include/bits/sched.h \ 96 | /usr/include/bits/types/struct_sched_param.h /usr/include/bits/cpu-set.h \ 97 | /usr/include/time.h /usr/include/bits/time.h /usr/include/bits/timex.h \ 98 | /usr/include/bits/types/struct_tm.h \ 99 | /usr/include/bits/types/struct_itimerspec.h /usr/include/bits/setjmp.h \ 100 | /usr/include/c++/8.2.1/x86_64-pc-linux-gnu/bits/atomic_word.h \ 101 | /usr/include/c++/8.2.1/ext/string_conversions.h \ 102 | /usr/include/c++/8.2.1/cstdio /usr/include/stdio.h \ 103 | /usr/include/bits/types/__fpos_t.h /usr/include/bits/types/__fpos64_t.h \ 104 | /usr/include/bits/types/struct_FILE.h \ 105 | /usr/include/bits/types/cookie_io_functions_t.h \ 106 | /usr/include/bits/stdio_lim.h /usr/include/bits/sys_errlist.h \ 107 | /usr/include/c++/8.2.1/cerrno /usr/include/errno.h \ 108 | /usr/include/bits/errno.h /usr/include/linux/errno.h \ 109 | /usr/include/asm/errno.h /usr/include/asm-generic/errno.h \ 110 | /usr/include/asm-generic/errno-base.h /usr/include/bits/types/error_t.h \ 111 | /usr/include/c++/8.2.1/bits/functional_hash.h \ 112 | /usr/include/c++/8.2.1/bits/basic_string.tcc \ 113 | /usr/include/c++/8.2.1/bits/uses_allocator.h \ 114 | /usr/include/c++/8.2.1/bits/invoke.h \ 115 | /usr/include/c++/8.2.1/bits/stl_multimap.h /usr/include/c++/8.2.1/set \ 116 | /usr/include/c++/8.2.1/bits/stl_set.h \ 117 | /usr/include/c++/8.2.1/bits/stl_multiset.h /usr/include/c++/8.2.1/vector \ 118 | /usr/include/c++/8.2.1/bits/stl_uninitialized.h \ 119 | /usr/include/c++/8.2.1/bits/stl_vector.h \ 120 | /usr/include/c++/8.2.1/bits/stl_bvector.h \ 121 | /usr/include/c++/8.2.1/bits/vector.tcc 122 | -------------------------------------------------------------------------------- /benchmarks/c++/bench/malloc.cc: -------------------------------------------------------------------------------- 1 | #include 2 | 3 | #include 4 | 5 | constexpr std::size_t size = 1000000; 6 | using type = std::int64_t; 7 | 8 | void bench_malloc(benchmark::State& state) { 9 | std::size_t size = state.range(0); 10 | std::vector ptrs(1 << 30); 11 | std::size_t ix = 0; 12 | 13 | for (auto _ : state) { 14 | void* p = std::malloc(size); 15 | benchmark::DoNotOptimize(p); 16 | state.PauseTiming(); 17 | if (ix == ptrs.size()) { 18 | for (void* p : ptrs) { 19 | std::free(p); 20 | } 21 | ix = 0; 22 | } 23 | ptrs[ix++] = p; 24 | state.ResumeTiming(); 25 | } 26 | } 27 | BENCHMARK(bench_malloc)->Range(64, 8096); 28 | -------------------------------------------------------------------------------- /benchmarks/c++/bench/memory_order.cc: -------------------------------------------------------------------------------- 1 | #include 2 | #include 3 | #include 4 | #include 5 | 6 | #include 7 | 8 | constexpr std::size_t size = 1000000; 9 | using type = std::int64_t; 10 | 11 | void bench_random_access(benchmark::State& state) { 12 | std::vector values(size); 13 | std::iota(values.begin(), values.end(), 0); 14 | 15 | std::vector indices = values; 16 | std::random_device rd; 17 | std::mt19937 g(rd()); 18 | std::shuffle(indices.begin(), indices.end(), g); 19 | 20 | for (auto _ : state) { 21 | for (const type ix : indices) { 22 | benchmark::DoNotOptimize(values[ix]); 23 | } 24 | } 25 | } 26 | BENCHMARK(bench_random_access); 27 | 28 | 29 | void bench_forward_linear_access(benchmark::State& state) { 30 | std::vector values(size); 31 | std::iota(values.begin(), values.end(), 0); 32 | 33 | std::vector indices = values; 34 | 35 | for (auto _ : state) { 36 | for (const type ix : indices) { 37 | benchmark::DoNotOptimize(values[ix]); 38 | } 39 | } 40 | } 41 | BENCHMARK(bench_forward_linear_access); 42 | 43 | 44 | void bench_reverse_linear_access(benchmark::State& state) { 45 | std::vector values(size); 46 | std::iota(values.begin(), values.end(), 0); 47 | 48 | std::vector indices(values.rbegin(), values.rend()); 49 | 50 | for (auto _ : state) { 51 | for (const type ix : indices) { 52 | benchmark::DoNotOptimize(values[ix]); 53 | } 54 | } 55 | } 56 | BENCHMARK(bench_reverse_linear_access); 57 | -------------------------------------------------------------------------------- /benchmarks/c++/bench/run: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/llllllllll/principles-of-performance/80d1d18ef4bc039fbed2a92b88e5b85d654c10a6/benchmarks/c++/bench/run -------------------------------------------------------------------------------- /benchmarks/c++/bench/strided_sum.cc: -------------------------------------------------------------------------------- 1 | #include 2 | 3 | std::vector make_array(std::size_t size) { 4 | std::vector out(size); 5 | for (double& item : out) { 6 | item = std::rand(); 7 | } 8 | return out; 9 | } 10 | 11 | template 12 | T sum(const std::vector& arr, std::size_t stride) { 13 | T sum = 0; 14 | 15 | for (std::size_t ix = 0; ix < arr.size(); ix += stride) { 16 | sum += arr[ix]; 17 | } 18 | 19 | return sum; 20 | } 21 | 22 | void bench_sum(benchmark::State& state) { 23 | auto arr = make_array(10000000); 24 | std::size_t stride = state.range(0); 25 | 26 | for (auto _ : state) { 27 | benchmark::DoNotOptimize(sum(arr, stride)); 28 | } 29 | } 30 | 31 | BENCHMARK(bench_sum)->Apply([](benchmark::internal::Benchmark* b) { 32 | for (std::size_t n = 1; n < 64; ++n) { 33 | b->Arg(n); 34 | } 35 | }); 36 | -------------------------------------------------------------------------------- /benchmarks/c++/bench/strided_sum.d: -------------------------------------------------------------------------------- 1 | bench/strided_sum.o: bench/strided_sum.cc /usr/include/stdc-predef.h \ 2 | benchmark/include/benchmark/benchmark.h \ 3 | /usr/lib/gcc/x86_64-pc-linux-gnu/8.2.1/include/stdint.h \ 4 | /usr/include/stdint.h /usr/include/bits/libc-header-start.h \ 5 | /usr/include/features.h /usr/include/sys/cdefs.h \ 6 | /usr/include/bits/wordsize.h /usr/include/bits/long-double.h \ 7 | /usr/include/gnu/stubs.h /usr/include/gnu/stubs-64.h \ 8 | /usr/include/bits/types.h /usr/include/bits/typesizes.h \ 9 | /usr/include/bits/wchar.h /usr/include/bits/stdint-intn.h \ 10 | /usr/include/bits/stdint-uintn.h /usr/include/c++/8.2.1/algorithm \ 11 | /usr/include/c++/8.2.1/utility \ 12 | /usr/include/c++/8.2.1/x86_64-pc-linux-gnu/bits/c++config.h \ 13 | /usr/include/c++/8.2.1/x86_64-pc-linux-gnu/bits/os_defines.h \ 14 | /usr/include/c++/8.2.1/x86_64-pc-linux-gnu/bits/cpu_defines.h \ 15 | /usr/include/c++/8.2.1/bits/stl_relops.h \ 16 | /usr/include/c++/8.2.1/bits/stl_pair.h \ 17 | /usr/include/c++/8.2.1/bits/move.h \ 18 | /usr/include/c++/8.2.1/bits/concept_check.h \ 19 | /usr/include/c++/8.2.1/type_traits \ 20 | /usr/include/c++/8.2.1/initializer_list \ 21 | /usr/include/c++/8.2.1/bits/stl_algobase.h \ 22 | /usr/include/c++/8.2.1/bits/functexcept.h \ 23 | /usr/include/c++/8.2.1/bits/exception_defines.h \ 24 | /usr/include/c++/8.2.1/bits/cpp_type_traits.h \ 25 | /usr/include/c++/8.2.1/ext/type_traits.h \ 26 | /usr/include/c++/8.2.1/ext/numeric_traits.h \ 27 | /usr/include/c++/8.2.1/bits/stl_iterator_base_types.h \ 28 | /usr/include/c++/8.2.1/bits/stl_iterator_base_funcs.h \ 29 | /usr/include/c++/8.2.1/debug/assertions.h \ 30 | /usr/include/c++/8.2.1/bits/stl_iterator.h \ 31 | /usr/include/c++/8.2.1/bits/ptr_traits.h \ 32 | /usr/include/c++/8.2.1/debug/debug.h \ 33 | /usr/include/c++/8.2.1/bits/predefined_ops.h \ 34 | /usr/include/c++/8.2.1/bits/stl_algo.h /usr/include/c++/8.2.1/cstdlib \ 35 | /usr/include/stdlib.h \ 36 | /usr/lib/gcc/x86_64-pc-linux-gnu/8.2.1/include/stddef.h \ 37 | /usr/include/bits/waitflags.h /usr/include/bits/waitstatus.h \ 38 | /usr/include/bits/floatn.h /usr/include/bits/floatn-common.h \ 39 | /usr/include/bits/types/locale_t.h /usr/include/bits/types/__locale_t.h \ 40 | /usr/include/sys/types.h /usr/include/bits/types/clock_t.h \ 41 | /usr/include/bits/types/clockid_t.h /usr/include/bits/types/time_t.h \ 42 | /usr/include/bits/types/timer_t.h /usr/include/endian.h \ 43 | /usr/include/bits/endian.h /usr/include/bits/byteswap.h \ 44 | /usr/include/bits/uintn-identity.h /usr/include/sys/select.h \ 45 | /usr/include/bits/select.h /usr/include/bits/types/sigset_t.h \ 46 | /usr/include/bits/types/__sigset_t.h \ 47 | /usr/include/bits/types/struct_timeval.h \ 48 | /usr/include/bits/types/struct_timespec.h \ 49 | /usr/include/bits/pthreadtypes.h /usr/include/bits/thread-shared-types.h \ 50 | /usr/include/bits/pthreadtypes-arch.h /usr/include/alloca.h \ 51 | /usr/include/bits/stdlib-float.h /usr/include/c++/8.2.1/bits/std_abs.h \ 52 | /usr/include/c++/8.2.1/bits/algorithmfwd.h \ 53 | /usr/include/c++/8.2.1/bits/stl_heap.h \ 54 | /usr/include/c++/8.2.1/bits/stl_tempbuf.h \ 55 | /usr/include/c++/8.2.1/bits/stl_construct.h /usr/include/c++/8.2.1/new \ 56 | /usr/include/c++/8.2.1/exception /usr/include/c++/8.2.1/bits/exception.h \ 57 | /usr/include/c++/8.2.1/bits/exception_ptr.h \ 58 | /usr/include/c++/8.2.1/bits/cxxabi_init_exception.h \ 59 | /usr/include/c++/8.2.1/typeinfo /usr/include/c++/8.2.1/bits/hash_bytes.h \ 60 | /usr/include/c++/8.2.1/bits/nested_exception.h \ 61 | /usr/include/c++/8.2.1/ext/alloc_traits.h \ 62 | /usr/include/c++/8.2.1/bits/alloc_traits.h \ 63 | /usr/include/c++/8.2.1/bits/memoryfwd.h \ 64 | /usr/include/c++/8.2.1/bits/uniform_int_dist.h \ 65 | /usr/include/c++/8.2.1/limits /usr/include/c++/8.2.1/cassert \ 66 | /usr/include/assert.h /usr/include/c++/8.2.1/cstddef \ 67 | /usr/include/c++/8.2.1/iosfwd /usr/include/c++/8.2.1/bits/stringfwd.h \ 68 | /usr/include/c++/8.2.1/bits/postypes.h /usr/include/c++/8.2.1/cwchar \ 69 | /usr/include/wchar.h \ 70 | /usr/lib/gcc/x86_64-pc-linux-gnu/8.2.1/include/stdarg.h \ 71 | /usr/include/bits/types/wint_t.h /usr/include/bits/types/mbstate_t.h \ 72 | /usr/include/bits/types/__mbstate_t.h /usr/include/bits/types/__FILE.h \ 73 | /usr/include/bits/types/FILE.h /usr/include/c++/8.2.1/map \ 74 | /usr/include/c++/8.2.1/bits/stl_tree.h \ 75 | /usr/include/c++/8.2.1/bits/allocator.h \ 76 | /usr/include/c++/8.2.1/x86_64-pc-linux-gnu/bits/c++allocator.h \ 77 | /usr/include/c++/8.2.1/ext/new_allocator.h \ 78 | /usr/include/c++/8.2.1/bits/stl_function.h \ 79 | /usr/include/c++/8.2.1/backward/binders.h \ 80 | /usr/include/c++/8.2.1/ext/aligned_buffer.h \ 81 | /usr/include/c++/8.2.1/bits/stl_map.h /usr/include/c++/8.2.1/tuple \ 82 | /usr/include/c++/8.2.1/array /usr/include/c++/8.2.1/stdexcept \ 83 | /usr/include/c++/8.2.1/string /usr/include/c++/8.2.1/bits/char_traits.h \ 84 | /usr/include/c++/8.2.1/cstdint /usr/include/c++/8.2.1/bits/localefwd.h \ 85 | /usr/include/c++/8.2.1/x86_64-pc-linux-gnu/bits/c++locale.h \ 86 | /usr/include/c++/8.2.1/clocale /usr/include/locale.h \ 87 | /usr/include/bits/locale.h /usr/include/c++/8.2.1/cctype \ 88 | /usr/include/ctype.h /usr/include/c++/8.2.1/bits/ostream_insert.h \ 89 | /usr/include/c++/8.2.1/bits/cxxabi_forced.h \ 90 | /usr/include/c++/8.2.1/bits/range_access.h \ 91 | /usr/include/c++/8.2.1/bits/basic_string.h \ 92 | /usr/include/c++/8.2.1/ext/atomicity.h \ 93 | /usr/include/c++/8.2.1/x86_64-pc-linux-gnu/bits/gthr.h \ 94 | /usr/include/c++/8.2.1/x86_64-pc-linux-gnu/bits/gthr-default.h \ 95 | /usr/include/pthread.h /usr/include/sched.h /usr/include/bits/sched.h \ 96 | /usr/include/bits/types/struct_sched_param.h /usr/include/bits/cpu-set.h \ 97 | /usr/include/time.h /usr/include/bits/time.h /usr/include/bits/timex.h \ 98 | /usr/include/bits/types/struct_tm.h \ 99 | /usr/include/bits/types/struct_itimerspec.h /usr/include/bits/setjmp.h \ 100 | /usr/include/c++/8.2.1/x86_64-pc-linux-gnu/bits/atomic_word.h \ 101 | /usr/include/c++/8.2.1/ext/string_conversions.h \ 102 | /usr/include/c++/8.2.1/cstdio /usr/include/stdio.h \ 103 | /usr/include/bits/types/__fpos_t.h /usr/include/bits/types/__fpos64_t.h \ 104 | /usr/include/bits/types/struct_FILE.h \ 105 | /usr/include/bits/types/cookie_io_functions_t.h \ 106 | /usr/include/bits/stdio_lim.h /usr/include/bits/sys_errlist.h \ 107 | /usr/include/c++/8.2.1/cerrno /usr/include/errno.h \ 108 | /usr/include/bits/errno.h /usr/include/linux/errno.h \ 109 | /usr/include/asm/errno.h /usr/include/asm-generic/errno.h \ 110 | /usr/include/asm-generic/errno-base.h /usr/include/bits/types/error_t.h \ 111 | /usr/include/c++/8.2.1/bits/functional_hash.h \ 112 | /usr/include/c++/8.2.1/bits/basic_string.tcc \ 113 | /usr/include/c++/8.2.1/bits/uses_allocator.h \ 114 | /usr/include/c++/8.2.1/bits/invoke.h \ 115 | /usr/include/c++/8.2.1/bits/stl_multimap.h /usr/include/c++/8.2.1/set \ 116 | /usr/include/c++/8.2.1/bits/stl_set.h \ 117 | /usr/include/c++/8.2.1/bits/stl_multiset.h /usr/include/c++/8.2.1/vector \ 118 | /usr/include/c++/8.2.1/bits/stl_uninitialized.h \ 119 | /usr/include/c++/8.2.1/bits/stl_vector.h \ 120 | /usr/include/c++/8.2.1/bits/stl_bvector.h \ 121 | /usr/include/c++/8.2.1/bits/vector.tcc 122 | -------------------------------------------------------------------------------- /benchmarks/python/improper_alignment.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import pycallgrind 3 | 4 | arr = np.arange(10000 * 10000).reshape(10000, 10000) 5 | 6 | with pycallgrind.callgrind('loop'): 7 | sum_ = arr.sum(axis=1) 8 | -------------------------------------------------------------------------------- /benchmarks/python/loop.py: -------------------------------------------------------------------------------- 1 | import pycallgrind 2 | 3 | arr = list(range(10000)) 4 | 5 | with pycallgrind.callgrind('loop'): 6 | sum_ = 0 7 | for element in arr: 8 | sum_ += element 9 | -------------------------------------------------------------------------------- /benchmarks/python/numpy_sum.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import pycallgrind 3 | 4 | arr = np.arange(10000) 5 | 6 | with pycallgrind.callgrind('loop'): 7 | sum_ = arr.sum() 8 | -------------------------------------------------------------------------------- /benchmarks/python/proper_alignment.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import pycallgrind 3 | 4 | arr = np.arange(10000 * 10000).reshape(10000, 10000) 5 | 6 | with pycallgrind.callgrind('loop'): 7 | sum_ = arr.sum(axis=0) 8 | -------------------------------------------------------------------------------- /etc/setup-env: -------------------------------------------------------------------------------- 1 | ( 2 | PYTHON_EXE=python3.6 3 | 4 | # create a new virtualenv 5 | $PYTHON_EXE -m venv venv 6 | 7 | # activate our venv 8 | source venv/bin/activate 9 | 10 | # pip install the things needed to build the docs; ipython is for people 11 | # to use during the exercises 12 | pip install \ 13 | ipython \ 14 | jupyter \ 15 | sphinx \ 16 | sphinx-rtd-theme \ 17 | numpy \ 18 | pandas \ 19 | matplotlib 20 | 21 | # build the sphinx project 22 | pushd tutorial 23 | make html 24 | popd 25 | 26 | ROOT=$PWD 27 | 28 | PYTHON_ASSERTION=" 29 | import numpy 30 | import pandas 31 | " 32 | if python -c "$PYTHON_ASSERTION";then 33 | printf "\nvirtual environment created successfully: $(find $ROOT -maxdepth 1 -name venv)\n" 34 | printf '\n\nEnvironment is setup correctly!\n' 35 | fi 36 | ) 37 | 38 | if [ $? -eq 0 ];then 39 | # only activate the venv if the install steps worked, otherwise we mask the 40 | # error 41 | source venv/bin/activate 42 | else 43 | # clear the failed venv 44 | rm -r venv 45 | 46 | BOLD=$(tput bold) 47 | RED=$(tput setaf 1) 48 | NORMAL=$(tput sgr0) 49 | if [ $? -ne 0 ];then 50 | # don't fail to print at all because of tput 51 | BOLD='' 52 | RED='' 53 | NORMAL='' 54 | fi 55 | printf "\n\nEnvironment is $BOLD$RED**not**$NORMAL setup correctly!\n" 56 | fi 57 | -------------------------------------------------------------------------------- /exercises/numpy/.ipynb_checkpoints/1-Finding Functions and Documentation in Jupyter-checkpoint.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Finding Functions with NumPy and Jupyter" 8 | ] 9 | }, 10 | { 11 | "cell_type": "markdown", 12 | "metadata": {}, 13 | "source": [ 14 | "NumPy is a large complex library with hundreds of useful functions. Often the hardest part of solving a problem with NumPy is simply finding the right function to use, or figuring out whether a function you've found can solve your problem.\n", 15 | "\n", 16 | "The official documentation for Numpy and SciPy are excellent resources:\n", 17 | "\n", 18 | "- The official NumPy documentation: [https://docs.scipy.org/doc/numpy-1.13.0/reference/](https://docs.scipy.org/doc/numpy-1.13.0/reference/)\n", 19 | "- The offical SciPy documentation: [https://docs.scipy.org/doc/scipy/reference/](https://docs.scipy.org/doc/scipy/reference/)\n", 20 | "\n", 21 | "There are also a few tricks you can use to learn about NumPy functions without having to leave the notebook:" 22 | ] 23 | }, 24 | { 25 | "cell_type": "code", 26 | "execution_count": null, 27 | "metadata": {}, 28 | "outputs": [], 29 | "source": [ 30 | "import numpy as np" 31 | ] 32 | }, 33 | { 34 | "cell_type": "markdown", 35 | "metadata": {}, 36 | "source": [ 37 | "Suppose, for example, that we want to find the [eigenvalues](https://en.wikipedia.org/wiki/Eigenvalues_and_eigenvectors) of a matrix. The first thing we might try is to use Jupyter's tab-completion feature to see if there's a top-level function with a name like \"eigenvalues\":" 38 | ] 39 | }, 40 | { 41 | "cell_type": "code", 42 | "execution_count": null, 43 | "metadata": {}, 44 | "outputs": [], 45 | "source": [ 46 | "# Place your cursor immediately after the `e` and hit TAB to see the top-level numpy functions\n", 47 | "# that start with the letter \"e\"\n", 48 | "np.e" 49 | ] 50 | }, 51 | { 52 | "cell_type": "markdown", 53 | "metadata": {}, 54 | "source": [ 55 | "Unfortunately for us, none of the functions that appear look like they're related to eigenvalues. The next thing we can try is to use [`np.lookfor`](https://docs.scipy.org/doc/numpy-1.14.0/reference/generated/numpy.lookfor.html) to search for \"eigenvalue\" by keyword:" 56 | ] 57 | }, 58 | { 59 | "cell_type": "code", 60 | "execution_count": null, 61 | "metadata": {}, 62 | "outputs": [], 63 | "source": [ 64 | "np.lookfor(\"eigenvalue\")" 65 | ] 66 | }, 67 | { 68 | "cell_type": "markdown", 69 | "metadata": {}, 70 | "source": [ 71 | "That looks more promising, but now we have to figure out which function to use. For that, we can use Jupyter's `?` operator. Running a cell containing `function_name?` will bring up a window containing documentation about the function.\n", 72 | "\n", 73 | "Execute the following cell and read the documentation for `eigvals`. In particular, notice that if you scroll down, there are **\"Examples\"** and **\"See Also\"** sections. These sections are as or more useful than the description of the what a function does." 74 | ] 75 | }, 76 | { 77 | "cell_type": "code", 78 | "execution_count": null, 79 | "metadata": {}, 80 | "outputs": [], 81 | "source": [ 82 | "np.linalg.eigvals?" 83 | ] 84 | }, 85 | { 86 | "cell_type": "markdown", 87 | "metadata": {}, 88 | "source": [ 89 | "For cases where we just to check a docstring quickly, it can be more ergonomic to bring up documentation in-line using Shift+Tab:\n", 90 | "\n", 91 | "Place your cursor after \"eigvals\", hold Shift, and then press Tab. You should see the signature and the first line of the function's documentation. You can see the rest of the documentation by pressing Tab again without letting go of Shift." 92 | ] 93 | }, 94 | { 95 | "cell_type": "code", 96 | "execution_count": null, 97 | "metadata": {}, 98 | "outputs": [], 99 | "source": [ 100 | "np.linalg.eigvals" 101 | ] 102 | }, 103 | { 104 | "cell_type": "markdown", 105 | "metadata": {}, 106 | "source": [ 107 | "## Exercise: Finding Functions\n", 108 | "\n", 109 | "Use `np.lookfor` and `?` to find functions that do the following:\n", 110 | "\n", 111 | "- Compute the largest value in an array.\n", 112 | "- Compute the smallest value in an array.\n", 113 | "- Compute the value at the 90th percentile of an array.\n", 114 | "- Sort an array." 115 | ] 116 | }, 117 | { 118 | "cell_type": "code", 119 | "execution_count": null, 120 | "metadata": {}, 121 | "outputs": [], 122 | "source": [] 123 | }, 124 | { 125 | "cell_type": "markdown", 126 | "metadata": {}, 127 | "source": [ 128 | "## Exercise: Finding Functions (continued)\n", 129 | "\n", 130 | "Continue using `np.lookfor` and `?` to find functions that do the following:" 131 | ] 132 | }, 133 | { 134 | "cell_type": "markdown", 135 | "metadata": {}, 136 | "source": [ 137 | "- Find a value in a sorted array.\n", 138 | "- Compute the [natural logarithm](https://en.wikipedia.org/wiki/Natural_logarithm) of each element in an array.\n", 139 | "- Compute the [Correlation Coefficient](https://en.wikipedia.org/wiki/Correlation_coefficient) between two arrays.\n", 140 | "- Fit coefficients of a polynomial function.\n", 141 | "- Compute a [Covariance Matrix](https://en.wikipedia.org/wiki/Covariance_matrix)." 142 | ] 143 | }, 144 | { 145 | "cell_type": "code", 146 | "execution_count": null, 147 | "metadata": {}, 148 | "outputs": [], 149 | "source": [] 150 | } 151 | ], 152 | "metadata": { 153 | "kernelspec": { 154 | "display_name": "Python 3", 155 | "language": "python", 156 | "name": "python3" 157 | }, 158 | "language_info": { 159 | "codemirror_mode": { 160 | "name": "ipython", 161 | "version": 3 162 | }, 163 | "file_extension": ".py", 164 | "mimetype": "text/x-python", 165 | "name": "python", 166 | "nbconvert_exporter": "python", 167 | "pygments_lexer": "ipython3", 168 | "version": "3.6.6" 169 | } 170 | }, 171 | "nbformat": 4, 172 | "nbformat_minor": 2 173 | } 174 | -------------------------------------------------------------------------------- /exercises/numpy/.ipynb_checkpoints/3-Universal Functions-checkpoint.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Universal Functions\n", 8 | "\n", 9 | "The exercises in this notebook will teach you how to use universal functions (ufuncs) to apply vectorized operations over arrays." 10 | ] 11 | }, 12 | { 13 | "cell_type": "code", 14 | "execution_count": null, 15 | "metadata": {}, 16 | "outputs": [], 17 | "source": [ 18 | "import numpy as np" 19 | ] 20 | }, 21 | { 22 | "cell_type": "markdown", 23 | "metadata": {}, 24 | "source": [ 25 | "## UFunc Basics: Unary Functions\n", 26 | "\n", 27 | "Numpy provides many functions that can be applied over an entire array as a single operation." 28 | ] 29 | }, 30 | { 31 | "cell_type": "code", 32 | "execution_count": null, 33 | "metadata": {}, 34 | "outputs": [], 35 | "source": [ 36 | "data = np.linspace(1, 3, 15)\n", 37 | "data" 38 | ] 39 | }, 40 | { 41 | "cell_type": "markdown", 42 | "metadata": {}, 43 | "source": [ 44 | "**Exercise:** Use `np.sqrt` to compute the square-root of every element in `data`.\n", 45 | "\n", 46 | "(**Hint:** You should only need to call `sqrt` once)." 47 | ] 48 | }, 49 | { 50 | "cell_type": "code", 51 | "execution_count": null, 52 | "metadata": {}, 53 | "outputs": [], 54 | "source": [ 55 | "np.sqrt?" 56 | ] 57 | }, 58 | { 59 | "cell_type": "markdown", 60 | "metadata": {}, 61 | "source": [ 62 | "**Exercise:** Use `np.exp` to compute $e^x$ for each element of `data`." 63 | ] 64 | }, 65 | { 66 | "cell_type": "code", 67 | "execution_count": null, 68 | "metadata": {}, 69 | "outputs": [], 70 | "source": [ 71 | "np.exp?" 72 | ] 73 | }, 74 | { 75 | "cell_type": "markdown", 76 | "metadata": {}, 77 | "source": [ 78 | "**Exercise:** Use `np.log` to compute the [natural logarithm](https://en.wikipedia.org/wiki/Natural_logarithm) of every value in `data`." 79 | ] 80 | }, 81 | { 82 | "cell_type": "code", 83 | "execution_count": null, 84 | "metadata": {}, 85 | "outputs": [], 86 | "source": [ 87 | "np.log?" 88 | ] 89 | }, 90 | { 91 | "cell_type": "markdown", 92 | "metadata": {}, 93 | "source": [ 94 | "## UFunc Basics: Binary Operators\n", 95 | "\n", 96 | "Most of Python's binary operators can also be used as as UFuncs. \n", 97 | "\n", 98 | "Binary operators work on like-shape (array, array) pairs, as well as (array, scalar) pairs." 99 | ] 100 | }, 101 | { 102 | "cell_type": "code", 103 | "execution_count": null, 104 | "metadata": {}, 105 | "outputs": [], 106 | "source": [ 107 | "x = np.linspace(0, 10, 11)\n", 108 | "print(\"x:\", x)\n", 109 | "y = np.linspace(0, 1, 11)\n", 110 | "print(\"y:\", y)" 111 | ] 112 | }, 113 | { 114 | "cell_type": "markdown", 115 | "metadata": {}, 116 | "source": [ 117 | "### Combining Arrays and Scalars" 118 | ] 119 | }, 120 | { 121 | "cell_type": "markdown", 122 | "metadata": {}, 123 | "source": [ 124 | "**Exercise:** Compute an array containing each value from `x` incremented by 1." 125 | ] 126 | }, 127 | { 128 | "cell_type": "code", 129 | "execution_count": null, 130 | "metadata": {}, 131 | "outputs": [], 132 | "source": [] 133 | }, 134 | { 135 | "cell_type": "markdown", 136 | "metadata": {}, 137 | "source": [ 138 | "**Exercise:** Compute an array containing each value from `y` multiplied by 2." 139 | ] 140 | }, 141 | { 142 | "cell_type": "code", 143 | "execution_count": null, 144 | "metadata": {}, 145 | "outputs": [], 146 | "source": [] 147 | }, 148 | { 149 | "cell_type": "markdown", 150 | "metadata": {}, 151 | "source": [ 152 | "**Exercise:** Compute an array containing each value from ``x`` squared." 153 | ] 154 | }, 155 | { 156 | "cell_type": "code", 157 | "execution_count": null, 158 | "metadata": {}, 159 | "outputs": [], 160 | "source": [] 161 | }, 162 | { 163 | "cell_type": "markdown", 164 | "metadata": {}, 165 | "source": [ 166 | "### Combining Arrays with Arrays" 167 | ] 168 | }, 169 | { 170 | "cell_type": "markdown", 171 | "metadata": {}, 172 | "source": [ 173 | "**Exercise:** Compute an array containing the element-wise sum of values drawn from `x` and `y`. \n", 174 | "\n", 175 | "(For example, the element-wise sum of `[1, 2, 3]` and `[2, 3, 4]` would be `[3, 5, 7]`.)" 176 | ] 177 | }, 178 | { 179 | "cell_type": "code", 180 | "execution_count": null, 181 | "metadata": {}, 182 | "outputs": [], 183 | "source": [] 184 | }, 185 | { 186 | "cell_type": "markdown", 187 | "metadata": {}, 188 | "source": [ 189 | "**Exercise:** Compute an array containing the element-wise product of values drawn from `x` and `y`." 190 | ] 191 | }, 192 | { 193 | "cell_type": "code", 194 | "execution_count": null, 195 | "metadata": {}, 196 | "outputs": [], 197 | "source": [] 198 | }, 199 | { 200 | "cell_type": "markdown", 201 | "metadata": {}, 202 | "source": [ 203 | "**Exercise:** Compute an array containing each element in `x` raised to the power of the corresponding element in `y`." 204 | ] 205 | }, 206 | { 207 | "cell_type": "code", 208 | "execution_count": null, 209 | "metadata": {}, 210 | "outputs": [], 211 | "source": [] 212 | }, 213 | { 214 | "cell_type": "markdown", 215 | "metadata": {}, 216 | "source": [ 217 | "## Exercise: Plotting Functions with Numpy and Matplotlib\n", 218 | "\n", 219 | "Numpy can be used as powerful graphing calculator when combined with a plotting library like [matplotlib](https://matplotlib.org/index.html)." 220 | ] 221 | }, 222 | { 223 | "cell_type": "code", 224 | "execution_count": null, 225 | "metadata": {}, 226 | "outputs": [], 227 | "source": [ 228 | "x = np.linspace(-np.pi, np.pi, 500)" 229 | ] 230 | }, 231 | { 232 | "cell_type": "markdown", 233 | "metadata": {}, 234 | "source": [ 235 | "You can plot arrays of X and Y values by passing them to `matplotlib.pyplot.plot`." 236 | ] 237 | }, 238 | { 239 | "cell_type": "code", 240 | "execution_count": null, 241 | "metadata": {}, 242 | "outputs": [], 243 | "source": [ 244 | "import matplotlib.pyplot as plt\n", 245 | "\n", 246 | "# Tell matplotlib to output images to our notebook. \n", 247 | "# Without this line, matplotlib would construct a figure in memory, but wouldn't show it to us.\n", 248 | "%matplotlib inline" 249 | ] 250 | }, 251 | { 252 | "cell_type": "code", 253 | "execution_count": null, 254 | "metadata": {}, 255 | "outputs": [], 256 | "source": [ 257 | "# Tell matplotlib to use a larger default figure size.\n", 258 | "plt.rc('figure', figsize=(12, 7))" 259 | ] 260 | }, 261 | { 262 | "cell_type": "markdown", 263 | "metadata": {}, 264 | "source": [ 265 | "**Example:** Plot the graph of $y = x^2$." 266 | ] 267 | }, 268 | { 269 | "cell_type": "code", 270 | "execution_count": null, 271 | "metadata": {}, 272 | "outputs": [], 273 | "source": [ 274 | "plt.plot(x, x ** 2);" 275 | ] 276 | }, 277 | { 278 | "cell_type": "markdown", 279 | "metadata": {}, 280 | "source": [ 281 | "**Exercise:** Use numpy and matplotlib to plot graphs of the following functions:\n", 282 | "\n", 283 | "- $y = x^2 + 2x + 1$\n", 284 | "- $y = \\sqrt{|x|}$\n", 285 | "- $y = \\sin{\\frac{1}{x}}$" 286 | ] 287 | }, 288 | { 289 | "cell_type": "code", 290 | "execution_count": null, 291 | "metadata": {}, 292 | "outputs": [], 293 | "source": [] 294 | }, 295 | { 296 | "cell_type": "code", 297 | "execution_count": null, 298 | "metadata": {}, 299 | "outputs": [], 300 | "source": [] 301 | }, 302 | { 303 | "cell_type": "code", 304 | "execution_count": null, 305 | "metadata": {}, 306 | "outputs": [], 307 | "source": [] 308 | }, 309 | { 310 | "cell_type": "markdown", 311 | "metadata": {}, 312 | "source": [ 313 | "Numpy and matplotlib can also be used to plot functions of multiple arguments.\n", 314 | "\n", 315 | "The easiest way to plot a function of two arguments is to use `np.meshgrid` to generate grids of `x` and `y` coordinates, and to use `matplotlib.pyplot.imshow` to draw a density plot." 316 | ] 317 | }, 318 | { 319 | "cell_type": "markdown", 320 | "metadata": {}, 321 | "source": [ 322 | "**Example:** Draw a density plot of $z = \\sin{x} + \\cos{y}$." 323 | ] 324 | }, 325 | { 326 | "cell_type": "code", 327 | "execution_count": null, 328 | "metadata": {}, 329 | "outputs": [], 330 | "source": [ 331 | "# np.meshgrid takes arrays of X and Y values, and returns a pair of 2D arrays \n", 332 | "# that containing all pairs of (X, Y) coordinates from the input arrays.\n", 333 | "xvals = np.linspace(-5, 5, 9)\n", 334 | "yvals = np.linspace(-5, 5, 9)\n", 335 | "print(\"X Values:\", xvals)\n", 336 | "print(\"Y Values:\", yvals)\n", 337 | "\n", 338 | "xcoords, ycoords = np.meshgrid(xvals, yvals)\n", 339 | "print(\"X Coordinates:\\n\", xcoords)\n", 340 | "print(\"Y Coordinates:\\n\", ycoords)" 341 | ] 342 | }, 343 | { 344 | "cell_type": "code", 345 | "execution_count": null, 346 | "metadata": {}, 347 | "outputs": [], 348 | "source": [ 349 | "plt.imshow(np.sin(xcoords) + np.cos(ycoords), extent=[xvals[0], xvals[-1], yvals[0], yvals[-1]]);" 350 | ] 351 | }, 352 | { 353 | "cell_type": "markdown", 354 | "metadata": {}, 355 | "source": [ 356 | "The plot looks better if we use a few more samples:" 357 | ] 358 | }, 359 | { 360 | "cell_type": "code", 361 | "execution_count": null, 362 | "metadata": {}, 363 | "outputs": [], 364 | "source": [ 365 | "X, Y = np.meshgrid(np.linspace(-5, 5, 100), np.linspace(-5, 5, 100))\n", 366 | "plt.imshow(np.sin(X) + np.cos(Y), extent=[-5, 5, -5, 5]);" 367 | ] 368 | }, 369 | { 370 | "cell_type": "markdown", 371 | "metadata": {}, 372 | "source": [ 373 | "**Exercise:** Generate a density plot of the function $z = \\sqrt{x^2 + y^2}$." 374 | ] 375 | }, 376 | { 377 | "cell_type": "code", 378 | "execution_count": null, 379 | "metadata": {}, 380 | "outputs": [], 381 | "source": [] 382 | }, 383 | { 384 | "cell_type": "markdown", 385 | "metadata": {}, 386 | "source": [ 387 | "**Exercise:** Generate a density plot of the function $z = e^x - e^y$" 388 | ] 389 | }, 390 | { 391 | "cell_type": "code", 392 | "execution_count": null, 393 | "metadata": {}, 394 | "outputs": [], 395 | "source": [] 396 | } 397 | ], 398 | "metadata": { 399 | "kernelspec": { 400 | "display_name": "Python 3", 401 | "language": "python", 402 | "name": "python3" 403 | }, 404 | "language_info": { 405 | "codemirror_mode": { 406 | "name": "ipython", 407 | "version": 3 408 | }, 409 | "file_extension": ".py", 410 | "mimetype": "text/x-python", 411 | "name": "python", 412 | "nbconvert_exporter": "python", 413 | "pygments_lexer": "ipython3", 414 | "version": "3.6.6" 415 | } 416 | }, 417 | "nbformat": 4, 418 | "nbformat_minor": 2 419 | } 420 | -------------------------------------------------------------------------------- /exercises/numpy/.ipynb_checkpoints/4-Selections-checkpoint.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Selections\n", 8 | "\n", 9 | "Often when we're working with numpy we're only interested in a portion of the data in our arrays. The `[]` on `ndarray` allows us select portions of the data in the array in a variety of interesting ways.\n", 10 | "\n", 11 | "The exercises in this notebook will teach you how to select elements out of arrays in a variety of ways." 12 | ] 13 | }, 14 | { 15 | "cell_type": "code", 16 | "execution_count": null, 17 | "metadata": {}, 18 | "outputs": [], 19 | "source": [ 20 | "import numpy as np\n", 21 | "\n", 22 | "rand = np.random.RandomState(42) # Use a deterministic seed." 23 | ] 24 | }, 25 | { 26 | "cell_type": "markdown", 27 | "metadata": {}, 28 | "source": [ 29 | "## Exercise: 1-dimensional selection\n", 30 | "\n", 31 | "Write expressions to select the following elements from the array:\n", 32 | "\n", 33 | "1. first element\n", 34 | "1. second element\n", 35 | "1. last element\n", 36 | "1. second to last element\n", 37 | "1. first 5 elements\n", 38 | "1. last 5 elements\n", 39 | "1. elements at indices 1, 4, 7, and 13\n", 40 | "1. elements with even indices\n", 41 | "1. the entire array, in reverse order\n", 42 | "1. every other element, starting at index 3 (inclusive) ending at index 17 (exclusive)" 43 | ] 44 | }, 45 | { 46 | "cell_type": "code", 47 | "execution_count": null, 48 | "metadata": {}, 49 | "outputs": [], 50 | "source": [ 51 | "array = np.arange(20)" 52 | ] 53 | }, 54 | { 55 | "cell_type": "code", 56 | "execution_count": null, 57 | "metadata": {}, 58 | "outputs": [], 59 | "source": [ 60 | "array[FILL_ME_IN]" 61 | ] 62 | }, 63 | { 64 | "cell_type": "markdown", 65 | "metadata": {}, 66 | "source": [ 67 | "## Exercise: 2-dimensional selection\n", 68 | "\n", 69 | "Write expressions to select the following elements from the array.\n", 70 | "\n", 71 | "1. scalar value at coordinates `[3, 6]`\n", 72 | "1. top-left scalar value\n", 73 | "1. first row\n", 74 | "1. first column\n", 75 | "1. second column\n", 76 | "1. last column\n", 77 | "1. first 5 columns\n", 78 | "1. last 5 columns\n", 79 | "1. top-left 2 x 2 square\n", 80 | "1. top-right 2 x 2 square\n", 81 | "1. last 5 rows from every other column" 82 | ] 83 | }, 84 | { 85 | "cell_type": "code", 86 | "execution_count": null, 87 | "metadata": {}, 88 | "outputs": [], 89 | "source": [ 90 | "array = np.arange(20 * 20).reshape(20, 20)" 91 | ] 92 | }, 93 | { 94 | "cell_type": "code", 95 | "execution_count": null, 96 | "metadata": {}, 97 | "outputs": [], 98 | "source": [ 99 | "array[FILL_ME_IN]" 100 | ] 101 | }, 102 | { 103 | "cell_type": "markdown", 104 | "metadata": {}, 105 | "source": [ 106 | "## Exercise: N-dimensional selection" 107 | ] 108 | }, 109 | { 110 | "cell_type": "markdown", 111 | "metadata": {}, 112 | "source": [ 113 | "## Exercise: Selections with boolean arrays." 114 | ] 115 | }, 116 | { 117 | "cell_type": "code", 118 | "execution_count": null, 119 | "metadata": {}, 120 | "outputs": [], 121 | "source": [ 122 | "array = rand.normal(0, 1, 50)" 123 | ] 124 | }, 125 | { 126 | "cell_type": "markdown", 127 | "metadata": {}, 128 | "source": [ 129 | "Write an expression to select the positive values from the array." 130 | ] 131 | }, 132 | { 133 | "cell_type": "code", 134 | "execution_count": null, 135 | "metadata": {}, 136 | "outputs": [], 137 | "source": [] 138 | }, 139 | { 140 | "cell_type": "markdown", 141 | "metadata": {}, 142 | "source": [ 143 | "Write an expression to select the values less than -1 **or** greater than 1.5." 144 | ] 145 | }, 146 | { 147 | "cell_type": "code", 148 | "execution_count": null, 149 | "metadata": {}, 150 | "outputs": [], 151 | "source": [ 152 | "array = rand.normal(0, 1, 50)" 153 | ] 154 | }, 155 | { 156 | "cell_type": "markdown", 157 | "metadata": {}, 158 | "source": [ 159 | "Write an expression that produces the value from `array` if the value is positive, and produces the **square** of the value if it's negative." 160 | ] 161 | }, 162 | { 163 | "cell_type": "code", 164 | "execution_count": null, 165 | "metadata": {}, 166 | "outputs": [], 167 | "source": [ 168 | "array = np.arange(-5, 5)" 169 | ] 170 | }, 171 | { 172 | "cell_type": "code", 173 | "execution_count": null, 174 | "metadata": {}, 175 | "outputs": [], 176 | "source": [ 177 | "np.where?" 178 | ] 179 | }, 180 | { 181 | "cell_type": "markdown", 182 | "metadata": {}, 183 | "source": [ 184 | "## Exercise: \"FizzBuzz\"\n", 185 | "\n", 186 | "Write an expression that converts `array` into a new array of the same shape according to the following rules:\n", 187 | "\n", 188 | "At each index `[i]`\n", 189 | "\n", 190 | "- if `array[i]` is divisible by 3: `result[i]` should hold -1\n", 191 | "- if `array[i]` is divisible by 5: `result[i]` should hold -2\n", 192 | "- if `array[i]` is divisible by 15, `result[i]` should hold -3\n", 193 | "- otherwise:`result[i]` should hold `array[i]`\n", 194 | "\n", 195 | "(**Hint:** `np.select` works like `np.where`, but it can select from more than two arrays.)" 196 | ] 197 | }, 198 | { 199 | "cell_type": "code", 200 | "execution_count": null, 201 | "metadata": {}, 202 | "outputs": [], 203 | "source": [ 204 | "array = np.arange(1, 100)\n", 205 | "np.select?" 206 | ] 207 | }, 208 | { 209 | "cell_type": "markdown", 210 | "metadata": {}, 211 | "source": [ 212 | "## Exercise: N-dimensional FizzBuzz\n", 213 | "\n", 214 | "Same rules as above, but on a 3-dimensional array. (HINT: It's possible to write a solution that works for this exercise and the previous one.)" 215 | ] 216 | }, 217 | { 218 | "cell_type": "code", 219 | "execution_count": null, 220 | "metadata": {}, 221 | "outputs": [], 222 | "source": [ 223 | "array = np.arange(1, 100).reshape(3, 11, 3)" 224 | ] 225 | } 226 | ], 227 | "metadata": { 228 | "kernelspec": { 229 | "display_name": "Python 3", 230 | "language": "python", 231 | "name": "python3" 232 | }, 233 | "language_info": { 234 | "codemirror_mode": { 235 | "name": "ipython", 236 | "version": 3 237 | }, 238 | "file_extension": ".py", 239 | "mimetype": "text/x-python", 240 | "name": "python", 241 | "nbconvert_exporter": "python", 242 | "pygments_lexer": "ipython3", 243 | "version": "3.6.6" 244 | } 245 | }, 246 | "nbformat": 4, 247 | "nbformat_minor": 2 248 | } 249 | -------------------------------------------------------------------------------- /exercises/numpy/.ipynb_checkpoints/5-Reductions-checkpoint.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Reductions\n", 8 | "\n", 9 | "Reductions allow us to compute summary statistics and other useful aggregations over our data.\n", 10 | "\n", 11 | "The dataset for these exercises contains minutely price and volume observations for 5 stocks for October of 2017." 12 | ] 13 | }, 14 | { 15 | "cell_type": "code", 16 | "execution_count": 1, 17 | "metadata": {}, 18 | "outputs": [ 19 | { 20 | "data": { 21 | "text/html": [ 22 | "
\n", 23 | "\n", 36 | "\n", 37 | " \n", 38 | " \n", 39 | " \n", 40 | " \n", 41 | " \n", 42 | " \n", 43 | " \n", 44 | " \n", 45 | " \n", 46 | " \n", 47 | " \n", 48 | " \n", 49 | " \n", 50 | " \n", 51 | " \n", 52 | " \n", 53 | " \n", 54 | " \n", 55 | " \n", 56 | " \n", 57 | " \n", 58 | " \n", 59 | " \n", 60 | " \n", 61 | " \n", 62 | " \n", 63 | " \n", 64 | " \n", 65 | " \n", 66 | " \n", 67 | " \n", 68 | " \n", 69 | " \n", 70 | " \n", 71 | " \n", 72 | " \n", 73 | " \n", 74 | " \n", 75 | " \n", 76 | " \n", 77 | " \n", 78 | " \n", 79 | " \n", 80 | " \n", 81 | " \n", 82 | " \n", 83 | " \n", 84 | " \n", 85 | " \n", 86 | " \n", 87 | " \n", 88 | " \n", 89 | " \n", 90 | " \n", 91 | " \n", 92 | " \n", 93 | " \n", 94 | " \n", 95 | " \n", 96 | " \n", 97 | "
AAPLMSFTTSLAMCDBK
dt
2017-10-02 13:31:00154.3474.880342.330156.38052.736
2017-10-02 13:32:00154.0774.832341.480156.66052.686
2017-10-02 13:33:00153.7274.835341.830156.32452.756
2017-10-02 13:34:00153.6974.890341.240156.66052.726
2017-10-02 13:35:00153.4574.810341.873156.67052.706
\n", 98 | "
" 99 | ], 100 | "text/plain": [ 101 | " AAPL MSFT TSLA MCD BK\n", 102 | "dt \n", 103 | "2017-10-02 13:31:00 154.34 74.880 342.330 156.380 52.736\n", 104 | "2017-10-02 13:32:00 154.07 74.832 341.480 156.660 52.686\n", 105 | "2017-10-02 13:33:00 153.72 74.835 341.830 156.324 52.756\n", 106 | "2017-10-02 13:34:00 153.69 74.890 341.240 156.660 52.726\n", 107 | "2017-10-02 13:35:00 153.45 74.810 341.873 156.670 52.706" 108 | ] 109 | }, 110 | "metadata": {}, 111 | "output_type": "display_data" 112 | }, 113 | { 114 | "data": { 115 | "text/html": [ 116 | "
\n", 117 | "\n", 130 | "\n", 131 | " \n", 132 | " \n", 133 | " \n", 134 | " \n", 135 | " \n", 136 | " \n", 137 | " \n", 138 | " \n", 139 | " \n", 140 | " \n", 141 | " \n", 142 | " \n", 143 | " \n", 144 | " \n", 145 | " \n", 146 | " \n", 147 | " \n", 148 | " \n", 149 | " \n", 150 | " \n", 151 | " \n", 152 | " \n", 153 | " \n", 154 | " \n", 155 | " \n", 156 | " \n", 157 | " \n", 158 | " \n", 159 | " \n", 160 | " \n", 161 | " \n", 162 | " \n", 163 | " \n", 164 | " \n", 165 | " \n", 166 | " \n", 167 | " \n", 168 | " \n", 169 | " \n", 170 | " \n", 171 | " \n", 172 | " \n", 173 | " \n", 174 | " \n", 175 | " \n", 176 | " \n", 177 | " \n", 178 | " \n", 179 | " \n", 180 | " \n", 181 | " \n", 182 | " \n", 183 | " \n", 184 | " \n", 185 | " \n", 186 | " \n", 187 | " \n", 188 | " \n", 189 | " \n", 190 | " \n", 191 | "
AAPLMSFTTSLAMCDBK
dt
2017-10-02 13:31:00420042.0409211.049907.085774.030276.0
2017-10-02 13:32:00161960.049207.018480.06866.04511.0
2017-10-02 13:33:00118283.024043.047039.03000.03001.0
2017-10-02 13:34:00103544.062383.013444.04364.0900.0
2017-10-02 13:35:0088012.040175.036556.0820.01500.0
\n", 192 | "
" 193 | ], 194 | "text/plain": [ 195 | " AAPL MSFT TSLA MCD BK\n", 196 | "dt \n", 197 | "2017-10-02 13:31:00 420042.0 409211.0 49907.0 85774.0 30276.0\n", 198 | "2017-10-02 13:32:00 161960.0 49207.0 18480.0 6866.0 4511.0\n", 199 | "2017-10-02 13:33:00 118283.0 24043.0 47039.0 3000.0 3001.0\n", 200 | "2017-10-02 13:34:00 103544.0 62383.0 13444.0 4364.0 900.0\n", 201 | "2017-10-02 13:35:00 88012.0 40175.0 36556.0 820.0 1500.0" 202 | ] 203 | }, 204 | "metadata": {}, 205 | "output_type": "display_data" 206 | } 207 | ], 208 | "source": [ 209 | "import numpy as np\n", 210 | "import pandas as pd\n", 211 | "from IPython.display import display\n", 212 | "import matplotlib.pyplot as plt\n", 213 | "\n", 214 | "prices_df = pd.read_csv('prices.csv', index_col='dt', parse_dates=['dt'])\n", 215 | "volumes_df = pd.read_csv('volumes.csv', index_col='dt', parse_dates=['dt'])\n", 216 | "\n", 217 | "display(prices_df.head())\n", 218 | "display(volumes_df.head())" 219 | ] 220 | }, 221 | { 222 | "cell_type": "code", 223 | "execution_count": null, 224 | "metadata": {}, 225 | "outputs": [], 226 | "source": [ 227 | "prices = prices_df.values\n", 228 | "volumes = volumes_df.values\n", 229 | "timestamps = prices_df.index.values" 230 | ] 231 | }, 232 | { 233 | "cell_type": "code", 234 | "execution_count": null, 235 | "metadata": {}, 236 | "outputs": [], 237 | "source": [ 238 | "timestamps" 239 | ] 240 | }, 241 | { 242 | "cell_type": "markdown", 243 | "metadata": {}, 244 | "source": [ 245 | "**Exercise:** Compute the average price for each stock." 246 | ] 247 | }, 248 | { 249 | "cell_type": "code", 250 | "execution_count": null, 251 | "metadata": {}, 252 | "outputs": [], 253 | "source": [] 254 | }, 255 | { 256 | "cell_type": "markdown", 257 | "metadata": {}, 258 | "source": [ 259 | "**Exercise:** Compute the average volume for each stock." 260 | ] 261 | }, 262 | { 263 | "cell_type": "code", 264 | "execution_count": null, 265 | "metadata": {}, 266 | "outputs": [], 267 | "source": [] 268 | }, 269 | { 270 | "cell_type": "markdown", 271 | "metadata": {}, 272 | "source": [ 273 | "**Exercise:** Compute the number of times that each stock's price increased." 274 | ] 275 | }, 276 | { 277 | "cell_type": "code", 278 | "execution_count": null, 279 | "metadata": {}, 280 | "outputs": [], 281 | "source": [ 282 | "np.diff?" 283 | ] 284 | }, 285 | { 286 | "cell_type": "markdown", 287 | "metadata": {}, 288 | "source": [ 289 | "**Exercise:** Compute the volume-weighted average price of all 5 stocks." 290 | ] 291 | }, 292 | { 293 | "cell_type": "code", 294 | "execution_count": null, 295 | "metadata": {}, 296 | "outputs": [], 297 | "source": [ 298 | "np.average?" 299 | ] 300 | }, 301 | { 302 | "cell_type": "markdown", 303 | "metadata": {}, 304 | "source": [ 305 | "**Exercise:** Compute the timestamps where the lowest price occurred for each stock." 306 | ] 307 | }, 308 | { 309 | "cell_type": "code", 310 | "execution_count": null, 311 | "metadata": {}, 312 | "outputs": [], 313 | "source": [ 314 | "np.argmin?" 315 | ] 316 | }, 317 | { 318 | "cell_type": "markdown", 319 | "metadata": {}, 320 | "source": [ 321 | "**Exercise:** Compute the average volume for each minute of the day, aggregated across all 5 stocks.\n", 322 | "\n", 323 | "(**HINT:** There are exactly 390 trading minutes in each day in this dataset.)" 324 | ] 325 | }, 326 | { 327 | "cell_type": "code", 328 | "execution_count": null, 329 | "metadata": {}, 330 | "outputs": [], 331 | "source": [ 332 | "plt.plot(volumes[:, 0].reshape(22, 390).mean(axis=0))" 333 | ] 334 | }, 335 | { 336 | "cell_type": "code", 337 | "execution_count": null, 338 | "metadata": {}, 339 | "outputs": [], 340 | "source": [ 341 | "volumes" 342 | ] 343 | }, 344 | { 345 | "cell_type": "code", 346 | "execution_count": null, 347 | "metadata": {}, 348 | "outputs": [], 349 | "source": [ 350 | "plt.plot(volumes.sum(axis=1).reshape(22, 390).mean(axis=0))" 351 | ] 352 | }, 353 | { 354 | "cell_type": "code", 355 | "execution_count": null, 356 | "metadata": {}, 357 | "outputs": [], 358 | "source": [] 359 | } 360 | ], 361 | "metadata": { 362 | "kernelspec": { 363 | "display_name": "Python 3", 364 | "language": "python", 365 | "name": "python3" 366 | }, 367 | "language_info": { 368 | "codemirror_mode": { 369 | "name": "ipython", 370 | "version": 3 371 | }, 372 | "file_extension": ".py", 373 | "mimetype": "text/x-python", 374 | "name": "python", 375 | "nbconvert_exporter": "python", 376 | "pygments_lexer": "ipython3", 377 | "version": "3.6.6" 378 | } 379 | }, 380 | "nbformat": 4, 381 | "nbformat_minor": 2 382 | } 383 | -------------------------------------------------------------------------------- /exercises/numpy/.ipynb_checkpoints/6-Broadcasting-checkpoint.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Broadcasting\n", 8 | "\n", 9 | "In these exercises, we'll practice using broadcasting to combine arrays of different dimensions." 10 | ] 11 | }, 12 | { 13 | "cell_type": "code", 14 | "execution_count": null, 15 | "metadata": {}, 16 | "outputs": [], 17 | "source": [ 18 | "import numpy as np\n", 19 | "import pandas as pd\n", 20 | "from IPython.display import display\n", 21 | "import matplotlib.pyplot as plt\n", 22 | "plt.rc('figure', figsize=(12, 7))" 23 | ] 24 | }, 25 | { 26 | "cell_type": "markdown", 27 | "metadata": {}, 28 | "source": [ 29 | "## Bezier Curves\n", 30 | "\n", 31 | "A [Bezier Curve](https://en.wikipedia.org/wiki/B%C3%A9zier_curve) is way to define a two-dimensional curve in terms of a sequence of \"control points\". Intuitively, each control point \"pulls\" the path traced by the curve toward itself.\n", 32 | "\n", 33 | "Mathematically, a bezier curve defines a function $B(t)$ from the interval $[0, 1]$ to a two-dimensional point $p$." 34 | ] 35 | }, 36 | { 37 | "cell_type": "markdown", 38 | "metadata": {}, 39 | "source": [ 40 | "Here's an example of a fourth-order bezier curve. You can see that as t moves from 0 to 1, the control points exert different amounts of \"force\", pulling the final point closer toward themselves." 41 | ] 42 | }, 43 | { 44 | "cell_type": "markdown", 45 | "metadata": {}, 46 | "source": [ 47 | "![images/bezier.gif](images/bezier.gif)" 48 | ] 49 | }, 50 | { 51 | "cell_type": "markdown", 52 | "metadata": {}, 53 | "source": [ 54 | "## Linear Bezier Curves\n", 55 | "\n", 56 | "The simplest form of bezier curve is a \"first-order\" bezier curve with two control points $p_0$ and $p_1$.\n", 57 | "\n", 58 | "The formula for first-order bezier curve is:\n", 59 | "\n", 60 | "$B(t) = (1 - t)p_0 + tp_1$\n", 61 | "\n", 62 | "The curve traced out by a first-order bezier curve is simply the line connecting $p_0$ and $p_1$. For any value of $t \\in [0, 1]$, $B(t)$ evaluates to the point \"t percent\" of the way between $p_0$ and $p_1$." 63 | ] 64 | }, 65 | { 66 | "cell_type": "markdown", 67 | "metadata": {}, 68 | "source": [ 69 | "**Exercise:** Implement a function that evaluates a Bezier curve **at a single point**. \n", 70 | "\n", 71 | "Your function should take the following arguments:\n", 72 | "\n", 73 | "- `p0`, a length-2 array containing (x, y) coordinates of the first control point.\n", 74 | "- `p1`, a length-2 array containing (x, y) coordinates of the second control point.\n", 75 | "- `t`, a scalar value between 0 and 1." 76 | ] 77 | }, 78 | { 79 | "cell_type": "code", 80 | "execution_count": null, 81 | "metadata": {}, 82 | "outputs": [], 83 | "source": [ 84 | "def evaluate_linear_bezier_curve(p0, p1, t):\n", 85 | " raise NotImplemented(\"IMPLEMENT ME!\")" 86 | ] 87 | }, 88 | { 89 | "cell_type": "code", 90 | "execution_count": null, 91 | "metadata": {}, 92 | "outputs": [], 93 | "source": [ 94 | "p0 = np.array([0, 0])\n", 95 | "p1 = np.array([2, 1])\n", 96 | "\n", 97 | "halfway = evaluate_linear_bezier_curve(p0, p1, 0.5)\n", 98 | "three_quarters = evaluate_linear_bezier_curve(p0, p1, 0.75)\n", 99 | "\n", 100 | "# If your implementation is correct, these assertions shouldn't trigger.\n", 101 | "np.testing.assert_almost_equal(halfway, [1.0, 0.5])\n", 102 | "np.testing.assert_almost_equal(three_quarters, [1.5, 0.75])" 103 | ] 104 | }, 105 | { 106 | "cell_type": "markdown", 107 | "metadata": {}, 108 | "source": [ 109 | "Implement a function that computes **an array of samples from a linear bezier curve**. Your function should take the following arguments:\n", 110 | "\n", 111 | "- `p0`, a length-2 array containing (x, y) coordinates of the first control point. (Same as above.)\n", 112 | "- `p1`, a length-2 array containing (x, y) coordinates of the second control point. (Same as above.)\n", 113 | "- `ts`, a 1d array of unknown length containing sample values between 0 and 1.\n", 114 | "\n", 115 | "Your function should return a `len(ts) x 2` array containing (x, y) coordinates of the requested samples.\n", 116 | "\n", 117 | "Once you've think you have a solution, you can check your implementation by running `draw_linear_bezier_curve`, which will draw a bezier curve using samples generated by your `compute_linear_bezier_curve` function." 118 | ] 119 | }, 120 | { 121 | "cell_type": "code", 122 | "execution_count": null, 123 | "metadata": {}, 124 | "outputs": [], 125 | "source": [ 126 | "def draw_linear_bezier_curve(p0, p1):\n", 127 | " \"\"\"You shouldn't need to change anything in this function.\n", 128 | " \"\"\"\n", 129 | " ts = np.linspace(0, 1, 50)\n", 130 | " samples = compute_linear_bezier_curve(p0, p1, ts)\n", 131 | " X = samples[:, 0]\n", 132 | " Y = samples[:, 1]\n", 133 | " \n", 134 | " plt.plot(X, Y)\n", 135 | " plt.scatter([p0[0], p1[0]], [p0[1], p1[1]], color='red')\n", 136 | "\n", 137 | "def compute_linear_bezier_curve(p0, p1, ts):\n", 138 | " raise NotImplementedError(\"IMPLEMENT ME!\")" 139 | ] 140 | }, 141 | { 142 | "cell_type": "code", 143 | "execution_count": null, 144 | "metadata": {}, 145 | "outputs": [], 146 | "source": [ 147 | "draw_linear_bezier_curve(np.array([1, 1]), np.array([2, 3]));" 148 | ] 149 | }, 150 | { 151 | "cell_type": "markdown", 152 | "metadata": {}, 153 | "source": [ 154 | "## Exercise: Quadratic Bezier Curve\n", 155 | "\n", 156 | "The next-simplest form of Bezier Curve is a second-order (also known as \"quadratic\") curve. A second-order Bezier Curve has three control points, and can be implemented using the following formula:\n", 157 | "\n", 158 | "$b(t) = (1 - t)^2p_0 + 2(1 - t)tp_1 + t^2p_2$\n", 159 | "\n", 160 | "Implement a function with the same signature as above, but accept three control points, p0, p1, and p2. A correct implementation will draw a line that starts at $p0$, curves toward $p1$, and finishes at $p2$." 161 | ] 162 | }, 163 | { 164 | "cell_type": "code", 165 | "execution_count": null, 166 | "metadata": {}, 167 | "outputs": [], 168 | "source": [ 169 | "def draw_quadratic_bezier_curve(p0, p1, p2):\n", 170 | " \"\"\"You shouldn't need to change anything in this function.\n", 171 | " \"\"\"\n", 172 | " ts = np.linspace(0, 1, 50)\n", 173 | " samples = compute_quadratic_bezier_curve(p0, p1, p2, ts)\n", 174 | " \n", 175 | " X = samples[:, 0]\n", 176 | " Y = samples[:, 1]\n", 177 | " plt.plot(X, Y)\n", 178 | " \n", 179 | " points = np.vstack([p0, p1, p2])\n", 180 | " plt.scatter(points[:, 0], points[:, 1], color='red')\n", 181 | " plt.plot(points[:, 0], points[:, 1], linestyle='--', color='red')\n", 182 | " \n", 183 | "\n", 184 | "def compute_quadratic_bezier_curve(p0, p1, p2, ts):\n", 185 | " raise NotImplementedError(\"IMPLEMENT ME!\")" 186 | ] 187 | }, 188 | { 189 | "cell_type": "code", 190 | "execution_count": null, 191 | "metadata": {}, 192 | "outputs": [], 193 | "source": [ 194 | "p0 = np.array([0, 0])\n", 195 | "p1 = np.array([1, 1])\n", 196 | "p2 = np.array([2, -2])\n", 197 | "draw_quadratic_bezier_curve(p0, p1, p2)" 198 | ] 199 | }, 200 | { 201 | "cell_type": "markdown", 202 | "metadata": {}, 203 | "source": [ 204 | "## Exercise: Cubic Bezier Curve\n", 205 | "\n", 206 | "A third-order (aka, cubic) Bezier Curve has four control points, and has the following formula:\n", 207 | "\n", 208 | "$b(t) = (1 - t)^3p_0 + 3(1 - t)^2tp_1 + 3(1 - t)t^2p_2 + t^3p_3$\n", 209 | "\n", 210 | "Implement an evaluator for a cubic bezier curve following the same pattern as above." 211 | ] 212 | }, 213 | { 214 | "cell_type": "code", 215 | "execution_count": null, 216 | "metadata": {}, 217 | "outputs": [], 218 | "source": [ 219 | "def draw_cubic_bezier_curve(p0, p1, p2, p3):\n", 220 | " \"\"\"You shouldn't need to change anything in this function.\n", 221 | " \"\"\"\n", 222 | " ts = np.linspace(0, 1, 50)\n", 223 | " samples = compute_cubic_bezier_curve(p0, p1, p2, p3, ts)\n", 224 | " X = samples[:, 0]\n", 225 | " Y = samples[:, 1]\n", 226 | " plt.plot(X, Y)\n", 227 | " \n", 228 | " points = np.vstack([p0, p1, p2, p3])\n", 229 | " plt.scatter(points[:, 0], points[:, 1], color='red')\n", 230 | " plt.plot(points[:, 0], points[:, 1], linestyle='--', color='red')\n", 231 | " \n", 232 | "\n", 233 | "def compute_cubic_bezier_curve(p0, p1, p2, p3, ts):\n", 234 | " raise NotImplementedError()" 235 | ] 236 | }, 237 | { 238 | "cell_type": "code", 239 | "execution_count": null, 240 | "metadata": {}, 241 | "outputs": [], 242 | "source": [ 243 | "p0 = np.array([0, 0])\n", 244 | "p1 = np.array([1, 1])\n", 245 | "p2 = np.array([2, -2])\n", 246 | "p3 = np.array([3, 3])\n", 247 | "draw_cubic_bezier_curve(p0, p1, p2, p3)" 248 | ] 249 | }, 250 | { 251 | "cell_type": "markdown", 252 | "metadata": {}, 253 | "source": [ 254 | "## Exercise: Generalized Bezier Curve\n", 255 | "\n", 256 | "You may have started to notice a pattern in the coefficients of each control point's contribution to the curve. We're getting [Binomial Coefficients](https://en.wikipedia.org/wiki/Binomial_coefficient)!\n", 257 | "\n", 258 | "The general formula for a Bezier curve with $n$ control points is:\n", 259 | "\n", 260 | "$b(t) = \\sum_{i=0}^n \\binom{n}{i}(1 - t)^{n - i}t^ip_i$\n", 261 | "\n", 262 | "where $\\binom{n}{i}$ is the binomial coefficient of $n$ and $i$.\n", 263 | "\n", 264 | "Implement a function that computes samples from a generalized bezier curve. It should take a 2d array of (npoints x 2) and a 1d array of samples, and it should return a (len(t) x 2) array of evaluated samples. \n", 265 | "\n", 266 | "**Hint:** You can use `scipy.special.comb` to evaluate binomial coefficients.\n", 267 | "\n", 268 | "**NOTE:** This exercise is hard. If you get stuck here, there's a solutions notebook nextdoor." 269 | ] 270 | }, 271 | { 272 | "cell_type": "code", 273 | "execution_count": null, 274 | "metadata": {}, 275 | "outputs": [], 276 | "source": [ 277 | "from scipy.special import comb\n", 278 | "\n", 279 | "def draw_bezier_curve(points):\n", 280 | " ts = np.linspace(0, 1, 50)\n", 281 | " samples = compute_bezier_curve(points, ts)\n", 282 | " X = samples[:, 0]\n", 283 | " Y = samples[:, 1]\n", 284 | " plt.plot(X, Y)\n", 285 | " \n", 286 | " plt.scatter(points[:, 0], points[:, 1], color='red')\n", 287 | " plt.plot(points[:, 0], points[:, 1], linestyle='--', color='red')\n", 288 | " \n", 289 | "def compute_bezier_curve(points, t):\n", 290 | " raise NotImplementedError()" 291 | ] 292 | }, 293 | { 294 | "cell_type": "code", 295 | "execution_count": null, 296 | "metadata": { 297 | "scrolled": false 298 | }, 299 | "outputs": [], 300 | "source": [ 301 | "draw_bezier_curve(np.vstack([p0, p1, p2, p3]))" 302 | ] 303 | }, 304 | { 305 | "cell_type": "markdown", 306 | "metadata": {}, 307 | "source": [ 308 | "## Exercise: Estimate the Length of a Bezier Curve\n", 309 | "\n", 310 | "Write a function that estimates the length of a bezier curve by computing an array of sample points and summing lengths of the differences between each successive point." 311 | ] 312 | }, 313 | { 314 | "cell_type": "code", 315 | "execution_count": null, 316 | "metadata": {}, 317 | "outputs": [], 318 | "source": [ 319 | "def estimate_curve_length(points, nsamples):\n", 320 | " raise NotImplementedError()" 321 | ] 322 | }, 323 | { 324 | "cell_type": "code", 325 | "execution_count": null, 326 | "metadata": {}, 327 | "outputs": [], 328 | "source": [ 329 | "estimate_curve_length([p0, p1, p2, p3], 5)" 330 | ] 331 | }, 332 | { 333 | "cell_type": "code", 334 | "execution_count": null, 335 | "metadata": {}, 336 | "outputs": [], 337 | "source": [ 338 | "for i in range(3, 10):\n", 339 | " print(estimate_curve_length([p0, p1, p2, p3], i))" 340 | ] 341 | } 342 | ], 343 | "metadata": { 344 | "kernelspec": { 345 | "display_name": "Python 3", 346 | "language": "python", 347 | "name": "python3" 348 | }, 349 | "language_info": { 350 | "codemirror_mode": { 351 | "name": "ipython", 352 | "version": 3 353 | }, 354 | "file_extension": ".py", 355 | "mimetype": "text/x-python", 356 | "name": "python", 357 | "nbconvert_exporter": "python", 358 | "pygments_lexer": "ipython3", 359 | "version": "3.6.6" 360 | } 361 | }, 362 | "nbformat": 4, 363 | "nbformat_minor": 2 364 | } 365 | -------------------------------------------------------------------------------- /exercises/numpy/1-Finding Functions and Documentation in Jupyter.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Finding Functions with NumPy and Jupyter" 8 | ] 9 | }, 10 | { 11 | "cell_type": "markdown", 12 | "metadata": {}, 13 | "source": [ 14 | "NumPy is a large complex library with hundreds of useful functions. Often the hardest part of solving a problem with NumPy is simply finding the right function to use, or figuring out whether a function you've found can solve your problem.\n", 15 | "\n", 16 | "The official documentation for Numpy and SciPy are excellent resources:\n", 17 | "\n", 18 | "- The official NumPy documentation: [https://docs.scipy.org/doc/numpy-1.13.0/reference/](https://docs.scipy.org/doc/numpy-1.13.0/reference/)\n", 19 | "- The offical SciPy documentation: [https://docs.scipy.org/doc/scipy/reference/](https://docs.scipy.org/doc/scipy/reference/)\n", 20 | "\n", 21 | "There are also a few tricks you can use to learn about NumPy functions without having to leave the notebook:" 22 | ] 23 | }, 24 | { 25 | "cell_type": "code", 26 | "execution_count": null, 27 | "metadata": {}, 28 | "outputs": [], 29 | "source": [ 30 | "import numpy as np" 31 | ] 32 | }, 33 | { 34 | "cell_type": "markdown", 35 | "metadata": {}, 36 | "source": [ 37 | "Suppose, for example, that we want to find the [eigenvalues](https://en.wikipedia.org/wiki/Eigenvalues_and_eigenvectors) of a matrix. The first thing we might try is to use Jupyter's tab-completion feature to see if there's a top-level function with a name like \"eigenvalues\":" 38 | ] 39 | }, 40 | { 41 | "cell_type": "code", 42 | "execution_count": null, 43 | "metadata": {}, 44 | "outputs": [], 45 | "source": [ 46 | "# Place your cursor immediately after the `e` and hit TAB to see the top-level numpy functions\n", 47 | "# that start with the letter \"e\"\n", 48 | "np.e" 49 | ] 50 | }, 51 | { 52 | "cell_type": "markdown", 53 | "metadata": {}, 54 | "source": [ 55 | "Unfortunately for us, none of the functions that appear look like they're related to eigenvalues. The next thing we can try is to use [`np.lookfor`](https://docs.scipy.org/doc/numpy-1.14.0/reference/generated/numpy.lookfor.html) to search for \"eigenvalue\" by keyword:" 56 | ] 57 | }, 58 | { 59 | "cell_type": "code", 60 | "execution_count": null, 61 | "metadata": {}, 62 | "outputs": [], 63 | "source": [ 64 | "np.lookfor(\"eigenvalue\")" 65 | ] 66 | }, 67 | { 68 | "cell_type": "markdown", 69 | "metadata": {}, 70 | "source": [ 71 | "That looks more promising, but now we have to figure out which function to use. For that, we can use Jupyter's `?` operator. Running a cell containing `function_name?` will bring up a window containing documentation about the function.\n", 72 | "\n", 73 | "Execute the following cell and read the documentation for `eigvals`. In particular, notice that if you scroll down, there are **\"Examples\"** and **\"See Also\"** sections. These sections are as or more useful than the description of the what a function does." 74 | ] 75 | }, 76 | { 77 | "cell_type": "code", 78 | "execution_count": null, 79 | "metadata": {}, 80 | "outputs": [], 81 | "source": [ 82 | "np.linalg.eigvals?" 83 | ] 84 | }, 85 | { 86 | "cell_type": "markdown", 87 | "metadata": {}, 88 | "source": [ 89 | "For cases where we just to check a docstring quickly, it can be more ergonomic to bring up documentation in-line using Shift+Tab:\n", 90 | "\n", 91 | "Place your cursor after \"eigvals\", hold Shift, and then press Tab. You should see the signature and the first line of the function's documentation. You can see the rest of the documentation by pressing Tab again without letting go of Shift." 92 | ] 93 | }, 94 | { 95 | "cell_type": "code", 96 | "execution_count": null, 97 | "metadata": {}, 98 | "outputs": [], 99 | "source": [ 100 | "np.linalg.eigvals" 101 | ] 102 | }, 103 | { 104 | "cell_type": "markdown", 105 | "metadata": {}, 106 | "source": [ 107 | "## Exercise: Finding Functions\n", 108 | "\n", 109 | "Use `np.lookfor` and `?` to find functions that do the following:\n", 110 | "\n", 111 | "- Compute the largest value in an array.\n", 112 | "- Compute the smallest value in an array.\n", 113 | "- Compute the value at the 90th percentile of an array.\n", 114 | "- Sort an array." 115 | ] 116 | }, 117 | { 118 | "cell_type": "code", 119 | "execution_count": null, 120 | "metadata": {}, 121 | "outputs": [], 122 | "source": [] 123 | }, 124 | { 125 | "cell_type": "markdown", 126 | "metadata": {}, 127 | "source": [ 128 | "## Exercise: Finding Functions (continued)\n", 129 | "\n", 130 | "Continue using `np.lookfor` and `?` to find functions that do the following:" 131 | ] 132 | }, 133 | { 134 | "cell_type": "markdown", 135 | "metadata": {}, 136 | "source": [ 137 | "- Find a value in a sorted array.\n", 138 | "- Compute the [natural logarithm](https://en.wikipedia.org/wiki/Natural_logarithm) of each element in an array.\n", 139 | "- Compute the [Correlation Coefficient](https://en.wikipedia.org/wiki/Correlation_coefficient) between two arrays.\n", 140 | "- Fit coefficients of a polynomial function.\n", 141 | "- Compute a [Covariance Matrix](https://en.wikipedia.org/wiki/Covariance_matrix)." 142 | ] 143 | }, 144 | { 145 | "cell_type": "code", 146 | "execution_count": null, 147 | "metadata": {}, 148 | "outputs": [], 149 | "source": [] 150 | } 151 | ], 152 | "metadata": { 153 | "kernelspec": { 154 | "display_name": "Python 3", 155 | "language": "python", 156 | "name": "python3" 157 | }, 158 | "language_info": { 159 | "codemirror_mode": { 160 | "name": "ipython", 161 | "version": 3 162 | }, 163 | "file_extension": ".py", 164 | "mimetype": "text/x-python", 165 | "name": "python", 166 | "nbconvert_exporter": "python", 167 | "pygments_lexer": "ipython3", 168 | "version": "3.6.6" 169 | } 170 | }, 171 | "nbformat": 4, 172 | "nbformat_minor": 2 173 | } 174 | -------------------------------------------------------------------------------- /exercises/numpy/3-Universal Functions.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Universal Functions\n", 8 | "\n", 9 | "The exercises in this notebook will teach you how to use universal functions (ufuncs) to apply vectorized operations over arrays." 10 | ] 11 | }, 12 | { 13 | "cell_type": "code", 14 | "execution_count": null, 15 | "metadata": {}, 16 | "outputs": [], 17 | "source": [ 18 | "import numpy as np" 19 | ] 20 | }, 21 | { 22 | "cell_type": "markdown", 23 | "metadata": {}, 24 | "source": [ 25 | "## UFunc Basics: Unary Functions\n", 26 | "\n", 27 | "Numpy provides many functions that can be applied over an entire array as a single operation." 28 | ] 29 | }, 30 | { 31 | "cell_type": "code", 32 | "execution_count": null, 33 | "metadata": {}, 34 | "outputs": [], 35 | "source": [ 36 | "data = np.linspace(1, 3, 15)\n", 37 | "data" 38 | ] 39 | }, 40 | { 41 | "cell_type": "markdown", 42 | "metadata": {}, 43 | "source": [ 44 | "**Exercise:** Use `np.sqrt` to compute the square-root of every element in `data`.\n", 45 | "\n", 46 | "(**Hint:** You should only need to call `sqrt` once)." 47 | ] 48 | }, 49 | { 50 | "cell_type": "code", 51 | "execution_count": null, 52 | "metadata": {}, 53 | "outputs": [], 54 | "source": [ 55 | "np.sqrt?" 56 | ] 57 | }, 58 | { 59 | "cell_type": "markdown", 60 | "metadata": {}, 61 | "source": [ 62 | "**Exercise:** Use `np.exp` to compute $e^x$ for each element of `data`." 63 | ] 64 | }, 65 | { 66 | "cell_type": "code", 67 | "execution_count": null, 68 | "metadata": {}, 69 | "outputs": [], 70 | "source": [ 71 | "np.exp?" 72 | ] 73 | }, 74 | { 75 | "cell_type": "markdown", 76 | "metadata": {}, 77 | "source": [ 78 | "**Exercise:** Use `np.log` to compute the [natural logarithm](https://en.wikipedia.org/wiki/Natural_logarithm) of every value in `data`." 79 | ] 80 | }, 81 | { 82 | "cell_type": "code", 83 | "execution_count": null, 84 | "metadata": {}, 85 | "outputs": [], 86 | "source": [ 87 | "np.log?" 88 | ] 89 | }, 90 | { 91 | "cell_type": "markdown", 92 | "metadata": {}, 93 | "source": [ 94 | "## UFunc Basics: Binary Operators\n", 95 | "\n", 96 | "Most of Python's binary operators can also be used as as UFuncs. \n", 97 | "\n", 98 | "Binary operators work on like-shape (array, array) pairs, as well as (array, scalar) pairs." 99 | ] 100 | }, 101 | { 102 | "cell_type": "code", 103 | "execution_count": null, 104 | "metadata": {}, 105 | "outputs": [], 106 | "source": [ 107 | "x = np.linspace(0, 10, 11)\n", 108 | "print(\"x:\", x)\n", 109 | "y = np.linspace(0, 1, 11)\n", 110 | "print(\"y:\", y)" 111 | ] 112 | }, 113 | { 114 | "cell_type": "markdown", 115 | "metadata": {}, 116 | "source": [ 117 | "### Combining Arrays and Scalars" 118 | ] 119 | }, 120 | { 121 | "cell_type": "markdown", 122 | "metadata": {}, 123 | "source": [ 124 | "**Exercise:** Compute an array containing each value from `x` incremented by 1." 125 | ] 126 | }, 127 | { 128 | "cell_type": "code", 129 | "execution_count": null, 130 | "metadata": {}, 131 | "outputs": [], 132 | "source": [] 133 | }, 134 | { 135 | "cell_type": "markdown", 136 | "metadata": {}, 137 | "source": [ 138 | "**Exercise:** Compute an array containing each value from `y` multiplied by 2." 139 | ] 140 | }, 141 | { 142 | "cell_type": "code", 143 | "execution_count": null, 144 | "metadata": {}, 145 | "outputs": [], 146 | "source": [] 147 | }, 148 | { 149 | "cell_type": "markdown", 150 | "metadata": {}, 151 | "source": [ 152 | "**Exercise:** Compute an array containing each value from ``x`` squared." 153 | ] 154 | }, 155 | { 156 | "cell_type": "code", 157 | "execution_count": null, 158 | "metadata": {}, 159 | "outputs": [], 160 | "source": [] 161 | }, 162 | { 163 | "cell_type": "markdown", 164 | "metadata": {}, 165 | "source": [ 166 | "### Combining Arrays with Arrays" 167 | ] 168 | }, 169 | { 170 | "cell_type": "markdown", 171 | "metadata": {}, 172 | "source": [ 173 | "**Exercise:** Compute an array containing the element-wise sum of values drawn from `x` and `y`. \n", 174 | "\n", 175 | "(For example, the element-wise sum of `[1, 2, 3]` and `[2, 3, 4]` would be `[3, 5, 7]`.)" 176 | ] 177 | }, 178 | { 179 | "cell_type": "code", 180 | "execution_count": null, 181 | "metadata": {}, 182 | "outputs": [], 183 | "source": [] 184 | }, 185 | { 186 | "cell_type": "markdown", 187 | "metadata": {}, 188 | "source": [ 189 | "**Exercise:** Compute an array containing the element-wise product of values drawn from `x` and `y`." 190 | ] 191 | }, 192 | { 193 | "cell_type": "code", 194 | "execution_count": null, 195 | "metadata": {}, 196 | "outputs": [], 197 | "source": [] 198 | }, 199 | { 200 | "cell_type": "markdown", 201 | "metadata": {}, 202 | "source": [ 203 | "**Exercise:** Compute an array containing each element in `x` raised to the power of the corresponding element in `y`." 204 | ] 205 | }, 206 | { 207 | "cell_type": "code", 208 | "execution_count": null, 209 | "metadata": {}, 210 | "outputs": [], 211 | "source": [] 212 | }, 213 | { 214 | "cell_type": "markdown", 215 | "metadata": {}, 216 | "source": [ 217 | "## Exercise: Plotting Functions with Numpy and Matplotlib\n", 218 | "\n", 219 | "Numpy can be used as powerful graphing calculator when combined with a plotting library like [matplotlib](https://matplotlib.org/index.html)." 220 | ] 221 | }, 222 | { 223 | "cell_type": "code", 224 | "execution_count": null, 225 | "metadata": {}, 226 | "outputs": [], 227 | "source": [ 228 | "x = np.linspace(-np.pi, np.pi, 500)" 229 | ] 230 | }, 231 | { 232 | "cell_type": "markdown", 233 | "metadata": {}, 234 | "source": [ 235 | "You can plot arrays of X and Y values by passing them to `matplotlib.pyplot.plot`." 236 | ] 237 | }, 238 | { 239 | "cell_type": "code", 240 | "execution_count": null, 241 | "metadata": {}, 242 | "outputs": [], 243 | "source": [ 244 | "import matplotlib.pyplot as plt\n", 245 | "\n", 246 | "# Tell matplotlib to output images to our notebook. \n", 247 | "# Without this line, matplotlib would construct a figure in memory, but wouldn't show it to us.\n", 248 | "%matplotlib inline" 249 | ] 250 | }, 251 | { 252 | "cell_type": "code", 253 | "execution_count": null, 254 | "metadata": {}, 255 | "outputs": [], 256 | "source": [ 257 | "# Tell matplotlib to use a larger default figure size.\n", 258 | "plt.rc('figure', figsize=(12, 7))" 259 | ] 260 | }, 261 | { 262 | "cell_type": "markdown", 263 | "metadata": {}, 264 | "source": [ 265 | "**Example:** Plot the graph of $y = x^2$." 266 | ] 267 | }, 268 | { 269 | "cell_type": "code", 270 | "execution_count": null, 271 | "metadata": {}, 272 | "outputs": [], 273 | "source": [ 274 | "plt.plot(x, x ** 2);" 275 | ] 276 | }, 277 | { 278 | "cell_type": "markdown", 279 | "metadata": {}, 280 | "source": [ 281 | "**Exercise:** Use numpy and matplotlib to plot graphs of the following functions:\n", 282 | "\n", 283 | "- $y = x^2 + 2x + 1$\n", 284 | "- $y = \\sqrt{|x|}$\n", 285 | "- $y = \\sin{\\frac{1}{x}}$" 286 | ] 287 | }, 288 | { 289 | "cell_type": "code", 290 | "execution_count": null, 291 | "metadata": {}, 292 | "outputs": [], 293 | "source": [] 294 | }, 295 | { 296 | "cell_type": "code", 297 | "execution_count": null, 298 | "metadata": {}, 299 | "outputs": [], 300 | "source": [] 301 | }, 302 | { 303 | "cell_type": "code", 304 | "execution_count": null, 305 | "metadata": {}, 306 | "outputs": [], 307 | "source": [] 308 | }, 309 | { 310 | "cell_type": "markdown", 311 | "metadata": {}, 312 | "source": [ 313 | "Numpy and matplotlib can also be used to plot functions of multiple arguments.\n", 314 | "\n", 315 | "The easiest way to plot a function of two arguments is to use `np.meshgrid` to generate grids of `x` and `y` coordinates, and to use `matplotlib.pyplot.imshow` to draw a density plot." 316 | ] 317 | }, 318 | { 319 | "cell_type": "markdown", 320 | "metadata": {}, 321 | "source": [ 322 | "**Example:** Draw a density plot of $z = \\sin{x} + \\cos{y}$." 323 | ] 324 | }, 325 | { 326 | "cell_type": "code", 327 | "execution_count": null, 328 | "metadata": {}, 329 | "outputs": [], 330 | "source": [ 331 | "# np.meshgrid takes arrays of X and Y values, and returns a pair of 2D arrays \n", 332 | "# that containing all pairs of (X, Y) coordinates from the input arrays.\n", 333 | "xvals = np.linspace(-5, 5, 9)\n", 334 | "yvals = np.linspace(-5, 5, 9)\n", 335 | "print(\"X Values:\", xvals)\n", 336 | "print(\"Y Values:\", yvals)\n", 337 | "\n", 338 | "xcoords, ycoords = np.meshgrid(xvals, yvals)\n", 339 | "print(\"X Coordinates:\\n\", xcoords)\n", 340 | "print(\"Y Coordinates:\\n\", ycoords)" 341 | ] 342 | }, 343 | { 344 | "cell_type": "code", 345 | "execution_count": null, 346 | "metadata": {}, 347 | "outputs": [], 348 | "source": [ 349 | "plt.imshow(np.sin(xcoords) + np.cos(ycoords), extent=[xvals[0], xvals[-1], yvals[0], yvals[-1]]);" 350 | ] 351 | }, 352 | { 353 | "cell_type": "markdown", 354 | "metadata": {}, 355 | "source": [ 356 | "The plot looks better if we use a few more samples:" 357 | ] 358 | }, 359 | { 360 | "cell_type": "code", 361 | "execution_count": null, 362 | "metadata": {}, 363 | "outputs": [], 364 | "source": [ 365 | "X, Y = np.meshgrid(np.linspace(-5, 5, 100), np.linspace(-5, 5, 100))\n", 366 | "plt.imshow(np.sin(X) + np.cos(Y), extent=[-5, 5, -5, 5]);" 367 | ] 368 | }, 369 | { 370 | "cell_type": "markdown", 371 | "metadata": {}, 372 | "source": [ 373 | "**Exercise:** Generate a density plot of the function $z = \\sqrt{x^2 + y^2}$." 374 | ] 375 | }, 376 | { 377 | "cell_type": "code", 378 | "execution_count": null, 379 | "metadata": {}, 380 | "outputs": [], 381 | "source": [] 382 | }, 383 | { 384 | "cell_type": "markdown", 385 | "metadata": {}, 386 | "source": [ 387 | "**Exercise:** Generate a density plot of the function $z = e^x - e^y$" 388 | ] 389 | }, 390 | { 391 | "cell_type": "code", 392 | "execution_count": null, 393 | "metadata": {}, 394 | "outputs": [], 395 | "source": [] 396 | } 397 | ], 398 | "metadata": { 399 | "kernelspec": { 400 | "display_name": "Python 3", 401 | "language": "python", 402 | "name": "python3" 403 | }, 404 | "language_info": { 405 | "codemirror_mode": { 406 | "name": "ipython", 407 | "version": 3 408 | }, 409 | "file_extension": ".py", 410 | "mimetype": "text/x-python", 411 | "name": "python", 412 | "nbconvert_exporter": "python", 413 | "pygments_lexer": "ipython3", 414 | "version": "3.6.6" 415 | } 416 | }, 417 | "nbformat": 4, 418 | "nbformat_minor": 2 419 | } 420 | -------------------------------------------------------------------------------- /exercises/numpy/4-Selections.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Selections\n", 8 | "\n", 9 | "Often when we're working with numpy we're only interested in a portion of the data in our arrays. The `[]` on `ndarray` allows us select portions of the data in the array in a variety of interesting ways.\n", 10 | "\n", 11 | "The exercises in this notebook will teach you how to select elements out of arrays in a variety of ways." 12 | ] 13 | }, 14 | { 15 | "cell_type": "code", 16 | "execution_count": null, 17 | "metadata": {}, 18 | "outputs": [], 19 | "source": [ 20 | "import numpy as np\n", 21 | "\n", 22 | "rand = np.random.RandomState(42) # Use a deterministic seed." 23 | ] 24 | }, 25 | { 26 | "cell_type": "markdown", 27 | "metadata": {}, 28 | "source": [ 29 | "## Exercise: 1-dimensional selection\n", 30 | "\n", 31 | "Write expressions to select the following elements from the array:\n", 32 | "\n", 33 | "1. first element\n", 34 | "1. second element\n", 35 | "1. last element\n", 36 | "1. second to last element\n", 37 | "1. first 5 elements\n", 38 | "1. last 5 elements\n", 39 | "1. elements at indices 1, 4, 7, and 13\n", 40 | "1. elements with even indices\n", 41 | "1. the entire array, in reverse order\n", 42 | "1. every other element, starting at index 3 (inclusive) ending at index 17 (exclusive)" 43 | ] 44 | }, 45 | { 46 | "cell_type": "code", 47 | "execution_count": null, 48 | "metadata": {}, 49 | "outputs": [], 50 | "source": [ 51 | "array = np.arange(20)" 52 | ] 53 | }, 54 | { 55 | "cell_type": "code", 56 | "execution_count": null, 57 | "metadata": {}, 58 | "outputs": [], 59 | "source": [ 60 | "array[FILL_ME_IN]" 61 | ] 62 | }, 63 | { 64 | "cell_type": "markdown", 65 | "metadata": {}, 66 | "source": [ 67 | "## Exercise: 2-dimensional selection\n", 68 | "\n", 69 | "Write expressions to select the following elements from the array.\n", 70 | "\n", 71 | "1. scalar value at coordinates `[3, 6]`\n", 72 | "1. top-left scalar value\n", 73 | "1. first row\n", 74 | "1. first column\n", 75 | "1. second column\n", 76 | "1. last column\n", 77 | "1. first 5 columns\n", 78 | "1. last 5 columns\n", 79 | "1. top-left 2 x 2 square\n", 80 | "1. top-right 2 x 2 square\n", 81 | "1. last 5 rows from every other column" 82 | ] 83 | }, 84 | { 85 | "cell_type": "code", 86 | "execution_count": null, 87 | "metadata": {}, 88 | "outputs": [], 89 | "source": [ 90 | "array = np.arange(20 * 20).reshape(20, 20)" 91 | ] 92 | }, 93 | { 94 | "cell_type": "code", 95 | "execution_count": null, 96 | "metadata": {}, 97 | "outputs": [], 98 | "source": [ 99 | "array[FILL_ME_IN]" 100 | ] 101 | }, 102 | { 103 | "cell_type": "markdown", 104 | "metadata": {}, 105 | "source": [ 106 | "## Exercise: N-dimensional selection" 107 | ] 108 | }, 109 | { 110 | "cell_type": "markdown", 111 | "metadata": {}, 112 | "source": [ 113 | "## Exercise: Selections with boolean arrays." 114 | ] 115 | }, 116 | { 117 | "cell_type": "code", 118 | "execution_count": null, 119 | "metadata": {}, 120 | "outputs": [], 121 | "source": [ 122 | "array = rand.normal(0, 1, 50)" 123 | ] 124 | }, 125 | { 126 | "cell_type": "markdown", 127 | "metadata": {}, 128 | "source": [ 129 | "Write an expression to select the positive values from the array." 130 | ] 131 | }, 132 | { 133 | "cell_type": "code", 134 | "execution_count": null, 135 | "metadata": {}, 136 | "outputs": [], 137 | "source": [] 138 | }, 139 | { 140 | "cell_type": "markdown", 141 | "metadata": {}, 142 | "source": [ 143 | "Write an expression to select the values less than -1 **or** greater than 1.5." 144 | ] 145 | }, 146 | { 147 | "cell_type": "code", 148 | "execution_count": null, 149 | "metadata": {}, 150 | "outputs": [], 151 | "source": [ 152 | "array = rand.normal(0, 1, 50)" 153 | ] 154 | }, 155 | { 156 | "cell_type": "markdown", 157 | "metadata": {}, 158 | "source": [ 159 | "Write an expression that produces the value from `array` if the value is positive, and produces the **square** of the value if it's negative." 160 | ] 161 | }, 162 | { 163 | "cell_type": "code", 164 | "execution_count": null, 165 | "metadata": {}, 166 | "outputs": [], 167 | "source": [ 168 | "array = np.arange(-5, 5)" 169 | ] 170 | }, 171 | { 172 | "cell_type": "code", 173 | "execution_count": null, 174 | "metadata": {}, 175 | "outputs": [], 176 | "source": [ 177 | "np.where?" 178 | ] 179 | }, 180 | { 181 | "cell_type": "markdown", 182 | "metadata": {}, 183 | "source": [ 184 | "## Exercise: \"FizzBuzz\"\n", 185 | "\n", 186 | "Write an expression that converts `array` into a new array of the same shape according to the following rules:\n", 187 | "\n", 188 | "At each index `[i]`\n", 189 | "\n", 190 | "- if `array[i]` is divisible by 3: `result[i]` should hold -1\n", 191 | "- if `array[i]` is divisible by 5: `result[i]` should hold -2\n", 192 | "- if `array[i]` is divisible by 15, `result[i]` should hold -3\n", 193 | "- otherwise:`result[i]` should hold `array[i]`\n", 194 | "\n", 195 | "(**Hint:** `np.select` works like `np.where`, but it can select from more than two arrays.)" 196 | ] 197 | }, 198 | { 199 | "cell_type": "code", 200 | "execution_count": null, 201 | "metadata": {}, 202 | "outputs": [], 203 | "source": [ 204 | "array = np.arange(1, 100)\n", 205 | "np.select?" 206 | ] 207 | }, 208 | { 209 | "cell_type": "markdown", 210 | "metadata": {}, 211 | "source": [ 212 | "## Exercise: N-dimensional FizzBuzz\n", 213 | "\n", 214 | "Same rules as above, but on a 3-dimensional array. (HINT: It's possible to write a solution that works for this exercise and the previous one.)" 215 | ] 216 | }, 217 | { 218 | "cell_type": "code", 219 | "execution_count": null, 220 | "metadata": {}, 221 | "outputs": [], 222 | "source": [ 223 | "array = np.arange(1, 100).reshape(3, 11, 3)" 224 | ] 225 | } 226 | ], 227 | "metadata": { 228 | "kernelspec": { 229 | "display_name": "Python 3", 230 | "language": "python", 231 | "name": "python3" 232 | }, 233 | "language_info": { 234 | "codemirror_mode": { 235 | "name": "ipython", 236 | "version": 3 237 | }, 238 | "file_extension": ".py", 239 | "mimetype": "text/x-python", 240 | "name": "python", 241 | "nbconvert_exporter": "python", 242 | "pygments_lexer": "ipython3", 243 | "version": "3.6.6" 244 | } 245 | }, 246 | "nbformat": 4, 247 | "nbformat_minor": 2 248 | } 249 | -------------------------------------------------------------------------------- /exercises/numpy/5-Reductions.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Reductions\n", 8 | "\n", 9 | "Reductions allow us to compute summary statistics and other useful aggregations over our data.\n", 10 | "\n", 11 | "The dataset for these exercises contains minutely price and volume observations for 5 stocks for October of 2017." 12 | ] 13 | }, 14 | { 15 | "cell_type": "code", 16 | "execution_count": 1, 17 | "metadata": {}, 18 | "outputs": [ 19 | { 20 | "data": { 21 | "text/html": [ 22 | "
\n", 23 | "\n", 36 | "\n", 37 | " \n", 38 | " \n", 39 | " \n", 40 | " \n", 41 | " \n", 42 | " \n", 43 | " \n", 44 | " \n", 45 | " \n", 46 | " \n", 47 | " \n", 48 | " \n", 49 | " \n", 50 | " \n", 51 | " \n", 52 | " \n", 53 | " \n", 54 | " \n", 55 | " \n", 56 | " \n", 57 | " \n", 58 | " \n", 59 | " \n", 60 | " \n", 61 | " \n", 62 | " \n", 63 | " \n", 64 | " \n", 65 | " \n", 66 | " \n", 67 | " \n", 68 | " \n", 69 | " \n", 70 | " \n", 71 | " \n", 72 | " \n", 73 | " \n", 74 | " \n", 75 | " \n", 76 | " \n", 77 | " \n", 78 | " \n", 79 | " \n", 80 | " \n", 81 | " \n", 82 | " \n", 83 | " \n", 84 | " \n", 85 | " \n", 86 | " \n", 87 | " \n", 88 | " \n", 89 | " \n", 90 | " \n", 91 | " \n", 92 | " \n", 93 | " \n", 94 | " \n", 95 | " \n", 96 | " \n", 97 | "
AAPLMSFTTSLAMCDBK
dt
2017-10-02 13:31:00154.3474.880342.330156.38052.736
2017-10-02 13:32:00154.0774.832341.480156.66052.686
2017-10-02 13:33:00153.7274.835341.830156.32452.756
2017-10-02 13:34:00153.6974.890341.240156.66052.726
2017-10-02 13:35:00153.4574.810341.873156.67052.706
\n", 98 | "
" 99 | ], 100 | "text/plain": [ 101 | " AAPL MSFT TSLA MCD BK\n", 102 | "dt \n", 103 | "2017-10-02 13:31:00 154.34 74.880 342.330 156.380 52.736\n", 104 | "2017-10-02 13:32:00 154.07 74.832 341.480 156.660 52.686\n", 105 | "2017-10-02 13:33:00 153.72 74.835 341.830 156.324 52.756\n", 106 | "2017-10-02 13:34:00 153.69 74.890 341.240 156.660 52.726\n", 107 | "2017-10-02 13:35:00 153.45 74.810 341.873 156.670 52.706" 108 | ] 109 | }, 110 | "metadata": {}, 111 | "output_type": "display_data" 112 | }, 113 | { 114 | "data": { 115 | "text/html": [ 116 | "
\n", 117 | "\n", 130 | "\n", 131 | " \n", 132 | " \n", 133 | " \n", 134 | " \n", 135 | " \n", 136 | " \n", 137 | " \n", 138 | " \n", 139 | " \n", 140 | " \n", 141 | " \n", 142 | " \n", 143 | " \n", 144 | " \n", 145 | " \n", 146 | " \n", 147 | " \n", 148 | " \n", 149 | " \n", 150 | " \n", 151 | " \n", 152 | " \n", 153 | " \n", 154 | " \n", 155 | " \n", 156 | " \n", 157 | " \n", 158 | " \n", 159 | " \n", 160 | " \n", 161 | " \n", 162 | " \n", 163 | " \n", 164 | " \n", 165 | " \n", 166 | " \n", 167 | " \n", 168 | " \n", 169 | " \n", 170 | " \n", 171 | " \n", 172 | " \n", 173 | " \n", 174 | " \n", 175 | " \n", 176 | " \n", 177 | " \n", 178 | " \n", 179 | " \n", 180 | " \n", 181 | " \n", 182 | " \n", 183 | " \n", 184 | " \n", 185 | " \n", 186 | " \n", 187 | " \n", 188 | " \n", 189 | " \n", 190 | " \n", 191 | "
AAPLMSFTTSLAMCDBK
dt
2017-10-02 13:31:00420042.0409211.049907.085774.030276.0
2017-10-02 13:32:00161960.049207.018480.06866.04511.0
2017-10-02 13:33:00118283.024043.047039.03000.03001.0
2017-10-02 13:34:00103544.062383.013444.04364.0900.0
2017-10-02 13:35:0088012.040175.036556.0820.01500.0
\n", 192 | "
" 193 | ], 194 | "text/plain": [ 195 | " AAPL MSFT TSLA MCD BK\n", 196 | "dt \n", 197 | "2017-10-02 13:31:00 420042.0 409211.0 49907.0 85774.0 30276.0\n", 198 | "2017-10-02 13:32:00 161960.0 49207.0 18480.0 6866.0 4511.0\n", 199 | "2017-10-02 13:33:00 118283.0 24043.0 47039.0 3000.0 3001.0\n", 200 | "2017-10-02 13:34:00 103544.0 62383.0 13444.0 4364.0 900.0\n", 201 | "2017-10-02 13:35:00 88012.0 40175.0 36556.0 820.0 1500.0" 202 | ] 203 | }, 204 | "metadata": {}, 205 | "output_type": "display_data" 206 | } 207 | ], 208 | "source": [ 209 | "import numpy as np\n", 210 | "import pandas as pd\n", 211 | "from IPython.display import display\n", 212 | "import matplotlib.pyplot as plt\n", 213 | "\n", 214 | "prices_df = pd.read_csv('prices.csv', index_col='dt', parse_dates=['dt'])\n", 215 | "volumes_df = pd.read_csv('volumes.csv', index_col='dt', parse_dates=['dt'])\n", 216 | "\n", 217 | "display(prices_df.head())\n", 218 | "display(volumes_df.head())" 219 | ] 220 | }, 221 | { 222 | "cell_type": "code", 223 | "execution_count": null, 224 | "metadata": {}, 225 | "outputs": [], 226 | "source": [ 227 | "prices = prices_df.values\n", 228 | "volumes = volumes_df.values\n", 229 | "timestamps = prices_df.index.values" 230 | ] 231 | }, 232 | { 233 | "cell_type": "code", 234 | "execution_count": null, 235 | "metadata": {}, 236 | "outputs": [], 237 | "source": [ 238 | "timestamps" 239 | ] 240 | }, 241 | { 242 | "cell_type": "markdown", 243 | "metadata": {}, 244 | "source": [ 245 | "**Exercise:** Compute the average price for each stock." 246 | ] 247 | }, 248 | { 249 | "cell_type": "code", 250 | "execution_count": null, 251 | "metadata": {}, 252 | "outputs": [], 253 | "source": [] 254 | }, 255 | { 256 | "cell_type": "markdown", 257 | "metadata": {}, 258 | "source": [ 259 | "**Exercise:** Compute the average volume for each stock." 260 | ] 261 | }, 262 | { 263 | "cell_type": "code", 264 | "execution_count": null, 265 | "metadata": {}, 266 | "outputs": [], 267 | "source": [] 268 | }, 269 | { 270 | "cell_type": "markdown", 271 | "metadata": {}, 272 | "source": [ 273 | "**Exercise:** Compute the number of times that each stock's price increased." 274 | ] 275 | }, 276 | { 277 | "cell_type": "code", 278 | "execution_count": null, 279 | "metadata": {}, 280 | "outputs": [], 281 | "source": [ 282 | "np.diff?" 283 | ] 284 | }, 285 | { 286 | "cell_type": "markdown", 287 | "metadata": {}, 288 | "source": [ 289 | "**Exercise:** Compute the volume-weighted average price of all 5 stocks." 290 | ] 291 | }, 292 | { 293 | "cell_type": "code", 294 | "execution_count": null, 295 | "metadata": {}, 296 | "outputs": [], 297 | "source": [ 298 | "np.average?" 299 | ] 300 | }, 301 | { 302 | "cell_type": "markdown", 303 | "metadata": {}, 304 | "source": [ 305 | "**Exercise:** Compute the timestamps where the lowest price occurred for each stock." 306 | ] 307 | }, 308 | { 309 | "cell_type": "code", 310 | "execution_count": null, 311 | "metadata": {}, 312 | "outputs": [], 313 | "source": [ 314 | "np.argmin?" 315 | ] 316 | }, 317 | { 318 | "cell_type": "markdown", 319 | "metadata": {}, 320 | "source": [ 321 | "**Exercise:** Compute the average volume for each minute of the day, aggregated across all 5 stocks.\n", 322 | "\n", 323 | "(**HINT:** There are exactly 390 trading minutes in each day in this dataset.)" 324 | ] 325 | }, 326 | { 327 | "cell_type": "code", 328 | "execution_count": null, 329 | "metadata": {}, 330 | "outputs": [], 331 | "source": [ 332 | "plt.plot(volumes[:, 0].reshape(22, 390).mean(axis=0))" 333 | ] 334 | }, 335 | { 336 | "cell_type": "code", 337 | "execution_count": null, 338 | "metadata": {}, 339 | "outputs": [], 340 | "source": [ 341 | "volumes" 342 | ] 343 | }, 344 | { 345 | "cell_type": "code", 346 | "execution_count": null, 347 | "metadata": {}, 348 | "outputs": [], 349 | "source": [ 350 | "plt.plot(volumes.sum(axis=1).reshape(22, 390).mean(axis=0))" 351 | ] 352 | }, 353 | { 354 | "cell_type": "code", 355 | "execution_count": null, 356 | "metadata": {}, 357 | "outputs": [], 358 | "source": [] 359 | } 360 | ], 361 | "metadata": { 362 | "kernelspec": { 363 | "display_name": "Python 3", 364 | "language": "python", 365 | "name": "python3" 366 | }, 367 | "language_info": { 368 | "codemirror_mode": { 369 | "name": "ipython", 370 | "version": 3 371 | }, 372 | "file_extension": ".py", 373 | "mimetype": "text/x-python", 374 | "name": "python", 375 | "nbconvert_exporter": "python", 376 | "pygments_lexer": "ipython3", 377 | "version": "3.6.6" 378 | } 379 | }, 380 | "nbformat": 4, 381 | "nbformat_minor": 2 382 | } 383 | -------------------------------------------------------------------------------- /exercises/numpy/6-Broadcasting.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Broadcasting\n", 8 | "\n", 9 | "In these exercises, we'll practice using broadcasting to combine arrays of different dimensions." 10 | ] 11 | }, 12 | { 13 | "cell_type": "code", 14 | "execution_count": null, 15 | "metadata": {}, 16 | "outputs": [], 17 | "source": [ 18 | "import numpy as np\n", 19 | "import pandas as pd\n", 20 | "from IPython.display import display\n", 21 | "import matplotlib.pyplot as plt\n", 22 | "plt.rc('figure', figsize=(12, 7))" 23 | ] 24 | }, 25 | { 26 | "cell_type": "markdown", 27 | "metadata": {}, 28 | "source": [ 29 | "## Bezier Curves\n", 30 | "\n", 31 | "A [Bezier Curve](https://en.wikipedia.org/wiki/B%C3%A9zier_curve) is way to define a two-dimensional curve in terms of a sequence of \"control points\". Intuitively, each control point \"pulls\" the path traced by the curve toward itself.\n", 32 | "\n", 33 | "Mathematically, a bezier curve defines a function $B(t)$ from the interval $[0, 1]$ to a two-dimensional point $p$." 34 | ] 35 | }, 36 | { 37 | "cell_type": "markdown", 38 | "metadata": {}, 39 | "source": [ 40 | "Here's an example of a fourth-order bezier curve. You can see that as t moves from 0 to 1, the control points exert different amounts of \"force\", pulling the final point closer toward themselves." 41 | ] 42 | }, 43 | { 44 | "cell_type": "markdown", 45 | "metadata": {}, 46 | "source": [ 47 | "![images/bezier.gif](images/bezier.gif)" 48 | ] 49 | }, 50 | { 51 | "cell_type": "markdown", 52 | "metadata": {}, 53 | "source": [ 54 | "## Linear Bezier Curves\n", 55 | "\n", 56 | "The simplest form of bezier curve is a \"first-order\" bezier curve with two control points $p_0$ and $p_1$.\n", 57 | "\n", 58 | "The formula for first-order bezier curve is:\n", 59 | "\n", 60 | "$B(t) = (1 - t)p_0 + tp_1$\n", 61 | "\n", 62 | "The curve traced out by a first-order bezier curve is simply the line connecting $p_0$ and $p_1$. For any value of $t \\in [0, 1]$, $B(t)$ evaluates to the point \"t percent\" of the way between $p_0$ and $p_1$." 63 | ] 64 | }, 65 | { 66 | "cell_type": "markdown", 67 | "metadata": {}, 68 | "source": [ 69 | "**Exercise:** Implement a function that evaluates a Bezier curve **at a single point**. \n", 70 | "\n", 71 | "Your function should take the following arguments:\n", 72 | "\n", 73 | "- `p0`, a length-2 array containing (x, y) coordinates of the first control point.\n", 74 | "- `p1`, a length-2 array containing (x, y) coordinates of the second control point.\n", 75 | "- `t`, a scalar value between 0 and 1." 76 | ] 77 | }, 78 | { 79 | "cell_type": "code", 80 | "execution_count": null, 81 | "metadata": {}, 82 | "outputs": [], 83 | "source": [ 84 | "def evaluate_linear_bezier_curve(p0, p1, t):\n", 85 | " raise NotImplemented(\"IMPLEMENT ME!\")" 86 | ] 87 | }, 88 | { 89 | "cell_type": "code", 90 | "execution_count": null, 91 | "metadata": {}, 92 | "outputs": [], 93 | "source": [ 94 | "p0 = np.array([0, 0])\n", 95 | "p1 = np.array([2, 1])\n", 96 | "\n", 97 | "halfway = evaluate_linear_bezier_curve(p0, p1, 0.5)\n", 98 | "three_quarters = evaluate_linear_bezier_curve(p0, p1, 0.75)\n", 99 | "\n", 100 | "# If your implementation is correct, these assertions shouldn't trigger.\n", 101 | "np.testing.assert_almost_equal(halfway, [1.0, 0.5])\n", 102 | "np.testing.assert_almost_equal(three_quarters, [1.5, 0.75])" 103 | ] 104 | }, 105 | { 106 | "cell_type": "markdown", 107 | "metadata": {}, 108 | "source": [ 109 | "Implement a function that computes **an array of samples from a linear bezier curve**. Your function should take the following arguments:\n", 110 | "\n", 111 | "- `p0`, a length-2 array containing (x, y) coordinates of the first control point. (Same as above.)\n", 112 | "- `p1`, a length-2 array containing (x, y) coordinates of the second control point. (Same as above.)\n", 113 | "- `ts`, a 1d array of unknown length containing sample values between 0 and 1.\n", 114 | "\n", 115 | "Your function should return a `len(ts) x 2` array containing (x, y) coordinates of the requested samples.\n", 116 | "\n", 117 | "Once you've think you have a solution, you can check your implementation by running `draw_linear_bezier_curve`, which will draw a bezier curve using samples generated by your `compute_linear_bezier_curve` function." 118 | ] 119 | }, 120 | { 121 | "cell_type": "code", 122 | "execution_count": null, 123 | "metadata": {}, 124 | "outputs": [], 125 | "source": [ 126 | "def draw_linear_bezier_curve(p0, p1):\n", 127 | " \"\"\"You shouldn't need to change anything in this function.\n", 128 | " \"\"\"\n", 129 | " ts = np.linspace(0, 1, 50)\n", 130 | " samples = compute_linear_bezier_curve(p0, p1, ts)\n", 131 | " X = samples[:, 0]\n", 132 | " Y = samples[:, 1]\n", 133 | " \n", 134 | " plt.plot(X, Y)\n", 135 | " plt.scatter([p0[0], p1[0]], [p0[1], p1[1]], color='red')\n", 136 | "\n", 137 | "def compute_linear_bezier_curve(p0, p1, ts):\n", 138 | " raise NotImplementedError(\"IMPLEMENT ME!\")" 139 | ] 140 | }, 141 | { 142 | "cell_type": "code", 143 | "execution_count": null, 144 | "metadata": {}, 145 | "outputs": [], 146 | "source": [ 147 | "draw_linear_bezier_curve(np.array([1, 1]), np.array([2, 3]));" 148 | ] 149 | }, 150 | { 151 | "cell_type": "markdown", 152 | "metadata": {}, 153 | "source": [ 154 | "## Exercise: Quadratic Bezier Curve\n", 155 | "\n", 156 | "The next-simplest form of Bezier Curve is a second-order (also known as \"quadratic\") curve. A second-order Bezier Curve has three control points, and can be implemented using the following formula:\n", 157 | "\n", 158 | "$b(t) = (1 - t)^2p_0 + 2(1 - t)tp_1 + t^2p_2$\n", 159 | "\n", 160 | "Implement a function with the same signature as above, but accept three control points, p0, p1, and p2. A correct implementation will draw a line that starts at $p0$, curves toward $p1$, and finishes at $p2$." 161 | ] 162 | }, 163 | { 164 | "cell_type": "code", 165 | "execution_count": null, 166 | "metadata": {}, 167 | "outputs": [], 168 | "source": [ 169 | "def draw_quadratic_bezier_curve(p0, p1, p2):\n", 170 | " \"\"\"You shouldn't need to change anything in this function.\n", 171 | " \"\"\"\n", 172 | " ts = np.linspace(0, 1, 50)\n", 173 | " samples = compute_quadratic_bezier_curve(p0, p1, p2, ts)\n", 174 | " \n", 175 | " X = samples[:, 0]\n", 176 | " Y = samples[:, 1]\n", 177 | " plt.plot(X, Y)\n", 178 | " \n", 179 | " points = np.vstack([p0, p1, p2])\n", 180 | " plt.scatter(points[:, 0], points[:, 1], color='red')\n", 181 | " plt.plot(points[:, 0], points[:, 1], linestyle='--', color='red')\n", 182 | " \n", 183 | "\n", 184 | "def compute_quadratic_bezier_curve(p0, p1, p2, ts):\n", 185 | " raise NotImplementedError(\"IMPLEMENT ME!\")" 186 | ] 187 | }, 188 | { 189 | "cell_type": "code", 190 | "execution_count": null, 191 | "metadata": {}, 192 | "outputs": [], 193 | "source": [ 194 | "p0 = np.array([0, 0])\n", 195 | "p1 = np.array([1, 1])\n", 196 | "p2 = np.array([2, -2])\n", 197 | "draw_quadratic_bezier_curve(p0, p1, p2)" 198 | ] 199 | }, 200 | { 201 | "cell_type": "markdown", 202 | "metadata": {}, 203 | "source": [ 204 | "## Exercise: Cubic Bezier Curve\n", 205 | "\n", 206 | "A third-order (aka, cubic) Bezier Curve has four control points, and has the following formula:\n", 207 | "\n", 208 | "$b(t) = (1 - t)^3p_0 + 3(1 - t)^2tp_1 + 3(1 - t)t^2p_2 + t^3p_3$\n", 209 | "\n", 210 | "Implement an evaluator for a cubic bezier curve following the same pattern as above." 211 | ] 212 | }, 213 | { 214 | "cell_type": "code", 215 | "execution_count": null, 216 | "metadata": {}, 217 | "outputs": [], 218 | "source": [ 219 | "def draw_cubic_bezier_curve(p0, p1, p2, p3):\n", 220 | " \"\"\"You shouldn't need to change anything in this function.\n", 221 | " \"\"\"\n", 222 | " ts = np.linspace(0, 1, 50)\n", 223 | " samples = compute_cubic_bezier_curve(p0, p1, p2, p3, ts)\n", 224 | " X = samples[:, 0]\n", 225 | " Y = samples[:, 1]\n", 226 | " plt.plot(X, Y)\n", 227 | " \n", 228 | " points = np.vstack([p0, p1, p2, p3])\n", 229 | " plt.scatter(points[:, 0], points[:, 1], color='red')\n", 230 | " plt.plot(points[:, 0], points[:, 1], linestyle='--', color='red')\n", 231 | " \n", 232 | "\n", 233 | "def compute_cubic_bezier_curve(p0, p1, p2, p3, ts):\n", 234 | " raise NotImplementedError()" 235 | ] 236 | }, 237 | { 238 | "cell_type": "code", 239 | "execution_count": null, 240 | "metadata": {}, 241 | "outputs": [], 242 | "source": [ 243 | "p0 = np.array([0, 0])\n", 244 | "p1 = np.array([1, 1])\n", 245 | "p2 = np.array([2, -2])\n", 246 | "p3 = np.array([3, 3])\n", 247 | "draw_cubic_bezier_curve(p0, p1, p2, p3)" 248 | ] 249 | }, 250 | { 251 | "cell_type": "markdown", 252 | "metadata": {}, 253 | "source": [ 254 | "## Exercise: Generalized Bezier Curve\n", 255 | "\n", 256 | "You may have started to notice a pattern in the coefficients of each control point's contribution to the curve. We're getting [Binomial Coefficients](https://en.wikipedia.org/wiki/Binomial_coefficient)!\n", 257 | "\n", 258 | "The general formula for a Bezier curve with $n$ control points is:\n", 259 | "\n", 260 | "$b(t) = \\sum_{i=0}^n \\binom{n}{i}(1 - t)^{n - i}t^ip_i$\n", 261 | "\n", 262 | "where $\\binom{n}{i}$ is the binomial coefficient of $n$ and $i$.\n", 263 | "\n", 264 | "Implement a function that computes samples from a generalized bezier curve. It should take a 2d array of (npoints x 2) and a 1d array of samples, and it should return a (len(t) x 2) array of evaluated samples. \n", 265 | "\n", 266 | "**Hint:** You can use `scipy.special.comb` to evaluate binomial coefficients.\n", 267 | "\n", 268 | "**NOTE:** This exercise is hard. If you get stuck here, there's a solutions notebook nextdoor." 269 | ] 270 | }, 271 | { 272 | "cell_type": "code", 273 | "execution_count": null, 274 | "metadata": {}, 275 | "outputs": [], 276 | "source": [ 277 | "from scipy.special import comb\n", 278 | "\n", 279 | "def draw_bezier_curve(points):\n", 280 | " ts = np.linspace(0, 1, 50)\n", 281 | " samples = compute_bezier_curve(points, ts)\n", 282 | " X = samples[:, 0]\n", 283 | " Y = samples[:, 1]\n", 284 | " plt.plot(X, Y)\n", 285 | " \n", 286 | " plt.scatter(points[:, 0], points[:, 1], color='red')\n", 287 | " plt.plot(points[:, 0], points[:, 1], linestyle='--', color='red')\n", 288 | " \n", 289 | "def compute_bezier_curve(points, t):\n", 290 | " raise NotImplementedError()" 291 | ] 292 | }, 293 | { 294 | "cell_type": "code", 295 | "execution_count": null, 296 | "metadata": { 297 | "scrolled": false 298 | }, 299 | "outputs": [], 300 | "source": [ 301 | "draw_bezier_curve(np.vstack([p0, p1, p2, p3]))" 302 | ] 303 | }, 304 | { 305 | "cell_type": "markdown", 306 | "metadata": {}, 307 | "source": [ 308 | "## Exercise: Estimate the Length of a Bezier Curve\n", 309 | "\n", 310 | "Write a function that estimates the length of a bezier curve by computing an array of sample points and summing lengths of the differences between each successive point." 311 | ] 312 | }, 313 | { 314 | "cell_type": "code", 315 | "execution_count": null, 316 | "metadata": {}, 317 | "outputs": [], 318 | "source": [ 319 | "def estimate_curve_length(points, nsamples):\n", 320 | " raise NotImplementedError()" 321 | ] 322 | }, 323 | { 324 | "cell_type": "code", 325 | "execution_count": null, 326 | "metadata": {}, 327 | "outputs": [], 328 | "source": [ 329 | "estimate_curve_length([p0, p1, p2, p3], 5)" 330 | ] 331 | }, 332 | { 333 | "cell_type": "code", 334 | "execution_count": null, 335 | "metadata": {}, 336 | "outputs": [], 337 | "source": [ 338 | "for i in range(3, 10):\n", 339 | " print(estimate_curve_length([p0, p1, p2, p3], i))" 340 | ] 341 | } 342 | ], 343 | "metadata": { 344 | "kernelspec": { 345 | "display_name": "Python 3", 346 | "language": "python", 347 | "name": "python3" 348 | }, 349 | "language_info": { 350 | "codemirror_mode": { 351 | "name": "ipython", 352 | "version": 3 353 | }, 354 | "file_extension": ".py", 355 | "mimetype": "text/x-python", 356 | "name": "python", 357 | "nbconvert_exporter": "python", 358 | "pygments_lexer": "ipython3", 359 | "version": "3.6.6" 360 | } 361 | }, 362 | "nbformat": 4, 363 | "nbformat_minor": 2 364 | } 365 | -------------------------------------------------------------------------------- /exercises/numpy/images/bezier.gif: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/llllllllll/principles-of-performance/80d1d18ef4bc039fbed2a92b88e5b85d654c10a6/exercises/numpy/images/bezier.gif -------------------------------------------------------------------------------- /exercises/numpy/images/bezier2.gif: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/llllllllll/principles-of-performance/80d1d18ef4bc039fbed2a92b88e5b85d654c10a6/exercises/numpy/images/bezier2.gif -------------------------------------------------------------------------------- /exercises/numpy/solutions/1-Finding Functions and Documentation (Solutions).ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Finding Functions with NumPy and Jupyter" 8 | ] 9 | }, 10 | { 11 | "cell_type": "markdown", 12 | "metadata": {}, 13 | "source": [ 14 | "NumPy is a large complex library with hundreds of useful functions. Often the hardest part of solving a problem with NumPy is simply finding the right function to use, or figuring out whether a function you've found can solve your problem.\n", 15 | "\n", 16 | "The official documentation for Numpy and SciPy are excellent resources:\n", 17 | "\n", 18 | "- The official NumPy documentation: [https://docs.scipy.org/doc/numpy-1.13.0/reference/](https://docs.scipy.org/doc/numpy-1.13.0/reference/)\n", 19 | "- The offical SciPy documentation: [https://docs.scipy.org/doc/scipy/reference/](https://docs.scipy.org/doc/scipy/reference/)\n", 20 | "\n", 21 | "There are also a few tricks you can use to learn about NumPy functions without having to leave the notebook:" 22 | ] 23 | }, 24 | { 25 | "cell_type": "code", 26 | "execution_count": 1, 27 | "metadata": {}, 28 | "outputs": [], 29 | "source": [ 30 | "import numpy as np" 31 | ] 32 | }, 33 | { 34 | "cell_type": "markdown", 35 | "metadata": {}, 36 | "source": [ 37 | "Suppose, for example, that we want to find the [eigenvalues](https://en.wikipedia.org/wiki/Eigenvalues_and_eigenvectors) of a matrix. The first thing we might try is to use Jupyter's tab-completion feature to see if there's a top-level function with a name like \"eigenvalues\":" 38 | ] 39 | }, 40 | { 41 | "cell_type": "code", 42 | "execution_count": null, 43 | "metadata": {}, 44 | "outputs": [], 45 | "source": [ 46 | "# Place your cursor immediately after the `e` and hit TAB to see the top-level numpy functions\n", 47 | "# that start with the letter \"e\"\n", 48 | "np.e" 49 | ] 50 | }, 51 | { 52 | "cell_type": "markdown", 53 | "metadata": {}, 54 | "source": [ 55 | "Unfortunately for us, none of the functions that appear look like they're related to eigenvalues. The next thing we can try is to use [`np.lookfor`](https://docs.scipy.org/doc/numpy-1.14.0/reference/generated/numpy.lookfor.html) to search for \"eigenvalue\" by keyword:" 56 | ] 57 | }, 58 | { 59 | "cell_type": "code", 60 | "execution_count": 3, 61 | "metadata": {}, 62 | "outputs": [ 63 | { 64 | "name": "stdout", 65 | "output_type": "stream", 66 | "text": [ 67 | "Search results for 'eigenvalue'\n", 68 | "-------------------------------\n", 69 | "numpy.linalg.eig\n", 70 | " Compute the eigenvalues and right eigenvectors of a square array.\n", 71 | "numpy.linalg.eigh\n", 72 | " Return the eigenvalues and eigenvectors of a Hermitian or symmetric matrix.\n", 73 | "numpy.linalg.eigvals\n", 74 | " Compute the eigenvalues of a general matrix.\n", 75 | "numpy.linalg.eigvalsh\n", 76 | " Compute the eigenvalues of a Hermitian or real symmetric matrix.\n", 77 | "numpy.roots\n", 78 | " Return the roots of a polynomial with coefficients given in p.\n", 79 | "numpy.linalg.svd\n", 80 | " Singular Value Decomposition.\n", 81 | "numpy.linalg._umath_linalg.eig\n", 82 | " eig on the last two dimension and broadcast to the rest.\n", 83 | "numpy.polynomial.Hermite._roots\n", 84 | " Compute the roots of a Hermite series.\n", 85 | "numpy.polynomial.HermiteE._roots\n", 86 | " Compute the roots of a HermiteE series.\n", 87 | "numpy.polynomial.Laguerre._roots\n", 88 | " Compute the roots of a Laguerre series.\n", 89 | "numpy.polynomial.Legendre._roots\n", 90 | " Compute the roots of a Legendre series.\n", 91 | "numpy.polynomial.Chebyshev._roots\n", 92 | " Compute the roots of a Chebyshev series.\n", 93 | "numpy.linalg._umath_linalg.eigh_lo\n", 94 | " eigh on the last two dimension and broadcast to the rest, using lower triangle\n", 95 | "numpy.linalg._umath_linalg.eigh_up\n", 96 | " eigh on the last two dimension and broadcast to the rest, using upper triangle.\n", 97 | "numpy.linalg._umath_linalg.eigvals\n", 98 | " eigvals on the last two dimension and broadcast to the rest.\n", 99 | "numpy.polynomial.Polynomial._roots\n", 100 | " Compute the roots of a polynomial.\n", 101 | "numpy.linalg._umath_linalg.eigvalsh_lo\n", 102 | " eigh on the last two dimension and broadcast to the rest, using lower triangle.\n", 103 | "numpy.linalg._umath_linalg.eigvalsh_up\n", 104 | " eigvalsh on the last two dimension and broadcast to the rest, using upper triangle.\n", 105 | "numpy.polynomial.hermite.hermcompanion\n", 106 | " Return the scaled companion matrix of c.\n", 107 | "numpy.polynomial.legendre.legcompanion\n", 108 | " Return the scaled companion matrix of c.\n", 109 | "numpy.polynomial.chebyshev.chebcompanion\n", 110 | " Return the scaled companion matrix of c.\n", 111 | "numpy.polynomial.hermite_e.hermecompanion\n", 112 | " Return the scaled companion matrix of c." 113 | ] 114 | } 115 | ], 116 | "source": [ 117 | "np.lookfor(\"eigenvalue\")" 118 | ] 119 | }, 120 | { 121 | "cell_type": "markdown", 122 | "metadata": {}, 123 | "source": [ 124 | "That looks more promising, but now we have to figure out which function to use. For that, we can use Jupyter's `?` operator. Running a cell containing `function_name?` will bring up a window containing documentation about the function.\n", 125 | "\n", 126 | "Execute the following cell and read the documentation for `eigvals`. In particular, notice that if you scroll down, there are **\"Examples\"** and **\"See Also\"** sections. These sections are as or more useful than the description of the what a function does." 127 | ] 128 | }, 129 | { 130 | "cell_type": "code", 131 | "execution_count": 4, 132 | "metadata": {}, 133 | "outputs": [], 134 | "source": [ 135 | "np.linalg.eigvals?" 136 | ] 137 | }, 138 | { 139 | "cell_type": "markdown", 140 | "metadata": {}, 141 | "source": [ 142 | "For cases where we just to check a docstring quickly, it can be more ergonomic to bring up documentation in-line using Shift+Tab:\n", 143 | "\n", 144 | "Place your cursor after \"eigvals\", hold Shift, and then press Tab. You should see the signature and the first line of the function's documentation. You can see the rest of the documentation by pressing Tab again without letting go of Shift." 145 | ] 146 | }, 147 | { 148 | "cell_type": "code", 149 | "execution_count": 5, 150 | "metadata": {}, 151 | "outputs": [ 152 | { 153 | "data": { 154 | "text/plain": [ 155 | "" 156 | ] 157 | }, 158 | "execution_count": 5, 159 | "metadata": {}, 160 | "output_type": "execute_result" 161 | } 162 | ], 163 | "source": [ 164 | "np.linalg.eigvals" 165 | ] 166 | }, 167 | { 168 | "cell_type": "markdown", 169 | "metadata": {}, 170 | "source": [ 171 | "## Exercise: Finding Functions\n", 172 | "\n", 173 | "Use `np.lookfor` and `?` to find functions that do the following:\n", 174 | "\n", 175 | "- Compute the largest value in an array. (`np.max` or `np.maximum`)\n", 176 | "- Compute the smallest value in an array. (`np.min` or `np.minimum`)\n", 177 | "- Compute the value at the 90th percentile of an array. (`np.percentile`)\n", 178 | "- Sort an array. (`np.sort`)" 179 | ] 180 | }, 181 | { 182 | "cell_type": "markdown", 183 | "metadata": {}, 184 | "source": [ 185 | "## Exercise: Finding Functions (continued)\n", 186 | "\n", 187 | "Continue using `np.lookfor` and `?` to find functions that do the following:" 188 | ] 189 | }, 190 | { 191 | "cell_type": "markdown", 192 | "metadata": {}, 193 | "source": [ 194 | "- Find a value in a sorted array. (`np.searchsorted`)\n", 195 | "- Compute the [natural logarithm](https://en.wikipedia.org/wiki/Natural_logarithm) of each element in an array. (`np.log`)\n", 196 | "- Compute the [Correlation Coefficient](https://en.wikipedia.org/wiki/Correlation_coefficient) between two arrays. (`np.corrcoef`)\n", 197 | "- Fit coefficients of a polynomial function. (`np.polyfit`)\n", 198 | "- Compute a [Covariance Matrix](https://en.wikipedia.org/wiki/Covariance_matrix). (`np.cov`)" 199 | ] 200 | }, 201 | { 202 | "cell_type": "code", 203 | "execution_count": null, 204 | "metadata": {}, 205 | "outputs": [], 206 | "source": [] 207 | } 208 | ], 209 | "metadata": { 210 | "kernelspec": { 211 | "display_name": "Python 3", 212 | "language": "python", 213 | "name": "python3" 214 | }, 215 | "language_info": { 216 | "codemirror_mode": { 217 | "name": "ipython", 218 | "version": 3 219 | }, 220 | "file_extension": ".py", 221 | "mimetype": "text/x-python", 222 | "name": "python", 223 | "nbconvert_exporter": "python", 224 | "pygments_lexer": "ipython3", 225 | "version": "3.5.2" 226 | } 227 | }, 228 | "nbformat": 4, 229 | "nbformat_minor": 2 230 | } 231 | -------------------------------------------------------------------------------- /exercises/numpy/solutions/images/bezier.gif: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/llllllllll/principles-of-performance/80d1d18ef4bc039fbed2a92b88e5b85d654c10a6/exercises/numpy/solutions/images/bezier.gif -------------------------------------------------------------------------------- /exercises/numpy/solutions/images/bezier2.gif: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/llllllllll/principles-of-performance/80d1d18ef4bc039fbed2a92b88e5b85d654c10a6/exercises/numpy/solutions/images/bezier2.gif -------------------------------------------------------------------------------- /exercises/profiling/replay-parsing.stats: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/llllllllll/principles-of-performance/80d1d18ef4bc039fbed2a92b88e5b85d654c10a6/exercises/profiling/replay-parsing.stats -------------------------------------------------------------------------------- /exercises/profiling/rolling.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | 3 | 4 | def rolling_sum(window_size, array): 5 | out = [] 6 | for n in range(len(array) - window_size + 1): 7 | window = array[n:n + window_size] 8 | out.append(np.sum(window)) 9 | 10 | return np.array(out) 11 | 12 | 13 | if __name__ == '__main__': 14 | import cProfile 15 | 16 | array = np.random.random(10000) 17 | window_size = 20 18 | 19 | p = cProfile.Profile() 20 | p.enable() 21 | 22 | rolling_sum(20, array) 23 | 24 | p.disable() 25 | p.dump_stats('rolling_sum.stats') 26 | -------------------------------------------------------------------------------- /exercises/profiling/solutions/rolling.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | 3 | 4 | def _rolling_windows(window_size, array): 5 | orig_shape = array.shape 6 | if not orig_shape: 7 | raise IndexError("Can't restride a scalar.") 8 | elif orig_shape[0] <= window_size: 9 | raise IndexError( 10 | "Can't restride array of shape {shape} with" 11 | " a window length of {len}".format( 12 | shape=orig_shape, 13 | len=window_size, 14 | ) 15 | ) 16 | 17 | num_windows = (orig_shape[0] - window_size + 1) 18 | new_shape = (num_windows, window_size) + orig_shape[1:] 19 | 20 | new_strides = (array.strides[0],) + array.strides 21 | 22 | return np.ndarray( 23 | dtype=array.dtype, 24 | shape=new_shape, 25 | buffer=array, 26 | strides=new_strides, 27 | ) 28 | 29 | 30 | def rolling_sum(window_size, array): 31 | windows = _rolling_windows(window_size, array) 32 | return np.sum(windows, axis=1) 33 | 34 | 35 | if __name__ == '__main__': 36 | import cProfile 37 | 38 | array = np.random.random(10000) 39 | window_size = 20 40 | 41 | p = cProfile.Profile() 42 | p.enable() 43 | 44 | rolling_sum(20, array) 45 | 46 | p.disable() 47 | p.dump_stats('rolling_sum.stats') 48 | -------------------------------------------------------------------------------- /tutorial/Makefile: -------------------------------------------------------------------------------- 1 | # Minimal makefile for Sphinx documentation 2 | # 3 | 4 | # You can set these variables from the command line. 5 | SPHINXOPTS = 6 | SPHINXBUILD = sphinx-build 7 | SPHINXPROJ = c-extension-tutorial 8 | SOURCEDIR = source 9 | BUILDDIR = build 10 | 11 | # Put it first so that "make" without argument is like "make help". 12 | help: 13 | @$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O) 14 | 15 | .PHONY: help Makefile 16 | 17 | # Catch-all target: route all unknown targets to Sphinx using the new 18 | # "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS). 19 | %: Makefile 20 | @$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O) 21 | 22 | livehtml: 23 | sphinx-autobuild -b html $(SPHINXOPTS) $(SOURCEDIR) $(BUILDDIR)/html 24 | -------------------------------------------------------------------------------- /tutorial/deploy.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # This file is taken from the Zipline project with minor modifications: 3 | # github.com/quantopian/zipline 4 | # 5 | # Copyright 2016 Quantopian, Inc. 6 | # 7 | # Licensed under the Apache License, Version 2.0 (the "License"); 8 | # you may not use this file except in compliance with the License. 9 | # You may obtain a copy of the License at 10 | # 11 | # http://www.apache.org/licenses/LICENSE-2.0 12 | # 13 | # Unless required by applicable law or agreed to in writing, software 14 | # distributed under the License is distributed on an "AS IS" BASIS, 15 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 16 | # See the License for the specific language governing permissions and 17 | # limitations under the License. 18 | 19 | from contextlib import contextmanager 20 | from glob import glob 21 | import os 22 | from os.path import abspath, basename, dirname, exists, isfile 23 | from shutil import move, rmtree 24 | from subprocess import check_call 25 | 26 | HERE = dirname(abspath(__file__)) 27 | TUTORIAL_ROOT = dirname(HERE) 28 | TEMP_LOCATION = '/tmp/tutorial-doc' 29 | TEMP_LOCATION_GLOB = TEMP_LOCATION + '/*' 30 | 31 | 32 | @contextmanager 33 | def removing(path): 34 | try: 35 | yield 36 | finally: 37 | rmtree(path) 38 | 39 | 40 | def ensure_not_exists(path): 41 | if not exists(path): 42 | return 43 | if isfile(path): 44 | os.unlink(path) 45 | else: 46 | rmtree(path) 47 | 48 | 49 | def main(): 50 | old_dir = os.getcwd() 51 | print("Moving to %s." % HERE) 52 | os.chdir(HERE) 53 | 54 | try: 55 | print("Building docs with 'make html'") 56 | check_call(['make', 'html']) 57 | 58 | print("Clearing temp location '%s'" % TEMP_LOCATION) 59 | rmtree(TEMP_LOCATION, ignore_errors=True) 60 | 61 | with removing(TEMP_LOCATION): 62 | print("Copying built files to temp location.") 63 | move('build/html', TEMP_LOCATION) 64 | 65 | print("Moving to '%s'" % TUTORIAL_ROOT) 66 | os.chdir(TUTORIAL_ROOT) 67 | 68 | print("Checking out gh-pages branch.") 69 | check_call( 70 | [ 71 | 'git', 'branch', '-f', 72 | '--track', 'gh-pages', 'origin/gh-pages' 73 | ] 74 | ) 75 | check_call(['git', 'checkout', 'gh-pages']) 76 | check_call(['git', 'reset', '--hard', 'origin/gh-pages']) 77 | 78 | print("Copying built files:") 79 | for file_ in glob(TEMP_LOCATION_GLOB): 80 | base = basename(file_) 81 | 82 | print("%s -> %s" % (file_, base)) 83 | ensure_not_exists(base) 84 | move(file_, '.') 85 | finally: 86 | os.chdir(old_dir) 87 | 88 | print() 89 | print("Updated documentation branch in directory %s" % TUTORIAL_ROOT) 90 | print("If you are happy with these changes, commit and push to gh-pages.") 91 | 92 | 93 | if __name__ == '__main__': 94 | main() 95 | -------------------------------------------------------------------------------- /tutorial/source/_static/1-byte-value-array.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/llllllllll/principles-of-performance/80d1d18ef4bc039fbed2a92b88e5b85d654c10a6/tutorial/source/_static/1-byte-value-array.png -------------------------------------------------------------------------------- /tutorial/source/_static/2d-array.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/llllllllll/principles-of-performance/80d1d18ef4bc039fbed2a92b88e5b85d654c10a6/tutorial/source/_static/2d-array.png -------------------------------------------------------------------------------- /tutorial/source/_static/adders.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/llllllllll/principles-of-performance/80d1d18ef4bc039fbed2a92b88e5b85d654c10a6/tutorial/source/_static/adders.png -------------------------------------------------------------------------------- /tutorial/source/_static/addition-dereferences.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/llllllllll/principles-of-performance/80d1d18ef4bc039fbed2a92b88e5b85d654c10a6/tutorial/source/_static/addition-dereferences.png -------------------------------------------------------------------------------- /tutorial/source/_static/cache-0.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/llllllllll/principles-of-performance/80d1d18ef4bc039fbed2a92b88e5b85d654c10a6/tutorial/source/_static/cache-0.png -------------------------------------------------------------------------------- /tutorial/source/_static/cache-1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/llllllllll/principles-of-performance/80d1d18ef4bc039fbed2a92b88e5b85d654c10a6/tutorial/source/_static/cache-1.png -------------------------------------------------------------------------------- /tutorial/source/_static/column-order-strides.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/llllllllll/principles-of-performance/80d1d18ef4bc039fbed2a92b88e5b85d654c10a6/tutorial/source/_static/column-order-strides.png -------------------------------------------------------------------------------- /tutorial/source/_static/column-order.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/llllllllll/principles-of-performance/80d1d18ef4bc039fbed2a92b88e5b85d654c10a6/tutorial/source/_static/column-order.png -------------------------------------------------------------------------------- /tutorial/source/_static/column-slice-strides.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/llllllllll/principles-of-performance/80d1d18ef4bc039fbed2a92b88e5b85d654c10a6/tutorial/source/_static/column-slice-strides.png -------------------------------------------------------------------------------- /tutorial/source/_static/memory-cells.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/llllllllll/principles-of-performance/80d1d18ef4bc039fbed2a92b88e5b85d654c10a6/tutorial/source/_static/memory-cells.png -------------------------------------------------------------------------------- /tutorial/source/_static/multi-byte-value-array.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/llllllllll/principles-of-performance/80d1d18ef4bc039fbed2a92b88e5b85d654c10a6/tutorial/source/_static/multi-byte-value-array.png -------------------------------------------------------------------------------- /tutorial/source/_static/row-order-strides.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/llllllllll/principles-of-performance/80d1d18ef4bc039fbed2a92b88e5b85d654c10a6/tutorial/source/_static/row-order-strides.png -------------------------------------------------------------------------------- /tutorial/source/_static/row-order.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/llllllllll/principles-of-performance/80d1d18ef4bc039fbed2a92b88e5b85d654c10a6/tutorial/source/_static/row-order.png -------------------------------------------------------------------------------- /tutorial/source/_static/struct.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/llllllllll/principles-of-performance/80d1d18ef4bc039fbed2a92b88e5b85d654c10a6/tutorial/source/_static/struct.png -------------------------------------------------------------------------------- /tutorial/source/appendix.rst: -------------------------------------------------------------------------------- 1 | Appendix 2 | ======== 3 | 4 | Processor 5 | --------- 6 | 7 | .. _CPU: 8 | 9 | CPU 10 | ~~~ 11 | 12 | The CPU is the device that physically executes :ref:`instructions ` 13 | to perform computation. 14 | 15 | .. _mmu: 16 | 17 | MMU 18 | ~~~ 19 | 20 | The MMU which stands for "Memory Management Unit". Nowadays, is a component that 21 | intercepts every memory read or write coming from the program and remaps it from 22 | the :ref:`virtual address ` to the physical :ref:`main memory 23 | ` or :ref:`processor cache `. The MMU is also responsible 24 | for enforcing that programs only access the memory they have been given so that 25 | programs do not use other program's memory. 26 | 27 | .. _virtual-memory: 28 | 29 | Virtual Memory 30 | ~~~~~~~~~~~~~~ 31 | 32 | Virtual memory is a way of isolating the memory used by different programs 33 | running on the same machine at the same time. Basically, each program sees a 34 | subset of total memory available on the machine. The process can only read or 35 | write address inside this space. 36 | 37 | .. _x86: 38 | 39 | x86 40 | ~~~ 41 | 42 | By far the most common instruction set architecture for personal computers and 43 | servers. The instruction set architecture defines which :ref:`instructions 44 | ` exist for the machine. 45 | 46 | CPU 47 | ~~~ 48 | 49 | The CPU is the device that reads :ref:`instructions `, interprets 50 | the meaning, and performs the given computation. This is the component of the 51 | computer that implements all of the logic. 52 | 53 | .. _instruction: 54 | 55 | Instruction 56 | ~~~~~~~~~~~ 57 | 58 | A single low level step that the computer can execute. This is the pair of the 59 | operation along with the arguments to the operation. 60 | 61 | .. _instruction-pointer: 62 | 63 | Instruction Pointer 64 | ~~~~~~~~~~~~~~~~~~~ 65 | 66 | A special :ref:`register ` which stores the :ref:`address
` 67 | of the next instruction to execute. The programmer cannot directly manipulate 68 | this register. 69 | 70 | .. _opcode: 71 | 72 | Opcode 73 | ~~~~~~ 74 | 75 | A particular operation that the computer can execute decoupled from the 76 | arguments. For example ``add`` or ``sub``. 77 | 78 | .. _register: 79 | 80 | Register 81 | ~~~~~~~~ 82 | 83 | A location to store a small value while performing operations. Except for 84 | :ref:`mov`, most :ref:`instructions ` take registers for all inputs 85 | and outputs. Registers live on the :ref:`CPU` itself, not in :ref:`memory 86 | `. 87 | 88 | .. _bitness: 89 | 90 | Bitness 91 | ~~~~~~~ 92 | 93 | The number of bits in a value. 94 | 95 | .. _word-size: 96 | 97 | Word Size 98 | ~~~~~~~~~ 99 | 100 | The word size is the size of a :ref:`register ` on the 101 | machine. However, "word" is often overloaded to mean 16 bit value. This comes 102 | from the fact that the original :ref:`x86` processors were 16 bit. This is the 103 | root of the terms: 104 | 105 | - ``dword``: double word, 32 bit value 106 | - ``qword``: quadruple word, 64 bit value 107 | 108 | .. _mov: 109 | 110 | ``mov`` 111 | ~~~~~~~ 112 | 113 | ``mov`` is the :ref:`instruction ` that can load data from 114 | :ref:`memory ` into a :ref:`register ` or write data 115 | from a :ref:`register ` back to :ref:`memory `. 116 | 117 | Memory 118 | ------ 119 | 120 | .. _bit: 121 | 122 | Bit 123 | ~~~ 124 | 125 | The most primitive unit for storing information, either true or false. Bits are 126 | often denoted using 1 for true or 0 for false. 127 | 128 | .. _byte: 129 | 130 | Byte 131 | ~~~~ 132 | 133 | The smallest addressable number of :ref:`bits `. This is almost always 134 | eight bits. 135 | 136 | .. _memory-management: 137 | 138 | Memory Management 139 | ~~~~~~~~~~~~~~~~~ 140 | 141 | Memory management is tracking which :ref:`addresses
` are in use so 142 | that data is not overwritten while it is being used. This involves properly 143 | :ref:`allocating ` and :ref:`deallocating ` memory. 144 | 145 | .. _allocation: 146 | 147 | Allocation 148 | ~~~~~~~~~~ 149 | 150 | To allocate memory is reserve a section of memory for some period of time. 151 | 152 | .. note:: See Also 153 | 154 | :ref:`deallocation` 155 | 156 | .. _deallocation: 157 | 158 | Deallocation 159 | ~~~~~~~~~~~~ 160 | 161 | To deallocate memory is to mark that previously :ref:`allocated ` 162 | region of memory is no longer needed and can be reused in the future. 163 | 164 | .. note:: See Also 165 | 166 | :ref:`allocation` 167 | 168 | .. _address: 169 | 170 | Address 171 | ~~~~~~~ 172 | 173 | An address is an integer which corresponds to a location in :ref:`memory 174 | `. 175 | 176 | .. _pointer: 177 | 178 | Pointer 179 | ~~~~~~~ 180 | 181 | A pointer is a value that stores a :ref:`memory address
`. This can 182 | either be the address of a single value, an :ref:`array `, or a 183 | :ref:`struct `. 184 | 185 | .. _dereference: 186 | 187 | Dereference 188 | ~~~~~~~~~~~ 189 | 190 | To "dereference" means to read the value stored at a particular :ref:`address 191 |
`. For example, given memory that looks like: 192 | 193 | .. code-block:: python 194 | 195 | memory = [1, 5, 3, 4, 5, 2, 8, 3] 196 | 197 | Dereferencing address 4 (0-indexed) would be: ``memory[4] == 5``. 198 | 199 | .. _bit-width: 200 | 201 | Integer Width 202 | ~~~~~~~~~~~~~ 203 | 204 | The fixed number of :ref:`bits ` in an integer. The common widths for 205 | integers are: 8, 16, 32, and 64. 206 | 207 | .. _main-memory: 208 | 209 | Main Memory 210 | ~~~~~~~~~~~ 211 | 212 | Main memory, also just called "memory" or "RAM", is ephemeral storage available 213 | to the processor for storing results of computations. This does not include 214 | persistent storage like hard drives. 215 | 216 | .. _l1: 217 | .. _cache: 218 | 219 | Processor Cache 220 | ~~~~~~~~~~~~~~~ 221 | 222 | The processor cache is a series of caches that reside on the :ref:`CPU` 223 | itself. These caches are arranged from smallest and fastest to access to largest 224 | and slowest to access. The common naming convention is: 225 | 226 | - ``L1``: smallest and fastest 227 | - ``L3`` or ``L4`` (depending on CPU): largest and slowest 228 | - ``LL``: Last Level, always refers to the last level regardless of how many 229 | levels exist. 230 | 231 | Often the ``L1`` cache is split into two distinct caches: one for instructions 232 | and one for data. 233 | 234 | .. _LL: 235 | 236 | ``LL`` 237 | ~~~~~~ 238 | 239 | ``LL``, short for "Last Level", always refers to the largest and slowest 240 | :ref:`memory cache ` level for a given machine. 241 | 242 | .. _cache-line: 243 | 244 | Cache Line 245 | ~~~~~~~~~~ 246 | 247 | A cache line is the unit of data transfer between :ref:`main memory 248 | ` and the :ref:`processor cache `, or between levels of the 249 | cache. Instead of moving one byte at a time, movement is accelerated 250 | 251 | Data Structures 252 | --------------- 253 | 254 | .. _array: 255 | 256 | Array 257 | ~~~~~ 258 | 259 | An array is an ordered sequence of values. The defining characteristic of an 260 | array is that the values are laid out next to each other in :ref:`memory 261 | `. For example if the elements are 4 byte integers, and the first 262 | element has an address of ``addr``, then the second element will have an address 263 | of ``add + 4``, the third element will have an address of ``addr + 8``, and so 264 | on. The address of element ``n`` of any array ``a`` is ``a + sizeof(element) * 265 | n``. Element ``n`` of array ``a`` is often denoted as: ``a[n]``. 266 | 267 | .. _struct: 268 | 269 | Struct 270 | ~~~~~~ 271 | 272 | A struct, short for "structure", is a fixed-size collection of potentially 273 | unrelated types. In a structure, the elements are laid out in a fixed order, for 274 | example, imagine the struct: 275 | 276 | .. code-block:: c 277 | 278 | { 279 | int32 a 280 | int8 b; 281 | int16 c; 282 | } 283 | 284 | The bytes could be laid out like: 285 | 286 | .. code-block:: python 287 | 288 | [a[0], a[1], a[2], a[3], b[0], c[0], c[1]] 289 | 290 | though, for alignment reasons it could also be laid out like: 291 | 292 | .. code-block:: python 293 | 294 | [a[0], a[1], a[2], a[3], b[0], padding, c[0], c[1]] 295 | 296 | where padding is a wasted byte that serves to make the address of ``c`` a 297 | multiple of 2. 298 | 299 | Miscellaneous 300 | ------------- 301 | 302 | .. _object: 303 | 304 | Object 305 | ~~~~~~ 306 | 307 | An object, in terms of object oriented programming, is a value paired with the 308 | set of operations that may be performed on the value. 309 | 310 | .. _profiler: 311 | 312 | Profiler 313 | ~~~~~~~~ 314 | 315 | A profiler is a tool that tracks the execution of a program to better understand 316 | the behavior and performance of the program. 317 | -------------------------------------------------------------------------------- /tutorial/source/arrays-and-structs.rst: -------------------------------------------------------------------------------- 1 | Arrays and Structs 2 | ================== 3 | 4 | Arrays 5 | ------ 6 | 7 | In most real world programs, we are less interested in storing a single number 8 | at a time and more interested in storing a sequence of numbers. The easiest way 9 | to store more than one number is to just stick them next to each other in 10 | :ref:`memory `. A region of memory which holds a sequence of values 11 | of the same type is called an :ref:`"array" `. In memory, the array of 1 12 | byte integers ``[108, 109, 97, 111]`` would look like: 13 | 14 | .. image:: _static/1-byte-value-array.png 15 | 16 | In order to access the elements of the array, we would just add the offset to 17 | the base address of the array. This is where the 0-indexed convention for lists 18 | comes from, the first element is 0 elements away from the start. 19 | 20 | The advantage of packing the elements next to each other in this way is that we 21 | can now refer to the entire sequence with just a :ref:`pointer ` to the 22 | first element. This means that we can semantically move a lot of data just by 23 | moving a single integer. 24 | 25 | Multi-Byte Values in Arrays 26 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~ 27 | 28 | Arrays may contain multi-byte values. Just like arrays of single-byte values, 29 | the elements are laid out next to each other in memory. 30 | 31 | .. image:: _static/multi-byte-value-array.png 32 | 33 | In order to access element :math:`n` of an array at address :math:`p` with 34 | elements of size :math:`s` we just read the value at the address: :math:`p + 35 | sn`. 36 | 37 | Structs 38 | ------- 39 | 40 | Another common low-level data structure is called a :ref:`"struct" `, 41 | short for "structure". Structures are a fixed length, ordered collection of 42 | potentially different types of values. A structure works similar to an array, 43 | where we just pack the values next to each other in memory. Structures are 44 | useful because, just like with :ref:`arrays `, they allow us to 45 | semantically move many related values with a single :ref:`pointer `. 46 | 47 | Imagine we want to represent a structure with the following fields: 48 | 49 | .. code-block:: c 50 | 51 | { 52 | int32 a; 53 | int8 b; 54 | int16 c; 55 | } 56 | 57 | The memory for a single instance of this struct may look like: 58 | 59 | .. image:: _static/struct.png 60 | 61 | Just like we can compute the address of any given element in an array from the 62 | address of the first value and the index, we can compute the address of any 63 | member of the structure from just the first value. Instead of accounting for an 64 | index, we just need to know through some side-channel method what the expected 65 | offset is. The idea is that if we lay out the data at a particular offset from 66 | the first element when we write it, then code that requires just the ``c`` field 67 | can be hard coded to read :math:`p + 6` and the value will be there. There is 68 | nothing that enforces that this is true, it is just a convention for arranging 69 | memory. 70 | 71 | The reason that the 6th byte is unused is that we may want to keep all of our 72 | ``int16`` (2 byte integer) values at an address that is a multiple of 2. Some 73 | hardware operations are faster when reading values that are "aligned" to a 74 | multiple of their size. We will discuss this more later. 75 | 76 | Multi-Dimensional Arrays 77 | ------------------------ 78 | 79 | Memory is intrinsically one dimensional because the machine natively addresses 80 | memory with a scalar integer. Therefore, in order to store a semantically 81 | multi-dimensional array we need to store some flattening of the data and design 82 | a mapping from coordinate to a single address. The general way to do this is to 83 | think of a multi-dimensional array as an "array of arrays", or an array whose 84 | elements are themselves arrays. An array just stores its values one after the 85 | other in memory, so a multi-dimensional array can be stored by laying out a 86 | sequence of arrays of progressively smaller dimensionality. 87 | 88 | For example, let's consider a 2d array of shape ``(6, 3)``, or 6 rows by 3 89 | columns: 90 | 91 | .. image:: _static/2d-array.png 92 | 93 | There are two ways we can define this as an "array of arrays": 94 | 95 | - a length 6 array of length 3 arrays, a collection of rows 96 | - a length 3 array of length 6 arrays, a collection of columns 97 | 98 | Row Order 99 | ~~~~~~~~~ 100 | 101 | Row order is where the array is arranged in memory as a collection of rows. This 102 | is also called "C order" because the C programming language often uses arrays 103 | arranged this way. 104 | 105 | .. image:: _static/row-order.png 106 | 107 | The formula to get element :math:`(r, c)` is: 108 | 109 | .. math:: 110 | 111 | getitem(r, c) = r * Num Columns + c 112 | 113 | Column Order 114 | ~~~~~~~~~~~~ 115 | 116 | Column order is where the array is arranged in memory as a collection of 117 | columns. This is also called "F order" or "Fortran" order because the Fortran 118 | programming language often used arrays arranged in this way. 119 | 120 | .. image:: _static/column-order.png 121 | 122 | The formula to get element :math:`(r, c)` is: 123 | 124 | .. math:: 125 | 126 | getitem(r, c) = c * Num Rows + r 127 | 128 | .. warning:: 129 | 130 | Be careful not to confuse "C order" with column order, they are opposites! 131 | -------------------------------------------------------------------------------- /tutorial/source/bits.rst: -------------------------------------------------------------------------------- 1 | Bits 2 | ==== 3 | 4 | A :ref:`"bit" ` is the most fundamental unit for storing information. It 5 | may either be true, often denoted as 1, or false, often denoted as 0. While we 6 | can only store two states in a single bit, if we group them together we can 7 | store exponentially more information. 8 | 9 | 1 bit (2 states): 10 | 11 | - 0 12 | - 1 13 | 14 | 2 bits (4 states): 15 | 16 | - 00 17 | - 01 18 | - 10 19 | - 11 20 | 21 | 3 bits (8 states): 22 | 23 | - 000 24 | - 001 25 | - 010 26 | - 011 27 | - 100 28 | - 101 29 | - 110 30 | - 111 31 | 32 | The number of states representable by :math:`n` bits is :math:`2^{n}`. 33 | 34 | Representing Non-Negative Integers with Bits 35 | -------------------------------------------- 36 | 37 | We can uniquely represent non-negative integers using just bits with the 38 | following formula: 39 | 40 | Let :math:`s` be the number of bits and let :math:`B` be the sequence of bits: 41 | 42 | .. math:: 43 | 44 | n = \sum_{i=1}^{s}{B_i2^{i - 1}} 45 | 46 | 47 | For example, to represent 13 in 8 bits, we would need: 48 | 49 | .. math:: 50 | 51 | 0(2^{7}) + 0(2^{6}) + 0(2^{5}) + 0(2^{4}) + 1(2^{3}) + 1(2^{2}) + 0(2^{1}) + 52 | 1(2^{0}) 53 | 54 | which can be written out as: ``00001101``. 55 | 56 | Addition with Bits 57 | ------------------ 58 | 59 | We can use the same algorithm for addition that we learned in elementary school. 60 | 61 | To walk through an example, we will try to add 13 and 4: 62 | 63 | :: 64 | 65 | 1101 66 | + 100 67 | ------- 68 | 69 | :: 70 | 71 | 1101 72 | + 100 73 | ------- 74 | 1 (0 + 1) = 1 75 | 76 | :: 77 | 78 | 1101 79 | + 100 80 | ------- 81 | 01 (0 + 1) = 1 82 | (0 + 0) = 0 83 | 84 | :: 85 | 86 | 1 87 | 1101 88 | + 100 89 | ------- 90 | 001 (0 + 1) = 1 91 | (0 + 0) = 0 92 | (1 + 1) = 10 = 0 carry 1 93 | 94 | :: 95 | 96 | 11 97 | 1101 98 | + 100 99 | ------- 100 | 0001 (0 + 1) = 1 101 | (0 + 0) = 0 102 | (1 + 1) = 10 = 0 carry 1 103 | (1 + 1) = 10 = 0 carry 1 104 | 105 | :: 106 | 107 | 11 108 | 1101 109 | + 100 110 | ------- 111 | 10001 (0 + 1) = 1 112 | (0 + 0) = 0 113 | (1 + 1) = 10 = 0 carry 1 114 | (1 + 1) = 10 = 0 carry 1 115 | (1 + 0) = 1 116 | 117 | 118 | Representing Negative Numbers 119 | ----------------------------- 120 | 121 | Singed Magnitude 122 | ~~~~~~~~~~~~~~~~ 123 | 124 | Given that there are only two states for the sign of a number (positive or 125 | negative), you could reserve a particular bit to denote the sign, and then use 126 | the remaining bits to denote the magnitude of the value. For example, using 0 to 127 | denote positive and 1 to denote negative, in 8 bits we can represent values in 128 | the range: 129 | 130 | :: 131 | 132 | 01111111 = +127 133 | 134 | 11111111 = -127 135 | 136 | Unfortunately, this representation allows multiple representations for 0: 137 | 138 | :: 139 | 140 | 00000000 = +0 141 | 10000000 = -0 142 | 143 | This representation will also complicate the algorithms needed to perform 144 | arithmetic with signed numbers. 145 | 146 | Two's Complement 147 | ~~~~~~~~~~~~~~~~ 148 | 149 | In order to simplify arithmetic and produce a unique representation for each 150 | number, most computers use two's complement to represent negative integers. To 151 | find the two's complement representation of a negative number, take the absolute 152 | value, add one, and then do a binary negation. A binary negation is where you 153 | flip every true bit to false, and every false bit to true. Because there could 154 | be an infinite number of leading zeroes to negate, we need to ahead of time 155 | decide the number of bits we will be working with. This is known as the 156 | :ref:`"width" ` of an integer. 157 | 158 | For example, to represent -17 in 8 bits using two's complement: 159 | 160 | :: 161 | 162 | x = -17 163 | abs(x) = 17 164 | 17 + 1 = 18 165 | bin(17) = 00010010 166 | ~00010010 = 11101101 167 | 168 | 169 | The two's complement representation allows us to implement subtraction by 170 | implementing it as addition of the unsigned interpretation of the two's 171 | complement value. For example, let's subtract 4 from 13. We will do this by 172 | adding -4 to 13. 173 | 174 | :: 175 | 176 | bin(13) = 00001101 177 | bin(4) = 00000100 178 | 179 | To negate a number with two's complement representation, either positive or 180 | negative) is done by inverting all of the bits and then adding one. So to get -4 181 | we would do: 182 | 183 | :: 184 | 185 | bin(4) = 00000100 186 | ~bin(4) = 11111011 187 | ~bin(4) + 1 = 11111100 188 | 189 | Then we would do our addition from before: 190 | 191 | :: 192 | 193 | 111111 194 | 00001101 195 | + 11111100 196 | -------------- 197 | 100001001 198 | 199 | Which when interpreted in base ten is 265. If we take the last 8 bits of the 200 | result we get: ``00001001 = 9``. We know to take the last 8 bits because that is 201 | the bit width of the integers. 202 | -------------------------------------------------------------------------------- /tutorial/source/conf.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | # -*- coding: utf-8 -*- 3 | # 4 | # principles-of-performance documentation build configuration file, created by 5 | # sphinx-quickstart on Sat Apr 15 15:29:30 2018. 6 | # 7 | # This file is execfile()d with the current directory set to its 8 | # containing dir. 9 | # 10 | # Note that not all possible configuration values are present in this 11 | # autogenerated file. 12 | # 13 | # All configuration values have a default; values that are commented out 14 | # serve to show the default. 15 | 16 | # If extensions (or modules to document with autodoc) are in another directory, 17 | # add these directories to sys.path here. If the directory is relative to the 18 | # documentation root, use os.path.abspath to make it absolute, like shown here. 19 | # 20 | # import os 21 | # import sys 22 | # sys.path.insert(0, os.path.abspath('.')) 23 | 24 | 25 | # -- General configuration ------------------------------------------------ 26 | 27 | # If your documentation needs a minimal Sphinx version, state it here. 28 | # 29 | # needs_sphinx = '1.0' 30 | 31 | # Add any Sphinx extension module names here, as strings. They can be 32 | # extensions coming with Sphinx (named 'sphinx.ext.*') or your custom 33 | # ones. 34 | extensions = [ 35 | 'sphinx.ext.autodoc', 36 | 'sphinx.ext.intersphinx', 37 | 'sphinx.ext.todo', 38 | 'IPython.sphinxext.ipython_console_highlighting', 39 | 'IPython.sphinxext.ipython_directive', 40 | 'matplotlib.sphinxext.plot_directive', 41 | ] 42 | 43 | # Add any paths that contain templates here, relative to this directory. 44 | templates_path = ['_templates'] 45 | 46 | # The suffix(es) of source filenames. 47 | # You can specify multiple suffix as a list of string: 48 | # 49 | # source_suffix = ['.rst', '.md'] 50 | source_suffix = '.rst' 51 | 52 | # The master toctree document. 53 | master_doc = 'index' 54 | 55 | # General information about the project. 56 | project = 'principles-of-performance' 57 | copyright = '2018, Joe Jevnik' 58 | author = 'Joe Jevnik' 59 | 60 | # The version info for the project you're documenting, acts as replacement for 61 | # |version| and |release|, also used in various other places throughout the 62 | # built documents. 63 | # 64 | # The short X.Y version. 65 | version = '' 66 | # The full version, including alpha/beta/rc tags. 67 | release = '' 68 | 69 | # The language for content autogenerated by Sphinx. Refer to documentation 70 | # for a list of supported languages. 71 | # 72 | # This is also used if you do content translation via gettext catalogs. 73 | # Usually you set "language" from the command line for these cases. 74 | language = None 75 | 76 | # List of patterns, relative to source directory, that match files and 77 | # directories to ignore when looking for source files. 78 | # This patterns also effect to html_static_path and html_extra_path 79 | exclude_patterns = [] 80 | 81 | # If true, `todo` and `todoList` produce output, else they produce nothing. 82 | todo_include_todos = True 83 | 84 | 85 | # -- Options for HTML output ---------------------------------------------- 86 | 87 | # The theme to use for HTML and HTML Help pages. See the documentation for 88 | # a list of builtin themes. 89 | # 90 | html_theme = 'sphinx_rtd_theme' 91 | 92 | # Theme options are theme-specific and customize the look and feel of a theme 93 | # further. For a list of options available for each theme, see the 94 | # documentation. 95 | # 96 | # html_theme_options = {} 97 | 98 | # Add any paths that contain custom static files (such as style sheets) here, 99 | # relative to this directory. They are copied after the builtin static files, 100 | # so a file named "default.css" will overwrite the builtin "default.css". 101 | html_static_path = ['_static'] 102 | 103 | 104 | # -- Options for HTMLHelp output ------------------------------------------ 105 | 106 | # Output file base name for HTML help builder. 107 | htmlhelp_basename = 'principles-of-performancedoc' 108 | 109 | 110 | # -- Options for LaTeX output --------------------------------------------- 111 | 112 | latex_elements = { 113 | # The paper size ('letterpaper' or 'a4paper'). 114 | # 115 | # 'papersize': 'letterpaper', 116 | 117 | # The font size ('10pt', '11pt' or '12pt'). 118 | # 119 | # 'pointsize': '10pt', 120 | 121 | # Additional stuff for the LaTeX preamble. 122 | # 123 | # 'preamble': '', 124 | 125 | # Latex figure (float) alignment 126 | # 127 | # 'figure_align': 'htbp', 128 | } 129 | 130 | # Grouping the document tree into LaTeX files. List of tuples 131 | # (source start file, target name, title, 132 | # author, documentclass [howto, manual, or own class]). 133 | latex_documents = [ 134 | (master_doc, 'principles-of-performance.tex', 'principles-of-performance Documentation', 135 | 'Joe Jevnik', 'manual'), 136 | ] 137 | 138 | 139 | # -- Options for manual page output --------------------------------------- 140 | 141 | # One entry per manual page. List of tuples 142 | # (source start file, name, description, authors, manual section). 143 | man_pages = [ 144 | (master_doc, 'principles-of-performance', 'principles-of-performance Documentation', 145 | [author], 1) 146 | ] 147 | 148 | 149 | # -- Options for Texinfo output ------------------------------------------- 150 | 151 | # Grouping the document tree into Texinfo files. List of tuples 152 | # (source start file, target name, title, author, 153 | # dir menu entry, description, category) 154 | texinfo_documents = [ 155 | (master_doc, 'principles-of-performance', 'principles-of-performance Documentation', 156 | author, 'principles-of-performance', 'One line description of project.', 157 | 'Miscellaneous'), 158 | ] 159 | 160 | 161 | intersphinx_mapping = { 162 | 'http://docs.python.org/dev': None, 163 | 'numpy': ('http://docs.scipy.org/doc/numpy/', None), 164 | 'scipy': ('http://docs.scipy.org/doc/scipy/reference/', None), 165 | 'pandas': ('http://pandas.pydata.org/pandas-docs/stable/', None), 166 | } 167 | -------------------------------------------------------------------------------- /tutorial/source/how-to-optimize-code.rst: -------------------------------------------------------------------------------- 1 | How to Optimize Code 2 | ==================== 3 | 4 | When considering the performance of code, the things to worry about in order of 5 | importance: 6 | 7 | 1. Algorithmic complexity. 8 | 2. Allocations and copies. 9 | 3. Memory access and cache performance. 10 | 4. Number of instructions. 11 | 12 | Most programs will achieve acceptable performance by only considering the first 13 | two points, however, when doing computationally intensive tasks like the numeric 14 | programming required for quantitative finance, the 3rd and 4th point may become 15 | important. 16 | 17 | Algorithmic Complexity 18 | ---------------------- 19 | 20 | Algorithmic complexity is a measure of the relationship between input size and 21 | computation time. 22 | 23 | *Big-O: how code slows as data grows.* 24 | 25 | Ned Batchelder 26 | 27 | The common technique for describing the time complexity of an algorithm is to 28 | compute the *worst case* performance of an algorithm. The standard way to 29 | communicate the worst case performance is through "Big O Notation". 30 | 31 | The intuition is to count the number of operations that happen for every 32 | element of the input. For example: 33 | 34 | .. code-block:: python 35 | 36 | def contains(haystack, needle): 37 | for value in haystack: 38 | if value == needle: 39 | return True 40 | 41 | return False 42 | 43 | In this example, for each element in ``haystack``, we will perform 1 comparison, 44 | meaning there is a linear relationship between the time spent in ``contains`` and 45 | the length of ``haystack``. This means that ``contains`` is :math:`O(n)` with 46 | relation to ``haystack``. 47 | 48 | .. code-block:: python 49 | 50 | def contains_sorted(haystack, needle): 51 | while haystack: 52 | mid_ix = len(haystack) // 2 53 | midpoint = haystack[mid_ix] 54 | if midpoint == needle: 55 | return True 56 | if midpoint < needle: 57 | haystack = haystack[mid_ix + 1:] 58 | else: 59 | haystack = haystack[:mid_ix] 60 | return False 61 | 62 | In this example, we cut the size of the ``haystack`` list in half in each step 63 | of the loop. This means that for a needle size of 16, in the worst case (the 64 | value is not in the haystack), we will operate on 16, 8, 4, 2, 1 values. This 65 | means we had 5 operations for 16 inputs. If we check 32, we will get 32, 16, 8, 66 | 4, 2, 1, or 6 operations. This function actually scales logarithmically with the 67 | size of ``haystack``. This means that ``contains_sorted`` is :math:`O(ln(n))` with 68 | relation to ``haystack``. 69 | 70 | The second function requires that the input is pre-sorted, but will perform much 71 | better for large ``haystacks`` given that constraint. 72 | 73 | 74 | Reality Check 75 | ````````````` 76 | 77 | Algorithmic complexity is a good tool for quickly evaluating how an algorithm 78 | will scale as the data **approaches infinity**. However, in the real world, we 79 | are often working with finite data sets (even big data is finite). When working 80 | with finite data, it is important to remember the constants that get erased and 81 | the derivative of the scaling function. 82 | 83 | For example, here are three functions: 84 | 85 | .. code-block:: python 86 | 87 | def f0(xs): 88 | for x in xs: 89 | pass 90 | 91 | def f2(xs): 92 | for x in xs: 93 | pass 94 | 95 | for x in xs: 96 | pass 97 | 98 | def f3(xs): 99 | # slow down, the data isn't going anywhere 100 | sleep(100) 101 | 102 | The first function performs one operation, pass, per element. Therefore this 103 | function is linear. The second function performs 2 operations per element by 104 | looping twice so it is also linear. The third function performs 0 operations per 105 | element, so it has constant time scaling. This function will run in the same 106 | time regardless of the size of the input. However, the constant speed is 100 107 | seconds, which may very well be slower than the linear solution for "small" 108 | ``xs``. 109 | 110 | This plot shows a linear function, a logarithmic function, and a quadratic 111 | function. Because of the particulars of these functions, the quadratic function 112 | would be faster until around 600 elements. If the expected input size was less 113 | than 600, then the quadratic algorithm would actually be the best choice! 114 | 115 | .. plot:: 116 | 117 | import pandas as pd 118 | import numpy as np 119 | 120 | xy = np.arange(1, 1001) 121 | log = np.log2(xy) 122 | x2 = (xy / 200) ** 2 123 | 124 | pd.DataFrame({ 125 | '$O(n)$': xy / 50, 126 | '$O(ln(n))$': log, 127 | '$O(x^2)$': x2, 128 | }).plot() 129 | 130 | 131 | Allocations and Copies 132 | ---------------------- 133 | 134 | :ref:`Memory allocations ` can be very expensive. Allocating memory 135 | is itself potentially expensive because of the interaction with the operating 136 | system as well as the book keeping needed to track the newly allocated 137 | memory. The other, less obvious reason why allocations are bad is that it 138 | further spreads your data across more distinct addresses meaning you will get 139 | worse cache locality with the data. 140 | 141 | Copies have all of the problems as allocations with the addition of an 142 | :math:`O(n)` operation to traverse the values being copied. Scanning a large 143 | region of memory can evict the entire working set from the :ref:`L1 ` cache 144 | because it is touching a lot of memory at once. 145 | 146 | This isn't to say that you shouldn't allocate any memory. Programs sometimes 147 | need to store their results in new allocations; however, be careful about it. 148 | 149 | The Fastest Operation 150 | --------------------- 151 | 152 | One of the most important tricks is to note that the fastest operation you can 153 | do is nothing. If you are struggling to improve the performance of some code, 154 | step back and think, "do I need to be doing this at all". It can be easy to fall 155 | into the trap of optimizing a an algorithm as it exists, which may only be a 156 | locally optimal solution, when the globally optimal solution is to just call the 157 | function less times, or not at all. 158 | -------------------------------------------------------------------------------- /tutorial/source/index.rst: -------------------------------------------------------------------------------- 1 | .. include:: ../../README.rst 2 | 3 | .. toctree:: 4 | :maxdepth: 2 5 | :caption: Contents: 6 | 7 | bits 8 | low-level-computation 9 | arrays-and-structs 10 | memory-locality 11 | memory-management 12 | how-to-optimize-code 13 | python-overview 14 | numpy-overview 15 | profiling 16 | appendix 17 | 18 | 19 | Indices and tables 20 | ================== 21 | 22 | * :ref:`genindex` 23 | * :ref:`modindex` 24 | * :ref:`search` 25 | -------------------------------------------------------------------------------- /tutorial/source/low-level-computation.rst: -------------------------------------------------------------------------------- 1 | Low Level Computation 2 | ===================== 3 | 4 | Main Memory 5 | ----------- 6 | 7 | The memory of a computer is an ordered sequence of bits, which is broken up into 8 | small fixed sized pieces called :ref:`"bytes" `. A :ref:`byte ` is 9 | the smallest addressable unit of bits. This is eight :ref:`bits `, except 10 | for very old computers or some specialty devices. These :ref:`bytes ` are 11 | laid out as a flat sequence which are named according to their index. When 12 | discussing memory, an index is called an :ref:`"address"
`. To 13 | :ref:`"dereference" ` an address is to read the value stored at 14 | that address. 15 | 16 | .. image:: _static/memory-cells.png 17 | 18 | Computation Circuits 19 | -------------------- 20 | 21 | A general purpose computer can perform many different tasks, even tasks not 22 | known when the device was built. To do this, the computer has many small 23 | computation circuits that implement very low level functions, for example: add 24 | two numbers together. Using "add" as an example, at a high level this circuit 25 | will take two numbers and produce the sum. To do that, it needs a physical 26 | location to store the input bits and the output bits. These circuits take up 27 | physical space and material so it would be prohibitively expensive to build 28 | custom hardware to special case adding every address to every other address 29 | (quadratic in the number of addresses). This problem is magnified by the desire 30 | to perform many operations, like subtract, multiply, and probably others. To 31 | reduce the number of specific circuits needed, the computer provides a small set 32 | of locations to read and write data and provides operations for moving data from 33 | :ref:`main memory ` to and from these standard locations. These 34 | locations are individually called :ref:`"registers" `. For example, 35 | the ``add`` implementation for a 3 :ref:`register ` machine would 36 | require 9 adder circuits, where one of the inputs gets overwritten with the 37 | output. 38 | 39 | .. image:: _static/adders.png 40 | :alt: The adder circuits needed for a three register machine. 41 | 42 | Bitness 43 | ------- 44 | 45 | A :ref:`register ` is not limited to a single byte. If it were, our 46 | memory space would be limited to :math:`2^{8} = 256` bytes because that would be 47 | the largest address we could store in a single register. The number of bits in a 48 | register is the :ref:`"bitness" ` of the machine. This is what it means 49 | when you see "64 bit processor" or "32 bit processor". 50 | 51 | Increasing the bitness of the :ref:`registers ` is advantageous 52 | because it means the machine has special circuits for operating on larger 53 | integers and it increases the total addressable memory space. For example, the 54 | theoretical size of the memory space is: 55 | 56 | - 8 bits: :math:`2^{8} = 256` bytes 57 | - 16 bits: :math:`2^{16} = 65536B = 64KiB` 58 | - 32 bits: :math:`2^{32} = 4294967296B = 4GiB` 59 | - 64 bits: :math:`2^{64} = 18446744073709551616B = 16EiB` 60 | 61 | The downsides to increasing the bitness of the :ref:`registers ` is 62 | that it requires physically more material and space to build. Electricity only moves 63 | about one foot in a nanosecond, so routing the electricity around a lot of 64 | physical space takes time. Making things smaller also makes them harder to keep 65 | cool enough to function properly. Another downside is that in exchange for a 66 | larger memory space, the size of every address goes up. This trade-off is okay 67 | if you actually have more memory, but can lead to some complications. 68 | 69 | 70 | Sequencing Low Level Operations 71 | ------------------------------- 72 | 73 | Given our pile of bytes and very specific computation circuits, how can we 74 | perform useful computations? We need to first decompose our problem into these 75 | small atomic steps, and then instruct the computer which atomic steps we want to 76 | execute and in what order. To do this, we could either: 77 | 78 | 1. Build a custom circuit to sequence the low level circuits. 79 | 2. Encode the steps as data and store the sequence in memory. 80 | 81 | The problem with option 1 is that we either need to know what function we want 82 | to perform when we build the hardware, or we would need some sort of mechanism 83 | for re-synthesizing the circuits after the device is built. There is actually a 84 | type of device that does this, called an FPGA (Field Programmable Gate Array), 85 | but that is not the option most modern general purpose computers use. 86 | 87 | Option two requires that we can encode all of the tasks we want into bits and 88 | get the computer to read them when we want. 89 | 90 | Encoding Operations as Numbers 91 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 92 | 93 | A single encoded operation is referred to as an :ref:`"instruction" 94 | `. An instruction refers to both the function to perform paired 95 | with the arguments to act on. 96 | 97 | Encoding our low level operations as numbers is a reasonably straightforward 98 | task. Given that we have a finite number of atomic operations, we could just 99 | enumerate them and use the index as the value to store in memory. The downside 100 | with that technique is that it forces all instructions to be the same size, 101 | which means all instructions are as large as the largest possible 102 | instruction. In practice, not all instructions need the same amount of 103 | information. For example: ``inc``, which increments a value by one, is composed 104 | of two parts: 105 | 106 | 1. Something to indicate that this is an ``inc``. 107 | 2. Something to denote which register should be incremented. 108 | 109 | Where ``add`` requires 3 pieces: 110 | 111 | 1. Something to indicate that this is an ``add``. 112 | 2. Something to denote the register to read the first addend from. 113 | 3. Something to denote the register to read the second addend from. This 114 | register will then hold the result. 115 | 116 | In general, each instruction is encoded as an :ref:`"opcode" ` followed 117 | by a variable amount of space depending on the number of arguments needed. The 118 | processor knows how many bytes to read after the opcode because the number of 119 | arguments is fixed for any particular opcode. For example, in some fictional 120 | encoding we could encode: 121 | 122 | .. code-block:: asm 123 | 124 | inc %r1 125 | add %r2, %r3 126 | 127 | as: 128 | 129 | :: 130 | 131 | 00000001 00000001 132 | inc = 1 r1 = 1 133 | 134 | 00000010 00000010 00000011 135 | add = 2 r2 = 1 r3 = 3 136 | 137 | Given the complexity of all of the operations modern X86-64 computers can 138 | perform, the actual encoding is very complicated and a single instruction can 139 | span anywhere from 1 byte to 15 bytes! 140 | 141 | Telling the Computer Where the Program Is 142 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 143 | 144 | So we can now encode a computation as a series of atomic steps that our computer 145 | can execute, but how does the computer read that? 146 | 147 | When the computer launches, there is a small program hard coded into the device 148 | that reads some startup code from your persistent storage and loads it into 149 | memory at a known location. The processor then knows to read instructions 150 | starting at this location and moving forward one instruction at a time. The 151 | computer stores the current :ref:`address
` where the program is being 152 | read in a special :ref:`register ` called the :ref:`"instruction 153 | pointer" `. 154 | 155 | The general execution flow for a program is: 156 | 157 | 1. :ref:`Dereference ` the :ref:`instruction pointer 158 | `. 159 | 2. Parse the given :ref:`instruction ` by reading the :ref:`opcode 160 | ` and any arguments. 161 | 3. Execute the :ref:`instruction `. 162 | 4. Increment the :ref:`instruction pointer ` by the size of 163 | the :ref:`instruction `.. 164 | 5. Go to step 1. 165 | 166 | .. note:: 167 | 168 | There are cases where step 4 is altered or skipped. This happens when the 169 | instruction itself changes the instruction pointer. 170 | -------------------------------------------------------------------------------- /tutorial/source/memory-locality.rst: -------------------------------------------------------------------------------- 1 | Memory Locality 2 | =============== 3 | 4 | The way objects are arranged in memory can have a dramatic impact on the 5 | performance of a program. This is because, somewhat surprisingly, not all memory 6 | accesses are equivalent. Putting related data near each other in memory is 7 | very convenient as a programmer, like with :ref:`arrays ` and 8 | :ref:`structs `; therefore, hardware has been optimized for this use 9 | case. 10 | 11 | Cache Hierarchy 12 | --------------- 13 | 14 | Remember, electricity can move about 1 foot in a nanosecond, so as we add more 15 | and more physical locations to store memory, the device grows in size and 16 | complexity so it takes more time to access the memory. The solution to this is 17 | simply to maintain smaller copies of the memory closer to, or on the CPU 18 | itself. Putting the data physically closer makes them *much* faster to access 19 | than normal memory. 20 | 21 | The names for these caches start with ``L1``, which is the smallest and fastest 22 | level, and grow upwards like ``L2`` and ``L3``. Most computers only have 2 or 3 23 | levels total. There is also a special term :ref:`LL` which refers to the largest 24 | and slowest level, regardless of how many levels exist on the particular device. 25 | 26 | For example, on an Intel Core i7-6600U, there are 3 cache levels: 27 | 28 | - ``L1``: 64KiB (split 32K data and 32K instructions) 29 | - ``L2``: 256KiB 30 | - ``L3``: 4096KiB 31 | 32 | This is expected to be a very small fraction of the total :ref:`main memory 33 | `. On my computer, I have 16GiB of :ref:`main memory 34 | `, which means that, as a fraction of my total memory, the levels 35 | can store: 36 | 37 | - ``L1``: :math:`1/262144` 38 | - ``L2``: :math:`1/65536` 39 | - ``L3``: :math:`1/4096` 40 | 41 | Propagating Values 42 | ~~~~~~~~~~~~~~~~~~ 43 | 44 | The way the cache works is that instead of reading memory in units of one byte, 45 | memory actually gets read in larger chunks called a :ref:`"cache line" 46 | `. Modern processor use a cache line size of 64 bytes, or some power 47 | of 2 around that size. Whenever you want to read data, you actually grab the 48 | value along with a little bit of the data around it. 49 | 50 | For example, assuming a cache line size of 8 bytes (instead of the normal 64), 51 | let's say we want to read the yellow 2 byte value: 52 | 53 | .. image:: _static/cache-0.png 54 | 55 | In Instead of just moving this two byte value to the cache, we move the entire 56 | cache line that the value is in to the cache: 57 | 58 | .. image:: _static/cache-1.png 59 | 60 | Notice that this doesn't pull evenly from both sides of the value, instead, it 61 | just treats each multiple of the cache line size as one atomic unit. 62 | 63 | .. note:: 64 | 65 | Remember when we defined the struct: 66 | 67 | .. code-block:: c 68 | 69 | { 70 | int32 a; 71 | int8 b; 72 | int16 c; 73 | } 74 | 75 | The 6th byte was unused for performance reasons. One reason to keep the 76 | values aligned, is that you don't want to run the risk of having the two 77 | bytes of ``c`` in different cache lines. 78 | 79 | 80 | Performance Impact 81 | ~~~~~~~~~~~~~~~~~~ 82 | 83 | To show the impact first hand, you can time a simple function which loops 84 | through an array touching every element in different orders. 85 | 86 | On my Intel Core i7-6600U, touching 1000000 int64 (8 byte values) takes: 87 | 88 | :: 89 | 90 | ------------------------------------------------------------------- 91 | Benchmark Time CPU Iterations 92 | ------------------------------------------------------------------- 93 | bench_random_access 49937987 ns 49752328 ns 15 94 | bench_forward_linear_access 12688049 ns 12675668 ns 55 95 | bench_reverse_linear_access 13032238 ns 13019142 ns 54 96 | 97 | .. note:: 98 | 99 | This timing does not include the generation of the random numbers. The source 100 | is in ``benchmarks/c++/bench/memory_order.cc``. 101 | 102 | This shows that traversing a real array in random order is almost 5 times slower 103 | than traversing the *same array* in linear order. 104 | 105 | Multi-Dimensional Arrays 106 | ------------------------ 107 | 108 | Remember that we have some choices for how we lay out multi-dimensional arrays 109 | in memory. For 2d arrays, we can either use: 110 | 111 | Row Order: 112 | 113 | .. image:: _static/row-order.png 114 | 115 | Column Order: 116 | 117 | .. image:: _static/column-order.png 118 | 119 | As you can see, this affects which values are closer to each other in memory, 120 | which we now know affects performance. 121 | 122 | Example 123 | ------- 124 | 125 | Given a 2d array of shape (10000, 10000), think about which orientation would be 126 | best for: 127 | 128 | - sum the columns (sum along axis 0) 129 | - sum the rows (sum along axis 1) 130 | - sum the whole array 131 | 132 | .. code-block:: ipython 133 | 134 | In [1]: import numpy as np 135 | 136 | In [2]: row_major = np.random.random((10000, 10000)) 137 | 138 | In [3]: column_major = row_major.copy(order='F') 139 | 140 | In [4]: %timeit row_major.sum(axis=0) 141 | 60.8 ms ± 984 µs per loop (mean ± std. dev. of 7 runs, 10 loops each) 142 | 143 | In [5]: %timeit row_major.sum(axis=1) 144 | 44.9 ms ± 431 µs per loop (mean ± std. dev. of 7 runs, 10 loops each) 145 | 146 | In [6]: %timeit column_major.sum(axis=0) 147 | 47.4 ms ± 2.26 ms per loop (mean ± std. dev. of 7 runs, 10 loops each) 148 | 149 | In [7]: %timeit column_major.sum(axis=1) 150 | 61.3 ms ± 460 µs per loop (mean ± std. dev. of 7 runs, 10 loops each) 151 | 152 | In [8]: %timeit row_major.sum() 153 | 42.9 ms ± 417 µs per loop (mean ± std. dev. of 7 runs, 10 loops each) 154 | 155 | In [9]: %timeit column_major.sum() 156 | 42.1 ms ± 304 µs per loop (mean ± std. dev. of 7 runs, 10 loops each) 157 | -------------------------------------------------------------------------------- /tutorial/source/memory-management.rst: -------------------------------------------------------------------------------- 1 | Memory Management 2 | ================= 3 | 4 | When dealing with memory, one of the things to do is keep track of which memory 5 | is in use at any given time. Basically, when a program decides that it would 6 | like to store some data, it needs to find a region in memory large enough to 7 | store the given value or values, but is not currently being used to store a 8 | value that we would like to read later. The term for this problem is 9 | :ref:`"memory management" `, and it is broken into two 10 | related parts: 11 | 12 | - :ref:`allocation `: reserving a region in memory for use 13 | - :ref:`deallocation `: marking that a formerly allocated region 14 | is now again free to be used for a future allocation. This is also called 15 | "freeing memory". 16 | 17 | .. note:: 18 | 19 | There are many techniques for tracking this information with different 20 | performance trade-offs, for the rest of this content we will treat all 21 | allocators (algorithms for managing allocated and freed memory) as 22 | equivalent. 23 | 24 | Virtual Memory 25 | -------------- 26 | 27 | So far we have been discussing programs as though they are the only things 28 | running; however, modern computers allow many programs to be run seemingly at 29 | the same time. In order to prevent programs from reading or writing memory in 30 | use by another program, modern CPUs support a feature called :ref:`"virtual 31 | memory" `. The way virtual memory works is a device intercepts 32 | every memory read or write coming from the program and remaps it to a different 33 | address in the physical :ref:`main memory ` or :ref:`processor 34 | cache `. This device is referred to as an :ref:`"MMU" `, which 35 | stands for "Memory Management Unit". Nowadays, this device is built directly 36 | into the CPU itself as they are deeply ingrained. 37 | 38 | The operating system issues instructions that tell the :ref:`MMU` which virtual 39 | memory space the program will be operating in. Once that is done the operating 40 | moves the :ref:`instruction pointer ` to the program and 41 | your program begins executing. The :ref:`MMU` will prevent the program from 42 | reading any address which the program has not been assigned, and will issue a 43 | hardware fault, which brings execution back to the operating system, if an 44 | invalid memory access occurs. This is what prevents any random program from 45 | reading your browser's memory to steal your password. The :ref:`MMU` can also be 46 | used to enforce read only or no execute (memory cannot be used to store 47 | instructions) on given regions of memory. 48 | 49 | Due to virtual memory, two programs running at the same time on the same machine 50 | may believe that they have been given the same :ref:`address 51 |
`. The :ref:`MMU` will know that process :math:`A` address :math:`N` 52 | maps to physical address :math:`P`, but process :math:`B` address :math:`N` maps 53 | to some different physical address :math:`Q`. 54 | 55 | Two Tiers of Allocation 56 | ----------------------- 57 | 58 | Because processes can only access the memory that the :ref:`MMU` and operating 59 | system have given it, the program needs some way of requesting memory from the 60 | operating system. Every operating system exposes this functionality to programs 61 | in some way. It is expensive to switch execution between the program and the 62 | operating system, so often programs request large blocks of memory at once. The 63 | program will then implement it's own :ref:`allocation ` algorithm to 64 | distribute this memory as it needs. 65 | 66 | When the process requests a large block of memory to distribute internally, it 67 | may not release that right away. Just like :ref:`allocation `, 68 | :ref:`deallocation ` requires telling the operating system and 69 | :ref:`MMU` that the process is done with the memory. This is similarly 70 | expensive to allocation. Therefore, processes often defer this if possible. This 71 | can make it very difficult to tell how much memory a complicated program like 72 | Python or R is using. 73 | -------------------------------------------------------------------------------- /tutorial/source/numpy-overview.rst: -------------------------------------------------------------------------------- 1 | Numpy Overview 2 | ============== 3 | 4 | Python :ref:`allocates ` every object in the heap and only refers to 5 | them through a pointer, requiring that we do at least one :ref:`memory 6 | dereference ` to access the value at all. Python also requires many 7 | dereferences to perform even simple operations like addition or 8 | multiplication. Both of these things are a massive drain on performance, so why 9 | does Python get used in performance sensitive fields like numerical computing? 10 | 11 | 12 | *This is the paradox that we have to work with when we're doing scientific or 13 | numerically-intensive Python. What makes Python fast for development -- this 14 | high-level, interpreted, and dynamically-typed aspect of the language -- is 15 | exactly what makes it slow for code execution.* 16 | 17 | Jake VanderPlas, Losing Your Loops: Fast Numerical Computing with NumPy, 18 | PyCon 2015 19 | 20 | Numpy is a Python library which is built around a single core data structure, 21 | the ``ndarray``, short for *N dimensional array*. This single data structure is 22 | what allows Python to be as fast or faster than other languages for doing 23 | numeric computing, even with all of the other downsides Python has. 24 | 25 | ``ndarray`` 26 | ----------- 27 | 28 | An ``ndarray`` is composed of the following parts: 29 | 30 | - ``dtype`` 31 | - shape 32 | - memory buffer 33 | - strides 34 | 35 | ``dtype`` 36 | ~~~~~~~~~ 37 | 38 | The :class:`~numpy.dtype`, short for "data type" is an object which represents 39 | the types of the elements to store in the array. For example, 40 | ``np.dtype('int32')`` represent 32 bit (4 byte) integers. Unlike a Python list, 41 | all elements of an ``ndarray`` must be the same type. 42 | 43 | Shape 44 | ~~~~~ 45 | 46 | The shape is small array which represents the number of values along each 47 | axis. For example: 48 | 49 | - ``(10,)``: 1-dimensional array of length 1 50 | - ``(5, 3)``: 2-dimensional array with 5 rows and 3 columns 51 | - ``(2, 4, 6)``: 3-dimensional array 52 | - ``()``: the empty sequence represents a scalar value, this comes up every so 53 | often but is not normally explicitly stated. 54 | 55 | Axes are named by their index in the shape. For 2-dimensional arrays, axis 0 56 | means rows and axis 1 means columns. 57 | 58 | Memory Buffer and Strides 59 | ~~~~~~~~~~~~~~~~~~~~~~~~~ 60 | 61 | The memory buffer is a low-level :ref:`array ` which holds the actual 62 | values for the array. This is deeply tied to the strides, which is an array 63 | representing the number of bytes needed to move one step along each axis. This 64 | is a generalization of the "row order" and "column order" layouts discussed 65 | earlier. It is possible to implement both row and column major orders by 66 | changing your data and strides. For example: 67 | 68 | Row Order: 69 | 70 | .. image:: _static/row-order-strides.png 71 | 72 | Column Order: 73 | 74 | .. image:: _static/column-order-strides.png 75 | 76 | The strides allow us to represent another very important concept, "strided 77 | views". Imagine we have a 2d array in row major order, but we want to take a 78 | view over a particular column. By shifting the base pointer and playing with the 79 | strides, it is possible to create an array which produces the value for each 80 | column but doesn't need to copy the data. 81 | 82 | .. image:: _static/column-slice-strides.png 83 | 84 | Operations 85 | ---------- 86 | 87 | Just storing data efficiently is not that exciting by itself. What makes numpy 88 | very powerful is that it gives efficient operations that act on the 89 | array. Because the data in an :class:`~numpy.ndarray` is both homogeneously typed 90 | and stored in a (usually) contiguous buffer, numpy can implement operations in a 91 | lower level language which optimize for memory access, cache utilization, and 92 | instruction count. 93 | 94 | Python features like iterators or ``sum`` helped optimize our loop because we 95 | could reduce the number of times we asked objects "how do you retrieve an 96 | element at an index?". Because :class:`~numpy.ndarray` objects hold 97 | homogeneously typed elements, we can ask "how do you do X operation with the 98 | given inputs" exactly once for the entire array. At this point, we can decide if 99 | the operation is valid and if it is, execute an optimized loop to perform that 100 | operation given the operands. 101 | 102 | For example, let's look at "sum". When summing a Python list of Python integers, 103 | we need to ask every int how to add itself to another int. At every step we 104 | need to re-check the operands and find the proper implementation of "add". Also, 105 | we will need to :ref:`allocate ` a new integer object for every 106 | intermediate sum. Using an :class:`~numpy.ndarray`, we already know up front 107 | that the elements are all the same type. If we try to sum, we would check once, 108 | "can elements of this dtype be added together?". If they can, it would find the 109 | optimized implementation of "add" for the dtype once. Then it would jump into 110 | the loop and start performing the sum. Because numpy doesn't need to keep all of 111 | it's state with Python objects, it can re-use the memory storing the 112 | intermediate sum which further removes :ref:`allocations `. 113 | 114 | To show that this really adds up, let's compare a Python dot product with 115 | numpy's implementation of dot product: 116 | 117 | .. code-block:: ipython 118 | 119 | In [2]: xs = [random.random() for _ in range(1000)] 120 | 121 | In [3]: ys = [random.random() for _ in range(10000)] 122 | 123 | In [4]: %timeit pythonic_dot(xs, ys) 124 | 552 µs ± 8.65 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) 125 | 126 | 127 | .. code-block:: ipython 128 | 129 | In [1]: import numpy as np 130 | 131 | In [2]: xs = np.random.random(10000) 132 | 133 | In [3]: xs.dtype 134 | Out[3]: dtype('float64') 135 | 136 | In [4]: xs.shape 137 | Out[4]: (10000,) 138 | 139 | In [5]: xs.strides 140 | Out[5]: (8,) 141 | 142 | In [6]: ys = np.random.random(10000) 143 | 144 | In [7]: %timeit np.dot(xs, ys) 145 | 2.45 µs ± 16.5 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) 146 | 147 | .. note:: 148 | 149 | These are both :math:`O(n)` implementations. 150 | 151 | Broadcasting 152 | ------------ 153 | 154 | Not only is numpy efficient, it is pleasant to use. One of the most pleasant 155 | features of numpy is called "broadcasting". Broadcasting happens when you want 156 | to perform a function with 2 or more arguments. A function here could also be an 157 | operator like ``+`` or ``*``. Broadcasting is a set of rules that allow us to 158 | align two array-like inputs so that we can formally define the operation. The 159 | steps for broadcasting are: 160 | 161 | 1. Align the *shape* of the two elements by left-extending with 1. 162 | 2. Compare the shapes axis by axis, if they are not equal, one side must be 1, 163 | otherwise the shapes are **not** compatible. 164 | 3. Convert any value equal to 1 in the shape to the max value on that dimension 165 | by repeating along that axis. At this point, the arrays will be the same 166 | shape. 167 | 4. Apply the scalar function element-wise. 168 | 169 | .. note:: 170 | 171 | Remember that the shape of a scalar is ``()``. 172 | 173 | Examples 174 | ~~~~~~~~ 175 | 176 | ``np.array([1, 2, 3]) + 10`` 177 | ```````````````````````````` 178 | 179 | let ``lhs = np.array([1, 2, 3])`` 180 | let ``rhs = 10`` 181 | 182 | ``lhs.shape == (3,)``, ``rhs.shape == ()`` 183 | 184 | 1. Left extend the shape of ``rhs`` with 1, giving us ``rhs.shape = (1,)`` and 185 | ``rhs = np.array([10])``. 186 | 2. Compare the shapes, ``3 != 1``; however, one of the values is 1. 187 | 3. Convert the ``1`` in the rhs shape to ``3`` by repeating along the axis. Now 188 | ``rhs = np.array([10, 10, 10])``. 189 | 4. Apply the scalar function element-wise: ``[1 + 10, 2 + 10, 3 + 10]`` 190 | 191 | Result: ``np.array([11, 12, 13])`` 192 | 193 | ``np.array([1, 2, 3]) * np.array([2, 3, 4])`` 194 | ````````````````````````````````````````````` 195 | 196 | let ``lhs = np.array([1, 2, 3])`` 197 | let ``rhs = np.array([2, 3, 4])`` 198 | 199 | ``lhs.shape == (3,)``, ``rhs.shape == (3,)`` 200 | 201 | 1. The shapes are already aligned. 202 | 2. The shapes are equal. 203 | 3. There are no ``1`` values in the shape. 204 | 4. Apply the scalar function element-wise: ``[1 * 2, 2 * 3, 3 * 4]`` 205 | 206 | Result: ``np.array([ 2, 6, 12])`` 207 | 208 | ``np.array([1, 2, 3]) / np.array([2, 4])`` 209 | `````````````````````````````````````````` 210 | 211 | let ``lhs = np.array([1, 2, 3])`` 212 | let ``rhs = np.array([2, 4])`` 213 | 214 | ``lhs.shape == (3,)``, ``rhs.shape == (2,)`` 215 | 216 | 1. The shapes are already aligned. 217 | 2. Along axis 0, ``3 != 2``. Neither ``3`` nor ``2`` is equal to ``1``. This 218 | means the shapes are **not** compatible. 219 | 220 | Result: Exception, these shapes are not compatible. 221 | 222 | ``np.array([2, 3, 4]) ** np.array([[1 / 2], [1 / 3]])`` 223 | ```````````````````````````````````````````````````````````````` 224 | 225 | let ``lhs = np.array([2, 3, 4])`` 226 | let ``rhs = np.array([[1 / 2], [1 / 3]])`` 227 | 228 | ``lhs.shape == (3,)``, ``rhs.shape == (2, 1)`` 229 | 230 | 1. Align the shapes by left extending the ``lhs`` with 1: ``lhs.shape == (1, 231 | 3)``. 232 | 2. Compare the shapes, ``1 != 2`` but there is a one. ``3 != 1`` but there is a 233 | one. 234 | 3. Convert the ``1`` values to the maximum value along that axis. This gives us: 235 | ``lhs.shape == (2, 3)`` and ``rhs.shape == (2, 3)``. 236 | 4. Apply the function element-wise: 237 | 238 | .. code-block:: python 239 | 240 | [[2 ** (1 / 2)], [3 ** (1 / 2)], [4 ** (1 / 2)], 241 | [2 ** (1 / 3)], [3 ** (1 / 3)], [4 ** (1 / 3)]] 242 | 243 | Result: 244 | 245 | .. code-block:: python 246 | 247 | array([[1.41421356, 1.73205081, 2. ], 248 | [1.25992105, 1.44224957, 1.58740105]]) 249 | 250 | .. note:: 251 | 252 | This algorithm is just an abstract representation of how alignment 253 | happens. In practice, numpy does not materialize the extended and aligned 254 | arrays, it just acts with the input data as is and plays tricks with the 255 | indexing. This is to reduce the number of allocations and copies improving 256 | performance. 257 | -------------------------------------------------------------------------------- /tutorial/source/profiling.rst: -------------------------------------------------------------------------------- 1 | Profiling 2 | ========= 3 | 4 | A :ref:`profiler ` is a tool that tracks the execution of a 5 | program. This is most commonly used to understand and measure the performance of 6 | a program. 7 | 8 | 9 | cProfile 10 | -------- 11 | 12 | cProfile is a :ref:`profiler ` that comes as part of the Python 13 | standard library. This means it is always available and does not need to be 14 | installed separately. cProfile is designed to trace the execution of a program 15 | collection information about the call graph. cProfile operates at the 16 | granularity of a function call. Here how to invoke cProfile: 17 | 18 | .. code-block:: python 19 | 20 | import cProfile 21 | 22 | p = cProfile.Profile() 23 | p.enable() 24 | 25 | # code to trace 26 | # ... 27 | 28 | p.disable() 29 | p.dump_stats('/path/to/save/results') 30 | 31 | 32 | pstats 33 | ------ 34 | 35 | cProfile only collects data, in order to analyze it we use a second tool called 36 | ``pstats``. pstats is an interactive command line tool for helping you explore 37 | the call graph and relative timings of sections. 38 | 39 | The key operations in pstats: 40 | 41 | - stats 42 | - sorting 43 | - callees 44 | - callers 45 | 46 | stats 47 | ~~~~~ 48 | 49 | The ``stats`` command displays either the stats for a single function or the top 50 | n values based on the sort. 51 | 52 | :: 53 | 54 | out.stats% stats 10 55 | 56 | prints the top 10 values based on the current sort. 57 | 58 | :: 59 | 60 | out.stats% stats pattern1 pattern2 ... 61 | 62 | prints the stats for functions whose ``filepath.py:line-number(function)`` 63 | matches all the patterns where patterns are regular expressions. 64 | 65 | For example: 66 | 67 | :: 68 | 69 | Fri Oct 19 05:29:01 2018 out.stats 70 | 71 | 314925129 function calls (314813999 primitive calls) in 352.943 seconds 72 | 73 | Ordered by: call count 74 | List reduced from 344 to 2 due to restriction <'bit_enum.py'> 75 | 76 | ncalls tottime percall cumtime percall filename:lineno(function) 77 | 29290879 32.323 0.000 32.323 0.000 /home/joe/projects/python/slider/slider/bit_enum.py:47() 78 | 29290879 96.887 0.000 144.744 0.000 /home/joe/projects/python/slider/slider/bit_enum.py:33(unpack) 79 | 80 | 81 | out.stats% stats bit_enum.py:47 82 | Fri Oct 19 05:29:01 2018 out.stats 83 | 84 | 314925129 function calls (314813999 primitive calls) in 352.943 seconds 85 | 86 | Ordered by: call count 87 | List reduced from 344 to 1 due to restriction <'bit_enum.py:47'> 88 | 89 | ncalls tottime percall cumtime percall filename:lineno(function) 90 | 29290879 32.323 0.000 32.323 0.000 /home/joe/projects/python/slider/slider/bit_enum.py:47() 91 | 92 | 93 | out.stats% stats bit_enum.py unpack 94 | Fri Oct 19 05:29:01 2018 out.stats 95 | 96 | 314925129 function calls (314813999 primitive calls) in 352.943 seconds 97 | 98 | Ordered by: call count 99 | List reduced from 344 to 2 due to restriction <'bit_enum.py'> 100 | List reduced from 2 to 1 due to restriction <'unpack'> 101 | 102 | ncalls tottime percall cumtime percall filename:lineno(function) 103 | 29290879 96.887 0.000 144.744 0.000 /home/joe/projects/python/slider/slider/bit_enum.py:33(unpack) 104 | 105 | 106 | 107 | sorting 108 | ~~~~~~~ 109 | 110 | pstats allows you to sort functions by the following criteria: 111 | 112 | - ``tottime``: The total time spent in this function excluding functions called 113 | by this function. 114 | - ``cumtime``: The cumulative time spent in this function including the time 115 | spent in all functions called in this function. 116 | - ``ncalls``: The total number of calls to the function. 117 | 118 | Sorting by ``tottime`` is useful for finding the meaty functions where a lot of 119 | work is actually being done. The highest ``tottime`` functions are worth 120 | looking over to see if there are easy optimizations to make. 121 | 122 | Sorting by ``cumtime`` is useful for getting a sense of the high level 123 | operations that are taking a long time. This will help see the chain of events 124 | that lead to the most time being spent. 125 | 126 | Sorting by ``ncalls`` is useful for identifying algorithmic issues. If you see a 127 | function with much higher than expected call count, it may indicate that your 128 | high level algorithm is implemented incorrectly. Functions with a high 129 | ``tottime`` and high ``ncalls`` are especially important to look out for. For 130 | example: 131 | 132 | :: 133 | 134 | ncalls tottime percall 135 | 29290879 96.887 0.000 136 | 137 | Here we are spending almost no time at all in any individual call, but summing 138 | those near-zero values grows to a very large amount of time. Trying to micro 139 | optimize this function may or may not help, but you should try to evaluate *why* 140 | the function is being called so many times first. 141 | 142 | callees 143 | ~~~~~~~ 144 | 145 | The callees command prints the stats for all the functions called by the target 146 | function. The callee functions are printed in the order of the currently active 147 | sort. This function is useful for understanding where a function's cumulative 148 | time comes from. 149 | 150 | Example: 151 | 152 | :: 153 | 154 | out.stats% callees _consume_actions 155 | Ordered by: cumulative time 156 | List reduced from 344 to 1 due to restriction <'_consume_actions'> 157 | 158 | Function called... 159 | ncalls tottime cumtime 160 | /home/joe/projects/python/slider/slider/replay.py:132(_consume_actions) -> 29287456 96.881 144.685 /home/joe/projects/python/slider/slider/bit_enum.py:33(unpack) 161 | 29287456 12.551 12.551 /home/joe/projects/python/slider/slider/replay.py:43(__init__) 162 | 3423 0.002 0.004 /home/joe/projects/python/slider/slider/replay.py:75(_consume_int) 163 | 3423 0.025 8.845 /usr/lib64/python3.6/lzma.py:322(decompress) 164 | 29287456 7.258 13.943 :12(__new__) 165 | 29287456 2.627 2.627 {method 'append' of 'list' objects} 166 | 29290879 8.439 8.439 {method 'split' of 'bytes' objects} 167 | 168 | callers 169 | ~~~~~~~ 170 | 171 | The callers command prints the functions which called the target function. The 172 | functions are printed in the order of the currently active sort. The callers 173 | function is useful if you want to understand where a high ``ncalls`` function is 174 | being called from. 175 | 176 | 177 | Example:: 178 | 179 | out.stats% callers unpack 180 | Ordered by: call count 181 | List reduced from 344 to 1 due to restriction <'unpack'> 182 | 183 | Function was called by... 184 | ncalls tottime cumtime 185 | /home/joe/projects/python/slider/slider/bit_enum.py:33(unpack) <- 29287456 96.881 144.685 /home/joe/projects/python/slider/slider/replay.py:132(_consume_actions) 186 | 3423 0.006 0.059 /home/joe/projects/python/slider/slider/replay.py:626(parse) 187 | 188 | Here we can see that there are 2 calls to ``unpack`` in ``replay.py``; however, 189 | with ``callers`` it is clear that almost all of the calls are coming from 190 | ``_consume_actions``. 191 | -------------------------------------------------------------------------------- /tutorial/source/python-overview.rst: -------------------------------------------------------------------------------- 1 | Python Overview 2 | =============== 3 | 4 | .. note:: 5 | 6 | Some of this content is specific to CPython, which is most common 7 | implementation of Python. This is the program that runs when you run ``$ 8 | python`` in a terminal. 9 | 10 | Python is a high level programming language, meaning that it is designed to 11 | abstract away the details of the machine, and instead present a simpler 12 | interface. The Python programming language provides a few important features 13 | that make programming easier: 14 | 15 | 1. Automatic :ref:`Memory Management `. 16 | 2. Object oriented programming. 17 | 3. Dynamic typing. 18 | 19 | Python Objects 20 | -------------- 21 | 22 | In object oriented programming (OOP), :ref:`objects ` are values paired 23 | with their operations. For example, to execute the code ``a + b``, we need to 24 | inspect ``a`` at runtime and ask ``a`` how ``a`` it performs addition. 25 | 26 | To store the behavior, Python has objects need to carry around a collection of 27 | implementations for all of the operations they support in a way that may be 28 | accessed dynamically. One way to do this, given a fixed set of operations which 29 | may be performed, would be to use a :ref:`struct ` to store the 30 | addresses of the operations at fixed offsets from some base address. For 31 | example: 32 | 33 | .. code-block:: c 34 | 35 | struct type { 36 | pointer add_address; 37 | pointer sub_address; 38 | pointer mul_address; 39 | pointer div_address; 40 | ... 41 | }; 42 | 43 | Then, values can be a struct like: 44 | 45 | .. code-block:: c 46 | 47 | struct value { 48 | pointer type; 49 | // type specific data goes here 50 | }; 51 | 52 | Because the size of each value differs depending on the type, which cannot be 53 | known ahead of time because Python is dynamically typed, Python just refers to 54 | all objects through a pointer to the ``value`` struct. All we know is that the 55 | first member of the ``value`` struct will be a pointer to some collection of 56 | functions which will be designed to know the true size of the object and how to 57 | interpret the data. This means that at minimum we must do one :ref:`memory 58 | dereference ` to perform any operation on a Python object. 59 | 60 | Using this model, let's walk through the execution of: 61 | 62 | .. code-block:: python 63 | 64 | a + b 65 | 66 | 1. :Ref:`Dereference ` ``a``. 67 | 2. :Ref:`Dereference ` ``a``\'s type. 68 | 3. Jump to the implementation of ``add`` for the type of a. 69 | 4. :Ref:`Dereference ` ``b`` 70 | 5. Check if the type of ``b`` can be added to the type of ``a``. 71 | - If not, throw an exception. 72 | 6. :ref:`Allocate ` memory to store the result of the addition. 73 | 7. Perform the addition and store the result in the newly allocated memory. 74 | 75 | Here is what all of the memory :ref:`dereferences ` look like for 76 | ``5 + 3``. 77 | 78 | .. image:: _static/addition-dereferences.png 79 | 80 | Overhead 81 | ~~~~~~~~ 82 | 83 | All of this extra "pointer chasing", runtime type checking, and allocation 84 | *really* adds up. For example, let's inspect a simple dot product function: 85 | 86 | .. code-block:: python 87 | 88 | def dot(xs, ys): 89 | out = 0 90 | ix = 0 91 | while ix < len(xs): 92 | x = xs[ix] 93 | y = ys[ix] 94 | out += x * y 95 | ix += 1 96 | return out 97 | 98 | Let's see how this function performs: 99 | 100 | .. code-block:: ipython 101 | 102 | In [2]: xs = [1, 2, 3] 103 | 104 | In [3]: ys = [4, 5, 6] 105 | 106 | In [4]: dot(xs, ys) 107 | Out[4]: 32 108 | 109 | In [5]: 1 * 4 + 2 * 5 + 3 * 6 110 | Out[5]: 32 111 | 112 | In [7]: xs = [random.random() for _ in range(10000)] 113 | 114 | In [8]: ys = [random.random() for _ in range(10000)] 115 | 116 | In [9]: dot(xs, ys) 117 | Out[9]: 2493.0449981169236 118 | 119 | In [10]: %timeit dot(xs, ys) 120 | 1.52 ms ± 15.4 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) 121 | 122 | 1.5 milliseconds to take the dot product of 10000 elements, that seems pretty 123 | quick, but what about a more pythonic implementation of ``dot``? 124 | 125 | .. code-block:: python 126 | 127 | def pythonic_dot(xs, ys): 128 | return sum(x * y for x, y in zip(xs, ys)) 129 | 130 | .. code-block:: ipython 131 | 132 | In [12]: %timeit pythonic_dot(xs, ys) 133 | 552 µs ± 8.65 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) 134 | 135 | This function is shorter and better expresses our intent. It is also 136 | considerably faster, why is that? In short, Python's built in function like 137 | ``zip`` and ``sum`` take advantage of the repetition of accessing 138 | elements. Instead of constantly checking the object and saying, "how should I 139 | retrieve elements from you", it asks the question once and re-uses the answer 140 | many times. This reduces the over all number of memory accesses and instructions 141 | needed. 142 | --------------------------------------------------------------------------------