├── README.md
├── utils.pyx
├── setup.py
├── 10 minutes to cython.md
├── Python multi process.md
├── .gitignore
├── Python multi threads.md
├── More efficient pandas.md
├── 10 minutes to cython.ipynb
├── Python multi process.ipynb
├── Python Standard Library.md
├── Python coroutines.ipynb
├── More efficient array.md
├── Itertools for efficient looping.md
├── Python multi threads.ipynb
├── More efficient pandas.ipynb
├── Python Standard Library.ipynb
├── Built-in method.md
├── More efficient array.ipynb
├── Using C++ in Cython.md
├── Itertools for efficient looping.ipynb
├── Using C++ in Cython.ipynb
└── Built-in method.ipynb


/README.md:
--------------------------------------------------------------------------------
1 | # flying-python
2 | How to write fast and efficient Python code.
3 | 


--------------------------------------------------------------------------------
/utils.pyx:
--------------------------------------------------------------------------------
 1 | import cython
 2 | import numpy as np
 3 | cimport numpy as cnp
 4 | ctypedef cnp.int_t DTYPE_t
 5 | 
 6 | 
 7 | @cython.boundscheck(False)
 8 | @cython.wraparound(False)
 9 | cpdef cnp.ndarray[DTYPE_t] _transform(cnp.ndarray[DTYPE_t] arr):
10 |     cdef:
11 |         int i = 0
12 |         int n = arr.shape[0]
13 |         int x
14 |         cnp.ndarray[DTYPE_t] new_arr = np.empty_like(arr)
15 | 
16 |     while i < n:
17 |         x = arr[i]
18 |         if x % 2:
19 |             new_arr[i] = x + 1
20 |         else:
21 |             new_arr[i] = x - 1
22 |         i += 1
23 |     return new_arr
24 | 


--------------------------------------------------------------------------------
/setup.py:
--------------------------------------------------------------------------------
 1 | """
 2 | @Author: tushushu
 3 | @Date: 2019-06-20 10:32:30
 4 | """
 5 | 
 6 | from distutils.core import setup
 7 | from Cython.Build import cythonize
 8 | import numpy
 9 | 
10 | 
11 | def compile_file(file_name: str):
12 |     """Compile pyx file."""
13 | 
14 |     ext_modules = cythonize(file_name)
15 |     name = file_name.split(".")[0] if "." in file_name else file_name
16 |     setup(name=name, ext_modules=ext_modules, include_dirs=[numpy.get_include()])
17 | 
18 | 
19 | if __name__ == "__main__":
20 |     compile_file("utils.pyx")
21 | 
22 | # source activate py36
23 | # python setup.py build_ext --inplace
24 | 


--------------------------------------------------------------------------------
/10 minutes to cython.md:
--------------------------------------------------------------------------------
 1 | # 10分钟入门Cython
 2 | 作者: tushushu  
 3 | 项目地址: https://github.com/tushushu/flying-python
 4 | 
 5 | ## 1. Cython是什么? 
 6 | Cython是一个编程语言，它通过类似Python的语法来编写C扩展并可以被Python调用.既具备了Python快速开发的特点，又可以让代码运行起来像C一样快，同时还可以方便地调用C library。
 7 | 
 8 | ## 2. 如何安装Cython?
 9 | 跟大多数的Python库不同，Cython需要一个C编译器，在不同的平台上配置方法也不一样。
10 | ### 2.1 配置gcc
11 | - **windows**  
12 | 安装MingW-w64编译器：``conda install libpython m2w64-toolchain -c msys2``  
13 | 在Python安装路径下找到\Lib\distutils文件夹，创建distutils.cfg写入如下内容：  
14 | ``[build] compiler=mingw32``
15 | 
16 | - **macOS**   
17 | 安装XCode即可  
18 | 
19 | - **linux:**  
20 | gcc一般都是配置好的，如果没有就执行这条命令：  ``sudo apt-get install build-essential``  
21 | 
22 | 
23 | ### 2.2 安装cython库
24 | - 如果没安装Anaconda：  ``pip install cython`` 
25 | - 如果安装了Anaconda：  ``conda install cython``
26 | 
27 | ## 3. 在Jupyter Notebook上使用Cython 
28 | - 首先加载Cython扩展，使用魔术命令  ``%load_ext Cython``
29 | - 接下来运行Cython代码，使用魔术命令  ``%%cython``
30 | 
31 | 
32 | ```python
33 | %load_ext Cython
34 | ```
35 | 
36 | 
37 | ```cython
38 | %%cython
39 | # 对1~100的自然数进行求和
40 | total = 0
41 | for i in range(1, 101):
42 |     total += i
43 | print(total)
44 | ```
45 | 
46 |     5050
47 | 
48 | 
49 | ## 4. 试试Cython到底有多快
50 | - Python函数，运行时间261 ns
51 | - Cython函数，运行时间44.1 ns  
52 | 
53 | 运行时间竟然只有原来的五分之一左右，秘诀就在于参数x使用了静态类型int，避免了类型检查的耗时。
54 | 
55 | ### 4.1 Python函数
56 | 
57 | 
58 | ```python
59 | def f(x):
60 |     return x ** 2 - x
61 | ```
62 | 
63 | 
64 | ```python
65 | %timeit f(100)
66 | ```
67 | 
68 |     261 ns ± 8.78 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
69 | 
70 | 
71 | ### 4.2 Cython函数
72 | 
73 | 
74 | ```cython
75 | %%cython
76 | def g(int x):
77 |     return x ** 2 - x
78 | ```
79 | 
80 | 
81 | ```python
82 | %timeit g(100)
83 | ```
84 | 
85 |     44.1 ns ± 1.09 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
86 | 
87 | 
88 | ## 参考文章
89 | 部分内容引用自 - [Cython官方文档](http://docs.cython.org/en/latest/index.html)
90 | 
91 | 
92 | ```python
93 | 
94 | ```
95 | 


--------------------------------------------------------------------------------
/Python multi process.md:
--------------------------------------------------------------------------------
 1 | # Python多任务处理(多进程篇)
 2 | 作者: tushushu  
 3 | 项目地址: https://github.com/tushushu/flying-python
 4 | 
 5 | ## 多进程处理CPU密集型任务
 6 | CPU密集型任务的特点是要进行大量的计算，消耗CPU资源，比如计算圆周率、对视频进行高清解码等等，全靠CPU的运算能力。一个线程执行CPU密集型任务的时候，CPU处于忙碌状态，运行1000个字节码之后GIL会被释放给其他线程，加上切换线程的时间有可能会比串行代码更慢。在Python多任务处理(多线程篇)，我们试图用多线程执行CPU密集型任务，然而并没有性能上的提升。现在我们试一下用多进程来处理CPU密集型任务。
 7 | 
 8 | ### 1. 建立进程池
 9 | 
10 | 
11 | ```python
12 | from concurrent.futures import ProcessPoolExecutor
13 | from time import sleep, time
14 | import os
15 | print("CPU核数为%s个!" % os.cpu_count())
16 | ```
17 | 
18 |     CPU核数为8个!
19 | 
20 | 
21 | 
22 | ```python
23 | # Worker数量
24 | N = 8
25 | # 建立进程池
26 | pool = ProcessPoolExecutor(max_workers=N)
27 | ```
28 | 
29 | ### 2. 定义一个CPU密集型函数
30 | 该函数会对[1, x]之间的整数进行求和。
31 | 
32 | 
33 | ```python
34 | def cpu_bound_func(x):
35 |     tot = 0
36 |     a = 1
37 |     while a <= x:
38 |         tot += x
39 |         a += 1
40 |     print("Finish sum from 1 to %d!" % x)
41 |     return tot
42 | ```
43 | 
44 | ### 3. 使用串行的方式处理
45 | 遍历一个列表的所有元素，执行func函数。
46 | 
47 | 
48 | ```python
49 | def process_array(arr):
50 |     for x in arr:
51 |         cpu_bound_func(x)
52 | ```
53 | 
54 | ### 4. 使用多进程处理
55 | 通过线程池的map方法，可以将同一个函数作用在列表中的所有元素上。
56 | 
57 | 
58 | ```python
59 | def fast_process_array(arr):
60 |     for x in pool.map(cpu_bound_func, arr):
61 |         pass
62 | ```
63 | 
64 | ### 5. 计算函数运行时间
65 | - 串行版本的运行时间5.7秒
66 | - 多进程版本的运行时间1.6秒
67 | 
68 | 
69 | ```python
70 | def time_it(fn, *args):
71 |     start = time()
72 |     fn(*args)
73 |     print("%s版本的运行时间为 %.5f 秒!" % (fn.__name__, time() - start))
74 | ```
75 | 
76 | 
77 | ```python
78 | time_it(process_array, [10**7 for _ in range(8)])
79 | ```
80 | 
81 |     Finish sum from 1 to 10000000!
82 |     Finish sum from 1 to 10000000!
83 |     Finish sum from 1 to 10000000!
84 |     Finish sum from 1 to 10000000!
85 |     Finish sum from 1 to 10000000!
86 |     Finish sum from 1 to 10000000!
87 |     Finish sum from 1 to 10000000!
88 |     Finish sum from 1 to 10000000!
89 |     process_array版本的运行时间为 5.74394 秒!
90 | 
91 | 
92 | 
93 | ```python
94 | time_it(fast_process_array, [10**7 for _ in range(8)])
95 | ```
96 | 
97 |     fast_process_array版本的运行时间为 1.62266 秒!
98 | 
99 | 


--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------
  1 | # Byte-compiled / optimized / DLL files
  2 | __pycache__/
  3 | *.py[cod]
  4 | *$py.class
  5 | 
  6 | # C extensions
  7 | *.so
  8 | 
  9 | # Distribution / packaging
 10 | .Python
 11 | build/
 12 | develop-eggs/
 13 | dist/
 14 | downloads/
 15 | eggs/
 16 | .eggs/
 17 | lib/
 18 | lib64/
 19 | parts/
 20 | sdist/
 21 | var/
 22 | wheels/
 23 | pip-wheel-metadata/
 24 | share/python-wheels/
 25 | *.egg-info/
 26 | .installed.cfg
 27 | *.egg
 28 | MANIFEST
 29 | 
 30 | # PyInstaller
 31 | #  Usually these files are written by a python script from a template
 32 | #  before PyInstaller builds the exe, so as to inject date/other infos into it.
 33 | *.manifest
 34 | *.spec
 35 | 
 36 | # Installer logs
 37 | pip-log.txt
 38 | pip-delete-this-directory.txt
 39 | 
 40 | # Unit test / coverage reports
 41 | htmlcov/
 42 | .tox/
 43 | .nox/
 44 | .coverage
 45 | .coverage.*
 46 | .cache
 47 | nosetests.xml
 48 | coverage.xml
 49 | *.cover
 50 | .hypothesis/
 51 | .pytest_cache/
 52 | 
 53 | # Translations
 54 | *.mo
 55 | *.pot
 56 | 
 57 | # Django stuff:
 58 | *.log
 59 | local_settings.py
 60 | db.sqlite3
 61 | db.sqlite3-journal
 62 | 
 63 | # Flask stuff:
 64 | instance/
 65 | .webassets-cache
 66 | 
 67 | # Scrapy stuff:
 68 | .scrapy
 69 | 
 70 | # Sphinx documentation
 71 | docs/_build/
 72 | 
 73 | # PyBuilder
 74 | target/
 75 | 
 76 | # Jupyter Notebook
 77 | .ipynb_checkpoints
 78 | 
 79 | # IPython
 80 | profile_default/
 81 | ipython_config.py
 82 | 
 83 | # pyenv
 84 | .python-version
 85 | 
 86 | # pipenv
 87 | #   According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
 88 | #   However, in case of collaboration, if having platform-specific dependencies or dependencies
 89 | #   having no cross-platform support, pipenv may install dependencies that don't work, or not
 90 | #   install all needed dependencies.
 91 | #Pipfile.lock
 92 | 
 93 | # celery beat schedule file
 94 | celerybeat-schedule
 95 | 
 96 | # SageMath parsed files
 97 | *.sage.py
 98 | 
 99 | # Environments
100 | .env
101 | .venv
102 | env/
103 | venv/
104 | ENV/
105 | env.bak/
106 | venv.bak/
107 | 
108 | # Spyder project settings
109 | .spyderproject
110 | .spyproject
111 | 
112 | # Rope project settings
113 | .ropeproject
114 | 
115 | # mkdocs documentation
116 | /site
117 | 
118 | # mypy
119 | .mypy_cache/
120 | .dmypy.json
121 | dmypy.json
122 | 
123 | # Pyre type checker
124 | .pyre/
125 | 
126 | # VS Code
127 | .vscode/
128 | .ipynb_checkpoints/
129 | 


--------------------------------------------------------------------------------
/Python multi threads.md:
--------------------------------------------------------------------------------
  1 | # Python多任务处理(多线程篇)
  2 | 作者: tushushu  
  3 | 项目地址: https://github.com/tushushu/flying-python
  4 | 
  5 | ## 1. GIL
  6 | 
  7 | 熟悉python的都知道，在C语言写的python解释器中存在全局解释器锁，由于全局解释器锁的存在，在同一时间内，python解释器只能运行一个线程的代码，这大大影响了python多线程的性能。而这个解释器锁由于历史原因，现在几乎无法消除。 
  8 |   
  9 | python GIL 之所以会影响多线程等性能，是因为在多线程的情况下，只有当线程获得了一个全局锁的时候，那么该线程的代码才能运行，而全局锁只有一个，所以使用python多线程，在同一时刻也只有一个线程在运行，因此在即使在多核的情况下也只能发挥出单核的性能。 
 10 | 
 11 | 
 12 | ## 2. 多线程处理IO密集型任务
 13 | IO密集型任务指的是系统的CPU性能相对硬盘、内存要好很多，此时，系统运作，大部分的状况是CPU在等I/O (硬盘/内存) 的读/写操作，此时CPU Loading并不高。涉及到网络、磁盘IO的任务都是IO密集型任务。一个线程执行IO密集型任务的时候，CPU处于闲置状态，因此GIL会被释放给其他线程，从而缩短了总体的等待运行时间。
 14 | 
 15 | 
 16 | ```python
 17 | from concurrent.futures import ThreadPoolExecutor
 18 | from time import sleep, time
 19 | ```
 20 | 
 21 | 
 22 | ```python
 23 | # Worker数量
 24 | N = 4
 25 | # 建立线程池
 26 | pool = ThreadPoolExecutor(max_workers=N)
 27 | ```
 28 | 
 29 | ### 2.1 定义一个IO密集型函数
 30 | 该函数会“睡眠”x秒。
 31 | 
 32 | 
 33 | ```python
 34 | def io_bound_func(x):
 35 |     sleep(x)
 36 |     print("Sleep for %d seconds." % x)
 37 | ```
 38 | 
 39 | ### 2.2 使用串行的方式处理
 40 | 遍历一个列表的所有元素，执行func函数。
 41 | 
 42 | 
 43 | ```python
 44 | def process_array(arr):
 45 |     for x in arr:
 46 |         io_bound_func(x)
 47 | ```
 48 | 
 49 | ### 2.3 使用多线程处理
 50 | 通过线程池的map方法，可以将同一个函数作用在列表中的所有元素上。
 51 | 
 52 | 
 53 | ```python
 54 | def fast_process_array(arr):
 55 |     for x in pool.map(io_bound_func, arr):
 56 |         pass
 57 | ```
 58 | 
 59 | ### 2.4 计算函数运行时间
 60 | - 串行版本的运行时间 = 1 + 2 + 3 = 6秒  
 61 | - 多线程版本的运行时间 = max(1, 2, 3) = 3秒
 62 | 
 63 | 
 64 | ```python
 65 | def time_it(fn, *args):
 66 |     start = time()
 67 |     fn(*args)
 68 |     print("%s版本的运行时间为 %.5f 秒!" % (fn.__name__, time() - start))
 69 | ```
 70 | 
 71 | 
 72 | ```python
 73 | time_it(process_array, [1, 2, 3])
 74 | ```
 75 | 
 76 |     Sleep for 1 seconds.
 77 |     Sleep for 2 seconds.
 78 |     Sleep for 3 seconds.
 79 |     process_array版本的运行时间为 6.00883 秒!
 80 | 
 81 | 
 82 | 
 83 | ```python
 84 | time_it(fast_process_array, [1, 2, 3])
 85 | ```
 86 | 
 87 |     Sleep for 1 seconds.
 88 |     Sleep for 2 seconds.
 89 |     Sleep for 3 seconds.
 90 |     fast_process_array版本的运行时间为 3.00300 秒!
 91 | 
 92 | 
 93 | ### 3. 多线程CPU密集型任务
 94 | CPU密集型任务的特点是要进行大量的计算，消耗CPU资源，比如计算圆周率、对视频进行高清解码等等，全靠CPU的运算能力。一个线程执行CPU密集型任务的时候，CPU处于忙碌状态，运行1000个字节码之后GIL会被释放给其他线程，加上切换线程的时间有可能会比串行代码更慢。
 95 | 
 96 | ### 3.1 定义一个CPU密集型函数
 97 | 该函数会对[1, x]之间的整数进行求和。
 98 | 
 99 | 
100 | ```python
101 | def cpu_bound_func(x):
102 |     tot = 0
103 |     a = 1
104 |     while a <= x:
105 |         tot += x
106 |         a += 1
107 |     print("Finish sum from 1 to %d!" % x)
108 |     return tot
109 | ```
110 | 
111 | ### 3.2 使用串行的方式处理
112 | 遍历一个列表的所有元素，执行func函数。
113 | 
114 | 
115 | ```python
116 | def process_array(arr):
117 |     for x in arr:
118 |         cpu_bound_func(x)
119 | ```
120 | 
121 | ### 3.3 使用多线程处理
122 | 通过线程池的map方法，可以将同一个函数作用在列表中的所有元素上。
123 | 
124 | 
125 | ```python
126 | def fast_process_array(arr):
127 |     for x in pool.map(cpu_bound_func, arr):
128 |         pass
129 | ```
130 | 
131 | ### 3.4 计算函数运行时间
132 | - 串行版本的运行时间2.1秒
133 | - 多线程版本的运行时间2.2秒
134 | 
135 | 
136 | ```python
137 | def time_it(fn, *args):
138 |     start = time()
139 |     fn(*args)
140 |     print("%s版本的运行时间为 %.5f 秒!" % (fn.__name__, time() - start))
141 | ```
142 | 
143 | 
144 | ```python
145 | time_it(process_array, [10**7, 10**7, 10**7])
146 | ```
147 | 
148 |     Finish sum from 1 to 10000000!
149 |     Finish sum from 1 to 10000000!
150 |     Finish sum from 1 to 10000000!
151 |     process_array版本的运行时间为 2.10489 秒!
152 | 
153 | 
154 | 
155 | ```python
156 | time_it(fast_process_array, [10**7, 10**7, 10**7])
157 | ```
158 | 
159 |     Finish sum from 1 to 10000000!
160 |     Finish sum from 1 to 10000000!
161 |     Finish sum from 1 to 10000000!
162 |     fast_process_array版本的运行时间为 2.20897 秒!
163 | 
164 | 
165 | ## 参考文章
166 | https://www.jianshu.com/p/c75ed8a6e9af  
167 | https://www.cnblogs.com/tusheng/articles/10630662.html
168 | 
169 | 
170 | ```python
171 | 
172 | ```
173 | 


--------------------------------------------------------------------------------
/More efficient pandas.md:
--------------------------------------------------------------------------------
  1 | # 让Pandas DataFrame性能提升40倍
  2 | 
  3 | ## 1. 小试牛刀
  4 | 大名鼎鼎的Pandas是数据分析的神器。有时候我们需要对上千万甚至上亿的数据进行非常复杂处理，那么运行效率就是一个不能忽视的问题。比如下面这个简单例子，我们随机生成100万条数据，对'val'这一列进行处理：如果是偶数则减1，奇数则加1。实际的数据分析工作要比这个例子复杂的多，但考虑到我们（主要是我）没有那么多时间等待运行结果，所以就偷个懒吧。可以看到transform函数的平均运行时间是284ms，
  5 | 
  6 | 
  7 | ```python
  8 | import pandas as pd
  9 | import numpy as np
 10 | 
 11 | def gen_data(size):
 12 |     d = dict()
 13 |     d["genre"] = np.random.choice(["A", "B", "C", "D"], size=size)
 14 |     d["val"] = np.random.randint(low=0, high=100, size=size)
 15 |     return pd.DataFrame(d)
 16 | 
 17 | data = gen_data(1000000)
 18 | data.head()
 19 | ```
 20 | 
 21 | 
 22 | 
 23 | 
 24 | <div>
 25 | <style scoped>
 26 |     .dataframe tbody tr th:only-of-type {
 27 |         vertical-align: middle;
 28 |     }
 29 | 
 30 |     .dataframe tbody tr th {
 31 |         vertical-align: top;
 32 |     }
 33 | 
 34 |     .dataframe thead th {
 35 |         text-align: right;
 36 |     }
 37 | </style>
 38 | <table border="1" class="dataframe">
 39 |   <thead>
 40 |     <tr style="text-align: right;">
 41 |       <th></th>
 42 |       <th>genre</th>
 43 |       <th>val</th>
 44 |     </tr>
 45 |   </thead>
 46 |   <tbody>
 47 |     <tr>
 48 |       <th>0</th>
 49 |       <td>C</td>
 50 |       <td>54</td>
 51 |     </tr>
 52 |     <tr>
 53 |       <th>1</th>
 54 |       <td>A</td>
 55 |       <td>5</td>
 56 |     </tr>
 57 |     <tr>
 58 |       <th>2</th>
 59 |       <td>D</td>
 60 |       <td>0</td>
 61 |     </tr>
 62 |     <tr>
 63 |       <th>3</th>
 64 |       <td>D</td>
 65 |       <td>42</td>
 66 |     </tr>
 67 |     <tr>
 68 |       <th>4</th>
 69 |       <td>C</td>
 70 |       <td>91</td>
 71 |     </tr>
 72 |   </tbody>
 73 | </table>
 74 | </div>
 75 | 
 76 | 
 77 | 
 78 | 
 79 | ```python
 80 | def transform(data):
 81 |     data.loc[:, "new_val"] = data.val.apply(lambda x: x + 1 if x % 2 else x - 1)
 82 | ```
 83 | 
 84 | 
 85 | ```python
 86 | %timeit -n 1 transform(data)
 87 | ```
 88 | 
 89 |     284 ms ± 8.95 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
 90 | 
 91 | 
 92 | ## 2. 用Cython编写C扩展
 93 | 试试用我们的老朋友Cython来写一下 `x + 1 if x % 2 else x - 1` 这个函数。平均运行时间降低到了202ms，果然速度变快了。性能大约提升了1.4倍，离40倍的flag还差的好远[捂脸]。
 94 | 
 95 | 
 96 | ```python
 97 | %load_ext cython
 98 | ```
 99 | 
100 | 
101 | ```cython
102 | %%cython
103 | cpdef int _transform(int x):
104 |     if x % 2:
105 |         return x + 1
106 |     return x - 1
107 | 
108 | def transform(data):
109 |     data.loc[:, "new_val"] = data.val.apply(_transform)
110 | ```
111 | 
112 | 
113 | ```python
114 | %timeit -n 1 transform(data)
115 | ```
116 | 
117 |     202 ms ± 13.1 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
118 | 
119 | 
120 | ## 3. 减少类型转换
121 | 为了减少C和Python之间的类型转换，我们直接把'val'这一列作为Numpy数组传递给Cython函数，注意区分cnp和np。平均运行时间直接降到10.8毫秒，性能大约提升了26倍，仿佛看到了一丝希望。
122 | 
123 | 
124 | ```cython
125 | %%cython
126 | import numpy as np
127 | cimport numpy as cnp
128 | ctypedef cnp.int_t DTYPE_t
129 | 
130 | cpdef cnp.ndarray[DTYPE_t] _transform(cnp.ndarray[DTYPE_t] arr):
131 |     cdef:
132 |         int i = 0
133 |         int n = arr.shape[0]
134 |         int x
135 |         cnp.ndarray[DTYPE_t] new_arr = np.empty_like(arr)
136 | 
137 |     while i < n:
138 |         x = arr[i]
139 |         if x % 2:
140 |             new_arr[i] = x + 1
141 |         else:
142 |             new_arr[i] = x - 1
143 |         i += 1
144 |     return new_arr
145 | 
146 | def transform(data):
147 |     data.loc[:, "new_val"] = _transform(data.val.values)
148 | ```
149 | 
150 | 
151 | ```python
152 | %timeit -n 1 transform(data)
153 | ```
154 | 
155 |     10.8 ms ± 512 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)
156 | 
157 | 
158 | ## 4. 使用不安全的数组
159 | 利用@cython.boundscheck(False)，@cython.wraparound(False)装饰器关闭数组的边界检查和负下标处理，平均运行时间变为5.9毫秒。性能提升了42倍左右，顺利完成任务。
160 | 
161 | 
162 | ```cython
163 | %%cython
164 | import cython
165 | import numpy as np
166 | cimport numpy as cnp
167 | ctypedef cnp.int_t DTYPE_t
168 | 
169 | 
170 | @cython.boundscheck(False)
171 | @cython.wraparound(False)
172 | cpdef cnp.ndarray[DTYPE_t] _transform(cnp.ndarray[DTYPE_t] arr):
173 |     cdef:
174 |         int i = 0
175 |         int n = arr.shape[0]
176 |         int x
177 |         cnp.ndarray[DTYPE_t] new_arr = np.empty_like(arr)
178 | 
179 |     while i < n:
180 |         x = arr[i]
181 |         if x % 2:
182 |             new_arr[i] = x + 1
183 |         else:
184 |             new_arr[i] = x - 1
185 |         i += 1
186 |     return new_arr
187 | 
188 | def transform(data):
189 |     data.loc[:, "new_val"] = _transform(data.val.values)
190 | ```
191 | 
192 | 
193 | ```python
194 | %timeit -n 1 transform(data)
195 | ```
196 | 
197 |     6.76 ms ± 545 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)
198 | 
199 | 


--------------------------------------------------------------------------------
/10 minutes to cython.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "markdown",
  5 |    "metadata": {},
  6 |    "source": [
  7 |     "# 10分钟入门Cython\n",
  8 |     "作者: tushushu  \n",
  9 |     "项目地址: https://github.com/tushushu/flying-python"
 10 |    ]
 11 |   },
 12 |   {
 13 |    "cell_type": "markdown",
 14 |    "metadata": {},
 15 |    "source": [
 16 |     "## 1. Cython是什么? \n",
 17 |     "Cython是一个编程语言，它通过类似Python的语法来编写C扩展并可以被Python调用.既具备了Python快速开发的特点，又可以让代码运行起来像C一样快，同时还可以方便地调用C library。"
 18 |    ]
 19 |   },
 20 |   {
 21 |    "cell_type": "markdown",
 22 |    "metadata": {},
 23 |    "source": [
 24 |     "## 2. 如何安装Cython?\n",
 25 |     "跟大多数的Python库不同，Cython需要一个C编译器，在不同的平台上配置方法也不一样。\n",
 26 |     "### 2.1 配置gcc\n",
 27 |     "- **windows**  \n",
 28 |     "安装MingW-w64编译器：``conda install libpython m2w64-toolchain -c msys2``  \n",
 29 |     "在Python安装路径下找到\\Lib\\distutils文件夹，创建distutils.cfg写入如下内容：  \n",
 30 |     "``[build] compiler=mingw32``\n",
 31 |     "\n",
 32 |     "- **macOS**   \n",
 33 |     "安装XCode即可  \n",
 34 |     "\n",
 35 |     "- **linux:**  \n",
 36 |     "gcc一般都是配置好的，如果没有就执行这条命令：  ``sudo apt-get install build-essential``  \n",
 37 |     "\n",
 38 |     "\n",
 39 |     "### 2.2 安装cython库\n",
 40 |     "- 如果没安装Anaconda：  ``pip install cython`` \n",
 41 |     "- 如果安装了Anaconda：  ``conda install cython``"
 42 |    ]
 43 |   },
 44 |   {
 45 |    "cell_type": "markdown",
 46 |    "metadata": {},
 47 |    "source": [
 48 |     "## 3. 在Jupyter Notebook上使用Cython \n",
 49 |     "- 首先加载Cython扩展，使用魔术命令  ``%load_ext Cython``\n",
 50 |     "- 接下来运行Cython代码，使用魔术命令  ``%%cython``"
 51 |    ]
 52 |   },
 53 |   {
 54 |    "cell_type": "code",
 55 |    "execution_count": 1,
 56 |    "metadata": {},
 57 |    "outputs": [],
 58 |    "source": [
 59 |     "%load_ext Cython"
 60 |    ]
 61 |   },
 62 |   {
 63 |    "cell_type": "code",
 64 |    "execution_count": 2,
 65 |    "metadata": {},
 66 |    "outputs": [
 67 |     {
 68 |      "name": "stdout",
 69 |      "output_type": "stream",
 70 |      "text": [
 71 |       "5050\n"
 72 |      ]
 73 |     }
 74 |    ],
 75 |    "source": [
 76 |     "%%cython\n",
 77 |     "# 对1~100的自然数进行求和\n",
 78 |     "total = 0\n",
 79 |     "for i in range(1, 101):\n",
 80 |     "    total += i\n",
 81 |     "print(total)"
 82 |    ]
 83 |   },
 84 |   {
 85 |    "cell_type": "markdown",
 86 |    "metadata": {},
 87 |    "source": [
 88 |     "## 4. 试试Cython到底有多快\n",
 89 |     "- Python函数，运行时间261 ns\n",
 90 |     "- Cython函数，运行时间44.1 ns  \n",
 91 |     "\n",
 92 |     "运行时间竟然只有原来的五分之一左右，秘诀就在于参数x使用了静态类型int，避免了类型检查的耗时。"
 93 |    ]
 94 |   },
 95 |   {
 96 |    "cell_type": "markdown",
 97 |    "metadata": {},
 98 |    "source": [
 99 |     "### 4.1 Python函数"
100 |    ]
101 |   },
102 |   {
103 |    "cell_type": "code",
104 |    "execution_count": 3,
105 |    "metadata": {},
106 |    "outputs": [],
107 |    "source": [
108 |     "def f(x):\n",
109 |     "    return x ** 2 - x"
110 |    ]
111 |   },
112 |   {
113 |    "cell_type": "code",
114 |    "execution_count": 4,
115 |    "metadata": {},
116 |    "outputs": [
117 |     {
118 |      "name": "stdout",
119 |      "output_type": "stream",
120 |      "text": [
121 |       "261 ns ± 8.78 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)\n"
122 |      ]
123 |     }
124 |    ],
125 |    "source": [
126 |     "%timeit f(100)"
127 |    ]
128 |   },
129 |   {
130 |    "cell_type": "markdown",
131 |    "metadata": {},
132 |    "source": [
133 |     "### 4.2 Cython函数"
134 |    ]
135 |   },
136 |   {
137 |    "cell_type": "code",
138 |    "execution_count": 5,
139 |    "metadata": {},
140 |    "outputs": [],
141 |    "source": [
142 |     "%%cython\n",
143 |     "def g(int x):\n",
144 |     "    return x ** 2 - x"
145 |    ]
146 |   },
147 |   {
148 |    "cell_type": "code",
149 |    "execution_count": 6,
150 |    "metadata": {},
151 |    "outputs": [
152 |     {
153 |      "name": "stdout",
154 |      "output_type": "stream",
155 |      "text": [
156 |       "44.1 ns ± 1.09 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)\n"
157 |      ]
158 |     }
159 |    ],
160 |    "source": [
161 |     "%timeit g(100)"
162 |    ]
163 |   },
164 |   {
165 |    "cell_type": "markdown",
166 |    "metadata": {},
167 |    "source": [
168 |     "## 参考文章\n",
169 |     "部分内容引用自 - [Cython官方文档](http://docs.cython.org/en/latest/index.html)"
170 |    ]
171 |   },
172 |   {
173 |    "cell_type": "code",
174 |    "execution_count": null,
175 |    "metadata": {},
176 |    "outputs": [],
177 |    "source": []
178 |   }
179 |  ],
180 |  "metadata": {
181 |   "kernelspec": {
182 |    "display_name": "Python 3",
183 |    "language": "python",
184 |    "name": "python3"
185 |   },
186 |   "language_info": {
187 |    "codemirror_mode": {
188 |     "name": "ipython",
189 |     "version": 3
190 |    },
191 |    "file_extension": ".py",
192 |    "mimetype": "text/x-python",
193 |    "name": "python",
194 |    "nbconvert_exporter": "python",
195 |    "pygments_lexer": "ipython3",
196 |    "version": "3.6.6"
197 |   }
198 |  },
199 |  "nbformat": 4,
200 |  "nbformat_minor": 2
201 | }
202 | 


--------------------------------------------------------------------------------
/Python multi process.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "markdown",
  5 |    "metadata": {},
  6 |    "source": [
  7 |     "# Python多任务处理(多进程篇)\n",
  8 |     "作者: tushushu  \n",
  9 |     "项目地址: https://github.com/tushushu/flying-python"
 10 |    ]
 11 |   },
 12 |   {
 13 |    "cell_type": "markdown",
 14 |    "metadata": {},
 15 |    "source": [
 16 |     "## 多进程处理CPU密集型任务\n",
 17 |     "CPU密集型任务的特点是要进行大量的计算，消耗CPU资源，比如计算圆周率、对视频进行高清解码等等，全靠CPU的运算能力。一个线程执行CPU密集型任务的时候，CPU处于忙碌状态，运行1000个字节码之后GIL会被释放给其他线程，加上切换线程的时间有可能会比串行代码更慢。在Python多任务处理(多线程篇)，我们试图用多线程执行CPU密集型任务，然而并没有性能上的提升。现在我们试一下用多进程来处理CPU密集型任务。"
 18 |    ]
 19 |   },
 20 |   {
 21 |    "cell_type": "markdown",
 22 |    "metadata": {},
 23 |    "source": [
 24 |     "### 1. 建立进程池"
 25 |    ]
 26 |   },
 27 |   {
 28 |    "cell_type": "code",
 29 |    "execution_count": 10,
 30 |    "metadata": {},
 31 |    "outputs": [
 32 |     {
 33 |      "name": "stdout",
 34 |      "output_type": "stream",
 35 |      "text": [
 36 |       "CPU核数为8个!\n"
 37 |      ]
 38 |     }
 39 |    ],
 40 |    "source": [
 41 |     "from concurrent.futures import ProcessPoolExecutor\n",
 42 |     "from time import sleep, time\n",
 43 |     "import os\n",
 44 |     "print(\"CPU核数为%s个!\" % os.cpu_count())"
 45 |    ]
 46 |   },
 47 |   {
 48 |    "cell_type": "code",
 49 |    "execution_count": 11,
 50 |    "metadata": {},
 51 |    "outputs": [],
 52 |    "source": [
 53 |     "# Worker数量\n",
 54 |     "N = 8\n",
 55 |     "# 建立进程池\n",
 56 |     "pool = ProcessPoolExecutor(max_workers=N)"
 57 |    ]
 58 |   },
 59 |   {
 60 |    "cell_type": "markdown",
 61 |    "metadata": {},
 62 |    "source": [
 63 |     "### 2. 定义一个CPU密集型函数\n",
 64 |     "该函数会对[1, x]之间的整数进行求和。"
 65 |    ]
 66 |   },
 67 |   {
 68 |    "cell_type": "code",
 69 |    "execution_count": 12,
 70 |    "metadata": {},
 71 |    "outputs": [],
 72 |    "source": [
 73 |     "def cpu_bound_func(x):\n",
 74 |     "    tot = 0\n",
 75 |     "    a = 1\n",
 76 |     "    while a <= x:\n",
 77 |     "        tot += x\n",
 78 |     "        a += 1\n",
 79 |     "    print(\"Finish sum from 1 to %d!\" % x)\n",
 80 |     "    return tot"
 81 |    ]
 82 |   },
 83 |   {
 84 |    "cell_type": "markdown",
 85 |    "metadata": {},
 86 |    "source": [
 87 |     "### 3. 使用串行的方式处理\n",
 88 |     "遍历一个列表的所有元素，执行func函数。"
 89 |    ]
 90 |   },
 91 |   {
 92 |    "cell_type": "code",
 93 |    "execution_count": 13,
 94 |    "metadata": {},
 95 |    "outputs": [],
 96 |    "source": [
 97 |     "def process_array(arr):\n",
 98 |     "    for x in arr:\n",
 99 |     "        cpu_bound_func(x)"
100 |    ]
101 |   },
102 |   {
103 |    "cell_type": "markdown",
104 |    "metadata": {},
105 |    "source": [
106 |     "### 4. 使用多进程处理\n",
107 |     "通过线程池的map方法，可以将同一个函数作用在列表中的所有元素上。"
108 |    ]
109 |   },
110 |   {
111 |    "cell_type": "code",
112 |    "execution_count": 14,
113 |    "metadata": {},
114 |    "outputs": [],
115 |    "source": [
116 |     "def fast_process_array(arr):\n",
117 |     "    for x in pool.map(cpu_bound_func, arr):\n",
118 |     "        pass"
119 |    ]
120 |   },
121 |   {
122 |    "cell_type": "markdown",
123 |    "metadata": {},
124 |    "source": [
125 |     "### 5. 计算函数运行时间\n",
126 |     "- 串行版本的运行时间5.7秒\n",
127 |     "- 多进程版本的运行时间1.6秒"
128 |    ]
129 |   },
130 |   {
131 |    "cell_type": "code",
132 |    "execution_count": 15,
133 |    "metadata": {},
134 |    "outputs": [],
135 |    "source": [
136 |     "def time_it(fn, *args):\n",
137 |     "    start = time()\n",
138 |     "    fn(*args)\n",
139 |     "    print(\"%s版本的运行时间为 %.5f 秒!\" % (fn.__name__, time() - start))"
140 |    ]
141 |   },
142 |   {
143 |    "cell_type": "code",
144 |    "execution_count": 20,
145 |    "metadata": {},
146 |    "outputs": [
147 |     {
148 |      "name": "stdout",
149 |      "output_type": "stream",
150 |      "text": [
151 |       "Finish sum from 1 to 10000000!\n",
152 |       "Finish sum from 1 to 10000000!\n",
153 |       "Finish sum from 1 to 10000000!\n",
154 |       "Finish sum from 1 to 10000000!\n",
155 |       "Finish sum from 1 to 10000000!\n",
156 |       "Finish sum from 1 to 10000000!\n",
157 |       "Finish sum from 1 to 10000000!\n",
158 |       "Finish sum from 1 to 10000000!\n",
159 |       "process_array版本的运行时间为 5.74394 秒!\n"
160 |      ]
161 |     }
162 |    ],
163 |    "source": [
164 |     "time_it(process_array, [10**7 for _ in range(8)])"
165 |    ]
166 |   },
167 |   {
168 |    "cell_type": "code",
169 |    "execution_count": 21,
170 |    "metadata": {},
171 |    "outputs": [
172 |     {
173 |      "name": "stdout",
174 |      "output_type": "stream",
175 |      "text": [
176 |       "fast_process_array版本的运行时间为 1.62266 秒!\n"
177 |      ]
178 |     }
179 |    ],
180 |    "source": [
181 |     "time_it(fast_process_array, [10**7 for _ in range(8)])"
182 |    ]
183 |   }
184 |  ],
185 |  "metadata": {
186 |   "kernelspec": {
187 |    "display_name": "Python 3",
188 |    "language": "python",
189 |    "name": "python3"
190 |   },
191 |   "language_info": {
192 |    "codemirror_mode": {
193 |     "name": "ipython",
194 |     "version": 3
195 |    },
196 |    "file_extension": ".py",
197 |    "mimetype": "text/x-python",
198 |    "name": "python",
199 |    "nbconvert_exporter": "python",
200 |    "pygments_lexer": "ipython3",
201 |    "version": "3.6.6"
202 |   }
203 |  },
204 |  "nbformat": 4,
205 |  "nbformat_minor": 2
206 | }
207 | 


--------------------------------------------------------------------------------
/Python Standard Library.md:
--------------------------------------------------------------------------------
  1 | # 用Python标准库写出高效的代码
  2 | 作者: tushushu  
  3 | 项目地址: https://github.com/tushushu/flying-python
  4 | 
  5 | ## 1. bisect - 二分查找
  6 | 给定一个列表对象，我们要对目标元素进行查找，返回其在列表中的下标。  
  7 | * 首先想到的是Python列表的index方法。建立一个长度为10000的升序列表，编写search函数使用index方式把里面的每一个元素查找一遍，平均运行时间437毫秒。
  8 | * 使用bisect模块的bisect_left，也就是我们熟知的二分查找。编写fast_search函数，平均运行时间3.94毫秒，性能提升了110倍！
  9 | 
 10 | 
 11 | ```python
 12 | import bisect
 13 | ```
 14 | 
 15 | 
 16 | ```python
 17 | def search(nums):
 18 |     for x in nums:
 19 |         nums.index(x)
 20 | ```
 21 | 
 22 | 
 23 | ```python
 24 | def fast_search(nums):
 25 |     for x in nums:
 26 |         bisect.bisect_left(nums, x)
 27 | ```
 28 | 
 29 | 
 30 | ```python
 31 | arr = list(range(10000))
 32 | ```
 33 | 
 34 | 
 35 | ```python
 36 | %timeit -n 1 search(arr)
 37 | ```
 38 | 
 39 |     437 ms ± 12.1 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
 40 | 
 41 | 
 42 | 
 43 | ```python
 44 | %timeit -n 1 fast_search(arr)
 45 | ```
 46 | 
 47 |     3.94 ms ± 407 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)
 48 | 
 49 | 
 50 | ## 2. Counter - 高效计数
 51 | 给定一个列表对象，我们要统计其中的每个不重复的元素出现了多少次，返回一个字典对象。  
 52 | * 创建一个长度为10000，元素为1-3之间的随机数的列表。编写count函数，创建一个空字典，用for循环遍历该列表，将计数结果写入字典。平均运行时间937微秒。
 53 | * 使用collections模块的Counter，编写fast_count函数，一行代码搞定。平均运行时间494微秒，性能几乎是原来的2倍。
 54 | 
 55 | 
 56 | ```python
 57 | from collections import Counter
 58 | from random import randint
 59 | ```
 60 | 
 61 | 
 62 | ```python
 63 | def count(nums):
 64 |     res = dict()
 65 |     for x in nums:
 66 |         if x in res:
 67 |             res[x] += 1
 68 |         else:
 69 |             res[x] = 0
 70 |     return x
 71 | ```
 72 | 
 73 | 
 74 | ```python
 75 | def fast_count(nums):
 76 |     return Counter(nums)
 77 | ```
 78 | 
 79 | 
 80 | ```python
 81 | nums = [randint(1, 3) for _ in range(10000)]
 82 | ```
 83 | 
 84 | 
 85 | ```python
 86 | %timeit -n 1 count(nums)
 87 | ```
 88 | 
 89 |     937 µs ± 153 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)
 90 | 
 91 | 
 92 | 
 93 | ```python
 94 | %timeit -n 1 fast_count(nums)
 95 | ```
 96 | 
 97 |     494 µs ± 240 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)
 98 | 
 99 | 
100 | ## 3. heapq - 堆
101 | 给定一个列表对象，返回该列表中最小的3个元素。
102 | * 创建一个长度为10000的列表，对元素进行随机打乱。编写top_3函数，对列表进行排序，返回前3个元素。平均运行时间2.03毫秒。
103 | * 使用heapq模块，也就是我们熟悉的堆，编写fast_top_3函数。平均运行时间296微秒，性能提升了6.8倍。
104 | 
105 | 
106 | ```python
107 | import heapq
108 | from random import shuffle
109 | ```
110 | 
111 | 
112 | ```python
113 | def top_3(nums):
114 |     return sorted(nums)[:3]
115 | ```
116 | 
117 | 
118 | ```python
119 | def fast_top_3(nums):
120 |     return heapq.nsmallest(3, nums)
121 | ```
122 | 
123 | 
124 | ```python
125 | nums = list(range(10000))
126 | shuffle(nums)
127 | ```
128 | 
129 | 
130 | ```python
131 | %timeit -n 1 top_3(nums)
132 | ```
133 | 
134 |     2.03 ms ± 236 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)
135 | 
136 | 
137 | 
138 | ```python
139 | %timeit -n 1 fast_top_3(nums)
140 | ```
141 | 
142 |     296 µs ± 56.2 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)
143 | 
144 | 
145 | ## 4. itemgetter - 批量get元素
146 | 给定一个字典和一个列表，列表中包含一个或多个字典中的key，返回对应的values。
147 | * 创建一个元素数量为10万的字典，从字典的key中随机抽样10万，形成一个长度为1万的列表。编写get_items函数，平均运行时间1.12毫秒
148 | * 使用itemgetter批量读取这些元素，编写fast_get_items函数，平均运行时间836微秒，性能是原来的1.3倍。
149 | 
150 | 
151 | 
152 | ```python
153 | from operator import itemgetter
154 | from random import choices
155 | ```
156 | 
157 | 
158 | ```python
159 | def get_items(data, keys):
160 |     return [data[x] for x in keys]
161 | ```
162 | 
163 | 
164 | ```python
165 | def fast_get_items(data, keys):
166 |     return itemgetter(*keys)(data)
167 | ```
168 | 
169 | 
170 | ```python
171 | data= dict(enumerate(range(100000)))
172 | keys = choices(list(data.keys()), k=10000)
173 | ```
174 | 
175 | 
176 | ```python
177 | %timeit -n 5 get_items(data, keys)
178 | ```
179 | 
180 |     1.12 ms ± 354 µs per loop (mean ± std. dev. of 7 runs, 5 loops each)
181 | 
182 | 
183 | 
184 | ```python
185 | %timeit -n 5 fast_get_items(data, keys)
186 | ```
187 | 
188 |     836 µs ± 287 µs per loop (mean ± std. dev. of 7 runs, 5 loops each)
189 | 
190 | 
191 | ## 5. lru_cache - 空间换时间
192 | 给定数字n，返回长度为n的斐波那且数列
193 | * 使用递归方式，编写fib函数，并用fib_seq函数对其进行循环调用。令n等于20，平均运行时间3.28ms。
194 | * 使用@lru_cache语法糖，将已经计算出来的结果缓存起来，比如fib(4)，计算fib(5)的时候可以直接调用缓存的fib(4)。平均运行时间144微秒，性能提升了22倍。
195 | 
196 | 
197 | ```python
198 | from functools import lru_cache
199 | ```
200 | 
201 | 
202 | ```python
203 | def fib(n):
204 |     if n < 2:
205 |         return n
206 |     return fib(n-1) + fib(n-2)
207 | 
208 | def fib_seq(n):
209 |     return [fib(x) for x in range(n)]
210 | ```
211 | 
212 | 
213 | ```python
214 | @lru_cache(maxsize=None)
215 | def fast_fib(n):
216 |     if n < 2:
217 |         return n
218 |     return fib(n-1) + fib(n-2)
219 | 
220 | def fast_fib_seq(n):
221 |     return [fast_fib(x) for x in range(n)]
222 | ```
223 | 
224 | 
225 | ```python
226 | %timeit -n 5 fib_seq(20)
227 | ```
228 | 
229 |     3.28 ms ± 220 µs per loop (mean ± std. dev. of 7 runs, 3 loops each)
230 | 
231 | 
232 | 
233 | ```python
234 | %timeit -n 5 fast_fib_seq(20)
235 | ```
236 | 
237 |     The slowest run took 524.07 times longer than the fastest. This could mean that an intermediate result is being cached.
238 |     144 µs ± 347 µs per loop (mean ± std. dev. of 7 runs, 3 loops each)
239 | 
240 | 
241 | 
242 | ```python
243 | 
244 | ```
245 | 


--------------------------------------------------------------------------------
/Python coroutines.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "markdown",
  5 |    "metadata": {},
  6 |    "source": [
  7 |     "# Python多任务处理(协程篇)\n",
  8 |     "作者: tushushu  \n",
  9 |     "项目地址: https://github.com/tushushu/flying-python"
 10 |    ]
 11 |   },
 12 |   {
 13 |    "cell_type": "markdown",
 14 |    "metadata": {},
 15 |    "source": [
 16 |     "## 1. 协程\n",
 17 |     "协程，英文名是Coroutine，又称为微线程，是一种用户态的轻量级线程。协程不像线程和进程那样，需要进行系统内核上的上下文切换，协程的上下文切换是由程序员决定的。协程通过 async/await 语法进行声明，是编写异步应用的推荐方式。\n"
 18 |    ]
 19 |   },
 20 |   {
 21 |    "cell_type": "code",
 22 |    "execution_count": 1,
 23 |    "metadata": {},
 24 |    "outputs": [],
 25 |    "source": [
 26 |     "import asyncio\n",
 27 |     "\n",
 28 |     "async def hello_world():\n",
 29 |     "    print('hello')\n",
 30 |     "    await asyncio.sleep(1)\n",
 31 |     "    print('world')\n",
 32 |     "\n",
 33 |     "# asyncio.run(main())"
 34 |    ]
 35 |   },
 36 |   {
 37 |    "cell_type": "markdown",
 38 |    "metadata": {},
 39 |    "source": [
 40 |     "## 2. 协程处理IO密集型任务\n",
 41 |     "IO密集型任务指的是系统的CPU性能相对硬盘、内存要好很多，此时，系统运作，大部分的状况是CPU在等I/O (硬盘/内存) 的读/写操作，此时CPU Loading并不高。涉及到网络、磁盘IO的任务都是IO密集型任务。"
 42 |    ]
 43 |   },
 44 |   {
 45 |    "cell_type": "code",
 46 |    "execution_count": 1,
 47 |    "metadata": {},
 48 |    "outputs": [],
 49 |    "source": [
 50 |     "import requests\n",
 51 |     "import time"
 52 |    ]
 53 |   },
 54 |   {
 55 |    "cell_type": "markdown",
 56 |    "metadata": {},
 57 |    "source": [
 58 |     "### 2.1 定义一个IO密集型函数"
 59 |    ]
 60 |   },
 61 |   {
 62 |    "cell_type": "code",
 63 |    "execution_count": 2,
 64 |    "metadata": {},
 65 |    "outputs": [],
 66 |    "source": [
 67 |     "def _sleep(n):\n",
 68 |     "    time.sleep(n)\n",
 69 |     "    print(\"Sleep for %d seconds.\" % n)"
 70 |    ]
 71 |   },
 72 |   {
 73 |    "cell_type": "markdown",
 74 |    "metadata": {},
 75 |    "source": [
 76 |     "### 2.2 使用串行的方式处理"
 77 |    ]
 78 |   },
 79 |   {
 80 |    "cell_type": "code",
 81 |    "execution_count": 3,
 82 |    "metadata": {},
 83 |    "outputs": [],
 84 |    "source": [
 85 |     "def sleep(m, n):\n",
 86 |     "    for i in range(m):\n",
 87 |     "        _sleep(n)"
 88 |    ]
 89 |   },
 90 |   {
 91 |    "cell_type": "markdown",
 92 |    "metadata": {},
 93 |    "source": [
 94 |     "### 2.3 使用协程处理\n",
 95 |     "目前Jupyter Notebook因为一些待修复的BUG无法运行协程。"
 96 |    ]
 97 |   },
 98 |   {
 99 |    "cell_type": "code",
100 |    "execution_count": 4,
101 |    "metadata": {},
102 |    "outputs": [],
103 |    "source": [
104 |     "f = open(\"tmp.py\", \"w\")\n",
105 |     "f.write(\n",
106 |     "\"\"\"\n",
107 |     "import time\n",
108 |     "import asyncio\n",
109 |     "\n",
110 |     "\n",
111 |     "async def _sleep(n):\n",
112 |     "    await asyncio.sleep(n)\n",
113 |     "    print(\"Sleep for %d seconds.\" % n)\n",
114 |     "\n",
115 |     "\n",
116 |     "def sleep(m, n):\n",
117 |     "    loop = asyncio.get_event_loop()\n",
118 |     "    loop.run_until_complete(asyncio.gather(*[_sleep(n) for _ in range(m)]))\n",
119 |     "    loop.close()\n",
120 |     "\n",
121 |     "\n",
122 |     "if __name__ == '__main__':\n",
123 |     "    start = time.perf_counter()\n",
124 |     "    m = 3\n",
125 |     "    n = 1\n",
126 |     "    sleep(m, n)\n",
127 |     "    print(\"%s函数的运行时间为 %.5f 秒!\" % (sleep.__name__, time.perf_counter() - start))\"\"\"\n",
128 |     ")\n",
129 |     "f.close()"
130 |    ]
131 |   },
132 |   {
133 |    "cell_type": "markdown",
134 |    "metadata": {},
135 |    "source": [
136 |     "### 2.4 计算函数运行时间\n",
137 |     "- 串行版本的运行时间 = 1 + 2 + 3 = 6秒  \n",
138 |     "- 多线程版本的运行时间 = max(1, 2, 3) = 3秒"
139 |    ]
140 |   },
141 |   {
142 |    "cell_type": "code",
143 |    "execution_count": 5,
144 |    "metadata": {},
145 |    "outputs": [],
146 |    "source": [
147 |     "def time_it(fn, *args):\n",
148 |     "    start = time.perf_counter()\n",
149 |     "    fn(*args)\n",
150 |     "    print(\"%s函数的运行时间为 %.5f 秒!\" % (fn.__name__, time.perf_counter() - start))"
151 |    ]
152 |   },
153 |   {
154 |    "cell_type": "code",
155 |    "execution_count": 6,
156 |    "metadata": {},
157 |    "outputs": [
158 |     {
159 |      "name": "stdout",
160 |      "output_type": "stream",
161 |      "text": [
162 |       "Sleep for 1 seconds.\n",
163 |       "Sleep for 1 seconds.\n",
164 |       "Sleep for 1 seconds.\n",
165 |       "sleep函数的运行时间为 3.01054 秒!\n"
166 |      ]
167 |     }
168 |    ],
169 |    "source": [
170 |     "time_it(sleep, 3, 1)"
171 |    ]
172 |   },
173 |   {
174 |    "cell_type": "code",
175 |    "execution_count": 7,
176 |    "metadata": {},
177 |    "outputs": [
178 |     {
179 |      "name": "stdout",
180 |      "output_type": "stream",
181 |      "text": [
182 |       "Sleep for 1 seconds.\n",
183 |       "Sleep for 1 seconds.\n",
184 |       "Sleep for 1 seconds.\n",
185 |       "sleep函数的运行时间为 1.00305 秒!\n",
186 |       "\n"
187 |      ]
188 |     }
189 |    ],
190 |    "source": [
191 |     "import subprocess\n",
192 |     "print(str(subprocess.check_output(\"python tmp.py\", shell=True), encoding = \"utf-8\"))"
193 |    ]
194 |   },
195 |   {
196 |    "cell_type": "markdown",
197 |    "metadata": {},
198 |    "source": [
199 |     "## 参考文章\n",
200 |     "https://docs.python.org/zh-cn/3.7/library/asyncio-task.html"
201 |    ]
202 |   },
203 |   {
204 |    "cell_type": "code",
205 |    "execution_count": null,
206 |    "metadata": {},
207 |    "outputs": [],
208 |    "source": []
209 |   }
210 |  ],
211 |  "metadata": {
212 |   "kernelspec": {
213 |    "display_name": "Python 3",
214 |    "language": "python",
215 |    "name": "python3"
216 |   },
217 |   "language_info": {
218 |    "codemirror_mode": {
219 |     "name": "ipython",
220 |     "version": 3
221 |    },
222 |    "file_extension": ".py",
223 |    "mimetype": "text/x-python",
224 |    "name": "python",
225 |    "nbconvert_exporter": "python",
226 |    "pygments_lexer": "ipython3",
227 |    "version": "3.6.10"
228 |   }
229 |  },
230 |  "nbformat": 4,
231 |  "nbformat_minor": 2
232 | }
233 | 


--------------------------------------------------------------------------------
/More efficient array.md:
--------------------------------------------------------------------------------
  1 | # 4种方法提升Python数组的效率
  2 | 
  3 | ## 1. Python的列表为什么慢
  4 | Python的列表是一个动态的数组，即数组的size是可以调整的，数组存放着指向各个列表元素的指针(PyObject*)。列表中的各个元素可以是不同的类型，比如my_list = ['a', 1, True]。实际上数组里存放了三个指针，分别指向了这三个元素。那么相比其他语言的数组而言，为什么Python的列表会慢呢？原因主要是以下两个：
  5 | 1. Python是动态类型语言，意味着类型检查要耗费额外的时间。
  6 | 2. Python或者说Cpython没有JIT优化器。
  7 | 
  8 | ## 2. 如何用Python执行快速的数组计算
  9 | 目前比较主流的解决方案有如下几种：
 10 | 1. Numpy - Numpy的array更像是C/C++的数组，数据类型一致，而且array的方法(如sum)都是用C来实现的。
 11 | 2. Numba - 使用JIT技术，优化Numpy的性能。无论是调用Numpy的方法，还是使用for循环遍历Numpy数组，都可以得到性能提升。
 12 | 3. Numexpr - 避免Numpy为中间结果分配内存，优化Numpy性能，主要用于大数组的表达式计算。
 13 | 4. Cython - 为Python编写C/C++扩展。
 14 | 
 15 | 接下来通过两个例子来演示如何通过这四种工具
 16 | 
 17 | ## 3. 数组求平方和
 18 | 
 19 | 
 20 | ```python
 21 | arr = [x for x in range(10000)]
 22 | ```
 23 | 
 24 | ### 3.1 for循环
 25 | 
 26 | 
 27 | ```python
 28 | def sqr_sum(arr):
 29 |     total = 0
 30 |     for x in arr:
 31 |         total += x ** 2
 32 |     return total
 33 | 
 34 | print("The result is:", sqr_sum(arr))
 35 | %timeit sqr_sum(arr)
 36 | ```
 37 | 
 38 |     The result is: 333283335000
 39 |     2.53 ms ± 91.7 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
 40 | 
 41 | 
 42 | ### 3.2 Numpy
 43 | 
 44 | 
 45 | ```python
 46 | import numpy as np
 47 | ```
 48 | 
 49 | 
 50 | ```python
 51 | def sqr_sum(arr):
 52 |     return (arr ** 2).sum()
 53 | 
 54 | arr = np.array(arr)
 55 | print("The result is:", sqr_sum(arr))
 56 | %timeit sqr_sum(arr)
 57 | ```
 58 | 
 59 |     The result is: 333283335000
 60 |     9.66 µs ± 275 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
 61 | 
 62 | 
 63 | ### 3.3 Numba
 64 | 
 65 | 
 66 | ```python
 67 | from numba import jit
 68 | ```
 69 | 
 70 | 
 71 | ```python
 72 | @jit(nopython=True)
 73 | def sqr_sum(arr):
 74 |     return (arr ** 2).sum()
 75 | 
 76 | arr = np.array(arr)
 77 | print("The result is:", sqr_sum(arr))
 78 | %timeit sqr_sum(arr)
 79 | ```
 80 | 
 81 |     The result is: 333283335000
 82 |     3.39 µs ± 57.2 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
 83 | 
 84 | 
 85 | ### 3.4 Numexpr
 86 | 
 87 | 
 88 | ```python
 89 | import numexpr as ne
 90 | ```
 91 | 
 92 | 
 93 | ```python
 94 | def sqr_sum(arr):
 95 |     return ne.evaluate("sum(arr * arr)")
 96 | 
 97 | arr = np.array(arr)
 98 | print("The result is:", sqr_sum(arr))
 99 | %timeit sqr_sum(arr)
100 | ```
101 | 
102 |     The result is: 333283335000
103 |     14.9 µs ± 144 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
104 | 
105 | 
106 | ### 3.5 Cython
107 | 
108 | 
109 | ```python
110 | %load_ext cython
111 | ```
112 | 
113 | 
114 | ```cython
115 | %%cython
116 | cimport numpy as np
117 | ctypedef np.int_t DTYPE_t
118 | 
119 | def sqr_sum(np.ndarray[DTYPE_t] arr):
120 |     cdef:
121 |         DTYPE_t total = 0
122 |         DTYPE_t x
123 |         int i = 0
124 |         int n = len(arr)
125 |     while i < n:
126 |         total += arr[i] ** 2
127 |         i += 1
128 |     return total
129 | ```
130 | 
131 | 
132 | ```python
133 | arr = np.array(arr, dtype="int")
134 | print("The result is:", sqr_sum(arr))
135 | %timeit sqr_sum(arr)
136 | ```
137 | 
138 |     The result is: 333283335000
139 |     5.51 µs ± 62.4 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
140 | 
141 | 
142 | ## 4. 数组变换
143 | 
144 | 
145 | ```python
146 | arr = [x for x in range(1000000)]
147 | ```
148 | 
149 | ### 4.1 for循环
150 | 
151 | 
152 | ```python
153 | def transform(arr):
154 |     return [x * 2 + 1 for x in arr]
155 | 
156 | print("The result is:", transform(arr)[:5], "...")
157 | %timeit transform(arr)
158 | ```
159 | 
160 |     The result is: [1, 3, 5, 7, 9] ...
161 |     84.5 ms ± 381 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
162 | 
163 | 
164 | ### 4.2 Numpy
165 | 
166 | 
167 | ```python
168 | import numpy as np
169 | ```
170 | 
171 | 
172 | ```python
173 | def transform(arr):
174 |     return arr * 2 + 1
175 | 
176 | arr = np.array(arr)
177 | print("The result is:", transform(arr)[:5], "...")
178 | %timeit transform(arr)
179 | ```
180 | 
181 |     The result is: [1 3 5 7 9] ...
182 |     803 µs ± 11.4 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
183 | 
184 | 
185 | ### 4.3 Numba
186 | 
187 | 
188 | ```python
189 | from numba import jit
190 | ```
191 | 
192 | 
193 | ```python
194 | @jit(nopython=True)
195 | def transform(arr):
196 |     return arr * 2 + 1
197 | 
198 | arr = np.array(arr)
199 | print("The result is:", transform(arr)[:5], "...")
200 | %timeit transform(arr)
201 | ```
202 | 
203 |     The result is: [1 3 5 7 9] ...
204 |     498 µs ± 8.71 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
205 | 
206 | 
207 | ### 4.4 Numexpr
208 | 
209 | 
210 | ```python
211 | import numexpr as ne
212 | ```
213 | 
214 | 
215 | ```python
216 | def transform(arr):
217 |     return ne.evaluate("arr * 2 + 1")
218 | 
219 | arr = np.array(arr)
220 | print("The result is:", transform(arr)[:5], "...")
221 | %timeit transform(arr)
222 | ```
223 | 
224 |     The result is: [1 3 5 7 9] ...
225 |     369 µs ± 13.2 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
226 | 
227 | 
228 | ### 4.5 Cython
229 | 
230 | 
231 | ```python
232 | %load_ext cython
233 | ```
234 | 
235 |     The cython extension is already loaded. To reload it, use:
236 |       %reload_ext cython
237 | 
238 | 
239 | 
240 | ```cython
241 | %%cython
242 | import numpy as np
243 | cimport numpy as np
244 | ctypedef np.int_t DTYPE_t
245 | 
246 | def transform(np.ndarray[DTYPE_t] arr):
247 |     cdef:
248 |         np.ndarray[DTYPE_t] new_arr = np.empty_like(arr)
249 |         int i = 0
250 |         int n = len(arr)
251 |     while i < n:
252 |         new_arr[i] = arr[i] * 2 + 1
253 |         i += 1
254 |     return new_arr
255 | ```
256 | 
257 | 
258 | ```python
259 | arr = np.array(arr)
260 | print("The result is:", transform(arr)[:5], "...")
261 | %timeit transform(arr)
262 | ```
263 | 
264 |     The result is: [1 3 5 7 9] ...
265 |     887 µs ± 29.3 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
266 | 
267 | 
268 | ## 5. 参考文章
269 | [How does python have different data types in an array?](https://stackoverflow.com/questions/10558670/how-does-python-have-different-data-types-in-an-array)  
270 | [Why are Python Programs often slower than the Equivalent Program Written in C or C++?](https://stackoverflow.com/questions/3033329/why-are-python-programs-often-slower-than-the-equivalent-program-written-in-c-or)  
271 | [How Fast Numpy Really is and Why?](https://towardsdatascience.com/how-fast-numpy-really-is-e9111df44347)
272 | 
273 | 
274 | ```python
275 | 
276 | ```
277 | 


--------------------------------------------------------------------------------
/Itertools for efficient looping.md:
--------------------------------------------------------------------------------
  1 | # Python Itertools - 高效的循环
  2 | 作者: tushushu  
  3 | 项目地址: https://github.com/tushushu/flying-python  
  4 | 
  5 | Python官方文档用"高效的循环"来形容itertools模块，有些tools会带来性能提升，而另外一些tools并不快，只是会节省一些开发时间而已，如果滥用还会导致代码可读性变差。我们不妨把itertools的兄弟们拉出来溜溜。
  6 | 
  7 | 
  8 | ## 1. 数列累加
  9 | 给定一个列表An，返回数列累加和Sn。
 10 | 举例说明：
 11 | * 输入: [1, 2, 3, 4, 5]
 12 | * 返回: [1, 3, 6, 10, 15]  
 13 | 
 14 | 使用accumulate，性能提升了2.5倍
 15 | 
 16 | 
 17 | ```python
 18 | from itertools import accumulate
 19 | ```
 20 | 
 21 | 
 22 | ```python
 23 | def _accumulate_list(arr):
 24 |     tot = 0
 25 |     for x in arr:
 26 |         tot += x
 27 |         yield tot
 28 | 
 29 | def accumulate_list(arr):
 30 |     return list(_accumulate_list(arr))
 31 | ```
 32 | 
 33 | 
 34 | ```python
 35 | def fast_accumulate_list(arr):
 36 |     return list(accumulate(arr))
 37 | ```
 38 | 
 39 | 
 40 | ```python
 41 | arr = list(range(1000))
 42 | ```
 43 | 
 44 | 
 45 | ```python
 46 | %timeit accumulate_list(arr)
 47 | ```
 48 | 
 49 |     61 µs ± 2.91 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
 50 | 
 51 | 
 52 | 
 53 | ```python
 54 | %timeit fast_accumulate_list(arr)
 55 | ```
 56 | 
 57 |     21.3 µs ± 811 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
 58 | 
 59 | 
 60 | ## 2. 选择数据
 61 | 给定一个列表data，一个用0/1表示的列表selectors，返回被选择的数据。
 62 | 举例说明：
 63 | * 输入: [1, 2, 3, 4, 5], [0, 1, 0, 1, 0]
 64 | * 返回: [2, 4]  
 65 | 
 66 | 使用compress，性能提升了2.8倍
 67 | 
 68 | 
 69 | ```python
 70 | from itertools import compress
 71 | from random import randint
 72 | ```
 73 | 
 74 | 
 75 | ```python
 76 | def select_data(data, selectors):
 77 |     return [x for x, y in zip(data, selectors) if y]
 78 | ```
 79 | 
 80 | 
 81 | ```python
 82 | def fast_select_data(data, selectors):
 83 |     return list(compress(data, selectors))
 84 | ```
 85 | 
 86 | 
 87 | ```python
 88 | data = list(range(10000))
 89 | selectors = [randint(0, 1) for _ in range(10000)]
 90 | ```
 91 | 
 92 | 
 93 | ```python
 94 | %timeit select_data(data, selectors)
 95 | ```
 96 | 
 97 |     341 µs ± 17.8 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
 98 | 
 99 | 
100 | 
101 | ```python
102 | %timeit fast_select_data(data, selectors)
103 | ```
104 | 
105 |     130 µs ± 3.19 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
106 | 
107 | 
108 | ## 3. 组合
109 | 给定一个列表arr和一个数字k，返回从arr中选择k个元素的所有情况。
110 | 举例说明：
111 | * 输入: [1, 2, 3], 2
112 | * 返回: [(1, 2), (1, 3), (2, 1), (2, 3), (3, 1), (3, 2)] 
113 | 
114 | 使用permutations，性能提升了10倍
115 | 
116 | 
117 | ```python
118 | from itertools import permutations
119 | ```
120 | 
121 | 
122 | ```python
123 | def _get_permutations(arr, k, i):
124 |     if i == k:
125 |         return [arr[:k]]
126 |     res = []
127 |     for j in range(i, len(arr)):
128 |         arr_cpy = arr.copy()
129 |         arr_cpy[i], arr_cpy[j] = arr_cpy[j], arr_cpy[i]
130 |         res += _get_permutations(arr_cpy, k, i + 1)
131 |     return res
132 |     
133 | def get_permutations(arr, k):
134 |     return _get_permutations(arr, k, 0)
135 | ```
136 | 
137 | 
138 | ```python
139 | def fast_get_permutations(arr, k):
140 |     return list(permutations(arr, k))
141 | ```
142 | 
143 | 
144 | ```python
145 | arr = list(range(10))
146 | k = 5
147 | ```
148 | 
149 | 
150 | ```python
151 | %timeit -n 1 get_permutations(arr, k)
152 | ```
153 | 
154 |     15.5 ms ± 1.96 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
155 | 
156 | 
157 | 
158 | ```python
159 | %timeit -n 1 fast_get_permutations(arr, k)
160 | ```
161 | 
162 |     1.56 ms ± 284 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)
163 | 
164 | 
165 | ## 4. 筛选数据
166 | 给定一个列表arr，筛选出所有的偶数。
167 | 举例说明：
168 | * 输入: [3, 1, 4, 5, 9, 2]
169 | * 返回: [(4, 2] 
170 | 
171 | 使用filterfalse，性能反而会变慢，所以不要迷信itertools。
172 | 
173 | 
174 | ```python
175 | from itertools import filterfalse
176 | ```
177 | 
178 | 
179 | ```python
180 | def get_even_nums(arr):
181 |     return [x for x in arr if x % 2 == 0]
182 | ```
183 | 
184 | 
185 | ```python
186 | def fast_get_even_nums(arr):
187 |     return list(filterfalse(lambda x: x % 2, arr))
188 | ```
189 | 
190 | 
191 | ```python
192 | arr = list(range(10000))
193 | ```
194 | 
195 | 
196 | ```python
197 | %timeit get_even_nums(arr)
198 | ```
199 | 
200 |     417 µs ± 18.8 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
201 | 
202 | 
203 | 
204 | ```python
205 | %timeit fast_get_even_nums(arr)
206 | ```
207 | 
208 |     823 µs ± 22.6 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
209 | 
210 | 
211 | ## 5. 条件终止
212 | 给定一个列表arr，依次对列表的所有数字进行求和，若遇到某个元素大于target之后则终止求和，返回这个和。
213 | 举例说明：
214 | * 输入: [1, 2, 3, 4, 5], 3
215 | * 返回: 6 (4 > 3，终止)
216 | 
217 | 使用takewhile，性能反而会变慢，所以不要迷信itertools。
218 | 
219 | 
220 | ```python
221 | from itertools import takewhile
222 | ```
223 | 
224 | 
225 | ```python
226 | def cond_sum(arr, target):
227 |     res = 0
228 |     for x in arr:
229 |         if x > target:
230 |             break
231 |         res += x
232 |     return res
233 | ```
234 | 
235 | 
236 | ```python
237 | def fast_cond_sum(arr, target):
238 |     return sum(takewhile(lambda x: x <= target, arr))
239 | ```
240 | 
241 | 
242 | ```python
243 | arr = list(range(10000))
244 | target = 5000
245 | ```
246 | 
247 | 
248 | ```python
249 | %timeit cond_sum(arr, target)
250 | ```
251 | 
252 |     245 µs ± 11.8 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
253 | 
254 | 
255 | 
256 | ```python
257 | %timeit fast_cond_sum(arr, target)
258 | ```
259 | 
260 |     404 µs ± 13.3 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
261 | 
262 | 
263 | ## 6. 循环嵌套
264 | 给定列表arr1，arr2，返回两个列表的所有元素两两相加的和。
265 | 举例说明：
266 | * 输入: [1, 2], [4, 5]
267 | * 返回: [1 + 4， 1 + 5， 2 + 4， 2 + 5]
268 | 
269 | 使用product，性能提升了1.25倍。
270 | 
271 | 
272 | ```python
273 | from itertools import product
274 | ```
275 | 
276 | 
277 | ```python
278 | def _cross_sum(arr1, arr2):
279 |     for x in arr1:
280 |         for y in arr2:
281 |             yield x + y
282 | 
283 | def cross_sum(arr1, arr2):
284 |     return list(_cross_sum(arr1, arr2))
285 | ```
286 | 
287 | 
288 | ```python
289 | def fast_cross_sum(arr1, arr2):
290 |     return [x + y for x, y in product(arr1, arr2)]
291 | ```
292 | 
293 | 
294 | ```python
295 | arr1 = list(range(100))
296 | arr2 = list(range(100))
297 | ```
298 | 
299 | 
300 | ```python
301 | %timeit cross_sum(arr1, arr2)
302 | ```
303 | 
304 |     484 µs ± 16.6 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
305 | 
306 | 
307 | 
308 | ```python
309 | %timeit fast_cross_sum(arr1, arr2)
310 | ```
311 | 
312 |     373 µs ± 11.4 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
313 | 
314 | 
315 | ## 7. 二维列表转一维列表
316 | 给定二维列表arr，转为一维列表
317 | 举例说明：
318 | * 输入: [[1, 2], [3, 4]]
319 | * 返回: [1, 2, 3, 4]
320 | 
321 | 使用chain，性能提升了6倍。
322 | 
323 | 
324 | ```python
325 | from itertools import chain
326 | ```
327 | 
328 | 
329 | ```python
330 | def _flatten(arr2d):
331 |     for arr in arr2d:
332 |         for x in arr:
333 |             yield x
334 | 
335 | def flatten(arr2d):
336 |     return list(_flatten(arr2d))
337 | ```
338 | 
339 | 
340 | ```python
341 | def fast_flatten(arr2d):
342 |     return list(chain(*arr2d))
343 | ```
344 | 
345 | 
346 | ```python
347 | arr2d = [[x + y * 100 for x in range(100)] for y in range(100)]
348 | ```
349 | 
350 | 
351 | ```python
352 | %timeit flatten(arr2d)
353 | ```
354 | 
355 |     379 µs ± 15.4 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
356 | 
357 | 
358 | 
359 | ```python
360 | %timeit fast_flatten(arr2d)
361 | ```
362 | 
363 |     66.9 µs ± 3.43 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
364 | 
365 | 
366 | 
367 | ```python
368 | 
369 | ```
370 | 


--------------------------------------------------------------------------------
/Python multi threads.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "markdown",
  5 |    "metadata": {},
  6 |    "source": [
  7 |     "# Python多任务处理(多线程篇)\n",
  8 |     "作者: tushushu  \n",
  9 |     "项目地址: https://github.com/tushushu/flying-python"
 10 |    ]
 11 |   },
 12 |   {
 13 |    "cell_type": "markdown",
 14 |    "metadata": {},
 15 |    "source": [
 16 |     "## 1. GIL\n",
 17 |     "\n",
 18 |     "熟悉python的都知道，在C语言写的python解释器中存在全局解释器锁，由于全局解释器锁的存在，在同一时间内，python解释器只能运行一个线程的代码，这大大影响了python多线程的性能。而这个解释器锁由于历史原因，现在几乎无法消除。 \n",
 19 |     "  \n",
 20 |     "python GIL 之所以会影响多线程等性能，是因为在多线程的情况下，只有当线程获得了一个全局锁的时候，那么该线程的代码才能运行，而全局锁只有一个，所以使用python多线程，在同一时刻也只有一个线程在运行，因此在即使在多核的情况下也只能发挥出单核的性能。 \n"
 21 |    ]
 22 |   },
 23 |   {
 24 |    "cell_type": "markdown",
 25 |    "metadata": {},
 26 |    "source": [
 27 |     "## 2. 多线程处理IO密集型任务\n",
 28 |     "IO密集型任务指的是系统的CPU性能相对硬盘、内存要好很多，此时，系统运作，大部分的状况是CPU在等I/O (硬盘/内存) 的读/写操作，此时CPU Loading并不高。涉及到网络、磁盘IO的任务都是IO密集型任务。一个线程执行IO密集型任务的时候，CPU处于闲置状态，因此GIL会被释放给其他线程，从而缩短了总体的等待运行时间。"
 29 |    ]
 30 |   },
 31 |   {
 32 |    "cell_type": "code",
 33 |    "execution_count": 1,
 34 |    "metadata": {},
 35 |    "outputs": [],
 36 |    "source": [
 37 |     "from concurrent.futures import ThreadPoolExecutor\n",
 38 |     "from time import sleep, time"
 39 |    ]
 40 |   },
 41 |   {
 42 |    "cell_type": "code",
 43 |    "execution_count": 2,
 44 |    "metadata": {},
 45 |    "outputs": [],
 46 |    "source": [
 47 |     "# Worker数量\n",
 48 |     "N = 4\n",
 49 |     "# 建立线程池\n",
 50 |     "pool = ThreadPoolExecutor(max_workers=N)"
 51 |    ]
 52 |   },
 53 |   {
 54 |    "cell_type": "markdown",
 55 |    "metadata": {},
 56 |    "source": [
 57 |     "### 2.1 定义一个IO密集型函数\n",
 58 |     "该函数会“睡眠”x秒。"
 59 |    ]
 60 |   },
 61 |   {
 62 |    "cell_type": "code",
 63 |    "execution_count": 3,
 64 |    "metadata": {},
 65 |    "outputs": [],
 66 |    "source": [
 67 |     "def io_bound_func(x):\n",
 68 |     "    sleep(x)\n",
 69 |     "    print(\"Sleep for %d seconds.\" % x)"
 70 |    ]
 71 |   },
 72 |   {
 73 |    "cell_type": "markdown",
 74 |    "metadata": {},
 75 |    "source": [
 76 |     "### 2.2 使用串行的方式处理\n",
 77 |     "遍历一个列表的所有元素，执行func函数。"
 78 |    ]
 79 |   },
 80 |   {
 81 |    "cell_type": "code",
 82 |    "execution_count": 4,
 83 |    "metadata": {},
 84 |    "outputs": [],
 85 |    "source": [
 86 |     "def process_array(arr):\n",
 87 |     "    for x in arr:\n",
 88 |     "        io_bound_func(x)"
 89 |    ]
 90 |   },
 91 |   {
 92 |    "cell_type": "markdown",
 93 |    "metadata": {},
 94 |    "source": [
 95 |     "### 2.3 使用多线程处理\n",
 96 |     "通过线程池的map方法，可以将同一个函数作用在列表中的所有元素上。"
 97 |    ]
 98 |   },
 99 |   {
100 |    "cell_type": "code",
101 |    "execution_count": 5,
102 |    "metadata": {},
103 |    "outputs": [],
104 |    "source": [
105 |     "def fast_process_array(arr):\n",
106 |     "    for x in pool.map(io_bound_func, arr):\n",
107 |     "        pass"
108 |    ]
109 |   },
110 |   {
111 |    "cell_type": "markdown",
112 |    "metadata": {},
113 |    "source": [
114 |     "### 2.4 计算函数运行时间\n",
115 |     "- 串行版本的运行时间 = 1 + 2 + 3 = 6秒  \n",
116 |     "- 多线程版本的运行时间 = max(1, 2, 3) = 3秒"
117 |    ]
118 |   },
119 |   {
120 |    "cell_type": "code",
121 |    "execution_count": 6,
122 |    "metadata": {},
123 |    "outputs": [],
124 |    "source": [
125 |     "def time_it(fn, *args):\n",
126 |     "    start = time()\n",
127 |     "    fn(*args)\n",
128 |     "    print(\"%s版本的运行时间为 %.5f 秒!\" % (fn.__name__, time() - start))"
129 |    ]
130 |   },
131 |   {
132 |    "cell_type": "code",
133 |    "execution_count": 7,
134 |    "metadata": {},
135 |    "outputs": [
136 |     {
137 |      "name": "stdout",
138 |      "output_type": "stream",
139 |      "text": [
140 |       "Sleep for 1 seconds.\n",
141 |       "Sleep for 2 seconds.\n",
142 |       "Sleep for 3 seconds.\n",
143 |       "process_array版本的运行时间为 6.00883 秒!\n"
144 |      ]
145 |     }
146 |    ],
147 |    "source": [
148 |     "time_it(process_array, [1, 2, 3])"
149 |    ]
150 |   },
151 |   {
152 |    "cell_type": "code",
153 |    "execution_count": 8,
154 |    "metadata": {},
155 |    "outputs": [
156 |     {
157 |      "name": "stdout",
158 |      "output_type": "stream",
159 |      "text": [
160 |       "Sleep for 1 seconds.\n",
161 |       "Sleep for 2 seconds.\n",
162 |       "Sleep for 3 seconds.\n",
163 |       "fast_process_array版本的运行时间为 3.00300 秒!\n"
164 |      ]
165 |     }
166 |    ],
167 |    "source": [
168 |     "time_it(fast_process_array, [1, 2, 3])"
169 |    ]
170 |   },
171 |   {
172 |    "cell_type": "markdown",
173 |    "metadata": {},
174 |    "source": [
175 |     "### 3. 多线程CPU密集型任务\n",
176 |     "CPU密集型任务的特点是要进行大量的计算，消耗CPU资源，比如计算圆周率、对视频进行高清解码等等，全靠CPU的运算能力。一个线程执行CPU密集型任务的时候，CPU处于忙碌状态，运行1000个字节码之后GIL会被释放给其他线程，加上切换线程的时间有可能会比串行代码更慢。"
177 |    ]
178 |   },
179 |   {
180 |    "cell_type": "markdown",
181 |    "metadata": {},
182 |    "source": [
183 |     "### 3.1 定义一个CPU密集型函数\n",
184 |     "该函数会对[1, x]之间的整数进行求和。"
185 |    ]
186 |   },
187 |   {
188 |    "cell_type": "code",
189 |    "execution_count": 9,
190 |    "metadata": {},
191 |    "outputs": [],
192 |    "source": [
193 |     "def cpu_bound_func(x):\n",
194 |     "    tot = 0\n",
195 |     "    a = 1\n",
196 |     "    while a <= x:\n",
197 |     "        tot += x\n",
198 |     "        a += 1\n",
199 |     "    print(\"Finish sum from 1 to %d!\" % x)\n",
200 |     "    return tot"
201 |    ]
202 |   },
203 |   {
204 |    "cell_type": "markdown",
205 |    "metadata": {},
206 |    "source": [
207 |     "### 3.2 使用串行的方式处理\n",
208 |     "遍历一个列表的所有元素，执行func函数。"
209 |    ]
210 |   },
211 |   {
212 |    "cell_type": "code",
213 |    "execution_count": 10,
214 |    "metadata": {},
215 |    "outputs": [],
216 |    "source": [
217 |     "def process_array(arr):\n",
218 |     "    for x in arr:\n",
219 |     "        cpu_bound_func(x)"
220 |    ]
221 |   },
222 |   {
223 |    "cell_type": "markdown",
224 |    "metadata": {},
225 |    "source": [
226 |     "### 3.3 使用多线程处理\n",
227 |     "通过线程池的map方法，可以将同一个函数作用在列表中的所有元素上。"
228 |    ]
229 |   },
230 |   {
231 |    "cell_type": "code",
232 |    "execution_count": 11,
233 |    "metadata": {},
234 |    "outputs": [],
235 |    "source": [
236 |     "def fast_process_array(arr):\n",
237 |     "    for x in pool.map(cpu_bound_func, arr):\n",
238 |     "        pass"
239 |    ]
240 |   },
241 |   {
242 |    "cell_type": "markdown",
243 |    "metadata": {},
244 |    "source": [
245 |     "### 3.4 计算函数运行时间\n",
246 |     "- 串行版本的运行时间2.1秒\n",
247 |     "- 多线程版本的运行时间2.2秒"
248 |    ]
249 |   },
250 |   {
251 |    "cell_type": "code",
252 |    "execution_count": 12,
253 |    "metadata": {},
254 |    "outputs": [],
255 |    "source": [
256 |     "def time_it(fn, *args):\n",
257 |     "    start = time()\n",
258 |     "    fn(*args)\n",
259 |     "    print(\"%s版本的运行时间为 %.5f 秒!\" % (fn.__name__, time() - start))"
260 |    ]
261 |   },
262 |   {
263 |    "cell_type": "code",
264 |    "execution_count": 13,
265 |    "metadata": {},
266 |    "outputs": [
267 |     {
268 |      "name": "stdout",
269 |      "output_type": "stream",
270 |      "text": [
271 |       "Finish sum from 1 to 10000000!\n",
272 |       "Finish sum from 1 to 10000000!\n",
273 |       "Finish sum from 1 to 10000000!\n",
274 |       "process_array版本的运行时间为 2.10489 秒!\n"
275 |      ]
276 |     }
277 |    ],
278 |    "source": [
279 |     "time_it(process_array, [10**7, 10**7, 10**7])"
280 |    ]
281 |   },
282 |   {
283 |    "cell_type": "code",
284 |    "execution_count": 14,
285 |    "metadata": {},
286 |    "outputs": [
287 |     {
288 |      "name": "stdout",
289 |      "output_type": "stream",
290 |      "text": [
291 |       "Finish sum from 1 to 10000000!\n",
292 |       "Finish sum from 1 to 10000000!\n",
293 |       "Finish sum from 1 to 10000000!\n",
294 |       "fast_process_array版本的运行时间为 2.20897 秒!\n"
295 |      ]
296 |     }
297 |    ],
298 |    "source": [
299 |     "time_it(fast_process_array, [10**7, 10**7, 10**7])"
300 |    ]
301 |   },
302 |   {
303 |    "cell_type": "markdown",
304 |    "metadata": {},
305 |    "source": [
306 |     "## 参考文章\n",
307 |     "https://www.jianshu.com/p/c75ed8a6e9af  \n",
308 |     "https://www.cnblogs.com/tusheng/articles/10630662.html"
309 |    ]
310 |   },
311 |   {
312 |    "cell_type": "code",
313 |    "execution_count": null,
314 |    "metadata": {},
315 |    "outputs": [],
316 |    "source": []
317 |   }
318 |  ],
319 |  "metadata": {
320 |   "kernelspec": {
321 |    "display_name": "Python 3",
322 |    "language": "python",
323 |    "name": "python3"
324 |   },
325 |   "language_info": {
326 |    "codemirror_mode": {
327 |     "name": "ipython",
328 |     "version": 3
329 |    },
330 |    "file_extension": ".py",
331 |    "mimetype": "text/x-python",
332 |    "name": "python",
333 |    "nbconvert_exporter": "python",
334 |    "pygments_lexer": "ipython3",
335 |    "version": "3.6.6"
336 |   }
337 |  },
338 |  "nbformat": 4,
339 |  "nbformat_minor": 2
340 | }
341 | 


--------------------------------------------------------------------------------
/More efficient pandas.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "markdown",
  5 |    "metadata": {},
  6 |    "source": [
  7 |     "# 让Pandas DataFrame性能提升40倍"
  8 |    ]
  9 |   },
 10 |   {
 11 |    "cell_type": "markdown",
 12 |    "metadata": {},
 13 |    "source": [
 14 |     "## 1. 小试牛刀\n",
 15 |     "大名鼎鼎的Pandas是数据分析的神器。有时候我们需要对上千万甚至上亿的数据进行非常复杂处理，那么运行效率就是一个不能忽视的问题。比如下面这个简单例子，我们随机生成100万条数据，对'val'这一列进行处理：如果是偶数则减1，奇数则加1。实际的数据分析工作要比这个例子复杂的多，但考虑到我们（主要是我）没有那么多时间等待运行结果，所以就偷个懒吧。可以看到transform函数的平均运行时间是284ms，"
 16 |    ]
 17 |   },
 18 |   {
 19 |    "cell_type": "code",
 20 |    "execution_count": 1,
 21 |    "metadata": {},
 22 |    "outputs": [
 23 |     {
 24 |      "data": {
 25 |       "text/html": [
 26 |        "<div>\n",
 27 |        "<style scoped>\n",
 28 |        "    .dataframe tbody tr th:only-of-type {\n",
 29 |        "        vertical-align: middle;\n",
 30 |        "    }\n",
 31 |        "\n",
 32 |        "    .dataframe tbody tr th {\n",
 33 |        "        vertical-align: top;\n",
 34 |        "    }\n",
 35 |        "\n",
 36 |        "    .dataframe thead th {\n",
 37 |        "        text-align: right;\n",
 38 |        "    }\n",
 39 |        "</style>\n",
 40 |        "<table border=\"1\" class=\"dataframe\">\n",
 41 |        "  <thead>\n",
 42 |        "    <tr style=\"text-align: right;\">\n",
 43 |        "      <th></th>\n",
 44 |        "      <th>genre</th>\n",
 45 |        "      <th>val</th>\n",
 46 |        "    </tr>\n",
 47 |        "  </thead>\n",
 48 |        "  <tbody>\n",
 49 |        "    <tr>\n",
 50 |        "      <th>0</th>\n",
 51 |        "      <td>C</td>\n",
 52 |        "      <td>54</td>\n",
 53 |        "    </tr>\n",
 54 |        "    <tr>\n",
 55 |        "      <th>1</th>\n",
 56 |        "      <td>A</td>\n",
 57 |        "      <td>5</td>\n",
 58 |        "    </tr>\n",
 59 |        "    <tr>\n",
 60 |        "      <th>2</th>\n",
 61 |        "      <td>D</td>\n",
 62 |        "      <td>0</td>\n",
 63 |        "    </tr>\n",
 64 |        "    <tr>\n",
 65 |        "      <th>3</th>\n",
 66 |        "      <td>D</td>\n",
 67 |        "      <td>42</td>\n",
 68 |        "    </tr>\n",
 69 |        "    <tr>\n",
 70 |        "      <th>4</th>\n",
 71 |        "      <td>C</td>\n",
 72 |        "      <td>91</td>\n",
 73 |        "    </tr>\n",
 74 |        "  </tbody>\n",
 75 |        "</table>\n",
 76 |        "</div>"
 77 |       ],
 78 |       "text/plain": [
 79 |        "  genre  val\n",
 80 |        "0     C   54\n",
 81 |        "1     A    5\n",
 82 |        "2     D    0\n",
 83 |        "3     D   42\n",
 84 |        "4     C   91"
 85 |       ]
 86 |      },
 87 |      "execution_count": 1,
 88 |      "metadata": {},
 89 |      "output_type": "execute_result"
 90 |     }
 91 |    ],
 92 |    "source": [
 93 |     "import pandas as pd\n",
 94 |     "import numpy as np\n",
 95 |     "\n",
 96 |     "def gen_data(size):\n",
 97 |     "    d = dict()\n",
 98 |     "    d[\"genre\"] = np.random.choice([\"A\", \"B\", \"C\", \"D\"], size=size)\n",
 99 |     "    d[\"val\"] = np.random.randint(low=0, high=100, size=size)\n",
100 |     "    return pd.DataFrame(d)\n",
101 |     "\n",
102 |     "data = gen_data(1000000)\n",
103 |     "data.head()"
104 |    ]
105 |   },
106 |   {
107 |    "cell_type": "code",
108 |    "execution_count": 2,
109 |    "metadata": {},
110 |    "outputs": [],
111 |    "source": [
112 |     "def transform(data):\n",
113 |     "    data.loc[:, \"new_val\"] = data.val.apply(lambda x: x + 1 if x % 2 else x - 1)"
114 |    ]
115 |   },
116 |   {
117 |    "cell_type": "code",
118 |    "execution_count": 3,
119 |    "metadata": {},
120 |    "outputs": [
121 |     {
122 |      "name": "stdout",
123 |      "output_type": "stream",
124 |      "text": [
125 |       "284 ms ± 8.95 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n"
126 |      ]
127 |     }
128 |    ],
129 |    "source": [
130 |     "%timeit -n 1 transform(data)"
131 |    ]
132 |   },
133 |   {
134 |    "cell_type": "markdown",
135 |    "metadata": {},
136 |    "source": [
137 |     "## 2. 用Cython编写C扩展\n",
138 |     "试试用我们的老朋友Cython来写一下 `x + 1 if x % 2 else x - 1` 这个函数。平均运行时间降低到了202ms，果然速度变快了。性能大约提升了1.4倍，离40倍的flag还差的好远[捂脸]。"
139 |    ]
140 |   },
141 |   {
142 |    "cell_type": "code",
143 |    "execution_count": 4,
144 |    "metadata": {},
145 |    "outputs": [],
146 |    "source": [
147 |     "%load_ext cython"
148 |    ]
149 |   },
150 |   {
151 |    "cell_type": "code",
152 |    "execution_count": 5,
153 |    "metadata": {},
154 |    "outputs": [],
155 |    "source": [
156 |     "%%cython\n",
157 |     "cpdef int _transform(int x):\n",
158 |     "    if x % 2:\n",
159 |     "        return x + 1\n",
160 |     "    return x - 1\n",
161 |     "\n",
162 |     "def transform(data):\n",
163 |     "    data.loc[:, \"new_val\"] = data.val.apply(_transform)"
164 |    ]
165 |   },
166 |   {
167 |    "cell_type": "code",
168 |    "execution_count": 6,
169 |    "metadata": {},
170 |    "outputs": [
171 |     {
172 |      "name": "stdout",
173 |      "output_type": "stream",
174 |      "text": [
175 |       "202 ms ± 13.1 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n"
176 |      ]
177 |     }
178 |    ],
179 |    "source": [
180 |     "%timeit -n 1 transform(data)"
181 |    ]
182 |   },
183 |   {
184 |    "cell_type": "markdown",
185 |    "metadata": {},
186 |    "source": [
187 |     "## 3. 减少类型转换\n",
188 |     "为了减少C和Python之间的类型转换，我们直接把'val'这一列作为Numpy数组传递给Cython函数，注意区分cnp和np。平均运行时间直接降到10.8毫秒，性能大约提升了26倍，仿佛看到了一丝希望。"
189 |    ]
190 |   },
191 |   {
192 |    "cell_type": "code",
193 |    "execution_count": 7,
194 |    "metadata": {},
195 |    "outputs": [],
196 |    "source": [
197 |     "%%cython\n",
198 |     "import numpy as np\n",
199 |     "cimport numpy as cnp\n",
200 |     "ctypedef cnp.int_t DTYPE_t\n",
201 |     "\n",
202 |     "cpdef cnp.ndarray[DTYPE_t] _transform(cnp.ndarray[DTYPE_t] arr):\n",
203 |     "    cdef:\n",
204 |     "        int i = 0\n",
205 |     "        int n = arr.shape[0]\n",
206 |     "        int x\n",
207 |     "        cnp.ndarray[DTYPE_t] new_arr = np.empty_like(arr)\n",
208 |     "\n",
209 |     "    while i < n:\n",
210 |     "        x = arr[i]\n",
211 |     "        if x % 2:\n",
212 |     "            new_arr[i] = x + 1\n",
213 |     "        else:\n",
214 |     "            new_arr[i] = x - 1\n",
215 |     "        i += 1\n",
216 |     "    return new_arr\n",
217 |     "\n",
218 |     "def transform(data):\n",
219 |     "    data.loc[:, \"new_val\"] = _transform(data.val.values)"
220 |    ]
221 |   },
222 |   {
223 |    "cell_type": "code",
224 |    "execution_count": 8,
225 |    "metadata": {},
226 |    "outputs": [
227 |     {
228 |      "name": "stdout",
229 |      "output_type": "stream",
230 |      "text": [
231 |       "10.8 ms ± 512 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)\n"
232 |      ]
233 |     }
234 |    ],
235 |    "source": [
236 |     "%timeit -n 1 transform(data)"
237 |    ]
238 |   },
239 |   {
240 |    "cell_type": "markdown",
241 |    "metadata": {},
242 |    "source": [
243 |     "## 4. 使用不安全的数组\n",
244 |     "利用@cython.boundscheck(False)，@cython.wraparound(False)装饰器关闭数组的边界检查和负下标处理，平均运行时间变为5.9毫秒。性能提升了42倍左右，顺利完成任务。"
245 |    ]
246 |   },
247 |   {
248 |    "cell_type": "code",
249 |    "execution_count": 9,
250 |    "metadata": {},
251 |    "outputs": [],
252 |    "source": [
253 |     "%%cython\n",
254 |     "import cython\n",
255 |     "import numpy as np\n",
256 |     "cimport numpy as cnp\n",
257 |     "ctypedef cnp.int_t DTYPE_t\n",
258 |     "\n",
259 |     "\n",
260 |     "@cython.boundscheck(False)\n",
261 |     "@cython.wraparound(False)\n",
262 |     "cpdef cnp.ndarray[DTYPE_t] _transform(cnp.ndarray[DTYPE_t] arr):\n",
263 |     "    cdef:\n",
264 |     "        int i = 0\n",
265 |     "        int n = arr.shape[0]\n",
266 |     "        int x\n",
267 |     "        cnp.ndarray[DTYPE_t] new_arr = np.empty_like(arr)\n",
268 |     "\n",
269 |     "    while i < n:\n",
270 |     "        x = arr[i]\n",
271 |     "        if x % 2:\n",
272 |     "            new_arr[i] = x + 1\n",
273 |     "        else:\n",
274 |     "            new_arr[i] = x - 1\n",
275 |     "        i += 1\n",
276 |     "    return new_arr\n",
277 |     "\n",
278 |     "def transform(data):\n",
279 |     "    data.loc[:, \"new_val\"] = _transform(data.val.values)"
280 |    ]
281 |   },
282 |   {
283 |    "cell_type": "code",
284 |    "execution_count": 10,
285 |    "metadata": {},
286 |    "outputs": [
287 |     {
288 |      "name": "stdout",
289 |      "output_type": "stream",
290 |      "text": [
291 |       "6.76 ms ± 545 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)\n"
292 |      ]
293 |     }
294 |    ],
295 |    "source": [
296 |     "%timeit -n 1 transform(data)"
297 |    ]
298 |   }
299 |  ],
300 |  "metadata": {
301 |   "kernelspec": {
302 |    "display_name": "Python 3",
303 |    "language": "python",
304 |    "name": "python3"
305 |   },
306 |   "language_info": {
307 |    "codemirror_mode": {
308 |     "name": "ipython",
309 |     "version": 3
310 |    },
311 |    "file_extension": ".py",
312 |    "mimetype": "text/x-python",
313 |    "name": "python",
314 |    "nbconvert_exporter": "python",
315 |    "pygments_lexer": "ipython3",
316 |    "version": "3.6.6"
317 |   }
318 |  },
319 |  "nbformat": 4,
320 |  "nbformat_minor": 2
321 | }
322 | 


--------------------------------------------------------------------------------
/Python Standard Library.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "markdown",
  5 |    "metadata": {},
  6 |    "source": [
  7 |     "# 用Python标准库写出高效的代码\n",
  8 |     "作者: tushushu  \n",
  9 |     "项目地址: https://github.com/tushushu/flying-python"
 10 |    ]
 11 |   },
 12 |   {
 13 |    "cell_type": "markdown",
 14 |    "metadata": {},
 15 |    "source": [
 16 |     "## 1. bisect - 二分查找\n",
 17 |     "给定一个列表对象，我们要对目标元素进行查找，返回其在列表中的下标。  \n",
 18 |     "* 首先想到的是Python列表的index方法。建立一个长度为10000的升序列表，编写search函数使用index方式把里面的每一个元素查找一遍，平均运行时间437毫秒。\n",
 19 |     "* 使用bisect模块的bisect_left，也就是我们熟知的二分查找。编写fast_search函数，平均运行时间3.94毫秒，性能提升了110倍！"
 20 |    ]
 21 |   },
 22 |   {
 23 |    "cell_type": "code",
 24 |    "execution_count": 1,
 25 |    "metadata": {},
 26 |    "outputs": [],
 27 |    "source": [
 28 |     "import bisect"
 29 |    ]
 30 |   },
 31 |   {
 32 |    "cell_type": "code",
 33 |    "execution_count": 2,
 34 |    "metadata": {},
 35 |    "outputs": [],
 36 |    "source": [
 37 |     "def search(nums):\n",
 38 |     "    for x in nums:\n",
 39 |     "        nums.index(x)"
 40 |    ]
 41 |   },
 42 |   {
 43 |    "cell_type": "code",
 44 |    "execution_count": 3,
 45 |    "metadata": {},
 46 |    "outputs": [],
 47 |    "source": [
 48 |     "def fast_search(nums):\n",
 49 |     "    for x in nums:\n",
 50 |     "        bisect.bisect_left(nums, x)"
 51 |    ]
 52 |   },
 53 |   {
 54 |    "cell_type": "code",
 55 |    "execution_count": 4,
 56 |    "metadata": {},
 57 |    "outputs": [],
 58 |    "source": [
 59 |     "arr = list(range(10000))"
 60 |    ]
 61 |   },
 62 |   {
 63 |    "cell_type": "code",
 64 |    "execution_count": 5,
 65 |    "metadata": {},
 66 |    "outputs": [
 67 |     {
 68 |      "name": "stdout",
 69 |      "output_type": "stream",
 70 |      "text": [
 71 |       "437 ms ± 12.1 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n"
 72 |      ]
 73 |     }
 74 |    ],
 75 |    "source": [
 76 |     "%timeit -n 1 search(arr)"
 77 |    ]
 78 |   },
 79 |   {
 80 |    "cell_type": "code",
 81 |    "execution_count": 6,
 82 |    "metadata": {},
 83 |    "outputs": [
 84 |     {
 85 |      "name": "stdout",
 86 |      "output_type": "stream",
 87 |      "text": [
 88 |       "3.94 ms ± 407 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)\n"
 89 |      ]
 90 |     }
 91 |    ],
 92 |    "source": [
 93 |     "%timeit -n 1 fast_search(arr)"
 94 |    ]
 95 |   },
 96 |   {
 97 |    "cell_type": "markdown",
 98 |    "metadata": {},
 99 |    "source": [
100 |     "## 2. Counter - 高效计数\n",
101 |     "给定一个列表对象，我们要统计其中的每个不重复的元素出现了多少次，返回一个字典对象。  \n",
102 |     "* 创建一个长度为10000，元素为1-3之间的随机数的列表。编写count函数，创建一个空字典，用for循环遍历该列表，将计数结果写入字典。平均运行时间937微秒。\n",
103 |     "* 使用collections模块的Counter，编写fast_count函数，一行代码搞定。平均运行时间494微秒，性能几乎是原来的2倍。"
104 |    ]
105 |   },
106 |   {
107 |    "cell_type": "code",
108 |    "execution_count": 7,
109 |    "metadata": {},
110 |    "outputs": [],
111 |    "source": [
112 |     "from collections import Counter\n",
113 |     "from random import randint"
114 |    ]
115 |   },
116 |   {
117 |    "cell_type": "code",
118 |    "execution_count": 8,
119 |    "metadata": {},
120 |    "outputs": [],
121 |    "source": [
122 |     "def count(nums):\n",
123 |     "    res = dict()\n",
124 |     "    for x in nums:\n",
125 |     "        if x in res:\n",
126 |     "            res[x] += 1\n",
127 |     "        else:\n",
128 |     "            res[x] = 0\n",
129 |     "    return x"
130 |    ]
131 |   },
132 |   {
133 |    "cell_type": "code",
134 |    "execution_count": 9,
135 |    "metadata": {},
136 |    "outputs": [],
137 |    "source": [
138 |     "def fast_count(nums):\n",
139 |     "    return Counter(nums)"
140 |    ]
141 |   },
142 |   {
143 |    "cell_type": "code",
144 |    "execution_count": 10,
145 |    "metadata": {},
146 |    "outputs": [],
147 |    "source": [
148 |     "nums = [randint(1, 3) for _ in range(10000)]"
149 |    ]
150 |   },
151 |   {
152 |    "cell_type": "code",
153 |    "execution_count": 11,
154 |    "metadata": {},
155 |    "outputs": [
156 |     {
157 |      "name": "stdout",
158 |      "output_type": "stream",
159 |      "text": [
160 |       "937 µs ± 153 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)\n"
161 |      ]
162 |     }
163 |    ],
164 |    "source": [
165 |     "%timeit -n 1 count(nums)"
166 |    ]
167 |   },
168 |   {
169 |    "cell_type": "code",
170 |    "execution_count": 12,
171 |    "metadata": {},
172 |    "outputs": [
173 |     {
174 |      "name": "stdout",
175 |      "output_type": "stream",
176 |      "text": [
177 |       "494 µs ± 240 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)\n"
178 |      ]
179 |     }
180 |    ],
181 |    "source": [
182 |     "%timeit -n 1 fast_count(nums)"
183 |    ]
184 |   },
185 |   {
186 |    "cell_type": "markdown",
187 |    "metadata": {},
188 |    "source": [
189 |     "## 3. heapq - 堆\n",
190 |     "给定一个列表对象，返回该列表中最小的3个元素。\n",
191 |     "* 创建一个长度为10000的列表，对元素进行随机打乱。编写top_3函数，对列表进行排序，返回前3个元素。平均运行时间2.03毫秒。\n",
192 |     "* 使用heapq模块，也就是我们熟悉的堆，编写fast_top_3函数。平均运行时间296微秒，性能提升了6.8倍。"
193 |    ]
194 |   },
195 |   {
196 |    "cell_type": "code",
197 |    "execution_count": 13,
198 |    "metadata": {},
199 |    "outputs": [],
200 |    "source": [
201 |     "import heapq\n",
202 |     "from random import shuffle"
203 |    ]
204 |   },
205 |   {
206 |    "cell_type": "code",
207 |    "execution_count": 14,
208 |    "metadata": {},
209 |    "outputs": [],
210 |    "source": [
211 |     "def top_3(nums):\n",
212 |     "    return sorted(nums)[:3]"
213 |    ]
214 |   },
215 |   {
216 |    "cell_type": "code",
217 |    "execution_count": 15,
218 |    "metadata": {},
219 |    "outputs": [],
220 |    "source": [
221 |     "def fast_top_3(nums):\n",
222 |     "    return heapq.nsmallest(3, nums)"
223 |    ]
224 |   },
225 |   {
226 |    "cell_type": "code",
227 |    "execution_count": 16,
228 |    "metadata": {},
229 |    "outputs": [],
230 |    "source": [
231 |     "nums = list(range(10000))\n",
232 |     "shuffle(nums)"
233 |    ]
234 |   },
235 |   {
236 |    "cell_type": "code",
237 |    "execution_count": 17,
238 |    "metadata": {},
239 |    "outputs": [
240 |     {
241 |      "name": "stdout",
242 |      "output_type": "stream",
243 |      "text": [
244 |       "2.03 ms ± 236 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)\n"
245 |      ]
246 |     }
247 |    ],
248 |    "source": [
249 |     "%timeit -n 1 top_3(nums)"
250 |    ]
251 |   },
252 |   {
253 |    "cell_type": "code",
254 |    "execution_count": 18,
255 |    "metadata": {},
256 |    "outputs": [
257 |     {
258 |      "name": "stdout",
259 |      "output_type": "stream",
260 |      "text": [
261 |       "296 µs ± 56.2 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)\n"
262 |      ]
263 |     }
264 |    ],
265 |    "source": [
266 |     "%timeit -n 1 fast_top_3(nums)"
267 |    ]
268 |   },
269 |   {
270 |    "cell_type": "markdown",
271 |    "metadata": {},
272 |    "source": [
273 |     "## 4. itemgetter - 批量get元素\n",
274 |     "给定一个字典和一个列表，列表中包含一个或多个字典中的key，返回对应的values。\n",
275 |     "* 创建一个元素数量为10万的字典，从字典的key中随机抽样10万，形成一个长度为1万的列表。编写get_items函数，平均运行时间1.12毫秒\n",
276 |     "* 使用itemgetter批量读取这些元素，编写fast_get_items函数，平均运行时间836微秒，性能是原来的1.3倍。\n"
277 |    ]
278 |   },
279 |   {
280 |    "cell_type": "code",
281 |    "execution_count": 19,
282 |    "metadata": {},
283 |    "outputs": [],
284 |    "source": [
285 |     "from operator import itemgetter\n",
286 |     "from random import choices"
287 |    ]
288 |   },
289 |   {
290 |    "cell_type": "code",
291 |    "execution_count": 20,
292 |    "metadata": {},
293 |    "outputs": [],
294 |    "source": [
295 |     "def get_items(data, keys):\n",
296 |     "    return [data[x] for x in keys]"
297 |    ]
298 |   },
299 |   {
300 |    "cell_type": "code",
301 |    "execution_count": 21,
302 |    "metadata": {},
303 |    "outputs": [],
304 |    "source": [
305 |     "def fast_get_items(data, keys):\n",
306 |     "    return itemgetter(*keys)(data)"
307 |    ]
308 |   },
309 |   {
310 |    "cell_type": "code",
311 |    "execution_count": 22,
312 |    "metadata": {},
313 |    "outputs": [],
314 |    "source": [
315 |     "data= dict(enumerate(range(100000)))\n",
316 |     "keys = choices(list(data.keys()), k=10000)"
317 |    ]
318 |   },
319 |   {
320 |    "cell_type": "code",
321 |    "execution_count": 23,
322 |    "metadata": {},
323 |    "outputs": [
324 |     {
325 |      "name": "stdout",
326 |      "output_type": "stream",
327 |      "text": [
328 |       "1.12 ms ± 354 µs per loop (mean ± std. dev. of 7 runs, 5 loops each)\n"
329 |      ]
330 |     }
331 |    ],
332 |    "source": [
333 |     "%timeit -n 5 get_items(data, keys)"
334 |    ]
335 |   },
336 |   {
337 |    "cell_type": "code",
338 |    "execution_count": 24,
339 |    "metadata": {},
340 |    "outputs": [
341 |     {
342 |      "name": "stdout",
343 |      "output_type": "stream",
344 |      "text": [
345 |       "836 µs ± 287 µs per loop (mean ± std. dev. of 7 runs, 5 loops each)\n"
346 |      ]
347 |     }
348 |    ],
349 |    "source": [
350 |     "%timeit -n 5 fast_get_items(data, keys)"
351 |    ]
352 |   },
353 |   {
354 |    "cell_type": "markdown",
355 |    "metadata": {},
356 |    "source": [
357 |     "## 5. lru_cache - 空间换时间\n",
358 |     "给定数字n，返回长度为n的斐波那且数列\n",
359 |     "* 使用递归方式，编写fib函数，并用fib_seq函数对其进行循环调用。令n等于20，平均运行时间3.28ms。\n",
360 |     "* 使用@lru_cache语法糖，将已经计算出来的结果缓存起来，比如fib(4)，计算fib(5)的时候可以直接调用缓存的fib(4)。平均运行时间144微秒，性能提升了22倍。"
361 |    ]
362 |   },
363 |   {
364 |    "cell_type": "code",
365 |    "execution_count": 25,
366 |    "metadata": {},
367 |    "outputs": [],
368 |    "source": [
369 |     "from functools import lru_cache"
370 |    ]
371 |   },
372 |   {
373 |    "cell_type": "code",
374 |    "execution_count": 36,
375 |    "metadata": {},
376 |    "outputs": [],
377 |    "source": [
378 |     "def fib(n):\n",
379 |     "    if n < 2:\n",
380 |     "        return n\n",
381 |     "    return fib(n-1) + fib(n-2)\n",
382 |     "\n",
383 |     "def fib_seq(n):\n",
384 |     "    return [fib(x) for x in range(n)]"
385 |    ]
386 |   },
387 |   {
388 |    "cell_type": "code",
389 |    "execution_count": 27,
390 |    "metadata": {},
391 |    "outputs": [],
392 |    "source": [
393 |     "@lru_cache(maxsize=None)\n",
394 |     "def fast_fib(n):\n",
395 |     "    if n < 2:\n",
396 |     "        return n\n",
397 |     "    return fib(n-1) + fib(n-2)\n",
398 |     "\n",
399 |     "def fast_fib_seq(n):\n",
400 |     "    return [fast_fib(x) for x in range(n)]"
401 |    ]
402 |   },
403 |   {
404 |    "cell_type": "code",
405 |    "execution_count": 28,
406 |    "metadata": {},
407 |    "outputs": [
408 |     {
409 |      "name": "stdout",
410 |      "output_type": "stream",
411 |      "text": [
412 |       "3.28 ms ± 220 µs per loop (mean ± std. dev. of 7 runs, 3 loops each)\n"
413 |      ]
414 |     }
415 |    ],
416 |    "source": [
417 |     "%timeit -n 5 fib_seq(20)"
418 |    ]
419 |   },
420 |   {
421 |    "cell_type": "code",
422 |    "execution_count": 29,
423 |    "metadata": {},
424 |    "outputs": [
425 |     {
426 |      "name": "stdout",
427 |      "output_type": "stream",
428 |      "text": [
429 |       "The slowest run took 524.07 times longer than the fastest. This could mean that an intermediate result is being cached.\n",
430 |       "144 µs ± 347 µs per loop (mean ± std. dev. of 7 runs, 3 loops each)\n"
431 |      ]
432 |     }
433 |    ],
434 |    "source": [
435 |     "%timeit -n 5 fast_fib_seq(20)"
436 |    ]
437 |   },
438 |   {
439 |    "cell_type": "code",
440 |    "execution_count": null,
441 |    "metadata": {},
442 |    "outputs": [],
443 |    "source": []
444 |   }
445 |  ],
446 |  "metadata": {
447 |   "kernelspec": {
448 |    "display_name": "Python 3",
449 |    "language": "python",
450 |    "name": "python3"
451 |   },
452 |   "language_info": {
453 |    "codemirror_mode": {
454 |     "name": "ipython",
455 |     "version": 3
456 |    },
457 |    "file_extension": ".py",
458 |    "mimetype": "text/x-python",
459 |    "name": "python",
460 |    "nbconvert_exporter": "python",
461 |    "pygments_lexer": "ipython3",
462 |    "version": "3.6.6"
463 |   }
464 |  },
465 |  "nbformat": 4,
466 |  "nbformat_minor": 2
467 | }
468 | 


--------------------------------------------------------------------------------
/Built-in method.md:
--------------------------------------------------------------------------------
  1 | # 使用内置方法优化Python性能
  2 | 作者: tushushu  
  3 | 项目地址: https://github.com/tushushu/flying-python
  4 | 
  5 | Python程序运行太慢的一个可能的原因是没有尽可能的调用内置方法，下面通过5个例子来演示如何用内置方法提升Python程序的性能。
  6 | 
  7 | ## 1. 数组求平方和
  8 | 输入一个列表，要求计算出该列表中数字的的平方和。最终性能提升了1.4倍。
  9 | 
 10 | 首先创建一个长度为10000的列表。
 11 | 
 12 | 
 13 | ```python
 14 | arr = list(range(10000))
 15 | ```
 16 | 
 17 | ### 1.1 最常规的写法
 18 | while循环遍历列表求平方和。平均运行时间2.97毫秒。
 19 | 
 20 | 
 21 | ```python
 22 | def sum_sqr_0(arr):
 23 |     res = 0
 24 |     n = len(arr)
 25 |     i = 0
 26 |     while i < n:
 27 |         res += arr[i] ** 2
 28 |         i += 1
 29 |     return res
 30 | ```
 31 | 
 32 | 
 33 | ```python
 34 | %timeit sum_sqr_0(arr)
 35 | ```
 36 | 
 37 |     2.97 ms ± 36.4 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
 38 | 
 39 | 
 40 | ### 1.2 for range代替while循环
 41 | 避免i += 1的变量类型检查带来的额外开销。平均运行时间2.9毫秒。
 42 | 
 43 | 
 44 | ```python
 45 | def sum_sqr_1(arr):
 46 |     res = 0
 47 |     for i in range(len(arr)):
 48 |         res += arr[i] ** 2
 49 |     return res
 50 | ```
 51 | 
 52 | 
 53 | ```python
 54 | %timeit sum_sqr_1(arr)
 55 | ```
 56 | 
 57 |     2.9 ms ± 137 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
 58 | 
 59 | 
 60 | ### 1.3 for x in arr代替for range
 61 | 避免arr[i]的变量类型检查带来的额外开销。平均运行时间2.59毫秒。
 62 | 
 63 | 
 64 | ```python
 65 | def sum_sqr_2(arr):
 66 |     res = 0
 67 |     for x in arr:
 68 |         res += x ** 2
 69 |     return res
 70 | ```
 71 | 
 72 | 
 73 | ```python
 74 | %timeit sum_sqr_2(arr)
 75 | ```
 76 | 
 77 |     2.59 ms ± 89 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
 78 | 
 79 | 
 80 | ### 1.4 sum函数套用map函数
 81 | 平均运行时间2.36毫秒
 82 | 
 83 | 
 84 | ```python
 85 | def sum_sqr_3(arr):
 86 |     return sum(map(lambda x: x**2, arr))
 87 | ```
 88 | 
 89 | 
 90 | ```python
 91 | %timeit sum_sqr_3(arr)
 92 | ```
 93 | 
 94 |     2.36 ms ± 15.1 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
 95 | 
 96 | 
 97 | ### 1.5 sum函数套用生成器表达式
 98 | 生成器表达式如果作为某个函数的参数，则可以省略掉()。平均运行时间2.35毫秒。
 99 | 
100 | 
101 | ```python
102 | def sum_sqr_4(arr):
103 |     return sum(x ** 2 for x in arr)
104 | ```
105 | 
106 | 
107 | ```python
108 | %timeit sum_sqr_4(arr)
109 | ```
110 | 
111 |     2.35 ms ± 107 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
112 | 
113 | 
114 | ### 1. 6 sum函数套用列表推导式
115 | 平均运行时间2.06毫秒。
116 | 
117 | 
118 | ```python
119 | def sum_sqr_5(arr):
120 |     return sum([x ** 2 for x in arr])
121 | ```
122 | 
123 | 
124 | ```python
125 | %timeit sum_sqr_5(arr)
126 | ```
127 | 
128 |     2.06 ms ± 27.2 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
129 | 
130 | 
131 | ## 2. 字符串拼接
132 | 输入一个列表，要求将列表中的字符串的前3个字符都拼接为一个字符串。最终性能提升了2.1倍。
133 | 
134 | 首先创建一个列表，生成10000个随机长度和内容的字符串。
135 | 
136 | 
137 | ```python
138 | from random import randint
139 | 
140 | def random_letter():
141 |     return chr(ord('a') + randint(0, 25))
142 | 
143 | def random_letters(n):
144 |     return "".join([random_letter() for _ in range(n)])
145 | 
146 | strings = [random_letters(randint(1, 10)) for _ in range(10000)]
147 | ```
148 | 
149 | ### 2.1 最常规的写法
150 | while循环遍历列表，对字符串进行拼接。平均运行时间1.86毫秒。
151 | 
152 | 
153 | ```python
154 | def concat_strings_0(strings):
155 |     res = ""
156 |     n = len(strings)
157 |     i = 0
158 |     while i < n:
159 |         res += strings[i][:3]
160 |         i += 1
161 |     return res
162 | ```
163 | 
164 | 
165 | ```python
166 | %timeit concat_strings_0(strings)
167 | ```
168 | 
169 |     1.86 ms ± 74.9 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
170 | 
171 | 
172 | ### 2.2 for range代替while循环
173 | 避免i += 1的变量类型检查带来的额外开销。平均运行时间1.55毫秒。
174 | 
175 | 
176 | ```python
177 | def concat_strings_1(strings):
178 |     res = ""
179 |     for i in range(len(strings)):
180 |         res += strings[i][:3]
181 |     return res
182 | ```
183 | 
184 | 
185 | ```python
186 | %timeit concat_strings_1(strings)
187 | ```
188 | 
189 |     1.55 ms ± 32.9 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
190 | 
191 | 
192 | ### 2.3 for x in strings代替for range
193 | 避免strings[i]的变量类型检查带来的额外开销。平均运行时间1.32毫秒。
194 | 
195 | 
196 | ```python
197 | def concat_strings_2(strings):
198 |     res = ""
199 |     for x in strings:
200 |         res += x[:3]
201 |     return res
202 | ```
203 | 
204 | 
205 | ```python
206 | %timeit concat_strings_2(strings)
207 | ```
208 | 
209 |     1.32 ms ± 19.5 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
210 | 
211 | 
212 | ### 2.4 .join方法套用生成器表达式
213 | 平均运行时间1.06毫秒。
214 | 
215 | 
216 | ```python
217 | def concat_strings_3(strings):
218 |     return "".join(x[:3] for x in strings)
219 | ```
220 | 
221 | 
222 | ```python
223 | %timeit concat_strings_3(strings)
224 | ```
225 | 
226 |     1.06 ms ± 15.2 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
227 | 
228 | 
229 | ### 2.5 .join方法套用列表解析式
230 | 平均运行时间0.85毫秒。
231 | 
232 | 
233 | ```python
234 | def concat_strings_4(strings):
235 |     return "".join([x[:3] for x in strings])
236 | ```
237 | 
238 | 
239 | ```python
240 | %timeit concat_strings_4(strings)
241 | ```
242 | 
243 |     858 µs ± 14.5 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
244 | 
245 | 
246 | ## 3. 筛选奇数
247 | 
248 | 输入一个列表，要求筛选出该列表中的所有奇数。最终性能提升了3.6倍。
249 | 
250 | 首先创建一个长度为10000的列表。
251 | 
252 | 
253 | ```python
254 | arr = list(range(10000))
255 | ```
256 | 
257 | ### 3.1 最常规的写法
258 | 创建一个空列表res，while循环遍历列表，将奇数append到res中。平均运行时间1.03毫秒。
259 | 
260 | 
261 | ```python
262 | def filter_odd_0(arr):
263 |     res = []
264 |     i = 0
265 |     n = len(arr)
266 |     while i < n:
267 |         if arr[i] % 2:
268 |             res.append(arr[i])
269 |         i += 1
270 |     return res
271 | ```
272 | 
273 | 
274 | ```python
275 | %timeit filter_odd_0(arr)
276 | ```
277 | 
278 |     1.03 ms ± 34.1 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
279 | 
280 | 
281 | ### 3.2 for range代替while循环
282 | 避免i += 1的变量类型检查带来的额外开销。平均运行时间0.965毫秒。
283 | 
284 | 
285 | ```python
286 | def filter_odd_1(arr):
287 |     res = []
288 |     for i in range(len(arr)):
289 |         if arr[i] % 2:
290 |             res.append(arr[i])
291 |         i += 1
292 |     return res
293 | ```
294 | 
295 | 
296 | ```python
297 | %timeit filter_odd_1(arr)
298 | ```
299 | 
300 |     965 µs ± 4.02 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
301 | 
302 | 
303 | ### 3.3 for x in arr代替for range
304 | 避免arr[i]的变量类型检查带来的额外开销。平均运行时间0.430毫秒。
305 | 
306 | 
307 | ```python
308 | def filter_odd_2(arr):
309 |     res = []
310 |     for x in arr:
311 |         if x % 2:
312 |             res.append(x)
313 |     return res
314 | ```
315 | 
316 | 
317 | ```python
318 | %timeit filter_odd_2(arr)
319 | ```
320 | 
321 |     430 µs ± 9.25 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
322 | 
323 | 
324 | ### 3.4 list套用filter函数
325 | 平均运行时间0.763毫秒。注意filter函数很慢，在Python 3.6里非常鸡肋。
326 | 
327 | 
328 | ```python
329 | def filter_odd_3(arr):
330 |     return list(filter(lambda x: x % 2, arr))
331 | ```
332 | 
333 | 
334 | ```python
335 | %timeit filter_odd_3(arr)
336 | ```
337 | 
338 |     763 µs ± 15.9 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
339 | 
340 | 
341 | ### 3.5 list套用生成器表达式
342 | 平均运行时间0.398毫秒。
343 | 
344 | 
345 | ```python
346 | def filter_odd_4(arr):
347 |     return list((x for x in arr if x % 2))
348 | ```
349 | 
350 | 
351 | ```python
352 | %timeit filter_odd_4(arr)
353 | ```
354 | 
355 |     398 µs ± 16.4 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
356 | 
357 | 
358 | ### 3.6 带条件的列表推导式
359 | 平均运行时间0.290毫秒。
360 | 
361 | 
362 | ```python
363 | def filter_odd_5(arr):
364 |     return [x for x in arr if x % 2]
365 | ```
366 | 
367 | 
368 | ```python
369 | %timeit filter_odd_5(arr)
370 | ```
371 | 
372 |     290 µs ± 5.54 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
373 | 
374 | 
375 | ## 4. 两个数组相加
376 | 
377 | 输入两个长度相同的列表，要求计算出两个列表对应位置的数字之和，返回一个与输入长度相同的列表。最终性能提升了2.7倍。
378 | 
379 | 首先生成两个长度为10000的列表。
380 | 
381 | 
382 | ```python
383 | arr1 = list(range(10000))
384 | arr2 = list(range(10000))
385 | ```
386 | 
387 | ### 4.1 最常规的写法
388 | 创建一个空列表res，while循环遍历列表，将两个列表对应的元素之和append到res中。平均运行时间1.23毫秒。
389 | 
390 | 
391 | ```python
392 | def arr_sum_0(arr1, arr2):
393 |     i = 0
394 |     n = len(arr1)
395 |     res = []
396 |     while i < n:
397 |         res.append(arr1[i] + arr2[i])
398 |         i += 1
399 |     return res
400 | ```
401 | 
402 | 
403 | ```python
404 | %timeit arr_sum_0(arr1, arr2)
405 | ```
406 | 
407 |     1.23 ms ± 3.77 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
408 | 
409 | 
410 | ### 4.2 for range代替while循环
411 | 避免i += 1的变量类型检查带来的额外开销。平均运行时间0.997毫秒。
412 | 
413 | 
414 | ```python
415 | def arr_sum_1(arr1, arr2):
416 |     res = []
417 |     for i in range(len(arr1)):
418 |         res.append(arr1[i] + arr2[i])
419 |     return res
420 | ```
421 | 
422 | 
423 | ```python
424 | %timeit arr_sum_1(arr1, arr2)
425 | ```
426 | 
427 |     997 µs ± 7.42 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
428 | 
429 | 
430 | ### 4.3 for i, x in enumerate代替for range
431 | 部分避免arr[i]的变量类型检查带来的额外开销。平均运行时间0.799毫秒。
432 | 
433 | 
434 | ```python
435 | def arr_sum_2(arr1, arr2):
436 |     res = arr1.copy()
437 |     for i, x in enumerate(arr2):
438 |         res[i] += x
439 |     return res
440 | ```
441 | 
442 | 
443 | ```python
444 | %timeit arr_sum_2(arr1, arr2)
445 | ```
446 | 
447 |     799 µs ± 16.7 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
448 | 
449 | 
450 | ### 4.4 for x, y in zip代替for range
451 | 避免arr[i]的变量类型检查带来的额外开销。平均运行时间0.769毫秒。
452 | 
453 | 
454 | ```python
455 | def arr_sum_3(arr1, arr2):
456 |     res = []
457 |     for x, y in zip(arr1, arr2):
458 |         res.append(x + y)
459 |     return res
460 | ```
461 | 
462 | 
463 | ```python
464 | %timeit arr_sum_3(arr1, arr2)
465 | ```
466 | 
467 |     769 µs ± 12.2 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
468 | 
469 | 
470 | ### 4.5 列表推导式套用zip
471 | 平均运行时间0.462毫秒。
472 | 
473 | 
474 | ```python
475 | def arr_sum_4(arr1, arr2):
476 |     return [x + y for x, y in zip(arr1, arr2)]
477 | ```
478 | 
479 | 
480 | ```python
481 | %timeit arr_sum_4(arr1, arr2)
482 | ```
483 | 
484 |     462 µs ± 3.43 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
485 | 
486 | 
487 | ## 5. 两个列表相同元素的数量
488 | 输入两个列表，要求统计两个列表相同元素的数量。其中每个列表内的元素都是不重复的。最终性能提升了5000倍。
489 | 
490 | 首先创建两个列表，并将元素的顺序打乱。
491 | 
492 | 
493 | ```python
494 | from random import shuffle
495 | arr1 = list(range(2000))
496 | shuffle(arr1)
497 | arr2 = list(range(1000, 3000))
498 | shuffle(arr2)
499 | ```
500 | 
501 | ### 5.1 最常规的写法
502 | while循环嵌套，判断元素arr1[i]是否等于arr2[j]，平均运行时间338毫秒。
503 | 
504 | 
505 | ```python
506 | def n_common_0(arr1, arr2):
507 |     res = 0
508 |     i = 0
509 |     m = len(arr1)
510 |     n = len(arr2)
511 |     while i < m:
512 |         j = 0
513 |         while j < n:
514 |             if arr1[i] == arr2[j]:
515 |                 res += 1
516 |             j += 1
517 |         i += 1
518 |     return res
519 | ```
520 | 
521 | 
522 | ```python
523 | %timeit n_common_0(arr1, arr2)
524 | ```
525 | 
526 |     338 ms ± 7.81 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
527 | 
528 | 
529 | ### 5.2 for range代替while循环
530 | 避免i += 1的变量类型检查带来的额外开销。平均运行时间233毫秒。
531 | 
532 | 
533 | ```python
534 | def n_common_1(arr1, arr2):
535 |     res = 0
536 |     for i in range(len(arr1)):
537 |         for j in range(len(arr2)):
538 |             if arr1[i] == arr2[j]:
539 |                 res += 1
540 |     return res
541 | ```
542 | 
543 | 
544 | ```python
545 | %timeit n_common_1(arr1, arr2)
546 | ```
547 | 
548 |     233 ms ± 10.9 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
549 | 
550 | 
551 | ### 5.3 for x in arr代替for range
552 | 避免arr[i]的变量类型检查带来的额外开销。平均运行时间84.8毫秒。
553 | 
554 | 
555 | ```python
556 | def n_common_2(arr1, arr2):
557 |     res = 0
558 |     for x in arr1:
559 |         for y in arr2:
560 |             if x == y:
561 |                 res += 1
562 |     return res
563 | ```
564 | 
565 | 
566 | ```python
567 | %timeit n_common_2(arr1, arr2)
568 | ```
569 | 
570 |     84.8 ms ± 1.38 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
571 | 
572 | 
573 | ### 5.4 使用if x in arr2代替内层循环
574 | 平均运行时间24.9毫秒。
575 | 
576 | 
577 | ```python
578 | def n_common_3(arr1, arr2):
579 |     res = 0
580 |     for x in arr1:
581 |         if x in arr2:
582 |             res += 1
583 |     return res
584 | ```
585 | 
586 | 
587 | ```python
588 | %timeit n_common_3(arr1, arr2)
589 | ```
590 | 
591 |     24.9 ms ± 1.39 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
592 | 
593 | 
594 | ### 5.4 使用更快的算法
595 | 将数组用.sort方法排序，再进行单层循环遍历。把时间复杂度从O(n2)降低到O(nlogn)，平均运行时间0.239毫秒。
596 | 
597 | 
598 | ```python
599 | def n_common_4(arr1, arr2):
600 |     arr1.sort()
601 |     arr2.sort()
602 |     res = i = j = 0
603 |     m, n = len(arr1), len(arr2)
604 |     while i < m and j < n:
605 |         if arr1[i] == arr2[j]:
606 |             res += 1
607 |             i += 1
608 |             j += 1
609 |         elif arr1[i] > arr2[j]:
610 |             j += 1
611 |         else:
612 |             i += 1
613 |     return res
614 | ```
615 | 
616 | 
617 | ```python
618 | %timeit n_common_4(arr1, arr2)
619 | ```
620 | 
621 |     329 µs ± 12.3 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
622 | 
623 | 
624 | ### 5.5 使用更好的数据结构
625 | 将数组转为集合，求交集的长度。平均运行时间0.067毫秒。
626 | 
627 | 
628 | ```python
629 | def n_common_5(arr1, arr2):
630 |     return len(set(arr1) & set(arr2))
631 | ```
632 | 
633 | 
634 | ```python
635 | %timeit n_common_5(arr1, arr2)
636 | ```
637 | 
638 |     67.2 µs ± 755 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
639 | 
640 | 
641 | 
642 | ```python
643 | 
644 | ```
645 | 


--------------------------------------------------------------------------------
/More efficient array.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "markdown",
  5 |    "metadata": {},
  6 |    "source": [
  7 |     "# 4种方法提升Python数组的效率"
  8 |    ]
  9 |   },
 10 |   {
 11 |    "cell_type": "markdown",
 12 |    "metadata": {},
 13 |    "source": [
 14 |     "## 1. Python的列表为什么慢\n",
 15 |     "Python的列表是一个动态的数组，即数组的size是可以调整的，数组存放着指向各个列表元素的指针(PyObject*)。列表中的各个元素可以是不同的类型，比如my_list = ['a', 1, True]。实际上数组里存放了三个指针，分别指向了这三个元素。那么相比其他语言的数组而言，为什么Python的列表会慢呢？原因主要是以下两个：\n",
 16 |     "1. Python是动态类型语言，意味着类型检查要耗费额外的时间。\n",
 17 |     "2. Python或者说Cpython没有JIT优化器。"
 18 |    ]
 19 |   },
 20 |   {
 21 |    "cell_type": "markdown",
 22 |    "metadata": {},
 23 |    "source": [
 24 |     "## 2. 如何用Python执行快速的数组计算\n",
 25 |     "目前比较主流的解决方案有如下几种：\n",
 26 |     "1. Numpy - Numpy的array更像是C/C++的数组，数据类型一致，而且array的方法(如sum)都是用C来实现的。\n",
 27 |     "2. Numba - 使用JIT技术，优化Numpy的性能。无论是调用Numpy的方法，还是使用for循环遍历Numpy数组，都可以得到性能提升。\n",
 28 |     "3. Numexpr - 避免Numpy为中间结果分配内存，优化Numpy性能，主要用于大数组的表达式计算。\n",
 29 |     "4. Cython - 为Python编写C/C++扩展。\n",
 30 |     "\n",
 31 |     "接下来通过两个例子来演示如何通过这四种工具"
 32 |    ]
 33 |   },
 34 |   {
 35 |    "cell_type": "markdown",
 36 |    "metadata": {},
 37 |    "source": [
 38 |     "## 3. 数组求平方和"
 39 |    ]
 40 |   },
 41 |   {
 42 |    "cell_type": "code",
 43 |    "execution_count": 1,
 44 |    "metadata": {},
 45 |    "outputs": [],
 46 |    "source": [
 47 |     "arr = [x for x in range(10000)]"
 48 |    ]
 49 |   },
 50 |   {
 51 |    "cell_type": "markdown",
 52 |    "metadata": {},
 53 |    "source": [
 54 |     "### 3.1 for循环"
 55 |    ]
 56 |   },
 57 |   {
 58 |    "cell_type": "code",
 59 |    "execution_count": 2,
 60 |    "metadata": {},
 61 |    "outputs": [
 62 |     {
 63 |      "name": "stdout",
 64 |      "output_type": "stream",
 65 |      "text": [
 66 |       "The result is: 333283335000\n",
 67 |       "2.53 ms ± 91.7 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)\n"
 68 |      ]
 69 |     }
 70 |    ],
 71 |    "source": [
 72 |     "def sqr_sum(arr):\n",
 73 |     "    total = 0\n",
 74 |     "    for x in arr:\n",
 75 |     "        total += x ** 2\n",
 76 |     "    return total\n",
 77 |     "\n",
 78 |     "print(\"The result is:\", sqr_sum(arr))\n",
 79 |     "%timeit sqr_sum(arr)"
 80 |    ]
 81 |   },
 82 |   {
 83 |    "cell_type": "markdown",
 84 |    "metadata": {},
 85 |    "source": [
 86 |     "### 3.2 Numpy"
 87 |    ]
 88 |   },
 89 |   {
 90 |    "cell_type": "code",
 91 |    "execution_count": 3,
 92 |    "metadata": {},
 93 |    "outputs": [],
 94 |    "source": [
 95 |     "import numpy as np"
 96 |    ]
 97 |   },
 98 |   {
 99 |    "cell_type": "code",
100 |    "execution_count": 4,
101 |    "metadata": {},
102 |    "outputs": [
103 |     {
104 |      "name": "stdout",
105 |      "output_type": "stream",
106 |      "text": [
107 |       "The result is: 333283335000\n",
108 |       "9.66 µs ± 275 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)\n"
109 |      ]
110 |     }
111 |    ],
112 |    "source": [
113 |     "def sqr_sum(arr):\n",
114 |     "    return (arr ** 2).sum()\n",
115 |     "\n",
116 |     "arr = np.array(arr)\n",
117 |     "print(\"The result is:\", sqr_sum(arr))\n",
118 |     "%timeit sqr_sum(arr)"
119 |    ]
120 |   },
121 |   {
122 |    "cell_type": "markdown",
123 |    "metadata": {},
124 |    "source": [
125 |     "### 3.3 Numba"
126 |    ]
127 |   },
128 |   {
129 |    "cell_type": "code",
130 |    "execution_count": 5,
131 |    "metadata": {},
132 |    "outputs": [],
133 |    "source": [
134 |     "from numba import jit"
135 |    ]
136 |   },
137 |   {
138 |    "cell_type": "code",
139 |    "execution_count": 6,
140 |    "metadata": {},
141 |    "outputs": [
142 |     {
143 |      "name": "stdout",
144 |      "output_type": "stream",
145 |      "text": [
146 |       "The result is: 333283335000\n",
147 |       "3.39 µs ± 57.2 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)\n"
148 |      ]
149 |     }
150 |    ],
151 |    "source": [
152 |     "@jit(nopython=True)\n",
153 |     "def sqr_sum(arr):\n",
154 |     "    return (arr ** 2).sum()\n",
155 |     "\n",
156 |     "arr = np.array(arr)\n",
157 |     "print(\"The result is:\", sqr_sum(arr))\n",
158 |     "%timeit sqr_sum(arr)"
159 |    ]
160 |   },
161 |   {
162 |    "cell_type": "markdown",
163 |    "metadata": {},
164 |    "source": [
165 |     "### 3.4 Numexpr"
166 |    ]
167 |   },
168 |   {
169 |    "cell_type": "code",
170 |    "execution_count": 7,
171 |    "metadata": {},
172 |    "outputs": [],
173 |    "source": [
174 |     "import numexpr as ne"
175 |    ]
176 |   },
177 |   {
178 |    "cell_type": "code",
179 |    "execution_count": 8,
180 |    "metadata": {},
181 |    "outputs": [
182 |     {
183 |      "name": "stdout",
184 |      "output_type": "stream",
185 |      "text": [
186 |       "The result is: 333283335000\n",
187 |       "14.9 µs ± 144 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)\n"
188 |      ]
189 |     }
190 |    ],
191 |    "source": [
192 |     "def sqr_sum(arr):\n",
193 |     "    return ne.evaluate(\"sum(arr * arr)\")\n",
194 |     "\n",
195 |     "arr = np.array(arr)\n",
196 |     "print(\"The result is:\", sqr_sum(arr))\n",
197 |     "%timeit sqr_sum(arr)"
198 |    ]
199 |   },
200 |   {
201 |    "cell_type": "markdown",
202 |    "metadata": {},
203 |    "source": [
204 |     "### 3.5 Cython"
205 |    ]
206 |   },
207 |   {
208 |    "cell_type": "code",
209 |    "execution_count": 9,
210 |    "metadata": {},
211 |    "outputs": [],
212 |    "source": [
213 |     "%load_ext cython"
214 |    ]
215 |   },
216 |   {
217 |    "cell_type": "code",
218 |    "execution_count": 10,
219 |    "metadata": {},
220 |    "outputs": [],
221 |    "source": [
222 |     "%%cython\n",
223 |     "cimport numpy as np\n",
224 |     "ctypedef np.int_t DTYPE_t\n",
225 |     "\n",
226 |     "def sqr_sum(np.ndarray[DTYPE_t] arr):\n",
227 |     "    cdef:\n",
228 |     "        DTYPE_t total = 0\n",
229 |     "        DTYPE_t x\n",
230 |     "        int i = 0\n",
231 |     "        int n = len(arr)\n",
232 |     "    while i < n:\n",
233 |     "        total += arr[i] ** 2\n",
234 |     "        i += 1\n",
235 |     "    return total"
236 |    ]
237 |   },
238 |   {
239 |    "cell_type": "code",
240 |    "execution_count": 11,
241 |    "metadata": {},
242 |    "outputs": [
243 |     {
244 |      "name": "stdout",
245 |      "output_type": "stream",
246 |      "text": [
247 |       "The result is: 333283335000\n",
248 |       "5.51 µs ± 62.4 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)\n"
249 |      ]
250 |     }
251 |    ],
252 |    "source": [
253 |     "arr = np.array(arr, dtype=\"int\")\n",
254 |     "print(\"The result is:\", sqr_sum(arr))\n",
255 |     "%timeit sqr_sum(arr)"
256 |    ]
257 |   },
258 |   {
259 |    "cell_type": "markdown",
260 |    "metadata": {},
261 |    "source": [
262 |     "## 4. 数组变换"
263 |    ]
264 |   },
265 |   {
266 |    "cell_type": "code",
267 |    "execution_count": 12,
268 |    "metadata": {},
269 |    "outputs": [],
270 |    "source": [
271 |     "arr = [x for x in range(1000000)]"
272 |    ]
273 |   },
274 |   {
275 |    "cell_type": "markdown",
276 |    "metadata": {},
277 |    "source": [
278 |     "### 4.1 for循环"
279 |    ]
280 |   },
281 |   {
282 |    "cell_type": "code",
283 |    "execution_count": 13,
284 |    "metadata": {},
285 |    "outputs": [
286 |     {
287 |      "name": "stdout",
288 |      "output_type": "stream",
289 |      "text": [
290 |       "The result is: [1, 3, 5, 7, 9] ...\n",
291 |       "84.5 ms ± 381 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)\n"
292 |      ]
293 |     }
294 |    ],
295 |    "source": [
296 |     "def transform(arr):\n",
297 |     "    return [x * 2 + 1 for x in arr]\n",
298 |     "\n",
299 |     "print(\"The result is:\", transform(arr)[:5], \"...\")\n",
300 |     "%timeit transform(arr)"
301 |    ]
302 |   },
303 |   {
304 |    "cell_type": "markdown",
305 |    "metadata": {},
306 |    "source": [
307 |     "### 4.2 Numpy"
308 |    ]
309 |   },
310 |   {
311 |    "cell_type": "code",
312 |    "execution_count": 14,
313 |    "metadata": {},
314 |    "outputs": [],
315 |    "source": [
316 |     "import numpy as np"
317 |    ]
318 |   },
319 |   {
320 |    "cell_type": "code",
321 |    "execution_count": 15,
322 |    "metadata": {},
323 |    "outputs": [
324 |     {
325 |      "name": "stdout",
326 |      "output_type": "stream",
327 |      "text": [
328 |       "The result is: [1 3 5 7 9] ...\n",
329 |       "803 µs ± 11.4 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)\n"
330 |      ]
331 |     }
332 |    ],
333 |    "source": [
334 |     "def transform(arr):\n",
335 |     "    return arr * 2 + 1\n",
336 |     "\n",
337 |     "arr = np.array(arr)\n",
338 |     "print(\"The result is:\", transform(arr)[:5], \"...\")\n",
339 |     "%timeit transform(arr)"
340 |    ]
341 |   },
342 |   {
343 |    "cell_type": "markdown",
344 |    "metadata": {},
345 |    "source": [
346 |     "### 4.3 Numba"
347 |    ]
348 |   },
349 |   {
350 |    "cell_type": "code",
351 |    "execution_count": 16,
352 |    "metadata": {},
353 |    "outputs": [],
354 |    "source": [
355 |     "from numba import jit"
356 |    ]
357 |   },
358 |   {
359 |    "cell_type": "code",
360 |    "execution_count": 17,
361 |    "metadata": {},
362 |    "outputs": [
363 |     {
364 |      "name": "stdout",
365 |      "output_type": "stream",
366 |      "text": [
367 |       "The result is: [1 3 5 7 9] ...\n",
368 |       "498 µs ± 8.71 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)\n"
369 |      ]
370 |     }
371 |    ],
372 |    "source": [
373 |     "@jit(nopython=True)\n",
374 |     "def transform(arr):\n",
375 |     "    return arr * 2 + 1\n",
376 |     "\n",
377 |     "arr = np.array(arr)\n",
378 |     "print(\"The result is:\", transform(arr)[:5], \"...\")\n",
379 |     "%timeit transform(arr)"
380 |    ]
381 |   },
382 |   {
383 |    "cell_type": "markdown",
384 |    "metadata": {},
385 |    "source": [
386 |     "### 4.4 Numexpr"
387 |    ]
388 |   },
389 |   {
390 |    "cell_type": "code",
391 |    "execution_count": 18,
392 |    "metadata": {},
393 |    "outputs": [],
394 |    "source": [
395 |     "import numexpr as ne"
396 |    ]
397 |   },
398 |   {
399 |    "cell_type": "code",
400 |    "execution_count": 19,
401 |    "metadata": {},
402 |    "outputs": [
403 |     {
404 |      "name": "stdout",
405 |      "output_type": "stream",
406 |      "text": [
407 |       "The result is: [1 3 5 7 9] ...\n",
408 |       "369 µs ± 13.2 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)\n"
409 |      ]
410 |     }
411 |    ],
412 |    "source": [
413 |     "def transform(arr):\n",
414 |     "    return ne.evaluate(\"arr * 2 + 1\")\n",
415 |     "\n",
416 |     "arr = np.array(arr)\n",
417 |     "print(\"The result is:\", transform(arr)[:5], \"...\")\n",
418 |     "%timeit transform(arr)"
419 |    ]
420 |   },
421 |   {
422 |    "cell_type": "markdown",
423 |    "metadata": {},
424 |    "source": [
425 |     "### 4.5 Cython"
426 |    ]
427 |   },
428 |   {
429 |    "cell_type": "code",
430 |    "execution_count": 20,
431 |    "metadata": {},
432 |    "outputs": [
433 |     {
434 |      "name": "stdout",
435 |      "output_type": "stream",
436 |      "text": [
437 |       "The cython extension is already loaded. To reload it, use:\n",
438 |       "  %reload_ext cython\n"
439 |      ]
440 |     }
441 |    ],
442 |    "source": [
443 |     "%load_ext cython"
444 |    ]
445 |   },
446 |   {
447 |    "cell_type": "code",
448 |    "execution_count": 21,
449 |    "metadata": {},
450 |    "outputs": [],
451 |    "source": [
452 |     "%%cython\n",
453 |     "import numpy as np\n",
454 |     "cimport numpy as np\n",
455 |     "ctypedef np.int_t DTYPE_t\n",
456 |     "\n",
457 |     "def transform(np.ndarray[DTYPE_t] arr):\n",
458 |     "    cdef:\n",
459 |     "        np.ndarray[DTYPE_t] new_arr = np.empty_like(arr)\n",
460 |     "        int i = 0\n",
461 |     "        int n = len(arr)\n",
462 |     "    while i < n:\n",
463 |     "        new_arr[i] = arr[i] * 2 + 1\n",
464 |     "        i += 1\n",
465 |     "    return new_arr"
466 |    ]
467 |   },
468 |   {
469 |    "cell_type": "code",
470 |    "execution_count": 22,
471 |    "metadata": {},
472 |    "outputs": [
473 |     {
474 |      "name": "stdout",
475 |      "output_type": "stream",
476 |      "text": [
477 |       "The result is: [1 3 5 7 9] ...\n",
478 |       "887 µs ± 29.3 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)\n"
479 |      ]
480 |     }
481 |    ],
482 |    "source": [
483 |     "arr = np.array(arr)\n",
484 |     "print(\"The result is:\", transform(arr)[:5], \"...\")\n",
485 |     "%timeit transform(arr)"
486 |    ]
487 |   },
488 |   {
489 |    "cell_type": "markdown",
490 |    "metadata": {},
491 |    "source": [
492 |     "## 5. 参考文章\n",
493 |     "[How does python have different data types in an array?](https://stackoverflow.com/questions/10558670/how-does-python-have-different-data-types-in-an-array)  \n",
494 |     "[Why are Python Programs often slower than the Equivalent Program Written in C or C++?](https://stackoverflow.com/questions/3033329/why-are-python-programs-often-slower-than-the-equivalent-program-written-in-c-or)  \n",
495 |     "[How Fast Numpy Really is and Why?](https://towardsdatascience.com/how-fast-numpy-really-is-e9111df44347)"
496 |    ]
497 |   },
498 |   {
499 |    "cell_type": "code",
500 |    "execution_count": null,
501 |    "metadata": {},
502 |    "outputs": [],
503 |    "source": []
504 |   }
505 |  ],
506 |  "metadata": {
507 |   "kernelspec": {
508 |    "display_name": "Python 3",
509 |    "language": "python",
510 |    "name": "python3"
511 |   },
512 |   "language_info": {
513 |    "codemirror_mode": {
514 |     "name": "ipython",
515 |     "version": 3
516 |    },
517 |    "file_extension": ".py",
518 |    "mimetype": "text/x-python",
519 |    "name": "python",
520 |    "nbconvert_exporter": "python",
521 |    "pygments_lexer": "ipython3",
522 |    "version": "3.6.6"
523 |   }
524 |  },
525 |  "nbformat": 4,
526 |  "nbformat_minor": 4
527 | }
528 | 


--------------------------------------------------------------------------------
/Using C++ in Cython.md:
--------------------------------------------------------------------------------
  1 | ## 在Cython中使用C++
  2 | 作者: tushushu  
  3 | 项目地址: https://github.com/tushushu/flying-python
  4 | 
  5 | ## 1. 在Jupyter Notebook上使用C++ 
  6 | - 首先加载Cython扩展，使用魔术命令  ``%load_ext Cython``
  7 | - 接下来运行Cython代码，使用魔术命令  ``%%cython --cplus``
  8 | - 如果使用MacOS，使用魔术命令  ``%%cython --cplus --compile-args=-stdlib=libc++ --link-args=-stdlib=libc++``，详情请参考https://stackoverflow.com/questions/57367764/cant-import-cpplist-into-cython
  9 | 
 10 | 
 11 | ```python
 12 | %load_ext Cython
 13 | ```
 14 | 
 15 | 
 16 | ```cython
 17 | %%cython --cplus --compile-args=-stdlib=libc++ --link-args=-stdlib=libc++
 18 | # 注意: 使用 'cimport' 而不是 'import'
 19 | from libcpp.string cimport string
 20 | cdef string s
 21 | s = b"Hello world!"
 22 | print(s.decode("utf-8"))
 23 | ```
 24 | 
 25 |     Hello world!
 26 | 
 27 | 
 28 | ## 2. C++和Python类型的相互转换
 29 | 
 30 | | Python type| C++ type | Python type |
 31 | | ------ | ------ | ------ |
 32 | | bytes | std::string | bytes |
 33 | |iterable|std::vector|list|
 34 | |iterable|std::list|list|
 35 | |iterable|std::set|set|
 36 | |iterable (len 2)|std::pair|tuple (len 2)|
 37 | 
 38 | ## 3. 使用C++ STL
 39 | 
 40 | ### 3.1 使用C++ Vector
 41 | 可以替代Python的List。
 42 | 1. 初始化 - 通过Python的可迭代对象进行初始化，需要声明变量的嵌套类型
 43 | 2. 遍历 - 让index自增，通过while循环进行遍历
 44 | 3. 访问 - 和Python一样使用'[]'操作符对元素进行访问
 45 | 4. 追加 - 与Python list的append方法相似，使用C++ Vector的push_back方法追加元素
 46 | 
 47 | 最后，我们通过分别实现Python和C++版本的元素计数函数来对比性能，C++大约快240倍左右。  
 48 | 注意: 为了公平起见，函数没有传入参数，而是直接访问函数体外部的变量。避免计入C++版本把Python列表转换为C++ Vector的耗时。如果计入这部分耗时，C++的版本大约快4倍左右。
 49 | 
 50 | 
 51 | ```cython
 52 | %%cython --cplus --compile-args=-stdlib=libc++ --link-args=-stdlib=libc++
 53 | from libcpp.vector cimport vector
 54 | # 通过Python对象初始化
 55 | cdef vector[int] vec = range(5)
 56 | # 遍历
 57 | cdef:
 58 |     int i = 0
 59 |     int n = vec.size()
 60 | print("开始遍历...")
 61 | while i < n:
 62 |     # 访问
 63 |     print("\t第%d个位置的元素是%d" % (i, vec[i]))
 64 |     i += 1
 65 | print()
 66 | # 追加
 67 | vec.push_back(5)
 68 | print("追加元素之后vec变为", vec)
 69 | ```
 70 | 
 71 |     开始遍历...
 72 |     	第0个位置的元素是0
 73 |     	第1个位置的元素是1
 74 |     	第2个位置的元素是2
 75 |     	第3个位置的元素是3
 76 |     	第4个位置的元素是4
 77 |     
 78 |     追加元素之后vec变为 [0, 1, 2, 3, 4, 5]
 79 | 
 80 | 
 81 | 
 82 | ```python
 83 | arr = [x // 100 for x in range(1000)]
 84 | target = 6
 85 | 
 86 | def count_py():
 87 |     return sum(1 for x in arr if x == target)
 88 | 
 89 | print("用Python来实现，计算结果为%d!"% count_py())
 90 | ```
 91 | 
 92 |     用Python来实现，计算结果为100!
 93 | 
 94 | 
 95 | 
 96 | ```cython
 97 | %%cython --cplus --compile-args=-stdlib=libc++ --link-args=-stdlib=libc++
 98 | from libcpp.vector cimport vector
 99 | 
100 | cdef:
101 |     int target = 6
102 |     vector[int] v = [x // 100 for x in range(1000)]
103 | 
104 | cdef int _count_cpp():
105 |     cdef:
106 |         int i = 0
107 |         int n = v.size()
108 |         int ret = 0
109 |     while i < n:
110 |         if v[i] == target:
111 |             ret += 1
112 |         i += 1
113 |     return ret
114 | 
115 | def count_cpp():
116 |     return _count_cpp()
117 | 
118 | print("用Cython(C++)来实现，计算结果为%d!"% count_cpp())
119 | ```
120 | 
121 |     用Cython(C++)来实现，计算结果为100!
122 | 
123 | 
124 | 
125 | ```python
126 | print("对比Python版本与C++版本的性能...")
127 | %timeit count_py()
128 | %timeit count_cpp()
129 | ```
130 | 
131 |     对比Python版本与C++版本的性能...
132 |     29.9 µs ± 995 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
133 |     130 ns ± 2.91 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
134 | 
135 | 
136 | ### 3.2 使用C++ Unordered Map
137 | 可以替代Python的Dict。
138 | 1. 初始化 - 通过Python的可迭代对象进行初始化，需要声明变量的嵌套类型
139 | 2. 遍历 - 让泛型指针自增，通过while循环进行遍历
140 | 3. 访问 - 使用deref(C++中的'*'操作符)来解引用，返回pair对象，通过.first来访问key, .second来访问Value
141 | 4. 查找 - 使用unordered_map.count，返回1或0；或者用unordered_map.find，返回一个泛型指针，如果指针指向unordered_map.end，则表示未找到。
142 | 5. 追加/修改 - unordered_map[key] = value。如果Key不存在，'[]'操作符会添加一个Key，并赋值为默认的Value，比如0.0。所以，除非确定不会产生错误，否则在修改Key对应的Value之前，要先判断Key是否存在。这与Python的DecaultDict有点相似。  
143 | 
144 | 最后，我们通过分别实现Python和C++版本的map条件求和函数来对比性能，C++大约快40倍左右。
145 | 
146 | 
147 | ```cython
148 | %%cython --cplus --compile-args=-stdlib=libc++ --link-args=-stdlib=libc++
149 | from cython.operator cimport dereference as deref, preincrement as inc
150 | from libcpp.unordered_map cimport unordered_map
151 | # 通过Python对象初始化
152 | cdef unordered_map[int, float] mymap = {i: i/10 for i in range(10)}
153 | # 遍历
154 | cdef:
155 |     unordered_map[int, float].iterator it = mymap.begin()
156 |     unordered_map[int, float].iterator end = mymap.end()
157 | print("开始遍历...")
158 | while it != end:
159 |     # 访问
160 |     print("\tKey is %d, Value is %.1f" % (deref(it).first, deref(it).second))
161 |     inc(it)
162 | print()
163 | 
164 | # 查找
165 | print("开始查找...")
166 | if mymap.count(-2):
167 |     print("\t元素-2存在!")
168 | else:
169 |     print("\t元素-2不存在!")
170 | 
171 | it = mymap.find(3)
172 | if it != end:
173 |     print("\t元素3存在, 它的值是%.1f!" % deref(it).second)
174 | else:
175 |     print("\t元素3不存在!")
176 | print()
177 | 
178 | # 修改
179 | print("修改元素...")
180 | if mymap.count(3):
181 |     mymap[3] += 1.0
182 | mymap[-2]  # Key -2不存在，会被添加一个默认值0.0
183 | print("\tKey is 3, Value is %.1f" % mymap[3])
184 | print("\tKey is -2, Value is %.1f" % mymap[-2])
185 | ```
186 | 
187 |     开始遍历...
188 |     	Key is 0, Value is 0.0
189 |     	Key is 1, Value is 0.1
190 |     	Key is 2, Value is 0.2
191 |     	Key is 3, Value is 0.3
192 |     	Key is 4, Value is 0.4
193 |     	Key is 5, Value is 0.5
194 |     	Key is 6, Value is 0.6
195 |     	Key is 7, Value is 0.7
196 |     	Key is 8, Value is 0.8
197 |     	Key is 9, Value is 0.9
198 |     
199 |     开始查找...
200 |     	元素-2不存在!
201 |     	元素3存在, 它的值是0.3!
202 |     
203 |     修改元素...
204 |     	Key is 3, Value is 1.3
205 |     	Key is -2, Value is 0.0
206 | 
207 | 
208 | 
209 | ```python
210 | my_map = {x: x for x in range(100)}
211 | target = 50
212 | 
213 | def sum_lt_py():
214 |     return sum(my_map[x] for x in my_map if x < target)
215 | 
216 | print("用Python来实现，计算结果为%d!"% sum_lt_py())
217 | ```
218 | 
219 |     用Python来实现，计算结果为1225!
220 | 
221 | 
222 | 
223 | ```cython
224 | %%cython --cplus --compile-args=-stdlib=libc++ --link-args=-stdlib=libc++
225 | from libcpp.unordered_map cimport unordered_map
226 | from cython.operator cimport dereference as deref, preincrement as inc
227 | 
228 | cdef:
229 |     unordered_map[int, int] my_map = {x: x for x in range(100)}
230 |     int target = 50
231 | 
232 | cdef _sum_lt_cpp():
233 |     cdef:
234 |         unordered_map[int, int].iterator it = my_map.begin()
235 |         int ret
236 |     while it != my_map.end():
237 |         if deref(it).first < target:
238 |             ret += deref(it).second
239 |         inc(it)
240 |     return ret
241 | 
242 | def sum_lt_cpp():
243 |     return _sum_lt_cpp()
244 | 
245 | print("用Cython(C++)来实现，计算结果为%d!"% sum_lt_cpp())
246 | ```
247 | 
248 |     用Cython(C++)来实现，计算结果为1225!
249 | 
250 | 
251 | 
252 | ```python
253 | print("对比Python版本与C++版本的性能...")
254 | %timeit sum_lt_py()
255 | %timeit sum_lt_cpp()
256 | ```
257 | 
258 |     对比Python版本与C++版本的性能...
259 |     6.56 µs ± 117 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
260 |     162 ns ± 6.29 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
261 | 
262 | 
263 | ### 3.3 使用C++ Unordered Set
264 | 可以替代Python的Set。  
265 | 1. 初始化 - 通过Python的可迭代对象进行初始化，需要声明变量的嵌套类型
266 | 2. 遍历 - 让泛型指针自增，通过while循环进行遍历
267 | 3. 访问 - 使用deref(C++中的'*'操作符)来解引用
268 | 4. 查找 - 使用unordered_set.count，返回1或0
269 | 5. 追加 - 使用unordered_set.insert，如果元素已经存在，则元素不会被追加
270 | 6. 交集、并集、差集 - 据我所知，unordered_set的这些操作需要开发者自己去实现，不如Python的Set用起来方便。
271 |  
272 | 最后，我们通过分别实现Python和C++版本的set求交集对比性能，C++大约**慢**20倍左右。详情可参考https://stackoverflow.com/questions/54763112/how-to-improve-stdset-intersection-performance-in-c  
273 | 如果只是求两个集合相同元素的数量，C++的性能大约是Python的6倍。不难推测，C++的unordered set查询很快，但是创建很慢。
274 | 
275 | 
276 | ```cython
277 | %%cython --cplus --compile-args=-stdlib=libc++ --link-args=-stdlib=libc++
278 | from cython.operator cimport dereference as deref, preincrement as inc
279 | from libcpp.unordered_set cimport unordered_set
280 | # 通过Python对象初始化
281 | cdef unordered_set[int] myset = {i for i in range(5)}
282 | # 遍历
283 | cdef:
284 |     unordered_set[int].iterator it = myset.begin()
285 |     unordered_set[int].iterator end = myset.end()
286 | print("开始遍历...")
287 | while it != end:
288 |     # 访问
289 |     print("\tValue is %d" % deref(it))
290 |     inc(it)
291 | print()
292 | 
293 | # 查找
294 | print("开始查找...")
295 | if myset.count(-2):
296 |     print("\t元素-2存在!")
297 | else:
298 |     print("\t元素-2不存在!")
299 | 
300 | print()
301 | 
302 | # 追加
303 | print("追加元素...")
304 | myset.insert(0)
305 | myset.insert(-1)
306 | 
307 | print("\tMyset is: ", myset)
308 | ```
309 | 
310 |     开始遍历...
311 |     	Value is 0
312 |     	Value is 1
313 |     	Value is 2
314 |     	Value is 3
315 |     	Value is 4
316 |     
317 |     开始查找...
318 |     	元素-2不存在!
319 |     
320 |     追加元素...
321 |     	Myset is:  {0, 1, 2, 3, 4, -1}
322 | 
323 | 
324 | 
325 | ```python
326 | myset1 = {x for x in range(100)}
327 | myset2 = {x for x in range(50, 60)}
328 | 
329 | def intersection_py():
330 |     return myset1 & myset2
331 | 
332 | print("用Python来实现，计算结果为%s!"% intersection_py())
333 | ```
334 | 
335 |     用Python来实现，计算结果为{50, 51, 52, 53, 54, 55, 56, 57, 58, 59}!
336 | 
337 | 
338 | 
339 | ```cython
340 | %%cython --cplus --compile-args=-stdlib=libc++ --link-args=-stdlib=libc++
341 | from cython.operator cimport dereference as deref, preincrement as inc
342 | from libcpp.unordered_set cimport unordered_set
343 | 
344 | cdef:
345 |     unordered_set[int] myset1 = {x for x in range(100)}
346 |     unordered_set[int] myset2 = {x for x in range(50, 60)}
347 | 
348 | cdef unordered_set[int] _intersection_cpp():
349 |     cdef:
350 |         unordered_set[int].iterator it = myset1.begin()
351 |         unordered_set[int] ret
352 |     while it != myset1.end():
353 |         if myset2.count(deref(it)):
354 |             ret.insert(deref(it))
355 |         inc(it)
356 |     return ret
357 | 
358 | def intersection_cpp():
359 |     return _intersection_cpp()
360 | 
361 | print("用Cython(C++)来实现，计算结果为%s!"% intersection_cpp())
362 | ```
363 | 
364 |     用Cython(C++)来实现，计算结果为{50, 51, 52, 53, 54, 55, 56, 57, 58, 59}!
365 | 
366 | 
367 | 
368 | ```python
369 | print("对比Python版本与C++版本的性能...")
370 | %timeit intersection_py()
371 | %timeit intersection_cpp()
372 | ```
373 | 
374 |     对比Python版本与C++版本的性能...
375 |     274 ns ± 13.7 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
376 |     5.28 µs ± 220 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
377 | 
378 | 
379 | 
380 | ```python
381 | myset1 = {x for x in range(100)}
382 | myset2 = {x for x in range(50, 60)}
383 | 
384 | def count_common_py():
385 |     return len(myset1 & myset2)
386 | 
387 | print("用Python(C++)来实现，计算结果为%s!"% count_common_py())
388 | ```
389 | 
390 |     用Python(C++)来实现，计算结果为10!
391 | 
392 | 
393 | 
394 | ```cython
395 | %%cython --cplus --compile-args=-stdlib=libc++ --link-args=-stdlib=libc++
396 | from cython.operator cimport dereference as deref, preincrement as inc
397 | from libcpp.unordered_set cimport unordered_set
398 | 
399 | cdef:
400 |     unordered_set[int] myset2 = {x for x in range(100)}
401 |     unordered_set[int] myset1 = {x for x in range(50, 60)}
402 | 
403 | cdef int _count_common_cpp():
404 |     if myset1.size() > myset2.size():
405 |         myset1.swap(myset2)
406 |     cdef:
407 |         unordered_set[int].iterator it = myset1.begin()
408 |         int ret = 0
409 |     while it != myset1.end():
410 |         if myset2.count(deref(it)):
411 |             ret += 1
412 |         inc(it)
413 |     return ret
414 | 
415 | def count_common_cpp():
416 |     return _count_common_cpp()
417 | 
418 | print("用Cython(C++)来实现，计算结果为%s!"% count_common_cpp())
419 | ```
420 | 
421 |     用Cython(C++)来实现，计算结果为10!
422 | 
423 | 
424 | 
425 | ```python
426 | print("对比Python版本与C++版本的性能...")
427 | %timeit count_common_py()
428 | %timeit count_common_cpp()
429 | ```
430 | 
431 |     对比Python版本与C++版本的性能...
432 |     295 ns ± 5.91 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
433 |     46.1 ns ± 0.785 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
434 | 
435 | 
436 | ## 4. 传值与传引用
437 | Python的函数，如果是容器类对象(如List, Set)，传递的是引用，否则传递的是值(如int, float)，如果不希望让函数修改容器类对象，可以用deepcopy函数先拷贝一份容器的副本。  
438 | 但在C++里默认都是传值，如果需要传引用需要声明。
439 | 以int型Vector为例，可以看到v1的值没有被pass_value修改，但被pass_reference修改了。
440 | - 传值使用  ``vector[int]``，pass_value函数只是传入了v1的一份拷贝，所以函数无法修改v1
441 | - 传引用使用  ``vector[int]&``，pass_reference传入了v1的引用，函数可以修改v1。  
442 | 
443 | 下面的两块代码可以展示Python与C++的不同之处。
444 | 
445 | 
446 | ```python
447 | from copy import deepcopy
448 | 
449 | def pass_value(v):
450 |     v = deepcopy(v)
451 |     v[0] = -1
452 | 
453 | def pass_reference(v):
454 |     v[0] = -1
455 | 
456 | v1 = [0, 0, 0]
457 | print("v1的初始值是%s" % v1)
458 | pass_value(v1)
459 | print("执行pass_value函数后，v1的值是%s" % v1)
460 | pass_reference(v1)
461 | print("执行pass_reference函数后，v1的值是%s" % v1)
462 | ```
463 | 
464 |     v1的初始值是[0, 0, 0]
465 |     执行pass_value函数后，v1的值是[0, 0, 0]
466 |     执行pass_reference函数后，v1的值是[-1, 0, 0]
467 | 
468 | 
469 | 
470 | ```cython
471 | %%cython --cplus --compile-args=-stdlib=libc++ --link-args=-stdlib=libc++
472 | 
473 | from libcpp.vector cimport vector
474 | 
475 | cdef void pass_value(vector[int] v):
476 |     v[0] = -1
477 | 
478 | cdef void pass_reference(vector[int]& v):
479 |     v[0] = -1
480 | 
481 | cdef vector[int] v1 = [0, 0, 0]
482 | print("v1的初始值是%s" % v1)
483 | pass_value(v1)
484 | print("执行pass_value函数后，v1的值是%s" % v1)
485 | pass_reference(v1)
486 | print("执行pass_reference函数后，v1的值是%s" % v1)
487 | ```
488 | 
489 |     v1的初始值是[0, 0, 0]
490 |     执行pass_value函数后，v1的值是[0, 0, 0]
491 |     执行pass_reference函数后，v1的值是[-1, 0, 0]
492 | 
493 | 
494 | ## 5. 数字的范围
495 | Python只有int型，而且int的范围可以认为是无限大的，只要没有超出内存限制，所以Python使用者一般不太关心数值溢出等问题。但使用C++的时候就需要谨慎，C++各个数字类型对应的范围如下：  
496 | 
497 | 
498 | |Type	|Typical Bit Width	|Typical Range|
499 | | ------ | ------ | ------ |
500 | |char	|1byte	|-127 to 127 or 0 to 255|
501 | |unsigned char	|1byte	|0 to 255|
502 | |signed char	|1byte	-127 to 127|
503 | |int	|4bytes	|-2147483648 to 2147483647|
504 | |unsigned int	|4bytes	|0 to 4294967295|
505 | |signed int	|4bytes	|-2147483648 to 2147483647|
506 | |short int	|2bytes	|-32768 to 32767|
507 | |unsigned short int	|2bytes	|0 to 65,535|
508 | |signed short int	|2bytes	|-32768 to 32767|
509 | |long int	|4bytes	|-2,147,483,648 to 2,147,483,647|
510 | |signed long int	|8bytes	|same as long int|
511 | |unsigned long int	|4bytes	|0 to 4,294,967,295|
512 | |long long int	|8bytes	|-(2^63) to (2^63)-1|
513 | |unsigned long long int	|8bytes	|0 to 18,446,744,073,709,551,615|
514 | |float	|4bytes	||
515 | |double	|8bytes	||
516 | |long double	|12bytes||	
517 | |wchar_t	|2 or 4 bytes	|1 wide character|
518 | 
519 | 
520 | 比如下面的函数就会造成错误。
521 | 
522 | 
523 | ```cython
524 | %%cython --cplus --compile-args=-stdlib=libc++ --link-args=-stdlib=libc++
525 | def sum_py(num1, num2):
526 |     print("The result by python is:", num1 + num2)
527 | 
528 | cdef int _sum_cpp(int num1, int num2):  # int -> long int
529 |     return num1 + num2
530 | 
531 | def sum_cpp(num1, num2):
532 |     print("The result by cpp is:", _sum_cpp(num1, num2))
533 | ```
534 | 
535 | 
536 | ```python
537 | sum_py(2**31-1, 1)
538 | sum_cpp(2**31-1, 1)
539 | ```
540 | 
541 |     The result by python is: 2147483648
542 |     The result by cpp is: -2147483648
543 | 
544 | 
545 | 
546 | ```cython
547 | %%cython --cplus --compile-args=-stdlib=libc++ --link-args=-stdlib=libc++
548 | from libcpp cimport bool
549 | 
550 | def lt_py(num1, num2):
551 |     print("The result by python is:", num1 < num2)
552 | 
553 | cdef bool _lt_cpp(float num1, float num2):  # float -> double
554 |     return num1 > num2
555 | 
556 | def lt_cpp(num1, num2):
557 |     print("The result by cpp is:", _lt_cpp(num1, num2))
558 | ```
559 | 
560 | 
561 | ```python
562 | lt_py(1234567890.0, 1234567891.0)
563 | lt_cpp(1234567890.0, 1234567891.0)
564 | ```
565 | 
566 |     The result by python is: True
567 |     The result by cpp is: False
568 | 
569 | 
570 | 
571 | ```python
572 | 
573 | ```
574 | 


--------------------------------------------------------------------------------
/Itertools for efficient looping.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "markdown",
  5 |    "metadata": {},
  6 |    "source": [
  7 |     "# Python Itertools - 高效的循环\n",
  8 |     "作者: tushushu  \n",
  9 |     "项目地址: https://github.com/tushushu/flying-python  \n",
 10 |     "\n",
 11 |     "Python官方文档用\"高效的循环\"来形容itertools模块，有些tools会带来性能提升，而另外一些tools并不快，只是会节省一些开发时间而已，如果滥用还会导致代码可读性变差。我们不妨把itertools的兄弟们拉出来溜溜。\n"
 12 |    ]
 13 |   },
 14 |   {
 15 |    "cell_type": "markdown",
 16 |    "metadata": {},
 17 |    "source": [
 18 |     "## 1. 数列累加\n",
 19 |     "给定一个列表An，返回数列累加和Sn。\n",
 20 |     "举例说明：\n",
 21 |     "* 输入: [1, 2, 3, 4, 5]\n",
 22 |     "* 返回: [1, 3, 6, 10, 15]  \n",
 23 |     "\n",
 24 |     "使用accumulate，性能提升了2.5倍"
 25 |    ]
 26 |   },
 27 |   {
 28 |    "cell_type": "code",
 29 |    "execution_count": 1,
 30 |    "metadata": {},
 31 |    "outputs": [],
 32 |    "source": [
 33 |     "from itertools import accumulate"
 34 |    ]
 35 |   },
 36 |   {
 37 |    "cell_type": "code",
 38 |    "execution_count": 2,
 39 |    "metadata": {},
 40 |    "outputs": [],
 41 |    "source": [
 42 |     "def _accumulate_list(arr):\n",
 43 |     "    tot = 0\n",
 44 |     "    for x in arr:\n",
 45 |     "        tot += x\n",
 46 |     "        yield tot\n",
 47 |     "\n",
 48 |     "def accumulate_list(arr):\n",
 49 |     "    return list(_accumulate_list(arr))"
 50 |    ]
 51 |   },
 52 |   {
 53 |    "cell_type": "code",
 54 |    "execution_count": 3,
 55 |    "metadata": {},
 56 |    "outputs": [],
 57 |    "source": [
 58 |     "def fast_accumulate_list(arr):\n",
 59 |     "    return list(accumulate(arr))"
 60 |    ]
 61 |   },
 62 |   {
 63 |    "cell_type": "code",
 64 |    "execution_count": 4,
 65 |    "metadata": {},
 66 |    "outputs": [],
 67 |    "source": [
 68 |     "arr = list(range(1000))"
 69 |    ]
 70 |   },
 71 |   {
 72 |    "cell_type": "code",
 73 |    "execution_count": 5,
 74 |    "metadata": {},
 75 |    "outputs": [
 76 |     {
 77 |      "name": "stdout",
 78 |      "output_type": "stream",
 79 |      "text": [
 80 |       "61 µs ± 2.91 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)\n"
 81 |      ]
 82 |     }
 83 |    ],
 84 |    "source": [
 85 |     "%timeit accumulate_list(arr)"
 86 |    ]
 87 |   },
 88 |   {
 89 |    "cell_type": "code",
 90 |    "execution_count": 6,
 91 |    "metadata": {},
 92 |    "outputs": [
 93 |     {
 94 |      "name": "stdout",
 95 |      "output_type": "stream",
 96 |      "text": [
 97 |       "21.3 µs ± 811 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)\n"
 98 |      ]
 99 |     }
100 |    ],
101 |    "source": [
102 |     "%timeit fast_accumulate_list(arr)"
103 |    ]
104 |   },
105 |   {
106 |    "cell_type": "markdown",
107 |    "metadata": {},
108 |    "source": [
109 |     "## 2. 选择数据\n",
110 |     "给定一个列表data，一个用0/1表示的列表selectors，返回被选择的数据。\n",
111 |     "举例说明：\n",
112 |     "* 输入: [1, 2, 3, 4, 5], [0, 1, 0, 1, 0]\n",
113 |     "* 返回: [2, 4]  \n",
114 |     "\n",
115 |     "使用compress，性能提升了2.8倍"
116 |    ]
117 |   },
118 |   {
119 |    "cell_type": "code",
120 |    "execution_count": 7,
121 |    "metadata": {},
122 |    "outputs": [],
123 |    "source": [
124 |     "from itertools import compress\n",
125 |     "from random import randint"
126 |    ]
127 |   },
128 |   {
129 |    "cell_type": "code",
130 |    "execution_count": 8,
131 |    "metadata": {},
132 |    "outputs": [],
133 |    "source": [
134 |     "def select_data(data, selectors):\n",
135 |     "    return [x for x, y in zip(data, selectors) if y]"
136 |    ]
137 |   },
138 |   {
139 |    "cell_type": "code",
140 |    "execution_count": 9,
141 |    "metadata": {},
142 |    "outputs": [],
143 |    "source": [
144 |     "def fast_select_data(data, selectors):\n",
145 |     "    return list(compress(data, selectors))"
146 |    ]
147 |   },
148 |   {
149 |    "cell_type": "code",
150 |    "execution_count": 10,
151 |    "metadata": {},
152 |    "outputs": [],
153 |    "source": [
154 |     "data = list(range(10000))\n",
155 |     "selectors = [randint(0, 1) for _ in range(10000)]"
156 |    ]
157 |   },
158 |   {
159 |    "cell_type": "code",
160 |    "execution_count": 11,
161 |    "metadata": {},
162 |    "outputs": [
163 |     {
164 |      "name": "stdout",
165 |      "output_type": "stream",
166 |      "text": [
167 |       "341 µs ± 17.8 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)\n"
168 |      ]
169 |     }
170 |    ],
171 |    "source": [
172 |     "%timeit select_data(data, selectors)"
173 |    ]
174 |   },
175 |   {
176 |    "cell_type": "code",
177 |    "execution_count": 12,
178 |    "metadata": {},
179 |    "outputs": [
180 |     {
181 |      "name": "stdout",
182 |      "output_type": "stream",
183 |      "text": [
184 |       "130 µs ± 3.19 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)\n"
185 |      ]
186 |     }
187 |    ],
188 |    "source": [
189 |     "%timeit fast_select_data(data, selectors)"
190 |    ]
191 |   },
192 |   {
193 |    "cell_type": "markdown",
194 |    "metadata": {},
195 |    "source": [
196 |     "## 3. 组合\n",
197 |     "给定一个列表arr和一个数字k，返回从arr中选择k个元素的所有情况。\n",
198 |     "举例说明：\n",
199 |     "* 输入: [1, 2, 3], 2\n",
200 |     "* 返回: [(1, 2), (1, 3), (2, 1), (2, 3), (3, 1), (3, 2)] \n",
201 |     "\n",
202 |     "使用permutations，性能提升了10倍"
203 |    ]
204 |   },
205 |   {
206 |    "cell_type": "code",
207 |    "execution_count": 13,
208 |    "metadata": {},
209 |    "outputs": [],
210 |    "source": [
211 |     "from itertools import permutations"
212 |    ]
213 |   },
214 |   {
215 |    "cell_type": "code",
216 |    "execution_count": 14,
217 |    "metadata": {},
218 |    "outputs": [],
219 |    "source": [
220 |     "def _get_permutations(arr, k, i):\n",
221 |     "    if i == k:\n",
222 |     "        return [arr[:k]]\n",
223 |     "    res = []\n",
224 |     "    for j in range(i, len(arr)):\n",
225 |     "        arr_cpy = arr.copy()\n",
226 |     "        arr_cpy[i], arr_cpy[j] = arr_cpy[j], arr_cpy[i]\n",
227 |     "        res += _get_permutations(arr_cpy, k, i + 1)\n",
228 |     "    return res\n",
229 |     "    \n",
230 |     "def get_permutations(arr, k):\n",
231 |     "    return _get_permutations(arr, k, 0)"
232 |    ]
233 |   },
234 |   {
235 |    "cell_type": "code",
236 |    "execution_count": 15,
237 |    "metadata": {},
238 |    "outputs": [],
239 |    "source": [
240 |     "def fast_get_permutations(arr, k):\n",
241 |     "    return list(permutations(arr, k))"
242 |    ]
243 |   },
244 |   {
245 |    "cell_type": "code",
246 |    "execution_count": 16,
247 |    "metadata": {},
248 |    "outputs": [],
249 |    "source": [
250 |     "arr = list(range(10))\n",
251 |     "k = 5"
252 |    ]
253 |   },
254 |   {
255 |    "cell_type": "code",
256 |    "execution_count": 17,
257 |    "metadata": {},
258 |    "outputs": [
259 |     {
260 |      "name": "stdout",
261 |      "output_type": "stream",
262 |      "text": [
263 |       "15.5 ms ± 1.96 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n"
264 |      ]
265 |     }
266 |    ],
267 |    "source": [
268 |     "%timeit -n 1 get_permutations(arr, k)"
269 |    ]
270 |   },
271 |   {
272 |    "cell_type": "code",
273 |    "execution_count": 18,
274 |    "metadata": {},
275 |    "outputs": [
276 |     {
277 |      "name": "stdout",
278 |      "output_type": "stream",
279 |      "text": [
280 |       "1.56 ms ± 284 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)\n"
281 |      ]
282 |     }
283 |    ],
284 |    "source": [
285 |     "%timeit -n 1 fast_get_permutations(arr, k)"
286 |    ]
287 |   },
288 |   {
289 |    "cell_type": "markdown",
290 |    "metadata": {},
291 |    "source": [
292 |     "## 4. 筛选数据\n",
293 |     "给定一个列表arr，筛选出所有的偶数。\n",
294 |     "举例说明：\n",
295 |     "* 输入: [3, 1, 4, 5, 9, 2]\n",
296 |     "* 返回: [(4, 2] \n",
297 |     "\n",
298 |     "使用filterfalse，性能反而会变慢，所以不要迷信itertools。"
299 |    ]
300 |   },
301 |   {
302 |    "cell_type": "code",
303 |    "execution_count": 19,
304 |    "metadata": {},
305 |    "outputs": [],
306 |    "source": [
307 |     "from itertools import filterfalse"
308 |    ]
309 |   },
310 |   {
311 |    "cell_type": "code",
312 |    "execution_count": 20,
313 |    "metadata": {},
314 |    "outputs": [],
315 |    "source": [
316 |     "def get_even_nums(arr):\n",
317 |     "    return [x for x in arr if x % 2 == 0]"
318 |    ]
319 |   },
320 |   {
321 |    "cell_type": "code",
322 |    "execution_count": 21,
323 |    "metadata": {},
324 |    "outputs": [],
325 |    "source": [
326 |     "def fast_get_even_nums(arr):\n",
327 |     "    return list(filterfalse(lambda x: x % 2, arr))"
328 |    ]
329 |   },
330 |   {
331 |    "cell_type": "code",
332 |    "execution_count": 22,
333 |    "metadata": {},
334 |    "outputs": [],
335 |    "source": [
336 |     "arr = list(range(10000))"
337 |    ]
338 |   },
339 |   {
340 |    "cell_type": "code",
341 |    "execution_count": 23,
342 |    "metadata": {},
343 |    "outputs": [
344 |     {
345 |      "name": "stdout",
346 |      "output_type": "stream",
347 |      "text": [
348 |       "417 µs ± 18.8 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)\n"
349 |      ]
350 |     }
351 |    ],
352 |    "source": [
353 |     "%timeit get_even_nums(arr)"
354 |    ]
355 |   },
356 |   {
357 |    "cell_type": "code",
358 |    "execution_count": 24,
359 |    "metadata": {},
360 |    "outputs": [
361 |     {
362 |      "name": "stdout",
363 |      "output_type": "stream",
364 |      "text": [
365 |       "823 µs ± 22.6 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)\n"
366 |      ]
367 |     }
368 |    ],
369 |    "source": [
370 |     "%timeit fast_get_even_nums(arr)"
371 |    ]
372 |   },
373 |   {
374 |    "cell_type": "markdown",
375 |    "metadata": {},
376 |    "source": [
377 |     "## 5. 条件终止\n",
378 |     "给定一个列表arr，依次对列表的所有数字进行求和，若遇到某个元素大于target之后则终止求和，返回这个和。\n",
379 |     "举例说明：\n",
380 |     "* 输入: [1, 2, 3, 4, 5], 3\n",
381 |     "* 返回: 6 (4 > 3，终止)\n",
382 |     "\n",
383 |     "使用takewhile，性能反而会变慢，所以不要迷信itertools。"
384 |    ]
385 |   },
386 |   {
387 |    "cell_type": "code",
388 |    "execution_count": 25,
389 |    "metadata": {},
390 |    "outputs": [],
391 |    "source": [
392 |     "from itertools import takewhile"
393 |    ]
394 |   },
395 |   {
396 |    "cell_type": "code",
397 |    "execution_count": 26,
398 |    "metadata": {},
399 |    "outputs": [],
400 |    "source": [
401 |     "def cond_sum(arr, target):\n",
402 |     "    res = 0\n",
403 |     "    for x in arr:\n",
404 |     "        if x > target:\n",
405 |     "            break\n",
406 |     "        res += x\n",
407 |     "    return res"
408 |    ]
409 |   },
410 |   {
411 |    "cell_type": "code",
412 |    "execution_count": 27,
413 |    "metadata": {},
414 |    "outputs": [],
415 |    "source": [
416 |     "def fast_cond_sum(arr, target):\n",
417 |     "    return sum(takewhile(lambda x: x <= target, arr))"
418 |    ]
419 |   },
420 |   {
421 |    "cell_type": "code",
422 |    "execution_count": 28,
423 |    "metadata": {},
424 |    "outputs": [],
425 |    "source": [
426 |     "arr = list(range(10000))\n",
427 |     "target = 5000"
428 |    ]
429 |   },
430 |   {
431 |    "cell_type": "code",
432 |    "execution_count": 29,
433 |    "metadata": {},
434 |    "outputs": [
435 |     {
436 |      "name": "stdout",
437 |      "output_type": "stream",
438 |      "text": [
439 |       "245 µs ± 11.8 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)\n"
440 |      ]
441 |     }
442 |    ],
443 |    "source": [
444 |     "%timeit cond_sum(arr, target)"
445 |    ]
446 |   },
447 |   {
448 |    "cell_type": "code",
449 |    "execution_count": 30,
450 |    "metadata": {},
451 |    "outputs": [
452 |     {
453 |      "name": "stdout",
454 |      "output_type": "stream",
455 |      "text": [
456 |       "404 µs ± 13.3 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)\n"
457 |      ]
458 |     }
459 |    ],
460 |    "source": [
461 |     "%timeit fast_cond_sum(arr, target)"
462 |    ]
463 |   },
464 |   {
465 |    "cell_type": "markdown",
466 |    "metadata": {},
467 |    "source": [
468 |     "## 6. 循环嵌套\n",
469 |     "给定列表arr1，arr2，返回两个列表的所有元素两两相加的和。\n",
470 |     "举例说明：\n",
471 |     "* 输入: [1, 2], [4, 5]\n",
472 |     "* 返回: [1 + 4， 1 + 5， 2 + 4， 2 + 5]\n",
473 |     "\n",
474 |     "使用product，性能提升了1.25倍。"
475 |    ]
476 |   },
477 |   {
478 |    "cell_type": "code",
479 |    "execution_count": 31,
480 |    "metadata": {},
481 |    "outputs": [],
482 |    "source": [
483 |     "from itertools import product"
484 |    ]
485 |   },
486 |   {
487 |    "cell_type": "code",
488 |    "execution_count": 32,
489 |    "metadata": {},
490 |    "outputs": [],
491 |    "source": [
492 |     "def _cross_sum(arr1, arr2):\n",
493 |     "    for x in arr1:\n",
494 |     "        for y in arr2:\n",
495 |     "            yield x + y\n",
496 |     "\n",
497 |     "def cross_sum(arr1, arr2):\n",
498 |     "    return list(_cross_sum(arr1, arr2))"
499 |    ]
500 |   },
501 |   {
502 |    "cell_type": "code",
503 |    "execution_count": 33,
504 |    "metadata": {},
505 |    "outputs": [],
506 |    "source": [
507 |     "def fast_cross_sum(arr1, arr2):\n",
508 |     "    return [x + y for x, y in product(arr1, arr2)]"
509 |    ]
510 |   },
511 |   {
512 |    "cell_type": "code",
513 |    "execution_count": 34,
514 |    "metadata": {},
515 |    "outputs": [],
516 |    "source": [
517 |     "arr1 = list(range(100))\n",
518 |     "arr2 = list(range(100))"
519 |    ]
520 |   },
521 |   {
522 |    "cell_type": "code",
523 |    "execution_count": 35,
524 |    "metadata": {},
525 |    "outputs": [
526 |     {
527 |      "name": "stdout",
528 |      "output_type": "stream",
529 |      "text": [
530 |       "484 µs ± 16.6 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)\n"
531 |      ]
532 |     }
533 |    ],
534 |    "source": [
535 |     "%timeit cross_sum(arr1, arr2)"
536 |    ]
537 |   },
538 |   {
539 |    "cell_type": "code",
540 |    "execution_count": 36,
541 |    "metadata": {},
542 |    "outputs": [
543 |     {
544 |      "name": "stdout",
545 |      "output_type": "stream",
546 |      "text": [
547 |       "373 µs ± 11.4 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)\n"
548 |      ]
549 |     }
550 |    ],
551 |    "source": [
552 |     "%timeit fast_cross_sum(arr1, arr2)"
553 |    ]
554 |   },
555 |   {
556 |    "cell_type": "markdown",
557 |    "metadata": {},
558 |    "source": [
559 |     "## 7. 二维列表转一维列表\n",
560 |     "给定二维列表arr，转为一维列表\n",
561 |     "举例说明：\n",
562 |     "* 输入: [[1, 2], [3, 4]]\n",
563 |     "* 返回: [1, 2, 3, 4]\n",
564 |     "\n",
565 |     "使用chain，性能提升了6倍。"
566 |    ]
567 |   },
568 |   {
569 |    "cell_type": "code",
570 |    "execution_count": 37,
571 |    "metadata": {},
572 |    "outputs": [],
573 |    "source": [
574 |     "from itertools import chain"
575 |    ]
576 |   },
577 |   {
578 |    "cell_type": "code",
579 |    "execution_count": 38,
580 |    "metadata": {},
581 |    "outputs": [],
582 |    "source": [
583 |     "def _flatten(arr2d):\n",
584 |     "    for arr in arr2d:\n",
585 |     "        for x in arr:\n",
586 |     "            yield x\n",
587 |     "\n",
588 |     "def flatten(arr2d):\n",
589 |     "    return list(_flatten(arr2d))"
590 |    ]
591 |   },
592 |   {
593 |    "cell_type": "code",
594 |    "execution_count": 39,
595 |    "metadata": {},
596 |    "outputs": [],
597 |    "source": [
598 |     "def fast_flatten(arr2d):\n",
599 |     "    return list(chain(*arr2d))"
600 |    ]
601 |   },
602 |   {
603 |    "cell_type": "code",
604 |    "execution_count": 40,
605 |    "metadata": {},
606 |    "outputs": [],
607 |    "source": [
608 |     "arr2d = [[x + y * 100 for x in range(100)] for y in range(100)]"
609 |    ]
610 |   },
611 |   {
612 |    "cell_type": "code",
613 |    "execution_count": 41,
614 |    "metadata": {},
615 |    "outputs": [
616 |     {
617 |      "name": "stdout",
618 |      "output_type": "stream",
619 |      "text": [
620 |       "379 µs ± 15.4 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)\n"
621 |      ]
622 |     }
623 |    ],
624 |    "source": [
625 |     "%timeit flatten(arr2d)"
626 |    ]
627 |   },
628 |   {
629 |    "cell_type": "code",
630 |    "execution_count": 42,
631 |    "metadata": {},
632 |    "outputs": [
633 |     {
634 |      "name": "stdout",
635 |      "output_type": "stream",
636 |      "text": [
637 |       "66.9 µs ± 3.43 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)\n"
638 |      ]
639 |     }
640 |    ],
641 |    "source": [
642 |     "%timeit fast_flatten(arr2d)"
643 |    ]
644 |   },
645 |   {
646 |    "cell_type": "code",
647 |    "execution_count": null,
648 |    "metadata": {},
649 |    "outputs": [],
650 |    "source": []
651 |   }
652 |  ],
653 |  "metadata": {
654 |   "kernelspec": {
655 |    "display_name": "Python 3",
656 |    "language": "python",
657 |    "name": "python3"
658 |   },
659 |   "language_info": {
660 |    "codemirror_mode": {
661 |     "name": "ipython",
662 |     "version": 3
663 |    },
664 |    "file_extension": ".py",
665 |    "mimetype": "text/x-python",
666 |    "name": "python",
667 |    "nbconvert_exporter": "python",
668 |    "pygments_lexer": "ipython3",
669 |    "version": "3.6.6"
670 |   }
671 |  },
672 |  "nbformat": 4,
673 |  "nbformat_minor": 2
674 | }
675 | 


--------------------------------------------------------------------------------
/Using C++ in Cython.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "markdown",
  5 |    "metadata": {},
  6 |    "source": [
  7 |     "## 在Cython中使用C++\n",
  8 |     "作者: tushushu  \n",
  9 |     "项目地址: https://github.com/tushushu/flying-python"
 10 |    ]
 11 |   },
 12 |   {
 13 |    "cell_type": "markdown",
 14 |    "metadata": {},
 15 |    "source": [
 16 |     "## 1. 在Jupyter Notebook上使用C++ \n",
 17 |     "- 首先加载Cython扩展，使用魔术命令  ``%load_ext Cython``\n",
 18 |     "- 接下来运行Cython代码，使用魔术命令  ``%%cython --cplus``\n",
 19 |     "- 如果使用MacOS，使用魔术命令  ``%%cython --cplus --compile-args=-stdlib=libc++ --link-args=-stdlib=libc++``，详情请参考https://stackoverflow.com/questions/57367764/cant-import-cpplist-into-cython"
 20 |    ]
 21 |   },
 22 |   {
 23 |    "cell_type": "code",
 24 |    "execution_count": 1,
 25 |    "metadata": {},
 26 |    "outputs": [],
 27 |    "source": [
 28 |     "%load_ext Cython"
 29 |    ]
 30 |   },
 31 |   {
 32 |    "cell_type": "code",
 33 |    "execution_count": 2,
 34 |    "metadata": {},
 35 |    "outputs": [
 36 |     {
 37 |      "name": "stdout",
 38 |      "output_type": "stream",
 39 |      "text": [
 40 |       "Hello world!\n"
 41 |      ]
 42 |     }
 43 |    ],
 44 |    "source": [
 45 |     "%%cython --cplus --compile-args=-stdlib=libc++ --link-args=-stdlib=libc++\n",
 46 |     "# 注意: 使用 'cimport' 而不是 'import'\n",
 47 |     "from libcpp.string cimport string\n",
 48 |     "cdef string s\n",
 49 |     "s = b\"Hello world!\"\n",
 50 |     "print(s.decode(\"utf-8\"))"
 51 |    ]
 52 |   },
 53 |   {
 54 |    "cell_type": "markdown",
 55 |    "metadata": {},
 56 |    "source": [
 57 |     "## 2. C++和Python类型的相互转换"
 58 |    ]
 59 |   },
 60 |   {
 61 |    "cell_type": "markdown",
 62 |    "metadata": {},
 63 |    "source": [
 64 |     "| Python type| C++ type | Python type |\n",
 65 |     "| ------ | ------ | ------ |\n",
 66 |     "| bytes | std::string | bytes |\n",
 67 |     "|iterable|std::vector|list|\n",
 68 |     "|iterable|std::list|list|\n",
 69 |     "|iterable|std::set|set|\n",
 70 |     "|iterable (len 2)|std::pair|tuple (len 2)|"
 71 |    ]
 72 |   },
 73 |   {
 74 |    "cell_type": "markdown",
 75 |    "metadata": {},
 76 |    "source": [
 77 |     "## 3. 使用C++ STL"
 78 |    ]
 79 |   },
 80 |   {
 81 |    "cell_type": "markdown",
 82 |    "metadata": {},
 83 |    "source": [
 84 |     "### 3.1 使用C++ Vector\n",
 85 |     "可以替代Python的List。\n",
 86 |     "1. 初始化 - 通过Python的可迭代对象进行初始化，需要声明变量的嵌套类型\n",
 87 |     "2. 遍历 - 让index自增，通过while循环进行遍历\n",
 88 |     "3. 访问 - 和Python一样使用'[]'操作符对元素进行访问\n",
 89 |     "4. 追加 - 与Python list的append方法相似，使用C++ Vector的push_back方法追加元素\n",
 90 |     "\n",
 91 |     "最后，我们通过分别实现Python和C++版本的元素计数函数来对比性能，C++大约快240倍左右。  \n",
 92 |     "注意: 为了公平起见，函数没有传入参数，而是直接访问函数体外部的变量。避免计入C++版本把Python列表转换为C++ Vector的耗时。如果计入这部分耗时，C++的版本大约快4倍左右。"
 93 |    ]
 94 |   },
 95 |   {
 96 |    "cell_type": "code",
 97 |    "execution_count": 3,
 98 |    "metadata": {},
 99 |    "outputs": [
100 |     {
101 |      "name": "stdout",
102 |      "output_type": "stream",
103 |      "text": [
104 |       "开始遍历...\n",
105 |       "\t第0个位置的元素是0\n",
106 |       "\t第1个位置的元素是1\n",
107 |       "\t第2个位置的元素是2\n",
108 |       "\t第3个位置的元素是3\n",
109 |       "\t第4个位置的元素是4\n",
110 |       "\n",
111 |       "追加元素之后vec变为 [0, 1, 2, 3, 4, 5]\n"
112 |      ]
113 |     }
114 |    ],
115 |    "source": [
116 |     "%%cython --cplus --compile-args=-stdlib=libc++ --link-args=-stdlib=libc++\n",
117 |     "from libcpp.vector cimport vector\n",
118 |     "# 通过Python对象初始化\n",
119 |     "cdef vector[int] vec = range(5)\n",
120 |     "# 遍历\n",
121 |     "cdef:\n",
122 |     "    int i = 0\n",
123 |     "    int n = vec.size()\n",
124 |     "print(\"开始遍历...\")\n",
125 |     "while i < n:\n",
126 |     "    # 访问\n",
127 |     "    print(\"\\t第%d个位置的元素是%d\" % (i, vec[i]))\n",
128 |     "    i += 1\n",
129 |     "print()\n",
130 |     "# 追加\n",
131 |     "vec.push_back(5)\n",
132 |     "print(\"追加元素之后vec变为\", vec)"
133 |    ]
134 |   },
135 |   {
136 |    "cell_type": "code",
137 |    "execution_count": 4,
138 |    "metadata": {},
139 |    "outputs": [
140 |     {
141 |      "name": "stdout",
142 |      "output_type": "stream",
143 |      "text": [
144 |       "用Python来实现，计算结果为100!\n"
145 |      ]
146 |     }
147 |    ],
148 |    "source": [
149 |     "arr = [x // 100 for x in range(1000)]\n",
150 |     "target = 6\n",
151 |     "\n",
152 |     "def count_py():\n",
153 |     "    return sum(1 for x in arr if x == target)\n",
154 |     "\n",
155 |     "print(\"用Python来实现，计算结果为%d!\"% count_py())"
156 |    ]
157 |   },
158 |   {
159 |    "cell_type": "code",
160 |    "execution_count": 5,
161 |    "metadata": {},
162 |    "outputs": [
163 |     {
164 |      "name": "stdout",
165 |      "output_type": "stream",
166 |      "text": [
167 |       "用Cython(C++)来实现，计算结果为100!\n"
168 |      ]
169 |     }
170 |    ],
171 |    "source": [
172 |     "%%cython --cplus --compile-args=-stdlib=libc++ --link-args=-stdlib=libc++\n",
173 |     "from libcpp.vector cimport vector\n",
174 |     "\n",
175 |     "cdef:\n",
176 |     "    int target = 6\n",
177 |     "    vector[int] v = [x // 100 for x in range(1000)]\n",
178 |     "\n",
179 |     "cdef int _count_cpp():\n",
180 |     "    cdef:\n",
181 |     "        int i = 0\n",
182 |     "        int n = v.size()\n",
183 |     "        int ret = 0\n",
184 |     "    while i < n:\n",
185 |     "        if v[i] == target:\n",
186 |     "            ret += 1\n",
187 |     "        i += 1\n",
188 |     "    return ret\n",
189 |     "\n",
190 |     "def count_cpp():\n",
191 |     "    return _count_cpp()\n",
192 |     "\n",
193 |     "print(\"用Cython(C++)来实现，计算结果为%d!\"% count_cpp())"
194 |    ]
195 |   },
196 |   {
197 |    "cell_type": "code",
198 |    "execution_count": 6,
199 |    "metadata": {},
200 |    "outputs": [
201 |     {
202 |      "name": "stdout",
203 |      "output_type": "stream",
204 |      "text": [
205 |       "对比Python版本与C++版本的性能...\n",
206 |       "29.9 µs ± 995 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)\n",
207 |       "130 ns ± 2.91 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)\n"
208 |      ]
209 |     }
210 |    ],
211 |    "source": [
212 |     "print(\"对比Python版本与C++版本的性能...\")\n",
213 |     "%timeit count_py()\n",
214 |     "%timeit count_cpp()"
215 |    ]
216 |   },
217 |   {
218 |    "cell_type": "markdown",
219 |    "metadata": {},
220 |    "source": [
221 |     "### 3.2 使用C++ Unordered Map\n",
222 |     "可以替代Python的Dict。\n",
223 |     "1. 初始化 - 通过Python的可迭代对象进行初始化，需要声明变量的嵌套类型\n",
224 |     "2. 遍历 - 让泛型指针自增，通过while循环进行遍历\n",
225 |     "3. 访问 - 使用deref(C++中的'*'操作符)来解引用，返回pair对象，通过.first来访问key, .second来访问Value\n",
226 |     "4. 查找 - 使用unordered_map.count，返回1或0；或者用unordered_map.find，返回一个泛型指针，如果指针指向unordered_map.end，则表示未找到。\n",
227 |     "5. 追加/修改 - unordered_map[key] = value。如果Key不存在，'[]'操作符会添加一个Key，并赋值为默认的Value，比如0.0。所以，除非确定不会产生错误，否则在修改Key对应的Value之前，要先判断Key是否存在。这与Python的DecaultDict有点相似。  \n",
228 |     "\n",
229 |     "最后，我们通过分别实现Python和C++版本的map条件求和函数来对比性能，C++大约快40倍左右。"
230 |    ]
231 |   },
232 |   {
233 |    "cell_type": "code",
234 |    "execution_count": 7,
235 |    "metadata": {},
236 |    "outputs": [
237 |     {
238 |      "name": "stdout",
239 |      "output_type": "stream",
240 |      "text": [
241 |       "开始遍历...\n",
242 |       "\tKey is 0, Value is 0.0\n",
243 |       "\tKey is 1, Value is 0.1\n",
244 |       "\tKey is 2, Value is 0.2\n",
245 |       "\tKey is 3, Value is 0.3\n",
246 |       "\tKey is 4, Value is 0.4\n",
247 |       "\tKey is 5, Value is 0.5\n",
248 |       "\tKey is 6, Value is 0.6\n",
249 |       "\tKey is 7, Value is 0.7\n",
250 |       "\tKey is 8, Value is 0.8\n",
251 |       "\tKey is 9, Value is 0.9\n",
252 |       "\n",
253 |       "开始查找...\n",
254 |       "\t元素-2不存在!\n",
255 |       "\t元素3存在, 它的值是0.3!\n",
256 |       "\n",
257 |       "修改元素...\n",
258 |       "\tKey is 3, Value is 1.3\n",
259 |       "\tKey is -2, Value is 0.0\n"
260 |      ]
261 |     }
262 |    ],
263 |    "source": [
264 |     "%%cython --cplus --compile-args=-stdlib=libc++ --link-args=-stdlib=libc++\n",
265 |     "from cython.operator cimport dereference as deref, preincrement as inc\n",
266 |     "from libcpp.unordered_map cimport unordered_map\n",
267 |     "# 通过Python对象初始化\n",
268 |     "cdef unordered_map[int, float] mymap = {i: i/10 for i in range(10)}\n",
269 |     "# 遍历\n",
270 |     "cdef:\n",
271 |     "    unordered_map[int, float].iterator it = mymap.begin()\n",
272 |     "    unordered_map[int, float].iterator end = mymap.end()\n",
273 |     "print(\"开始遍历...\")\n",
274 |     "while it != end:\n",
275 |     "    # 访问\n",
276 |     "    print(\"\\tKey is %d, Value is %.1f\" % (deref(it).first, deref(it).second))\n",
277 |     "    inc(it)\n",
278 |     "print()\n",
279 |     "\n",
280 |     "# 查找\n",
281 |     "print(\"开始查找...\")\n",
282 |     "if mymap.count(-2):\n",
283 |     "    print(\"\\t元素-2存在!\")\n",
284 |     "else:\n",
285 |     "    print(\"\\t元素-2不存在!\")\n",
286 |     "\n",
287 |     "it = mymap.find(3)\n",
288 |     "if it != end:\n",
289 |     "    print(\"\\t元素3存在, 它的值是%.1f!\" % deref(it).second)\n",
290 |     "else:\n",
291 |     "    print(\"\\t元素3不存在!\")\n",
292 |     "print()\n",
293 |     "\n",
294 |     "# 修改\n",
295 |     "print(\"修改元素...\")\n",
296 |     "if mymap.count(3):\n",
297 |     "    mymap[3] += 1.0\n",
298 |     "mymap[-2]  # Key -2不存在，会被添加一个默认值0.0\n",
299 |     "print(\"\\tKey is 3, Value is %.1f\" % mymap[3])\n",
300 |     "print(\"\\tKey is -2, Value is %.1f\" % mymap[-2])"
301 |    ]
302 |   },
303 |   {
304 |    "cell_type": "code",
305 |    "execution_count": 8,
306 |    "metadata": {},
307 |    "outputs": [
308 |     {
309 |      "name": "stdout",
310 |      "output_type": "stream",
311 |      "text": [
312 |       "用Python来实现，计算结果为1225!\n"
313 |      ]
314 |     }
315 |    ],
316 |    "source": [
317 |     "my_map = {x: x for x in range(100)}\n",
318 |     "target = 50\n",
319 |     "\n",
320 |     "def sum_lt_py():\n",
321 |     "    return sum(my_map[x] for x in my_map if x < target)\n",
322 |     "\n",
323 |     "print(\"用Python来实现，计算结果为%d!\"% sum_lt_py())"
324 |    ]
325 |   },
326 |   {
327 |    "cell_type": "code",
328 |    "execution_count": 9,
329 |    "metadata": {},
330 |    "outputs": [
331 |     {
332 |      "name": "stdout",
333 |      "output_type": "stream",
334 |      "text": [
335 |       "用Cython(C++)来实现，计算结果为1225!\n"
336 |      ]
337 |     }
338 |    ],
339 |    "source": [
340 |     "%%cython --cplus --compile-args=-stdlib=libc++ --link-args=-stdlib=libc++\n",
341 |     "from libcpp.unordered_map cimport unordered_map\n",
342 |     "from cython.operator cimport dereference as deref, preincrement as inc\n",
343 |     "\n",
344 |     "cdef:\n",
345 |     "    unordered_map[int, int] my_map = {x: x for x in range(100)}\n",
346 |     "    int target = 50\n",
347 |     "\n",
348 |     "cdef _sum_lt_cpp():\n",
349 |     "    cdef:\n",
350 |     "        unordered_map[int, int].iterator it = my_map.begin()\n",
351 |     "        int ret\n",
352 |     "    while it != my_map.end():\n",
353 |     "        if deref(it).first < target:\n",
354 |     "            ret += deref(it).second\n",
355 |     "        inc(it)\n",
356 |     "    return ret\n",
357 |     "\n",
358 |     "def sum_lt_cpp():\n",
359 |     "    return _sum_lt_cpp()\n",
360 |     "\n",
361 |     "print(\"用Cython(C++)来实现，计算结果为%d!\"% sum_lt_cpp())"
362 |    ]
363 |   },
364 |   {
365 |    "cell_type": "code",
366 |    "execution_count": 10,
367 |    "metadata": {},
368 |    "outputs": [
369 |     {
370 |      "name": "stdout",
371 |      "output_type": "stream",
372 |      "text": [
373 |       "对比Python版本与C++版本的性能...\n",
374 |       "6.56 µs ± 117 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)\n",
375 |       "162 ns ± 6.29 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)\n"
376 |      ]
377 |     }
378 |    ],
379 |    "source": [
380 |     "print(\"对比Python版本与C++版本的性能...\")\n",
381 |     "%timeit sum_lt_py()\n",
382 |     "%timeit sum_lt_cpp()"
383 |    ]
384 |   },
385 |   {
386 |    "cell_type": "markdown",
387 |    "metadata": {},
388 |    "source": [
389 |     "### 3.3 使用C++ Unordered Set\n",
390 |     "可以替代Python的Set。  \n",
391 |     "1. 初始化 - 通过Python的可迭代对象进行初始化，需要声明变量的嵌套类型\n",
392 |     "2. 遍历 - 让泛型指针自增，通过while循环进行遍历\n",
393 |     "3. 访问 - 使用deref(C++中的'*'操作符)来解引用\n",
394 |     "4. 查找 - 使用unordered_set.count，返回1或0\n",
395 |     "5. 追加 - 使用unordered_set.insert，如果元素已经存在，则元素不会被追加\n",
396 |     "6. 交集、并集、差集 - 据我所知，unordered_set的这些操作需要开发者自己去实现，不如Python的Set用起来方便。\n",
397 |     " \n",
398 |     "最后，我们通过分别实现Python和C++版本的set求交集对比性能，C++大约**慢**20倍左右。详情可参考https://stackoverflow.com/questions/54763112/how-to-improve-stdset-intersection-performance-in-c  \n",
399 |     "如果只是求两个集合相同元素的数量，C++的性能大约是Python的6倍。不难推测，C++的unordered set查询很快，但是创建很慢。"
400 |    ]
401 |   },
402 |   {
403 |    "cell_type": "code",
404 |    "execution_count": 11,
405 |    "metadata": {},
406 |    "outputs": [
407 |     {
408 |      "name": "stdout",
409 |      "output_type": "stream",
410 |      "text": [
411 |       "开始遍历...\n",
412 |       "\tValue is 0\n",
413 |       "\tValue is 1\n",
414 |       "\tValue is 2\n",
415 |       "\tValue is 3\n",
416 |       "\tValue is 4\n",
417 |       "\n",
418 |       "开始查找...\n",
419 |       "\t元素-2不存在!\n",
420 |       "\n",
421 |       "追加元素...\n",
422 |       "\tMyset is:  {0, 1, 2, 3, 4, -1}\n"
423 |      ]
424 |     }
425 |    ],
426 |    "source": [
427 |     "%%cython --cplus --compile-args=-stdlib=libc++ --link-args=-stdlib=libc++\n",
428 |     "from cython.operator cimport dereference as deref, preincrement as inc\n",
429 |     "from libcpp.unordered_set cimport unordered_set\n",
430 |     "# 通过Python对象初始化\n",
431 |     "cdef unordered_set[int] myset = {i for i in range(5)}\n",
432 |     "# 遍历\n",
433 |     "cdef:\n",
434 |     "    unordered_set[int].iterator it = myset.begin()\n",
435 |     "    unordered_set[int].iterator end = myset.end()\n",
436 |     "print(\"开始遍历...\")\n",
437 |     "while it != end:\n",
438 |     "    # 访问\n",
439 |     "    print(\"\\tValue is %d\" % deref(it))\n",
440 |     "    inc(it)\n",
441 |     "print()\n",
442 |     "\n",
443 |     "# 查找\n",
444 |     "print(\"开始查找...\")\n",
445 |     "if myset.count(-2):\n",
446 |     "    print(\"\\t元素-2存在!\")\n",
447 |     "else:\n",
448 |     "    print(\"\\t元素-2不存在!\")\n",
449 |     "\n",
450 |     "print()\n",
451 |     "\n",
452 |     "# 追加\n",
453 |     "print(\"追加元素...\")\n",
454 |     "myset.insert(0)\n",
455 |     "myset.insert(-1)\n",
456 |     "\n",
457 |     "print(\"\\tMyset is: \", myset)"
458 |    ]
459 |   },
460 |   {
461 |    "cell_type": "code",
462 |    "execution_count": 12,
463 |    "metadata": {},
464 |    "outputs": [
465 |     {
466 |      "name": "stdout",
467 |      "output_type": "stream",
468 |      "text": [
469 |       "用Python来实现，计算结果为{50, 51, 52, 53, 54, 55, 56, 57, 58, 59}!\n"
470 |      ]
471 |     }
472 |    ],
473 |    "source": [
474 |     "myset1 = {x for x in range(100)}\n",
475 |     "myset2 = {x for x in range(50, 60)}\n",
476 |     "\n",
477 |     "def intersection_py():\n",
478 |     "    return myset1 & myset2\n",
479 |     "\n",
480 |     "print(\"用Python来实现，计算结果为%s!\"% intersection_py())"
481 |    ]
482 |   },
483 |   {
484 |    "cell_type": "code",
485 |    "execution_count": 13,
486 |    "metadata": {},
487 |    "outputs": [
488 |     {
489 |      "name": "stdout",
490 |      "output_type": "stream",
491 |      "text": [
492 |       "用Cython(C++)来实现，计算结果为{50, 51, 52, 53, 54, 55, 56, 57, 58, 59}!\n"
493 |      ]
494 |     }
495 |    ],
496 |    "source": [
497 |     "%%cython --cplus --compile-args=-stdlib=libc++ --link-args=-stdlib=libc++\n",
498 |     "from cython.operator cimport dereference as deref, preincrement as inc\n",
499 |     "from libcpp.unordered_set cimport unordered_set\n",
500 |     "\n",
501 |     "cdef:\n",
502 |     "    unordered_set[int] myset1 = {x for x in range(100)}\n",
503 |     "    unordered_set[int] myset2 = {x for x in range(50, 60)}\n",
504 |     "\n",
505 |     "cdef unordered_set[int] _intersection_cpp():\n",
506 |     "    cdef:\n",
507 |     "        unordered_set[int].iterator it = myset1.begin()\n",
508 |     "        unordered_set[int] ret\n",
509 |     "    while it != myset1.end():\n",
510 |     "        if myset2.count(deref(it)):\n",
511 |     "            ret.insert(deref(it))\n",
512 |     "        inc(it)\n",
513 |     "    return ret\n",
514 |     "\n",
515 |     "def intersection_cpp():\n",
516 |     "    return _intersection_cpp()\n",
517 |     "\n",
518 |     "print(\"用Cython(C++)来实现，计算结果为%s!\"% intersection_cpp())"
519 |    ]
520 |   },
521 |   {
522 |    "cell_type": "code",
523 |    "execution_count": 14,
524 |    "metadata": {},
525 |    "outputs": [
526 |     {
527 |      "name": "stdout",
528 |      "output_type": "stream",
529 |      "text": [
530 |       "对比Python版本与C++版本的性能...\n",
531 |       "274 ns ± 13.7 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)\n",
532 |       "5.28 µs ± 220 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)\n"
533 |      ]
534 |     }
535 |    ],
536 |    "source": [
537 |     "print(\"对比Python版本与C++版本的性能...\")\n",
538 |     "%timeit intersection_py()\n",
539 |     "%timeit intersection_cpp()"
540 |    ]
541 |   },
542 |   {
543 |    "cell_type": "code",
544 |    "execution_count": 15,
545 |    "metadata": {},
546 |    "outputs": [
547 |     {
548 |      "name": "stdout",
549 |      "output_type": "stream",
550 |      "text": [
551 |       "用Python(C++)来实现，计算结果为10!\n"
552 |      ]
553 |     }
554 |    ],
555 |    "source": [
556 |     "myset1 = {x for x in range(100)}\n",
557 |     "myset2 = {x for x in range(50, 60)}\n",
558 |     "\n",
559 |     "def count_common_py():\n",
560 |     "    return len(myset1 & myset2)\n",
561 |     "\n",
562 |     "print(\"用Python(C++)来实现，计算结果为%s!\"% count_common_py())"
563 |    ]
564 |   },
565 |   {
566 |    "cell_type": "code",
567 |    "execution_count": 16,
568 |    "metadata": {},
569 |    "outputs": [
570 |     {
571 |      "name": "stdout",
572 |      "output_type": "stream",
573 |      "text": [
574 |       "用Cython(C++)来实现，计算结果为10!\n"
575 |      ]
576 |     }
577 |    ],
578 |    "source": [
579 |     "%%cython --cplus --compile-args=-stdlib=libc++ --link-args=-stdlib=libc++\n",
580 |     "from cython.operator cimport dereference as deref, preincrement as inc\n",
581 |     "from libcpp.unordered_set cimport unordered_set\n",
582 |     "\n",
583 |     "cdef:\n",
584 |     "    unordered_set[int] myset2 = {x for x in range(100)}\n",
585 |     "    unordered_set[int] myset1 = {x for x in range(50, 60)}\n",
586 |     "\n",
587 |     "cdef int _count_common_cpp():\n",
588 |     "    if myset1.size() > myset2.size():\n",
589 |     "        myset1.swap(myset2)\n",
590 |     "    cdef:\n",
591 |     "        unordered_set[int].iterator it = myset1.begin()\n",
592 |     "        int ret = 0\n",
593 |     "    while it != myset1.end():\n",
594 |     "        if myset2.count(deref(it)):\n",
595 |     "            ret += 1\n",
596 |     "        inc(it)\n",
597 |     "    return ret\n",
598 |     "\n",
599 |     "def count_common_cpp():\n",
600 |     "    return _count_common_cpp()\n",
601 |     "\n",
602 |     "print(\"用Cython(C++)来实现，计算结果为%s!\"% count_common_cpp())"
603 |    ]
604 |   },
605 |   {
606 |    "cell_type": "code",
607 |    "execution_count": 17,
608 |    "metadata": {},
609 |    "outputs": [
610 |     {
611 |      "name": "stdout",
612 |      "output_type": "stream",
613 |      "text": [
614 |       "对比Python版本与C++版本的性能...\n",
615 |       "295 ns ± 5.91 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)\n",
616 |       "46.1 ns ± 0.785 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)\n"
617 |      ]
618 |     }
619 |    ],
620 |    "source": [
621 |     "print(\"对比Python版本与C++版本的性能...\")\n",
622 |     "%timeit count_common_py()\n",
623 |     "%timeit count_common_cpp()"
624 |    ]
625 |   },
626 |   {
627 |    "cell_type": "markdown",
628 |    "metadata": {},
629 |    "source": [
630 |     "## 4. 传值与传引用\n",
631 |     "Python的函数，如果是容器类对象(如List, Set)，传递的是引用，否则传递的是值(如int, float)，如果不希望让函数修改容器类对象，可以用deepcopy函数先拷贝一份容器的副本。  \n",
632 |     "但在C++里默认都是传值，如果需要传引用需要声明。\n",
633 |     "以int型Vector为例，可以看到v1的值没有被pass_value修改，但被pass_reference修改了。\n",
634 |     "- 传值使用  ``vector[int]``，pass_value函数只是传入了v1的一份拷贝，所以函数无法修改v1\n",
635 |     "- 传引用使用  ``vector[int]&``，pass_reference传入了v1的引用，函数可以修改v1。  \n",
636 |     "\n",
637 |     "下面的两块代码可以展示Python与C++的不同之处。"
638 |    ]
639 |   },
640 |   {
641 |    "cell_type": "code",
642 |    "execution_count": 18,
643 |    "metadata": {},
644 |    "outputs": [
645 |     {
646 |      "name": "stdout",
647 |      "output_type": "stream",
648 |      "text": [
649 |       "v1的初始值是[0, 0, 0]\n",
650 |       "执行pass_value函数后，v1的值是[0, 0, 0]\n",
651 |       "执行pass_reference函数后，v1的值是[-1, 0, 0]\n"
652 |      ]
653 |     }
654 |    ],
655 |    "source": [
656 |     "from copy import deepcopy\n",
657 |     "\n",
658 |     "def pass_value(v):\n",
659 |     "    v = deepcopy(v)\n",
660 |     "    v[0] = -1\n",
661 |     "\n",
662 |     "def pass_reference(v):\n",
663 |     "    v[0] = -1\n",
664 |     "\n",
665 |     "v1 = [0, 0, 0]\n",
666 |     "print(\"v1的初始值是%s\" % v1)\n",
667 |     "pass_value(v1)\n",
668 |     "print(\"执行pass_value函数后，v1的值是%s\" % v1)\n",
669 |     "pass_reference(v1)\n",
670 |     "print(\"执行pass_reference函数后，v1的值是%s\" % v1)"
671 |    ]
672 |   },
673 |   {
674 |    "cell_type": "code",
675 |    "execution_count": 19,
676 |    "metadata": {},
677 |    "outputs": [
678 |     {
679 |      "name": "stdout",
680 |      "output_type": "stream",
681 |      "text": [
682 |       "v1的初始值是[0, 0, 0]\n",
683 |       "执行pass_value函数后，v1的值是[0, 0, 0]\n",
684 |       "执行pass_reference函数后，v1的值是[-1, 0, 0]\n"
685 |      ]
686 |     }
687 |    ],
688 |    "source": [
689 |     "%%cython --cplus --compile-args=-stdlib=libc++ --link-args=-stdlib=libc++\n",
690 |     "\n",
691 |     "from libcpp.vector cimport vector\n",
692 |     "\n",
693 |     "cdef void pass_value(vector[int] v):\n",
694 |     "    v[0] = -1\n",
695 |     "\n",
696 |     "cdef void pass_reference(vector[int]& v):\n",
697 |     "    v[0] = -1\n",
698 |     "\n",
699 |     "cdef vector[int] v1 = [0, 0, 0]\n",
700 |     "print(\"v1的初始值是%s\" % v1)\n",
701 |     "pass_value(v1)\n",
702 |     "print(\"执行pass_value函数后，v1的值是%s\" % v1)\n",
703 |     "pass_reference(v1)\n",
704 |     "print(\"执行pass_reference函数后，v1的值是%s\" % v1)"
705 |    ]
706 |   },
707 |   {
708 |    "cell_type": "markdown",
709 |    "metadata": {},
710 |    "source": [
711 |     "## 5. 数字的范围\n",
712 |     "Python只有int型，而且int的范围可以认为是无限大的，只要没有超出内存限制，所以Python使用者一般不太关心数值溢出等问题。但使用C++的时候就需要谨慎，C++各个数字类型对应的范围如下：  \n",
713 |     "\n",
714 |     "\n",
715 |     "|Type\t|Typical Bit Width\t|Typical Range|\n",
716 |     "| ------ | ------ | ------ |\n",
717 |     "|char\t|1byte\t|-127 to 127 or 0 to 255|\n",
718 |     "|unsigned char\t|1byte\t|0 to 255|\n",
719 |     "|signed char\t|1byte\t-127 to 127|\n",
720 |     "|int\t|4bytes\t|-2147483648 to 2147483647|\n",
721 |     "|unsigned int\t|4bytes\t|0 to 4294967295|\n",
722 |     "|signed int\t|4bytes\t|-2147483648 to 2147483647|\n",
723 |     "|short int\t|2bytes\t|-32768 to 32767|\n",
724 |     "|unsigned short int\t|2bytes\t|0 to 65,535|\n",
725 |     "|signed short int\t|2bytes\t|-32768 to 32767|\n",
726 |     "|long int\t|4bytes\t|-2,147,483,648 to 2,147,483,647|\n",
727 |     "|signed long int\t|8bytes\t|same as long int|\n",
728 |     "|unsigned long int\t|4bytes\t|0 to 4,294,967,295|\n",
729 |     "|long long int\t|8bytes\t|-(2^63) to (2^63)-1|\n",
730 |     "|unsigned long long int\t|8bytes\t|0 to 18,446,744,073,709,551,615|\n",
731 |     "|float\t|4bytes\t||\n",
732 |     "|double\t|8bytes\t||\n",
733 |     "|long double\t|12bytes||\t\n",
734 |     "|wchar_t\t|2 or 4 bytes\t|1 wide character|\n",
735 |     "\n",
736 |     "\n",
737 |     "比如下面的函数就会造成错误。"
738 |    ]
739 |   },
740 |   {
741 |    "cell_type": "code",
742 |    "execution_count": 20,
743 |    "metadata": {},
744 |    "outputs": [],
745 |    "source": [
746 |     "%%cython --cplus --compile-args=-stdlib=libc++ --link-args=-stdlib=libc++\n",
747 |     "def sum_py(num1, num2):\n",
748 |     "    print(\"The result by python is:\", num1 + num2)\n",
749 |     "\n",
750 |     "cdef int _sum_cpp(int num1, int num2):  # int -> long int\n",
751 |     "    return num1 + num2\n",
752 |     "\n",
753 |     "def sum_cpp(num1, num2):\n",
754 |     "    print(\"The result by cpp is:\", _sum_cpp(num1, num2))"
755 |    ]
756 |   },
757 |   {
758 |    "cell_type": "code",
759 |    "execution_count": 21,
760 |    "metadata": {},
761 |    "outputs": [
762 |     {
763 |      "name": "stdout",
764 |      "output_type": "stream",
765 |      "text": [
766 |       "The result by python is: 2147483648\n",
767 |       "The result by cpp is: -2147483648\n"
768 |      ]
769 |     }
770 |    ],
771 |    "source": [
772 |     "sum_py(2**31-1, 1)\n",
773 |     "sum_cpp(2**31-1, 1)"
774 |    ]
775 |   },
776 |   {
777 |    "cell_type": "code",
778 |    "execution_count": 22,
779 |    "metadata": {},
780 |    "outputs": [],
781 |    "source": [
782 |     "%%cython --cplus --compile-args=-stdlib=libc++ --link-args=-stdlib=libc++\n",
783 |     "from libcpp cimport bool\n",
784 |     "\n",
785 |     "def lt_py(num1, num2):\n",
786 |     "    print(\"The result by python is:\", num1 < num2)\n",
787 |     "\n",
788 |     "cdef bool _lt_cpp(float num1, float num2):  # float -> double\n",
789 |     "    return num1 > num2\n",
790 |     "\n",
791 |     "def lt_cpp(num1, num2):\n",
792 |     "    print(\"The result by cpp is:\", _lt_cpp(num1, num2))"
793 |    ]
794 |   },
795 |   {
796 |    "cell_type": "code",
797 |    "execution_count": 23,
798 |    "metadata": {},
799 |    "outputs": [
800 |     {
801 |      "name": "stdout",
802 |      "output_type": "stream",
803 |      "text": [
804 |       "The result by python is: True\n",
805 |       "The result by cpp is: False\n"
806 |      ]
807 |     }
808 |    ],
809 |    "source": [
810 |     "lt_py(1234567890.0, 1234567891.0)\n",
811 |     "lt_cpp(1234567890.0, 1234567891.0)"
812 |    ]
813 |   },
814 |   {
815 |    "cell_type": "code",
816 |    "execution_count": null,
817 |    "metadata": {},
818 |    "outputs": [],
819 |    "source": []
820 |   }
821 |  ],
822 |  "metadata": {
823 |   "kernelspec": {
824 |    "display_name": "Python 3",
825 |    "language": "python",
826 |    "name": "python3"
827 |   },
828 |   "language_info": {
829 |    "codemirror_mode": {
830 |     "name": "ipython",
831 |     "version": 3
832 |    },
833 |    "file_extension": ".py",
834 |    "mimetype": "text/x-python",
835 |    "name": "python",
836 |    "nbconvert_exporter": "python",
837 |    "pygments_lexer": "ipython3",
838 |    "version": "3.6.6"
839 |   }
840 |  },
841 |  "nbformat": 4,
842 |  "nbformat_minor": 2
843 | }
844 | 


--------------------------------------------------------------------------------
/Built-in method.ipynb:
--------------------------------------------------------------------------------
   1 | {
   2 |  "cells": [
   3 |   {
   4 |    "cell_type": "markdown",
   5 |    "metadata": {},
   6 |    "source": [
   7 |     "# 使用内置方法优化Python性能\n",
   8 |     "作者: tushushu  \n",
   9 |     "项目地址: https://github.com/tushushu/flying-python\n",
  10 |     "\n",
  11 |     "Python程序运行太慢的一个可能的原因是没有尽可能的调用内置方法，下面通过5个例子来演示如何用内置方法提升Python程序的性能。"
  12 |    ]
  13 |   },
  14 |   {
  15 |    "cell_type": "markdown",
  16 |    "metadata": {},
  17 |    "source": [
  18 |     "## 1. 数组求平方和\n",
  19 |     "输入一个列表，要求计算出该列表中数字的的平方和。最终性能提升了1.4倍。"
  20 |    ]
  21 |   },
  22 |   {
  23 |    "cell_type": "markdown",
  24 |    "metadata": {},
  25 |    "source": [
  26 |     "首先创建一个长度为10000的列表。"
  27 |    ]
  28 |   },
  29 |   {
  30 |    "cell_type": "code",
  31 |    "execution_count": 1,
  32 |    "metadata": {},
  33 |    "outputs": [],
  34 |    "source": [
  35 |     "arr = list(range(10000))"
  36 |    ]
  37 |   },
  38 |   {
  39 |    "cell_type": "markdown",
  40 |    "metadata": {},
  41 |    "source": [
  42 |     "### 1.1 最常规的写法\n",
  43 |     "while循环遍历列表求平方和。平均运行时间2.97毫秒。"
  44 |    ]
  45 |   },
  46 |   {
  47 |    "cell_type": "code",
  48 |    "execution_count": 2,
  49 |    "metadata": {},
  50 |    "outputs": [],
  51 |    "source": [
  52 |     "def sum_sqr_0(arr):\n",
  53 |     "    res = 0\n",
  54 |     "    n = len(arr)\n",
  55 |     "    i = 0\n",
  56 |     "    while i < n:\n",
  57 |     "        res += arr[i] ** 2\n",
  58 |     "        i += 1\n",
  59 |     "    return res"
  60 |    ]
  61 |   },
  62 |   {
  63 |    "cell_type": "code",
  64 |    "execution_count": 3,
  65 |    "metadata": {},
  66 |    "outputs": [
  67 |     {
  68 |      "name": "stdout",
  69 |      "output_type": "stream",
  70 |      "text": [
  71 |       "2.97 ms ± 36.4 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)\n"
  72 |      ]
  73 |     }
  74 |    ],
  75 |    "source": [
  76 |     "%timeit sum_sqr_0(arr)"
  77 |    ]
  78 |   },
  79 |   {
  80 |    "cell_type": "markdown",
  81 |    "metadata": {},
  82 |    "source": [
  83 |     "### 1.2 for range代替while循环\n",
  84 |     "避免i += 1的变量类型检查带来的额外开销。平均运行时间2.9毫秒。"
  85 |    ]
  86 |   },
  87 |   {
  88 |    "cell_type": "code",
  89 |    "execution_count": 4,
  90 |    "metadata": {},
  91 |    "outputs": [],
  92 |    "source": [
  93 |     "def sum_sqr_1(arr):\n",
  94 |     "    res = 0\n",
  95 |     "    for i in range(len(arr)):\n",
  96 |     "        res += arr[i] ** 2\n",
  97 |     "    return res"
  98 |    ]
  99 |   },
 100 |   {
 101 |    "cell_type": "code",
 102 |    "execution_count": 5,
 103 |    "metadata": {},
 104 |    "outputs": [
 105 |     {
 106 |      "name": "stdout",
 107 |      "output_type": "stream",
 108 |      "text": [
 109 |       "2.9 ms ± 137 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)\n"
 110 |      ]
 111 |     }
 112 |    ],
 113 |    "source": [
 114 |     "%timeit sum_sqr_1(arr)"
 115 |    ]
 116 |   },
 117 |   {
 118 |    "cell_type": "markdown",
 119 |    "metadata": {},
 120 |    "source": [
 121 |     "### 1.3 for x in arr代替for range\n",
 122 |     "避免arr[i]的变量类型检查带来的额外开销。平均运行时间2.59毫秒。"
 123 |    ]
 124 |   },
 125 |   {
 126 |    "cell_type": "code",
 127 |    "execution_count": 6,
 128 |    "metadata": {},
 129 |    "outputs": [],
 130 |    "source": [
 131 |     "def sum_sqr_2(arr):\n",
 132 |     "    res = 0\n",
 133 |     "    for x in arr:\n",
 134 |     "        res += x ** 2\n",
 135 |     "    return res"
 136 |    ]
 137 |   },
 138 |   {
 139 |    "cell_type": "code",
 140 |    "execution_count": 7,
 141 |    "metadata": {},
 142 |    "outputs": [
 143 |     {
 144 |      "name": "stdout",
 145 |      "output_type": "stream",
 146 |      "text": [
 147 |       "2.59 ms ± 89 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)\n"
 148 |      ]
 149 |     }
 150 |    ],
 151 |    "source": [
 152 |     "%timeit sum_sqr_2(arr)"
 153 |    ]
 154 |   },
 155 |   {
 156 |    "cell_type": "markdown",
 157 |    "metadata": {},
 158 |    "source": [
 159 |     "### 1.4 sum函数套用map函数\n",
 160 |     "平均运行时间2.36毫秒"
 161 |    ]
 162 |   },
 163 |   {
 164 |    "cell_type": "code",
 165 |    "execution_count": 8,
 166 |    "metadata": {},
 167 |    "outputs": [],
 168 |    "source": [
 169 |     "def sum_sqr_3(arr):\n",
 170 |     "    return sum(map(lambda x: x**2, arr))"
 171 |    ]
 172 |   },
 173 |   {
 174 |    "cell_type": "code",
 175 |    "execution_count": 9,
 176 |    "metadata": {},
 177 |    "outputs": [
 178 |     {
 179 |      "name": "stdout",
 180 |      "output_type": "stream",
 181 |      "text": [
 182 |       "2.36 ms ± 15.1 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)\n"
 183 |      ]
 184 |     }
 185 |    ],
 186 |    "source": [
 187 |     "%timeit sum_sqr_3(arr)"
 188 |    ]
 189 |   },
 190 |   {
 191 |    "cell_type": "markdown",
 192 |    "metadata": {},
 193 |    "source": [
 194 |     "### 1.5 sum函数套用生成器表达式\n",
 195 |     "生成器表达式如果作为某个函数的参数，则可以省略掉()。平均运行时间2.35毫秒。"
 196 |    ]
 197 |   },
 198 |   {
 199 |    "cell_type": "code",
 200 |    "execution_count": 10,
 201 |    "metadata": {},
 202 |    "outputs": [],
 203 |    "source": [
 204 |     "def sum_sqr_4(arr):\n",
 205 |     "    return sum(x ** 2 for x in arr)"
 206 |    ]
 207 |   },
 208 |   {
 209 |    "cell_type": "code",
 210 |    "execution_count": 11,
 211 |    "metadata": {},
 212 |    "outputs": [
 213 |     {
 214 |      "name": "stdout",
 215 |      "output_type": "stream",
 216 |      "text": [
 217 |       "2.35 ms ± 107 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)\n"
 218 |      ]
 219 |     }
 220 |    ],
 221 |    "source": [
 222 |     "%timeit sum_sqr_4(arr)"
 223 |    ]
 224 |   },
 225 |   {
 226 |    "cell_type": "markdown",
 227 |    "metadata": {},
 228 |    "source": [
 229 |     "### 1. 6 sum函数套用列表推导式\n",
 230 |     "平均运行时间2.06毫秒。"
 231 |    ]
 232 |   },
 233 |   {
 234 |    "cell_type": "code",
 235 |    "execution_count": 12,
 236 |    "metadata": {},
 237 |    "outputs": [],
 238 |    "source": [
 239 |     "def sum_sqr_5(arr):\n",
 240 |     "    return sum([x ** 2 for x in arr])"
 241 |    ]
 242 |   },
 243 |   {
 244 |    "cell_type": "code",
 245 |    "execution_count": 13,
 246 |    "metadata": {},
 247 |    "outputs": [
 248 |     {
 249 |      "name": "stdout",
 250 |      "output_type": "stream",
 251 |      "text": [
 252 |       "2.06 ms ± 27.2 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)\n"
 253 |      ]
 254 |     }
 255 |    ],
 256 |    "source": [
 257 |     "%timeit sum_sqr_5(arr)"
 258 |    ]
 259 |   },
 260 |   {
 261 |    "cell_type": "markdown",
 262 |    "metadata": {},
 263 |    "source": [
 264 |     "## 2. 字符串拼接\n",
 265 |     "输入一个列表，要求将列表中的字符串的前3个字符都拼接为一个字符串。最终性能提升了2.1倍。"
 266 |    ]
 267 |   },
 268 |   {
 269 |    "cell_type": "markdown",
 270 |    "metadata": {},
 271 |    "source": [
 272 |     "首先创建一个列表，生成10000个随机长度和内容的字符串。"
 273 |    ]
 274 |   },
 275 |   {
 276 |    "cell_type": "code",
 277 |    "execution_count": 1,
 278 |    "metadata": {},
 279 |    "outputs": [],
 280 |    "source": [
 281 |     "from random import randint\n",
 282 |     "\n",
 283 |     "def random_letter():\n",
 284 |     "    return chr(ord('a') + randint(0, 25))\n",
 285 |     "\n",
 286 |     "def random_letters(n):\n",
 287 |     "    return \"\".join([random_letter() for _ in range(n)])\n",
 288 |     "\n",
 289 |     "strings = [random_letters(randint(1, 10)) for _ in range(10000)]"
 290 |    ]
 291 |   },
 292 |   {
 293 |    "cell_type": "markdown",
 294 |    "metadata": {},
 295 |    "source": [
 296 |     "### 2.1 最常规的写法\n",
 297 |     "while循环遍历列表，对字符串进行拼接。平均运行时间1.86毫秒。"
 298 |    ]
 299 |   },
 300 |   {
 301 |    "cell_type": "code",
 302 |    "execution_count": 2,
 303 |    "metadata": {},
 304 |    "outputs": [],
 305 |    "source": [
 306 |     "def concat_strings_0(strings):\n",
 307 |     "    res = \"\"\n",
 308 |     "    n = len(strings)\n",
 309 |     "    i = 0\n",
 310 |     "    while i < n:\n",
 311 |     "        res += strings[i][:3]\n",
 312 |     "        i += 1\n",
 313 |     "    return res"
 314 |    ]
 315 |   },
 316 |   {
 317 |    "cell_type": "code",
 318 |    "execution_count": 3,
 319 |    "metadata": {},
 320 |    "outputs": [
 321 |     {
 322 |      "name": "stdout",
 323 |      "output_type": "stream",
 324 |      "text": [
 325 |       "1.86 ms ± 74.9 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)\n"
 326 |      ]
 327 |     }
 328 |    ],
 329 |    "source": [
 330 |     "%timeit concat_strings_0(strings)"
 331 |    ]
 332 |   },
 333 |   {
 334 |    "cell_type": "markdown",
 335 |    "metadata": {},
 336 |    "source": [
 337 |     "### 2.2 for range代替while循环\n",
 338 |     "避免i += 1的变量类型检查带来的额外开销。平均运行时间1.55毫秒。"
 339 |    ]
 340 |   },
 341 |   {
 342 |    "cell_type": "code",
 343 |    "execution_count": 4,
 344 |    "metadata": {},
 345 |    "outputs": [],
 346 |    "source": [
 347 |     "def concat_strings_1(strings):\n",
 348 |     "    res = \"\"\n",
 349 |     "    for i in range(len(strings)):\n",
 350 |     "        res += strings[i][:3]\n",
 351 |     "    return res"
 352 |    ]
 353 |   },
 354 |   {
 355 |    "cell_type": "code",
 356 |    "execution_count": 5,
 357 |    "metadata": {},
 358 |    "outputs": [
 359 |     {
 360 |      "name": "stdout",
 361 |      "output_type": "stream",
 362 |      "text": [
 363 |       "1.55 ms ± 32.9 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)\n"
 364 |      ]
 365 |     }
 366 |    ],
 367 |    "source": [
 368 |     "%timeit concat_strings_1(strings)"
 369 |    ]
 370 |   },
 371 |   {
 372 |    "cell_type": "markdown",
 373 |    "metadata": {},
 374 |    "source": [
 375 |     "### 2.3 for x in strings代替for range\n",
 376 |     "避免strings[i]的变量类型检查带来的额外开销。平均运行时间1.32毫秒。"
 377 |    ]
 378 |   },
 379 |   {
 380 |    "cell_type": "code",
 381 |    "execution_count": 6,
 382 |    "metadata": {},
 383 |    "outputs": [],
 384 |    "source": [
 385 |     "def concat_strings_2(strings):\n",
 386 |     "    res = \"\"\n",
 387 |     "    for x in strings:\n",
 388 |     "        res += x[:3]\n",
 389 |     "    return res"
 390 |    ]
 391 |   },
 392 |   {
 393 |    "cell_type": "code",
 394 |    "execution_count": 7,
 395 |    "metadata": {},
 396 |    "outputs": [
 397 |     {
 398 |      "name": "stdout",
 399 |      "output_type": "stream",
 400 |      "text": [
 401 |       "1.32 ms ± 19.5 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)\n"
 402 |      ]
 403 |     }
 404 |    ],
 405 |    "source": [
 406 |     "%timeit concat_strings_2(strings)"
 407 |    ]
 408 |   },
 409 |   {
 410 |    "cell_type": "markdown",
 411 |    "metadata": {},
 412 |    "source": [
 413 |     "### 2.4 .join方法套用生成器表达式\n",
 414 |     "平均运行时间1.06毫秒。"
 415 |    ]
 416 |   },
 417 |   {
 418 |    "cell_type": "code",
 419 |    "execution_count": 8,
 420 |    "metadata": {},
 421 |    "outputs": [],
 422 |    "source": [
 423 |     "def concat_strings_3(strings):\n",
 424 |     "    return \"\".join(x[:3] for x in strings)"
 425 |    ]
 426 |   },
 427 |   {
 428 |    "cell_type": "code",
 429 |    "execution_count": 9,
 430 |    "metadata": {},
 431 |    "outputs": [
 432 |     {
 433 |      "name": "stdout",
 434 |      "output_type": "stream",
 435 |      "text": [
 436 |       "1.06 ms ± 15.2 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)\n"
 437 |      ]
 438 |     }
 439 |    ],
 440 |    "source": [
 441 |     "%timeit concat_strings_3(strings)"
 442 |    ]
 443 |   },
 444 |   {
 445 |    "cell_type": "markdown",
 446 |    "metadata": {},
 447 |    "source": [
 448 |     "### 2.5 .join方法套用列表解析式\n",
 449 |     "平均运行时间0.85毫秒。"
 450 |    ]
 451 |   },
 452 |   {
 453 |    "cell_type": "code",
 454 |    "execution_count": 10,
 455 |    "metadata": {},
 456 |    "outputs": [],
 457 |    "source": [
 458 |     "def concat_strings_4(strings):\n",
 459 |     "    return \"\".join([x[:3] for x in strings])"
 460 |    ]
 461 |   },
 462 |   {
 463 |    "cell_type": "code",
 464 |    "execution_count": 11,
 465 |    "metadata": {},
 466 |    "outputs": [
 467 |     {
 468 |      "name": "stdout",
 469 |      "output_type": "stream",
 470 |      "text": [
 471 |       "858 µs ± 14.5 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)\n"
 472 |      ]
 473 |     }
 474 |    ],
 475 |    "source": [
 476 |     "%timeit concat_strings_4(strings)"
 477 |    ]
 478 |   },
 479 |   {
 480 |    "cell_type": "markdown",
 481 |    "metadata": {},
 482 |    "source": [
 483 |     "## 3. 筛选奇数"
 484 |    ]
 485 |   },
 486 |   {
 487 |    "cell_type": "markdown",
 488 |    "metadata": {},
 489 |    "source": [
 490 |     "输入一个列表，要求筛选出该列表中的所有奇数。最终性能提升了3.6倍。"
 491 |    ]
 492 |   },
 493 |   {
 494 |    "cell_type": "markdown",
 495 |    "metadata": {},
 496 |    "source": [
 497 |     "首先创建一个长度为10000的列表。"
 498 |    ]
 499 |   },
 500 |   {
 501 |    "cell_type": "code",
 502 |    "execution_count": 26,
 503 |    "metadata": {},
 504 |    "outputs": [],
 505 |    "source": [
 506 |     "arr = list(range(10000))"
 507 |    ]
 508 |   },
 509 |   {
 510 |    "cell_type": "markdown",
 511 |    "metadata": {},
 512 |    "source": [
 513 |     "### 3.1 最常规的写法\n",
 514 |     "创建一个空列表res，while循环遍历列表，将奇数append到res中。平均运行时间1.03毫秒。"
 515 |    ]
 516 |   },
 517 |   {
 518 |    "cell_type": "code",
 519 |    "execution_count": 27,
 520 |    "metadata": {},
 521 |    "outputs": [],
 522 |    "source": [
 523 |     "def filter_odd_0(arr):\n",
 524 |     "    res = []\n",
 525 |     "    i = 0\n",
 526 |     "    n = len(arr)\n",
 527 |     "    while i < n:\n",
 528 |     "        if arr[i] % 2:\n",
 529 |     "            res.append(arr[i])\n",
 530 |     "        i += 1\n",
 531 |     "    return res"
 532 |    ]
 533 |   },
 534 |   {
 535 |    "cell_type": "code",
 536 |    "execution_count": 28,
 537 |    "metadata": {},
 538 |    "outputs": [
 539 |     {
 540 |      "name": "stdout",
 541 |      "output_type": "stream",
 542 |      "text": [
 543 |       "1.03 ms ± 34.1 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)\n"
 544 |      ]
 545 |     }
 546 |    ],
 547 |    "source": [
 548 |     "%timeit filter_odd_0(arr)"
 549 |    ]
 550 |   },
 551 |   {
 552 |    "cell_type": "markdown",
 553 |    "metadata": {},
 554 |    "source": [
 555 |     "### 3.2 for range代替while循环\n",
 556 |     "避免i += 1的变量类型检查带来的额外开销。平均运行时间0.965毫秒。"
 557 |    ]
 558 |   },
 559 |   {
 560 |    "cell_type": "code",
 561 |    "execution_count": 29,
 562 |    "metadata": {},
 563 |    "outputs": [],
 564 |    "source": [
 565 |     "def filter_odd_1(arr):\n",
 566 |     "    res = []\n",
 567 |     "    for i in range(len(arr)):\n",
 568 |     "        if arr[i] % 2:\n",
 569 |     "            res.append(arr[i])\n",
 570 |     "        i += 1\n",
 571 |     "    return res"
 572 |    ]
 573 |   },
 574 |   {
 575 |    "cell_type": "code",
 576 |    "execution_count": 30,
 577 |    "metadata": {},
 578 |    "outputs": [
 579 |     {
 580 |      "name": "stdout",
 581 |      "output_type": "stream",
 582 |      "text": [
 583 |       "965 µs ± 4.02 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)\n"
 584 |      ]
 585 |     }
 586 |    ],
 587 |    "source": [
 588 |     "%timeit filter_odd_1(arr)"
 589 |    ]
 590 |   },
 591 |   {
 592 |    "cell_type": "markdown",
 593 |    "metadata": {},
 594 |    "source": [
 595 |     "### 3.3 for x in arr代替for range\n",
 596 |     "避免arr[i]的变量类型检查带来的额外开销。平均运行时间0.430毫秒。"
 597 |    ]
 598 |   },
 599 |   {
 600 |    "cell_type": "code",
 601 |    "execution_count": 31,
 602 |    "metadata": {},
 603 |    "outputs": [],
 604 |    "source": [
 605 |     "def filter_odd_2(arr):\n",
 606 |     "    res = []\n",
 607 |     "    for x in arr:\n",
 608 |     "        if x % 2:\n",
 609 |     "            res.append(x)\n",
 610 |     "    return res"
 611 |    ]
 612 |   },
 613 |   {
 614 |    "cell_type": "code",
 615 |    "execution_count": 32,
 616 |    "metadata": {},
 617 |    "outputs": [
 618 |     {
 619 |      "name": "stdout",
 620 |      "output_type": "stream",
 621 |      "text": [
 622 |       "430 µs ± 9.25 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)\n"
 623 |      ]
 624 |     }
 625 |    ],
 626 |    "source": [
 627 |     "%timeit filter_odd_2(arr)"
 628 |    ]
 629 |   },
 630 |   {
 631 |    "cell_type": "markdown",
 632 |    "metadata": {},
 633 |    "source": [
 634 |     "### 3.4 list套用filter函数\n",
 635 |     "平均运行时间0.763毫秒。注意filter函数很慢，在Python 3.6里非常鸡肋。"
 636 |    ]
 637 |   },
 638 |   {
 639 |    "cell_type": "code",
 640 |    "execution_count": 33,
 641 |    "metadata": {},
 642 |    "outputs": [],
 643 |    "source": [
 644 |     "def filter_odd_3(arr):\n",
 645 |     "    return list(filter(lambda x: x % 2, arr))"
 646 |    ]
 647 |   },
 648 |   {
 649 |    "cell_type": "code",
 650 |    "execution_count": 34,
 651 |    "metadata": {},
 652 |    "outputs": [
 653 |     {
 654 |      "name": "stdout",
 655 |      "output_type": "stream",
 656 |      "text": [
 657 |       "763 µs ± 15.9 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)\n"
 658 |      ]
 659 |     }
 660 |    ],
 661 |    "source": [
 662 |     "%timeit filter_odd_3(arr)"
 663 |    ]
 664 |   },
 665 |   {
 666 |    "cell_type": "markdown",
 667 |    "metadata": {},
 668 |    "source": [
 669 |     "### 3.5 list套用生成器表达式\n",
 670 |     "平均运行时间0.398毫秒。"
 671 |    ]
 672 |   },
 673 |   {
 674 |    "cell_type": "code",
 675 |    "execution_count": 35,
 676 |    "metadata": {},
 677 |    "outputs": [],
 678 |    "source": [
 679 |     "def filter_odd_4(arr):\n",
 680 |     "    return list((x for x in arr if x % 2))"
 681 |    ]
 682 |   },
 683 |   {
 684 |    "cell_type": "code",
 685 |    "execution_count": 36,
 686 |    "metadata": {},
 687 |    "outputs": [
 688 |     {
 689 |      "name": "stdout",
 690 |      "output_type": "stream",
 691 |      "text": [
 692 |       "398 µs ± 16.4 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)\n"
 693 |      ]
 694 |     }
 695 |    ],
 696 |    "source": [
 697 |     "%timeit filter_odd_4(arr)"
 698 |    ]
 699 |   },
 700 |   {
 701 |    "cell_type": "markdown",
 702 |    "metadata": {},
 703 |    "source": [
 704 |     "### 3.6 带条件的列表推导式\n",
 705 |     "平均运行时间0.290毫秒。"
 706 |    ]
 707 |   },
 708 |   {
 709 |    "cell_type": "code",
 710 |    "execution_count": 37,
 711 |    "metadata": {},
 712 |    "outputs": [],
 713 |    "source": [
 714 |     "def filter_odd_5(arr):\n",
 715 |     "    return [x for x in arr if x % 2]"
 716 |    ]
 717 |   },
 718 |   {
 719 |    "cell_type": "code",
 720 |    "execution_count": 38,
 721 |    "metadata": {},
 722 |    "outputs": [
 723 |     {
 724 |      "name": "stdout",
 725 |      "output_type": "stream",
 726 |      "text": [
 727 |       "290 µs ± 5.54 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)\n"
 728 |      ]
 729 |     }
 730 |    ],
 731 |    "source": [
 732 |     "%timeit filter_odd_5(arr)"
 733 |    ]
 734 |   },
 735 |   {
 736 |    "cell_type": "markdown",
 737 |    "metadata": {},
 738 |    "source": [
 739 |     "## 4. 两个数组相加"
 740 |    ]
 741 |   },
 742 |   {
 743 |    "cell_type": "markdown",
 744 |    "metadata": {},
 745 |    "source": [
 746 |     "输入两个长度相同的列表，要求计算出两个列表对应位置的数字之和，返回一个与输入长度相同的列表。最终性能提升了2.7倍。"
 747 |    ]
 748 |   },
 749 |   {
 750 |    "cell_type": "markdown",
 751 |    "metadata": {},
 752 |    "source": [
 753 |     "首先生成两个长度为10000的列表。"
 754 |    ]
 755 |   },
 756 |   {
 757 |    "cell_type": "code",
 758 |    "execution_count": 40,
 759 |    "metadata": {},
 760 |    "outputs": [],
 761 |    "source": [
 762 |     "arr1 = list(range(10000))\n",
 763 |     "arr2 = list(range(10000))"
 764 |    ]
 765 |   },
 766 |   {
 767 |    "cell_type": "markdown",
 768 |    "metadata": {},
 769 |    "source": [
 770 |     "### 4.1 最常规的写法\n",
 771 |     "创建一个空列表res，while循环遍历列表，将两个列表对应的元素之和append到res中。平均运行时间1.23毫秒。"
 772 |    ]
 773 |   },
 774 |   {
 775 |    "cell_type": "code",
 776 |    "execution_count": 41,
 777 |    "metadata": {},
 778 |    "outputs": [],
 779 |    "source": [
 780 |     "def arr_sum_0(arr1, arr2):\n",
 781 |     "    i = 0\n",
 782 |     "    n = len(arr1)\n",
 783 |     "    res = []\n",
 784 |     "    while i < n:\n",
 785 |     "        res.append(arr1[i] + arr2[i])\n",
 786 |     "        i += 1\n",
 787 |     "    return res"
 788 |    ]
 789 |   },
 790 |   {
 791 |    "cell_type": "code",
 792 |    "execution_count": 42,
 793 |    "metadata": {},
 794 |    "outputs": [
 795 |     {
 796 |      "name": "stdout",
 797 |      "output_type": "stream",
 798 |      "text": [
 799 |       "1.23 ms ± 3.77 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)\n"
 800 |      ]
 801 |     }
 802 |    ],
 803 |    "source": [
 804 |     "%timeit arr_sum_0(arr1, arr2)"
 805 |    ]
 806 |   },
 807 |   {
 808 |    "cell_type": "markdown",
 809 |    "metadata": {},
 810 |    "source": [
 811 |     "### 4.2 for range代替while循环\n",
 812 |     "避免i += 1的变量类型检查带来的额外开销。平均运行时间0.997毫秒。"
 813 |    ]
 814 |   },
 815 |   {
 816 |    "cell_type": "code",
 817 |    "execution_count": 43,
 818 |    "metadata": {},
 819 |    "outputs": [],
 820 |    "source": [
 821 |     "def arr_sum_1(arr1, arr2):\n",
 822 |     "    res = []\n",
 823 |     "    for i in range(len(arr1)):\n",
 824 |     "        res.append(arr1[i] + arr2[i])\n",
 825 |     "    return res"
 826 |    ]
 827 |   },
 828 |   {
 829 |    "cell_type": "code",
 830 |    "execution_count": 44,
 831 |    "metadata": {},
 832 |    "outputs": [
 833 |     {
 834 |      "name": "stdout",
 835 |      "output_type": "stream",
 836 |      "text": [
 837 |       "997 µs ± 7.42 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)\n"
 838 |      ]
 839 |     }
 840 |    ],
 841 |    "source": [
 842 |     "%timeit arr_sum_1(arr1, arr2)"
 843 |    ]
 844 |   },
 845 |   {
 846 |    "cell_type": "markdown",
 847 |    "metadata": {},
 848 |    "source": [
 849 |     "### 4.3 for i, x in enumerate代替for range\n",
 850 |     "部分避免arr[i]的变量类型检查带来的额外开销。平均运行时间0.799毫秒。"
 851 |    ]
 852 |   },
 853 |   {
 854 |    "cell_type": "code",
 855 |    "execution_count": 45,
 856 |    "metadata": {},
 857 |    "outputs": [],
 858 |    "source": [
 859 |     "def arr_sum_2(arr1, arr2):\n",
 860 |     "    res = arr1.copy()\n",
 861 |     "    for i, x in enumerate(arr2):\n",
 862 |     "        res[i] += x\n",
 863 |     "    return res"
 864 |    ]
 865 |   },
 866 |   {
 867 |    "cell_type": "code",
 868 |    "execution_count": 46,
 869 |    "metadata": {},
 870 |    "outputs": [
 871 |     {
 872 |      "name": "stdout",
 873 |      "output_type": "stream",
 874 |      "text": [
 875 |       "799 µs ± 16.7 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)\n"
 876 |      ]
 877 |     }
 878 |    ],
 879 |    "source": [
 880 |     "%timeit arr_sum_2(arr1, arr2)"
 881 |    ]
 882 |   },
 883 |   {
 884 |    "cell_type": "markdown",
 885 |    "metadata": {},
 886 |    "source": [
 887 |     "### 4.4 for x, y in zip代替for range\n",
 888 |     "避免arr[i]的变量类型检查带来的额外开销。平均运行时间0.769毫秒。"
 889 |    ]
 890 |   },
 891 |   {
 892 |    "cell_type": "code",
 893 |    "execution_count": 47,
 894 |    "metadata": {},
 895 |    "outputs": [],
 896 |    "source": [
 897 |     "def arr_sum_3(arr1, arr2):\n",
 898 |     "    res = []\n",
 899 |     "    for x, y in zip(arr1, arr2):\n",
 900 |     "        res.append(x + y)\n",
 901 |     "    return res"
 902 |    ]
 903 |   },
 904 |   {
 905 |    "cell_type": "code",
 906 |    "execution_count": 48,
 907 |    "metadata": {},
 908 |    "outputs": [
 909 |     {
 910 |      "name": "stdout",
 911 |      "output_type": "stream",
 912 |      "text": [
 913 |       "769 µs ± 12.2 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)\n"
 914 |      ]
 915 |     }
 916 |    ],
 917 |    "source": [
 918 |     "%timeit arr_sum_3(arr1, arr2)"
 919 |    ]
 920 |   },
 921 |   {
 922 |    "cell_type": "markdown",
 923 |    "metadata": {},
 924 |    "source": [
 925 |     "### 4.5 列表推导式套用zip\n",
 926 |     "平均运行时间0.462毫秒。"
 927 |    ]
 928 |   },
 929 |   {
 930 |    "cell_type": "code",
 931 |    "execution_count": 49,
 932 |    "metadata": {},
 933 |    "outputs": [],
 934 |    "source": [
 935 |     "def arr_sum_4(arr1, arr2):\n",
 936 |     "    return [x + y for x, y in zip(arr1, arr2)]"
 937 |    ]
 938 |   },
 939 |   {
 940 |    "cell_type": "code",
 941 |    "execution_count": 50,
 942 |    "metadata": {},
 943 |    "outputs": [
 944 |     {
 945 |      "name": "stdout",
 946 |      "output_type": "stream",
 947 |      "text": [
 948 |       "462 µs ± 3.43 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)\n"
 949 |      ]
 950 |     }
 951 |    ],
 952 |    "source": [
 953 |     "%timeit arr_sum_4(arr1, arr2)"
 954 |    ]
 955 |   },
 956 |   {
 957 |    "cell_type": "markdown",
 958 |    "metadata": {},
 959 |    "source": [
 960 |     "## 5. 两个列表相同元素的数量\n",
 961 |     "输入两个列表，要求统计两个列表相同元素的数量。其中每个列表内的元素都是不重复的。最终性能提升了5000倍。"
 962 |    ]
 963 |   },
 964 |   {
 965 |    "cell_type": "markdown",
 966 |    "metadata": {},
 967 |    "source": [
 968 |     "首先创建两个列表，并将元素的顺序打乱。"
 969 |    ]
 970 |   },
 971 |   {
 972 |    "cell_type": "code",
 973 |    "execution_count": 51,
 974 |    "metadata": {},
 975 |    "outputs": [],
 976 |    "source": [
 977 |     "from random import shuffle\n",
 978 |     "arr1 = list(range(2000))\n",
 979 |     "shuffle(arr1)\n",
 980 |     "arr2 = list(range(1000, 3000))\n",
 981 |     "shuffle(arr2)"
 982 |    ]
 983 |   },
 984 |   {
 985 |    "cell_type": "markdown",
 986 |    "metadata": {},
 987 |    "source": [
 988 |     "### 5.1 最常规的写法\n",
 989 |     "while循环嵌套，判断元素arr1[i]是否等于arr2[j]，平均运行时间338毫秒。"
 990 |    ]
 991 |   },
 992 |   {
 993 |    "cell_type": "code",
 994 |    "execution_count": 52,
 995 |    "metadata": {},
 996 |    "outputs": [],
 997 |    "source": [
 998 |     "def n_common_0(arr1, arr2):\n",
 999 |     "    res = 0\n",
1000 |     "    i = 0\n",
1001 |     "    m = len(arr1)\n",
1002 |     "    n = len(arr2)\n",
1003 |     "    while i < m:\n",
1004 |     "        j = 0\n",
1005 |     "        while j < n:\n",
1006 |     "            if arr1[i] == arr2[j]:\n",
1007 |     "                res += 1\n",
1008 |     "            j += 1\n",
1009 |     "        i += 1\n",
1010 |     "    return res"
1011 |    ]
1012 |   },
1013 |   {
1014 |    "cell_type": "code",
1015 |    "execution_count": 53,
1016 |    "metadata": {},
1017 |    "outputs": [
1018 |     {
1019 |      "name": "stdout",
1020 |      "output_type": "stream",
1021 |      "text": [
1022 |       "338 ms ± 7.81 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n"
1023 |      ]
1024 |     }
1025 |    ],
1026 |    "source": [
1027 |     "%timeit n_common_0(arr1, arr2)"
1028 |    ]
1029 |   },
1030 |   {
1031 |    "cell_type": "markdown",
1032 |    "metadata": {},
1033 |    "source": [
1034 |     "### 5.2 for range代替while循环\n",
1035 |     "避免i += 1的变量类型检查带来的额外开销。平均运行时间233毫秒。"
1036 |    ]
1037 |   },
1038 |   {
1039 |    "cell_type": "code",
1040 |    "execution_count": 54,
1041 |    "metadata": {},
1042 |    "outputs": [],
1043 |    "source": [
1044 |     "def n_common_1(arr1, arr2):\n",
1045 |     "    res = 0\n",
1046 |     "    for i in range(len(arr1)):\n",
1047 |     "        for j in range(len(arr2)):\n",
1048 |     "            if arr1[i] == arr2[j]:\n",
1049 |     "                res += 1\n",
1050 |     "    return res"
1051 |    ]
1052 |   },
1053 |   {
1054 |    "cell_type": "code",
1055 |    "execution_count": 55,
1056 |    "metadata": {},
1057 |    "outputs": [
1058 |     {
1059 |      "name": "stdout",
1060 |      "output_type": "stream",
1061 |      "text": [
1062 |       "233 ms ± 10.9 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n"
1063 |      ]
1064 |     }
1065 |    ],
1066 |    "source": [
1067 |     "%timeit n_common_1(arr1, arr2)"
1068 |    ]
1069 |   },
1070 |   {
1071 |    "cell_type": "markdown",
1072 |    "metadata": {},
1073 |    "source": [
1074 |     "### 5.3 for x in arr代替for range\n",
1075 |     "避免arr[i]的变量类型检查带来的额外开销。平均运行时间84.8毫秒。"
1076 |    ]
1077 |   },
1078 |   {
1079 |    "cell_type": "code",
1080 |    "execution_count": 56,
1081 |    "metadata": {},
1082 |    "outputs": [],
1083 |    "source": [
1084 |     "def n_common_2(arr1, arr2):\n",
1085 |     "    res = 0\n",
1086 |     "    for x in arr1:\n",
1087 |     "        for y in arr2:\n",
1088 |     "            if x == y:\n",
1089 |     "                res += 1\n",
1090 |     "    return res"
1091 |    ]
1092 |   },
1093 |   {
1094 |    "cell_type": "code",
1095 |    "execution_count": 57,
1096 |    "metadata": {},
1097 |    "outputs": [
1098 |     {
1099 |      "name": "stdout",
1100 |      "output_type": "stream",
1101 |      "text": [
1102 |       "84.8 ms ± 1.38 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)\n"
1103 |      ]
1104 |     }
1105 |    ],
1106 |    "source": [
1107 |     "%timeit n_common_2(arr1, arr2)"
1108 |    ]
1109 |   },
1110 |   {
1111 |    "cell_type": "markdown",
1112 |    "metadata": {},
1113 |    "source": [
1114 |     "### 5.4 使用if x in arr2代替内层循环\n",
1115 |     "平均运行时间24.9毫秒。"
1116 |    ]
1117 |   },
1118 |   {
1119 |    "cell_type": "code",
1120 |    "execution_count": 58,
1121 |    "metadata": {},
1122 |    "outputs": [],
1123 |    "source": [
1124 |     "def n_common_3(arr1, arr2):\n",
1125 |     "    res = 0\n",
1126 |     "    for x in arr1:\n",
1127 |     "        if x in arr2:\n",
1128 |     "            res += 1\n",
1129 |     "    return res"
1130 |    ]
1131 |   },
1132 |   {
1133 |    "cell_type": "code",
1134 |    "execution_count": 59,
1135 |    "metadata": {},
1136 |    "outputs": [
1137 |     {
1138 |      "name": "stdout",
1139 |      "output_type": "stream",
1140 |      "text": [
1141 |       "24.9 ms ± 1.39 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)\n"
1142 |      ]
1143 |     }
1144 |    ],
1145 |    "source": [
1146 |     "%timeit n_common_3(arr1, arr2)"
1147 |    ]
1148 |   },
1149 |   {
1150 |    "cell_type": "markdown",
1151 |    "metadata": {},
1152 |    "source": [
1153 |     "### 5.4 使用更快的算法\n",
1154 |     "将数组用.sort方法排序，再进行单层循环遍历。把时间复杂度从O(n2)降低到O(nlogn)，平均运行时间0.239毫秒。"
1155 |    ]
1156 |   },
1157 |   {
1158 |    "cell_type": "code",
1159 |    "execution_count": 60,
1160 |    "metadata": {},
1161 |    "outputs": [],
1162 |    "source": [
1163 |     "def n_common_4(arr1, arr2):\n",
1164 |     "    arr1.sort()\n",
1165 |     "    arr2.sort()\n",
1166 |     "    res = i = j = 0\n",
1167 |     "    m, n = len(arr1), len(arr2)\n",
1168 |     "    while i < m and j < n:\n",
1169 |     "        if arr1[i] == arr2[j]:\n",
1170 |     "            res += 1\n",
1171 |     "            i += 1\n",
1172 |     "            j += 1\n",
1173 |     "        elif arr1[i] > arr2[j]:\n",
1174 |     "            j += 1\n",
1175 |     "        else:\n",
1176 |     "            i += 1\n",
1177 |     "    return res"
1178 |    ]
1179 |   },
1180 |   {
1181 |    "cell_type": "code",
1182 |    "execution_count": 61,
1183 |    "metadata": {},
1184 |    "outputs": [
1185 |     {
1186 |      "name": "stdout",
1187 |      "output_type": "stream",
1188 |      "text": [
1189 |       "329 µs ± 12.3 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)\n"
1190 |      ]
1191 |     }
1192 |    ],
1193 |    "source": [
1194 |     "%timeit n_common_4(arr1, arr2)"
1195 |    ]
1196 |   },
1197 |   {
1198 |    "cell_type": "markdown",
1199 |    "metadata": {},
1200 |    "source": [
1201 |     "### 5.5 使用更好的数据结构\n",
1202 |     "将数组转为集合，求交集的长度。平均运行时间0.067毫秒。"
1203 |    ]
1204 |   },
1205 |   {
1206 |    "cell_type": "code",
1207 |    "execution_count": 62,
1208 |    "metadata": {},
1209 |    "outputs": [],
1210 |    "source": [
1211 |     "def n_common_5(arr1, arr2):\n",
1212 |     "    return len(set(arr1) & set(arr2))"
1213 |    ]
1214 |   },
1215 |   {
1216 |    "cell_type": "code",
1217 |    "execution_count": 63,
1218 |    "metadata": {},
1219 |    "outputs": [
1220 |     {
1221 |      "name": "stdout",
1222 |      "output_type": "stream",
1223 |      "text": [
1224 |       "67.2 µs ± 755 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)\n"
1225 |      ]
1226 |     }
1227 |    ],
1228 |    "source": [
1229 |     "%timeit n_common_5(arr1, arr2)"
1230 |    ]
1231 |   },
1232 |   {
1233 |    "cell_type": "code",
1234 |    "execution_count": null,
1235 |    "metadata": {},
1236 |    "outputs": [],
1237 |    "source": []
1238 |   }
1239 |  ],
1240 |  "metadata": {
1241 |   "kernelspec": {
1242 |    "display_name": "Python 3",
1243 |    "language": "python",
1244 |    "name": "python3"
1245 |   },
1246 |   "language_info": {
1247 |    "codemirror_mode": {
1248 |     "name": "ipython",
1249 |     "version": 3
1250 |    },
1251 |    "file_extension": ".py",
1252 |    "mimetype": "text/x-python",
1253 |    "name": "python",
1254 |    "nbconvert_exporter": "python",
1255 |    "pygments_lexer": "ipython3",
1256 |    "version": "3.6.6"
1257 |   }
1258 |  },
1259 |  "nbformat": 4,
1260 |  "nbformat_minor": 2
1261 | }
1262 | 


--------------------------------------------------------------------------------