├── .ipynb_checkpoints
├── 01. Numpy和原生Python用于数组计算的性能对比-checkpoint.ipynb
├── 02. Numpy的核心array对象以及创建array的方法-checkpoint.ipynb
├── 03. Numpy对数组按索引查询-checkpoint.ipynb
├── 04. Numpy常用random随机函数汇总-checkpoint.ipynb
├── 05. Numpy的数学统计函数-checkpoint.ipynb
├── 06. Numpy计算数组中满足条件元素个数-checkpoint.ipynb
├── 07. Numpy怎样给数组增加一个维度-checkpoint.ipynb
├── 08. Numpy实现K折交叉验证的数据划分-checkpoint.ipynb
├── 09. Numpy非常有用的数组合并操作-checkpoint.ipynb
├── 10. Numpy怎样对数组排序-checkpoint.ipynb
├── 11. Numpy中数组的乘法-checkpoint.ipynb
├── 12. Numpy中重要的广播概念-checkpoint.ipynb
├── 13. Numpy求解线性方程组-checkpoint.ipynb
├── 14. Numpy实现SVD矩阵分解-checkpoint.ipynb
├── 15. Numpy实现多项式曲线拟合-checkpoint.ipynb
├── 16. Numpy使用Matplotlib实现可视化绘图-checkpoint.ipynb
├── 17. Numpy计算逆矩阵求解线性方程组-checkpoint.ipynb
├── 18. Numpy怎样将数组读写到文件-checkpoint.ipynb
├── 19. Numpy的结构化数组-checkpoint.ipynb
├── 20. Numpy与Pandas数据的相互转换-checkpoint.ipynb
├── 21. Numpy数据输入给Scikit-learn实现模型训练-checkpoint.ipynb
├── Untitled-checkpoint.ipynb
└── Untitled1-checkpoint.ipynb
├── 01. Numpy和原生Python用于数组计算的性能对比.ipynb
├── 02. Numpy的核心array对象以及创建array的方法.ipynb
├── 03. Numpy对数组按索引查询.ipynb
├── 04. Numpy常用random随机函数汇总.ipynb
├── 05. Numpy的数学统计函数.ipynb
├── 06. Numpy计算数组中满足条件元素个数.ipynb
├── 07. Numpy怎样给数组增加一个维度.ipynb
├── 08. Numpy实现K折交叉验证的数据划分.ipynb
├── 09. Numpy非常有用的数组合并操作.ipynb
├── 10. Numpy怎样对数组排序.ipynb
├── 11. Numpy中数组的乘法.ipynb
├── 12. Numpy中重要的广播概念.ipynb
├── 13. Numpy求解线性方程组.ipynb
├── 14. Numpy实现SVD矩阵分解.ipynb
├── 15. Numpy实现多项式曲线拟合.ipynb
├── 16. Numpy使用Matplotlib实现可视化绘图.ipynb
├── 17. Numpy计算逆矩阵求解线性方程组.ipynb
├── 18. Numpy怎样将数组读写到文件.ipynb
├── 19. Numpy的结构化数组.ipynb
├── 20. Numpy与Pandas数据的相互转换.ipynb
├── 21. Numpy数据输入给Scikit-learn实现模型训练.ipynb
├── README.md
├── Untitled.ipynb
├── Untitled1.ipynb
├── arr_a.npy
├── arr_ab.npz
├── arr_ab_compressed.npz
└── other_files
├── numpy-array-inv.jpg
├── numpy-kfold-validation.jpg
├── numpy-kfold-validation.png
└── numpy_random_functions.png
/.ipynb_checkpoints/06. Numpy计算数组中满足条件元素个数-checkpoint.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "## Numpy计算数组中满足条件元素个数\n",
8 | "\n",
9 | "需求:有一个非常大的数组比如1亿个数字,求出里面数字小于5000的数字数目"
10 | ]
11 | },
12 | {
13 | "cell_type": "markdown",
14 | "metadata": {},
15 | "source": [
16 | "### 1. 使用numpy的random模块生成1亿个数字"
17 | ]
18 | },
19 | {
20 | "cell_type": "code",
21 | "execution_count": 1,
22 | "metadata": {},
23 | "outputs": [],
24 | "source": [
25 | "import numpy as np"
26 | ]
27 | },
28 | {
29 | "cell_type": "code",
30 | "execution_count": 2,
31 | "metadata": {},
32 | "outputs": [],
33 | "source": [
34 | "arr = np.random.randint(1, 10000, size=int(1e8))"
35 | ]
36 | },
37 | {
38 | "cell_type": "code",
39 | "execution_count": 3,
40 | "metadata": {},
41 | "outputs": [
42 | {
43 | "data": {
44 | "text/plain": [
45 | "array([8855, 6014, 4193, 7830, 355, 9469, 1661, 6569, 7647, 5907])"
46 | ]
47 | },
48 | "execution_count": 3,
49 | "metadata": {},
50 | "output_type": "execute_result"
51 | }
52 | ],
53 | "source": [
54 | "arr[:10]"
55 | ]
56 | },
57 | {
58 | "cell_type": "code",
59 | "execution_count": 4,
60 | "metadata": {},
61 | "outputs": [
62 | {
63 | "data": {
64 | "text/plain": [
65 | "100000000"
66 | ]
67 | },
68 | "execution_count": 4,
69 | "metadata": {},
70 | "output_type": "execute_result"
71 | }
72 | ],
73 | "source": [
74 | "arr.size"
75 | ]
76 | },
77 | {
78 | "cell_type": "markdown",
79 | "metadata": {},
80 | "source": [
81 | "### 2. 使用Python原生语法实现"
82 | ]
83 | },
84 | {
85 | "cell_type": "code",
86 | "execution_count": 5,
87 | "metadata": {},
88 | "outputs": [],
89 | "source": [
90 | "pyarr = list(arr)"
91 | ]
92 | },
93 | {
94 | "cell_type": "code",
95 | "execution_count": 6,
96 | "metadata": {},
97 | "outputs": [
98 | {
99 | "data": {
100 | "text/plain": [
101 | "50001207"
102 | ]
103 | },
104 | "execution_count": 6,
105 | "metadata": {},
106 | "output_type": "execute_result"
107 | }
108 | ],
109 | "source": [
110 | "# 计算下结果,用于对比是否准确\n",
111 | "len([x for x in pyarr if x>5000])"
112 | ]
113 | },
114 | {
115 | "cell_type": "code",
116 | "execution_count": 7,
117 | "metadata": {},
118 | "outputs": [
119 | {
120 | "name": "stdout",
121 | "output_type": "stream",
122 | "text": [
123 | "16.6 s ± 252 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n"
124 | ]
125 | }
126 | ],
127 | "source": [
128 | "# 记一下时间\n",
129 | "%timeit len([x for x in pyarr if x>5000])"
130 | ]
131 | },
132 | {
133 | "cell_type": "markdown",
134 | "metadata": {},
135 | "source": [
136 | "### 3. 使用numpy的向量化操作实现"
137 | ]
138 | },
139 | {
140 | "cell_type": "code",
141 | "execution_count": 8,
142 | "metadata": {},
143 | "outputs": [
144 | {
145 | "data": {
146 | "text/plain": [
147 | "50001207"
148 | ]
149 | },
150 | "execution_count": 8,
151 | "metadata": {},
152 | "output_type": "execute_result"
153 | }
154 | ],
155 | "source": [
156 | "# 计算下结果,用于对比是否准确\n",
157 | "arr[arr>5000].size"
158 | ]
159 | },
160 | {
161 | "cell_type": "code",
162 | "execution_count": 9,
163 | "metadata": {},
164 | "outputs": [
165 | {
166 | "data": {
167 | "text/plain": [
168 | "array([ True, True, False, True, False, True, False, True, True,\n",
169 | " True])"
170 | ]
171 | },
172 | "execution_count": 9,
173 | "metadata": {},
174 | "output_type": "execute_result"
175 | }
176 | ],
177 | "source": [
178 | "(arr>5000)[:10]"
179 | ]
180 | },
181 | {
182 | "cell_type": "code",
183 | "execution_count": 10,
184 | "metadata": {},
185 | "outputs": [
186 | {
187 | "name": "stdout",
188 | "output_type": "stream",
189 | "text": [
190 | "556 ms ± 3.45 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n"
191 | ]
192 | }
193 | ],
194 | "source": [
195 | "# 记一下时间\n",
196 | "%timeit arr[arr>5000].size"
197 | ]
198 | },
199 | {
200 | "cell_type": "markdown",
201 | "metadata": {},
202 | "source": [
203 | "### 4. 对比下时间"
204 | ]
205 | },
206 | {
207 | "cell_type": "code",
208 | "execution_count": 12,
209 | "metadata": {},
210 | "outputs": [
211 | {
212 | "data": {
213 | "text/plain": [
214 | "29.90990990990991"
215 | ]
216 | },
217 | "execution_count": 12,
218 | "metadata": {},
219 | "output_type": "execute_result"
220 | }
221 | ],
222 | "source": [
223 | "16.6*1000 / 555 "
224 | ]
225 | },
226 | {
227 | "cell_type": "code",
228 | "execution_count": null,
229 | "metadata": {},
230 | "outputs": [],
231 | "source": []
232 | }
233 | ],
234 | "metadata": {
235 | "kernelspec": {
236 | "display_name": "Python 3",
237 | "language": "python",
238 | "name": "python3"
239 | },
240 | "language_info": {
241 | "codemirror_mode": {
242 | "name": "ipython",
243 | "version": 3
244 | },
245 | "file_extension": ".py",
246 | "mimetype": "text/x-python",
247 | "name": "python",
248 | "nbconvert_exporter": "python",
249 | "pygments_lexer": "ipython3",
250 | "version": "3.7.6"
251 | }
252 | },
253 | "nbformat": 4,
254 | "nbformat_minor": 4
255 | }
256 |
--------------------------------------------------------------------------------
/.ipynb_checkpoints/07. Numpy怎样给数组增加一个维度-checkpoint.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "## Numpy怎样给数组增加一个维度"
8 | ]
9 | },
10 | {
11 | "cell_type": "markdown",
12 | "metadata": {},
13 | "source": [
14 | "***背景:*** \n",
15 | "很多数据计算都是二维或三维的,对于一维的数据输入为了形状匹配,经常需升维变成二维\n",
16 | "\n",
17 | "***需要:*** \n",
18 | "在不改变数据的情况下,添加数组维度;(注意观察这个例子,维度变了,但数据不变) \n",
19 | "原始数组:一维数组arr=[1,2,3,4],其shape是(4, ),取值分别为arr[0],arr[1],arr[2],arr[3] \n",
20 | "变形数组:二维数组arr[[1,2,3,4]],其shape实(1,4), 取值分别为a[0,0],a[0,1],a[0,2],a[0,3]\n",
21 | "\n",
22 | "***实操:*** \n",
23 | "经常需要在纸上手绘数组的形状,来查看不同数组是否形状匹配,是否需要升维降维\n",
24 | "\n",
25 | "***3种方法:*** \n",
26 | "* np.newaxis:关键字,使用索引的语法给数组添加维度\n",
27 | "* np.expand_dims(arr, axis):方法,和np.newaxis实现一样的功能,给arr在axis位置添加维度\n",
28 | "* np.reshape(a, newshape):方法,给一个维度设置为1完成升维"
29 | ]
30 | },
31 | {
32 | "cell_type": "code",
33 | "execution_count": 1,
34 | "metadata": {},
35 | "outputs": [],
36 | "source": [
37 | "import numpy as np"
38 | ]
39 | },
40 | {
41 | "cell_type": "code",
42 | "execution_count": 3,
43 | "metadata": {},
44 | "outputs": [
45 | {
46 | "data": {
47 | "text/plain": [
48 | "array([0, 1, 2, 3, 4])"
49 | ]
50 | },
51 | "execution_count": 3,
52 | "metadata": {},
53 | "output_type": "execute_result"
54 | }
55 | ],
56 | "source": [
57 | "arr = np.arange(5)\n",
58 | "arr"
59 | ]
60 | },
61 | {
62 | "cell_type": "code",
63 | "execution_count": 5,
64 | "metadata": {},
65 | "outputs": [
66 | {
67 | "data": {
68 | "text/plain": [
69 | "(5,)"
70 | ]
71 | },
72 | "execution_count": 5,
73 | "metadata": {},
74 | "output_type": "execute_result"
75 | }
76 | ],
77 | "source": [
78 | "# 注意,当前是一维向量\n",
79 | "arr.shape"
80 | ]
81 | },
82 | {
83 | "cell_type": "markdown",
84 | "metadata": {},
85 | "source": [
86 | "### 方法1:np.newaxis关键字"
87 | ]
88 | },
89 | {
90 | "cell_type": "markdown",
91 | "metadata": {},
92 | "source": [
93 | "#### 注意:np.newaxis其实就是None的别名"
94 | ]
95 | },
96 | {
97 | "cell_type": "code",
98 | "execution_count": 6,
99 | "metadata": {},
100 | "outputs": [
101 | {
102 | "data": {
103 | "text/plain": [
104 | "True"
105 | ]
106 | },
107 | "execution_count": 6,
108 | "metadata": {},
109 | "output_type": "execute_result"
110 | }
111 | ],
112 | "source": [
113 | "np.newaxis is None"
114 | ]
115 | },
116 | {
117 | "cell_type": "code",
118 | "execution_count": 7,
119 | "metadata": {},
120 | "outputs": [
121 | {
122 | "data": {
123 | "text/plain": [
124 | "True"
125 | ]
126 | },
127 | "execution_count": 7,
128 | "metadata": {},
129 | "output_type": "execute_result"
130 | }
131 | ],
132 | "source": [
133 | "np.newaxis == None"
134 | ]
135 | },
136 | {
137 | "cell_type": "markdown",
138 | "metadata": {},
139 | "source": [
140 | "即以下所有的np.newaxis的位置,都可以用None替代"
141 | ]
142 | },
143 | {
144 | "cell_type": "markdown",
145 | "metadata": {},
146 | "source": [
147 | "#### 给一维向量添加一个行维度"
148 | ]
149 | },
150 | {
151 | "cell_type": "code",
152 | "execution_count": 8,
153 | "metadata": {},
154 | "outputs": [
155 | {
156 | "data": {
157 | "text/plain": [
158 | "array([[0, 1, 2, 3, 4]])"
159 | ]
160 | },
161 | "execution_count": 8,
162 | "metadata": {},
163 | "output_type": "execute_result"
164 | }
165 | ],
166 | "source": [
167 | "arr[np.newaxis, :]"
168 | ]
169 | },
170 | {
171 | "cell_type": "code",
172 | "execution_count": 9,
173 | "metadata": {
174 | "scrolled": true
175 | },
176 | "outputs": [
177 | {
178 | "data": {
179 | "text/plain": [
180 | "(1, 5)"
181 | ]
182 | },
183 | "execution_count": 9,
184 | "metadata": {},
185 | "output_type": "execute_result"
186 | }
187 | ],
188 | "source": [
189 | "arr[np.newaxis, :].shape"
190 | ]
191 | },
192 | {
193 | "cell_type": "markdown",
194 | "metadata": {},
195 | "source": [
196 | "数据现在是一行*五列,数据本身没有增减,只是多了一级括号"
197 | ]
198 | },
199 | {
200 | "cell_type": "markdown",
201 | "metadata": {},
202 | "source": [
203 | "#### 给一维向量添加一个列维度"
204 | ]
205 | },
206 | {
207 | "cell_type": "code",
208 | "execution_count": 10,
209 | "metadata": {},
210 | "outputs": [
211 | {
212 | "data": {
213 | "text/plain": [
214 | "array([[0],\n",
215 | " [1],\n",
216 | " [2],\n",
217 | " [3],\n",
218 | " [4]])"
219 | ]
220 | },
221 | "execution_count": 10,
222 | "metadata": {},
223 | "output_type": "execute_result"
224 | }
225 | ],
226 | "source": [
227 | "arr[:, np.newaxis]"
228 | ]
229 | },
230 | {
231 | "cell_type": "code",
232 | "execution_count": 11,
233 | "metadata": {},
234 | "outputs": [
235 | {
236 | "data": {
237 | "text/plain": [
238 | "(5, 1)"
239 | ]
240 | },
241 | "execution_count": 11,
242 | "metadata": {},
243 | "output_type": "execute_result"
244 | }
245 | ],
246 | "source": [
247 | "arr[:, np.newaxis].shape"
248 | ]
249 | },
250 | {
251 | "cell_type": "markdown",
252 | "metadata": {},
253 | "source": [
254 | "数据现在是五行*一列"
255 | ]
256 | },
257 | {
258 | "cell_type": "markdown",
259 | "metadata": {},
260 | "source": [
261 | "### 方法2:np.expand_dims方法"
262 | ]
263 | },
264 | {
265 | "cell_type": "markdown",
266 | "metadata": {},
267 | "source": [
268 | "np.expand_dims方法实现的效果,和np.newaxis关键字是一模一样的"
269 | ]
270 | },
271 | {
272 | "cell_type": "code",
273 | "execution_count": 13,
274 | "metadata": {},
275 | "outputs": [
276 | {
277 | "data": {
278 | "text/plain": [
279 | "array([0, 1, 2, 3, 4])"
280 | ]
281 | },
282 | "execution_count": 13,
283 | "metadata": {},
284 | "output_type": "execute_result"
285 | }
286 | ],
287 | "source": [
288 | "arr"
289 | ]
290 | },
291 | {
292 | "cell_type": "markdown",
293 | "metadata": {},
294 | "source": [
295 | "#### 给一维数组添加一个行维度"
296 | ]
297 | },
298 | {
299 | "cell_type": "markdown",
300 | "metadata": {},
301 | "source": [
302 | "相当于arr[np.newaxis, arr]"
303 | ]
304 | },
305 | {
306 | "cell_type": "code",
307 | "execution_count": 14,
308 | "metadata": {},
309 | "outputs": [
310 | {
311 | "data": {
312 | "text/plain": [
313 | "array([[0, 1, 2, 3, 4]])"
314 | ]
315 | },
316 | "execution_count": 14,
317 | "metadata": {},
318 | "output_type": "execute_result"
319 | }
320 | ],
321 | "source": [
322 | "np.expand_dims(arr, axis=0)"
323 | ]
324 | },
325 | {
326 | "cell_type": "code",
327 | "execution_count": 15,
328 | "metadata": {},
329 | "outputs": [
330 | {
331 | "data": {
332 | "text/plain": [
333 | "(1, 5)"
334 | ]
335 | },
336 | "execution_count": 15,
337 | "metadata": {},
338 | "output_type": "execute_result"
339 | }
340 | ],
341 | "source": [
342 | "np.expand_dims(arr, axis=0).shape"
343 | ]
344 | },
345 | {
346 | "cell_type": "markdown",
347 | "metadata": {},
348 | "source": [
349 | "#### 给一维数组添加一个列维度"
350 | ]
351 | },
352 | {
353 | "cell_type": "markdown",
354 | "metadata": {},
355 | "source": [
356 | "相当于arr[arr, np.newaxis]"
357 | ]
358 | },
359 | {
360 | "cell_type": "code",
361 | "execution_count": 16,
362 | "metadata": {},
363 | "outputs": [
364 | {
365 | "data": {
366 | "text/plain": [
367 | "array([[0],\n",
368 | " [1],\n",
369 | " [2],\n",
370 | " [3],\n",
371 | " [4]])"
372 | ]
373 | },
374 | "execution_count": 16,
375 | "metadata": {},
376 | "output_type": "execute_result"
377 | }
378 | ],
379 | "source": [
380 | "np.expand_dims(arr, axis=1)"
381 | ]
382 | },
383 | {
384 | "cell_type": "code",
385 | "execution_count": 17,
386 | "metadata": {},
387 | "outputs": [
388 | {
389 | "data": {
390 | "text/plain": [
391 | "(5, 1)"
392 | ]
393 | },
394 | "execution_count": 17,
395 | "metadata": {},
396 | "output_type": "execute_result"
397 | }
398 | ],
399 | "source": [
400 | "np.expand_dims(arr, axis=1).shape"
401 | ]
402 | },
403 | {
404 | "cell_type": "markdown",
405 | "metadata": {},
406 | "source": [
407 | "### 方法3:np.reshape方法"
408 | ]
409 | },
410 | {
411 | "cell_type": "markdown",
412 | "metadata": {},
413 | "source": [
414 | "#### 给一维数组添加一个行维度"
415 | ]
416 | },
417 | {
418 | "cell_type": "code",
419 | "execution_count": 23,
420 | "metadata": {},
421 | "outputs": [
422 | {
423 | "data": {
424 | "text/plain": [
425 | "array([0, 1, 2, 3, 4])"
426 | ]
427 | },
428 | "execution_count": 23,
429 | "metadata": {},
430 | "output_type": "execute_result"
431 | }
432 | ],
433 | "source": [
434 | "arr"
435 | ]
436 | },
437 | {
438 | "cell_type": "code",
439 | "execution_count": 32,
440 | "metadata": {},
441 | "outputs": [
442 | {
443 | "data": {
444 | "text/plain": [
445 | "array([[0, 1, 2, 3, 4]])"
446 | ]
447 | },
448 | "execution_count": 32,
449 | "metadata": {},
450 | "output_type": "execute_result"
451 | }
452 | ],
453 | "source": [
454 | "np.reshape(arr, (1, 5))"
455 | ]
456 | },
457 | {
458 | "cell_type": "code",
459 | "execution_count": 33,
460 | "metadata": {},
461 | "outputs": [
462 | {
463 | "data": {
464 | "text/plain": [
465 | "array([[0, 1, 2, 3, 4]])"
466 | ]
467 | },
468 | "execution_count": 33,
469 | "metadata": {},
470 | "output_type": "execute_result"
471 | }
472 | ],
473 | "source": [
474 | "np.reshape(arr, (1, -1))"
475 | ]
476 | },
477 | {
478 | "cell_type": "code",
479 | "execution_count": 34,
480 | "metadata": {},
481 | "outputs": [
482 | {
483 | "data": {
484 | "text/plain": [
485 | "(1, 5)"
486 | ]
487 | },
488 | "execution_count": 34,
489 | "metadata": {},
490 | "output_type": "execute_result"
491 | }
492 | ],
493 | "source": [
494 | "np.reshape(arr, (1, -1)).shape"
495 | ]
496 | },
497 | {
498 | "cell_type": "markdown",
499 | "metadata": {},
500 | "source": [
501 | "#### 给一维数组添加一个列维度"
502 | ]
503 | },
504 | {
505 | "cell_type": "code",
506 | "execution_count": 35,
507 | "metadata": {},
508 | "outputs": [
509 | {
510 | "data": {
511 | "text/plain": [
512 | "array([[0],\n",
513 | " [1],\n",
514 | " [2],\n",
515 | " [3],\n",
516 | " [4]])"
517 | ]
518 | },
519 | "execution_count": 35,
520 | "metadata": {},
521 | "output_type": "execute_result"
522 | }
523 | ],
524 | "source": [
525 | "np.reshape(arr, (-1, 1))"
526 | ]
527 | },
528 | {
529 | "cell_type": "code",
530 | "execution_count": 36,
531 | "metadata": {},
532 | "outputs": [
533 | {
534 | "data": {
535 | "text/plain": [
536 | "(5, 1)"
537 | ]
538 | },
539 | "execution_count": 36,
540 | "metadata": {},
541 | "output_type": "execute_result"
542 | }
543 | ],
544 | "source": [
545 | "np.reshape(arr, (-1, 1)).shape"
546 | ]
547 | }
548 | ],
549 | "metadata": {
550 | "kernelspec": {
551 | "display_name": "Python 3",
552 | "language": "python",
553 | "name": "python3"
554 | },
555 | "language_info": {
556 | "codemirror_mode": {
557 | "name": "ipython",
558 | "version": 3
559 | },
560 | "file_extension": ".py",
561 | "mimetype": "text/x-python",
562 | "name": "python",
563 | "nbconvert_exporter": "python",
564 | "pygments_lexer": "ipython3",
565 | "version": "3.7.6"
566 | }
567 | },
568 | "nbformat": 4,
569 | "nbformat_minor": 4
570 | }
571 |
--------------------------------------------------------------------------------
/.ipynb_checkpoints/08. Numpy实现K折交叉验证的数据划分-checkpoint.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "## Numpy实现K折交叉验证的数据划分"
8 | ]
9 | },
10 | {
11 | "cell_type": "markdown",
12 | "metadata": {},
13 | "source": [
14 | "本实例使用Numpy的数组切片语法,实现了K折交叉验证的数据划分"
15 | ]
16 | },
17 | {
18 | "cell_type": "markdown",
19 | "metadata": {},
20 | "source": [
21 | "### 背景:K折交叉验证\n",
22 | "\n",
23 | "***为什么需要这个?*** \n",
24 | "在机器学习中,因为如下原因,使用K折交叉验证能更好评估模型效果:\n",
25 | "1. 样本量不充足,划分了训练集和测试集后,训练数据更少;\n",
26 | "2. 训练集和测试集的不同划分,可能会导致不同的模型性能结果;\n",
27 | "\n",
28 | "\n",
29 | "***K折验证是什么*** \n",
30 | "K折验证(K-fold validtion)将数据划分为大小相同的K个分区。 \n",
31 | "对每个分区i,在剩余的K-1个分区上训练模型,然后在分区i上评估模型。 \n",
32 | "最终分数等于K个分数的平均值,使用平均值来消除训练集和测试集的划分影响;\n",
33 | "\n",
34 | "
"
35 | ]
36 | },
37 | {
38 | "cell_type": "markdown",
39 | "metadata": {},
40 | "source": [
41 | "### 1. 模拟构造样本集合"
42 | ]
43 | },
44 | {
45 | "cell_type": "code",
46 | "execution_count": 1,
47 | "metadata": {},
48 | "outputs": [],
49 | "source": [
50 | "import numpy as np"
51 | ]
52 | },
53 | {
54 | "cell_type": "code",
55 | "execution_count": 2,
56 | "metadata": {},
57 | "outputs": [
58 | {
59 | "data": {
60 | "text/plain": [
61 | "array([[ 0, 1, 2, 3],\n",
62 | " [ 4, 5, 6, 7],\n",
63 | " [ 8, 9, 10, 11],\n",
64 | " [12, 13, 14, 15],\n",
65 | " [16, 17, 18, 19],\n",
66 | " [20, 21, 22, 23],\n",
67 | " [24, 25, 26, 27],\n",
68 | " [28, 29, 30, 31],\n",
69 | " [32, 33, 34, 35]])"
70 | ]
71 | },
72 | "execution_count": 2,
73 | "metadata": {},
74 | "output_type": "execute_result"
75 | }
76 | ],
77 | "source": [
78 | "data = np.arange(36).reshape(9,4)\n",
79 | "data"
80 | ]
81 | },
82 | {
83 | "cell_type": "markdown",
84 | "metadata": {},
85 | "source": [
86 | "用样本的角度解释下data数组:\n",
87 | "* 这是一个二维矩阵,行代表每个样本,列代表每个特征\n",
88 | "* 这里有9个样本,每个样本有4个特征\n",
89 | "\n",
90 | "这是scikit-learn模型训练输入的标准格式"
91 | ]
92 | },
93 | {
94 | "cell_type": "markdown",
95 | "metadata": {},
96 | "source": [
97 | "### 2. 使用Numpy实现K次划分"
98 | ]
99 | },
100 | {
101 | "cell_type": "code",
102 | "execution_count": 3,
103 | "metadata": {},
104 | "outputs": [],
105 | "source": [
106 | "# 我们想进行4折交叉验证\n",
107 | "k = 4"
108 | ]
109 | },
110 | {
111 | "cell_type": "code",
112 | "execution_count": 4,
113 | "metadata": {},
114 | "outputs": [
115 | {
116 | "data": {
117 | "text/plain": [
118 | "2"
119 | ]
120 | },
121 | "execution_count": 4,
122 | "metadata": {},
123 | "output_type": "execute_result"
124 | }
125 | ],
126 | "source": [
127 | "# 算出来每个fold的样本个数\n",
128 | "k_samples_count = data.shape[0]//k\n",
129 | "k_samples_count"
130 | ]
131 | },
132 | {
133 | "cell_type": "code",
134 | "execution_count": 5,
135 | "metadata": {
136 | "scrolled": false
137 | },
138 | "outputs": [
139 | {
140 | "name": "stdout",
141 | "output_type": "stream",
142 | "text": [
143 | "\n",
144 | "#####第0折#####\n",
145 | "验证集:\n",
146 | " [[0 1 2 3]\n",
147 | " [4 5 6 7]]\n",
148 | "训练集:\n",
149 | " [[ 8 9 10 11]\n",
150 | " [12 13 14 15]\n",
151 | " [16 17 18 19]\n",
152 | " [20 21 22 23]\n",
153 | " [24 25 26 27]\n",
154 | " [28 29 30 31]\n",
155 | " [32 33 34 35]]\n",
156 | "\n",
157 | "#####第1折#####\n",
158 | "验证集:\n",
159 | " [[ 8 9 10 11]\n",
160 | " [12 13 14 15]]\n",
161 | "训练集:\n",
162 | " [[ 0 1 2 3]\n",
163 | " [ 4 5 6 7]\n",
164 | " [16 17 18 19]\n",
165 | " [20 21 22 23]\n",
166 | " [24 25 26 27]\n",
167 | " [28 29 30 31]\n",
168 | " [32 33 34 35]]\n",
169 | "\n",
170 | "#####第2折#####\n",
171 | "验证集:\n",
172 | " [[16 17 18 19]\n",
173 | " [20 21 22 23]]\n",
174 | "训练集:\n",
175 | " [[ 0 1 2 3]\n",
176 | " [ 4 5 6 7]\n",
177 | " [ 8 9 10 11]\n",
178 | " [12 13 14 15]\n",
179 | " [24 25 26 27]\n",
180 | " [28 29 30 31]\n",
181 | " [32 33 34 35]]\n",
182 | "\n",
183 | "#####第3折#####\n",
184 | "验证集:\n",
185 | " [[24 25 26 27]\n",
186 | " [28 29 30 31]]\n",
187 | "训练集:\n",
188 | " [[ 0 1 2 3]\n",
189 | " [ 4 5 6 7]\n",
190 | " [ 8 9 10 11]\n",
191 | " [12 13 14 15]\n",
192 | " [16 17 18 19]\n",
193 | " [20 21 22 23]\n",
194 | " [32 33 34 35]]\n"
195 | ]
196 | }
197 | ],
198 | "source": [
199 | "for fold in range(k):\n",
200 | " validation_begin = k_samples_count*fold\n",
201 | " validation_end = k_samples_count*(fold+1)\n",
202 | " \n",
203 | " validation_data = data[validation_begin:validation_end]\n",
204 | " \n",
205 | " # np.vstack,沿着垂直的方向堆叠数组\n",
206 | " train_data = np.vstack([\n",
207 | " data[:validation_begin], \n",
208 | " data[validation_end:]\n",
209 | " ])\n",
210 | " \n",
211 | " print()\n",
212 | " print(f\"#####第{fold}折#####\")\n",
213 | " print(\"验证集:\\n\", validation_data)\n",
214 | " print(\"训练集:\\n\", train_data)"
215 | ]
216 | },
217 | {
218 | "cell_type": "markdown",
219 | "metadata": {},
220 | "source": [
221 | "如果使用scikit-learn,已经有封装好的实现: \n",
222 | "from sklearn.model_selection import cross_val_score"
223 | ]
224 | }
225 | ],
226 | "metadata": {
227 | "kernelspec": {
228 | "display_name": "Python 3",
229 | "language": "python",
230 | "name": "python3"
231 | },
232 | "language_info": {
233 | "codemirror_mode": {
234 | "name": "ipython",
235 | "version": 3
236 | },
237 | "file_extension": ".py",
238 | "mimetype": "text/x-python",
239 | "name": "python",
240 | "nbconvert_exporter": "python",
241 | "pygments_lexer": "ipython3",
242 | "version": "3.7.6"
243 | }
244 | },
245 | "nbformat": 4,
246 | "nbformat_minor": 4
247 | }
248 |
--------------------------------------------------------------------------------
/.ipynb_checkpoints/09. Numpy非常有用的数组合并操作-checkpoint.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "## Numpy非常重要有用的数组合并操作"
8 | ]
9 | },
10 | {
11 | "cell_type": "markdown",
12 | "metadata": {},
13 | "source": [
14 | "背景:在给机器学习准备数据的过程中,经常需要进行不同来源的数据合并的操作。\n",
15 | "\n",
16 | "两类场景:\n",
17 | "1. 给已有的数据添加多行,比如增添一些样本数据进去;\n",
18 | "2. 给已有的数据添加多列,比如增添一些特征进去;\n",
19 | "\n",
20 | "以下操作均可以实现数组合并:\n",
21 | "* np.concatenate(array_list, axis=0/1):沿着指定axis进行数组的合并\n",
22 | "* np.vstack或者np.row_stack(array_list):垂直vertically、按行row wise进行数据合并\n",
23 | "* np.hstack或者np.column_stack(array_list):水平horizontally、按列column wise进行数据合并"
24 | ]
25 | },
26 | {
27 | "cell_type": "code",
28 | "execution_count": 1,
29 | "metadata": {},
30 | "outputs": [],
31 | "source": [
32 | "import numpy as np"
33 | ]
34 | },
35 | {
36 | "cell_type": "markdown",
37 | "metadata": {},
38 | "source": [
39 | "### 1. 怎样给数据添加新的多行"
40 | ]
41 | },
42 | {
43 | "cell_type": "code",
44 | "execution_count": 2,
45 | "metadata": {},
46 | "outputs": [],
47 | "source": [
48 | "a = np.arange(6).reshape(2,3)\n",
49 | "b = np.random.randint(10,20,size=(4,3))"
50 | ]
51 | },
52 | {
53 | "cell_type": "code",
54 | "execution_count": 3,
55 | "metadata": {},
56 | "outputs": [
57 | {
58 | "data": {
59 | "text/plain": [
60 | "array([[0, 1, 2],\n",
61 | " [3, 4, 5]])"
62 | ]
63 | },
64 | "execution_count": 3,
65 | "metadata": {},
66 | "output_type": "execute_result"
67 | }
68 | ],
69 | "source": [
70 | "a"
71 | ]
72 | },
73 | {
74 | "cell_type": "code",
75 | "execution_count": 4,
76 | "metadata": {},
77 | "outputs": [
78 | {
79 | "data": {
80 | "text/plain": [
81 | "array([[12, 11, 14],\n",
82 | " [18, 15, 10],\n",
83 | " [11, 15, 15],\n",
84 | " [19, 16, 10]])"
85 | ]
86 | },
87 | "execution_count": 4,
88 | "metadata": {},
89 | "output_type": "execute_result"
90 | }
91 | ],
92 | "source": [
93 | "b"
94 | ]
95 | },
96 | {
97 | "cell_type": "code",
98 | "execution_count": 5,
99 | "metadata": {},
100 | "outputs": [
101 | {
102 | "data": {
103 | "text/plain": [
104 | "array([[ 0, 1, 2],\n",
105 | " [ 3, 4, 5],\n",
106 | " [12, 11, 14],\n",
107 | " [18, 15, 10],\n",
108 | " [11, 15, 15],\n",
109 | " [19, 16, 10]])"
110 | ]
111 | },
112 | "execution_count": 5,
113 | "metadata": {},
114 | "output_type": "execute_result"
115 | }
116 | ],
117 | "source": [
118 | "# 方法1:\n",
119 | "np.concatenate([a,b])"
120 | ]
121 | },
122 | {
123 | "cell_type": "code",
124 | "execution_count": 6,
125 | "metadata": {},
126 | "outputs": [
127 | {
128 | "data": {
129 | "text/plain": [
130 | "array([[ 0, 1, 2],\n",
131 | " [ 3, 4, 5],\n",
132 | " [12, 11, 14],\n",
133 | " [18, 15, 10],\n",
134 | " [11, 15, 15],\n",
135 | " [19, 16, 10]])"
136 | ]
137 | },
138 | "execution_count": 6,
139 | "metadata": {},
140 | "output_type": "execute_result"
141 | }
142 | ],
143 | "source": [
144 | "# 方法2\n",
145 | "np.vstack([a,b])"
146 | ]
147 | },
148 | {
149 | "cell_type": "code",
150 | "execution_count": 7,
151 | "metadata": {},
152 | "outputs": [
153 | {
154 | "data": {
155 | "text/plain": [
156 | "array([[ 0, 1, 2],\n",
157 | " [ 3, 4, 5],\n",
158 | " [12, 11, 14],\n",
159 | " [18, 15, 10],\n",
160 | " [11, 15, 15],\n",
161 | " [19, 16, 10]])"
162 | ]
163 | },
164 | "execution_count": 7,
165 | "metadata": {},
166 | "output_type": "execute_result"
167 | }
168 | ],
169 | "source": [
170 | "# 方法3\n",
171 | "np.row_stack([a, b])"
172 | ]
173 | },
174 | {
175 | "cell_type": "markdown",
176 | "metadata": {},
177 | "source": [
178 | "### 2. 怎样给数据添加新的多列"
179 | ]
180 | },
181 | {
182 | "cell_type": "code",
183 | "execution_count": 8,
184 | "metadata": {},
185 | "outputs": [],
186 | "source": [
187 | "a = np.arange(12).reshape(3,4)\n",
188 | "b = np.random.randint(10,20,size=(3,2))"
189 | ]
190 | },
191 | {
192 | "cell_type": "code",
193 | "execution_count": 9,
194 | "metadata": {},
195 | "outputs": [
196 | {
197 | "data": {
198 | "text/plain": [
199 | "array([[ 0, 1, 2, 3],\n",
200 | " [ 4, 5, 6, 7],\n",
201 | " [ 8, 9, 10, 11]])"
202 | ]
203 | },
204 | "execution_count": 9,
205 | "metadata": {},
206 | "output_type": "execute_result"
207 | }
208 | ],
209 | "source": [
210 | "a"
211 | ]
212 | },
213 | {
214 | "cell_type": "code",
215 | "execution_count": 10,
216 | "metadata": {},
217 | "outputs": [
218 | {
219 | "data": {
220 | "text/plain": [
221 | "array([[16, 10],\n",
222 | " [12, 17],\n",
223 | " [12, 10]])"
224 | ]
225 | },
226 | "execution_count": 10,
227 | "metadata": {},
228 | "output_type": "execute_result"
229 | }
230 | ],
231 | "source": [
232 | "b"
233 | ]
234 | },
235 | {
236 | "cell_type": "code",
237 | "execution_count": 11,
238 | "metadata": {},
239 | "outputs": [
240 | {
241 | "data": {
242 | "text/plain": [
243 | "array([[ 0, 1, 2, 3, 16, 10],\n",
244 | " [ 4, 5, 6, 7, 12, 17],\n",
245 | " [ 8, 9, 10, 11, 12, 10]])"
246 | ]
247 | },
248 | "execution_count": 11,
249 | "metadata": {},
250 | "output_type": "execute_result"
251 | }
252 | ],
253 | "source": [
254 | "# 方法1\n",
255 | "np.concatenate([a,b], axis=1)"
256 | ]
257 | },
258 | {
259 | "cell_type": "code",
260 | "execution_count": 12,
261 | "metadata": {},
262 | "outputs": [
263 | {
264 | "data": {
265 | "text/plain": [
266 | "array([[ 0, 1, 2, 3, 16, 10],\n",
267 | " [ 4, 5, 6, 7, 12, 17],\n",
268 | " [ 8, 9, 10, 11, 12, 10]])"
269 | ]
270 | },
271 | "execution_count": 12,
272 | "metadata": {},
273 | "output_type": "execute_result"
274 | }
275 | ],
276 | "source": [
277 | "# 方法2\n",
278 | "np.hstack([a,b])"
279 | ]
280 | },
281 | {
282 | "cell_type": "code",
283 | "execution_count": 13,
284 | "metadata": {},
285 | "outputs": [
286 | {
287 | "data": {
288 | "text/plain": [
289 | "array([[ 0, 1, 2, 3, 16, 10],\n",
290 | " [ 4, 5, 6, 7, 12, 17],\n",
291 | " [ 8, 9, 10, 11, 12, 10]])"
292 | ]
293 | },
294 | "execution_count": 13,
295 | "metadata": {},
296 | "output_type": "execute_result"
297 | }
298 | ],
299 | "source": [
300 | "# 方法3\n",
301 | "np.column_stack([a,b])"
302 | ]
303 | }
304 | ],
305 | "metadata": {
306 | "kernelspec": {
307 | "display_name": "Python 3",
308 | "language": "python",
309 | "name": "python3"
310 | },
311 | "language_info": {
312 | "codemirror_mode": {
313 | "name": "ipython",
314 | "version": 3
315 | },
316 | "file_extension": ".py",
317 | "mimetype": "text/x-python",
318 | "name": "python",
319 | "nbconvert_exporter": "python",
320 | "pygments_lexer": "ipython3",
321 | "version": "3.7.6"
322 | }
323 | },
324 | "nbformat": 4,
325 | "nbformat_minor": 4
326 | }
327 |
--------------------------------------------------------------------------------
/.ipynb_checkpoints/10. Numpy怎样对数组排序-checkpoint.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "## Numpy怎样对数组排序\n",
8 | "\n",
9 | "Numpy给数组排序的三个方法: \n",
10 | "* numpy.sort:返回排序后数组的拷贝\n",
11 | "* array.sort:原地排序数组而不是返回拷贝\n",
12 | "* numpy.argsort:间接排序,返回的是排序后的数字索引\n",
13 | "\n",
14 | "3个方法都支持一个参数kind,可以是以下一个值:\n",
15 | "* quicksort:快速排序,平均O(nlogn),不稳定情况\n",
16 | "* mergesort:归并排序,平均O(nlogn),稳定排序\n",
17 | "* heapsort:堆排序,平均O(nlogn),不稳定排序\n",
18 | "* stable:稳定排序\n",
19 | "\n",
20 | "kind默认值是quicksort,快速排序平均情况是最快,保持默认即可"
21 | ]
22 | },
23 | {
24 | "cell_type": "code",
25 | "execution_count": 1,
26 | "metadata": {},
27 | "outputs": [],
28 | "source": [
29 | "import numpy as np"
30 | ]
31 | },
32 | {
33 | "cell_type": "markdown",
34 | "metadata": {},
35 | "source": [
36 | "### 1. np.sort返回排序后的数组"
37 | ]
38 | },
39 | {
40 | "cell_type": "code",
41 | "execution_count": 2,
42 | "metadata": {},
43 | "outputs": [],
44 | "source": [
45 | "arr = np.array([3, 2, 4, 5, 1, 9, 7, 8, 6])"
46 | ]
47 | },
48 | {
49 | "cell_type": "code",
50 | "execution_count": 3,
51 | "metadata": {},
52 | "outputs": [
53 | {
54 | "data": {
55 | "text/plain": [
56 | "array([1, 2, 3, 4, 5, 6, 7, 8, 9])"
57 | ]
58 | },
59 | "execution_count": 3,
60 | "metadata": {},
61 | "output_type": "execute_result"
62 | }
63 | ],
64 | "source": [
65 | "# 返回拷贝后的数组\n",
66 | "np.sort(arr)"
67 | ]
68 | },
69 | {
70 | "cell_type": "code",
71 | "execution_count": 4,
72 | "metadata": {},
73 | "outputs": [
74 | {
75 | "data": {
76 | "text/plain": [
77 | "array([3, 2, 4, 5, 1, 9, 7, 8, 6])"
78 | ]
79 | },
80 | "execution_count": 4,
81 | "metadata": {},
82 | "output_type": "execute_result"
83 | }
84 | ],
85 | "source": [
86 | "arr"
87 | ]
88 | },
89 | {
90 | "cell_type": "markdown",
91 | "metadata": {},
92 | "source": [
93 | "### 2. array.sort进行原地排序"
94 | ]
95 | },
96 | {
97 | "cell_type": "code",
98 | "execution_count": 5,
99 | "metadata": {},
100 | "outputs": [],
101 | "source": [
102 | "arr2 = arr.copy()"
103 | ]
104 | },
105 | {
106 | "cell_type": "code",
107 | "execution_count": 6,
108 | "metadata": {},
109 | "outputs": [
110 | {
111 | "data": {
112 | "text/plain": [
113 | "array([3, 2, 4, 5, 1, 9, 7, 8, 6])"
114 | ]
115 | },
116 | "execution_count": 6,
117 | "metadata": {},
118 | "output_type": "execute_result"
119 | }
120 | ],
121 | "source": [
122 | "arr2"
123 | ]
124 | },
125 | {
126 | "cell_type": "code",
127 | "execution_count": 7,
128 | "metadata": {},
129 | "outputs": [],
130 | "source": [
131 | "arr2.sort()"
132 | ]
133 | },
134 | {
135 | "cell_type": "code",
136 | "execution_count": 8,
137 | "metadata": {},
138 | "outputs": [
139 | {
140 | "data": {
141 | "text/plain": [
142 | "array([1, 2, 3, 4, 5, 6, 7, 8, 9])"
143 | ]
144 | },
145 | "execution_count": 8,
146 | "metadata": {},
147 | "output_type": "execute_result"
148 | }
149 | ],
150 | "source": [
151 | "arr2"
152 | ]
153 | },
154 | {
155 | "cell_type": "markdown",
156 | "metadata": {},
157 | "source": [
158 | "### 3. np.argsort 返回的是有序数字的索引"
159 | ]
160 | },
161 | {
162 | "cell_type": "code",
163 | "execution_count": 9,
164 | "metadata": {},
165 | "outputs": [
166 | {
167 | "data": {
168 | "text/plain": [
169 | "array([3, 2, 4, 5, 1, 9, 7, 8, 6])"
170 | ]
171 | },
172 | "execution_count": 9,
173 | "metadata": {},
174 | "output_type": "execute_result"
175 | }
176 | ],
177 | "source": [
178 | "arr"
179 | ]
180 | },
181 | {
182 | "cell_type": "code",
183 | "execution_count": 10,
184 | "metadata": {},
185 | "outputs": [
186 | {
187 | "data": {
188 | "text/plain": [
189 | "array([4, 1, 0, 2, 3, 8, 6, 7, 5], dtype=int64)"
190 | ]
191 | },
192 | "execution_count": 10,
193 | "metadata": {},
194 | "output_type": "execute_result"
195 | }
196 | ],
197 | "source": [
198 | "# 获得排序元素对应的索引数字列表\n",
199 | "indices = np.argsort(arr)\n",
200 | "indices"
201 | ]
202 | },
203 | {
204 | "cell_type": "code",
205 | "execution_count": 11,
206 | "metadata": {
207 | "scrolled": true
208 | },
209 | "outputs": [
210 | {
211 | "data": {
212 | "text/plain": [
213 | "array([1, 2, 3, 4, 5, 6, 7, 8, 9])"
214 | ]
215 | },
216 | "execution_count": 11,
217 | "metadata": {},
218 | "output_type": "execute_result"
219 | }
220 | ],
221 | "source": [
222 | "# 可以直接获取对应的数据列表\n",
223 | "arr[indices]"
224 | ]
225 | },
226 | {
227 | "cell_type": "markdown",
228 | "metadata": {},
229 | "source": [
230 | "### 4. Python原生sorted与np.sort的性能对比"
231 | ]
232 | },
233 | {
234 | "cell_type": "code",
235 | "execution_count": 12,
236 | "metadata": {},
237 | "outputs": [],
238 | "source": [
239 | "arr_np = np.random.randint(0, 100, 100*10000)"
240 | ]
241 | },
242 | {
243 | "cell_type": "code",
244 | "execution_count": 13,
245 | "metadata": {},
246 | "outputs": [
247 | {
248 | "name": "stdout",
249 | "output_type": "stream",
250 | "text": [
251 | "24 ms ± 2.14 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)\n"
252 | ]
253 | }
254 | ],
255 | "source": [
256 | "%timeit np.sort(arr_np)"
257 | ]
258 | },
259 | {
260 | "cell_type": "code",
261 | "execution_count": 14,
262 | "metadata": {},
263 | "outputs": [],
264 | "source": [
265 | "# 将numpy arr变成python list\n",
266 | "arr_py = arr_np.tolist()"
267 | ]
268 | },
269 | {
270 | "cell_type": "code",
271 | "execution_count": 15,
272 | "metadata": {},
273 | "outputs": [
274 | {
275 | "name": "stdout",
276 | "output_type": "stream",
277 | "text": [
278 | "90.1 ms ± 726 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)\n"
279 | ]
280 | }
281 | ],
282 | "source": [
283 | "%timeit sorted(arr_py)"
284 | ]
285 | },
286 | {
287 | "cell_type": "code",
288 | "execution_count": null,
289 | "metadata": {},
290 | "outputs": [],
291 | "source": []
292 | }
293 | ],
294 | "metadata": {
295 | "kernelspec": {
296 | "display_name": "Python 3",
297 | "language": "python",
298 | "name": "python3"
299 | },
300 | "language_info": {
301 | "codemirror_mode": {
302 | "name": "ipython",
303 | "version": 3
304 | },
305 | "file_extension": ".py",
306 | "mimetype": "text/x-python",
307 | "name": "python",
308 | "nbconvert_exporter": "python",
309 | "pygments_lexer": "ipython3",
310 | "version": "3.7.6"
311 | }
312 | },
313 | "nbformat": 4,
314 | "nbformat_minor": 4
315 | }
316 |
--------------------------------------------------------------------------------
/.ipynb_checkpoints/12. Numpy中重要的广播概念-checkpoint.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "## Numpy中重要的广播概念"
8 | ]
9 | },
10 | {
11 | "cell_type": "markdown",
12 | "metadata": {},
13 | "source": [
14 | "***广播:*** \n",
15 | "简单理解为用于不同大小数组的二元通用函数(加、减、乘等)的一组规则\n",
16 | "\n",
17 | "***广播的规则:***\n",
18 | "1. 如果两个数组的维度数dim不相同,那么小维度数组的形状将会在左边补1\n",
19 | "2. 如果shape的维度不匹配,但是有维度是1,那么可以扩展维度是1的维度匹配另一个数组;\n",
20 | "3. 如果shape的维度不匹配,但是没有任何一个维度是1,则匹配失败引发错误;"
21 | ]
22 | },
23 | {
24 | "cell_type": "code",
25 | "execution_count": 1,
26 | "metadata": {},
27 | "outputs": [],
28 | "source": [
29 | "import numpy as np"
30 | ]
31 | },
32 | {
33 | "cell_type": "markdown",
34 | "metadata": {},
35 | "source": [
36 | "### 实例1:二维数组加一维数组"
37 | ]
38 | },
39 | {
40 | "cell_type": "code",
41 | "execution_count": 2,
42 | "metadata": {},
43 | "outputs": [
44 | {
45 | "data": {
46 | "text/plain": [
47 | "array([[1., 1., 1.],\n",
48 | " [1., 1., 1.]])"
49 | ]
50 | },
51 | "execution_count": 2,
52 | "metadata": {},
53 | "output_type": "execute_result"
54 | }
55 | ],
56 | "source": [
57 | "a = np.ones((2,3))\n",
58 | "a"
59 | ]
60 | },
61 | {
62 | "cell_type": "code",
63 | "execution_count": 3,
64 | "metadata": {},
65 | "outputs": [
66 | {
67 | "data": {
68 | "text/plain": [
69 | "array([0, 1, 2])"
70 | ]
71 | },
72 | "execution_count": 3,
73 | "metadata": {},
74 | "output_type": "execute_result"
75 | }
76 | ],
77 | "source": [
78 | "b = np.arange(3)\n",
79 | "b"
80 | ]
81 | },
82 | {
83 | "cell_type": "code",
84 | "execution_count": 4,
85 | "metadata": {},
86 | "outputs": [
87 | {
88 | "data": {
89 | "text/plain": [
90 | "((2, 3), (3,))"
91 | ]
92 | },
93 | "execution_count": 4,
94 | "metadata": {},
95 | "output_type": "execute_result"
96 | }
97 | ],
98 | "source": [
99 | "a.shape, b.shape"
100 | ]
101 | },
102 | {
103 | "cell_type": "code",
104 | "execution_count": 5,
105 | "metadata": {},
106 | "outputs": [
107 | {
108 | "data": {
109 | "text/plain": [
110 | "array([[1., 2., 3.],\n",
111 | " [1., 2., 3.]])"
112 | ]
113 | },
114 | "execution_count": 5,
115 | "metadata": {},
116 | "output_type": "execute_result"
117 | }
118 | ],
119 | "source": [
120 | "# 形状不匹配但是可以相加\n",
121 | "a + b"
122 | ]
123 | },
124 | {
125 | "cell_type": "markdown",
126 | "metadata": {},
127 | "source": [
128 | "***分析:a.shape=(2, 3), b.shape=(3,)***\n",
129 | "1. 根据规则1,b.shape会变成(1, 3)\n",
130 | "2. 根据规则2,b.shape再变成(2, 3),相当于在行上复制\n",
131 | "3. 完成匹配"
132 | ]
133 | },
134 | {
135 | "cell_type": "markdown",
136 | "metadata": {},
137 | "source": [
138 | "### 实例2:两个数组均需要广播"
139 | ]
140 | },
141 | {
142 | "cell_type": "code",
143 | "execution_count": 6,
144 | "metadata": {},
145 | "outputs": [
146 | {
147 | "data": {
148 | "text/plain": [
149 | "array([[0],\n",
150 | " [1],\n",
151 | " [2]])"
152 | ]
153 | },
154 | "execution_count": 6,
155 | "metadata": {},
156 | "output_type": "execute_result"
157 | }
158 | ],
159 | "source": [
160 | "a = np.arange(3).reshape((3, 1))\n",
161 | "a"
162 | ]
163 | },
164 | {
165 | "cell_type": "code",
166 | "execution_count": 7,
167 | "metadata": {},
168 | "outputs": [
169 | {
170 | "data": {
171 | "text/plain": [
172 | "array([0, 1, 2])"
173 | ]
174 | },
175 | "execution_count": 7,
176 | "metadata": {},
177 | "output_type": "execute_result"
178 | }
179 | ],
180 | "source": [
181 | "b = np.arange(3)\n",
182 | "b"
183 | ]
184 | },
185 | {
186 | "cell_type": "code",
187 | "execution_count": 8,
188 | "metadata": {},
189 | "outputs": [
190 | {
191 | "data": {
192 | "text/plain": [
193 | "((3, 1), (3,))"
194 | ]
195 | },
196 | "execution_count": 8,
197 | "metadata": {},
198 | "output_type": "execute_result"
199 | }
200 | ],
201 | "source": [
202 | "a.shape, b.shape"
203 | ]
204 | },
205 | {
206 | "cell_type": "code",
207 | "execution_count": 9,
208 | "metadata": {
209 | "scrolled": true
210 | },
211 | "outputs": [
212 | {
213 | "data": {
214 | "text/plain": [
215 | "array([[0, 1, 2],\n",
216 | " [1, 2, 3],\n",
217 | " [2, 3, 4]])"
218 | ]
219 | },
220 | "execution_count": 9,
221 | "metadata": {},
222 | "output_type": "execute_result"
223 | }
224 | ],
225 | "source": [
226 | "a + b"
227 | ]
228 | },
229 | {
230 | "cell_type": "markdown",
231 | "metadata": {},
232 | "source": [
233 | "***分析:a.shape为(3,1),b.shape为(3,)***:\n",
234 | "1. 根据规则1,b.shape会变成(1, 3)\n",
235 | "2. 根据规则2,b.shape再变成(3, 3),相当于在行上复制\n",
236 | "3. 根据规则2,a.shape再变成(3, 3),相当于在列上复制\n",
237 | "3. 完成匹配"
238 | ]
239 | },
240 | {
241 | "cell_type": "markdown",
242 | "metadata": {},
243 | "source": [
244 | "### 实例3:不匹配的例子"
245 | ]
246 | },
247 | {
248 | "cell_type": "code",
249 | "execution_count": 10,
250 | "metadata": {},
251 | "outputs": [
252 | {
253 | "data": {
254 | "text/plain": [
255 | "array([[1., 1.],\n",
256 | " [1., 1.],\n",
257 | " [1., 1.]])"
258 | ]
259 | },
260 | "execution_count": 10,
261 | "metadata": {},
262 | "output_type": "execute_result"
263 | }
264 | ],
265 | "source": [
266 | "a = np.ones((3,2))\n",
267 | "a"
268 | ]
269 | },
270 | {
271 | "cell_type": "code",
272 | "execution_count": 11,
273 | "metadata": {},
274 | "outputs": [
275 | {
276 | "data": {
277 | "text/plain": [
278 | "array([0, 1, 2])"
279 | ]
280 | },
281 | "execution_count": 11,
282 | "metadata": {},
283 | "output_type": "execute_result"
284 | }
285 | ],
286 | "source": [
287 | "b = np.arange(3)\n",
288 | "b"
289 | ]
290 | },
291 | {
292 | "cell_type": "code",
293 | "execution_count": 12,
294 | "metadata": {},
295 | "outputs": [
296 | {
297 | "data": {
298 | "text/plain": [
299 | "((3, 2), (3,))"
300 | ]
301 | },
302 | "execution_count": 12,
303 | "metadata": {},
304 | "output_type": "execute_result"
305 | }
306 | ],
307 | "source": [
308 | "a.shape, b.shape"
309 | ]
310 | },
311 | {
312 | "cell_type": "code",
313 | "execution_count": 13,
314 | "metadata": {},
315 | "outputs": [
316 | {
317 | "ename": "ValueError",
318 | "evalue": "operands could not be broadcast together with shapes (3,2) (3,) ",
319 | "output_type": "error",
320 | "traceback": [
321 | "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m",
322 | "\u001b[1;31mValueError\u001b[0m Traceback (most recent call last)",
323 | "\u001b[1;32m\u001b[0m in \u001b[0;36m\u001b[1;34m\u001b[0m\n\u001b[1;32m----> 1\u001b[1;33m \u001b[0ma\u001b[0m \u001b[1;33m+\u001b[0m \u001b[0mb\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m",
324 | "\u001b[1;31mValueError\u001b[0m: operands could not be broadcast together with shapes (3,2) (3,) "
325 | ]
326 | }
327 | ],
328 | "source": [
329 | "a + b"
330 | ]
331 | },
332 | {
333 | "cell_type": "markdown",
334 | "metadata": {},
335 | "source": [
336 | "***分析:a.shape为(3,2),b.shape为(3,)***:\n",
337 | "1. 根据规则1,b.shape会变成(1, 3)\n",
338 | "2. 根据规则2,b.shape再变成(3, 3),相当于在行上复制\n",
339 | "3. 根据规则3,形状不匹配,但是没有维度是1,匹配失败报错"
340 | ]
341 | }
342 | ],
343 | "metadata": {
344 | "kernelspec": {
345 | "display_name": "Python 3",
346 | "language": "python",
347 | "name": "python3"
348 | },
349 | "language_info": {
350 | "codemirror_mode": {
351 | "name": "ipython",
352 | "version": 3
353 | },
354 | "file_extension": ".py",
355 | "mimetype": "text/x-python",
356 | "name": "python",
357 | "nbconvert_exporter": "python",
358 | "pygments_lexer": "ipython3",
359 | "version": "3.7.6"
360 | }
361 | },
362 | "nbformat": 4,
363 | "nbformat_minor": 4
364 | }
365 |
--------------------------------------------------------------------------------
/.ipynb_checkpoints/13. Numpy求解线性方程组-checkpoint.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "## Numpy求解线性方程组\n",
8 | "\n",
9 | "对于Ax=b,已知A和b,怎么算出x?"
10 | ]
11 | },
12 | {
13 | "cell_type": "markdown",
14 | "metadata": {},
15 | "source": [
16 | "### 1. 引入包"
17 | ]
18 | },
19 | {
20 | "cell_type": "code",
21 | "execution_count": 1,
22 | "metadata": {},
23 | "outputs": [],
24 | "source": [
25 | "import numpy as np"
26 | ]
27 | },
28 | {
29 | "cell_type": "markdown",
30 | "metadata": {},
31 | "source": [
32 | "### 2. 求解"
33 | ]
34 | },
35 | {
36 | "cell_type": "code",
37 | "execution_count": 3,
38 | "metadata": {},
39 | "outputs": [
40 | {
41 | "data": {
42 | "text/plain": [
43 | "array([[ 1, -2, 1],\n",
44 | " [ 0, 2, -8],\n",
45 | " [-4, 5, 9]])"
46 | ]
47 | },
48 | "execution_count": 3,
49 | "metadata": {},
50 | "output_type": "execute_result"
51 | }
52 | ],
53 | "source": [
54 | "A = np.array(\n",
55 | " [\n",
56 | " [1, -2, 1],\n",
57 | " [0, 2, -8],\n",
58 | " [-4, 5, 9]\n",
59 | " ]\n",
60 | ")\n",
61 | "A"
62 | ]
63 | },
64 | {
65 | "cell_type": "code",
66 | "execution_count": 5,
67 | "metadata": {},
68 | "outputs": [
69 | {
70 | "data": {
71 | "text/plain": [
72 | "array([ 0, 8, -9])"
73 | ]
74 | },
75 | "execution_count": 5,
76 | "metadata": {},
77 | "output_type": "execute_result"
78 | }
79 | ],
80 | "source": [
81 | "b = np.array([0, 8, -9])\n",
82 | "b"
83 | ]
84 | },
85 | {
86 | "cell_type": "code",
87 | "execution_count": 7,
88 | "metadata": {},
89 | "outputs": [
90 | {
91 | "data": {
92 | "text/plain": [
93 | "array([29., 16., 3.])"
94 | ]
95 | },
96 | "execution_count": 7,
97 | "metadata": {},
98 | "output_type": "execute_result"
99 | }
100 | ],
101 | "source": [
102 | "# 调用solve方法直接求解\n",
103 | "x = np.linalg.solve(A, b)\n",
104 | "x"
105 | ]
106 | },
107 | {
108 | "cell_type": "markdown",
109 | "metadata": {},
110 | "source": [
111 | "### 验证"
112 | ]
113 | },
114 | {
115 | "cell_type": "code",
116 | "execution_count": 9,
117 | "metadata": {},
118 | "outputs": [
119 | {
120 | "data": {
121 | "text/plain": [
122 | "8.0"
123 | ]
124 | },
125 | "execution_count": 9,
126 | "metadata": {},
127 | "output_type": "execute_result"
128 | }
129 | ],
130 | "source": [
131 | "# 验证单个方程\n",
132 | "A[1].dot(x)"
133 | ]
134 | },
135 | {
136 | "cell_type": "code",
137 | "execution_count": 11,
138 | "metadata": {},
139 | "outputs": [
140 | {
141 | "data": {
142 | "text/plain": [
143 | "array([ True, True, True])"
144 | ]
145 | },
146 | "execution_count": 11,
147 | "metadata": {},
148 | "output_type": "execute_result"
149 | }
150 | ],
151 | "source": [
152 | "# 验证整个矩阵计算\n",
153 | "A.dot(x) == b"
154 | ]
155 | }
156 | ],
157 | "metadata": {
158 | "kernelspec": {
159 | "display_name": "Python 3",
160 | "language": "python",
161 | "name": "python3"
162 | },
163 | "language_info": {
164 | "codemirror_mode": {
165 | "name": "ipython",
166 | "version": 3
167 | },
168 | "file_extension": ".py",
169 | "mimetype": "text/x-python",
170 | "name": "python",
171 | "nbconvert_exporter": "python",
172 | "pygments_lexer": "ipython3",
173 | "version": "3.7.6"
174 | }
175 | },
176 | "nbformat": 4,
177 | "nbformat_minor": 4
178 | }
179 |
--------------------------------------------------------------------------------
/.ipynb_checkpoints/14. Numpy实现SVD矩阵分解-checkpoint.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "## Numpy实现SVD矩阵分解"
8 | ]
9 | },
10 | {
11 | "cell_type": "markdown",
12 | "metadata": {},
13 | "source": [
14 | "### 1. 引入包"
15 | ]
16 | },
17 | {
18 | "cell_type": "code",
19 | "execution_count": 1,
20 | "metadata": {},
21 | "outputs": [],
22 | "source": [
23 | "import numpy as np"
24 | ]
25 | },
26 | {
27 | "cell_type": "markdown",
28 | "metadata": {},
29 | "source": [
30 | "### 2. 实现矩阵分解"
31 | ]
32 | },
33 | {
34 | "cell_type": "code",
35 | "execution_count": 2,
36 | "metadata": {},
37 | "outputs": [],
38 | "source": [
39 | "A = np.random.randint(1, 10, (8, 4))"
40 | ]
41 | },
42 | {
43 | "cell_type": "code",
44 | "execution_count": 3,
45 | "metadata": {},
46 | "outputs": [
47 | {
48 | "data": {
49 | "text/plain": [
50 | "array([[6, 5, 1, 5],\n",
51 | " [1, 7, 9, 7],\n",
52 | " [7, 2, 4, 2],\n",
53 | " [6, 4, 3, 5],\n",
54 | " [2, 8, 8, 6],\n",
55 | " [5, 2, 8, 6],\n",
56 | " [7, 8, 2, 3],\n",
57 | " [1, 3, 6, 9]])"
58 | ]
59 | },
60 | "execution_count": 3,
61 | "metadata": {},
62 | "output_type": "execute_result"
63 | }
64 | ],
65 | "source": [
66 | "A"
67 | ]
68 | },
69 | {
70 | "cell_type": "code",
71 | "execution_count": 4,
72 | "metadata": {},
73 | "outputs": [],
74 | "source": [
75 | "# 实现矩阵分解\n",
76 | "U, S, V = np.linalg.svd(A, full_matrices=False)"
77 | ]
78 | },
79 | {
80 | "cell_type": "code",
81 | "execution_count": 5,
82 | "metadata": {},
83 | "outputs": [
84 | {
85 | "data": {
86 | "text/plain": [
87 | "((8, 4), (4,), (4, 4))"
88 | ]
89 | },
90 | "execution_count": 5,
91 | "metadata": {},
92 | "output_type": "execute_result"
93 | }
94 | ],
95 | "source": [
96 | "U.shape, S.shape, V.shape"
97 | ]
98 | },
99 | {
100 | "cell_type": "code",
101 | "execution_count": 6,
102 | "metadata": {},
103 | "outputs": [
104 | {
105 | "data": {
106 | "text/plain": [
107 | "array([[-0.28611227, -0.38768744, -0.07088588, -0.47757145],\n",
108 | " [-0.44374671, 0.40390585, -0.25458601, 0.20383531],\n",
109 | " [-0.24657791, -0.34884357, 0.43054458, 0.4062272 ],\n",
110 | " [-0.30673084, -0.27495123, 0.14797683, -0.2218886 ],\n",
111 | " [-0.43671345, 0.23339125, -0.39431663, 0.27599841],\n",
112 | " [-0.37257929, 0.10313032, 0.59362412, 0.23542645],\n",
113 | " [-0.33314069, -0.52514475, -0.41727103, 0.07285924],\n",
114 | " [-0.35472167, 0.38520663, 0.20225001, -0.61580222]])"
115 | ]
116 | },
117 | "execution_count": 6,
118 | "metadata": {},
119 | "output_type": "execute_result"
120 | }
121 | ],
122 | "source": [
123 | "U"
124 | ]
125 | },
126 | {
127 | "cell_type": "code",
128 | "execution_count": 7,
129 | "metadata": {},
130 | "outputs": [
131 | {
132 | "data": {
133 | "text/plain": [
134 | "array([28.44730142, 10.24874824, 6.39012419, 4.56952014])"
135 | ]
136 | },
137 | "execution_count": 7,
138 | "metadata": {},
139 | "output_type": "execute_result"
140 | }
141 | ],
142 | "source": [
143 | "# 因为是对角矩阵,这里进行了简写\n",
144 | "S"
145 | ]
146 | },
147 | {
148 | "cell_type": "code",
149 | "execution_count": 8,
150 | "metadata": {},
151 | "outputs": [
152 | {
153 | "data": {
154 | "text/plain": [
155 | "array([[28.44730142, 0. , 0. , 0. ],\n",
156 | " [ 0. , 10.24874824, 0. , 0. ],\n",
157 | " [ 0. , 0. , 6.39012419, 0. ],\n",
158 | " [ 0. , 0. , 0. , 4.56952014]])"
159 | ]
160 | },
161 | "execution_count": 8,
162 | "metadata": {},
163 | "output_type": "execute_result"
164 | }
165 | ],
166 | "source": [
167 | "np.diag(S)"
168 | ]
169 | },
170 | {
171 | "cell_type": "code",
172 | "execution_count": 9,
173 | "metadata": {},
174 | "outputs": [
175 | {
176 | "data": {
177 | "text/plain": [
178 | "array([[-0.39194862, -0.50004828, -0.54329548, -0.54877866],\n",
179 | " [-0.81202147, -0.18350883, 0.48594814, 0.26608277],\n",
180 | " [ 0.41980592, -0.84227439, 0.27814277, 0.19228481],\n",
181 | " [ 0.10373231, 0.08276523, 0.62555658, -0.76880979]])"
182 | ]
183 | },
184 | "execution_count": 9,
185 | "metadata": {},
186 | "output_type": "execute_result"
187 | }
188 | ],
189 | "source": [
190 | "V"
191 | ]
192 | },
193 | {
194 | "cell_type": "markdown",
195 | "metadata": {},
196 | "source": [
197 | "### 3. 从分量还原矩阵"
198 | ]
199 | },
200 | {
201 | "cell_type": "code",
202 | "execution_count": 10,
203 | "metadata": {},
204 | "outputs": [
205 | {
206 | "data": {
207 | "text/plain": [
208 | "array([[6., 5., 1., 5.],\n",
209 | " [1., 7., 9., 7.],\n",
210 | " [7., 2., 4., 2.],\n",
211 | " [6., 4., 3., 5.],\n",
212 | " [2., 8., 8., 6.],\n",
213 | " [5., 2., 8., 6.],\n",
214 | " [7., 8., 2., 3.],\n",
215 | " [1., 3., 6., 9.]])"
216 | ]
217 | },
218 | "execution_count": 10,
219 | "metadata": {},
220 | "output_type": "execute_result"
221 | }
222 | ],
223 | "source": [
224 | "U @ np.diag(S) @ V"
225 | ]
226 | },
227 | {
228 | "cell_type": "code",
229 | "execution_count": null,
230 | "metadata": {},
231 | "outputs": [],
232 | "source": []
233 | }
234 | ],
235 | "metadata": {
236 | "kernelspec": {
237 | "display_name": "Python 3",
238 | "language": "python",
239 | "name": "python3"
240 | },
241 | "language_info": {
242 | "codemirror_mode": {
243 | "name": "ipython",
244 | "version": 3
245 | },
246 | "file_extension": ".py",
247 | "mimetype": "text/x-python",
248 | "name": "python",
249 | "nbconvert_exporter": "python",
250 | "pygments_lexer": "ipython3",
251 | "version": "3.7.6"
252 | }
253 | },
254 | "nbformat": 4,
255 | "nbformat_minor": 4
256 | }
257 |
--------------------------------------------------------------------------------
/.ipynb_checkpoints/17. Numpy计算逆矩阵求解线性方程组-checkpoint.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "## Numpy使用逆矩阵求解线性方程组"
8 | ]
9 | },
10 | {
11 | "cell_type": "markdown",
12 | "metadata": {},
13 | "source": [
14 | "对于这样的线性方程组:\n",
15 | "* x + y + z = 6\n",
16 | "* 2y + 5z = -4\n",
17 | "* 2x + 5y - z = 27\n",
18 | "\n",
19 | "可以表示成矩阵的形式:\n",
20 | "
\n",
21 | "\n",
22 | "用公式可以表示为:Ax=b,其中A是矩阵,x和b都是列向量\n",
23 | "\n",
24 | "***逆矩阵(inverse matrix)的定义:*** \n",
25 | "设A是数域上的一个n阶矩阵,若存在另一个n阶矩阵B,使得: AB=BA=E ,则我们称B是A的逆矩阵,而A则被称为可逆矩阵。注:E为单位矩阵。\n",
26 | "\n",
27 | "***使用逆矩阵求解线性方程组的方法:*** \n",
28 | "两边都乘以$A^{-1}$,变成$A^{-1}$Ax=$A^{-1}$b,因为任何矩阵乘以单位矩阵都是自身,所以x=$A^{-1}$b"
29 | ]
30 | },
31 | {
32 | "cell_type": "code",
33 | "execution_count": 1,
34 | "metadata": {},
35 | "outputs": [],
36 | "source": [
37 | "import numpy as np"
38 | ]
39 | },
40 | {
41 | "cell_type": "markdown",
42 | "metadata": {},
43 | "source": [
44 | "### 1. 求解逆矩阵"
45 | ]
46 | },
47 | {
48 | "cell_type": "code",
49 | "execution_count": 2,
50 | "metadata": {},
51 | "outputs": [
52 | {
53 | "data": {
54 | "text/plain": [
55 | "array([[ 1, 1, 1],\n",
56 | " [ 0, 2, 5],\n",
57 | " [ 2, 5, -1]])"
58 | ]
59 | },
60 | "execution_count": 2,
61 | "metadata": {},
62 | "output_type": "execute_result"
63 | }
64 | ],
65 | "source": [
66 | "A = np.array([\n",
67 | " [1,1,1],\n",
68 | " [0,2,5],\n",
69 | " [2,5,-1]\n",
70 | "])\n",
71 | "A"
72 | ]
73 | },
74 | {
75 | "cell_type": "code",
76 | "execution_count": 3,
77 | "metadata": {},
78 | "outputs": [
79 | {
80 | "data": {
81 | "text/plain": [
82 | "array([[ 1.28571429, -0.28571429, -0.14285714],\n",
83 | " [-0.47619048, 0.14285714, 0.23809524],\n",
84 | " [ 0.19047619, 0.14285714, -0.0952381 ]])"
85 | ]
86 | },
87 | "execution_count": 3,
88 | "metadata": {},
89 | "output_type": "execute_result"
90 | }
91 | ],
92 | "source": [
93 | "# B为A的逆矩阵\n",
94 | "B = np.linalg.inv(A)\n",
95 | "B"
96 | ]
97 | },
98 | {
99 | "cell_type": "markdown",
100 | "metadata": {},
101 | "source": [
102 | "### 2. 验证矩阵和逆矩阵的乘积是单位矩阵"
103 | ]
104 | },
105 | {
106 | "cell_type": "code",
107 | "execution_count": 6,
108 | "metadata": {},
109 | "outputs": [
110 | {
111 | "data": {
112 | "text/plain": [
113 | "array([[ 1.00000000e+00, -2.77555756e-17, 2.77555756e-17],\n",
114 | " [ 0.00000000e+00, 1.00000000e+00, 0.00000000e+00],\n",
115 | " [-2.22044605e-16, 5.55111512e-17, 1.00000000e+00]])"
116 | ]
117 | },
118 | "execution_count": 6,
119 | "metadata": {},
120 | "output_type": "execute_result"
121 | }
122 | ],
123 | "source": [
124 | "A@B"
125 | ]
126 | },
127 | {
128 | "cell_type": "code",
129 | "execution_count": 7,
130 | "metadata": {},
131 | "outputs": [
132 | {
133 | "data": {
134 | "text/plain": [
135 | "array([[ 1.00000000e+00, -2.77555756e-17, 2.77555756e-17],\n",
136 | " [ 0.00000000e+00, 1.00000000e+00, 0.00000000e+00],\n",
137 | " [-2.22044605e-16, 5.55111512e-17, 1.00000000e+00]])"
138 | ]
139 | },
140 | "execution_count": 7,
141 | "metadata": {},
142 | "output_type": "execute_result"
143 | }
144 | ],
145 | "source": [
146 | "np.matmul(A, B)"
147 | ]
148 | },
149 | {
150 | "cell_type": "markdown",
151 | "metadata": {},
152 | "source": [
153 | "### 3. 验证线性方程组"
154 | ]
155 | },
156 | {
157 | "cell_type": "code",
158 | "execution_count": 8,
159 | "metadata": {},
160 | "outputs": [],
161 | "source": [
162 | "# 构造Ax=b中的b\n",
163 | "b = np.array([6, -4, 27])"
164 | ]
165 | },
166 | {
167 | "cell_type": "code",
168 | "execution_count": 9,
169 | "metadata": {},
170 | "outputs": [],
171 | "source": [
172 | "# 使用逆矩阵求解x\n",
173 | "x = B@b"
174 | ]
175 | },
176 | {
177 | "cell_type": "code",
178 | "execution_count": 10,
179 | "metadata": {},
180 | "outputs": [
181 | {
182 | "data": {
183 | "text/plain": [
184 | "array([ 5., 3., -2.])"
185 | ]
186 | },
187 | "execution_count": 10,
188 | "metadata": {},
189 | "output_type": "execute_result"
190 | }
191 | ],
192 | "source": [
193 | "x"
194 | ]
195 | },
196 | {
197 | "cell_type": "code",
198 | "execution_count": 12,
199 | "metadata": {},
200 | "outputs": [
201 | {
202 | "data": {
203 | "text/plain": [
204 | "array([ 6., -4., 27.])"
205 | ]
206 | },
207 | "execution_count": 12,
208 | "metadata": {},
209 | "output_type": "execute_result"
210 | }
211 | ],
212 | "source": [
213 | "# 验证A@x = b\n",
214 | "A@x"
215 | ]
216 | }
217 | ],
218 | "metadata": {
219 | "kernelspec": {
220 | "display_name": "Python 3",
221 | "language": "python",
222 | "name": "python3"
223 | },
224 | "language_info": {
225 | "codemirror_mode": {
226 | "name": "ipython",
227 | "version": 3
228 | },
229 | "file_extension": ".py",
230 | "mimetype": "text/x-python",
231 | "name": "python",
232 | "nbconvert_exporter": "python",
233 | "pygments_lexer": "ipython3",
234 | "version": "3.7.6"
235 | }
236 | },
237 | "nbformat": 4,
238 | "nbformat_minor": 4
239 | }
240 |
--------------------------------------------------------------------------------
/.ipynb_checkpoints/18. Numpy怎样将数组读写到文件-checkpoint.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "## Numpy怎样将数组读写到文件\n",
8 | "\n",
9 | "本文档介绍的是Numpy以自己内建二进制的方式,将数组写出到文件,以及从文件加载数组;\n",
10 | "\n",
11 | "如果是文本、表格类数据,一般使用pandas这个类库做加载和处理,不用numpy\n",
12 | "\n",
13 | "几个方法:\n",
14 | "1. np.load(filename):从.npy或者.npz文件中加载numpy数组\n",
15 | "2. np.save(filename, arr):将单个numpy数组保存到.npy文件中\n",
16 | "3. np.savez(filename, arra=arra, arrb=arrb):将多个numpy数组保存到.npz未压缩的文件格式中\n",
17 | "4. np.savez_compressed(filename, arra=arra, arrb=arrb):将多个numpy数组保存到.npz压缩的文件格式中\n",
18 | "\n",
19 | ".npy和.npz都是二进制格式文件,用纯文本编辑器打开都是乱码"
20 | ]
21 | },
22 | {
23 | "cell_type": "code",
24 | "execution_count": 1,
25 | "metadata": {},
26 | "outputs": [],
27 | "source": [
28 | "import numpy as np"
29 | ]
30 | },
31 | {
32 | "cell_type": "markdown",
33 | "metadata": {},
34 | "source": [
35 | "### 1. 使用np.save和np.load保存和加载单个数组"
36 | ]
37 | },
38 | {
39 | "cell_type": "code",
40 | "execution_count": 2,
41 | "metadata": {},
42 | "outputs": [
43 | {
44 | "data": {
45 | "text/plain": [
46 | "array([[ 0, 1, 2, 3],\n",
47 | " [ 4, 5, 6, 7],\n",
48 | " [ 8, 9, 10, 11]])"
49 | ]
50 | },
51 | "execution_count": 2,
52 | "metadata": {},
53 | "output_type": "execute_result"
54 | }
55 | ],
56 | "source": [
57 | "a = np.arange(12).reshape(3,4)\n",
58 | "a"
59 | ]
60 | },
61 | {
62 | "cell_type": "code",
63 | "execution_count": 3,
64 | "metadata": {},
65 | "outputs": [],
66 | "source": [
67 | "np.save(\"arr_a.npy\", a)"
68 | ]
69 | },
70 | {
71 | "cell_type": "code",
72 | "execution_count": null,
73 | "metadata": {},
74 | "outputs": [],
75 | "source": []
76 | }
77 | ],
78 | "metadata": {
79 | "kernelspec": {
80 | "display_name": "Python 3",
81 | "language": "python",
82 | "name": "python3"
83 | },
84 | "language_info": {
85 | "codemirror_mode": {
86 | "name": "ipython",
87 | "version": 3
88 | },
89 | "file_extension": ".py",
90 | "mimetype": "text/x-python",
91 | "name": "python",
92 | "nbconvert_exporter": "python",
93 | "pygments_lexer": "ipython3",
94 | "version": "3.7.6"
95 | }
96 | },
97 | "nbformat": 4,
98 | "nbformat_minor": 4
99 | }
100 |
--------------------------------------------------------------------------------
/.ipynb_checkpoints/19. Numpy的结构化数组-checkpoint.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "## Numpy的结构化数组"
8 | ]
9 | },
10 | {
11 | "cell_type": "markdown",
12 | "metadata": {},
13 | "source": [
14 | "一般情况下,Numpy中的数组都是同样的数据类型,比如int、float; \n",
15 | "这也是Numpy性能高效的原因,在内存中紧凑存储,读取非常快; \n",
16 | "\n",
17 | "但是Numpy也可以记录异构数组,比如下面的数据: \n",
18 | "\n",
19 | " \n",
20 | " 姓名 | \n",
21 | " 年龄 | \n",
22 | " 体重 | \n",
23 | "
\n",
24 | " \n",
25 | " 小王 | \n",
26 | " 30 | \n",
27 | " 80.5 | \n",
28 | "
\n",
29 | " \n",
30 | " 小李 | \n",
31 | " 28 | \n",
32 | " 70.3 | \n",
33 | "
\n",
34 | " \n",
35 | " 小天 | \n",
36 | " 29 | \n",
37 | " 78.6 | \n",
38 | "
\n",
39 | "
\n",
40 | "\n",
41 | "这就是本节要介绍的“Numpy结构化数组”特性; "
42 | ]
43 | },
44 | {
45 | "cell_type": "code",
46 | "execution_count": 1,
47 | "metadata": {},
48 | "outputs": [],
49 | "source": [
50 | "import numpy as np"
51 | ]
52 | },
53 | {
54 | "cell_type": "markdown",
55 | "metadata": {},
56 | "source": [
57 | "### 1. 正常的Numpy数组的dtype值只有一个类型"
58 | ]
59 | },
60 | {
61 | "cell_type": "code",
62 | "execution_count": 2,
63 | "metadata": {},
64 | "outputs": [
65 | {
66 | "data": {
67 | "text/plain": [
68 | "(array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]), dtype('int32'))"
69 | ]
70 | },
71 | "execution_count": 2,
72 | "metadata": {},
73 | "output_type": "execute_result"
74 | }
75 | ],
76 | "source": [
77 | "arr = np.arange(10)\n",
78 | "arr, arr.dtype"
79 | ]
80 | },
81 | {
82 | "cell_type": "code",
83 | "execution_count": 3,
84 | "metadata": {},
85 | "outputs": [
86 | {
87 | "data": {
88 | "text/plain": [
89 | "(array([[0.13813273, 0.69213455, 0.2869116 , 0.64065806],\n",
90 | " [0.5972653 , 0.42803843, 0.84914465, 0.0502318 ],\n",
91 | " [0.31351949, 0.87095862, 0.52867948, 0.83884873]]),\n",
92 | " dtype('float64'))"
93 | ]
94 | },
95 | "execution_count": 3,
96 | "metadata": {},
97 | "output_type": "execute_result"
98 | }
99 | ],
100 | "source": [
101 | "arr = np.random.rand(3, 4)\n",
102 | "arr, arr.dtype"
103 | ]
104 | },
105 | {
106 | "cell_type": "markdown",
107 | "metadata": {},
108 | "source": [
109 | "### 2. 怎样使用Numpy表达异构数据"
110 | ]
111 | },
112 | {
113 | "cell_type": "code",
114 | "execution_count": 4,
115 | "metadata": {},
116 | "outputs": [
117 | {
118 | "data": {
119 | "text/plain": [
120 | "dtype([('name', '= 29]"
314 | ]
315 | },
316 | {
317 | "cell_type": "code",
318 | "execution_count": 13,
319 | "metadata": {},
320 | "outputs": [
321 | {
322 | "data": {
323 | "text/plain": [
324 | "array([('xiaowang', 30, 80.5)],\n",
325 | " dtype=[('name', '= 29) & (my_arr[\"weight\"] > 80)]"
336 | ]
337 | },
338 | {
339 | "cell_type": "markdown",
340 | "metadata": {},
341 | "source": [
342 | "#### 对单列做逐元素计算"
343 | ]
344 | },
345 | {
346 | "cell_type": "code",
347 | "execution_count": 14,
348 | "metadata": {},
349 | "outputs": [
350 | {
351 | "data": {
352 | "text/plain": [
353 | "array([30, 28, 29])"
354 | ]
355 | },
356 | "execution_count": 14,
357 | "metadata": {},
358 | "output_type": "execute_result"
359 | }
360 | ],
361 | "source": [
362 | "my_arr[\"age\"]"
363 | ]
364 | },
365 | {
366 | "cell_type": "code",
367 | "execution_count": 15,
368 | "metadata": {},
369 | "outputs": [],
370 | "source": [
371 | "my_arr[\"age\"] += 1"
372 | ]
373 | },
374 | {
375 | "cell_type": "code",
376 | "execution_count": 16,
377 | "metadata": {},
378 | "outputs": [
379 | {
380 | "data": {
381 | "text/plain": [
382 | "array([31, 29, 30])"
383 | ]
384 | },
385 | "execution_count": 16,
386 | "metadata": {},
387 | "output_type": "execute_result"
388 | }
389 | ],
390 | "source": [
391 | "my_arr[\"age\"]"
392 | ]
393 | },
394 | {
395 | "cell_type": "markdown",
396 | "metadata": {},
397 | "source": [
398 | "最后的一言: \n",
399 | "* 对于这种每列类型不同的“异构数据”,Pandas更擅长处理;\n",
400 | "* 但我们还要学习一下Numpy结构化数组,不一定会使用它,但要能读懂别人的代码"
401 | ]
402 | }
403 | ],
404 | "metadata": {
405 | "kernelspec": {
406 | "display_name": "Python 3",
407 | "language": "python",
408 | "name": "python3"
409 | },
410 | "language_info": {
411 | "codemirror_mode": {
412 | "name": "ipython",
413 | "version": 3
414 | },
415 | "file_extension": ".py",
416 | "mimetype": "text/x-python",
417 | "name": "python",
418 | "nbconvert_exporter": "python",
419 | "pygments_lexer": "ipython3",
420 | "version": "3.7.6"
421 | }
422 | },
423 | "nbformat": 4,
424 | "nbformat_minor": 4
425 | }
426 |
--------------------------------------------------------------------------------
/.ipynb_checkpoints/20. Numpy与Pandas数据的相互转换-checkpoint.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "## Numpy与Pandas数据的相互转换\n",
8 | "\n",
9 | "Pandas是在Numpy基础上建立的非常流行的数据分析类库; \n",
10 | "提供了强大针对异构、表格类型数据的处理与分析能力。\n",
11 | "\n",
12 | "本节介绍Numpy和Pandas的转换方法: \n",
13 | "1. Numpy数组怎样输入给Pandas的Series、DataFrame;\n",
14 | "2. Pandas的Series、DataFrame怎样转换成Numpy的数组"
15 | ]
16 | },
17 | {
18 | "cell_type": "code",
19 | "execution_count": 1,
20 | "metadata": {},
21 | "outputs": [],
22 | "source": [
23 | "import numpy as np\n",
24 | "import pandas as pd"
25 | ]
26 | },
27 | {
28 | "cell_type": "markdown",
29 | "metadata": {},
30 | "source": [
31 | "### 怎样将Numpy数组转换成Pandas的数据结构"
32 | ]
33 | },
34 | {
35 | "cell_type": "markdown",
36 | "metadata": {},
37 | "source": [
38 | "#### 怎样将Numpy的一维数组变成Pandas的Series"
39 | ]
40 | },
41 | {
42 | "cell_type": "code",
43 | "execution_count": 2,
44 | "metadata": {},
45 | "outputs": [
46 | {
47 | "data": {
48 | "text/plain": [
49 | "array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])"
50 | ]
51 | },
52 | "execution_count": 2,
53 | "metadata": {},
54 | "output_type": "execute_result"
55 | }
56 | ],
57 | "source": [
58 | "arr = np.arange(10)\n",
59 | "arr"
60 | ]
61 | },
62 | {
63 | "cell_type": "code",
64 | "execution_count": 3,
65 | "metadata": {},
66 | "outputs": [
67 | {
68 | "data": {
69 | "text/plain": [
70 | "0 0\n",
71 | "1 1\n",
72 | "2 2\n",
73 | "3 3\n",
74 | "4 4\n",
75 | "5 5\n",
76 | "6 6\n",
77 | "7 7\n",
78 | "8 8\n",
79 | "9 9\n",
80 | "dtype: int32"
81 | ]
82 | },
83 | "execution_count": 3,
84 | "metadata": {},
85 | "output_type": "execute_result"
86 | }
87 | ],
88 | "source": [
89 | "series = pd.Series(arr)\n",
90 | "series"
91 | ]
92 | },
93 | {
94 | "cell_type": "markdown",
95 | "metadata": {},
96 | "source": [
97 | "#### 怎样将Numpy的二维数组转换成Pandas的DataFrame"
98 | ]
99 | },
100 | {
101 | "cell_type": "code",
102 | "execution_count": 4,
103 | "metadata": {},
104 | "outputs": [
105 | {
106 | "data": {
107 | "text/plain": [
108 | "array([[3, 9, 6, 3],\n",
109 | " [4, 1, 8, 1],\n",
110 | " [2, 4, 4, 7],\n",
111 | " [4, 8, 4, 7],\n",
112 | " [8, 3, 9, 8]])"
113 | ]
114 | },
115 | "execution_count": 4,
116 | "metadata": {},
117 | "output_type": "execute_result"
118 | }
119 | ],
120 | "source": [
121 | "arr = np.random.randint(1, 10, size=(5, 4))\n",
122 | "arr"
123 | ]
124 | },
125 | {
126 | "cell_type": "code",
127 | "execution_count": 5,
128 | "metadata": {},
129 | "outputs": [
130 | {
131 | "data": {
132 | "text/html": [
133 | "\n",
134 | "\n",
147 | "
\n",
148 | " \n",
149 | " \n",
150 | " | \n",
151 | " ca | \n",
152 | " cb | \n",
153 | " cc | \n",
154 | " cd | \n",
155 | "
\n",
156 | " \n",
157 | " \n",
158 | " \n",
159 | " 0 | \n",
160 | " 3 | \n",
161 | " 9 | \n",
162 | " 6 | \n",
163 | " 3 | \n",
164 | "
\n",
165 | " \n",
166 | " 1 | \n",
167 | " 4 | \n",
168 | " 1 | \n",
169 | " 8 | \n",
170 | " 1 | \n",
171 | "
\n",
172 | " \n",
173 | " 2 | \n",
174 | " 2 | \n",
175 | " 4 | \n",
176 | " 4 | \n",
177 | " 7 | \n",
178 | "
\n",
179 | " \n",
180 | " 3 | \n",
181 | " 4 | \n",
182 | " 8 | \n",
183 | " 4 | \n",
184 | " 7 | \n",
185 | "
\n",
186 | " \n",
187 | " 4 | \n",
188 | " 8 | \n",
189 | " 3 | \n",
190 | " 9 | \n",
191 | " 8 | \n",
192 | "
\n",
193 | " \n",
194 | "
\n",
195 | "
"
196 | ],
197 | "text/plain": [
198 | " ca cb cc cd\n",
199 | "0 3 9 6 3\n",
200 | "1 4 1 8 1\n",
201 | "2 2 4 4 7\n",
202 | "3 4 8 4 7\n",
203 | "4 8 3 9 8"
204 | ]
205 | },
206 | "execution_count": 5,
207 | "metadata": {},
208 | "output_type": "execute_result"
209 | }
210 | ],
211 | "source": [
212 | "df = pd.DataFrame(arr, columns = [\"ca\", \"cb\", \"cc\", \"cd\"])\n",
213 | "df"
214 | ]
215 | },
216 | {
217 | "cell_type": "code",
218 | "execution_count": 6,
219 | "metadata": {},
220 | "outputs": [
221 | {
222 | "data": {
223 | "text/html": [
224 | "\n",
225 | "\n",
238 | "
\n",
239 | " \n",
240 | " \n",
241 | " | \n",
242 | " ca | \n",
243 | " cb | \n",
244 | " cc | \n",
245 | " cd | \n",
246 | "
\n",
247 | " \n",
248 | " \n",
249 | " \n",
250 | " 4 | \n",
251 | " 8 | \n",
252 | " 3 | \n",
253 | " 9 | \n",
254 | " 8 | \n",
255 | "
\n",
256 | " \n",
257 | "
\n",
258 | "
"
259 | ],
260 | "text/plain": [
261 | " ca cb cc cd\n",
262 | "4 8 3 9 8"
263 | ]
264 | },
265 | "execution_count": 6,
266 | "metadata": {},
267 | "output_type": "execute_result"
268 | }
269 | ],
270 | "source": [
271 | "df[df[\"ca\"] > 4]"
272 | ]
273 | },
274 | {
275 | "cell_type": "markdown",
276 | "metadata": {},
277 | "source": [
278 | "### 怎样Pandas的数据结构转换成Numpy数组\n",
279 | "\n",
280 | "* 方法1:.values()\n",
281 | "* 方法2:.to_numpy()\n",
282 | "\n",
283 | "用途: \n",
284 | "比如Scikit-Learn的模型输入需要的是Numpy的数组 \n",
285 | "可以使用Pandas对原始数据做大量的处理后,将结果数据转换成Numpy数组作为输入 "
286 | ]
287 | },
288 | {
289 | "cell_type": "markdown",
290 | "metadata": {},
291 | "source": [
292 | "#### 将Series转换成Numpy数组"
293 | ]
294 | },
295 | {
296 | "cell_type": "code",
297 | "execution_count": 7,
298 | "metadata": {},
299 | "outputs": [
300 | {
301 | "data": {
302 | "text/plain": [
303 | "0 0\n",
304 | "1 1\n",
305 | "2 2\n",
306 | "3 3\n",
307 | "4 4\n",
308 | "5 5\n",
309 | "6 6\n",
310 | "7 7\n",
311 | "8 8\n",
312 | "9 9\n",
313 | "dtype: int64"
314 | ]
315 | },
316 | "execution_count": 7,
317 | "metadata": {},
318 | "output_type": "execute_result"
319 | }
320 | ],
321 | "source": [
322 | "series = pd.Series(range(10))\n",
323 | "series"
324 | ]
325 | },
326 | {
327 | "cell_type": "code",
328 | "execution_count": 8,
329 | "metadata": {},
330 | "outputs": [
331 | {
332 | "data": {
333 | "text/plain": [
334 | "array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], dtype=int64)"
335 | ]
336 | },
337 | "execution_count": 8,
338 | "metadata": {},
339 | "output_type": "execute_result"
340 | }
341 | ],
342 | "source": [
343 | "series.values"
344 | ]
345 | },
346 | {
347 | "cell_type": "code",
348 | "execution_count": 9,
349 | "metadata": {},
350 | "outputs": [
351 | {
352 | "data": {
353 | "text/plain": [
354 | "array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], dtype=int64)"
355 | ]
356 | },
357 | "execution_count": 9,
358 | "metadata": {},
359 | "output_type": "execute_result"
360 | }
361 | ],
362 | "source": [
363 | "series.to_numpy()"
364 | ]
365 | },
366 | {
367 | "cell_type": "markdown",
368 | "metadata": {},
369 | "source": [
370 | "#### 将DataFrame转换成Numpy数组"
371 | ]
372 | },
373 | {
374 | "cell_type": "code",
375 | "execution_count": 10,
376 | "metadata": {},
377 | "outputs": [
378 | {
379 | "data": {
380 | "text/html": [
381 | "\n",
382 | "\n",
395 | "
\n",
396 | " \n",
397 | " \n",
398 | " | \n",
399 | " feature_a | \n",
400 | " feature_b | \n",
401 | " feature_c | \n",
402 | "
\n",
403 | " \n",
404 | " \n",
405 | " \n",
406 | " 0 | \n",
407 | " 11 | \n",
408 | " 12.23 | \n",
409 | " 45.23 | \n",
410 | "
\n",
411 | " \n",
412 | " 1 | \n",
413 | " 21 | \n",
414 | " 22.23 | \n",
415 | " 55.23 | \n",
416 | "
\n",
417 | " \n",
418 | " 2 | \n",
419 | " 31 | \n",
420 | " 32.23 | \n",
421 | " 65.23 | \n",
422 | "
\n",
423 | " \n",
424 | " 3 | \n",
425 | " 41 | \n",
426 | " 42.23 | \n",
427 | " 75.23 | \n",
428 | "
\n",
429 | " \n",
430 | "
\n",
431 | "
"
432 | ],
433 | "text/plain": [
434 | " feature_a feature_b feature_c\n",
435 | "0 11 12.23 45.23\n",
436 | "1 21 22.23 55.23\n",
437 | "2 31 32.23 65.23\n",
438 | "3 41 42.23 75.23"
439 | ]
440 | },
441 | "execution_count": 10,
442 | "metadata": {},
443 | "output_type": "execute_result"
444 | }
445 | ],
446 | "source": [
447 | "df = pd.DataFrame(\n",
448 | " [\n",
449 | " [11, 12.23, 45.23],\n",
450 | " [21, 22.23, 55.23],\n",
451 | " [31, 32.23, 65.23],\n",
452 | " [41, 42.23, 75.23]\n",
453 | " ],\n",
454 | " columns = [\"feature_a\", \"feature_b\", \"feature_c\"]\n",
455 | ")\n",
456 | "df"
457 | ]
458 | },
459 | {
460 | "cell_type": "code",
461 | "execution_count": 11,
462 | "metadata": {},
463 | "outputs": [
464 | {
465 | "data": {
466 | "text/plain": [
467 | "array([[11. , 12.23, 45.23],\n",
468 | " [21. , 22.23, 55.23],\n",
469 | " [31. , 32.23, 65.23],\n",
470 | " [41. , 42.23, 75.23]])"
471 | ]
472 | },
473 | "execution_count": 11,
474 | "metadata": {},
475 | "output_type": "execute_result"
476 | }
477 | ],
478 | "source": [
479 | "df.values"
480 | ]
481 | },
482 | {
483 | "cell_type": "code",
484 | "execution_count": 12,
485 | "metadata": {},
486 | "outputs": [
487 | {
488 | "data": {
489 | "text/plain": [
490 | "array([[11. , 12.23, 45.23],\n",
491 | " [21. , 22.23, 55.23],\n",
492 | " [31. , 32.23, 65.23],\n",
493 | " [41. , 42.23, 75.23]])"
494 | ]
495 | },
496 | "execution_count": 12,
497 | "metadata": {},
498 | "output_type": "execute_result"
499 | }
500 | ],
501 | "source": [
502 | "df.to_numpy()"
503 | ]
504 | },
505 | {
506 | "cell_type": "code",
507 | "execution_count": null,
508 | "metadata": {},
509 | "outputs": [],
510 | "source": []
511 | }
512 | ],
513 | "metadata": {
514 | "kernelspec": {
515 | "display_name": "Python 3",
516 | "language": "python",
517 | "name": "python3"
518 | },
519 | "language_info": {
520 | "codemirror_mode": {
521 | "name": "ipython",
522 | "version": 3
523 | },
524 | "file_extension": ".py",
525 | "mimetype": "text/x-python",
526 | "name": "python",
527 | "nbconvert_exporter": "python",
528 | "pygments_lexer": "ipython3",
529 | "version": "3.7.6"
530 | }
531 | },
532 | "nbformat": 4,
533 | "nbformat_minor": 4
534 | }
535 |
--------------------------------------------------------------------------------
/.ipynb_checkpoints/21. Numpy数据输入给Scikit-learn实现模型训练-checkpoint.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "## Numpy数据输入给Sklearn实现模型训练\n",
8 | "\n",
9 | "***本视频的目的,向大家演示:*** \n",
10 | "Numpy的数组怎样与sklearn模型交互,包括训练测试集拆分、输入给模型、评估模型、模型预估\n",
11 | "\n",
12 | "对于大家自己的任务,可以提前处理成这样的Numpy格式,然后输入给sklearn模型"
13 | ]
14 | },
15 | {
16 | "cell_type": "code",
17 | "execution_count": 1,
18 | "metadata": {},
19 | "outputs": [],
20 | "source": [
21 | "import numpy as np\n",
22 | "# 使用sklearn自带的数据集,这些数据集都是Numpy的形式\n",
23 | "# 我们自己的数据,也可以处理成这种格式,然后就可以输入给模型\n",
24 | "from sklearn import datasets\n",
25 | "# 用train_test_split可以拆分训练集和测试集\n",
26 | "from sklearn.model_selection import train_test_split\n",
27 | "# 使用LinearRegression训练线性回归模型\n",
28 | "from sklearn.linear_model import LinearRegression"
29 | ]
30 | },
31 | {
32 | "cell_type": "markdown",
33 | "metadata": {},
34 | "source": [
35 | "### 1. 加载波斯顿房价数据集"
36 | ]
37 | },
38 | {
39 | "cell_type": "code",
40 | "execution_count": 2,
41 | "metadata": {},
42 | "outputs": [],
43 | "source": [
44 | "# 加载数据集,存入特征矩阵data、预测结果向量target\n",
45 | "data, target = datasets.load_boston(return_X_y=True)"
46 | ]
47 | },
48 | {
49 | "cell_type": "code",
50 | "execution_count": 3,
51 | "metadata": {},
52 | "outputs": [
53 | {
54 | "data": {
55 | "text/plain": [
56 | "(numpy.ndarray, numpy.ndarray)"
57 | ]
58 | },
59 | "execution_count": 3,
60 | "metadata": {},
61 | "output_type": "execute_result"
62 | }
63 | ],
64 | "source": [
65 | "type(data), type(target)"
66 | ]
67 | },
68 | {
69 | "cell_type": "code",
70 | "execution_count": 4,
71 | "metadata": {},
72 | "outputs": [
73 | {
74 | "data": {
75 | "text/plain": [
76 | "((506, 13), (506,))"
77 | ]
78 | },
79 | "execution_count": 4,
80 | "metadata": {},
81 | "output_type": "execute_result"
82 | }
83 | ],
84 | "source": [
85 | "data.shape, target.shape"
86 | ]
87 | },
88 | {
89 | "cell_type": "code",
90 | "execution_count": 5,
91 | "metadata": {},
92 | "outputs": [
93 | {
94 | "data": {
95 | "text/plain": [
96 | "array([[6.3200e-03, 1.8000e+01, 2.3100e+00, 0.0000e+00, 5.3800e-01,\n",
97 | " 6.5750e+00, 6.5200e+01, 4.0900e+00, 1.0000e+00, 2.9600e+02,\n",
98 | " 1.5300e+01, 3.9690e+02, 4.9800e+00],\n",
99 | " [2.7310e-02, 0.0000e+00, 7.0700e+00, 0.0000e+00, 4.6900e-01,\n",
100 | " 6.4210e+00, 7.8900e+01, 4.9671e+00, 2.0000e+00, 2.4200e+02,\n",
101 | " 1.7800e+01, 3.9690e+02, 9.1400e+00],\n",
102 | " [2.7290e-02, 0.0000e+00, 7.0700e+00, 0.0000e+00, 4.6900e-01,\n",
103 | " 7.1850e+00, 6.1100e+01, 4.9671e+00, 2.0000e+00, 2.4200e+02,\n",
104 | " 1.7800e+01, 3.9283e+02, 4.0300e+00]])"
105 | ]
106 | },
107 | "execution_count": 5,
108 | "metadata": {},
109 | "output_type": "execute_result"
110 | }
111 | ],
112 | "source": [
113 | "# 查看前三条房子的特征信息\n",
114 | "data[:3]"
115 | ]
116 | },
117 | {
118 | "cell_type": "code",
119 | "execution_count": 6,
120 | "metadata": {},
121 | "outputs": [
122 | {
123 | "data": {
124 | "text/plain": [
125 | "array([24. , 21.6, 34.7])"
126 | ]
127 | },
128 | "execution_count": 6,
129 | "metadata": {},
130 | "output_type": "execute_result"
131 | }
132 | ],
133 | "source": [
134 | "# 查看前三条房价结果\n",
135 | "target[:3]"
136 | ]
137 | },
138 | {
139 | "cell_type": "markdown",
140 | "metadata": {},
141 | "source": [
142 | "### 2. 拆分训练集和测试集"
143 | ]
144 | },
145 | {
146 | "cell_type": "code",
147 | "execution_count": 7,
148 | "metadata": {},
149 | "outputs": [],
150 | "source": [
151 | "# 拆分训练集和测试集\n",
152 | "X_train, X_test, y_train, y_test = train_test_split(data, target)"
153 | ]
154 | },
155 | {
156 | "cell_type": "code",
157 | "execution_count": 8,
158 | "metadata": {},
159 | "outputs": [
160 | {
161 | "data": {
162 | "text/plain": [
163 | "((379, 13), (379,))"
164 | ]
165 | },
166 | "execution_count": 8,
167 | "metadata": {},
168 | "output_type": "execute_result"
169 | }
170 | ],
171 | "source": [
172 | "# 训练集的数据\n",
173 | "X_train.shape, y_train.shape"
174 | ]
175 | },
176 | {
177 | "cell_type": "code",
178 | "execution_count": 9,
179 | "metadata": {},
180 | "outputs": [
181 | {
182 | "data": {
183 | "text/plain": [
184 | "((127, 13), (127,))"
185 | ]
186 | },
187 | "execution_count": 9,
188 | "metadata": {},
189 | "output_type": "execute_result"
190 | }
191 | ],
192 | "source": [
193 | "# 测试集的数据\n",
194 | "X_test.shape, y_test.shape"
195 | ]
196 | },
197 | {
198 | "cell_type": "markdown",
199 | "metadata": {},
200 | "source": [
201 | "### 3. 训练线性回归模型"
202 | ]
203 | },
204 | {
205 | "cell_type": "code",
206 | "execution_count": 10,
207 | "metadata": {},
208 | "outputs": [],
209 | "source": [
210 | "# 构造线性回归对象,使用默认参数即可\n",
211 | "clf = LinearRegression()"
212 | ]
213 | },
214 | {
215 | "cell_type": "code",
216 | "execution_count": 11,
217 | "metadata": {},
218 | "outputs": [
219 | {
220 | "data": {
221 | "text/plain": [
222 | "LinearRegression(copy_X=True, fit_intercept=True, n_jobs=None, normalize=False)"
223 | ]
224 | },
225 | "execution_count": 11,
226 | "metadata": {},
227 | "output_type": "execute_result"
228 | }
229 | ],
230 | "source": [
231 | "# 执行训练\n",
232 | "clf.fit(X_train, y_train)"
233 | ]
234 | },
235 | {
236 | "cell_type": "code",
237 | "execution_count": 12,
238 | "metadata": {},
239 | "outputs": [
240 | {
241 | "data": {
242 | "text/plain": [
243 | "0.7290997955432121"
244 | ]
245 | },
246 | "execution_count": 12,
247 | "metadata": {},
248 | "output_type": "execute_result"
249 | }
250 | ],
251 | "source": [
252 | "# 在训练集上的打分\n",
253 | "clf.score(X_train, y_train)"
254 | ]
255 | },
256 | {
257 | "cell_type": "markdown",
258 | "metadata": {},
259 | "source": [
260 | "### 4. 评估模型和使用模型"
261 | ]
262 | },
263 | {
264 | "cell_type": "code",
265 | "execution_count": 13,
266 | "metadata": {},
267 | "outputs": [
268 | {
269 | "data": {
270 | "text/plain": [
271 | "0.7658281007291711"
272 | ]
273 | },
274 | "execution_count": 13,
275 | "metadata": {},
276 | "output_type": "execute_result"
277 | }
278 | ],
279 | "source": [
280 | "# 在测试集上打分评估\n",
281 | "clf.score(X_test, y_test)"
282 | ]
283 | },
284 | {
285 | "cell_type": "code",
286 | "execution_count": 14,
287 | "metadata": {},
288 | "outputs": [
289 | {
290 | "data": {
291 | "text/plain": [
292 | "array([36.1889043 , 17.05681981, 26.1238293 ])"
293 | ]
294 | },
295 | "execution_count": 14,
296 | "metadata": {},
297 | "output_type": "execute_result"
298 | }
299 | ],
300 | "source": [
301 | "# 只取前三条数据,实现房价预估\n",
302 | "clf.predict(X_test[:3])"
303 | ]
304 | },
305 | {
306 | "cell_type": "code",
307 | "execution_count": 15,
308 | "metadata": {},
309 | "outputs": [
310 | {
311 | "data": {
312 | "text/plain": [
313 | "array([50. , 23.1, 22.8])"
314 | ]
315 | },
316 | "execution_count": 15,
317 | "metadata": {},
318 | "output_type": "execute_result"
319 | }
320 | ],
321 | "source": [
322 | "# 看下实际的房价\n",
323 | "y_test[:3]"
324 | ]
325 | },
326 | {
327 | "cell_type": "code",
328 | "execution_count": null,
329 | "metadata": {},
330 | "outputs": [],
331 | "source": []
332 | }
333 | ],
334 | "metadata": {
335 | "kernelspec": {
336 | "display_name": "Python 3",
337 | "language": "python",
338 | "name": "python3"
339 | },
340 | "language_info": {
341 | "codemirror_mode": {
342 | "name": "ipython",
343 | "version": 3
344 | },
345 | "file_extension": ".py",
346 | "mimetype": "text/x-python",
347 | "name": "python",
348 | "nbconvert_exporter": "python",
349 | "pygments_lexer": "ipython3",
350 | "version": "3.7.6"
351 | }
352 | },
353 | "nbformat": 4,
354 | "nbformat_minor": 4
355 | }
356 |
--------------------------------------------------------------------------------
/.ipynb_checkpoints/Untitled-checkpoint.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [],
3 | "metadata": {},
4 | "nbformat": 4,
5 | "nbformat_minor": 4
6 | }
7 |
--------------------------------------------------------------------------------
/.ipynb_checkpoints/Untitled1-checkpoint.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [],
3 | "metadata": {},
4 | "nbformat": 4,
5 | "nbformat_minor": 4
6 | }
7 |
--------------------------------------------------------------------------------
/06. Numpy计算数组中满足条件元素个数.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "## Numpy计算数组中满足条件元素个数\n",
8 | "\n",
9 | "需求:有一个非常大的数组比如1亿个数字,求出里面数字小于5000的数字数目"
10 | ]
11 | },
12 | {
13 | "cell_type": "markdown",
14 | "metadata": {},
15 | "source": [
16 | "### 1. 使用numpy的random模块生成1亿个数字"
17 | ]
18 | },
19 | {
20 | "cell_type": "code",
21 | "execution_count": 1,
22 | "metadata": {},
23 | "outputs": [],
24 | "source": [
25 | "import numpy as np"
26 | ]
27 | },
28 | {
29 | "cell_type": "code",
30 | "execution_count": 2,
31 | "metadata": {},
32 | "outputs": [],
33 | "source": [
34 | "arr = np.random.randint(1, 10000, size=int(1e8))"
35 | ]
36 | },
37 | {
38 | "cell_type": "code",
39 | "execution_count": 3,
40 | "metadata": {},
41 | "outputs": [
42 | {
43 | "data": {
44 | "text/plain": [
45 | "array([5682, 8924, 7737, 7717, 8871, 2469, 1807, 6847, 8138, 1779])"
46 | ]
47 | },
48 | "execution_count": 3,
49 | "metadata": {},
50 | "output_type": "execute_result"
51 | }
52 | ],
53 | "source": [
54 | "arr[:10]"
55 | ]
56 | },
57 | {
58 | "cell_type": "code",
59 | "execution_count": 4,
60 | "metadata": {},
61 | "outputs": [
62 | {
63 | "data": {
64 | "text/plain": [
65 | "100000000"
66 | ]
67 | },
68 | "execution_count": 4,
69 | "metadata": {},
70 | "output_type": "execute_result"
71 | }
72 | ],
73 | "source": [
74 | "arr.size"
75 | ]
76 | },
77 | {
78 | "cell_type": "markdown",
79 | "metadata": {},
80 | "source": [
81 | "### 2. 使用Python原生语法实现"
82 | ]
83 | },
84 | {
85 | "cell_type": "code",
86 | "execution_count": 5,
87 | "metadata": {},
88 | "outputs": [],
89 | "source": [
90 | "pyarr = list(arr)"
91 | ]
92 | },
93 | {
94 | "cell_type": "code",
95 | "execution_count": 6,
96 | "metadata": {},
97 | "outputs": [
98 | {
99 | "data": {
100 | "text/plain": [
101 | "49997444"
102 | ]
103 | },
104 | "execution_count": 6,
105 | "metadata": {},
106 | "output_type": "execute_result"
107 | }
108 | ],
109 | "source": [
110 | "# 计算下结果,用于对比是否准确\n",
111 | "len([x for x in pyarr if x>5000])"
112 | ]
113 | },
114 | {
115 | "cell_type": "code",
116 | "execution_count": 7,
117 | "metadata": {},
118 | "outputs": [
119 | {
120 | "name": "stdout",
121 | "output_type": "stream",
122 | "text": [
123 | "16.8 s ± 204 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n"
124 | ]
125 | }
126 | ],
127 | "source": [
128 | "# 记一下时间\n",
129 | "%timeit len([x for x in pyarr if x>5000])"
130 | ]
131 | },
132 | {
133 | "cell_type": "markdown",
134 | "metadata": {},
135 | "source": [
136 | "### 3. 使用numpy的向量化操作实现"
137 | ]
138 | },
139 | {
140 | "cell_type": "code",
141 | "execution_count": 8,
142 | "metadata": {},
143 | "outputs": [
144 | {
145 | "data": {
146 | "text/plain": [
147 | "49997444"
148 | ]
149 | },
150 | "execution_count": 8,
151 | "metadata": {},
152 | "output_type": "execute_result"
153 | }
154 | ],
155 | "source": [
156 | "# 计算下结果,用于对比是否准确\n",
157 | "arr[arr>5000].size"
158 | ]
159 | },
160 | {
161 | "cell_type": "code",
162 | "execution_count": 9,
163 | "metadata": {},
164 | "outputs": [
165 | {
166 | "data": {
167 | "text/plain": [
168 | "array([ True, True, True, True, True, False, False, True, True,\n",
169 | " False])"
170 | ]
171 | },
172 | "execution_count": 9,
173 | "metadata": {},
174 | "output_type": "execute_result"
175 | }
176 | ],
177 | "source": [
178 | "(arr>5000)[:10]"
179 | ]
180 | },
181 | {
182 | "cell_type": "code",
183 | "execution_count": 10,
184 | "metadata": {},
185 | "outputs": [
186 | {
187 | "name": "stdout",
188 | "output_type": "stream",
189 | "text": [
190 | "590 ms ± 33.3 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n"
191 | ]
192 | }
193 | ],
194 | "source": [
195 | "# 记一下时间\n",
196 | "%timeit arr[arr>5000].size"
197 | ]
198 | },
199 | {
200 | "cell_type": "markdown",
201 | "metadata": {},
202 | "source": [
203 | "### 4. 对比下时间"
204 | ]
205 | },
206 | {
207 | "cell_type": "code",
208 | "execution_count": 11,
209 | "metadata": {},
210 | "outputs": [
211 | {
212 | "data": {
213 | "text/plain": [
214 | "28.47457627118644"
215 | ]
216 | },
217 | "execution_count": 11,
218 | "metadata": {},
219 | "output_type": "execute_result"
220 | }
221 | ],
222 | "source": [
223 | "16.8*1000 / 590 "
224 | ]
225 | },
226 | {
227 | "cell_type": "code",
228 | "execution_count": null,
229 | "metadata": {},
230 | "outputs": [],
231 | "source": []
232 | }
233 | ],
234 | "metadata": {
235 | "kernelspec": {
236 | "display_name": "Python 3",
237 | "language": "python",
238 | "name": "python3"
239 | },
240 | "language_info": {
241 | "codemirror_mode": {
242 | "name": "ipython",
243 | "version": 3
244 | },
245 | "file_extension": ".py",
246 | "mimetype": "text/x-python",
247 | "name": "python",
248 | "nbconvert_exporter": "python",
249 | "pygments_lexer": "ipython3",
250 | "version": "3.7.6"
251 | }
252 | },
253 | "nbformat": 4,
254 | "nbformat_minor": 4
255 | }
256 |
--------------------------------------------------------------------------------
/07. Numpy怎样给数组增加一个维度.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "## Numpy怎样给数组增加一个维度"
8 | ]
9 | },
10 | {
11 | "cell_type": "markdown",
12 | "metadata": {},
13 | "source": [
14 | "***背景:*** \n",
15 | "很多数据计算都是二维或三维的,对于一维的数据输入为了形状匹配,经常需升维变成二维\n",
16 | "\n",
17 | "***需要:*** \n",
18 | "在不改变数据的情况下,添加数组维度;(注意观察这个例子,维度变了,但数据不变) \n",
19 | "原始数组:一维数组arr=[1,2,3,4],其shape是(4, ),取值分别为arr[0],arr[1],arr[2],arr[3] \n",
20 | "变形数组:二维数组arr[[1,2,3,4]],其shape实(1,4), 取值分别为a[0,0],a[0,1],a[0,2],a[0,3]\n",
21 | "\n",
22 | "***实操:*** \n",
23 | "经常需要在纸上手绘数组的形状,来查看不同数组是否形状匹配,是否需要升维降维\n",
24 | "\n",
25 | "***3种方法:*** \n",
26 | "* np.newaxis:关键字,使用索引的语法给数组添加维度\n",
27 | "* np.expand_dims(arr, axis):方法,和np.newaxis实现一样的功能,给arr在axis位置添加维度\n",
28 | "* np.reshape(a, newshape):方法,给一个维度设置为1完成升维"
29 | ]
30 | },
31 | {
32 | "cell_type": "code",
33 | "execution_count": 1,
34 | "metadata": {},
35 | "outputs": [],
36 | "source": [
37 | "import numpy as np"
38 | ]
39 | },
40 | {
41 | "cell_type": "code",
42 | "execution_count": 2,
43 | "metadata": {},
44 | "outputs": [
45 | {
46 | "data": {
47 | "text/plain": [
48 | "array([0, 1, 2, 3, 4])"
49 | ]
50 | },
51 | "execution_count": 2,
52 | "metadata": {},
53 | "output_type": "execute_result"
54 | }
55 | ],
56 | "source": [
57 | "arr = np.arange(5)\n",
58 | "arr"
59 | ]
60 | },
61 | {
62 | "cell_type": "code",
63 | "execution_count": 3,
64 | "metadata": {},
65 | "outputs": [
66 | {
67 | "data": {
68 | "text/plain": [
69 | "(5,)"
70 | ]
71 | },
72 | "execution_count": 3,
73 | "metadata": {},
74 | "output_type": "execute_result"
75 | }
76 | ],
77 | "source": [
78 | "# 注意,当前是一维向量\n",
79 | "arr.shape"
80 | ]
81 | },
82 | {
83 | "cell_type": "markdown",
84 | "metadata": {},
85 | "source": [
86 | "### 方法1:np.newaxis关键字"
87 | ]
88 | },
89 | {
90 | "cell_type": "markdown",
91 | "metadata": {},
92 | "source": [
93 | "#### 注意:np.newaxis其实就是None的别名"
94 | ]
95 | },
96 | {
97 | "cell_type": "code",
98 | "execution_count": 4,
99 | "metadata": {},
100 | "outputs": [
101 | {
102 | "data": {
103 | "text/plain": [
104 | "True"
105 | ]
106 | },
107 | "execution_count": 4,
108 | "metadata": {},
109 | "output_type": "execute_result"
110 | }
111 | ],
112 | "source": [
113 | "np.newaxis is None"
114 | ]
115 | },
116 | {
117 | "cell_type": "code",
118 | "execution_count": 5,
119 | "metadata": {},
120 | "outputs": [
121 | {
122 | "data": {
123 | "text/plain": [
124 | "True"
125 | ]
126 | },
127 | "execution_count": 5,
128 | "metadata": {},
129 | "output_type": "execute_result"
130 | }
131 | ],
132 | "source": [
133 | "np.newaxis == None"
134 | ]
135 | },
136 | {
137 | "cell_type": "markdown",
138 | "metadata": {},
139 | "source": [
140 | "即以下所有的np.newaxis的位置,都可以用None替代"
141 | ]
142 | },
143 | {
144 | "cell_type": "markdown",
145 | "metadata": {},
146 | "source": [
147 | "#### 给一维向量添加一个行维度"
148 | ]
149 | },
150 | {
151 | "cell_type": "code",
152 | "execution_count": 6,
153 | "metadata": {},
154 | "outputs": [
155 | {
156 | "data": {
157 | "text/plain": [
158 | "array([[0, 1, 2, 3, 4]])"
159 | ]
160 | },
161 | "execution_count": 6,
162 | "metadata": {},
163 | "output_type": "execute_result"
164 | }
165 | ],
166 | "source": [
167 | "arr[np.newaxis, :]"
168 | ]
169 | },
170 | {
171 | "cell_type": "code",
172 | "execution_count": 7,
173 | "metadata": {
174 | "scrolled": true
175 | },
176 | "outputs": [
177 | {
178 | "data": {
179 | "text/plain": [
180 | "(1, 5)"
181 | ]
182 | },
183 | "execution_count": 7,
184 | "metadata": {},
185 | "output_type": "execute_result"
186 | }
187 | ],
188 | "source": [
189 | "arr[np.newaxis, :].shape"
190 | ]
191 | },
192 | {
193 | "cell_type": "markdown",
194 | "metadata": {},
195 | "source": [
196 | "数据现在是一行*五列,数据本身没有增减,只是多了一级括号"
197 | ]
198 | },
199 | {
200 | "cell_type": "markdown",
201 | "metadata": {},
202 | "source": [
203 | "#### 给一维向量添加一个列维度"
204 | ]
205 | },
206 | {
207 | "cell_type": "code",
208 | "execution_count": 8,
209 | "metadata": {},
210 | "outputs": [
211 | {
212 | "data": {
213 | "text/plain": [
214 | "array([[0],\n",
215 | " [1],\n",
216 | " [2],\n",
217 | " [3],\n",
218 | " [4]])"
219 | ]
220 | },
221 | "execution_count": 8,
222 | "metadata": {},
223 | "output_type": "execute_result"
224 | }
225 | ],
226 | "source": [
227 | "arr[:, np.newaxis]"
228 | ]
229 | },
230 | {
231 | "cell_type": "code",
232 | "execution_count": 9,
233 | "metadata": {},
234 | "outputs": [
235 | {
236 | "data": {
237 | "text/plain": [
238 | "(5, 1)"
239 | ]
240 | },
241 | "execution_count": 9,
242 | "metadata": {},
243 | "output_type": "execute_result"
244 | }
245 | ],
246 | "source": [
247 | "arr[:, np.newaxis].shape"
248 | ]
249 | },
250 | {
251 | "cell_type": "markdown",
252 | "metadata": {},
253 | "source": [
254 | "数据现在是五行*一列"
255 | ]
256 | },
257 | {
258 | "cell_type": "markdown",
259 | "metadata": {},
260 | "source": [
261 | "### 方法2:np.expand_dims方法"
262 | ]
263 | },
264 | {
265 | "cell_type": "markdown",
266 | "metadata": {},
267 | "source": [
268 | "np.expand_dims方法实现的效果,和np.newaxis关键字是一模一样的"
269 | ]
270 | },
271 | {
272 | "cell_type": "code",
273 | "execution_count": 10,
274 | "metadata": {},
275 | "outputs": [
276 | {
277 | "data": {
278 | "text/plain": [
279 | "array([0, 1, 2, 3, 4])"
280 | ]
281 | },
282 | "execution_count": 10,
283 | "metadata": {},
284 | "output_type": "execute_result"
285 | }
286 | ],
287 | "source": [
288 | "arr"
289 | ]
290 | },
291 | {
292 | "cell_type": "markdown",
293 | "metadata": {},
294 | "source": [
295 | "#### 给一维数组添加一个行维度"
296 | ]
297 | },
298 | {
299 | "cell_type": "markdown",
300 | "metadata": {},
301 | "source": [
302 | "相当于arr[np.newaxis, :]"
303 | ]
304 | },
305 | {
306 | "cell_type": "code",
307 | "execution_count": 11,
308 | "metadata": {},
309 | "outputs": [
310 | {
311 | "data": {
312 | "text/plain": [
313 | "array([[0, 1, 2, 3, 4]])"
314 | ]
315 | },
316 | "execution_count": 11,
317 | "metadata": {},
318 | "output_type": "execute_result"
319 | }
320 | ],
321 | "source": [
322 | "np.expand_dims(arr, axis=0)"
323 | ]
324 | },
325 | {
326 | "cell_type": "code",
327 | "execution_count": 12,
328 | "metadata": {},
329 | "outputs": [
330 | {
331 | "data": {
332 | "text/plain": [
333 | "(1, 5)"
334 | ]
335 | },
336 | "execution_count": 12,
337 | "metadata": {},
338 | "output_type": "execute_result"
339 | }
340 | ],
341 | "source": [
342 | "np.expand_dims(arr, axis=0).shape"
343 | ]
344 | },
345 | {
346 | "cell_type": "markdown",
347 | "metadata": {},
348 | "source": [
349 | "#### 给一维数组添加一个列维度"
350 | ]
351 | },
352 | {
353 | "cell_type": "markdown",
354 | "metadata": {},
355 | "source": [
356 | "相当于arr[:, np.newaxis]"
357 | ]
358 | },
359 | {
360 | "cell_type": "code",
361 | "execution_count": 13,
362 | "metadata": {},
363 | "outputs": [
364 | {
365 | "data": {
366 | "text/plain": [
367 | "array([[0],\n",
368 | " [1],\n",
369 | " [2],\n",
370 | " [3],\n",
371 | " [4]])"
372 | ]
373 | },
374 | "execution_count": 13,
375 | "metadata": {},
376 | "output_type": "execute_result"
377 | }
378 | ],
379 | "source": [
380 | "np.expand_dims(arr, axis=1)"
381 | ]
382 | },
383 | {
384 | "cell_type": "code",
385 | "execution_count": 14,
386 | "metadata": {},
387 | "outputs": [
388 | {
389 | "data": {
390 | "text/plain": [
391 | "(5, 1)"
392 | ]
393 | },
394 | "execution_count": 14,
395 | "metadata": {},
396 | "output_type": "execute_result"
397 | }
398 | ],
399 | "source": [
400 | "np.expand_dims(arr, axis=1).shape"
401 | ]
402 | },
403 | {
404 | "cell_type": "markdown",
405 | "metadata": {},
406 | "source": [
407 | "### 方法3:np.reshape方法"
408 | ]
409 | },
410 | {
411 | "cell_type": "markdown",
412 | "metadata": {},
413 | "source": [
414 | "#### 给一维数组添加一个行维度"
415 | ]
416 | },
417 | {
418 | "cell_type": "code",
419 | "execution_count": 15,
420 | "metadata": {},
421 | "outputs": [
422 | {
423 | "data": {
424 | "text/plain": [
425 | "array([0, 1, 2, 3, 4])"
426 | ]
427 | },
428 | "execution_count": 15,
429 | "metadata": {},
430 | "output_type": "execute_result"
431 | }
432 | ],
433 | "source": [
434 | "arr"
435 | ]
436 | },
437 | {
438 | "cell_type": "code",
439 | "execution_count": 16,
440 | "metadata": {},
441 | "outputs": [
442 | {
443 | "data": {
444 | "text/plain": [
445 | "array([[0, 1, 2, 3, 4]])"
446 | ]
447 | },
448 | "execution_count": 16,
449 | "metadata": {},
450 | "output_type": "execute_result"
451 | }
452 | ],
453 | "source": [
454 | "np.reshape(arr, (1, 5))"
455 | ]
456 | },
457 | {
458 | "cell_type": "code",
459 | "execution_count": 17,
460 | "metadata": {},
461 | "outputs": [
462 | {
463 | "data": {
464 | "text/plain": [
465 | "array([[0, 1, 2, 3, 4]])"
466 | ]
467 | },
468 | "execution_count": 17,
469 | "metadata": {},
470 | "output_type": "execute_result"
471 | }
472 | ],
473 | "source": [
474 | "np.reshape(arr, (1, -1))"
475 | ]
476 | },
477 | {
478 | "cell_type": "code",
479 | "execution_count": 18,
480 | "metadata": {},
481 | "outputs": [
482 | {
483 | "data": {
484 | "text/plain": [
485 | "(1, 5)"
486 | ]
487 | },
488 | "execution_count": 18,
489 | "metadata": {},
490 | "output_type": "execute_result"
491 | }
492 | ],
493 | "source": [
494 | "np.reshape(arr, (1, -1)).shape"
495 | ]
496 | },
497 | {
498 | "cell_type": "markdown",
499 | "metadata": {},
500 | "source": [
501 | "#### 给一维数组添加一个列维度"
502 | ]
503 | },
504 | {
505 | "cell_type": "code",
506 | "execution_count": null,
507 | "metadata": {},
508 | "outputs": [],
509 | "source": [
510 | "np.reshape(arr, (-1, 1))"
511 | ]
512 | },
513 | {
514 | "cell_type": "code",
515 | "execution_count": null,
516 | "metadata": {},
517 | "outputs": [],
518 | "source": [
519 | "np.reshape(arr, (-1, 1)).shape"
520 | ]
521 | }
522 | ],
523 | "metadata": {
524 | "kernelspec": {
525 | "display_name": "Python 3",
526 | "language": "python",
527 | "name": "python3"
528 | },
529 | "language_info": {
530 | "codemirror_mode": {
531 | "name": "ipython",
532 | "version": 3
533 | },
534 | "file_extension": ".py",
535 | "mimetype": "text/x-python",
536 | "name": "python",
537 | "nbconvert_exporter": "python",
538 | "pygments_lexer": "ipython3",
539 | "version": "3.7.6"
540 | }
541 | },
542 | "nbformat": 4,
543 | "nbformat_minor": 4
544 | }
545 |
--------------------------------------------------------------------------------
/08. Numpy实现K折交叉验证的数据划分.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "## Numpy实现K折交叉验证的数据划分"
8 | ]
9 | },
10 | {
11 | "cell_type": "markdown",
12 | "metadata": {},
13 | "source": [
14 | "本实例使用Numpy的数组切片语法,实现了K折交叉验证的数据划分"
15 | ]
16 | },
17 | {
18 | "cell_type": "markdown",
19 | "metadata": {},
20 | "source": [
21 | "### 背景:K折交叉验证\n",
22 | "\n",
23 | "***为什么需要这个?*** \n",
24 | "在机器学习中,因为如下原因,使用K折交叉验证能更好评估模型效果:\n",
25 | "1. 样本量不充足,划分了训练集和测试集后,训练数据更少;\n",
26 | "2. 训练集和测试集的不同划分,可能会导致不同的模型性能结果;\n",
27 | "\n",
28 | "\n",
29 | "***K折验证是什么*** \n",
30 | "K折验证(K-fold validtion)将数据划分为大小相同的K个分区。 \n",
31 | "对每个分区i,在剩余的K-1个分区上训练模型,然后在分区i上评估模型。 \n",
32 | "最终分数等于K个分数的平均值,使用平均值来消除训练集和测试集的划分影响;\n",
33 | "\n",
34 | "
"
35 | ]
36 | },
37 | {
38 | "cell_type": "markdown",
39 | "metadata": {},
40 | "source": [
41 | "### 1. 模拟构造样本集合"
42 | ]
43 | },
44 | {
45 | "cell_type": "code",
46 | "execution_count": 1,
47 | "metadata": {},
48 | "outputs": [],
49 | "source": [
50 | "import numpy as np"
51 | ]
52 | },
53 | {
54 | "cell_type": "code",
55 | "execution_count": 2,
56 | "metadata": {},
57 | "outputs": [
58 | {
59 | "data": {
60 | "text/plain": [
61 | "array([[ 0, 1, 2, 3],\n",
62 | " [ 4, 5, 6, 7],\n",
63 | " [ 8, 9, 10, 11],\n",
64 | " [12, 13, 14, 15],\n",
65 | " [16, 17, 18, 19],\n",
66 | " [20, 21, 22, 23],\n",
67 | " [24, 25, 26, 27],\n",
68 | " [28, 29, 30, 31],\n",
69 | " [32, 33, 34, 35]])"
70 | ]
71 | },
72 | "execution_count": 2,
73 | "metadata": {},
74 | "output_type": "execute_result"
75 | }
76 | ],
77 | "source": [
78 | "data = np.arange(36).reshape(9,4)\n",
79 | "data"
80 | ]
81 | },
82 | {
83 | "cell_type": "markdown",
84 | "metadata": {},
85 | "source": [
86 | "用样本的角度解释下data数组:\n",
87 | "* 这是一个二维矩阵,行代表每个样本,列代表每个特征\n",
88 | "* 这里有9个样本,每个样本有4个特征\n",
89 | "\n",
90 | "这是scikit-learn模型训练输入的标准格式"
91 | ]
92 | },
93 | {
94 | "cell_type": "markdown",
95 | "metadata": {},
96 | "source": [
97 | "### 2. 使用Numpy实现K次划分"
98 | ]
99 | },
100 | {
101 | "cell_type": "code",
102 | "execution_count": 3,
103 | "metadata": {},
104 | "outputs": [],
105 | "source": [
106 | "# 我们想进行4折交叉验证\n",
107 | "k = 4"
108 | ]
109 | },
110 | {
111 | "cell_type": "code",
112 | "execution_count": 4,
113 | "metadata": {},
114 | "outputs": [
115 | {
116 | "data": {
117 | "text/plain": [
118 | "2"
119 | ]
120 | },
121 | "execution_count": 4,
122 | "metadata": {},
123 | "output_type": "execute_result"
124 | }
125 | ],
126 | "source": [
127 | "# 算出来每个fold的样本个数\n",
128 | "k_samples_count = data.shape[0]//k\n",
129 | "k_samples_count"
130 | ]
131 | },
132 | {
133 | "cell_type": "code",
134 | "execution_count": 5,
135 | "metadata": {
136 | "scrolled": false
137 | },
138 | "outputs": [
139 | {
140 | "name": "stdout",
141 | "output_type": "stream",
142 | "text": [
143 | "\n",
144 | "#####第0折#####\n",
145 | "验证集:\n",
146 | " [[0 1 2 3]\n",
147 | " [4 5 6 7]]\n",
148 | "训练集:\n",
149 | " [[ 8 9 10 11]\n",
150 | " [12 13 14 15]\n",
151 | " [16 17 18 19]\n",
152 | " [20 21 22 23]\n",
153 | " [24 25 26 27]\n",
154 | " [28 29 30 31]\n",
155 | " [32 33 34 35]]\n",
156 | "\n",
157 | "#####第1折#####\n",
158 | "验证集:\n",
159 | " [[ 8 9 10 11]\n",
160 | " [12 13 14 15]]\n",
161 | "训练集:\n",
162 | " [[ 0 1 2 3]\n",
163 | " [ 4 5 6 7]\n",
164 | " [16 17 18 19]\n",
165 | " [20 21 22 23]\n",
166 | " [24 25 26 27]\n",
167 | " [28 29 30 31]\n",
168 | " [32 33 34 35]]\n",
169 | "\n",
170 | "#####第2折#####\n",
171 | "验证集:\n",
172 | " [[16 17 18 19]\n",
173 | " [20 21 22 23]]\n",
174 | "训练集:\n",
175 | " [[ 0 1 2 3]\n",
176 | " [ 4 5 6 7]\n",
177 | " [ 8 9 10 11]\n",
178 | " [12 13 14 15]\n",
179 | " [24 25 26 27]\n",
180 | " [28 29 30 31]\n",
181 | " [32 33 34 35]]\n",
182 | "\n",
183 | "#####第3折#####\n",
184 | "验证集:\n",
185 | " [[24 25 26 27]\n",
186 | " [28 29 30 31]]\n",
187 | "训练集:\n",
188 | " [[ 0 1 2 3]\n",
189 | " [ 4 5 6 7]\n",
190 | " [ 8 9 10 11]\n",
191 | " [12 13 14 15]\n",
192 | " [16 17 18 19]\n",
193 | " [20 21 22 23]\n",
194 | " [32 33 34 35]]\n"
195 | ]
196 | }
197 | ],
198 | "source": [
199 | "for fold in range(k):\n",
200 | " validation_begin = k_samples_count*fold\n",
201 | " validation_end = k_samples_count*(fold+1)\n",
202 | " \n",
203 | " validation_data = data[validation_begin:validation_end]\n",
204 | " \n",
205 | " # np.vstack,沿着垂直的方向堆叠数组\n",
206 | " train_data = np.vstack([\n",
207 | " data[:validation_begin], \n",
208 | " data[validation_end:]\n",
209 | " ])\n",
210 | " \n",
211 | " print()\n",
212 | " print(f\"#####第{fold}折#####\")\n",
213 | " print(\"验证集:\\n\", validation_data)\n",
214 | " print(\"训练集:\\n\", train_data)"
215 | ]
216 | },
217 | {
218 | "cell_type": "markdown",
219 | "metadata": {},
220 | "source": [
221 | "如果使用scikit-learn,已经有封装好的实现: \n",
222 | "from sklearn.model_selection import cross_val_score"
223 | ]
224 | }
225 | ],
226 | "metadata": {
227 | "kernelspec": {
228 | "display_name": "Python 3",
229 | "language": "python",
230 | "name": "python3"
231 | },
232 | "language_info": {
233 | "codemirror_mode": {
234 | "name": "ipython",
235 | "version": 3
236 | },
237 | "file_extension": ".py",
238 | "mimetype": "text/x-python",
239 | "name": "python",
240 | "nbconvert_exporter": "python",
241 | "pygments_lexer": "ipython3",
242 | "version": "3.7.6"
243 | }
244 | },
245 | "nbformat": 4,
246 | "nbformat_minor": 4
247 | }
248 |
--------------------------------------------------------------------------------
/09. Numpy非常有用的数组合并操作.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "## Numpy非常重要有用的数组合并操作"
8 | ]
9 | },
10 | {
11 | "cell_type": "markdown",
12 | "metadata": {},
13 | "source": [
14 | "背景:在给机器学习准备数据的过程中,经常需要进行不同来源的数据合并的操作。\n",
15 | "\n",
16 | "两类场景:\n",
17 | "1. 给已有的数据添加多行,比如增添一些样本数据进去;\n",
18 | "2. 给已有的数据添加多列,比如增添一些特征进去;\n",
19 | "\n",
20 | "以下操作均可以实现数组合并:\n",
21 | "* np.concatenate(array_list, axis=0/1):沿着指定axis进行数组的合并\n",
22 | "* np.vstack或者np.row_stack(array_list):垂直vertically、按行row wise进行数据合并\n",
23 | "* np.hstack或者np.column_stack(array_list):水平horizontally、按列column wise进行数据合并"
24 | ]
25 | },
26 | {
27 | "cell_type": "code",
28 | "execution_count": 1,
29 | "metadata": {},
30 | "outputs": [],
31 | "source": [
32 | "import numpy as np"
33 | ]
34 | },
35 | {
36 | "cell_type": "markdown",
37 | "metadata": {},
38 | "source": [
39 | "### 1. 怎样给数据添加新的多行"
40 | ]
41 | },
42 | {
43 | "cell_type": "code",
44 | "execution_count": 2,
45 | "metadata": {},
46 | "outputs": [],
47 | "source": [
48 | "a = np.arange(6).reshape(2,3)\n",
49 | "b = np.random.randint(10,20,size=(4,3))"
50 | ]
51 | },
52 | {
53 | "cell_type": "code",
54 | "execution_count": 3,
55 | "metadata": {},
56 | "outputs": [
57 | {
58 | "data": {
59 | "text/plain": [
60 | "array([[0, 1, 2],\n",
61 | " [3, 4, 5]])"
62 | ]
63 | },
64 | "execution_count": 3,
65 | "metadata": {},
66 | "output_type": "execute_result"
67 | }
68 | ],
69 | "source": [
70 | "a"
71 | ]
72 | },
73 | {
74 | "cell_type": "code",
75 | "execution_count": 4,
76 | "metadata": {},
77 | "outputs": [
78 | {
79 | "data": {
80 | "text/plain": [
81 | "array([[13, 16, 13],\n",
82 | " [17, 15, 14],\n",
83 | " [19, 13, 19],\n",
84 | " [10, 13, 10]])"
85 | ]
86 | },
87 | "execution_count": 4,
88 | "metadata": {},
89 | "output_type": "execute_result"
90 | }
91 | ],
92 | "source": [
93 | "b"
94 | ]
95 | },
96 | {
97 | "cell_type": "code",
98 | "execution_count": 5,
99 | "metadata": {},
100 | "outputs": [
101 | {
102 | "data": {
103 | "text/plain": [
104 | "array([[ 0, 1, 2],\n",
105 | " [ 3, 4, 5],\n",
106 | " [13, 16, 13],\n",
107 | " [17, 15, 14],\n",
108 | " [19, 13, 19],\n",
109 | " [10, 13, 10]])"
110 | ]
111 | },
112 | "execution_count": 5,
113 | "metadata": {},
114 | "output_type": "execute_result"
115 | }
116 | ],
117 | "source": [
118 | "# 方法1:\n",
119 | "np.concatenate([a,b])"
120 | ]
121 | },
122 | {
123 | "cell_type": "code",
124 | "execution_count": 6,
125 | "metadata": {},
126 | "outputs": [
127 | {
128 | "data": {
129 | "text/plain": [
130 | "array([[ 0, 1, 2],\n",
131 | " [ 3, 4, 5],\n",
132 | " [13, 16, 13],\n",
133 | " [17, 15, 14],\n",
134 | " [19, 13, 19],\n",
135 | " [10, 13, 10]])"
136 | ]
137 | },
138 | "execution_count": 6,
139 | "metadata": {},
140 | "output_type": "execute_result"
141 | }
142 | ],
143 | "source": [
144 | "# 方法2\n",
145 | "np.vstack([a,b])"
146 | ]
147 | },
148 | {
149 | "cell_type": "code",
150 | "execution_count": 7,
151 | "metadata": {},
152 | "outputs": [
153 | {
154 | "data": {
155 | "text/plain": [
156 | "array([[ 0, 1, 2],\n",
157 | " [ 3, 4, 5],\n",
158 | " [13, 16, 13],\n",
159 | " [17, 15, 14],\n",
160 | " [19, 13, 19],\n",
161 | " [10, 13, 10]])"
162 | ]
163 | },
164 | "execution_count": 7,
165 | "metadata": {},
166 | "output_type": "execute_result"
167 | }
168 | ],
169 | "source": [
170 | "# 方法3\n",
171 | "np.row_stack([a, b])"
172 | ]
173 | },
174 | {
175 | "cell_type": "markdown",
176 | "metadata": {},
177 | "source": [
178 | "### 2. 怎样给数据添加新的多列"
179 | ]
180 | },
181 | {
182 | "cell_type": "code",
183 | "execution_count": 8,
184 | "metadata": {},
185 | "outputs": [],
186 | "source": [
187 | "a = np.arange(12).reshape(3,4)\n",
188 | "b = np.random.randint(10,20,size=(3,2))"
189 | ]
190 | },
191 | {
192 | "cell_type": "code",
193 | "execution_count": 9,
194 | "metadata": {},
195 | "outputs": [
196 | {
197 | "data": {
198 | "text/plain": [
199 | "array([[ 0, 1, 2, 3],\n",
200 | " [ 4, 5, 6, 7],\n",
201 | " [ 8, 9, 10, 11]])"
202 | ]
203 | },
204 | "execution_count": 9,
205 | "metadata": {},
206 | "output_type": "execute_result"
207 | }
208 | ],
209 | "source": [
210 | "a"
211 | ]
212 | },
213 | {
214 | "cell_type": "code",
215 | "execution_count": 10,
216 | "metadata": {},
217 | "outputs": [
218 | {
219 | "data": {
220 | "text/plain": [
221 | "array([[12, 16],\n",
222 | " [18, 12],\n",
223 | " [12, 12]])"
224 | ]
225 | },
226 | "execution_count": 10,
227 | "metadata": {},
228 | "output_type": "execute_result"
229 | }
230 | ],
231 | "source": [
232 | "b"
233 | ]
234 | },
235 | {
236 | "cell_type": "code",
237 | "execution_count": 11,
238 | "metadata": {},
239 | "outputs": [
240 | {
241 | "data": {
242 | "text/plain": [
243 | "array([[ 0, 1, 2, 3, 12, 16],\n",
244 | " [ 4, 5, 6, 7, 18, 12],\n",
245 | " [ 8, 9, 10, 11, 12, 12]])"
246 | ]
247 | },
248 | "execution_count": 11,
249 | "metadata": {},
250 | "output_type": "execute_result"
251 | }
252 | ],
253 | "source": [
254 | "# 方法1\n",
255 | "np.concatenate([a,b], axis=1)"
256 | ]
257 | },
258 | {
259 | "cell_type": "code",
260 | "execution_count": null,
261 | "metadata": {},
262 | "outputs": [],
263 | "source": [
264 | "# 方法2\n",
265 | "np.hstack([a,b])"
266 | ]
267 | },
268 | {
269 | "cell_type": "code",
270 | "execution_count": null,
271 | "metadata": {},
272 | "outputs": [],
273 | "source": [
274 | "# 方法3\n",
275 | "np.column_stack([a,b])"
276 | ]
277 | }
278 | ],
279 | "metadata": {
280 | "kernelspec": {
281 | "display_name": "Python 3",
282 | "language": "python",
283 | "name": "python3"
284 | },
285 | "language_info": {
286 | "codemirror_mode": {
287 | "name": "ipython",
288 | "version": 3
289 | },
290 | "file_extension": ".py",
291 | "mimetype": "text/x-python",
292 | "name": "python",
293 | "nbconvert_exporter": "python",
294 | "pygments_lexer": "ipython3",
295 | "version": "3.7.6"
296 | }
297 | },
298 | "nbformat": 4,
299 | "nbformat_minor": 4
300 | }
301 |
--------------------------------------------------------------------------------
/10. Numpy怎样对数组排序.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "## Numpy怎样对数组排序\n",
8 | "\n",
9 | "Numpy给数组排序的三个方法: \n",
10 | "* numpy.sort:返回排序后数组的拷贝\n",
11 | "* array.sort:原地排序数组而不是返回拷贝\n",
12 | "* numpy.argsort:间接排序,返回的是排序后的数字索引\n",
13 | "\n",
14 | "3个方法都支持一个参数kind,可以是以下一个值:\n",
15 | "* quicksort:快速排序,平均O(nlogn),不稳定情况\n",
16 | "* mergesort:归并排序,平均O(nlogn),稳定排序\n",
17 | "* heapsort:堆排序,平均O(nlogn),不稳定排序\n",
18 | "* stable:稳定排序\n",
19 | "\n",
20 | "kind默认值是quicksort,快速排序平均情况是最快,保持默认即可"
21 | ]
22 | },
23 | {
24 | "cell_type": "code",
25 | "execution_count": 1,
26 | "metadata": {},
27 | "outputs": [],
28 | "source": [
29 | "import numpy as np"
30 | ]
31 | },
32 | {
33 | "cell_type": "markdown",
34 | "metadata": {},
35 | "source": [
36 | "### 1. np.sort返回排序后的数组"
37 | ]
38 | },
39 | {
40 | "cell_type": "code",
41 | "execution_count": 2,
42 | "metadata": {},
43 | "outputs": [],
44 | "source": [
45 | "arr = np.array([3, 2, 4, 5, 1, 9, 7, 8, 6])"
46 | ]
47 | },
48 | {
49 | "cell_type": "code",
50 | "execution_count": 3,
51 | "metadata": {},
52 | "outputs": [
53 | {
54 | "data": {
55 | "text/plain": [
56 | "array([1, 2, 3, 4, 5, 6, 7, 8, 9])"
57 | ]
58 | },
59 | "execution_count": 3,
60 | "metadata": {},
61 | "output_type": "execute_result"
62 | }
63 | ],
64 | "source": [
65 | "# 返回拷贝后的数组\n",
66 | "np.sort(arr)"
67 | ]
68 | },
69 | {
70 | "cell_type": "code",
71 | "execution_count": 4,
72 | "metadata": {},
73 | "outputs": [
74 | {
75 | "data": {
76 | "text/plain": [
77 | "array([3, 2, 4, 5, 1, 9, 7, 8, 6])"
78 | ]
79 | },
80 | "execution_count": 4,
81 | "metadata": {},
82 | "output_type": "execute_result"
83 | }
84 | ],
85 | "source": [
86 | "arr"
87 | ]
88 | },
89 | {
90 | "cell_type": "markdown",
91 | "metadata": {},
92 | "source": [
93 | "### 2. array.sort进行原地排序"
94 | ]
95 | },
96 | {
97 | "cell_type": "code",
98 | "execution_count": 5,
99 | "metadata": {},
100 | "outputs": [],
101 | "source": [
102 | "arr2 = arr.copy()"
103 | ]
104 | },
105 | {
106 | "cell_type": "code",
107 | "execution_count": 6,
108 | "metadata": {},
109 | "outputs": [
110 | {
111 | "data": {
112 | "text/plain": [
113 | "array([3, 2, 4, 5, 1, 9, 7, 8, 6])"
114 | ]
115 | },
116 | "execution_count": 6,
117 | "metadata": {},
118 | "output_type": "execute_result"
119 | }
120 | ],
121 | "source": [
122 | "arr2"
123 | ]
124 | },
125 | {
126 | "cell_type": "code",
127 | "execution_count": 7,
128 | "metadata": {},
129 | "outputs": [],
130 | "source": [
131 | "arr2.sort()"
132 | ]
133 | },
134 | {
135 | "cell_type": "code",
136 | "execution_count": 8,
137 | "metadata": {},
138 | "outputs": [
139 | {
140 | "data": {
141 | "text/plain": [
142 | "array([1, 2, 3, 4, 5, 6, 7, 8, 9])"
143 | ]
144 | },
145 | "execution_count": 8,
146 | "metadata": {},
147 | "output_type": "execute_result"
148 | }
149 | ],
150 | "source": [
151 | "arr2"
152 | ]
153 | },
154 | {
155 | "cell_type": "markdown",
156 | "metadata": {},
157 | "source": [
158 | "### 3. np.argsort 返回的是有序数字的索引"
159 | ]
160 | },
161 | {
162 | "cell_type": "code",
163 | "execution_count": 9,
164 | "metadata": {},
165 | "outputs": [
166 | {
167 | "data": {
168 | "text/plain": [
169 | "array([3, 2, 4, 5, 1, 9, 7, 8, 6])"
170 | ]
171 | },
172 | "execution_count": 9,
173 | "metadata": {},
174 | "output_type": "execute_result"
175 | }
176 | ],
177 | "source": [
178 | "arr"
179 | ]
180 | },
181 | {
182 | "cell_type": "code",
183 | "execution_count": 10,
184 | "metadata": {},
185 | "outputs": [
186 | {
187 | "data": {
188 | "text/plain": [
189 | "array([4, 1, 0, 2, 3, 8, 6, 7, 5], dtype=int64)"
190 | ]
191 | },
192 | "execution_count": 10,
193 | "metadata": {},
194 | "output_type": "execute_result"
195 | }
196 | ],
197 | "source": [
198 | "# 获得排序元素对应的索引数字列表\n",
199 | "indices = np.argsort(arr)\n",
200 | "indices"
201 | ]
202 | },
203 | {
204 | "cell_type": "code",
205 | "execution_count": 11,
206 | "metadata": {
207 | "scrolled": true
208 | },
209 | "outputs": [
210 | {
211 | "data": {
212 | "text/plain": [
213 | "array([1, 2, 3, 4, 5, 6, 7, 8, 9])"
214 | ]
215 | },
216 | "execution_count": 11,
217 | "metadata": {},
218 | "output_type": "execute_result"
219 | }
220 | ],
221 | "source": [
222 | "# 可以直接获取对应的数据列表\n",
223 | "arr[indices]"
224 | ]
225 | },
226 | {
227 | "cell_type": "markdown",
228 | "metadata": {},
229 | "source": [
230 | "### 4. Python原生sorted与np.sort的性能对比"
231 | ]
232 | },
233 | {
234 | "cell_type": "code",
235 | "execution_count": 12,
236 | "metadata": {},
237 | "outputs": [],
238 | "source": [
239 | "arr_np = np.random.randint(0, 100, 100*10000)"
240 | ]
241 | },
242 | {
243 | "cell_type": "code",
244 | "execution_count": 13,
245 | "metadata": {},
246 | "outputs": [
247 | {
248 | "name": "stdout",
249 | "output_type": "stream",
250 | "text": [
251 | "24 ms ± 2.14 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)\n"
252 | ]
253 | }
254 | ],
255 | "source": [
256 | "%timeit np.sort(arr_np)"
257 | ]
258 | },
259 | {
260 | "cell_type": "code",
261 | "execution_count": 14,
262 | "metadata": {},
263 | "outputs": [],
264 | "source": [
265 | "# 将numpy arr变成python list\n",
266 | "arr_py = arr_np.tolist()"
267 | ]
268 | },
269 | {
270 | "cell_type": "code",
271 | "execution_count": 15,
272 | "metadata": {},
273 | "outputs": [
274 | {
275 | "name": "stdout",
276 | "output_type": "stream",
277 | "text": [
278 | "90.1 ms ± 726 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)\n"
279 | ]
280 | }
281 | ],
282 | "source": [
283 | "%timeit sorted(arr_py)"
284 | ]
285 | },
286 | {
287 | "cell_type": "code",
288 | "execution_count": null,
289 | "metadata": {},
290 | "outputs": [],
291 | "source": []
292 | }
293 | ],
294 | "metadata": {
295 | "kernelspec": {
296 | "display_name": "Python 3",
297 | "language": "python",
298 | "name": "python3"
299 | },
300 | "language_info": {
301 | "codemirror_mode": {
302 | "name": "ipython",
303 | "version": 3
304 | },
305 | "file_extension": ".py",
306 | "mimetype": "text/x-python",
307 | "name": "python",
308 | "nbconvert_exporter": "python",
309 | "pygments_lexer": "ipython3",
310 | "version": "3.7.6"
311 | }
312 | },
313 | "nbformat": 4,
314 | "nbformat_minor": 4
315 | }
316 |
--------------------------------------------------------------------------------
/11. Numpy中数组的乘法.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "## Numpy中数组的乘法\n",
8 | "\n",
9 | "按照两个相乘数组A和B的维度不同,分为以下乘法:\n",
10 | "1. 数字与一维/二维数组相乘;\n",
11 | "2. 一维数组与一维数组相乘;\n",
12 | "3. 二维数组与一维数组相乘;\n",
13 | "4. 二维数组与二维数组相乘;\n",
14 | "\n",
15 | "**numpy有以下乘法函数:** \n",
16 | "1. *符号或者np.multiply:逐元素乘法,对应位置的元素相乘,要求shape相同\n",
17 | "2. @符号或者np.matmul:矩阵乘法,形状要求满足(n,k),(k,m)->(n,m)\n",
18 | "3. np.dot:点积乘法\n",
19 | "\n",
20 | "**解释:点积,也叫内积,也叫数量积** \n",
21 | "两个向量a = [a1, a2,…, an]和b = [b1, b2,…, bn]的点积定义为: \n",
22 | "a·b=a1b1+a2b2+……+anbn。"
23 | ]
24 | },
25 | {
26 | "cell_type": "code",
27 | "execution_count": 1,
28 | "metadata": {},
29 | "outputs": [],
30 | "source": [
31 | "import numpy as np"
32 | ]
33 | },
34 | {
35 | "cell_type": "markdown",
36 | "metadata": {},
37 | "source": [
38 | "### 1. 数字与一维数组/二维数组相乘"
39 | ]
40 | },
41 | {
42 | "cell_type": "markdown",
43 | "metadata": {},
44 | "source": [
45 | "#### 一维数组"
46 | ]
47 | },
48 | {
49 | "cell_type": "code",
50 | "execution_count": 2,
51 | "metadata": {},
52 | "outputs": [
53 | {
54 | "data": {
55 | "text/plain": [
56 | "array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])"
57 | ]
58 | },
59 | "execution_count": 2,
60 | "metadata": {},
61 | "output_type": "execute_result"
62 | }
63 | ],
64 | "source": [
65 | "A = np.arange(10)\n",
66 | "A"
67 | ]
68 | },
69 | {
70 | "cell_type": "code",
71 | "execution_count": 3,
72 | "metadata": {},
73 | "outputs": [
74 | {
75 | "data": {
76 | "text/plain": [
77 | "array([0. , 0.5, 1. , 1.5, 2. , 2.5, 3. , 3.5, 4. , 4.5])"
78 | ]
79 | },
80 | "execution_count": 3,
81 | "metadata": {},
82 | "output_type": "execute_result"
83 | }
84 | ],
85 | "source": [
86 | "# *意思是逐元素乘法\n",
87 | "A * 0.5"
88 | ]
89 | },
90 | {
91 | "cell_type": "markdown",
92 | "metadata": {},
93 | "source": [
94 | "#### 二维数组"
95 | ]
96 | },
97 | {
98 | "cell_type": "code",
99 | "execution_count": 4,
100 | "metadata": {},
101 | "outputs": [
102 | {
103 | "data": {
104 | "text/plain": [
105 | "array([[ 0, 1, 2, 3],\n",
106 | " [ 4, 5, 6, 7],\n",
107 | " [ 8, 9, 10, 11]])"
108 | ]
109 | },
110 | "execution_count": 4,
111 | "metadata": {},
112 | "output_type": "execute_result"
113 | }
114 | ],
115 | "source": [
116 | "B = np.arange(12).reshape(3, 4)\n",
117 | "B"
118 | ]
119 | },
120 | {
121 | "cell_type": "code",
122 | "execution_count": 5,
123 | "metadata": {},
124 | "outputs": [
125 | {
126 | "data": {
127 | "text/plain": [
128 | "array([[0. , 0.5, 1. , 1.5],\n",
129 | " [2. , 2.5, 3. , 3.5],\n",
130 | " [4. , 4.5, 5. , 5.5]])"
131 | ]
132 | },
133 | "execution_count": 5,
134 | "metadata": {},
135 | "output_type": "execute_result"
136 | }
137 | ],
138 | "source": [
139 | "B * 0.5"
140 | ]
141 | },
142 | {
143 | "cell_type": "markdown",
144 | "metadata": {},
145 | "source": [
146 | "### 2. 一维数组与一维数组相乘"
147 | ]
148 | },
149 | {
150 | "cell_type": "code",
151 | "execution_count": 6,
152 | "metadata": {},
153 | "outputs": [
154 | {
155 | "data": {
156 | "text/plain": [
157 | "array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10])"
158 | ]
159 | },
160 | "execution_count": 6,
161 | "metadata": {},
162 | "output_type": "execute_result"
163 | }
164 | ],
165 | "source": [
166 | "A = np.arange(1, 11)\n",
167 | "A"
168 | ]
169 | },
170 | {
171 | "cell_type": "code",
172 | "execution_count": 7,
173 | "metadata": {},
174 | "outputs": [
175 | {
176 | "data": {
177 | "text/plain": [
178 | "array([0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1. ])"
179 | ]
180 | },
181 | "execution_count": 7,
182 | "metadata": {},
183 | "output_type": "execute_result"
184 | }
185 | ],
186 | "source": [
187 | "B = np.arange(1, 11) * 0.1\n",
188 | "B"
189 | ]
190 | },
191 | {
192 | "cell_type": "markdown",
193 | "metadata": {},
194 | "source": [
195 | "#### 逐元素乘法"
196 | ]
197 | },
198 | {
199 | "cell_type": "code",
200 | "execution_count": 8,
201 | "metadata": {},
202 | "outputs": [
203 | {
204 | "data": {
205 | "text/plain": [
206 | "array([ 0.1, 0.4, 0.9, 1.6, 2.5, 3.6, 4.9, 6.4, 8.1, 10. ])"
207 | ]
208 | },
209 | "execution_count": 8,
210 | "metadata": {},
211 | "output_type": "execute_result"
212 | }
213 | ],
214 | "source": [
215 | "np.multiply(A, B)"
216 | ]
217 | },
218 | {
219 | "cell_type": "code",
220 | "execution_count": 9,
221 | "metadata": {},
222 | "outputs": [
223 | {
224 | "data": {
225 | "text/plain": [
226 | "array([ 0.1, 0.4, 0.9, 1.6, 2.5, 3.6, 4.9, 6.4, 8.1, 10. ])"
227 | ]
228 | },
229 | "execution_count": 9,
230 | "metadata": {},
231 | "output_type": "execute_result"
232 | }
233 | ],
234 | "source": [
235 | "A * B"
236 | ]
237 | },
238 | {
239 | "cell_type": "markdown",
240 | "metadata": {},
241 | "source": [
242 | "#### 点积/内积/数量积"
243 | ]
244 | },
245 | {
246 | "cell_type": "code",
247 | "execution_count": 11,
248 | "metadata": {},
249 | "outputs": [
250 | {
251 | "data": {
252 | "text/plain": [
253 | "38.5"
254 | ]
255 | },
256 | "execution_count": 11,
257 | "metadata": {},
258 | "output_type": "execute_result"
259 | }
260 | ],
261 | "source": [
262 | "A@B"
263 | ]
264 | },
265 | {
266 | "cell_type": "code",
267 | "execution_count": 12,
268 | "metadata": {},
269 | "outputs": [
270 | {
271 | "data": {
272 | "text/plain": [
273 | "38.5"
274 | ]
275 | },
276 | "execution_count": 12,
277 | "metadata": {},
278 | "output_type": "execute_result"
279 | }
280 | ],
281 | "source": [
282 | "np.matmul(A, B)"
283 | ]
284 | },
285 | {
286 | "cell_type": "code",
287 | "execution_count": 10,
288 | "metadata": {},
289 | "outputs": [
290 | {
291 | "data": {
292 | "text/plain": [
293 | "38.5"
294 | ]
295 | },
296 | "execution_count": 10,
297 | "metadata": {},
298 | "output_type": "execute_result"
299 | }
300 | ],
301 | "source": [
302 | "np.dot(A, B)"
303 | ]
304 | },
305 | {
306 | "cell_type": "code",
307 | "execution_count": 13,
308 | "metadata": {},
309 | "outputs": [
310 | {
311 | "data": {
312 | "text/plain": [
313 | "38.5"
314 | ]
315 | },
316 | "execution_count": 13,
317 | "metadata": {},
318 | "output_type": "execute_result"
319 | }
320 | ],
321 | "source": [
322 | "# 以上三个,相当于\n",
323 | "np.sum(A*B)"
324 | ]
325 | },
326 | {
327 | "cell_type": "markdown",
328 | "metadata": {},
329 | "source": [
330 | "### 3. 二维数组和一维数组相乘"
331 | ]
332 | },
333 | {
334 | "cell_type": "code",
335 | "execution_count": 14,
336 | "metadata": {},
337 | "outputs": [
338 | {
339 | "data": {
340 | "text/plain": [
341 | "array([[ 1, 2, 3, 4],\n",
342 | " [ 5, 6, 7, 8],\n",
343 | " [ 9, 10, 11, 12],\n",
344 | " [13, 14, 15, 16],\n",
345 | " [17, 18, 19, 20]])"
346 | ]
347 | },
348 | "execution_count": 14,
349 | "metadata": {},
350 | "output_type": "execute_result"
351 | }
352 | ],
353 | "source": [
354 | "A = np.arange(1, 21).reshape(5, 4)\n",
355 | "A"
356 | ]
357 | },
358 | {
359 | "cell_type": "code",
360 | "execution_count": 15,
361 | "metadata": {},
362 | "outputs": [
363 | {
364 | "data": {
365 | "text/plain": [
366 | "array([0.1, 0.2, 0.3, 0.4])"
367 | ]
368 | },
369 | "execution_count": 15,
370 | "metadata": {},
371 | "output_type": "execute_result"
372 | }
373 | ],
374 | "source": [
375 | "B = np.arange(1, 5) * 0.1\n",
376 | "B"
377 | ]
378 | },
379 | {
380 | "cell_type": "markdown",
381 | "metadata": {},
382 | "source": [
383 | "#### 逐元素乘法"
384 | ]
385 | },
386 | {
387 | "cell_type": "code",
388 | "execution_count": 16,
389 | "metadata": {},
390 | "outputs": [
391 | {
392 | "data": {
393 | "text/plain": [
394 | "array([[0.1, 0.4, 0.9, 1.6],\n",
395 | " [0.5, 1.2, 2.1, 3.2],\n",
396 | " [0.9, 2. , 3.3, 4.8],\n",
397 | " [1.3, 2.8, 4.5, 6.4],\n",
398 | " [1.7, 3.6, 5.7, 8. ]])"
399 | ]
400 | },
401 | "execution_count": 16,
402 | "metadata": {},
403 | "output_type": "execute_result"
404 | }
405 | ],
406 | "source": [
407 | "A*B"
408 | ]
409 | },
410 | {
411 | "cell_type": "code",
412 | "execution_count": 17,
413 | "metadata": {},
414 | "outputs": [
415 | {
416 | "data": {
417 | "text/plain": [
418 | "array([[0.1, 0.4, 0.9, 1.6],\n",
419 | " [0.5, 1.2, 2.1, 3.2],\n",
420 | " [0.9, 2. , 3.3, 4.8],\n",
421 | " [1.3, 2.8, 4.5, 6.4],\n",
422 | " [1.7, 3.6, 5.7, 8. ]])"
423 | ]
424 | },
425 | "execution_count": 17,
426 | "metadata": {},
427 | "output_type": "execute_result"
428 | }
429 | ],
430 | "source": [
431 | "np.multiply(A, B)"
432 | ]
433 | },
434 | {
435 | "cell_type": "markdown",
436 | "metadata": {},
437 | "source": [
438 | "#### 矩阵乘法"
439 | ]
440 | },
441 | {
442 | "cell_type": "code",
443 | "execution_count": 18,
444 | "metadata": {},
445 | "outputs": [
446 | {
447 | "data": {
448 | "text/plain": [
449 | "array([ 3., 7., 11., 15., 19.])"
450 | ]
451 | },
452 | "execution_count": 18,
453 | "metadata": {},
454 | "output_type": "execute_result"
455 | }
456 | ],
457 | "source": [
458 | "A@B"
459 | ]
460 | },
461 | {
462 | "cell_type": "code",
463 | "execution_count": 19,
464 | "metadata": {},
465 | "outputs": [
466 | {
467 | "data": {
468 | "text/plain": [
469 | "array([ 3., 7., 11., 15., 19.])"
470 | ]
471 | },
472 | "execution_count": 19,
473 | "metadata": {},
474 | "output_type": "execute_result"
475 | }
476 | ],
477 | "source": [
478 | "np.matmul(A, B)"
479 | ]
480 | },
481 | {
482 | "cell_type": "code",
483 | "execution_count": 20,
484 | "metadata": {},
485 | "outputs": [
486 | {
487 | "data": {
488 | "text/plain": [
489 | "array([ 3., 7., 11., 15., 19.])"
490 | ]
491 | },
492 | "execution_count": 20,
493 | "metadata": {},
494 | "output_type": "execute_result"
495 | }
496 | ],
497 | "source": [
498 | "np.dot(A, B)"
499 | ]
500 | },
501 | {
502 | "cell_type": "markdown",
503 | "metadata": {},
504 | "source": [
505 | "### 4. A和B都是二维数组,实现矩阵乘法"
506 | ]
507 | },
508 | {
509 | "cell_type": "code",
510 | "execution_count": 21,
511 | "metadata": {},
512 | "outputs": [
513 | {
514 | "data": {
515 | "text/plain": [
516 | "array([[ 0, 1, 2, 3],\n",
517 | " [ 4, 5, 6, 7],\n",
518 | " [ 8, 9, 10, 11]])"
519 | ]
520 | },
521 | "execution_count": 21,
522 | "metadata": {},
523 | "output_type": "execute_result"
524 | }
525 | ],
526 | "source": [
527 | "A = np.arange(12).reshape(3, 4)\n",
528 | "A"
529 | ]
530 | },
531 | {
532 | "cell_type": "code",
533 | "execution_count": 22,
534 | "metadata": {},
535 | "outputs": [
536 | {
537 | "data": {
538 | "text/plain": [
539 | "array([[ 0, 1, 2, 3, 4],\n",
540 | " [ 5, 6, 7, 8, 9],\n",
541 | " [10, 11, 12, 13, 14],\n",
542 | " [15, 16, 17, 18, 19]])"
543 | ]
544 | },
545 | "execution_count": 22,
546 | "metadata": {},
547 | "output_type": "execute_result"
548 | }
549 | ],
550 | "source": [
551 | "B = np.arange(20).reshape(4, 5)\n",
552 | "B"
553 | ]
554 | },
555 | {
556 | "cell_type": "code",
557 | "execution_count": 23,
558 | "metadata": {
559 | "scrolled": true
560 | },
561 | "outputs": [
562 | {
563 | "data": {
564 | "text/plain": [
565 | "array([[ 70, 76, 82, 88, 94],\n",
566 | " [190, 212, 234, 256, 278],\n",
567 | " [310, 348, 386, 424, 462]])"
568 | ]
569 | },
570 | "execution_count": 23,
571 | "metadata": {},
572 | "output_type": "execute_result"
573 | }
574 | ],
575 | "source": [
576 | "A@B"
577 | ]
578 | },
579 | {
580 | "cell_type": "code",
581 | "execution_count": 24,
582 | "metadata": {},
583 | "outputs": [
584 | {
585 | "data": {
586 | "text/plain": [
587 | "array([[ 70, 76, 82, 88, 94],\n",
588 | " [190, 212, 234, 256, 278],\n",
589 | " [310, 348, 386, 424, 462]])"
590 | ]
591 | },
592 | "execution_count": 24,
593 | "metadata": {},
594 | "output_type": "execute_result"
595 | }
596 | ],
597 | "source": [
598 | "np.matmul(A, B)"
599 | ]
600 | },
601 | {
602 | "cell_type": "code",
603 | "execution_count": 25,
604 | "metadata": {},
605 | "outputs": [
606 | {
607 | "data": {
608 | "text/plain": [
609 | "array([[ 70, 76, 82, 88, 94],\n",
610 | " [190, 212, 234, 256, 278],\n",
611 | " [310, 348, 386, 424, 462]])"
612 | ]
613 | },
614 | "execution_count": 25,
615 | "metadata": {},
616 | "output_type": "execute_result"
617 | }
618 | ],
619 | "source": [
620 | "np.dot(A, B)"
621 | ]
622 | },
623 | {
624 | "cell_type": "code",
625 | "execution_count": null,
626 | "metadata": {},
627 | "outputs": [],
628 | "source": []
629 | }
630 | ],
631 | "metadata": {
632 | "kernelspec": {
633 | "display_name": "Python 3",
634 | "language": "python",
635 | "name": "python3"
636 | },
637 | "language_info": {
638 | "codemirror_mode": {
639 | "name": "ipython",
640 | "version": 3
641 | },
642 | "file_extension": ".py",
643 | "mimetype": "text/x-python",
644 | "name": "python",
645 | "nbconvert_exporter": "python",
646 | "pygments_lexer": "ipython3",
647 | "version": "3.7.6"
648 | }
649 | },
650 | "nbformat": 4,
651 | "nbformat_minor": 4
652 | }
653 |
--------------------------------------------------------------------------------
/12. Numpy中重要的广播概念.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "## Numpy中重要的广播概念"
8 | ]
9 | },
10 | {
11 | "cell_type": "markdown",
12 | "metadata": {},
13 | "source": [
14 | "***广播:*** \n",
15 | "简单理解为用于不同大小数组的二元通用函数(加、减、乘等)的一组规则\n",
16 | "\n",
17 | "***广播的规则:***\n",
18 | "1. 如果两个数组的维度数dim不相同,那么小维度数组的形状将会在左边补1\n",
19 | "2. 如果shape的维度不匹配,但是有维度是1,那么可以扩展维度是1的维度匹配另一个数组;\n",
20 | "3. 如果shape的维度不匹配,但是没有任何一个维度是1,则匹配失败引发错误;"
21 | ]
22 | },
23 | {
24 | "cell_type": "code",
25 | "execution_count": 1,
26 | "metadata": {},
27 | "outputs": [],
28 | "source": [
29 | "import numpy as np"
30 | ]
31 | },
32 | {
33 | "cell_type": "markdown",
34 | "metadata": {},
35 | "source": [
36 | "### 实例1:二维数组加一维数组"
37 | ]
38 | },
39 | {
40 | "cell_type": "code",
41 | "execution_count": 2,
42 | "metadata": {},
43 | "outputs": [
44 | {
45 | "data": {
46 | "text/plain": [
47 | "array([[1., 1., 1.],\n",
48 | " [1., 1., 1.]])"
49 | ]
50 | },
51 | "execution_count": 2,
52 | "metadata": {},
53 | "output_type": "execute_result"
54 | }
55 | ],
56 | "source": [
57 | "a = np.ones((2,3))\n",
58 | "a"
59 | ]
60 | },
61 | {
62 | "cell_type": "code",
63 | "execution_count": 3,
64 | "metadata": {},
65 | "outputs": [
66 | {
67 | "data": {
68 | "text/plain": [
69 | "array([0, 1, 2])"
70 | ]
71 | },
72 | "execution_count": 3,
73 | "metadata": {},
74 | "output_type": "execute_result"
75 | }
76 | ],
77 | "source": [
78 | "b = np.arange(3)\n",
79 | "b"
80 | ]
81 | },
82 | {
83 | "cell_type": "code",
84 | "execution_count": 4,
85 | "metadata": {},
86 | "outputs": [
87 | {
88 | "data": {
89 | "text/plain": [
90 | "((2, 3), (3,))"
91 | ]
92 | },
93 | "execution_count": 4,
94 | "metadata": {},
95 | "output_type": "execute_result"
96 | }
97 | ],
98 | "source": [
99 | "a.shape, b.shape"
100 | ]
101 | },
102 | {
103 | "cell_type": "code",
104 | "execution_count": 5,
105 | "metadata": {},
106 | "outputs": [
107 | {
108 | "data": {
109 | "text/plain": [
110 | "array([[1., 2., 3.],\n",
111 | " [1., 2., 3.]])"
112 | ]
113 | },
114 | "execution_count": 5,
115 | "metadata": {},
116 | "output_type": "execute_result"
117 | }
118 | ],
119 | "source": [
120 | "# 形状不匹配但是可以相加\n",
121 | "a + b"
122 | ]
123 | },
124 | {
125 | "cell_type": "markdown",
126 | "metadata": {},
127 | "source": [
128 | "***分析:a.shape=(2, 3), b.shape=(3,)***\n",
129 | "1. 根据规则1,b.shape会变成(1, 3)\n",
130 | "2. 根据规则2,b.shape再变成(2, 3),相当于在行上复制\n",
131 | "3. 完成匹配"
132 | ]
133 | },
134 | {
135 | "cell_type": "markdown",
136 | "metadata": {},
137 | "source": [
138 | "### 实例2:两个数组均需要广播"
139 | ]
140 | },
141 | {
142 | "cell_type": "code",
143 | "execution_count": 6,
144 | "metadata": {},
145 | "outputs": [
146 | {
147 | "data": {
148 | "text/plain": [
149 | "array([[0],\n",
150 | " [1],\n",
151 | " [2]])"
152 | ]
153 | },
154 | "execution_count": 6,
155 | "metadata": {},
156 | "output_type": "execute_result"
157 | }
158 | ],
159 | "source": [
160 | "a = np.arange(3).reshape((3, 1))\n",
161 | "a"
162 | ]
163 | },
164 | {
165 | "cell_type": "code",
166 | "execution_count": 7,
167 | "metadata": {},
168 | "outputs": [
169 | {
170 | "data": {
171 | "text/plain": [
172 | "array([0, 1, 2])"
173 | ]
174 | },
175 | "execution_count": 7,
176 | "metadata": {},
177 | "output_type": "execute_result"
178 | }
179 | ],
180 | "source": [
181 | "b = np.arange(3)\n",
182 | "b"
183 | ]
184 | },
185 | {
186 | "cell_type": "code",
187 | "execution_count": 8,
188 | "metadata": {},
189 | "outputs": [
190 | {
191 | "data": {
192 | "text/plain": [
193 | "((3, 1), (3,))"
194 | ]
195 | },
196 | "execution_count": 8,
197 | "metadata": {},
198 | "output_type": "execute_result"
199 | }
200 | ],
201 | "source": [
202 | "a.shape, b.shape"
203 | ]
204 | },
205 | {
206 | "cell_type": "code",
207 | "execution_count": 9,
208 | "metadata": {
209 | "scrolled": true
210 | },
211 | "outputs": [
212 | {
213 | "data": {
214 | "text/plain": [
215 | "array([[0, 1, 2],\n",
216 | " [1, 2, 3],\n",
217 | " [2, 3, 4]])"
218 | ]
219 | },
220 | "execution_count": 9,
221 | "metadata": {},
222 | "output_type": "execute_result"
223 | }
224 | ],
225 | "source": [
226 | "a + b"
227 | ]
228 | },
229 | {
230 | "cell_type": "markdown",
231 | "metadata": {},
232 | "source": [
233 | "***分析:a.shape为(3,1),b.shape为(3,)***:\n",
234 | "1. 根据规则1,b.shape会变成(1, 3)\n",
235 | "2. 根据规则2,b.shape再变成(3, 3),相当于在行上复制\n",
236 | "3. 根据规则2,a.shape再变成(3, 3),相当于在列上复制\n",
237 | "3. 完成匹配"
238 | ]
239 | },
240 | {
241 | "cell_type": "markdown",
242 | "metadata": {},
243 | "source": [
244 | "### 实例3:不匹配的例子"
245 | ]
246 | },
247 | {
248 | "cell_type": "code",
249 | "execution_count": 10,
250 | "metadata": {},
251 | "outputs": [
252 | {
253 | "data": {
254 | "text/plain": [
255 | "array([[1., 1.],\n",
256 | " [1., 1.],\n",
257 | " [1., 1.]])"
258 | ]
259 | },
260 | "execution_count": 10,
261 | "metadata": {},
262 | "output_type": "execute_result"
263 | }
264 | ],
265 | "source": [
266 | "a = np.ones((3,2))\n",
267 | "a"
268 | ]
269 | },
270 | {
271 | "cell_type": "code",
272 | "execution_count": 11,
273 | "metadata": {},
274 | "outputs": [
275 | {
276 | "data": {
277 | "text/plain": [
278 | "array([0, 1, 2])"
279 | ]
280 | },
281 | "execution_count": 11,
282 | "metadata": {},
283 | "output_type": "execute_result"
284 | }
285 | ],
286 | "source": [
287 | "b = np.arange(3)\n",
288 | "b"
289 | ]
290 | },
291 | {
292 | "cell_type": "code",
293 | "execution_count": 12,
294 | "metadata": {},
295 | "outputs": [
296 | {
297 | "data": {
298 | "text/plain": [
299 | "((3, 2), (3,))"
300 | ]
301 | },
302 | "execution_count": 12,
303 | "metadata": {},
304 | "output_type": "execute_result"
305 | }
306 | ],
307 | "source": [
308 | "a.shape, b.shape"
309 | ]
310 | },
311 | {
312 | "cell_type": "code",
313 | "execution_count": 13,
314 | "metadata": {},
315 | "outputs": [
316 | {
317 | "ename": "ValueError",
318 | "evalue": "operands could not be broadcast together with shapes (3,2) (3,) ",
319 | "output_type": "error",
320 | "traceback": [
321 | "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m",
322 | "\u001b[1;31mValueError\u001b[0m Traceback (most recent call last)",
323 | "\u001b[1;32m\u001b[0m in \u001b[0;36m\u001b[1;34m\u001b[0m\n\u001b[1;32m----> 1\u001b[1;33m \u001b[0ma\u001b[0m \u001b[1;33m+\u001b[0m \u001b[0mb\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m",
324 | "\u001b[1;31mValueError\u001b[0m: operands could not be broadcast together with shapes (3,2) (3,) "
325 | ]
326 | }
327 | ],
328 | "source": [
329 | "a + b"
330 | ]
331 | },
332 | {
333 | "cell_type": "markdown",
334 | "metadata": {},
335 | "source": [
336 | "***分析:a.shape为(3,2),b.shape为(3,)***:\n",
337 | "1. 根据规则1,b.shape会变成(1, 3)\n",
338 | "2. 根据规则2,b.shape再变成(3, 3),相当于在行上复制\n",
339 | "3. 根据规则3,形状不匹配,但是没有维度是1,匹配失败报错"
340 | ]
341 | }
342 | ],
343 | "metadata": {
344 | "kernelspec": {
345 | "display_name": "Python 3",
346 | "language": "python",
347 | "name": "python3"
348 | },
349 | "language_info": {
350 | "codemirror_mode": {
351 | "name": "ipython",
352 | "version": 3
353 | },
354 | "file_extension": ".py",
355 | "mimetype": "text/x-python",
356 | "name": "python",
357 | "nbconvert_exporter": "python",
358 | "pygments_lexer": "ipython3",
359 | "version": "3.7.6"
360 | }
361 | },
362 | "nbformat": 4,
363 | "nbformat_minor": 4
364 | }
365 |
--------------------------------------------------------------------------------
/13. Numpy求解线性方程组.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "## Numpy求解线性方程组\n",
8 | "\n",
9 | "对于Ax=b,已知A和b,怎么算出x?"
10 | ]
11 | },
12 | {
13 | "cell_type": "markdown",
14 | "metadata": {},
15 | "source": [
16 | "### 1. 引入包"
17 | ]
18 | },
19 | {
20 | "cell_type": "code",
21 | "execution_count": 1,
22 | "metadata": {},
23 | "outputs": [],
24 | "source": [
25 | "import numpy as np"
26 | ]
27 | },
28 | {
29 | "cell_type": "markdown",
30 | "metadata": {},
31 | "source": [
32 | "### 2. 求解"
33 | ]
34 | },
35 | {
36 | "cell_type": "code",
37 | "execution_count": 2,
38 | "metadata": {},
39 | "outputs": [
40 | {
41 | "data": {
42 | "text/plain": [
43 | "array([[ 1, -2, 1],\n",
44 | " [ 0, 2, -8],\n",
45 | " [-4, 5, 9]])"
46 | ]
47 | },
48 | "execution_count": 2,
49 | "metadata": {},
50 | "output_type": "execute_result"
51 | }
52 | ],
53 | "source": [
54 | "A = np.array(\n",
55 | " [\n",
56 | " [1, -2, 1],\n",
57 | " [0, 2, -8],\n",
58 | " [-4, 5, 9]\n",
59 | " ]\n",
60 | ")\n",
61 | "A"
62 | ]
63 | },
64 | {
65 | "cell_type": "code",
66 | "execution_count": 3,
67 | "metadata": {},
68 | "outputs": [
69 | {
70 | "data": {
71 | "text/plain": [
72 | "array([ 0, 8, -9])"
73 | ]
74 | },
75 | "execution_count": 3,
76 | "metadata": {},
77 | "output_type": "execute_result"
78 | }
79 | ],
80 | "source": [
81 | "b = np.array([0, 8, -9])\n",
82 | "b"
83 | ]
84 | },
85 | {
86 | "cell_type": "code",
87 | "execution_count": 4,
88 | "metadata": {},
89 | "outputs": [
90 | {
91 | "data": {
92 | "text/plain": [
93 | "array([29., 16., 3.])"
94 | ]
95 | },
96 | "execution_count": 4,
97 | "metadata": {},
98 | "output_type": "execute_result"
99 | }
100 | ],
101 | "source": [
102 | "# 调用solve方法直接求解\n",
103 | "x = np.linalg.solve(A, b)\n",
104 | "x"
105 | ]
106 | },
107 | {
108 | "cell_type": "markdown",
109 | "metadata": {},
110 | "source": [
111 | "### 验证"
112 | ]
113 | },
114 | {
115 | "cell_type": "code",
116 | "execution_count": 5,
117 | "metadata": {},
118 | "outputs": [
119 | {
120 | "data": {
121 | "text/plain": [
122 | "8.0"
123 | ]
124 | },
125 | "execution_count": 5,
126 | "metadata": {},
127 | "output_type": "execute_result"
128 | }
129 | ],
130 | "source": [
131 | "# 验证单个方程\n",
132 | "A[1].dot(x)"
133 | ]
134 | },
135 | {
136 | "cell_type": "code",
137 | "execution_count": 6,
138 | "metadata": {},
139 | "outputs": [
140 | {
141 | "data": {
142 | "text/plain": [
143 | "array([ True, True, True])"
144 | ]
145 | },
146 | "execution_count": 6,
147 | "metadata": {},
148 | "output_type": "execute_result"
149 | }
150 | ],
151 | "source": [
152 | "# 验证整个矩阵计算\n",
153 | "A.dot(x) == b"
154 | ]
155 | },
156 | {
157 | "cell_type": "code",
158 | "execution_count": null,
159 | "metadata": {},
160 | "outputs": [],
161 | "source": []
162 | }
163 | ],
164 | "metadata": {
165 | "kernelspec": {
166 | "display_name": "Python 3",
167 | "language": "python",
168 | "name": "python3"
169 | },
170 | "language_info": {
171 | "codemirror_mode": {
172 | "name": "ipython",
173 | "version": 3
174 | },
175 | "file_extension": ".py",
176 | "mimetype": "text/x-python",
177 | "name": "python",
178 | "nbconvert_exporter": "python",
179 | "pygments_lexer": "ipython3",
180 | "version": "3.7.6"
181 | }
182 | },
183 | "nbformat": 4,
184 | "nbformat_minor": 4
185 | }
186 |
--------------------------------------------------------------------------------
/14. Numpy实现SVD矩阵分解.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "## Numpy实现SVD矩阵分解"
8 | ]
9 | },
10 | {
11 | "cell_type": "markdown",
12 | "metadata": {},
13 | "source": [
14 | "### 1. 引入包"
15 | ]
16 | },
17 | {
18 | "cell_type": "code",
19 | "execution_count": 1,
20 | "metadata": {},
21 | "outputs": [],
22 | "source": [
23 | "import numpy as np"
24 | ]
25 | },
26 | {
27 | "cell_type": "markdown",
28 | "metadata": {},
29 | "source": [
30 | "### 2. 实现矩阵分解"
31 | ]
32 | },
33 | {
34 | "cell_type": "code",
35 | "execution_count": 2,
36 | "metadata": {},
37 | "outputs": [],
38 | "source": [
39 | "A = np.random.randint(1, 10, (8, 4))"
40 | ]
41 | },
42 | {
43 | "cell_type": "code",
44 | "execution_count": 3,
45 | "metadata": {},
46 | "outputs": [
47 | {
48 | "data": {
49 | "text/plain": [
50 | "array([[6, 5, 1, 5],\n",
51 | " [1, 7, 9, 7],\n",
52 | " [7, 2, 4, 2],\n",
53 | " [6, 4, 3, 5],\n",
54 | " [2, 8, 8, 6],\n",
55 | " [5, 2, 8, 6],\n",
56 | " [7, 8, 2, 3],\n",
57 | " [1, 3, 6, 9]])"
58 | ]
59 | },
60 | "execution_count": 3,
61 | "metadata": {},
62 | "output_type": "execute_result"
63 | }
64 | ],
65 | "source": [
66 | "A"
67 | ]
68 | },
69 | {
70 | "cell_type": "code",
71 | "execution_count": 4,
72 | "metadata": {},
73 | "outputs": [],
74 | "source": [
75 | "# 实现矩阵分解\n",
76 | "U, S, V = np.linalg.svd(A, full_matrices=False)"
77 | ]
78 | },
79 | {
80 | "cell_type": "code",
81 | "execution_count": 5,
82 | "metadata": {},
83 | "outputs": [
84 | {
85 | "data": {
86 | "text/plain": [
87 | "((8, 4), (4,), (4, 4))"
88 | ]
89 | },
90 | "execution_count": 5,
91 | "metadata": {},
92 | "output_type": "execute_result"
93 | }
94 | ],
95 | "source": [
96 | "U.shape, S.shape, V.shape"
97 | ]
98 | },
99 | {
100 | "cell_type": "code",
101 | "execution_count": 6,
102 | "metadata": {},
103 | "outputs": [
104 | {
105 | "data": {
106 | "text/plain": [
107 | "array([[-0.28611227, -0.38768744, -0.07088588, -0.47757145],\n",
108 | " [-0.44374671, 0.40390585, -0.25458601, 0.20383531],\n",
109 | " [-0.24657791, -0.34884357, 0.43054458, 0.4062272 ],\n",
110 | " [-0.30673084, -0.27495123, 0.14797683, -0.2218886 ],\n",
111 | " [-0.43671345, 0.23339125, -0.39431663, 0.27599841],\n",
112 | " [-0.37257929, 0.10313032, 0.59362412, 0.23542645],\n",
113 | " [-0.33314069, -0.52514475, -0.41727103, 0.07285924],\n",
114 | " [-0.35472167, 0.38520663, 0.20225001, -0.61580222]])"
115 | ]
116 | },
117 | "execution_count": 6,
118 | "metadata": {},
119 | "output_type": "execute_result"
120 | }
121 | ],
122 | "source": [
123 | "U"
124 | ]
125 | },
126 | {
127 | "cell_type": "code",
128 | "execution_count": 7,
129 | "metadata": {},
130 | "outputs": [
131 | {
132 | "data": {
133 | "text/plain": [
134 | "array([28.44730142, 10.24874824, 6.39012419, 4.56952014])"
135 | ]
136 | },
137 | "execution_count": 7,
138 | "metadata": {},
139 | "output_type": "execute_result"
140 | }
141 | ],
142 | "source": [
143 | "# 因为是对角矩阵,这里进行了简写\n",
144 | "S"
145 | ]
146 | },
147 | {
148 | "cell_type": "code",
149 | "execution_count": 8,
150 | "metadata": {},
151 | "outputs": [
152 | {
153 | "data": {
154 | "text/plain": [
155 | "array([[28.44730142, 0. , 0. , 0. ],\n",
156 | " [ 0. , 10.24874824, 0. , 0. ],\n",
157 | " [ 0. , 0. , 6.39012419, 0. ],\n",
158 | " [ 0. , 0. , 0. , 4.56952014]])"
159 | ]
160 | },
161 | "execution_count": 8,
162 | "metadata": {},
163 | "output_type": "execute_result"
164 | }
165 | ],
166 | "source": [
167 | "np.diag(S)"
168 | ]
169 | },
170 | {
171 | "cell_type": "code",
172 | "execution_count": 9,
173 | "metadata": {},
174 | "outputs": [
175 | {
176 | "data": {
177 | "text/plain": [
178 | "array([[-0.39194862, -0.50004828, -0.54329548, -0.54877866],\n",
179 | " [-0.81202147, -0.18350883, 0.48594814, 0.26608277],\n",
180 | " [ 0.41980592, -0.84227439, 0.27814277, 0.19228481],\n",
181 | " [ 0.10373231, 0.08276523, 0.62555658, -0.76880979]])"
182 | ]
183 | },
184 | "execution_count": 9,
185 | "metadata": {},
186 | "output_type": "execute_result"
187 | }
188 | ],
189 | "source": [
190 | "V"
191 | ]
192 | },
193 | {
194 | "cell_type": "markdown",
195 | "metadata": {},
196 | "source": [
197 | "### 3. 从分量还原矩阵"
198 | ]
199 | },
200 | {
201 | "cell_type": "code",
202 | "execution_count": 10,
203 | "metadata": {},
204 | "outputs": [
205 | {
206 | "data": {
207 | "text/plain": [
208 | "array([[6., 5., 1., 5.],\n",
209 | " [1., 7., 9., 7.],\n",
210 | " [7., 2., 4., 2.],\n",
211 | " [6., 4., 3., 5.],\n",
212 | " [2., 8., 8., 6.],\n",
213 | " [5., 2., 8., 6.],\n",
214 | " [7., 8., 2., 3.],\n",
215 | " [1., 3., 6., 9.]])"
216 | ]
217 | },
218 | "execution_count": 10,
219 | "metadata": {},
220 | "output_type": "execute_result"
221 | }
222 | ],
223 | "source": [
224 | "U @ np.diag(S) @ V"
225 | ]
226 | },
227 | {
228 | "cell_type": "code",
229 | "execution_count": null,
230 | "metadata": {},
231 | "outputs": [],
232 | "source": []
233 | }
234 | ],
235 | "metadata": {
236 | "kernelspec": {
237 | "display_name": "Python 3",
238 | "language": "python",
239 | "name": "python3"
240 | },
241 | "language_info": {
242 | "codemirror_mode": {
243 | "name": "ipython",
244 | "version": 3
245 | },
246 | "file_extension": ".py",
247 | "mimetype": "text/x-python",
248 | "name": "python",
249 | "nbconvert_exporter": "python",
250 | "pygments_lexer": "ipython3",
251 | "version": "3.7.6"
252 | }
253 | },
254 | "nbformat": 4,
255 | "nbformat_minor": 4
256 | }
257 |
--------------------------------------------------------------------------------
/17. Numpy计算逆矩阵求解线性方程组.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "## Numpy计算逆矩阵求解线性方程组"
8 | ]
9 | },
10 | {
11 | "cell_type": "markdown",
12 | "metadata": {},
13 | "source": [
14 | "对于这样的线性方程组:\n",
15 | "* x + y + z = 6\n",
16 | "* 2y + 5z = -4\n",
17 | "* 2x + 5y - z = 27\n",
18 | "\n",
19 | "可以表示成矩阵的形式:\n",
20 | "
\n",
21 | "\n",
22 | "用公式可以表示为:Ax=b,其中A是矩阵,x和b都是列向量\n",
23 | "\n",
24 | "***逆矩阵(inverse matrix)的定义:*** \n",
25 | "设A是数域上的一个n阶矩阵,若存在另一个n阶矩阵B,使得: AB=BA=E ,则我们称B是A的逆矩阵,而A则被称为可逆矩阵。注:E为单位矩阵。\n",
26 | "\n",
27 | "***使用逆矩阵求解线性方程组的方法:*** \n",
28 | "两边都乘以$A^{-1}$,变成$A^{-1}$Ax=$A^{-1}$b,因为任何矩阵乘以单位矩阵都是自身,所以x=$A^{-1}$b"
29 | ]
30 | },
31 | {
32 | "cell_type": "code",
33 | "execution_count": 1,
34 | "metadata": {},
35 | "outputs": [],
36 | "source": [
37 | "import numpy as np"
38 | ]
39 | },
40 | {
41 | "cell_type": "markdown",
42 | "metadata": {},
43 | "source": [
44 | "### 1. 求解逆矩阵"
45 | ]
46 | },
47 | {
48 | "cell_type": "code",
49 | "execution_count": 2,
50 | "metadata": {},
51 | "outputs": [
52 | {
53 | "data": {
54 | "text/plain": [
55 | "array([[ 1, 1, 1],\n",
56 | " [ 0, 2, 5],\n",
57 | " [ 2, 5, -1]])"
58 | ]
59 | },
60 | "execution_count": 2,
61 | "metadata": {},
62 | "output_type": "execute_result"
63 | }
64 | ],
65 | "source": [
66 | "A = np.array([\n",
67 | " [1,1,1],\n",
68 | " [0,2,5],\n",
69 | " [2,5,-1]\n",
70 | "])\n",
71 | "A"
72 | ]
73 | },
74 | {
75 | "cell_type": "code",
76 | "execution_count": 3,
77 | "metadata": {},
78 | "outputs": [
79 | {
80 | "data": {
81 | "text/plain": [
82 | "array([[ 1.28571429, -0.28571429, -0.14285714],\n",
83 | " [-0.47619048, 0.14285714, 0.23809524],\n",
84 | " [ 0.19047619, 0.14285714, -0.0952381 ]])"
85 | ]
86 | },
87 | "execution_count": 3,
88 | "metadata": {},
89 | "output_type": "execute_result"
90 | }
91 | ],
92 | "source": [
93 | "# B为A的逆矩阵\n",
94 | "B = np.linalg.inv(A)\n",
95 | "B"
96 | ]
97 | },
98 | {
99 | "cell_type": "markdown",
100 | "metadata": {},
101 | "source": [
102 | "### 2. 验证矩阵和逆矩阵的乘积是单位矩阵"
103 | ]
104 | },
105 | {
106 | "cell_type": "code",
107 | "execution_count": 4,
108 | "metadata": {},
109 | "outputs": [
110 | {
111 | "data": {
112 | "text/plain": [
113 | "array([[ 1.00000000e+00, -2.77555756e-17, 2.77555756e-17],\n",
114 | " [ 0.00000000e+00, 1.00000000e+00, 0.00000000e+00],\n",
115 | " [-2.22044605e-16, 5.55111512e-17, 1.00000000e+00]])"
116 | ]
117 | },
118 | "execution_count": 4,
119 | "metadata": {},
120 | "output_type": "execute_result"
121 | }
122 | ],
123 | "source": [
124 | "A@B"
125 | ]
126 | },
127 | {
128 | "cell_type": "code",
129 | "execution_count": 5,
130 | "metadata": {},
131 | "outputs": [
132 | {
133 | "data": {
134 | "text/plain": [
135 | "array([[ 1.00000000e+00, -2.77555756e-17, 2.77555756e-17],\n",
136 | " [ 0.00000000e+00, 1.00000000e+00, 0.00000000e+00],\n",
137 | " [-2.22044605e-16, 5.55111512e-17, 1.00000000e+00]])"
138 | ]
139 | },
140 | "execution_count": 5,
141 | "metadata": {},
142 | "output_type": "execute_result"
143 | }
144 | ],
145 | "source": [
146 | "np.matmul(A, B)"
147 | ]
148 | },
149 | {
150 | "cell_type": "markdown",
151 | "metadata": {},
152 | "source": [
153 | "### 3. 验证线性方程组"
154 | ]
155 | },
156 | {
157 | "cell_type": "code",
158 | "execution_count": 6,
159 | "metadata": {},
160 | "outputs": [],
161 | "source": [
162 | "# 构造Ax=b中的b\n",
163 | "b = np.array([6, -4, 27])"
164 | ]
165 | },
166 | {
167 | "cell_type": "code",
168 | "execution_count": 7,
169 | "metadata": {},
170 | "outputs": [],
171 | "source": [
172 | "# 使用逆矩阵求解x\n",
173 | "x = B@b"
174 | ]
175 | },
176 | {
177 | "cell_type": "code",
178 | "execution_count": 8,
179 | "metadata": {},
180 | "outputs": [
181 | {
182 | "data": {
183 | "text/plain": [
184 | "array([ 5., 3., -2.])"
185 | ]
186 | },
187 | "execution_count": 8,
188 | "metadata": {},
189 | "output_type": "execute_result"
190 | }
191 | ],
192 | "source": [
193 | "x"
194 | ]
195 | },
196 | {
197 | "cell_type": "code",
198 | "execution_count": 9,
199 | "metadata": {},
200 | "outputs": [
201 | {
202 | "data": {
203 | "text/plain": [
204 | "array([ 6., -4., 27.])"
205 | ]
206 | },
207 | "execution_count": 9,
208 | "metadata": {},
209 | "output_type": "execute_result"
210 | }
211 | ],
212 | "source": [
213 | "# 验证A@x = b\n",
214 | "A@x"
215 | ]
216 | },
217 | {
218 | "cell_type": "code",
219 | "execution_count": null,
220 | "metadata": {},
221 | "outputs": [],
222 | "source": []
223 | }
224 | ],
225 | "metadata": {
226 | "kernelspec": {
227 | "display_name": "Python 3",
228 | "language": "python",
229 | "name": "python3"
230 | },
231 | "language_info": {
232 | "codemirror_mode": {
233 | "name": "ipython",
234 | "version": 3
235 | },
236 | "file_extension": ".py",
237 | "mimetype": "text/x-python",
238 | "name": "python",
239 | "nbconvert_exporter": "python",
240 | "pygments_lexer": "ipython3",
241 | "version": "3.7.6"
242 | }
243 | },
244 | "nbformat": 4,
245 | "nbformat_minor": 4
246 | }
247 |
--------------------------------------------------------------------------------
/18. Numpy怎样将数组读写到文件.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "## Numpy怎样将数组读写到文件\n",
8 | "\n",
9 | "本文档介绍的是Numpy以自己内建二进制的方式,将数组写出到文件,以及从文件加载数组;\n",
10 | "\n",
11 | "如果是文本、表格类数据,一般使用pandas这个类库做加载和处理,不用numpy\n",
12 | "\n",
13 | "几个方法:\n",
14 | "1. np.load(filename):从.npy或者.npz文件中加载numpy数组 \n",
15 | "如果文件后缀是.npy返回单个数组,如果文件后缀是.npz返回多个数组的字典\n",
16 | "2. np.save(filename, arr):将单个numpy数组保存到.npy文件中\n",
17 | "3. np.savez(filename, arra=arra, arrb=arrb):将多个numpy数组保存到.npz未压缩的文件格式中\n",
18 | "4. np.savez_compressed(filename, arra=arra, arrb=arrb):将多个numpy数组保存到.npz压缩的文件格式中\n",
19 | "\n",
20 | ".npy和.npz都是二进制格式文件,用纯文本编辑器打开都是乱码"
21 | ]
22 | },
23 | {
24 | "cell_type": "code",
25 | "execution_count": 1,
26 | "metadata": {},
27 | "outputs": [],
28 | "source": [
29 | "import numpy as np"
30 | ]
31 | },
32 | {
33 | "cell_type": "markdown",
34 | "metadata": {},
35 | "source": [
36 | "### 1. 使用np.save和np.load保存和加载单个数组"
37 | ]
38 | },
39 | {
40 | "cell_type": "code",
41 | "execution_count": 2,
42 | "metadata": {},
43 | "outputs": [
44 | {
45 | "data": {
46 | "text/plain": [
47 | "array([[ 0, 1, 2, 3],\n",
48 | " [ 4, 5, 6, 7],\n",
49 | " [ 8, 9, 10, 11]])"
50 | ]
51 | },
52 | "execution_count": 2,
53 | "metadata": {},
54 | "output_type": "execute_result"
55 | }
56 | ],
57 | "source": [
58 | "a = np.arange(12).reshape(3,4)\n",
59 | "a"
60 | ]
61 | },
62 | {
63 | "cell_type": "code",
64 | "execution_count": 3,
65 | "metadata": {},
66 | "outputs": [],
67 | "source": [
68 | "# 把单个数组保存到.npy文件\n",
69 | "np.save(\"arr_a.npy\", a)"
70 | ]
71 | },
72 | {
73 | "cell_type": "code",
74 | "execution_count": 4,
75 | "metadata": {},
76 | "outputs": [
77 | {
78 | "data": {
79 | "text/plain": [
80 | "array([[ 0, 1, 2, 3],\n",
81 | " [ 4, 5, 6, 7],\n",
82 | " [ 8, 9, 10, 11]])"
83 | ]
84 | },
85 | "execution_count": 4,
86 | "metadata": {},
87 | "output_type": "execute_result"
88 | }
89 | ],
90 | "source": [
91 | "# 从.npy文件加载单个数组\n",
92 | "b = np.load(\"arr_a.npy\")\n",
93 | "b"
94 | ]
95 | },
96 | {
97 | "cell_type": "markdown",
98 | "metadata": {},
99 | "source": [
100 | "### 2. 使用np.savez和np.load保存和加载多个数组"
101 | ]
102 | },
103 | {
104 | "cell_type": "code",
105 | "execution_count": 5,
106 | "metadata": {},
107 | "outputs": [
108 | {
109 | "data": {
110 | "text/plain": [
111 | "array([[ 0, 1, 2, 3],\n",
112 | " [ 4, 5, 6, 7],\n",
113 | " [ 8, 9, 10, 11]])"
114 | ]
115 | },
116 | "execution_count": 5,
117 | "metadata": {},
118 | "output_type": "execute_result"
119 | }
120 | ],
121 | "source": [
122 | "a"
123 | ]
124 | },
125 | {
126 | "cell_type": "code",
127 | "execution_count": 6,
128 | "metadata": {},
129 | "outputs": [
130 | {
131 | "data": {
132 | "text/plain": [
133 | "array([[0.06355473, 0.69576567, 0.17754786],\n",
134 | " [0.28343315, 0.29994149, 0.76737219]])"
135 | ]
136 | },
137 | "execution_count": 6,
138 | "metadata": {},
139 | "output_type": "execute_result"
140 | }
141 | ],
142 | "source": [
143 | "b = np.random.rand(2, 3)\n",
144 | "b"
145 | ]
146 | },
147 | {
148 | "cell_type": "code",
149 | "execution_count": 7,
150 | "metadata": {},
151 | "outputs": [],
152 | "source": [
153 | "# 保存多个数组到一个文件\n",
154 | "np.savez(\"arr_ab.npz\", a=a, b=b)"
155 | ]
156 | },
157 | {
158 | "cell_type": "code",
159 | "execution_count": 8,
160 | "metadata": {},
161 | "outputs": [
162 | {
163 | "data": {
164 | "text/plain": [
165 | ""
166 | ]
167 | },
168 | "execution_count": 8,
169 | "metadata": {},
170 | "output_type": "execute_result"
171 | }
172 | ],
173 | "source": [
174 | "# 从.npz读取多个数组,返回一个字典形式\n",
175 | "data = np.load(\"arr_ab.npz\")\n",
176 | "data"
177 | ]
178 | },
179 | {
180 | "cell_type": "code",
181 | "execution_count": 9,
182 | "metadata": {},
183 | "outputs": [
184 | {
185 | "data": {
186 | "text/plain": [
187 | "array([[ 0, 1, 2, 3],\n",
188 | " [ 4, 5, 6, 7],\n",
189 | " [ 8, 9, 10, 11]])"
190 | ]
191 | },
192 | "execution_count": 9,
193 | "metadata": {},
194 | "output_type": "execute_result"
195 | }
196 | ],
197 | "source": [
198 | "data[\"a\"]"
199 | ]
200 | },
201 | {
202 | "cell_type": "code",
203 | "execution_count": 10,
204 | "metadata": {},
205 | "outputs": [
206 | {
207 | "data": {
208 | "text/plain": [
209 | "array([[0.06355473, 0.69576567, 0.17754786],\n",
210 | " [0.28343315, 0.29994149, 0.76737219]])"
211 | ]
212 | },
213 | "execution_count": 10,
214 | "metadata": {},
215 | "output_type": "execute_result"
216 | }
217 | ],
218 | "source": [
219 | "data[\"b\"]"
220 | ]
221 | },
222 | {
223 | "cell_type": "markdown",
224 | "metadata": {},
225 | "source": [
226 | "### 3. 使用np.savez_compressed和np.load保存和加载多个数组到压缩格式文件"
227 | ]
228 | },
229 | {
230 | "cell_type": "code",
231 | "execution_count": 11,
232 | "metadata": {},
233 | "outputs": [
234 | {
235 | "data": {
236 | "text/plain": [
237 | "array([[ 0, 1, 2, 3],\n",
238 | " [ 4, 5, 6, 7],\n",
239 | " [ 8, 9, 10, 11]])"
240 | ]
241 | },
242 | "execution_count": 11,
243 | "metadata": {},
244 | "output_type": "execute_result"
245 | }
246 | ],
247 | "source": [
248 | "a"
249 | ]
250 | },
251 | {
252 | "cell_type": "code",
253 | "execution_count": 12,
254 | "metadata": {},
255 | "outputs": [
256 | {
257 | "data": {
258 | "text/plain": [
259 | "array([[0.06355473, 0.69576567, 0.17754786],\n",
260 | " [0.28343315, 0.29994149, 0.76737219]])"
261 | ]
262 | },
263 | "execution_count": 12,
264 | "metadata": {},
265 | "output_type": "execute_result"
266 | }
267 | ],
268 | "source": [
269 | "b"
270 | ]
271 | },
272 | {
273 | "cell_type": "code",
274 | "execution_count": 13,
275 | "metadata": {},
276 | "outputs": [],
277 | "source": [
278 | "# 保存多个数组到压缩文件\n",
279 | "np.savez_compressed(\"arr_ab_compressed.npz\", a=a, b=b)"
280 | ]
281 | },
282 | {
283 | "cell_type": "code",
284 | "execution_count": 14,
285 | "metadata": {},
286 | "outputs": [],
287 | "source": [
288 | "# 同样用np.load加载.npz文件\n",
289 | "data = np.load(\"arr_ab_compressed.npz\")"
290 | ]
291 | },
292 | {
293 | "cell_type": "code",
294 | "execution_count": 15,
295 | "metadata": {},
296 | "outputs": [
297 | {
298 | "data": {
299 | "text/plain": [
300 | "array([[ 0, 1, 2, 3],\n",
301 | " [ 4, 5, 6, 7],\n",
302 | " [ 8, 9, 10, 11]])"
303 | ]
304 | },
305 | "execution_count": 15,
306 | "metadata": {},
307 | "output_type": "execute_result"
308 | }
309 | ],
310 | "source": [
311 | "data[\"a\"]"
312 | ]
313 | },
314 | {
315 | "cell_type": "code",
316 | "execution_count": 16,
317 | "metadata": {},
318 | "outputs": [
319 | {
320 | "data": {
321 | "text/plain": [
322 | "array([[0.06355473, 0.69576567, 0.17754786],\n",
323 | " [0.28343315, 0.29994149, 0.76737219]])"
324 | ]
325 | },
326 | "execution_count": 16,
327 | "metadata": {},
328 | "output_type": "execute_result"
329 | }
330 | ],
331 | "source": [
332 | "data[\"b\"]"
333 | ]
334 | },
335 | {
336 | "cell_type": "code",
337 | "execution_count": null,
338 | "metadata": {},
339 | "outputs": [],
340 | "source": []
341 | }
342 | ],
343 | "metadata": {
344 | "kernelspec": {
345 | "display_name": "Python 3",
346 | "language": "python",
347 | "name": "python3"
348 | },
349 | "language_info": {
350 | "codemirror_mode": {
351 | "name": "ipython",
352 | "version": 3
353 | },
354 | "file_extension": ".py",
355 | "mimetype": "text/x-python",
356 | "name": "python",
357 | "nbconvert_exporter": "python",
358 | "pygments_lexer": "ipython3",
359 | "version": "3.7.6"
360 | }
361 | },
362 | "nbformat": 4,
363 | "nbformat_minor": 4
364 | }
365 |
--------------------------------------------------------------------------------
/19. Numpy的结构化数组.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "## Numpy的结构化数组"
8 | ]
9 | },
10 | {
11 | "cell_type": "markdown",
12 | "metadata": {},
13 | "source": [
14 | "一般情况下,Numpy中的数组都是同样的数据类型,比如int、float; \n",
15 | "这也是Numpy性能高效的原因,在内存中紧凑存储,读取非常快; \n",
16 | "\n",
17 | "但是Numpy也可以记录异构数组,比如下面的数据: \n",
18 | "\n",
19 | " \n",
20 | " 姓名 | \n",
21 | " 年龄 | \n",
22 | " 体重 | \n",
23 | "
\n",
24 | " \n",
25 | " 小王 | \n",
26 | " 30 | \n",
27 | " 80.5 | \n",
28 | "
\n",
29 | " \n",
30 | " 小李 | \n",
31 | " 28 | \n",
32 | " 70.3 | \n",
33 | "
\n",
34 | " \n",
35 | " 小天 | \n",
36 | " 29 | \n",
37 | " 78.6 | \n",
38 | "
\n",
39 | "
\n",
40 | "\n",
41 | "这就是本节要介绍的“Numpy结构化数组”特性; "
42 | ]
43 | },
44 | {
45 | "cell_type": "code",
46 | "execution_count": 1,
47 | "metadata": {},
48 | "outputs": [],
49 | "source": [
50 | "import numpy as np"
51 | ]
52 | },
53 | {
54 | "cell_type": "markdown",
55 | "metadata": {},
56 | "source": [
57 | "### 1. 正常的Numpy数组的dtype值只有一个类型"
58 | ]
59 | },
60 | {
61 | "cell_type": "code",
62 | "execution_count": 2,
63 | "metadata": {},
64 | "outputs": [
65 | {
66 | "data": {
67 | "text/plain": [
68 | "(array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]), dtype('int32'))"
69 | ]
70 | },
71 | "execution_count": 2,
72 | "metadata": {},
73 | "output_type": "execute_result"
74 | }
75 | ],
76 | "source": [
77 | "arr = np.arange(10)\n",
78 | "arr, arr.dtype"
79 | ]
80 | },
81 | {
82 | "cell_type": "code",
83 | "execution_count": 3,
84 | "metadata": {},
85 | "outputs": [
86 | {
87 | "data": {
88 | "text/plain": [
89 | "(array([[0.13813273, 0.69213455, 0.2869116 , 0.64065806],\n",
90 | " [0.5972653 , 0.42803843, 0.84914465, 0.0502318 ],\n",
91 | " [0.31351949, 0.87095862, 0.52867948, 0.83884873]]),\n",
92 | " dtype('float64'))"
93 | ]
94 | },
95 | "execution_count": 3,
96 | "metadata": {},
97 | "output_type": "execute_result"
98 | }
99 | ],
100 | "source": [
101 | "arr = np.random.rand(3, 4)\n",
102 | "arr, arr.dtype"
103 | ]
104 | },
105 | {
106 | "cell_type": "markdown",
107 | "metadata": {},
108 | "source": [
109 | "### 2. 怎样使用Numpy表达异构数据"
110 | ]
111 | },
112 | {
113 | "cell_type": "code",
114 | "execution_count": 4,
115 | "metadata": {},
116 | "outputs": [
117 | {
118 | "data": {
119 | "text/plain": [
120 | "dtype([('name', '= 29]"
314 | ]
315 | },
316 | {
317 | "cell_type": "code",
318 | "execution_count": 13,
319 | "metadata": {},
320 | "outputs": [
321 | {
322 | "data": {
323 | "text/plain": [
324 | "array([('xiaowang', 30, 80.5)],\n",
325 | " dtype=[('name', '= 29) & (my_arr[\"weight\"] > 80)]"
336 | ]
337 | },
338 | {
339 | "cell_type": "markdown",
340 | "metadata": {},
341 | "source": [
342 | "#### 对单列做逐元素计算"
343 | ]
344 | },
345 | {
346 | "cell_type": "code",
347 | "execution_count": 14,
348 | "metadata": {},
349 | "outputs": [
350 | {
351 | "data": {
352 | "text/plain": [
353 | "array([30, 28, 29])"
354 | ]
355 | },
356 | "execution_count": 14,
357 | "metadata": {},
358 | "output_type": "execute_result"
359 | }
360 | ],
361 | "source": [
362 | "my_arr[\"age\"]"
363 | ]
364 | },
365 | {
366 | "cell_type": "code",
367 | "execution_count": 15,
368 | "metadata": {},
369 | "outputs": [],
370 | "source": [
371 | "my_arr[\"age\"] += 1"
372 | ]
373 | },
374 | {
375 | "cell_type": "code",
376 | "execution_count": 16,
377 | "metadata": {},
378 | "outputs": [
379 | {
380 | "data": {
381 | "text/plain": [
382 | "array([31, 29, 30])"
383 | ]
384 | },
385 | "execution_count": 16,
386 | "metadata": {},
387 | "output_type": "execute_result"
388 | }
389 | ],
390 | "source": [
391 | "my_arr[\"age\"]"
392 | ]
393 | },
394 | {
395 | "cell_type": "markdown",
396 | "metadata": {},
397 | "source": [
398 | "最后的一言: \n",
399 | "* 对于这种每列类型不同的“异构数据”,Pandas更擅长处理;\n",
400 | "* 但我们还要学习一下Numpy结构化数组,不一定会使用它,但要能读懂别人的代码"
401 | ]
402 | }
403 | ],
404 | "metadata": {
405 | "kernelspec": {
406 | "display_name": "Python 3",
407 | "language": "python",
408 | "name": "python3"
409 | },
410 | "language_info": {
411 | "codemirror_mode": {
412 | "name": "ipython",
413 | "version": 3
414 | },
415 | "file_extension": ".py",
416 | "mimetype": "text/x-python",
417 | "name": "python",
418 | "nbconvert_exporter": "python",
419 | "pygments_lexer": "ipython3",
420 | "version": "3.7.6"
421 | }
422 | },
423 | "nbformat": 4,
424 | "nbformat_minor": 4
425 | }
426 |
--------------------------------------------------------------------------------
/20. Numpy与Pandas数据的相互转换.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "## Numpy与Pandas数据的相互转换\n",
8 | "\n",
9 | "Pandas是在Numpy基础上建立的非常流行的数据分析类库; \n",
10 | "提供了强大针对异构、表格类型数据的处理与分析能力。\n",
11 | "\n",
12 | "本节介绍Numpy和Pandas的转换方法: \n",
13 | "1. Numpy数组怎样输入给Pandas的Series、DataFrame;\n",
14 | "2. Pandas的Series、DataFrame怎样转换成Numpy的数组"
15 | ]
16 | },
17 | {
18 | "cell_type": "code",
19 | "execution_count": 1,
20 | "metadata": {},
21 | "outputs": [],
22 | "source": [
23 | "import numpy as np\n",
24 | "import pandas as pd"
25 | ]
26 | },
27 | {
28 | "cell_type": "markdown",
29 | "metadata": {},
30 | "source": [
31 | "### 怎样将Numpy数组转换成Pandas的数据结构"
32 | ]
33 | },
34 | {
35 | "cell_type": "markdown",
36 | "metadata": {},
37 | "source": [
38 | "#### 怎样将Numpy的一维数组变成Pandas的Series"
39 | ]
40 | },
41 | {
42 | "cell_type": "code",
43 | "execution_count": 2,
44 | "metadata": {},
45 | "outputs": [
46 | {
47 | "data": {
48 | "text/plain": [
49 | "array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])"
50 | ]
51 | },
52 | "execution_count": 2,
53 | "metadata": {},
54 | "output_type": "execute_result"
55 | }
56 | ],
57 | "source": [
58 | "arr = np.arange(10)\n",
59 | "arr"
60 | ]
61 | },
62 | {
63 | "cell_type": "code",
64 | "execution_count": 3,
65 | "metadata": {},
66 | "outputs": [
67 | {
68 | "data": {
69 | "text/plain": [
70 | "0 0\n",
71 | "1 1\n",
72 | "2 2\n",
73 | "3 3\n",
74 | "4 4\n",
75 | "5 5\n",
76 | "6 6\n",
77 | "7 7\n",
78 | "8 8\n",
79 | "9 9\n",
80 | "dtype: int32"
81 | ]
82 | },
83 | "execution_count": 3,
84 | "metadata": {},
85 | "output_type": "execute_result"
86 | }
87 | ],
88 | "source": [
89 | "series = pd.Series(arr)\n",
90 | "series"
91 | ]
92 | },
93 | {
94 | "cell_type": "markdown",
95 | "metadata": {},
96 | "source": [
97 | "#### 怎样将Numpy的二维数组转换成Pandas的DataFrame"
98 | ]
99 | },
100 | {
101 | "cell_type": "code",
102 | "execution_count": 4,
103 | "metadata": {},
104 | "outputs": [
105 | {
106 | "data": {
107 | "text/plain": [
108 | "array([[3, 9, 6, 3],\n",
109 | " [4, 1, 8, 1],\n",
110 | " [2, 4, 4, 7],\n",
111 | " [4, 8, 4, 7],\n",
112 | " [8, 3, 9, 8]])"
113 | ]
114 | },
115 | "execution_count": 4,
116 | "metadata": {},
117 | "output_type": "execute_result"
118 | }
119 | ],
120 | "source": [
121 | "arr = np.random.randint(1, 10, size=(5, 4))\n",
122 | "arr"
123 | ]
124 | },
125 | {
126 | "cell_type": "code",
127 | "execution_count": 5,
128 | "metadata": {},
129 | "outputs": [
130 | {
131 | "data": {
132 | "text/html": [
133 | "\n",
134 | "\n",
147 | "
\n",
148 | " \n",
149 | " \n",
150 | " | \n",
151 | " ca | \n",
152 | " cb | \n",
153 | " cc | \n",
154 | " cd | \n",
155 | "
\n",
156 | " \n",
157 | " \n",
158 | " \n",
159 | " 0 | \n",
160 | " 3 | \n",
161 | " 9 | \n",
162 | " 6 | \n",
163 | " 3 | \n",
164 | "
\n",
165 | " \n",
166 | " 1 | \n",
167 | " 4 | \n",
168 | " 1 | \n",
169 | " 8 | \n",
170 | " 1 | \n",
171 | "
\n",
172 | " \n",
173 | " 2 | \n",
174 | " 2 | \n",
175 | " 4 | \n",
176 | " 4 | \n",
177 | " 7 | \n",
178 | "
\n",
179 | " \n",
180 | " 3 | \n",
181 | " 4 | \n",
182 | " 8 | \n",
183 | " 4 | \n",
184 | " 7 | \n",
185 | "
\n",
186 | " \n",
187 | " 4 | \n",
188 | " 8 | \n",
189 | " 3 | \n",
190 | " 9 | \n",
191 | " 8 | \n",
192 | "
\n",
193 | " \n",
194 | "
\n",
195 | "
"
196 | ],
197 | "text/plain": [
198 | " ca cb cc cd\n",
199 | "0 3 9 6 3\n",
200 | "1 4 1 8 1\n",
201 | "2 2 4 4 7\n",
202 | "3 4 8 4 7\n",
203 | "4 8 3 9 8"
204 | ]
205 | },
206 | "execution_count": 5,
207 | "metadata": {},
208 | "output_type": "execute_result"
209 | }
210 | ],
211 | "source": [
212 | "df = pd.DataFrame(arr, columns = [\"ca\", \"cb\", \"cc\", \"cd\"])\n",
213 | "df"
214 | ]
215 | },
216 | {
217 | "cell_type": "code",
218 | "execution_count": 6,
219 | "metadata": {},
220 | "outputs": [
221 | {
222 | "data": {
223 | "text/html": [
224 | "\n",
225 | "\n",
238 | "
\n",
239 | " \n",
240 | " \n",
241 | " | \n",
242 | " ca | \n",
243 | " cb | \n",
244 | " cc | \n",
245 | " cd | \n",
246 | "
\n",
247 | " \n",
248 | " \n",
249 | " \n",
250 | " 4 | \n",
251 | " 8 | \n",
252 | " 3 | \n",
253 | " 9 | \n",
254 | " 8 | \n",
255 | "
\n",
256 | " \n",
257 | "
\n",
258 | "
"
259 | ],
260 | "text/plain": [
261 | " ca cb cc cd\n",
262 | "4 8 3 9 8"
263 | ]
264 | },
265 | "execution_count": 6,
266 | "metadata": {},
267 | "output_type": "execute_result"
268 | }
269 | ],
270 | "source": [
271 | "df[df[\"ca\"] > 4]"
272 | ]
273 | },
274 | {
275 | "cell_type": "markdown",
276 | "metadata": {},
277 | "source": [
278 | "### 怎样Pandas的数据结构转换成Numpy数组\n",
279 | "\n",
280 | "* 方法1:.values()\n",
281 | "* 方法2:.to_numpy()\n",
282 | "\n",
283 | "用途: \n",
284 | "比如Scikit-Learn的模型输入需要的是Numpy的数组 \n",
285 | "可以使用Pandas对原始数据做大量的处理后,将结果数据转换成Numpy数组作为输入 "
286 | ]
287 | },
288 | {
289 | "cell_type": "markdown",
290 | "metadata": {},
291 | "source": [
292 | "#### 将Series转换成Numpy数组"
293 | ]
294 | },
295 | {
296 | "cell_type": "code",
297 | "execution_count": 7,
298 | "metadata": {},
299 | "outputs": [
300 | {
301 | "data": {
302 | "text/plain": [
303 | "0 0\n",
304 | "1 1\n",
305 | "2 2\n",
306 | "3 3\n",
307 | "4 4\n",
308 | "5 5\n",
309 | "6 6\n",
310 | "7 7\n",
311 | "8 8\n",
312 | "9 9\n",
313 | "dtype: int64"
314 | ]
315 | },
316 | "execution_count": 7,
317 | "metadata": {},
318 | "output_type": "execute_result"
319 | }
320 | ],
321 | "source": [
322 | "series = pd.Series(range(10))\n",
323 | "series"
324 | ]
325 | },
326 | {
327 | "cell_type": "code",
328 | "execution_count": 8,
329 | "metadata": {},
330 | "outputs": [
331 | {
332 | "data": {
333 | "text/plain": [
334 | "array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], dtype=int64)"
335 | ]
336 | },
337 | "execution_count": 8,
338 | "metadata": {},
339 | "output_type": "execute_result"
340 | }
341 | ],
342 | "source": [
343 | "series.values"
344 | ]
345 | },
346 | {
347 | "cell_type": "code",
348 | "execution_count": 9,
349 | "metadata": {},
350 | "outputs": [
351 | {
352 | "data": {
353 | "text/plain": [
354 | "array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], dtype=int64)"
355 | ]
356 | },
357 | "execution_count": 9,
358 | "metadata": {},
359 | "output_type": "execute_result"
360 | }
361 | ],
362 | "source": [
363 | "series.to_numpy()"
364 | ]
365 | },
366 | {
367 | "cell_type": "markdown",
368 | "metadata": {},
369 | "source": [
370 | "#### 将DataFrame转换成Numpy数组"
371 | ]
372 | },
373 | {
374 | "cell_type": "code",
375 | "execution_count": 10,
376 | "metadata": {},
377 | "outputs": [
378 | {
379 | "data": {
380 | "text/html": [
381 | "\n",
382 | "\n",
395 | "
\n",
396 | " \n",
397 | " \n",
398 | " | \n",
399 | " feature_a | \n",
400 | " feature_b | \n",
401 | " feature_c | \n",
402 | "
\n",
403 | " \n",
404 | " \n",
405 | " \n",
406 | " 0 | \n",
407 | " 11 | \n",
408 | " 12.23 | \n",
409 | " 45.23 | \n",
410 | "
\n",
411 | " \n",
412 | " 1 | \n",
413 | " 21 | \n",
414 | " 22.23 | \n",
415 | " 55.23 | \n",
416 | "
\n",
417 | " \n",
418 | " 2 | \n",
419 | " 31 | \n",
420 | " 32.23 | \n",
421 | " 65.23 | \n",
422 | "
\n",
423 | " \n",
424 | " 3 | \n",
425 | " 41 | \n",
426 | " 42.23 | \n",
427 | " 75.23 | \n",
428 | "
\n",
429 | " \n",
430 | "
\n",
431 | "
"
432 | ],
433 | "text/plain": [
434 | " feature_a feature_b feature_c\n",
435 | "0 11 12.23 45.23\n",
436 | "1 21 22.23 55.23\n",
437 | "2 31 32.23 65.23\n",
438 | "3 41 42.23 75.23"
439 | ]
440 | },
441 | "execution_count": 10,
442 | "metadata": {},
443 | "output_type": "execute_result"
444 | }
445 | ],
446 | "source": [
447 | "df = pd.DataFrame(\n",
448 | " [\n",
449 | " [11, 12.23, 45.23],\n",
450 | " [21, 22.23, 55.23],\n",
451 | " [31, 32.23, 65.23],\n",
452 | " [41, 42.23, 75.23]\n",
453 | " ],\n",
454 | " columns = [\"feature_a\", \"feature_b\", \"feature_c\"]\n",
455 | ")\n",
456 | "df"
457 | ]
458 | },
459 | {
460 | "cell_type": "code",
461 | "execution_count": 11,
462 | "metadata": {},
463 | "outputs": [
464 | {
465 | "data": {
466 | "text/plain": [
467 | "array([[11. , 12.23, 45.23],\n",
468 | " [21. , 22.23, 55.23],\n",
469 | " [31. , 32.23, 65.23],\n",
470 | " [41. , 42.23, 75.23]])"
471 | ]
472 | },
473 | "execution_count": 11,
474 | "metadata": {},
475 | "output_type": "execute_result"
476 | }
477 | ],
478 | "source": [
479 | "df.values"
480 | ]
481 | },
482 | {
483 | "cell_type": "code",
484 | "execution_count": 12,
485 | "metadata": {},
486 | "outputs": [
487 | {
488 | "data": {
489 | "text/plain": [
490 | "array([[11. , 12.23, 45.23],\n",
491 | " [21. , 22.23, 55.23],\n",
492 | " [31. , 32.23, 65.23],\n",
493 | " [41. , 42.23, 75.23]])"
494 | ]
495 | },
496 | "execution_count": 12,
497 | "metadata": {},
498 | "output_type": "execute_result"
499 | }
500 | ],
501 | "source": [
502 | "df.to_numpy()"
503 | ]
504 | },
505 | {
506 | "cell_type": "code",
507 | "execution_count": null,
508 | "metadata": {},
509 | "outputs": [],
510 | "source": []
511 | }
512 | ],
513 | "metadata": {
514 | "kernelspec": {
515 | "display_name": "Python 3",
516 | "language": "python",
517 | "name": "python3"
518 | },
519 | "language_info": {
520 | "codemirror_mode": {
521 | "name": "ipython",
522 | "version": 3
523 | },
524 | "file_extension": ".py",
525 | "mimetype": "text/x-python",
526 | "name": "python",
527 | "nbconvert_exporter": "python",
528 | "pygments_lexer": "ipython3",
529 | "version": "3.7.6"
530 | }
531 | },
532 | "nbformat": 4,
533 | "nbformat_minor": 4
534 | }
535 |
--------------------------------------------------------------------------------
/21. Numpy数据输入给Scikit-learn实现模型训练.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "## Numpy数据输入给Sklearn实现模型训练\n",
8 | "\n",
9 | "***本视频的目的,向大家演示:*** \n",
10 | "Numpy的数组怎样与sklearn模型交互,包括训练测试集拆分、输入给模型、评估模型、模型预估\n",
11 | "\n",
12 | "对于大家自己的任务,可以提前处理成这样的Numpy格式,然后输入给sklearn模型"
13 | ]
14 | },
15 | {
16 | "cell_type": "code",
17 | "execution_count": 1,
18 | "metadata": {},
19 | "outputs": [],
20 | "source": [
21 | "import numpy as np\n",
22 | "# 使用sklearn自带的数据集,这些数据集都是Numpy的形式\n",
23 | "# 我们自己的数据,也可以处理成这种格式,然后就可以输入给模型\n",
24 | "from sklearn import datasets\n",
25 | "# 用train_test_split可以拆分训练集和测试集\n",
26 | "from sklearn.model_selection import train_test_split\n",
27 | "# 使用LinearRegression训练线性回归模型\n",
28 | "from sklearn.linear_model import LinearRegression"
29 | ]
30 | },
31 | {
32 | "cell_type": "markdown",
33 | "metadata": {},
34 | "source": [
35 | "### 1. 加载波斯顿房价数据集"
36 | ]
37 | },
38 | {
39 | "cell_type": "code",
40 | "execution_count": 2,
41 | "metadata": {},
42 | "outputs": [],
43 | "source": [
44 | "# 加载数据集,存入特征矩阵data、预测结果向量target\n",
45 | "data, target = datasets.load_boston(return_X_y=True)"
46 | ]
47 | },
48 | {
49 | "cell_type": "code",
50 | "execution_count": 3,
51 | "metadata": {},
52 | "outputs": [
53 | {
54 | "data": {
55 | "text/plain": [
56 | "(numpy.ndarray, numpy.ndarray)"
57 | ]
58 | },
59 | "execution_count": 3,
60 | "metadata": {},
61 | "output_type": "execute_result"
62 | }
63 | ],
64 | "source": [
65 | "type(data), type(target)"
66 | ]
67 | },
68 | {
69 | "cell_type": "code",
70 | "execution_count": 4,
71 | "metadata": {},
72 | "outputs": [
73 | {
74 | "data": {
75 | "text/plain": [
76 | "((506, 13), (506,))"
77 | ]
78 | },
79 | "execution_count": 4,
80 | "metadata": {},
81 | "output_type": "execute_result"
82 | }
83 | ],
84 | "source": [
85 | "data.shape, target.shape"
86 | ]
87 | },
88 | {
89 | "cell_type": "code",
90 | "execution_count": 5,
91 | "metadata": {},
92 | "outputs": [
93 | {
94 | "data": {
95 | "text/plain": [
96 | "array([[6.3200e-03, 1.8000e+01, 2.3100e+00, 0.0000e+00, 5.3800e-01,\n",
97 | " 6.5750e+00, 6.5200e+01, 4.0900e+00, 1.0000e+00, 2.9600e+02,\n",
98 | " 1.5300e+01, 3.9690e+02, 4.9800e+00],\n",
99 | " [2.7310e-02, 0.0000e+00, 7.0700e+00, 0.0000e+00, 4.6900e-01,\n",
100 | " 6.4210e+00, 7.8900e+01, 4.9671e+00, 2.0000e+00, 2.4200e+02,\n",
101 | " 1.7800e+01, 3.9690e+02, 9.1400e+00],\n",
102 | " [2.7290e-02, 0.0000e+00, 7.0700e+00, 0.0000e+00, 4.6900e-01,\n",
103 | " 7.1850e+00, 6.1100e+01, 4.9671e+00, 2.0000e+00, 2.4200e+02,\n",
104 | " 1.7800e+01, 3.9283e+02, 4.0300e+00]])"
105 | ]
106 | },
107 | "execution_count": 5,
108 | "metadata": {},
109 | "output_type": "execute_result"
110 | }
111 | ],
112 | "source": [
113 | "# 查看前三条房子的特征信息\n",
114 | "data[:3]"
115 | ]
116 | },
117 | {
118 | "cell_type": "code",
119 | "execution_count": 6,
120 | "metadata": {},
121 | "outputs": [
122 | {
123 | "data": {
124 | "text/plain": [
125 | "array([24. , 21.6, 34.7])"
126 | ]
127 | },
128 | "execution_count": 6,
129 | "metadata": {},
130 | "output_type": "execute_result"
131 | }
132 | ],
133 | "source": [
134 | "# 查看前三条房价结果\n",
135 | "target[:3]"
136 | ]
137 | },
138 | {
139 | "cell_type": "markdown",
140 | "metadata": {},
141 | "source": [
142 | "### 2. 拆分训练集和测试集"
143 | ]
144 | },
145 | {
146 | "cell_type": "code",
147 | "execution_count": 7,
148 | "metadata": {},
149 | "outputs": [],
150 | "source": [
151 | "# 拆分训练集和测试集\n",
152 | "X_train, X_test, y_train, y_test = train_test_split(data, target)"
153 | ]
154 | },
155 | {
156 | "cell_type": "code",
157 | "execution_count": 8,
158 | "metadata": {},
159 | "outputs": [
160 | {
161 | "data": {
162 | "text/plain": [
163 | "((379, 13), (379,))"
164 | ]
165 | },
166 | "execution_count": 8,
167 | "metadata": {},
168 | "output_type": "execute_result"
169 | }
170 | ],
171 | "source": [
172 | "# 训练集的数据\n",
173 | "X_train.shape, y_train.shape"
174 | ]
175 | },
176 | {
177 | "cell_type": "code",
178 | "execution_count": 9,
179 | "metadata": {},
180 | "outputs": [
181 | {
182 | "data": {
183 | "text/plain": [
184 | "((127, 13), (127,))"
185 | ]
186 | },
187 | "execution_count": 9,
188 | "metadata": {},
189 | "output_type": "execute_result"
190 | }
191 | ],
192 | "source": [
193 | "# 测试集的数据\n",
194 | "X_test.shape, y_test.shape"
195 | ]
196 | },
197 | {
198 | "cell_type": "markdown",
199 | "metadata": {},
200 | "source": [
201 | "### 3. 训练线性回归模型"
202 | ]
203 | },
204 | {
205 | "cell_type": "code",
206 | "execution_count": 10,
207 | "metadata": {},
208 | "outputs": [],
209 | "source": [
210 | "# 构造线性回归对象,使用默认参数即可\n",
211 | "clf = LinearRegression()"
212 | ]
213 | },
214 | {
215 | "cell_type": "code",
216 | "execution_count": 11,
217 | "metadata": {},
218 | "outputs": [
219 | {
220 | "data": {
221 | "text/plain": [
222 | "LinearRegression(copy_X=True, fit_intercept=True, n_jobs=None, normalize=False)"
223 | ]
224 | },
225 | "execution_count": 11,
226 | "metadata": {},
227 | "output_type": "execute_result"
228 | }
229 | ],
230 | "source": [
231 | "# 执行训练\n",
232 | "clf.fit(X_train, y_train)"
233 | ]
234 | },
235 | {
236 | "cell_type": "code",
237 | "execution_count": 12,
238 | "metadata": {},
239 | "outputs": [
240 | {
241 | "data": {
242 | "text/plain": [
243 | "0.7290997955432121"
244 | ]
245 | },
246 | "execution_count": 12,
247 | "metadata": {},
248 | "output_type": "execute_result"
249 | }
250 | ],
251 | "source": [
252 | "# 在训练集上的打分\n",
253 | "clf.score(X_train, y_train)"
254 | ]
255 | },
256 | {
257 | "cell_type": "markdown",
258 | "metadata": {},
259 | "source": [
260 | "### 4. 评估模型和使用模型"
261 | ]
262 | },
263 | {
264 | "cell_type": "code",
265 | "execution_count": 13,
266 | "metadata": {},
267 | "outputs": [
268 | {
269 | "data": {
270 | "text/plain": [
271 | "0.7658281007291711"
272 | ]
273 | },
274 | "execution_count": 13,
275 | "metadata": {},
276 | "output_type": "execute_result"
277 | }
278 | ],
279 | "source": [
280 | "# 在测试集上打分评估\n",
281 | "clf.score(X_test, y_test)"
282 | ]
283 | },
284 | {
285 | "cell_type": "code",
286 | "execution_count": 14,
287 | "metadata": {},
288 | "outputs": [
289 | {
290 | "data": {
291 | "text/plain": [
292 | "array([36.1889043 , 17.05681981, 26.1238293 ])"
293 | ]
294 | },
295 | "execution_count": 14,
296 | "metadata": {},
297 | "output_type": "execute_result"
298 | }
299 | ],
300 | "source": [
301 | "# 只取前三条数据,实现房价预估\n",
302 | "clf.predict(X_test[:3])"
303 | ]
304 | },
305 | {
306 | "cell_type": "code",
307 | "execution_count": 15,
308 | "metadata": {},
309 | "outputs": [
310 | {
311 | "data": {
312 | "text/plain": [
313 | "array([50. , 23.1, 22.8])"
314 | ]
315 | },
316 | "execution_count": 15,
317 | "metadata": {},
318 | "output_type": "execute_result"
319 | }
320 | ],
321 | "source": [
322 | "# 看下实际的房价\n",
323 | "y_test[:3]"
324 | ]
325 | },
326 | {
327 | "cell_type": "code",
328 | "execution_count": null,
329 | "metadata": {},
330 | "outputs": [],
331 | "source": []
332 | }
333 | ],
334 | "metadata": {
335 | "kernelspec": {
336 | "display_name": "Python 3",
337 | "language": "python",
338 | "name": "python3"
339 | },
340 | "language_info": {
341 | "codemirror_mode": {
342 | "name": "ipython",
343 | "version": 3
344 | },
345 | "file_extension": ".py",
346 | "mimetype": "text/x-python",
347 | "name": "python",
348 | "nbconvert_exporter": "python",
349 | "pygments_lexer": "ipython3",
350 | "version": "3.7.6"
351 | }
352 | },
353 | "nbformat": 4,
354 | "nbformat_minor": 4
355 | }
356 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # ant-learn-numpy
2 | Python科学计算库Numpy的代码实现
3 |
4 | 同时,欢迎大家关注我的微信公众号,也会分享很多Python领域学习的视频
5 | 关注:Python基础入门,爬虫、数据分析、大数据处理、机器学习、推荐系统等领域
6 |
7 | 公众号名字:蚂蚁学Python
8 |
9 |
--------------------------------------------------------------------------------
/Untitled.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [],
3 | "metadata": {},
4 | "nbformat": 4,
5 | "nbformat_minor": 4
6 | }
7 |
--------------------------------------------------------------------------------
/Untitled1.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [],
3 | "metadata": {},
4 | "nbformat": 4,
5 | "nbformat_minor": 4
6 | }
7 |
--------------------------------------------------------------------------------
/arr_a.npy:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/peiss/ant-learn-numpy/dca4191fcf9c762902632c59cb49c73b12332397/arr_a.npy
--------------------------------------------------------------------------------
/arr_ab.npz:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/peiss/ant-learn-numpy/dca4191fcf9c762902632c59cb49c73b12332397/arr_ab.npz
--------------------------------------------------------------------------------
/arr_ab_compressed.npz:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/peiss/ant-learn-numpy/dca4191fcf9c762902632c59cb49c73b12332397/arr_ab_compressed.npz
--------------------------------------------------------------------------------
/other_files/numpy-array-inv.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/peiss/ant-learn-numpy/dca4191fcf9c762902632c59cb49c73b12332397/other_files/numpy-array-inv.jpg
--------------------------------------------------------------------------------
/other_files/numpy-kfold-validation.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/peiss/ant-learn-numpy/dca4191fcf9c762902632c59cb49c73b12332397/other_files/numpy-kfold-validation.jpg
--------------------------------------------------------------------------------
/other_files/numpy-kfold-validation.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/peiss/ant-learn-numpy/dca4191fcf9c762902632c59cb49c73b12332397/other_files/numpy-kfold-validation.png
--------------------------------------------------------------------------------
/other_files/numpy_random_functions.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/peiss/ant-learn-numpy/dca4191fcf9c762902632c59cb49c73b12332397/other_files/numpy_random_functions.png
--------------------------------------------------------------------------------