├── .gitignore
├── Code
    ├── Chapter02 Python基础知识.ipynb
    ├── Chapter03 Pandas数据结构.ipynb
    ├── Chapter04 获取数据源.ipynb
    ├── Chapter05 数据预处理.ipynb
    ├── Chapter06 数据选择.ipynb
    ├── Chapter07 数值操作.ipynb
    ├── Chapter08 数据运算.ipynb
    ├── Chapter09 时间序列.ipynb
    ├── Chapter10 数据分组 数据透视表.ipynb
    ├── Chapter11 多表拼接.ipynb
    ├── Chapter12 结果导出.ipynb
    ├── Chapter13 数据可视化.ipynb
    ├── Chapter14 典型数据分析案例.ipynb
    └── Chapter15 NumPy数组.ipynb
├── Data
    ├── Chapter04.1.csv
    ├── Chapter04.csv
    ├── Chapter04.txt
    ├── Chapter04.xlsx
    ├── Chapter05.xlsx
    ├── Chapter06.xlsx
    ├── Chapter07.xlsx
    ├── Chapter08.xlsx
    ├── Chapter10.xlsx
    ├── Chapter11.xlsx
    ├── Chapter12.xlsx
    ├── fillna.xlsx
    ├── loan.csv
    ├── order-14.1.csv
    ├── order-14.3.csv
    ├── train-pivot.csv
    └── 数据集使用说明.txt
├── Note
    ├── Git Fork开源项目如何同步更新.pdf
    ├── Markdown常用标签.pdf
    ├── jupyter notebook导出pdf并支持中文.md
    ├── pandas填充缺失值fillna()函数.ipynb
    ├── 如何给 github 的开源项目提交 pull request.pdf
    └── 常见的Python代码报错及解决方案.pdf
├── Other
    ├── 01 Pyecharts渲染图表 .ipynb
    ├── Pyecharts.xlsx
    └── html
    │   ├── Gauge01.html
    │   ├── Gauge02.html
    │   ├── WordCloud.html
    │   ├── bar01.html
    │   ├── dark.html
    │   ├── images
    │       ├── Gauge01.png
    │       ├── Gauge02.png
    │       ├── WordCloud.png
    │       ├── bar.png
    │       ├── dark.png
    │       ├── pie.png
    │       └── start.png
    │   ├── pie.html
    │   └── start.html
└── README.md


/.gitignore:
--------------------------------------------------------------------------------
1 | Code/.ipynb_checkpoints/
2 | Note/.ipynb_checkpoints/
3 | 


--------------------------------------------------------------------------------
/Code/Chapter03 Pandas数据结构.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "markdown",
  5 |    "metadata": {},
  6 |    "source": [
  7 |     "**Pandas数据结构**  \n",
  8 |     "Python数据分析主要用到Pandas、NumPy，matplotlib这几个模块，使用前需要先导入"
  9 |    ]
 10 |   },
 11 |   {
 12 |    "cell_type": "code",
 13 |    "execution_count": 1,
 14 |    "metadata": {},
 15 |    "outputs": [],
 16 |    "source": [
 17 |     "#模块的导入\n",
 18 |     "import pandas as pd\n",
 19 |     "import numpy as np\n",
 20 |     "import matplotlib as plt"
 21 |    ]
 22 |   },
 23 |   {
 24 |    "cell_type": "markdown",
 25 |    "metadata": {},
 26 |    "source": [
 27 |     "## Series数据结构"
 28 |    ]
 29 |   },
 30 |   {
 31 |    "cell_type": "markdown",
 32 |    "metadata": {},
 33 |    "source": [
 34 |     "### Serise是什么  \n",
 35 |     "Serise是一种类似一维数组的对象，由一组数据及一组与之相关数据标签（即索引）组成"
 36 |    ]
 37 |   },
 38 |   {
 39 |    "cell_type": "markdown",
 40 |    "metadata": {},
 41 |    "source": [
 42 |     "###  创建一个Series\n",
 43 |     "用pd.Series()方法创建，通过给Series()方法传入不同的对象即可实现"
 44 |    ]
 45 |   },
 46 |   {
 47 |    "cell_type": "code",
 48 |    "execution_count": 2,
 49 |    "metadata": {},
 50 |    "outputs": [
 51 |     {
 52 |      "data": {
 53 |       "text/plain": [
 54 |        "0    a\n",
 55 |        "1    b\n",
 56 |        "2    c\n",
 57 |        "3    d\n",
 58 |        "dtype: object"
 59 |       ]
 60 |      },
 61 |      "execution_count": 2,
 62 |      "metadata": {},
 63 |      "output_type": "execute_result"
 64 |     }
 65 |    ],
 66 |    "source": [
 67 |     "#传入一个列表\n",
 68 |     "import pandas as pd\n",
 69 |     "S1 = pd.Series([\"a\",\"b\",\"c\",\"d\"])\n",
 70 |     "S1"
 71 |    ]
 72 |   },
 73 |   {
 74 |    "cell_type": "code",
 75 |    "execution_count": 3,
 76 |    "metadata": {},
 77 |    "outputs": [
 78 |     {
 79 |      "data": {
 80 |       "text/plain": [
 81 |        "a    1\n",
 82 |        "b    2\n",
 83 |        "c    3\n",
 84 |        "d    4\n",
 85 |        "dtype: int64"
 86 |       ]
 87 |      },
 88 |      "execution_count": 3,
 89 |      "metadata": {},
 90 |      "output_type": "execute_result"
 91 |     }
 92 |    ],
 93 |    "source": [
 94 |     "#指定索引\n",
 95 |     "S2 = pd.Series([1,2,3,4],index = [\"a\",\"b\",\"c\",\"d\"])\n",
 96 |     "S2"
 97 |    ]
 98 |   },
 99 |   {
100 |    "cell_type": "code",
101 |    "execution_count": 4,
102 |    "metadata": {},
103 |    "outputs": [
104 |     {
105 |      "data": {
106 |       "text/plain": [
107 |        "a    1\n",
108 |        "b    2\n",
109 |        "c    3\n",
110 |        "d    4\n",
111 |        "dtype: int64"
112 |       ]
113 |      },
114 |      "execution_count": 4,
115 |      "metadata": {},
116 |      "output_type": "execute_result"
117 |     }
118 |    ],
119 |    "source": [
120 |     "#传入字典\n",
121 |     "S3 = pd.Series({\"a\":1,\"b\":2,\"c\":3,\"d\":4})\n",
122 |     "S3"
123 |    ]
124 |   },
125 |   {
126 |    "cell_type": "markdown",
127 |    "metadata": {},
128 |    "source": [
129 |     "### 利用index方法获取Series的索引"
130 |    ]
131 |   },
132 |   {
133 |    "cell_type": "code",
134 |    "execution_count": 5,
135 |    "metadata": {},
136 |    "outputs": [
137 |     {
138 |      "data": {
139 |       "text/plain": [
140 |        "RangeIndex(start=0, stop=4, step=1)"
141 |       ]
142 |      },
143 |      "execution_count": 5,
144 |      "metadata": {},
145 |      "output_type": "execute_result"
146 |     }
147 |    ],
148 |    "source": [
149 |     "S1.index"
150 |    ]
151 |   },
152 |   {
153 |    "cell_type": "code",
154 |    "execution_count": 6,
155 |    "metadata": {},
156 |    "outputs": [
157 |     {
158 |      "data": {
159 |       "text/plain": [
160 |        "Index(['a', 'b', 'c', 'd'], dtype='object')"
161 |       ]
162 |      },
163 |      "execution_count": 6,
164 |      "metadata": {},
165 |      "output_type": "execute_result"
166 |     }
167 |    ],
168 |    "source": [
169 |     "S2.index"
170 |    ]
171 |   },
172 |   {
173 |    "cell_type": "markdown",
174 |    "metadata": {},
175 |    "source": [
176 |     "### 利用values方法获取Series的值"
177 |    ]
178 |   },
179 |   {
180 |    "cell_type": "code",
181 |    "execution_count": 10,
182 |    "metadata": {},
183 |    "outputs": [
184 |     {
185 |      "data": {
186 |       "text/plain": [
187 |        "array(['a', 'b', 'c', 'd'], dtype=object)"
188 |       ]
189 |      },
190 |      "execution_count": 10,
191 |      "metadata": {},
192 |      "output_type": "execute_result"
193 |     }
194 |    ],
195 |    "source": [
196 |     "S1.values"
197 |    ]
198 |   },
199 |   {
200 |    "cell_type": "code",
201 |    "execution_count": 11,
202 |    "metadata": {},
203 |    "outputs": [
204 |     {
205 |      "data": {
206 |       "text/plain": [
207 |        "array([1, 2, 3, 4], dtype=int64)"
208 |       ]
209 |      },
210 |      "execution_count": 11,
211 |      "metadata": {},
212 |      "output_type": "execute_result"
213 |     }
214 |    ],
215 |    "source": [
216 |     "S2.values"
217 |    ]
218 |   },
219 |   {
220 |    "cell_type": "markdown",
221 |    "metadata": {},
222 |    "source": [
223 |     "## DataFrame表格型数据结构"
224 |    ]
225 |   },
226 |   {
227 |    "cell_type": "markdown",
228 |    "metadata": {},
229 |    "source": [
230 |     "### DataFrame是什么  \n",
231 |     "DataFrame是由一组数据与一对索引（行索引和列索引）组成的表格型数据结构"
232 |    ]
233 |   },
234 |   {
235 |    "cell_type": "markdown",
236 |    "metadata": {},
237 |    "source": [
238 |     "### 创建一个DataFrame  \n",
239 |     "使用pd.DataFrame()方法创建，通过传入对象即可实现"
240 |    ]
241 |   },
242 |   {
243 |    "cell_type": "code",
244 |    "execution_count": 14,
245 |    "metadata": {},
246 |    "outputs": [
247 |     {
248 |      "data": {
249 |       "text/html": [
250 |        "<div>\n",
251 |        "<style scoped>\n",
252 |        "    .dataframe tbody tr th:only-of-type {\n",
253 |        "        vertical-align: middle;\n",
254 |        "    }\n",
255 |        "\n",
256 |        "    .dataframe tbody tr th {\n",
257 |        "        vertical-align: top;\n",
258 |        "    }\n",
259 |        "\n",
260 |        "    .dataframe thead th {\n",
261 |        "        text-align: right;\n",
262 |        "    }\n",
263 |        "</style>\n",
264 |        "<table border=\"1\" class=\"dataframe\">\n",
265 |        "  <thead>\n",
266 |        "    <tr style=\"text-align: right;\">\n",
267 |        "      <th></th>\n",
268 |        "      <th>0</th>\n",
269 |        "    </tr>\n",
270 |        "  </thead>\n",
271 |        "  <tbody>\n",
272 |        "    <tr>\n",
273 |        "      <th>0</th>\n",
274 |        "      <td>a</td>\n",
275 |        "    </tr>\n",
276 |        "    <tr>\n",
277 |        "      <th>1</th>\n",
278 |        "      <td>b</td>\n",
279 |        "    </tr>\n",
280 |        "    <tr>\n",
281 |        "      <th>2</th>\n",
282 |        "      <td>c</td>\n",
283 |        "    </tr>\n",
284 |        "    <tr>\n",
285 |        "      <th>3</th>\n",
286 |        "      <td>d</td>\n",
287 |        "    </tr>\n",
288 |        "  </tbody>\n",
289 |        "</table>\n",
290 |        "</div>"
291 |       ],
292 |       "text/plain": [
293 |        "   0\n",
294 |        "0  a\n",
295 |        "1  b\n",
296 |        "2  c\n",
297 |        "3  d"
298 |       ]
299 |      },
300 |      "execution_count": 14,
301 |      "metadata": {},
302 |      "output_type": "execute_result"
303 |     }
304 |    ],
305 |    "source": [
306 |     "#传入一个列表\n",
307 |     "import pandas as pd\n",
308 |     "df1 = pd.DataFrame([\"a\",\"b\",\"c\",\"d\"])\n",
309 |     "df1"
310 |    ]
311 |   },
312 |   {
313 |    "cell_type": "code",
314 |    "execution_count": 36,
315 |    "metadata": {},
316 |    "outputs": [
317 |     {
318 |      "data": {
319 |       "text/html": [
320 |        "<div>\n",
321 |        "<style scoped>\n",
322 |        "    .dataframe tbody tr th:only-of-type {\n",
323 |        "        vertical-align: middle;\n",
324 |        "    }\n",
325 |        "\n",
326 |        "    .dataframe tbody tr th {\n",
327 |        "        vertical-align: top;\n",
328 |        "    }\n",
329 |        "\n",
330 |        "    .dataframe thead th {\n",
331 |        "        text-align: right;\n",
332 |        "    }\n",
333 |        "</style>\n",
334 |        "<table border=\"1\" class=\"dataframe\">\n",
335 |        "  <thead>\n",
336 |        "    <tr style=\"text-align: right;\">\n",
337 |        "      <th></th>\n",
338 |        "      <th>0</th>\n",
339 |        "      <th>1</th>\n",
340 |        "    </tr>\n",
341 |        "  </thead>\n",
342 |        "  <tbody>\n",
343 |        "    <tr>\n",
344 |        "      <th>0</th>\n",
345 |        "      <td>a</td>\n",
346 |        "      <td>A</td>\n",
347 |        "    </tr>\n",
348 |        "    <tr>\n",
349 |        "      <th>1</th>\n",
350 |        "      <td>b</td>\n",
351 |        "      <td>B</td>\n",
352 |        "    </tr>\n",
353 |        "    <tr>\n",
354 |        "      <th>2</th>\n",
355 |        "      <td>c</td>\n",
356 |        "      <td>C</td>\n",
357 |        "    </tr>\n",
358 |        "    <tr>\n",
359 |        "      <th>3</th>\n",
360 |        "      <td>d</td>\n",
361 |        "      <td>D</td>\n",
362 |        "    </tr>\n",
363 |        "  </tbody>\n",
364 |        "</table>\n",
365 |        "</div>"
366 |       ],
367 |       "text/plain": [
368 |        "   0  1\n",
369 |        "0  a  A\n",
370 |        "1  b  B\n",
371 |        "2  c  C\n",
372 |        "3  d  D"
373 |       ]
374 |      },
375 |      "execution_count": 36,
376 |      "metadata": {},
377 |      "output_type": "execute_result"
378 |     }
379 |    ],
380 |    "source": [
381 |     "#传入一个嵌套列表\n",
382 |     "df2 = pd.DataFrame([[\"a\",\"A\"],[\"b\",\"B\"],[\"c\",\"C\"],[\"d\",\"D\"]])\n",
383 |     "df2"
384 |    ]
385 |   },
386 |   {
387 |    "cell_type": "markdown",
388 |    "metadata": {},
389 |    "source": [
390 |     "**指定行、列索引**  \n",
391 |     "- columns 参数自定义列索引\n",
392 |     "- index 参数自定义行索引"
393 |    ]
394 |   },
395 |   {
396 |    "cell_type": "code",
397 |    "execution_count": 22,
398 |    "metadata": {},
399 |    "outputs": [
400 |     {
401 |      "data": {
402 |       "text/html": [
403 |        "<div>\n",
404 |        "<style scoped>\n",
405 |        "    .dataframe tbody tr th:only-of-type {\n",
406 |        "        vertical-align: middle;\n",
407 |        "    }\n",
408 |        "\n",
409 |        "    .dataframe tbody tr th {\n",
410 |        "        vertical-align: top;\n",
411 |        "    }\n",
412 |        "\n",
413 |        "    .dataframe thead th {\n",
414 |        "        text-align: right;\n",
415 |        "    }\n",
416 |        "</style>\n",
417 |        "<table border=\"1\" class=\"dataframe\">\n",
418 |        "  <thead>\n",
419 |        "    <tr style=\"text-align: right;\">\n",
420 |        "      <th></th>\n",
421 |        "      <th>小写</th>\n",
422 |        "      <th>大写</th>\n",
423 |        "    </tr>\n",
424 |        "  </thead>\n",
425 |        "  <tbody>\n",
426 |        "    <tr>\n",
427 |        "      <th>0</th>\n",
428 |        "      <td>a</td>\n",
429 |        "      <td>A</td>\n",
430 |        "    </tr>\n",
431 |        "    <tr>\n",
432 |        "      <th>1</th>\n",
433 |        "      <td>b</td>\n",
434 |        "      <td>B</td>\n",
435 |        "    </tr>\n",
436 |        "    <tr>\n",
437 |        "      <th>2</th>\n",
438 |        "      <td>c</td>\n",
439 |        "      <td>C</td>\n",
440 |        "    </tr>\n",
441 |        "    <tr>\n",
442 |        "      <th>3</th>\n",
443 |        "      <td>d</td>\n",
444 |        "      <td>D</td>\n",
445 |        "    </tr>\n",
446 |        "  </tbody>\n",
447 |        "</table>\n",
448 |        "</div>"
449 |       ],
450 |       "text/plain": [
451 |        "  小写 大写\n",
452 |        "0  a  A\n",
453 |        "1  b  B\n",
454 |        "2  c  C\n",
455 |        "3  d  D"
456 |       ]
457 |      },
458 |      "execution_count": 22,
459 |      "metadata": {},
460 |      "output_type": "execute_result"
461 |     }
462 |    ],
463 |    "source": [
464 |     "# 设置列索引\n",
465 |     "df31 = pd.DataFrame([[\"a\",\"A\"],[\"b\",\"B\"],[\"c\",\"C\"],[\"d\",\"D\"]],columns = [\"小写\",\"大写\"])\n",
466 |     "df31"
467 |    ]
468 |   },
469 |   {
470 |    "cell_type": "code",
471 |    "execution_count": 24,
472 |    "metadata": {},
473 |    "outputs": [
474 |     {
475 |      "data": {
476 |       "text/html": [
477 |        "<div>\n",
478 |        "<style scoped>\n",
479 |        "    .dataframe tbody tr th:only-of-type {\n",
480 |        "        vertical-align: middle;\n",
481 |        "    }\n",
482 |        "\n",
483 |        "    .dataframe tbody tr th {\n",
484 |        "        vertical-align: top;\n",
485 |        "    }\n",
486 |        "\n",
487 |        "    .dataframe thead th {\n",
488 |        "        text-align: right;\n",
489 |        "    }\n",
490 |        "</style>\n",
491 |        "<table border=\"1\" class=\"dataframe\">\n",
492 |        "  <thead>\n",
493 |        "    <tr style=\"text-align: right;\">\n",
494 |        "      <th></th>\n",
495 |        "      <th>0</th>\n",
496 |        "      <th>1</th>\n",
497 |        "    </tr>\n",
498 |        "  </thead>\n",
499 |        "  <tbody>\n",
500 |        "    <tr>\n",
501 |        "      <th>一</th>\n",
502 |        "      <td>a</td>\n",
503 |        "      <td>A</td>\n",
504 |        "    </tr>\n",
505 |        "    <tr>\n",
506 |        "      <th>二</th>\n",
507 |        "      <td>b</td>\n",
508 |        "      <td>B</td>\n",
509 |        "    </tr>\n",
510 |        "    <tr>\n",
511 |        "      <th>三</th>\n",
512 |        "      <td>c</td>\n",
513 |        "      <td>C</td>\n",
514 |        "    </tr>\n",
515 |        "    <tr>\n",
516 |        "      <th>四</th>\n",
517 |        "      <td>d</td>\n",
518 |        "      <td>D</td>\n",
519 |        "    </tr>\n",
520 |        "  </tbody>\n",
521 |        "</table>\n",
522 |        "</div>"
523 |       ],
524 |       "text/plain": [
525 |        "   0  1\n",
526 |        "一  a  A\n",
527 |        "二  b  B\n",
528 |        "三  c  C\n",
529 |        "四  d  D"
530 |       ]
531 |      },
532 |      "execution_count": 24,
533 |      "metadata": {},
534 |      "output_type": "execute_result"
535 |     }
536 |    ],
537 |    "source": [
538 |     "# 设置行索引\n",
539 |     "df32 = pd.DataFrame([[\"a\",\"A\"],[\"b\",\"B\"],[\"c\",\"C\"],[\"d\",\"D\"]],index = [\"一\",\"二\",\"三\",\"四\"])\n",
540 |     "df32"
541 |    ]
542 |   },
543 |   {
544 |    "cell_type": "code",
545 |    "execution_count": 37,
546 |    "metadata": {},
547 |    "outputs": [
548 |     {
549 |      "data": {
550 |       "text/html": [
551 |        "<div>\n",
552 |        "<style scoped>\n",
553 |        "    .dataframe tbody tr th:only-of-type {\n",
554 |        "        vertical-align: middle;\n",
555 |        "    }\n",
556 |        "\n",
557 |        "    .dataframe tbody tr th {\n",
558 |        "        vertical-align: top;\n",
559 |        "    }\n",
560 |        "\n",
561 |        "    .dataframe thead th {\n",
562 |        "        text-align: right;\n",
563 |        "    }\n",
564 |        "</style>\n",
565 |        "<table border=\"1\" class=\"dataframe\">\n",
566 |        "  <thead>\n",
567 |        "    <tr style=\"text-align: right;\">\n",
568 |        "      <th></th>\n",
569 |        "      <th>小写</th>\n",
570 |        "      <th>大写</th>\n",
571 |        "    </tr>\n",
572 |        "  </thead>\n",
573 |        "  <tbody>\n",
574 |        "    <tr>\n",
575 |        "      <th>一</th>\n",
576 |        "      <td>a</td>\n",
577 |        "      <td>A</td>\n",
578 |        "    </tr>\n",
579 |        "    <tr>\n",
580 |        "      <th>二</th>\n",
581 |        "      <td>b</td>\n",
582 |        "      <td>B</td>\n",
583 |        "    </tr>\n",
584 |        "    <tr>\n",
585 |        "      <th>三</th>\n",
586 |        "      <td>c</td>\n",
587 |        "      <td>C</td>\n",
588 |        "    </tr>\n",
589 |        "    <tr>\n",
590 |        "      <th>四</th>\n",
591 |        "      <td>d</td>\n",
592 |        "      <td>D</td>\n",
593 |        "    </tr>\n",
594 |        "  </tbody>\n",
595 |        "</table>\n",
596 |        "</div>"
597 |       ],
598 |       "text/plain": [
599 |        "  小写 大写\n",
600 |        "一  a  A\n",
601 |        "二  b  B\n",
602 |        "三  c  C\n",
603 |        "四  d  D"
604 |       ]
605 |      },
606 |      "execution_count": 37,
607 |      "metadata": {},
608 |      "output_type": "execute_result"
609 |     }
610 |    ],
611 |    "source": [
612 |     "# 行、列同时设置\n",
613 |     "df33 = pd.DataFrame([[\"a\",\"A\"],[\"b\",\"B\"],[\"c\",\"C\"],[\"d\",\"D\"]],columns = [\"小写\",\"大写\"],index = [\"一\",\"二\",\"三\",\"四\"])\n",
614 |     "df33"
615 |    ]
616 |   },
617 |   {
618 |    "cell_type": "code",
619 |    "execution_count": 38,
620 |    "metadata": {},
621 |    "outputs": [
622 |     {
623 |      "data": {
624 |       "text/html": [
625 |        "<div>\n",
626 |        "<style scoped>\n",
627 |        "    .dataframe tbody tr th:only-of-type {\n",
628 |        "        vertical-align: middle;\n",
629 |        "    }\n",
630 |        "\n",
631 |        "    .dataframe tbody tr th {\n",
632 |        "        vertical-align: top;\n",
633 |        "    }\n",
634 |        "\n",
635 |        "    .dataframe thead th {\n",
636 |        "        text-align: right;\n",
637 |        "    }\n",
638 |        "</style>\n",
639 |        "<table border=\"1\" class=\"dataframe\">\n",
640 |        "  <thead>\n",
641 |        "    <tr style=\"text-align: right;\">\n",
642 |        "      <th></th>\n",
643 |        "      <th>小写</th>\n",
644 |        "      <th>大写</th>\n",
645 |        "    </tr>\n",
646 |        "  </thead>\n",
647 |        "  <tbody>\n",
648 |        "    <tr>\n",
649 |        "      <th>0</th>\n",
650 |        "      <td>a</td>\n",
651 |        "      <td>A</td>\n",
652 |        "    </tr>\n",
653 |        "    <tr>\n",
654 |        "      <th>1</th>\n",
655 |        "      <td>b</td>\n",
656 |        "      <td>B</td>\n",
657 |        "    </tr>\n",
658 |        "    <tr>\n",
659 |        "      <th>2</th>\n",
660 |        "      <td>c</td>\n",
661 |        "      <td>C</td>\n",
662 |        "    </tr>\n",
663 |        "    <tr>\n",
664 |        "      <th>3</th>\n",
665 |        "      <td>d</td>\n",
666 |        "      <td>D</td>\n",
667 |        "    </tr>\n",
668 |        "  </tbody>\n",
669 |        "</table>\n",
670 |        "</div>"
671 |       ],
672 |       "text/plain": [
673 |        "  小写 大写\n",
674 |        "0  a  A\n",
675 |        "1  b  B\n",
676 |        "2  c  C\n",
677 |        "3  d  D"
678 |       ]
679 |      },
680 |      "execution_count": 38,
681 |      "metadata": {},
682 |      "output_type": "execute_result"
683 |     }
684 |    ],
685 |    "source": [
686 |     "#传入一个字段\n",
687 |     "data = {\"小写\":[\"a\",\"b\",\"c\",\"d\"],\"大写\":[\"A\",\"B\",\"C\",\"D\"]}\n",
688 |     "df41 = pd.DataFrame(data)\n",
689 |     "df41"
690 |    ]
691 |   },
692 |   {
693 |    "cell_type": "markdown",
694 |    "metadata": {},
695 |    "source": [
696 |     "- 字典传入DataFrame时，key的值相当于列索引，如没设置行索引默认从0开始，如需设置行索引，可以赢index参数"
697 |    ]
698 |   },
699 |   {
700 |    "cell_type": "code",
701 |    "execution_count": 28,
702 |    "metadata": {},
703 |    "outputs": [
704 |     {
705 |      "data": {
706 |       "text/html": [
707 |        "<div>\n",
708 |        "<style scoped>\n",
709 |        "    .dataframe tbody tr th:only-of-type {\n",
710 |        "        vertical-align: middle;\n",
711 |        "    }\n",
712 |        "\n",
713 |        "    .dataframe tbody tr th {\n",
714 |        "        vertical-align: top;\n",
715 |        "    }\n",
716 |        "\n",
717 |        "    .dataframe thead th {\n",
718 |        "        text-align: right;\n",
719 |        "    }\n",
720 |        "</style>\n",
721 |        "<table border=\"1\" class=\"dataframe\">\n",
722 |        "  <thead>\n",
723 |        "    <tr style=\"text-align: right;\">\n",
724 |        "      <th></th>\n",
725 |        "      <th>小写</th>\n",
726 |        "      <th>大写</th>\n",
727 |        "    </tr>\n",
728 |        "  </thead>\n",
729 |        "  <tbody>\n",
730 |        "    <tr>\n",
731 |        "      <th>一</th>\n",
732 |        "      <td>a</td>\n",
733 |        "      <td>A</td>\n",
734 |        "    </tr>\n",
735 |        "    <tr>\n",
736 |        "      <th>二</th>\n",
737 |        "      <td>b</td>\n",
738 |        "      <td>B</td>\n",
739 |        "    </tr>\n",
740 |        "    <tr>\n",
741 |        "      <th>三</th>\n",
742 |        "      <td>c</td>\n",
743 |        "      <td>C</td>\n",
744 |        "    </tr>\n",
745 |        "    <tr>\n",
746 |        "      <th>四</th>\n",
747 |        "      <td>d</td>\n",
748 |        "      <td>D</td>\n",
749 |        "    </tr>\n",
750 |        "  </tbody>\n",
751 |        "</table>\n",
752 |        "</div>"
753 |       ],
754 |       "text/plain": [
755 |        "  小写 大写\n",
756 |        "一  a  A\n",
757 |        "二  b  B\n",
758 |        "三  c  C\n",
759 |        "四  d  D"
760 |       ]
761 |      },
762 |      "execution_count": 28,
763 |      "metadata": {},
764 |      "output_type": "execute_result"
765 |     }
766 |    ],
767 |    "source": [
768 |     "# 给传入字典的数据设置行索引\n",
769 |     "data = {\"小写\":[\"a\",\"b\",\"c\",\"d\"],\"大写\":[\"A\",\"B\",\"C\",\"D\"]}\n",
770 |     "df42 = pd.DataFrame(data,index = [\"一\",\"二\",\"三\",\"四\"])\n",
771 |     "df42"
772 |    ]
773 |   },
774 |   {
775 |    "cell_type": "markdown",
776 |    "metadata": {},
777 |    "source": [
778 |     "### 获取DataFrame的行、列索引  \n",
779 |     "- 利用columns方法获取DataFrame的列索引\n",
780 |     "- 利用index方法获取DataFrame的行索引"
781 |    ]
782 |   },
783 |   {
784 |    "cell_type": "code",
785 |    "execution_count": 29,
786 |    "metadata": {},
787 |    "outputs": [
788 |     {
789 |      "data": {
790 |       "text/plain": [
791 |        "RangeIndex(start=0, stop=2, step=1)"
792 |       ]
793 |      },
794 |      "execution_count": 29,
795 |      "metadata": {},
796 |      "output_type": "execute_result"
797 |     }
798 |    ],
799 |    "source": [
800 |     "#获取DataFrame列索引\n",
801 |     "df2.columns"
802 |    ]
803 |   },
804 |   {
805 |    "cell_type": "code",
806 |    "execution_count": 33,
807 |    "metadata": {},
808 |    "outputs": [
809 |     {
810 |      "data": {
811 |       "text/plain": [
812 |        "Index(['小写', '大写'], dtype='object')"
813 |       ]
814 |      },
815 |      "execution_count": 33,
816 |      "metadata": {},
817 |      "output_type": "execute_result"
818 |     }
819 |    ],
820 |    "source": [
821 |     "df33.columns"
822 |    ]
823 |   },
824 |   {
825 |    "cell_type": "code",
826 |    "execution_count": 34,
827 |    "metadata": {},
828 |    "outputs": [
829 |     {
830 |      "data": {
831 |       "text/plain": [
832 |        "RangeIndex(start=0, stop=4, step=1)"
833 |       ]
834 |      },
835 |      "execution_count": 34,
836 |      "metadata": {},
837 |      "output_type": "execute_result"
838 |     }
839 |    ],
840 |    "source": [
841 |     "#获取DataFrame行索引\n",
842 |     "df2.index"
843 |    ]
844 |   },
845 |   {
846 |    "cell_type": "code",
847 |    "execution_count": 35,
848 |    "metadata": {},
849 |    "outputs": [
850 |     {
851 |      "data": {
852 |       "text/plain": [
853 |        "Index(['一', '二', '三', '四'], dtype='object')"
854 |       ]
855 |      },
856 |      "execution_count": 35,
857 |      "metadata": {},
858 |      "output_type": "execute_result"
859 |     }
860 |    ],
861 |    "source": [
862 |     "df33.index"
863 |    ]
864 |   },
865 |   {
866 |    "cell_type": "markdown",
867 |    "metadata": {},
868 |    "source": [
869 |     "## 获取DataFrame的值\n",
870 |     "第6章中介绍"
871 |    ]
872 |   }
873 |  ],
874 |  "metadata": {
875 |   "kernelspec": {
876 |    "display_name": "Python 3",
877 |    "language": "python",
878 |    "name": "python3"
879 |   },
880 |   "language_info": {
881 |    "codemirror_mode": {
882 |     "name": "ipython",
883 |     "version": 3
884 |    },
885 |    "file_extension": ".py",
886 |    "mimetype": "text/x-python",
887 |    "name": "python",
888 |    "nbconvert_exporter": "python",
889 |    "pygments_lexer": "ipython3",
890 |    "version": "3.7.0"
891 |   },
892 |   "toc": {
893 |    "base_numbering": 1,
894 |    "nav_menu": {},
895 |    "number_sections": true,
896 |    "sideBar": true,
897 |    "skip_h1_title": false,
898 |    "title_cell": "Table of Contents",
899 |    "title_sidebar": "第3章 Pandas数据结构",
900 |    "toc_cell": false,
901 |    "toc_position": {
902 |     "height": "calc(100% - 180px)",
903 |     "left": "10px",
904 |     "top": "150px",
905 |     "width": "320px"
906 |    },
907 |    "toc_section_display": true,
908 |    "toc_window_display": true
909 |   }
910 |  },
911 |  "nbformat": 4,
912 |  "nbformat_minor": 2
913 | }
914 | 


--------------------------------------------------------------------------------
/Code/Chapter05 数据预处理.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "markdown",
  5 |    "metadata": {},
  6 |    "source": [
  7 |     "# 数据处理"
  8 |    ]
  9 |   },
 10 |   {
 11 |    "cell_type": "markdown",
 12 |    "metadata": {},
 13 |    "source": [
 14 |     "## 缺失值处理\n",
 15 |     "### 缺失值查看"
 16 |    ]
 17 |   },
 18 |   {
 19 |    "cell_type": "code",
 20 |    "execution_count": 1,
 21 |    "metadata": {},
 22 |    "outputs": [
 23 |     {
 24 |      "name": "stdout",
 25 |      "output_type": "stream",
 26 |      "text": [
 27 |       "<class 'pandas.core.frame.DataFrame'>\n",
 28 |       "RangeIndex: 5 entries, 0 to 4\n",
 29 |       "Data columns (total 4 columns):\n",
 30 |       "编号      4 non-null object\n",
 31 |       "年龄      4 non-null float64\n",
 32 |       "性别      3 non-null object\n",
 33 |       "注册时间    4 non-null datetime64[ns]\n",
 34 |       "dtypes: datetime64[ns](1), float64(1), object(2)\n",
 35 |       "memory usage: 240.0+ bytes\n"
 36 |      ]
 37 |     }
 38 |    ],
 39 |    "source": [
 40 |     "import pandas as pd\n",
 41 |     "df = pd.read_excel(r\"..\\Data\\Chapter05.xlsx\")\n",
 42 |     "df.head(20).info()#head()默认只显示前5条数据\n",
 43 |     "#df.info()#info()方法返回各个字段属性及每一列缺失数据的情况"
 44 |    ]
 45 |   },
 46 |   {
 47 |    "cell_type": "markdown",
 48 |    "metadata": {},
 49 |    "source": [
 50 |     "### 缺失值删除"
 51 |    ]
 52 |   },
 53 |   {
 54 |    "cell_type": "code",
 55 |    "execution_count": 2,
 56 |    "metadata": {},
 57 |    "outputs": [
 58 |     {
 59 |      "data": {
 60 |       "text/html": [
 61 |        "<div>\n",
 62 |        "<style scoped>\n",
 63 |        "    .dataframe tbody tr th:only-of-type {\n",
 64 |        "        vertical-align: middle;\n",
 65 |        "    }\n",
 66 |        "\n",
 67 |        "    .dataframe tbody tr th {\n",
 68 |        "        vertical-align: top;\n",
 69 |        "    }\n",
 70 |        "\n",
 71 |        "    .dataframe thead th {\n",
 72 |        "        text-align: right;\n",
 73 |        "    }\n",
 74 |        "</style>\n",
 75 |        "<table border=\"1\" class=\"dataframe\">\n",
 76 |        "  <thead>\n",
 77 |        "    <tr style=\"text-align: right;\">\n",
 78 |        "      <th></th>\n",
 79 |        "      <th>编号</th>\n",
 80 |        "      <th>年龄</th>\n",
 81 |        "      <th>性别</th>\n",
 82 |        "      <th>注册时间</th>\n",
 83 |        "    </tr>\n",
 84 |        "  </thead>\n",
 85 |        "  <tbody>\n",
 86 |        "    <tr>\n",
 87 |        "      <th>0</th>\n",
 88 |        "      <td>A1</td>\n",
 89 |        "      <td>54.0</td>\n",
 90 |        "      <td>男</td>\n",
 91 |        "      <td>2018-08-08</td>\n",
 92 |        "    </tr>\n",
 93 |        "    <tr>\n",
 94 |        "      <th>1</th>\n",
 95 |        "      <td>A2</td>\n",
 96 |        "      <td>16.0</td>\n",
 97 |        "      <td>NaN</td>\n",
 98 |        "      <td>2018-08-09</td>\n",
 99 |        "    </tr>\n",
100 |        "    <tr>\n",
101 |        "      <th>3</th>\n",
102 |        "      <td>A3</td>\n",
103 |        "      <td>47.0</td>\n",
104 |        "      <td>女</td>\n",
105 |        "      <td>2018-08-10</td>\n",
106 |        "    </tr>\n",
107 |        "    <tr>\n",
108 |        "      <th>4</th>\n",
109 |        "      <td>A4</td>\n",
110 |        "      <td>41.0</td>\n",
111 |        "      <td>男</td>\n",
112 |        "      <td>2018-08-11</td>\n",
113 |        "    </tr>\n",
114 |        "  </tbody>\n",
115 |        "</table>\n",
116 |        "</div>"
117 |       ],
118 |       "text/plain": [
119 |        "   编号    年龄   性别       注册时间\n",
120 |        "0  A1  54.0    男 2018-08-08\n",
121 |        "1  A2  16.0  NaN 2018-08-09\n",
122 |        "3  A3  47.0    女 2018-08-10\n",
123 |        "4  A4  41.0    男 2018-08-11"
124 |       ]
125 |      },
126 |      "execution_count": 2,
127 |      "metadata": {},
128 |      "output_type": "execute_result"
129 |     }
130 |    ],
131 |    "source": [
132 |     "import pandas as pd\n",
133 |     "df = pd.read_excel(r\"..\\Data\\Chapter05.xlsx\")\n",
134 |     "df.dropna() #dropna()删除缺失值的行\n",
135 |     "df.dropna(how = \"all\")#删除所有列为空的行"
136 |    ]
137 |   },
138 |   {
139 |    "cell_type": "markdown",
140 |    "metadata": {},
141 |    "source": [
142 |     "### 缺失值填充"
143 |    ]
144 |   },
145 |   {
146 |    "cell_type": "code",
147 |    "execution_count": 5,
148 |    "metadata": {},
149 |    "outputs": [
150 |     {
151 |      "data": {
152 |       "text/html": [
153 |        "<div>\n",
154 |        "<style scoped>\n",
155 |        "    .dataframe tbody tr th:only-of-type {\n",
156 |        "        vertical-align: middle;\n",
157 |        "    }\n",
158 |        "\n",
159 |        "    .dataframe tbody tr th {\n",
160 |        "        vertical-align: top;\n",
161 |        "    }\n",
162 |        "\n",
163 |        "    .dataframe thead th {\n",
164 |        "        text-align: right;\n",
165 |        "    }\n",
166 |        "</style>\n",
167 |        "<table border=\"1\" class=\"dataframe\">\n",
168 |        "  <thead>\n",
169 |        "    <tr style=\"text-align: right;\">\n",
170 |        "      <th></th>\n",
171 |        "      <th>编号</th>\n",
172 |        "      <th>年龄</th>\n",
173 |        "      <th>性别</th>\n",
174 |        "      <th>注册时间</th>\n",
175 |        "    </tr>\n",
176 |        "  </thead>\n",
177 |        "  <tbody>\n",
178 |        "    <tr>\n",
179 |        "      <th>0</th>\n",
180 |        "      <td>A1</td>\n",
181 |        "      <td>54.0</td>\n",
182 |        "      <td>男</td>\n",
183 |        "      <td>2018-08-08</td>\n",
184 |        "    </tr>\n",
185 |        "    <tr>\n",
186 |        "      <th>1</th>\n",
187 |        "      <td>A2</td>\n",
188 |        "      <td>16.0</td>\n",
189 |        "      <td>男</td>\n",
190 |        "      <td>2018-08-09</td>\n",
191 |        "    </tr>\n",
192 |        "    <tr>\n",
193 |        "      <th>2</th>\n",
194 |        "      <td>A3</td>\n",
195 |        "      <td>30.0</td>\n",
196 |        "      <td>女</td>\n",
197 |        "      <td>2018-08-10</td>\n",
198 |        "    </tr>\n",
199 |        "    <tr>\n",
200 |        "      <th>3</th>\n",
201 |        "      <td>A4</td>\n",
202 |        "      <td>41.0</td>\n",
203 |        "      <td>男</td>\n",
204 |        "      <td>2018-08-11</td>\n",
205 |        "    </tr>\n",
206 |        "  </tbody>\n",
207 |        "</table>\n",
208 |        "</div>"
209 |       ],
210 |       "text/plain": [
211 |        "   编号    年龄 性别       注册时间\n",
212 |        "0  A1  54.0  男 2018-08-08\n",
213 |        "1  A2  16.0  男 2018-08-09\n",
214 |        "2  A3  30.0  女 2018-08-10\n",
215 |        "3  A4  41.0  男 2018-08-11"
216 |       ]
217 |      },
218 |      "execution_count": 5,
219 |      "metadata": {},
220 |      "output_type": "execute_result"
221 |     }
222 |    ],
223 |    "source": [
224 |     "import pandas as pd\n",
225 |     "df = pd.read_excel(r\"..\\Data\\Chapter05.xlsx\",sheet_name=1)\n",
226 |     "df.fillna(0)#fillna将缺失值填充为0\n",
227 |     "df.fillna({\"性别\":\"男\",\"年龄\":30})#分别对性别和年龄填充\n"
228 |    ]
229 |   },
230 |   {
231 |    "cell_type": "markdown",
232 |    "metadata": {},
233 |    "source": [
234 |     "## 重复数据处理"
235 |    ]
236 |   },
237 |   {
238 |    "cell_type": "code",
239 |    "execution_count": 9,
240 |    "metadata": {},
241 |    "outputs": [
242 |     {
243 |      "data": {
244 |       "text/html": [
245 |        "<div>\n",
246 |        "<style scoped>\n",
247 |        "    .dataframe tbody tr th:only-of-type {\n",
248 |        "        vertical-align: middle;\n",
249 |        "    }\n",
250 |        "\n",
251 |        "    .dataframe tbody tr th {\n",
252 |        "        vertical-align: top;\n",
253 |        "    }\n",
254 |        "\n",
255 |        "    .dataframe thead th {\n",
256 |        "        text-align: right;\n",
257 |        "    }\n",
258 |        "</style>\n",
259 |        "<table border=\"1\" class=\"dataframe\">\n",
260 |        "  <thead>\n",
261 |        "    <tr style=\"text-align: right;\">\n",
262 |        "      <th></th>\n",
263 |        "      <th>订单编号</th>\n",
264 |        "      <th>客户姓名</th>\n",
265 |        "      <th>唯一识别码</th>\n",
266 |        "      <th>成交时间</th>\n",
267 |        "    </tr>\n",
268 |        "  </thead>\n",
269 |        "  <tbody>\n",
270 |        "    <tr>\n",
271 |        "      <th>0</th>\n",
272 |        "      <td>A1</td>\n",
273 |        "      <td>张通</td>\n",
274 |        "      <td>101</td>\n",
275 |        "      <td>2018-08-08</td>\n",
276 |        "    </tr>\n",
277 |        "    <tr>\n",
278 |        "      <th>1</th>\n",
279 |        "      <td>A2</td>\n",
280 |        "      <td>李谷</td>\n",
281 |        "      <td>102</td>\n",
282 |        "      <td>2018-08-09</td>\n",
283 |        "    </tr>\n",
284 |        "    <tr>\n",
285 |        "      <th>3</th>\n",
286 |        "      <td>A3</td>\n",
287 |        "      <td>孙凤</td>\n",
288 |        "      <td>103</td>\n",
289 |        "      <td>2018-08-10</td>\n",
290 |        "    </tr>\n",
291 |        "    <tr>\n",
292 |        "      <th>5</th>\n",
293 |        "      <td>A5</td>\n",
294 |        "      <td>赵恒</td>\n",
295 |        "      <td>104</td>\n",
296 |        "      <td>2018-08-11</td>\n",
297 |        "    </tr>\n",
298 |        "  </tbody>\n",
299 |        "</table>\n",
300 |        "</div>"
301 |       ],
302 |       "text/plain": [
303 |        "  订单编号 客户姓名  唯一识别码       成交时间\n",
304 |        "0   A1   张通    101 2018-08-08\n",
305 |        "1   A2   李谷    102 2018-08-09\n",
306 |        "3   A3   孙凤    103 2018-08-10\n",
307 |        "5   A5   赵恒    104 2018-08-11"
308 |       ]
309 |      },
310 |      "execution_count": 9,
311 |      "metadata": {},
312 |      "output_type": "execute_result"
313 |     }
314 |    ],
315 |    "source": [
316 |     "import pandas as pd\n",
317 |     "df = pd.read_excel(r\"..\\Data\\Chapter05.xlsx\",sheet_name=2)\n",
318 |     "df.drop_duplicates() #删除重复的列\n",
319 |     "df.drop_duplicates(subset = \"唯一识别码\") #指定判断的列\n",
320 |     "df.drop_duplicates(subset = [\"客户姓名\",\"唯一识别码\"])\n",
321 |     "df.drop_duplicates(subset = [\"客户姓名\",\"唯一识别码\"],keep = \"last\") #keep参数（first,last）设置保留那个值\n"
322 |    ]
323 |   },
324 |   {
325 |    "cell_type": "markdown",
326 |    "metadata": {},
327 |    "source": [
328 |     "## 异常值的检测与处理"
329 |    ]
330 |   },
331 |   {
332 |    "cell_type": "markdown",
333 |    "metadata": {},
334 |    "source": [
335 |     "对于异常值一般有以下几种处理方式：\n",
336 |     "- 最常用的处理方式就是删除。\n",
337 |     "- 把异常值当作缺失值来填充。\n",
338 |     "- 把异常值当作特殊情况，研究异常值出现的原因"
339 |    ]
340 |   },
341 |   {
342 |    "cell_type": "markdown",
343 |    "metadata": {},
344 |    "source": [
345 |     "## 数据类型转换"
346 |    ]
347 |   },
348 |   {
349 |    "cell_type": "markdown",
350 |    "metadata": {},
351 |    "source": [
352 |     "### 数据类型"
353 |    ]
354 |   },
355 |   {
356 |    "cell_type": "markdown",
357 |    "metadata": {},
358 |    "source": [
359 |     "类型 | 说明\n",
360 |     "---|---\n",
361 |     "int | 整型数，即整数\n",
362 |     "flat | 浮点数，即含有小数点的数\n",
363 |     "object | Python对象类型，用O表示\n",
364 |     "string_ | 字符串类型，经常用S表示，S10表示长度为10的字符串\n",
365 |     "unicode_ | 谷歌程度的unicode类型，跟字符串的定义方式一样\n",
366 |     "datatime64[ns] | 表示时间格式"
367 |    ]
368 |   },
369 |   {
370 |    "cell_type": "code",
371 |    "execution_count": 6,
372 |    "metadata": {},
373 |    "outputs": [
374 |     {
375 |      "name": "stdout",
376 |      "output_type": "stream",
377 |      "text": [
378 |       "<class 'pandas.core.frame.DataFrame'>\n",
379 |       "RangeIndex: 6 entries, 0 to 5\n",
380 |       "Data columns (total 4 columns):\n",
381 |       "订单编号     6 non-null object\n",
382 |       "客户姓名     6 non-null object\n",
383 |       "唯一识别码    6 non-null int64\n",
384 |       "成交时间     6 non-null datetime64[ns]\n",
385 |       "dtypes: datetime64[ns](1), int64(1), object(2)\n",
386 |       "memory usage: 272.0+ bytes\n"
387 |      ]
388 |     },
389 |     {
390 |      "data": {
391 |       "text/plain": [
392 |        "dtype('int64')"
393 |       ]
394 |      },
395 |      "execution_count": 6,
396 |      "metadata": {},
397 |      "output_type": "execute_result"
398 |     }
399 |    ],
400 |    "source": [
401 |     "import pandas as pd\n",
402 |     "df = pd.read_excel(r\"..\\Data\\Chapter05.xlsx\",sheet_name = 2)\n",
403 |     "df.info() #info( )获取每一列的数据类型\n",
404 |     "df[\"订单编号\"].dtype # 查看订单编号这一列的数据类型\n",
405 |     "df[\"唯一识别码\"].dtype # 查看唯一识别码这一列的数据类型"
406 |    ]
407 |   },
408 |   {
409 |    "cell_type": "markdown",
410 |    "metadata": {},
411 |    "source": [
412 |     "### 类型转换"
413 |    ]
414 |   },
415 |   {
416 |    "cell_type": "code",
417 |    "execution_count": 17,
418 |    "metadata": {},
419 |    "outputs": [
420 |     {
421 |      "data": {
422 |       "text/plain": [
423 |        "0    101.0\n",
424 |        "1    102.0\n",
425 |        "2    103.0\n",
426 |        "3    103.0\n",
427 |        "4    104.0\n",
428 |        "5    104.0\n",
429 |        "Name: 唯一识别码, dtype: float64"
430 |       ]
431 |      },
432 |      "execution_count": 17,
433 |      "metadata": {},
434 |      "output_type": "execute_result"
435 |     }
436 |    ],
437 |    "source": [
438 |     "import pandas as pd\n",
439 |     "df = pd.read_excel(r\"..\\Data\\Chapter05.xlsx\",sheet_name = 2)\n",
440 |     "df[\"唯一识别码\"].dtype #查看类型\n",
441 |     "df[\"唯一识别码\"].astype(\"float64\")#将唯一识别码冲int类型转为float类型"
442 |    ]
443 |   },
444 |   {
445 |    "cell_type": "markdown",
446 |    "metadata": {},
447 |    "source": [
448 |     "## 索引设置"
449 |    ]
450 |   },
451 |   {
452 |    "cell_type": "markdown",
453 |    "metadata": {},
454 |    "source": [
455 |     "### 为无索引表添加索引"
456 |    ]
457 |   },
458 |   {
459 |    "cell_type": "code",
460 |    "execution_count": 49,
461 |    "metadata": {},
462 |    "outputs": [
463 |     {
464 |      "data": {
465 |       "text/html": [
466 |        "<div>\n",
467 |        "<style scoped>\n",
468 |        "    .dataframe tbody tr th:only-of-type {\n",
469 |        "        vertical-align: middle;\n",
470 |        "    }\n",
471 |        "\n",
472 |        "    .dataframe tbody tr th {\n",
473 |        "        vertical-align: top;\n",
474 |        "    }\n",
475 |        "\n",
476 |        "    .dataframe thead th {\n",
477 |        "        text-align: right;\n",
478 |        "    }\n",
479 |        "</style>\n",
480 |        "<table border=\"1\" class=\"dataframe\">\n",
481 |        "  <thead>\n",
482 |        "    <tr style=\"text-align: right;\">\n",
483 |        "      <th></th>\n",
484 |        "      <th>订单编号</th>\n",
485 |        "      <th>客户姓名</th>\n",
486 |        "      <th>唯一识别码</th>\n",
487 |        "      <th>成交时间</th>\n",
488 |        "    </tr>\n",
489 |        "  </thead>\n",
490 |        "  <tbody>\n",
491 |        "    <tr>\n",
492 |        "      <th>1</th>\n",
493 |        "      <td>A1</td>\n",
494 |        "      <td>张通</td>\n",
495 |        "      <td>101</td>\n",
496 |        "      <td>2018-08-08</td>\n",
497 |        "    </tr>\n",
498 |        "    <tr>\n",
499 |        "      <th>2</th>\n",
500 |        "      <td>A2</td>\n",
501 |        "      <td>李谷</td>\n",
502 |        "      <td>102</td>\n",
503 |        "      <td>2018-08-09</td>\n",
504 |        "    </tr>\n",
505 |        "    <tr>\n",
506 |        "      <th>3</th>\n",
507 |        "      <td>A3</td>\n",
508 |        "      <td>孙凤</td>\n",
509 |        "      <td>103</td>\n",
510 |        "      <td>2018-08-10</td>\n",
511 |        "    </tr>\n",
512 |        "    <tr>\n",
513 |        "      <th>4</th>\n",
514 |        "      <td>A4</td>\n",
515 |        "      <td>赵恒</td>\n",
516 |        "      <td>104</td>\n",
517 |        "      <td>2018-08-11</td>\n",
518 |        "    </tr>\n",
519 |        "    <tr>\n",
520 |        "      <th>5</th>\n",
521 |        "      <td>A5</td>\n",
522 |        "      <td>赵恒</td>\n",
523 |        "      <td>104</td>\n",
524 |        "      <td>2018-08-11</td>\n",
525 |        "    </tr>\n",
526 |        "  </tbody>\n",
527 |        "</table>\n",
528 |        "</div>"
529 |       ],
530 |       "text/plain": [
531 |        "  订单编号 客户姓名  唯一识别码       成交时间\n",
532 |        "1   A1   张通    101 2018-08-08\n",
533 |        "2   A2   李谷    102 2018-08-09\n",
534 |        "3   A3   孙凤    103 2018-08-10\n",
535 |        "4   A4   赵恒    104 2018-08-11\n",
536 |        "5   A5   赵恒    104 2018-08-11"
537 |       ]
538 |      },
539 |      "execution_count": 49,
540 |      "metadata": {},
541 |      "output_type": "execute_result"
542 |     }
543 |    ],
544 |    "source": [
545 |     "import pandas as pd\n",
546 |     "df = pd.read_excel(r\"..\\Data\\Chapter05.xlsx\",sheet_name = 3,header= None)\n",
547 |     "df.columns = [\"订单编号\",\"客户姓名\",\"唯一识别码\",\"成交时间\"]#header需要设置为None，否则会覆盖第一行数据\n",
548 |     "df.index = [1,2,3,4,5]\n",
549 |     "df\n"
550 |    ]
551 |   },
552 |   {
553 |    "cell_type": "markdown",
554 |    "metadata": {},
555 |    "source": [
556 |     "### 重新设置索引"
557 |    ]
558 |   },
559 |   {
560 |    "cell_type": "code",
561 |    "execution_count": 66,
562 |    "metadata": {},
563 |    "outputs": [
564 |     {
565 |      "data": {
566 |       "text/html": [
567 |        "<div>\n",
568 |        "<style scoped>\n",
569 |        "    .dataframe tbody tr th:only-of-type {\n",
570 |        "        vertical-align: middle;\n",
571 |        "    }\n",
572 |        "\n",
573 |        "    .dataframe tbody tr th {\n",
574 |        "        vertical-align: top;\n",
575 |        "    }\n",
576 |        "\n",
577 |        "    .dataframe thead th {\n",
578 |        "        text-align: right;\n",
579 |        "    }\n",
580 |        "</style>\n",
581 |        "<table border=\"1\" class=\"dataframe\">\n",
582 |        "  <thead>\n",
583 |        "    <tr style=\"text-align: right;\">\n",
584 |        "      <th></th>\n",
585 |        "      <th>客户姓名</th>\n",
586 |        "      <th>唯一识别码</th>\n",
587 |        "      <th>成交时间</th>\n",
588 |        "    </tr>\n",
589 |        "    <tr>\n",
590 |        "      <th>订单编号</th>\n",
591 |        "      <th></th>\n",
592 |        "      <th></th>\n",
593 |        "      <th></th>\n",
594 |        "    </tr>\n",
595 |        "  </thead>\n",
596 |        "  <tbody>\n",
597 |        "    <tr>\n",
598 |        "      <th>A1</th>\n",
599 |        "      <td>张通</td>\n",
600 |        "      <td>101</td>\n",
601 |        "      <td>2018-08-08</td>\n",
602 |        "    </tr>\n",
603 |        "    <tr>\n",
604 |        "      <th>A2</th>\n",
605 |        "      <td>李谷</td>\n",
606 |        "      <td>102</td>\n",
607 |        "      <td>2018-08-09</td>\n",
608 |        "    </tr>\n",
609 |        "    <tr>\n",
610 |        "      <th>A3</th>\n",
611 |        "      <td>孙凤</td>\n",
612 |        "      <td>103</td>\n",
613 |        "      <td>2018-08-10</td>\n",
614 |        "    </tr>\n",
615 |        "    <tr>\n",
616 |        "      <th>A3</th>\n",
617 |        "      <td>孙凤</td>\n",
618 |        "      <td>103</td>\n",
619 |        "      <td>2018-08-10</td>\n",
620 |        "    </tr>\n",
621 |        "    <tr>\n",
622 |        "      <th>A4</th>\n",
623 |        "      <td>赵恒</td>\n",
624 |        "      <td>104</td>\n",
625 |        "      <td>2018-08-11</td>\n",
626 |        "    </tr>\n",
627 |        "    <tr>\n",
628 |        "      <th>A5</th>\n",
629 |        "      <td>赵恒</td>\n",
630 |        "      <td>104</td>\n",
631 |        "      <td>2018-08-11</td>\n",
632 |        "    </tr>\n",
633 |        "  </tbody>\n",
634 |        "</table>\n",
635 |        "</div>"
636 |       ],
637 |       "text/plain": [
638 |        "     客户姓名  唯一识别码       成交时间\n",
639 |        "订单编号                       \n",
640 |        "A1     张通    101 2018-08-08\n",
641 |        "A2     李谷    102 2018-08-09\n",
642 |        "A3     孙凤    103 2018-08-10\n",
643 |        "A3     孙凤    103 2018-08-10\n",
644 |        "A4     赵恒    104 2018-08-11\n",
645 |        "A5     赵恒    104 2018-08-11"
646 |       ]
647 |      },
648 |      "execution_count": 66,
649 |      "metadata": {},
650 |      "output_type": "execute_result"
651 |     }
652 |    ],
653 |    "source": [
654 |     "import pandas as pd\n",
655 |     "df = pd.read_excel(r\"..\\Data\\Chapter05.xlsx\",sheet_name = 2)\n",
656 |     "df.set_index(\"订单编号\") #se_index()方法重新设置索引列"
657 |    ]
658 |   },
659 |   {
660 |    "cell_type": "markdown",
661 |    "metadata": {},
662 |    "source": [
663 |     "### 重命名索引"
664 |    ]
665 |   },
666 |   {
667 |    "cell_type": "code",
668 |    "execution_count": 82,
669 |    "metadata": {},
670 |    "outputs": [
671 |     {
672 |      "data": {
673 |       "text/html": [
674 |        "<div>\n",
675 |        "<style scoped>\n",
676 |        "    .dataframe tbody tr th:only-of-type {\n",
677 |        "        vertical-align: middle;\n",
678 |        "    }\n",
679 |        "\n",
680 |        "    .dataframe tbody tr th {\n",
681 |        "        vertical-align: top;\n",
682 |        "    }\n",
683 |        "\n",
684 |        "    .dataframe thead th {\n",
685 |        "        text-align: right;\n",
686 |        "    }\n",
687 |        "</style>\n",
688 |        "<table border=\"1\" class=\"dataframe\">\n",
689 |        "  <thead>\n",
690 |        "    <tr style=\"text-align: right;\">\n",
691 |        "      <th></th>\n",
692 |        "      <th>新订单编号</th>\n",
693 |        "      <th>新客户姓名</th>\n",
694 |        "      <th>唯一识别码</th>\n",
695 |        "      <th>成交时间</th>\n",
696 |        "    </tr>\n",
697 |        "  </thead>\n",
698 |        "  <tbody>\n",
699 |        "    <tr>\n",
700 |        "      <th>一</th>\n",
701 |        "      <td>A1</td>\n",
702 |        "      <td>张通</td>\n",
703 |        "      <td>101</td>\n",
704 |        "      <td>2018-08-08</td>\n",
705 |        "    </tr>\n",
706 |        "    <tr>\n",
707 |        "      <th>二</th>\n",
708 |        "      <td>A2</td>\n",
709 |        "      <td>李谷</td>\n",
710 |        "      <td>102</td>\n",
711 |        "      <td>2018-08-09</td>\n",
712 |        "    </tr>\n",
713 |        "    <tr>\n",
714 |        "      <th>三</th>\n",
715 |        "      <td>A3</td>\n",
716 |        "      <td>孙凤</td>\n",
717 |        "      <td>103</td>\n",
718 |        "      <td>2018-08-10</td>\n",
719 |        "    </tr>\n",
720 |        "    <tr>\n",
721 |        "      <th>四</th>\n",
722 |        "      <td>A4</td>\n",
723 |        "      <td>赵恒</td>\n",
724 |        "      <td>104</td>\n",
725 |        "      <td>2018-08-11</td>\n",
726 |        "    </tr>\n",
727 |        "    <tr>\n",
728 |        "      <th>5</th>\n",
729 |        "      <td>A5</td>\n",
730 |        "      <td>赵恒</td>\n",
731 |        "      <td>104</td>\n",
732 |        "      <td>2018-08-12</td>\n",
733 |        "    </tr>\n",
734 |        "  </tbody>\n",
735 |        "</table>\n",
736 |        "</div>"
737 |       ],
738 |       "text/plain": [
739 |        "  新订单编号 新客户姓名  唯一识别码       成交时间\n",
740 |        "一    A1    张通    101 2018-08-08\n",
741 |        "二    A2    李谷    102 2018-08-09\n",
742 |        "三    A3    孙凤    103 2018-08-10\n",
743 |        "四    A4    赵恒    104 2018-08-11\n",
744 |        "5    A5    赵恒    104 2018-08-12"
745 |       ]
746 |      },
747 |      "execution_count": 82,
748 |      "metadata": {},
749 |      "output_type": "execute_result"
750 |     }
751 |    ],
752 |    "source": [
753 |     "import pandas as pd\n",
754 |     "df = pd.read_excel(r\"..\\Data\\Chapter05.xlsx\",sheet_name = 4)\n",
755 |     "df.index = [1,2,3,4,5] #添加索引\n",
756 |     "df.rename(columns={\"订单编号\":\"新订单编号\",\"客户姓名\":\"新客户姓名\"}) #重命名列索引\n",
757 |     "df.rename(index = {1:\"一\",2:\"二\",3:\"三\"}) #重命名行索引\n",
758 |     "df.rename(columns={\"订单编号\":\"新订单编号\",\"客户姓名\":\"新客户姓名\"},index = {1:\"一\",2:\"二\",3:\"三\",4:'四'})#同时重命名列和行索引"
759 |    ]
760 |   },
761 |   {
762 |    "cell_type": "markdown",
763 |    "metadata": {},
764 |    "source": [
765 |     "### 重置索引"
766 |    ]
767 |   },
768 |   {
769 |    "cell_type": "code",
770 |    "execution_count": 7,
771 |    "metadata": {},
772 |    "outputs": [
773 |     {
774 |      "data": {
775 |       "text/html": [
776 |        "<div>\n",
777 |        "<style scoped>\n",
778 |        "    .dataframe tbody tr th:only-of-type {\n",
779 |        "        vertical-align: middle;\n",
780 |        "    }\n",
781 |        "\n",
782 |        "    .dataframe tbody tr th {\n",
783 |        "        vertical-align: top;\n",
784 |        "    }\n",
785 |        "\n",
786 |        "    .dataframe thead th {\n",
787 |        "        text-align: right;\n",
788 |        "    }\n",
789 |        "</style>\n",
790 |        "<table border=\"1\" class=\"dataframe\">\n",
791 |        "  <thead>\n",
792 |        "    <tr style=\"text-align: right;\">\n",
793 |        "      <th></th>\n",
794 |        "      <th>level_0</th>\n",
795 |        "      <th>level_1</th>\n",
796 |        "      <th>C1</th>\n",
797 |        "      <th>C2</th>\n",
798 |        "    </tr>\n",
799 |        "  </thead>\n",
800 |        "  <tbody>\n",
801 |        "    <tr>\n",
802 |        "      <th>0</th>\n",
803 |        "      <td>Z1</td>\n",
804 |        "      <td>Z2</td>\n",
805 |        "      <td>NaN</td>\n",
806 |        "      <td>NaN</td>\n",
807 |        "    </tr>\n",
808 |        "    <tr>\n",
809 |        "      <th>1</th>\n",
810 |        "      <td>A</td>\n",
811 |        "      <td>a</td>\n",
812 |        "      <td>1.0</td>\n",
813 |        "      <td>2.0</td>\n",
814 |        "    </tr>\n",
815 |        "    <tr>\n",
816 |        "      <th>2</th>\n",
817 |        "      <td>NaN</td>\n",
818 |        "      <td>b</td>\n",
819 |        "      <td>3.0</td>\n",
820 |        "      <td>4.0</td>\n",
821 |        "    </tr>\n",
822 |        "    <tr>\n",
823 |        "      <th>3</th>\n",
824 |        "      <td>B</td>\n",
825 |        "      <td>a</td>\n",
826 |        "      <td>5.0</td>\n",
827 |        "      <td>6.0</td>\n",
828 |        "    </tr>\n",
829 |        "    <tr>\n",
830 |        "      <th>4</th>\n",
831 |        "      <td>NaN</td>\n",
832 |        "      <td>b</td>\n",
833 |        "      <td>7.0</td>\n",
834 |        "      <td>8.0</td>\n",
835 |        "    </tr>\n",
836 |        "  </tbody>\n",
837 |        "</table>\n",
838 |        "</div>"
839 |       ],
840 |       "text/plain": [
841 |        "  level_0 level_1   C1   C2\n",
842 |        "0      Z1      Z2  NaN  NaN\n",
843 |        "1       A       a  1.0  2.0\n",
844 |        "2     NaN       b  3.0  4.0\n",
845 |        "3       B       a  5.0  6.0\n",
846 |        "4     NaN       b  7.0  8.0"
847 |       ]
848 |      },
849 |      "execution_count": 7,
850 |      "metadata": {},
851 |      "output_type": "execute_result"
852 |     }
853 |    ],
854 |    "source": [
855 |     "import pandas as pd\n",
856 |     "df = pd.read_excel(r\"..\\Data\\Chapter05.xlsx\",sheet_name=5)\n",
857 |     "df.reset_index()\n",
858 |     "#详见第10章"
859 |    ]
860 |   }
861 |  ],
862 |  "metadata": {
863 |   "kernelspec": {
864 |    "display_name": "Python 3",
865 |    "language": "python",
866 |    "name": "python3"
867 |   },
868 |   "language_info": {
869 |    "codemirror_mode": {
870 |     "name": "ipython",
871 |     "version": 3
872 |    },
873 |    "file_extension": ".py",
874 |    "mimetype": "text/x-python",
875 |    "name": "python",
876 |    "nbconvert_exporter": "python",
877 |    "pygments_lexer": "ipython3",
878 |    "version": "3.7.0"
879 |   },
880 |   "toc": {
881 |    "base_numbering": 1,
882 |    "nav_menu": {},
883 |    "number_sections": true,
884 |    "sideBar": true,
885 |    "skip_h1_title": false,
886 |    "title_cell": "Table of Contents",
887 |    "title_sidebar": "第5章 数据预处理",
888 |    "toc_cell": false,
889 |    "toc_position": {},
890 |    "toc_section_display": true,
891 |    "toc_window_display": true
892 |   }
893 |  },
894 |  "nbformat": 4,
895 |  "nbformat_minor": 2
896 | }
897 | 


--------------------------------------------------------------------------------
/Code/Chapter06 数据选择.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "markdown",
  5 |    "metadata": {},
  6 |    "source": [
  7 |     "## 列选择"
  8 |    ]
  9 |   },
 10 |   {
 11 |    "cell_type": "markdown",
 12 |    "metadata": {},
 13 |    "source": [
 14 |     "### 选择某一列/某几列"
 15 |    ]
 16 |   },
 17 |   {
 18 |    "cell_type": "code",
 19 |    "execution_count": 8,
 20 |    "metadata": {
 21 |     "scrolled": true
 22 |    },
 23 |    "outputs": [
 24 |     {
 25 |      "data": {
 26 |       "text/html": [
 27 |        "<div>\n",
 28 |        "<style scoped>\n",
 29 |        "    .dataframe tbody tr th:only-of-type {\n",
 30 |        "        vertical-align: middle;\n",
 31 |        "    }\n",
 32 |        "\n",
 33 |        "    .dataframe tbody tr th {\n",
 34 |        "        vertical-align: top;\n",
 35 |        "    }\n",
 36 |        "\n",
 37 |        "    .dataframe thead th {\n",
 38 |        "        text-align: right;\n",
 39 |        "    }\n",
 40 |        "</style>\n",
 41 |        "<table border=\"1\" class=\"dataframe\">\n",
 42 |        "  <thead>\n",
 43 |        "    <tr style=\"text-align: right;\">\n",
 44 |        "      <th></th>\n",
 45 |        "      <th>订单编号</th>\n",
 46 |        "      <th>唯一识别码</th>\n",
 47 |        "    </tr>\n",
 48 |        "  </thead>\n",
 49 |        "  <tbody>\n",
 50 |        "    <tr>\n",
 51 |        "      <th>0</th>\n",
 52 |        "      <td>A1</td>\n",
 53 |        "      <td>101</td>\n",
 54 |        "    </tr>\n",
 55 |        "    <tr>\n",
 56 |        "      <th>1</th>\n",
 57 |        "      <td>A2</td>\n",
 58 |        "      <td>102</td>\n",
 59 |        "    </tr>\n",
 60 |        "    <tr>\n",
 61 |        "      <th>2</th>\n",
 62 |        "      <td>A3</td>\n",
 63 |        "      <td>103</td>\n",
 64 |        "    </tr>\n",
 65 |        "    <tr>\n",
 66 |        "      <th>3</th>\n",
 67 |        "      <td>A3</td>\n",
 68 |        "      <td>103</td>\n",
 69 |        "    </tr>\n",
 70 |        "    <tr>\n",
 71 |        "      <th>4</th>\n",
 72 |        "      <td>A4</td>\n",
 73 |        "      <td>104</td>\n",
 74 |        "    </tr>\n",
 75 |        "    <tr>\n",
 76 |        "      <th>5</th>\n",
 77 |        "      <td>A5</td>\n",
 78 |        "      <td>104</td>\n",
 79 |        "    </tr>\n",
 80 |        "  </tbody>\n",
 81 |        "</table>\n",
 82 |        "</div>"
 83 |       ],
 84 |       "text/plain": [
 85 |        "  订单编号  唯一识别码\n",
 86 |        "0   A1    101\n",
 87 |        "1   A2    102\n",
 88 |        "2   A3    103\n",
 89 |        "3   A3    103\n",
 90 |        "4   A4    104\n",
 91 |        "5   A5    104"
 92 |       ]
 93 |      },
 94 |      "execution_count": 8,
 95 |      "metadata": {},
 96 |      "output_type": "execute_result"
 97 |     }
 98 |    ],
 99 |    "source": [
100 |     "import pandas as pd\n",
101 |     "df = pd.read_excel(r\"..\\Data\\Chapter06.xlsx\",sheet_name = 0)\n",
102 |     "#通过传入列名选择数据的方式称为普通索引\n",
103 |     "df\n",
104 |     "df['客户姓名']\n",
105 |     "df[['订单编号','客户姓名']]\n",
106 |     "#通过传入具体位置来选择数据的方式称为位置索引\n",
107 |     "df.iloc[:,[0,2]] #获取第1和第3列的数值，：表示获取所有的行"
108 |    ]
109 |   },
110 |   {
111 |    "cell_type": "markdown",
112 |    "metadata": {},
113 |    "source": [
114 |     "### 连续选择某几列"
115 |    ]
116 |   },
117 |   {
118 |    "cell_type": "code",
119 |    "execution_count": 11,
120 |    "metadata": {},
121 |    "outputs": [
122 |     {
123 |      "data": {
124 |       "text/html": [
125 |        "<div>\n",
126 |        "<style scoped>\n",
127 |        "    .dataframe tbody tr th:only-of-type {\n",
128 |        "        vertical-align: middle;\n",
129 |        "    }\n",
130 |        "\n",
131 |        "    .dataframe tbody tr th {\n",
132 |        "        vertical-align: top;\n",
133 |        "    }\n",
134 |        "\n",
135 |        "    .dataframe thead th {\n",
136 |        "        text-align: right;\n",
137 |        "    }\n",
138 |        "</style>\n",
139 |        "<table border=\"1\" class=\"dataframe\">\n",
140 |        "  <thead>\n",
141 |        "    <tr style=\"text-align: right;\">\n",
142 |        "      <th></th>\n",
143 |        "      <th>订单编号</th>\n",
144 |        "      <th>客户姓名</th>\n",
145 |        "      <th>唯一识别码</th>\n",
146 |        "    </tr>\n",
147 |        "  </thead>\n",
148 |        "  <tbody>\n",
149 |        "    <tr>\n",
150 |        "      <th>0</th>\n",
151 |        "      <td>A1</td>\n",
152 |        "      <td>张通</td>\n",
153 |        "      <td>101</td>\n",
154 |        "    </tr>\n",
155 |        "    <tr>\n",
156 |        "      <th>1</th>\n",
157 |        "      <td>A2</td>\n",
158 |        "      <td>李谷</td>\n",
159 |        "      <td>102</td>\n",
160 |        "    </tr>\n",
161 |        "    <tr>\n",
162 |        "      <th>2</th>\n",
163 |        "      <td>A3</td>\n",
164 |        "      <td>孙凤</td>\n",
165 |        "      <td>103</td>\n",
166 |        "    </tr>\n",
167 |        "    <tr>\n",
168 |        "      <th>3</th>\n",
169 |        "      <td>A3</td>\n",
170 |        "      <td>孙凤</td>\n",
171 |        "      <td>103</td>\n",
172 |        "    </tr>\n",
173 |        "    <tr>\n",
174 |        "      <th>4</th>\n",
175 |        "      <td>A4</td>\n",
176 |        "      <td>赵恒</td>\n",
177 |        "      <td>104</td>\n",
178 |        "    </tr>\n",
179 |        "    <tr>\n",
180 |        "      <th>5</th>\n",
181 |        "      <td>A5</td>\n",
182 |        "      <td>赵恒</td>\n",
183 |        "      <td>104</td>\n",
184 |        "    </tr>\n",
185 |        "  </tbody>\n",
186 |        "</table>\n",
187 |        "</div>"
188 |       ],
189 |       "text/plain": [
190 |        "  订单编号 客户姓名  唯一识别码\n",
191 |        "0   A1   张通    101\n",
192 |        "1   A2   李谷    102\n",
193 |        "2   A3   孙凤    103\n",
194 |        "3   A3   孙凤    103\n",
195 |        "4   A4   赵恒    104\n",
196 |        "5   A5   赵恒    104"
197 |       ]
198 |      },
199 |      "execution_count": 11,
200 |      "metadata": {},
201 |      "output_type": "execute_result"
202 |     }
203 |    ],
204 |    "source": [
205 |     "#通过传入一个位置区间来获取数据的方式称为切片索引\n",
206 |     "df.iloc[:,0:3] #选择第1列到第4列的之间的值（包含第1列但是不包含第4列）"
207 |    ]
208 |   },
209 |   {
210 |    "cell_type": "markdown",
211 |    "metadata": {},
212 |    "source": [
213 |     "## 行选择"
214 |    ]
215 |   },
216 |   {
217 |    "cell_type": "markdown",
218 |    "metadata": {},
219 |    "source": [
220 |     "### 选择某一行/某几行"
221 |    ]
222 |   },
223 |   {
224 |    "cell_type": "code",
225 |    "execution_count": 18,
226 |    "metadata": {},
227 |    "outputs": [
228 |     {
229 |      "data": {
230 |       "text/html": [
231 |        "<div>\n",
232 |        "<style scoped>\n",
233 |        "    .dataframe tbody tr th:only-of-type {\n",
234 |        "        vertical-align: middle;\n",
235 |        "    }\n",
236 |        "\n",
237 |        "    .dataframe tbody tr th {\n",
238 |        "        vertical-align: top;\n",
239 |        "    }\n",
240 |        "\n",
241 |        "    .dataframe thead th {\n",
242 |        "        text-align: right;\n",
243 |        "    }\n",
244 |        "</style>\n",
245 |        "<table border=\"1\" class=\"dataframe\">\n",
246 |        "  <thead>\n",
247 |        "    <tr style=\"text-align: right;\">\n",
248 |        "      <th></th>\n",
249 |        "      <th>订单编号</th>\n",
250 |        "      <th>客户姓名</th>\n",
251 |        "      <th>唯一识别码</th>\n",
252 |        "      <th>成交时间</th>\n",
253 |        "    </tr>\n",
254 |        "  </thead>\n",
255 |        "  <tbody>\n",
256 |        "    <tr>\n",
257 |        "      <th>一</th>\n",
258 |        "      <td>A1</td>\n",
259 |        "      <td>张通</td>\n",
260 |        "      <td>101</td>\n",
261 |        "      <td>2018-08-08</td>\n",
262 |        "    </tr>\n",
263 |        "    <tr>\n",
264 |        "      <th>二</th>\n",
265 |        "      <td>A2</td>\n",
266 |        "      <td>李谷</td>\n",
267 |        "      <td>102</td>\n",
268 |        "      <td>2018-08-09</td>\n",
269 |        "    </tr>\n",
270 |        "  </tbody>\n",
271 |        "</table>\n",
272 |        "</div>"
273 |       ],
274 |       "text/plain": [
275 |        "  订单编号 客户姓名  唯一识别码       成交时间\n",
276 |        "一   A1   张通    101 2018-08-08\n",
277 |        "二   A2   李谷    102 2018-08-09"
278 |       ]
279 |      },
280 |      "execution_count": 18,
281 |      "metadata": {},
282 |      "output_type": "execute_result"
283 |     }
284 |    ],
285 |    "source": [
286 |     "#利用loc()方法，普通索引\n",
287 |     "df.index = [\"一\",\"二\",\"三\",\"四\",\"五\",\"六\"]\n",
288 |     "df.loc[\"一\"]\n",
289 |     "df.loc[[\"一\",\"二\"]]\n",
290 |     "#利用iloc方法，位置索引\n",
291 |     "df.iloc[0]\n",
292 |     "df.iloc[[0,1]] #选择第一和第二行"
293 |    ]
294 |   },
295 |   {
296 |    "cell_type": "markdown",
297 |    "metadata": {},
298 |    "source": [
299 |     "### 选择连续的某几行"
300 |    ]
301 |   },
302 |   {
303 |    "cell_type": "code",
304 |    "execution_count": 19,
305 |    "metadata": {
306 |     "scrolled": true
307 |    },
308 |    "outputs": [
309 |     {
310 |      "data": {
311 |       "text/html": [
312 |        "<div>\n",
313 |        "<style scoped>\n",
314 |        "    .dataframe tbody tr th:only-of-type {\n",
315 |        "        vertical-align: middle;\n",
316 |        "    }\n",
317 |        "\n",
318 |        "    .dataframe tbody tr th {\n",
319 |        "        vertical-align: top;\n",
320 |        "    }\n",
321 |        "\n",
322 |        "    .dataframe thead th {\n",
323 |        "        text-align: right;\n",
324 |        "    }\n",
325 |        "</style>\n",
326 |        "<table border=\"1\" class=\"dataframe\">\n",
327 |        "  <thead>\n",
328 |        "    <tr style=\"text-align: right;\">\n",
329 |        "      <th></th>\n",
330 |        "      <th>订单编号</th>\n",
331 |        "      <th>客户姓名</th>\n",
332 |        "      <th>唯一识别码</th>\n",
333 |        "      <th>成交时间</th>\n",
334 |        "    </tr>\n",
335 |        "  </thead>\n",
336 |        "  <tbody>\n",
337 |        "    <tr>\n",
338 |        "      <th>一</th>\n",
339 |        "      <td>A1</td>\n",
340 |        "      <td>张通</td>\n",
341 |        "      <td>101</td>\n",
342 |        "      <td>2018-08-08</td>\n",
343 |        "    </tr>\n",
344 |        "    <tr>\n",
345 |        "      <th>二</th>\n",
346 |        "      <td>A2</td>\n",
347 |        "      <td>李谷</td>\n",
348 |        "      <td>102</td>\n",
349 |        "      <td>2018-08-09</td>\n",
350 |        "    </tr>\n",
351 |        "    <tr>\n",
352 |        "      <th>三</th>\n",
353 |        "      <td>A3</td>\n",
354 |        "      <td>孙凤</td>\n",
355 |        "      <td>103</td>\n",
356 |        "      <td>2018-08-10</td>\n",
357 |        "    </tr>\n",
358 |        "  </tbody>\n",
359 |        "</table>\n",
360 |        "</div>"
361 |       ],
362 |       "text/plain": [
363 |        "  订单编号 客户姓名  唯一识别码       成交时间\n",
364 |        "一   A1   张通    101 2018-08-08\n",
365 |        "二   A2   李谷    102 2018-08-09\n",
366 |        "三   A3   孙凤    103 2018-08-10"
367 |       ]
368 |      },
369 |      "execution_count": 19,
370 |      "metadata": {},
371 |      "output_type": "execute_result"
372 |     }
373 |    ],
374 |    "source": [
375 |     "df.iloc[0:3]#选择第一行到第四行（不包含第四行）"
376 |    ]
377 |   },
378 |   {
379 |    "cell_type": "markdown",
380 |    "metadata": {},
381 |    "source": [
382 |     "### 选择满足条件的行"
383 |    ]
384 |   },
385 |   {
386 |    "cell_type": "code",
387 |    "execution_count": 21,
388 |    "metadata": {},
389 |    "outputs": [
390 |     {
391 |      "data": {
392 |       "text/html": [
393 |        "<div>\n",
394 |        "<style scoped>\n",
395 |        "    .dataframe tbody tr th:only-of-type {\n",
396 |        "        vertical-align: middle;\n",
397 |        "    }\n",
398 |        "\n",
399 |        "    .dataframe tbody tr th {\n",
400 |        "        vertical-align: top;\n",
401 |        "    }\n",
402 |        "\n",
403 |        "    .dataframe thead th {\n",
404 |        "        text-align: right;\n",
405 |        "    }\n",
406 |        "</style>\n",
407 |        "<table border=\"1\" class=\"dataframe\">\n",
408 |        "  <thead>\n",
409 |        "    <tr style=\"text-align: right;\">\n",
410 |        "      <th></th>\n",
411 |        "      <th>订单编号</th>\n",
412 |        "      <th>客户姓名</th>\n",
413 |        "      <th>唯一识别码</th>\n",
414 |        "      <th>年龄</th>\n",
415 |        "      <th>成交时间</th>\n",
416 |        "    </tr>\n",
417 |        "  </thead>\n",
418 |        "  <tbody>\n",
419 |        "    <tr>\n",
420 |        "      <th>0</th>\n",
421 |        "      <td>A1</td>\n",
422 |        "      <td>张通</td>\n",
423 |        "      <td>101.0</td>\n",
424 |        "      <td>31.0</td>\n",
425 |        "      <td>2018-08-08</td>\n",
426 |        "    </tr>\n",
427 |        "  </tbody>\n",
428 |        "</table>\n",
429 |        "</div>"
430 |       ],
431 |       "text/plain": [
432 |        "  订单编号 客户姓名  唯一识别码    年龄       成交时间\n",
433 |        "0   A1   张通  101.0  31.0 2018-08-08"
434 |       ]
435 |      },
436 |      "execution_count": 21,
437 |      "metadata": {},
438 |      "output_type": "execute_result"
439 |     }
440 |    ],
441 |    "source": [
442 |     "import pandas as pd\n",
443 |     "df = pd.read_excel(r\"..\\Data\\Chapter06.xlsx\",sheet_name=3)\n",
444 |     "df\n",
445 |     "#选择年龄小于200的数据\n",
446 |     "df[df['年龄']<200]\n",
447 |     "#选择年龄小于200并且唯一识别码小于200，条件用括号括起来\n",
448 |     "df[(df['年龄']<200) & (df['唯一识别码']<102)]"
449 |    ]
450 |   },
451 |   {
452 |    "cell_type": "markdown",
453 |    "metadata": {},
454 |    "source": [
455 |     "## 行列同时选择"
456 |    ]
457 |   },
458 |   {
459 |    "cell_type": "markdown",
460 |    "metadata": {},
461 |    "source": [
462 |     "### 普通索引+普通索引选择指定的行和列"
463 |    ]
464 |   },
465 |   {
466 |    "cell_type": "code",
467 |    "execution_count": 20,
468 |    "metadata": {},
469 |    "outputs": [
470 |     {
471 |      "data": {
472 |       "text/html": [
473 |        "<div>\n",
474 |        "<style scoped>\n",
475 |        "    .dataframe tbody tr th:only-of-type {\n",
476 |        "        vertical-align: middle;\n",
477 |        "    }\n",
478 |        "\n",
479 |        "    .dataframe tbody tr th {\n",
480 |        "        vertical-align: top;\n",
481 |        "    }\n",
482 |        "\n",
483 |        "    .dataframe thead th {\n",
484 |        "        text-align: right;\n",
485 |        "    }\n",
486 |        "</style>\n",
487 |        "<table border=\"1\" class=\"dataframe\">\n",
488 |        "  <thead>\n",
489 |        "    <tr style=\"text-align: right;\">\n",
490 |        "      <th></th>\n",
491 |        "      <th>订单编号</th>\n",
492 |        "      <th>客户姓名</th>\n",
493 |        "      <th>唯一识别码</th>\n",
494 |        "    </tr>\n",
495 |        "  </thead>\n",
496 |        "  <tbody>\n",
497 |        "    <tr>\n",
498 |        "      <th>一</th>\n",
499 |        "      <td>A1</td>\n",
500 |        "      <td>张通</td>\n",
501 |        "      <td>101</td>\n",
502 |        "    </tr>\n",
503 |        "    <tr>\n",
504 |        "      <th>二</th>\n",
505 |        "      <td>A2</td>\n",
506 |        "      <td>李谷</td>\n",
507 |        "      <td>102</td>\n",
508 |        "    </tr>\n",
509 |        "  </tbody>\n",
510 |        "</table>\n",
511 |        "</div>"
512 |       ],
513 |       "text/plain": [
514 |        "  订单编号 客户姓名  唯一识别码\n",
515 |        "一   A1   张通    101\n",
516 |        "二   A2   李谷    102"
517 |       ]
518 |      },
519 |      "execution_count": 20,
520 |      "metadata": {},
521 |      "output_type": "execute_result"
522 |     }
523 |    ],
524 |    "source": [
525 |     "import pandas as pd\n",
526 |     "df = pd.read_excel(r\"..\\Data\\Chapter06.xlsx\",sheet_name=4)\n",
527 |     "df.index = [\"一\",\"二\",\"三\",\"四\",\"五\"]\n",
528 |     "#用loc传入行列名称\n",
529 |     "df.loc[[\"一\",\"二\"],[\"订单编号\",\"客户姓名\",\"唯一识别码\"]]"
530 |    ]
531 |   },
532 |   {
533 |    "cell_type": "markdown",
534 |    "metadata": {},
535 |    "source": [
536 |     "### 位置索引+位置索引选择指定的行和列"
537 |    ]
538 |   },
539 |   {
540 |    "cell_type": "code",
541 |    "execution_count": 16,
542 |    "metadata": {},
543 |    "outputs": [
544 |     {
545 |      "data": {
546 |       "text/html": [
547 |        "<div>\n",
548 |        "<style scoped>\n",
549 |        "    .dataframe tbody tr th:only-of-type {\n",
550 |        "        vertical-align: middle;\n",
551 |        "    }\n",
552 |        "\n",
553 |        "    .dataframe tbody tr th {\n",
554 |        "        vertical-align: top;\n",
555 |        "    }\n",
556 |        "\n",
557 |        "    .dataframe thead th {\n",
558 |        "        text-align: right;\n",
559 |        "    }\n",
560 |        "</style>\n",
561 |        "<table border=\"1\" class=\"dataframe\">\n",
562 |        "  <thead>\n",
563 |        "    <tr style=\"text-align: right;\">\n",
564 |        "      <th></th>\n",
565 |        "      <th>订单编号</th>\n",
566 |        "      <th>唯一识别码</th>\n",
567 |        "    </tr>\n",
568 |        "  </thead>\n",
569 |        "  <tbody>\n",
570 |        "    <tr>\n",
571 |        "      <th>一</th>\n",
572 |        "      <td>A1</td>\n",
573 |        "      <td>101</td>\n",
574 |        "    </tr>\n",
575 |        "    <tr>\n",
576 |        "      <th>二</th>\n",
577 |        "      <td>A2</td>\n",
578 |        "      <td>102</td>\n",
579 |        "    </tr>\n",
580 |        "  </tbody>\n",
581 |        "</table>\n",
582 |        "</div>"
583 |       ],
584 |       "text/plain": [
585 |        "  订单编号  唯一识别码\n",
586 |        "一   A1    101\n",
587 |        "二   A2    102"
588 |       ]
589 |      },
590 |      "execution_count": 16,
591 |      "metadata": {},
592 |      "output_type": "execute_result"
593 |     }
594 |    ],
595 |    "source": [
596 |     "#用iloc方法传入行列位置\n",
597 |     "df.iloc[[0,1],[0,2]]"
598 |    ]
599 |   },
600 |   {
601 |    "cell_type": "markdown",
602 |    "metadata": {},
603 |    "source": [
604 |     "### 布尔索引+普通缩影选择指定的行和列"
605 |    ]
606 |   },
607 |   {
608 |    "cell_type": "code",
609 |    "execution_count": 12,
610 |    "metadata": {},
611 |    "outputs": [
612 |     {
613 |      "data": {
614 |       "text/html": [
615 |        "<div>\n",
616 |        "<style scoped>\n",
617 |        "    .dataframe tbody tr th:only-of-type {\n",
618 |        "        vertical-align: middle;\n",
619 |        "    }\n",
620 |        "\n",
621 |        "    .dataframe tbody tr th {\n",
622 |        "        vertical-align: top;\n",
623 |        "    }\n",
624 |        "\n",
625 |        "    .dataframe thead th {\n",
626 |        "        text-align: right;\n",
627 |        "    }\n",
628 |        "</style>\n",
629 |        "<table border=\"1\" class=\"dataframe\">\n",
630 |        "  <thead>\n",
631 |        "    <tr style=\"text-align: right;\">\n",
632 |        "      <th></th>\n",
633 |        "      <th>订单编号</th>\n",
634 |        "      <th>年龄</th>\n",
635 |        "    </tr>\n",
636 |        "  </thead>\n",
637 |        "  <tbody>\n",
638 |        "    <tr>\n",
639 |        "      <th>一</th>\n",
640 |        "      <td>A1</td>\n",
641 |        "      <td>31</td>\n",
642 |        "    </tr>\n",
643 |        "    <tr>\n",
644 |        "      <th>二</th>\n",
645 |        "      <td>A2</td>\n",
646 |        "      <td>45</td>\n",
647 |        "    </tr>\n",
648 |        "    <tr>\n",
649 |        "      <th>三</th>\n",
650 |        "      <td>A3</td>\n",
651 |        "      <td>23</td>\n",
652 |        "    </tr>\n",
653 |        "  </tbody>\n",
654 |        "</table>\n",
655 |        "</div>"
656 |       ],
657 |       "text/plain": [
658 |        "  订单编号  年龄\n",
659 |        "一   A1  31\n",
660 |        "二   A2  45\n",
661 |        "三   A3  23"
662 |       ]
663 |      },
664 |      "execution_count": 12,
665 |      "metadata": {},
666 |      "output_type": "execute_result"
667 |     }
668 |    ],
669 |    "source": [
670 |     "#先进行布尔选择，然后通过普通索引选择列\n",
671 |     "df[df[\"年龄\"]<200][[\"订单编号\",\"年龄\"]]"
672 |    ]
673 |   },
674 |   {
675 |    "cell_type": "markdown",
676 |    "metadata": {},
677 |    "source": [
678 |     "### 切片索引+切片索引选择指定的行和列"
679 |    ]
680 |   },
681 |   {
682 |    "cell_type": "code",
683 |    "execution_count": 17,
684 |    "metadata": {},
685 |    "outputs": [
686 |     {
687 |      "data": {
688 |       "text/html": [
689 |        "<div>\n",
690 |        "<style scoped>\n",
691 |        "    .dataframe tbody tr th:only-of-type {\n",
692 |        "        vertical-align: middle;\n",
693 |        "    }\n",
694 |        "\n",
695 |        "    .dataframe tbody tr th {\n",
696 |        "        vertical-align: top;\n",
697 |        "    }\n",
698 |        "\n",
699 |        "    .dataframe thead th {\n",
700 |        "        text-align: right;\n",
701 |        "    }\n",
702 |        "</style>\n",
703 |        "<table border=\"1\" class=\"dataframe\">\n",
704 |        "  <thead>\n",
705 |        "    <tr style=\"text-align: right;\">\n",
706 |        "      <th></th>\n",
707 |        "      <th>客户姓名</th>\n",
708 |        "      <th>唯一识别码</th>\n",
709 |        "      <th>年龄</th>\n",
710 |        "    </tr>\n",
711 |        "  </thead>\n",
712 |        "  <tbody>\n",
713 |        "    <tr>\n",
714 |        "      <th>一</th>\n",
715 |        "      <td>张通</td>\n",
716 |        "      <td>101</td>\n",
717 |        "      <td>31</td>\n",
718 |        "    </tr>\n",
719 |        "    <tr>\n",
720 |        "      <th>二</th>\n",
721 |        "      <td>李谷</td>\n",
722 |        "      <td>102</td>\n",
723 |        "      <td>45</td>\n",
724 |        "    </tr>\n",
725 |        "    <tr>\n",
726 |        "      <th>三</th>\n",
727 |        "      <td>孙凤</td>\n",
728 |        "      <td>103</td>\n",
729 |        "      <td>23</td>\n",
730 |        "    </tr>\n",
731 |        "  </tbody>\n",
732 |        "</table>\n",
733 |        "</div>"
734 |       ],
735 |       "text/plain": [
736 |        "  客户姓名  唯一识别码  年龄\n",
737 |        "一   张通    101  31\n",
738 |        "二   李谷    102  45\n",
739 |        "三   孙凤    103  23"
740 |       ]
741 |      },
742 |      "execution_count": 17,
743 |      "metadata": {},
744 |      "output_type": "execute_result"
745 |     }
746 |    ],
747 |    "source": [
748 |     "import pandas as pd\n",
749 |     "df = pd.read_excel(r\"..\\Data\\Chapter06.xlsx\",sheet_name=4)\n",
750 |     "df.index = [\"一\",\"二\",\"三\",\"四\",\"五\"]\n",
751 |     "#iloc第一个参数选择的是行区间，第二个参数选的是列的区间\n",
752 |     "df.iloc[0:3,1:4]\n"
753 |    ]
754 |   },
755 |   {
756 |    "cell_type": "markdown",
757 |    "metadata": {},
758 |    "source": [
759 |     "### 切片索引+普通索引指定的行和列"
760 |    ]
761 |   },
762 |   {
763 |    "cell_type": "code",
764 |    "execution_count": 19,
765 |    "metadata": {},
766 |    "outputs": [
767 |     {
768 |      "name": "stderr",
769 |      "output_type": "stream",
770 |      "text": [
771 |       "D:\\Anaconda3\\lib\\site-packages\\ipykernel_launcher.py:2: DeprecationWarning: \n",
772 |       ".ix is deprecated. Please use\n",
773 |       ".loc for label based indexing or\n",
774 |       ".iloc for positional indexing\n",
775 |       "\n",
776 |       "See the documentation here:\n",
777 |       "http://pandas.pydata.org/pandas-docs/stable/indexing.html#ix-indexer-is-deprecated\n",
778 |       "  \n"
779 |      ]
780 |     },
781 |     {
782 |      "data": {
783 |       "text/html": [
784 |        "<div>\n",
785 |        "<style scoped>\n",
786 |        "    .dataframe tbody tr th:only-of-type {\n",
787 |        "        vertical-align: middle;\n",
788 |        "    }\n",
789 |        "\n",
790 |        "    .dataframe tbody tr th {\n",
791 |        "        vertical-align: top;\n",
792 |        "    }\n",
793 |        "\n",
794 |        "    .dataframe thead th {\n",
795 |        "        text-align: right;\n",
796 |        "    }\n",
797 |        "</style>\n",
798 |        "<table border=\"1\" class=\"dataframe\">\n",
799 |        "  <thead>\n",
800 |        "    <tr style=\"text-align: right;\">\n",
801 |        "      <th></th>\n",
802 |        "      <th>客户姓名</th>\n",
803 |        "      <th>唯一识别码</th>\n",
804 |        "    </tr>\n",
805 |        "  </thead>\n",
806 |        "  <tbody>\n",
807 |        "    <tr>\n",
808 |        "      <th>一</th>\n",
809 |        "      <td>张通</td>\n",
810 |        "      <td>101</td>\n",
811 |        "    </tr>\n",
812 |        "    <tr>\n",
813 |        "      <th>二</th>\n",
814 |        "      <td>李谷</td>\n",
815 |        "      <td>102</td>\n",
816 |        "    </tr>\n",
817 |        "    <tr>\n",
818 |        "      <th>三</th>\n",
819 |        "      <td>孙凤</td>\n",
820 |        "      <td>103</td>\n",
821 |        "    </tr>\n",
822 |        "  </tbody>\n",
823 |        "</table>\n",
824 |        "</div>"
825 |       ],
826 |       "text/plain": [
827 |        "  客户姓名  唯一识别码\n",
828 |        "一   张通    101\n",
829 |        "二   李谷    102\n",
830 |        "三   孙凤    103"
831 |       ]
832 |      },
833 |      "execution_count": 19,
834 |      "metadata": {},
835 |      "output_type": "execute_result"
836 |     }
837 |    ],
838 |    "source": [
839 |     "df\n",
840 |     "df.ix[0:3,[\"客户姓名\",\"唯一识别码\"]]\n",
841 |     "df.iloc[0:3][[\"客户姓名\",\"唯一识别码\"]]"
842 |    ]
843 |   }
844 |  ],
845 |  "metadata": {
846 |   "kernelspec": {
847 |    "display_name": "Python 3",
848 |    "language": "python",
849 |    "name": "python3"
850 |   },
851 |   "language_info": {
852 |    "codemirror_mode": {
853 |     "name": "ipython",
854 |     "version": 3
855 |    },
856 |    "file_extension": ".py",
857 |    "mimetype": "text/x-python",
858 |    "name": "python",
859 |    "nbconvert_exporter": "python",
860 |    "pygments_lexer": "ipython3",
861 |    "version": "3.7.0"
862 |   },
863 |   "toc": {
864 |    "base_numbering": 1,
865 |    "nav_menu": {},
866 |    "number_sections": true,
867 |    "sideBar": true,
868 |    "skip_h1_title": false,
869 |    "title_cell": "Table of Contents",
870 |    "title_sidebar": "第6章 数据选择",
871 |    "toc_cell": false,
872 |    "toc_position": {
873 |     "height": "calc(100% - 180px)",
874 |     "left": "10px",
875 |     "top": "150px",
876 |     "width": "320px"
877 |    },
878 |    "toc_section_display": true,
879 |    "toc_window_display": true
880 |   }
881 |  },
882 |  "nbformat": 4,
883 |  "nbformat_minor": 2
884 | }
885 | 


--------------------------------------------------------------------------------
/Code/Chapter08 数据运算.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "markdown",
  5 |    "metadata": {},
  6 |    "source": [
  7 |     "# 数据运算"
  8 |    ]
  9 |   },
 10 |   {
 11 |    "cell_type": "markdown",
 12 |    "metadata": {},
 13 |    "source": [
 14 |     "## 算数运算"
 15 |    ]
 16 |   },
 17 |   {
 18 |    "cell_type": "code",
 19 |    "execution_count": 1,
 20 |    "metadata": {},
 21 |    "outputs": [
 22 |     {
 23 |      "data": {
 24 |       "text/plain": [
 25 |        "S1     3\n",
 26 |        "S2     9\n",
 27 |        "S3    15\n",
 28 |        "dtype: int64"
 29 |       ]
 30 |      },
 31 |      "execution_count": 1,
 32 |      "metadata": {},
 33 |      "output_type": "execute_result"
 34 |     }
 35 |    ],
 36 |    "source": [
 37 |     "#两列相加\n",
 38 |     "import pandas as pd\n",
 39 |     "df = pd.read_excel(r\"../Data/Chapter08.xlsx\",sheet_name = 0)\n",
 40 |     "#添加行索引\n",
 41 |     "df.index=[\"S1\",\"S2\",\"S3\"]\n",
 42 |     "df[\"C1\"]+df[\"C2\"]"
 43 |    ]
 44 |   },
 45 |   {
 46 |    "cell_type": "code",
 47 |    "execution_count": 2,
 48 |    "metadata": {},
 49 |    "outputs": [
 50 |     {
 51 |      "data": {
 52 |       "text/plain": [
 53 |        "S1   -1\n",
 54 |        "S2   -1\n",
 55 |        "S3   -1\n",
 56 |        "dtype: int64"
 57 |       ]
 58 |      },
 59 |      "execution_count": 2,
 60 |      "metadata": {},
 61 |      "output_type": "execute_result"
 62 |     }
 63 |    ],
 64 |    "source": [
 65 |     "#两列相减\n",
 66 |     "df[\"C1\"]-df[\"C2\"]"
 67 |    ]
 68 |   },
 69 |   {
 70 |    "cell_type": "code",
 71 |    "execution_count": 15,
 72 |    "metadata": {},
 73 |    "outputs": [
 74 |     {
 75 |      "data": {
 76 |       "text/plain": [
 77 |        "S1     2\n",
 78 |        "S2    20\n",
 79 |        "S3    56\n",
 80 |        "dtype: int64"
 81 |       ]
 82 |      },
 83 |      "execution_count": 15,
 84 |      "metadata": {},
 85 |      "output_type": "execute_result"
 86 |     }
 87 |    ],
 88 |    "source": [
 89 |     "#两列相乘\n",
 90 |     "df[\"C1\"]*df[\"C2\"]"
 91 |    ]
 92 |   },
 93 |   {
 94 |    "cell_type": "code",
 95 |    "execution_count": 4,
 96 |    "metadata": {},
 97 |    "outputs": [
 98 |     {
 99 |      "data": {
100 |       "text/plain": [
101 |        "S1    0.500\n",
102 |        "S2    0.800\n",
103 |        "S3    0.875\n",
104 |        "dtype: float64"
105 |       ]
106 |      },
107 |      "execution_count": 4,
108 |      "metadata": {},
109 |      "output_type": "execute_result"
110 |     }
111 |    ],
112 |    "source": [
113 |     "#两列相除\n",
114 |     "df[\"C1\"]/df[\"C2\"]"
115 |    ]
116 |   },
117 |   {
118 |    "cell_type": "code",
119 |    "execution_count": 5,
120 |    "metadata": {},
121 |    "outputs": [
122 |     {
123 |      "data": {
124 |       "text/plain": [
125 |        "S1    0\n",
126 |        "S2    3\n",
127 |        "S3    6\n",
128 |        "Name: C1, dtype: int64"
129 |       ]
130 |      },
131 |      "execution_count": 5,
132 |      "metadata": {},
133 |      "output_type": "execute_result"
134 |     }
135 |    ],
136 |    "source": [
137 |     "#任意一列加/减一个常数\n",
138 |     "df[\"C1\"]+1\n",
139 |     "df[\"C1\"]-1"
140 |    ]
141 |   },
142 |   {
143 |    "cell_type": "markdown",
144 |    "metadata": {},
145 |    "source": [
146 |     "## 比较运算符"
147 |    ]
148 |   },
149 |   {
150 |    "cell_type": "code",
151 |    "execution_count": 8,
152 |    "metadata": {},
153 |    "outputs": [
154 |     {
155 |      "data": {
156 |       "text/plain": [
157 |        "S1    True\n",
158 |        "S2    True\n",
159 |        "S3    True\n",
160 |        "dtype: bool"
161 |       ]
162 |      },
163 |      "execution_count": 8,
164 |      "metadata": {},
165 |      "output_type": "execute_result"
166 |     }
167 |    ],
168 |    "source": [
169 |     "import pandas as pd\n",
170 |     "df = pd.read_excel(r\"../Data/Chapter08.xlsx\",sheet_name = 0)\n",
171 |     "#添加行索引\n",
172 |     "df.index=[\"S1\",\"S2\",\"S3\"]\n",
173 |     "df\n",
174 |     "df[\"C1\"] > df[\"C2\"]\n",
175 |     "df[\"C1\"] < df[\"C2\"]\n",
176 |     "df[\"C1\"] != df[\"C2\"]"
177 |    ]
178 |   },
179 |   {
180 |    "cell_type": "markdown",
181 |    "metadata": {},
182 |    "source": [
183 |     "## 汇总运算"
184 |    ]
185 |   },
186 |   {
187 |    "cell_type": "markdown",
188 |    "metadata": {},
189 |    "source": [
190 |     "**count()非空值计数**  \n",
191 |     "非空值计数就是计算摸一个区域中非空数值的个数  \n",
192 |     "默认是求每一列非空值的个数  \n",
193 |     "修改axis=1可以计算每一行的非空值个数"
194 |    ]
195 |   },
196 |   {
197 |    "cell_type": "code",
198 |    "execution_count": 9,
199 |    "metadata": {},
200 |    "outputs": [
201 |     {
202 |      "data": {
203 |       "text/plain": [
204 |        "C1    3\n",
205 |        "C2    3\n",
206 |        "C3    3\n",
207 |        "dtype: int64"
208 |       ]
209 |      },
210 |      "execution_count": 9,
211 |      "metadata": {},
212 |      "output_type": "execute_result"
213 |     }
214 |    ],
215 |    "source": [
216 |     "import pandas as pd\n",
217 |     "df = pd.read_excel(r\"../Data/Chapter08.xlsx\",sheet_name = 0)\n",
218 |     "#添加行索引\n",
219 |     "df.index=[\"S1\",\"S2\",\"S3\"]\n",
220 |     "#计算每一列的非空个数\n",
221 |     "df.count()"
222 |    ]
223 |   },
224 |   {
225 |    "cell_type": "code",
226 |    "execution_count": 10,
227 |    "metadata": {},
228 |    "outputs": [
229 |     {
230 |      "data": {
231 |       "text/plain": [
232 |        "S1    3\n",
233 |        "S2    3\n",
234 |        "S3    3\n",
235 |        "dtype: int64"
236 |       ]
237 |      },
238 |      "execution_count": 10,
239 |      "metadata": {},
240 |      "output_type": "execute_result"
241 |     }
242 |    ],
243 |    "source": [
244 |     "#计算每一行的非空值个数\n",
245 |     "df.count(axis =1)"
246 |    ]
247 |   },
248 |   {
249 |    "cell_type": "markdown",
250 |    "metadata": {},
251 |    "source": [
252 |     "**sum()求和**"
253 |    ]
254 |   },
255 |   {
256 |    "cell_type": "code",
257 |    "execution_count": 11,
258 |    "metadata": {},
259 |    "outputs": [
260 |     {
261 |      "data": {
262 |       "text/plain": [
263 |        "C1    12\n",
264 |        "C2    15\n",
265 |        "C3    18\n",
266 |        "dtype: int64"
267 |       ]
268 |      },
269 |      "execution_count": 11,
270 |      "metadata": {},
271 |      "output_type": "execute_result"
272 |     }
273 |    ],
274 |    "source": [
275 |     "#默认对每一列求和\n",
276 |     "df.sum()"
277 |    ]
278 |   },
279 |   {
280 |    "cell_type": "code",
281 |    "execution_count": 12,
282 |    "metadata": {},
283 |    "outputs": [
284 |     {
285 |      "data": {
286 |       "text/plain": [
287 |        "S1     6\n",
288 |        "S2    15\n",
289 |        "S3    24\n",
290 |        "dtype: int64"
291 |       ]
292 |      },
293 |      "execution_count": 12,
294 |      "metadata": {},
295 |      "output_type": "execute_result"
296 |     }
297 |    ],
298 |    "source": [
299 |     "#添加参数axis对每一行求和\n",
300 |     "df.sum(axis = 1)"
301 |    ]
302 |   },
303 |   {
304 |    "cell_type": "code",
305 |    "execution_count": 13,
306 |    "metadata": {},
307 |    "outputs": [
308 |     {
309 |      "data": {
310 |       "text/plain": [
311 |        "12"
312 |       ]
313 |      },
314 |      "execution_count": 13,
315 |      "metadata": {},
316 |      "output_type": "execute_result"
317 |     }
318 |    ],
319 |    "source": [
320 |     "#对具体某一列求和\n",
321 |     "df[\"C1\"].sum()"
322 |    ]
323 |   },
324 |   {
325 |    "cell_type": "markdown",
326 |    "metadata": {},
327 |    "source": [
328 |     "**mean()求均值**  \n",
329 |     "求均值就是对某一区域中的所有值进行算数平均值运算"
330 |    ]
331 |   },
332 |   {
333 |    "cell_type": "code",
334 |    "execution_count": 14,
335 |    "metadata": {},
336 |    "outputs": [
337 |     {
338 |      "data": {
339 |       "text/plain": [
340 |        "C1    4.0\n",
341 |        "C2    5.0\n",
342 |        "C3    6.0\n",
343 |        "dtype: float64"
344 |       ]
345 |      },
346 |      "execution_count": 14,
347 |      "metadata": {},
348 |      "output_type": "execute_result"
349 |     }
350 |    ],
351 |    "source": [
352 |     "#默认对每一列进行均值运算\n",
353 |     "df.mean()"
354 |    ]
355 |   },
356 |   {
357 |    "cell_type": "code",
358 |    "execution_count": 15,
359 |    "metadata": {},
360 |    "outputs": [
361 |     {
362 |      "data": {
363 |       "text/plain": [
364 |        "S1    2.0\n",
365 |        "S2    5.0\n",
366 |        "S3    8.0\n",
367 |        "dtype: float64"
368 |       ]
369 |      },
370 |      "execution_count": 15,
371 |      "metadata": {},
372 |      "output_type": "execute_result"
373 |     }
374 |    ],
375 |    "source": [
376 |     "#对每一行进行均值运算\n",
377 |     "df.mean( axis =1)"
378 |    ]
379 |   },
380 |   {
381 |    "cell_type": "code",
382 |    "execution_count": 16,
383 |    "metadata": {},
384 |    "outputs": [
385 |     {
386 |      "data": {
387 |       "text/plain": [
388 |        "4.0"
389 |       ]
390 |      },
391 |      "execution_count": 16,
392 |      "metadata": {},
393 |      "output_type": "execute_result"
394 |     }
395 |    ],
396 |    "source": [
397 |     "#指定某一列进行均值运算\n",
398 |     "df[\"C1\"].mean()"
399 |    ]
400 |   },
401 |   {
402 |    "cell_type": "markdown",
403 |    "metadata": {},
404 |    "source": [
405 |     "**max()求最大值**"
406 |    ]
407 |   },
408 |   {
409 |    "cell_type": "code",
410 |    "execution_count": 17,
411 |    "metadata": {},
412 |    "outputs": [
413 |     {
414 |      "data": {
415 |       "text/plain": [
416 |        "C1    7\n",
417 |        "C2    8\n",
418 |        "C3    9\n",
419 |        "dtype: int64"
420 |       ]
421 |      },
422 |      "execution_count": 17,
423 |      "metadata": {},
424 |      "output_type": "execute_result"
425 |     }
426 |    ],
427 |    "source": [
428 |     "#默认返回每一列的最大值\n",
429 |     "df.max()"
430 |    ]
431 |   },
432 |   {
433 |    "cell_type": "code",
434 |    "execution_count": 18,
435 |    "metadata": {},
436 |    "outputs": [
437 |     {
438 |      "data": {
439 |       "text/plain": [
440 |        "S1    3\n",
441 |        "S2    6\n",
442 |        "S3    9\n",
443 |        "dtype: int64"
444 |       ]
445 |      },
446 |      "execution_count": 18,
447 |      "metadata": {},
448 |      "output_type": "execute_result"
449 |     }
450 |    ],
451 |    "source": [
452 |     "#对每一行求最大值\n",
453 |     "df.max( axis =1)"
454 |    ]
455 |   },
456 |   {
457 |    "cell_type": "code",
458 |    "execution_count": 19,
459 |    "metadata": {},
460 |    "outputs": [
461 |     {
462 |      "data": {
463 |       "text/plain": [
464 |        "7"
465 |       ]
466 |      },
467 |      "execution_count": 19,
468 |      "metadata": {},
469 |      "output_type": "execute_result"
470 |     }
471 |    ],
472 |    "source": [
473 |     "# 对某一列求最大值\n",
474 |     "df[\"C1\"].max()"
475 |    ]
476 |   },
477 |   {
478 |    "cell_type": "markdown",
479 |    "metadata": {},
480 |    "source": [
481 |     "**min()求最小值使用方法和max()一致**"
482 |    ]
483 |   },
484 |   {
485 |    "cell_type": "markdown",
486 |    "metadata": {},
487 |    "source": [
488 |     "**median()求中位数**  \n",
489 |     "中位数就是将一组含有n个数据的序列X按照从小到大排列，位于中间位置的那个数，使用方法和其他函数一致"
490 |    ]
491 |   },
492 |   {
493 |    "cell_type": "code",
494 |    "execution_count": 20,
495 |    "metadata": {},
496 |    "outputs": [
497 |     {
498 |      "data": {
499 |       "text/plain": [
500 |        "C1    4.0\n",
501 |        "C2    5.0\n",
502 |        "C3    6.0\n",
503 |        "dtype: float64"
504 |       ]
505 |      },
506 |      "execution_count": 20,
507 |      "metadata": {},
508 |      "output_type": "execute_result"
509 |     }
510 |    ],
511 |    "source": [
512 |     "df.median()"
513 |    ]
514 |   },
515 |   {
516 |    "cell_type": "markdown",
517 |    "metadata": {},
518 |    "source": [
519 |     "**mode()求众数**  \n",
520 |     "众数就是在一组数据中出现次数最多的数，使用方法与其他函数一致"
521 |    ]
522 |   },
523 |   {
524 |    "cell_type": "code",
525 |    "execution_count": 26,
526 |    "metadata": {},
527 |    "outputs": [
528 |     {
529 |      "data": {
530 |       "text/html": [
531 |        "<div>\n",
532 |        "<style scoped>\n",
533 |        "    .dataframe tbody tr th:only-of-type {\n",
534 |        "        vertical-align: middle;\n",
535 |        "    }\n",
536 |        "\n",
537 |        "    .dataframe tbody tr th {\n",
538 |        "        vertical-align: top;\n",
539 |        "    }\n",
540 |        "\n",
541 |        "    .dataframe thead th {\n",
542 |        "        text-align: right;\n",
543 |        "    }\n",
544 |        "</style>\n",
545 |        "<table border=\"1\" class=\"dataframe\">\n",
546 |        "  <thead>\n",
547 |        "    <tr style=\"text-align: right;\">\n",
548 |        "      <th></th>\n",
549 |        "      <th>C1</th>\n",
550 |        "      <th>C2</th>\n",
551 |        "      <th>C3</th>\n",
552 |        "    </tr>\n",
553 |        "  </thead>\n",
554 |        "  <tbody>\n",
555 |        "    <tr>\n",
556 |        "      <th>0</th>\n",
557 |        "      <td>1</td>\n",
558 |        "      <td>1</td>\n",
559 |        "      <td>3</td>\n",
560 |        "    </tr>\n",
561 |        "  </tbody>\n",
562 |        "</table>\n",
563 |        "</div>"
564 |       ],
565 |       "text/plain": [
566 |        "   C1  C2  C3\n",
567 |        "0   1   1   3"
568 |       ]
569 |      },
570 |      "execution_count": 26,
571 |      "metadata": {},
572 |      "output_type": "execute_result"
573 |     }
574 |    ],
575 |    "source": [
576 |     "import pandas as pd\n",
577 |     "df = pd.read_excel(r\"../Data/Chapter08.xlsx\",sheet_name=1)\n",
578 |     "df.index=[\"S1\",\"S2\",\"S3\"]\n",
579 |     "df.mode()"
580 |    ]
581 |   },
582 |   {
583 |    "cell_type": "markdown",
584 |    "metadata": {},
585 |    "source": [
586 |     "**var()求方差✩**  \n",
587 |     "方差是用来衡量一组数据离散程度的，使用方法与其他函数一致  \n",
588 |     "**std()求标准差✩**  \n",
589 |     "标准差是方差的平方根，二者都是用来表示数据的离散程度的，使用方法与其他函数一致"
590 |    ]
591 |   },
592 |   {
593 |    "cell_type": "markdown",
594 |    "metadata": {},
595 |    "source": [
596 |     "**quantile()求分数位**  \n",
597 |     "分数位是比中数位更加详细的基于位置的指标，有四分之一分数位、四分之二分数位、四分之三分数位，而四分之二分数位就是中数位。\n"
598 |    ]
599 |   },
600 |   {
601 |    "cell_type": "code",
602 |    "execution_count": 27,
603 |    "metadata": {},
604 |    "outputs": [
605 |     {
606 |      "data": {
607 |       "text/plain": [
608 |        "C1    4.0\n",
609 |        "C2    5.0\n",
610 |        "C3    6.0\n",
611 |        "Name: 0.25, dtype: float64"
612 |       ]
613 |      },
614 |      "execution_count": 27,
615 |      "metadata": {},
616 |      "output_type": "execute_result"
617 |     }
618 |    ],
619 |    "source": [
620 |     "import pandas as pd\n",
621 |     "df = pd.read_excel(r\"../Data/Chapter08.xlsx\",sheet_name=2)\n",
622 |     "df.index=[\"S1\",\"S2\",\"S3\",\"S4\",\"S5\"]\n",
623 |     "df\n",
624 |     "df.quantile(0.25)#求四分之一分数位"
625 |    ]
626 |   },
627 |   {
628 |    "cell_type": "code",
629 |    "execution_count": 13,
630 |    "metadata": {},
631 |    "outputs": [
632 |     {
633 |      "data": {
634 |       "text/plain": [
635 |        "C1    10.0\n",
636 |        "C2    11.0\n",
637 |        "C3    12.0\n",
638 |        "Name: 0.75, dtype: float64"
639 |       ]
640 |      },
641 |      "execution_count": 13,
642 |      "metadata": {},
643 |      "output_type": "execute_result"
644 |     }
645 |    ],
646 |    "source": [
647 |     "df.quantile(0.75)#求四分之三分数位"
648 |    ]
649 |   },
650 |   {
651 |    "cell_type": "code",
652 |    "execution_count": 15,
653 |    "metadata": {},
654 |    "outputs": [
655 |     {
656 |      "data": {
657 |       "text/plain": [
658 |        "S1     1.5\n",
659 |        "S2     4.5\n",
660 |        "S3     7.5\n",
661 |        "S4    10.5\n",
662 |        "S5    13.5\n",
663 |        "Name: 0.25, dtype: float64"
664 |       ]
665 |      },
666 |      "execution_count": 15,
667 |      "metadata": {},
668 |      "output_type": "execute_result"
669 |     }
670 |    ],
671 |    "source": [
672 |     "df.quantile(0.25,axis = 1)#求每一行的四分之一分数位"
673 |    ]
674 |   },
675 |   {
676 |    "cell_type": "markdown",
677 |    "metadata": {},
678 |    "source": [
679 |     "## 相关性运算符✩\n",
680 |     "相关性长用来衡量两个事之间的相关程度，用corr()函数"
681 |    ]
682 |   },
683 |   {
684 |    "cell_type": "code",
685 |    "execution_count": 17,
686 |    "metadata": {},
687 |    "outputs": [
688 |     {
689 |      "data": {
690 |       "text/html": [
691 |        "<div>\n",
692 |        "<style scoped>\n",
693 |        "    .dataframe tbody tr th:only-of-type {\n",
694 |        "        vertical-align: middle;\n",
695 |        "    }\n",
696 |        "\n",
697 |        "    .dataframe tbody tr th {\n",
698 |        "        vertical-align: top;\n",
699 |        "    }\n",
700 |        "\n",
701 |        "    .dataframe thead th {\n",
702 |        "        text-align: right;\n",
703 |        "    }\n",
704 |        "</style>\n",
705 |        "<table border=\"1\" class=\"dataframe\">\n",
706 |        "  <thead>\n",
707 |        "    <tr style=\"text-align: right;\">\n",
708 |        "      <th></th>\n",
709 |        "      <th>C1</th>\n",
710 |        "      <th>C2</th>\n",
711 |        "      <th>C3</th>\n",
712 |        "    </tr>\n",
713 |        "  </thead>\n",
714 |        "  <tbody>\n",
715 |        "    <tr>\n",
716 |        "      <th>C1</th>\n",
717 |        "      <td>1.0</td>\n",
718 |        "      <td>1.0</td>\n",
719 |        "      <td>1.0</td>\n",
720 |        "    </tr>\n",
721 |        "    <tr>\n",
722 |        "      <th>C2</th>\n",
723 |        "      <td>1.0</td>\n",
724 |        "      <td>1.0</td>\n",
725 |        "      <td>1.0</td>\n",
726 |        "    </tr>\n",
727 |        "    <tr>\n",
728 |        "      <th>C3</th>\n",
729 |        "      <td>1.0</td>\n",
730 |        "      <td>1.0</td>\n",
731 |        "      <td>1.0</td>\n",
732 |        "    </tr>\n",
733 |        "  </tbody>\n",
734 |        "</table>\n",
735 |        "</div>"
736 |       ],
737 |       "text/plain": [
738 |        "     C1   C2   C3\n",
739 |        "C1  1.0  1.0  1.0\n",
740 |        "C2  1.0  1.0  1.0\n",
741 |        "C3  1.0  1.0  1.0"
742 |       ]
743 |      },
744 |      "execution_count": 17,
745 |      "metadata": {},
746 |      "output_type": "execute_result"
747 |     }
748 |    ],
749 |    "source": [
750 |     "df.corr()"
751 |    ]
752 |   }
753 |  ],
754 |  "metadata": {
755 |   "kernelspec": {
756 |    "display_name": "Python 3",
757 |    "language": "python",
758 |    "name": "python3"
759 |   },
760 |   "language_info": {
761 |    "codemirror_mode": {
762 |     "name": "ipython",
763 |     "version": 3
764 |    },
765 |    "file_extension": ".py",
766 |    "mimetype": "text/x-python",
767 |    "name": "python",
768 |    "nbconvert_exporter": "python",
769 |    "pygments_lexer": "ipython3",
770 |    "version": "3.7.0"
771 |   },
772 |   "toc": {
773 |    "base_numbering": 1,
774 |    "nav_menu": {},
775 |    "number_sections": true,
776 |    "sideBar": true,
777 |    "skip_h1_title": false,
778 |    "title_cell": "Table of Contents",
779 |    "title_sidebar": "第8章 数据运算",
780 |    "toc_cell": false,
781 |    "toc_position": {
782 |     "height": "calc(100% - 180px)",
783 |     "left": "10px",
784 |     "top": "150px",
785 |     "width": "320px"
786 |    },
787 |    "toc_section_display": true,
788 |    "toc_window_display": true
789 |   }
790 |  },
791 |  "nbformat": 4,
792 |  "nbformat_minor": 2
793 | }
794 | 


--------------------------------------------------------------------------------
/Code/Chapter09 时间序列.ipynb:
--------------------------------------------------------------------------------
   1 | {
   2 |  "cells": [
   3 |   {
   4 |    "cell_type": "markdown",
   5 |    "metadata": {},
   6 |    "source": [
   7 |     "# 时间序列"
   8 |    ]
   9 |   },
  10 |   {
  11 |    "cell_type": "markdown",
  12 |    "metadata": {},
  13 |    "source": [
  14 |     "## 获取当前时刻的时间"
  15 |    ]
  16 |   },
  17 |   {
  18 |    "cell_type": "markdown",
  19 |    "metadata": {},
  20 |    "source": [
  21 |     "**返回当前时刻的日期和时间**"
  22 |    ]
  23 |   },
  24 |   {
  25 |    "cell_type": "code",
  26 |    "execution_count": 76,
  27 |    "metadata": {},
  28 |    "outputs": [
  29 |     {
  30 |      "data": {
  31 |       "text/plain": [
  32 |        "datetime.datetime(2019, 3, 14, 15, 57, 43, 307645)"
  33 |       ]
  34 |      },
  35 |      "execution_count": 76,
  36 |      "metadata": {},
  37 |      "output_type": "execute_result"
  38 |     }
  39 |    ],
  40 |    "source": [
  41 |     "from datetime import datetime\n",
  42 |     "datetime.now()"
  43 |    ]
  44 |   },
  45 |   {
  46 |    "cell_type": "markdown",
  47 |    "metadata": {},
  48 |    "source": [
  49 |     "**分别返回当前时刻的年、月、日**"
  50 |    ]
  51 |   },
  52 |   {
  53 |    "cell_type": "code",
  54 |    "execution_count": 77,
  55 |    "metadata": {},
  56 |    "outputs": [
  57 |     {
  58 |      "data": {
  59 |       "text/plain": [
  60 |        "2019"
  61 |       ]
  62 |      },
  63 |      "execution_count": 77,
  64 |      "metadata": {},
  65 |      "output_type": "execute_result"
  66 |     }
  67 |    ],
  68 |    "source": [
  69 |     "datetime.now().year "
  70 |    ]
  71 |   },
  72 |   {
  73 |    "cell_type": "code",
  74 |    "execution_count": 3,
  75 |    "metadata": {},
  76 |    "outputs": [
  77 |     {
  78 |      "data": {
  79 |       "text/plain": [
  80 |        "3"
  81 |       ]
  82 |      },
  83 |      "execution_count": 3,
  84 |      "metadata": {},
  85 |      "output_type": "execute_result"
  86 |     }
  87 |    ],
  88 |    "source": [
  89 |     "datetime.now().month"
  90 |    ]
  91 |   },
  92 |   {
  93 |    "cell_type": "code",
  94 |    "execution_count": 78,
  95 |    "metadata": {},
  96 |    "outputs": [
  97 |     {
  98 |      "data": {
  99 |       "text/plain": [
 100 |        "14"
 101 |       ]
 102 |      },
 103 |      "execution_count": 78,
 104 |      "metadata": {},
 105 |      "output_type": "execute_result"
 106 |     }
 107 |    ],
 108 |    "source": [
 109 |     "datetime.now().day"
 110 |    ]
 111 |   },
 112 |   {
 113 |    "cell_type": "markdown",
 114 |    "metadata": {},
 115 |    "source": [
 116 |     "**返回当前时刻的周数**"
 117 |    ]
 118 |   },
 119 |   {
 120 |    "cell_type": "code",
 121 |    "execution_count": 6,
 122 |    "metadata": {},
 123 |    "outputs": [
 124 |     {
 125 |      "data": {
 126 |       "text/plain": [
 127 |        "7"
 128 |       ]
 129 |      },
 130 |      "execution_count": 6,
 131 |      "metadata": {},
 132 |      "output_type": "execute_result"
 133 |     }
 134 |    ],
 135 |    "source": [
 136 |     "#返回周几，python周几是从0开始的，所以后面加1\n",
 137 |     "datetime.now().weekday()+1"
 138 |    ]
 139 |   },
 140 |   {
 141 |    "cell_type": "code",
 142 |    "execution_count": 9,
 143 |    "metadata": {},
 144 |    "outputs": [
 145 |     {
 146 |      "data": {
 147 |       "text/plain": [
 148 |        "(2019, 10, 7)"
 149 |       ]
 150 |      },
 151 |      "execution_count": 9,
 152 |      "metadata": {},
 153 |      "output_type": "execute_result"
 154 |     }
 155 |    ],
 156 |    "source": [
 157 |     "#返回周数\n",
 158 |     "datetime.now().isocalendar()\n",
 159 |     "#2019年第10周的第7天"
 160 |    ]
 161 |   },
 162 |   {
 163 |    "cell_type": "code",
 164 |    "execution_count": 11,
 165 |    "metadata": {},
 166 |    "outputs": [
 167 |     {
 168 |      "data": {
 169 |       "text/plain": [
 170 |        "10"
 171 |       ]
 172 |      },
 173 |      "execution_count": 11,
 174 |      "metadata": {},
 175 |      "output_type": "execute_result"
 176 |     }
 177 |    ],
 178 |    "source": [
 179 |     "#返回周数\n",
 180 |     "datetime.now().isocalendar()[1]"
 181 |    ]
 182 |   },
 183 |   {
 184 |    "cell_type": "markdown",
 185 |    "metadata": {},
 186 |    "source": [
 187 |     "**指定日期和时间格式**  \n",
 188 |     "- date()函数将只展示日期  \n",
 189 |     "- time()函数将只展示时间  \n",
 190 |     "- strftime()函数可以自定义时间和日期格式"
 191 |    ]
 192 |   },
 193 |   {
 194 |    "cell_type": "code",
 195 |    "execution_count": 12,
 196 |    "metadata": {},
 197 |    "outputs": [
 198 |     {
 199 |      "data": {
 200 |       "text/plain": [
 201 |        "datetime.date(2019, 3, 10)"
 202 |       ]
 203 |      },
 204 |      "execution_count": 12,
 205 |      "metadata": {},
 206 |      "output_type": "execute_result"
 207 |     }
 208 |    ],
 209 |    "source": [
 210 |     "datetime.now().date()"
 211 |    ]
 212 |   },
 213 |   {
 214 |    "cell_type": "code",
 215 |    "execution_count": 13,
 216 |    "metadata": {
 217 |     "scrolled": true
 218 |    },
 219 |    "outputs": [
 220 |     {
 221 |      "data": {
 222 |       "text/plain": [
 223 |        "datetime.time(22, 11, 41, 36684)"
 224 |       ]
 225 |      },
 226 |      "execution_count": 13,
 227 |      "metadata": {},
 228 |      "output_type": "execute_result"
 229 |     }
 230 |    ],
 231 |    "source": [
 232 |     "datetime.now().time()"
 233 |    ]
 234 |   },
 235 |   {
 236 |    "cell_type": "markdown",
 237 |    "metadata": {},
 238 |    "source": [
 239 |     "strftime()定义的时间格式  \n",
 240 |     "\n",
 241 |     "代码 | 说明\n",
 242 |     "---|---\n",
 243 |     "%H | 小时(24小时制)[00,23]\n",
 244 |     "%I | 小时(24小时制)[01,12]\n",
 245 |     "%M | 两位数的分[00,59]\n",
 246 |     "%S | 秒\\[00,61](60和61用于闰秒)\n",
 247 |     "%w | 用整数表示星期几，从0开始\n",
 248 |     "%U | 每年的第几周，周日被认为每周第一天\n",
 249 |     "%U | 每年的第几周，周一被认为每周第一天\n",
 250 |     "%F | %Y-%m-%d的简写形式，例如2018-04-18\n",
 251 |     "%D | %m/%d/%y的简写形式，例如04/18/2018"
 252 |    ]
 253 |   },
 254 |   {
 255 |    "cell_type": "code",
 256 |    "execution_count": 14,
 257 |    "metadata": {},
 258 |    "outputs": [
 259 |     {
 260 |      "data": {
 261 |       "text/plain": [
 262 |        "'2019-03-10'"
 263 |       ]
 264 |      },
 265 |      "execution_count": 14,
 266 |      "metadata": {},
 267 |      "output_type": "execute_result"
 268 |     }
 269 |    ],
 270 |    "source": [
 271 |     "datetime.now().strftime(\"%Y-%m-%d\")"
 272 |    ]
 273 |   },
 274 |   {
 275 |    "cell_type": "markdown",
 276 |    "metadata": {},
 277 |    "source": [
 278 |     "## 字符串和时间格式相互转换"
 279 |    ]
 280 |   },
 281 |   {
 282 |    "cell_type": "markdown",
 283 |    "metadata": {},
 284 |    "source": [
 285 |     "**将时间格式转换为字符串格式**  \n",
 286 |     "使用str()函数"
 287 |    ]
 288 |   },
 289 |   {
 290 |    "cell_type": "code",
 291 |    "execution_count": 79,
 292 |    "metadata": {},
 293 |    "outputs": [
 294 |     {
 295 |      "data": {
 296 |       "text/plain": [
 297 |        "str"
 298 |       ]
 299 |      },
 300 |      "execution_count": 79,
 301 |      "metadata": {},
 302 |      "output_type": "execute_result"
 303 |     }
 304 |    ],
 305 |    "source": [
 306 |     "from datetime import datetime\n",
 307 |     "now = datetime.now()\n",
 308 |     "now\n",
 309 |     "type(now)\n",
 310 |     "type(str(now))"
 311 |    ]
 312 |   },
 313 |   {
 314 |    "cell_type": "markdown",
 315 |    "metadata": {},
 316 |    "source": [
 317 |     "**将字符串格式转换为时间格式**  \n",
 318 |     "使用parse()函数"
 319 |    ]
 320 |   },
 321 |   {
 322 |    "cell_type": "code",
 323 |    "execution_count": 11,
 324 |    "metadata": {},
 325 |    "outputs": [
 326 |     {
 327 |      "data": {
 328 |       "text/plain": [
 329 |        "datetime.datetime"
 330 |       ]
 331 |      },
 332 |      "execution_count": 11,
 333 |      "metadata": {},
 334 |      "output_type": "execute_result"
 335 |     }
 336 |    ],
 337 |    "source": [
 338 |     "from dateutil.parser import parse\n",
 339 |     "str_time = \"2019-03-11\"\n",
 340 |     "type(str_time)\n",
 341 |     "parse(str_time)\n",
 342 |     "type(parse(str_time))"
 343 |    ]
 344 |   },
 345 |   {
 346 |    "cell_type": "markdown",
 347 |    "metadata": {},
 348 |    "source": [
 349 |     "## 时间索引  \n",
 350 |     "时间索引就是根据时间来对时间格式的字段进行数据选取的一种索引方式。"
 351 |    ]
 352 |   },
 353 |   {
 354 |    "cell_type": "code",
 355 |    "execution_count": 4,
 356 |    "metadata": {},
 357 |    "outputs": [
 358 |     {
 359 |      "data": {
 360 |       "text/html": [
 361 |        "<div>\n",
 362 |        "<style scoped>\n",
 363 |        "    .dataframe tbody tr th:only-of-type {\n",
 364 |        "        vertical-align: middle;\n",
 365 |        "    }\n",
 366 |        "\n",
 367 |        "    .dataframe tbody tr th {\n",
 368 |        "        vertical-align: top;\n",
 369 |        "    }\n",
 370 |        "\n",
 371 |        "    .dataframe thead th {\n",
 372 |        "        text-align: right;\n",
 373 |        "    }\n",
 374 |        "</style>\n",
 375 |        "<table border=\"1\" class=\"dataframe\">\n",
 376 |        "  <thead>\n",
 377 |        "    <tr style=\"text-align: right;\">\n",
 378 |        "      <th></th>\n",
 379 |        "      <th>num</th>\n",
 380 |        "    </tr>\n",
 381 |        "  </thead>\n",
 382 |        "  <tbody>\n",
 383 |        "    <tr>\n",
 384 |        "      <th>2018-01-01</th>\n",
 385 |        "      <td>1</td>\n",
 386 |        "    </tr>\n",
 387 |        "    <tr>\n",
 388 |        "      <th>2018-01-02</th>\n",
 389 |        "      <td>2</td>\n",
 390 |        "    </tr>\n",
 391 |        "    <tr>\n",
 392 |        "      <th>2018-01-03</th>\n",
 393 |        "      <td>3</td>\n",
 394 |        "    </tr>\n",
 395 |        "    <tr>\n",
 396 |        "      <th>2018-01-04</th>\n",
 397 |        "      <td>4</td>\n",
 398 |        "    </tr>\n",
 399 |        "    <tr>\n",
 400 |        "      <th>2018-01-05</th>\n",
 401 |        "      <td>5</td>\n",
 402 |        "    </tr>\n",
 403 |        "    <tr>\n",
 404 |        "      <th>2018-01-06</th>\n",
 405 |        "      <td>6</td>\n",
 406 |        "    </tr>\n",
 407 |        "    <tr>\n",
 408 |        "      <th>2018-01-07</th>\n",
 409 |        "      <td>7</td>\n",
 410 |        "    </tr>\n",
 411 |        "    <tr>\n",
 412 |        "      <th>2018-01-08</th>\n",
 413 |        "      <td>8</td>\n",
 414 |        "    </tr>\n",
 415 |        "    <tr>\n",
 416 |        "      <th>2018-01-09</th>\n",
 417 |        "      <td>9</td>\n",
 418 |        "    </tr>\n",
 419 |        "    <tr>\n",
 420 |        "      <th>2018-01-10</th>\n",
 421 |        "      <td>10</td>\n",
 422 |        "    </tr>\n",
 423 |        "  </tbody>\n",
 424 |        "</table>\n",
 425 |        "</div>"
 426 |       ],
 427 |       "text/plain": [
 428 |        "            num\n",
 429 |        "2018-01-01    1\n",
 430 |        "2018-01-02    2\n",
 431 |        "2018-01-03    3\n",
 432 |        "2018-01-04    4\n",
 433 |        "2018-01-05    5\n",
 434 |        "2018-01-06    6\n",
 435 |        "2018-01-07    7\n",
 436 |        "2018-01-08    8\n",
 437 |        "2018-01-09    9\n",
 438 |        "2018-01-10   10"
 439 |       ]
 440 |      },
 441 |      "execution_count": 4,
 442 |      "metadata": {},
 443 |      "output_type": "execute_result"
 444 |     }
 445 |    ],
 446 |    "source": [
 447 |     "import pandas as pd\n",
 448 |     "import numpy as np\n",
 449 |     "index = pd.DatetimeIndex(['2018-01-01','2018-01-02','2018-01-03','2018-01-04','2018-01-05',\n",
 450 |     "                          '2018-01-06','2018-01-07','2018-01-08','2018-01-09','2018-01-10'])\n",
 451 |     "data = pd.DataFrame(np.arange(1,11),columns =[\"num\"],index = index)\n",
 452 |     "data"
 453 |    ]
 454 |   },
 455 |   {
 456 |    "cell_type": "code",
 457 |    "execution_count": 5,
 458 |    "metadata": {},
 459 |    "outputs": [
 460 |     {
 461 |      "data": {
 462 |       "text/html": [
 463 |        "<div>\n",
 464 |        "<style scoped>\n",
 465 |        "    .dataframe tbody tr th:only-of-type {\n",
 466 |        "        vertical-align: middle;\n",
 467 |        "    }\n",
 468 |        "\n",
 469 |        "    .dataframe tbody tr th {\n",
 470 |        "        vertical-align: top;\n",
 471 |        "    }\n",
 472 |        "\n",
 473 |        "    .dataframe thead th {\n",
 474 |        "        text-align: right;\n",
 475 |        "    }\n",
 476 |        "</style>\n",
 477 |        "<table border=\"1\" class=\"dataframe\">\n",
 478 |        "  <thead>\n",
 479 |        "    <tr style=\"text-align: right;\">\n",
 480 |        "      <th></th>\n",
 481 |        "      <th>num</th>\n",
 482 |        "    </tr>\n",
 483 |        "  </thead>\n",
 484 |        "  <tbody>\n",
 485 |        "    <tr>\n",
 486 |        "      <th>2018-01-01</th>\n",
 487 |        "      <td>1</td>\n",
 488 |        "    </tr>\n",
 489 |        "    <tr>\n",
 490 |        "      <th>2018-01-02</th>\n",
 491 |        "      <td>2</td>\n",
 492 |        "    </tr>\n",
 493 |        "    <tr>\n",
 494 |        "      <th>2018-01-03</th>\n",
 495 |        "      <td>3</td>\n",
 496 |        "    </tr>\n",
 497 |        "    <tr>\n",
 498 |        "      <th>2018-01-04</th>\n",
 499 |        "      <td>4</td>\n",
 500 |        "    </tr>\n",
 501 |        "    <tr>\n",
 502 |        "      <th>2018-01-05</th>\n",
 503 |        "      <td>5</td>\n",
 504 |        "    </tr>\n",
 505 |        "    <tr>\n",
 506 |        "      <th>2018-01-06</th>\n",
 507 |        "      <td>6</td>\n",
 508 |        "    </tr>\n",
 509 |        "    <tr>\n",
 510 |        "      <th>2018-01-07</th>\n",
 511 |        "      <td>7</td>\n",
 512 |        "    </tr>\n",
 513 |        "    <tr>\n",
 514 |        "      <th>2018-01-08</th>\n",
 515 |        "      <td>8</td>\n",
 516 |        "    </tr>\n",
 517 |        "    <tr>\n",
 518 |        "      <th>2018-01-09</th>\n",
 519 |        "      <td>9</td>\n",
 520 |        "    </tr>\n",
 521 |        "    <tr>\n",
 522 |        "      <th>2018-01-10</th>\n",
 523 |        "      <td>10</td>\n",
 524 |        "    </tr>\n",
 525 |        "  </tbody>\n",
 526 |        "</table>\n",
 527 |        "</div>"
 528 |       ],
 529 |       "text/plain": [
 530 |        "            num\n",
 531 |        "2018-01-01    1\n",
 532 |        "2018-01-02    2\n",
 533 |        "2018-01-03    3\n",
 534 |        "2018-01-04    4\n",
 535 |        "2018-01-05    5\n",
 536 |        "2018-01-06    6\n",
 537 |        "2018-01-07    7\n",
 538 |        "2018-01-08    8\n",
 539 |        "2018-01-09    9\n",
 540 |        "2018-01-10   10"
 541 |       ]
 542 |      },
 543 |      "execution_count": 5,
 544 |      "metadata": {},
 545 |      "output_type": "execute_result"
 546 |     }
 547 |    ],
 548 |    "source": [
 549 |     "#获取2018年的数据\n",
 550 |     "data[\"2018\"]"
 551 |    ]
 552 |   },
 553 |   {
 554 |    "cell_type": "code",
 555 |    "execution_count": 6,
 556 |    "metadata": {},
 557 |    "outputs": [
 558 |     {
 559 |      "data": {
 560 |       "text/html": [
 561 |        "<div>\n",
 562 |        "<style scoped>\n",
 563 |        "    .dataframe tbody tr th:only-of-type {\n",
 564 |        "        vertical-align: middle;\n",
 565 |        "    }\n",
 566 |        "\n",
 567 |        "    .dataframe tbody tr th {\n",
 568 |        "        vertical-align: top;\n",
 569 |        "    }\n",
 570 |        "\n",
 571 |        "    .dataframe thead th {\n",
 572 |        "        text-align: right;\n",
 573 |        "    }\n",
 574 |        "</style>\n",
 575 |        "<table border=\"1\" class=\"dataframe\">\n",
 576 |        "  <thead>\n",
 577 |        "    <tr style=\"text-align: right;\">\n",
 578 |        "      <th></th>\n",
 579 |        "      <th>num</th>\n",
 580 |        "    </tr>\n",
 581 |        "  </thead>\n",
 582 |        "  <tbody>\n",
 583 |        "    <tr>\n",
 584 |        "      <th>2018-01-01</th>\n",
 585 |        "      <td>1</td>\n",
 586 |        "    </tr>\n",
 587 |        "    <tr>\n",
 588 |        "      <th>2018-01-02</th>\n",
 589 |        "      <td>2</td>\n",
 590 |        "    </tr>\n",
 591 |        "    <tr>\n",
 592 |        "      <th>2018-01-03</th>\n",
 593 |        "      <td>3</td>\n",
 594 |        "    </tr>\n",
 595 |        "    <tr>\n",
 596 |        "      <th>2018-01-04</th>\n",
 597 |        "      <td>4</td>\n",
 598 |        "    </tr>\n",
 599 |        "    <tr>\n",
 600 |        "      <th>2018-01-05</th>\n",
 601 |        "      <td>5</td>\n",
 602 |        "    </tr>\n",
 603 |        "    <tr>\n",
 604 |        "      <th>2018-01-06</th>\n",
 605 |        "      <td>6</td>\n",
 606 |        "    </tr>\n",
 607 |        "    <tr>\n",
 608 |        "      <th>2018-01-07</th>\n",
 609 |        "      <td>7</td>\n",
 610 |        "    </tr>\n",
 611 |        "    <tr>\n",
 612 |        "      <th>2018-01-08</th>\n",
 613 |        "      <td>8</td>\n",
 614 |        "    </tr>\n",
 615 |        "    <tr>\n",
 616 |        "      <th>2018-01-09</th>\n",
 617 |        "      <td>9</td>\n",
 618 |        "    </tr>\n",
 619 |        "    <tr>\n",
 620 |        "      <th>2018-01-10</th>\n",
 621 |        "      <td>10</td>\n",
 622 |        "    </tr>\n",
 623 |        "  </tbody>\n",
 624 |        "</table>\n",
 625 |        "</div>"
 626 |       ],
 627 |       "text/plain": [
 628 |        "            num\n",
 629 |        "2018-01-01    1\n",
 630 |        "2018-01-02    2\n",
 631 |        "2018-01-03    3\n",
 632 |        "2018-01-04    4\n",
 633 |        "2018-01-05    5\n",
 634 |        "2018-01-06    6\n",
 635 |        "2018-01-07    7\n",
 636 |        "2018-01-08    8\n",
 637 |        "2018-01-09    9\n",
 638 |        "2018-01-10   10"
 639 |       ]
 640 |      },
 641 |      "execution_count": 6,
 642 |      "metadata": {},
 643 |      "output_type": "execute_result"
 644 |     }
 645 |    ],
 646 |    "source": [
 647 |     "#获取2018年1月份的数据\n",
 648 |     "data[\"2018-01\"]"
 649 |    ]
 650 |   },
 651 |   {
 652 |    "cell_type": "code",
 653 |    "execution_count": 8,
 654 |    "metadata": {},
 655 |    "outputs": [
 656 |     {
 657 |      "data": {
 658 |       "text/html": [
 659 |        "<div>\n",
 660 |        "<style scoped>\n",
 661 |        "    .dataframe tbody tr th:only-of-type {\n",
 662 |        "        vertical-align: middle;\n",
 663 |        "    }\n",
 664 |        "\n",
 665 |        "    .dataframe tbody tr th {\n",
 666 |        "        vertical-align: top;\n",
 667 |        "    }\n",
 668 |        "\n",
 669 |        "    .dataframe thead th {\n",
 670 |        "        text-align: right;\n",
 671 |        "    }\n",
 672 |        "</style>\n",
 673 |        "<table border=\"1\" class=\"dataframe\">\n",
 674 |        "  <thead>\n",
 675 |        "    <tr style=\"text-align: right;\">\n",
 676 |        "      <th></th>\n",
 677 |        "      <th>num</th>\n",
 678 |        "    </tr>\n",
 679 |        "  </thead>\n",
 680 |        "  <tbody>\n",
 681 |        "    <tr>\n",
 682 |        "      <th>2018-01-01</th>\n",
 683 |        "      <td>1</td>\n",
 684 |        "    </tr>\n",
 685 |        "    <tr>\n",
 686 |        "      <th>2018-01-02</th>\n",
 687 |        "      <td>2</td>\n",
 688 |        "    </tr>\n",
 689 |        "    <tr>\n",
 690 |        "      <th>2018-01-03</th>\n",
 691 |        "      <td>3</td>\n",
 692 |        "    </tr>\n",
 693 |        "    <tr>\n",
 694 |        "      <th>2018-01-04</th>\n",
 695 |        "      <td>4</td>\n",
 696 |        "    </tr>\n",
 697 |        "    <tr>\n",
 698 |        "      <th>2018-01-05</th>\n",
 699 |        "      <td>5</td>\n",
 700 |        "    </tr>\n",
 701 |        "  </tbody>\n",
 702 |        "</table>\n",
 703 |        "</div>"
 704 |       ],
 705 |       "text/plain": [
 706 |        "            num\n",
 707 |        "2018-01-01    1\n",
 708 |        "2018-01-02    2\n",
 709 |        "2018-01-03    3\n",
 710 |        "2018-01-04    4\n",
 711 |        "2018-01-05    5"
 712 |       ]
 713 |      },
 714 |      "execution_count": 8,
 715 |      "metadata": {},
 716 |      "output_type": "execute_result"
 717 |     }
 718 |    ],
 719 |    "source": [
 720 |     "#获取2018年1月1日到2018年1月5日的数据\n",
 721 |     "data[\"2018-01-01\":\"2018-01-05\"]"
 722 |    ]
 723 |   },
 724 |   {
 725 |    "cell_type": "code",
 726 |    "execution_count": 3,
 727 |    "metadata": {},
 728 |    "outputs": [
 729 |     {
 730 |      "data": {
 731 |       "text/html": [
 732 |        "<div>\n",
 733 |        "<style scoped>\n",
 734 |        "    .dataframe tbody tr th:only-of-type {\n",
 735 |        "        vertical-align: middle;\n",
 736 |        "    }\n",
 737 |        "\n",
 738 |        "    .dataframe tbody tr th {\n",
 739 |        "        vertical-align: top;\n",
 740 |        "    }\n",
 741 |        "\n",
 742 |        "    .dataframe thead th {\n",
 743 |        "        text-align: right;\n",
 744 |        "    }\n",
 745 |        "</style>\n",
 746 |        "<table border=\"1\" class=\"dataframe\">\n",
 747 |        "  <thead>\n",
 748 |        "    <tr style=\"text-align: right;\">\n",
 749 |        "      <th></th>\n",
 750 |        "      <th>订单编号</th>\n",
 751 |        "      <th>客户姓名</th>\n",
 752 |        "      <th>唯一识别码</th>\n",
 753 |        "      <th>年龄</th>\n",
 754 |        "      <th>成交时间</th>\n",
 755 |        "      <th>销售ID</th>\n",
 756 |        "    </tr>\n",
 757 |        "  </thead>\n",
 758 |        "  <tbody>\n",
 759 |        "    <tr>\n",
 760 |        "      <th>1</th>\n",
 761 |        "      <td>A2</td>\n",
 762 |        "      <td>李谷</td>\n",
 763 |        "      <td>102</td>\n",
 764 |        "      <td>45</td>\n",
 765 |        "      <td>2018-08-09</td>\n",
 766 |        "      <td>2</td>\n",
 767 |        "    </tr>\n",
 768 |        "    <tr>\n",
 769 |        "      <th>2</th>\n",
 770 |        "      <td>A3</td>\n",
 771 |        "      <td>孙凤</td>\n",
 772 |        "      <td>103</td>\n",
 773 |        "      <td>23</td>\n",
 774 |        "      <td>2018-08-10</td>\n",
 775 |        "      <td>1</td>\n",
 776 |        "    </tr>\n",
 777 |        "    <tr>\n",
 778 |        "      <th>3</th>\n",
 779 |        "      <td>A4</td>\n",
 780 |        "      <td>赵恒</td>\n",
 781 |        "      <td>104</td>\n",
 782 |        "      <td>240</td>\n",
 783 |        "      <td>2018-08-11</td>\n",
 784 |        "      <td>2</td>\n",
 785 |        "    </tr>\n",
 786 |        "    <tr>\n",
 787 |        "      <th>4</th>\n",
 788 |        "      <td>A5</td>\n",
 789 |        "      <td>王娜</td>\n",
 790 |        "      <td>105</td>\n",
 791 |        "      <td>21</td>\n",
 792 |        "      <td>2018-08-11</td>\n",
 793 |        "      <td>3</td>\n",
 794 |        "    </tr>\n",
 795 |        "  </tbody>\n",
 796 |        "</table>\n",
 797 |        "</div>"
 798 |       ],
 799 |       "text/plain": [
 800 |        "  订单编号 客户姓名  唯一识别码   年龄       成交时间  销售ID\n",
 801 |        "1   A2   李谷    102   45 2018-08-09     2\n",
 802 |        "2   A3   孙凤    103   23 2018-08-10     1\n",
 803 |        "3   A4   赵恒    104  240 2018-08-11     2\n",
 804 |        "4   A5   王娜    105   21 2018-08-11     3"
 805 |       ]
 806 |      },
 807 |      "execution_count": 3,
 808 |      "metadata": {},
 809 |      "output_type": "execute_result"
 810 |     }
 811 |    ],
 812 |    "source": [
 813 |     "import pandas as pd\n",
 814 |     "from datetime import datetime\n",
 815 |     "df = pd.read_excel(r\"../Data/Chapter06.xlsx\",sheet_name = 4)\n",
 816 |     "df[df[\"成交时间\"]>datetime(2018,8,8)]"
 817 |    ]
 818 |   },
 819 |   {
 820 |    "cell_type": "code",
 821 |    "execution_count": 4,
 822 |    "metadata": {},
 823 |    "outputs": [
 824 |     {
 825 |      "data": {
 826 |       "text/html": [
 827 |        "<div>\n",
 828 |        "<style scoped>\n",
 829 |        "    .dataframe tbody tr th:only-of-type {\n",
 830 |        "        vertical-align: middle;\n",
 831 |        "    }\n",
 832 |        "\n",
 833 |        "    .dataframe tbody tr th {\n",
 834 |        "        vertical-align: top;\n",
 835 |        "    }\n",
 836 |        "\n",
 837 |        "    .dataframe thead th {\n",
 838 |        "        text-align: right;\n",
 839 |        "    }\n",
 840 |        "</style>\n",
 841 |        "<table border=\"1\" class=\"dataframe\">\n",
 842 |        "  <thead>\n",
 843 |        "    <tr style=\"text-align: right;\">\n",
 844 |        "      <th></th>\n",
 845 |        "      <th>订单编号</th>\n",
 846 |        "      <th>客户姓名</th>\n",
 847 |        "      <th>唯一识别码</th>\n",
 848 |        "      <th>年龄</th>\n",
 849 |        "      <th>成交时间</th>\n",
 850 |        "      <th>销售ID</th>\n",
 851 |        "    </tr>\n",
 852 |        "  </thead>\n",
 853 |        "  <tbody>\n",
 854 |        "    <tr>\n",
 855 |        "      <th>0</th>\n",
 856 |        "      <td>A1</td>\n",
 857 |        "      <td>张通</td>\n",
 858 |        "      <td>101</td>\n",
 859 |        "      <td>31</td>\n",
 860 |        "      <td>2018-08-08</td>\n",
 861 |        "      <td>1</td>\n",
 862 |        "    </tr>\n",
 863 |        "  </tbody>\n",
 864 |        "</table>\n",
 865 |        "</div>"
 866 |       ],
 867 |       "text/plain": [
 868 |        "  订单编号 客户姓名  唯一识别码  年龄       成交时间  销售ID\n",
 869 |        "0   A1   张通    101  31 2018-08-08     1"
 870 |       ]
 871 |      },
 872 |      "execution_count": 4,
 873 |      "metadata": {},
 874 |      "output_type": "execute_result"
 875 |     }
 876 |    ],
 877 |    "source": [
 878 |     "df[df[\"成交时间\"] == datetime(2018,8,8)]"
 879 |    ]
 880 |   },
 881 |   {
 882 |    "cell_type": "code",
 883 |    "execution_count": 26,
 884 |    "metadata": {},
 885 |    "outputs": [
 886 |     {
 887 |      "data": {
 888 |       "text/html": [
 889 |        "<div>\n",
 890 |        "<style scoped>\n",
 891 |        "    .dataframe tbody tr th:only-of-type {\n",
 892 |        "        vertical-align: middle;\n",
 893 |        "    }\n",
 894 |        "\n",
 895 |        "    .dataframe tbody tr th {\n",
 896 |        "        vertical-align: top;\n",
 897 |        "    }\n",
 898 |        "\n",
 899 |        "    .dataframe thead th {\n",
 900 |        "        text-align: right;\n",
 901 |        "    }\n",
 902 |        "</style>\n",
 903 |        "<table border=\"1\" class=\"dataframe\">\n",
 904 |        "  <thead>\n",
 905 |        "    <tr style=\"text-align: right;\">\n",
 906 |        "      <th></th>\n",
 907 |        "      <th>订单编号</th>\n",
 908 |        "      <th>客户姓名</th>\n",
 909 |        "      <th>唯一识别码</th>\n",
 910 |        "      <th>年龄</th>\n",
 911 |        "      <th>成交时间</th>\n",
 912 |        "      <th>销售ID</th>\n",
 913 |        "    </tr>\n",
 914 |        "  </thead>\n",
 915 |        "  <tbody>\n",
 916 |        "    <tr>\n",
 917 |        "      <th>0</th>\n",
 918 |        "      <td>A1</td>\n",
 919 |        "      <td>张通</td>\n",
 920 |        "      <td>101</td>\n",
 921 |        "      <td>31</td>\n",
 922 |        "      <td>2018-08-08</td>\n",
 923 |        "      <td>1</td>\n",
 924 |        "    </tr>\n",
 925 |        "  </tbody>\n",
 926 |        "</table>\n",
 927 |        "</div>"
 928 |       ],
 929 |       "text/plain": [
 930 |        "  订单编号 客户姓名  唯一识别码  年龄       成交时间  销售ID\n",
 931 |        "0   A1   张通    101  31 2018-08-08     1"
 932 |       ]
 933 |      },
 934 |      "execution_count": 26,
 935 |      "metadata": {},
 936 |      "output_type": "execute_result"
 937 |     }
 938 |    ],
 939 |    "source": [
 940 |     "df[df[\"成交时间\"]<datetime(2018,8,9)]"
 941 |    ]
 942 |   },
 943 |   {
 944 |    "cell_type": "code",
 945 |    "execution_count": 29,
 946 |    "metadata": {},
 947 |    "outputs": [
 948 |     {
 949 |      "data": {
 950 |       "text/html": [
 951 |        "<div>\n",
 952 |        "<style scoped>\n",
 953 |        "    .dataframe tbody tr th:only-of-type {\n",
 954 |        "        vertical-align: middle;\n",
 955 |        "    }\n",
 956 |        "\n",
 957 |        "    .dataframe tbody tr th {\n",
 958 |        "        vertical-align: top;\n",
 959 |        "    }\n",
 960 |        "\n",
 961 |        "    .dataframe thead th {\n",
 962 |        "        text-align: right;\n",
 963 |        "    }\n",
 964 |        "</style>\n",
 965 |        "<table border=\"1\" class=\"dataframe\">\n",
 966 |        "  <thead>\n",
 967 |        "    <tr style=\"text-align: right;\">\n",
 968 |        "      <th></th>\n",
 969 |        "      <th>订单编号</th>\n",
 970 |        "      <th>客户姓名</th>\n",
 971 |        "      <th>唯一识别码</th>\n",
 972 |        "      <th>年龄</th>\n",
 973 |        "      <th>成交时间</th>\n",
 974 |        "      <th>销售ID</th>\n",
 975 |        "    </tr>\n",
 976 |        "  </thead>\n",
 977 |        "  <tbody>\n",
 978 |        "    <tr>\n",
 979 |        "      <th>1</th>\n",
 980 |        "      <td>A2</td>\n",
 981 |        "      <td>李谷</td>\n",
 982 |        "      <td>102</td>\n",
 983 |        "      <td>45</td>\n",
 984 |        "      <td>2018-08-09</td>\n",
 985 |        "      <td>2</td>\n",
 986 |        "    </tr>\n",
 987 |        "    <tr>\n",
 988 |        "      <th>2</th>\n",
 989 |        "      <td>A3</td>\n",
 990 |        "      <td>孙凤</td>\n",
 991 |        "      <td>103</td>\n",
 992 |        "      <td>23</td>\n",
 993 |        "      <td>2018-08-10</td>\n",
 994 |        "      <td>1</td>\n",
 995 |        "    </tr>\n",
 996 |        "  </tbody>\n",
 997 |        "</table>\n",
 998 |        "</div>"
 999 |       ],
1000 |       "text/plain": [
1001 |        "  订单编号 客户姓名  唯一识别码  年龄       成交时间  销售ID\n",
1002 |        "1   A2   李谷    102  45 2018-08-09     2\n",
1003 |        "2   A3   孙凤    103  23 2018-08-10     1"
1004 |       ]
1005 |      },
1006 |      "execution_count": 29,
1007 |      "metadata": {},
1008 |      "output_type": "execute_result"
1009 |     }
1010 |    ],
1011 |    "source": [
1012 |     "df[(df[\"成交时间\"]>datetime(2018,8,8))&(df[\"成交时间\"]< datetime(2018,8,11))]"
1013 |    ]
1014 |   },
1015 |   {
1016 |    "cell_type": "markdown",
1017 |    "metadata": {},
1018 |    "source": [
1019 |     "## 时间运算"
1020 |    ]
1021 |   },
1022 |   {
1023 |    "cell_type": "markdown",
1024 |    "metadata": {},
1025 |    "source": [
1026 |     "**两个时间之差**"
1027 |    ]
1028 |   },
1029 |   {
1030 |    "cell_type": "code",
1031 |    "execution_count": 30,
1032 |    "metadata": {},
1033 |    "outputs": [
1034 |     {
1035 |      "data": {
1036 |       "text/plain": [
1037 |        "datetime.timedelta(days=2, seconds=83880)"
1038 |       ]
1039 |      },
1040 |      "execution_count": 30,
1041 |      "metadata": {},
1042 |      "output_type": "execute_result"
1043 |     }
1044 |    ],
1045 |    "source": [
1046 |     "cha = datetime(2018,5,21,19,50)-datetime(2018,5,18,20,32)\n",
1047 |     "cha"
1048 |    ]
1049 |   },
1050 |   {
1051 |    "cell_type": "code",
1052 |    "execution_count": 31,
1053 |    "metadata": {},
1054 |    "outputs": [
1055 |     {
1056 |      "data": {
1057 |       "text/plain": [
1058 |        "2"
1059 |       ]
1060 |      },
1061 |      "execution_count": 31,
1062 |      "metadata": {},
1063 |      "output_type": "execute_result"
1064 |     }
1065 |    ],
1066 |    "source": [
1067 |     "#返回天数\n",
1068 |     "cha.days"
1069 |    ]
1070 |   },
1071 |   {
1072 |    "cell_type": "code",
1073 |    "execution_count": 33,
1074 |    "metadata": {},
1075 |    "outputs": [
1076 |     {
1077 |      "data": {
1078 |       "text/plain": [
1079 |        "83880"
1080 |       ]
1081 |      },
1082 |      "execution_count": 33,
1083 |      "metadata": {},
1084 |      "output_type": "execute_result"
1085 |     }
1086 |    ],
1087 |    "source": [
1088 |     "#返回秒时差\n",
1089 |     "cha.seconds"
1090 |    ]
1091 |   },
1092 |   {
1093 |    "cell_type": "code",
1094 |    "execution_count": 35,
1095 |    "metadata": {},
1096 |    "outputs": [
1097 |     {
1098 |      "data": {
1099 |       "text/plain": [
1100 |        "23.3"
1101 |       ]
1102 |      },
1103 |      "execution_count": 35,
1104 |      "metadata": {},
1105 |      "output_type": "execute_result"
1106 |     }
1107 |    ],
1108 |    "source": [
1109 |     "#换算成小时的时间差\n",
1110 |     "cha.seconds/3600"
1111 |    ]
1112 |   },
1113 |   {
1114 |    "cell_type": "markdown",
1115 |    "metadata": {},
1116 |    "source": [
1117 |     "**时间偏移**\n",
1118 |     "- timedelata只能偏移天、秒、微秒\n",
1119 |     "- 日期偏移量,可以直接实现天、小时、分钟单位的偏移date offset"
1120 |    ]
1121 |   },
1122 |   {
1123 |    "cell_type": "markdown",
1124 |    "metadata": {},
1125 |    "source": [
1126 |     "**timedelate**"
1127 |    ]
1128 |   },
1129 |   {
1130 |    "cell_type": "code",
1131 |    "execution_count": 43,
1132 |    "metadata": {},
1133 |    "outputs": [
1134 |     {
1135 |      "data": {
1136 |       "text/plain": [
1137 |        "datetime.datetime(2019, 3, 14, 15, 39, 55, 130084)"
1138 |       ]
1139 |      },
1140 |      "execution_count": 43,
1141 |      "metadata": {},
1142 |      "output_type": "execute_result"
1143 |     }
1144 |    ],
1145 |    "source": [
1146 |     "from datetime import timedelta,datetime\n",
1147 |     "date = datetime.now()\n",
1148 |     "date"
1149 |    ]
1150 |   },
1151 |   {
1152 |    "cell_type": "code",
1153 |    "execution_count": 51,
1154 |    "metadata": {},
1155 |    "outputs": [
1156 |     {
1157 |      "data": {
1158 |       "text/plain": [
1159 |        "datetime.datetime(2019, 3, 15, 15, 39, 55, 130084)"
1160 |       ]
1161 |      },
1162 |      "execution_count": 51,
1163 |      "metadata": {},
1164 |      "output_type": "execute_result"
1165 |     }
1166 |    ],
1167 |    "source": [
1168 |     "#往后推一天\n",
1169 |     "date+timedelta(days =1)"
1170 |    ]
1171 |   },
1172 |   {
1173 |    "cell_type": "code",
1174 |    "execution_count": 50,
1175 |    "metadata": {},
1176 |    "outputs": [
1177 |     {
1178 |      "data": {
1179 |       "text/plain": [
1180 |        "datetime.datetime(2019, 3, 14, 15, 40, 55, 130084)"
1181 |       ]
1182 |      },
1183 |      "execution_count": 50,
1184 |      "metadata": {},
1185 |      "output_type": "execute_result"
1186 |     }
1187 |    ],
1188 |    "source": [
1189 |     "#往后推60秒\n",
1190 |     "date+timedelta(seconds = 60)"
1191 |    ]
1192 |   },
1193 |   {
1194 |    "cell_type": "code",
1195 |    "execution_count": 52,
1196 |    "metadata": {},
1197 |    "outputs": [
1198 |     {
1199 |      "data": {
1200 |       "text/plain": [
1201 |        "datetime.datetime(2019, 3, 13, 15, 39, 55, 130084)"
1202 |       ]
1203 |      },
1204 |      "execution_count": 52,
1205 |      "metadata": {},
1206 |      "output_type": "execute_result"
1207 |     }
1208 |    ],
1209 |    "source": [
1210 |     "#往前推一天\n",
1211 |     "date - timedelta(days =1)"
1212 |    ]
1213 |   },
1214 |   {
1215 |    "cell_type": "markdown",
1216 |    "metadata": {},
1217 |    "source": [
1218 |     "**data offset**"
1219 |    ]
1220 |   },
1221 |   {
1222 |    "cell_type": "code",
1223 |    "execution_count": 74,
1224 |    "metadata": {},
1225 |    "outputs": [
1226 |     {
1227 |      "data": {
1228 |       "text/plain": [
1229 |        "datetime.datetime(2019, 3, 14, 15, 57, 32, 786664)"
1230 |       ]
1231 |      },
1232 |      "execution_count": 74,
1233 |      "metadata": {},
1234 |      "output_type": "execute_result"
1235 |     }
1236 |    ],
1237 |    "source": [
1238 |     "from pandas.tseries.offsets import Hour,Minute,Day,MonthEnd\n",
1239 |     "date = datetime.now()\n",
1240 |     "date"
1241 |    ]
1242 |   },
1243 |   {
1244 |    "cell_type": "code",
1245 |    "execution_count": 67,
1246 |    "metadata": {},
1247 |    "outputs": [
1248 |     {
1249 |      "data": {
1250 |       "text/plain": [
1251 |        "Timestamp('2019-03-15 15:54:23.875623')"
1252 |       ]
1253 |      },
1254 |      "execution_count": 67,
1255 |      "metadata": {},
1256 |      "output_type": "execute_result"
1257 |     }
1258 |    ],
1259 |    "source": [
1260 |     "#往后推一天\n",
1261 |     "date+Day(1)"
1262 |    ]
1263 |   },
1264 |   {
1265 |    "cell_type": "code",
1266 |    "execution_count": 70,
1267 |    "metadata": {},
1268 |    "outputs": [
1269 |     {
1270 |      "data": {
1271 |       "text/plain": [
1272 |        "Timestamp('2019-03-14 16:54:23.875623')"
1273 |       ]
1274 |      },
1275 |      "execution_count": 70,
1276 |      "metadata": {},
1277 |      "output_type": "execute_result"
1278 |     }
1279 |    ],
1280 |    "source": [
1281 |     "#往后推1小时\n",
1282 |     "date+Hour(1)"
1283 |    ]
1284 |   },
1285 |   {
1286 |    "cell_type": "code",
1287 |    "execution_count": 71,
1288 |    "metadata": {},
1289 |    "outputs": [
1290 |     {
1291 |      "data": {
1292 |       "text/plain": [
1293 |        "Timestamp('2019-03-14 16:04:23.875623')"
1294 |       ]
1295 |      },
1296 |      "execution_count": 71,
1297 |      "metadata": {},
1298 |      "output_type": "execute_result"
1299 |     }
1300 |    ],
1301 |    "source": [
1302 |     "#往后推10分钟\n",
1303 |     "date+Minute(10)"
1304 |    ]
1305 |   },
1306 |   {
1307 |    "cell_type": "code",
1308 |    "execution_count": 75,
1309 |    "metadata": {},
1310 |    "outputs": [
1311 |     {
1312 |      "data": {
1313 |       "text/plain": [
1314 |        "Timestamp('2019-03-31 15:57:32.786664')"
1315 |       ]
1316 |      },
1317 |      "execution_count": 75,
1318 |      "metadata": {},
1319 |      "output_type": "execute_result"
1320 |     }
1321 |    ],
1322 |    "source": [
1323 |     "#推后到月底\n",
1324 |     "date+MonthEnd(1)"
1325 |    ]
1326 |   }
1327 |  ],
1328 |  "metadata": {
1329 |   "kernelspec": {
1330 |    "display_name": "Python 3",
1331 |    "language": "python",
1332 |    "name": "python3"
1333 |   },
1334 |   "language_info": {
1335 |    "codemirror_mode": {
1336 |     "name": "ipython",
1337 |     "version": 3
1338 |    },
1339 |    "file_extension": ".py",
1340 |    "mimetype": "text/x-python",
1341 |    "name": "python",
1342 |    "nbconvert_exporter": "python",
1343 |    "pygments_lexer": "ipython3",
1344 |    "version": "3.7.0"
1345 |   },
1346 |   "toc": {
1347 |    "base_numbering": 1,
1348 |    "nav_menu": {},
1349 |    "number_sections": true,
1350 |    "sideBar": true,
1351 |    "skip_h1_title": false,
1352 |    "title_cell": "Table of Contents",
1353 |    "title_sidebar": "第9章 时间序列",
1354 |    "toc_cell": false,
1355 |    "toc_position": {},
1356 |    "toc_section_display": true,
1357 |    "toc_window_display": true
1358 |   }
1359 |  },
1360 |  "nbformat": 4,
1361 |  "nbformat_minor": 2
1362 | }
1363 | 


--------------------------------------------------------------------------------
/Code/Chapter12 结果导出.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "markdown",
  5 |    "metadata": {},
  6 |    "source": [
  7 |     "## 导出.xlsx文件"
  8 |    ]
  9 |   },
 10 |   {
 11 |    "cell_type": "markdown",
 12 |    "metadata": {},
 13 |    "source": [
 14 |     "**设置文件导出路径**"
 15 |    ]
 16 |   },
 17 |   {
 18 |    "cell_type": "code",
 19 |    "execution_count": 47,
 20 |    "metadata": {},
 21 |    "outputs": [],
 22 |    "source": [
 23 |     "import pandas as pd\n",
 24 |     "df = pd.read_excel(r\"../Data/Chapter12.xlsx\",sheet_name =0 )\n",
 25 |     "df.to_excel(excel_writer = r\"C:\\Users\\Administrator\\Excel-Python\\Data\\测试文档01.xlsx\")"
 26 |    ]
 27 |   },
 28 |   {
 29 |    "cell_type": "markdown",
 30 |    "metadata": {},
 31 |    "source": [
 32 |     "**设置Sheet名称**"
 33 |    ]
 34 |   },
 35 |   {
 36 |    "cell_type": "code",
 37 |    "execution_count": 48,
 38 |    "metadata": {},
 39 |    "outputs": [],
 40 |    "source": [
 41 |     "df.to_excel(excel_writer = r\"C:\\Users\\Administrator\\Excel-Python\\Data\\测试文档02.xlsx\",\n",
 42 |     "            sheet_name =\"测试\")"
 43 |    ]
 44 |   },
 45 |   {
 46 |    "cell_type": "markdown",
 47 |    "metadata": {},
 48 |    "source": [
 49 |     "**设置索引**"
 50 |    ]
 51 |   },
 52 |   {
 53 |    "cell_type": "code",
 54 |    "execution_count": 46,
 55 |    "metadata": {},
 56 |    "outputs": [],
 57 |    "source": [
 58 |     "df.to_excel(excel_writer = r\"C:\\Users\\Administrator\\Excel-Python\\Data\\测试文档03.xlsx\",\n",
 59 |     "            index = False)"
 60 |    ]
 61 |   },
 62 |   {
 63 |    "cell_type": "markdown",
 64 |    "metadata": {},
 65 |    "source": [
 66 |     "**设置要导出的列**"
 67 |    ]
 68 |   },
 69 |   {
 70 |    "cell_type": "code",
 71 |    "execution_count": 45,
 72 |    "metadata": {},
 73 |    "outputs": [],
 74 |    "source": [
 75 |     "df = pd.read_excel(r\"../Data/Chapter12.xlsx\",sheet_name =0 )\n",
 76 |     "df.to_excel(excel_writer = r\"C:\\Users\\Administrator\\Excel-Python\\Data\\测试文档04.xlsx\",\n",
 77 |     "            sheet_name = \"测试文档\",\n",
 78 |     "            index=False,columns = [\"用户ID\",\"7月销量\",\"8月销量\",\"9月销量\"])"
 79 |    ]
 80 |   },
 81 |   {
 82 |    "cell_type": "markdown",
 83 |    "metadata": {},
 84 |    "source": [
 85 |     "**设置编码格式**"
 86 |    ]
 87 |   },
 88 |   {
 89 |    "cell_type": "code",
 90 |    "execution_count": 43,
 91 |    "metadata": {},
 92 |    "outputs": [],
 93 |    "source": [
 94 |     "df.to_excel(excel_writer = r\"C:\\Users\\Administrator\\Excel-Python\\Data\\测试文档05.xlsx\",\n",
 95 |     "           sheet_name = \"测试文档\",\n",
 96 |     "           index = False,\n",
 97 |     "           encoding = \"utf-8\")"
 98 |    ]
 99 |   },
100 |   {
101 |    "cell_type": "markdown",
102 |    "metadata": {},
103 |    "source": [
104 |     "**缺失值处理**"
105 |    ]
106 |   },
107 |   {
108 |    "cell_type": "code",
109 |    "execution_count": 42,
110 |    "metadata": {},
111 |    "outputs": [],
112 |    "source": [
113 |     "df = pd.read_excel(r\"../Data/Chapter12.xlsx\",sheet_name =2)\n",
114 |     "df.to_excel(excel_writer = r\"C:\\Users\\Administrator\\Excel-Python\\Data\\测试文档06.xlsx\",\n",
115 |     "           sheet_name=\"测试文档\",\n",
116 |     "           index = False,\n",
117 |     "           encoding = \"utf-8\",\n",
118 |     "           na_rep = 0 #缺失值填充为0\n",
119 |     "           )"
120 |    ]
121 |   },
122 |   {
123 |    "cell_type": "markdown",
124 |    "metadata": {},
125 |    "source": [
126 |     "**无穷值处理**"
127 |    ]
128 |   },
129 |   {
130 |    "cell_type": "code",
131 |    "execution_count": 55,
132 |    "metadata": {},
133 |    "outputs": [],
134 |    "source": [
135 |     "df = pd.read_excel(r\"../Data/Chapter12.xlsx\",sheet_name =1)\n",
136 |     "df.to_excel(excel_writer = r\"C:\\Users\\Administrator\\Excel-Python\\Data\\测试文档07.xlsx\",\n",
137 |     "           sheet_name = \"测试文档\",\n",
138 |     "           index = False,\n",
139 |     "           encoding = \"utf-8\",\n",
140 |     "           na_rep = 0,\n",
141 |     "           inf_rep = 0 #无穷值填充为0\n",
142 |     "           )"
143 |    ]
144 |   },
145 |   {
146 |    "cell_type": "markdown",
147 |    "metadata": {},
148 |    "source": [
149 |     "## 导出为 .csv文件"
150 |    ]
151 |   },
152 |   {
153 |    "cell_type": "markdown",
154 |    "metadata": {},
155 |    "source": [
156 |     "**设置文件导出路径**"
157 |    ]
158 |   },
159 |   {
160 |    "cell_type": "code",
161 |    "execution_count": 82,
162 |    "metadata": {},
163 |    "outputs": [],
164 |    "source": [
165 |     "df = pd.read_excel(r\"../Data/Chapter12.xlsx\",sheet_name =0)\n",
166 |     "df.to_csv(path_or_buf = r\"C:\\Users\\Administrator\\Excel-Python\\Data\\测试文档01.csv\" )"
167 |    ]
168 |   },
169 |   {
170 |    "cell_type": "markdown",
171 |    "metadata": {},
172 |    "source": [
173 |     "**设置索引**"
174 |    ]
175 |   },
176 |   {
177 |    "cell_type": "code",
178 |    "execution_count": 64,
179 |    "metadata": {},
180 |    "outputs": [],
181 |    "source": [
182 |     "df.to_csv(path_or_buf = r\"C:\\Users\\Administrator\\Excel-Python\\Data\\测试文档02.csv\",\n",
183 |     "          index = False )"
184 |    ]
185 |   },
186 |   {
187 |    "cell_type": "markdown",
188 |    "metadata": {},
189 |    "source": [
190 |     "**设置导出的列**"
191 |    ]
192 |   },
193 |   {
194 |    "cell_type": "code",
195 |    "execution_count": 83,
196 |    "metadata": {},
197 |    "outputs": [],
198 |    "source": [
199 |     "df.to_csv(path_or_buf = r\"C:\\Users\\Administrator\\Excel-Python\\Data\\测试文档03.csv\" ,\n",
200 |     "          index= False,\n",
201 |     "          columns = [\"用户ID\",\"7月销量\",\"8月销量\",\"9月销量\"])"
202 |    ]
203 |   },
204 |   {
205 |    "cell_type": "markdown",
206 |    "metadata": {},
207 |    "source": [
208 |     "**设置分隔符号**"
209 |    ]
210 |   },
211 |   {
212 |    "cell_type": "code",
213 |    "execution_count": 77,
214 |    "metadata": {},
215 |    "outputs": [],
216 |    "source": [
217 |     "df.to_csv(path_or_buf = r\"C:\\Users\\Administrator\\Excel-Python\\Data\\测试文档04.csv\" ,\n",
218 |     "          index= False,\n",
219 |     "          columns = [\"用户ID\",\"7月销量\",\"8月销量\",\"9月销量\"],\n",
220 |     "          sep=\",\")"
221 |    ]
222 |   },
223 |   {
224 |    "cell_type": "markdown",
225 |    "metadata": {},
226 |    "source": [
227 |     "**缺失值处理**"
228 |    ]
229 |   },
230 |   {
231 |    "cell_type": "code",
232 |    "execution_count": 75,
233 |    "metadata": {},
234 |    "outputs": [],
235 |    "source": [
236 |     "df = pd.read_excel(r\"../Data/Chapter12.xlsx\",sheet_name =2)\n",
237 |     "df.to_csv(path_or_buf = r\"C:\\Users\\Administrator\\Excel-Python\\Data\\测试文档05.csv\" ,\n",
238 |     "          index= False,\n",
239 |     "          columns = [\"用户ID\",\"7月销量\",\"8月销量\",\"9月销量\"],\n",
240 |     "          sep=\",\",\n",
241 |     "          na_rep = 0)"
242 |    ]
243 |   },
244 |   {
245 |    "cell_type": "markdown",
246 |    "metadata": {},
247 |    "source": [
248 |     "**设置编码格式**"
249 |    ]
250 |   },
251 |   {
252 |    "cell_type": "code",
253 |    "execution_count": 81,
254 |    "metadata": {},
255 |    "outputs": [],
256 |    "source": [
257 |     "df = pd.read_excel(r\"../Data/Chapter12.xlsx\",sheet_name =2)\n",
258 |     "df.to_csv(path_or_buf = r\"C:\\Users\\Administrator\\Excel-Python\\Data\\测试文档06.csv\" ,\n",
259 |     "          index= False,\n",
260 |     "          columns = [\"用户ID\",\"7月销量\",\"8月销量\",\"9月销量\"],\n",
261 |     "          sep=\",\",\n",
262 |     "          na_rep = 0,\n",
263 |     "          encoding = \"gbk\" #设置为gbk或者utf-8-sig\n",
264 |     "         )"
265 |    ]
266 |   },
267 |   {
268 |    "cell_type": "markdown",
269 |    "metadata": {},
270 |    "source": [
271 |     "## 将文件导出到多个Sheet"
272 |    ]
273 |   },
274 |   {
275 |    "cell_type": "code",
276 |    "execution_count": 80,
277 |    "metadata": {},
278 |    "outputs": [],
279 |    "source": [
280 |     "df1 = pd.read_excel(r\"../Data/Chapter12.xlsx\",sheet_name =0)\n",
281 |     "df2 = pd.read_excel(r\"../Data/Chapter12.xlsx\",sheet_name =1)\n",
282 |     "df3 = pd.read_excel(r\"../Data/Chapter12.xlsx\",sheet_name =2)\n",
283 |     "#声明一个对象\n",
284 |     "writer = pd.ExcelWriter(r\"C:\\Users\\Administrator\\Excel-Python\\Data\\test02.xlsx\",\n",
285 |     "                        engine = \"xlsxwriter\")\n",
286 |     "#将df1、df2、df3写入Excel中的sheet1、sheet2、sheet3\n",
287 |     "#重命名表1、表2、表3\n",
288 |     "df1.to_excel(writer,sheet_name =\"表1\")\n",
289 |     "df2.to_excel(writer,sheet_name =\"表2\")\n",
290 |     "df3.to_excel(writer,sheet_name =\"表3\")\n",
291 |     "#保存读写的内容\n",
292 |     "writer.save()"
293 |    ]
294 |   }
295 |  ],
296 |  "metadata": {
297 |   "kernelspec": {
298 |    "display_name": "Python 3",
299 |    "language": "python",
300 |    "name": "python3"
301 |   },
302 |   "language_info": {
303 |    "codemirror_mode": {
304 |     "name": "ipython",
305 |     "version": 3
306 |    },
307 |    "file_extension": ".py",
308 |    "mimetype": "text/x-python",
309 |    "name": "python",
310 |    "nbconvert_exporter": "python",
311 |    "pygments_lexer": "ipython3",
312 |    "version": "3.7.0"
313 |   },
314 |   "toc": {
315 |    "base_numbering": 1,
316 |    "nav_menu": {},
317 |    "number_sections": true,
318 |    "sideBar": true,
319 |    "skip_h1_title": false,
320 |    "title_cell": "Table of Contents",
321 |    "title_sidebar": "第12章 结果导出",
322 |    "toc_cell": false,
323 |    "toc_position": {},
324 |    "toc_section_display": true,
325 |    "toc_window_display": true
326 |   }
327 |  },
328 |  "nbformat": 4,
329 |  "nbformat_minor": 2
330 | }
331 | 


--------------------------------------------------------------------------------
/Data/Chapter04.1.csv:
--------------------------------------------------------------------------------
1 | ﻿编号 年龄 性别 注册时间
2 | A1 54 男 2018/8/8
3 | A2 16 女 2018/8/9
4 | A3 47 女 2018/8/10
5 | A4 41 男 2018/8/11
6 | 


--------------------------------------------------------------------------------
/Data/Chapter04.csv:
--------------------------------------------------------------------------------
1 | ﻿编号,年龄,性别,注册时间
2 | A1,54,男,2018/8/8
3 | A2,16,女,2018/8/9
4 | A3,47,女,2018/8/10
5 | A4,41,男,2018/8/11
6 | 


--------------------------------------------------------------------------------
/Data/Chapter04.txt:
--------------------------------------------------------------------------------
1 | ﻿编号,年龄,性别,注册时间
2 | A1,54,男,2018/8/8
3 | A2,16,女,2018/8/9
4 | A3,47,女,2018/8/10
5 | A4,41,男,2018/8/11
6 | 


--------------------------------------------------------------------------------
/Data/Chapter04.xlsx:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/xmaniu/Excel-Python/26408c8a29d6eafb0bb83ac7532e6fd58140af00/Data/Chapter04.xlsx


--------------------------------------------------------------------------------
/Data/Chapter05.xlsx:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/xmaniu/Excel-Python/26408c8a29d6eafb0bb83ac7532e6fd58140af00/Data/Chapter05.xlsx


--------------------------------------------------------------------------------
/Data/Chapter06.xlsx:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/xmaniu/Excel-Python/26408c8a29d6eafb0bb83ac7532e6fd58140af00/Data/Chapter06.xlsx


--------------------------------------------------------------------------------
/Data/Chapter07.xlsx:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/xmaniu/Excel-Python/26408c8a29d6eafb0bb83ac7532e6fd58140af00/Data/Chapter07.xlsx


--------------------------------------------------------------------------------
/Data/Chapter08.xlsx:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/xmaniu/Excel-Python/26408c8a29d6eafb0bb83ac7532e6fd58140af00/Data/Chapter08.xlsx


--------------------------------------------------------------------------------
/Data/Chapter10.xlsx:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/xmaniu/Excel-Python/26408c8a29d6eafb0bb83ac7532e6fd58140af00/Data/Chapter10.xlsx


--------------------------------------------------------------------------------
/Data/Chapter11.xlsx:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/xmaniu/Excel-Python/26408c8a29d6eafb0bb83ac7532e6fd58140af00/Data/Chapter11.xlsx


--------------------------------------------------------------------------------
/Data/Chapter12.xlsx:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/xmaniu/Excel-Python/26408c8a29d6eafb0bb83ac7532e6fd58140af00/Data/Chapter12.xlsx


--------------------------------------------------------------------------------
/Data/fillna.xlsx:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/xmaniu/Excel-Python/26408c8a29d6eafb0bb83ac7532e6fd58140af00/Data/fillna.xlsx


--------------------------------------------------------------------------------
/Data/loan.csv:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/xmaniu/Excel-Python/26408c8a29d6eafb0bb83ac7532e6fd58140af00/Data/loan.csv


--------------------------------------------------------------------------------
/Data/order-14.1.csv:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/xmaniu/Excel-Python/26408c8a29d6eafb0bb83ac7532e6fd58140af00/Data/order-14.1.csv


--------------------------------------------------------------------------------
/Data/order-14.3.csv:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/xmaniu/Excel-Python/26408c8a29d6eafb0bb83ac7532e6fd58140af00/Data/order-14.3.csv


--------------------------------------------------------------------------------
/Data/train-pivot.csv:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/xmaniu/Excel-Python/26408c8a29d6eafb0bb83ac7532e6fd58140af00/Data/train-pivot.csv


--------------------------------------------------------------------------------
/Data/数据集使用说明.txt:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/xmaniu/Excel-Python/26408c8a29d6eafb0bb83ac7532e6fd58140af00/Data/数据集使用说明.txt


--------------------------------------------------------------------------------
/Note/Git Fork开源项目如何同步更新.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/xmaniu/Excel-Python/26408c8a29d6eafb0bb83ac7532e6fd58140af00/Note/Git Fork开源项目如何同步更新.pdf


--------------------------------------------------------------------------------
/Note/Markdown常用标签.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/xmaniu/Excel-Python/26408c8a29d6eafb0bb83ac7532e6fd58140af00/Note/Markdown常用标签.pdf


--------------------------------------------------------------------------------
/Note/jupyter notebook导出pdf并支持中文.md:
--------------------------------------------------------------------------------
 1 | **Jupyter Notebook**是很好的数据科学创作环境，反正我做数据分析的项目或小练习的时候，基本都是在用jupyter notebook（原先是叫ipython notebook，所以现在文件后缀还是.ipynb），以前不怎么用到导出pdf功能，然后要用的时候就遇到很多坑了。jupyter提供导出的格式有.py、.html、.md、.pdf等。
 2 | 
 3 | ![jupyter notebook支持的导出格式](https://upload-images.jianshu.io/upload_images/2473543-b37f85b5584364b9.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)
 4 | 
 5 | 从效果来看，网页中notebook的渲染是最好看的，导出的html对代码和超链接失真严重。在网页上点*Download as -> PDF via LaTex*的时候先是说缺少Pandoc库，于是pip install pandoc，之后不再说缺少这个库了，而是
 6 |  nbconvert failed: pdflatex not found on PATH 或者 nbconvert failed: PDF creating failed, captured latex output。查了一些资料后改用命令行，要避免*'xelatex' 不是内部或外部命令，也不是可运行的程序或批处理文件*，需要先安装MiKTeX，在其[官网下载](https://miktex.org/download)后，Windows版一路next安装就行，安装包有190MB，安装过程还是耗费些时间的，下载安装完成之后的步骤是：
 7 | 
 8 | ### 1, ipynb文件编译为tex 
 9 | 在命令行中定位到要转换的jupyter文件的路径下，输入
10 |  **jupyter nbconvert --to latex yourNotebookName.ipynb**
11 | 
12 | ![编译ipynb文件为LaTeX文件](https://upload-images.jianshu.io/upload_images/2473543-3066970796a6043b.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)
13 | 在文件目录下就可以看到一个叫**yourNotebookName.tex**的LaTeX文件了。
14 | ### 2, 手动编辑latex文件
15 | 为了能支持输出中文，需要改一下tex文件，在编辑器（我用的是Notepad++）打开刚才生成的LaTeX文件，
16 | 在**\documentclass{article}**（没有这一句就在\documentclass[11pt]{ctexart} 的后面插入下面的语句）后面插入
17 | ```latex
18 | \usepackage{fontspec, xunicode, xltxtra}
19 | \setmainfont{Microsoft YaHei}
20 | \usepackage{ctex}
21 | ```
22 | ![修改latex文件](https://upload-images.jianshu.io/upload_images/2473543-898fdf8271689505.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)
23 | 
24 | ### 3, 转latex为pdf
25 | 随后在命令行下输入：（我演示文件用的是GeoCluster.tex）
26 | ```
27 | xelatex yourNotebookName.tex
28 | ```
29 | ![命令行转latex为pdf](https://upload-images.jianshu.io/upload_images/2473543-6624da52f9d4d9d1.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)
30 | 之前没有运行过xelatex，首次运行会安装一些依赖文件，会慢一些，最后运行完毕：
31 | ![运行完xelatex命令](https://upload-images.jianshu.io/upload_images/2473543-192ac8f3fe434b96.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)
32 | 可以在文件夹下看到输出的文件：
33 | ![最后文件夹下的结果](https://upload-images.jianshu.io/upload_images/2473543-c7f89da3bad6866f.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)
34 | - .ipynb 是我们的jupyter文件
35 | - .tex 是由jupyter notebook文件生成的
36 | - .pdf 是我们最后的目标文件由.tex文件生成
37 | - .log、.out、.aux是LaTex生成pdf的一些输出和日志
38 | 
39 | 总结一下，从jupyter notebook生成pdf文件需要的依赖项还是比较多的，Windows下安装MiKTeX才能用xelatex命令。生成步骤是先把ipynb文件编译为LaTex，然后为了支持中文修改一下lex文件，最后转换为pdf文件。 
40 | 
41 | 最后效果如下，虽然还是比不上网页端.ipynb的直接渲染效果，但比起导出的html等格式，更好地作为展示格式。
42 | ![生成pdf的效果](https://upload-images.jianshu.io/upload_images/2473543-036c476dcddbbca0.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)
43 | 
44 | ps:
45 | - 现在觉得下载安装部分说得有些简略，之后可以把这部分说得更详细；
46 | - 原文[简书链接](https://www.jianshu.com/p/6b84a9631f8a)
47 | - [MiKTeX 中文支持的解决方案](https://jingyan.baidu.com/article/ff411625e229d512e482379c.html)
48 | - [ipython notebook导出含有中文的pdf文件](https://blog.csdn.net/weixin_42114013/article/details/81106797)


--------------------------------------------------------------------------------
/Note/pandas填充缺失值fillna()函数.ipynb:
--------------------------------------------------------------------------------
   1 | {
   2 |  "cells": [
   3 |   {
   4 |    "cell_type": "markdown",
   5 |    "metadata": {},
   6 |    "source": [
   7 |     "## **pandas填充缺失值fillna()函数**   \n",
   8 |     "缺失值的填充在平时做数据处理的时候非常常见，fillna()函数常用的参数有8个：　\n",
   9 |     "- 用常数填充\n",
  10 |     "- 用字典填充\n",
  11 |     "- 用计算公式填充\n",
  12 |     "- 使用具体某一列填充\n",
  13 |     "- 缺失值等于前面/后面一个值\n",
  14 |     "- 限定填充个数\n",
  15 |     "- 填充分享设定\n",
  16 |     "- 更改数据源"
  17 |    ]
  18 |   },
  19 |   {
  20 |    "cell_type": "code",
  21 |    "execution_count": 12,
  22 |    "metadata": {},
  23 |    "outputs": [
  24 |     {
  25 |      "data": {
  26 |       "text/html": [
  27 |        "<div>\n",
  28 |        "<style scoped>\n",
  29 |        "    .dataframe tbody tr th:only-of-type {\n",
  30 |        "        vertical-align: middle;\n",
  31 |        "    }\n",
  32 |        "\n",
  33 |        "    .dataframe tbody tr th {\n",
  34 |        "        vertical-align: top;\n",
  35 |        "    }\n",
  36 |        "\n",
  37 |        "    .dataframe thead th {\n",
  38 |        "        text-align: right;\n",
  39 |        "    }\n",
  40 |        "</style>\n",
  41 |        "<table border=\"1\" class=\"dataframe\">\n",
  42 |        "  <thead>\n",
  43 |        "    <tr style=\"text-align: right;\">\n",
  44 |        "      <th></th>\n",
  45 |        "      <th>名次</th>\n",
  46 |        "      <th>姓名</th>\n",
  47 |        "      <th>语文</th>\n",
  48 |        "      <th>数学</th>\n",
  49 |        "      <th>外语</th>\n",
  50 |        "    </tr>\n",
  51 |        "  </thead>\n",
  52 |        "  <tbody>\n",
  53 |        "    <tr>\n",
  54 |        "      <th>0</th>\n",
  55 |        "      <td>1</td>\n",
  56 |        "      <td>郭靖</td>\n",
  57 |        "      <td>90.0</td>\n",
  58 |        "      <td>80.0</td>\n",
  59 |        "      <td>76</td>\n",
  60 |        "    </tr>\n",
  61 |        "    <tr>\n",
  62 |        "      <th>1</th>\n",
  63 |        "      <td>2</td>\n",
  64 |        "      <td>黄蓉</td>\n",
  65 |        "      <td>100.0</td>\n",
  66 |        "      <td>100.0</td>\n",
  67 |        "      <td>98</td>\n",
  68 |        "    </tr>\n",
  69 |        "    <tr>\n",
  70 |        "      <th>2</th>\n",
  71 |        "      <td>3</td>\n",
  72 |        "      <td>黄药师</td>\n",
  73 |        "      <td>NaN</td>\n",
  74 |        "      <td>98.0</td>\n",
  75 |        "      <td>100</td>\n",
  76 |        "    </tr>\n",
  77 |        "    <tr>\n",
  78 |        "      <th>3</th>\n",
  79 |        "      <td>4</td>\n",
  80 |        "      <td>欧阳锋</td>\n",
  81 |        "      <td>NaN</td>\n",
  82 |        "      <td>95.0</td>\n",
  83 |        "      <td>85</td>\n",
  84 |        "    </tr>\n",
  85 |        "    <tr>\n",
  86 |        "      <th>4</th>\n",
  87 |        "      <td>5</td>\n",
  88 |        "      <td>洪七公</td>\n",
  89 |        "      <td>98.0</td>\n",
  90 |        "      <td>NaN</td>\n",
  91 |        "      <td>96</td>\n",
  92 |        "    </tr>\n",
  93 |        "    <tr>\n",
  94 |        "      <th>5</th>\n",
  95 |        "      <td>5</td>\n",
  96 |        "      <td>周伯通</td>\n",
  97 |        "      <td>88.0</td>\n",
  98 |        "      <td>91.0</td>\n",
  99 |        "      <td>88</td>\n",
 100 |        "    </tr>\n",
 101 |        "  </tbody>\n",
 102 |        "</table>\n",
 103 |        "</div>"
 104 |       ],
 105 |       "text/plain": [
 106 |        "   名次   姓名     语文     数学   外语\n",
 107 |        "0   1   郭靖   90.0   80.0   76\n",
 108 |        "1   2   黄蓉  100.0  100.0   98\n",
 109 |        "2   3  黄药师    NaN   98.0  100\n",
 110 |        "3   4  欧阳锋    NaN   95.0   85\n",
 111 |        "4   5  洪七公   98.0    NaN   96\n",
 112 |        "5   5  周伯通   88.0   91.0   88"
 113 |       ]
 114 |      },
 115 |      "execution_count": 12,
 116 |      "metadata": {},
 117 |      "output_type": "execute_result"
 118 |     }
 119 |    ],
 120 |    "source": [
 121 |     "import pandas as pd\n",
 122 |     "df = pd.read_excel(\"../Data/fillna.xlsx\")\n",
 123 |     "df"
 124 |    ]
 125 |   },
 126 |   {
 127 |    "cell_type": "markdown",
 128 |    "metadata": {},
 129 |    "source": [
 130 |     "### 用常数填充"
 131 |    ]
 132 |   },
 133 |   {
 134 |    "cell_type": "code",
 135 |    "execution_count": 13,
 136 |    "metadata": {},
 137 |    "outputs": [
 138 |     {
 139 |      "data": {
 140 |       "text/html": [
 141 |        "<div>\n",
 142 |        "<style scoped>\n",
 143 |        "    .dataframe tbody tr th:only-of-type {\n",
 144 |        "        vertical-align: middle;\n",
 145 |        "    }\n",
 146 |        "\n",
 147 |        "    .dataframe tbody tr th {\n",
 148 |        "        vertical-align: top;\n",
 149 |        "    }\n",
 150 |        "\n",
 151 |        "    .dataframe thead th {\n",
 152 |        "        text-align: right;\n",
 153 |        "    }\n",
 154 |        "</style>\n",
 155 |        "<table border=\"1\" class=\"dataframe\">\n",
 156 |        "  <thead>\n",
 157 |        "    <tr style=\"text-align: right;\">\n",
 158 |        "      <th></th>\n",
 159 |        "      <th>名次</th>\n",
 160 |        "      <th>姓名</th>\n",
 161 |        "      <th>语文</th>\n",
 162 |        "      <th>数学</th>\n",
 163 |        "      <th>外语</th>\n",
 164 |        "    </tr>\n",
 165 |        "  </thead>\n",
 166 |        "  <tbody>\n",
 167 |        "    <tr>\n",
 168 |        "      <th>0</th>\n",
 169 |        "      <td>1</td>\n",
 170 |        "      <td>郭靖</td>\n",
 171 |        "      <td>90.0</td>\n",
 172 |        "      <td>80.0</td>\n",
 173 |        "      <td>76</td>\n",
 174 |        "    </tr>\n",
 175 |        "    <tr>\n",
 176 |        "      <th>1</th>\n",
 177 |        "      <td>2</td>\n",
 178 |        "      <td>黄蓉</td>\n",
 179 |        "      <td>100.0</td>\n",
 180 |        "      <td>100.0</td>\n",
 181 |        "      <td>98</td>\n",
 182 |        "    </tr>\n",
 183 |        "    <tr>\n",
 184 |        "      <th>2</th>\n",
 185 |        "      <td>3</td>\n",
 186 |        "      <td>黄药师</td>\n",
 187 |        "      <td>0.0</td>\n",
 188 |        "      <td>98.0</td>\n",
 189 |        "      <td>100</td>\n",
 190 |        "    </tr>\n",
 191 |        "    <tr>\n",
 192 |        "      <th>3</th>\n",
 193 |        "      <td>4</td>\n",
 194 |        "      <td>欧阳锋</td>\n",
 195 |        "      <td>0.0</td>\n",
 196 |        "      <td>95.0</td>\n",
 197 |        "      <td>85</td>\n",
 198 |        "    </tr>\n",
 199 |        "    <tr>\n",
 200 |        "      <th>4</th>\n",
 201 |        "      <td>5</td>\n",
 202 |        "      <td>洪七公</td>\n",
 203 |        "      <td>98.0</td>\n",
 204 |        "      <td>0.0</td>\n",
 205 |        "      <td>96</td>\n",
 206 |        "    </tr>\n",
 207 |        "    <tr>\n",
 208 |        "      <th>5</th>\n",
 209 |        "      <td>5</td>\n",
 210 |        "      <td>周伯通</td>\n",
 211 |        "      <td>88.0</td>\n",
 212 |        "      <td>91.0</td>\n",
 213 |        "      <td>88</td>\n",
 214 |        "    </tr>\n",
 215 |        "  </tbody>\n",
 216 |        "</table>\n",
 217 |        "</div>"
 218 |       ],
 219 |       "text/plain": [
 220 |        "   名次   姓名     语文     数学   外语\n",
 221 |        "0   1   郭靖   90.0   80.0   76\n",
 222 |        "1   2   黄蓉  100.0  100.0   98\n",
 223 |        "2   3  黄药师    0.0   98.0  100\n",
 224 |        "3   4  欧阳锋    0.0   95.0   85\n",
 225 |        "4   5  洪七公   98.0    0.0   96\n",
 226 |        "5   5  周伯通   88.0   91.0   88"
 227 |       ]
 228 |      },
 229 |      "execution_count": 13,
 230 |      "metadata": {},
 231 |      "output_type": "execute_result"
 232 |     }
 233 |    ],
 234 |    "source": [
 235 |     "df.fillna(0)"
 236 |    ]
 237 |   },
 238 |   {
 239 |    "cell_type": "markdown",
 240 |    "metadata": {},
 241 |    "source": [
 242 |     "### 用字典填充"
 243 |    ]
 244 |   },
 245 |   {
 246 |    "cell_type": "code",
 247 |    "execution_count": 23,
 248 |    "metadata": {},
 249 |    "outputs": [
 250 |     {
 251 |      "data": {
 252 |       "text/html": [
 253 |        "<div>\n",
 254 |        "<style scoped>\n",
 255 |        "    .dataframe tbody tr th:only-of-type {\n",
 256 |        "        vertical-align: middle;\n",
 257 |        "    }\n",
 258 |        "\n",
 259 |        "    .dataframe tbody tr th {\n",
 260 |        "        vertical-align: top;\n",
 261 |        "    }\n",
 262 |        "\n",
 263 |        "    .dataframe thead th {\n",
 264 |        "        text-align: right;\n",
 265 |        "    }\n",
 266 |        "</style>\n",
 267 |        "<table border=\"1\" class=\"dataframe\">\n",
 268 |        "  <thead>\n",
 269 |        "    <tr style=\"text-align: right;\">\n",
 270 |        "      <th></th>\n",
 271 |        "      <th>名次</th>\n",
 272 |        "      <th>姓名</th>\n",
 273 |        "      <th>语文</th>\n",
 274 |        "      <th>数学</th>\n",
 275 |        "      <th>外语</th>\n",
 276 |        "    </tr>\n",
 277 |        "  </thead>\n",
 278 |        "  <tbody>\n",
 279 |        "    <tr>\n",
 280 |        "      <th>0</th>\n",
 281 |        "      <td>1</td>\n",
 282 |        "      <td>郭靖</td>\n",
 283 |        "      <td>90.0</td>\n",
 284 |        "      <td>80.0</td>\n",
 285 |        "      <td>76</td>\n",
 286 |        "    </tr>\n",
 287 |        "    <tr>\n",
 288 |        "      <th>1</th>\n",
 289 |        "      <td>2</td>\n",
 290 |        "      <td>黄蓉</td>\n",
 291 |        "      <td>100.0</td>\n",
 292 |        "      <td>100.0</td>\n",
 293 |        "      <td>98</td>\n",
 294 |        "    </tr>\n",
 295 |        "    <tr>\n",
 296 |        "      <th>2</th>\n",
 297 |        "      <td>3</td>\n",
 298 |        "      <td>黄药师</td>\n",
 299 |        "      <td>80.0</td>\n",
 300 |        "      <td>98.0</td>\n",
 301 |        "      <td>100</td>\n",
 302 |        "    </tr>\n",
 303 |        "    <tr>\n",
 304 |        "      <th>3</th>\n",
 305 |        "      <td>4</td>\n",
 306 |        "      <td>欧阳锋</td>\n",
 307 |        "      <td>80.0</td>\n",
 308 |        "      <td>95.0</td>\n",
 309 |        "      <td>85</td>\n",
 310 |        "    </tr>\n",
 311 |        "    <tr>\n",
 312 |        "      <th>4</th>\n",
 313 |        "      <td>5</td>\n",
 314 |        "      <td>洪七公</td>\n",
 315 |        "      <td>98.0</td>\n",
 316 |        "      <td>90.0</td>\n",
 317 |        "      <td>96</td>\n",
 318 |        "    </tr>\n",
 319 |        "    <tr>\n",
 320 |        "      <th>5</th>\n",
 321 |        "      <td>5</td>\n",
 322 |        "      <td>周伯通</td>\n",
 323 |        "      <td>88.0</td>\n",
 324 |        "      <td>91.0</td>\n",
 325 |        "      <td>88</td>\n",
 326 |        "    </tr>\n",
 327 |        "  </tbody>\n",
 328 |        "</table>\n",
 329 |        "</div>"
 330 |       ],
 331 |       "text/plain": [
 332 |        "   名次   姓名     语文     数学   外语\n",
 333 |        "0   1   郭靖   90.0   80.0   76\n",
 334 |        "1   2   黄蓉  100.0  100.0   98\n",
 335 |        "2   3  黄药师   80.0   98.0  100\n",
 336 |        "3   4  欧阳锋   80.0   95.0   85\n",
 337 |        "4   5  洪七公   98.0   90.0   96\n",
 338 |        "5   5  周伯通   88.0   91.0   88"
 339 |       ]
 340 |      },
 341 |      "execution_count": 23,
 342 |      "metadata": {},
 343 |      "output_type": "execute_result"
 344 |     }
 345 |    ],
 346 |    "source": [
 347 |     "df.fillna({\"语文\":80,\"数学\":90})"
 348 |    ]
 349 |   },
 350 |   {
 351 |    "cell_type": "markdown",
 352 |    "metadata": {},
 353 |    "source": [
 354 |     "### 用计算公式填充"
 355 |    ]
 356 |   },
 357 |   {
 358 |    "cell_type": "code",
 359 |    "execution_count": 24,
 360 |    "metadata": {},
 361 |    "outputs": [
 362 |     {
 363 |      "data": {
 364 |       "text/html": [
 365 |        "<div>\n",
 366 |        "<style scoped>\n",
 367 |        "    .dataframe tbody tr th:only-of-type {\n",
 368 |        "        vertical-align: middle;\n",
 369 |        "    }\n",
 370 |        "\n",
 371 |        "    .dataframe tbody tr th {\n",
 372 |        "        vertical-align: top;\n",
 373 |        "    }\n",
 374 |        "\n",
 375 |        "    .dataframe thead th {\n",
 376 |        "        text-align: right;\n",
 377 |        "    }\n",
 378 |        "</style>\n",
 379 |        "<table border=\"1\" class=\"dataframe\">\n",
 380 |        "  <thead>\n",
 381 |        "    <tr style=\"text-align: right;\">\n",
 382 |        "      <th></th>\n",
 383 |        "      <th>名次</th>\n",
 384 |        "      <th>姓名</th>\n",
 385 |        "      <th>语文</th>\n",
 386 |        "      <th>数学</th>\n",
 387 |        "      <th>外语</th>\n",
 388 |        "    </tr>\n",
 389 |        "  </thead>\n",
 390 |        "  <tbody>\n",
 391 |        "    <tr>\n",
 392 |        "      <th>0</th>\n",
 393 |        "      <td>1</td>\n",
 394 |        "      <td>郭靖</td>\n",
 395 |        "      <td>90.0</td>\n",
 396 |        "      <td>80.0</td>\n",
 397 |        "      <td>76</td>\n",
 398 |        "    </tr>\n",
 399 |        "    <tr>\n",
 400 |        "      <th>1</th>\n",
 401 |        "      <td>2</td>\n",
 402 |        "      <td>黄蓉</td>\n",
 403 |        "      <td>100.0</td>\n",
 404 |        "      <td>100.0</td>\n",
 405 |        "      <td>98</td>\n",
 406 |        "    </tr>\n",
 407 |        "    <tr>\n",
 408 |        "      <th>2</th>\n",
 409 |        "      <td>3</td>\n",
 410 |        "      <td>黄药师</td>\n",
 411 |        "      <td>94.0</td>\n",
 412 |        "      <td>98.0</td>\n",
 413 |        "      <td>100</td>\n",
 414 |        "    </tr>\n",
 415 |        "    <tr>\n",
 416 |        "      <th>3</th>\n",
 417 |        "      <td>4</td>\n",
 418 |        "      <td>欧阳锋</td>\n",
 419 |        "      <td>94.0</td>\n",
 420 |        "      <td>95.0</td>\n",
 421 |        "      <td>85</td>\n",
 422 |        "    </tr>\n",
 423 |        "    <tr>\n",
 424 |        "      <th>4</th>\n",
 425 |        "      <td>5</td>\n",
 426 |        "      <td>洪七公</td>\n",
 427 |        "      <td>98.0</td>\n",
 428 |        "      <td>92.8</td>\n",
 429 |        "      <td>96</td>\n",
 430 |        "    </tr>\n",
 431 |        "    <tr>\n",
 432 |        "      <th>5</th>\n",
 433 |        "      <td>5</td>\n",
 434 |        "      <td>周伯通</td>\n",
 435 |        "      <td>88.0</td>\n",
 436 |        "      <td>91.0</td>\n",
 437 |        "      <td>88</td>\n",
 438 |        "    </tr>\n",
 439 |        "  </tbody>\n",
 440 |        "</table>\n",
 441 |        "</div>"
 442 |       ],
 443 |       "text/plain": [
 444 |        "   名次   姓名     语文     数学   外语\n",
 445 |        "0   1   郭靖   90.0   80.0   76\n",
 446 |        "1   2   黄蓉  100.0  100.0   98\n",
 447 |        "2   3  黄药师   94.0   98.0  100\n",
 448 |        "3   4  欧阳锋   94.0   95.0   85\n",
 449 |        "4   5  洪七公   98.0   92.8   96\n",
 450 |        "5   5  周伯通   88.0   91.0   88"
 451 |       ]
 452 |      },
 453 |      "execution_count": 24,
 454 |      "metadata": {},
 455 |      "output_type": "execute_result"
 456 |     }
 457 |    ],
 458 |    "source": [
 459 |     "df.fillna(df.mean())"
 460 |    ]
 461 |   },
 462 |   {
 463 |    "cell_type": "code",
 464 |    "execution_count": 25,
 465 |    "metadata": {},
 466 |    "outputs": [
 467 |     {
 468 |      "data": {
 469 |       "text/html": [
 470 |        "<div>\n",
 471 |        "<style scoped>\n",
 472 |        "    .dataframe tbody tr th:only-of-type {\n",
 473 |        "        vertical-align: middle;\n",
 474 |        "    }\n",
 475 |        "\n",
 476 |        "    .dataframe tbody tr th {\n",
 477 |        "        vertical-align: top;\n",
 478 |        "    }\n",
 479 |        "\n",
 480 |        "    .dataframe thead th {\n",
 481 |        "        text-align: right;\n",
 482 |        "    }\n",
 483 |        "</style>\n",
 484 |        "<table border=\"1\" class=\"dataframe\">\n",
 485 |        "  <thead>\n",
 486 |        "    <tr style=\"text-align: right;\">\n",
 487 |        "      <th></th>\n",
 488 |        "      <th>名次</th>\n",
 489 |        "      <th>姓名</th>\n",
 490 |        "      <th>语文</th>\n",
 491 |        "      <th>数学</th>\n",
 492 |        "      <th>外语</th>\n",
 493 |        "    </tr>\n",
 494 |        "  </thead>\n",
 495 |        "  <tbody>\n",
 496 |        "    <tr>\n",
 497 |        "      <th>0</th>\n",
 498 |        "      <td>1</td>\n",
 499 |        "      <td>郭靖</td>\n",
 500 |        "      <td>90.0</td>\n",
 501 |        "      <td>80.0</td>\n",
 502 |        "      <td>76</td>\n",
 503 |        "    </tr>\n",
 504 |        "    <tr>\n",
 505 |        "      <th>1</th>\n",
 506 |        "      <td>2</td>\n",
 507 |        "      <td>黄蓉</td>\n",
 508 |        "      <td>100.0</td>\n",
 509 |        "      <td>100.0</td>\n",
 510 |        "      <td>98</td>\n",
 511 |        "    </tr>\n",
 512 |        "    <tr>\n",
 513 |        "      <th>2</th>\n",
 514 |        "      <td>3</td>\n",
 515 |        "      <td>黄药师</td>\n",
 516 |        "      <td>376.0</td>\n",
 517 |        "      <td>98.0</td>\n",
 518 |        "      <td>100</td>\n",
 519 |        "    </tr>\n",
 520 |        "    <tr>\n",
 521 |        "      <th>3</th>\n",
 522 |        "      <td>4</td>\n",
 523 |        "      <td>欧阳锋</td>\n",
 524 |        "      <td>376.0</td>\n",
 525 |        "      <td>95.0</td>\n",
 526 |        "      <td>85</td>\n",
 527 |        "    </tr>\n",
 528 |        "    <tr>\n",
 529 |        "      <th>4</th>\n",
 530 |        "      <td>5</td>\n",
 531 |        "      <td>洪七公</td>\n",
 532 |        "      <td>98.0</td>\n",
 533 |        "      <td>464.0</td>\n",
 534 |        "      <td>96</td>\n",
 535 |        "    </tr>\n",
 536 |        "    <tr>\n",
 537 |        "      <th>5</th>\n",
 538 |        "      <td>5</td>\n",
 539 |        "      <td>周伯通</td>\n",
 540 |        "      <td>88.0</td>\n",
 541 |        "      <td>91.0</td>\n",
 542 |        "      <td>88</td>\n",
 543 |        "    </tr>\n",
 544 |        "  </tbody>\n",
 545 |        "</table>\n",
 546 |        "</div>"
 547 |       ],
 548 |       "text/plain": [
 549 |        "   名次   姓名     语文     数学   外语\n",
 550 |        "0   1   郭靖   90.0   80.0   76\n",
 551 |        "1   2   黄蓉  100.0  100.0   98\n",
 552 |        "2   3  黄药师  376.0   98.0  100\n",
 553 |        "3   4  欧阳锋  376.0   95.0   85\n",
 554 |        "4   5  洪七公   98.0  464.0   96\n",
 555 |        "5   5  周伯通   88.0   91.0   88"
 556 |       ]
 557 |      },
 558 |      "execution_count": 25,
 559 |      "metadata": {},
 560 |      "output_type": "execute_result"
 561 |     }
 562 |    ],
 563 |    "source": [
 564 |     "df.fillna(df.sum())"
 565 |    ]
 566 |   },
 567 |   {
 568 |    "cell_type": "markdown",
 569 |    "metadata": {},
 570 |    "source": [
 571 |     "### 使用具体某一列填充"
 572 |    ]
 573 |   },
 574 |   {
 575 |    "cell_type": "code",
 576 |    "execution_count": 17,
 577 |    "metadata": {},
 578 |    "outputs": [
 579 |     {
 580 |      "data": {
 581 |       "text/html": [
 582 |        "<div>\n",
 583 |        "<style scoped>\n",
 584 |        "    .dataframe tbody tr th:only-of-type {\n",
 585 |        "        vertical-align: middle;\n",
 586 |        "    }\n",
 587 |        "\n",
 588 |        "    .dataframe tbody tr th {\n",
 589 |        "        vertical-align: top;\n",
 590 |        "    }\n",
 591 |        "\n",
 592 |        "    .dataframe thead th {\n",
 593 |        "        text-align: right;\n",
 594 |        "    }\n",
 595 |        "</style>\n",
 596 |        "<table border=\"1\" class=\"dataframe\">\n",
 597 |        "  <thead>\n",
 598 |        "    <tr style=\"text-align: right;\">\n",
 599 |        "      <th></th>\n",
 600 |        "      <th>名次</th>\n",
 601 |        "      <th>姓名</th>\n",
 602 |        "      <th>语文</th>\n",
 603 |        "      <th>数学</th>\n",
 604 |        "      <th>外语</th>\n",
 605 |        "    </tr>\n",
 606 |        "  </thead>\n",
 607 |        "  <tbody>\n",
 608 |        "    <tr>\n",
 609 |        "      <th>0</th>\n",
 610 |        "      <td>1</td>\n",
 611 |        "      <td>郭靖</td>\n",
 612 |        "      <td>90.0</td>\n",
 613 |        "      <td>80.0</td>\n",
 614 |        "      <td>76</td>\n",
 615 |        "    </tr>\n",
 616 |        "    <tr>\n",
 617 |        "      <th>1</th>\n",
 618 |        "      <td>2</td>\n",
 619 |        "      <td>黄蓉</td>\n",
 620 |        "      <td>100.0</td>\n",
 621 |        "      <td>100.0</td>\n",
 622 |        "      <td>98</td>\n",
 623 |        "    </tr>\n",
 624 |        "    <tr>\n",
 625 |        "      <th>2</th>\n",
 626 |        "      <td>3</td>\n",
 627 |        "      <td>黄药师</td>\n",
 628 |        "      <td>90.5</td>\n",
 629 |        "      <td>98.0</td>\n",
 630 |        "      <td>100</td>\n",
 631 |        "    </tr>\n",
 632 |        "    <tr>\n",
 633 |        "      <th>3</th>\n",
 634 |        "      <td>4</td>\n",
 635 |        "      <td>欧阳锋</td>\n",
 636 |        "      <td>90.5</td>\n",
 637 |        "      <td>95.0</td>\n",
 638 |        "      <td>85</td>\n",
 639 |        "    </tr>\n",
 640 |        "    <tr>\n",
 641 |        "      <th>4</th>\n",
 642 |        "      <td>5</td>\n",
 643 |        "      <td>洪七公</td>\n",
 644 |        "      <td>98.0</td>\n",
 645 |        "      <td>90.5</td>\n",
 646 |        "      <td>96</td>\n",
 647 |        "    </tr>\n",
 648 |        "    <tr>\n",
 649 |        "      <th>5</th>\n",
 650 |        "      <td>5</td>\n",
 651 |        "      <td>周伯通</td>\n",
 652 |        "      <td>88.0</td>\n",
 653 |        "      <td>91.0</td>\n",
 654 |        "      <td>88</td>\n",
 655 |        "    </tr>\n",
 656 |        "  </tbody>\n",
 657 |        "</table>\n",
 658 |        "</div>"
 659 |       ],
 660 |       "text/plain": [
 661 |        "   名次   姓名     语文     数学   外语\n",
 662 |        "0   1   郭靖   90.0   80.0   76\n",
 663 |        "1   2   黄蓉  100.0  100.0   98\n",
 664 |        "2   3  黄药师   90.5   98.0  100\n",
 665 |        "3   4  欧阳锋   90.5   95.0   85\n",
 666 |        "4   5  洪七公   98.0   90.5   96\n",
 667 |        "5   5  周伯通   88.0   91.0   88"
 668 |       ]
 669 |      },
 670 |      "execution_count": 17,
 671 |      "metadata": {},
 672 |      "output_type": "execute_result"
 673 |     }
 674 |    ],
 675 |    "source": [
 676 |     "df.fillna(df.mean()['外语'])"
 677 |    ]
 678 |   },
 679 |   {
 680 |    "cell_type": "markdown",
 681 |    "metadata": {},
 682 |    "source": [
 683 |     "### 缺失值等于前面/后面一个值  \n",
 684 |     "通过指定参数method的值来设定:  \n",
 685 |     "- mothod = \"ffill/pad\" 用前一个非缺失值去填充该缺失值\n",
 686 |     "- mothod = \"bflii/backfill\"用下一个非缺失值填充该缺失值"
 687 |    ]
 688 |   },
 689 |   {
 690 |    "cell_type": "code",
 691 |    "execution_count": 18,
 692 |    "metadata": {},
 693 |    "outputs": [
 694 |     {
 695 |      "data": {
 696 |       "text/html": [
 697 |        "<div>\n",
 698 |        "<style scoped>\n",
 699 |        "    .dataframe tbody tr th:only-of-type {\n",
 700 |        "        vertical-align: middle;\n",
 701 |        "    }\n",
 702 |        "\n",
 703 |        "    .dataframe tbody tr th {\n",
 704 |        "        vertical-align: top;\n",
 705 |        "    }\n",
 706 |        "\n",
 707 |        "    .dataframe thead th {\n",
 708 |        "        text-align: right;\n",
 709 |        "    }\n",
 710 |        "</style>\n",
 711 |        "<table border=\"1\" class=\"dataframe\">\n",
 712 |        "  <thead>\n",
 713 |        "    <tr style=\"text-align: right;\">\n",
 714 |        "      <th></th>\n",
 715 |        "      <th>名次</th>\n",
 716 |        "      <th>姓名</th>\n",
 717 |        "      <th>语文</th>\n",
 718 |        "      <th>数学</th>\n",
 719 |        "      <th>外语</th>\n",
 720 |        "    </tr>\n",
 721 |        "  </thead>\n",
 722 |        "  <tbody>\n",
 723 |        "    <tr>\n",
 724 |        "      <th>0</th>\n",
 725 |        "      <td>1</td>\n",
 726 |        "      <td>郭靖</td>\n",
 727 |        "      <td>90.0</td>\n",
 728 |        "      <td>80.0</td>\n",
 729 |        "      <td>76</td>\n",
 730 |        "    </tr>\n",
 731 |        "    <tr>\n",
 732 |        "      <th>1</th>\n",
 733 |        "      <td>2</td>\n",
 734 |        "      <td>黄蓉</td>\n",
 735 |        "      <td>100.0</td>\n",
 736 |        "      <td>100.0</td>\n",
 737 |        "      <td>98</td>\n",
 738 |        "    </tr>\n",
 739 |        "    <tr>\n",
 740 |        "      <th>2</th>\n",
 741 |        "      <td>3</td>\n",
 742 |        "      <td>黄药师</td>\n",
 743 |        "      <td>100.0</td>\n",
 744 |        "      <td>98.0</td>\n",
 745 |        "      <td>100</td>\n",
 746 |        "    </tr>\n",
 747 |        "    <tr>\n",
 748 |        "      <th>3</th>\n",
 749 |        "      <td>4</td>\n",
 750 |        "      <td>欧阳锋</td>\n",
 751 |        "      <td>100.0</td>\n",
 752 |        "      <td>95.0</td>\n",
 753 |        "      <td>85</td>\n",
 754 |        "    </tr>\n",
 755 |        "    <tr>\n",
 756 |        "      <th>4</th>\n",
 757 |        "      <td>5</td>\n",
 758 |        "      <td>洪七公</td>\n",
 759 |        "      <td>98.0</td>\n",
 760 |        "      <td>95.0</td>\n",
 761 |        "      <td>96</td>\n",
 762 |        "    </tr>\n",
 763 |        "    <tr>\n",
 764 |        "      <th>5</th>\n",
 765 |        "      <td>5</td>\n",
 766 |        "      <td>周伯通</td>\n",
 767 |        "      <td>88.0</td>\n",
 768 |        "      <td>91.0</td>\n",
 769 |        "      <td>88</td>\n",
 770 |        "    </tr>\n",
 771 |        "  </tbody>\n",
 772 |        "</table>\n",
 773 |        "</div>"
 774 |       ],
 775 |       "text/plain": [
 776 |        "   名次   姓名     语文     数学   外语\n",
 777 |        "0   1   郭靖   90.0   80.0   76\n",
 778 |        "1   2   黄蓉  100.0  100.0   98\n",
 779 |        "2   3  黄药师  100.0   98.0  100\n",
 780 |        "3   4  欧阳锋  100.0   95.0   85\n",
 781 |        "4   5  洪七公   98.0   95.0   96\n",
 782 |        "5   5  周伯通   88.0   91.0   88"
 783 |       ]
 784 |      },
 785 |      "execution_count": 18,
 786 |      "metadata": {},
 787 |      "output_type": "execute_result"
 788 |     }
 789 |    ],
 790 |    "source": [
 791 |     "df.fillna(method=\"ffill\")"
 792 |    ]
 793 |   },
 794 |   {
 795 |    "cell_type": "markdown",
 796 |    "metadata": {},
 797 |    "source": [
 798 |     "### 限定填充个数"
 799 |    ]
 800 |   },
 801 |   {
 802 |    "cell_type": "code",
 803 |    "execution_count": 26,
 804 |    "metadata": {},
 805 |    "outputs": [
 806 |     {
 807 |      "data": {
 808 |       "text/html": [
 809 |        "<div>\n",
 810 |        "<style scoped>\n",
 811 |        "    .dataframe tbody tr th:only-of-type {\n",
 812 |        "        vertical-align: middle;\n",
 813 |        "    }\n",
 814 |        "\n",
 815 |        "    .dataframe tbody tr th {\n",
 816 |        "        vertical-align: top;\n",
 817 |        "    }\n",
 818 |        "\n",
 819 |        "    .dataframe thead th {\n",
 820 |        "        text-align: right;\n",
 821 |        "    }\n",
 822 |        "</style>\n",
 823 |        "<table border=\"1\" class=\"dataframe\">\n",
 824 |        "  <thead>\n",
 825 |        "    <tr style=\"text-align: right;\">\n",
 826 |        "      <th></th>\n",
 827 |        "      <th>名次</th>\n",
 828 |        "      <th>姓名</th>\n",
 829 |        "      <th>语文</th>\n",
 830 |        "      <th>数学</th>\n",
 831 |        "      <th>外语</th>\n",
 832 |        "    </tr>\n",
 833 |        "  </thead>\n",
 834 |        "  <tbody>\n",
 835 |        "    <tr>\n",
 836 |        "      <th>0</th>\n",
 837 |        "      <td>1</td>\n",
 838 |        "      <td>郭靖</td>\n",
 839 |        "      <td>90.0</td>\n",
 840 |        "      <td>80.0</td>\n",
 841 |        "      <td>76</td>\n",
 842 |        "    </tr>\n",
 843 |        "    <tr>\n",
 844 |        "      <th>1</th>\n",
 845 |        "      <td>2</td>\n",
 846 |        "      <td>黄蓉</td>\n",
 847 |        "      <td>100.0</td>\n",
 848 |        "      <td>100.0</td>\n",
 849 |        "      <td>98</td>\n",
 850 |        "    </tr>\n",
 851 |        "    <tr>\n",
 852 |        "      <th>2</th>\n",
 853 |        "      <td>3</td>\n",
 854 |        "      <td>黄药师</td>\n",
 855 |        "      <td>NaN</td>\n",
 856 |        "      <td>98.0</td>\n",
 857 |        "      <td>100</td>\n",
 858 |        "    </tr>\n",
 859 |        "    <tr>\n",
 860 |        "      <th>3</th>\n",
 861 |        "      <td>4</td>\n",
 862 |        "      <td>欧阳锋</td>\n",
 863 |        "      <td>98.0</td>\n",
 864 |        "      <td>95.0</td>\n",
 865 |        "      <td>85</td>\n",
 866 |        "    </tr>\n",
 867 |        "    <tr>\n",
 868 |        "      <th>4</th>\n",
 869 |        "      <td>5</td>\n",
 870 |        "      <td>洪七公</td>\n",
 871 |        "      <td>98.0</td>\n",
 872 |        "      <td>91.0</td>\n",
 873 |        "      <td>96</td>\n",
 874 |        "    </tr>\n",
 875 |        "    <tr>\n",
 876 |        "      <th>5</th>\n",
 877 |        "      <td>5</td>\n",
 878 |        "      <td>周伯通</td>\n",
 879 |        "      <td>88.0</td>\n",
 880 |        "      <td>91.0</td>\n",
 881 |        "      <td>88</td>\n",
 882 |        "    </tr>\n",
 883 |        "  </tbody>\n",
 884 |        "</table>\n",
 885 |        "</div>"
 886 |       ],
 887 |       "text/plain": [
 888 |        "   名次   姓名     语文     数学   外语\n",
 889 |        "0   1   郭靖   90.0   80.0   76\n",
 890 |        "1   2   黄蓉  100.0  100.0   98\n",
 891 |        "2   3  黄药师    NaN   98.0  100\n",
 892 |        "3   4  欧阳锋   98.0   95.0   85\n",
 893 |        "4   5  洪七公   98.0   91.0   96\n",
 894 |        "5   5  周伯通   88.0   91.0   88"
 895 |       ]
 896 |      },
 897 |      "execution_count": 26,
 898 |      "metadata": {},
 899 |      "output_type": "execute_result"
 900 |     }
 901 |    ],
 902 |    "source": [
 903 |     "df.fillna(method='bfill', limit=1)"
 904 |    ]
 905 |   },
 906 |   {
 907 |    "cell_type": "markdown",
 908 |    "metadata": {},
 909 |    "source": [
 910 |     "### 使用左边或右边的填充指定axis参数"
 911 |    ]
 912 |   },
 913 |   {
 914 |    "cell_type": "code",
 915 |    "execution_count": 21,
 916 |    "metadata": {},
 917 |    "outputs": [
 918 |     {
 919 |      "data": {
 920 |       "text/html": [
 921 |        "<div>\n",
 922 |        "<style scoped>\n",
 923 |        "    .dataframe tbody tr th:only-of-type {\n",
 924 |        "        vertical-align: middle;\n",
 925 |        "    }\n",
 926 |        "\n",
 927 |        "    .dataframe tbody tr th {\n",
 928 |        "        vertical-align: top;\n",
 929 |        "    }\n",
 930 |        "\n",
 931 |        "    .dataframe thead th {\n",
 932 |        "        text-align: right;\n",
 933 |        "    }\n",
 934 |        "</style>\n",
 935 |        "<table border=\"1\" class=\"dataframe\">\n",
 936 |        "  <thead>\n",
 937 |        "    <tr style=\"text-align: right;\">\n",
 938 |        "      <th></th>\n",
 939 |        "      <th>名次</th>\n",
 940 |        "      <th>姓名</th>\n",
 941 |        "      <th>语文</th>\n",
 942 |        "      <th>数学</th>\n",
 943 |        "      <th>外语</th>\n",
 944 |        "    </tr>\n",
 945 |        "  </thead>\n",
 946 |        "  <tbody>\n",
 947 |        "    <tr>\n",
 948 |        "      <th>0</th>\n",
 949 |        "      <td>1</td>\n",
 950 |        "      <td>郭靖</td>\n",
 951 |        "      <td>90</td>\n",
 952 |        "      <td>80</td>\n",
 953 |        "      <td>76</td>\n",
 954 |        "    </tr>\n",
 955 |        "    <tr>\n",
 956 |        "      <th>1</th>\n",
 957 |        "      <td>2</td>\n",
 958 |        "      <td>黄蓉</td>\n",
 959 |        "      <td>100</td>\n",
 960 |        "      <td>100</td>\n",
 961 |        "      <td>98</td>\n",
 962 |        "    </tr>\n",
 963 |        "    <tr>\n",
 964 |        "      <th>2</th>\n",
 965 |        "      <td>3</td>\n",
 966 |        "      <td>黄药师</td>\n",
 967 |        "      <td>98</td>\n",
 968 |        "      <td>98</td>\n",
 969 |        "      <td>100</td>\n",
 970 |        "    </tr>\n",
 971 |        "    <tr>\n",
 972 |        "      <th>3</th>\n",
 973 |        "      <td>4</td>\n",
 974 |        "      <td>欧阳锋</td>\n",
 975 |        "      <td>95</td>\n",
 976 |        "      <td>95</td>\n",
 977 |        "      <td>85</td>\n",
 978 |        "    </tr>\n",
 979 |        "    <tr>\n",
 980 |        "      <th>4</th>\n",
 981 |        "      <td>5</td>\n",
 982 |        "      <td>洪七公</td>\n",
 983 |        "      <td>98</td>\n",
 984 |        "      <td>96</td>\n",
 985 |        "      <td>96</td>\n",
 986 |        "    </tr>\n",
 987 |        "    <tr>\n",
 988 |        "      <th>5</th>\n",
 989 |        "      <td>5</td>\n",
 990 |        "      <td>周伯通</td>\n",
 991 |        "      <td>88</td>\n",
 992 |        "      <td>91</td>\n",
 993 |        "      <td>88</td>\n",
 994 |        "    </tr>\n",
 995 |        "  </tbody>\n",
 996 |        "</table>\n",
 997 |        "</div>"
 998 |       ],
 999 |       "text/plain": [
1000 |        "  名次   姓名   语文   数学   外语\n",
1001 |        "0  1   郭靖   90   80   76\n",
1002 |        "1  2   黄蓉  100  100   98\n",
1003 |        "2  3  黄药师   98   98  100\n",
1004 |        "3  4  欧阳锋   95   95   85\n",
1005 |        "4  5  洪七公   98   96   96\n",
1006 |        "5  5  周伯通   88   91   88"
1007 |       ]
1008 |      },
1009 |      "execution_count": 21,
1010 |      "metadata": {},
1011 |      "output_type": "execute_result"
1012 |     }
1013 |    ],
1014 |    "source": [
1015 |     "df.fillna(method='bfill', axis=1)"
1016 |    ]
1017 |   },
1018 |   {
1019 |    "cell_type": "markdown",
1020 |    "metadata": {},
1021 |    "source": [
1022 |     "### 更改数据源添加参数inplace = True  \n",
1023 |     "以上的7个参数都是没有改变源数据的，如果要改变源数据的话需要添加参数inplace = True即可。"
1024 |    ]
1025 |   }
1026 |  ],
1027 |  "metadata": {
1028 |   "kernelspec": {
1029 |    "display_name": "Python 3",
1030 |    "language": "python",
1031 |    "name": "python3"
1032 |   },
1033 |   "language_info": {
1034 |    "codemirror_mode": {
1035 |     "name": "ipython",
1036 |     "version": 3
1037 |    },
1038 |    "file_extension": ".py",
1039 |    "mimetype": "text/x-python",
1040 |    "name": "python",
1041 |    "nbconvert_exporter": "python",
1042 |    "pygments_lexer": "ipython3",
1043 |    "version": "3.7.1"
1044 |   },
1045 |   "toc": {
1046 |    "base_numbering": 1,
1047 |    "nav_menu": {},
1048 |    "number_sections": true,
1049 |    "sideBar": true,
1050 |    "skip_h1_title": false,
1051 |    "title_cell": "Table of Contents",
1052 |    "title_sidebar": "Contents",
1053 |    "toc_cell": false,
1054 |    "toc_position": {},
1055 |    "toc_section_display": true,
1056 |    "toc_window_display": false
1057 |   }
1058 |  },
1059 |  "nbformat": 4,
1060 |  "nbformat_minor": 2
1061 | }
1062 | 


--------------------------------------------------------------------------------
/Note/如何给 github 的开源项目提交 pull request.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/xmaniu/Excel-Python/26408c8a29d6eafb0bb83ac7532e6fd58140af00/Note/如何给 github 的开源项目提交 pull request.pdf


--------------------------------------------------------------------------------
/Note/常见的Python代码报错及解决方案.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/xmaniu/Excel-Python/26408c8a29d6eafb0bb83ac7532e6fd58140af00/Note/常见的Python代码报错及解决方案.pdf


--------------------------------------------------------------------------------
/Other/01 Pyecharts渲染图表 .ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "markdown",
  5 |    "metadata": {},
  6 |    "source": [
  7 |     "**pyecharts 库的基本使用用法**"
  8 |    ]
  9 |   },
 10 |   {
 11 |    "cell_type": "markdown",
 12 |    "metadata": {},
 13 |    "source": [
 14 |     "## 安装pyecharts  \n",
 15 |     "pip install pyecharts  "
 16 |    ]
 17 |   },
 18 |   {
 19 |    "cell_type": "markdown",
 20 |    "metadata": {},
 21 |    "source": [
 22 |     "## 开始使用"
 23 |    ]
 24 |   },
 25 |   {
 26 |    "cell_type": "code",
 27 |    "execution_count": 64,
 28 |    "metadata": {},
 29 |    "outputs": [],
 30 |    "source": [
 31 |     "from pyecharts import Bar\n",
 32 |     "from pyecharts import Bar\n",
 33 |     "\n",
 34 |     "df = pd.read_excel(r\"./Pyecharts.xlsx\")\n",
 35 |     "brands = df['品牌'].values\n",
 36 |     "solds = df['已售'].values\n",
 37 |     "bar = Bar(\"汽车各品牌销量\", \"这里是测试数据\")\n",
 38 |     "bar.add(\"销量\", brands, solds)\n",
 39 |     "# bar.print_echarts_options() # 该行只为了打印配置项，方便调试时使用\n",
 40 |     "bar.render(\"./html/start.html\")    # 生成本地 HTML 文件"
 41 |    ]
 42 |   },
 43 |   {
 44 |    "cell_type": "markdown",
 45 |    "metadata": {},
 46 |    "source": [
 47 |     "![图片](html/images/start.png)"
 48 |    ]
 49 |   },
 50 |   {
 51 |    "cell_type": "markdown",
 52 |    "metadata": {},
 53 |    "source": [
 54 |     "- add()：主要方法，用于添加图表的数据和设置各种配置项\n",
 55 |     "- print_echarts_options()：打印输出图表的所有配置项\n",
 56 |     "- render()：默认将会在根目录下生成一个 render.html 的文件，支持 path 参数，设置文件保存位置，如 render(r\"e:\\my_first_chart.html\")，文件用浏览器打开。  \n",
 57 |     "**Note：**可以按右边的下载按钮将图片下载到本地，如果想要提供更多实用工具按钮，请在 add() 中设置 is_more_utils 为 True"
 58 |    ]
 59 |   },
 60 |   {
 61 |    "cell_type": "markdown",
 62 |    "metadata": {},
 63 |    "source": [
 64 |     "## 使用主题"
 65 |    ]
 66 |   },
 67 |   {
 68 |    "cell_type": "code",
 69 |    "execution_count": 31,
 70 |    "metadata": {},
 71 |    "outputs": [],
 72 |    "source": [
 73 |     "from pyecharts import Bar\n",
 74 |     "from pyecharts import Bar\n",
 75 |     "\n",
 76 |     "df = pd.read_excel(r\"./Pyecharts.xlsx\")\n",
 77 |     "brands = df['品牌'].values\n",
 78 |     "solds = df['已售'].values\n",
 79 |     "bar = Bar(\"汽车各品牌销量\", \"这里是测试数据\")\n",
 80 |     "bar.use_theme('dark')\n",
 81 |     "bar.add(\"销量\", brands, solds)\n",
 82 |     "bar.render(\"./html/dark.html\")"
 83 |    ]
 84 |   },
 85 |   {
 86 |    "cell_type": "markdown",
 87 |    "metadata": {},
 88 |    "source": [
 89 |     "![图片](html/images/dark.png)"
 90 |    ]
 91 |   },
 92 |   {
 93 |    "cell_type": "markdown",
 94 |    "metadata": {},
 95 |    "source": [
 96 |     "## 使用 pyecharts-snapshot 插件  \n",
 97 |     "如果想直接将图片保存为 png, pdf, gif 格式的文件，可以使用 pyecharts-snapshot。使用该插件请确保你的系统上已经安装了 Nodejs 环境。  \n",
 98 |     "- 安装 phantomjs  \\$ npm install -g phantomjs-prebuilt<br>  \n",
 99 |     "- 安装 pyecharts-snapshot  $ pip install pyecharts-snapshot  \n",
100 |     "- 调用 render 方法  bar.render(path='snapshot.png') 文件结尾可以为 svg/jpeg/png/pdf/gif。请注意，svg 文件需要你在初始化 bar 的时候设置 renderer='svg'。\n"
101 |    ]
102 |   },
103 |   {
104 |    "cell_type": "markdown",
105 |    "metadata": {},
106 |    "source": [
107 |     "## 图形绘制过程\n",
108 |     "- 实例一个具体类型图表的对象 chart = FooChart()\n",
109 |     "- 为图表添加通用的配置，如主题 chart.use_theme()\n",
110 |     "- 为图表添加特定的配置 geo.add_coordinate()\n",
111 |     "- 添加数据及配置项 chart.add()\n",
112 |     "- 生成本地文件（html/svg/jpeg/png/pdf/gif） chart.render()"
113 |    ]
114 |   },
115 |   {
116 |    "cell_type": "markdown",
117 |    "metadata": {},
118 |    "source": [
119 |     "## 基本图表"
120 |    ]
121 |   },
122 |   {
123 |    "cell_type": "markdown",
124 |    "metadata": {},
125 |    "source": [
126 |     "### Bar（柱状图/条形图）\n",
127 |     ">柱状/条形图，通过柱形的高度/条形的宽度来表现数据的大小。  \n",
128 |     "\n",
129 |     "Bar.add() 方法签名  \n",
130 |     "```python\n",
131 |     "add(name, x_axis, y_axis,\n",
132 |     "    is_stack=False,\n",
133 |     "    bar_category_gap='20%', **kwargs)\n",
134 |     "```  \n",
135 |     "- name -> str  \n",
136 |     "图例名称\n",
137 |     "- attr -> list  \n",
138 |     "属性名称\n",
139 |     "- value -> list  \n",
140 |     "属性所对应的值\n",
141 |     "- shape -> list  \n",
142 |     "词云图轮廓，有'circle', 'cardioid', 'diamond', 'triangle-forward', 'triangle', 'pentagon', 'star'可选\n",
143 |     "- word_gap -> int  \n",
144 |     "单词间隔，默认为 20。\n",
145 |     "- word_size_range -> list  \n",
146 |     "单词字体大小范围，默认为 [12, 60]。\n",
147 |     "- rotate_step -> int  \n",
148 |     "旋转单词角度，默认为 45"
149 |    ]
150 |   },
151 |   {
152 |    "cell_type": "code",
153 |    "execution_count": 26,
154 |    "metadata": {},
155 |    "outputs": [],
156 |    "source": [
157 |     "import pandas as pd\n",
158 |     "from pyecharts import Bar\n",
159 |     "\n",
160 |     "df = pd.read_excel(r\"./Pyecharts.xlsx\")\n",
161 |     "\n",
162 |     "brands = df['品牌'].values\n",
163 |     "solds = df['已售'].values\n",
164 |     "schedules = df['已预订'].values\n",
165 |     "bar = Bar(\"汽车各品牌销量\")\n",
166 |     "bar.add(\"已售\", brands, sold, is_stack=True)\n",
167 |     "bar.add(\"已预订\", brands, schedules, is_stack=True)\n",
168 |     "bar.render(\"./html/bar01.html\")"
169 |    ]
170 |   },
171 |   {
172 |    "cell_type": "markdown",
173 |    "metadata": {},
174 |    "source": [
175 |     "![图片](html/images/bar.png)"
176 |    ]
177 |   },
178 |   {
179 |    "cell_type": "markdown",
180 |    "metadata": {},
181 |    "source": [
182 |     "### Pie（饼图）\n",
183 |     ">饼图主要用于表现不同类目的数据在总和中的占比。每个的弧度表示数据数量的比例。  \n",
184 |     "\n",
185 |     "Pie.add() 方法签名\n",
186 |     "```python\n",
187 |     "add(name, attr, value,\n",
188 |     "    radius=None,\n",
189 |     "    center=None,\n",
190 |     "    rosetype=None, **kwargs)\n",
191 |     "```    \n",
192 |     "- name -> str   \n",
193 |     "图例名称\n",
194 |     "- attr -> list  \n",
195 |     "属性名称\n",
196 |     "- value -> list   \n",
197 |     "属性所对应的值\n",
198 |     "- radius -> list  \n",
199 |     "饼图的半径，数组的第一项是内半径，第二项是外半径，默认为 [0, 75]   \n",
200 |     "默认设置成百分比，相对于容器高宽中较小的一项的一半\n",
201 |     "- center -> list   \n",
202 |     "饼图的中心（圆心）坐标，数组的第一项是横坐标，第二项是纵坐标，默认为 [50, 50]   \n",
203 |     "默认设置成百分比，设置成百分比时第一项是相对于容器宽度，第二项是相对于容器高度\n",
204 |     "- rosetype -> str  \n",
205 |     "是否展示成南丁格尔图，通过半径区分数据大小，有'radius'和'area'两种模式。默认为'radius'  \n",
206 |     "radius：扇区圆心角展现数据的百分比，半径展现数据的大小  \n",
207 |     "area：所有扇区圆心角相同，仅通过半径展现数据大小"
208 |    ]
209 |   },
210 |   {
211 |    "cell_type": "code",
212 |    "execution_count": 54,
213 |    "metadata": {},
214 |    "outputs": [],
215 |    "source": [
216 |     "from pyecharts import Pie\n",
217 |     "import pandas as pd\n",
218 |     "df = pd.read_excel(r\"./Pyecharts.xlsx\")\n",
219 |     "brands = df[\"品牌\"].values\n",
220 |     "Sales = df[\"总计\"].values\n",
221 |     "pie = Pie(\"汽车各品牌销量\")\n",
222 |     "pie.add(\"\", brands , Sales, is_label_show=True)\n",
223 |     "pie.render(\"./html/pie.html\")"
224 |    ]
225 |   },
226 |   {
227 |    "cell_type": "markdown",
228 |    "metadata": {},
229 |    "source": [
230 |     "![pie](./html/images/pie.png)"
231 |    ]
232 |   },
233 |   {
234 |    "cell_type": "markdown",
235 |    "metadata": {},
236 |    "source": [
237 |     "### WordCloud（词云图）  \n",
238 |     "WordCloud.add() 方法签名  \n",
239 |     "```python\n",
240 |     "add(name, attr, value,\n",
241 |     "    shape=\"circle\",\n",
242 |     "    word_gap=20,\n",
243 |     "    word_size_range=None,\n",
244 |     "    rotate_step=45)\n",
245 |     "```\n",
246 |     "- name -> str  \n",
247 |     "图例名称\n",
248 |     "- attr -> list  \n",
249 |     "属性名称\n",
250 |     "- value -> list  \n",
251 |     "属性所对应的值\n",
252 |     "- shape -> list  \n",
253 |     "词云图轮廓，有'circle', 'cardioid', 'diamond', 'triangle-forward', 'triangle', 'pentagon', 'star'可选\n",
254 |     "- word_gap -> int  \n",
255 |     "单词间隔，默认为 20。\n",
256 |     "- word_size_range -> list  \n",
257 |     "单词字体大小范围，默认为 [12, 60]。\n",
258 |     "- rotate_step -> int  \n",
259 |     "旋转单词角度，默认为 45"
260 |    ]
261 |   },
262 |   {
263 |    "cell_type": "code",
264 |    "execution_count": 52,
265 |    "metadata": {},
266 |    "outputs": [],
267 |    "source": [
268 |     "from pyecharts import WordCloud\n",
269 |     "import pandas as pd\n",
270 |     "df = pd.read_excel(r\"./Pyecharts.xlsx\",sheet_name=1)\n",
271 |     "brands = df[\"品牌\"].values\n",
272 |     "sales = df[\"总计\"].values\n",
273 |     "wordcloud = WordCloud(width=1300, height=620)\n",
274 |     "wordcloud.add(\"\", brands, sales, word_size_range=[20, 100])\n",
275 |     "wordcloud.render(\"./html/WordCloud.html\")"
276 |    ]
277 |   },
278 |   {
279 |    "cell_type": "markdown",
280 |    "metadata": {},
281 |    "source": [
282 |     "![WordCloud](./html/images/WordCloud.png)"
283 |    ]
284 |   },
285 |   {
286 |    "cell_type": "markdown",
287 |    "metadata": {},
288 |    "source": [
289 |     "### Gauge（仪表盘）  \n",
290 |     "Gauge.add() 方法签名  \n",
291 |     "```python\n",
292 |     "add(name, attr, value,\n",
293 |     "    scale_range=None,\n",
294 |     "    angle_range=None, **kwargs)\n",
295 |     "```\n",
296 |     "- name -> str  \n",
297 |     "图例名称\n",
298 |     "- attr -> list  \n",
299 |     "属性名称\n",
300 |     "- value -> list  \n",
301 |     "属性所对应的值  \n",
302 |     "- scale_range -> list  \n",
303 |     "仪表盘数据范围。默认为 [0, 100]\n",
304 |     "- angle_range -> list  \n",
305 |     "仪表盘角度范围。默认为 [225, -45]"
306 |    ]
307 |   },
308 |   {
309 |    "cell_type": "code",
310 |    "execution_count": 62,
311 |    "metadata": {},
312 |    "outputs": [],
313 |    "source": [
314 |     "from pyecharts import Gauge\n",
315 |     "\n",
316 |     "gauge = Gauge(\"仪表盘示例\")\n",
317 |     "gauge.add(\"业务指标\", \"完成率\", 66.66)\n",
318 |     "gauge.render(\"./html/Gauge01.html\")"
319 |    ]
320 |   },
321 |   {
322 |    "cell_type": "markdown",
323 |    "metadata": {},
324 |    "source": [
325 |     "![Gauge01](./html/images/Gauge01.png)"
326 |    ]
327 |   },
328 |   {
329 |    "cell_type": "code",
330 |    "execution_count": 56,
331 |    "metadata": {},
332 |    "outputs": [],
333 |    "source": [
334 |     "gauge = Gauge(\"仪表盘示例\")\n",
335 |     "gauge.add(\n",
336 |     "    \"业务指标\",\n",
337 |     "    \"完成率\",\n",
338 |     "    166.66,\n",
339 |     "    angle_range=[180, 0],\n",
340 |     "    scale_range=[0, 200],\n",
341 |     "    is_legend_show=False,\n",
342 |     ")\n",
343 |     "gauge.render(\"./html/Gauge02.html\")"
344 |    ]
345 |   },
346 |   {
347 |    "cell_type": "markdown",
348 |    "metadata": {},
349 |    "source": [
350 |     "![Gauge02](./html/images/Gauge02.png)"
351 |    ]
352 |   }
353 |  ],
354 |  "metadata": {
355 |   "kernelspec": {
356 |    "display_name": "Python 3",
357 |    "language": "python",
358 |    "name": "python3"
359 |   },
360 |   "language_info": {
361 |    "codemirror_mode": {
362 |     "name": "ipython",
363 |     "version": 3
364 |    },
365 |    "file_extension": ".py",
366 |    "mimetype": "text/x-python",
367 |    "name": "python",
368 |    "nbconvert_exporter": "python",
369 |    "pygments_lexer": "ipython3",
370 |    "version": "3.7.1"
371 |   },
372 |   "toc": {
373 |    "base_numbering": 1,
374 |    "nav_menu": {},
375 |    "number_sections": true,
376 |    "sideBar": true,
377 |    "skip_h1_title": false,
378 |    "title_cell": "Table of Contents",
379 |    "title_sidebar": "Contents",
380 |    "toc_cell": false,
381 |    "toc_position": {},
382 |    "toc_section_display": true,
383 |    "toc_window_display": false
384 |   }
385 |  },
386 |  "nbformat": 4,
387 |  "nbformat_minor": 2
388 | }
389 | 


--------------------------------------------------------------------------------
/Other/Pyecharts.xlsx:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/xmaniu/Excel-Python/26408c8a29d6eafb0bb83ac7532e6fd58140af00/Other/Pyecharts.xlsx


--------------------------------------------------------------------------------
/Other/html/images/Gauge01.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/xmaniu/Excel-Python/26408c8a29d6eafb0bb83ac7532e6fd58140af00/Other/html/images/Gauge01.png


--------------------------------------------------------------------------------
/Other/html/images/Gauge02.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/xmaniu/Excel-Python/26408c8a29d6eafb0bb83ac7532e6fd58140af00/Other/html/images/Gauge02.png


--------------------------------------------------------------------------------
/Other/html/images/WordCloud.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/xmaniu/Excel-Python/26408c8a29d6eafb0bb83ac7532e6fd58140af00/Other/html/images/WordCloud.png


--------------------------------------------------------------------------------
/Other/html/images/bar.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/xmaniu/Excel-Python/26408c8a29d6eafb0bb83ac7532e6fd58140af00/Other/html/images/bar.png


--------------------------------------------------------------------------------
/Other/html/images/dark.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/xmaniu/Excel-Python/26408c8a29d6eafb0bb83ac7532e6fd58140af00/Other/html/images/dark.png


--------------------------------------------------------------------------------
/Other/html/images/pie.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/xmaniu/Excel-Python/26408c8a29d6eafb0bb83ac7532e6fd58140af00/Other/html/images/pie.png


--------------------------------------------------------------------------------
/Other/html/images/start.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/xmaniu/Excel-Python/26408c8a29d6eafb0bb83ac7532e6fd58140af00/Other/html/images/start.png


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | 本仓库为『对比Excel，轻松学习Python数据分析』书本的读书笔记
 2 | 
 3 | [书本详细介绍](https://github.com/junhongzhang/Excel-Python-DA/blob/master/%E6%9C%AC%E4%B9%A6%E8%AF%A6%E7%BB%86%E4%BB%8B%E7%BB%8D.md)
 4 | 
 5 | [本书的勘误表](https://github.com/junhongzhang/Excel-Python-DA/blob/master/%E5%8B%98%E8%AF%AF%E8%A1%A8.md)
 6 | 
 7 | **说明**
 8 | - Code文件夹存放的是知识点整理及书本的案例代码
 9 | - Data文件夹存放的是书本代码案例用的基础数据
10 | - Note文件夹是我写的分享文章及其他同学分享的文章
11 | - 个人微信：net3330 欢迎一起学习交流
12 | 
13 | 


--------------------------------------------------------------------------------