├── README.md ├── materials ├── 32201184-python科学计算生态教学进度表.docx └── python科学计算生态教学大纲-20019.docx └── 课件 ├── 10_mangle_data.ipynb ├── 11_file_io_and_structured_text_files.ipynb ├── 12_system_management.ipynb ├── 13_regular_expressions.ipynb ├── 14_sort.ipynb ├── 15_else-and-copy.ipynb ├── 16_numpy.ipynb ├── 1_a_taste_of_python.pdf ├── 2_python_ingredients.ipynb ├── 3_strings.ipynb ├── 4_py_filling.ipynb ├── 5_code_structure.ipynb ├── 6_exceptions.ipynb ├── 7_function.ipynb ├── 8_modules_pacakges_programs.ipynb ├── 9_objects_and_classes.ipynb ├── README.md └── images └── python_object.png /README.md: -------------------------------------------------------------------------------- 1 | # 《Python科学计算生态》课程资料 2 | 3 | * [教学大纲](./materials) 4 | * [课件](./课件) 5 | 1. [课程介绍&Python语言初识](./课件/1_a_taste_of_python.pdf) 6 | 2. [Python变量和数据类型](./课件/2_python_ingredients.ipynb) 7 | 3. [Python变量和数据类型—字符串](./课件/3_strings.ipynb) 8 | 4. [Python基本数据结构](./课件/4_py_filling.ipynb) 9 | 5. [Python代码结构](./课件/5_code_structure.ipynb) 10 | 6. [Python错误和异常](./课件/6_exceptions.ipynb) 11 | 7. [Python函数](./课件/7_function.ipynb) 12 | 8. [Python模块、包和程序](./课件/8_modules_pacakges_programs.ipynb) 13 | 9. [Python对象和类](./课件/9_objects_and_classes.ipynb) 14 | 10. [Python数据操作](./课件/10_mangle_data.ipynb) 15 | 11. [Python文件IO](./课件/11_file_io_and_structured_text_files.ipynb) 16 | 12. [Python系统管理](./课件/12_system_management.ipynb) 17 | 13. [Python正则表达式](./课件/13_regular_expressions.ipynb) 18 | 14. [Python排序](./课件/14_sort.ipynb) 19 | 15. [Python的else块和复制](./课件/15_else-and-copy.ipynb) 20 | 16. [NumPy基础](./课件/16_numpy.ipynb) 21 | -------------------------------------------------------------------------------- /materials/32201184-python科学计算生态教学进度表.docx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/edu2act/course-PySCE/335a5ccd782d57a6641fb5e7861413f645cc93c9/materials/32201184-python科学计算生态教学进度表.docx -------------------------------------------------------------------------------- /materials/python科学计算生态教学大纲-20019.docx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/edu2act/course-PySCE/335a5ccd782d57a6641fb5e7861413f645cc93c9/materials/python科学计算生态教学大纲-20019.docx -------------------------------------------------------------------------------- /课件/12_system_management.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": { 6 | "ein.tags": "worksheet-0", 7 | "slideshow": { 8 | "slide_type": "-" 9 | } 10 | }, 11 | "source": [ 12 | "# Python 系统管理\n", 13 | "\n", 14 | "## 文件系统\n", 15 | "\n", 16 | "Python的`os`模块实现了操作文件系统的接口。这些操作包括遍历目录树,删除/重命名文件等。此外`os.path`模块可以实现一些针对路径名的操作。\n", 17 | "\n", 18 | "\n", 19 | "### `os`模块的函数\n", 20 | "\n", 21 | "#### 文件处理\n", 22 | "\n", 23 | "* `remove()` / `unlink()` 删除文件\n", 24 | "* `rename()` / `renames()` 重命名文件\n", 25 | "* `stat()` 返回文件信息\n", 26 | "* `symlink()` 创建符号链接\n", 27 | "* `utime()` 更新时间戳\n", 28 | "* `walk()` 生成一个目录树下的所有文件名\n", 29 | "\n", 30 | "#### 目录/文件夹\n", 31 | "\n", 32 | "* `mkdir()` / `mkdirs()` 创建目录/创建多层目录\n", 33 | "* `rmdir()` / `removedirs()` 删除目录/删除多层目录\n", 34 | "* `listdir()` 列出指定目录的文件\n", 35 | "* `chdir()` / `fcdir()` 改变当前工作目录 / 通过一个文件描述符改变当前目录\n", 36 | "* `chroot()` 改变当前进程的根目录\n", 37 | "* `getcwd()` / `getcwdu()` 返回当前工作目录 / 功能相同,但返回Unicode对象\n", 38 | "\n", 39 | "#### 访问/权限\n", 40 | "\n", 41 | "* `access()` 检验权限模式\n", 42 | "* `chmod()` 改变权限模式\n", 43 | "* `chown()` / `lchown()` 改变owner和group ID / 功能相同,但不会跟踪链接\n", 44 | "* `umask()` 设置默认权限模式\n", 45 | "\n", 46 | "\n", 47 | "### `os.path`模块中的路径名访问函数\n", 48 | "\n", 49 | "#### 分隔\n", 50 | "\n", 51 | "* `basename()` 去掉目录路径,返回文件名\n", 52 | "* `dirname()` 去掉文件名,返回目录路径\n", 53 | "* `join()` 将分隔的部分组合成路径\n", 54 | "* `split()` 返回`(dirname(), basename())`元组\n", 55 | "* `splitdrive()` 返回`(drivename, pathname)`元组\n", 56 | "* `splitext()` 返回`(filename, extension)`元组\n", 57 | "\n", 58 | "#### 信息\n", 59 | "\n", 60 | "* `getatime()` 返回最近访问时间\n", 61 | "* `getctime()` 返回文件创建时间\n", 62 | "* `getmtime()` 返回最近文件修改时间\n", 63 | "* `getsize()` 返回文件大小\n", 64 | "\n", 65 | "#### 查询\n", 66 | "\n", 67 | "* `exists()` 指定路径(文件或者目录)是否存在\n", 68 | "* `isabs()` 指定路径是否为绝对路径\n", 69 | "* `isdir()` 指定路径是否存在且是一个目录\n", 70 | "* `isfile()` 指定路径是否存在且是一个文件\n", 71 | "* `islink()` 指定路径是否存在且是一个符号链接\n", 72 | "* `ismount()` 指定路径是否存在且是一个挂载点\n", 73 | "* `samefile()` 两个路径名是否指向同一个文件" 74 | ] 75 | }, 76 | { 77 | "cell_type": "code", 78 | "execution_count": null, 79 | "metadata": { 80 | "autoscroll": false, 81 | "collapsed": false, 82 | "ein.hycell": false, 83 | "ein.tags": "worksheet-0", 84 | "slideshow": { 85 | "slide_type": "-" 86 | } 87 | }, 88 | "outputs": [], 89 | "source": [ 90 | "import os\n", 91 | "os.path.exists('12_system_management.ipynb') # True\n", 92 | "os.path.isfile('12_system_management.ipynb') # True\n", 93 | "os.path.isdir('12_system_management.ipynb') # False\n", 94 | "os.path.isabs('12_system_management.ipynb') # False" 95 | ] 96 | }, 97 | { 98 | "cell_type": "code", 99 | "execution_count": null, 100 | "metadata": { 101 | "autoscroll": false, 102 | "collapsed": false, 103 | "ein.hycell": false, 104 | "ein.tags": "worksheet-0", 105 | "slideshow": { 106 | "slide_type": "-" 107 | } 108 | }, 109 | "outputs": [], 110 | "source": [ 111 | "list(os.walk('.'))" 112 | ] 113 | }, 114 | { 115 | "cell_type": "markdown", 116 | "metadata": { 117 | "ein.tags": "worksheet-0", 118 | "slideshow": { 119 | "slide_type": "-" 120 | } 121 | }, 122 | "source": [ 123 | "### 使用`glob`模块列出匹配文件\n", 124 | "\n", 125 | "`glob.glob()`函数会使用Unix shell的规则来匹配文件或者目录:\n", 126 | "\n", 127 | "* `*` 匹配任意名称(`re`中是`.*`)\n", 128 | "* `?` 匹配一个字符\n", 129 | "* `[abc]` 匹配字符`a`、`b`和`c`\n", 130 | "* `[!abc]` 匹配出了`a`、`b`和`c`之外所有字符" 131 | ] 132 | }, 133 | { 134 | "cell_type": "code", 135 | "execution_count": null, 136 | "metadata": { 137 | "autoscroll": false, 138 | "collapsed": false, 139 | "ein.hycell": false, 140 | "ein.tags": "worksheet-0", 141 | "slideshow": { 142 | "slide_type": "-" 143 | } 144 | }, 145 | "outputs": [], 146 | "source": [ 147 | "import glob\n", 148 | "\n", 149 | "glob.glob('*.ipynb')" 150 | ] 151 | }, 152 | { 153 | "cell_type": "markdown", 154 | "metadata": { 155 | "ein.tags": "worksheet-0", 156 | "slideshow": { 157 | "slide_type": "-" 158 | } 159 | }, 160 | "source": [ 161 | "## 日期和时间\n", 162 | "\n", 163 | "### `datetime`模块\n", 164 | "\n", 165 | "其定义了4个主要的对象,每个对象处理的内容:\n", 166 | "\n", 167 | "* `date` 处理年、月、日\n", 168 | "* `time` 处理时、分、秒和微秒\n", 169 | "* `datetime` 处理日期和时间同时出现的情况\n", 170 | "* `timedelta` 处理日期和(或)时间间隔\n" 171 | ] 172 | }, 173 | { 174 | "cell_type": "code", 175 | "execution_count": null, 176 | "metadata": { 177 | "autoscroll": false, 178 | "collapsed": false, 179 | "ein.hycell": false, 180 | "ein.tags": "worksheet-0", 181 | "slideshow": { 182 | "slide_type": "-" 183 | } 184 | }, 185 | "outputs": [], 186 | "source": [ 187 | "from datetime import date\n", 188 | "\n", 189 | "\n", 190 | "halloween = date(2017, 4, 21)\n", 191 | "halloween\n", 192 | "print(halloween.day, halloween.month, halloween.year)\n", 193 | "halloween.isoformat()\n" 194 | ] 195 | }, 196 | { 197 | "cell_type": "markdown", 198 | "metadata": { 199 | "ein.tags": "worksheet-0", 200 | "slideshow": { 201 | "slide_type": "-" 202 | } 203 | }, 204 | "source": [ 205 | "**iso**是指ISO 8601,一种表示日期和时间的国际标准。这个标准的显示顺序是从一般(年)到特殊(日)。其可用来对日期进行正确的排序:先按照年,然后是月,最后是日。" 206 | ] 207 | }, 208 | { 209 | "cell_type": "code", 210 | "execution_count": null, 211 | "metadata": { 212 | "autoscroll": false, 213 | "collapsed": false, 214 | "ein.hycell": false, 215 | "ein.tags": "worksheet-0", 216 | "slideshow": { 217 | "slide_type": "-" 218 | } 219 | }, 220 | "outputs": [], 221 | "source": [ 222 | "now = date.today()\n", 223 | "now" 224 | ] 225 | }, 226 | { 227 | "cell_type": "code", 228 | "execution_count": null, 229 | "metadata": { 230 | "autoscroll": false, 231 | "collapsed": false, 232 | "ein.hycell": false, 233 | "ein.tags": "worksheet-0", 234 | "slideshow": { 235 | "slide_type": "-" 236 | } 237 | }, 238 | "outputs": [], 239 | "source": [ 240 | "from datetime import timedelta\n", 241 | "\n", 242 | "one_day = timedelta(days=1)\n", 243 | "tomorrow = now + one_day\n", 244 | "print(tomorrow)\n", 245 | "print(now + 17 * one_day)\n", 246 | "yesterday = now - one_day\n", 247 | "print(yesterday)\n", 248 | "\n", 249 | "from datetime import datetime\n", 250 | "print(repr(datetime.resolution))" 251 | ] 252 | }, 253 | { 254 | "cell_type": "markdown", 255 | "metadata": { 256 | "ein.tags": "worksheet-0", 257 | "slideshow": { 258 | "slide_type": "-" 259 | } 260 | }, 261 | "source": [ 262 | "date的范围是`date.min`到`date.max`。" 263 | ] 264 | }, 265 | { 266 | "cell_type": "code", 267 | "execution_count": null, 268 | "metadata": { 269 | "autoscroll": false, 270 | "collapsed": false, 271 | "ein.hycell": false, 272 | "ein.tags": "worksheet-0", 273 | "slideshow": { 274 | "slide_type": "-" 275 | } 276 | }, 277 | "outputs": [], 278 | "source": [ 279 | "print(date.min)\n", 280 | "print(date.max)" 281 | ] 282 | }, 283 | { 284 | "cell_type": "markdown", 285 | "metadata": { 286 | "ein.tags": "worksheet-0", 287 | "slideshow": { 288 | "slide_type": "-" 289 | } 290 | }, 291 | "source": [ 292 | "`datetime`模块中的`time`对象用来表示一天中的时间:" 293 | ] 294 | }, 295 | { 296 | "cell_type": "code", 297 | "execution_count": null, 298 | "metadata": { 299 | "autoscroll": false, 300 | "collapsed": false, 301 | "ein.hycell": false, 302 | "ein.tags": "worksheet-0", 303 | "slideshow": { 304 | "slide_type": "-" 305 | } 306 | }, 307 | "outputs": [], 308 | "source": [ 309 | "from datetime import time\n", 310 | "\n", 311 | "noon = time(12, 0, 0)\n", 312 | "print(noon)\n", 313 | "print(noon.hour, noon.minute, noon.second, sep=':')\n", 314 | "print(noon.microsecond)" 315 | ] 316 | }, 317 | { 318 | "cell_type": "markdown", 319 | "metadata": { 320 | "ein.tags": "worksheet-0", 321 | "slideshow": { 322 | "slide_type": "-" 323 | } 324 | }, 325 | "source": [ 326 | "参数的顺序按照时间单位从大到小排列(时、分、秒、微秒)。没有参数的话,`time`会默认使用0。\n", 327 | "\n", 328 | "注意,时间不一定时精确的,对于**微秒**和**秒**。" 329 | ] 330 | }, 331 | { 332 | "cell_type": "code", 333 | "execution_count": null, 334 | "metadata": { 335 | "autoscroll": false, 336 | "collapsed": false, 337 | "ein.hycell": false, 338 | "ein.tags": "worksheet-0", 339 | "slideshow": { 340 | "slide_type": "-" 341 | } 342 | }, 343 | "outputs": [], 344 | "source": [ 345 | "from datetime import datetime\n", 346 | "\n", 347 | "def print_repr(obj):\n", 348 | " print(repr(obj))\n", 349 | "\n", 350 | "some_day = datetime(2017, 4, 21, 2, 43, 50, 7)\n", 351 | "print_repr(some_day.isoformat())\n", 352 | "\n", 353 | "right_now = datetime.now()\n", 354 | "print_repr(right_now)\n", 355 | "\n", 356 | "from datetime import time, date\n", 357 | "noon = time(12)\n", 358 | "this_day = date.today()\n", 359 | "noon_today = datetime.combine(this_day, noon)\n", 360 | "print_repr(noon_today)\n", 361 | "\n", 362 | "print_repr(noon_today.date())\n", 363 | "print_repr(noon_today.time())" 364 | ] 365 | }, 366 | { 367 | "cell_type": "markdown", 368 | "metadata": { 369 | "ein.tags": "worksheet-0", 370 | "slideshow": { 371 | "slide_type": "-" 372 | } 373 | }, 374 | "source": [ 375 | "下面的代码展示计算一个月份的开始日到结束日中间的日期范围:" 376 | ] 377 | }, 378 | { 379 | "cell_type": "code", 380 | "execution_count": null, 381 | "metadata": { 382 | "autoscroll": false, 383 | "collapsed": false, 384 | "ein.hycell": false, 385 | "ein.tags": "worksheet-0", 386 | "slideshow": { 387 | "slide_type": "-" 388 | } 389 | }, 390 | "outputs": [], 391 | "source": [ 392 | "from datetime import datetime, date, timedelta\n", 393 | "import calendar\n", 394 | "\n", 395 | "\n", 396 | "def get_month_range(start_date=None):\n", 397 | " if start_date is None:\n", 398 | " start_date = date.today().replace(day=1)\n", 399 | " _, days_in_month = calendar.monthrange(start_date.year, start_date.month)\n", 400 | " end_date = start_date + timedelta(days=days_in_month)\n", 401 | " return (start_date, end_date)\n", 402 | "\n", 403 | "\n", 404 | "a_day = timedelta(days=1)\n", 405 | "first_day, last_day = get_month_range()\n", 406 | "while first_day < last_day:\n", 407 | " print(first_day)\n", 408 | " first_day += a_day\n" 409 | ] 410 | }, 411 | { 412 | "cell_type": "markdown", 413 | "metadata": { 414 | "ein.tags": "worksheet-0", 415 | "slideshow": { 416 | "slide_type": "-" 417 | } 418 | }, 419 | "source": [ 420 | "上面的`get_month_range()`函数接受一个`datetime`对象并返回一个由当前月份开始日和下个月开始日组成的元组对象。\n", 421 | "\n", 422 | "计算出一个对应月份第一天的日期,一种快速的方法就是使用`date`或`datetime`对象的`replace()`方法简单地将`days`属性设置成`1`即可。\n", 423 | "\n", 424 | "使用`calendar.monthrange()`来获得该月的总天数。任何时候只要你想获得日历信息,可以使用`calendar`模块。" 425 | ] 426 | }, 427 | { 428 | "cell_type": "markdown", 429 | "metadata": { 430 | "ein.tags": "worksheet-0", 431 | "slideshow": { 432 | "slide_type": "-" 433 | } 434 | }, 435 | "source": [ 436 | "### `time`模块\n", 437 | "\n", 438 | "一种表示绝对时间的方法时计算从某个起始点开始的秒数。Unix时间使用从1970年1月1日0点开始的秒数。这个值通常被成为**纪元**(epoch),它是不同系统之间最简单的交换日期时间的方法。" 439 | ] 440 | }, 441 | { 442 | "cell_type": "code", 443 | "execution_count": null, 444 | "metadata": { 445 | "autoscroll": false, 446 | "collapsed": false, 447 | "ein.hycell": false, 448 | "ein.tags": "worksheet-0", 449 | "slideshow": { 450 | "slide_type": "-" 451 | } 452 | }, 453 | "outputs": [], 454 | "source": [ 455 | "import time\n", 456 | "\n", 457 | "# time() 返回当前时间的纪元值\n", 458 | "now = time.time()\n", 459 | "print_repr(now)\n", 460 | "\n", 461 | "# ctime() 将纪元值转换成一个字符串\n", 462 | "print_repr(time.ctime(now))\n", 463 | "\n", 464 | "# localtime() 返回当前系统时区下的时间\n", 465 | "print_repr(time.localtime(now))\n", 466 | "\n", 467 | "# gmtime() 返回UTC时间\n", 468 | "print_repr(time.gmtime(now))\n", 469 | "\n", 470 | "print_repr(time.localtime())\n", 471 | "print_repr(time.gmtime())\n", 472 | "\n", 473 | "# mktime() 将 struct_time 对象转换回纪元值\n", 474 | "print_repr(time.mktime(time.localtime()))" 475 | ] 476 | }, 477 | { 478 | "cell_type": "markdown", 479 | "metadata": { 480 | "ein.tags": "worksheet-0", 481 | "slideshow": { 482 | "slide_type": "-" 483 | } 484 | }, 485 | "source": [ 486 | "`localtime()`和`gmttime()`返回的是一个`struct_time`对象(命名元组)。其结构如下:\n", 487 | "\n", 488 | "| Index | Attribute | Values |\n", 489 | "|-------|-----------|--------------------------------------------------|\n", 490 | "| 0 | tm_year | (for example, 1993) |\n", 491 | "| 1 | tm_mon | range [1, 12] |\n", 492 | "| 2 | tm_mday | range [1, 31] |\n", 493 | "| 3 | tm_hour | range [0, 23] |\n", 494 | "| 4 | tm_min | range [0, 59] |\n", 495 | "| 5 | tm_sec | range [0, 61]; |\n", 496 | "| 6 | tm_wday | range [0, 6], Monday is 0 |\n", 497 | "| 7 | tm_yday | range [1, 366] |\n", 498 | "| 8 | tm_isdst | 0, 1 or -1; |\n", 499 | "| N/A | tm_zone | abbreviation of timezone name |\n", 500 | "| N/A | tm_gmtoff | offset east of UTC in seconds |\n", 501 | "\n", 502 | "\n", 503 | "**建议:**\n", 504 | "\n", 505 | "* 尽量多使用UTC来代替时区,特别是将服务器设置为UTC时间,不要使用本地时间。\n", 506 | "* 有可能的话绝对不使用夏时制时间。\n", 507 | "\n", 508 | "\n", 509 | "### 读写日期和时间\n", 510 | "\n", 511 | "使用`strftime()`将日期和时间转换成字符串,`datetime`、`date`、`time`对象和`time`模块中都包含此方法。`strftime()`使用格式化字符串来指定输出,见下表:\n", 512 | "\n", 513 | "| 格式化字符串 | 日期/时间单元 | 范围 |\n", 514 | "|--------------|----------------|-------------|\n", 515 | "| Y | 年 | 1900-... |\n", 516 | "| m | 月 | 01-12 |\n", 517 | "| B | 月名 | January,... |\n", 518 | "| b | 月名简写 | Jan,... |\n", 519 | "| d | 日 | 01-31 |\n", 520 | "| A | 星期 | Sunday,... |\n", 521 | "| a | 星期缩写 | Sun,... |\n", 522 | "| H | 时(24小时制) | 00-23 |\n", 523 | "| I | 时(12小时制) | 01-12 |\n", 524 | "| p | 上午/下午 | AM,PM |\n", 525 | "| M | 分 | 00-59 |\n", 526 | "| S | 秒 | 00-59 |\n", 527 | "\n", 528 | "数字左侧都是补零。更多内容请参考[官方文档](https://docs.python.org/3.6/library/datetime.html#strftime-strptime-behavior)。" 529 | ] 530 | }, 531 | { 532 | "cell_type": "code", 533 | "execution_count": null, 534 | "metadata": { 535 | "autoscroll": false, 536 | "collapsed": false, 537 | "ein.hycell": false, 538 | "ein.tags": "worksheet-0", 539 | "slideshow": { 540 | "slide_type": "-" 541 | } 542 | }, 543 | "outputs": [], 544 | "source": [ 545 | "import time\n", 546 | "\n", 547 | "fmt = \"It's %A, %B %d, %Y, local time %I:%M:%S%p\"\n", 548 | "t = time.localtime()\n", 549 | "print_repr(t)\n", 550 | "print(time.strftime(fmt, t))\n", 551 | "\n", 552 | "\n", 553 | "from datetime import date\n", 554 | "\n", 555 | "some_day = date(2017, 4, 21)\n", 556 | "print(some_day.strftime(fmt)) # 只能获取日期部分,时间默认是午夜\n", 557 | "\n", 558 | "\n", 559 | "from datetime import time\n", 560 | "\n", 561 | "some_time = time(10, 35)\n", 562 | "print(some_time.strftime(fmt)) # 只会转换时间部分" 563 | ] 564 | }, 565 | { 566 | "cell_type": "markdown", 567 | "metadata": { 568 | "ein.tags": "worksheet-0", 569 | "slideshow": { 570 | "slide_type": "-" 571 | } 572 | }, 573 | "source": [ 574 | "使用`strptime()`可以将格式化的字符串转换为日期或时间。不能使用正则表达式,字符串的非格式化部分必须完全匹配。" 575 | ] 576 | }, 577 | { 578 | "cell_type": "code", 579 | "execution_count": null, 580 | "metadata": { 581 | "autoscroll": false, 582 | "collapsed": false, 583 | "ein.hycell": false, 584 | "ein.tags": "worksheet-0", 585 | "slideshow": { 586 | "slide_type": "-" 587 | } 588 | }, 589 | "outputs": [], 590 | "source": [ 591 | "import time\n", 592 | "\n", 593 | "fmt = '%Y-%m-%d'\n", 594 | "print_repr(time.strptime('2017-04-21', fmt))\n", 595 | "print_repr(time.strptime('2017-04-31', fmt)) # ValueError" 596 | ] 597 | }, 598 | { 599 | "cell_type": "markdown", 600 | "metadata": { 601 | "ein.tags": "worksheet-0", 602 | "slideshow": { 603 | "slide_type": "-" 604 | } 605 | }, 606 | "source": [ 607 | "名称可以通过操作系统中的`locale`进行设置。如果要打印不同的月和日名称,可通过`setlocale()`来设置,其第一个参数是`locale.LC_TIME`,表示设置的是日期和时间,第二个参数是一个结合了**语言**和**国家名称**的缩写字符串。" 608 | ] 609 | }, 610 | { 611 | "cell_type": "code", 612 | "execution_count": null, 613 | "metadata": { 614 | "autoscroll": false, 615 | "collapsed": false, 616 | "ein.hycell": false, 617 | "ein.tags": "worksheet-0", 618 | "slideshow": { 619 | "slide_type": "-" 620 | } 621 | }, 622 | "outputs": [], 623 | "source": [ 624 | "import locale\n", 625 | "help(locale.setlocale)" 626 | ] 627 | }, 628 | { 629 | "cell_type": "code", 630 | "execution_count": null, 631 | "metadata": { 632 | "autoscroll": false, 633 | "collapsed": false, 634 | "ein.hycell": false, 635 | "ein.tags": "worksheet-0", 636 | "slideshow": { 637 | "slide_type": "-" 638 | } 639 | }, 640 | "outputs": [], 641 | "source": [ 642 | "import locale\n", 643 | "from datetime import date\n", 644 | "\n", 645 | "halloween = date(2014, 10, 31)\n", 646 | "for lang_country in ['en_us', 'fr_fr', 'de_de', 'zh_cn']:\n", 647 | " locale.setlocale(locale.LC_TIME, lang_country)\n", 648 | " print(halloween.strftime('%A, %B %d'))" 649 | ] 650 | }, 651 | { 652 | "cell_type": "code", 653 | "execution_count": null, 654 | "metadata": { 655 | "autoscroll": false, 656 | "collapsed": false, 657 | "ein.hycell": false, 658 | "ein.tags": "worksheet-0", 659 | "slideshow": { 660 | "slide_type": "-" 661 | } 662 | }, 663 | "outputs": [], 664 | "source": [ 665 | "import locale\n", 666 | "names = locale.locale_alias.keys()\n", 667 | "good_names = [name for name in names\n", 668 | " if len(name) == 5 and name[2] == '_']\n", 669 | "for name in list(good_names)[-5:]:\n", 670 | " print(name)\n", 671 | "\n", 672 | "zh = [name for name in good_names if name.startswith('zh')]\n", 673 | "print_repr(zh)" 674 | ] 675 | }, 676 | { 677 | "cell_type": "markdown", 678 | "metadata": { 679 | "ein.tags": "worksheet-0", 680 | "slideshow": { 681 | "slide_type": "-" 682 | } 683 | }, 684 | "source": [ 685 | "#### 其他操作日期和时间的类库\n", 686 | "\n", 687 | "* [arrow](https://github.com/crsmithdev/arrow):更好的 Python 日期时间操作类库。\n", 688 | "* [maya](https://github.com/kennethreitz/maya):Timestamps for humans.\n", 689 | "* [Chronyk](https://github.com/KoffeinFlummi/Chronyk):Python 3 的类库,用于解析手写格式的时间和日期。\n", 690 | "* [dateutil](https://pypi.python.org/pypi/python-dateutil):Python datetime 模块的扩展。\n", 691 | "* [delorean](https://github.com/myusuf3/delorean/):解决 Python 中有关日期处理的棘手问题的库。\n", 692 | "* [moment](https://github.com/zachwill/moment):一个用来处理时间和日期的Python库。灵感来自于Moment.js。\n", 693 | "* [PyTime](https://github.com/shinux/PyTime):一个简单易用的Python模块,用于通过字符串来操作日期/时间。\n", 694 | "* [pytz](https://launchpad.net/pytz):现代以及历史版本的世界时区定义。将时区数据库引入Python。\n", 695 | "* [when.py](https://github.com/dirn/When.py):提供用户友好的函数来帮助用户进行常用的日期和时间操作。" 696 | ] 697 | } 698 | ], 699 | "metadata": { 700 | "kernelspec": { 701 | "display_name": "Python 3", 702 | "name": "python3" 703 | }, 704 | "name": "12_system_management.ipynb" 705 | }, 706 | "nbformat": 4, 707 | "nbformat_minor": 2 708 | } 709 | -------------------------------------------------------------------------------- /课件/13_regular_expressions.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": { 6 | "ein.tags": "worksheet-0", 7 | "slideshow": { 8 | "slide_type": "-" 9 | } 10 | }, 11 | "source": [ 12 | "# Python正则表达式\n", 13 | "\n", 14 | "正则表达式(regular expression)发源于与计算机密切相关的两个领域:计算理论和形式语言。其主要功能是从字符串中通过特定的模式(pattern)搜索想要的内容。\n", 15 | "\n", 16 | "给定一个正则表达式和另一个字符串,我们可以达到如下的目的:\n", 17 | "\n", 18 | "1. 给定的字符串是否符合正则表达式的过滤逻辑(称作“匹配”);\n", 19 | "2. 可以通过正则表达式,从字符串中获取我们想要的特定部分。\n", 20 | "\n", 21 | "## 正则表达式的特点\n", 22 | "\n", 23 | "1. 灵活性、逻辑性和功能性非常的强\n", 24 | "2. 可以迅速地用极简单的方式达到对字符串的复杂控制\n", 25 | "3. 对于刚接触的人来说,比较晦涩难懂\n", 26 | "\n", 27 | "## 正则表达式的语法\n", 28 | "\n", 29 | "正则表达式由一些**普通字符**和一些**元字符**组成。普通字符就是我们平时常见的字符串、数字之类的,当然也包括一些常见的符号,等等。而元字符则可以理解为正则表达式引擎的保留字符,就像很多计算机语言中的保留字符一样,它们在正则引擎中有特殊的意义。\n", 30 | "\n", 31 | "## 字符组\n", 32 | "\n", 33 | "字符组就是一组字符,在正则表达式中,其表示“在同一个位置可能出现的各种字符”,其写法是在一对方括号`[`和`]`之间列出所有可能出现的字符,简单的字符组比如`[ab]`、`[314]`、`[#.?]`。" 34 | ] 35 | }, 36 | { 37 | "cell_type": "code", 38 | "execution_count": null, 39 | "metadata": { 40 | "autoscroll": false, 41 | "collapsed": false, 42 | "ein.hycell": false, 43 | "ein.tags": "worksheet-0", 44 | "slideshow": { 45 | "slide_type": "-" 46 | } 47 | }, 48 | "outputs": [], 49 | "source": [ 50 | "# 用正则表达式判断数字字符\n", 51 | "import re\n", 52 | "\n", 53 | "\n", 54 | "is_string = lambda astring: re.search('[0123456789]', astring) != None\n", 55 | "is_string('1234')" 56 | ] 57 | }, 58 | { 59 | "cell_type": "markdown", 60 | "metadata": { 61 | "ein.tags": "worksheet-0", 62 | "slideshow": { 63 | "slide_type": "-" 64 | } 65 | }, 66 | "source": [ 67 | "`re.search()` 是Python提供的正则表达式操作函数,表示“进行正则表示匹配”;`astring`是需要判断的字符串,而`[0123456789]`则是以字符串形式给出的正则表达式,它是一个字符组,表示这里可以是0、1、2、…、8、9中的任意一个字符。只要`astring`包含其中任何一个字符,就会得到一个**MatchObject**对象,否则,返回None。\n", 68 | "字符组中的字符排列顺序并不影响字符组的功能,出现重复字符也不会影响,所以`[0123456789]`和`[9876543210]`、`[998877654332210]`完全等价。\n", 69 | "\n", 70 | "正则表达式提供了**-范围表示法(range)**,它更直观,能进一步简化字符组。其形式为`[x-y]`,表示**x到y整个范围内的字符**。这样`[0123456789]`就可以表示为`[0-9]`,`[abcdefghijklmnopqrstuvwxyz]`就可以表示为`[a-z]`。一般情况下,字符组的范围表示法都表示一类字符(数字字符或者字母字符等)。\n", 71 | "\n", 72 | "字符组中可以同时并列多个**-范围表示法(range)**,字符组`[0-9a-zA-Z]`可以匹配数字、大写字母或小写字母;字符组`[0-9a-fA-F]`可以匹配数字、大小写形式的a~f,它可以用来验证十六进制字符。\n", 73 | "\n", 74 | "### 元字符与转义\n", 75 | "\n", 76 | "字符组的开方括号`[`、闭方括号`]`和之前出现的`^`、`$`都算元字符。在匹配中,它们都有特殊含义。有时候只需要表示普通字符,就必须做特殊处理。\n", 77 | "\n", 78 | "字符组中的`-`,如果其紧邻这字符组中的开方括号`[`,那么它就是普通字符,其他情况下都是元字符;而对于其他元字符,取消特殊含义的做法都是转义,即在其前面加上反斜线字符`\\`。\n", 79 | "\n", 80 | "如果要在字符组内部使用横线`-`,最好的办法是将它排列在字符组的最开头。`[-09]`就是包含三个字符`-`、`0`、`9`的字符组。" 81 | ] 82 | }, 83 | { 84 | "cell_type": "code", 85 | "execution_count": null, 86 | "metadata": { 87 | "autoscroll": false, 88 | "collapsed": false, 89 | "ein.hycell": false, 90 | "ein.tags": "worksheet-0", 91 | "slideshow": { 92 | "slide_type": "-" 93 | } 94 | }, 95 | "outputs": [], 96 | "source": [ 97 | "re.search(\"^[-09]$\", \"3\") != None # => False\n", 98 | "re.search(\"^[-09]$\", \"-\") != None # => True\n", 99 | "\n", 100 | "re.search(\"^[0\\\\-9]$\", \"3\") != None # => False\n", 101 | "re.search(\"^[0\\\\-9]$\", \"-\") != None # => True" 102 | ] 103 | }, 104 | { 105 | "cell_type": "markdown", 106 | "metadata": { 107 | "ein.tags": "worksheet-0", 108 | "slideshow": { 109 | "slide_type": "-" 110 | } 111 | }, 112 | "source": [ 113 | "Python提供了**原生字符串**(Raw String),其非常适合正则表达式:正则表达式是怎样,原生字符串就是怎样,完全不需要考虑正则表达式之外的转义(只有双引号字符是例外,原生字符串内的双引号字符必须转义写成`\\\"`)。原生字符串的形式是`r\"string\"`。" 114 | ] 115 | }, 116 | { 117 | "cell_type": "code", 118 | "execution_count": null, 119 | "metadata": { 120 | "autoscroll": false, 121 | "collapsed": false, 122 | "ein.hycell": false, 123 | "ein.tags": "worksheet-0", 124 | "slideshow": { 125 | "slide_type": "-" 126 | } 127 | }, 128 | "outputs": [], 129 | "source": [ 130 | "r\"^[0\\-9]$\" == \"^[0\\\\-9]$\" # => True" 131 | ] 132 | }, 133 | { 134 | "cell_type": "markdown", 135 | "metadata": { 136 | "ein.tags": "worksheet-0", 137 | "slideshow": { 138 | "slide_type": "-" 139 | } 140 | }, 141 | "source": [ 142 | "### 排除型字符组\n", 143 | "\n", 144 | "排除型字符组(Negated Character Class)非常类似普通字符组`[...]`,只是在开方括号`[`之后紧跟一个脱字符`^`,写作`[^...]`,表示“在当前位置,匹配一个没有列出的字符”。" 145 | ] 146 | }, 147 | { 148 | "cell_type": "code", 149 | "execution_count": null, 150 | "metadata": { 151 | "autoscroll": false, 152 | "collapsed": false, 153 | "ein.hycell": false, 154 | "ein.tags": "worksheet-0", 155 | "slideshow": { 156 | "slide_type": "-" 157 | } 158 | }, 159 | "outputs": [], 160 | "source": [ 161 | "# 第一个不是数字第二个是数字\n", 162 | "re.search(r\"^[^0-9][0-9]$\", \"A8\") != None # => True\n", 163 | "re.search(r\"^[^0-9][0-9]$\", \"08\") != None # => False" 164 | ] 165 | }, 166 | { 167 | "cell_type": "markdown", 168 | "metadata": { 169 | "ein.tags": "worksheet-0", 170 | "slideshow": { 171 | "slide_type": "-" 172 | } 173 | }, 174 | "source": [ 175 | "“在当前位置,匹配一个没有列出的字符”和“在当前位置不要匹配列出的字符”是不同的,后者暗示“这里不出现任何字符也可以”。排除型字符组必须匹配一个字符。" 176 | ] 177 | }, 178 | { 179 | "cell_type": "code", 180 | "execution_count": null, 181 | "metadata": { 182 | "autoscroll": false, 183 | "collapsed": false, 184 | "ein.hycell": false, 185 | "ein.tags": "worksheet-0", 186 | "slideshow": { 187 | "slide_type": "-" 188 | } 189 | }, 190 | "outputs": [], 191 | "source": [ 192 | "re.search(r\"^[^0-9][0-9]$\", \"8\") != None # => False" 193 | ] 194 | }, 195 | { 196 | "cell_type": "markdown", 197 | "metadata": { 198 | "ein.tags": "worksheet-0", 199 | "slideshow": { 200 | "slide_type": "-" 201 | } 202 | }, 203 | "source": [ 204 | "在排除型字符组中,如果需要表示横线字符`-`,那么`-`应紧跟在`^`之后。" 205 | ] 206 | }, 207 | { 208 | "cell_type": "code", 209 | "execution_count": null, 210 | "metadata": { 211 | "autoscroll": false, 212 | "collapsed": false, 213 | "ein.hycell": false, 214 | "ein.tags": "worksheet-0", 215 | "slideshow": { 216 | "slide_type": "-" 217 | } 218 | }, 219 | "outputs": [], 220 | "source": [ 221 | "# 匹配一个-、0、9之外的字符\n", 222 | "re.search(r\"^[^-09]$\", \"-\") != None # => False" 223 | ] 224 | }, 225 | { 226 | "cell_type": "markdown", 227 | "metadata": { 228 | "ein.tags": "worksheet-0", 229 | "slideshow": { 230 | "slide_type": "-" 231 | } 232 | }, 233 | "source": [ 234 | "### 字符组简记法\n", 235 | "\n", 236 | "* `\\d` 等价于`[0-9]`,`d`代表“数字(digit)”\n", 237 | "* `\\w` 等价于`[0-9a-zA-Z_]`,`w`代表“单词字符(word)”\n", 238 | "* `\\s` 等价于`[ \\t\\r\\n\\v\\f]`(第一个字符是空格),`s`表示“空白字符(space)”\n", 239 | "\n", 240 | "注意:字符组简记法中的“单词字符”不只有大小写单词,还包括数字字符和下划线`_`。\n", 241 | "\n", 242 | "“空白字符”可以是空格字符、制表符`\\t`、回车符`\\r`、换行符`\\n`等各种“空白”字符。\n", 243 | "\n", 244 | "字符组简记法可以单独出现,也可以使用在字符组中,比如`[0-9a-zA-Z]`也可以写作`[\\da-zA-Z]`,`[^0-9a-zA-Z_]`可以写作`[^\\w]`。\n", 245 | "\n", 246 | "相对于`\\d`、`\\w`和`\\s`这三个普通字符组简记法,正则表达式也提供了对应排除型字符组的简记法:`\\D`、`\\W`和`\\S`——字母完全一样,只是改为大写。这些简记法匹配的字符互补:`\\s`能匹配的字符,`\\S`一定不能匹配;`\\w`能匹配的字符,`\\W`一定不能匹配;`\\d`能匹配的字符,`\\D`一定不能匹配。\n", 247 | "\n", 248 | "利用这种互补的属性,就能得到巧妙的效果:`[\\s\\S]`、`[\\w\\W]`、`[\\d\\D]`匹配的就是“所有的字符”(或者叫“任意字符”)。\n", 249 | "\n", 250 | "\n", 251 | "## 量词\n", 252 | "\n", 253 | "匹配确定的长度或者不确定的长度。\n", 254 | "\n", 255 | "* `prev{m}` 限定之前的元素出现m次。\n", 256 | "* `prev{m,n}` 限定之前的元素最少出现m次,最多出现n次(均为闭区间)。\n", 257 | "\n", 258 | "如果不确定长度的上限,也可以省略,只指定下限,写成`prev{m,}`。\n", 259 | "\n", 260 | "### 常用量词\n", 261 | "\n", 262 | "* `*`,等价于`{0,}`,可能出现,也可能不出现,出现次数没有上限\n", 263 | "* `+`,等价于`{1,}`,至少出现1次,出现次数没有上限\n", 264 | "* `?`,等价于`{0,1}`,至多出现一次,也可能不出现\n", 265 | "\n", 266 | "使用正则表达式的一条根本规律:使用合适的结构(包括字符组和量词),精确表达自己的意图,界定能匹配的文本。" 267 | ] 268 | }, 269 | { 270 | "cell_type": "markdown", 271 | "metadata": { 272 | "ein.tags": "worksheet-0", 273 | "slideshow": { 274 | "slide_type": "-" 275 | } 276 | }, 277 | "source": [ 278 | "## Python的`re`模块" 279 | ] 280 | }, 281 | { 282 | "cell_type": "code", 283 | "execution_count": null, 284 | "metadata": { 285 | "autoscroll": false, 286 | "collapsed": false, 287 | "ein.hycell": false, 288 | "ein.tags": "worksheet-0", 289 | "slideshow": { 290 | "slide_type": "-" 291 | } 292 | }, 293 | "outputs": [], 294 | "source": [ 295 | "import re\n", 296 | "\n", 297 | "result = re.match(r'^travell?er$', 'traveler')\n", 298 | "print(result)\n", 299 | "result.group()" 300 | ] 301 | }, 302 | { 303 | "cell_type": "markdown", 304 | "metadata": { 305 | "ein.tags": "worksheet-0", 306 | "slideshow": { 307 | "slide_type": "-" 308 | } 309 | }, 310 | "source": [ 311 | "`match()`函数用于查看源(source)字符串是否以模式(pattern)字符串开头。" 312 | ] 313 | }, 314 | { 315 | "cell_type": "code", 316 | "execution_count": null, 317 | "metadata": { 318 | "autoscroll": false, 319 | "collapsed": false, 320 | "ein.hycell": false, 321 | "ein.tags": "worksheet-0", 322 | "slideshow": { 323 | "slide_type": "-" 324 | } 325 | }, 326 | "outputs": [], 327 | "source": [ 328 | "pattern = re.compile(r'travell?er')\n", 329 | "pattern.match('traveler')" 330 | ] 331 | }, 332 | { 333 | "cell_type": "markdown", 334 | "metadata": { 335 | "ein.tags": "worksheet-0", 336 | "slideshow": { 337 | "slide_type": "-" 338 | } 339 | }, 340 | "source": [ 341 | "`re`模块其他可用的方法:\n", 342 | "* `search()` 返回第一次成功匹配,如果存在的话;\n", 343 | "* `findall()` 返回所有不重叠的匹配,如果存在的话;\n", 344 | "* `split()` 会根据pattern将source切分成若干段,返回由这些片段组成的列表;\n", 345 | "* `sub()` 需要一个额外的参数`replacement`,它会把source中所有匹配的pattern改成replacement。\n" 346 | ] 347 | }, 348 | { 349 | "cell_type": "code", 350 | "execution_count": null, 351 | "metadata": { 352 | "autoscroll": false, 353 | "collapsed": false, 354 | "ein.hycell": false, 355 | "ein.tags": "worksheet-0", 356 | "slideshow": { 357 | "slide_type": "-" 358 | } 359 | }, 360 | "outputs": [], 361 | "source": [ 362 | "# search() 寻找首次匹配\n", 363 | "m = pattern.search('traveller')\n", 364 | "if m:\n", 365 | " print(m.group())" 366 | ] 367 | }, 368 | { 369 | "cell_type": "code", 370 | "execution_count": null, 371 | "metadata": { 372 | "autoscroll": false, 373 | "collapsed": false, 374 | "ein.hycell": false, 375 | "ein.tags": "worksheet-0", 376 | "slideshow": { 377 | "slide_type": "-" 378 | } 379 | }, 380 | "outputs": [], 381 | "source": [ 382 | "# findall() 寻找所有匹配\n", 383 | "m = pattern.findall('traveller and traveler')\n", 384 | "m" 385 | ] 386 | }, 387 | { 388 | "cell_type": "code", 389 | "execution_count": null, 390 | "metadata": { 391 | "autoscroll": false, 392 | "collapsed": false, 393 | "ein.hycell": false, 394 | "ein.tags": "worksheet-0", 395 | "slideshow": { 396 | "slide_type": "-" 397 | } 398 | }, 399 | "outputs": [], 400 | "source": [ 401 | "# split() 按匹配切分\n", 402 | "astring = 'a3b2c1d4e10'\n", 403 | "\n", 404 | "def uncompress(astring):\n", 405 | " a = re.split(r'\\d+', astring)[:-1]\n", 406 | " d = re.split(r'[a-zA-Z]', astring)[1:]\n", 407 | " return ''.join([a[i] * int(d[i]) for i in range(len(a))])\n", 408 | "\n", 409 | "uncompress(astring)" 410 | ] 411 | }, 412 | { 413 | "cell_type": "markdown", 414 | "metadata": { 415 | "ein.tags": "worksheet-0", 416 | "slideshow": { 417 | "slide_type": "-" 418 | } 419 | }, 420 | "source": [ 421 | "## 模式\n", 422 | "\n", 423 | "* 普通的文本值代表自身,用于匹配非特殊字符;\n", 424 | "* 使用`.`代表任意除`\\n`外的字符;\n", 425 | "* 使用`*`表示任意多个字符(包括0个);\n", 426 | "* 使用`?`表示可选字符(0个或1个)。" 427 | ] 428 | }, 429 | { 430 | "cell_type": "code", 431 | "execution_count": null, 432 | "metadata": { 433 | "autoscroll": false, 434 | "collapsed": false, 435 | "ein.hycell": false, 436 | "ein.tags": "worksheet-0", 437 | "slideshow": { 438 | "slide_type": "-" 439 | } 440 | }, 441 | "outputs": [], 442 | "source": [ 443 | "import re\n", 444 | "# 换行符的匹配\n", 445 | "re.search(r'^.$', '\\n') != None # => False\n", 446 | "# 单行模式\n", 447 | "re.search(r'(?s)^.$', '\\n') != None # => True\n", 448 | "# 自制“通配字符组”\n", 449 | "re.search(r'^[\\s\\S]$', '\\n') != None # => True" 450 | ] 451 | }, 452 | { 453 | "cell_type": "markdown", 454 | "metadata": { 455 | "ein.tags": "worksheet-0", 456 | "slideshow": { 457 | "slide_type": "-" 458 | } 459 | }, 460 | "source": [ 461 | "### 特殊字符(参考教材p140)\n", 462 | "\n", 463 | "**正则表达式不仅仅适用于ASCII字符,还适用于Unicode的字符。**\n" 464 | ] 465 | }, 466 | { 467 | "cell_type": "code", 468 | "execution_count": null, 469 | "metadata": { 470 | "autoscroll": false, 471 | "collapsed": false, 472 | "ein.hycell": false, 473 | "ein.tags": "worksheet-0", 474 | "slideshow": { 475 | "slide_type": "-" 476 | } 477 | }, 478 | "outputs": [], 479 | "source": [ 480 | "x = 'abc-/*\\u00ea\\u0115'\n", 481 | "re.findall(r'\\w', x)" 482 | ] 483 | }, 484 | { 485 | "cell_type": "markdown", 486 | "metadata": { 487 | "ein.tags": "worksheet-0", 488 | "slideshow": { 489 | "slide_type": "-" 490 | } 491 | }, 492 | "source": [ 493 | "### 模式标识符\n", 494 | "\n", 495 | "|模式 | 匹配 |\n", 496 | "|----------------|--------------|\n", 497 | "|`abc` | 文本值abc|\n", 498 | "|`(expr)` | expr|\n", 499 | "|expr1|expr2 | `expr1`或`expr2`|\n", 500 | "|`.` | 除`\\n`外的任何字符|\n", 501 | "|`^` | 源字符串的开头|\n", 502 | "|`$` | 源字符串的结尾|\n", 503 | "|`prev?` | 0个或1个`prev`|\n", 504 | "|`prev*` | 0个或多个`prev`,尽可能多地匹配|\n", 505 | "|`prev*?` | 0个或多个`prev`,尽可能少地匹配|\n", 506 | "|`prev+` | 1个或多个`prev`,尽可能多地匹配|\n", 507 | "|`prev+?` | 1个或多个`prev`,尽可能少地匹配|\n", 508 | "|`prev{m}` | m个连续的`prev`|\n", 509 | "|`prev{m,n}` | m到n个连续的`prev`,尽可能多地匹配|\n", 510 | "|`prev{m,n}?` | m到n个连续的`prev`,尽可能少地匹配|\n", 511 | "|`[abc]` | a或b或c,等价于a|b|c|\n", 512 | "|`[^abc]` | 非(a或b或c)|\n", 513 | "|`prev(?=next)` | 如果后面为`next`,返回`prev`|\n", 514 | "|`prev(?!next)` | 如果后面非`next`,返回`prev`|\n", 515 | "|`(?<=prev)next` | 如果前面为`next`,返回`prev`|\n", 516 | "|`(?expr)`会匹配`expr`,并将匹配结果存储到名为`name`的组中。命名分组捕获时仍然保留了数字编号。" 618 | ] 619 | }, 620 | { 621 | "cell_type": "code", 622 | "execution_count": null, 623 | "metadata": { 624 | "autoscroll": false, 625 | "collapsed": false, 626 | "ein.hycell": false, 627 | "ein.tags": "worksheet-0", 628 | "slideshow": { 629 | "slide_type": "-" 630 | } 631 | }, 632 | "outputs": [], 633 | "source": [ 634 | "m = re.search(r'(?P\\d{4})-(?P\\d{2})-(?P\\d{2})', '2017-04-07')\n", 635 | "print(m.group())\n", 636 | "print(m.group(0))\n", 637 | "print(m.groups())\n", 638 | "print(m.group(1))\n", 639 | "print(m.group('year'))\n", 640 | "print(m.group(2))\n", 641 | "print(m.group('month'))\n", 642 | "print(m.group(3))\n", 643 | "print(m.group('day'))" 644 | ] 645 | }, 646 | { 647 | "cell_type": "markdown", 648 | "metadata": { 649 | "ein.tags": "worksheet-0", 650 | "slideshow": { 651 | "slide_type": "-" 652 | } 653 | }, 654 | "source": [ 655 | "注意:不要弄错分组的结构!" 656 | ] 657 | }, 658 | { 659 | "cell_type": "code", 660 | "execution_count": null, 661 | "metadata": { 662 | "autoscroll": false, 663 | "collapsed": false, 664 | "ein.hycell": false, 665 | "ein.tags": "worksheet-0", 666 | "slideshow": { 667 | "slide_type": "-" 668 | } 669 | }, 670 | "outputs": [], 671 | "source": [ 672 | "astring = '2017-04-07'\n", 673 | "print(re.search(r'(\\d{4})-(\\d{2})-(\\d{2})', astring).group(1))\n", 674 | "print(re.search(r'(\\d){4}-(\\d{2})-(\\d{2})', astring).group(1))" 675 | ] 676 | }, 677 | { 678 | "cell_type": "markdown", 679 | "metadata": { 680 | "ein.tags": "worksheet-0", 681 | "slideshow": { 682 | "slide_type": "-" 683 | } 684 | }, 685 | "source": [ 686 | "第二个表达式中,编号为1的括号是`(\\d)`,表示匹配一个数字字符,因为之后有量词`{4}`,所以整个括号作为单个元素,要重复出现4次,而且编号都是1;于是每重复出现一次,就要更新一次匹配结果。所以在匹配过程中,编号为1的分组匹配的文本的值依次是2、0、1、7,最后的结果是7。" 687 | ] 688 | }, 689 | { 690 | "cell_type": "markdown", 691 | "metadata": { 692 | "ein.tags": "worksheet-0", 693 | "slideshow": { 694 | "slide_type": "-" 695 | } 696 | }, 697 | "source": [ 698 | "## 正则表达式替换\n", 699 | "\n", 700 | "分组捕获的文本,不仅仅用于数据提取,也可以用于替换,比如对于上面的例子,希望将**YYYY-MM-DD**格式的日期变为**MM/DD/YYYY**,就可以使用正则表达式替换。\n", 701 | "\n", 702 | "替换方法:`re.sub(pattern, replacement, string)`" 703 | ] 704 | }, 705 | { 706 | "cell_type": "code", 707 | "execution_count": null, 708 | "metadata": { 709 | "autoscroll": false, 710 | "collapsed": false, 711 | "ein.hycell": false, 712 | "ein.tags": "worksheet-0", 713 | "slideshow": { 714 | "slide_type": "-" 715 | } 716 | }, 717 | "outputs": [], 718 | "source": [ 719 | "print(re.sub(r'[a-z]', ' ', 'a3b2c1d4'))\n", 720 | "print(re.sub(r'[0-9]', ' ', 'a3b2c1d4'))" 721 | ] 722 | }, 723 | { 724 | "cell_type": "markdown", 725 | "metadata": { 726 | "ein.tags": "worksheet-0", 727 | "slideshow": { 728 | "slide_type": "-" 729 | } 730 | }, 731 | "source": [ 732 | "在`replacement`中也可以引用分组,形式是`\\num`,其中的`num`是对应分组的编号,`replacement`是一个普通的字符串,也必须指定其为原生字符串。" 733 | ] 734 | }, 735 | { 736 | "cell_type": "code", 737 | "execution_count": null, 738 | "metadata": { 739 | "autoscroll": false, 740 | "collapsed": false, 741 | "ein.hycell": false, 742 | "ein.tags": "worksheet-0", 743 | "slideshow": { 744 | "slide_type": "-" 745 | } 746 | }, 747 | "outputs": [], 748 | "source": [ 749 | "print(re.sub(r'(\\d{4})-(\\d{2})-(\\d{2})', r'\\2/\\3/\\1', '2017-04-07'))\n", 750 | "print(re.sub(r'(\\d{4})-(\\d{2})-(\\d{2})', r'\\1年\\2月\\3日', '2017-04-07'))" 751 | ] 752 | }, 753 | { 754 | "cell_type": "markdown", 755 | "metadata": { 756 | "ein.tags": "worksheet-0", 757 | "slideshow": { 758 | "slide_type": "-" 759 | } 760 | }, 761 | "source": [ 762 | "如果想在`replacement`中引用整个表达式匹配的文本,可以给整个表达式加上一对括号,之后用`\\1`来引用。" 763 | ] 764 | }, 765 | { 766 | "cell_type": "code", 767 | "execution_count": null, 768 | "metadata": { 769 | "autoscroll": false, 770 | "collapsed": false, 771 | "ein.hycell": false, 772 | "ein.tags": "worksheet-0", 773 | "slideshow": { 774 | "slide_type": "-" 775 | } 776 | }, 777 | "outputs": [], 778 | "source": [ 779 | "re.sub(r'((\\d{4})-(\\d{2})-(\\d{2}))', r'[\\1]', '2017-04-07')" 780 | ] 781 | }, 782 | { 783 | "cell_type": "markdown", 784 | "metadata": { 785 | "ein.tags": "worksheet-0", 786 | "slideshow": { 787 | "slide_type": "-" 788 | } 789 | }, 790 | "source": [ 791 | "### 反向引用\n", 792 | "\n", 793 | "如何检查某个单词是否包含重叠出现的字母(例如,shoot或beep)?\n", 794 | "\n", 795 | "`[a-z][a-z]`可以吗?\n", 796 | "\n", 797 | "“重叠出现”的字符,取决与第一个`[a-z]`在运行时的匹配结果,而不能预先设定,即必须知道之前匹配的确切内容。\n", 798 | "\n", 799 | "反向引用(back-reference)允许在正则表达式内部引用之前的捕获分组匹配的文本(左侧),其形式也是`\\num`,其中`num`表示所引用分组的编号,编号规则与之前介绍的相同。" 800 | ] 801 | }, 802 | { 803 | "cell_type": "code", 804 | "execution_count": null, 805 | "metadata": { 806 | "autoscroll": false, 807 | "collapsed": false, 808 | "ein.hycell": false, 809 | "ein.tags": "worksheet-0", 810 | "slideshow": { 811 | "slide_type": "-" 812 | } 813 | }, 814 | "outputs": [], 815 | "source": [ 816 | "re.search(r'^([a-z])\\1$', 'aa') != None # => True\n", 817 | "re.search(r'^([a-z])\\1$', 'ac') != None # => False" 818 | ] 819 | }, 820 | { 821 | "cell_type": "code", 822 | "execution_count": null, 823 | "metadata": { 824 | "autoscroll": false, 825 | "collapsed": false, 826 | "ein.hycell": false, 827 | "ein.tags": "worksheet-0", 828 | "slideshow": { 829 | "slide_type": "-" 830 | } 831 | }, 832 | "outputs": [], 833 | "source": [ 834 | "# 用反向引用匹配成对的tag\n", 835 | "paired_tag_regex = r'<([^>]+)>[\\s\\S]*?'\n", 836 | "re.search(paired_tag_regex, 'text') != None # => True\n", 837 | "re.search(paired_tag_regex, '

text') != None # => False" 838 | ] 839 | }, 840 | { 841 | "cell_type": "markdown", 842 | "metadata": { 843 | "ein.tags": "worksheet-0", 844 | "slideshow": { 845 | "slide_type": "-" 846 | } 847 | }, 848 | "source": [ 849 | "反向引用重复的是对应捕获分组匹配的文本,而不是之前的表达式。\n", 850 | "\n", 851 | "#### 具有二义性的反向引用" 852 | ] 853 | }, 854 | { 855 | "cell_type": "code", 856 | "execution_count": null, 857 | "metadata": { 858 | "autoscroll": false, 859 | "collapsed": false, 860 | "ein.hycell": false, 861 | "ein.tags": "worksheet-0", 862 | "slideshow": { 863 | "slide_type": "-" 864 | } 865 | }, 866 | "outputs": [], 867 | "source": [ 868 | "re.sub(r'(\\d)', r'\\10', '123')\n", 869 | "# error: invalid group reference 10 at position 1" 870 | ] 871 | }, 872 | { 873 | "cell_type": "markdown", 874 | "metadata": { 875 | "ein.tags": "worksheet-0", 876 | "slideshow": { 877 | "slide_type": "-" 878 | } 879 | }, 880 | "source": [ 881 | "Python提供了`\\g`表示法,将`\\10`写成`\\g<1>0`,这样就避免了替换时无法使用`\\0`的问题。" 882 | ] 883 | }, 884 | { 885 | "cell_type": "code", 886 | "execution_count": null, 887 | "metadata": { 888 | "autoscroll": false, 889 | "collapsed": false, 890 | "ein.hycell": false, 891 | "ein.tags": "worksheet-0", 892 | "slideshow": { 893 | "slide_type": "-" 894 | } 895 | }, 896 | "outputs": [], 897 | "source": [ 898 | "re.sub(r'(\\d)', r'\\g<1>0', '123')" 899 | ] 900 | }, 901 | { 902 | "cell_type": "markdown", 903 | "metadata": { 904 | "ein.tags": "worksheet-0", 905 | "slideshow": { 906 | "slide_type": "-" 907 | } 908 | }, 909 | "source": [ 910 | "### 命名分组的引用方法\n", 911 | "\n", 912 | "如果使用了命名分组,在表达式中反向引用时,必须使用`(?P=name)`的记法。而要进行正则表达式替换,则需要写作`\\g`,其中`name`是分组的名字。" 913 | ] 914 | }, 915 | { 916 | "cell_type": "code", 917 | "execution_count": null, 918 | "metadata": { 919 | "autoscroll": false, 920 | "collapsed": false, 921 | "ein.hycell": false, 922 | "ein.tags": "worksheet-0", 923 | "slideshow": { 924 | "slide_type": "-" 925 | } 926 | }, 927 | "outputs": [], 928 | "source": [ 929 | "re.search(r'^(?P[a-z])(?P=char)$', 'aa') != None # => True\n", 930 | "\n", 931 | "re.sub(r'(?P\\d)', r'\\g0', '123') # => '102030'" 932 | ] 933 | }, 934 | { 935 | "cell_type": "markdown", 936 | "metadata": { 937 | "ein.tags": "worksheet-0", 938 | "slideshow": { 939 | "slide_type": "-" 940 | } 941 | }, 942 | "source": [ 943 | "## 正则表达式工具\n", 944 | "\n", 945 | "* https://regexper.com\n", 946 | "* https://www.debuggex.com" 947 | ] 948 | } 949 | ], 950 | "metadata": { 951 | "kernelspec": { 952 | "display_name": "Python 3", 953 | "name": "python3" 954 | }, 955 | "language_info": { 956 | "codemirror_mode": { 957 | "name": "ipython", 958 | "version": 3 959 | }, 960 | "file_extension": ".py", 961 | "mimetype": "text/x-python", 962 | "name": "python", 963 | "nbconvert_exporter": "python", 964 | "pygments_lexer": "ipython3", 965 | "version": "3.6.1" 966 | }, 967 | "name": "11_regular_expressions.ipynb" 968 | }, 969 | "nbformat": 4, 970 | "nbformat_minor": 2 971 | } 972 | -------------------------------------------------------------------------------- /课件/14_sort.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": { 6 | "ein.tags": "worksheet-0", 7 | "slideshow": { 8 | "slide_type": "-" 9 | } 10 | }, 11 | "source": [ 12 | "# `list.sort` 方法和 `sorted` 函数" 13 | ] 14 | }, 15 | { 16 | "cell_type": "markdown", 17 | "metadata": { 18 | "ein.tags": "worksheet-0", 19 | "slideshow": { 20 | "slide_type": "-" 21 | } 22 | }, 23 | "source": [ 24 | "list.sort方法会**就地**排序列表,所以这个方法的返回值是None。\n", 25 | "\n", 26 | "**如果一个函数或者方法对对象进行的是就地改动,那它就应该返回None,让调用者知道传入的参数发生了变动,并且未产生新的对象。**\n", 27 | "\n", 28 | "与list.sort不同的是内置函数sorted,其会新建一个列表作为返回值。这个方法可以接收任何形式的可迭代对象作为参数,包括不可变序列或者生成器,不管其接收的是怎样的参数,它最后都返回一个列表。\n", 29 | "\n", 30 | "list.sortsorted都有2个可选的关键字参数:\n", 31 | "\n", 32 | "- **`reverse`:** 如果设定为True,被排序的序列里的元素会以降序输出(把最大值当作最小值来排序)。此参数默认值是False。\n", 33 | "- **`key`:** 一个只有一个参数的函数,这个函数会被用在序列里的每一个元素上,所产生的结果将是排序算法依赖的对比关键字。这个参数的默认值是恒等函数,即默认用元素自己的值来排序。" 34 | ] 35 | }, 36 | { 37 | "cell_type": "code", 38 | "execution_count": null, 39 | "metadata": { 40 | "autoscroll": false, 41 | "collapsed": false, 42 | "ein.hycell": false, 43 | "ein.tags": "worksheet-0", 44 | "slideshow": { 45 | "slide_type": "-" 46 | } 47 | }, 48 | "outputs": [], 49 | "source": [ 50 | "fruits = ['apple', 'banana', 'pear', 'raspberry', 'strawberry']\n", 51 | "print(sorted(fruits))\n", 52 | "print(sorted(fruits, reverse=True))\n", 53 | "print(sorted(fruits, key=len))\n", 54 | "print(sorted(fruits, key=len, reverse=True))\n", 55 | "print(fruits)\n", 56 | "fruits.sort()\n", 57 | "print(fruits)" 58 | ] 59 | }, 60 | { 61 | "cell_type": "markdown", 62 | "metadata": { 63 | "ein.tags": "worksheet-0", 64 | "slideshow": { 65 | "slide_type": "-" 66 | } 67 | }, 68 | "source": [ 69 | "## 通过某个关键字排序一个字典列表\n", 70 | "\n", 71 | "对于如下的字典列表,根据某个或某几个字典字段来排序这个列表:\n" 72 | ] 73 | }, 74 | { 75 | "cell_type": "code", 76 | "execution_count": null, 77 | "metadata": { 78 | "autoscroll": false, 79 | "collapsed": false, 80 | "ein.hycell": false, 81 | "ein.tags": "worksheet-0", 82 | "slideshow": { 83 | "slide_type": "-" 84 | } 85 | }, 86 | "outputs": [], 87 | "source": [ 88 | "rows = [\n", 89 | " {'first_name': 'Brian', 'last_name': 'Jones', 'uid': 1003},\n", 90 | " {'first_name': 'David', 'last_name': 'Beazley', 'uid': 1002},\n", 91 | " {'first_name': 'John', 'last_name': 'Cleese', 'uid': 1001},\n", 92 | " {'first_name': 'Big', 'last_name': 'Jones', 'uid': 1004}\n", 93 | "]" 94 | ] 95 | }, 96 | { 97 | "cell_type": "markdown", 98 | "metadata": { 99 | "ein.tags": "worksheet-0", 100 | "slideshow": { 101 | "slide_type": "-" 102 | } 103 | }, 104 | "source": [ 105 | "可以使用operator模块的itemgetter函数。\n", 106 | "```\n", 107 | "itemgetter(item, ...) --> itemgetter object\n", 108 | "\n", 109 | "Return a callable object that fetches the given item(s) from its operand.\n", 110 | "After f = itemgetter(2), the call f(r) returns r[2].\n", 111 | "After g = itemgetter(2, 5, 3), the call g(r) returns (r[2], r[5], r[3])\n", 112 | "```" 113 | ] 114 | }, 115 | { 116 | "cell_type": "code", 117 | "execution_count": null, 118 | "metadata": { 119 | "autoscroll": false, 120 | "collapsed": false, 121 | "ein.hycell": false, 122 | "ein.tags": "worksheet-0", 123 | "slideshow": { 124 | "slide_type": "-" 125 | } 126 | }, 127 | "outputs": [], 128 | "source": [ 129 | "from operator import itemgetter\n", 130 | "\n", 131 | "rows_by_fname = sorted(rows, key=itemgetter('first_name'))\n", 132 | "rows_by_uid = sorted(rows, key=itemgetter('uid'))\n", 133 | "rows_by_lfname = sorted(rows, key=itemgetter('last_name', 'first_name'))\n", 134 | "print(rows_by_fname)\n", 135 | "print(rows_by_uid)\n", 136 | "print(rows_by_lfname)" 137 | ] 138 | }, 139 | { 140 | "cell_type": "markdown", 141 | "metadata": { 142 | "ein.tags": "worksheet-0", 143 | "slideshow": { 144 | "slide_type": "-" 145 | } 146 | }, 147 | "source": [ 148 | "sorted函数的key参数是 callable 类型,并且从列表中接受一个单一元素,然后返回被用来排序的值。\n", 149 | "itemgetter()函数就是负责创建这个 callable 对象的。\n", 150 | "\n", 151 | "operator.itemgetter()函数有一个被排序列表中的记录用来查找值的索引参数。\n", 152 | "可以是一个字典键名称,一个整形值或者任何能够传入一个对象的\\_\\_getitem\\_\\_()方法的值。\n", 153 | "如果你传入多个索引参数给itemgetter(),它生成的 callable 对象会返回一个包含\n", 154 | "所有元素值的元组,并且sorted()函数会根据这个元组中元素顺序去排序。\n", 155 | "想要同时在几个字段上面进行排序(比如通过姓和名来排序,也就是例子中的那样)的时候这种\n", 156 | "方法是很有用的。\n", 157 | "\n", 158 | "itemgetter()也可以使用lambda表达式代替,比如:" 159 | ] 160 | }, 161 | { 162 | "cell_type": "code", 163 | "execution_count": null, 164 | "metadata": { 165 | "autoscroll": false, 166 | "collapsed": false, 167 | "ein.hycell": false, 168 | "ein.tags": "worksheet-0", 169 | "slideshow": { 170 | "slide_type": "-" 171 | } 172 | }, 173 | "outputs": [], 174 | "source": [ 175 | "rows_by_fname = sorted(rows, key=lambda r: r['first_name'])\n", 176 | "rows_by_lfname = sorted(rows, key=lambda r: (r['last_name'], r['first_name']))" 177 | ] 178 | }, 179 | { 180 | "cell_type": "markdown", 181 | "metadata": { 182 | "ein.tags": "worksheet-0", 183 | "slideshow": { 184 | "slide_type": "-" 185 | } 186 | }, 187 | "source": [ 188 | "# 使用 `bisect` 模块来管理已排序序列" 189 | ] 190 | }, 191 | { 192 | "cell_type": "markdown", 193 | "metadata": { 194 | "ein.tags": "worksheet-0", 195 | "slideshow": { 196 | "slide_type": "-" 197 | } 198 | }, 199 | "source": [ 200 | "已排序的序列可以用来进行快速搜索,标准库的bisect模块提供了二分查找算法。\n", 201 | "\n", 202 | "bisect模块包含两个主要函数,bisectinsort,\n", 203 | "两个函数都利用二分查找算法来在有序序列中查找或插入元素。" 204 | ] 205 | }, 206 | { 207 | "cell_type": "markdown", 208 | "metadata": { 209 | "ein.tags": "worksheet-0", 210 | "slideshow": { 211 | "slide_type": "-" 212 | } 213 | }, 214 | "source": [ 215 | "## 用 `bisect` 来搜索\n", 216 | "\n", 217 | "bisect(haystack, needle)haystack(干草垛)里搜索needle(针)的位置,该位置满足的条件是,把needle插入到这个位置后,haystack还能保持升序,即此函数返回的位置前面的值,都小于或等于needle的值。其中haystack必须是一个有序的序列。\n", 218 | "\n", 219 | "可以先用bisect(haystack, needle)查找位置index,再用haystack.insert(index, needle)来插入新值。或者用insort来一步到位,速度会更快一些。\n", 220 | "\n", 221 | "```python\n", 222 | "# bisect_demo.py\n", 223 | "\n", 224 | "import bisect\n", 225 | "import sys\n", 226 | "\n", 227 | "HAYSTACK = [1, 4, 5, 6, 8, 12, 15, 20, 21, 23, 23, 26, 29, 30]\n", 228 | "NEEDLES = [0, 1, 2, 5, 8, 10, 22, 23, 29, 30, 31]\n", 229 | "\n", 230 | "ROW_FMT = '{0:2d} @ {1:2d} {2}{0:<2d}'\n", 231 | "\n", 232 | "\n", 233 | "def demo(bisect_fn):\n", 234 | " for needle in reversed(NEEDLES):\n", 235 | " position = bisect_fn(HAYSTACK, needle)\n", 236 | " offset = position * ' |'\n", 237 | " print(ROW_FMT.format(needle, position, offset))\n", 238 | "\n", 239 | "\n", 240 | "if __name__ == '__main__':\n", 241 | " if sys.argv[-1] == 'left':\n", 242 | " bisect_fn = bisect.bisect_left\n", 243 | " else:\n", 244 | " bisect_fn = bisect.bisect\n", 245 | "\n", 246 | " print('DEMO:', bisect_fn.__name__)\n", 247 | " print('haystack ->', ' '.join('%2d' % n for n in HAYSTACK))\n", 248 | " demo(bisect_fn)\n", 249 | "```\n", 250 | "\n", 251 | "```\n", 252 | "$ python3 bisect_demo.py\n", 253 | "DEMO: bisect\n", 254 | "haystack -> 1 4 5 6 8 12 15 20 21 23 23 26 29 30\n", 255 | "31 @ 14 | | | | | | | | | | | | | |31\n", 256 | "30 @ 14 | | | | | | | | | | | | | |30\n", 257 | "29 @ 13 | | | | | | | | | | | | |29\n", 258 | "23 @ 11 | | | | | | | | | | |23\n", 259 | "22 @ 9 | | | | | | | | |22\n", 260 | "10 @ 5 | | | | |10\n", 261 | " 8 @ 5 | | | | |8\n", 262 | " 5 @ 3 | | |5\n", 263 | " 2 @ 1 |2\n", 264 | " 1 @ 1 |1\n", 265 | " 0 @ 0 0\n", 266 | "```\n", 267 | " \n", 268 | "```\n", 269 | "$ python3 bisect_demo.py left\n", 270 | "DEMO: bisect_left\n", 271 | "haystack -> 1 4 5 6 8 12 15 20 21 23 23 26 29 30\n", 272 | "31 @ 14 | | | | | | | | | | | | | |31\n", 273 | "30 @ 13 | | | | | | | | | | | | |30\n", 274 | "29 @ 12 | | | | | | | | | | | |29\n", 275 | "23 @ 9 | | | | | | | | |23\n", 276 | "22 @ 9 | | | | | | | | |22\n", 277 | "10 @ 5 | | | | |10\n", 278 | " 8 @ 4 | | | |8\n", 279 | " 5 @ 2 | |5\n", 280 | " 2 @ 1 |2\n", 281 | " 1 @ 0 1\n", 282 | " 0 @ 0 0\n", 283 | "```\n", 284 | "\n", 285 | "bisect的表现可以从两个方面来调整。\n", 286 | "\n", 287 | "1. 用它的两个可选参数——lohi——来缩小搜寻范围。lo的默认值是0,hi的默认值是序列的长度。\n", 288 | "2. bisect起始是bisect_right的别名,对应的函数是bisect_left。\n", 289 | "\n", 290 | "bisect可用来建立一个用数字作为索引的查询表格,比如把分数和成绩对应起来。\n" 291 | ] 292 | }, 293 | { 294 | "cell_type": "code", 295 | "execution_count": null, 296 | "metadata": { 297 | "autoscroll": false, 298 | "collapsed": false, 299 | "ein.hycell": false, 300 | "ein.tags": "worksheet-0", 301 | "slideshow": { 302 | "slide_type": "-" 303 | } 304 | }, 305 | "outputs": [], 306 | "source": [ 307 | "import bisect\n", 308 | "\n", 309 | "def grade(score, breakpoints=[60, 70, 80, 90], grades='FDCBA'):\n", 310 | " i = bisect.bisect(breakpoints, score)\n", 311 | " return grades[i]\n", 312 | "\n", 313 | "print([grade(score) for score in [33, 99, 77, 70, 89, 90, 100]])" 314 | ] 315 | }, 316 | { 317 | "cell_type": "markdown", 318 | "metadata": { 319 | "ein.tags": "worksheet-0", 320 | "slideshow": { 321 | "slide_type": "-" 322 | } 323 | }, 324 | "source": [ 325 | "## 用 `bisect.insort` 插入新元素\n", 326 | "\n", 327 | "insort(seq, item)把变量item插入到序列seq中,\n", 328 | "并能保持seq的升序顺序。\n" 329 | ] 330 | }, 331 | { 332 | "cell_type": "code", 333 | "execution_count": null, 334 | "metadata": { 335 | "autoscroll": false, 336 | "collapsed": false, 337 | "ein.hycell": false, 338 | "ein.tags": "worksheet-0", 339 | "slideshow": { 340 | "slide_type": "-" 341 | } 342 | }, 343 | "outputs": [], 344 | "source": [ 345 | "import bisect\n", 346 | "import random\n", 347 | "\n", 348 | "SIZE = 7\n", 349 | "\n", 350 | "random.seed(1730)\n", 351 | "\n", 352 | "my_list = []\n", 353 | "for i in range(SIZE):\n", 354 | " new_item = random.randrange(SIZE*2)\n", 355 | " bisect.insort(my_list, new_item)\n", 356 | " print('%2d ->' % new_item, my_list)" 357 | ] 358 | }, 359 | { 360 | "cell_type": "markdown", 361 | "metadata": { 362 | "ein.tags": "worksheet-0", 363 | "slideshow": { 364 | "slide_type": "-" 365 | } 366 | }, 367 | "source": [ 368 | "insortbisect一样,有lohi两个可选参数用来控制查找的范围。它也有个变体叫insort_left,这个变体在背后用的是bisect_left。\n" 369 | ] 370 | } 371 | ], 372 | "metadata": { 373 | "kernelspec": { 374 | "display_name": "Python 3", 375 | "name": "python3" 376 | }, 377 | "name": "14_sort.ipynb" 378 | }, 379 | "nbformat": 4, 380 | "nbformat_minor": 2 381 | } 382 | -------------------------------------------------------------------------------- /课件/15_else-and-copy.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": { 6 | "ein.tags": "worksheet-0", 7 | "slideshow": { 8 | "slide_type": "-" 9 | } 10 | }, 11 | "source": [ 12 | "# `if` 语句之外的 `else` 块\n", 13 | "\n", 14 | "else子句不仅能在if语句中使用,\n", 15 | "还能在forwhiletry语句中使用。\n", 16 | "else子句的行为:\n", 17 | "\n", 18 | "- **`for`:** 仅当for循环正常运行完毕时\n", 19 | " (即for循环没有被break语句终止)才运行else块。" 20 | ] 21 | }, 22 | { 23 | "cell_type": "code", 24 | "execution_count": null, 25 | "metadata": { 26 | "autoscroll": false, 27 | "collapsed": false, 28 | "ein.hycell": false, 29 | "ein.tags": "worksheet-0", 30 | "slideshow": { 31 | "slide_type": "-" 32 | } 33 | }, 34 | "outputs": [], 35 | "source": [ 36 | "from random import randrange\n", 37 | "\n", 38 | "\n", 39 | "def insertion_sort(seq):\n", 40 | " if len(seq) <= 1:\n", 41 | " return seq\n", 42 | "\n", 43 | " _sorted = seq[:1]\n", 44 | " for i in seq[1:]:\n", 45 | " inserted = False\n", 46 | " for j in range(len(_sorted)):\n", 47 | " if i < _sorted[j]:\n", 48 | " _sorted = [*_sorted[:j], i, *_sorted[j:]]\n", 49 | " inserted = True\n", 50 | " break\n", 51 | " if not inserted:\n", 52 | " _sorted.append(i)\n", 53 | " return _sorted\n", 54 | "\n", 55 | "\n", 56 | "print(insertion_sort([randrange(1, 100) for i in range(10)]))" 57 | ] 58 | }, 59 | { 60 | "cell_type": "code", 61 | "execution_count": null, 62 | "metadata": { 63 | "autoscroll": false, 64 | "collapsed": false, 65 | "ein.hycell": false, 66 | "ein.tags": "worksheet-0", 67 | "slideshow": { 68 | "slide_type": "-" 69 | } 70 | }, 71 | "outputs": [], 72 | "source": [ 73 | "from random import randrange\n", 74 | "\n", 75 | "\n", 76 | "def insertion_sort(seq):\n", 77 | " if len(seq) <= 1:\n", 78 | " return seq\n", 79 | "\n", 80 | " _sorted = seq[:1]\n", 81 | " for i in seq[1:]:\n", 82 | " for j in range(len(_sorted)):\n", 83 | " if i < _sorted[j]:\n", 84 | " _sorted = [*_sorted[:j], i, *_sorted[j:]]\n", 85 | " break\n", 86 | " else:\n", 87 | " _sorted.append(i)\n", 88 | " return _sorted\n", 89 | "\n", 90 | "\n", 91 | "print(insertion_sort([randrange(1, 100) for i in range(10)]))" 92 | ] 93 | }, 94 | { 95 | "cell_type": "markdown", 96 | "metadata": { 97 | "ein.tags": "worksheet-0", 98 | "slideshow": { 99 | "slide_type": "-" 100 | } 101 | }, 102 | "source": [ 103 | "- **`while`:** 仅当while循环因为条件为 **假值** 而退出时\n", 104 | " (即while循环没有被break语句终止)才运行else语句。" 105 | ] 106 | }, 107 | { 108 | "cell_type": "code", 109 | "execution_count": null, 110 | "metadata": { 111 | "autoscroll": false, 112 | "collapsed": false, 113 | "ein.hycell": false, 114 | "ein.tags": "worksheet-0", 115 | "slideshow": { 116 | "slide_type": "-" 117 | } 118 | }, 119 | "outputs": [], 120 | "source": [ 121 | "while False:\n", 122 | " print('Will never print!')\n", 123 | "else:\n", 124 | " print('Loop failed!')" 125 | ] 126 | }, 127 | { 128 | "cell_type": "markdown", 129 | "metadata": { 130 | "ein.tags": "worksheet-0", 131 | "slideshow": { 132 | "slide_type": "-" 133 | } 134 | }, 135 | "source": [ 136 | "- **`try`:** 仅当try块中没有异常抛出时才运行else块。" 137 | ] 138 | }, 139 | { 140 | "cell_type": "code", 141 | "execution_count": null, 142 | "metadata": { 143 | "autoscroll": false, 144 | "collapsed": false, 145 | "ein.hycell": false, 146 | "ein.tags": "worksheet-0", 147 | "slideshow": { 148 | "slide_type": "-" 149 | } 150 | }, 151 | "outputs": [], 152 | "source": [ 153 | "def divide(x, y):\n", 154 | " try:\n", 155 | " result = x / y\n", 156 | " except ZeroDivisionError:\n", 157 | " print(\"division by 0!\")\n", 158 | " else:\n", 159 | " print(\"result = {}\".format(result))\n", 160 | " finally:\n", 161 | " print(\"divide finished!\")\n", 162 | "\n", 163 | "\n", 164 | "divide(2, 1)\n", 165 | "print('-' * 16)\n", 166 | "divide(2, 0)" 167 | ] 168 | }, 169 | { 170 | "cell_type": "markdown", 171 | "metadata": { 172 | "ein.tags": "worksheet-0", 173 | "slideshow": { 174 | "slide_type": "-" 175 | } 176 | }, 177 | "source": [ 178 | "在所有的情况下,如果异常或者returnbreak或\n", 179 | "continue语句导致控制权跳到了复合语句的主块之外,\n", 180 | "else语句也会被跳过。" 181 | ] 182 | }, 183 | { 184 | "cell_type": "markdown", 185 | "metadata": { 186 | "ein.tags": "worksheet-0", 187 | "slideshow": { 188 | "slide_type": "-" 189 | } 190 | }, 191 | "source": [ 192 | "# 浅复制与深复制\n", 193 | "\n", 194 | "复制列表(或多数内置的可变集合)最简单的方式是使用内置的类型构造方法。" 195 | ] 196 | }, 197 | { 198 | "cell_type": "code", 199 | "execution_count": null, 200 | "metadata": { 201 | "autoscroll": false, 202 | "collapsed": false, 203 | "ein.hycell": false, 204 | "ein.tags": "worksheet-0", 205 | "slideshow": { 206 | "slide_type": "-" 207 | } 208 | }, 209 | "outputs": [], 210 | "source": [ 211 | "l1 = [3, [55, 44], (7, 8, 9)]\n", 212 | "l2 = list(l1) # l2 = l1[:]\n", 213 | "print(l2)\n", 214 | "print(l2 == l1)\n", 215 | "print(l2 is l1)" 216 | ] 217 | }, 218 | { 219 | "cell_type": "markdown", 220 | "metadata": { 221 | "ein.tags": "worksheet-0", 222 | "slideshow": { 223 | "slide_type": "-" 224 | } 225 | }, 226 | "source": [ 227 | "构造方法或[:]做的是 **浅复制**\n", 228 | "(即复制了最外层容器,副本中的元素是源容器中元素的引用)。" 229 | ] 230 | }, 231 | { 232 | "cell_type": "code", 233 | "execution_count": null, 234 | "metadata": { 235 | "autoscroll": false, 236 | "collapsed": false, 237 | "ein.hycell": false, 238 | "ein.tags": "worksheet-0", 239 | "slideshow": { 240 | "slide_type": "-" 241 | } 242 | }, 243 | "outputs": [], 244 | "source": [ 245 | "l1 = [3, [55, 44], (7, 8, 9)]\n", 246 | "l2 = list(l1)\n", 247 | "l1.append(100)\n", 248 | "l1[1].remove(55)\n", 249 | "print('l1:', l1)\n", 250 | "print('l2:', l2)\n", 251 | "l2[1] += [33, 22]\n", 252 | "l2[2] += (10, 11)\n", 253 | "print('l1:', l1)\n", 254 | "print('l2:', l2)" 255 | ] 256 | }, 257 | { 258 | "cell_type": "markdown", 259 | "metadata": { 260 | "ein.tags": "worksheet-0", 261 | "slideshow": { 262 | "slide_type": "-" 263 | } 264 | }, 265 | "source": [ 266 | "## 为任意对象做深复制和浅复制\n", 267 | "\n", 268 | "有时我们需要深复制(即副本不共享内部对象的引用。)\n", 269 | "copy模块提供的deepcopycopy函数能为任意对象做深复制和浅复制。\n" 270 | ] 271 | }, 272 | { 273 | "cell_type": "code", 274 | "execution_count": null, 275 | "metadata": { 276 | "autoscroll": false, 277 | "collapsed": false, 278 | "ein.hycell": false, 279 | "ein.tags": "worksheet-0", 280 | "slideshow": { 281 | "slide_type": "-" 282 | } 283 | }, 284 | "outputs": [], 285 | "source": [ 286 | "class Bus:\n", 287 | "\n", 288 | " def __init__(self, passengers=None):\n", 289 | " if passengers is None:\n", 290 | " self.passengers = []\n", 291 | " else:\n", 292 | " self.passengers = list(passengers)\n", 293 | "\n", 294 | " def pick(self, name):\n", 295 | " self.passengers.append(name)\n", 296 | "\n", 297 | " def drop(self, name):\n", 298 | " self.passengers.remove(name)" 299 | ] 300 | }, 301 | { 302 | "cell_type": "code", 303 | "execution_count": null, 304 | "metadata": { 305 | "autoscroll": false, 306 | "collapsed": false, 307 | "ein.hycell": false, 308 | "ein.tags": "worksheet-0", 309 | "slideshow": { 310 | "slide_type": "-" 311 | } 312 | }, 313 | "outputs": [], 314 | "source": [ 315 | "import copy\n", 316 | "\n", 317 | "bus1 = Bus(['Alice', 'Bill', 'Claire', 'David'])\n", 318 | "bus2 = copy.copy(bus1)\n", 319 | "bus3 = copy.deepcopy(bus1)\n", 320 | "print(id(bus1), id(bus2), id(bus3))\n", 321 | "bus1.drop('Bill')\n", 322 | "print(bus2.passengers)\n", 323 | "print(id(bus1.passengers), id(bus2.passengers), id(bus3.passengers))\n", 324 | "print(bus3.passengers)" 325 | ] 326 | } 327 | ], 328 | "metadata": { 329 | "kernelspec": { 330 | "display_name": "Python 3", 331 | "name": "python3" 332 | }, 333 | "name": "15-else-and-copy.ipynb" 334 | }, 335 | "nbformat": 4, 336 | "nbformat_minor": 2 337 | } 338 | -------------------------------------------------------------------------------- /课件/1_a_taste_of_python.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/edu2act/course-PySCE/335a5ccd782d57a6641fb5e7861413f645cc93c9/课件/1_a_taste_of_python.pdf -------------------------------------------------------------------------------- /课件/2_python_ingredients.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": { 6 | "collapsed": false, 7 | "ein.tags": "worksheet-0", 8 | "slideshow": { 9 | "slide_type": "-" 10 | } 11 | }, 12 | "source": [ 13 | "# Python变量和数据类型" 14 | ] 15 | }, 16 | { 17 | "cell_type": "markdown", 18 | "metadata": { 19 | "collapsed": false, 20 | "ein.tags": "worksheet-0", 21 | "slideshow": { 22 | "slide_type": "-" 23 | } 24 | }, 25 | "source": [ 26 | "Python使用对象(object)模型存储数据,因此所构造的任何类型的值都以对象形式存在。事实上,当我们在解释器中直接键入一个数字或者字符串并按下回车后,就创建了一个“对象”。\n", 27 | "Python内置的4种最基本的数据类型,包括:\n", 28 | "\n", 29 | "* **布尔型**(用来表示真假,仅包含`True`和`False`两种取值)\n", 30 | "\n", 31 | "* **整型** (整数,例如`42`、`100000000`)\n", 32 | "\n", 33 | "* **浮点型**(小数,例如`3.14159`、`1.0e8`、`100000000.0`)\n", 34 | "\n", 35 | "* **字符串型**(字符序列)" 36 | ] 37 | }, 38 | { 39 | "cell_type": "markdown", 40 | "metadata": { 41 | "collapsed": false, 42 | "ein.tags": "worksheet-0", 43 | "slideshow": { 44 | "slide_type": "-" 45 | } 46 | }, 47 | "source": [ 48 | "## 对象和变量\n", 49 | "**Python中的一切都是对象。** 所有的对象都具有三个特征:\n", 50 | "\n", 51 | "* **身份**:每个对象唯一身份标识,可以使用内建函数 `id()` 查看,\n", 52 | " 如 `id(43)`,`id(obj_name)`,所得到的值可以认为是该对象的内存地址。\n", 53 | "\n", 54 | "* **类型**:决定了该对象可以保持什么类型的值,使用内建函数 `type(obj_name)` 可以查看。\n", 55 | " 注意,该函数返回的是类型对象,而非字符串对象。对象类型决定了可以对它进行怎样的操作,\n", 56 | " 还决定了其包装的值是否允许被修改。\n", 57 | " 对象的类型无法改变,所以Python是强类型的(strongly typed)。\n", 58 | "\n", 59 | "* **值**:对象表示的数据项。\n", 60 | "\n", 61 | "某些对象有属性、值和方法,和C++一样,可以使用 `.` 操作符访问。\n" 62 | ] 63 | }, 64 | { 65 | "cell_type": "markdown", 66 | "metadata": { 67 | "collapsed": false, 68 | "ein.tags": "worksheet-0", 69 | "slideshow": { 70 | "slide_type": "-" 71 | } 72 | }, 73 | "source": [ 74 | "### 对象值的比较\n", 75 | "\n", 76 | "操作符:`<`, `<=`, `>`, `>=`, `==`, `!=`, `<>`\n", 77 | "\n", 78 | "返回值:布尔值 `True`, `False`\n", 79 | "\n", 80 | "注:`<>`在Python3中已经被废弃,但是并未移除,可以通过导入语句\n", 81 | "`from __future__ import barry_as_FLUFL`使用。\n", 82 | "详细内容请参考[PEP 401](https://www.python.org/dev/peps/pep-0401/)。" 83 | ] 84 | }, 85 | { 86 | "cell_type": "code", 87 | "execution_count": null, 88 | "metadata": { 89 | "autoscroll": false, 90 | "collapsed": false, 91 | "ein.hycell": false, 92 | "ein.tags": "worksheet-0", 93 | "slideshow": { 94 | "slide_type": "-" 95 | } 96 | }, 97 | "outputs": [], 98 | "source": [ 99 | "2 == 2\n", 100 | "3.14 <= 10\n", 101 | "'welcome' > 'sssaf' # True, 'w' > 's', ord('w') > ord('s')\n", 102 | "[3, 'abc'] != ['abc', 3]\n", 103 | "3 < 4 < 7 # True, equivalent to 3 < 4 and 4 < 7\n", 104 | "4 > 3 == 3 # True, equivalent to 4 > 3 and 3 == 3" 105 | ] 106 | }, 107 | { 108 | "cell_type": "markdown", 109 | "metadata": { 110 | "collapsed": false, 111 | "ein.tags": "worksheet-0", 112 | "slideshow": { 113 | "slide_type": "-" 114 | } 115 | }, 116 | "source": [ 117 | "从最后两个例子可以看出,Python中的表达式更加的灵活自然。" 118 | ] 119 | }, 120 | { 121 | "cell_type": "markdown", 122 | "metadata": { 123 | "collapsed": false, 124 | "ein.tags": "worksheet-0", 125 | "slideshow": { 126 | "slide_type": "-" 127 | } 128 | }, 129 | "source": [ 130 | "### 对象身份的比较\n", 131 | "\n", 132 | "先谈一下引用计数:" 133 | ] 134 | }, 135 | { 136 | "cell_type": "code", 137 | "execution_count": null, 138 | "metadata": { 139 | "autoscroll": false, 140 | "collapsed": false, 141 | "ein.hycell": false, 142 | "ein.tags": "worksheet-0", 143 | "slideshow": { 144 | "slide_type": "-" 145 | } 146 | }, 147 | "outputs": [], 148 | "source": [ 149 | "foo1 = foo2 = 4.3" 150 | ] 151 | }, 152 | { 153 | "cell_type": "markdown", 154 | "metadata": { 155 | "collapsed": false, 156 | "ein.tags": "worksheet-0", 157 | "slideshow": { 158 | "slide_type": "-" 159 | } 160 | }, 161 | "source": [ 162 | "这条语句的实质是:一个值为`4.3`的数字对象被创建,`foo1`和`foo2`这两个变量名字共同\n", 163 | "指向了此对象(**变量**仅仅是一个名字,是对对象的**引用**而不是对象本身。),\n", 164 | "即`foo1`和`foo2`是同一个对象的两个引用:" 165 | ] 166 | }, 167 | { 168 | "cell_type": "code", 169 | "execution_count": null, 170 | "metadata": { 171 | "autoscroll": false, 172 | "collapsed": false, 173 | "ein.hycell": false, 174 | "ein.tags": "worksheet-0", 175 | "slideshow": { 176 | "slide_type": "-" 177 | } 178 | }, 179 | "outputs": [], 180 | "source": [ 181 | "foo1 = 4.3\n", 182 | "foo2 = foo1" 183 | ] 184 | }, 185 | { 186 | "cell_type": "markdown", 187 | "metadata": { 188 | "collapsed": false, 189 | "ein.tags": "worksheet-0", 190 | "slideshow": { 191 | "slide_type": "-" 192 | } 193 | }, 194 | "source": [ 195 | "第一句话使得值为`4.3`的数字对象被创建,然后其引用被赋值给`foo1`,\n", 196 | "第二句话使得`foo2`借助`foo1`同样指向了值为`4.3`的对象,这里和上一个例子的实质是相同的。" 197 | ] 198 | }, 199 | { 200 | "cell_type": "code", 201 | "execution_count": null, 202 | "metadata": { 203 | "autoscroll": false, 204 | "collapsed": false, 205 | "ein.hycell": false, 206 | "ein.tags": "worksheet-0", 207 | "slideshow": { 208 | "slide_type": "-" 209 | } 210 | }, 211 | "outputs": [], 212 | "source": [ 213 | "foo2 = 1.3 + 3" 214 | ] 215 | }, 216 | { 217 | "cell_type": "markdown", 218 | "metadata": { 219 | "collapsed": false, 220 | "ein.tags": "worksheet-0", 221 | "slideshow": { 222 | "slide_type": "-" 223 | } 224 | }, 225 | "source": [ 226 | "值为`1.3`的数字对象和值为`3`的数字对象被创建,相加后,得到一个新的值为`4.3`的对象\n", 227 | "(**此对象与上面代码中的值`4.3`对象不是同一个!**),然后`foo2`指向了这个新对象。\n", 228 | "\n", 229 | "现在,foo1和foo2指向了两个值相同,但身份不相同的数字对象。" 230 | ] 231 | }, 232 | { 233 | "cell_type": "code", 234 | "execution_count": null, 235 | "metadata": { 236 | "autoscroll": false, 237 | "collapsed": false, 238 | "ein.hycell": false, 239 | "ein.tags": "worksheet-0", 240 | "slideshow": { 241 | "slide_type": "-" 242 | } 243 | }, 244 | "outputs": [], 245 | "source": [ 246 | "foo1 = 4.3\n", 247 | "foo2 = 4.3" 248 | ] 249 | }, 250 | { 251 | "cell_type": "markdown", 252 | "metadata": { 253 | "collapsed": false, 254 | "ein.tags": "worksheet-0", 255 | "slideshow": { 256 | "slide_type": "-" 257 | } 258 | }, 259 | "source": [ 260 | "同样,两个值相同,身份不同的数字对象被创建,分别由`foo1`和`foo2`指向。\n", 261 | "\n", 262 | "很多时候如果你分不清楚的话,可以使用内建函数`id()`来进行判定,可以认为`id`返回的是**对象的内存地址**,即**指针**,这样的判定方法是最有效的。如:" 263 | ] 264 | }, 265 | { 266 | "cell_type": "code", 267 | "execution_count": null, 268 | "metadata": { 269 | "autoscroll": false, 270 | "collapsed": false, 271 | "ein.hycell": false, 272 | "ein.tags": "worksheet-0", 273 | "slideshow": { 274 | "slide_type": "-" 275 | } 276 | }, 277 | "outputs": [], 278 | "source": [ 279 | "foo1 = foo2 = 4.3\n", 280 | "id(foo1) == id(foo2) #返回值为True" 281 | ] 282 | }, 283 | { 284 | "cell_type": "code", 285 | "execution_count": null, 286 | "metadata": { 287 | "autoscroll": false, 288 | "collapsed": false, 289 | "ein.hycell": false, 290 | "ein.tags": "worksheet-0", 291 | "slideshow": { 292 | "slide_type": "-" 293 | } 294 | }, 295 | "outputs": [], 296 | "source": [ 297 | "bar1 = 4.3\n", 298 | "bar2 = 4.3\n", 299 | "id(bar1) == id(bar2) #返回值为False" 300 | ] 301 | }, 302 | { 303 | "cell_type": "markdown", 304 | "metadata": { 305 | "collapsed": false, 306 | "ein.tags": "worksheet-0", 307 | "slideshow": { 308 | "slide_type": "-" 309 | } 310 | }, 311 | "source": [ 312 | "通常`id()`很少使用,`is`和`is not`操作符是判别身份的最佳方式:" 313 | ] 314 | }, 315 | { 316 | "cell_type": "code", 317 | "execution_count": null, 318 | "metadata": { 319 | "autoscroll": false, 320 | "collapsed": false, 321 | "ein.hycell": false, 322 | "ein.tags": "worksheet-0", 323 | "slideshow": { 324 | "slide_type": "-" 325 | } 326 | }, 327 | "outputs": [], 328 | "source": [ 329 | "foo1 = foo2 = 4.3\n", 330 | "foo1 is foo2 # 返回True\n", 331 | "foo1 is not foo2 # 返回False" 332 | ] 333 | }, 334 | { 335 | "cell_type": "markdown", 336 | "metadata": { 337 | "collapsed": false, 338 | "ein.tags": "worksheet-0", 339 | "slideshow": { 340 | "slide_type": "-" 341 | } 342 | }, 343 | "source": [ 344 | "特殊情况是存在的,这通常会令人迷惑不解,如:" 345 | ] 346 | }, 347 | { 348 | "cell_type": "code", 349 | "execution_count": null, 350 | "metadata": { 351 | "autoscroll": false, 352 | "collapsed": false, 353 | "ein.hycell": false, 354 | "ein.tags": "worksheet-0", 355 | "slideshow": { 356 | "slide_type": "-" 357 | } 358 | }, 359 | "outputs": [], 360 | "source": [ 361 | "a = 4\n", 362 | "b = 4\n", 363 | "a is b" 364 | ] 365 | }, 366 | { 367 | "cell_type": "code", 368 | "execution_count": null, 369 | "metadata": { 370 | "autoscroll": false, 371 | "collapsed": false, 372 | "ein.hycell": false, 373 | "ein.tags": "worksheet-0", 374 | "slideshow": { 375 | "slide_type": "-" 376 | } 377 | }, 378 | "outputs": [], 379 | "source": [ 380 | "c = 1000\n", 381 | "d = 1000\n", 382 | "c is d" 383 | ] 384 | }, 385 | { 386 | "cell_type": "markdown", 387 | "metadata": { 388 | "collapsed": false, 389 | "ein.tags": "worksheet-0", 390 | "slideshow": { 391 | "slide_type": "-" 392 | } 393 | }, 394 | "source": [ 395 | "第二个结果很好理解,两个`1000`是不同的对象嘛!但第一个怎么回事?\n", 396 | "因为 __小整型量通常会在程序代码中频繁使用,为了提升效率,Python会对**-5~256**的整型对象进行缓存,即不会重复创建。__\n", 397 | "这就是上面例子中`a is b`返回`True`的原因。\n", 398 | "**任何一个对象都有一个内部的计数器,记录着其引用的数量,\n", 399 | "当引用为0时,该对象就会被系统给收回,这就是Python进行自主内存管理的基本原理之一。**\n" 400 | ] 401 | }, 402 | { 403 | "cell_type": "markdown", 404 | "metadata": { 405 | "collapsed": false, 406 | "ein.tags": "worksheet-0", 407 | "slideshow": { 408 | "slide_type": "-" 409 | } 410 | }, 411 | "source": [ 412 | "### 使用`type()`返回对象的类型\n", 413 | "\n", 414 | "用法:`type(obj_name)`" 415 | ] 416 | }, 417 | { 418 | "cell_type": "code", 419 | "execution_count": null, 420 | "metadata": { 421 | "autoscroll": false, 422 | "collapsed": false, 423 | "ein.hycell": false, 424 | "ein.tags": "worksheet-0", 425 | "slideshow": { 426 | "slide_type": "-" 427 | } 428 | }, 429 | "outputs": [], 430 | "source": [ 431 | "type(4) # 返回int\n", 432 | "type(4.0) # 返回float\n", 433 | "type('abc') # 返回str\n", 434 | "\n", 435 | "# type返回的不是字符串,而是类型对象,如\n", 436 | "type('abc').__name__ # 返回 'str' ,__name__是返回对象的属性\n", 437 | "type(type('abc')) # 返回type,Python的内建元类" 438 | ] 439 | }, 440 | { 441 | "cell_type": "markdown", 442 | "metadata": { 443 | "collapsed": false, 444 | "ein.tags": "worksheet-0", 445 | "slideshow": { 446 | "slide_type": "-" 447 | } 448 | }, 449 | "source": [ 450 | "类(class)是对象的定义。" 451 | ] 452 | }, 453 | { 454 | "cell_type": "markdown", 455 | "metadata": { 456 | "collapsed": false, 457 | "ein.tags": "worksheet-0", 458 | "slideshow": { 459 | "slide_type": "-" 460 | } 461 | }, 462 | "source": [ 463 | "### 使用`isinstance()`检查一个对象是否是某类型的实例" 464 | ] 465 | }, 466 | { 467 | "cell_type": "code", 468 | "execution_count": null, 469 | "metadata": { 470 | "autoscroll": false, 471 | "collapsed": false, 472 | "ein.hycell": false, 473 | "ein.tags": "worksheet-0", 474 | "slideshow": { 475 | "slide_type": "-" 476 | } 477 | }, 478 | "outputs": [], 479 | "source": [ 480 | "def display_num_type(num):\n", 481 | " print(num, 'is', end=' ')\n", 482 | " if isinstance(num, (int, float, complex)):\n", 483 | " print('a number of type: ', type(num).__name__)\n", 484 | " else:\n", 485 | " print('not a number at all!')\n", 486 | "\n", 487 | "display_num_type(-69)\n", 488 | "display_num_type(98.6)\n", 489 | "display_num_type(234+2j)\n", 490 | "display_num_type('xxx')" 491 | ] 492 | }, 493 | { 494 | "cell_type": "markdown", 495 | "metadata": { 496 | "collapsed": false, 497 | "ein.tags": "worksheet-0", 498 | "slideshow": { 499 | "slide_type": "-" 500 | } 501 | }, 502 | "source": [ 503 | "### 变量赋值" 504 | ] 505 | }, 506 | { 507 | "cell_type": "code", 508 | "execution_count": null, 509 | "metadata": { 510 | "autoscroll": false, 511 | "collapsed": false, 512 | "ein.hycell": false, 513 | "ein.tags": "worksheet-0", 514 | "slideshow": { 515 | "slide_type": "-" 516 | } 517 | }, 518 | "outputs": [], 519 | "source": [ 520 | "# 普通赋值方式\n", 521 | "int_example = 211\n", 522 | "string_example = 'So easy!'\n", 523 | "\n", 524 | "# 增量赋值,与C语言中的算数自反赋值运算一样\n", 525 | "int_example *= 3\n", 526 | "string_example += 'Try it!'\n", 527 | "print(int_example)\n", 528 | "print(string_example)\n", 529 | "\n", 530 | "# 多重赋值\n", 531 | "x = y = z = 1\n", 532 | "print(x, y, z)\n", 533 | "# 多元赋值,很有用,用起来效率很高,括号是可选的,但保留可以增强代码可读性\n", 534 | "(x, y, z) = (1, 2, 'a string')\n", 535 | "print(x, y, z)\n", 536 | "(x, y) = (y, x) # 对x, y的值做交换,不需要第三个辅助变量了\n", 537 | "# 不要过多考虑顺序和优先级,注重功能逻辑" 538 | ] 539 | }, 540 | { 541 | "cell_type": "markdown", 542 | "metadata": { 543 | "collapsed": false, 544 | "ein.tags": "worksheet-0", 545 | "slideshow": { 546 | "slide_type": "-" 547 | } 548 | }, 549 | "source": [ 550 | "### 变量名(标志符)\n", 551 | "\n", 552 | "1. **规则**\n", 553 | "\n", 554 | " 和C语言相似,没有长度限制,具体如下:\n", 555 | "\n", 556 | " * **只能包含以下字符:**\n", 557 | "\n", 558 | " * 小写字母(a-z)\n", 559 | " * 大写字母(A-Z)\n", 560 | " * 数字(0-9)\n", 561 | " * 下滑线(_)\n", 562 | "\n", 563 | " * **不允许以数字开头。**\n", 564 | "\n", 565 | " * **Python中以下划线开头的名字有特殊的含义。**\n", 566 | "\n", 567 | "2. **关键字**\n", 568 | "\n", 569 | " ```\n", 570 | " False await else import pass\n", 571 | " None break except in raise\n", 572 | " True class finally is return\n", 573 | " and continue for lambda try\n", 574 | " as def from nonlocal while\n", 575 | " assert del global not with\n", 576 | " async elif if or yield\n", 577 | " ```\n", 578 | "\n", 579 | "3. **builtins**\n", 580 | "\n", 581 | " 进入解释器时,`builtins`模块会被自动导入,这个模块中包含一些保留的名字集合,\n", 582 | " 如 `open` , `input` 等,一般情况下,你定义的标识符最好不要和它们冲突。\n" 583 | ] 584 | }, 585 | { 586 | "cell_type": "code", 587 | "execution_count": null, 588 | "metadata": { 589 | "autoscroll": false, 590 | "collapsed": false, 591 | "ein.hycell": false, 592 | "ein.tags": "worksheet-0", 593 | "slideshow": { 594 | "slide_type": "-" 595 | } 596 | }, 597 | "outputs": [], 598 | "source": [ 599 | "import builtins\n", 600 | "dir(builtins) # 查看模块中所有的内建名字" 601 | ] 602 | }, 603 | { 604 | "cell_type": "markdown", 605 | "metadata": { 606 | "collapsed": false, 607 | "ein.tags": "worksheet-0", 608 | "slideshow": { 609 | "slide_type": "-" 610 | } 611 | }, 612 | "source": [ 613 | "标识符命名应该使用固有的风格,不要随便命名,离标识符、保留名字、特权名字等远一些。" 614 | ] 615 | }, 616 | { 617 | "cell_type": "markdown", 618 | "metadata": { 619 | "collapsed": false, 620 | "ein.tags": "worksheet-0", 621 | "slideshow": { 622 | "slide_type": "-" 623 | } 624 | }, 625 | "source": [ 626 | "## 数字\n", 627 | "\n", 628 | "数字可以直接访问,是不可更改并且不可分割的原子类型。\n", 629 | "不可更改意味着变更数字值的实质是新对象的创建。\n", 630 | "Python本身支持整数和浮点数,其整数类型可以存储任意大小的整数\n", 631 | "(所能表达的数字范围和计算机的虚拟内存大小有关),这使得Python非常适合大数计算。\n" 632 | ] 633 | }, 634 | { 635 | "cell_type": "markdown", 636 | "metadata": { 637 | "collapsed": false, 638 | "ein.tags": "worksheet-0", 639 | "slideshow": { 640 | "slide_type": "-" 641 | } 642 | }, 643 | "source": [ 644 | "### 数字对象的创建和赋值" 645 | ] 646 | }, 647 | { 648 | "cell_type": "code", 649 | "execution_count": null, 650 | "metadata": { 651 | "autoscroll": false, 652 | "collapsed": false, 653 | "ein.hycell": false, 654 | "ein.tags": "worksheet-0", 655 | "slideshow": { 656 | "slide_type": "-" 657 | } 658 | }, 659 | "outputs": [], 660 | "source": [ 661 | "# 像大多数脚本语言一样,无需指定类型\n", 662 | "an_int = 1\n", 663 | "a_float = 3.1415\n", 664 | "a_complex = 1.2 + 3.3j" 665 | ] 666 | }, 667 | { 668 | "cell_type": "markdown", 669 | "metadata": { 670 | "collapsed": false, 671 | "ein.tags": "worksheet-0", 672 | "slideshow": { 673 | "slide_type": "-" 674 | } 675 | }, 676 | "source": [ 677 | "### 布尔型\n", 678 | "布尔型只有两个值,`True`和`False`。事实上,布尔型是整型的子类,对应整型的1和0。\n", 679 | "使用内建函数`bool`返回布尔对象。" 680 | ] 681 | }, 682 | { 683 | "cell_type": "code", 684 | "execution_count": null, 685 | "metadata": { 686 | "autoscroll": false, 687 | "collapsed": false, 688 | "ein.hycell": false, 689 | "ein.tags": "worksheet-0", 690 | "slideshow": { 691 | "slide_type": "-" 692 | } 693 | }, 694 | "outputs": [], 695 | "source": [ 696 | "bool() # 返回False\n", 697 | "bool(1) # 返回True\n", 698 | "bool(0) # 返回False\n", 699 | "bool(True) # 返回True\n", 700 | "bool(False) # 返回False\n", 701 | "True + True # 返回2,因bool值实质是整型" 702 | ] 703 | }, 704 | { 705 | "cell_type": "markdown", 706 | "metadata": { 707 | "collapsed": false, 708 | "ein.tags": "worksheet-0", 709 | "slideshow": { 710 | "slide_type": "-" 711 | } 712 | }, 713 | "source": [ 714 | "### 布尔运算\n", 715 | "\n", 716 | "布尔运算符有三个:`and`, `or`, `not`。善于使用括号以避免优先级和结合性导致的问题。\n", 717 | "\n", 718 | "优先级由高到低依次为: `not`, `and`, `or`。" 719 | ] 720 | }, 721 | { 722 | "cell_type": "markdown", 723 | "metadata": { 724 | "collapsed": false, 725 | "ein.tags": "worksheet-0", 726 | "slideshow": { 727 | "slide_type": "-" 728 | } 729 | }, 730 | "source": [ 731 | "### 复数\n", 732 | "语法:`real + imag j`\n", 733 | "\n", 734 | "实数部分和虚数部分都是浮点型,虚数部分结尾必须是`j`或`J`。\n", 735 | "\n", 736 | "复数包含两个浮点属性:`real`(实数部分),`imag`(虚数部分),\n", 737 | "还有一个方法: *conjugate()* ,用以获取其共轭复数。" 738 | ] 739 | }, 740 | { 741 | "cell_type": "code", 742 | "execution_count": null, 743 | "metadata": { 744 | "autoscroll": false, 745 | "collapsed": false, 746 | "ein.hycell": false, 747 | "ein.tags": "worksheet-0", 748 | "slideshow": { 749 | "slide_type": "-" 750 | } 751 | }, 752 | "outputs": [], 753 | "source": [ 754 | "a_complex = 3.5 + 2.9j\n", 755 | "a_complex # 返回(3.5+2.9j)\n", 756 | "a_complex.real # 返回3.5\n", 757 | "a_complex.imag # 返回2.9\n", 758 | "a_complex.conjugate() # 返回(3.5-2.9j)" 759 | ] 760 | }, 761 | { 762 | "cell_type": "markdown", 763 | "metadata": { 764 | "collapsed": false, 765 | "ein.tags": "worksheet-0", 766 | "slideshow": { 767 | "slide_type": "-" 768 | } 769 | }, 770 | "source": [ 771 | "### 更新数字对象(即重新赋值,注意其本质:新对象的创建)" 772 | ] 773 | }, 774 | { 775 | "cell_type": "code", 776 | "execution_count": null, 777 | "metadata": { 778 | "autoscroll": false, 779 | "collapsed": false, 780 | "ein.hycell": false, 781 | "ein.tags": "worksheet-0", 782 | "slideshow": { 783 | "slide_type": "-" 784 | } 785 | }, 786 | "outputs": [], 787 | "source": [ 788 | "an_int += 1\n", 789 | "a_float = 3.1415926" 790 | ] 791 | }, 792 | { 793 | "cell_type": "markdown", 794 | "metadata": { 795 | "collapsed": false, 796 | "ein.tags": "worksheet-0", 797 | "slideshow": { 798 | "slide_type": "-" 799 | } 800 | }, 801 | "source": [ 802 | "### “删除”数字对象" 803 | ] 804 | }, 805 | { 806 | "cell_type": "code", 807 | "execution_count": null, 808 | "metadata": { 809 | "autoscroll": false, 810 | "collapsed": false, 811 | "ein.hycell": false, 812 | "ein.tags": "worksheet-0", 813 | "slideshow": { 814 | "slide_type": "-" 815 | } 816 | }, 817 | "outputs": [], 818 | "source": [ 819 | "del an_int" 820 | ] 821 | }, 822 | { 823 | "cell_type": "markdown", 824 | "metadata": { 825 | "collapsed": false, 826 | "ein.tags": "worksheet-0", 827 | "slideshow": { 828 | "slide_type": "-" 829 | } 830 | }, 831 | "source": [ 832 | "注意:我们只是删除了对象的引用,而不是删除了对象本身\n", 833 | "(相当于使对象内部计数器的值减少1),这时`an_int`不引用任何对象。\n", 834 | "对象本身的删除是由Python内部的内存管理功能进行的。" 835 | ] 836 | }, 837 | { 838 | "cell_type": "markdown", 839 | "metadata": { 840 | "collapsed": false, 841 | "ein.tags": "worksheet-0", 842 | "slideshow": { 843 | "slide_type": "-" 844 | } 845 | }, 846 | "source": [ 847 | "### Python支持的数学运算\n", 848 | "| 运算符 | 描述 | 示例 | 结果 |\n", 849 | "|--------|------------|---------|------|\n", 850 | "| + | 加法 | 5 + 8 | 13 |\n", 851 | "| - | 减法 | 90 - 10 | 80 |\n", 852 | "| * | 乘法 | 4 * 7 | 28 |\n", 853 | "| / | 浮点数除法 | 7 / 2 | 3.5 |\n", 854 | "| // | 整数除法 | 7 // 2 | 3 |\n", 855 | "| % | 模(求余) | 7 % 3 | 1 |\n", 856 | "| ** | 幂 | 3 ** 4 | 81 |\n" 857 | ] 858 | }, 859 | { 860 | "cell_type": "markdown", 861 | "metadata": { 862 | "collapsed": false, 863 | "ein.tags": "worksheet-0", 864 | "slideshow": { 865 | "slide_type": "-" 866 | } 867 | }, 868 | "source": [ 869 | "### 将运算过程与赋值过程合并" 870 | ] 871 | }, 872 | { 873 | "cell_type": "code", 874 | "execution_count": null, 875 | "metadata": { 876 | "autoscroll": false, 877 | "collapsed": false, 878 | "ein.hycell": false, 879 | "ein.tags": "worksheet-0", 880 | "slideshow": { 881 | "slide_type": "-" 882 | } 883 | }, 884 | "outputs": [], 885 | "source": [ 886 | "a = 95\n", 887 | "a -= 3\n", 888 | "a += 8\n", 889 | "a *= 2\n", 890 | "a /= 3\n", 891 | "a //= 2\n", 892 | "a" 893 | ] 894 | }, 895 | { 896 | "cell_type": "markdown", 897 | "metadata": { 898 | "collapsed": false, 899 | "ein.tags": "worksheet-0", 900 | "slideshow": { 901 | "slide_type": "-" 902 | } 903 | }, 904 | "source": [ 905 | "### 除法\n", 906 | "\n", 907 | "* 使用`/`执行**浮点**除法(十进制小数)\n", 908 | "\n", 909 | " **即使运算对象是两个整数,使用`/`仍会得到浮点型的结果。**\n", 910 | "\n", 911 | "* 使用`//`执行**整数**除法(整除)" 912 | ] 913 | }, 914 | { 915 | "cell_type": "code", 916 | "execution_count": null, 917 | "metadata": { 918 | "autoscroll": false, 919 | "collapsed": false, 920 | "ein.hycell": false, 921 | "ein.tags": "worksheet-0", 922 | "slideshow": { 923 | "slide_type": "-" 924 | } 925 | }, 926 | "outputs": [], 927 | "source": [ 928 | "1 / 2 # 0.5\n", 929 | "1.0 / 2 # 0.5\n", 930 | "1.0 // 2 # 0.0\n", 931 | "\n", 932 | "9 / 5 # 1.8\n", 933 | "9 // 5 # 1" 934 | ] 935 | }, 936 | { 937 | "cell_type": "markdown", 938 | "metadata": { 939 | "collapsed": false, 940 | "ein.tags": "worksheet-0", 941 | "slideshow": { 942 | "slide_type": "-" 943 | } 944 | }, 945 | "source": [ 946 | "如果除数为**0**,除法运算会产生`ZeroDivisionError`异常。" 947 | ] 948 | }, 949 | { 950 | "cell_type": "code", 951 | "execution_count": null, 952 | "metadata": { 953 | "autoscroll": false, 954 | "collapsed": false, 955 | "ein.hycell": false, 956 | "ein.tags": "worksheet-0", 957 | "slideshow": { 958 | "slide_type": "-" 959 | } 960 | }, 961 | "outputs": [], 962 | "source": [ 963 | "5 / 0" 964 | ] 965 | }, 966 | { 967 | "cell_type": "markdown", 968 | "metadata": { 969 | "collapsed": false, 970 | "ein.tags": "worksheet-0", 971 | "slideshow": { 972 | "slide_type": "-" 973 | } 974 | }, 975 | "source": [ 976 | "### 基数\n", 977 | "\n", 978 | "除了十进制外,Python还支持以下三种进制的数字:\n", 979 | "* `0b`或`0B`表示二进制(以2为底)\n", 980 | "* `0o`或`0O`表示八进制(以8为底)\n", 981 | "* `0x`或`0X`表示十六进制(以16为底)" 982 | ] 983 | }, 984 | { 985 | "cell_type": "code", 986 | "execution_count": null, 987 | "metadata": { 988 | "autoscroll": false, 989 | "collapsed": false, 990 | "ein.hycell": false, 991 | "ein.tags": "worksheet-0", 992 | "slideshow": { 993 | "slide_type": "-" 994 | } 995 | }, 996 | "outputs": [], 997 | "source": [ 998 | "0b10\n", 999 | "0o10\n", 1000 | "0x10" 1001 | ] 1002 | }, 1003 | { 1004 | "cell_type": "markdown", 1005 | "metadata": { 1006 | "collapsed": false, 1007 | "ein.tags": "worksheet-0", 1008 | "slideshow": { 1009 | "slide_type": "-" 1010 | } 1011 | }, 1012 | "source": [ 1013 | "### 类型转换\n", 1014 | "\n", 1015 | "两个不同类型的数字对象进行运算时,Python就要对其中一个进行强制类型转换,\n", 1016 | "继而进行运算,这个道理和C语言中的自动转化是相似的。\n", 1017 | "基本规则: **整型转换为浮点型,非复数转换为复数。**\n", 1018 | "总之就是: **简单类型向复杂类型转换,不精确类型向更精确类型转换。**\n", 1019 | "\n", 1020 | "类型转换失败会产生`ValueError`异常。" 1021 | ] 1022 | }, 1023 | { 1024 | "cell_type": "code", 1025 | "execution_count": null, 1026 | "metadata": { 1027 | "autoscroll": false, 1028 | "collapsed": false, 1029 | "ein.hycell": false, 1030 | "ein.tags": "worksheet-0", 1031 | "slideshow": { 1032 | "slide_type": "-" 1033 | } 1034 | }, 1035 | "outputs": [], 1036 | "source": [ 1037 | "int('10a')\n", 1038 | "int('98.6')" 1039 | ] 1040 | }, 1041 | { 1042 | "cell_type": "markdown", 1043 | "metadata": { 1044 | "collapsed": false, 1045 | "ein.tags": "worksheet-0", 1046 | "slideshow": { 1047 | "slide_type": "-" 1048 | } 1049 | }, 1050 | "source": [ 1051 | "### 运算优先级\n", 1052 | "参考教材[1]附录F(p.380)\n", 1053 | "\n", 1054 | "**使用括号来保证运算顺序与期望的一致。**" 1055 | ] 1056 | }, 1057 | { 1058 | "cell_type": "markdown", 1059 | "metadata": { 1060 | "collapsed": false, 1061 | "ein.tags": "worksheet-0", 1062 | "slideshow": { 1063 | "slide_type": "-" 1064 | } 1065 | }, 1066 | "source": [ 1067 | "### 数学函数\n", 1068 | "#### 转换工厂函数\n", 1069 | "\n", 1070 | "**注意**:转换是表现,实质是创建新对象。\n", 1071 | "\n", 1072 | "* `int()`\n", 1073 | "* `float()`\n", 1074 | "* `complex()`\n", 1075 | "* `bool()`" 1076 | ] 1077 | }, 1078 | { 1079 | "cell_type": "code", 1080 | "execution_count": null, 1081 | "metadata": { 1082 | "autoscroll": false, 1083 | "collapsed": false, 1084 | "ein.hycell": false, 1085 | "ein.tags": "worksheet-0", 1086 | "slideshow": { 1087 | "slide_type": "-" 1088 | } 1089 | }, 1090 | "outputs": [], 1091 | "source": [ 1092 | "int(4.225) # 返回4,实质是生产了一个int类型对象\n", 1093 | "float(4) # 返回4.0\n", 1094 | "complex(11, 9.0) # 返回(11+9j)\n", 1095 | "bool(0.000001) # 返回True" 1096 | ] 1097 | }, 1098 | { 1099 | "cell_type": "markdown", 1100 | "metadata": { 1101 | "collapsed": false, 1102 | "ein.tags": "worksheet-0", 1103 | "slideshow": { 1104 | "slide_type": "-" 1105 | } 1106 | }, 1107 | "source": [ 1108 | "#### 功能函数\n", 1109 | "\n", 1110 | "* `abs()`\n", 1111 | "\n", 1112 | " 返回绝对值,如果参数是整型,返回整型,如果是浮点型,返回浮点类型,\n", 1113 | " 同样也可用于复数绝对值的计算,即返回实部和虚部平方和的二次方根。\n", 1114 | "\n", 1115 | "* `divmod()`\n", 1116 | "\n", 1117 | " 此函数将除法和求余结合起来,返回一个包含商和余数的元组:\n" 1118 | ] 1119 | }, 1120 | { 1121 | "cell_type": "code", 1122 | "execution_count": null, 1123 | "metadata": { 1124 | "autoscroll": false, 1125 | "collapsed": false, 1126 | "ein.hycell": false, 1127 | "ein.tags": "worksheet-0", 1128 | "slideshow": { 1129 | "slide_type": "-" 1130 | } 1131 | }, 1132 | "outputs": [], 1133 | "source": [ 1134 | "divmod(10, 3) # 返回(3, 1)\n", 1135 | "divmod(2.5, 10) # 返回(0.0, 2.5)" 1136 | ] 1137 | }, 1138 | { 1139 | "cell_type": "markdown", 1140 | "metadata": { 1141 | "collapsed": false, 1142 | "ein.tags": "worksheet-0", 1143 | "slideshow": { 1144 | "slide_type": "-" 1145 | } 1146 | }, 1147 | "source": [ 1148 | "* `pow()`\n", 1149 | "\n", 1150 | " 此函数的功能和 `**` 一样,实现幂运算:" 1151 | ] 1152 | }, 1153 | { 1154 | "cell_type": "code", 1155 | "execution_count": null, 1156 | "metadata": { 1157 | "autoscroll": false, 1158 | "collapsed": false, 1159 | "ein.hycell": false, 1160 | "ein.tags": "worksheet-0", 1161 | "slideshow": { 1162 | "slide_type": "-" 1163 | } 1164 | }, 1165 | "outputs": [], 1166 | "source": [ 1167 | "pow(2, 5) # 返回32\n", 1168 | "pow(5, 2) # 返回25" 1169 | ] 1170 | }, 1171 | { 1172 | "cell_type": "markdown", 1173 | "metadata": { 1174 | "collapsed": false, 1175 | "ein.tags": "worksheet-0", 1176 | "slideshow": { 1177 | "slide_type": "-" 1178 | } 1179 | }, 1180 | "source": [ 1181 | "* `round()`\n", 1182 | "\n", 1183 | " `round()` 做真正的四舍五入!可以用第二个参数指定精确到小数点后第几位:\n" 1184 | ] 1185 | }, 1186 | { 1187 | "cell_type": "code", 1188 | "execution_count": null, 1189 | "metadata": { 1190 | "autoscroll": false, 1191 | "collapsed": false, 1192 | "ein.hycell": false, 1193 | "ein.tags": "worksheet-0", 1194 | "slideshow": { 1195 | "slide_type": "-" 1196 | } 1197 | }, 1198 | "outputs": [], 1199 | "source": [ 1200 | "round(4.499) # 返回4\n", 1201 | "round(4.499, 1) # 返回4.5\n", 1202 | "round(4.5) # 返回4" 1203 | ] 1204 | }, 1205 | { 1206 | "cell_type": "markdown", 1207 | "metadata": { 1208 | "collapsed": false, 1209 | "ein.tags": "worksheet-0", 1210 | "slideshow": { 1211 | "slide_type": "-" 1212 | } 1213 | }, 1214 | "source": [ 1215 | "### 高级数学运算\n", 1216 | "\n", 1217 | "参考教材[1]附录C(p.320)。" 1218 | ] 1219 | } 1220 | ], 1221 | "metadata": { 1222 | "kernelspec": { 1223 | "display_name": "Python 3", 1224 | "name": "python3" 1225 | }, 1226 | "language_info": { 1227 | "codemirror_mode": { 1228 | "name": "ipython", 1229 | "version": 3 1230 | }, 1231 | "file_extension": ".py", 1232 | "mimetype": "text/x-python", 1233 | "name": "python", 1234 | "nbconvert_exporter": "python", 1235 | "pygments_lexer": "ipython3", 1236 | "version": "3.6.6" 1237 | }, 1238 | "name": "2_python_ingredients.ipynb", 1239 | "toc": { 1240 | "colors": { 1241 | "hover_highlight": "#DAA520", 1242 | "running_highlight": "#FF0000", 1243 | "selected_highlight": "#FFD700" 1244 | }, 1245 | "moveMenuLeft": true, 1246 | "nav_menu": { 1247 | "height": "480px", 1248 | "width": "252px" 1249 | }, 1250 | "navigate_menu": true, 1251 | "number_sections": false, 1252 | "sideBar": true, 1253 | "threshold": 4, 1254 | "toc_cell": false, 1255 | "toc_section_display": "block", 1256 | "toc_window_display": false, 1257 | "widenNotebook": false 1258 | } 1259 | }, 1260 | "nbformat": 4, 1261 | "nbformat_minor": 2 1262 | } 1263 | -------------------------------------------------------------------------------- /课件/3_strings.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": { 6 | "collapsed": false, 7 | "ein.tags": "worksheet-0", 8 | "slideshow": { 9 | "slide_type": "-" 10 | } 11 | }, 12 | "source": [ 13 | "# Python变量和数据类型" 14 | ] 15 | }, 16 | { 17 | "cell_type": "markdown", 18 | "metadata": { 19 | "collapsed": false, 20 | "ein.tags": "worksheet-0", 21 | "slideshow": { 22 | "slide_type": "-" 23 | } 24 | }, 25 | "source": [ 26 | "## 字符串\n", 27 | "\n", 28 | "Python中对字符串的定义:\n", 29 | "\n", 30 | "> Textual data in Python is handled with `str` objects, or **strings**. Strings are immutable sequences of Unicode code points.\n", 31 | "\n", 32 | "Python中的文本数据是通过`str`对象或字符串来处理的,字符串是由一系列Unicode码位(code point)所组成的**不可变序列**。" 33 | ] 34 | }, 35 | { 36 | "cell_type": "code", 37 | "execution_count": null, 38 | "metadata": { 39 | "autoscroll": false, 40 | "collapsed": false, 41 | "ein.hycell": false, 42 | "ein.tags": "worksheet-0", 43 | "slideshow": { 44 | "slide_type": "-" 45 | } 46 | }, 47 | "outputs": [], 48 | "source": [ 49 | "('S' 'T' 'R' 'I' 'N' 'G')" 50 | ] 51 | }, 52 | { 53 | "cell_type": "markdown", 54 | "metadata": { 55 | "collapsed": false, 56 | "ein.tags": "worksheet-0", 57 | "slideshow": { 58 | "slide_type": "-" 59 | } 60 | }, 61 | "source": [ 62 | "Unicode 暂时可以看作一张非常大的地图,这张地图里面记录了世界上所有的符号,而码位则是每个符号所对应的坐标。" 63 | ] 64 | }, 65 | { 66 | "cell_type": "code", 67 | "execution_count": null, 68 | "metadata": { 69 | "autoscroll": false, 70 | "collapsed": false, 71 | "ein.hycell": false, 72 | "ein.tags": "worksheet-0", 73 | "slideshow": { 74 | "slide_type": "-" 75 | } 76 | }, 77 | "outputs": [], 78 | "source": [ 79 | "s = '春分'\n", 80 | "print(s)\n", 81 | "print(len(s))\n", 82 | "print(s.encode())" 83 | ] 84 | }, 85 | { 86 | "cell_type": "markdown", 87 | "metadata": { 88 | "collapsed": false, 89 | "ein.tags": "worksheet-0", 90 | "slideshow": { 91 | "slide_type": "-" 92 | } 93 | }, 94 | "source": [ 95 | "使用内建函数`len()`可以获得字符串的长度。\n", 96 | "\n", 97 | "**不可变**是指无法对字符串本身进行更改操作:" 98 | ] 99 | }, 100 | { 101 | "cell_type": "code", 102 | "execution_count": null, 103 | "metadata": { 104 | "autoscroll": false, 105 | "collapsed": false, 106 | "ein.hycell": false, 107 | "ein.tags": "worksheet-0", 108 | "slideshow": { 109 | "slide_type": "-" 110 | } 111 | }, 112 | "outputs": [], 113 | "source": [ 114 | "s = 'Hello'\n", 115 | "print(s[3])\n", 116 | "s[3] = 'o'" 117 | ] 118 | }, 119 | { 120 | "cell_type": "markdown", 121 | "metadata": { 122 | "collapsed": false, 123 | "ein.tags": "worksheet-0", 124 | "slideshow": { 125 | "slide_type": "-" 126 | } 127 | }, 128 | "source": [ 129 | "而**序列(sequence)**则是指字符串继承序列类型(`list/tuple/range`)的通用操作:" 130 | ] 131 | }, 132 | { 133 | "cell_type": "code", 134 | "execution_count": null, 135 | "metadata": { 136 | "autoscroll": false, 137 | "collapsed": false, 138 | "ein.hycell": false, 139 | "ein.tags": "worksheet-0", 140 | "slideshow": { 141 | "slide_type": "-" 142 | } 143 | }, 144 | "outputs": [], 145 | "source": [ 146 | "list1 = []\n", 147 | "for i in 'Hello':\n", 148 | " list1.append(i.upper())\n", 149 | "\n", 150 | "# list2 = [i.upper() for i in 'Hello']\n", 151 | "# list1 == list2" 152 | ] 153 | }, 154 | { 155 | "cell_type": "markdown", 156 | "metadata": { 157 | "collapsed": false, 158 | "ein.tags": "worksheet-0", 159 | "slideshow": { 160 | "slide_type": "-" 161 | } 162 | }, 163 | "source": [ 164 | "**序列**是容器类型,“成员”们站成了有序的队列,我们从0开始进行对每个成员进行标记,0, 1, 2, 3…,这样,便可以通过**下标**访问序列的一个或几个成员,就像C语言中的数组一样。接下来,我们先来了解一下序列。\n", 165 | "\n", 166 | "### 序列类型操作符\n", 167 | "\n", 168 | "**注:以下操作符对所有序列类型都适用。**\n", 169 | "\n", 170 | "#### 成员关系操作符(`in`、`not in`)" 171 | ] 172 | }, 173 | { 174 | "cell_type": "code", 175 | "execution_count": null, 176 | "metadata": { 177 | "autoscroll": false, 178 | "collapsed": false, 179 | "ein.hycell": false, 180 | "ein.tags": "worksheet-0", 181 | "slideshow": { 182 | "slide_type": "-" 183 | } 184 | }, 185 | "outputs": [], 186 | "source": [ 187 | "'x' in 'china' # 返回 False\n", 188 | "'e' not in 'pity' # 返回 True\n", 189 | "12 in [13, 32, 4, 0] # 返回 False\n", 190 | "'green' not in ['red', 'yellow', 'white'] # 返回 True" 191 | ] 192 | }, 193 | { 194 | "cell_type": "markdown", 195 | "metadata": { 196 | "collapsed": false, 197 | "ein.tags": "worksheet-0", 198 | "slideshow": { 199 | "slide_type": "-" 200 | } 201 | }, 202 | "source": [ 203 | "#### 连接操作符(`+`)\n", 204 | "\n", 205 | "注:只可用于同种类型序列连接。" 206 | ] 207 | }, 208 | { 209 | "cell_type": "code", 210 | "execution_count": null, 211 | "metadata": { 212 | "autoscroll": false, 213 | "collapsed": false, 214 | "ein.hycell": false, 215 | "ein.tags": "worksheet-0", 216 | "slideshow": { 217 | "slide_type": "-" 218 | } 219 | }, 220 | "outputs": [], 221 | "source": [ 222 | "str1 = 'aaa'\n", 223 | "str2 = 'bbb'\n", 224 | "str2 = str1 + str2\n", 225 | "str2 # 返回'aaabbb',此时str2所指向的对象是新创建的对象\n", 226 | " # 因字符串是不可更新的标量,可以用id()测试" 227 | ] 228 | }, 229 | { 230 | "cell_type": "code", 231 | "execution_count": null, 232 | "metadata": { 233 | "autoscroll": false, 234 | "collapsed": false, 235 | "ein.hycell": false, 236 | "ein.tags": "worksheet-0", 237 | "slideshow": { 238 | "slide_type": "-" 239 | } 240 | }, 241 | "outputs": [], 242 | "source": [ 243 | "num_list = [1, 3, 5]\n", 244 | "num_list += [7, 9]\n", 245 | "num_list # 返回[1, 3, 5, 7, 9],此时的num_list指向的对象还是\n", 246 | " # 原始对象,因其是可更改的容器,可以用id()测试" 247 | ] 248 | }, 249 | { 250 | "cell_type": "code", 251 | "execution_count": null, 252 | "metadata": { 253 | "autoscroll": false, 254 | "collapsed": false, 255 | "ein.hycell": false, 256 | "ein.tags": "worksheet-0", 257 | "slideshow": { 258 | "slide_type": "-" 259 | } 260 | }, 261 | "outputs": [], 262 | "source": [ 263 | "(1, 3) + (5, 7) # 返回(1, 3, 5, 7),注意元组是不可更改的容器" 264 | ] 265 | }, 266 | { 267 | "cell_type": "markdown", 268 | "metadata": { 269 | "collapsed": false, 270 | "ein.tags": "worksheet-0", 271 | "slideshow": { 272 | "slide_type": "-" 273 | } 274 | }, 275 | "source": [ 276 | "#### 重复操作符(`*`)\n", 277 | "\n", 278 | "用法:`s * n` 或 `n * s`\n", 279 | "\n", 280 | "`*`用以将序列重复指定次数,如:" 281 | ] 282 | }, 283 | { 284 | "cell_type": "code", 285 | "execution_count": null, 286 | "metadata": { 287 | "autoscroll": false, 288 | "collapsed": false, 289 | "ein.hycell": false, 290 | "ein.tags": "worksheet-0", 291 | "slideshow": { 292 | "slide_type": "-" 293 | } 294 | }, 295 | "outputs": [], 296 | "source": [ 297 | "astr = 'hello'\n", 298 | "astr *= 3\n", 299 | "astr # 返回'hellohellohello'" 300 | ] 301 | }, 302 | { 303 | "cell_type": "code", 304 | "execution_count": null, 305 | "metadata": { 306 | "autoscroll": false, 307 | "collapsed": false, 308 | "ein.hycell": false, 309 | "ein.tags": "worksheet-0", 310 | "slideshow": { 311 | "slide_type": "-" 312 | } 313 | }, 314 | "outputs": [], 315 | "source": [ 316 | "alpha_list = ['a', 'b', 'c']\n", 317 | "alpha_list *= 2\n", 318 | "alpha_list # 返回['a', 'b', 'c', 'a', 'b', 'c']" 319 | ] 320 | }, 321 | { 322 | "cell_type": "code", 323 | "execution_count": null, 324 | "metadata": { 325 | "autoscroll": false, 326 | "collapsed": false, 327 | "ein.hycell": false, 328 | "ein.tags": "worksheet-0", 329 | "slideshow": { 330 | "slide_type": "-" 331 | } 332 | }, 333 | "outputs": [], 334 | "source": [ 335 | "('ha', 'ya') * 3 # 返回('ha', 'ya', 'ha', 'ya', 'ha', 'ya')" 336 | ] 337 | }, 338 | { 339 | "cell_type": "markdown", 340 | "metadata": { 341 | "collapsed": false, 342 | "ein.tags": "worksheet-0", 343 | "slideshow": { 344 | "slide_type": "-" 345 | } 346 | }, 347 | "source": [ 348 | "**当`n`的值小于0的时候都按照`n = 0`对待(结果将返回一个和`s`类型相同的空序列)。**" 349 | ] 350 | }, 351 | { 352 | "cell_type": "code", 353 | "execution_count": null, 354 | "metadata": { 355 | "autoscroll": false, 356 | "collapsed": false, 357 | "ein.hycell": false, 358 | "ein.tags": "worksheet-0", 359 | "slideshow": { 360 | "slide_type": "-" 361 | } 362 | }, 363 | "outputs": [], 364 | "source": [ 365 | "'a' * -2" 366 | ] 367 | }, 368 | { 369 | "cell_type": "markdown", 370 | "metadata": { 371 | "collapsed": false, 372 | "ein.tags": "worksheet-0", 373 | "slideshow": { 374 | "slide_type": "-" 375 | } 376 | }, 377 | "source": [ 378 | "**另外需要注意的是序列`s`中的元素并没有被复制,它们只是被引用了多次。**这个需要Python初学者特别注意,比如:" 379 | ] 380 | }, 381 | { 382 | "cell_type": "code", 383 | "execution_count": null, 384 | "metadata": { 385 | "autoscroll": false, 386 | "collapsed": false, 387 | "ein.hycell": false, 388 | "ein.tags": "worksheet-0", 389 | "slideshow": { 390 | "slide_type": "-" 391 | } 392 | }, 393 | "outputs": [], 394 | "source": [ 395 | "lists = [[]] * 3\n", 396 | "lists" 397 | ] 398 | }, 399 | { 400 | "cell_type": "code", 401 | "execution_count": null, 402 | "metadata": { 403 | "autoscroll": false, 404 | "collapsed": false, 405 | "ein.hycell": false, 406 | "ein.tags": "worksheet-0", 407 | "slideshow": { 408 | "slide_type": "-" 409 | } 410 | }, 411 | "outputs": [], 412 | "source": [ 413 | "lists[0].append(3)\n", 414 | "lists" 415 | ] 416 | }, 417 | { 418 | "cell_type": "markdown", 419 | "metadata": { 420 | "collapsed": false, 421 | "ein.tags": "worksheet-0", 422 | "slideshow": { 423 | "slide_type": "-" 424 | } 425 | }, 426 | "source": [ 427 | "`[[]]`是一个单元素列表,包含一个空的列表,所以`[[]] * 3`中的3个元素都引用这个空的列表。修改其中的任意一个元素都会修改这个空的列表。你可以用下面的方式创建一个不同列表构成的列表:" 428 | ] 429 | }, 430 | { 431 | "cell_type": "code", 432 | "execution_count": null, 433 | "metadata": { 434 | "autoscroll": false, 435 | "collapsed": false, 436 | "ein.hycell": false, 437 | "ein.tags": "worksheet-0", 438 | "slideshow": { 439 | "slide_type": "-" 440 | } 441 | }, 442 | "outputs": [], 443 | "source": [ 444 | "lists = [[] for i in range(3)]\n", 445 | "lists[0].append(3)\n", 446 | "lists[1].append(5)\n", 447 | "lists[2].append(7)\n", 448 | "lists" 449 | ] 450 | }, 451 | { 452 | "cell_type": "markdown", 453 | "metadata": { 454 | "collapsed": false, 455 | "ein.tags": "worksheet-0", 456 | "slideshow": { 457 | "slide_type": "-" 458 | } 459 | }, 460 | "source": [ 461 | "更多的解释可以参考Python官方文档:[How do I create a multidimensional list?](https://docs.python.org/3.6/faq/programming.html#faq-multidimensional-list)\n", 462 | "\n", 463 | "#### 切片操作符(`[]`、`[:]`、`[::]`)\n", 464 | "\n", 465 | "通过切片功能可以访问序列的一个或者多个成员。和C一样,在访问单个成员时你要保证你访问下标的成员是存在的,否则会引发`IndexError`异常(C中叫做数组越界)。\n", 466 | "\n", 467 | "##### 索引——访问单个成员`[]`\n" 468 | ] 469 | }, 470 | { 471 | "cell_type": "code", 472 | "execution_count": null, 473 | "metadata": { 474 | "autoscroll": false, 475 | "collapsed": false, 476 | "ein.hycell": false, 477 | "ein.tags": "worksheet-0", 478 | "slideshow": { 479 | "slide_type": "-" 480 | } 481 | }, 482 | "outputs": [], 483 | "source": [ 484 | "astr = 'Python'\n", 485 | "astr[0]\n", 486 | "astr[3]" 487 | ] 488 | }, 489 | { 490 | "cell_type": "code", 491 | "execution_count": null, 492 | "metadata": { 493 | "autoscroll": false, 494 | "collapsed": false, 495 | "ein.hycell": false, 496 | "ein.tags": "worksheet-0", 497 | "slideshow": { 498 | "slide_type": "-" 499 | } 500 | }, 501 | "outputs": [], 502 | "source": [ 503 | "astr[6]" 504 | ] 505 | }, 506 | { 507 | "cell_type": "code", 508 | "execution_count": null, 509 | "metadata": { 510 | "autoscroll": false, 511 | "collapsed": false, 512 | "ein.hycell": false, 513 | "ein.tags": "worksheet-0", 514 | "slideshow": { 515 | "slide_type": "-" 516 | } 517 | }, 518 | "outputs": [], 519 | "source": [ 520 | "astr[-1] # 'n'\n", 521 | "astr[-6] # 'P'" 522 | ] 523 | }, 524 | { 525 | "cell_type": "markdown", 526 | "metadata": { 527 | "collapsed": false, 528 | "ein.tags": "worksheet-0", 529 | "slideshow": { 530 | "slide_type": "-" 531 | } 532 | }, 533 | "source": [ 534 | "注意,因为`-0`等于`0`,负数的索引从`-1`开始。\n", 535 | "\n", 536 | "##### 切片——访问连续的多个成员`[starting_index : ending_index]`\n", 537 | "\n", 538 | "切片索引有默认值,默认的起始索引是`0`,默认的终止索引是所要切片的字符串的长度。\n", 539 | "参考[Common Sequence Operations](https://docs.python.org/3/library/stdtypes.html#common-sequence-operations) Notes 4:\n", 540 | "> The slice of s from i to j is defined as the sequence of items with index k such that `i <= k < j`. \n", 541 | "> If i or j is greater than `len(s)`, use `len(s)`. If i is omitted or `None`, use `0`. \n", 542 | "> If j is omitted or `None`, use `len(s)`. \n", 543 | "> If i is greater than or equal to j, the slice is empty." 544 | ] 545 | }, 546 | { 547 | "cell_type": "code", 548 | "execution_count": null, 549 | "metadata": { 550 | "autoscroll": false, 551 | "collapsed": false, 552 | "ein.hycell": false, 553 | "ein.tags": "worksheet-0", 554 | "slideshow": { 555 | "slide_type": "-" 556 | } 557 | }, 558 | "outputs": [], 559 | "source": [ 560 | "astr[:] # 'Python'\n", 561 | "astr[0:] # 'Python'\n", 562 | "astr[:6] # 'Python'\n", 563 | "astr[:5] # 'Pytho'\n", 564 | "astr[5:] # 'n'\n", 565 | "astr[:-1] # 'Pytho'\n", 566 | "astr[1:-1] # 'ytho'" 567 | ] 568 | }, 569 | { 570 | "cell_type": "markdown", 571 | "metadata": { 572 | "collapsed": false, 573 | "ein.tags": "worksheet-0", 574 | "slideshow": { 575 | "slide_type": "-" 576 | } 577 | }, 578 | "source": [ 579 | "注意起始索引是包含进来的,终止索引是排除在外的。所以,**`s[:i] + s[i:]`永远等于`s`。**" 580 | ] 581 | }, 582 | { 583 | "cell_type": "code", 584 | "execution_count": null, 585 | "metadata": { 586 | "autoscroll": false, 587 | "collapsed": false, 588 | "ein.hycell": false, 589 | "ein.tags": "worksheet-0", 590 | "slideshow": { 591 | "slide_type": "-" 592 | } 593 | }, 594 | "outputs": [], 595 | "source": [ 596 | "astr[:2] + astr[2:]\n", 597 | "astr[:4] + astr[4:]" 598 | ] 599 | }, 600 | { 601 | "cell_type": "markdown", 602 | "metadata": { 603 | "collapsed": false, 604 | "ein.tags": "worksheet-0", 605 | "slideshow": { 606 | "slide_type": "-" 607 | } 608 | }, 609 | "source": [ 610 | "记住切片如何工作的一种方法是将索引看作是字符间的点,第一个字符的左侧的位置为`0`,最后一个字符的右侧的位置为字符的长度。比如:\n", 611 | "\n", 612 | "```\n", 613 | " +---+---+---+---+---+---+\n", 614 | " | P | y | t | h | o | n |\n", 615 | " +---+---+---+---+---+---+\n", 616 | " 0 1 2 3 4 5 6\n", 617 | "-6 -5 -4 -3 -2 -1\n", 618 | "```\n", 619 | "\n", 620 | "另外需要注意的是,当使用切片访问连续的多个成员时超出索引范围将被很好的处理。" 621 | ] 622 | }, 623 | { 624 | "cell_type": "markdown", 625 | "metadata": { 626 | "collapsed": false, 627 | "ein.tags": "worksheet-0", 628 | "slideshow": { 629 | "slide_type": "-" 630 | } 631 | }, 632 | "source": [ 633 | "##### 以等差数列形式的下标进行访问 `[starting_index : ending_index : step_length]`" 634 | ] 635 | }, 636 | { 637 | "cell_type": "markdown", 638 | "metadata": { 639 | "collapsed": false, 640 | "ein.tags": "worksheet-0", 641 | "slideshow": { 642 | "slide_type": "-" 643 | } 644 | }, 645 | "source": [ 646 | "`step_length` 为正表示从左到右切片,反之为右到左,然后根据index依次切片。\n", 647 | "\n", 648 | "* `step_length`为正,则从左到右切片,如果 `starting_index > ending_index`,则为空\n", 649 | "* `step_length`为负,则从右到左切片,如果 `starting_index < ending_index`,则为空\n", 650 | "* `starting_index` 和 `ending_index` 填空,前者表示**最开始**,后者表示**最后一个**, 同时为空的时候,表示取所有。\n", 651 | " 至于方向,取决于 `step_length` 。\n", 652 | "\n", 653 | "\n", 654 | "参考[Common Sequence Operations](https://docs.python.org/3/library/stdtypes.html#common-sequence-operations) Notes 5:\n", 655 | "\n", 656 | "> The slice of s from i to j with step k is defined as the sequence of items with index `x = i + n*k` \n", 657 | "> such that `0 <= n < (j-i)/k`. In other words, the indices are `i`, `i+k`, `i+2*k`, `i+3*k` and so on, \n", 658 | "> stopping when j is reached (but never including j). When k is positive, i and j are reduced to \n", 659 | "> `len(s)` if they are greater. When k is negative, i and j are reduced to `len(s) - 1` if they are greater. \n", 660 | "> If i or j are omitted or `None`, they become “end” values (which end depends on the sign of k). \n", 661 | "> Note, k cannot be zero. If k is `None`, it is treated like `1`." 662 | ] 663 | }, 664 | { 665 | "cell_type": "code", 666 | "execution_count": null, 667 | "metadata": { 668 | "autoscroll": false, 669 | "collapsed": false, 670 | "ein.hycell": false, 671 | "ein.tags": "worksheet-0", 672 | "slideshow": { 673 | "slide_type": "-" 674 | } 675 | }, 676 | "outputs": [], 677 | "source": [ 678 | "(1, 2, 3, 4, 5, 6)[0:6:2] # 返回 (1, 3, 5)\n", 679 | "bstr = \"abcdefg\"\n", 680 | "bstr[::-1] # 返回'gfedcba',瞬间反转,未添加的参数默认为开始和结束\n", 681 | "bstr[::] # 返回'abcdefg',未添加的参数都使用默认值" 682 | ] 683 | }, 684 | { 685 | "cell_type": "code", 686 | "execution_count": null, 687 | "metadata": { 688 | "autoscroll": false, 689 | "collapsed": false, 690 | "ein.hycell": false, 691 | "ein.tags": "worksheet-0", 692 | "slideshow": { 693 | "slide_type": "-" 694 | } 695 | }, 696 | "outputs": [], 697 | "source": [ 698 | "cstr = '012345'\n", 699 | "cstr[len(cstr)-1::-1] # '543210'\n", 700 | "cstr[None::-1] # '543210'\n", 701 | "cstr[-1::-1] # '543210'\n", 702 | "cstr[-1:None:-1] # '543210'\n", 703 | "cstr[-1:len(cstr)-1:-1] # ''\n", 704 | "cstr[:0:-1] # '54321'" 705 | ] 706 | }, 707 | { 708 | "cell_type": "markdown", 709 | "metadata": { 710 | "collapsed": false, 711 | "ein.tags": "worksheet-0", 712 | "slideshow": { 713 | "slide_type": "-" 714 | } 715 | }, 716 | "source": [ 717 | "#### 用于序列的内建函数\n", 718 | "\n", 719 | "* `max()` 返回序列中的最大值\n", 720 | "* `min()` 返回序列中最小值\n", 721 | "* `sum()` 返回列表的元素之和\n", 722 | "* `enumerate(iter)` 接受一个可迭代对象,返回一个enumerate对象,该对象生成iter的每个成员的index值和item值构成的数组\n", 723 | "* `reversed(seq)` 返回一个序列的逆向迭代器\n", 724 | "* `sorted()` 对一个序列,排序,返回排好序的列表,可以指定排序方法\n", 725 | "* `zip()` 返回一个zip对象,其成员为元组" 726 | ] 727 | }, 728 | { 729 | "cell_type": "code", 730 | "execution_count": null, 731 | "metadata": { 732 | "autoscroll": false, 733 | "collapsed": false, 734 | "ein.hycell": false, 735 | "ein.tags": "worksheet-0", 736 | "slideshow": { 737 | "slide_type": "-" 738 | } 739 | }, 740 | "outputs": [], 741 | "source": [ 742 | "list_demo = [1, 43, 4, 54]\n", 743 | "max(list_demo) # 54\n", 744 | "min(list_demo) # 1\n", 745 | "sum(list_demo) # 102\n", 746 | "list(reversed(list_demo)) # [54, 4, 43, 1]\n", 747 | "sorted(list_demo) # [1, 4, 43, 54]\n", 748 | "list(zip(list_demo, list_demo[::-1])) # [(1, 54), (43, 4), (4, 43), (54, 1)]" 749 | ] 750 | }, 751 | { 752 | "cell_type": "markdown", 753 | "metadata": { 754 | "collapsed": false, 755 | "ein.tags": "worksheet-0", 756 | "slideshow": { 757 | "slide_type": "-" 758 | } 759 | }, 760 | "source": [ 761 | "上面是对序列简单的介绍,接着我们回到字符串上来。\n", 762 | "\n", 763 | "### 字符串的创建\n", 764 | "\n", 765 | "3种方式创建字符串字面量:\n", 766 | "\n", 767 | "1. 单引号:`'allows embedded \"double\" quotes'`\n", 768 | "2. 双引号:`\"allows embedded 'single' quotes\"`\n", 769 | "3. 三引号:`'''Three single quotes'''` `\"\"\"Three double quotes\"\"\"`\n", 770 | "\n", 771 | "其中,三引号创建的字符串可以跨越多行,其中的空白(例如每行的换行符以及行首或行末的空格)会被包含进所创建的字符串字面量。\n", 772 | "\n", 773 | "Python允许空字符串`''`,它不包含任何字符但完全合法。空字符串是其他任何字符串的子串。\n", 774 | "\n", 775 | "字符串字面量是一个单独的表达式,如果多个字符串字面量中间仅包含空白,则它们将被隐性地转换为一个单一的字符串字面量。所以,`(\"spam\" \"eggs\") == \"spameggs\"`。" 776 | ] 777 | }, 778 | { 779 | "cell_type": "code", 780 | "execution_count": null, 781 | "metadata": { 782 | "autoscroll": false, 783 | "collapsed": false, 784 | "ein.hycell": false, 785 | "ein.tags": "worksheet-0", 786 | "slideshow": { 787 | "slide_type": "-" 788 | } 789 | }, 790 | "outputs": [], 791 | "source": [ 792 | "url_str = ('http://' # protocol\n", 793 | " 'localhost' # hostname\n", 794 | " ':8080' # port\n", 795 | " '/index.html') # file\n", 796 | "url_str # 'http://localhost:8080/index.html'" 797 | ] 798 | }, 799 | { 800 | "cell_type": "markdown", 801 | "metadata": { 802 | "collapsed": false, 803 | "ein.tags": "worksheet-0", 804 | "slideshow": { 805 | "slide_type": "-" 806 | } 807 | }, 808 | "source": [ 809 | "另外,你还可以使用`str`构造器将其它对象转换为字符串。" 810 | ] 811 | }, 812 | { 813 | "cell_type": "code", 814 | "execution_count": null, 815 | "metadata": { 816 | "autoscroll": false, 817 | "collapsed": false, 818 | "ein.hycell": false, 819 | "ein.tags": "worksheet-0", 820 | "slideshow": { 821 | "slide_type": "-" 822 | } 823 | }, 824 | "outputs": [], 825 | "source": [ 826 | "str(98.6) # '98.6'\n", 827 | "str(True) # 'True'" 828 | ] 829 | }, 830 | { 831 | "cell_type": "markdown", 832 | "metadata": { 833 | "collapsed": false, 834 | "ein.tags": "worksheet-0", 835 | "slideshow": { 836 | "slide_type": "-" 837 | } 838 | }, 839 | "source": [ 840 | "### 使用`\\`转义" 841 | ] 842 | }, 843 | { 844 | "cell_type": "markdown", 845 | "metadata": { 846 | "collapsed": false, 847 | "ein.tags": "worksheet-0", 848 | "slideshow": { 849 | "slide_type": "-" 850 | } 851 | }, 852 | "source": [ 853 | "常见的转义符:`\\n`(换行符)、`\\t`(Tab制表符)、`\\r`(回车)、`\\'`(单引号)、`\\\"`(双引号)、`\\\\`(反斜线)\n", 854 | "\n", 855 | "### 字符串操作符\n", 856 | "\n", 857 | "参考上面**序列类型操作符**,不再赘述。\n", 858 | "\n", 859 | "### 用于字符串的内建函数\n", 860 | "\n", 861 | "* `input()` 获取用户输入,返回一个字符串" 862 | ] 863 | }, 864 | { 865 | "cell_type": "code", 866 | "execution_count": null, 867 | "metadata": { 868 | "autoscroll": false, 869 | "collapsed": false, 870 | "ein.hycell": false, 871 | "ein.tags": "worksheet-0", 872 | "slideshow": { 873 | "slide_type": "-" 874 | } 875 | }, 876 | "outputs": [], 877 | "source": [ 878 | "name = input(\"What's your name: \")\n", 879 | "print(\"Your name is %s.\" % name)" 880 | ] 881 | }, 882 | { 883 | "cell_type": "code", 884 | "execution_count": null, 885 | "metadata": { 886 | "autoscroll": false, 887 | "collapsed": false, 888 | "ein.hycell": false, 889 | "ein.tags": "worksheet-0", 890 | "slideshow": { 891 | "slide_type": "-" 892 | } 893 | }, 894 | "outputs": [], 895 | "source": [ 896 | "ord(chr(64))" 897 | ] 898 | }, 899 | { 900 | "cell_type": "code", 901 | "execution_count": null, 902 | "metadata": { 903 | "autoscroll": false, 904 | "collapsed": false, 905 | "ein.hycell": false, 906 | "ein.tags": "worksheet-0", 907 | "slideshow": { 908 | "slide_type": "-" 909 | } 910 | }, 911 | "outputs": [], 912 | "source": [ 913 | "dir(str)" 914 | ] 915 | }, 916 | { 917 | "cell_type": "markdown", 918 | "metadata": { 919 | "collapsed": false, 920 | "ein.tags": "worksheet-0", 921 | "slideshow": { 922 | "slide_type": "-" 923 | } 924 | }, 925 | "source": [ 926 | "* `chr()` 接受一个整数,返回对应的Unicode字符\n", 927 | "* `ord()` 功能与`chr()` 相反" 928 | ] 929 | }, 930 | { 931 | "cell_type": "markdown", 932 | "metadata": { 933 | "collapsed": false, 934 | "ein.tags": "worksheet-0", 935 | "slideshow": { 936 | "slide_type": "-" 937 | } 938 | }, 939 | "source": [ 940 | "### 字符串方法\n", 941 | "\n", 942 | "用法:`string_object.method(arguments)`\n", 943 | "\n", 944 | "字符串方法比较多,可以通过`help(str)`或者`dir(str)`获取帮助。\n", 945 | "\n", 946 | "以下是一些常用的方法:\n", 947 | "\n", 948 | "* `split()` 基于**分隔符**将字符串分割成由若干子串组成的列表,如果不指定分割符,默认使用空白字符进行分割。\n", 949 | "* `join()` 与`split()`功能相反,将包含若干子串的列表分解,并将这些子串通过指定的**粘合用的字符串**合并成一个完整的大的字符串。\n", 950 | "* `find()` 查找返回字符串中第一次出现子串的位置(偏移量),失败时返回`-1`。\n", 951 | "* `index()` 与`find()`类似,但是查找失败时将触发`ValueError`异常。\n", 952 | "* `rfind()` 与`find()`类似,但返回最后一次子串出现的位置。\n", 953 | "* `startswith()` 判断字符串是否以特定前缀开头。\n", 954 | "* `endswith()` 判断字符串是否以特定后缀结尾。\n", 955 | "* `count()` 统计子串在字符串中出现的次数。\n", 956 | "* `is*` 判断字符串中字符是否符合某种类型或者规则。\n", 957 | "* `strip()` 返回移除开始和结尾空白字符的字符串,如果指定参数,将在字符串的开始和结尾移除参数中所包含的字符。\n", 958 | "* `upper()` `lower()` `swapcase()` 分别将字符串所有字母转换成大写、转换成小写、大小写转换。\n", 959 | "* `title()` 将字符串中所有单词的开头字母变成大写。\n", 960 | "* `capitalize()` 将字符串首字母变成大写。\n", 961 | "* `center()` `ljust()` `rjust()` 分别将字符串根据指定长度居中对齐、左对齐、右对齐。\n", 962 | "* `replace()` 进行简单的子串替换,需要传入的参数:需要被替换的子串,用于替换的新子串,以及需要替换多少处。" 963 | ] 964 | }, 965 | { 966 | "cell_type": "code", 967 | "execution_count": null, 968 | "metadata": { 969 | "autoscroll": false, 970 | "collapsed": false, 971 | "ein.hycell": false, 972 | "ein.tags": "worksheet-0", 973 | "slideshow": { 974 | "slide_type": "-" 975 | } 976 | }, 977 | "outputs": [], 978 | "source": [ 979 | "s = \"Hello, world!\"\n", 980 | "s.split() # ['Hello,', 'world!']\n", 981 | "' '.join(s.split()) # s.split()\n", 982 | "s.find('world') # 7\n", 983 | "#s.index('~') # will cause ValueError\n", 984 | "s.is" 985 | ] 986 | }, 987 | { 988 | "cell_type": "code", 989 | "execution_count": null, 990 | "metadata": { 991 | "autoscroll": false, 992 | "collapsed": false, 993 | "ein.hycell": false, 994 | "ein.tags": "worksheet-0", 995 | "slideshow": { 996 | "slide_type": "-" 997 | } 998 | }, 999 | "outputs": [], 1000 | "source": [ 1001 | "s.rfind('o') # 8\n", 1002 | "s.startswith('m') # False\n", 1003 | "s.endswith('!') # True\n", 1004 | "\n", 1005 | "s.isalnum() # False\n", 1006 | "s.isdigit() # False\n", 1007 | "s.islower() # False\n", 1008 | "s.istitle() # False\n", 1009 | "s.isalpha() # False\n", 1010 | "\n", 1011 | "s.strip('!d') # 'Hello, worl'\n", 1012 | "\n", 1013 | "s.upper() # 'HELLO, WORLD!'\n", 1014 | "s.lower() # 'hello, world!'\n", 1015 | "s.swapcase() # 'hELLO, WORLD!'\n", 1016 | "\n", 1017 | "s.title() # 'Hello, World!'\n", 1018 | "s.capitalize() # 'Hello, world!'\n", 1019 | "\n", 1020 | "s.center(20) # ' Hello, world! '\n", 1021 | "s.ljust(20) # 'Hello, world! '\n", 1022 | "s.rjust(20) # ' Hello, world!'\n", 1023 | "\n", 1024 | "s.replace('o', 'O', 1) # 'HellO, world!'" 1025 | ] 1026 | }, 1027 | { 1028 | "cell_type": "markdown", 1029 | "metadata": { 1030 | "collapsed": false 1031 | }, 1032 | "source": [ 1033 | "练习:" 1034 | ] 1035 | } 1036 | ], 1037 | "metadata": { 1038 | "kernelspec": { 1039 | "display_name": "Python 3", 1040 | "language": "python", 1041 | "name": "python3" 1042 | }, 1043 | "language_info": { 1044 | "codemirror_mode": { 1045 | "name": "ipython", 1046 | "version": 3 1047 | }, 1048 | "file_extension": ".py", 1049 | "mimetype": "text/x-python", 1050 | "name": "python", 1051 | "nbconvert_exporter": "python", 1052 | "pygments_lexer": "ipython3", 1053 | "version": "3.6.6" 1054 | }, 1055 | "name": "3_strings.ipynb", 1056 | "toc": { 1057 | "colors": { 1058 | "hover_highlight": "#DAA520", 1059 | "running_highlight": "#FF0000", 1060 | "selected_highlight": "#FFD700" 1061 | }, 1062 | "moveMenuLeft": true, 1063 | "nav_menu": { 1064 | "height": "273.783px", 1065 | "width": "252px" 1066 | }, 1067 | "navigate_menu": true, 1068 | "number_sections": false, 1069 | "sideBar": true, 1070 | "threshold": 4, 1071 | "toc_cell": false, 1072 | "toc_section_display": "block", 1073 | "toc_window_display": false, 1074 | "widenNotebook": false 1075 | } 1076 | }, 1077 | "nbformat": 4, 1078 | "nbformat_minor": 2 1079 | } 1080 | -------------------------------------------------------------------------------- /课件/5_code_structure.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": { 6 | "ein.tags": "worksheet-0", 7 | "slideshow": { 8 | "slide_type": "-" 9 | } 10 | }, 11 | "source": [ 12 | "# Python代码结构\n", 13 | "讲解如何组织代码和数据。\n", 14 | "\n", 15 | "## 语法和句法\n", 16 | "\n", 17 | "### 注释\n", 18 | "\n", 19 | "在Python中使用`#`字符标记注释,从`#`开始到当前行结束的部分都是注释。可以把注释作为单独的一行或者把注释和代码放在同一行。" 20 | ] 21 | }, 22 | { 23 | "cell_type": "code", 24 | "execution_count": null, 25 | "metadata": { 26 | "autoscroll": false, 27 | "collapsed": false, 28 | "ein.hycell": false, 29 | "ein.tags": "worksheet-0", 30 | "slideshow": { 31 | "slide_type": "-" 32 | } 33 | }, 34 | "outputs": [], 35 | "source": [ 36 | "print(\"Hello, World!\") # 这是一个单行注释" 37 | ] 38 | }, 39 | { 40 | "cell_type": "markdown", 41 | "metadata": { 42 | "ein.tags": "worksheet-0", 43 | "slideshow": { 44 | "slide_type": "-" 45 | } 46 | }, 47 | "source": [ 48 | "### 把语句分成多行写\n", 49 | "有时候一行代码太长,我们想分开,或者是出于美观、清晰的需要,可以用`\\`来连接多行。" 50 | ] 51 | }, 52 | { 53 | "cell_type": "code", 54 | "execution_count": null, 55 | "metadata": { 56 | "autoscroll": false, 57 | "collapsed": false, 58 | "ein.hycell": false, 59 | "ein.tags": "worksheet-0", 60 | "slideshow": { 61 | "slide_type": "-" 62 | } 63 | }, 64 | "outputs": [], 65 | "source": [ 66 | "if (1 == 1) and \\\n", 67 | " (2 == 2):\n", 68 | " print('Frivolous!')" 69 | ] 70 | }, 71 | { 72 | "cell_type": "markdown", 73 | "metadata": { 74 | "ein.tags": "worksheet-0", 75 | "slideshow": { 76 | "slide_type": "-" 77 | } 78 | }, 79 | "source": [ 80 | "### 使用`:`将代码块的头和身体分开" 81 | ] 82 | }, 83 | { 84 | "cell_type": "code", 85 | "execution_count": null, 86 | "metadata": { 87 | "autoscroll": false, 88 | "collapsed": false, 89 | "ein.hycell": false, 90 | "ein.tags": "worksheet-0", 91 | "slideshow": { 92 | "slide_type": "-" 93 | } 94 | }, 95 | "outputs": [], 96 | "source": [ 97 | "if (user == 'green'): # 头\n", 98 | " print('hello, green') # 身体" 99 | ] 100 | }, 101 | { 102 | "cell_type": "markdown", 103 | "metadata": { 104 | "ein.tags": "worksheet-0", 105 | "slideshow": { 106 | "slide_type": "-" 107 | } 108 | }, 109 | "source": [ 110 | "### 缩进\n", 111 | "\n", 112 | "前面讲过,Python使用缩进来表示代码的逻辑结构,建议使用4个空格进行缩进(可参考[PEP8 Python编码规范](https://www.python.org/dev/peps/pep-0008/))。\n", 113 | "\n", 114 | "### 同一行写多个语句\n", 115 | "\n", 116 | "同一行写多个语句,使用`;`分隔。" 117 | ] 118 | }, 119 | { 120 | "cell_type": "code", 121 | "execution_count": null, 122 | "metadata": { 123 | "autoscroll": false, 124 | "collapsed": false, 125 | "ein.hycell": false, 126 | "ein.tags": "worksheet-0", 127 | "slideshow": { 128 | "slide_type": "-" 129 | } 130 | }, 131 | "outputs": [], 132 | "source": [ 133 | "import sys; x = 'foo'; sys.stdout.write(x + '\\n')\n", 134 | "# 这样可行但并不好,降低了代码的可读性" 135 | ] 136 | }, 137 | { 138 | "cell_type": "markdown", 139 | "metadata": { 140 | "ein.tags": "worksheet-0", 141 | "slideshow": { 142 | "slide_type": "-" 143 | } 144 | }, 145 | "source": [ 146 | "## `if`-`elif`-`else`语句" 147 | ] 148 | }, 149 | { 150 | "cell_type": "code", 151 | "execution_count": null, 152 | "metadata": { 153 | "autoscroll": false, 154 | "collapsed": false, 155 | "ein.hycell": false, 156 | "ein.tags": "worksheet-0", 157 | "slideshow": { 158 | "slide_type": "-" 159 | } 160 | }, 161 | "outputs": [], 162 | "source": [ 163 | "today = input(\"Input: \")\n", 164 | "\n", 165 | "if today == 'Mon':\n", 166 | " print('Today is Monday')\n", 167 | "elif today == 'Tue':\n", 168 | " print('Today is Tuesday')\n", 169 | "elif today == 'Wed':\n", 170 | " print('Today is Wednesday')\n", 171 | "else:\n", 172 | " print('Today is a boring day')" 173 | ] 174 | }, 175 | { 176 | "cell_type": "markdown", 177 | "metadata": { 178 | "ein.tags": "worksheet-0", 179 | "slideshow": { 180 | "slide_type": "-" 181 | } 182 | }, 183 | "source": [ 184 | "`elif` 和C语言中的`else if`是等效的,C语言中的`switch case`结构在Python中没有等效的实现。\n", 185 | "\n", 186 | "C语言中使用`if`和`else`最常见的一个问题就是**else悬挂**,很多时候初学者弄不清除哪个`else`和哪个`if`是一对(C语言中,任何一个`else`与其上方最近的一个`if`是一对)。这个问题在Python中不存在,之前我们说过,Python中的缩进所起到的作用不仅仅上是排版层面上的,更是逻辑层面上的。没有了大括号,使用缩进来控制流程的层次深度,便使得程序员无需考虑悬挂问题,只要**保证同样逻辑层次的语句有相同的缩进量**即可:" 187 | ] 188 | }, 189 | { 190 | "cell_type": "code", 191 | "execution_count": null, 192 | "metadata": { 193 | "autoscroll": false, 194 | "collapsed": false, 195 | "ein.hycell": false, 196 | "ein.tags": "worksheet-0", 197 | "slideshow": { 198 | "slide_type": "-" 199 | } 200 | }, 201 | "outputs": [], 202 | "source": [ 203 | "if username_is_right:\n", 204 | " if passwd_is_right:\n", 205 | " if cash > 0:\n", 206 | " print(\"you have extra money\")\n", 207 | " else:\n", 208 | " print(\"you are in debt\")\n", 209 | " else:\n", 210 | " print(\"wrong password\")\n", 211 | "else:\n", 212 | " print(\"username is not existed!\")" 213 | ] 214 | }, 215 | { 216 | "cell_type": "markdown", 217 | "metadata": { 218 | "ein.tags": "worksheet-0", 219 | "slideshow": { 220 | "slide_type": "-" 221 | } 222 | }, 223 | "source": [ 224 | "### 条件表达式(三元操作符)\n", 225 | "\n", 226 | "语法:`X if C else Y`" 227 | ] 228 | }, 229 | { 230 | "cell_type": "code", 231 | "execution_count": null, 232 | "metadata": { 233 | "autoscroll": false, 234 | "collapsed": false, 235 | "ein.hycell": false, 236 | "ein.tags": "worksheet-0", 237 | "slideshow": { 238 | "slide_type": "-" 239 | } 240 | }, 241 | "outputs": [], 242 | "source": [ 243 | "x, y = 3, 4\n", 244 | "small = x if x < y else y\n", 245 | "small" 246 | ] 247 | }, 248 | { 249 | "cell_type": "markdown", 250 | "metadata": { 251 | "ein.tags": "worksheet-0", 252 | "slideshow": { 253 | "slide_type": "-" 254 | } 255 | }, 256 | "source": [ 257 | "### 多重比较\n", 258 | "\n", 259 | "如果想同时进行多重比较判断,可使用布尔操作符`and`、`or`或者`not`连接来决定最终表达式的布尔取值。布尔操作符的优先级没有比较表达式的代码段高,即,表达式要先计算然后再比较。" 260 | ] 261 | }, 262 | { 263 | "cell_type": "code", 264 | "execution_count": null, 265 | "metadata": { 266 | "autoscroll": false, 267 | "collapsed": false, 268 | "ein.hycell": false, 269 | "ein.tags": "worksheet-0", 270 | "slideshow": { 271 | "slide_type": "-" 272 | } 273 | }, 274 | "outputs": [], 275 | "source": [ 276 | "x = 7\n", 277 | "x > 5 and x < 10 # True" 278 | ] 279 | }, 280 | { 281 | "cell_type": "markdown", 282 | "metadata": { 283 | "ein.tags": "worksheet-0", 284 | "slideshow": { 285 | "slide_type": "-" 286 | } 287 | }, 288 | "source": [ 289 | "避免混淆的办法是加圆括号。" 290 | ] 291 | }, 292 | { 293 | "cell_type": "code", 294 | "execution_count": null, 295 | "metadata": { 296 | "autoscroll": false, 297 | "collapsed": false, 298 | "ein.hycell": false, 299 | "ein.tags": "worksheet-0", 300 | "slideshow": { 301 | "slide_type": "-" 302 | } 303 | }, 304 | "outputs": [], 305 | "source": [ 306 | "(x > 5) or not (x < 10) # True" 307 | ] 308 | }, 309 | { 310 | "cell_type": "markdown", 311 | "metadata": { 312 | "ein.tags": "worksheet-0", 313 | "slideshow": { 314 | "slide_type": "-" 315 | } 316 | }, 317 | "source": [ 318 | "### 什么是真值\n", 319 | "\n", 320 | "下面的情况会被认为是**False**:\n", 321 | "\n", 322 | "| 布尔 | False |\n", 323 | "|----------|---------|\n", 324 | "| null类型 | `None` |\n", 325 | "| 整型 | `0` |\n", 326 | "| 浮点型 | `0.0` |\n", 327 | "| 空字符串 | `''` |\n", 328 | "| 空列表 | `[]` |\n", 329 | "| 空元组 | `()` |\n", 330 | "| 空字典 | `{}` |\n", 331 | "| 空集合 | `set()` |\n", 332 | "\n", 333 | "剩下的都会被认为是`True`。\n", 334 | "\n", 335 | "如果是在判断一个表达式,Python会先计算表达式的值,然后返回布尔型结果。" 336 | ] 337 | }, 338 | { 339 | "cell_type": "markdown", 340 | "metadata": { 341 | "ein.tags": "worksheet-0", 342 | "slideshow": { 343 | "slide_type": "-" 344 | } 345 | }, 346 | "source": [ 347 | "## `while`循环\n", 348 | "语法:\n", 349 | "```python\n", 350 | "while expression:\n", 351 | " loop code\n", 352 | "```\n", 353 | "\n", 354 | "无限循环:\n", 355 | "```python\n", 356 | "white True:\n", 357 | " ...\n", 358 | "```" 359 | ] 360 | }, 361 | { 362 | "cell_type": "markdown", 363 | "metadata": { 364 | "ein.tags": "worksheet-0", 365 | "slideshow": { 366 | "slide_type": "-" 367 | } 368 | }, 369 | "source": [ 370 | "## `for`循环\n", 371 | "\n", 372 | "`for`循环是Python中最强大也最使用广泛的循环类型。\n", 373 | "它可以遍历序列成员,由此可用于列表解析和生成器表达式中。\n", 374 | "\n", 375 | "语法:\n", 376 | "```python\n", 377 | "for item_var in iterable: # iterable是可迭代的对象\n", 378 | " code to process item_var\n", 379 | "```\n", 380 | "\n", 381 | "一些例子:\n" 382 | ] 383 | }, 384 | { 385 | "cell_type": "code", 386 | "execution_count": null, 387 | "metadata": { 388 | "autoscroll": false, 389 | "collapsed": false, 390 | "ein.hycell": false, 391 | "ein.tags": "worksheet-0", 392 | "slideshow": { 393 | "slide_type": "-" 394 | } 395 | }, 396 | "outputs": [], 397 | "source": [ 398 | "nameList = ['Walter', 'Nicole', 'Steven']\n", 399 | "# 使用序列项迭代\n", 400 | "for eachName in nameList:\n", 401 | " print(eachName, 'Lim')\n", 402 | "\n", 403 | "# 使用序列索引迭代\n", 404 | "for nameIndex in range(len(nameList)):\n", 405 | " print(\"Liu,\", nameList[nameIndex])\n", 406 | "\n", 407 | "# 使用项和索引迭代\n", 408 | "for i, eachLee in enumerate(nameList):\n", 409 | " print(\"%d %s Lee\" % (i+1, eachLee))" 410 | ] 411 | }, 412 | { 413 | "cell_type": "markdown", 414 | "metadata": { 415 | "ein.tags": "worksheet-0", 416 | "slideshow": { 417 | "slide_type": "-" 418 | } 419 | }, 420 | "source": [ 421 | "## `break`, `continue`, `pass`\n", 422 | "和C语言中一样,`break`用于终止循环,可以使得流程跳出所在的`while`或者`for`循环。\n", 423 | "\n", 424 | "和C语言中一样,`continue`用于终止当前循环,忽略剩下的语句,\n", 425 | "回到循环开始(即终止当前循环,开始下一次循环)。\n", 426 | "\n", 427 | "顾名思义,pass就是什么都不做,通常用作占位符,在函数原型设计中很有用:\n", 428 | "\n", 429 | "```python\n", 430 | "# main function\n", 431 | "def main():\n", 432 | " pass # 以后有功夫再来完善吧\n", 433 | "```" 434 | ] 435 | }, 436 | { 437 | "cell_type": "markdown", 438 | "metadata": { 439 | "ein.tags": "worksheet-0", 440 | "slideshow": { 441 | "slide_type": "-" 442 | } 443 | }, 444 | "source": [ 445 | "## 迭代器和`iter()`函数\n", 446 | "\n", 447 | "### 什么是迭代器\n", 448 | "\n", 449 | "迭代器是一种特殊的数据结构,在Python中,它也是以对象的形式存在的。\n", 450 | "迭代器提供了一种遍历类序列对象的方法。\n", 451 | "对于一般的序列类型,可以利用索引从0一直迭代到序列的最后一个元素;\n", 452 | "对于字典、文件、自定义对象类型等,可以自定义迭代方式,从而实现对这些对象类型的遍历。\n", 453 | "\n", 454 | "可以这样理解:对于一个集体中的每一个元素,要进行遍历,\n", 455 | "那么针对这个集体的迭代器定义了遍历每一个元素的顺序或者方法。\n", 456 | "例如:0, 1, 2, 3, 4...,或者1, 3, 5, 7, 9...\n", 457 | "\n", 458 | "### 如何迭代\n", 459 | "\n", 460 | "迭代器有一个`__next__()`方法,每次调用这个方法而实现计数(计数不是通过索引实现的),\n", 461 | "在循环中,如果要获得下一个对象,迭代器自己调用`__next__()`方法,这个过程是透明的。\n", 462 | "当遍历结束后(集合中再无未访问的项)会遇到`StopIteration`异常,从而结束循环。\n", 463 | "\n", 464 | "迭代器有一些限制,只能向前迭代,不能后退,即**迭代是单向的**,当然,\n", 465 | "可以独立创建一个反向的迭代器。迭代器不能复制,一旦需要重新迭代某个对象,\n", 466 | "必须重新创建一个该对象的迭代器。\n", 467 | "\n", 468 | "`reversed()`返回一个序列对象的逆序迭代器。\n", 469 | "`enumerate()`也可以返回迭代器。\n", 470 | "\n", 471 | "### 使用迭代器" 472 | ] 473 | }, 474 | { 475 | "cell_type": "code", 476 | "execution_count": null, 477 | "metadata": { 478 | "autoscroll": false, 479 | "collapsed": false, 480 | "ein.hycell": false, 481 | "ein.tags": "worksheet-0", 482 | "slideshow": { 483 | "slide_type": "-" 484 | } 485 | }, 486 | "outputs": [], 487 | "source": [ 488 | "mylist = ['green', 'red', 'blue', 'white']\n", 489 | "i = iter(mylist)\n", 490 | "i.__next__()\n", 491 | "i.__next__()\n", 492 | "type(i.__next__())\n", 493 | "\n", 494 | "ri = reversed(mylist) # reversed返回一个逆序迭代器\n", 495 | "ri.__next__()" 496 | ] 497 | }, 498 | { 499 | "cell_type": "code", 500 | "execution_count": null, 501 | "metadata": { 502 | "autoscroll": false, 503 | "collapsed": false, 504 | "ein.hycell": false, 505 | "ein.tags": "worksheet-0", 506 | "slideshow": { 507 | "slide_type": "-" 508 | } 509 | }, 510 | "outputs": [], 511 | "source": [ 512 | "i = iter(mylist)\n", 513 | "while True:\n", 514 | " try: # try-except用来捕获异常\n", 515 | " print(i.__next__())\n", 516 | " except StopIteration:\n", 517 | " break" 518 | ] 519 | }, 520 | { 521 | "cell_type": "markdown", 522 | "metadata": { 523 | "ein.tags": "worksheet-0", 524 | "slideshow": { 525 | "slide_type": "-" 526 | } 527 | }, 528 | "source": [ 529 | "**注意:** **在迭代一个对象的过程中最好不要去修改对象本身**,\n", 530 | "否则会发生难以预料的迭代异常,使得代码具有潜在的缺陷。" 531 | ] 532 | }, 533 | { 534 | "cell_type": "markdown", 535 | "metadata": { 536 | "ein.tags": "worksheet-0", 537 | "slideshow": { 538 | "slide_type": "-" 539 | } 540 | }, 541 | "source": [ 542 | "### `iter()`函数\n", 543 | "\n", 544 | "1. `iter(obj)` 如果`obj`是一个序列类型,那么可以根据其索引从0开始迭代。\n", 545 | "\n", 546 | "2. `iter(callable, sentinel)` 每次迭代调用`callable`,直至迭代的下一个值返回`sentinel`。" 547 | ] 548 | }, 549 | { 550 | "cell_type": "markdown", 551 | "metadata": { 552 | "ein.tags": "worksheet-0", 553 | "slideshow": { 554 | "slide_type": "-" 555 | } 556 | }, 557 | "source": [ 558 | "## 推导式(Comprehensions)\n", 559 | "\n", 560 | "推导式是从一个或者多个迭代器快速简洁地创建数据结构的一种方法,\n", 561 | "最早来源于Haskell编程语言,它可以将循环和条件判断结合,从而避免语法冗长的代码。\n", 562 | "\n", 563 | "### 列表推导(解析)\n", 564 | "\n", 565 | "语法:`[expr for item_var in iterable]`\n", 566 | "\n", 567 | "该语句的核心是for循环:" 568 | ] 569 | }, 570 | { 571 | "cell_type": "code", 572 | "execution_count": null, 573 | "metadata": { 574 | "autoscroll": false, 575 | "collapsed": false, 576 | "ein.hycell": false, 577 | "ein.tags": "worksheet-0", 578 | "slideshow": { 579 | "slide_type": "-" 580 | } 581 | }, 582 | "outputs": [], 583 | "source": [ 584 | "print([x**2 for x in range(9)])\n", 585 | "print([ss[::-1] for ss in ('green', 'red', 'blue')])" 586 | ] 587 | }, 588 | { 589 | "cell_type": "markdown", 590 | "metadata": { 591 | "ein.tags": "worksheet-0", 592 | "slideshow": { 593 | "slide_type": "-" 594 | } 595 | }, 596 | "source": [ 597 | "扩展语法可以实现条件控制: `[expr for item_var in iterable if cond_expr]`\n", 598 | "\n", 599 | "另外,也可增加 `else` 语句:`[expr if cond_expr else else_expr for item_var in iterable]`" 600 | ] 601 | }, 602 | { 603 | "cell_type": "code", 604 | "execution_count": null, 605 | "metadata": { 606 | "autoscroll": false, 607 | "collapsed": false, 608 | "ein.hycell": false, 609 | "ein.tags": "worksheet-0", 610 | "slideshow": { 611 | "slide_type": "-" 612 | } 613 | }, 614 | "outputs": [], 615 | "source": [ 616 | "print([x for x in range(100) if x % 17 == 0]) # 100以内17的倍数\n", 617 | "\n", 618 | "# 在对应的推导中有多个for语句\n", 619 | "print([(x, y) for x in range(9) for y in range(9)]) # 生成笛卡尔积" 620 | ] 621 | }, 622 | { 623 | "cell_type": "markdown", 624 | "metadata": { 625 | "ein.tags": "worksheet-0", 626 | "slideshow": { 627 | "slide_type": "-" 628 | } 629 | }, 630 | "source": [ 631 | "### 字典推导\n", 632 | "\n", 633 | "语法:`{key_expr: value_expr for expr in iterable}`\n", 634 | "\n", 635 | "类似于列表推导,字典推导也有`if`条件判断以及多个`for`循环迭代语句。" 636 | ] 637 | }, 638 | { 639 | "cell_type": "code", 640 | "execution_count": null, 641 | "metadata": { 642 | "autoscroll": false, 643 | "collapsed": false, 644 | "ein.hycell": false, 645 | "ein.tags": "worksheet-0", 646 | "slideshow": { 647 | "slide_type": "-" 648 | } 649 | }, 650 | "outputs": [], 651 | "source": [ 652 | "word = 'letters'\n", 653 | "letter_counts = {letter: word.count(letter) for letter in set(word)}\n", 654 | "letter_counts" 655 | ] 656 | }, 657 | { 658 | "cell_type": "markdown", 659 | "metadata": { 660 | "ein.tags": "worksheet-0", 661 | "slideshow": { 662 | "slide_type": "-" 663 | } 664 | }, 665 | "source": [ 666 | "### 集合推导\n", 667 | "\n", 668 | "与上面列表推导类似。\n", 669 | "\n", 670 | "### 生成器表达式\n", 671 | "\n", 672 | "生成器表达式是对列表解析的扩展,列表解析有一个不足,它一次性生成所有数据,\n", 673 | "用以创建列表,这样在数据量很大,内存很有限的时候便会造成问题。\n", 674 | "生成器表达式解决了这个问题,它一次生成一个数据,然后暂停,当下一次使用时,\n", 675 | "生产出下一个数据,工作的过程就像迭代器一样。\n", 676 | "\n", 677 | "语法很简单:`(expr for item_var in iterable if cond_expr)`\n", 678 | "\n", 679 | "也就是将列表的`[]`换成`()`即可:" 680 | ] 681 | }, 682 | { 683 | "cell_type": "code", 684 | "execution_count": null, 685 | "metadata": { 686 | "autoscroll": false, 687 | "collapsed": false, 688 | "ein.hycell": false, 689 | "ein.tags": "worksheet-0", 690 | "slideshow": { 691 | "slide_type": "-" 692 | } 693 | }, 694 | "outputs": [], 695 | "source": [ 696 | "type((x for x in range(10000) if x * x == (x - 3) * (x + 4))) # generator\n", 697 | "num = (x for x in range(10000) if x * x == (x - 3) * (x + 4))\n", 698 | "num.__next__()" 699 | ] 700 | }, 701 | { 702 | "cell_type": "markdown", 703 | "metadata": { 704 | "ein.tags": "worksheet-0", 705 | "slideshow": { 706 | "slide_type": "-" 707 | } 708 | }, 709 | "source": [ 710 | "**注意:**一个生成器只能运行一次,其只在运行中产生值,不会被存下来,\n", 711 | "所以不能重新使用或者备份一个生成器。" 712 | ] 713 | } 714 | ], 715 | "metadata": { 716 | "kernelspec": { 717 | "display_name": "Python 3", 718 | "name": "python3" 719 | }, 720 | "language_info": { 721 | "codemirror_mode": { 722 | "name": "ipython", 723 | "version": 3 724 | }, 725 | "file_extension": ".py", 726 | "mimetype": "text/x-python", 727 | "name": "python", 728 | "nbconvert_exporter": "python", 729 | "pygments_lexer": "ipython3", 730 | "version": "3.6.8" 731 | }, 732 | "name": "5_code_structure.ipynb", 733 | "toc": { 734 | "colors": { 735 | "hover_highlight": "#ddd", 736 | "running_highlight": "#FF0000", 737 | "selected_highlight": "#ccc" 738 | }, 739 | "moveMenuLeft": true, 740 | "nav_menu": { 741 | "height": "463.85px", 742 | "width": "251.833px" 743 | }, 744 | "navigate_menu": true, 745 | "number_sections": false, 746 | "sideBar": true, 747 | "threshold": 4, 748 | "toc_cell": false, 749 | "toc_section_display": "block", 750 | "toc_window_display": false, 751 | "widenNotebook": false 752 | } 753 | }, 754 | "nbformat": 4, 755 | "nbformat_minor": 2 756 | } 757 | -------------------------------------------------------------------------------- /课件/6_exceptions.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": { 6 | "deletable": true, 7 | "editable": true, 8 | "ein.tags": "worksheet-0", 9 | "slideshow": { 10 | "slide_type": "-" 11 | } 12 | }, 13 | "source": [ 14 | "# Python代码结构\n", 15 | "\n", 16 | "## 错误和异常\n", 17 | "\n", 18 | "### 什么是异常\n", 19 | "\n", 20 | "#### 错误\n", 21 | "\n", 22 | "错误可以分为两种,语法上的和逻辑上的,当检测到一个错误时,Python解释器会指出当前流已经无法继续执行下去。这个时候就出现了异常(Exception)。\n", 23 | "\n", 24 | "#### 异常\n", 25 | "\n", 26 | "对异常的最好描述是:它是因为程序出现了错误而在正常控制流之外采取的行为,这个行为分为两个阶段。\n", 27 | "\n", 28 | "* 引起异常发生的错误\n", 29 | "* 检测(和采取可能措施)阶段\n", 30 | "\n", 31 | "当程序运行过程中遇到错误时就会抛出一个异常(异常也可以由程序员主动触发)。其可以被异常控制语句捕捉并予以处理。异常如果不能被捕捉和处理,就会以错误的形式呈现出来。\n", 32 | "\n", 33 | "在类C语言(包括C语言)的程序设计中,程序员必须尽量考虑到各种错误情况,从而通过代码设计增强代码健壮性,然而总是有些特殊的会引发错误的情况无法考虑到,这就导致了错误发生时,只能眼睁睁的看着程序崩溃掉。异常的引入改变了这一情况,错误引发的异常可以被分类捕捉并在程序主流程以外加以处理,异常处理后,程序重新回到主流程中运行。对于像交换机、路由器这样的需要不间断运行的设备,出现错误并挂掉是不允许的,异常处理便显得极为重要。\n", 34 | "\n", 35 | "\n", 36 | "### Python中的几种常见异常\n", 37 | "\n", 38 | "* `NameError`:尝试访问一个未声明的变量" 39 | ] 40 | }, 41 | { 42 | "cell_type": "code", 43 | "execution_count": null, 44 | "metadata": { 45 | "autoscroll": false, 46 | "collapsed": false, 47 | "deletable": true, 48 | "editable": true, 49 | "ein.tags": "worksheet-0", 50 | "slideshow": { 51 | "slide_type": "-" 52 | } 53 | }, 54 | "outputs": [], 55 | "source": [ 56 | "foo" 57 | ] 58 | }, 59 | { 60 | "cell_type": "markdown", 61 | "metadata": { 62 | "deletable": true, 63 | "editable": true, 64 | "ein.tags": "worksheet-0", 65 | "slideshow": { 66 | "slide_type": "-" 67 | } 68 | }, 69 | "source": [ 70 | "* `ZeroDivisionError`:除数为0" 71 | ] 72 | }, 73 | { 74 | "cell_type": "code", 75 | "execution_count": null, 76 | "metadata": { 77 | "autoscroll": false, 78 | "collapsed": false, 79 | "deletable": true, 80 | "editable": true, 81 | "ein.tags": "worksheet-0", 82 | "slideshow": { 83 | "slide_type": "-" 84 | } 85 | }, 86 | "outputs": [], 87 | "source": [ 88 | "1/0" 89 | ] 90 | }, 91 | { 92 | "cell_type": "markdown", 93 | "metadata": { 94 | "deletable": true, 95 | "editable": true, 96 | "ein.tags": "worksheet-0", 97 | "slideshow": { 98 | "slide_type": "-" 99 | } 100 | }, 101 | "source": [ 102 | "* `SyntaxError`:解释器语法错误" 103 | ] 104 | }, 105 | { 106 | "cell_type": "code", 107 | "execution_count": null, 108 | "metadata": { 109 | "autoscroll": false, 110 | "collapsed": false, 111 | "deletable": true, 112 | "editable": true, 113 | "ein.tags": "worksheet-0", 114 | "slideshow": { 115 | "slide_type": "-" 116 | } 117 | }, 118 | "outputs": [], 119 | "source": [ 120 | "for" 121 | ] 122 | }, 123 | { 124 | "cell_type": "markdown", 125 | "metadata": { 126 | "deletable": true, 127 | "editable": true, 128 | "ein.tags": "worksheet-0", 129 | "slideshow": { 130 | "slide_type": "-" 131 | } 132 | }, 133 | "source": [ 134 | "* `IndexError`:请求的索引超出序列范围" 135 | ] 136 | }, 137 | { 138 | "cell_type": "code", 139 | "execution_count": null, 140 | "metadata": { 141 | "autoscroll": false, 142 | "collapsed": false, 143 | "deletable": true, 144 | "editable": true, 145 | "ein.tags": "worksheet-0", 146 | "slideshow": { 147 | "slide_type": "-" 148 | } 149 | }, 150 | "outputs": [], 151 | "source": [ 152 | "alist = [0, 1, 2]\n", 153 | "alist[4]" 154 | ] 155 | }, 156 | { 157 | "cell_type": "markdown", 158 | "metadata": { 159 | "deletable": true, 160 | "editable": true, 161 | "ein.tags": "worksheet-0", 162 | "slideshow": { 163 | "slide_type": "-" 164 | } 165 | }, 166 | "source": [ 167 | "* `KeyError`:请求一个不存在的字典键" 168 | ] 169 | }, 170 | { 171 | "cell_type": "code", 172 | "execution_count": null, 173 | "metadata": { 174 | "autoscroll": false, 175 | "collapsed": false, 176 | "deletable": true, 177 | "editable": true, 178 | "ein.tags": "worksheet-0", 179 | "slideshow": { 180 | "slide_type": "-" 181 | } 182 | }, 183 | "outputs": [], 184 | "source": [ 185 | "adict = {'a': 1, 'b': 2}\n", 186 | "adict['c']" 187 | ] 188 | }, 189 | { 190 | "cell_type": "markdown", 191 | "metadata": { 192 | "deletable": true, 193 | "editable": true, 194 | "ein.tags": "worksheet-0", 195 | "slideshow": { 196 | "slide_type": "-" 197 | } 198 | }, 199 | "source": [ 200 | "* `FileNotFoundError`:输入/输出错误" 201 | ] 202 | }, 203 | { 204 | "cell_type": "code", 205 | "execution_count": null, 206 | "metadata": { 207 | "autoscroll": false, 208 | "collapsed": false, 209 | "deletable": true, 210 | "editable": true, 211 | "ein.tags": "worksheet-0", 212 | "slideshow": { 213 | "slide_type": "-" 214 | } 215 | }, 216 | "outputs": [], 217 | "source": [ 218 | "afile = open('not_exists.txt')" 219 | ] 220 | }, 221 | { 222 | "cell_type": "markdown", 223 | "metadata": { 224 | "deletable": true, 225 | "editable": true, 226 | "ein.tags": "worksheet-0", 227 | "slideshow": { 228 | "slide_type": "-" 229 | } 230 | }, 231 | "source": [ 232 | "* `AttributeError`:尝试访问未知的对象属性" 233 | ] 234 | }, 235 | { 236 | "cell_type": "code", 237 | "execution_count": null, 238 | "metadata": { 239 | "autoscroll": false, 240 | "collapsed": false, 241 | "deletable": true, 242 | "editable": true, 243 | "ein.tags": "worksheet-0", 244 | "slideshow": { 245 | "slide_type": "-" 246 | } 247 | }, 248 | "outputs": [], 249 | "source": [ 250 | "'abc'.good()" 251 | ] 252 | }, 253 | { 254 | "cell_type": "markdown", 255 | "metadata": { 256 | "deletable": true, 257 | "editable": true, 258 | "ein.tags": "worksheet-0", 259 | "slideshow": { 260 | "slide_type": "-" 261 | } 262 | }, 263 | "source": [ 264 | "### 检测和处理异常\n", 265 | "\n", 266 | "#### `try-except`语句\n", 267 | "\n", 268 | "语法:\n", 269 | "\n", 270 | "```\n", 271 | "try:\n", 272 | " try_suite # 监控这里的异常\n", 273 | "except exceptiontype as name:\n", 274 | " except_suite # 异常处理代码\n", 275 | "```" 276 | ] 277 | }, 278 | { 279 | "cell_type": "code", 280 | "execution_count": null, 281 | "metadata": { 282 | "autoscroll": false, 283 | "collapsed": false, 284 | "deletable": true, 285 | "editable": true, 286 | "ein.tags": "worksheet-0", 287 | "slideshow": { 288 | "slide_type": "-" 289 | } 290 | }, 291 | "outputs": [], 292 | "source": [ 293 | "try:\n", 294 | " f = open('not_exists.txt')\n", 295 | "except FileNotFoundError as err:\n", 296 | " print('File does not exist.')" 297 | ] 298 | }, 299 | { 300 | "cell_type": "markdown", 301 | "metadata": { 302 | "deletable": true, 303 | "editable": true, 304 | "ein.tags": "worksheet-0", 305 | "slideshow": { 306 | "slide_type": "-" 307 | } 308 | }, 309 | "source": [ 310 | "程序执行时,尝试执行`try`中的代码,如果代码块中没有发现任何异常,则忽略`except`代码块中的内容,反之,如果发生的异常和`except`中要捕捉的异常类型相同。则调用其代码块中的代码对异常进行处理。处理后,返回程序的主流程中。\n", 311 | "\n", 312 | "如果`try`中产生的异常在`except`中无法被捕捉(类型不匹配),异常就会被递交至上一级,也就是由该段代码的调用者去处理。如果最后还是无法解决的话,就会出现错误,导致程序崩溃。\n", 313 | "\n", 314 | "#### `try-except-except-...`语句(多`except`)\n", 315 | "\n", 316 | "语法:\n", 317 | "\n", 318 | "```\n", 319 | "try:\n", 320 | " try_suite # 监控这里的异常\n", 321 | "except exceptiontype1 as name1:\n", 322 | " except_suite1 # 异常处理代码\n", 323 | "except exceptiontype2 as name2:\n", 324 | " except_suite2 # 异常处理代码\n", 325 | "...\n", 326 | "```\n", 327 | "\n", 328 | "有时候`try`代码中可能发生的异常有多种类型,我们可以针对不同类型的异常分类捕捉并予以处理。下面的例子定义了一个浮点类型转换函数,对于各种非法参数可以返回“标准的waring”:" 329 | ] 330 | }, 331 | { 332 | "cell_type": "code", 333 | "execution_count": null, 334 | "metadata": { 335 | "autoscroll": false, 336 | "collapsed": false, 337 | "deletable": true, 338 | "editable": true, 339 | "ein.tags": "worksheet-0", 340 | "slideshow": { 341 | "slide_type": "-" 342 | } 343 | }, 344 | "outputs": [], 345 | "source": [ 346 | "def safe_float(obj):\n", 347 | " try:\n", 348 | " retval = float(obj)\n", 349 | " except ValueError: # 异常:类型正确,值不正确\n", 350 | " retval = 'could not convert non-number to float'\n", 351 | " except TypeError: # 异常:类型不正确\n", 352 | " retval = 'object type cannot be converted to float'\n", 353 | " return retval\n", 354 | "\n", 355 | "print(safe_float('xyz'))\n", 356 | "print(safe_float(()))\n", 357 | "print(safe_float(200))" 358 | ] 359 | }, 360 | { 361 | "cell_type": "markdown", 362 | "metadata": { 363 | "deletable": true, 364 | "editable": true, 365 | "ein.tags": "worksheet-0", 366 | "slideshow": { 367 | "slide_type": "-" 368 | } 369 | }, 370 | "source": [ 371 | "#### 处理多个异常的`except`语句\n", 372 | "\n", 373 | "语法:\n", 374 | "```\n", 375 | "try:\n", 376 | " try_suite\n", 377 | "except (Exc1,[, Exc2[, ... ExcN]])[as reason]:\n", 378 | " suite_for_exception1_to_excN\n", 379 | "```\n", 380 | "\n", 381 | "这种语法使得我们可以对多种异常类型使用相同的处理方法。使用这种语法重写的`safe_float()`如下:" 382 | ] 383 | }, 384 | { 385 | "cell_type": "code", 386 | "execution_count": null, 387 | "metadata": { 388 | "autoscroll": false, 389 | "collapsed": false, 390 | "deletable": true, 391 | "editable": true, 392 | "ein.tags": "worksheet-0", 393 | "slideshow": { 394 | "slide_type": "-" 395 | } 396 | }, 397 | "outputs": [], 398 | "source": [ 399 | "def safe_float(obj):\n", 400 | " try:\n", 401 | " retval = float(obj)\n", 402 | " except (ValueError, TypeError):\n", 403 | " retval = 'arguments must be a number or numeric string'\n", 404 | " return retval\n", 405 | "\n", 406 | "print(safe_float('xyz'))\n", 407 | "print(safe_float(()))\n", 408 | "print(safe_float(200))" 409 | ] 410 | }, 411 | { 412 | "cell_type": "markdown", 413 | "metadata": { 414 | "deletable": true, 415 | "editable": true, 416 | "ein.tags": "worksheet-0", 417 | "slideshow": { 418 | "slide_type": "-" 419 | } 420 | }, 421 | "source": [ 422 | "现在,对于不同的类型错误返回的是相同的错误提示字符串。\n", 423 | "\n", 424 | "#### 捕获所有异常\n", 425 | "\n", 426 | "语法:\n", 427 | "\n", 428 | "```\n", 429 | "try:\n", 430 | " ...\n", 431 | "except [Exception] [as reason]:\n", 432 | " ...\n", 433 | "```\n", 434 | "\n", 435 | "在异常的继承树结构中,Exception是所有错误引发的异常的基类。当然,这里不包括`SystemExit`(当前应用程序需要退出)、 `KeyboardInterupt`(用户按下了Ctrl+C键)这两种异常。\n", 436 | "\n", 437 | "真正的异常基类是`BaseException`,其有3个子类,分别为`Exception`,`SystemExit`,`KeyboardInterrupt`。\n", 438 | "\n", 439 | "所以,如果要真正捕捉“所有”的异常,`except`语句应该写:`except BaseException as e`:\n", 440 | "\n", 441 | "所以,通常的代码框架以如下写法进行:\n", 442 | "\n", 443 | "```\n", 444 | "try:\n", 445 | " ...\n", 446 | "except (KeyboardInterrupt, SystemExit):\n", 447 | " #用户希望退出\n", 448 | " raise # 将异常上交给 caller\n", 449 | "except Exception:\n", 450 | " # 处理真正的错误\n", 451 | "```\n", 452 | "\n", 453 | "#### 异常参数\n", 454 | "\n", 455 | "异常有参数,前面的例子中,在`except`后你经常看到有个`e`,这个`e`就是异常参数,它是指示了异常原因的一个字符串。虽然在前面的很多例子中异常参数都被忽略了,但在实际编码过程中,将参数字符串和自己设定的错误信息一起输出是一个好的习惯。\n", 456 | "\n", 457 | "```\n", 458 | "...\n", 459 | "except ValueError as e:\n", 460 | " print('invalid value', e)\n", 461 | "```\n", 462 | "\n", 463 | "#### `else`子句\n", 464 | "\n", 465 | "```\n", 466 | "try:\n", 467 | " ...\n", 468 | "except:\n", 469 | " ...\n", 470 | "else:\n", 471 | " ...\n", 472 | "```\n", 473 | "\n", 474 | "`try`代码块中没有异常检测到时,执行`else`子句。\n", 475 | "\n", 476 | "\n", 477 | "#### `finally`子句\n", 478 | "\n", 479 | "```\n", 480 | "try:\n", 481 | " ...\n", 482 | "except MyException:\n", 483 | " ...\n", 484 | "else:\n", 485 | " ...\n", 486 | "finally:\n", 487 | " ...\n", 488 | "```\n", 489 | "\n", 490 | "**无论异常是否发生,无论异常是否被捕捉到,`finally`后的语句块一定会被执行。**诸如关闭文件,断开数据库连接之类的语句理所当然应当放在这一部分。\n", 491 | "\n", 492 | "#### `try-finally`子句\n", 493 | "\n", 494 | "```\n", 495 | "try:\n", 496 | " try_suite\n", 497 | "finally:\n", 498 | " finally_suite\n", 499 | "```\n", 500 | "\n", 501 | "这种用法和`try-except`的区别主要在于它不是用来捕捉异常的,而是用于保证无论异常是否发生,`finally`代码段都会执行。\n", 502 | "\n", 503 | "#### 综合\n", 504 | "\n", 505 | "```\n", 506 | "try:\n", 507 | " try_suite\n", 508 | "except Exception1:\n", 509 | " suite_for_exception1\n", 510 | "except (Exception2, Exc3, Exc4):\n", 511 | " suite_for_exc2_to_4\n", 512 | "except (Exc5, Exc6) as argu_for_56:\n", 513 | " suite_for_exc5_to_6\n", 514 | "except:\n", 515 | " suite_for_other_exceptions\n", 516 | "else:\n", 517 | " no_exception_detected_suite\n", 518 | "finally:\n", 519 | " always_excute_suite\n", 520 | "```\n", 521 | "\n", 522 | "### 触发异常\n", 523 | "\n", 524 | "到目前为止,看到的异常都是由Python解释器引起的,由执行期间的错误引发。很多情况下,你编写自己的类或者API,需要在遇到错误输入的时候触发异常,Python提供了`raise`语句来实现这一机制。\n", 525 | "\n", 526 | "语法:\n", 527 | "```\n", 528 | "raise [expression [from expression]]\n", 529 | "```\n", 530 | "\n", 531 | "如果`raise`后面没有跟表达式,`raise`将重新引发当前作用域中的最后一个异常。如果当前作用域中没有异常,将引发一个`RuntimeError`来提示有错误发生。\n", 532 | "\n", 533 | "其他情况下,`raise`将执行第一个表达式作为异常对象。它必须是`BaseException`的子类或者一个实例。如果它是一个类,那么将通过不带任何参数的初始化这个类来获取其实例对象。\n", 534 | "\n", 535 | "`from`语句被用来进行异常链追踪,其后的表达式必须是另外一个异常类或者实例。" 536 | ] 537 | }, 538 | { 539 | "cell_type": "code", 540 | "execution_count": null, 541 | "metadata": { 542 | "autoscroll": false, 543 | "collapsed": false, 544 | "deletable": true, 545 | "editable": true, 546 | "ein.tags": "worksheet-0", 547 | "slideshow": { 548 | "slide_type": "-" 549 | } 550 | }, 551 | "outputs": [], 552 | "source": [ 553 | "try:\n", 554 | " print(1 / 0)\n", 555 | "except Exception as exc:\n", 556 | " raise RuntimeError(\"Something bad happened\") from exc" 557 | ] 558 | }, 559 | { 560 | "cell_type": "code", 561 | "execution_count": null, 562 | "metadata": { 563 | "autoscroll": false, 564 | "collapsed": false, 565 | "deletable": true, 566 | "editable": true, 567 | "ein.tags": "worksheet-0", 568 | "slideshow": { 569 | "slide_type": "-" 570 | } 571 | }, 572 | "outputs": [], 573 | "source": [ 574 | "try:\n", 575 | " print(1 / 0)\n", 576 | "except:\n", 577 | " raise RuntimeError(\"Something bad happened\")" 578 | ] 579 | }, 580 | { 581 | "cell_type": "markdown", 582 | "metadata": { 583 | "deletable": true, 584 | "editable": true, 585 | "ein.tags": "worksheet-0", 586 | "slideshow": { 587 | "slide_type": "-" 588 | } 589 | }, 590 | "source": [ 591 | "### 创建异常\n", 592 | "当系统的内建异常不能满足你的需求时,就需要动手创建自己的异常。异常的创建实质是异常类的设计。这里涉及到了面向对象编程中类的设计,因此不再这里讲述,在后面会讲到。" 593 | ] 594 | }, 595 | { 596 | "cell_type": "code", 597 | "execution_count": null, 598 | "metadata": { 599 | "autoscroll": false, 600 | "collapsed": false, 601 | "deletable": true, 602 | "editable": true, 603 | "ein.tags": "worksheet-0", 604 | "slideshow": { 605 | "slide_type": "-" 606 | } 607 | }, 608 | "outputs": [], 609 | "source": [ 610 | "class UppercaseException(Exception):\n", 611 | " pass\n", 612 | "\n", 613 | "words = ['eenie', 'meenie', 'miny', 'MO']\n", 614 | "for word in words:\n", 615 | " if word.isupper():\n", 616 | " raise UppercaseException(word)" 617 | ] 618 | }, 619 | { 620 | "cell_type": "markdown", 621 | "metadata": { 622 | "deletable": true, 623 | "editable": true, 624 | "ein.tags": "worksheet-0", 625 | "slideshow": { 626 | "slide_type": "-" 627 | } 628 | }, 629 | "source": [ 630 | "学习异常就是学习如何解决程序中的各种有意、无意引入的错误,这对于提高程序健壮性,减少bugs是非常重要的。程序设计中,使用异常框架来管理诸如数据库连接,文件操作,GUI响应等极容易发生异常的过程是必须的。" 631 | ] 632 | } 633 | ], 634 | "metadata": { 635 | "kernelspec": { 636 | "display_name": "Python 3", 637 | "name": "python3" 638 | }, 639 | "language_info": { 640 | "codemirror_mode": { 641 | "name": "ipython", 642 | "version": 3 643 | }, 644 | "file_extension": ".py", 645 | "mimetype": "text/x-python", 646 | "name": "python", 647 | "nbconvert_exporter": "python", 648 | "pygments_lexer": "ipython3", 649 | "version": "3.6.1" 650 | }, 651 | "name": "6_exceptions.ipynb" 652 | }, 653 | "nbformat": 4, 654 | "nbformat_minor": 2 655 | } 656 | -------------------------------------------------------------------------------- /课件/8_modules_pacakges_programs.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": { 6 | "ein.tags": "worksheet-0", 7 | "slideshow": { 8 | "slide_type": "-" 9 | } 10 | }, 11 | "source": [ 12 | "# Pyhon模块、包和程序\n", 13 | "\n", 14 | "本章学习如何写出实用的大型Python程序。\n", 15 | "\n", 16 | "## 之前的独立程序\n", 17 | "\n", 18 | "```python\n", 19 | "#!/usr/bin/env python3.6\n", 20 | "\n", 21 | "\"\"\"\n", 22 | "hello_world.py\n", 23 | "\"\"\"\n", 24 | "\n", 25 | "\n", 26 | "def main():\n", 27 | " print('hello, world!')\n", 28 | "\n", 29 | "\n", 30 | "if __name__ = '__main__':\n", 31 | " main()\n", 32 | "```\n", 33 | "\n", 34 | "## 命令行工具\n", 35 | "\n", 36 | "Python 作为一种脚本语言,可以非常方便地用于系统(尤其是*nix系统)命令行工具的开发。Python 自身也集成了一些标准库,专门用于处理命令行相关的问题。\n", 37 | "\n", 38 | "\n", 39 | "### 标准输入输出\n", 40 | "\n", 41 | "*nix 系统中,一切皆为文件,因此标准输入、输出可以完全可以看做是对文件的操作。标准化输入可以通过管道(pipe)或重定向(redirect)的方式传递:" 42 | ] 43 | }, 44 | { 45 | "cell_type": "code", 46 | "execution_count": null, 47 | "metadata": { 48 | "autoscroll": false, 49 | "collapsed": false, 50 | "ein.hycell": false, 51 | "ein.tags": "worksheet-0", 52 | "slideshow": { 53 | "slide_type": "-" 54 | } 55 | }, 56 | "outputs": [], 57 | "source": [ 58 | "#!/usr/bin/env python3.6\n", 59 | "\n", 60 | "# reverse.py\n", 61 | "\n", 62 | "import sys\n", 63 | "\n", 64 | "for l in sys.stdin.readlines():\n", 65 | " sys.stdout.write(l[::-1])" 66 | ] 67 | }, 68 | { 69 | "cell_type": "markdown", 70 | "metadata": { 71 | "ein.tags": "worksheet-0", 72 | "slideshow": { 73 | "slide_type": "-" 74 | } 75 | }, 76 | "source": [ 77 | "将上面的代码保存为`reverse.py`,通过管道`|`传递:\n", 78 | "\n", 79 | "```sh\n", 80 | "chmod +x reverse.py\n", 81 | "cat reverse.py | ./reverse.py\n", 82 | "```\n", 83 | "\n", 84 | "```\n", 85 | "6.3nohtyp vne/nib/rsu/!#\n", 86 | "\n", 87 | "yp.esrever #\n", 88 | "\n", 89 | "sys tropmi\n", 90 | "\n", 91 | ":)(senildaer.nidts.sys ni l rof\n", 92 | ")]1-::[l(etirw.tuodts.sys\n", 93 | "```\n", 94 | "\n", 95 | "通过重定向`<`传递:\n", 96 | "\n", 97 | "```sh\n", 98 | "./reverse.py < reverse.py\n", 99 | "# 输出结果同上\n", 100 | "```" 101 | ] 102 | }, 103 | { 104 | "cell_type": "markdown", 105 | "metadata": { 106 | "ein.tags": "worksheet-0", 107 | "slideshow": { 108 | "slide_type": "-" 109 | } 110 | }, 111 | "source": [ 112 | "### 命令行参数\n", 113 | "\n", 114 | "一般在命令行后追加的参数可以通过`sys.argv`获取,`sys.argv`是一个列表,其中第一个元素为当前脚本的文件名:" 115 | ] 116 | }, 117 | { 118 | "cell_type": "code", 119 | "execution_count": null, 120 | "metadata": { 121 | "autoscroll": false, 122 | "collapsed": false, 123 | "ein.hycell": false, 124 | "ein.tags": "worksheet-0", 125 | "slideshow": { 126 | "slide_type": "-" 127 | } 128 | }, 129 | "outputs": [], 130 | "source": [ 131 | "#!/usr/bin/env python3.6\n", 132 | "\n", 133 | "import sys\n", 134 | "\n", 135 | "print(sys.argv) #下面返回的是Jupyter运行的结果" 136 | ] 137 | }, 138 | { 139 | "cell_type": "markdown", 140 | "metadata": { 141 | "ein.tags": "worksheet-0", 142 | "slideshow": { 143 | "slide_type": "-" 144 | } 145 | }, 146 | "source": [ 147 | "运行上面的脚本:\n", 148 | "\n", 149 | "```sh\n", 150 | "chmod +x argv.py\n", 151 | "./argv.py hello world\n", 152 | "# ['./argv.py', 'hello', 'world']\n", 153 | "python argv.py hello world\n", 154 | "# ['argv.py', 'hello', 'world']\n", 155 | "```\n", 156 | "\n", 157 | "对于比较复杂的命令行参数,例如通过`--option`传递的选项参数,如果是对`sys.argv`逐项进行解析会很麻烦,Python 提供标准库`argparse`(旧的库为`optparse`,已经停止维护)专门解析命令行参数。" 158 | ] 159 | }, 160 | { 161 | "cell_type": "markdown", 162 | "metadata": { 163 | "ein.tags": "worksheet-0", 164 | "slideshow": { 165 | "slide_type": "-" 166 | } 167 | }, 168 | "source": [ 169 | "## 模块和包\n", 170 | "\n", 171 | "模块和包是大型项目的核心,如何组织包、将大型模块分解成多个文件对于合理组织代码结构非常重要。\n", 172 | "\n", 173 | "一个**模块**就是一个Python代码文件。把多个模块组织成文件层次,称之为**包**。\n", 174 | "\n", 175 | "### 导入模块\n", 176 | "\n", 177 | "使用`import`语句来进行模块导入,模块是不带`.py`拓展的另外一个Python文件的文件名。先来看下面的例子,这里先将代码组织成由很多分层模块构成的包。在文件系统上组织代码,并确保每一个目录都定义了一个`__init__.py`文件,这个文件可以是空的,其目的是要包含不同运行级别的包的可选的初始化代码。举个例子,如果你执行了语句`import graphics`,文件`graphics/__init__.py`将被导入,建立`graphics`命名空间的内容。像`import graphics.formats.jpg`这样导入,文件`graphics/__init__.py`和文件`graphics/graphics/formats/__init__.py`将在文件`graphics/formats/jpg.py`导入之前导入。\n", 178 | "\n", 179 | "```\n", 180 | "graphics/\n", 181 | " __init__.py\n", 182 | " primitive/\n", 183 | " __init__.py\n", 184 | " line.py\n", 185 | " fill.py\n", 186 | " text.py\n", 187 | " formats/\n", 188 | " __init__.py\n", 189 | " png.py\n", 190 | " jpg.py\n", 191 | "```\n", 192 | "\n", 193 | "之后,就可以使用各种`import`语句来导入模块。\n", 194 | "\n", 195 | "```python\n", 196 | "import graphics.primitive.line # 直接导入模块,须按照“模块名. ”的用法来使用\n", 197 | "from graphics.primitive import line # 导入模块的一部分\n", 198 | "import graphics.formats.jpg as jpg # 使用别名导入模块\n", 199 | "from graphics import * # 导入模块的全部内容\n", 200 | "```\n", 201 | "\n", 202 | "### 在函数内部导入模块" 203 | ] 204 | }, 205 | { 206 | "cell_type": "code", 207 | "execution_count": null, 208 | "metadata": { 209 | "autoscroll": false, 210 | "collapsed": false, 211 | "ein.hycell": false, 212 | "ein.tags": "worksheet-0", 213 | "slideshow": { 214 | "slide_type": "-" 215 | } 216 | }, 217 | "outputs": [], 218 | "source": [ 219 | "def get_random():\n", 220 | " import random\n", 221 | " possibilities = ['a', 'b', 'c', 'd']\n", 222 | " return random.choice(possibilities)" 223 | ] 224 | }, 225 | { 226 | "cell_type": "markdown", 227 | "metadata": { 228 | "ein.tags": "worksheet-0", 229 | "slideshow": { 230 | "slide_type": "-" 231 | } 232 | }, 233 | "source": [ 234 | "大家都比较习惯的在函数外部导入:" 235 | ] 236 | }, 237 | { 238 | "cell_type": "code", 239 | "execution_count": null, 240 | "metadata": { 241 | "autoscroll": false, 242 | "collapsed": false, 243 | "ein.hycell": false, 244 | "ein.tags": "worksheet-0", 245 | "slideshow": { 246 | "slide_type": "-" 247 | } 248 | }, 249 | "outputs": [], 250 | "source": [ 251 | "import random\n", 252 | "def get_random():\n", 253 | " possibilities = ['a', 'b', 'c', 'd']\n", 254 | " return random.choice(possibilities)" 255 | ] 256 | }, 257 | { 258 | "cell_type": "markdown", 259 | "metadata": { 260 | "ein.tags": "worksheet-0", 261 | "slideshow": { 262 | "slide_type": "-" 263 | } 264 | }, 265 | "source": [ 266 | "如果被导入的代码被多个地方多次使用,就应该考虑在函数外部导入;如果被导入的代码只在函数内部使用,就在函数内部导入。" 267 | ] 268 | }, 269 | { 270 | "cell_type": "markdown", 271 | "metadata": { 272 | "ein.tags": "worksheet-0", 273 | "slideshow": { 274 | "slide_type": "-" 275 | } 276 | }, 277 | "source": [ 278 | "### 模块搜索路径\n", 279 | "\n", 280 | "Python使用存储在`sys`模块下的目录名和zip压缩文件列表作为`path`变量,这个列表可以被读取和修改。" 281 | ] 282 | }, 283 | { 284 | "cell_type": "code", 285 | "execution_count": null, 286 | "metadata": { 287 | "autoscroll": false, 288 | "collapsed": false, 289 | "ein.hycell": false, 290 | "ein.tags": "worksheet-0", 291 | "slideshow": { 292 | "slide_type": "-" 293 | } 294 | }, 295 | "outputs": [], 296 | "source": [ 297 | "import sys\n", 298 | "\n", 299 | "for pth in sys.path:\n", 300 | " print(pth)" 301 | ] 302 | }, 303 | { 304 | "cell_type": "markdown", 305 | "metadata": { 306 | "ein.tags": "worksheet-0", 307 | "slideshow": { 308 | "slide_type": "-" 309 | } 310 | }, 311 | "source": [ 312 | "最开始的空白输出行是空字符串'',代表当前目录。如果空字符串是在`sys.path`的开始位置,Python会先搜索当前目录。Python会根据列表元素的位置优先导入前面目录中存在的模块,如果模块同名,后面路径中出现的模块则不会被导入。" 313 | ] 314 | }, 315 | { 316 | "cell_type": "markdown", 317 | "metadata": { 318 | "ein.tags": "worksheet-0", 319 | "slideshow": { 320 | "slide_type": "-" 321 | } 322 | }, 323 | "source": [ 324 | "### 使用相对路径导入模块\n", 325 | "\n", 326 | "在包内,既可以使用绝对路径来导入也可以使用相对路径来导入。可以使用包的相对导入,使一个的模块导入同一个包的另一个模块。比如下面的例子,假设在文件系上有`mypackage`包,组织如下:\n", 327 | "\n", 328 | "```\n", 329 | "mypackage/\n", 330 | " __init__.py\n", 331 | " A/\n", 332 | " __init__.py\n", 333 | " spam.py\n", 334 | " grok.py\n", 335 | " B/\n", 336 | " __init__.py\n", 337 | " bar.py\n", 338 | "```\n", 339 | "\n", 340 | "如果模块`mypackage.A.spam`要导入同目录下的模块`grok`,它应该包括的`import`语句如下:\n", 341 | "\n", 342 | "```python\n", 343 | "# mypackage/A/spam.py\n", 344 | "\n", 345 | "from . import grok\n", 346 | "```\n", 347 | "\n", 348 | "如果模块`mypackage.A.spam`要导入不同目录下的模块`B.bar`,它应该使用的`import`语句如下:\n", 349 | "\n", 350 | "```python\n", 351 | "# mypackage/A/spam.py\n", 352 | "\n", 353 | "from ..B import bar\n", 354 | "```\n", 355 | "\n", 356 | "两个`import`语句都没包含顶层包名,而是使用了`spam.py`的相对路径。\n", 357 | "\n", 358 | "下面是同时使用绝对路径导入和相对路径导入的例子:\n", 359 | "\n", 360 | "```python\n", 361 | "# mypackage/A/spam.py\n", 362 | "\n", 363 | "from mypackage.A import grok # OK\n", 364 | "from . import grok # OK\n", 365 | "import grok # ImportError (not found)\n", 366 | "```\n", 367 | "\n", 368 | "像`mypackage.A`这样使用绝对路径名的不利之处是这将顶层包名硬编码到你的源码中。如果想重新组织它,你的代码将更加脆弱,很难工作。举个例子,如果改变了包名,你就必须检查所有文件来修正源码。同样,硬编码的名称会使移动代码变得困难。举个例子,也许有人想安装两个不同版本的软件包,只通过名称区分它们。如果使用相对导入,那一切都ok,然而使用绝对路径名很可能会出问题。\n", 369 | "\n", 370 | "`import`语句中的`.`和`..`语法跟shell中的当前目录和上级目录比较类似,把它们想象成指定目录名即可。`.`意味着在当前目录中查找,而`..B`表示在`../B`目录中查找。这种语法只能用在`from xx import yy`这样的导入语句中。\n", 371 | "\n", 372 | "```python\n", 373 | "from . import grok # OK\n", 374 | "import .grok # ERROR\n", 375 | "```\n", 376 | "\n", 377 | "相对导入不允许跳出定义包的那个目录,即利用句点的组合形式进入一个不是Python包的目录会出现导入错误。\n", 378 | "\n", 379 | "另外,相对导入只在特定的条件下才起作用,即,模块必须位于一个合适的包中才可以。特别的,位于脚本顶层目录的模块不能使用相对导入。\n", 380 | "\n", 381 | "此外,如果包的某个部分是直接以脚本的形式执行的,这种情况也不能使用相对导入。\n", 382 | "\n", 383 | "如:\n", 384 | "\n", 385 | "```sh\n", 386 | "$ python3 mypackage/A/spam.py # relative imports fails\n", 387 | "```\n", 388 | "\n", 389 | "但是可以使用`-m`选项来执行上面的脚本:\n", 390 | "\n", 391 | "```sh\n", 392 | "$ python3 -m mypackage.A.spam # relative imports work\n", 393 | "```" 394 | ] 395 | }, 396 | { 397 | "cell_type": "markdown", 398 | "metadata": { 399 | "ein.tags": "worksheet-0", 400 | "slideshow": { 401 | "slide_type": "-" 402 | } 403 | }, 404 | "source": [ 405 | "## Python标准库\n", 406 | "\n", 407 | "Python具有庞大的标准库,Python的标准库和Python语言核心一起构成的Python语言。Python提供了标准库各模块的官方文档( https://docs.python.org/3/library )以及使用指南( https://docs.python.org/3.6/tutorial/stdlib.html )。另外, *Doug Hellmann* 的网站*Python 3 Module of the Week*( https://pymotw.com/3/ )和他的书 *The Python 3 Standard Library by Example*(中文版《Python 3标准库》由机械工业出版社于2018年10月11日出版)都是非常有帮助的指南,其针对标准库模块展示了大量的代码实例。\n", 408 | "\n", 409 | "教材中展示的一些常用的标准库模块:\n", 410 | "\n", 411 | "* `collections.defaultdict` 创建包含键默认值的字典,其参数是一个函数(可以是lambda函数),返回赋给缺失键的值。" 412 | ] 413 | }, 414 | { 415 | "cell_type": "code", 416 | "execution_count": null, 417 | "metadata": { 418 | "autoscroll": false, 419 | "collapsed": false, 420 | "ein.hycell": false, 421 | "ein.tags": "worksheet-0", 422 | "slideshow": { 423 | "slide_type": "-" 424 | } 425 | }, 426 | "outputs": [], 427 | "source": [ 428 | "from collections import defaultdict\n", 429 | "food_counter = defaultdict(int)\n", 430 | "for food in ['spam', 'spam', 'eggs', 'spam']:\n", 431 | " food_counter[food] += 1\n", 432 | "\n", 433 | "for food, count in food_counter.items():\n", 434 | " print(food, count)" 435 | ] 436 | }, 437 | { 438 | "cell_type": "markdown", 439 | "metadata": { 440 | "ein.tags": "worksheet-0", 441 | "slideshow": { 442 | "slide_type": "-" 443 | } 444 | }, 445 | "source": [ 446 | "* `collections.Counter` 提供了计数器功能。" 447 | ] 448 | }, 449 | { 450 | "cell_type": "code", 451 | "execution_count": null, 452 | "metadata": { 453 | "autoscroll": false, 454 | "collapsed": false, 455 | "ein.hycell": false, 456 | "ein.tags": "worksheet-0", 457 | "slideshow": { 458 | "slide_type": "-" 459 | } 460 | }, 461 | "outputs": [], 462 | "source": [ 463 | "from collections import Counter\n", 464 | "alpha_counter = Counter('aaabbbccc')\n", 465 | "alpha_counter" 466 | ] 467 | }, 468 | { 469 | "cell_type": "markdown", 470 | "metadata": { 471 | "ein.tags": "worksheet-0", 472 | "slideshow": { 473 | "slide_type": "-" 474 | } 475 | }, 476 | "source": [ 477 | "函数`most_common()`以降序返回所有元素,可以给其指定一个数字参数,返回排名在该数字之前的元素。" 478 | ] 479 | }, 480 | { 481 | "cell_type": "code", 482 | "execution_count": null, 483 | "metadata": { 484 | "autoscroll": false, 485 | "collapsed": false, 486 | "ein.hycell": false, 487 | "ein.tags": "worksheet-0", 488 | "slideshow": { 489 | "slide_type": "-" 490 | } 491 | }, 492 | "outputs": [], 493 | "source": [ 494 | "alpha_counter.most_common(2)" 495 | ] 496 | }, 497 | { 498 | "cell_type": "markdown", 499 | "metadata": { 500 | "ein.tags": "worksheet-0", 501 | "slideshow": { 502 | "slide_type": "-" 503 | } 504 | }, 505 | "source": [ 506 | "另外,可以针对两个或多个计数器进行组合,其也支持类似集合元算的求交、并、差运算。" 507 | ] 508 | }, 509 | { 510 | "cell_type": "code", 511 | "execution_count": null, 512 | "metadata": { 513 | "autoscroll": false, 514 | "collapsed": false, 515 | "ein.hycell": false, 516 | "ein.tags": "worksheet-0", 517 | "slideshow": { 518 | "slide_type": "-" 519 | } 520 | }, 521 | "outputs": [], 522 | "source": [ 523 | "alpha_counter2 = Counter('abcddd')\n", 524 | "alpha_counter + alpha_counter2 # Counter({'a': 4, 'b': 4, 'c': 4, 'd': 3}),计数组合\n", 525 | "alpha_counter - alpha_counter2 # Counter({'a': 2, 'b': 2, 'c': 2}),计数相减\n", 526 | "alpha_counter & alpha_counter2 # Counter({'a': 1, 'b': 1, 'c': 1}),两者交集中取两者中较小计数\n", 527 | "alpha_counter | alpha_counter2 # Counter({'a': 3, 'b': 3, 'c': 3, 'd': 3}), 两者并集中取两者中较大计数" 528 | ] 529 | }, 530 | { 531 | "cell_type": "markdown", 532 | "metadata": { 533 | "ein.tags": "worksheet-0", 534 | "slideshow": { 535 | "slide_type": "-" 536 | } 537 | }, 538 | "source": [ 539 | "* `collections.OrderedDict` 有序字典,记忆字典键添加的顺序,然后从一个迭代器按照相同的顺序返回。" 540 | ] 541 | }, 542 | { 543 | "cell_type": "code", 544 | "execution_count": null, 545 | "metadata": { 546 | "autoscroll": false, 547 | "collapsed": false, 548 | "ein.hycell": false, 549 | "ein.tags": "worksheet-0", 550 | "slideshow": { 551 | "slide_type": "-" 552 | } 553 | }, 554 | "outputs": [], 555 | "source": [ 556 | "from collections import OrderedDict\n", 557 | "quotes = OrderedDict([\n", 558 | " ('Moe', 'A wise guy, huh?'),\n", 559 | " ('Larry', 'Ow!'),\n", 560 | " ('Curly', 'Nyuk nyuk!'),\n", 561 | "])\n", 562 | "for stooge in quotes:\n", 563 | " print(stooge)" 564 | ] 565 | }, 566 | { 567 | "cell_type": "code", 568 | "execution_count": null, 569 | "metadata": { 570 | "autoscroll": false, 571 | "collapsed": false, 572 | "ein.hycell": false, 573 | "ein.tags": "worksheet-0", 574 | "slideshow": { 575 | "slide_type": "-" 576 | } 577 | }, 578 | "outputs": [], 579 | "source": [ 580 | "quotes = dict([\n", 581 | " ('Moe', 'A wise guy, huh?'),\n", 582 | " ('Larry', 'Ow!'),\n", 583 | " ('Curly', 'Nyuk nyuk!'),\n", 584 | "])\n", 585 | "for stooge in quotes:\n", 586 | " print(stooge)" 587 | ] 588 | }, 589 | { 590 | "cell_type": "markdown", 591 | "metadata": { 592 | "ein.tags": "worksheet-0", 593 | "slideshow": { 594 | "slide_type": "-" 595 | } 596 | }, 597 | "source": [ 598 | "* `collections.deque`双端队列,同时具有栈和队列的特征,可从序列的任何一端添加和删除项。函数`popleft()`去掉最左边的项并返回该项,`pop()`去掉最右边的项并返回该项。" 599 | ] 600 | }, 601 | { 602 | "cell_type": "code", 603 | "execution_count": null, 604 | "metadata": { 605 | "autoscroll": false, 606 | "collapsed": false, 607 | "ein.hycell": false, 608 | "ein.tags": "worksheet-0", 609 | "slideshow": { 610 | "slide_type": "-" 611 | } 612 | }, 613 | "outputs": [], 614 | "source": [ 615 | "def palindrome(word):\n", 616 | " \"\"\"检测回文\"\"\"\n", 617 | " from collections import deque\n", 618 | " dq = deque(word)\n", 619 | " while len(dq) > 1:\n", 620 | " if dq.popleft() != dq.pop():\n", 621 | " return False\n", 622 | " return True\n", 623 | "\n", 624 | "\n", 625 | "def another_palindrome(word):\n", 626 | " \"\"\"检测回文\"\"\"\n", 627 | " return word == word[::-1]\n", 628 | "print(palindrome('racecar'))\n", 629 | "print(another_palindrome('racecar'))" 630 | ] 631 | }, 632 | { 633 | "cell_type": "markdown", 634 | "metadata": { 635 | "ein.tags": "worksheet-0", 636 | "slideshow": { 637 | "slide_type": "-" 638 | } 639 | }, 640 | "source": [ 641 | "* `itertools` 迭代器函数,每次返回一项,并记住当前调用的状态。" 642 | ] 643 | }, 644 | { 645 | "cell_type": "code", 646 | "execution_count": null, 647 | "metadata": { 648 | "autoscroll": false, 649 | "collapsed": false, 650 | "ein.hycell": false, 651 | "ein.tags": "worksheet-0", 652 | "slideshow": { 653 | "slide_type": "-" 654 | } 655 | }, 656 | "outputs": [], 657 | "source": [ 658 | "import itertools\n", 659 | "help(itertools)" 660 | ] 661 | }, 662 | { 663 | "cell_type": "code", 664 | "execution_count": null, 665 | "metadata": { 666 | "autoscroll": false, 667 | "collapsed": false, 668 | "ein.hycell": false, 669 | "ein.tags": "worksheet-0", 670 | "slideshow": { 671 | "slide_type": "-" 672 | } 673 | }, 674 | "outputs": [], 675 | "source": [ 676 | "for item in itertools.chain([1, 2], ['a', 'b']):\n", 677 | " print(item)" 678 | ] 679 | }, 680 | { 681 | "cell_type": "code", 682 | "execution_count": null, 683 | "metadata": { 684 | "autoscroll": false, 685 | "collapsed": false, 686 | "ein.hycell": false, 687 | "ein.tags": "worksheet-0", 688 | "slideshow": { 689 | "slide_type": "-" 690 | } 691 | }, 692 | "outputs": [], 693 | "source": [ 694 | "for item in itertools.accumulate([1, 2, 3, 4], lambda x, y: x * y):\n", 695 | " print(item)" 696 | ] 697 | }, 698 | { 699 | "cell_type": "markdown", 700 | "metadata": { 701 | "ein.tags": "worksheet-0", 702 | "slideshow": { 703 | "slide_type": "-" 704 | } 705 | }, 706 | "source": [ 707 | "* `pprint` 友好打印" 708 | ] 709 | }, 710 | { 711 | "cell_type": "code", 712 | "execution_count": null, 713 | "metadata": { 714 | "autoscroll": false, 715 | "collapsed": false, 716 | "ein.hycell": false, 717 | "ein.tags": "worksheet-0", 718 | "slideshow": { 719 | "slide_type": "-" 720 | } 721 | }, 722 | "outputs": [], 723 | "source": [ 724 | "quotes = OrderedDict([\n", 725 | " ('Moe', 'A wise guy, huh?'),\n", 726 | " ('Larry', 'Ow!'),\n", 727 | " ('Curly', 'Nyuk nyuk!'),\n", 728 | "])\n", 729 | "print(quotes)" 730 | ] 731 | }, 732 | { 733 | "cell_type": "code", 734 | "execution_count": null, 735 | "metadata": { 736 | "autoscroll": false, 737 | "collapsed": false, 738 | "ein.hycell": false, 739 | "ein.tags": "worksheet-0", 740 | "slideshow": { 741 | "slide_type": "-" 742 | } 743 | }, 744 | "outputs": [], 745 | "source": [ 746 | "from pprint import pprint\n", 747 | "pprint(quotes)" 748 | ] 749 | }, 750 | { 751 | "cell_type": "markdown", 752 | "metadata": { 753 | "ein.tags": "worksheet-0", 754 | "slideshow": { 755 | "slide_type": "-" 756 | } 757 | }, 758 | "source": [ 759 | "### 获取第三方Python代码\n", 760 | "\n", 761 | "* **Pypi** (https://pypi.python.org)\n", 762 | "* **Github** (https://github.com)\n", 763 | "* **ReadTheDocs** (https://readthedocs.org)\n", 764 | "* **activestate** (http://code.activestate.com/recipes/langs/python)" 765 | ] 766 | } 767 | ], 768 | "metadata": { 769 | "kernelspec": { 770 | "display_name": "Python 3", 771 | "name": "python3" 772 | }, 773 | "language_info": { 774 | "codemirror_mode": { 775 | "name": "ipython", 776 | "version": 3 777 | }, 778 | "file_extension": ".py", 779 | "mimetype": "text/x-python", 780 | "name": "python", 781 | "nbconvert_exporter": "python", 782 | "pygments_lexer": "ipython3", 783 | "version": "3.6.1" 784 | }, 785 | "name": "8_modules_pacakges_programs.ipynb" 786 | }, 787 | "nbformat": 4, 788 | "nbformat_minor": 2 789 | } 790 | -------------------------------------------------------------------------------- /课件/9_objects_and_classes.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": { 6 | "ein.tags": "worksheet-0", 7 | "slideshow": { 8 | "slide_type": "-" 9 | } 10 | }, 11 | "source": [ 12 | "# 对象和类\n", 13 | "\n", 14 | "本章我们学习如何使用自定义的数据结构:**对象**。\n", 15 | "\n", 16 | "## 什么是对象\n", 17 | "\n", 18 | "“对象是 Python 中对数据的一种抽象,Python程序中所有数据都是通过对象或对象之间的关系来表示的。”\n", 19 | "\n", 20 | "在 Python 中,所有的对象都具有**id**、**type**、**value**三个属性:\n", 21 | "\n", 22 | "![python_objects.png](images/python_object.png)\n", 23 | "\n", 24 | "其中**id**代表内存地址,可以通过内置函数`id()`查看,而**type**表示对象的类别,不同的类别意味着该对象拥有的属性和方法等,可以通过`type()`方法查看。\n", 25 | "\n", 26 | "对象既包含数据(attribute,称为特性),也包含代码(函数,也称为方法),它是某一类具体事物的特殊实例。\n", 27 | "\n", 28 | "对象作为 Python 中的基本单位,可以被创建、命名或删除。Python中一般不需要手动删除对象,其垃圾回收机制会自动处理不再使用的对象,当然如果需要,也可以使用`del`语句删除某个变量;所谓命名则是指给对象贴上一个名字标签,方便使用,也就是声明或赋值变量。\n", 29 | "\n", 30 | "对于一些 Python 内置类型的对象,通常可以使用特定的语法生成,例如**数字**直接使用阿拉伯数字字面量,**字符串**使用引号`''`,**列表**使用`[]`,**字典**使用`{}`,**函数**使用`def`语法等,这些对象的类型都是 Python 内置的。\n", 31 | "\n", 32 | "当你想创建一个其他类型的对象时,首先必须定义一个类,用以指明该类型的对象所包含的内容(特性和方法)。\n", 33 | "\n", 34 | "## 类与实例\n", 35 | "\n", 36 | "在Python中通常使用`class`语句来定义一个类(类对象),与其他对象不同的是,`class`定义的对象(类)可以用于产生新的对象(实例)。" 37 | ] 38 | }, 39 | { 40 | "cell_type": "code", 41 | "execution_count": null, 42 | "metadata": { 43 | "autoscroll": false, 44 | "collapsed": false, 45 | "ein.hycell": false, 46 | "ein.tags": "worksheet-0", 47 | "slideshow": { 48 | "slide_type": "-" 49 | } 50 | }, 51 | "outputs": [], 52 | "source": [ 53 | "class Person():\n", 54 | " pass\n", 55 | "\n", 56 | "\n", 57 | "def who(obj):\n", 58 | " print(id(obj), type(obj))\n", 59 | "\n", 60 | "\n", 61 | "someone = Person()\n", 62 | "who(someone)" 63 | ] 64 | }, 65 | { 66 | "cell_type": "markdown", 67 | "metadata": { 68 | "ein.tags": "worksheet-0", 69 | "slideshow": { 70 | "slide_type": "-" 71 | } 72 | }, 73 | "source": [ 74 | "上面的例子中`Person`是我们创建的一个新的类,通过调用`Person()`可以获得一个`Person`类型的实例对象,将其赋值为`someone`,就成功创建了一个与所有内置对象类型不同的对象`someone`,它的类型为`__main__.Person`。到这里,可以将Python中一切的对象分为两种:\n", 75 | "\n", 76 | "1. 可以用来生成新对象的类,包括内置的`int`、`str`以及上面定义的`Person`等;\n", 77 | "2. 由类生成的实例对象,包括内置类型的数字、字符串、以及上面定义的类型为`__main__.Person`的`someone`。\n", 78 | "\n", 79 | "在实践中不得不考虑的一些细节性问题:\n", 80 | "\n", 81 | "1. 需要一些方便的机制来实现面向对象编程中的继承、重载等特性;\n", 82 | "2. 需要一些固定的流程让我们可以在生成实例化对象的过程中执行一些特定的操作。\n", 83 | "\n", 84 | "## 类的方法\n", 85 | "\n", 86 | "在类的内部,使用`def`关键字可以为类定义一个方法,与一般函数定义不同,方法必须包含参数`self`,且为第一个参数。\n", 87 | "\n", 88 | "Python使用特殊的对象初始化方法(构造函数,在生成对象时调用)`__init__`,当创建了这个类的实例时就会调用该方法,与C++中构造函数类似。如果想重新定义一下上面的`Person`类,比如为其添加一个`name`参数,可以这样:" 89 | ] 90 | }, 91 | { 92 | "cell_type": "code", 93 | "execution_count": null, 94 | "metadata": { 95 | "autoscroll": false, 96 | "collapsed": false, 97 | "ein.hycell": false, 98 | "ein.tags": "worksheet-0", 99 | "slideshow": { 100 | "slide_type": "-" 101 | } 102 | }, 103 | "outputs": [], 104 | "source": [ 105 | "class Person():\n", 106 | " def __init__(self, name):\n", 107 | " self.name = name\n", 108 | "\n", 109 | "someone = Person('小明')\n", 110 | "print(someone)\n", 111 | "print(someone.name)" 112 | ] 113 | }, 114 | { 115 | "cell_type": "markdown", 116 | "metadata": { 117 | "ein.tags": "worksheet-0", 118 | "slideshow": { 119 | "slide_type": "-" 120 | } 121 | }, 122 | "source": [ 123 | "上面代码执行流程:\n", 124 | "1. 查看`Person`类的定义;\n", 125 | "2. 在内存中实例化(创建)一个新的对象;\n", 126 | "3. 调用对象的`__init__`方法,将这个新创建的对象作为`self`参数传入,并将另一个参数('小明')作为`name`传入;\n", 127 | "4. 将`name`的值存入对象;\n", 128 | "5. 返回这个新的对象;\n", 129 | "6. 将名字`someone`与这个对象关联。" 130 | ] 131 | }, 132 | { 133 | "cell_type": "code", 134 | "execution_count": null, 135 | "metadata": { 136 | "autoscroll": false, 137 | "collapsed": false, 138 | "ein.hycell": false, 139 | "ein.tags": "worksheet-0", 140 | "slideshow": { 141 | "slide_type": "-" 142 | } 143 | }, 144 | "outputs": [], 145 | "source": [ 146 | "class Person():\n", 147 | " def __init__(self, name, age):\n", 148 | " self.name = name\n", 149 | " self.age = age\n", 150 | "\n", 151 | " def get_name(self):\n", 152 | " return self.name\n", 153 | "\n", 154 | " def set_name(self, name):\n", 155 | " self.name = name\n", 156 | "\n", 157 | " def get_age(self):\n", 158 | " return self.age\n", 159 | "\n", 160 | " def set_age(self, age):\n", 161 | " self.age = age\n", 162 | "\n", 163 | "\n", 164 | "someone = Person('小明', 14)\n", 165 | "print(someone.name)\n", 166 | "print(someone.get_name())\n", 167 | "someone.set_name('李小明')\n", 168 | "print(someone.name)\n", 169 | "print(someone.age)\n", 170 | "someone.age = 15\n", 171 | "print(someone.get_age())" 172 | ] 173 | }, 174 | { 175 | "cell_type": "markdown", 176 | "metadata": { 177 | "ein.tags": "worksheet-0", 178 | "slideshow": { 179 | "slide_type": "-" 180 | } 181 | }, 182 | "source": [ 183 | "## 继承\n", 184 | "\n", 185 | "从已有类中衍生出新的类,添加或修改部分功能,能提高代码复用。使用继承得到的新类会自动获得旧类中的所有方法,而不需要进行复制。\n", 186 | "\n", 187 | "你可以在新类里面定义自己额外需要的方法,或者按照需要对继承的方法进行修改。" 188 | ] 189 | }, 190 | { 191 | "cell_type": "code", 192 | "execution_count": null, 193 | "metadata": { 194 | "autoscroll": false, 195 | "collapsed": false, 196 | "ein.hycell": false, 197 | "ein.tags": "worksheet-0", 198 | "slideshow": { 199 | "slide_type": "-" 200 | } 201 | }, 202 | "outputs": [], 203 | "source": [ 204 | "class A():\n", 205 | " def foo(self):\n", 206 | " print('A.foo')\n", 207 | "\n", 208 | "class B(A):\n", 209 | " def foo(self):\n", 210 | " \"\"\"覆盖父类中的方法\n", 211 | " \"\"\"\n", 212 | " # 使用super()调用父类的方法\n", 213 | " super().foo()\n", 214 | " print('B.foo')\n", 215 | "\n", 216 | " def bar(self):\n", 217 | " \"\"\"在子类中添加父类中没有的新方法\n", 218 | " \"\"\"\n", 219 | " print('B.bar')\n", 220 | "\n", 221 | "\n", 222 | "a = A()\n", 223 | "a.foo()\n", 224 | "b = B()\n", 225 | "b.foo()\n", 226 | "b.bar()" 227 | ] 228 | }, 229 | { 230 | "cell_type": "markdown", 231 | "metadata": { 232 | "ein.tags": "worksheet-0", 233 | "slideshow": { 234 | "slide_type": "-" 235 | } 236 | }, 237 | "source": [ 238 | "上面代码示例展示了如何在子类中覆盖(override)父类的方法,在子类中如何调用父类的方法,以及在子类中添加父类中没有的新方法。\n", 239 | "\n", 240 | "\n", 241 | "## 在类中封装属性名\n", 242 | "\n", 243 | "Python中类的所有特性都是公开的,Python程序员不去依赖语言特性去封装数据,而是通过遵循一定的属性和方法命名规约来达到这个效果。\n", 244 | "\n", 245 | "第一个约定是任何以单下划线`_`开头的名字都应该是内部实现。" 246 | ] 247 | }, 248 | { 249 | "cell_type": "code", 250 | "execution_count": null, 251 | "metadata": { 252 | "autoscroll": false, 253 | "collapsed": false, 254 | "ein.hycell": false, 255 | "ein.tags": "worksheet-0", 256 | "slideshow": { 257 | "slide_type": "-" 258 | } 259 | }, 260 | "outputs": [], 261 | "source": [ 262 | "class A():\n", 263 | " def __init__(self):\n", 264 | " self._internal = 0 # An internal attribute\n", 265 | " self.public = 1 # A public attribute\n", 266 | "\n", 267 | " def public_method(self):\n", 268 | " \"\"\"A public method\n", 269 | " \"\"\"\n", 270 | " pass\n", 271 | "\n", 272 | " def _internal_method(self):\n", 273 | " pass" 274 | ] 275 | }, 276 | { 277 | "cell_type": "markdown", 278 | "metadata": { 279 | "ein.tags": "worksheet-0", 280 | "slideshow": { 281 | "slide_type": "-" 282 | } 283 | }, 284 | "source": [ 285 | "Python并不会真的真的阻止别人访问内部名称。但是如果你这么做肯定是不好的,可能会导致脆弱的代码。同时还要注意到,使用下划线开头的约定同样适用于模块名和模块级别函数。\n", 286 | "\n", 287 | "另外,使用双下划线`__`开始会导致访问名称变成其他形式(名称改写/name mangling)。 " 288 | ] 289 | }, 290 | { 291 | "cell_type": "code", 292 | "execution_count": null, 293 | "metadata": { 294 | "autoscroll": false, 295 | "collapsed": false, 296 | "ein.hycell": false, 297 | "ein.tags": "worksheet-0", 298 | "slideshow": { 299 | "slide_type": "-" 300 | } 301 | }, 302 | "outputs": [], 303 | "source": [ 304 | "class B():\n", 305 | " def __init__(self):\n", 306 | " self.__private = 0\n", 307 | "\n", 308 | " def __private_method(self):\n", 309 | " pass\n", 310 | "\n", 311 | " def public_method(self):\n", 312 | " self.__private_method()\n", 313 | " pass" 314 | ] 315 | }, 316 | { 317 | "cell_type": "markdown", 318 | "metadata": { 319 | "ein.tags": "worksheet-0", 320 | "slideshow": { 321 | "slide_type": "-" 322 | } 323 | }, 324 | "source": [ 325 | "上面的类B中,私有特性和方法会被重命名为`_B__private`和`_B__private_method`。这样做的目的就是为了继承时无法被覆盖。例如:" 326 | ] 327 | }, 328 | { 329 | "cell_type": "code", 330 | "execution_count": null, 331 | "metadata": { 332 | "autoscroll": false, 333 | "collapsed": false, 334 | "ein.hycell": false, 335 | "ein.tags": "worksheet-0", 336 | "slideshow": { 337 | "slide_type": "-" 338 | } 339 | }, 340 | "outputs": [], 341 | "source": [ 342 | "class C(B):\n", 343 | " def __init__(self):\n", 344 | " super().__init__()\n", 345 | " self.__private = 1 # does not override B.__private\n", 346 | "\n", 347 | " # does not override B.__private_method()\n", 348 | " def __private_method(self):\n", 349 | " pass" 350 | ] 351 | }, 352 | { 353 | "cell_type": "markdown", 354 | "metadata": { 355 | "ein.tags": "worksheet-0", 356 | "slideshow": { 357 | "slide_type": "-" 358 | } 359 | }, 360 | "source": [ 361 | "上面示例中,私有名称`__private`和`__private_method`会被重命名为`_C__private`和`_C__private_method`,这个跟父类B中的名称是完全不同的。\n", 362 | "\n", 363 | "上面提到的两种不同的编码约定(单下划线和双下划线)来命名私有属性应该选择哪一种?大多数而言,你应该让你的非公共名称以单下划线开头。但是,如果你清楚你的代码会涉及到子类,并且有些内部属性应该在子类中隐藏起来,那么你需要考虑使用双下划线方案。\n", 364 | "\n", 365 | "还有一点,有时候你定义的一个变量和某个保留关键字冲突,这时候可以使用单下划线作为后缀:" 366 | ] 367 | }, 368 | { 369 | "cell_type": "code", 370 | "execution_count": null, 371 | "metadata": { 372 | "autoscroll": false, 373 | "collapsed": false, 374 | "ein.hycell": false, 375 | "ein.tags": "worksheet-0", 376 | "slideshow": { 377 | "slide_type": "-" 378 | } 379 | }, 380 | "outputs": [], 381 | "source": [ 382 | "lambda_ = 2.0 # trailing _ to avoid clash with lambda keyword" 383 | ] 384 | }, 385 | { 386 | "cell_type": "markdown", 387 | "metadata": { 388 | "ein.tags": "worksheet-0", 389 | "slideshow": { 390 | "slide_type": "-" 391 | } 392 | }, 393 | "source": [ 394 | "上面并不使用单下划线前缀的原因是它避免误解它的使用初衷(如使用单下划线前缀的目的是为了防止命名冲突而不是指明这个属性是私有的)。通过使用单下划线后缀可以解决这个问题。\n", 395 | "\n", 396 | "\n", 397 | "## 创建可管理的属性\n", 398 | "\n", 399 | "有时候我们想要控制对实例对象特性(attribute)的访问,比如将其设置为只读(无法修改)或者添加类型检查、合法性验证等,这时候我们可以将其定义为一个属性(property)。例如,下面的代码定义了一个property,增加对一个属性简单的类型检查。" 400 | ] 401 | }, 402 | { 403 | "cell_type": "code", 404 | "execution_count": null, 405 | "metadata": { 406 | "autoscroll": false, 407 | "collapsed": false, 408 | "ein.hycell": false, 409 | "ein.tags": "worksheet-0", 410 | "slideshow": { 411 | "slide_type": "-" 412 | } 413 | }, 414 | "outputs": [], 415 | "source": [ 416 | "class Person():\n", 417 | " def __init__(self, first_name):\n", 418 | " self.first_name = first_name\n", 419 | "\n", 420 | " def get_first_name(self):\n", 421 | " return self._first_name\n", 422 | "\n", 423 | " def set_first_name(self, value):\n", 424 | " if not isinstance(value, str):\n", 425 | " raise TypeError('Expected a string')\n", 426 | " self._first_name = value\n", 427 | "\n", 428 | " def del_first_name(self):\n", 429 | " raise AttributeError(\"Can't delete attribute\")\n", 430 | "\n", 431 | " first_name = property(get_first_name, set_first_name, del_first_name)\n", 432 | "\n", 433 | "\n", 434 | "a = Person('Mark')\n", 435 | "print(a.first_name)\n", 436 | "a.first_name = 42" 437 | ] 438 | }, 439 | { 440 | "cell_type": "code", 441 | "execution_count": null, 442 | "metadata": { 443 | "autoscroll": false, 444 | "collapsed": false, 445 | "ein.hycell": false, 446 | "ein.tags": "worksheet-0", 447 | "slideshow": { 448 | "slide_type": "-" 449 | } 450 | }, 451 | "outputs": [], 452 | "source": [ 453 | "del a.first_name" 454 | ] 455 | }, 456 | { 457 | "cell_type": "markdown", 458 | "metadata": { 459 | "ein.tags": "worksheet-0", 460 | "slideshow": { 461 | "slide_type": "-" 462 | } 463 | }, 464 | "source": [ 465 | "上例中,使用`property()`定义了一个属性`first_name`,`property()`的第一个参数是**getter**方法,第二个参数是**setter**方法,第三个参数是**deleter**方法。属性(property)的一个关键特征是它看上去跟普通的特性(attribute)没什么两样,但是访问它的时候会自动触发**getter**、**setter**、**deleter**方法。\n", 466 | "\n", 467 | "在实现一个property的时候,底层数据(如果有的话)仍然需要存储在某个地方。因此,在**getter**和**setter**方法中,你会看到对`_firse_name`的操作,这也是实际数据保存的地方。另外,你可能还会问为什么`__init__()`方法中设置了`self.first_name`而不是`self._first_name`。在这个例子中,创建一个property的目的就是在设置attribute的时候进行检查。因此,你可能想在初始化的时候也进行这种类型检查。通过设置`self.first_name`,自动调用**setter**方法,这个方法里面会进行参数的检查,否则就是直接访问`self._first_name`了。\n", 468 | "\n", 469 | "另一种定义属性的方法是使用装饰器。例如,上面的例子可以写成这样:" 470 | ] 471 | }, 472 | { 473 | "cell_type": "code", 474 | "execution_count": null, 475 | "metadata": { 476 | "autoscroll": false, 477 | "collapsed": false, 478 | "ein.hycell": false, 479 | "ein.tags": "worksheet-0", 480 | "slideshow": { 481 | "slide_type": "-" 482 | } 483 | }, 484 | "outputs": [], 485 | "source": [ 486 | "class Person():\n", 487 | " def __init__(self, first_name):\n", 488 | " self.first_name = first_name\n", 489 | "\n", 490 | " @property\n", 491 | " def first_name(self):\n", 492 | " return self._first_name\n", 493 | "\n", 494 | " @first_name.setter\n", 495 | " def first_name(self, value):\n", 496 | " if not isinstance(value, str):\n", 497 | " raise TypeError('Expected a string')\n", 498 | " self._first_name = value\n", 499 | "\n", 500 | " @first_name.deleter\n", 501 | " def first_name(self):\n", 502 | " raise AttributeError(\"Can't delete attribute\")\n", 503 | "\n", 504 | "\n", 505 | "a = Person('Mark')\n", 506 | "print(a.first_name)\n", 507 | "a.first_name = 42" 508 | ] 509 | }, 510 | { 511 | "cell_type": "code", 512 | "execution_count": null, 513 | "metadata": { 514 | "autoscroll": false, 515 | "collapsed": false, 516 | "ein.hycell": false, 517 | "ein.tags": "worksheet-0", 518 | "slideshow": { 519 | "slide_type": "-" 520 | } 521 | }, 522 | "outputs": [], 523 | "source": [ 524 | "del a.first_name" 525 | ] 526 | }, 527 | { 528 | "cell_type": "markdown", 529 | "metadata": { 530 | "ein.tags": "worksheet-0", 531 | "slideshow": { 532 | "slide_type": "-" 533 | } 534 | }, 535 | "source": [ 536 | "上述代码中有三个相关联的方法,这三个方法的名字都必须一样。`@property`用于指示**getter**方法,它使得`first_name`成为一个属性。`@first_name.setter`用于指示**setter**方法,`@first_name.deleter`用于指示**deleter**方法。需要强调的是只有在`first_name`属性被创建后,后面的两个装饰器`@first_name.setter`和`@first_name.deleter`才能被定义。\n", 537 | "\n", 538 | "*注意:*不要写没有做任何其他额外操作的property。\n", 539 | "\n", 540 | "另外,property还是一种定义动态计算attribute的方法。这种类型的特性并不会被实际的存储,而是在需要的时候计算出来。" 541 | ] 542 | }, 543 | { 544 | "cell_type": "code", 545 | "execution_count": null, 546 | "metadata": { 547 | "autoscroll": false, 548 | "collapsed": false, 549 | "ein.hycell": false, 550 | "ein.tags": "worksheet-0", 551 | "slideshow": { 552 | "slide_type": "-" 553 | } 554 | }, 555 | "outputs": [], 556 | "source": [ 557 | "import math\n", 558 | "\n", 559 | "\n", 560 | "class Circle():\n", 561 | " def __init__(self, radius):\n", 562 | " self.radius = radius\n", 563 | "\n", 564 | " @property\n", 565 | " def area(self):\n", 566 | " return math.pi * self.radius ** 2\n", 567 | "\n", 568 | " @property\n", 569 | " def perimeter(self):\n", 570 | " return 2 * math.pi * self.radius\n", 571 | "\n", 572 | "\n", 573 | "c = Circle(4.0)\n", 574 | "print(c.radius)\n", 575 | "print(c.area)\n", 576 | "print(c.perimeter)" 577 | ] 578 | }, 579 | { 580 | "cell_type": "markdown", 581 | "metadata": { 582 | "ein.tags": "worksheet-0", 583 | "slideshow": { 584 | "slide_type": "-" 585 | } 586 | }, 587 | "source": [ 588 | "在这里,我们通过使用property,将所有的访问接口形式统一起来,对半径、周长和面积的访问都是通过属性访问,就跟访问简单的attribute是一样的。如果不这样做的话,那么就要在代码中混合使用简单属性访问和方法调用。\n", 589 | "\n", 590 | "如果你没有指定某一特性的**setter**属性(@area.setter),那么将无法在类的外部对它的值进行设置。这对于只读的特性非常有用:" 591 | ] 592 | }, 593 | { 594 | "cell_type": "code", 595 | "execution_count": null, 596 | "metadata": { 597 | "autoscroll": false, 598 | "collapsed": false, 599 | "ein.hycell": false, 600 | "ein.tags": "worksheet-0", 601 | "slideshow": { 602 | "slide_type": "-" 603 | } 604 | }, 605 | "outputs": [], 606 | "source": [ 607 | "c.area = 10 # will raise AttributeError" 608 | ] 609 | }, 610 | { 611 | "cell_type": "markdown", 612 | "metadata": { 613 | "ein.tags": "worksheet-0", 614 | "slideshow": { 615 | "slide_type": "-" 616 | } 617 | }, 618 | "source": [ 619 | "另外,使用property时,如果你改变了某个特性的定义,只需要在类定义里修改相关代码即可,不需要在每一处调用修改。" 620 | ] 621 | }, 622 | { 623 | "cell_type": "code", 624 | "execution_count": null, 625 | "metadata": { 626 | "autoscroll": false, 627 | "collapsed": false, 628 | "ein.hycell": false, 629 | "ein.tags": "worksheet-0", 630 | "slideshow": { 631 | "slide_type": "-" 632 | } 633 | }, 634 | "outputs": [], 635 | "source": [ 636 | "class Circle():\n", 637 | " def __init__(self, diameter):\n", 638 | " self.radius = diameter / 2\n", 639 | "\n", 640 | " @property\n", 641 | " def area(self):\n", 642 | " return math.pi * self.radius ** 2\n", 643 | "\n", 644 | " @property\n", 645 | " def perimeter(self):\n", 646 | " return 2 * math.pi * self.radius\n", 647 | "\n", 648 | "\n", 649 | "c = Circle(8.0)\n", 650 | "print(c.radius)\n", 651 | "print(c.area)\n", 652 | "print(c.perimeter)" 653 | ] 654 | }, 655 | { 656 | "cell_type": "markdown", 657 | "metadata": { 658 | "ein.tags": "worksheet-0", 659 | "slideshow": { 660 | "slide_type": "-" 661 | } 662 | }, 663 | "source": [ 664 | "## 方法类型\n", 665 | "\n", 666 | "在类中定义的方法有3种,分别是:\n", 667 | "\n", 668 | "* **实例方法**(instance method),以`self`作为第一个参数,当其被调用时,Python会把调用该方法的对象作为`self`参数传入。\n", 669 | "* **类方法**(class method),用`@classmethod`装饰器指定,第一个参数是类本身,通常写作`cls`,作用于整个类,对类作出的任何改变会对它的所有实例对象产生影响。\n", 670 | "* **静态方法**(static method),用`@staticmethod`修饰,其参数既不需要`self`,也不需要`cls`,其功能既不影响类也不影响类的实例,仅仅为了组织代码的方便。" 671 | ] 672 | }, 673 | { 674 | "cell_type": "code", 675 | "execution_count": null, 676 | "metadata": { 677 | "autoscroll": false, 678 | "collapsed": false, 679 | "ein.hycell": false, 680 | "ein.tags": "worksheet-0", 681 | "slideshow": { 682 | "slide_type": "-" 683 | } 684 | }, 685 | "outputs": [], 686 | "source": [ 687 | "class A():\n", 688 | " count = 0\n", 689 | " def __init__(self):\n", 690 | " A.count += 1\n", 691 | "\n", 692 | " def exclaim(self):\n", 693 | " print('I am an A!')\n", 694 | "\n", 695 | " @classmethod\n", 696 | " def kids(cls):\n", 697 | " print(\"A has\", cls.count, \"little objects.\")\n", 698 | "\n", 699 | " @staticmethod\n", 700 | " def static_method():\n", 701 | " print('Static method called!')\n", 702 | "\n", 703 | "\n", 704 | "easy_a = A()\n", 705 | "breezy_a = A()\n", 706 | "wheezy_a = A()\n", 707 | "A.kids()\n", 708 | "easy_a.exclaim()\n", 709 | "A.exclaim(easy_a)\n", 710 | "A.static_method()" 711 | ] 712 | }, 713 | { 714 | "cell_type": "markdown", 715 | "metadata": { 716 | "ein.tags": "worksheet-0", 717 | "slideshow": { 718 | "slide_type": "-" 719 | } 720 | }, 721 | "source": [ 722 | "## 鸭子类型\n", 723 | "\n", 724 | "在程序设计中,鸭子类型(duck typing)是动态类型的一种风格。在这种风格中,一个对象有效的语义,不是由继承自特定的类或实现特定的接口,而是由\"当前方法和属性的集合\"决定。支持“鸭子类型”的语言的解释器/编译器会在解释或编译时推断对象的类型。在鸭子类型中,关注的不是对象的类型本身,而是它是如何使用的。例如,在不使用鸭子类型的语言中,我们可以编写一个函数,它接受一个类型为“鸭子”的对象,并调用它的“走”和“叫”方法。在使用鸭子类型的语言中,这样的一个函数可以接受一个任意类型的对象,并调用它的“走”和“叫”方法。如果这些需要被调用的方法不存在,那么将引发一个运行时错误。任何拥有这样的正确的“走”和“叫”方法的对象都可被函数接受的这种行为引出了以上表述,这种决定类型的方式因此得名。\n", 725 | "\n", 726 | "在动态语言设计中,可以解释为无论一个对象是什么类型的,只要它具有某类型的行为(方法),则它就是这一类型的实例,而不在于它是否显示的实现或者继承。比如,如果一个对象具备迭代器所具有的所有行为特征,那它就是迭代器了。而如何保证一个对象实现某一种类型的所有特征,则依靠**协议**。\n", 727 | "\n", 728 | "\n", 729 | "## 特殊方法(魔法方法)\n", 730 | "\n", 731 | "特殊方法是指Python类中以双下划线`__`开头和结尾的方法,比如`__init__`,根据类的定义以及传入的参数对新创建的对象进行初始化。\n", 732 | "\n", 733 | "### 构造和初始化\n", 734 | "\n", 735 | "但是当调用`x = SomeClass()`的时候,`__init__`并不是第一个被调用的方法。实际上,还有一个叫做`__new__`的方法,来构造这个实例。然后给在开始创建时候的初始化函数来传递参数。在对象生命周期的另一端,也有一个`__del__`方法(如果`__new__`和`__init__`是对象的构造器的话,那么`__del__`就是析构器),它定义的是当一个对象进行垃圾回收时候的行为。当一个对象在删除的时需要更多的清洁工作的时候此方法会很有用,比如套接字对象或者是文件对象。" 736 | ] 737 | }, 738 | { 739 | "cell_type": "markdown", 740 | "metadata": { 741 | "ein.tags": "worksheet-0", 742 | "slideshow": { 743 | "slide_type": "-" 744 | } 745 | }, 746 | "source": [ 747 | "```python\n", 748 | "from os.path import join\n", 749 | "\n", 750 | "class FileObject:\n", 751 | " '''给文件对象进行包装从而确认在删除时关闭文件流'''\n", 752 | "\n", 753 | " def __init__(self, filepath='~', filename='sample.txt'):\n", 754 | " # 读写模式打开一个文件\n", 755 | " self.file = open(join(filepath, filename), 'r+')\n", 756 | "\n", 757 | " def __del__(self):\n", 758 | " self.file.close()\n", 759 | " del self.file\n", 760 | "```" 761 | ] 762 | }, 763 | { 764 | "cell_type": "markdown", 765 | "metadata": { 766 | "ein.tags": "worksheet-0", 767 | "slideshow": { 768 | "slide_type": "-" 769 | } 770 | }, 771 | "source": [ 772 | "### 用于比较的魔术方法\n", 773 | "\n", 774 | "```\n", 775 | "__lt__(self, other) self < other\n", 776 | "__le__(self, other) self <= other\n", 777 | "__eq__(self, other) self == other\n", 778 | "__ne__(self, other) self != other\n", 779 | "__gt__(self, other) self > other\n", 780 | "__ge__(self, other) self >= other\n", 781 | "```\n", 782 | "\n", 783 | "举一个例子,创建一个类来表示一个词语。我们也许会想要比较单词的字典序(通过字母表),通过默认的字符串比较的方法就可以实现,但是我们也想要通过一些其他的标准来实现,比如单词长度或者音节数量。在这个例子中,我们来比较长度实现。以下是实现代码:" 784 | ] 785 | }, 786 | { 787 | "cell_type": "code", 788 | "execution_count": null, 789 | "metadata": { 790 | "autoscroll": false, 791 | "collapsed": false, 792 | "ein.hycell": false, 793 | "ein.tags": "worksheet-0", 794 | "slideshow": { 795 | "slide_type": "-" 796 | } 797 | }, 798 | "outputs": [], 799 | "source": [ 800 | "class Word(str):\n", 801 | " \"\"\"\n", 802 | " 存储单词的类,定义比较单词的几种方法\n", 803 | " \"\"\"\n", 804 | "\n", 805 | " def __new__(cls, word):\n", 806 | " # 注意我们必须要用到__new__方法,因为str是不可变类型\n", 807 | " # 所以我们必须在创建的时候将它初始化\n", 808 | " if ' ' in word:\n", 809 | " print(\"Value contains spaces. Truncating to first space.\")\n", 810 | " word = word[:word.index(' ')] #单词是第一个空格之前的所有字符\n", 811 | " return str.__new__(cls, word)\n", 812 | "\n", 813 | " def __gt__(self, other):\n", 814 | " return len(self) > len(other)\n", 815 | " def __lt__(self, other):\n", 816 | " return len(self) < len(other)\n", 817 | " def __ge__(self, other):\n", 818 | " return len(self) >= len(other)\n", 819 | " def __le__(self, other):\n", 820 | " return len(self) <= len(other)\n", 821 | "\n", 822 | "foo = Word('foo')\n", 823 | "bar = Word('bar')\n", 824 | "foo > bar # False\n", 825 | "foo < bar # False\n", 826 | "foo >= bar # True\n", 827 | "foo <= bar # True" 828 | ] 829 | }, 830 | { 831 | "cell_type": "markdown", 832 | "metadata": { 833 | "ein.tags": "worksheet-0", 834 | "slideshow": { 835 | "slide_type": "-" 836 | } 837 | }, 838 | "source": [ 839 | "### 普通算数操作符\n", 840 | "\n", 841 | "```\n", 842 | "__add__(self, other) 加法 +\n", 843 | "__sub__(self, other) 减法 -\n", 844 | "__mul__(self, other) 乘法 *\n", 845 | "__floordiv__(self, other) 整数除法 //\n", 846 | "__truediv__(self, other) 真除法 /\n", 847 | "__mod__(self, other) 取模算法 %\n", 848 | "__divmod___(self, other) 内置divmod()函数\n", 849 | "__pow__(self, other) 指数运算 **\n", 850 | "__lshift__(self, other) 左移 <<\n", 851 | "__rshift__(self, other) 右移 >>\n", 852 | "__and__(self, other) 按位与 &\n", 853 | "__or__(self, other) 按位或 |\n", 854 | "__xor__(self, other) 按位异或 ^\n", 855 | "```\n", 856 | "\n", 857 | "### 其他种类的魔法方法\n", 858 | "\n", 859 | "* `__str__(self)` 等价于`str(self)`,定义如何打印对象信息,`print()`、`str()`以及字符串格式化相关方法都会用到`__str__()`。\n", 860 | "* `__repr__(self)` 等价于`repr(self)`,交互式解释器适用此方法输出变量。\n", 861 | "* `__len__(self)` 等价于`len(self)`。" 862 | ] 863 | }, 864 | { 865 | "cell_type": "code", 866 | "execution_count": null, 867 | "metadata": { 868 | "autoscroll": false, 869 | "collapsed": false, 870 | "ein.hycell": false, 871 | "ein.tags": "worksheet-0", 872 | "slideshow": { 873 | "slide_type": "-" 874 | } 875 | }, 876 | "outputs": [], 877 | "source": [ 878 | "class Word():\n", 879 | " def __init__(self, text):\n", 880 | " self.text = text\n", 881 | " def __eq__(self, word2):\n", 882 | " return self.text.lower() == word2.text.lower()\n", 883 | " def __str__(self):\n", 884 | " return self.text\n", 885 | " def __repr__(self):\n", 886 | " return 'Word(\"' + self.text + '\")'\n", 887 | "\n", 888 | "first = Word('ha')\n", 889 | "print(first)\n", 890 | "first" 891 | ] 892 | }, 893 | { 894 | "cell_type": "markdown", 895 | "metadata": { 896 | "ein.tags": "worksheet-0", 897 | "slideshow": { 898 | "slide_type": "-" 899 | } 900 | }, 901 | "source": [ 902 | "更多源于魔法方法的内容可查看Python在线文档( https://docs.python.org/3/reference/datamodel.html#special-method-names )。\n", 903 | "\n", 904 | "## 组合\n", 905 | "\n", 906 | "类与类之间的2种关系:\n", 907 | "\n", 908 | "* **is-a** 子类是父类的一种特殊情况\n", 909 | "* **has-a** 一种类型包含另一种类型" 910 | ] 911 | }, 912 | { 913 | "cell_type": "code", 914 | "execution_count": null, 915 | "metadata": { 916 | "autoscroll": false, 917 | "collapsed": false, 918 | "ein.hycell": false, 919 | "ein.tags": "worksheet-0", 920 | "slideshow": { 921 | "slide_type": "-" 922 | } 923 | }, 924 | "outputs": [], 925 | "source": [ 926 | "class Bill():\n", 927 | " def __init__(self, description):\n", 928 | " self.description = description\n", 929 | "\n", 930 | "\n", 931 | "class Tail():\n", 932 | " def __init__(self, length):\n", 933 | " self.length = length\n", 934 | "\n", 935 | "\n", 936 | "class Duck():\n", 937 | " def __init__(self, bill, tail):\n", 938 | " self.bill = bill\n", 939 | " self.tail = tail\n", 940 | "\n", 941 | " def about(self):\n", 942 | " print('This duck has a', self.bill.description,\n", 943 | " 'bill and a', self.tail.length, 'tail.')\n", 944 | "\n", 945 | "\n", 946 | "tail = Tail('long')\n", 947 | "bill = Bill('wide orange')\n", 948 | "duck = Duck(bill, tail)\n", 949 | "duck.about()" 950 | ] 951 | }, 952 | { 953 | "cell_type": "markdown", 954 | "metadata": { 955 | "ein.tags": "worksheet-0", 956 | "slideshow": { 957 | "slide_type": "-" 958 | } 959 | }, 960 | "source": [ 961 | "## 命名元组\n", 962 | "\n", 963 | "命名元组是元组的子类,其允许使用名称而不是只是索引访问其元素的元组!" 964 | ] 965 | }, 966 | { 967 | "cell_type": "code", 968 | "execution_count": null, 969 | "metadata": { 970 | "autoscroll": false, 971 | "collapsed": false, 972 | "ein.hycell": false, 973 | "ein.tags": "worksheet-0", 974 | "slideshow": { 975 | "slide_type": "-" 976 | } 977 | }, 978 | "outputs": [], 979 | "source": [ 980 | "from collections import namedtuple\n", 981 | "\n", 982 | "# 创建命名元组\n", 983 | "Student = namedtuple('Student', 'first_name last_name grade')\n", 984 | "# Student = namedtuple('Student', ['first_name', 'last_name', 'grade'])\n", 985 | "astudent = Student('Lisa', 'Simpson', 'A')\n", 986 | "# 使用字典构造命名元组\n", 987 | "# intro = {'first_name': 'Lisa', 'last_name': 'Simpson', 'grade': 'A'}\n", 988 | "# astudent = Student(**intro)\n", 989 | "print(type(Student))\n", 990 | "print(type(astudent))\n", 991 | "print(astudent)\n", 992 | "print(astudent.first_name)\n", 993 | "print(astudent.last_name)\n", 994 | "print(astudent.grade)" 995 | ] 996 | }, 997 | { 998 | "cell_type": "markdown", 999 | "metadata": { 1000 | "ein.tags": "worksheet-0", 1001 | "slideshow": { 1002 | "slide_type": "-" 1003 | } 1004 | }, 1005 | "source": [ 1006 | "将命名元组以有序字典(OrderedDict)形式返回:" 1007 | ] 1008 | }, 1009 | { 1010 | "cell_type": "code", 1011 | "execution_count": null, 1012 | "metadata": { 1013 | "autoscroll": false, 1014 | "collapsed": false, 1015 | "ein.hycell": false, 1016 | "ein.tags": "worksheet-0", 1017 | "slideshow": { 1018 | "slide_type": "-" 1019 | } 1020 | }, 1021 | "outputs": [], 1022 | "source": [ 1023 | "astudent._asdict()" 1024 | ] 1025 | }, 1026 | { 1027 | "cell_type": "markdown", 1028 | "metadata": { 1029 | "ein.tags": "worksheet-0", 1030 | "slideshow": { 1031 | "slide_type": "-" 1032 | } 1033 | }, 1034 | "source": [ 1035 | "命名元组是不可变的,但可替换其中某些域的值并返回一个新命名元组:" 1036 | ] 1037 | }, 1038 | { 1039 | "cell_type": "code", 1040 | "execution_count": null, 1041 | "metadata": { 1042 | "autoscroll": false, 1043 | "collapsed": false, 1044 | "ein.hycell": false, 1045 | "ein.tags": "worksheet-0", 1046 | "slideshow": { 1047 | "slide_type": "-" 1048 | } 1049 | }, 1050 | "outputs": [], 1051 | "source": [ 1052 | "bstudent = astudent._replace(first_name='Bart', grade='C')\n", 1053 | "bstudent" 1054 | ] 1055 | }, 1056 | { 1057 | "cell_type": "markdown", 1058 | "metadata": { 1059 | "ein.tags": "worksheet-0", 1060 | "slideshow": { 1061 | "slide_type": "-" 1062 | } 1063 | }, 1064 | "source": [ 1065 | "使用命名元组的好处:\n", 1066 | "\n", 1067 | "* 可以当作不可变对象来使用;\n", 1068 | "* 使用点号`.`对特性进行访问,使得代码更清晰简单;\n", 1069 | "* 与使用对象相比,使用命名元组在时间和空间上效率更高。" 1070 | ] 1071 | }, 1072 | { 1073 | "cell_type": "markdown", 1074 | "metadata": { 1075 | "ein.tags": "worksheet-0", 1076 | "slideshow": { 1077 | "slide_type": "-" 1078 | } 1079 | }, 1080 | "source": [ 1081 | "## 类、对象还是模块?\n", 1082 | "\n", 1083 | "用最简单的方式解决问题。使用元组(命名元组)、字典和列表等内置数据类型要比使用模块更加简单,使用类则更加复杂。" 1084 | ] 1085 | } 1086 | ], 1087 | "metadata": { 1088 | "kernelspec": { 1089 | "display_name": "Python 3", 1090 | "name": "python3" 1091 | }, 1092 | "language_info": { 1093 | "codemirror_mode": { 1094 | "name": "ipython", 1095 | "version": 3 1096 | }, 1097 | "file_extension": ".py", 1098 | "mimetype": "text/x-python", 1099 | "name": "python", 1100 | "nbconvert_exporter": "python", 1101 | "pygments_lexer": "ipython3", 1102 | "version": "3.6.1" 1103 | }, 1104 | "name": "9_objects_and_classes.ipynb" 1105 | }, 1106 | "nbformat": 4, 1107 | "nbformat_minor": 2 1108 | } 1109 | -------------------------------------------------------------------------------- /课件/README.md: -------------------------------------------------------------------------------- 1 | # 《Python科学计算生态》课件 2 | 3 | 1. [课程介绍&Python语言初识](./1_a_taste_of_python.pdf) 4 | 2. [Python变量和数据类型](./2_python_ingredients.ipynb) 5 | 3. [Python变量和数据类型—字符串](./3_strings.ipynb) 6 | 4. [Python基本数据结构](./4_py_filling.ipynb) 7 | 5. [Python代码结构](./5_code_structure.ipynb) 8 | 6. [Python错误和异常](./6_exceptions.ipynb) 9 | 7. [Python函数](./7_function.ipynb) 10 | 8. [Python模块、包和程序](./8_modules_pacakges_programs.ipynb) 11 | 9. [Python对象和类](./9_objects_and_classes.ipynb) 12 | 10. [Python数据操作](./10_mangle_data.ipynb) 13 | 11. [Python文件IO](./11_file_io_and_structured_text_files.ipynb) 14 | 12. [Python系统管理](./12_system_management.ipynb) 15 | 13. [Python正则表达式](./13_regular_expressions.ipynb) 16 | 14. [Python排序](./14_sort.ipynb) 17 | 15. [Python的else块和复制](./15_else-and-copy.ipynb) 18 | 16. [NumPy基础](./16_numpy.ipynb) 19 | -------------------------------------------------------------------------------- /课件/images/python_object.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/edu2act/course-PySCE/335a5ccd782d57a6641fb5e7861413f645cc93c9/课件/images/python_object.png --------------------------------------------------------------------------------