├── 01章-Python的数据模型.md ├── 04章-文本与字节.md ├── 05章-一级函数.md ├── 06章-.md ├── 07章-函数装饰器和闭包.md ├── 08章-对象的引用、可变性和回收.md ├── 11章-抽象基类.md ├── 12章-继承该如何是好.md ├── 13章-运算符重载.md ├── 14章-可迭代之迭代器和生成器.md ├── 15章-上下文管理和else语句块.md ├── 16章-协程.md ├── 17章.md ├── 18-asyncio.md ├── 19章-属性和特性.md ├── 20章-属性描述符.md ├── LICENSE ├── README.md ├── appendix └── _collection_abc.py └── images ├── 14-1.png ├── c16_1.png ├── c16_2.png ├── c7_1.png └── the_qrcode_for_qq_group.png /01章-Python的数据模型.md: -------------------------------------------------------------------------------- 1 | Chapter 1. The Python Data Model 2 | 第一章-Python数据模型 3 | ******************************** 4 | 5 | *Guido’s sense of the aesthetics of language design is amazing. I’ve met many fine language designers who could build theoretically beautiful languages that no one would ever use, but Guido is one of those rare people who can build a language that is just slightly less theoretically beautiful but thereby is a joy to write programs in[3].* 6 | 7 | *— Jim Hugunin creator of Jython, co-creator of AspectJ, architect of the .Net DLR* 8 | 9 | One of the best qualities of Python is its consistency. After working with Python for a while, you are able to start making informed, correct guesses about features that are new to you. 10 | 11 | Python的一个最佳特质就是其具有的一致性。在使用Python一段时间后,你就能够知道并正确地猜测要使用的新功能。   12 | 13 | However, if you learned another object oriented language before Python, you may have found it strange to spell `len(collection)` instead of `collection.len()`. This apparent oddity is the tip of an iceberg which, when properly understood, is the key to everything we call `Pythonic`. The iceberg is called the Python Data Model, and it describes the API that you can use to make your own objects play well with the most idiomatic language features. 14 | 15 | 不过,要是你在学些Python之前,学习过其他的面向对象语言,你会奇怪地发现Python使用的是*len(collection)*而不是*collection.len().*这只是明显奇怪的的做法只是冰山一角,当你能够正确理解后,这也是我们称之“Python范儿”的关键。而冰山则称为Python数据模型,它描述了能够让你使用它来构建符合大多数语言特性的个人项目。 16 | 17 | You can think of the Data Model as a description of Python as a framework. It formalizes the interfaces of the building blocks of the language itself, such as sequences, iterators, functions, classes, context managers and so on. 18 | 19 | 你可以认为数据模型作为一个使用Python语言描述的框架。它将语言自身所构建的语句块接口正式化了,比如队列,迭代器,函数,类,上下文管理器等等。 20 | 21 | While coding with any framework, you spend a lot of time implementing methods that are called by the framework. The same happens when you leverage the Python Data Model. The Python interpreter invokes special methods to perform basic object operations, often triggered by special syntax. The special method names are always spelled with leading and trailing double underscores, i.e. `__getitem__`. For example, the syntax `obj[key]` is supported by the `__getitem__` special method. To evaluate `my_collection[key]`, the interpreter calls `my_collection.__getitem__(key)`. 22 | 23 | 在使用框架编写代码时,实际上你花了很多时间来实现可以被框架调用的方法。同样的事情也发生在你处理Python数据模型时。Python解释器调用会调用特殊方法以执行基本的对象操作,这个操作经常是通过对特殊语法的触发来实现。特殊方法名称常常利用首位的双下划线拼写而成,例如,*__getitem__*。例如,语法*obj[key]*由特殊方法*__getitem__*提供支持。为了计算*my_collection[key]*,解释器需要调用*my_collection.__getitem__(key)*。 24 | 25 | The special method names allow your objects to implement, support and interact with basic language constructs such as: 26 | 27 | 特殊方法名称可以使对象实现并支持与基本语言结构的交互,比如: 28 | 29 | - iteration; 30 | - collections; 31 | - attribute access; 32 | - operator overloading; 33 | - function and method invocation; 34 | - object creation and destruction; 35 | - string representation and formatting; 36 | - managed contexts (i.e. *with* blocks); 37 | 38 | - 迭代; 39 | - 集合; 40 | - 属性访问; 41 | - 运算符重载; 42 | - 函数及方法的调用; 43 | - 对象创建与销毁; 44 | - 字符串的重表示与格式化; 45 | - 管理上下文(例如,with语句块); 46 | 47 | 48 | ##### MAGIC AND DUNDER 魔法和双下划线 49 | The term magic method is slang for special method, but when talking about a specific method like `__getitem__`, some Python developers take the shortcut of saying “under-under-getitem” which is ambiguous, since the syntax __x has another special meaning[4]. But being precise and pronouncing “under-under-getitem-under-under” is tiresome, so I follow the lead of author and teacher Steve Holden and say “dunder-getitem”. All experienced Pythonistas understand that shortcut. As a result, the special methods are also known as dunder methods [5]. 50 | 51 | 术语魔法方式是特殊方法的俚语,当我们谈论一个类似`__getitem__`的专有方法时,有时候Python开发者会简称为“under-under-getitem”,这让其他人感到困惑,因为语法__x拥有特别的意义。 52 | 53 | ## A Pythonic Card Deck Python风格纸牌游戏 54 | The following is a very simple example, but it demonstrates the power of implementing just two special methods, `__getitem__` and `__len__`. 55 | 56 | 下面是一个非常简单的例子,但它足以说明只实现了`__getitem__` 和`__len__`特殊方法的强大之处: 57 | 58 | Example 1-1 is a class to represent a deck of playing cards: 59 | 60 | 例子1-1是一个表示一副纸牌游戏的例子: 61 | 62 | *Example 1-1. A deck as a sequence of cards.* 63 | 例子1-1. 一副按照顺序排列的纸牌。 64 | 65 | ```python 66 | import collections 67 | 68 | Card = collections.namedtuple('Card', ['rank', 'suit']) 69 | 70 | 71 | class FrenchDeck: 72 | ranks = [str(n) for n in range(2, 11)] + list('JQKA') 73 | suits = 'spades diamonds clubs hearts'.split() 74 | 75 | def __init__(self): 76 | self._cards = [Card(rank, suit) for suit in self.suits 77 | for rank in self.ranks] 78 | 79 | def __len__(self): 80 | return len(self._cards) 81 | 82 | def __getitem__(self, position): 83 | return self._cards[position] 84 | ``` 85 | 86 | The first thing to note is the use of collections.namedtuple to construct a simple class to represent individual cards. Since Python 2.6, namedtuple can be used to build classes of objects that are just bundles of attributes with no custom methods, like a database record. In the example we use it to provide a nice representation for the cards in the deck, as shown in the console session: 87 | 88 | 要注意的第一件事情是使用collections.namedtuple构建一个简单的类来表示每张牌。从Python2.6起,namedtuple可以向数据库记录一样,只使用一组没有自定义方法的属性来构建对象的类。在这个例子中我们用它来为整副牌提供一个好看的外观,一如控制台会话所示: 89 | 90 | ```python 91 | >>> beer_card = Card('7', 'diamonds') 92 | >>> beer_card 93 | Card(rank='7', suit='diamonds') 94 | ``` 95 | 96 | But the point of this example is the `FrenchDeck` class. It’s short, but it packs a punch. First, like any standard Python collection, a deck responds to the `len()` function by returning the number of cards in it. 97 | 98 | ```python 99 | >>> deck = FrenchDeck() 100 | >>> len(deck) 101 | 52 102 | ``` 103 | 104 | 但是这个例子的关键点在于类`FrenchDeck`。这个类的代码略少,但是它直中要害。首先,和任何其他的Python集合一样,deck通过返回纸牌的数量已响应`len()`函数: 105 | 106 | Reading specific cards from the deck, say, the first or the last, should be as easy as `deck[0]` or `deck[-1]`, and this is what the `__getitem__` method provides. 107 | 108 | 从整副牌中读取指定的牌,我是说第一张或者最后一张,应该简单的使用`deck[0]` 或者 `deck[-1]`,这也正是`__getitem__`方法所提供的功能。 109 | 110 | ```python 111 | >>> deck[0] 112 | Card(rank='2', suit='spades') 113 | >>> deck[-1] 114 | Card(rank='A', suit='hearts') 115 | ``` 116 | 117 | Should we create a method to pick a random card? No need. Python already has a function to get a random item from a sequence: random.choice. We can just use it on a deck instance: 118 | 119 | ```python 120 | >>> from random import choice 121 | >>> choice(deck) 122 | Card(rank='3', suit='hearts') 123 | >>> choice(deck) 124 | Card(rank='K', suit='spades') 125 | >>> choice(deck) 126 | Card(rank='2', suit='clubs') 127 | ``` 128 | 129 | We’ve just seen two advantages of using special methods to leverage the Python Data Model: 130 | 131 | 1. The users of your classes don’t have to memorize arbitrary method names for standard operations (“How to get the number of items? Is it .size() .length() or what?”) 132 | 133 | 2. It’s easier to benefit from the rich Python standard library and avoid reinventing the wheel, like the random.choice function. 134 | 135 | 136 | But it gets better. 137 | 138 | Because our `__getitem__ `delegates to the `[]` operator of `self._cards`, our deck automatically supports slicing. Here’s how we look at the top three cards from a brand new deck, and then pick just the aces by starting on index 12 and skipping 13 cards at a time: 139 | 140 | ```python 141 | >>> deck[:3] 142 | [Card(rank='2', suit='spades'), Card(rank='3', suit='spades'), 143 | Card(rank='4', suit='spades')] 144 | >>> deck[12::13] 145 | [Card(rank='A', suit='spades'), Card(rank='A', suit='diamonds'), 146 | Card(rank='A', suit='clubs'), Card(rank='A', suit='hearts')] 147 | ``` 148 | 149 | Just by implementing the `__getitem__` special method, our deck is also iterable: 150 | ```python 151 | >>> for card in deck: # doctest: +ELLIPSIS 152 | ... print(card) 153 | Card(rank='2', suit='spades') 154 | Card(rank='3', suit='spades') 155 | Card(rank='4', suit='spades') 156 | ... 157 | ``` 158 | 159 | The deck can also be iterated in reverse: 160 | 161 | ```python 162 | >>> for card in reversed(deck): # doctest: +ELLIPSIS 163 | ... print(card) 164 | Card(rank='A', suit='hearts') 165 | Card(rank='K', suit='hearts') 166 | Card(rank='Q', suit='hearts') 167 | ... 168 | ``` 169 | 170 | ##### ELLIPSIS IN DOCTESTS 171 | Whenever possible, the Python console listings in this book were extracted from doctests to insure accuracy. When the output was too long, the elided part is marked by an ellipsis ... like in the last line above. In such cases, we used the `#` doctest: +ELLIPSIS directive to make the doctest pass. If you are trying these examples in the interactive console, you may omit the doctest directives altogether. 172 | 173 | Iteration is often implicit. If a collection has no `__contains__` method, the in operator does a sequential scan. Case in point: in works with our FrenchDeck class because it is iterable. Check it out: 174 | 175 | ```python 176 | >>> Card('Q', 'hearts') in deck 177 | True 178 | >>> Card('7', 'beasts') in deck 179 | False 180 | ``` 181 | 182 | How about sorting? A common system of ranking cards is by rank (with aces being highest), then by suit in the order: spades (highest), then hearts, diamonds and clubs (lowest). Here is a function that ranks cards by that rule, returning 0 for the 2 of clubs and 51 for the ace of spades: 183 | 184 | ```python 185 | suit_values = dict(spades=3, hearts=2, diamonds=1, clubs=0) 186 | 187 | def spades_high(card): 188 | rank_value = FrenchDeck.ranks.index(card.rank) 189 | return rank_value * len(suit_values) + suit_values[card.suit] 190 | ``` 191 | 192 | Given `spades_high`, we can now list our deck in order of increasing rank: 193 | ```python 194 | >>> for card in sorted(deck, key=spades_high): # doctest: +ELLIPSIS 195 | ... print(card) 196 | Card(rank='2', suit='clubs') 197 | Card(rank='2', suit='diamonds') 198 | Card(rank='2', suit='hearts') 199 | ... (46 cards ommitted) 200 | Card(rank='A', suit='diamonds') 201 | Card(rank='A', suit='hearts') 202 | Card(rank='A', suit='spades') 203 | ``` 204 | 205 | Although FrenchDeck implicitly inherits from `object`[6], its functionality is not inherited, but comes from leveraging the Data Model and composition. By implementing the special methods `__len__` and `__getitem__` our `FrenchDeck` behaves like a standard Python sequence, allowing it to benefit from core language features—like iteration and slicing—and from the standard library, as shown by the examples using `random.choice`, `reversed` and `sorted`. Thanks to composition, the `__len__` and `__getitem__` implementations can hand off all the work to a `list` object, `self._cards`. 206 | 207 | ##### HOW ABOUT SHUFFLING? 208 | As implemented so far, a `FrenchDeck` cannot be shuffled, because it is *immutable*: the cards and their positions cannot be changed, except by violating encapsulation and handling the `_cards` attribute directly. In Chapter 11 that will be fixed by adding a one-line `__setitem__` method. 209 | 210 | ## How special methods are used 211 | The first thing to know about special methods is that they are meant to be called by the Python interpreter, and not by you. You don’t write `my_object.__len__()`. You write `len(my_object)` and, if `my_object` is an instance of a user defined class, then Python calls the `__len__` instance method you implemented. 212 | 213 | But for built-in types like `list`, `str`, `bytearray` etc., the interpreter takes a shortcut: the CPython implementation of `len()` actually returns the value of the `ob_size` field in the `PyVarObject C` struct that represents any variable-sized built-in object in memory. This is much faster than calling a method. 214 | 215 | More often than not, the special method call is implicit. For example, the statement `for i in x`: actually causes the invocation of `iter(x)` which in turn may call `x.__iter__()` if that is available. 216 | 217 | Normally, your code should not have many direct calls to special methods. Unless you are doing a lot of metaprogramming, you should be implementing special methods more often than invoking them explicitly. The only special method that is frequently called by user code directly is `__init__`, to invoke the initializer of the superclass in your own `__init__` implementation. 218 | If you need to invoke a special method, it is usually better to call the related built-in function, such as `len`, `iter`, `str` etc. These built-ins call the corresponding special method, but often provide other services and—for built-in types—are faster than method calls. See for example `A closer look at the iter function` in `Chapter 14`. 219 | 220 | Avoid creating arbitrary, custom attributes with the `__foo__` syntax because such names may acquire special meanings in the future, even if they are unused today. 221 | 222 | ### Emulating numeric types 223 | Several special methods allow user objects to respond to operators such as +. We will cover that in more detail in Chapter 13, but here our goal is to further illustrate the use of special methods through another simple example. 224 | 225 | We will implement a class to represent 2-dimensional vectors, i.e. Euclidean vectors like those used in math and physics (see Figure 1-1). 226 | 227 | 图片:略 228 | 229 | Figure 1-1. Example of 2D vector addition. Vector(2, 4) + Vector(2, 1) results in Vector(4, 5). 230 | 231 | ##### TIP 232 | The built-in complex type can be used to represent 2D vectors, but our class can be extended to represent n-dimensional vectors. We will do that in Chapter 14. 233 | 234 | We will start by designing the `API` for such a class by writing a simulated console session which we can use later as doctest. The following snippet tests the vector addition pictured in Figure 1-1: 235 | 236 | ```python 237 | >>> v1 = Vector(2, 4) 238 | >>> v2 = Vector(2, 1) 239 | >>> v1 + v2 240 | Vector(4, 5) 241 | ``` 242 | 243 | Note how the + operator produces a `Vector` result which is displayed in a friendly manner in the console. 244 | 245 | The `abs` built-in function returns the absolute value of integers and floats, and the magnitude of `complex` numbers, so to be consistent our `API` also uses `abs` to calculate the magnitude of a vector: 246 | 247 | ```python 248 | >>> v = Vector(3, 4) 249 | >>> abs(v) 250 | 5.0 251 | ``` 252 | 253 | We can also implement the `*` operator to perform scalar multiplication, i.e. multiplying a vector by a number to produce a new vector with the same direction and a multiplied magnitude: 254 | 255 | ```python 256 | >>> v * 3 257 | Vector(9, 12) 258 | >>> abs(v * 3) 259 | 15.0 260 | ``` 261 | 262 | Example 1-2 is a `Vector` class implementing the operations just described, through the use of the special methods `__repr__`, `__abs__`, `__add__` and `__mul__`: 263 | 264 | *Example 1-2. A simple 2D vector class.* 265 | 266 | ```python 267 | from math import hypot 268 | 269 | class Vector: 270 | 271 | def __init__(self, x=0, y=0): 272 | self.x = x 273 | self.y = y 274 | 275 | def __repr__(self): 276 | return 'Vector(%r, %r)' % (self.x, self.y) 277 | 278 | def __abs__(self): 279 | return hypot(self.x, self.y) 280 | 281 | def __bool__(self): 282 | return bool(abs(self)) 283 | 284 | def __add__(self, other): 285 | x = self.x + other.x 286 | y = self.y + other.y 287 | return Vector(x, y) 288 | 289 | def __mul__(self, scalar): 290 | return Vector(self.x * scalar, self.y * scalar) 291 | ``` 292 | 293 | Note that although we implemented four special methods (apart from `__init__`), none of them is directly called within the class or in the typical usage of the class illustrated by the console listings. As mentioned before, the Python interpreter is the only frequent caller of most special methods. In the next sections we discuss the code for each special method. 294 | 295 | ### String representation 296 | The `__repr__` special method is called by the `repr` built-in to get string representation of the object for inspection. If we did not implement `__repr__`, vector instances would be shown in the console like `.` 297 | 298 | The interactive console and debugger call `repr` on the results of the expressions evaluated, as does the `'%r'` place holder in classic formatting with `%` operator, and the `!r` conversion field in the new Format String Syntax used in the `str.format` method[7]. 299 | 300 | Note that in our `__repr__` implementation we used `%r` to obtain the standard representation of the attributes to be displayed. This is good practice, as it shows the crucial difference between `Vector(1, 2)` and `Vector('1', '2')`—the latter would not work in the context of this example, because the constructors arguments must be numbers, not `str`. 301 | 302 | The string returned by `__repr__` should be unambiguous and, if possible, match the source code necessary to recreate the object being represented. That is why our chosen representation looks like calling the constructor of the class, e.g. `Vector(3, 4)`. 303 | 304 | Contrast `__repr__` with with `__str__`, which is called by the `str()` constructor and implicitly used by the print function. `__str__` should return a string suitable for display to end-users. 305 | 306 | If you only implement one of these special methods, choose `__repr__`, because when no custom `__str__` is available, Python will call `__repr__` as a fallback. 307 | 308 | ##### TIP 309 | Difference between `__str__` and `__repr__` in Python is a StackOverflow question with excellent contributions from Pythonistas Alex Martelli and Martijn Pieters. 310 | 311 | ### Arithmetic operators 312 | Example 1-2 implements two operators: `+` and `*`, to show basic usage of `__add__` and `__mul__`. Note that in both cases, the methods create and return a new instance of Vector, and do not modify either operand—`self` or `other` are merely read. This is the expected behavior of infix operators: to create new objects and not touch their operands. I will have a lot more to say about that in Chapter 13. 313 | 314 | ##### WARNING 315 | As implemented, Example 1-2 allows multiplying a `Vector` by a number, but not a number by a `Vector`, which violates the commutative property of multiplication. We will fix that with the special method `__rmul__` in Chapter 13. 316 | 317 | ### Boolean value of a custom type 自定义类型的布尔值 318 | Although Python has a `bool` type, it accepts any object in a boolean context, such as the expression controlling an `if` or `while` statement, or as operands to `and,` `or` and `not`. To determine whether a value `x` is `truthy` or `falsy`, Python applies `bool(x)`, which always returns `True` or `False`. 319 | 320 | By default, instances of user-defined classes are considered truthy, unless either `__bool__` or `__len__` is implemented. Basically, `bool(x)` calls`x.__bool__()` and uses the result. If `__bool__` is not implemented, Python tries to invoke `x.__len__()`, and if that returns `zero`, `bool` returns `False`. Otherwise `bool` returns `True`. 321 | 322 | Our implementation of `__bool__` is conceptually simple: it returns `False` if the magnitude of the vector is `zero`, `True` otherwise. We convert the magnitude to a boolean using `bool(abs(self))` because `__bool__` is expected to return a boolean. 323 | 324 | Note how the special method `__bool__` allows your objects to be consistent with the truth value testing rules defined in the Built-in Types chapter of the Python Standard Library documentation. 325 | 326 | ##### NOTE 327 | A faster implementation of `Vector.__bool__` is this: 328 | 329 | ```python 330 | def __bool__(self): 331 | return bool(self.x or self.y) 332 | ``` 333 | 334 | This is harder to read, but avoids the trip through `abs`, `__abs__`, the squares and square root. The explicit conversion to `bool` is needed because `__bool__` must return a boolean and or returns either operand as is: `x` or `y` evaluates to `x` if that is `truthy`, otherwise the result is `y`, whatever that is. 335 | 336 | ## Overview of special methods 337 | The `Data Model` page of the Python Language Reference lists 83 special method names, 47 of which are used to implement arithmetic, bitwise and comparison operators. 338 | 339 | As an overview of what is available, see Table 1-1 and Table 1-2. 340 | 341 | ##### NOTE 342 | The grouping shown in the following tables is not exactly the same as in the official documentation. 343 | 344 | *Table 1-1. Special method names (operators excluded).* 345 | 346 | |category |method names | 347 | |---------------------------|:---------------------------------------------| 348 | string/bytes representation |`__repr__`,`__str__`,`__format__`,`__bytes__` 349 | 350 | 351 | -------------------------------------------------------------------------------- /04章-文本与字节.md: -------------------------------------------------------------------------------- 1 | 第四章-文本与字节 2 | **************** 3 | 4 | >人类使用文本交流,而计算机则使用字节通信。———埃斯特。纳姆,和特雷斯。吠舍。*字符解码与Python中的Unicode* 5 | 6 | Python 3在人类可读文本字符串和原始字节之间引入了非常明显的差异。含混的字节序列转换到Unicode文本已经成为往事。本章着眼于Unicode字符串,二进制序列以及用于转换前两者的编码。 7 | 8 | 依据Python编程的上下文不同,更深入的理解Unicode对你来说或许很重要或许也不是很重要。最后,本章中遇到的大多数问题都不会影响到那些仅处理ASCII文本的程序员。不过要是你遇到此类问题,也没有必要去讲str转义为byte分开来。这样做带来的好处是,你会发现专门的二进制序列类型提供了Python 2 中的str类型没有提供的“万能”功能。 9 | 10 | 在这一章我们会谈到以下话题: 11 | 12 | - characters, code points and byte representations; 13 | - 二进制序列独一无二的功能以及早期字符集合; 14 | - 避免并处理编码错误; 15 | - 处理文本的最佳实践; 16 | - 默认的编码陷阱与标准I/O问题; 17 | - 安全的Unicode文本正规化比较; 18 | - 正规化的多用途函数,case folding and brute-force diacritic removal; 19 | - proper sorting of Unicode text with locale and the PyUCA library; 20 | - Unicode数据库中的字符元类; 21 | - 能够处理str和bytes的双模式API; 22 | 23 | Let’s start with the characters, code points and bytes. 24 | 25 | ## 字符的问题 26 | “字符串“的概念太简单了:字符串是一个字符列表。而问题则出现在”字符“定义中。 27 | 28 | 在2014年我们所知道的最佳“字符”定义就是Unicode字符。因此,你从Python 3 的str得到项便是Unicode字符,就像从Python 2中得到的unicode对象一样————而且不会是从Python str中得到的原始字节。 29 | 30 | - The identity of a character—its code point—is a number from 0 to 1,114,111 (base 10), shown in the Unicode standard as 4 to 6 hexadecimal digits with a “U+” prefix. For example, the code point for the letter A is U+0041, the Euro sign is U+20AC and the musical symbol G clef is assigned to code point U+1D11E. About 10% of the valid code points have characters assigned to them in Unicode 6.3, the standard used in Python 3.4. 31 | - The actual bytes that represent a character depend on the encoding in use. An encoding is an algorithm that converts code points to byte sequences and vice-versa. The code point for A (U+0041) is encoded as the single byte \x41 in the UTF-8 encoding, or as the bytes \x41\x00 in UTF-16LE encoding. As another example, the Euro sign (U+20AC) becomes three bytes in UTF-8—\xe2\x82\xac—but in UTF-16LE it is encoded as two bytes: \xac\x20. 32 | 33 | 转换代码片段到字节就是*编码*;由字节转换为代码片段便是*解码*。见例子4-1. 34 | 35 | 例子4-1. 编码与解码。 36 | ******************* 37 | 38 | ```python 39 | >>> s = 'café' 40 | >>> len(s) # 1 41 | 4 42 | >>> b = s.encode('utf8') # 2 43 | >>> b 44 | b'caf\xc3\xa9' # 3 45 | >>> len(b) # 4 46 | 5 47 | >>> b.decode('utf8') # 5 48 | 'café' 49 | ``` 50 | 51 | `#1str 'café'拥有4个Unicode字符。` 52 | `#2: 利用UTF-8编码将str编码为bytes` 53 | `#3: bytes literals start with a b prefix.` 54 | `#4: bytes b拥有5个字节(在UTF-8中代码片段“é”被编码为两个字节)。` 55 | `#5 利用UTF-8编码将bytes解码为str。` 56 | 57 | >##### Tips 58 | >If you need a memory aid to help distinguish .decode() from .encode(), convince yourself that byte sequences can be cryptic machine core dumps while Unicode str objects are “hu‐ man” text. Therefore, it makes sense that we decode bytes to str to get human-readable text, and we encode str to bytes for stor‐ age or transmission. 59 | 60 | Although the Python 3 str is pretty much the Python 2 unicode type with a new name, the Python 3 bytes is not simply the old str renamed, and there is also the closely related bytearray type. So it is worthwhile to take a look at the binary sequence types before advancing to encoding/decoding issues. 61 | 62 | ## Byte Essentials 63 | The new binary sequence types are unlike the Python 2 str in many regards. The first thing to know is that there are two basic built-in types for binary sequences: the im‐ mutable bytes type introduced in Python 3 and the mutable bytearray, added in Python 2.6. (Python 2.6 also introduced bytes, but it’s just an alias to the str type, and does not behave like the Python 3 bytes type.) 64 | 65 | Each item in bytes or bytearray is an integer from 0 to 255, and not a one-character string like in the Python 2 str. However, a slice of a binary sequence always produces a binary sequence of the same type—including slices of length 1. See Example 4-2. 66 | 67 | Example 4-2. A five-byte sequence as bytes and as bytearray 68 | 69 | ```shell 70 | >>> cafe = bytes('café', encoding='utf_8') 71 | >>> cafe 72 | b'caf\xc3\xa9' 73 | >>> cafe[0] 74 | 99 75 | >>> cafe[:1] 76 | b'c' 77 | >>> cafe_arr = bytearray(cafe) 78 | >>> cafe_arr 79 | bytearray(b'caf\xc3\xa9') 80 | >>> cafe_arr[-1:] bytearray(b'\xa9') 81 | ``` 82 | 83 | bytes can be built from a str, given an encoding. 84 | Each item is an integer in range(256). 85 | Slices of bytes are also bytes—even slices of a single byte. 86 | There is no literal syntax for bytearray: they are shown as bytearray() with a bytes literal as argument. 87 | A slice of bytearray is also a bytearray. 88 | 89 | >The fact that my_bytes[0] retrieves an int but my_bytes[:1] returns a bytes object of length 1 should not be surprising. The only sequence type where s[0] == s[:1] is the str type. Al‐ though practical, this behavior of str is exceptional. For every other sequence, s[i] returns one item, and s[i:i+1] returns a sequence of the same type with the s[1] item inside it. 90 | 91 | Although binary sequences are really sequences of integers, their literal notation reflects the fact that ASCII text is often embedded in them. Therefore, three different displays are used, depending on each byte value: 92 | 93 | - For bytes in the printable ASCII range—from space to ~—the ASCII character itself is used. 94 | - For bytes corresponding to tab, newline, carriage return, and \, the escape sequences \t, \n, \r, and \\ are used. 95 | - For every other byte value, a hexadecimal escape sequence is used (e.g., \x00 is the null byte). 96 | 97 | That is why in Example 4-2 you see b'caf\xc3\xa9': the first three bytes b'caf' are in the printable ASCII range, the last two are not. 98 | 99 | Both bytes and bytearray support every str method except those that do formatting (format, format_map) and a few others that depend on Unicode data, including case fold, isdecimal, isidentifier, isnumeric, isprintable, and encode. This means that you can use familiar string methods like endswith, replace, strip, translate, upper, and dozens of others with binary sequences—only using bytes and not str arguments. In addition, the regular expression functions in the re module also work on binary sequences, if the regex is compiled from a binary sequence instead of a str. The % operator does not work with binary sequences in Python 3.0 to 3.4, but should be sup‐ported in version 3.5 according to PEP 461 — Adding % formatting to bytes and byte‐ array. 100 | 101 | Binary sequences have a class method thatstrdoesn’t have, calledfromhex, which builds a binary sequence by parsing pairs of hex digits optionally separated by spaces: 102 | 103 | ```shell 104 | >>> bytes.fromhex('31 4B CE A9') b'1K\xce\xa9' 105 | ``` 106 | 107 | The other ways of building bytes or bytearray instances are calling their constructors with: 108 | 109 | - A str and an encoding keyword argument. 110 | - An iterable providing items with values from 0 to 255. 111 | - A single integer, to create a binary sequence of that size initialized with null bytes. (This signature will be deprecated in Python 3.5 and removed in Python 3.6. See PEP 467 — Minor API improvements for binary sequences.) 112 | - An object that implements the buffer protocol (e.g., bytes, bytearray, memory view, array.array); this copies the bytes from the source object to the newly cre‐ ated binary sequence. 113 | 114 | Building a binary sequence from a buffer-like object is a low-level operation that may involve type casting. See a demonstration in Example 4-3. 115 | 116 | Example 4-3. Initializing bytes from the raw data of an array 117 | 118 | ```shell 119 | >>> import array 120 | >>> numbers = array.array('h', [-2, -1, 0, 1, 2]) 121 | >>> octets = bytes(numbers) 122 | >>> octets b'\xfe\xff\xff\xff\x00\x00\x01\x00\x02\x00' 123 | ``` 124 | 125 | Typecode 'h' creates an array of short integers (16 bits). octets holds a copy of the bytes that make up numbers. These are the 10 bytes that represent the five short integers. 126 | 127 | Creating a bytes or bytearray object from any buffer-like source will always copy the bytes. In contrast, memoryview objects let you share memory between binary data struc‐ tures. To extract structured information from binary sequences, the struct module is invaluable. We’ll see it working along with bytes and memoryview in the next section. 128 | 129 | ### Structs and Memory Views 130 | The struct module provides functions to parse packed bytes into a tuple of fields of different types and to perform the opposite conversion, from a tuple into packed bytes. struct is used with bytes, bytearray, and memoryview objects. 131 | 132 | As we’ve seen in “Memory Views” on page 51, the memoryview class does not let you create or store byte sequences, but provides shared memory access to slices of data from other binary sequences, packed arrays, and buffers such as Python Imaging Library (PIL) images,2 without copying the bytes. 133 | 134 | Example 4-4 shows the use of memoryview and struct together to extract the width and height of a GIF image. 135 | 136 | Example 4-4. Using memoryview and struct to inspect a GIF image header 137 | 138 | ```shell 139 | >>> import struct 140 | >>> fmt = '<3s3sHH' # 141 | >>> with open('filter.gif', 'rb') as fp: ... img = memoryview(fp.read()) # ... 142 | >>> header = img[:10] # 143 | >>> bytes(header) # b'GIF89a+\x02\xe6\x00' 144 | >>> struct.unpack(fmt, header) # (b'GIF', b'89a', 555, 230) 145 | >>> del header # 146 | >>> del img 147 | ``` 148 | 149 | struct format: < little-endian; 3s3s two sequences of 3 bytes; HH two 16-bit integers. 150 | Create memoryview from file contents in memory... 151 | ...then another memoryview by slicing the first one; no bytes are copied here. 152 | Convert to bytes for display only; 10 bytes are copied here. 153 | Unpack memoryview into tuple of: type, version, width, and height. 154 | Delete references to release the memory associated with the memoryview instances. 155 | 156 | Note that slicing a memoryview returns a new memoryview, without copying bytes (Leo‐ nardo Rochael—one of the technical reviewers—pointed out that even less byte copying would happen if I used the mmap module to open the image as a memory-mapped file. 157 | 158 | I will not cover mmap in this book, but if you read and change binary files frequently, learning more about mmap — Memory-mapped file support will be very fruitful). 159 | 160 | We will not go deeper into memoryview or the struct module in this book, but if you work with binary data, you’ll find it worthwhile to study their docs: Built-in Types » Memory Views and struct — Interpret bytes as packed binary data. 161 | 162 | After this brief exploration of binary sequence types in Python, let’s see how they are converted to/from strings. 163 | 164 | ## Basic Encoders/Decoders 165 | The Python distribution bundles more than 100 codecs (encoder/decoder) for text to byte conversion and vice versa. Each codec has a name, like 'utf_8', and often aliases, such as 'utf8', 'utf-8', and 'U8', which you can use as the encoding argument in functions like open(), str.encode(), bytes.decode(), and so on. Example 4-5 shows the same text encoded as three different byte sequences. 166 | 167 | Example 4-5. The string “El Niño” encoded with three codecs producing very different byte sequences 168 | 169 | ``` 170 | >>> for codec in ['latin_1', 'utf_8', 'utf_16']: 171 | ... print(codec, 'El Niño'.encode(codec), sep='\t') 172 | ... 173 | latin_1 b'El Ni\xf1o' 174 | utf_8 b'El Ni\xc3\xb1o' 175 | utf_16 b'\xff\xfeE\x00l\x00 \x00N\x00i\x00\xf1\x00o\x00' 176 | ``` 177 | 178 | Figure 4-1 demonstrates a variety of codecs generating bytes from characters like the letter “A” through the G-clef musical symbol. Note that the last three encodings are variable-length, multibyte encodings. 179 | 180 | ![img](images/) 181 | Figure 4-1. Twelve characters, their code points, and their byte representation (in hex) in seven different encodings (asterisks indicate that the character cannot be represented in that encoding) 182 | 183 | All those asterisks in Figure 4-1 make clear that some encodings, like ASCII and even the multibyte GB2312, cannot represent every Unicode character. The UTF encodings, however, are designed to handle every Unicode code point. 184 | 185 | The encodings shown in Figure 4-1 were chosen as a representative sample: 186 | 187 | latin1 a.k.a. iso8859_1 188 | Important because it is the basis for other encodings, such as cp1252 and Unicode itself (note how the latin1 byte values appear in the cp1252 bytes and even in the code points). 189 | cp1252 190 | A latin1 superset by Microsoft, adding useful symbols like curly quotes and the € (euro); some Windows apps call it “ANSI,” but it was never a real ANSI standard. 191 | cp437 192 | The original character set of the IBM PC, with box drawing characters. Incompat‐ ible with latin1, which appeared later. 193 | gb2312 194 | Legacy standard to encode the simplified Chinese ideographs used in mainland China; one of several widely deployed multibyte encodings for Asian languages. 195 | utf-8 196 | The most common 8-bit encoding on the Web, by far;3 backward-compatible with ASCII (pure ASCII text is valid UTF-8). 197 | utf-16le 198 | One form of the UTF-16 16-bit encoding scheme; all UTF-16 encodings support code points beyond U+FFFF through escape sequences called “surrogate pairs.” 199 | 200 | >#### Caution 201 | >UTF-16 superseded the original 16-bit Unicode 1.0 encoding— UCS-2—way back in 1996. UCS-2 is still deployed in many sys‐ tems, but it only supports code points up to U+FFFF. As of Uni‐ code 6.3, more than 50% of the allocated code points are above U +10000, including the increasingly popular emoji pictographs. 202 | 203 | With this overview of common encodings now complete, we move to handling issues in encoding and decoding operations. 204 | 205 | ## Understanding Encode/Decode Problems 206 | Although there is a generic UnicodeError exception, the error reported is almost always more specific: either a UnicodeEncodeError (when converting str to binary sequences) or a UnicodeDecodeError (when reading binary sequences into str). Loading Python modules may also generate a SyntaxError when the source encoding is unexpected. We’ll show how to handle all of these errors in the next sections. 207 | 208 | >### Tips 209 | >The first thing to note when you get a Unicode error is the exact type of the exception. Is it a UnicodeEncodeError, a UnicodeDeco deError, or some other error (e.g., SyntaxError) that mentions an encoding problem? To solve the problem, you have to under‐ stand it first. 210 | 211 | ### Coping with UnicodeEncodeError 212 | Most non-UTF codecs handle only a small subset of the Unicode characters. When converting text to bytes, if a character is not defined in the target encoding, UnicodeEn codeError will be raised, unless special handling is provided by passing an errors argument to the encoding method or function. The behavior of the error handlers is shown in Example 4-6. 213 | 214 | [3]. As of September, 2014, W3Techs: Usage of Character Encodings for Websites claims that 81.4% of sites use UTF-8, while Built With: Encoding Usage Statistics estimates 79.4%. 215 | 216 | Example 4-6. Encoding to bytes: success and error handling 217 | 218 | ```python 219 | >>> city = 'São Paulo' 220 | >>> city.encode('utf_8') 221 | b'S\xc3\xa3o Paulo' 222 | >>> city.encode('utf_16') 223 | b'\xff\xfeS\x00\xe3\x00o\x00 \x00P\x00a\x00u\x00l\x00o\x00' 224 | >>> city.encode('iso8859_1') 225 | b'S\xe3o Paulo' 226 | >>> city.encode('cp437') Traceback (most recent call last): 227 | File "", line 1, in 228 | File "/.../lib/python3.4/encodings/cp437.py", line 12, in encode 229 | return codecs.charmap_encode(input,errors,encoding_map) UnicodeEncodeError: 'charmap' codec can't encode character '\xe3' in position 1: character maps to 230 | >>> city.encode('cp437', errors='ignore') 231 | b'So Paulo' 232 | >>> city.encode('cp437', errors='replace') 233 | b'S?o Paulo' 234 | >>> city.encode('cp437', errors='xmlcharrefreplace') 235 | b'São Paulo' 236 | ``` 237 | 238 | The 'utf_?' encodings handle any str. 'iso8859_1' also works for the 'São Paulo' str. 239 | 'cp437' can’t encode the 'ã' (“a” with tilde). The default error handler —'strict'—raises UnicodeEncodeError. 240 | The error='ignore' handler silently skips characters that cannot be encoded; this is usually a very bad idea. 241 | When encoding, error='replace' substitutes unencodable characters with '?'; data is lost, but users will know something is amiss. 242 | 'xmlcharrefreplace' replaces unencodable characters with an XML entity. 243 | 244 | >#### Tips 245 | >The codecs error handling is extensible. You may register extra strings for the errors argument by passing a name and an error handling function to the codecs.register_error function. See the codecs.register_error documentation. 246 | 247 | ### Coping with UnicodeDecodeError 248 | Not every byte holds a valid ASCII character, and not every byte sequence is valid UTF-8 or UTF-16; therefore, when you assume one of these encodings while converting a binary sequence to text, you will get a UnicodeDecodeError if unexpected bytes are found. 249 | 250 | On the other hand, many legacy 8-bit encodings like 'cp1252', 'iso8859_1', and 'koi8_r' are able to decode any stream of bytes, including random noise, without generating errors. Therefore, if your program assumes the wrong 8-bit encoding, it will silently decode garbage. 251 | 252 | 253 | -------------------------------------------------------------------------------- /06章-.md: -------------------------------------------------------------------------------- 1 | CHAPTER 6 Design Patterns with First-Class Functions 2 | Conformity to patterns is not a measure of goodness.1 3 | — Ralph Johnson 4 | Coauthor of the Design Patterns classic 5 | -------------------------------------------------------------------------------- /08章-对象的引用、可变性和回收.md: -------------------------------------------------------------------------------- 1 | CHAPTER 8 Object References, Mutability, and Recycling 2 | =================== 3 | 4 | *‘You are sad,’ the Knight said in an anxious tone: ‘let me sing you a song to comfort you. [...] The name of the song is called “HADDOCKS’ EYES”.’ ‘Oh, that’s the name of the song, is it?’ Alice said, trying to feel interested.‘No, you don’t understand,’ the Knight said, looking a little vexed. ‘That’s what the name is CALLED. The name really IS “THE AGED AGED MAN."’ (adapted from Chapter VIII. ‘It’s my own Invention’). 5 | — Lewis Carroll Through the Looking-Glass, and What Alice Found There * 6 | 7 | 8 | Alice and the Knight set the tone of what we will see in this chapter. The theme is the distinction between objects and their names. A name is not the object; a name is a separate thing. 9 | 10 | We start the chapter by presenting a metaphor for variables in Python: variables are labels, not boxes. If reference variables are old news to you, the analogy may still be handy if you need to explain aliasing issues to others. 11 | 12 | 我们由当下对Python中变量的隐喻开始本章:变量是标签,而不是盒子。 13 | 14 | We then discuss the concepts of object identity, value, and aliasing. A surprising trait of tuples is revealed: they are immutable but their values may change. This leads to a discussion of shallow and deep copies. References and function parameters are our next theme: the problem with mutable parameter defaults and the safe handling of mutable arguments passed by clients of our functions. 15 | 16 | 然后我们讨论对象、值和别名概念的本体。 17 | 18 | The last sections of the chapter cover garbage collection, the del command, and how to use weak references to “remember” objects without keeping them alive. 19 | 20 | 本章的最后一节讨论了垃圾回收,del命令,以及如何使用弱引用去“记住”对象而不用一直保留它们。 21 | 22 | This is a rather dry chapter, but its topics lie at the heart of many subtle bugs in real Python programs. 23 | 24 | 这是相当相当枯燥的的章节,但是 25 | 26 | Let’s start by unlearning that a variable is like a box where you store data. 27 | 28 | ## Variables Are Not Boxes 变量不是个盒子 29 | In 1997, I took a summer course on Java at MIT. The professor, Lynn Andrea Stein— an award-winning computer science educator who currently teaches at Olin College of Engineering—made the point that the usual “variables as boxes” metaphor actually hinders the understanding of reference variables in OO languages. Python variables are like reference variables in Java, so it’s better to think of them as labels attached to objects. 30 | 31 | Example 8-1 is a simple interaction that the “variables as boxes” idea cannot explain. Figure 8-1 illustrates why the box metaphor is wrong for Python, while sticky notes provide a helpful picture of how variables actually work. 32 | 33 | 例子8-1是一个简单的“变量即盒子”思想不能解释的交互命令。图表8-1 34 | 35 | Example 8-1. Variables a and b hold references to the same list, not copies of the list 36 | 37 | ```python 38 | >>>a=[1,2,3] 39 | >>>b=a 40 | >>> a.append(4) 41 | >>> b 42 | [1, 2, 3, 4] 43 | ``` 44 | 45 | ![img](images/c8_1.png) 46 | 47 | Figure 8-1. If you imagine variables are like boxes, you can’t make sense of assignment in Python; instead, think of variables as sticky notes—[Example 8-1]() then becomes easy to explain 48 | 49 | 图表8-1. 如果你想象变量跟盒子一样,你就无法理解Python中的赋值;相反, 50 | 51 | Prof. Stein also spoke about assignment in a very deliberate way. For example, when talking about a seesaw object in a simulation, she would say: “Variable s is assigned to the seesaw,” but never “The seesaw is assigned to variable s.” With reference variables, it makes much more sense to say that the variable is assigned to an object, and not the other way around. After all, the object is created before the assignment. Example 8-2 proves that the righthand side of an assignment happens first. 52 | 53 | Stein教授也讲过关于赋值 54 | 55 | Example 8-2. Variables are assigned to objects only after the objects are created 56 | 57 | 例子8-2.变量赋值到对象仅在创建被创建之后。 58 | 59 | ```python 60 | >>> class Gizmo: 61 | ... def __init__(self): 62 | ... print('Gizmo id: %d' % id(self)) 63 | ... 64 | >>> x = Gizmo() 65 | Gizmo id: 4303555472 66 | >>> y = Gizmo() * 10 67 | Gizmo id: 4303555544 68 | Traceback (most recent call last): 69 | File "", line 1, in 70 | TypeError: unsupported operand type(s) for *: 'instance' and 'int' 71 | >>> dir() 72 | ['Gizmo', '__builtins__', '__doc__', '__name__', '__package__', 'x'] 73 | ``` 74 | 75 | 1. The output Gizmo id: ... is a side effect of creating a Gizmo instance. 76 | 2. Multiplying a Gizmo instance will raise an exception. 77 | 3. Here is proof that a second Gizmo was actually instantiated before the multiplication was attempted. 78 | 4. But variable y was never created, because the exception happened while the right- hand side of the assignment was being evaluated. 79 | 80 | 1. Gizmo的输出id: 81 | 2. 用整数乘以Gizmo将抛出一个异常 82 | 3. 83 | 84 | >#### Tips 提示 85 | >To understand an assignment in Python, always read the right- hand side first: that’s where the object is created or retrieved. Af‐ ter that, the variable on the left is bound to the object, like a label stuck to it. Just forget about the boxes. 86 | 87 | >要理解Python中赋值,你要一直从右边读起来:此处为对象被创建或者重新取回。之后,左边的变量被绑定到对象,就像粘上去的标签一样。所以,把盒子给忘掉吧。 88 | 89 | Because variables are mere labels, nothing prevents an object from having several labels assigned to it. When that happens, you have aliasing, our next topic. 90 | 91 | 因为变量仅仅是标签而已,没有什么能阻止一个对象给自己使用多个标签。 92 | 93 | ## Identity, Equality, and Aliases 94 | Lewis Carroll is the pen name of Prof. Charles Lutwidge Dodgson. Mr. Carroll is not only equal to Prof. Dodgson: they are one and the same. Example 8-3 expresses this idea in Python. 95 | 96 | Lewis Carroll是Charles Lutwidge Dodgson教授的笔名。Carroll不仅等于教授Dodgson:它们就是同一个完全一样的东西。例子8-3表现了Python中的这种思想。 97 | 98 | Example 8-3. charles and lewis refer to the same object 99 | 例子8-3. charles和lewis引用了相同的对象。 100 | 101 | ```python 102 | >>> charles = {'name': 'Charles L. Dodgson', 'born': 1832} 103 | >>> lewis = charles #1 104 | >>> lewis is charles 105 | True 106 | >>> id(charles), id(lewis) #2 107 | (4522426448, 4522426448) 108 | >>> lewis['balance'] = 950 #3 109 | >>> charles 110 | {'born': 1832, 'balance': 950, 'name': 'Charles L. Dodgson'} 111 | ``` 112 | 113 | 1. lewis is an alias for charles. 114 | 2. The is operator and the id function confirm it. 115 | 3. Adding an item to lewis is the same as adding an item to charles. 116 | 117 | 1. lewis是一个charles的别名。 118 | 2. is运算符和id函数对它进行确认。 119 | 3. 120 | 121 | However, suppose an impostor—let’s call him Dr. Alexander Pedachenko—claims he is Charles L. Dodgson, born in 1832. His credentials may be the same, but Dr. Pedachenko is not Prof. Dodgson. Figure 8-2 illustrates this scenario. 122 | 123 | ![Flowers](/flowers.jpeg) 124 | *Figure 8-2. charles and lewis are bound to the same object; alex is bound to a separate object of equal contents* 125 | 126 | Example 8-4 implements and tests the alex object depicted in Figure 8-2. 127 | 128 | *Example 8-4. alex and charles compare equal, but alex is not charles* 129 | 130 | 131 | ```python 132 | >>> alex = {'name': 'Charles L. Dodgson', 'born': 1832, 'balance': 950} #1 133 | >>> alex == charles #2 134 | True 135 | >>> alex is not charles #3 136 | True 137 | ``` 138 | 139 | 1. alex refers to an object that is a replica of the object assigned to charles. 140 | 2. The objects compare equal, because of the `__eq__` implementation in the dict 141 | class. 142 | 3. But they are distinct objects. This is the Pythonic way of writing the negative 143 | identity comparison: a is not b. 144 | 145 | 1. alex引用了一个赋值给charles对象的复制对象 146 | 2. 对象比较结果相等,因为在dict类中使用了`__eq__` 147 | 3. 但它们是不同的对象。这是一种具有Python风格的编写负一致性比较的写法:a is not b。 148 | 149 | Example 8-3 is an example of aliasing. In that code, lewis and charles are aliases: two 150 | variables bound to the same object. On the other hand, alex is not an alias for charles: 151 | these variables are bound to distinct objects. The objects bound to alex and charles have 152 | the same value—that’s what == compares—but they have different iden‐ tities. 153 | 154 | 例子8-3是一个 155 | 156 | In The Python Language Reference, “3.1. Objects, values and types” states: 157 | 158 | Every object has an identity, a type and a value. An object’s identity never changes once it has been created; you may think of it as the object’s address in memory. Theisoperator compares the identity of two objects; the id() function returns an integer representing its identity. 159 | 160 | The real meaning of an object’s ID is implementation-dependent. In CPython, id() returns the memory address of the object, but it may be something else in another Python interpreter. The key point is that the ID is guaranteed to be a unique numeric label, and it will never change during the life of the object. 161 | 162 | In practice, we rarely use the id() function while programming. Identity checks are most often done with the is operator, and not by comparing IDs. Next, we’ll talk about is versus ==. 163 | 164 | ## Choosing Between == and is 165 | The == operator compares the values of objects (the data they hold), while is compares their identities. 166 | 167 | We often care about values and not identities, so == appears more frequently than is in Python code. 168 | 169 | However, if you are comparing a variable to a singleton, then it makes sense to use is. By far, the most common case is checking whether a variable is bound to None. This is the recommended way to do it: 170 | 171 | ```python 172 | x is None 173 | ``` 174 | 175 | And the proper way to write its negation is: 176 | 177 | ```python 178 | x is not None 179 | ``` 180 | 181 | The `is` operator is faster than `==`, because it cannot be overloaded, so Python does not have to find and invoke special methods to evaluate it, and computing is as simple as comparing two integer IDs. In contrast, `a == b` is syntactic sugar for `a.__eq__(b)`. The `__eq__` method inherited from object compares object IDs, so it produces the same result as is. But most built-in types override `__eq__` with more meaningful implemen‐ tations that actually take into account the values of the object attributes. Equality may involve a lot of processing—for example, when comparing large collections or deeply nested structures. 182 | 183 | 运算符`is`快于`==`,因为它不能够重载,所以Python不能够找到并调用特殊方法以计算它,计算简单到比较两个整数ID。为了对比,`a == b`是`a.__eq__(b)`的语法糖。`__eq__`方法继承自 184 | 185 | 186 | -------------------------------------------------------------------------------- /12章-继承该如何是好.md: -------------------------------------------------------------------------------- 1 | # Inheritance: for good or for worse 第十二章 继承该如何是好 2 | 3 | >[我们]推动了继承思想,使其成为新手也可以构建框架的一种方法,而原先只有专家才可以设计的框架。 4 | >— 阿兰。凯《Smalltalk的早期历史》 5 | 6 | 本章有关于继承和子类化,其中有两个针对Python不同的重点内容: 7 | 8 | * 从内建类型中的子类化陷阱 9 | * 多重继承以及方法解析顺序 10 | 11 | 很多人认为多重继承带来的麻烦远大于其自身带来好处。 12 | 13 | 然而,由于Java特别的成功及其带来的影响力,这就意味着,在实际操作中很多程序员并没有见到多重继承。这就是为什么我们通过两个重要的项目来阐明多重继承的适应范围:`Tkinter GUI`套件,以及Django web 框架。 14 | 15 | 我们从内建子类化的问题开始。余下的章节会用案例研究来学习多重继承,并讨论在构建类的分层设计时所遇到的问题。 16 | 17 | ## 技巧之-子类化内建类型 18 | 19 | 在Python2.2之前,子类化`list`或者`dict`这样的内建类型是不可能的。打那以后,Python虽然可以做到子类化内建类型,但是仍然要面对的重要警告是:内建的代码(由C语言重写)并不会调用被通过用户自定义类所覆盖的特殊方法。 20 | 21 | 对问题的准确描述都放在了`PyPy`文档,以及内建类型的子类化一节中的`PyPy和CPython之间差异`: 22 | 23 | >正式地来说,Cpython对完全地重写内建类型的子类方法时是否要显式调用上显得毫无规则可言。大略上,这些方法从来没有被其他的相同对象的内建方法所调用。例如,`dict`子类中的重写`__getitem__()`不会被`get()`这样的内建方法调用。 24 | 25 | 例子12-1则说明了此问题。 26 | 27 | *例子12-1。重写的`__setitem__`被`dict`的`__init__`和`__update__`方法所忽略。* 28 | 29 | ************************ 30 | 31 | ```python 32 | >>> class DoppelDict(dict): 33 | ... def __setitem__(self, key, value): 34 | ... super(DoppelDict, self).__setitem__(key, [value] * 2) # 1... 35 | >>> dd = DoppelDict(one=1) # 2 36 | >>> dd 37 | {'one': 1} 38 | >>> dd['two'] = 2 # 3 39 | >>> dd 40 | {'one': 1, 'two': [2, 2]} 41 | >>> dd.update(three=3) # 4> 42 | >> dd 43 | {'three': 3, 'one': 1, 'two': [2, 2]} 44 | ``` 45 | 46 | 1:存储时`DoppelDict.__setitem__`会使值重复(由于这个不好原因,因此必须有可见的效果)。它在委托到超类时才会正常运行。 47 | 48 | 2:继承自`dict`的`__init__`方法,明确地忽略了重写的`__setitem__`:`'one'`的值并没有重复。 49 | 50 | 3:`[]`运算符调用`__setitem__`,并如所希望的那样运行:`'two'`映射到了重复的值`[2, 2]`。 51 | 52 | 4:`dict`的`update`方法也没有使用我们定义的`__setitem__`:值`'three'`没有被重复。 53 | 54 | 该内建行为违反了面向对象的基本准则:方法的搜索应该总是从目标实例(`self`)的类开始,甚至是调用发生在以超类实现的方法之内部。在这样的悲观的情形下, 55 | 56 | 问题是在一个实例内部没有调用的限制,例如,不论`self.get()`是否调用`self.__getitem__()`,都会出现会被内建方法所调用其他类的方法被重写。下面是改编自`PyPy文档`的例子: 57 | 58 | 例子12-2。`AnswerDict`的`__getitem__`被`dict.update`所忽略。 59 | 60 | ```python 61 | >>> class AnswerDict(dict): 62 | ... def __getitem__(self, key): # 1... 63 | return 42 64 | ... 65 | >>> ad = AnswerDict(a='foo') # 2 66 | >>> ad['a'] # 3 67 | 42 68 | >>> d = {} 69 | >>> d.update(ad) # 4 70 | >>> d['a'] # 5 71 | 'foo' 72 | >>> d 73 | {'a': 'foo'} 74 | ``` 75 | 76 | 1:`AnserDict.__getitem__`总是返回`42`,不论键是什么。 77 | 78 | 2:`ad`是一个带有键值对`('a', 'foo')`的`AnswerDict`。 79 | 80 | 3:`ad['a']`如所期望的那样返回42。 81 | 82 | 4:`d`是一个普通使用`ad`更新的`dict`实例。 83 | 84 | 5:`dict.update`方法忽略了`AnserDict.__getitem__`。 85 | 86 | >##### 警告 87 | 直接地子类化类似`dict`或者`list`或者`str`这样的内建类型非常容易出错,因为大多数的内建方法会忽略用户所定义的重写方法。你应该从被设计成易于扩展的`collections`模块的`UserDict`,`UserList`和`UserString`派生类,而不是子类化内建对象。 88 | 89 | 如果你子类化`collections.UserDict`而不是`dict`,那么例子12-1和例子12-2中的问题都会被该解决。见例子12-3。 90 | 91 | *例子12-3。`DoppelDict2`和`AnswerDict2`一如所希望的运行,因为它们扩展的是UserDict而不是dict。* 92 | *************** 93 | 94 | ```python 95 | >>> import collections 96 | >>> 97 | >>> class DoppelDict2(collections.UserDict): 98 | ... def __setitem__(self, key, value): 99 | ... super().__setitem__(key, [value] * 2) 100 | ... 101 | >>> dd = DoppelDict2(one=1) 102 | >>> dd 103 | {'one': [1, 1]} 104 | >>> dd['two'] = 2 105 | >>> dd 106 | {'two': [2, 2], 'one': [1, 1]} 107 | >>> dd.update(three=3) 108 | >>> dd 109 | {'two': [2, 2], 'three': [3, 3], 'one': [1, 1]} 110 | >>> 111 | >>> class AnswerDict2(collections.UserDict): 112 | ... def __getitem__(self, key): 113 | ... return 42 114 | ... 115 | >>> ad = AnswerDict2(a='foo') 116 | >>> ad['a'] 117 | 42 118 | >>> d = {} 119 | >>> d.update(ad) 120 | >>> d['a'] 121 | 42 122 | >>> d 123 | {'a': 42} 124 | ``` 125 | 126 | 为了估量内建的子类工作所要求体验,我重写了例子3-8中`StrKeyDict`类。继承自`collections.UserDict`的原始版本,由三种方法实现:`__missing__`,`___contains__`和`__setitem__`。 127 | 128 | 总结:本节所描述的问题仅应用于在C语言内的方法委托实现内建类型,而且仅对用户定义的派生自这些的类型的类有效果。如果你在Python中子类化类编程,比如,`UserDict`或者`MutableMapping`,你不会遇到麻烦的。 129 | 130 | 还有问题就是,有关继承,特别地的多重继承:Python如何确定哪一个属性应该使用,如果超类来自并行分支定义相同的名称的属性,答案在下面一节。 131 | 132 | ### 多重继承以及方法解析顺序 133 | 134 | 当不关联的祖先类实现相同名称的方法时,任何语言实现多重继承都需要解决潜在的命名冲突。这称做“钻石问题”,一如图表12-1和例子12-4所描述。 135 | 136 | 图片:略 137 | 138 | 139 | 图表12-1.左边:UML类图表阐明了“钻石问题”。右边:虚线箭头为例子12-4描绘了Python MRO(方法解析顺序). 140 | 例子12-4. diamond.py:类A,B, C,和D构成了图表12-1中的图。 141 | 142 | ```python 143 | class A: 144 | def ping(self): 145 | print('ping:', self) 146 | 147 | 148 | class B(A): 149 | def pong(self): 150 | print('pong:', self) 151 | 152 | 153 | class C(A): 154 | def pong(self): 155 | print('PONG:', self) 156 | 157 | 158 | class D(B, C): 159 | 160 | def ping(self): 161 | super().ping() 162 | print('post-ping:', self) 163 | 164 | def pingpong(self): 165 | self.ping() 166 | super().ping() 167 | self.pong() 168 | super().pong() 169 | C.pong(self) 170 | 171 | ``` 172 | 173 | 注意类`B`和`C`都实现了`pong`方法。唯一的不同是`C.pong`输出大写的单词`PONG`。 174 | 175 | 如果你对实例`D`调用`d.pong()`,实际上哪一个`pong`方法会运行呢?对于C++程序员来说他们必须具有使用类名称调用方法,以解决这个模棱两可的问题。这样的问题在Python中也能够解决。看下例子12-5就知道了。 176 | 177 | 例子12-5.对类D的实例的pong方法调用的两种形式。 178 | 179 | ```python 180 | >>> from diamond import * 181 | >>> d = D() 182 | >>> d.pong() # 1 183 | pong: 184 | >>> C.pong(d) # 2 185 | PONG: 186 | ``` 187 | 1: 简单地调用`d.pong`导致B的运行。 188 | 2: 你可以总是直接地对调用超类的方法,传递实例作为明确的参数。 189 | 190 | 像`d.pong()`这样的模棱两可的调用得以解决,因为Python在穿越继承图时,遵循一个特定的顺序。这个顺序就叫做MRO:方法解析顺序。类有一个被称为`__mro__`的属性,它拥有使用MRO顺序的超类的引用元组,即,当前的类的所有到`object`类的路径。拿类`D`来说明什么是`__mro__`(参见 图表12-1): 191 | 192 | ```python 193 | >>> D.__mro__ 194 | (, , , 195 | , ) 196 | ``` 197 | 198 | 推荐的调用超类的委托方法就是内建的`super()`函数,这样做是因为在Python3中较易使用,就像例子12-4中的类D的`pingpong`方法所阐述的那样。不过,有时候忽略MRO,对超类直接地调用方法也是也可以的,而且很方便。例如,`D.ping`方法可以这样写: 199 | 200 | ```python 201 | def ping(self): 202 | A.ping(self) # instead of super().ping() 203 | print('post-ping:', self) 204 | ``` 205 | 206 | 注意,当调用直接调用一个类的实例时,你必须明确地传递`self`,因为你访问的是`unbound method`。 207 | 208 | 不过,这是最安全的而且更未来化的使用`super()`,特别是在调用一个框架的方法时,或者任何不受你控制的类继承时。例子12-6演示了在调用方法时`super()`对MRO的遵循。 209 | 210 | 例子12-6。使用`super()`去调用`ping`(源码见例子12-4)。 211 | 212 | ```python 213 | >>> from diamond import D 214 | >>> d = D() 215 | >>> d.ping() # 1 216 | ping: # 2 217 | post-ping: # 3 218 | ``` 219 | 220 | 1: The `ping` of `D` makes two calls: 221 | 1: D的ping进行了两次调用: 222 | 223 | 2: The first call is super().ping(); the super delegates the ping call to class A; A.ping outputs this line. 224 | 2: 第一次调用了`super().ping()`;super委托ping去调用类A;A.ping输出内容。 225 | 226 | 3: The second call is print('post-ping:', self) which outputs this line. 227 | 3: 第二次调用的是`print('post-ping:', self)`输出本行内容。 228 | 229 | Now let’s see what happens when pingpong is called on an instance of D. 230 | 231 | 现在让我们看一看调用D的实例上的pingpong到底发生哪些事情。 232 | 233 | Example 12-7. The five calls made by pingpong (source code in Example 12-4)。 234 | 235 | 例子12-7.由`pingpong`发起的五次调用(源码见例子12-4)。 236 | 237 | ```python 238 | >>> from diamond import D 239 | >>> d = D() 240 | >>> d.pingpong() 241 | >>> d.pingpong() 242 | ping: # ① 243 | post-ping: 244 | ping: # ② 245 | pong: # ③ 246 | pong: # ④ 247 | PONG: # ⑤ 248 | ``` 249 | 250 | ① Call #1 is`self.ping()` runs the ping method of D, which outputs this line and the next one. 251 | ② Call #2 is super.ping() which bypasses the ping in D and finds the ping method in A. 252 | ③ Call #3 is self.pong() which finds the B implementation of pong, according to the __mro__. 253 | ④ Call #4 is super.pong() which finds the same B.pong implementation, also following the __mro__. 254 | ⑤ Call #5 is C.pong(self) which finds the C.pong implementation, ignoring the __mro__. 255 | 256 | ① 第一次调用的是``self.ping()`运行的是D的ping方法,它输出了本行以及下面一行 257 | ② 第二次调用的是super.ping(),它忽略了D中的ping方法然后找到了A中的ping方法。 258 | ③ 第三次调用的是self.ping()它通过`__mro__`找到了B的pong方法。 259 | ④ 第四次调用的是`super.pong()`它同样是通过`__mro__`找到了B.pong。 260 | ⑤ 第五次调用的是`C.pong(self)`,它忽略了`__mro__`找到是C.pong。 261 | 262 | The MRO takes into account not only the inheritance graph but also the order in which superclasses are listed in a subclass declaration. In other words, if in diamond.py (Example 12-4) the D class was declared as class D(C, B):, the __mro__ of class D would be different: C would be searched before B. 263 | 264 | MRO 265 | 266 | I often check the`__mro__` of classes interactively when I am studying them. Example 12-8 has some examples using familiar classes. 267 | 268 | 在我研究多重继承时,我常交互式地去检查类的`__mro__`。例子12-8就使用了熟悉的类。 269 | 270 | *Example 12-8. Inspecting the `__mro__` attribute in several classes* 271 | *例子 12-8.在多个类中检查`__mro__`属性* 272 | 273 | ```python 274 | >>> bool.__mro__ # 1 275 | (, , ) 276 | >>> def print_mro(cls): # 2 277 | ... print(', '.join(c.__name__ for c in cls.__mro__)) 278 | ... 279 | >>> print_mro(bool) 280 | bool, int, object 281 | >>> from frenchdeck2 import FrenchDeck2 282 | >>> print_mro(FrenchDeck2) # 3 283 | FrenchDeck2, MutableSequence, Sequence, Sized, Iterable, Container, object 284 | >>> import numbers 285 | >>> print_mro(numbers.Integral) # 4 286 | Integral, Rational, Real, Complex, Number, object 287 | >>> import io # 5 288 | >>> print_mro(io.BytesIO) 289 | BytesIO, _BufferedIOBase, _IOBase, object 290 | >>> print_mro(io.TextIOWrapper) 291 | TextIOWrapper, _TextIOBase, _IOBase, object 292 | ``` 293 | 294 | 1: bool inherits methods and attributes from int and object. 295 | 2: print_mro produces more compact displays of the MRO. 296 | 3: The ancestors of FrenchDeck2 include several ABCs from the collections.abc module. 297 | 4: These are the numeric ABCs provided by the numbers module. 298 | 5: The io module includes ABCs (those with the …Base suffix) and concrete classes like BytesIO and TextIOWrapper which are the types of binary and text file objects returned by open(), depending on the mode argument. 299 | 300 | >#####Note 301 | >The MRO is computed using an algorithm called C3. The canonical paper on the Python MRO explaining C3 is Michele Simionato’s The Python 2.3 Method Resolution Order. If you are interested in the subtleties of the MRO, Further reading has other pointers. But don’t fret too much about this, the algorithm is sensible and Simionato wrote: 302 | […] unless you make strong use of multiple inheritance and you have non-trivial hierarchies, you don’t need to understand the C3 algorithm, and you can easily skip this paper. 303 | 304 | To wrap up this discussion of the MRO, Figure 12-2 illustrates part of the complex multiple inheritance graph of the Tkinter GUI toolkit from the Python standard library. To study the picture, start at the Text class at the bottom. The Text class implements a full featured, multiline editable text widget. It has rich functionality of its own, but also inherits many methods from other classes. The left side shows a plain UML class dia‐ gram. On the right, it’s decorated with arrows showing the MRO, as listed here with the help of the print_mro convenience function defined in Example 12-8: 305 | 306 | ```python 307 | >>> import tkinter 308 | >>> print_mro(tkinter.Text) 309 | Text, Widget, BaseWidget, Misc, Pack, Place, Grid, XView, YView, object 310 | ``` 311 | 312 | 图片:略 313 | 314 | Figure 12-2. Left: UML class diagram of the Tkinter Text widget class and its super‐ classes. Right: Dashed arrows depict Text.mro. 315 | 316 | In the next section, we’ll discuss the pros and cons of multiple inheritance, with examples from real frameworks that use it. 317 | 318 | 接下来的小节中,我们要讨论的是多重继承中的pros和cons,以及在真实框架中使用它的例子。 319 | 320 | ## Multiple Inheritance in the Real World 现实世界中的多重继承 321 | 322 | It is possible to put multiple inheritance to good use. The Adapter pattern in the Design Patterns book uses multiple inheritance, so it can’t be completely wrong to do it (the remaining 22 patterns in the book use single inheritance only, so multiple inheritance is clearly not a cure-all). 323 | 324 | 把多重继承给用好是有可能。设计模式这类书中的适配器模式使用了多重继承,所以也不能完全错误的使用它(书中余下的22个模式仅仅使用的是单继承,因此多重继承明显不是什么灵丹妙药)。 325 | 326 | In the Python standard library, the most visible use of multiple inheritance is the collections.abc package. That is not controversial: after all, even Java supports multiple inheritance of interfaces, and ABCs are interface declarations that may optionally pro‐ vide concrete method implementations.[5] . 327 | 328 | 在Python标准库中,最常见的多重继承是collections.abc包。这并不存在争议:毕竟, 329 | 330 | [5].*As previously mentioned, Java 8 allows interfaces to provide method implementations as well. The new feature is called Default Methods in the official Java Tutorial.* 331 | 332 | An extreme example of multiple inheritance in the standard library is the Tkinter GUI toolkit (module tkinter: Python interface to Tcl/Tk). I used part of the Tkinter widget hierarchy to illustrate the MRO in Figure 12-2, but Figure 12-3 shows all the widget classes in the tkinter base package (there are more widgets in the tkinter.ttk sub- package). 333 | 334 | 多重继承中的一个极端的例子便是标准库中的Tkinter图形套件(tkinter模块:面向Tcl/Tk的Python接口)。 335 | 336 | image: ignore 337 | 338 | Figure 12-3. Summary UML diagram for the Tkinter GUI class hierarchy; classes tag‐ ged «mixin» are designed to provide concrete methods to other classes via multiple in‐ heritance 339 | 340 | Tkinter is 20 years old as I write this, and is not an example of current best practices. But it shows how multiple inheritance was used when coders did not appreciate its drawbacks. And it will serve as a counter-example when we cover some good practices in the next section. 341 | 342 | 当我写这段文字时Tkinter就已经有20的历史了,它也不是目前最佳实践的例子。 343 | 344 | Consider these classes from Figure 12-3: 345 | 346 | ➊ Toplevel: The class of a top-level window in a Tkinter application. 347 | ➋ Widget: The superclass of every visible object that can be placed on a window. 348 | ➌ Button: A plain button widget. 349 | ➍ Entry: A single-line editable text field. 350 | ➎ Text: A multiline editable text field. 351 | 352 | Here are the MROs of those classes, displayed by the print_mro function from Example 12-8: 353 | 354 | ```python 355 | >>> import tkinter 356 | >>> print_mro(tkinter.Toplevel) 357 | Toplevel, BaseWidget, Misc, Wm, object 358 | >>> print_mro(tkinter.Widget) 359 | Widget, BaseWidget, Misc, Pack, Place, Grid, object 360 | >>> print_mro(tkinter.Button) 361 | Button, Widget, BaseWidget, Misc, Pack, Place, Grid, object 362 | >>> print_mro(tkinter.Entry) 363 | Entry, Widget, BaseWidget, Misc, Pack, Place, Grid, XView, object 364 | >>> print_mro(tkinter.Text) 365 | Text, Widget, BaseWidget, Misc, Pack, Place, Grid, XView, YView, object 366 | ``` 367 | 368 | Things to note about how these classes relate to others: 369 | 370 | • Toplevel is the only graphical class that does not inherit from Widget, because it is the top-level window and does not behave like a widget—for example, it cannot be attached to a window or frame. Toplevel inherits from Wm, which provides direct access functions of the host window manager, like setting the window title and configuring its borders. 371 | 372 | • Widget inherits directly from BaseWidget and from Pack, Place, and Grid. These last three classes are geometry managers: they are responsible for arranging widgets inside a window or frame. Each encapsulates a different layout strategy and widget placement API. 373 | 374 | • Button, like most widgets, descends only from Widget, but indirectly from Misc, which provides dozens of methods to every widget. 375 | 376 | • Entry subclasses Widget and XView, the class that implements horizontal scrolling. • Text subclasses from Widget, XView, and YView, which provides vertical scrolling 377 | functionality. 378 | 379 | We’ll now discuss some good practices of multiple inheritance and see whether Tkinter goes along with them. 380 | 381 | ### Coping with Multiple Inheritance 382 | 383 | [...] we needed a better theory about inheritance entirely (and still do). For example, inheritance and instancing (which is a kind of inheritance) muddles both pragmatics (such as factoring code to save space) and semantics (used for way too many tasks such as: specialization, generalization, speciation, etc.). 384 | — Alan Kay The Early History of Smalltalk 385 | 386 | As Alan Kay wrote, inheritance is used for different reasons, and multiple inheritance adds alternatives and complexity. It’s easy to create incomprehensible and brittle designs using multiple inheritance. Because we don’t have a comprehensive theory, here are a few tips to avoid spaghetti class graphs. 387 | 388 | #### 1. Distinguish Interface Inheritance from Implementation Inheritance 389 | 390 | When dealing with multiple inheritance, it’s useful to keep straight the reasons why subclassing is done in the first place. The main reasons are: 391 | 392 | • Inheritance of interface creates a subtype, implying an “is-a” relationship. 393 | • Inheritance of implementation avoids code duplication by reuse. 394 | 395 | In practice, both uses are often simultaneous, but whenever you can make the intent clear, do it. Inheritance for code reuse is an implementation detail, and it can often be replaced by composition and delegation. On the other hand, interface inheritance is the backbone of a framework. 396 | 397 | #### 2. Make Interfaces Explicit with ABCs 398 | 399 | In modern Python, if a class is designed to define an interface, it should be an explicit ABC. In Python ≥ 3.4, this means: subclass abc.ABC or another ABC (see “ABC Syntax Details” on page 328 if you need to support older Python versions). 400 | 401 | #### 3. Use Mixins for Code Reuse 使用Mixin实现代码复用 402 | 403 | If a class is designed to provide method implementations for reuse by multiple unrelated subclasses, without implying an “is-a” relationship, it should be an explicit mixin class. Conceptually, a mixin does not define a new type; it merely bundles methods for reuse. A mixin should never be instantiated, and concrete classes should not inherit only from a mixin. Each mixin should provide a single specific behavior, implementing few and very closely related methods. 404 | 405 | #### 4. Make Mixins Explicit by Naming 406 | 407 | There is no formal way in Python to state that a class is a mixin, so it is highly recom‐ mended that they are named with a ...Mixin suffix. Tkinter does not follow this advice, but if it did, XView would be XViewMixin, Pack would be PackMixin, and so on with all the classes where I put the «mixin» tag in Figure 12-3. 408 | 409 | #### 5. An ABC May Also Be a Mixin; The Reverse Is Not True 410 | 411 | Because an ABC can implement concrete methods, it works as a mixin as well. An ABC also defines a type, which a mixin does not. And an ABC can be the sole base class of any other class, while a mixin should never be subclassed alone except by another, more specialized mixin—not a common arrangement in real code. 412 | 413 | One restriction applies to ABCs and not to mixins: the concrete methods implemented in an ABC should only collaborate with methods of the same ABC and its superclasses. This implies that concrete methods in an ABC are always for convenience, because everything they do, a user of the class can also do by calling other methods of the ABC. 414 | 415 | #### 6. Don’t Subclass from More Than One Concrete Class 不要子类化一个以上的具象类 416 | 417 | Concrete classes should have zero or at most one concrete superclass.[6] In other words, all but one of the superclasses of a concrete class should be ABCs or mixins. For example, in the following code, if Alpha is a concrete class, then Beta and Gamma must be ABCs or mixins: 418 | 419 | 具象类应当拥有零个或者最多一个具象子类。换言之, 420 | 421 | ```python 422 | class MyConcreteClass(Alpha, Beta, Gamma): 423 | """This is a concrete class: it can be instantiated.""" # ... more code ... 424 | ``` 425 | 426 | #### 7. Provide Aggregate Classes to Users 提供聚合类 427 | 428 | If some combination of ABCs or mixins is particularly useful to client code, provide a class that brings them together in a sensible way. Grady Booch calls this an aggregate class.[7] 429 | 430 | For example, here is the complete source code for tkinter.Widget: 431 | 432 | ```python 433 | class Widget(BaseWidget, Pack, Place, Grid): 434 | """Internal class. 435 | Base class for a widget which can be positioned with the geometry managers Pack, Place or Grid.""" 436 | pass 437 | ``` 438 | 439 | [6]. In “Waterfowl and ABCs” on page 314, Alex Martelli quotes Scott Meyer’s More Effective C++, which goes even further: “all non-leaf classes should be abstract” (i.e., concrete classes should not have concrete super‐ classes at all). 440 | 441 | [7]. “A class that is constructed primarily by inheriting from mixins and does not add its own structure or behavior is called an aggregate class.”, Grady Booch et al., Object Oriented Analysis and Design, 3E (Addison-Wesley, 2007), p. 109. 442 | 443 | 444 | The body of Widget is empty, but the class provides a useful service: it brings together four superclasses so that anyone who needs to create a new widget does not need to remember all those mixins, or wonder if they need to be declared in a certain order in a class statement. A better example of this is the Django ListView class, which we’ll discuss shortly, in “A Modern Example: Mixins in Django Generic Views” on page 362. 445 | 446 | #### 8. “Favor Object Composition Over Class Inheritance.” 447 | 448 | This quote comes straight the Design Patterns book,(*#8*) and is the best advice I can offer here. Once you get comfortable with inheritance, it’s too easy to overuse it. Placing objects in a neat hierarchy appeals to our sense of order; programmers do it just for fun. 449 | 450 | ``` 451 | #8. Erich Gamma, Richard Helm, Ralph Johnson and John Vlissides, Design Patterns: Elements of Reusable Object-Oriented Software, Introduction, p. 20. 452 | ``` 453 | 454 | However, favoring composition leads to more flexible designs. For example, in the case of the tkinter.Widget class, instead of inheriting the methods from all geometry man‐ agers, widget instances could hold a reference to a geometry manager, and invoke its methods. After all, a Widget should not “be” a geometry manager, but could use the services of one via delegation. Then you could add a new geometry manager without touching the widget class hierarchy and without worrying about name clashes. Even with single inheritance, this principle enhances flexibility, because subclassing is a form of tight coupling, and tall inheritance trees tend to be brittle. 455 | 456 | Composition and delegation can replace the use of mixins to make behaviors available to different classes, but cannot replace the use of interface inheritance to define a hier‐ archy of types. 457 | 458 | We will now analyze Tkinter from the point of view of these recommendations. 459 | 460 | ### Tkinter: The Good, the Bad, and the Ugly 461 | 462 | >Keep in mind that Tkinter has been part of the standard library since Python 1.1 was released in 1994. Tkinter is a layer on top of the excellent Tk GUI toolkit of the Tcl language. The Tcl/Tk combo is not originally object oriented, so the Tk API is basical‐ ly a vast catalog of functions. However, the toolkit is very object oriented in its concepts, if not in its implementation. 463 | 464 | Most advice in the previous section is not followed by Tkinter, with #7 being a notable exception. Even then, it’s not a great example, because composition would probably work better for integrating the geometry managers into Widget, as discussed in #8. 465 | 466 | The docstring of tkinter.Widget starts with the words “Internal class.” This suggests that Widget should probably be an ABC. Although Widget has no methods of its own, it does define an interface. Its message is: “You can count on every Tkinter widget pro‐ viding basic widget methods (__init__, destroy, and dozens of Tk API functions), in addition to the methods of all three geometry managers.” We can agree that this is not a great interface definition (it’s just too broad), but it is an interface, and Widget “defines” it as the union of the interfaces of its superclasses. 467 | 468 | The Tk class, which encapsulates the GUI application logic, inherits from Wm and Misc, neither of which are abstract or mixin (Wm is not proper mixin because TopLevel sub‐ classes only from it). The name of the Misc class is—by itself—a very strong code smell. Misc has more than 100 methods, and all widgets inherit from it. Why is it nec‐ essary that every single widget has methods for clipboard handling, text selection, timer management, and the like? You can’t really paste into a button or select text from a scrollbar. Misc should be split into several specialized mixin classes, and not all widgets should inherit from every one of those mixins. 469 | 470 | To be fair, as a Tkinter user, you don’t need to know or use multiple inheritance at all. It’s an implementation detail hidden behind the widget classes that you will instantiate or subclass in your own code. But you will suffer the consequences of excessive multiple inheritance when you type dir(tkinter.Button) and try to find the method you need among the 214 attributes listed. 471 | 472 | Despite the problems, Tkinter is stable, flexible, and not necessarily ugly. The legacy (and default) Tk widgets are not themed to match modern user interfaces, but the tkinter.ttk package provides pretty, native-looking widgets, making professional GUI development viable since Python 3.1 (2009). Also, some of the legacy widgets, like Canvas and Text, are incredibly powerful. With just a little coding, you can turn a Canvas object into a simple drag-and-drop drawing application. Tkinter and Tcl/Tk are defi‐ nitely worth a look if you are interested in GUI programming. 473 | 474 | However, our theme here is not GUI programming, but the practice of multiple inher‐ itance. A more up-to-date example with explicit mixin classes can be found in Django. 475 | 476 | ## A Modern Example: Mixins in Django Generic Views 477 | 478 | >You don’t need to know Django to follow this section. I am just using a small part of the framework as a practical example of multiple inheritance, and I will try to give all the necessary back‐ ground, assuming you have some experience with server-side web development in another language or framework. 479 | 480 | In Django, a view is a callable object that takes, as argument, an object representing an HTTP request and returns an object representing an HTTP response. The different responses are what interests us in this discussion. They can be as simple as a redirect response, with no content body, or as complex as a catalog page in an online store, rendered from an HTML template and listing multiple merchandise with buttons for buying and links to detail pages. 481 | 482 | Originally, Django provided a set of functions, called generic views, that implemented some common use cases. For example, many sites need to show search results that include information from numerous items, with the listing spanning multiple pages, and for each item a link to a page with detailed information about it. In Django, a list view and a detail view are designed to work together to solve this problem: a list view renders search results, and a detail view produces pages for individual items. 483 | 484 | However, the original generic views were functions, so they were not extensible. If you needed to do something similar but not exactly like a generic list view, you’d have to start from scratch. 485 | 486 | In Django 1.3, the concept of class-based views was introduced, along with a set of generic view classes organized as base classes, mixins, and ready-to-use concrete classes. The base classes and mixins are in the base module of the django.views.generic package, pictured in Figure 12-4. At the top of the diagram we see two classes that take care of very distinct responsibilities: View and TemplateResponseMixin. 487 | 488 | >A great resource to study these classes is the Classy Class-Based Views website, where you can easily navigate through them, see all methods in each class (inherited, overridden, and added meth‐ ods), view diagrams, browse their documentation, and jump to their source code on GitHub. 489 | 490 | View is the base class of all views (it could be an ABC), and it provides core functionality like the dispatch method, which delegates to “handler” methods like get, head, post, etc., implemented by concrete subclasses to handle the different HTTP verbs.(*#9*) The RedirectView class inherits only from View, and you can see that it implements get, head, post, etc. 491 | 492 | ``` 493 | #9. Django programmers know that the as_view class method is the most visible part of the View interface, but it’s not relevant to us here. 494 | ``` 495 | 496 | Concrete subclasses of View are supposed to implement the handler methods, so why aren’t they part of the View interface? The reason: subclasses are free to implement just the handlers they want to support. A TemplateView is used only to display content, so it only implements get. If an HTTP POST request is sent to a TemplateView, the inherited View.dispatch method checks that there is no post handler, and produces an HTTP 405 Method Not Allowed response.(*#10*) 497 | 498 | ``` 499 | #10. If you are into design patterns, you’ll notice that the Django dispatch mechanism is a dynamic variation of the Template Method pattern. It’s dynamic because the View class does not force subclasses to implement all handlers, but dispatch checks at runtime if a concrete handler is available for the specific request. 500 | ``` 501 | 502 | image:bypass 503 | 504 | Figure 12-4. UML class diagram for the django.views.generic.base module 505 | 506 | The TemplateResponseMixin provides functionality that is of interest only to views that need to use a template. A RedirectView, for example, has no content body, so it has no need of a template and it does not inherit from this mixin. TemplateResponseMixin provides behaviors to TemplateView and other template-rendering views, such as List View, DetailView, etc., defined in other modules of the django.views.generic package. Figure 12-5 depicts the django.views.generic.list module and part of the base module. 507 | 508 | iamge:pass 509 | 510 | Figure 12-5. UML class diagram for the django.views.generic.list module. Here the three classes of the base module are collapsed (see Figure 12-4). The ListView class has no methods or attributes: it’s an aggregate class. 511 | 512 | For Django users, the most important class in Figure 12-5 is ListView, which is an aggregate class, with no code at all (its body is just a docstring). When instantiated, a ListView has an object_list instance attribute through which the template can iterate to show the page contents, usually the result of a database query returning multiple objects. All the functionality related to generating this iterable of objects comes from the MultipleObjectMixin. That mixin also provides the complex pagination logic—to display part of the results in one page and links to more pages. 513 | 514 | Suppose you want to create a view that will not render a template, but will produce a list of objects in JSON format. Thats’ why the BaseListView exists. It provides an easy- to-use extension point that brings together View and MultipleObjectMixin function‐ ality, without the overhead of the template machinery. 515 | 516 | The Django class-based views API is a better example of multiple inheritance than Tkinter. In particular, it is easy to make sense of its mixin classes: each has a well-defined purpose, and they are all named with the ...Mixin suffix. 517 | 518 | Class-based views were not universally embraced by Django users. Many do use them in a limited way, as black boxes, but when it’s necessary to create something new, a lot of Django coders continue writing monolithic view functions that take care of all those responsibilities, instead of trying to reuse the base views and mixins. 519 | 520 | It does take some time to learn how to leverage class-based views and how to extend them to fulfill specific application needs, but I found that it was worthwhile to study them: they eliminate a lot of boilerplate code, make it easier to reuse solutions, and even improve team communication—for example, by defining standard names to templates, and to the variables passed to template contexts. Class-based views are Django views “on rails.” 521 | 522 | This concludes our tour of multiple inheritance and mixin classes. 523 | 524 | ## Chapter Summary 本章总结 525 | 526 | We started our coverage of inheritance explaining the problem with subclassing built- in types: their native methods implemented in C do not call overridden methods in subclasses, except in very few special cases. That’s why, when we need a custom list, dict, or str type, it’s easier to subclass UserList, UserDict, or UserString—all defined in the collections module, which actually wraps the built-in types and delegate op‐ erations to them—three examples of favoring composition over inheritance in the stan‐ dard library. If the desired behavior is very different from what the built-ins offer, it may be easier to subclass the appropriate ABC from collections.abc and write your own implementation. 527 | 528 | 我们从使用子类化内建类型来解释继承的问题:这些原生方法使用C实现, 529 | 530 | The rest of the chapter was devoted to the double-edged sword of multiple inheritance. First we saw how the method resolution order, encoded in the __mro__ class attribute, addresses the problem of potential naming conflicts in inherited methods. We also saw how the super() built-in follows the __mro__ to call a method on a superclass. We then studied how multiple inheritance is used in the Tkinter GUI toolkit that comes with the Python standard library. Tkinter is not an example of current best practices, so we discussed some ways of coping with multiple inheritance, including careful use of mixin classes and avoiding multiple inheritance altogether by using composition instead. After considering how multiple inheritance is abused in Tkinter, we wrapped up by studying the core parts of the Django class-based views hierarchy, which I consider a better ex‐ ample of mixin usage. 531 | Lennart Regebro—a very experienced Pythonista and one of this book’s technical re‐ viewers—finds the design of Django’s mixin views hierarchy confusing. But he also wrote: 532 | 533 | 章节余下的内容用来专门讨论多重继承的双刃剑问题。首先,我们看到了方法解析顺序,如何被编码在类的__mro__属性中,解 534 | 535 |    The dangers and badness of multiple inheritance are greatly overblown. I’ve actually never had a    real big problem with it. 536 | 537 | In the end, each of us may have different opinions about how to use multiple inheritance, or whether to use it at all in our own projects. But often we don’t have a choice: the frameworks we must use impose their own choices. 538 | 539 | 最后,我们每个人对于如何使用多重继承都有自己的不同观点,抑或是无视情况都将它用在自己的项目中。然而有时候我们也是没得选择: 540 | 541 | ## Further Reading 深入阅读 542 | 543 | When using ABCs, multiple inheritance is not only common but practically inevitable, because each of the most fundamental collection ABCs (Sequence, Mapping, and Set) extend multiple ABCs. The source code for collections.abc (Lib/_collections_abc.py) is a good example of multiple inheritance with ABCs—many of which are also mixin classes. 544 | 545 | 在使用ABC时,多重继承并不常见但实践不可避免,因为 546 | 547 | Raymond Hettinger’s post Python’s super() considered super! explains the workings of super and multiple inheritance in Python from a positive perspective. It was written in response to Python’s Super is nifty, but you can’t use it (a.k.a. Python’s Super Considered Harmful) by James Knight. 548 | 549 | Raymond Hettinger的文章《Python的super()过人之处!》从一个正面的视角解释了super的工作原理和在Pythno中的多重继承。写就该文以回应James Knight的文章《Python的super非常好,但你却无法使用它》(也被称为《Python中super的坏处》)。 550 | 551 | Despite the titles of those posts, the problem is not really the super built-in—which in Python 3 is not as ugly as it was in Python 2. The real issue is multiple inheritance, which is inherently complicated and tricky. Michele Simionato goes beyond criticizing and actually offers a solution in his Setting Multiple Inheritance Straight: he implements traits, a constrained form of mixins that originated in the Self language. Simionato has a long series of illuminating blog posts about multiple inheritance in Python, including The wonders of cooperative inheritance, or using super in Python 3; Mixins considered harmful, part 1 and part 2; and Things to Know About Python Super, part 1, part 2 and part 3. The oldest posts use the Python 2 super syntax, but are still relevant. 552 | 553 | 抛开这些文章的标题,内建super的问题不在于Python3中它比Python2中好看多少。真正问题在于多重继承,它在继承上面的复杂和奇技淫巧。Michele Simionato 554 | 555 | I read the first edition of Grady Booch’s Object Oriented Analysis and Design, 3E (Addison-Wesley, 2007), and highly recommend it as a general primer on object ori‐ ented thinking, independent of programming language. It is a rare book that covers multiple inheritance without prejudice. 556 | 557 | 我读过Gray Booch的第一版面向对象分析和设计,以及第三版(由Addison-Wesley出版社2007出版), 558 | 559 | >####Soapbox 560 | >#####Think About the Classes You Really Need 561 | >The vast majority of programmers write applications, not frameworks. Even those who do write frameworks are likely to spend a lot (if not most) of their time writing appli‐ cations. When we write applications, we normally don’t need to code class hierarchies. At most, we write classes that subclass from ABCs or other classes provided by the framework. As application developers, it’s very rare that we need to write a class that will act as the superclass of another. The classes we code are almost always leaf classes (i.e., leaves of the inheritance tree). 562 | 563 | >If, while working as an application developer, you find yourself building multilevel class hierarchies, it’s likely that one or more of the following applies: 564 | 565 | >• You are reinventing the wheel. Go look for a framework or library that provides components you can reuse in your application. 566 | 567 | >• You are using a badly designed framework. Go look for an alternative. 568 | 569 | >• You are overengineering. Remember the KISS principle. 570 | 571 | >• You became bored coding applications and decided to start a new framework. Congratulations and good luck! 572 | It’s also possible that all of the above apply to your situation: you became bored and decided to reinvent the wheel by building your own overengineered and badly designed framework, which is forcing you to code class after class to solve trivial problems. Hopefully you are having fun, or at least getting paid for it. 573 | 574 | >##### Misbehaving Built-ins: Bug or Feature? 575 | >The built-in dict, list, and str types are essential building blocks of Python itself, so they must be fast—any performance issues in them would severely impact pretty much everything else. That’s why CPython adopted the shortcuts that cause their built-in methods to misbehave by not cooperating with methods overridden by subclasses. A possible way out of this dilemma would be to offer two implementations for each of those types: one “internal,” optimized for use by the interpreter and an external, easily extensible one. 576 | 577 | >But wait, this is what we have: UserDict, UserList, and UserString are not as fast as the built-ins but are easily extensible. The pragmatic approach taken by CPython means we also get to use, in our own applications, the highly optimized implementations that are hard to subclass. Which makes sense, considering that it’s not so often that we need a custom mapping, list, or string, but we use dict, list and str every day. We just need to be aware of the trade-offs involved. 578 | 579 | >##### Inheritance Across Languages 580 | Alan Kay coined the term “object oriented,” and Smalltalk had only single inheritance, although there are forks with various forms of multiple inheritance support, including the modern Squeak and Pharo Smalltalk dialects that support traits—a language con‐ struct that fulfills the role of a mixin class, while avoiding some of the issues with multiple inheritance. 581 | 582 | >The first popular language to implement multiple inheritance was C++, and the feature was abused enough that Java—intended as a C++ replacement—was designed without support for multiple inheritance of implementation (i.e., no mixin classes). That is, until Java 8 introduced default methods that make interfaces very similar to the abstract classes used to define interfaces in C++ and in Python. Except that Java interfaces cannot have state—a key distinction. After Java, probably the most widely deployed JVM lan‐ guage is Scala, and it implements traits. Other languages supporting traits are the latest stable versions of PHP and Groovy, and the under-construction languages Rust and Perl 6—so it’s fair to say that traits are trendy as I write this. 583 | 584 | >Ruby offers an original take on multiple inheritance: it does not support it, but intro‐ duces mixins as a language feature. A Ruby class can include a module in its body, so the methods defined in the module become part of the class implementation. This is a “pure” form of mixin, with no inheritance involved, and it’s clear that a Ruby mixin has no influence on the type of the class where it’s used. This provides the benefits of mixins, while avoiding many of its usual problems. 585 | 586 | >Two recent languages that are getting a lot of traction severely limit inheritance: Go and Julia. Go has no inheritance at all, but it implements interfaces in a way that resembles a static form of duck typing (see “Soapbox” on page 343 for more about this). Julia avoids the terms “classes” and has only “types.” Julia has a type hierarchy but subtypes cannot inherit structure, only behaviors, and only abstract types can be subtyped. In addition, Julia methods are implemented using multiple dispatch—a more advanced form of the mechanism we saw in “Generic Functions with Single Dispatch” on page 202. 587 | 588 |  589 | >演讲台 590 | 591 | -------------------------------------------------------------------------------- /13章-运算符重载.md: -------------------------------------------------------------------------------- 1 | # CHAPTER 13 Operator Overloading: Doing It Right 2 | 3 | *There are some things that I kind of feel torn about, like operator overloading. I left out operator overloading as a fairly personal choice because I had seen too many people abuse it in C++.[1] — James Gosling Creator of Java* 4 | 5 | Operator overloading allows user-defined objects to interoperate with infix operators such as+and|or unary operators like-and~. More generally, function invocation (()), attribute access (.), and item access/slicing ([]) are also operators in Python, but this chapter covers unary and infix operators. 6 | 7 | 运算符重载 8 | 9 | In “Emulating Numeric Types” on page 9 (Chapter 1) we saw some trivial implemen‐ tations of operators in a bare bones Vector class. The __add__ and __mul__ methods in Example 1-2 were written to show how special methods support operator overload‐ ing, but there are subtle problems in their implementations that we overlooked. Also, in Example 9-2, we noted that the Vector2d.__eq__ method considers this to be True: Vector(3, 4) == [3, 4]—which may or not make sense. We will address those matters in this chapter. 10 | 11 | In the following sections, we will cover: 12 | 13 | 接下来的小节中,我们涉及的内容有: 14 | 15 | - How Python supports infix operators with operands of different types 16 | - Using duck typing or explicit type checks to deal with operands of various types 17 | - How an infix operator method should signal it cannot handle an operand 18 | - The special behavior of the rich comparison operators (e.g., ==, >, <=, etc.) 19 | 20 | - Python如何使用不同类型的运算对象来支持中缀运算符 21 | - 使用鸭子类型或者明确的类型检查来出处理多种类型的运算对象 22 | - 23 | - 24 | 25 | [1. Source: “The C Family of Languages: Interview with Dennis Ritchie, Bjarne Stroustrup, and James Gosling”. 371] 26 | 27 | - The default handling of augmented assignment operators, like +=, and how to over‐ load them 28 | 29 | ## Operator Overloading 101 运算符重载基础 30 | 31 | Operator overloading has a bad name in some circles. It is a language feature that can be (and has been) abused, resulting in programmer confusion, bugs, and unexpected performance bottlenecks. But if well used, it leads to pleasurable APIs and readable code. Python strikes a good balance between flexibility, usability, and safety by imposing some limitations: 32 | 33 | 在某些圈子里运算符重载声名狼藉。 34 | 35 | - We cannot overload operators for the built-in types. 36 | - We cannot create new operators, only overload existing ones. 37 | - A few operators can’t be overloaded: is, and, or, not (but the bitwise &, |, ~, can). 38 | 39 | - 我们不能够重载内建类型运算符 40 | - 我们不能够创建新的运算符,只能重载已有的。 41 | - 部分运算符是不能够被重载的:is和or,not(但是比特位运算符却可以,&, |, ~ 。) 42 | 43 | In Chapter 10, we already had one infix operator in Vector: ==, supported by the __eq__ method. In this chapter, we’ll improve the implementation of __eq__ to better handle operands of types other than Vector. However, the rich comparison operators (==, !=, >, <, >=, <=) are special cases in operator overloading, so we’ll start by overloading four arithmetic operators in Vector: the unary - and +, followed by the infix + and *. 44 | 45 | 在第十章,我们在Vector中已经拥有一个中缀运算符:==,由`__eq__`方法提供支持。在本章,我们 46 | 47 | Let’s start with the easiest topic: unary operators. 48 | 49 | ## Unary Operators 一元运算符 50 | 51 | In The Python Language Reference, “6.5. Unary arithmetic and bitwise operations” lists three unary operators, shown here with their associated special methods: 52 | 53 | 在Python语言手册中, 54 | 55 | `- (__neg__)` 56 | Arithmetic unary negation. If x is -2 then -x == 2. 57 | 58 | `+ (__pos__)` 59 | Arithmetic unary plus. Usually x == +x, but there are a few cases when that’s not true. See “When x and +x Are Not Equal” on page 373 if you’re curious. 60 | 61 | `~ (__invert__)` 62 | Bitwise inverse of an integer, defined as ~x == -(x+1). If x is 2 then ~x == -3. 63 | 64 | The Data Model” chapter of The Python Language Reference also lists the abs(...) built- in function as a unary operator. The associated special method is __abs__, as we’ve seen before, starting with “Emulating Numeric Types” on page 9. 65 | 66 | It’s easy to support the unary operators. Simply implement the appropriate special method, which will receive just one argument: self. Use whatever logic makes sense in your class, but stick to the fundamental rule of operators: always return a new object. In other words, do not modify self, but create and return a new instance of a suitable type. 67 | 68 | 支持一元运算符很简单。简单地实现只接受一个参数的特殊方法即可:self。在类中使用任意能理解的逻辑,但是要 69 | 70 | In the case of - and +, the result will probably be an instance of the same class as self; for +, returning a copy of self is the best approach most of the time. For abs(...), the result should be a scalar number. As for ~, it’s difficult to say what would be a sensible result if you’re not dealing with bits in an integer, but in an ORM it could make sense to return the negation of an SQL WHERE clause, for example. 71 | 72 | 在 - 和 + 的情景中,结果可能是同一个类的实例 73 | 74 | As promised before, we’ll implement several new operators on the Vector class from Chapter 10. Example 13-1 shows the __abs__ method we already had in Example 10-16, and the newly added __neg__ and __pos__ unary operator method. 75 | 76 | 就像之前承诺过的那样,我们会基于第十章的Vector类智商实现多个新的运算。例子13-1展示了我们在例子10-16中已有的`__abs__`方法,并新增了一元运算符方法`__neg__`和`__pos__`。 77 | 78 | Example 13-1. vector_v6.py: unary operators - and + added to Example 10-16 79 | 80 | 例子13-1。 81 | 82 | ```python 83 | def __abs__(self): 84 | return math.sqrt(sum(x * x for x in self)) 85 | 86 | def __neg__(self): 87 | return Vector(-x for x in self) # 1 88 | 89 | def __pos__(self): # 2 90 | return Vector(self) 91 | ``` 92 | 93 | 1. To compute -v, build a new Vector with every component of self negated. 94 | 2. To compute +v, build a new Vector with every component of self. 95 | 96 | 1. 为了计算-v, 97 | 2. 98 | 99 | Recall that Vector instances are iterable, and the Vector.__init__ takes an iterable 100 | argument, so the implementations of __neg__ and __pos__ are short and sweet. 101 | 102 | 回一下,Vector的实例是可变的,而且Vector.__init__接受一个可变参数,所以 103 | 104 | We’ll not implement __invert__, so if the user tries ~v on a Vector instance, Python 105 | will raise TypeError with a clear message: “bad operand type for unary ~: 'Vector'.” The following sidebar covers a curiosity that may help you win a bet about unary + 106 | someday. The next important topic is “Overloading + for Vector Addition” on page 375. 107 | 108 | 我们不会实现__invert__,所以对Vector的实例操作~v,Python会抛出包含明确消息的TypeError:“bad operand type for unary ~: 'Vector'.”。 109 | 110 | >#### When x and +x Are Not Equal 111 | >Everybody expects that x == +x, and that is true almost all the time in Python, but I found two cases in the standard library where x != +x. 112 | 每个人都期待x == +x,并且大多数时候在Python中都是true,但是我在标准库中发现有两种情况, x != +x。 113 | >The first case involves the decimal.Decimal class. You can have x != +x if x is a Deci mal instance created in an arithmetic context and +x is then evaluated in a context with different settings. For example, x is calculated in a context with a certain precision, but the precision of the context is changed and then +x is evaluated. See Example 13-2 for a demonstration. 114 | 第一中情况涉及decimal.Decimal类。你可以 115 | 116 | > Example 13-2. A change in the arithmetic context precision may cause x to differ from +x 117 | 例子13-2. 数字上下文精度的改变可能引起x与+x的不同 118 | 119 | > ```python 120 | >>> import decimal 121 | >>> ctx = decimal.getcontext() 122 | >>> ctx.prec = 40 123 | >>> one_third = decimal.Decimal('1') / decimal.Decimal('3') 124 | >>> one_third Decimal('0.3333333333333333333333333333333333333333') 125 | >>> one_third == +one_third 126 | True 127 | >>> ctx.prec = 28 128 | >>> one_third == +one_third 129 | False 130 | >>> +one_third 131 | Decimal('0.3333333333333333333333333333') 132 | ``` 133 | 134 | 1. Get a reference to the current global arithmetic context. 135 | 2. Set the precision of the arithmetic context to 40. 136 | 3. Compute 1/3 using the current precision. 137 | 4. Inspect the result; there are 40 digits after the decimal point. 138 | 5. one_third == +one_third is True. 139 | 5. Lower precision to 28—the default for Decimal arithmetic in Python 3.4. Now one_third == +one_third is False. 140 | 6. Inspect +one_third; there are 28 digits after the '.' here. 141 | 142 | 1. 143 | 144 | >The fact is that each occurrence of the expression +one_third produces a new Deci mal instance from the value of one_third, but using the precision of the current arith‐ metic context. 145 | 146 | >The second case where x != +x you can find in the collections.Counter documen‐ tation. The Counter class implements several arithmetic operators, including infix + to add the tallies from two Counter instances. However, for practical reasons, Counter addition discards from the result any item with a negative or zero count. And the prefix + is a shortcut for adding an empty Counter, therefore it produces a new Counter preserving only the tallies that are greater than zero. See Example 13-3. 147 | 148 | Example 13-3. Unary + produces a new Counter without zeroed or negative tallies 149 | 150 | ```shell 151 | >>> ct = Counter('abracadabra') 152 | >>> ct 153 | Counter({'a': 5, 'r': 2, 'b': 2, 'd': 1, 'c': 1}) >>> ct['r'] = -3 154 | >>> ct['d'] = 0 155 | >>> ct 156 | ``` 157 | Counter({'a': 5, 'b': 2, 'c': 1, 'd': 0, 'r': -3}) >>> +ct 158 | Counter({'a': 5, 'b': 2, 'c': 1}) 159 | Now, back to our regularly scheduled programming. 160 | 161 | 162 | 163 | ## Overloading + for Vector Addition 164 | The Vector class is a sequence type, and the section “3.3.6. Em‐ ulating container types” in the “Data Model” chapter says sequen‐ ces should support the + operator for concatenation and * for repetition. However, here we will implement + and * as mathe‐ matical vector operations, which are a bit harder but more mean‐ ingful for a Vector type. 165 | 166 | Vector类是一个序列类型,而且在“数据模型”章中的小节“3.3.6 模拟容器类型”也说过,要支持合并与重复序列应该支持 + 和 * 运算符。不过这里我们会把 + 和 * 实现为数学矢量运算符, 167 | 168 | Adding two Euclidean vectors results in a new vector in which the components are the pairwise additions of the components of the addends. To illustrate: 169 | 170 | 在新的vector中添加两个欧几里德矢量结果 171 | 172 | ```shell 173 | >>> v1 = Vector([3, 4, 5]) 174 | >>> v2 = Vector([6, 7, 8]) 175 | >>>v1+v2 176 | Vector([9.0, 11.0, 13.0]) 177 | >>> v1 + v2 == Vector([3+6, 4+7, 5+8]) 178 | True 179 | ``` 180 | 181 | What happens if we try to add two Vector instances of different lengths? We could raise an error, but considering practical applications (such as information retrieval), it’s better to fill out the shortest Vector with zeros. This is the result we want: 182 | 183 | 如果我们尝试对两个不同长度的Vector实例合并会发生什么?我们 184 | 185 | ```shell 186 | >>> v1 = Vector([3, 4, 5, 6]) 187 | >>> v3 = Vector([1, 2]) 188 | >>>v1+v3 189 | Vector([4.0, 6.0, 5.0, 6.0]) 190 | ``` 191 | 192 | Given these basic requirements, the implementation of __add__ is short and sweet, as shown in Example 13-4. 193 | 194 | Example 13-4. Vector.add method, take #1 195 | 196 | ```python 197 | # inside the Vector class 198 | def __add__(self, other): 199 | pairs = itertools.zip_longest(self, other, fillvalue=0.0) # return Vector(a + b for a, b in pairs) # 200 | ``` 201 | 202 | 1. pairs is a generator that will produce tuples (a, b) where a is from self, and b is from other. If self and other have different lengths, fillvalue is used to supply the missing values for the shortest iterable. 203 | 2. A new Vector is built from a generator expression producing one sum for each item in pairs. 204 | 205 | Note how __add__ returns a new Vector instance, and does not affect self or other. 206 | 207 | >Special methods implementing unary or infix operators should never change their operands. Expressions with such operators are expected to produce results by creating new objects. Only aug‐ mented assignment operators may change the first operand (self), as discussed in “Augmented Assignment Operators” on page 388. 208 | 209 | Example 13-4 allows adding Vector to a Vector2d, and Vector to a tuple or to any iterable that produces numbers, as Example 13-5 proves. 210 | 211 | Example 13-5. Vector.__add__ take #1 supports non-Vector objects, too 212 | 213 | ```shell 214 | >>> v1 = Vector([3, 4, 5]) 215 | >>> v1 + (10, 20, 30) 216 | Vector([13.0, 24.0, 35.0]) 217 | >>> from vector2d_v3 import Vector2d 218 | >>> v2d = Vector2d(1, 2) 219 | >>>v1+v2d 220 | Vector([4.0, 6.0, 5.0]) 221 | ``` 222 | 223 | Both additions in Example 13-5 work because __add__ uses zip_longest(...), which can consume any iterable, and the generator expression to build the new Vector merely performs a + b with the pairs produced by zip_longest(...), so an iterable producing any number items will do. 224 | 225 | However, if we swap the operands (Example 13-6), the mixed-type additions fail. 226 | 227 | Example 13-6. Vector.__add__ take #1 fails with non-Vector left operands 228 | 229 | ```python 230 | >>> v1 = Vector([3, 4, 5]) 231 | >>> (10, 20, 30) + v1 232 | Traceback (most recent call last): 233 | File "", line 1, in 234 | TypeError: can only concatenate tuple (not "Vector") to tuple >>> from vector2d_v3 import Vector2d 235 | >>> v2d = Vector2d(1, 2) 236 | >>>v2d+v1 237 | Traceback (most recent call last): 238 | File "", line 1, in 239 | TypeError: unsupported operand type(s) for +: 'Vector2d' and 'Vector' 240 | ``` 241 | 242 | To support operations involving objects of different types, Python implements a special dispatching mechanism for the infix operator special methods. Given an expression a + b, the interpreter will perform these steps (also see Figure 13-1): 243 | 244 | 1. If a has __add__, call a.__add__(b) and return result unless it’s NotImplemented. 245 | 2. If a doesn’t have __add__, or calling it returns NotImplemented, check if b has 246 | __radd__, then call b.__radd__(a) and return result unless it’s NotImplemented. 247 | 3. If b doesn’t have __radd__, or calling it returns NotImplemented, raise TypeError 248 | with an unsupported operand types message. 249 | 250 | img: 251 | 252 | Figure 13-1. Flowchart for computing a + b with __add__ and __radd__ 253 | 254 | The __radd__ method is called the “reflected” or “reversed” version of __add__. I prefer to call them “reversed” special methods.2 Three of this book’s technical reviewers—Alex, Anna, and Leo—told me they like to think of them as the “right” special methods, because they are called on the righthand operand. Whatever “r”-word you prefer, that’s what the “r” prefix stands for in __radd__, __rsub__, and the like. 255 | 256 | Therefore, to make the mixed-type additions in Example 13-6 work, we need to imple‐ ment the Vector.__radd__ method, which Python will invoke as a fall back if the left operand does not implement __add__ or if it does but returns NotImplemented to signal that it doesn’t know how to handle the right operand. 257 | 258 | Do not confuse NotImplemented with NotImplementedError. The first, NotImplemented, is a special singleton value that an infix operator special method should return to tell the interpreter it cannot handle a given operand. In contrast, NotImplementedEr ror is an exception that stub methods in abstract classes raise to warn that they must be overwritten by subclasses. 259 | 260 | The simplest possible __radd__ that works is shown in Example 13-7. 261 | 262 | Example 13-7. Vector.__add__ and __radd__ methods 263 | 264 | ```python 265 | # inside the Vector class 266 | def __add__(self, other): # 267 | pairs = itertools.zip_longest(self, other, fillvalue=0.0) 268 | return Vector(a + b for a, b in pairs) 269 | 270 | def __radd__(self, other): # return self + other 271 | ``` 272 | 273 | 274 | No changes to __add__ from Example 13-4; listed here because __radd__ uses it. 275 | __radd__ just delegates to __add__. 276 | 277 | Often, __radd__ can be as simple as that: just invoke the proper operator, therefore delegating to __add__ in this case. This applies to any commutative operator; + is com‐ mutative when dealing with numbers or our vectors, but it’s not commutative when concatenating sequences in Python. 278 | 279 | [注释2. The Python documentation uses both terms. The “Data Model” chapter uses “reflected,” but “9.1.2.2. Imple‐ menting the arithmetic operations” in the numbers module docs mention “forward” and “reverse” methods, and I find this terminology better, because “forward” and “reversed” clearly name each of the directions, while “reflected” doesn’t have an obvious opposite. ] 280 | 281 | 282 | 283 | -------------------------------------------------------------------------------- /14章-可迭代之迭代器和生成器.md: -------------------------------------------------------------------------------- 1 | # CHAPTER 14 Iterables, iterators and generators 2 | 3 | *When I see patterns in my programs, I consider it a sign of trouble. The shape of a program should reflect only the problem it needs to solve. Any other regularity in the code is a sign, to me at least, that I’m using abstractions that aren’t powerful enough — often that I’m generating by hand the expansions of some macro that I need to write [1]. 4 | — Paul Graham 5 | Lisp hacker and venture capitalist* 6 | 7 | 当我审视自己程序中的模式时,我视其为问题的发端。程序的式样应该只反映出它需要解决的。就我而言,代码中任何其他规律都是一种标志,说明我使用的抽象还不够多,也就是说我需要写某些宏扩展通常是手工写出来【1】。 8 | 9 | Iteration is fundamental to data processing. And when scanning datasets that don’t fit in memory, we need a way to fetch the items lazily, that is, one at a time and on demand. This is what the Iterator pattern is about. This chapter shows how the Iterator pattern is built into the Python language so you never need to implement it by hand. 10 | 11 | 迭代是数据处理的基础。当扫描数据集时不要放进内存,我们需要一种`惰性`获取项的方法,即,特定时刻而且是按需的。这就是迭代器模式所要表现的内容。本章展示了迭代器模式如何被构建到了Python语言中,因此你也绝对不需要自己手工实现它。 12 | 13 | Python does not have macros like Lisp (Paul Graham’s favorite language), so abstracting away the Iterator pattern required changing the language: the yield keyword was added in Python 2.2 (2001)[2]. The yield keyword allows the construction of generators, which work as iterators. 14 | 15 | 不像Lisp,Python没有宏,所以抽象出迭代器模式就要求改变语言:在Python 2.2(2001)【注释2】中加入的yield关键子。yield关键字实现了生成器的构建,它能够像迭代器一样使用。 16 | 17 | >####Note 18 | >Every generator is an iterator: generators fully implement the iterator interface. But an iterator — as defined in the GoF book — re‐trieves items from a collection, while a generator can produce items “out of thin air”. That’s why the Fibonacci sequence generator is a common example: an infinite series of numbers cannot be stored in a collection. However, be aware that the Python community treats iterator and generator as synonyms most of the time. 19 | 20 | 21 | >####注释 22 | >每个生成器都是一个迭代器:生成器完全实现了迭代器接口。但迭代器在GoF这本书中定义为:从collection中重新取回项,而生成器则是“凭空”产生项。这也是斐波那契序列生成器作为常见示例的原因:一个无限的数字列是不能够存储在collection中。不过,要主要在Python社区中很多时候都把迭代器和生成器当做同义词。 23 | 24 | [1]. From Revenge of the Nerds, a blog post. 25 | [2]. Python 2.2 users could use yield with the directive from __future__ import generators; yield became available by default in Python 2.3. 26 | 27 | [1]. 来自博文《呆瓜的复仇》。 28 | [2]. Python 2.2 用户能够利用命令`from __future__ import generators`运用yield;在Python 2.3中yield默认可用。 29 | 30 | Python 3 uses generators in many places. Even the range() built-in now returns a generator-like object instead of full-blown lists like before. If you must build a list from range, you have to be explicit, e.g. list(range(100)). 31 | 32 | Python 3在很多地方使用了生成器。甚至内建的range()现在也返回一个类生成器对象,而不是之前的完整列表。如果你必须用range构建出一个列表,那么你必须明确的指明,比如,list(range(100))。 33 | 34 | Every collection in Python is iterable, and iterators are used internally to support: 35 | 36 | Python中的每一个collection都是可迭代的,迭代器用于集合的内部以支持以下动作: 37 | 38 | - for loops; 39 | - collection types construction and extension; 40 | - looping over text files line by line; 41 | - list, dict and set comprehensions; 42 | - tuple unpacking; 43 | - unpacking actual parameters with * in function calls. 44 | 45 | - 支持循环; 46 | - collection类型构建以及扩展; 47 | - 一行接一行的循环文本文件; 48 | - 列表、字典和集合解析式; 49 | - 元组解包; 50 | - 在函数调用中使用 * 解包实参。 51 | 52 | This chapter covers the following topics: 53 | 54 | 本章讨论以下话题: 55 | 56 | - How the iter(...) built-in function is used internally to handle iterable objects. 57 | - How to implement the classic Iterator pattern in Python. 58 | - How a generator function works in detail, with line by line descriptions. 59 | - How the classic Iterator can be replaced by a generator function or generator ex‐ pression. 60 | - Leveraging the general purpose generator functions in the standard library. 61 | - Using the new yield from statement to combine generators. 62 | - A case study: using generator functions in a database conversion utility designed to work with large data sets. 63 | - Why generators and coroutines look alike but are actually very different and should not be mixed. 64 | 65 | - 内建函数iter(...)是如何用在内部去处理可迭代对象的。 66 | - 在Python如何实现典型的迭代器模式 67 | - 生成器函数具体是如何工作的,我们一字一句得来说一说 68 | - 改进标准库中的普通用途生成器函数 69 | - 在语句中使用新的yield来合并生成器 70 | - 案例研究:在数据库中使用生成器函数转变实用设计以便用大数据集。 71 | - 为什么生成器和协程看起相似,但实际上有着很大不同,所以不要把它们给弄混淆了。 72 | 73 | We’ll get started studying how the iter(...) function makes sequences iterable. 74 | 75 | 我们从研习iter(...)函数如何使序列可迭代开始。 76 | 77 | ## Sentence take #1: a sequence of words 78 | 79 | We’ll start our exploration of iterables by implementing a Sentence class: you give its constructor a string with some text, and then you can iterate word by word. The first version will implement the sequence protocol, and it’s iterable because all sequences are iterable, as we’ve seen before, but now we’ll see exactly why. 80 | 81 | 我们通过实现的Sentence类开始对可迭代的探究:你对类的构造器赋值一些文本,然后你就可以一个词接着一个词的迭代。第一个版本会将实现序列接口,一如我们之前所见,它是可迭代的,因为所有的序列都是可迭代的,现在我们就来看看到底为什么是这样。 82 | 83 | Example 14-1 shows a Sentence class that extracts words from a text by index. 84 | 85 | 例子14-1 展示了一个通过索引从文本提取单词的Sentence类。 86 | 87 | *Example 14-1. sentence.py: A Sentence as a sequence of words.* 88 | *例子14-1.sentence.py:由单词序列组成的Sentence* 89 | 90 | ```python 91 | import re 92 | import reprlib 93 | 94 | 95 | RE_WORD = re.compile('\w+') 96 | 97 | 98 | class Sentence: 99 | def __init__(self, text): 100 | self.text = text 101 | self.words = RE_WORD.findall(text) # 1 102 | 103 | def __getitem__(self, index): 104 | return self.words[index] # 2 105 | 106 | def __len__(self): # 3 107 | return len(self.words) 108 | 109 | def __repr__(self): 110 | return 'Sentence(%s)' % reprlib.repr(self.text) # 4 111 | ``` 112 | 113 | - `re.findall` returns a list with all non-overlapping matches of the regular 114 | expression, as a list of strings. 115 | - `self.words` holds the result of `.findall`, so we simply return the word at the given index. 116 | - To complete the sequence protocol, we implement `__len__` — but it is not needed to make an iterable object. 117 | - `reprlib.repr` is a utility function to generate abbreviated string representations of data structures that can be very large [3]. 118 | 119 | 1. `re.findall`返回一个正则表达式的所有不重叠匹配的字符串列表。 120 | 2. `self.words`包含了`.findall`的结果,所以我们就简单地返回指定索引的单词。 121 | 3. 为了完成序列协议,我们实现了`__len__`,但是对于生成可迭代对象来所,这并不是必须的。 122 | 4. `reprlib.repr`是一个用来生成大量数据的字符串显示缩写的实用函数【注释3】。 123 | 124 | [注释3] We first used reprlib in “Vectortake#1:Vector2dcompatible” on page 278.` 125 | 126 | By default, `reprlib.repr` limits the generated string to 30 characters. See the following console session to see how `Sentence` is used: 127 | 128 | 默认情况下,`reprlib.repr`限制生成的字符串为30个字符串。在下列终端中的会话你可以看到Sentence是如何使用的: 129 | 130 | *Example 14-2. Testing iteration on a Sentence instance.* 131 | 132 | ```python 133 | >>> s = Sentence('"The time has come," the Walrus said,') # 1 134 | >>> s 135 | Sentence('"The time ha... Walrus said,') # 2 136 | >>>for word in s: # 3 137 | ... print(word) 138 | The 139 | time 140 | has 141 | come 142 | the 143 | Walrus 144 | said 145 | >>> list(s) # 4 146 | ['The', 'time', 'has', 'come', 'the', 'Walrus', 'said'] 147 | ``` 148 | 149 | - A sentence is created from a string. 150 | - Note the output of `__repr__` using `...` generated by reprlib.repr. 151 | - Sentence instances are iterable, we’ll see why in a moment. 152 | - Being iterable, Sentence objects can be used as input to build lists and other iterable types. 153 | 154 | 1. 创建了一个字符串句子。 155 | 2. 注意`__repr__`的输出,它使用了通过由reprlib.repr生成的`...`。 156 | 3. Sentence实例是可迭代的,稍后我们来看看为什么。 157 | 4. 因为可迭代,所以Sentence对象可以用于输入以构造列表和其他可迭代类型。 158 | 159 | In the next pages, we’ll develop other Sentence classes that pass the tests in Example 14-2. However, the implementation in Example 14-1 is different from all the others because it’s also a sequence, so you can get words by index: 160 | 161 | 在下一页,我们会编写其他的Sentence在例子14-2中通过测试的类。不过,在例子14-1中实现的例子和其他的所有例子都不相同,因为结果就是一个序列,所以能通过索引获取单词: 162 | 163 | ```python 164 | >>> s[0] 165 | 'The' 166 | >>> s[5] 167 | 'Walrus' 168 | >>> s[-1] 169 | 'said' 170 | ``` 171 | 172 | Every Python programmer knows that sequences are iterable. Now we’ll see precisely why. 173 | 174 | 每个Python程序员都知道序列是可迭代的。现在我们来仔细的一探究竟。 175 | 176 | ## Why sequences are iterable: the iter function 为什么序列是可迭代的之iter函数 177 | 178 | Whenever the interpreter needs to iterate over an object x, it automatically calls iter(x). The iter built-in function: 179 | 180 | 不论何时当解释器需要迭代对象x时,它会自动地调用iter(x)。iter为内建函数: 181 | 182 | 1. Checks whether the object implements, `__iter__`, and calls that to obtain an iterator; 183 | 2. If `__iter__` is not implemented, but `__getitem__` is implemented, Python creates an iterator that attempts to fetch items in order, starting from index 0 (zero); 184 | 3. If that fails, Python raises TypeError, usually saying "'C' object is not itera ble", where C is the class of the target object. 185 | 186 | - 检查对象是否实现`__iter__`,并调用它来获得一个迭代器; 187 | - 如果没有实现`__iter__`,而是实现了`__getitem__`,Python会创建一个迭代器,从索引0开始,尝试按顺序去获取项; 188 | - 如果以上方法调用都失败了,Python抛出TypeError,通常是“C object is not iterable”,这里C是目标对象的类。 189 | 190 | That is why any Python sequence is iterable: they all implement `__getitem__`. In fact, the standard sequences also implement `__iter__`, and yours should too, because the special handling of `__getitem__` exists for backward compatibility reasons and may be gone in the future (although it is not deprecated as I write this). 191 | 192 | 这就是为什么Python序列是可迭代的:因为它们都实现了`__getitem__`。实际上,标准序列也实现了`__iter__`,而且你的自定义序列也应当如此,因为现有的为了向后兼容性的特殊处理 `__getitem__`,在未来或许会消失(尽管在我写本书时还没有移除)。 193 | 194 | As mentioned in “Python digs sequences” on page 312, this is an extreme form of duck typing: an object is considered iterable not only when it implements the special method `__iter__`, but also when it implements `__getitem__`, as long as `__getitem__` accepts int keys starting from 0. 195 | 196 | 就像在312页中提及的“深入Python序列”那样,这就是一个鸭子类型的一个极致表现:一个对象被认为是可迭代的不仅仅是它实现了特殊方法`__iter__`,还实现了`__getitem__`,只要`__getitem__`能够接受从0开始的整数键。 197 | 198 | In the goose-typing approach, the definition for an iterable is simpler but not as flexible: an object is considered iterable if it implements the `__iter__` method. No subclassing or registration is required, because abc.Iterable implements the `__subclasshook__`, as seen in “Geese can behave as ducks” on page 340. Here is a demonstration: 199 | 200 | 在鹅类型方法中,可迭代的定义更简单了,但却不那么灵活:如果一个对象实现了`__iter__`方法便认为是可迭代的。没有子类化或者注册上的要求,因为abc.Iterable实现了`__subclasshook__`,你可以在340页中的“鹅的行为可以像鸭子”中见到。这里对其说明: 201 | 202 | ```python 203 | >>> class Foo: 204 | ... def __iter__(self): 205 | ... pass 206 | ... 207 | >>> from collections import abc 208 | >>> issubclass(Foo, abc.Iterable) 209 | True 210 | >>> f = Foo() 211 | >>> isinstance(f, abc.Iterable) 212 | True 213 | ``` 214 | 215 | However, note that our initial Sentence class does not pass the issubclass(Sentence, abc.Iterable) test, even though it is iterable in practice. 216 | 217 | 不过,值得注意的是第一版Sentence类并没有通过issubclass(Sentence, abc.Iterable)测试,即便实际上它是可迭代的。 218 | 219 | >**Note** 220 | >As of Python 3.4, the most accurate way to check whether an object x is iterable is to call iter(x) and handle a TypeError exception if it isn’t. This is more accurate than using isinstance(x, abc.Iterable), because iter(x) also considers the legacy `__getitem__` method, while the Iterable ABC does not. 221 | >**注释** 222 | >从Python 3.4开始,检查对象x是否是可迭代的最精确方法为调用iter(x),并在对象不可迭代时处理TypeError异常。这比使用isinstance(x, abc.Iterable)更为精准,因为iter(x)还使用了早期的`__getitem__`方法,而Iterable ABC则不然。 223 | 224 | Explicitly checking whether an object is iterable may not be worthwhile if right after the check you are going to iterate over the object. After all, when the iteration is attempted on a noniterable, the exception Python raises is clear enough: TypeError: 'C' object is not iterable . If you can do better than just raising TypeError, then do so in a try/except block instead of doing an explicit check. The explicit check may make sense if you are holding on to the object to iterate over it later; in this case, catching the error early may be useful. 225 | 226 | 如果这之后你立刻作出迭代玩整个对象,而明确地检查一个对象是否可迭代有些得不偿失。毕竟,当迭代的尝试发生在一个不可迭代身上时,Python抛出的异常足够清楚了:TypeError: 'C' object is not iterable。假如你想将代码写的更好些而不仅仅抛出TypeError,那么你可以在try/except语句块中这样做,而不是做显式的检查。如果你在对一个稍后需要迭代的对象,显式地检查可以让人易于理解;于此情况下,早些捕捉错误会更有用。 227 | 228 | The next section makes explicit the relationship between iterables and iterators. 229 | 230 | 下一节将明确一下可迭代和迭代器之间的关系。 231 | 232 | ### Iterables Versus Iterators 可迭代与迭代器 233 | 234 | From the explanation in “Why Sequences Are Iterable: The iter Function” on page 404 we can extrapolate a definition: 235 | 236 | 从404页中的定义“为什么序列是可迭代:iter函数”中我们可以推断定义: 237 | 238 | *iterable* 239 | Any object from which the iter built-in function can obtain an iterator. Objects implementing an `__iter__` method returning an iterator are iterable. Sequences are always iterable; as are objects implementing a `__getitem__` method that takes 0-based indexes. 240 | 241 | *可迭代* 242 | 任何来自内建iter函数的对象都可以的迭代器。对象实现了`__iter__`方法,能够返回一个迭代器便是可迭代的。序列总是可迭代的;因为对象实现了能够接受从0开始的索引的`__getitem__`方法。 243 | 244 | It’s important to be clear about the relationship between iterables and iterators: Python obtains iterators from iterables. 245 | 246 | 重要的是要弄清楚可迭代和迭代器之间的关系:Python从可迭代中获得迭代器。 247 | 248 | Here is a simple for loop iterating over a str. The str 'ABC' is the iterable here. You don’t see it, but there is an iterator behind the curtain: 249 | 250 | 这里有一个简单的迭代字符串的for循环。此处的'ABC'是可迭代的。你虽然看不见它,但其背后有一个迭代器在工作: 251 | 252 | ```python 253 | >>> s = 'ABC' 254 | >>> for char in s: 255 | ... print(char) 256 | ... 257 | A 258 | B 259 | C 260 | ``` 261 | 262 | If there was no for statement and we had to emulate the for machinery by hand with a while loop, this is what we’d have to write: 263 | 264 | 如果没有for语句,那么我们就得自己手动使用while循环来模拟for机制,这里就是我们必须去编写的: 265 | 266 | ```python 267 | >>> s = 'ABC' 268 | >>> it = iter(s) # 1 269 | >>> while True: # 2 270 | ... try: 271 | ... print(next(it)) 272 | ... except StopIteration: #3 273 | ... del it # 4 274 | ... break # 5 275 | ... 276 | A 277 | B 278 | C 279 | ``` 280 | 281 | 1. Build an iterator it from the iterable. 282 | 2. Repeatedly call next on the iterator to obtain the next item. 283 | 3. The iterator raises StopIteration when there are no further items. 284 | 4. Release reference to it—the iterator object is discarded. 285 | 5. Exit the loop. 286 | 287 | - 从可迭代对象中构造一个迭代器。 288 | - 对迭代器重复地调用next以获取下一个项。 289 | - 在没有更多的项时迭代器抛出StopIteration。 290 | - 释放对it的引用,即迭代器被丢弃。 291 | - 退出循环 292 | 293 | StopIteration signals that the iterator is exhausted. This exception is handled inter‐nally in for loops and other iteration contexts like list comprehensions, tuple unpacking, etc. 294 | 295 | StopIteration指示迭代器迭代完了。这个异常是在for循环内部被处理的,以及其他的迭代上下文,比如列表解析式、元组解包等等。 296 | 297 | The standard interface for an iterator has two methods: 298 | 299 | 迭代器的标准接口有两个方法: 300 | 301 | `__next__` 302 | Returns the next available item, raising StopIteration when there are no more items. 303 | 返回下一个可用的项,在没有更多可用项时抛出StopIteration。 304 | 305 | `__iter__` 306 | Returns self; this allows iterators to be used where an iterable is expected, for example, in a for loop. 307 | 返回self;允许将迭代器用在期望可迭代的地方,例如,在循环中。 308 | 309 | This is formalized in the collections.abc.Iterator ABC, which defines the `__next__` abstract method, and subclasses Iterable—where the abstract `__iter__` method is defined. See Figure 14-1. 310 | 311 | 在collections.abc.Iterator ABC中这是标准化用法,它定义了 `__next__` 抽象方法,以及可迭代的子类——即定义抽象`__iter__`方法的地方。参见图表14-1. 312 | 313 | ![img](images/14-1.png) 314 | 315 | *Figure 14-1. The Iterable and Iterator ABCs. Methods in italic are abstract. A concrete Iterable.iter should return a new Iterator instance. A concrete Iterator must implement next. The Iterator.iter method just returns the instance itself.* 316 | 317 | *图表14-1. 可迭代和迭代器ABC。以斜体字显示的方法是抽象的。具体的Iterable.iter应该返回一个新的迭代器实例。具体的迭代器必须使用next方法。Iterator.iter方法仅仅返回了自身。* 318 | 319 | The Iterator ABC implements `__iter__` by doing return self. This allows an iterator to be used wherever an iterable is required. The source code for abc.Iterator is in Example 14-3. 320 | 321 | 迭代器ABC通过返回自身实现了`__iter__`。这就让迭代器可以用在任何有可迭代需求的地方。abc.Iterator的源码在例子14-3中。 322 | 323 | *Example 14-3. abc.Iterator class; extracted from Lib/_collections_abc.py* 324 | *例子14-3. 类abc.Iterator;提取自Lib/_collections_abc.py* 325 | 326 | ```python 327 | class Iterator(Iterable): 328 | __slots__ = () 329 | 330 | @abstractmethod 331 | def __next__(self): 332 | 'Return the next item from the iterator. When exhausted, raise StopIteration' 333 | raise StopIteration 334 | 335 | def __iter__(self): 336 | return self 337 | 338 | @classmethod 339 | def __subclasshook__(cls, C): 340 | if cls is Iterator: 341 | if (any("__next__" in B.__dict__ for B in C.__mro__) and any("__iter__" in B.__dict__ for in C.__mro__)): 342 | return True 343 | return NotImplemented 344 | ``` 345 | 346 | >####Warning 347 | >The Iterator ABC abstract method is `it.__next__()` in Python 3 and `it.next()` in Python 2. As usual, you should avoid calling special methods directly. Just use the `next(it)`: this built-in func‐ tion does the right thing in Python 2 and 3. 348 | >####警告⚠️ 349 | >迭代器ABC的抽象方法在Python 3中是`it.__next__()`在Python 2中是`it.next()`. 350 | 351 | The `Lib/types.py` module source code in Python 3.4 has a comment that says: 352 | Iterators in Python aren't a matter of type but of protocol. A large 353 | and changing number of builtin types implement *some* flavor of 354 | iterator. Don't check the type! Use hasattr to check for both 355 | "__iter__" and "__next__" attributes instead. 356 | 357 | Python 3.4中模块`Lib/types.py`的源码注释如是: 358 | 359 | - Python中的迭代器与类型有关而不是接口。 360 | - 大量的处于改变中的内建类型实现了“某些”风格的迭代器。 361 | - 不要检查类型!作为替代请使用hasattr来检查属性"__iter__" 和 "__next__"。 362 | 363 | In fact, that’s exactly what the `__subclasshook__` method of the abc.Iterator ABC does (see Example 14-3). 364 | 365 | 实际上,这正是abc.Iterator的ABC的`__subclasshook__`所做的事情(参见例子14-3)。 366 | 367 | >**Note** 368 | >Taking into account the advice from `Lib/types.py` and the logic implemented in `Lib/_collections_abc.py`, the best way to check if an object x is an iterator is to call isinstance(x, abc.Iterator). Thanks to `Iterator.__subclasshook__`, this test works even if the class of x is not a real or virtual subclass of Iterator. 369 | >**注释** 370 | >考虑到`Lib/types.py`的建议,以及`Lib/_collections_abc.py`中实现的逻辑,检查一个对象x是否是一个迭代器的最佳办法是去调用isinstance(x, abc.Iterator)。要感谢`Iterator.__subclasshook__`,这个测试可以正常工作,即便类x不是真的抑或不是Iterator的虚子类。 371 | 372 | Back to our Sentence class from Example 14-1, you can clearly see how the iterator is built by iter(...) and consumed by next(...) using the Python console: 373 | 374 | 回到我们的例子14-1中来,你可以清楚地看到迭代器是如何由iter(...)构造,然后被next(...)在Python终端中使用: 375 | 376 | ```python 377 | >>> s3 = 'ABC' # 1 378 | >>> it = iter(s) # 1 379 | >>> while True: 380 | ... try: 381 | ... print(next(it)) # 2 382 | ... except StopIteration: # 3 383 | ... del it # 4 384 | ... break # 5 385 | ... 386 | A 387 | B 388 | C 389 | ``` 390 | 391 | - Create a sentence s3 with three words. 392 | - Obtain an iterator from s3. 393 | - next(it) fetches the next word. 394 | - There are no more words, so the iterator raises a StopIteration exception. Once exhausted, an iterator becomes useless. 395 | - To go over the sentence again, a new iterator must be built. 396 | 397 | 1. 创建一个包含三个单词的句子s3 398 | 2. 从s3中获取迭代器 399 | 3. 调用next(it)取得下一个单词 400 | 4. 没有更多的单词了,所以迭代器抛出一个StopIteration异常。一旦可迭代元素用完,迭代器就没有用了。 401 | 5. 为了重新把句子再过一遍,必须构建一个新的迭代器。 402 | 403 | Because the only methods required of an iterator are `__next__` and `__iter__`, there is no way to check whether there are remaining items, other than to call next() and catch StopInteration. Also, it’s not possible to “reset” an iterator. If you need to start over, you need to call iter(...) on the iterable that built the iterator in the first place. Calling iter(...) on the iterator itself won’t help, because—as mentioned—`Iterator.__iter__` is implemented by returning self, so this will not reset a depleted iter‐ ator. 404 | 405 | 因为迭代器唯一要求的方法是`__next__`和`__iter__`,除了调用next() 以捕捉StopInteration之外,而且也没有办法检查剩余的项。而且,也没有可能去“重设”迭代器。如果你需要重新开始,那么怒 406 | 407 | To wrap up this section, here is a definition for iterator: 408 | 409 | 为了结束本节,这里是迭代器的定义: 410 | 411 | **iterator** 412 | Any object that implements the `__next__` no-argument method that returns the next item in a series or raises StopIteration when there are no more items. Python iterators also implement the `__iter__` method so they are iterable as well. 413 | 414 | **迭代器** 415 | 任何实现了不含参数的`__next__`方法,其返回序列中的下一个项或者在没有项时抛出StopIteration。Python迭代器还实现了`__iter__`方法,所以它们也是可迭代的。 416 | 417 | This first version of Sentence was iterable thanks to the special treatment the iter(...) built-in gives to sequences. Now we’ll implement the standard iterable protocol. 418 | 419 | Sentence的第一个版本能够迭代,要感谢对分配给语句的内建iter(...)的特殊对待。现在我们要实现标准的迭代接口。 420 | 421 | ## Sentence Take #2: A Classic Iterator 经典迭代器 422 | 423 | The next Sentence class is built according to the classic Iterator design pattern following the blueprint in the GoF book. Note that this is not idiomatic Python, as the next re‐ factorings will make very clear. But it serves to make explicit the relationship between the iterable collection and the iterator object. 424 | 425 | 接下来的Sentence类的构造应用了遵循了GoF这本书中的蓝图中经典迭代器设计模式。 426 | 427 | Example 14-4 shows an implementation of a Sentence that is iterable because it implements the `__iter__` special method, which builds and returns a Sentence Iterator. This is how the Iterator design pattern is described in the original Design Patterns book. 428 | 429 | 例子14-4展示了一个可迭代的Sentence实现,因为它实现了特殊方法`__iter__`,构造并返回了Sentence迭代器。这便是原始设计模式书中所描述的迭代器设计模式。 430 | 431 | We are doing it this way here just to make clear the crucial distinction between an iterable and an iterator and how they are connected. 432 | 433 | 我们这样做的仅仅是为了区分可迭代和迭代器之间的重要区别,以及它们之间的是如何联系的。 434 | 435 | *Example 14-4. sentence_iter.py: Sentence implemented using the Iterator pattern* 436 | 例子14-4. sentence_iter.py:使用了迭代器模式实现的Sentence 437 | 438 | ```python 439 | import re 440 | import reprlib 441 | RE_WORD = re.compile('\w+') 442 | 443 | class Sentence: 444 | def __init__(self, text): 445 | self.text = text 446 | self.words = RE_WORD.findall(text) 447 | 448 | def __repr__(self): 449 | return 'Sentence(%s)' % reprlib.repr(self.text) 450 | 451 | def __iter__(self): 452 | return SentenceIterator(self.words) 453 | 454 | 455 | class SentenceIterator: 456 | def __init__(self, words): 457 | self.words = words 458 | self.index = 0 459 | 460 | def __next__(self): 461 | try: 462 | word = self.words[self.index] 463 | except IndexError: 464 | raise StopIteration() 465 | self.index += 1 466 | return word 467 | 468 | def __iter__(self): 469 | return self 470 | ``` 471 | 472 | - The __iter__ method is the only addition to the previous Sentence implementation. 473 | - This version has no __getitem__, to make it clear that the class is iterable because it implements __iter__. 474 | - __iter__ fulfills the iterable protocol by instantiating and returning an iterator. 475 | - SentenceIterator holds a reference to the list of words. 476 | - self.index is used to determine the next word to fetch. 477 | - Get the word at self.index. 478 | - If there is no word at self.index, raise StopIteration. Increment self.index. 479 | - Return the word. 480 | - Implement self.__iter__. 481 | 482 | 1. 相对于之前的Sentence实现,`__iter__`方法是唯一的添加。 483 | 2. 这个版本没有`__getitem__`, 484 | 3. `__iter__`填充了 485 | 4. SentenceIterator包含了一个对words列表的引用 486 | 5. self.index用来确定后面要获取的单词 487 | 488 | The code in Example 14-4 passes the tests in Example 14-2. 489 | 490 | Note that implementing __iter__ in SentenceIterator is not actually needed for this example to work, but the it’s the right thing to do: iterators are supposed to implement both __next__ and __iter__, and doing so makes our iterator pass the issubclass(Sen tenceInterator, abc.Iterator) test. If we had subclassed SentenceIterator from abc.Iterator, we’d inherit the concrete abc.Iterator.__iter__ method. 491 | 492 | 注意 493 | 494 | That is a lot of work (for us lazy Python programmers, anyway). Note how most code in SentenceIterator deals with managing the internal state of the iterator. Soon we’ll see how to make it shorter. But first, a brief detour to address an implementation shortcut that may be tempting, but is just wrong. 495 | 496 | 做了这么多(总而言之,对于我们这些懒惰的Python程序员来说)。注意 497 | 498 | ### Making Sentence an Iterator: Bad Idea 499 | 500 | A common cause of errors in building iterables and iterators is to confuse the two. To be clear: iterables have an __iter__ method that instantiates a new iterator every time. Iterators implement a __next__ method that returns individual items, and an __iter__ method that returns self. 501 | 502 | 构建可迭代过程中常见的一个错误起因 503 | 504 | Therefore, iterators are also iterable, but iterables are not iterators. 505 | 506 | 因此,迭代器同时是可迭代的,但可迭代并不能成为迭代器。 507 | 508 | It may be tempting to implement __next__ in addition to __iter__ in the Sentence class, making each Sentence instance at the same time an iterable and iterator over itself. But this is a terrible idea. It’s also a common anti-pattern, according to Alex Martelli who has a lot of experience with Python code reviews. 509 | 510 | 除了Sentence类中的`__iter__`,实现`__next__`也让人浮想联翩,让每个Sentence实例自身在同一时间成为可迭代和迭代器。然而这是一个糟糕的想法。依照在Python代码评审上具有丰富经验的Alex Martelli来看,这也是一种常见的反模式。 511 | 512 | The “Applicability” section[4] of the Iterator design pattern in the GoF book says: 513 | 514 | [4.] Gamma et. al., Design Patterns: Elements of Reusable Object-Oriented Software, p. 259. 515 | 516 |    Use the Iterator pattern 517 | 518 | • to access an aggregate object’s contents without exposing its internal representation. 519 | • to support multiple traversals of aggregate objects. 520 | • to provide a uniform interface for traversing different aggregate structures (that is, to support polymorphic iteration). 521 | 522 | To “support multiple traversals” it must be possible to obtain multiple independent iterators from the same iterable instance, and each iterator must keep its own internal state, so a proper implementation of the pattern requires each call to iter(my_itera ble) to create a new, independent, iterator. That is why we need the SentenceItera tor class in this example. 523 | 524 | 为了“支持多次遍历”必须有从相同可迭代实例中获得多个独立的迭代的可能,而且没个迭代器必须保留自己的内部状态,所以之前的视线 525 | 526 | >####Tips 527 | >An iterable should never act as an iterator over itself. In other words, iterables must implement __iter__, but not __next__. 528 | >On the other hand, for convenience, iterators should be iterable. An iterator’s __iter__ should just return self. 529 | >####提示 530 | >可迭代永远不应当把自己作为迭代器。换句话来说,可迭代必须实现`__iter__`,而不是`__next__`。 531 | >就另外一方面而言,为了方便,迭代器应该可迭代的。迭代器的`__iter__`方法应当返回其自身。 532 | 533 | Now that the classic Iterator pattern is properly demonstrated, we can get let it go. The next section presents a more idiomatic implementation of Sentence. 534 | 535 | 现在,经典的迭代器模式已经完全说明白了,我们继续。接下来的部分会出现一个特别的Sentence实现。 536 | 537 | ## Sentence Take #3: A Generator Function Sentence的第三种实现:生成器函数 538 | 539 | A Pythonic implementation of the same functionality uses a generator function to replace the SequenceIterator class. A proper explanation of the generator function comes right after Example 14-5. 540 | 541 | 相同功能的Python风格实现为使用生成器函数来替换SequenceIterator类。对生成器函数的准确的解释就在例子14-5之后。 542 | 543 | Example 14-5. sentence_gen.py: Sentence implemented using a generator function 544 | 545 | 例子14-5. sentence_gen.py: 546 | 547 | ```python 548 | import re 549 | import reprlib 550 | 551 | RE_WORD = re.compile('\w+') 552 | 553 | 554 | class Sentence: 555 | 556 | def __init__(self, text): 557 | self.text = text 558 | self.words = RE_WORD.findall(text) 559 | 560 | def __repr__(self): 561 | return 'Sentence(%s)' % reprlib.repr(self.text) 562 | 563 | def __iter__(self): 564 | for word in self.words: 565 | yield word 566 | return 567 | # done! 568 | ``` 569 | 570 | 1. Iterate over self.word. 571 | 2. Yield the current word. 572 | 3. This return is not needed; the function can just “fall-through” and return automatically. Either way, a generator function doesn’t raise StopIteration: it simply exits when it’s done producing values.[Note-5] 573 | 4. No need for a separate iterator class! 574 | 575 | - 迭代整个 self.word。 576 | - 生成当前的单词。 577 | - 该return并不需要; 578 | - 不需要 579 | 580 | >[Note-5] When reviewing this code, Alex Martelli suggested the body of this method could simply be return iter(self.words). He is correct, of course: the result of calling __iter__ would also be an iterator, as it should be. However, I used a for loop with yield here to introduce the syntax of a generator function, which will be covered in detail in the next section. 581 | 582 | 【注释5】 583 | 584 | Here again we have a different implementation of Sentence that passes the tests in 585 | Example 14-2. 586 | 587 | 这里我们 588 | 589 | Back in the Sentence code in Example 14-4, __iter__ called the SentenceIterator constructor to build an iterator and return it. Now the iterator in Example 14-5 is in fact a generator object, built automatically when the __iter__ method is called, because __iter__ here is a generator function. 590 | 591 | 回到例子14-4中的Sentence代码, 592 | 593 | A full explanation of generator functions follows. 594 | 595 | ### How a Generator Function Works 生成器函数的工作原理 596 | 597 | Any Python function that has the yield keyword in its body is a generator function: a function which, when called, returns a generator object. In other words, a generator function is a generator factory. 598 | 599 | 任何在语句体中拥有yield关键的Python函数都是生成器函数:在调用时,该函数返回一个生成器函数。换句话说,生成器函数是生成器工厂。 600 | 601 | >####Tips 602 | >The only syntax distinguishing a plain function from a generator function is the fact that the latter has a yield keyword some‐ where in its body. Some argued that a new keyword like gen should be used for generator functions instead of def, but Guido did not agree. His arguments are in PEP 255 — Simple Generators.[6] 603 | >####提示 604 | > 605 | >[6]Sometimes I add a gen prefix or suffix when naming generator functions, but this is not a common prac‐ tice. And you can’t do that if you’re implementing an iterable, of course: the necessary special method must be named __iter__. 606 | 607 | Here is the simplest function useful to demonstrate the behavior of a generator:[7] 608 | 609 | 这里是一个简单的对于阐明生成器行为很有用的函数【注释7】: 610 | 611 | >[7]Thanks to David Kwast for suggesting this example. 612 | 【注释7】感谢David Kwast对于这个例子的建议。 613 | 614 | ```python 615 | >>> def gen_123(): # 616 | ... yield 1# 617 | ... yield 2 618 | ... yield 3 619 | ... 620 | >>> gen_123 # 621 | doctest: +ELLIPSIS # 622 | >>> gen_123() # 623 | doctest: +ELLIPSIS # 624 | >>> for i in gen_123(): # 625 | ... print(i) 626 | 1 627 | 2 628 | 3 629 | >>> g = gen_123() # 630 | >>> next(g) # 631 | 1 632 | >>> next(g) 633 | 2 634 | >>> next(g) 635 | 3 636 | >>> next(g) # 637 | Traceback (most recent call last): 638 | ... 639 | StopIteration 640 | ``` 641 | 642 | 1. Any Python function that contains the yield keyword is a generator function. 643 | 2. Usually the body of a generator function has loop, but not necessarily; here I just repeat yield three times. 644 | 3. Looking closely, we see gen_123 is a function object. 645 | 4. But when invoked, gen_123() returns a generator object. 646 | 5. Generators are iterators that produce the values of the expressions passed to yield. 647 | 6. For closer inspection, we assign the generator object to g. 648 | 7. Because g is an iterator, calling next(g) fetches the next item produced by yield. 649 | 8. When the body of the function completes, the generator object raises a StopIt eration. 650 | 651 | - 通常生成器函数主体包含循环,但并不是必须的;这里我只是把yield重复了三次。 652 | - 进一步来看,我们可以看到gen_123是一个函数对象。 653 | - 但是在调用时,gen_123()返回一个生成器对象。 654 | 655 | A generator function builds a generator object that wraps the body of the function. When we invoke next(...) on the generator object, execution advances to the next yield in the function body, and the next(...) call evaluates to the value yielded when the func‐ tion body is suspended. Finally, when the function body returns, the enclosing generator object raises StopIteration, in accordance with the Iterator protocol. 656 | 657 | 生成器函数构建了一个包含函数主体的生成器对象。当我们对生成器对象调用next(...)时, 658 | 659 | >####Tips 660 | >I find it helpful to be strict when talking about the results ob‐ tained from a generator: I say that a generator yields or produces values. But it’s confusing to say a generator “returns” values. Func‐ tions return values. Calling a generator function returns a gener‐ ator. A generator yields or produces values. A generator doesn’t “return” values in the usual way: the return statement in the body of a generator function causes StopIteration to be raised by the generator object.[8] 661 | >#####提示 662 | > 663 | 664 | [8] *Prior to Python 3.3, it was an error to provide a value with the return statement in a generator function. Now that is legal, but the return still causes a StopIteration exception to be raised. The caller can retrieve the return value from the exception object. However, this is only relevant when using a generator func‐ tion as a coroutine, as we’ll see in “Returning a Value from a Coroutine” on page 475.* 665 | 666 | 【注释8】*在Python3.3之前,在生成器函数的为return语句中提供值是件错误的事情,但是reurn仍旧引发StopIteration异常的抛出。调用者可以在exception对象中重新去回值。不过,* 667 | 668 | Example 14-6 makes the interaction between a for loop and the body of the function more explicit. 669 | 670 | 例子14-6, 671 | 672 | Example 14-6. A generator function that prints messages when it runs 673 | 674 | 例子14-6. 一个在运行时答应消息的函数 675 | 676 | ```python 677 | >>> def gen_AB(): # 678 | ... print('start') 679 | ... yield 'A' # 680 | ... print('continue') 681 | ... yield 'B' # 682 | ... print('end.') # 683 | ... 684 | >>> for c in gen_AB(): # 685 | ... print('-->', c) # 686 | ... 687 | start 688 | --> A 689 | continue 690 | --> B 691 | end. 692 | >>> 693 | ``` 694 | 695 | 1. The generator function is defined like any function, but uses yield. 696 | 生成器函数和其他任意函数一样来定义,除了使用yield之外。 697 | 698 | 2. The first implicit call to next() in the for loop at will print 'start' and 699 | stop at the first yield, producing the value 'A'. 700 | 在for循环中第一次明确调用next()会打印 701 | 702 | 3. The second implicit call to next() in the for loop will print 'continue' and 703 | stop at the second yield, producing the value 'B'. 704 | for循环中隐式地调用next() 705 | 706 | 707 | 4. The third call to next() will print 'end.' and fall through the end of the function 708 | body, causing the generator object to raise StopIteration. 709 | To iterate, the for machinery does the equivalent of g = iter(gen_AB()) to get a generator object, and then next(g) at each iteration. 710 | The loop block prints --> and the value returned by next(g). But this output will be seen only after the output of the print calls inside the generator function. 711 | The string 'start' appears as a result of print('start') in the generator function body. 712 | yield 'A' in the generator function body produces the value A consumed by the for loop, which gets assigned to the c variable and results in the output -- > A. 713 | Iteration continues with a second call next(g), advancing the generator function body from yield 'A' to yield 'B'. The text continue is output because of the second print in the generator function body. 714 | yield 'B' produces the value B consumed by the for loop, which gets assigned to the c loop variable, so the loop prints --> B. 715 | Iteration continues with a third call next(it), advancing to the end of the body of the function. The text end. appears in the output because of the third print in the generator function body. 716 | 12. When the generator function body runs to the end, the generator object raises StopIteration. The for loop machinery catches that exception, and the loop terminates cleanly. 717 | 718 | Now hopefully it’s clear how Sentence.__iter__ in Example 14-5 works: __iter__ is a generator function which, when called, builds a generator object that implements the iterator interface, so the SentenceIterator class is no longer needed. 719 | 720 | 现在 721 | 722 | This second version of Sentence is much shorter than the first, but it’s not as lazy as it could be. Nowadays, laziness is considered a good trait, at least in programming lan‐ guages and APIs. A lazy implementation postpones producing values to the last possible moment. This saves memory and may avoid useless processing as well. 723 | 724 | We’ll build a lazy Sentence class next. 725 | 726 | ## Sentence Take #4: A Lazy Implementation 惰性实现 727 | 728 | The Iterator interface is designed to be lazy: next(my_iterator) produces one item at a time. The opposite of lazy is eager: lazy evaluation and eager evaluation are actual technical terms in programming language theory. 729 | 730 | Our Sentence implementations so far have not been lazy because the __init__ eagerly builds a list of all words in the text, binding it to the self.words attribute. This will entail processing the entire text, and the list may use as much memory as the text itself (probably more; it depends on how many nonword characters are in the text). Most of this work will be in vain if the user only iterates over the first couple words. 731 | 732 | Whenever you are using Python 3 and start wondering “Is there a lazy way of doing this?”, often the answer is “Yes.” 733 | 734 | The re.finditer function is a lazy version of re.findall which, instead of a list, re‐ turns a generator producing re.MatchObject instances on demand. If there are many matches, re.finditer saves a lot of memory. Using it, our third version of Sentence is now lazy: it only produces the next word when it is needed. The code is in Example 14-7. 735 | 736 | Example 14-7. sentence_gen2.py: Sentence implemented using a generator function calling the re.finditer generator function 737 | 738 | ```python 739 | import re 740 | import reprlib 741 | RE_WORD = re.compile('\w+') 742 | 743 | class Sentence: 744 | def __init__(self, text): 745 | self.text = text 746 | 747 | def __repr__(self): 748 | return 'Sentence(%s)' % reprlib.repr(self.text) 749 | 750 | def __iter__(self): 751 | for match in RE_WORD.finditer(self.text): 752 | yield match.group() 753 | ``` 754 | 755 | No need to have a words list. 756 | finditer builds an iterator over the matches of RE_WORD on self.text, yielding MatchObject instances. 757 | match.group() extracts the actual matched text from the MatchObject instance. 758 | 759 | Generator functions are an awesome shortcut, but the code can be made even shorter 760 | with a generator expression. 761 | 762 | ## Sentence Take #5: A Generator Expression 生成器表达式 763 | 764 | Simple generator functions like the one in the previous Sentence class (Example 14-7) can be replaced by a generator expression. 765 | 766 | A generator expression can be understood as a lazy version of a list comprehension: it does not eagerly build a list, but returns a generator that will lazily produce the items on demand. In other words, if a list comprehension is a factory of lists, a generator expression is a factory of generators. 767 | 768 | Example 14-8 is a quick demo of a generator expression, comparing it to a list compre‐ hension. 769 | 770 | Example 14-8. The gen_AB generator function is used by a list comprehension, then by a generator expression 771 | 772 | ``` 773 | 774 | >>> def gen_AB(): # 775 | ... print('start') 776 | ... yield 'A' 777 | ... print('continue') 778 | ... yield 'B' 779 | ... print('end.') 780 | ... 781 | >>> res1 = [x*3 for x in gen_AB()] # start 782 | continue 783 | end. 784 | >>>foriinres1: # 785 | ... print('-->', i) 786 | ... 787 | --> AAA 788 | --> BBB 789 | >>> res2 = (x*3 for x in gen_AB()) # 790 | >>> res2 # 791 | at 0x10063c240> 792 | >>>foriinres2: # 793 | ... print('-->', i) 794 | ... 795 | start 796 | --> AAA 797 | continue 798 | --> BBB 799 | end. 800 | ``` 801 | 802 | 1. This is the same gen_AB function from Example 14-6. 803 | The list comprehension eagerly iterates over the items yielded by the generator object produced by calling gen_AB(): 'A' and 'B'. Note the output in the next lines: start, continue, end. 804 | This for loop is iterating over the res1 list produced by the list comprehension. The generator expression returns res2. The call to gen_AB() is made, but that 805 | call returns a generator, which is not consumed here. res2 is a generator object. 806 | 6. Only when the for loop iterates over res2, the body of gen_AB actually executes. Each iteration of the for loop implicitly calls next(res2), advancing gen_AB to the next yield. Note the output of gen_AB with the output of the print in the for loop. 807 | 808 | So, a generator expression produces a generator, and we can use it to further reduce the code in the Sentence class. See Example 14-9. 809 | 810 | Example 14-9. sentence_genexp.py: Sentence implemented using a generator expression 811 | 812 | ```python 813 | import re 814 | import reprlib 815 | RE_WORD = re.compile('\w+') 816 | 817 | class Sentence: 818 | def __init__(self, text): 819 | self.text = text 820 | 821 | def __repr__(self): 822 | return 'Sentence(%s)' % reprlib.repr(self.text) 823 | 824 | def __iter__(self): 825 | return (match.group() for match in RE_WORD.finditer(self.text)) 826 | ``` 827 | 828 | The only difference from Example 14-7 is the __iter__ method, which here is not a generator function (it has no yield) but uses a generator expression to build a generator and then returns it. The end result is the same: the caller of __iter__ gets a generator object. 829 | 830 | Generator expressions are syntactic sugar: they can always be replaced by generator functions, but sometimes are more convenient. The next section is about generator expression usage. 831 | 832 | ## Generator Expressions: When to Use Them 833 | I used several generator expressions when implementing the Vector class in Example 10-16. Each of the methods __eq__, __hash__, __abs__, angle, angles, format, __add__, and __mul__ has a generator expression. In all those methods, a list comprehension would also work, at the cost of using more memory to store the inter‐ mediate list values. 834 | 835 | In Example 14-9, we saw that a generator expression is a syntactic shortcut to create a generator without defining and calling a function. On the other hand, generator func‐tions are much more flexible: you can code complex logic with multiple statements, and can even use them as coroutines (see Chapter 16). 836 | 837 | For the simpler cases, a generator expression will do, and it’s easier to read at a glance, as the Vector example shows. 838 | 839 | My rule of thumb in choosing the syntax to use is simple: if the generator expression spans more than a couple of lines, I prefer to code a generator function for the sake of readability. Also, because generator functions have a name, they can be reused. You can always name a generator expression and use it later by assigning it to a variable, of course, but that is stretching its intended usage as a one-off generator. 840 | 841 | >#### Syntax Tip 842 | >When a generator expression is passed as the single argument to a function or constructor, you don’t need to write a set of paren‐ theses for the function call and another to enclose the generator expression. A single pair will do, like in the Vector call from the __mul__ method in Example 10-16, reproduced here. However, if there are more function arguments after the generator expres‐ sion, you need to enclose it in parentheses to avoid a SyntaxError: 843 | >def __mul__(self, scalar): 844 | if isinstance(scalar, numbers.Real): 845 | return Vector(n * scalar for n in self) else: 846 | return NotImplemented 847 | 848 | The Sentence examples we’ve seen exemplify the use of generators playing the role of classic iterators: retrieving items from a collection. But generators can also be used to produce values independent of a data source. The next section shows an example of that. 849 | 850 | ## Another Example: Arithmetic Progression Generator 851 | The classic Iterator pattern is all about traversal: navigating some data structure. But a standard interface based on a method to fetch the next item in a series is also useful when the items are produced on the fly, instead of retrieved from a collection. For example, the range built-in generates a bounded arithmetic progression (AP) of inte‐ gers, and the itertools.count function generates a boundless AP. 852 | 853 | We’ll cover itertools.count in the next section, but what if you need to generate a bounded AP of numbers of any type? 854 | 855 | Example 14-10 shows a few console tests of an ArithmeticProgression class we will see in a moment. The signature of the constructor in Example 14-10 is Arithmetic Progression(begin, step[, end]). The range() function is similar to the ArithmeticProgression here, but its full signature is range(start, stop[, step]). I chose to implement a different signature because for an arithmetic progression the step is mandatory but end is optional. I also changed the argument names from start/stop to begin/end to make it very clear that I opted for a different signature. In each test in Example 14-10 I call list() on the result to inspect the generated values. 856 | 857 | Example 14-10. Demonstration of an ArithmeticProgression class 858 | 859 | ``` 860 | >>> ap = ArithmeticProgression(0, 1, 3) 861 | >>> list(ap) 862 | [0, 1, 2] 863 | >>> ap = ArithmeticProgression(1, .5, 3) 864 | >>> list(ap) 865 | [1.0, 1.5, 2.0, 2.5] 866 | >>> ap = ArithmeticProgression(0, 1/3, 1) 867 | >>> list(ap) 868 | [0.0, 0.3333333333333333, 0.6666666666666666] 869 | >>> from fractions import Fraction 870 | >>> ap = ArithmeticProgression(0, Fraction(1, 3), 1) >>> list(ap) 871 | [Fraction(0, 1), Fraction(1, 3), Fraction(2, 3)] >>> from decimal import Decimal 872 | >>> ap = ArithmeticProgression(0, Decimal('.1'), .3) >>> list(ap) 873 | [Decimal('0.0'), Decimal('0.1'), Decimal('0.2')] 874 | 875 | ``` 876 | 877 | Note that type of the numbers in the resulting arithmetic progression follows the type of begin or step, according to the numeric coercion rules of Python arithmetic. In Example 14-10, you see lists of int, float, Fraction, and Decimal numbers. 878 | 879 | Example 14-11 lists the implementation of the ArithmeticProgression class. 880 | 881 | Example 14-11. The ArithmeticProgression class 882 | 883 | ```python 884 | class ArithmeticProgression: 885 | 886 | def __init__(self, begin, step, end=None): 887 | self.begin = begin 888 | self.step = step 889 | self.end = end # None -> "infinite" series 890 | 891 | def __iter__(self): 892 | result = type(self.begin + self.step)(self.begin) forever = self.end is None 893 | index = 0 894 | while forever or result < self.end: 895 | yield result 896 | index += 1 897 | result = self.begin + self.step * index 898 | ``` 899 | 900 | 1. __init__ requires two arguments: begin and step. end is optional, if it’s None, the series will be unbounded. 901 | This line produces a result value equal to self.begin, but coerced to the type of the subsequent additions.9 902 | For readability, the forever flag will be True if the self.end attribute is None, resulting in an unbounded series. 903 | This loop runs forever or until the result matches or exceeds self.end. When this loop exits, so does the function. 904 | The current result is produced. 905 | 6. The next potential result is calculated. It may never be yielded, because the while 906 | loop may terminate. 907 | 908 | In the last line of Example 14-11, instead of simply incrementing the result with self.step iteratively, I opted to use an index variable and calculate each result by adding self.begin to self.step multiplied by index to reduce the cumulative effect of errors when working with with floats. 909 | 910 | 在例子14-11的最后一行, 911 | 912 | The ArithmeticProgression class from Example 14-11 works as intended, and is a clear example of the use of a generator function to implement the __iter__ special method. However, if the whole point of a class is to build a generator by implementing __iter__, the class can be reduced to a generator function. A generator function is, after all, a generator factory. 913 | 914 | Example 14-12 shows a generator function called aritprog_gen that does the same job as ArithmeticProgression but with less code. The tests in Example 14-10 all pass if you just call aritprog_gen instead of ArithmeticProgression.[10] 915 | 916 | Example 14-12. The aritprog_gen generator function 917 | 918 | -------------------------------------------------------------------------------- /15章-上下文管理和else语句块.md: -------------------------------------------------------------------------------- 1 | # CHAPTER 15 Context Managers and else Blocks 2 | 3 | Context managers may end up being almost as important as the subroutine itself. We’ve only scratched the surface with them. [...] Basic has a with statement, there are with statements in lots of languages. But they don’t do the same thing, they all do something very shallow, they save you from repeated dotted [attribute] lookups, they don’t do setup and tear down. Just because it’s the same name don’t think it’s the same thing. The with statement is a very big deal. [1] 4 | — Raymond Hettinger Eloquent Python evangelist 5 | 6 | [#1]PyCon US 2013 keynote: “What Makes Python Awesome”; the part about with starts at 23:00 and ends at 26:15. 7 | 8 | In this chapter, we will discuss control flow features that are not so common in other languages, and for this reason tend to be overlooked or underused in Python. They are: 9 | 10 | 本章,我们会讨论在其他语言中并不常见的`控制流`功能, 11 | 12 | • The with statement and context managers 13 | • The else clause in for, while, and try statements 14 | 15 | - with语句和上下文管理器 16 | - for、while和try语句中的else子句 17 | 18 | The with statement sets up a temporary context and reliably tears it down, under the control of a context manager object. This prevents errors and reduces boilerplate code, making APIs at the same time safer and easier to use. Python programmers are finding lots of uses for with blocks beyond automatic file closing. 19 | 20 | with语句建立一个临时的上下文,并在上下文管理器对象的控制下可靠的销毁它。 21 | 22 | The else clause is completely unrelated to with. But this is Part V, and I couldn’t find another place for covering else, and I wouldn’t have a one-page chapter about it, so here it is. 23 | 24 | else子句和with完全没有联系。但是 25 | 26 | Let’s review the smaller topic to get to the real substance of this chapter. 27 | 28 | 29 | -------------------------------------------------------------------------------- /17章.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/cundi/fluent-python/50418e5ffa373cc2503eab91d8488e5e613f389d/17章.md -------------------------------------------------------------------------------- /18-asyncio.md: -------------------------------------------------------------------------------- 1 | # CHAPTER 18 Concurrency with asyncio 2 | 3 | `Concurrency is about dealing with lots of things at once. 4 | Parallelism is about doing lots of things at once. 5 | Not the same, but related. 6 | One is about structure, one is about execution. 7 | Concurrency provides a way to structure a solution to solve a problem that may (but not 8 | necessarily) be parallelizable. [1] 9 | — Rob Pike 10 | Co-inventor of the Go language` 11 | 12 | 【注释1】 13 | 14 | Professor Imre Simon [2] liked to say there are two major sins in science: using different 15 | words to mean the same thing and using one word to mean different things. If you do 16 | any research on concurrent or parallel programming you will find different definitions 17 | for “concurrency” and “parallelism.” I will adopt the informal definitions by Rob Pike, 18 | quoted above. 19 | 20 | 【注释2】 21 | 22 | For real parallelism, you must have multiple cores. A modern laptop has four CPU cores 23 | but is routinely running more than 100 processes at any given time under normal, casual 24 | use. So, in practice, most processing happens concurrently and not in parallel. The 25 | computer is constantly dealing with 100+ processes, making sure each has an oppor‐ 26 | tunity to make progress, even if the CPU itself can’t do more than four things at once. 27 | Ten years ago we used machines that were also able to handle 100 processes concurrently, 28 | but on a single core. That’s why Rob Pike titled that talk “Concurrency Is Not Parallelism 29 | (It’s Better).” 30 | 31 | 就真并行来说,你必须使用多核心。 32 | 33 | This chapter introduces asyncio , a package that implements concurrency with corou‐ 34 | tines driven by an event loop. It’s one of the largest and most ambitious libraries ever 35 | added to Python. Guido van Rossum developed asyncio outside of the Python repos‐ 36 | itory and gave the project a code name of “Tulip”—so you’ll see references to that flower 37 | when researching this topic online. For example, the main discussion group is still called 38 | python-tulip. 39 | 40 | Tulip was renamed to asyncio when it was added to the standard library in Python 3.4. 41 | It’s also compatible with Python 3.3—you can find it on PyPI under the new official 42 | name. Because it uses yield from expressions extensively, asyncio is incompatible with 43 | older versions of Python. 44 | 45 | >The Trollius project—also named after a flower—is a backport of asyncio to Python 2.6 and newer, 46 | > replacing yield from with 47 | > yield and clever callables named From and Return . A yield from ... expression becomes yield From(...) ; and when a coroutine needs to return a result, you write raise Return(result) instead ofreturn result . Trollius is led by Victor Stinner, who is also an asyncio core developer, and who kindly agreed to review this chapter as this book was going into production. 48 | 49 | In this chapter we’ll see: 50 | • A comparison between a simple threaded program and the asyncio equivalent, 51 | showing the relationship between threads and asynchronous tasks 52 | • How the asyncio.Future class differs from concurrent.futures.Future 53 | • Asynchronous versions of the flag download examples from Chapter 17 54 | • How asynchronous programming manages high concurrency in network applica‐ 55 | tions, without using threads or processes 56 | • How coroutines are a major improvement over callbacks for asynchronous pro‐ 57 | gramming 58 | • How to avoid blocking the event loop by offloading blocking operations to a thread 59 | pool 60 | • Writing asyncio servers, and how to rethink web applications for high concurrency 61 | • Why asyncio is poised to have a big impact in the Python ecosystem 62 | Let’s get started with the simple example contrasting threading and asyncio . 63 | 64 | ## Thread Versus Coroutine: A Comparison 线程与协程的比较 65 | 66 | During a discussion about threads and the GIL, Michele Simionato posted a simple but 67 | fun example using multiprocessing to display an animated spinner made with the 68 | ASCII characters "|/-\" on the console while some long computation is running. 69 | I adapted Simionato’s example to use a thread with the Threading module and then a 70 | coroutine with asyncio , so you can see the two examples side by side and understand 71 | how to code concurrent behavior without threads. 72 | 73 | 在讨论线程和GIL时, 74 | 75 | The output shown in Examples 18-1 and 18-2 is animated, so you really should run the 76 | scripts to see what happens. If you’re in the subway (or somewhere else without a WiFi 77 | connection), take a look at Figure 18-1 and imagine the \ bar before the word “thinking” 78 | is spinning. 79 | 80 | ![images]() 81 | 82 | Figure 18-1. The scripts spinner_thread.py and spinner_asyncio.py produce similar 83 | output: the repr of a spinner object and the text Answer: 42. In the screenshot, spin‐ 84 | ner_asyncio.py is still running, and the spinner message \ thinking! is shown; when the 85 | script ends, that line will be replaced by the Answer: 42. 86 | 87 | Let’s review the spinner_thread.py script first (Example 18-1). 88 | 89 | Example 18-1. spinner_thread.py: animating a text spinner with a thread 90 | 91 | ```python 92 | import threading, itertools, time, sys 93 | 94 | 95 | class Signal: # 1 96 | go = True 97 | 98 | 99 | def spin(msg, signal): # 2 100 | write, flush = sys.stdout.write, sys.stdout.flush 101 | for char in itertools.cycle('|/-\\'): # 3 102 | status = char + ' ' + msg 103 | write(status) 104 | flush() 105 | write('\x08' * len(status)) # 4 106 | time.sleep(.1) 107 | if not signal.go: # 5 108 | break 109 | write(' ' * len(status) + '\x08' * len(status)) # 6 110 | 111 | 112 | def slow_function(): 113 | # pretend waiting a long time for I/O 114 | time.sleep(3) 115 | return 42 116 | 117 | 118 | def supervisor(): 119 | signal = Signal() 120 | spinner = threading.Thread(target=spin, args=('thinking!', signal)) 121 | print('spinner object:', spinner) 122 | spinner.start() 123 | result = slow_function() 124 | signal.go = False 125 | spinner.join() 126 | return result 127 | 128 | 129 | def main(): 130 | result = supervisor() # 15 131 | print('Answer:', result) 132 | 133 | 134 | if __name__ == '__main__': 135 | main() 136 | ``` 137 | 138 | 1. This class defines a simple mutable object with a `go` attribute we’ll use to control the thread from outside. 139 | 140 | 2. This function will run in a separate thread. The `signal` argument is an instance of the `Signal` class just defined. 141 | 142 | 3. This is actually an infinite loop because `itertools.cycle` produces items cycling from the given sequence forever. 143 | 144 | 4. The trick to do text-mode animation: move the cursor back with backspace characters ( `\x08` ). 145 | 146 | 5. If the `go` attribute is no longer `True` , exit the loop. 147 | 148 | 6. Clear the status line by overwriting with spaces and moving the cursor back to the beginning. 149 | 150 | 7. Imagine this is some costly computation. 151 | 152 | 8. Calling ``sleep` will block the main thread, but crucially, the GIL will be released so the secondary thread will proceed. 153 | 154 | 9. This function sets up the secondary thread, displays the thread object, runs the slow computation, and kills the thread. 155 | 156 | 10. Display the secondary thread object. The output looks like `` . 157 | 158 | 11. Start the secondary thread. 159 | 160 | 12. Run `slow_function`; this blocks the main thread. Meanwhile, the spinner is animated by the secondary thread. 161 | 162 | 13. Change the state of the `signal`; this will terminate the `for` loop inside the `spin` function. 163 | 164 | 14. Wait until the `spinner` thread finishes. 165 | 166 | 15. Run the `supervisor` function. 167 | 168 | Note that, by design, there is no API for terminating a thread in Python. You must send 169 | it a message to shut down. Here I used the signal.go attribute: when the main thread 170 | sets it to false, the spinner thread will eventually notice and exit cleanly. 171 | 172 | Now let’s see how the same behavior can be achieved with an @asyncio.coroutine 173 | instead of a thread. 174 | 175 | >### Note 176 | >As noted in the “Chapter Summary” on page 498 (Chapter 16), 177 | >asyncio uses a stricter definition of “coroutine.” A coroutine 178 | >suitable for use with the asyncio API must use yield from and 179 | >not yield in its body. Also, an asyncio coroutine should be driv‐ 180 | >en by a caller invoking it through yield from or by passing the 181 | >coroutine to one of the asyncio functions such as asyncio.async(...) 182 | >and others covered in this chapter. Finally, the 183 | >@asyncio.coroutine decorator should be applied to coroutines, 184 | >as shown in the examples. 185 | 186 | Take a look at Example 18-2. 187 | 188 | `Example 18-2. spinner_asyncio.py: animating a text spinner with a coroutine` 189 | 190 | ```python 191 | import asyncio 192 | import itertools 193 | import sys 194 | 195 | @asyncio.coroutine #1 196 | def spin(msg): #2 197 | write, flush = sys.stdout.write, sys.stdout.flush 198 | for char in itertools.cycle('|/-\\'): 199 | status = char + ' ' + msg 200 | write(status) 201 | flush() 202 | write('\x08' * len(status)) 203 | try: 204 | yield from asyncio.sleep(.1) #3 205 | except asyncio.CancelledError: #4 206 | break 207 | write(' ' * len(status) + '\x08' * len(status)) 208 | 209 | @asyncio.coroutine 210 | def slow_function(): 211 | # pretend waiting a long time for I/O 212 | yield from asyncio.sleep(3) 213 | return 42 214 | 215 | @asyncio.coroutine 216 | def supervisor(): 217 | spinner = asyncio.async(spin('thinking!')) 218 | print('spinner object:', spinner) 219 | result = yield from slow_function() 220 | spinner.cancel() 221 | return result 222 | 223 | def main(): 224 | loop = asyncio.get_event_loop() 225 | result = loop.run_until_complete(supervisor()) 226 | loop.close() 227 | print('Answer:', result) 228 | ``` 229 | 230 | 1. Coroutines intended for use with asyncio should be decorated with @asyncio.coroutine . This not mandatory, but is highly advisable. See explanation following this listing. 231 | 232 | 2. Here we don’t need the signal argument that was used to shut down the thread in the spin function of Example 18-1. 233 | 234 | 3. Use yield from asyncio.sleep(.1) instead of just time.sleep(.1) , to sleep without blocking the event loop. 235 | 236 | 4. If asyncio.CancelledError is raised after spin wakes up, it’s because cancellation was requested, so exit the loop. 237 | 238 | 5. `slow_function` is now a coroutine, and uses yield from to let the event loop proceed while this coroutine pretends to do I/O by sleeping. 239 | 240 | 6. The yield from asyncio.sleep(3) expression handles the control flow to the main loop, which will resume this coroutine after the sleep delay. 241 | 242 | 7. `supervisor` is now a coroutine as well, so it can drive `slow_function` with `yield from`. 243 | 244 | 8. `asyncio.async(...)` schedules the `spin` coroutine to run, wrapping it in a `Task` object, which is returned immediately. 245 | 246 | 9. Display the `Task` object. The output looks like `>` . 247 | 248 | 10. Drive the `slow_function()` . When that is done, get the returned value. Meanwhile, the event loop will continue running because `slow_function` ultimately uses `yield from asyncio.sleep(3)` to hand control back to the main loop. 249 | 250 | 11. A `Task` object can be cancelled;this raises `asyncio.CancelledError` at the `yield` line where the coroutine is currently suspended. The coroutine may catch the exception and delay or even refuse to cancel. 251 | 252 | 12. Get a reference to the event loop. 253 | 254 | 13. Drive the `supervisor` coroutine to completion; the return value of the coroutine is the return value of this call. 255 | 256 | >### Cautious 257 | >Never use `time.sleep(...)` in `asyncio` coroutines unless you want 258 | >to block the main thread, therefore freezing the event loop and 259 | >probably the whole application as well. If a coroutine needs to 260 | >spend some time doing nothing, it should `yield from asyncio.sleep(DELAY)` . 261 | 262 | The use of the `@asyncio.coroutine` decorator is not mandatory, but highly recom‐ 263 | mended: it makes the coroutines stand out among regular functions, and helps with 264 | debugging by issuing a warning when a coroutine is garbage collected without being 265 | yielded from—which means some operation was left unfinished and is likely a bug. This 266 | is not a `priming decorator`. 267 | 268 | Note that the line count of `spinner_thread.py` and `spinner_asyncio.py` is nearly the same. 269 | The `supervisor` functions are the heart of these examples. Let’s compare them in detail. 270 | Example 18-3 lists only the `supervisor` from the `Threading` example. 271 | 272 | `Example 18-3. spinner_thread.py: the threaded supervisor function` 273 | 274 | ```python 275 | def supervisor(): 276 | signal = Signal() 277 | spinner = threading.Thread(target=spin, 278 | args=('thinking!', signal)) 279 | print('spinner object:', spinner) 280 | spinner.start() 281 | result = slow_function() 282 | signal.go = False 283 | spinner.join() 284 | return result 285 | ``` 286 | 287 | For comparison, Example 18-4 shows the supervisor coroutine. 288 | 289 | `Example 18-4. spinner_asyncio.py: the asynchronous supervisor coroutine` 290 | 291 | ```python 292 | @asyncio.coroutine 293 | def supervisor(): 294 | spinner = asyncio.async(spin('thinking!')) 295 | print('spinner object:', spinner) 296 | result = yield from slow_function() 297 | spinner.cancel() 298 | return result 299 | ``` 300 | 301 | Here is a summary of the main differences to note between the two supervisor im‐ 302 | plementations: 303 | 304 | - An `asyncio.Task` is roughly the equivalent of a `threading.Thread`. Victor Stinner, special technical reviewer for this chapter, points out that “a Task is like a green thread in libraries that implement cooperative multitasking, such as `gevent`.” 305 | 306 | - A `Task` drives a coroutine, and a Thread invokes a callable. 307 | 308 | - You don’t instantiate `Task` objects yourself, you get them by passing a coroutine to` asyncio.async(...)` or `loop.create_task(...)`. 309 | 310 | • When you get a Task object, it is already scheduled to run (e.g., by asyn 311 | cio.async ); a Thread instance must be explicitly told to run by calling its start 312 | method. 313 | • In the threaded supervisor , the slow_function is a plain function and is directly 314 | invoked by the thread. In the asyncio supervisor , slow_function is a coroutine 315 | driven by yield from . 316 | • There’s no API to terminate a thread from the outside, because a thread could be 317 | interrupted at any point, leaving the system in an invalid state. For tasks, there is -------------------------------------------------------------------------------- /19章-属性和特性.md: -------------------------------------------------------------------------------- 1 | Part VI. Metaprogramming 2 | 地四部分 元编程 3 | ************************ 4 | 5 | Chapter 19. Dynamic attributes and properties 6 | 第十九章 动态属性和特性 7 | ********************************************* 8 | 9 | ``` 10 | The crucial importance of properties is that their existence makes it perfectly safe and indeed advisable for you to expose public data attributes as part of your class’s public interface[186]. 11 | 12 | — Alex Martelli Python contributor and book author 13 | ``` 14 | 15 | Data attributes and methods are collectively known as attributes in Python: a method is just an attribute that is callable. Besides data attributes and methods, we can also create properties, which can be used to replace a public data attribute with accessor methods (i.e. getter/setter), without changing the class interface. This agrees with the Uniform access principle: 16 | 数据属性和方法在Python中统称为属性:方法只是一个可调用的属性。除了数据属性和方法,我们也可以创建特性,它用于使用访问器方法(例如,getter/setter)替换公共的数据属性,而不用改变类的接口。此做法与统一访问原则一致: 17 | 18 | ``` 19 | 20 | All services offered by a module should be available through a uniform notation, which does not betray whether they are implemented through storage or through computation[187]. 21 | 22 | 所有由模块提供的服务都应该对统一标记可用,不论它们是通过存储还是通过计算实现它都不会背叛这一原则 23 | 24 | 25 | Besides properties, Python provides a rich API for controlling attribute access and implementing dynamic attributes. The interpreter calls special methods such as `__getattr__` and `__setattr__` to evaluate attribute access using dot notation, eg. obj.attr. A user-defined class implementing `__getattr__` can implement “virtual attributes” by computing values on the fly whenever somebody tries to read a nonexistent attribute like `obj.no_such_attribute`. 26 | 27 | 除了特性,Python为了控制属性访问以及执行动态属性而提供了一个富API。解释器使用点号,比如`obj.attr`调用特殊方法比如`__getattr__`和`__setattr__`去计算属性访问。用户定义的类通过快速计算值,执行`__getattr__`可以实现“虚拟属性”,不论合适有人尝试读取一个不存在的属性,就像`obj.no_such_attribute` 28 | 29 | Coding dynamic attributes is the kind of metaprogramming that framework authors do. However, in Python the basic techniques are so straightforward that anyone can put them to work, even for everyday data wrangling tasks. That’s how we’ll start this chapter. 30 | 31 | 编写动态属性也是框架作者所做的元编程的一种。不过,在Python中基础技术都是简洁明了的, 32 | 33 | ## Data wrangling with dynamic attributes 34 | In the next few examples we’ll leverage dynamic attributes to work with a JSON data feed published by O’Reilly for the OSCON 2014 conference[188]. 35 | 36 | 在接下来的几个例子中,我们 37 | 38 | Example 19-1. Sample records from osconfeed.json; some field contents abbreviated. 39 | 40 | 例子19-1. 41 | 42 | 43 | { "Schedule": 44 | { "conferences": [{"serial": 115 }], 45 | "events": [ 46 | { "serial": 34505, 47 | "name": "Why Schools Don´t Use Open Source to Teach Programming", 48 | "event_type": "40-minute conference session", 49 | "time_start": "2014-07-23 11:30:00", 50 | "time_stop": "2014-07-23 12:10:00", 51 | "venue_serial": 1462, 52 | "description": "Aside from the fact that high school programming...", 53 | "website_url": "http://oscon.com/oscon2014/public/schedule/detail/34505", 54 | "speakers": [157509], 55 | "categories": ["Education"] } 56 | ], 57 | "speakers": [ 58 | { "serial": 157509, 59 | "name": "Robert Lefkowitz", 60 | "photo": null, 61 | "url": "http://sharewave.com/", 62 | "position": "CTO", 63 | "affiliation": "Sharewave", 64 | "twitter": "sharewaveteam", 65 | "bio": "Robert ´r0ml´ Lefkowitz is the CTO at Sharewave, a startup..." } 66 | ], 67 | "venues": [ 68 | { "serial": 1462, 69 | "name": "F151", 70 | "category": "Conference Venues" } 71 | ] 72 | } 73 | } 74 | 75 | 76 | Example 19-1 shows 4 out of the 895 records in the JSON feed. As you can see, the entire data set is a single JSON object with the key "Schedule", and its value is another mapping with four keys: "conferences", "events", "speakers" and "venues". Each of those four keys is paired with a list of records. In Example 19-1 each list has one record, but in the full dataset those lists have dozens or hundreds of records—except "conferences" which holds just the single record shown. Every item in those four lists has a "serial" field which is a unique identifier within the list. 77 | 78 | 例子19-1展示了 79 | 80 | The first script I wrote to deal with the OSCON feed simply downloads the feed, avoiding unnecessary traffic by checking if there is a local copy. This makes sense because OSCON 2014 is history now, so that feed will not be updated. 81 | 82 | 我写的第一个脚本用来处理OSCON订阅, 83 | 84 | There is no metaprogramming in Example 19-2, pretty much everything boils down to this expression: json.load(fp), but that’s enough to let us explore the dataset. The osconfeed.load function will be used in the next several examples. 85 | 86 | 在例子19-2中并没有元编程 87 | 88 | *Example 19-2. osconfeed.py: Downloading osconfeed.json. Doctests are in Example 19-3.* 89 | 90 | 例子19-2. osconfeed.py: 91 | 92 | from urllib.request import urlopen 93 | import warnings 94 | import os 95 | import json 96 | 97 | URL = "http://www.oreilly.com/pub/sc/osconfeed" 98 | JSON = "data/osconfeed.json" 99 | 100 | 101 | def load(): 102 | if not os.path.exists(JSON): 103 | msg = "downloading {} to {}".format(URL, JSON) 104 | warnings.warn(msg) `1`with urlopen(URL) as remote, open(JSON, "wb") as local: `2`local.write(remote.read()) 105 | 106 | with open(JSON) as fp: 107 | return json.load(fp) 108 | 109 | `1Issue a warning if a new download will be made. 110 | -------------------------------------------------------------------------------- /20章-属性描述符.md: -------------------------------------------------------------------------------- 1 | # CHAPTER 20 Attribute descriptors 2 | *Learning about descriptors not only provides access to a larger toolset, it creates a deeper understanding of how Python works and an appreciation for the elegance of its design [1] 3 | — Raymond Hettinger 4 | Python core developer and guru* 5 | 6 | [1]: 1. Reymond Hettinger, Descriptor HowTo Guide. 7 | 8 | Descriptors are a way of reusing the same access logic in multiple attributes. For ex‐ ample, field types in ORMs such as the Django ORM and SQL Alchemy are descriptors, managing the flow of data from the fields in a database record to Python object attributes and vice-versa. 9 | 10 | 描述符是一种在多属性的环境中重复使用相同访问逻辑的方法。例如,像Django ORM 和SQLAlchemy这些ORM中的字段类型都是描述符,这些描述符管理者数据库中字段的记录Python对象属性的数据流,反之亦然。 11 | 12 | A descriptor is a class which implements a protocol consisting of the `__get__`, `__set__` and `__delete__` methods. The property class implements the full descriptor protocol. As usual with protocols, partial implementations are OK. In fact, most descriptors we see in real code implement only `__get__` and`__set__`, and many implement only one of these methods. 13 | 14 | 描述符是一个实现了由 `__get__`, `__set__` and `__delete__` 方法组成的协议。property类实现了完整的描述符协议。通常使用协议,部分地实现是没有问题的。事实上,我们在现实世界中见到的代码中的很多的描述符仅由 `__get__`和 `__set__` 实现,而且很多的实现也只有这两个方法中的一个。 15 | 16 | Descriptors are a distinguishing feature of Python, deployed not only at the application level but also in the language infrastructure. Besides properties, other Python features that leverage descriptors are methods and the classmethod and staticmethod deco‐ rators. Understanding descriptors is key to Python mastery. This is what this chapter is about. 17 | 18 | 描述符是Python独有的功能,它不仅被部署在了应用层面而且也存在于这门语言的基础部分。除了特性,对描述符还有影响的方法是classmethod 和 staticmethod装饰器。理解描述符是精通Python的关键所在。这也是本章将要详细说明的。 19 | 20 | ## Descriptor example: attribute validation 21 | As we saw in “Coding a property factory” on page 613, a property factory is a way to avoid repetitive coding of getters and setters by applying functional programming pat‐ terns. A property factory is a higher-order function that creates a parametrized set of accessor functions and builds a custom property instance from them, with closures to hold settings like the storage_name. The object oriented way of solving the same prob‐ lem is a descriptor class. 22 | We’ll continue the series of LineItem examples where we left it, in “Coding a property factory” on page 613, by refactoring the quantity property factory into a Quantity descriptor class. 23 | 24 | 25 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | CC0 1.0 Universal 2 | 3 | Statement of Purpose 4 | 5 | The laws of most jurisdictions throughout the world automatically confer 6 | exclusive Copyright and Related Rights (defined below) upon the creator and 7 | subsequent owner(s) (each and all, an "owner") of an original work of 8 | authorship and/or a database (each, a "Work"). 9 | 10 | Certain owners wish to permanently relinquish those rights to a Work for the 11 | purpose of contributing to a commons of creative, cultural and scientific 12 | works ("Commons") that the public can reliably and without fear of later 13 | claims of infringement build upon, modify, incorporate in other works, reuse 14 | and redistribute as freely as possible in any form whatsoever and for any 15 | purposes, including without limitation commercial purposes. These owners may 16 | contribute to the Commons to promote the ideal of a free culture and the 17 | further production of creative, cultural and scientific works, or to gain 18 | reputation or greater distribution for their Work in part through the use and 19 | efforts of others. 20 | 21 | For these and/or other purposes and motivations, and without any expectation 22 | of additional consideration or compensation, the person associating CC0 with a 23 | Work (the "Affirmer"), to the extent that he or she is an owner of Copyright 24 | and Related Rights in the Work, voluntarily elects to apply CC0 to the Work 25 | and publicly distribute the Work under its terms, with knowledge of his or her 26 | Copyright and Related Rights in the Work and the meaning and intended legal 27 | effect of CC0 on those rights. 28 | 29 | 1. Copyright and Related Rights. A Work made available under CC0 may be 30 | protected by copyright and related or neighboring rights ("Copyright and 31 | Related Rights"). Copyright and Related Rights include, but are not limited 32 | to, the following: 33 | 34 | i. the right to reproduce, adapt, distribute, perform, display, communicate, 35 | and translate a Work; 36 | 37 | ii. moral rights retained by the original author(s) and/or performer(s); 38 | 39 | iii. publicity and privacy rights pertaining to a person's image or likeness 40 | depicted in a Work; 41 | 42 | iv. rights protecting against unfair competition in regards to a Work, 43 | subject to the limitations in paragraph 4(a), below; 44 | 45 | v. rights protecting the extraction, dissemination, use and reuse of data in 46 | a Work; 47 | 48 | vi. database rights (such as those arising under Directive 96/9/EC of the 49 | European Parliament and of the Council of 11 March 1996 on the legal 50 | protection of databases, and under any national implementation thereof, 51 | including any amended or successor version of such directive); and 52 | 53 | vii. other similar, equivalent or corresponding rights throughout the world 54 | based on applicable law or treaty, and any national implementations thereof. 55 | 56 | 2. Waiver. To the greatest extent permitted by, but not in contravention of, 57 | applicable law, Affirmer hereby overtly, fully, permanently, irrevocably and 58 | unconditionally waives, abandons, and surrenders all of Affirmer's Copyright 59 | and Related Rights and associated claims and causes of action, whether now 60 | known or unknown (including existing as well as future claims and causes of 61 | action), in the Work (i) in all territories worldwide, (ii) for the maximum 62 | duration provided by applicable law or treaty (including future time 63 | extensions), (iii) in any current or future medium and for any number of 64 | copies, and (iv) for any purpose whatsoever, including without limitation 65 | commercial, advertising or promotional purposes (the "Waiver"). Affirmer makes 66 | the Waiver for the benefit of each member of the public at large and to the 67 | detriment of Affirmer's heirs and successors, fully intending that such Waiver 68 | shall not be subject to revocation, rescission, cancellation, termination, or 69 | any other legal or equitable action to disrupt the quiet enjoyment of the Work 70 | by the public as contemplated by Affirmer's express Statement of Purpose. 71 | 72 | 3. Public License Fallback. Should any part of the Waiver for any reason be 73 | judged legally invalid or ineffective under applicable law, then the Waiver 74 | shall be preserved to the maximum extent permitted taking into account 75 | Affirmer's express Statement of Purpose. In addition, to the extent the Waiver 76 | is so judged Affirmer hereby grants to each affected person a royalty-free, 77 | non transferable, non sublicensable, non exclusive, irrevocable and 78 | unconditional license to exercise Affirmer's Copyright and Related Rights in 79 | the Work (i) in all territories worldwide, (ii) for the maximum duration 80 | provided by applicable law or treaty (including future time extensions), (iii) 81 | in any current or future medium and for any number of copies, and (iv) for any 82 | purpose whatsoever, including without limitation commercial, advertising or 83 | promotional purposes (the "License"). The License shall be deemed effective as 84 | of the date CC0 was applied by Affirmer to the Work. Should any part of the 85 | License for any reason be judged legally invalid or ineffective under 86 | applicable law, such partial invalidity or ineffectiveness shall not 87 | invalidate the remainder of the License, and in such case Affirmer hereby 88 | affirms that he or she will not (i) exercise any of his or her remaining 89 | Copyright and Related Rights in the Work or (ii) assert any associated claims 90 | and causes of action with respect to the Work, in either case contrary to 91 | Affirmer's express Statement of Purpose. 92 | 93 | 4. Limitations and Disclaimers. 94 | 95 | a. No trademark or patent rights held by Affirmer are waived, abandoned, 96 | surrendered, licensed or otherwise affected by this document. 97 | 98 | b. Affirmer offers the Work as-is and makes no representations or warranties 99 | of any kind concerning the Work, express, implied, statutory or otherwise, 100 | including without limitation warranties of title, merchantability, fitness 101 | for a particular purpose, non infringement, or the absence of latent or 102 | other defects, accuracy, or the present or absence of errors, whether or not 103 | discoverable, all to the greatest extent permissible under applicable law. 104 | 105 | c. Affirmer disclaims responsibility for clearing rights of other persons 106 | that may apply to the Work or any use thereof, including without limitation 107 | any person's Copyright and Related Rights in the Work. Further, Affirmer 108 | disclaims responsibility for obtaining any necessary consents, permissions 109 | or other rights required for any use of the Work. 110 | 111 | d. Affirmer understands and acknowledges that Creative Commons is not a 112 | party to this document and has no duty or obligation with respect to this 113 | CC0 or use of the Work. 114 | 115 | For more information, please see 116 | 117 | 118 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # fluent-python 2 | 3 | 《流畅的Python》 4 | -------------------------------------------------------------------------------- /appendix/_collection_abc.py: -------------------------------------------------------------------------------- 1 | # Copyright 2007 Google, Inc. All Rights Reserved. 2 | # Licensed to PSF under a Contributor Agreement. 3 | 4 | """Abstract Base Classes (ABCs) for collections, according to PEP 3119. 5 | 6 | Unit tests are in test_collections. 7 | """ 8 | 9 | from abc import ABCMeta, abstractmethod 10 | import sys 11 | 12 | __all__ = ["Hashable", "Iterable", "Iterator", 13 | "Sized", "Container", "Callable", 14 | "Set", "MutableSet", 15 | "Mapping", "MutableMapping", 16 | "MappingView", "KeysView", "ItemsView", "ValuesView", 17 | "Sequence", "MutableSequence", 18 | "ByteString", 19 | ] 20 | 21 | # This module has been renamed from collections.abc to _collections_abc to 22 | # speed up interpreter startup. Some of the types such as MutableMapping are 23 | # required early but collections module imports a lot of other modules. 24 | # See issue #19218 25 | __name__ = "collections.abc" 26 | 27 | # Private list of types that we want to register with the various ABCs 28 | # so that they will pass tests like: 29 | # it = iter(somebytearray) 30 | # assert isinstance(it, Iterable) 31 | # Note: in other implementations, these types many not be distinct 32 | # and they make have their own implementation specific types that 33 | # are not included on this list. 34 | bytes_iterator = type(iter(b'')) 35 | bytearray_iterator = type(iter(bytearray())) 36 | #callable_iterator = ??? 37 | dict_keyiterator = type(iter({}.keys())) 38 | dict_valueiterator = type(iter({}.values())) 39 | dict_itemiterator = type(iter({}.items())) 40 | list_iterator = type(iter([])) 41 | list_reverseiterator = type(iter(reversed([]))) 42 | range_iterator = type(iter(range(0))) 43 | set_iterator = type(iter(set())) 44 | str_iterator = type(iter("")) 45 | tuple_iterator = type(iter(())) 46 | zip_iterator = type(iter(zip())) 47 | ## views ## 48 | dict_keys = type({}.keys()) 49 | dict_values = type({}.values()) 50 | dict_items = type({}.items()) 51 | ## misc ## 52 | mappingproxy = type(type.__dict__) 53 | 54 | 55 | ### ONE-TRICK PONIES ### 56 | 57 | class Hashable(metaclass=ABCMeta): 58 | 59 | __slots__ = () 60 | 61 | @abstractmethod 62 | def __hash__(self): 63 | return 0 64 | 65 | @classmethod 66 | def __subclasshook__(cls, C): 67 | if cls is Hashable: 68 | for B in C.__mro__: 69 | if "__hash__" in B.__dict__: 70 | if B.__dict__["__hash__"]: 71 | return True 72 | break 73 | return NotImplemented 74 | 75 | 76 | class Iterable(metaclass=ABCMeta): 77 | 78 | __slots__ = () 79 | 80 | @abstractmethod 81 | def __iter__(self): 82 | while False: 83 | yield None 84 | 85 | @classmethod 86 | def __subclasshook__(cls, C): 87 | if cls is Iterable: 88 | if any("__iter__" in B.__dict__ for B in C.__mro__): 89 | return True 90 | return NotImplemented 91 | 92 | 93 | class Iterator(Iterable): 94 | 95 | __slots__ = () 96 | 97 | @abstractmethod 98 | def __next__(self): 99 | 'Return the next item from the iterator. When exhausted, raise StopIteration' 100 | raise StopIteration 101 | 102 | def __iter__(self): 103 | return self 104 | 105 | @classmethod 106 | def __subclasshook__(cls, C): 107 | if cls is Iterator: 108 | if (any("__next__" in B.__dict__ for B in C.__mro__) and 109 | any("__iter__" in B.__dict__ for B in C.__mro__)): 110 | return True 111 | return NotImplemented 112 | 113 | Iterator.register(bytes_iterator) 114 | Iterator.register(bytearray_iterator) 115 | #Iterator.register(callable_iterator) 116 | Iterator.register(dict_keyiterator) 117 | Iterator.register(dict_valueiterator) 118 | Iterator.register(dict_itemiterator) 119 | Iterator.register(list_iterator) 120 | Iterator.register(list_reverseiterator) 121 | Iterator.register(range_iterator) 122 | Iterator.register(set_iterator) 123 | Iterator.register(str_iterator) 124 | Iterator.register(tuple_iterator) 125 | Iterator.register(zip_iterator) 126 | 127 | class Sized(metaclass=ABCMeta): 128 | 129 | __slots__ = () 130 | 131 | @abstractmethod 132 | def __len__(self): 133 | return 0 134 | 135 | @classmethod 136 | def __subclasshook__(cls, C): 137 | if cls is Sized: 138 | if any("__len__" in B.__dict__ for B in C.__mro__): 139 | return True 140 | return NotImplemented 141 | 142 | 143 | class Container(metaclass=ABCMeta): 144 | 145 | __slots__ = () 146 | 147 | @abstractmethod 148 | def __contains__(self, x): 149 | return False 150 | 151 | @classmethod 152 | def __subclasshook__(cls, C): 153 | if cls is Container: 154 | if any("__contains__" in B.__dict__ for B in C.__mro__): 155 | return True 156 | return NotImplemented 157 | 158 | 159 | class Callable(metaclass=ABCMeta): 160 | 161 | __slots__ = () 162 | 163 | @abstractmethod 164 | def __call__(self, *args, **kwds): 165 | return False 166 | 167 | @classmethod 168 | def __subclasshook__(cls, C): 169 | if cls is Callable: 170 | if any("__call__" in B.__dict__ for B in C.__mro__): 171 | return True 172 | return NotImplemented 173 | 174 | 175 | ### SETS ### 176 | 177 | 178 | class Set(Sized, Iterable, Container): 179 | 180 | """A set is a finite, iterable container. 181 | 182 | This class provides concrete generic implementations of all 183 | methods except for __contains__, __iter__ and __len__. 184 | 185 | To override the comparisons (presumably for speed, as the 186 | semantics are fixed), redefine __le__ and __ge__, 187 | then the other operations will automatically follow suit. 188 | """ 189 | 190 | __slots__ = () 191 | 192 | def __le__(self, other): 193 | if not isinstance(other, Set): 194 | return NotImplemented 195 | if len(self) > len(other): 196 | return False 197 | for elem in self: 198 | if elem not in other: 199 | return False 200 | return True 201 | 202 | def __lt__(self, other): 203 | if not isinstance(other, Set): 204 | return NotImplemented 205 | return len(self) < len(other) and self.__le__(other) 206 | 207 | def __gt__(self, other): 208 | if not isinstance(other, Set): 209 | return NotImplemented 210 | return len(self) > len(other) and self.__ge__(other) 211 | 212 | def __ge__(self, other): 213 | if not isinstance(other, Set): 214 | return NotImplemented 215 | if len(self) < len(other): 216 | return False 217 | for elem in other: 218 | if elem not in self: 219 | return False 220 | return True 221 | 222 | def __eq__(self, other): 223 | if not isinstance(other, Set): 224 | return NotImplemented 225 | return len(self) == len(other) and self.__le__(other) 226 | 227 | @classmethod 228 | def _from_iterable(cls, it): 229 | '''Construct an instance of the class from any iterable input. 230 | 231 | Must override this method if the class constructor signature 232 | does not accept an iterable for an input. 233 | ''' 234 | return cls(it) 235 | 236 | def __and__(self, other): 237 | if not isinstance(other, Iterable): 238 | return NotImplemented 239 | return self._from_iterable(value for value in other if value in self) 240 | 241 | __rand__ = __and__ 242 | 243 | def isdisjoint(self, other): 244 | 'Return True if two sets have a null intersection.' 245 | for value in other: 246 | if value in self: 247 | return False 248 | return True 249 | 250 | def __or__(self, other): 251 | if not isinstance(other, Iterable): 252 | return NotImplemented 253 | chain = (e for s in (self, other) for e in s) 254 | return self._from_iterable(chain) 255 | 256 | __ror__ = __or__ 257 | 258 | def __sub__(self, other): 259 | if not isinstance(other, Set): 260 | if not isinstance(other, Iterable): 261 | return NotImplemented 262 | other = self._from_iterable(other) 263 | return self._from_iterable(value for value in self 264 | if value not in other) 265 | 266 | def __rsub__(self, other): 267 | if not isinstance(other, Set): 268 | if not isinstance(other, Iterable): 269 | return NotImplemented 270 | other = self._from_iterable(other) 271 | return self._from_iterable(value for value in other 272 | if value not in self) 273 | 274 | def __xor__(self, other): 275 | if not isinstance(other, Set): 276 | if not isinstance(other, Iterable): 277 | return NotImplemented 278 | other = self._from_iterable(other) 279 | return (self - other) | (other - self) 280 | 281 | __rxor__ = __xor__ 282 | 283 | def _hash(self): 284 | """Compute the hash value of a set. 285 | 286 | Note that we don't define __hash__: not all sets are hashable. 287 | But if you define a hashable set type, its __hash__ should 288 | call this function. 289 | 290 | This must be compatible __eq__. 291 | 292 | All sets ought to compare equal if they contain the same 293 | elements, regardless of how they are implemented, and 294 | regardless of the order of the elements; so there's not much 295 | freedom for __eq__ or __hash__. We match the algorithm used 296 | by the built-in frozenset type. 297 | """ 298 | MAX = sys.maxsize 299 | MASK = 2 * MAX + 1 300 | n = len(self) 301 | h = 1927868237 * (n + 1) 302 | h &= MASK 303 | for x in self: 304 | hx = hash(x) 305 | h ^= (hx ^ (hx << 16) ^ 89869747) * 3644798167 306 | h &= MASK 307 | h = h * 69069 + 907133923 308 | h &= MASK 309 | if h > MAX: 310 | h -= MASK + 1 311 | if h == -1: 312 | h = 590923713 313 | return h 314 | 315 | Set.register(frozenset) 316 | 317 | 318 | class MutableSet(Set): 319 | """A mutable set is a finite, iterable container. 320 | 321 | This class provides concrete generic implementations of all 322 | methods except for __contains__, __iter__, __len__, 323 | add(), and discard(). 324 | 325 | To override the comparisons (presumably for speed, as the 326 | semantics are fixed), all you have to do is redefine __le__ and 327 | then the other operations will automatically follow suit. 328 | """ 329 | 330 | __slots__ = () 331 | 332 | @abstractmethod 333 | def add(self, value): 334 | """Add an element.""" 335 | raise NotImplementedError 336 | 337 | @abstractmethod 338 | def discard(self, value): 339 | """Remove an element. Do not raise an exception if absent.""" 340 | raise NotImplementedError 341 | 342 | def remove(self, value): 343 | """Remove an element. If not a member, raise a KeyError.""" 344 | if value not in self: 345 | raise KeyError(value) 346 | self.discard(value) 347 | 348 | def pop(self): 349 | """Return the popped value. Raise KeyError if empty.""" 350 | it = iter(self) 351 | try: 352 | value = next(it) 353 | except StopIteration: 354 | raise KeyError 355 | self.discard(value) 356 | return value 357 | 358 | def clear(self): 359 | """This is slow (creates N new iterators!) but effective.""" 360 | try: 361 | while True: 362 | self.pop() 363 | except KeyError: 364 | pass 365 | 366 | def __ior__(self, it): 367 | for value in it: 368 | self.add(value) 369 | return self 370 | 371 | def __iand__(self, it): 372 | for value in (self - it): 373 | self.discard(value) 374 | return self 375 | 376 | def __ixor__(self, it): 377 | if it is self: 378 | self.clear() 379 | else: 380 | if not isinstance(it, Set): 381 | it = self._from_iterable(it) 382 | for value in it: 383 | if value in self: 384 | self.discard(value) 385 | else: 386 | self.add(value) 387 | return self 388 | 389 | def __isub__(self, it): 390 | if it is self: 391 | self.clear() 392 | else: 393 | for value in it: 394 | self.discard(value) 395 | return self 396 | 397 | MutableSet.register(set) 398 | 399 | 400 | ### MAPPINGS ### 401 | 402 | 403 | class Mapping(Sized, Iterable, Container): 404 | 405 | __slots__ = () 406 | 407 | """A Mapping is a generic container for associating key/value 408 | pairs. 409 | 410 | This class provides concrete generic implementations of all 411 | methods except for __getitem__, __iter__, and __len__. 412 | 413 | """ 414 | 415 | @abstractmethod 416 | def __getitem__(self, key): 417 | raise KeyError 418 | 419 | def get(self, key, default=None): 420 | 'D.get(k[,d]) -> D[k] if k in D, else d. d defaults to None.' 421 | try: 422 | return self[key] 423 | except KeyError: 424 | return default 425 | 426 | def __contains__(self, key): 427 | try: 428 | self[key] 429 | except KeyError: 430 | return False 431 | else: 432 | return True 433 | 434 | def keys(self): 435 | "D.keys() -> a set-like object providing a view on D's keys" 436 | return KeysView(self) 437 | 438 | def items(self): 439 | "D.items() -> a set-like object providing a view on D's items" 440 | return ItemsView(self) 441 | 442 | def values(self): 443 | "D.values() -> an object providing a view on D's values" 444 | return ValuesView(self) 445 | 446 | def __eq__(self, other): 447 | if not isinstance(other, Mapping): 448 | return NotImplemented 449 | return dict(self.items()) == dict(other.items()) 450 | 451 | Mapping.register(mappingproxy) 452 | 453 | 454 | class MappingView(Sized): 455 | 456 | def __init__(self, mapping): 457 | self._mapping = mapping 458 | 459 | def __len__(self): 460 | return len(self._mapping) 461 | 462 | def __repr__(self): 463 | return '{0.__class__.__name__}({0._mapping!r})'.format(self) 464 | 465 | 466 | class KeysView(MappingView, Set): 467 | 468 | @classmethod 469 | def _from_iterable(self, it): 470 | return set(it) 471 | 472 | def __contains__(self, key): 473 | return key in self._mapping 474 | 475 | def __iter__(self): 476 | yield from self._mapping 477 | 478 | KeysView.register(dict_keys) 479 | 480 | 481 | class ItemsView(MappingView, Set): 482 | 483 | @classmethod 484 | def _from_iterable(self, it): 485 | return set(it) 486 | 487 | def __contains__(self, item): 488 | key, value = item 489 | try: 490 | v = self._mapping[key] 491 | except KeyError: 492 | return False 493 | else: 494 | return v == value 495 | 496 | def __iter__(self): 497 | for key in self._mapping: 498 | yield (key, self._mapping[key]) 499 | 500 | ItemsView.register(dict_items) 501 | 502 | 503 | class ValuesView(MappingView): 504 | 505 | def __contains__(self, value): 506 | for key in self._mapping: 507 | if value == self._mapping[key]: 508 | return True 509 | return False 510 | 511 | def __iter__(self): 512 | for key in self._mapping: 513 | yield self._mapping[key] 514 | 515 | ValuesView.register(dict_values) 516 | 517 | 518 | class MutableMapping(Mapping): 519 | 520 | __slots__ = () 521 | 522 | """A MutableMapping is a generic container for associating 523 | key/value pairs. 524 | 525 | This class provides concrete generic implementations of all 526 | methods except for __getitem__, __setitem__, __delitem__, 527 | __iter__, and __len__. 528 | 529 | """ 530 | 531 | @abstractmethod 532 | def __setitem__(self, key, value): 533 | raise KeyError 534 | 535 | @abstractmethod 536 | def __delitem__(self, key): 537 | raise KeyError 538 | 539 | __marker = object() 540 | 541 | def pop(self, key, default=__marker): 542 | '''D.pop(k[,d]) -> v, remove specified key and return the corresponding value. 543 | If key is not found, d is returned if given, otherwise KeyError is raised. 544 | ''' 545 | try: 546 | value = self[key] 547 | except KeyError: 548 | if default is self.__marker: 549 | raise 550 | return default 551 | else: 552 | del self[key] 553 | return value 554 | 555 | def popitem(self): 556 | '''D.popitem() -> (k, v), remove and return some (key, value) pair 557 | as a 2-tuple; but raise KeyError if D is empty. 558 | ''' 559 | try: 560 | key = next(iter(self)) 561 | except StopIteration: 562 | raise KeyError 563 | value = self[key] 564 | del self[key] 565 | return key, value 566 | 567 | def clear(self): 568 | 'D.clear() -> None. Remove all items from D.' 569 | try: 570 | while True: 571 | self.popitem() 572 | except KeyError: 573 | pass 574 | 575 | def update(*args, **kwds): 576 | ''' D.update([E, ]**F) -> None. Update D from mapping/iterable E and F. 577 | If E present and has a .keys() method, does: for k in E: D[k] = E[k] 578 | If E present and lacks .keys() method, does: for (k, v) in E: D[k] = v 579 | In either case, this is followed by: for k, v in F.items(): D[k] = v 580 | ''' 581 | if not args: 582 | raise TypeError("descriptor 'update' of 'MutableMapping' object " 583 | "needs an argument") 584 | self, *args = args 585 | if len(args) > 1: 586 | raise TypeError('update expected at most 1 arguments, got %d' % 587 | len(args)) 588 | if args: 589 | other = args[0] 590 | if isinstance(other, Mapping): 591 | for key in other: 592 | self[key] = other[key] 593 | elif hasattr(other, "keys"): 594 | for key in other.keys(): 595 | self[key] = other[key] 596 | else: 597 | for key, value in other: 598 | self[key] = value 599 | for key, value in kwds.items(): 600 | self[key] = value 601 | 602 | def setdefault(self, key, default=None): 603 | 'D.setdefault(k[,d]) -> D.get(k,d), also set D[k]=d if k not in D' 604 | try: 605 | return self[key] 606 | except KeyError: 607 | self[key] = default 608 | return default 609 | 610 | MutableMapping.register(dict) 611 | 612 | 613 | ### SEQUENCES ### 614 | 615 | 616 | class Sequence(Sized, Iterable, Container): 617 | 618 | """All the operations on a read-only sequence. 619 | 620 | Concrete subclasses must override __new__ or __init__, 621 | __getitem__, and __len__. 622 | """ 623 | 624 | __slots__ = () 625 | 626 | @abstractmethod 627 | def __getitem__(self, index): 628 | raise IndexError 629 | 630 | def __iter__(self): 631 | i = 0 632 | try: 633 | while True: 634 | v = self[i] 635 | yield v 636 | i += 1 637 | except IndexError: 638 | return 639 | 640 | def __contains__(self, value): 641 | for v in self: 642 | if v == value: 643 | return True 644 | return False 645 | 646 | def __reversed__(self): 647 | for i in reversed(range(len(self))): 648 | yield self[i] 649 | 650 | def index(self, value): 651 | '''S.index(value) -> integer -- return first index of value. 652 | Raises ValueError if the value is not present. 653 | ''' 654 | for i, v in enumerate(self): 655 | if v == value: 656 | return i 657 | raise ValueError 658 | 659 | def count(self, value): 660 | 'S.count(value) -> integer -- return number of occurrences of value' 661 | return sum(1 for v in self if v == value) 662 | 663 | Sequence.register(tuple) 664 | Sequence.register(str) 665 | Sequence.register(range) 666 | Sequence.register(memoryview) 667 | 668 | 669 | class ByteString(Sequence): 670 | 671 | """This unifies bytes and bytearray. 672 | 673 | XXX Should add all their methods. 674 | """ 675 | 676 | __slots__ = () 677 | 678 | ByteString.register(bytes) 679 | ByteString.register(bytearray) 680 | 681 | 682 | class MutableSequence(Sequence): 683 | 684 | __slots__ = () 685 | 686 | """All the operations on a read-write sequence. 687 | 688 | Concrete subclasses must provide __new__ or __init__, 689 | __getitem__, __setitem__, __delitem__, __len__, and insert(). 690 | 691 | """ 692 | 693 | @abstractmethod 694 | def __setitem__(self, index, value): 695 | raise IndexError 696 | 697 | @abstractmethod 698 | def __delitem__(self, index): 699 | raise IndexError 700 | 701 | @abstractmethod 702 | def insert(self, index, value): 703 | 'S.insert(index, value) -- insert value before index' 704 | raise IndexError 705 | 706 | def append(self, value): 707 | 'S.append(value) -- append value to the end of the sequence' 708 | self.insert(len(self), value) 709 | 710 | def clear(self): 711 | 'S.clear() -> None -- remove all items from S' 712 | try: 713 | while True: 714 | self.pop() 715 | except IndexError: 716 | pass 717 | 718 | def reverse(self): 719 | 'S.reverse() -- reverse *IN PLACE*' 720 | n = len(self) 721 | for i in range(n//2): 722 | self[i], self[n-i-1] = self[n-i-1], self[i] 723 | 724 | def extend(self, values): 725 | 'S.extend(iterable) -- extend sequence by appending elements from the iterable' 726 | for v in values: 727 | self.append(v) 728 | 729 | def pop(self, index=-1): 730 | '''S.pop([index]) -> item -- remove and return item at index (default last). 731 | Raise IndexError if list is empty or index is out of range. 732 | ''' 733 | v = self[index] 734 | del self[index] 735 | return v 736 | 737 | def remove(self, value): 738 | '''S.remove(value) -- remove first occurrence of value. 739 | Raise ValueError if the value is not present. 740 | ''' 741 | del self[self.index(value)] 742 | 743 | def __iadd__(self, values): 744 | self.extend(values) 745 | return self 746 | 747 | MutableSequence.register(list) 748 | MutableSequence.register(bytearray) # Multiply inheriting, see ByteString 749 | -------------------------------------------------------------------------------- /images/14-1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/cundi/fluent-python/50418e5ffa373cc2503eab91d8488e5e613f389d/images/14-1.png -------------------------------------------------------------------------------- /images/c16_1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/cundi/fluent-python/50418e5ffa373cc2503eab91d8488e5e613f389d/images/c16_1.png -------------------------------------------------------------------------------- /images/c16_2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/cundi/fluent-python/50418e5ffa373cc2503eab91d8488e5e613f389d/images/c16_2.png -------------------------------------------------------------------------------- /images/c7_1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/cundi/fluent-python/50418e5ffa373cc2503eab91d8488e5e613f389d/images/c7_1.png -------------------------------------------------------------------------------- /images/the_qrcode_for_qq_group.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/cundi/fluent-python/50418e5ffa373cc2503eab91d8488e5e613f389d/images/the_qrcode_for_qq_group.png --------------------------------------------------------------------------------