├── README.md └── zh-cn └── README.md /README.md: -------------------------------------------------------------------------------- 1 | # The Elements of Python Style 2 | 3 | This document goes beyond PEP8 to cover the core of what I think of as great Python style. It is opinionated, but not too opinionated. It goes beyond mere issues of syntax and module layout, and into areas of paradigm, organization, and architecture. I hope it can be a kind of condensed ["Strunk & White"][strunk-white] for Python code. 4 | 5 | [strunk-white]: https://en.wikipedia.org/wiki/The_Elements_of_Style 6 | 7 | # Table of Contents 8 | 9 | * [The Elements of Python Style](#the-elements-of-python-style) 10 | * [Follow Most PEP8 Guidelines](#follow-most-pep8-guidelines) 11 | * [Flexibility on Line Length](#flexibility-on-line-length) 12 | * [Consistent Naming](#consistent-naming) 13 | * [Nitpicks That Aren't Worth It](#nitpicks-that-arent-worth-it) 14 | * [Writing Good Docstrings](#writing-good-docstrings) 15 | * [Paradigms and Patterns](#paradigms-and-patterns) 16 | * [A Little Zen for Your Code Style](#a-little-zen-for-your-code-style) 17 | * [Six of One, Half a Dozen of the Other](#six-of-one-half-a-dozen-of-the-other) 18 | * [Standard Tools and Project Structure](#standard-tools-and-project-structure) 19 | * [Some Inspiration](#some-inspiration) 20 | * [Contributors](#contributors) 21 | 22 | ## Follow Most [PEP8 Guidelines][pep8] 23 | 24 | ... but, be flexible on naming and line length. 25 | 26 | PEP8 covers lots of mundane stuff like whitespace, line breaks between functions/classes/methods, imports, and warning against use of deprecated functionality. Pretty much everything in there is good. 27 | 28 | The best tool to enforce these rules, while also helping you catch silly Python syntax errors, is [flake8][flake8]. 29 | 30 | PEP8 is meant as a set of guidelines, not rules to be strictly, or religiously, followed. Make sure to read the section of PEP8 that is titled: "A Foolish Consistency is the Hobgoblin of Little Minds." Also see Raymond Hettinger's excellent talk, ["Beyond PEP8"](https://www.youtube.com/watch?v=wf-BqAjZb8M) for more on this. 31 | 32 | The only set of rules that seem to cause a disproportionate amount of controversy are around the line length and naming. These can be easily tweaked. 33 | 34 | ## Flexibility on Line Length 35 | 36 | If the strict 79-character line length rule in `flake8` bothers you, feel free to ignore or adjust that rule. It's probably still a good rule-of-thumb -- like a "rule" that says English sentences should have 50 or fewer words, or that paragraphs should have fewer than 10 sentences. Here's the link to [flake8 config][f8config], see the `max-line-length` config option. Note also that often a `# noqa` comment can be added to a line to have a `flake8` check ignored, but please use these sparingly. 37 | 38 | 90%+ of your lines should be 79 characters or fewer, though, for the simple reason that "Flat is better than nested". If you find a function where all the lines are longer than this, something else is wrong, and you should look at your code rather than at your flake8 settings. 39 | 40 | [pep8]: https://www.python.org/dev/peps/pep-0008/ 41 | [flake8]: https://flake8.readthedocs.org 42 | [f8config]: https://flake8.readthedocs.org/en/latest/config.html 43 | 44 | ## Consistent Naming 45 | 46 | On naming, following some simple rules can prevent a whole lot of team-wide grief. 47 | 48 | ### Preferred Naming Rules 49 | 50 | Many of these were adapted from [the Pocoo team][pocoo]. 51 | 52 | - Class names: `CamelCase`, and capitalize acronyms: `HTTPWriter`, not `HttpWriter`. 53 | - Variable names: `lower_with_underscores`. 54 | - Method and function names: `lower_with_underscores`. 55 | - Modules: `lower_with_underscores.py`. (But, prefer names that don't need underscores!) 56 | - Constants: `UPPER_WITH_UNDERSCORES`. 57 | - Precompiled regular expressions: `name_re`. 58 | 59 | [pocoo]: https://flask.palletsprojects.com/en/1.1.x/styleguide/ 60 | 61 | You should generally follow these rules, unless you are mirroring some other tool's naming convention, like a database schema or message format. 62 | 63 | You can also choose to use `CamelCase` for things that are class-like but not quite classes -- the main benefit of `CamelCase` is calling attention to something as a "global noun", rather than a local label or a verb. Notice that Python names `True`, `False`, and `None` use `CamelCase` even though they are not classes. 64 | 65 | ### Avoid Name Adornments 66 | 67 | ... like `_prefix` or `suffix_`. Functions and methods can have a `_prefix` notation to indicate "private", but this should be used sparingly and only for APIs that are expected to be widely used, and where the `_private` indicator assists with [information hiding][infohiding]. 68 | 69 | [infohiding]: http://c2.com/cgi/wiki?InformationHiding 70 | 71 | PEP8 suggests using a trailing underscore to avoid aliasing a built-in, e.g. 72 | 73 | ```python 74 | sum_ = sum(some_long_list) 75 | print(sum_) 76 | ``` 77 | 78 | This is OK in a pinch, but it might be better to just choose a different name. 79 | 80 | You should rarely use `__mangled` double-underscore prefixes for class/instance/method labels, which have special [name mangling behavior][mangling] -- it's rarely necessary. Never create your own names using `__dunder__` adornments unless you are implementing a Python standard protocol, like `__len__`; this is a namespace specifically reserved for Python's internal protocols and shouldn't be co-opted for your own stuff. 81 | 82 | [mangling]: https://docs.python.org/3/tutorial/classes.html#private-variables 83 | 84 | ### Avoid One-Character Names 85 | 86 | There are some one-character label names that are common and acceptable. 87 | 88 | With `lambda`, using `x` for single-argument functions is OK. For example: 89 | 90 | ```python 91 | encode = lambda x: x.encode("utf-8", "ignore") 92 | ``` 93 | 94 | With tuple unpacking, using `_` as a throwaway label is also OK. For example: 95 | 96 | ```python 97 | _, url, urlref = data 98 | ``` 99 | 100 | This basically means, "ignore the first element." 101 | 102 | Similar to `lambda`, inside list/dict/set comprehensions, generator expressions, or very short (1-2 line) for loops, a single-char iteration label can be used. This is also typically `x`, e.g. 103 | 104 | ```python 105 | sum(x for x in items if x > 0) 106 | ``` 107 | 108 | to sum all positive integers in the sequence `items`. 109 | 110 | It is also very common to use `i` as shorthand for "index", and commonly with the `enumerate` built-in. For example: 111 | 112 | ```python 113 | for i, item in enumerate(items): 114 | print("%4s: %s" % (i, item)) 115 | ``` 116 | 117 | Outside of these cases, you should rarely, perhaps **never**, use single-character label/argument/method names. This is because it just makes it impossible to `grep` for stuff. 118 | 119 | ### Use `self` and similar conventions 120 | 121 | You should: 122 | 123 | - always name a method's first argument `self` 124 | - always name `@classmethod`'s first argument `cls` 125 | - always use `*args` and `**kwargs` for variable argument lists 126 | 127 | ## Nitpicks That Aren't Worth It 128 | 129 | There's nothing to gain from not following these rules, so you should just follow them. 130 | 131 | ### Always [inherit from `object`][newstyle] and use new-style classes 132 | 133 | ```python 134 | # bad 135 | class JSONWriter: 136 | pass 137 | 138 | # good 139 | class JSONWriter(object): 140 | pass 141 | ``` 142 | 143 | In Python 2, it's important to follow this rule. In Python 3, all classes implicitly inherit from `object` and this rule isn't necessary any longer. 144 | 145 | ### Don't repeat instance labels in the class 146 | 147 | ```python 148 | # bad 149 | class JSONWriter(object): 150 | handler = None 151 | def __init__(self, handler): 152 | self.handler = handler 153 | 154 | # good 155 | class JSONWriter(object): 156 | def __init__(self, handler): 157 | self.handler = handler 158 | ``` 159 | 160 | ### Prefer [list/dict/set comprehensions][mapfilter] over map/filter. 161 | 162 | ```python 163 | # bad 164 | map(truncate, filter(lambda x: len(x) > 30, items)) 165 | 166 | # good 167 | [truncate(x) for x in items if len(x) > 30] 168 | ``` 169 | 170 | Though you should prefer comprehensions for most of the simple cases, there are occasions where `map()` or `filter()` will be more readable, so use your judgment. 171 | 172 | ### Use parens `(...)` for continuations 173 | 174 | ```python 175 | # bad 176 | from itertools import groupby, chain, \ 177 | izip, islice 178 | 179 | # good 180 | from itertools import (groupby, chain, 181 | izip, islice) 182 | ``` 183 | 184 | ### Use parens `(...)` for fluent APIs 185 | 186 | ```python 187 | # bad 188 | response = Search(using=client) \ 189 | .filter("term", cat="search") \ 190 | .query("match", title="python") 191 | 192 | # good 193 | response = (Search(using=client) 194 | .filter("term", cat="search") 195 | .query("match", title="python")) 196 | ``` 197 | 198 | ### Use implicit continuations in function calls 199 | 200 | ```python 201 | # bad -- simply unnecessary backslash 202 | return set((key.lower(), val.lower()) \ 203 | for key, val in mapping.iteritems()) 204 | 205 | # good 206 | return set((key.lower(), val.lower()) 207 | for key, val in mapping.iteritems()) 208 | ``` 209 | 210 | ### Use `isinstance(obj, cls)`, not `type(obj) == cls` 211 | 212 | This is because `isinstance` covers way more cases, including sub-classes and ABC's. Also, rarely use `isinstance` at all, since you should usually be doing duck typing, instead! 213 | 214 | ### Use `with` for files and locks 215 | 216 | The `with` statement subtly handles file closing and lock releasing even in the case of exceptions being raised. So: 217 | 218 | ```python 219 | # bad 220 | somefile = open("somefile.txt", "w") 221 | somefile.write("sometext") 222 | return 223 | 224 | # good 225 | with open("somefile.txt", "w") as somefile: 226 | somefile.write("sometext") 227 | return 228 | ``` 229 | 230 | ### Use `is` when comparing to `None` 231 | 232 | The `None` value is a singleton but when you're checking for `None`, you rarely want to actually call `__eq__` on the LHS argument. So: 233 | 234 | ```python 235 | # bad 236 | if item == None: 237 | continue 238 | 239 | # good 240 | if item is None: 241 | continue 242 | ``` 243 | 244 | Not only is the good form faster, it's also more correct. It's no more concise to use `==`, so just remember this rule! 245 | 246 | ### Avoid `sys.path` hacks 247 | 248 | It can be tempting to do `sys.path.insert(0, "../")` and similar to control Python's import approach, but you should avoid these like the plague. 249 | 250 | Python has a somewhat-complex, but very comprehensible, approach to module path resolution. You can adjust how Python loads modules via `PYTHONPATH` or via tricks like `setup.py develop`. You can also run Python using `-m` to good effect, e.g. `python -m mypkg.mymodule` rather than `python mypkg/mymodule.py`. You should not rely upon the current working directory that you run python out of for your code to work properly. David Beazley saves the day once more with his PDF slides which are worth a skim, ["Modules and Packages: Live and Let Die!"][modules] 251 | 252 | [modules]: http://www.dabeaz.com/modulepackage/ModulePackage.pdf 253 | 254 | ### Rarely create [your own exception types][exceptiontypes] 255 | 256 | ... and when you must, don't make too many. 257 | 258 | ```python 259 | # bad 260 | class ArgumentError(Exception): 261 | pass 262 | ... 263 | raise ArgumentError(url) 264 | 265 | # good 266 | raise ValueError("bad value for url: %s" % url) 267 | ``` 268 | 269 | Note that Python includes [a rich set of built-in exception classes][ex-tree]. Leverage these appropriately, and you should "customize" them simply by instantiating them with string messages that describe the specific error condition you hit. It is most common to raise `ValueError` (bad argument), `LookupError` (bad key), or `AssertionError` (via the `assert` statement) in user code. 270 | 271 | A good rule of thumb for whether you should create your own exception type is to figure out whether a caller should catch it **every time** they call your function. If so, you probably **should** make your own type. But this is relatively rare. A good example of an exception type that clearly had to exist is [tornado.web.HTTPError][http-error]. But notice how Tornado did not go overboard: there is one exception class for **all** HTTP errors raised by the framework or user code. 272 | 273 | [ex-tree]: https://docs.python.org/2/library/exceptions.html#exception-hierarchy 274 | [http-error]: http://www.tornadoweb.org/en/stable/web.html#tornado.web.HTTPError 275 | 276 | ### Short docstrings are proper one-line sentences 277 | 278 | ```python 279 | # bad 280 | def reverse_sort(items): 281 | """ 282 | sort items in reverse order 283 | """ 284 | 285 | # good 286 | def reverse_sort(items): 287 | """Sort items in reverse order.""" 288 | ``` 289 | 290 | Keep the triple-quote's on the same line `"""`, capitalize the first letter, and include a period. Four lines become two, the `__doc__` attribute doesn't have crufty newlines, and the pedants are pleased! 291 | 292 | ### Use [reST for docstrings][docstrings] 293 | 294 | It's done by the stdlib and most open source projects. It's supported out-of-the-box by Sphinx. Just do it! The Python `requests` module uses these to extremely good effect. See the [`requests.api`][requests-api] module, for example. 295 | 296 | ### Strip trailing whitespace 297 | 298 | This is perhaps the ultimate nitpick, but if you don't do it, it will drive people crazy. There are no shortage of tools that will do this for you in your text editor automatically; here's [a link to the one I use for vim][whitespace]. 299 | 300 | [whitespace]: https://github.com/amontalenti/home/blob/master/.vim/bundle/whitespace/plugin/whitespace.vim 301 | 302 | ## Writing Good Docstrings 303 | 304 | Here's a quick reference to using Sphinx-style reST in your function docstrings: 305 | 306 | ```python 307 | def get(url, qsargs=None, timeout=5.0): 308 | """Send an HTTP GET request. 309 | 310 | :param url: URL for the new request. 311 | :type url: str 312 | :param qsargs: Converted to query string arguments. 313 | :type qsargs: dict 314 | :param timeout: In seconds. 315 | :rtype: mymodule.Response 316 | """ 317 | return request('get', url, qsargs=qsargs, timeout=timeout) 318 | ``` 319 | 320 | Don't document for the sake of documenting. The way to think about this is: 321 | 322 | ```python 323 | good_names + explicit_defaults > verbose_docs + type_specs 324 | ``` 325 | 326 | That is, in the example above, there is no need to say `timeout` is a `float`, because the default value is `5.0`, which is clearly a `float`. It is useful to indicate in the documentation that the semantic meaning is "seconds", thus `5.0` means 5 seconds. Meanwhile, the caller has no clue what `qsargs` should be, so we give a hint with the `type` annotation, and the caller also has no clue what to expect back from the function, so an `rtype` annotation is appropriate. 327 | 328 | One last point. Guido once said that his key insight for Python is that, "code is read much more often than it is written." Well, a corollary of this is that **some documentation helps, but too much documentation hurts**. 329 | 330 | You should basically only document functions you expect to be widely re-used. If you document every function in an internal module, you'll just end up with a less maintainable module, since the documentation needs to be refactored when the code is refactored. Don't "cargo cult" your docstrings and definitely don't auto-generate them with tooling! 331 | 332 | [newstyle]: https://docs.python.org/2/reference/datamodel.html#new-style-and-classic-classes 333 | [mapfilter]: http://www.artima.com/weblogs/viewpost.jsp?thread=98196 334 | [exceptiontypes]: https://twitter.com/amontalenti/status/665338326396727297 335 | [docstrings]: https://www.python.org/dev/peps/pep-0287/ 336 | [requests-api]: https://github.com/kennethreitz/requests/blob/master/requests/api.py 337 | 338 | ## Paradigms and Patterns 339 | 340 | ### Functions vs classes 341 | 342 | You should usually prefer functions to classes. Functions and modules are the basic units of code re-use in Python, and they are the most flexible form. Classes are an "upgrade path" for certain Python facilities, such as implementing containers, proxies, descriptors, type systems, and more. But usually, functions are a better option. 343 | 344 | Some might like the code organization benefits of grouping related functions together into classes. But this is a mistake. You should group related functions together into **modules**. 345 | 346 | Though sometimes classes can act as a helpful "mini namespace" (e.g. with `@staticmethod`), more often a group of methods should be contributing to the internal operation of an object, rather than merely being a behavior grouping. 347 | 348 | It's always better to have a `lib.time` module for time-related functions than to have a `TimeHelper` class with a bunch of methods you are forced to subclass in order to use! Classes proliferate other classes, which proliferates complexity and decreases readability. 349 | 350 | ### Generators and iterators 351 | 352 | Generators and iterators are Python's most powerful features -- you should master the iterator protocol, the `yield` keyword, and generator expressions. 353 | 354 | Not only are generators important for any function that needs to be called over a large stream of data, but they also have the effect of simplifying code by making it easy for you to write your own iterators. Refactoring code to generators often simplifies it while making it work in more scenarios. 355 | 356 | Luciano Ramalho, author of "Fluent Python", has a 30-minute presentation, ["Iterators & Generators: the Python Way"](https://www.youtube.com/watch?v=z4P6hSa6K9g), which gives an excellent, fast-paced overview. David Beazley, author of "Python Essential Reference" and "Python Cookbook", has a mind-bending three-hour video tutorial entitled ["Generators: The Final Frontier"](https://www.youtube.com/watch?v=5-qadlG7tWo) that is a satisfying exposition of generator use cases. Mastering this topic is worth it because it applies everywhere. 357 | 358 | ### Declarative vs imperative 359 | 360 | You should prefer declarative to imperative programming. This is code that says **what** you want to do, rather than code that describes **how** to do it. Python's [functional programming guide][func] includes some good details and examples of how to use this style effectively. 361 | 362 | [func]: https://docs.python.org/3/howto/functional.html 363 | 364 | You should use lightweight data structures like `list`, `dict`, `tuple`, and `set` to your advantage. It's always better to lay out your data, and then write some code to transform it, than to build up data by repeatedly calling mutating functions/methods. 365 | 366 | An example of this is the common list comprehension refactoring: 367 | 368 | ```python 369 | # bad 370 | filtered = [] 371 | for x in items: 372 | if x.endswith(".py"): 373 | filtered.append(x) 374 | return filtered 375 | ``` 376 | 377 | This should be rewritten as: 378 | 379 | ```python 380 | # good 381 | return [x 382 | for x in items 383 | if x.endswith(".py")] 384 | ``` 385 | 386 | But another good example is rewriting an `if`/`elif`/`else` chain as a `dict` lookup. 387 | 388 | ### Prefer "pure" functions and generators 389 | 390 | This is a concept that we can borrow from the functional programming community. These kinds of functions and generators are alternatively described as "side-effect free", "referentially transparent", or as having "immutable inputs/outputs". 391 | 392 | As a simple example, you should avoid code like this: 393 | 394 | ```python 395 | # bad 396 | def dedupe(items): 397 | """Remove dupes in-place, return items and # of dupes.""" 398 | seen = set() 399 | dupe_positions = [] 400 | for i, item in enumerate(items): 401 | if item in seen: 402 | dupe_positions.append(i) 403 | else: 404 | seen.add(item) 405 | num_dupes = len(dupe_positions) 406 | for idx in reversed(dupe_positions): 407 | items.pop(idx) 408 | return items, num_dupes 409 | ``` 410 | 411 | This same function can be written as follows: 412 | 413 | ```python 414 | # good 415 | def dedupe(items): 416 | """Return deduped items and # of dupes.""" 417 | deduped = set(items) 418 | num_dupes = len(items) - len(deduped) 419 | return deduped, num_dupes 420 | ``` 421 | 422 | This is a somewhat shocking example. In addition to making this function pure, we also made it much, much shorter. It's not only shorter: it's better. Its purity means `assert dedupe(items) == dedupe(items)` always holds true for the "good" version. In the "bad" version, `num_dupes` will **always** be `0` on the second call, which can lead to subtle bugs when using the function. 423 | 424 | This also illustrates imperative vs declarative style: the function now reads like a description of what we need, rather than a set of instructions to build up what we need. 425 | 426 | ### Prefer simple argument and return types 427 | 428 | Functions should operate on data, rather than on custom objects, wherever possible. Prefer simple argument types like `dict`, `set`, `tuple`, `list`, `int`, `float`, and `bool`. Upgrade from there to standard library types like `datetime`, `timedelta`, `array`, `Decimal`, and `Future`. Only upgrade to your own custom types when absolutely necessary. 429 | 430 | As a good rule of thumb for whether your function is simple enough, ask yourself whether its arguments and return values could always be JSON-serializable. It turns out, this rule of thumb matters more than you might think: JSON-serializability is often a prerequisite to make the functions usable in parallel computing contexts. But, for the purpose of this document, the main benefits are: readability, testability, and overall function simplicity. 431 | 432 | ### Avoid "traditional" OOP 433 | 434 | In "traditional OOP languages" like Java and C++, code re-use is achieved through class hierarchies and polymorphism, or so those languages claim. In Python, though we have the ability to subclass and to do class-based polymorphism, in practice, these capabilities are used rarely in idiomatic Python programs. 435 | 436 | It's more common to achieve re-use through modules and functions, and it's more common to achieve dynamic dispatch through duck typing. If you find yourself using super classes as a form of code re-use, stop what you're doing and reconsider. If you find yourself using lots of polymorphism, consider whether one of Python's dunder protocols or duck typing strategies might apply better. 437 | 438 | See also the excellent Python talk, ["Stop Writing Classes"][stop-classes], by a Python core contributor. In it, the presenter suggests that if you have built a class with a single method that is named like a class (e.g. `Runnable.run()`), then what you've done is modeled a function as a class, and you should just stop. Since in Python, functions are "first-class", there is **no reason** to do this! 439 | 440 | [stop-classes]: https://www.youtube.com/watch?v=o9pEzgHorH0 441 | 442 | ### Mixins are sometimes OK 443 | 444 | One way to do class-based re-use without going overboard on type hierarchies is to use Mixins. Don't overuse these, though. "Flat is better than nested" applies to type hierarchies, too, so you should avoid introducing needless required layers of hierarchy just to decompose behavior. 445 | 446 | Mixins are not actually a Python language feature, but are possible thanks to its support for multiple inheritance. You can create base classes that "inject" functionality into your subclass without forming an "important" part of a type hierarchy, simply by listing that base class as the first entry in the `bases` list. An example: 447 | 448 | ```python 449 | class APIHandler(AuthMixin, RequestHandler): 450 | """Handle HTTP/JSON requests with security.""" 451 | ``` 452 | 453 | The order matters, so may as well remember the rule: `bases` forms a hierarchy bottom-to-top. One readability benefit here is that everything you need to know about this class is contained in the `class` definition itself: "it mixes in auth behavior and is a specialized Tornado RequestHandler." 454 | 455 | ### Be careful with frameworks 456 | 457 | Python has a slew of frameworks for web, databases, and more. One of the joys of the language is that it's easy to create your own frameworks. When using an open source framework, you should be careful not to couple your "core code" too closely to the framework itself. 458 | 459 | When considering building your own framework for your code, you should err on the side of caution. The standard library has a lot of stuff built-in, PyPI has even more, and usually, [YAGNI applies][yagni]. 460 | 461 | [yagni]: http://c2.com/cgi/wiki?YouArentGonnaNeedIt 462 | 463 | ### Respect metaprogramming 464 | 465 | Python supports "metaprogramming" via a number of features, including decorators, context managers, descriptors, import hooks, metaclasses and AST transformations. 466 | 467 | You should feel comfortable using and understanding these features -- they are a core part of the language and are fully supported by it. But you should realize that when you use these features, you are opening yourself up to complex failure scenarios. Thus, treat the creation of metaprogramming facilities for your code similarly to the decision to "build your own framework". They amount to the same thing. When and if you do it, make the facilities into their own modules and document them well! 468 | 469 | ### Don't be afraid of "dunder" methods 470 | 471 | Many people conflate Python's metaprogramming facilities with its support for "double-underscore" or "dunder" methods, such as `__getattr__`. 472 | 473 | As described in the blog post, ["Python double-under, double-wonder"][dunder], there is nothing "special" about dunders. They are nothing more than a lightweight namespace the Python core developers picked for all of Python's internal protocols. After all, `__init__` is a dunder, and there's nothing magic about it. 474 | 475 | It's true that some dunders can create more confusing results than others -- for example, it's probably not a good idea to overload operators without good reason. But many of them, such as `__repr__`, `__str__`, `__len__`, and `__call__` are really full parts of the language you should be leveraging in idiomatic Python code. Don't shy away! 476 | 477 | [dunder]: http://www.pixelmonkey.org/2013/04/11/python-double-under-double-wonder 478 | 479 | ## A Little Zen for Your Code Style 480 | 481 | Barry Warsaw, one of the core Python developers, once said that it frustrated him that "The Zen of Python" ([PEP 20][pep20]) is used as a style guide for Python code, since it was originally written as a poem about Python's **internal** design. That is, the design of the language and language implementation itself. One can acknowledge that, but a few of the lines from PEP 20 serve as pretty good guidelines for idiomatic Python code, so we'll just go with it. 482 | 483 | [pep20]: https://www.python.org/dev/peps/pep-0020/ 484 | [pocoo-naming]: http://www.pocoo.org/internal/styleguide/#naming-conventions 485 | 486 | ### Beautiful is better than ugly 487 | 488 | This one is subjective, but what it usually amounts to is this: will the person who inherits this code from you be impressed or disappointed? What if that person is you, three years later? 489 | 490 | ### Explicit is better than implicit 491 | 492 | Sometimes in the name of refactoring out repetition in our code, we also get a little bit abstract with it. It should be possible to translate the code into plain English and basically understand what's going on. There shouldn't be an excessive amount of "magic". 493 | 494 | ### Flat is better than nested 495 | 496 | This one is really easy to understand. The best functions have no nesting, neither by loops nor `if` statements. Second best is one level of nesting. Two or more levels of nesting, and you should probably start refactoring to smaller functions. 497 | 498 | Also, don't be afraid to refactor a nested if statement into a multi-part boolean conditional. For example: 499 | 500 | ```python 501 | # bad 502 | if response: 503 | if response.get("data"): 504 | return len(response["data"]) 505 | ``` 506 | 507 | is better written as: 508 | 509 | ```python 510 | # good 511 | if response and response.get("data"): 512 | return len(response["data"]) 513 | ``` 514 | 515 | ### Readability counts 516 | 517 | Don't be afraid to add line-comments with `#`. Don't go overboard on these or over-document, but a little explanation, line-by-line, often helps a whole lot. Don't be afraid to pick a slightly longer name because it's more descriptive. No one wins any points for shortening "`response`" to "`rsp`". Use doctest-style examples to illustrate edge cases in docstrings. Keep it simple! 518 | 519 | ### Errors should never pass silently 520 | 521 | The biggest offender here is the bare `except: pass` clause. Never use these. Suppressing **all** exceptions is simply dangerous. Scope your exception handling to single lines of code, and always scope your `except` handler to a specific type. Also, get comfortable with the `logging` module and `log.exception(...)`. 522 | 523 | ### If the implementation is hard to explain, it's a bad idea 524 | 525 | This is a general software engineering principle -- but applies very well to Python code. Most Python functions and objects can have an easy-to-explain implementation. If it's hard to explain, it's probably a bad idea. Usually you can make a hard-to-explain function easier-to-explain via "divide and conquer" -- split it into several functions. 526 | 527 | ### Testing is one honking great idea 528 | 529 | OK, we took liberty on this one -- in "The Zen of Python", it's actually "namespaces" that's the honking great idea. 530 | 531 | But seriously: beautiful code without tests is simply worse than even the ugliest tested code. At least the ugly code can be refactored to be beautiful, but the beautiful code can't be refactored to be verifiably correct, at least not without writing the tests! So, write tests! Please! 532 | 533 | ## Six of One, Half a Dozen of the Other 534 | 535 | This is a section for arguments we'd rather not settle. Don't rewrite other people's code because of this stuff. Feel free to use these forms interchangeably. 536 | 537 | ### `str.format` vs overloaded format `%` 538 | 539 | `str.format` is more robust, yet `%` with `"%s %s"` printf-style strings is more concise. Both will be around forever. 540 | 541 | Remember to use unicode strings for your format pattern, if you need to preserve unicode: 542 | 543 | ```python 544 | u"%s %s" % (dt.datetime.utcnow().isoformat(), line) 545 | ``` 546 | 547 | If you do end up using `%`, you should consider the `"%(name)s"` syntax which allows you to use a dictionary rather than a tuple, e.g. 548 | 549 | ```python 550 | u"%(time)s %(line)s" % {"time": dt.datetime.utcnow().isoformat(), "line": line} 551 | ``` 552 | 553 | Also, don't re-invent the wheel. One thing `str.format` does unequivocally better is support various [formatting modes][str-format], such as humanized numbers and percentages. Use them. 554 | 555 | But use whichever one you please. We choose not to care. 556 | 557 | [str-format]: https://docs.python.org/2/library/string.html#formatspec 558 | 559 | ### `if item` vs `if item is not None` 560 | 561 | This is unrelated to the earlier rule on `==` vs `is` for `None`. In this case, we are actually taking advantage of Python's "truthiness rules" to our benefit in `if item`, e.g. as a shorthand "item is not None or empty string." 562 | 563 | Truthiness is a [tad complicated][truth-values] in Python and certainly the latter is safer against some classes of bugs. The former, however, is very common in much Python code, and it's shorter. We choose not to care. 564 | 565 | [truth-values]: https://docs.python.org/2/library/stdtypes.html#truth-value-testing 566 | 567 | ### Implicit multi-line strings vs triple-quote `"""` 568 | 569 | Python's compiler will automatically join multiple quoted strings together into a single string during the parse phase if it finds nothing in between them, e.g. 570 | 571 | ```python 572 | msg = ("Hello, wayward traveler!\n" 573 | "What shall we do today?\n" 574 | "=>") 575 | print(msg) 576 | ``` 577 | 578 | This is roughly equivalent to: 579 | 580 | ```python 581 | msg = """Hello, wayward traveler! 582 | What shall we do today? 583 | =>""" 584 | print(msg) 585 | ``` 586 | 587 | In the former's case, you keep the indentation clean, but need the ugly newline characters. In the latter case, you don't need the newlines, but break indentation. We choose not to care. 588 | 589 | ### Using `raise` with classes vs instances 590 | 591 | It turns out Python lets you pass either an exception **class** or an exception **instance** to the `raise` statement. For example, these two lines are roughly equivalent: 592 | 593 | ```python 594 | raise ValueError 595 | raise ValueError() 596 | ``` 597 | 598 | Essentially, Python turns the [first line into the second automatically][raises]. You should probably prefer the second form, if for no other reason than to **actually provide a useful argument**, like a helpful message about why the `ValueError` occurred. But these two lines **are** equivalent and you shouldn't rewrite one style into the other just because. We choose not to care. 599 | 600 | [raises]: https://docs.python.org/2.7/reference/simple_stmts.html#the-raise-statement 601 | 602 | ## Standard Tools and Project Structure 603 | 604 | We've made some choices on "best-of-breed" tools for things, as well as the very minimal starting structure for a proper Python project. 605 | 606 | ### The Standard Library 607 | 608 | - `import datetime as dt`: always import `datetime` this way 609 | - `dt.datetime.utcnow()`: preferred to `.now()`, which does local time 610 | - `import json`: the standard for data interchange 611 | - `from collections import namedtuple`: use for lightweight data types 612 | - `from collections import defaultdict`: use for counting/grouping 613 | - `from collections import deque`: a fast double-ended queue 614 | - `from itertools import groupby, chain`: for declarative style 615 | - `from functools import wraps`: use for writing well-behaved decorators 616 | - `argparse`: for "robust" CLI tool building 617 | - `fileinput`: to create quick UNIX pipe-friendly tools 618 | - `log = logging.getLogger(__name__)`: good enough for logging 619 | - `from __future__ import absolute_import`: fixes import aliasing 620 | 621 | ### Common Third-Party Libraries 622 | 623 | - `python-dateutil` for datetime parsing and calendars 624 | - `pytz` for timezone handling 625 | - `tldextract` for better URL handling 626 | - `msgpack-python` for a more compact encoding than JSON 627 | - `futures` for Future/pool concurrency primitives 628 | - `docopt` for quick throwaway CLI tools 629 | - `py.test` for unit tests, along with `mock` and `hypothesis` 630 | 631 | ### Local Development Project Skeleton 632 | 633 | For all Python packages and libraries: 634 | 635 | - no `__init__.py` in root folder: give your package a folder name! 636 | - `mypackage/__init__.py` preferred to `src/mypackage/__init__.py` 637 | - `mypackage/lib/__init__.py` preferred to `lib/__init__.py` 638 | - `mypackage/settings.py` preferred to `settings.py` 639 | - `README.rst` describes the repo for a newcomer; use reST 640 | - `setup.py` for simple facilities like `setup.py develop` 641 | - `requirements.txt` describes package dependencies for `pip` 642 | - `dev-requirements.txt` additional dependencies for tests/local 643 | - `Makefile` for simple (!!!) build/lint/test/run steps 644 | 645 | Also, always [pin your requirements](http://nvie.com/posts/better-package-management/). 646 | 647 | ## Some Inspiration 648 | 649 | The following links may give you some inspiration about the core of writing Python code with great style and taste. 650 | 651 | - Python's stdlib [`Counter` class][Counter], implemented by Raymond Hettinger 652 | - The [`rq.queue` module][rq], originally by Vincent Driessen 653 | - This document's author also wrote [this blog post on "Pythonic" code][idiomatic] 654 | 655 | Go forth and be Pythonic! 656 | 657 | ``` 658 | $ python 659 | >>> import antigravity 660 | ``` 661 | 662 | [Counter]: https://github.com/python/cpython/blob/57b569d8af2b3263c5d9e6d75fb308f89ea17ac6/Lib/collections/__init__.py#L446-L841 663 | [rq]: https://github.com/nvie/rq/blob/master/rq/queue.py 664 | [idiomatic]: http://www.pixelmonkey.org/2010/11/03/pythonic-means-idiomatic-and-tasteful 665 | 666 | ## Contributors 667 | 668 | - Andrew Montalenti ([twitter][amontalenti]): original author 669 | - Vincent Driessen ([twitter][nvie]): edits and suggestions 670 | - William Feng ([github][williamfzc]): translation to zh-cn 671 | 672 | [amontalenti]: https://twitter.com/amontalenti 673 | [nvie]: https://twitter.com/nvie 674 | [williamfzc]: https://github.com/williamfzc 675 | 676 | --- 677 | 678 | Like good Python style? Then perhaps you'll enjoy this style guide author's [past blog posts on Python][amonpy]. 679 | 680 | [amonpy]: https://amontalenti.com/?s=python 681 | -------------------------------------------------------------------------------- /zh-cn/README.md: -------------------------------------------------------------------------------- 1 | # Python写作指南 # 2 | > 本文基于[The Elements of Python Style](https://github.com/amontalenti/elements-of-python-style/blob/master/README.md#paradigms-and-patterns)翻译润色,才疏学浅难免有不足之处,如有错漏之处恳请指教。 3 | 4 | > 译者 [@williamfzc](https://github.com/williamfzc) 5 | > 错漏及建议欢迎提交issue联系我 6 | 7 | 这篇文章基于PEP8,覆盖语法、模块布局、范式和架构等多个方面,介绍了一些个人认为比较好的python编写风格。 8 | 9 | 原文介绍: 10 | This document goes beyond PEP8 to cover the core of what I think of as great Python style. It is opinionated, but not too opinionated. It goes beyond mere issues of syntax and module layout, and into areas of paradigm, organization, and architecture. I hope it can be a kind of condensed ["Strunk & White"][strunk-white] for Python code. 11 | 12 | [strunk-white]: https://en.wikipedia.org/wiki/The_Elements_of_Style 13 | 14 | # 目录 15 | 16 | * [Python写作指南](#Python写作指南) 17 | * [遵从大部分PEP8规范](#遵从大部分PEP8规范) 18 | * [关于每行代码的最大长度](#关于每行代码的最大长度) 19 | * [命名的一致性](#命名的一致性) 20 | * [不值得钻牛角尖的一些点](#不值得钻牛角尖的一些点) 21 | * [写好docstring](#写好docstring) 22 | * [范式 & 设计模式](#范式-设计模式) 23 | * [“python之禅”在你代码中的具体应用](#“python之禅”在你代码中的具体应用) 24 | * [一些无关好坏的主观比较](#一些无关好坏的主观比较) 25 | * [有用的库与项目结构](#有用的库与项目结构) 26 | * [灵感来源](#灵感来源) 27 | * [编著者](#编著者) 28 | 29 | ## 遵从大部分PEP8规范 30 | 31 | PEP8涵盖了大部分用户最常用的使用内容,例如空格、函数/类/方法之间的换行符、模块的导入、针对不再被推荐的功能的警告等等,且将他们管理地非常好。 32 | 33 | 在PEP8准则的实践上,有一套非常好的辅助工具flake8,能够帮助你检测出python代码中的语法错误与不足之处。 34 | 35 | PEP8是指一套设计准则,理论上来说他不是一套需要百分百严格执行的规定。 36 | 37 | 而遵守唯一的准则,很容易在没有明确说明的地方引起人们的争议。本文将针对这些争议点,提出个人认为比较好的解决方案,供各位参考。 38 | 39 | ## 关于每行代码的最大长度 40 | 41 | 关于flake8中严格的“每行代码不能超过79个字符”的限制,当他偶尔影响了你的正常使用,无需为此感到苦恼,请无视他。但大多数情况下,如果你的代码经常超过这个限制,那你的代码中应该有设计不到位的地方。 42 | 43 | 理论上来说,你90%以上的代码的长度应该是少于79个字符的。 44 | 45 | ## 命名的一致性 46 | 47 | 团队如果在命名规则上能够共同遵守一些简单的规则,将会很有效地减少很多不必要的协作问题。 48 | 49 | ### 建议的命名规则 50 | 51 | 请注意第一条,`HTTPWriter`而不是`HttpWriter`,专有名词保持大写,如API。 52 | 53 | - Class names: `CamelCase`, and capitalize acronyms: `HTTPWriter`, not `HttpWriter`. 54 | - 常规变量名: `lower_with_underscores`. 55 | - 方法/函数名: `lower_with_underscores`. 56 | - 模块文件: `lower_with_underscores.py`. (但最好是使用那些连下划线都不需要的词) 57 | - 静态变量: `UPPER_WITH_UNDERSCORES`. 58 | - Precompiled regular expressions: `name_re`. 59 | 60 | 其他详情参见 [the Pocoo team][pocoo]. 61 | 62 | [pocoo]: https://flask.palletsprojects.com/en/1.1.x/styleguide/ 63 | 64 | ### 下划线的使用 65 | 66 | 包括了前下划线与后下划线(`_prefix or suffix_`)。 67 | 68 | 函数经常在名称前加单下划线来表示他是私有的,几乎是一个约定俗成的事实。但这个规则应该被谨慎使用,甚至只应该在以下两种情况下使用: 69 | 70 | - 在会被广泛使用的API内部; 71 | - 涉及[information hiding(信息隐藏)](http://wiki.c2.com/?InformationHiding); 72 | 73 | PEP8建议应该使用单下划线来避免命名与built-in命名冲突: 74 | 75 | sum_ = sum(some_long_list) 76 | print(sum_) 77 | 78 | 但事实上比起加下划线,或许你应该优先考虑换另一个词语。 79 | 80 | 对于前置双下划线,理论上它**只应该**在遇到需要[name mangling behavior](https://docs.python.org/2/tutorial/classes.html#private-variables-and-class-local-references)的情况下使用,但很多情况下其实我们并不需要; 81 | 而对于前后置双下划线(例如`__len__`)它只应该在实现python标准协议的过程中使用,它是为python内部协议特意保留的命名空间,不应该被随意在团队工作中使用。 82 | 83 | **译者观点** 84 | 介绍一下name mangling behavior。python在运行时会将由双下划线标识的变量在前方加上类名,例如在A类中的变量`__b`,在运行时会变成`_A__b`。这是因为python没有像其他语言一样的`private`关键词,只能通过这种方法起到标识变量的效果。这种方法有利于子类重写父类的方法而不会破坏类内部的方法调用。 85 | 但这个机制是“防君子不防小人”(怪怪的)的,即便你使用了双下划线,其他人依旧可以强行使用类似`_A__b`的方法访问这个变量。在日常开发中有一个比较约定俗成的通用方法是在变量前加入单下划线而不是使用name mangling behavior,只起到一个警告的效果而不是完全阻止。 86 | 87 | ### 避免过于简单的命名 88 | 89 | - 例如`i`,`x`,`_`等,在简单的函数中使用无可厚非,像lambda/列表解析等 90 | - 但除了这些情况之外千万不要随意这么用,这将会给其他人的阅读造成很大的困难 91 | 92 | ### 业界观念上比较认可的命名方式 93 | 94 | 下列规则理论上应该被严格执行: 95 | 96 | - 实例方法中第一个参数名称固定为`self` 97 | - 类方法`@classmethod`中第一个参数名称固定为`cls` 98 | - 可变参数的命名应固定采用`*args`, `**kwargs` 99 | 100 | ## 不值得钻牛角尖的一些点 101 | 102 | 这个部分列举了一些规则,不遵守这些规则并不会给你的开发带来任何便利(或者很少)。所以我们不应该在这上面花费太多时间钻牛角尖,仅仅遵守他们就行了。 103 | 104 | ### 在声明基类的时候统一继承于`object` 105 | 106 | # bad 107 | class JSONWriter: 108 | pass 109 | 110 | # good 111 | class JSONWriter(object): 112 | pass 113 | 114 | 在python2中,遵守这条规则非常非常重要,因为旧式类与新式类的表现有一定的差别。在python3,所有的类都默认继承自object,即所有的类都是新式类,所以这条规则在python3变得不那么重要了。 115 | 116 | ### 变量不要重名 117 | 118 | # bad 119 | class JSONWriter(object): 120 | handler = None 121 | def __init__(self, handler): 122 | self.handler = handler 123 | 124 | # good 125 | class JSONWriter(object): 126 | def __init__(self, handler): 127 | self.handler = handler 128 | 129 | ### 使用列表解析替代map/filter 130 | 131 | # bad 132 | map(truncate, filter(lambda x: len(x) > 30, items)) 133 | 134 | # good 135 | [truncate(x) for x in items if len(x) > 30] 136 | 137 | 在团队协作中,你的代码应该更注重可读性。在一些情况下,可能map与filter比起列表解析的方式更具有可读性,在这些情况下请自行选用。 138 | 139 | ### 使用小括号给过长的代码换行 140 | 141 | 换行的原则是**尽量不要出现反斜杠**,直接给出两个例子: 142 | 143 | # bad 144 | from itertools import groupby, chain, \ 145 | izip, islice 146 | 147 | # good 148 | from itertools import (groupby, chain, 149 | izip, islice) 150 | 151 | # bad 152 | response = Search(using=client) \ 153 | .filter("term", cat="search") \ 154 | .query("match", title="python") 155 | 156 | # good 157 | response = (Search(using=client) 158 | .filter("term", cat="search") 159 | .query("match", title="python")) 160 | 161 | ### 使用`isinstance`而不是`type`进行类型比较 162 | 163 | `isinstance`比起`type`,能够覆盖更多的情况,例如子类与抽象类。 164 | 在大部分情况下应该尽量使用`isinstance`。 165 | 当然,如果需要进行精确的类型判断或其他特殊情况时,请自行根据需要选用。 166 | 167 | ### 使用`with`语句处理文件与锁 168 | 169 | `with`语句能够很方便地帮助开发者解决文件关闭与锁释放的问题,即使在代码执行过程中遇到异常。比起`try/finally`的组合要方便许多。 170 | 171 | # bad 172 | somefile = open("somefile.txt", "w") 173 | somefile.write("sometext") 174 | return 175 | 176 | # good 177 | with open("somefile.txt", "w") as somefile: 178 | somefile.write("sometext") 179 | return 180 | 181 | ### 与`None`的比较用`is`而不是`==` 182 | 183 | `None`是`Nonetype`的唯一实例,即所有对它的引用事实上都链接到同一个对象。 184 | 所以比较的时候我们并不需要特地调用`__eq__`将这个过程复杂化(使用==会调用对象的`__eq__`方法并以此判断他们是否相等),直接使用is比较他们是不是同一个对象即可。 185 | 186 | # bad 187 | if item == None: 188 | continue 189 | 190 | # good 191 | if item is None: 192 | continue 193 | 194 | 在与`None`的比较上,使用`is`代替`==`是一种又快又稳定的方法。 195 | 196 | ### 避免使用`sys.path`黑科技 197 | 198 | **译者观点** 199 | 原文的这个部分描述的不是很清楚,这里加了一部分我个人的理解。如有需要请参阅原文。 200 | 201 | 可能很多人使用类似`sys.path.insert(0, "../")`的方式来实现python的动态导入,但这种做法是非常影响可读性的,其他人很难轻易通过这个知道你到底想干嘛。 202 | 203 | python已经具备了非常完善的模块处理机制。通过修改`PYTHONPATH`可以调整加载的模块,或是运行时使用`-m`可以以模块方式运行python代码。另外,通过import与`__import`之类的方式也能够有效地解决动态调用模块的问题。 204 | 205 | 模块不应该需要依赖工作目录的文件结构才能正常运行,这不免有些跨越层级了。模块应该在模块层面上解决互相依赖的问题(例如import),而不是通过文件层面。 206 | 207 | ### 尽量少/不自定义异常类型 208 | 209 | > 如果你非要这么做的话,也尽量不要定义太多个。 210 | 211 | 事实上,python已经提供了非常丰富的异常类型供用户选择。就目前而言,在日常开发中这些异常类型基本已经能够满足绝大多数的开发需要。 212 | 213 | 一个判定你是否需要自定义异常类型的方法是,考虑一下是否每次用户调用这个函数时都需要捕获这类异常。如果是,那你可能应该自定义,但这种情况相对来说非常少见。 214 | 215 | 一个例子: [tornado.web.HTTPError 216 | ](http://www.tornadoweb.org/en/stable/web.html#tornado.web.HTTPError) 217 | 218 | 这类异常针对的是,所有的发生在框架范围内或用户代码内的http错误。 219 | 220 | ## 写好docstring 221 | 222 | ### 短docstring尽量只用一行 223 | 224 | # bad 225 | def reverse_sort(items): 226 | """ 227 | sort items in reverse order 228 | """ 229 | 230 | # good 231 | def reverse_sort(items): 232 | """Sort items in reverse order.""" 233 | 234 | ### 用reST风格写docstring 235 | 236 | def get(url, qsargs=None, timeout=5.0): 237 | """Send an HTTP GET request. 238 | 239 | :param url: URL for the new request. 240 | :type url: str 241 | :param qsargs: Converted to query string arguments. 242 | :type qsargs: dict 243 | :param timeout: In seconds. 244 | :rtype: mymodule.Response 245 | """ 246 | return request('get', url, qsargs=qsargs, timeout=timeout) 247 | 248 | 在上面的例子中,关于timeout的类型是可以略去的,因为timeout的默认值已经很明显地表现了它是一个`float`类型;而qsargs没有指定它是什么类型,所以docstring为其做了标识;rtype主要用于标识返回类型。 249 | 250 | 代码被阅读的频率要远高于它被修改的频率,docstring的存在能够有效地降低阅读成本,是非常必要的。 251 | 252 | 但是也要注意的是,恰当的注释会对开发有所帮助,过多的注释反而会给人带来困扰。理论上,开发者应优先对一些会被广泛复用的代码(函数)加上注释,并在每次函数进行更新时对docstring进行同步更新。 253 | 254 | ## 范式 & 设计模式 255 | 256 | ### 函数 vs 类 257 | 258 | - 多用函数而不是类。函数与模块是代码复用的基本单位,而且他们是最灵活的。 259 | - 类比起函数更加“高级”,它一般被广泛应用在一些更为庞大的功能上例如容器的实现/代理/类型系统等。但是高级意味着,它的维护成本需要相应的提升。在大多数情况下,它很可能是一把“牛刀”,往往我们只需要使用函数就足以应对这些情况。 260 | - 一些人喜欢用类将相关的函数进行分类,但这是错的。要达到同样的目的,应该使用module而不是class。尽管有些时候类可以表现地像是一个mini版本的命名空间(使用`@staticmethod`),但类中的方法通常应该围绕着对象的内部操作展开,而不仅仅只是一系列行为的分组。 261 | - 通过一个模块去调用函数比通过类来调用要清晰得多。例如你需要管理一系列时间相关的函数: 262 | - 使用一个名为`lib.time`的模块来管理,可以灵活的根据需要进行导入与重命名; 263 | - 使用一个名为`TimeHelper`的类来管理,在使用时你甚至还需要构建一个它的子类来使用他的方法; 264 | 265 | **译者观点** 266 | 换言之,在python中,module的存在替在其他语言中举足轻重的class分担了许多工作。因为class比起module而言过于笨重了。对于一些轻量级的场景而言,同样的情况下使用module会让你的代码更加pythonic。 267 | 不过不可否认的是,在很多情况下class比起module能够更好地处理复杂的逻辑,所以还是需要根据实际情况灵活选用。 268 | 269 | ### 命令式编程 vs 声明式编程 270 | 271 | #### 概念 #### 272 | 273 | - 命令式编程(Imperative):喜欢大量使用可变对象和指令,我们总是习惯于创建对象或者变量,并且修改它们的状态或者值,或者喜欢提供一系列指令,要求程序执行。 274 | 275 | - 声明式编程(Declarative):对于声明式的编程范式,你不在需要提供明确的指令操作,所有的细节指令将会更好的被程序库所封装,你要做的只是提出你要的要求,声明你的用意即可。 276 | 277 | #### 结论 #### 278 | 279 | - 在python中应该尽可能使用声明式编程。 280 | - 代码应该表达的是,它想完成什么样的功能,而不是描述怎么样完成这个功能。 281 | 282 | 一个关于list的例子,大大降低了复杂度,提高了效率并提高了可读性。 283 | 284 | # bad 285 | filtered = [] 286 | for x in items: 287 | if x.endswith(".py"): 288 | filtered.append(x) 289 | return filtered 290 | 291 | # good 292 | return [x 293 | for x in items 294 | if x.endswith(".py")] 295 | 296 | **译者观点** 297 | 个人觉得这里作者想通过这个例子表达的意思应该是,多利用python自有的一些功能(或者自行编写的一些子函数)简化功能的实现,减少代码中“描述怎么样完成这个功能”的部分(实际上是分摊到不同地方中去了)。 298 | 例如例子中列表解析是python提供的功能,对于列表解析的实现是由python完成的,在阅读代码时我们需要关注的并不是他是怎么实现的,而是这个函数到底会得到一个什么样的结果。这在团队工作中是极为重要的。 299 | 300 | ### 同样功能下,越干净,越短,越好 301 | 302 | # bad 303 | def dedupe(items): 304 | """Remove dupes in-place, return items and # of dupes.""" 305 | seen = set() 306 | dupe_positions = [] 307 | for i, item in enumerate(items): 308 | if item in seen: 309 | dupe_positions.append(i) 310 | else: 311 | seen.add(item) 312 | num_dupes = len(dupe_positions) 313 | for idx in reversed(dupe_positions): 314 | items.pop(idx) 315 | return items, num_dupes 316 | 317 | This same function can be written as follows: 318 | 319 | # good 320 | def dedupe(items): 321 | """Return deduped items and # of dupes.""" 322 | deduped = set(items) 323 | num_dupes = len(items) - len(deduped) 324 | return deduped, num_dupes 325 | 326 | - 在这个函数下,后者相比前者的优势不只在于行数 327 | - 逻辑更加清晰 328 | - 调试效率非常高 329 | - 造成更少的bug 330 | 331 | ### 在参数与返回值上尽量使用简单类型 332 | 333 | - 函数操作应该基于数据,尽量使用简单类型而不是自定对象 334 | - 简单类型:`set, tuple, list, int, float, bool` 335 | - 在上述类型无法满足的情况下再考虑使用更为复杂的类型 336 | 337 | **译者观点** 338 | 使用基本类型作为函数IO一定程度上也能够降低整体的耦合性。 339 | 340 | ### 避免传统的OOP编程思想 341 | 342 | - 在java与C++中,代码的复用基本上通过类继承与多态实现; 343 | - 在python中,尽管我们也可以这么做,但事实上这并不是很pythonic。 344 | - 在python中,代码的复用最普遍的做法是通过模块与函数来实现。python的核心开发者曾经做过一个演讲批评了类被滥用的行为["Stop Writing Classes"][stop-classes],函数在python中是“一等公民”,在很多时候我们并不需要特地去构建一个类。如果你在实际使用中使用类实现代码的重用,请重新考虑一下,尤其是当你的类名与其内部函数很相似的时候。 `(e.g. Runnable.run())` 345 | - 对于多态,应该使用鸭子类型来解决。在python中,我们关注的不再是对象本身是什么东西,而是它能够被怎么使用。 346 | 347 | [stop-classes]: https://www.youtube.com/watch?v=o9pEzgHorH0 348 | 349 | **译者观点** 350 | 作者这个部分并不是说OOP编程不好,而是在OOP的实现上python的方式可能跟其他语言有所不同。 351 | 352 | ### mixin好,但不要滥用 353 | 354 | 关于mixin的概念详见[这里](http://blog.csdn.net/gzlaiyonghao/article/details/1656969)。 355 | 356 | - 使用mixin可以简单有效达成基类重用的目的,而不必深入其内部的类型层次结构。 357 | - 但过多的嵌套会大幅度加大代码的阅读难度,在使用前要简单评估一下是否过度设计。 358 | 359 | 关于使用上的规范: 360 | 361 | class APIHandler(AuthMixin, RequestHandler): 362 | """Handle HTTP/JSON requests with security.""" 363 | 364 | 这样设计可以直接在函数签名透露出:“这个类混合了auth的行为,他是一个RequestHandler。” 365 | 366 | **译者观点** 367 | 在命名中直接体现函数的功能是非常优雅的,函数的设计应该遵守这个原则,让其他用户直接通过名字与docstring就能够准确地使用这个函数。 368 | 369 | ### 小心地使用框架 370 | 371 | - python有非常非常多的第三方库,能让你很方便地构建自己的框架,实现自己需要的功能。 372 | - 在使用中需要非常注意,不要跟其他的库有冲突(包括命名/重复功能等) 373 | 374 | 一个糟糕的例子: 375 | 376 | from something import * 377 | # 千万别这样,除非你真的很清楚这里面有什么东西且不会跟你造成冲突 378 | 379 | ### 尊重元编程 380 | 381 | - 元编程是python中非常重要的核心组成部分,许多功能都是基于此实现的,包括装饰器/上下文管理器/描述符/导入等等; 382 | - 熟悉元编程能够让你应对非常复杂/多样的场景,自定义对象的表现,是你编写属于自己的框架的第一步。 383 | 384 | ### 不要害怕带有双下划线的方法(内部方法) 385 | 386 | - 他们无非就是一些一开始选定的用于实现内部协议的命名空间,并没有什么太过神奇的地方; 387 | - 不过需要承认的是,他们比起常规函数确实可能会造成一些令人困扰的结果; 388 | - 没有充分考虑就贸然重写他们并不是一个好主意,除非你真的有一个足够充分的理由。 389 | 390 | ## “python之禅”在你代码中的具体应用 391 | 392 | ### 美比丑好 393 | 以编写优美的代码为目标。 394 | 395 | **译者观点** 396 | 在团队协作中,编程的目的绝对绝对不只是完好的实现功能,尤其对于python这种与自然语言已经非常接近的高级语言。 397 | 398 | ### 准确比模糊好 399 | 400 | 在代码中应该尽量将代码翻译为标准的英文(人类语言)且让人能够基本明白执行这段代码会发生什么。 401 | 402 | ### 扁平比嵌套好 403 | 404 | 直接扔两个例子: 405 | 406 | # bad 407 | if response: 408 | if response.get("data"): 409 | return len(response["data"]) 410 | 411 | # good 412 | if response and response.get("data"): 413 | return len(response["data"]) 414 | 415 | ### 可读性非常重要 416 | 417 | - 最好不要用缩写 418 | - 不要怕参数名太长 419 | - 多加注释 420 | 421 | ### 错误永远不应该被轻易忽略 422 | 423 | - 最大的根源来自于`except: pass`,这种做法很可能会漏掉奇怪的错误而导致后期难以调试,永远都不要这么用它。 424 | - 如果错误不处理,至少应该有log记录它曾经发生过。 425 | 426 | ### 所有代码都应该容易解释 427 | 428 | - 大多数python函数与对象都有一种易于解释的实现方法 429 | - 如果你的实现方法难以让其他人看懂,或许应该考虑重构一下 430 | - 通常可以通过拆分来解决 431 | 432 | ### 测试代码是必要的 433 | 434 | 比起代码的美丑,正确性无疑是更为重要的。 435 | 难以保证正确性的优美代码 < 能保证正确的丑代码 436 | 437 | ## 一些无关好坏的主观比较 438 | 439 | 下面会介绍的内容是工程师因自身编程习惯的不同可能会出现的分歧。不过这些内容不涉及好坏,使用方式的差异也不会对团队工作造成很大的影响。 440 | 441 | **译者观点** 442 | 在合理且团队允许的情况下,每个工程师的编程习惯都应该被尊重,个人习惯的不同不应该成为你修改其他人代码的理由。 443 | 444 | ### `str.format` 与 `%` 445 | 446 | 前者稳定性更强,后者更为简洁 447 | 448 | ### `if item` 与 `if item is not None` 449 | 450 | - 首先明确这两个是不一样的。 451 | - if item还会比较空字符串与空列表的问题,而后者仅仅只会比较它是否是None本身。 452 | - 如果这个判断不影响正确性,那么两者的使用应该是自由的。 453 | 454 | ### 多行字符串 455 | 456 | 直接上两个例子: 457 | 458 | msg = ("Hello, wayward traveler!\n" 459 | "What shall we do today?\n" 460 | ) 461 | print(msg) 462 | 463 | 464 | msg = """Hello, wayward traveler! 465 | What shall we do today? 466 | """ 467 | print(msg) 468 | 469 | 看个人选择。 470 | 471 | 如果多行字符串被放到函数里,需要变成: 472 | 473 | def abc(): 474 | msg = """Hello, wayward traveler! 475 | What shall we do today? 476 | """ 477 | print(msg) 478 | 479 | 比较丑。 480 | 481 | ### `raise`应该抛出类还是实例 482 | 483 | raise ValueError 484 | raise ValueError() 485 | 486 | - 在上述情况的实现上,python会自动把类转换为实例; 487 | - 建议还是用第二种,因为可以加入一些帮助信息以方便后期调试; 488 | - 但不是硬性要求,因为两者最终表现确实是一样的 489 | 490 | ## 一些很有用的内部库与三方库 491 | 492 | ### The Standard Library 493 | 494 | - `import datetime as dt`: 用这种方法引用时间 495 | - `dt.datetime.utcnow()`: 比起`.now()`,这个是真正的本地时间(utc) 496 | - `import json`: 用json进行数据交换,json是一种普遍认为较好的数据交换方式 497 | - `from collections import namedtuple`: 快速定义一个轻量级的数据类型,类似一个轻量级的不带方法的class 498 | - `from collections import defaultdict`: 默认字典 499 | - `from collections import deque`: 快速的双端队列 500 | - `from itertools import groupby, chain`: chain用于迭代器组合,groupby用于数据分类 501 | - `from functools import wraps`: 构造一个好的装饰器必不可少 502 | - `argparse`: 用于处理命令行参数 503 | - `fileinput`: 用于文件读取,支持原位修改,个人感觉并没有很方便 504 | - `log = logging.getLogger(__name__)`: 日志管理 505 | - `from __future__ import absolute_import`: 解决自定义模块与内部模块引用可能冲突的问题(python3已强制使用) 506 | 507 | ### Common Third-Party Libraries 508 | 509 | - `python-dateutil` for datetime parsing and calendars 510 | - `pytz` for timezone handling 511 | - `tldextract` for better URL handling 512 | - `msgpack-python` for a more compact encoding than JSON 513 | - `futures` for Future/pool concurrency primitives 514 | - `docopt` for quick throwaway CLI tools 515 | - `py.test` for unit tests, along with `mock` and `hypothesis` 516 | 517 | ### Local Development Project Skeleton 518 | 519 | 对于所有python项目与库而言,不要在根目录里加入`__init__.py`文件! 520 | 521 | 一个项目结构最好是这样摆放: 522 | 523 | - `mypackage/__init__.py` 不如 `src/mypackage/__init__.py` 524 | - `mypackage/lib/__init__.py` 不如 `lib/__init__.py` 525 | - `mypackage/settings.py` 不如 `settings.py` 526 | 527 | 注:此处的根目录为项目根目录。 528 | 529 | 其他文件: 530 | 531 | - `README.rst`/`README.md` 使用reST/Markdown向新用户描述这个项目的功能,介绍整个项目。 532 | - `setup.py` 用于在不同机器上安装这个项目包。 533 | - `requirements.txt` 声明了这个项目所需的依赖库,需要`pip`。 534 | - `dev-requirements.txt` 与上面不同的是,它用于管理一些调试过程中需要的依赖库。 535 | - `Makefile` 用于简单的 build/lint/test/run 操作步骤 536 | 537 | Also, always [pin your requirements](http://nvie.com/posts/better-package-management/). 538 | 539 | 540 | ## 灵感与启发 541 | 542 | 下面介绍的链接或许可以给你一些启发,让你在开发过程中可以写出风格更好、更加pythonic的代码。 543 | 544 | - Python's stdlib [`Counter` class][Counter], implemented by Raymond Hettinger 545 | - The [`rq.queue` module][rq], originally by Vincent Driessen 546 | - This document's author also wrote [this blog post on "Pythonic" code][idiomatic] 547 | 548 | Go forth and be Pythonic! 549 | 550 | ``` 551 | $ python 552 | >>> import antigravity 553 | ``` 554 | 555 | [Counter]: https://github.com/python/cpython/blob/57b569d8af2b3263c5d9e6d75fb308f89ea17ac6/Lib/collections/__init__.py#L446-L841 556 | [rq]: https://github.com/nvie/rq/blob/master/rq/queue.py 557 | [idiomatic]: http://www.pixelmonkey.org/2010/11/03/pythonic-means-idiomatic-and-tasteful 558 | 559 | ## 编著者 560 | 561 | - Andrew Montalenti ([@amontalenti][amontalenti]): original author 562 | - Vincent Driessen ([@nvie][nvie]): edits and suggestions 563 | 564 | [amontalenti]: http://twitter.com/amontalenti 565 | [nvie]: http://twitter.com/nvie 566 | 567 | --- 568 | 569 | Like good Python style? Then perhaps you'd like to [work on our team of Pythonistas][tweet] at Parse.ly! 570 | 571 | [tweet]: https://twitter.com/amontalenti/status/682968375702716416 572 | --------------------------------------------------------------------------------