├── README.md └── _config.yml /README.md: -------------------------------------------------------------------------------- 1 | # How to Shoot Yourself in the Foot with Python 2 | 3 | ## Common pitfalls and misunderstandings 4 | 5 | ![Snake Fail](http://cl.jroo.me/z3/e/D/s/e/a.baa-snake-will-eat-itself.jpg) 6 | 7 | This document doesn't list _bugs_ in Python, but rather unexpected behaviours. Of course, "unexpected behaviour" depends a lot on what you expect Python to do. 8 | 9 | Please feel free to add corrections, clarifications and more common pitfalls by sending a pull request or opening an issue! 10 | 11 | ### Contents 12 | 13 | - [Arithmetic Fail](#arithmetic-fail) 14 | - [Class Property Fail](#class-property-fail) 15 | - [Scope Fail](#scope-fail) 16 | - [Oscar Speech Fail / Immutables Part I](#oscar-speech-fail--immutables-part-i) 17 | - [Immutable Fail, part II](#immutable-fail-part-ii) 18 | - [Cooking the Books Fail](#cooking-the-books-fail) 19 | - [Integer Division Fail](#integer-division-fail) 20 | - [Closure Fail](#closure-fail) 21 | 22 | 23 | ### Arithmetic Fail 24 | 25 | Let's do some first grade arithmetic: 26 | 27 | ```python 28 | >>> a = 2 29 | >>> a * a is 4 30 | True 31 | ``` 32 | 33 | Works as advertised. Let's see if Python can handle slightly larger numbers, too: 34 | 35 | ```python 36 | >>> a = 20 37 | >>> a * a is 400 38 | False 39 | ``` 40 | 41 | __What's happening here?__ Remember that everything in Python is an object, even numbers. Also remember that `is` checks for _identity_, not _equality_. So `2 * 2 is 4` is the same as `id(2 * 2) == id(4)`. The reason this works for small numbers is that Python creates singletons for integers from -9 to 255 on start-up because they're frequently used -- it's an implementation detail of CPython, not a language feature. However, when we compute `20 * 20`, a new object with value 400 is created, which is a different object than `400`. 42 | 43 | > __How do avoid this issue:__ only use `is` to check if things are `True`, `False` or `None`. These are singletons (i.e. every `False` in your code is the same object.) 44 | 45 | ### Class Property Fail 46 | 47 | "In the wild, life is a constant battle to find enough to eat..." 48 | 49 | ```python 50 | class Mammal(object): 51 | awkwardness = 0 52 | 53 | class Platypus(Mammal): 54 | pass 55 | 56 | class Dolphin(Mammal): 57 | pass 58 | ``` 59 | 60 | We create a mammal class and two sub-classes. 61 | 62 | ```python 63 | >>> print(Mammal.awkwardness, Platypus.awkwardness, Dolphin.awkwardness) 64 | 0 0 0 65 | ``` 66 | 67 | Nothing too unexpected. Let's set the awkwardness of the platypus to a well-deserved 10: 68 | 69 | ```python 70 | >>> Platypus.awkwardness = 10 71 | >>> print(Mammal.awkwardness, Platypus.awkwardness, Dolphin.awkwardness) 72 | 0 10 0 73 | ``` 74 | 75 | All as expected. No remember that all mammals are basically __tubes__, and feel very self-conscious about being a mammal, too. Let's bump the awkwardness of mammals to 3: 76 | 77 | ```python 78 | >>> Mammal.awkwardness = 3 79 | >>> print(Mammal.awkwardness, Platypus.awkwardness, Dolphin.awkwardness) 80 | 3 10 3 81 | ``` 82 | 83 | __Why did the awkwardness of dolphins change? Dolphins are *cute*!__ We're dealing with class properties here. If untouched, they are simply references to the parent's class properties. When we set `Platypus.awkwardness = 10` we create a __new__ class property on the platypus class. 84 | 85 | ### Scope Fail 86 | 87 | Here's one of my favourite Python party tricks (I'm an unpopular party guest). The setup: 88 | 89 | ```python 90 | answer = 42 91 | 92 | def ultimate_question_of_life(): 93 | print(answer) 94 | ``` 95 | 96 | Now for the easy part: 97 | 98 | ```python 99 | >>> ultimate_question_of_life() 100 | 42 101 | ``` 102 | 103 | Right on. But what if we try to one-up Douglas Adams? 104 | 105 | ```python 106 | answer = 42 107 | 108 | def ultimate_question_of_life(): 109 | print(answer) 110 | answer += 1 111 | 112 | ultimate_question_of_life() 113 | ``` 114 | 115 | Ouch: 116 | 117 | ```python 118 | --------------------------------------------------------------------------- 119 | UnboundLocalError Traceback (most recent call last) 120 | in () 121 | ----> 7 ultimate_question_of_life() 122 | 123 | in ultimate_question_of_life() 124 | 3 def ultimate_question_of_life(): 125 | ----> 4 print(answer) 126 | 5 answer += 1 127 | 128 | UnboundLocalError: local variable 'answer' referenced before assignment 129 | ``` 130 | 131 | __Alright, this fails.__ But wait a second, where does it fail? At the print statement that used to succeed in the example above! By adding a line _after_ a perfectly innocuous statement we make this statement suddenly break things! Madness!! 132 | 133 | The problem here is that Python is, contrary to common misconception, not interpreted line-by-line. Instead, when we execute code (ie. import a module), Python computes scopes for all blocks, which variables are available inside the scope and where they point to. Since we assign `answer` inside the scope of `ultimate_question_of_life` (note that `+=` doesn't change the value of `answer`, but creates a new object!), we won't be able to refer to the `answer` that's declared outside that scope anymore. 134 | 135 | ### Oscar Speech Fail / Immutables Part I 136 | 137 | As any academy award winning director knows, the most unforgivable of all faux pas is to forget to thank your spouse. Let's write a Python script that takes care of our Oscar® speech: 138 | 139 | ```python 140 | def oscar_speech(people_to_thank=[]): 141 | people_to_thank.append("my wife") 142 | for person in people_to_thank: 143 | print("I want to thank {}".format(person)) 144 | ``` 145 | 146 | Alright, ready for the spotlight? 147 | 148 | ```python 149 | >>> oscar_speech() 150 | I want to thank my wife 151 | >>> oscar_speech(["The Academy", "Lars von Trier"]) 152 | I want to thank The Academy 153 | I want to thank Lars von Trier 154 | I want to thank my wife 155 | ``` 156 | 157 | Great. Let's practice some more: 158 | 159 | ```python 160 | >>> oscar_speech() 161 | I want to thank my wife 162 | I want to thank my wife 163 | >>> oscar_speech() 164 | I want to thank my wife 165 | I want to thank my wife 166 | I want to thank my wife 167 | ``` 168 | 169 | __Huh?__ The problem is that the list we pass on as the default argument only gets created once, at import time - no every time we call the function. So we end up appending our wife to the same list over and over again. This piece of code is identical to the one above and clarifies the issue: 170 | 171 | ```python 172 | default_list = [] 173 | def oscar_speech(people_to_thank=default_list): 174 | people_to_thank.append("my wife") 175 | ``` 176 | 177 | ### Immutable Fail, part II 178 | 179 | The pledge: 180 | 181 | ```python 182 | flying_circus = ["Eric Idle", "Terry Gilliam"] 183 | 184 | def casting_a(): 185 | flying_circus.append("John Cleese") 186 | return flying_circus 187 | 188 | def casting_b(): 189 | flying_circus += ["Terry Jones"] 190 | return flying_circus 191 | ``` 192 | 193 | The turn: 194 | 195 | ```python 196 | >>> casting_a() 197 | ['Eric Idle', 'Terry Gilliam', 'John Cleese'] 198 | ``` 199 | 200 | 201 | The prestige: 202 | 203 | ```python 204 | >>> casting_b() 205 | --------------------------------------------------------------------------- 206 | UnboundLocalError Traceback (most recent call last) 207 | in () 208 | ----> 1 casting_b() 209 | 210 | in casting_b() 211 | 7 def casting_b(): 212 | ----> 8 flying_circus += ["Terry Jones"] 213 | 9 return flying_circus 214 | 215 | UnboundLocalError: local variable 'flying_circus' referenced before assignment 216 | ``` 217 | 218 | Why does `list.append` succeed, but `list += [...]` fail? Because `list.append` alters the object, whereas `+=` tries to create a new object. Remember our scope fail above. `flying_circus += ["Terry Jones"]` is the same as `flying_circus = flying_circus + ["Terry Jones"]`. Because we will assign the variable `flying_circus` it won't be available in our scope until after the assignment. However before we try to assign it, we try to compute `flying_circus + ["Terry Jones"]`. For comparison, 219 | 220 | ```python 221 | def casting_c(): 222 | flying_circus_new = flying_circus + ["Terry Jones"] 223 | return flying_circus_new 224 | ``` 225 | 226 | will work perfectly fine. 227 | 228 | 229 | ### Cooking the Books Fail 230 | 231 | Let's turn our attention to the use of Python in the scientific community. A frequent problem many scientists encounter is that their data doesn't _quite_ match the hypothesis. Instead of going through the arduous step of refining our hypothesis, we can just, you know, _tweak_ the data a little bit until it looks like what it was _supposed_ to look like to start with. 232 | 233 | ```python 234 | data = { 235 | 'x': [0,1,2,3], 236 | 'y': [1,3,9,16] 237 | } 238 | ``` 239 | 240 | So, obviously the effect here is quadratic, right? And the `3` on the y-axis is just a tiny perturbance in our measurements. Let's fix that! But just to be safe, let's work on a copy of our data and not touch the original: 241 | 242 | ```python 243 | >>> baked_data = data.copy() 244 | >>> baked_data['y'][1] = 4 245 | >>> print(baked_data) 246 | {'y': [1, 4, 9, 16], 'x': [0, 1, 2, 3]} 247 | ``` 248 | 249 | Much better! Let's just make sure our original data is still the same. 250 | 251 | ```python 252 | >>> print(data) 253 | {'y': [1, 4, 9, 16], 'x': [0, 1, 2, 3]} 254 | ``` 255 | 256 | __Damn.__ When we created a copy of our data, we actually created a so-called __shallow copy__. This means that we create a new `dict` object, but we only copy the references of the keys and values. So the list we're altering in `baked_data` is actually the same list as the one in the original `data`. 257 | 258 | Similarly, copying a list with `[:]`, as in `my_list = [[1, 2], 3, 4, 5], ; new_list = my_list[:]` only creates a shallow copy and can lead to similar unexpected effects. 259 | 260 | > __How to avoid this issue:__ Use the `deepcopy` module. 261 | 262 | ### Integer Division Fail 263 | 264 | Here's something that works, but is inadvisable. 265 | 266 | ```python 267 | ducks = ["Donald", "Huey", "Dewey", "Louie"] 268 | middle = len(ducks) / 2 269 | print(ducks[middle]) 270 | ``` 271 | 272 | As any adventurous and brave pythonista does these days, you upgrade your code to Python3, and suddenly: 273 | 274 | 275 | ```python 276 | # In Python3 277 | ducks = ["Donald", "Huey", "Dewey", "Louie"] 278 | middle = len(ducks) / 2 279 | print(ducks[middle]) 280 | --------------------------------------------------------------------------- 281 | UnboundLocalError Traceback (most recent call last) 282 | in 283 | ----> 1 print(ducks[middle]) 284 | 285 | TypeError: list indices must be integers, not float 286 | ``` 287 | 288 | ___Why?___ Because in Python2, `/` has different meanings depending on wheather you feed in floats or integers. If both left and right side are integers, the result will also be an integer. In Python3, `/` will __always__ produce a float, and of course you can't index a list with floats. 289 | 290 | > __How to avoid this issue:__ Use `//` for integer devision. 291 | 292 | ### Closure Fail 293 | 294 | This is a real-life example from production code I once wrote and now feel very ashamed for. 295 | 296 | ```python 297 | def valid_password(pwd): 298 | return False # In production, we'd do actual password validation here 299 | 300 | def wrong_password_prompts(): 301 | return [lambda pwd: "Password {} incorrect - {} attempts left".format(pwd, 3-i) for i in range(3)] 302 | 303 | def get_password(): 304 | for bad_attempt in wrong_password_prompts(): 305 | pwd = input() 306 | if not valid_password(pwd): 307 | print(bad_attempt(pwd)) 308 | else: 309 | return True 310 | return False 311 | ``` 312 | 313 | Let this sink in for a second. The crucial and most shameful part is `wrong_password_prompts`, where we return a list of three anonymous functions. The first function should return `"Password xyz incorrect - 3 attempts left"` when called with password `"xyz"`. The second function should return `"Password xyz incorrect - 2 attempts left"` and so on. Let's see what happens: 314 | 315 | ```python 316 | >>> get_password() 317 | xyz 318 | Password xyz incorrect - 1 attempts left 319 | swordfish 320 | Password swordfish incorrect - 1 attempts left 321 | shibboleth 322 | Password shibboleth incorrect - 1 attempts left 323 | ``` 324 | 325 | __Why is there always only one attempt left?__ because the string we return only gets formatted when we call the anonymous functions. And the `i` we use to format it is actually just the `i` that gets "left over" after the loop over `range(3)` is done - which has value `2`. Specifically, it leaked outside the scope. 326 | -------------------------------------------------------------------------------- /_config.yml: -------------------------------------------------------------------------------- 1 | theme: jekyll-theme-cayman --------------------------------------------------------------------------------