├── Cool-stuff ├── Built-ins.md ├── Development-tricks.md ├── Float-secrets.md ├── FunnyEggs.md ├── Hypothesis-test.md ├── Memory-Secrets.md ├── Merging-lists.md ├── Python-Tricks.md ├── Sql-Tricks.md └── Sql-tricks2.md ├── Python ├── 01. Variables.md ├── 02. If-else.md ├── 03. Number-strings.md ├── 04. Container-types.md ├── 05. Functions.md ├── 06. Decorators.md └── 07. Exception-handling.md └── README.md /Cool-stuff/Built-ins.md: -------------------------------------------------------------------------------- 1 | # Built-in Functions 2 | 3 | ## 1. set() 4 | **Using sets** 5 | ```python 6 | obj = ['a','b','c','b','a'] 7 | 8 | print(set(obj)) 9 | # output:{'b', 'c', 'a'} 10 | ``` 11 | **union, intersection and set difference** 12 | ```python 13 | A = set('hello') 14 | B = set('world') 15 | 16 | A.union(B) 17 | # output:{'d', 'e', 'h', 'l', 'o', 'r', 'w'} 18 | 19 | A.intersection(B) 20 | # output:{'l', 'o'} 21 | 22 | A.difference(B) 23 | # output:{'d', 'r', 'w'} 24 | ``` 25 | 26 | ## 2. eval() 27 | **String expression** 28 | ```python 29 | a = eval('[1,2,3]') 30 | print(type(a)) 31 | # output: 32 | 33 | b = eval('max([2,4,5])') 34 | print(b) 35 | # output: 5 36 | ``` 37 | **Methods and functions** 38 | ```python 39 | def divide_by_two(x): 40 | return x//2 41 | 42 | eval('divide_by_two(8)') 43 | # output: 4 44 | ``` 45 | ## 3. sorted() 46 | **Ascending order** 47 | ```python 48 | a = sorted([2,4,3,7,1,9]) 49 | 50 | print(a) 51 | # output:[1, 2, 3, 4, 7, 9] 52 | ``` 53 | **Reverse** 54 | ```python 55 | sorted((4,1,9,6),reverse=True) 56 | 57 | print(a) 58 | # output:[9, 6, 4, 1] 59 | ``` 60 | **By length as parameter** 61 | ```python 62 | chars = ['apple','papaya','pear','banana'] 63 | a = sorted(chars,key=lambda x:len(x)) 64 | 65 | print(a) 66 | # output:['pear', 'apple', 'papaya', 'banana'] 67 | ``` 68 | **Custom rules** 69 | ```python 70 | tuple_list = [('A', 1,5), ('B', 3,2), ('C', 2,6)] 71 | a = sorted(tuple_list, key=lambda x: x[1]) 72 | 73 | print(a) 74 | # output:[('A', 1, 5), ('C', 2, 6), ('B', 3, 2)] 75 | ``` 76 | ## 4. reversed() 77 | ```python 78 | a = reversed('abcde') 79 | print(list(a)) 80 | # output:['e', 'd', 'c', 'b', 'a'] 81 | 82 | b = reversed([2,3,4,5]) 83 | print(list(b)) 84 | # output:[5, 4, 3, 2] 85 | ``` 86 | ## 5. map() 87 | **Uppercase** 88 | ```python 89 | chars = ['apple','papaya','pear','banana'] 90 | a = map(lambda x:x.upper(),chars) 91 | 92 | print(list(a)) 93 | # output:['APPLE', 'PAPAYA', 'PEAR', 'BANANA'] 94 | ``` 95 | **Square each number** 96 | ```python 97 | nums = [1,2,3,4] 98 | a = map(lambda x:x*x,nums) 99 | 100 | print(list(a)) 101 | # output:[1, 4, 9, 16] 102 | ``` 103 | ## 6. bin() 104 | ```python 105 | number = 5 106 | 107 | print(bin(number)) 108 | # output: 0b101 109 | ``` 110 | ## 7. filter() 111 | **Remove even** 112 | ```python 113 | nums = [1,2,3,4,5,6] 114 | a = filter(lambda x:x%2!=0,nums) 115 | 116 | print(list(a)) 117 | # output:[1,3,5] 118 | ``` 119 | **word starts with 'b'** 120 | ```python 121 | chars = chars = ['apple','papaya','pear','banana'] 122 | a = filter(lambda x:'b' in x,chars) 123 | 124 | print(list(a)) 125 | # output:['banana'] 126 | ``` 127 | ## 8. enumerate() 128 | **Enumeration example** 129 | ```python 130 | chars = ['apple','papaya','pear','banana'] 131 | for i,j in enumerate(chars): 132 | print(i,j) 133 | 134 | ''' 135 | output: 136 | 0 apple 137 | 1 papaya 138 | 2 pear 139 | 3 banana 140 | ''' 141 | ``` 142 | **Enumerating strings** 143 | ```python 144 | a = enumerate('abc') 145 | 146 | print(list(a)) 147 | # output:[(0, 'a'), (1, 'b'), (2, 'c')] 148 | ``` 149 | **Using next()** 150 | ```python 151 | chars = ['apple', 'papaya', 'pear', 'banana'] 152 | enumerate_object = enumerate(chars) 153 | 154 | next(enumerate_object) 155 | # output: (0,'apple') 156 | 157 | next(enumerate_object) 158 | # output: (1,'papaya') 159 | ``` 160 | -------------------------------------------------------------------------------- /Cool-stuff/Development-tricks.md: -------------------------------------------------------------------------------- 1 | # Development Tricks 2 | 3 | ## 01. View Source Code 4 | ```python 5 | # demo.py 6 | import inspect 7 | 8 | 9 | def add(x, y): 10 | return x + y 11 | 12 | print("===================") 13 | print(inspect.getsource(add)) 14 | ``` 15 | 16 | ## 02. View package path 17 | 18 | ### Slow(pprint) 19 | ```python 20 | import sys 21 | 22 | from pprint import pprint 23 | pprint(sys.path) 24 | ``` 25 | 26 | ### Faster 27 | ```python 28 | # In terminal 29 | # >> python3 -m site 30 | ``` 31 | 32 | ## 03. Single line loop 33 | 34 | ### Nested one 35 | ```python 36 | list1 = range(1,3) 37 | list2 = range(4,6) 38 | list3 = range(7,9) 39 | 40 | for item1 in list1: 41 | for item2 in list2: 42 | for item3 in list3: 43 | print(item1+item2+item3) 44 | ``` 45 | 46 | ### Using itertools.product 47 | ```python 48 | from itertools import product 49 | 50 | list1 = range(1,3) 51 | list2 = range(4,6) 52 | list3 = range(7,9) 53 | 54 | for item1,item2,item3 in product(list1, list2, list3): 55 | print(item1+item2+item3) 56 | ``` 57 | 58 | ## 04. Use the print output log 59 | ```python 60 | 61 | >>> with open('test.log', mode='w') as f: 62 | ... print('hello, python', file=f, flush=True) 63 | >>> exit() 64 | 65 | ``` 66 | ## 05. FUnction running time 67 | 68 | ### Using `time` (lame way) 69 | ```python 70 | import time 71 | start = time.time() 72 | 73 | # run the function 74 | 75 | end = time.time() 76 | print(end-start) 77 | ``` 78 | ### Using `timeit` 79 | ```python 80 | import time 81 | import timeit 82 | 83 | def run_sleep(second): 84 | print(second) 85 | time.sleep(second) 86 | 87 | print(timeit.timeit(lambda :run_sleep(2), number=5)) 88 | ``` 89 | 90 | ## 06. Using `lru_cache` 91 | 92 | ### `lru_cache` 93 | ```python 94 | @functools.lru_cache(maxsize=None, typed=False) 95 | ``` 96 | 97 | ### Example as decorator 98 | ```python 99 | from functools import lru_cache 100 | 101 | @lru_cache(None) 102 | def add(x, y): 103 | print("calculating: %s + %s" % (x, y)) 104 | return x + y 105 | 106 | print(add(1, 2)) 107 | print(add(1, 2)) 108 | print(add(2, 3)) 109 | ``` 110 | 111 | ### Example for recursive call (Fibonacci) 112 | ```python 113 | import timeit 114 | from functools import lru_cache 115 | 116 | @lru_cache(None) 117 | def fib(n): 118 | if n < 2: 119 | return n 120 | return fib(n - 2) + fib(n - 1) 121 | 122 | print(timeit.timeit(lambda :fib(500), number=1)) 123 | # output: 0.0004921059880871326 124 | ``` 125 | 126 | ## 07. Executing code before the program exits 127 | ```python 128 | import atexit 129 | 130 | @atexit.register 131 | def clean(): 132 | print("Hello") 133 | 134 | def main(): 135 | 1/0 136 | 137 | main() 138 | ``` 139 | ## 08. To turn off the exception association 140 | ```python 141 | try: 142 | print(1 / 0) 143 | except Exception as exc: 144 | raise RuntimeError("bad thing").with_traceback(exc) 145 | ``` 146 | ## 09. Delayed calls 147 | ```python 148 | import contextlib 149 | 150 | def callback(): 151 | print('B') 152 | 153 | with contextlib.ExitStack() as stack: 154 | stack.callback(callback) 155 | print('A') 156 | ``` 157 | ## 10. Stream read files 158 | 159 | ### Using with...open...(x) 160 | ```python 161 | with open("big_file.txt", "r") as fp: 162 | content = fp.read() 163 | ``` 164 | 165 | ### Using readline as generator 166 | ```python 167 | def read_from_file(filename): 168 | with open(filename, "r") as fp: 169 | yield fp.readline() 170 | ``` 171 | 172 | ### In chunks 173 | ```python 174 | def read_from_file(filename, block_size = 1024 * 8): 175 | with open(filename, "r") as fp: 176 | while True: 177 | chunk = fp.read(block_size) 178 | if not chunk: 179 | break 180 | 181 | yield chunk 182 | ``` 183 | 184 | ### `functools.partial` to more optimization 185 | ```python 186 | from functools import partial 187 | 188 | def read_from_file(filename, block_size = 1024 * 8): 189 | with open(filename, "r") as fp: 190 | for chunk in iter(partial(fp.read, block_size), ""): 191 | yield chunk 192 | ``` 193 | -------------------------------------------------------------------------------- /Cool-stuff/Float-secrets.md: -------------------------------------------------------------------------------- 1 | # Floating Points 2 | 3 | ## 1. Not Equal Alignments 4 | 5 | ```python 6 | >> a = (float('nan'),) 7 | >> b = a 8 | 9 | >> a, b 10 | Output: ((nan,), (nan,)) 11 | 12 | >> type(a), type(b) 13 | Output: (, ) 14 | 15 | >> a == b 16 | Output: True 17 | 18 | >> a is b # (i.e. id(a)==id(b) 19 | Output: True 20 | 21 | >> a[0] == b[0] 22 | Output: False 23 | ``` 24 | #### Parsing Rules 25 | 26 | ```python 27 | sign ::= "+" | "-" 28 | infinity ::= "Infinity" | "inf" 29 | nan ::= "nan" 30 | numeric_value ::= floatnumber | infinity | nan 31 | numeric_string ::= [sign] numeric_value 32 | ``` 33 | 34 | #### Inf Slice 35 | ```python 36 | >> a = (float('inf'),) 37 | >> b = a 38 | 39 | >> a # (inf,) 40 | >> b # (inf,) 41 | 42 | >> a == b # True 43 | >> a is b # True 44 | >> a[0] == b[0] # True 45 | ``` 46 | 47 | ## 2. Abnormal Hash Results 48 | ```python 49 | >> a = float('inf') 50 | >> b = float('inf') 51 | 52 | >> c = float('nan') 53 | >> d = float('nan') 54 | 55 | >> a == b # True 56 | >> c == d # False 57 | 58 | >> hash(float('nan')) == hash(float('nan')) 59 | Output: True 60 | ``` 61 | 62 | #### Dictionary Hash 63 | ```python 64 | >> a = {float('nan'): 1, float('nan'): 2} 65 | >> a # {nan: 1, nan: 2} 66 | 67 | >> b = {float('inf'): 1, float('inf'): 2} 68 | >> b # {inf: 2} 69 | ``` 70 | 71 | -------------------------------------------------------------------------------- /Cool-stuff/FunnyEggs.md: -------------------------------------------------------------------------------- 1 | ## 1. Hello World 2 | ``` 3 | >>> import __hello__ 4 | Hello World... 5 | ``` 6 | ## 2. April Fool 7 | ``` 8 | >>> from __future__ import barry_as_FLUFL 9 | >>> 1<>2 10 | True 11 | ``` 12 | 13 | ## 3. Convert from C++ to Python 14 | ``` 15 | >>> from __future__ import braces 16 | SyntaxError: not a chance 17 | ``` 18 | 19 | ## 4. Python Zen 20 | ``` 21 | >>> import this 22 | ``` 23 | 24 | ## 5. "Anti-gravity" dialogue 25 | ``` 26 | >>> import antigravity 27 | ``` 28 | -------------------------------------------------------------------------------- /Cool-stuff/Hypothesis-test.md: -------------------------------------------------------------------------------- 1 | # Hypothesis Testing 2 | 3 | ## For Z - test 4 | 5 | ```python 6 | import pandas as pd 7 | iris = pd.read_csv("Iris.csv") 8 | iris.sample(50) 9 | ``` 10 | 11 | ```python 12 | from statsmodels.stats.weightstats import ztest 13 | 14 | z, pval = ztest(iris['PetalLengthCm'], value=4.2) 15 | print(z,pval) 16 | ``` 17 | 18 | ## For graphs 19 | 20 | ### Creating binomial distribution 21 | ```python 22 | from scipy.stats import binom 23 | 24 | lst = [] 25 | for i in range(11): 26 | lst.append(round(binom.pmf(k=i, n=10, p=0.5),5)) 27 | ``` 28 | 29 | ## Graph 1 30 | ```python 31 | import plotly.graph_objects as go 32 | 33 | fig = go.Figure(go.Bar( 34 | y=lst, 35 | x=[i for i in range(11)])) 36 | 37 | fig.update_layout( 38 | title={ 39 | 'text': "Fair Distribution", 40 | 'y':0.9, 41 | 'x':0.5, 42 | 'xanchor': 'center', 43 | 'yanchor': 'top' 44 | }, 45 | 46 | xaxis=dict( 47 | title='Number of Heads', 48 | tickmode='linear'), 49 | 50 | yaxis=dict( 51 | title='Probability')) 52 | 53 | fig.update_layout({ 54 | 'plot_bgcolor': 'rgba(0, 0, 0, 0)' }) 55 | 56 | fig.update_traces(marker_color='rgb(158,202,225)', 57 | marker_line_color='rgb(8,48,107)', 58 | marker_line_width=2.5, opacity=0.6) 59 | 60 | fig.show() 61 | ``` 62 | 63 | ## Graph 2 64 | ```python 65 | import plotly.graph_objects as go 66 | 67 | fig = go.Figure(go.Bar( 68 | y=lst, 69 | x=[i for i in range(11)])) 70 | 71 | fig.update_layout( 72 | title={ 73 | 'text': "Fair Distribution", 74 | 'y':0.9, 75 | 'x':0.5, 76 | 'xanchor': 'center', 77 | 'yanchor': 'top'}, 78 | 79 | xaxis=dict( 80 | title='Number of Heads', 81 | tickmode='linear'), 82 | 83 | yaxis=dict( 84 | title='Probability')) 85 | 86 | fig.update_layout({ 87 | 'plot_bgcolor': 'rgba(0, 0, 0, 0)'}) 88 | 89 | fig.update_traces(marker_color='rgb(158,202,225)', 90 | marker_line_color='rgb(8,48,107)', 91 | marker_line_width=2.5, opacity=0.6) 92 | 93 | fig.add_annotation( 94 | x=8, 95 | y=-0.005, 96 | xref="x", 97 | yref="y", 98 | text="Heads", 99 | showarrow=True, 100 | font=dict( 101 | family="Open Sans", 102 | size=13, 103 | color="black" 104 | ), 105 | align="center", 106 | arrowhead=2, 107 | arrowsize=1, 108 | arrowwidth=2, 109 | arrowcolor="#636363", 110 | bordercolor="#c7c7c7", 111 | borderwidth=2, 112 | borderpad=4, 113 | ax=0, 114 | ay=50, 115 | opacity=0.8 116 | ) 117 | 118 | fig.show() 119 | ``` 120 | 121 | ## Graph 3 122 | ```python 123 | colors = ['rgb(158,202,225)'] * 11 124 | colors[8],colors[10],colors[9] = 'darkblue', 'darkblue', 'darkblue' 125 | 126 | import plotly.graph_objects as go 127 | 128 | fig = go.Figure(go.Bar( 129 | y=lst, 130 | x=[i for i in range(11)], 131 | marker_color=colors)) 132 | 133 | fig.update_layout( 134 | title={ 135 | 'text': "Fair Distribution", 136 | 'y':0.9, 137 | 'x':0.5, 138 | 'xanchor': 'center', 139 | 'yanchor': 'top'}, 140 | 141 | xaxis=dict( 142 | title='Number of Heads', 143 | tickmode='linear'), 144 | 145 | yaxis=dict( 146 | title='Probability') 147 | ) 148 | fig.update_layout({ 149 | 'plot_bgcolor': 'rgba(0, 0, 0, 0)'}) 150 | 151 | fig.update_traces(marker_line_color='rgb(8,48,107)', 152 | marker_line_width=2.5, 153 | opacity=0.6) 154 | 155 | fig.show() 156 | ``` 157 | -------------------------------------------------------------------------------- /Cool-stuff/Memory-Secrets.md: -------------------------------------------------------------------------------- 1 | # Memory Allocation Tricks 2 | 3 | ## Intro trick 4 | 5 | ```python 6 | import sys 7 | 8 | a = [1,2] 9 | b = [a,a] 10 | 11 | sys.getsizeof(a) #72 12 | sys.getsizeof(b) #72 13 | ``` 14 | 15 | ## 1. Empty not empty 16 | 17 | ```python 18 | import sys 19 | 20 | sys.getsizeof("") # 49 21 | sys.getsizeof([]) # 56 22 | sys.getsizeof(()) # 40 23 | sys.getsizeof(set()) # 216 24 | sys.getsizeof(dict()) # 232 25 | 26 | sys.getsizeof(1) # 28 27 | sys.getsizeof(True) # 28 28 | ``` 29 | 30 | ## 2. Memory Expansion 31 | 32 | ```python 33 | import sys 34 | letters = "abcdefghijklmnopqrstuvwxyz" 35 | 36 | a = [] 37 | for i in letters: 38 | a.append(i) 39 | print(f'{len(a)}, sys.getsizeof(a) = {sys.getsizeof(a)}') 40 | 41 | b = set() 42 | for j in letters: 43 | b.add(j) 44 | print(f'{len(b)}, sys.getsizeof(b) = {sys.getsizeof(b)}') 45 | 46 | c = dict() 47 | for k in letters: 48 | c[k]=k 49 | print(f'{len(c)}, sys.getsizeof(c) = {sys.getsizeof(c)}') 50 | ``` 51 | 52 | ## 3. No memory Release 53 | 54 | ```python 55 | import sys 56 | 57 | a = [1,2,3,4] 58 | sys.getsizeof(a) # 88 59 | 60 | a.append(5) 61 | sys.getsizeof(a) # 120 62 | 63 | a.pop() 64 | sys.getsizeof(a) # 120 65 | ``` 66 | 67 | ## 4. No empty dictionary 68 | 69 | ```python 70 | import sys 71 | 72 | a = [1,2,3] 73 | b = {1,2,3} 74 | c = {'a':1, 'b':2, 'c':3} 75 | 76 | sys.getsizeof(a) # 80 77 | sys.getsizeof(b) # 216 78 | sys.getsizeof(c) # 232 79 | 80 | a.clear() 81 | b.clear() 82 | c.clear() 83 | ``` 84 | ```python 85 | sys.getsizeof(a) # 56 86 | sys.getsizeof(b) # 216 87 | sys.getsizeof(c) # 64 88 | ``` 89 | -------------------------------------------------------------------------------- /Cool-stuff/Merging-lists.md: -------------------------------------------------------------------------------- 1 | # Ways to Merge lists 2 | ## 1. Addition 3 | ```Python 4 | list01 = [1,2,3] 5 | list02 = [4,5,6] 6 | list03 = [7,8,9] 7 | 8 | list01 + list02 + list03 9 | Output : [1, 2, 3, 4, 5, 6, 7, 8, 9] 10 | ``` 11 | ## 2. Itertools 12 | ```Python 13 | from itertools import chain 14 | 15 | list01 = [1,2,3] 16 | list02 = [4,5,6] 17 | list03 = [7,8,9] 18 | 19 | list(chain(list01, list02, list03)) 20 | Output : [1, 2, 3, 4, 5, 6, 7, 8, 9] 21 | ``` 22 | ## 3. * to unpack 23 | ```Python 24 | list01 = [1,2,3] 25 | list02 = [4,5,6] 26 | 27 | [*list01, *list02] 28 | Output: [1, 2, 3, 4, 5, 6] 29 | ``` 30 | ## 4. Extend 31 | ```Python 32 | list01 = [1,2,3] 33 | list02 = [4,5,6] 34 | 35 | list01.extend(list02) 36 | list01 37 | 38 | Output : [1, 2, 3, 4, 5, 6] 39 | ``` 40 | ## 5. List comprehension 41 | ```Python 42 | list01 = [1,2,3] 43 | list02 = [4,5,6] 44 | list03 = [7,8,9] 45 | 46 | [x for l in (list01, list02, list03) for x in l] 47 | Output : [1, 2, 3, 4, 5, 6, 7, 8, 9] 48 | ``` 49 | 50 | ## 6. Heapq 51 | ```Python 52 | list01 = [1,2,3] 53 | list02 = [4,5,6] 54 | list03 = [7,8,9] 55 | 56 | from heapq import merge 57 | list(merge(list01, list02, list03)) 58 | 59 | Output : [1, 2, 3, 4, 5, 6, 7, 8, 9] 60 | ``` 61 | 62 | ## 7. Magic methods 63 | ```Python 64 | list01 = [1,2,3] 65 | list02 = [4,5,6] 66 | 67 | list01 + list02 68 | Output: [1, 2, 3, 4, 5, 6] 69 | 70 | list01.__add__(list02) 71 | Output: [1, 2, 3, 4, 5, 6] 72 | ``` 73 | ```python 74 | list01 = [1,2,3] 75 | list02 = [4,5,6] 76 | list03 = [7,8,9] 77 | 78 | from functools import reduce 79 | reduce(list.__add__, (list01, list02, list03)) 80 | 81 | Output: [1, 2, 3, 4, 5, 6, 7, 8, 9] 82 | ``` 83 | ## 8. yield from 84 | ```Python 85 | list01 = [1,2,3] 86 | list02 = [4,5,6] 87 | list03 = [7,8,9] 88 | 89 | def merge(*lists): 90 | for l in lists: 91 | yield from l 92 | 93 | list(merge(list01, list02, list03)) 94 | Output:[1, 2, 3, 4, 5, 6, 7, 8, 9] 95 | ``` 96 | -------------------------------------------------------------------------------- /Cool-stuff/Python-Tricks.md: -------------------------------------------------------------------------------- 1 | # Python Optimization Tricks 2 | 3 | ## 1. Avoid global variables 4 | **Wrong way** 5 | ```python 6 | import math 7 | size = 10000 8 | for x in range(size): 9 | for y in range(size): 10 | z = math.sqrt(x) + math.sqrt(y) 11 | ``` 12 | **Correct way** 13 | ```python 14 | import math 15 | def main(): 16 | size = 10000 17 | for x in range(size): 18 | for y in range(size): 19 | z = math.sqrt(x) + math.sqrt(y) 20 | main() 21 | ``` 22 | ## 2.1 Avoid access to module and function properties 23 | **Wrong way** 24 | ```python 25 | import math 26 | def computeSqrt(size: int): 27 | result = [] 28 | for i in range(size): 29 | result.append(math.sqrt(i)) 30 | return result 31 | def main(): 32 | size = 10000 33 | for _ in range(size): 34 | result = computeSqrt(size) 35 | main() 36 | ``` 37 | **Correct Way** 38 | ```python 39 | from math import sqrt 40 | def computeSqrt(size: int): 41 | result = [] 42 | for i in range(size): 43 | result.append(sqrt(i)) 44 | return result 45 | def main(): 46 | size = 10000 47 | for _ in range(size): 48 | result = computeSqrt(size) 49 | main() 50 | ``` 51 | ## 2.2 Avoid intra-class attribute access 52 | **Wrong way** 53 | ```python 54 | import math 55 | from typing import List 56 | class DemoClass: 57 | def __init__(self, value: int): 58 | self._value = value 59 | 60 | def computeSqrt(self, size: int) -> List[float]: 61 | result = [] 62 | append = result.append 63 | sqrt = math.sqrt 64 | for _ in range(size): 65 | append(sqrt(self._value)) 66 | return result 67 | def main(): 68 | size = 10000 69 | for _ in range(size): 70 | demo_instance = DemoClass(size) 71 | result = demo_instance.computeSqrt(size) 72 | main() 73 | ``` 74 | **Correct way** 75 | ```python 76 | import math 77 | from typing import List 78 | class DemoClass: 79 | def __init__(self, value: int): 80 | self._value = value 81 | 82 | def computeSqrt(self, size: int) -> List[float]: 83 | result = [] 84 | append = result.append 85 | sqrt = math.sqrt 86 | value = self._value 87 | for _ in range(size): 88 | append(sqrt(value)) 89 | return result 90 | def main(): 91 | size = 10000 92 | for _ in range(size): 93 | demo_instance = DemoClass(size) 94 | demo_instance.computeSqrt(size) 95 | main() 96 | ## 3. Avoid unnecessary abstractions 97 | ``` 98 | **Wrong way** 99 | ```python 100 | class DemoClass: 101 | def __init__(self, value: int): 102 | self.value = value 103 | @property 104 | def value(self) -> int: 105 | return self._value 106 | @value.setter 107 | def value(self, x: int): 108 | self._value = x 109 | def main(): 110 | size = 1000000 111 | for i in range(size): 112 | demo_instance = DemoClass(size) 113 | value = demo_instance.value 114 | demo_instance.value = i 115 | main() 116 | ``` 117 | **Correct way** 118 | ```python 119 | class DemoClass: 120 | def __init__(self, value: int): 121 | self.value = value 122 | def main(): 123 | size = 1000000 124 | for i in range(size): 125 | demo_instance = DemoClass(size) 126 | value = demo_instance.value 127 | demo_instance.value = i 128 | main() 129 | ``` 130 | ## 4.1 Avoid meaningless data copying 131 | **Wrong Way** 132 | ```python 133 | def main(): 134 | size = 10000 135 | for _ in range(size): 136 | value = range(size) 137 | value_list = [x for x in value] 138 | square_list = [x * x for x in value_list] 139 | main() 140 | ``` 141 | **Correct Way** 142 | ``` 143 | def main(): 144 | size = 10000 145 | for _ in range(size): 146 | value = range(size) 147 | square_list = [x * x for x in value] 148 | main() 149 | ``` 150 | ## 4.2 Intermediate variables are not used when exchanging values 151 | **Wrong Way** 152 | ```python 153 | def main(): 154 | size = 1000000 155 | for _ in range(size): 156 | a = 3 157 | b = 5 158 | temp = a 159 | a = b 160 | b = temp 161 | main() 162 | ``` 163 | **Correct Way** 164 | ```python 165 | def main(): 166 | size = 1000000 167 | for _ in range(size): 168 | a = 3 169 | b = 5 170 | a, b = b, a 171 | main() 172 | ``` 173 | ## 4.3 String concatenation join instead of '+' 174 | **Wrong Way** 175 | ```python 176 | import string 177 | from typing import List 178 | def concatString(string_list: List[str]) -> str: 179 | result = '' 180 | for str_i in string_list: 181 | result += str_i 182 | return result 183 | def main(): 184 | string_list = list(string.ascii_letters * 100) 185 | for _ in range(10000): 186 | result = concatString(string_list) 187 | main() 188 | ``` 189 | **Correct Way** 190 | ```python 191 | import string 192 | from typing import List 193 | def concatString(string_list: List[str]) -> str: 194 | return ''.join(string_list) 195 | def main(): 196 | string_list = list(string.ascii_letters * 100) 197 | for _ in range(10000): 198 | result = concatString(string_list) 199 | main() 200 | ``` 201 | ## 5.1 Cycle optimization 202 | **Wrong way** 203 | ```python 204 | def computeSum(size: int) -> int: 205 | sum_ = 0 206 | i = 0 207 | while i < size: 208 | sum_ += i 209 | i += 1 210 | return sum_ 211 | def main(): 212 | size = 10000 213 | for _ in range(size): 214 | sum_ = computeSum(size) 215 | main() 216 | ``` 217 | **Correct way** 218 | ```python 219 | def computeSum(size: int) -> int: 220 | sum_ = 0 221 | for i in range(size): 222 | sum_ += i 223 | return sum_ 224 | def main(): 225 | size = 10000 226 | for _ in range(size): 227 | sum_ = computeSum(size) 228 | main() 229 | ``` 230 | ## 5.3 For Calculations to reduce inner circulation 231 | **Wrong Way** 232 | ```python 233 | from math import sqrt 234 | def main(): 235 | size = 10000 236 | for x in range(size): 237 | for y in range(size): 238 | z = sqrt(x) + sqrt(y) 239 | main() 240 | ``` 241 | **Correct Way** 242 | ```python 243 | import math 244 | def main(): 245 | size = 10000 246 | for x in range(size): 247 | sqrt_x = sqrt(x) 248 | for y in range(size): 249 | z = sqrt_x + sqrt(y) 250 | main() 251 | ``` 252 | ## 6. Use Numba.jit 253 | ```python 254 | import numba 255 | @numba.jit 256 | def computeSum(size: float) -> int: 257 | sum = 0 258 | for i in range(size): 259 | sum += i 260 | return sum 261 | def main(): 262 | size = 10000 263 | for _ in range(size): 264 | sum = computeSum(size) 265 | main() 266 | ``` 267 | -------------------------------------------------------------------------------- /Cool-stuff/Sql-Tricks.md: -------------------------------------------------------------------------------- 1 | # SQL Optimization Tricks 2 | 3 | ### 1. Try not to use select * to query SQL, but select specific fields. 4 | **Counter-example** 5 | ```SQL 6 | select * from employee; 7 | ``` 8 | **Positive example** 9 | ```SQL 10 | select id,name from employee; 11 | ``` 12 | ### 2. If you know that there is only one query result, it is recommended to use limit 1 13 | **Counter-example** 14 | ```SQL 15 | select id,name from employee where name='jay' 16 | ``` 17 | **Positive example** 18 | ```SQL 19 | select id,name from employee where name='jay' limit 1; 20 | ``` 21 | ### 3. Try to avoid using or in the where clause to join conditions 22 | **Counter-example** 23 | ```SQL 24 | select * from user where userid = 1 or age = 18 25 | ``` 26 | **Positive example** 27 | ```SQL 28 | //Use union all 29 | select * from user where userid=1 30 | union all 31 | select * from user where age = 18 32 | 33 | //Or write two separate SQL 34 | select * from user where userid=1 35 | select * from user where age = 18 36 | ``` 37 | ### 4. Optimize limit paging 38 | **Counter-example** 39 | ```SQL 40 | select id,name,age from employee limit 10000,10; 41 | ``` 42 | **Positive example** 43 | ```SQL 44 | //Solution 1: Return the largest record (offset) of the last query 45 | select id,name from employee where id>10000 limit 10. 46 | 47 | //Solution 2: order by + index 48 | select id,name from employee order by id limit 10000,10 49 | ``` 50 | ### 5. Optimize your like statement 51 | **Counter-example** 52 | ```SQL 53 | select userId,name from user where userId like '%123'; 54 | ``` 55 | **Positive example** 56 | ```SQL 57 | select userId,name from user where userId like '123%'; 58 | ``` 59 | ### 6. Use where conditions to limit the data to be queried to avoid returning extra rows 60 | **Counter-example** 61 | ```SQL 62 | List userIds = sqlMap.queryList("select userId from user where isVip=1"); 63 | boolean isVip = userIds.contains(userId); 64 | ``` 65 | **Positive example** 66 | ```SQL 67 | Long userId = sqlMap.queryObject("select userId from user where userId='userId' and isVip='1' ") 68 | boolean isVip = userId!=null; 69 | ``` 70 | ### 7. You should avoid using the != or <> operator in the where clause as much as possible, otherwise the engine will give up using the index and perform a full table scan 71 | **Counter-example** 72 | ```SQL 73 | select age,name from user where age <>18; 74 | ``` 75 | **Positive example** 76 | ```SQL 77 | //You can consider separate two sql write 78 | 79 | select age,name from user where age <18; 80 | select age,name from user where age >18; 81 | ``` 82 | ### 8. If you insert too much data, consider bulk insertion 83 | **Counter-example** 84 | ```SQL 85 | for(User u :list){ 86 | INSERT into user(name,age) values(#name#,#age#) 87 | } 88 | ``` 89 | **Positive example** 90 | ```SQL 91 | //One batch of 500 inserts, carried out in batches 92 | 93 | insert into user(name,age) values 94 | 95 | (#{item.name},#{item.age}) 96 | 97 | ``` 98 | ### 9. Use the distinct keyword with caution 99 | **Counter-example** 100 | ```SQL 101 | SELECT DISTINCT * from user; 102 | ``` 103 | **Positive example** 104 | ```SQL 105 | select DISTINCT name from user; 106 | ``` 107 | ### 10. Remove redundant and duplicate indexes 108 | **Counter-example** 109 | ```SQL 110 | KEY `idx_userId` (`userId`) 111 | KEY `idx_userId_age` (`userId`,`age`) 112 | ``` 113 | **Positive example** 114 | ```SQL 115 | //Delete the userId index, because the combined index (A, B) 116 | // is equivalent to creating the (A) and (A, B) indexes 117 | 118 | KEY `idx_userId_age` (`userId`,`age`) 119 | ``` 120 | ### 11. If the amount of data is large, optimize your modify/delete statement 121 | **Counter-example** 122 | ```SQL 123 | //Delete 100,000 or 1 million+ at a time? 124 | delete from user where id <100000; 125 | 126 | //Or use single cycle operation, low efficiency and long time 127 | for(User user:list){ 128 | delete from user; } 129 | ``` 130 | **Positive example** 131 | ```SQL 132 | //Delete in batches, such as 500 each time 133 | 134 | delete user where id<500 135 | delete product where id>=500 and id<1000; 136 | ``` 137 | ### 12. Consider using default values ​​instead of null in the where clause 138 | **Counter-example** 139 | ```SQL 140 | select * from user where age is not null; 141 | ``` 142 | **Positive example** 143 | ```SQL 144 | select * from user where age>0; //Set 0 as default 145 | ``` 146 | ### 13. Try to replace union with union all 147 | **Counter-example** 148 | ```SQL 149 | select * from user where userid=1 150 | union 151 | select * from user where age = 10 152 | ``` 153 | **Positive example** 154 | ```SQL 155 | select * from user where userid=1 156 | union all 157 | select * from user where age = 10 158 | ``` 159 | ### 14. Use numeric fields as much as possible. If the fields only contain numeric information, try not to design them as a character type. 160 | **Counter-example** 161 | ```SQL 162 | `king_id` varchar(20) NOT NULL 163 | ``` 164 | **Positive example** 165 | ```SQL 166 | `king_id` int(11) NOT NULL; 167 | ``` 168 | ### 15. Use varchar/nvarchar instead of char/nchar whenever possible 169 | **Counter-example** 170 | ```SQL 171 | `deptName` char(100) DEFAULT NULL; 172 | ``` 173 | **Positive example** 174 | ```SQL 175 | `deptName` varchar(100) DEFAULT NULL; 176 | ``` 177 | ### 16. Use explain to analyze your SQL plan 178 | ```SQL 179 | explain select * from user where userid = 10086 or age =18; 180 | ``` 181 | -------------------------------------------------------------------------------- /Cool-stuff/Sql-tricks2.md: -------------------------------------------------------------------------------- 1 | # SQL Optimization Tricks Part 2 2 | 3 | ### 1. Get advice from PROCEDURE ANALYSE() 4 | 5 | ```SQL 6 | SELECT … FROM … WHERE … PROCEDURE ANALYSE([max_elements,[max_memory]]) 7 | 8 | // Example 9 | SELECT col1, col2 FROM table1 PROCEDURE ANALYSE(10, 2000); 10 | ``` 11 | 12 | ### 2. Always set an ID for each table 13 | 14 | **Counter-example** 15 | ```SQL 16 | CREATE TABLE subs ( 17 | email varchar(20) NOT NULL, 18 | name varchar(20) 19 | ); 20 | ``` 21 | **Positive example** 22 | ```SQL 23 | CREATE TABLE subs ( 24 | id int(5) NOT NULL AUTO_INCREMENT, 25 | email varchar(20) NOT NULL, 26 | name varchar(20) 27 | ); 28 | ``` 29 | ### 3. Use ENUM instead of VARCHAR 30 | 31 | **Counter-example** 32 | ```SQL 33 | CREATE TABLE Persons ( 34 | PersonID int, 35 | Status varchar(25) 36 | ); 37 | ``` 38 | 39 | **Positive example** 40 | ```SQL 41 | CREATE TABLE Persons ( 42 | PersonID int, 43 | Status enum('Married', 'Single') NOT NULL 44 | ); 45 | ``` 46 | 47 | ### 4. Optimize your query by caching 48 | 49 | ``` 50 | sudo nano /etc/mysql/my.cnf 51 | ``` 52 | 53 | ``` 54 | /etc/mysql/my.cnf 55 | ... 56 | [mysqld] 57 | query_cache_type=1 58 | query_cache_size = 10M 59 | query_cache_limit=256K 60 | ``` 61 | 62 | ### 6. Make use of Prepared Statements 63 | 64 | ``` 65 | * PREPARE – prepare a statement for execution. 66 | * EXECUTE – execute a prepared statement prepared by the PREPARE statement. 67 | * DEALLOCATE PREPARE – release a prepared statement. 68 | ``` 69 | 70 | ```SQL 71 | 1. Prepare 72 | PREPARE item1 FROM 73 | 'SELECT itemcode, itemname 74 | FROM items 75 | WHERE itemcode = ?'; 76 | 77 | // ic stands for itemcode 78 | SET @ic = 'i012'; 79 | 80 | 2. Execute 81 | EXECUTE item1 USING @pc; 82 | 83 | 3. Deallocate 84 | DEALLOCATE PREPARE item1; 85 | ``` 86 | 87 | ### 7. Use the alias of the table and prefix the alias on each column, so that the semantics are more clear 88 | 89 | **Counter-example** 90 | ```SQL 91 | select * from A 92 | inner join B 93 | on A.deptId = B.deptId; 94 | ``` 95 | 96 | **Positive example** 97 | ```SQL 98 | select memeber.name, deptment.deptName from A member 99 | inner join B deptment 100 | on member.deptId = deptment.deptId; 101 | ``` 102 | 103 | ### 8. If the field type is a string, it must be enclosed in quotation marks 104 | 105 | **Counter-example** 106 | ```SQL 107 | select * from user where userid =123; 108 | ``` 109 | 110 | **Positive example** 111 | ```SQL 112 | select * from user where userid = ‘123’ ; 113 | ``` 114 | 115 | ### 9. When using a joint index, pay attention to the order of the index columns, generally following the left-most matching principle 116 | 117 | **Counter-example** 118 | ```SQL 119 | select * from user where age = 10; 120 | ``` 121 | 122 | **Positive example** 123 | ```SQL 124 | //Complies with the left-most matching principle 125 | select * from user where userid=10 and age =10; 126 | 127 | //Complies with the left-most matching principle 128 | select * from user where userid =10; 129 | ``` 130 | 131 | 132 | ### 10. Inner join is preferred still if the left join is used, the result of the left table is as small as possible 133 | 134 | **Counter-example** 135 | ```SQL 136 | select * from 137 | table1 t1 left join table2 t2 138 | on t1.size = t2.size 139 | where t1.id>2; 140 | ``` 141 | 142 | **Positive example** 143 | ```SQL 144 | select * from 145 | (select * from table1 where id >2) 146 | t1 left join table2 t2 147 | on t1.size = t2.size; 148 | ``` 149 | 150 | ### 12. Save the IP address as UNSIGNED INT 151 | 152 | **Counter-example** 153 | ```SQL 154 | CREATE TABLE classes ( 155 | id INT AUTO_INCREMENT, 156 | ipadd VARCHAR(15) NOT NULL 157 | ); 158 | ``` 159 | 160 | **Positive example** 161 | ```SQL 162 | CREATE TABLE classes ( 163 | id INT AUTO_INCREMENT, 164 | ipadd INT(4) UNSIGNED NOT NULL 165 | ); 166 | ``` 167 | -------------------------------------------------------------------------------- /Python/01. Variables.md: -------------------------------------------------------------------------------- 1 | # Variables Guide 2 | 3 | ## Variables and code quality 4 | 5 | As the first article in the "Python Craftsman" series, I want to talk about "Variables" first. Because how to define and use variables has always been one of the first skills to learn in any programming language. 6 | 7 | Whether variables are used well or not is very important to the quality of the code. Among the many questions about variables, it is especially important to give them a good name. 8 | 9 | ## How to name variables 10 | 11 | In the field of computer science, there is a famous motto (playful saying): 12 | 13 | > There are only two hard things in Computer Science: cache invalidation and naming things. 14 | > There are only two difficult things in the field of computer science: cache expiration and naming things 15 | > 16 | > - Phil Karlton 17 | 18 | Needless to say, the difficulty of the first "cache expiration problem" will be understood by anyone who has used the cache. As for the difficulty of the second "Name a thing", I also understand it deeply. One of the darkest afternoons I have spent in my career was sitting in front of the monitor and scratching my head to give a suitable name for a new project. 19 | 20 | The most names from programming, also count various variables. It is important to give a good name to a variable, because good variable naming can greatly improve the overall readability of the code. ** 21 | 22 | The following points are the basic principles that I should summarize when naming variables. 23 | 24 | ### 1. Variable names must be descriptive, not too broad 25 | 26 | Within the **acceptable length range**, the variable name can describe the content it points to as accurately as possible. So, try not to use those too broad words as your variable names: 27 | 28 | - **BAD**: `day`, `host`, `cards`, `temp` 29 | - **GOOD**: `day_of_week`, `hosts_to_reboot`, `expired_cards` 30 | 31 | ### 2. Variable names are best for people to guess the type 32 | 33 | Everyone who learns Python knows that Python is a dynamically typed language, and it (at least before [PEP 484](https://www.python.org/dev/peps/pep-0484/) appeared) has no variable types statement. So when you see a variable, you can't easily know what type it is except for guessing through the context. 34 | 35 | However, people usually have some intuitive conventions for the relationship between variable names and variable types. I summarize them below. 36 | 37 | #### "What kind of name will be treated as bool?" 38 | 39 | The biggest characteristic of a Boolean variable is that it only has two possible values ​​**『Yes』** or **『No』**. Therefore, variable names modified with non-black and white words such as `is` and `has` would be a good choice. The principle is: **Make people who read the variable name think that this variable will only have "yes" or "no" two values**. 40 | 41 | Here are a few good examples: 42 | 43 | - `is_superuser`: "Whether superuser", there will only be two values: yes/no 44 | - `has_error`: "There is no error", there will only be two values: yes/no 45 | - `allow_vip`: "Whether VIP is allowed", there will only be two values: allowed/not allowed 46 | - `use_msgpack`: "Whether to use msgpack", there will only be two values: use/not use 47 | - `debug`: "Whether to enable debug mode" is regarded as bool mainly because of convention 48 | 49 | #### What kind of names will be treated as int/float? 50 | 51 | When people see names related to numbers, they all assume that they are of type int/float. The following are more common: 52 | 53 | - All words interpreted as numbers, such as: `port (port number)`, `age (age)`, `radius (radius)`, etc. 54 | - Use words ending in _id, for example: `user_id`, `host_id` 55 | - Use words starting or ending with length/count, for example: `length_of_username`, `max_length`, `users_count` 56 | 57 | **Note**: Do not use ordinary complex numbers to represent an int type variable, such as `apples`, `trips`, it is better to use `number_of_apples`, `trips_count` instead. 58 | 59 | #### Other types 60 | 61 | For complex types such as str, list, tuple, and dict, it is difficult to have a unified rule that allows us to guess the variable type by name. For example, `headers` may be either a list of header information or a dict containing header information. 62 | 63 | For these types of variable names, the most recommended way is to write a canonical document, using the sphinx format in the document string of the function and method ([Document tool used by the official Python documentation](http://www.sphinx-doc.org/en/stable/)) to label all variable types. 64 | 65 | ### 3. Proper use of "Hungarian Nomenclature" 66 | 67 | The first time I knew [Hungarian Nomenclature](https://en.wikipedia.org/wiki/Hungarian_notation) was in [Joel on Software's blog post](http://www.joelonsoftware.com/articles/Wrong.html). In short, the Hungarian nomenclature is to abbreviate the "type" of the variable and put it at the front of the variable name. 68 | 69 | The key is that the variable "type" mentioned here does not refer to the traditional int/str/list type, but refers to those types that are related to the business logic of your code. 70 | 71 | For example, there are two variables in your code: `students` and `teachers`, and they all point to a list containing Person objects. After using "Hungarian Nomenclature", these two names can be rewritten as follows: 72 | 73 | students -> `pl_students` 74 | teachers -> `pl_teachers` 75 | 76 | Where pl is the acronym for **person list**. When the variable name is prefixed, if you see the variable that starts with `pl_`, you can know the type of value it points to. 77 | 78 | In many cases, using the "Hungarian nomenclature" is a good idea, because it can improve the readability of your code, especially when those variables are many and the same type appears multiple times. Just be careful not to abuse it. 79 | 80 | ### 4. Variable names should be as short as possible, but never too short 81 | 82 | Earlier, we mentioned that variable names should be descriptive. If you do not put any restrictions on this principle, then you are likely to write such a highly descriptive variable name: `how_much_points_need_for_level2`. If the code is full of such long variable names, it is a disaster for code readability. 83 | 84 | A good variable name, the length should be controlled at **two to three words**. For example, the above name can be abbreviated as `points_level2`. 85 | 86 | **In most cases, you should avoid short names with only one or two letters**, such as array index three Musketeers `i`, `j`, `k`, use names with clear meaning, such as person_index It’s always better to replace them. 87 | 88 | #### Exceptions using short names 89 | 90 | Sometimes, there are some exceptions to the above principle. When some variable names with clear meanings but longer appear repeatedly, in order to make the code more concise, it is entirely possible to use short abbreviations. But in order to reduce the cost of understanding, it is best not to use too many such short names in the same piece of code. 91 | 92 | For example, when importing modules in Python, short names are often used as aliases. For example, the commonly used `gettext` method in Django i18n translation is usually abbreviated to `_` to use * (from django.utils.translation import ugettext as _)* 93 | 94 | ### 5. Other considerations 95 | 96 | Some other considerations for naming variables: 97 | 98 | - Do not use too similar variable names in the same piece of code, such as the sequence of `users`, `users1`, `user3` 99 | - Do not use variable names with negative meanings, use `is_special` instead of `is_not_normal` 100 | 101 | ## Better-use-of-variables 102 | 103 | Earlier I talked about how to give a good name to a variable. Let's talk about some small details that should be paid attention to when using variables in daily life. 104 | 105 | ### 1. Maintain consistency 106 | 107 | If you call the picture variable `photo` in a method, don’t change it to `image` in other places. This will only confuse the readers of the code: Are `image` and `photo` really? The same thing? 』 108 | 109 | In addition, although Python is a dynamically typed language, that does not mean that you can use the same variable name to represent the str type for a while, and then replace it with a list. **The variable types referred to by the same variable name also need to be consistent. ** 110 | 111 | ### 2. Try not to use globals()/locals() 112 | 113 | Maybe you first discovered globals()/locals(), which was very exciting for the built-in functions, and can’t wait to write the following extremely “simple” code: 114 | 115 | ```python 116 | def render_trip_page(request, user_id, trip_id): 117 | user = User.objects.get(id=user_id) 118 | trip = get_object_or_404(Trip, pk=trip_id) 119 | is_suggested = is_suggested(user, trip) 120 | # Using locals() to save three lines of code, I am a genius! 121 | return render(request,'trip.html', locals()) 122 | ``` 123 | 124 | Don't do this, it will only make people who read this code (including yourself after three months) hate you, because he needs to remember all the variables defined in this function (think this function grows to two What about Baixing?), not to mention that locals() will also pass out some unnecessary variables. 125 | 126 | Not to mention, [The Zen of Python](https://www.python.org/dev/peps/pep-0020/) said clearly: **Explicit is better than implicit. Is better than implicit)**. So, let's honestly write the code like this: 127 | 128 | ```python 129 | return render(request,'trip.html', { 130 | 'user': user, 131 | 'trip': trip, 132 | 'is_suggested': is_suggested 133 | }) 134 | ``` 135 | 136 | ### 3. Variable definitions are used as close as possible 137 | 138 | This principle is commonplace. Many people (including me) have a habit when they first start learning programming. It is to write all the variable definitions together and put them at the front of the function or method. 139 | 140 | ```python 141 | def generate_trip_png(trip): 142 | path = [] 143 | markers = [] 144 | photo_markers = [] 145 | text_markers = [] 146 | marker_count = 0 147 | point_count = 0 148 | ... 149 | ``` 150 | 151 | Doing so will only make your code "look neat", but it will not help to improve the readability of the code. 152 | 153 | A better approach is to **use variable definitions as close to use as possible**. So when you read the code, you can better understand the logic of the code, rather than trying to think about what and where this variable is defined? 154 | 155 | ### 4. Reasonably use namedtuple/dict to make the function return multiple values 156 | 157 | Python functions can return multiple values: 158 | 159 | ```python 160 | def latlon_to_address(lat, lon): 161 | return country, province, city 162 | 163 | # Use multiple return values to unpack and define multiple variables at once 164 | country, province, city = latlon_to_address(lat, lon) 165 | ``` 166 | 167 | However, this usage will create a small problem: what if the `latlon_to_address` function needs to return "District" on a certain day? 168 | 169 | If it is written as above, you need to find all the places where `latlon_to_address` is called, and make up the extra variable, otherwise *ValueError: too many values ​​to unpack* will find you: 170 | 171 | ```python 172 | country, province, city, district = latlon_to_address(lat, lon) 173 | 174 | # Or use _ to ignore the extra return value 175 | country, province, city, _ = latlon_to_address(lat, lon) 176 | ``` 177 | 178 | For this potentially variable multi-return value function, it is more convenient to use namedtuple/dict. When you add a return value, it will not have any destructive effect on the previous function call: 179 | 180 | ```python 181 | # 1. Use dict 182 | def latlon_to_address(lat, lon): 183 | return { 184 | 'country': country, 185 | 'province': province, 186 | 'city': city 187 | } 188 | 189 | addr_dict = latlon_to_address(lat, lon) 190 | 191 | # 2. Use namedtuple 192 | from collections import namedtuple 193 | 194 | Address = namedtuple("Address", ['country','province','city']) 195 | 196 | def latlon_to_address(lat, lon): 197 | return Address( 198 | country=country, 199 | province=province, 200 | city=city 201 | ) 202 | 203 | addr = latlon_to_address(lat, lon) 204 | ``` 205 | 206 | However, there are disadvantages to doing this because the compatibility of the code with the changes has improved, but you can no longer continue to unpack and define multiple variables at once with the method of `x, y = f()`. The choice is yours. 207 | 208 | ### 5. Control the number of variables in a single function 209 | 210 | The ability of the human brain is limited. Studies have shown that human short-term memory can only remember no more than ten names at a time. So, when one of your functions is too long (in general, a function with more than one screen will be considered a bit too long) and contains too many variables. Please split it into multiple small functions in time. 211 | 212 | ### 6. Delete those useless variables in time 213 | 214 | This principle is very simple and easy to do. But if it is not followed, it will be a devastating blow to the quality of your code. It will make people who read your code feel fooled. 215 | 216 | ```python 217 | def fancy_func(): 218 | # Reader Psychology: Well, a fancy_vars is defined here 219 | fancy_vars = get_fancy() 220 | ... (after a lot of code) 221 | 222 | # Reader Psychology: Is this the end? Where did the previous fancy_vars go? Was it eaten by the cat? 223 | return result 224 | ``` 225 | 226 | Therefore, please open the smart prompt of the IDE and clean up the variables that are defined but not used in time. 227 | 228 | ### 7. Define temporary variables to improve readability 229 | 230 | Sometimes, some complex expressions will appear in our code, like this: 231 | 232 | ```python 233 | # Distribute 10,000 gold coins for all active users whose gender is female or whose level is greater than 3 234 | if user.is_active and (user.sex =='female' or user.level> 3): 235 | user.add_coins(10000) 236 | return 237 | ``` 238 | 239 | See the long list behind `if`? It's a little hard to read, right? But if we assign it to a temporary variable, 240 | Can give readers a psychological buffer and improve readability: 241 | 242 | ``` 243 | # Distribute 10,000 gold coins for all active users whose gender is female or whose level is greater than 3 244 | user_is_eligible = user.is_active and (user.sex =='female' or user.level> 3): 245 | 246 | if user_is_eligible: 247 | user.add_coins(10000) 248 | return 249 | ``` 250 | 251 | Defining temporary variables can improve readability. But sometimes, assigning unnecessary things to temporary variables can make the code look verbose: 252 | 253 | ```python 254 | def get_best_trip_by_user_id(user_id): 255 | 256 | # Psychological activity: "Well, this value may be modified/reused in the future", let us first define it as a variable! 257 | user = get_user(user_id) 258 | trip = get_best_trip(user_id) 259 | result = { 260 | 'user': user, 261 | 'trip': trip 262 | } 263 | return result 264 | ``` 265 | 266 | In fact, the "future" you think will never come. The three temporary variables in this code can be completely removed and become like this: 267 | 268 | ```python 269 | def get_best_trip_by_user_id(user_id): 270 | return { 271 | 'user': get_user(user_id), 272 | 'trip': get_best_trip(user_id) 273 | } 274 | ``` 275 | 276 | There is no need to sacrifice the current readability of the code for possible changes. If there is a need to define variables in the future, then add it later. 277 | -------------------------------------------------------------------------------- /Python/02. If-else.md: -------------------------------------------------------------------------------- 1 | # If-Else Guide 2 | 3 | ## Preface 4 | 5 | Writing conditional branch code is an integral part of the coding process. 6 | 7 | If we use roads as a metaphor, the code in the real world is never a straight highway, but more like a map of an urban area composed of countless forks. Our coder is like a driver, we need to tell our program whether we need to turn left or right at the next intersection. 8 | 9 | Writing good conditional branch code is very important, because bad, complicated branch handling is very confusing, which reduces the quality of the code. Therefore, this article will focus on the points that should be noted when writing branch code in Python. 10 | 11 | ### Branch code in Python 12 | 13 | Python supports the most common `if/else` conditional branch statement, but it lacks the `switch/case` statement common in other programming languages. 14 | 15 | In addition, Python also provides the else branch for `for/while` loops and `try/except` statements. In some special scenarios, they can show their talents. 16 | 17 | Below I will talk about how to write excellent conditional branch code from three aspects: `best practice`, `common tips` and `common traps`. 18 | 19 | ## Best Practices 20 | 21 | ### 1. Avoid nesting multiple branches 22 | 23 | If this article can only be cut down to a sentence and then ended, then that sentence must be **"We must do our best to avoid branch nesting"**. 24 | 25 | Too deep branch nesting is one of the easiest mistakes for many novice programmers. If a novice JavaScript programmer wrote a lot of branches and nesting, then you might see layers of braces: `if {if {if {... }}}`. Commonly known as **Nested If Statement Hell**. 26 | 27 | But because Python uses indentation instead of `{}`, nested branches that are too deep can have more serious consequences than other languages. For example, too many indentation levels can easily cause the code to exceed the word limit per line specified in [PEP8](https://www.python.org/dev/peps/pep-0008/). Let's look at this code: 28 | 29 | ```Python 30 | def buy_fruit(nerd, store): 31 | """ 32 | - Go to the fruit shop to buy apples 33 | - First check if the store is open 34 | - If you have an apple, buy one 35 | - If you don’t have enough money, go home and get the money again 36 | """ 37 | 38 | if store.is_open(): 39 | if store.has_stocks("apple"): 40 | if nerd.can_afford(store.price("apple", amount=1)): 41 | nerd.buy(store, "apple", amount=1) 42 | return 43 | else: 44 | nerd.go_home_and_get_money() 45 | return buy_fruit(nerd, store) 46 | else: 47 | raise MadAtNoFruit("no apple in store!") 48 | else: 49 | raise MadAtNoFruit("store is closed!") 50 | ``` 51 | 52 | The biggest problem with the above code is that the original conditional branch requirements are too directly translated, resulting in just a dozen lines of code containing three levels of nested branches. 53 | 54 | Such code is poorly readable and maintainable. But we can use a very simple trick: **"end early"** to optimize this code: 55 | 56 | ```python 57 | def buy_fruit(nerd, store): 58 | if not store.is_open(): 59 | raise MadAtNoFruit("store is closed!") 60 | 61 | if not store.has_stocks("apple"): 62 | raise MadAtNoFruit("no apple in store!") 63 | 64 | if nerd.can_afford(store.price("apple", amount=1)): 65 | nerd.buy(store, "apple", amount=1) 66 | return 67 | else: 68 | nerd.go_home_and_get_money() 69 | return buy_fruit(nerd, store) 70 | ``` 71 | 72 | **"End early"** means: Use `return` or `raise` statements in the function to end the function in the branch in advance. **For example, in the new `buy_fruit` function, when the branch condition is not met, we directly throw an exception and end this code branch. Such code has no nested branches and is more straightforward and easier to read. 73 | 74 | ### 2. Encapsulate logic judgments that are too complex 75 | 76 | If the expression in the conditional branch is too complicated and there are too many `not/and/or`, the readability of this code will be greatly reduced, such as the following code: 77 | 78 | ``` 79 | # If the event is still open and the remaining places in the event are greater than 10, it is for all genders or women, or the level is greater than 3 80 | # Active users issue 10,000 coins 81 | if activity.is_active and activity.remaining> 10 and \ 82 | user.is_active and (user.sex =='female' or user.level> 3): 83 | user.add_coins(10000) 84 | return 85 | ``` 86 | 87 | For such code, we can consider encapsulating the specific branch logic into a function or method to achieve the purpose of simplifying the code: 88 | 89 | ``` 90 | if activity.allow_new_user() and user.match_activity_condition(): 91 | user.add_coins(10000) 92 | return 93 | ``` 94 | 95 | In fact, after rewriting the code, the previous comment text can actually be removed. **Because the following code has reached the purpose of self-explanation. As for specific **What kind of users meet the conditions of the activity?** This kind of question should be answered by the specific `match_activity_condition()` method. 96 | 97 | > **Hint:** Proper encapsulation not only directly improves the readability of the code, in fact, if the above activity judgment logic appears more than once in the code, encapsulation is even more necessary. Otherwise, repeating the code will greatly destroy the maintainability of this logic. 98 | 99 | ### 3. Pay attention to the repeated code under different branches 100 | 101 | Repeated code is the natural enemy of code quality, and conditional branch statements can easily become the hardest hit by repeated code. Therefore, when we write conditional branch statements, we need to pay special attention not to produce unnecessary repetitive code. 102 | 103 | Let's take a look at this example: 104 | 105 | ```python 106 | # For new users, create a new user profile, otherwise update the old profile 107 | if user.no_profile_exists: 108 | create_user_profile( 109 | username=user.username, 110 | email=user.email, 111 | age=user.age, 112 | address=user.address, 113 | # For new users, set the user's points to 0 114 | points=0, 115 | created=now(), 116 | ) 117 | else: 118 | update_user_profile( 119 | username=user.username, 120 | email=user.email, 121 | age=user.age, 122 | address=user.address, 123 | updated=now(), 124 | ) 125 | ``` 126 | 127 | In the above code, we can see at a glance that under different branches, the program calls different functions and does different things. However, because of the existence of repetitive code, it is difficult for us to easily distinguish the difference between the two. 128 | 129 | In fact, thanks to the dynamic nature of Python, we can simply rewrite the above code, so that the readability can be significantly improved: 130 | 131 | ```python 132 | if user.no_profile_exists: 133 | profile_func = create_user_profile 134 | extra_args = {'points': 0,'created': now()} 135 | else: 136 | profile_func = update_user_profile 137 | extra_args = {'updated': now()} 138 | 139 | profile_func( 140 | username=user.username, 141 | email=user.email, 142 | age=user.age, 143 | address=user.address, 144 | **extra_args 145 | ) 146 | ``` 147 | 148 | When you write branch code, please pay attention to **repetitive code blocks produced by branches**, if you can simply eliminate them, then don’t hesitate. 149 | 150 | ### 4. Use ternary expressions with caution 151 | 152 | The ternary expression is a syntax supported only after Python 2.5 version. Before that, the Python community once thought that ternary expressions were unnecessary, and we needed to use x and a or b to simulate it. 153 | 154 | The fact is that in many cases, code that uses ordinary `if/else` statements is indeed more readable. Blindly pursuing ternary expressions can easily entice you to write complex, poorly readable code. 155 | 156 | So, remember to use simple ternary expressions to handle simple logical branches. 157 | 158 | ```python 159 | language = "python" if you.favor("dynamic") else "golang" 160 | ``` 161 | 162 | For the vast majority of cases, use ordinary `if/else` statements. 163 | 164 | ## Common Tips 165 | 166 | ### 1. Use "De Morgan's Law" 167 | 168 | When doing branch judgments, we sometimes write code like this: 169 | 170 | ```python 171 | # If the user is not logged in or the user is not using chrome, refuse to provide services 172 | if not user.has_logged_in or not user.is_from_chrome: 173 | return "our service is only available for chrome logged in user" 174 | ``` 175 | 176 | When you first see the code, do you need to think about it for a while to understand what it wants to do? This is because two `not` and one `or` appear in the above logical expression. And we humans are just not good at handling too many logical relations of "negation" and "or". 177 | 178 | At this time, it should be [De Morgan's Law](https://en.wikipedia.org/wiki/De_Morgan%27s_laws) played. In layman's terms, De Morgan's law is that `not A or not B` is equivalent to `not (A and B)`. Through such conversion, the above code can be rewritten as follows: 179 | 180 | ```python 181 | if not (user.has_logged_in and user.is_from_chrome): 182 | return "our service is only available for chrome logged in user" 183 | ``` 184 | 185 | How is the code a lot easier to read? Remember De Morgan's Law, which is very useful for simplifying the logic of code in conditional branches. 186 | 187 | ### 2. "Boolean true and false" of custom objects 188 | 189 | We often say that in Python, "everything is an object". In fact, not only "everything is an object", we can also use many magic methods (called in the document: [user-defined method](https://docs.python.org/3/reference/datamodel.html)) , To customize various behaviors of objects. We can affect the execution of the code in many magical ways that cannot be done in other languages. 190 | 191 | For example, all objects in Python have their own "Boolean true and false": 192 | 193 | - Objects with false boolean values: `None`, `0`, `False`, `[]`, `()`, `{}`, `set()`, `frozenset()`, ... ... 194 | - Objects with boolean true values: `0`, `True`, non-empty sequences, tuples, ordinary user class instances,... 195 | 196 | Through the built-in function `bool()`, you can easily check the boolean true or false of an object. And Python uses this value when making conditional branch judgments: 197 | 198 | ```python 199 | >>> bool(object()) 200 | True 201 | ``` 202 | 203 | The point is coming, although the boolean values ​​of all user class instances are true. But Python provides a way to change this behavior: **`__bool__` magic methods of custom classes** * (`__nonzero__` in Python 2.X version)*. When the class defines the `__bool__` method, its return value will be treated as the boolean value of the class instance. 204 | 205 | In addition, `__bool__` is not the only way to affect the boolean truth of an instance. If the class does not define the `__bool__` method, Python will also try to call the `__len__` method* (that is, call the `len` function on any sequence object)*, and judge whether the instance is true or false based on whether the result is `0`. 206 | 207 | So what's the use of this feature? Look at the following code: 208 | 209 | ```python 210 | class UserCollection(object): 211 | 212 | def __init__(self, users): 213 | self._users = users 214 | 215 | 216 | users = UserCollection([piglei, raymond]) 217 | 218 | if len(users._users)> 0: 219 | print("There's some users in collection!") 220 | ``` 221 | 222 | In the above code, the length of `users._users` is used to determine whether there is content in `UserCollection`. In fact, by adding `__len__` magic method to `UserCollection`, the above branch can be made simpler: 223 | 224 | ```python 225 | class UserCollection: 226 | 227 | def __init__(self, users): 228 | self._users = users 229 | 230 | def __len__(self): 231 | return len(self._users) 232 | 233 | 234 | users = UserCollection([piglei, raymond]) 235 | 236 | # After defining the __len__ method, the UserCollection object itself can be used for Boolean judgment 237 | if users: 238 | print("There's some users in collection!") 239 | ``` 240 | 241 | By defining the magic methods `__len__` and `__bool__`, we can let the class control the boolean true and false values ​​we want to show, and make the code more pythonic. 242 | 243 | ### 3. Use all() / any() in conditional judgment 244 | 245 | The two functions `all()` and `any()` are very suitable for use in conditional judgment. These two functions accept an iterable object and return a Boolean value, where: 246 | 247 | - `all(seq)`: return `True` only if all objects in `seq` are boolean, otherwise return `False` 248 | - `any(seq)`: as long as any object in `seq` is boolean, return `True`, otherwise return `False` 249 | 250 | Suppose we have the following code: 251 | 252 | ```python 253 | def all_numbers_gt_10(numbers): 254 | """ returns True only if all numbers in the sequence are greater than 10 255 | """ 256 | if not numbers: 257 | return False 258 | 259 | for n in numbers: 260 | if n <= 10: 261 | return False 262 | return True 263 | ``` 264 | 265 | If you use the `all()` built-in function, combined with a simple generator expression, the above code can be written like this: 266 | 267 | ```python 268 | def all_numbers_gt_10_2(numbers): 269 | return bool(numbers) and all(n> 10 for n in numbers) 270 | ``` 271 | 272 | Simple, efficient, and without loss of availability. 273 | 274 | ### 4. Use try/while/for in else branch 275 | 276 | Let's look at this function: 277 | 278 | ```python 279 | def do_stuff(): 280 | first_thing_successed = False 281 | try: 282 | do_the_first_thing() 283 | first_thing_successed = True 284 | except Exception as e: 285 | print("Error while calling do_some_thing") 286 | return 287 | 288 | # Only when first_thing completes successfully, do the second thing 289 | if first_thing_successed: 290 | return do_the_second_thing() 291 | ``` 292 | 293 | In the function `do_stuff`, we want to continue to make the second function call only after `do_the_first_thing()` is successfully called* (that is, no exception is thrown)*. In order to do this, we need to define an additional variable `first_thing_successed` as a marker. 294 | 295 | In fact, we can achieve the same effect with a simpler method: 296 | 297 | ```python 298 | def do_stuff(): 299 | try: 300 | do_the_first_thing() 301 | except Exception as e: 302 | print("Error while calling do_some_thing") 303 | return 304 | else: 305 | return do_the_second_thing() 306 | ``` 307 | 308 | After appending the `else` branch at the end of the `try` statement block, `do_the_second_thing()` under the branch will only be executed normally after all statements under **try (that is, no exceptions, no return, break, etc.) are completed carried out**. 309 | 310 | Similarly, the `for/while` loop in Python also supports the addition of the `else` branch, which means that the else branch is executed only after the iteration object used by the loop is exhausted normally, or the condition variable used by the while loop becomes False Code. 311 | 312 | ## Common pitfalls 313 | 314 | ### 1. Comparison with None 315 | 316 | In Python, there are two ways to compare variables: `==` and `is`, which are fundamentally different in meaning: 317 | 318 | - `==`: indicates whether the **value** pointed by the two is consistent 319 | - `is`: indicates whether the two refer to the same content in memory, that is, if id(x) is equal to id(y) 320 | 321 | `None` is a singleton object in the Python language. If you want to determine whether a variable is None, remember to use `is` instead of `==`, because only `is` can represent a strict one. Whether the variable is None. 322 | 323 | Otherwise, the following situation may occur: 324 | 325 | ```python 326 | >>> class Foo(object): 327 | ... def __eq__(self, other): 328 | ... return True 329 | ... 330 | >>> foo = Foo() 331 | >>> foo == None 332 | True 333 | ``` 334 | 335 | In the above code, the Foo class easily satisfies the condition of `== None` by customizing the magic method of `__eq__`. 336 | 337 | **So, when you want to judge whether a variable is None, please use `is` instead of `==`.** 338 | 339 | ### 2. Pay attention to the operation priority of and and or 340 | 341 | Take a look at the following two expressions and guess whether they have the same value? 342 | 343 | ```python 344 | >>> (True or False) and False 345 | >>> True or False and False 346 | ``` 347 | 348 | The answer is: different, their values are `False` and `True`, did you guess right? 349 | 350 | The key to the problem is: **The `and` operator has a higher priority than `or`**. Therefore, the second expression above actually appears to Python as `True or (False and False)`. So the result is `True` instead of `False`. 351 | 352 | When writing expressions that contain multiple `and` and `or`, please pay extra attention to the arithmetic priority of `and` and `or`. Even if the execution priority is exactly what you need, you can add extra brackets to make the code clearer. 353 | -------------------------------------------------------------------------------- /Python/03. Number-strings.md: -------------------------------------------------------------------------------- 1 | # Number and Strings 2 | 3 | ## Preface 4 | 5 | Numbers are the most basic data type in almost all programming languages. It is the foundation of our connection to the real world through code. There are three numeric types in Python: integer (int), floating point (float), and complex (complex). In most cases, we only need to deal with the first two. 6 | 7 | Integers are more worry-free in Python because they do not distinguish between unsigned and never overflow. But the floating point type still has the same accuracy problems as most other programming languages, which often confuses many newcomers who have just entered the programming world: ["Why Are Floating Point Numbers Inaccurate?"](https://stackoverflow.com/questions/21895756/why-are-floating-point-numbers-inaccurate). 8 | 9 | Compared to numbers, strings in Python are much more complicated. To master it, you must first figure out the difference between bytes and str. If it is even more unlucky, if you are still a Python2 user, the unicode and character encoding problems are enough for you to drink several pots* (hurry up and move to Python3, just today!)*. 10 | 11 | However, none of the above mentioned topics are the subject of this article. If you are interested, you can find piles of related materials online. In this article, we will discuss some of the more subtle and less common programming practices. To help you write better Python code. 12 | 13 | 14 | ## Best Practices 15 | 16 | ### 1. Write less literal numbers 17 | 18 | "Integer literal" refers to the numbers that appear directly in the code. They are distributed in all corners of the code, for example, `0` in the code `del users[0]` is a numeric literal. They are simple and practical, everyone writes every day. **However, when some specific literals repeatedly appear in your code, your "code quality warning light" should light up yellow light . ** 19 | 20 | For example, if you have just joined a long-cherished new company, there is such a function in the project that your colleagues transferred to you: 21 | 22 | ```python 23 | def mark_trip_as_featured(trip): 24 | """Add a trip to the recommended section 25 | """ 26 | if trip.source == 11: 27 | do_some_thing(trip) 28 | elif trip.source == 12: 29 | do_some_other_thing(trip) 30 | ... 31 | return 32 | ``` 33 | 34 | What does this function do? You try to understand what it means, but what about `trip.source == 11`? What about `== 12`? These two lines of code are very simple and do not use any magic features. But for the first time you are exposed to the code, you may need to spend **a whole afternoon** to understand their meaning. 35 | 36 | **The problem lies in the literal numbers. ** The person who originally wrote this function may be the veteran programmer who joined at the beginning of the company. And he is very clear about the meaning of those numbers. But if you are a newcomer to this code, it is another matter entirely. 37 | 38 | #### Improve code with enum enumeration type 39 | 40 | So, how to improve this code? The most direct way is to add comments to these two conditional branches. But here, "add comments" is obviously not the best way to improve the readability of the code* (in fact, it is not the case in most other cases)*. We need to replace these literals with meaningful names, and the enum type (enum) is most appropriate for use here. 41 | 42 | `enum` is a built-in module introduced by Python since version 3.4. If you are using an earlier version, you can install it via `pip install enum34`. Here is the sample code using enum: 43 | 44 | ```python 45 | # -*- coding: utf-8 -*- 46 | from enum import IntEnum 47 | 48 | class TripSource(IntEnum): 49 | FROM_WEBSITE = 11 50 | FROM_IOS_CLIENT = 12 51 | 52 | 53 | def mark_trip_as_featured(trip): 54 | if trip.source == TripSource.FROM_WEBSITE: 55 | do_some_thing(trip) 56 | elif trip.source == TripSource.FROM_IOS_CLIENT: 57 | do_some_other_thing(trip) 58 | ... 59 | return 60 | ``` 61 | 62 | Defining repetitive numeric literals as an enumeration type not only improves the readability of the code, but also reduces the chance of bugs in the code. 63 | 64 | Imagine if you mistyped `11` into `111` when judging a branch? We often make such mistakes, and such mistakes are particularly difficult to find in the early days. Putting all these numeric literals into the enumeration type can better avoid such problems. Similarly, rewriting string literals into enums can achieve the same benefits. 65 | 66 | The benefits of using enumerated types instead of literals: 67 | 68 | -**Improve code readability**: No one needs to remember what a magic number represents 69 | -**Improve code accuracy**: Reduce the possibility of bugs when typing wrong numbers or letters 70 | 71 | Of course, you don't need to change all the literals in the code to enumerated types. ** Literals appearing in the code can be used as long as they are easy to understand in the context in which they are located. ** For example, those `0` and `-1` that often appear as numeric subscripts are completely okay, because everyone knows what they mean. 72 | 73 | ### 2. Don't go too far on bare string processing 74 | 75 | What is "bare string processing"? In this article, it refers to the use of only basic addition, subtraction, multiplication, division, and loops, and built-in functions/methods to manipulate strings to obtain the results we need. ** 76 | 77 | Everyone has written such code. Sometimes we need to stitch together a large section of alarm information sent to users, and sometimes we need to construct a large section of SQL query statements sent to the database, like this: 78 | 79 | ```python 80 | def fetch_users(conn, min_level=None, gender=None, has_membership=False, sort_field="created"): 81 | """Get User List 82 | 83 | :param int min_level: the required minimum user level, the default is all levels 84 | :param int gender: filter user gender, the default is all genders 85 | :param int has_membership: filter all members/non-members, default non-members 86 | :param str sort_field: sort field, default is by created "user creation date" 87 | :returns: List: [(User ID, User Name), ...] 88 | """ 89 | # An ancient SQL stitching technique, using "WHERE 1=1" to simplify string stitching 90 | # Differentiate query params to avoid SQL injection problems 91 | statement = "SELECT id, name FROM users WHERE 1=1" 92 | params = [] 93 | if min_level is not None: 94 | statement += "AND level >= ?" 95 | params.append(min_level) 96 | if gender is not None: 97 | statement += "AND gender >= ?" 98 | params.append(gender) 99 | if has_membership: 100 | statement += "AND has_membership == true" 101 | else: 102 | statement += "AND has_membership == false" 103 | 104 | statement += "ORDER BY ?" 105 | params.append(sort_field) 106 | return list(conn.execute(statement, params)) 107 | ``` 108 | 109 | The reason why we concatenate the required strings in this way-* here is the SQL statement *-is because it is simple, straightforward and intuitive. But the biggest problem with this is that: **As the function logic becomes more complex, this piece of stitched code becomes error-prone and difficult to expand. **In fact, the above Demo code just only **looks like** there are no obvious bugs* (who knows if there are any other hidden problems)*. 110 | 111 | In fact, for structured, regular strings like SQL statements, it is better to construct and edit it in an object-oriented way. The following code uses the [SQLAlchemy](https://www.sqlalchemy.org/) module to accomplish the same function: 112 | 113 | ```python 114 | def fetch_users_v2(conn, min_level=None, gender=None, has_membership=False, sort_field="created"): 115 | """Get User List 116 | """ 117 | query = select([users.c.id, users.c.name]) 118 | if min_level is not None: 119 | query = query.where(users.c.level >= min_level) 120 | if gender is not None: 121 | query = query.where(users.c.gender == gender) 122 | query = query.where(users.c.has_membership == has_membership).order_by(users.c[sort_field]) 123 | return list(conn.execute(query)) 124 | ``` 125 | 126 | The above `fetch_users_v2` function is shorter and easier to maintain, and you don’t need to worry about SQL injection at all. So, when complex bare string processing logic appears in your code, please try to replace it in the following way: 127 | 128 | `Q: The target/source string is structured, does it follow a certain format? ` 129 | 130 | -Yes: Find out if there are open source object-oriented modules to operate them, or write one yourself 131 | -SQL: SQLAlchemy 132 | -XML: lxml 133 | -JSON, YAML... 134 | -No: try to use template engine instead of complex string processing logic to achieve the purpose 135 | -Jinja2 136 | -Mako 137 | -Mustache 138 | 139 | ### 3. It is not necessary to pre-calculate the literal expression 140 | 141 | Occasionally some more complex numbers appear in our code, like this: 142 | 143 | ```python 144 | def f1(delta_seconds): 145 | # If the time has passed more than 11 days, do nothing 146 | if delta_seconds> 950400: 147 | return 148 | ``` 149 | 150 | Speaking of the beginning, the above code does not have any problems. 151 | 152 | First of all, we calculated on a small book (of course, smart people like me would use IPython): how many seconds does `11 days contain? `. Then fill the result with the magic number `950400` in our code, and finally fill in a line with a note: Tell everyone how this magic number came from. 153 | 154 | What I want to ask is: **"Why don’t we just write the code as `if delta_seconds <11 * 24 * 3600:`?"** 155 | 156 | **"Performance", the answer must be "Performance"**. We all know that Python is a interpreted language, so we pre-calculated `950400` precisely because we don’t want every call to the function `f1` to carry this part of the calculation Overhead. But the fact is: *Even if we change the code to `if delta_seconds <11 * 24 * 3600:`, the function will not have any extra overhead.* 157 | 158 | When the Python code is executed, it will be compiled into bytecode by the interpreter, and the truth is hidden in the bytecode. Let's take a look at the dis module: 159 | 160 | ```python 161 | def f1(delta_seconds): 162 | if delta_seconds <11 * 24 * 3600: 163 | return 164 | 165 | import dis 166 | dis.dis(f1) 167 | 168 | # dis execution result 169 | 5 0 LOAD_FAST 0 (delta_seconds) 170 | 2 LOAD_CONST 1 (950400) 171 | 4 COMPARE_OP 0 (<) 172 | 6 POP_JUMP_IF_FALSE 12 173 | 174 | 6 8 LOAD_CONST 0 (None) 175 | 10 RETURN_VALUE 176 | >> 12 LOAD_CONST 0 (None) 177 | 14 RETURN_VALUE 178 | ``` 179 | 180 | Do you see `2 LOAD_CONST 1 (950400)` above? This means that when the Python interpreter compiles the source code into bytecode, it will calculate the entire expression of `11 * 24 * 3600` and replace it with `950400`. 181 | 182 | So, **When complex literals need to appear in our code, please keep the whole formula. It has no impact on performance and will increase the readability of the code.** 183 | 184 | > Hint: In addition to pre-calculating numeric literal expressions, the Python interpreter also does similar operations on strings and lists. Everything is for performance. Who makes you spit Python slow? 185 | 186 | ## Practical tips 187 | 188 | ### 1. Boolean values ​​are actually "numbers" 189 | 190 | In Python, the two Boolean values ​​`True` and `False` can be directly used as two integers `1` and `0`, like this: 191 | 192 | ```python 193 | >>> True + 1 194 | 2 195 | >>> 1 / False 196 | Traceback (most recent call last): 197 | File "", line 1, in 198 | ZeroDivisionError: division by zero 199 | ``` 200 | 201 | So what's the point of remembering this? First of all, they can cooperate with the `sum` function to simplify the operation when it is necessary to calculate the total number: 202 | 203 | ```python 204 | >>> l = [1, 2, 4, 5, 7] 205 | >>> sum(i% 2 == 0 for i in l) 206 | 2 207 | ``` 208 | 209 | In addition, if you use a Boolean expression as a subscript of a list, you can achieve a similar ternary expression: 210 | 211 | ```python 212 | # Similar ternary expression: "Javascript" if 2> 1 else "Python" 213 | >>> ["Python", "Javascript"][2> 1] 214 | 'Javascript' 215 | ``` 216 | 217 | ### 2. Improve the readability of super long strings 218 | 219 | The length of a single line of code should not be too long. For example, in PEP8, it is recommended that the number of characters per line should not exceed **79**. In the real world, most people follow a single line with a maximum number of characters between 79 and 119. If it is just code, such a requirement is relatively easy to meet, but what if an extra long string of code is required in the code? 220 | 221 | At this time, in addition to using the slash `\` and the plus sign `+` to split the long string into several sections, there is an easier way: ** Use the brackets to wrap the long string, and then you can freely Wrapped up**: 222 | 223 | ```python 224 | def main(): 225 | logger.info(("There is something really bad happened during the process. " 226 | "Please contact your administrator.")) 227 | ``` 228 | 229 | #### When multi-line strings appear in multi-level indentation 230 | 231 | There is a more troublesome situation in daily coding. It is necessary to insert a multi-line string literal in the code that has an indentation level. Because multi-line strings cannot contain the current indented spaces, we need to write the code like this: 232 | 233 | ```python 234 | def main(): 235 | if user.is_active: 236 | message = """Welcome, today's movie list: 237 | -Jaw (1975) 238 | -The Shining (1980) 239 | -Saw (2004)""" 240 | ``` 241 | 242 | But this way of writing will destroy the visual effect of the indentation of the entire code, which is very abrupt. There are many ways to improve it, for example, we can extract this multi-line string as a variable to the outermost layer of the module. However, if it is more appropriate to use literals in your code logic, you can also use the standard library `textwrap` to solve this problem: 243 | 244 | ```python 245 | from textwrap import dedent 246 | 247 | def main(): 248 | if user.is_active: 249 | # dedent will indent the leftmost empty string of the entire text 250 | message = dedent("""\ 251 | Welcome, today's movie list: 252 | -Jaw (1975) 253 | -The Shining (1980) 254 | -Saw (2004)""") 255 | ``` 256 | 257 | #### Large numbers can also become more readable 258 | 259 | For those very large numbers, you can improve the readability by adding an underscore in the middle 260 | ([PEP515](https://www.python.org/dev/peps/pep-0515/), requires Python3.6+). 261 | 262 | such as: 263 | 264 | ```python 265 | >>> 10_000_000.0 # Divide the numbers in "thousands" 266 | 10000000.0 267 | >>> 0xCAFE_F00D # Hexadecimal numbers are also valid, a group of 4 is easier to read 268 | 3405705229 269 | >>> 0b_0011_1111_0100_1110 # Binary is also valid 270 | 16206 271 | >>> int('0b_1111_0000', 2) # When processing strings, underscores are also processed correctly 272 | 240 273 | ``` 274 | 275 | ### 3. Don't forget the built-in string functions starting with "r" 276 | 277 | Python strings have many practical built-in methods, the most commonly used are `.strip()`, `.split()`, etc. Most of these built-in methods are processed from left to right. But it also contains some mirroring methods that start with `r` and are processed from right to left. When dealing with specific logic, using them can make you do more with less. 278 | 279 | Suppose we need to parse some access logs, the log format is: "{user_agent}" {content_length}: 280 | 281 | >>> log_line ='"AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.77 Safari/537.36" 47632' 282 | 283 | 284 | If we use `.split()` to split the log into `(user_agent, content_length)`, we need to write: 285 | 286 | ```python 287 | >>> l = log_line.split() 288 | >>> "".join(l[:-1]), l[-1] 289 | ('"AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.77 Safari/537.36"', '47632') 290 | ``` 291 | 292 | But if you use `.rsplit()`, the processing logic is more straightforward: 293 | 294 | ```python 295 | >>> log_line.rsplit(None, 1) 296 | ['"AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.77 Safari/537.36"', '47632'] 297 | ``` 298 | 299 | 300 | ### 4. Use "infinity" float("inf") 301 | 302 | If someone asks you: *"What is the largest/smallest number in Python?"*. What should you answer? Does such a thing exist? 303 | 304 | The answer is: "Yes, they are: `float("inf")` and `float("-inf")`". They correspond to positive and negative infinity in the mathematical world. When they are compared with any value, this rule is satisfied: `float("-inf") >> users = {"tom": 19, "jenny": 13, "jack": None, "andrew": 43} 311 | >>> sorted(users.keys(), key=lambda user: users.get(user) or float('inf')) 312 | ['jenny','tom','andrew','jack'] 313 | 314 | # B. As the initial value of the loop, simplify the first judgment logic 315 | >>> max_num = float('-inf') 316 | 317 | >>> # Find the largest number in the list 318 | >>> for i in [23, 71, 3, 21, 8]: 319 | if i> max_num: 320 | max_num = i 321 | 322 | >>> max_num 323 | 71 324 | ``` 325 | 326 | ## Common Misunderstandings 327 | 328 | ### 1. "value += 1" is not thread safe 329 | 330 | When we write multi-threaded programs, we often need to deal with complex shared variables and race conditions. 331 | 332 | "Thread safety" is usually used to describe a certain behavior or a certain type of data structure that can be shared and used in a multi-threaded environment and produce the expected results. **A typical module that satisfies "thread safety" is [queue module](https://docs.python.org/3/library/queue.html)**. 333 | 334 | And the value += 1 operation we often do is easily taken for granted as "thread safe". Because it seems to be an atomic operation* (refers to a smallest operating unit, no other operations will be inserted during execution)*. However, the truth is not the case, although from the point of view of Python code, the operation of `value += 1` seems to be atomic. But when it was finally executed by the Python interpreter, it was no longer "atomic"*. 335 | 336 | We can use the aforementioned `dis` module to verify: 337 | 338 | ```python 339 | def incr(value): 340 | value += 1 341 | 342 | 343 | # Use the dis module to view bytecode 344 | import dis 345 | 346 | dis.dis(incr) 347 | 0 LOAD_FAST 0 (value) 348 | 2 LOAD_CONST 1 (1) 349 | 4 INPLACE_ADD 350 | 6 STORE_FAST 0 (value) 351 | 8 LOAD_CONST 0 (None) 352 | 10 RETURN_VALUE 353 | ``` 354 | 355 | In the above output, you can see that this simple accumulation statement will be compiled into several different steps including value and save, and in a multi-threaded environment, any other thread may be in one of them Steps cut in, preventing you from getting the right results. 356 | 357 | **Therefore, please do not use your intuition to judge whether a behavior is "thread safe", otherwise you will pay a painful price for your intuition when the program has strange bugs in a high concurrency environment.** 358 | 359 | ### 2. String concatenation is not slow 360 | 361 | When I first came into contact with Python, I saw a statement on a website: *"The strings in Python are immutable, so every time the strings are spliced, a new object is generated, resulting in a new memory allocation and efficiency. very low". * I am convinced of this. 362 | 363 | So, all along, I try to avoid using `+=` to concatenate strings, and instead use `"".join(str_list)` and other methods to replace it. 364 | 365 | However, at a certain chance, I did a simple performance test on Python string concatenation and found: **Python string concatenation is not slow at all! ** After consulting some information, the truth was finally discovered. 366 | 367 | Python string concatenation in version 2.2 and earlier is really slow, consistent with the behavior I first saw. But because this operation is too common, performance optimization has been specifically made for it in later versions. Greatly improve the execution efficiency. 368 | 369 | Now using `+=` to concatenate strings, the efficiency is very close to `"".join(str_list)`. Therefore, you should splice it at the time of splicing without worrying about any performance problems. 370 | 371 | > Hint: If you want to know more detailed related content, you can read this article: [Python-Efficient String Concatenation in Python (2016 edition)-smcl](http://blog.mclemon.io/python-efficient-string-concatenation-in-python-2016-edition) 372 | -------------------------------------------------------------------------------- /Python/04. Container-types.md: -------------------------------------------------------------------------------- 1 | # Containers 2 | 3 | ## Preface 4 | 5 | The word "container" is rarely mentioned in Python technical articles. When you see "container", most people think of the little blue whale: *Docker*, but this article has nothing to do with it. A container is an abstract concept in Python, and is a general term for data types that are specifically used to hold other objects. 6 | 7 | In Python, there are four most common types of built-in containers: `list`, `tuple`, `dict` and `set`. By using them individually or in combination, many things can be accomplished efficiently. 8 | 9 | The internal implementation details of the Python language itself are also closely related to these container types. For example, Python class instance attributes, global variables `globals()`, etc. are all stored by dictionary type. 10 | 11 | In this article, I will start with the definition of the container type and try to summarize some of the best practices of daily coding. After that, I will share some programming tips around the special functions provided by each container type. 12 | 13 | 14 | ### When we talk about containers, what are we talking about? 15 | 16 | I gave a simple definition of "container" in the front: *The container that is used to hold other objects is the container*. But this definition is too broad to provide any guidance value for our daily programming. To truly master the containers in Python, you need to start from two levels: 17 | 18 | - **Low-level implementation**: What data structure is used by the built-in container type? How does an operation work? 19 | - **High-level abstraction**: What determines whether an object is a container? What behaviors define containers? 20 | 21 | Below, let us stand on these two different levels and re-understand the container. 22 | 23 | 24 | ## Look at the bottom of the container 25 | 26 | Python is a high-level programming language, **the built-in container types it provides are all highly encapsulated and abstract results**. Compared with the names "linked list", "red-black tree", and "hash table", the names of all Python built-in types only describe the functional characteristics of this type, and others can't understand them only by these names. Even a little bit of internal details. 27 | 28 | This is one of the advantages of the Python programming language. Compared with programming languages ​​such as C, which are closer to the bottom of the computer, Python has redesigned and implemented a programmer-friendly built-in container type, shielding additional work such as memory management. Provide us with a better development experience. 29 | 30 | But if this is the advantage of the Python language, why should we bother to understand the implementation details of the container type? The answer is: **Attention to detail can help us write faster code.** 31 | 32 | ### Write faster code 33 | 34 | #### 1. Avoid frequently expanding lists/creating new lists 35 | 36 | All built-in container types do not limit capacity. If you want, you can keep stuffing the increasing numbers into an empty list and eventually burst the entire machine's memory. 37 | 38 | In the implementation details of the Python language, the memory of the list is allocated on demand. When the memory currently owned by a list is not enough, the memory expansion logic will be triggered. And allocating memory is an expensive operation. Although in most cases, it will not have any serious impact on the performance of your program. But when the amount of data you process is particularly large, it is easy to drag the performance of the entire program because of memory allocation. 39 | 40 | Fortunately, Python has been aware of this problem for a long time, and provided an official problem-solving guide, that is: **"Lazy"**. 41 | 42 | How to explain "to be lazy"? The evolution of the `range()` function is a very good example. 43 | 44 | In Python 2, if you call `range(100000000)`, you need to wait a few seconds to get the result, because it needs to return a huge list, and spends a lot of time on memory allocation and calculation. But in Python 3, the same call will immediately get the result. Because what the function returns is no longer a list, but a lazy object of type `range`, it will return you the real number only when you iterate it or slice it. 45 | 46 | **So, in order to improve performance, the built-in function `range` becomes “lazy”.** And in order to avoid too frequent memory allocation, in daily coding, our function also needs to become lazy, this includes: 47 | 48 | - More use of the yield keyword to return generator objects 49 | - Try to use generator expressions instead of list comprehension expressions 50 | - Generator expression: `(i for i in range(100))` 👍 51 | - List derivation expression: `[i for i in range(100)]` 52 | -Try to use lazy objects provided by the module: 53 | - Use `re.finditer` instead of `re.findall` 54 | - Use iterable file objects directly: `for line in fp` instead of `for line in fp.readlines()` 55 | 56 | #### 2. Use the deque module in scenes with many operations at the head of the list 57 | 58 | The list is implemented based on the array structure (Array), when you insert a new member (`list.insert(0, item)`) at the head of the list, all other members behind it need to be moved, the operation time is complicated The degree is `O(n)`. This makes inserting members at the head of the list much slower than appending at the end (`list.append(item)` time complexity is `O(1)`). 59 | 60 | If your code needs to perform such operations many times, consider using the [collections.deque](https://docs.python.org/3.7/library/collections.html#collections.deque) type instead of lists. Because deque is implemented based on a double-ended queue, whether it is appending elements at the head or the tail, the time complexity is `O(1)`. 61 | 62 | #### 3. Use collections/dictionaries to determine whether members exist 63 | 64 | When you need to determine whether a member exists in a container, it is more appropriate to use a collection than a list. Because the time complexity of `item in [...]` operations is `O(n)`, and the time complexity of `item in {...}` is `O(1)`. This is because dictionaries and collections are implemented based on Hash Table data structures. 65 | 66 | ```python 67 | # This example is not particularly appropriate, because when the target collection is particularly small, the effect of using the collection or the list on the efficiency is minimal 68 | # But this is not the point :) 69 | VALID_NAMES = ["piglei", "raymond", "bojack", "caroline"] 70 | 71 | # Convert to a collection type specifically for member judgment 72 | VALID_NAMES_SET = set(VALID_NAMES) 73 | 74 | 75 | def validate_name(name): 76 | if name not in VALID_NAMES_SET: 77 | # The f-strings feature added by Python 3.6 is used here 78 | raise ValueError(f"{name} is not a valid name!") 79 | ``` 80 | 81 | > Hint: I strongly recommend reading [TimeComplexity-Python Wiki](https://wiki.python.org/moin/TimeComplexity) to learn more about the time complexity of common container types. 82 | > 83 | > If you are interested in the implementation details of the dictionary, I strongly recommend watching Raymond Hettinger's speech [Modern Dictionaries(YouTube)](https://www.youtube.com/watch?v=p33CVV29OG8&t=1403s) 84 | 85 | ## Looking at the container 86 | 87 | Python is a "[duck type](https://en.wikipedia.org/wiki/Duck_typing)" language: *"When you see a bird walking like a duck, swimming like a duck, and calling like a duck , Then this bird can be called a duck."* So, when we say what type an object is, it basically refers to: **This object meets the specific interface specification of that type and can be Use as this type. ** The same is true for all built-in container types. 88 | 89 | Open [abc](https://docs.python.org/3/library/collections.abc.html) located under the [collections](https://docs.python.org/3.7/library/collections.html) module )*(Acronym for "Abstract Base Classes")* Submodule, you can find all the container-related interfaces (abstract classes) definitions. Let's take a look at the interfaces that those built-in container types satisfy: 90 | 91 | - **List (list)**: meet the interfaces of `Iterable`, `Sequence`, `MutableSequence`, etc. 92 | - **Tuple (tuple)**: satisfy `Iterable`, `Sequence` 93 | - **Dictionary (dict)**: Meet `Iterable`, `Mapping`, `MutableMapping` [[Note 3]] 94 | - **Set**: satisfy `Iterable`, `Set`, `MutableSet` 95 | 96 | Each built-in container type is actually a combined entity that satisfies multiple interface definitions. For example, all container types satisfy the "Iterable" interface, which means that they are all "iterable". But conversely, not all "iterable" objects are containers. Just like a string can be iterated, we usually don't think of it as a "container". 97 | 98 | After understanding this fact, we will **recognize** one of the most important principles of object-oriented programming in Python: **programming towards interfaces rather than concrete implementations.** 99 | 100 | Let's use an example to see how to understand "interface-oriented programming" in Python. 101 | 102 | ### Write more scalable code 103 | 104 | One day, we received a request: *There is a list containing many user comments. In order to display on the page normally, all comments exceeding a certain length need to be replaced with ellipses*. 105 | 106 | This requirement is easy to do, and soon we wrote the first version of the code: 107 | 108 | ```python 109 | # Note: In order to enhance the illustrative nature of the sample code, some code snippets in this article use Python 3.5 110 | # Version added Type Hinting feature 111 | 112 | def add_ellipsis(comments: typing.List[str], max_length: int = 12): 113 | """If the content in the comment list exceeds max_length, the remaining characters are replaced with ellipses 114 | """ 115 | index = 0 116 | for comment in comments: 117 | comment = comment.strip() 118 | if len(comment)> max_length: 119 | comments[index] = comment[:max_length] +'...' 120 | index += 1 121 | return comments 122 | 123 | 124 | comments = [ 125 | "Implementation note", 126 | "Changed", 127 | "ABC for generator", 128 | ] 129 | print("\n".join(add_ellipsis(comments))) 130 | # OUTPUT: 131 | # Implementati... 132 | # Changed 133 | # ABC for gene... 134 | ``` 135 | 136 | In the above code, the `add_ellipsis` function takes a list as a parameter, and then iterates through it, replacing the members that need to be modified. All this seems reasonable, because the most original demand we received was: "There is a **list**, inside...". **But if one day, the comments we get are no longer in the list, but in immutable tuples? ** 137 | 138 | In that case, the existing function design will force us to write `add_ellipsis(list(comments))` which is slow and ugly code. 😨 139 | 140 | #### Container interface programming 141 | 142 | We need to improve the function to avoid this problem. Because the `add_ellipsis` function strongly depends on the list type, so when the parameter type becomes a tuple, the current function is no longer applicable *(reason: the type of `Comments[index]` will throw a `TypeError` exception ).* How to improve the design of this part? The secret is: **Let functions rely on the abstract concept of "iterable objects" rather than entity list types.** 143 | 144 | Using the generator feature, the function can be changed to this: 145 | 146 | ```python 147 | def add_ellipsis_gen(comments: typing.Iterable[str], max_length: int = 12): 148 | """If the content in the iterable comment exceeds max_length, the remaining characters are replaced with ellipses 149 | """ 150 | for comment in comments: 151 | comment = comment.strip() 152 | if len(comment)> max_length: 153 | yield comment[:max_length] +'...' 154 | else: 155 | yield comment 156 | 157 | 158 | print("\n".join(add_ellipsis_gen(comments))) 159 | ``` 160 | 161 | In the new function, we changed the dependent parameter type from a list to an iterable abstract class. There are many benefits to doing this, one of the most obvious is that whether the comment is from a list, tuple, or a certain file, the new function can be easily satisfied: 162 | 163 | ```python 164 | # Handle comments placed in tuples 165 | comments = ("Implementation note", "Changed", "ABC for generator") 166 | print("\n".join(add_ellipsis_gen(comments))) 167 | 168 | # Handle comments placed in the file 169 | with open("comments") as fp: 170 | for comment in add_ellipsis_gen(fp): 171 | print(comment) 172 | ``` 173 | 174 | After changing the dependency from a specific container type to an abstract interface, the applicability of the function becomes wider. In addition, the new function also has advantages in execution efficiency and other aspects. Now let us return to the previous question. ** From a high-level perspective, what defines a container? ** 175 | 176 | The answer is: **The interface protocol implemented by each container type defines the container. ** In our eyes, different container types should be a combination of various features such as `can iterate`, `can it be modified`, `is there a length? When writing relevant code, we need to pay more attention to the abstract properties of the container rather than the container type itself, which can help us write more elegant and scalable code. 177 | 178 | > Hint: In [itertools](https://docs.python.org/3/library/itertools.html) and [more-itertools](https://pypi.org/project/more-itertools/) modules You can find more treasures about handling iterable objects. 179 | 180 | ## Common Tips 181 | 182 | ### 1. Use tuples to improve branch code 183 | 184 | Sometimes, more than three branches of `if/else` will appear in our code. Just like this: 185 | 186 | ```python 187 | import time 188 | 189 | 190 | def from_now(ts): 191 | """ Receive a past timestamp and return a relative time text description from the current time 192 | """ 193 | now = time.time() 194 | seconds_delta = int(now-ts) 195 | if seconds_delta <1: 196 | return "less than 1 second ago" 197 | elif seconds_delta <60: 198 | return "{} seconds ago".format(seconds_delta) 199 | elif seconds_delta <3600: 200 | return "{} minutes ago".format(seconds_delta // 60) 201 | elif seconds_delta <3600 * 24: 202 | return "{} hours ago".format(seconds_delta // 3600) 203 | else: 204 | return "{} days ago".format(seconds_delta // (3600 * 24)) 205 | 206 | 207 | now = time.time() 208 | print(from_now(now)) 209 | print(from_now(now-24)) 210 | print(from_now(now-600)) 211 | print(from_now(now-7500)) 212 | print(from_now(now-87500)) 213 | # OUTPUT: 214 | # less than 1 second ago 215 | # 24 seconds ago 216 | # 10 minutes ago 217 | # 2 hours ago 218 | # 1 days ago 219 | ``` 220 | 221 | The above function can not pick out too many problems, many people will write similar code. However, if you look closely at it, you can find some obvious "**boundaries**" in the branch code section. For example, when the function determines whether a certain time should be displayed in "seconds", `60` is used. When judging whether minutes should be used, `3600` was used. 222 | 223 | **Refining the rules from the boundary is the key to optimizing this code.** If we put all these boundaries in an ordered tuple, then cooperate with the binary search module [bisect](https://docs.python.org/3.7/library/bisect.html). The control flow of the entire function can be greatly simplified: 224 | 225 | ```python 226 | import bisect 227 | 228 | 229 | # BREAKPOINTS must be sorted, otherwise binary search is not possible 230 | BREAKPOINTS = (1, 60, 3600, 3600 * 24) 231 | TMPLS = ( 232 | # unit, template 233 | (1, "less than 1 second ago"), 234 | (1, "{units} seconds ago"), 235 | (60, "{units} minutes ago"), 236 | (3600, "{units} hours ago"), 237 | (3600 * 24, "{units} days ago"), 238 | ) 239 | 240 | 241 | def from_now(ts): 242 | """ Receive a past timestamp and return a relative time text description from the current time 243 | """ 244 | seconds_delta = int(time.time()-ts) 245 | unit, tmpl = TMPLS[bisect.bisect(BREAKPOINTS, seconds_delta)] 246 | return tmpl.format(units=seconds_delta // unit) 247 | ``` 248 | 249 | In addition to using tuples to optimize too many `if/else` branches, in some cases dictionaries can also be used to do the same thing. The key is to find repetitive logic and laws from the existing code and try a lot. 250 | 251 | ### 2. Use dynamic unpacking in more places 252 | 253 | The dynamic unpacking operation refers to the act of "unwrapping" an iterable object using the `*` or `**` operator. In the Python 2 era, this operation can only be used in the function parameter part, and the order and number of occurrences There are very strict requirements, and the usage scenario is very simple. 254 | 255 | ```python 256 | def calc(a, b, multiplier=1): 257 | return (a + b) * multiplier 258 | 259 | 260 | # Python2 only supports dynamic unpacking in the function parameter section 261 | print calc(*[1, 2], **{"multiplier": 10}) 262 | # OUTPUT: 30 263 | ``` 264 | 265 | However, after Python 3, especially the 3.5 version, the usage scenarios of `*` and `**` have been greatly expanded. For example, in Python 2, if we need to merge two dictionaries, we need to do this: 266 | 267 | ```python 268 | def merge_dict(d1, d2): 269 | # Because the dictionary is an object that can be modified, in order to avoid modifying the original object, a shallow copy of d1 needs to be copied here 270 | result = d1.copy() 271 | result.update(d2) 272 | return result 273 | 274 | user = merge_dict({"name": "piglei"}, {"movies": ["Fight Club"]}) 275 | ``` 276 | 277 | But in Python 3.5 and later, you can use the `**` operator to quickly complete the dictionary merge operation: 278 | 279 | ``` 280 | user = {**{"name": "piglei"}, **{"movies": ["Fight Club"]}} 281 | ``` 282 | 283 | In addition, you can also use the `*` operator in ordinary assignment statements to dynamically unpack iterable objects. If you want to learn more about related content, you can read the recommended PEP below. 284 | 285 | > Hint: Two PEPs that promote the expansion of dynamic unpacking scenarios: 286 | > - [PEP 3132 - Extended Iterable Unpacking | Python.org](https://www.python.org/dev/peps/pep-3132/) 287 | > - [PEP 448 - Additional Unpacking Generalizations | Python.org](https://www.python.org/dev/peps/pep-0448/) 288 | 289 | ### 3. It is better not to use "acquire permission" and "require forgiveness" 290 | 291 | This subheading may be a little confusing, let me explain briefly: "acquiring permission" and "requiring forgiveness" are two different programming styles. If we use a classic requirement: "Count the number of occurrences of each element in the list" as an example, two different styles of code would look like this: 292 | 293 | ```python 294 | # AF: Ask for Forgiveness 295 | # Just do it, if it throws an exception, then handle the exception 296 | def counter_af(l): 297 | result = {} 298 | for key in l: 299 | try: 300 | result[key] += 1 301 | except KeyError: 302 | result[key] = 1 303 | return result 304 | 305 | 306 | # AP: Ask for Permission 307 | # Before doing, ask if you can do it, you can do it again 308 | def counter_ap(l): 309 | result = {} 310 | for key in l: 311 | if key in result: 312 | result[key] += 1 313 | else: 314 | result[key] = 1 315 | return result 316 | ``` 317 | 318 | The entire Python community has a clear preference for the first *Ask for Forgiveness* exception-catch programming style. There are many reasons for this. First, throwing an exception in Python is a very lightweight operation. Second, the first approach is also better than the second in terms of performance, because it does not need to do an additional member check every time it loops. 319 | 320 | However, the two pieces of code in the example are very rare in the real world. why? Because if you want to count the number of times, just use `collections.defaultdict` directly: 321 | 322 | ```python 323 | from collections import defaultdict 324 | 325 | 326 | def counter_by_collections(l): 327 | result = defaultdict(int) 328 | for key in l: 329 | result[key] += 1 330 | return result 331 | ``` 332 | 333 | Such code does not require "acquiring permission" nor "request for forgiveness". **The control flow of the entire code becomes clearer and more natural.** So, if possible, please try to omit those **non-core** exception catching logic. Some tips: 334 | 335 | - When manipulating dictionary members: use `collections.defaultdict` type 336 | - Or use `dict[key] = dict.setdefault(key, 0) + 1` built-in function 337 | - If you remove a dictionary member, don't care if it exists: 338 | - Set the default value when calling the pop function, such as `dict.pop(key, None)` 339 | - Specify the default value when obtaining members in the dictionary: `dict.get(key, default_value)` 340 | - Non-existent slice access to the list will not throw `IndexError` exception: `["foo"][100:200]` 341 | 342 | ### 4. Use next() function 343 | 344 | `next()` is a very useful built-in function that takes an iterator as a parameter and returns the next element of the iterator. Using it with generator expressions, you can efficiently implement requirements such as *"find the first member from the list that meets the criteria"*. 345 | 346 | ```python 347 | numbers = [3, 7, 8, 2, 21] 348 | # Get and **return immediately** the first even number in the list 349 | print(next(i for i in numbers if i% 2 == 0)) 350 | # OUTPUT: 8 351 | ``` 352 | 353 | ### 5. Use an ordered dictionary to deduplicate 354 | 355 | The structural characteristics of dictionaries and collections ensure that their members are not duplicated, so they are often used for deduplication. However, the result of using these two deduplication will lose the order of the original list. This is determined by the characteristics of the underlying data structure "Hash Table". 356 | 357 | ```python 358 | >>> l = [10, 2, 3, 21, 10, 3] 359 | # Deduplication but lost order 360 | >>> set(l) 361 | {3, 10, 2, 21} 362 | ``` 363 | 364 | What if you need to remove duplicates and you must preserve order? We can use the `collections.OrderedDict` module: 365 | 366 | ```python 367 | >>> from collections import OrderedDict 368 | >>> list(OrderedDict.fromkeys(l).keys()) 369 | [10, 2, 3, 21] 370 | ``` 371 | 372 | > Hint: In Python 3.6, the default dictionary type changed the way of implementation, and it has become ordered. And in Python 3.7, this feature has changed from **language implementation details** to **reliable official language features**. 373 | > 374 | > But I think it will take some time to get the entire Python community accustomed to this, after all, "the dictionary is unordered" is still printed on countless Python books. Therefore, I still recommend using OrderedDict wherever an ordered dictionary is needed. 375 | 376 | ## Common Misunderstandings 377 | 378 | ### 1. Beware of exhausted iterators 379 | 380 | Earlier in the article, we mentioned the benefits of using "lazy" generators. However, everything has its two sides. One of the biggest disadvantages of the generator is: **it will dry up**. After you have traversed them completely, you will not get any new content after repeated traversal. 381 | 382 | ```python 383 | numbers = [1, 2, 3] 384 | numbers = (i * 2 for i in numbers) 385 | 386 | # The first loop will output 2, 4, 6 387 | for number in numbers: 388 | print(number) 389 | 390 | # Nothing will be output in this loop because the iterator is exhausted 391 | for number in numbers: 392 | print(number) 393 | ``` 394 | 395 | And not only generator expressions, the built-in functions of map and filter in Python 3 have the same characteristics. Ignoring this feature can easily lead to some undetectable bugs in the code. 396 | 397 | Instagram encountered this problem during the migration of the project from Python 2 to Python 3. They shared the story of dealing with this problem at PyCon 2017. Visit the article [Summary of Instagram's speech at PyCon 2017](https://www.zlovezl.cn/articles/instagram-pycon-2017/) and search for "Iterator" to see the details. 398 | 399 | ### 2. Don't modify the iterated object in the loop body 400 | 401 | This is a mistake many Python beginners will make. For example, we need a function to delete all even numbers in the list: 402 | 403 | ```python 404 | def remove_even(numbers): 405 | """Remove all even numbers in the list 406 | """ 407 | for i, number in enumerate(numbers): 408 | if number% 2 == 0: 409 | # Problematic code 410 | del numbers[i] 411 | 412 | 413 | numbers = [1, 2, 7, 4, 8, 11] 414 | remove_even(numbers) 415 | print(numbers) 416 | # OUTPUT: [1, 7, 8, 11] 417 | ``` 418 | 419 | Notice the extra "8" in the result? When you modify a list while iterating through it, something like this will appear. Because the iterated object `numbers` was modified during the loop. **The traversed subscript is growing, and the length of the list itself is shrinking at the same time. This will cause some members of the list not to be traversed at all. ** 420 | 421 | So for such operations, please use a new empty list to save the result, or use yield to return a generator. Instead of modifying the list being iterated or the dictionary object itself. 422 | 423 | ## to sum up 424 | 425 | In this article, we first start from the definition of "container type" and discuss the container type at the bottom and high levels. After following the tradition of a series of articles, some tips for writing container-related code are provided. 426 | 427 | Let us conclude with the main points: 428 | 429 | - Understand the low-level implementation of container types, which can help you write better performance code 430 | - Refine abstract concepts in requirements, oriented towards interfaces instead of programming 431 | - Use more "lazy" objects and less "urgent" lists 432 | - Using tuples and dictionaries can simplify branch code structure 433 | - Using `next()` function with iterator can accomplish many things efficiently, but also need to pay attention to the "exhaustion" problem 434 | - There are a lot of useful tools in the collections and itertools modules. Go check it out! 435 | -------------------------------------------------------------------------------- /Python/05. Functions.md: -------------------------------------------------------------------------------- 1 | # Functions 2 | 3 | ## Preface 4 | 5 | 6 | There is no doubt that functions are one of the most important concepts in the Python language. When programming, we decompose the big problems in the real world into small problems, and then hand over the answers one by one. Functions are the nemesis of repeated code and the best weapon against code complexity. 7 | 8 | As most stories will have an ending, most functions also end with **return result**. The way the function returns the result determines the experience when calling it. Therefore, understanding how to make a function return results elegantly is a must for writing a good function. 9 | 10 | ### Python function return method 11 | 12 | The Python function returns the result by calling the `return` statement. Use `return value` to return a single value, and use `return value1, value2` to let the function return multiple values ​​at the same time. 13 | 14 | If there is no `return` statement in a function body, then the return value of this function defaults to `None`. In addition to returning content through the `return` statement, you can also use the method of throwing an exception* (raise Exception)* to "return the result". 15 | 16 | Next, I will list some common programming suggestions related to function return. 17 | 18 | ## Programming suggestions 19 | 20 | ### 1. Do not return multiple types for a single function 21 | 22 | The Python language is very flexible, and we can use it to easily accomplish things that are difficult to do in other languages. For example: *Let a function return different types of results at the same time. *Achieve a "multifunctional function" that looks very practical. 23 | 24 | Just like this: 25 | 26 | ```python 27 | def get_users(user_id=None): 28 | if user_id is not None: 29 | return User.get(user_id) 30 | else: 31 | return User.filter(is_active=True) 32 | 33 | 34 | # Return to a single user 35 | get_users(user_id=1) 36 | # Return multiple users 37 | get_users() 38 | ``` 39 | 40 | When we need to get a single user, we pass the `user_id` parameter, otherwise we don’t pass the parameter to get a list of all active users. Everything is done by a function `get_users`. This design seems reasonable. 41 | 42 | However, in the world of functions, it is not a good thing to be proud of writing a Swiss Army knife function with "multi-function". This is because a good function must be ["Single responsibility"](https://en.wikipedia.org/wiki/Single_responsibility_principle). **Single responsibility means that one function only does one thing with a clear purpose.** Such functions are also less likely to be modified in the future due to changing requirements. 43 | 44 | While returning multiple types of functions must violate the "single responsibility" principle, the best functions should always provide stable return values ​​to minimize the caller's processing costs. 45 | 46 | Like the above example, we should write two separate functions `get_user_by_id(user_id)` and `get_active_users()` instead. 47 | 48 | ### 2. Use partial to construct a new function 49 | 50 | Assuming such a scenario, there is a function `A` with many parameters in your code, which is very applicable. The other function `B` is to complete the work completely by calling `A`, which is a similar shortcut. 51 | 52 | For example, in this example, the `double` function is completely calculated using `multiply`: 53 | 54 | ```python 55 | def multiply(x, y): 56 | return x * y 57 | 58 | 59 | def double(value): 60 | # Return another function call result 61 | return multiply(2, value) 62 | ``` 63 | 64 | For the above scenario, we can use the [`partial()`](https://docs.python.org/3.6/library/functools.html#functools.partial) function in the `functools` module to simplify it. 65 | 66 | `partial(func, *args, **kwargs)` constructs a new function based on the incoming function and variable (position/keyword) parameters. **All calls to the new function will be delegated to the original function after merging the current call parameters and construction parameters. ** 67 | 68 | Using the `partial` function, the definition of the `double` function above can be modified to a single-line expression, which is more concise and straightforward. 69 | 70 | ```python 71 | import functools 72 | 73 | double = functools.partial(multiply, 2) 74 | ``` 75 | 76 | > Suggested reading: [partial function official documentation](https://docs.python.org/3.6/library/functools.html#functools.partial) 77 | 78 | ### 3. Throw exceptions instead of returning results and errors 79 | 80 | As I mentioned earlier, functions in Python can return multiple values. Based on this ability, we can write a special class of functions: **functions that return both results and error information.** 81 | 82 | ```python 83 | def create_item(name): 84 | if len(name)> MAX_LENGTH_OF_NAME: 85 | return None,'name of item is too long' 86 | if len(CURRENT_ITEMS)> MAX_ITEMS_QUOTA: 87 | return None,'items is full' 88 | return Item(name=name),'' 89 | 90 | 91 | def create_from_input(): 92 | name = input() 93 | item, err_msg = create_item(name) 94 | if err_msg: 95 | print(f'create item failed: {err_msg}') 96 | else: 97 | print(f'item<{name}> created') 98 | ``` 99 | 100 | In the example, the function of `create_item` is to create a new Item object. At the same time, in order to provide the caller with error details when an error occurs, it uses the multiple return value feature to return the error information as the second result. 101 | 102 | At first glance, this approach is natural. This is especially true for those who have programming experience in the Go language. But in the Python world, this is not the best way to solve such problems. Because this approach will increase the cost of caller error handling, especially when many functions follow this specification and there are multiple layers of calls. 103 | 104 | Python has a perfect *Exception (Exception) mechanism, and to some extent encourages us to use exceptions ([Official documentation about EAFP](https://docs.python.org/3/glossary.html#term-eafp)). Therefore, it is more authentic to use exceptions to handle error processes.* 105 | 106 | After introducing a custom exception, the above code can be rewritten as follows: 107 | 108 | ```python 109 | class CreateItemError(Exception): 110 | """Exception thrown when creating Item failed""" 111 | 112 | def create_item(name): 113 | """Create a new Item 114 | 115 | :raises: Throw CreateItemError when unable to create 116 | """ 117 | if len(name)> MAX_LENGTH_OF_NAME: 118 | raise CreateItemError('name of item is too long') 119 | if len(CURRENT_ITEMS)> MAX_ITEMS_QUOTA: 120 | raise CreateItemError('items is full') 121 | return Item(name=name) 122 | 123 | 124 | def create_for_input(): 125 | name = input() 126 | try: 127 | item = create_item(name) 128 | except CreateItemError as e: 129 | print(f'create item failed: {err_msg}') 130 | else: 131 | print(f'item<{name}> created') 132 | 133 | ``` 134 | 135 | After using "throw exception" instead of "return (result, error message)", the entire error process processing does not seem to change much at first, but in fact it is very different, some details: 136 | 137 | - The new version of the function has a more stable return value type, it will always only return the `Item` type or throw an exception 138 | - Although I encourage the use of exceptions here, "exceptions" are always unavoidable and **surprise**, so it is best to state the type of exceptions that may be thrown in the function documentation 139 | - The exception is different from the return value, it will continue to report to the upper layer of the call stack before being caught. Therefore, the first-level caller of `create_item` can omit exception handling and leave it to the upper layer. This feature gives us more flexibility, but it also brings greater risks. 140 | 141 | 142 | > Hint: How to handle errors in a programming language is a subject that is still controversial. For example, the multi-return method, which is not recommended above, is the core error handling mechanism in Go that lacks exceptions. In addition, even the exception mechanism itself, there are differences between different programming languages. 143 | > 144 | > Abnormal, or not abnormal, is the result of multiple choices made by the language designer, and more often there is no absolute pros and cons. **However, as far as the Python language is concerned, using exceptions to express errors is undoubtedly more in line with the Python philosophy and should be respected. ** 145 | 146 | 147 | ### 4. Use None with caution 148 | 149 | The `None` value is usually used to indicate **“something that should exist but is missing”**, which is unique in Python. Many programming languages ​​have designs similar to None, such as `null` in JavaScript and `nil` in Go. Because of the unique *null* temperament possessed by None, it is often used as a function return value. 150 | 151 | When we use None as the function return value, it is usually the following three cases. 152 | 153 | #### 1. As the default return value of the operation function 154 | 155 | When an operation function does not require any return value, it usually returns None. At the same time, None is also the default return value of the function without any `return` statement. 156 | 157 | For this function, there is no problem using None. The `list.append()` and `os.chdir()` in the standard library belong to this category. 158 | 159 | #### 2. As some "expected" values ​​that may not be available 160 | 161 | There are some functions whose purpose is usually to try something tentatively. Depending on the situation, there may or may not be results in the end. ** And for the caller, "no result" is completely expected **. For this type of function, it is reasonable to use None as the return value when there is no result. 162 | 163 | In the Python standard library, the `re.search` and `re.match` functions under the regular expression module `re` belong to this category. These two functions return `re.Match` objects when a match can be found. If not found, it returns `None`. 164 | 165 | #### 3. As the value representing "wrong result" when the call fails 166 | 167 | Sometimes, `None` is often used by us as the default return value when a function fails, such as the following function: 168 | 169 | ```python 170 | def create_user_from_name(username): 171 | """Create a User instance by user name""" 172 | if validate_username(username): 173 | return User.from_username(username) 174 | else: 175 | return None 176 | 177 | 178 | user = create_user_from_name(username) 179 | if user: 180 | user.do_something() 181 | ``` 182 | 183 | When the username is invalid, the function `create_user_from_name` will return None. But in this scenario, this is actually not good. 184 | 185 | However, you may think that this function is completely reasonable, and even you may think that it is very similar to the last usage we mentioned when there was no result. So how to distinguish between these two different situations? The key is: **Is there an "expected" hint between the function signature (name and parameters) and the return value of None?** 186 | 187 | Let me explain, whenever you make the function return None, please **read the function name carefully**, and then ask yourself a question: *If I am a user of this function, judging from this name, "I can’t take it To any result" is part of the meaning of the function name?* 188 | 189 | Take these two functions as examples: 190 | 191 | - `re.search()`: From the function name, `search` means to **search** the matching result from the target string, and the search behavior may or may not have results, so This function is suitable for returning None 192 | - `create_user_from_name()`: From the function name, it means that the user is constructed based on a name, and it cannot read a meaning of `may return or may not return`. So it is not suitable to return None 193 | 194 | For those functions that cannot read the hint of None from the function name, there are two ways to modify it. First, if you insist on using the None return value, then please modify the function name. For example, you can rename the function `create_user_from_name()` to `create_user_or_none()`. 195 | 196 | The second method is much more common: replace the None return value with a raise exception* (raise Exception)*. Because, if the normal result cannot be returned, it is not part of the meaning of the function, which means that the function has *"unexpected conditions"*, and this is the domain controlled by Exceptions. 197 | 198 | Example after using exception rewriting: 199 | 200 | ```python 201 | class UnableToCreateUser(Exception): 202 | """Throw """ when the user cannot be created 203 | 204 | 205 | def create_user_from_name(username): 206 | ""Create a User instance by username" 207 | :raises: UnableToCreateUser is thrown when the user cannot be created 208 | """ 209 | if validate_username(username): 210 | return User.from_username(username) 211 | else: 212 | raise UnableToCreateUser(f'unable to create user from {username}') 213 | 214 | 215 | try: 216 | user = create_user_from_name(username) 217 | except UnableToCreateUser: 218 | # Error handling 219 | else: 220 | user.do_something() 221 | ``` 222 | 223 | Compared with the return value of None, throwing an exception has the additional advantage of having the features we mentioned in the last scenario: **You can provide the reason for the unexpected result in the exception information**, which It is impossible to return only a None value. 224 | 225 | ### 5. Reasonable use of "empty object mode" 226 | 227 | I mentioned earlier that functions can use `None` values ​​or exceptions to return erroneous results, but both methods have a common drawback. That is, wherever you need to use the function return value, you must add an `if` or `try/except` defense statement to determine whether the result is normal. 228 | 229 | Let's see a complete example that works: 230 | 231 | ```python 232 | import decimal 233 | 234 | 235 | class CreateAccountError(Exception): 236 | """Unable to create a account error""" 237 | 238 | 239 | class Account: 240 | """A virtual bank account"""" 241 | 242 | def __init__(self, username, balance): 243 | self.username = username 244 | self.balance = balance 245 | 246 | @classmethod 247 | def from_string(cls, s): 248 | """Initialize an account from a string""" 249 | try: 250 | username, balance = s.split() 251 | balance = decimal.Decimal(float(balance)) 252 | except ValueError: 253 | raise CreateAccountError('input must follow pattern "{ACCOUNT_NAME} {BALANCE}"') 254 | 255 | if balance <0: 256 | raise CreateAccountError('balance can not be negative') 257 | return cls(username=username, balance=balance) 258 | 259 | 260 | def caculate_total_balance(accounts_data): 261 | """Calculate the total balance of all accounts 262 | """ 263 | result = 0 264 | for account_string in accounts_data: 265 | try: 266 | user = Account.from_string(account_string) 267 | except CreateAccountError: 268 | pass 269 | else: 270 | result += user.balance 271 | return result 272 | 273 | 274 | accounts_data = [ 275 | 'piglei 96.5', 276 | 'cotton 21', 277 | 'invalid_data', 278 | 'roland $invalid_balance', 279 | 'alfred -3', 280 | ] 281 | 282 | print(caculate_total_balance(accounts_data)) 283 | ``` 284 | 285 | In this example, whenever we call `Account.from_string`, we must use `try/except` to catch possible exceptions. If the function needs to be called many times in the project, this part of the work becomes very cumbersome. For this situation, ["Null object pattern (Null object pattern)"](https://en.wikipedia.org/wiki/Null_object_pattern) can be used to improve this control flow. 286 | 287 | Martin Fowler explained this model in detail in a chapter in his classic book ["Refactoring"](https://martinfowler.com/books/refactoring.html). To put it simply, **is to use a "null type" that conforms to the normal result interface to replace the null return/throw exception, in order to reduce the cost of the caller to process the result.** 288 | 289 | After introducing the "Null Object Mode", the above example can be modified to look like this: 290 | 291 | ```python 292 | class Account: 293 | # def __init__ has been omitted... 294 | 295 | @classmethod 296 | def from_string(cls, s): 297 | """ Initialize an account from a string 298 | 299 | :returns: If the input is valid, return the Account object, otherwise return NullAccount 300 | """ 301 | try: 302 | username, balance = s.split() 303 | balance = decimal.Decimal(float(balance)) 304 | except ValueError: 305 | return NullAccount() 306 | 307 | if balance <0: 308 | return NullAccount() 309 | return cls(username=username, balance=balance) 310 | 311 | 312 | class NullAccount: 313 | username ='' 314 | balance = 0 315 | 316 | @classmethod 317 | def from_string(cls, s): 318 | raise NotImplementedError 319 | ``` 320 | 321 | In the new version of the code, I defined the new type `NullAccount`, which is used as an error result when `from_string` fails. The biggest change after this modification is reflected in the `caculate_total_balance` section: 322 | 323 | ```python 324 | def caculate_total_balance(accounts_data): 325 | """Calculate the total balance of all accounts 326 | """ 327 | return sum(Account.from_string(s).balance for s in accounts_data) 328 | ``` 329 | 330 | After the adjustment, the caller no longer has to explicitly use the try statement to handle errors, but can assume that the `Account.from_string` function will always return a valid Account object, thus greatly simplifying the entire calculation logic. 331 | 332 | > Hint: In the Python world, the "empty object mode" is not uncommon, such as [AnonymousUser](https://docs.djangoproject.com/en/2.1/ref/contrib/auth/#anonymoususer) in the famous Django framework -object is a typical null object. 333 | 334 | ### 6. Use generator functions instead of returning lists 335 | 336 | It is particularly common to return lists in functions. Usually, we will first initialize a list `results = []`, then fill it with the `results.append(item)` function in the loop body, and finally return at the end of the function. 337 | 338 | For such patterns, we can use generator functions to simplify it. To put it crudely, replace the append statement with yield item. Functions that use generators are usually more concise and more versatile. 339 | 340 | ```python 341 | def foo_func(items): 342 | for item in items: 343 | # ... Use item to return directly after processing item 344 | yield item 345 | ``` 346 | 347 | I have analyzed this model in detail in [The fourth article in the series "The Doorway of the Container"](https://www.zlovezl.cn/articles/mastering-container-types/). For more details, you can visit the article and search for " Write more scalable code" view. 348 | 349 | ### 7. Limit the use of recursion 350 | 351 | When the function returns its own call, that is, when `recursion` occurs. Recursion is a programming technique that is very useful in certain scenarios, but the bad news is: Python has very limited support for recursion. 352 | 353 | This "limited support" is reflected in many ways. First, the Python language does not support ["tail recursive optimization"](https://en.wikipedia.org/wiki/Tail_call). In addition, Python also has strict limits on the maximum number of recursion levels. 354 | 355 | So I suggest: **Try to write less recursion**. If you want to use recursion to solve a problem, first think about whether it can be easily replaced with a loop. If the answer is yes, then rewrite it with a loop. If you must use recursion, please consider the following points: 356 | 357 | -Whether the input data size of the function is stable, and whether it must not exceed the maximum layer limit specified by `sys.getrecursionlimit()` 358 | -Is it possible to reduce the number of recursive layers by using cache tool functions like [functools.lru_cache](https://docs.python.org/3/library/functools.html#functools.lru_cache) 359 | 360 | ## to sum up 361 | 362 | In this article, I have virtualized some scenarios related to the return of Python functions, and provided my optimization suggestions for each scenario. Finally, summarize the main points: 363 | 364 | - Let the function have a stable return value, a function only does one thing 365 | - Use `functools.partial` to define shortcut functions 366 | - Throwing an exception is also a way to return the result, use it instead of returning an error message 367 | - Whether the function is suitable for returning None, is determined by the "meaning" of the function signature 368 | - Use "Empty Object Mode" to simplify the caller's error handling logic 369 | - Use more generator functions, try to use loops instead of recursion 370 | -------------------------------------------------------------------------------- /Python/06. Decorators.md: -------------------------------------------------------------------------------- 1 | # Tips for using decorators 2 | 3 | ## Foreword 4 | 5 | 6 | Decorator is a special tool in Python that provides us with a flexible ability to modify functions outside the function. It's a bit like a magic hat with a unique `@` symbol, as long as it is worn on the top of the function head, it can silently change the behavior of the function itself. 7 | 8 | You may have dealt with decorators a lot. When doing object-oriented programming, we often use two built-in decorators, @staticmethod and @classmethod. In addition, if you have been in touch with the [click](https://click.palletsprojects.com/en/7.x/) module, you will not be unfamiliar with decorators. click The most well-known parameter definition interface `@click.option(...)` is implemented using decorators. 9 | 10 | In addition to using decorators, we often need to write some decorators ourselves. In this article, I will share with you some little knowledge about decorators from two aspects: `best practice` and `common errors`. 11 | 12 | ## Best Practices 13 | 14 | ### 1. Try to implement decorators with classes 15 | 16 | Most decorators are based on functions and [closures](https://en.wikipedia.org/wiki/Closure_(computer_programming)), but this is not the only way to make decorators. In fact, Python has only one requirement for whether an object can be used in the form of a decorator (`@decorator`): **decorator must be a "callable" object**. 17 | 18 | ```python 19 | # Use callable to detect whether an object is "callable" 20 | >>> def foo(): pass 21 | ... 22 | >>> type(foo) 23 | 24 | >>> callable(foo) 25 | True 26 | ``` 27 | 28 | Functions are naturally "callable" objects. But in addition to functions, we can also make any class "callable". The method is very simple, as long as the custom class's __call__` magic method is enough. 29 | 30 | ``` 31 | class Foo: 32 | def __call__(self): 33 | print("Hello, __call___") 34 | 35 | foo = Foo() 36 | 37 | # OUTPUT: True 38 | print(callable(foo)) 39 | # Call foo instance 40 | # OUTPUT: Hello, __call__ 41 | foo() 42 | ``` 43 | 44 | Based on this feature, we can easily use classes to implement decorators. 45 | 46 | The following code will define a decorator named `@delay(duration)`. The function decorated with it will wait an additional `duration` seconds before each execution. At the same time, we also want to provide users with the `eager_call` interface that does not have to wait for immediate execution. 47 | 48 | ```python 49 | import time 50 | import functools 51 | 52 | 53 | class DelayFunc: 54 | def __init__(self, duration, func): 55 | self.duration = duration 56 | self.func = func 57 | 58 | def __call__(self, *args, **kwargs): 59 | print(f'Wait for {self.duration} seconds...') 60 | time.sleep(self.duration) 61 | return self.func(*args, **kwargs) 62 | 63 | def eager_call(self, *args, **kwargs): 64 | print('Call without delay') 65 | return self.func(*args, **kwargs) 66 | 67 | 68 | def delay(duration): 69 | """Decorator: postpone the execution of a function. Provide the .eager_call method for immediate execution 70 | """ 71 | # To avoid defining additional functions, use functools.partial directly to help construct 72 | # DelayFunc example 73 | return functools.partial(DelayFunc, duration) 74 | ``` 75 | 76 | Sample code for how to use the decorator: 77 | 78 | ``` 79 | @delay(duration=2) 80 | def add(a, b): 81 | return a + b 82 | 83 | 84 | # This call will be delayed by 2 seconds 85 | add(1, 2) 86 | # This call will be executed immediately 87 | add.eager_call(1, 2) 88 | ``` 89 | 90 | `@delay(duration)` is a class-based decorator. Of course, if you are very familiar with functions and closures in Python, the above `delay` decorator can actually be implemented using only functions. So why do we use classes to do this? 91 | 92 | Compared with pure functions, I think that decorators implemented using classes have several advantages in **specific scenarios**: 93 | 94 | - When implementing stateful decorators, operation class attributes are more intuitive and less error-prone than operating closure variables 95 | - When implementing the decorator as a function extension interface, using a class wrapper function is easier to maintain than appending attributes directly to the function object 96 | - It is easier to implement an object that is compatible with both the decorator and the context manager protocol (refer to [unitest.mock.patch](https://docs.python.org/3/library/unittest.mock.html#unittest.mock.patch)) 97 | 98 | ### 2. Write a flatter decorator using the wrapt module 99 | 100 | In the process of writing decorators, have you encountered any uncomfortable things? Whether you have it or not, I have anyway. When I was writing code, I was particularly uncomfortable with the following two things: 101 | 102 | 1. When implementing decorators with parameters, nested function codes are particularly difficult to write and difficult to read 103 | 2. Because of the difference between functions and class methods, decorators written for the former cannot often be directly applied to the latter 104 | 105 | For example, in the following example, I implemented a decorator that generates random numbers and injects them as function parameters. 106 | 107 | ```python 108 | import random 109 | 110 | 111 | def provide_number(min_num, max_num): 112 | """Decorator: Randomly generate an integer in the range [min_num, max_num], appended as the first positional parameter of the function 113 | """ 114 | def wrapper(func): 115 | def decorated(*args, **kwargs): 116 | num = random.randint(min_num, max_num) 117 | # Add num as the first parameter and call the function 118 | return func(num, *args, **kwargs) 119 | return decorated 120 | return wrapper 121 | 122 | 123 | 124 | @provide_number(1, 100) 125 | def print_random_number(num): 126 | print(num) 127 | 128 | # Output a random integer from 1-100 129 | # OUTPUT: 72 130 | print_random_number() 131 | ``` 132 | 133 | The `@provide_number` decorator function looks good, but it has the two problems I mentioned earlier: **The nesting level is deep and cannot be used on class methods. **If you use it directly to decorate class methods, the following will occur: 134 | 135 | ``` 136 | class Foo: 137 | @provide_number(1, 100) 138 | def print_random_number(self, num): 139 | print(num) 140 | 141 | # OUTPUT: <__main__.Foo object at 0x104047278> 142 | Foo().print_random_number() 143 | ``` 144 | 145 | The `print_random_number` method in the `Foo` class instance will output the class instance `self` instead of the random number `num` we expect. 146 | 147 | This result occurs because the class method * (method) * and the function * (function) * have slightly different working mechanisms. If you want to fix this problem, the `provider_number` decorator must smartly skip the `self` variable of the class instance hidden in `*args` when modifying the positional parameters of the class method, in order to correctly use `num` as the first Parameter injection. 148 | 149 | At this time, it should be time for the [wrapt](https://pypi.org/project/wrapt/) module to debut. The `wrapt` module is a tool library dedicated to helping you write decorators. Using it, we can easily modify the `provide_number` decorator to perfectly solve the two problems of ""deep nesting level"" and *"cannot be used universally" * 150 | 151 | ```python 152 | import wrapt 153 | 154 | def provide_number(min_num, max_num): 155 | @wrapt.decorator 156 | def wrapper(wrapped, instance, args, kwargs): 157 | # Parameter meaning: 158 | ## 159 | #-wrapped: the decorated function or class method 160 | #-instance: 161 | #-If the decorator is an ordinary class method, the value is a class instance 162 | #-If the decorator is classmethod, the value is class 163 | #-If the decorator is a class/function/static method, the value is None 164 | ## 165 | #-args: positional parameters when calling (note that there is no * symbol) 166 | #-kwargs: keyword parameters when calling (note that there is no ** symbol) 167 | ## 168 | num = random.randint(min_num, max_num) 169 | # No need to pay attention to whether wrapped is a class method or an ordinary function, directly append parameters to the head 170 | args = (num,) + args 171 | return wrapped(*args, **kwargs) 172 | return wrapper 173 | 174 | <... Part of the code of the application decorator is omitted...> 175 | 176 | # OUTPUT: 48 177 | Foo().print_random_number() 178 | ``` 179 | 180 | Decorators written using the `wrapt` module have the following advantages over the original: 181 | 182 | - Fewer nesting levels: Use `@wrapt.decorator` to reduce two levels of nesting to one level 183 | - Simpler: when dealing with position and keyword parameters, you can ignore special cases such as class instances 184 | - More flexibility: After conditional judgment on the value of `instance`, it is easier to make the decorator universal 185 | 186 | ## Common errors 187 | 188 | ### 1. "Decorator" is not "Decorator Mode" 189 | 190 | ["Design Pattern"](https://en.wikipedia.org/wiki/Software_design_pattern) is a word well-known in the computer world. If you are a Java programmer and you don’t understand the design patterns at all, then I bet you will be going through a very difficult job interview process. 191 | 192 | But when writing Python, we rarely talk about "design patterns." Although Python is also an object-oriented programming language, its [duck type](https://en.wikipedia.org/wiki/Duck_typing) design and excellent dynamic characteristics determine that most of the design patterns come to us It is not a necessity. Therefore, many Python programmers may not have actually applied several design patterns after working for a long time. 193 | 194 | However, ["Decorator Pattern"](https://en.wikipedia.org/wiki/Decorator_pattern) is an exception. Because Python's "decorator" and "decorator pattern" have the same name, I have heard more than once that they are regarded as the same thing, thinking that using "decorator" is to practice the "decorator pattern". But in fact, **they are two completely different things.** 195 | 196 | "Decorator pattern" is a programming technique derived entirely from "object-oriented". It has several key components: **a unified interface definition**, **several classes that follow the interface**, and layer-by-layer packaging between classes and classes. In the end, they form a "decoration" effect together. 197 | 198 | The "decorator" and "object-oriented" in Python do not have any direct connection, **it can just be a trick between functions and functions.** In fact, the "decorator" does not provide some irreplaceable function, it is just a ["syntactic sugar"](https://en.wikipedia.org/wiki/Syntactic_sugar). The following code uses the decorator: 199 | 200 | ```python 201 | @log_time 202 | @cache_result 203 | def foo(): pass 204 | ``` 205 | 206 | Basically completely equivalent to the following: 207 | 208 | ``` 209 | def foo(): pass 210 | 211 | foo = log_time(cache_result(foo)) 212 | ``` 213 | 214 | **The biggest contribution of the decorator is that it allows us to write more intuitive and easy-to-read code in certain scenarios**. It is just a "sugar", not a complicated programming model in an object-oriented field. 215 | 216 | > Hint: There is a [Example of Implementing Decorator Pattern](https://wiki.python.org/moin/DecoratorPattern) on the Python official website , you can read this example to better understand it. 217 | 218 | ### 2. Remember to decorate inner functions with functools.wraps() 219 | 220 | The following is a simple decorator designed to print the time-consuming function calls: 221 | 222 | ```python 223 | import time 224 | 225 | 226 | def timer(wrapped): 227 | """Decorator: Recording and printing functions takes time""" 228 | def decorated(*args, **kwargs): 229 | st = time.time() 230 | ret = wrapped(*args, **kwargs) 231 | print('execution take: {} seconds'.format(time.time()-st)) 232 | return ret 233 | return decorated 234 | 235 | 236 | @timer 237 | def random_sleep(): 238 | """Random sleep for a little while"""" 239 | time.sleep(random.random()) 240 | ``` 241 | 242 | Although there is no error in the `timer` decorator, after using it to decorate a function, the original signature of the function will be destroyed. In other words, you can no longer get the name and document content of the `random_sleep` function correctly, all signatures will become the value of the inner function `decorated`: 243 | 244 | ```python 245 | print(random_sleep.__name__) 246 | # Output'decorated' 247 | print(random_sleep.__doc__) 248 | # Output None 249 | ``` 250 | 251 | Although this is only a minor problem, it can sometimes lead to undetectable bugs. Fortunately, the standard library functools provides a solution for it. You only need to decorate the inner decorated function with another decorator when you define the decorator. 252 | 253 | It sounds a bit ridiculous, but it is actually just a new line of code: 254 | 255 | ```python 256 | def timer(wrapped): 257 | # Assign the real signature of the wrapper function to decorated 258 | @functools.wraps(wrapped) 259 | def decorated(*args, **kwargs): 260 | # <...> omitted 261 | return decorated 262 | ``` 263 | 264 | After this processing, the `timer` decorator will not affect the function it decorates. 265 | 266 | ```python 267 | print(random_sleep.__name__) 268 | # Output'random_sleep' 269 | print(random_sleep.__doc__) 270 | # Output'random sleep for a little while' 271 | ``` 272 | 273 | ### 3. Remember to use nonlocal when modifying outer variables 274 | 275 | Decorators are an advanced application of function objects. In the process of writing decorators, you will often encounter situations where inner functions need to modify outer function variables. Just like this decorator: 276 | 277 | ```python 278 | import functools 279 | 280 | def counter(func): 281 | """Decorator: record and print the number of calls """ 282 | count = 0 283 | @functools.wraps(func) 284 | def decorated(*args, **kwargs): 285 | # Accumulation of times 286 | count += 1 287 | print(f"Count: {count}") 288 | return func(*args, **kwargs) 289 | return decorated 290 | 291 | @counter 292 | def foo(): 293 | pass 294 | 295 | foo() 296 | ``` 297 | 298 | In order to count the number of function calls, we need to modify the value of the `count` variable defined by the outer function inside the `decorated` function. However, the above code is problematic, and the interpreter will report an error when executing it: 299 | 300 | ```raw 301 | Traceback (most recent call last): 302 | File "counter.py", line 22, in 303 | foo() 304 | File "counter.py", line 11, in decorated 305 | count += 1 306 | UnboundLocalError: local variable'count' referenced before assignment 307 | ``` 308 | 309 | This error is caused by the scope of nesting of the counter and decorated functions. 310 | 311 | When the interpreter executes `count += 1`, it does not know that `count` is a variable defined in the outer scope. It treats `count` as a local variable and searches in the current scope. In the end, no definition of the `count` variable was found, and an error was thrown. 312 | 313 | In order to solve this problem, we need to tell the interpreter through the `nonlocal` keyword: **"count variable does not belong to the current local scope, go outside and find it"**, the previous error can be solved. 314 | 315 | ```python 316 | def decorated(*args, **kwargs): 317 | nonlocal count 318 | count += 1 319 | # <... omitted...> 320 | ``` 321 | 322 | > Hint: If you want to learn more about the history of nonlocal keywords, you can refer to [PEP-3104](https://www.python.org/dev/peps/pep-3104/) 323 | 324 | ## to sum up 325 | 326 | In this article, I shared with you some tips and small knowledge about decorators. 327 | 328 | Summary of some points: 329 | 330 | - All callable objects can be used to implement decorators 331 | - Mix functions and classes to better implement decorators 332 | - The wrapt module is very useful, it can help us write complex decorators with simpler code 333 | - "Decorator" is just syntactic sugar, it is not "Decorator mode" 334 | - The decorator will change the original signature of the function, you need `functools.wraps` 335 | - When modifying variables of outer functions in inner functions, you need to use the `nonlocal` keyword 336 | -------------------------------------------------------------------------------- /Python/07. Exception-handling.md: -------------------------------------------------------------------------------- 1 | # Exception handling 2 | 3 | If you program in Python, you cannot avoid exceptions, because exceptions are everywhere in the language. For example, when you press `ctrl+c` to exit during script execution, the interpreter will generate a `KeyboardInterrupt` exception. And `KeyError`, `ValueError`, `TypeError`, etc. are old friends everywhere in daily programming. 4 | 5 | Exception handling consists of two parts: "capture" and "throw". "Capture" refers to the use of `try...except` to wrap a specific sentence to properly complete the error process. The proper use of `raise` to actively "throw" exceptions is an essential part of elegant code. 6 | 7 | In this article, I will share 3 good habits related to exception handling. 8 | 9 | ## Three good habits 10 | 11 | ### 1. Only do the most accurate exception catching 12 | 13 | If you don't understand the abnormal mechanism enough, you will inevitably feel a natural fear of it. You may think: *Exception is a bad thing. A good program should catch all exceptions and let everything run smoothly.* The code written with this idea usually contains a large amount of ambiguous exception catching logic. 14 | 15 | Let's use an executable script as an example: 16 | 17 | ```python 18 | # -*- coding: utf-8 -*- 19 | import requests 20 | import re 21 | 22 | 23 | def save_website_title(url, filename): 24 | """Get the page title of an address, and then write it to a file 25 | 26 | :returns: True if saved successfully, otherwise print error, return False 27 | """ 28 | try: 29 | resp = requests.get(url) 30 | obj = re.search(r'(.*)', resp.text) 31 | if not obj: 32 | print('save failed: title tag not found in page content') 33 | return False 34 | 35 | title = obj.grop(1) 36 | with open(filename,'w') as fp: 37 | fp.write(title) 38 | return True 39 | except Exception: 40 | print(f'save failed: unable to save title of {url} to {filename}') 41 | return False 42 | 43 | 44 | def main(): 45 | save_website_title('https://www.qq.com','qq_title.txt') 46 | 47 | 48 | if __name__ =='__main__': 49 | main() 50 | ``` 51 | 52 | The `save_website_title` function in the script does several things. It first obtains the content of the webpage through the Internet, then matches the title using regular rules, and finally writes the title in a local file. There are two steps that are easy to make mistakes here: **network request** and **local file operation**. So in the code, we use a big `try... except` statement block to wrap up these steps. **Safety first**. 53 | 54 | So, what's the problem with this concise and easy-to-understand code? 55 | 56 | If you happen to have a computer with Python installed next to it, then you can try to run the above script again. You will find that the above code cannot be executed successfully. And you will also find that no matter how you modify the value of the URL and the target file, the program will still report an error *"save failed: unable to..."*. why? 57 | 58 | The problem lies in this huge try...except block. If you put your eyes close to the screen, check this code very carefully. You will find that when writing a function, I made a **small mistake**. I mistyped the method of obtaining the regular matching string into `obj.grop(1)`, which is missing a `u' (`obj. group(1)`). 59 | 60 | But because of the overly large and ambiguous exception catch, this `AttibuteError` that should have been thrown due to the wrong method name was swallowed. This adds unnecessary trouble to our debugging process. 61 | 62 | The purpose of exception catching is not to catch as many exceptions as possible. If we insist from the beginning: **Only do the most accurate exception catch**. Then such a problem will not happen at all, and accurate capture includes: 63 | 64 | - Always capture only those statement blocks that may throw exceptions 65 | - Try to only capture the exact exception type, not the ambiguous `Exception` 66 | 67 | According to this principle, our example should be changed to this: 68 | 69 | ```python 70 | from requests.exceptions import RequestException 71 | 72 | 73 | def save_website_title(url, filename): 74 | try: 75 | resp = requests.get(url) 76 | except RequestException as e: 77 | print(f'save failed: unable to get page content: {e}') 78 | return False 79 | 80 | # This regular operation itself should not throw an exception, so we do not need to use the try block 81 | # It doesn't matter if the group is mistakenly made into grop, the program will immediately pass AttributeError 82 | # tell us. 83 | obj = re.search(r'(.*)', resp.text) 84 | if not obj: 85 | print('save failed: title tag not found in page content') 86 | return False 87 | title = obj.group(1) 88 | 89 | try: 90 | with open(filename,'w') as fp: 91 | fp.write(title) 92 | except IOError as e: 93 | print(f'save failed: unable to write to file {filename}: {e}') 94 | return False 95 | else: 96 | return True 97 | ``` 98 | 99 | ### 2. Don't let exceptions destroy abstract consistency 100 | 101 | About four or five years ago, I was working on a back-end API project for a mobile application. If you also have experience in developing back-end APIs, then you must know that such systems need to develop a set of ""API error code specifications"" to provide convenience for clients to handle call errors. 102 | 103 | An error code returned probably looks like this: 104 | 105 | ```javascript 106 | // HTTP Status Code: 400 107 | // Content-Type: application/json 108 | { 109 | "code": "UNABLE_TO_UPVOTE_YOUR_OWN_REPLY", 110 | "detail": "You cannot recommend your reply" 111 | } 112 | ``` 113 | 114 | After formulating the error code specification, the next task is how to implement it. The project at that time used the Django framework, and Django's error page was implemented using an exception mechanism. For example, if you want a request to return a 404 status code, then just execute `raise Http404` during the request processing. 115 | 116 | Therefore, we naturally get inspiration from Django. First, we defined the error code exception class in the project: `APIErrorCode`. Then according to the "error code specification", write a lot of error codes that inherit this class. When you need to return an error message to the user, you only need to do a `raise` to get it. 117 | 118 | ```python 119 | raise error_codes.UNABLE_TO_UPVOTE 120 | raise error_codes.USER_HAS_BEEN_BANNED 121 | ... 122 | ``` 123 | 124 | Unsurprisingly, everyone likes to return error codes in this way. Because it is very convenient to use, no matter how deep the call stack is, as long as you want to return an error code to the user, just call `raise error_codes.ANY_THING`. 125 | 126 | Over time, the project has become larger and larger, and there are more and more places where APIErrorCode is thrown. One day, when I was about to reuse a low-level image processing function, I suddenly encountered a problem. 127 | 128 | I saw a piece of code that made me very tangled: 129 | 130 | ```python 131 | # Inside a module that processes images 132 | # /util/image/processor.py 133 | def process_image(...): 134 | try: 135 | image = Image.open(fp) 136 | except Exception: 137 | # Description (non-project original comment): The exception will be caught by Django's middleware and returned to the front end 138 | # "The format of the uploaded image is incorrect" message 139 | raise error_codes.INVALID_IMAGE_UPLOADED 140 | ... 141 | ``` 142 | 143 | The `process_image` function will try to parse a file object. If the object cannot be opened normally as an image, it will throw an `error_codes.INVALID_IMAGE_UPLOADED (APIErrorCode subclass)` exception, thereby returning the error code JSON to the caller. 144 | 145 | Let me give you this code from scratch. When I originally wrote `process_image`, although I put it in the `util.image` module, the only place where I called this function at the time was "Process POST request for user to upload images". In order to be lazy, I let the function directly throw an APIErrorCode exception to complete the error handling work. 146 | 147 | Let me talk about the problem at that time. At that time I needed to write a batch image script that was running in the background, and it just happened to reuse the functions implemented by the `process_image` function. But something wrong happened at this time, if I want to reuse the function, then: 148 | 149 | - I have to catch an exception named `INVALID_IMAGE_UPLOADED` 150 | - **Even if my pictures are not from user uploads at all** 151 | - I must introduce the `APIErrorCode` exception class as a dependency to catch exceptions 152 | - **Even if my script has nothing to do with Django API** 153 | 154 | **This is the result of the inconsistency of the abstract level of the exception class.**The meaning of the APIErrorCode exception class is to express an "error code" that can be directly recognized and consumed by the end user (person). **It belongs to the highest level of abstraction in the entire project.** But for convenience, we introduced and threw it in the underlying module. This breaks the abstract consistency of the `image.processor` module and affects its reusability and maintainability. 155 | 156 | This kind of situation belongs to "the module throws an exception that is higher than the abstract level to which it belongs". To avoid such errors, you need to pay attention to the following points: 157 | 158 | - Let the module only throw exceptions that are consistent with the current abstraction level 159 | - For example, the `image.processer` module should throw its own `ImageOpenError` exception 160 | - Abnormal packaging and conversion where necessary 161 | - For example, the image processing module's `ImageOpenError` low-level exception wrapper should be converted to `APIErrorCode` high-level exception, close to the high-level abstraction (view function) 162 | 163 | Modified code: 164 | 165 | ```python 166 | # /util/image/processor.py 167 | class ImageOpenError(Exception): 168 | pass 169 | 170 | 171 | def process_image(...): 172 | try: 173 | image = Image.open(fp) 174 | except Exception as e: 175 | raise ImageOpenError(exc=e) 176 | ... 177 | 178 | # /app/views.py 179 | def foo_view_function(request): 180 | try: 181 | process_image(fp) 182 | except ImageOpenError: 183 | raise error_codes.INVALID_IMAGE_UPLOADED 184 | ``` 185 | 186 | In addition to avoiding exceptions that are higher than the current abstraction level, we should also avoid leaking exceptions that are lower than the current abstraction level. 187 | 188 | If you have used the `requests` module, you may have discovered that the exception it throws when it requests a page error is not the original exception of the `urllib3` module it uses at the bottom, but it is wrapped by `requests.exceptions` Once abnormal. 189 | 190 | ```python 191 | >>> try: 192 | ... requests.get('https://www.invalid-host-foo.com') 193 | ... except Exception as e: 194 | ... print(type(e)) 195 | ... 196 | 197 | ``` 198 | 199 | This is also to ensure the abstract consistency of the exception class. Because the urllib3 module is the underlying implementation detail that the requests module depends on, and this detail may change in future versions. Therefore, the exceptions thrown by it must be properly packaged to avoid the impact of future low-level changes on the request processing error of the client. 200 | 201 | ### 3. Exception handling should not overwhelm 202 | 203 | Earlier we mentioned that exception catching must be precise and the level of abstraction must be consistent. But in the real world, if you strictly follow these processes, then you are likely to encounter another problem: **exception handling logic is too much, so that the code core logic is disrupted**. The specific performance is that the code is full of `try`, `except`, and `raise` statements, making the core logic difficult to recognize. 204 | 205 | Let us look at an example: 206 | 207 | ```python 208 | def upload_avatar(request): 209 | """User uploaded new avatar""" 210 | try: 211 | avatar_file = request.FILES['avatar'] 212 | except KeyError: 213 | raise error_codes.AVATAR_FILE_NOT_PROVIDED 214 | 215 | try: 216 | resized_avatar_file = resize_avatar(avatar_file) 217 | except FileTooLargeError as e: 218 | raise error_codes.AVATAR_FILE_TOO_LARGE 219 | except ResizeAvatarError as e: 220 | raise error_codes.AVATAR_FILE_INVALID 221 | 222 | try: 223 | request.user.avatar = resized_avatar_file 224 | request.user.save() 225 | except Exception: 226 | raise error_codes.INTERNAL_SERVER_ERROR 227 | return HttpResponse({}) 228 | ``` 229 | 230 | This is a view function that processes the user's uploaded avatar. There are three things done in this function, and exception catches are done for each thing. If an exception occurs while doing something, a user-friendly error is returned to the front end. 231 | 232 | Although such a processing flow is reasonable, it is clear that the exception handling logic in the code is a bit "overwhelming." At first glance, the entire code is indented, it is difficult to extract the core logic of the code. 233 | 234 | As early as the 2.5 version, the Python language has provided a tool to deal with such scenarios: "context manager (context manager)". The context manager is a special Python object used with the `with` statement, through which you can make exception handling more convenient. 235 | 236 | So, how to use the context manager to improve our exception handling process? Let's look directly at the code. 237 | 238 | ```python 239 | class raise_api_error: 240 | """captures specified exception and raise ApiErrorCode instead 241 | 242 | :raises: AttributeError if code_name is not valid 243 | """ 244 | def __init__(self, captures, code_name): 245 | self.captures = captures 246 | self.code = getattr(error_codes, code_name) 247 | 248 | def __enter__(self): 249 | # This method will be called when entering the context 250 | return self 251 | 252 | def __exit__(self, exc_type, exc_val, exc_tb): 253 | # This method will be called when exiting the context 254 | # exc_type, exc_val, exc_tb represent the thrown in this context respectively 255 | # Exception type, abnormal value, error stack 256 | if exc_type is None: 257 | return False 258 | 259 | if exc_type == self.captures: 260 | raise self.code from exc_val 261 | return False 262 | ``` 263 | 264 | In the above code, we define a context manager named `raise_api_error`, which does nothing when entering the context. But when exiting the context, it will determine whether an exception of type `self.captures` is thrown in the current context, and if so, replace it with the exception class of `APIErrorCode`. 265 | 266 | After using this context manager, the whole function can become more clear and concise: 267 | 268 | ```python 269 | def upload_avatar(request): 270 | """User uploaded new avatar""" 271 | with raise_api_error(KeyError,'AVATAR_FILE_NOT_PROVIDED'): 272 | avatar_file = request.FILES['avatar'] 273 | 274 | with raise_api_error(ResizeAvatarError,'AVATAR_FILE_INVALID'),\ 275 | raise_api_error(FileTooLargeError,'AVATAR_FILE_TOO_LARGE'): 276 | resized_avatar_file = resize_avatar(avatar_file) 277 | 278 | with raise_api_error(Exception,'INTERNAL_SERVER_ERROR'): 279 | request.user.avatar = resized_avatar_file 280 | request.user.save() 281 | return HttpResponse({}) 282 | ``` 283 | 284 | > Hint: I suggest reading [PEP 343 - The "with" Statement | Python.org](https://www.python.org/dev/peps/pep-0343/) to learn more about the context manager Knowledge. 285 | > 286 | > The module [contextlib](https://docs.python.org/3/library/contextlib.html) also provides a lot of tool functions and examples related to writing context managers. 287 | 288 | ## To sum up 289 | 290 | In this article, I shared three suggestions related to exception handling. Finally, summarize the main points: 291 | 292 | - Only capture statements that may throw exceptions, avoid ambiguous capture logic 293 | - Maintain the abstract consistency of module exception classes, and wrap the underlying exception classes if necessary 294 | - Use "Context Manager" to simplify repeated exception handling logic 295 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Python 2 | | About | Code + Article | 3 | | ------------- | ------------- | 4 | | Variables Guide | [Link](https://github.com/pawangeek/Python-Resources/blob/master/Python/01.%20Variables.md) | 5 | | If-else Guide | [Link](https://github.com/pawangeek/Python-Resources/blob/master/Python/02.%20If-else.md) | 6 | | String and Numbers | [Link](https://github.com/pawangeek/Python-Resources/blob/master/Python/03.%20Number-strings.md) | 7 | | Containers | [Link](https://github.com/pawangeek/Python-Resources/blob/master/Python/04.%20Container-types.md) | 8 | | Functions | [Link](https://github.com/pawangeek/Python-Resources/blob/master/Python/05.%20Functions.md) | 9 | | Decorators | [Link](https://github.com/pawangeek/Python-Resources/blob/master/Python/06.%20Decorators.md) | 10 | | Exception handling | [Link](https://github.com/pawangeek/Python-Resources/blob/master/Python/07.%20Exception-handling.md) | 11 | 12 | 13 | # Python-Resources 14 | 15 | | About | Code | Article | 16 | | ------------- | ------------- | ----------- | 17 | | Python Tricks | [Code](https://github.com/pawangeek/Python-Resources/blob/master/Cool-stuff/Python-Tricks.md) | [Article](https://towardsdatascience.com/do-you-have-these-python-speedup-skills-3fd9e7758765) | 18 | | Funny Tricks | [Code](https://github.com/pawangeek/Python-Resources/blob/master/Cool-stuff/FunnyEggs.md) | [Article](https://medium.com/@pawanjain.432/do-you-know-python-funny-eggs-decd12a0e06c) (Undistributed) | 19 | | Memory Management | [Code](https://github.com/pawangeek/Python-Resources/blob/master/Cool-stuff/Memory-Secrets.md) | [Article](https://towardsdatascience.com/dark-secrets-of-python-memory-allocation-c5d22879d8a9) | 20 | | Merge Lists | [Code](https://github.com/pawangeek/Python-Resources/tree/master/Cool-stuff/Merging-lists.md) | [Article](https://towardsdatascience.com/i-know-eight-ways-to-merge-lists-3f80259bb74f) | 21 | | Built-in Functions | [Code](https://github.com/pawangeek/Python-Resources/blob/master/Cool-stuff/Built-ins.md) | [Article](https://towardsdatascience.com/8-super-easy-to-use-python-built-in-functions-fa873e66085) | 22 | | Float secrets | [Code](https://github.com/pawangeek/Python-Resources/blob/master/Cool-stuff/Float-secrets.md) | [Article](https://towardsdatascience.com/dark-secrets-of-python-floats-dbe2d1b8a85c) 23 | | Development tricks | [Code](https://github.com/pawangeek/Python-Resources/blob/master/Cool-stuff/Development-tricks.md) | [Article](https://towardsdatascience.com/ten-python-development-skills-998a52f8f7c0) 24 | 25 | # Sql-Resources 26 | 27 | | About | Code | Article | 28 | | ------------- | ------------- | ----------- | 29 | | SQL Tricks | [Code](https://github.com/pawangeek/Python-Resources/blob/master/Cool-stuff/Sql-Tricks.md) | [Article](https://towardsdatascience.com/how-to-optimize-sql-queries-742177cd5cc6) | 30 | | SQL Tricks 2 | [Code](https://github.com/pawangeek/Python-Resources/blob/master/Cool-stuff/Sql-tricks2.md) | [Article](https://towardsdatascience.com/how-to-optimize-sql-queries-part-ii-407311784112) 31 | 32 | 33 | # Data-Science Related 34 | 35 | | About | Code | Article | 36 | | ------------- | ------------- | ----------- | 37 | | Hypothesis Testing | [Code](https://github.com/pawangeek/Python-Resources/blob/master/Cool-stuff/Hypothesis-test.md) | [Article](https://towardsdatascience.com/informative-and-practical-guide-of-hypothesis-testing-9c8a3d9472f7) | 38 | 39 | ## More to come Just stay Tuned 😊 40 | 41 | # Upcoming Aricles 42 | 43 | | About | 44 | | ------------- | 45 | | Pythonic code snippets | 46 | | Activation fuctions | 47 | | Python Black magics | 48 | | Optics Clustering | 49 | | Python Shared economy | 50 | --------------------------------------------------------------------------------