├── 2Doptimization └── Readme.md ├── 3wayQuickSort └── Readme.md ├── BFS └── Readme.md ├── BackTrackingSubsets └── Readme.md ├── BellmanFord └── Readme.md ├── BinaryIndexTree └── Readme.md ├── BinarySearch └── Readme.md ├── Bit └── Readme.md ├── Container_all_lang └── Readme.md ├── DP ├── 198.py ├── 740.py ├── 746.py ├── 764.py ├── 787.py └── 801.py ├── DP_optimization └── Readme.md ├── DP_sequence_operation └── Readme.md ├── DP_two_players └── Readme.md ├── Decomposition_DFS_BFS └── Readme.md ├── EulerPath └── Readme.md ├── FenwickTree └── Readme.md ├── FloydWarshall_Allpairshortest └── Readme.md ├── Geometry └── Readme.md ├── HungarianAlgorithm └── Readme.md ├── Intervals └── Readme.md ├── KMP └── Readme.md ├── Knapsack └── Readme.md ├── ML_SystemDesign └── Readme.md ├── MaxFlow └── Readme.md ├── MergeSort └── Readme.md ├── MonotonicQueue └── Readme.md ├── MorrisTraversal └── Readme.md ├── Parentheses └── Readme.md ├── ParseStrings └── Readme.md ├── PinterestSystemDesign └── Readme.md ├── Queue └── Readme.md ├── README.md ├── Random └── Readme.md ├── Regex └── Readme.md ├── SegmentTree └── Readme.md ├── ShortPythonHacks └── Readme.md ├── ShortestDistance └── Readme.md ├── SparseTable └── Readme.md ├── Stack └── Readme.md ├── StronglyConnectedComponent ├── Kosaraju.png └── Readme.md ├── SuffixTree └── Readme.md ├── SystemDesign └── Readme.md ├── TopK_Greedy └── Readme.md ├── TopologicalSort └── Readme.md ├── TreeDP └── Readme.md ├── TreeTraversal └── Readme.md ├── Trie └── Readme.md ├── TwoPointers └── Readme.md ├── UnionFind └── Readme.md ├── WaveletTree └── Readme.md ├── backtrack └── Readme.md ├── heap ├── 215.py ├── 239.py ├── 313.py └── Readme.md ├── kSum ├── 11.py ├── 135.py ├── 161.py ├── 209MinimumSizeSubarraySum.py ├── 228.py ├── 243.py ├── 26.py ├── 2sum.py ├── 2sum_III.py ├── 325MaximumSizeSubarraySumEqualsk.py ├── 3sum.py ├── 3sum_closest.py ├── 3sum_smaller.py ├── 42.py ├── 4sum.py ├── 4sum_II.py ├── 80.py ├── 88.py └── Readme.md ├── language ├── Cplusplus │ └── Readme.md ├── Go │ └── Readme.md ├── Java │ └── Readme.md ├── Javascript │ └── Readme.md ├── Python │ └── Readme.md └── Readme.md └── other_thoughts └── Readme.md /2Doptimization/Readme.md: -------------------------------------------------------------------------------- 1 | optimization problem with multiple goals (2-D optimization) 2 | === 3 | 4 | Some problems require optimization using multi-dimensional input. 5 | 6 | **LC857. Minimum Cost to Hire K Workers** 7 | 8 | The optimization problem is `max(ratio) * sum(quality)` 9 | 10 | where `ratio` is the wage/quality ratio of workers and `quality` is their quality. 11 | 12 | We start with the K lowest-quality workers first. 13 | Each time, we use the new lowest-quality worker to replace the previous highest-ratio worker. 14 | 15 | **LC787. Cheapest Flights Within K Stops** 16 | 17 | This is Dijisktra with a K-stop constraint. It is not that straightforward because the optimization requires two tasks: 18 | 19 | `min(distance)` and `min(steps)`. 20 | 21 | Dijisktra cares only about the shortest distance. It pops up the node with min distance. 22 | 23 | However, the next node with less steps but **longer** distance should also be considered. 24 | 25 | We create a `mark` set to identify the nodes `(steps, node)`, and push `(distance, steps, node)` into the priority queue each time. 26 | 27 | **LC354. Russian Doll Envelopes** 28 | 29 | The task wants a longest sequence with both `x` and `y` dimension **strictly** increasing. 30 | 31 | We sort the input list of tuples by its first dimension. The problem becomes the `Longest Increasing Sequence` of finding longest increasing sequence in the second dimension. 32 | 33 | A trick is to sorted by `(x,-y)` in case the some successive `x`s are the same. 34 | 35 | Summary 36 | --- 37 | For optimization problem with multiple goal, we usually apply a greedy-style search: 38 | 39 | * Sort the input by one dimension. 40 | * Cut-and-paste (greedy algorithm) on the second dimension with help of a heap or other tricks. 41 | 42 | which make sures we skip searching all the sub-optimal states. 43 | 44 | -------------------------------------------------------------------------------- /3wayQuickSort/Readme.md: -------------------------------------------------------------------------------- 1 | Quick Sort 2 | === 3 | An array may have many redundant elements - we want to put elements equal to pivot in the middle 4 | 5 | https://www.geeksforgeeks.org/3-way-quicksort-dutch-national-flag/ 6 | 7 | In 3 Way QuickSort, an array arr[l:r+1] is divided in 3 parts: 8 | * arr[l:i] elements less than pivot. 9 | * arr[i:j] elements equal to pivot. 10 | * arr[j:r+1] elements greater than pivot. 11 | 12 | Dutch National Flag Algorithm: 13 | 14 | between `low` and `mid` are the redundant elements equal to the pivot 15 | 16 | ``` 17 | # pivot = 4 18 | # 19 | # 1 2 3 4 4 4 1 2 12 3 7 8 9 20 | # ^ ^ ^ 21 | # low mid high 22 | # 23 | int mid = low; 24 | int pivot = a[high]; 25 | while (mid <= high) 26 | { 27 | if (a[mid]pivot) 32 | swap(&a[mid], &a[high--]); 33 | } 34 | ``` 35 | 36 | **LC 324. Wiggle Sort II** 37 | Given an unsorted array nums, reorder it such that `nums[0] < nums[1] > nums[2] < nums[3]....` 38 | 39 | Solution with [virtual indexing](https://leetcode.com/problems/wiggle-sort-ii/discuss/77677/O(n)%2BO(1)-after-median-Virtual-Indexing) 40 | 41 | ``` 42 | def wiggleSort(self, nums): 43 | """ 44 | :type nums: List[int] 45 | :rtype: void Do not return anything, modify nums in-place instead. 46 | """ 47 | N = len(nums) 48 | if not nums: return [] 49 | 50 | def w(i): 51 | return (1+2*i) % (N|1) 52 | 53 | def swap(i, j): 54 | nums[w(i)], nums[w(j)] = nums[w(j)], nums[w(i)] 55 | 56 | def partition(nums, l, r, x): 57 | i, j, k = 0, 0, r 58 | while j <= k: 59 | if x < nums[w(j)]: 60 | swap(i, j) 61 | i += 1 62 | j += 1 63 | elif nums[w(j)] < x: 64 | swap(j, k) 65 | k -= 1 66 | else: 67 | j += 1 68 | return i, j 69 | 70 | # find the k-th "smallest" element 71 | def find(nums, l, r, k): 72 | if l == r: return nums[w(l)] 73 | if k <= r - l + 1: 74 | x = random.randint(l, r) 75 | i, j = partition(nums, l, r, nums[w(x)]) 76 | if k <= i - l: 77 | return find(nums, l, i - 1, k) 78 | elif k > j - l: 79 | return find(nums, j, r, k - (j - l)) 80 | else: 81 | return nums[w(i)] 82 | return float('inf') 83 | 84 | median = find(nums, 0, N - 1, (N + 1)//2) 85 | 86 | partition(nums, 0, N - 1, median) 87 | ``` 88 | -------------------------------------------------------------------------------- /BFS/Readme.md: -------------------------------------------------------------------------------- 1 | # Breath First Search 2 | ## Basics 3 | There are two styles. Straightforward list: 4 | ``` 5 | bfs = [(source1,0),...,(sourceN,0)] 6 | for node,distance in bfs: 7 | if node in destinations: 8 | return 9 | for neighbor in neighborhood(node): 10 | if neighbor not in set(visisted): 11 | bfs += (neighbor, distance+1), 12 | ``` 13 | Using deque (Note that Dijkstra uses a heap instead of a deque) 14 | ``` 15 | Q = collections.deque(sources) 16 | vis = set() 17 | distance = 1 18 | while Q: 19 | n = len(Q) 20 | 21 | for _ in range(n): 22 | node = Q.popleft() 23 | 24 | if node in destinations: 25 | return 26 | 27 | for neighbour in neighborhood(node): 28 | if neighbour not in vis: 29 | vis.add(neighbor) 30 | Q.append(neighbor) 31 | distance += 1 32 | ``` 33 | Both Dijkstra and BFS need to keep a visisted nodes set, both modify the list/deque inside the iterations, both check if visisted node is THE destination as the FIRST step inside the iterations. 34 | 35 | **Multiple sources & destinations** 36 | The code above (both styles) can be trivially extended to multiple sources and multiple destinations case. So, don't run the algorithm for each source, you simply add the all sources to the initial list/deque. 37 | 38 | ## Bi-directional 39 | BFS starting from source and destination at the same time. The time complexity decreases from `O(k^d)` to `O(k^(d/2) + k^(d/2))` where `k` is the average node degree and `d` is the depth of the one-directional BFS search. 40 | 41 | Notes: 42 | * Check intersection at **boundry** only (all the visited nodes can be discarded) 43 | * Use `set()` instead of `list` to store the nodes at each layer, for the convenience of checking intersection. 44 | * Always expand from the side with fewer nodes to save time 45 | 46 | **LC 127. Word Ladder** 47 | 48 | Given two words (beginWord and endWord), and a dictionary's word list, find the length of shortest transformation sequence from beginWord to endWord, such that: 49 | 50 | Only one letter can be changed at a time. 51 | Each transformed word must exist in the word list. Note that beginWord is not a transformed word. 52 | 53 | ``` 54 | def ladderLength(self, beginWord, endWord, wordList): 55 | def adj_word(w): 56 | w2 = list(w) 57 | for i in range(len(w)): 58 | for j in range(26): 59 | w2[i] = chr(ord('a') + j) 60 | w2_str = "".join(w2) 61 | if w2_str != w: 62 | yield w2_str 63 | w2[i] = w[i] 64 | 65 | vis = set([beginWord, endWord]) 66 | allword = set(wordList) 67 | 68 | if endWord not in allword: return 0 69 | 70 | # Use two sets to store the layers (instead of list) 71 | Q1, Q2 = set([beginWord]), set([endWord]) 72 | 73 | steps = 0 74 | while Q1 and Q2: 75 | tmp = set() 76 | for w in Q1: 77 | for W in adj_word(w): 78 | if W in allword: 79 | # Check intersection at the boundary only!! 80 | if W in Q2: 81 | return steps + 2 82 | if W not in vis: 83 | vis.add(W) 84 | tmp.add(W) 85 | Q1 = tmp 86 | steps += 1 87 | 88 | # Always expand the side with fewer nodes!! 89 | if len(Q2) < len(Q1): 90 | Q1, Q2 = Q2, Q1 91 | return 0 92 | ``` 93 | -------------------------------------------------------------------------------- /BackTrackingSubsets/Readme.md: -------------------------------------------------------------------------------- 1 | ## A general approach to backtracking questions (Subsets, Permutations, Combination Sum, Palindrome Partitioning etc.) 2 | 3 | Many problems ask for generating all subsets of a set, all permutations or all combinations of candidates etc. 4 | The idea is to use backtracking. Here is a template to do this quickly. 5 | 6 | **LC 78.** Subsets. return all possible subsets (the power set). 7 | 8 | Python: 9 | ``` 10 | def subsets(self, nums): 11 | ret = [] 12 | def add(path, nums, i): 13 | ret.append(path.copy()) # append the current set 14 | while i < len(nums): 15 | path.append(nums[i]) # if nums[i] is in the list 16 | add(path, nums, i + 1) 17 | path.pop() # if nums[i] is not in the list 18 | i += 1 19 | add([], nums, 0) 20 | return ret 21 | ``` 22 | 23 | C++ 24 | ``` 25 | vector> subsets(vector& nums) { 26 | vector path; 27 | vector> ret; 28 | find(nums, path, ret, 0); 29 | return ret; 30 | } 31 | 32 | void find(vector& nums, vector& path, vector>& ret, int i) { 33 | ret.push_back(vector(path.begin(), path.end())); 34 | for (int j = i; j < nums.size(); ++j) { 35 | path.push_back(nums[j]); 36 | find(nums, path, ret, j + 1); 37 | path.pop_back(); 38 | } 39 | } 40 | ``` 41 | 42 | **LC 47. Permutations II** Return all possible unique permutations of a list which might contain duplicates. 43 | ``` 44 | def permuteUnique(self, nums): 45 | ret = [] 46 | N = len(nums) 47 | nums = sorted(nums) 48 | 49 | def add(path, nums, vis): 50 | if len(path) == N: ret.append(path.copy()) 51 | else: 52 | for i in range(N): 53 | # If the number is a duplicate, it's left number must have been added. 54 | # So we avoid adding k out of n numbers (k < n) for comb(n, k) times. 55 | # Instead, only the leftmost k are added into the results. 56 | if vis[i] or i > 0 and nums[i] == nums[i-1] and not vis[i - 1]: continue 57 | 58 | path.append(nums[i]) 59 | vis[i] = True 60 | 61 | add(path, nums, vis) 62 | 63 | vis[i] = False 64 | path.pop() 65 | add([], nums, [False] * N) 66 | return ret 67 | ``` 68 | 69 | **LC 39. Combination Sum** Find all unique combinations in candidates where the candidate numbers sums to target. 70 | ``` 71 | def combinationSum(self, nums, target): 72 | nums = sorted(nums) 73 | ret = [] 74 | def find(path, cur_sum, j): 75 | if cur_sum == target: 76 | ret.append(path.copy()) 77 | elif cur_sum > target: 78 | return 79 | for i in range(j, len(nums)): 80 | path.append(nums[i]) 81 | find(path, cur_sum + nums[i], i) # allow access to the same candidate for multiple times 82 | path.pop() 83 | find([], 0, 0) 84 | return ret 85 | ``` 86 | 87 | **LC 131. Palindrome Partitioning** Return every partition a string such that each substring of the partition is a palindrome. Ex. "abb" => `['a', 'bb']` + `['a', 'b', 'b']` 88 | ``` 89 | def partition(self, s): 90 | ret = [] 91 | def find(s, path): 92 | if not s: 93 | ret.append(path.copy()) 94 | for i in range(1, len(s) + 1): 95 | if s[:i] == s[i-1::-1]: 96 | find(s[i:], path + [s[:i]]) 97 | find(s, []) 98 | return ret 99 | ``` 100 | 101 | This problem has another version: return the minimum number of cuts s.t. each cut is a palindrome. DP solution: keep extending the last cut, and update the dp[j] if last cut reaches index j. 102 | 103 | **LC 784. Letter Case Permutation** Transform every letter individually to be lowercase or uppercase, return all combinations. 104 | Input: S = "a1b2" 105 | Output: `["a1b2", "a1B2", "A1b2", "A1B2"]` 106 | 107 | ``` 108 | def letterCasePermutation(self, S): 109 | ret = [] 110 | def change(ret, S, i): 111 | while i < len(S) and S[i].isdigit(): i += 1 112 | ret.append("".join(S)) 113 | for j in range(i, len(S)): 114 | if S[j].isalpha(): 115 | S[j] = S[j].upper() 116 | change(ret, S, j + 1) 117 | S[j] = S[j].lower() 118 | change(ret, list(S.lower()), 0) 119 | return ret 120 | ``` 121 | 122 | **LC 320. Generalized Abbreviation** 123 | Input: "word" 124 | Output:`["word", "1ord", "w1rd", "wo1d", "wor1", "2rd", "w2d", "wo2", "1o1d", "1or1", "w1r1", "1o2", "2r1", "3d", "w3", "4"]` 125 | ``` 126 | def generateAbbreviations(self, word): 127 | def dfs(ret, path, word, i): 128 | if i == len(word): 129 | ret.append("".join(path)) 130 | else: 131 | # add a letter 132 | path += word[i], 133 | dfs(ret, path, word, i + 1) 134 | path.pop() 135 | # add a number only at the beginning or with a previous letter (no adjacent numbers) 136 | if not path or path[-1].isalpha(): 137 | for j in range(i, len(word)): 138 | path += str(j - i + 1), 139 | dfs(ret, path, word, j + 1) 140 | path.pop() 141 | ret = [] 142 | dfs(ret, [], word, 0) 143 | return ret 144 | ``` 145 | -------------------------------------------------------------------------------- /BellmanFord/Readme.md: -------------------------------------------------------------------------------- 1 | Bellman Ford Algorithm 2 | === 3 | Single source shortest path with complexity of O(VE) 4 | 5 | The naive implementation: 6 | ``` 7 | for (int i = 0; i < V; i++) 8 | dist[i] = INT_MAX; 9 | dist[src] = 0; 10 | 11 | for (int i = 1; i <= V-1; i++) 12 | { 13 | for (int j = 0; j < E; j++) 14 | { 15 | int u = graph->edge[j].src; 16 | int v = graph->edge[j].dest; 17 | int weight = graph->edge[j].weight; 18 | if (dist[u] != INT_MAX && dist[u] + weight < dist[v]) 19 | dist[v] = dist[u] + weight; 20 | } 21 | } 22 | 23 | //check for negative-weight cycles 24 | ... 25 | ``` 26 | 27 | A better implementation: 28 | 29 | we can push the nodes into a [deque](https://blog.csdn.net/u014800748/article/details/44059993) 30 | Use `inq` to mark nodes in the deque. 31 | Push new nodes into the deque only if its dist gets "relaxed" and it's not in the deque. 32 | 33 | ``` 34 | for(int i=0;iq; 40 | q.push(s); 41 | while(!q.empty()) 42 | { 43 | int u=q.front();q.pop(); 44 | inq[u]=0; 45 | for(int i=0;ie.flow&&d[e.to]>d[u]+e.cost)//松弛操作 49 | { 50 | d[e.to]=d[u]+e.cost; 51 | p[e.to]=G[u][i];//记录父边 52 | a[e.to]=min(a[u],e.cap-e.flow);//更新可改进量,等于min{到达u时候的可改进量,e边的残量} 53 | if(!inq[e.to]){q.push(e.to);inq[e.to]=1;} 54 | } 55 | } 56 | } 57 | ``` 58 | -------------------------------------------------------------------------------- /BinaryIndexTree/Readme.md: -------------------------------------------------------------------------------- 1 | ## Binary Indexed Trees 2 | BIT computes any prefix sum in `O(log n)` time. 3 | 4 | The idea to store the numbers in the way that 5 | * sum of '1', '10', '100' is located at '100' 6 | * sum of '101', '110' is located at '110' 7 | 8 | There are update and sum views. 9 | * The sum view iterates through i -= (i & -i) which removes the last digit 1 in binary presentation. 10 | * The update view iterates through i += (i & -i) which adds the last digit 1 in binary presentation. 11 | For example, 12 | ``` 13 | ^ for update, * for sum 14 | 0(root) 15 | 1, 10, 100*, 1000^ ... 16 | 11, 101^, 110*^, 1001, 1010, ... 17 | 111* 18 | ``` 19 | the update of element with index ‘101’ would be reflected at position 101, 110, 1000, … 20 | 21 | When you sum up from index such as ‘111’ through the `*` path, the update of ‘101’ would have been reflected at ‘110’. 22 | 23 | Essentially, At `100` stores the sum of the left triangle 24 | ``` 25 | 1, 10, 26 | 11, 27 | ``` 28 | At `1000` stores the sum of left triangle 29 | ``` 30 | 1, 10, 100, 31 | 11, 101, 110, 32 | 111 33 | ``` 34 | so on so forth. 35 | 36 | C++ code: leetcode 307. Range Sum Query - Mutable 37 | 38 | ``` 39 | class NumArray { 40 | public: 41 | vector bi; 42 | int n; 43 | NumArray(vector nums) { 44 | n = nums.size(); 45 | bi.resize(n + 1, 0); 46 | fill(bi.begin(), bi.end(), 0); 47 | for (int i=0;i= target: # NOTE when not target found, 20 | # equal make sure search ending up with an element GREATER than target. 21 | r = mid 22 | else: 23 | l = mid + 1 24 | # NOW, l=r and array[r]==target 25 | return l 26 | ``` 27 | The first style solves most problems, while another style allows access to `array[r]`. 28 | Useful for problem like "min element in rotated array". 29 | ``` 30 | array.sort() 31 | l, r = 0, len(array) - 1 32 | while l <= r: 33 | mid = (l + r) // 2 34 | 35 | if array[mid]==target: 36 | return mid # return when found target 37 | 38 | # NOTE you can access array[r] in this case, useful for problem like "min element in rotated array" 39 | 40 | if array[mid] > target: 41 | r = mid - 1 42 | else: 43 | l += mid + 1 44 | 45 | # NOW, l=r+1 and target is not found 46 | return l 47 | ``` 48 | 49 | ## Balanced Binary Search Tree 50 | To keep a binary search tree balanced, it should be implemented as Red-Black tree, AVL tree or even skipping lists (probablistics data structure) to maintain the same heights of the branches. 51 | 52 | **LC 220. Contains Duplicate III** 53 | Given an array of integers, find out whether there are two distinct indices i and j in the array such that the absolute difference between `nums[i]` and `nums[j]` is at most t and the absolute difference between i and j is at most k. 54 | 55 | Due to the lack of equivalent container, Python code needs to use Buckets (using collections.OrderedDict). 56 | C++ and Java are good for this problem thanks to `TreeSet` (Java) and `set` (C++). 57 | 58 | C++ solution: 59 | ``` 60 | bool containsNearbyAlmostDuplicate(vector& nums, int k, int t) { 61 | set win; 62 | for (int i = 0; i < nums.size(); ++i) { 63 | if (i > k) win.erase(nums[i - k - 1]); 64 | auto lower = win.lower_bound((long)nums[i] - t); 65 | 66 | # If *lower == nums[i] - t, then almost duplicate due to the lower bound 67 | # Since *lower >= nums[i] - t, so if *lower < nums[i] + t, 68 | # then almost duplicate due to the upper bound 69 | if (lower != win.end() && abs(*lower - nums[i]) <= t) return true; 70 | win.insert(nums[i]); 71 | } 72 | return false; 73 | } 74 | ``` 75 | 76 | Another important note about balanced BST is that the predecessor and successor can be reached in `O(log n)` time. You just start from the root to find the target node, going left or right down to the leaves. The **last** left turn would be associated with the successor and **last** right turn is associated with the predecessor. 77 | 78 | **LC 285. Inorder Successor in BST** 79 | 80 | Given a binary search tree and a node in it, find the in-order successor of that node in the BST. 81 | ``` 82 | def find(root, target, lastleft = None): 83 | if not root: return lastleft 84 | if root.val <= target.val: # the equality here skips the target node itself 85 | return find(root.right, target, lastleft) 86 | return find(root.left, target, root) 87 | ``` 88 | 89 | ## More advanced 90 | Binary search is a universal treatment for problems with **monotonic solutions**. The KEY is to identify the monotonic natural of these problems. Usually, if the solution is among a ordered list, the answer would be `Yes` before a certain number and `N` after a certain number. And you job to find the last `Y` or the first `N`. 91 | ``` 92 | 1 2 3 4 5 6 7 93 | Y Y Y N N N N 94 | ``` 95 | It is usually fast to check the correctness of your solution. So you can binary-search the solution, if wrong, just jump to the middle one. 96 | 97 | **"410. Split Array Largest Sum"**: given the largest sum, you can check if spliting the array into k subarrays are possible in `O(n)` time. 98 | 99 | **"786. K-th Smallest Prime Fraction"**: given the smallest prime fraction, you can check if there are `K` pairs whose prime fraction is smaller than the given value in `O(n)` time. 100 | 101 | **"373. Find K Pairs with Smallest Sums"**: given the smallest sum, you can check if there are `K` pairs whose sum is smaller in `O(n)` time. 102 | 103 | Other problems: 104 | * LC774 Minimize Max Distance to Gas Station 105 | * LC378 Kth Smallest Element in a Sorted Matrix 106 | * LC668 Kth Smallest Number in Multiplication Table 107 | * LC719 Find K-th Smallest Pair Distance 108 | 109 | On the other hand, Priority Queue can be used for many of such problems. Details see this post . 110 | 111 | ## Some details about `bisect_right` and `bisect_left` 112 | Source code 113 | ``` 114 | # bisect_left() 115 | while lo < hi: 116 | mid = (lo+hi)//2 117 | if a[mid] < x: lo = mid+1 118 | # MOVE hi in the case of equality 119 | # 0 1 2 3 3 3 4 5 120 | # ^ 121 | # hi would end up staying here when searching for 3. 122 | else: hi = mid 123 | return lo 124 | ``` 125 | `bisect_right()` is similar to `bisect_left()`, but returns an insertion point which comes after (to the right of) any existing entries of x in a. 126 | ``` 127 | # bisect_right() 128 | while lo < hi: 129 | mid = (lo+hi)//2 130 | # MOVE lo in the case of equality 131 | if a[mid] <= x: lo = mid+1 132 | # 0 1 2 3 3 3 4 5 133 | # ^ 134 | # hi would end up staying here when searching for 3. 135 | else: hi = mid 136 | return lo 137 | ``` 138 | 139 | ## Some thinking about the binary search "space" 140 | 141 | **LC300. Longest Increasing Subsequence** 142 | Given an unsorted array of integers, find the length of longest increasing subsequence. 143 | 144 | Natually, we would keep a sorted list of the previous elements, and find a location to insert the new element. But indeed, it is not easy to see the binary search solution at the first glampse. 145 | 146 | The key is to think about the solution space that we do NOT need to search. Like in the two-pointer problems, a certain set of solutions are inferior than the solutions we have searched. So we just skip them to save time. 147 | 148 | ``` 149 | INPUT: 1 3 5 4 2 150 | ^ 151 | Before: 1 3 5 (replace 5 by 4) 152 | After: 1 3 4 153 | 154 | INPUT: 1 3 5 4 2 155 | ^ 156 | Before: 1 3 4 157 | After: ?? 158 | ``` 159 | 160 | When 4 comes in, 5 can be replaced, because 134 has same length as 135, but with 4 at the end is better for future match. During the search time, 5 is skipped to save time. 161 | 162 | However, when 2 comes in, where should we put it? Replace 3 and 4?? No, we can not compare 134 or 12 now. So we should keep them both. It means we should keep 1, 12, 134. 163 | 164 | We keep a list of ending points for longest increasing subsequences (LIS) of length 1, 2, .... When a new number comes in, we locate the number it can replace by binary search or just append it as the end of a new LIS. 165 | 166 | ``` 167 | int lengthOfLIS(vector& nums) { 168 | vector res; 169 | for(int i=0; i K` and minimize `i - j`. 199 | 200 | Given `B[i1]`, we find `B[j]` which is smaller than `B[i1]-K`, i.e. the `******` part. 201 | Among all these `j`s, we want the max `j` to make the subarray shortest. 202 | 203 | ``` 204 | index j with an increasing order of B[j] 205 | 206 | B[i1] 207 | V 208 | *****######..... <--- B[i2] (the new input with i2 > i1) 209 | ^ ^ 210 | | B[i1]-K 211 | | 212 | B[i2] - K, if B[i2] < B[i1], we skip it because i2 - max(***) > i1 - max(*****) 213 | otherwise, the chance is we find a better j in the ##### part 214 | 215 | ****** part for B[j] < B[i1] - K 216 | ...... part for B[j] > B[i1] 217 | ``` 218 | 219 | Suppose we have found the max `j` in the `******` part for `B[i1]`, and `B[i2]` comes in now. 220 | 221 | The key is to skip the inferior solution space that we do NOT need to search. 222 | 223 | * For all `******` part, if `B[i2] < B[i1]`, then `B[i2] - K` points to somewhere in the `******` part, say the prefix `***` are those `j`s such that `B[j] < B[i2] - K`. The max of `j` in `***` must be smaller than the max of j in `*****`, which means the distance from them to `i2` can only be longer; Otherwise, if `B[i2] >= B[i1]`, the target `j` would be somewhere in the `#####` part. So we do not need to search `******` part. 224 | 225 | * The `j`s in `.....` part are worse than `i1`, because such `B[j]`s are larger than `B[i1]` and further away from index `i2`. Skip them too. 226 | 227 | So essentially, only the `#####` part is worth searching for `B[i2]`. All the other parts are irrelavent. A deque fits our purpose, it pops the `*****` and `.....` part as `B[i1]` comes in at the first place. 228 | -------------------------------------------------------------------------------- /Bit/Readme.md: -------------------------------------------------------------------------------- 1 | # Bit Operations 2 | 3 | ## Basic operations: 4 | * 1. Flip `~x`. WARNING: Python returns negative numbers because of the proceding 1s. 5 | 6 | * 2. Get the rightmost **set bit** `x & -x` 7 | 8 | * 3. Check if all bit are set: 9 | DO NOT TRY `~x == 0` because you got extra proceding bits. 10 | Do x & (x + 1) INSTEAD!!! 11 | 12 | * 4. Test if `1000..00`, `x&(x-1)==0` 13 | 14 | * 5. Remove the last set bit `x&(x-1)` 15 | 16 | * 6. Number of set bit `__builtin_popcount(int x)` (C++ CPU specific instruction) 17 | 18 | * 7. Remove some bit `A &= ~(1 << bit)` 19 | 20 | * 8. Get all 1-bits `~0` 21 | 22 | * 9. Set a bit `x |= 1 << bits`, Fill all the postitions by `1`s 23 | ``` 24 | x |= x >> 16 25 | x |= x >> 8 26 | x |= x >> 4 27 | x |= x >> 2 28 | x |= x >> 1 29 | ``` 30 | * 10. Get the left-most set bit step3 + `x ^ (x >> 1)` 31 | 32 | WARNING: Python 3 integers aren't represented using the internal CPU representation, so you have to determine the sign yourself! 33 | 34 | **`signbit = int(n < 0)`** 35 | 36 | ## Templates: 37 | 38 | ### A really nice summary 39 | 40 | ### Single number problem by XOR 41 | Given an array of integers, every element appears `k (k > 1)` times except for one, which appears `p` times `(p >= 1, p % k != 0)`. Find that single one. 42 | 43 | We count the number of occurrence of `1`s on each digit. Say, for a particular digit, the input array contains `w*k + p` ONEs at this digit, then, the number which repeats `p` times has a One at this digit. Otherwise, the input array contains `w*k` ONEs, then, the number which repeats `p` times has a ZERO at this digit. 44 | 45 | Note that we do not need to count ZEROs because the input array should has a size mod `k` equal to `p`. 46 | 47 | Java Code for the case `k = 3, p = 1`. We need two 32-bits variable to store the occurrence of `1`s on each digit, which ranges from `0b00` to `0b10` (`00->01->10->00`). Note `11` is `00` here. 48 | 49 | ***LC 137. Single Number II*** 50 | 51 | ``` 52 | x1 = 0 53 | x2 = 0 54 | for (int i : nums) { 55 | x2 ^= x1 & i; 56 | x1 ^= i; 57 | mask = ~(x1 & x2); 58 | x2 &= mask; 59 | x1 &= mask; 60 | } 61 | return x1 62 | ``` 63 | 64 | IF there are two numbers, we can divide the input array into two groups. If we know `x1^x2 > 0`, we know they must be different at one bit (ex. the rightmost set bit of `x1^x2`). Then we go ahead to XOR each group. 65 | 66 | ***LC 260. Single Number III*** 67 | 68 | ``` 69 | diff = reduce(lambda x,y: x^y, nums) 70 | diff &= -diff 71 | x1, x2 = 0, 0 72 | for n in nums: 73 | if n & diff: 74 | x1 ^= n 75 | else: 76 | x2 ^= n 77 | return [x1, x2] 78 | ``` 79 | 80 | IF there are three unique numbers `a`, `b`, `c`, and other number appears twice, let `x` be the XOR of all elements, 81 | The last set bit of the XOR of the last bit of `x^a` and `x^b` and `x^c` be the M-th bit. 82 | Then we can show only one of `x^a`, `x^b` and `x^c` has a set M-th bit, the other two have unset M-th bit. (Why?) 83 | Then we can identify a list of elements contain one target number, 84 | and solve the sub-problem of finding the other two among the remaining elements. 85 | 86 | See: 87 | http://zhedahht.blog.163.com/blog/static/25411174201283084246412/ 88 | 89 | ### Bit Masks 90 | Mask allows you to have a small subset (no more than 32 elements for intergers) but could be larger using C++ bit_set or larger numbers in Python. 91 | 92 | Three convenient operations: 93 | ``` 94 | Set union A | B 95 | Set intersection A & B 96 | Set subtraction A & ~B 97 | ``` 98 | First, it allows dynamic programming to know the states quickly. Secondly, the shift of bits usually can be treated as a `O(1)` operation. 99 | 100 | **LC 318. Maximum Product of Word Lengths** Find the maximum value of `length(word[i]) * length(word[j])` where the two words do NOT share common letters. 101 | ``` 102 | def maxProduct(self, words): 103 | d = {} 104 | for w in words: 105 | mask = 0 106 | for c in set(w): 107 | mask |= 1 << (ord(c) - 90) 108 | d[mask] = max(d.get(mask, 0), len(w)) 109 | return max([d[x] * d[y] for x in d for y in d if not x&y] or [0]) 110 | ``` 111 | **LC 187. Repeated DNA Sequences** Find all the 10-letter-long sequences (substrings) that occur more than once in a DNA. 112 | Given s = `"AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT"`, 113 | Return: `["AAAAACCCCC", "CCCCCAAAAA"]`. 114 | 115 | So here we have a **shifting mask** to indicate the sub-sequences. 116 | ``` 117 | vector findRepeatedDnaSequences(string s) { 118 | vector ret; 119 | map d; 120 | if (s.size() <= 10) return ret; 121 | int v = 0; 122 | for (int j = 0; j <= 8; ++j) { 123 | v <<= 2; 124 | v |= s[j] == 'A'? 0 : s[j] == 'T'? 1 : s[j] == 'C'? 2 : 3; 125 | } 126 | for (int i = 9; i < s.size(); ++i) { 127 | v = v << 2 & 0xfffff; 128 | v |= s[i] == 'A'? 0 : s[i] == 'T'? 1 : s[i] == 'C'? 2 : 3; 129 | if (d[v]++==1) ret.push_back(s.substr(i - 9, 10)); 130 | } 131 | return ret; 132 | } 133 | ``` 134 | 135 | **847. Shortest Path Visiting All Nodes** Return shortest path that visits every node in a graph (NP-hard problem), but you can visit node, edges multiple times. 136 | My blog: 137 | 138 | ``` 139 | def shortestPathLength(self, graph): 140 | df = {} 141 | def find(state, i): 142 | if (state, i) in df: return df[(state, i)] 143 | if state & (state - 1) == 0: return 0 144 | df[(state, i)] = sys.maxsize 145 | for j in graph[i]: 146 | if state & (1<0: 21 | for j in graph[i]: 22 | heapq.heappush(heap, (price+graph[i][j],j,k-1)) 23 | return -1 24 | -------------------------------------------------------------------------------- /DP/801.py: -------------------------------------------------------------------------------- 1 | class Solution: 2 | def minSwap(self, A, B): 3 | """ 4 | :type A: List[int] 5 | :type B: List[int] 6 | :rtype: int 7 | """ 8 | ret = 0 9 | swap = [1]*len(A) 10 | noswap = [0]*len(A) 11 | for i in range(1,len(A)): 12 | x = noswap[i-1]+1 if A[i]>B[i-1] and B[i]>A[i-1] else len(A)+1 13 | y = swap[i-1]+1 if A[i]>A[i-1] and B[i]>B[i-1] else len(A)+1 14 | swap[i] = min(x, y) 15 | print(i,".",swap[i]) 16 | x = noswap[i-1] if A[i]>A[i-1] and B[i]>B[i-1] else len(A)+1 17 | y = swap[i-1] if A[i]>B[i-1] and B[i]>A[i-1] else len(A)+1 18 | noswap[i] = min(x, y) 19 | print(i,".",noswap[i]) 20 | return min(swap[len(A)-1],noswap[len(A)-1]) 21 | 22 | print( Solution().minSwap([0,3,5,8,9],[2,1,4,6,9]) ) 23 | -------------------------------------------------------------------------------- /DP_optimization/Readme.md: -------------------------------------------------------------------------------- 1 | # Dynamic Programming 2 | 3 | After you figure out the reduction rule, you should try to optimize the DP: 4 | 5 | ## Rolling array to save the memory 6 | 7 | You should save memory cost whenever possible. For 2D DP problem, keep asking yourself if the memory cost could be actually 1D. 8 | Try to simulate the DP process in a table and find out the way to do 1D. 9 | 10 | **LC 64. Minimum Path Sum** Given a m x n grid filled with non-negative numbers, find a path from top left to bottom right which minimizes the sum of all numbers along its path. 11 | 12 | The basic DP reduction is 2D 13 | ``` 14 | dp[i,j] = min(dp[i-1,j], dp[i,j-1]) + grid[i][j] 15 | ``` 16 | But actually we just need 1D vector the store the states because 17 | ``` 18 | # A B C This is the old row memo 19 | # V V 20 | # a->b->c This is the mew row memo 21 | # You update b using a and B 22 | ``` 23 | 24 | Python Code: 25 | ``` 26 | if not grid: return 0 27 | m = len(grid[0]) 28 | memo = [0] + [2**31] * (m-1) 29 | for row in grid: 30 | memo[0] = memo[0] + row[0] 31 | for j in range(m-1): 32 | memo[j+1] = min(memo[j:j+2]) + row[j+1] 33 | return memo[-1] 34 | ``` 35 | 36 | **LC 174. Dungeon Game** 37 | A knight gains/loses health entering a room. He dies if health <= 0. Return the knight's minimum initial health to get (N, M) from (0, 0). 38 | 39 | So reversively, we compute the min health at `(i,j)` (before gains/lose health) so the knight can reach (N, M). Obviously, `dp[N-1, M-1] = 1 - dungeon[i, j]`. The reduction rule is 40 | 41 | ``` 42 | dp[i, j] = max(1, min(dp[i+1,j],dp[i,j+1]) - dungeon[i, j] ) 43 | ``` 44 | 45 | After the simplification, we get 46 | ``` 47 | if not dungeon: return 0 48 | n, m = len(dungeon), len(dungeon[0]) 49 | memo = [2**31] * (m-1) + [1] 50 | for i in range(n)[::-1]: 51 | for j in range(m)[::-1]: 52 | memo[j] = max(1, min(memo[j:j+2])-dungeon[i][j]) 53 | return memo[0] 54 | ``` 55 | Yes, we can always use standard DP, but actually one row memo would satisfy the needs. 56 | 57 | **LC 714. Best Time to Buy and Sell Stock with Transaction Fee** (and the series of stock transaction problems.) 58 | 59 | Given the daily prices of a stock, you may buy and sell on each day (only sell after you buy, and buy after you sell). 60 | You may complete as many transactions as you like, but you need to pay the transaction fee for each transaction `fee`. 61 | 62 | The max profit at day `i` is 63 | ``` 64 | dp[i] = max(dp[i-1], max(p[i] - p[j] - fee + dp[j-1], for all 0 <= j < i) ) 65 | ``` 66 | which can be simpilified as 67 | ``` 68 | dp[i] = max(dp[i-1], p[i] - fee + max( dp[j-1] - p[j] ) ) 69 | ``` 70 | Let `sell = dp[i]` and `buy = max( dp[j-1] - p[j] )`, only two variables are enough for the DP. 71 | 72 | Python Code: 73 | ``` 74 | def maxProfit(self, prices, fee): 75 | 76 | sell, buy = 0, -float('inf') 77 | for p in prices: 78 | buy = max(buy, sell - p) 79 | sell = max(sell, buy + p - fee) 80 | return sell 81 | ``` 82 | 83 | ## Reduce time complexity by "differentiating" the states 84 | 85 | In some cases, `dp[i]` can be related to `dp[i-1]`. So it's better to use `dp[i] - dp[i-1]` as the real dp state. 86 | After we have derived the reduction rule, we should question that **"How are the dp states related to each other"**? 87 | 88 | 89 | As an extension to **LC 45. Jump Game II**, see **Lintcode climbing-stairs-iii** 90 | 91 | 92 | The question asks how many ways one can jump to n-th steps from 0th step, assuming at the i-th steps, she can jump `[1, n[i]]` steps. 93 | 94 | Suppose there are `x[i]` ways to reach i-th step. Then `x[i] = sum(x[j] for any j < i that j + n[j] >= i)`. 95 | 96 | If we use `x[i]` as the DP state directly, the time complexity would be `O(n^2)` because it involves a ***range operation*** - for each position `j`, we should add `x[j]` to all the `x[i]`s in the range `j+1 <= i <= j+n[j]`. 97 | 98 | But if we differentiate `x[i+1]` and `x[i]`, it is suprisingly simple that `x[i+1]-x[i]` is equal to 99 | * `+x[i]` because pos i can always reach pos i + 1 ( `x[i+1] = x[i] + etc.`) 100 | * `-x[k]` for any pos `k` that can reach `i` but not `i+1` 101 | 102 | So if we use `dp[i+1] = x[i+1] - x[i]`, we just need to add the current `x[i]` to `dp[i + 1]` and add `-x[k]` to some pos `k+n[k]+1` because this is the first position we can't reach from `k` (while pos `k` can reach ``k+n[k]`). 103 | 104 | Note that `x[i] = dp[i] + dp[i-1] + .. + dp[0]` can by maintained dynamically. 105 | 106 | `O(N)` time Python solution: 107 | ``` 108 | def climb_stairs_iii(self, n): 109 | N = len(n) 110 | dp = [1] + [0] * N 111 | xi = 1 112 | for i in range(1, N + 1): 113 | xi = xi + dp[i] 114 | dp[i + 1] += xi 115 | dp[i + n[i] + 1] -= xi 116 | return xi 117 | ``` 118 | 119 | More complicated case, see the [`O(n^2)` solution](https://leetcode.com/problems/guess-number-higher-or-lower-ii/discuss/84826/An-O(n2)-DP-Solution-Quite-Hard.) for **LC 45. Jump Game II** 120 | 121 | 122 | -------------------------------------------------------------------------------- /DP_sequence_operation/Readme.md: -------------------------------------------------------------------------------- 1 | Sequence Operation (Dynamic Programming) 2 | ==== 3 | 4 | Some questions ask for the optimal stragety to operate on an input sequence, which includes: 5 | * **LC312. Burst Balloons** 6 | * **LC546. Remove Boxes** 7 | * **LC488. Zuma Game** 8 | 9 | The key to solve these problems is to reduce from a sequence `dp[i][j]` to a shorter sequence `dp[i][w]` and `dp[w+1][j]`. 10 | But there are some tricks: 11 | 12 | 13 | **LC312. Burst Balloons** 14 | ---- 15 | 16 | Given n balloons, indexed from 0 to n-1. Each balloon is painted with a number on it represented by array nums. 17 | You are asked to burst all the balloons. 18 | If the you burst balloon i you will get `nums[left] * nums[i] * nums[right]` coins. 19 | Here left and right are adjacent indices of i. 20 | After the burst, the left and right then becomes adjacent. 21 | 22 | Find the maximum coins you can collect by bursting the balloons wisely. 23 | 24 | > Input: [3,1,5,8] 25 | > Output: 167 26 | 27 | ``` 28 | Explanation: nums = [3,1,5,8] --> [3,5,8] --> [3,8] --> [8] --> [] 29 | coins = 3*1*5 + 3*5*8 + 1*3*8 + 1*8*1 = 167 30 | ``` 31 | 32 | Suppose ballon `i` burst last. The left and right ballon at index `l` and `r` would impact it. 33 | ``` 34 | def maxCoins(self, nums): 35 | # l i r 36 | # [1 3 1 5 8 1] 37 | # l i r 38 | # dp[l][r] = max (. , dp[l][i] + dp[i][r] + n[l]*n[i]*n[r]) 39 | 40 | if not nums: return 0 41 | n = [1] + nums + [1] 42 | N = len(n) 43 | dp = [[0] * N for _ in range(N)] 44 | for r in range(N): 45 | for l in range(r-2, -1, -1): 46 | for i in range(l+1, r): # i is neither l nor r. 47 | dp[l][r] = max(dp[l][r], dp[l][i] + dp[i][r] + n[l]*n[i]*n[r]) 48 | return dp[0][N-1] 49 | ``` 50 | 51 | **LC546. Remove Boxes** 52 | --- 53 | 54 | Given several boxes with different colors represented by different positive numbers. 55 | You may experience several rounds to remove boxes until there is no box left. Each time you can choose some continuous boxes with the same color (composed of k boxes, k >= 1), remove them and get k*k points. 56 | Find the maximum points you can get. 57 | 58 | ``` 59 | > [1, 3, 2, 2, 2, 3, 4, 3, 1] 60 | > ----> [1, 3, 3, 4, 3, 1] (3*3=9 points) 61 | > ----> [1, 3, 3, 3, 1] (1*1=1 points) 62 | > ----> [1, 1] (3*3=9 points) 63 | > ----> [] (2*2=4 points) 64 | ``` 65 | 66 | If the sequence `i,j` can be split into non-empty left and right parts, then reduction is easy. 67 | But if `b[i]==b[j]`, we have to consider all segments splitted by `b[i]`. 68 | 69 | ``` 70 | > i w1 w2 j 71 | > [1, 3, 2, 3, 2, 3, 4, 3, 1] 72 | > [*] [*] [*] <== three segments for DP reduction 73 | ``` 74 | 75 | The last removal in this case could be the four 3s in [i,j]. Then DP reduces to 3 segments. Too complicated. 76 | 77 | So, why not just consider the right most 3 at index `w2`, and reduce one 3 at a time? 78 | We need one extra value to store the number of following 3s in each DP state. 79 | 80 | > `dp[i][j][0] = max(dp[i][w2][1] + dp[w2+1][j-1][0]` 81 | > `,dp[i][w1][1] + dp[w1+1][j-1][0])` 82 | 83 | ``` 84 | def removeBoxes(self, b): 85 | if not b: return 0 86 | N = len(b) 87 | dp = [[[0] * N for j in range(N)] for i in range(N)] 88 | def find(i, j, k): 89 | if j < i: return 0 90 | if i == j: return (k + 1) * (k + 1) 91 | if dp[i][j][k] > 0: return dp[i][j][k] 92 | while j > i and b[j-1] == b[j]: 93 | j -= 1 94 | k += 1 95 | dp[i][j][k] = find(i, j - 1, 0) + (k + 1) * (k + 1) 96 | for w in range(j-1, i-1, -1): 97 | if b[w] == b[j]: 98 | dp[i][j][k] = max(dp[i][j][k], find(i, w, k+1) + find(w+1, j-1, 0)) 99 | return dp[i][j][k] 100 | return find(0, N-1, 0) 101 | ``` 102 | 103 | 104 | -------------------------------------------------------------------------------- /DP_two_players/Readme.md: -------------------------------------------------------------------------------- 1 | DP with two players 2 | === 3 | 4 | There are simulation questions in which two players operate in turns, each optimizing its own goal. 5 | 6 | Reduction rule 7 | --- 8 | If two players' optimization goal are "symmetrical", then only one reduction rule is needed. 9 | 10 | The input range forms a valid DP state. No need to prepare two dp table. One dp table can be used for both players because they are "symmetrical". 11 | 12 | For example, if `dp[i][j]` is the state from `i` to `j` (inclusive-inclusive), a player can only operate on `i` or `j`. 13 | ``` 14 | dp[i][j] = max( - dp[i+1][j], - dp[i][j-1]) 15 | ``` 16 | 17 | LC 1690. Stone Game VII 18 | --- 19 | Alice and Bob take turns to remove either the leftmost stone or the rightmost stone from the row and receive points equal to the sum of the remaining stones' values in the row. 20 | 21 | ``` 22 | def stoneGameVII(self, stones: List[int]) -> int: 23 | L = [0] + list(itertools.accumulate(stones)) 24 | A = [[0 for _ in range(len(stones))] for __ in range(len(stones))] 25 | def a(i, j): 26 | if i == j: return 0 27 | if A[i][j] == 0: 28 | s = L[j + 1] - L[i] 29 | A[i][j] = max(s - stones[i] - a(i + 1, j), s - stones[j] - a(i, j - 1)) 30 | return A[i][j] 31 | return a(0, len(stones) - 1) 32 | ``` 33 | -------------------------------------------------------------------------------- /Decomposition_DFS_BFS/Readme.md: -------------------------------------------------------------------------------- 1 | Decomposition Problem using DFS/BFS 2 | === 3 | 4 | Some problem asks for an optimal way to decomposite the input as a "sum" of many terms 5 | 6 | LC279. Perfect Squares 7 | --- 8 | Given a positive integer n, find the least number of perfect square numbers (for example, 1, 4, 9, 16, ...) which sum to n. 9 | 10 | BFS solution 11 | ``` 12 | def numSquares(self, n): 13 | i, f = 1, [] 14 | while i * i <= n: 15 | f.append(i * i) 16 | i += 1 17 | 18 | bfs = [(0, n)] 19 | vis = set([n]) 20 | for step, num in bfs: 21 | if num == 0: return step 22 | for sqr in f: 23 | if num - sqr >= 0 and num - sqr not in vis: 24 | vis.add(num - sqr) 25 | bfs.append((step + 1, num - sqr)) 26 | return -1 27 | ``` 28 | 29 | DFS solution 30 | ``` 31 | def numSquares(self, n): 32 | dp = {0:0} 33 | def dfs(num): 34 | if num in dp: return dp[num] 35 | tmp = num 36 | i = 2 37 | while i * i <= num: 38 | tmp = min(tmp, 1 + dfs(num - i * i)) 39 | i += 1 40 | dp[num] = tmp 41 | return dp[num] 42 | 43 | return dfs(n) 44 | ``` 45 | Note that this DFS solution works but got TLE in Leetcode. Why? This problem asks for min number of squares, instead of a specific destination. Some paths are too long (say 471 = 1 + 1 + 1 + .. + 1, a total of 471 ones sum up) to search. So BFS is better than DFS. 46 | 47 | LC691. Stickers to Spell Word 48 | --- 49 | 50 | You would like to spell out the given target string by cutting individual letters from your collection of stickers and rearranging them. 51 | 52 | What is the minimum number of stickers that you need to spell out the target? 53 | 54 | BFS solution 55 | ``` 56 | def minStickers(self, stickers, target): 57 | mp = [Counter(w) for w in stickers] 58 | 59 | vis = set() 60 | Q = [(0, target)] 61 | for dist, t in Q: 62 | if t == "": return dist 63 | 64 | for cnt in mp: 65 | # As the sequence does not matter, we can force matching the first letter in target 66 | if t[0] not in cnt: continue 67 | 68 | tmp = cnt.copy() 69 | s = "" 70 | for ch in t: 71 | if tmp[ch] > 0: 72 | tmp[ch] -= 1 73 | else: 74 | s += ch 75 | if s not in vis: 76 | vis.add(s) 77 | Q.append((dist + 1, s)) 78 | return -1 79 | ``` 80 | 81 | DFS solution 82 | ``` 83 | def minStickers(self, stickers, target): 84 | mp = [Counter(w) for w in stickers] 85 | dp = {"":0} 86 | 87 | def dfs(target): 88 | if target in dp: return dp[target] 89 | res = float('inf') 90 | for cnt in mp: 91 | if cnt[target[0]] > 0: 92 | tmp = cnt.copy() 93 | s = "" 94 | for l in target: 95 | if tmp[l] > 0: 96 | tmp[l] -= 1 97 | else: 98 | s += l 99 | res = min(res, 1 + dfs(s)) 100 | dp[target] = res 101 | return res 102 | 103 | res = dfs(target) 104 | if res >= float('inf'): return -1 105 | return res 106 | ``` 107 | BFS(260ms) is still faster then DFS(360ms). 108 | 109 | It looks like **LC964. Least Operators to Express Number** can be solved in similar way. But there is actually a shortcut using DP there. 110 | 111 | 112 | -------------------------------------------------------------------------------- /EulerPath/Readme.md: -------------------------------------------------------------------------------- 1 | # Euler Path & Euler Circuit 2 | 3 | Euler path is a path visiting every edge exactly once. 4 | 5 | Euler cycle (Euler tour) is a Euler path which starts and ends on the same vertex. 6 | 7 | A graph is called Eulerian if it has an Eulerian Cycle and called Semi-Eulerian if it has an Eulerian Path. 8 | 9 | Given a **connected** graph, 10 | * Euler cycle exists if all vertices have even degree 11 | * Euler path exists if zero or two vertices have odd degree and all other vertices have even degree 12 | 13 | To find out the Euler path/tour, we can do a **post-order DFS while removing visited edges** 14 | 15 | It is quite like DFS, with a little change: 16 | ``` 17 | vector E 18 | dfs (v): 19 | color[v] = gray 20 | for u in adj[v]: 21 | erase the edge v-u and dfs(u) 22 | color[v] = black 23 | push v at the end of e 24 | ``` 25 | e is the answer. 26 | 27 | **LC 332. Reconstruct Itinerary** 28 | 29 | Given a list of airline tickets represented by pairs of departure and arrival airports [from, to], reconstruct the itinerary in order. 30 | 31 | ``` 32 | def findItinerary(self, tickets): 33 | graph = collections.defaultdict(list) 34 | 35 | for s, t in tickets: 36 | graph[s].append(t) 37 | 38 | # Visit the airports with lower lexi-order first 39 | for s in graph: 40 | graph[s] = sorted(graph[s])[::-1] 41 | 42 | def dfs(s, ret): 43 | while graph[s]: 44 | # once an edge has been visited, remove it 45 | # it is like mark the visited vertex in normal DFS 46 | # pop() is easiest way to mark edge 47 | t = graph[s].pop() 48 | dfs(t, ret) 49 | # Now, All out-going edges from s have been removed 50 | # it is time to add s into the path 51 | ret.append(s) 52 | 53 | ret = [] 54 | dfs("JFK", ret) 55 | return ret[::-1] 56 | ``` 57 | 58 | **LC 753. Cracking the Safe** 59 | 60 | Output a string formed by `n` letters, each letter is among `[0, 1, 2, .., k-1]` 61 | The length-n substrings should contains all possible strings `[000, 00..1, 00..2, ...]`. 62 | For example, 63 | Input: n = 2, k = 2 64 | Output: "00110" contains '00', '01', '11', '10', which are all the length-2 substrings formed by letters of '0' and '1' 65 | 66 | The problem is to find a Euler cycle visiting all the edges in a directed graph: 67 | * Every vertex is a possible string with a length of `n-1`. 68 | * Every vertex has `k` out-going edges for letters `0`, `1`, ..., `k-1`. 69 | 70 | ``` 71 | def crackSafe(self, n, k): 72 | curr = "0" * (n - 1) 73 | graph = collections.defaultdict(lambda : k - 1) 74 | ret = [] 75 | def dfs(s, ret): 76 | t = graph[s] 77 | if t < 0: return 78 | graph[s] -= 1 79 | dfs( (s + str(t))[len(s) + 2 - n:] , ret) 80 | # Post-order here, actually pre-order also works for Euler cycle 81 | # (but pre-order does NOT work for Euler path problems) 82 | ret.append(t) 83 | dfs(curr, ret) 84 | return curr + "".join(str(i) for i in ret[::-1]) 85 | ``` 86 | 87 | Another iterative implementation: we can just record every edges of DFS, 88 | because no matter which route we choose, there will always be a Euler cycle, 89 | as long as we post-pone visiting the starting vertex `000..0` here. 90 | ``` 91 | def crackSafe(self, n, k): 92 | ret = curr = "0" * (n - 1) 93 | graph = collections.defaultdict(lambda : k - 1) 94 | for _ in range(k**n): 95 | ret += str(graph[curr]) 96 | graph[curr] -= 1 97 | curr = ret[len(ret) - n + 1:] 98 | return ret 99 | ``` 100 | -------------------------------------------------------------------------------- /FenwickTree/Readme.md: -------------------------------------------------------------------------------- 1 | ## Binary Indexed Trees (Fenwick Tree) 2 | 3 | Fenwick tree supports the query of array prefix sums in `O(log N)` time. Specifically, given an array `A` of length `N`, Fenwick tree returns the prefix sum of `A[:k]` for any `0<=k<=N` in `O(log N)` time, while the elements of `A` can also be updated dynamically in `O(log N)` time. 4 | 5 | The idea is to maintains the partial sums. 6 | 7 | To sum up all elements with index less than '0b111' (presented in binary, 1-indexed instead of 0-indexed), we need to know: 8 | * sum of the elements at '1', '10', '11', '100', which is stored at node '100' 9 | * sum of the elements at '101', '110', which is stored at node '110' 10 | 11 | The sum of these partial sums would be the sum of `A[0b1] + A[0b10] + ... + A[0b111]`. 12 | 13 | Since there are at most `O(log N)` such partial sums, each query takes `O(log N)` time. 14 | 15 | NOTE, the node `0` is a dummy node and the tree is 1-indexed (instead of 0-indexed) 16 | 17 | ``` 18 | // marked (*) nodes stores the prefix sum for elements with index less than `0b111` 19 | root 0 20 | |--------------------- ... 21 | | | | | 22 | level1 1 10 100* 1000 23 | | |---- |--------- ... 24 | | | | | | 25 | level2 11 101 110* 1001 1010 ... 26 | | | 27 | 111* 28 | ``` 29 | At `111` stores the partial sum of `111`. 30 | At `110` stores the partial sum of `110`, `101`. 31 | At `100` stores the partial sum of `100`, `11`, `10`, `1`. 32 | 33 | Sum up these partial sums (marked by `*`), and we will obtain prefix sum `A[0b1]+A[0b10]+...+A[0b111]`. 34 | 35 | Two Operations: 36 | 37 | * To obtain the prefix sum `A[0:i]`, we start from index `i` and iteratively remove the last digit 1 of `i` by `i -= (i & -i)` until `i = 0`. This step is illustrated by the diagram above. 38 | 39 | * To update `A[i-1]`, we start from index `i` (instead of `i-1` because the tree is 1-indexed), and then iteratively adds the last digit 1 of `i` by `i += (i & -i)` until `i > N`. We update the associated partial sums along this path by the change of `A[i-1]`. 40 | 41 | **LC 307. Range Sum Query - Mutable** 42 | 43 | C++ code: 44 | ``` 45 | class NumArray { 46 | public: 47 | vector bi; 48 | int n; 49 | NumArray(vector nums) { 50 | n = nums.size(); 51 | bi.resize(n + 1, 0); 52 | fill(bi.begin(), bi.end(), 0); 53 | for (int i=0;i `|a x0 + b y0 + c |` 17 | > `------------------` 18 | > `sqrt(a * a + b * b)` 19 | 20 | 6. Angle between two 2D vectors 21 | 22 | > `|a - b|^2 = |a|^2 + |b|^2 - 2 |a| |b| cos(theta) ` 23 | 24 | where `||` denotes the L2 (Euclidean) norm `|(x ,y)| = sqrt(x*x + y*y)` 25 | 26 | so we can calculate the `theta` by this equation. 27 | 28 | > `cross(a, b) = | a | | b | sin(theta) n` 29 | 30 | where `cross()` denotes the cross product, and the unit vector `n` can be found by the right-hand rule. 31 | 32 | ![Right hand rule](https://upload.wikimedia.org/wikipedia/commons/thumb/d/d2/Right_hand_rule_cross_product.svg/220px-Right_hand_rule_cross_product.svg.png) 33 | 34 | Note that if `cross(a, b)` is positive then the `b` is at the anti-crosswise direction of `a` within 180 degree. (right-hand rule) 35 | 36 | 37 | 38 | **LC 469. Convex Polygon** 39 | --- 40 | Given a list of points that form a polygon when joined sequentially, find if this polygon is convex. 41 | 42 | Every boundary edge should turn towards the same direction, either all clockwise or all counter-clockwise. 43 | Use right-hand rule (cross product) to check that. 44 | 45 | ``` 46 | def isConvex(self, points): 47 | N = len(points) 48 | clock, aclock = 0, 0 49 | for i in range(N): 50 | j = (i + 1) % N 51 | k = (i + 2) % N 52 | ax, ay = points[j][0] - points[i][0], points[j][1] - points[i][1] 53 | bx, by = points[k][0] - points[j][0], points[k][1] - points[j][1] 54 | if ax * by - bx * ay < 0: 55 | aclock += 1 56 | if ax * by - bx * ay > 0: 57 | clock += 1 58 | 59 | if aclock > 0 and clock > 0: return False 60 | return True 61 | ``` 62 | 63 | **LC 587. Erect the Fence** 64 | --- 65 | Find convex hull given a set of points. 66 | 67 | Monotone Chain Algorithm (sort the points by `(x, -y)`). 68 | ``` 69 | def outerTrees(self, points): 70 | """ 71 | :type points: List[Point] 72 | :rtype: List[Point] 73 | """ 74 | if not points: return [] 75 | 76 | p = sorted([(_.x, _.y) for _ in points], key=lambda x: (x[0], -x[1])) 77 | n = len(points) 78 | 79 | def cross(a, b, x, y): 80 | #print(a, b, x, y) 81 | return a * y - b * x 82 | 83 | L = [] 84 | for i in range(n): 85 | while len(L) > 1 and \ 86 | cross(L[-1][0] - L[-2][0], L[-1][1] - L[-2][1], p[i][0] - L[-2][0], p[i][1] - L[-2][1]) < 0: 87 | L.pop() 88 | L.append(p[i]) 89 | 90 | H = [] 91 | for i in range(n-1, -1, -1): 92 | while len(H) > 1 and \ 93 | cross(H[-1][0] - H[-2][0], H[-1][1] - H[-2][1], p[i][0] - H[-2][0], p[i][1] - H[-2][1]) < 0: 94 | H.pop() 95 | H.append(p[i]) 96 | 97 | return [Point(x, y) for x, y in set(L + H)] 98 | ``` 99 | 100 | -------------------------------------------------------------------------------- /HungarianAlgorithm/Readme.md: -------------------------------------------------------------------------------- 1 | Hungarian Algorithm 2 | === 3 | 4 | Problem: 5 | 6 | * Input: n workers and m tasks, worker i gets paid `cost(i, j)` to finish task j 7 | * Output: min cost to finish all tasks 8 | 9 | Key observation: the optimal assignment given original input is still optimal 10 | if a number is subtracted from one row or column in the cost matrix 11 | 12 | [blog](https://www.hackerearth.com/fr/practice/algorithms/graphs/minimum-cost-maximum-flow/tutorial/) 13 | ``` 14 | function: HungarianAlgorithm(Matrix C): 15 | Copy C into X 16 | for i from 1 to N: 17 | subtract elements in row(i) of X with min(row(i)) 18 | for j from 1 to N: 19 | subtract elements in column(j) of X with min(column(j)) 20 | L = minimum number of lines(horizontal or vertical) to join all 0s in X 21 | while L != N: 22 | M = minimum number among the cells that are not crossed by the lines 23 | Subtract M from all the cells that are not crossed by lines 24 | Add M to all cells that have intersection of lines 25 | L = minimum number of lines(horizontal or vertical) to join all 0s in X 26 | return FindMinCost(X,C) 27 | ``` 28 | FindMinCost does an optimal selection of s in matrix such that cells are selected and non of them lie in same row or column 29 | 30 | [Slides](http://www.math.harvard.edu/archive/20_spring_05/handouts/assignment_overheads.pdf) 31 | 32 | minimum weight perfect bipartite matching 33 | --- 34 | The minimum weight perfect bipartite matching is the minimum cost max flow problem 35 | ![](https://upload.wikimedia.org/wikipedia/commons/thumb/4/48/Minimum_weight_bipartite_matching.pdf/page1-330px-Minimum_weight_bipartite_matching.pdf.jpg) 36 | 37 | Further readings 38 | --- 39 | [Google OR tools](https://developers.google.com/optimization/assignment/simple_assignment) 40 | 41 | 42 | 43 | -------------------------------------------------------------------------------- /KMP/Readme.md: -------------------------------------------------------------------------------- 1 | # KMP algorithm 2 | 3 | KMP checks if a `T` is a sub-string of `S` in **linear** time. 4 | 5 | The idea is that, if `T[:i]==S[j-i:j]` and `T[i]` fails to match `S[j]`, we do not need to re-start from matching the beginning of `T`, i.e. `T[0]` with `S[j-i+1]`. This is because we already have `T[:i] == S[j-i:j]`, we can preprocess `T[:i]` to know where to re-start then. 6 | 7 | * Preprocessing: obtain the next matching point `b[i]` so when the attempt of matching `T[i]` fails, we roll back to `T[b[i]]` to restart the matching. 8 | 9 | * Searching: When `T[i]` mis-matches, we roll back to `T[b[i]]` to restart. 10 | 11 | The most important note is that `T[:b[i]]` is the longest "border" of `T[:i]`. Here, "border" is defined as the string which is both a prefix and a suffix of a string. 12 | 13 | [Why it works?](http://www.inf.fh-flensburg.de/lang/algorithmen/pattern/kmpen.htm) 14 | 15 | If border `r` is both suffix and prefix of `x`, then `r+a` is a "border" of `x+a`. 16 | 17 | ![alt text](http://www.inf.fh-flensburg.de/lang/algorithmen/pattern/rand2.gif) 18 | 19 | Hence, the key is to find index `b[i]` for index `i` such that we can expand the matched string before them. 20 | 21 | ![alt text](http://www.inf.fh-flensburg.de/lang/algorithmen/pattern/rand4.gif) 22 | 23 | where `b[i]` indicates the index of next potential match of char at `array[i]`. 24 | 25 | The searching phase of the Knuth-Morris-Pratt algorithm is in `O(n)`. A preprocessing of the pattern is necessary in order to analyze its structure. The preprocessing phase has a complexity of O(m). Since `m <= n`, the overall complexity of the Knuth-Morris-Pratt algorithm is in `O(n)`. 26 | 27 | A **C++** implementation of preprocessing: 28 | 29 | Note that the `b[0]=-1`. In the worst case, we starting matching from index `b[i]=0` - matching the first char. 30 | 31 | ``` 32 | void kmpPreprocess(vector& array) 33 | { 34 | vector b(array.size()); 35 | int i=0, j=-1; 36 | b[i]=j; 37 | while (i < array.size()) 38 | { 39 | while (j>=0 && t[i]!=t[j]) j=b[j]; // retreat when no matching 40 | b[++i]=++j; 41 | } 42 | } 43 | ``` 44 | 45 | Given the preprocessed `b`, do searching: 46 | ``` 47 | void kmpSearch() 48 | { 49 | int i=0, j=0; 50 | while (i=0 && t[i]!=s[j]) j=b[j]; 54 | i++; j++; 55 | 56 | // succeess 57 | if (j==m) 58 | { 59 | report(i-j); 60 | // roll back 61 | j=b[j]; 62 | } 63 | } 64 | } 65 | ``` 66 | 67 | 68 | **LC 214. Shortest Palindrome** 69 | --- 70 | 71 | Prepend letters at the front of a string to make a palindrome. 72 | What's the smallest numer of letters you need to prepend? 73 | 74 | This question essentially asks for the longest palindrome prefix of a string. With a new string 75 | 76 | `s = g + '|' + reverse(g)` 77 | 78 | If a proper prefix of `s` is also a suffix, it is a prefix palindrome of `g`. So we look for the **longest** proper prefix (LPP) of `s` which is also a suffix. Notice that `|` should not appear in `g` so the LPP must be inside the `g` part before `|`. 79 | 80 | ``` 81 | def shortestPalindrome(self, g): 82 | s = g + "|" + g[::-1] 83 | b = [0] * (len(s) + 1) 84 | l, r = -1, 0 85 | b[r] = l 86 | while r < len(s): 87 | while l >= 0 and s[l] != s[r]: l = b[l] 88 | l, r = l + 1, r + 1 89 | b[r] = l 90 | return g[b[-1]:][::-1] + g 91 | ``` 92 | 93 | **LC 5. Longest Palindromic Substring** 94 | --- 95 | Find the longest palindromic substring in `s`. 96 | 97 | Test all the suffix of `s[:i]`, which is also a proper prefix of it. KMP linear time for each suffix and a total of `O(N^2)` time. 98 | 99 | ``` 100 | def longestPalindrome(self, s): 101 | def LPS(s): 102 | lps = [0] * len(s) 103 | i, l = 1, 0 104 | while i < len(s): 105 | if s[i] == s[l]: 106 | l += 1 107 | lps[i] = l 108 | i += 1 109 | else: 110 | if l > 0: 111 | l = lps[l - 1] 112 | else: 113 | lps[i] = 0 114 | i += 1 115 | return l 116 | ret = (1,1) 117 | for i in range(1, len(s) + 1): 118 | l = LPS(s[:i][::-1] + '|' + s[:i]) 119 | if l > ret[1]: 120 | ret = (i, l) 121 | i, l = ret 122 | return s[i-l:i] 123 | ``` 124 | 125 | There exists a O(N) time algorithm called Manchester algorithm. And DP also solves the problem in `O(N^2)` time. 126 | 127 | 128 | **LINTCODE 1365. Minimum Cycle Section** 129 | --- 130 | Given an array of int, find the length of the minimum cycle section. ex. `[1,2,3,1,2,3]` has the longest cycle section of `[1,2,3]`, so return 2 as the minimum number of cycle sections. 131 | 132 | Idea: input `abc abc abc` would have the longest proper prefix `abc abc` which is also a suffix. So the cycle section would be `N - b[N]` where `array[0:b[N]] == array[N-b[N]:N]`. 133 | 134 | **C++** solution: 135 | 136 | ``` 137 | int minimumCycleSection(vector &array) { 138 | int i = 0, j = -1, N = array.size(); 139 | vector b(N + 1, -1); 140 | while (i < N) { 141 | while (j >= 0 && array[i] != array[j]) j = b[j]; 142 | b[++i] = ++j; 143 | } 144 | return i - b[i]; 145 | } 146 | ``` 147 | 148 | [Another idea](https://code.dennyzhang.com/minimum-cycle-section) is to have an array `dp[i]` storing the longest cycle length at index `i`. 149 | When `array[i]` does not match `array[j%dp[i]]`, reset the `dp[i] = i`; otherwise `dp[i] = dp[i-1]` because repeating pattern continues. 150 | -------------------------------------------------------------------------------- /Knapsack/Readme.md: -------------------------------------------------------------------------------- 1 | # Knapsack 2 | 3 | **LC 691. Stickers to Spell Word** 4 | 5 | We are given N different types of stickers. Each sticker has a lowercase English word on it. 6 | 7 | You would like to spell out the given target string by cutting individual letters from your collection of stickers and rearranging them. 8 | 9 | You can use each sticker more than once if you want, and you have infinite quantities of each sticker. 10 | 11 | What is the minimum number of stickers that you need to spell out the target? If the task is impossible, return -1. 12 | 13 | ``` 14 | def minStickers(self, s, t): 15 | """ 16 | :type stickers: List[str] 17 | :type target: str 18 | :rtype: int 19 | """ 20 | mp = [Counter(w) for w in s] 21 | dp = {"":0} 22 | def find(t): 23 | if t in dp: return dp[t] 24 | res = sys.maxsize 25 | for i in range(len(s)): 26 | if mp[i][t[0]] > 0: 27 | tmp = mp[i].copy() 28 | r = "" 29 | for j in range(len(t)): 30 | if tmp[t[j]] > 0: tmp[t[j]] -= 1 31 | else: r += t[j] 32 | res = min(res, 1 + find(r)) 33 | dp[t] = res 34 | return dp[t] 35 | R = find(t) 36 | return R if R < sys.maxsize else -1 37 | ``` 38 | -------------------------------------------------------------------------------- /ML_SystemDesign/Readme.md: -------------------------------------------------------------------------------- 1 | Machine learning system design 2 | === 3 | 4 | # Formulate loss functions 5 | 6 | | **Problem Type** | **Activation + Loss Function Combo** | **Math (in LaTeX)** | 7 | |--------------------------------------|---------------------------------------------------------------|------------------------------------------------------------------------------------------------------| 8 | | **Binary Classification** | Sigmoid + Binary Cross-Entropy | $L = - y log(p) + (1 - y) log(1 - p)$ | 9 | | **Imbalanced Binary Classification** | Sigmoid + Weighted Binary Cross-Entropy | $L = - [ w_1 y log(p) + w_2 (1 - y) log(1 - p) ]$ | 10 | | **Multi-Class Classification** | Softmax + Categorical Cross-Entropy | $L = - \sum_{i} y_i log(p_i)$, allow multiple classes, but $y_i$s should sum to 1 | 11 | | **Multi-Class Classification** | Log-Softmax + Negative Log-Likelihood (NLL) | $L = - \log(p_i)$ where $y_i = 1$ only for the "true" class $i$ | 12 | | **Multi-Label Classification** | Sigmoid (per class) + Binary Cross-Entropy (per class) | $L = - [ y log(p) + (1 - y) log(1 - p) ]$, there is one loss per output activation | 13 | | **Regression** | None (Linear) + Mean Squared Error (MSE) | $L = 1/N \sum_{i} (y_i - hat{y}_i)^2$, L2 penalize outliers | 14 | | **Regression** | None (Linear) + Mean Absolute error (MAE) | $L = 1/N \sum_{i} \|y_i - hat{y}_i\|$, L1 encourage sparsity | 15 | | **Autogression/Next-token prediction** | Softmax + Categorical Cross-Entropy | $L = - \sum_{i} y_i log(p_i)$ | 16 | | **Ranking/Pairwise Learning** | None (Linear/ Sigmoid) + Contrastive Loss / Triplet Loss | $y_i > y_j$ or $\|y_i - y_j\| > \|y_k - y_j\|$ | 17 | 18 | # Feature 19 | 20 | - feature engineering: 21 | - domain-specific, use human common sense 22 | - unsupervised feature representation learning (cold start, new/fresh item recommendations): 23 | - word2vec style -- similar embeddings for similar items, learn from item-to-item interactions. 24 | - autoencoder (VAE) 25 | 26 | # Explotaion and Exploration 27 | - fairness, diversity 28 | - use temperature in softmax to smooth the output 29 | - add a probablistic layer in the neural net 30 | - Multi-arm Bandit 31 | - episoly greedy: randomly select someone by small prob. 32 | 33 | -------------------------------------------------------------------------------- /MaxFlow/Readme.md: -------------------------------------------------------------------------------- 1 | Max Flow 2 | === 3 | 4 | Problem: 5 | 6 | * Input: a graph with edge capacity, source `s` and destination `t` 7 | * Output: max flow from s to t (the flow over every edge is no more than its capacity) 8 | 9 | [Edmonds-Karp implementation](http://theory.stanford.edu/~tim/w16/l/l1.pdf) of the Ford-Fulkerson algorithm: 10 | 11 | * repeat forever 12 | * Create a residual graph `rG`, for an edge `u -> v` in original graph `G` 13 | * rG[u][v] = G[u][v] 14 | * rG[v][u] = 0 15 | 16 | * BFS to find a shortest path from s to t with postive flow 17 | 18 | * If no such path: break 19 | 20 | * For each edge in this path: 21 | * Remove the bottleneck flow in the forward edges 22 | * Add the bottleneck flow in the reverse edges 23 | 24 | Max Matching 25 | --- 26 | Maximum flow can be used for max matching in bi-parite graph if we create a fake source S and fake destination T, 27 | each connecting to the nodes of one layer in the bi-parite graph. For example, 28 | 29 | > Given the preference of 4 nodes `x0~x3`: `[[y0,y1,y2,y3],[y0,y1,y2],[y0,y1],[y1,y2]]` 30 | > max matching would be `x0->y3, x1->y1, x2->y0, x3->y2` 31 | 32 | ``` 33 | from collections import defaultdict 34 | from collections import deque 35 | 36 | def maxflow(G, s, t): 37 | # Given graph G, return the max flow from source s to destination t 38 | rG = defaultdict(dict) 39 | for u in G: 40 | for v in G[u]: 41 | rG[u][v] = G[u][v] 42 | rG[v][u] = 0 43 | N = len(rG) 44 | 45 | max_flow = 0 46 | while True: 47 | # bfs to find shortest path with postive flow 48 | parent = [-1] * N 49 | vis = set([s]) 50 | bfs = deque([s]) 51 | while bfs: 52 | u = bfs.popleft() 53 | if u == t: 54 | break 55 | for v in rG[u]: 56 | if v not in vis and rG[u][v] > 0: 57 | vis.add(v) 58 | parent[v] = u 59 | bfs.append(v) 60 | 61 | # terminate when no augment path found 62 | if t not in vis: break 63 | 64 | # find bottleneck 65 | v = t 66 | bottleneck = float('inf') 67 | while v != s: 68 | u = parent[v] 69 | print(v, end = "->") 70 | bottleneck = min(bottleneck, rG[u][v]) 71 | v = u 72 | print(s, ":", bottleneck) 73 | 74 | # remove bottleneck flow 75 | v = t 76 | while v != s: 77 | u = parent[v] 78 | rG[u][v] -= bottleneck 79 | rG[v][u] += bottleneck 80 | v = u 81 | 82 | max_flow += bottleneck 83 | return max_flow 84 | 85 | # preference [[0,1,2,3],[0,1,2],[0,1],[1,2]] 86 | # max matching is x0->y3, x1->y1, x2->y0, x3->y2 87 | 88 | preference = [[0,1,2,3],[0,1,2],[0,1],[1,2]] 89 | N = len(preference) 90 | G = defaultdict(dict) 91 | s, t = 0, 1 92 | for i in range(N): 93 | G[s][2 + i] = 1 # source to layer one 94 | G[2 + N + i][t] = 1 # layer two to destination 95 | for i, row in enumerate(preference): 96 | for r in row: 97 | G[2 + i][2 + N + r] = 1 98 | print(maxflow(G, s, t)) 99 | ``` 100 | 101 | According to [DarthPrince's blog](https://codeforces.com/blog/entry/16221), there is a shortcut. 102 | When `v` is matched with `u` by edge `(u->v)`, then in the residual graph, there is actually a path from `v` to `u`. 103 | Hence, we can obtain an augment path along `(v->u)`. 104 | 105 | Keep finding augment path by DFS. Each DFS only visits each node once. 106 | ``` 107 | def max_matching(G): 108 | N = len(G) 109 | 110 | match = [-1] * N 111 | 112 | def dfs(u, mark): 113 | if mark[u]: return False 114 | mark[u] = True 115 | for v in G[u]: 116 | if match[v] == -1 or dfs(match[v]): 117 | match[v], match[u] = u, v 118 | return True 119 | return False 120 | 121 | for i in G: 122 | if match[i] == -1: 123 | mark = [False] * N 124 | dfs(i, mark) 125 | return match 126 | ``` 127 | 128 | See this example for the same problem: 129 | [https://www.geeksforgeeks.org/channel-assignment-problem/](https://www.geeksforgeeks.org/channel-assignment-problem/) 130 | 131 | Minimum-cost maximum flow (MCMF) 132 | --- 133 | The max flow problem is a special case of MCMF where the cost of unit flow in every edge is ONE. The goal is to 134 | 135 | * max the flow 136 | * while the flow is maximized, minimize cost = `sum(cost_e * f_e for all edge e)` 137 | 138 | We just replace the BFS in max flow by **weighted graph shortest-distance algorithm** like Bellman-Ford to find the path with min sum of cost. 139 | 140 | Mathematically, it works because the bottleneck flow through a path brings cost = `sum(cost_e * bf)` where `bf` is the bottleneck flow. So given the `bf`, min cost = `min( sum(cost_e) ) * bf`. 141 | 142 | **Application for min-cost matching** 143 | The MCMF algorithm can solve the problem for Hungarian Algorithm: given N workers and M tasks, the i-th worker costs `W[i,j]` to finish the j-th tasks - return the min cost to finish all M tasks. 144 | 145 | Create fake source and destination nodes, source connects to all worker nodes, and all task nodes connect to the destination. The capacities for the edges connecting fake nodes to original nodes are all Ones. `W[i,j]` is the cost of edges between worker and task nodes. So the maximum flow with minimum cost is the result. 146 | 147 | **Futher Readings** 148 | http://acm.pku.edu.cn/summerschool/acm-icpc%E6%9A%91%E6%9C%9F%E8%AF%BE_%E7%BD%91%E7%BB%9C%E6%B5%81.pdf 149 | 150 | 151 | 152 | 153 | 154 | -------------------------------------------------------------------------------- /MergeSort/Readme.md: -------------------------------------------------------------------------------- 1 | ## Merge Sort 2 | Merge Sort is suitable for problems which look for some pairs `i, j` 3 | such that `i < j,` and `nums[i], nums[j]` satisfy some constraints. 4 | 5 | We can find such pairs during merge sort. In each recursion, before we merge two sorted subarrays, the `i` is in the left sorted subarray, and the `j` is in the right sorted subarray. So, we can just go through both sorted subarray to find the valid `i` and `j` pairs. As long as this step is `O(n)`, the total time complexity would be `O(n log n)`. 6 | 7 | ``` 8 | // [l, r) is the interval to be sorted 9 | int sort_count(iterator l, iterator r) { 10 | if (r - l <= 1) return; 11 | // step 1. find the middle 12 | iterator m = l + (r - l) / 2; 13 | // step 2. sort left and right subarray 14 | int count = sort_count(l, m) + sort_count(m, r); 15 | /* step 3. write your code here for counting the pairs (i, j).*/ 16 | 17 | // step 4. call inplace_merge to merge 18 | inplace_merge(l, m, r); 19 | return count; 20 | } 21 | ``` 22 | 23 | Such problems do not care about the order of `j`s as long as `j > i`. The `j` could be possibly anywhere after `i` as long as `nums[j]` and `nums[i]` satisfy the given constraint. This is the KEY feature because we can sort the `nums[j]`s. Balanced Binary Search Tree would work well for this purpose, because we can insert the elements one-by-one, store the inserted element arrays in BST and search for those valid `j`s in BST given `i`. But it requires extra code to build the BST and also keep it balanced to avoid `O(N^2)` worst time complexity. For problems like these, Segment Tree and Binary Indexed Tree are also good choices and give `O(n log n)` time complexity. 24 | 25 | Merge sort avoids extra data structure. For code interviews, merge sort code seems to be easier and you can just mention how to use BST, Segment Tree and Binary Indexed Tree to get more credits for the interview. 26 | 27 | C++ provides built-ins for merge sort including: 28 | * `merge(l1.begin(), l1.end(), l2.begin(), l2.end(), result.begin());` which stores the merged array in `result` 29 | * `inplace_merge(l.begin(), l.middle, l.end())` where array `[begin, middle)` is merged with array `[middle, end)`. 30 | 31 | **NOTE:** Python's `sorted()` use **Timesort**, which actually search for sorted subsequence first and then apply merge-sort. So, you can treated it as inplace_merge_sort. LoL. 32 | 33 | LeetCode 315. Count of Smaller Numbers After Self. Return the number of `j`s such that `i < j` and `nums[j] < nums[i]`. 34 | ``` 35 | #define iterator vector>::iterator 36 | void sort_count(iterator l, iterator r, vector& count) { 37 | if (r - l <= 1) return; 38 | iterator m = l + (r - l) / 2; 39 | sort_count(l, m, count); 40 | sort_count(m, r, count); 41 | for (iterator i = l, j = m; i < m; i++) { 42 | while (j < r && (*i)[0] > (*j)[0]) j++; 43 | count[(*i)[1]] += j - m; // add the number of valid "j"s to the indices of *i 44 | } 45 | inplace_merge(l, m, r); 46 | } 47 | vector countSmaller(vector& nums) { 48 | vector> hold; 49 | int n = nums.size(); 50 | for (int i = 0; i < n; ++i) hold.push_back(vector({nums[i], i})); // "zip" the nums with their indices 51 | vector count(n, 0); 52 | sort_count(hold.begin(), hold.end(), count); 53 | return count; 54 | } 55 | ``` 56 | 57 | LeetCode 493. Reverse Pairs. Return the number of reverse pairs s.t. `i < j` and `nums[i] > 2*nums[j]`. 58 | ``` 59 | int sort_count(vector::iterator begin, vector::iterator end) { 60 | if (end - begin <= 1) return 0; 61 | vector::iterator middle = begin + (end - begin) / 2; 62 | int count = 0; 63 | if (begin < middle) count += sort_count(begin, middle); 64 | if (middle < end) count += sort_count(middle, end); 65 | vector::iterator i, j; 66 | for (i = begin, j = middle; i < middle; ++i) { // double pointers trick 67 | while (j < end && *i > 2L * *j) { 68 | j++; 69 | } 70 | count += j - middle; 71 | } 72 | inplace_merge(begin, middle, end); 73 | return count; 74 | } 75 | int reversePairs(vector& nums) { 76 | return sort_count(nums.begin(), nums.end()); 77 | } 78 | ``` 79 | 80 | LeetCode 327. Count of Range Sum. Return the number of range sums that lie in `[lower, upper]` inclusive-inclusive. Let prefix-array sum be `sums[0..n+1]`, the problem is to find pairs of `i` and `j` such that `lower <= sums[j] - sums[i] <= upper`. 81 | ``` 82 | int countRangeSum(vector& nums, int lower, int upper) { 83 | int n = nums.size(); 84 | vector sums(n + 1, 0); 85 | for (int i = 0; i < n; ++i) sums[i + 1] = sums[i] + nums[i]; 86 | return sort_count(sums, 0, n + 1, lower, upper); 87 | } 88 | 89 | int sort_count(vector& sums, int l, int r, int lower, int upper) { 90 | if (r - l <= 1) return 0; 91 | int m = (l + r) / 2, i, j1, j2; 92 | int count = sort_count(sums, l, m, lower, upper) + sort_count(sums, m, r, lower, upper); 93 | for (i = l, j1 = j2 = m; i < m; ++i) { 94 | // we have two j pointers now and one i pointer, but still linear time 95 | while (j1 < r && sums[j1] - sums[i] < lower) j1++; 96 | while (j2 < r && sums[j2] - sums[i] <= upper) j2++; 97 | count += j2 - j1; 98 | } 99 | inplace_merge(sums.begin() + l, sums.begin() + m, sums.begin() + r); 100 | return count; 101 | } 102 | ``` 103 | 104 | 105 | -------------------------------------------------------------------------------- /MonotonicQueue/Readme.md: -------------------------------------------------------------------------------- 1 | Monotonic Queue 2 | === 3 | The following question can be solved by monotonic queue: 4 | * **LC84. Largest Rectangle in Histogram** 5 | * **LC239. Sliding Window Maximum** 6 | * **LC739. Daily Temperatures** 7 | * **LC862. Shortest Subarray with Sum at Least K** 8 | * **LC901. Online Stock Span** 9 | * **LC907. Sum of Subarray Minimums** 10 | * [Frog Jump II](https://anthony-huang.github.io/competitiveprogramming/2016/06/06/monotonic-queue.html): K steps at most with cost `A[i]` if landing at position `i` 11 | 12 | In general, the following "prototype" problems can be solved by monotonic queue: 13 | 14 | Sliding Max 15 | --- 16 | 17 | Any DP problem where `S[i] = max(A[j:k]) + C` where `j < k <= i` and `C` is a constant. 18 | 19 | The sliding max/min window problem belongs to this type. 20 | 21 | Problem statement: return the max elements in a sliding window of certain length. 22 | 23 | Key observation: Given input array `A`, when `A[l] < A[r]` for `l < r`, then `A[l]` should never be retuned as the sliding max, if `A[r]` has entered the sliding window. 24 | 25 | So we maintain a monotonic array with index **increasing** and value **decreasing**. 26 | 27 | For example, with sliding window of fixed length 3, 28 | > `A = [3, 1, 4, 3, 8] => [3], [3, 1], [4], [4, 3], [8]` 29 | > when element `4` enters, we remove `[3, 1]` because they are on the left and smaller than `4`, no chance being chosen as the max element. 30 | 31 | The head of the increasing queue is the sliding max! 32 | 33 | As simple as it is, we have a sliding window of elements, 34 | the only unique thing here is that we can keep the elements in the window sorted. It brings great benefits because it takes O(1) to obtain the min/max element in the window. 35 | 36 | That's why any DP problem where `S[i] = max(A[j:k]) + C` for `j < k <= i` can be solved by Monotonic Queue. 37 | 38 | Given a element `A[i]`, find the nearest element larger/smaller than it 39 | --- 40 | Given element `A[i]`, the task is to find the maximum index `j < i` such that `A[j] > A[i]`. Namely, `A[j]` is the nearest larger element on the left of `A[i]`. 41 | 42 | Key observation: given `A[k] < A[j] > A[i]` for `k < j < i`, `A[k]` never become the **nearest** element larger than `A[i]` because of `A[j]`. 43 | 44 | So we should have a decreasing monotonic queue here. The arrow indicates that the mapping from element on the right to the nearest element on the left larger than it. The elements in the valley are ignored. 45 | 46 | ![alt text](https://imgur.com/ZfQSOag.png) 47 | 48 | **LC 85. Maximal Rectangle** 49 | 50 | Given a 2D binary matrix filled with 0's and 1's, find the largest rectangle containing only 1's and return its area. 51 | 52 | Idea: convert 2D matrix to 1D height array. The task becomes **LC84. Largest Rectangle in Histogram** which is essentially "finding the index of the nearest previous value smaller than itself". 53 | 54 | ``` 55 | if not matrix: return 0 56 | N, M = len(matrix), len(matrix[0]) 57 | dp = [0] * (M + 1) 58 | area = 0 59 | 60 | for i in range(N): 61 | for j in range(M): 62 | # obtain the height based on each row 63 | if matrix[i][j] == '1': 64 | dp[j] += 1 65 | else: 66 | dp[j] = 0 67 | 68 | s = [] 69 | for j in range(M + 1): # IMPORTANT: note that the last ZERO should pop out all remaining heights 70 | if not s: s.append(j) 71 | else: 72 | while s and dp[s[-1]] >= dp[j]: 73 | x = s.pop() 74 | if s: area = max(area, dp[x]*(j - s[-1] - 1)) 75 | else: area = max(area, dp[x]*j) 76 | s.append(j) 77 | 78 | return area 79 | ``` 80 | 81 | **LC862. Shortest Subarray with Sum at Least K** 82 | 83 | Return the length of the shortest, non-empty, contiguous subarray of A with sum at least K. 84 | 85 | Key observation: If we accumulate array A to obtain B, then `B[l] <= B[r] - K` indicates `sum(A[l:r]) >= K`. Given `B[r]`, the problem is equivalent to finding the **nearest** previous element `B[l]` such that `B[l] <= B[r] - K`. 86 | 87 | We maintain a **increasing queue** here because, given a new `B[i]`, the larger element on the left are inferior than `B[i]` as a candidate to make some future element `B[j] >= B[i] + K` (`j > i`). 88 | 89 | One extra optimization learnt from [@lee215](https://leetcode.com/problems/shortest-subarray-with-sum-at-least-k/discuss/143726/C%2B%2BJavaPython-O(N)-Using-Deque) is that we can also pop up the element on the left side `<= B[i] - K` of the **increasing** queue because, given current element `B[i]`, if a future element `B[j] > B[i]`, then `B[j] - K` would be within the queue after the removal of such elements `<= B[i] - K`; Otherwise, if a future element `B[j] > B[i]` then it never appears in the final results. 90 | 91 | ``` 92 | Q = collections.deque([]) 93 | 94 | B = [0] 95 | for a in A: B.append(B[-1] + a) 96 | 97 | res = float('inf') 98 | for i, b in enumerate(B): 99 | if not Q: Q.append(i) 100 | else: 101 | while Q and B[Q[-1]] > b: Q.pop() 102 | while Q and B[Q[0]] <= b - K: 103 | res = min(res, i - Q[0]) 104 | Q.popleft() 105 | Q.append(i) 106 | return res if res < float('inf') else -1 107 | ``` 108 | 109 | Frog Jump II 110 | --- 111 | 112 | A Frog jumps K steps at most. It costs `A[i]` to stays at `i`. 113 | Return the min cost to get to `N-1` from `0` 114 | 115 | ``` 116 | #K = 2 117 | #A = [0, 3, 2, 7, 1, 4] 118 | #cost 0 3 2 9 3 119 | # 0 120 | # 0 3 121 | # 0 2 122 | # - 2 9 123 | # 2 - 3 # remove 9 because 3 < 9 and it's on the right 124 | # 3 7 125 | # 126 | ``` 127 | 128 | Pay attention to the two `while` loop. 129 | ``` 130 | from collections import deque 131 | def min_cost(A, K): 132 | Q = deque([(0, A[0])]) 133 | for i in range(1, len(A)): 134 | 135 | # keep sliding width == K steps 136 | while Q and Q[0][0] < i - K: 137 | Q.popleft() 138 | 139 | # remove inferior elements at the tail 140 | while Q and Q[-1][1] > A[i] + Q[0][1]: 141 | Q.pop() 142 | 143 | Q.append((i, A[i] + Q[0][1])) 144 | return Q[-1][1] 145 | ``` 146 | 147 | **LC975. Odd Even Jump** 148 | --- 149 | Find a `j` of each `i` such that `j > i` and `A[j] = min(A[x] > A[i])` for all `x > i`. 150 | 151 | This problem may not sound that straight-forward. But, if we sort `A` and arrange the indices by the increasing order of `A[i]` as `B`. 152 | For example, `A=[5,2,1,3,4]`, then the indices would be `B=[2,1,3,4,0]`. 153 | 154 | The corresponding `j` of `i` would be the **next indice in B larger than i**, which gets projected to the monotonic queue problem. 155 | 156 | 157 | Codeforces 487B Strip 158 | === 159 | 160 | One extreme case is (Codeforces 487B Strip)[http://codeforces.com/contest/487/problem/B] 161 | 162 | 题意:给你一个数组,现在让你分割这个数组,要求是数组每一块最少 L 个数 且 每一块中 极值 不超过 s 163 | 164 | (Multiset C++ solution)[https://www.cnblogs.com/zyue/p/4360175.html] 165 | 166 | 167 | 487B - Strip 168 | We can use dynamic programming to solve this problem. 169 | 170 | Let f[i] denote the minimal number of pieces that the first i numbers can be split into. g[i] denote the maximal length of substrip whose right border is i(included) and it satisfy the condition. 171 | 172 | Then `f[i] = min(f[k]) + 1`, where `i - g[i] ≤ k ≤ i - l`. 173 | 174 | We can use monotonic queue to calculate g[i] and f[i]. And this can be implemented in `O(n)` !!! 175 | 176 | We can also use sparse table to solve the problem, the time complexity is O(n logn). 177 | 178 | ``` 179 | from collections import deque 180 | 181 | # f[i] = min(f[k]) + 1 for g[i] - 1 <= k <= i - l 182 | # [g[i], i] is a valid seg 183 | # sliding window of max/min elements 184 | 185 | def find(N, S, L, A): 186 | big, small = deque(), deque() 187 | 188 | F = [N + 1] * N 189 | Fsmall = deque() 190 | 191 | j = 0 192 | for i, n in enumerate(A): 193 | # insert (i, n) 194 | while big and big[-1][1] <= n: 195 | big.pop() 196 | big.append((i, n)) 197 | while small and small[-1][1] >= n: 198 | small.pop() 199 | small.append((i, n)) 200 | 201 | # increment j until max-min <= S 202 | while j <= i and big and small and big[0][1] - small[0][1] > S: 203 | while big and big[0][0] <= j: big.popleft() 204 | while small and small[0][0] <= j: small.popleft() 205 | j += 1 206 | 207 | # [j,i] is the longest segment now 208 | # j - 1 <= k <= i - L 209 | if i >= L: 210 | # insert i-L 211 | while Fsmall and Fsmall[-1][1] >= F[i - L]: 212 | Fsmall.pop() 213 | Fsmall.append((i-L, F[i - L])) 214 | 215 | if j == 0 and i - j + 1 >= L: 216 | F[i] = 1 217 | else: 218 | # remove elements before j-1 219 | while Fsmall and Fsmall[0][0] < j - 1: 220 | Fsmall.popleft() 221 | if Fsmall: 222 | F[i] = min(F[i], Fsmall[0][1] + 1) 223 | 224 | return F[-1] if F[-1] <= N else -1 225 | 226 | 227 | if __name__ == "__main__": 228 | N, S, L = (int(_) for _ in raw_input().split(' ')) 229 | A = [int(_) for _ in raw_input().split(' ')] 230 | print find(N, S, L, A) 231 | ``` 232 | 233 | LC 375. Guess Number Higher or Lower II 234 | --- 235 | I pick a number from 1 to n. You have to guess which number I picked. 236 | 237 | Every time you guess wrong, I'll tell you whether the number I picked is higher or lower. 238 | 239 | However, when you guess a particular number x, and you guess wrong, you pay $x. You win the game when you guess the number I picked. 240 | 241 | Given a particular n ≥ 1, find out how much money you need to have to guarantee a win. 242 | 243 | Say we select `k`, then `dp[l][k-1]` is the cost of the left remaining segment and `dp[k+1][r]` is the cost of the right. 244 | 245 | > `dp[l][r] = max(dp[l][k-1], dp[k+1][r]) + k` for `l < k < r` 246 | 247 | As `k` increase, `dp[l][k-1] + k` goes down, `dp[k+1][r] + k` grows up. 248 | When `dp[l][k0-1] >= dp[k0+1][r]` for the first `k = k0`, we just need to compare 249 | `dp[l][k0-1] + k0` with `dp[k+1][r] + k` for `k < k0`. 250 | 251 | As a sub-routine, we need to find `dp[k+1][r] + k` for `k < k0` here. This is what Monotonic Queue does in `O(1)` time. 252 | 253 | [O(1) time to find sliding minimum of the first `k0` elements!!](https://artofproblemsolving.com/community/c296841h1273742) 254 | 255 | LC 1425. Constrained Subsequence Sum 256 | --- 257 | Maximum sum of a non-empty subsequence `A[i]`, every two consecutive element are **at most** `k` steps away. 258 | 259 | DP: the max sum of such subsequence ending with `A[i]` is `S[i] = max(S[i-k:i]) + A[i]`. Answer is `max(S)`. 260 | 261 | We use `deque` to maintain a decreasing queue with `max(S[i-k:i])` at the head of the queue. 262 | 263 | ``` 264 | def constrainedSubsetSum(self, A, k): 265 | deque = collections.deque() 266 | S = [0] * len(A) 267 | for i in xrange(len(A)): 268 | S[i] = A[i] + (deque[0] if deque else 0) 269 | while len(deque) and S[i] > deque[-1]: 270 | deque.pop() 271 | if S[i] > 0: 272 | # Note: S[i] == A[i] when max(S[i-k:i]) <= 0 273 | # so no need to enqueue non-positive S[i] into the deque 274 | deque.append(S[i]) 275 | if i >= k and deque and deque[0] == S[i - k]: 276 | # Note: elements in the deque are all unique 277 | # so this is a way to avoid checking the indices of elements which are k steps away. 278 | deque.popleft() 279 | return max(S) 280 | ``` 281 | -------------------------------------------------------------------------------- /MorrisTraversal/Readme.md: -------------------------------------------------------------------------------- 1 | Morris Traversal 2 | === 3 | 4 | O(N) time O(1) space tree traversal. Works for pre-, post-, in-order traversal of binary trees. 5 | 6 | ``` 7 | # We start from the input tree and end up with the input tree 8 | # Go Down: the clock-wise shift at the first visit of each node (pre-order) 9 | # Go Up: the anti-clockwise shift at the second visit of each node (in-order) 10 | # 11 | # | ^ 12 | # V | 13 | # 14 | # parent 15 | # \ 16 | # current 17 | # / \ 18 | # lkid rtreeA 19 | # / \ 20 | # ltree rtreeB 21 | # 22 | # | ^ 23 | # V | # After clock-wise shift, current becomes the right kid of the right-most node rooted by lkid 24 | # 25 | # parent 26 | # | 27 | # | lkid <--------- 28 | # | / \ | 29 | # | ltree rtreeB | 30 | # | \ | 31 | # |-------> current---- # note that current's left kid is still lkid, 32 | # \ # so we can recover the original tree after Morris traversal 33 | # rtreeA 34 | # 35 | # | ^ 36 | # V | 37 | # 38 | ``` 39 | 40 | ``` 41 | current = root 42 | while current: 43 | if current.left is None: 44 | # visit left-most node in every branch 45 | current = current.right 46 | else: 47 | prev = current.left 48 | while prev.right and prev.right != current: 49 | prev = prev.right 50 | 51 | if prev.right == current: # recovery, go up 52 | # inorder here 53 | prev.right = None 54 | current = current.right 55 | else: # visit, go down 56 | # pre-order here 57 | prev.right = current 58 | current = current.left 59 | ``` 60 | 61 | The post-order visit can be done in a similar fashion as in this post https://github.com/xiaoylu/leetcode_category/blob/master/TreeTraversal/Readme.md. 62 | 63 | -------------------------------------------------------------------------------- /Parentheses/Readme.md: -------------------------------------------------------------------------------- 1 | # The parentheses problem is essentially stack problem. 2 | 3 | **LC 301. Remove Invalid Parentheses** 4 | ``` 5 | def removeInvalidParentheses(self, s): 6 | # when you have one extra ")", remove it 7 | # caution: remove the ")" after the position of last removal to avoid duplicates 8 | def fix(s, start, last_removal): 9 | counter = 0 10 | for i in range(start, len(s)): 11 | if s[i]==')': counter -= 1 12 | elif s[i]=='(': counter += 1 13 | if counter < 0: # fix 14 | ret = [] 15 | for j in range(last_removal, i+1): 16 | if s[j]==')' and (j==0 or s[j-1] != s[j]): 17 | ret += fix(s[:j]+s[j+1:], i, j) 18 | return ret 19 | return [s] 20 | 21 | # (() => )(( => ()) 22 | def reverse(s): 23 | ss = "" 24 | for c in s[::-1]: 25 | if '('==c: ss += ")" 26 | elif ')'==c: ss += "(" 27 | else: ss += c 28 | return ss 29 | 30 | right = fix(s, 0, 0) 31 | left = [] 32 | for s in right: 33 | left += fix(reverse(s),0,0) 34 | return [reverse(s) for s in left] 35 | ``` 36 | -------------------------------------------------------------------------------- /ParseStrings/Readme.md: -------------------------------------------------------------------------------- 1 | Parse Strings 2 | === 3 | 4 | In order to parse a input with brackets, symbols and numbers, the tricks below may help: 5 | 6 | 1. Use stacks to store numbers or symbols 7 | 2. Load trucks of input at a time. For example, we load a contineous string as the decimal number here. 8 | 9 | ``` 10 | while i < N: 11 | j = i 12 | while j < N and S[j].isdigit(): 13 | j += 1 14 | stack.append( int(S[i:j]) ) 15 | i = j 16 | ``` 17 | 18 | 3. Operator Prioriety: when should we combine terms? 19 | 20 | Be Lazy: conduct the lower level operations when you have to (i.e. when encoutered high level operations) 21 | -------------------------------------------------------------------------------- /PinterestSystemDesign/Readme.md: -------------------------------------------------------------------------------- 1 | A case study of Pinterest system design 2 | === 3 | 4 | Smart Feed 5 | --- 6 | https://medium.com/@Pinterest_Engineering/building-a-smarter-home-feed-ad1918fdfbe3 7 | 8 | Smart Feed Worker 9 | * Work collected repin, related pin and interest pin into three pools 10 | * Each pool is a priority queue sorted on score and belongs to a single user. (via key-based sorting of HBase) 11 | 12 | Smart Feed Content Generator 13 | * fetch from each pool 14 | 15 | Smart Feed Service 16 | * the materialized feed represents a frozen view of the feed as it was the last time the user viewed it. 17 | * when no new feed, the smart feed service will return the content contained in the materialized feed 18 | 19 | Dynamic and Responsive Pinterest 20 | --- 21 | https://medium.com/pinterest-engineering/building-a-dynamic-and-responsive-pinterest-7d410e99f0a9 22 | 23 | Online: Display 24 | Offline: Generating the recommendations 25 | * find candidates 26 | * Feed Generator sends to Polaris the board list and the bloom filter consisting of the impression history of the user 27 | * 28 | * find features 29 | * scoring 30 | * display 31 | 32 | Pixie 33 | --- 34 | https://medium.com/@Pinterest_Engineering/introducing-pixie-an-advanced-graph-based-recommendation-system-e7b4229b664b 35 | * periodically loads into memory an offline-generated graph consisting of boards and Pins 36 | * when recommended boards are requested for a user, a random walk is simulated in the Pixie graph by using the Pins engaged by the user as starting points. 37 | 38 | Graph Convolutional Neural Networks for Web-Scale Recommender Systems (KDD 2018) 39 | --- 40 | https://arxiv.org/pdf/1806.01973.pdf 41 | 42 | * PinSage algorithm performs efficient, localized convolutions by sampling the neighborhood around a node and dynamically constructing a computation graph from this sampled neighborhood. 43 | * Basic idea: transform the representations of u’s neighbors through a dense neural network and then apply a aggregator/pooling fuction 44 | * We then concatenate the aggregated neighborhood vector nu with u’s current representation and transform the **concatenated** vector through another dense 45 | neural network layer 46 | * Supervised algorithm: the goal of the training phase is to optimize the PinSage parameters so that the output embeddings of pairs in the labeled set are close together. 47 | 48 | 49 | -------------------------------------------------------------------------------- /Queue/Readme.md: -------------------------------------------------------------------------------- 1 | # Priority Queue 2 | 3 | ## Sort + Priortiy Queue 4 | 5 | **LC 857. Minimum Cost to Hire K Workers** 6 | 7 | There are N workers. The i-th worker has a `quality[i]` and a minimum wage expectation `wage[i]`. 8 | 9 | Now we want to hire exactly K workers to form a paid group. When hiring a group of K workers, we must pay them according to the following rules: 10 | 11 | * Every worker in the paid group should be paid in the ratio of their quality compared to other workers in the paid group. 12 | * Every worker in the paid group must be paid at least their minimum wage expectation. 13 | 14 | Return the least amount of money needed to form a paid group satisfying the above conditions. 15 | 16 | We compute the ratio of money/quality for every worker. Given a group, the maximum money/quality ratio should be paid. 17 | 18 | So the least amount of money would be `max(r1, r2, .. rk) * sum(q1, q2, .. qk)` where `ri` is the money/quality ratio and `qi` is the quality. Note that if we already have `K` workers with the smallest sum of quality, the only way to save money is to replace the worker with highest pay. 19 | 20 | Dynamic Programming view: 21 | * Sort the works by their qualities 22 | * The optimal `K` workers in the first `i-1` workers are known 23 | * If the `i`-th worker joins, then we must replace the worker with highest ratio in the first `i-1` workers; Otherwise, both the sum of quality and highest ratio would increase as `i`-th worker joins. 24 | 25 | ``` 26 | def mincostToHireWorkers(self, quality, wage, K): 27 | # q 10 20 5 28 | # r 7 2.5 6 29 | # 5 10 20 30 30 | # 6 7 2.5 99 31 | # max(r) * sum(q) 32 | # start with smallest sum(q), 33 | # we have to increase sum(q) while reducing max(r) 34 | 35 | if not quality or K==0: return 0 36 | 37 | rate = [] 38 | for q, w in zip(quality, wage): 39 | rate.append(w/q) 40 | A = sorted((q, r) for q, r in zip(quality, rate)) 41 | 42 | hp = [(-r, q) for q, r in A[:K]] 43 | heapq.heapify(hp) 44 | sumq, maxr = sum(q for q, r in A[:K]), max(r for q, r in A[:K]) 45 | 46 | res = sumq * maxr 47 | for q, r in A[K:]: 48 | _R, Q = heapq.heappop(hp) 49 | heapq.heappush(hp, (-r, q)) 50 | 51 | sumq = sumq - Q + q 52 | maxr = -hp[0][0] 53 | res = min(res, sumq * maxr) 54 | return res 55 | ``` 56 | 57 | ## Two queues for medium 58 | **LC 295. Find Median from Data Stream** 59 | 60 | Design a data structure that supports the following two operations: 61 | 62 | * `void addNum(int num)` - Add a integer number from the data stream to the data structure. 63 | * `double findMedian()` - Return the median of all elements so far. 64 | 65 | ``` 66 | def addNum(self, num): 67 | # the size of hi queue is equal to or one more than the size of lo queue 68 | # len(self.hi) == len(self.lo) 69 | # or len(self.hi) == len(self.lo) + 1 70 | if not self.hi: 71 | heappush(self.hi, num) 72 | elif num >= self.hi[0]: 73 | heappush(self.hi, num) 74 | if len(self.hi) > len(self.lo) + 1: 75 | heappush(self.lo, -heappop(self.hi)) 76 | else: 77 | heappush(self.lo, -num) 78 | if len(self.hi) < len(self.lo): 79 | heappush(self.hi, -heappop(self.lo)) 80 | ``` 81 | 82 | **LC 480. Sliding Window Median** 83 | 84 | See 85 | 86 | 87 | Insert the i-th element (try self.hi first, if fails, then try self.lo) 88 | Remove the (i-k)-th element (try self.hi first, if fails, then try self.lo) 89 | 90 | Then the size of **effective** elements in two queues can only differ by 0 or 2. 91 | * If differ by 2, move one element from one queue to the other 92 | * If differ by 0, do nothing. 93 | 94 | Before you re-balance two heaps, make sure at the top of a heap is an effective element. 95 | Before you compute the median, make sure both tops are effective elements. 96 | 97 | ``` 98 | from heapq import heappush, heappop 99 | class Solution: 100 | def medianSlidingWindow(self, nums, k): 101 | """ 102 | :type nums: List[int] 103 | :type k: int 104 | :rtype: List[float] 105 | """ 106 | if not nums or k > len(nums) or k == 0: return [] 107 | 108 | def clean(lo, hi, count): 109 | while lo and count[-lo[0]] > 0: count[-lo[0]] -= 1; heappop(lo) 110 | while hi and count[hi[0]] > 0: count[hi[0]] -= 1; heappop(hi) 111 | 112 | lo, hi = [], [] 113 | count = collections.defaultdict(int) 114 | 115 | for i in range(k): heappush(hi, nums[i]) 116 | for _ in range(k//2): heappush(lo, -heappop(hi)) 117 | 118 | # now we have len(h) >= len(l) 119 | res = [] 120 | for i in range(k, len(nums)): 121 | 122 | if k & 1: res.append(float(hi[0])) 123 | else: res.append((hi[0] - lo[0]) / 2.) 124 | 125 | balance = 0 126 | 127 | # push the current element 128 | if nums[i] >= hi[0]: balance += 1; heappush(hi, nums[i]) 129 | else: balance -= 1; heappush(lo, -nums[i]) 130 | 131 | # pop the element k steps before 132 | if nums[i-k] >= hi[0]: 133 | balance -= 1 134 | if nums[i-k] == hi[0]: heappop(hi) 135 | else: count[nums[i-k]] += 1 136 | else: 137 | balance += 1 138 | if nums[i-k] == -lo[0]: heappop(lo) 139 | else: count[nums[i-k]] += 1 140 | 141 | clean(lo, hi, count) 142 | 143 | # at the top of hi/lo must be an effective element?? 144 | if balance > 0: heappush(lo, -heappop(hi)) 145 | elif balance < 0: heappush(hi, -heappop(lo)) 146 | 147 | clean(lo, hi, count) 148 | 149 | if k & 1: res.append(float(hi[0])) 150 | else: res.append((hi[0] - lo[0]) / 2.0) 151 | 152 | return res 153 | ``` 154 | 155 | ## Lazy deletion 156 | 157 | Unlike Java, Python heapq does not support removal, you can do lazy deletion which removes a element only when it's at the top. Specifically, 158 | * Use a hash table to count the number of removals of each element 159 | * When such element emerges to the top of a queue, remove it, decrease the count of removal until zero 160 | 161 | It is illustrated in LC 480 above. To implement your own removal, for binary heap, you sift up the element (choose the min of parent and two kids, and make it the parent, then traverse upward...) and then sift down the element. Two many lines for interview, so just do lazy deletion. 162 | 163 | **LC 716. Max Stack** 164 | 165 | Design a max stack that supports push, pop, top, peekMax and popMax. 166 | 167 | 1. push(x) -- Push element x onto stack. 168 | 2. pop() -- Remove the element on top of the stack and return it. 169 | 3. top() -- Get the element on the top. 170 | 4. peekMax() -- Retrieve the maximum element in the stack. 171 | 5. popMax() -- Retrieve the maximum element in the stack, and remove it. If you find more than one maximum elements, only remove the top-most one. 172 | 173 | ``` 174 | class MaxStack: 175 | def __init__(self): 176 | self.stack = [] 177 | self.queue = [] 178 | self.t = 0 179 | # Use the hash of current time 180 | # self.t = hash(time.time()) 181 | 182 | self.remove_stack = set() 183 | self.remove_queue = set() 184 | 185 | def push(self, x): 186 | self.stack.append((x, self.t)) 187 | heappush(self.queue, (-x, -self.t)) 188 | self.t += 1 189 | # Use the hash of current time 190 | # self.t = hash(time.time()) 191 | 192 | def pop(self): 193 | self.top() 194 | value, t = self.stack.pop() 195 | self.remove_queue.add(t) 196 | return value 197 | 198 | def top(self): 199 | while self.stack and self.stack[-1][1] in self.remove_stack: 200 | self.remove_stack.remove(self.stack[-1][1]) 201 | self.stack.pop() 202 | return self.stack[-1][0] 203 | 204 | def peekMax(self): 205 | while self.queue and -self.queue[0][1] in self.remove_queue: 206 | self.remove_queue.remove(-self.queue[0][1]) 207 | heappop(self.queue) 208 | return -self.queue[0][0] 209 | 210 | def popMax(self): 211 | self.peekMax() 212 | value, t = heappop(self.queue) 213 | self.remove_stack.add(-t) 214 | return -value 215 | ``` 216 | 217 | 218 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | Leetcode by Category 2 | === 3 | 4 | In this repo, I maintain my notes about Leetcode problems. I also included some notes about Lintcode and CodeForce problems. 5 | 6 | Topics: 7 | --- 8 | 9 | Basic: 10 | * DFS 11 | * BFS 12 | * Back Tracking 13 | * Binary Search 14 | * Bit Operation 15 | * Dynamic Programming (DP) 16 | * Knapsake Problems 17 | * Intervals 18 | * Parentheses Problems 19 | * Merge Sort 20 | * Shortest Distance 21 | * Random Number Generation 22 | * Regex Matching 23 | * Tree Traversal 24 | * Two Pointers Problems 25 | * Heap 26 | * Stack 27 | * Queue 28 | * kSum Problems 29 | 30 | Intermediate: 31 | 32 | * Euler Path/Tour 33 | * Geometry Problems 34 | * DP of sequence operations 35 | * Monotonic Queue 36 | * Sparse Table 37 | * Strongly Connected Component 38 | * Topological Sort 39 | * Trie 40 | * Union Find 41 | 42 | Advanced: 43 | * KMP 44 | * Morris Traversal 45 | * Segment Tree 46 | * Binary Index Tree (BIT) 47 | * Max-flow (and min-cost max-flow) 48 | * Wavelet Tree 49 | * Suffix Tree (Ukkonen's algorithm) 50 | 51 | Others: 52 | * System Design 53 | * [Javascript basics](language/Javascript/Readme.md) 54 | * [C++](language/Cplusplus/Readme.md) 55 | * [Java](language/Java/Readme.md) 56 | * [Go](language/Go/Readme.md) 57 | -------------------------------------------------------------------------------- /Random/Readme.md: -------------------------------------------------------------------------------- 1 | # Random Sampling 2 | 3 | 4 | 5 | **LC 382. Linked List Random Node** Pick a random element from a linked list. 6 | 7 | The problem is a little ambiguous. In the interview, you should ask clearly whether the list length is unknown but static or it is unknown and dynamically changing. In the first case, you can simply precompute the length and generate random indices based on that. It is faster than the reservior sampling solution. 8 | 9 | If the list length changes dynamically, reservior sampling is a good choice. If you are not familiar with it, check here. 10 | 11 | ``` 12 | class Solution(object): 13 | 14 | def __init__(self, head): 15 | self.head = head 16 | 17 | def getRandom(self): 18 | result, node, index = self.head, self.head.next, 1 19 | while node: 20 | if random.randint(0, index) is 0: 21 | result = node 22 | node = node.next 23 | index += 1 24 | return result.val 25 | ``` 26 | -------------------------------------------------------------------------------- /Regex/Readme.md: -------------------------------------------------------------------------------- 1 | # Regular Expression Match 2 | 3 | The regular experssion matching problem can be abstracted as a DP problem. `dp[i][j]` indicates the match between string `s[:i+1]` and regex expression `p[:j+1]`: 4 | 5 | * Given the wildcard `p[j] == "*"`, we can reduce the current match between `s[:i]` and `p[:j]` to sub-states: 6 | * Match `s[i]` with the `*` at `p[j]`, remove this `*`, so `dp[i][j] => dp[i-1][j-1]` 7 | * Match `s[i]` with the `*` at `p[j]`, but `*` remains, so `dp[i][j] => dp[i-1][j]` 8 | * We do not match the `*`, so `dp[i][j] => dp[i][j-1]` 9 | 10 | * Given the wildcard `p[j] == "."`, we must match it with a char at `s[i]` 11 | 12 | **LC 44. Wildcard Matching**, implement wildcard pattern matching with support for '?' and '*'. 13 | 14 | ``` 15 | def isMatch(self, s, p): 16 | if not s and not p: return True 17 | if not s: return len(p.replace("*", ""))==0 18 | N, M = len(s), len(p) 19 | dp = [[False] * (M + 1) for _ in range(N + 1)] 20 | dp[0][0] = True # dp[i][j] is the match with length-i of s and length-j of p 21 | for j in range(1, M + 1): 22 | if p[j - 1] == '*': 23 | dp[0][j] = dp[0][j-1] 24 | for i in range(1, N + 1): 25 | for j in range(1, M + 1): 26 | if p[j - 1] == '*': 27 | # three cases: zero one duo 28 | dp[i][j] = dp[i][j-1] or dp[i-1][j-1] or dp[i-1][j] 29 | elif p[j - 1] == '?' or p[j - 1] == s[i - 1]: 30 | dp[i][j] = dp[i-1][j-1] 31 | else: 32 | dp[i][j] = False 33 | return dp[N][M] 34 | ``` 35 | 36 | **LC 10. Regular Expression Matching** implement regular expression matching with support for '.' and '*'. 37 | 38 | * '.' Matches any single character. 39 | * '*' Matches zero or more of the preceding element. 40 | 41 | The only difference is that we need to look back two steps. 42 | 43 | ``` 44 | def isMatch(self, s, p): 45 | if not s and not p: return True 46 | N, M = len(s), len(p) 47 | dp = [[False] * (M + 1) for _ in range(N + 1)] 48 | dp[0][0] = True 49 | for j in range(1, M + 1): 50 | if j >= 2 and p[j - 1] == '*': 51 | dp[0][j] = dp[0][j - 2] 52 | if not s: return dp[0][M] 53 | for i in range(1, N + 1): 54 | for j in range(1, M + 1): 55 | if p[j - 1] == '.': 56 | dp[i][j] = dp[i - 1][j - 1] 57 | elif p[j - 1] == '*': 58 | if j - 2 < 0: dp[i][j] = False 59 | elif p[j - 2] == '.' or p[j - 2] == s[i - 1]: 60 | # empty match one match duo 61 | dp[i][j] = dp[i][j-2] or dp[i-1][j-2] or dp[i-1][j] 62 | else: 63 | dp[i][j] = dp[i][j - 2] 64 | elif p[j - 1] == s[i - 1]: 65 | dp[i][j] = dp[i - 1][j - 1] 66 | else: 67 | dp[i][j] = False 68 | return dp[N][M] 69 | ``` 70 | -------------------------------------------------------------------------------- /SegmentTree/Readme.md: -------------------------------------------------------------------------------- 1 | ## Segment tree (supporting range operation) 2 | Segment tree supports the update and query of sum/max/min value within a given range of elements. 3 | 4 | * The "best" C++ introduction [by (DarthPrince)](http://codeforces.com/blog/entry/15729) 5 | * Really concise C++ code 6 | * Construct tree dynamically using [Java](https://leetcode.com/problems/my-calendar-iii/discuss/109568/Java-Solution-O(n-log(len))-beats-100-Segment-Tree) 7 | 8 | You can even do binary search in segment trees using the sum value at each segment tree node! 9 | 10 | Segment Tree supports range operations. We go from root to leaves, when one interval is **completed covered** by the query range, we can stop there and save the updates as lazy value. The next time when we need to go down through that node, we can carry such lazy value down to the children. In this way, every level of the tree has only two active nodes. The time complexity for range operation is the tree depth `O(log N)` where `N` is the size of the range. 11 | 12 | **LC732. My Calendar III** 13 | Return an integer K representing the largest integer such that there exists a K-booking in the calendar. (There are K books overlapping at the same period.) 14 | 15 | Concise Python: 16 | ``` 17 | class MyCalendarThree(object): 18 | 19 | def __init__(self): 20 | self.seg = collections.defaultdict(int) 21 | self.lazy = collections.defaultdict(int) 22 | 23 | def book(self, start, end): 24 | def update(s, e, l = 0, r = 10**9, ID = 1): 25 | if r <= s or e <= l: return 26 | if s <= l < r <= e: 27 | self.seg[ID] += 1 28 | self.lazy[ID] += 1 29 | else: 30 | m = (l + r) // 2 31 | update(s, e, l, m, 2 * ID) 32 | update(s, e, m, r, 2*ID+1) 33 | self.seg[ID] = self.lazy[ID] + max(self.seg[2*ID], self.seg[2*ID+1]) 34 | update(start, end) 35 | return self.seg[1] + self.lazy[1] 36 | ``` 37 | The usage of ID is similar to the index of array element in a heap -- 2ID and 2ID+1 are the indices of the two kids. 38 | ex. 1 is the root, 2, 3 are the left and right kids of root, so on so forth. 39 | 40 | Time complexity O(1) - In each update, only two nodes can be active at every level of the segment tree. Since the segment tree has max range `[0, 10**9)`, the depth of segment tree is `log(10**9) = O(1)`. 41 | 42 | **LC 307. Range Sum Query - Mutable** 43 | ``` 44 | class NumArray(object): 45 | def __init__(self, nums): 46 | self.seg = collections.defaultdict(int) 47 | for i, n in enumerate(nums): self.update(i, n) 48 | 49 | def update(self, i, val, l = 0, r = 10**9, ID = 1): 50 | if l == i == r - 1: 51 | self.seg[ID] = val 52 | else: 53 | m = (l + r) // 2 54 | if i < m: self.update(i, val, l, m, 2 * ID) 55 | else: self.update(i, val, m, r, 2*ID + 1) 56 | self.seg[ID] = self.seg[2*ID] + self.seg[2*ID+1] 57 | 58 | def sumRange(self, i, j, l = 0, r = 10**9, ID = 1): 59 | if ID == 1: j += 1 # make the input range inclusive-exclusive [i, j] --> [i,j) 60 | 61 | if j <= l or r <= i: return 0 62 | if i <= l < r <= j: return self.seg[ID] 63 | m = (l + r) // 2 64 | return self.sumRange(i, j, l, m, 2*ID) + self.sumRange(i, j, m, r, 2*ID+1) 65 | ``` 66 | 67 | ## Range sum with lazy operation 68 | C++ code from [C++ by (DarthPrince)](http://codeforces.com/blog/entry/15729) 69 | ``` 70 | // A function to update a node : 71 | void upd(int id,int l,int r,int x){// increase all members in this interval by x 72 | lazy[id] += x; 73 | s[id] += (r - l) * x; 74 | } 75 | 76 | // A function to pass the update information to its children : 77 | void shift(int id,int l,int r){//pass update information to the children 78 | int mid = (l+r)/2; 79 | upd(id * 2, l, mid, lazy[id]); 80 | upd(id * 2 + 1, mid, r, lazy[id]); 81 | lazy[id] = 0;// passing is done 82 | } 83 | 84 | // A function to perform increase queries : 85 | void increase(int x,int y,int v,int id = 1,int l = 0,int r = n){ 86 | if(x >= r or l >= y) return ; 87 | if(x <= l && r <= y){ 88 | upd(id, l, r, v); 89 | return ; 90 | } 91 | shift(id, l, r); 92 | int mid = (l+r)/2; 93 | increase(x, y, v, id * 2, l, mid); 94 | increase(x, y, v, id*2+1, mid, r); 95 | s[id] = s[id * 2] + s[id * 2 + 1]; 96 | } 97 | ``` 98 | Note that `shift` should be called at every level right **before** the function `increase` traverses to the children! 99 | 100 | **LC 699. Falling Squares** 101 | 102 | 俄罗斯方块堆箱子,求每一块放下后的最高高度 103 | 104 | 输入格式为[[左边界,宽度],......] 105 | 106 | Input: [[1, 2], [2, 3], [6, 1]] 107 | 108 | Output: [2, 5, 5] 109 | 110 | Explanation: 111 | 112 | After the first drop of positions[0] = [1, 2]: 113 | ``` 114 | _aa 115 | _aa 116 | ------- 117 | The maximum height of any square is 2. 118 | ``` 119 | 120 | After the second drop of positions[1] = [2, 3]: 121 | ``` 122 | __aaa 123 | __aaa 124 | __aaa 125 | _aa__ 126 | _aa__ 127 | -------------- 128 | ``` 129 | The maximum height of any square is 5. 130 | 131 | ``` 132 | class Solution: 133 | def fallingSquares(self, positions): 134 | v = collections.defaultdict(int) 135 | lazy = collections.defaultdict(int) 136 | 137 | def query(x, y, l = 0, r = 2046, ID = 1): 138 | if y <= l or x >= r: return float('-inf') 139 | if x <= l < r <= y: return v[ID] 140 | # note the lazy propagation before quering kids 141 | shift(l, r, ID) 142 | m = (l + r) // 2 143 | return max(query(x, y, l, m, 2 * ID), query(x, y, m, r, 2*ID+1)) 144 | 145 | def upd(l, r, ID, val): 146 | v[ID] = max(v[ID], val) 147 | lazy[ID] = max(lazy[ID], val) 148 | 149 | def shift(l, r, ID): 150 | m = (l + r) // 2 151 | upd(l, m, 2*ID, lazy[ID]) 152 | upd(m, r, 2*ID+1, lazy[ID]) 153 | lazy[ID] = float('-inf') 154 | 155 | def add(x, y, val, l = 0, r = 2046, ID = 1): 156 | if y <= l or x >= r: return 157 | if x <= l < r <= y: 158 | upd(l, r, ID, val) 159 | return 160 | 161 | shift(l, r, ID) 162 | m = (l + r) // 2 163 | add(x, y, val, l, m, 2 * ID) 164 | add(x, y, val, m, r, 2*ID+1) 165 | v[ID] = max(v[2*ID], v[2*ID+1]) 166 | 167 | L = list(set([j for i, side in positions for j in [i, i+side]])) 168 | L.sort() 169 | indice = {x:i for i, x in enumerate(L)} 170 | 171 | R = [] 172 | for i, side in positions: 173 | x, y = indice[i], indice[i+side] 174 | add(x, y, side + query(x, y)) 175 | R.append(v[1]) 176 | return R 177 | ``` 178 | 179 | ## Two-dimensional Segment Tree 180 | 181 | A segment tree over `x` coordinates. Each node of the x-coordinate (outer) segment tree stores a segment tree on `y` coordinates that corresponds to "strip" `[xl, xr]`, meaning that it will store all the sums of rectangles `[xl,xr]×[yl,yr]` where `[yl,yr]` is a valid atomic segment tree segment. 182 | 183 | While answering queries you basically first traverse by outer tree until you find `O(logn)` segments that fits your x-part-of-a-query and then for each such segment you are traversing in corresponding segment tree until you find a set of segments that match your y-part-of-a-query. 184 | 185 | "Before writing a 2D-segment tree you should consider using partial sums, 2D-Fenwick tree or O(1)-static-rmq for 2D since all of them are like 10× shorter and usually faster." ---- cited from a Quora answer by an ACMer 186 | -------------------------------------------------------------------------------- /ShortPythonHacks/Readme.md: -------------------------------------------------------------------------------- 1 | Short Python Hacks 2 | === 3 | 4 | Find shortest string 5 | > `min([s0, s1, s2], key=len)` 6 | 7 | Find replacing segment in a string `abcabc => abc * 2` 8 | > `i = (s + s).find(s, 1)` 9 | 10 | Get columns of mattrix represented as a list of lists 11 | > `for col in zip(*A): print(col)` 12 | 13 | Append a list of characters to the end of a list of strings: 14 | > `zip([['abc','efg'], ['d', 'h']])` 15 | 16 | Get alphabetics from a list of words: 17 | > `chars = set(''.join(words))` 18 | 19 | Transpose a matrix: 20 | > `z = list(map(list, zip(*A)))` 21 | -------------------------------------------------------------------------------- /ShortestDistance/Readme.md: -------------------------------------------------------------------------------- 1 | # Shortest Distance 2 | 3 | ## Dijkstra's algorithm 4 | 5 | The time complexity depends on how to support the operations 6 | * **return the min key** to find the next nearest node (T1 time) 7 | * **decrease some key** to update the distance to the neighbors of the nearest node (T2 time) 8 | 9 | We return every vertex with min distance once, then decrease neighbors' distance along its out-going edges. 10 | 11 | The time complexity is: `O(E T2 + V T1)`. 12 | 13 | If you use a binary heap or BST, `T1=T2=O(log V)` 14 | Elif you use a Fibonacci heap, `T2=O(1)` and `T1 = O(log V)`. 15 | 16 | 17 | See for algorithm pseudo code 18 | and for compaison between heap implementations. 19 | 20 | But, how to quickly finish the code in interviews? One trick is to skip the decrease key operation and just insert the value directly. 21 | This step would increase the size of the queue from `V` to `E`, leading to `O(E log E)` time, but it simplifies your code. You should make sure the interviewee knows that you know the Fibonacci heap implementation later. 22 | 23 | Using C++ `std::priority_queue` [DarthPrince's blog](https://codeforces.com/blog/entry/16221): 24 | ``` 25 | bool mark[MAXN]; 26 | void dijkstra(int v){ 27 | fill(d,d + n, inf); 28 | fill(mark, mark + n, false); 29 | d[v] = 0; 30 | int u; 31 | priority_queue,vector >, less > > pq; 32 | pq.push({d[v], v}); 33 | while(!pq.empty()){ 34 | u = pq.top().second; // pop the shortest distance to u 35 | pq.pop(); 36 | if(mark[u]) 37 | continue; 38 | mark[u] = true; 39 | for(auto p : adj[u]) //adj[v][i] = pair(vertex, weight) 40 | if(d[p.first] > d[u] + p.second){ 41 | d[p.first] = d[u] + p.second; 42 | pq.push({d[p.first], p.first}); 43 | } 44 | } 45 | } 46 | ``` 47 | 48 | Note that a node might be poped up from the queue multiple times... with further distance from the source node. We eliminate them by the `mark`s. 49 | 50 | It is quite interesting that we can skip `mark`s in the problem below: 51 | 52 | **LC 787. Cheapest Flights Within K Stops** Find the shortest path with at most `k+2` nodes 53 | 54 | ``` 55 | def findCheapestPrice(self, n, flights, src, dst, K): 56 | graph = collections.defaultdict(list) 57 | for i, j, w in flights: 58 | graph[i].append((j, w)) 59 | Q = [(0, src, 0)] 60 | while Q: 61 | dist, i, k = heapq.heappop(Q) 62 | if i == dst: return dist 63 | if k <= K: 64 | for j, w in graph[i]: 65 | heapq.heappush(Q, (dist + w, j, k + 1)) 66 | return -1 67 | ``` 68 | 69 | The queue size is `O(E)` in the worst case because every update might insert one element into queue. The time complexity becomes `O(E log E)`. But we should eliminate the loop here. See example, 70 | 71 | ``` 72 | 0----2-----3----5 73 | \ / \ / 74 | 1 4 75 | ``` 76 | To get to 2, 0-1-2 might be shorter, say distance(0-1-2)=100, but 0-2 takes less steps, distance(0-2)=200. 77 | However, we can go 0-2-3-4-5, but can not go 0-1-2-3-4-5, if K = 3. 78 | So, we should insert 0-2 into the queue. 79 | 80 | In short, a node should be updated here if (i) shorter distance (ii) less number of steps. 81 | 82 | The heap solution checks (i) and consider all the possible number of steps. This is also what Dijkstra does because it assumes any steps is alright. 83 | 84 | We can prune the search space using the second measure also. (Record the current least number of steps to reach a pop-up node, and update only if less No. of steps is found) 85 | 86 | ## Searching states in a graph 87 | 88 | Many problems can be viewed as a shortest distance problem. It might need some attention to realize that the essence of problem. But these problems are common in the way that: 89 | * The process can be divided in primitive segments, which is a jump/edge in the states graph 90 | * The goal is to transit from a starting state to some desired states. 91 | 92 | Problems like this include: 93 | 94 | * **LC 753. Cracking the Safe** The sequence (3 digits for example) 000,001,010,011,100,101,110,111 95 | can be treated as the edges: 96 | ``` 97 | 10 98 | / \ 99 | 00 11 100 | \ / 101 | 01 102 | ``` 103 | 000 is a self-edge from 00 to 00, 001 is an edge from 00 to 01, so on so forth... 104 | The essence of this problem is to find out a shortest path visiting all edges of the graph. We might visit an edge multiple times in a graph, but for our graph, there exist a Eular path visiting every edge eaxctly once because every node has out-degree TWO. 105 | 106 | * **LC 864. Shortest Path to Get All Keys** 107 | We consider the lock, key, start point @ as interest points. Since we won't touch the walls, the problem is to find a sequence of interests point pairs which are combined to be the final route. The state is `(keys, i, j)` where `keys` is the set of obtained keys, `i, j` is a pair of interest points. The edge goes from `(keys, i, j)` to `(keys, j, k)` if `j` is not a key; otherwise to `(keys+{j}, j, k)` if `j` is a key. The corresponding weight of an edge is the shortest distance from `j` to `k`, which can be found by BFS. So, we start with no key, and end up with all keys, the problem asks for the shortest distance in the weighted states graph. 108 | 109 | 110 | 111 | 112 | 113 | -------------------------------------------------------------------------------- /SparseTable/Readme.md: -------------------------------------------------------------------------------- 1 | Sparse Table 2 | === 3 | 4 | > `O(n logn)` time pre-processing and `O(1)` time query. 5 | > `d[i][j]` stores the max/min element in `[ i, i+(1< DP construction time of `d[i][j]` for at most `O(logn)` values of `j`s 7 | > When query, just return the max/min of `d[l][j]` and `d[r-(1< N: break 16 | d[i][j] = min(d[i][j-1], d[i + (1<<(j-1))][j-1]) if j > 0 else A[i] 17 | 18 | def query(u, d, l, r): # [l, r) inclusive-exclusive 19 | j = int(math.log(r - l, 2)) 20 | return min(d[l][j], d[r-(1< u`, then `u -> v`. 44 | 45 | Proof by contradiction: otherwise, assume no path `u -> v`, then `v` should be above `u` in the stack as `v -> u`, but `v` is below `u` actually. 46 | 47 | Implementation trick: 48 | 49 | The first and second DFS can share the same `visited` array. The dfs1 starts with all-false `visited` and set the visited nodes as `true`. And dfs2 starts with the all-true `visited` and set the visited nodes as `false`. 50 | 51 | ``` 52 | N = len(graph) 53 | stack = [] 54 | vis = set() 55 | SCC = {} 56 | 57 | rgraph = [[] for _ in range(N)] 58 | for i in range(N): 59 | for j in graph[i]: 60 | rgraph[j].append(i) 61 | 62 | def dfs1(i): 63 | if i not in vis: 64 | vis.add(i) 65 | for j in graph[i]: 66 | dfs1(j) 67 | stack.append(i) 68 | 69 | for i in range(N): dfs1(i) 70 | 71 | def dfs2(i, scc): 72 | if i in vis: 73 | vis.remove(i) 74 | SCC[i] = scc 75 | for j in rG[i]: 76 | dfs2(j, scc) 77 | 78 | while stack: 79 | i = stack.pop() 80 | dfs2(i, i) 81 | 82 | print(SCC) 83 | ``` 84 | -------------------------------------------------------------------------------- /SuffixTree/Readme.md: -------------------------------------------------------------------------------- 1 | Suffix Tree 2 | === 3 | 4 | A good post about suffix tree 5 | [Ukkonen's suffix tree algorithm in plain English](https://stackoverflow.com/questions/9452701/ukkonens-suffix-tree-algorithm-in-plain-english) 6 | 7 | Suffix tree is a "trie" which stores all the suffices of a string. 8 | 9 | The Ukkonen's construction algorithm insert one character at a time (from left to right). 10 | 11 | Many thanks to the [visualization](http://brenden.github.io/ukkonen-animation/) tool by Brenden 12 | 13 | Implicit Tree 14 | --- 15 | Given a string `A`, each leaf edge of the implicit tree represents `A[i:#]`, where `#` indicates the current last char; when we insert a new char, the leaf edges **implicitly** extend, no operation is needed. 16 | 17 | Split 18 | --- 19 | Example: `A=abcabx` 20 | 21 | Inserting `abcab`: the implicit tree automatically grows the edges `abcab`, `bcab`, `cab` from root. 22 | 23 | Note the second `ab` matches the first `ab`. So we just move the active point to `ab|cab`. 24 | 25 | ![Imgur](https://i.imgur.com/n7c2xx8.png) 26 | 27 | Now, when we insert `x`, there is no matching letter, so we need to split at the current active point 28 | ![Imgur](https://i.imgur.com/BGIgKA5.png) 29 | 30 | We maintain a variable `remainder` which tells us how many additional inserts we need to make. 31 | 32 | Case in point: we deal with `abx` already, so `remainder = 2` because it is the turn of `bx` and `x`. 33 | 34 | To insert `bx`, we restart from the root and move the active point to `b|cabx`. 35 | 36 | Suffix link 37 | --- 38 | Due to the `x`, a split at the active point `b` is needed. After the split, the previous active point is linked to the current active point. The link illustrated by the dashed line below is called **suffix link**. 39 | 40 | ![Imgur](https://i.imgur.com/EmRp5Rf.png) 41 | 42 | 43 | **Why suffix link?** 44 | 45 | Suffix links enable us to reset the active point so we can make the next remaining insert 46 | 47 | For example, when dealing with suffix `abc...` and `bc..`, we know that, if a split occurs at the active point at `ab|...`, then another split must be done at the active point `b|...`. 48 | 49 | Instead of restarting from the root to match the `b`, we can follow the suffix link to get to the next active point. 50 | 51 | For example: insert the remaining `by` for `abcabxaby`. 52 | 53 | we also have the suffix link 54 | 55 | ![Imgur](https://i.imgur.com/3OtL7xK.png) 56 | 57 | Follow the suffix link to insert `by` 58 | * set the node to which suffix link points as the next active point 59 | * make a split there 60 | 61 | ![Imgur](https://i.imgur.com/nr6LGOa.png) 62 | 63 | Summarization 64 | --- 65 | Variables: 66 | * active_node: the node where we start matching the letters 67 | * active_edge: the letter indicating which kid node to follow 68 | * active_length: the number of letters that are already matched 69 | * remainder: the number of remaining inserts 70 | 71 | We always seek to match as many new letters in the tree as possible utill a split is inevitable. 72 | In such case, we only change `active_node`, `active_edge`, `active_length` and increment `remainder`. The tree does not change. 73 | 74 | When a split is needed, we add a new kid to the current active node, connect the previous active node to the current active node by a suffix link. (except when the previous active node is the root.) 75 | 76 | After the split at the active node, follow the suffix link of the current active node, if there is any, to go to the next active node. (`active_edge` and `active_length` stay unchanged.) 77 | Otherwise, if there is no suffix link from the current active node, restart from the root. 78 | 79 | Code 80 | --- 81 | C++ code can be found here. 82 | https://github.com/ADJA/algos/blob/master/Strings/UkkonenSuffixTree.cpp 83 | 84 | Java code 85 | https://gist.github.com/makagonov/22ab3675e3fc0031314e8535ffcbee2c 86 | -------------------------------------------------------------------------------- /SystemDesign/Readme.md: -------------------------------------------------------------------------------- 1 | System Design Notes 2 | === 3 | 4 | Template 5 | === 6 | * Requirements and Goals of the System 7 | * Write Heavy or Read Heavy? 8 | * Please prioritize functionality 9 | * What is NOT in scope? 10 | * Capacity Estimation and Constraints: 11 | * max requests/day, QPS (query per second) 12 | * storage size, latency, availability 13 | * High Level 14 | * System APIs 15 | * who call it? 16 | * Data Model 17 | * what will be saved? 18 | * seperate meta-data and hosting data! 19 | * Workflow/User Case 20 | * go through the steps 21 | * Details 22 | * Choice of database 23 | * Image/Video hosting: distributed file storage system (HDFS) 24 | * CA: RDBMS 25 | * CP: HBase, Redis, MongoDB 26 | * AP: Cassendra(wide-column), Dynamno(key-value store) 27 | * Data Partition 28 | * How many shards we need? 29 | * Shard by which ID 30 | * when user/post/ become popular? 31 | * **consistent hashing** to balance the load between servers!! 32 | * Draw Diagram: 33 | * seperate web server and application server 34 | * try Aggregation servers and Cache server (be creative with the names) 35 | * Cache 36 | * Memcached 37 | * Least Recently Used (LRU) 38 | * Least Frequent Used (LFU) 39 | * Load Balancer 40 | * Round Robin (may have overloading problem) 41 | * Detect dead servers 42 | * more intelligent ways 43 | 44 | 45 | 46 | 47 | 48 | 49 | Domain Knowledge 50 | === 51 | http://www.mitbbs.com/article_t/JobHunting/32777529.html 52 | 53 | Sharding 54 | --- 55 | Pros: 56 | * Split the burden of data storage 57 | * ID generation is simplistic 58 | 59 | Cons: 60 | * send requests to all data resources to get the responses 61 | * unbalanced splitting (hot items) 62 | * solution 1: consistent hashing (rind model) 63 | * solution 2: find a better sharding ID (ex. Twitter use tweet ID instead of user ID) 64 | 65 | ACID 66 | --- 67 | * Atomicity: either succeeds completely, or fails completely 68 | * Consistentcy: valid transactions 69 | * Isolation: concurrency control 70 | * Durability: power outage 71 | 72 | Eventual vs Strong Consistency: 73 | --- 74 | https://cloud.google.com/datastore/docs/articles/balancing-strong-and-eventual-consistency-with-google-cloud-datastore/ 75 | * Eventual consistency: eventually the system converges to the same state 76 | * Strong consistency: data **viewed after an update** will be consistent for all observers of the entity 77 | 78 | Choices of NoSQL via CAP 79 | --- 80 | 81 | http://blog.nahurst.com/visual-guide-to-nosql-systems 82 | * CA: RDBMS 83 | * CP: BigTable, HBase, Redis, MongoDB 84 | * HBase gets frequent writes get cached and write only once when buffer is full (out-performs MongoDB) 85 | 86 | * AP: Dynamo, Cassendra 87 | * good for most content distribution platforms because consistency is not important here 88 | * Dynamo is a key value store where cassandra is a column wide store; Cassendra > Redis 89 | 90 | Cassendra 91 | --- 92 | A must read for developers 93 | http://abiasforaction.net/cassandra-architecture/ 94 | 95 | * Consistent Hashing (both virtual server replicas and keys are mapped to a ring.) 96 | * determining a node on which a specific piece of data should reside on 97 | * minimising data movement when adding or removing nodes. 98 | 99 | * Gossip Protocol – exchanging state information about themselves and a maximum of 3 other nodes they know about. Over a period of time state information about every node propagates throughout the cluster. The gossip protocol facilitates failure detection. 100 | 101 | Google Project 102 | --- 103 | * Hadoop <--> Mapreduce 104 | * Hadoop Distributed File System (HDFS) <--> Google File System 105 | * HBase <--> Bigtable 106 | 107 | Long Term Storage: Hive 108 | 109 | Fan-out Write/Read (Push or Pull) 110 | --- 111 | http://massivetechinterview.blogspot.com/2015/06/itint5.html 112 | 113 | 114 | 同样是timeline, twitter用fan-out-write(将new feed直接写到follower的timeline里),而Facebook却用fan-out-read(在读的时候实时抓取相关用户的feeds并merge/rank) 115 | 116 | Twitter has apparently seen great performance improvements by disabling fanout for high profile users and instead loading their tweets during reads (pull). 117 | 118 | Redis Vs Cassandra 119 | --- 120 | http://highscalability.com/blog/2013/10/28/design-decisions-for-scaling-your-high-traffic-feeds.html 121 | 122 | Instagram started out with Redis but eventually switched to Cassandra. 123 | 124 | Redis however has a few limitations: 125 | * all of your data needs to be stored in RAM which eventually becomes expensive. 126 | * no support for sharding built into Redis. Sharding across nodes is quite easy, but moving data when you add or remove nodes is a pain. 127 | 128 | CDN 129 | --- 130 | Store data physcially close to its consumers 131 | 132 | Read Heavy or Write Heavy 133 | --- 134 | read heavy那么用cache会提升performance之类的 同时知道应该避免什 135 | 么东西 比如避免single point of failure 再比如时间和空间的tradeoff在read 136 | heavy的时候应该倾向于时间 Write heavy的时候倾向于空间等等 137 | 138 | 139 | Further Reading 140 | --- 141 | http://www.mitbbs.com/article_t/JobHunting/32777529.html 142 | 143 | https://blog.csdn.net/sigh1988/article/details/9790337 144 | 145 | http://blog.bittiger.io/%E9%9D%A2%E8%AF%95%E5%BF%85%E8%80%83%EF%BC%9A%E5%A6%82%E4%BD%95%E8%AE%BE%E8%AE%A1%E6%89%BF%E8%BD%BD%E5%8D%83%E4%B8%87%E7%94%A8%E6%88%B7%E7%9A%84uber%E5%AE%9E%E6%97%B6%E6%9E%B6%E6%9E%84/ 146 | 147 | https://github.com/Vonng/ddia 148 | 149 | https://github.com/donnemartin/system-design-primer 150 | 151 | -------------------------------------------------------------------------------- /TopK_Greedy/Readme.md: -------------------------------------------------------------------------------- 1 | Greedy Algorithm - Top K 2 | === 3 | 4 | Problem: given array `A`, return the `K` elements with max sum, i.e. the top-K elements. 5 | 6 | First Intuition: sort `A`, return the largest `K` elements. 7 | 8 | Key Observation: **The sequence of top-`K` elements does not matter**. We can avoid the time for sorting. 9 | Just replace the min element in the current top-`K` when new element comes in. 10 | 11 | Python Solution: 12 | 13 | ``` 14 | def topK(A): 15 | Q = [] 16 | for a in A: 17 | if len(Q) < K: 18 | heapq.heappush(Q, a) 19 | else: 20 | if Q[0] < a: 21 | heapq.heappop(Q) 22 | heapq.heappush(Q, a) 23 | return Q 24 | ``` 25 | 26 | Understanding this problem from the view of Dynamic Programming (DP): 27 | 28 | * Given the top-K elements within `[0, i]` 29 | * if `i` is not in the final top-K 30 | * we need top-K elements within `[0, i-1]` 31 | * if `i` is in the final top-K 32 | * we just need top-`K-1` elements within `[0, i-1]` 33 | * note that the top-`K-1` within `[0, i-1]` must be a set of top-K within `[0, i-1]` 34 | * i.e. `top-K == top-(K-1) + the k-th largest element` 35 | * so, replace the smallest element of the top-K in `[0, i-1]` by the new element `A[i]` 36 | 37 | Follow-up I: Non-overlapping Top-K 38 | --- 39 | 40 | Given an array `A`, return the `K` elements with max sum, which are at least `w` elements away from each other. 41 | 42 | For example, `K = 3, w = 2, A = [3, 7, 9, 2, 5] ==> [3,9,5]`; here `[7,9,5]` is invalid because `7`,`9` are adjacent. 43 | 44 | Idea: 45 | 46 | keep `w` heaps storing the top-K within range `A[:i+1]`, ..., `A[:i+w]` 47 | 48 | when `A[i+w]` comes in, replace min element in the **first heap corresponding to `A[:i+1]`**. 49 | 50 | then make this heap the last heap. 51 | 52 | Deal with `A[i+w+1]` and the next heap corresponding to `A[:i+2]`, so on so forth. 53 | 54 | Follow-up II: constrainted Top-K with sequence 55 | --- 56 | In such problems, the **sequence matters**, but replacement works fine as long as we **sort the input** properly. 57 | 58 | LC 630. Course Schedule III 59 | --- 60 | 61 | Each course has length `t` and closed on `d`-th day. One must take courses one-by-one before their closing date. 62 | 63 | Given `n` online courses represented by pairs `(t,d)`, return the maximal number of courses that can be taken. 64 | 65 | Key observation: 66 | 67 | There are two goals to achieve: 68 | * The current ``top-K'' courses have min `sum(t_i)`. 69 | * A course with later closing date `d`, shorter duration `t` can replace a course with earlier closing date `d` and longer durtion `t` without violating the closing dates. ==> A top-K problem with replacement. 70 | 71 | Solution: 72 | * arrange courses that `d` increases, so that later courses can replace earlier courses. 73 | * when a new course comes in, push it into the heap 74 | * pop up courses with longest duration until `sum(t_i)` smaller than current d 75 | * return final heap size 76 | 77 | 78 | **871. Minimum Number of Refueling Stops** 79 | 80 | A car go from location 0 to `target` with initial fuel `s`. The car can refuel at every station located at `stations[i][0]` with `stations[i][1]` gas. Return the min number of refuels. 81 | 82 | Observation: in the top-K problem, the goal is to find K elements with largest sum. Here the idea is the same but no constraint on K and every station added to the result must be reachable. So we maintain the longest distance the car can reach, i.e. `sofar`, and replace the gas station with min gas to refuel iteratively. 83 | 84 | ``` 85 | def minRefuelStops(self, t, s, stations): 86 | Q, i, sofar, res = [], 0, s, 0 87 | while sofar < t: 88 | while i < len(stations) and stations[i][0] <= sofar: 89 | heapq.heappush(Q, -stations[i][1]) 90 | i += 1 91 | if not Q: return -1 92 | gas = -heapq.heappop(Q) 93 | sofar += gas 94 | res += 1 95 | return res 96 | ``` 97 | 98 | 99 | Comparison with Longest-Increasing-SubSequence (LIS) 100 | --- 101 | 102 | In the Top-K problems, the top-(K-1) elements are always a subset of the top-K set. 103 | For example, 104 | ``` 105 | [3, 7, 9, 2, 5] 106 | K=1 9 107 | 2 7 +9 108 | 3 7 +9 +5 109 | ``` 110 | So **replacement works**. 111 | 112 | 113 | In **LC 300. Longest Increasing Subsequence**, the length-`l-1` subsequence with earlies ending time is NOT a sub-sequence of length-`l` subsequence with earlies ending time. 114 | 115 | So replacement does NOT work, you have to store all states in DP (ending time of subsequence length `l=1,2,3,...`) 116 | 117 | 118 | 119 | -------------------------------------------------------------------------------- /TopologicalSort/Readme.md: -------------------------------------------------------------------------------- 1 | # Topological Sort 2 | To find out the order of nodes in a [DAG](https://en.wikipedia.org/wiki/Directed_acyclic_graph) 3 | * remove the node with ZERO in-degree/out-degree, record this node 4 | * update the degree of other node pointing to this removed node 5 | 6 | Keep doing this you will obtain a topological order of the nodes 7 | 8 | The application of topological sort: 9 | * Find cycles in a graph. The steps above will remove every node until only some loop of nodes remain. 10 | * Rank prerequisites to find a total order. 11 | * Find the "most inner" nodes in graph, like **LC 310. Minimum Height Trees**. 12 | 13 | **LC 310. Minimum Height Trees** 14 | Find the leaves of the tree, and remove the deg 1 nodes iteratively until no more than two nodes remain. 15 | The remaining node(s) would be the root of minimum height tree. 16 | ``` 17 | def findMinHeightTrees(self, n, edges): 18 | if not edges: return [0] 19 | graph = collections.defaultdict(list) 20 | for i, j in edges: 21 | graph[i].append(j) 22 | graph[j].append(i) 23 | q = [] 24 | vis = set() 25 | for i, row in graph.items(): 26 | if len(row) == 1: 27 | q.append(i) 28 | n = len(graph) 29 | while n > 2: 30 | newq = [] 31 | for i in q: 32 | j = graph[i].pop() 33 | graph[j].remove(i) 34 | n -= 1 35 | if len(graph[j]) == 1: 36 | newq += j, 37 | q = newq 38 | return q 39 | ``` 40 | 41 | **329. Longest Increasing Path in a Matrix** 42 | First, build a graph out of the matrix elements, then topological sort it. Record the number of layers. 43 | ``` 44 | def longestIncreasingPath(self, matrix): 45 | if not matrix: return 0 46 | N = len(matrix) 47 | M = len(matrix[0]) 48 | larger = collections.defaultdict(int) 49 | smaller = collections.defaultdict(list) 50 | for i in range(N): 51 | for j in range(M): 52 | for I, J in [(i-1, j),(i+1, j),(i, j-1),(i, j+1)]: 53 | if 0 <= I < N and 0 <= J < M and matrix[I][J] > matrix[i][j]: 54 | larger[(i, j)] += 1 55 | smaller[(I, J)].append((i, j)) 56 | deg0 = [(i, j) for i in range(N) for j in range(M) if not larger[(i,j)]] 57 | cnt = 0 58 | while deg0: 59 | cnt += 1 60 | newdeg0 = [] 61 | for i, j in deg0: 62 | for I, J in smaller[(i ,j)]: 63 | larger[(I, J)] -= 1 64 | if larger[(I, J)] == 0: 65 | newdeg0.append((I, J)) 66 | deg0 = newdeg0 67 | return cnt 68 | ``` 69 | 70 | Another way for topological sort is DFS. 71 | 72 | The post order visit would always put a root **after** its children. So a post-order visit of DFS should give us a order. 73 | 74 | Starting from every possible roots (whose in-degrees are zeros), we keep the maximum post-visiting time (MPT) of each node (if a node has been visited before with smaller MPT, then store the larger MPT). 75 | 76 | ``` 77 | # construct graph here 78 | # initialize MPT = {node:0 for node in graph} 79 | 80 | def dfs(root): 81 | for kid in graph[root]: 82 | if not kid in vis: 83 | vis.add(kid) 84 | dfs(kid) 85 | MPT[root] = max(MPT[kid] + 1, MPT[root]) 86 | 87 | # for each node x with zero in-degree 88 | # dfs(x) 89 | ``` 90 | 91 | It ensures that the root's MPT would higher than the children's MPT. So you have one total order of nodes (ranked by MPT) here. 92 | 93 | 94 | -------------------------------------------------------------------------------- /TreeDP/Readme.md: -------------------------------------------------------------------------------- 1 | Tree DP 2 | === 3 | 4 | Always think by rooting the sub-tree. 5 | 6 | Often DFS visit a tree -- at post order, update state `dp[v]` of the current node `v`, using its children `u`'s state `dp[u]`. 7 | 8 | LC543. Return the longest path between **any** two nodes in a binary tree. 9 | --- 10 | 11 | At the post-order visit of the sub-root, return the longest path going through this sub-root. 12 | ```python 13 | def diameterOfBinaryTree(self, root): 14 | self.ans = 0 15 | def f(r): 16 | if not r: return -1 17 | left = f(r.left) 18 | right = f(r.right) 19 | self.ans = max(self.ans, left + right + 2) 20 | return max(left, right) + 1 21 | f(root) 22 | return self.ans 23 | ``` 24 | 25 | Problem: find number of different sub trees of size less than or equal to K. 26 | --- 27 | 28 | Assume a subroot node `u` has children `v_1`, `v_2`, ... `v_n`: 29 | 30 | - `f[v_i][k]` is the number of sub trees with `k` nodes and `v_i` as root. 31 | - For such `u` as sub-root 32 | - `dp[i][j]` is the number of sub-trees rooted by `u`'s first `i` children with a total of `j` nodes. 33 | 34 | So we have, 35 | 36 | `dp[i][j] = sum_k ( dp[i - 1][j - k] * f[v_i][k] )` 37 | `f[v][k] = dp[n][k]` 38 | 39 | To save memory, we can use a rolling one-dim array to store the `dp` (rather than a two-dim matrix). 40 | 41 | The final result is `sum_v( sum(f[v][:K]) )`. 42 | 43 | 44 | 45 | -------------------------------------------------------------------------------- /TreeTraversal/Readme.md: -------------------------------------------------------------------------------- 1 | # Iterative Tree Traversal - Pre-order, In-order and Post-order 2 | 3 | 4 | **LC 144. Binary Tree Preorder Traversal** 5 | **LC 94. Binary Tree Inorder Traversal** 6 | Solution: 7 | ``` 8 | def inorderTraversal(self, root): 9 | res, stack = [], [] 10 | node = root 11 | while node or stack: 12 | # go all the way to the left, exactly what recursion inorder does 13 | while node: 14 | stack.append(node) # pre-order here 15 | node = node.left 16 | 17 | node = stack.pop() 18 | res.append(node.val) # in-order here 19 | 20 | # then go right, exactly what recursion inorder does 21 | node = node.right 22 | return res 23 | ``` 24 | 25 | **Simplification for pre-order** 26 | 27 | The pre-order traversal can be simplified by pushing the right kid, instead of the root, into the stack. 28 | ``` 29 | def preorderTraversal(self, root): 30 | node = root 31 | stack, res = [], [] 32 | while node or stack: 33 | while node: 34 | res += node.val, 35 | stack += node.right, # push the right kid 36 | node = node.left 37 | node = stack.pop() 38 | #node = node.right # this line can be removed now 39 | return res 40 | ``` 41 | 42 | **Post-order** is reversed pre-order (not to simply reverse the result, but we should reverse the order at every level.) We keep visiting the right branch in prior to the left branch. 43 | 44 | The solution would be: 45 | * push current node into deque from its left 46 | * go right whenever possible 47 | * we pop up a node from stack, the right branch of this node has been fully explored 48 | * so go left 49 | 50 | ``` 51 | def postorderTraversal(self, root): 52 | res = collections.deque() 53 | stack = [] 54 | node = root 55 | while node or stack: 56 | while node: 57 | res.appendleft(node.val) # unlike pre-order, insert node into a deque instead of stack 58 | stack.append(node) 59 | node = node.right 60 | node = stack.pop() 61 | node = node.left 62 | return list(res) 63 | ``` 64 | 65 | **A more general solution** 66 | Of course, we can mark a node by the times you visited it. Since every node is visited three times in DFS, `vis[node]` is set as -1 initially for the pre-order visit. Change it to 0 in the in-order visit. Finally, the post-order makes `vis[node]` 1. 67 | 68 | Dynamically release memory of `vis` so that max memory usage is height of tree. 69 | ``` 70 | vis, node, stack = {}, root, [] 71 | 72 | while node or stack: 73 | while node and node not in vis: 74 | # do sth. in pre-order 75 | vis[node] = -1 76 | stack.append(node) 77 | node = node.left 78 | 79 | if not stack: break 80 | node = stack[-1] 81 | 82 | if vis[node] < 0: 83 | # do sth. in in-order 84 | vis[node] += 1 85 | node = node.right 86 | else: 87 | # do sth. in post-order 88 | vis[node] += 1 89 | if node.left: del vis[node.left] #IMPORTANT: release node's kids memory here 90 | if node.right: del vis[node.right] # but not the node itself, let its parent release its memory 91 | stack.pop() 92 | ``` 93 | 94 | ## N-ary Tree 95 | The same idea works for tree with multiple kids: 96 | * Pre-order: visit the left-most kid first, while pushing all its siblings into the stack from left to right, and append the current node to result 97 | * Post-order: visit the right-most kid first, while pushing all its siblings into the stack from right to left, and push the current node to the left side of result (deque) 98 | 99 | **LC 341. Flatten Nested List Iterator** 100 | 101 | Given a nested list of integers, implement an iterator to flatten it. 102 | 103 | Each element is either an integer, or a list -- whose elements may also be integers or other lists. 104 | 105 | Example 1: 106 | Given the list `[[1,1],2,[1,1]]`, 107 | 108 | By calling next repeatedly until hasNext returns false, the order of elements returned by next should be: `[1,1,2,1,1]`. 109 | 110 | ``` 111 | class NestedIterator(object): 112 | 113 | def __init__(self, nestedList): 114 | # reverse the input list 115 | self.stack = nestedList[::-1] 116 | 117 | def next(self): 118 | # pop the one at the top 119 | if self.hasNext(): 120 | return self.stack.pop().getInteger() 121 | return -1 122 | 123 | def hasNext(self): 124 | # keep going left, meanwhile, push the sibling into the stack, the right-most is inserted first. 125 | while len(self.stack) > 0 and not self.stack[-1].isInteger(): 126 | x = self.stack.pop() 127 | self.stack.extend(x.getList()[::-1]) 128 | return len(self.stack) > 0 129 | ``` 130 | -------------------------------------------------------------------------------- /Trie/Readme.md: -------------------------------------------------------------------------------- 1 | ## Trie 2 | 3 | Trie a tree structure which faciliates searching and storing prefixes. Given a set of strings, we can store the i-th letter as a node on the i-th layers of the tree. 4 | 5 | 1. Easy creation of Prefix Tree 6 | ``` 7 | self.root = {} 8 | node = self.root 9 | for c in word+"$": 10 | node = node.setdefault(c, {}) 11 | ``` 12 | where the `$` symbol indicates the end of a string `word`. Or equivalently 13 | ``` 14 | T = lambda: collections.defaultdict(T) 15 | self.root = T() 16 | reduce(dict.__getitem__, word, self.root)['$'] = True 17 | ``` 18 | 19 | 2. Easy travesal of Prefix Tree 20 | ``` 21 | # Recursive 22 | def visit(node, word, prefix): 23 | if not word: 24 | # Do sth. 25 | if '$' in node: 26 | # Do sth. 27 | c, w = word[0], word[1:] 28 | if c in node: 29 | # Do sth. 30 | visit(node[c], w, prefix+c) 31 | ``` 32 | Note that within any recursion call, `prefix+word` is the complete input string. 33 | 34 | 3. C++ short implementation, using a counter `next` to create new nodes. 35 | ``` 36 | map > x; 37 | int next = 1; 38 | 39 | void build(const string& w) { 40 | int cur = 0; 41 | for (const auto& ch : w) { 42 | if (x[cur].find(ch) == x[cur].end()) cur = x[cur][ch] = next++; 43 | else cur = x[cur][ch]; 44 | } 45 | x[cur]['#'] = -1; 46 | } 47 | ``` 48 | 49 | **LC 336. Palindrome Pairs** 50 | 51 | Given a list of unique words, find all pairs of distinct indices (i, j) in the given list, so that the concatenation of the two words, i.e. `words[i] + words[j]` is a palindrome. 52 | 53 | The idea is to use a Trie storing all the prefixes, a pair of strings `ab???` and `ba` matches, so you can reverse `ba` into `ab`, and search Trie to end up with some suffix `???`. If `???` is palindrome, then `ab???` and `ba` make a pair. 54 | 55 | ``` 56 | class TrieNode: 57 | def __init__(self): 58 | self.kids = {} 59 | self.end = -1 60 | self.pal = [] 61 | 62 | # create the Trie, recording the index of a string in case of palindrome suffix 63 | def insert(self, i, s): 64 | if not s: 65 | self.end = i 66 | else: 67 | if s[::-1] == s: 68 | self.pal += i, 69 | if s[0] not in self.kids: 70 | self.kids[s[0]] = TrieNode() 71 | self.kids[s[0]].insert(i, s[1:]) 72 | 73 | # search the Trie, collect the pairs of strings if concatenation is palindrome string 74 | def collect(self, i, s, ret): 75 | if not s: 76 | # Reach `ab?` with the input `ab` 77 | ret.extend([j, i] for j in self.pal) 78 | # Reach `ab` with the input `ab` 79 | if self.end not in [-1, i]: 80 | ret.append([self.end, i]) 81 | else: 82 | # Reach `ab` with the input `abs` 83 | if self.end != -1 and s[::-1] == s: 84 | ret.append([self.end, i]) 85 | if s[0] in self.kids: 86 | self.kids[s[0]].collect(i, s[1:], ret) 87 | 88 | class Solution(object): 89 | def palindromePairs(self, words): 90 | root = TrieNode() 91 | for i, w in enumerate(words): 92 | root.insert(i, w) 93 | ret = [] 94 | for i, w in enumerate(words): 95 | root.collect(i, w[::-1], ret) 96 | return ret 97 | ``` 98 | 99 | **LC 676. Implement Magic Dictionary** 100 | Find a string in the given dict which becomes the input string after modifying exact one character. 101 | We traverse the Trie recursively with a `flag` storing the state if a character has been modified. 102 | 103 | Pay attention to the two termination states: (i) run out of the input string (ii) reach the bottom of Trie 104 | 105 | ``` 106 | class MagicDictionary { 107 | public: 108 | map > x; 109 | int next = 1; 110 | 111 | /** Initialize your data structure here. */ 112 | MagicDictionary() { 113 | next = 1; 114 | x.clear(); 115 | } 116 | void build(const string& w) { 117 | int cur = 0; 118 | for (const auto& ch : w) { 119 | if (x[cur].find(ch) == x[cur].end()) cur = x[cur][ch] = next++; 120 | else cur = x[cur][ch]; 121 | } 122 | x[cur]['#'] = -1; 123 | } 124 | /** Build a dictionary through a list of words */ 125 | void buildDict(vector dict) { 126 | for (const auto& w : dict) build(w); 127 | } 128 | 129 | bool _search(string word, int cur, int i, bool flag) { 130 | //cout << word << "," << word[i] << "," << flag << endl; 131 | 132 | if (i == word.size()) { 133 | if (!flag && x[cur].find('#') != x[cur].end()) return true; 134 | return false; 135 | } 136 | 137 | if (flag) { 138 | for (auto it : x[cur]) if (it.first != '#') { 139 | if (_search(word, it.second, i + 1, (word[i] == it.first))) 140 | return true; 141 | } 142 | return false; 143 | } 144 | else { 145 | return x[cur].find(word[i]) == x[cur].end() ? 146 | false : _search(word, x[cur][word[i]], i + 1, false); 147 | } 148 | } 149 | 150 | /** Returns if there is any word in the trie that equals to the given word after modifying exactly one character */ 151 | bool search(string word) { 152 | if (word.empty()) return false; 153 | return _search(word, 0, 0, true); 154 | } 155 | }; 156 | ``` 157 | -------------------------------------------------------------------------------- /TwoPointers/Readme.md: -------------------------------------------------------------------------------- 1 | # Two Pointers 2 | 3 | Array 4 | --- 5 | 6 | Given an array, if all the **sub-optimal solutions** are **continuous subarrays** bounded by indices `l` and `r`, then we can shift `l`, `r` to search for the solutions. 7 | 8 | C++ template: 9 | ``` 10 | int i, j; 11 | for (i = 0, j = 0; i < N; ++i) { 12 | // add A[i] here 13 | 14 | while (j <= i && some condition is satisfied) { 15 | // update result 16 | res = {j, i}; 17 | // remove A[j] here 18 | // note that you should update result before removal! 19 | // and the update must be done WITHIN the while loop 20 | j++; 21 | } 22 | 23 | if (some condition is satisfied) { 24 | return res; 25 | } 26 | } 27 | ``` 28 | 29 | **LC 76. Minimum Window Substring** 30 | Given a string `S` and a string `T`, find the minimum window in `S` which will contain all the characters in `T` in complexity `O(n)`. 31 | 32 | Input: S = "ADOBECODEBANC", T = "ABC" 33 | 34 | Output: "BANC" 35 | 36 | ``` 37 | i = j = 0 38 | start = end = -1 39 | missing = len(t) 40 | need = collections.defaultdict(int) 41 | for ch in t: need[ch] += 1 42 | 43 | while i < len(s): 44 | if s[i] in need: 45 | need[s[i]] -= 1 46 | if need[s[i]] >= 0: missing -= 1 47 | while missing == 0: 48 | if end < 0 or i - j < end - start: 49 | start, end = j, i 50 | if s[j] in need: 51 | need[s[j]] += 1 52 | if need[s[j]] >= 1: missing += 1 53 | j += 1 54 | i += 1 55 | if end < 0: return "" 56 | return s[start:end + 1] 57 | ``` 58 | 59 | **LC 567. Permutation in String** 60 | 61 | Check if the sub-string of `s2` is a permutation of `s1`. 62 | Since the question asks for permutation, the order of `s1` does not matter. 63 | If we got too many letters in `s2[l:r]` than `s1[:]`, 64 | then we increase `l` and remove `s2[l]`; Otherwise, we got insufficent letters, 65 | we should increase `r` and insert `s2[r]`. 66 | 67 | ``` 68 | def checkInclusion(self, s1, s2): 69 | d = collections.defaultdict(int) 70 | for c in s1: d[c] += 1 71 | tmp = collections.defaultdict(int) 72 | cnt, j = 0, 0 73 | for i in range(len(s2)): 74 | tmp[s2[i]] += 1 75 | cnt += 1 76 | while j <= i and tmp[s2[i]] > d[s2[i]]: # becase tmp[c] is always <= d[c] 77 | tmp[s2[j]] -= 1 78 | cnt -= 1 79 | j += 1 80 | if cnt == len(s1): return True # only possilbe if tmp[c]==d[c] for all c 81 | return False 82 | ``` 83 | 84 | Linked List 85 | --- 86 | Pointers `slow` and `fast` move at the different speeds. 87 | 88 | There is one trick here, you can create one extra pointer `prev` storing the previous value of `slow`. 89 | 90 | ``` 91 | ListNode *slow = head, *fast = head, *prev = NULL; 92 | while (fast && fast->next) { 93 | prev = slow; 94 | slow = slow->next; 95 | fast = fast->next->next; 96 | } 97 | prev->next = NULL; 98 | ``` 99 | 100 | So, when `prev->next == slow`. In the corner cases `slow->next == fast == NULL`, 101 | ``` 102 | head->prev->slow->fast 103 | ^ 104 | NULL 105 | ``` 106 | you can still split the linked list into two parts by setting `prev->next = NULL`. The two linked lists have heads `head` and `slow` respectively. 107 | 108 | 109 | 110 | -------------------------------------------------------------------------------- /UnionFind/Readme.md: -------------------------------------------------------------------------------- 1 | # Disjoint Sets Union ("UnionFind") 2 | ## `O(log* n)` time 3 | To achieve the iterated `O(log* n)` time complexity, we need to 4 | * Path Compression which flatten the tree 5 | ``` 6 | def find(parent, i): 7 | if parent[i] != i: 8 | parent[i] = find(parent, parent[i]) 9 | return parent[i] 10 | ``` 11 | * Union by Rank 12 | ``` 13 | def union_by_rank(rank, parent, i, j): 14 | pi, pj = find(parent, i), find(parent, j) 15 | if rank[pi] < rank[pj]: parent[pi] = pj 16 | elif rank[pj] < rank[pi]: parent[pj] = pi 17 | else: 18 | parent[pi] = pj 19 | rank[pj] += 1 20 | ``` 21 | The proof of time complexity is about showing there are at most `n/2^r` nodes with rank `r` 22 | because the branch rooted by rank-r node has at most `2^r` nodes. 23 | 24 | A smarter way to do this (also see C++ practive below): 25 | * `par[i] = j` if node j is the parent of node i; 26 | * `par[i] = -n` if node i is the root, and node i has a total of n descendents (including itself). 27 | 28 | Initially, `par = [-1 for _ in range(n)]` because each node is a root. 29 | 30 | ```python 31 | def find(par, i): 32 | if par[i] < 0: return i 33 | par[i] = self.find(par, par[i]) 34 | return par[i] 35 | 36 | def union(par, i, j): 37 | i, j = self.find(par, i), self.find(par, j) 38 | if i == j: return 39 | if par[j] < par[i]: i, j = j, i 40 | par[i], par[j] = par[i] + par[j], i 41 | ``` 42 | 43 | To conduct `n` find operations: 44 | --- 45 | 1. No optimization: `O(n^2)` 46 | 2. Path compression alone: `O(n log n)` 47 | 3. Path compression + union by rank: `O(n log*(n))~O(n)` (iterated log can be treated as a constant) 48 | 49 | See [wiki](https://en.wikipedia.org/wiki/Disjoint-set_data_structure#Time_complexity): 50 | 51 | With neither path compression (or a variant), union by rank, nor union by size, the height of trees can grow unchecked as `O(n)`, which implies that Find and Union operations will take `O(n)` time. 52 | 53 | Using path compression alone gives a worst-case running time of `O(n log n)` for n find operations on n elements. 54 | 55 | Using union by rank alone gives a running-time of `O(n logn)` for n operations on n elements. 56 | 57 | Using both path compression, splitting, or halving and union by rank or size ensures that the amortized time per operation is only O(1), so the disjoint-set operations take place in essentially constant time. 58 | 59 | The best practice (C++): 60 | --- 61 | 62 | For each root v, `par[v]` equals the negative of number of nodes in its rooted tree. 63 | 64 | For other nodes u, `par[u]` equals the parent. 65 | 66 | ``` 67 | int root(int v){return par[v] < 0 ? v : (par[v] = root(par[v]));} 68 | void merge(int x,int y){ // x and y are some tools (vertices) 69 | if((x = root(x)) == (y = root(y)) return ; 70 | if(par[y] < par[x]) // balancing the height of the tree 71 | swap(x, y); 72 | par[x] += par[y]; 73 | par[y] = x; 74 | } 75 | ``` 76 | 77 | Or we can even use `vector` to store the elements in the same 'set'. We merge the `vector`s. Any node can be merged at most `O(log n)` times, so the total time complexity would be `O(n logn)`. 78 | 79 | # Directed vs Undirect graph 80 | 81 | For undirected graph, path compression + union-by-rank ==> O(1) time find/union operation. 82 | 83 | For directed graph, neither works because edge `(i,j)` can only have `parent[j] = i`. 84 | Path compression will make all nodes have the very root as `parent`. 85 | If that's fine for the specific problem, then union-find still works. 86 | 87 | See this problem: 88 | 89 | ## Redundant Connection 90 | 91 | **LC 685. Redundant Connection II** 92 | 93 | Find a redundant edge in a directed tree. The directed edge of the tree points from the parent to its kid. 94 | 95 | [Good solution](https://leetcode.com/problems/redundant-connection-ii/discuss/108045/C++Java-Union-Find-with-explanation-O(n)) 96 | 97 | Three types of the redudant edge: 98 | * double-parent node, no loop (easy, remove one double-parent edge to this node) 99 | * double-parent node, loop (remove the double-parent edges inside the loop) 100 | * no double-parent node, loop (remove the current edges once a loop is detected) 101 | 102 | Store the two edges pointing to the double-parent node, just remove the second edge (if exist) causing the double parents, then check 103 | * if the tree is valid now, return the second edge as result 104 | * elif we found a loop (by using union-find on a directed graph) 105 | 1. the first edge exist, return the first edge 106 | 2. else return the current edge causing the loop 107 | 108 | 109 | -------------------------------------------------------------------------------- /WaveletTree/Readme.md: -------------------------------------------------------------------------------- 1 | Wavelet Tree 2 | === 3 | 4 | What is Wavelet Tree capable of? 5 | --- 6 | 7 | For any given L and R, return the number of elements in subarray A[L:R] that 8 | * Smaller than x 9 | * or equal to x 10 | * or being the k-th smallest 11 | 12 | in log(N) time 13 | 14 | Binary Tree 15 | --- 16 | [Post by Rachit Jain](http://rachitiitr.blogspot.com/2017/06/wavelet-trees-wavelet-trees-editorial.html) 17 | [Paper](https://users.dcc.uchile.cl/~jperez/papers/ioiconf16.pdf) 18 | [Slides](https://users.dcc.uchile.cl/~jperez/talks/ioi16.pdf) 19 | 20 | Wavelet tree is a binary tree in which every node is associated with a sub-sequence of the input array. 21 | 22 | The root is associated with the input array itself. Like quick sort, the elements `<=mid` are then assigned to the left kid, 23 | while the element `>mid` are assigned to the right kid. 24 | 25 | After the split, the elements in each sub-sequence keep their original relative order. 26 | 27 | Each tree node stores `b[i]` which indicates the number of elements in A[:i] assigned to the left kid. 28 | 29 | So `i - b[i]` is the number of elements in `A[:i]` which gets assigned to the right kid. 30 | 31 | > For example, root is associated with `A=[1,3,2,5,2]`. If `mid = 2`, then 32 | > `leftkid = [1,2,2]` 33 | > `rightkid = [3,5]` 34 | > thus, for the root node, 35 | > `b=[0,1,1,2,2,3]` 36 | 37 | Query 38 | --- 39 | 40 | Find the K-th element in A[L:R] 41 | * `b[R] - b[L]` is the number of elements in A[L:R] assigned to the left kid of the root 42 | * if `K > b[R] - b[L]`, go to left kid 43 | * else, go to right kid 44 | 45 | number of occurence of x in A[L:R] 46 | * if `x <= mid`, go left; else go right util we hit x 47 | * return the number of x stored at current tree node 48 | 49 | 50 | 51 | 52 | 53 | -------------------------------------------------------------------------------- /backtrack/Readme.md: -------------------------------------------------------------------------------- 1 | ## NOTES 2 | 1. Two styles of backtracking (Example: N-Queen) 3 | **recursive** 4 | ``` 5 | def nqueens(n): 6 | def valid(a,b,i,j): 7 | return (a!=i) and (b!=j) and (abs(a-i)!=abs(b-j)) 8 | 9 | self.ret = 0 10 | def dfs(path): 11 | if len(path)==n: 12 | self.ret += 1 13 | return 14 | row = len(path) 15 | for col in range(n): 16 | if all(valid(row,col,i,j) for i,j in enumerate(path)): 17 | dfs(path+[col]) 18 | dfs([]) 19 | return self.ret 20 | ``` 21 | 22 | **Iterative** 23 | ``` 24 | ans =[[]] 25 | for row in range(n): 26 | new_ans = [] 27 | for path in ans: 28 | for col in range(n): 29 | if all(valid(row,col,i,j) for i,j in enumerate(path)): 30 | new_ans.append(path+[col]) 31 | ans = new_ans 32 | return len(ans) 33 | ``` 34 | 35 | 2. How to avoid duplicates in backtracking? 36 | ``` 37 | # 47. Permutations II 38 | # Given a collection of numbers that might contain duplicates, 39 | # return all possible unique permutations. 40 | ret = [[]] 41 | for n in nums: 42 | new_ret = [] 43 | for row in ret: 44 | for i in range(len(row)+1): 45 | new_ret.append(row[:i]+[n]+row[i:]) 46 | # Note: avoid inserting a number before any of its duplicates!! 47 | if i < len(row) and row[i]==n: 48 | break 49 | ret = new_ret 50 | return ret 51 | ``` 52 | 53 | ``` 54 | # 40. Combination Sum II 55 | # Find all unique combinations in candidates where the candidate numbers sums to target 56 | candidates.sort() 57 | ret = [] 58 | def dfs(target, path, start): 59 | if target < 0 or (start==len(candidates) and target != 0): 60 | return 61 | if target == 0: 62 | ret.append(path) 63 | return 64 | for i in range(start, len(candidates)): 65 | # NOTE: the key to avoid duplicates!! 66 | if i > start and candidates[i]==candidates[i-1]: continue 67 | dfs(target-candidates[i], path+[candidates[i]], i+1) 68 | dfs(target, [], 0) 69 | return ret 70 | ``` 71 | -------------------------------------------------------------------------------- /heap/215.py: -------------------------------------------------------------------------------- 1 | from heapq import heapreplace, heappush 2 | class Solution: 3 | def findKthLargest(self, nums, k): 4 | """ 5 | :type nums: List[int] 6 | :type k: int 7 | :rtype: int 8 | """ 9 | h = [] 10 | for n in nums: 11 | if len(h)h[0]: 14 | heapreplace(h,n) 15 | return h[0] 16 | -------------------------------------------------------------------------------- /heap/239.py: -------------------------------------------------------------------------------- 1 | from collections import deque 2 | class Solution: 3 | def maxSlidingWindow(self, nums, k): 4 | """ 5 | :type nums: List[int] 6 | :type k: int 7 | :rtype: List[int] 8 | """ 9 | out = [] 10 | d = deque() 11 | if k==0: return [] 12 | for i,n in enumerate(nums): 13 | while d and nums[d[-1]]=k-1: 19 | out.append(nums[d[0]]) 20 | return out 21 | -------------------------------------------------------------------------------- /heap/313.py: -------------------------------------------------------------------------------- 1 | #class Solution: 2 | # def nthSuperUglyNumber(self, n, primes): 3 | # """ 4 | # :type n: int 5 | # :type primes: List[int] 6 | # :rtype: int 7 | # """ 8 | # ug = [1] 9 | # def gen(prime): 10 | # for u in ug: 11 | # yield u*prime 12 | # h = heapq.merge(*map(gen, primes)) 13 | # while len(ug)h[0]: 20 | heapreplace(h,n) 21 | return h[0] 22 | ``` 23 | 24 | The heapq is a min-heap (smallest element at `h[0]`). `heapreplace(heap, item)` pop and return the smallest item, and also push the new item. The heap size doesn’t change. 25 | 26 | **LC 313 Super Ugly Number** 27 | Write a program to find the n-th super ugly number. 28 | Super ugly numbers are positive numbers whose all prime factors are in the given prime list primes of size k. 29 | 30 | For each prime `p_i`, `x*p_i` has the same order as the `x`s'. We do not need to compare `x*p_i`, instead, we just increase the current `x` to the next ugly number (least larger than the current `x`). In that way, `x*p_i` is naturally generated by ascending order. 31 | 32 | Therefore, we maintain an index `idx_i` for each prime `p_i`, pointing to the `idx_i`-th ugly number `ug[idx_i]`. 33 | 34 | Given all primes, the next ugly number is `min(ug[idx_i] * p_i for all i)`. 35 | 36 | Insert this next ugly number into the list `ug`; Replace it by `ug[idx_i + 1] * p_i` in the heap and `idx_i++`. 37 | 38 | ``` 39 | from heapq import heappush,heappop,heapify 40 | class Solution: 41 | def nthSuperUglyNumber(self, n: int, primes: List[int]) -> int: 42 | ug = [1] 43 | idx = {p:0 for p in primes} 44 | hp = [(p * ug[idx[p]], p) for p in primes] 45 | heapify(hp) 46 | while len(ug) < n: 47 | nxt_val = hp[0][0] 48 | ug.append(nxt_val) 49 | while hp and nxt_val == hp[0][0]: 50 | val, p = heappop(hp) 51 | idx[p] += 1 52 | heappush(hp, (p * ug[idx[p]], p) ) 53 | return ug[-1] 54 | ``` 55 | 56 | P.S. Define custom comparator for Python heap 57 | --- 58 | ``` 59 | @functools.total_ordering 60 | class Element: 61 | def __init__(self, word, n): 62 | self.word = word 63 | self.n = n 64 | 65 | def __lt__(self, other): 66 | if self.n == other.n: 67 | return self.word > other.word 68 | return self.n < other.n 69 | 70 | def __eq__(self, other): 71 | return self.n==other.n and self.word==other.word 72 | ``` 73 | and `heapq.heappush(hp, Element(word, n))` would do the job. 74 | -------------------------------------------------------------------------------- /kSum/11.py: -------------------------------------------------------------------------------- 1 | class Solution: 2 | def maxArea(self, height): 3 | """ 4 | :type height: List[int] 5 | :rtype: int 6 | """ 7 | i = 0 8 | j = len(height)-1 9 | ret = 0 10 | while i ratings[i-1] else 1 10 | rbase = 1 11 | for j in range(len(ratings)-2,-1,-1): 12 | rbase = rbase + 1 if ratings[j] > ratings[j+1] else 1 13 | ret[j] = max(rbase, ret[j]) 14 | return sum(ret) 15 | -------------------------------------------------------------------------------- /kSum/161.py: -------------------------------------------------------------------------------- 1 | class Solution: 2 | def isOneEditDistance(self, s, t): 3 | """ 4 | :type s: str 5 | :type t: str 6 | :rtype: bool 7 | """ 8 | d = len(s) - len(t) 9 | for i, (a, b) in enumerate(zip(s,t)): 10 | if a != b: 11 | return s[i+(d>=0):]==t[i+(d<=0):] 12 | return abs(d)==1 13 | -------------------------------------------------------------------------------- /kSum/209MinimumSizeSubarraySum.py: -------------------------------------------------------------------------------- 1 | class Solution: 2 | def minSubArrayLen(self, s, nums): 3 | """ 4 | :type s: int 5 | :type nums: List[int] 6 | :rtype: int 7 | """ 8 | if len(nums)==1 and nums[0]= s: 15 | min_l = min(min_l, j-i+1) 16 | cur -= nums[i] 17 | i += 1 18 | return min_l if min_l <= len(nums) else 0 19 | -------------------------------------------------------------------------------- /kSum/228.py: -------------------------------------------------------------------------------- 1 | class Solution: 2 | def summaryRanges(self, nums): 3 | """ 4 | :type nums: List[int] 5 | :rtype: List[str] 6 | """ 7 | if len(nums)<1: return [] 8 | j = 0 9 | ret = [] 10 | for i in range(1,len(nums)+1): 11 | if i==len(nums) or nums[i] != nums[i-1]+1: 12 | ret.append( "%d" %nums[j] if i-1==j else "%d->%d" % (nums[j], nums[i-1]) ) 13 | j = i 14 | return ret 15 | 16 | class Solution: 17 | def summaryRanges(self, nums): 18 | """ 19 | :type nums: List[int] 20 | :rtype: List[str] 21 | """ 22 | ret, r = [], [] 23 | for n in nums: 24 | if n-1 not in r: 25 | r = [] 26 | ret.append(r) 27 | r[1:] = n, 28 | return ['->'.join(map(str, r)) for r in ret] 29 | -------------------------------------------------------------------------------- /kSum/243.py: -------------------------------------------------------------------------------- 1 | class Solution: 2 | def shortestDistance(self, words, word1, word2): 3 | """ 4 | :type words: List[str] 5 | :type word1: str 6 | :type word2: str 7 | :rtype: int 8 | """ 9 | i1 = i2 = -len(words) 10 | ret = len(words) 11 | for idx, word in enumerate(words): 12 | if word1==word: 13 | ret = min(ret, idx-i2) 14 | i1 = idx 15 | elif word2==word: 16 | ret = min(ret, idx-i1) 17 | i2 = idx 18 | return ret 19 | -------------------------------------------------------------------------------- /kSum/26.py: -------------------------------------------------------------------------------- 1 | #26. Remove Duplicates from Sorted Array 2 | class Solution: 3 | def removeDuplicates(self, nums): 4 | """ 5 | :type nums: List[int] 6 | :rtype: int 7 | """ 8 | if len(nums)<1: return 0 9 | i = 1 10 | prev = nums[0] 11 | for j in range(len(nums)): 12 | if nums[j] != prev: 13 | nums[i] = nums[j] 14 | i += 1 15 | prev = nums[j] 16 | return i 17 | 18 | -------------------------------------------------------------------------------- /kSum/2sum.py: -------------------------------------------------------------------------------- 1 | class Solution: 2 | def twoSum(self, nums, target): 3 | """ 4 | :type nums: List[int] 5 | :type target: int 6 | :rtype: List[int] 7 | """ 8 | dictMap = {} 9 | for idx, val in enumerate(nums): 10 | if target - val in dictMap: 11 | return dictMap[target-val], idx 12 | dictMap[val] = idx 13 | -------------------------------------------------------------------------------- /kSum/2sum_III.py: -------------------------------------------------------------------------------- 1 | class TwoSum: 2 | 3 | # initialize your data structure here 4 | def __init__(self): 5 | self.table = dict() 6 | 7 | # @return nothing 8 | def add(self, number): 9 | self.table[number] = self.table.get(number, 0) + 1; 10 | 11 | # @param value, an integer 12 | # @return a Boolean 13 | def find(self, value): 14 | for i in self.table.keys(): 15 | j = value - i 16 | if i == j and self.table.get(i) > 1 or i != j and self.table.get(j, 0) > 0: 17 | return True 18 | return False 19 | -------------------------------------------------------------------------------- /kSum/325MaximumSizeSubarraySumEqualsk.py: -------------------------------------------------------------------------------- 1 | class Solution: 2 | def maxSubArrayLen(self, nums, k): 3 | """ 4 | :type nums: List[int] 5 | :type k: int 6 | :rtype: int 7 | """ 8 | acc = 0 9 | ret = 0 10 | loc = {0:-1} 11 | for idx, val in enumerate(nums): 12 | acc += val 13 | if acc not in loc: 14 | loc[acc] = idx 15 | if acc-k in loc: 16 | ret = max(ret, idx-loc[acc-k]) 17 | return ret 18 | -------------------------------------------------------------------------------- /kSum/3sum.py: -------------------------------------------------------------------------------- 1 | class Solution: 2 | def threeSum(self, nums): 3 | """ 4 | :type nums: List[int] 5 | :rtype: List[List[int]] 6 | """ 7 | if (len(nums)<=2): return [] 8 | nums.sort() 9 | i = 0 10 | ret = [] 11 | while(i-nums[i]: 18 | k-=1 19 | else: 20 | ret.append([nums[i],nums[j],nums[k]]) 21 | j+=1 22 | k-=1 23 | while jtarget: k-=1 17 | elif nums[i]+nums[j]+nums[k] nums[j-2]: 11 | nums[j] = n 12 | j += 1 13 | return j 14 | -------------------------------------------------------------------------------- /kSum/88.py: -------------------------------------------------------------------------------- 1 | class Solution: 2 | def merge(self, nums1, m, nums2, n): 3 | """ 4 | :type nums1: List[int] 5 | :type m: int 6 | :type nums2: List[int] 7 | :type n: int 8 | :rtype: void Do not return anything, modify nums1 in-place instead. 9 | """ 10 | while n > 0: 11 | if m == 0 or nums1[m-1]=nums2[n-1]: 15 | nums1[m+n-1] = nums1[m-1] 16 | m -= 1 17 | -------------------------------------------------------------------------------- /kSum/Readme.md: -------------------------------------------------------------------------------- 1 | ## Notes 2 | 1. Sort First: nums.sort() 3 | 2. Move `j,k` pointers for each i: 4 | `while (j0: r[1] = n 32 | 33 | ## 88. Merge Sorted Array 34 | 1. In-place merge at the end of array, stop when run out of nums2 35 | 36 | -------------------------------------------------------------------------------- /language/Go/Readme.md: -------------------------------------------------------------------------------- 1 | Go 2 | === 3 | 4 | Is Go Object-Oriented? 5 | --- 6 | Go does not have class implemenations. 7 | But class == struct types + methods in Go 8 | * struct only holds the state, not the behavior 9 | * method changes the state of struct types 10 | 11 | Access level in a package 12 | * capitalized fields, methods and functions are public 13 | * lower case fields, methods and functions are package private 14 | 15 | "Constructors" in Go 16 | * declare lower case struct type, thus making it private 17 | * and use a capitalized function `New` which return an object of this struct type 18 | * This is a factory pattern by default 19 | 20 | Composition over Inheritence 21 | * Go prefers embedding a struct inside the other 22 | * Ask yourself what is inheritence? 23 | * base class's data members are in the derived class ==> Go uses composition 24 | * base class's non-private interfaces can be invoked inside derived class 25 | * ===> Go uses interface and package-level access level control 26 | * polymorphism ===> Go uses interface values such as `var x = ('hello world')` 27 | * so any interface can be associated with any type 28 | 29 | Type 30 | --- 31 | * golang is very serious about variable types 32 | * `var x int = 1` 33 | * `y := x` the compiler infers **new** variable y's type 34 | * which is the same as `x`'s 35 | * all cast must be explicit (unlike C++) 36 | * constant `const x = 1.1` 37 | * define your own `struct type` 38 | ``` 39 | type MyFloat float64 40 | type Person struct { 41 | name string 42 | age int 43 | } 44 | ``` 45 | * struct embedding 46 | * Composition over inheritance 47 | ```go 48 | // copied from https://flaviocopes.com/golang-is-go-object-oriented/ 49 | type Dog struct { 50 | Animal // Composition by struct embedding 51 | } 52 | type Animal struct { 53 | Age int 54 | } 55 | func (a *Animal) Move() { 56 | fmt.Println("Animal moved") 57 | } 58 | func main() { 59 | d := Dog{} 60 | d.Age = 3 // Age automatically becomes part of Dog 61 | d.Move() // call Animal's method directly 62 | } 63 | ``` 64 | 65 | Function 66 | --- 67 | * `func needInt(x int) int { return x*10 + 1 }` 68 | * `func needInt(x int) (int, int) { return 1, 2 }` 69 | * bind the return variable 70 | ```go 71 | func needInt(x int) (x, y int) { 72 | x++ 73 | y += x 74 | return 75 | } 76 | ``` 77 | * closure `adder` is bound to its own variable `sum` 78 | ```go 79 | func adder() func(int) int { 80 | sum := 0 81 | return func(x int) int { 82 | sum += x 83 | return sum 84 | } 85 | } 86 | x = adder() // sum == 0 now 87 | x(1) // sum == 1 now 88 | x(2) // sum == 3 now 89 | ``` 90 | * functions can be 91 | * stored as struct fields 92 | * passed as arguments to other functions 93 | * returned from a function or methods 94 | 95 | Method 96 | --- 97 | * there is no class in Go 98 | * a method is a function with a receiver 99 | ```go 100 | // Abs() has v as its receiver 101 | func (v Vertex) Abs() float64 { 102 | return math.Sqrt(v.X*v.X + v.Y*v.Y) 103 | } 104 | ``` 105 | * methods can only be attached to a type in the same package where this type is defined 106 | * Use **pointer receiver** to allow modification 107 | ```go 108 | func (v *Vertex) Scale(f float64) { 109 | v.X = v.X * f 110 | v.Y = v.Y * f 111 | } 112 | ``` 113 | 114 | Interface 115 | --- 116 | See this [Post](https://research.swtch.com/interfaces) 117 | 118 | * interface is **a set of methods** 119 | * interface type (`I interface{}`) is **a type** 120 | * `var i I interface{} = "hello"` 121 | 122 | * all types implement at least zero methods 123 | * thus they are all of the empty interface type, i.e. `interface{}` 124 | ```go 125 | func DoSomething(v interface{}) { 126 | // ... 127 | } 128 | ``` 129 | will accept any parameter `v` whatsoever. But they will convert any type 130 | to the interface type. 131 | 132 | * Example of a type `string` implementing an interface `I` 133 | ```go 134 | type I interface { 135 | M() 136 | } 137 | 138 | // we can say: interface `I` holds the concrete type string 139 | // or: M() has a receiver of type string 140 | func (s string) M() { /* whatever */ } 141 | 142 | // create an interface value 143 | var i I = string("hello") 144 | // we will be able to call the method 145 | i.M() 146 | ``` 147 | 148 | * a type can implement different interfaces 149 | * an interface can hold many different types 150 | 151 | * `t, ok := i.(T)` checks if interface value `i` holds a type `T` 152 | * `ok == true` if yes; otherwise, `ok == false` 153 | * `t` will be the underlying value if yes 154 | 155 | * an interface value `i` provides Polymorphism 156 | * for example, every type can have its own way to `fmt.Printf()` if 157 | * interface `Stringer`'s method `String() string` holds this type 158 | * `func (t SomeType) String() string { return }` 159 | 160 | * Polymorphism 161 | ```go 162 | func do(i interface{}) { 163 | switch v := i.(type) { 164 | case T: ... 165 | case S: ... 166 | default: ... 167 | } 168 | } 169 | ``` 170 | 171 | 172 | Error 173 | --- 174 | * built-in interface 175 | ```go 176 | type error interface { 177 | Error() string 178 | } 179 | ``` 180 | * the customized Error type implements an `error` interface 181 | * `func (e MyError) Error() string { /* handle error */ }` 182 | 183 | Logic 184 | --- 185 | * one extra statement before the if condition 186 | ```go 187 | if v := math.Pow(x, n); v < lim { 188 | return v 189 | } 190 | // v is only in the scope of the if condition 191 | ``` 192 | * for loop can ignore end condition `for { /* forever */ }` 193 | * `switch` automatically add `break` for every `case` (unlike C++) 194 | 195 | `defer` 196 | --- 197 | * function executes after the current function returns 198 | * `defer` functions are pushed to a stack 199 | * their executions order is the reversed push order 200 | 201 | Pointer 202 | --- 203 | * No pointer operations!! ==> We do not need the `->` like in C++, always use `.` to access fields. 204 | * No pointer operation does not imply no pointer 205 | * there is [no pass-by-reference](https://dave.cheney.net/2017/04/29/there-is-no-pass-by-reference-in-go) in Go; one always pass by value (this can avoid many bugs) 206 | * pointer saves the cost of copy in pass-by-value 207 | ``` 208 | type Vertex struct { 209 | X, Y int 210 | } 211 | p := &Vertex{1, 2} // p is a pointer which allows modification of the struct 212 | p.X = 100 213 | ``` 214 | 215 | Array 216 | --- 217 | * `var a [2]string` 218 | * `primes := [6]int{2, 3, 5, 7, 11, 13}` 219 | * `var s []int = primes[1:4]` 220 | * slices are like Python, `a := names[0:2]` 221 | * `a` is a **reference** of the array 222 | * `len(a)` is length of slice 223 | * `cap(a)` is the allocated memory starting from `a[0]` 224 | * slice `x == nil` if `len(x) == 0` 225 | * create slice `a := make([]int, 5)` 226 | * slice of slices 227 | ``` 228 | board := [][]string{ 229 | []string{"_", "_", "_"}, 230 | []string{"_", "_", "_"}, 231 | []string{"_", "_", "_"}, 232 | } 233 | ``` 234 | * `range` iterates through (index, value) pairs, like Python's `enumerate()` 235 | * `for i := range pow { /* index i*/ } ` 236 | * `for i, val := range pow { /* index i*/ } ` 237 | 238 | Map 239 | --- 240 | * `var m map[]` 241 | * Insert by `m[key] = value` 242 | * Obtain value by `value = m[key]` 243 | * Check exist by `val, exit = m[key]` 244 | * `exit == false` if `key` is not in `m` 245 | * Remove by `delete(m, key)` 246 | 247 | Closure 248 | --- 249 | How to implement a local `static` variable inside a function? Use Closure 250 | 251 | ```go 252 | func main() { 253 | counter := newCounter() 254 | counter() // return 1 255 | counter() // return 2 256 | } 257 | 258 | // Here newCounter() returns an anonymous function 259 | // which has access to n even after it exists 260 | func newCounter() func() int { 261 | n := 0 262 | return func() int { 263 | n += 1 264 | return n 265 | } 266 | } 267 | ``` 268 | [Common ways to use closure](https://www.calhoun.io/5-useful-ways-to-use-closures-in-go/) 269 | 270 | Test 271 | --- 272 | Run `go test -v` at the package where the test file has a filename ending with `_test.go`. 273 | 274 | ``` 275 | func TestSum(t *testing.T) { 276 | t.Run("[1,2,3,4,5]", testSumFunc([]int{1, 2, 3, 4, 5}, 15)) 277 | t.Run("[1,2,3,4,-5]", testSumFunc([]int{1, 2, 3, 4, -5}, 5)) 278 | } 279 | 280 | func testSumFunc(numbers []int, expected int) func(*testing.T) { 281 | return func(t *testing.T) { 282 | actual := Sum(numbers) 283 | if actual != expected { 284 | t.Error(fmt.Sprintf("Expected the sum of %v to be %d but instead got %d!", numbers, expected, actual)) 285 | } 286 | } 287 | } 288 | ``` 289 | 290 | [More](https://www.calhoun.io/how-to-test-with-go/) 291 | -------------------------------------------------------------------------------- /language/Javascript/Readme.md: -------------------------------------------------------------------------------- 1 | Javascript Basics 2 | === 3 | AirBnB [Javascript Style Guide](https://github.com/airbnb/javascript) 4 | 5 | Google [Javascript Style Guide](https://google.github.io/styleguide/jsguide.html) 6 | 7 | and this [tutorial](https://www.dofactory.com/tutorial/javascript) 8 | 9 | Variables 10 | --- 11 | Boolean values and numbers are value-based types, whereas strings, objects, arrays, and functions are reference types. 12 | 13 | Value types are copied, passed, and compared by value. Reference types, are copied, passed, and compared by reference. 14 | 15 | JavaScript strings are immutable. Modifying the string actually generates a new string. 16 | 17 | Google+AirBnB Javascript style guide actually forbids the usage of `var`. 18 | 19 | > Declare all local variables with either const or let. Use const by default, unless a variable needs to be reassigned. The var keyword must not be used. 20 | 21 | Both `let` and `const` are block-scoped. The difference is `const` does not allow re-assignment and re-declaration. The `const` variable is a constant pointer and it can be optimized by browser. `const` is always preferred when appropriate. 22 | 23 | ``` 24 | // Wrong: `i` is redefined (not reassigned) on each loop step. 25 | for (let i in [1, 2, 3]) { 26 | console.log(i); 27 | } 28 | 29 | // Correct: `i` gets a new binding each iteration 30 | for (const i in [1, 2, 3]) { 31 | console.log(i); 32 | } 33 | ``` 34 | 35 | Array 36 | --- 37 | 38 | Destructuring 39 | 40 | ``` 41 | let [, b,, d] = someArray; 42 | function optionalDestructuring([a = 4, b = 2] = []) { … }; 43 | 44 | const [a, b, c, ...rest] = generateResults(); 45 | 46 | // Spread operator to FLATTEN elements out of one or more other iterables 47 | [...foo] // preferred over Array.prototype.slice.call(foo) 48 | [...foo, ...bar] // preferred over foo.concat(bar) 49 | 50 | // Example 51 | function myFunction(...elements) {} 52 | myFunction(...array, ...iterable, ...generator()); 53 | ``` 54 | 55 | Objects 56 | --- 57 | 58 | JavaScript objects are mutable. The properties can be other objects. 59 | 60 | ``` 61 | var rect = { 62 | width: 20, 63 | height: 10, 64 | color: { red: 0, green: 255, blue: 128 }, // object property 65 | getArea: function() { // method property 66 | return this.width * this.height; 67 | } 68 | }; 69 | ``` 70 | 71 | Dot notation is used more often to access properties, while bracket notation allows you to use the property names that are variables. 72 | 73 | To get a list of property names from an object use the for-in loop. 74 | 75 | ``` 76 | var car = { make: "Toyota", model: "Camry" }; 77 | for (var prop in car) { 78 | // => make: Toyota, and model: Camry 79 | alert(prop + ": " + car[prop]); 80 | } 81 | ``` 82 | 83 | Class Methods 84 | --- 85 | 86 | Shorthands 87 | 88 | ``` 89 | // class example 90 | class { 91 | getObjectLiteral() { 92 | this.stuff = 'fruit'; 93 | return { 94 | stuff: 'candy', 95 | method: () => this.stuff, // Returns 'fruit' 96 | }; 97 | } 98 | } 99 | 100 | // obj example 101 | const foo = 1; 102 | const bar = 2; 103 | const obj = { 104 | foo, 105 | bar, 106 | method() { return this.foo + this.bar; }, 107 | }; 108 | ``` 109 | 110 | Constructor Function 111 | --- 112 | * Constructor functions are capitalized by convention 113 | * Calling a constructor function without `new` is like calling an ordinary function. Doing this pollutes the global namespace!! 114 | * With `new`, you create an object, so the keyword `this` in the constructor function refers to this newly created object. 115 | 116 | ``` 117 | function Book(isbn) { 118 | this.isbn = isbn; 119 | this.getIsbn = function () { 120 | return "Isbn is " + this.isbn; 121 | }; 122 | } 123 | var book = new Book("901-3865"); 124 | ``` 125 | 126 | The code above creates `getIsbn` function each time when `Book()` is called. 127 | We can use a single `getIsbn` function to skip the creation of function `getIsbn`. 128 | See below. 129 | ``` 130 | function Book(isbn) { 131 | this.isbn = isbn; 132 | } 133 | Book.prototype.getIsbn = function () { 134 | return "Isbn is " + this.isbn; 135 | }; 136 | var book = new Book("901-3865"); 137 | ``` 138 | 139 | `this` 140 | --- 141 | 142 | Calling `this` would find the local object (go up until it hits the global object, i.e. the "window" object, which is just like finding the variables declared by the `var` keyword.) 143 | 144 | Only use this in class constructors and methods, or in arrow functions defined within class constructors and methods. You play with `this` to develop a framework but avoid it for development at the application level. 145 | 146 | ``` 147 | var name = 'First'; 148 | var student = { 149 | name: 'Middle', 150 | detail: { 151 | name: 'Last', 152 | getName: function() { 153 | alert(this.name); 154 | } 155 | } 156 | } 157 | var result = student.detail.getName; 158 | result(); // => 'First' (global scope) 159 | student.detail.getName(); // => 'Last' (detail scope) 160 | ``` 161 | 162 | Class 163 | --- 164 | [Understanding Classes in JavaScript By Tania Rascia](https://www.digitalocean.com/community/tutorials/understanding-classes-in-javascript) 165 | 166 | Extending a class: subclass constructors must call super() before setting any fields or otherwise accessing `this`. 167 | 168 | ``` 169 | // Initializing a class 170 | class Hero { 171 | constructor(name, level) { 172 | this.name = name; 173 | this.level = level; 174 | } 175 | 176 | // Adding a method to the constructor 177 | greet() { 178 | return `${this.name} says hello.`; 179 | } 180 | } 181 | 182 | // Creating a new class from the parent 183 | class Mage extends Hero { 184 | constructor(name, level, spell) { 185 | // Chain constructor with super 186 | super(name, level); 187 | 188 | // Add a new property 189 | this.spell = spell; 190 | } 191 | } 192 | ``` 193 | 194 | The class keyword allows clearer and more readable class definitions than defining `prototype` properties. 195 | 196 | Prototypal inheritance 197 | --- 198 | Setting prototypes to an object is done by setting an object's prototype attribute to a prototype object. 199 | A prototype is just a single object and derived object instances hold only references to their prototype. 200 | ``` 201 | var account = { 202 | bank: "Bank of America", // just the default value 203 | getBank: function() { 204 | return this.bank; 205 | } 206 | }; 207 | function createObject (p) { 208 | var F = function () {}; // Create a new and empty function 209 | F.prototype = p; // The function has a prototype property (Yes, one can play with the prototype before actually `new function`) 210 | return new F(); // new instantiate this object 211 | } 212 | var savings = createObject(account); 213 | 214 | alert(savings.getBank()); // => Bank of America 215 | savings.bank = "JP Morgan Chase"; 216 | alert(savings.getBank()); // => JP Morgan Chase 217 | ``` 218 | 219 | Anonymous Immediate Function 220 | --- 221 | The anonymous immediate function is the function wrapped in parentheses 222 | ``` 223 | var module = (function() { 224 | … 225 | … 226 | }()) 227 | ``` 228 | 229 | * it has no function name 230 | * it gets executed immediately when JavaScript encounters it 231 | * Withint such function, variables declared with `var` are private 232 | 233 | Private member shared by prototype 234 | --- 235 | ``` 236 | function Book(author) { 237 | var author = author; // private instance variable 238 | this.getAuthor = function () { 239 | return author; // privileged instance method 240 | }; 241 | } 242 | Book.prototype = (function () { 243 | var label = "Author: "; // private prototype variable 244 | return { 245 | getLabel: function () { // privileged prototype method 246 | return label; 247 | } 248 | }; 249 | }()); 250 | var book1 = new Book('James Joyce'); 251 | alert(book1.getLabel() + book1.getAuthor()); // => Author: James Joyce 252 | var book2 = new Book('Virginia Woolf'); 253 | alert(book2.getLabel() + book2.getAuthor()); // => Author: Virginia Woolf 254 | ``` 255 | 256 | Here both `book1` and `book2` adopts a prototype which has the same private variable `label`. 257 | 258 | Functions 259 | --- 260 | 261 | Functions are copied by reference. 262 | 263 | 264 | nested function closures 265 | --- 266 | 267 | `typeof new function(){} => "object"` 268 | 269 | When you hold a reference to a function with a variable by 270 | ``` 271 | function counter() { 272 | var index = 0; 273 | function increment() { 274 | return ++index; 275 | } 276 | return increment; 277 | } 278 | var userIncrement = counter(); // a reference to inner increment() 279 | var adminIncrement = counter(); // a reference to inner increment() 280 | userIncrement(); // => 1 281 | userIncrement(); // => 2 282 | adminIncrement(); // => 1 283 | adminIncrement(); // => 2 284 | adminIncrement(); // => 3 285 | ``` 286 | In such cases, Javascript will maintain a second, but hidden, reference to its closure which will NOT be destroyed after this function returns. (But every execution will create its own copy of the closure, for example, `counter()` executes twice here.). 287 | 288 | Even after the function returns, these local `var` like `index` in the closure will NOT be destroyed. 289 | 290 | For loop 291 | --- 292 | 293 | Use `for (let i in x)` for dict `x` such as the object keys; Use `for (let i of y)` for iterable `y` such as `maps`, `sets`, `generators` and `array`. 294 | 295 | Prefer `for of` over `for in` whenever possible. 296 | 297 | `for-in` loops may only be used on dict-style objects and should not be used to iterate over an array. 298 | 299 | 300 | 301 | 302 | 303 | 304 | 305 | 306 | 307 | 308 | 309 | 310 | 311 | 312 | 313 | -------------------------------------------------------------------------------- /language/Python/Readme.md: -------------------------------------------------------------------------------- 1 | Python 2 | === 3 | 4 | MRO (Method Resolution Order) 5 | --- 6 | The class C inherits class A and B. 7 | 8 | If both A and B define the same parameter `val`, 9 | a risk is that both A and B's `__init__` modifies this `val` in an unknown order. 10 | 11 | The `super` method makes sure only one copy of `val` is preserved, 12 | but it needs a MRO to determine eventually which class's method gets called. 13 | 14 | ``` 15 | # Base 16 | # / \ 17 | # A B 18 | # \ / 19 | # C 20 | # every class defines a variable val 21 | 22 | class C(A, B): 23 | def show(self): 24 | print(self.val) 25 | 26 | def __init__(self, val): 27 | super(C, self).__init__(val) 28 | ``` 29 | 30 | `__init__.py` 31 | --- 32 | `__init__.py` prevents directories with a common name from unintentionally hiding valid modules that occur later (deeper) on the module search path. It marks a package from module so you can import modules (i.e. the .py files) by their path in the directory. 33 | 34 | 35 | 36 | -------------------------------------------------------------------------------- /language/Readme.md: -------------------------------------------------------------------------------- 1 | Programming Languages Basics 2 | === 3 | 4 | [Java](Java/Readme.md) 5 | [C++](Cplusplus/Readme.md) 6 | [JavaScript](Javascript/Readme.md) 7 | [Go](Go/Readme.md) 8 | -------------------------------------------------------------------------------- /other_thoughts/Readme.md: -------------------------------------------------------------------------------- 1 | ## Notes 2 | 1. Fully exploit the order of your input: 3 | Some questions ask for the maximum sth. while the input is somehow sorted. 4 | You can use the ordering of sorted input. 5 | For example: 6 | ``` 7 | # Leetcode 720: longest word with the smallest lexicographical order. 8 | # NOTE: lexicographical order is ensured by the sort! 9 | words.sort() 10 | ret = '' 11 | tmp = set(['']) 12 | for word in words: 13 | if word[:-1] in tmp: 14 | if len(word) > len(ret): 15 | ret = word 16 | tmp.add(word) 17 | return ret 18 | ``` 19 | And 20 | * Problem 745 Prefix and Suffix Search - latter word has larger weight 21 | * Problem 332, return the itinerary that has the smallest lexical order 22 | ``` 23 | def findItinerary(self, tickets): 24 | targets = collections.defaultdict(list) 25 | for a, b in sorted(tickets)[::-1]: # sort the tickets so for the same src, dst is sorted reversely here. 26 | targets[a] += b, 27 | route, stack = [], ['JFK'] 28 | while stack: 29 | while targets[stack[-1]]: 30 | stack += targets[stack[-1]].pop(), 31 | route += stack.pop(), 32 | return route[::-1] 33 | ``` 34 | 35 | --------------------------------------------------------------------------------