├── queue ├── images │ └── queue.png └── README.md ├── stack ├── images │ ├── stack.png │ ├── linked_list.png │ ├── python_linked_list.png │ └── linked_list_implementation.png └── README.md ├── mergesort ├── images │ ├── mergesort_tree.png │ ├── bottom-up_mergesort.png │ ├── mergesort_animation.gif │ └── top-down_mergesort.png ├── mergesort.py └── README.md ├── graph ├── images │ ├── graph_adjacency_list.png │ └── graph_connected_components.png └── README.md ├── breadth-first-search ├── images │ ├── bfs_maze.png │ ├── bfs_result.png │ └── bfs_trace.png ├── breadth_first_search.py └── README.md ├── depth-first-search ├── images │ ├── dfs_result.png │ ├── dfs_trace.png │ └── dfs_maze_definition.png ├── dept_first_search.py ├── dfs_connected_components.py └── README.md ├── disjoint-set ├── images │ ├── quick-find_trace.png │ ├── quick-find_overview.png │ ├── quick-union_trace.png │ ├── quick-union_overview.png │ ├── performance_comparison.png │ ├── weighted-quick-union_trace.png │ └── weighted-quick-union_overview.png ├── ex1_successor.py ├── README.md ├── ex2_canonical_element.py └── disjoint_set.py ├── quicksort ├── images │ ├── quicksort_animation.gif │ └── quicksort_overview.png ├── quicksort.py └── README.md ├── priority-queue ├── images │ ├── pq_bottom-up_swim.png │ ├── pq_top-down_sink.png │ ├── pq_binary_heap_insert.png │ └── pq_binary_heap_remove_maximum.png └── README.md ├── binary-search-trees ├── images │ └── bst_example_tree.png └── README.md ├── digraph ├── images │ ├── graph_connected_components_code.png │ ├── digraph_data_structure_properties.png │ ├── digraph_topological_sort_postorder.png │ ├── digraph_topological_sort_postorder_complete.png │ ├── digraph_strong_connected_components_kosaraju_sharir_code.png │ └── digraph_strong_connected_components_kosaraju_sharir_algorithm.png ├── postorder.py └── README.md ├── minimum-spanning-trees ├── images │ ├── mst_tree_example.png │ ├── mst_greedy_algorithm.png │ ├── mst_kruskal_algorithm.png │ ├── mst_prim_algorithm_0.png │ ├── mst_prim_algorithm_1.png │ ├── mst_prim_algorithm_2.png │ ├── mst_prim_algorithm_6.png │ ├── mst_prim_algorithm_7.png │ ├── mst_kruskal_complexity.png │ └── mst_prim_algorithm_end.png └── README.md ├── balanced-search-trees ├── images │ ├── bst_2-3_tree_search_it.png │ └── bst_insert_2-node_3-node.png └── README.md ├── red-black-balanced-search-trees ├── images │ ├── comparison_complexity.png │ ├── rbbst_3-node_to_two_2-nodes.png │ └── rbbst_insert_into_a_3-node.png └── README.md ├── LICENSE ├── shortest-path └── README.md ├── .gitignore ├── symbol-tables └── README.md ├── oo-programming └── README.md ├── README.md ├── interview └── google.md └── hash-functions └── README.md /queue/images/queue.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ashishpatel26/algorithms/master/queue/images/queue.png -------------------------------------------------------------------------------- /stack/images/stack.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ashishpatel26/algorithms/master/stack/images/stack.png -------------------------------------------------------------------------------- /stack/images/linked_list.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ashishpatel26/algorithms/master/stack/images/linked_list.png -------------------------------------------------------------------------------- /mergesort/images/mergesort_tree.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ashishpatel26/algorithms/master/mergesort/images/mergesort_tree.png -------------------------------------------------------------------------------- /stack/images/python_linked_list.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ashishpatel26/algorithms/master/stack/images/python_linked_list.png -------------------------------------------------------------------------------- /graph/images/graph_adjacency_list.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ashishpatel26/algorithms/master/graph/images/graph_adjacency_list.png -------------------------------------------------------------------------------- /breadth-first-search/images/bfs_maze.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ashishpatel26/algorithms/master/breadth-first-search/images/bfs_maze.png -------------------------------------------------------------------------------- /depth-first-search/images/dfs_result.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ashishpatel26/algorithms/master/depth-first-search/images/dfs_result.png -------------------------------------------------------------------------------- /depth-first-search/images/dfs_trace.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ashishpatel26/algorithms/master/depth-first-search/images/dfs_trace.png -------------------------------------------------------------------------------- /disjoint-set/images/quick-find_trace.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ashishpatel26/algorithms/master/disjoint-set/images/quick-find_trace.png -------------------------------------------------------------------------------- /mergesort/images/bottom-up_mergesort.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ashishpatel26/algorithms/master/mergesort/images/bottom-up_mergesort.png -------------------------------------------------------------------------------- /mergesort/images/mergesort_animation.gif: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ashishpatel26/algorithms/master/mergesort/images/mergesort_animation.gif -------------------------------------------------------------------------------- /mergesort/images/top-down_mergesort.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ashishpatel26/algorithms/master/mergesort/images/top-down_mergesort.png -------------------------------------------------------------------------------- /quicksort/images/quicksort_animation.gif: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ashishpatel26/algorithms/master/quicksort/images/quicksort_animation.gif -------------------------------------------------------------------------------- /quicksort/images/quicksort_overview.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ashishpatel26/algorithms/master/quicksort/images/quicksort_overview.png -------------------------------------------------------------------------------- /breadth-first-search/images/bfs_result.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ashishpatel26/algorithms/master/breadth-first-search/images/bfs_result.png -------------------------------------------------------------------------------- /breadth-first-search/images/bfs_trace.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ashishpatel26/algorithms/master/breadth-first-search/images/bfs_trace.png -------------------------------------------------------------------------------- /disjoint-set/images/quick-find_overview.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ashishpatel26/algorithms/master/disjoint-set/images/quick-find_overview.png -------------------------------------------------------------------------------- /disjoint-set/images/quick-union_trace.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ashishpatel26/algorithms/master/disjoint-set/images/quick-union_trace.png -------------------------------------------------------------------------------- /graph/images/graph_connected_components.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ashishpatel26/algorithms/master/graph/images/graph_connected_components.png -------------------------------------------------------------------------------- /priority-queue/images/pq_bottom-up_swim.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ashishpatel26/algorithms/master/priority-queue/images/pq_bottom-up_swim.png -------------------------------------------------------------------------------- /priority-queue/images/pq_top-down_sink.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ashishpatel26/algorithms/master/priority-queue/images/pq_top-down_sink.png -------------------------------------------------------------------------------- /stack/images/linked_list_implementation.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ashishpatel26/algorithms/master/stack/images/linked_list_implementation.png -------------------------------------------------------------------------------- /disjoint-set/images/quick-union_overview.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ashishpatel26/algorithms/master/disjoint-set/images/quick-union_overview.png -------------------------------------------------------------------------------- /binary-search-trees/images/bst_example_tree.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ashishpatel26/algorithms/master/binary-search-trees/images/bst_example_tree.png -------------------------------------------------------------------------------- /disjoint-set/images/performance_comparison.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ashishpatel26/algorithms/master/disjoint-set/images/performance_comparison.png -------------------------------------------------------------------------------- /priority-queue/images/pq_binary_heap_insert.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ashishpatel26/algorithms/master/priority-queue/images/pq_binary_heap_insert.png -------------------------------------------------------------------------------- /depth-first-search/images/dfs_maze_definition.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ashishpatel26/algorithms/master/depth-first-search/images/dfs_maze_definition.png -------------------------------------------------------------------------------- /digraph/images/graph_connected_components_code.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ashishpatel26/algorithms/master/digraph/images/graph_connected_components_code.png -------------------------------------------------------------------------------- /disjoint-set/images/weighted-quick-union_trace.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ashishpatel26/algorithms/master/disjoint-set/images/weighted-quick-union_trace.png -------------------------------------------------------------------------------- /minimum-spanning-trees/images/mst_tree_example.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ashishpatel26/algorithms/master/minimum-spanning-trees/images/mst_tree_example.png -------------------------------------------------------------------------------- /digraph/images/digraph_data_structure_properties.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ashishpatel26/algorithms/master/digraph/images/digraph_data_structure_properties.png -------------------------------------------------------------------------------- /digraph/images/digraph_topological_sort_postorder.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ashishpatel26/algorithms/master/digraph/images/digraph_topological_sort_postorder.png -------------------------------------------------------------------------------- /disjoint-set/images/weighted-quick-union_overview.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ashishpatel26/algorithms/master/disjoint-set/images/weighted-quick-union_overview.png -------------------------------------------------------------------------------- /balanced-search-trees/images/bst_2-3_tree_search_it.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ashishpatel26/algorithms/master/balanced-search-trees/images/bst_2-3_tree_search_it.png -------------------------------------------------------------------------------- /minimum-spanning-trees/images/mst_greedy_algorithm.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ashishpatel26/algorithms/master/minimum-spanning-trees/images/mst_greedy_algorithm.png -------------------------------------------------------------------------------- /minimum-spanning-trees/images/mst_kruskal_algorithm.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ashishpatel26/algorithms/master/minimum-spanning-trees/images/mst_kruskal_algorithm.png -------------------------------------------------------------------------------- /minimum-spanning-trees/images/mst_prim_algorithm_0.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ashishpatel26/algorithms/master/minimum-spanning-trees/images/mst_prim_algorithm_0.png -------------------------------------------------------------------------------- /minimum-spanning-trees/images/mst_prim_algorithm_1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ashishpatel26/algorithms/master/minimum-spanning-trees/images/mst_prim_algorithm_1.png -------------------------------------------------------------------------------- /minimum-spanning-trees/images/mst_prim_algorithm_2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ashishpatel26/algorithms/master/minimum-spanning-trees/images/mst_prim_algorithm_2.png -------------------------------------------------------------------------------- /minimum-spanning-trees/images/mst_prim_algorithm_6.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ashishpatel26/algorithms/master/minimum-spanning-trees/images/mst_prim_algorithm_6.png -------------------------------------------------------------------------------- /minimum-spanning-trees/images/mst_prim_algorithm_7.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ashishpatel26/algorithms/master/minimum-spanning-trees/images/mst_prim_algorithm_7.png -------------------------------------------------------------------------------- /priority-queue/images/pq_binary_heap_remove_maximum.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ashishpatel26/algorithms/master/priority-queue/images/pq_binary_heap_remove_maximum.png -------------------------------------------------------------------------------- /balanced-search-trees/images/bst_insert_2-node_3-node.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ashishpatel26/algorithms/master/balanced-search-trees/images/bst_insert_2-node_3-node.png -------------------------------------------------------------------------------- /minimum-spanning-trees/images/mst_kruskal_complexity.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ashishpatel26/algorithms/master/minimum-spanning-trees/images/mst_kruskal_complexity.png -------------------------------------------------------------------------------- /minimum-spanning-trees/images/mst_prim_algorithm_end.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ashishpatel26/algorithms/master/minimum-spanning-trees/images/mst_prim_algorithm_end.png -------------------------------------------------------------------------------- /digraph/images/digraph_topological_sort_postorder_complete.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ashishpatel26/algorithms/master/digraph/images/digraph_topological_sort_postorder_complete.png -------------------------------------------------------------------------------- /red-black-balanced-search-trees/images/comparison_complexity.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ashishpatel26/algorithms/master/red-black-balanced-search-trees/images/comparison_complexity.png -------------------------------------------------------------------------------- /red-black-balanced-search-trees/images/rbbst_3-node_to_two_2-nodes.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ashishpatel26/algorithms/master/red-black-balanced-search-trees/images/rbbst_3-node_to_two_2-nodes.png -------------------------------------------------------------------------------- /red-black-balanced-search-trees/images/rbbst_insert_into_a_3-node.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ashishpatel26/algorithms/master/red-black-balanced-search-trees/images/rbbst_insert_into_a_3-node.png -------------------------------------------------------------------------------- /digraph/images/digraph_strong_connected_components_kosaraju_sharir_code.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ashishpatel26/algorithms/master/digraph/images/digraph_strong_connected_components_kosaraju_sharir_code.png -------------------------------------------------------------------------------- /digraph/images/digraph_strong_connected_components_kosaraju_sharir_algorithm.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ashishpatel26/algorithms/master/digraph/images/digraph_strong_connected_components_kosaraju_sharir_algorithm.png -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2017 Massimiliano Patacchiola 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /depth-first-search/dept_first_search.py: -------------------------------------------------------------------------------- 1 | 2 | 3 | #Creating the graph manually 4 | #The graph is the same used in Chapter 4.1 (page 533) of the book "Algorithm II" 5 | vertex_list = [[6,2,1], [0], [0], [5,4], [5,6,3], [3,4,0], [0,4]] 6 | #List used to mark visited vertex 7 | marked_list = [False, False, False, False, False, False, False] 8 | #List used to sign from which node there was an incoming visit 9 | #this list can be used to move backward from a given node. 10 | edgeto_list = [-1, -1, -1, -1, -1, -1, -1] 11 | 12 | def depth_first_search(v): 13 | if marked_list[v] == False: 14 | marked_list[v] = True #mark the node 15 | print(v) #print marked node 16 | adjacent_list = vertex_list[v] #get adjacent nodes 17 | for v_adj in adjacent_list: 18 | if marked_list[v_adj] == False: 19 | edgeto_list[v_adj] = v 20 | depth_first_search(v_adj) #iterative call 21 | 22 | print("Starting DFS algorith...") 23 | print("Marked list:") 24 | print marked_list 25 | print("edgeto list:") 26 | print edgeto_list 27 | depth_first_search(0) 28 | print("Marked list:") 29 | print marked_list 30 | print("edgeto list:") 31 | print edgeto_list 32 | -------------------------------------------------------------------------------- /shortest-path/README.md: -------------------------------------------------------------------------------- 1 | 2 | In graph theory, the [shortest path problem](https://en.wikipedia.org/wiki/Shortest_path_problem) is the problem of finding a path between two vertices (or nodes) in a graph such that the sum of the weights of its constituent edges is minimized. There are many possible solutions to the problem, the most famous are: 3 | 4 | - **Dijkstra**'s algorithm solves the single-source shortest path problem. 5 | - **Bellman–Ford** algorithm solves the single-source problem if edge weights may be negative. 6 | - **A*** search algorithm solves for single pair shortest path using heuristics to try to speed up the search. 7 | - **Viterbi** algorithm solves the shortest stochastic path problem with an additional probabilistic weight on each node. 8 | 9 | In weighted graph the **time complexity** of Dijkstra is *O(V^2)*. In unweighted graph it can be used the breadth-fist search to solve the problem, that has *O(V+E)* time complexity. 10 | 11 | 12 | Implementation 13 | --------------- 14 | 15 | Methods 16 | -------- 17 | 18 | 19 | Applications 20 | ------------ 21 | 22 | 23 | Quiz 24 | ----- 25 | 26 | 27 | Material 28 | -------- 29 | - **Coursera Algorithms Part 2**: week 30 | - **Algorithms**, Sedgewick and Wayne (2014): Chapter "" 31 | -------------------------------------------------------------------------------- /digraph/postorder.py: -------------------------------------------------------------------------------- 1 | 2 | 3 | #Creating the graph manually 4 | #The graph is the same used in Chapter 4.1 (page 533) of the book "Algorithm II" 5 | vertex_list = [[1,5,2], [4], [], [4,5,2,6], [], [2], [0,4]] 6 | #List used to mark visited vertex 7 | marked_list = [False, False, False, False, False, False, False] 8 | #List used to sign from which node there was an incoming visit 9 | #this list can be used to move backward from a given node. 10 | edgeto_list = [-1, -1, -1, -1, -1, -1, -1] 11 | #the stack where we are gonna store the postorder values 12 | postorder_list = list() 13 | 14 | def depth_first_search(v): 15 | if marked_list[v] == False: 16 | marked_list[v] = True #mark the node 17 | #print(v) #print marked node 18 | adjacent_list = vertex_list[v] #get adjacent nodes 19 | for v_adj in adjacent_list: 20 | if marked_list[v_adj] == False: 21 | edgeto_list[v_adj] = v 22 | depth_first_search(v_adj) #iterative call 23 | postorder_list.append(v) 24 | print(v) #print the postorder node 25 | 26 | print("Starting DFS algorith...") 27 | print("Marked list:") 28 | print marked_list 29 | print("postorder list:") 30 | print postorder_list 31 | depth_first_search(0) 32 | depth_first_search(3) 33 | print("Marked list:") 34 | print marked_list 35 | print("postorder list:") 36 | print postorder_list 37 | -------------------------------------------------------------------------------- /breadth-first-search/breadth_first_search.py: -------------------------------------------------------------------------------- 1 | from collections import deque 2 | 3 | #Creating the graph manually 4 | #The graph is the same used in Chapter 4.1 (page 533) of the book "Algorithm II" 5 | vertex_list = [[2,1,5], [0,2], [0,1,3,4], [2,5,4], [2,3], [3,0]] 6 | #List used to mark visited vertex 7 | marked_list = [False, False, False, False, False, False] 8 | #List used to sign from which node there was an incoming visit 9 | #this list can be used to move backward from a given node. 10 | edgeto_list = [-1, -1, -1, -1, -1, -1] 11 | 12 | def breadth_first_search(s): 13 | queue = deque() 14 | queue.append(s) 15 | marked_list[s] = True 16 | print(s) #mark the starting node 17 | while(len(queue) != 0): 18 | v = queue.popleft() #important to pop-left (FIFO) 19 | adjacent_list = vertex_list[v] #get adjacent nodes 20 | for v_adj in adjacent_list: 21 | if marked_list[v_adj] == False: 22 | queue.append(v_adj) 23 | marked_list[v_adj] = True 24 | edgeto_list[v_adj] = v 25 | print(v_adj) #print marked node 26 | 27 | print("Starting BFS algorith...") 28 | print("Marked list:") 29 | print marked_list 30 | print("edgeto list:") 31 | print edgeto_list 32 | breadth_first_search(0) 33 | print("Marked list:") 34 | print marked_list 35 | print("edgeto list:") 36 | print edgeto_list 37 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | # Byte-compiled / optimized / DLL files 2 | __pycache__/ 3 | *.py[cod] 4 | *$py.class 5 | *~ 6 | 7 | # C extensions 8 | *.so 9 | 10 | # Distribution / packaging 11 | .Python 12 | env/ 13 | build/ 14 | develop-eggs/ 15 | dist/ 16 | downloads/ 17 | eggs/ 18 | .eggs/ 19 | lib/ 20 | lib64/ 21 | parts/ 22 | sdist/ 23 | var/ 24 | wheels/ 25 | *.egg-info/ 26 | .installed.cfg 27 | *.egg 28 | 29 | # PyInstaller 30 | # Usually these files are written by a python script from a template 31 | # before PyInstaller builds the exe, so as to inject date/other infos into it. 32 | *.manifest 33 | *.spec 34 | 35 | # Installer logs 36 | pip-log.txt 37 | pip-delete-this-directory.txt 38 | 39 | # Unit test / coverage reports 40 | htmlcov/ 41 | .tox/ 42 | .coverage 43 | .coverage.* 44 | .cache 45 | nosetests.xml 46 | coverage.xml 47 | *.cover 48 | .hypothesis/ 49 | 50 | # Translations 51 | *.mo 52 | *.pot 53 | 54 | # Django stuff: 55 | *.log 56 | local_settings.py 57 | 58 | # Flask stuff: 59 | instance/ 60 | .webassets-cache 61 | 62 | # Scrapy stuff: 63 | .scrapy 64 | 65 | # Sphinx documentation 66 | docs/_build/ 67 | 68 | # PyBuilder 69 | target/ 70 | 71 | # Jupyter Notebook 72 | .ipynb_checkpoints 73 | 74 | # pyenv 75 | .python-version 76 | 77 | # celery beat schedule file 78 | celerybeat-schedule 79 | 80 | # SageMath parsed files 81 | *.sage.py 82 | 83 | # dotenv 84 | .env 85 | 86 | # virtualenv 87 | .venv 88 | venv/ 89 | ENV/ 90 | 91 | # Spyder project settings 92 | .spyderproject 93 | .spyproject 94 | 95 | # Rope project settings 96 | .ropeproject 97 | 98 | # mkdocs documentation 99 | /site 100 | 101 | # mypy 102 | .mypy_cache/ 103 | -------------------------------------------------------------------------------- /depth-first-search/dfs_connected_components.py: -------------------------------------------------------------------------------- 1 | 2 | 3 | #Creating the graph manually 4 | #The graph is the same used in Chapter 4.1 (page 533) of the book "Algorithm II" 5 | vertex_list = [[6,2,1,5], [0], [0], [5,4], [5,6,3], [3,4,0], [0,4], [8], [7], [10,12,11], [9], [9,12], [11,9]] 6 | #List used to mark visited vertex 7 | marked_list = [False, False, False, False, False, False, False, False, False, False, False, False, False] 8 | #List used to sign from which node there was an incoming visit 9 | #this list can be used to move backward from a given node. 10 | edgeto_list = [-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1] 11 | #Variables for the connected components 12 | cc_counter = 0 13 | cc_list = [-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1] 14 | 15 | def depth_first_search(v): 16 | if marked_list[v] == False: 17 | marked_list[v] = True #mark the node 18 | print(v) #print marked node 19 | adjacent_list = vertex_list[v] #get adjacent nodes 20 | for v_adj in adjacent_list: 21 | if marked_list[v_adj] == False: 22 | edgeto_list[v_adj] = v 23 | depth_first_search(v_adj) #iterative call 24 | #Additional code for connected components 25 | cc_list[v_adj] = cc_counter 26 | 27 | 28 | for v in range(len(marked_list)): 29 | if marked_list[v] == False: 30 | cc_list[v] = cc_counter #assign the ID to the starting node 31 | depth_first_search(v) #depth search from the starting node 32 | cc_counter += 1 #increment the ID counter 33 | 34 | print("Starting DFS algorith...") 35 | print("Marked list:") 36 | print marked_list 37 | print("edgeto list:") 38 | print edgeto_list 39 | print("cc list:") 40 | print cc_list 41 | 42 | depth_first_search(0) 43 | 44 | print("Marked list:") 45 | print marked_list 46 | print("edgeto list:") 47 | print edgeto_list 48 | print("cc list:") 49 | print cc_list 50 | -------------------------------------------------------------------------------- /queue/README.md: -------------------------------------------------------------------------------- 1 | 2 | 3 | A [queue](https://en.wikipedia.org/wiki/Stack_(abstract_data_type)) is an [abstract data type](https://en.wikipedia.org/wiki/Abstract_data_type) where the entities in the collection are kept in order and the principle (or only) operations on the collection are the addition of entities to the rear terminal position, known as enqueue, and removal of entities from the front terminal position, known as dequeue. This makes the queue a First-In-First-Out (FIFO) data structure. 4 | 5 | Implementation 6 | -------------- 7 | 8 | 1. **[Linked list](https://en.wikipedia.org/wiki/Linked_list)**: each element is defined by the data and a reference to the next element in the queue. It is necessary to have two additional values which store the index of the head and tail. The head and tail are necessary and when missing can increase the time of queue and/or dequeue operations. For instance, if the tail is missing when adding and element (queue) will be necessary to access all the elements in order to arrive at the last one (it takes linear time) and then add the reference to the queued new element. If the reference to the tail is present it is only necessary to access to this reference and then add the queued element (it takes constant time). 9 | 10 | 2. **Resizing array:** fixed length arrays are limited in capacity, but it is not true that items need to be copied towards the head of the queue. When the maximum capacity of the array is exceded it is possible to create a circular array. The simple trick of turning the array into a closed circle and letting the head and tail drift around endlessly in that circle makes it unnecessary to ever move items stored in the array. If *N* is the size of the array, then computing indices modulo *N* will turn the array into a circle. 11 | 12 | Methods 13 | -------- 14 | 15 | - `enqueue()`: it adds an element in the queue tail. 16 | - `dequeue()`: it pop an element from the head. 17 | - `isEmpty()`: returns True if the stack is empty. 18 | 19 | 20 | Material 21 | -------- 22 | - **Coursera Algorithms Part 1**: week 2 23 | - **Algorithms**, Sedgewick and Wayne (2014): Chapter 1.3 24 | -------------------------------------------------------------------------------- /quicksort/quicksort.py: -------------------------------------------------------------------------------- 1 | 2 | from random import shuffle 3 | 4 | class Quicksort(): 5 | ''' 6 | Quicksort is a recursive program that sorts a subarray a[lo..hi] by using a partition() method 7 | that puts a[j] into position and arranges the rest of the entries such that the recursive calls finish 8 | the sort. 9 | ''' 10 | 11 | def sort(self, a): 12 | ''' it sorts the input array. 13 | 14 | ''' 15 | #shuffle(a) 16 | self._sort(a, 0, len(a)-1) 17 | 18 | def _sort(self, a, lo, hi): 19 | if hi<= lo: return 20 | j = self._partition(a, lo, hi); 21 | self._sort(a, lo, j-1) #Sort left part a[lo .. j-1]. 22 | self._sort(a, j+1, hi) #Sort right part a[j+1 .. hi]. 23 | 24 | #TODO check were the bug is... 25 | def _partition(self, a, lo, hi): 26 | i = lo 27 | j = hi 28 | v = a[lo] #partitioning item 29 | print i, j, v 30 | while(True): 31 | #Scan right 32 | while(a[i]=j: break 43 | #Exchange items 44 | #name[0], name[1] = name[1], name[0] 45 | print "Exchanging a[", j, "] and a[", i, "]" 46 | print a 47 | a[i], a[j] = a[j], a[i] 48 | print a 49 | print "" 50 | #temp_i = a[i] 51 | #temp_j = a[j] 52 | #a[i] = temp_j 53 | #a[j] = temp_i 54 | #Exchange items 55 | print "Exchanging a[", lo, "] and a[", j, "]" 56 | print a 57 | a[lo], a[j] = a[j], a[lo] 58 | print a 59 | print "" 60 | #temp_lo = a[lo] 61 | #temp_j = a[j] 62 | #a[lo] = temp_j 63 | #a[j] = temp_lo 64 | return j 65 | 66 | 67 | def main(): 68 | a = [6,5,1,3,8,7,2,4] 69 | my_sorter = Quicksort() 70 | print a 71 | #my_sorter.TDsort(a, lo, hi) 72 | my_sorter.sort(a) 73 | print a 74 | 75 | if __name__ == "__main__": 76 | main() 77 | -------------------------------------------------------------------------------- /binary-search-trees/README.md: -------------------------------------------------------------------------------- 1 | 2 | The [binary search trees](https://en.wikipedia.org/wiki/Binary_search_tree) are a particular type of [symbol tables](https://en.wikipedia.org/wiki/Symbol_table) data structure that combines the flexibility of insertion in a linked list with the efficiency of search in an ordered array. Specifically, using two links per node (instead of the one link per node found in linked lists) leads to a very efficient implementation. In a binary search tree each node has a key, a value, a left link, and a right link. 3 | 4 |

5 | 6 |

7 | 8 | In the standard implementation the binary search tree has a node with lower key linked on the left, and the node with an higher key linked on the right. 9 | Differently from the [binary heap](https://en.wikipedia.org/wiki/Binary_heap) data structure studied in the priority queue module, the binary tree has a structure where a parent node can have a lower weight compared to the child. For example, in a binary heap the root must always be the largest element, this is not true for a binary tree. 10 | 11 | Implementation 12 | -------------- 13 | 14 | 1. (binary search). The main problem is with the deletion operation (Hibbard deletion) that is the only method used today. The Hibbard deletion unbalance the three leading to sqrt(N) height. 15 | 16 | 17 | Methods 18 | -------- 19 | 20 | `put(key, value)`: insert a new pair of key-value. It must not be allowed to associate a `None` (python) value. 21 | 22 | `get(key)`: return the value associated with the key. If the key does not exist it is possible to return `None`. 23 | 24 | `remove(key)`: remove the key and the associated value. 25 | 26 | `rank(key, lo, hi)`: (ordered array) the method is used in ordered array to search for a specific key using binary search. If the key is in the array it returns the index. 27 | 28 | Applications 29 | ------------ 30 | 31 | 1. Dictionaries: which is the application that also gives the name to the data structure (key=word, value=definition) 32 | 2. Account management: it can be used to process transactions (key=account-id, value=transaction detail) 33 | 3. Web search: find relevant pages based on keywords (key=keyword, value=web-pages) 34 | 35 | Quiz 36 | ----- 37 | 38 | 39 | 40 | 41 | Material 42 | -------- 43 | - **Coursera Algorithms Part 1**: week 4 44 | - **Algorithms**, Sedgewick and Wayne (2014): Chapter 3.2 "Binary Search Trees" 45 | -------------------------------------------------------------------------------- /disjoint-set/ex1_successor.py: -------------------------------------------------------------------------------- 1 | 2 | ''' 3 | ----------------------------------------------------------------------------------------------------------- 4 | Successor with delete. 5 | Given a set of n integers S={0,1,...,n-1} and a sequence of requests of the following form: 6 | 7 | 1. Remove x from S 8 | 2. Find the successor of x: the smallest y in S such that y>=x. 9 | 10 | design a data type so that all operations (except construction) take logarithmic time or better in the worst case. 11 | ----------------------------------------------------------------------------------------------------------- 12 | 13 | To tackle this problem I suppose that the set S contains positive integers, sorted, with no duplicates. 14 | Based on this assumption I can model the set S as an array of integer where each index represent the element 15 | in S and the value represents a two-element list containing the predecessor and successor of that element. 16 | 17 | For a set with 5 elements: S={0,1,2,3,4} I generate an array [[0, 1], [0, 2], [1, 3], [2, 4], [3, 5], [4, 5]] 18 | When an element is removed it is possible to access the element, get the index of the predecessor, go to the 19 | predecessor and make its successor point to the element successor, then go to the elments successor 20 | make its predecessor point to the element predecessor. 21 | 22 | To mark the element as removed it is possible to set to [] empty the associated list. 23 | The complexity of the algorithm is O(1) since it only require access to the array element which 24 | can be obtained in constant time. 25 | ''' 26 | 27 | def remove(S, x): 28 | if S[x] == []: return False 29 | pre = S[x][0] #get the element predecessor 30 | suc = S[x][1] #get the elemnet successor 31 | S[pre][1] = suc #the predecessor successor now point to the element successor 32 | S[suc][0] = pre #the successor predecessor now point to the element predecessor 33 | S[x] = [] #set the element to empty 34 | return True 35 | 36 | def find(S, x): 37 | if S[x] == [] or x<0 or x>len(S): return -1 38 | return S[x][1] #return the successor 39 | 40 | def main(): 41 | 42 | N = 10 43 | S = [[n-1, n+1] for n in range(N)] 44 | S[0][0] = 0 45 | S[N-1][1] = N-1 46 | 47 | 48 | print "S = " + str(S) 49 | print "remove(8) = " + str(remove(S,8)) + "; S=" + str(S) 50 | print "remove(7) = " + str(remove(S,7)) + "; S=" + str(S) 51 | print "find(5) = " + str(find(S, 5)) 52 | print "find(6) = " + str(find(S, 6)) 53 | print "find(9) = " + str(find(S, 9)) 54 | 55 | if __name__ == "__main__": 56 | main() 57 | -------------------------------------------------------------------------------- /mergesort/mergesort.py: -------------------------------------------------------------------------------- 1 | 2 | 3 | class Mergesort(): 4 | 5 | def TDsort(self, a, lo, hi): 6 | ''' it sort the input array in top-down order. 7 | 8 | Based on the lo and hi variables it sorts the values inside a. 9 | The array a is divided in two halves and sorted recursively. 10 | ''' 11 | if hi <= lo: return #termination condition for the recursion 12 | mid = int(lo + (hi - lo)/2) #estimates the mid point 13 | self.TDsort(a, lo, mid) #sort the left half 14 | self.TDsort(a, mid+1, hi) #sort the right half 15 | self._merge(a, lo, mid, hi) #merge the results 16 | 17 | def BUsort(self, a): 18 | ''' it sort the input array in bottom-up order. 19 | 20 | Based on the lo and hi variables it sorts the values inside a[]. 21 | The array a is divided in two halves and sorted recursively. 22 | ''' 23 | N = len(a) 24 | #sz = sub-array size 25 | #lo = sub-array index 26 | for sz in range(1, N): 27 | 28 | for lo in range(0, N-sz): 29 | mid = lo+sz-1 #5, 9, 13, 30 | hi = min(lo+sz+sz-1, N-1) #7, 11, 31 | self._merge(a, lo, mid, hi) 32 | lo += sz+sz #4, 8, 12, 16, 20 33 | print lo, mid, hi 34 | print(a) 35 | sz = sz+sz #2, 4, 6, 8, 10, 12, 14, 16, 18, 36 | 37 | def _merge(self, a, lo, mid, hi): 38 | ''' it merges the input array. 39 | 40 | Based on the lo, hi and mid point it merges the values of the array. 41 | The array a is divided in two halves and sorted recursively. 42 | ''' 43 | i = lo 44 | j = mid+1 45 | #clone the list 46 | aux = a[:] 47 | #for k in range(lo, hi+1): 48 | # aux[k] = a[k] 49 | #here elements in a[] are replaced with the largest 50 | #element obtained from the comparison of a[] and aux[] 51 | for k in range(lo, hi+1): 52 | if i>mid: 53 | a[k] = aux[j] 54 | j += 1 #left half exhausted 55 | elif j>hi: 56 | a[k] = aux[i] 57 | i += 1 #right half exhausted 58 | elif aux[j] < aux[i]: 59 | a[k] = aux[j] 60 | j += 1 #right value less than left value 61 | elif aux[j] >= aux[i]: 62 | a[k] = aux[i] 63 | i += 1; #right value greater or equal to left value 64 | 65 | 66 | def main(): 67 | a = [6,5,1,3,8,7,2,4] 68 | my_sorter = Mergesort() 69 | lo = 0 70 | hi = len(a)-1 # hi ==> 12-1 = 11 71 | print a 72 | #my_sorter.TDsort(a, lo, hi) 73 | my_sorter.BUsort(a) 74 | print a 75 | 76 | if __name__ == "__main__": 77 | main() 78 | -------------------------------------------------------------------------------- /mergesort/README.md: -------------------------------------------------------------------------------- 1 | 2 | The [mergesort](https://en.wikipedia.org/wiki/Merge_sort) is a divide-and-conquer sorting algorithm invented by John Von Neumann in 1945. The first step of the algorithm consists in dividing the list in *n* sublists (each containing 1 element). The second step consists in merging the sublists producing new sublists with more elements. The process finishes when only 1 sublist remains. As the name suggest there are two main methods in this algorithm, the method `merge()` and the method `sort()`. The input is an array `a[]` containing comparable elements. An auxiliary array called `aux[]` is used in order to store the temporary ordered elements. The algorithm has N*log(N) complexity because divides the array to sort in sub-arrays reducig (or increasing) the size by power of 2. Moreover this complexity is kept also in the worst case, while other algorithms such as quicksort does not have this property. 3 | Mergesort is not only efficient but also **stable**. The stability of a sorting algorithm is the ability of preserving order when sort by different criteria. For example sorting by name and then by place, a not stable algoritm will loose the first sorting when the second is applied. 4 | Finally mergesort is highly parallelizable using the Three Hungarians' Algorithm. 5 | 6 | 7 | Implementation 8 | -------------- 9 | 10 | 1. **Top-Down:** A `mid` point variable is estimated dividing the input array in half. The half is then divided in half, and so on. The last couple of elements are sorted. 11 | 12 | 2. **Bottom-Up:** Starting only from two values is applied the `merge()` method. A step is done on the original array and the next couple is considered for merging. When all the couples are exausted then 4 elements are merged. When the 4 elements are exausted then it is necessary to look for 8 elements. The size of the arrays considered is doubled every time (N log(N) complexity) 13 | 14 | 15 | Methods 16 | -------- 17 | 18 | - `merge()`: it merges two halves moving on them and selecting the smallest element 19 | - `sort()`: it sorts the input array 20 | 21 | Applications 22 | ------------ 23 | 24 | 1. Sorting a list of numbers 25 | 2. Sort a list of music songs by name 26 | 3. Sort polar coordinates of points in a polar plot 27 | 28 | Quiz 29 | ----- 30 | 31 | 1. Merging with smaller auxiliary array. Suppose that the subarray `a[0]` to `a[n−1]` is sorted and the subarray `a[n]` to `a[2*n−1]` is sorted. How can you merge the two subarrays so that `a[0]` to `a[2*n−1]` is sorted using an auxiliary array of length `n` (instead of `2n`)? 32 | 33 | 2. Counting inversions. An inversion in an array `a[]` is a pair of entries `a[i]` and `a[j]` such that `ia[j]`. Given an array, design a linearithmic algorithm to count the number of inversions. 34 | 35 | Material 36 | -------- 37 | - **Coursera Algorithms Part 1**: week 3 38 | - **Algorithms**, Sedgewick and Wayne (2014): Chapter 2.2 39 | -------------------------------------------------------------------------------- /symbol-tables/README.md: -------------------------------------------------------------------------------- 1 | 2 | A [Symbol Table](https://en.wikipedia.org/wiki/Symbol_table) data structure associates a key to a value. They are also known as associative arrays, **maps**, or **dictionaries**. The client can insert key-value pair in the table and can later return the value using the key. 3 | It is not possible to have **duplicated keys** because when a new element associated with an existing key is added, the old key is kept and the associated value is overwritten. 4 | 5 | Implementation 6 | -------------- 7 | 8 | 1. **Linked list**: The keys are stored in a sequence of linked nodes. To find the value associated to a key it is necessary to scan the list and comparing the input key with the current key (sequential search). The same mechanism is used for both `get()` and `put()`. In particular in the `put()` method it is necessary to scan the array looking for the key, if the key is found then it can be returned otherwise the input key is added at the head of the linked list referring to the next element. The main problem of this approach is the time complexity, that is O(N) for both `get()` and `put()`. 9 | 10 | 2. **Ordered array:** using and ordered array for the keys to be stored, it is possible to use *binary search* for `get()`, reducing the time complexity to O(log N). Using an ordered array makes also extremelly fast to return the maximum and minimum keys (operations often used). Moreover it makes possible to manage particular keys, such as times and dates. The same problem remains for the method `put()` since to add a new element it is necessary to find the insertion point (it can be done with binary search) and then shift of one position all the elements that are greater than the key. In the worst case it will be necessary to shift all the keys (time complexity: O(N)). Those problems are solved using [binary search trees](https://en.wikipedia.org/wiki/Binary_search_tree), which are introduced in the next lesson. 11 | 12 | 13 | Methods 14 | -------- 15 | 16 | `put(key, value)`: insert a new pair of key-value. It must not be allowed to associate a `None` (python) value. 17 | 18 | `get(key)`: return the value associated with the key. If the key does not exist it is possible to return `None`. 19 | 20 | `remove(key)`: remove the key and the associated value. 21 | 22 | `rank(key, lo, hi)`: (ordered array) the method is used in ordered array to search for a specific key using binary search. If the key is in the array it returns the index. 23 | 24 | Applications 25 | ------------ 26 | 27 | 1. Dictionaries: which is the application that also gives the name to the data structure (key=word, value=definition) 28 | 2. Account management: it can be used to process transactions (key=account id, value=transaction detail) 29 | 3. Web search: find relevant pages based on keywords (key=keyword, value=web pages) 30 | 31 | Quiz 32 | ----- 33 | 34 | 35 | 36 | 37 | Material 38 | -------- 39 | - **Coursera Algorithms Part 1**: week 4 40 | - **Algorithms**, Sedgewick and Wayne (2014): Chapter 3.1 "Symbol Tables" 41 | -------------------------------------------------------------------------------- /disjoint-set/README.md: -------------------------------------------------------------------------------- 1 | 2 | 3 | A [disjoint-set](https://en.wikipedia.org/wiki/Disjoint-set_data_structure) data structure, also called a **union–find** data structure or **merge–find set**, is a data structure that keeps track of a set of elements partitioned into a number of disjoint (non-overlapping) subsets. It provides near-constant-time operations to add new sets, to merge existing sets, and to determine whether elements are in the same set. It plays a key role in Kruskal's algorithm for finding the minimum spanning tree of a graph. 4 | 5 | Implementation 6 | --------------- 7 | 8 | 1. **Quick-Find**: In this implementation of the disjoint-set data structure the `find()` operation is quick because it only needs to access the elements of a list. The problem with this implementation is that `union()` needs to scan through the whole `id[]` array for each input pair in order to apply the union operator when needed. This operation must be done in all the cases. 9 | 10 | 2. **Quick-Union**: In this implementation of the disjoint-set data structure we focus on speeding up the `union()` operation. To implement `find()`, we start at the given site, follow its link to another site, follow that site’s link to yet another site, and so forth, following links until reaching a root, a site that has a link to itself (which is guaranteed to happen). Two sites are in the same component if and only if this process leads them to the same root. The quick-union algorithm would seem to be faster than the quick-find algorithm, because it does not have to go through the entire array for each input pair; However in the worst case scenario it will have to iterate through the entire array before finding a root node, for instance in a structure like: `1>2>3>4>5>6>7>8` when the function `union(7,8)` is called it has to iterate back to the root 1 and scan the entire array leading to complexity O(N^2). You can regard quick-union as an improvement over quick-find because it removes quick-find main liability (that `union()` always takes linear time). This difference certainly represents an improvement for typical data, but quick-union still has the liability that we cannot guarantee it to be substantially faster than quick-find in every case 11 | 12 | 3. **Weighted-Union**: In this implementation of the disjoint-set data structure we focus on optimizing the union operation when two trees are joined in a common set. Merging a short tree with the large one, it is possible to keep contained the depth of the resulting tree. This simple trick allows reducing the complexity to O(log N). This improvement requires a new array in order to store the size of the set. 13 | 14 | 15 | Methods 16 | -------- 17 | 18 | - `find()`: it returns the identifier of a given set. 19 | - `union()`: it merges two components and decrement the set counter. 20 | - `connected()`: it determines if two elements are in the same set. 21 | - `count()`: it retunrs the number of current sets. 22 | 23 | Material 24 | -------- 25 | - **Coursera Algorithms Part 1**: week 1 26 | - **Algorithms**, Sedgewick and Wayne (2014): Chapter 1.5 27 | 28 | -------------------------------------------------------------------------------- /oo-programming/README.md: -------------------------------------------------------------------------------- 1 | 2 | 3 | [Object-oriented programming (OOP)](https://en.wikipedia.org/wiki/Object-oriented_programming) is a programming paradigm based on the concept of objects, which may contain data, in the form of fields, often known as *attributes*; and code, in the form of procedures, often known as *methods*. This is often compared to [procedural programming](https://en.wikipedia.org/wiki/Procedural_programming) which structures a program like a recipe in that it provides a set of steps, and it is organized in routines or functions. Examples of procedural languages are Fortran, ALGOL, COBOL, BASIC, Pascal, C and Ada. Go is an example of a more modern procedural language, first published in 2009 4 | 5 | Main properties 6 | --------------- 7 | 8 | 1. **Encapsulation**: in some OOP languages both attributes and methods are kept safe against misuse. For instance in C++ it is possible to declare methods as public, private, or protected. In other languages such as Python the encapsulation is only obtained by convention, adding an underscore behind the methods that should not be used. 9 | 10 | 2. **Inheritance**: objects can relate to each other with relationships like *has-a*, *uses-a* or *is-a*. A particular super-class is the *abstract* class used in C++ or C#. Abstract classes cannot be really instantiated, they can only be used as a super-class for other classes that extend the abstract class. This is also related to the concept of encapsulation, since an abstract class is not accessible. 11 | 12 | 3. **Polymorphism**: poly-morphism means many-forms. Polymorphism manifests itself by having multiple methods all with the same name, but different functionality. There are two type of polymorphism, overriding and overloading. In *overriding* the method used is decided at runtime, based on the variables passed to the method itself. It is a language feature that allows a subclass to override a specific implementation of a method that is already provided by one of its super-classes. On the other hand, *overloading* determines the method used only at compile time. It is the ability to define several methods all with the same name. An example are operators like plus or minus that are treated as polymorphic functions and as such have different behaviours depending on the types of the arguments. At compile time the variables used by the operator are estimated and the correct method is associated to the operator. 13 | 14 | 4. **Abstraction**: it places the emphasis on what an object is or does rather than how it is represented or how it works. It reduces complexity by hiding irrelevant detail. It is a programming (and design) technique that relies on the separation of interface and implementation. An example could be a computer, you can use the keyboard to write a text file, you can regulate the monitor brightness but the internal functioning is hidden inside the case. You can use the interface, but the implementation is not directly accessible. In OOP the classes provide great level of abstraction giving methods to the outside world without actually showing how the class has been designed internally. 15 | 16 | 17 | -------------------------------------------------------------------------------- /disjoint-set/ex2_canonical_element.py: -------------------------------------------------------------------------------- 1 | ''' 2 | ------------------------------------------------------------------------------------------------ 3 | Union-find with specific canonical element. 4 | Add a method find() to the union-find data type so that find(i) returns the largest element in the connected component containing i. 5 | The operations, union(), connected(), and find() should all take logarithmic time or better. 6 | 7 | For example, if one of the connected components is {1,2,6,9}, then the find() method should return 9 for each of the four elements in the connected components. 8 | ------------------------------------------------------------------------------------------------- 9 | 10 | Modifying the find() method it is possible to insert a check for the maximum element of the branch. 11 | When the maximum element is found it is associated with the root of that set in a separate list. 12 | 13 | When it is asked to return the maximum element associated to an element p, it is necessary 14 | to find the associated root of p, accessing the id[] array and then get the maximum from the hg[] array. 15 | 16 | When two sets are joined the largest value between the two roots is assigned to the root having the lowest value in hg[]. 17 | ''' 18 | 19 | 20 | 21 | class WeightedUnion(): 22 | def __init__(self, N): 23 | self.N = N 24 | self.id_list = [n for n in range(N)] 25 | self.sz_list = [1 for _ in range(N)] 26 | self.hg_list = [n for n in range(N)] 27 | self.count = N 28 | 29 | def find(self, p): 30 | #Here the maximum element is found and returned 31 | max_element = p 32 | while(self.id_list[p] != p): 33 | p = self.id_list[p] 34 | if p > max_element: max_element = p 35 | return p, max_element 36 | 37 | def find_max(self, p): 38 | return self.hg_list[self.id_list[p]] 39 | 40 | def union(self, p, q): 41 | root_p, p_max = self.find(p) 42 | root_q, q_max = self.find(q) 43 | if root_p == root_q: return False 44 | 45 | #checking the hg values for the root 46 | self.hg_list[root_p] = p_max 47 | self.hg_list[root_q] = q_max 48 | #Here the largest hg is assigned to the lowest root 49 | if p_max >= q_max: self.hg_list[root_q] = p_max 50 | else: self.hg_list[root_p] = q_max 51 | 52 | if self.sz_list[root_p] < self.sz_list[root_q]: 53 | self.id_list[root_p] = root_q 54 | self.sz_list[root_q] += self.sz_list[root_p] 55 | else: 56 | self.id_list[root_q] = self.id_list[root_p] 57 | self.sz_list[root_p] += self.sz_list[root_q] 58 | self.count -= 1 59 | return True 60 | 61 | 62 | def main(): 63 | N = 10 64 | my_disjoint = WeightedUnion(N) 65 | 66 | my_disjoint.union(1,2) 67 | my_disjoint.union(6,2) 68 | my_disjoint.union(9,1) 69 | print("find(0) " + str(my_disjoint.find(0))) 70 | print("find(1) " + str(my_disjoint.find(1))) 71 | print("find(2) " + str(my_disjoint.find(2))) 72 | print("find(3) " + str(my_disjoint.find(3))) 73 | print("find(6) " + str(my_disjoint.find(6))) 74 | print("find(9) " + str(my_disjoint.find(9))) 75 | print("find_max(0) " + str(my_disjoint.find_max(0))) 76 | print("find_max(1) " + str(my_disjoint.find_max(1))) 77 | print("find_max(2) " + str(my_disjoint.find_max(2))) 78 | 79 | if __name__ == "__main__": 80 | main() 81 | -------------------------------------------------------------------------------- /breadth-first-search/README.md: -------------------------------------------------------------------------------- 1 | 2 | The [breadth-first search](https://en.wikipedia.org/wiki/Breadth-first_search) is an algorithm for traversing or searching tree or graph data structures. It starts at a specific node called **search key** and explores the neighbor nodes first, before moving to the next level neighbours. Differently from the depth-first search this algorithm is **not recursive**, instead it uses a queue to store all the adjacent node and mark them. Taking in consideration the same maze example used in previous module, we can think of breadth-first search as a group of searchers exploring by fanning out in all directions, each unrolling his or her own ball of string. When more than one passage needs to be explored, we imagine that the searchers split up to expore all of them; when two groups of searchers meet up, they join forces (using the ball of string held by the one getting there first). 3 | 4 |

5 | 6 |

7 | 8 | The idea is to add the unvisited node into a FIFO queue and mark it as visited. The it is necessary to pop an element from the queue, look to all the neighbours and add the unvisited to the queue and mark them. Repeat this process untill all the nodes are marked. The algorithm solves the problem of finding the **shortest path** between the starting node and all the other nodes. The time complexity needed for finding the shortest path is proportional to E+V. 9 | 10 | 11 | Implementation 12 | --------------- 13 | 14 | As in the previous module we suppose that the graph is stored as an adjacency-list. We use teh `edgeto_list[]` to save the incoming connections to a node (like in depth-first) and a `distto_list[]` data structure to keep track of the distance. We also need a FIFO `queue` object to store the marked nodes. In python we can use a `deque()` data structure and use the method `append()` to insert element in the queue and `popleft()` to get back the first element that was insert. As usual we keep track of the marked nodes in a list `marked_list[]`. 15 | 16 | ```Python 17 | def breadth_first_search(s): 18 | queue = deque() 19 | queue.append(s) 20 | marked_list[s] = True 21 | print(s) #mark the starting node 22 | while(len(queue) != 0): 23 | v = queue.popleft() #important to pop-left (FIFO) 24 | adjacent_list = vertex_list[v] #get adjacent nodes 25 | for v_adj in adjacent_list: 26 | if marked_list[v_adj] == False: 27 | queue.append(v_adj) 28 | marked_list[v_adj] = True 29 | edgeto_list[v_adj] = v 30 | print(v_adj) #print marked node 31 | 32 | ``` 33 | 34 | The following is a representation of the trace of the algorithm on a given graph: 35 | 36 |

37 | 38 |

39 | 40 | The result of calling the algorithm from the vertex 0 of a graph is the following: 41 | 42 |

43 | 44 |

45 | 46 | Methods 47 | -------- 48 | 49 | `breadth_first_search(v)`: iterative function to deep search in an adjacent-list of vertices. 50 | 51 | 52 | Applications 53 | ------------ 54 | 55 | 1. routing: find the shortest path between nodes 56 | 2. social networks: minimum degrees of separation between people 57 | 58 | Quiz 59 | ----- 60 | 61 | 62 | 63 | 64 | Material 65 | -------- 66 | - **Coursera Algorithms Part 1**: week 4 67 | - **Algorithms**, Sedgewick and Wayne (2014): Chapter 4.1 "Undirected Graph" 68 | -------------------------------------------------------------------------------- /quicksort/README.md: -------------------------------------------------------------------------------- 1 | 2 | 3 | The [quicksort](https://en.wikipedia.org/wiki/Quicksort) algorithm has been developed by Tony Hoare in 1959. 4 | The first step of the algorithm is to shuffle the array, then take a random element called **pivot**. The second step consists in reordering the array so that all elements with values less than the pivot come before the pivot, while all elements with values greater than the pivot come after it (equal values can go either way). After this partitioning, the pivot is in its final position. This is called the **partition operation**. Recursively apply the above steps to the sub-array of elements with smaller values and separately to the sub-array of elements with greater values. 5 | 6 | Complexity: the algorithm has the same complexity of mergesort (N*logN) however it does not use an auxiliary array (it is called an **in-place** algorithm) and it makes it faster because it does not require to move elements between two arrays. The worst case is quadratic (1/2 * N^2) but it happens only when the input array is already sorted (direct or reverse order). To avoid this case it is possible to randomly shuffle the array before running the sorting. 7 | 8 | Pivot: in the very early versions of quicksort, the leftmost element of the partition would often be chosen as the pivot element. Unfortunately, this causes worst-case behavior on already sorted arrays, which is a rather common use-case. The problem was easily solved by choosing either a random index for the pivot, choosing the middle index of the partition or (especially for longer partitions) choosing the median of the first, middle and last element of the partition for the pivot (as recommended by Sedgewick) 9 | 10 | Selection: a modified version of quicksort, called **quickselect**, allows finding an element *k* in the array in linear time *N*. 11 | 12 | Duplicated-keys: when there are a lot of duplicated elments in the array then the quicksort algorithm is not effective because it is not moving elements between the two partitions. Using a **three-way partitioning** it is possible to handle this case effectively. 13 | 14 | Implementation 15 | -------------- 16 | 17 | 18 | 19 | Methods 20 | -------- 21 | 22 | - `sort()` it sorts the element dividing the array in two halves and then calling `partition()` 23 | - `partition()` it divides the array in two partitions and define a pivot, then element are swappend between the two partitions 24 | 25 | Applications 26 | ------------ 27 | 28 | 1. Find duplicates 29 | 2. Apply sorting then binary search for specific element 30 | 3. Identify outliers or median 31 | 32 | Quiz 33 | ----- 34 | 35 | 1. Nuts and bolts. A disorganized carpenter has a mixed pile of n nuts and n bolts. The goal is to find the corresponding pairs of nuts and bolts. Each nut fits exactly one bolt and each bolt fits exactly one nut. By fitting a nut and a bolt together, the carpenter can see which one is bigger (but the carpenter cannot compare two nuts or two bolts directly). Design an algorithm for the problem that uses *nlogn* compares (probabilistically). 36 | 37 | 2. Decimal dominants. Given an array with n keys, design an algorithm to find all values that occur more than *n/10* times. The expected running time of your algorithm should be linear. 38 | 39 | 3. Selection in two sorted arrays. Given two sorted arrays *a[]* and *b[]*, of sizes *n1* and *n2*, respectively, design an algorithm to find the kth largest key. The order of growth of the worst case running time of your algorithm should be logn, where *n=n1+n2*. 40 | Version 1: *n1=n2* and *k=n/2*. Version 2: *k=n/2*. Version 3: no restrictions 41 | 42 | 43 | 44 | 45 | Material 46 | -------- 47 | - **Coursera Algorithms Part 1**: week 3 48 | - **Algorithms**, Sedgewick and Wayne (2014): Chapter 2.3 49 | -------------------------------------------------------------------------------- /graph/README.md: -------------------------------------------------------------------------------- 1 | 2 | A [graph](https://en.wikipedia.org/wiki/Graph_(abstract_data_type)) is a set of vertices and a collection of edges that each connect a pair of vertices. 3 | 4 | A **path** in a graph is a sequence of vertices connected by edges. A simple path is one with no repeated vertices. A **cycle** is a path with at least one edge whose first and last vertices are the same. A **simple cycle** is a cycle with no repeated edges or vertices (except the requisite repetition of the first and last vertices). The **length** of a path or a cycle is its number of edges. 5 | 6 | A **tree** is an acyclic connected graph. A disjoint set of trees is called a **forest**. A **spanning tree** of a connected graph is a subgraph that contains all of that graph’s vertices and is a single tree. A **spanning forest** of a graph is the union of spanning trees of its connected components. 7 | 8 | A graph is **connected** if there is a path from every vertex to every other vertex in the graph. A graph that is **not connected** consists of a set of **connected components**, which are maximal connected subgraphs. The following is a graph with 3 connected components: 9 | 10 |

11 | 12 |

13 | 14 | It is possible to find the number of connected components using the **union-find** discussed in the first module, or using the depth-first search algorithm discussed in the next modules. 15 | 16 | Implementation 17 | --------------- 18 | 19 | To store graphs we need to keep in mind space and time constraints. 20 | First we must have the **space** to accommodate the types of graphs that we are likely to encounter in applications. Second we want to develop **time** efficient implementations of Graph instance methods. There are three possible way to represent graphs: 21 | 22 | 1. **adjacency matrix**: we can represent connections between V nodes with a matrix of size V*V. A particular entry in the matrix identify the presence of a connection between two nodes. The problem with this rapresentation is the space. We need V^2 space to store the graph. 23 | 24 | 2. **array of edges**: we can keep track of all the edges of the graph storing them into an array. For instance we can have a list of tuple `[(2,5), (0,3), (5,4), (0,1)]`. The positive thing is that the space usage is equal to V. 25 | The problem with this representation is the time. We need to iterate the entire array to do the most common operations like add and remove an edge. 26 | 27 | 3. **array of adjacency lists**: we define an array of size V where each entry represent a specific vertex. In this array we store lists, representing the neighbours of that vertex. This representation is a tradeoff between space and time and it is the best one. We can add new edges in constant time and iterate through adjacent vertices in constant time per adjacent vertex. Space usage proportional to V+E. 28 | The following image is an adjacency-list example: 29 | 30 |

31 | 32 |

33 | 34 | A simple `Graph` object based on the adjacency-list criteria can be created in python: 35 | 36 | ```Python 37 | class Graph(): 38 | def __init__(self, V): 39 | self.vertex_list = [[] for _ in range(V)] 40 | 41 | def add_vertex(v): 42 | self.vertex_list.append([]) 43 | 44 | def add_edge(s, v): 45 | self.vertex_list[s].append(v) 46 | ``` 47 | 48 | 49 | Methods 50 | -------- 51 | 52 | `add_edge(s, v)`: add an edge between two nodes 53 | 54 | Applications 55 | ------------ 56 | 57 | 58 | Quiz 59 | ----- 60 | 61 | 62 | 63 | 64 | Material 65 | -------- 66 | - **Coursera Algorithms Part 1**: week 4 67 | - **Algorithms**, Sedgewick and Wayne (2014): Chapter 4.1 "Undirected Graph" 68 | -------------------------------------------------------------------------------- /stack/README.md: -------------------------------------------------------------------------------- 1 | 2 | 3 | A [stack](https://en.wikipedia.org/wiki/Stack_(abstract_data_type)) is an [abstract data type](https://en.wikipedia.org/wiki/Abstract_data_type). An abstract data type is different from a classical data structure. The data structures are concrete representations of data, and are the point of view of an implementer, not a user. The **data type** is defined by its behavior (semantics) from the point of view of a user of the data, specifically in terms of possible values, possible operations on data of this type, and the behavior of these operations. The stack is based on two principal operations: push and pop. **Push** adds an element to the collection and **pop** removes the most recently added element. For this reason stacks are considered LIFO (last in, first out). 4 | 5 | It is important to make the implementation **generic**, meaning that the `class Stack` could contain any kind of object. In **C++** it can be done using templates. The class is based on a `vector` object. When declaring a new stack the template incorporates the type `Stack my_int_stack` or `Sack my_string_stack`, which is forwarded to the vector as `vector vector_stack` or `vector vector_stack`. In **Python** it can be possible to use an object `array`. The object `array` is like a list but it contains only objects of a specific type, which is specified at object creation time. The class can be imported using `from array import array`, and declared as `int_stack = array('i')` where the string `'i'` identifies the integer type, `'d'` is a double, and `'f'` is a float (other types are available). 6 | 7 | 8 | Implementation 9 | -------------- 10 | 11 | 1. **[Linked list](https://en.wikipedia.org/wiki/Linked_list)**: each element is defined by the data and a reference to the next element in the stack. The main advantage of a linked list for stack is that it does not require to adjust the dimension of the stack because there is not a size limit. However returning an item from a linked list takes linear time, because it is necessary to iterate on all the elements one after the other. This implementation can be selected if there is a stream of data coming and loosing some packets has an high cost. In Python the linked list is not represented by any built-in data structure. The Python list object is an array of references which store a reference to an object in each position of the list. 12 | 13 | 2. **Static array**: using an array is another way to implement a stack. The static implementation has a maximum capacity declared when the stack is created. It is necessary to monitor the size of the array because if the maximum capacity is exceeded then a [stack overflow](https://en.wikipedia.org/wiki/Stack_buffer_overflow) can occur. 14 | 15 | 3. **Resizing array**: in this implementation the size of the array is increased when a certain number of elements has been added. The resizing avoids the buffer overflow problem but can be quite slow because it requires to copy all the elements in a new array. A smart implementation will double the size of the array every time that the `resize()` method is called. In this way the next resize will happen less and less often. In particular, we can say that since the array double the `resize()` will be called with a logarithmic time. 16 | 17 | Methods 18 | -------- 19 | 20 | - `push()`: it adds an element in the stack. 21 | - `pop()`: it determines if two elements are in the same set. 22 | - `isEmpty()`: returns True if the stack is empty. 23 | - `resize()`: used only in the *resizing array* implementation. 24 | 25 | Applications: 26 | -------------- 27 | 28 | - Dijkstra's (or Shunting-yard) two stack algorithm for computing arithmetic operations [[link]](https://en.wikipedia.org/wiki/Shunting-yard_algorithm) 29 | 30 | - DO/UNDO operations are done accumulating everything in a stack. 31 | 32 | 33 | Material 34 | -------- 35 | - **Coursera Algorithms Part 1**: week 2 36 | - **Algorithms**, Sedgewick and Wayne (2014): Chapter 1.3 37 | -------------------------------------------------------------------------------- /priority-queue/README.md: -------------------------------------------------------------------------------- 1 | 2 | Definition: the [Priority-Queue](https://en.wikipedia.org/wiki/Priority_queue) data structure associates a key to a value. An appropriate data type in such an environment supports two operations: remove the maximum and insert. Such a data type is called a priority queue. Using priority queues is similar to using queues (remove the oldest) and stacks (remove the newest), but implementing them efficiently is more challenging. 3 | 4 | 5 | Implementation 6 | -------------- 7 | 8 | 1. **Ordered array**: based on an array of ordered values that is adjusted every time a new element is inserted. For instance starting from the array `[45, 49, 68, 89]` if we push a new element `54` we have to find the place where it must be inserted and this can be done using a *insertion sort* like mechanism and the method `less()`. In our example the element can be added to the end of the cue and then using the method `less()` compared with the element on the left, the two elements are switched until `less()` returns true. For instance starting from `[45, 49, 68, 89, 54]` we perform `less(89, 54) -> False` and switch the last two elements in this way `[45, 49, 68, 54, 89]` then we compare again `less(68, 54) -> False` and switch in this way `[45, 49, 54, 68, 89]`, finally we get `less(68, 54) -> True` meaning that we reached the correct position. The complexity of this method in the worst case is O(N) for insertion, but since the array is ordered it takes only O(1) to return the largest element (the last one in the array). 9 | 10 | 2. **Unordered array**: this is the lazy approach. The array of value is kept unordered and a new element is added at the end. Differently from the ordered solution, here adding a new element is very easy and only takes O(1) time complexity. The problem comes when we have to find the largest element, because this operation takes O(N) time complexity requiring to scan the entire array. 11 | 12 | 3. [binary heap](https://en.wikipedia.org/wiki/Binary_heap): using this data structure it is possible to obtain a complexity of O(log N) for both insertion and max element return. The binary heap is a tree where *each node has only two children* that are smaller than the parent. The root node of the tree is the largest element in the set. The elements are stored into an array, each parent at index `k` has two children at index `2k` and `2k+1` (on the opposite each child in position `k` has a parent in position `k/2`). 13 | To use the binary heap we have to implement two operations needed for restoring the heap order (reheapify): top-down (sink), bottom-up (swim). The **top-down** operation is used when a node key becomes smaller than one (or both) of the children's key. What is done is to use a switching mechanism similar to the *insertion sort* were we exchange parent-child until the correct order is obtained. The top-down is used when the `delMax()` method is called. The root is removed and the smallest element at the end of the array is moved at the top, then the sink is called and the order restored. 14 | On the other hand the **bottom-up** operation proceed in the reverse order. If a child got a priority that is higher than the parent it is necessary to exchange child-parent until the correct order is obtained. The bottom-up is used when the `insert()` is called and a new element is added to the array. The element is added in the last position in the array and the swim is performed to reorder the tree. 15 | 16 | 17 | Methods 18 | -------- 19 | 20 | `delMax()`: remove the element with the largest priority and returns it 21 | 22 | `insert()`: a new element in the queue 23 | 24 | `less()`: compares two elements and returns a boolean 25 | 26 | `sink()`: (binary heap) implement the top-down reordering of the nodes 27 | 28 | `swim()`: (binary heap) implement the bottom-up reordering of the nodes 29 | 30 | Applications 31 | ------------ 32 | 33 | 1. System processes priority management. Smartphone application priority (e.g. a phone call has higher priority than a game) 34 | 35 | 36 | Quiz 37 | ----- 38 | 39 | 40 | 41 | 42 | Material 43 | -------- 44 | - **Coursera Algorithms Part 1**: week 4 45 | - **Algorithms**, Sedgewick and Wayne (2014): Chapter 2.4 "Priority Queues" 46 | 47 | 48 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | 2 | Introduction 3 | ------------- 4 | 5 | 1. **Object-oriented programming** [[link]](./oo-programming) 6 | 7 | 8 | Algorithms 9 | ----------- 10 | 11 | Collection of algorithms and data structures implemented in Python and C++. For an optimal learning experience it is recommended to study the material in the following order: 12 | 13 | 1. **Disjoint-set**: (data structure) [[link]](./disjoint-set) 14 | 15 | 2. **Stack**: (abstract data structure) [[link]](./stack) 16 | 17 | 3. **Queue**: (abstract data structure) [[link]](./queue) 18 | 19 | 4. **Elementary sort**: (sorting algorithm) TBD [selection sort, insertion sort, shellsort] 20 | 21 | 5. **Mergesort**: (sorting algorithm) [[link]](./mergesort) 22 | 23 | 6. **Quicksort**: (sorting algorithm) [[link]](./quicksort) 24 | 25 | 7. **Priority-Queue**: (data structure) similarly to a Queue it stores an array of values, but when `pop()` is called the largest value (higher priority) is returned [[link]](./priority-queue) 26 | 27 | 8. **Symbol Tables**: (data structure) the primary purpose is to associate a key to a value. The best standard implementation is based on ordered arrays and has O(log N) time complexity for `get()` and O(N) time complexity for `put()` [[link]](./symbol-tables) 28 | 29 | 9. **Binary Search Trees**: (data structure) it is a smart implementation of symbol tables. It solves the problem given by the standard symbol table implementations, giving O(log N) time complexity for both `get()` and `put()`. [[link]](./binary-search-trees) 30 | 31 | 10. **Balanced Search Trees**: (data structure) it solves the problem of unbalanced trees, given by standard binary trees. It is also known as [self-balancing](https://en.wikipedia.org/wiki/Self-balancing_binary_search_tree) binary search tree. The problem with this data structure is the overhead in keeping track of different types of node and links [[link]](./balanced-search-trees) 32 | 33 | 11. **Red-Black Balanced Search Trees**: (data structure) it solves the overhead problem of standard balanced trees. The implementation guarantees O(log N) time complexity in all the operations. The resulting tree is not perfectly balanced in some particular cases [[link]](./red-black-balanced-search-trees) 34 | 35 | 12. **Hash tables**: (data structure) using an hash function that maps any integer into a subset it is possible to store a large number of values into a limited number of bins [[link]](./hash-functions) 36 | 37 | 13. **Graph**: (data structure) the graph data structure is very important and it can be implemented in three ways. The first way is using a matrix of size V*V where each value identifies a connection between two vertices. The third way is using an adjacency-list, meaning a list of list where for each vertex there is associated a list of neighbours. [[link]](./graph) 38 | 39 | 14. **Depth-First Search**: (search algorithm) using a depth search it is possible to look for path and vertex in a graph. It can be used to verify in constant time if two nodes are connected. [[link]](./depth-first-search) 40 | 41 | 15. **Breadth-First Search**: (search algorithm) using a breadth search it is possible to look for path and vertex in a graph. It is used to find the shortest path between two nodes. [[link]](./breadth-first-search) 42 | 43 | 16. **Digraph**: (data structure) a directed graph or digraph has directed edges. The depth-first and breadth-first algorithms still work. However finding strong connected components requires a slightly more complex algorithm. [[link]](./digraph) 44 | 45 | 17. **Minimum Spanning Trees**: (data structure and search algorithm) sometimes it is necessary to find the minimum set of edges connecting all the nodes, where to each edge it is associated a weigh. The minimum spanning tree correspond to the tree having the edges with lowest weight connecting all the nodes. Here are discussed three implementations: greedy, Kruskal, and Prim. [[link]](./minimum-spanning-trees) 46 | 47 | 17. **Shortest path**: (search algorithm) finding the shortest path between two nodes in efficient way is not easy. Here are discussed different solutions and the Dijkstra algorithm [[link]](./shortest-path) 48 | 49 | Official requirements (of different companies) for passing coding interview: 50 | 51 | - Google [[readme]](./interview/google.md) 52 | -------------------------------------------------------------------------------- /interview/google.md: -------------------------------------------------------------------------------- 1 | Tips and tricks taken from the official Google guide to interviews. There are a lot of hyperlinks to find out the meaning of the most important terms. 2 | 3 | 4 | **Coding practice**: You can find sample coding questions on sites like [CodeLab](https://codelabs.developers.google.com/), Quora, and Stack Overflow. The book [Cracking the Coding Interview](http://www.crackingthecodinginterview.com/) is also a good resource. In some sites, you'll have the option to code on either a Chromebook or a whiteboard, to offer a more natural coding environment. (Ask your recruiter what's available so you can practice.) Be sure to test your code and ensure it’s easily readable without bugs. Don’t stress about small syntactical errors like which substring to use for a given method (e.g. start, end or start, length) — just pick one and let your interviewer know. 5 | 6 | **Coding**: You should know at least one programming language really well, preferably C++, Java, Python, Go, or C. You will be expected to know APIs, [Object Orientated](https://en.wikipedia.org/wiki/Object-oriented_programming) Design and Programming, how to test your code, as well as come up with [corner cases](https://en.wikipedia.org/wiki/Corner_case) and [edge cases](https://en.wikipedia.org/wiki/Edge_case) for code. Note that we focus on conceptual understanding rather than memorization. 7 | 8 | **Algorithms**: Approach the problem with both bottom-up and top-down algorithms. You will be expected to know the *complexity* of an algorithm and how you can improve/change it. Algorithms that are used to solve Google problems include *sorting* (plus searching and *binary search*), [divide-and-conquer](https://en.wikipedia.org/wiki/Divide_and_conquer_algorithm), [dynamic programming](https://en.wikipedia.org/wiki/Dynamic_programming)/[memoization](https://en.wikipedia.org/wiki/Memoization), [greediness](https://en.wikipedia.org/wiki/Greedy_algorithm), [recursion](](https://en.wikipedia.org/wiki/Recursion_(computer_science))) or algorithms linked to a specific data structure. Know Big-O notations (e.g. run time) and be ready to discuss complex algorithms like [Dijkstra](https://en.wikipedia.org/wiki/Dijkstra%27s_algorithm) and [A*](https://en.wikipedia.org/wiki/A*_search_algorithm). We recommend discussing or outlining the algorithm you have in mind before writing code. 9 | 10 | **Sorting**: Be familiar with common sorting functions and on what kind of input data they’re efficient on or not. Think about efficiency means in terms of runtime and space used. For example, in exceptional cases [insertion-sort](https://en.wikipedia.org/wiki/Insertion_sort) or [radix-sort](https://en.wikipedia.org/wiki/Radix_sort) are much better than the generic QuickSort/MergeSort/HeapSort answers. 11 | 12 | **Data Structures**: You should study up on as many data structures as possible. Data structures most frequently used are arrays, linked lists, stacks, queues, hash-sets, [hash-maps, hash-tables](https://en.wikipedia.org/wiki/Hash_table), [dictionary](https://en.wikipedia.org/wiki/Associative_array), trees and binary trees, heaps and graphs. You should know the data structure inside out, and what algorithms tend to go along with each data structure. 13 | 14 | **Mathematics**: Some interviewers ask basic discrete math questions. This is more prevalent at Google than at other companies because counting problems, probability problems and other [Discrete Math 101](https://en.wikipedia.org/wiki/Outline_of_discrete_mathematics) situations surround us. Spend some time before the interview refreshing your memory on (or teaching yourself) the essentials of elementary *probability theory* and *combinatorics*. You should be familiar with [n-choose-k](https://en.wikipedia.org/wiki/Binomial_coefficient) problems and their ilk. 15 | 16 | **Graphs**: Consider if a problem can be applied with graph algorithms like distance, search, connectivity, [cycle-detection](https://en.wikipedia.org/wiki/Cycle_(graph_theory)#Cycle_detection), etc. There are three basic ways to represent a graph in memory (objects and pointers, matrix, and adjacency list) — familiarize yourself with each representation and its pros and cons. You should know the [basic graph traversal algorithms](https://en.wikipedia.org/wiki/Graph_traversal), [breadth-first search](https://en.wikipedia.org/wiki/Breadth-first_search) and [depth-first search](https://en.wikipedia.org/wiki/Depth-first_search). Know their computational complexity, their tradeoffs and how to implement them in real code. 17 | 18 | **Recursion**: Many coding problems involve thinking recursively and potentially coding a [recursive](https://en.wikipedia.org/wiki/Recursion_(computer_science)) solution. Use recursion to find more elegant solutions to problems that can be solved iteratively. 19 | 20 | 21 | **Resources:** 22 | 23 | - Grow Your Technical Skills with Google [[website]](https://techdevguide.withgoogle.com/#courses) 24 | 25 | - Foundations path [[website]](https://techdevguide.withgoogle.com/paths/foundational/) 26 | 27 | - Resource Library [[website]](https://techdevguide.withgoogle.com/resources/) 28 | 29 | 30 | 31 | -------------------------------------------------------------------------------- /balanced-search-trees/README.md: -------------------------------------------------------------------------------- 1 | 2 | The balanced search trees a.k.a. [self-balancing binary search tree](https://en.wikipedia.org/wiki/Self-balancing_binary_search_tree) are a particular data structure used to solve the problem of unbalanced trees that arises when using standard binary trees. Idealy, a perfect binary tree should have log(n) height, and the time taken to find a node in the worst case should be log(N). 3 | The time taken to reach a node is then proportional to the height of the tree, for this reason it is desirable to keep the height small. In standard binary trees it is possible to have degenerated trees having an extreme height. For example, if the items are inserted in sorted key order the resulting tree degenerates in a linked list having heigh N. The list `[54,55,56,57,58]` contains elements disposed in a sorted order. When the elements are inserted in a standard binary tree each node is linked on the right side of the previous one, leading to a tree of depth 5. The balanced trees solve the problem performing some operations during the key insertion, in order to keep the height proportional to log(N). Unfortunately, maintaining a perfect balanced tree has an high overhead and it is necessary to find a compromise between balance requirement and performance. 4 | 5 | Implementation 6 | --------------- 7 | 8 | **2-3 balanced search trees**: this type of tree can have 2-nodes (1 key, 2 links) or 3-nodes (2 keys, 3 links). 9 | As usual a link to an empty node is called a null link. The **search** procedure in case of a 2-node is the same used in standard binary trees. In the particular case of a 3-node we must move on the left if the searched value is less than the left value of the node, and move on the right if the searched value is greater than the right value of the node. It is necessary to move on the centre if the searched value is in between the two values associated to the node. 10 | 11 |

12 | 13 |

14 | 15 | The operation of **insert** associated to a 2-node is easy to do. We scan the tree until the null link is reached (meaning the key is not in the tree), then we simply add the key into the 2-node making a 3-node, that's all. However when the terminal node is a 3-node there is much work to do and we can differentiate between three different cases: 16 | 17 | 1. *tree having only a single 3-node as root*: this is the easiest case and solving it is easy. We put the new key inside the 3-node creating a 4-node (3 keys, 4 links). Then we take the central key and we use it to create a parent node, whereas the side keys become the children. 18 | 19 | 2. *the 3-node has a 2-node as parent*: in this case we can proceed similarly to the previous case. The 3-node becomes a 4-node and the central key is used as a parent and integrated in the above 2-node, the original 3-node is then split generating two 2-nodes. After the operation the original parent node (that was a 2-node) now is a 3-node, whereas the child (that was a 3-node) now is represented by two separate 2-nodes. 20 | 21 |

22 | 23 |

24 | 25 | 3. *the 3-node has a 3-node as parent*: in this case the process described in the previous case is repeated multiple times until the parent node is a 2-node. This process can generate a problem when the root node is reached. In this case the temporary 4-node is associated to the root but it cannot be digested by an upper level. To solve the issue it is possible to split the 4-node into a parent node (that becomes the new root) and two children, similarly to what has been done in the case 1. The root splitting is the only case when the *height of the tree increase of one level*, and it happens when all the nodes along the search path are 3-nodes. 26 | 27 | The use of all these local transformations preserve the global properties that the tree is balanced and ordered. Unlike standard binary trees, the 2-3 trees grow up from the bottom. The starting root node is pushed to the bottom and new nodes are added on the top of it. Using a 2-3 tree the heigh resulting from inserting 1 billion keys is between 19 and 30. 28 | 29 | The **drawbacks** associated to the 2-3 trees is that there is a lot of code involved and writing a 2-3 tree data structure is not easy because different cases should be considered. For instance having different node types (2-node, 3-node, 4-node) is cumbersome, there are many splitting cases to handle, it is necessary to move-up the tree when the parent is a 4-node. 30 | It would be better to have a balanced tree with less overhead, this is possible using **Red-Black balanced trees** that are described in the next module. 31 | 32 | Methods 33 | -------- 34 | 35 | `put(key, value)`: insert a new pair of key-value. It must not be allowed to associate a `None` (python) value. 36 | 37 | `get(key)`: return the value associated with the key. If the key does not exist it is possible to return `None`. 38 | 39 | `remove(key)`: remove the key and the associated value. 40 | 41 | 42 | Applications 43 | ------------ 44 | 45 | 1. Dictionaries: which is the application that also gives the name to the data structure (key=word, value=definition) 46 | 2. Account management: it can be used to process transactions (key=account-id, value=transaction detail) 47 | 3. Web search: find relevant pages based on keywords (key=keyword, value=web-pages) 48 | 49 | Quiz 50 | ----- 51 | 52 | 53 | 54 | 55 | Material 56 | -------- 57 | - **Coursera Algorithms Part 1**: week 4 58 | - **Algorithms**, Sedgewick and Wayne (2014): Chapter 3.3 "Balanced Search Trees" 59 | -------------------------------------------------------------------------------- /hash-functions/README.md: -------------------------------------------------------------------------------- 1 | 2 | The [hash function](https://en.wikipedia.org/wiki/Hash_function) is any function that can be used to map data of arbitrary size to data of fixed size. Hash functions are used in hash tables to solve the problem of storing a large number of values. There are two main problems solved by the hash function. The **first problem** is **space** constraint, that emerges when we want to store a large number of values. If we do not have any space limit we do not need an has function and we can simply store a value associating the key to the index of an array. The **second problem** is search **time**. The hash allows searching with complexity O(1), whereas a linked list is O(N). The only problem with hash functions are **collisions** which can happen when different keys are assigned to the same index in the array. This problem is solved in two ways: separate chaining, linear probing. 3 | 4 | **Modular hashing**: it is used a prime number M to divide (and take the reminder) the hash value returned as an integer. In this way the large integer retuned with the hash is compressed in a well defined space that goes from 0 to M-1. In probabilistic terms is like throwing balls uniformly at random into M bins. Sometimes the hash can return a negative values, for this reason is necessary to take the absolute value of the hash before taking the reminder through M. Thinking in probability terms (with combinatorial analysis) tells us that we expected to have two balls in the same bin after *sqrt(pi M / 2)* (birthday problem), and that every bin has >= 1 balls after *M ln M* tosses (coupon collector). It is important to notice that returning the hash function of a string takes time proportional to the length of the string because hashing involves performing arithmetic operations on each character of the string. It may be necessary to mask the negative bit of the hash instead of using a simple `abs()` method because there is a bug in one of a billion cases where a negative value is returned also after applying the absolute. 5 | 6 | Implementation 7 | --------------- 8 | 9 | 1. **separate chaining**: this implementation has been invented in the 1953 by Luhn at IBM. It consists in adding a linked list in any position of the array. In this way if there are collisions the colliding keys are chained in the same position. When looking for a specific key we access the location and then we do a linear search on the linked list. Under the uniform hashing assumption (the hash uniformly spread the integers across the bins). We have cost of N/M, where N is the sequential cost required to search N values in a linked list, and M is the number of array elements mapped through the has function. For instance, if we have N=10^4 that is larger than the available number of bins (let's suppose M=10^3) the cost for searching a key will be only 10 (10^4 / 10^3). In particular the distribution of list size is binomial. The choice of M is important, and in general it is necessary to choose a value that is not too small (otherwise the linked list associated to the bin is large) and not too large (otherwise there are a lot of empty bins). The ideal size generally chosen is M=N/5. The *main advantage* of the separate chaining implementation is constant time in case of a search miss, due to the fact that empty bins do not have any linked list associated. 10 | 11 | 2. **linear probing**: the idea behind linear probing is to have an array where the number of bins M is larger that the number of keys N. When a value is added and it collides, then it is possible to look at position *i+1* and if it is free store the key there. Every time there is a collision we look for higher locations and we insert the key at the first available. When the end of the array is reached it is necessary to restart from the first position, and always look for a free place. When we search for a key we can use the same idea. We start from the index pointed by the hash and then we start a linear search increasing *i*. We get a search miss if an empty place is found along the path, meaning that the key was not stored before that point. Also in this case the right choice of M is important. If M is small then search time for an item blows up (because the seach it can take a long time), whereas if it is too large then there are too many empty positions and waste of memory. The array dimension must be monitored, and when more than half of it has been filled it is necessary to increase the dimension to make room for new keys. A particular note is needed when the `remove()` method is called. In this case it is necessary to find the key, remove it and then reinsert all the keys after that one. If the size of M is correct then not many keys have to be reinsert. An alternative is to flag the key to delete, reuse it again at the next insert and skip it during a search. 12 | 13 | Methods 14 | -------- 15 | 16 | `put(key, value)`: insert a new pair of key-value. It must not be allowed to associate a `None` (python) value. 17 | 18 | `get(key)`: return the value associated with the key. If the key does not exist it is possible to return `None`. 19 | 20 | `remove(key)`: remove the key and the associated value. 21 | 22 | `hash(key)`: internal function that returns the integer associated with the key value. 23 | 24 | `equals(key, key)`: used to search for a key into the array 25 | 26 | Applications 27 | ------------ 28 | 29 | 1. This is not a real application but a vulnerability. If an attacker discover the hash function used, then it can send a range of numbers that always generate the same key, leading to a pile of keys allocated in the same array bin. For instance in Linux 2.4 the problem arose when saving files with some specific names. 30 | 31 | Quiz 32 | ----- 33 | 34 | 35 | 36 | 37 | Material 38 | -------- 39 | - **Coursera Algorithms Part 1** 40 | - **Algorithms**, Sedgewick and Wayne (2014): Chapter 3.4 "Hash tables" 41 | -------------------------------------------------------------------------------- /depth-first-search/README.md: -------------------------------------------------------------------------------- 1 | 2 | The [depth-first search](https://en.wikipedia.org/wiki/Depth-first_search) is an algorithm for **searching** or for traversing a graph. It is possible to understand how the algorithm works using the maze example. Let's suppose that a maze can be represented with nodes (intersection points) and edges (paths between nodes), as follows: 3 | 4 |

5 | 6 |

7 | 8 | The [Trémaux's algorithm](https://en.wikipedia.org/wiki/Maze_solving_algorithm) can be used to find the exit. This algorithm is similar to the one used by Theseus in the Minotaur maze. The rules are three: first unroll a ball of string behind you, second mark each visited intersection and passage, third retrace your steps when no unvisited options are available. The main idea in depth-first search is to mimic the maze exploration. There are two main steps: first mark the current vertex `v` as visited, second *recursively* visit all unmarked vertices adjacent to `v`. The **recursion** is the engine of the algorithm. 9 | 10 | The **time complexity** for marking all the vertices connected to `s` in time proportional to the sum of their degrees. The degree is the number of adjacent nodes of a vertex. The time to find a path from `s` to `v` is proportional to its length. 11 | 12 | It is also possible to use depth-first search to efficiently find the **connected components** of a graph. As discussed in the graph module, a graph is **connected** if there is a path from every vertex to every other vertex in the graph. A graph that is **not connected** consists of a set of **connected components**, which are maximal connected subgraphs. 13 | 14 | 15 | Implementation 16 | --------------- 17 | 18 | It is good practice to implement a `Graph()` object, that is passed to the algorithm. In this way the graph is decoupled from the algorithms used on it. After the class `Graph()` is ready (see previous module) it is possible to create a second object called `Path(Graph G, vertex s)` that is used to find paths in the graph once a given starting vertex (represented as integer) is defined. The path class has a method `hasPathTo(vertex v)` returning `True` if there is a route between the starting vertex `s` and the end vertex `v`. 19 | 20 | The graph can be represented with all the methods described in the previous module. Here however we hypothesise that an adjacency-list representation is used. A list of list is defined, for instance in a graph having 5 nodes: `vertex_list[[], [], [], [], []]`. Each sub-list contains the vertex connected to that particular node. For instance with the syntax `vertex_list[2][3]` we are accessing the connection 3 of the vertex 2. The information regarding the fact that a vertex has been visited can be stored in an array of boolean `marked_list[False, False, False, False, False]`. The recursive method marks the given vertex and calls itself for any unmarked vertices on its adjacency list. It is possible to use another list called `edgeto_list[]` to store the incoming connection to any vertex. This list is useful if we want to move back from a given node to the starting node. 21 | 22 | ```Python 23 | def depth_first_search(v): 24 | if marked_list[v] == False: 25 | marked_list[v] = True #mark the node 26 | print(v) #print marked node 27 | adjacent_list = vertex_list[v] #get adjacent nodes 28 | for v_adj in adjacent_list: 29 | if marked_list[v_adj] == False: 30 | edgeto_list[v_adj] = v 31 | depth_first_search(v_adj) #iterative call 32 | 33 | ``` 34 | 35 | The following is a representation of the trace of the algorithm on a given graph: 36 | 37 |

38 | 39 |

40 | 41 | The result of calling the algorithm from the vertex 0 of a graph divided in three groups is the following: 42 | 43 |

44 | 45 |

46 | 47 | It is possible to used depth-first search to find the **connected components** of a graph. In this case we keep a connected component counter `cc_counter` and a list `cc_list[]`. We iterate through the list of vertex stored in `vertex_list[]`, and we run the depth-search if the node is unmarked. When the search finishes we increment the `cc_counter` and we search the next unmarked node in `vertex_list[]`. To each unmarked vertex encountered along the way, we assign the current index given by the `cc_counter`. The main loop is as follows: 48 | 49 | ```Python 50 | for v in range(V): 51 | if marked_list[v] == False: 52 | cc_list[v] = cc_counter #assign the ID to the starting node 53 | depth_first_search(v) #depth search from the starting node 54 | cc_counter += 1 #increment the ID counter 55 | ``` 56 | 57 | A single line of code must be added at the end of the the depth-search function, in order to assign the component id to the vertex: 58 | 59 | ```Python 60 | if marked_list[v_adj] == False: 61 | edgeto_list[v_adj] = v 62 | depth_first_search(v_adj) #iterative call 63 | #Additional code for connected components 64 | cc_list[v_adj] = cc_counter 65 | ``` 66 | 67 | Methods 68 | -------- 69 | 70 | `depth_first_search(v)`: iterative function to deep search in an adjacent-list of vertices. 71 | 72 | 73 | Applications 74 | ------------ 75 | 76 | 1. financial transactions between partners 77 | 2. hyperlink classification of complex networks 78 | 79 | Quiz 80 | ----- 81 | 82 | - Find if a given graph is bipartite. Bipartite graph are define as such graph in which each edge connects nodes assigned to two different colours (e.g. black and red). 83 | 84 | - Euler tour. Find if there is a cycle that uses each edge exactly one. Based on the [Euler solution](https://en.wikipedia.org/wiki/Eulerian_path) to the [seven Bridges of Königsberg](https://en.wikipedia.org/wiki/Seven_Bridges_of_K%C3%B6nigsberg) problem. 85 | 86 | 87 | Material 88 | -------- 89 | - **Coursera Algorithms Part 1**: week 4 90 | - **Algorithms**, Sedgewick and Wayne (2014): Chapter 4.1 "Undirected Graph" 91 | -------------------------------------------------------------------------------- /red-black-balanced-search-trees/README.md: -------------------------------------------------------------------------------- 1 | 2 | The [red-black](https://en.wikipedia.org/wiki/Red%E2%80%93black_tree) balanced search trees are a particular data structure used to solve the problem of unbalanced trees and to avoid the overhead of 2-3 balanced tree. Red-black trees are both 2-3 trees and standard binary trees. From standard trees they take the simplicity of the search method, whereas from the 2-3 trees they take the balanced insertion method. However, respect to standard trees the red-black trees do not get unbalanced if the key are inserted in increasing order, and they do not have all the problems we found in the 2-3 trees (e.g. there are not multiple node types and splitting cases). Comparing this data structure with the ones studied in previous modules we can see the advantages in term of complexity: 3 | 4 |

5 | 6 |

7 | 8 | This data structure guarantees a **time complexity** of O(log N) in all the operations of search, insert, and delete. The **space complexity** is almost identical to a standard tree, because the only additional information required is the colour of the link, that can be stored in a single bit. 9 | 10 | We think of the links as being of two different types: **red links**, which bind together two 2-nodes to represent 3-nodes, and **black links** which bind together the 2-3 tree. In 2008, Sedgewick introduced a simpler version of the red–black tree called the left-leaning red–black tree by eliminating a previously unspecified degree of freedom in the implementation. In this implementation (the same discussed here) all red links must lean left except during inserts and deletes. 11 | 12 |

13 | 14 |

15 | 16 | There are three main properties that must hold in order to have a red-black tree: 17 | 18 | - Red links lean left. 19 | - No node has two red links connected to it. 20 | - The tree has perfect black balance: every path from the root to a null link has the same number of black links 21 | 22 | The red-black trees, while not perfectly balanced, are always nearly so, regardless of the order in which the keys are inserted. This fact immediately follows from the correspondence with 2-3 trees and the defining property of 2-3 trees (perfect balance). The **height** of a red-black tree with N nodes is no more than 2 log N. The **worst case** is a 2-3 tree that is all 2-nodes except that the leftmost path is made up of 3-nodes. The path taking left links from the root is twice as long as the paths of length log N that involve just 2-nodes. However the **average length** of a path from the root to a node is ~1.00 log N. 23 | 24 | Implementation 25 | --------------- 26 | 27 | An additional Boolean property called `colour` can be added to the standard code of a balanced tree. Since each node has only one and only one incoming link, it is possible to embed the property inside the node itself. If the incoming link is red then `node.colour=Red` (where `Red` is the boolean `True`). An additional method called `isRed(node)` can be used to return the colour of the node. By convention the null links are black. The adjustments to the tree are governed by three fundamental operations: 28 | 29 | 1. **left rotation**: it is used in order to correct missplaced red links (right-leaning red links). The rotation switches the orientation of red links through the methods `rotateLeft()`. The left rotation is used when we have a right-leaning red link that needs to be rotated to lean to the left. The idea is to switch the head node `h` right link with the left link of its child node `x`, then assign to the empty left link of `x` the parent node `h`. After those operations it is necessary to reset the colours associated with the two nodes. The colour associated to `h` becomes equal to the previous colour associated to `h` (we set `x.colour=h.colour`). The incoming connection to `h` is red for sure (we set `h.colour=Red`), because the node has been placed on the right side of `x` and the right nodes are red for convention. 30 | 31 | 2. **right rotation**: is done in special cases using the method `rotateRight()`. It is important to notice that this is a temporary operation, and the red link is only temporary moved to the right. This operation is used when there are two red links in a row on the left side of a tree. A rotate right is used to temporary adjust the tree, moving the node that is located in between the two red links from the left side to the right side (see third case in image below). 32 | 33 | 3. **flip colours**: this operation is done when both children are red. In this case they are both flipped to black, and the incoming link is flipped as well (to red if black, and to black if red). 34 | 35 | The rotations and flipping operations are all used during the creation of the tree. Here we can see three cases: 36 | 37 |

38 | 39 |

40 | 41 | All the above operations are only used in the method `put()` and in the method `remove()`, the other methods are the same used in standard binary trees. In particular, it is important to notice that he `get()` method does not examine the node colour, so the balancing mechanism adds no overhead. Search is faster than in elementary binary trees because the tree is balanced. 42 | 43 | Methods 44 | -------- 45 | 46 | `put(key, value)`: insert a new pair of key-value. It must not be allowed to associate a `None` (python) value. 47 | 48 | `get(key)`: return the value associated with the key. If the key does not exist it is possible to return `None`. 49 | 50 | `remove(key)`: remove the key and the associated value. 51 | 52 | `isRed()`: check the colour of the node 53 | 54 | `rotateLeft()`: implementation of the rotate left operation 55 | 56 | `rotateRight()`: implementation of the rotate right operation 57 | 58 | Applications 59 | ------------ 60 | 61 | 1. [computational geometry](https://en.wikipedia.org/wiki/Computational_geometry): many data structures used in computational geometry can be based on red–black trees. 62 | 2. scheduler: the [Completely Fair Scheduler](https://en.wikipedia.org/wiki/Completely_Fair_Scheduler) used in current Linux kernels uses red–black trees. 63 | 64 | Quiz 65 | ----- 66 | 67 | 68 | 69 | 70 | Material 71 | -------- 72 | - **Coursera Algorithms Part 1**: week 4 73 | - **Algorithms**, Sedgewick and Wayne (2014): Chapter 3.3 "Balanced Search Trees" 74 | -------------------------------------------------------------------------------- /disjoint-set/disjoint_set.py: -------------------------------------------------------------------------------- 1 | 2 | #We shall consider three different implementations, all based on 3 | #using the site-indexed id[] array, to determine whether two sites are in the same connected component. 4 | #The integer inside the array id[] represent the component. For instance if nine elements are grouped in 5 | #three components then inside the array id[] we can have something like: id[0,3,3,0,0,8,8,0,3] 6 | #The numbers 0,3,8 represent the label associated to each component. 7 | # 8 | # 1. Quick-find 9 | # 2. Quick-union 10 | # 3- Weighted-union 11 | 12 | 13 | class QuickFind(): 14 | ''' 15 | In this implementation of the disjoint-set data structure the find() 16 | operation is quick because it only needs to access the elements of a list. 17 | The problem with this implementation is that union() needs to scan 18 | through the whole id[] array for each input pair in order to apply the 19 | union operator when needed. This operation must be done in all the cases. 20 | ''' 21 | 22 | def __init__(self, N): 23 | #At the beginning the number of components is equal to the number of elements. 24 | self.id_list = [n for n in range(N)] 25 | self.count = N 26 | 27 | def find(self, p): 28 | #The find operation is really fast. 29 | return self.id[p] 30 | 31 | def union(self, p, q): 32 | ''' 33 | This operation can be interpreted as: 34 | I put p and all the elements that are in the same component of p, 35 | into the component of q. 36 | For instance: union(3,8) means that the 37 | element 3 (and all the other in the same component) 38 | must be added into the component of 8. 39 | ''' 40 | p_id = self.find(p) 41 | q_id = self.find(q) 42 | 43 | #we first check whether they are already in the same component, 44 | #in which case there is nothing to do. 45 | if p_id == q_id return False 46 | 47 | #Here we are faced with the situation that all of the self.id_list entries 48 | #corresponding to sites in the same component as p have one value and 49 | #all of the self.id_list entries corresponding to sites in the same component as q have another 50 | #value. To combine the two components into one, we have to make all of the id[] entries 51 | #corresponding to both sets of sites the same value. 52 | for i, _id in enumerate(self.id_list): 53 | if _id == p_id: _id = q_id 54 | self.count -= 1 55 | return True 56 | 57 | 58 | class QuickUnion(): 59 | ''' 60 | In this implementation of the disjoint-set data structure we focus on speeding 61 | up the union() operation. 62 | 63 | To implement find(), we start at the given site, follow its link to another 64 | site, follow that site’s link to yet another site, and so forth, following 65 | links until reaching a root, a site that has a link to itself (which is guaran- 66 | teed to happen). Two sites are in the same component if and only if this process 67 | leads them to the same root. 68 | 69 | The quick-union algorithm would seem to be faster than the quick-find algorithm, 70 | because it does not have to go through the entire array for each input pair; 71 | However in the worst case scenario it will have to iterate through the entire 72 | array before finding a root node, for instance in a structure like: 73 | 1>2>3>4>5>6>7>8 when the function union(7,8) is called it has to iterate back 74 | to the root 1 and scan the entire array leading to complexity O(N^2). 75 | 76 | You can regard quick-union as an improvement over quick-find because it 77 | removes quick-find’s main liability (that union() always takes linear time). 78 | This difference certainly represents an improvement for typical data, but 79 | quick-union still has the liability that we cannot guarantee it to be 80 | substantially faster than quick-find in every case 81 | ''' 82 | 83 | def __init__(self, N): 84 | self.N = N 85 | self.id_list = [n for n in range(N)] 86 | self.count = N 87 | 88 | def find(self, p): 89 | #The find operation is slower than the previous implementation. 90 | while(self.id[p] != p): 91 | p = self.id[p] 92 | return p 93 | 94 | def union(self, p, q): 95 | ''' 96 | The union operation is fast, 97 | it only needs to call the find() method and get the root 98 | nodes of both the elements, and use this information to 99 | decide if the elements can be joined (if they are in 100 | different sets) or not (they already are in the same set). 101 | ''' 102 | root_p = self.find(p) 103 | root_q = self.find(q) 104 | #Element already in the same set 105 | if root_p == root_q: return False 106 | #Elements in different sets they point to the same node. 107 | self.id_list[root_p] = root_q 108 | self.count -= 1 109 | return True 110 | 111 | 112 | class WeightedUnion(): 113 | ''' 114 | In this implementation of the disjoint-set data structure we focus on optimizing 115 | the union operation when two trees are joined in a common set. Merging a short tree 116 | with the large one, it is possible to keep contained the depth of the resulting tree. 117 | This simple trick allows reducing the complexity to O(log N) 118 | 119 | This improvement requires a new list in order to store the size of the set. 120 | ''' 121 | 122 | def __init__(self, N): 123 | self.N = N 124 | self.id_list = [n for n in range(N)] 125 | self.sz_list = [1 for _ in range(N)] #A new list is required to store the size of the set 126 | self.count = N 127 | 128 | def find(self, p): 129 | #The find operation equal to the one in QuickUnion implementation. 130 | while(self.id[p] != p): 131 | p = self.id[p] 132 | return p 133 | 134 | def union(self, p, q): 135 | ''' 136 | In this implementation of union we have to add a check on the size of the trees. 137 | The shorter tree is linked to the longest, using the second as root. 138 | ''' 139 | root_p = self.find(p) 140 | root_q = self.find(q) 141 | #Element already in the same set 142 | if root_p == root_q: return False 143 | #Elements in different sets 144 | if self.sz[root_p] < self.sz(root_q): 145 | self.id_list[root_p] = root_q 146 | self.sz[root_q] += self.sz[root_p] 147 | else: 148 | self.id_list[root_q] = self.id_list[root_p] 149 | self.sz[root_p] += self.sz[root_q] 150 | self.count -= 1 151 | return True 152 | 153 | -------------------------------------------------------------------------------- /digraph/README.md: -------------------------------------------------------------------------------- 1 | 2 | A directed graph or [digraph](https://en.wikipedia.org/wiki/Directed_graph) ) is a graph that is a set of vertices connected by edges, where the edges have a direction associated with them.The implementation strategy is the same seen for graph, and the best option remains to store the connections in a list of lists (adjacency-list). The value stored for each vertex is the list of nodes to witch it is pointing (outgoing edges). For this reason the time complexity to find all the connections going to a node (incoming edges) is V+E because it is necessary to scann all the vertices and all the associated nodes. This image represent the properties of each data-structure implementation for digraphs: 3 | 4 |

5 | 6 |

7 | 8 | **Search**: the first problem we analyse is finding all vertices reachable from a starting node `s` along directed paths. To solve this problem we can use **depth-first search** and the same code implemented for graphs. We can also reuse the code of **breadth-first search** since it is exactly the same. Using depth-first we can find the shortest path. For instance if we have a set of vertices {3, 4, 10} and we want to find which one has the shortest path to the vertex 6, we can run the breadth-first algorithm. 9 | 10 | **Topological sort**: this is another application and consists in finding a topological order in a digraph. The edges represents constraints and relationships between the nodes. It is important that the digraph has no cycle. If there is a cycle somewhere then it is not possible to establish a hierarchy between nodes. To solve this problem it is possible to run a depth-first search and then return vertices in reverse postorder. The postorder is simply a list containing the nodes in which the depth-search terminates. For instance, starting from 0>1>4 we reach the end and we puth 4 into the postorder stack `[4]`. Then we go back to previous node 2 and we check if there we can run a depth-search from there. It turns out we cannnot, meaning that we can store also 1 in the postorder stack `[4, 1]`. We can go back to 0 now and the depth-search in this case can continue, because there are other nodes connected to 0, we cannot put it into the postorder stack for the moment. 11 | 12 |

13 | 14 |

15 | 16 | The research continue until all the nodes have been marked. The stack will be full and it will contain the order to follow. 17 | 18 |

19 | 20 |

21 | 22 | 23 | The implementation of the algorithm is straightforward and only requires to add a single line of code to the standard depth-first search algorithm. The last line in the method is the one responsible of storing the dead node into the stack. The last line is called when the `adjacent_list[]` of nodes is spent, meaning that the root `v` does not have any more children and can be stored in postorder: 24 | 25 | ```Python 26 | def depth_first_search(v): 27 | if marked_list[v] == False: 28 | marked_list[v] = True 29 | adjacent_list = vertex_list[v] 30 | for v_adj in adjacent_list: 31 | if marked_list[v_adj] == False: 32 | edgeto_list[v_adj] = v 33 | depth_first_search(v_adj) 34 | postorder_list.append(v) #this line does the postorder 35 | ``` 36 | 37 | **Strong components**: two vertices `v` and `w` are said to be strongly connected if there is a connection from `v` to `w` and from `w` to `v`. A strong component is a set of vertices that are strongly connected. There may be multiple strong components in the same digraph, depending on its structure. There are different algorithms to solve this problem, but one of the most simple is the **Kosaraju-Sharir algorithm** which is based on depth-search. For this algorithm we will need to use the method `reverse()` which switch all the edges in a digraph in the opposite direction. An important property of strong components is that they remain the same also in the reverse graph. The algorithm is represented in this image: 38 | 39 |

40 | 41 |

42 | 43 | The first phase consists in running a topological sort using the postorder algorithm on the reverse digraph. The second phase consists in running a depth-search from each node contained in the reverse postorder list using the original graph (not the reversed graph). The starting node in the postorder list can have multiple unmarked children connected. All the children are assigned to the same component. When the search along one path is finished it is possible to pass to the next unmarked node contained in the postorder list and repeat the process incrementing the component counter. This algorithm is easy to implement but it has some bottlenecks due to the fact that is necessary to run depth-search twice and it is also necessary to compute the reverse graph. The implementation is very similar to the one used for the connected components in the undirected graph. The only difference is that the iteration is done on the `postorder_list[]` instead of the original list: 44 | 45 | ```Python 46 | for v in postorder_list: 47 | if marked_list[v] == False: 48 | cc_list[v] = cc_counter #assign the ID to the starting node 49 | depth_first_search(v) #depth search from the starting node 50 | cc_counter += 1 #increment the ID counter 51 | ``` 52 | 53 | Obviously to obtain the `postorder_list[]` it is necessary to reverse the graph and find the topological order. It is important to notice that it is also possible to run the first phase on the original graph and the second phase on the inverted graph. This will give the strong connected components if the reverse graph. However we said that a graph and its reverse have the same strong connected components. 54 | 55 | Implementation 56 | --------------- 57 | 58 | Methods 59 | -------- 60 | 61 | `add_edge(s, v)`: add an edge between two nodes 62 | 63 | `reverse()`: return the digraph with all the edges switched 64 | 65 | Applications 66 | ------------ 67 | 68 | - Garbage collection: in each programming language the garbage collector is implemented as a digraph, where each accessible reference (e.g. an object) is a root and all the references connected to it (e.g. object properties) are sub-nodes of a digraph. Using the mark-sweep algorithm it is possible to mark all reachable nodes, and reallocate the one that have not been marked during the sweep (they are garbage). 69 | 70 | - web pages hyper-links: they are organised as digraph, and each website points to other websites. 71 | 72 | Quiz 73 | ----- 74 | 75 | 76 | 77 | 78 | Material 79 | -------- 80 | - **Coursera Algorithms Part 1**: week 4 81 | - **Algorithms**, Sedgewick and Wayne (2014): Chapter 4.1 "Undirected Graph" 82 | -------------------------------------------------------------------------------- /minimum-spanning-trees/README.md: -------------------------------------------------------------------------------- 1 | 2 | A [minimum spanning tree](https://en.wikipedia.org/wiki/Minimum_spanning_tree) or minimum weight spanning tree is a subset of the edges of a connected, edge-weighted (un)directed graph that connects all the vertices together, without any cycles and with the minimum possible total edge weight. 3 | 4 |

5 | 6 |

7 | 8 | 9 | To understand the implementation of the algorithm it is necessary to define some properties. A **cut** in a graph is a partititon of its vertices into two non-empty sets. A **crossing edge** is any edge connecting a node in a set with a node in the other set. The cut and crossing edges are particularly important in the greedy algorithm. 10 | 11 | 12 | Implementation 13 | --------------- 14 | 15 | 16 | 1. **Greedy algorithm**: this algorithm only works in the particular case of a graph having edges with all different weights. The idea is to look for the edge having lowest weight and save it. The property at the core of the algorithm is the following: given a cut, the crossing edge of minimum weight is part of the minimum spanning tree. The algorithm starts with a graph with no edges marked black. The first step is to compute a random cut. The crossing edge with minimum weight is marked black and added to the minimum-spanning tree. Now it is necessary to find a new cut that does not have any marked crossing edge. Once this new cut have been found, we can take the crossing edge of minimum weight and add it to the tree. The process continue until V-1 edges have been marked black. The minimum spanning tree is complete. The following is an image representing the computation of the last edge (4-5) of the minimum spanning tree: 17 | 18 | 19 |

20 | 21 |

22 | 23 | 24 | 2. **Kruskal's algorithm**: this is an old algorithm (1956) that is based on simple principia. The first step is to sort all the edges by descending weights. The second step consists of a search through the edges, marking all the edges that do not create a cycle with the other marked edges. The first edge can be marked and directly included in the minimum spanning tree because it is the edge with the lowest weight and for sure is part of the tree. The following image is the result of applying the algorithm: 25 | 26 |

27 | 28 |

29 | 30 | The grey edges are the ones that have been discarded, whereas the black edges are the one that have been marked because they did not created cycles. The proof of the correctness of the algorithm demonstrates that it is a special case of the Greedy algorithm. When an edge is marked black it is because it is a crossing edge in a cut (if the edge create a loop is not a crossing edge). Now there is a problem, how can we be sure that the marked edge is the one with lowest weight among the other crossing edges? We are sure because the edges have been disposed in decreasing order, and the marked ones are for sure crossing edges having lowest weight. 31 | 32 | The most efficient data structure for storing edges in decreasing order is a priority queue. The edges are added to the queue that will automatically sort them based on the lowest weight. 33 | The most efficient data structure for checking if adding and edge will create a cycle is the **union-find** discussed in the first module. This data structure allows a complexity of log(V) for checking the cycle, whereas using a depth-first search will take complexity V (see table below for complexity required by each step of the algorithm). The idea is to keep a set for each connected component. When a new connection is checked is necessary to look if the two nodes are in the same component, if this happens it means the connection will create a cycle. The following is a python-like pseudocode of the algorithm: 34 | 35 | ```python 36 | #First we add the edges to the priority queue 37 | for vertex in vertex_list: 38 | for tuple in vertex: 39 | priority_queue.push(tuple) 40 | 41 | while(len(priority_queue) > 0): 42 | s, v, w = priority_queue.return_min() #enquee the minimum element 43 | if union_find.is_connected(s, v) == False: #check if vertices are not in the same component 44 | union_find.connect(s, v) #add the two vertices in the same component 45 | mst_list.append(s, v, w) #add the tuple to the minimum spanning tree 46 | ``` 47 | 48 | The time complexity of the algorithm is *E log(E)* and it is easy to understand why looking to a table with the complexity required by any passage: 49 | 50 |

51 | 52 |

53 | 54 | It is possible to see that the larger complexity is the *log(E)* required by the delete-min operation in the priority-queue, and this operation must be repeated *E* times leading to *E log(E)* total complexity. 55 | 56 | 57 | 3. **Prim's algorithm**: this is another algorithm for evaluating the minimum spanning tree. To use the algorithm we need a priority queue data structure, that is used to store temporary edges. Using the priority queue the edges are automatically sorted from minimum to maximum weight. The idea is to add to the queue only the edges having exactly **one endpoint** in the minimum spanning tree that we are growing. This principle is very important and is the core of the algorithm. We start the search from node 0 of the given graph *G*. We consider all the edges that connect 0 to the neighbours and we select the one having minimum weight (in this case 0-7). This edge is added to *T*: 58 | 59 |

60 | 61 |

62 | 63 | The next step consists in considering the connections from the node 7 to the neighbours. All the edges from 7 are added to the priority queue. 64 | Then it is asked to return the minimum weight edge. In this case the edge with minimum weight is the one going to node 1: 65 | 66 |

67 | 68 |

69 | 70 | The edge 1-7 is added to the minimum spanning tree. The search continue pushing in the priority queue all the neighbours edges of node 1: 71 | 72 |

73 | 74 |

75 | 76 | The next step is particular, because we have to consider the edge with minimum weight in the queue. This edge is 0-2 meaning that it is necessary to turn our attention to node 2 that has not been marked yet. The edge 0-2 is added to the minimum spanning tree: 77 | 78 |

79 | 80 |

81 | 82 | When we marked node 2 some edges became obsolete and they could be removed from the queue because the do not respect the endpoint principia. However here we are implementing a **lazy** version of the algorithm and we keep these connections in the queue (it does not affect the algorithm). The algorithm continue until the last node (in our example is 6) is added to the minimum spanning tree. 83 | 84 |

85 | 86 |

87 | 88 | For the last node some connections are already in the queue (solid red) while others have been added (6-4, dashed red). It is important to notice that before arriving to the next minimum weight (6-2) it will be necessary to remove from the priority queue some obsolete edges (1-2, 4-7, 0-4). 89 | 90 |

91 | 92 |

93 | 94 | The following is a python-like pseudocode for the implementation of the lazy Prim's algorithm: 95 | 96 | ```python 97 | marked_list = [False, False, False, ...] #list of marked nodes 98 | mst_list = [] #minimum spanning tree list, will contain the edges 99 | priority_queue_list = dequeue() #the queue for the temporary edges 100 | 101 | #The node 0 edges are added to the queue 102 | add_to_queue(Node 0, priority_queue_list) 103 | 104 | #Main loop 105 | while(len(priority_queue_list)>0): 106 | #pop one edge from the queue 107 | #Note: it is possible to use the Edge() class instead of tuple 108 | (v, w) = priority_queue_list.popleft() 109 | #Going forward to the next step only if at list one node is not endpoint 110 | if not marked_list[v] and marked_list[w]: 111 | mst_list.append(e) #the edge can be added to the list 112 | #Check which one of the two nodes is not-marked and add its neighbours 113 | if marked_list[v]: add_to_queue(v, priority_queue_list) 114 | if marked_list[w]: add_to_queue(w, priority_queue_list) 115 | ``` 116 | 117 | The *time complexity* of the lazy version of Prim's algorithm is in the worst case *E log(E)* because it is necessary to visit all the edges (E) and the binary heap (priority queue) associate to each one of them. There is a more efficient solution proposed by **Eager** 118 | 119 | Methods 120 | -------- 121 | 122 | A rough solution to implement a weighted graph is our usual adjacency list, where for each node we store a list of tuple `(s, v, w)` representing the starting node, the end node and the connection weight. 123 | 124 | To implement this kind of weighted graph it may be possible to define a new class called `Edge()` that takes as parameters the starting vertex `s` the end vertex `v` and the weight `w`. Some methods can be associated to this class. For instance a method `return_either()` that returns the head vertex of the edge. Another method `return_other(v)` will return the other vertex contained in the edge (if passing `s` returns `v`, if passing `v` returns `s`). The either and other methods are useful because we are going to store the same edges twice, and associate it with two vertices in the adjacency-list. A method `compare_weight(Edge e)` can be implemented in order to compare the weight of the current object with the weight of another edge. 125 | 126 | The graph itself can be implemented into a class called `WeightedGraph(V)` that takes in input the total number of vertices. A method called `add_edge(Edge e)` will include an edge object into the class. The edge can be stored into our usual adjacency-list. The edge object must be associated to both the vertices that are into it. 127 | 128 | Applications 129 | ------------ 130 | 131 | 1. 132 | 133 | Quiz 134 | ----- 135 | 136 | 137 | 138 | 139 | Material 140 | -------- 141 | - **Coursera Algorithms Part 2**: week 142 | - **Algorithms**, Sedgewick and Wayne (2014): Chapter "" 143 | --------------------------------------------------------------------------------